Importer = TextImporter('/path/to/test.csv')) # Import cards from CSV into the new collection # Create a new deck in the collection (otherwise the "Default") deck will be usedĭeck_id = ('Deck name') Here's a very basic example to import from CSV and export a deck to an Anki package (.apkg) file: import ankiįrom import TextImporterĬollection = anki.Collection('/path/to/test.anki2')) To build on gavenkoa's answer, the Anki API has built-in functionality to import from CSV.įirst of all, you can install the anki Python package using pip, e.g. Is it possible to merge improvements and corrections to cards during apkg import without loosing progress?.Any way to build apkg from command line without GUI?. Note = (collection, model)Īs long you keep note.guid and model the same, you can import the DB and update cards without losing progress! Model = 12345678 # essential for upgrade detection Then you can adapt the following example to your needs: import ankiįrom anki.exporting import AnkiPackageExporterĬollection = anki.Collection(os.path.join(TMPDIR, 'collection.anki2'))ĭeck_id = (FBASENAME + "_deck") Extend: PYTHONPATH=/usr/share/anki: python. apkg files is by programmatically reusing the desktop version with Python. Assuming you are using Java 8 or later, the Stream API can be very helpful.Another way to generate. This will require changing the method signature to use a return type other than List. This means that each line of the file is processed and then passed directly to the output, without collecting all of the lines in memory in between. Otherwise, the best way to deal with large amounts of data in a bounded amount of memory is to use a streaming approach. If you have enough memory available on the machine to assign a heap size large enough to hold the entire contents, that will be the simplest solution, as it won't require changing the code. Each line may not consume much memory, but multiplied by millions of lines, it all adds up. The memory consumption would be less from the replace and split operations, and more from the fact that the entire contents of the file need to be read into memory in this approach. The main problem is using too much heap memory, and the performance problem is likely to be due to excessive garbage collection when the remaining available heap is very small (but it's best to measure and profile to determine the exact cause of performance problems). I don't think that splitting this work onto multiple threads is going to provide much improvement, and may in fact make the problem worse by consuming even more memory. Do you know of a different approach to split at comas and replace the double quotes in each CSV line ? Would StringBuilder be of any healp here ? What about StringTokenizer ? How could I reduce the amount of heap memory used in the process ? Is the multithread implementation with Callable correct ? How could I improve the speed of the CSV reading ? Other than that the api is running out of heap memory when running on the server, I know that a solution would be to enhance the amount of available memory but I suspect that the replace() and split() operations on strings made in the Callable(s) are responsible for consuming a large amout of heap memory. To improve speed processing, I tried to implement multithreading with Callable(s) but I am not familiar with that kind of concept, so the implementation might be wrong. The code works fine on my local machine but it is very slow : it takes about 20 seconds to process 450 columns and 40 000 lines. I am not guaranteed to have the same header between files (each file can have a completly different header than another), so I have no way to create a dedicated class which would provide mapping with the CSV headers.Ĭurrently the api controller is calling a csv service which reads the CSV data using a BufferReader. It has to read big CSV files which will contain more than 500 columns and 2.5 millions lines each. I am currently working on a spring based API which has to transform csv data and to expose them as json.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |