Skip to content

Using ijson to avoid loading full json in memory.#356

Open
SamimAB wants to merge 2 commits into
Vision-CAIR:mainfrom
SamimAB:dataset_memory_saving
Open

Using ijson to avoid loading full json in memory.#356
SamimAB wants to merge 2 commits into
Vision-CAIR:mainfrom
SamimAB:dataset_memory_saving

Conversation

@SamimAB
Copy link
Copy Markdown

@SamimAB SamimAB commented Sep 20, 2023

Using ijson to load item by item, so it is possible to load dataset using dataset/convert_cc_sbu.py and dataset/convert_laion.py on machines with low RAM.

Using ijson to load item by item, so it is possible to load dataset
using dataset/convert_cc_sbu.py and dataset/convert_laion.py on machines
with low RAM.
Added ijson==3.2.3 in environment.yml
@lzhhha
Copy link
Copy Markdown

lzhhha commented Nov 27, 2023

pyarrow.lib.ArrowInvalid: CSV parse error: Expected 2 columns, got 8: a colorful arrow with several lights http://themintlist.com/media/catalog/product/cache/1/small_ ...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants