Skip to content

Clear Examples of use with different dataset types and code changes.  #409

@Woodr7

Description

@Woodr7

🚀 Feature

Within the readme there should be examples, or links to examples, of how to reformat a dataset, starting with imagenet-tiny, in order to make it work well with LitData. How can I take a file structure where each image is organized into a folder named as its associated class and change it so when it's processed with Litdata, all of the relevant information is contained in the noew structure. Then, How do I need to change the code I used to train before in order to use the newly optimized litdata.

Motivation

This is needed in order to make litdata self serve. There is not a good plain english example of going from one simple, understandable dataset type and codebase, to an optimized litdata dataset and the new codebase needed to use that dataset and train the same model 20x faster. We will see more adoption if there is an example of this for as many dataset types as possible.

Pitch

Starting with the existing imagenet-tiny. Should how you go form the current file structure to the filestructure neccesary to run ld.optimize and maintain all of the necessary info. Then show an example of how you need to change the training code in order to take advantage of the optimized cloud dataset.

Metadata

Metadata

Assignees

No one assigned

    Labels

    documentationImprovements or additions to documentationenhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions