Skip to content

Make dataset viewer more flexible in displaying metadata alongside images #7123

Open
@egrace479

Description

@egrace479

Feature request

To display images with their associated metadata in the dataset viewer, a metadata.csv file is required. In the case of a dataset with multiple subsets, this would require the CSVs to be contained in the same folder as the images since they all need to be named metadata.csv. The request is that this be made more flexible for datasets with multiple subsets to avoid the need to put a metadata.csv into each image directory where they are not as easily accessed.

Motivation

When creating datasets with multiple subsets I can't get the images to display alongside their associated metadata (it's usually one or the other that will show up). Since this requires a file specifically named metadata.csv, I then have to place that file within the image directory, which makes it much more difficult to access. Additionally, it still doesn't necessarily display the images alongside their metadata correctly (see, for instance, this discussion).

It was suggested I bring this discussion to GitHub on another dataset struggling with a similar issue (discussion). In that case, it's a mix of data subsets, where some just reference the image URLs, while others actually have the images uploaded. The ones with images uploaded are not displaying images, but renaming that file to just metadata.csv would diminish the clarity of the construction of the dataset itself (and I'm not entirely convinced it would solve the issue).

Your contribution

I can make a suggestion for one approach to address the issue:

For instance, even if it could just end in _metadata.csv or -metadata.csv, that would be very helpful to allow for more flexibility of dataset structure without impacting clarity. I would think that the functionality on the backend looking for metadata.csv could reasonably be adapted to look for such an ending on a filename (maybe also check that it has a file_name column?).

Presumably, requiring the configs in a setup like on this dataset could also help in figuring out how it should work?

configs:
  - config_name: <image subset>
    data_files:
      - <image-metadata>.csv
      - <path/to/images>/*.jpg

I'd also be happy to look at whatever solution is decided upon and contribute to the ideation.

Thanks for your time and consideration! The dataset viewer really is fabulous when it works :)

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions