Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tutorial on COCO and VOC parsers #187

Closed
lgvaz opened this issue Jul 18, 2020 · 5 comments
Closed

Tutorial on COCO and VOC parsers #187

lgvaz opened this issue Jul 18, 2020 · 5 comments
Assignees
Labels
documentation Improvements or additions to documentation example request good first issue Good for newcomers help wanted Extra attention is needed priority-high

Comments

@lgvaz
Copy link
Collaborator

lgvaz commented Jul 18, 2020

📓 New Tutorial

COCO and VOC are the two most common annotation formats, we need a tutorial (or two, one for each) that shows how to use them.

mentioned in #79


Don't remove
Main issue for examples: #39

@lgvaz lgvaz added documentation Improvements or additions to documentation help wanted Extra attention is needed good first issue Good for newcomers example request priority-high labels Jul 18, 2020
@lgvaz lgvaz self-assigned this Jul 18, 2020
@shimsan
Copy link

shimsan commented Jul 19, 2020

@lgvaz , you think it is a good idea to have mantisshrimp automatically create the classes dictionary mapping IDs to labelnames?

The annotations.json have both a category_id and a category_name.

I was trying to create a tutorial for COCO Object Detection, but then realized, I might need to parse the JSON myself to get this info out, which seems redundant.

I would do something like this (code from fastai's get_annotations):

annot_dict = json.load(open(fname))
classes = {o['id']:o['name'] for o in annot_dict['categories']}

@lgvaz
Copy link
Collaborator Author

lgvaz commented Jul 19, 2020

This is a topic that we've been internally discussing in the past weeks.

It's very hard to write an automated extractor of classes that safely works for all use cases (all Parsers). But we might be able to design something specific for COCO.

  • The categories from the original COCO dataset correctly start counting from 1. but what happens if the user wasn't aware of the fact that the id 0 is reserved for background and starts counting from that?
  • To make it even more complicated, I believe there might be some models that don't follow the convention that 0 is background. For example, I think the new Detr implementation don't expect a background class as input at all. How can we handle such cases?

The solution we came for now, is to let the user figure out how to create classes, and let him handle the conventions (because we cannot make any assumptions)

But maybe we can provide some helper functions so the standard use case is facilidated... I'm open to ideas =)

@shimsan
Copy link

shimsan commented Jul 20, 2020

The standard COCO JSON will already contain the category_ids and category_names.

    "categories": [
        {
            "id": 0,
            "name": "Background"
        },
        {
            "id": 1,
            "name": "Label1"
        },
        {
            "id": 2,
            "name": "Label2"
        },
        {
            "id": 3,
            "name": "Label3"
        },
}

So, it is matter of reading the JSON which is already being done with the COCO parser.

But, you have already implemented a Helper function which now makes this simpler.

I agree that the user needs to understand the model they want to use and the requirements thereof.

@lgvaz
Copy link
Collaborator Author

lgvaz commented Jul 20, 2020

I just merged this helper function, you can now do:

class_map = datasets.coco.class_map("path_to_annotations_file", background=0)

This assumes that background is not already present in your categories, if it's (or you don't need a background id), you can do background=None

@lgvaz
Copy link
Collaborator Author

lgvaz commented Aug 13, 2020

Done! See tutorials on the docs =)

@lgvaz lgvaz closed this as completed Aug 13, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation example request good first issue Good for newcomers help wanted Extra attention is needed priority-high
Projects
None yet
Development

No branches or pull requests

2 participants