Skip to content

Unbalanced classes #272

@arthurdouillard

Description

@arthurdouillard

From @umbertocappellazzo:

Well, the split into train, test and valid has been made by the authors who created the corpus and I don't know whether they crafted then different sets. Since I'm the first to use FSC in a CL scenario, I think it could be ok to proceed in this way, and I understand your rigorousness for this matter. So, you have the last word about this.
I take advantage of this thread for asking one thing: does Continuum handle the case of unbalanced classes for rehearsal? I had a look at the I suppose not, but I wanna be sure. If the dataset contains unbalanced classes, it's not fair to keep the same # of samples for each class. If continnum doesn't cover this case, I can come up with a solution for my project and then I can make a PR (if you think this is worth it).

I'd see two solutions:

  • either use a sampler given to the data loader to {over,under}-sample classes
  • or use a custom RehearsalMemory where you'd allow sampling a different amount of samples per class (not sure this very particular case is worth adding to Continuum though)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions