Improve dataset handling with adaptive folder detection #331

yyq19990828 · 2025-08-21T09:28:26Z

Summary

This PR improves the dataset handling functionality in rfdetr/datasets/coco.py by adding adaptive folder detection and better fallback mechanisms for test datasets.

Key improvements:

Adaptive val/valid folder detection: Automatically detects whether the dataset uses val (YOLO format) or valid (COCO format) folder naming
Smart test dataset fallback: When test dataset is missing (which is common in most datasets), automatically uses the validation dataset as fallback
Reduced manual dataset modification: Users no longer need to manually rename folders to match expected naming conventions
Better error handling: Proper FileNotFoundError when neither val nor valid folders exist
Code internationalization: Converted Chinese comments to English

Why this change?

Most datasets don't include a separate test split, making the fallback mechanism necessary
YOLO datasets typically use val folder while COCO datasets use valid folder
This reduces the friction for users when working with different dataset formats without requiring manual folder restructuring

Test plan

Verify the code handles both val and valid folder structures
Confirm test dataset fallback works when test folder is missing
Ensure proper error handling when validation folders are missing

🤖 Generated with Claude Code

CLAassistant · 2025-08-21T09:28:39Z

All committers have signed the CLA.

…st dataset fallback - Add adaptive val/valid folder detection to support both YOLO (val) and COCO (valid) dataset structures - Implement fallback mechanism for test dataset since most datasets don't include test split - Reduce the need for manual dataset modification by automatically handling different folder naming conventions - Add proper error handling with FileNotFoundError for missing validation folders - Convert Chinese comments to English for better internationalization 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]>

isaacrob-roboflow · 2025-08-21T18:28:33Z

I think this is interesting but I will say that datasets SHOULD include a test set that's not the val set ;) often nowadays people benchmark on COCO val, but that's not because that form is fine, it's because the test set is hidden away on a private server .. ultralytics notably does NOT report test set numbers on datasets they train on, but imo that gives a misleading measure of final accuracy because they're also picking the best checkpoint based on val score so the result is biased

so I would still like it to be clear when folks train a model that they SHOULD have a test set .. but handling val vs valid seems logical to me

yyq19990828 requested review from Matvezy, SkalskiP, isaacrob-roboflow and probicheaux as code owners August 21, 2025 09:28

yyq19990828 force-pushed the feature/adaptive-dataset-folder-detection branch from fcd7819 to 2c9e689 Compare August 21, 2025 09:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Improve dataset handling with adaptive folder detection #331

Improve dataset handling with adaptive folder detection #331

Uh oh!

yyq19990828 commented Aug 21, 2025 •

edited

Loading

Uh oh!

CLAassistant commented Aug 21, 2025 •

edited

Loading

Uh oh!

isaacrob-roboflow commented Aug 21, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Improve dataset handling with adaptive folder detection #331

Are you sure you want to change the base?

Improve dataset handling with adaptive folder detection #331

Uh oh!

Conversation

yyq19990828 commented Aug 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Key improvements:

Why this change?

Test plan

Uh oh!

CLAassistant commented Aug 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

isaacrob-roboflow commented Aug 21, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

yyq19990828 commented Aug 21, 2025 •

edited

Loading

CLAassistant commented Aug 21, 2025 •

edited

Loading