feat: Add CLI commands for browsing and searching OpenML runs #1510
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Metadata
openml runs list,openml runs info, andopenml runs download"Details
What does this PR implement/fix?
This PR adds three new CLI subcommands under
openml runsto improve the user experience of the run catalogue:openml runs list- List runs with optional filtering (task_id, flow_id, uploader, tag, pagination, output format)openml runs info <run_id>- Display detailed information about a specific run including task, flow, evaluations, and parameter settingsopenml runs download <run_id>- Download a run and save predictions to local cacheWhy is this change necessary? What is the problem it solves?
Currently, users must write Python code to browse or search OpenML runs, even for simple tasks like listing runs for a specific task or downloading run results. This creates a barrier to entry and makes the run catalogue less accessible. Adding CLI commands allows users to interact with the run catalogue directly from the command line without writing code.
This directly addresses the ESoC 2025 goal of "Improving user experience of the run catalogue in AIoD and OpenML".
How can I reproduce the issue this PR is solving and its solution?
Before (requires Python code):
After (CLI commands):
Implementation Details
Files Modified:
openml/cli.py- Added three new functions and integration into main CLI parserruns_list()- List runs with filtering and formattingruns_info()- Display detailed run informationruns_download()- Download and cache runs_format_runs_output(),_format_runs_table(),_format_runs_list(),_print_run_evaluations()runs()- Dispatcher for runs subcommandsmain()to register runs subparserFiles Created:
tests/test_openml/test_cli.py- Comprehensive test suite with 18 testsKey Features:
openml.runs.list_runs()andopenml.runs.get_run()functionsTesting
All tests pass successfully:
Test Coverage:
test_runs_list_simple- Basic list functionalitytest_runs_list_with_filters- Filtering with task, flow, uploader, tagtest_runs_list_verbose- Verbose output modetest_runs_list_table_format- Table format outputtest_runs_list_json_format- JSON format outputtest_runs_list_empty_results- Empty result handlingtest_runs_list_error_handling- Error scenariostest_runs_info- Detailed run information displaytest_runs_info_with_fold_evaluations- Fold evaluation summarytest_runs_info_error_handling- Info error scenariostest_runs_download- Download and cache functionalitytest_runs_download_error_handling- Download error scenariostest_runs_dispatcher- Command routingtest_runs_dispatcher_invalid_subcommand- Invalid command handlingCode Quality
CLI Help Output
$ openml runs --help usage: openml runs [-h] {list,info,download} ... Browse and search OpenML runs from the command line. positional arguments: {list,info,download} list List runs with optional filtering. info Display detailed information about a specific run. download Download a run and cache it locally. $ openml runs list --help usage: openml runs list [-h] [--task TASK] [--flow FLOW] [--uploader UPLOADER] [--tag TAG] [--size SIZE] [--offset OFFSET] [--format {list,table,json}] [--verbose] List runs with optional filtering. options: --task TASK Filter by task ID --flow FLOW Filter by flow ID --uploader UPLOADER Filter by uploader name or ID --tag TAG Filter by tag --size SIZE Number of runs to retrieve (default: 10) --offset OFFSET Offset for pagination (default: 0) --format {list,table,json} Output format (default: list) --verbose Show detailed informationBenefits
Future Enhancements (Optional)
Potential improvements for future PRs:
--sortoption for custom ordering--exportoption to save results to fileopenml runs compareto compare multiple runsScreenshots/Examples
List Command (Simple):
List Command (Table Format):
Info Command:
Any other comments?
developbranchThis implementation makes the OpenML run catalogue significantly more accessible and user-friendly, aligning with the project's goals of improving the user experience for both novice and experienced users.