Skip to content

Conversation

@kindler-king
Copy link

Metadata

Reference Issue: Fixes #1491
New Tests Added: No (tests covered in PR3 per roadmap)
Documentation Updated: No

Details

What This PR Implements

This PR introduces structured error mapping in _api_calls.py, enabling the OpenML Python client to raise typed, semantically meaningful exceptions in response to:

1. HTTP Status Codes → Typed Exceptions

  • 414 → OpenMLURITooLongError
  • 429 → OpenMLRateLimitError
  • 404 → OpenMLNotFoundError
  • 408, 504 → OpenMLTimeoutError
  • 503 → OpenMLServiceUnavailableError
  • 401 → OpenMLAuthenticationError

2. OpenML XML Error Codes → Typed Exceptions

Includes mappings for:

  • 107 → OpenMLDatabaseConnectionError
  • 111, 372, 482, 500, 512, 542, 674 → OpenMLServerNoResult
  • 163 → OpenMLValidationError
  • 102, 137, 310, 320, 350, 400, 460 → OpenMLNotAuthorizedError

3. Message-Based Fallback Classification

When server responses return:

  • HTML error pages
  • Partial or malformed XML
  • Generic messages

…the client now makes reasonable inferences for:
authentication failures, rate limits, validation errors, missing resources, timeouts, and server outages.

4. Pre-check for Overly Long URLs (Fix for test_too_long_uri)

Before sending a request:

if len(url) > MAX_URL_LENGTH:
    raise OpenMLURITooLongError(url)

This resolves:

  • Unnecessary network calls
  • XML parsing failures on HTML responses
  • A previously failing test described in a now-closed issue

Why This Change Is Necessary

The prior system funneled most server errors into OpenMLServerError or OpenMLServerException, providing limited diagnostic value.

Typed exceptions allow:

  • better retry logic
  • clearer user-facing error messages
  • easier debugging
  • more reliable behavior in large-scale workflows (e.g., batch model uploads)

This PR also fixes a longstanding edge case where very long URLs caused inconsistent failures.

How to Reproduce the Issue

  • Construct a request with >10,000 data IDs.
  • Call a list endpoint (e.g., openml.datasets.list_datasets(...)).

Previously:

  • Server returned HTTP 414 with HTML
  • HTML was parsed as XML → crash
  • Generic OpenMLServerError raised

After this PR:

  • URL length is checked client-side
  • A clean OpenMLURITooLongError is raised immediately

Additional Notes / Pre-commit Status

Some Ruff warnings remain for maintainers to decide:

  • C901, PLR0912, PLR0911 for __parse_server_exception complexity
  • N818 for existing exception names (OpenMLServerException, etc.)

These were intentionally not addressed in this PR to avoid breaking changes or large refactors.

kindler-king and others added 3 commits November 21, 2025 04:08
…d exceptions for different error scenarios- Map HTTP status codes to specific exception types- Map OpenML error codes to appropriate exceptions- Add message-based fallback detection- Preserve backward compatibility with existing exception hierarchyNote: Pre-commit checks skipped due to intentional complexity in __parse_server_exception (handles 17+ error types). Will addresslinting rules after maintainer review.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Refactor Error Handling: Introducing Structured, Typed Exceptions for Clearer Server & Validation Errors

1 participant