Skip to content

feat: Enhanced model load error diagnostics with structured error types#98

Open
Sumanth-806307 wants to merge 3 commits into
mlc-ai:mainfrom
Sumanth-806307:feat/model-load-error-diagnostics
Open

feat: Enhanced model load error diagnostics with structured error types#98
Sumanth-806307 wants to merge 3 commits into
mlc-ai:mainfrom
Sumanth-806307:feat/model-load-error-diagnostics

Conversation

@Sumanth-806307
Copy link
Copy Markdown

@Sumanth-806307 Sumanth-806307 commented Feb 18, 2026

Summary

Implements Phase 1 of issue #96: Enhanced error diagnostics for model loading failures.

This PR addresses the model loading failures reported in issue #85 by introducing structured error handling, classification, and user-actionable error messages.

Changes Made

1. Added Structured Error Types (app/client/api.ts)

  • Added ModelLoadErrorCode enum with 7 error categories
  • Created ModelLoadError interface extending Error with diagnostic fields
  • Implemented createModelLoadError helper function

2. Enhanced Error Classification (app/client/webllm.ts)

  • Imported new error types and utilities
  • Replaced generic error handling with structured classification
  • Added classifyError method that categorizes errors into:
    • webgpu_init_failed - Browser lacks WebGPU support
    • artifact_fetch_failed - Network/CDN download failures
    • cache_invalid - Cache corruption issues
    • worker_init_failed - Web Worker startup failures
    • unknown_error - Unclassified errors
  • Each error includes retryable flag and diagnostic context

3. Improved Error Display (app/store/chat.ts)

  • Updated onError handler to display structured errors
  • Added error code badges (❌) for visual clarity
  • Implemented contextual retry hints based on error type
  • Added specific guidance for WebGPU compatibility errors
  • Enhanced console logging with structured diagnostic data

4. Documentation

  • Added analysis and planning documents
  • Created implementation prompts for reproducibility

Benefits

Better Diagnostics: Every error now has a classification code
Actionable Guidance: Users get specific steps to resolve issues
Improved Debugging: Console logs include structured error details
User Experience: Clear, helpful error messages instead of technical jargon
Future-Ready: Foundation for retry logic (Phase 2) and telemetry

Testing

  • ✅ No TypeScript compilation errors
  • ✅ Code follows existing patterns and style
  • ✅ All error paths now use structured errors
  • ✅ Backward compatible (gracefully handles non-structured errors)

Manual Testing Recommendations

  1. WebGPU Error: Test on Firefox (no WebGPU) → should show clear upgrade message
  2. Network Error: Disconnect network during model load → should classify as artifact_fetch_failed
  3. Cache Error: Manually corrupt IndexedDB → should classify as cache_invalid
  4. Normal Flow: Load model normally → should work without regressions

Related Issues

Closes #96

Next Steps (Future PRs)

  • Phase 2: Automatic retry logic with exponential backoff
  • Phase 3: Custom artifact base URL for self-hosting
  • Phase 4: Documentation and troubleshooting guides

Screenshots

Before

Error while initializing the model [object Object]

After

❌ **webgpu_init_failed**

WebGPU is not available. Please use a browser with WebGPU support (Chrome/Edge 113+). 
See https://caniuse.com/webgpu

💡 **Your browser doesn't support WebGPU**. Please use Chrome 113+, Edge 113+, 
or check compatibility at https://caniuse.com/webgpu

- Add ModelLoadErrorCode enum with 7 error types
- Implement error classification in WebLLMApi
- Enhance error display with actionable guidance
- Add structured console logging for debugging

Addresses mlc-ai#96 (Phase 1) and helps resolve mlc-ai#85
@gemini-code-assist
Copy link
Copy Markdown

Summary of Changes

Hello @Sumanth-806307, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly improves the diagnostic capabilities for model loading failures by introducing a structured error handling system. It classifies errors into specific categories, provides users with actionable guidance and retry suggestions, and enhances debugging information available in console logs. This change aims to improve the user experience by replacing generic error messages with clear, contextual feedback, and sets the foundation for future features like automatic retry logic.

Highlights

  • Structured Error Types: Defined a ModelLoadErrorCode enum with 7 categories, created a ModelLoadError interface extending Error with diagnostic fields, and implemented a createModelLoadError helper function in app/client/api.ts.
  • Enhanced Error Classification: Replaced generic error handling in app/client/webllm.ts with structured classification, introducing a classifyError method that categorizes errors into types like webgpu_init_failed, artifact_fetch_failed, cache_invalid, and worker_init_failed.
  • Improved Error Display: Updated the onError handler in app/store/chat.ts to display structured errors with error code badges, contextual retry hints, specific guidance for WebGPU compatibility issues, and enhanced console logging.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • app/client/api.ts
    • Defined ModelLoadErrorCode enum for various error categories.
    • Introduced ModelLoadError interface to extend Error with diagnostic fields.
    • Implemented createModelLoadError helper for consistent error creation.
  • app/client/webllm.ts
    • Imported new error types and utilities.
    • Replaced generic error handling with structured classification in WebLLMApi.chat.
    • Added a private classifyError method to categorize errors based on keywords in messages (WebGPU, network, cache, worker) and return a ModelLoadError.
  • app/store/chat.ts
    • Updated the onError handler to process ModelLoadError objects.
    • Formatted error messages to include error codes with visual badges.
    • Added contextual retry suggestions, including specific guidance for WebGPU compatibility issues.
    • Enhanced console logging to output structured diagnostic data for errors.
Activity
  • No human activity has been recorded for this pull request.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request significantly enhances error diagnostics for model loading failures by introducing structured error types and improved error classification. The changes provide more actionable guidance to users and better debugging information, which is a substantial improvement to the application's robustness and user experience. The implementation of ModelLoadError and createModelLoadError provides a solid foundation for consistent error handling. My main feedback concerns a minor inconsistency in the error code definitions and their usage.

Comment thread app/client/api.ts
WORKER_INIT_FAILED = "worker_init_failed",
WEBGPU_INIT_FAILED = "webgpu_init_failed",
CACHE_INVALID = "cache_invalid",
NETWORK_ERROR = "network_error",
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The NETWORK_ERROR enum member is defined in ModelLoadErrorCode but is not currently used in the classifyError method in app/client/webllm.ts. All network-related errors are mapped to ARTIFACT_FETCH_FAILED. Consider either removing NETWORK_ERROR if ARTIFACT_FETCH_FAILED is intended to cover all network issues, or add specific logic in classifyError to utilize NETWORK_ERROR if it represents a distinct category of network failures.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Improve model load error handling with structured diagnostics, retry logic, and self-hosting support

1 participant