Skip to content

Conversation

@kovtcharov
Copy link
Collaborator

@kovtcharov kovtcharov commented Jan 20, 2026

Summary

  • Adds lemonade-sdk as an optional pip dependency (pip install "amd-gaia[lemonade]")
  • Removes the separate install-lemonade GitHub Action in favor of pip installation
  • Version is managed via LEMONADE_VERSION in src/gaia/version.py
  • This should resolve the failing errors on sjlab-stx-2

Changes

  • setup.py: Added lemonade extra with dynamic version from version.py
  • Workflows: Updated 5 CI workflows to use [lemonade] extra instead of custom action
  • Docs: Updated setup, quickstart, and dev guides with new installation option
  • Deleted: .github/actions/install-lemonade/ directory

Test plan

  • Verify pip install "amd-gaia[lemonade]" installs lemonade-sdk
  • Verify lemonade-server command is available after installation
  • CI workflows pass with pip-based lemonade installation

Related Issues

@kovtcharov kovtcharov self-assigned this Jan 20, 2026
@github-actions github-actions bot added documentation Documentation changes dependencies Dependency updates devops DevOps/infrastructure changes labels Jan 20, 2026
@kovtcharov kovtcharov enabled auto-merge January 21, 2026 01:09
@kovtcharov kovtcharov added this pull request to the merge queue Jan 21, 2026
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Jan 21, 2026
@github-actions github-actions bot added the security Security-sensitive changes label Jan 21, 2026
Root cause: DEFAULT_MODEL_NAME was Qwen2.5-0.5B-Instruct-CPU (ONNX model)
which requires onnxruntime-genai package. We don't want to add more
dependencies, so switch to Qwen3-0.6B-GGUF which uses llamacpp backend
(built into lemonade-sdk, no extra deps needed).

This fixes the 422 errors where server tried to auto-load the ONNX model
when tests requested DEFAULT_MODEL_NAME, but onnxruntime-genai wasn't installed.

Co-Authored-By: Claude Sonnet 4.5 (1M context) <noreply@anthropic.com>
@github-actions github-actions bot added llm LLM backend changes performance Performance-critical changes labels Jan 21, 2026
- Update all workflows to use Qwen3-4B-Instruct-2507-GGUF as default model
- Fix summarizer tests by running in same step as Lemonade server
- Switch test_gaia_cli_windows.yml to PowerShell for better process handling
- Improve start-lemonade.bat error handling and server health checks
- Simplify workflow structure and debug output
@github-actions github-actions bot added the tests Test changes label Jan 21, 2026
@kovtcharov
Copy link
Collaborator Author

@claude review this PR.

@github-actions
Copy link

github-actions bot commented Jan 21, 2026

Claude finished @kovtcharov's task —— View job


PR Review: Add lemonade-sdk as pip dependency

  • Read pr-diff.txt to understand changes
  • Review setup.py and version management
  • Review CI workflow changes
  • Review documentation updates
  • Review new scripts and test files
  • Provide comprehensive review feedback

Summary

This PR makes significant improvements to how Lemonade Server is installed and managed across GAIA. The shift from a custom GitHub Action to a pip-based installation is a solid architectural decision that improves maintainability.

Review Findings

🟢 Strengths

1. Clean Version Management (setup.py, src/gaia/version.py)

  • Single source of truth for LEMONADE_VERSION = "9.1.0" in version.py:12
  • Dynamic version extraction in setup.py:15-20 is well-implemented
  • The [lemonade] extra with lemonade-sdk>={lemonade_version} ensures version consistency

2. Well-Documented Scripts (scripts/start-lemonade.*)

  • PowerShell script has proper help documentation with synopsis and examples (start-lemonade.ps1:4-35)
  • Both scripts handle edge cases: port conflicts, orphaned processes, server health checks
  • Good error handling with informative log output for debugging CI failures

3. Simplified CI Workflows

  • Unified approach using setup-venv action with install-package: '.[lemonade]'
  • Removed 65 lines of custom action code (.github/actions/install-lemonade/action.yml)
  • New test_lemonade_server.yml smoke test is minimal and effective

4. Comprehensive Documentation (docs/setup.mdx)

  • Clear installation instructions for both PyPI and clone workflows
  • Good separation between Windows and Linux tabs
  • Helpful tips about skipping lemonade extra if already installed

🟡 Suggestions (Non-blocking)

1. Test File Placement (test_lemonade_local.py:1-336)

  • This file is in the repo root, not in tests/. Consider moving to tests/ or tests/integration/ for consistency with project structure.
  • The interactive prompt at line 273 (input("Test API pull...")) won't work in CI - consider a CLI flag instead.

2. Model Consistency

  • CI uses Qwen3-4B-Instruct-2507-GGUF (e.g., test_gaia_cli_windows.yml:75)
  • lemonade_client.py:82 defaults to Qwen3-0.6B-GGUF
  • test_lemonade_client.py:26 defaults to Llama-3.2-3B-Instruct-Hybrid
  • This isn't a bug (env vars override), but documenting the hierarchy would help.

3. Script Hardcoded Paths

  • start-lemonade.sh:60 and start-lemonade.ps1:80 hardcode .venv/bin/lemonade-server-dev and .venv\Scripts\lemonade-server-dev.exe
  • Consider adding a parameter for custom venv paths for non-standard setups.

🔴 Potential Issues

1. Root Cause of CI Failures
As noted in the previous analysis, the HTTP 422 errors on Windows runners are likely caused by:

  • Missing ONNX Runtime dependencies after the transition to pip-based installation
  • The deleted install-lemonade action may have included additional setup steps

Recommendation: Verify that lemonade-sdk package includes all necessary runtime dependencies, or add them explicitly:

"lemonade": [
    f"lemonade-sdk>={lemonade_version}",
    # Add if needed:
    # "onnxruntime-directml",  # Windows DirectML
    # "vulkan-sdk",            # Vulkan backend
],

2. Missing Cleanup in Start Scripts
The PowerShell script kills orphaned processes at startup (start-lemonade.ps1:70-72), but neither script has a cleanup/stop counterpart. For CI reliability, consider adding stop-lemonade.ps1/sh scripts.

Overall Assessment

Approved with suggestions. The architectural changes are solid, documentation is comprehensive, and the code quality is good. The CI failures should be investigated separately as they appear to be related to runtime dependencies on the self-hosted runners rather than issues with this PR's code.


All 3 startup scripts now retry the health check up to 10 times
(2-second intervals) after model pull to handle server restarts.

Fixes race condition where server appears dead during restart.
Add encoding="utf-8" and errors="replace" to all subprocess.run()
calls to prevent UnicodeDecodeError on Windows with cp1252 codec.
@kovtcharov
Copy link
Collaborator Author

@claude review the documentation updates for any discrepancies.

@github-actions
Copy link

github-actions bot commented Jan 21, 2026

Claude finished @kovtcharov's task —— View job


Documentation Review: lemonade-sdk pip dependency

  • Read pr-diff.txt to understand documentation changes
  • Review docs/setup.mdx for accuracy
  • Review docs/quickstart.mdx for consistency
  • Review docs/reference/dev.mdx changes
  • Cross-check documentation with actual code
  • Identify any discrepancies

Summary

The documentation updates are well-structured and consistent with the code changes. Found 1 discrepancy and several minor suggestions.

🔴 Discrepancy Found

docs/reference/dev.mdx:106-112 - Missing lemonade extra in install command

The dev guide shows:

uv pip install -e ".[dev]"
# Or with all extras:
uv pip install -e ".[dev,talk,rag]"

But it should include lemonade to match the new pip-based installation pattern:

uv pip install -e ".[dev,lemonade]"
# Or with all extras:
uv pip install -e ".[dev,lemonade,talk,rag]"

The docs/setup.mdx correctly shows .[dev,lemonade] for the clone workflows (lines 248, 322), but dev.mdx was not updated.

🟢 Verified Consistency

Documentation Code Match
pip install "amd-gaia[lemonade]" (setup.mdx:92, 160) "lemonade": [f"lemonade-sdk>={lemonade_version}"] (setup.py:99-101)
LEMONADE_VERSION = "9.1.0" reference (setup.mdx:95) LEMONADE_VERSION = "9.1.0" (version.py:12)
.[dev,lemonade,talk,rag] extras (setup.mdx:252, 326) All extras defined in setup.py
Quickstart references Setup guide (quickstart.mdx:7) Setup guide has full instructions

🟡 Suggestions (Non-blocking)

  1. docs/reference/dev.mdx:49 - Mentions both pip and download options for Lemonade:

    "Install via pip (pip install "amd-gaia[lemonade]") or download from lemonade-server.ai"

    This is correct but could clarify that the pip approach is now the recommended method.

  2. docs/playbooks/index.mdx:314-318 - Prerequisites section mentions Lemonade Server but doesn't specify the [lemonade] extra is the recommended installation method.


Set CtxSize/CTX_SIZE to 32768 in all lemonade startup scripts
and CI workflows to meet Agent base class min_context_size
requirement. The health endpoint may not always return
context_size, causing it to default to 0 in the client.
- Add inline health check in lemonade server smoke test workflow
- Add standalone test_lemonade_health.py for context_size validation
- Will help identify if lemonade-sdk pip package differs from installer
@kovtcharov kovtcharov removed this from the v0.15.2 milestone Jan 24, 2026
@kovtcharov kovtcharov marked this pull request as draft January 24, 2026 02:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dependencies Dependency updates devops DevOps/infrastructure changes documentation Documentation changes llm LLM backend changes performance Performance-critical changes security Security-sensitive changes tests Test changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Fix runner failing due to missing dependencies Lemonade 9.1.4 cautionary upgrade

4 participants