Skip to content

Conversation

@rioliu-rh
Copy link
Contributor

@rioliu-rh rioliu-rh commented Dec 23, 2025

Summary

Fixes the "Task group is not initialized" error that occurs on subsequent MCP client calls when using HTTP transport. The first call succeeds but subsequent calls fail with 500 Internal Server Error.

Related: OCPERT-275

Root Cause

The MCP data collector was using asyncio.run() for each call, which creates a new event loop every time. FastMCP's streamable-http transport requires persistent event loop and session management - creating new loops breaks the internal task group state.

This is a known issue in the MCP ecosystem:

Changes

MCP Server:

  • Upgrade fastmcp from 0.1.0 to >=2.14.1 (includes task group fixes)
  • Upgrade mcp from 0.1.0 to >=1.25.0 (includes session manager stability improvements)
  • Add cachetools <6.0 constraint for google-auth/streamlit compatibility
  • Update transport from "http" to "streamable-http" in server.py and run_server.sh

MCP Data Collector Client:

  • Implement persistent event loop management (reused across all calls)
  • Add thread pool execution support for Streamlit environments
  • Update to non-deprecated streamable_http_client API
  • Configure httpx timeout properly (connect/read/write/pool)
  • Add exponential backoff retry logic (5 attempts, 2s initial delay)
  • Rename sse_read_timeout to read_timeout for clarity
  • Remove unused threading import

Testing

Verified with test script making 3 consecutive calls:

  • First call: ✓ Success (200 OK)
  • Second call: ✓ Success (200 OK) - previously failed
  • Third call: ✓ Success (200 OK) - previously failed

Deployment Notes

After merge, remote server needs:

  1. Package upgrade: pip3 install -r mcp_server/requirements.txt --upgrade
  2. Server restart to pick up new transport configuration
  3. Health check verification: curl http://localhost:8000/health

Compatibility

  • Works in standalone Python scripts
  • Works in Streamlit dashboard environment
  • Backward compatible with existing sync API

- Fix 'Task group is not initialized' error on subsequent MCP calls
- Implement persistent event loop management in MCP data collector
- Update to non-deprecated streamable_http_client API
- Add support for Streamlit environment with thread pool execution
- Upgrade MCP SDK packages (fastmcp>=2.14.1, mcp>=1.25.0)
- Configure httpx timeout for long-running operations
- Rename sse_read_timeout to read_timeout for clarity

Changes:
- mcp_server/requirements.txt: Upgrade fastmcp and mcp packages
- mcp_server/server.py: Use streamable-http transport
- mcp_server/run_server.sh: Update transport configuration
- tools/mcp_data_collector.py: Implement persistent event loop and retry logic
- tools/release_progress_dashboard/requirements.txt: Update dependencies

Root cause: Using asyncio.run() created new event loop per call, breaking
FastMCP session management. Fixed by maintaining persistent event loop and
using loop.run_until_complete() for reusable sessions.
@openshift-ci openshift-ci bot requested review from barboras7 and jhuttana December 23, 2025 05:13
@openshift-ci
Copy link

openshift-ci bot commented Dec 23, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign ming1013 for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci
Copy link

openshift-ci bot commented Dec 23, 2025

@rioliu-rh: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@rioliu-rh rioliu-rh merged commit 45c0868 into openshift:master Dec 23, 2025
2 of 3 checks passed
rioliu-rh added a commit to rioliu-rh/release-tests that referenced this pull request Dec 23, 2025
- Update architecture diagram from HTTP/SSE to HTTP
- Update server transport from sse to streamable-http
- Update Claude Code client config from transport:sse to type:http
- Update Python client examples to use streamable_http_client
- Update all /sse endpoints to /mcp
- Update server configuration examples with new host/port API
- Update curl testing commands

This completes the documentation updates for the HTTP transport migration
started in PR openshift#852.
rioliu-rh added a commit that referenced this pull request Dec 23, 2025
* Update MCP deployment guide for HTTP transport migration

- Update architecture diagram from HTTP/SSE to HTTP
- Update server transport from sse to streamable-http
- Update Claude Code client config from transport:sse to type:http
- Update Python client examples to use streamable_http_client
- Update all /sse endpoints to /mcp
- Update server configuration examples with new host/port API
- Update curl testing commands

This completes the documentation updates for the HTTP transport migration
started in PR #852.

* Update dashboard deployment guide for HTTP transport migration

- Update all MCP server URLs from /sse to /mcp endpoint
- Update environment variable examples
- Update systemd service configuration examples
- Update troubleshooting curl commands

This ensures dashboard deployment documentation is consistent with the
MCP server HTTP transport migration.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant