Skip to content

Feat: AI-Inferred Schema Relationships and Multi-Collection Query Context#22

Merged
ChingEnLin merged 7 commits intodevfrom
feat/usability
Feb 7, 2026
Merged

Feat: AI-Inferred Schema Relationships and Multi-Collection Query Context#22
ChingEnLin merged 7 commits intodevfrom
feat/usability

Conversation

@ChingEnLin
Copy link
Owner

This pull request introduces a significant new capability: AI-powered inference and visualization of schema relationships between MongoDB collections. It also enhances the natural language to query (NL2Query) feature by providing richer, multi-collection context to the AI model, improving its ability to generate complex queries involving relationships.

Key Features & Improvements:

  1. AI-Powered Schema Relationship Inference:

    • Users can now select multiple collections in the Query Generator page.
    • When two or more collections are selected, the system automatically analyzes their schemas (via sample documents) and uses an AI model (Google Gemini) to infer potential relationships (e.g., foreign keys, join conditions).
    • A new SchemaRelationshipGraph component visualizes these inferred relationships, showing source/target collections and fields, along with a confidence score and description. This helps users understand how collections might be connected.
  2. Enhanced Multi-Collection Context for NL2Query:

    • The backend's nl2query endpoint has been updated to accept account_id and a list of collection_context objects (representing multiple selected collections).
    • This allows the AI to consider the schema structure of all relevant collections when generating a query, significantly improving its capability to handle requests involving joins, lookups, or references across different data sets.
    • If no collection context is explicitly provided by the frontend, the backend will now proactively fetch a schema summary for the entire database to provide broader context to the AI.
  3. Multi-Collection Selection in UI:

    • The QueryGeneratorPage now supports multi-selection of collections using Ctrl/Cmd + click.
    • Selected collection details are displayed in a stack of expandable cards, providing a clearer view when managing multiple schemas.
  4. Quick Data Explorer Launch:

    • Added a new "Explorer" button next to each Cosmos DB account and the connected database. This allows users to quickly navigate to the Data Explorer for a selected database or account, streamlining the workflow between querying and data exploration.

Technical Details (Backend):

  • backend/models/schemas.py:
    • QueryPrompt schema modified to include account_id and change collection_context to accept a list[CollectionContext].
    • Introduced new schemas: SchemaRelationshipsRequest, Relationship, and SchemaRelationshipsResponse for structured relationship data.
  • backend/routes/query.py:
    • New endpoint /query/infer-relationships added, which accepts SchemaRelationshipsRequest, performs OBO token exchange, fetches schema summaries for specific collections, and then calls the Gemini service to infer relationships.
    • The nl2query endpoint now expects account_id in the QueryPrompt and uses the new get_database_schema_summary to provide comprehensive schema context (all_collections_schema) to the Gemini model.
  • backend/services/gemini_service.py:
    • New function generate_schema_relationships implemented to interact with the Gemini model for relationship inference, using a specific prompt template and structured output (Pydantic SchemaRelationshipsResponse).
    • The generate_query_from_prompt function now accepts all_collections_schema as a parameter to provide a broader database context to the AI.
  • backend/services/mongo_service.py:
    • New function get_database_schema_summary introduced, which connects to Cosmos DB, retrieves sample documents for specified (or all) collections, and generates a textual summary of the schema.

Technical Details (Frontend):

  • frontend/types.ts:
    • New interfaces Relationship and SchemaRelationshipsResponse added to support relationship data.
  • frontend/services/geminiService.ts:
    • New inferSchemaRelationships function added to make API calls to the new backend endpoint and handle authentication.
    • generateMongoQuery function updated to include accountId and selectedCollections parameters, aligning with the new backend schema. MSAL token acquisition logic is now part of these service calls.
  • frontend/pages/QueryGeneratorPage.tsx:
    • Extensive refactoring to manage state for multiple selected collections (selectedCollections, collectionDetailsMap).
    • Integrated the logic to trigger inferSchemaRelationships when multiple collections are selected, along with debouncing.
    • Implemented handleLaunchExplorer and handleQuickExploreAccount for improved navigation to Data Explorer.
    • The UI for displaying collection details has been updated to render expandable CollectionActionPanel cards when multiple collections are selected.
    • Removed unused isExplorerNavEnabled prop from HeaderUI.
  • frontend/components/SchemaRelationshipGraph.tsx:
    • A completely new React component designed to visually represent the inferred schema relationships using SVG, with interactive hover states.

This change significantly improves the QueryPal application by enabling more sophisticated AI-driven query generation and providing users with valuable insights into their database schema.

@ChingEnLin ChingEnLin merged commit 0cd1f0a into dev Feb 7, 2026
3 checks passed
@ChingEnLin ChingEnLin deleted the feat/usability branch February 7, 2026 09:20
@github-actions
Copy link

github-actions bot commented Feb 7, 2026

🎉 This PR is included in version 2.7.0 🎉

The release is available on GitHub release

Your semantic-release bot 📦🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant