Add hybrid vector + full-text search by richardsolomou · Pull Request #75 · get-convex/rag

richardsolomou · 2026-01-30T07:35:01Z

Summary

Extend the search action with optional full-text search that combines vector and text search results using Reciprocal Rank Fusion (hybridRank)
Add textSearch, textWeight, and vectorWeight options to the client SearchOptions type
Default searchableText to chunk text content so hybrid search works out of the box for new documents

Changes

src/component/search.ts: Add textSearch internal query using the searchableText search index; extend search action with hybrid path (vector + text → RRF merge); scope text search to namespace; apply user filters (OR semantics) to text search matching vector search behavior; guard against empty merge results
src/component/chunks.ts: Extract shared buildRanges helper; add getRangesOfChunkIds and getChunkIdsByEmbeddingIds internal queries
src/client/index.ts: Add textSearch/textWeight/vectorWeight to SearchOptions; pass textQuery through in RAG.search(); default searchableText in createChunkArgsBatch
src/client/hybridRank.ts: Fix typo "Recriprocal" → "Reciprocal"
src/component/schema.ts: Remove TODO comment
src/component/search.test.ts: Add 9 tests covering text search query, namespace scoping, filters, chunk ID lookups, hybrid search with textQuery, deduplication, vector-only path, and weight parameters
example/: Add hybrid search toggle to example app, passing textSearch option through search and askQuestion actions

Backwards compatibility

All new args are optional — existing code is unaffected
Existing chunks without searchableText won't appear in text search but still work for vector search
New chunks automatically get searchableText populated
Return types unchanged; score semantics change only when textSearch is enabled (scores become position-based via RRF)

Test plan

Existing tests pass (npm test)
Test vector-only search still works identically (no textSearch param)
Test hybrid search: add documents, search with textSearch: true
Test that text search is scoped to namespace
Test that user filters are applied to text search
Test deduplication when results appear in both vector and text results
Test with textWeight/vectorWeight to confirm ranking changes
Test getChunkIdsByEmbeddingIds and getRangesOfChunkIds helpers

Summary by CodeRabbit

New Features
- Hybrid search modes (Vector, Text, Hybrid) with a UI selector, separate scope selector (general/category/file), and propagation to search and Q&A flows.
- Tunable text/vector weight controls and improved result grouping/context ranges.
Bug Fixes
- Fixed ranking algorithm name in docs.
Tests
- Added comprehensive hybrid search tests for ranking, deduplication, weighting, and namespace filtering.
Chores
- Expanded search options and public surface to support new modes and weights.

Extend the search action with optional text search that combines vector and full-text search results using Reciprocal Rank Fusion (hybridRank). - Add textSearch internal query using the searchableText search index - Add getRangesOfChunkIds and getChunkIdsByEmbeddingIds queries - Extract shared buildRanges helper from getRangesOfChunks - Add textSearch, textWeight, vectorWeight options to client API - Default searchableText to chunk content in createChunkArgsBatch

coderabbitai · 2026-01-30T07:35:12Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

Adds a selectable search mode (vector|text|hybrid) across UI, client, and server; threads searchType and weighting parameters through RAG search calls; implements text-only, vector-only, and hybrid ranking paths; centralizes range-building; and adds hybrid search tests.

Changes

Cohort / File(s)	Summary
Example app `example/convex/example.ts`, `example/src/Example.tsx`, `example/src/components/SearchInterface.tsx`	Added `searchType` (optional) to Convex action args and UI; introduced `searchScope` in UI while keeping internal `searchType` mode; pass `searchType` into server actions and provide setters/props to SearchInterface.
Client API & types `src/shared.ts`, `src/client/index.ts`	Added `vSearchType` validator and `SearchType` type; extended `SearchOptions` with `searchType`, `textWeight`, `vectorWeight`; attach `searchableText` to chunk creation and forward textQuery/weights/searchType to server.
Server search logic `src/component/search.ts`, `src/client/hybridRank.ts`	Implemented vector-only, text-only, and hybrid search flows (weights, textQuery, embedding/dimension checks), added internal handlers `textSearch` and `textAndRanges`, and fixed RRF doc comment.
Chunk & range management `src/component/chunks.ts`	Extracted and exported `buildRanges` helper; refactored `getRangesOfChunks` to use it (centralized range-building and deduplication).
Tests `src/component/search.test.ts`	Added extensive hybrid search tests (text, vector, hybrid, namespace/filters, weighting, dedupe) and imported internal endpoints for text paths.
Minor `src/component/schema.ts`	Removed a TODO comment related to text search (no behavioral change).

Sequence Diagram(s)

sequenceDiagram
    participant UI as Client UI
    participant ClientLib as Client Library
    participant SearchAPI as Search Action
    participant TextPath as Text Search Worker
    participant VectorPath as Vector Search Worker
    participant RRF as Reciprocal Rank Fusion
    participant DB as Database

    UI->>ClientLib: search(query, searchType, textWeight?, vectorWeight?)
    ClientLib->>SearchAPI: convex.action.search(payload with searchType/textQuery/weights)

    alt searchType == "text" or "hybrid"
        SearchAPI->>TextPath: textSearch(textQuery, filters)
        TextPath->>DB: full-text lookup
        DB-->>TextPath: text results (chunk ids + scores)
    end

    alt searchType == "vector" or "hybrid"
        SearchAPI->>VectorPath: vectorSearch(embedding/dimension)
        VectorPath->>DB: similarity lookup
        DB-->>VectorPath: vector results (chunk ids + scores)
    end

    alt searchType == "hybrid"
        VectorPath-->>RRF: vector results
        TextPath-->>RRF: text results
        RRF->>RRF: merge using weights (textWeight/vectorWeight) -> ranked ids
        RRF->>SearchAPI: merged ranked results
    else single-path
        VectorPath-->>SearchAPI: vector results
        TextPath-->>SearchAPI: text results
    end

    SearchAPI->>DB: buildRanges / getRangesOfChunkIds(final ids)
    DB-->>SearchAPI: chunk contents & metadata
    SearchAPI-->>ClientLib: final results
    ClientLib-->>UI: render results

Estimated Code Review Effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐇 I hop through queries, soft and spry,

Vector, text, or hybrid I try,
I blend the scores, nibble weights with care,
Ranges stitched, results laid bare,
A joyful twitch — search made spry.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 11.11% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'Add hybrid vector + full-text search' accurately summarizes the main objective of the pull request, which adds hybrid search functionality combining vector and full-text search capabilities.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Pass the textSearch option through all search and askQuestion actions. Add a toggle in the advanced options panel to enable hybrid search.

- Fix typo: "Recriprocal" → "Reciprocal" in hybridRank - Apply user filters (OR semantics) to text search, matching vector search behavior - Document that scores become position-based when hybrid search is enabled - Use proper Id<"chunks"> types instead of string casts in hybrid merge

… path - Add explicit namespaceId filter in text search when user filters are present - Guard against empty merge results in hybrid path - Hoist vector search above the branch point to avoid duplication

Use Doc<"chunks"> instead of inline type in textSearch toResults. Add 9 tests covering textSearch query, namespace scoping, filters, getChunkIdsByEmbeddingIds, getRangesOfChunkIds, hybrid search with textQuery, deduplication, vector-only path, and weight parameters.

pkg-pr-new · 2026-02-02T18:33:12Z

Open in StackBlitz

npm i https://pkg.pr.new/get-convex/rag/@convex-dev/rag@75

commit: 65fa6bc

ianmacartney

Looking really good!
Would love, in addition to the feedback below, for you to try out the pkg-pr-new link and test it with your app to make sure it behaves well for you.
Big ask here is around reducing the number of queries in search, which might affect the factoring of various pieces of code.

My bandwidth is limited, so if I don't get back to you soon, apologies! Hoping to get more folks on the team to help (we're hiring!)

src/client/hybridRank.ts

src/client/index.ts

src/component/search.ts

…eries - Replace `textSearch?: boolean` with `searchType?: "vector" | "text" | "hybrid"` to support text-only search mode (no embedding needed) - Combine 3 separate queries in hybrid path into single `textAndRanges` query (vectorSearch remains separate as it requires action context) - Export shared `vSearchType` validator and `SearchType` type from shared.ts - Update example app to use shared types instead of inline unions

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In `@src/client/index.ts`:
- Around line 394-427: The current handling permits array queries to silently
fall back to vector-only even when searchType === "text", which results in no
text query and empty results; update the logic so that if searchType === "text"
and Array.isArray(args.query) you throw a descriptive error (referencing
searchType and args.query) instead of only warning, while preserving the
existing warning-and-fallback behavior for hybrid/hybrid-compatible cases
(searchType === "hybrid" or vector). Ensure the change is applied around the
existing variables/flow: searchType, needsEmbedding, needsTextQuery, args.query,
the embed() call and textQuery assignment so that embedding still occurs for
array queries when allowed and textQuery remains undefined only when legitimate.

🧹 Nitpick comments (1)

src/component/chunks.ts (1)

314-444: Avoid repeated entry reads in buildRanges.

You already load all entries up front; reusing them avoids extra DB gets per chunk.

♻️ Suggested refactor

-  const entries = (
-    await Promise.all(
-      Array.from(
-        new Set(chunks.filter((c) => c !== null).map((c) => c.entryId)),
-      ).map((id) => ctx.db.get(id)),
-    )
-  )
-    .filter((d) => d !== null)
-    .map(publicEntry);
+  const entryDocs = (
+    await Promise.all(
+      Array.from(
+        new Set(chunks.filter((c) => c !== null).map((c) => c.entryId)),
+      ).map((id) => ctx.db.get(id)),
+    )
+  ).filter((d): d is Doc<"entries"> => d !== null);
+  const entries = entryDocs.map(publicEntry);
+  const entryDocById = new Map(entryDocs.map((d) => [d._id, d]));
@@
-    const entry = await ctx.db.get(entryId);
-    assert(entry, `Entry ${entryId} not found`);
+    const entry = entryDocById.get(entryId);
+    assert(entry, `Entry ${entryId} not found`);

src/client/index.ts

…dRanges

richardsolomou · 2026-02-08T10:23:42Z

@ianmacartney All requested changes resolved. I've been using the pkg-pr-new link for a few days now and it's been super stable. Could you approve a new build of the package so I can test with the latest changes too please?

ianmacartney

really close!
Only changes necessary are formatting

example/src/components/SearchInterface.tsx

example/src/Example.tsx

ianmacartney · 2026-02-09T22:18:24Z

src/component/search.ts

    namespace: v.string(),
-    embedding: v.array(v.number()),
+    embedding: v.optional(v.array(v.number())),
+    dimension: v.optional(v.number()),


ah yeah this is an interesting nuance - I wonder if finding the namespace should happen higher up and pass in a namespaceId, since doing a query from the parent isn't any more expensive than doing a query first thing in this action.. the encapsulation is nice, but exposing that dimension is part of the namespace identifier is a bit funky. We can keep it like this for now and could in the future have a separate API possible to pass in a namespaceId instead of namespace

src/component/search.ts

src/client/index.ts

- Rename local SearchType to SearchScope in example to avoid confusion with the RAG package's SearchType - Always pass searchType verbatim instead of conditionally omitting it - Throw error when neither embedding nor textQuery is provided - Add explicit vSearchType validator to server search action args - Run prettier formatting

richardsolomou · 2026-02-10T12:47:26Z

@ianmacartney should all be sorted out now 🤞

ianmacartney · 2026-02-10T16:08:33Z

FYI I'll be in the backcountry for a few days so may not get to this until next week.
The package build should work for you in the meantime

ianmacartney

Actually I'll just push this out now

ianmacartney · 2026-02-10T16:13:58Z

0.7.1

ianmacartney · 2026-02-10T16:14:02Z

thanks again!

richardsolomou added 5 commits January 30, 2026 09:45

Add hybrid search toggle to example app

dadf760

Pass the textSearch option through all search and askQuestion actions. Add a toggle in the advanced options panel to enable hybrid search.

Merge branch 'main' into hybrid-text-search

29cf71d

Fix hybrid search: scope text search to namespace, deduplicate vector…

9169d13

… path - Add explicit namespaceId filter in text search when user filters are present - Guard against empty merge results in hybrid path - Hoist vector search above the branch point to avoid duplication

richardsolomou marked this pull request as ready for review January 31, 2026 21:08

ianmacartney requested changes Feb 4, 2026

View reviewed changes

src/client/hybridRank.ts Show resolved Hide resolved

src/client/index.ts Outdated Show resolved Hide resolved

src/component/search.ts Outdated Show resolved Hide resolved

coderabbitai bot reviewed Feb 4, 2026

View reviewed changes

src/client/index.ts Show resolved Hide resolved

Fix array query handling for text search, reuse entry lookups in buil…

7e4d372

…dRanges

fix linter

37a8cec

ianmacartney reviewed Feb 9, 2026

View reviewed changes

richardsolomou force-pushed the hybrid-text-search branch from 78740e2 to febe896 Compare February 10, 2026 12:44

Merge branch 'main' into hybrid-text-search

65fa6bc

ianmacartney approved these changes Feb 10, 2026

View reviewed changes

ianmacartney merged commit 13bbef1 into get-convex:main Feb 10, 2026
2 checks passed

Conversation

richardsolomou commented Jan 30, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Backwards compatibility

Test plan

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Jan 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Sequence Diagram(s)

Estimated Code Review Effort

Poem

Uh oh!

pkg-pr-new bot commented Feb 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ianmacartney left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

richardsolomou commented Feb 8, 2026

Uh oh!

ianmacartney left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

ianmacartney Feb 9, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

richardsolomou commented Feb 10, 2026

Uh oh!

ianmacartney commented Feb 10, 2026

Uh oh!

ianmacartney left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ianmacartney commented Feb 10, 2026

Uh oh!

ianmacartney commented Feb 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

richardsolomou commented Jan 30, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jan 30, 2026 •

edited

Loading

pkg-pr-new bot commented Feb 2, 2026 •

edited

Loading