chore(providers): bump Gemini defaults to current GA models by rohitg00 · Pull Request #370 · rohitg00/agentmemory

rohitg00 · 2026-05-14T11:52:20Z

Summary

Bundles two upstream PRs into one chore — both block real users today and both are default-string bumps with zero API-contract change.

Source PR	Author	Surface
#368	@yut304	Gemini LLM model default
#246	@AmmarSaleh50	Gemini embedding model + L2 norm + dim plumbing

LLM default

gemini-2.0-flash is deprecated in Google's Gemini API and returns 429 rate-limit errors under load. Default switches to gemini-flash-latest.

Users on a pinned GEMINI_MODEL in ~/.agentmemory/.env are unaffected — defaults only.

Embedding default

text-embedding-004 is deprecated (shutdown Jan 14 2026). Default switches to gemini-embedding-001 (GA): 100+ languages, MRL dims (768 / 1536 / 3072), 2048-token input.

Three implementation details that go with the model swap:

URL path — :batchEmbedContent → :batchEmbedContents (plural; the new model's batch endpoint).
outputDimensionality: 768 — sent on every request so returned vectors match GeminiEmbeddingProvider.dimensions = 768 and the index-restore dim guard from PR fix(embedding): guard provider responses against dimension mismatches #248 — no reindex needed for existing users.
L2 normalize the returned vectors before pushing them onto the result array. Unlike text-embedding-004, gemini-embedding-001 does not normalize by default — without this the cosine-similarity math elsewhere in the search pipeline (which assumes unit-length vectors) silently collapses recall.

Closes

Closes #368 — @yut304's bump
Closes #246 — @AmmarSaleh50's bump

Test plan

npm test passes — 903 / 903.
npm run build clean.
Live smoke: GEMINI_API_KEY=... set, npx agentmemory doctor reports provider = llm, model = gemini-flash-latest.
Live smoke: BM25 + vector smart-search round-trip returns expected hits (vector cosine math doesn't collapse).

Summary by CodeRabbit

Documentation
- Updated Gemini embedding provider docs with the latest model specs, language support, dimensional options, and deprecation note for the prior model.
New Features
- Provider now uses the latest Gemini embedding model and offers expanded language & dimension configuration.
- Embedding outputs are now L2-normalized.
- Default model fallback updated to the newer Gemini release.

@yut304

Bundles two upstream PRs into one chore — both are blocking real users today and both are simple default-string bumps with no API contract change. LLM default (was PR #368, @yut304) - `gemini-2.0-flash` is deprecated in Google's Gemini API and returns 429 rate-limit errors under load. Replace the default with `gemini-flash-latest`. Users on a pinned `GEMINI_MODEL` in `~/.agentmemory/.env` are unaffected. Embedding default (was PR #246, @AmmarSaleh50) - `text-embedding-004` is deprecated (shutdown Jan 14 2026). Replace with `gemini-embedding-001` (GA): 100+ languages, MRL dims (768 / 1536 / 3072), 2048-token input. - URL path changes from `:batchEmbedContent` to `:batchEmbedContents` (plural — the new model's batch endpoint). - Each request now sends `outputDimensionality: 768` so the returned vectors match the existing index dim guard from #248 — no reindex needed. - L2-normalize each returned vector before pushing to the result array. `gemini-embedding-001` does not normalize by default, unlike `text-embedding-004`. Without this the cosine-similarity math elsewhere in the search pipeline (which assumes unit-length vectors) collapses. Verified - `npm test` clean: 903 / 903. - `npm run build` clean. Closes #368, closes #246.

vercel · 2026-05-14T11:52:25Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
agentmemory	Ready	Preview, Comment	May 14, 2026 0:31am

coderabbitai · 2026-05-14T11:52:33Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 371d90a3-2557-4a06-915e-8543bb78a88b

📥 Commits

Reviewing files that changed from the base of the PR and between 6420ffe and 255105b.

📒 Files selected for processing (2)

src/config.ts
src/providers/embedding/gemini.ts

✅ Files skipped from review due to trivial changes (1)

src/config.ts

🚧 Files skipped from review as they are similar to previous changes (1)

src/providers/embedding/gemini.ts

📝 Walkthrough

Walkthrough

Switch Gemini embedding to gemini-embedding-001 (API endpoint change), pass outputDimensionality, L2-normalize returned vectors (with zero-norm warning), update default Gemini model fallback to gemini-2.5-flash, and update the README embedding providers table.

Changes

Gemini Provider Migration and Configuration

Layer / File(s)	Summary
Model constant and API base `src/providers/embedding/gemini.ts`	Adds `MODEL` constant and rebuilds `API_BASE` to target `models/gemini-embedding-001` instead of the old text-embedding endpoint.
Batch request payload update `src/providers/embedding/gemini.ts`	Batch embed requests now include `model: MODEL` and `outputDimensionality: this.dimensions` in the payload.
L2 normalization of returned embeddings `src/providers/embedding/gemini.ts`	Returned embedding vectors are L2-normalized in-place; zero-norm vectors are left unchanged and cause a one-time warning to be written to `process.stderr`.
Default Gemini model fallback `src/config.ts`	`detectProvider()` now falls back to `gemini-2.5-flash` when `GEMINI_MODEL` is not set (replaces gemini-2.0-flash).
Documentation update `README.md`	Embedding providers table updated to reference `gemini-embedding-001` and replace the previous `text-embedding-004` entry with deprecation/shutdown info.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

rohitg00/agentmemory#248: Adds dimension-guarding/validation for provider-returned embedding lengths, which relates to the outputDimensionality and normalization changes in this PR.

Poem

A bunny hops where vectors bloom,
Models swapped to chase the gloom.
I normalize each tiny thread,
Zero norms get a quiet dread.
🐰🌿

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately captures the main objective of the PR: updating Gemini LLM and embedding provider defaults to current GA models (gemini-2.5-flash and gemini-embedding-001).
Linked Issues check	✅ Passed	All code changes directly address issue `#368` (LLM default update to gemini-2.5-flash) and `#246` (embedding migration to gemini-embedding-001 with L2-normalization and outputDimensionality=768).
Out of Scope Changes check	✅ Passed	All changes are tightly scoped to the two linked issues: LLM default bump, embedding provider migration, L2-normalization, and README documentation updates. No unrelated modifications present.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch chore/gemini-defaults-bump

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 ESLint

If the error stems from missing dependencies, add them to the package.json file. For unrecoverable errors (e.g., due to private dependencies), disable the tool in the CodeRabbit configuration.

ESLint skipped: no ESLint configuration detected in root package.json. To enable, add eslint to devDependencies.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (1)

src/providers/embedding/gemini.ts (1)
58-65: ⚡ Quick win

Consider logging or throwing on zero-norm vectors.

The function silently returns the unnormalized vector when norm === 0 (line 62). A zero-norm embedding from the API would indicate a problem upstream, but this implementation swallows it. Consider logging a warning or throwing an error to surface the issue rather than injecting an unnormalized (zero) vector into results that are expected to be unit-length for cosine similarity.
🔍 Proposed enhancement
 function l2Normalize(vec: Float32Array): Float32Array {
   let sum = 0;
   for (let i = 0; i < vec.length; i++) sum += vec[i]! * vec[i]!;
   const norm = Math.sqrt(sum);
-  if (norm === 0) return vec;
+  if (norm === 0) {
+    throw new Error("Cannot normalize zero-length embedding vector");
+  }
   for (let i = 0; i < vec.length; i++) vec[i] = vec[i]! / norm;
   return vec;
 }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/providers/embedding/gemini.ts` around lines 58 - 65, The l2Normalize
function currently returns the original array when norm === 0, silently allowing
zero-length embeddings; update l2Normalize to surface this upstream error by
either throwing a descriptive Error (e.g., "zero-norm embedding returned from
upstream") or logging a warning with context before failing, and ensure callers
can handle the exception; refer to the function name l2Normalize and modify its
norm === 0 branch to throw or log (and return a safe fallback only if explicitly
wanted), including details such as the embedding length or source to aid
debugging.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/config.ts`:
- Line 79: The default model string used for the config key "model" (fallback
when env["GEMINI_MODEL"] is unset) should not use the auto-updating alias
"gemini-flash-latest"; change the fallback to a stable pinned identifier such as
"gemini-2.5-flash" so production behavior is deterministic, i.e., update the
expression that sets model (the `model: env["GEMINI_MODEL"] ||
"gemini-flash-latest",` assignment) to use a stable model name as the default.

In `@src/providers/embedding/gemini.ts`:
- Around line 32-36: The requests payload is using camelCase key
outputDimensionality which Gemini expects as snake_case output_dimensionality;
update the chunk.map(...) object so the property is output_dimensionality:
this.dimensions (instead of outputDimensionality) wherever you build requests
for MODEL in the embedding/gemini provider, ensuring any other occurrences of
outputDimensionality are renamed to output_dimensionality so the API receives
the intended 768-dimension vector setting.

---

Nitpick comments:
In `@src/providers/embedding/gemini.ts`:
- Around line 58-65: The l2Normalize function currently returns the original
array when norm === 0, silently allowing zero-length embeddings; update
l2Normalize to surface this upstream error by either throwing a descriptive
Error (e.g., "zero-norm embedding returned from upstream") or logging a warning
with context before failing, and ensure callers can handle the exception; refer
to the function name l2Normalize and modify its norm === 0 branch to throw or
log (and return a safe fallback only if explicitly wanted), including details
such as the embedding length or source to aid debugging.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 69176075-be9c-4340-a799-ed5ea3ec84c2

📥 Commits

Reviewing files that changed from the base of the PR and between a9c3a59 and 6420ffe.

📒 Files selected for processing (3)

README.md
src/config.ts
src/providers/embedding/gemini.ts

coderabbitai · 2026-05-14T11:55:16Z

          requests: chunk.map((t) => ({
-            model: "models/text-embedding-004",
+            model: MODEL,
            content: { parts: [{ text: t }] },
+            outputDimensionality: this.dimensions,
          })),


⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

🧩 Analysis chain

🌐 Web query:

Gemini gemini-embedding-001 outputDimensionality parameter reduce dimensions

💡 Result:

For the Gemini embedding model gemini-embedding-001, the parameter you use to reduce the embedding dimensionality (i.e., the “outputDimensionality/output_dimensionality” behavior) is output_dimensionality / outputDimensionality. 1) What it does - By default, gemini-embedding-001 returns 3072-dimensional embeddings. [1] - You can pass output_dimensionality (or outputDimensionality) to control the size of the output embedding vector; smaller values truncate the embedding to the requested size (to save storage and improve downstream efficiency). [1][2] 2) Parameter name(s) by API surface - Gemini API (ai.google.dev docs): use output_dimensionality in the embed content request. [3] - Vertex AI Generative AI API (cloud.google.com docs): use output_dimensionality in EmbedContentConfig (shown as output_dimensionality=... in examples). [1] - Vertex AI model reference also describes outputDimensionality as an optional int used to specify output embedding size and truncates if set. [2] 3) Practical example (what to change) - Request without dimensionality: defaults to 3072 dims. [1] - Request with dimensionality: set output_dimensionality/outputDimensionality to e.g. 768, 1536, or 3072 (Google recommends 768/1536/3072 for quality/storage tradeoffs). [3][4] 4) Common gotcha - Ensure your client/library actually sends the parameter; otherwise you’ll get the default 3072-dimension vectors, which can break downstream vector databases expecting the reduced dimension. [5] So, to “reduce dimensions” for gemini-embedding-001, set output_dimensionality (Gemini API) or outputDimensionality/output_dimensionality (Vertex AI) in the embedContent request config; e.g., 768 instead of the default 3072. [1][2][3]

Citations:

1: https://cloud.google.com/vertex-ai/generative-ai/docs/embeddings/get-text-embeddings#:~:text=For%20gemini%2Dembedding%2D001%20%2C,of%20the%20output%20embedding%20vector.

2: https://docs.cloud.google.com/vertex-ai/generative-ai/docs/model-reference/text-embeddings-api

3: https://ai.google.dev/gemini-api/docs/embeddings

4: https://developers.googleblog.com/gemini-embedding-available-gemini-api/

5: Google Gemini embedding model outputDimensionality not working vercel/ai#8033

🏁 Script executed:

cat -n src/providers/embedding/gemini.ts | head -50

Repository: rohitg00/agentmemory

Length of output: 2110

Change outputDimensionality to output_dimensionality (snake_case).

The code targets the Gemini API (ai.google.dev), which expects the parameter as output_dimensionality in snake_case, not outputDimensionality in camelCase (which is the Vertex AI convention). Using the wrong parameter name will cause the API to ignore it and return default 3072-dimensional vectors instead of the intended 768 dimensions, breaking compatibility with existing indexes.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/providers/embedding/gemini.ts` around lines 32 - 36, The requests payload is using camelCase key outputDimensionality which Gemini expects as snake_case output_dimensionality; update the chunk.map(...) object so the property is output_dimensionality: this.dimensions (instead of outputDimensionality) wherever you build requests for MODEL in the embedding/gemini provider, ensuring any other occurrences of outputDimensionality are renamed to output_dimensionality so the API receives the intended 768-dimension vector setting.

…norm Addresses CodeRabbit findings on PR #370. 1. Pin Gemini LLM default to gemini-2.5-flash. `gemini-flash-latest` is a moving alias that points to whatever Google promotes next. Production behaviour should be deterministic from a release perspective — users who upgrade agentmemory should not also get a Gemini model rotation in the same step. Switch the default to the current stable GA model `gemini-2.5-flash`. Users who want the moving alias keep getting it via `GEMINI_MODEL=gemini-flash-latest` in `~/.agentmemory/.env`. 2. Warn-once on zero-norm embedding in l2Normalize. `gemini-embedding-001` can return a zero-norm vector for degenerate input. The previous code silently returned the zero vector — downstream cosine-similarity math then divides by zero and the call site sees `NaN` scores with no signal as to why. Emit a one-time stderr warning naming the model + vector length so operators can correlate index quality dips with upstream embedding regressions. Behaviour otherwise unchanged: return the zero vector and let BM25 carry the search signal. Throwing was the other option — rejected because a single bad embedding in a 100-item batch would abort the whole batch and surface as an indexing pipeline halt. Soft-fail + warn matches the rest of the embedding provider error handling. Skipped finding: - `outputDimensionality` → `output_dimensionality` snake_case rename. CodeRabbit asserts the REST API expects snake_case. The Gemini REST API actually uses camelCase on the wire — confirmed against ai.google.dev/api/embeddings (field labelled `outputDimensionality` in the REST schema; the Python SDK alone uses snake_case and translates internally). Current code is correct as-shipped; the snake_case rename would silently break the dim override. Verified: 903 / 903 tests pass; build clean.

…loy templates + Gemini GA bumps (#383) * chore(release): v0.9.13 — env-example discovery + CJK tokenizer + load harness + deploy templates + Gemini GA bumps + 14 advisories closed Six PRs landed since v0.9.12: - #372 .env.example discovery (this commit) — repo-root template + `init` CLI command + CI sync-checker - #362 CJK BM25 tokenizer (`@node-rs/jieba` + tiny-segmenter + Hangul) - #363 `benchmark/load-100k.ts` harness with p50/p90/p99 + per-release results dir - #361 one-click deploy templates for fly.io / Railway / Render / Coolify (multi-stage Dockerfile, `iiidev/iii` base, `gosu` privilege drop, first-boot HMAC, verified end-to-end on fly.io) - #364 Python ecosystem via `iii-sdk` example (replaces closed PR #360) - #370 Gemini GA bumps (LLM default → gemini-2.5-flash, embedding → gemini-embedding-001 + L2-norm + 768 dims) Plus 14 open Dependabot advisories closed in PR #348 via Next.js → 16.2.6 and PostCSS → 8.5.10 overrides. Bumped: - src/version.ts: VERSION 0.9.12 → 0.9.13 - package.json: 0.9.12 → 0.9.13, files += ".env.example", build script copies .env.example into dist/ - packages/mcp/package.json: 0.9.12 → 0.9.13 (lockstep with main) - plugin/.claude-plugin/plugin.json, plugin/.codex-plugin/plugin.json: 0.9.12 → 0.9.13 - src/types.ts: ExportData.version union extended with "0.9.13" - src/functions/export-import.ts: supportedVersions Set extended - test/export-import.test.ts: expected version updated New surface: - .env.example at repo root — every env var read by src/ documented in one place, grouped by surface (LLM, embedding, auth, search tuning, behaviour flags, CLI runtime, ports, iii engine pin, Claude Code bridge, Obsidian export). Every line commented out by default so the file is a template. - agentmemory init — copies bundled .env.example to ~/.agentmemory/.env if absent, refuses to overwrite, prints a diff command. Wired into CLI dispatch + help block. - scripts/check-env-example.mjs — walks src/ for env-read patterns, fails CI on drift in either direction. Plugged into ci.yml after npm test. Initial bootstrap: 60 keys in sync. Verified: npm test 903/903, npm run build clean, init smoke pass (creates ~/.agentmemory/.env on first run, refuses overwrite on second). * fix(init): atomic copy via COPYFILE_EXCL; address CodeRabbit review Two valid findings from the CodeRabbit pass on PR #383. 1. `runInit` race between existsSync(target) + copyFile(template, target). A parallel `agentmemory init` (or any other process touching ~/.agentmemory/.env between the two calls) would silently overwrite the config the operator just wrote. Switch to a single atomic `copyFile(template, target, fsConstants.COPYFILE_EXCL)` and treat the EEXIST error as the "already configured" signal — same warning + diff hint as before, but the check + copy now happen in one syscall so they cannot race. Other failure paths still surface as process exit 1. 2. Comment on `scripts/check-env-example.mjs::walk` claimed it matched ".ts / .mts / .mjs" but the regex also matched ".js". Rewrote the comment to match the regex (".ts / .mts / .mjs / .js"). Same comment pass: noted that test/ never enters because the walk is rooted at src/, not because of an explicit skip. Skipped findings: - WHAT-style comment on `findEnvExample` — kept a one-liner explaining the package-vs-source priority since both paths are real; reduced the block from 4 lines to 2 instead of removing it entirely. - "Add trailing newline to .env.example" — file already ends with `\n` (verified `tail -c 5` shows `tion\n`). Verified locally: - `npm run build` clean. - `npm test` 903 / 903 pass. - First `agentmemory init` against a clean HOME creates the file. - Second init against the same HOME hits EEXIST and prints the "leaving it untouched" warning + diff hint without overwriting. - `node scripts/check-env-example.mjs` — in sync (60 keys).

vercel Bot deployed to Preview May 14, 2026 11:52 View deployment

coderabbitai Bot reviewed May 14, 2026

View reviewed changes

vercel Bot deployed to Preview May 14, 2026 12:31 View deployment

rohitg00 merged commit 4b354b7 into main May 14, 2026
5 checks passed

rohitg00 deleted the chore/gemini-defaults-bump branch May 14, 2026 12:35

rohitg00 mentioned this pull request May 15, 2026

chore(release): v0.9.13 — .env.example + init command + sync-checker (#372) #383

Merged

4 tasks

coderabbitai Bot mentioned this pull request May 15, 2026

README.md updated #322

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore(providers): bump Gemini defaults to current GA models#370

chore(providers): bump Gemini defaults to current GA models#370
rohitg00 merged 2 commits into
mainfrom
chore/gemini-defaults-bump

rohitg00 commented May 14, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

vercel Bot commented May 14, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot commented May 14, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

coderabbitai Bot May 14, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

rohitg00 commented May 14, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

LLM default

Embedding default

Closes

Test plan

Summary by CodeRabbit

Uh oh!

vercel Bot commented May 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai Bot commented May 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot May 14, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

rohitg00 commented May 14, 2026 •

edited by coderabbitai Bot

Loading

vercel Bot commented May 14, 2026 •

edited

Loading

coderabbitai Bot commented May 14, 2026 •

edited

Loading