Skip to content

fix(chroma): honor DocFilter.isTable=false in _convertFilter#325

Closed
kgarg2468 wants to merge 5 commits intorocketride-org:developfrom
kgarg2468:bugfix/chroma-istable-false-filter
Closed

fix(chroma): honor DocFilter.isTable=false in _convertFilter#325
kgarg2468 wants to merge 5 commits intorocketride-org:developfrom
kgarg2468:bugfix/chroma-istable-false-filter

Conversation

@kgarg2468
Copy link
Copy Markdown
Contributor

@kgarg2468 kgarg2468 commented Mar 22, 2026

Summary

  • Fixes a Chroma filter regression where DocFilter(isTable=False) was ignored in _convertFilter.
  • Preserves existing behavior for isTable=True and isTable=None.
  • Adds focused regression tests for isTable filter semantics, including combined filters.

Bug

nodes/src/nodes/chroma/chroma.py used a truthy check:

if docFilter.isTable:

This skipped the filter when isTable=False, so "exclude tables" requests were not applied.

Steps to Reproduce

cd /Users/krishgarg/Documents/Projects/RocketRide/rocketride-server/.worktrees/chroma-istable-false-filter

# Failing behavior before fix:
git checkout aee7d58^
/Users/krishgarg/Documents/Projects/RocketRide/rocketride-server/.worktrees/chroma-istable-false-filter/dist/server/engine -m pytest nodes/test/test_chroma_filter_semantics.py::test_convert_filter_handles_is_table_false -q

# Passing behavior after fix:
git checkout aee7d58
/Users/krishgarg/Documents/Projects/RocketRide/rocketride-server/.worktrees/chroma-istable-false-filter/dist/server/engine -m pytest nodes/test/test_chroma_filter_semantics.py::test_convert_filter_handles_is_table_false -q

Expected: filter contains {'isTable': {'$eq': False}} when DocFilter.isTable=False.

Actual (before fix): isTable clause omitted.

Root Cause

_convertFilter treated False as "not set" because it used a truthiness check instead of an explicit None check.

Fix

Changed the conditional in nodes/src/nodes/chroma/chroma.py:

  • From: if docFilter.isTable:
  • To: if docFilter.isTable is not None:

Added tests in nodes/test/test_chroma_filter_semantics.py:

  • isTable=False -> includes {'isTable': {'$eq': False}}
  • isTable=True -> includes {'isTable': {'$eq': True}}
  • isTable=None -> omits isTable clause
  • nodeId + isTable=False -> preserves both clauses

Why This Works

is not None correctly distinguishes explicit boolean values (True/False) from "unset" (None), matching DocFilter semantics and preventing false-value drops.

Testing / Validation

# Full contract suite
PATH="/Users/krishgarg/.local/cmake-3.30-stage/cmake-3.30.1-macos-universal/CMake.app/Contents/bin:$PATH" ./builder nodes:test-contracts --verbose
# result: PASS

# Full repo test pipeline
PATH="/Users/krishgarg/.local/cmake-3.30-stage/cmake-3.30.1-macos-universal/CMake.app/Contents/bin:$PATH" ./builder test --verbose
# result: PASS

# Targeted regression test
/Users/krishgarg/Documents/Projects/RocketRide/rocketride-server/.worktrees/chroma-istable-false-filter/dist/server/engine -m pytest nodes/test/test_chroma_filter_semantics.py -q
# result: 4 passed

# Nodes suite sanity
/Users/krishgarg/Documents/Projects/RocketRide/rocketride-server/.worktrees/chroma-istable-false-filter/dist/server/engine -m pytest nodes/test -q
# result: 328 passed, 65 skipped

Type

fix

Checklist

  • Tests added/updated
  • Tested locally
  • ./builder test passes
  • Commit messages follow conventional commits
  • No secrets or credentials included
  • Wiki updated (if applicable)
  • Breaking changes documented (if applicable)

Related Issues

  • None currently linked.

Summary by CodeRabbit

  • Bug Fixes

    • Preserve boolean false for the isTable field in filters so records with isTable=false are matched correctly.
  • Tests

    • Added regression tests validating isTable filter behavior for true/false/null, combined-filter handling, and that test import isolation prevents leakage or replacement of scoped modules.

Tag: #frontier-tower-hackathon

Impact

This restores correct filtering semantics so explicit isTable=False queries no longer return incorrect results.

CI Evidence

@github-actions github-actions bot added the module:nodes Python pipeline nodes label Mar 22, 2026
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Mar 22, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Updated Chroma Store filter construction to add the isTable equality clause only when docFilter.isTable is not None, preserving explicit False. Added regression tests validating isTable semantics and ensuring temporary import stubs do not leak from sys.modules.

Changes

Cohort / File(s) Summary
Chroma Filter Logic Fix
nodes/src/nodes/chroma/chroma.py
Modify _convertFilter to include {'isTable': {'$eq': ...}} only when docFilter.isTable is not None, preserving explicit False as a constraint.
Filter Semantics Test Suite
nodes/test/test_chroma_filter_semantics.py
Add tests that load chroma.chroma.Store with scoped stub imports, assert _convertFilter outputs for isTable=False, isTable=True, and isTable=None (omitted), verify nodeId+isTable=False yields an $and of clauses, and confirm no scoped-module leakage in sys.modules.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 I sniffed a False beneath the hay,
and pulled it out to hop and say:
None keeps quiet, True and False stand tall—
tests clap paws and catch them all. 🥕

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 72.73% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and specifically describes the main fix: honoring DocFilter.isTable=false in the _convertFilter method, which is exactly what the changeset addresses.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@nodes/test/test_chroma_filter_semantics.py`:
- Around line 16-95: Tests install stub modules via _install_stubs() which
mutates sys.modules globally and leaks between tests; implement a
`@contextmanager` named _scoped_stubs() (with the docstring: "Temporarily install
stubs, restoring original modules on exit.") that snapshots sys.modules, calls
_install_stubs(), yields to the caller, and on exit restores the original
sys.modules state (removing any newly added stub modules and restoring replaced
ones); update _load_store_class() to use the _scoped_stubs() context manager
around the import so stubs are automatically scoped and restored after each
test.
- Line 19: Update the helper scaffolding to satisfy ruff ANN/ARG/PT rules:
rename unused lambda parameters in mod_depends.depends from *args, **kwargs to
*_args, **_kwargs; add explicit return type -> None and annotate parameters
*_args: object, **_kwargs: object on the three helper __init__ methods
referenced in the diff; add type annotations provider: object, connConfig:
object and return type -> dict to getNodeConfig and rename any unused params
with a leading underscore; add -> type return annotation to _load_store_class;
replace the compound assertion with two assertions (assert spec is not None and
assert spec.loader is not None) where spec is checked; annotate _doc_filter to
accept **overrides: object and return -> types.SimpleNamespace; and add -> None
return type annotations to the test functions at the bottom (lines 115+). Ensure
these changes reference the exact function/class names in the file so linters
pass.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 223c3f28-8743-4211-b936-fc1764096f08

📥 Commits

Reviewing files that changed from the base of the PR and between 5bdcd5b and aee7d58.

📒 Files selected for processing (2)
  • nodes/src/nodes/chroma/chroma.py
  • nodes/test/test_chroma_filter_semantics.py

Comment thread nodes/test/test_chroma_filter_semantics.py Outdated
Comment thread nodes/test/test_chroma_filter_semantics.py Outdated
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@nodes/test/test_chroma_filter_semantics.py`:
- Line 15: Replace the typing import: change the import statement that currently
reads "from typing import Iterator" to import Iterator from collections.abc
instead (i.e., use collections.abc.Iterator) in the
test_chroma_filter_semantics.py file so it satisfies Ruff UP035; update any
references if necessary and run the linter/tests to confirm the warning is
resolved.
- Around line 182-203: Replace the duplicated hardcoded stub module list in
test_load_store_class_does_not_leak_stub_modules with the module-level constant
_STUB_MODULE_NAMES so the test always uses the canonical list; locate the test
function test_load_store_class_does_not_leak_stub_modules, remove the local
stub_modules assignment and use _STUB_MODULE_NAMES when building the before
mapping, and ensure _load_store_class is still invoked and subsequent assertions
remain unchanged.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: b226271f-9ab7-4f9b-bc96-c311ea927efa

📥 Commits

Reviewing files that changed from the base of the PR and between aee7d58 and b277778.

📒 Files selected for processing (1)
  • nodes/test/test_chroma_filter_semantics.py

Comment thread nodes/test/test_chroma_filter_semantics.py Outdated
Comment thread nodes/test/test_chroma_filter_semantics.py Outdated
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@nodes/test/test_chroma_filter_semantics.py`:
- Around line 128-142: Add a PEP 257 docstring to the helper function
_doc_filter describing its purpose (constructing a SimpleNamespace representing
a document filter with default keys), listing the accepted keyword overrides and
return type (types.SimpleNamespace), and noting that keys like nodeId, isTable,
tableIds, parent, permissions, objectIds, isDeleted, chunkIds, minChunkId, and
maxChunkId are initialized to None by default; place the docstring immediately
below the def _doc_filter(...) line.
- Around line 116-125: Add a PEP 257 docstring to the helper function
_load_store_class(): immediately beneath def _load_store_class() add a short
triple-quoted description that explains the function’s purpose (dynamically load
the chroma.Store class for tests), any important behavior (uses _scoped_stubs
and resolves the chroma.py path) and what it returns (the Store class type).
Keep the docstring concise and consistent with other helpers in the module.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: ea4b0c8b-8d03-4fc4-8678-3ba647c761a0

📥 Commits

Reviewing files that changed from the base of the PR and between b277778 and bfa5b77.

📒 Files selected for processing (1)
  • nodes/test/test_chroma_filter_semantics.py

Comment thread nodes/test/test_chroma_filter_semantics.py Outdated
Comment thread nodes/test/test_chroma_filter_semantics.py Outdated
Comment thread nodes/test/test_chroma_filter_semantics.py
Isolate the Chroma filter regression tests from runtime-only modules so they run reliably in CI and local environments without full engine dependencies.

Made-with: Cursor
@kgarg2468
Copy link
Copy Markdown
Contributor Author

Addressed the remaining review friction with one focused follow-up commit: 9403aa9.

What changed

  • Hardened nodes/test/test_chroma_filter_semantics.py so it loads nodes/src/nodes/chroma/chroma.py directly and stubs runtime-only modules (depends, rocketlib, ai.common.*, numpy) inside the test scope.
  • Preserved strict module cleanup/restore behavior after each scoped import.

Why this addresses the concern

  • This removes environment-coupled import failures from the regression test harness, so the isTable=false semantics tests validate the intended behavior without requiring full engine/runtime dependencies.
  • Net effect: lower reviewer friction and more deterministic test execution.

Validation (after)

  • ../../.venv/bin/pytest nodes/test/test_chroma_filter_semantics.py -q
  • Result: 5 passed

CI evidence

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
nodes/test/test_chroma_filter_semantics.py (1)

140-140: ⚠️ Potential issue | 🟡 Minor

Split the compound assertion at Line 140 for clearer pytest failures and lint compliance.

Use two assertions so failures report the exact missing condition.

♻️ Proposed fix
-        assert spec is not None and spec.loader is not None
+        assert spec is not None
+        assert spec.loader is not None

As per coding guidelines, nodes/**/*.py should use ruff for linting/formatting.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nodes/test/test_chroma_filter_semantics.py` at line 140, Split the compound
assertion into two separate assertions so failures are clearer and
lint-compliant: replace the single `assert spec is not None and spec.loader is
not None` with `assert spec is not None` followed by `assert spec.loader is not
None`, referencing the same `spec` and `spec.loader` variables in the test
function to preserve behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@nodes/test/test_chroma_filter_semantics.py`:
- Around line 47-133: Extract the setup and teardown logic out of the long
_scoped_imports contextmanager into two small helpers (e.g.,
_install_scoped_imports() and _restore_scoped_imports(original_sys_path,
original_modules, original_stub_modules)), keeping _scoped_imports as a thin
wrapper that calls the installer, yields, then calls the restorer; move path
mutations, sys.modules stub creation (the depends/rocketlib/ai/numpy stubs and
assignments), and importlib.invalidate_caches() into _install_scoped_imports,
and move restoring sys.path and module cleanup/restoration (using
_is_scoped_module and original_modules/original_stub_modules captured via
_capture_scoped_modules/_STUB_MODULE_NAMES) into _restore_scoped_imports so
behavior is identical but readability and testability improve.

---

Duplicate comments:
In `@nodes/test/test_chroma_filter_semantics.py`:
- Line 140: Split the compound assertion into two separate assertions so
failures are clearer and lint-compliant: replace the single `assert spec is not
None and spec.loader is not None` with `assert spec is not None` followed by
`assert spec.loader is not None`, referencing the same `spec` and `spec.loader`
variables in the test function to preserve behavior.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 8b721b8e-0a34-4697-95cb-d8c5e6219c79

📥 Commits

Reviewing files that changed from the base of the PR and between 6566c09 and 9403aa9.

📒 Files selected for processing (1)
  • nodes/test/test_chroma_filter_semantics.py

Comment on lines +47 to +133
@contextmanager
def _scoped_imports() -> Iterator[None]:
"""Temporarily prepend canonical Chroma mock paths and restore import state."""
original_sys_path = list(sys.path)
original_modules = _capture_scoped_modules()
original_stub_modules = {name: sys.modules.get(name) for name in _STUB_MODULE_NAMES}

test_dir = Path(__file__).resolve().parent
mock_path = test_dir / 'mocks'
nodes_path = test_dir.parent / 'src' / 'nodes'

sys.path.insert(0, str(nodes_path))
sys.path.insert(0, str(mock_path))
# `nodes/src/nodes/chroma/chroma.py` imports runtime dependencies at module
# import time; install lightweight stubs so tests stay hermetic.
depends_module = ModuleType('depends')
depends_module.depends = lambda *_a, **_kw: None # type: ignore[attr-defined]
sys.modules['depends'] = depends_module

rocketlib_module = ModuleType('rocketlib')
rocketlib_module.debug = lambda *_a, **_kw: None # type: ignore[attr-defined]
sys.modules['rocketlib'] = rocketlib_module

ai_module = ModuleType('ai')
ai_common_module = ModuleType('ai.common')
ai_common_module.__path__ = [] # type: ignore[attr-defined]
ai_schema_module = ModuleType('ai.common.schema')
ai_store_module = ModuleType('ai.common.store')
ai_config_module = ModuleType('ai.common.config')
ai_transform_module = ModuleType('ai.common.transform')
numpy_module = ModuleType('numpy')

class _Doc:
pass

class _DocFilter:
pass

class _DocMetadata:
pass

class _QuestionText:
pass

class _DocumentStoreBase:
def __init__(self, *_a: object, **_kw: object) -> None:
pass

class _Config:
@staticmethod
def getNodeConfig(_provider: object, _connConfig: object) -> dict[str, object]:
return {}

class _IEndpointTransform:
pass

ai_schema_module.Doc = _Doc
ai_schema_module.DocFilter = _DocFilter
ai_schema_module.DocMetadata = _DocMetadata
ai_schema_module.QuestionText = _QuestionText
ai_store_module.DocumentStoreBase = _DocumentStoreBase
ai_config_module.Config = _Config
ai_transform_module.IEndpointTransform = _IEndpointTransform

sys.modules['ai'] = ai_module
sys.modules['ai.common'] = ai_common_module
sys.modules['ai.common.schema'] = ai_schema_module
sys.modules['ai.common.store'] = ai_store_module
sys.modules['ai.common.config'] = ai_config_module
sys.modules['ai.common.transform'] = ai_transform_module
sys.modules['numpy'] = numpy_module
importlib.invalidate_caches()
try:
yield
finally:
sys.path[:] = original_sys_path
for module_name in list(sys.modules):
if _is_scoped_module(module_name) and module_name not in original_modules:
sys.modules.pop(module_name, None)
for module_name, module in original_modules.items():
sys.modules[module_name] = module
for module_name, module in original_stub_modules.items():
if module is None:
sys.modules.pop(module_name, None)
else:
sys.modules[module_name] = module

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Consider splitting _scoped_imports into install/restore helpers for readability and safer maintenance.

Line 48 introduces a long context manager that does path mutation, stub injection, cache invalidation, and cleanup in one block. Extracting setup/teardown into small helpers would reduce cognitive load and make future changes less error-prone.

🧰 Tools
🪛 Ruff (0.15.6)

[warning] 48-48: Too many statements (57 > 50)

(PLR0915)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nodes/test/test_chroma_filter_semantics.py` around lines 47 - 133, Extract
the setup and teardown logic out of the long _scoped_imports contextmanager into
two small helpers (e.g., _install_scoped_imports() and
_restore_scoped_imports(original_sys_path, original_modules,
original_stub_modules)), keeping _scoped_imports as a thin wrapper that calls
the installer, yields, then calls the restorer; move path mutations, sys.modules
stub creation (the depends/rocketlib/ai/numpy stubs and assignments), and
importlib.invalidate_caches() into _install_scoped_imports, and move restoring
sys.path and module cleanup/restoration (using _is_scoped_module and
original_modules/original_stub_modules captured via
_capture_scoped_modules/_STUB_MODULE_NAMES) into _restore_scoped_imports so
behavior is identical but readability and testability improve.

@asclearuc
Copy link
Copy Markdown
Collaborator

Looks like there is PR #373 that is more comprehensive.
And this one needs to be closed

@kwit75 kwit75 added the superseded Superseded by a more comprehensive PR label Mar 23, 2026
@kwit75
Copy link
Copy Markdown
Collaborator

kwit75 commented Mar 23, 2026

Hi @kgarg2468, thanks for catching the isTable=false filter bug! Your newer PR #373 applies the same is not None fix to all 6 filter fields (nodeId, tableIds, parent, isTable, nodeType, pipelineId), making it a more comprehensive solution. Marking this one as superseded in favor of #373.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

module:nodes Python pipeline nodes superseded Superseded by a more comprehensive PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants