Finish one shot plugin implementation by ppinchuk · Pull Request #379 · NatLabRockies/COMPASS

ppinchuk · 2026-02-12T01:21:01Z

Add missing components, including LLM generated keywords, heuristic, and text extraction.

codecov-commenter · 2026-02-12T01:23:17Z

Codecov Report

❌ Patch coverage is 18.00000% with 246 lines in your changes missing coverage. Please review.
✅ Project coverage is 54.85%. Comparing base (b5460fb) to head (46ec65b).

Files with missing lines	Patch %	Lines
compass/plugin/one_shot/base.py	11.18%	143 Missing ⚠️
compass/plugin/one_shot/generators.py	16.66%	50 Missing ⚠️
compass/plugin/one_shot/cache.py	22.22%	42 Missing ⚠️
compass/plugin/one_shot/components.py	50.00%	9 Missing ⚠️
compass/plugin/interface.py	33.33%	2 Missing ⚠️

❌ Your patch status has failed because the patch coverage (18.00%) is below the target coverage (80.00%). You can increase the patch coverage or adjust the target coverage.

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #379      +/-   ##
==========================================
- Coverage   56.42%   54.85%   -1.58%     
==========================================
  Files          60       61       +1     
  Lines        5366     5589     +223     
  Branches      484      525      +41     
==========================================
+ Hits         3028     3066      +38     
- Misses       2292     2477     +185     
  Partials       46       46

Flag	Coverage Δ
unittests	`54.85% <18.00%> (-1.58%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copilot

Pull request overview

This PR completes missing parts of the one-shot schema extraction plugin by adding LLM-generated website keywords, heuristic keyword generation, and schema-based text extraction, and adds a CLI example for running COMPASS against known local documents.

Changes:

Add LLM-driven generators + caching for query templates, website keywords, and heuristic keyword lists in the one-shot plugin.
Implement schema-based text extraction (structured-output) and update plugin/extractor call paths accordingly.
Add a CLI “parse existing docs” example and wire it into the Sphinx examples index.

Reviewed changes

Copilot reviewed 21 out of 22 changed files in this pull request and generated 8 comments.

Show a summary per file

File	Description
examples/parse_existing_docs/CLI/local_docs_minimal.json5	Adds minimal local-doc mapping example (currently references a non-existent PDF filename).
examples/parse_existing_docs/CLI/local_docs.json5	Adds fuller local-doc mapping example with metadata fields.
examples/parse_existing_docs/CLI/jurisdictions.csv	Adds sample jurisdictions input for the local-docs CLI run.
examples/parse_existing_docs/CLI/config.json5	Adds sample run config demonstrating `known_local_docs` + disabled search.
examples/parse_existing_docs/CLI/README.rst	Adds CLI walkthrough for processing local docs (contains a couple typos).
examples/one_shot_schema_extraction/plugin_config_simple.json5	Updates config option name + enables heuristic keyword auto-generation.
examples/one_shot_schema_extraction/plugin_config.yaml	Refreshes website keywords and adds heuristic keyword lists example.
examples/one_shot_schema_extraction/README.rst	Updates option name and fixes a doc link.
docs/source/examples/index.rst	Adds the “parse existing docs via CLI” example to the docs toctree.
compass/services/threaded.py	Adjusts jurisdiction document info dumping (currently breaks filename reporting for local docs).
compass/plugin/ordinance.py	Refactors text extractors to be direct LLM callers; updates usage labeling + call path.
compass/plugin/one_shot/schemas/website_keywords.json5	Adds schema for LLM-generated website keyword weights.
compass/plugin/one_shot/schemas/heuristic_keywords.json5	Adds schema for LLM-generated heuristic keyword lists.
compass/plugin/one_shot/schemas/extract_text.json5	Adds schema for structured-output text extraction (verbatim or null).
compass/plugin/one_shot/generators.py	Adds website keyword + heuristic keyword generators and keyword normalization/deduping.
compass/plugin/one_shot/components.py	Implements schema-based text extractor/collector components (has a prompt typo).
compass/plugin/one_shot/cache.py	Adds a disk cache for LLM-generated content (hashing is not stable).
compass/plugin/one_shot/base.py	Wires in new generators, caching, heuristic support, and schema-based text extraction.
compass/plugin/noop.py	Removes legacy `llm_caller` init pattern for NoOp text extractor.
compass/plugin/interface.py	Updates text extraction instantiation and uses async `get_heuristic()` in filtering.
compass/extraction/apply.py	Improves attempt-count logging format for ngram-checked extraction retries.

compass/plugin/one_shot/generators.py

compass/plugin/one_shot/components.py

compass/plugin/one_shot/cache.py

examples/parse_existing_docs/CLI/README.rst

examples/parse_existing_docs/CLI/local_docs_minimal.json5

compass/services/threaded.py

ppinchuk added 24 commits February 11, 2026 12:22

Fix link

3b4bb70

Add generate_website_keywords

57b0cab

Add website keyword schema

29a1cee

Add one-shot cache utilities

40b2d61

Add LLM-based keyword generation to plugin

a746c88

Update schema and add corresponding post-processing

552702f

Use semaphores to prevent race conditions

de199f3

Fix to cache bug and better logging

c00445c

Fix incorrect logging

a93a4a3

Post-processing adds formatted kw, so no need for them in the config

d743776

Add example for running local docs

3d6e9be

Fix key name

06e3ac5

Add file to example

4ea1a42

Add generate_heuristic_keywords

b9e1614

Try reading source_fp when inferring file name

1b295c5

Update config with new key

8319e3f

Use get_heuristic

48b74a6

Include heuristic generation

762bc78

New keys

d02eb82

Add SchemaBasedTextExtractor

56db7ea

New base class for BaseTextExtractor

95b81c3

New base class

2800d19

Add LLM-generated text extraction

d92d6da

Add schema

b41447e

ppinchuk added this to the Infrastructure and accuracy improvements milestone Feb 12, 2026

ppinchuk self-assigned this Feb 12, 2026

ppinchuk requested a review from castelao as a code owner February 12, 2026 01:21

ppinchuk added the enhancement Update to logic or general code improvements label Feb 12, 2026

Copilot AI review requested due to automatic review settings February 12, 2026 01:21

ppinchuk added the new computation Update that adds a new computation method label Feb 12, 2026

ppinchuk added p-critical Priority: critical topic-python-general Issues/pull requests related to python labels Feb 12, 2026

Copilot started reviewing on behalf of ppinchuk February 12, 2026 01:22 View session

Copilot AI reviewed Feb 12, 2026

View reviewed changes

PR review

46ec65b

ppinchuk merged commit 701f9f9 into main Feb 12, 2026
18 checks passed

ppinchuk deleted the pp/finish_one_shot branch February 12, 2026 05:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Finish one shot plugin implementation#379

Finish one shot plugin implementation#379
ppinchuk merged 25 commits intomainfrom
pp/finish_one_shot

ppinchuk commented Feb 12, 2026

Uh oh!

codecov-commenter commented Feb 12, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments

Conversation

ppinchuk commented Feb 12, 2026

Uh oh!

codecov-commenter commented Feb 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments

codecov-commenter commented Feb 12, 2026 •

edited

Loading