feat: replace custom filter engine with tokf-filter crate#577
feat: replace custom filter engine with tokf-filter crate#577mpecan wants to merge 1 commit intortk-ai:masterfrom
Conversation
Delegate RTK's 8-stage filter pipeline to tokf-filter::apply() while keeping the registry, command matching, build.rs concatenation, rtk verify, and omission markers unchanged. Unlocks tokf's full feature set (sections, chunks, JSON extraction, templates) for .rtk/filters.toml authors. - All 890 unit tests pass - All 111/111 inline verify tests pass - 7 pre-existing verify test failures fixed (on_empty + empty input) - One cosmetic change: truncate_lines_at uses unicode ellipsis (…) - +2.1ms startup overhead, +0.2MB binary size Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
@mpecan This is a great PR that unlocks a lot of potential for RTK! I wanted to suggest an enhancement that would be incredibly valuable for MCP tool users. Use Case: MCP-Specific FiltersWith the rise of MCP (Model Context Protocol) tools, there's a growing need to filter verbose JSON responses from external tools that RTK doesn't natively support. For example, the ClickUp MCP returns massive JSON payloads with fields like Proposed EnhancementWould it be possible to extend the TOML DSL to support conditional filters based on MCP tool name patterns? Something like: [[filters.clickup]]
match_mcp = "mcp__clickup__.*"
[filters.clickup.json]
# JSONPath extraction for specific fields
extract = "{tasks: [.tasks[] | {id, name, status, assignee: .assignees[0].username}]}"
# Or field exclusion
exclude_paths = [
"$..workspace_id",
"$..creator",
"$..custom_fields[*].type_config",
"$..assignees[*].profilePicture"
]
# Array limits
max_array_items = 10
# Field truncation
[filters.clickup.json.truncate]
description = 200
markdown_description = 0 # 0 = remove entirelyWhy This Matters
Implementation IdeasThe filter could be triggered by:
Would this fit within the scope of tokf-filter's roadmap? Happy to help test or refine the proposal! Related: PR #535 also addresses MCP output compression but with a generic approach. These two PRs could work beautifully together—#535 for generic truncation, and this proposal for tool-specific semantic filtering. |
|
@Alorse The matching is done fully in RTK, so the filter that is used is completely independent from the tokf implementation, this only adds the Filter layer. That said: the fact above makes it easier to add MCP matching into the tool and reuse the JSON capabilities from tokf. Just to make sure: what I am trying to say is that the use-case you are proposing doesn't require any change to tokf-filter. |
Summary
Hey — I'm the maintainer of tokf. I built tokf because I wanted a configurable, locally-definable filter pipeline for reducing LLM token consumption from CLI output. RTK was an inspiration — I credited it in the README from day one — but tokf's filter engine and TOML DSL predate RTK's TOML filter engine by about three weeks (tokf's filter pipeline landed Feb 18; RTK's TOML Part 1 landed Mar 10).
When RTK added its own TOML engine, the two projects ended up with very similar designs — same core stages (skip/keep, replace, match_output, truncate, head/tail, on_empty), similar TOML schemas. Rather than let the two implementations drift apart, I added RTK format compatibility to tokf's serde layer so RTK's field names (
strip_lines_matching,keep_lines_matching,head_lines,tail_lines,message) all deserialize natively.This PR replaces RTK's custom filter pipeline with a delegation to
tokf-filter::apply(). RTK keeps everything that makes it RTK — the registry, command matching,build.rsconcatenation,rtk verify, omission markers — but the actual line-by-line filtering is now handled by the shared library.What changes
tokf-filter::apply()(net -35 lines of pipeline code)[filters.name]+match_command,build.rs, filter priority chain,rtk verify,RTK_NO_TOML/RTK_TOML_DEBUGall work exactly as before"... (N lines omitted)","... (N lines truncated)") are still applied by RTK as post-processingon_empty+ empty input expected""instead of theon_emptymessage — these were broken on master before this PR)What this unlocks for RTK users
After this lands, anyone writing
.rtk/filters.tomlor built-in filters gains access to tokf's full feature set — without any breaking changes to existing filters:[[section]]state machines for collecting failure blocks[[chunk]]for splitting output into repeating blocks with aggregation[on_success]/[on_failure]branches with templatesdedup/dedup_windowfor collapsing duplicate lines[json]extraction via JSONPath| each:,| join:,| truncate:,| keep:)Backward compatibility
rtk verify --require-all)cargo fmt --all --check && tokf run cargo clippy --all-targetsclean.tomlfilter's logic (7 test expectation fixes only)truncate_lines_atnow uses…(unicode ellipsis) instead of...(3 ASCII dots)Dependencies added
tokf-filter = "0.2.33"(no default features, Lua disabled — minimal binary size impact)tokf-common = "0.2.33"(shared config types)Benchmark results
Motivation
I don't want two near-identical filter engines maintained in parallel. By sharing the core pipeline, bug fixes and new features in tokf automatically benefit RTK, and RTK's extensive filter library (47 built-in filters with 111 inline tests) has already helped me find and fix bugs in tokf — like match_output not respecting strip_ansi. The ecosystems are stronger together.
Test plan
cargo fmt --all --check— cleancargo clippy --all-targets— cleancargo test --all— 890 passed, 0 failedrtk verify --require-all— 111/111 passedhyperfine— 15.2ms (+2.1ms over master)rtk make --version,rtk git log -5,rtk ping -c 2 localhost🤖 Generated with Claude Code