Skip to content

BOR-518: add raw line ingestion fallback#1

Open
bdclaw2026 wants to merge 5 commits into
masterfrom
bdclaw/bor-518-phase-2ingestion-foundation-build-raw-log-line-ingestion
Open

BOR-518: add raw line ingestion fallback#1
bdclaw2026 wants to merge 5 commits into
masterfrom
bdclaw/bor-518-phase-2ingestion-foundation-build-raw-log-line-ingestion

Conversation

@bdclaw2026
Copy link
Copy Markdown
Owner

Mirror of upstream PR STRRL#27 for Symphony workflow metadata.

Upstream PR: STRRL#27
Linear issue: https://linear.app/boringdesign/issue/BOR-518/phase-2ingestion-foundation-build-raw-log-line-ingestion-with-text

This fork PR is not the intended merge target; it exists because the upstream repository does not expose the required symphony label to this account.

STRRL and others added 5 commits March 9, 2026 20:29
)

* refactor: restructure CLI — analyze includes ingest pipeline, move ingest under debug

- `analyze` now runs the full ingest pipeline (Drain + semantic labeling +
  DuckDB storage) before launching the AI agent
- Move `ingest` command under `debug ingest` for step-by-step debugging
- Extract shared pipeline helpers into `cmd/lapp/pipeline.go`
- Remove top-level `templates` command
- Add workspace path constraint to analyzer system prompt to prevent
  the agent from scanning files outside the workspace directory
- Add Langfuse tracing support with docker-compose for local dev
- Update CLAUDE.md with new CLI structure and code style notes

* feat: add OpenTelemetry distributed tracing with Jaeger backend

Instrument the entire pipeline with OTel spans: CLI commands, multiline
merge, Drain parsing, semantic labeling, DuckDB storage, and analyzer.
HTTP clients for LLM calls use otelhttp transport for deep request traces.

- Add pkg/tracing/otel.go with OTLP HTTP exporter (env-gated via OTEL_TRACING_ENABLED)
- Add Jaeger service to docker-compose.yml (UI on port 16686)
- Wire InitOTel in main.go with graceful shutdown
- Add ctx parameter to DrainParser.Feed/Templates and multiline.Merge/MergeSlice
- Wrap eino OpenRouter HTTP clients with otelhttp.NewTransport

* fix: reuse Drain templates in analyzer to keep IDs consistent with DB

Extract AnalyzeWithTemplates() that accepts pre-computed templates,
so the analyze command passes the same DrainParser output to both
DuckDB storage and the workspace builder. Previously, Analyze()
created a second DrainParser with fresh UUIDs, causing template IDs
in the workspace to diverge from those in the database.

* feat: replace flat workspace with structured file-based layout for AI agents

Replace all old CLI commands (analyze, debug *) with a new `workspace`
command group (create, add-log, analyze) that builds a structured
directory with pattern directories named by LLM-generated semantic IDs.

Workspace structure: logs/, patterns/<semantic-id>/{pattern.md,samples.log},
patterns/unmatched/, notes/{summary.md,errors.md}, and AGENTS.md.

Closes STRRL#17

* fix: address PR review — sanitize dir names, deterministic order, unique stdin names

- Sanitize semantic IDs with [a-z0-9-] regex before using as directory
  names to prevent path traversal from LLM output
- Sort filenames before iterating in mergeAllLogs for deterministic
  rebuild output across runs
- Use UnixNano instead of Unix for stdin log filenames to avoid
  collisions within the same second

* feat: replace positional dir args with --topic flag and auto-resolve workspace paths

Workspace commands now accept a --topic flag instead of a raw directory path.
Topics are sanitized to lower-kebab-case and resolved to ~/.lapp/workspaces/<topic>/,
giving users a simpler CLI interface without needing to manage paths directly.

* feat: add workspace list command and show available workspaces on errors

Add `workspace list` subcommand to enumerate existing workspaces.
Include available workspace names in error messages when a workspace
is not found, helping users discover valid --topic values.
* feat: integrate ACP providers for workspace analyze

* refactor: remove Gemini provider, update eino-acp to 829a6c3

Drop Gemini ACP support, keeping only Claude and Codex providers.
Update eino-acp dependency and use its command builders instead of
hardcoded command slices.

---------

Co-authored-by: bdclaw2026 <262853276+bdclaw2026@users.noreply.github.com>
Matches Loki's default. Higher depth means the prefix tree routes more
precisely, reducing the number of candidate clusters that need similarity
comparison at leaf nodes. This improves both accuracy and performance for
pattern detection.
* feat: define v1 event schema fixtures

* fix: make event schema protobuf canonical

---------

Co-authored-by: bdclaw2026 <262853276+bdclaw2026@users.noreply.github.com>
@bdclaw2026 bdclaw2026 added the symphony Managed by Symphony workflow label Mar 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

symphony Managed by Symphony workflow

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants