Cascade

AI-native terminal agent for GCP data engineering. Think Claude Code, but for BigQuery, Airflow, dbt, and your entire GCP data platform.

What It Does

Cascade is a conversational CLI that understands your data warehouse schema, pipeline dependencies, and cost profile. Ask questions in natural language, run queries with cost awareness, explore schemas, and diagnose issues — all from the terminal.

Status

Pre-alpha — actively developed. Cascade is usable today for BigQuery workflows, Cloud Logging, GCS, and platform intelligence (/morning briefing). Multi-provider support (Gemini, OpenAI, Anthropic) is live.

What's working

Component	Status	Notes
Conversational TUI	Done	Streaming, markdown, trackpad scroll, sweep glow spinner
Core tools	Done	read, write, edit, glob, grep, bash
Permission engine	Done	Risk classification, approval modal, 3 modes
BigQuery query	Done	Execute SQL, dry-run cost estimation, cost guards
BigQuery schema	Done	Schema cache (SQLite + FTS5), explore, search, context injection
Cost tracking	Done	Per-query cost, session totals, budget warnings, `/cost`
Cost intelligence	Done	`/insights` dashboard, INFORMATION_SCHEMA analysis, inline sparklines + bar charts
Billing export	Done	Cross-project billing queries, auto-discovers export table
Multi-project cache	Done	Unified SQLite cache across projects, no dataset collisions
SQL optimization hints	Done	Partition filters, clustering keys, expensive JOINs
Context compaction	Done	Auto at 80%, `/compact` manual trigger
One-shot mode	Done	`cascade -p "..."` for scripting
Gemini provider	Done	API key and Vertex AI
OpenAI provider	Done	GPT-4o, GPT-4 Turbo, o3-mini
Anthropic provider	Done	Claude Sonnet, Opus, Haiku
Interactive model picker	Done	`/model` with arrow key selection
Turn summaries	Done	Elapsed time + token counts persist after each turn
Cloud Logging	Done	Query/tail logs, severity coloring, `/logs` command
Cloud Storage	Done	Browse buckets, list objects, read files (capped), metadata
Platform Intelligence	Done	`/morning` briefing, cross-service signal correlation, CASCADE.md project config
CASCADE.md	Done	Per-project config: critical tables, refresh schedules, alert thresholds
Session persistence	Done	SQLite-backed, auto-save, `--resume`, `/sessions`
Tool timeout	Done	Configurable per-tool timeout via `agent.tool_timeout`
Color-blind bullets	Done	Shape-differentiated glyphs: ○ read, ◇ write, ● exec, △ query, □ data

Roadmap

Phase	Description	Status
Cloud Composer	Airflow DAG inspection, task logs, trigger runs	Planned
Cost recommendations	"This table hasn't been queried in 90 days" — automated insights	Planned
dbt integration	Model lineage, run/test commands, source freshness	Planned
Schema autocomplete	Tab completion for table/column names	Planned

Getting Started

Prerequisites

Go 1.26+
GCP credentials (for BigQuery and other GCP tools):
```
gcloud auth application-default login
```
LLM provider — one of:
- Gemini API key: export GOOGLE_API_KEY="your-key" (cheapest, recommended)
- Vertex AI: uses your GCP credentials automatically
- OpenAI: export OPENAI_API_KEY="sk-..." (requires API credits from platform.openai.com)
- Anthropic: export ANTHROPIC_API_KEY="sk-ant-..." (requires API credits from console.anthropic.com)

Install

# From source
git clone https://github.com/slokam-ai/cascade.git
cd cascade
make build

# Or install directly
go install github.com/slokam-ai/cascade/cmd/cascade@latest

Cascade is pure-Go — no C toolchain required. The Makefile pins CGO_ENABLED=0 so make build and make test produce a single static binary that drops onto darwin / linux / windows without gcc. Run make verify-pure-go to cross-compile to every supported OS — useful before tagging a release. If a future change accidentally pulls in a CGO-only dep, the build will fail loudly rather than silently producing a non-portable binary.

Run

# Interactive mode
./bin/cascade

# One-shot mode
./bin/cascade -p "show me the largest tables in my project"

Configure

Create ~/.cascade/config.toml:

# LLM provider
[model]
provider = "gemini_api"  # gemini_api | vertex | openai | anthropic
model = "gemini-3-flash-preview"

# GCP platform access
[gcp]
project = "my-gcp-project"

[gcp.auth]
mode = "adc"  # adc | impersonation | service_account_key

# BigQuery
[bigquery]
datasets = ["my_dataset", "analytics"]  # Datasets to cache for schema-aware queries

# Cost controls
[cost]
warn_threshold = 1.0     # Dollar amount to prompt confirmation
max_query_cost = 10.0    # Dollar amount to block query
daily_budget_usd = 100.0 # Session budget warning at 80%
billing_project = ""     # Project with billing export (optional, cross-project OK)
billing_dataset = ""     # Billing export dataset name (optional)

# Permission mode
[security]
default_mode = "ask"  # ask | read-only | full-access

# DuckDB engine — all optional. The local-only paths
# (duckdb_query / duckdb_schema, bq_to_duckdb mode='local') work with
# zero config. Configure these when you want bq_to_duckdb mode='gcs'
# or want to tune the volume gate for your machine.
[duckdb]
staging_bucket = ""        # GCS bucket (no gs:// prefix) for bq_to_duckdb mode='gcs'
keep_session_db = false    # true = retain ~/.cascade/duckdb/<id>.db on exit

[duckdb.volume_gate]
warn_bytes            = 1073741824     # 1 GiB — informational warning
hard_stop_bytes       = 53687091200    # 50 GiB — refuse mode='gcs' unless force=true
local_hard_stop_bytes = 5368709120     # 5 GiB — refuse mode='local' unless force=true

Without a config file, Cascade auto-detects: GOOGLE_API_KEY for the LLM, ADC for GCP tools.

Features

Core

Streaming conversational TUI (Bubble Tea v2 + Lip Gloss v2 + Glamour v2)
Tool system: read_file, write_file, edit_file, glob, grep, bash
Policy-first permission engine with risk classification
Approval modal: allow once, allow tool for session, deny
Session context compaction at 80% window usage (/compact)
One-shot mode for scripting (cascade -p "...")

BigQuery

bigquery_query — Execute SQL with automatic dry-run cost estimation; dry_run=true for cost-only
bigquery_schema — Explore schemas: list datasets, tables, describe columns, FTS5 search
Multi-project schema cache (unified SQLite + FTS5) — datasets from multiple GCP projects in one index
Schema-aware context injection with fully-qualified project.dataset.table references
SQL optimization hints: missing partition filters, unused clustering keys, expensive JOINs
Cost intelligence: INFORMATION_SCHEMA analysis for query costs, storage, slot utilization
Billing export support: cross-project queries with auto-discovery of export table
Inline terminal charts: sparklines (▁▂▃▄▅▆▇█) and horizontal bar charts in ocean blue
/insights — One-command cost health dashboard (query trend, top queries, storage, slots)
/cost — Styled session cost breakdown
/sync [dataset] — Refresh schema cache (syncs all configured projects)

DuckDB (zero per-scan cost)

duckdb_query — Run SQL against the per-session DB or any .duckdb file (database='/path/to/your.duckdb'); also queries gs://*.parquet directly via httpfs with bearer auth
duckdb_schema — List tables, describe columns, sample rows; works on the session DB or external files
bq_to_duckdb — Pull a BQ slice into a local table. Modes: local (small, no staging needed), gcs (large, EXPORT-to-Parquet via staging bucket), auto (picks based on size + config)
Volume gate guards against accidental terabyte pulls; honest about the BQ-scan-vs-local-disk distinction
Pure-Go invariant intact: subprocess wrapper around the duckdb CLI, no CGO

Cloud Logging

cloud_logging — Query and tail GCP log entries with filter syntax
Severity coloring: DEBUG (dim) → INFO (blue) → WARNING (amber) → ERROR (red) → CRITICAL (bright red)
Smart message extraction from proto/JSON payloads
/logs [severity] [duration] — Quick access to recent logs (default: WARNING, 1h)

Cloud Storage

gcs — Browse buckets, list objects, read files, inspect metadata
Directory-style browsing with prefix + delimiter
File reading capped at 100 lines (text files only, binary detection)
Styled output with line numbers for file content

Platform Intelligence

/morning — Morning briefing: "3 things need attention before standup"
Cross-service signal collection: BQ job failures, Cloud Logging errors, GCS freshness, schema staleness
Union-find correlation groups related signals into incidents (e.g., failed job + stale table = one incident)
Signals ranked by severity → blast radius → recency
Graceful degradation: each source independently optional
CASCADE.md — Per-project config (like CLAUDE.md) defining critical tables, expected refresh schedules, alert thresholds
Platform debugging playbook teaches the LLM cross-service investigation patterns

Auth

Two independent auth planes: GCP resources + LLM provider
GCP: ADC, service account impersonation, or key file
LLM: Vertex AI (reuses GCP auth), Gemini API key, OpenAI API key, Anthropic API key
Auto-detection: checks env vars in order (GOOGLE_API_KEY → ANTHROPIC_API_KEY → OPENAI_API_KEY → Vertex AI)
Note: consumer subscriptions (ChatGPT Pro, Claude Max) cannot be used — separate API keys required
Startup report shows what's available

UX

Ocean blue cascade branding with animated tilde spinner
Sweep glow text effect and per-turn elapsed timer with token counts
Welcome screen with connection dashboard (project, datasets, mode)
Human-friendly model names in status bar (e.g., "Gemini 3 (Flash)")
Interactive model picker (/model) with arrow key navigation
Custom markdown theme with borderless tables and alternating row dimming
Trackpad scroll support
Session persistence: auto-saves to SQLite, resume with --resume or --session <id>
cascade sessions — list saved sessions, cascade sessions rm <id> — delete
Configurable tool timeout (agent.tool_timeout in config.toml, default 120s)
Color-blind accessible tool bullets: shape encodes action category alongside color
Shell escape: ! <command> runs a shell command inline and shows output in chat
Slash commands: /help, /model, /compact, /sync, /cost, /insights, /logs, /morning, /sessions, /save, /reload

Architecture

graph TD
    User([User]) --> TUI

    subgraph Cascade
        TUI[Bubble Tea TUI<br/><i>chat, input, status, confirm</i>]
        TUI --> Agent[Agent Loop<br/><i>observe - reason - tool - execute</i>]
        Agent --> Permissions[Permission Engine<br/><i>risk classification, approval modal</i>]

        Agent --> Tools

        subgraph Tools
            Core[Core Tools<br/><i>read, write, edit, glob, grep, bash</i>]
            BQTools[BigQuery Tools<br/><i>query, schema, insights</i>]
            PlatformTools[Platform Tools<br/><i>cloud_logging, gcs</i>]
        end

        Morning[Platform Intelligence<br/><i>/morning briefing, signal correlation</i>]
        Morning --> BQTools
        Morning --> PlatformTools
        Morning --> SchemaCache

        subgraph Auth[Auth Resolvers]
            Resource[Resource Plane<br/><i>ADC / impersonation / SA key</i>]
            Model[Model Plane<br/><i>vertex / gemini_api / openai / anthropic</i>]
        end

        Agent --> Model
        BQTools --> Resource
        PlatformTools --> Resource
        BQTools --> SchemaCache[Schema Cache<br/><i>SQLite + FTS5, multi-project</i>]
        BQTools --> BillingExport[Billing Export<br/><i>cross-project cost data</i>]
    end

    Model --> LLM[LLM Provider<br/><i>Gemini, Claude, GPT, ...</i>]
    Resource --> GCP[GCP APIs<br/><i>BigQuery, GCS, Logging, Composer</i>]
    SchemaCache --> GCP
    BillingExport --> GCP

    style Morning fill:#1e3a5f,stroke:#38BDF8,color:#F3F4F6
    style TUI fill:#1e3a5f,stroke:#6B9FFF,color:#F3F4F6
    style Agent fill:#1e3a5f,stroke:#6B9FFF,color:#F3F4F6
    style Permissions fill:#2d2235,stroke:#818CF8,color:#F3F4F6
    style Core fill:#1a2e1a,stroke:#34D399,color:#F3F4F6
    style BQTools fill:#1a2e1a,stroke:#34D399,color:#F3F4F6
    style PlatformTools fill:#1a2e1a,stroke:#34D399,color:#F3F4F6
    style Resource fill:#2a2510,stroke:#FBBF24,color:#F3F4F6
    style Model fill:#2a2510,stroke:#FBBF24,color:#F3F4F6
    style SchemaCache fill:#1a2e1a,stroke:#34D399,color:#F3F4F6
    style BillingExport fill:#2a2510,stroke:#FBBF24,color:#F3F4F6
    style LLM fill:#0d1117,stroke:#4B5563,color:#9CA3AF
    style GCP fill:#0d1117,stroke:#4B5563,color:#9CA3AF
    style User fill:#0d1117,stroke:#6B9FFF,color:#F3F4F6

Development

make build      # Build binary to bin/cascade
make test       # Run all tests with race detector
make test-short # Run unit tests only
make lint       # Run golangci-lint (falls back to go vet)
make clean      # Remove build artifacts

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 65 Commits
cmd/cascade		cmd/cascade
docs/testing		docs/testing
internal		internal
pkg/types		pkg/types
.gitignore		.gitignore
.golangci.yml		.golangci.yml
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
VERSION		VERSION
go.mod		go.mod
go.sum		go.sum

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Cascade

What It Does

Status

What's working

Roadmap

Getting Started

Prerequisites

Install

Run

Configure

Features

Core

BigQuery

DuckDB (zero per-scan cost)

Cloud Logging

Cloud Storage

Platform Intelligence

Auth

UX

Architecture

Development

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Cascade

What It Does

Status

What's working

Roadmap

Getting Started

Prerequisites

Install

Run

Configure

Features

Core

BigQuery

DuckDB (zero per-scan cost)

Cloud Logging

Cloud Storage

Platform Intelligence

Auth

UX

Architecture

Development

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages