Mondrian is the open-source OLAP engine that powers analytics dashboards by translating MDX queries into SQL against a relational warehouse. This Spicule fork keeps the original Mondrian semantics intact and adds an Apache Calcite-based SQL generation layer alongside the legacy hand-rolled SQL builder.
Forked from OSBI/mondrian (itself a fork of pentaho/mondrian). Full pre-fork history lives upstream.
License: Eclipse Public License v1.0 — see LICENSE.html.
Legacy Mondrian hand-codes SQL across ~30 dialect subclasses. Apache Calcite is a general-purpose SQL planner with 40+ dialects and modern optimiser machinery. Swapping Mondrian's emitter for Calcite unlocks:
- Automatic aggregate-table rewriting. Declared
<MeasureGroup type="aggregate">rollups get used without flipping Mondrian'sUseAggregates/ReadAggregatesglobals. ~10× wins on queries that match a declared aggregate. - Cleaner SQL on less-clever planners. HSQLDB benchmarks ~5× faster because Calcite emits ANSI joins + IN-lists that HSQLDB plans better than Mondrian's comma-join SQL.
- Parity on modern warehouses. On Postgres at 86.8M-row scale, Calcite-generated SQL and legacy SQL produce equivalent plans — same execution time, same results. No regression to ship this.
- Single-line dialect adds. New database backends (DuckDB, ClickHouse, Snowflake) already live in Calcite — wiring them is one dialect-mapping line, not a 300-line dialect subclass.
45-query MDX corpus, 1 warmup + 3 timed iterations, median reported.
| Workload | Calcite vs Legacy |
|---|---|
| HSQLDB (~87k rows) | 4.80× faster |
| Postgres (86.8M rows, 1000×) | 1.01× (parity) |
Declared aggregate hits (Postgres, UseAggregates=false) |
9.6× faster |
See docs/reports/perf-analysis-2026-04-22-postmerge.md for per-query numbers and methodology.
All flags are -D system properties; defaults are production-safe.
| Flag | Default | What it does |
|---|---|---|
mondrian.backend |
calcite |
Pick the SQL emitter. Set to legacy to route through the original Mondrian SQL builder — useful as a kill switch while shadow-evaluating the new path. |
mondrian.calcite.strict |
true |
When a Calcite translator gap is hit (an MDX shape the new translator doesn't cover yet), rethrow the failure instead of silently falling back to the legacy SQL builder. Set to false to revert to silent fallback for any deployment that still needs it. Strict mode is what gives the "Calcite path is the correctness path" guarantee — every silent passthrough that previously masked a result-divergence bug now surfaces in CI. |
mondrian.calcite.mvMatch |
true |
Hand-rolled MV matcher that rewrites segment-load requests onto declared aggregate tables when the query shape matches. Runs even when Mondrian's UseAggregates is off — this is the main "aggregates just work" win. |
mondrian.calcite.volcano |
true |
Use Calcite's VolcanoPlanner to apply a curated rule set during SQL generation. When false, falls back to the HepPlanner-only path. |
mondrian.calcite.calcConsume |
false |
Experimental. When on, arithmetic calc members (e.g. SUM(x) / SUM(y)) are emitted as SQL expressions in the SELECT list instead of being recomputed in Java. Off by default because the SegmentLoader path that consumes the computed column is still behind a profiler-required investigation — see "Known issues" below. |
mondrian.calcite.mvMaxSubsetSize |
4 |
Size cap on power-set enumeration of aggregate-table shapes. Each declared aggregate generates candidate shapes for the MV matcher; this bounds combinatorial blowup. |
mondrian.calcite.trace |
false |
Dumps per-request matcher + SQL capture to stderr. Use for debugging MV rewrites. Also emits [calcite-cache] and [calcite-ok]/[calcite-fallback] lines so you can confirm Calcite is engaging for a given DataSource. |
mondrian.foodmart.jdbcURL |
unset | Override the test JDBC URL. Used by the multi-DB validation; e.g. -Dmondrian.foodmart.jdbcURL=jdbc:duckdb:/tmp/foodmart.duckdb to run the FoodMart test suite against DuckDB instead of the default HSQLDB fixture. |
# Default (Calcite ON, strict ON):
mvn test
# equivalent:
mvn test -Dmondrian.backend=calcite -Dmondrian.calcite.strict=true
# Legacy SQL builder, no Calcite:
mvn test -Dmondrian.backend=legacy
# Calcite engaged but with old silent-fallback behaviour (not recommended;
# masks translator gaps as silent legacy execution):
mvn test -Dmondrian.calcite.strict=falseThe Calcite SQL backend works automatically against any database in the
list below. The "How" column describes whether Calcite's own
auto-detection from JDBC product-name strings handles the database, or
whether mondrian.calcite.CalciteDialectMap hand-curates an entry
because Calcite's pattern matcher misses the real driver's product name.
| Database | How resolved |
|---|---|
| HSQLDB | hand-curated (QUOTING_HSQLDB) |
| PostgreSQL | hand-curated (PostgresqlSqlDialect.DEFAULT) |
| DuckDB | hand-curated (QUOTING_DUCKDB) |
| Amazon Redshift | hand-curated (QUOTING_REDSHIFT) — auto-detect misses |
| Apache Hive | hand-curated (QUOTING_HIVE) — auto-detect misses |
| Trino | hand-curated (QUOTING_TRINO via PrestoSqlDialect) — auto-detect misses |
| Exasol | hand-curated (QUOTING_EXASOL) — auto-detect misses |
| Apache Spark | hand-curated (QUOTING_SPARK) — auto-detect misses ("Spark SQL") |
| Apache Phoenix | hand-curated (QUOTING_PHOENIX) — auto-detect misses ("Apache Phoenix") |
| LucidDB | hand-curated (QUOTING_LUCIDDB) — abandoned upstream; included for completeness |
| Google BigQuery | Calcite auto-detect |
| Snowflake | Calcite auto-detect |
| ClickHouse | Calcite auto-detect |
| Microsoft SQL Server | Calcite auto-detect |
| Oracle | Calcite auto-detect |
| MySQL | Calcite auto-detect |
| Presto | Calcite auto-detect |
| IBM DB2 | Calcite auto-detect |
| Apache Derby | Calcite auto-detect |
| H2 | Calcite auto-detect |
| Netezza | Calcite auto-detect |
| Sybase ASE | Calcite auto-detect |
| Teradata | Calcite auto-detect |
| Vertica | Calcite auto-detect |
| Firebird | Calcite auto-detect |
| Informix | Calcite auto-detect |
If neither the hand-curated map nor Calcite's auto-detect recognises the
JDBC product name, the planner cache returns null and queries silently
fall back to the legacy Mondrian SQL builder. A one-shot warning is
emitted to both LOGGER.warn and System.err with the unrecognised
product name and a pointer to CalciteDialectMap.forProductNameOrNull,
so the silent skip is at least observable. Multi-DB Calcite features
(strict mode, structured translator coverage, portable LIMIT n emission,
MV rewrite) are NOT active for unrecognised backends — please file an
issue with the JDBC product-name string so we can add a hand-curated entry.
To check whether Calcite is engaging for your DataSource, run with
-Dmondrian.calcite.trace=true and look for a line like:
[calcite-cache] dialect=mondrian.calcite.CalciteDialectMap$3 for ...
A dialect=null line means the backend was silently skipped.
# Default — Calcite backend, matcher on:
mvn test -Dmondrian.backend=calcite
# Compare against legacy:
mvn test -Dmondrian.backend=legacy
# Profile with matcher tracing:
mvn test -Dmondrian.backend=calcite -Dmondrian.calcite.trace=trueMondrian schemas declare aggregate tables via <MeasureGroup type="aggregate">. Each such MG is described by:
- Its base fact table (e.g.
sales_fact_1997) - A list of copy-linked columns (denormalised columns copied into the agg table, e.g.
the_year,product_family) - A list of foreign-key-linked dimensions reachable via
ForeignKeyLink
On schema load, MvRegistry.fromSchema enumerates power-set subsets of the copy-linked columns (up to mvMaxSubsetSize), adds hand-curated FK-reachable shapes, dedupes overlapping shapes (smaller row count wins), and builds a shape catalog.
At query time, MvMatcher.match compares every PlannerRequest (a segment-load's {groupBy, measures}) against the catalog via GroupColKey O(1) lookup. On match, the request is rewritten to scan the aggregate instead of the fact table — typically a 1000× row reduction on 1000×-scale FoodMart.
This runs independently of Mondrian's RolapGalaxy.findAgg code path — no configuration flip required.
mvn clean package -DskipTests # build the jar
mvn test # run unit tests (excludes slow harnesses)
mvn -Pcalcite-harness test # run the Calcite equivalence harnessCustom dependencies (eigenbase-, olap4j-spicule-, mondrian-data-foodmart-hsql) that aren't on Maven Central are vendored in-tree at lib/repo/ and resolved via a file:// repository defined in pom.xml. No external repo configuration needed.
CI: GitHub Actions builds on every push to main and publishes a JAR to GitHub Packages. See .github/workflows/build.yml.
Calc pushdown SegmentLoader consume path (commit 439438c). The SegmentLoader widening that would let the SQL-computed calc value replace the Java-side calc-member evaluation caused an unexplained 2× regression across non-calc Postgres queries (slicer-where, time-fn, topcount). Empirical bisect isolated it to that one commit; static code reading doesn't explain it. The commit lives on branch calcite-calc-pushdown-sql pending a profiler session. mondrian.calcite.calcConsume stays off by default until this is resolved.
Calcite's own MaterializedViewRule doesn't fire on Mondrian-generated plans. The cost-based MV selection infrastructure is wired but the rule engine's internal substitution logic requires exact structural matching that Mondrian's generated plans don't satisfy. The hand-rolled MvMatcher covers the practical cases; Volcano-driven MV selection is future work (see docs/plans/2026-04-21-volcano-mv-selection.md).
- Blog (short): "What We Learned Replacing a 20-Year-Old SQL Engine With Apache Calcite"
- Engineering report:
docs/reports/perf-analysis-2026-04-22-postmerge.md - Design docs:
docs/plans/(calcite-backend-rewrite-design, MV matcher, shape catalog, calc pushdown) - Upstream Mondrian docs:
doc/install.html, mondrian.pentaho.com (legacy)