Skip to content

Commit 92377f8

Browse files
authored
sql cells, backed by DuckDB (#844)
* sql cells * registerTable * prettier * destructuring assignment * register sql files * sql → @observablehq/duckdb * incremental sql update * test sql + data loader * docs; table display * more docs; better display * echo * fix tests, again * remove console * id="[{min, max}]" * more docs
1 parent d6311b5 commit 92377f8

26 files changed

+431
-93
lines changed

docs/display-race.md

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
# Display race
2+
3+
```js echo
4+
async function sleep(ms) {
5+
return new Promise((resolve) => setTimeout(resolve, ms));
6+
}
7+
```
8+
9+
```js echo
10+
const value = (function* () {
11+
yield 2000;
12+
yield 1000;
13+
})();
14+
```
15+
16+
```js echo
17+
await sleep(value);
18+
display(value);
19+
```

docs/sql.md

Lines changed: 158 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,158 @@
1+
---
2+
sql:
3+
gaia: ./lib/gaia-sample.parquet
4+
---
5+
6+
# SQL
7+
8+
Observable Framework includes built-in support for client-side SQL powered by [DuckDB](./lib/duckdb). You can use SQL to query data from [CSV](./lib/csv), [TSV](./lib/csv), [JSON](./javascript/files#json), [Apache Arrow](./lib/arrow), and [Apache Parquet](./lib/arrow#apache-parquet) files, which can either be static or generated by [data loaders](./loaders).
9+
10+
To use SQL, first register the desired tables in the page’s [front matter](./markdown#front-matter) using the **sql** option. Each key is a table name, and each value is the path to the corresponding data file. For example, to register a table named `gaia` from a Parquet file:
11+
12+
```yaml
13+
---
14+
sql:
15+
gaia: ./lib/gaia-sample.parquet
16+
---
17+
```
18+
19+
## SQL code blocks
20+
21+
To run SQL queries, create a SQL fenced code block (<code>```sql</code>). For example, to query the first 10 rows from the `gaia` table:
22+
23+
````md
24+
```sql
25+
SELECT * FROM gaia ORDER BY phot_g_mean_mag LIMIT 10
26+
```
27+
````
28+
29+
This produces a table:
30+
31+
```sql
32+
SELECT * FROM gaia ORDER BY phot_g_mean_mag LIMIT 10
33+
```
34+
35+
To refer to the results of a query in JavaScript, use the `id` directive. For example, to refer to the results of the previous query as `top10`:
36+
37+
````md
38+
```sql id=top10
39+
SELECT * FROM gaia ORDER BY phot_g_mean_mag LIMIT 10
40+
```
41+
````
42+
43+
```sql id=top10
44+
SELECT * FROM gaia ORDER BY phot_g_mean_mag LIMIT 10
45+
```
46+
47+
This returns an array of 10 rows, inspected here:
48+
49+
```js echo
50+
top10
51+
```
52+
53+
When a SQL code block uses the `id` directive, the results are not displayed by default. You can display them by adding the `display` directive, which produces the table shown above.
54+
55+
````md
56+
```sql id=top10 display
57+
SELECT * FROM gaia ORDER BY phot_g_mean_mag LIMIT 10
58+
```
59+
````
60+
61+
The `id` directive is often a simple identifier such as `top10` above, but it supports [destructuring assignment](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Operators/Destructuring_assignment), so you can refer to individual rows and columns using array and object patterns. For example, to pull out the top row:
62+
63+
````md
64+
```sql id=[top]
65+
SELECT * FROM gaia ORDER BY phot_g_mean_mag LIMIT 1
66+
```
67+
````
68+
69+
```sql id=[top]
70+
SELECT * FROM gaia ORDER BY phot_g_mean_mag LIMIT 1
71+
```
72+
73+
```js echo
74+
top
75+
```
76+
77+
Or to pull out the minimum value of the `phot_g_mean_mag` column:
78+
79+
````md
80+
```sql id=[{min}]
81+
SELECT MIN(phot_g_mean_mag) AS min FROM gaia
82+
```
83+
````
84+
85+
```sql id=[{min}]
86+
SELECT MIN(phot_g_mean_mag) AS min FROM gaia
87+
```
88+
89+
```js echo
90+
min
91+
```
92+
93+
<div class="tip">
94+
95+
For complex destructuring patterns, you may need to quote the `id` directive. For example, to pull out the column named `min(phot_g_mean_mag)` to the variable named `min`, say <code style="white-space: nowrap;">id="[{'min(phot_g_mean_mag)': min}]"</code>. Or to pull out the `min` and `max` columns, say <code style="white-space: nowrap;">id="[{min, max}]"</code>.
96+
97+
</div>
98+
99+
For dynamic or interactive queries that respond to user input, you can interpolate values into SQL queries using inline expressions `${…}`. For example, to show the stars around a given brightness:
100+
101+
```js echo
102+
const mag = view(Inputs.range([6, 20], {label: "Magnitude"}));
103+
```
104+
105+
```sql echo
106+
SELECT * FROM gaia WHERE phot_g_mean_mag BETWEEN ${mag - 0.1} AND ${mag + 0.1};
107+
```
108+
109+
The value of a SQL code block is an [Apache Arrow](./lib/arrow) table. This format is supported by [Observable Plot](./lib/plot), so you can use SQL and Plot together to visualize data. For example, below we count the number of stars in each 2°×2° bin of the sky (where `ra` is [right ascension](https://en.wikipedia.org/wiki/Right_ascension) and `dec` is [declination](https://en.wikipedia.org/wiki/Declination), representing a point on the celestial sphere in the equatorial coordinate system), and then visualize the resulting heatmap using a [raster mark](https://observablehq.com/plot/marks/raster).
110+
111+
```sql id=bins echo
112+
SELECT
113+
floor(ra / 2) * 2 + 1 AS ra,
114+
floor(dec / 2) * 2 + 1 AS dec,
115+
count() AS count
116+
FROM
117+
gaia
118+
GROUP BY
119+
1,
120+
2
121+
```
122+
123+
```js echo
124+
Plot.plot({
125+
aspectRatio: 1,
126+
x: {domain: [0, 360]},
127+
y: {domain: [-90, 90]},
128+
marks: [
129+
Plot.frame({fill: 0}),
130+
Plot.raster(bins, {
131+
x: "ra",
132+
y: "dec",
133+
fill: "count",
134+
width: 360 / 2,
135+
height: 180 / 2,
136+
imageRendering: "pixelated"
137+
})
138+
]
139+
})
140+
```
141+
142+
## SQL literals
143+
144+
SQL fenced code blocks are shorthand for the `sql` tagged template literal. You can invoke the `sql` tagged template literal directly like so:
145+
146+
```js echo
147+
const rows = await sql`SELECT random() AS random`;
148+
```
149+
150+
```js echo
151+
rows[0].random
152+
```
153+
154+
The `sql` tagged template literal is available by default in Markdown, but you can also import it explicitly as:
155+
156+
```js echo
157+
import {sql} from "npm:@observablehq/duckdb";
158+
```

observablehq.config.ts

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,7 @@ export default {
77
{name: "Markdown", path: "/markdown"},
88
{name: "JavaScript", path: "/javascript"},
99
{name: "Data loaders", path: "/loaders"},
10+
{name: "SQL", path: "/sql"},
1011
{name: "Themes", path: "/themes"},
1112
{name: "Configuration", path: "/config"},
1213
{

src/build.ts

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -112,7 +112,7 @@ export async function build(
112112
effects.output.write(`${faint("build")} ${clientPath} ${faint("→")} `);
113113
const define: {[key: string]: string} = {};
114114
if (config.search) define["global.__minisearch"] = JSON.stringify(relativePath(path, aliases.get("/_observablehq/minisearch.json")!)); // prettier-ignore
115-
const contents = await rollupClient(clientPath, root, path, {minify: true, define});
115+
const contents = await rollupClient(clientPath, root, path, {minify: true, keepNames: true, define});
116116
await effects.writeFile(path, contents);
117117
}
118118
}

src/client/preview.js

Lines changed: 25 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,6 @@
1-
import {registerFile} from "npm:@observablehq/stdlib";
2-
import {undefine} from "./main.js";
1+
import {registerTable} from "npm:@observablehq/duckdb";
2+
import {FileAttachment, registerFile} from "npm:@observablehq/stdlib";
3+
import {main, undefine} from "./main.js";
34
import {enableCopyButtons} from "./pre.js";
45

56
export * from "./index.js";
@@ -26,16 +27,16 @@ export function open({hash, eval: compile} = {}) {
2627
}
2728
case "update": {
2829
const root = document.querySelector("main");
29-
if (message.previousHash !== hash) {
30+
if (message.hash.previous !== hash) {
3031
console.log("contents out of sync");
3132
location.reload();
3233
break;
3334
}
34-
hash = message.updatedHash;
35+
hash = message.hash.current;
3536
let offset = 0;
3637
const addedCells = new Map();
3738
const removedCells = new Map();
38-
for (const {type, oldPos, items} of message.diffHtml) {
39+
for (const {type, oldPos, items} of message.html) {
3940
switch (type) {
4041
case "add": {
4142
for (const item of items) {
@@ -71,34 +72,43 @@ export function open({hash, eval: compile} = {}) {
7172
for (const [id, removed] of removedCells) {
7273
addedCells.get(id)?.replaceWith(removed);
7374
}
74-
for (const id of message.diffCode.removed) {
75+
for (const id of message.code.removed) {
7576
undefine(id);
7677
}
77-
for (const body of message.diffCode.added) {
78+
for (const body of message.code.added) {
7879
compile(body);
7980
}
80-
for (const name of message.diffFiles.removed) {
81+
for (const name of message.files.removed) {
8182
registerFile(name, null);
8283
}
83-
for (const file of message.diffFiles.added) {
84+
for (const file of message.files.added) {
8485
registerFile(file.name, file);
8586
}
86-
const {addedStylesheets, removedStylesheets} = message;
87-
if (addedStylesheets.length === 1 && removedStylesheets.length === 1) {
88-
const [newHref] = addedStylesheets;
89-
const [oldHref] = removedStylesheets;
87+
for (const name of message.tables.removed) {
88+
registerTable(name, null);
89+
}
90+
for (const table of message.tables.added) {
91+
registerTable(table.name, FileAttachment(table.path));
92+
}
93+
if (message.tables.removed.length || message.tables.added.length) {
94+
const sql = main._resolve("sql");
95+
sql.define(sql._promise); // re-evaluate sql code
96+
}
97+
if (message.stylesheets.added.length === 1 && message.stylesheets.removed.length === 1) {
98+
const [newHref] = message.stylesheets.added;
99+
const [oldHref] = message.stylesheets.removed;
90100
const link = document.head.querySelector(`link[rel="stylesheet"][href="${oldHref}"]`);
91101
link.href = newHref;
92102
} else {
93-
for (const href of addedStylesheets) {
103+
for (const href of message.stylesheets.added) {
94104
const link = document.createElement("link");
95105
link.rel = "stylesheet";
96106
link.type = "text/css";
97107
link.crossOrigin = "";
98108
link.href = href;
99109
document.head.appendChild(link);
100110
}
101-
for (const href of removedStylesheets) {
111+
for (const href of message.stylesheets.removed) {
102112
document.head.querySelector(`link[rel="stylesheet"][href="${href}"]`)?.remove();
103113
}
104114
}

src/client/stdlib/duckdb.js

Lines changed: 53 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -40,7 +40,28 @@ const bundle = await duckdb.selectBundle({
4040
}
4141
});
4242

43-
const logger = new duckdb.ConsoleLogger();
43+
const logger = new duckdb.ConsoleLogger(duckdb.LogLevel.WARNING);
44+
45+
let db;
46+
let inserts = [];
47+
const sources = new Map();
48+
49+
export function registerTable(name, source) {
50+
if (source == null) {
51+
sources.delete(name);
52+
db = DuckDBClient.of(); // drop existing tables and views before re-inserting
53+
inserts = Array.from(sources, (i) => db.then((db) => insertSource(db._db, ...i)));
54+
} else {
55+
sources.set(name, source);
56+
db ??= DuckDBClient.of(); // lazy instantiation
57+
inserts.push(db.then((db) => insertSource(db._db, name, source)));
58+
}
59+
}
60+
61+
export async function sql(strings, ...args) {
62+
await Promise.all(inserts);
63+
return (await (db ??= DuckDBClient.of())).query(strings.join("?"), args);
64+
}
4465

4566
export class DuckDBClient {
4667
constructor(db) {
@@ -139,37 +160,7 @@ export class DuckDBClient {
139160
config = {...config, query: {...config.query, castBigIntToDouble: true}};
140161
}
141162
await db.open(config);
142-
await Promise.all(
143-
Object.entries(sources).map(async ([name, source]) => {
144-
source = await source;
145-
if (isFileAttachment(source)) {
146-
// bare file
147-
await insertFile(db, name, source);
148-
} else if (isArrowTable(source)) {
149-
// bare arrow table
150-
await insertArrowTable(db, name, source);
151-
} else if (Array.isArray(source)) {
152-
// bare array of objects
153-
await insertArray(db, name, source);
154-
} else if (isArqueroTable(source)) {
155-
await insertArqueroTable(db, name, source);
156-
} else if ("data" in source) {
157-
// data + options
158-
const {data, ...options} = source;
159-
if (isArrowTable(data)) {
160-
await insertArrowTable(db, name, data, options);
161-
} else {
162-
await insertArray(db, name, data, options);
163-
}
164-
} else if ("file" in source) {
165-
// file + options
166-
const {file, ...options} = source;
167-
await insertFile(db, name, file, options);
168-
} else {
169-
throw new Error(`invalid source: ${source}`);
170-
}
171-
})
172-
);
163+
await Promise.all(Object.entries(sources).map(([name, source]) => insertSource(db, name, source)));
173164
return new DuckDBClient(db);
174165
}
175166
}
@@ -178,6 +169,36 @@ Object.defineProperty(DuckDBClient.prototype, "dialect", {
178169
value: "duckdb"
179170
});
180171

172+
async function insertSource(database, name, source) {
173+
source = await source;
174+
if (isFileAttachment(source)) {
175+
// bare file
176+
await insertFile(database, name, source);
177+
} else if (isArrowTable(source)) {
178+
// bare arrow table
179+
await insertArrowTable(database, name, source);
180+
} else if (Array.isArray(source)) {
181+
// bare array of objects
182+
await insertArray(database, name, source);
183+
} else if (isArqueroTable(source)) {
184+
await insertArqueroTable(database, name, source);
185+
} else if ("data" in source) {
186+
// data + options
187+
const {data, ...options} = source;
188+
if (isArrowTable(data)) {
189+
await insertArrowTable(database, name, data, options);
190+
} else {
191+
await insertArray(database, name, data, options);
192+
}
193+
} else if ("file" in source) {
194+
// file + options
195+
const {file, ...options} = source;
196+
await insertFile(database, name, file, options);
197+
} else {
198+
throw new Error(`invalid source: ${source}`);
199+
}
200+
}
201+
181202
async function insertFile(database, name, file, options) {
182203
const url = await file.url();
183204
if (url.startsWith("blob:")) {

src/client/stdlib/recommendedLibraries.js

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,7 @@ export const L = () => import("npm:leaflet");
1414
export const mapboxgl = () => import("npm:mapbox-gl").then((module) => module.default);
1515
export const mermaid = () => import("observablehq:stdlib/mermaid").then((mermaid) => mermaid.default);
1616
export const Plot = () => import("npm:@observablehq/plot");
17+
export const sql = () => import("observablehq:stdlib/duckdb").then((duckdb) => duckdb.sql);
1718
export const SQLite = () => import("observablehq:stdlib/sqlite").then((sqlite) => sqlite.default);
1819
export const SQLiteDatabaseClient = () => import("observablehq:stdlib/sqlite").then((sqlite) => sqlite.SQLiteDatabaseClient); // prettier-ignore
1920
export const tex = () => import("observablehq:stdlib/tex").then((tex) => tex.default);

0 commit comments

Comments
 (0)