Skip to content

Conversation

@dt
Copy link

@dt dt commented Nov 14, 2025

Previously attempting to set temp_directory to an opfs:// path did not work.

Specifically, while 'auto' file handling scanned SQL stmts for opfs:// paths to pre-open mentioned files via the async OPFS API and register them where the sync openFile call could find them, it only worked for specific files and not for a temp directory, i.e. a path in which openFile would later attempt to create individual files.

The sync vs async APIs of openFile vs OPFS presents a challenge here: when the DB decides to spill to a tempfile and attempts to create it, it does so via the openFile call, which is a sync call. This makes creating the OPFS file at that point, via the async OPFS API, a problem.

To get around this, as of this change, when an OPFS temp directory is registered, it sets up a 'pool' of pre-created empty temp files, with sync access handles to each pre-created, which can then be handed out as needed to (sync) openFile calls. When closed, a file can be truncated and returned to the pool, or be deleted if the pool is full. When the pool runs low on pre-created files, new files can be opened -- async -- and added to it.

This somewhat indirect approach works around the sync vs async API mismatch, while still allowing openFile calls to create arbitrarily named -- from its point of view -- files at arbitrary times. The included tests show this in action for opfs temp directories, both when configured via SET at interactively and via up-front configuration passed to open().

Fixes #2061.

@dt dt force-pushed the opfs-tempdir branch 3 times, most recently from c261e12 to 4381898 Compare November 14, 2025 21:42
Previously attempting to set temp_directory to an opfs:// path did not work.
Specifically, while 'auto' file handling scanned SQL stmts for opfs:// paths
to pre-open mentioned files via the async OPFS API and register them where
the sync openFile call could find them, it only worked for specific _files_,
not for a temp directory _in which_ open file would later attempt to create
files.

The sync vs async APIs of openFile vs OPFS presents a challenge here: when
the DB decides to spill to a tempfile and attempts to create it, it does so
via the openFile call, which is a sync call. This makes creating the OPFS
file at that point, via the async OPFS API, a problem.

To get around this, this change, when an OPFS temp directory is registered,
sets up a 'temp pool' of pre-created files, with sync access handles, which
can be handed out as needed in openFile calls. When closed, a file can be
truncated and returned to the pool, or deleted if the pool is already full.
When the pool runs low, new files can be opened and added to it async.

This approach works around the sync vs async API mismatch, while still
allowing openFile calls to create arbitrarily named -- from its point of
view -- files at arbitrary times.

Fixes duckdb#2061.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support for spilling to OPFS

1 participant