Implement proxy mount read caching#751
Conversation
🦋 Changeset detectedLatest commit: 9b670bf The changes in this PR will be included in the next version bump. This PR includes changesets to release 1 package
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
commit: |
📦 Preview BuildVersion: Install the SDK preview:
|
Keep the established SigV4 signer for credential-proxy mounts while retaining the metadata cache behavior. This keeps the cache change focused on reducing redundant HEAD requests without expanding signing risk.
aron-cf
left a comment
There was a problem hiding this comment.
Thanks for digging into the HEAD traffic. Did you look at tweaking s3fs’s own stat cache? I don't really want to add an additional layer when the one in s3fs should be doing this already (unless I've missed something).
Right now sandbox-sdk sets these defaults for R2-backed s3fs mounts:
stat_cache_expire: '60',
enable_noobj_cache: true,
multipart_size: '5's3fs also supports max_stat_cache_size, stat_cache_expire, and negative caching (enable_negative_cache / disable_negative_cache; negative cache appears to be enabled by default in current s3fs).
Have we tried to repro with larger s3fs options first? for example:
s3fsOptions: [
'stat_cache_expire=300',
'max_stat_cache_size=100000',
'enable_negative_cache'
]
aron-cf
left a comment
There was a problem hiding this comment.
After looking at s3fs I think this is probably useful for common cases where a bucket is mounted and owned by a single container or the bucket is largely read only.
It's a bit risky for anything intended to be shared collaboratively. I'd suggest we roll this out under an opt-in flag, with configurable timeouts and document which usecases will benefit and which wont.
| const DEFAULT_SLOW_REQUEST_MS = 1000; | ||
| const ERROR_RESPONSE_BODY_LIMIT = 2048; | ||
| const MAX_DIAGNOSTIC_EVENTS = 500; | ||
| const HEAD_METADATA_CACHE_TTL_MS = 60_000; |
There was a problem hiding this comment.
This feels risky, if something updates/deletes the object, we keep serving the old HEAD result for up to a minute. Could we make this shorter or configurable... this works well for cases where this bucket is not changing often but if the bucket is shared with another writer then this will get problematic.
| const ERROR_RESPONSE_BODY_LIMIT = 2048; | ||
| const MAX_DIAGNOSTIC_EVENTS = 500; | ||
| const HEAD_METADATA_CACHE_TTL_MS = 60_000; | ||
| const NEGATIVE_HEAD_METADATA_CACHE_TTL_MS = 5_000; |
There was a problem hiding this comment.
Ditto here, this means an object created externally will be missing for 5s.
| } | ||
| } | ||
|
|
||
| function getHeadMetadataCacheKey( |
There was a problem hiding this comment.
The cache key is scoped by mountId, not by the underlying bucket/endpoint identity, this doesn't feel right. If you have the same bucket mounted in two places the cache should be the same right?
| return `${mountId}:${realPath}${url.search}`; | ||
| } | ||
|
|
||
| function getCachedHeadMetadataResponse( |
There was a problem hiding this comment.
This defeats s3fs’s open-time revalidation. s3fs intentionally drops its own stat cache on reopen and sends a HEAD; with this cache, that HEAD may be answered locally instead of actually revalidating against the bucket. I think this behaviour in s3fs is probably what led to this PR. But it makes me think that this cache layer should be opt-in and extremely configurable.
| }); | ||
| } | ||
|
|
||
| function cacheHeadMetadataFromPUT( |
There was a problem hiding this comment.
This is neat. How do we ensure that we're accurately matching what the provider actually responds with? Feels like this might need an integration test to verify our code matches the various provider implementations.
| } | ||
| } else if (isMutatingMethod(method)) { | ||
| deleteDirectoryMarkerCacheEntry(mountId, realPath); | ||
| deleteHeadMetadataCacheEntriesForObject(mountId, realPath); |
There was a problem hiding this comment.
For non-HEAD mutating requests, we invalidate before forwarding. There’s a possible race where a HEAD during an in-flight DELETE/POST/copy/multipart operation can cache an old upstream metadata, and then the mutation succeeds. Should we also invalidate after successful mutations?
| ); | ||
|
|
||
| if (response.ok) { | ||
| directoryMarkerCache.set( |
There was a problem hiding this comment.
The directory marker cache looks like it has no TTL. External changes can leave us with stale directory metadata for the lifetime of the mount unless something local invalidates it. Can we add the same invalidation pattern?
Summary
Reduce redundant upstream requests during credential-proxy mounted reads by adding a short-lived HEAD metadata response cache.
s3fs issues frequent HEAD requests for metadata (
getattr) on every file open, stat, and read. With credential-proxy mounts, each HEAD request was forwarded upstream through the signing proxy, adding latency to repeated reads. This PR keeps the existingaws4fetch/AwsClientsigning path and focuses the optimization on safe metadata caching.Changes
HEAD metadata cache (
s3-credential-proxy-handler.ts)path/,path_$folder$variants).content-length,content-type,etag,last-modified, andx-amz-meta-*.PUT,POST,DELETE) invalidate cached metadata. GET requests preserve cached metadata.Cache lifecycle (
sandbox.ts)evictHeadMetadataCacheForMountis called during unmount, mount-failure cleanup, and sandbox teardown, matching the existing SigV4 client and directory-marker cache cleanup paths.Request forwarding safety
content-lengthinstead of dropping the stream.aws4fetchAwsClientpath.Benchmark results (from repro)
Credential-proxy now avoids redundant upstream HEAD requests on repeated metadata reads.