Skip to content

feat: Add S3 External Storage Driver (Claim Check pattern)#2924

Open
brucearctor wants to merge 2 commits into
temporalio:masterfrom
brucearctor:s3-external-storage-driver
Open

feat: Add S3 External Storage Driver (Claim Check pattern)#2924
brucearctor wants to merge 2 commits into
temporalio:masterfrom
brucearctor:s3-external-storage-driver

Conversation

@brucearctor

Copy link
Copy Markdown
Contributor

Summary

Implements the S3 External Storage Driver for the Java SDK, as described in #2882 and the cross-SDK feature spec temporalio/features#783. This ports the Claim Check pattern already implemented in the Go and Python SDKs.

Problem

Temporal has a payload size limit (~2 MiB). Workflows dealing with large data (ML models, documents, media) hit this limit.

Solution

External Storage adds a transparent stage to the data converter pipeline that automatically offloads large payloads to S3 and replaces them with lightweight claim tokens in Event History.

Wire Format (Cross-SDK Compatible)

Claim token payloads use encoding temporal-external-storage/claim with claim data stored as prefixed metadata entries (claim.*). Compatible with Go and Python SDK implementations.

Changes

New Module: temporal-s3-external-storage

  • S3StorageDriver — StorageDriver impl with content-addressed storage (SHA-256)
  • S3StorageDriverOptions — Builder-based config (bucket, key prefix, driver name)
  • S3Client / AwsSdkV2S3Client — Abstracted S3 client for testability

Core SDK (temporal-sdk)

  • StorageDriver interface — SPI for storage backends
  • ExternalStorage — Config (threshold, driver selection)
  • CodecDataConverter — Integration as terminal pipeline stage
  • WorkflowClientOptions.setExternalStorage() — Auto-wires into data converter

Usage

S3StorageDriver driver = new S3StorageDriver(
    S3StorageDriverOptions.newBuilder()
        .setClient(new AwsSdkV2S3Client(awsS3Client))
        .setBucket("my-temporal-payloads")
        .build());

WorkflowClientOptions options = WorkflowClientOptions.newBuilder()
    .setExternalStorage(ExternalStorage.newBuilder().addDriver(driver).build())
    .build();
// Payloads > 256 KiB are now automatically offloaded to S3

Security

  • Path injection prevention: URL-encoded path segments in S3 keys
  • Confused deputy prevention: retrieve() ignores bucket from claim data
  • Metadata key collision prevention: claim data keys prefixed with claim.

Testing

  • 21 tests total (15 core + 6 S3 driver)
  • Full pipeline integration test
  • Java 8 compatible

Closes #2882

Implements temporalio#2882 — porting the External Storage
feature from the Go and Python SDKs to Java.

Core interfaces (temporal-sdk):
- StorageDriver: interface for external storage drivers
- StorageDriverClaim: claim reference token stored in Event History
- StorageDriverStoreContext/RetrieveContext: identity context for drivers
- StorageDriverSelector: multi-driver selection strategy
- ExternalStorage: configuration with builder pattern
- StorageDriverException: driver error type

Pipeline integration:
- CodecDataConverter: wired external storage as last pipeline stage
  (after PayloadCodec). Payloads >= threshold are stored externally
  and replaced with claim tokens. Claims are auto-detected and
  resolved on deserialization.
- WorkflowClientOptions: added setExternalStorage() builder method

S3 Driver (temporal-s3-external-storage module):
- S3Client: abstract interface for S3 operations (testability)
- AwsSdkV2S3Client: AWS SDK v2 adapter
- S3StorageDriver: content-addressed storage using SHA-256 hashing
- S3StorageDriverOptions: builder-based configuration

Tests:
- 11 core unit tests (ExternalStorageTest)
- 6 S3 driver unit tests (S3StorageDriverTest)
Critical fixes:
- Wire ExternalStorage from WorkflowClientOptions into CodecDataConverter
- Apply external storage to failure encode/decode paths (encodePayloads/decodePayloads)
- Namespace claim data keys with 'claim.' prefix to prevent metadata key collisions
- Remove dynamic bucket override in S3StorageDriver.retrieve() (confused deputy fix)
- URL-encode path segments in S3 key construction (path injection fix)
- Return null from buildStoreContext() when no context (instead of 'unknown'/'unknown')

Improvements:
- Add SLF4J logging to S3StorageDriver and AwsSdkV2S3Client (DEBUG/WARN)
- Add serialVersionUID to StorageDriverException
- Remove misleading 'throws StorageDriverException' from StorageDriver interface
- Add null checks in StorageDriverStoreContext factory methods
- Validate setDrivers() list elements for null in ExternalStorage.Builder
- Add equals()/hashCode() to ExternalStorage
- Replace unreachable fallback with IllegalStateException in selectDriver()
- Remove redundant IOException catch block in AwsSdkV2S3Client
- Add withExternalStorage() method to CodecDataConverter

Documentation:
- Add retry configuration guidance (AWS SDK v2 built-in retries)
- Document SSE encryption, S3 Lifecycle Policies, memory implications
- Add thread-safety notes to StorageDriver, S3Client, context classes
- Document Claim Check pattern, threshold rationale, server limits
- Warn about reserved claim data key names

Tests:
- Add full pipeline integration test (serialize -> externalize -> retrieve -> deserialize)
- Add small payload inline test (below threshold)
- Add claim key collision protection test
- Add WorkflowClientOptions wiring test
@brucearctor brucearctor requested a review from a team as a code owner June 21, 2026 04:12
@jmaeagle99

Copy link
Copy Markdown
Contributor

Thanks for making this contribution. There is an existing PR (#2907) for contributing an S3 driver in Java. I think that it is best to continue to provide feedback there and merge that PR when it is ready.

@brucearctor

Copy link
Copy Markdown
Contributor Author

@jmaeagle99 -- any reason the PR wasn't linked to the issue?

It is disappointing to devote time to address open issues -- seeing nobody assigned and no PR attached and thinking that is useful time to help the project. Then to be notified, there is unlinked work in process and that was time not well spent.

Ultimately, that is discouraging and creates a higher bar for me to be contributing if i need to check around for open PRs that are not linked and when nobody is assigned?

Any thoughts on how to assuage that concern? Is there a community expectation that PRs would be linked? That if someone is working on something they would be assigned? Ex: what is the community expectation [ ex: that I need to check for PRs that are not linked to open issues ]??

Generally, I look around for issues not assigned and not having associated PRs. Otherwise, what's the reccommended process for how to go ahead and figure out where to contribute? Is it expected I wait and ask and hope for a response before working on a contribution [ I suspect that's added friction, which doesn't help many ]?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

S3 External Storage Driver - Java

2 participants