Skip to content

Add versioned task hasher strategy#6927

Draft
pditommaso wants to merge 3 commits intomasterfrom
task-hasher-strategy
Draft

Add versioned task hasher strategy#6927
pditommaso wants to merge 3 commits intomasterfrom
task-hasher-strategy

Conversation

@pditommaso
Copy link
Member

Summary

  • Refactor TaskHasher into a configurable strategy pattern with versioned implementations (V1, V2) selectable via the NXF_TASK_HASH_VER environment variable
  • V2 (current hashing behavior) is the default; V1 preserves the pre-record-types hashing for backward compatibility
  • The resolved version is cached at session level to avoid re-reading the env var per task

Changes

  • TaskHasher → interface with compute(), getTaskGlobalVars(), getTaskBinEntries()
  • TaskHasherV1 → original hashing (Maps by values, CacheFunnel after SerializableMarker)
  • TaskHasherV2 → current hashing (Maps by unordered entrySet, CacheFunnel before Map)
  • TaskHasherFactoryVersion enum + create(TaskRun) factory
  • HashBuilder → version-aware with() method for Map/CacheFunnel handling
  • CacheHelper → new hasher(value, mode, version) overload
  • SessionhashStrategy field caches the resolved version at startup

Test plan

  • ./gradlew :nextflow:test --tests "nextflow.processor.TaskHasherTest" passes
  • Verify V1 produces the same hashes as pre-record-types builds
  • Verify V2 (default) produces hashes identical to current master
  • Test NXF_TASK_HASH_VER=V1 env var switches strategy correctly

🤖 Generated with Claude Code

Refactor TaskHasher into a configurable strategy pattern with
versioned implementations (V1, V2) selectable via the NXF_TASK_HASH_VER
environment variable. V2 (current behavior) is the default.

- TaskHasher: now an interface defining compute(), getTaskGlobalVars(),
  getTaskBinEntries()
- TaskHasherV1: original hashing (Maps by values, CacheFunnel after
  SerializableMarker)
- TaskHasherV2: current hashing (Maps by unordered entrySet, CacheFunnel
  before Map)
- TaskHasherFactory: Version enum + create() factory method
- HashBuilder: version-aware with branching for Map/CacheFunnel handling
- Session: caches resolved hash strategy version at startup

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Paolo Di Tommaso <paolo.ditommaso@gmail.com>
@netlify
Copy link

netlify bot commented Mar 16, 2026

Deploy Preview for nextflow-docs-staging ready!

Name Link
🔨 Latest commit 10c26f2
🔍 Latest deploy log https://app.netlify.com/projects/nextflow-docs-staging/deploys/69bc52805311300009437446
😎 Deploy Preview https://deploy-preview-6927--nextflow-docs-staging.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

pditommaso and others added 2 commits March 19, 2026 18:15
…der config

- Extract TaskHasher classes into `nextflow.processor.hash` package
- Convert TaskHasher from concrete class to interface + AbstractTaskHasher base
- Replace magic version int in HashBuilder with declarative boolean flags
  (withOrderIndependentMaps, withCacheFunnelFirst)
- TaskHasherV1/V2 now contain real configuration logic via createHashBuilder()
- Move getTaskGlobalVars() and getTaskBinEntries() into TaskRun
- Rename Version enum values to STD_V1/STD_V2 with string values std/v1, std/v2
- Add dedicated TaskHasherFactoryTest
- LinObserver now calls TaskRun methods directly instead of depending on hasher

Signed-off-by: Paolo Di Tommaso <paolo.ditommaso@gmail.com>

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Paolo Di Tommaso <paolo.ditommaso@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant