Skip to content

Support multi-layer algorithms that receive a data product from an unfold#571

Merged
knoepfel merged 5 commits into
Framework-R-D:mainfrom
knoepfel:multi-layer-failure
May 20, 2026
Merged

Support multi-layer algorithms that receive a data product from an unfold#571
knoepfel merged 5 commits into
Framework-R-D:mainfrom
knoepfel:multi-layer-failure

Conversation

@knoepfel
Copy link
Copy Markdown
Member

@knoepfel knoepfel commented May 6, 2026

Summary

This PR fixes a bug where transforms operating on a hierarchy layer downstream of an unfold could not receive data products produced by that unfold. The root cause was that the index_router had no mechanism to learn — from the unfold itself — how many children it had generated, so it could not emit the correct flush token at the right time.

Solution

The fix introduces several new model components, reworks the flush-message types, and refactors the flush/routing pipeline. The previous data_cell_counter / flush_counts abstraction is replaced by smaller, more focused pieces.

New: data_cell_counts (phlex/model/)

A concurrent map from layer_hash to count, replacing the old flush_counts type. It supports emplace, increment, add_to, and count operations and is used both as a mutable accumulator (data_cell_counts_ptr) and as an immutable payload on flush messages (data_cell_counts_const_ptr).

New: flush_messages.hpp (phlex/model/)

Defines the message structs that flow through the new flush pipeline:

  • index_flush(index, counts) pair carrying the child-count map for an index.
  • unfold_flush — simpler (index, layer_hash, count) message emitted by an unfold, since each unfold produces children in exactly one child layer.
  • ready_flushes_then_emit — bundles the closeout flushes that must be emitted before a new index.

New: data_cell_tracker (phlex/model/)

Tracks the sequence of incoming data_cell_index values from the driver and determines which flush tokens are ready to be emitted when a new index arrives or the job ends. This logic was previously implicit in the source/index-router interaction.

New: child_tracker (phlex/model/)

Per-index tracker that tracks how many children have been processed across each child layer. Once the expected count (received from the unfold's flush result) is satisfied, child_tracker fires a callback to emit the flush token for that index.

declared_unfold

Now uses a third TBB output port (index_flush) carrying the child counts alongside the existing message and index-message ports.

index_router

Gains a flush_receiver input and an establish_layers() initializer that records which layers are produced by unfolds vs. consumed as inputs. route() now accepts the closeout flushes from data_cell_tracker rather than computing them internally, and drain() likewise receives the remaining flushes at job end.

framework_graph

  • Introduces an index_receiver_ node that decouples closeout flush emission from the source node.
  • Instantiates a data_cell_tracker (cell_tracker_) and feeds its output into index_router_.route().
  • Calls index_router_.establish_layers() before finalize(), using layer metadata gathered from the declared unfolds.
  • Connects each unfold's flush_sender() to index_router_.flush_receiver().

edge_maker

Refactored to return (provider_input_ports, multilayer_join_index_ports) instead of directly calling index_router_.finalize(), allowing framework_graph to call establish_layers() first.

store_counters / message

Updated to use data_cell_counts_const_ptr in place of the removed flush_counts_ptr, and the internal counts map is renamed accordingly.

fixed_hierarchy

Exposes a layer_paths() accessor so the framework graph can read the declared layer paths when establishing layers on the index_router.

Tests

  • test/data_cell_tracker_test.cpp — unit tests for data_cell_tracker in isolation.
  • test/child_tracker_test.cpp — unit-level tests for child_tracker verifying that committed child counts accumulate correctly through nested unfolds without involving the full framework_graph machinery.
  • test/data_layer_hierarchy_test.cpp — new unit tests for data_layer_hierarchy, covering ambiguous-layer-name lookups and the unnamed-layer fallback path.
  • test/fold_duplicate_layer_name_test.cpp — renamed from test/different_hierarchies.cpp; re-enables the previously disabled different_hierarchies test, now covering the case where two layers share the same name in a fold scenario.
  • test/unfold.cpp — additional unfold scenarios exercising multi-layer product consumption.
  • test/data_cell_counting.cpp — removed alongside the old data_cell_counter implementation.

Files changed

Area Files
New model phlex/model/child_tracker.{hpp,cpp}, phlex/model/data_cell_tracker.{hpp,cpp}, phlex/model/data_cell_counts.{hpp,cpp}, phlex/model/flush_messages.hpp
Removed model phlex/model/data_cell_counter.{hpp,cpp}
Modified model phlex/model/fwd.hpp, phlex/model/fixed_hierarchy.{hpp,cpp}, phlex/model/CMakeLists.txt
Modified core phlex/core/index_router.{hpp,cpp}, phlex/core/framework_graph.{hpp,cpp}, phlex/core/declared_unfold.{hpp,cpp}, phlex/core/edge_maker.hpp, phlex/core/message.hpp, phlex/core/store_counters.{hpp,cpp}
New tests test/data_cell_tracker_test.cpp, test/child_tracker_test.cpp, test/data_layer_hierarchy_test.cpp
Renamed tests test/different_hierarchies.cpptest/fold_duplicate_layer_name_test.cpp
Removed tests test/data_cell_counting.cpp
Modified tests test/unfold.cpp, test/CMakeLists.txt

Summary largely courtesy of Claude Sonnet 4.6

Resolves #550

@knoepfel knoepfel force-pushed the multi-layer-failure branch from 381d7c1 to fa25cb1 Compare May 6, 2026 21:48
@codecov
Copy link
Copy Markdown

codecov Bot commented May 6, 2026

Codecov Report

❌ Patch coverage is 91.48265% with 27 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
phlex/core/index_router.cpp 90.00% 8 Missing and 8 partials ⚠️
phlex/model/data_cell_tracker.cpp 90.90% 0 Missing and 4 partials ⚠️
phlex/model/flush_gate.cpp 85.71% 1 Missing and 3 partials ⚠️
phlex/core/framework_graph.cpp 97.87% 0 Missing and 1 partial ⚠️
phlex/core/store_counters.cpp 50.00% 0 Missing and 1 partial ⚠️
phlex/model/data_cell_counts.hpp 85.71% 0 Missing and 1 partial ⚠️
@@            Coverage Diff             @@
##             main     #571      +/-   ##
==========================================
+ Coverage   82.23%   82.59%   +0.35%     
==========================================
  Files         157      161       +4     
  Lines        5760     5895     +135     
  Branches      649      682      +33     
==========================================
+ Hits         4737     4869     +132     
+ Misses        807      804       -3     
- Partials      216      222       +6     
Flag Coverage Δ
scripts 76.12% <ø> (ø)
unittests 85.93% <91.48%> (+0.42%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
phlex/core/declared_unfold.cpp 100.00% <100.00%> (+16.66%) ⬆️
phlex/core/declared_unfold.hpp 96.42% <100.00%> (ø)
phlex/core/edge_maker.hpp 100.00% <100.00%> (ø)
phlex/core/framework_graph.hpp 100.00% <ø> (ø)
phlex/core/index_router.hpp 100.00% <100.00%> (ø)
phlex/core/message.hpp 100.00% <ø> (ø)
phlex/model/data_cell_counts.cpp 100.00% <100.00%> (ø)
phlex/model/data_cell_tracker.hpp 100.00% <100.00%> (ø)
phlex/model/fixed_hierarchy.cpp 100.00% <100.00%> (ø)
phlex/model/fixed_hierarchy.hpp 100.00% <100.00%> (ø)
... and 7 more

... and 3 files with indirect coverage changes


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2e90f07...0dafda7. Read the comment docs.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@knoepfel knoepfel force-pushed the multi-layer-failure branch 12 times, most recently from eb2cf70 to f4599e5 Compare May 12, 2026 20:49
@Framework-R-D Framework-R-D deleted a comment from greenc-FNAL May 12, 2026
@knoepfel knoepfel force-pushed the multi-layer-failure branch 10 times, most recently from 520789a to ab9f5eb Compare May 15, 2026 15:50
Comment thread phlex/core/index_router.hpp Outdated
Comment thread phlex/model/child_tracker.cpp Outdated
Comment thread test/child_tracker_test.cpp Outdated
@knoepfel knoepfel force-pushed the multi-layer-failure branch from ab9f5eb to 6fe9261 Compare May 18, 2026 19:14
Comment thread phlex/model/data_cell_tracker.hpp Outdated
@knoepfel knoepfel force-pushed the multi-layer-failure branch 6 times, most recently from 629bc84 to cf92bb1 Compare May 19, 2026 21:41
Comment thread phlex/core/index_router.cpp Outdated
Comment thread phlex/core/declared_unfold.hpp Outdated
@knoepfel knoepfel force-pushed the multi-layer-failure branch from cf92bb1 to cd49e44 Compare May 20, 2026 15:38
@knoepfel knoepfel force-pushed the multi-layer-failure branch from cd49e44 to 0dafda7 Compare May 20, 2026 16:02
@knoepfel knoepfel dismissed marcpaterno’s stale review May 20, 2026 16:04

Changes implemented in 5da5756.

@knoepfel knoepfel requested a review from marcpaterno May 20, 2026 16:05
@knoepfel knoepfel merged commit 3455282 into Framework-R-D:main May 20, 2026
40 checks passed
@knoepfel knoepfel deleted the multi-layer-failure branch May 20, 2026 18:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support multi-layer consumer nodes with input data products from unfold nodes

3 participants