Skip to content

Conversation

Lore0599
Copy link

Add support for reduction logic

@Lore0599 Lore0599 changed the title Feature/reduction hw: Add Reduction capabilities Sep 24, 2025
@Lore0599 Lore0599 force-pushed the feature/reduction branch 9 times, most recently from 6c02065 to 777e645 Compare October 1, 2025 10:30
@Lore0599 Lore0599 force-pushed the feature/reduction branch 2 times, most recently from d420a31 to 27f24dc Compare October 3, 2025 08:05
Raphael and others added 16 commits October 6, 2025 09:49
…rrier operation

* hw: Extend package with configuration for parallel and offload reduction

* hw: Introduce LSB-And operation in ```floo_reduction_arbiter.sv```

* hw: Add parallel reduction support to the ```floo_router.sv```

* hw: Add all parameters relevant for the parallel and offload reduction to the ```floo_nw_router.sv``` / ```floo_router.sv``` without changing the port-list

* hw: Add support to the ```floo_nw_chimney.sv```

(merged from commit hash: 4a7f9a1 by raroth)
* hw: add sequential reduction controller to FlooNoC

* hw: add generic offload port to all routers

* hw: added small parameterizable ALU to FlooNoC which can be used to run simple Integer operation (Add / Mul / Min / Max)

* misc: reduction controller supports multiple configuration depending on area / performence requirements.

* dataflow: any offload collective flit is brunch off at the input of the router and forwarded to the reduction logic. The final reduced flit will be merged into the normal datastream with an additional slave port on the reduction arbiter / wormhole arbiter.

(merged from commit hash: 4a7f9a1 by raroth)
* When it is dedected that the flit will leave the reduction controller after the reduction then the controller starts with the next reduction even if it is not finished yet.
When an endpoint initiates a wide multicast DMA transfer from another
endpoint to itself (and possibly other endpoints), the following
deadlock occurs. The DMA issues an AR, causing a read burst to come
back from the router to the initiator endpoint. When the first read
beats arrive, the DMA issues a write burst. This write burst loops
back to the initiator endpoint, and may take control of the physical
link. As writes lock the link for the entire burst, but the burst
cannot complete as it needs the stalling read beats to feed the write
burst (a DMA requirement), there is a deadlock. Note, this can happen
also without multicast, so long as the system uses loopback.
@Lore0599 Lore0599 force-pushed the feature/reduction branch 4 times, most recently from f58137f to 8d67524 Compare October 9, 2025 08:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants