Skip to content

Conversation

lukamac
Copy link
Collaborator

@lukamac lukamac commented Oct 16, 2025

Remove memory-aware node bindings because it makes the great parser refactor easier.

The memory-aware node bindings exist only for Neureka to be able to separate bindings that use the dedicated weight memory vs. those who don't.
But those bindings can simply be rewritten to check whether the weights reside in weight memory and change behavior accordingly.

By removing the memory-aware node bindings, we remove another dependency on having hoisted buffers in the middle of parsing.

The RequantHelpers are a bonus that fixes the requantization mul and add hyperrectangles to keep the rank of the original tensors.

Added

  • RequantHelpers.py for Neureka's TileConstraints

Changed

  • Removed NodeMemoryLevelChecker, MemoryAwareNodeBinding
  • _parseNode from MemoryNetworkDeployer since we don't need the annotations before typeChecking anymore
  • Wmem variants of bindings and tile constraints from Neureka

Fixed

  • Keep mul/add rank of requantized Neureka tile constraints

PR Merge Checklist

  1. The PR is rebased on the latest devel commit and pointing to devel.
  2. Your PR reviewed and approved.
  3. All checks are passing.
  4. The CHANGELOG.md file has been updated.
  5. If the docker was modified, change back its link after review.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Oct 16, 2025

📝 Walkthrough

Summary by CodeRabbit

  • Chores

    • Removed legacy memory-aware node bindings infrastructure and deprecated memory-level checking extensions.
    • Simplified internal Neureka target configurations by eliminating memory-aware wrapper variants.
  • Refactor

    • Consolidated tile constraint logic and streamlined bindings configuration.
    • Added internal helper utilities for constraint processing.

Walkthrough

Removed memory-aware node parsing/binding extensions and all Neureka "Wmem" binding/tiling variants; introduced requantization helper utilities and refactored Neureka tile-constraint classes to condition on weight SRAM vs non‑SRAM placement and to integrate requant load scheduling.

Changes

Cohort / File(s) Summary
Network deployer wrappers
Deeploy/CommonExtensions/NetworkDeployers/NetworkDeployerWrapper.py
Removed the MemoryAwareDeployer _parseNode augmentation and reduced typing imports (Tuple removed).
Memory-level extension core
Deeploy/MemoryLevelExtension/MemoryLevels.py, Deeploy/MemoryLevelExtension/NetworkDeployers/MemoryLevelDeployer.py
Removed NodeMemoryLevelChecker, MemoryAwareNodeBinding, memoryAwareNodeBindingExtension and per-node _parseNode overrides; preserved memory-level annotation optimizer in bind()/codeTransform(); removed ONNXLayer from public imports.
Neureka bindings & engine
Deeploy/Targets/Neureka/Bindings.py, Deeploy/Targets/Neureka/Engine.py, Deeploy/Targets/Neureka/Tiler.py
Deleted all NeurekaWmem* binding and tiling-ready declarations and imports; simplified mapper definitions to direct parser→tiling-binding mappings; narrowed exported tiling/binding set to non‑Wmem variants.
TileConstraints: helpers
Deeploy/Targets/Neureka/TileConstraints/RequantHelpers.py
Added requantAddGeometricalConstraint and requantLoadSchedule to encapsulate requant geometrical constraints and per-cube load scheduling.
TileConstraints: dense
Deeploy/Targets/Neureka/TileConstraints/NeurekaDenseConstraint.py
Conditional constraints/serialization based on weight buffer memory level (SRAM vs non‑SRAM); removed direct per-cube weight-offset logic for some paths; added NeurekaRQSDenseConv2DTileConstraint subclass that composes requant helpers.
TileConstraints: depthwise
Deeploy/Targets/Neureka/TileConstraints/NeurekaDepthwiseConstraint.py
Replaced legacy dimension mappings with SRAM-aware constraints and serialization; removed manual weight-cube construction in non‑SRAM path; added NeurekaRQSDWConv2DTileConstraint subclass integrating requant helpers.
TileConstraints: pointwise
Deeploy/Targets/Neureka/TileConstraints/NeurekaPointwiseConstraint.py
Consolidated PW constraints into NeurekaPWConv2DTileConstraint, renamed requant variant to NeurekaRQSPWConv2DTileConstraint, introduced SRAM-aware weight handling and integration with requantLoadSchedule; removed NeurekaWmemPWConv2DTileConstraint.
Changelog
CHANGELOG.md
Added unreleased entries documenting removals of memory-aware bindings/parse paths, addition of RequantHelpers, and related cleanup.

Sequence Diagram(s)

sequenceDiagram
    participant Parser as Parser
    participant Engine as Engine/Mapper
    participant Tiler as Tiler
    participant Binding as TilingReadyBinding

    Note over Parser,Engine: Before: parser → (WmemBinding + Binding)
    Parser->>Engine: map to (WmemBinding + Binding)
    Engine->>Tiler: provide Wmem + non‑Wmem tiling bindings
    Tiler->>Binding: use Wmem-aware constraints

    Note over Parser,Engine: After: parser → Binding (direct)
    Parser->>Engine: map to Binding
    Engine->>Tiler: provide non‑Wmem tiling bindings
    Tiler->>Binding: use unified / requant-aware constraints
Loading
sequenceDiagram
    participant Constraint as TileConstraint
    participant Ctxt as NetworkContext
    participant Requant as RequantHelpers
    participant Schedule as TilingSchedule

    Constraint->>Ctxt: lookup weight buffer (VariableBuffer)
    alt weight memory == SRAM
        Ctxt-->>Constraint: SRAM
        Constraint->>Schedule: compute per-cube weight_addr_offset (SRAM path)
    else weight memory != SRAM
        Ctxt-->>Constraint: non‑SRAM
        Constraint->>Requant: requantLoadSchedule(absoluteOutputCubes, ctxt, op)
        Requant-->>Schedule: mul/add load schedule per cube
        Constraint->>Schedule: merge weight info into input/output loads
    end
    Schedule-->>Constraint: final tiling load schedule
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Suggested labels

Feature

Suggested reviewers

  • Victor-Jung
  • Xeratec

Walkthrough

Removed memory-aware parsing/binding extensions and Wmem-prefixed binding/tiling variants across the deployer and Neureka target; introduced requantization helper utilities and refactored tile-constraint classes to handle SRAM vs non‑SRAM weight placement and to integrate requant load scheduling.

Changes

Cohort / File(s) Summary
Network deployer & memory-level removals
Deeploy/CommonExtensions/NetworkDeployers/NetworkDeployerWrapper.py, Deeploy/MemoryLevelExtension/MemoryLevels.py, Deeploy/MemoryLevelExtension/NetworkDeployers/MemoryLevelDeployer.py
Removed _parseNode augmentations and memory-aware binding/checker classes (NodeMemoryLevelChecker, MemoryAwareNodeBinding, memoryAwareNodeBindingExtension); reduced typing/public imports. Memory-level annotation optimizer remains in bind()/codeTransform().
Neureka bindings & engine simplification
Deeploy/Targets/Neureka/Bindings.py, Deeploy/Targets/Neureka/Engine.py, Deeploy/Targets/Neureka/Tiler.py
Deleted all NeurekaWmem* binding and tiling-ready declarations and their imports; mapper definitions simplified to direct parser → tiling-binding mappings; exported import set narrowed to non‑Wmem variants.
New requant helpers module
Deeploy/Targets/Neureka/TileConstraints/RequantHelpers.py
Added requantAddGeometricalConstraint and requantLoadSchedule to encapsulate requant geometric constraints and per-cube load scheduling for requantized operators.
Dense tile constraint refactor
Deeploy/Targets/Neureka/TileConstraints/NeurekaDenseConstraint.py
Conditional handling based on weight buffer memory level (SRAM vs non‑SRAM) in geometrical/policy constraints and serialization; removed direct per-cube weight offset logic in favor of weight-buffer-aware paths; added NeurekaRQSDenseConv2DTileConstraint subclass integrating requant helpers.
Depthwise tile constraint refactor
Deeploy/Targets/Neureka/TileConstraints/NeurekaDepthwiseConstraint.py
Replaced legacy dimension mappings with SRAM-aware constraints and serialization; removed manual weight-cube construction for non‑SRAM paths and added NeurekaRQSDWConv2DTileConstraint subclass using requant helpers.
Pointwise tile constraint refactor
Deeploy/Targets/Neureka/TileConstraints/NeurekaPointwiseConstraint.py
Consolidated PW constraints into unified NeurekaPWConv2DTileConstraint, renamed requant variant to NeurekaRQSPWConv2DTileConstraint, introduced SRAM-aware weight handling and integration with requantLoadSchedule; removed NeurekaWmemPWConv2DTileConstraint.
Changelog
CHANGELOG.md
Added unreleased entries documenting removals of memory-aware bindings/parse paths, addition of RequantHelpers, and related tidying.

Sequence Diagram(s)

sequenceDiagram
    participant Parser as Parser (e.g., NeurekaRQSPWParser)
    participant Engine as Engine Mapper
    participant Tiler as Tiler
    participant Binding as TilingReadyBinding

    Note over Parser,Binding: Before: parser mapped to concatenated Wmem + non‑Wmem bindings
    Parser->>Engine: map to (WmemBinding + Binding)
    Engine->>Tiler: produce Wmem tiling-ready bindings
    Tiler->>Binding: use Wmem-aware tile constraints

    Note over Parser,Binding: After: direct one-to-one mapping
    Parser->>Engine: map to Binding
    Engine->>Tiler: produce tiling-ready binding (non-Wmem)
    Tiler->>Binding: use unified / requant-aware tile constraints
Loading
sequenceDiagram
    participant Constraint as TileConstraint
    participant Mem as MemoryStore
    participant Requant as RequantHelpers
    participant Schedule as TilingSchedule

    Constraint->>Mem: lookup weight buffer memory level
    alt Weight in SRAM
        Mem-->>Constraint: SRAM
        Constraint->>Schedule: compute per-cube weight_addr_offset
    else Weight in non‑SRAM
        Mem-->>Constraint: non-SRAM
        Constraint->>Requant: call requantLoadSchedule
        Requant-->>Schedule: return requantized load schedule (mul/add)
        Constraint->>Schedule: merge weight into input/output loads
    end
    Schedule-->>Constraint: final tiling load schedule
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Suggested labels

Feature

Suggested reviewers

  • Xeratec
  • Victor-Jung

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Title Check ✅ Passed The title succinctly captures the core change of removing memory-aware node bindings, aligning with the main modifications described in the pull request and clearly conveying the primary intent of the PR.
Description Check ✅ Passed The PR description is clearly related to the changeset and provides meaningful context for the changes. The author describes the main objective (removing memory-aware node bindings to simplify a parser refactor), explains the rationale (bindings can check at runtime whether weights are in weight memory instead of relying on separate memory-aware variants), and lists specific changes including the addition of RequantHelpers.py, removal of NodeMemoryLevelChecker, MemoryAwareNodeBinding, and _parseNode from MemoryNetworkDeployer, and removal of Wmem variants from Neureka. These stated changes align directly with the raw summary showing modifications across multiple files, confirming the description accurately represents the changeset.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
Deeploy/CommonExtensions/NetworkDeployers/NetworkDeployerWrapper.py (1)

5-64: Restore NetworkContext import to avoid runtime failure

NetworkContext is still used in the type hints (e.g., Line 51) but is no longer imported. Because these annotations are evaluated at module import time (no from __future__ import annotations safeguard), the module will now raise a NameError as soon as it loads. Please reintroduce the import (or switch to a quoted forward reference).

🧹 Nitpick comments (12)
Deeploy/Targets/Neureka/TileConstraints/NeurekaDepthwiseConstraint.py (2)

227-227: Add strict=True to zip() for safety.

Python 3.10+ supports strict=True to catch length mismatches between iterables. This prevents silent bugs if outputCubes and inputLoadSchedule diverge in length.

Apply this diff:

-            for cube, load in zip(outputCubes, inputLoadSchedule):
+            for cube, load in zip(outputCubes, inputLoadSchedule, strict=True):

257-261: Add strict=True to zip() for safety.

Similar to line 227, this zip() should include strict=True to catch length mismatches.

Apply this diff:

         newInputLoadSchedule = [{
             **load,
             **rqLoad
-        } for load, rqLoad in zip(tilingSchedule.inputLoadSchedule, requantSchedule)]
+        } for load, rqLoad in zip(tilingSchedule.inputLoadSchedule, requantSchedule, strict=True)]
Deeploy/Targets/Neureka/TileConstraints/RequantHelpers.py (1)

64-68: Add strict=True to zip() for safety.

For consistency and to catch potential length mismatches, add strict=True.

Apply this diff:

     newInputLoadSchedule = [{
         **schedule,
         "mul": mul,
         "add": add,
-    } for schedule, mul, add in zip(tilingSchedule.inputLoadSchedule, inputMulCubes, inputAddCubes)]
+    } for schedule, mul, add in zip(tilingSchedule.inputLoadSchedule, inputMulCubes, inputAddCubes, strict=True)]
Deeploy/Targets/Neureka/TileConstraints/NeurekaDenseConstraint.py (4)

161-162: Consider using _ prefix for unused unpacked variables.

Static analysis flags BatchOffset, HOffset, WOffset, and BatchSize as unused. While the current code works, prefixing with _ makes the intent explicit.

Apply this diff:

-            (BatchOffset, HOffset, WOffset, COffset) = cube.offset
-            (BatchSize, HSize, WSize, CSize) = cube.dims
+            (_BatchOffset, _HOffset, _WOffset, COffset) = cube.offset
+            (_BatchSize, HSize, WSize, CSize) = cube.dims

212-232: Code duplication with NeurekaDepthwiseConstraint.

This weight handling logic is nearly identical to lines 210-230 in NeurekaDepthwiseConstraint.py. The only difference is the weight cube construction (different dimension ordering).

Consider extracting this pattern into a helper function in RequantHelpers.py to reduce duplication across Dense, Depthwise, and Pointwise constraint classes.


229-229: Add strict=True to zip() for safety.

For consistency with best practices in Python 3.10+, add strict=True.

Apply this diff:

-            for cube, load in zip(outputCubes, inputLoadSchedule):
+            for cube, load in zip(outputCubes, inputLoadSchedule, strict=True):

259-263: Add strict=True to zip() for safety.

Apply this diff:

         newInputLoadSchedule = [{
             **load,
             **rqLoad
-        } for load, rqLoad in zip(tilingSchedule.inputLoadSchedule, requantSchedule)]
+        } for load, rqLoad in zip(tilingSchedule.inputLoadSchedule, requantSchedule, strict=True)]
Deeploy/Targets/Neureka/TileConstraints/NeurekaPointwiseConstraint.py (5)

48-52: Document why SRAM weights are constrained to Max().

Unlike the Dense and Depthwise variants, the Pointwise constraint explicitly constrains weightOutChannelVar == weightOutChannelVar.Max() for SRAM weights. This difference suggests Pointwise has specific requirements for SRAM weight tiling.

Consider adding a comment explaining why Pointwise forces full channel tiling for SRAM weights while Dense/Depthwise don't.


191-192: Consider using _ prefix for unused unpacked variables.

Static analysis flags BatchOffset, HOffset, WOffset, and BatchSize as unused.

Apply this diff:

-            (BatchOffset, HOffset, WOffset, COffset) = cube.offset
-            (BatchSize, HSize, WSize, CSize) = cube.dims
+            (_BatchOffset, _HOffset, _WOffset, COffset) = cube.offset
+            (_BatchSize, HSize, WSize, CSize) = cube.dims

242-262: Code duplication with Dense and Depthwise constraints.

This weight handling logic is duplicated across NeurekaDenseConstraint.py (lines 212-232), NeurekaDepthwiseConstraint.py (lines 210-230), and here. Consider consolidating this pattern into a shared helper function.


259-259: Add strict=True to zip() for safety.

Apply this diff:

-            for cube, load in zip(outputCubes, inputLoadSchedule):
+            for cube, load in zip(outputCubes, inputLoadSchedule, strict=True):

289-293: Add strict=True to zip() for safety.

Apply this diff:

         newInputLoadSchedule = [{
             **load,
             **rqLoad
-        } for load, rqLoad in zip(tilingSchedule.inputLoadSchedule, requantSchedule)]
+        } for load, rqLoad in zip(tilingSchedule.inputLoadSchedule, requantSchedule, strict=True)]
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 15c4a23 and 0bfd7fb.

📒 Files selected for processing (10)
  • Deeploy/CommonExtensions/NetworkDeployers/NetworkDeployerWrapper.py (1 hunks)
  • Deeploy/MemoryLevelExtension/MemoryLevels.py (1 hunks)
  • Deeploy/MemoryLevelExtension/NetworkDeployers/MemoryLevelDeployer.py (1 hunks)
  • Deeploy/Targets/Neureka/Bindings.py (0 hunks)
  • Deeploy/Targets/Neureka/Engine.py (1 hunks)
  • Deeploy/Targets/Neureka/TileConstraints/NeurekaDenseConstraint.py (7 hunks)
  • Deeploy/Targets/Neureka/TileConstraints/NeurekaDepthwiseConstraint.py (6 hunks)
  • Deeploy/Targets/Neureka/TileConstraints/NeurekaPointwiseConstraint.py (5 hunks)
  • Deeploy/Targets/Neureka/TileConstraints/RequantHelpers.py (1 hunks)
  • Deeploy/Targets/Neureka/Tiler.py (1 hunks)
💤 Files with no reviewable changes (1)
  • Deeploy/Targets/Neureka/Bindings.py
🧰 Additional context used
🧬 Code graph analysis (8)
Deeploy/Targets/Neureka/Engine.py (2)
Deeploy/DeeployTypes.py (1)
  • NodeMapper (1716-1872)
Deeploy/Targets/Neureka/Parsers.py (6)
  • NeurekaRQSPWConv2DParser (140-162)
  • NeurekaPWConv2DParser (125-137)
  • NeurekaRQSDWConv2DParser (99-122)
  • NeurekaDWConv2DParser (80-96)
  • NeurekaRQSDenseConv2DParser (180-202)
  • NeurekaDenseConv2DParser (165-177)
Deeploy/Targets/Neureka/TileConstraints/RequantHelpers.py (6)
Deeploy/DeeployTypes.py (2)
  • NetworkContext (564-1076)
  • add (737-774)
Deeploy/TilingExtension/MemoryConstraints.py (1)
  • NodeMemoryConstraint (95-167)
Deeploy/TilingExtension/TilerModel.py (3)
  • TilerModel (34-402)
  • addTensorDimToModel (143-157)
  • getTensorDimVar (131-135)
Deeploy/TilingExtension/TilingCodegen.py (4)
  • AbsoluteHyperRectangle (39-49)
  • HyperRectangle (24-35)
  • TilingSchedule (53-122)
  • VariableReplacementScheme (126-158)
Deeploy/Targets/Neureka/TileConstraints/NeurekaDenseConstraint.py (2)
  • serializeTilingSolution (92-236)
  • serializeTilingSolution (247-268)
Deeploy/TilingExtension/TileConstraint.py (1)
  • extractBaseAddr (56-74)
Deeploy/CommonExtensions/NetworkDeployers/NetworkDeployerWrapper.py (1)
Deeploy/DeeployTypes.py (3)
  • CodeGenVerbosity (49-55)
  • NetworkDeployer (3231-3596)
  • ONNXLayer (1875-2203)
Deeploy/MemoryLevelExtension/NetworkDeployers/MemoryLevelDeployer.py (1)
Deeploy/DeeployTypes.py (3)
  • NetworkContext (564-1076)
  • NetworkOptimizationPass (2263-2283)
  • NetworkOptimizer (2286-2309)
Deeploy/Targets/Neureka/TileConstraints/NeurekaDepthwiseConstraint.py (5)
Deeploy/DeeployTypes.py (3)
  • NetworkContext (564-1076)
  • VariableBuffer (232-391)
  • lookup (776-808)
Deeploy/Targets/Neureka/TileConstraints/RequantHelpers.py (2)
  • requantAddGeometricalConstraint (14-34)
  • requantLoadSchedule (76-94)
Deeploy/TilingExtension/TilerModel.py (2)
  • getTensorDimVar (131-135)
  • TilerModel (34-402)
Deeploy/TilingExtension/TilingCodegen.py (4)
  • HyperRectangle (24-35)
  • calculateFlatOffsetInBytes (239-242)
  • TilingSchedule (53-122)
  • VariableReplacementScheme (126-158)
Deeploy/TilingExtension/TileConstraint.py (1)
  • extractBaseAddr (56-74)
Deeploy/Targets/Neureka/TileConstraints/NeurekaDenseConstraint.py (5)
Deeploy/DeeployTypes.py (3)
  • NetworkContext (564-1076)
  • VariableBuffer (232-391)
  • lookup (776-808)
Deeploy/Targets/Neureka/TileConstraints/RequantHelpers.py (2)
  • requantAddGeometricalConstraint (14-34)
  • requantLoadSchedule (76-94)
Deeploy/TilingExtension/TilerModel.py (2)
  • getTensorDimVar (131-135)
  • TilerModel (34-402)
Deeploy/TilingExtension/TilingCodegen.py (4)
  • HyperRectangle (24-35)
  • calculateFlatOffsetInBytes (239-242)
  • TilingSchedule (53-122)
  • VariableReplacementScheme (126-158)
Deeploy/TilingExtension/TileConstraint.py (1)
  • extractBaseAddr (56-74)
Deeploy/Targets/Neureka/Tiler.py (3)
Deeploy/Targets/Neureka/TileConstraints/NeurekaDenseConstraint.py (2)
  • NeurekaDenseConv2DTileConstraint (21-236)
  • NeurekaRQSDenseConv2DTileConstraint (239-268)
Deeploy/Targets/Neureka/TileConstraints/NeurekaDepthwiseConstraint.py (2)
  • NeurekaDWConv2DTileConstraint (21-234)
  • NeurekaRQSDWConv2DTileConstraint (237-266)
Deeploy/Targets/Neureka/TileConstraints/NeurekaPointwiseConstraint.py (2)
  • NeurekaPWConv2DTileConstraint (21-266)
  • NeurekaRQSPWConv2DTileConstraint (269-298)
Deeploy/Targets/Neureka/TileConstraints/NeurekaPointwiseConstraint.py (5)
Deeploy/DeeployTypes.py (3)
  • NetworkContext (564-1076)
  • VariableBuffer (232-391)
  • lookup (776-808)
Deeploy/Targets/Neureka/TileConstraints/RequantHelpers.py (2)
  • requantAddGeometricalConstraint (14-34)
  • requantLoadSchedule (76-94)
Deeploy/Targets/PULPOpen/TileConstraints/ConvTileConstraint.py (1)
  • computeInputCube (320-347)
Deeploy/TilingExtension/TilingCodegen.py (4)
  • HyperRectangle (24-35)
  • calculateFlatOffsetInBytes (239-242)
  • TilingSchedule (53-122)
  • VariableReplacementScheme (126-158)
Deeploy/TilingExtension/TileConstraint.py (1)
  • extractBaseAddr (56-74)
🪛 Ruff (0.14.0)
Deeploy/Targets/Neureka/TileConstraints/RequantHelpers.py

48-48: Undefined name cls

(F821)


68-68: zip() without an explicit strict= parameter

Add explicit value for parameter strict=

(B905)

Deeploy/Targets/Neureka/TileConstraints/NeurekaDepthwiseConstraint.py

227-227: zip() without an explicit strict= parameter

Add explicit value for parameter strict=

(B905)


261-261: zip() without an explicit strict= parameter

Add explicit value for parameter strict=

(B905)

Deeploy/Targets/Neureka/TileConstraints/NeurekaDenseConstraint.py

161-161: Unpacked variable BatchOffset is never used

Prefix it with an underscore or any other dummy variable pattern

(RUF059)


161-161: Unpacked variable HOffset is never used

Prefix it with an underscore or any other dummy variable pattern

(RUF059)


161-161: Unpacked variable WOffset is never used

Prefix it with an underscore or any other dummy variable pattern

(RUF059)


162-162: Unpacked variable BatchSize is never used

Prefix it with an underscore or any other dummy variable pattern

(RUF059)


229-229: zip() without an explicit strict= parameter

Add explicit value for parameter strict=

(B905)


263-263: zip() without an explicit strict= parameter

Add explicit value for parameter strict=

(B905)

Deeploy/Targets/Neureka/TileConstraints/NeurekaPointwiseConstraint.py

191-191: Unpacked variable BatchOffset is never used

Prefix it with an underscore or any other dummy variable pattern

(RUF059)


191-191: Unpacked variable HOffset is never used

Prefix it with an underscore or any other dummy variable pattern

(RUF059)


191-191: Unpacked variable WOffset is never used

Prefix it with an underscore or any other dummy variable pattern

(RUF059)


192-192: Unpacked variable BatchSize is never used

Prefix it with an underscore or any other dummy variable pattern

(RUF059)


259-259: zip() without an explicit strict= parameter

Add explicit value for parameter strict=

(B905)


293-293: zip() without an explicit strict= parameter

Add explicit value for parameter strict=

(B905)

🔇 Additional comments (7)
Deeploy/Targets/Neureka/TileConstraints/NeurekaDepthwiseConstraint.py (2)

210-230: LGTM! Clean SRAM-aware weight handling.

The conditional weight handling properly separates the SRAM path (computing offsets per tile) from the non-SRAM path (merging base offsets and adding weight to load schedule). This aligns with the PR's goal to simplify by removing memory-aware bindings while still supporting SRAM weights.


237-266: LGTM! Requantization subclass correctly delegates.

The NeurekaRQSDWConv2DTileConstraint class properly delegates to the base class and then applies requantization-specific constraints via requantAddGeometricalConstraint and requantLoadSchedule. This composition pattern is clean and maintainable.

Deeploy/Targets/Neureka/TileConstraints/RequantHelpers.py (2)

14-34: LGTM! Clean constraint addition for requantization.

This helper correctly adds dimension variables for mul/add buffers and constrains their channel dimensions to match the output channel. The logic is straightforward and well-commented.


76-94: LGTM! Clean load schedule generation.

This helper correctly constructs HyperRectangles for mul/add buffers based on output channel offsets and sizes. The logic matches the pattern used in the tile constraint classes.

Deeploy/Targets/Neureka/TileConstraints/NeurekaDenseConstraint.py (2)

54-58: LGTM! Clean conditional weight constraint.

Unlike the depthwise variant, this implementation doesn't have a TODO and cleanly handles the SRAM case with a pass statement, making the intent clear.


239-268: LGTM! Requantization integration follows established pattern.

The NeurekaRQSDenseConv2DTileConstraint class correctly implements the same pattern as the depthwise variant, delegating to the base class and then applying requantization constraints.

Deeploy/Targets/Neureka/TileConstraints/NeurekaPointwiseConstraint.py (1)

269-298: LGTM! Consistent requantization pattern.

The NeurekaRQSPWConv2DTileConstraint class follows the same clean pattern as Dense and Depthwise variants.

@lukamac lukamac force-pushed the remove-memory-aware-binding branch from 6f72b86 to f0d4de7 Compare October 16, 2025 15:55
@lukamac lukamac force-pushed the remove-memory-aware-binding branch from f0d4de7 to 28faade Compare October 16, 2025 15:58
@lukamac lukamac changed the title Remove memory-aware node bindings DRAFT: Remove memory-aware node bindings Oct 16, 2025
@lukamac lukamac marked this pull request as draft October 16, 2025 16:08
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
Deeploy/Targets/Neureka/TileConstraints/NeurekaDenseConstraint.py (1)

86-88: Bug: kernel dims swapped (height vs width).

Height should compare to dim_kernel_y, width to dim_kernel_x.

-        tilerModel.addConstraint(inputHeightVar >= parseDict['dim_kernel_x'])
-        tilerModel.addConstraint(inputWidthVar >= parseDict['dim_kernel_y'])
+        tilerModel.addConstraint(inputHeightVar >= parseDict['dim_kernel_y'])
+        tilerModel.addConstraint(inputWidthVar >= parseDict['dim_kernel_x'])
♻️ Duplicate comments (1)
Deeploy/Targets/Neureka/TileConstraints/NeurekaDepthwiseConstraint.py (1)

52-58: Clarify the TODO comment or remove it.

The TODO at line 54 suggests uncertainty about whether to uncomment the constraint weightOutChannelVar == weightOutChannelVar.Max() for SRAM weights. This conditional logic is critical for handling weights in dedicated weight memory.

If skipping the constraint is intentional for SRAM weights, document why. Otherwise, either uncomment it or remove the TODO.

Could you clarify the intended behavior for SRAM weights? Should the constraint be:

  1. Skipped entirely (current behavior)
  2. Uncommented to enforce full channel loading
  3. Something else?
🧹 Nitpick comments (7)
Deeploy/Targets/Neureka/TileConstraints/NeurekaDepthwiseConstraint.py (2)

210-230: Consider adding strict=True to zip() call.

The conditional weight handling logic correctly separates SRAM and non-SRAM paths:

  • SRAM weights: pre-calculate offsets for direct access
  • Non-SRAM weights: include in load schedule like other inputs

However, static analysis flags the zip() call at line 227 for missing strict= parameter. While the lists should have matching lengths by construction, adding strict=True makes this assumption explicit and will fail fast if violated.

Consider applying this change:

-            for cube, load in zip(outputCubes, inputLoadSchedule):
+            for cube, load in zip(outputCubes, inputLoadSchedule, strict=True):

237-266: LGTM: Requantized DW conv constraint correctly extends base class.

The NeurekaRQSDWConv2DTileConstraint class properly layers requantization support onto the base DW convolution constraints by:

  • Calling base addGeometricalConstraint then applying requantAddGeometricalConstraint
  • Calling base serializeTilingSolution then merging in requant load schedules

This design allows requantization to be added without duplicating the base convolution tiling logic.

Consider adding strict=True to the zip() call at line 261 for the same reason as mentioned in the previous comment:

-        } for load, rqLoad in zip(tilingSchedule.inputLoadSchedule, requantSchedule)]
+        } for load, rqLoad in zip(tilingSchedule.inputLoadSchedule, requantSchedule, strict=True)]
Deeploy/Targets/Neureka/TileConstraints/NeurekaPointwiseConstraint.py (2)

190-195: Silence unused tuple-unpack vars.

Prefix unused items with underscores to satisfy linters and reduce noise.

-            (BatchOffset, HOffset, WOffset, COffset) = cube.offset
-            (BatchSize, HSize, WSize, CSize) = cube.dims
+            (_BatchOffset, _HOffset, _WOffset, _COffset) = cube.offset
+            (_BatchSize, HSize, WSize, CSize) = cube.dims

259-262: Use zip(..., strict=True) to guard against length mismatches
Since the project requires Python >=3.10, update your loops as follows:

- for cube, load in zip(outputCubes, inputLoadSchedule):
+ for cube, load in zip(outputCubes, inputLoadSchedule, strict=True):
- } for load, rqLoad in zip(tilingSchedule.inputLoadSchedule, requantSchedule)]
+ } for load, rqLoad in zip(tilingSchedule.inputLoadSchedule, requantSchedule, strict=True)]
Deeploy/Targets/Neureka/TileConstraints/NeurekaDenseConstraint.py (3)

65-66: Avoid hard-coded 3x3; use parsed kernel dims (optional).

Use dim_kernel_y/x to generalize and reduce magic numbers.

-        tilerModel.addConstraint((outputHeightVar == (effectiveHeight - (3 - 1) - 1) // strides[0] + 1))
-        tilerModel.addConstraint((outputWidthVar == (effectiveWidth - (3 - 1) - 1) // strides[1] + 1))
+        kH, kW = parseDict["dim_kernel_y"], parseDict["dim_kernel_x"]
+        tilerModel.addConstraint((outputHeightVar == (effectiveHeight - (kH - 1) - 1) // strides[0] + 1))
+        tilerModel.addConstraint((outputWidthVar == (effectiveWidth - (kW - 1) - 1) // strides[1] + 1))

161-163: Silence unused tuple-unpack vars.

Prefix unused items to satisfy linters.

-            (BatchOffset, HOffset, WOffset, COffset) = cube.offset
-            (BatchSize, HSize, WSize, CSize) = cube.dims
+            (_BatchOffset, _HOffset, _WOffset, _COffset) = cube.offset
+            (_BatchSize, HSize, WSize, CSize) = cube.dims

229-231: Add strict=True to zip() calls (optional defensive programming).

Since the project requires Python >= 3.10 (per pyproject.toml), use strict=True unconditionally rather than conditional logic. This ensures the iterables always have matching lengths:

  • Line 229: for cube, load in zip(outputCubes, inputLoadSchedule, strict=True):
  • Line 263: for load, rqLoad in zip(tilingSchedule.inputLoadSchedule, requantSchedule, strict=True)]
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0bfd7fb and 28faade.

📒 Files selected for processing (11)
  • CHANGELOG.md (4 hunks)
  • Deeploy/CommonExtensions/NetworkDeployers/NetworkDeployerWrapper.py (1 hunks)
  • Deeploy/MemoryLevelExtension/MemoryLevels.py (1 hunks)
  • Deeploy/MemoryLevelExtension/NetworkDeployers/MemoryLevelDeployer.py (1 hunks)
  • Deeploy/Targets/Neureka/Bindings.py (0 hunks)
  • Deeploy/Targets/Neureka/Engine.py (1 hunks)
  • Deeploy/Targets/Neureka/TileConstraints/NeurekaDenseConstraint.py (7 hunks)
  • Deeploy/Targets/Neureka/TileConstraints/NeurekaDepthwiseConstraint.py (6 hunks)
  • Deeploy/Targets/Neureka/TileConstraints/NeurekaPointwiseConstraint.py (5 hunks)
  • Deeploy/Targets/Neureka/TileConstraints/RequantHelpers.py (1 hunks)
  • Deeploy/Targets/Neureka/Tiler.py (1 hunks)
💤 Files with no reviewable changes (1)
  • Deeploy/Targets/Neureka/Bindings.py
🚧 Files skipped from review as they are similar to previous changes (2)
  • Deeploy/MemoryLevelExtension/MemoryLevels.py
  • Deeploy/Targets/Neureka/Engine.py
🧰 Additional context used
🧬 Code graph analysis (6)
Deeploy/Targets/Neureka/TileConstraints/RequantHelpers.py (3)
Deeploy/DeeployTypes.py (1)
  • NetworkContext (564-1076)
Deeploy/TilingExtension/TilerModel.py (3)
  • TilerModel (34-402)
  • addTensorDimToModel (143-157)
  • getTensorDimVar (131-135)
Deeploy/TilingExtension/TilingCodegen.py (2)
  • AbsoluteHyperRectangle (39-49)
  • HyperRectangle (24-35)
Deeploy/MemoryLevelExtension/NetworkDeployers/MemoryLevelDeployer.py (1)
Deeploy/DeeployTypes.py (4)
  • NetworkContext (564-1076)
  • NetworkOptimizationPass (2263-2283)
  • NetworkOptimizer (2286-2309)
  • StructBuffer (489-509)
Deeploy/Targets/Neureka/TileConstraints/NeurekaDepthwiseConstraint.py (4)
Deeploy/DeeployTypes.py (3)
  • NetworkContext (564-1076)
  • VariableBuffer (232-391)
  • lookup (776-808)
Deeploy/Targets/Neureka/TileConstraints/RequantHelpers.py (2)
  • requantAddGeometricalConstraint (12-32)
  • requantLoadSchedule (35-53)
Deeploy/TilingExtension/TilerModel.py (2)
  • getTensorDimVar (131-135)
  • TilerModel (34-402)
Deeploy/TilingExtension/TilingCodegen.py (2)
  • HyperRectangle (24-35)
  • calculateFlatOffsetInBytes (239-242)
Deeploy/Targets/Neureka/TileConstraints/NeurekaPointwiseConstraint.py (3)
Deeploy/DeeployTypes.py (3)
  • NetworkContext (564-1076)
  • VariableBuffer (232-391)
  • lookup (776-808)
Deeploy/Targets/Neureka/TileConstraints/RequantHelpers.py (2)
  • requantAddGeometricalConstraint (12-32)
  • requantLoadSchedule (35-53)
Deeploy/TilingExtension/TilingCodegen.py (2)
  • HyperRectangle (24-35)
  • calculateFlatOffsetInBytes (239-242)
Deeploy/Targets/Neureka/Tiler.py (3)
Deeploy/Targets/Neureka/TileConstraints/NeurekaDenseConstraint.py (1)
  • NeurekaDenseConv2DTileConstraint (21-236)
Deeploy/Targets/Neureka/TileConstraints/NeurekaDepthwiseConstraint.py (1)
  • NeurekaDWConv2DTileConstraint (21-234)
Deeploy/Targets/Neureka/TileConstraints/NeurekaPointwiseConstraint.py (1)
  • NeurekaPWConv2DTileConstraint (21-266)
Deeploy/Targets/Neureka/TileConstraints/NeurekaDenseConstraint.py (6)
Deeploy/DeeployTypes.py (3)
  • NetworkContext (564-1076)
  • VariableBuffer (232-391)
  • lookup (776-808)
Deeploy/Targets/Neureka/TileConstraints/RequantHelpers.py (2)
  • requantAddGeometricalConstraint (12-32)
  • requantLoadSchedule (35-53)
Deeploy/TilingExtension/TilerModel.py (2)
  • getTensorDimVar (131-135)
  • TilerModel (34-402)
Deeploy/Targets/PULPOpen/TileConstraints/ConvTileConstraint.py (1)
  • computeInputCube (320-347)
Deeploy/TilingExtension/TilingCodegen.py (4)
  • HyperRectangle (24-35)
  • calculateFlatOffsetInBytes (239-242)
  • TilingSchedule (53-122)
  • VariableReplacementScheme (126-158)
Deeploy/TilingExtension/TileConstraint.py (1)
  • extractBaseAddr (56-74)
🪛 Ruff (0.14.0)
Deeploy/Targets/Neureka/TileConstraints/NeurekaDepthwiseConstraint.py

227-227: zip() without an explicit strict= parameter

Add explicit value for parameter strict=

(B905)


261-261: zip() without an explicit strict= parameter

Add explicit value for parameter strict=

(B905)

Deeploy/Targets/Neureka/TileConstraints/NeurekaPointwiseConstraint.py

191-191: Unpacked variable BatchOffset is never used

Prefix it with an underscore or any other dummy variable pattern

(RUF059)


191-191: Unpacked variable HOffset is never used

Prefix it with an underscore or any other dummy variable pattern

(RUF059)


191-191: Unpacked variable WOffset is never used

Prefix it with an underscore or any other dummy variable pattern

(RUF059)


192-192: Unpacked variable BatchSize is never used

Prefix it with an underscore or any other dummy variable pattern

(RUF059)


259-259: zip() without an explicit strict= parameter

Add explicit value for parameter strict=

(B905)


293-293: zip() without an explicit strict= parameter

Add explicit value for parameter strict=

(B905)

Deeploy/Targets/Neureka/TileConstraints/NeurekaDenseConstraint.py

161-161: Unpacked variable BatchOffset is never used

Prefix it with an underscore or any other dummy variable pattern

(RUF059)


161-161: Unpacked variable HOffset is never used

Prefix it with an underscore or any other dummy variable pattern

(RUF059)


161-161: Unpacked variable WOffset is never used

Prefix it with an underscore or any other dummy variable pattern

(RUF059)


162-162: Unpacked variable BatchSize is never used

Prefix it with an underscore or any other dummy variable pattern

(RUF059)


229-229: zip() without an explicit strict= parameter

Add explicit value for parameter strict=

(B905)


263-263: zip() without an explicit strict= parameter

Add explicit value for parameter strict=

(B905)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (150)
  • GitHub Check: mempool-models / test-runner-mempool
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (simpleRegression, 45000, 30000, 16000, 8, L3) / test-runner-siracusa-tiled (30000)
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (Attention, 60000, 10000, 5000, 2500, 8, L3) / test-runner-siracusa-tiled (60000)
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (miniMobileNet, 60000, 12000, 6000, 8, L3) / test-runner-siracusa-tiled (60000)
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (miniMobileNet, 60000, 12000, 6000, 8, L3) / test-runner-siracusa-tiled (6000)
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (Attention, 60000, 10000, 5000, 2500, 8, L3) / test-runner-siracusa-tiled (5000)
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (miniMobileNet, 60000, 12000, 6000, 8, L3) / test-runner-siracusa-tiled (12000)
  • GitHub Check: siracusa-kernels-tiled-doublebuffer-L2 / test-runner-siracusa-tiled
  • GitHub Check: siracusa-kernels / test-runner-siracusa
  • GitHub Check: siracusa-kernels-tiled-singlebuffer-L2 / test-runner-siracusa-tiled
  • GitHub Check: siracusa-neureka-models-tiled-singlebuffer-L3 (Transformer, 15000, 8, L3) / test-runner-siracusa-neureka-tiled (15000)
  • GitHub Check: siracusa-neureka-models-tiled-singlebuffer-L3 (microLlama/microLlama1, 10000, 8, L3) / test-runner-siracusa-neureka-tiled (10000)
  • GitHub Check: generic-kernels / test-runner-generic
  • GitHub Check: siracusa-neureka-models-tiled-singlebuffer-L3 (miniMobileNet, 2000, 8, L3) / test-runner-siracusa-neureka-tiled (2000)
  • GitHub Check: siracusa-neureka-models-tiled-doublebuffer-L3-wmem (microLlama/microLlama1, 10000, 8, true, L3, t... / test-runner-siracusa-neureka-tiled (10000)
  • GitHub Check: mempool-models / test-runner-mempool
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (simpleRegression, 45000, 30000, 16000, 8, L3) / test-runner-siracusa-tiled (30000)
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (Attention, 60000, 10000, 5000, 2500, 8, L3) / test-runner-siracusa-tiled (60000)
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (miniMobileNet, 60000, 12000, 6000, 8, L3) / test-runner-siracusa-tiled (60000)
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (miniMobileNet, 60000, 12000, 6000, 8, L3) / test-runner-siracusa-tiled (6000)
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (Attention, 60000, 10000, 5000, 2500, 8, L3) / test-runner-siracusa-tiled (5000)
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (miniMobileNet, 60000, 12000, 6000, 8, L3) / test-runner-siracusa-tiled (12000)
  • GitHub Check: siracusa-kernels-tiled-doublebuffer-L2 / test-runner-siracusa-tiled
  • GitHub Check: siracusa-kernels / test-runner-siracusa
  • GitHub Check: siracusa-kernels-tiled-singlebuffer-L2 / test-runner-siracusa-tiled
  • GitHub Check: siracusa-neureka-models-tiled-singlebuffer-L3 (Transformer, 15000, 8, L3) / test-runner-siracusa-neureka-tiled (15000)
  • GitHub Check: siracusa-neureka-models-tiled-singlebuffer-L3 (microLlama/microLlama1, 10000, 8, L3) / test-runner-siracusa-neureka-tiled (10000)
  • GitHub Check: generic-kernels / test-runner-generic
  • GitHub Check: siracusa-neureka-models-tiled-singlebuffer-L3 (miniMobileNet, 2000, 8, L3) / test-runner-siracusa-neureka-tiled (2000)
  • GitHub Check: siracusa-neureka-models-tiled-doublebuffer-L3-wmem (microLlama/microLlama1, 10000, 8, true, L3, t... / test-runner-siracusa-neureka-tiled (10000)
  • GitHub Check: mempool-models / test-runner-mempool
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (simpleRegression, 45000, 30000, 16000, 8, L3) / test-runner-siracusa-tiled (30000)
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (Attention, 60000, 10000, 5000, 2500, 8, L3) / test-runner-siracusa-tiled (60000)
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (miniMobileNet, 60000, 12000, 6000, 8, L3) / test-runner-siracusa-tiled (60000)
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (miniMobileNet, 60000, 12000, 6000, 8, L3) / test-runner-siracusa-tiled (6000)
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (Attention, 60000, 10000, 5000, 2500, 8, L3) / test-runner-siracusa-tiled (5000)
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (miniMobileNet, 60000, 12000, 6000, 8, L3) / test-runner-siracusa-tiled (12000)
  • GitHub Check: siracusa-kernels-tiled-doublebuffer-L2 / test-runner-siracusa-tiled
  • GitHub Check: siracusa-kernels / test-runner-siracusa
  • GitHub Check: siracusa-kernels-tiled-singlebuffer-L2 / test-runner-siracusa-tiled
  • GitHub Check: siracusa-neureka-models-tiled-singlebuffer-L3 (Transformer, 15000, 8, L3) / test-runner-siracusa-neureka-tiled (15000)
  • GitHub Check: siracusa-neureka-models-tiled-singlebuffer-L3 (microLlama/microLlama1, 10000, 8, L3) / test-runner-siracusa-neureka-tiled (10000)
  • GitHub Check: generic-kernels / test-runner-generic
  • GitHub Check: siracusa-neureka-models-tiled-singlebuffer-L3 (miniMobileNet, 2000, 8, L3) / test-runner-siracusa-neureka-tiled (2000)
  • GitHub Check: siracusa-neureka-models-tiled-doublebuffer-L3-wmem (microLlama/microLlama1, 10000, 8, true, L3, t... / test-runner-siracusa-neureka-tiled (10000)
  • GitHub Check: mempool-models / test-runner-mempool
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (simpleRegression, 45000, 30000, 16000, 8, L3) / test-runner-siracusa-tiled (30000)
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (Attention, 60000, 10000, 5000, 2500, 8, L3) / test-runner-siracusa-tiled (60000)
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (miniMobileNet, 60000, 12000, 6000, 8, L3) / test-runner-siracusa-tiled (60000)
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (miniMobileNet, 60000, 12000, 6000, 8, L3) / test-runner-siracusa-tiled (6000)
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (Attention, 60000, 10000, 5000, 2500, 8, L3) / test-runner-siracusa-tiled (5000)
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (miniMobileNet, 60000, 12000, 6000, 8, L3) / test-runner-siracusa-tiled (12000)
  • GitHub Check: siracusa-kernels-tiled-doublebuffer-L2 / test-runner-siracusa-tiled
  • GitHub Check: siracusa-kernels / test-runner-siracusa
  • GitHub Check: siracusa-kernels-tiled-singlebuffer-L2 / test-runner-siracusa-tiled
  • GitHub Check: siracusa-neureka-models-tiled-singlebuffer-L3 (Transformer, 15000, 8, L3) / test-runner-siracusa-neureka-tiled (15000)
  • GitHub Check: siracusa-neureka-models-tiled-singlebuffer-L3 (microLlama/microLlama1, 10000, 8, L3) / test-runner-siracusa-neureka-tiled (10000)
  • GitHub Check: generic-kernels / test-runner-generic
  • GitHub Check: siracusa-neureka-models-tiled-singlebuffer-L3 (miniMobileNet, 2000, 8, L3) / test-runner-siracusa-neureka-tiled (2000)
  • GitHub Check: siracusa-neureka-models-tiled-doublebuffer-L3-wmem (microLlama/microLlama1, 10000, 8, true, L3, t... / test-runner-siracusa-neureka-tiled (10000)
  • GitHub Check: mempool-models / test-runner-mempool
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (simpleRegression, 45000, 30000, 16000, 8, L3) / test-runner-siracusa-tiled (30000)
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (Attention, 60000, 10000, 5000, 2500, 8, L3) / test-runner-siracusa-tiled (60000)
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (miniMobileNet, 60000, 12000, 6000, 8, L3) / test-runner-siracusa-tiled (60000)
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (miniMobileNet, 60000, 12000, 6000, 8, L3) / test-runner-siracusa-tiled (6000)
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (Attention, 60000, 10000, 5000, 2500, 8, L3) / test-runner-siracusa-tiled (5000)
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (miniMobileNet, 60000, 12000, 6000, 8, L3) / test-runner-siracusa-tiled (12000)
  • GitHub Check: siracusa-kernels-tiled-doublebuffer-L2 / test-runner-siracusa-tiled
  • GitHub Check: siracusa-kernels / test-runner-siracusa
  • GitHub Check: siracusa-kernels-tiled-singlebuffer-L2 / test-runner-siracusa-tiled
  • GitHub Check: siracusa-neureka-models-tiled-singlebuffer-L3 (Transformer, 15000, 8, L3) / test-runner-siracusa-neureka-tiled (15000)
  • GitHub Check: siracusa-neureka-models-tiled-singlebuffer-L3 (microLlama/microLlama1, 10000, 8, L3) / test-runner-siracusa-neureka-tiled (10000)
  • GitHub Check: generic-kernels / test-runner-generic
  • GitHub Check: siracusa-neureka-models-tiled-singlebuffer-L3 (miniMobileNet, 2000, 8, L3) / test-runner-siracusa-neureka-tiled (2000)
  • GitHub Check: siracusa-neureka-models-tiled-doublebuffer-L3-wmem (microLlama/microLlama1, 10000, 8, true, L3, t... / test-runner-siracusa-neureka-tiled (10000)
  • GitHub Check: mempool-models / test-runner-mempool
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (simpleRegression, 45000, 30000, 16000, 8, L3) / test-runner-siracusa-tiled (30000)
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (Attention, 60000, 10000, 5000, 2500, 8, L3) / test-runner-siracusa-tiled (60000)
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (miniMobileNet, 60000, 12000, 6000, 8, L3) / test-runner-siracusa-tiled (60000)
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (miniMobileNet, 60000, 12000, 6000, 8, L3) / test-runner-siracusa-tiled (6000)
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (Attention, 60000, 10000, 5000, 2500, 8, L3) / test-runner-siracusa-tiled (5000)
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (miniMobileNet, 60000, 12000, 6000, 8, L3) / test-runner-siracusa-tiled (12000)
  • GitHub Check: siracusa-kernels-tiled-doublebuffer-L2 / test-runner-siracusa-tiled
  • GitHub Check: siracusa-kernels / test-runner-siracusa
  • GitHub Check: siracusa-kernels-tiled-singlebuffer-L2 / test-runner-siracusa-tiled
  • GitHub Check: siracusa-neureka-models-tiled-singlebuffer-L3 (Transformer, 15000, 8, L3) / test-runner-siracusa-neureka-tiled (15000)
  • GitHub Check: siracusa-neureka-models-tiled-singlebuffer-L3 (microLlama/microLlama1, 10000, 8, L3) / test-runner-siracusa-neureka-tiled (10000)
  • GitHub Check: generic-kernels / test-runner-generic
  • GitHub Check: siracusa-neureka-models-tiled-singlebuffer-L3 (miniMobileNet, 2000, 8, L3) / test-runner-siracusa-neureka-tiled (2000)
  • GitHub Check: siracusa-neureka-models-tiled-doublebuffer-L3-wmem (microLlama/microLlama1, 10000, 8, true, L3, t... / test-runner-siracusa-neureka-tiled (10000)
  • GitHub Check: mempool-models / test-runner-mempool
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (simpleRegression, 45000, 30000, 16000, 8, L3) / test-runner-siracusa-tiled (30000)
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (Attention, 60000, 10000, 5000, 2500, 8, L3) / test-runner-siracusa-tiled (60000)
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (miniMobileNet, 60000, 12000, 6000, 8, L3) / test-runner-siracusa-tiled (60000)
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (miniMobileNet, 60000, 12000, 6000, 8, L3) / test-runner-siracusa-tiled (6000)
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (Attention, 60000, 10000, 5000, 2500, 8, L3) / test-runner-siracusa-tiled (5000)
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (miniMobileNet, 60000, 12000, 6000, 8, L3) / test-runner-siracusa-tiled (12000)
  • GitHub Check: siracusa-kernels-tiled-doublebuffer-L2 / test-runner-siracusa-tiled
  • GitHub Check: siracusa-kernels / test-runner-siracusa
  • GitHub Check: siracusa-kernels-tiled-singlebuffer-L2 / test-runner-siracusa-tiled
  • GitHub Check: siracusa-neureka-models-tiled-singlebuffer-L3 (Transformer, 15000, 8, L3) / test-runner-siracusa-neureka-tiled (15000)
  • GitHub Check: siracusa-neureka-models-tiled-singlebuffer-L3 (microLlama/microLlama1, 10000, 8, L3) / test-runner-siracusa-neureka-tiled (10000)
  • GitHub Check: generic-kernels / test-runner-generic
  • GitHub Check: siracusa-neureka-models-tiled-singlebuffer-L3 (miniMobileNet, 2000, 8, L3) / test-runner-siracusa-neureka-tiled (2000)
  • GitHub Check: siracusa-neureka-models-tiled-doublebuffer-L3-wmem (microLlama/microLlama1, 10000, 8, true, L3, t... / test-runner-siracusa-neureka-tiled (10000)
  • GitHub Check: mempool-models / test-runner-mempool
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (simpleRegression, 45000, 30000, 16000, 8, L3) / test-runner-siracusa-tiled (30000)
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (Attention, 60000, 10000, 5000, 2500, 8, L3) / test-runner-siracusa-tiled (60000)
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (miniMobileNet, 60000, 12000, 6000, 8, L3) / test-runner-siracusa-tiled (60000)
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (miniMobileNet, 60000, 12000, 6000, 8, L3) / test-runner-siracusa-tiled (6000)
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (Attention, 60000, 10000, 5000, 2500, 8, L3) / test-runner-siracusa-tiled (5000)
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (miniMobileNet, 60000, 12000, 6000, 8, L3) / test-runner-siracusa-tiled (12000)
  • GitHub Check: siracusa-kernels-tiled-doublebuffer-L2 / test-runner-siracusa-tiled
  • GitHub Check: siracusa-kernels / test-runner-siracusa
  • GitHub Check: siracusa-kernels-tiled-singlebuffer-L2 / test-runner-siracusa-tiled
  • GitHub Check: siracusa-neureka-models-tiled-singlebuffer-L3 (Transformer, 15000, 8, L3) / test-runner-siracusa-neureka-tiled (15000)
  • GitHub Check: siracusa-neureka-models-tiled-singlebuffer-L3 (microLlama/microLlama1, 10000, 8, L3) / test-runner-siracusa-neureka-tiled (10000)
  • GitHub Check: generic-kernels / test-runner-generic
  • GitHub Check: siracusa-neureka-models-tiled-singlebuffer-L3 (miniMobileNet, 2000, 8, L3) / test-runner-siracusa-neureka-tiled (2000)
  • GitHub Check: siracusa-neureka-models-tiled-doublebuffer-L3-wmem (microLlama/microLlama1, 10000, 8, true, L3, t... / test-runner-siracusa-neureka-tiled (10000)
  • GitHub Check: mempool-models / test-runner-mempool
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (simpleRegression, 45000, 30000, 16000, 8, L3) / test-runner-siracusa-tiled (30000)
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (Attention, 60000, 10000, 5000, 2500, 8, L3) / test-runner-siracusa-tiled (60000)
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (miniMobileNet, 60000, 12000, 6000, 8, L3) / test-runner-siracusa-tiled (60000)
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (miniMobileNet, 60000, 12000, 6000, 8, L3) / test-runner-siracusa-tiled (6000)
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (Attention, 60000, 10000, 5000, 2500, 8, L3) / test-runner-siracusa-tiled (5000)
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (miniMobileNet, 60000, 12000, 6000, 8, L3) / test-runner-siracusa-tiled (12000)
  • GitHub Check: siracusa-kernels-tiled-doublebuffer-L2 / test-runner-siracusa-tiled
  • GitHub Check: siracusa-kernels / test-runner-siracusa
  • GitHub Check: siracusa-kernels-tiled-singlebuffer-L2 / test-runner-siracusa-tiled
  • GitHub Check: siracusa-neureka-models-tiled-singlebuffer-L3 (Transformer, 15000, 8, L3) / test-runner-siracusa-neureka-tiled (15000)
  • GitHub Check: siracusa-neureka-models-tiled-singlebuffer-L3 (microLlama/microLlama1, 10000, 8, L3) / test-runner-siracusa-neureka-tiled (10000)
  • GitHub Check: generic-kernels / test-runner-generic
  • GitHub Check: siracusa-neureka-models-tiled-singlebuffer-L3 (miniMobileNet, 2000, 8, L3) / test-runner-siracusa-neureka-tiled (2000)
  • GitHub Check: siracusa-neureka-models-tiled-doublebuffer-L3-wmem (microLlama/microLlama1, 10000, 8, true, L3, t... / test-runner-siracusa-neureka-tiled (10000)
  • GitHub Check: mempool-models / test-runner-mempool
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (simpleRegression, 45000, 30000, 16000, 8, L3) / test-runner-siracusa-tiled (30000)
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (Attention, 60000, 10000, 5000, 2500, 8, L3) / test-runner-siracusa-tiled (60000)
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (miniMobileNet, 60000, 12000, 6000, 8, L3) / test-runner-siracusa-tiled (60000)
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (miniMobileNet, 60000, 12000, 6000, 8, L3) / test-runner-siracusa-tiled (6000)
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (Attention, 60000, 10000, 5000, 2500, 8, L3) / test-runner-siracusa-tiled (5000)
  • GitHub Check: siracusa-models-tiled-singlebuffer-L3 (miniMobileNet, 60000, 12000, 6000, 8, L3) / test-runner-siracusa-tiled (12000)
  • GitHub Check: siracusa-kernels-tiled-doublebuffer-L2 / test-runner-siracusa-tiled
  • GitHub Check: siracusa-kernels / test-runner-siracusa
  • GitHub Check: siracusa-kernels-tiled-singlebuffer-L2 / test-runner-siracusa-tiled
  • GitHub Check: siracusa-neureka-models-tiled-singlebuffer-L3 (Transformer, 15000, 8, L3) / test-runner-siracusa-neureka-tiled (15000)
  • GitHub Check: siracusa-neureka-models-tiled-singlebuffer-L3 (microLlama/microLlama1, 10000, 8, L3) / test-runner-siracusa-neureka-tiled (10000)
  • GitHub Check: generic-kernels / test-runner-generic
  • GitHub Check: siracusa-neureka-models-tiled-singlebuffer-L3 (miniMobileNet, 2000, 8, L3) / test-runner-siracusa-neureka-tiled (2000)
  • GitHub Check: siracusa-neureka-models-tiled-doublebuffer-L3-wmem (microLlama/microLlama1, 10000, 8, true, L3, t... / test-runner-siracusa-neureka-tiled (10000)
🔇 Additional comments (5)
Deeploy/Targets/Neureka/TileConstraints/RequantHelpers.py (2)

12-32: LGTM: Requantization constraint logic is correct.

The function properly adds mul and add tensor dimensions to the tiler model and enforces channel equality constraints. The implementation correctly uses the TilerModel API to retrieve dimension variables and add constraints.


35-53: LGTM: Load schedule generation is correct.

The function correctly generates per-cube load schedules for mul and add buffers by:

  • Extracting channel offset and size from each output cube
  • Creating HyperRectangles with zeros for all dimensions except the last (channel) dimension
  • Properly preserving the rank of the mul/add tensors as mentioned in the PR objectives

This implementation aligns with the NHWC layout convention used by Neureka.

Deeploy/Targets/Neureka/TileConstraints/NeurekaDepthwiseConstraint.py (1)

9-9: LGTM: Required imports added.

The imports for VariableBuffer and requantization helpers (requantAddGeometricalConstraint, requantLoadSchedule) are necessary for the new memory-level checking and requantization functionality introduced in this PR.

Also applies to: 12-12

Deeploy/Targets/Neureka/TileConstraints/NeurekaPointwiseConstraint.py (1)

48-52: SRAM-aware weight constraint is correct.

Using weightBuffer._memoryLevel to switch channel constraint is the right replacement for Wmem variants.

Deeploy/Targets/Neureka/Tiler.py (1)

6-13: Wmem bindings removal verified as complete.

Ripgrep confirms zero stale references to removed Wmem variants, MemoryAwareNodeBinding, or NodeMemoryLevelChecker across the codebase. The CHANGELOG documents the intentional removal at lines 78–80, and the import cleanup in Tiler.py is consistent with the broader cleanup. Code changes are sound.

@lukamac lukamac force-pushed the remove-memory-aware-binding branch from e9f8bef to b27c5ac Compare October 16, 2025 17:36
@lukamac lukamac marked this pull request as ready for review October 16, 2025 18:53
@lukamac lukamac changed the title DRAFT: Remove memory-aware node bindings Remove memory-aware node bindings Oct 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant