Optimize Enn runner by Jiseong-oh · Pull Request #18735 · pytorch/executorch

Jiseong-oh · 2026-04-07T06:04:53Z

Summary

Open Model with zero-copy using dmabufheap
Improve model loading with zero-copy
Remove unncessary comments or headers. Change file name

cc @SS-JIA @digantdesai @kimishpatel

pytorch-bot · 2026-04-07T06:04:59Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/18735

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure, 4 Unrelated Failures

As of commit b68ac07 with merge base b5ae0b9 ():

NEW FAILURE - The following job has failed:

Build Presets / linux (linux, linux.arm64.2xlarge, executorch-ubuntu-22.04-gcc11-aarch64) / build (gh)

FLAKY - The following job failed but was likely due to flakiness present on trunk:

pull / unittest-editable / macos / macos-job (gh) (similar failure)
export/tests/test_target_recipes.py::TestTargetRecipes::test_mv3_model

BROKEN TRUNK - The following jobs failed but was present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

pull / unittest / macos / macos-job (gh) (trunk failure)
export/tests/test_target_recipes.py::TestTargetRecipes::test_mv3_model
pull / unittest / windows / windows-job (gh) (trunk failure)
##[error]The operation was canceled.
pull / unittest-editable / windows / windows-job (gh) (trunk failure)
##[error]The operation was canceled.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

github-actions · 2026-04-07T06:05:46Z

This PR needs a `release notes:` label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

- Open Model with zero-copy using dmabufheap - Improve model loading with zero-copy - Remove unncessary comments or headers. Change file name - remove enn api from default extension/file_data_loader Co-authored-by: Hoon Choi <hoon98.choi@samsung.com> Signed-off-by: Jiseong oh <jiseong.oh@samsung.com>

SS-JIA · 2026-04-09T16:23:15Z

@claude help review this PR

claude · 2026-04-09T16:24:08Z

Claude finished @SS-JIA's task in 4m 37s —— View job

PR Review: Optimize Enn runner — Memory Zero-Copy

Read all changed files
Review ExynosFileDataLoader implementation
Review EnnSharedMemoryManager implementation
Review changes to ENN API and executor
Review CMakeLists.txt changes
Review test coverage
Post comprehensive review feedback

Overview

This PR introduces zero-copy model loading for the Samsung ENN backend by:

Adding a SharedMemoryManager that allocates buffers via EnnCreateBuffer (dmabuf heap)
Creating a custom FileDataLoader (exynos_file_data_loader.cpp) that uses these shared buffers instead of operator new
Modifying EnnExecutor::initialize to detect when model data is in a shared buffer and open the model via file descriptor (EnnOpenModelFromFd) instead of copying from memory
Adding new ENN API bindings for buffer creation/release/fd-query
Refactoring the ENN API singleton lifecycle

The concept is sound — if the model data is already in a dmabuf-backed buffer, passing the fd directly avoids a copy. However, there are several issues ranging from correctness bugs to architectural concerns.

Critical Issues

1. ret may be used uninitialized in EnnExecutor::initialize — enn_executor.cpp:46-72

If _sm_instance->query() returns true but the inner if condition (_out->va == binary_buf_addr && ...) is false, then ret is never assigned before the ET_CHECK_OR_RETURN_ERROR(ret == ENN_RET_SUCCESS, ...) check on line 68. This is undefined behavior.

EnnReturn ret;  // uninitialized
...
if (_sm_instance->query(&_out, binary_buf_addr, buf_size)) {
    int fd;
    if (_out->va == binary_buf_addr &&
        !enn_api_inst->EnnGetFileDescriptorFromEnnBuffer(_out, &fd)) {
      ret = enn_api_inst->EnnOpenModelFromFd(fd, &model_id_);
      ...
    }
    // <-- if inner condition is false, ret is still uninitialized
}
if (!model_id_) {  // model_id_ may also be uninitialized garbage if query returned true but inner failed
    ret = enn_api_inst->EnnOpenModelFromMemory(...);
}
// ret may be uninitialized here
ET_CHECK_OR_RETURN_ERROR(ret == ENN_RET_SUCCESS, ...);

Fix this →

2. Test file references non-existent header and namespace — exynos_file_data_loader_test.cpp:1,13

#include <executorch/backends/samsung/runtime/enn_api_wrapper.h>  // does not exist
...
using executorch::backends::enn::api_wrapper::SharedMemoryManager;  // wrong namespace

The file enn_api_wrapper.h does not exist in this PR or the codebase. The actual header is enn_shared_memory_manager.h and the namespace is executorch::backends::enn::shared_memory_manager::SharedMemoryManager. This will cause a compilation failure and is likely the cause of the CI build failure.

Fix this →

3. EnnCreateBuffer loaded twice in loadApiLib — enn_api_implementation.cpp:74,86

ENN_LOAD_API_FUNC(libenn_public_api_, EnnCreateBuffer, this) appears on both line 74 and line 86. The first load at line 74 is the new addition, and the second at line 86 is also new. This is redundant and suggests one was left in by mistake.

Fix this →

Design Concerns

4. Global replacement of FileDataLoader via CMake is fragile — extension/data_loader/CMakeLists.txt:27-34

if(EXECUTORCH_BUILD_ENN)
  list(REMOVE_ITEM _extension_data_loader__srcs
       "extension/data_loader/file_data_loader.cpp"
  )
  list(APPEND _extension_data_loader__srcs
       "backends/samsung/runtime/extension/exynos_file_data_loader.cpp"
  )
endif()

This globally replaces the FileDataLoader implementation for the entire build when EXECUTORCH_BUILD_ENN is set. This means every user of FileDataLoader in the process will allocate from the ENN shared memory manager (dmabuf), not just the model loading path. This has several implications:

Any non-ENN code using FileDataLoader will get dmabuf allocations, which may be slower or have different lifetime semantics
The header file (exynos_file_data_loader.h) is a near-copy of the original file_data_loader.h, creating maintenance burden
The destructor in exynos_file_data_loader.cpp:56-66 calls std::free() on file_name_ but the name was allocated with ::strdup() (which uses malloc), so this works — but it's inconsistent with the original which used et_aligned_free. If the buffer manager allocates file_name_ differently in the future, this would break.

Consider a more targeted approach: either subclass FileDataLoader or create a separate ExynosDataLoader class rather than replacing the global implementation.

5. using namespace in header file — enn_shared_memory_manager.h:17

using namespace torch::executor::enn;

using namespace directives in header files pollute the namespace of every translation unit that includes the header. This should be avoided. Use fully qualified names or move the using directive inside the implementation file.

6. Double initialization of ENN API — enn_executor_runner.cpp:121-127 + enn_api_implementation.cpp:41-50

The EnnApi constructor now calls EnnInitialize() internally:

EnnApi::EnnApi() {
  auto status = loadApiLib();
  if (status == Error::Ok) {
    EnnInitialize();  // called in constructor
    ...

But the runner also calls it explicitly:

void exynos_npu_init() {
  EnnApi::getEnnApiInstance()->EnnInitialize();  // called again
}

Similarly, EnnDeinitialize is called both in the destructor and in exynos_npu_deinit(). This double init/deinit may cause issues depending on the ENN API's tolerance for redundant calls. And the old code in EnnExecutor::initialize that called EnnInitialize was removed but the constructor now does it, so the executor path is fine — but the runner path double-initializes.

7. Singleton SharedMemoryManager is never cleaned up — enn_shared_memory_manager.cpp:28-31

The SharedMemoryManager is a Meyer's singleton with static lifetime. Its EnnBufferPtrList may still contain entries when the process exits. Since the destructor is defaulted (~SharedMemoryManager() = default), these buffers are leaked. If ENN API resources need explicit cleanup, this could be a resource leak.

Minor Issues

8. get_size_from_fd is defined but never used — enn_executor.cpp:27-39

This function is defined in the file but not called anywhere. Dead code should be removed.

9. free(void* ptr) calls free(ptr, {}) with default-constructed align_val_t — enn_shared_memory_manager.cpp:74-76

void SharedMemoryManager::free(void* ptr) {
  free(ptr, {});
}

The alignment parameter in free(void* ptr, std::align_val_t alignment) is unused (the function never references alignment), so this works, but the overload is misleading. Consider removing the alignment parameter from the free path if it's not needed.

10. ET_LOG format specifier mismatch — enn_shared_memory_manager.cpp:85

ET_LOG(Info, "va(%p), size(%d), offset(%d) is erased from LUT",
    ptr, (*it)->size, (*it)->offset);

(*it)->size and (*it)->offset are uint32_t but logged with %d (signed int). Use %u or PRIu32 for correctness.

11. ET_LOG format specifier mismatch — enn_executor.cpp:48

ET_LOG(Info, "Start to open model %p, %ld", binary_buf_addr, buf_size);

buf_size is size_t, which should use %zu, not %ld.

12. Missing __hexagon__ in pread guard — exynos_file_data_loader.cpp:30

The original file_data_loader.cpp includes defined(__hexagon__) in the ET_HAVE_PREAD guard, but the copy in exynos_file_data_loader.cpp only has defined(__xtensa__). This is a divergence from the upstream source that could cause issues on Hexagon platforms.

13. Unnecessary includes — enn_executor.cpp:14-18

#include <android/log.h>
#include <sys/mman.h>

android/log.h is likely already pulled in through the logging header, and sys/mman.h is not used in this file (no mmap calls). These should be removed.

Summary

The zero-copy optimization concept is good, but the PR needs work before merging:

Must fix: Uninitialized ret variable (UB), missing enn_api_wrapper.h header (won't compile), duplicate EnnCreateBuffer load
Should address: Global FileDataLoader replacement strategy, using namespace in header, double ENN init/deinit
Nice to fix: Dead code, format specifier mismatches, unnecessary includes

Jiseong-oh requested review from JacobSzwejbka, SS-JIA, abhinaykukkadapu, cccclai, digantdesai, kirklandsign, larryliu0820, lucylq, mergennachin and robert-kalmar as code owners April 7, 2026 06:04

meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Apr 7, 2026

Jiseong-oh force-pushed the mem_zero_copy branch from 2b6ebde to 20234f3 Compare April 7, 2026 06:08

Jiseong-oh removed request for JacobSzwejbka, SS-JIA, abhinaykukkadapu, cccclai, digantdesai, kirklandsign, larryliu0820, lucylq, mergennachin and robert-kalmar April 7, 2026 06:08

Jiseong-oh added partner: samsung For backend delegation, kernels, demo, etc. from the 3rd-party partner, Samsung module: samsung labels Apr 7, 2026

Jiseong-oh requested a review from SS-JIA April 7, 2026 07:32

Jiseong-oh force-pushed the mem_zero_copy branch from 20234f3 to b68ac07 Compare April 8, 2026 05:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize Enn runner #18735

Optimize Enn runner #18735
Jiseong-oh wants to merge 1 commit intomainfrom
mem_zero_copy

Jiseong-oh commented Apr 7, 2026 •

edited by pytorch-bot bot

Loading

Uh oh!

pytorch-bot bot commented Apr 7, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Apr 7, 2026

Uh oh!

SS-JIA commented Apr 9, 2026

Uh oh!

claude bot commented Apr 9, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Jiseong-oh commented Apr 7, 2026 • edited by pytorch-bot bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Uh oh!

pytorch-bot bot commented Apr 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/18735

❌ 1 New Failure, 4 Unrelated Failures

Uh oh!

github-actions bot commented Apr 7, 2026

This PR needs a release notes: label

Uh oh!

SS-JIA commented Apr 9, 2026

Uh oh!

claude bot commented Apr 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review: Optimize Enn runner — Memory Zero-Copy

Overview

Critical Issues

Design Concerns

Minor Issues

Summary

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Jiseong-oh commented Apr 7, 2026 •

edited by pytorch-bot bot

Loading

pytorch-bot bot commented Apr 7, 2026 •

edited

Loading

This PR needs a `release notes:` label

claude bot commented Apr 9, 2026 •

edited

Loading