Add trt decoder #307

wsttiger · 2025-09-29T17:49:58Z

Add TensorRT Decoder Plugin for Quantum Error Correction

Overview

This PR introduces a new TensorRT-based decoder plugin for quantum error correction, leveraging NVIDIA TensorRT for accelerated neural network inference in QEC applications.

Key Features

TensorRT Integration: Full TensorRT runtime integration with support for both ONNX model loading and pre-built engine loading
Flexible Precision Support: Configurable precision modes (fp16, bf16, int8, fp8, tf32, best) with automatic hardware capability detection
Memory Management: Efficient CUDA memory allocation and stream-based execution
Parameter Validation: Comprehensive input validation with clear error messages
Python Utilities: ONNX to TensorRT engine conversion script for model preprocessing

Technical Implementation

Core Decoder Class: trt_decoder implementing the decoder interface with TensorRT backend
Hardware Detection: Automatic GPU capability detection for optimal precision selection
Error Handling: Robust error handling with graceful fallbacks and informative error messages
Plugin Architecture: CMake-based plugin system with conditional TensorRT linking

Files Added/Modified

libs/qec/include/cudaq/qec/trt_decoder_internal.h - Internal API declarations
libs/qec/lib/decoders/plugins/trt_decoder/trt_decoder.cpp - Main decoder implementation
libs/qec/lib/decoders/plugins/trt_decoder/CMakeLists.txt - Plugin build configuration
libs/qec/python/cudaq_qec/plugins/tensorrt_utils/build_engine_from_onnx.py - Python utility
libs/qec/unittests/test_trt_decoder.cpp - Comprehensive unit tests
Updated CMakeLists.txt files for integration

Testing

✅ All 8 unit tests passing
Parameter validation tests
File loading utility tests
Edge case handling tests
Error condition testing

Usage Example

// Load from ONNX model
cudaqx::heterogeneous_map params;
params.insert("onnx_load_path", "model.onnx");
params.insert("precision", "fp16");
auto decoder = std::make_unique<trt_decoder>(H, params);

// Or load pre-built engine
params.clear();
params.insert("engine_load_path", "model.trt");
auto decoder = std::make_unique<trt_decoder>(H, params);

Dependencies

TensorRT 10.13.3.9+
CUDA 12.0+
NVIDIA GPU with appropriate compute capability

Performance Benefits

GPU-accelerated inference for QEC decoding
Optimized precision selection based on hardware capabilities
Efficient memory usage with CUDA streams
Reduced latency compared to CPU-based decoders

This implementation provides a production-ready TensorRT decoder plugin that can significantly accelerate quantum error correction workflows while maintaining compatibility with the existing CUDA-Q QEC framework.

copy-pr-bot · 2025-09-29T17:50:02Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

- Add trt_decoder class implementing TensorRT-accelerated inference - Support both ONNX model loading and pre-built engine loading - Include precision configuration (fp16, bf16, int8, fp8, tf32, best) - Add hardware platform detection for capability-based precision selection - Implement CUDA memory management and stream-based execution - Add Python utility script for ONNX to TensorRT engine conversion - Update CMakeLists.txt to build TensorRT decoder plugin - Add comprehensive parameter validation and error handling

Signed-off-by: Scott Thornton <[email protected]>

libs/qec/unittests/test_trt_decoder.cpp

libs/qec/include/cudaq/qec/trt_decoder_internal.h

Signed-off-by: Scott Thornton <[email protected]>

.github/workflows/all_libs.yaml

libs/qec/python/cudaq_qec/plugins/tensorrt_utils/build_engine_from_onnx.py

libs/qec/lib/decoders/plugins/trt_decoder/trt_decoder.cpp

bmhowe23 · 2025-10-03T23:01:19Z

libs/qec/python/cudaq_qec/plugins/tensorrt_utils/build_engine_from_onnx.py

+import tensorrt as trt
+
+
+def build_engine(onnx_file,


Is this file exposed as part of the wheel such that regular users will be able to use this file?

I think I would rather this file be in the docs as an example as to how to convert an ONNX file to an TRT engine

I think I would rather this file be in the docs as an example as to how to convert an ONNX file to an TRT engine

Is there a reason we can't just expose this utility function in our wheel? Assuming it's possible, if we are going to reference it in our docs, it would be better to just use it from the wheel rather than ask users to copy/paste code.

libs/qec/unittests/CMakeLists.txt

libs/qec/unittests/test_trt_decoder.cpp

Signed-off-by: Scott Thornton <[email protected]>

…trix) Signed-off-by: Scott Thornton <[email protected]>

Signed-off-by: Scott Thornton <[email protected]>

…ecoder model, added to unittest Signed-off-by: Scott Thornton <[email protected]>

Signed-off-by: Scott Thornton <[email protected]>

I, Scott Thornton <[email protected]>, hereby add my Signed-off-by to this commit: 9e97e26 Signed-off-by: Scott Thornton <[email protected]>

wsttiger · 2025-10-16T15:28:20Z

/ok to test fb16b36

copy-pr-bot · 2025-10-16T15:28:24Z

/ok to test fb16b36

@wsttiger, there was an error processing your request: E2

See the following link for more information: https://docs.gha-runners.nvidia.com/cpr/e/2/

wsttiger · 2025-10-16T15:30:06Z

/ok to test c9e563f

Signed-off-by: Scott Thornton <[email protected]>

wsttiger · 2025-10-17T20:34:10Z

/ok to test 42c2b32

Signed-off-by: Scott Thornton <[email protected]>

wsttiger · 2025-10-21T00:41:38Z

/ok to test 5ad505b

Signed-off-by: Scott Thornton <[email protected]>

wsttiger · 2025-10-21T01:39:23Z

/ok to test 2d08b88

Signed-off-by: Scott Thornton <[email protected]>

wsttiger · 2025-10-21T16:17:55Z

/ok to test 4defcfd

Signed-off-by: Scott Thornton <[email protected]>

wsttiger · 2025-10-21T17:14:46Z

/ok to test 62cdbac

Signed-off-by: Scott Thornton <[email protected]>

wsttiger · 2025-10-21T18:12:59Z

/ok to test d4e79a9

Signed-off-by: Scott Thornton <[email protected]>

wsttiger · 2025-10-21T21:56:59Z

/ok to test d8489f7

Signed-off-by: Scott Thornton <[email protected]>

wsttiger · 2025-10-22T00:35:38Z

/ok to test 6ba9191

…eels yaml and script Signed-off-by: Scott Thornton <[email protected]>

wsttiger · 2025-10-22T02:37:12Z

/ok to test eea3198

Signed-off-by: Scott Thornton <[email protected]>

wsttiger · 2025-10-22T19:21:57Z

/ok to test 33359f5

bmhowe23 · 2025-10-23T00:20:56Z

.github/workflows/all_libs.yaml

+          wget https://developer.download.nvidia.com/compute/tensorrt/10.13.3/local_installers/nv-tensorrt-local-repo-ubuntu2404-10.13.3-cuda-12.9_1.0-1_amd64.deb
+          dpkg -i nv-tensorrt-local-repo-ubuntu2404-10.13.3-cuda-12.9_1.0-1_amd64.deb
+          cp /var/nv-tensorrt-local-repo-ubuntu2404-10.13.3-cuda-12.9/nv-tensorrt-local-4B177B4F-keyring.gpg /usr/share/keyrings/
+          apt update
+          apt install -y tensorrt-dev


IIRC, as written right now, this installs a mix of CUDA 13 and CUDA 12 stuff into the dev image. This might work better.

# Generate a pin preferences file to specify the desired CUDA version for tensorrt. # The "cache search" will make it propagate to all of tensorrt's depdendencies. apt-cache search tensorrt | awk '{print "Package: "$1"\nPin: version *+cuda12.9\nPin-Priority: 1001\n"}' | tee /etc/apt/preferences.d/tensorrt-cuda12.9.pref > /dev/null apt update apt install tensorrt tensorrt-dev

For CUDA 13, it would need to be updated to cuda13.0 instead of cuda 12.9.

In actuality, having the tensorrt package and the tensorrt-dev package is a bit redundant. It turns out the only thing you really need is the tensorrt-dev package for our dev image, and I suspect that means the tensorrt-lib package (which is much smaller) is the only thing needed for our released Docker image. (And hopefully pre-existing Python packages cover the Python environment.)

In other words, I think we can simply do this:

apt-cache search tensorrt | awk '{print "Package: "$1"\nPin: version *+cuda12.9\nPin-Priority: 1001\n"}' | tee /etc/apt/preferences.d/tensorrt-cuda12.9.pref > /dev/null apt update apt install tensorrt-dev

bmhowe23 · 2025-10-23T01:02:56Z

libs/qec/python/tests/test_trt_decoder.py

+IS_ARM = _is_arm_architecture()
+
+# Test inputs - 100 test cases with 24 detectors each
+TEST_INPUTS = [[


These test cases span 500 lines, and it looks like many of them are redundant and/or contain only a single non-zero syndrome. If those are the intended test vectors, would it be possible to collapse these into something like "initialize with all 0's", and then just set the 1's where you want them?

bmhowe23 · 2025-10-23T01:03:34Z

libs/qec/unittests/CMakeLists.txt

 add_dependencies(CUDAQXQECUnitTests test_qec)
 gtest_discover_tests(test_qec)

+# TensorRT decoder test is only built for x86 architectures


Now that we support CUDA 13, we should be able to do x86 and ARM for that CUDA version.

Signed-off-by: Scott Thornton <[email protected]>

wsttiger · 2025-10-23T02:36:20Z

/ok to test 5e38f6a

wsttiger requested review from bmhowe23 and melody-ren September 29, 2025 17:50

wsttiger force-pushed the add_trt_decoder branch from bd14f16 to c34be87 Compare September 30, 2025 17:33

wsttiger force-pushed the add_trt_decoder branch from c34be87 to 9e97e26 Compare September 30, 2025 18:34

Formatting

79a7e19

Signed-off-by: Scott Thornton <[email protected]>

bmhowe23 reviewed Sep 30, 2025

View reviewed changes

libs/qec/unittests/test_trt_decoder.cpp Outdated Show resolved Hide resolved

melody-ren reviewed Sep 30, 2025

View reviewed changes

libs/qec/include/cudaq/qec/trt_decoder_internal.h Show resolved Hide resolved

wsttiger added 5 commits October 1, 2025 19:02

Removed hardcoded paths to TensorRT installation

d88452f

Signed-off-by: Scott Thornton <[email protected]>

Merge branch 'main' into add_trt_decoder

e7ec736

Signed-off-by: Scott Thornton <[email protected]>

Incorrect URL

7cbbeb1

Signed-off-by: Scott Thornton <[email protected]>

Fixed up the references to cuda in CMake

5287b09

Signed-off-by: Scott Thornton <[email protected]>

Switched to finding cuda toolkit instead of hardcoding cuda headers

88b3cc1

Signed-off-by: Scott Thornton <[email protected]>

bmhowe23 reviewed Oct 3, 2025

View reviewed changes

.github/workflows/all_libs.yaml Show resolved Hide resolved

bmhowe23 reviewed Oct 3, 2025

View reviewed changes

libs/qec/python/cudaq_qec/plugins/tensorrt_utils/build_engine_from_onnx.py Show resolved Hide resolved

bmhowe23 reviewed Oct 3, 2025

View reviewed changes

wsttiger added 13 commits October 6, 2025 19:14

Disabled trt_decoder for ARM

1bfbb3d

Signed-off-by: Scott Thornton <[email protected]>

Redo platform check for x86

4c040dc

Signed-off-by: Scott Thornton <[email protected]>

Added include directory for the Arm64 arch

e01de62

Signed-off-by: Scott Thornton <[email protected]>

Removed cudaqx namespace

ce0f24c

Signed-off-by: Scott Thornton <[email protected]>

Added copyright notice

e641f49

Signed-off-by: Scott Thornton <[email protected]>

Added CUDAQ logging + minor details

f3d7a95

Signed-off-by: Scott Thornton <[email protected]>

Handled CUDA (potential) errors + formatting

6533797

Signed-off-by: Scott Thornton <[email protected]>

Removed block_size from trt_decoder logic (there's no parity check ma…

83e957b

…trix) Signed-off-by: Scott Thornton <[email protected]>

Default initialization + formatting

cdb1754

Signed-off-by: Scott Thornton <[email protected]>

Added LFS (no assets yet), added training for E2E test with test AI d…

b6cfa6f

…ecoder model, added to unittest Signed-off-by: Scott Thornton <[email protected]>

Added test AI model (onnx)

aea8d56

Signed-off-by: Scott Thornton <[email protected]>

Formatting

036b331

Signed-off-by: Scott Thornton <[email protected]>

Added test_trt_decoder.py - for the python path

1deb4f5

Signed-off-by: Scott Thornton <[email protected]>

DCO Remediation Commit for Scott Thornton <[email protected]>

c9e563f

I, Scott Thornton <[email protected]>, hereby add my Signed-off-by to this commit: 9e97e26 Signed-off-by: Scott Thornton <[email protected]>

wsttiger added 2 commits October 17, 2025 16:18

Added installation of TensorRT to build_wheels.yaml

392f5de

Signed-off-by: Scott Thornton <[email protected]>

Made minor mods to build_wheels.yaml

42c2b32

Signed-off-by: Scott Thornton <[email protected]>

Hardcoding TensorRT package name for now

5ad505b

Signed-off-by: Scott Thornton <[email protected]>

Added debugging info

2d08b88

Signed-off-by: Scott Thornton <[email protected]>

Edits to build_wheel.sh

4defcfd

Signed-off-by: Scott Thornton <[email protected]>

more edits of build_wheel.yaml for debugging

62cdbac

Signed-off-by: Scott Thornton <[email protected]>

Added TensorRT library path to LD_LIBRARY_PATH for auditwheel

d4e79a9

Signed-off-by: Scott Thornton <[email protected]>

modified test_wheels to check for GPU's

d8489f7

Signed-off-by: Scott Thornton <[email protected]>

Merge from main - fixed conflict in test_wheels.sh

6ba9191

Signed-off-by: Scott Thornton <[email protected]>

Removed the hardcoding of TensorRT version and path from the build_wh…

eea3198

…eels yaml and script Signed-off-by: Scott Thornton <[email protected]>

Added all optional dependencies to cudaq_qec

33359f5

Signed-off-by: Scott Thornton <[email protected]>

bmhowe23 reviewed Oct 23, 2025

View reviewed changes

Fixed small bug in pyproject.toml for QEC

5e38f6a

Signed-off-by: Scott Thornton <[email protected]>

Add trt decoder #307

Are you sure you want to change the base?

Add trt decoder #307

Uh oh!

Conversation

wsttiger commented Sep 29, 2025

Add TensorRT Decoder Plugin for Quantum Error Correction

Overview

Key Features

Technical Implementation

Files Added/Modified

Testing

Usage Example

Dependencies

Performance Benefits

Uh oh!

copy-pr-bot bot commented Sep 29, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

bmhowe23 Oct 3, 2025

Choose a reason for hiding this comment

Uh oh!

wsttiger Oct 16, 2025

Choose a reason for hiding this comment

Uh oh!

bmhowe23 Oct 17, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

wsttiger commented Oct 16, 2025

Uh oh!

copy-pr-bot bot commented Oct 16, 2025

Uh oh!

wsttiger commented Oct 16, 2025

Uh oh!

wsttiger commented Oct 17, 2025

Uh oh!

wsttiger commented Oct 21, 2025

Uh oh!

wsttiger commented Oct 21, 2025

Uh oh!

wsttiger commented Oct 21, 2025

Uh oh!

wsttiger commented Oct 21, 2025

Uh oh!

wsttiger commented Oct 21, 2025

Uh oh!

wsttiger commented Oct 21, 2025

Uh oh!

wsttiger commented Oct 22, 2025

Uh oh!

wsttiger commented Oct 22, 2025

Uh oh!

wsttiger commented Oct 22, 2025

Uh oh!

bmhowe23 Oct 23, 2025

Choose a reason for hiding this comment

Uh oh!

bmhowe23 Oct 23, 2025

Choose a reason for hiding this comment

Uh oh!

bmhowe23 Oct 23, 2025

Choose a reason for hiding this comment

Uh oh!

bmhowe23 Oct 23, 2025

Choose a reason for hiding this comment

Uh oh!

wsttiger commented Oct 23, 2025

Uh oh!