You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
TRTorch is a compiler for PyTorch/TorchScript, targeting NVIDIA GPUs via NVIDIA's TensorRT Deep Learning Optimizer and Runtime. Unlike PyTorch's Just-In-Time (JIT) compiler, TRTorch is an Ahead-of-Time (AOT) compiler, meaning that before you deploy your TorchScript code, you go through an explicit compile step to convert a standard TorchScript program into an module targeting a TensorRT engine. TRTorch operates as a PyTorch extention and compiles modules that integrate into the JIT runtime seamlessly. After compilation using the optimized graph should feel no different than running a TorchScript module. You also have access to TensorRT's suite of configurations at compile time, so you are able to specify operating precision (FP32/FP16/INT8) and other settings for your module.
7
+
Torch-TensorRT is a compiler for PyTorch/TorchScript, targeting NVIDIA GPUs via NVIDIA's TensorRT Deep Learning Optimizer and Runtime. Unlike PyTorch's Just-In-Time (JIT) compiler, Torch-TensorRT is an Ahead-of-Time (AOT) compiler, meaning that before you deploy your TorchScript code, you go through an explicit compile step to convert a standard TorchScript program into an module targeting a TensorRT engine. Torch-TensorRT operates as a PyTorch extention and compiles modules that integrate into the JIT runtime seamlessly. After compilation using the optimized graph should feel no different than running a TorchScript module. You also have access to TensorRT's suite of configurations at compile time, so you are able to specify operating precision (FP32/FP16/INT8) and other settings for your module.
8
8
9
9
More Information / System Architecture:
10
10
@@ -15,18 +15,18 @@ More Information / System Architecture:
> Note: Refer NVIDIA NGC container(https://ngc.nvidia.com/catalog/containers/nvidia:l4t-pytorch) for PyTorch libraries on JetPack.
76
76
77
77
### Dependencies
78
-
These are the following dependencies used to verify the testcases. TRTorch can work with other versions, but the tests are not guaranteed to pass.
78
+
These are the following dependencies used to verify the testcases. Torch-TensorRT can work with other versions, but the tests are not guaranteed to pass.
79
79
80
80
- Bazel 4.0.0
81
81
- Libtorch 1.9.1 (built with CUDA 11.1)
@@ -85,9 +85,9 @@ These are the following dependencies used to verify the testcases. TRTorch can w
bazel run //cpp/trtorchexec -- $(realpath <PATH TO GRAPH>)<input-size>
195
+
bazel run //cpp/bin/torchtrtc -- $(realpath <PATH TO GRAPH>) out.ts<input-size>
197
196
```
198
197
199
198
## Compiling the Python Package
@@ -202,15 +201,15 @@ To compile the python package for your local machine, just run `python3 setup.py
202
201
To build wheel files for different python versions, first build the Dockerfile in ``//py`` then run the following
203
202
command
204
203
```
205
-
docker run -it -v$(pwd)/..:/workspace/TRTorch build_trtorch_wheel /bin/bash /workspace/TRTorch/py/build_whl.sh
204
+
docker run -it -v$(pwd)/..:/workspace/Torch-TensorRT build_torch_tensorrt_wheel /bin/bash /workspace/Torch-TensorRT/py/build_whl.sh
206
205
```
207
206
Python compilation expects using the tarball based compilation strategy from above.
208
207
209
208
## How do I add support for a new op...
210
209
211
-
### In TRTorch?
210
+
### In Torch-TensorRT?
212
211
213
-
Thanks for wanting to contribute! There are two main ways to handle supporting a new op. Either you can write a converter for the op from scratch and register it in the NodeConverterRegistry or if you can map the op to a set of ops that already have converters you can write a graph rewrite pass which will replace your new op with an equivalent subgraph of supported ops. Its preferred to use graph rewriting because then we do not need to maintain a large library of op converters. Also do look at the various op support trackers in the [issues](https://github.com/NVIDIA/TRTorch/issues) for information on the support status of various operators.
212
+
Thanks for wanting to contribute! There are two main ways to handle supporting a new op. Either you can write a converter for the op from scratch and register it in the NodeConverterRegistry or if you can map the op to a set of ops that already have converters you can write a graph rewrite pass which will replace your new op with an equivalent subgraph of supported ops. Its preferred to use graph rewriting because then we do not need to maintain a large library of op converters. Also do look at the various op support trackers in the [issues](https://github.com/NVIDIA/Torch-TensorRT/issues) for information on the support status of various operators.
214
213
215
214
### In my application?
216
215
@@ -224,9 +223,9 @@ You can register a converter for your op using the `NodeConverterRegistry` insid
The TRTorch Core is the main graph analysis library, it processes a TorchScript Module, converting method graphs to engines and returning a new equivalent module which when run will run inputs through a TensorRT engine
1
+
# Torch-TensorRT Core
2
+
The Torch-TensorRT Core is the main graph analysis library, it processes a TorchScript Module, converting method graphs to engines and returning a new equivalent module which when run will run inputs through a TensorRT engine
3
3
4
4
## Stages
5
5
6
-
> Basic rule of thumb for organization, if the the output of the component is a modified block then it is in lowering, if the output is a TRT engine block then its in conversion
6
+
> Basic rule of thumb for organization, if the the output of the component is a modified block then it is in lowering, if the output is a TRT engine block then its in conversion
7
7
8
8
## Lowering Passes
9
9
@@ -14,29 +14,29 @@ Firstly the graph will go through the lowering passes used in LibTorch, this wil
14
14
15
15
#### Call Method Insertions
16
16
17
-
Graphs from prim::CallMethods need to be inserted into the graph or used to segment the graph into convertible subgraphs.
17
+
Graphs from prim::CallMethods need to be inserted into the graph or used to segment the graph into convertible subgraphs.
18
18
19
-
### TRTorch Lowering
19
+
### Torch-TensorRT Lowering
20
20
21
-
To simplify conversion we can use the PyTorch JIT Subgraph Rewriter to simplify the set of subgraphs that need explicit TensorRT converters. This means we could aim for closer to 1->1 op conversion vs looking for applicable subgraphs, limit the number of converters and reduce the size of each converter.
21
+
To simplify conversion we can use the PyTorch JIT Subgraph Rewriter to simplify the set of subgraphs that need explicit TensorRT converters. This means we could aim for closer to 1->1 op conversion vs looking for applicable subgraphs, limit the number of converters and reduce the size of each converter.
22
22
23
23
24
-
## Conversion Phase
24
+
## Conversion Phase
25
25
26
-
Once the graph has be simplified to a form thats easy to convert, we then set up a conversion context to manage the construction of a TensorRT INetworkDefinition from the blocks nodes. The conversion context records the set of converted nodes, block inputs and outputs and other information about the conversion of the graph. This data is then used to help converters link together layers and also hold build time information like weights required to construct the engine. After the context is created, the block converter starts iterating through the list of nodes, for each node, the converter will look at its inputs and assemble a dictionary of resources to pass to the converter. Inputs can be in a couple of states:
26
+
Once the graph has be simplified to a form thats easy to convert, we then set up a conversion context to manage the construction of a TensorRT INetworkDefinition from the blocks nodes. The conversion context records the set of converted nodes, block inputs and outputs and other information about the conversion of the graph. This data is then used to help converters link together layers and also hold build time information like weights required to construct the engine. After the context is created, the block converter starts iterating through the list of nodes, for each node, the converter will look at its inputs and assemble a dictionary of resources to pass to the converter. Inputs can be in a couple of states:
27
27
- The input is a block parameter
28
28
In this case the input should have already been stored in as an IValue in the conversion context evaluated_value_map. The conversion stage will add the IValue to the list of args for the converter
29
29
- The input is an output of a node that has already been converted
30
30
In this case the ITensor of the output has added to the to the value_tensor_map, The conversion stage will add the ITensor to the list of args for the converter
31
31
- The input is from a node that produces a static value
32
32
There are nodes that produce static values, typically used to store parameters for operators, we need to evaluate these nodes at conversion time to be able to convert a op. The converter will look for a node evaluator in the evaluator registry and run it on the node. The IValue produced will be entered in the conversion context evaluated_value_map and added to the list of args for the converter.
33
-
- The input is from a node that has not been converted
34
-
TRTorch will error here
33
+
- The input is from a node that has not been converted
34
+
Torch-TensorRT will error here
35
35
36
36
### Node Evaluation
37
37
38
-
There are some nodes that contain static data and are resources for operations. These can be evaluated at conversion time so that you can use those values when doing node conversion. In theory any node kind can have a conversion time evaluator as long as it produces a static IValue, This IValue will be stored in the conversion context so it can be consumed by any node that takes the evaluated node as an input.
38
+
There are some nodes that contain static data and are resources for operations. These can be evaluated at conversion time so that you can use those values when doing node conversion. In theory any node kind can have a conversion time evaluator as long as it produces a static IValue, This IValue will be stored in the conversion context so it can be consumed by any node that takes the evaluated node as an input.
39
39
40
40
### Converters
41
41
42
-
See the README in //core/conversion/converters for more information
42
+
See the README in //core/conversion/converters for more information
A library for plugins (custom layers) used in a network. This component of TRTorch library builds a separate library called `libtrtorch_plugins.so`.
3
+
A library for plugins (custom layers) used in a network. This component of Torch-TensorRT library builds a separate library called `libtorchtrt_plugins.so`.
4
4
5
-
On a high level, TRTorch plugin library interface does the following :
5
+
On a high level, Torch-TensorRT plugin library interface does the following :
6
6
7
7
- Uses TensorRT plugin registry as the main data structure to access all plugins.
8
8
9
9
- Automatically registers TensorRT plugins with empty namepsace.
10
10
11
-
- Automatically registers TRTorch plugins with `"trtorch"` namespace.
11
+
- Automatically registers Torch-TensorRT plugins with `"torch_tensorrt"` namespace.
12
12
13
13
Here is the brief description of functionalities of each file
14
14
15
-
-`plugins.h` - Provides a macro to register any plugins with `"trtorch"` namespace.
16
-
-`register_plugins.cpp` - Main registry class which initializes both `libnvinfer` plugins and TRTorch plugins (`Interpolate` and `Normalize`)
15
+
-`plugins.h` - Provides a macro to register any plugins with `"torch_tensorrt"` namespace.
16
+
-`register_plugins.cpp` - Main registry class which initializes both `libnvinfer` plugins and Torch-TensorRT plugins (`Interpolate` and `Normalize`)
17
17
-`impl/interpolate_plugin.cpp` - Core implementation of interpolate plugin. Uses pytorch kernels during execution.
18
18
-`impl/normalize_plugin.cpp` - Core implementation of normalize plugin. Uses pytorch kernels during execution.
19
19
@@ -22,19 +22,19 @@ A converter basically converts a pytorch layer in the torchscript graph into a T
22
22
We can access a plugin via the plugin name and namespace in which it is registered.
23
23
For example, to access the Interpolate plugin, we can use
24
24
```
25
-
auto creator = getPluginRegistry()->getPluginCreator("Interpolate", "1", "trtorch");
25
+
auto creator = getPluginRegistry()->getPluginCreator("Interpolate", "1", "torch_tensorrt");
26
26
auto interpolate_plugin = creator->createPlugin(name, &fc); // fc is the collection of parameters passed to the plugin.
27
27
```
28
28
29
29
### If you have your own plugin
30
30
31
-
If you'd like to compile your plugin with TRTorch,
31
+
If you'd like to compile your plugin with Torch-TensorRT,
32
32
33
33
- Add your implementation to the `impl` directory
34
-
- Add a call `REGISTER_TRTORCH_PLUGINS(MyPluginCreator)` to `register_plugins.cpp`. `MyPluginCreator` is the plugin creator class which creates your plugin. By adding this to `register_plugins.cpp`, your plugin will be initialized and accessible (added to TensorRT plugin registry) during the `libtrtorch_plugins.so` library loading.
34
+
- Add a call `REGISTER_TRTORCH_PLUGINS(MyPluginCreator)` to `register_plugins.cpp`. `MyPluginCreator` is the plugin creator class which creates your plugin. By adding this to `register_plugins.cpp`, your plugin will be initialized and accessible (added to TensorRT plugin registry) during the `libtorchtrt_plugins.so` library loading.
35
35
- Update the `BUILD` file with the your plugin files and dependencies.
36
36
- Implement a converter op which makes use of your plugin.
37
37
38
-
Once you've completed the above steps, upon successful compilation of TRTorch library, your plugin should be available in `libtrtorch_plugins.so`.
38
+
Once you've completed the above steps, upon successful compilation of Torch-TensorRT library, your plugin should be available in `libtorchtrt_plugins.so`.
39
39
40
-
A sample runtime application on how to run a network with plugins can be found <ahref="https://github.com/NVIDIA/TRTorch/tree/master/examples/trtorchrt_example" >here</a>
40
+
A sample runtime application on how to run a network with plugins can be found <ahref="https://github.com/NVIDIA/Torch-TensorRT/tree/master/examples/trtorchrt_example" >here</a>
0 commit comments