You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Model publishers may not always assign the right Model Architecture tags
We want to detect when a model is a TensorRT Engine && can be run via TRT-LLM inference runtime.
isTensorrtModel Rules
At least 1 file ending in .engine.
(Optional) At least 1 file named config.json. Caveat: By design, model builders can actually rename this file.
Engine compatibility rules
For context, TensorRT models are specific to:
GPU architectures, i.e. models compiled for Ada will only run on Ada
TRT-LLM release, i.e. models compiled on release version v0.9.0 will need to run on 0.9.0
OS (optional), though as of v0.9.0, models are cross OS compatible. We're still testing as it could be flaky.
n GPUs, i.e. GPU topology. This can be detected by counting the # of engine files actually.
Unfortunately, afaik config.json and other metadata files do not track the hardware/build-time configurations once the models are built, so model authors will have to specify this info.
^ We'll update this info as it changes, and as we learn more 😄 .
Naming
TensorRT weights can be .plans or .onnx
TensorRT weights that run in TensorRT-LLM are in .engines
So we may need to be specific across the various TRT formats, i.e. isTensorrtEngine vs isTensorrtPlan?
The text was updated successfully, but these errors were encountered:
0xSage
changed the title
feat: simple heuristic for isTensorrtModel
feat: simple heuristic for isTensorrtLLMMay 21, 2024
0xSage
changed the title
feat: simple heuristic for isTensorrtLLM
feat: simple heuristic for isTensorrtEngineMay 21, 2024
Hi @0xSage! I suggest only detecting the .engine files for now, they seem like the more popular format on the hub right now (450 models contain a .engine file, vs. 0 model repo contain a .plans file)
We can auto-tag those repos with a tensorrt tag, i think it'd be the easiest!
Problem
isTensorrtModel Rules
.engine
.config.json
. Caveat: By design, model builders can actually rename this file.Engine compatibility rules
For context, TensorRT models are specific to:
GPU architectures
, i.e. models compiled for Ada will only run on AdaTRT-LLM release
, i.e. models compiled on release version v0.9.0 will need to run on 0.9.0OS
(optional), though as of v0.9.0, models are cross OS compatible. We're still testing as it could be flaky.n GPUs
, i.e. GPU topology. This can be detected by counting the # of engine files actually.Unfortunately, afaik
config.json
and other metadata files do not track the hardware/build-time configurations once the models are built, so model authors will have to specify this info.^ We'll update this info as it changes, and as we learn more 😄 .
Naming
.plans
or.onnx
.engines
isTensorrtEngine
vsisTensorrtPlan
?The text was updated successfully, but these errors were encountered: