Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: simple heuristic for isTensorrtEngine #690

Open
0xSage opened this issue May 21, 2024 · 1 comment
Open

feat: simple heuristic for isTensorrtEngine #690

0xSage opened this issue May 21, 2024 · 1 comment
Assignees

Comments

@0xSage
Copy link

0xSage commented May 21, 2024

Problem

  • Model publishers may not always assign the right Model Architecture tags
  • We want to detect when a model is a TensorRT Engine && can be run via TRT-LLM inference runtime.

isTensorrtModel Rules

  1. At least 1 file ending in .engine.
  2. (Optional) At least 1 file named config.json. Caveat: By design, model builders can actually rename this file.

Engine compatibility rules

For context, TensorRT models are specific to:

  1. GPU architectures, i.e. models compiled for Ada will only run on Ada
  2. TRT-LLM release, i.e. models compiled on release version v0.9.0 will need to run on 0.9.0
  3. OS (optional), though as of v0.9.0, models are cross OS compatible. We're still testing as it could be flaky.
  4. n GPUs, i.e. GPU topology. This can be detected by counting the # of engine files actually.

Unfortunately, afaik config.json and other metadata files do not track the hardware/build-time configurations once the models are built, so model authors will have to specify this info.

^ We'll update this info as it changes, and as we learn more 😄 .

Naming

  • TensorRT weights can be .plans or .onnx
  • TensorRT weights that run in TensorRT-LLM are in .engines
  • So we may need to be specific across the various TRT formats, i.e. isTensorrtEngine vs isTensorrtPlan?
@0xSage 0xSage changed the title feat: simple heuristic for isTensorrtModel feat: simple heuristic for isTensorrtLLM May 21, 2024
@0xSage 0xSage changed the title feat: simple heuristic for isTensorrtLLM feat: simple heuristic for isTensorrtEngine May 21, 2024
@julien-c
Copy link
Member

Hi @0xSage! I suggest only detecting the .engine files for now, they seem like the more popular format on the hub right now (450 models contain a .engine file, vs. 0 model repo contain a .plans file)

We can auto-tag those repos with a tensorrt tag, i think it'd be the easiest!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants