-
Couldn't load subscription status.
- Fork 523
[Docs] Add InternVL series tutorial for single NPU #3664
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: gcanlin <[email protected]>
|
👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:
If CI fails, you can run linting and testing checks locally according Contributing and Testing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request adds a new tutorial for running InternVL series models on a single NPU. The tutorial is well-structured and provides clear instructions for both offline inference and online serving. However, there is a significant issue with the model path used in the examples, which points to a non-public model. This will prevent users from following the tutorial. My review includes a comment to address this.
Signed-off-by: gcanlin <[email protected]>
Signed-off-by: gcanlin <[email protected]>
Signed-off-by: gcanlin <[email protected]>
|
Please add an accuracy test with same parameter with doc before merge. https://github.com/vllm-project/vllm-ascend/tree/main/tests%2Fe2e%2Fmodels%2Fconfigs |
|
This pull request has conflicts, please resolve those before we can evaluate the pull request. |
What this PR does / why we need it?
Closes #3508.
Does this PR introduce any user-facing change?
No.
How was this patch tested?
No need to test. All scripts and Python code in the docs has been tested.