Skip to content

Commit 66f80d4

Browse files
authored
Merge pull request #98 from anthonsu/main
Point vllm-tpu image to latest tag
2 parents 7e88b6d + f9b7b0e commit 66f80d4

File tree

4 files changed

+5
-7
lines changed

4 files changed

+5
-7
lines changed

inference/trillium/vLLM/Llama3.1/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -80,7 +80,7 @@ gcloud compute tpus tpu-vm ssh $TPU_NAME --project $PROJECT --zone=$ZONE
8080
## Step 3: Use the latest vllm docker image for TPU
8181

8282
```bash
83-
export DOCKER_URI=vllm/vllm-tpu:nightly
83+
export DOCKER_URI=vllm/vllm-tpu:latest
8484
```
8585

8686
## Step 4: Run the docker container in the TPU instance

inference/trillium/vLLM/Qwen2.5-32B/README.md

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -35,10 +35,8 @@ gcloud compute tpus tpu-vm ssh $TPU_NAME --project $PROJECT --zone=$ZONE
3535

3636
## Step 3: Use the latest vllm docker image for TPU
3737

38-
We use a pinned image but you can change it to `vllm/vllm-tpu:nightly` to get the latest TPU nightly image.
39-
4038
```bash
41-
export DOCKER_URI=vllm/vllm-tpu:nightly
39+
export DOCKER_URI=vllm/vllm-tpu:latest
4240
```
4341

4442
## Step 4: Run the docker container in the TPU instance

inference/trillium/vLLM/Qwen2.5-VL/README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -76,10 +76,10 @@ gcloud compute tpus tpu-vm ssh $TPU_NAME --project $PROJECT --zone=$ZONE
7676
## Step 3: Use the latest vllm docker image for TPU
7777

7878
```bash
79-
export DOCKER_URI=vllm/vllm-tpu:nightly
79+
export DOCKER_URI=vllm/vllm-tpu:latest
8080
```
8181

82-
> **!!Important!!:** As of 10/07/2025, the `vllm/vllm-tpu:nightly` Docker image does not yet include the necessary `tpu_inference` updates to support multi-modal models like Qwen2.5-VL. The following instructions require installing [vllm-tpu](https://docs.vllm.ai/en/latest/getting_started/installation/google_tpu.html#set-up-using-python) and [tpu-inference](https://github.com/vllm-project/tpu-inference) manually on the TPU VM and run directly from the source (user can also choose to build a local Docker image) instead of using Docker published images. For production environments, we recommend waiting for an official `vllm-tpu` Docker image release that includes this support.
82+
> **Note:** For production deployments, we recommend using the official `vllm-tpu` Docker images to ensure a stable and supported environment. For development and testing, you have the option to perform a manual installation of [vllm-tpu](https://docs.vllm.ai/en/latest/getting_started/installation/google_tpu.html#set-up-using-python) and [tpu-inference](https://github.com/vllm-project/tpu-inference) from source on a TPU VM or to build a custom Docker image.
8383
8484
## Step 4: Run the docker container in the TPU instance
8585

inference/trillium/vLLM/Qwen3/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -69,7 +69,7 @@ gcloud compute tpus tpu-vm ssh $TPU_NAME --project $PROJECT --zone=$ZONE
6969
## Step 3: Use the latest vllm docker image for TPU
7070

7171
```bash
72-
export DOCKER_URI=vllm/vllm-tpu:nightly
72+
export DOCKER_URI=vllm/vllm-tpu:latest
7373
```
7474

7575
## Step 4: Run the docker container in the TPU instance

0 commit comments

Comments
 (0)