This repo contains all the files necessary to create a basic load balancing worker on Runpod. For end-to-end deployment instructions, refer to the Runpod documentation.
- Build the image:
docker build --platform linux/amd64 -t YOUR_DOCKER_USERNAME/loadbalancer-example:v1.0 .
- Push to Docker Hub
docker push YOUR_DOCKER_USERNAME/loadbalancer-example:v1.0
- Use this container image path when deploying your endpoint to Runpod
YOUR_DOCKER_USERNAME/loadbalancer-example:v1.0
- Make sure to expose HTTP ports 5000 and 5001 in your endpoint's container configuration, and add these environmnet variables:
PORT = 5000
PORT_HEALTH = 5001
.
Use the curl commands below to test your endpoint:
curl -X POST "https://ENDPOINT_ID.api.runpod.ai/generate" \
-H 'Authorization: Bearer RUNPOD_API_KEY' \
-H "Content-Type: application/json" \
-d '{"prompt": "Hello, world!"}'
curl -X GET "https://ENDPOINT_ID.api.runpod.ai/ping" \
-H 'Authorization: Bearer RUNPOD_API_KEY' \
-H "Content-Type: application/json"
curl -X GET "https://ENDPOINT_ID.api.runpod.ai/stats" \
-H 'Authorization: Bearer RUNPOD_API_KEY' \
-H "Content-Type: application/json"