|
| 1 | +# Deploying Dynamo Inference Graphs to Kubernetes using Helm |
| 2 | + |
| 3 | +This guide will walk you through the process of deploying an inference graph created using the Dynamo SDK onto a Kubernetes cluster. Note that this is currently an experimental feature. |
| 4 | + |
| 5 | +## Dynamo Kubernetes Operator Coming Soon! |
| 6 | + |
| 7 | + |
| 8 | + |
| 9 | +While this guide covers deployment of Dynamo inference graphs using Helm, the preferred method to deploy an inference graph in the future will be via the Dynamo Kubernetes Operator. Dynamo Kubernetes Operator is a soon to be released cloud platform that will simplify the deployment and management of Dynamo inference graphs. It includes a set of components (Operator, UIs, Kubernetes Custom Resources, etc.) to simplify the deployment and management of Dynamo inference graphs. |
| 10 | + |
| 11 | + Once an inference graph is defined using the Dynamo SDK, it can be deployed onto a Kubernetes cluster using a simple `dynamo deploy` command that orchestrates the following deployment steps: |
| 12 | + |
| 13 | +1. Building docker images from inference graph components on the cluster |
| 14 | +2. Intelligently composing the encoded inference graph into a complete deployment on Kubernetes |
| 15 | +3. Enabling autoscaling, monitoring, and observability for the inference graph |
| 16 | +4. Easy administration of deployments via UI |
| 17 | + |
| 18 | +The Dynamo Kubernetes Operator will be released soon. |
| 19 | + |
| 20 | +## Helm Deployment Guide |
| 21 | + |
| 22 | +### Setting up MicroK8s |
| 23 | + |
| 24 | +Follow these steps to set up a local Kubernetes cluster using MicroK8s: |
| 25 | + |
| 26 | +1. Install MicroK8s: |
| 27 | +```bash |
| 28 | +sudo snap install microk8s --classic |
| 29 | +``` |
| 30 | + |
| 31 | +2. Configure user permissions: |
| 32 | +```bash |
| 33 | +sudo usermod -a -G microk8s $USER |
| 34 | +sudo chown -R $USER ~/.kube |
| 35 | +``` |
| 36 | + |
| 37 | +3. **Important**: Log out and log back in for the permissions to take effect |
| 38 | + |
| 39 | +4. Start MicroK8s: |
| 40 | +```bash |
| 41 | +microk8s start |
| 42 | +``` |
| 43 | + |
| 44 | +5. Enable required addons: |
| 45 | +```bash |
| 46 | +# Enable GPU support |
| 47 | +microk8s enable gpu |
| 48 | + |
| 49 | +# Enable storage support |
| 50 | +# See: https://microk8s.io/docs/addon-hostpath-storage |
| 51 | +microk8s enable storage |
| 52 | +``` |
| 53 | + |
| 54 | +6. Configure kubectl: |
| 55 | +```bash |
| 56 | +mkdir -p ~/.kube |
| 57 | +microk8s config >> ~/.kube/config |
| 58 | +``` |
| 59 | + |
| 60 | +After completing these steps, you should be able to use the `kubectl` command to interact with your cluster. |
| 61 | + |
| 62 | +### Installing Required Dependencies |
| 63 | + |
| 64 | +Follow these steps to set up the namespace and install required components: |
| 65 | + |
| 66 | +1. Set environment variables: |
| 67 | +```bash |
| 68 | +export NAMESPACE=dynamo-playground |
| 69 | +export RELEASE_NAME=dynamo-platform |
| 70 | +``` |
| 71 | + |
| 72 | +2. Install NATS messaging system: |
| 73 | +```bash |
| 74 | +# Navigate to dependencies directory |
| 75 | +cd deploy/Kubernetes/pipeline/dependencies |
| 76 | + |
| 77 | +# Add and update NATS Helm repository |
| 78 | +helm repo add nats https://nats-io.github.io/k8s/helm/charts/ |
| 79 | +helm repo update |
| 80 | + |
| 81 | +# Install NATS with custom values |
| 82 | +helm install --namespace ${NAMESPACE} dynamo-platform-nats nats/nats \ |
| 83 | + --create-namespace \ |
| 84 | + --values nats-values.yaml |
| 85 | +``` |
| 86 | + |
| 87 | +3. Install etcd key-value store: |
| 88 | +```bash |
| 89 | +# Install etcd using Bitnami chart |
| 90 | +helm install --namespace ${NAMESPACE} dynamo-platform-etcd \ |
| 91 | + oci://registry-1.docker.io/bitnamicharts/etcd \ |
| 92 | + --values etcd-values.yaml |
| 93 | +``` |
| 94 | + |
| 95 | +After completing these steps, your cluster will have the necessary messaging and storage infrastructure for running Dynamo inference graphs. |
| 96 | + |
| 97 | +### Building and Deploying the Pipeline |
| 98 | + |
| 99 | +Follow these steps to containerize and deploy your inference pipeline: |
| 100 | + |
| 101 | +1. Build and containerize the pipeline: |
| 102 | +```bash |
| 103 | +# Navigate to example directory |
| 104 | +cd examples/hello_world |
| 105 | + |
| 106 | +# Set runtime image name |
| 107 | +export DYNAMO_IMAGE=<dynamo_runtime_image_name> |
| 108 | + |
| 109 | +# Build and containerize the Frontend service |
| 110 | +dynamo build --containerize hello_world:Frontend |
| 111 | +``` |
| 112 | + |
| 113 | +2. Push container to registry: |
| 114 | +```bash |
| 115 | +# Tag the built image for your registry |
| 116 | +docker tag <BUILT_IMAGE_TAG> <TAG> |
| 117 | + |
| 118 | +# Push to your container registry |
| 119 | +docker push <TAG> |
| 120 | +``` |
| 121 | + |
| 122 | +3. Deploy using Helm: |
| 123 | +```bash |
| 124 | +# Set release name for Helm |
| 125 | +export HELM_RELEASE=helloworld |
| 126 | + |
| 127 | +# Generate Helm values file from Frontend service |
| 128 | +dynamo get frontend > pipeline-values.yaml |
| 129 | + |
| 130 | +# Install/upgrade Helm release |
| 131 | +helm upgrade -i "$HELM_RELEASE" ./chart \ |
| 132 | + -f pipeline-values.yaml \ |
| 133 | + --set image=<TAG> \ |
| 134 | + --set dynamoIdentifier="hello_world:Frontend" \ |
| 135 | + -n "$NAMESPACE" |
| 136 | +``` |
| 137 | + |
| 138 | +4. Test the deployment: |
| 139 | +```bash |
| 140 | +# Forward the service port to localhost |
| 141 | +kubectl -n ${NAMESPACE} port-forward svc/helloworld-frontend 3000:80 |
| 142 | + |
| 143 | +# Test the API endpoint |
| 144 | +curl -X 'POST' 'http://localhost:3000/generate' \ |
| 145 | + -H 'accept: text/event-stream' \ |
| 146 | + -H 'Content-Type: application/json' \ |
| 147 | + -d '{"text": "test"}' |
| 148 | +``` |
| 149 | + |
| 150 | +For convenience, you can find a complete deployment script at `deploy/Kubernetes/pipeline/deploy.sh` that automates all of these steps. |
0 commit comments