This directory contains Kubernetes manifests for deploying Spinal Tap on SLAC's S3DF Kubernetes infrastructure.
Log into SLAC S3DF:
ssh USERNAME@s3dflogin.slac.stanford.eduFollow the setup instructions at: https://k8s.slac.stanford.edu/neutrino-ml
This will configure your kubectl to use the neutrino-ml vCluster context.
Verify your context:
kubectl config current-context
# Should show: neutrino-mlThe spinal-tap namespace must be created before first deployment:
kubectl create namespace spinal-tapContact S3DF support to request approval for the sdf-data-neutrino storage class:
Email: s3df-help@slac.stanford.edu
Template:
Subject: Request storage class approval for neutrino-ml vCluster
Hello,
I need access to the sdf-data-neutrino storage class for the neutrino-ml vCluster.
Details:
- User: <your-email>@slac.stanford.edu
- vCluster: neutrino-ml
- Namespace: spinal-tap
- Required storage class: sdf-data-neutrino
- Purpose: Read-only access to /sdf/data/neutrino/spinal-tap/ for Spinal Tap data visualization
Please approve this storage class for the neutrino-ml vCluster.
Thanks!
Wait for approval before proceeding (usually quick).
Spinal Tap requires authentication when deployed to Kubernetes to control access to experiment-specific data.
Quick setup:
cd k8s
./generate-secrets.sh
kubectl apply -f secret.yamlFor detailed authentication setup, see AUTHENTICATION.md.
This deployment is pre-configured for SLAC S3DF with:
- Ingress:
https://spinal-tap.slac.stanford.edu - Storage: Read-only access to
/sdf/data/neutrino/spinal-tap/viasdf-data-neutrinostorage class - Namespace:
spinal-tap
For detailed configuration information including:
- How storage classes and filesystem paths work
- Setting up data symlinks
- Customizing ingress/namespace/resources
- Adapting for other facilities
See SLAC_CONFIG.md for the complete configuration guide
-
Log into S3DF:
ssh USERNAME@s3dflogin.slac.stanford.edu
-
Navigate to the k8s directory:
cd /path/to/spinal-tap/k8s -
Preview what will be deployed:
make dump
-
Deploy:
make apply
-
Verify deployment:
kubectl get pods,pvc,ingress -n spinal-tap
To update manifests after making changes:
-
Pull latest changes:
cd /path/to/spinal-tap git pull -
Apply updates:
cd k8s make apply
Kubernetes will automatically perform a rolling update with zero downtime (if using multiple replicas).
To change experiment passwords after initial deployment:
-
Generate new secrets with updated passwords:
cd k8s ./generate-secrets.sh kubectl apply -f secret.yaml -n spinal-tap -
Restart the deployment to pick up the new secret values:
kubectl rollout restart deployment spinal-tap -n spinal-tap kubectl rollout status deployment spinal-tap -n spinal-tap
Note: Pods don't automatically reload environment variables when secrets are updated. The rollout restart creates new pods with the updated password values.
Alternatively, use kubectl directly:
# Apply all resources
kubectl apply -k .
# Preview changes
kubectl kustomize .The deployment defaults to replicas: 1 in deployment.yaml.
When to scale:
- 1 replica: Development, low traffic (< 10 concurrent users)
- 2-3 replicas: Production, high availability, 10-50 concurrent users
- 5+ replicas: High traffic, mission-critical (50+ concurrent users)
Scale the deployment:
# Edit deployment.yaml and change replicas, then:
make apply
# Or scale directly:
kubectl scale deployment spinal-tap -n spinal-tap --replicas=3Note: The ReadOnlyMany PVC access mode supports multiple replicas reading simultaneously.
Current defaults in deployment.yaml:
resources:
requests:
memory: "512Mi"
cpu: "250m"
limits:
memory: "2Gi"
cpu: "1000m"Adjust based on your needs. Monitor usage with:
kubectl top pod -n spinal-tapIf pods are OOMKilled, increase memory limits in deployment.yaml and reapply.
For detailed resource tuning guidance, see SLAC_CONFIG.md.
The deployment uses sdf-data-neutrino for the neutrino facility.
To use a different facility or customize storage, see SLAC_CONFIG.md for details on:
- Available storage classes
- How to adapt for other facilities
- Setting up data symlinks
- Requesting new storage classes
# Check pods
kubectl get pods -l app=spinal-tap
# Check persistent volume claim
kubectl get pvc spinal-tap-data
# Check ingress
kubectl get ingress spinal-tapOnce deployed, access Spinal Tap at: https://spinal-tap.slac.stanford.edu
This deployment follows SLAC S3DF Kubernetes best practices:
- Kustomize: All manifests managed via
kustomization.yaml - Makefile: Standard targets (
apply,dump,delete) - Storage: Facility-specific storage classes for filesystem access
- Ingress: Simple configuration, automatically handled by S3DF
- Labels: Consistent labeling with
app: spinal-tap
For more details, see SLAC_CONFIG.md.
Error: namespaces "spinal-tap" not found
Solution: Create the namespace first:
kubectl create namespace spinal-tapError: validation error: Allowed storageClasses at vcluster--neutrino-ml
Cause: The Kyverno policy is blocking the storage class for your vCluster.
Solution: Contact S3DF support to approve sdf-data-neutrino for the neutrino-ml vCluster:
# Check PVC status
kubectl describe pvc spinal-tap-data -n spinal-tap
# Email s3df-help@slac.stanford.edu for storage class approvalkubectl describe pvc spinal-tap-data -n spinal-tapCommon issues:
- Wrong storageClassName
- Insufficient permissions for that storageClass
- Storage path doesn't exist on filesystem
# Check pod status
kubectl get pods -n spinal-tap
# Get detailed information
kubectl describe pod -n spinal-tap -l app=spinal-tap
# View logs
kubectl logs -n spinal-tap -l app=spinal-tap --tail=50Error: User "username@slac.stanford.edu" cannot get resource...
Solution: You need appropriate RBAC permissions. Contact S3DF support to request access to the vCluster and namespace.
kubectl describe ingress spinal-tap -n spinal-tapVerify the ingress shows a valid address/hostname.
Check if pods are running out of memory or CPU:
# View current resource usage
kubectl top pod -n spinal-tap
# Watch resource usage continuously
kubectl top pod -n spinal-tap --watch
# Check for OOMKilled pods
kubectl get pods -n spinal-tap
kubectl describe pod -n spinal-tap -l app=spinal-tap | grep -E "State|Reason|Exit Code"If you need to restart pods (e.g., to pick up a new :latest image):
kubectl rollout restart deployment spinal-tap -n spinal-tap
# Monitor the rollout
kubectl rollout status deployment spinal-tap -n spinal-tapmake deleteOr:
kubectl delete -k .For S3DF-specific issues:
- Email:
s3df-help@slac.stanford.edu - Docs: Check SLAC confluence for S3DF Kubernetes documentation
- Examples: https://github.com/slaclab/slac-k8s-examples