Skip to content

markH5/poc-audio-inference

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

29 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

poc-audio-inference

Pre-requisites

Getting started

  1. Clone the repository
git clone [email protected]:nekomeowww/poc-audio-inference.git
cd poc-audio-inference
  1. Install dependencies
corepack enable # Make sure you have corepack enabled
pnpm install
  1. Start development server
pnpm dev
  1. Open http://localhost:5173 in your browser.
  2. Backend should listen on http://localhost:8080 and should already configured with proxy for frontend.

Project structure

.
├── apps     # Frontend applications
│   └── streaming-web
├── infra     # Infrastructure related
│   └── go
│       └── operator # Kubernetes streaming-backend-operator
├── packages # Shared packages
│   └── backend-shared
├── services # Backend services
│   ├── inference-server
│   └── streaming-backend
├── cspell.config.yaml  # Spell check config
├── eslint.config.mjs   # ESLint config
├── package.json        # Workspace global dependencies
├── pnpm-workspace.yaml # Monorepo config for pnpm
├── pnpm-lock.yaml
├── tsconfig.json       # TypeScript config for monorepo
└── vitest.workspace.ts # Unit test related
├── README.md

Development

Start essential depending services

docker compose up -d

Configure before starting

  1. Copy .env.example to .env.local and configure the environment variables.

Start each server

You can always start any of the needed apps, packages or services by running either:

  • pnpm -F @audio-inference/web dev for frontend
  • pnpm -F @audio-inference/backend dev for backend
  • cd services/inference-server && pixi run start for inference server

Deployment

Build image with docker buildx

docker buildx build --platform linux/arm64 --load . -f ./apps/streaming-web/Dockerfile -t test.nekomeowww.local/streaming-audio/web:0.0.1
docker buildx build --platform linux/arm64 --load . -f ./services/streaming-backend/Dockerfile -t test.nekomeowww.local/streaming-audio/backend:0.0.1
docker buildx build --platform linux/arm64 --load . -f ./services/inference-server/Dockerfile -t test.nekomeowww.local/streaming-audio/inference-server:0.0.1

Note

For x86_64 (amd64)

docker buildx build --platform linux/arm64 --load . -f ./apps/streaming-web/Dockerfile -t test.nekomeowww.local/streaming-audio/web:0.0.1
docker buildx build --platform linux/arm64 --load . -f ./services/streaming-backend/Dockerfile -t test.nekomeowww.local/streaming-audio/backend:0.0.1
docker buildx build --platform linux/arm64 --load . -f ./services/inference-server/Dockerfile -t test.nekomeowww.local/streaming-audio/inference-server:0.0.1

Deploy with docker

docker run -dit -p 8080:80 test.nekomeowww.local/streaming-audio/web:0.0.1
docker run -dit -p 8081:8081 -e REDIS_URL='URL of Redis' test.nekomeowww.local/streaming-audio/backend:0.0.1
docker run -dit -p 8082:8082 test.nekomeowww.local/streaming-audio/inference-server:0.0.1

Deploy with kind bootstrapped Kubernetes cluster

Create cluster

We have pre-defined kind configurations in the infra/go/operator/hack directory to help you get started with.

Note

Install kind if you haven't already. You can install it using the following command:

go install sigs.k8s.io/kind@latest

To create a kind cluster with the configurations defined in infra/go/operator/hack/kind-config.yaml, run the following command:

kind create cluster --config infra/go/operator/hack/kind-config.yaml --name kind-streaming-backend

Note

You can check the nodes with the following command:

kubectl get nodes

Prepare it with GPU resources

Since the streaming-backend-inference-server.yaml specified and simulated the resource allocation of nvidia.com/gpu, we need to prepare the cluster with the GPU resources.

Note

This is not a real GPU and not even essential for the workload, you can remove the resource constraint by modifying the deployment file:

resources:
  limits:
--  nvidia.com/gpu: "1"
  requests:
--  nvidia.com/gpu: "1"
Label all nodes
kubectl label node kind-worker run.ai/simulated-gpu-node-pool=default
kubectl label node kind-worker2 run.ai/simulated-gpu-node-pool=default
kubectl label node kind-worker3 run.ai/simulated-gpu-node-pool=default
Install fake-gpu-operator for device plugin
helm repo add fake-gpu-operator https://fake-gpu-operator.storage.googleapis.com
helm repo update
helm upgrade -i gpu-operator fake-gpu-operator/fake-gpu-operator --namespace gpu-operator --create-namespace

Load the images into kind cluster

kind load docker-image test.nekomeowww.local/streaming-audio/web:0.0.1 --name kind-streaming-backend
kind load docker-image test.nekomeowww.local/streaming-audio/backend:0.0.1 --name kind-streaming-backend
kind load docker-image test.nekomeowww.local/streaming-audio/inference-server:0.0.1 --name kind-streaming-backend

Create the corresponding resources

kubectl apply -f deploy/kubernetes-yaml/envs/local/streaming-backend-web.yaml --server-side
kubectl apply -f deploy/kubernetes-yaml/envs/local/streaming-backend-backend.yaml --server-side
kubectl apply -f deploy/kubernetes-yaml/envs/local/streaming-backend-inference-server.yaml --server-side

Expose the services

kubectl expose deployment/web --type=NodePort --name web-nodeport
# Modify the nodePort to 30101, specified with extra port mappings with the infra/go/operator/hack/kind-config.yaml
kubectl patch service web-nodeport --type='json' --patch='[{"op": "replace", "path": "/spec/ports/0/nodePort", "value":30101}]'

kubectl expose deployment/backend --type=NodePort --name backend-nodeport
# Modify the nodePort to 30102, specified with extra port mappings with the infra/go/operator/hack/kind-config.yaml
kubectl patch service backend-nodeport --type='json' --patch='[{"op": "replace", "path": "/spec/ports/0/nodePort", "value":30102}]'

kubectl expose deployment/inference-server --type=NodePort --name inference-server-nodeport
# Modify the nodePort to 30103, specified with extra port mappings with the infra/go/operator/hack/kind-config.yaml
kubectl patch service inference-server-nodeport --type='json' --patch='[{"op": "replace", "path": "/spec/ports/0/nodePort", "value":30103}]'

Terms

  • stub: Generating a set of desired stubbing .mjs, .d.ts files for a package to be able to allow the end package to be able to resolve without complicated watch setup.
  • workspace: A monorepo workspace that contains multiple packages, services or apps.
  • filter: Please refer to pnpm filter for more information.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Go 56.1%
  • Vue 13.5%
  • TypeScript 12.9%
  • Makefile 10.7%
  • Dockerfile 2.5%
  • Python 2.0%
  • Other 2.3%