-
Notifications
You must be signed in to change notification settings - Fork 42
Title: Add documentation website for Llama Stack Kubernetes operator #56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
This pull request sets up GitHub code scanning for this repository. Once the scans have completed and the checks have passed, the analysis results for this pull request branch will appear on this overview. Once you merge this pull request, the 'Security' tab will show more code scanning analysis results (for example, for the default branch). Depending on your configuration and choice of analysis tool, future pull requests will be annotated with code scanning analysis results. For more information about GitHub code scanning, check out the documentation. |
|
This pull request has merge conflicts that must be resolved before it can be merged. @zanetworker please rebase it. https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork |
|
This looks like a good base. We can open follow-up PRs for updates. /lgtm |
|
|
||
| ### Prerequisites | ||
|
|
||
| - Go 1.21+ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have updated the version to 1.24
|
@mergify rebase |
☑️ Nothing to do, the required conditions are not met
|
|
@zanetworker Can you update the conflict , and we can merge this |
Signed-off-by: Adel Zaalouk <[email protected]>
Signed-off-by: Adel Zaalouk <[email protected]>
3ac20d3 to
55f22ae
Compare
- Created detailed vLLM distribution guide with GPU/CPU variants, configuration examples, and best practices - Created comprehensive Ollama distribution guide with model management, API usage, and troubleshooting - Added new 'Distributions' section to navigation with dedicated pages for each distribution type - Includes production and development examples, resource planning, and scaling strategies - Clearly distinguishes supported distributions from BYO (bring-your-own) custom image approaches Signed-off-by: Adel Zaalouk <[email protected]>
|
@VaishnaviHire updated the docs. It looks like this now: https://llama-stack-k8s-operator.pages.dev/ Also after merging we probably need to setup cloudflare tokens on the repo to automated the publish of the doc (that is currently setup on my fork) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks but there is a lot of docs in this PR, all of which would need to be verified and tested before approving, maybe we should start smaller and build more manageable chunks in separate PR's ?
| env: | ||
| - name: LOG_LEVEL | ||
| value: "info" | ||
| - name: LLAMASTACK_PORT |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a thing? I can see it mentioned in the watsonx provider but I'm not sure it used anywhere
| The fastest way to install the operator is using the pre-built manifests: | ||
|
|
||
| ```bash | ||
| kubectl apply -f https://github.com/llamastack/llama-stack-k8s-operator/releases/latest/download/operator.yaml |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(base) derekh@laptop:~/workarea/llama-stack-k8s-operator$ kubectl apply -f https://github.com/llamastack/llama-stack-k8s-operator/releases/latest/download/operator.yaml
error: unable to read URL "https://github.com/llamastack/llama-stack-k8s-operator/releases/latest/download/operator.yaml", server reported 404 Not Found, status code=404
|
|
||
| ```bash | ||
| # Add the Helm repository | ||
| helm repo add llamastack https://llamastack.github.io/helm-charts |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(base) derekh@laptop:~/workarea/llama-stack-k8s-operator$ helm repo add llamastack https://llamastack.github.io/helm-charts
Error: looks like "https://llamastack.github.io/helm-charts" is not a valid chart repository or cannot be reached: failed to fetch https://llamastack.github.io/helm-charts/index.yaml : 404 Not Found
|
|
||
| ```bash | ||
| # Check operator deployment | ||
| kubectl get deployment -n llamastack-operator-system |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(base) derekh@laptop:~/workarea/llama-stack-k8s-operator$ kubectl get deployment -n llamastack-operator-system
No resources found in llamastack-operator-system namespace.
should be
(base) derekh@laptop:~/workarea/llama-stack-k8s-operator$ kubectl get -n llama-stack-k8s-operator-system deployment/llama-stack-k8s-operator-controller-manager
NAME READY UP-TO-DATE AVAILABLE AGE
llama-stack-k8s-operator-controller-manager 1/1 1 1 87s
| cd llama-stack-k8s-operator | ||
|
|
||
| # Install using Kustomize | ||
| kubectl apply -k config/default |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this worked for me, the comments below are based on this having been run
| replicas: 1 | ||
| server: | ||
| distribution: | ||
| name: "bedrock" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This and a bunch of other distributions arn't built any longer, we shouldn't be documenting them and should probably remove them from distributions.json
|
|
||
| 2. **Apply the configuration**: | ||
| ```bash | ||
| kubectl apply -f basic-deployment.yaml |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
distro doesn't start
(base) derekh@laptop:~/workarea/llama-stack-k8s-operator$ oc get pod/basic-llamastack-5867bd5865-z8v77
NAME READY STATUS RESTARTS AGE
basic-llamastack-5867bd5865-z8v77 0/1 Error 5 (87s ago) 5m1s
(base) derekh@laptop:~/workarea/llama-stack-k8s-operator$ oc logs pod/basic-llamastack-5867bd5865-z8v77
Traceback (most recent call last):
File "/usr/local/lib/python3.10/runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/usr/local/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/usr/local/lib/python3.10/site-packages/llama_stack/distribution/server/server.py", line 597, in <module>
main()
File "/usr/local/lib/python3.10/site-packages/llama_stack/distribution/server/server.py", line 422, in main
config = replace_env_vars(config_contents)
File "/usr/local/lib/python3.10/site-packages/llama_stack/distribution/stack.py", line 131, in replace_env_vars
raise EnvVarError(e.var_name, e.path) from None
llama_stack.distribution.stack.EnvVarError: Environment variable 'INFERENCE_MODEL' not set or empty at models[0].model_id
Description:
This PR introduces a documentation website for the Llama Stack Kubernetes operator. The documentation includes the operator functionality, API references, distro docs, etc.