[Usage]: using open-webui with vLLM inference engine instead of ollama #10322

wolfgangsmdt · 2024-11-14T11:02:00Z

Your current environment

No need for the environment

How would you like to use vllm

I want to run inference using openwebui (or something similar) using vLLM as a backend instead of ollama.
I already launched the vLLM openai server api and open-webai. However, it is not working.

Before submitting a new issue...

Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

NicolasDrapier · 2024-11-14T12:24:38Z

Hi @wolfgangsmdt

Here is the command I use to run Open-Webui with vLLM:

docker run --name=open-webui \
--env=ENABLE_RAG_WEB_SEARCH=true \
--env=RAG_WEB_SEARCH_ENGINE=duckduckgo \
--env=IMAGE_GENERATION_ENGINE=comfyui \
--env=COMFYUI_BASE_URL=http://<your-ip>:<comfy-port> \
--env=ENABLE_OLLAMA_API=false \
--env=OPENAI_API_KEY=aaaaa \
--env=ENABLE_IMAGE_GENERATION=true \
--env=IMAGE_SIZE=1024x1024 \
--env='IMAGE_GENERATION_MODEL=Stable Diffusion 3 - Medium' \
--env=WHISPER_MODEL=large \
--env=OPENAI_API_BASE_URL=http://<your-ip>:<vllm-port>/v1 \
--volume=/data/open-webui-volumes:/app/backend/data \
-p 3000:8080 \
--restart=always \
ghcr.io/open-webui/open-webui:main

It works like a charm.

wolfgangsmdt · 2024-11-14T13:00:24Z

Hello @NicolasDrapier ,

Thank you very much for the reply.
I just need little more help.

To start vLLM server I used:

vllm serve TinyLlama/TinyLlama_v1.1 --api-key token-abc123

To start open-webui I used your suggestion:

docker run -d -p 3000:8080 \
-v open-webui:/app/backend/data \
--name open-webui \
--restart always \
--env=OPENAI_API_BASE_URL=http://localhost:8000/v1 \
--env=OPENAI_API_KEY=token-abc123 \
--env=ENABLE_OLLAMA_API=false \
--env=ENABLE_RAG_WEB_SEARCH=true \
--env=RAG_WEB_SEARCH_ENGINE=duckduckgo  \
ghcr.io/open-webui/open-webui:main

When logging in to open-webui I have nothing and I cannot access Tinyllama

I am missing something in the commands?

EDIT:

I am getting Unauthorized from server side when I run start the open-webui:

INFO:     localhost:43052 - "GET /v1 HTTP/1.1" 401 Unauthorized
WARNING:  Invalid HTTP request received.
WARNING:  Invalid HTTP request received.

However, when I am running this command with curl, I am getting OK, which means the server is running perfectly.

curl http://localhost:8000/health

wolfgangsmdt · 2024-11-14T14:48:34Z

Hello again @sri-fiddler
do you have any idea about this?

wolfgangsmdt added the usage How to use vllm label Nov 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Usage]: using open-webui with vLLM inference engine instead of ollama #10322

[Usage]: using open-webui with vLLM inference engine instead of ollama #10322

wolfgangsmdt commented Nov 14, 2024

NicolasDrapier commented Nov 14, 2024

wolfgangsmdt commented Nov 14, 2024 •

edited

Loading

wolfgangsmdt commented Nov 14, 2024

[Usage]: using open-webui with vLLM inference engine instead of ollama #10322

[Usage]: using open-webui with vLLM inference engine instead of ollama #10322

Comments

wolfgangsmdt commented Nov 14, 2024

Your current environment

How would you like to use vllm

Before submitting a new issue...

NicolasDrapier commented Nov 14, 2024

wolfgangsmdt commented Nov 14, 2024 • edited Loading

wolfgangsmdt commented Nov 14, 2024

wolfgangsmdt commented Nov 14, 2024 •

edited

Loading