Optimization for building more efficient backend image requires less disk space and time #1037

icejean · 2025-01-26T09:49:13Z

Docker compose build will install Pytorch 2.5.1 with CUDA libraries automatically every time, it will take many disk space and many time. but you can't see the Nvidia GPU in container without properly setting up, please read the details at How to Install PyTorch on the GPU with Docker. My solution is to run those LLM & Embedding models with Ollama or vLLM outside backend container, and access them through endpoints. Thus only CPU version of PyTorch is needed for backend container, no need of CUDA libraries any longer.
So I modify backend's Dockerfile as follow, torch 2.3.1+cpu is O.K, and the matching torchvision 0.18.1+cpu and torchaudio 2.3.1+cpu too.
Installing them before packages in requirements.txt is enough.

FROM python:3.10-slim
WORKDIR /code
ENV PORT 8000
EXPOSE 8000
# Install dependencies and clean up in one layer
RUN apt-get update && \
   apt-get install -y --no-install-recommends \
       libmagic1 \
       libgl1-mesa-glx \
       libreoffice \
       cmake \
       poppler-utils \
       tesseract-ocr && \
   apt-get clean && \
   rm -rf /var/lib/apt/lists/*
# Set LD_LIBRARY_PATH
ENV LD_LIBRARY_PATH=/usr/lib/x86_64-linux-gnu:$LD_LIBRARY_PATH
# Copy requirements file and install Python dependencies
COPY requirements.txt /code/
# --no-cache-dir --upgrade 

# Install PyTorch, torchvision, and torchaudio for CPU only
RUN pip install torch==2.3.1+cpu -f https://download.pytorch.org/whl/torch_stable.html \
    torchvision==0.18.1+cpu -f https://download.pytorch.org/whl/torch_stable.html \
    torchaudio==2.3.1+cpu -f https://download.pytorch.org/whl/torch_stable.html

RUN pip install -r requirements.txt
# Copy application code
COPY . /code
# Set command
CMD ["gunicorn", "score:app", "--workers", "2","--threads", "2", "--worker-class", "uvicorn.workers.UvicornWorker", "--bind", "0.0.0.0:8000", "--timeout", "300"]
~

The result backend image is 4GB only, which is 13GB before.

(base) root@10-60-136-78:~# docker image list
REPOSITORY                   TAG       IMAGE ID       CREATED         SIZE
llm-graph-builder-frontend   latest    2183b1722a12   5 hours ago     55.8MB
llm-graph-builder-backend    latest    31779e605998   6 hours ago     4.07GB

Best regards
Jean

The text was updated successfully, but these errors were encountered:

jexp · 2025-02-12T08:28:56Z

Ideally we could do this via configuration and perhaps a multi-stage docker-file, so that the user doesn't need to deal with all these details.

I think if you use an external embedding model you wouldn't need pytorch at all for the embedding???

The only place where I think it might be needed is the unustructured.io document loaders, but not sure. And those should bring the dependencies in themselves?

Otherwise we can look into supporting external unstrucutred.io usage with API-KEY ... as an option.

icejean · 2025-02-12T08:52:04Z

Yes, I use an external embedding model through calling Ollama or other provider's endpoint online, so I don't need the ability of GPU within the container. Many cloud services can provide powerful embedding and LLM models online, so we don't need those expensive GPUs locally for research purpose. And in case we need a GPU locally, we can call it through Ollama， vLLM and so on, so don't need it within the container at all.

kartikpersistent assigned kartikpersistent, praveshkumar1988 and kaustubh-darekar and unassigned kartikpersistent Jan 27, 2025

kartikpersistent added the improvement label Jan 27, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimization for building more efficient backend image requires less disk space and time #1037

Optimization for building more efficient backend image requires less disk space and time #1037

icejean commented Jan 26, 2025

jexp commented Feb 12, 2025

icejean commented Feb 12, 2025 •

edited

Loading

Optimization for building more efficient backend image requires less disk space and time #1037

Optimization for building more efficient backend image requires less disk space and time #1037

Comments

icejean commented Jan 26, 2025

jexp commented Feb 12, 2025

icejean commented Feb 12, 2025 • edited Loading

icejean commented Feb 12, 2025 •

edited

Loading