Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ggml : build backends as libraries #10256

Merged
merged 26 commits into from
Nov 14, 2024
Merged

ggml : build backends as libraries #10256

merged 26 commits into from
Nov 14, 2024

Conversation

slaren
Copy link
Collaborator

@slaren slaren commented Nov 11, 2024

Moves each backend to a different directory with its own build script. The ggml library is split into the target ggml-base that only includes the core ggml elements, and ggml that bundles ggml-base and all the backends included in the build.

To completely separate the build of the CPU backend, ggml-quants.c and ggml-aarch64.c have been split such as the reference quantization and dequantization functions are in ggml-base, and the optimized quantization and dot product functions are in ggml-cpu.

The build is organized as such:

graph TD;
application    --> libllama;
application    --> libggml;
libllama       --> libggml;
libggml        --> libggml-base;
libggml        --> libggml-cpu;
libggml        --> libggml-backend1;
libggml        --> libggml-backend2;
libggml-cpu    --> libggml-base;
libggml-backend1 --> libggml-base;
libggml-backend2 --> libggml-base;
Loading

Currently, ggml needs to be linked to the backend libraries, but ultimately the goal is to load the backends dynamically at runtime, so that we can distribute a single llama.cpp package that includes all the backends, as well as multiple versions of the CPU backend compiled with different instruction sets.

Breaking changes

Applications that use ggml and llama.cpp should not require any changes, they only need to link to the ggml and llama targets as usual. However, when building with BUILD_SHARED_LIBS, additional shared libraries are produced that need to be bundled with the application: in addition to llama and ggml, ggml-base, ggml-cpu and the any other backends included in the build should be added to the application package.

  • The flag to build the HIP backend with cmake has been changed from GGML_HIPBLAS to GGML_HIP, in line with a previous change to the CUDA backend

@github-actions github-actions bot added build Compilation issues Nvidia GPU Issues specific to Nvidia GPUs ggml changes relating to the ggml tensor library for machine learning SYCL https://en.wikipedia.org/wiki/SYCL - GPU programming language labels Nov 11, 2024
@github-actions github-actions bot added testing Everything test related examples labels Nov 11, 2024
@slaren slaren force-pushed the sl/dl-backend branch 4 times, most recently from 28b3b76 to 0cdecd3 Compare November 12, 2024 01:03
@github-actions github-actions bot added Apple Metal https://en.wikipedia.org/wiki/Metal_(API) Kompute https://github.com/KomputeProject/kompute/ labels Nov 12, 2024
@github-actions github-actions bot added the devops improvements to build systems and github actions label Nov 12, 2024
@github-actions github-actions bot added documentation Improvements or additions to documentation nix Issues specific to consuming flake.nix, or generally concerned with ❄ Nix-based llama.cpp deployment labels Nov 12, 2024
@slaren slaren force-pushed the sl/dl-backend branch 2 times, most recently from db2cb04 to 45f7dc4 Compare November 12, 2024 19:46
@slaren
Copy link
Collaborator Author

slaren commented Nov 12, 2024

This should be good now. There are a few remaining issues that I was not able to fix:

  • I tried to fix the Swift build but I don't know how to add include directories
  • I don't know why the Nix aarch64 build is failing
  • The MUSA build is all over the place and especially without CI there is no chance I will be able to adapt it. @yeahdongcn please take a look when you have a chance, it should be done in a similar way to the HIP build by adding a CMakeLists.txt in a ggml-musa directory.
  • I cannot test the CANN build either, so it is likely to be broken. @hipudding @xuedinge233 please take a look after this is merged.
  • The make build of features without CI (eg. HIP) is likely broken

Other important changes:

  • The flag to build with HIP with cmake has been changed from GGML_HIPBLAS to GGML_HIP, in line with a previous change to the CUDA backend

@yeahdongcn
Copy link
Contributor

  • The MUSA build is all over the place and especially without CI there is no chance I will be able to adapt it. @yeahdongcn please take a look when you have a chance, it should be done in a similar way to the HIP build by adding a CMakeLists.txt in a ggml-musa directory.

@slaren Got it! I can start working on it.

@hipudding
Copy link
Collaborator

  • I cannot test the CANN build either, so it is likely to be broken. @hipudding @xuedinge233 please take a look after this is merged.

Thanks, I will check.

@yeahdongcn
Copy link
Contributor

During testing of the MUSA backend in Docker, I found that the library hierarchy in .devops/*.Dockerfile needs updates. The musa-light container, for instance, failed to run and returned the following error: /llama-cli: error while loading shared libraries: libggml-cpu.so: cannot open shared object file: No such file or directory.

@slaren
Copy link
Collaborator Author

slaren commented Nov 13, 2024

Right, the build now produces more libraries that also need to be copied to the container. Thanks for testing, I will update the dockerfiles after the swift and MUSA fixes are merged here.

@yeahdongcn
Copy link
Contributor

Right, the build now produces more libraries that also need to be copied to the container. Thanks for testing, I will update the dockerfiles after the swift and MUSA fixes are merged here.

Verified that the light-musa Docker image now functions as expected.

$ docker run -it -v $HOME/models:/models local/llama.cpp:light-musa \
    -m /models/llama3.2_1b_q8_0.gguf -ngl 999 -n 512 -co -cnv \
    -p "You are a helpful assistant."

@slaren slaren merged commit ae8de6d into master Nov 14, 2024
55 checks passed
@slaren slaren deleted the sl/dl-backend branch November 14, 2024 17:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Apple Metal https://en.wikipedia.org/wiki/Metal_(API) build Compilation issues devops improvements to build systems and github actions documentation Improvements or additions to documentation examples ggml changes relating to the ggml tensor library for machine learning Kompute https://github.com/KomputeProject/kompute/ nix Issues specific to consuming flake.nix, or generally concerned with ❄ Nix-based llama.cpp deployment Nvidia GPU Issues specific to Nvidia GPUs SYCL https://en.wikipedia.org/wiki/SYCL - GPU programming language testing Everything test related
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants