ggml : build backends as libraries #10256

slaren · 2024-11-11T20:23:46Z

Moves each backend to a different directory with its own build script. The ggml library is split into the target ggml-base that only includes the core ggml elements, and ggml that bundles ggml-base and all the backends included in the build.

To completely separate the build of the CPU backend, ggml-quants.c and ggml-aarch64.c have been split such as the reference quantization and dequantization functions are in ggml-base, and the optimized quantization and dot product functions are in ggml-cpu.

The build is organized as such:

graph TD;
application    --> libllama;
application    --> libggml;
libllama       --> libggml;
libggml        --> libggml-base;
libggml        --> libggml-cpu;
libggml        --> libggml-backend1;
libggml        --> libggml-backend2;
libggml-cpu    --> libggml-base;
libggml-backend1 --> libggml-base;
libggml-backend2 --> libggml-base;

Currently, ggml needs to be linked to the backend libraries, but ultimately the goal is to load the backends dynamically at runtime, so that we can distribute a single llama.cpp package that includes all the backends, as well as multiple versions of the CPU backend compiled with different instruction sets.

Breaking changes

Applications that use ggml and llama.cpp should not require any changes, they only need to link to the ggml and llama targets as usual. However, when building with BUILD_SHARED_LIBS, additional shared libraries are produced that need to be bundled with the application: in addition to llama and ggml, ggml-base, ggml-cpu and the any other backends included in the build should be added to the application package.

The flag to build the HIP backend with cmake has been changed from GGML_HIPBLAS to GGML_HIP, in line with a previous change to the CUDA backend

slaren · 2024-11-12T22:51:40Z

This should be good now. There are a few remaining issues that I was not able to fix:

I tried to fix the Swift build but I don't know how to add include directories
I don't know why the Nix aarch64 build is failing
The MUSA build is all over the place and especially without CI there is no chance I will be able to adapt it. @yeahdongcn please take a look when you have a chance, it should be done in a similar way to the HIP build by adding a CMakeLists.txt in a ggml-musa directory.
I cannot test the CANN build either, so it is likely to be broken. @hipudding @xuedinge233 please take a look after this is merged.
The make build of features without CI (eg. HIP) is likely broken

Other important changes:

The flag to build with HIP with cmake has been changed from GGML_HIPBLAS to GGML_HIP, in line with a previous change to the CUDA backend

yeahdongcn · 2024-11-13T00:43:32Z

The MUSA build is all over the place and especially without CI there is no chance I will be able to adapt it. @yeahdongcn please take a look when you have a chance, it should be done in a similar way to the HIP build by adding a CMakeLists.txt in a ggml-musa directory.

@slaren Got it! I can start working on it.

hipudding · 2024-11-13T03:00:13Z

I cannot test the CANN build either, so it is likely to be broken. @hipudding @xuedinge233 please take a look after this is merged.

Thanks, I will check.

yeahdongcn · 2024-11-13T13:38:20Z

During testing of the MUSA backend in Docker, I found that the library hierarchy in .devops/*.Dockerfile needs updates. The musa-light container, for instance, failed to run and returned the following error: /llama-cli: error while loading shared libraries: libggml-cpu.so: cannot open shared object file: No such file or directory.

slaren · 2024-11-13T13:42:47Z

Right, the build now produces more libraries that also need to be copied to the container. Thanks for testing, I will update the dockerfiles after the swift and MUSA fixes are merged here.

ggml-ci

Signed-off-by: Xiaodong Ye <[email protected]>

cmake/llama-config.cmake.in

Co-authored-by: Georgi Gerganov <[email protected]>

yeahdongcn · 2024-11-13T23:35:08Z

Right, the build now produces more libraries that also need to be copied to the container. Thanks for testing, I will update the dockerfiles after the swift and MUSA fixes are merged here.

Verified that the light-musa Docker image now functions as expected.

$ docker run -it -v $HOME/models:/models local/llama.cpp:light-musa \
    -m /models/llama3.2_1b_q8_0.gguf -ngl 999 -n 512 -co -cnv \
    -p "You are a helpful assistant."

Signed-off-by: Xiaodong Ye <[email protected]>

ggml/CMakeLists.txt

ggml : build backends as libraries

a30f0b2

github-actions bot added build Compilation issues Nvidia GPU Issues specific to Nvidia GPUs ggml changes relating to the ggml tensor library for machine learning SYCL https://en.wikipedia.org/wiki/SYCL - GPU programming language labels Nov 11, 2024

slaren added 2 commits November 11, 2024 23:20

Merge remote-tracking branch 'origin/master' into sl/dl-backend

4428593

fix tests and examples

8768c7c

github-actions bot added testing Everything test related examples labels Nov 11, 2024

add rpc backend

bf79cb3

slaren force-pushed the sl/dl-backend branch 4 times, most recently from 28b3b76 to 0cdecd3 Compare November 12, 2024 01:03

build fixes

ab26fb9

slaren force-pushed the sl/dl-backend branch from 0cdecd3 to ab26fb9 Compare November 12, 2024 01:32

more build fixes

efdd713

github-actions bot added Apple Metal https://en.wikipedia.org/wiki/Metal_(API) Kompute https://github.com/KomputeProject/kompute/ labels Nov 12, 2024

add vulkan and kompute

646e91a

slaren force-pushed the sl/dl-backend branch from 8cd434c to 646e91a Compare November 12, 2024 17:44

github-actions bot added the devops improvements to build systems and github actions label Nov 12, 2024

add amx, cann, sycl

710822f

slaren force-pushed the sl/dl-backend branch from bac7868 to 710822f Compare November 12, 2024 18:40

github-actions bot added documentation Improvements or additions to documentation nix Issues specific to consuming flake.nix, or generally concerned with ❄ Nix-based llama.cpp deployment labels Nov 12, 2024

slaren force-pushed the sl/dl-backend branch 2 times, most recently from db2cb04 to 45f7dc4 Compare November 12, 2024 19:46

add hip

c8da7d0

slaren force-pushed the sl/dl-backend branch from 45f7dc4 to c8da7d0 Compare November 12, 2024 19:53

fixes

dd49f08

ggerganov mentioned this pull request Nov 13, 2024

metal : fix build and swift package #10279

Merged

yeahdongcn mentioned this pull request Nov 13, 2024

ggml: build musa backend library #10280

Merged

4 tasks

ggerganov and others added 3 commits November 13, 2024 15:57

metal : fix build and swift package (#10279)

42eb364

ggml-ci

ggml: build musa backend library (cmake) (#10280)

4a80bb9

Signed-off-by: Xiaodong Ye <[email protected]>

Merge remote-tracking branch 'origin/master' into sl/dl-backend

125c235

ggerganov reviewed Nov 13, 2024

View reviewed changes

cmake/llama-config.cmake.in Outdated Show resolved Hide resolved

slaren and others added 5 commits November 13, 2024 19:41

Update cmake/llama-config.cmake.in

e541f7f

Co-authored-by: Georgi Gerganov <[email protected]>

add missing libraries to ggml cmake install

e0b321b

update cuda & musa dockerfiles

bc4f6cb

fix editorconfig

796f05b

fix sanitizers build

e503ad1

slaren force-pushed the sl/dl-backend branch from e6242b4 to e503ad1 Compare November 13, 2024 20:32

only use AMX on x86

fc66c4b

Merge remote-tracking branch 'origin/master' into sl/dl-backend

dc23135

yeahdongcn mentioned this pull request Nov 14, 2024

ggml: separate musa into its own section in the Makefile #10294

Merged

4 tasks

ggerganov mentioned this pull request Nov 14, 2024

metal : refactor kernel args into structs #10238

Open

12 tasks

slaren and others added 2 commits November 14, 2024 16:21

Merge remote-tracking branch 'origin/master' into sl/dl-backend

4e49714

ggml: separate musa into its own section in the Makefile (#10294)

f2f5c3b

Signed-off-by: Xiaodong Ye <[email protected]>

ggerganov reviewed Nov 14, 2024

View reviewed changes

ggml/CMakeLists.txt Show resolved Hide resolved

ggerganov approved these changes Nov 14, 2024

View reviewed changes

slaren merged commit ae8de6d into master Nov 14, 2024
55 checks passed

slaren deleted the sl/dl-backend branch November 14, 2024 17:04

ggerganov mentioned this pull request Nov 15, 2024

sync : ggml ggerganov/whisper.cpp#2561

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ggml : build backends as libraries #10256

ggml : build backends as libraries #10256

slaren commented Nov 11, 2024 •

edited

Loading

slaren commented Nov 12, 2024

yeahdongcn commented Nov 13, 2024

hipudding commented Nov 13, 2024

yeahdongcn commented Nov 13, 2024

slaren commented Nov 13, 2024

yeahdongcn commented Nov 13, 2024

ggml : build backends as libraries #10256

ggml : build backends as libraries #10256

Conversation

slaren commented Nov 11, 2024 • edited Loading

Breaking changes

slaren commented Nov 12, 2024

yeahdongcn commented Nov 13, 2024

hipudding commented Nov 13, 2024

yeahdongcn commented Nov 13, 2024

slaren commented Nov 13, 2024

yeahdongcn commented Nov 13, 2024

slaren commented Nov 11, 2024 •

edited

Loading