sycl : Fixes broken build and test-backend-ops #10257

Alcpz · 2024-11-11T21:40:10Z

I have read the contributing guidelines
Self-reported review complexity:
- Low
- Medium
- High

Fixes broken build for the SYCL CUDA backend caused by non-explicit gemm call in outprod (merged in with RWKV6 in Optimize RWKV6 Operator Naming and Implement Multi-core CPU/ SYCL Acceleration #10133)
Marks permuted MUL_MAT as unsupported to be able to run test-backend-ops
Fixes asserts in norm to fix debug builds.

Tests confirmed passing in Nvidia A100 and Intel Data Center GPU Max 1100

Alcpz · 2024-11-11T21:53:55Z

@airMeng I undestand you were fixing the unsupported permuted MUL_MAT in #10041, but since there is some issues with the SYCL CI and it seems that it could take longer, can we merge this?

airMeng · 2024-11-12T00:53:32Z

could you cherry-pick the norm related cases from #10041 too? It will only crash with debug building

Alcpz · 2024-11-12T09:38:09Z

Added the changes

Rbiessy

The oneMKL changes look good to me.

easyfab · 2024-11-14T17:02:47Z

these commits negatively affect intel gpus. Is this expected ?

For example :
Before :

ggml_sycl_init: GGML_SYCL_FORCE_MMQ:   no
ggml_sycl_init: SYCL_USE_XMX: yes
ggml_sycl_init: found 1 SYCL devices:
| model                          |       size |     params | backend    | ngl |          test |                  t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | ------------: | -------------------: |
[SYCL] call ggml_check_sycl
ggml_check_sycl: GGML_SYCL_DEBUG: 0
ggml_check_sycl: GGML_SYCL_F16: no
found 1 SYCL devices:
|  |                   |                                       |       |Max    |        |Max  |Global |                     |
|  |                   |                                       |       |compute|Max work|sub  |mem    |                     |
|ID|        Device Type|                                   Name|Version|units  |group   |group|size   |       Driver version|
|--|-------------------|---------------------------------------|-------|-------|--------|-----|-------|---------------------|
| 0| [level_zero:gpu:0]|                 Intel Iris Xe Graphics|    1.6|     96|     512|   32| 31604M|            1.3.31441|
| qwen2 1.5B Q5_K - Medium       |   1.22 GiB |     1.78 B | SYCL       |  99 |         pp512 |        358.62 ± 8.26 |
| qwen2 1.5B Q5_K - Medium       |   1.22 GiB |     1.78 B | SYCL       |  99 |         tg128 |         13.10 ± 0.34 |

build: 80dd7ff2 (4068)

After:

ggml_sycl_init: GGML_SYCL_FORCE_MMQ:   no
ggml_sycl_init: SYCL_USE_XMX: yes
ggml_sycl_init: found 1 SYCL devices:
| model                          |       size |     params | backend    | ngl |          test |                  t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | ------------: | -------------------: |
[SYCL] call ggml_check_sycl
ggml_check_sycl: GGML_SYCL_DEBUG: 0
ggml_check_sycl: GGML_SYCL_F16: no
found 1 SYCL devices:
|  |                   |                                       |       |Max    |        |Max  |Global |                     |
|  |                   |                                       |       |compute|Max work|sub  |mem    |                     |
|ID|        Device Type|                                   Name|Version|units  |group   |group|size   |       Driver version|
|--|-------------------|---------------------------------------|-------|-------|--------|-----|-------|---------------------|
| 0| [level_zero:gpu:0]|                 Intel Iris Xe Graphics|    1.6|     96|     512|   32| 31604M|            1.3.31441|
| qwen2 1.5B Q5_K - Medium       |   1.22 GiB |     1.78 B | SYCL       |  99 |         pp512 |       276.80 ± 13.68 |
| qwen2 1.5B Q5_K - Medium       |   1.22 GiB |     1.78 B | SYCL       |  99 |         tg128 |         10.64 ± 0.25 |

build: 2e82ffa4 (4069)

Reverting over master and performance returns

sycl : Fixes RWKV6 broken build in the cuda backend

17b8a2e

github-actions bot added ggml changes relating to the ggml tensor library for machine learning SYCL https://en.wikipedia.org/wiki/SYCL - GPU programming language labels Nov 11, 2024

sycl : marks permuted MUL_MAT as unsupported

f6ea8b7

Alcpz force-pushed the Alcpz/sycl-backend-build-fix branch from 06cb3c6 to f6ea8b7 Compare November 11, 2024 21:41

Alcpz requested a review from airMeng November 11, 2024 21:54

sycl : fix norm asserts in debug build

6a2c025

Alcpz requested a review from NeoZhangJianyu November 12, 2024 10:53

Rbiessy approved these changes Nov 12, 2024

View reviewed changes

airMeng approved these changes Nov 12, 2024

View reviewed changes

Alcpz merged commit 2e82ffa into ggerganov:master Nov 13, 2024
53 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sycl : Fixes broken build and test-backend-ops #10257

sycl : Fixes broken build and test-backend-ops #10257

Alcpz commented Nov 11, 2024 •

edited

Loading

Alcpz commented Nov 11, 2024

airMeng commented Nov 12, 2024

Alcpz commented Nov 12, 2024

Rbiessy left a comment

easyfab commented Nov 14, 2024

sycl : Fixes broken build and test-backend-ops #10257

sycl : Fixes broken build and test-backend-ops #10257

Conversation

Alcpz commented Nov 11, 2024 • edited Loading

Alcpz commented Nov 11, 2024

airMeng commented Nov 12, 2024

Alcpz commented Nov 12, 2024

Rbiessy left a comment

Choose a reason for hiding this comment

easyfab commented Nov 14, 2024

Alcpz commented Nov 11, 2024 •

edited

Loading