clblast gpu support error

windows tiny:
(base) PS F:\githubsources\whisper.cpp> .\build\bin\Release\main.exe -m F:\Downloads\ggml-tiny.en.bin -l auto F:\githubsources\whisper.cpp\samples\jfk.wav
whisper_init_from_file_with_params_no_state: loading model from 'F:\Downloads\ggml-tiny.en.bin'
whisper_model_load: loading model
whisper_model_load: n_vocab       = 51864
whisper_model_load: n_audio_ctx   = 1500
whisper_model_load: n_audio_state = 384
whisper_model_load: n_audio_head  = 6
whisper_model_load: n_audio_layer = 4
whisper_model_load: n_text_ctx    = 448
whisper_model_load: n_text_state  = 384
whisper_model_load: n_text_head   = 6
whisper_model_load: n_text_layer  = 4
whisper_model_load: n_mels        = 80
whisper_model_load: ftype         = 1
whisper_model_load: qntvr         = 0
whisper_model_load: type          = 1 (tiny)
whisper_model_load: adding 1607 extra tokens
whisper_model_load: n_langs       = 99
ggml_opencl: selecting platform: 'NVIDIA CUDA'
ggml_opencl: selecting device: 'NVIDIA GeForce GTX 1650'
ggml_opencl: device FP16 support: false
whisper_model_load:      CPU buffer size =    77.18 MB
whisper_model_load: model size    =   77.11 MB
whisper_init_state: kv self size  =    8.26 MB
whisper_init_state: kv cross size =    9.22 MB
whisper_init_state: compute buffer (conv)   =   12.17 MB
whisper_init_state: compute buffer (encode) =   64.92 MB
whisper_init_state: compute buffer (cross)  =    4.01 MB
whisper_init_state: compute buffer (decode) =   96.02 MB

system_info: n_threads = 4 / 12 | AVX = 1 | AVX2 = 1 | AVX512 = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | METAL = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | SSSE3 = 0 | VSX = 0 | CUDA = 0 | COREML = 0 | OPENVINO = 0 |

main: WARNING: model is not multilingual, ignoring language and translation options
main: processing 'F:\githubsources\whisper.cpp\samples\jfk.wav' (176000 samples, 11.0 sec), 4 threads, 1 processors, 5 beams + best of 5, lang = en, task = transcribe, timestamps = 1 ...


[00:00:00.000 --> 00:00:07.960]   And so my fellow Americans ask not what your country can do for you
[00:00:07.960 --> 00:00:10.760]   ask what you can do for your country.


whisper_print_timings:     load time =  1354.68 ms
whisper_print_timings:     fallbacks =   0 p /   0 h
whisper_print_timings:      mel time =    12.33 ms
whisper_print_timings:   sample time =   103.26 ms /   139 runs (    0.74 ms per run)
whisper_print_timings:   encode time =   395.21 ms /     1 runs (  395.21 ms per run)
whisper_print_timings:   decode time =     7.05 ms /     2 runs (    3.52 ms per run)
whisper_print_timings:   batchd time =   166.18 ms /   133 runs (    1.25 ms per run)
whisper_print_timings:   prompt time =     0.00 ms /     1 runs (    0.00 ms per run)
whisper_print_timings:    total time =  2045.60 ms

windows small:
(base) PS F:\githubsources\whisper.cpp> .\build\bin\Release\main.exe -m .\models\ggml-small-q4_k.bin -l auto F:\githubsources\whisper.cpp\samples\jfk.wav
whisper_init_from_file_with_params_no_state: loading model from '.\models\ggml-small-q4_k.bin'
whisper_model_load: loading model
whisper_model_load: n_vocab       = 51865
whisper_model_load: n_audio_ctx   = 1500
whisper_model_load: n_audio_state = 768
whisper_model_load: n_audio_head  = 12
whisper_model_load: n_audio_layer = 12
whisper_model_load: n_text_ctx    = 448
whisper_model_load: n_text_state  = 768
whisper_model_load: n_text_head   = 12
whisper_model_load: n_text_layer  = 12
whisper_model_load: n_mels        = 80
whisper_model_load: ftype         = 12
whisper_model_load: qntvr         = 2
whisper_model_load: type          = 3 (small)
whisper_model_load: adding 1608 extra tokens
whisper_model_load: n_langs       = 99
ggml_opencl: selecting platform: 'NVIDIA CUDA'
ggml_opencl: selecting device: 'NVIDIA GeForce GTX 1650'
ggml_opencl: device FP16 support: false
whisper_model_load:      CPU buffer size =   145.05 MB
whisper_model_load: model size    =  144.86 MB
whisper_init_state: kv self size  =   49.55 MB
whisper_init_state: kv cross size =   55.30 MB
whisper_init_state: compute buffer (conv)   =   20.23 MB
whisper_init_state: compute buffer (encode) =  128.14 MB
whisper_init_state: compute buffer (cross)  =    6.31 MB
whisper_init_state: compute buffer (decode) =   97.40 MB

system_info: n_threads = 4 / 12 | AVX = 1 | AVX2 = 1 | AVX512 = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | METAL = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | SSSE3 = 0 | VSX = 0 | CUDA = 0 | COREML = 0 | OPENVINO = 0 |

main: processing 'F:\githubsources\whisper.cpp\samples\jfk.wav' (176000 samples, 11.0 sec), 4 threads, 1 processors, 5 beams + best of 5, lang = auto, task = transcribe, timestamps = 1 ...

whisper_full_with_state: auto-detected language: en (p = 0.472618)

[00:00:00.000 --> 00:00:30.000]  .


whisper_print_timings:     load time =  1398.95 ms
whisper_print_timings:     fallbacks =   1 p /   1 h
whisper_print_timings:      mel time =    12.33 ms
whisper_print_timings:   sample time =   398.21 ms /   312 runs (    1.28 ms per run)
whisper_print_timings:   encode time =  4430.49 ms /     2 runs ( 2215.24 ms per run)
whisper_print_timings:   decode time =  1839.86 ms /   227 runs (    8.11 ms per run)
whisper_print_timings:   batchd time =   456.34 ms /    82 runs (    5.57 ms per run)
whisper_print_timings:   prompt time =     0.00 ms /     1 runs (    0.00 ms per run)
whisper_print_timings:    total time =  8543.37 ms

Only the results of tiny.en in windows nvidia clblast are correct, and the results of any model running on android are incorrect.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

clblast gpu support error #1738

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

clblast gpu support error #1738

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions