Open
Description
I have a 32-bit arm7 CPU on an Odroid-XU4 with Armbian 21.08.6 Focal with Linux 5.4.160-odroidxu4
Out of box make crashes:
./main -m models/ggml-medium.bin -otxt samples/jfk.wav
....
Illegal instruction
Looking at the makefile, looks like it's using the wrong FPU lib for floating point math.
https://github.com/ggerganov/whisper.cpp/blob/ab1916fc598cc364b521a6d24752c4b092553e40/Makefile#L149
Tweaked which but then crashes out with Killed instead:
diff --git a/Makefile b/Makefile
index 20915e3..f20bb5d 100644
--- a/Makefile
+++ b/Makefile
@@ -147,8 +147,11 @@ ifneq ($(filter armv6%,$(UNAME_M)),)
CFLAGS += -mfpu=neon-fp-armv8 -mfp16-format=ieee -mno-unaligned-access
endif
ifneq ($(filter armv7%,$(UNAME_M)),)
+ # this label looks wrong - matches 71 which is ONLY 32-bit
# Raspberry Pi 4
- CFLAGS += -mfpu=neon-fp-armv8 -mfp16-format=ieee -mno-unaligned-access -funsafe-math-optimizations
+ #CFLAGS += -mfpu=neon-fp-armv8 -mfp16-format=ieee -mno-unaligned-access -funsafe-math-optimizations
+ # cc: note: valid arguments to ‘-mfpu=’ are: auto crypto-neon-fp-armv8 fp-armv8 fpv4-sp-d16 fpv5-d16 fpv5-sp-d16 neon neon-fp-armv8 neon-fp16 neon-vfpv3 neon-vfpv4 vfp vfp3 vfpv2 vfpv3 vfpv3-d16 vfpv3-d16-fp16 vfpv3-fp16 vfpv3xd vfpv3xd-fp16 vfpv4 vfpv4-d16; did you mean ‘neon-fp-armv8’?
+ CFLAGS += -mfpu=neon -mfp16-format=ieee -mno-unaligned-access -funsafe-math-optimizations
endif
ifneq ($(filter armv8%,$(UNAME_M)),)
# Raspberry Pi 4
$ ./main -m models/ggml-medium.bin -otxt samples/jfk.wav
whisper_init_from_file: loading model from 'models/ggml-medium.bin'
whisper_model_load: loading model
whisper_model_load: n_vocab = 51865
whisper_model_load: n_audio_ctx = 1500
whisper_model_load: n_audio_state = 1024
whisper_model_load: n_audio_head = 16
whisper_model_load: n_audio_layer = 24
whisper_model_load: n_text_ctx = 448
whisper_model_load: n_text_state = 1024
whisper_model_load: n_text_head = 16
whisper_model_load: n_text_layer = 24
whisper_model_load: n_mels = 80
whisper_model_load: f16 = 1
whisper_model_load: type = 4
whisper_model_load: mem required = 1720.00 MB (+ 43.00 MB per decoder)
whisper_model_load: kv self size = 42.00 MB
whisper_model_load: kv cross size = 140.62 MB
whisper_model_load: adding 1608 extra tokens
whisper_model_load: model ctx = 1462.35 MB
Killed
platform info:
$ uname -a
Linux odroidxu4 5.4.160-odroidxu4 #21.08.6 SMP PREEMPT Mon Nov 22 12:18:25 UTC 2021 armv7l armv7l armv7l GNU/Linux
$ cat /proc/cpuinfo
processor : 0
model name : ARMv7 Processor rev 3 (v7l)
BogoMIPS : 36.00
Features : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm
CPU implementer : 0x41
CPU architecture: 7
CPU variant : 0x0
CPU part : 0xc07
CPU revision : 3
...
...
processor : 7
model name : ARMv7 Processor rev 3 (v7l)
BogoMIPS : 36.00
Features : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm
CPU implementer : 0x41
CPU architecture: 7
CPU variant : 0x2
CPU part : 0xc0f
CPU revision : 3
Hardware : Hardkernel ODROID-XU4
Revision : 0100
Serial : 0000000000000000
I've tried without a mfpu
flag but compilation fails:
ggml.c:153:10: fatal error: immintrin.h: No such file or directory
153 | #include <immintrin.h>
I'm in the process of trying a list, posting in case anyone else hits this (and/or has ideas).