You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Looking at llvm/llvm-project#128450, I realised that our emulated Float16 FMA is inaccurate as well:
julia> for T in (Float16, Float32, Float64), f in (fma, muladd)
@eval @show $(f)($(T)(0x1.400p+8), $(T)(0x1.008p+7), $(T)(0x1.000p-24))
end
(fma)((Float16)(320.0), (Float16)(128.25), (Float16)(5.960464477539063e-8)) = Float16(4.102e4)
(muladd)((Float16)(320.0), (Float16)(128.25), (Float16)(5.960464477539063e-8)) = Float16(4.106e4)
(fma)((Float32)(320.0), (Float32)(128.25), (Float32)(5.960464477539063e-8)) = 41040.0f0
(muladd)((Float32)(320.0), (Float32)(128.25), (Float32)(5.960464477539063e-8)) = 41040.0f0
(fma)((Float64)(320.0), (Float64)(128.25), (Float64)(5.960464477539063e-8)) = 41040.000000059605
(muladd)((Float64)(320.0), (Float64)(128.25), (Float64)(5.960464477539063e-8)) = 41040.000000059605
julia> for T in (Float16, Float32, Float64), f in (fma, muladd)
@eval @show $(f)($(T)(0x1.eb8p-12), $(T)(0x1.9p-11), $(T)(-0x1p-11))
end
(fma)((Float16)(0.0004687309265136719), (Float16)(0.000762939453125), (Float16)(-0.00048828125)) = Float16(-0.0004878)
(muladd)((Float16)(0.0004687309265136719), (Float16)(0.000762939453125), (Float16)(-0.00048828125)) = Float16(-0.000488)
(fma)((Float32)(0.0004687309265136719), (Float32)(0.000762939453125), (Float32)(-0.00048828125)) = -0.00048792362f0
(muladd)((Float32)(0.0004687309265136719), (Float32)(0.000762939453125), (Float32)(-0.00048828125)) = -0.00048792362f0
(fma)((Float64)(0.0004687309265136719), (Float64)(0.000762939453125), (Float64)(-0.00048828125)) = -0.0004879236366832629
(muladd)((Float64)(0.0004687309265136719), (Float64)(0.000762939453125), (Float64)(-0.00048828125)) = -0.0004879236366832629
julia> versioninfo()
Julia Version 1.13.0-DEV.204
Commit b9ac28a645* (2025-03-12 09:49 UTC)
Platform Info:
OS: macOS (arm64-apple-darwin23.4.0)
CPU: 8 × Apple M1
WORD_SIZE: 64
LLVM: libLLVM-19.1.7 (ORCJIT, apple-m1)
GC: Built with stock GC
Threads: 1 default, 1 interactive, 1 GC (on 4 virtual cores)
The result of fma is 1ULP off.
Note that on this CPU, with native support for fp16 extension, muladd gives the "right" result, unlike fma (which is using the emulated fma implementation because of #57783).
The text was updated successfully, but these errors were encountered:
Looking at llvm/llvm-project#128450, I realised that our emulated Float16 FMA is inaccurate as well:
The result of
fma
is 1ULP off.Note that on this CPU, with native support for fp16 extension,
muladd
gives the "right" result, unlikefma
(which is using the emulated fma implementation because of #57783).The text was updated successfully, but these errors were encountered: