Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slightly inaccurate emulated fma on Float16 #57784

Open
giordano opened this issue Mar 15, 2025 · 0 comments
Open

Slightly inaccurate emulated fma on Float16 #57784

giordano opened this issue Mar 15, 2025 · 0 comments
Labels

Comments

@giordano
Copy link
Contributor

Looking at llvm/llvm-project#128450, I realised that our emulated Float16 FMA is inaccurate as well:

julia> for T in (Float16, Float32, Float64), f in (fma, muladd)
           @eval @show $(f)($(T)(0x1.400p+8), $(T)(0x1.008p+7), $(T)(0x1.000p-24))
       end
(fma)((Float16)(320.0), (Float16)(128.25), (Float16)(5.960464477539063e-8)) = Float16(4.102e4)
(muladd)((Float16)(320.0), (Float16)(128.25), (Float16)(5.960464477539063e-8)) = Float16(4.106e4)
(fma)((Float32)(320.0), (Float32)(128.25), (Float32)(5.960464477539063e-8)) = 41040.0f0
(muladd)((Float32)(320.0), (Float32)(128.25), (Float32)(5.960464477539063e-8)) = 41040.0f0
(fma)((Float64)(320.0), (Float64)(128.25), (Float64)(5.960464477539063e-8)) = 41040.000000059605
(muladd)((Float64)(320.0), (Float64)(128.25), (Float64)(5.960464477539063e-8)) = 41040.000000059605

julia> for T in (Float16, Float32, Float64), f in (fma, muladd)
           @eval @show $(f)($(T)(0x1.eb8p-12), $(T)(0x1.9p-11), $(T)(-0x1p-11))
       end
(fma)((Float16)(0.0004687309265136719), (Float16)(0.000762939453125), (Float16)(-0.00048828125)) = Float16(-0.0004878)
(muladd)((Float16)(0.0004687309265136719), (Float16)(0.000762939453125), (Float16)(-0.00048828125)) = Float16(-0.000488)
(fma)((Float32)(0.0004687309265136719), (Float32)(0.000762939453125), (Float32)(-0.00048828125)) = -0.00048792362f0
(muladd)((Float32)(0.0004687309265136719), (Float32)(0.000762939453125), (Float32)(-0.00048828125)) = -0.00048792362f0
(fma)((Float64)(0.0004687309265136719), (Float64)(0.000762939453125), (Float64)(-0.00048828125)) = -0.0004879236366832629
(muladd)((Float64)(0.0004687309265136719), (Float64)(0.000762939453125), (Float64)(-0.00048828125)) = -0.0004879236366832629

julia> versioninfo()
Julia Version 1.13.0-DEV.204
Commit b9ac28a645* (2025-03-12 09:49 UTC)
Platform Info:
  OS: macOS (arm64-apple-darwin23.4.0)
  CPU: 8 × Apple M1
  WORD_SIZE: 64
  LLVM: libLLVM-19.1.7 (ORCJIT, apple-m1)
  GC: Built with stock GC
Threads: 1 default, 1 interactive, 1 GC (on 4 virtual cores)

The result of fma is 1ULP off.

Note that on this CPU, with native support for fp16 extension, muladd gives the "right" result, unlike fma (which is using the emulated fma implementation because of #57783).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant