Skip to content

Conversation

@mlugg
Copy link
Member

@mlugg mlugg commented Nov 14, 2025

See commit messages. (@alexrp, having implemented the requested legalization I now expect you to PR 7 backends /j)

Simplifies the logic, clarifies the comment, and fixes a minor bug,
which is that we exported the Windows ABI name *instead* of the standard
compiler-rt name, but it's meant to be exported *in addition* to the
standard name (this is LLVM's behavior and it is more useful).
Comment on lines +175 to +183
soft_f16,
/// Like `soft_f16`, but for 32-bit floating-point types.
soft_f32,
/// Like `soft_f16`, but for 64-bit floating-point types.
soft_f64,
/// Like `soft_f16`, but for 80-bit floating-point types.
soft_f80,
/// Like `soft_f16`, but for 128-bit floating-point types.
soft_f128,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we take advantage of some of these in the LLVM backend for the situations where we currently do manual soft float lowering? IIRC this mainly affects f16 and f128 on some targets.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No; it's strictly better in terms of both compiler performance and runtime performance to do the work in backends rather than Legalize, so ripping out that support would be counterproductive. Legalize is a tool for incomplete backends (currently technically all of them), not something they should be designed to rely on.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

both compiler performance and runtime performance

Wait, why would runtime performance be worse?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doing the translation in Legalize means doing it at the AIR level rather than the MIR level, so it's typical for it to ultimately result in more bloated MIR. For instance, when calling int<->float conversion functions, we might need to zero/sign-extend the integer to an ABI size before we do it; or when operating on c_longdouble we may need to bitcast the result back from f128; or when calling extended int routines (for >128 bits) we might need to make an alloc so we can pass a pointer to an integer. When combined, this can cause one conversion, say u150 -> c_longdouble (with 80-bit long double), to turn into AIR like this:

%1 = block(c_longdouble, {
  %2 = intcast(u256, [original operand])
  %3 = alloc(*u256)
  %4 = store(%3, %2)
  %5 = legalize_compiler_rt_call(__floatuneixf, [%3, <usize, 150>])
  %6 = bitcast(c_longdouble, %5)
  %7 = br(%1, %6)
})

That's a lot of operations, and a typical non-optimizing backend might lower them much less efficiently than it could lower the operation as a whole. For instance, the operand is quite likely to already be spilled to the stack and to be represented in memory as a zero-extended u256, so the intcast/alloc/store should all be nops, but a backend is pretty much guaranteed to at the very least reserve more stack space and shuffle memory around when lowering those AIR instructions. This applies less to the LLVM backend than to machine code backends, because we aren't dealing with register allocation / spills manually in the LLVM backend, but it's still definitely going to happen.

@mlugg
Copy link
Member Author

mlugg commented Nov 14, 2025

Oh, for posterity, here's the patch I used to enable these legalizations in the x86_64 backend so as to test this PR:

commit e5d30c005ceaa792775a0e353a3a8398aed5c155
Author: Matthew Lugg <[email protected]>
Date:   Fri Nov 14 10:54:12 2025 +0000

    DO NOT MERGE; make x86_64 use soft float

diff --git a/src/codegen/x86_64/CodeGen.zig b/src/codegen/x86_64/CodeGen.zig
index b43b359de1..a391527037 100644
--- a/src/codegen/x86_64/CodeGen.zig
+++ b/src/codegen/x86_64/CodeGen.zig
@@ -73,6 +73,12 @@ pub fn legalizeFeatures(_: *const std.Target) *const Air.Legalize.Features {
         .expand_packed_store,
         .expand_packed_struct_field_val,
         .expand_packed_aggregate_init,
+
+        .soft_f16,
+        .soft_f32,
+        .soft_f64,
+        .soft_f80,
+        .soft_f128,
     });
 }
 
@@ -173690,8 +173696,40 @@ fn genBody(cg: *CodeGen, body: []const Air.Inst.Index) InnerError!void {
                 for (ops) |op| try op.die(cg);
             },
 
-            // No soft-float `Legalize` features are enabled, so this instruction never appears.
-            .legalize_compiler_rt_call => unreachable,
+            .legalize_compiler_rt_call => {
+                const inst_data = air_datas[@intFromEnum(inst)].legalize_compiler_rt_call;
+                const extra = cg.air.extraData(Air.Call, inst_data.payload);
+                const args: []const Air.Inst.Ref = @ptrCast(cg.air.extra.items[extra.end..][0..extra.data.args_len]);
+
+                var sfba_state = std.heap.stackFallback(512, cg.gpa);
+                const sfba = sfba_state.get();
+
+                const arg_tys = try sfba.alloc(Type, args.len);
+                defer sfba.free(arg_tys);
+                const arg_tys_ip = try sfba.alloc(InternPool.Index, args.len);
+                defer sfba.free(arg_tys_ip);
+                const arg_vals = try sfba.alloc(MCValue, args.len);
+                defer sfba.free(arg_vals);
+
+                for (arg_tys, arg_tys_ip, arg_vals, args) |*ty, *ty_ip, *mcv, arg| {
+                    ty.* = cg.typeOf(arg);
+                    ty_ip.* = ty.*.toIntern();
+                    mcv.* = .{ .air_ref = arg };
+                }
+
+                assert(inst_data.func.@"callconv"(zcu.getTarget()).eql(cg.target.cCallingConvention().?));
+                const ret = try cg.genCall(.{ .extern_func = .{
+                    .return_type = inst_data.func.returnType().toIntern(),
+                    .param_types = arg_tys_ip,
+                    .sym = inst_data.func.name(cg.target),
+                } }, arg_tys, arg_vals, .{ .safety = true });
+
+                var bt = cg.liveness.iterateBigTomb(inst);
+                for (args) |arg| try cg.feed(&bt, arg);
+
+                const result = if (cg.liveness.isUnused(inst)) .unreach else ret;
+                cg.finishAirResult(inst, result);
+            },
 
             .work_item_id, .work_group_size, .work_group_id => unreachable,
         }

A new `Legalize.Feature` tag is introduced for each float bit width
(16/32/64/80/128). When e.g. `soft_f16` is enabled, all arithmetic and
comparison operations on `f16` are converted to calls to the appropriate
compiler_rt function using the new AIR tag `.legalize_compiler_rt_call`.
This includes casts where the source *or* target type is `f16`, or
integer<=>float conversions to or from `f16`. Occasionally, operations
are legalized to blocks because there is extra code required; for
instance, legalizing `@floatFromInt` where the integer type is larger
than 64 bits requires calling an arbitrary-width integer conversion
function which accepts a pointer to the integer, so we need to use
`alloc` to create such a pointer, and store the integer there (after
possibly zero-extending or sign-extending it).

No backend currently uses these new legalizations (and as such, no
backend currently needs to implement `.legalize_compiler_rt_call`).
However, for testing purposes, I tried modifying the self-hosted x86_64
backend to enable all of the soft-float features (and implement the AIR
instruction). This modified backend was able to pass all of the behavior
tests (except for one `@mod` test where the LLVM backend has a bug
resulting in incorrect compiler-rt behavior!), including the tests
specific to the self-hosted x86_64 backend.

`f16` and `f80` legalizations are likely of particular interest to
backend developers, because most architectures do not have instructions
to operate on these types. However, enabling *all* of these legalization
passes can be useful when developing a new backend to hit the ground
running and pass a good amount of tests more easily.
@mlugg mlugg force-pushed the legalize-soft-float branch from 5e9f06e to 421bc43 Compare November 14, 2025 15:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants