Skip to content

Commit 3e1ae47

Browse files
authored
optimizer: inline abstract union-split callsite (JuliaLang#44512)
Currently the optimizer handles abstract callsite only when there is a single dispatch candidate (in most cases), and so inlining and static-dispatch are prohibited when the callsite is union-split (in other word, union-split happens only when all the dispatch candidates are concrete). However, there are certain patterns of code (most notably our Julia-level compiler code) that inherently need to deal with abstract callsite. The following example is taken from `Core.Compiler` utility: ```julia julia> @inline isType(@nospecialize t) = isa(t, DataType) && t.name === Type.body.name isType (generic function with 1 method) julia> code_typed((Any,)) do x # abstract, but no union-split, successful inlining isType(x) end |> only CodeInfo( 1 ─ %1 = (x isa Main.DataType)::Bool └── goto #3 if not %1 2 ─ %3 = π (x, DataType) │ %4 = Base.getfield(%3, :name)::Core.TypeName │ %5 = Base.getfield(Type{T}, :name)::Core.TypeName │ %6 = (%4 === %5)::Bool └── goto #4 3 ─ goto #4 4 ┄ %9 = φ (#2 => %6, #3 => false)::Bool └── return %9 ) => Bool julia> code_typed((Union{Type,Nothing},)) do x # abstract, union-split, unsuccessful inlining isType(x) end |> only CodeInfo( 1 ─ %1 = (isa)(x, Nothing)::Bool └── goto #3 if not %1 2 ─ goto #4 3 ─ %4 = Main.isType(x)::Bool └── goto #4 4 ┄ %6 = φ (#2 => false, #3 => %4)::Bool └── return %6 ) => Bool ``` (note that this is a limitation of the inlining algorithm, and so any user-provided hints like callsite inlining annotation doesn't help here) This commit enables inlining and static dispatch for abstract union-split callsite. The core idea here is that we can simulate our dispatch semantics by generating `isa` checks in order of the specialities of dispatch candidates: ```julia julia> code_typed((Union{Type,Nothing},)) do x # union-split, unsuccessful inlining isType(x) end |> only CodeInfo( 1 ─ %1 = (isa)(x, Nothing)::Bool └── goto #3 if not %1 2 ─ goto #9 3 ─ %4 = (isa)(x, Type)::Bool └── goto #8 if not %4 4 ─ %6 = π (x, Type) │ %7 = (%6 isa Main.DataType)::Bool └── goto #6 if not %7 5 ─ %9 = π (%6, DataType) │ %10 = Base.getfield(%9, :name)::Core.TypeName │ %11 = Base.getfield(Type{T}, :name)::Core.TypeName │ %12 = (%10 === %11)::Bool └── goto #7 6 ─ goto #7 7 ┄ %15 = φ (#5 => %12, #6 => false)::Bool └── goto #9 8 ─ Core.throw(ErrorException("fatal error in type inference (type bound)"))::Union{} └── unreachable 9 ┄ %19 = φ (#2 => false, #7 => %15)::Bool └── return %19 ) => Bool ``` Inlining/static-dispatch of abstract union-split callsite will improve the performance in such situations (and so this commit will improve the latency of our JIT compilation). Especially, this commit helps us avoid excessive specializations of `Core.Compiler` code by statically-resolving `@nospecialize`d callsites, and as the result, the # of precompiled statements is now reduced from `2005` ([`master`](f782430)) to `1912` (this commit). And also, as a side effect, the implementation of our inlining algorithm gets much simplified now since we no longer need the previous special handlings for abstract callsites. One possible drawback would be increased code size. This change seems to certainly increase the size of sysimage, but I think these numbers are in an acceptable range: > [`master`](f782430) ``` ❯ du -shk usr/lib/julia/* 17604 usr/lib/julia/corecompiler.ji 194072 usr/lib/julia/sys-o.a 169424 usr/lib/julia/sys.dylib 23784 usr/lib/julia/sys.dylib.dSYM 103772 usr/lib/julia/sys.ji ``` > this commit ``` ❯ du -shk usr/lib/julia/* 17512 usr/lib/julia/corecompiler.ji 195588 usr/lib/julia/sys-o.a 170908 usr/lib/julia/sys.dylib 23776 usr/lib/julia/sys.dylib.dSYM 105360 usr/lib/julia/sys.ji ```
1 parent f782430 commit 3e1ae47

File tree

3 files changed

+196
-90
lines changed

3 files changed

+196
-90
lines changed

base/compiler/ssair/inlining.jl

Lines changed: 94 additions & 85 deletions
Original file line numberDiff line numberDiff line change
@@ -241,7 +241,7 @@ function cfg_inline_unionsplit!(ir::IRCode, idx::Int,
241241
push!(from_bbs, length(state.new_cfg_blocks))
242242
# TODO: Right now we unconditionally generate a fallback block
243243
# in case of subtyping errors - This is probably unnecessary.
244-
if i != length(cases) || (!fully_covered || (!params.trust_inference && isdispatchtuple(cases[i].sig)))
244+
if i != length(cases) || (!fully_covered || (!params.trust_inference))
245245
# This block will have the next condition or the final else case
246246
push!(state.new_cfg_blocks, BasicBlock(StmtRange(idx, idx)))
247247
push!(state.new_cfg_blocks[cond_bb].succs, length(state.new_cfg_blocks))
@@ -313,7 +313,6 @@ function ir_inline_item!(compact::IncrementalCompact, idx::Int, argexprs::Vector
313313
spec = item.spec::ResolvedInliningSpec
314314
sparam_vals = item.mi.sparam_vals
315315
def = item.mi.def::Method
316-
inline_cfg = spec.ir.cfg
317316
linetable_offset::Int32 = length(linetable)
318317
# Append the linetable of the inlined function to our line table
319318
inlined_at = Int(compact.result[idx][:line])
@@ -459,6 +458,66 @@ end
459458

460459
const FATAL_TYPE_BOUND_ERROR = ErrorException("fatal error in type inference (type bound)")
461460

461+
"""
462+
ir_inline_unionsplit!
463+
464+
The core idea of this function is to simulate the dispatch semantics by generating
465+
(flat) `isa`-checks corresponding to the signatures of union-split dispatch candidates,
466+
and then inline their bodies into each `isa`-conditional block.
467+
This `isa`-based virtual dispatch requires few pre-conditions to hold in order to simulate
468+
the actual semantics correctly.
469+
470+
The first one is that these dispatch candidates need to be processed in order of their specificity,
471+
and the corresponding `isa`-checks should reflect the method specificities, since now their
472+
signatures are not necessarily concrete.
473+
For example, given the following definitions:
474+
475+
f(x::Int) = ...
476+
f(x::Number) = ...
477+
f(x::Any) = ...
478+
479+
and a callsite:
480+
481+
f(x::Any)
482+
483+
then a correct `isa`-based virtual dispatch would be:
484+
485+
if isa(x, Int)
486+
[inlined/resolved f(x::Int)]
487+
elseif isa(x, Number)
488+
[inlined/resolved f(x::Number)]
489+
else # implies `isa(x, Any)`, which fully covers this call signature,
490+
# otherwise we need to insert a fallback dynamic dispatch case also
491+
[inlined/resolved f(x::Any)]
492+
end
493+
494+
Fortunately, `ml_matches` should already sorted them in that way, except cases when there is
495+
any ambiguity, from which we already bail out at this point.
496+
497+
Another consideration is type equality constraint from type variables: the `isa`-checks are
498+
not enough to simulate the dispatch semantics in cases like:
499+
Given a definition:
500+
501+
g(x::T, y::T) where T<:Integer = ...
502+
503+
transform a callsite:
504+
505+
g(x::Any, y::Any)
506+
507+
into the optimized form:
508+
509+
if isa(x, Integer) && isa(y, Integer)
510+
[inlined/resolved g(x::Integer, y::Integer)]
511+
else
512+
g(x, y) # fallback dynamic dispatch
513+
end
514+
515+
But again, we should already bail out from such cases at this point, essentially by
516+
excluding cases where `case.sig::UnionAll`.
517+
518+
In short, here we can process the dispatch candidates in order, assuming we haven't changed
519+
their order somehow somewhere up to this point.
520+
"""
462521
function ir_inline_unionsplit!(compact::IncrementalCompact, idx::Int,
463522
argexprs::Vector{Any}, linetable::Vector{LineInfoNode},
464523
(; fully_covered, atype, cases, bbs)::UnionSplit,
@@ -468,17 +527,17 @@ function ir_inline_unionsplit!(compact::IncrementalCompact, idx::Int,
468527
join_bb = bbs[end]
469528
pn = PhiNode()
470529
local bb = compact.active_result_bb
471-
@assert length(bbs) >= length(cases)
472-
for i in 1:length(cases)
530+
ncases = length(cases)
531+
@assert length(bbs) >= ncases
532+
for i = 1:ncases
473533
ithcase = cases[i]
474534
mtype = ithcase.sig::DataType # checked within `handle_cases!`
475535
case = ithcase.item
476536
next_cond_bb = bbs[i]
477537
cond = true
478538
nparams = fieldcount(atype)
479539
@assert nparams == fieldcount(mtype)
480-
if i != length(cases) || !fully_covered ||
481-
(!params.trust_inference && isdispatchtuple(cases[i].sig))
540+
if i != ncases || !fully_covered || !params.trust_inference
482541
for i = 1:nparams
483542
a, m = fieldtype(atype, i), fieldtype(mtype, i)
484543
# If this is always true, we don't need to check for it
@@ -535,7 +594,7 @@ function ir_inline_unionsplit!(compact::IncrementalCompact, idx::Int,
535594
bb += 1
536595
# We're now in the fall through block, decide what to do
537596
if fully_covered
538-
if !params.trust_inference && isdispatchtuple(cases[end].sig)
597+
if !params.trust_inference
539598
e = Expr(:call, GlobalRef(Core, :throw), FATAL_TYPE_BOUND_ERROR)
540599
insert_node_here!(compact, NewInstruction(e, Union{}, line))
541600
insert_node_here!(compact, NewInstruction(ReturnNode(), Union{}, line))
@@ -558,7 +617,7 @@ function batch_inline!(todo::Vector{Pair{Int, Any}}, ir::IRCode, linetable::Vect
558617
state = CFGInliningState(ir)
559618
for (idx, item) in todo
560619
if isa(item, UnionSplit)
561-
cfg_inline_unionsplit!(ir, idx, item::UnionSplit, state, params)
620+
cfg_inline_unionsplit!(ir, idx, item, state, params)
562621
else
563622
item = item::InliningTodo
564623
spec = item.spec::ResolvedInliningSpec
@@ -1172,12 +1231,8 @@ function analyze_single_call!(
11721231
sig::Signature, state::InliningState, todo::Vector{Pair{Int, Any}})
11731232
argtypes = sig.argtypes
11741233
cases = InliningCase[]
1175-
local only_method = nothing # keep track of whether there is one matching method
1176-
local meth::MethodLookupResult
1234+
local any_fully_covered = false
11771235
local handled_all_cases = true
1178-
local any_covers_full = false
1179-
local revisit_idx = nothing
1180-
11811236
for i in 1:length(infos)
11821237
meth = infos[i].results
11831238
if meth.ambig
@@ -1188,66 +1243,20 @@ function analyze_single_call!(
11881243
# No applicable methods; try next union split
11891244
handled_all_cases = false
11901245
continue
1191-
else
1192-
if length(meth) == 1 && only_method !== false
1193-
if only_method === nothing
1194-
only_method = meth[1].method
1195-
elseif only_method !== meth[1].method
1196-
only_method = false
1197-
end
1198-
else
1199-
only_method = false
1200-
end
12011246
end
1202-
for (j, match) in enumerate(meth)
1203-
any_covers_full |= match.fully_covers
1204-
if !isdispatchtuple(match.spec_types)
1205-
if !match.fully_covers
1206-
handled_all_cases = false
1207-
continue
1208-
end
1209-
if revisit_idx === nothing
1210-
revisit_idx = (i, j)
1211-
else
1212-
handled_all_cases = false
1213-
revisit_idx = nothing
1214-
end
1215-
else
1216-
handled_all_cases &= handle_match!(match, argtypes, flag, state, cases)
1217-
end
1247+
for match in meth
1248+
handled_all_cases &= handle_match!(match, argtypes, flag, state, cases, true)
1249+
any_fully_covered |= match.fully_covers
12181250
end
12191251
end
12201252

1221-
atype = argtypes_to_type(argtypes)
1222-
if handled_all_cases && revisit_idx !== nothing
1223-
# If there's only one case that's not a dispatchtuple, we can
1224-
# still unionsplit by visiting all the other cases first.
1225-
# This is useful for code like:
1226-
# foo(x::Int) = 1
1227-
# foo(@nospecialize(x::Any)) = 2
1228-
# where we where only a small number of specific dispatchable
1229-
# cases are split off from an ::Any typed fallback.
1230-
(i, j) = revisit_idx
1231-
match = infos[i].results[j]
1232-
handled_all_cases &= handle_match!(match, argtypes, flag, state, cases, true)
1233-
elseif length(cases) == 0 && only_method isa Method
1234-
# if the signature is fully covered and there is only one applicable method,
1235-
# we can try to inline it even if the signature is not a dispatch tuple.
1236-
# -- But don't try it if we already tried to handle the match in the revisit_idx
1237-
# case, because that'll (necessarily) be the same method.
1238-
if length(infos) > 1
1239-
(metharg, methsp) = ccall(:jl_type_intersection_with_env, Any, (Any, Any),
1240-
atype, only_method.sig)::SimpleVector
1241-
match = MethodMatch(metharg, methsp::SimpleVector, only_method, true)
1242-
else
1243-
@assert length(meth) == 1
1244-
match = meth[1]
1245-
end
1246-
handle_match!(match, argtypes, flag, state, cases, true) || return nothing
1247-
any_covers_full = handled_all_cases = match.fully_covers
1253+
if !handled_all_cases
1254+
# if we've not seen all candidates, union split is valid only for dispatch tuples
1255+
filter!(case::InliningCase->isdispatchtuple(case.sig), cases)
12481256
end
12491257

1250-
handle_cases!(ir, idx, stmt, atype, cases, any_covers_full && handled_all_cases, todo, state.params)
1258+
handle_cases!(ir, idx, stmt, argtypes_to_type(argtypes), cases,
1259+
handled_all_cases & any_fully_covered, todo, state.params)
12511260
end
12521261

12531262
# similar to `analyze_single_call!`, but with constant results
@@ -1258,8 +1267,8 @@ function handle_const_call!(
12581267
(; call, results) = cinfo
12591268
infos = isa(call, MethodMatchInfo) ? MethodMatchInfo[call] : call.matches
12601269
cases = InliningCase[]
1270+
local any_fully_covered = false
12611271
local handled_all_cases = true
1262-
local any_covers_full = false
12631272
local j = 0
12641273
for i in 1:length(infos)
12651274
meth = infos[i].results
@@ -1275,42 +1284,39 @@ function handle_const_call!(
12751284
for match in meth
12761285
j += 1
12771286
result = results[j]
1278-
any_covers_full |= match.fully_covers
1287+
any_fully_covered |= match.fully_covers
12791288
if isa(result, ConstResult)
12801289
case = const_result_item(result, state)
12811290
push!(cases, InliningCase(result.mi.specTypes, case))
12821291
elseif isa(result, InferenceResult)
1283-
handled_all_cases &= handle_inf_result!(result, argtypes, flag, state, cases)
1292+
handled_all_cases &= handle_inf_result!(result, argtypes, flag, state, cases, true)
12841293
else
12851294
@assert result === nothing
1286-
handled_all_cases &= handle_match!(match, argtypes, flag, state, cases)
1295+
handled_all_cases &= handle_match!(match, argtypes, flag, state, cases, true)
12871296
end
12881297
end
12891298
end
12901299

1291-
# if the signature is fully covered and there is only one applicable method,
1292-
# we can try to inline it even if the signature is not a dispatch tuple
1293-
atype = argtypes_to_type(argtypes)
1294-
if length(cases) == 0
1295-
length(results) == 1 || return nothing
1296-
result = results[1]
1297-
isa(result, InferenceResult) || return nothing
1298-
handle_inf_result!(result, argtypes, flag, state, cases, true) || return nothing
1299-
spec_types = cases[1].sig
1300-
any_covers_full = handled_all_cases = atype <: spec_types
1300+
if !handled_all_cases
1301+
# if we've not seen all candidates, union split is valid only for dispatch tuples
1302+
filter!(case::InliningCase->isdispatchtuple(case.sig), cases)
13011303
end
13021304

1303-
handle_cases!(ir, idx, stmt, atype, cases, any_covers_full && handled_all_cases, todo, state.params)
1305+
handle_cases!(ir, idx, stmt, argtypes_to_type(argtypes), cases,
1306+
handled_all_cases & any_fully_covered, todo, state.params)
13041307
end
13051308

13061309
function handle_match!(
13071310
match::MethodMatch, argtypes::Vector{Any}, flag::UInt8, state::InliningState,
13081311
cases::Vector{InliningCase}, allow_abstract::Bool = false)
13091312
spec_types = match.spec_types
13101313
allow_abstract || isdispatchtuple(spec_types) || return false
1314+
# we may see duplicated dispatch signatures here when a signature gets widened
1315+
# during abstract interpretation: for the purpose of inlining, we can just skip
1316+
# processing this dispatch candidate
1317+
_any(case->case.sig === spec_types, cases) && return true
13111318
item = analyze_method!(match, argtypes, flag, state)
13121319
item === nothing && return false
1313-
_any(case->case.sig === spec_types, cases) && return true
13141320
push!(cases, InliningCase(spec_types, item))
13151321
return true
13161322
end
@@ -1346,7 +1352,9 @@ function handle_cases!(ir::IRCode, idx::Int, stmt::Expr, @nospecialize(atype),
13461352
handle_single_case!(ir, idx, stmt, cases[1].item, todo, params)
13471353
elseif length(cases) > 0
13481354
isa(atype, DataType) || return nothing
1349-
all(case::InliningCase->isa(case.sig, DataType), cases) || return nothing
1355+
for case in cases
1356+
isa(case.sig, DataType) || return nothing
1357+
end
13501358
push!(todo, idx=>UnionSplit(fully_covered, atype, cases))
13511359
end
13521360
return nothing
@@ -1442,7 +1450,8 @@ function assemble_inline_todo!(ir::IRCode, state::InliningState)
14421450

14431451
analyze_single_call!(ir, idx, stmt, infos, flag, sig, state, todo)
14441452
end
1445-
todo
1453+
1454+
return todo
14461455
end
14471456

14481457
function linear_inline_eligible(ir::IRCode)

base/sort.jl

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ module Sort
55
import ..@__MODULE__, ..parentmodule
66
const Base = parentmodule(@__MODULE__)
77
using .Base.Order
8-
using .Base: copymutable, LinearIndices, length, (:),
8+
using .Base: copymutable, LinearIndices, length, (:), iterate,
99
eachindex, axes, first, last, similar, zip, OrdinalRange,
1010
AbstractVector, @inbounds, AbstractRange, @eval, @inline, Vector, @noinline,
1111
AbstractMatrix, AbstractUnitRange, isless, identity, eltype, >, <, <=, >=, |, +, -, *, !,

0 commit comments

Comments
 (0)