You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Enhances SROA of mutables using the novel Julia-level escape analysis (on top of #43800):
1. alias-aware SROA, mutable ϕ-node elimination
2. `isdefined` check elimination
3. load-forwarding for non-eliminable but analyzable mutables
---
1. alias-aware SROA, mutable ϕ-node elimination
EA's alias analysis allows this new SROA to handle nested mutables allocations
pretty well. Now we can eliminate the heap allocations completely from
this insanely nested examples by the single analysis/optimization pass:
```julia
julia> function refs(x)
(Ref(Ref(Ref(Ref(Ref(Ref(Ref(Ref(Ref(Ref((x))))))))))))[][][][][][][][][][]
end
refs (generic function with 1 method)
julia> refs("julia"); @allocated refs("julia")
0
```
EA can also analyze escape of ϕ-node as well as its aliasing.
Mutable ϕ-nodes would be eliminated even for a very tricky case as like:
```julia
julia> code_typed((Bool,String,)) do cond, x
# these allocation form multiple ϕ-nodes
if cond
ϕ2 = ϕ1 = Ref{Any}("foo")
else
ϕ2 = ϕ1 = Ref{Any}("bar")
end
ϕ2[] = x
y = ϕ1[] # => x
return y
end
1-element Vector{Any}:
CodeInfo(
1 ─ goto #3 if not cond
2 ─ goto #4
3 ─ nothing::Nothing
4 ┄ return x
) => Any
```
Combined with the alias analysis and ϕ-node handling above,
allocations in the following "realistic" examples will be optimized:
```julia
julia> # demonstrate the power of our field / alias analysis with realistic end to end examples
# adapted from http://wiki.luajit.org/Allocation-Sinking-Optimization#implementation%5B
abstract type AbstractPoint{T} end
julia> struct Point{T} <: AbstractPoint{T}
x::T
y::T
end
julia> mutable struct MPoint{T} <: AbstractPoint{T}
x::T
y::T
end
julia> add(a::P, b::P) where P<:AbstractPoint = P(a.x + b.x, a.y + b.y);
julia> function compute_point(T, n, ax, ay, bx, by)
a = T(ax, ay)
b = T(bx, by)
for i in 0:(n-1)
a = add(add(a, b), b)
end
a.x, a.y
end;
julia> function compute_point(n, a, b)
for i in 0:(n-1)
a = add(add(a, b), b)
end
a.x, a.y
end;
julia> function compute_point!(n, a, b)
for i in 0:(n-1)
a′ = add(add(a, b), b)
a.x = a′.x
a.y = a′.y
end
end;
julia> compute_point(MPoint, 10, 1+.5, 2+.5, 2+.25, 4+.75);
julia> compute_point(MPoint, 10, 1+.5im, 2+.5im, 2+.25im, 4+.75im);
julia> @allocated compute_point(MPoint, 10000, 1+.5, 2+.5, 2+.25, 4+.75)
0
julia> @allocated compute_point(MPoint, 10000, 1+.5im, 2+.5im, 2+.25im, 4+.75im)
0
julia> compute_point(10, MPoint(1+.5, 2+.5), MPoint(2+.25, 4+.75));
julia> compute_point(10, MPoint(1+.5im, 2+.5im), MPoint(2+.25im, 4+.75im));
julia> @allocated compute_point(10000, MPoint(1+.5, 2+.5), MPoint(2+.25, 4+.75))
0
julia> @allocated compute_point(10000, MPoint(1+.5im, 2+.5im), MPoint(2+.25im, 4+.75im))
0
julia> af, bf = MPoint(1+.5, 2+.5), MPoint(2+.25, 4+.75);
julia> ac, bc = MPoint(1+.5im, 2+.5im), MPoint(2+.25im, 4+.75im);
julia> compute_point!(10, af, bf);
julia> compute_point!(10, ac, bc);
julia> @allocated compute_point!(10000, af, bf)
0
julia> @allocated compute_point!(10000, ac, bc)
0
```
2. `isdefined` check elimination
This commit also implements a simple optimization to eliminate
`isdefined` call by checking load-fowardability.
This optimization may be especially useful to eliminate extra allocation
involved with a capturing closure, e.g.:
```julia
julia> callit(f, args...) = f(args...);
julia> function isdefined_elim()
local arr::Vector{Any}
callit() do
arr = Any[]
end
return arr
end;
julia> code_typed(isdefined_elim)
1-element Vector{Any}:
CodeInfo(
1 ─ %1 = $(Expr(:foreigncall, :(:jl_alloc_array_1d), Vector{Any}, svec(Any, Int64), 0, :(:ccall), Vector{Any}, 0, 0))::Vector{Any}
└── goto #3 if not true
2 ─ goto #4
3 ─ $(Expr(:throw_undef_if_not, :arr, false))::Any
4 ┄ return %1
) => Vector{Any}
```
3. load-forwarding for non-eliminable but analyzable mutables
EA also allows us to forward loads even when the mutable allocation
can't be eliminated but still its fields are known precisely.
The load forwarding might be useful since it may derive new type information
that succeeding optimization passes can use (or just because it allows
simpler code transformations down the load):
```julia
julia> code_typed((Bool,String,)) do c, s
r = Ref{Any}(s)
if c
return r[]::String # adce_pass! will further eliminate this type assert call also
else
return r
end
end
1-element Vector{Any}:
CodeInfo(
1 ─ %1 = %new(Base.RefValue{Any}, s)::Base.RefValue{Any}
└── goto #3 if not c
2 ─ return s
3 ─ return %1
) => Union{Base.RefValue{Any}, String}
```
---
Please refer to the newly added test cases for more examples.
Also, EA's alias analysis already succeeds to reason about arrays, and
so this EA-based SROA will hopefully be generalized for array SROA as well.
0 commit comments