Skip to content

Serialization: replace findfirst-esque loop with static lookup table#61730

Open
adienes wants to merge 4 commits intoJuliaLang:masterfrom
adienes:flat_sertag
Open

Serialization: replace findfirst-esque loop with static lookup table#61730
adienes wants to merge 4 commits intoJuliaLang:masterfrom
adienes:flat_sertag

Conversation

@adienes
Copy link
Copy Markdown
Member

@adienes adienes commented May 6, 2026

for to make that we get better performance

no tags are changed, just the way we compute them. main thing to call out for review is that this adds an __init__ statement

julia> using Serialization, BenchmarkTools

julia> syms  = [Symbol("sym_$i") for i in 1:5_000];

julia> types = repeat(Type[Int, Float64, String, Symbol, Bool, Module, Tuple, Array], 500);

# PR
julia> @benchmark serialize(buf, $syms) setup = (buf = IOBuffer())
BenchmarkTools.Trial: 10000 samples with 1 evaluation per sample.
 Range (min  max):  232.000 μs   1.445 ms  ┊ GC (min  max): 0.00%  79.35%
 Time  (median):     244.500 μs              ┊ GC (median):    0.00%
 Time  (mean ± σ):   254.755 μs ± 69.008 μs  ┊ GC (mean ± σ):  3.03% ±  7.87%

julia> @benchmark serialize(buf, $types) setup = (buf = IOBuffer())
BenchmarkTools.Trial: 10000 samples with 1 evaluation per sample.
 Range (min  max):  227.916 μs   2.377 ms  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     239.042 μs              ┊ GC (median):    0.00%
 Time  (mean ± σ):   243.924 μs ± 42.565 μs  ┊ GC (mean ± σ):  0.58% ± 3.23%

vs master

julia> @benchmark serialize(buf, $syms) setup = (buf = IOBuffer())
BenchmarkTools.Trial: 9629 samples with 1 evaluation per sample.
 Range (min  max):  475.250 μs  70.392 ms  ┊ GC (min  max): 0.00%  99.26%
 Time  (median):     494.042 μs              ┊ GC (median):    0.00%
 Time  (mean ± σ):   517.532 μs ±  1.010 ms  ┊ GC (mean ± σ):  4.23% ±  5.37%

julia> @benchmark serialize(buf, $types) setup = (buf = IOBuffer())
BenchmarkTools.Trial: 10000 samples with 1 evaluation per sample.
 Range (min  max):  265.084 μs   5.768 ms  ┊ GC (min  max): 0.00%  94.76%
 Time  (median):     275.791 μs              ┊ GC (median):    0.00%
 Time  (mean ± σ):   279.091 μs ± 61.669 μs  ┊ GC (mean ± σ):  0.72% ±  3.53%

draft written by claude and then substantial fixes + rewritten by me.

@adienes adienes added performance Must go faster stdlib Julia's standard library labels May 6, 2026
@JeffBezanson
Copy link
Copy Markdown
Member

This should use OncePerProcess to generate the table when it's needed. Hopefully there will be no performance impact.

Copy link
Copy Markdown
Member

@vtjnash vtjnash left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this so big it has to be recreated at runtime, instead of just storing as static data?

@adienes
Copy link
Copy Markdown
Member Author

adienes commented May 6, 2026

I didn't want to be relying on objectid stability

@vtjnash
Copy link
Copy Markdown
Member

vtjnash commented May 6, 2026

We guarantee it is stable

@adienes
Copy link
Copy Markdown
Member Author

adienes commented May 6, 2026

only within a process, right? if we make the table static data my understanding is it will be computed at precompilation time. but then runtime calls to sertag may see a different objectid value and thus access the wrong index of the lookup table.

latest commit uses OncePerProcess ; luckily there seems to be no performance difference on any relevant workloads (the only regression I could find was on artificial benchmarks of calling sertag itself)

@adienes
Copy link
Copy Markdown
Member Author

adienes commented May 6, 2026

test failure appears to be #61737

@jakobnissen
Copy link
Copy Markdown
Member

This seems like a re-implementation of an IdDict. Any reason to not use that?

@adienes
Copy link
Copy Markdown
Member Author

adienes commented May 7, 2026

there is a big performance difference, I think mainly because IdDict stores everything in Memory{Any} but here's we're specializing on the values with Memory{Int32}. for IdDict all gets and sets box the value; you can see it in @allocated

julia> d = IdDict{Any,Int32}();

julia> @allocated d[:foo] = Int32(12345)
16

julia> @allocated d[:foo]
16

I could imagine Base may want some kind of StaticDict so that other users of AOT lookup tables can make use of them (and e.g. getindex can be :foldable) but that's a separate project

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance Must go faster stdlib Julia's standard library

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants