Skip to content

Understanding which methods get cached in *.ji files (please help) #31457

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
timholy opened this issue Mar 23, 2019 · 1 comment
Closed

Understanding which methods get cached in *.ji files (please help) #31457

timholy opened this issue Mar 23, 2019 · 1 comment

Comments

@timholy
Copy link
Member

timholy commented Mar 23, 2019

I'm trying to reduce the lag with the new Revise, which loads a lot more code now that it depends on JuliaInterpreter. A long-standing problem is that despite precompile statements, certain methods need to be re-inferred. I'm trying to understand the rules of the selection and whether they can be modified to cache more comprehensively.

I am a little skeptical that I'm inserting debug statements in the correct places (please help @vtjnash!), but here goes:

diff --git a/src/dump.c b/src/dump.c
index 84a84032f4..5b978b80d7 100644
--- a/src/dump.c
+++ b/src/dump.c
@@ -390,6 +390,8 @@ static void jl_serialize_datatype(jl_serializer_state *s, jl_datatype_t *dt) JL_
 
 static void jl_serialize_module(jl_serializer_state *s, jl_module_t *m)
 {
+    jl_printf(JL_STDOUT, "Serializing module ");
+    jl_(m);
     write_uint8(s->s, TAG_MODULE);
     jl_serialize_value(s, m->name);
     size_t i;
@@ -2749,14 +2751,31 @@ JL_DLLEXPORT int jl_save_incremental(const char *fname, jl_array_t *worklist)
     jl_array_t *edges = jl_alloc_vec_any(0);
 
     size_t i;
+    jl_printf(JL_STDOUT, "Worklist:\n");
+    for (i = 0; i < jl_array_len(worklist); i++)
+        jl_(jl_array_ptr_ref(worklist, i));
     size_t len = jl_array_len(mod_array);
     for (i = 0; i < len; i++) {
         jl_module_t *m = (jl_module_t*)jl_array_ptr_ref(mod_array, i);
         assert(jl_is_module(m));
         jl_collect_lambdas_from_mod(lambdas, m);
     }
+    jl_printf(JL_STDOUT, "Collected lambdas:\n");
+    for (i = 0; i < jl_array_len(lambdas); i++) {
+        jl_value_t *l = jl_array_ptr_ref(lambdas, i);
+        jl_(l);
+        if ((jl_datatype_t*)jl_typeof(l) == jl_method_type)
+            jl_(((jl_method_t*)l)->sig);
+    }
 
     jl_collect_backedges(edges);
+    jl_printf(JL_STDOUT, "Collected edges:\n");
+    for (i = 0; i < jl_array_len(edges); i++) {
+        jl_value_t *l = jl_array_ptr_ref(edges, i);
+        jl_(l);
+        if ((jl_datatype_t*)jl_typeof(l) == jl_method_type)
+            jl_(((jl_method_t*)l)->sig);
+    }
 
     jl_serializer_state s = {
         &f, MODE_MODULE,
diff --git a/src/gf.c b/src/gf.c
index f225ad7bb4..1ab79fcfd1 100644
--- a/src/gf.c
+++ b/src/gf.c
@@ -1966,6 +1966,8 @@ JL_DLLEXPORT int jl_compile_hint(jl_tupletype_t *types)
     if (li == NULL)
         return 0;
     if (jl_generating_output()) {
+        jl_printf(JL_STDOUT, "compile hint: ");
+        jl_(li);
         jl_code_info_t *src = NULL;
         // If we are saving ji files (e.g. package pre-compilation or intermediate sysimg build steps),

Test case (mimicking certain features of Revise, like the use of setindex! methods from OrderedCollections):

module Dummy

using OrderedCollections

export f

struct RelocatableExpr
    ex::Expr
end
const ExprsSigs = OrderedDict{RelocatableExpr,Union{Nothing,Vector{Any}}}

f(x::RelocatableExpr) = 1
# Base.setindex!(rex::RelocatableExpr, val, idx) = error("oops")

if ccall(:jl_generating_output, Cint, ()) == 1
    precompile(Tuple{typeof(f), RelocatableExpr})
    precompile(Tuple{typeof(setindex!), ExprsSigs, Nothing, RelocatableExpr})
end

end

Output:

$ julia-master --startup-file=no -e 'push!(LOAD_PATH, "/tmp/pkgs"); Base.compilecache(Base.PkgId("Dummy"))'
compile hint: f(Dummy.RelocatableExpr)
compile hint: setindex!(OrderedCollections.OrderedDict{Dummy.RelocatableExpr, Union{Nothing, Array{Any, 1}}}, Nothing, Dummy.RelocatableExpr)
Worklist:
Dummy
Collected lambdas:
Dummy.Type(...)
Tuple{Type{Dummy.RelocatableExpr}, Expr}
nothing
Dummy.Type(...)
Tuple{Type{Dummy.RelocatableExpr}, Any}
nothing
Dummy.Type(...)
Tuple{Type{Dummy.RelocatableExpr}, Expr}
nothing
Dummy.Type(...)
Tuple{Type{Dummy.RelocatableExpr}, Any}
nothing
Collected edges:
Serializing module Dummy
Serializing module OrderedCollections

Now let's see what actually gets re-inferred after loading this package. This uses #31444 and this branch of SnoopCompile:

julia> push!(LOAD_PATH, "/tmp/pkgs")
4-element Array{String,1}:
 "@"        
 "@v#.#"    
 "@stdlib"  
 "/tmp/pkgs"

julia> using Dummy, SnoopCompile

julia> rex = Dummy.RelocatableExpr(:(export Foo))
Dummy.RelocatableExpr(:(export Foo))

julia> dict = Dummy.ExprsSigs();

julia> tinfo = @snoopi begin
           Dummy.f(rex)
           dict[rex] = nothing
       end
1-element Array{Tuple{Float64,Core.MethodInstance},1}:
 (0.045538902282714844, MethodInstance for setindex!(::OrderedCollections.OrderedDict{Dummy.RelocatableExpr,Union{Nothing, Array{Any,1}}}, ::Nothing, ::Dummy.RelocatableExpr))

So clearly my debug statements are failing to pick on the fact that f is getting compiled. (Where am I missing a step?) But note that setindex! is getting re-inferred despite the precompile statement. Interestingly, if I uncomment that bogus setindex! method in Dummy, then I see setindex! appearing in the lambdas but the particular MethodInstance still needs to be re-inferred.

Collectively, inference of various setindex! methods alone contribute nearly 0.2s to Revise's startup lag. That triples the startup time of Julia itself.

@timholy
Copy link
Member Author

timholy commented Mar 25, 2019

Superseded by #31466

@timholy timholy closed this as completed Mar 25, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant