It appears that something around backedge tracking broke somewhat recently (I'm assuming with pkgimages, but haven't bisected yet). In particular, changing Core.Compiler no longer gets picked up by Cthulhu. To reproduce, apply a patch to base/compiler, e.g.
diff --git a/base/compiler/abstractinterpretation.jl b/base/compiler/abstractinterpretation.jl
index b3e18a6ee7..fc8e2d1aee 100644
--- a/base/compiler/abstractinterpretation.jl
+++ b/base/compiler/abstractinterpretation.jl
@@ -547,6 +547,8 @@ function abstract_call_method(interp::AbstractInterpreter, method::Method, @nosp
return MethodCallResult(Any, false, false, nothing, Effects())
end
+ @show sig
+
# Limit argument type tuple growth of functions:
# look through the parents list to see if there's a call to the same method
# and from the same method.
This will not show up in Cthulhu:
julia> using Revise
julia> Revise.track(Core.Compiler)
julia> using Cthulhu
julia> @descend optimize=false sin(1.0)
[snip]
sin(x::T) where T<:Union{Float32, Float64} @ Base.Math special/trig.jl:29
However, if I force-reevaluate do_typeinf!
, it does show up:
julia> Core.eval(Cthulhu, quote
function do_typeinf!(interp::AbstractInterpreter, mi::MethodInstance)
result = InferenceResult(mi)
# we may want to handle the case when `InferenceState(...)` returns `nothing`,
# which indicates code generation of a `@generated` has been failed,
# and show it in the UI in some way ?
# branch on https://github.com/JuliaLang/julia/pull/42082
frame = @static hasmethod(InferenceState, (InferenceResult,Symbol,AbstractInterpreter)) ?
InferenceState(result, #=cache=# :global, interp)::InferenceState :
InferenceState(result, #=cached=# true, interp)::InferenceState
CC.typeinf(interp, frame)
return nothing
end
end)
julia> @descend optimize=false sin(1.);
:sig = Tuple{typeof(Base.abs), Float64}
:sig = Tuple{Type{Float64}, Base.Irrational{:π}}
:sig = Tuple{typeof(Base.:(/)), Float64, Int64}
:sig = Tuple{typeof(Base.promote), Float64, Int64}
:sig = Tuple{typeof(Base._promote), Float64, Int64}
This suggests that there is some cached version of do_typeinf!
that fails to notice the revision of the Core.Compiler method. @timholy any ideas - is this known?
If I do info = info_cachefile("Cthulhu")
and then look at info.edges
, I don't see an entry for Cthulhu.do_typeinf!
- shouldn't there be one?
julia> using Revise
julia> Revise.track(Core.Compiler)
julia> using Cthulhu
julia> mi = Cthulhu.get_specialization(Tuple{typeof(gcd), Int, Int})
MethodInstance for gcd(::Int64, ::Int64)
julia> interp = Cthulhu.CthulhuInterpreter();
julia> Cthulhu.do_typeinf!(interp, mi)
julia> Core.Compiler.typeinf(interp, Core.Compiler.InferenceState(Core.Compiler.InferenceResult(mi), :global, interp))
:sig = Tuple{typeof(Base.:(==)), Int64, Int64}
:sig = Tuple{typeof(Base.Checked.checked_abs), Int64}
:sig = Tuple{typeof(Base.:(==)), Int64, Int64}
:sig = Tuple{typeof(Base.Checked.checked_abs), Int64}
:sig = Tuple{typeof(Base._gcd), Int64, Int64}
:sig = Tuple{typeof(Base.signbit), Int64}
:sig = Tuple{typeof(Base.__throw_gcd_overflow), Int64, Int64}
true
Aha, I think I have the smoking gun. We collect the edges
in
jl_prepare_serialization_data(mod_array, newly_inferred, jl_worklist_key(worklist), &extext_methods, &new_specializations, &method_roots_list, &ext_targets, &edges);
but then only call
*_native_data = jl_precompile_worklist(worklist, extext_methods, new_specializations);
afterwards, so if we do extra inference there, those edges are missed.
Extra inference seems forbidden there, since we should have already prepped the method list for serialization and decided the total content to include. What changed?
By what mechanism is it supposed to be forbidden?
#0 jl_type_infer ([email protected]=0x7f96486a8380, world=33942, [email protected]=0) at /home/keno/julia/src/gf.c:278
#1 0x00007f9651e5305c in jl_ci_cache_lookup (cgparams=..., src_out=0x7ffd4ab60c08, ci_out=<synthetic pointer>,
world=<optimized out>, mi=0x7f96486a8380) at /home/keno/julia/src/aotcompile.cpp:250
#2 jl_create_native_impl (methods=0x7f964a99f4c0, llvmmod=<optimized out>,
cgparams=0x7f9651f07e60 <jl_default_cgparams>, _policy=0, _imaging_mode=<optimized out>,
_external_linkage=<optimized out>) at /home/keno/julia/src/aotcompile.cpp:335
#3 0x00007f96523d631c in jl_precompile_ ([email protected]=1, m=<optimized out>, m=<optimized out>)
at /home/keno/julia/src/precompile_utils.c:254
#4 0x00007f96523e3b55 in jl_precompile_worklist (extext_methods=<optimized out>, extext_methods=<optimized out>,
new_specializations=<optimized out>, new_specializations=<optimized out>, worklist=0x7f9648c268f0)
at /home/keno/julia/src/precompile_utils.c:303
#5 ijl_create_system_image (_native_data=<optimized out>, worklist=<optimized out>, emit_split=<optimized out>,
s=<optimized out>, z=<optimized out>, udeps=<optimized out>, srctextpos=<optimized out>)
at /home/keno/julia/src/staticdata.c:2624
It is not supposed to be calling anything Julia related (finalizers are also disabled here, and threads are supposed to be stopped), since it will corrupt the state, though nothing checks it to detect that sort of error. Sounds like someone broke it recently then?
This diff might work?
diff --git a/src/precompile_utils.c b/src/precompile_utils.c
index f251d00f76..6154a9ced7 100644
--- a/src/precompile_utils.c
+++ b/src/precompile_utils.c
@@ -188,7 +188,7 @@ static int precompile_enq_specialization_(jl_method_instance_t *mi, void *closur
(jl_ir_inlining_cost((jl_array_t*)inferred) == UINT16_MAX)) {
do_compile = 1;
}
- else if (jl_atomic_load_relaxed(&codeinst->invoke) != NULL || jl_atomic_load_relaxed(&codeinst->precompile)) {
+ else if (jl_atomic_load_relaxed(&codeinst->precompile)) {
do_compile = 1;
}
}
do_compile
seems likely to corrupt the precompile state prepared by jl_prepare_serialization_data
, if it gets called at all
I've PR'd that diff as #48054 since if fixes my immediate issue. If there's a large fix to be had, you might need to discuss with @vchuravy .
xref analysis in https://github.com/JuliaLang/julia/pull/48054#issuecomment-1372682196