-
Notifications
You must be signed in to change notification settings - Fork 55
Allow for deferred codegen in !toplevel #556
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Could this get merged? |
@vchuravy gentle bump |
ce06fc4
to
0e00885
Compare
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #556 +/- ##
==========================================
- Coverage 82.86% 73.45% -9.41%
==========================================
Files 24 24
Lines 3361 3330 -31
==========================================
- Hits 2785 2446 -339
- Misses 576 884 +308 ☔ View full report in Codecov by Sentry. |
I tried reproducing the Enzyme MWE with @maleadt do you recall why you limited this to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reasoning is that deferred functions called from the entrypoint (which is what the compiler driver is architected around) can be discovered recursively without having to expand them during deferred compilation. I guess that this doesn't hold with Enzyme because the deferred compilation jobs aren't called by the main entrypoint? To support this, the compiler driver will need to be changed. For example, if you simply let them expand from within a deferred context as proposed here, the jobs
variable at top level won't list all compilation jobs, and the post-processing that happens at top level won't be executing on all functions.
Hm... Doesn't the inner call resolve its own deferred codegen first and then both of them get linked into the outer one? |
Yes, but that doesn't correctly populate the Lines 423 to 435 in 288b613
We could work around this by passing jobs to the inner deferred codegen generators, but that's a hack. Essentially the current system relies on a compilation job having a single point of entry, even into deferred jobs.
|
Hm I am a bit confused how works. I just pushed a commit that returns the job variable from the |
We probably don't really rely on the I'm not particularly happy with threading yet more state through, but if you need this... The compiler driver should be reworked to support compiling multiple translation units without relying on a single entry-point. |
Yeah I can try to improve this while working on #582 |
CUDA.jl CI failure looks related. |
Ah I see for https://github.com/JuliaGPU/CUDA.jl/blob/c2d444b0f5a76f92c5ba6bc1534a53319218b563/test/core/execution.jl#L974-L975 We have an infinite loop since we don't pass through the list of already codegen'd functions. So we recurse into the original top-level and carry on from there. |
Fixes EnzymeAD/Enzyme.jl#1173
Need to come up with a standalone test-case.