Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scheduler: Use a "scheduler" task for thread sleep #57544

Merged
merged 6 commits into from
Mar 27, 2025
Merged

Conversation

kpamnany
Copy link
Contributor

@kpamnany kpamnany commented Feb 26, 2025

A Julia thread runs Julia's scheduler in the context of the switching task. If no task is found to switch to, the thread will sleep while holding onto the (possibly completed) task, preventing the task from being garbage collected. This recent Discourse post illustrates precisely this problem.

A solution to this would be for an idle Julia thread to switch to a "scheduler" task, thereby freeing the old task.

This PR uses OncePerThread to create a "scheduler" task (that does nothing but run wait() in a loop) and switches to that task when the thread finds itself idle.

Other approaches considered and discarded in favor of this one: #57465 and #57543.

@kpamnany kpamnany force-pushed the kp-sched-task-alt2 branch 2 times, most recently from b345e87 to 4c08ecb Compare February 28, 2025 00:16
@kpamnany
Copy link
Contributor Author

kpamnany commented Mar 5, 2025

This is currently blocked on what seems to be a bug in OncePerThread serialization/deserialization; found with @gbaraldi.

@kpamnany kpamnany marked this pull request as ready for review March 5, 2025 15:14
gbaraldi added a commit that referenced this pull request Mar 20, 2025
…simage (#57656)

This is quite tricky to test unfortunately, but
#57544 caught this and this fixes
that

---------

Co-authored-by: Jameson Nash <[email protected]>
@kpamnany kpamnany force-pushed the kp-sched-task-alt2 branch from f280243 to 684637a Compare March 20, 2025 18:56
KristofferC pushed a commit that referenced this pull request Mar 20, 2025
…simage (#57656)

This is quite tricky to test unfortunately, but
#57544 caught this and this fixes
that

---------

Co-authored-by: Jameson Nash <[email protected]>
(cherry picked from commit bf01638)
@kpamnany
Copy link
Contributor Author

Unblocked... thanks @gbaraldi!

Now, some Channel tests and a Sockets test are failing. Looking into these failures.

Copy link
Member

@vtjnash vtjnash left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think those tests are specifically testing race conditions in the code, so you might need to adjust them slightly to account for the change in process_events ordering and count

@JamesWrigley
Copy link
Contributor

Is there any chance of getting this backported to 1.12? It's quite tricky to debug and would be nice if it was fixed in the next release.

@kpamnany kpamnany force-pushed the kp-sched-task-alt2 branch from 684637a to 3bcde1b Compare March 24, 2025 19:15
This small group of tests is written with assumptions about when
and how the libuv event loop is run. As this PR changes this
behavior, the tests needed adjusting.
@kpamnany
Copy link
Contributor Author

The channels tests are fixed, but I don't see a way to fix the Sockets test that's failing. Any ideas @gbaraldi or @vtjnash?

Previously, this test depended on scheduler behavior, which is
slightly changed in this PR. Changed the test to connect to a
non-routable IP address so that it no longer depends on task
ordering.
@kpamnany
Copy link
Contributor Author

Thanks @vtjnash for the idea on how to fix the Sockets test.

@kpamnany
Copy link
Contributor Author

kpamnany commented Mar 26, 2025

The channels tests that are failing on FreeBSD are a bit mystifying. How come Workqueue is empty on Linux but not on FreeBSD? Do we have an extra sticky task?

@kpamnany kpamnany merged commit 0d4d6d9 into master Mar 27, 2025
5 of 7 checks passed
@kpamnany kpamnany deleted the kp-sched-task-alt2 branch March 27, 2025 15:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants