RFC: Reactor - IO threads#7230
Conversation
b59ade2 to
7782ef9
Compare
|
Thanks for working on this. I agree that using threads like this is likely a better solution that using "fadvice". I have a handful of high-level comments. Don't take these comments too seriously. In no particular order:
Thanks again, |
09624a6 to
dda9818
Compare
|
Yeah, palette2 is my hallucination. I've seen several "extensions" with UART, and it is just blended in my head. Anyway:
Where virtual_sdcard now has:
Fancy thread naming can be removed; it is mostly for monitoring what is going on. But I think it is suboptimal for everything except virtual_sdcard. Thanks, Hmmm, That way, there should be no IO or CPU spikes; it will be evenly spread out. Where the only limitation that I can think of is to correctly handle the Hmmm, I think that if I'm offloading the computation and writing to the thread, the problem is, that now the command is blocking. |
cabe873 to
9544f4c
Compare
|
I've underestimated the amount of work that is necessary to rework the accelerometer data write-out, so it has been removed from here.
Cannot reproduce, it seems that even in that case, reactor are able to execute other timers. |
3ecdf6f to
fcddb4c
Compare
|
Thank you for your contribution to Klipper. Unfortunately, a reviewer has not assigned themselves to this GitHub Pull Request. All Pull Requests are reviewed before merging, and a reviewer will need to volunteer. Further information is available at: https://www.klipper3d.org/CONTRIBUTING.html There are some steps that you can take now:
Unfortunately, if a reviewer does not assign themselves to this GitHub Pull Request then it will be automatically closed. If this happens, then it is a good idea to move further discussion to the Klipper Discourse server. Reviewers can reach out on that forum to let you know if they are interested and when they are available. Best regards, PS: I'm just an automated script, not a human being. |
db34d30 to
2444f37
Compare
62eb78b to
4fef8ef
Compare
4fef8ef to
4ed9f35
Compare
4ed9f35 to
03a0bda
Compare
03a0bda to
e67e341
Compare
In a batch mode, GCodeIO can exit before virtual_sdcard finishes execution of G-code file Signed-off-by: Timofey Titovets <nefelim4ag@gmail.com>
Shutdown hook tries to reread the previous and next bytes That will complicate the async file access loop Simply avoid that by always working over memory representation Signed-off-by: Timofey Titovets <nefelim4ag@gmail.com>
Allow to wrap the file wrapper with the AIO wrapper Which will pause greenlet upon blocking calls like read/write Signed-off-by: Timofey Titovets <nefelim4ag@gmail.com>
Signed-off-by: Timofey Titovets <nefelim4ag@gmail.com>
Signed-off-by: Timofey Titovets <nefelim4ag@gmail.com>
Signed-off-by: Timofey Titovets <nefelim4ag@gmail.com>
e67e341 to
0dc7f2b
Compare
dewi-ny-je
left a comment
There was a problem hiding this comment.
For what it's worth, an automatic LLM review is better than nothing, hoping it will make the PR get further.
Significant concerns
-
gcode.pynow reaches into an extras module (the biggest red flag).# Workaround virtual_sdcard obj = self.printer.lookup_object("virtual_sdcard", None) wait_virtual_input = obj.get_status(eventtime)["is_active"] if obj else False
Core
gcode.pyshouldn't know aboutvirtual_sdcard, and the author's own "Workaround" comment admits it. This is a layering inversion that will block merge. The real issue is that async/buffered reads make the existing synchronous fileinput-EOF detection race; it deserves a cleaner abstraction (e.g. virtual_sdcard registering that input is pending) rather than a hardcoded lookup in the core. -
Shared
FileAIO+Queue(1)+raise Fullis fragile under concurrent access.Executor.submitraisesFullif a call is already in flight ("single execution flow assumed").current_fileis a single shared object; the PR was careful to make_handle_analyze_shutdownread from the in-memory cache instead of touching the file precisely to avoid this — but it relies on no other code path ever calling acurrent_filemethod while the work-timer greenlet has a read outstanding. That invariant isn't enforced or documented, and any future caller of e.g.seek/tellfrom another greenlet would tripFull(or interleave). Worth at least asserting/guarding. -
temperature_hostmixes threads on one handle. Reads run on a borrowed worker thread (_get_sampledoesfile_handle.seek/read), buthandle_disconnectcallsself.file_handle.close()synchronously on the main greenlet. If a sample is in flight at disconnect, that's a close/read race on the same handle. Closing through the executor (assave_variablesdoes its writes) would be more consistent. -
Pool grows but never shrinks, and
join()vs. an in-flight item. Each new concurrent consumer spawns a permanent thread (no upper bound, no reclamation until disconnect). Alsojoin()doesput_nowait(sentinel)into aQueue(1); it's only safe because the workerget()s before processing (leaving the queue empty), but it's timing-dependent and undocumented. A blockingputfor the sentinel, or a dedicated stop flag checked without the queue, would be more robust.
Minor / correctness nits
FilePager.read()changed semantics: it no longer returns a fixed 8 KiB but "from current offset to end of the current page," so the first read after a non-alignedseekreturns a short chunk. The work loop tolerates variable-length reads, so this is fine functionally — but it's a behavioral change worth a comment.- Cache miss in the shutdown dump:
_handle_analyze_shutdowndoesself.current_file.pages.get(page_num, ""). WithCACHE_SIZE=3the relevant page is normally resident, but if it was recycled the diagnostic dump is silently empty/truncated. Acceptable for a best-effort dump, but a regression vs. the old guaranteedseek+read. FilePager.__getattr__returns the attribute directly (no executor proxy). It's only correct because the wrappedfile_objectis itself aFileAIOthat re-proxies — a two-layer indirection that's easy to break later. A one-line comment would help.- No timeout on
submit(completion.wait()defaults to_NEVER). If the underlying I/O genuinely hangs (e.g. dead NFS), the greenlet parks forever with no cancellation. That's arguably the intended trade-off, but it's an unbounded wait worth acknowledging. - No docs/config reference or tests. New
[aio_executor]object is auto-loaded, so no user config needed, but there's no documentation and no test coverage of the pager/executor. - Copyright year
2026in the new file header — just flag for consistency.
Recommendation
Good direction and a genuinely useful capability; the executor primitive is well-built. But it's not mergeable as-is, and the title's RFC status is appropriate. The two things I'd push back on hardest:
- Remove the
virtual_sdcardknowledge fromgcode.py— find a generic hook so the core stays decoupled. - Pin down the concurrency contract on the shared
FileAIO/Queue(1)(one in-flight op per file, enforced) and maketemperature_host's close go through the same thread.
|
@dewi-ny-je, I appreciate your intention, but the methods are questionable. Generally, I think: If one cannot review and grasp the changes (for example, I'm too goofy to review some PRs or I do not use that feature, for example). But if that one person is interested in those changes/want to help. One can test those changes. That being said, it is generally hard to make a decision, whether changes are worth it or not. The above actionables can help in general. So, unless LLM helps you personally understand the changes, makes reviewing or testing easier for you. Also, I do generally like what is written here: https://mikemcquaid.com/open-source-maintainers-owe-you-nothing/ All the above is my personal opinion. |
|
@nefelim4ag My motivation is obvious: I have seen tens of what looked (to me) like a really good idea to me dropped just because no reviewer picked up the PR. And there are over 200 PRs waiting for a reviewer. After all, I'm not blindly enthusiastic about LLMs: my attempt to help here is the result of an impressive success rate (feature addition, cleanup, bug fixes), which led me to think that the other way around could be just as good. Nevertheless I did it only for the few PRs which I found potentially useful for my use case to avoid spamming too much, I put a disclaimer and (unless I clicked wrong) I did not "request changes" but I only submitted comments. If you and/or @KevinOConnor prefer not to have any automatic review, I will comply and I won't take it personally because I totally recognise that if this happens often, the time wasted could be more than the benefits. |
There are several places where the reactor's greenlet can be used to do blocking file IO.
Unfortunately, it is not possible to do file IO in a non-blocking fashion.
In some places, it is handled with a daemon (adxl345.py), which is cumbersome.
In others, ignored (virtual_sdcard).
Somewhere, it is threads (palette2.py)This is my attempt to solve the possible IO blocks inside the virtual_sdcard code.
Because it is a somewhat generic problem, I've tried to make a generic solution to do so across the code.
Hence, the weird way of wrapping the FileIO wrapper. So, it can wrap whatever wrapper supports the blocking read/write.
I did a base test, where I triggered the ENOSPC on write, and checked the virtual_sdcard reading with a large file with dummy commands.
Thanks,
-Timofey
I have to figure out how to read during the analyze shutdown event =\ where I should not pause.Hmmm, the test exits before the first read happens, because do_resume expects that the next timer is blocking.