-
Notifications
You must be signed in to change notification settings - Fork 3.4k
Extend standalone support #18285
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Extend standalone support #18285
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working on this!
I left a few nits. Perhaps we can maybe split this up.
Can we add some some new tests based on these? |
I have never developed in C before, I don't really know where to start, can you give me some hints where you want these tests added and what they should test? |
system/lib/standalone/standalone.c
Outdated
@@ -155,24 +462,32 @@ double emscripten_get_now(void) { | |||
return (1000 * clock()) / (double)CLOCKS_PER_SEC; | |||
} | |||
|
|||
__attribute__((__weak__)) | |||
void _emscripten_throw_longjmp() { | |||
REPORT_UNSUPPORTED(do an invalid longjmp); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just call longjmp
would be more accurate I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I could only trigger this by doing an invalid jump, but I think you would know better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No.. all longjmp's require callout out of JS IIUC. WebAssembly, as it stands, has no support for longjmp so we call out the JS VM to throw
for us.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Not that it matters but do you have any C to demonstrate this? Just curious.
In my tests I was only able to trigger this by doing an invalid jmp. I think the jumps were being handled by the invoke_* methods?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any call to longjmp
from C code will called _emscripten_throw_longjmp
IIUC.
So here I would just do REPORT_UNSUPPORTED(longjmp)
system/include/wasi/api.h
Outdated
* - https://pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html#tag_15_14 | ||
* - https://github.com/WebAssembly/wasi-libc/blob/wasi-sdk-16/libc-bottom-half/sources/preopens.c#L215 | ||
*/ | ||
#define __WASI_FD_ROOT 3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe the way it works is that there can be any number of pre-opens, starting at 3: https://github.com/WebAssembly/wasi-libc/blob/30094b6ed05f19cee102115215863d185f2db4f0/libc-bottom-half/sources/preopens.c#L212-L215. And non of these pre-opens are necessarily the root, they can be mounted in various places.
Also, can you move this macros (if you keep it), into standalone.c. api.h is copied from wasi-libc, and I think its better to avoid have local modifications to it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we really needs some tests for this.
The most basic test for the standalone syscalls is in test_standalone_syscalls
in test_other.py
.
For other tests look for @also_with_standalone_wasm
in test_core.py
on behalf of wasm people I know I would like to thank @jerbob92 and @sbc100 for progressing this, as it not only unlocks things like pdfium, but also generic wasm utilities that use emscripten (like wabt). Ex being able to run wat2wasm using any wasi runtime instead of whatever was released and packaged. If any one not currently involved has skills to contribute, please do as this is very likely well into "hobby time" for @jerbob92 and it is at the core a pretty substantial infrastructure change and a lot of work to test it. other wasm people will thank you, but I'll thank in advance! |
If you just wast a wasi version of the wabt tools I think the simplest path would be to use wasi-sdk to build it. Of course that doesn't mean we shouldn't land this change too. |
@sbc100 ps your last comment didn't format right. might want to polish it |
good idea, on the wabt thing and indeed a separate topic if they are open to decoupling from emscripten. |
IIUC wabt is not coupled to emscripten, we just happen to build it with emscripten for the web demo. |
(I guess maybe what you are saying is that it would be nice if the web version of wabt, which is run by emscripten, happened to also be WASI compliant.. in that case I agree that would be very cool). |
yeah sorry it was that we were looking for a wabt compiled to wasm and found the one that was compiled with emscripten for the web. I raised this issue for out of browser wasi binary WebAssembly/wabt#2101 |
Thanks @codefromthecrypt! So in hindsight, I don't think my C skills are good enough to get this PR into a merge-able state, implementing AT_FDCWD, using the preload dirs and not the FD 3, implementing tests. |
@jerbob92 no worries, I think you got things very far. I'll keep recruiting to whatever end on this. |
55abfc0
to
9fbd75d
Compare
@sbc100 I have done some more progress on this PR now:
I do have some questions though:
|
I don't feel strongly about this. Perhaps use the same split that wasi-libc uses? Also I don't think we need
The ideas is that we will be running the full wasi testsuite. I started adding some of it in #12704 and I have some
I don't think its a problem, but please document the origin and try to document any changes from upstream.
Sure. We could even consider adding wasi-libc as a submodule and including certain files directly? |
Sounds good!
Nice, that will make it a lot easier!
If that's a possibility that would be great, it would make it a lot easier because right now I'm just cherry-picking wasi-libc code to get the syscalls/WASI calls working that I require, especially if we want to pass the whole wasi-testsuite then we have to copy a lot from was-libc. Only thing I noticed while copying code is that their WASI function signature is a bit different sometimes, which prevents us from directly using the C code (AFAIK), so we might want to look into making that possible. For example: So Emscripten requires the path length and wasi-libc doesn't. But perhaps we can work around this by completely using wasi-libc for the wasi part. |
return 0; | ||
} | ||
|
||
int rmdirat(int dirfd, intptr_t path) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you make this static
and make it clear this is an internal helper?
A stupid nit - if it was based on wasi libc, shouldn't there be copyright attribution somewhere? (Or is that not necessary?) |
No stupid at all, we should certainly consider how to handle this. I'm not sure why the right answer is. Or how we should deal with keeping the codebases in sync. One easy option wold be to add wasi-libc as a submodule instead of duplicating the code here. |
Correct. Go has proper cross-compilation support and automatically takes care of this.
Oh yeah we could. But from what I understand, the
|
IMHO it's way better to add support for WASI in Emcripten than to add Emscripten ABI/API support for every runtime out there. |
Perhaps, but I think that argument only really makes sense if the resulting binary is WASI-compaible. In this case it sounds like the resulting binary will contain a bunch of emscripten-specific stuff anyway, so won't be WASI-compliant moulde, right? (or am I misunderstanding?) |
Correct, there are still a few things that are needed to run Emscripten binaries:
|
Sure, I'd be OK with moving forward with this. I think it still needs a bit of work though. |
Would it be possible to create a TODO list for the work that still need to be done to get this PR merged? I'd be happy to help, but not quite sure where to start. Thanks! |
I want to say I am using this branch in building ghostscript (both the
ancient GPL version and the current AfferoGPL version) and Xpdf comand line
tools (latest version) and it all works as expected in wazero (go WASI
runtime).
…On Sat 15. 3. 2025 at 20:48, Bastian Müller ***@***.***> wrote:
Would it be possible to create a TODO list for the work that still need to
be done to get this PR merged? I'd be happy to help, but not quite sure
where to start. Thanks!
—
Reply to this email directly, view it on GitHub
<#18285 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAAZT4NUVKP4NLMZPWFB5ST2UR7XFAVCNFSM6AAAAAASQXFBISVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDOMRWHE3DIMBXGU>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
[image: turbolent]*turbolent* left a comment
(emscripten-core/emscripten#18285)
<#18285 (comment)>
Would it be possible to create a TODO list for the work that still need to
be done to get this PR merged? I'd be happy to help, but not quite sure
where to start. Thanks!
—
Reply to this email directly, view it on GitHub
<#18285 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAAZT4NUVKP4NLMZPWFB5ST2UR7XFAVCNFSM6AAAAAASQXFBISVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDOMRWHE3DIMBXGU>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Concretely I think what is needed here is
Anything else @sbc100 ? |
case O_RDONLY: | ||
case O_RDWR: | ||
case O_WRONLY: | ||
if ((flags & O_RDONLY) != 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your work on this so far!
Musl defines O_RDONLY
as 0
, so this line will not work I think. This will only work in implementations where the file access modes are single bits, one for each file access mode (like wasi-libc, where this is the case).
The robust, POSIX compliant way to deal with file access modes is to:
- first mask the file access modes part off with
flags & O_ACCMODE
- then compare the resulting value directly to the five defined access modes (
O_RDONLY
,O_WRONLY
,O_RDWR
,O_SEARCH
,O_EXEC
)
With this, it should be possible to open a file as O_RDONLY
.
Oh, and if you test with Wasmer: Right now, it doesn't support opening a file with O_RDONLY
(wasmerio/wasmer#4892). This took me quite a while to figure out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code comes from wasi-libc, which also uses musl, so then it would be broken there as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know exactly how much they use from musl, but they defined the file access mode bits themselves at least:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, yeah they defined it both at the top and bottom half (with different values). I think we will known soon enough if this works or not if we add tests.
I'm mostly just copy pasting stuff from wasi-libc here, so I wouldn't know if it's correct, it works for me but I'm not sure if I hit this code path.
|
||
// If we can't find a preopen for it, fail as if we can't find the path. | ||
if (dirfd == -1) { | ||
errno = ENOENT; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be return -ENOENT
. And similar elsewhere. In wasi-libc, this code is used in function that use the "return -1 + errno" error convention, while the __syscall_*
functions use the "return negative error" convention of the Linux syscall API.
@jerbob92 : Are you or is anyone else working on the test integration at the moment? Based on this PR I managed to get the I used this test because it is pretty simple. It just needed Should I open a new PR? |
@jiixyj Not that I know of! New MR is fine, you can also update this one if you have access. |
This is far from complete if you look at the WASI FS spec and the syscalls, but at least it gets basic operations like directory listing, file/directory opening and file/directory stat working. It was enough for me to get pdfium working in standalone mode.
I didn't really know what I should do with AT_FDCWD or what the runtime should do with it when passing AT_FDCWD. AT_FDCWD is also negative while I think
fd
in WASI is unsigned?Most of the syscall code was taken from or based on wasi-libc.