-
Notifications
You must be signed in to change notification settings - Fork 3.4k
Implement open
/stat
/mkdir
/symlink
for standalone WASI mode
#24246
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
You can run the tests locally by adding some WASM engines to ./test/runner -v test_hello_argc_standalone
# or
./test/runner -v test_fcntl_open_standalone |
Oh, and I have patched wasmer a bit, so behavior compared to a release version may differ: diff --git a/lib/wasix/src/syscalls/wasix/path_open2.rs b/lib/wasix/src/syscalls/wasix/path_open2.rs
index 6305b8bf59..c2297e05dc 100644
--- a/lib/wasix/src/syscalls/wasix/path_open2.rs
+++ b/lib/wasix/src/syscalls/wasix/path_open2.rs
@@ -148,7 +148,7 @@ pub(crate) fn path_open_internal(
//
// Maximum rights: should be the working dir rights
// Minimum rights: whatever rights are provided
- let adjusted_rights = /*fs_rights_base &*/ working_dir_rights_inheriting;
+ let adjusted_rights = fs_rights_base & working_dir_rights_inheriting;
let mut open_options = state.fs_new_open_options();
let target_rights = match maybe_inode {
@@ -276,7 +276,10 @@ pub(crate) fn path_open_internal(
}
}
Kind::Dir { .. } => {
- if fs_rights_base.contains(Rights::FD_WRITE) {
+ if fs_rights_base.contains(Rights::FD_WRITE)
+ || o_flags.contains(Oflags::TRUNC)
+ || o_flags.contains(Oflags::CREATE)
+ {
return Ok(Err(Errno::Isdir));
}
} |
The
Those are used in the preopen functions that run before main. Does anyone know how to fix this? |
fd_prestat_dir_name and fd_prestat_get are WASI syscalls which are implemented in libwasi.js.. seems strange that they would be reported as undefined symbols. |
test/common.py
Outdated
@@ -1562,6 +1584,9 @@ def run_js(self, filename, engine=None, args=None, | |||
else: | |||
self.fail('JS subprocess failed (%s): %s (expected=%s). Output:\n%s' % (error.cmd, error.returncode, assert_returncode, ret)) | |||
|
|||
if run_in_tmpdir: | |||
force_delete_contents(self.in_dir('fs')) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general tests do not need to (indeed should not since helps with debugging) cleanup after themselves like this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point! While there already is some existing cleanup logic it only ran after each "full" test, not in between each WASM engine run when iterating over them. This made the test fail because each test expects a blank directory.
What do you think about a subdir per each engine? So out/temp/wasmer
, out/temp/wasmtime
, etc.? Then the existing cleanup code should work just as is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think about a subdir per each engine? So out/temp/wasmer, out/temp/wasmtime, etc.? Then the existing cleanup code should work just as is.
I implemented this and it works very well!
test/common.py
Outdated
@@ -1505,7 +1522,11 @@ def cleanup(line): | |||
def run_js(self, filename, engine=None, args=None, | |||
assert_returncode=0, | |||
interleaved_output=True, | |||
input=None): | |||
input=None, | |||
run_in_tmpdir=False): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we can avoid adding this extra logic to run_js
which is use in a lot of different places.
Could we instead allow just add cwd=None
here and allow higher level functions to pass CWD?
Is there some fundamental reason why the tests need to run in a subdirectory?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we instead allow just add cwd=None here and allow higher level functions to pass CWD?
Good idea! I'll try to implement this.
Is there some fundamental reason why the tests need to run in a subdirectory?
The original reasons were to enable cleanup of this dir after each WASM engine and for the test not to interfere with the *.wasm
/*.js
/stdout
/... files in out/test
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea! I'll try to implement this.
Done!
test/common.py
Outdated
@@ -981,14 +997,14 @@ def get_nodejs(self): | |||
return None | |||
return config.NODE_JS_TEST | |||
|
|||
def require_node(self): | |||
def require_node(self, allow_wasm_engines=False): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems odd to mark a test as require_node
but also allow wasm engines. Maybe we can just remove require_node
from the tests in question instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I wonder about this, too. The original reason I did it like this comes from this line in also_with_standalone_wasm
:
Line 639 in 1ae691e
nodejs = self.require_node() |
Eventually this ends up in require_engine()
where self.wasm_engines
is set to an empty list. So what happened is that none of the WASM engines were actually used for any of the also_with_standalone_wasm
tests!
Alternatively I could stash the self.wasm_engines
in a local variable, then call self.require_node()
, and then repopulate self.wasm_engines
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternatively I could stash the self.wasm_engines in a local variable, then call self.require_node(), and then repopulate self.wasm_engines?
I've implemented this, avoiding this allow_wasm_engines
parameter.
Ah, they were gated behind |
To fix some failing CI tests I had to do some additional things:
I also added attribution to wasi-libc to The |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working on this!
I like the direction of this PR. I've not yet had a chance for fully grok the code in paths.c/h but the rest is looking good
system/lib/standalone/paths.c
Outdated
#include <string.h> | ||
#include <sysexits.h> | ||
|
||
/// A name and file descriptor pair. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How much of the code in this file is actually derived from the wasi-libc code? If its actually derived from that codebase we should probably say so in the licence comment at the top.
Does wasi-libc use the triple-slash (///) for comments? If not I would probably stick to normal C comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most of the code is derived from (a slightly older version of) preopens.c
from wasi-libc: https://github.com/WebAssembly/wasi-libc/blob/main/libc-bottom-half/sources/preopens.c
In the meantime they added on-demand loading of the preopens (instead of doing it before main
) and the ability to reset the preopens. Both of these add complexity though since you cannot assume any longer that the preopens are immutable. So I thought as a first step the "simple" version is better.
I added license headers to paths.{h,c}
where I mention that this code is derived from wasi-libc.
I also converted to #pragma once
and converted comments and brace style to Emscripten's. Looking at the wasi-libc code they weren't consistent about it anyways...
test/common.py
Outdated
@@ -623,6 +623,9 @@ def metafunc(self, standalone): | |||
if not standalone: | |||
func(self) | |||
else: | |||
nonlocal exclude_engines | |||
if exclude_engines is None: | |||
exclude_engines = [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you just put these two lines at the top of the function (before the def decorated
) and avoid the nonlocal thing.
You could also just write exclude_engines=[]
in the signature and add # noqa
to silence the warning.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done! exclude_engines
isn't modified so doing it in the signature sounds good.
test/common.py
Outdated
wasm_engines = [] | ||
else: | ||
wasm_engines = [engine for engine in self.wasm_engines | ||
if all(excluded not in os.path.basename(engine[0]) for excluded in exclude_engines)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find these double nested comprehension quite hard to understand. Can we expand into a loop or split in to two statements perhaps?
How about:
wasm_engines = []
if not impure:
for engine in self.wasm_engines:
basename = os.path.basename(engine[0])
if not any(pattern in basename for pattern in excluded_engines):
wasm_engines.append(engine)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, this is much more readable!
test/common.py
Outdated
@@ -614,7 +614,7 @@ def can_do_standalone(self, impure=False): | |||
|
|||
# Impure means a test that cannot run in a wasm VM yet, as it is not 100% | |||
# standalone. We can still run them with the JS code though. | |||
def also_with_standalone_wasm(impure=False): | |||
def also_with_standalone_wasm(impure=False, exclude_engines=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you convert add some info about exclude_engines
to the above comment?
Something like: "exclude_engines" is a list of engine names on which the test cannot run. These are pattern that are matched against the basename of the engine executable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done!
else: | ||
nodejs = self.require_node() | ||
self.node_args += shared.node_bigint_flags(nodejs) | ||
# `self.require_node()` clears `self.wasm_engines`, so set them here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't love this, but we can refactor later I suppose.
for engine, subdir in engines: | ||
if subdir: | ||
subdir = self.in_dir(subdir) | ||
ensure_dir(subdir) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So it looks for wasm engines only your want to run the JS in a new directory with the name of the engine?
I guess that will work for some tests, but don't some tests need to be run in alongside the files in the test directory?
Why is this needed? What happens if you don't do the chdir here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When I don't run inside a clean directory, the test for the second wasm engine would fail when trying to create test-file
here, as it already exists from the first wasm engine.
The current cleanup logic does not clean up the tests working directory after each engine, but only once in setUp
for the whole thing.
I guess that will work for some tests, but don't some tests need to be run in alongside the files in the test directory?
Yep, I have to admin that I don't have a good overview of the tests. But I thought that since the wasm_engines
didn't have access to the filesystem before this PR, I at least can't break anything with this approach. The downside is of course that adding @also_with_standalone_wasm
to a test like this won't work...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see.. so the problem only exists when running a single test with mutliple engines at the same time?
This is not something that gets a lot testing.. I for one basically never do this.
I kind of wonder if we should remove this feature? For now could we just run in the first wasm engine instead of in all of them?
Actually the EMTEST_USE_ALL_ENGINES
setting is false so enless you enable it we should only be running in a single engine. Perhaps we should fix that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about this solution:
- Leave
engines
as a simple list. - Add a
self.directory_per_engine
test field which defaults tofalse
- Set this field to
true
in the tests that need it. - Add this code here in the loop:
for engine, subdir in engines:
cwd = None
if self.directory_per_engine:
# Some tests require a fresh directory for each engine, for example, because
# they create file and are sensitive to existing files.
dirname = os.path.basename(engine[0])
ensure_dir(dirname)
cwd = dirname
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea, I'll try to implement this.
} else { | ||
return -EINVAL; | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there some reason to add this extra scope here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it is there to limit accmode
's scope. I don't feel strongly about this, though.
oserr: | ||
_Exit(EX_OSERR); | ||
software: | ||
_Exit(EX_SOFTWARE); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not just abort()
above instead of these goto + exit stuff?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question -- this seems to be a wasi-libc thing. The code in __main_void
(also derived from wasi-libc) uses EX_OSERR
as well:
__wasi_proc_exit(EX_OSERR); |
This change looks like its almost ready to land. Before we do though I wanted to check what your motivation for trying push this forward is? Is there some use case where you want to use emscripten to target non-web engines? Is there some reason why you need to use emscripten and not just wasi-sdk? |
For the fun of it I'm trying to port a hobby project of mine (CLI tool written in C++, using exceptions for error handling) to WASI. I actually tried wasi-sdk a few months ago but couldn't quite get exceptions to work. Then a few weeks ago I saw that Emscripten supported both a |
const preopen* pre = &preopens[i]; | ||
assert(pre->prefix != NULL); | ||
assert(pre->fd != (__wasi_fd_t)-1); | ||
#ifdef __wasm__ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why ifdef __wasm__
here? IIUC this code only compiles and run on wasm
preopen_capacity = new_capacity; | ||
free(old_preopens); | ||
|
||
assert_invariants(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not just preopens = realloc(preopen, new_capacity)
here?
path++; | ||
} else if (path[0] == '.' && path[1] == '/') { | ||
path += 2; | ||
} else if (path[0] == '.' && path[1] == 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use '\0
' here for the terminator?
This case looks like its trying to match .
on its own and then return the empty string.. is that right?
if (!register_preopened_fd(fd, prefix)) { | ||
goto software; | ||
} | ||
free(prefix); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that register_preopened_fd
is going to store and use the prefix, why not pass owership? i.e. don't call free()
here or strdup
in register_preopened_fd
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(otherwise no comments aside from @sbc100 's)
* | ||
* The preopen code is based on wasi-libc's `preopens.c` which is licensed | ||
* under a MIT style license. This license can also be found in the LICENSE | ||
* file. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please keep the copyright line from the original file, in addition to this text (which is good).
Based on #18285, implement just enough syscalls to make the
test_fcntl_open
test fromtest_core
work in standalone WASI mode.Compared to #18285 I moved the preopen handling into their own
paths.{h,c}
component. The idea is that this component will also be responsible for cwd handling in a follow up PR.Some notes:
main
runs. Therefore, I removed the locking functions.test_fcntl_open
test I had to extend the@also_with_standalone_wasm
decorator a bit. The test doesn't run on node and wasmer. Wasmer crashes when resolving symlinks. However, both wasmtime and toywasm successfully run the test.fs
folder inside theout/test
folder to put temporary files in. This folder is cleaned after each WASM engine is done. I couldn't make this work for wasmer, but for wasmtime and toywasm it works very nicely.