You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This benchmark runs the Argon2id hashing algorithm, with 2 iterations, 1KB of memory, and 1 parallel lane.
33
-
I had to decrease the memory usage from the default to 1KB, because especially the interpreters were struggling to finish in a reasonable amount of time.
34
-
This is something where `simd` instructions would be really useful, and it also highlights some of the issues with the current implementation of TinyWasm's Value Stack and Memory Instances.
35
-
36
38
### Fib
37
39
38
40
The first benchmark is a simple optimized Fibonacci function, which is a good way to show the overhead of calling functions and parsing the bytecode.
39
-
TinyWasm is slightly faster then Wasmi here, but that's probably because of the overhead of parsing the bytecode as TinyWasm uses a custom bytecode to pre-process the WebAssembly bytecode.
41
+
TinyWasm is slightly faster than Wasmi here, but that's probably because of the overhead of parsing the bytecode, as TinyWasm uses a custom bytecode to pre-process the WebAssembly bytecode.
40
42
41
43
### Fib-Rec
42
44
43
45
This benchmark is a recursive Fibonacci function, which highlights some of the issues with the current implementation of TinyWasm's Call Stack.
44
46
TinyWasm is a lot slower here, but that's because there's currently no way to reuse the same Call Frame for recursive calls, so a new Call Frame is allocated for every call. This is not a problem for most programs, and the upcoming `tail-call` proposal will make this a lot easier to implement.
45
47
48
+
### Argon2id
49
+
50
+
This benchmark runs the Argon2id hashing algorithm, with 2 iterations, 1KB of memory, and 1 parallel lane.
51
+
I had to decrease the memory usage from the default to 1KB, because especially the interpreters were struggling to finish in a reasonable amount of time.
52
+
This is where `simd` instructions would be really useful, and it also highlights some of the issues with the current implementation of TinyWasm's Value Stack and Memory Instances.
53
+
46
54
### Selfhosted
47
55
48
56
This benchmark runs TinyWasm itself in the VM, and parses and executes the `print.wasm` example from the `examples` folder.
49
-
This is a godd way to show some of TinyWasm's strengths - the code is pretty large at 702KB and Wasmer struggles massively with it, even with the Single Pass compiler. I think it's a decent real-world performance benchmark, but definitely favors TinyWasm a bit.
57
+
This is a good way to show some of TinyWasm's strengths - the code is quite large at 702KB and Wasmer struggles massively with it, even with the Single Pass compiler. I think it's a decent real-world performance benchmark, but it definitely favors TinyWasm a bit.
50
58
51
59
Wasmer also offers a pre-parsed module format, so keep in mind that this number could be a bit lower if that was used (but probably still on the same order of magnitude). This number seems so high that I'm not sure if I'm doing something wrong, so I will be looking into this in the future.
52
60
53
61
### Conclusion
54
62
55
-
After profiling and fixing some low hanging fruits, I found the biggest bottleneck to be Vector operations, especially for the Value Stack, and having shared access to Memory Instances using RefCell. These are the two areas I will be focusing on improving in the future, trying out to use
56
-
Arena Allocation and other data structures to improve performance. Still, I'm quite happy with the results, especially considering the use of standard Rust data structures. Additionally, typed FuncHandles have a significant overhead over the untyped ones, so I will be looking into improving that as well.
63
+
After profiling and fixing some low-hanging fruits, I found the biggest bottleneck to be Vector operations, especially for the Value Stack, and having shared access to Memory Instances using RefCell. These are the two areas I will be focusing on improving in the future, trying out Arena Allocation and other data structures to improve performance. Additionally, typed FuncHandles have a significant overhead over the untyped ones, so I will be looking into improving that as well. Still, I'm quite happy with the results, especially considering the use of standard Rust data structures.
0 commit comments