[AOT] Memory Info Restore Mechanism with Better Performance #4113

Jiax-cn · 2025-02-26T07:38:35Z

I found that my program runs slower in WAMR AOT mode compared to other WASM runtimes, e.g. WAVM. I compared their LLVM IRs and found that WAMR emits more load operations of memory base.

In WAMR, functions with mem_space_unchanged keep memory base address in mem_base_addr of AOTMemInfo, while others keep the address of memory base address in that field. When emit instructions like load/store, the former use the base address directly, while the later should load base address from its address at first. This reload is redundant when there is no possibility of changing memory between two consecutive load/store instructions:

%mem_base0 = load ptr, ptr %mem_base_addr_offset, align 8 ; reload
; load/store on %mem_base0
%mem_base1 = load ptr, ptr %mem_base_addr_offset, align 8 ; redundant reload
; load/store on %mem_base1

Optimization passes won’t recognize this redundancy because the reloaded memory base is accessed within the context.

In WAVM, the base address is reloaded when the memory possibly changes, e.g. after calling another function or after memory.grow. This can be redundant if there are no subsequent load/store instructions, but the dead code elimination pass handles this：

%mem_base0 = load ptr, ptr %mem_base_addr_offset, align 8 ; reload
; load/store on %mem_base0
; emit memory.grow
%mem_base1 = load ptr, ptr %mem_base_addr_offset, align 8 ; reload
; load/store on %mem_base1

Performance

Here is a sample C++ program substr.cc:

#include <string>
extern "C" void TestPerformance(const std::string& s) {
    s.substr(5);
}

int main() {
    for (int i = 0; i < 10007989; i++) {
        TestPerformance("Hello World");
    }
    return 0;
}

Compiled with emcc (version: 3.1.59 (0e4c5994eb5b8defd38367a416d0703fd506ad81))

emcc -O3 -g2 -s EXPORTED_FUNCTIONS='["_TestPerformance"]' ./substr.cc -o substr.wasm

Then ran wamrc and iwasm(linux) and compared the performance:

product-mini/platforms/posix/main.c:

#include <time.h>
static const void *
app_instance_main(wasm_module_inst_t module_inst)
{
    const char *exception;
    struct timespec s, e;
    clock_gettime(CLOCK_MONOTONIC, &s);
    wasm_application_execute_main(module_inst, app_argc, app_argv);
    clock_gettime(CLOCK_MONOTONIC, &e);
    printf("cost: %ld us", e.tv_sec * 1000000 + e.tv_nsec / 1000 - s.tv_sec * 1000000 - s.tv_nsec / 1000);
    exception = wasm_runtime_get_exception(module_inst);
    return exception;
}

result:

commit e3dcf4f

cost: 349913 us

commit 3f268e5

cost: 308787 us

IR(optimized) comparison:

lum1n0us · 2025-02-26T08:58:10Z

Thank you for the keen observation and the intriguing analysis of the root cause.

IIUC, the rationale behind storing mem_base_addr in AOTMemInfo is to avoid unnecessary reloading.

// in aot_check_memory_overflow()
    /* Get memory base address and memory data size */
    if (func_ctx->mem_space_unchanged
#if WASM_ENABLE_SHARED_MEMORY != 0
        || is_shared_memory
#endif
    ) {
        mem_base_addr = func_ctx->mem_info[0].mem_base_addr;  // This branch should be used in the majority of cases.
    }
    else {
        if (!(mem_base_addr = LLVMBuildLoad2(
                  comp_ctx->builder, OPQ_PTR_TYPE,
                  func_ctx->mem_info[0].mem_base_addr, "mem_base"))) {
            aot_set_last_error("llvm build load failed.");
            goto fail;
        }
    }

Using the address of the memory base address does not allow optimization passes to recognize the pattern and decide to eliminate superfluous load instructions.

However, I concur that the conditions for setting mem_space_unchanged to true are so stringent that it rarely occurs in practice, particularly due to the has_op_func_call condition. If you simply set mem_space_unchanged to true in create_memory_info(), you would achieve the desired improvement.

// in create_memory_info
    bool mem_space_unchanged = true; // (!func->has_op_memory_grow && !func->has_op_func_call) || (!module->possible_memory_grow);

Therefore, I believe the concept of "reloading the base address when the memory might change" is excellent.

If you're in agreement with my perspective, we can begin refactoring the PR by concentrating on "reloading the base address when the memory might change" and eliminating "keeping the address of the memory base address."

Jiax-cn · 2025-02-27T06:19:20Z

@lum1n0us Completely agree

lum1n0us

If you're in agreement with my perspective, we can begin refactoring the PR by concentrating on "reloading the base address when the memory might change" and eliminating "keeping the address of the memory base address."

Jiax-cn · 2025-03-04T02:33:31Z

If you're in agreement with my perspective, we can begin refactoring the PR by concentrating on "reloading the base address when the memory might change" and eliminating "keeping the address of the memory base address."

I agree. Let me know what I can do to help with the refactoring.

lum1n0us · 2025-03-04T23:36:55Z

I believe the approach should be:

To remove all changes related to "keeping the address of the memory base address" in AOTMemInfo.
To retain (or perhaps slightly refactor) the enhancement concerning "reloading the base address when the memory might change."

lum1n0us · 2025-03-12T11:22:39Z

You're correct; having readable names for AOT functions in the IR and generated code would indeed make debugging more user-friendly. However, the reality is that .wasm files often lack a named section. Typically, in the interest of minimizing binary size, debug information is stripped, which means we need at least two naming systems to handle scenarios with and without a name section.

On the other hand, generated function names are a common assumption across AOT and JIT running modes and their supporting tools. Therefore, unless there's a comprehensive solution available, we might prefer to stick with additional scripts for the time being.

lum1n0us · 2025-03-17T05:48:57Z

core/iwasm/compilation/aot_emit_memory.c

        if (!(mem_base_addr = LLVMBuildLoad2(
                  comp_ctx->builder, OPQ_PTR_TYPE,
-                  func_ctx->mem_info[0].mem_base_addr, "mem_base"))) {
+                  func_ctx->mem_info[0].mem_base_addr, "mem_base_addr"))) {


I assume the main purpose of this PR is to minimize or eliminate load instructions in memory operations. However, the changes eliminated all the fast accesses(if branch) but retained the slow ones(else branch). Does this actually address the original problem?

These two else branches are different. In this PR, mem_info[0].mem_base_addr comes from LLVMBuildAlloca, which allocates memory on the stack (often in registers due to mem2reg optimization or in cache), whereas the previous version loads from global memory.

I believe it's become more confusing.

There is a critical need to pass the base address of linear memory in various functions. Therefore, it's necessary to have some sort of global variable to hold this address.

In the original code, the process is as follows: the base address is always retrieved from the global variable, stored in a temporary variable mem_base_addr, and then this temporary variable is used to compute the final address.

%mem_base_addr_offset = getelementptr inbounds i8, ptr %aot_inst, i32 376 %mem_base_addr = load ptr, ptr %mem_base_addr_offset, align 8 ;; when using %maddr = getelementptr inbounds i8, ptr %mem_base_addr, i64 %offset1

In the PR, a new local variable, mem_base_addr, is introduced to hold the base address after obtaining it from the temporary variable mem_base_addr1. Although loading from a local variable isn't a significant issue, it does raise a small question as to why this is necessary. After all, the base address is already present in the temporary variable(if using the original design).

%mem_base_addr = alloca ptr, align 8 %mem_base_addr_offset = getelementptr inbounds i8, ptr %aot_inst, i32 376 %mem_base_addr1 = load ptr, ptr %mem_base_addr_offset, align 8 store ptr %mem_base_addr1, ptr %mem_base_addr, align 8 %mem_base_addr9 = load ptr, ptr %mem_base_addr, align 8 %maddr = getelementptr inbounds i8, ptr %mem_base_addr9, i64 %offset1

And after changes to linear memory, such as memory.grow, it is still necessary to reload values from the global variable, then from the temporary variable, and finally save them to the local variable. This doesn't appear to be more optimized, in my view.

Your thought is correct, but you have overlooked the optimization of LLVMBuildAlloca variables by mem2reg.

In this PR, mem_info[0].mem_base_addr comes from LLVMBuildAlloca, which allocates memory on the stack (often in registers due to mem2reg optimization or in cache).

For example:

define i32 @foo(i32 %x) { entry: %y = alloca i32 store i32 %x, i32* %y %val = load i32, i32* %y ret i32 %val }

The load/store of %y be optimized:

define i32 @foo(i32 %x) { entry: ret i32 %x }

Similarly,

%mem_base_addr = alloca ptr, align 8 %mem_base_addr_offset = getelementptr inbounds i8, ptr %aot_inst, i32 376 %mem_base_addr1 = load ptr, ptr %mem_base_addr_offset, align 8 store ptr %mem_base_addr1, ptr %mem_base_addr, align 8 %mem_base_addr9 = load ptr, ptr %mem_base_addr, align 8 %maddr = getelementptr inbounds i8, ptr %mem_base_addr9, i64 %offset1

can be optimized to:

%mem_base_addr_offset = getelementptr inbounds i8, ptr %aot_inst, i32 376 %mem_base_addr1 = load ptr, ptr %mem_base_addr_offset, align 8 %maddr = getelementptr inbounds i8, ptr %mem_base_addr1, i64 %offset1

In this PR, if you dump the IR of substr.wasm above, you will see that the load/store operations of %mem_base_addr are also optimized. As a result, the final outcome remains the same as in the original code.

I've devised a straightforward test case to assess the enhancement. Here is the code:

(module (memory 5 10) (func $store32 (export "store32") (param i32 i32) (i32.store (local.get 0) (local.get 1)) ) (func $load32 (export "load32") (param i32) (result i32) (i32.load (local.get 0)) ) (func (export "load_store") (param i32 i32) (result i32) (local i32) (i32.load (local.get 0)) (local.tee 2) (i32.store (local.get 1)) (i32.load (local.get 1)) (local.get 2) (i32.eq) ) (func (export "load_grow_store") (param i32 i32) (result i32) (local i32) (i32.load (local.get 0)) (local.tee 2) (i32.store (local.get 1)) (memory.grow (i32.const 1)) (drop) (i32.load (local.get 1)) (local.get 2) (i32.eq) ) (func (export "load_store_w_func") (param i32 i32) (result i32) (local i32) (local.get 0) (call $load32) (local.tee 2) (local.get 1) (call $store32) (i32.load (local.get 1)) (local.get 2) (i32.eq) ) (func (export "load_grow_store_w_func") (param i32 i32) (result i32) (local i32) (local.get 0) (call $load32) (local.tee 2) (local.get 1) (call $store32) (memory.grow (i32.const 1)) (drop) (i32.load (local.get 1)) (local.get 2) (i32.eq) ) )

And I used the following command: --bounds-checks=1 --format=llvmir-op, to create optimized llvmir. The --bounds-checks=1 is employed to apply the noinline attribute.

Several intriguing findings emerged from the comparison of the before and after scenarios:

Look at f0, f1, and f2. These are elementary cases involving load and store. As previously mentioned, the mem2reg optimization refines alloca variables, allowing the revised version to produce no additional IR compared to the original version.

Now, consider f4 and f5. I believed they presented issues that this PR aims to address. Clearly, as seen in f5, there is no necessity to reload the memory base address after calling f1, as there is no memory growth in f4. This PR should eliminate that redundant loading. However, the modified version maintains the status quo.

🆙 If I'm mistaken, please correct me.

This leads to my confusion: if there is no difference for basic cases and no enhancement for redundant loading, what is the rationale for changing?

This PR is to eliminate redundant memory info loads between multiple load/store instructions when the memory remains unchanged.

I use the following WAST example to compare the (optimized) IR generated by this PR with the original code, identifying specific scenarios where this optimization applies. I don’t need the noinline attribute, so I only used the --format=llvmir-op option.

(module (memory 5 10) (func (export "load_load") (param i32 i32) (result i32) (i32.load (local.get 0)) (i32.load (local.get 1)) (i32.eq) (memory.grow (i32.const 1)) (drop) ) (func (export "load_store") (param i32 i32) (i32.load (local.get 0)) (i32.store (local.get 1)) (memory.grow (i32.const 1)) (drop) ) (func (export "store_store") (param i32 i32) (i32.store (local.get 0) (i32.const 42)) (i32.store (local.get 1) (i32.const 42)) (memory.grow (i32.const 1)) (drop) ) (func (export "store_load") (param i32 i32) (result i32) (i32.store (local.get 0) (i32.const 42)) (i32.load (local.get 1)) (memory.grow (i32.const 1)) (drop) ) )

Same IR in f0 and f1.

In f3 and f4, the IR generated by this PR is different.

This PR primarily optimizes store-store and store-load scenarios. As for load-load and load-store scenarios, I believe they might have been optimized.

In your example, f0\f1 shows this PR won't produce no addtional IR compared to the original version. The memory in f2 is unchanged. f3 is the load-store case. f4 and f5 contains two call and one load instruction. Therefore, their IRs remains the same.

lum1n0us · 2025-04-02T05:00:51Z

Let me summarize it for us.

IMU, there are several scenarios we need to examine closely, such as single-load, single-store, load-load, load-store, store-store, store-load, load-store-grow-load-store, load-store-call-load-store, and load-store-call-grow-load-store. The rationale for the last three scenarios is that in the original implementation, the key condition for controlling the reloading of the memory base address is func->has_op_memory_grow && !func->has_op_func_call.

After merging both test scripts, I believe we can address all the aforementioned cases. For your information, since the cases are quite straightforward, particularly as functions being called, the compilation process will likely inline these functions, causing the last two cases to lose the call. This is the reason I recommend using --bounds-checks=1 to turn off the inline optimization.
The improvements in the pull request are as follows.

case	before	after	FIXED?
single-load	N	N	N/A
single-store	N	N	N/A
load-load	N	N	N/A
load-store	N	N	N/A
store-store	Y	N	Y
store-load	Y	N	Y
load-store-grow-load-store	N	N	N/A
load-store-call-load-store	Y	Y	N✨
load-store-call-grow-load-store	Y	Y	N/A

N means no redundant load/store
Y means redundant load/store

PR has improved two cases and left two cases involving a call

===

Use load-store-call-load-store as an example

BEFORE:


;; THIS IS BEFORE

define internal i32 @"aot_func_internal#7"(ptr nocapture readonly %exec_env, i32 %0, i32 %1) unnamed_addr #0 {
func_begin:
  %aot_inst_addr = getelementptr inbounds ptr, ptr %exec_env, i64 2
  %aot_inst25 = load i64, ptr %aot_inst_addr, align 8
  %2 = inttoptr i64 %aot_inst25 to ptr
  %mem_base_addr_offset = getelementptr inbounds i8, ptr %2, i64 376
  %bound_check_4bytes_offset = getelementptr inbounds i8, ptr %2, i64 440
  %cur_exception = getelementptr inbounds i8, ptr %2, i64 104
  %addr_i64 = zext i32 %0 to i64
  %mem_check_bound = load i64, ptr %bound_check_4bytes_offset, align 8
  %cmp = icmp ult i64 %mem_check_bound, %addr_i64
  br i1 %cmp, label %got_exception, label %check_succ

check_succ:                                       ; preds = %func_begin
  %mem_base = load ptr, ptr %mem_base_addr_offset, align 8
  %maddr = getelementptr inbounds i8, ptr %mem_base, i64 %addr_i64
  store i32 42, ptr %maddr, align 1
  %addr_i642 = zext i32 %1 to i64
  %mem_check_bound4 = load i64, ptr %bound_check_4bytes_offset, align 8
  %cmp5 = icmp ult i64 %mem_check_bound4, %addr_i642
  br i1 %cmp5, label %got_exception, label %check_succ6

check_succ6:                                      ; preds = %check_succ
  %call = tail call i32 @"aot_func#1"(ptr %exec_env, i32 %0)
  %exce_value = load i8, ptr %cur_exception, align 1
  %cmp9 = icmp eq i8 %exce_value, 0
  br i1 %cmp9, label %check_exce_succ, label %common.ret

check_exce_succ:                                  ; preds = %check_succ6
  tail call void @"aot_func#0"(ptr %exec_env, i32 %call, i32 %1)
  %exce_value11 = load i8, ptr %cur_exception, align 1
  %cmp12 = icmp eq i8 %exce_value11, 0
  br i1 %cmp12, label %func_end, label %common.ret

common.ret:                                       ; preds = %check_succ6, %check_exce_succ, %got_exception, %func_end
  %common.ret.op = phi i32 [ %data24, %func_end ], [ 0, %got_exception ], [ 0, %check_exce_succ ], [ 0, %check_succ6 ]
  ret i32 %common.ret.op

func_end:                                         ; preds = %check_exce_succ
  %mem_base15 = load ptr, ptr %mem_base_addr_offset, align 8     ; <== 💥 redundant  reloading
  %maddr18 = getelementptr inbounds i8, ptr %mem_base15, i64 %addr_i64
  store i32 42, ptr %maddr18, align 1
  %mem_base20 = load ptr, ptr %mem_base_addr_offset, align 8    ; <== 💥 redundant  reloading 
  %maddr23 = getelementptr inbounds i8, ptr %mem_base20, i64 %addr_i642
  %data24 = load i32, ptr %maddr23, align 1
  br label %common.ret

got_exception:                                    ; preds = %check_succ, %func_begin
  tail call void @aot_set_exception_with_id(ptr %2, i32 2)
  br label %common.ret
}

AFTER:


;; THIS IS AFTER

define internal i32 @"aot_func_internal#7"(ptr nocapture readonly %exec_env, i32 %0, i32 %1) unnamed_addr #0 {
func_begin:
  %aot_inst_addr = getelementptr inbounds ptr, ptr %exec_env, i64 2
  %aot_inst78 = load i64, ptr %aot_inst_addr, align 8
  %2 = inttoptr i64 %aot_inst78 to ptr
  %3 = getelementptr inbounds i8, ptr %2, i64 440
  %mem_bound_check_4bytes684 = load i64, ptr %3, align 8
  %cur_exception = getelementptr inbounds i8, ptr %2, i64 104
  %addr_i64 = zext i32 %0 to i64
  %cmp = icmp ult i64 %mem_bound_check_4bytes684, %addr_i64
  br i1 %cmp, label %got_exception, label %check_succ

check_succ:                                       ; preds = %func_begin
  %4 = getelementptr inbounds i8, ptr %2, i64 376
  %mem_base_addr181 = load i64, ptr %4, align 8
  %5 = inttoptr i64 %mem_base_addr181 to ptr
  %maddr = getelementptr inbounds i8, ptr %5, i64 %addr_i64
  store i32 42, ptr %maddr, align 1
  %addr_i6411 = zext i32 %1 to i64
  %cmp14 = icmp ult i64 %mem_bound_check_4bytes684, %addr_i6411
  br i1 %cmp14, label %got_exception, label %check_succ15

check_succ15:                                     ; preds = %check_succ
  %call = tail call i32 @"aot_func#1"(ptr %exec_env, i32 %0)
  %exce_value = load i8, ptr %cur_exception, align 1
  %cmp18 = icmp eq i8 %exce_value, 0
  br i1 %cmp18, label %check_exce_succ, label %common.ret

check_exce_succ:                                  ; preds = %check_succ15
  tail call void @"aot_func#0"(ptr %exec_env, i32 %call, i32 %1)
  %exce_value36 = load i8, ptr %cur_exception, align 1
  %cmp37 = icmp eq i8 %exce_value36, 0
  br i1 %cmp37, label %func_end, label %common.ret

common.ret:                                       ; preds = %check_succ15, %check_exce_succ, %got_exception, %func_end
  %common.ret.op = phi i32 [ %data65, %func_end ], [ 0, %got_exception ], [ 0, %check_exce_succ ], [ 0, %check_succ15 ]
  ret i32 %common.ret.op

func_end:                                         ; preds = %check_exce_succ
  %mem_base_addr4267 = load i64, ptr %4, align 8    ; <== 💥 redundant  reloading
  %6 = inttoptr i64 %mem_base_addr4267 to ptr
  %maddr59 = getelementptr inbounds i8, ptr %6, i64 %addr_i64
  store i32 42, ptr %maddr59, align 1

  %maddr64 = getelementptr inbounds i8, ptr %6, i64 %addr_i6411
  %data65 = load i32, ptr %maddr64, align 1
  br label %common.ret

got_exception:                                    ; preds = %check_succ, %func_begin
  tail call void @aot_set_exception_with_id(ptr %2, i32 2)
  br label %common.ret
}

lum1n0us · 2025-04-02T05:01:19Z

FYI: Test cases

(module
  (memory 5 10)

  (func $store32 (export "store32")  (param i32 i32)
    (i32.store (local.get 0) (local.get 1))
  )

  (func $load32 (export "load32")  (param i32) (result i32)
    (i32.load (local.get 0))
  )

  ;; 2
  (func (export "load_load") (param i32 i32) (result i32)
    (i32.load (local.get 0))
    (i32.load (local.get 1))
    (i32.eq)

    (memory.grow (i32.const 1))
    (drop)
  )

  (func (export "load_store") (param i32 i32)
    (i32.load (local.get 0))
    (i32.store (local.get 1))

    (memory.grow (i32.const 1))
    (drop)
  )

  (func (export "store_store") (param i32 i32)
    (i32.store (local.get 0) (i32.const 42))
    (i32.store (local.get 1) (i32.const 42))

    (memory.grow (i32.const 1))
    (drop)
  )

  (func (export "store_load") (param i32 i32) (result i32)
    (i32.store (local.get 0) (i32.const 42))
    (i32.load (local.get 1))

    (memory.grow (i32.const 1))
    (drop)
  )

  ;; 6
  (func (export "load_store_grow_load_store") (param i32 i32) (result i32)
    (local i32)

    (i32.store (local.get 0) (i32.const 42))
    (i32.load (local.get 1))
    (local.set 2)

    (memory.grow (i32.const 1))
    (drop)

    (i32.store (local.get 0) (i32.const 42))
    (i32.load (local.get 1))
  )

  (func (export "load_store_call_load_store") (param i32 i32) (result i32)
    (local i32)

    (i32.store (local.get 0) (i32.const 42))
    (i32.load (local.get 1))
    (local.set 2)

    (local.get 0)
    (call $load32)

    (local.tee 2)
    (local.get 1)
    (call $store32)

    (i32.store (local.get 0) (i32.const 42))
    (i32.load (local.get 1))
  )

  (func (export "load_store_call_grow_load_store") (param i32 i32) (result i32)
    (local i32)

    (i32.store (local.get 0) (i32.const 42))
    (i32.load (local.get 1))
    (local.set 2)

    (local.get 0)
    (call $load32)

    (local.tee 2)
    (local.get 1)
    (call $store32)

    (memory.grow (i32.const 1))
    (drop)

    (i32.store (local.get 0) (i32.const 42))
    (i32.load (local.get 1))
  )
)

Jiax-cn · 2025-04-10T08:05:48Z

the key condition for controlling the reloading of the memory base address is func->has_op_memory_grow && !func->has_op_func_call.

IIUC, the key condition is func->has_op_memory_grow || func->has_op_func_call, since both memory growth and function calls can result in memory changes. Therefore, reloading once after each call is not redundant in your load–store–call–load–store example.

feature: better memory information restore mechanism

e3dcf4f

Jiax-cn requested review from loganek, lum1n0us, no1wudi, TianlongLiang, wenyongh, xujuntwt95329 and yamt as code owners February 26, 2025 07:38

lum1n0us requested changes Mar 3, 2025

View reviewed changes

refactor: remove the intermediate addresses in AOTMemInfo

d9325b2

Jiax-cn requested a review from lum1n0us March 11, 2025 05:33

lum1n0us reviewed Mar 17, 2025

View reviewed changes

Jiax-cn mentioned this pull request Mar 18, 2025

Add wasm_runtime_get_func_name_from_index() api #4117

Closed

Jiax-cn requested a review from lum1n0us March 27, 2025 08:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[AOT] Memory Info Restore Mechanism with Better Performance #4113

[AOT] Memory Info Restore Mechanism with Better Performance #4113

Jiax-cn commented Feb 26, 2025 •

edited

Loading

lum1n0us commented Feb 26, 2025

Jiax-cn commented Feb 27, 2025

lum1n0us left a comment

Jiax-cn commented Mar 4, 2025

lum1n0us commented Mar 4, 2025

lum1n0us commented Mar 12, 2025

lum1n0us Mar 17, 2025

Jiax-cn Mar 17, 2025

lum1n0us Mar 28, 2025

Jiax-cn Mar 28, 2025

lum1n0us Mar 30, 2025

Jiax-cn Apr 1, 2025

lum1n0us commented Apr 2, 2025 •

edited

Loading

lum1n0us commented Apr 2, 2025

Jiax-cn commented Apr 10, 2025

[AOT] Memory Info Restore Mechanism with Better Performance #4113

Are you sure you want to change the base?

[AOT] Memory Info Restore Mechanism with Better Performance #4113

Conversation

Jiax-cn commented Feb 26, 2025 • edited Loading

Performance

result:

lum1n0us commented Feb 26, 2025

Jiax-cn commented Feb 27, 2025

lum1n0us left a comment

Choose a reason for hiding this comment

Jiax-cn commented Mar 4, 2025

lum1n0us commented Mar 4, 2025

lum1n0us commented Mar 12, 2025

lum1n0us Mar 17, 2025

Choose a reason for hiding this comment

Jiax-cn Mar 17, 2025

Choose a reason for hiding this comment

lum1n0us Mar 28, 2025

Choose a reason for hiding this comment

Jiax-cn Mar 28, 2025

Choose a reason for hiding this comment

lum1n0us Mar 30, 2025

Choose a reason for hiding this comment

Jiax-cn Apr 1, 2025

Choose a reason for hiding this comment

lum1n0us commented Apr 2, 2025 • edited Loading

lum1n0us commented Apr 2, 2025

Jiax-cn commented Apr 10, 2025

Jiax-cn commented Feb 26, 2025 •

edited

Loading

lum1n0us commented Apr 2, 2025 •

edited

Loading