Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUGs in WAMR #2252

Open
LeoneChen opened this issue Jun 3, 2023 · 20 comments
Open

BUGs in WAMR #2252

LeoneChen opened this issue Jun 3, 2023 · 20 comments

Comments

@LeoneChen
Copy link

LeoneChen commented Jun 3, 2023

Hello~

GS register shouldn't be modified in linux-sgx platform hardware mode (64 bit)

In linux-sgx

get_thread_data use READ_TD_DATA assembly macro here, for Enclave Thread Data/Thread Local Storage usage.
image

And in SGX hardware mode, READ_TD_DATA use GS register in 64bit mode as base address of thread data. here
(Note in hardware mode SE_SIM is undefined)
image

In wamr

Thus it shouldn't modify GS register in Enclave.here for example
image

os_writegsbase macro is at here

image
@wenyongh
Copy link
Contributor

wenyongh commented Jun 5, 2023

@LeoneChen Thanks for pointing out this! I only tested in simulator mode and thought that GS register is available. It is a pity that we cannot use the register to enable the segue optimization for linux-sgx platform, I submitted PR #2255 to fix the issue.

wenyongh added a commit that referenced this issue Jun 5, 2023
Writing GS segment register is not allowed on linux-sgx since it is used as
the base address of thread data in 64-bit hw mode. Reported in #2252.
Disable writing it and disable segue optimization for linux-sgx platform.
@LeoneChen
Copy link
Author

My pleasure~

@LeoneChen
Copy link
Author

LeoneChen commented Jun 5, 2023

Hello~ We find another bug

Heap OOB & Null Pointer Dereference

It's about WAMR run in Enclave

In EDL, size of cmd_buf is specified by cmd_buf_size, but cmd_buf_size can be set by attacker from untrusted host.

public void ecall_handle_command(unsigned cmd,
[in, out, size=cmd_buf_size]uint8_t *cmd_buf,
unsigned cmd_buf_size);

When specify [size=cmd_buf_size] attribute, TBridge(*_t.c from *.edl) like below will malloc only cmd_buf_size for cmd, _len_cmd_buf in below code.

static sgx_status_t SGX_CDECL sgx_ecall_handle_command(void* pms)
{
	...
	uint8_t* _tmp_cmd_buf = ms->ms_cmd_buf;
	unsigned int _tmp_cmd_buf_size = ms->ms_cmd_buf_size;
	size_t _len_cmd_buf = _tmp_cmd_buf_size;
	...

	if (_tmp_cmd_buf != NULL && _len_cmd_buf != 0) {
		if ( _len_cmd_buf % sizeof(*_tmp_cmd_buf) != 0)
		{
			status = SGX_ERROR_INVALID_PARAMETER;
			goto err;
		}
		_in_cmd_buf = (uint8_t*)malloc(_len_cmd_buf);
		if (_in_cmd_buf == NULL) {
			status = SGX_ERROR_OUT_OF_MEMORY;
			goto err;
		}

		if (memcpy_s(_in_cmd_buf, _len_cmd_buf, _tmp_cmd_buf, _len_cmd_buf)) {
			status = SGX_ERROR_UNEXPECTED;
			goto err;
		}
	}

	ecall_handle_command(ms->ms_cmd, _in_cmd_buf, _tmp_cmd_buf_size);
	...
}

Thus, in ecall_handle_command, argc can be any number (e.g. 0), and cmd_buf only have insufficient heap memory space

ecall_handle_command(unsigned cmd, unsigned char *cmd_buf,
unsigned cmd_buf_size)
{
uint64 *args = (uint64 *)cmd_buf;
uint32 argc = cmd_buf_size / sizeof(uint64);

However, e.g. in handle_cmd_init_runtime, this assert is only enabled in DEBUG mode. This assert (or if-check) is necessary in RELEASE mode too.

handle_cmd_init_runtime(uint64 *args, uint32 argc)
{
uint32 max_thread_num;
RuntimeInitArgs init_args;
bh_assert(argc == 1);
os_set_print_function(enclave_print);
max_thread_num = (uint32)args[0];

As bh_assert disabled in RELEASE mode, when cmd_buf_size is 2 for example, args is allocated with size 2 and argc is 0, args[0] will access 8 byte cause HEAP OOB, which overflow heap memory prepared by SGXSDK. And args also can set to null by untrusted host.

Note

Same bugs in all handle_cmd_xxx serial as well

@LeoneChen LeoneChen changed the title [BUG] GS register shouldn't be modified in linux-sgx platform hardware mode (64 bit) BUGs in WAMR Jun 5, 2023
@wenyongh
Copy link
Contributor

wenyongh commented Jun 5, 2023

Hi, the cmd_buf_size of ecall_handle_command is specified when calling it from host for each command, like:
https://github.com/bytecodealliance/wasm-micro-runtime/blob/main/product-mini/platforms/linux-sgx/enclave-sample/App/App.cpp#LL380C55-L380C55
https://github.com/bytecodealliance/wasm-micro-runtime/blob/main/product-mini/platforms/linux-sgx/enclave-sample/App/App.cpp#L412

Not sure why "argc can be any number (e.g. 0), and cmd_buf may only have insufficient heap memory space", is there any case to reproduce the issue? Or do you mean that cmd_buf_size may be modified illegally in the host side?

@LeoneChen
Copy link
Author

LeoneChen commented Jun 5, 2023

Yes, I mean "cmd_buf_size may be modified illegally in the host side", since host and even OS are untrusted (e.g. host is controlled by untrusted cloud manager when user deploys this Enclave at cloud). I think in SGX Enclave development principle, it's necessary to keep interfaces like ECALL/OCALL are as safe as possible, which is different from traditional development.

@LeoneChen
Copy link
Author

I'm happy to help you fix SGX specific BUGs like this, if you need.

@wenyongh
Copy link
Contributor

wenyongh commented Jun 5, 2023

Yes, appreciate!
And note that these commands (init_runtime/load_module/instantiate_module, etc) which invoke ecall_handle_cmd are also called by wamr_pal_create_process. wamr_pal_create_process is called by the inclavare container project, we should ensure that the modification works for the project.

https://github.com/bytecodealliance/wasm-micro-runtime/tree/main/product-mini/platforms/linux-sgx/enclave-sample/App

https://github.com/inclavare-containers/inclavare-containers

@LeoneChen
Copy link
Author

LeoneChen commented Jun 5, 2023

Null Pointer Dereference

args is only unchecked buffer from ecall_handle_command

public void ecall_handle_command(unsigned cmd,
[in, out, size=cmd_buf_size]uint8_t *cmd_buf,
unsigned cmd_buf_size);

ecall_handle_command(unsigned cmd, unsigned char *cmd_buf,
unsigned cmd_buf_size)
{
uint64 *args = (uint64 *)cmd_buf;
uint32 argc = cmd_buf_size / sizeof(uint64);
switch (cmd) {
case CMD_INIT_RUNTIME:
handle_cmd_init_runtime(args, argc);
break;
case CMD_LOAD_MODULE:
handle_cmd_load_module(args, argc);
break;
case CMD_SET_WASI_ARGS:
handle_cmd_set_wasi_args(args, argc);
break;
case CMD_INSTANTIATE_MODULE:
handle_cmd_instantiate_module(args, argc);

enclave_module is from this unchecked buffer, which is controlled by untrusted host, and can be set to NULL pointer, cause Null Pointer Dereference

handle_cmd_instantiate_module(uint64 *args, uint32 argc)
{
uint64 *args_org = args;
EnclaveModule *enclave_module = *(EnclaveModule **)args++;
uint32 stack_size = *(uint32 *)args++;
uint32 heap_size = *(uint32 *)args++;
char *error_buf = *(char **)args++;
uint32 error_buf_size = *(uint32 *)args++;
wasm_module_inst_t module_inst;
bh_assert(argc == 5);
if (!(module_inst =
wasm_runtime_instantiate(enclave_module->module, stack_size,
heap_size, error_buf, error_buf_size))) {

@LeoneChen
Copy link
Author

Yes, appreciate!
And note that these commands (init_runtime/load_module/instantiate_module, etc) which invoke ecall_handle_cmd are also called by wamr_pal_create_process. wamr_pal_create_process is called by the inclavare container project, we should ensure that the modification works for the project.
https://github.com/bytecodealliance/wasm-micro-runtime/tree/main/product-mini/platforms/linux-sgx/enclave-sample/App
https://github.com/inclavare-containers/inclavare-containers

OK, I'll try to fix it

@LeoneChen
Copy link
Author

LeoneChen commented Jun 5, 2023

Hi~ I'll list all bugs found in this issue, and help to try to fix them~

Time of Check to Time of Use/Double Fetch

In wasm_runtime_load (e.g. called by ecall_iwasm_main), buf is in untrusted host (non Enclave) memory.

ecall_iwasm_main(uint8_t *wasm_file_buf, uint32_t wasm_file_size)
{
wasm_module_t wasm_module = NULL;
wasm_module_inst_t wasm_module_inst = NULL;
RuntimeInitArgs init_args;
char error_buf[128];
const char *exception;
os_set_print_function(enclave_print);
memset(&init_args, 0, sizeof(RuntimeInitArgs));
#if WASM_ENABLE_GLOBAL_HEAP_POOL != 0
init_args.mem_alloc_type = Alloc_With_Pool;
init_args.mem_alloc_option.pool.heap_buf = global_heap_buf;
init_args.mem_alloc_option.pool.heap_size = sizeof(global_heap_buf);
#else
init_args.mem_alloc_type = Alloc_With_System_Allocator;
#endif
/* initialize runtime environment */
if (!wasm_runtime_full_init(&init_args)) {
enclave_print("Init runtime environment failed.");
enclave_print("\n");
return;
}
/* load WASM module */
if (!(wasm_module = wasm_runtime_load(wasm_file_buf, wasm_file_size,

wasm_runtime_load(uint8 *buf, uint32 size, char *error_buf,
uint32 error_buf_size)
{
WASMModuleCommon *module_common = NULL;
if (get_package_type(buf, size) == Wasm_Module_Bytecode) {
#if WASM_ENABLE_INTERP != 0
module_common =
(WASMModuleCommon *)wasm_load(buf, size, error_buf, error_buf_size);
return register_module_with_null_name(module_common, error_buf,
error_buf_size);
#endif
}
else if (get_package_type(buf, size) == Wasm_Module_AoT) {
#if WASM_ENABLE_AOT != 0
module_common = (WASMModuleCommon *)aot_load_from_aot_file(
buf, size, error_buf, error_buf_size);
return register_module_with_null_name(module_common, error_buf,
error_buf_size);
#endif
}

get_package_type will check first four bytes in buf, while wasm_load or aot_load_from_aot_file will try to load this buf.

get_package_type(const uint8 *buf, uint32 size)
{
#if (WASM_ENABLE_WORD_ALIGN_READ != 0)
uint32 buf32 = *(uint32 *)buf;
buf = (const uint8 *)&buf32;
#endif
if (buf && size >= 4) {
if (buf[0] == '\0' && buf[1] == 'a' && buf[2] == 's' && buf[3] == 'm')
return Wasm_Module_Bytecode;
if (buf[0] == '\0' && buf[1] == 'a' && buf[2] == 'o' && buf[3] == 't')
return Wasm_Module_AoT;
}
return Package_Type_Unknown;
}

But there is a situation that, attack at OS level can replace content of buf (reside in untrusted memory) just after the check of get_package_type but before loading of wasm_load or aot_load_from_aot_file. Thus when loading buf, the first four bytes can already be changed to anything. Cuase TOCTOU

Thus it's better to copy whole buffer into Enclave, and then check and use.

This problem is very similar to TOCTOU in linux kernel or other situations which involve security boundary, and is firstly pointed out by CCS'19 paper A Tale of Two Worlds: Assessing the Vulnerability of Enclave Shielding Runtimes.

@wenyongh
Copy link
Contributor

wenyongh commented Jun 6, 2023

Hi~ I'll list all bugs found in this issue, and help to try to fix them~

Yes, many thanks!

@LeoneChen
Copy link
Author

LeoneChen commented Jun 13, 2023

Arbitrarily Enclave Write/Read/Execute

Need to check module_inst is outside Enclave with sgx_is_outside_enclave, otherwise it can point to Enclave, and overwrite Enclave critical data.

wasm_runtime_set_exception(module_inst, "allocate memory failed.");

wasm_runtime_set_exception(module_inst, "allocate memory failed.");

Need to check error_buf is out of Enclave

No check on wasm_module_inst, it used in execute_main and cause Enclave memory overwrite and execute, also it can be null, cause null pointer dereference

wasm_application_execute_main(wasm_module_inst, 0, NULL);

ret = execute_main(module_inst, argc, argv);

No check on wasm_file_buf and wasm_file_size, it can point to In-Enclave memory, and cause arbitrarily Enclave read in wasm_loader_load->load

if (!(wasm_module = wasm_runtime_load(wasm_file_buf, wasm_file_size,

version = read_uint32(p);

handle_cmd_load_module -> wasm_runtime_load,error_buf can point to In Enclave memory

wasm_runtime_load(enclave_module->wasm_file, wasm_file_size,

set_error_buf(error_buf, error_buf_size,

wasm_file can point to in Enclave

bh_memcpy_s(enclave_module->wasm_file, wasm_file_size, wasm_file,

enclave_module can point to any where

No check on module_inst

0x000000000047771d: sgxsan_backtrace at /home/leone/Documents/SGXSan/SGXSanRT/SGXSanRTApp/SGXSanRTApp.cpp:667
0x0000000000487cdf: MemAccessMgrOutEnclaveAccess at /home/leone/Documents/SGXSan/SGXSanRT/SGXSanRTApp/MemAccessMgr.cpp:37
0x0000000000472a65: __asan_load8 at /home/leone/Documents/SGXSan/SGXSanRT/SGXSanRTApp/PoisonCheck.cpp:91
0x000000000008d405: aot_deinstantiate at /home/leone/Documents/SGXSan/SGX_APP/wasm-micro-runtime/core/iwasm/aot/aot_runtime.c:1267
0x000000000005ea76: wasm_runtime_deinstantiate_internal at /home/leone/Documents/SGXSan/SGX_APP/wasm-micro-runtime/core/iwasm/common/wasm_runtime_common.c:1242
0x000000000005ebd6: wasm_runtime_deinstantiate at /home/leone/Documents/SGXSan/SGX_APP/wasm-micro-runtime/core/iwasm/common/wasm_runtime_common.c:1286
0x0000000000055373: handle_cmd_deinstantiate_module(unsigned long*, unsigned int) at /home/leone/Documents/SGXSan/SGX_APP/wasm-micro-runtime/product-mini/platforms/linux-sgx/enclave-sample/Enclave/Enclave.cpp:380
0x0000000000053526: ecall_handle_command at /home/leone/Documents/SGXSan/SGX_APP/wasm-micro-runtime/product-mini/platforms/linux-sgx/enclave-sample/Enclave/Enclave.cpp:645
0x00000000000449f6: sgx_ecall_handle_command at /home/leone/Documents/SGXSan/SGX_APP/wasm-micro-runtime/product-mini/platforms/linux-sgx/enclave-sample/Enclave/Enclave_t.c:577

Null Pointer Dereference

Need to check func_name not Null before strlen(func_name)

Lack of check dir_list, env_list and so on, that can be null

p is from buf, and un-checked

section_type = read_uint8(p);

#0  0x00007ffff6abc4c0 in create_sections (buf=0x519680 "", size=4294901760, p_section_list=0x7fffffffe640, error_buf=0x7fffffffe920 "\320\351\377\377\377\177", error_buf_size=128) at /home/leone/Documents/SGXSan/SGX_APP/wasm-micro-runtime/core/iwasm/interpreter/wasm_loader.c:3982
#1  0x00007ffff6a999d2 in load (buf=0x519680 "", size=4294901760, module=0x519f00, error_buf=0x7fffffffe920 "\320\351\377\377\377\177", error_buf_size=128) at /home/leone/Documents/SGXSan/SGX_APP/wasm-micro-runtime/core/iwasm/interpreter/wasm_loader.c:4078
#2  0x00007ffff6a99626 in wasm_loader_load (buf=0x519680 "", size=4294901760, error_buf=0x7fffffffe920 "\320\351\377\377\377\177", error_buf_size=128) at /home/leone/Documents/SGXSan/SGX_APP/wasm-micro-runtime/core/iwasm/interpreter/wasm_loader.c:4246
#3  0x00007ffff6a04489 in wasm_load (buf=0x519680 "", size=4294901760, error_buf=0x7fffffffe920 "\320\351\377\377\377\177", error_buf_size=128) at /home/leone/Documents/SGXSan/SGX_APP/wasm-micro-runtime/core/iwasm/interpreter/wasm_runtime.c:56
#4  0x00007ffff69f7686 in wasm_runtime_load (buf=0x519680 "", size=4294901760, error_buf=0x7fffffffe920 "\320\351\377\377\377\177", error_buf_size=128) at /home/leone/Documents/SGXSan/SGX_APP/wasm-micro-runtime/core/iwasm/common/wasm_runtime_common.c:1119
#5  0x00007ffff69ee8fd in ecall_iwasm_main (wasm_file_buf=0x519680 "", wasm_file_size=4294901760) at Enclave/Enclave.cpp:695
#6  0x00007ffff69ddb2c in sgx_ecall_iwasm_main (pms=0x7fffffffeb38) at Enclave/Enclave_t.c:603

@LeoneChen
Copy link
Author

@wenyongh Hello, the current interface designed for SGX is too open, but it's due to the requirement of fully supported web assembly runtime, and it seems too hard to fix. How about trying Gramine from Intel or Occlum from Ant Group to seamlessly support non-SGX version WAMR, they are LibOS solutions run in Enclave to support legacy applications and have well-designed Enclave interfaces

@wenyongh
Copy link
Contributor

wenyongh commented Jul 3, 2023

@LeoneChen Yes, it may be very hard to fix, could you fix some simple cases? For Gramine and LibOS, maybe we can try running WAMR based on the platform linux (product-mini/platforms/linux), it should be OK if they supports the libc APIs that platform linux requires.

@LeoneChen
Copy link
Author

Hello!

My simple idea is change each handle in switch-case to ecalls, in order to leverage auto generated Tbridge code to memory clone for one-level pointer, but for nested pointer, we may need extra work.

This will change interface of Enclave, then host need change code to call these interface.

If you think it's ok, I will submit a PR for every to review

@wenyongh
Copy link
Contributor

wenyongh commented Jul 4, 2023

It is good to me, but not sure whether it is OK for other developers, they might be using the current mode in their products. How about adding a macro to control it? By default the new interface is used, and developer can use make WAMR_BUILD_XXX=1 (or 0) to use the old interfaces, and in the App code, we can use macro WASM_ENABLE_XXX=1/0 to control the code.

@LeoneChen
Copy link
Author

Ok

@LeoneChen
Copy link
Author

UAF

Malloc

ecall_iwasm_main->wasm_runtime_full_init->wasm_runtime_env_init->wasm_shared_memory_init->bh_hash_map_create->wasm_runtime_malloc
which set global wait_map

wasm_shared_memory_init()
{
if (os_mutex_init(&shared_memory_list_lock) != 0)
return false;
/* wait map not exists, create new map */
if (!(wait_map = bh_hash_map_create(32, true, (HashFunc)wait_address_hash,

Free

ecall_iwasm_main->wasm_runtime_destroy->wasm_shared_memory_destroy->bh_hash_map_destroy->wasm_runtime_free
which not set wait_map to NULL after free

wasm_shared_memory_destroy()
{
bh_hash_map_destroy(wait_map);
os_mutex_destroy(&shared_memory_list_lock);
}

Use

ecall_handle_command->handle_cmd_destroy_runtime->wasm_runtime_destroy->wasm_shared_memory_destroy->bh_hash_map_destroy
map is the wait_map passed from parameter and map->lock cause UAF

bh_hash_map_destroy(HashMap *map)
{
uint32 index;
HashMapElem *elem, *next;
if (!map) {
LOG_ERROR("HashMap destroy failed: map is NULL.\n");
return false;
}
if (map->lock) {

Conclusion

WAMR forget to set wait_map to NULL after free, and ecall can be called arbitrarily by untrusted host.

@LeoneChen
Copy link
Author

LeoneChen commented Jul 13, 2023

PR #2357

I'm a little busy recently, so late to fix the above problems

One detail is that I use indexes to replace the raw pointers when passing enclave_module and wasm_module_inst across the SGX boundary. This method is decribed in TeeRex, and at the end of section 5.3

The developers of the Rust SGX SDK acknowledged the problem and promptly updated their code. Akin to our suggestions to the developers, the enclave code now utilizes session identifiers instead of pointers to identify TLS sessions; similar to using file descriptors on Unix-like systems. Upon session creation in tls_client_new, the pointer to the TLS session object is now inserted into a hashmap, which is then used to map the identifier in subsequent ECALLs. Hence, no pointers are passed on the host-to-enclave boundary. This drastically reduces the attack surface of the enclave and eradicates both the vulnerability pattern Returning pointers to enclave memory (P2) and Pointers to Overlapping Memory (P3).

Since I'm not so familiar with WAMR, you may need further check. The Action on this PR failed on SPEC, but I have no idea about this.

I've fuzzed with this Enclave with the new interface, and no above bugs are reported anymore.

@wenyongh
Copy link
Contributor

PR #2357

I'm a little busy recently, so late to fix the above problems

One detail is that I use indexes to replace the raw pointers when passing enclave_module and wasm_module_inst across the SGX boundary. This method is decribed in TeeRex, and at the end of section 5.3

The developers of the Rust SGX SDK acknowledged the problem and promptly updated their code. Akin to our suggestions to the developers, the enclave code now utilizes session identifiers instead of pointers to identify TLS sessions; similar to using file descriptors on Unix-like systems. Upon session creation in tls_client_new, the pointer to the TLS session object is now inserted into a hashmap, which is then used to map the identifier in subsequent ECALLs. Hence, no pointers are passed on the host-to-enclave boundary. This drastically reduces the attack surface of the enclave and eradicates both the vulnerability pattern Returning pointers to enclave memory (P2) and Pointers to Overlapping Memory (P3).

Since I'm not so familiar with WAMR, you may need further check. The Action on this PR failed on SPEC, but I have no idea about this.

I've fuzzed with this Enclave with the new interface, and no above bugs are reported anymore.

Thanks @LeoneChen! We will check the PR and try to fix the issues reported by GitHub CIs.

wenyongh added a commit that referenced this issue Aug 4, 2023
…urity (#2416)

Call ecall commands arbitrarily from host when enclave's runtime isn't initialized
may cause unexpected behavior, for example, load/instantiate wasm module.
Add runtime inited status checks in enclave to improve the security.

Also fix `wait_map` issue mentioned in
#2252 (comment)
victoryang00 pushed a commit to victoryang00/wamr-aot-gc-checkpoint-restore that referenced this issue May 27, 2024
…2255)

Writing GS segment register is not allowed on linux-sgx since it is used as
the base address of thread data in 64-bit hw mode. Reported in bytecodealliance#2252.
Disable writing it and disable segue optimization for linux-sgx platform.
victoryang00 pushed a commit to victoryang00/wamr-aot-gc-checkpoint-restore that referenced this issue May 27, 2024
…urity (bytecodealliance#2416)

Call ecall commands arbitrarily from host when enclave's runtime isn't initialized
may cause unexpected behavior, for example, load/instantiate wasm module.
Add runtime inited status checks in enclave to improve the security.

Also fix `wait_map` issue mentioned in
bytecodealliance#2252 (comment)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants