Description
On linux and some other platforms, libstd uses __cxa_thread_atexit_impl
to register destructors for thread locals.
The last argument to this is &__dso_handle
, but this is only correct for threadlocals which are inside libstd, or are in code that is statically linked with libstd.
__dso_handle
is a magic symbol which has a value that is unique to whatever DSO is references it. That is, if libfoo.so
and libbar.so
both look at __dso_handle
(or &__dso_handle
, which is really the value you use), it will have a different value in each.
Yes, I know this isn't how things usually work, this is the entire point of __dso_handle
(technically, it behaves as if a symbol with "hidden" visibility named __dso_handle
were declared inside each DSO automatically, see https://itanium-cxx-abi.github.io/cxx-abi/abi.html#dso-dtor-runtime-api, although this is clearly a hack).
It's behaves very slightly differently depending on a number of moving pieces (libdl, libc, libcxxabi, the linker, the runtime loader, ... — collectively I'm going to call these "the runtime"), and shows up in a couple different APIs, but here it's being used to remember that that DSO has a pending thread-local dtor, which prevents the DSO from being unloaded until after the said dtors are all run (when all the threads in question are closed).
So, to the point: libstd
always registers this using a __dso_handle
which is linked from inside itself. This defeats the point of the symbol, as now "the runtime" believes that the DSO containing libstd
is the one responsible for the dtor. This can cause problems in scanarios where libstd is dynamically linked, and dlopen/dlclose is used to dynamically load rust code. (See "Memory unsafe scenario" for why)
I believe the ideal fix here is to have thread_local!
expand to contain the extern for __dso_handle
on these systems. Then, &__dso_handle
would be passed in as an argument to the call to unix::thread_local_dtor::register_dtor
. I don't know how this interacts with weak symbols, but I'm sure this can be made to work.
(This is... inconvenient, but it's not that surprising — if library code could be the source for this value, there'd be no need for it to get passed in)
Memory unsafe scenario
Concretely, I think this can lead to a concerning memory unsafety problem in the following scenario:
-
libstd is dynamically linked into a program.
-
Some rust library (which also dynamically links libstd) is loaded via dlopen. Let's call this
libmycrate.so
for concreteness. -
libmycrate.so
contains athread_local!
(mycrate::THE_THREAD_LOCAL
) that needs its dtor to be registered. -
A thread
T0
is spawned, andT0
calls some function inlibcrate.so
. -
This function references
mycrate::THE_THREAD_LOCAL
, which causes the destructor is registered via__cxa_thread_atexit_impl
(insidestd::sys::unix::register_dtor
)- Note: In this hypothetical, no other threads have registered dtors for
mycrate::THE_THREAD_LOCAL
.
- Note: In this hypothetical, no other threads have registered dtors for
-
The library
libcrate.so
is unloaded viadlclose
. This is prior toT0
ending, and it is not the last Rust crate to be unloaded. -
Later
T0
is joined, which runs the thread-specific destructors. This includesmycrate::THE_THREAD_LOCAL
's dtor, despite the fact that it has been unloaded.- That is, the memory for the dtor function (and internal functions it calls) may be unmapped/freed/in use as something else.
- Any static data accessed and the like may similarly be no longer alive
Note: between 6 and 7, some time may have to pass; dlclose
is often performed in the background. Also, I'm assuming in this situation that libmycrate.so
hasn't done anything else to prevent being unloaded. Finally, often the memory from the library is pushed onto a free list for later use, rather than actually being unmapped.
Anyway, this is concerning because:
- The situation I described is convoluted but not unthinkably so — the most unrealistic situation here is libstd being dynamically linked, and someone using dlopen to work with a Rust crate in this manner.
- It's a use-after-free where we call a function on freed memory, which could lead to problems (for example, if that memory gets returned to the allocator, and then used to allocate something that a potentially-hostile attacker has some control over).
On the other hand, this doesn't exist from purely safe stdlib APIs — someone had to unsafe
ly call dlclose
(perhaps by Drop
ing a libloading::Library
), so it's on them.
While I don't find this style of argument compelling, it unfortunately has to be the answer to some extent. We can't fix this everywhere, as only some platforms allow defending against this by accepting an equivalent to &__dso_handle
.
That said, this is clearly an example of us passing the wrong value, and I suspect there aren't really great arguments against fixing it. I think this is actually quite a bit of a footgun on platforms where it can't be addressed, but probably the solution is to somehow let people know that dlclose
(and equivalent) are extremely spooky.