mbox series

[for-3.1,0/2] Fix qemu_thread_atexit* for OSX

Message ID 20181105135538.28025-1-peter.maydell@linaro.org (mailing list archive)
Headers show
Series Fix qemu_thread_atexit* for OSX | expand

Message

Peter Maydell Nov. 5, 2018, 1:55 p.m. UTC
Our current implementation of qemu_thread_atexit* is broken on OSX.
This is because it works by cerating a piece of thread-specific
data with pthread_key_create() and using the destructor function
for that data to run the notifier function passed to it by
the caller of qemu_thread_atexit_add(). The expected use case
is that the caller uses a __thread variable as the notifier,
and uses the callback to clean up information that it is
keeping per-thread in __thread variables.

Unfortunately, on OSX this does not work, because on OSX
a __thread variable may be destroyed (freed) before the
pthread_key_create() destructor runs. (POSIX imposes no
ordering constraint here; the OSX implementation happens
to implement __thread variables in terms of pthread_key_create((),
whereas Linux uses different mechanisms that mean the __thread
variables will still be present when the pthread_key_create()
destructor is run.)

Fix this by switching to a scheme similar to the one qemu-thread-win32
uses for qemu_thread_atexit: keep the thread's notifiers on a
__thread variable, and run the notifiers on calls to
qemu_thread_exit() and on return from the start routine passed
to qemu_thread_start(). We do this with the pthread_cleanup_push()
API. (This approach was suggested by Paolo.)

There is a slightly awkward corner here for the main thread,
which isn't started via qemu_thread_start() and so can exit
without passing through the cleanup handler. qemu-thread-win32.c
tries to handle the main thread using an atexit() handler to
run its notifiers. I don't think this will work because the atexit
handler may not run on the main thread, in which case notify
callback functions which refer to __thread variables will get
the wrong ones. So instead I have documented that the API
leaves it unspecified whether notifiers are run for thread
exits that happen when the entire process is exiting. This
is OK for our current users (who only use the callbacks to
clean up memory or win32 Fibers which are deleted on process
exit anyway), and means we don't need to worry about running
callbacks on main thread exit (which always causes the whole
process to exit).

In particular, this fixes the ASAN errors and lockups we
currently see on OSX hosts when running test-bdrv-drain.

thanks
-- PMM

Peter Maydell (2):
  include/qemu/thread.h: Document qemu_thread_atexit* API
  util/qemu-thread-posix: Fix qemu_thread_atexit* for OSX

 include/qemu/thread.h    | 22 ++++++++++++++++++++
 util/qemu-thread-posix.c | 44 ++++++++++++++++++----------------------
 2 files changed, 42 insertions(+), 24 deletions(-)

Comments

Paolo Bonzini Nov. 6, 2018, 9:53 a.m. UTC | #1
On 05/11/2018 14:55, Peter Maydell wrote:
> Our current implementation of qemu_thread_atexit* is broken on OSX.
> This is because it works by cerating a piece of thread-specific
> data with pthread_key_create() and using the destructor function
> for that data to run the notifier function passed to it by
> the caller of qemu_thread_atexit_add(). The expected use case
> is that the caller uses a __thread variable as the notifier,
> and uses the callback to clean up information that it is
> keeping per-thread in __thread variables.
> 
> Unfortunately, on OSX this does not work, because on OSX
> a __thread variable may be destroyed (freed) before the
> pthread_key_create() destructor runs. (POSIX imposes no
> ordering constraint here; the OSX implementation happens
> to implement __thread variables in terms of pthread_key_create((),
> whereas Linux uses different mechanisms that mean the __thread
> variables will still be present when the pthread_key_create()
> destructor is run.)
> 
> Fix this by switching to a scheme similar to the one qemu-thread-win32
> uses for qemu_thread_atexit: keep the thread's notifiers on a
> __thread variable, and run the notifiers on calls to
> qemu_thread_exit() and on return from the start routine passed
> to qemu_thread_start(). We do this with the pthread_cleanup_push()
> API. (This approach was suggested by Paolo.)
> 
> There is a slightly awkward corner here for the main thread,
> which isn't started via qemu_thread_start() and so can exit
> without passing through the cleanup handler. qemu-thread-win32.c
> tries to handle the main thread using an atexit() handler to
> run its notifiers. I don't think this will work because the atexit
> handler may not run on the main thread, in which case notify
> callback functions which refer to __thread variables will get
> the wrong ones. So instead I have documented that the API
> leaves it unspecified whether notifiers are run for thread
> exits that happen when the entire process is exiting. This
> is OK for our current users (who only use the callbacks to
> clean up memory or win32 Fibers which are deleted on process
> exit anyway), and means we don't need to worry about running
> callbacks on main thread exit (which always causes the whole
> process to exit).
> 
> In particular, this fixes the ASAN errors and lockups we
> currently see on OSX hosts when running test-bdrv-drain.

Queued, thanks.  As an extra cleanup we can remove
qemu_thread_atexit_remove (it is unused, and a simple notifier_remove
now works), and also remove the atexit from Win32 so we can document
that the notifier does _not_ run for the main thread.

Paolo