Message ID | 158981894794.109297.3530035833368944254.stgit@bahia.lan (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v2] 9p: Lock directory streams with a CoMutex | expand |
On Mon, 18 May 2020 18:22:28 +0200 Greg Kurz <groug@kaod.org> wrote: > Locking was introduced in QEMU 2.7 to address the deprecation of > readdir_r(3) in glibc 2.24. It turns out that the frontend code is > the worst place to handle a critical section with a pthread mutex: > the code runs in a coroutine on behalf of the QEMU mainloop and then > yields control, waiting for the fsdev backend to process the request > in a worker thread. If the client resends another readdir request for > the same fid before the previous one finally unlocked the mutex, we're > deadlocked. > > This never bit us because the linux client serializes readdir requests > for the same fid, but it is quite easy to demonstrate with a custom > client. > > A good solution could be to narrow the critical section in the worker > thread code and to return a copy of the dirent to the frontend, but > this causes quite some changes in both 9p.c and codir.c. So, instead > of that, in order for people to easily backport the fix to older QEMU > versions, let's simply use a CoMutex since all the users for this > sit in coroutines. > > Fixes: 7cde47d4a89d ("9p: add locking to V9fsDir") > Signed-off-by: Greg Kurz <groug@kaod.org> > --- Just to clarify, this is v2 of: 9pfs: Fix potential deadlock of QEMU mainloop https://patchwork.ozlabs.org/project/qemu-devel/patch/158826201391.1344781.9403916162733181811.stgit@bahia.lan/ > hw/9pfs/9p.h | 8 ++++---- > 1 file changed, 4 insertions(+), 4 deletions(-) > > diff --git a/hw/9pfs/9p.h b/hw/9pfs/9p.h > index dd1c6cb8d2f4..3ab580764cf8 100644 > --- a/hw/9pfs/9p.h > +++ b/hw/9pfs/9p.h > @@ -197,22 +197,22 @@ typedef struct V9fsXattr > > typedef struct V9fsDir { > DIR *stream; > - QemuMutex readdir_mutex; > + CoMutex readdir_mutex; > } V9fsDir; > > static inline void v9fs_readdir_lock(V9fsDir *dir) > { > - qemu_mutex_lock(&dir->readdir_mutex); > + qemu_co_mutex_lock(&dir->readdir_mutex); > } > > static inline void v9fs_readdir_unlock(V9fsDir *dir) > { > - qemu_mutex_unlock(&dir->readdir_mutex); > + qemu_co_mutex_unlock(&dir->readdir_mutex); > } > > static inline void v9fs_readdir_init(V9fsDir *dir) > { > - qemu_mutex_init(&dir->readdir_mutex); > + qemu_co_mutex_init(&dir->readdir_mutex); > } > > /* > >
On Montag, 18. Mai 2020 18:35:04 CEST Greg Kurz wrote: > On Mon, 18 May 2020 18:22:28 +0200 > > Greg Kurz <groug@kaod.org> wrote: > > Locking was introduced in QEMU 2.7 to address the deprecation of > > readdir_r(3) in glibc 2.24. It turns out that the frontend code is > > the worst place to handle a critical section with a pthread mutex: > > the code runs in a coroutine on behalf of the QEMU mainloop and then > > yields control, waiting for the fsdev backend to process the request > > in a worker thread. If the client resends another readdir request for > > the same fid before the previous one finally unlocked the mutex, we're > > deadlocked. > > > > This never bit us because the linux client serializes readdir requests > > for the same fid, but it is quite easy to demonstrate with a custom > > client. > > > > A good solution could be to narrow the critical section in the worker > > thread code and to return a copy of the dirent to the frontend, but > > this causes quite some changes in both 9p.c and codir.c. So, instead > > of that, in order for people to easily backport the fix to older QEMU > > versions, let's simply use a CoMutex since all the users for this > > sit in coroutines. > > > > Fixes: 7cde47d4a89d ("9p: add locking to V9fsDir") > > Signed-off-by: Greg Kurz <groug@kaod.org> > > --- > > Just to clarify, this is v2 of: > > 9pfs: Fix potential deadlock of QEMU mainloop > > https://patchwork.ozlabs.org/project/qemu-devel/patch/158826201391.1344781.9 > 403916162733181811.stgit@bahia.lan/ Good move! Reviewed-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
diff --git a/hw/9pfs/9p.h b/hw/9pfs/9p.h index dd1c6cb8d2f4..3ab580764cf8 100644 --- a/hw/9pfs/9p.h +++ b/hw/9pfs/9p.h @@ -197,22 +197,22 @@ typedef struct V9fsXattr typedef struct V9fsDir { DIR *stream; - QemuMutex readdir_mutex; + CoMutex readdir_mutex; } V9fsDir; static inline void v9fs_readdir_lock(V9fsDir *dir) { - qemu_mutex_lock(&dir->readdir_mutex); + qemu_co_mutex_lock(&dir->readdir_mutex); } static inline void v9fs_readdir_unlock(V9fsDir *dir) { - qemu_mutex_unlock(&dir->readdir_mutex); + qemu_co_mutex_unlock(&dir->readdir_mutex); } static inline void v9fs_readdir_init(V9fsDir *dir) { - qemu_mutex_init(&dir->readdir_mutex); + qemu_co_mutex_init(&dir->readdir_mutex); } /*
Locking was introduced in QEMU 2.7 to address the deprecation of readdir_r(3) in glibc 2.24. It turns out that the frontend code is the worst place to handle a critical section with a pthread mutex: the code runs in a coroutine on behalf of the QEMU mainloop and then yields control, waiting for the fsdev backend to process the request in a worker thread. If the client resends another readdir request for the same fid before the previous one finally unlocked the mutex, we're deadlocked. This never bit us because the linux client serializes readdir requests for the same fid, but it is quite easy to demonstrate with a custom client. A good solution could be to narrow the critical section in the worker thread code and to return a copy of the dirent to the frontend, but this causes quite some changes in both 9p.c and codir.c. So, instead of that, in order for people to easily backport the fix to older QEMU versions, let's simply use a CoMutex since all the users for this sit in coroutines. Fixes: 7cde47d4a89d ("9p: add locking to V9fsDir") Signed-off-by: Greg Kurz <groug@kaod.org> --- hw/9pfs/9p.h | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)