Message ID | 20200402024431.1629-1-fangying1@huawei.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v2] util/async: Add memory barrier to aio_ctx_prepare | expand |
On 02/04/20 04:44, Ying Fang wrote: > Normal VM runtime is not affected by this hang since there is always some > timer timeout or subsequent io worker come and notify the main thead. > To fix this problem, a memory barrier is added to aio_ctx_prepare and > it is proved to have the hang fixed in our test. Hi Ying Fang, this part of the patch is correct, but I am not sure if a memory barrier is needed in aio_poll too. In addition, the memory barrier is quite slow on x86 and not needed there. I am sorry for dropping the ball on this bug; I had a patch using relaxed atomics (atomic_set/atomic_read) but I never submitted it because I had placed it in a larger series. Let me post it now. Thanks, Paolo > > diff --git a/util/async.c b/util/async.c > index b94518b..89a4f3e 100644 > --- a/util/async.c > +++ b/util/async.c > @@ -250,7 +250,8 @@ aio_ctx_prepare(GSource *source, gint *timeout) > AioContext *ctx = (AioContext *) source; > > atomic_or(&ctx->notify_me, 1); > - > + /* Make sure notify_me is set before aio_compute_timeout */ > + smp_mb(); > /* We assume there is no timeout already supplied */ > *timeout = qemu_timeout_ns_to_ms(aio_compute_timeout(ctx)); > >
On Thu, Apr 02, 2020 at 10:44:31AM +0800, Ying Fang wrote: > Qemu main thread is found to hang up in the mainloop when doing > image format convert on aarch64 platform and it is highly > reproduceable by executing test using: > > qemu-img convert -f qcow2 -O qcow2 origin.qcow2 converted.qcow2 > > This mysterious hang can be explained by a race condition between > the main thread and an io worker thread. There can be a chance that > the last worker thread has called aio_bh_schedule_oneshot and it is > checking against notify_me to deliver a notfiy event. At the same > time, the main thread is calling aio_ctx_prepare however it first > calls qemu_timeout_ns_to_ms, thus the worker thread did not see > notify_me as true and did not send a notify event. The time line > can be shown in the following way: > > Main Thread > ------------------------------------------------ > aio_ctx_prepare > atomic_or(&ctx->notify_me, 1); > /* out of order execution goes here */ > *timeout = qemu_timeout_ns_to_ms(aio_compute_timeout(ctx)); > > Worker Thread > ----------------------------------------------- > aio_bh_schedule_oneshot -> aio_bh_enqueue > aio_notify > smp_mb(); > if (ctx->notify_me) { /* worker thread checks notify_me here */ > event_notifier_set(&ctx->notifier); > atomic_mb_set(&ctx->notified, true); > } Paolo, I'm not sure how to interpret this case according to docs/devel/atomics.txt. Maybe you can clarify. atomic_or() is sequentially consistent and I therefore expected it to act as a barrier. Or does sequential consistency only cover the memory accessed via the sequentially consistent atomics APIs and everything else (like aio_compute_timeout()) can be reordered? > > Normal VM runtime is not affected by this hang since there is always some > timer timeout or subsequent io worker come and notify the main thead. > To fix this problem, a memory barrier is added to aio_ctx_prepare and > it is proved to have the hang fixed in our test. > > This hang is not observed on the x86 platform however it can be easily > reproduced on the aarch64 platform, thus it is architecture related. > Not sure if this is revelant to Commit eabc977973103527bbb8fed69c91cfaa6691f8ab > > Signed-off-by: Ying Fang <fangying1@huawei.com> > Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> > Reported-by: Euler Robot <euler.robot@huawei.com> > > --- > v2: add comments before the barrier > > --- > util/async.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/util/async.c b/util/async.c > index b94518b..89a4f3e 100644 > --- a/util/async.c > +++ b/util/async.c > @@ -250,7 +250,8 @@ aio_ctx_prepare(GSource *source, gint *timeout) > AioContext *ctx = (AioContext *) source; > > atomic_or(&ctx->notify_me, 1); > - > + /* Make sure notify_me is set before aio_compute_timeout */ > + smp_mb(); > /* We assume there is no timeout already supplied */ > *timeout = qemu_timeout_ns_to_ms(aio_compute_timeout(ctx)); > > -- > 1.8.3.1 > >
On 2020/4/2 16:47, Paolo Bonzini wrote: > On 02/04/20 04:44, Ying Fang wrote: >> Normal VM runtime is not affected by this hang since there is always some >> timer timeout or subsequent io worker come and notify the main thead. >> To fix this problem, a memory barrier is added to aio_ctx_prepare and >> it is proved to have the hang fixed in our test. > > Hi Ying Fang, > > this part of the patch is correct, but I am not sure if a memory barrier > is needed in aio_poll too. > > In addition, the memory barrier is quite slow on x86 and not needed there. Yes, memory barrier has impact on performance here. > > I am sorry for dropping the ball on this bug; I had a patch using > relaxed atomics (atomic_set/atomic_read) but I never submitted it > because I had placed it in a larger series. Let me post it now. Thanks for your job, we'd like to do some tests if you post it. Thanks, Ying > > Thanks, > > Paolo > >> >> diff --git a/util/async.c b/util/async.c >> index b94518b..89a4f3e 100644 >> --- a/util/async.c >> +++ b/util/async.c >> @@ -250,7 +250,8 @@ aio_ctx_prepare(GSource *source, gint *timeout) >> AioContext *ctx = (AioContext *) source; >> >> atomic_or(&ctx->notify_me, 1); >> - >> + /* Make sure notify_me is set before aio_compute_timeout */ >> + smp_mb(); >> /* We assume there is no timeout already supplied */ >> *timeout = qemu_timeout_ns_to_ms(aio_compute_timeout(ctx)); >> >> > > >
On 02/04/20 11:32, Stefan Hajnoczi wrote: > Paolo, I'm not sure how to interpret this case according to > docs/devel/atomics.txt. Maybe you can clarify. > > atomic_or() is sequentially consistent and I therefore expected it to > act as a barrier. Or does sequential consistency only cover the memory > accessed via the sequentially consistent atomics APIs and everything > else (like aio_compute_timeout()) can be reordered? Yes, that's what I expected too when I wrote that code. :( But Torvald Riegel explained a while ago that seq-cst accesses are actually weaker than e.g. the Linux kernel atomics[1]. The difference basically only matters if you are writing the relatively common synchronization pattern write A write B smp_mb() smp_mb() read B read A if not B then sleep if A then wake up the other side do something because you must either use memory barriers as above, or use seq-cst writes *and* reads. You cannot rely on having a memory barrier implicit in the writes. Paolo [1] https://lists.gnu.org/archive/html/qemu-arm/2019-10/msg00051.html
On Thu, Apr 02, 2020 at 11:59:06AM +0200, Paolo Bonzini wrote: > On 02/04/20 11:32, Stefan Hajnoczi wrote: > > Paolo, I'm not sure how to interpret this case according to > > docs/devel/atomics.txt. Maybe you can clarify. > > > > atomic_or() is sequentially consistent and I therefore expected it to > > act as a barrier. Or does sequential consistency only cover the memory > > accessed via the sequentially consistent atomics APIs and everything > > else (like aio_compute_timeout()) can be reordered? > > Yes, that's what I expected too when I wrote that code. :( But Torvald > Riegel explained a while ago that seq-cst accesses are actually weaker > than e.g. the Linux kernel atomics[1]. > > The difference basically only matters if you are writing the relatively > common synchronization pattern > > write A write B > smp_mb() smp_mb() > read B read A > if not B then sleep if A then wake up the other side > do something > > because you must either use memory barriers as above, or use seq-cst > writes *and* reads. You cannot rely on having a memory barrier implicit > in the writes. > > Paolo > > [1] https://lists.gnu.org/archive/html/qemu-arm/2019-10/msg00051.html Thanks. I know you are probably very busy right now, but updating atomics.txt would be great. Stefan
diff --git a/util/async.c b/util/async.c index b94518b..89a4f3e 100644 --- a/util/async.c +++ b/util/async.c @@ -250,7 +250,8 @@ aio_ctx_prepare(GSource *source, gint *timeout) AioContext *ctx = (AioContext *) source; atomic_or(&ctx->notify_me, 1); - + /* Make sure notify_me is set before aio_compute_timeout */ + smp_mb(); /* We assume there is no timeout already supplied */ *timeout = qemu_timeout_ns_to_ms(aio_compute_timeout(ctx));