diff mbox

[1/1] virtio: fallback from irqfd to non-irqfd notify

Message ID 20170301115004.96073-1-pasic@linux.vnet.ibm.com (mailing list archive)
State New, archived
Headers show

Commit Message

Halil Pasic March 1, 2017, 11:50 a.m. UTC
The commits 03de2f527 "virtio-blk: do not use vring in dataplane"  and
9ffe337c08 "virtio-blk: always use dataplane path if ioeventfd is active"
changed how notifications are done for virtio-blk substantially. Due to a
race condition interrupts are lost when irqfd is torn down after
notify_guest_bh was scheduled but before it actually runs.  Furthermore
virtio_notify_irqfd ignores the value returned by event_notifier_set
which correctly indicates that notification has failed due to bad file
descriptor.

Let's fix this by making virtio_notify_irqfd fall back to the non-irqfd
notification mechanism if event_notifier_set fails.

Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com>
---

This is probably not the only way to fix this: suggestions welcome. I
did not use a fixes tag because I'm not sure yet where exactly things got
broken. Maybe guys more familiar with dataplane an coroutines can help
(Paolo, Stefan).
---
 hw/virtio/virtio.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

Comments

Michael S. Tsirkin March 1, 2017, 12:54 p.m. UTC | #1
On Wed, Mar 01, 2017 at 12:50:04PM +0100, Halil Pasic wrote:
> The commits 03de2f527 "virtio-blk: do not use vring in dataplane"  and
> 9ffe337c08 "virtio-blk: always use dataplane path if ioeventfd is active"
> changed how notifications are done for virtio-blk substantially. Due to a
> race condition interrupts are lost when irqfd is torn down after
> notify_guest_bh was scheduled but before it actually runs.  Furthermore
> virtio_notify_irqfd ignores the value returned by event_notifier_set
> which correctly indicates that notification has failed due to bad file
> descriptor.
> 
> Let's fix this by making virtio_notify_irqfd fall back to the non-irqfd
> notification mechanism if event_notifier_set fails.
> 
> Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com>
> ---
> 
> This is probably not the only way to fix this: suggestions welcome. I
> did not use a fixes tag because I'm not sure yet where exactly things got
> broken. Maybe guys more familiar with dataplane an coroutines can help
> (Paolo, Stefan).
> ---
>  hw/virtio/virtio.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
> index 23483c7..8e1c1e9 100644
> --- a/hw/virtio/virtio.c
> +++ b/hw/virtio/virtio.c
> @@ -1581,7 +1581,9 @@ void virtio_notify_irqfd(VirtIODevice *vdev, VirtQueue *vq)
>       * to an atomic operation.
>       */
>      virtio_set_isr(vq->vdev, 0x1);
> -    event_notifier_set(&vq->guest_notifier);
> +    if (event_notifier_set(&vq->guest_notifier)) {
> +        virtio_notify_vector(vdev, vq->vector);
> +    }

Does this fail because the underlying fd got closed?
Then there's a problem: trying to write to a closed
fd might corrupt an unrelated fd.
If you want to use this way we need to set fds to -1 when we close.

>  }
>  
>  static void virtio_irq(VirtQueue *vq)
> -- 
> 2.8.4
Paolo Bonzini March 1, 2017, 12:57 p.m. UTC | #2
On 01/03/2017 12:50, Halil Pasic wrote:
> The commits 03de2f527 "virtio-blk: do not use vring in dataplane"  and
> 9ffe337c08 "virtio-blk: always use dataplane path if ioeventfd is active"
> changed how notifications are done for virtio-blk substantially. Due to a
> race condition interrupts are lost when irqfd is torn down after
> notify_guest_bh was scheduled but before it actually runs.

I don't think the non-irqfd notification mechanism is thread safe, and
that would be a problem for this patch.

What is the path that causes the irqfd to be torn down?  Only something
like a reset should cause it (the only call in virtio-blk is from
virtio_blk_data_plane_stop), and then the guest doesn't care anymore
about interrupts.

That path also does a qemu_bh_delete, so the notify_guest_bh should not
be invoked at all.

Paolo

>  Furthermore
> virtio_notify_irqfd ignores the value returned by event_notifier_set
> which correctly indicates that notification has failed due to bad file
> descriptor.
> 
> Let's fix this by making virtio_notify_irqfd fall back to the non-irqfd
> notification mechanism if event_notifier_set fails.
> 
> Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com>
> ---
> 
> This is probably not the only way to fix this: suggestions welcome. I
> did not use a fixes tag because I'm not sure yet where exactly things got
> broken. Maybe guys more familiar with dataplane an coroutines can help
> (Paolo, Stefan).
> ---
>  hw/virtio/virtio.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
> index 23483c7..8e1c1e9 100644
> --- a/hw/virtio/virtio.c
> +++ b/hw/virtio/virtio.c
> @@ -1581,7 +1581,9 @@ void virtio_notify_irqfd(VirtIODevice *vdev, VirtQueue *vq)
>       * to an atomic operation.
>       */
>      virtio_set_isr(vq->vdev, 0x1);
> -    event_notifier_set(&vq->guest_notifier);
> +    if (event_notifier_set(&vq->guest_notifier)) {
> +        virtio_notify_vector(vdev, vq->vector);
> +    }
>  }
>  
>  static void virtio_irq(VirtQueue *vq)
>
Halil Pasic March 1, 2017, 1:22 p.m. UTC | #3
On 03/01/2017 01:57 PM, Paolo Bonzini wrote:
> 
> 
> On 01/03/2017 12:50, Halil Pasic wrote:
>> The commits 03de2f527 "virtio-blk: do not use vring in dataplane"  and
>> 9ffe337c08 "virtio-blk: always use dataplane path if ioeventfd is active"
>> changed how notifications are done for virtio-blk substantially. Due to a
>> race condition interrupts are lost when irqfd is torn down after
>> notify_guest_bh was scheduled but before it actually runs.
> 
> I don't think the non-irqfd notification mechanism is thread safe, and
> that would be a problem for this patch.

Sorry, haven't looked into that thoroughly (and speculated people with
more understanding will jump in).

> 
> What is the path that causes the irqfd to be torn down?  Only something

Here a trace:

135871@1488304024.512533:virtio_blk_req_complete req 0x2aa6b117e10 status 0
135871@1488304024.512541:virtio_notify_irqfd vdev 0x2aa6b0e19d8 vq 0x2aa6b4c0870
135871@1488304024.522607:virtio_blk_req_complete req 0x2aa6b118980 status 0
135871@1488304024.522616:virtio_blk_req_complete req 0x2aa6b119260 status 0
135871@1488304024.522627:virtio_notify_irqfd vdev 0x2aa6b0e19d8 vq 0x2aa6b4c0870
135871@1488304024.527386:virtio_blk_req_complete req 0x2aa6b118980 status 0
135871@1488304024.527431:virtio_notify_irqfd vdev 0x2aa6b0e19d8 vq 0x2aa6b4c0870
135871@1488304024.528611:virtio_guest_notifier_read vdev 0x2aa6b0e61c8 vq 0x2aa6b4de880
135871@1488304024.528628:virtio_guest_notifier_read vdev 0x2aa6b0e61c8 vq 0x2aa6b4de8f8
135871@1488304024.528753:virtio_blk_data_plane_stop dataplane 0x2aa6b0e5540
                         ^== DATAPLANE STOP  
135871@1488304024.530709:virtio_blk_req_complete req 0x2aa6b117e10 status 0
135871@1488304024.530752:virtio_guest_notifier_read vdev 0x2aa6b0e19d8 vq 0x2aa6b4c0870
                         ^== comes from k->set_guest_notifiers(qbus->parent, nvqs, false);
                             in virtio_blk_data_plane_stop and done immediately after
                             irqfd is cleaned up by the transport
135871@1488304024.530836:virtio_notify_irqfd vdev 0x2aa6b0e19d8 vq 0x2aa6b4c0870
halil: error in event_notifier_set: Bad file descriptor
                         ^== here we have the problem

If you want a stacktrace that can be arranged to.

> like a reset should cause it (the only call in virtio-blk is from
> virtio_blk_data_plane_stop), and then the guest doesn't care anymore
> about interrupts.

I do not understand this with 'doesn't care anymore about interrupts'.
I was debugging a virtio-blk device being stuck waiting for a host
notification (interrupt) after migration.

> 
> That path also does a qemu_bh_delete, so the notify_guest_bh should not
> be invoked at all.
> 

That's only for destroy. I'm migrating.

Seems I tried to fix this is the wrong way. Was not too confident about it
in the first place. Suggestions welcome!

Cheers,
Halil
Halil Pasic March 1, 2017, 1:31 p.m. UTC | #4
On 03/01/2017 01:54 PM, Michael S. Tsirkin wrote:
> On Wed, Mar 01, 2017 at 12:50:04PM +0100, Halil Pasic wrote:
>> The commits 03de2f527 "virtio-blk: do not use vring in dataplane"  and
>> 9ffe337c08 "virtio-blk: always use dataplane path if ioeventfd is active"
>> changed how notifications are done for virtio-blk substantially. Due to a
>> race condition interrupts are lost when irqfd is torn down after
>> notify_guest_bh was scheduled but before it actually runs.  Furthermore
>> virtio_notify_irqfd ignores the value returned by event_notifier_set
>> which correctly indicates that notification has failed due to bad file
>> descriptor.
>>
>> Let's fix this by making virtio_notify_irqfd fall back to the non-irqfd
>> notification mechanism if event_notifier_set fails.
>>
>> Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com>
>> ---
>>
>> This is probably not the only way to fix this: suggestions welcome. I
>> did not use a fixes tag because I'm not sure yet where exactly things got
>> broken. Maybe guys more familiar with dataplane an coroutines can help
>> (Paolo, Stefan).
>> ---
>>  hw/virtio/virtio.c | 4 +++-
>>  1 file changed, 3 insertions(+), 1 deletion(-)
>>
>> diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
>> index 23483c7..8e1c1e9 100644
>> --- a/hw/virtio/virtio.c
>> +++ b/hw/virtio/virtio.c
>> @@ -1581,7 +1581,9 @@ void virtio_notify_irqfd(VirtIODevice *vdev, VirtQueue *vq)
>>       * to an atomic operation.
>>       */
>>      virtio_set_isr(vq->vdev, 0x1);
>> -    event_notifier_set(&vq->guest_notifier);
>> +    if (event_notifier_set(&vq->guest_notifier)) {
>> +        virtio_notify_vector(vdev, vq->vector);
>> +    }
> 
> Does this fail because the underlying fd got closed?

Yes. Its data_plane_stop()->virtio_ccw_set_guest_notifier->event_notifier_cleanup().
The function event_notifier_cleanup() does not set fds to -1 :(.
Seems to me, it would be safer to do so. Should I make a patch?

> Then there's a problem: trying to write to a closed
> fd might corrupt an unrelated fd.
> If you want to use this way we need to set fds to -1 when we close.

Sorry, did not check for that. OTOH Paolo says my approach is
fundamentally flawed because virtio_notify_vector is not thread
safe. I suggest we continue the discussion there.

Regards,
Halil

> 
>>  }
>>  
>>  static void virtio_irq(VirtQueue *vq)
>> -- 
>> 2.8.4
>
Paolo Bonzini March 1, 2017, 2:29 p.m. UTC | #5
On 01/03/2017 14:22, Halil Pasic wrote:
> Here a trace:
> 
> 135871@1488304024.512533:virtio_blk_req_complete req 0x2aa6b117e10 status 0
> 135871@1488304024.512541:virtio_notify_irqfd vdev 0x2aa6b0e19d8 vq 0x2aa6b4c0870
> 135871@1488304024.522607:virtio_blk_req_complete req 0x2aa6b118980 status 0
> 135871@1488304024.522616:virtio_blk_req_complete req 0x2aa6b119260 status 0
> 135871@1488304024.522627:virtio_notify_irqfd vdev 0x2aa6b0e19d8 vq 0x2aa6b4c0870
> 135871@1488304024.527386:virtio_blk_req_complete req 0x2aa6b118980 status 0
> 135871@1488304024.527431:virtio_notify_irqfd vdev 0x2aa6b0e19d8 vq 0x2aa6b4c0870
> 135871@1488304024.528611:virtio_guest_notifier_read vdev 0x2aa6b0e61c8 vq 0x2aa6b4de880
> 135871@1488304024.528628:virtio_guest_notifier_read vdev 0x2aa6b0e61c8 vq 0x2aa6b4de8f8
> 135871@1488304024.528753:virtio_blk_data_plane_stop dataplane 0x2aa6b0e5540
>                          ^== DATAPLANE STOP  
> 135871@1488304024.530709:virtio_blk_req_complete req 0x2aa6b117e10 status 0
> 135871@1488304024.530752:virtio_guest_notifier_read vdev 0x2aa6b0e19d8 vq 0x2aa6b4c0870
>                          ^== comes from k->set_guest_notifiers(qbus->parent, nvqs, false);
>                              in virtio_blk_data_plane_stop and done immediately after
>                              irqfd is cleaned up by the transport
> 135871@1488304024.530836:virtio_notify_irqfd vdev 0x2aa6b0e19d8 vq 0x2aa6b4c0870
> halil: error in event_notifier_set: Bad file descriptor
>                          ^== here we have the problem
> 
> If you want a stacktrace that can be arranged to.
> 
>> like a reset should cause it (the only call in virtio-blk is from
>> virtio_blk_data_plane_stop), and then the guest doesn't care anymore
>> about interrupts.
> I do not understand this with 'doesn't care anymore about interrupts'.
> I was debugging a virtio-blk device being stuck waiting for a host
> notification (interrupt) after migration.

Ok, this explains it better then.  The issue is that
virtio_blk_data_plane_stop doesn't flush the bottom half, which you want
to do when the caller is, for example, virtio_ccw_vmstate_change.

Does it work if you call to qemu_bh_cancel(s->bh) and notify_guest_bh(s)
after

    blk_set_aio_context(s->conf->conf.blk, qemu_get_aio_context());

?

Paolo

>> That path also does a qemu_bh_delete, so the notify_guest_bh should not
>> be invoked at all.
>>
> That's only for destroy. I'm migrating.
> 
> Seems I tried to fix this is the wrong way. Was not too confident about it
> in the first place. Suggestions welcome!
diff mbox

Patch

diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index 23483c7..8e1c1e9 100644
--- a/hw/virtio/virtio.c
+++ b/hw/virtio/virtio.c
@@ -1581,7 +1581,9 @@  void virtio_notify_irqfd(VirtIODevice *vdev, VirtQueue *vq)
      * to an atomic operation.
      */
     virtio_set_isr(vq->vdev, 0x1);
-    event_notifier_set(&vq->guest_notifier);
+    if (event_notifier_set(&vq->guest_notifier)) {
+        virtio_notify_vector(vdev, vq->vector);
+    }
 }
 
 static void virtio_irq(VirtQueue *vq)