Message ID | 20170302185942.76255-1-pasic@linux.vnet.ibm.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Thu, 2 Mar 2017 19:59:42 +0100 Halil Pasic <pasic@linux.vnet.ibm.com> wrote: > The function virtio_notify_irqfd used to ignore the return code of > event_notifier_set. Let's fail the device should this occur. I'm wondering if there are reasons for event_notifier_set() to fail beyond "we've hit an internal race and should make an effort to fix that one, or else we have completely messed up in qemu". Marking the device broken tells the guest that there's something wrong with the device, but I think we want qemu bug reports when there's something broken with the irqfd. > > Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com> > > --- > > This patch is most likely flawed because virtio_notify_irqfd > is probably supposed to be thread-safe and neither strerror > nor virtio_error are that. Anyway lets get the discussion started with this. > > There was a suggestion by Michael to make event_notifier_set void > and assert inside. After some thinking, I settled at the opinion, > that neither doing nothing nor crashing the guest is a good idea > should this failure happen in production. So I came up with this. > > Looking forward to your feedback. > --- > hw/virtio/virtio.c | 8 +++++++- > 1 file changed, 7 insertions(+), 1 deletion(-) > > diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c > index 23483c7..e05f3e5 100644 > --- a/hw/virtio/virtio.c > +++ b/hw/virtio/virtio.c > @@ -1555,6 +1555,8 @@ static bool virtio_should_notify(VirtIODevice *vdev, VirtQueue *vq) > void virtio_notify_irqfd(VirtIODevice *vdev, VirtQueue *vq) > { > bool should_notify; > + int rc; > + > rcu_read_lock(); > should_notify = virtio_should_notify(vdev, vq); > rcu_read_unlock(); > @@ -1581,7 +1583,11 @@ void virtio_notify_irqfd(VirtIODevice *vdev, VirtQueue *vq) > * to an atomic operation. > */ > virtio_set_isr(vq->vdev, 0x1); > - event_notifier_set(&vq->guest_notifier); > + rc = event_notifier_set(&vq->guest_notifier); > + if (unlikely(rc)) { > + virtio_error(vdev, "guest notifier broken for vdev at %p (%s)", vdev, > + strerror(-rc)); > + } > } > > static void virtio_irq(VirtQueue *vq)
On 03/03/2017 01:21 PM, Cornelia Huck wrote: > On Thu, 2 Mar 2017 19:59:42 +0100 > Halil Pasic <pasic@linux.vnet.ibm.com> wrote: > >> The function virtio_notify_irqfd used to ignore the return code of >> event_notifier_set. Let's fail the device should this occur. > > I'm wondering if there are reasons for event_notifier_set() to fail > beyond "we've hit an internal race and should make an effort to fix > that one, or else we have completely messed up in qemu". Marking the > device broken tells the guest that there's something wrong with the > device, but I think we want qemu bug reports when there's something > broken with the irqfd. > That's why the error is logged. I understand virtio_error like something suitable for indicating bugs. What do you suggest? Forcing a dump? I would rather leave it to the user to figure out how important is the state sitting in the machine and the device, and how much effort does (s)he want to put into recovering from the failure. Halil
On Fri, 3 Mar 2017 13:43:32 +0100 Halil Pasic <pasic@linux.vnet.ibm.com> wrote: > On 03/03/2017 01:21 PM, Cornelia Huck wrote: > > On Thu, 2 Mar 2017 19:59:42 +0100 > > Halil Pasic <pasic@linux.vnet.ibm.com> wrote: > > > >> The function virtio_notify_irqfd used to ignore the return code of > >> event_notifier_set. Let's fail the device should this occur. > > > > I'm wondering if there are reasons for event_notifier_set() to fail > > beyond "we've hit an internal race and should make an effort to fix > > that one, or else we have completely messed up in qemu". Marking the > > device broken tells the guest that there's something wrong with the > > device, but I think we want qemu bug reports when there's something > > broken with the irqfd. > > > > That's why the error is logged. I understand virtio_error like something > suitable for indicating bugs. > > What do you suggest? Forcing a dump? I would rather leave it to the > user to figure out how important is the state sitting in the machine > and the device, and how much effort does (s)he want to put into recovering > from the failure. How likely are those logged messages being brought to attention of the admin? Does any management software flag machines with such error messages? (that's more of a general question) I'd like to have some kind of trigger that rings an alarm bell so that the admin might consider reporting this, but I don't have a good idea on how to do that either...
On 03/03/2017 01:50 PM, Cornelia Huck wrote: > On Fri, 3 Mar 2017 13:43:32 +0100 > Halil Pasic <pasic@linux.vnet.ibm.com> wrote: > >> On 03/03/2017 01:21 PM, Cornelia Huck wrote: >>> On Thu, 2 Mar 2017 19:59:42 +0100 >>> Halil Pasic <pasic@linux.vnet.ibm.com> wrote: >>> >>>> The function virtio_notify_irqfd used to ignore the return code of >>>> event_notifier_set. Let's fail the device should this occur. >>> >>> I'm wondering if there are reasons for event_notifier_set() to fail >>> beyond "we've hit an internal race and should make an effort to fix >>> that one, or else we have completely messed up in qemu". Marking the >>> device broken tells the guest that there's something wrong with the >>> device, but I think we want qemu bug reports when there's something >>> broken with the irqfd. >>> >> >> That's why the error is logged. I understand virtio_error like something >> suitable for indicating bugs. >> >> What do you suggest? Forcing a dump? I would rather leave it to the >> user to figure out how important is the state sitting in the machine >> and the device, and how much effort does (s)he want to put into recovering >> from the failure. > > How likely are those logged messages being brought to attention of the > admin? Does any management software flag machines with such error > messages? (that's more of a general question) > I admit, I did not investigate this thoroughly, also because the patch is flawed regarding multi-thread anyway. After a quick investigation it seems the linux guest won't auto-reset the device so the guest should end up with a not working device. I think it's pretty likely that the admin will check the logs if the device was important. I agree fully that it's a very general question, and I do not feel competent for answering it. > I'd like to have some kind of trigger that rings an alarm bell so that > the admin might consider reporting this, but I don't have a good idea > on how to do that either... > There are tools for aggregating and processing logs, and triggering alarm bells too (for example ELK= logstash + Kibana + Elasticsearch). AFAIK logs are the most common way to deal with such stuff. But I'm far form being an expert. Of course logs are only as good as the messages landing in them... Halil
On Fri, 3 Mar 2017 14:08:37 +0100 Halil Pasic <pasic@linux.vnet.ibm.com> wrote: > On 03/03/2017 01:50 PM, Cornelia Huck wrote: > > On Fri, 3 Mar 2017 13:43:32 +0100 > > Halil Pasic <pasic@linux.vnet.ibm.com> wrote: > > > >> On 03/03/2017 01:21 PM, Cornelia Huck wrote: > >>> On Thu, 2 Mar 2017 19:59:42 +0100 > >>> Halil Pasic <pasic@linux.vnet.ibm.com> wrote: > >>> > >>>> The function virtio_notify_irqfd used to ignore the return code of > >>>> event_notifier_set. Let's fail the device should this occur. > >>> > >>> I'm wondering if there are reasons for event_notifier_set() to fail > >>> beyond "we've hit an internal race and should make an effort to fix > >>> that one, or else we have completely messed up in qemu". Marking the > >>> device broken tells the guest that there's something wrong with the > >>> device, but I think we want qemu bug reports when there's something > >>> broken with the irqfd. > >>> > >> > >> That's why the error is logged. I understand virtio_error like something > >> suitable for indicating bugs. > >> > >> What do you suggest? Forcing a dump? I would rather leave it to the > >> user to figure out how important is the state sitting in the machine > >> and the device, and how much effort does (s)he want to put into recovering > >> from the failure. > > > > How likely are those logged messages being brought to attention of the > > admin? Does any management software flag machines with such error > > messages? (that's more of a general question) > > > > I admit, I did not investigate this thoroughly, also because the patch > is flawed regarding multi-thread anyway. After a quick investigation > it seems the linux guest won't auto-reset the device so the guest should > end up with a not working device. I think it's pretty likely that the > admin will check the logs if the device was important. Thinking a bit more about this, it seems setting the device broken is not the right solution for exactly that reason. Setting the virtio device broken is a way to signal the guest to 'you did something broken; please reset the device and start anew' (and that's how current callers use it). In our case, this is not the guest's fault. Maybe go back to the assert 'solution'? But I'm not sure that's enough if production builds disable asserts... > > I agree fully that it's a very general question, and I do not feel > competent for answering it. > > > I'd like to have some kind of trigger that rings an alarm bell so that > > the admin might consider reporting this, but I don't have a good idea > > on how to do that either... > > > > There are tools for aggregating and processing logs, and triggering > alarm bells too (for example ELK= logstash + Kibana + Elasticsearch). > AFAIK logs are the most common way to deal with such stuff. But I'm far > form being an expert. Of course logs are only as good as the messages > landing in them... Let's hope this works properly, then.
On 03/06/2017 03:56 PM, Cornelia Huck wrote: > On Fri, 3 Mar 2017 14:08:37 +0100 > Halil Pasic <pasic@linux.vnet.ibm.com> wrote: > >> On 03/03/2017 01:50 PM, Cornelia Huck wrote: >>> On Fri, 3 Mar 2017 13:43:32 +0100 >>> Halil Pasic <pasic@linux.vnet.ibm.com> wrote: >>> >>>> On 03/03/2017 01:21 PM, Cornelia Huck wrote: >>>>> On Thu, 2 Mar 2017 19:59:42 +0100 >>>>> Halil Pasic <pasic@linux.vnet.ibm.com> wrote: [...] >> I admit, I did not investigate this thoroughly, also because the patch >> is flawed regarding multi-thread anyway. After a quick investigation >> it seems the linux guest won't auto-reset the device so the guest should >> end up with a not working device. I think it's pretty likely that the >> admin will check the logs if the device was important. > > Thinking a bit more about this, it seems setting the device broken is > not the right solution for exactly that reason. Setting the virtio > device broken is a way to signal the guest to 'you did something > broken; please reset the device and start anew' (and that's how current > callers use it). In our case, this is not the guest's fault. Do we have something to just say stuff broken without blaming the guest? And device reset might not be that stupid at all in the given situation, if we want to save what can be saved from the perspective of the guest. (After reset stuff should work again until we hit the race again -- and since turning ioeventfd on/off should not happen that often during normal operation it could help limit damage suffered -- e.g. controlled shutdown). > > Maybe go back to the assert 'solution'? But I'm not sure that's enough > if production builds disable asserts... > I will wait a bit, maybe other virtio folks are going to have an opinion too. My concern about the assert solution is that for production it is either too rigorous (kill off, hopefully with a dump) or not enough (as you have mentioned, if NDEBUG assert does nothing). I think there are setups where a loss of device does not have to be fatal, and I would not like to be the one who makes it fatal (for the guest). Regards, Halil
On Mon, 6 Mar 2017 16:21:13 +0100 Halil Pasic <pasic@linux.vnet.ibm.com> wrote: > On 03/06/2017 03:56 PM, Cornelia Huck wrote: > > On Fri, 3 Mar 2017 14:08:37 +0100 > > Halil Pasic <pasic@linux.vnet.ibm.com> wrote: > > > >> On 03/03/2017 01:50 PM, Cornelia Huck wrote: > >>> On Fri, 3 Mar 2017 13:43:32 +0100 > >>> Halil Pasic <pasic@linux.vnet.ibm.com> wrote: > >>> > >>>> On 03/03/2017 01:21 PM, Cornelia Huck wrote: > >>>>> On Thu, 2 Mar 2017 19:59:42 +0100 > >>>>> Halil Pasic <pasic@linux.vnet.ibm.com> wrote: > [...] > >> I admit, I did not investigate this thoroughly, also because the patch > >> is flawed regarding multi-thread anyway. After a quick investigation > >> it seems the linux guest won't auto-reset the device so the guest should > >> end up with a not working device. I think it's pretty likely that the > >> admin will check the logs if the device was important. > > > > Thinking a bit more about this, it seems setting the device broken is > > not the right solution for exactly that reason. Setting the virtio > > device broken is a way to signal the guest to 'you did something > > broken; please reset the device and start anew' (and that's how current > > callers use it). In our case, this is not the guest's fault. > > Do we have something to just say stuff broken without blaming the guest? > And device reset might not be that stupid at all in the given situation, > if we want to save what can be saved from the perspective of the guest. > (After reset stuff should work again until we hit the race again -- and > since turning ioeventfd on/off should not happen that often during normal > operation it could help limit damage suffered -- e.g. controlled shutdown). Checking again, the spec says DEVICE_NEEDS_RESET (64) Indicates that the device has experienced an error from which it can’t recover. Nothing about 'guest error'. The only problem is that legacy devices don't have that state, which means they'll have a broken device through no fault of their own. > > > > > Maybe go back to the assert 'solution'? But I'm not sure that's enough > > if production builds disable asserts... > > > > I will wait a bit, maybe other virtio folks are going to have an > opinion too. > > My concern about the assert solution is that for production it is > either too rigorous (kill off, hopefully with a dump) or not > enough (as you have mentioned, if NDEBUG assert does nothing). > > > I think there are setups where a loss of device does not have to be > fatal, and I would not like to be the one who makes it fatal (for the > guest). Basically, it's a host bug (and not a bug specific to a certain device). Moving the device which was impacted to a broken state may be a useful mitigation. But yes, let's hear some other opinions.
On Mon, Mar 06, 2017 at 05:04:41PM +0100, Cornelia Huck wrote: > On Mon, 6 Mar 2017 16:21:13 +0100 > Halil Pasic <pasic@linux.vnet.ibm.com> wrote: > > > On 03/06/2017 03:56 PM, Cornelia Huck wrote: > > > On Fri, 3 Mar 2017 14:08:37 +0100 > > > Halil Pasic <pasic@linux.vnet.ibm.com> wrote: > > > > > >> On 03/03/2017 01:50 PM, Cornelia Huck wrote: > > >>> On Fri, 3 Mar 2017 13:43:32 +0100 > > >>> Halil Pasic <pasic@linux.vnet.ibm.com> wrote: > > >>> > > >>>> On 03/03/2017 01:21 PM, Cornelia Huck wrote: > > >>>>> On Thu, 2 Mar 2017 19:59:42 +0100 > > >>>>> Halil Pasic <pasic@linux.vnet.ibm.com> wrote: > > [...] > > >> I admit, I did not investigate this thoroughly, also because the patch > > >> is flawed regarding multi-thread anyway. After a quick investigation > > >> it seems the linux guest won't auto-reset the device so the guest should > > >> end up with a not working device. I think it's pretty likely that the > > >> admin will check the logs if the device was important. > > > > > > Thinking a bit more about this, it seems setting the device broken is > > > not the right solution for exactly that reason. Setting the virtio > > > device broken is a way to signal the guest to 'you did something > > > broken; please reset the device and start anew' (and that's how current > > > callers use it). In our case, this is not the guest's fault. > > > > Do we have something to just say stuff broken without blaming the guest? > > And device reset might not be that stupid at all in the given situation, > > if we want to save what can be saved from the perspective of the guest. > > (After reset stuff should work again until we hit the race again -- and > > since turning ioeventfd on/off should not happen that often during normal > > operation it could help limit damage suffered -- e.g. controlled shutdown). > > Checking again, the spec says > > DEVICE_NEEDS_RESET (64) Indicates that the device has experienced an > error from which it can’t recover. > > Nothing about 'guest error'. > > The only problem is that legacy devices don't have that state, which > means they'll have a broken device through no fault of their own. > > > > > > > > > Maybe go back to the assert 'solution'? But I'm not sure that's enough > > > if production builds disable asserts... > > > > > > > I will wait a bit, maybe other virtio folks are going to have an > > opinion too. > > > > My concern about the assert solution is that for production it is > > either too rigorous (kill off, hopefully with a dump) or not > > enough (as you have mentioned, if NDEBUG assert does nothing). > > > > > > I think there are setups where a loss of device does not have to be > > fatal, and I would not like to be the one who makes it fatal (for the > > guest). > > Basically, it's a host bug (and not a bug specific to a certain > device). Moving the device which was impacted to a broken state may be > a useful mitigation. > > But yes, let's hear some other opinions. We don't support NDEBUG really so I think an assert is fine for now. Handling unexpected errors more gracefully is laudable but I think we want a more systematic approach than just open-coding it in this specific place.
On 03/23/2017 06:09 PM, Michael S. Tsirkin wrote: > On Mon, Mar 06, 2017 at 05:04:41PM +0100, Cornelia Huck wrote: >> On Mon, 6 Mar 2017 16:21:13 +0100 >> Halil Pasic <pasic@linux.vnet.ibm.com> wrote: >> >>> On 03/06/2017 03:56 PM, Cornelia Huck wrote: >>>> On Fri, 3 Mar 2017 14:08:37 +0100 >>>> Halil Pasic <pasic@linux.vnet.ibm.com> wrote: >>>> >>>>> On 03/03/2017 01:50 PM, Cornelia Huck wrote: >>>>>> On Fri, 3 Mar 2017 13:43:32 +0100 >>>>>> Halil Pasic <pasic@linux.vnet.ibm.com> wrote: >>>>>> >>>>>>> On 03/03/2017 01:21 PM, Cornelia Huck wrote: >>>>>>>> On Thu, 2 Mar 2017 19:59:42 +0100 >>>>>>>> Halil Pasic <pasic@linux.vnet.ibm.com> wrote: >>> [...] >>>>> I admit, I did not investigate this thoroughly, also because the patch >>>>> is flawed regarding multi-thread anyway. After a quick investigation >>>>> it seems the linux guest won't auto-reset the device so the guest should >>>>> end up with a not working device. I think it's pretty likely that the >>>>> admin will check the logs if the device was important. >>>> >>>> Thinking a bit more about this, it seems setting the device broken is >>>> not the right solution for exactly that reason. Setting the virtio >>>> device broken is a way to signal the guest to 'you did something >>>> broken; please reset the device and start anew' (and that's how current >>>> callers use it). In our case, this is not the guest's fault. >>> >>> Do we have something to just say stuff broken without blaming the guest? >>> And device reset might not be that stupid at all in the given situation, >>> if we want to save what can be saved from the perspective of the guest. >>> (After reset stuff should work again until we hit the race again -- and >>> since turning ioeventfd on/off should not happen that often during normal >>> operation it could help limit damage suffered -- e.g. controlled shutdown). >> >> Checking again, the spec says >> >> DEVICE_NEEDS_RESET (64) Indicates that the device has experienced an >> error from which it can’t recover. >> >> Nothing about 'guest error'. >> >> The only problem is that legacy devices don't have that state, which >> means they'll have a broken device through no fault of their own. >> >>> >>>> >>>> Maybe go back to the assert 'solution'? But I'm not sure that's enough >>>> if production builds disable asserts... >>>> >>> >>> I will wait a bit, maybe other virtio folks are going to have an >>> opinion too. >>> >>> My concern about the assert solution is that for production it is >>> either too rigorous (kill off, hopefully with a dump) or not >>> enough (as you have mentioned, if NDEBUG assert does nothing). >>> >>> >>> I think there are setups where a loss of device does not have to be >>> fatal, and I would not like to be the one who makes it fatal (for the >>> guest). >> >> Basically, it's a host bug (and not a bug specific to a certain >> device). Moving the device which was impacted to a broken state may be >> a useful mitigation. >> >> But yes, let's hear some other opinions. > > We don't support NDEBUG really so I think an assert is fine for now. Thanks for your reply! You are right, virtio.c makes sure we do not have NDEBUG defined, but at the very same place a todo comment suggests that not supporting NDEBUG is a temporary measure (and a so called technical debt). If my understanding is correct, I should use assert instead of virtio_error for the next version. I that right? > Handling unexpected errors more gracefully is laudable but I think we > want a more systematic approach than just open-coding it in > this specific place. > Sorry, I mistook virtio_error for the systematic approach for handling (expected and unexpected) errors in virtio. Should I create a patch which documents what is considered an (in)appropriate usage of virtio_error? AFAIU (based on Connie's comments) virtio_error should not be used for reporting QEMU errors (or should only be used to report guest errors). By the way I have read the exchange on '[PATCH 1/5] virtio: Error object based virtio_error()'. The things you suggest there do make a lot of sense to me. Unfortunately I can't tell if I'm going to have the time to come up with an appropriate patch(set), so I won't volunteer for now. Regards, Halil
diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c index 23483c7..e05f3e5 100644 --- a/hw/virtio/virtio.c +++ b/hw/virtio/virtio.c @@ -1555,6 +1555,8 @@ static bool virtio_should_notify(VirtIODevice *vdev, VirtQueue *vq) void virtio_notify_irqfd(VirtIODevice *vdev, VirtQueue *vq) { bool should_notify; + int rc; + rcu_read_lock(); should_notify = virtio_should_notify(vdev, vq); rcu_read_unlock(); @@ -1581,7 +1583,11 @@ void virtio_notify_irqfd(VirtIODevice *vdev, VirtQueue *vq) * to an atomic operation. */ virtio_set_isr(vq->vdev, 0x1); - event_notifier_set(&vq->guest_notifier); + rc = event_notifier_set(&vq->guest_notifier); + if (unlikely(rc)) { + virtio_error(vdev, "guest notifier broken for vdev at %p (%s)", vdev, + strerror(-rc)); + } } static void virtio_irq(VirtQueue *vq)
The function virtio_notify_irqfd used to ignore the return code of event_notifier_set. Let's fail the device should this occur. Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com> --- This patch is most likely flawed because virtio_notify_irqfd is probably supposed to be thread-safe and neither strerror nor virtio_error are that. Anyway lets get the discussion started with this. There was a suggestion by Michael to make event_notifier_set void and assert inside. After some thinking, I settled at the opinion, that neither doing nothing nor crashing the guest is a good idea should this failure happen in production. So I came up with this. Looking forward to your feedback. --- hw/virtio/virtio.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-)