diff mbox series

[4/4] scsi-disk: Always report RESERVATION_CONFLICT to guest

Message ID 20240729094702.50282-5-kwolf@redhat.com (mailing list archive)
State New, archived
Headers show
Series scsi-block: Fix error handling with r/werror=stop | expand

Commit Message

Kevin Wolf July 29, 2024, 9:47 a.m. UTC
RESERVATION_CONFLICT is not a backend error, but indicates that the
guest tried to make a request that it isn't allowed to execute. Pass the
error to the guest so that it can decide what to do with it.

Without this, if we stop the VM in response to a RESERVATION_CONFLICT,
it can happen that the VM cannot be resumed any more because every
attempt to resume it immediately runs into the same error and stops the
VM again.

One case that expects RESERVATION_CONFLICT errors to be visible in the
guest is running the validation tests in Windows 2019's Failover Cluster
Manager, which intentionally tries to execute invalid requests to see if
they are properly rejected.

Buglink: https://issues.redhat.com/browse/RHEL-50000
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 hw/scsi/scsi-disk.c | 15 +++++++++++----
 1 file changed, 11 insertions(+), 4 deletions(-)

Comments

Paolo Bonzini July 29, 2024, 11:55 a.m. UTC | #1
On Mon, Jul 29, 2024 at 11:47 AM Kevin Wolf <kwolf@redhat.com> wrote:
> RESERVATION_CONFLICT is not a backend error, but indicates that the
> guest tried to make a request that it isn't allowed to execute. Pass the
> error to the guest so that it can decide what to do with it.

This is only true of scsi-block (though your patch is okay here -
scsi-disk would see an EBADE and go down the ret < 0 path).

In general, for scsi-block I'd expect people to use report instead of
stop. I agree that this is the best behavior for the case where you
have a pr-manager, but it may also be better to stop the VM if a
pr-manager has not been set up.  That's probably a bit hackish, so I
guess it's okay to add a FIXME or TODO comment instead?

> -        if (status == CHECK_CONDITION) {
> +        switch (status) {
> +        case CHECK_CONDITION:
>              req_has_sense = true;
>              error = scsi_sense_buf_to_errno(r->req.sense, sizeof(r->req.sense));
> -        } else {
> +            break;
> +        case RESERVATION_CONFLICT:
> +            /* Don't apply the error policy, always report to the guest */

This is the only case where you get error == 0. Maybe remove it from
the initializer, and set it here?

Paolo

On Mon, Jul 29, 2024 at 11:47 AM Kevin Wolf <kwolf@redhat.com> wrote:
>
> RESERVATION_CONFLICT is not a backend error, but indicates that the
> guest tried to make a request that it isn't allowed to execute. Pass the
> error to the guest so that it can decide what to do with it.
>
> Without this, if we stop the VM in response to a RESERVATION_CONFLICT,
> it can happen that the VM cannot be resumed any more because every
> attempt to resume it immediately runs into the same error and stops the
> VM again.
>
> One case that expects RESERVATION_CONFLICT errors to be visible in the
> guest is running the validation tests in Windows 2019's Failover Cluster
> Manager, which intentionally tries to execute invalid requests to see if
> they are properly rejected.
>
> Buglink: https://issues.redhat.com/browse/RHEL-50000
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> ---
>  hw/scsi/scsi-disk.c | 15 +++++++++++----
>  1 file changed, 11 insertions(+), 4 deletions(-)
>
> diff --git a/hw/scsi/scsi-disk.c b/hw/scsi/scsi-disk.c
> index 69a195177e..e173b238de 100644
> --- a/hw/scsi/scsi-disk.c
> +++ b/hw/scsi/scsi-disk.c
> @@ -235,11 +235,17 @@ static bool scsi_handle_rw_error(SCSIDiskReq *r, int ret, bool acct_failed)
>      } else {
>          /* A passthrough command has completed with nonzero status.  */
>          status = ret;
> -        if (status == CHECK_CONDITION) {
> +        switch (status) {
> +        case CHECK_CONDITION:
>              req_has_sense = true;
>              error = scsi_sense_buf_to_errno(r->req.sense, sizeof(r->req.sense));
> -        } else {
> +            break;
> +        case RESERVATION_CONFLICT:
> +            /* Don't apply the error policy, always report to the guest */
> +            break;
> +        default:
>              error = EINVAL;
> +            break;
>          }
>      }
>
> @@ -249,8 +255,9 @@ static bool scsi_handle_rw_error(SCSIDiskReq *r, int ret, bool acct_failed)
>       * are usually retried immediately, so do not post them to QMP and
>       * do not account them as failed I/O.
>       */
> -    if (req_has_sense &&
> -        scsi_sense_buf_is_guest_recoverable(r->req.sense, sizeof(r->req.sense))) {
> +    if (!error || (req_has_sense &&
> +                   scsi_sense_buf_is_guest_recoverable(r->req.sense,
> +                                                       sizeof(r->req.sense)))) {
>          action = BLOCK_ERROR_ACTION_REPORT;
>          acct_failed = false;
>      } else {
> --
> 2.45.2
>
Kevin Wolf July 29, 2024, 12:20 p.m. UTC | #2
Am 29.07.2024 um 13:55 hat Paolo Bonzini geschrieben:
> On Mon, Jul 29, 2024 at 11:47 AM Kevin Wolf <kwolf@redhat.com> wrote:
> > RESERVATION_CONFLICT is not a backend error, but indicates that the
> > guest tried to make a request that it isn't allowed to execute. Pass the
> > error to the guest so that it can decide what to do with it.
> 
> This is only true of scsi-block (though your patch is okay here -
> scsi-disk would see an EBADE and go down the ret < 0 path).

Right, in the scsi-disk case, we probably do want to consider it a
host-side error because the guest can't see or influence what happens on
the backend.

I can change the commit message accordingly.

> In general, for scsi-block I'd expect people to use report instead of
> stop. I agree that this is the best behavior for the case where you
> have a pr-manager, but it may also be better to stop the VM if a
> pr-manager has not been set up.  That's probably a bit hackish, so I
> guess it's okay to add a FIXME or TODO comment instead?

Apparently both oVirt and Kubevirt unconditionally use the stop policy,
so I'm afraid in this case we must acknowledge that our expectations
don't match reality.

If I understand correctly, not having a pr-manager could mean that QEMU
itself is sufficiently privileged and then the same logic would apply.

But even if it means that we can't change any persistent reservations
from the VM, what use would stopping the VM be? You would run into the
exact case I'm describing in the commit message: You try to resume the
VM and it immediately stops again because the request still doesn't get
through. Or do you expect the host admin to take some manual action
then?

And what would you do about the Windows cluster validation case that
intentionally sends a request which reservations don't and shouldn't
allow? There is nothing on the host side to fix there. The guest is only
happy when it gets an error back.

> > -        if (status == CHECK_CONDITION) {
> > +        switch (status) {
> > +        case CHECK_CONDITION:
> >              req_has_sense = true;
> >              error = scsi_sense_buf_to_errno(r->req.sense, sizeof(r->req.sense));
> > -        } else {
> > +            break;
> > +        case RESERVATION_CONFLICT:
> > +            /* Don't apply the error policy, always report to the guest */
> 
> This is the only case where you get error == 0. Maybe remove it from
> the initializer, and set it here?

Not sure why the initialiser was added in the first place, but yes, I
can do that.

Kevin

> On Mon, Jul 29, 2024 at 11:47 AM Kevin Wolf <kwolf@redhat.com> wrote:
> >
> > RESERVATION_CONFLICT is not a backend error, but indicates that the
> > guest tried to make a request that it isn't allowed to execute. Pass the
> > error to the guest so that it can decide what to do with it.
> >
> > Without this, if we stop the VM in response to a RESERVATION_CONFLICT,
> > it can happen that the VM cannot be resumed any more because every
> > attempt to resume it immediately runs into the same error and stops the
> > VM again.
> >
> > One case that expects RESERVATION_CONFLICT errors to be visible in the
> > guest is running the validation tests in Windows 2019's Failover Cluster
> > Manager, which intentionally tries to execute invalid requests to see if
> > they are properly rejected.
> >
> > Buglink: https://issues.redhat.com/browse/RHEL-50000
> > Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> > ---
> >  hw/scsi/scsi-disk.c | 15 +++++++++++----
> >  1 file changed, 11 insertions(+), 4 deletions(-)
> >
> > diff --git a/hw/scsi/scsi-disk.c b/hw/scsi/scsi-disk.c
> > index 69a195177e..e173b238de 100644
> > --- a/hw/scsi/scsi-disk.c
> > +++ b/hw/scsi/scsi-disk.c
> > @@ -235,11 +235,17 @@ static bool scsi_handle_rw_error(SCSIDiskReq *r, int ret, bool acct_failed)
> >      } else {
> >          /* A passthrough command has completed with nonzero status.  */
> >          status = ret;
> > -        if (status == CHECK_CONDITION) {
> > +        switch (status) {
> > +        case CHECK_CONDITION:
> >              req_has_sense = true;
> >              error = scsi_sense_buf_to_errno(r->req.sense, sizeof(r->req.sense));
> > -        } else {
> > +            break;
> > +        case RESERVATION_CONFLICT:
> > +            /* Don't apply the error policy, always report to the guest */
> > +            break;
> > +        default:
> >              error = EINVAL;
> > +            break;
> >          }
> >      }
> >
> > @@ -249,8 +255,9 @@ static bool scsi_handle_rw_error(SCSIDiskReq *r, int ret, bool acct_failed)
> >       * are usually retried immediately, so do not post them to QMP and
> >       * do not account them as failed I/O.
> >       */
> > -    if (req_has_sense &&
> > -        scsi_sense_buf_is_guest_recoverable(r->req.sense, sizeof(r->req.sense))) {
> > +    if (!error || (req_has_sense &&
> > +                   scsi_sense_buf_is_guest_recoverable(r->req.sense,
> > +                                                       sizeof(r->req.sense)))) {
> >          action = BLOCK_ERROR_ACTION_REPORT;
> >          acct_failed = false;
> >      } else {
> > --
> > 2.45.2
> >
>
Paolo Bonzini July 29, 2024, 12:26 p.m. UTC | #3
Il lun 29 lug 2024, 14:20 Kevin Wolf <kwolf@redhat.com> ha scritto:

> Apparently both oVirt and Kubevirt unconditionally use the stop policy,
> so I'm afraid in this case we must acknowledge that our expectations
> don't match reality.
>

Yeah, of course.

If I understand correctly, not having a pr-manager could mean that QEMU
> itself is sufficiently privileged and then the same logic would apply.
>
> But even if it means that we can't change any persistent reservations
> from the VM, what use would stopping the VM be? You would run into the
> exact case I'm describing in the commit message: You try to resume the
> VM and it immediately stops again because the request still doesn't get
> through. Or do you expect the host admin to take some manual action
> then?
>

Yes, if the PR operation is not allowed then the host admin would probably
get a notification and release the PR (or perhaps shutdown the VM with an
error) itself.

And what would you do about the Windows cluster validation case that
> intentionally sends a request which reservations don't and shouldn't
> allow? There is nothing on the host side to fix there. The guest is only
> happy when it gets an error back.
>

Yes, in that case (which is the most common one) there is nothing you can
do, so the patch is a good idea even if the case without a PR manager is a
bit murky.

Paolo

> > -        if (status == CHECK_CONDITION) {
> > > +        switch (status) {
> > > +        case CHECK_CONDITION:
> > >              req_has_sense = true;
> > >              error = scsi_sense_buf_to_errno(r->req.sense,
> sizeof(r->req.sense));
> > > -        } else {
> > > +            break;
> > > +        case RESERVATION_CONFLICT:
> > > +            /* Don't apply the error policy, always report to the
> guest */
> >
> > This is the only case where you get error == 0. Maybe remove it from
> > the initializer, and set it here?
>
> Not sure why the initialiser was added in the first place, but yes, I
> can do that.
>
> Kevin
>
> > On Mon, Jul 29, 2024 at 11:47 AM Kevin Wolf <kwolf@redhat.com> wrote:
> > >
> > > RESERVATION_CONFLICT is not a backend error, but indicates that the
> > > guest tried to make a request that it isn't allowed to execute. Pass
> the
> > > error to the guest so that it can decide what to do with it.
> > >
> > > Without this, if we stop the VM in response to a RESERVATION_CONFLICT,
> > > it can happen that the VM cannot be resumed any more because every
> > > attempt to resume it immediately runs into the same error and stops the
> > > VM again.
> > >
> > > One case that expects RESERVATION_CONFLICT errors to be visible in the
> > > guest is running the validation tests in Windows 2019's Failover
> Cluster
> > > Manager, which intentionally tries to execute invalid requests to see
> if
> > > they are properly rejected.
> > >
> > > Buglink: https://issues.redhat.com/browse/RHEL-50000
> > > Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> > > ---
> > >  hw/scsi/scsi-disk.c | 15 +++++++++++----
> > >  1 file changed, 11 insertions(+), 4 deletions(-)
> > >
> > > diff --git a/hw/scsi/scsi-disk.c b/hw/scsi/scsi-disk.c
> > > index 69a195177e..e173b238de 100644
> > > --- a/hw/scsi/scsi-disk.c
> > > +++ b/hw/scsi/scsi-disk.c
> > > @@ -235,11 +235,17 @@ static bool scsi_handle_rw_error(SCSIDiskReq *r,
> int ret, bool acct_failed)
> > >      } else {
> > >          /* A passthrough command has completed with nonzero status.
> */
> > >          status = ret;
> > > -        if (status == CHECK_CONDITION) {
> > > +        switch (status) {
> > > +        case CHECK_CONDITION:
> > >              req_has_sense = true;
> > >              error = scsi_sense_buf_to_errno(r->req.sense,
> sizeof(r->req.sense));
> > > -        } else {
> > > +            break;
> > > +        case RESERVATION_CONFLICT:
> > > +            /* Don't apply the error policy, always report to the
> guest */
> > > +            break;
> > > +        default:
> > >              error = EINVAL;
> > > +            break;
> > >          }
> > >      }
> > >
> > > @@ -249,8 +255,9 @@ static bool scsi_handle_rw_error(SCSIDiskReq *r,
> int ret, bool acct_failed)
> > >       * are usually retried immediately, so do not post them to QMP and
> > >       * do not account them as failed I/O.
> > >       */
> > > -    if (req_has_sense &&
> > > -        scsi_sense_buf_is_guest_recoverable(r->req.sense,
> sizeof(r->req.sense))) {
> > > +    if (!error || (req_has_sense &&
> > > +                   scsi_sense_buf_is_guest_recoverable(r->req.sense,
> > > +
>  sizeof(r->req.sense)))) {
> > >          action = BLOCK_ERROR_ACTION_REPORT;
> > >          acct_failed = false;
> > >      } else {
> > > --
> > > 2.45.2
> > >
> >
>
>
Kevin Wolf July 29, 2024, 4:55 p.m. UTC | #4
Am 29.07.2024 um 14:26 hat Paolo Bonzini geschrieben:
> Il lun 29 lug 2024, 14:20 Kevin Wolf <kwolf@redhat.com> ha scritto:
> 
> > Apparently both oVirt and Kubevirt unconditionally use the stop policy,
> > so I'm afraid in this case we must acknowledge that our expectations
> > don't match reality.
> >
> 
> Yeah, of course.
> 
> If I understand correctly, not having a pr-manager could mean that QEMU
> > itself is sufficiently privileged and then the same logic would apply.
> >
> > But even if it means that we can't change any persistent reservations
> > from the VM, what use would stopping the VM be? You would run into the
> > exact case I'm describing in the commit message: You try to resume the
> > VM and it immediately stops again because the request still doesn't get
> > through. Or do you expect the host admin to take some manual action
> > then?
> >
> 
> Yes, if the PR operation is not allowed then the host admin would probably
> get a notification and release the PR (or perhaps shutdown the VM with an
> error) itself.
> 
> And what would you do about the Windows cluster validation case that
> > intentionally sends a request which reservations don't and shouldn't
> > allow? There is nothing on the host side to fix there. The guest is only
> > happy when it gets an error back.
> >
> 
> Yes, in that case (which is the most common one) there is nothing you can
> do, so the patch is a good idea even if the case without a PR manager is a
> bit murky.

Ok, so modifying the commit message and removing the 'error'
initialisation it is. Maybe mention the cluster validation case in the
comment here to explain why we do this even for non-pr-manager cases,
but not as a FIXME or TODO because it's not a problem with the
implementation, but we don't have any other choice. Right?

Should I send a v2 for this or is it okay to do this only while applying
the patch?

Kevin
Paolo Bonzini July 30, 2024, 6:06 a.m. UTC | #5
On Mon, Jul 29, 2024 at 6:55 PM Kevin Wolf <kwolf@redhat.com> wrote:
> Ok, so modifying the commit message and removing the 'error'
> initialisation it is. Maybe mention the cluster validation case in the
> comment here to explain why we do this even for non-pr-manager cases,
> but not as a FIXME or TODO because it's not a problem with the
> implementation, but we don't have any other choice. Right?

Yep, go ahead and do it.

Paolo

> Should I send a v2 for this or is it okay to do this only while applying
> the patch?
>
> Kevin
>
diff mbox series

Patch

diff --git a/hw/scsi/scsi-disk.c b/hw/scsi/scsi-disk.c
index 69a195177e..e173b238de 100644
--- a/hw/scsi/scsi-disk.c
+++ b/hw/scsi/scsi-disk.c
@@ -235,11 +235,17 @@  static bool scsi_handle_rw_error(SCSIDiskReq *r, int ret, bool acct_failed)
     } else {
         /* A passthrough command has completed with nonzero status.  */
         status = ret;
-        if (status == CHECK_CONDITION) {
+        switch (status) {
+        case CHECK_CONDITION:
             req_has_sense = true;
             error = scsi_sense_buf_to_errno(r->req.sense, sizeof(r->req.sense));
-        } else {
+            break;
+        case RESERVATION_CONFLICT:
+            /* Don't apply the error policy, always report to the guest */
+            break;
+        default:
             error = EINVAL;
+            break;
         }
     }
 
@@ -249,8 +255,9 @@  static bool scsi_handle_rw_error(SCSIDiskReq *r, int ret, bool acct_failed)
      * are usually retried immediately, so do not post them to QMP and
      * do not account them as failed I/O.
      */
-    if (req_has_sense &&
-        scsi_sense_buf_is_guest_recoverable(r->req.sense, sizeof(r->req.sense))) {
+    if (!error || (req_has_sense &&
+                   scsi_sense_buf_is_guest_recoverable(r->req.sense,
+                                                       sizeof(r->req.sense)))) {
         action = BLOCK_ERROR_ACTION_REPORT;
         acct_failed = false;
     } else {