mbox series

[v2,0/2] Fix SCSI async abort handling when eh_deadline is active

Message ID 20211029194311.17504-1-emilne@redhat.com (mailing list archive)
Headers show
Series Fix SCSI async abort handling when eh_deadline is active | expand

Message

Ewan Milne Oct. 29, 2021, 7:43 p.m. UTC
There is a code path in the SCSI async abort handling that can cause error
handling of subsequent scsi_cmnds to proceed immediately to host reset with no
other attempt at recovery.

This can be seen by the following:

    modprobe scsi_debug every_nth=10 opts=4
    echo 7 > /sys/module/scsi_mod/parameters/scsi_logging_level
    echo 10 > /sys/devices/pseudo_0/adapter0/host8/scsi_host/host<N>/eh_deadline

and performing I/O to the scsi_debug device, the host will get reset
because prior aborts succeeded, because ->last_reset does not get invalidated.

The patch series contains a fix, followed by a simplification
of the control flow to remove duplicate code.  Only the first patch
is Cc: stable as the second part doesn't qualify.  Yes, I know the
first patch is >100 lines, I couldn't make it smaller unfortunately.

Signed-off-by: Ewan D. Milne <emilne@redhat.com>

v2:
    - Introduced scsi_eh_abort_cleanup() in patch 1/1 to factor out code
      (This is then removed in patch 2/2 since code refactoring results
       in only one place it is called though.)
    - Moved introduction of local "shost" to cleanup patch 2/2

Ewan D. Milne (2):
  scsi: core: avoid leaving shost->last_reset with stale value if EH
    does not run
  scsi: core: simplify control flow in scmd_eh_abort_handler()

 drivers/scsi/hosts.c      |  1 +
 drivers/scsi/scsi_error.c | 92 ++++++++++++++++++++++++++++++-----------------
 drivers/scsi/scsi_lib.c   |  1 +
 include/scsi/scsi_cmnd.h  |  2 +-
 include/scsi/scsi_host.h  |  1 +
 5 files changed, 63 insertions(+), 34 deletions(-)