Message ID | 20240903135857.455818-2-daniel@0x0f.com (mailing list archive) |
---|---|
State | Changes Requested |
Headers | show |
Series | [1/2] m68k/mvme147: Fix SCSI IRQ numbers | expand |
Hi Daniel. [resent in plain text format, sorry...] On 4/09/24 01:58, Daniel Palmer wrote: > I have no idea if this fix is appropriate, the code in this driver > makes my brain hurt just looking at it, but sometimes when getting > scsi_pointer from cmd cmd itself is actually null and we get a weird > garbage pointer for scsi_pointer. I am not sure I read that code right (you are correct in saying all that state machine code inside the interrupt handler makes for headaches, but that's a different matter). To me, this looks like the driver takes an interrupt without a connected command (that is why cmd ends up NULL). The interrupt turns out to be a selection interrupt, and the attempt to read the current bus phase from the command private data fails. The driver does use scsi_pointer elsewhere without ever checking it's valid, so it would seem this case is not meant to happen. On the other hand, we do not expect to have a connected command while selecting, so this case is pretty much _guaranteed_ to happen! It would appear that when Bart worked on this driver to move scsi_pointer to command private data (commit dbb2da557a6a87c88bbb4b1fef037091b57f701b in my tree), he overlooked the fact that 'cmd' is reloaded inside the interrupt handler from host queues, in response to interrupt conditions or phases. The copy of scsi_pointer initially obtained (from the connected command, without checking that there is in fact a command connected) is never reloaded though. Even if there had been a connected command, scsi_pointer would be stale at the point where the phase information is needed to construct the IDENTIFY message. The old code used phase information from the just reloaded selecting command, so reloading scsi_pointer at this time (as you do) is the correct behaviour. I am not certain that the test for NULL cmd at the start of the interrupt handler is actually required. Any part of the driver that changes cmd and makes use of scsi_pointer must reload scsi_pointer afterwards. The selection interrupt part seems the only part using scsi_pointer, so your fix is sufficient. Please respin and set the appropriate Fixes: tag. Cheers, Michael > When this is accessed later on bad things happen. For my machine > this happens when the SCSI bus is initially scanned. > > With this "fix" SCSI on my MVME147 is happy again. > > [ 84.330000] wd33c93-0: chip=WD33c93B/13 no_sync=0xff no_dma=0 > [ 84.330000] debug_flags=0x00 > [ 84.350000] setup_args= > <snip> > [ 84.490000] > [ 84.510000] Version 1.26++ - 10/Feb/2007 > [ 84.520000] scsi host0: MVME147 built-in SCSI > [ 85.480000] sending SDTR 0103015e00 > [ 85.480000] 01 > [ 85.490000] 03 > [ 85.500000] 01 > [ 85.510000] 00 > [ 85.520000] 00 > [ 85.520000] sync_xfer=30 > [ 85.530000] scsi 0:0:5:0: Direct-Access BlueSCSI HARDDRIVE 2.0 PQ: 0 ANSI: 2 > [ 85.820000] st: Version 20160209, fixed bufsize 32768, s/g segs 256 > [ 85.900000] sd 0:0:5:0: Attached scsi generic sg0 type 0 > > Signed-off-by: Daniel Palmer<daniel@0x0f.com> > --- > drivers/scsi/wd33c93.c | 10 +++++++--- > drivers/scsi/wd33c93.h | 2 ++ > 2 files changed, 9 insertions(+), 3 deletions(-) > > diff --git a/drivers/scsi/wd33c93.c b/drivers/scsi/wd33c93.c > index a44b60c9004a..9789d852d541 100644 > --- a/drivers/scsi/wd33c93.c > +++ b/drivers/scsi/wd33c93.c > @@ -733,7 +733,7 @@ transfer_bytes(const wd33c93_regs regs, struct scsi_cmnd *cmd, > void > wd33c93_intr(struct Scsi_Host *instance) > { > - struct scsi_pointer *scsi_pointer; > + struct scsi_pointer *scsi_pointer = NULL; > struct WD33C93_hostdata *hostdata = > (struct WD33C93_hostdata *) instance->hostdata; > const wd33c93_regs regs = hostdata->regs; > @@ -752,7 +752,9 @@ wd33c93_intr(struct Scsi_Host *instance) > #endif > > cmd = (struct scsi_cmnd *) hostdata->connected; /* assume we're connected */ > - scsi_pointer = WD33C93_scsi_pointer(cmd); > + /* cmd could be null */ > + if (cmd) > + scsi_pointer = WD33C93_scsi_pointer(cmd); > sr = read_wd33c93(regs, WD_SCSI_STATUS); /* clear the interrupt */ > phs = read_wd33c93(regs, WD_COMMAND_PHASE); > > @@ -828,8 +830,10 @@ wd33c93_intr(struct Scsi_Host *instance) > (struct scsi_cmnd *) hostdata->selecting; > hostdata->selecting = NULL; > > - /* construct an IDENTIFY message with correct disconnect bit */ > + /* cmd should now be valid and we can get scsi_pointer */ > + scsi_pointer = WD33C93_scsi_pointer(cmd); > > + /* construct an IDENTIFY message with correct disconnect bit */ > hostdata->outgoing_msg[0] = IDENTIFY(0, cmd->device->lun); > if (scsi_pointer->phase) > hostdata->outgoing_msg[0] |= 0x40; > diff --git a/drivers/scsi/wd33c93.h b/drivers/scsi/wd33c93.h > index e5e4254b1477..898c1c7d024d 100644 > --- a/drivers/scsi/wd33c93.h > +++ b/drivers/scsi/wd33c93.h > @@ -259,6 +259,8 @@ struct WD33C93_hostdata { > > static inline struct scsi_pointer *WD33C93_scsi_pointer(struct scsi_cmnd *cmd) > { > + WARN_ON_ONCE(!cmd); > + > return scsi_cmd_priv(cmd); > } >
diff --git a/drivers/scsi/wd33c93.c b/drivers/scsi/wd33c93.c index a44b60c9004a..9789d852d541 100644 --- a/drivers/scsi/wd33c93.c +++ b/drivers/scsi/wd33c93.c @@ -733,7 +733,7 @@ transfer_bytes(const wd33c93_regs regs, struct scsi_cmnd *cmd, void wd33c93_intr(struct Scsi_Host *instance) { - struct scsi_pointer *scsi_pointer; + struct scsi_pointer *scsi_pointer = NULL; struct WD33C93_hostdata *hostdata = (struct WD33C93_hostdata *) instance->hostdata; const wd33c93_regs regs = hostdata->regs; @@ -752,7 +752,9 @@ wd33c93_intr(struct Scsi_Host *instance) #endif cmd = (struct scsi_cmnd *) hostdata->connected; /* assume we're connected */ - scsi_pointer = WD33C93_scsi_pointer(cmd); + /* cmd could be null */ + if (cmd) + scsi_pointer = WD33C93_scsi_pointer(cmd); sr = read_wd33c93(regs, WD_SCSI_STATUS); /* clear the interrupt */ phs = read_wd33c93(regs, WD_COMMAND_PHASE); @@ -828,8 +830,10 @@ wd33c93_intr(struct Scsi_Host *instance) (struct scsi_cmnd *) hostdata->selecting; hostdata->selecting = NULL; - /* construct an IDENTIFY message with correct disconnect bit */ + /* cmd should now be valid and we can get scsi_pointer */ + scsi_pointer = WD33C93_scsi_pointer(cmd); + /* construct an IDENTIFY message with correct disconnect bit */ hostdata->outgoing_msg[0] = IDENTIFY(0, cmd->device->lun); if (scsi_pointer->phase) hostdata->outgoing_msg[0] |= 0x40; diff --git a/drivers/scsi/wd33c93.h b/drivers/scsi/wd33c93.h index e5e4254b1477..898c1c7d024d 100644 --- a/drivers/scsi/wd33c93.h +++ b/drivers/scsi/wd33c93.h @@ -259,6 +259,8 @@ struct WD33C93_hostdata { static inline struct scsi_pointer *WD33C93_scsi_pointer(struct scsi_cmnd *cmd) { + WARN_ON_ONCE(!cmd); + return scsi_cmd_priv(cmd); }
I have no idea if this fix is appropriate, the code in this driver makes my brain hurt just looking at it, but sometimes when getting scsi_pointer from cmd cmd itself is actually null and we get a weird garbage pointer for scsi_pointer. When this is accessed later on bad things happen. For my machine this happens when the SCSI bus is initially scanned. With this "fix" SCSI on my MVME147 is happy again. [ 84.330000] wd33c93-0: chip=WD33c93B/13 no_sync=0xff no_dma=0 [ 84.330000] debug_flags=0x00 [ 84.350000] setup_args= <snip> [ 84.490000] [ 84.510000] Version 1.26++ - 10/Feb/2007 [ 84.520000] scsi host0: MVME147 built-in SCSI [ 85.480000] sending SDTR 0103015e00 [ 85.480000] 01 [ 85.490000] 03 [ 85.500000] 01 [ 85.510000] 00 [ 85.520000] 00 [ 85.520000] sync_xfer=30 [ 85.530000] scsi 0:0:5:0: Direct-Access BlueSCSI HARDDRIVE 2.0 PQ: 0 ANSI: 2 [ 85.820000] st: Version 20160209, fixed bufsize 32768, s/g segs 256 [ 85.900000] sd 0:0:5:0: Attached scsi generic sg0 type 0 Signed-off-by: Daniel Palmer <daniel@0x0f.com> --- drivers/scsi/wd33c93.c | 10 +++++++--- drivers/scsi/wd33c93.h | 2 ++ 2 files changed, 9 insertions(+), 3 deletions(-)