Message ID | 77f405a048b07e4451b7d7adaeba7ce4a00b7efb.camel@redhat.com (mailing list archive) |
---|---|
State | Changes Requested |
Headers | show |
Series | scsi: qla2xxx avoid a panic due to BUG() if a WRITE_SAME command is sent to a device that has no protection | expand |
Hi Laurence, > On Jun 28, 2023, at 4:34 AM, Laurence Oberman <loberman@redhat.com> wrote: > > In the current code, If a device does not have protection, qla2xx will > land up defaulting to a BUG() and will panic the system when > sg_write_same is sent.This is because SCSI_PROT_NORMAL is matched and > falls through to the BUG() call. > The write_same command to a device without protection is not handled > safely. > > Fix this by making two changes: > Set the bundling variable also to 0 for SCSI_PROT_NORMAL > Modify the switch statement to match on SCSI_PROT_NORMAL and handle it > appropriately removing the call to BUG() > This should go to stable kernel as well. Cc: stable@vger.kernel.org <mailto:stable@vger.kernel.org> > Supersedes prior suggested patch. > > Suggested-by: David Jeffery <djeffery@redhat.com> > Tested-by: Laurence Oberman <loberman@redhat.com> > Signed-off-by: Laurence Oberman <loberman@redhat.com> > --- > drivers/scsi/qla2xxx/qla_iocb.c | 6 +++++- > 1 file changed, 5 insertions(+), 1 deletion(-) > > diff --git a/drivers/scsi/qla2xxx/qla_iocb.c > b/drivers/scsi/qla2xxx/qla_iocb.c > index b9b3e6f80ea9..82a5d195e401 100644 > --- a/drivers/scsi/qla2xxx/qla_iocb.c > +++ b/drivers/scsi/qla2xxx/qla_iocb.c > @@ -1381,7 +1381,8 @@ qla24xx_build_scsi_crc_2_iocbs(srb_t *sp, struct > cmd_type_crc_2 *cmd_pkt, > if ((scsi_get_prot_op(cmd) == SCSI_PROT_READ_INSERT) || > (scsi_get_prot_op(cmd) == SCSI_PROT_WRITE_STRIP) || > (scsi_get_prot_op(cmd) == SCSI_PROT_READ_STRIP) || > - (scsi_get_prot_op(cmd) == SCSI_PROT_WRITE_INSERT)) > + (scsi_get_prot_op(cmd) == SCSI_PROT_WRITE_INSERT) || > + (scsi_get_prot_op(cmd) == SCSI_PROT_NORMAL)) > bundling = 0; > > /* Allocate CRC context from global pool */ > @@ -1443,6 +1444,9 @@ qla24xx_build_scsi_crc_2_iocbs(srb_t *sp, struct > cmd_type_crc_2 *cmd_pkt, > dif_bytes = (data_bytes / blk_size) * 8; > > switch (scsi_get_prot_op(GET_CMD_SP(sp))) { > + case SCSI_PROT_NORMAL: > + total_bytes = data_bytes; > + break; > case SCSI_PROT_READ_INSERT: > case SCSI_PROT_WRITE_STRIP: > total_bytes = data_bytes; > -- > 2.39.3 The change itself looks good. Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com <mailto:himanshu.madhani@oracle.com>>
Laurence, > In the current code, If a device does not have protection, qla2xx will > land up defaulting to a BUG() and will panic the system when > sg_write_same is sent.This is because SCSI_PROT_NORMAL is matched and > falls through to the BUG() call. The write_same command to a device > without protection is not handled safely. I would like to understand why the driver PI code path was chosen for a SCSI_PROT_NORMAL cmnd. That doesn't seem right. > + case SCSI_PROT_NORMAL: > + total_bytes = data_bytes; > + break; > case SCSI_PROT_READ_INSERT: > case SCSI_PROT_WRITE_STRIP: > total_bytes = data_bytes; All this transfer size wrangling in the driver should be removed and replaced with a call to scsi_transfer_length() which takes the PI size into account.
On Wed, 2023-07-05 at 22:35 -0400, Martin K. Petersen wrote: > > Laurence, > > > In the current code, If a device does not have protection, qla2xx > > will > > land up defaulting to a BUG() and will panic the system when > > sg_write_same is sent.This is because SCSI_PROT_NORMAL is matched > > and > > falls through to the BUG() call. The write_same command to a device > > without protection is not handled safely. > > I would like to understand why the driver PI code path was chosen for > a > SCSI_PROT_NORMAL cmnd. That doesn't seem right. > > > + case SCSI_PROT_NORMAL: > > + total_bytes = data_bytes; > > + break; > > case SCSI_PROT_READ_INSERT: > > case SCSI_PROT_WRITE_STRIP: > > total_bytes = data_bytes; > > All this transfer size wrangling in the driver should be removed and > replaced with a call to scsi_transfer_length() which takes the PI > size > into account. Martin, good questions I am waiting on Marvell to decide what to do about all this. The patch was focused only on avoiding the nasty effect of what you mention happens in that code path, causing customers to have system panics. The patch is to avoid the panic at this point with minimal impact to prior code functionality. Regards Laurence >
On Thu, 2023-07-06 at 09:41 -0400, Laurence Oberman wrote: > On Wed, 2023-07-05 at 22:35 -0400, Martin K. Petersen wrote: > > > > Laurence, > > > > > In the current code, If a device does not have protection, qla2xx > > > will > > > land up defaulting to a BUG() and will panic the system when > > > sg_write_same is sent.This is because SCSI_PROT_NORMAL is matched > > > and > > > falls through to the BUG() call. The write_same command to a > > > device > > > without protection is not handled safely. > > > > I would like to understand why the driver PI code path was chosen > > for > > a > > SCSI_PROT_NORMAL cmnd. That doesn't seem right. > > > > > + case SCSI_PROT_NORMAL: > > > + total_bytes = data_bytes; > > > + break; > > > case SCSI_PROT_READ_INSERT: > > > case SCSI_PROT_WRITE_STRIP: > > > total_bytes = data_bytes; > > > > All this transfer size wrangling in the driver should be removed > > and > > replaced with a call to scsi_transfer_length() which takes the PI > > size > > into account. > > > Martin, good questions > > I am waiting on Marvell to decide what to do about all this. > > The patch was focused only on avoiding the nasty effect of what you > mention happens in that code path, causing customers to have system > panics. > > The patch is to avoid the panic at this point with minimal impact to > prior code functionality. > > Regards > Laurence > > > > > > Hello Nilesh and Marvell Any chance to get comments/eyes on this please. Given its causing system crashes we need to decide how best to deal with it. regards Laurence
Hello Nilesh and Marvell Any chance to get comments/eyes on this please. Given its causing system crashes we need to decide how best to deal with it. QT: Laurence, In understanding the severity, Does end customer uses sg_write_same as the mechanism to move data? Other than the sg_write_same utility, how common is end customer uses 32Byte CDB? It seems like upper layer doesn't have support for 32Bytes CDB at this time. The code path you're modifying is for the T10-PI disk. This disk is "non-T10-PI" where it may create some confusion for next reader n Martin on why we've wander down this code path. Will queue up a patch that plug this hole.
On Wed, 2023-07-12 at 00:34 +0000, Quinn Tran wrote: > Hello Nilesh and Marvell > > Any chance to get comments/eyes on this please. > Given its causing system crashes we need to decide how best to deal > with it. > > QT: Laurence, > In understanding the severity, Does end customer uses sg_write_same > as the mechanism to move data? > Other than the sg_write_same utility, how common is end customer uses > 32Byte CDB? It seems like upper layer doesn't have support for > 32Bytes CDB at this time. > > The code path you're modifying is for the T10-PI disk. This disk is > "non-T10-PI" where it may create some confusion for next reader n > Martin on why we've wander down this code path. > > Will queue up a patch that plug this hole. > > > OK, Thank you In this case the customer was specifically using sg_write_same. I am not sure if it was part of a script or some other use case. They were of the opinion it was severe enough of an issue to warrant fixing so they logged a case with us. Thanks for looking into this. Regards Laurence
On Wed, 2023-07-12 at 09:25 -0400, Laurence Oberman wrote: > On Wed, 2023-07-12 at 00:34 +0000, Quinn Tran wrote: > > Hello Nilesh and Marvell > > > > Any chance to get comments/eyes on this please. > > Given its causing system crashes we need to decide how best to deal > > with it. > > > > QT: Laurence, > > In understanding the severity, Does end customer uses > > sg_write_same > > as the mechanism to move data? > > Other than the sg_write_same utility, how common is end customer > > uses > > 32Byte CDB? It seems like upper layer doesn't have support for > > 32Bytes CDB at this time. > > > > The code path you're modifying is for the T10-PI disk. This disk > > is > > "non-T10-PI" where it may create some confusion for next reader n > > Martin on why we've wander down this code path. > > > > Will queue up a patch that plug this hole. > > > > > > > > OK, Thank you > In this case the customer was specifically using sg_write_same. I am > not sure if it was part of a script or some other use case. > They were of the opinion it was severe enough of an issue to warrant > fixing so they logged a case with us. > Thanks for looking into this. > > Regards > Laurence Hello QT Did I miss an update to this. Was another fix sent. We need to deal with this at Red Hat as soon as possible please. Regards Laurence
On Wed, 2023-07-26 at 08:40 -0400, Laurence Oberman wrote: > On Wed, 2023-07-12 at 09:25 -0400, Laurence Oberman wrote: > > On Wed, 2023-07-12 at 00:34 +0000, Quinn Tran wrote: > > > Hello Nilesh and Marvell > > > > > > Any chance to get comments/eyes on this please. > > > Given its causing system crashes we need to decide how best to > > > deal > > > with it. > > > > > > QT: Laurence, > > > In understanding the severity, Does end customer uses > > > sg_write_same > > > as the mechanism to move data? > > > Other than the sg_write_same utility, how common is end customer > > > uses > > > 32Byte CDB? It seems like upper layer doesn't have support for > > > 32Bytes CDB at this time. > > > > > > The code path you're modifying is for the T10-PI disk. This disk > > > is > > > "non-T10-PI" where it may create some confusion for next reader n > > > Martin on why we've wander down this code path. > > > > > > Will queue up a patch that plug this hole. > > > > > > > > > > > > > OK, Thank you > > In this case the customer was specifically using sg_write_same. I > > am > > not sure if it was part of a script or some other use case. > > They were of the opinion it was severe enough of an issue to > > warrant > > fixing so they logged a case with us. > > Thanks for looking into this. > > > > Regards > > Laurence > Hello QT > > Did I miss an update to this. Was another fix sent. > We need to deal with this at Red Hat as soon as possible please. > > Regards > Laurence QT WStillw aiting on you guys and we need to get thjis fixed, Please send the fix upstream that you were thinking of today. Regards Laurence
diff --git a/drivers/scsi/qla2xxx/qla_iocb.c b/drivers/scsi/qla2xxx/qla_iocb.c index b9b3e6f80ea9..82a5d195e401 100644 --- a/drivers/scsi/qla2xxx/qla_iocb.c +++ b/drivers/scsi/qla2xxx/qla_iocb.c @@ -1381,7 +1381,8 @@ qla24xx_build_scsi_crc_2_iocbs(srb_t *sp, struct cmd_type_crc_2 *cmd_pkt, if ((scsi_get_prot_op(cmd) == SCSI_PROT_READ_INSERT) || (scsi_get_prot_op(cmd) == SCSI_PROT_WRITE_STRIP) || (scsi_get_prot_op(cmd) == SCSI_PROT_READ_STRIP) || - (scsi_get_prot_op(cmd) == SCSI_PROT_WRITE_INSERT)) + (scsi_get_prot_op(cmd) == SCSI_PROT_WRITE_INSERT) || + (scsi_get_prot_op(cmd) == SCSI_PROT_NORMAL)) bundling = 0;