Message ID | 1482929480-16688-1-git-send-email-gpiccoli@linux.vnet.ibm.com (mailing list archive) |
---|---|
State | Superseded, archived |
Headers | show |
Hi Guilherme, Can please share us the driver logs (with driver logging_level set to 0x3f8) for the original issue (i.e. without this patch changes) and the HBA firmware version you have used. Also can you please share us the steps you have followed to reproduce this issue, so that I can try on my setup. Thanks, Sreekanth On Wed, Dec 28, 2016 at 6:21 PM, Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com> wrote: > From: Ram Pai <linuxram@us.ibm.com> > > The firmware or device, possibly under a heavy I/O load, can return > on a partial unaligned boundary. Scsi-ml expects these requests to be > completed on an alignment boundary. Scsi-ml blindly requeues the I/O > without checking the alignment boundary of the I/O request for the > remaining bytes. This leads to errors, since devices cannot perform > non-aligned read/write operations. > > This patch fixes the issue in the driver. It aligns unaligned > completions of FS requests, by truncating them to the nearest > alignment boundary. > > Reported-by: Mauricio Faria De Oliveira <mauricfo@linux.vnet.ibm.com> > Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com> > Signed-off-by: Ram Pai <linuxram@us.ibm.com> > --- > drivers/scsi/mpt3sas/mpt3sas_scsih.c | 16 ++++++++++++++++ > 1 file changed, 16 insertions(+) > > diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c b/drivers/scsi/mpt3sas/mpt3sas_scsih.c > index b5c966e..55332a3 100644 > --- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c > +++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c > @@ -4644,6 +4644,8 @@ _scsih_io_done(struct MPT3SAS_ADAPTER *ioc, u16 smid, u8 msix_index, u32 reply) > struct MPT3SAS_DEVICE *sas_device_priv_data; > u32 response_code = 0; > unsigned long flags; > + unsigned int sector_sz; > + struct request *req; > > mpi_reply = mpt3sas_base_get_reply_virt_addr(ioc, reply); > scmd = _scsih_scsi_lookup_get_clear(ioc, smid); > @@ -4703,6 +4705,20 @@ _scsih_io_done(struct MPT3SAS_ADAPTER *ioc, u16 smid, u8 msix_index, u32 reply) > } > > xfer_cnt = le32_to_cpu(mpi_reply->TransferCount); > + > + /* In case of bogus fw or device, we could end up having > + * unaligned partial completion. We can force alignment here, > + * then scsi-ml does not need to handle this misbehavior. > + */ > + sector_sz = scmd->device->sector_size; > + req = scmd->request; > + if (unlikely(sector_sz && req && (req->cmd_type == REQ_TYPE_FS) && > + (xfer_cnt % sector_sz))) { > + sdev_printk(KERN_INFO, scmd->device, > + "unaligned partial completion avoided\n"); > + xfer_cnt = (xfer_cnt / sector_sz) * sector_sz; > + } > + > scsi_set_resid(scmd, scsi_bufflen(scmd) - xfer_cnt); > if (ioc_status & MPI2_IOCSTATUS_FLAG_LOG_INFO_AVAILABLE) > log_info = le32_to_cpu(mpi_reply->IOCLogInfo); > -- > 2.1.0 > -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Sreekanth, Let us discuss internally more about this before ACKING this patch. Thanks Sathya -----Original Message----- From: Sreekanth Reddy [mailto:sreekanth.reddy@broadcom.com] Sent: Wednesday, January 04, 2017 4:33 AM To: Guilherme G. Piccoli Cc: linux-scsi@vger.kernel.org; PDL-MPT-FUSIONLINUX; Sathya Prakash; Chaitra Basappa; Suganath Prabu Subramani; Brian King; mauricfo@linux.vnet.ibm.com; linuxram@us.ibm.com Subject: Re: [PATCH] mpt3sas: Force request partial completion alignment Hi Guilherme, Can please share us the driver logs (with driver logging_level set to 0x3f8) for the original issue (i.e. without this patch changes) and the HBA firmware version you have used. Also can you please share us the steps you have followed to reproduce this issue, so that I can try on my setup. Thanks, Sreekanth On Wed, Dec 28, 2016 at 6:21 PM, Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com> wrote: > From: Ram Pai <linuxram@us.ibm.com> > > The firmware or device, possibly under a heavy I/O load, can return on > a partial unaligned boundary. Scsi-ml expects these requests to be > completed on an alignment boundary. Scsi-ml blindly requeues the I/O > without checking the alignment boundary of the I/O request for the > remaining bytes. This leads to errors, since devices cannot perform > non-aligned read/write operations. > > This patch fixes the issue in the driver. It aligns unaligned > completions of FS requests, by truncating them to the nearest > alignment boundary. > > Reported-by: Mauricio Faria De Oliveira <mauricfo@linux.vnet.ibm.com> > Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com> > Signed-off-by: Ram Pai <linuxram@us.ibm.com> > --- > drivers/scsi/mpt3sas/mpt3sas_scsih.c | 16 ++++++++++++++++ > 1 file changed, 16 insertions(+) > > diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c > b/drivers/scsi/mpt3sas/mpt3sas_scsih.c > index b5c966e..55332a3 100644 > --- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c > +++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c > @@ -4644,6 +4644,8 @@ _scsih_io_done(struct MPT3SAS_ADAPTER *ioc, u16 > smid, u8 msix_index, u32 reply) > struct MPT3SAS_DEVICE *sas_device_priv_data; > u32 response_code = 0; > unsigned long flags; > + unsigned int sector_sz; > + struct request *req; > > mpi_reply = mpt3sas_base_get_reply_virt_addr(ioc, reply); > scmd = _scsih_scsi_lookup_get_clear(ioc, smid); @@ -4703,6 > +4705,20 @@ _scsih_io_done(struct MPT3SAS_ADAPTER *ioc, u16 smid, u8 > msix_index, u32 reply) > } > > xfer_cnt = le32_to_cpu(mpi_reply->TransferCount); > + > + /* In case of bogus fw or device, we could end up having > + * unaligned partial completion. We can force alignment here, > + * then scsi-ml does not need to handle this misbehavior. > + */ > + sector_sz = scmd->device->sector_size; > + req = scmd->request; > + if (unlikely(sector_sz && req && (req->cmd_type == REQ_TYPE_FS) && > + (xfer_cnt % sector_sz))) { > + sdev_printk(KERN_INFO, scmd->device, > + "unaligned partial completion avoided\n"); > + xfer_cnt = (xfer_cnt / sector_sz) * sector_sz; > + } > + > scsi_set_resid(scmd, scsi_bufflen(scmd) - xfer_cnt); > if (ioc_status & MPI2_IOCSTATUS_FLAG_LOG_INFO_AVAILABLE) > log_info = le32_to_cpu(mpi_reply->IOCLogInfo); > -- > 2.1.0 > -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Dec 28, 2016 at 6:21 PM, Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com> wrote: > From: Ram Pai <linuxram@us.ibm.com> > > The firmware or device, possibly under a heavy I/O load, can return > on a partial unaligned boundary. Scsi-ml expects these requests to be > completed on an alignment boundary. Scsi-ml blindly requeues the I/O > without checking the alignment boundary of the I/O request for the > remaining bytes. This leads to errors, since devices cannot perform > non-aligned read/write operations. > > This patch fixes the issue in the driver. It aligns unaligned > completions of FS requests, by truncating them to the nearest > alignment boundary. > > Reported-by: Mauricio Faria De Oliveira <mauricfo@linux.vnet.ibm.com> > Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com> > Signed-off-by: Ram Pai <linuxram@us.ibm.com> > --- > drivers/scsi/mpt3sas/mpt3sas_scsih.c | 16 ++++++++++++++++ > 1 file changed, 16 insertions(+) > > diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c b/drivers/scsi/mpt3sas/mpt3sas_scsih.c > index b5c966e..55332a3 100644 > --- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c > +++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c > @@ -4644,6 +4644,8 @@ _scsih_io_done(struct MPT3SAS_ADAPTER *ioc, u16 smid, u8 msix_index, u32 reply) > struct MPT3SAS_DEVICE *sas_device_priv_data; > u32 response_code = 0; > unsigned long flags; > + unsigned int sector_sz; > + struct request *req; > > mpi_reply = mpt3sas_base_get_reply_virt_addr(ioc, reply); > scmd = _scsih_scsi_lookup_get_clear(ioc, smid); > @@ -4703,6 +4705,20 @@ _scsih_io_done(struct MPT3SAS_ADAPTER *ioc, u16 smid, u8 msix_index, u32 reply) > } > > xfer_cnt = le32_to_cpu(mpi_reply->TransferCount); > + > + /* In case of bogus fw or device, we could end up having > + * unaligned partial completion. We can force alignment here, > + * then scsi-ml does not need to handle this misbehavior. > + */ > + sector_sz = scmd->device->sector_size; > + req = scmd->request; > + if (unlikely(sector_sz && req && (req->cmd_type == REQ_TYPE_FS) && > + (xfer_cnt % sector_sz))) { > + sdev_printk(KERN_INFO, scmd->device, > + "unaligned partial completion avoided\n"); [Sreekanth] Patch looks good. But can we print xfer_cnt & sector_sz values along with above print. Also if it is generic drive issue, then can we move this work around to SCSI Mid Layer? > + xfer_cnt = (xfer_cnt / sector_sz) * sector_sz; > + } > + > scsi_set_resid(scmd, scsi_bufflen(scmd) - xfer_cnt); > if (ioc_status & MPI2_IOCSTATUS_FLAG_LOG_INFO_AVAILABLE) > log_info = le32_to_cpu(mpi_reply->IOCLogInfo); > -- > 2.1.0 > -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 01/23/2017 07:05 AM, Sreekanth Reddy wrote: > On Wed, Dec 28, 2016 at 6:21 PM, Guilherme G. Piccoli > <gpiccoli@linux.vnet.ibm.com> wrote: >> From: Ram Pai <linuxram@us.ibm.com> >> >> The firmware or device, possibly under a heavy I/O load, can return >> on a partial unaligned boundary. Scsi-ml expects these requests to be >> completed on an alignment boundary. Scsi-ml blindly requeues the I/O >> without checking the alignment boundary of the I/O request for the >> remaining bytes. This leads to errors, since devices cannot perform >> non-aligned read/write operations. >> >> This patch fixes the issue in the driver. It aligns unaligned >> completions of FS requests, by truncating them to the nearest >> alignment boundary. >> >> Reported-by: Mauricio Faria De Oliveira <mauricfo@linux.vnet.ibm.com> >> Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com> >> Signed-off-by: Ram Pai <linuxram@us.ibm.com> >> --- >> drivers/scsi/mpt3sas/mpt3sas_scsih.c | 16 ++++++++++++++++ >> 1 file changed, 16 insertions(+) >> >> diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c b/drivers/scsi/mpt3sas/mpt3sas_scsih.c >> index b5c966e..55332a3 100644 >> --- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c >> +++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c >> @@ -4644,6 +4644,8 @@ _scsih_io_done(struct MPT3SAS_ADAPTER *ioc, u16 smid, u8 msix_index, u32 reply) >> struct MPT3SAS_DEVICE *sas_device_priv_data; >> u32 response_code = 0; >> unsigned long flags; >> + unsigned int sector_sz; >> + struct request *req; >> >> mpi_reply = mpt3sas_base_get_reply_virt_addr(ioc, reply); >> scmd = _scsih_scsi_lookup_get_clear(ioc, smid); >> @@ -4703,6 +4705,20 @@ _scsih_io_done(struct MPT3SAS_ADAPTER *ioc, u16 smid, u8 msix_index, u32 reply) >> } >> >> xfer_cnt = le32_to_cpu(mpi_reply->TransferCount); >> + >> + /* In case of bogus fw or device, we could end up having >> + * unaligned partial completion. We can force alignment here, >> + * then scsi-ml does not need to handle this misbehavior. >> + */ >> + sector_sz = scmd->device->sector_size; >> + req = scmd->request; >> + if (unlikely(sector_sz && req && (req->cmd_type == REQ_TYPE_FS) && >> + (xfer_cnt % sector_sz))) { >> + sdev_printk(KERN_INFO, scmd->device, >> + "unaligned partial completion avoided\n"); > > [Sreekanth] Patch looks good. But can we print xfer_cnt & sector_sz > values along with above print. > > Also if it is generic drive issue, then can we move this work around > to SCSI Mid Layer? > Thank you! I'll send a v2 including your suggestion. Regarding a fix in scsi-ml, we tried already: https://lkml.org/lkml/2016/12/19/591 Reception wasn't in favor of the patch; they suggested we patch the driver instead, then we sent the current change only for mpt3sas. Thanks, Guilherme >> + xfer_cnt = (xfer_cnt / sector_sz) * sector_sz; >> + } >> + >> scsi_set_resid(scmd, scsi_bufflen(scmd) - xfer_cnt); >> if (ioc_status & MPI2_IOCSTATUS_FLAG_LOG_INFO_AVAILABLE) >> log_info = le32_to_cpu(mpi_reply->IOCLogInfo); >> -- >> 2.1.0 >> > -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c b/drivers/scsi/mpt3sas/mpt3sas_scsih.c index b5c966e..55332a3 100644 --- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c +++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c @@ -4644,6 +4644,8 @@ _scsih_io_done(struct MPT3SAS_ADAPTER *ioc, u16 smid, u8 msix_index, u32 reply) struct MPT3SAS_DEVICE *sas_device_priv_data; u32 response_code = 0; unsigned long flags; + unsigned int sector_sz; + struct request *req; mpi_reply = mpt3sas_base_get_reply_virt_addr(ioc, reply); scmd = _scsih_scsi_lookup_get_clear(ioc, smid); @@ -4703,6 +4705,20 @@ _scsih_io_done(struct MPT3SAS_ADAPTER *ioc, u16 smid, u8 msix_index, u32 reply) } xfer_cnt = le32_to_cpu(mpi_reply->TransferCount); + + /* In case of bogus fw or device, we could end up having + * unaligned partial completion. We can force alignment here, + * then scsi-ml does not need to handle this misbehavior. + */ + sector_sz = scmd->device->sector_size; + req = scmd->request; + if (unlikely(sector_sz && req && (req->cmd_type == REQ_TYPE_FS) && + (xfer_cnt % sector_sz))) { + sdev_printk(KERN_INFO, scmd->device, + "unaligned partial completion avoided\n"); + xfer_cnt = (xfer_cnt / sector_sz) * sector_sz; + } + scsi_set_resid(scmd, scsi_bufflen(scmd) - xfer_cnt); if (ioc_status & MPI2_IOCSTATUS_FLAG_LOG_INFO_AVAILABLE) log_info = le32_to_cpu(mpi_reply->IOCLogInfo);