diff mbox

mpt3sas: Force request partial completion alignment

Message ID 1482929480-16688-1-git-send-email-gpiccoli@linux.vnet.ibm.com (mailing list archive)
State Superseded, archived
Headers show

Commit Message

Guilherme G. Piccoli Dec. 28, 2016, 12:51 p.m. UTC
From: Ram Pai <linuxram@us.ibm.com>

The firmware or device, possibly under a heavy I/O load, can return
on a partial unaligned boundary. Scsi-ml expects these requests to be
completed on an alignment boundary. Scsi-ml blindly requeues the I/O
without checking the alignment boundary of the I/O request for the
remaining bytes. This leads to errors, since devices cannot perform
non-aligned read/write operations.

This patch fixes the issue in the driver. It aligns unaligned
completions of FS requests, by truncating them to the nearest
alignment boundary.

Reported-by: Mauricio Faria De Oliveira <mauricfo@linux.vnet.ibm.com>
Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 drivers/scsi/mpt3sas/mpt3sas_scsih.c | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

Comments

Sreekanth Reddy Jan. 4, 2017, 11:33 a.m. UTC | #1
Hi Guilherme,

Can please share us the driver logs (with driver logging_level set to
0x3f8) for the original issue (i.e. without this patch changes) and
the HBA firmware version you have used.

Also can you please share us the steps you have followed to reproduce
this issue, so that I can try on my setup.

Thanks,
Sreekanth

On Wed, Dec 28, 2016 at 6:21 PM, Guilherme G. Piccoli
<gpiccoli@linux.vnet.ibm.com> wrote:
> From: Ram Pai <linuxram@us.ibm.com>
>
> The firmware or device, possibly under a heavy I/O load, can return
> on a partial unaligned boundary. Scsi-ml expects these requests to be
> completed on an alignment boundary. Scsi-ml blindly requeues the I/O
> without checking the alignment boundary of the I/O request for the
> remaining bytes. This leads to errors, since devices cannot perform
> non-aligned read/write operations.
>
> This patch fixes the issue in the driver. It aligns unaligned
> completions of FS requests, by truncating them to the nearest
> alignment boundary.
>
> Reported-by: Mauricio Faria De Oliveira <mauricfo@linux.vnet.ibm.com>
> Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
> Signed-off-by: Ram Pai <linuxram@us.ibm.com>
> ---
>  drivers/scsi/mpt3sas/mpt3sas_scsih.c | 16 ++++++++++++++++
>  1 file changed, 16 insertions(+)
>
> diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
> index b5c966e..55332a3 100644
> --- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
> +++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
> @@ -4644,6 +4644,8 @@ _scsih_io_done(struct MPT3SAS_ADAPTER *ioc, u16 smid, u8 msix_index, u32 reply)
>         struct MPT3SAS_DEVICE *sas_device_priv_data;
>         u32 response_code = 0;
>         unsigned long flags;
> +       unsigned int sector_sz;
> +       struct request *req;
>
>         mpi_reply = mpt3sas_base_get_reply_virt_addr(ioc, reply);
>         scmd = _scsih_scsi_lookup_get_clear(ioc, smid);
> @@ -4703,6 +4705,20 @@ _scsih_io_done(struct MPT3SAS_ADAPTER *ioc, u16 smid, u8 msix_index, u32 reply)
>         }
>
>         xfer_cnt = le32_to_cpu(mpi_reply->TransferCount);
> +
> +       /* In case of bogus fw or device, we could end up having
> +        * unaligned partial completion. We can force alignment here,
> +        * then scsi-ml does not need to handle this misbehavior.
> +        */
> +       sector_sz = scmd->device->sector_size;
> +       req = scmd->request;
> +       if (unlikely(sector_sz && req && (req->cmd_type == REQ_TYPE_FS) &&
> +                   (xfer_cnt % sector_sz))) {
> +               sdev_printk(KERN_INFO, scmd->device,
> +                           "unaligned partial completion avoided\n");
> +               xfer_cnt = (xfer_cnt / sector_sz) * sector_sz;
> +       }
> +
>         scsi_set_resid(scmd, scsi_bufflen(scmd) - xfer_cnt);
>         if (ioc_status & MPI2_IOCSTATUS_FLAG_LOG_INFO_AVAILABLE)
>                 log_info =  le32_to_cpu(mpi_reply->IOCLogInfo);
> --
> 2.1.0
>
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Sathya Prakash Veerichetty Jan. 5, 2017, 4:44 p.m. UTC | #2
Sreekanth,
Let us discuss internally more about this before ACKING this patch.

Thanks
Sathya

-----Original Message-----
From: Sreekanth Reddy [mailto:sreekanth.reddy@broadcom.com]
Sent: Wednesday, January 04, 2017 4:33 AM
To: Guilherme G. Piccoli
Cc: linux-scsi@vger.kernel.org; PDL-MPT-FUSIONLINUX; Sathya Prakash; Chaitra
Basappa; Suganath Prabu Subramani; Brian King; mauricfo@linux.vnet.ibm.com;
linuxram@us.ibm.com
Subject: Re: [PATCH] mpt3sas: Force request partial completion alignment

Hi Guilherme,

Can please share us the driver logs (with driver logging_level set to
0x3f8) for the original issue (i.e. without this patch changes) and the HBA
firmware version you have used.

Also can you please share us the steps you have followed to reproduce this
issue, so that I can try on my setup.

Thanks,
Sreekanth

On Wed, Dec 28, 2016 at 6:21 PM, Guilherme G. Piccoli
<gpiccoli@linux.vnet.ibm.com> wrote:
> From: Ram Pai <linuxram@us.ibm.com>
>
> The firmware or device, possibly under a heavy I/O load, can return on
> a partial unaligned boundary. Scsi-ml expects these requests to be
> completed on an alignment boundary. Scsi-ml blindly requeues the I/O
> without checking the alignment boundary of the I/O request for the
> remaining bytes. This leads to errors, since devices cannot perform
> non-aligned read/write operations.
>
> This patch fixes the issue in the driver. It aligns unaligned
> completions of FS requests, by truncating them to the nearest
> alignment boundary.
>
> Reported-by: Mauricio Faria De Oliveira <mauricfo@linux.vnet.ibm.com>
> Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
> Signed-off-by: Ram Pai <linuxram@us.ibm.com>
> ---
>  drivers/scsi/mpt3sas/mpt3sas_scsih.c | 16 ++++++++++++++++
>  1 file changed, 16 insertions(+)
>
> diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
> b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
> index b5c966e..55332a3 100644
> --- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
> +++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
> @@ -4644,6 +4644,8 @@ _scsih_io_done(struct MPT3SAS_ADAPTER *ioc, u16
> smid, u8 msix_index, u32 reply)
>         struct MPT3SAS_DEVICE *sas_device_priv_data;
>         u32 response_code = 0;
>         unsigned long flags;
> +       unsigned int sector_sz;
> +       struct request *req;
>
>         mpi_reply = mpt3sas_base_get_reply_virt_addr(ioc, reply);
>         scmd = _scsih_scsi_lookup_get_clear(ioc, smid); @@ -4703,6
> +4705,20 @@ _scsih_io_done(struct MPT3SAS_ADAPTER *ioc, u16 smid, u8
> msix_index, u32 reply)
>         }
>
>         xfer_cnt = le32_to_cpu(mpi_reply->TransferCount);
> +
> +       /* In case of bogus fw or device, we could end up having
> +        * unaligned partial completion. We can force alignment here,
> +        * then scsi-ml does not need to handle this misbehavior.
> +        */
> +       sector_sz = scmd->device->sector_size;
> +       req = scmd->request;
> +       if (unlikely(sector_sz && req && (req->cmd_type == REQ_TYPE_FS) &&
> +                   (xfer_cnt % sector_sz))) {
> +               sdev_printk(KERN_INFO, scmd->device,
> +                           "unaligned partial completion avoided\n");
> +               xfer_cnt = (xfer_cnt / sector_sz) * sector_sz;
> +       }
> +
>         scsi_set_resid(scmd, scsi_bufflen(scmd) - xfer_cnt);
>         if (ioc_status & MPI2_IOCSTATUS_FLAG_LOG_INFO_AVAILABLE)
>                 log_info =  le32_to_cpu(mpi_reply->IOCLogInfo);
> --
> 2.1.0
>
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Sreekanth Reddy Jan. 23, 2017, 9:05 a.m. UTC | #3
On Wed, Dec 28, 2016 at 6:21 PM, Guilherme G. Piccoli
<gpiccoli@linux.vnet.ibm.com> wrote:
> From: Ram Pai <linuxram@us.ibm.com>
>
> The firmware or device, possibly under a heavy I/O load, can return
> on a partial unaligned boundary. Scsi-ml expects these requests to be
> completed on an alignment boundary. Scsi-ml blindly requeues the I/O
> without checking the alignment boundary of the I/O request for the
> remaining bytes. This leads to errors, since devices cannot perform
> non-aligned read/write operations.
>
> This patch fixes the issue in the driver. It aligns unaligned
> completions of FS requests, by truncating them to the nearest
> alignment boundary.
>
> Reported-by: Mauricio Faria De Oliveira <mauricfo@linux.vnet.ibm.com>
> Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
> Signed-off-by: Ram Pai <linuxram@us.ibm.com>
> ---
>  drivers/scsi/mpt3sas/mpt3sas_scsih.c | 16 ++++++++++++++++
>  1 file changed, 16 insertions(+)
>
> diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
> index b5c966e..55332a3 100644
> --- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
> +++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
> @@ -4644,6 +4644,8 @@ _scsih_io_done(struct MPT3SAS_ADAPTER *ioc, u16 smid, u8 msix_index, u32 reply)
>         struct MPT3SAS_DEVICE *sas_device_priv_data;
>         u32 response_code = 0;
>         unsigned long flags;
> +       unsigned int sector_sz;
> +       struct request *req;
>
>         mpi_reply = mpt3sas_base_get_reply_virt_addr(ioc, reply);
>         scmd = _scsih_scsi_lookup_get_clear(ioc, smid);
> @@ -4703,6 +4705,20 @@ _scsih_io_done(struct MPT3SAS_ADAPTER *ioc, u16 smid, u8 msix_index, u32 reply)
>         }
>
>         xfer_cnt = le32_to_cpu(mpi_reply->TransferCount);
> +
> +       /* In case of bogus fw or device, we could end up having
> +        * unaligned partial completion. We can force alignment here,
> +        * then scsi-ml does not need to handle this misbehavior.
> +        */
> +       sector_sz = scmd->device->sector_size;
> +       req = scmd->request;
> +       if (unlikely(sector_sz && req && (req->cmd_type == REQ_TYPE_FS) &&
> +                   (xfer_cnt % sector_sz))) {
> +               sdev_printk(KERN_INFO, scmd->device,
> +                           "unaligned partial completion avoided\n");

[Sreekanth] Patch looks good. But can we print xfer_cnt & sector_sz
values along with above print.

Also if it is generic drive issue, then can we move this work around
to SCSI Mid Layer?

> +               xfer_cnt = (xfer_cnt / sector_sz) * sector_sz;
> +       }
> +
>         scsi_set_resid(scmd, scsi_bufflen(scmd) - xfer_cnt);
>         if (ioc_status & MPI2_IOCSTATUS_FLAG_LOG_INFO_AVAILABLE)
>                 log_info =  le32_to_cpu(mpi_reply->IOCLogInfo);
> --
> 2.1.0
>
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Guilherme G. Piccoli Jan. 23, 2017, 1 p.m. UTC | #4
On 01/23/2017 07:05 AM, Sreekanth Reddy wrote:
> On Wed, Dec 28, 2016 at 6:21 PM, Guilherme G. Piccoli
> <gpiccoli@linux.vnet.ibm.com> wrote:
>> From: Ram Pai <linuxram@us.ibm.com>
>>
>> The firmware or device, possibly under a heavy I/O load, can return
>> on a partial unaligned boundary. Scsi-ml expects these requests to be
>> completed on an alignment boundary. Scsi-ml blindly requeues the I/O
>> without checking the alignment boundary of the I/O request for the
>> remaining bytes. This leads to errors, since devices cannot perform
>> non-aligned read/write operations.
>>
>> This patch fixes the issue in the driver. It aligns unaligned
>> completions of FS requests, by truncating them to the nearest
>> alignment boundary.
>>
>> Reported-by: Mauricio Faria De Oliveira <mauricfo@linux.vnet.ibm.com>
>> Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
>> Signed-off-by: Ram Pai <linuxram@us.ibm.com>
>> ---
>>  drivers/scsi/mpt3sas/mpt3sas_scsih.c | 16 ++++++++++++++++
>>  1 file changed, 16 insertions(+)
>>
>> diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
>> index b5c966e..55332a3 100644
>> --- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
>> +++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
>> @@ -4644,6 +4644,8 @@ _scsih_io_done(struct MPT3SAS_ADAPTER *ioc, u16 smid, u8 msix_index, u32 reply)
>>         struct MPT3SAS_DEVICE *sas_device_priv_data;
>>         u32 response_code = 0;
>>         unsigned long flags;
>> +       unsigned int sector_sz;
>> +       struct request *req;
>>
>>         mpi_reply = mpt3sas_base_get_reply_virt_addr(ioc, reply);
>>         scmd = _scsih_scsi_lookup_get_clear(ioc, smid);
>> @@ -4703,6 +4705,20 @@ _scsih_io_done(struct MPT3SAS_ADAPTER *ioc, u16 smid, u8 msix_index, u32 reply)
>>         }
>>
>>         xfer_cnt = le32_to_cpu(mpi_reply->TransferCount);
>> +
>> +       /* In case of bogus fw or device, we could end up having
>> +        * unaligned partial completion. We can force alignment here,
>> +        * then scsi-ml does not need to handle this misbehavior.
>> +        */
>> +       sector_sz = scmd->device->sector_size;
>> +       req = scmd->request;
>> +       if (unlikely(sector_sz && req && (req->cmd_type == REQ_TYPE_FS) &&
>> +                   (xfer_cnt % sector_sz))) {
>> +               sdev_printk(KERN_INFO, scmd->device,
>> +                           "unaligned partial completion avoided\n");
> 
> [Sreekanth] Patch looks good. But can we print xfer_cnt & sector_sz
> values along with above print.
> 
> Also if it is generic drive issue, then can we move this work around
> to SCSI Mid Layer?
> 

Thank you! I'll send a v2 including your suggestion.
Regarding a fix in scsi-ml, we tried already:
https://lkml.org/lkml/2016/12/19/591

Reception wasn't in favor of the patch; they suggested we patch the
driver instead, then we sent the current change only for mpt3sas.

Thanks,


Guilherme

>> +               xfer_cnt = (xfer_cnt / sector_sz) * sector_sz;
>> +       }
>> +
>>         scsi_set_resid(scmd, scsi_bufflen(scmd) - xfer_cnt);
>>         if (ioc_status & MPI2_IOCSTATUS_FLAG_LOG_INFO_AVAILABLE)
>>                 log_info =  le32_to_cpu(mpi_reply->IOCLogInfo);
>> --
>> 2.1.0
>>
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
index b5c966e..55332a3 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
@@ -4644,6 +4644,8 @@  _scsih_io_done(struct MPT3SAS_ADAPTER *ioc, u16 smid, u8 msix_index, u32 reply)
 	struct MPT3SAS_DEVICE *sas_device_priv_data;
 	u32 response_code = 0;
 	unsigned long flags;
+	unsigned int sector_sz;
+	struct request *req;
 
 	mpi_reply = mpt3sas_base_get_reply_virt_addr(ioc, reply);
 	scmd = _scsih_scsi_lookup_get_clear(ioc, smid);
@@ -4703,6 +4705,20 @@  _scsih_io_done(struct MPT3SAS_ADAPTER *ioc, u16 smid, u8 msix_index, u32 reply)
 	}
 
 	xfer_cnt = le32_to_cpu(mpi_reply->TransferCount);
+
+	/* In case of bogus fw or device, we could end up having
+	 * unaligned partial completion. We can force alignment here,
+	 * then scsi-ml does not need to handle this misbehavior.
+	 */
+	sector_sz = scmd->device->sector_size;
+	req = scmd->request;
+	if (unlikely(sector_sz && req && (req->cmd_type == REQ_TYPE_FS) &&
+		    (xfer_cnt % sector_sz))) {
+		sdev_printk(KERN_INFO, scmd->device,
+			    "unaligned partial completion avoided\n");
+		xfer_cnt = (xfer_cnt / sector_sz) * sector_sz;
+	}
+
 	scsi_set_resid(scmd, scsi_bufflen(scmd) - xfer_cnt);
 	if (ioc_status & MPI2_IOCSTATUS_FLAG_LOG_INFO_AVAILABLE)
 		log_info =  le32_to_cpu(mpi_reply->IOCLogInfo);