diff mbox series

[v2] scsi: core: Consult supported VPD page list prior to fetching page

Message ID 20240214221411.2888112-1-martin.petersen@oracle.com (mailing list archive)
State Accepted
Commit b5fc07a5fb56216a49e6c1d0b172d5464d99a89b
Headers show
Series [v2] scsi: core: Consult supported VPD page list prior to fetching page | expand

Commit Message

Martin K. Petersen Feb. 14, 2024, 10:14 p.m. UTC
Commit c92a6b5d6335 ("scsi: core: Query VPD size before getting full
page") removed the logic which checks whether a VPD page is present on
the supported pages list before asking for the page itself. That was
done because SPC helpfully states "The Supported VPD Pages VPD page
list may or may not include all the VPD pages that are able to be
returned by the device server". Testing had revealed a few devices
that supported some of the 0xBn pages but didn't actually list them in
page 0.

Julian Sikorski bisected a problem with his drive resetting during
discovery to the commit above. As it turns out, this particular drive
firmware will crash if we attempt to fetch page 0xB9.

Various approaches were attempted to work around this. In the end,
reinstating the logic that consults VPD page 0 before fetching any
other page was the path of least resistance. A firmware update for the
devices which originally compelled us to remove the check has since
been released.

Cc: stable@vger.kernel.org
Cc: Bart Van Assche <bvanassche@acm.org>
Fixes: c92a6b5d6335 ("scsi: core: Query VPD size before getting full page")
Reported-by: Julian Sikorski <belegdol@gmail.com>
Tested-by: Julian Sikorski <belegdol@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

---

v2: Address Bart's comments.
---
 drivers/scsi/scsi.c        | 22 ++++++++++++++++++++--
 include/scsi/scsi_device.h |  4 ----
 2 files changed, 20 insertions(+), 6 deletions(-)

Comments

Lee Duncan Feb. 14, 2024, 11:14 p.m. UTC | #1
On Wed, Feb 14, 2024 at 2:14 PM Martin K. Petersen
<martin.petersen@oracle.com> wrote:
>
> Commit c92a6b5d6335 ("scsi: core: Query VPD size before getting full
> page") removed the logic which checks whether a VPD page is present on
> the supported pages list before asking for the page itself. That was
> done because SPC helpfully states "The Supported VPD Pages VPD page
> list may or may not include all the VPD pages that are able to be
> returned by the device server". Testing had revealed a few devices
> that supported some of the 0xBn pages but didn't actually list them in
> page 0.
>
> Julian Sikorski bisected a problem with his drive resetting during
> discovery to the commit above. As it turns out, this particular drive
> firmware will crash if we attempt to fetch page 0xB9.
>
> Various approaches were attempted to work around this. In the end,
> reinstating the logic that consults VPD page 0 before fetching any
> other page was the path of least resistance. A firmware update for the
> devices which originally compelled us to remove the check has since
> been released.
>
> Cc: stable@vger.kernel.org
> Cc: Bart Van Assche <bvanassche@acm.org>
> Fixes: c92a6b5d6335 ("scsi: core: Query VPD size before getting full page")
> Reported-by: Julian Sikorski <belegdol@gmail.com>
> Tested-by: Julian Sikorski <belegdol@gmail.com>
> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
>
> ---
>
> v2: Address Bart's comments.
> ---
>  drivers/scsi/scsi.c        | 22 ++++++++++++++++++++--
>  include/scsi/scsi_device.h |  4 ----
>  2 files changed, 20 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c
> index 76d369343c7a..8cad9792a562 100644
> --- a/drivers/scsi/scsi.c
> +++ b/drivers/scsi/scsi.c
> @@ -328,21 +328,39 @@ static int scsi_vpd_inquiry(struct scsi_device *sdev, unsigned char *buffer,
>         return result + 4;
>  }
>
> +enum scsi_vpd_parameters {
> +       SCSI_VPD_HEADER_SIZE = 4,
> +       SCSI_VPD_LIST_SIZE = 36,
> +};
> +
>  static int scsi_get_vpd_size(struct scsi_device *sdev, u8 page)
>  {
> -       unsigned char vpd_header[SCSI_VPD_HEADER_SIZE] __aligned(4);
> +       unsigned char vpd[SCSI_VPD_LIST_SIZE] __aligned(4);
>         int result;
>
>         if (sdev->no_vpd_size)
>                 return SCSI_DEFAULT_VPD_LEN;
>
> +       /*
> +        * Fetch the supported pages VPD and validate that the requested page
> +        * number is present.
> +        */
> +       if (page != 0) {
> +               result = scsi_vpd_inquiry(sdev, vpd, 0, sizeof(vpd));
> +               if (result < SCSI_VPD_HEADER_SIZE)
> +                       return 0;
> +
> +               result -= SCSI_VPD_HEADER_SIZE;
> +               if (!memchr(&vpd[SCSI_VPD_HEADER_SIZE], page, result))
> +                       return 0;
> +       }
>         /*
>          * Fetch the VPD page header to find out how big the page
>          * is. This is done to prevent problems on legacy devices
>          * which can not handle allocation lengths as large as
>          * potentially requested by the caller.
>          */
> -       result = scsi_vpd_inquiry(sdev, vpd_header, page, sizeof(vpd_header));
> +       result = scsi_vpd_inquiry(sdev, vpd, page, SCSI_VPD_HEADER_SIZE);
>         if (result < 0)
>                 return 0;
>
> diff --git a/include/scsi/scsi_device.h b/include/scsi/scsi_device.h
> index cb019c80763b..72a6b3923fc7 100644
> --- a/include/scsi/scsi_device.h
> +++ b/include/scsi/scsi_device.h
> @@ -100,10 +100,6 @@ struct scsi_vpd {
>         unsigned char   data[];
>  };
>
> -enum scsi_vpd_parameters {
> -       SCSI_VPD_HEADER_SIZE = 4,
> -};
> -
>  struct scsi_device {
>         struct Scsi_Host *host;
>         struct request_queue *request_queue;
> --
> 2.42.1
>
>

Reviewed-by: Lee Duncan <lee.duncan@suse.com>
Bart Van Assche Feb. 15, 2024, 6:28 p.m. UTC | #2
On 2/14/24 14:14, Martin K. Petersen wrote:
> Commit c92a6b5d6335 ("scsi: core: Query VPD size before getting full
> page") removed the logic which checks whether a VPD page is present on
> the supported pages list before asking for the page itself. That was
> done because SPC helpfully states "The Supported VPD Pages VPD page
> list may or may not include all the VPD pages that are able to be
> returned by the device server". Testing had revealed a few devices
> that supported some of the 0xBn pages but didn't actually list them in
> page 0.
> 
> Julian Sikorski bisected a problem with his drive resetting during
> discovery to the commit above. As it turns out, this particular drive
> firmware will crash if we attempt to fetch page 0xB9.
> 
> Various approaches were attempted to work around this. In the end,
> reinstating the logic that consults VPD page 0 before fetching any
> other page was the path of least resistance. A firmware update for the
> devices which originally compelled us to remove the check has since
> been released.
> 
> Cc: stable@vger.kernel.org
> Cc: Bart Van Assche <bvanassche@acm.org>
> Fixes: c92a6b5d6335 ("scsi: core: Query VPD size before getting full page")
> Reported-by: Julian Sikorski <belegdol@gmail.com>
> Tested-by: Julian Sikorski <belegdol@gmail.com>
> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

Reviewed-by: Bart Van Assche <bvanassche@acm.org>

BTW, here is another report related to this patch:
https://lore.kernel.org/linux-scsi/64phxapjp742qob7gr74o2tnnkaic6wmxgfa3uxn33ukrwumbi@cfd6kmix3bbm/

Vitaly, can you help with testing this patch? See also:
https://lore.kernel.org/linux-scsi/20240214221411.2888112-1-martin.petersen@oracle.com/
Vitaly Chikunov Feb. 16, 2024, 2:57 p.m. UTC | #3
Bart,

On Thu, Feb 15, 2024 at 10:28:06AM -0800, Bart Van Assche wrote:
> On 2/14/24 14:14, Martin K. Petersen wrote:
> > Commit c92a6b5d6335 ("scsi: core: Query VPD size before getting full
> > page") removed the logic which checks whether a VPD page is present on
> > the supported pages list before asking for the page itself. That was
> > done because SPC helpfully states "The Supported VPD Pages VPD page
> > list may or may not include all the VPD pages that are able to be
> > returned by the device server". Testing had revealed a few devices
> > that supported some of the 0xBn pages but didn't actually list them in
> > page 0.
> > 
> > Julian Sikorski bisected a problem with his drive resetting during
> > discovery to the commit above. As it turns out, this particular drive
> > firmware will crash if we attempt to fetch page 0xB9.
> > 
> > Various approaches were attempted to work around this. In the end,
> > reinstating the logic that consults VPD page 0 before fetching any
> > other page was the path of least resistance. A firmware update for the
> > devices which originally compelled us to remove the check has since
> > been released.
> > 
> > Cc: stable@vger.kernel.org
> > Cc: Bart Van Assche <bvanassche@acm.org>
> > Fixes: c92a6b5d6335 ("scsi: core: Query VPD size before getting full page")
> > Reported-by: Julian Sikorski <belegdol@gmail.com>
> > Tested-by: Julian Sikorski <belegdol@gmail.com>
> > Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
> 
> Reviewed-by: Bart Van Assche <bvanassche@acm.org>
> 
> BTW, here is another report related to this patch:
> https://lore.kernel.org/linux-scsi/64phxapjp742qob7gr74o2tnnkaic6wmxgfa3uxn33ukrwumbi@cfd6kmix3bbm/
> 
> Vitaly, can you help with testing this patch? See also:
> https://lore.kernel.org/linux-scsi/20240214221411.2888112-1-martin.petersen@oracle.com/

With this patch applied over 6.6.16 problem still persists.

Also reading page 0x89 (mentioned in patch description) does not show any signs
of crash (any errors in dmesg with it of with the following reads):

  [root@host-226 ~]# sg_vpd -p 0x89 /dev/sdc
  ATA information VPD page:
    SAT Vendor identification: LSI     
    SAT Product identification: LSI SATL        
    SAT Product revision level: 0008
    Device signature indicates SATA transport
    Command code: 0xec
    ATA command IDENTIFY DEVICE response summary:
      model: WDC WD2005FBYZ-01YCBB2                  
      serial number:      WD-WMC6N0J5VKLH
      firmware revision: RR07    

Thanks,
Martin K. Petersen Feb. 16, 2024, 3:08 p.m. UTC | #4
Vitaly,

> With this patch applied over 6.6.16 problem still persists.

I figured. Couldn't understand why any of these changes would lead to
the symptoms you are experiencing.

I think my original patch just exposed a timing issue. Since your device
has a SATL, we should never be sending a WRITE SAME command in the first
place.

I have another patch in the pipeline which, in combination with
Christoph's atomic queue limit update series, should close this gap.

I'll reach out when it's ready for testing.

Thanks for your help!
diff mbox series

Patch

diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c
index 76d369343c7a..8cad9792a562 100644
--- a/drivers/scsi/scsi.c
+++ b/drivers/scsi/scsi.c
@@ -328,21 +328,39 @@  static int scsi_vpd_inquiry(struct scsi_device *sdev, unsigned char *buffer,
 	return result + 4;
 }
 
+enum scsi_vpd_parameters {
+	SCSI_VPD_HEADER_SIZE = 4,
+	SCSI_VPD_LIST_SIZE = 36,
+};
+
 static int scsi_get_vpd_size(struct scsi_device *sdev, u8 page)
 {
-	unsigned char vpd_header[SCSI_VPD_HEADER_SIZE] __aligned(4);
+	unsigned char vpd[SCSI_VPD_LIST_SIZE] __aligned(4);
 	int result;
 
 	if (sdev->no_vpd_size)
 		return SCSI_DEFAULT_VPD_LEN;
 
+	/*
+	 * Fetch the supported pages VPD and validate that the requested page
+	 * number is present.
+	 */
+	if (page != 0) {
+		result = scsi_vpd_inquiry(sdev, vpd, 0, sizeof(vpd));
+		if (result < SCSI_VPD_HEADER_SIZE)
+			return 0;
+
+		result -= SCSI_VPD_HEADER_SIZE;
+		if (!memchr(&vpd[SCSI_VPD_HEADER_SIZE], page, result))
+			return 0;
+	}
 	/*
 	 * Fetch the VPD page header to find out how big the page
 	 * is. This is done to prevent problems on legacy devices
 	 * which can not handle allocation lengths as large as
 	 * potentially requested by the caller.
 	 */
-	result = scsi_vpd_inquiry(sdev, vpd_header, page, sizeof(vpd_header));
+	result = scsi_vpd_inquiry(sdev, vpd, page, SCSI_VPD_HEADER_SIZE);
 	if (result < 0)
 		return 0;
 
diff --git a/include/scsi/scsi_device.h b/include/scsi/scsi_device.h
index cb019c80763b..72a6b3923fc7 100644
--- a/include/scsi/scsi_device.h
+++ b/include/scsi/scsi_device.h
@@ -100,10 +100,6 @@  struct scsi_vpd {
 	unsigned char	data[];
 };
 
-enum scsi_vpd_parameters {
-	SCSI_VPD_HEADER_SIZE = 4,
-};
-
 struct scsi_device {
 	struct Scsi_Host *host;
 	struct request_queue *request_queue;