diff mbox series

[1/1] PCI/VPD: Fix blocking of VPD data in lspci for QLogic 1077:2261

Message ID 20210303224250.12618-2-aeasi@marvell.com (mailing list archive)
State Superseded
Delegated to: Bjorn Helgaas
Headers show
Series PCI/VPD: Fix blocking of VPD data in lspci for QLogic 1077:2261 | expand

Commit Message

Arun Easi March 3, 2021, 10:42 p.m. UTC
"lspci -vvv" for Qlogic Fibre Channel HBA 1077:2261 displays
"Vital Product Data" as "Not readable" today and thus preventing
customers from getting relevant HBA information. Fix it by removing
the blacklist quirk.

The VPD quirk was added by [0] to avoid a system NMI; this issue has
been long fixed in the HBA firmware. In addition, PCI also has changes
to check the VPD size [1], so this quirk can be reverted now regardless
of a firmware update.

Some more details can be found in the following thread:
    "VPD blacklist of Marvell QLogic 1077/2261" [2].

[0] 0d5370d1d852 ("PCI: Prevent VPD access for QLogic ISP2722")
[1] 104daa71b396 ("PCI: Determine actual VPD size on first access")
[2] https://lore.kernel.org/linux-pci/alpine.LRH.2.21.9999.2012161641230.28924@irv1user01.caveonetworks.com/

Signed-off-by: Arun Easi <aeasi@marvell.com>
CC: stable@vger.kernel.org      # v4.6+
---
 drivers/pci/vpd.c | 1 -
 1 file changed, 1 deletion(-)

Comments

Krzysztof Wilczyński March 6, 2021, 11:57 p.m. UTC | #1
Hi Arun,

> "lspci -vvv" for Qlogic Fibre Channel HBA 1077:2261 displays
> "Vital Product Data" as "Not readable" today and thus preventing
> customers from getting relevant HBA information. Fix it by removing
> the blacklist quirk.
> 
> The VPD quirk was added by [0] to avoid a system NMI; this issue has
> been long fixed in the HBA firmware. In addition, PCI also has changes
> to check the VPD size [1], so this quirk can be reverted now regardless
> of a firmware update.
> 
> Some more details can be found in the following thread:
>     "VPD blacklist of Marvell QLogic 1077/2261" [2].
> 
> [0] 0d5370d1d852 ("PCI: Prevent VPD access for QLogic ISP2722")
> [1] 104daa71b396 ("PCI: Determine actual VPD size on first access")
> [2] https://lore.kernel.org/linux-pci/alpine.LRH.2.21.9999.2012161641230.28924@irv1user01.caveonetworks.com/
[...]

Looks good!  Assuming that this won't break QLogic ISP2722 with older
firmware, like you say, it's a nice fix.

Reviewed-by: Krzysztof Wilczyński <kw@linux.com>

Krzysztof
Bjorn Helgaas April 7, 2021, 10:13 p.m. UTC | #2
On Wed, Mar 03, 2021 at 02:42:50PM -0800, Arun Easi wrote:
> "lspci -vvv" for Qlogic Fibre Channel HBA 1077:2261 displays
> "Vital Product Data" as "Not readable" today and thus preventing
> customers from getting relevant HBA information. Fix it by removing
> the blacklist quirk.
> 
> The VPD quirk was added by [0] to avoid a system NMI; this issue has
> been long fixed in the HBA firmware. In addition, PCI also has changes
> to check the VPD size [1], so this quirk can be reverted now regardless
> of a firmware update.

This is not a very convincing argument yet since 104daa71b396 ("PCI:
Determine actual VPD size on first access") appeared in v4.6 and
0d5370d1d852 ("PCI: Prevent VPD access for QLogic ISP2722") appeared
in v4.11.

If 104daa71b396 really fixed the problem, why did we need
0d5370d1d852?

> Some more details can be found in the following thread:
>     "VPD blacklist of Marvell QLogic 1077/2261" [2].
> 
> [0] 0d5370d1d852 ("PCI: Prevent VPD access for QLogic ISP2722")
> [1] 104daa71b396 ("PCI: Determine actual VPD size on first access")
> [2] https://lore.kernel.org/linux-pci/alpine.LRH.2.21.9999.2012161641230.28924@irv1user01.caveonetworks.com/
> 
> Signed-off-by: Arun Easi <aeasi@marvell.com>
> CC: stable@vger.kernel.org      # v4.6+
> ---
>  drivers/pci/vpd.c | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/drivers/pci/vpd.c b/drivers/pci/vpd.c
> index 7915d10..bd54907 100644
> --- a/drivers/pci/vpd.c
> +++ b/drivers/pci/vpd.c
> @@ -570,7 +570,6 @@ DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LSI_LOGIC, 0x005d, quirk_blacklist_vpd);
>  DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LSI_LOGIC, 0x005f, quirk_blacklist_vpd);
>  DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATTANSIC, PCI_ANY_ID,
>  		quirk_blacklist_vpd);
> -DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_QLOGIC, 0x2261, quirk_blacklist_vpd);
>  /*
>   * The Amazon Annapurna Labs 0x0031 device id is reused for other non Root Port
>   * device types, so the quirk is registered for the PCI_CLASS_BRIDGE_PCI class.
> -- 
> 2.9.5
>
Arun Easi April 7, 2021, 10:57 p.m. UTC | #3
On Wed, 7 Apr 2021, 3:13pm, Bjorn Helgaas wrote:

> On Wed, Mar 03, 2021 at 02:42:50PM -0800, Arun Easi wrote:
> > "lspci -vvv" for Qlogic Fibre Channel HBA 1077:2261 displays
> > "Vital Product Data" as "Not readable" today and thus preventing
> > customers from getting relevant HBA information. Fix it by removing
> > the blacklist quirk.
> > 
> > The VPD quirk was added by [0] to avoid a system NMI; this issue has
> > been long fixed in the HBA firmware. In addition, PCI also has changes
> > to check the VPD size [1], so this quirk can be reverted now regardless
> > of a firmware update.
> 
> This is not a very convincing argument yet since 104daa71b396 ("PCI:
> Determine actual VPD size on first access") appeared in v4.6 and
> 0d5370d1d852 ("PCI: Prevent VPD access for QLogic ISP2722") appeared
> in v4.11.
> 
> If 104daa71b396 really fixed the problem, why did we need
> 0d5370d1d852?

True, 0d5370d1d852 was not really neeeded for 104daa71b396 and newer 
kernels; my theory is that when Ethan Z. ran the tests, he was using an 
older (older than 104daa71b396) kernel, but by the time the blacklisting 
was put in place, the kernel already had the fix that made the 
blacklisting unnecessary.

More of my investigation details explained here:
	https://lore.kernel.org/linux-pci/alpine.LRH.2.21.9999.2012161641230.28924@irv1user01.caveonetworks.com/

A quick summary of which is that, when Ethan reported the crash stack, it 
had pci_vpd_pci22* calls which is seen only in older kernels. Though 
104daa71b396 too had those calls, it was very close to the commit that 
renamed those calls (f1cd93f9aabe) -- and I theorized Ethan probably was 
not running a kernel between 104daa71b396 and f1cd93f9aabe (only 3 
commits (drivers/pci/) away).

Unfortunately, Ethan's blacklisting patch did not have the kernel commit 
SHA he had used.

Regards,
-Arun

> 
> > Some more details can be found in the following thread:
> >     "VPD blacklist of Marvell QLogic 1077/2261" [2].
> > 
> > [0] 0d5370d1d852 ("PCI: Prevent VPD access for QLogic ISP2722")
> > [1] 104daa71b396 ("PCI: Determine actual VPD size on first access")
> > [2] https://urldefense.proofpoint.com/v2/url?u=https-3A__lore.kernel.org_linux-2Dpci_alpine.LRH.2.21.9999.2012161641230.28924-40irv1user01.caveonetworks.com_&d=DwIBAg&c=nKjWec2b6R0mOyPaz7xtfQ&r=P-q_Qkt75qFy33SvdD2nAxAyN87eO1d-mFO-lqNOomw&m=lPsRSUHeHa9BMJpm_qohlCkGzpRmhdPSNythG7ljHAU&s=Orsa4H3WN7tR8DOfCIof1toWvMUQZ2Mq0HzHPIFzEhQ&e= 
> > 
> > Signed-off-by: Arun Easi <aeasi@marvell.com>
> > CC: stable@vger.kernel.org      # v4.6+
> > ---
> >  drivers/pci/vpd.c | 1 -
> >  1 file changed, 1 deletion(-)
> > 
> > diff --git a/drivers/pci/vpd.c b/drivers/pci/vpd.c
> > index 7915d10..bd54907 100644
> > --- a/drivers/pci/vpd.c
> > +++ b/drivers/pci/vpd.c
> > @@ -570,7 +570,6 @@ DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LSI_LOGIC, 0x005d, quirk_blacklist_vpd);
> >  DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LSI_LOGIC, 0x005f, quirk_blacklist_vpd);
> >  DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATTANSIC, PCI_ANY_ID,
> >  		quirk_blacklist_vpd);
> > -DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_QLOGIC, 0x2261, quirk_blacklist_vpd);
> >  /*
> >   * The Amazon Annapurna Labs 0x0031 device id is reused for other non Root Port
> >   * device types, so the quirk is registered for the PCI_CLASS_BRIDGE_PCI class.
> > -- 
> > 2.9.5
> > 
>
Bjorn Helgaas April 8, 2021, 5:27 p.m. UTC | #4
On Wed, Apr 07, 2021 at 03:57:32PM -0700, Arun Easi wrote:
> On Wed, 7 Apr 2021, 3:13pm, Bjorn Helgaas wrote:
> 
> > On Wed, Mar 03, 2021 at 02:42:50PM -0800, Arun Easi wrote:
> > > "lspci -vvv" for Qlogic Fibre Channel HBA 1077:2261 displays
> > > "Vital Product Data" as "Not readable" today and thus preventing
> > > customers from getting relevant HBA information. Fix it by removing
> > > the blacklist quirk.
> > > 
> > > The VPD quirk was added by [0] to avoid a system NMI; this issue has
> > > been long fixed in the HBA firmware. In addition, PCI also has changes
> > > to check the VPD size [1], so this quirk can be reverted now regardless
> > > of a firmware update.
> > 
> > This is not a very convincing argument yet since 104daa71b396 ("PCI:
> > Determine actual VPD size on first access") appeared in v4.6 and
> > 0d5370d1d852 ("PCI: Prevent VPD access for QLogic ISP2722") appeared
> > in v4.11.
> > 
> > If 104daa71b396 really fixed the problem, why did we need
> > 0d5370d1d852?
> 
> True, 0d5370d1d852 was not really neeeded for 104daa71b396 and newer 
> kernels; my theory is that when Ethan Z. ran the tests, he was using an 
> older (older than 104daa71b396) kernel, but by the time the blacklisting 
> was put in place, the kernel already had the fix that made the 
> blacklisting unnecessary.
> 
> More of my investigation details explained here:
> 	https://lore.kernel.org/linux-pci/alpine.LRH.2.21.9999.2012161641230.28924@irv1user01.caveonetworks.com/
> 
> A quick summary of which is that, when Ethan reported the crash stack, it 
> had pci_vpd_pci22* calls which is seen only in older kernels. Though 
> 104daa71b396 too had those calls, it was very close to the commit that 
> renamed those calls (f1cd93f9aabe) -- and I theorized Ethan probably was 
> not running a kernel between 104daa71b396 and f1cd93f9aabe (only 3 
> commits (drivers/pci/) away).

We should put the outline of this theory in the commit log for the
benefit of future readers who have the same question I did.

Bjorn
Arun Easi April 9, 2021, 7:58 p.m. UTC | #5
On Thu, 8 Apr 2021, 10:27am, Bjorn Helgaas wrote:

> External Email
> 
> ----------------------------------------------------------------------
> On Wed, Apr 07, 2021 at 03:57:32PM -0700, Arun Easi wrote:
> > On Wed, 7 Apr 2021, 3:13pm, Bjorn Helgaas wrote:
> > 
> > > On Wed, Mar 03, 2021 at 02:42:50PM -0800, Arun Easi wrote:
> > > > "lspci -vvv" for Qlogic Fibre Channel HBA 1077:2261 displays
> > > > "Vital Product Data" as "Not readable" today and thus preventing
> > > > customers from getting relevant HBA information. Fix it by removing
> > > > the blacklist quirk.
> > > > 
> > > > The VPD quirk was added by [0] to avoid a system NMI; this issue has
> > > > been long fixed in the HBA firmware. In addition, PCI also has changes
> > > > to check the VPD size [1], so this quirk can be reverted now regardless
> > > > of a firmware update.
> > > 
> > > This is not a very convincing argument yet since 104daa71b396 ("PCI:
> > > Determine actual VPD size on first access") appeared in v4.6 and
> > > 0d5370d1d852 ("PCI: Prevent VPD access for QLogic ISP2722") appeared
> > > in v4.11.
> > > 
> > > If 104daa71b396 really fixed the problem, why did we need
> > > 0d5370d1d852?
> > 
> > True, 0d5370d1d852 was not really neeeded for 104daa71b396 and newer 
> > kernels; my theory is that when Ethan Z. ran the tests, he was using an 
> > older (older than 104daa71b396) kernel, but by the time the blacklisting 
> > was put in place, the kernel already had the fix that made the 
> > blacklisting unnecessary.
> > 
> > More of my investigation details explained here:
> > 	https://urldefense.proofpoint.com/v2/url?u=https-3A__lore.kernel.org_linux-2Dpci_alpine.LRH.2.21.9999.2012161641230.28924-40irv1user01.caveonetworks.com_&d=DwIBAg&c=nKjWec2b6R0mOyPaz7xtfQ&r=P-q_Qkt75qFy33SvdD2nAxAyN87eO1d-mFO-lqNOomw&m=6AxpinjPP1ODX7dE-syNBDgke94moUP_K_jm_ZTu6pI&s=YxWnXn7ZfcD1PJlUkl82l8TK5sNX7uBwkCJxKgpAgEM&e= 
> > 
> > A quick summary of which is that, when Ethan reported the crash stack, it 
> > had pci_vpd_pci22* calls which is seen only in older kernels. Though 
> > 104daa71b396 too had those calls, it was very close to the commit that 
> > renamed those calls (f1cd93f9aabe) -- and I theorized Ethan probably was 
> > not running a kernel between 104daa71b396 and f1cd93f9aabe (only 3 
> > commits (drivers/pci/) away).
> 
> We should put the outline of this theory in the commit log for the
> benefit of future readers who have the same question I did.
> 

Sure, will post a V2. BTW, the details were mentioned as a link "[2]" in 
the commit, I will pull the contents to the commit message.

Thanks Bjorn.

Regards,
-Arun
diff mbox series

Patch

diff --git a/drivers/pci/vpd.c b/drivers/pci/vpd.c
index 7915d10..bd54907 100644
--- a/drivers/pci/vpd.c
+++ b/drivers/pci/vpd.c
@@ -570,7 +570,6 @@  DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LSI_LOGIC, 0x005d, quirk_blacklist_vpd);
 DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LSI_LOGIC, 0x005f, quirk_blacklist_vpd);
 DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATTANSIC, PCI_ANY_ID,
 		quirk_blacklist_vpd);
-DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_QLOGIC, 0x2261, quirk_blacklist_vpd);
 /*
  * The Amazon Annapurna Labs 0x0031 device id is reused for other non Root Port
  * device types, so the quirk is registered for the PCI_CLASS_BRIDGE_PCI class.