Message ID | 20250212193516.88741-1-helgaas@kernel.org (mailing list archive) |
---|---|
State | Accepted |
Commit | 81f64e925c29fe6e99f04b131fac1935ac931e81 |
Headers | show |
Series | PCI: Avoid FLR for Mediatek MT7922 WiFi | expand |
On Wed, Feb 12, 2025 at 01:35:16PM -0600, Bjorn Helgaas wrote: > From: Bjorn Helgaas <bhelgaas@google.com> > > The Mediatek MT7922 WiFi device advertises FLR support, but it apparently > does not work, and all subsequent config reads return ~0: > > pci 0000:01:00.0: [14c3:0616] type 00 class 0x028000 PCIe Endpoint > pciback 0000:01:00.0: not ready 65535ms after FLR; giving up > > After an FLR, pci_dev_wait() waits for the device to become ready. Prior > to d591f6804e7e ("PCI: Wait for device readiness with Configuration RRS"), > it polls PCI_COMMAND until it is something other that PCI_POSSIBLE_ERROR > (~0). If it times out, pci_dev_wait() returns -ENOTTY and > __pci_reset_function_locked() tries the next available reset method. > Typically this is Secondary Bus Reset, which does work, so the MT7922 is > eventually usable. > > After d591f6804e7e, if Configuration Request Retry Status Software > Visibility (RRS SV) is enabled, pci_dev_wait() polls PCI_VENDOR_ID until it > is something other than the special 0x0001 Vendor ID that indicates a > completion with RRS status. > > When RRS SV is enabled, reads of PCI_VENDOR_ID should return either 0x0001, > i.e., the config read was completed with RRS, or a valid Vendor ID. On the > MT7922, it seems that all config reads after FLR return ~0 indefinitely. > When pci_dev_wait() reads PCI_VENDOR_ID and gets 0xffff, it assumes that's > a valid Vendor ID and the device is now ready, so it returns with success. > > After pci_dev_wait() returns success, we restore config space and continue. > Since the MT7922 is not actually ready after the FLR, the restore fails and > the device is unusable. > > We considered changing pci_dev_wait() to continue polling if a > PCI_VENDOR_ID read returns either 0x0001 or 0xffff. This "works" as it did > before d591f6804e7e, although we have to wait for the timeout and then fall > back to SBR. But it doesn't work for SR-IOV VFs, which *always* return > 0xffff as the Vendor ID. > > Mark Mediatek MT7922 WiFi devices to avoid the use of FLR completely. This > will cause fallback to another reset method, such as SBR. > > Fixes: d591f6804e7e ("PCI: Wait for device readiness with Configuration RRS") > Link: https://github.com/QubesOS/qubes-issues/issues/9689#issuecomment-2582927149 > Link: https://lore.kernel.org/r/Z4pHll_6GX7OUBzQ@mail-itl > Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> It works, thanks! Tested-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com> > --- > drivers/pci/quirks.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c > index b84ff7bade82..82b21e34c545 100644 > --- a/drivers/pci/quirks.c > +++ b/drivers/pci/quirks.c > @@ -5522,7 +5522,7 @@ DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x443, quirk_intel_qat_vf_cap); > * AMD Matisse USB 3.0 Host Controller 0x149c > * Intel 82579LM Gigabit Ethernet Controller 0x1502 > * Intel 82579V Gigabit Ethernet Controller 0x1503 > - * > + * Mediatek MT7922 802.11ax PCI Express Wireless Network Adapter > */ > static void quirk_no_flr(struct pci_dev *dev) > { > @@ -5534,6 +5534,7 @@ DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_AMD, 0x149c, quirk_no_flr); > DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_AMD, 0x7901, quirk_no_flr); > DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x1502, quirk_no_flr); > DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x1503, quirk_no_flr); > +DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_MEDIATEK, 0x0616, quirk_no_flr); > > /* FLR may cause the SolidRun SNET DPU (rev 0x1) to hang */ > static void quirk_no_flr_snet(struct pci_dev *dev) > -- > 2.34.1 >
On Wed, Feb 12, 2025 at 01:35:16PM -0600, Bjorn Helgaas wrote: > From: Bjorn Helgaas <bhelgaas@google.com> > > The Mediatek MT7922 WiFi device advertises FLR support, but it apparently > does not work, and all subsequent config reads return ~0: > > pci 0000:01:00.0: [14c3:0616] type 00 class 0x028000 PCIe Endpoint > pciback 0000:01:00.0: not ready 65535ms after FLR; giving up > > After an FLR, pci_dev_wait() waits for the device to become ready. Prior > to d591f6804e7e ("PCI: Wait for device readiness with Configuration RRS"), > it polls PCI_COMMAND until it is something other that PCI_POSSIBLE_ERROR > (~0). If it times out, pci_dev_wait() returns -ENOTTY and > __pci_reset_function_locked() tries the next available reset method. > Typically this is Secondary Bus Reset, which does work, so the MT7922 is > eventually usable. > > After d591f6804e7e, if Configuration Request Retry Status Software > Visibility (RRS SV) is enabled, pci_dev_wait() polls PCI_VENDOR_ID until it > is something other than the special 0x0001 Vendor ID that indicates a > completion with RRS status. > > When RRS SV is enabled, reads of PCI_VENDOR_ID should return either 0x0001, > i.e., the config read was completed with RRS, or a valid Vendor ID. On the > MT7922, it seems that all config reads after FLR return ~0 indefinitely. > When pci_dev_wait() reads PCI_VENDOR_ID and gets 0xffff, it assumes that's > a valid Vendor ID and the device is now ready, so it returns with success. > > After pci_dev_wait() returns success, we restore config space and continue. > Since the MT7922 is not actually ready after the FLR, the restore fails and > the device is unusable. > > We considered changing pci_dev_wait() to continue polling if a > PCI_VENDOR_ID read returns either 0x0001 or 0xffff. This "works" as it did > before d591f6804e7e, although we have to wait for the timeout and then fall > back to SBR. But it doesn't work for SR-IOV VFs, which *always* return > 0xffff as the Vendor ID. > > Mark Mediatek MT7922 WiFi devices to avoid the use of FLR completely. This > will cause fallback to another reset method, such as SBR. > > Fixes: d591f6804e7e ("PCI: Wait for device readiness with Configuration RRS") > Link: https://github.com/QubesOS/qubes-issues/issues/9689#issuecomment-2582927149 > Link: https://lore.kernel.org/r/Z4pHll_6GX7OUBzQ@mail-itl > Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Applied with Marek's tested-by to pci/for-linus for v6.14. I also added a cc: stable tag. > --- > drivers/pci/quirks.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c > index b84ff7bade82..82b21e34c545 100644 > --- a/drivers/pci/quirks.c > +++ b/drivers/pci/quirks.c > @@ -5522,7 +5522,7 @@ DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x443, quirk_intel_qat_vf_cap); > * AMD Matisse USB 3.0 Host Controller 0x149c > * Intel 82579LM Gigabit Ethernet Controller 0x1502 > * Intel 82579V Gigabit Ethernet Controller 0x1503 > - * > + * Mediatek MT7922 802.11ax PCI Express Wireless Network Adapter > */ > static void quirk_no_flr(struct pci_dev *dev) > { > @@ -5534,6 +5534,7 @@ DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_AMD, 0x149c, quirk_no_flr); > DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_AMD, 0x7901, quirk_no_flr); > DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x1502, quirk_no_flr); > DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x1503, quirk_no_flr); > +DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_MEDIATEK, 0x0616, quirk_no_flr); > > /* FLR may cause the SolidRun SNET DPU (rev 0x1) to hang */ > static void quirk_no_flr_snet(struct pci_dev *dev) > -- > 2.34.1 >
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c index b84ff7bade82..82b21e34c545 100644 --- a/drivers/pci/quirks.c +++ b/drivers/pci/quirks.c @@ -5522,7 +5522,7 @@ DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x443, quirk_intel_qat_vf_cap); * AMD Matisse USB 3.0 Host Controller 0x149c * Intel 82579LM Gigabit Ethernet Controller 0x1502 * Intel 82579V Gigabit Ethernet Controller 0x1503 - * + * Mediatek MT7922 802.11ax PCI Express Wireless Network Adapter */ static void quirk_no_flr(struct pci_dev *dev) { @@ -5534,6 +5534,7 @@ DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_AMD, 0x149c, quirk_no_flr); DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_AMD, 0x7901, quirk_no_flr); DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x1502, quirk_no_flr); DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x1503, quirk_no_flr); +DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_MEDIATEK, 0x0616, quirk_no_flr); /* FLR may cause the SolidRun SNET DPU (rev 0x1) to hang */ static void quirk_no_flr_snet(struct pci_dev *dev)