Message ID | d53b6377-ff2a-3bba-612f-d052ffa81d09@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | PCI: Disable parity checking if broken_parity is set | expand |
On Tue, Jan 05, 2021 at 10:42:31AM +0100, Heiner Kallweit wrote: > Simplify the quirk by using new PCI core function > pci_quirk_broken_parity(). In addition make the quirk > more specific, use device id 0x8169 instead of PCI_ANY_ID. > > Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> > --- > arch/arm/mach-iop32x/n2100.c | 8 +++----- > 1 file changed, 3 insertions(+), 5 deletions(-) > > diff --git a/arch/arm/mach-iop32x/n2100.c b/arch/arm/mach-iop32x/n2100.c > index 78b9a5ee4..24c3eec46 100644 > --- a/arch/arm/mach-iop32x/n2100.c > +++ b/arch/arm/mach-iop32x/n2100.c > @@ -122,12 +122,10 @@ static struct hw_pci n2100_pci __initdata = { > */ > static void n2100_fixup_r8169(struct pci_dev *dev) > { > - if (dev->bus->number == 0 && > - (dev->devfn == PCI_DEVFN(1, 0) || > - dev->devfn == PCI_DEVFN(2, 0))) > - dev->broken_parity_status = 1; > + if (machine_is_n2100()) > + pci_quirk_broken_parity(dev); Whatever "machine_is_n2100()" is (I can't find the definition), it is surely not equivalent to "00:01.0 || 00:02.0". That change probably should be a separate patch with some explanation. If this makes the quirk safe to use in a generic kernel, that sounds like a good thing. I guess a parity problem could be the result of a defect in either the device (e.g., 0x8169), which would be an issue in *all* platforms, or a platform-specific issue in the way it's wired up. I assume it's the latter because the quirk is not in drivers/pci/quirks.c. Why is it safe to restrict this to device ID 0x8169? If this is platform issue, it might affect any device in the slot. > } > -DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_REALTEK, PCI_ANY_ID, n2100_fixup_r8169); > +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_REALTEK, 0x8169, n2100_fixup_r8169); > > static int __init n2100_pci_init(void) > { > -- > 2.30.0 > >
On 06.01.2021 01:28, Bjorn Helgaas wrote: > On Tue, Jan 05, 2021 at 10:42:31AM +0100, Heiner Kallweit wrote: >> Simplify the quirk by using new PCI core function >> pci_quirk_broken_parity(). In addition make the quirk >> more specific, use device id 0x8169 instead of PCI_ANY_ID. >> >> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> >> --- >> arch/arm/mach-iop32x/n2100.c | 8 +++----- >> 1 file changed, 3 insertions(+), 5 deletions(-) >> >> diff --git a/arch/arm/mach-iop32x/n2100.c b/arch/arm/mach-iop32x/n2100.c >> index 78b9a5ee4..24c3eec46 100644 >> --- a/arch/arm/mach-iop32x/n2100.c >> +++ b/arch/arm/mach-iop32x/n2100.c >> @@ -122,12 +122,10 @@ static struct hw_pci n2100_pci __initdata = { >> */ >> static void n2100_fixup_r8169(struct pci_dev *dev) >> { >> - if (dev->bus->number == 0 && >> - (dev->devfn == PCI_DEVFN(1, 0) || >> - dev->devfn == PCI_DEVFN(2, 0))) >> - dev->broken_parity_status = 1; >> + if (machine_is_n2100()) >> + pci_quirk_broken_parity(dev); > > Whatever "machine_is_n2100()" is (I can't find the definition), it is > surely not equivalent to "00:01.0 || 00:02.0". That change probably > should be a separate patch with some explanation. > The machine_is_xxx() checks are dynamically created, see arch/arm/tools/gen-mach-types. Slots 1 and 2 are the two network cards, both are Realtek RTL8169. The quirk (after this patch) applies for Realtek RTL8169 devices only, therefore we don't need the slot checks any longer. Actually the slot checks haven't been needed even before, because only in slots 1 and 2 are Realtek devices. The machine type check is there to protect from (theoretical) cases where the n2100 code (incl. the RTL8169 quirk) may be compiled in, but the kernel is used on another machine. > If this makes the quirk safe to use in a generic kernel, that sounds > like a good thing. > > I guess a parity problem could be the result of a defect in either the > device (e.g., 0x8169), which would be an issue in *all* platforms, or > a platform-specific issue in the way it's wired up. I assume it's the > latter because the quirk is not in drivers/pci/quirks.c. > I haven't seen any other report about RTL8169 parity problems. Therefore I also think it's platform-specific. > Why is it safe to restrict this to device ID 0x8169? If this is > platform issue, it might affect any device in the slot. > So far the quirk was applied for all Realtek devices. The parity problem is limited to the two RTL8169 network cards, and there are no other Realtek PCI devices in the system. Supposedly PCI_ANY_ID was just used because it was less work than looking for the device id. Functionally it's the same on this system. >> } >> -DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_REALTEK, PCI_ANY_ID, n2100_fixup_r8169); >> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_REALTEK, 0x8169, n2100_fixup_r8169); >> >> static int __init n2100_pci_init(void) >> { >> -- >> 2.30.0 >> >>
On Tue, Jan 05, 2021 at 06:28:33PM -0600, Bjorn Helgaas wrote: > On Tue, Jan 05, 2021 at 10:42:31AM +0100, Heiner Kallweit wrote: > > { > > - if (dev->bus->number == 0 && > > - (dev->devfn == PCI_DEVFN(1, 0) || > > - dev->devfn == PCI_DEVFN(2, 0))) > > - dev->broken_parity_status = 1; > > + if (machine_is_n2100()) > > + pci_quirk_broken_parity(dev); > > Whatever "machine_is_n2100()" is (I can't find the definition), it is > surely not equivalent to "00:01.0 || 00:02.0". That change probably > should be a separate patch with some explanation. It isn't equivalent. It says "if this machine is N2100" which is a completely different thing from matching the bus/devfn numbers. You won't find a definition for machine_is_n2100() in the kernel; it is generated from the machine table by scripts, along with lots of similar definitions for each machine type: /* The type of machine we're running on */ extern unsigned int __machine_arch_type; #ifdef CONFIG_MACH_N2100 # ifdef machine_arch_type # undef machine_arch_type # define machine_arch_type __machine_arch_type # else # define machine_arch_type MACH_TYPE_N2100 # endif # define machine_is_n2100() (machine_arch_type == MACH_TYPE_N2100) #else # define machine_is_n2100() (0) #endif The upshot of the above is that machine_is_n2100() is constant zero if the platform is not configured (thereby allowing the compiler to eliminate the code.) If it is the _only_ platform selected, then it evaluates to an always-true expression. Otherwise, it becomes a run-time evaluated conditional. We may have better ways to do this in modern kernels, but this was invented decades ago, and works with zero runtime overhead. > If this makes the quirk safe to use in a generic kernel, that sounds > like a good thing. > > I guess a parity problem could be the result of a defect in either the > device (e.g., 0x8169), which would be an issue in *all* platforms, or > a platform-specific issue in the way it's wired up. I assume it's the > latter because the quirk is not in drivers/pci/quirks.c. > > Why is it safe to restrict this to device ID 0x8169? If this is > platform issue, it might affect any device in the slot. You assume the platform has multiple PCI slots - it doesn't. It's an embedded platform (it's sold as a NAS) and it has a single mini-PCI slot for a WiFi card. Without that populated, lspci -n looks like this: 00:01.0 0200: 10ec:8169 (rev 10) 00:02.0 0200: 10ec:8169 (rev 10) 00:03.0 0180: 1095:3512 (rev 01) 00:04.0 0c03: 1106:3038 (rev 61) 00:04.1 0c03: 1106:3038 (rev 61) 00:04.2 0c03: 1106:3104 (rev 63) Where all those devices are soldered to the board.
On Wed, Jan 06, 2021 at 01:44:03AM +0100, Heiner Kallweit wrote: > The machine type check is there to protect from (theoretical) cases > where the n2100 code (incl. the RTL8169 quirk) may be compiled in, > but the kernel is used on another machine. That is far from a theoretical case. The ARM port has always supported multiple machines in a single kernel. They just had to be "compatible" in other words, the same SoC. All the platforms supported by arch/arm/mach-iop32x can be built as a single kernel image and run on any of those platforms.
On 06.01.2021 01:52, Russell King - ARM Linux admin wrote: > On Wed, Jan 06, 2021 at 01:44:03AM +0100, Heiner Kallweit wrote: >> The machine type check is there to protect from (theoretical) cases >> where the n2100 code (incl. the RTL8169 quirk) may be compiled in, >> but the kernel is used on another machine. > > That is far from a theoretical case. The ARM port has always supported > multiple machines in a single kernel. They just had to be "compatible" > in other words, the same SoC. All the platforms supported by > arch/arm/mach-iop32x can be built as a single kernel image and run on > any of those platforms. > Good to know, then we indeed need the machine check. IOW, based on what you state we could even now have the following situation: N2100 support is compiled in, and the kernel is used on another machine that by chance also has Realtek RTL8169 in PCI slots 1 or 2. Then the PCI quirk would be applied, even though the machine doesn't have the parity issue.
diff --git a/arch/arm/mach-iop32x/n2100.c b/arch/arm/mach-iop32x/n2100.c index 78b9a5ee4..24c3eec46 100644 --- a/arch/arm/mach-iop32x/n2100.c +++ b/arch/arm/mach-iop32x/n2100.c @@ -122,12 +122,10 @@ static struct hw_pci n2100_pci __initdata = { */ static void n2100_fixup_r8169(struct pci_dev *dev) { - if (dev->bus->number == 0 && - (dev->devfn == PCI_DEVFN(1, 0) || - dev->devfn == PCI_DEVFN(2, 0))) - dev->broken_parity_status = 1; + if (machine_is_n2100()) + pci_quirk_broken_parity(dev); } -DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_REALTEK, PCI_ANY_ID, n2100_fixup_r8169); +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_REALTEK, 0x8169, n2100_fixup_r8169); static int __init n2100_pci_init(void) {
Simplify the quirk by using new PCI core function pci_quirk_broken_parity(). In addition make the quirk more specific, use device id 0x8169 instead of PCI_ANY_ID. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> --- arch/arm/mach-iop32x/n2100.c | 8 +++----- 1 file changed, 3 insertions(+), 5 deletions(-)