Message ID | 20210727023000.1030525-1-art@khadas.com (mailing list archive) |
---|---|
State | New, archived |
Delegated to: | Lorenzo Pieralisi |
Headers | show |
Series | [v2] PCI: DWC: meson: add 256 bytes MRRS quirk | expand |
On Tue, Jul 27, 2021 at 10:30:00AM +0800, Artem Lapkin wrote: > 256 bytes maximum read request size. They can't handle > anything larger than this. So force this limit on > any devices attached under these ports. This needs to say whether this is a functional or a performance issue. If it's a functional issue, i.e., if meson signals an error or abort when it receives a read request for > 256 bytes, we need to explain exactly what happens. If it's a performance issue, we need to explain why MRRS affects performance and that this is an optimization. > Come-from: https://lkml.org/lkml/2021/6/18/160 > Come-from: https://lkml.org/lkml/2021/6/19/19 Please use lore.kernel.org URLs instead. The lore URLs are a little uglier, but are more functional, more likely to continue working, and avoid the ads. These are: https://lore.kernel.org/r/20210618230132.GA3228427@bjorn-Precision-5520 https://lore.kernel.org/r/20210619063952.2008746-1-art@khadas.com > It only affects PCIe in P2P, in non-P2P is will certainly affect > transfers on the internal SoC/Processor/Chip internal bus/fabric. This needs to explain how a field in a PCIe TLP affects transfers on these non-PCIe fabrics. > These quirks are currently implemented in the > controller driver and only applies when the controller has been probed > and to each endpoint detected on this particular controller. > > Continue having separate quirks for each controller if the core > isn't the right place to handle MPS/MRRS. I see similar code in dwc/pci-keystone.c. Does this problem actually affect *all* DesignWare-based controllers? If so, we should put the workaround in the common dwc code, e.g., pcie-designware.c or similar. It also seems to affect pci-loongson.c (not DesignWare-based). Is there some reason it has the same problem, e.g., does loongson contain DesignWare IP, or does it use the same non-PCIe fabric? > >> Neil > > Signed-off-by: Artem Lapkin <art@khadas.com> > --- > drivers/pci/controller/dwc/pci-meson.c | 31 ++++++++++++++++++++++++++ > 1 file changed, 31 insertions(+) > > diff --git a/drivers/pci/controller/dwc/pci-meson.c b/drivers/pci/controller/dwc/pci-meson.c > index 686ded034..1498950de 100644 > --- a/drivers/pci/controller/dwc/pci-meson.c > +++ b/drivers/pci/controller/dwc/pci-meson.c > @@ -466,6 +466,37 @@ static int meson_pcie_probe(struct platform_device *pdev) > return ret; > } > > +static void meson_mrrs_limit_quirk(struct pci_dev *dev) > +{ > + struct pci_bus *bus = dev->bus; > + int mrrs, mrrs_limit = 256; > + static const struct pci_device_id bridge_devids[] = { > + { PCI_DEVICE(PCI_VENDOR_ID_SYNOPSYS, PCI_DEVICE_ID_SYNOPSYS_HAPSUSB3) }, I don't really believe that PCI_DEVICE_ID_SYNOPSYS_HAPSUSB3 is the only device affected here. Is this related to the Meson root port, or is it related to a PCI_DEVICE_ID_SYNOPSYS_HAPSUSB3 on a plug-in card? I guess the former, since you're searching upward for a root port. So why is this limited to PCI_DEVICE_ID_SYNOPSYS_HAPSUSB3? > + { 0, }, > + }; > + > + /* look for the matching bridge */ > + while (!pci_is_root_bus(bus)) { > + /* > + * 256 bytes maximum read request size. They can't handle > + * anything larger than this. So force this limit on > + * any devices attached under these ports. > + */ > + if (!pci_match_id(bridge_devids, bus->self)) { > + bus = bus->parent; > + continue; > + } > + > + mrrs = pcie_get_readrq(dev); > + if (mrrs > mrrs_limit) { > + pci_info(dev, "limiting MRRS %d to %d\n", mrrs, mrrs_limit); > + pcie_set_readrq(dev, mrrs_limit); > + } > + break; > + } > +} > +DECLARE_PCI_FIXUP_ENABLE(PCI_ANY_ID, PCI_ANY_ID, meson_mrrs_limit_quirk); > + > static const struct of_device_id meson_pcie_of_match[] = { > { > .compatible = "amlogic,axg-pcie", > -- > 2.25.1 >
Hi, On 27/07/2021 21:43, Bjorn Helgaas wrote: > On Tue, Jul 27, 2021 at 10:30:00AM +0800, Artem Lapkin wrote: >> 256 bytes maximum read request size. They can't handle >> anything larger than this. So force this limit on >> any devices attached under these ports. > > This needs to say whether this is a functional or a performance issue. > > If it's a functional issue, i.e., if meson signals an error or abort > when it receives a read request for > 256 bytes, we need to explain > exactly what happens. > > If it's a performance issue, we need to explain why MRRS affects > performance and that this is an optimization. > >> Come-from: https://lkml.org/lkml/2021/6/18/160 >> Come-from: https://lkml.org/lkml/2021/6/19/19 > > Please use lore.kernel.org URLs instead. The lore URLs are a little > uglier, but are more functional, more likely to continue working, and > avoid the ads. These are: > > https://lore.kernel.org/r/20210618230132.GA3228427@bjorn-Precision-5520 > https://lore.kernel.org/r/20210619063952.2008746-1-art@khadas.com > >> It only affects PCIe in P2P, in non-P2P is will certainly affect >> transfers on the internal SoC/Processor/Chip internal bus/fabric. > > This needs to explain how a field in a PCIe TLP affects transfers on > these non-PCIe fabrics. > >> These quirks are currently implemented in the >> controller driver and only applies when the controller has been probed >> and to each endpoint detected on this particular controller. >> >> Continue having separate quirks for each controller if the core >> isn't the right place to handle MPS/MRRS. > > I see similar code in dwc/pci-keystone.c. Does this problem actually > affect *all* DesignWare-based controllers? > > If so, we should put the workaround in the common dwc code, e.g., > pcie-designware.c or similar. > > It also seems to affect pci-loongson.c (not DesignWare-based). Is > there some reason it has the same problem, e.g., does loongson contain > DesignWare IP, or does it use the same non-PCIe fabric? As my reply on the previous thread, the Synopsys IP can be configured with a maximum TLP packet to AXI transaction size, which is hardcoded AFAIK Amlogic doesn't explicit it. And it doesn't seem we can read the value. This means is a TPL size if higher than this maximum packet size, the IP will do multiple AXI transactions, and this can reduce the system overall performance. The problem is that it affects the P2P transactions aswell, which can support any MPS/MRRS. But honestly, it's not a big deal on a PCIe 2.0 1x system only designed for NVMe and basic PCIe devices. The fun part is that the pci=pcie_bus_perf kerne cmdline solves this already, isn't there any possibility to force pcie_bus_perf for a particular root port ? Neil > >>>> Neil >> >> Signed-off-by: Artem Lapkin <art@khadas.com> >> --- >> drivers/pci/controller/dwc/pci-meson.c | 31 ++++++++++++++++++++++++++ >> 1 file changed, 31 insertions(+) >> >> diff --git a/drivers/pci/controller/dwc/pci-meson.c b/drivers/pci/controller/dwc/pci-meson.c >> index 686ded034..1498950de 100644 >> --- a/drivers/pci/controller/dwc/pci-meson.c >> +++ b/drivers/pci/controller/dwc/pci-meson.c >> @@ -466,6 +466,37 @@ static int meson_pcie_probe(struct platform_device *pdev) >> return ret; >> } >> >> +static void meson_mrrs_limit_quirk(struct pci_dev *dev) >> +{ >> + struct pci_bus *bus = dev->bus; >> + int mrrs, mrrs_limit = 256; >> + static const struct pci_device_id bridge_devids[] = { >> + { PCI_DEVICE(PCI_VENDOR_ID_SYNOPSYS, PCI_DEVICE_ID_SYNOPSYS_HAPSUSB3) }, > > I don't really believe that PCI_DEVICE_ID_SYNOPSYS_HAPSUSB3 is the > only device affected here. Is this related to the Meson root port, or > is it related to a PCI_DEVICE_ID_SYNOPSYS_HAPSUSB3 on a plug-in card? > I guess the former, since you're searching upward for a root port. > > So why is this limited to PCI_DEVICE_ID_SYNOPSYS_HAPSUSB3? > >> + { 0, }, >> + }; >> + >> + /* look for the matching bridge */ >> + while (!pci_is_root_bus(bus)) { >> + /* >> + * 256 bytes maximum read request size. They can't handle >> + * anything larger than this. So force this limit on >> + * any devices attached under these ports. >> + */ >> + if (!pci_match_id(bridge_devids, bus->self)) { >> + bus = bus->parent; >> + continue; >> + } >> + >> + mrrs = pcie_get_readrq(dev); >> + if (mrrs > mrrs_limit) { >> + pci_info(dev, "limiting MRRS %d to %d\n", mrrs, mrrs_limit); >> + pcie_set_readrq(dev, mrrs_limit); >> + } >> + break; >> + } >> +} >> +DECLARE_PCI_FIXUP_ENABLE(PCI_ANY_ID, PCI_ANY_ID, meson_mrrs_limit_quirk); >> + >> static const struct of_device_id meson_pcie_of_match[] = { >> { >> .compatible = "amlogic,axg-pcie", >> -- >> 2.25.1 >>
> The fun part is that the pci=pcie_bus_perf kerne cmdline solves this already, yes pci=pcie_bus_perf works only because setup same MRRS=256 in this case same like in this patch pcie_set_readrq(dev, 256 ) some details about pci=pcie_bus_perf [ 4.665327] pci 0000:00:00.0: [16c3:abcd] type 01 class 0x060400 [ 4.671166] pci 0000:00:00.0: reg 0x38: [mem 0x00000000-0x0000ffff pref] [ 4.677793] pci 0000:00:00.0: supports D1 [ 4.683028] pci 0000:00:00.0: PME# supported from D0 D1 D3hot D3cold [ 4.691667] pci 0000:01:00.0: [2646:2263] type 00 class 0x010802 [ 4.697469] pci 0000:01:00.0: reg 0x10: [mem 0x00000000-0x00003fff 64bit] [ 4.704218] pci 0000:01:00.0: pcie_get_mps - 128 [ 4.709607] pci 0000:00:00.0: pcie_get_mps - 256 [ 4.715145] pci 0000:01:00.0: 4.000 Gb/s available PCIe bandwidth, limited by 5.0 GT/s PCIe x1 link at 0000:00:00.0 (capable of 31.504 Gb/s with 8.0 GT/s PCIe x4 link) [ 4.748582] pci 0000:00:00.0: BAR 14: assigned [mem 0xfc700000-0xfc7fffff] [ 4.755575] pci 0000:00:00.0: BAR 6: assigned [mem 0xfc800000-0xfc80ffff pref] [ 4.762731] pci 0000:01:00.0: BAR 0: assigned [mem 0xfc700000-0xfc703fff 64bit] [ 4.769849] pci 0000:00:00.0: PCI bridge to [bus 01-ff] [ 4.775195] pci 0000:00:00.0: bridge window [mem 0xfc700000-0xfc7fffff] [ 4.781741] pci 0000:00:00.0: pcie_get_mps - 256 [ 4.787118] pci 0000:00:00.0: pcie_set_mps - 256 (32) [ 4.792404] pci 0000:00:00.0: pcie_get_mps - 256 [ 4.797637] pci 0000:00:00.0: pcie_get_mps - 256 [ 4.802674] pci 0000:00:00.0: Max Payload Size set to 256/ 256 (was 256), Max Read Rq 256 [ 4.810602] pci 0000:01:00.0: pcie_get_mps - 128 [ 4.815590] pci 0000:00:00.0: pcie_get_mps - 256 [ 4.820422] pci 0000:01:00.0: pcie_set_mps - 256 (32) [ 4.825274] pci 0000:01:00.0: pcie_get_mps - 256 [ 4.829946] pci 0000:01:00.0: pcie_set_readrq - 256 [ 4.834609] pci 0000:01:00.0: pcie_get_mps - 256 [ 4.839174] pci 0000:01:00.0: pcie_get_mps - 256 [ 4.843582] pci 0000:01:00.0: Max Payload Size set to 256/ 256 (was 128), Max Read Rq 256 and pci=pcie_bus_perf works for whole system and i think its not good idea use or adoptate this option for this case Artem On Wed, Jul 28, 2021 at 11:08 PM Neil Armstrong <narmstrong@baylibre.com> wrote: > > Hi, > > > On 27/07/2021 21:43, Bjorn Helgaas wrote: > > On Tue, Jul 27, 2021 at 10:30:00AM +0800, Artem Lapkin wrote: > >> 256 bytes maximum read request size. They can't handle > >> anything larger than this. So force this limit on > >> any devices attached under these ports. > > > > This needs to say whether this is a functional or a performance issue. > > > > If it's a functional issue, i.e., if meson signals an error or abort > > when it receives a read request for > 256 bytes, we need to explain > > exactly what happens. > > > > If it's a performance issue, we need to explain why MRRS affects > > performance and that this is an optimization. > > > >> Come-from: https://lkml.org/lkml/2021/6/18/160 > >> Come-from: https://lkml.org/lkml/2021/6/19/19 > > > > Please use lore.kernel.org URLs instead. The lore URLs are a little > > uglier, but are more functional, more likely to continue working, and > > avoid the ads. These are: > > > > https://lore.kernel.org/r/20210618230132.GA3228427@bjorn-Precision-5520 > > https://lore.kernel.org/r/20210619063952.2008746-1-art@khadas.com > > > >> It only affects PCIe in P2P, in non-P2P is will certainly affect > >> transfers on the internal SoC/Processor/Chip internal bus/fabric. > > > > This needs to explain how a field in a PCIe TLP affects transfers on > > these non-PCIe fabrics. > > > >> These quirks are currently implemented in the > >> controller driver and only applies when the controller has been probed > >> and to each endpoint detected on this particular controller. > >> > >> Continue having separate quirks for each controller if the core > >> isn't the right place to handle MPS/MRRS. > > > > I see similar code in dwc/pci-keystone.c. Does this problem actually > > affect *all* DesignWare-based controllers? > > > > If so, we should put the workaround in the common dwc code, e.g., > > pcie-designware.c or similar. > > > > It also seems to affect pci-loongson.c (not DesignWare-based). Is > > there some reason it has the same problem, e.g., does loongson contain > > DesignWare IP, or does it use the same non-PCIe fabric? > > As my reply on the previous thread, the Synopsys IP can be configured with a > maximum TLP packet to AXI transaction size, which is hardcoded AFAIK Amlogic > doesn't explicit it. And it doesn't seem we can read the value. > > This means is a TPL size if higher than this maximum packet size, the IP will > do multiple AXI transactions, and this can reduce the system overall performance. > > The problem is that it affects the P2P transactions aswell, which can support any MPS/MRRS. > But honestly, it's not a big deal on a PCIe 2.0 1x system only designed for NVMe and basic > PCIe devices. > > The fun part is that the pci=pcie_bus_perf kerne cmdline solves this already, > isn't there any possibility to force pcie_bus_perf for a particular root port ? > > Neil > > > > >>>> Neil > >> > >> Signed-off-by: Artem Lapkin <art@khadas.com> > >> --- > >> drivers/pci/controller/dwc/pci-meson.c | 31 ++++++++++++++++++++++++++ > >> 1 file changed, 31 insertions(+) > >> > >> diff --git a/drivers/pci/controller/dwc/pci-meson.c b/drivers/pci/controller/dwc/pci-meson.c > >> index 686ded034..1498950de 100644 > >> --- a/drivers/pci/controller/dwc/pci-meson.c > >> +++ b/drivers/pci/controller/dwc/pci-meson.c > >> @@ -466,6 +466,37 @@ static int meson_pcie_probe(struct platform_device *pdev) > >> return ret; > >> } > >> > >> +static void meson_mrrs_limit_quirk(struct pci_dev *dev) > >> +{ > >> + struct pci_bus *bus = dev->bus; > >> + int mrrs, mrrs_limit = 256; > >> + static const struct pci_device_id bridge_devids[] = { > >> + { PCI_DEVICE(PCI_VENDOR_ID_SYNOPSYS, PCI_DEVICE_ID_SYNOPSYS_HAPSUSB3) }, > > > > I don't really believe that PCI_DEVICE_ID_SYNOPSYS_HAPSUSB3 is the > > only device affected here. Is this related to the Meson root port, or > > is it related to a PCI_DEVICE_ID_SYNOPSYS_HAPSUSB3 on a plug-in card? > > I guess the former, since you're searching upward for a root port. > > > > So why is this limited to PCI_DEVICE_ID_SYNOPSYS_HAPSUSB3? > > > >> + { 0, }, > >> + }; > >> + > >> + /* look for the matching bridge */ > >> + while (!pci_is_root_bus(bus)) { > >> + /* > >> + * 256 bytes maximum read request size. They can't handle > >> + * anything larger than this. So force this limit on > >> + * any devices attached under these ports. > >> + */ > >> + if (!pci_match_id(bridge_devids, bus->self)) { > >> + bus = bus->parent; > >> + continue; > >> + } > >> + > >> + mrrs = pcie_get_readrq(dev); > >> + if (mrrs > mrrs_limit) { > >> + pci_info(dev, "limiting MRRS %d to %d\n", mrrs, mrrs_limit); > >> + pcie_set_readrq(dev, mrrs_limit); > >> + } > >> + break; > >> + } > >> +} > >> +DECLARE_PCI_FIXUP_ENABLE(PCI_ANY_ID, PCI_ANY_ID, meson_mrrs_limit_quirk); > >> + > >> static const struct of_device_id meson_pcie_of_match[] = { > >> { > >> .compatible = "amlogic,axg-pcie", > >> -- > >> 2.25.1 > >> >
> This needs to say whether this is a functional or a performance issue. Looks like not a performance issue , this is a functional issue. We detect just one problem(may be exist another i dont know ) for writing(reading works fine) any data to NVME devices with MRRS != 256 we have scrambled HDMI display > Does this problem actually affect *all* DesignWare-based controllers? > So why is this limited to PCI_DEVICE_ID_SYNOPSYS_HAPSUSB3? i can say only for VIM3 device which use this controller and have this problem On Wed, Jul 28, 2021 at 3:43 AM Bjorn Helgaas <helgaas@kernel.org> wrote: > > On Tue, Jul 27, 2021 at 10:30:00AM +0800, Artem Lapkin wrote: > > 256 bytes maximum read request size. They can't handle > > anything larger than this. So force this limit on > > any devices attached under these ports. > > This needs to say whether this is a functional or a performance issue. > > If it's a functional issue, i.e., if meson signals an error or abort > when it receives a read request for > 256 bytes, we need to explain > exactly what happens. > > If it's a performance issue, we need to explain why MRRS affects > performance and that this is an optimization. > > > Come-from: https://lkml.org/lkml/2021/6/18/160 > > Come-from: https://lkml.org/lkml/2021/6/19/19 > > Please use lore.kernel.org URLs instead. The lore URLs are a little > uglier, but are more functional, more likely to continue working, and > avoid the ads. These are: > > https://lore.kernel.org/r/20210618230132.GA3228427@bjorn-Precision-5520 > https://lore.kernel.org/r/20210619063952.2008746-1-art@khadas.com > > > It only affects PCIe in P2P, in non-P2P is will certainly affect > > transfers on the internal SoC/Processor/Chip internal bus/fabric. > > This needs to explain how a field in a PCIe TLP affects transfers on > these non-PCIe fabrics. > > > These quirks are currently implemented in the > > controller driver and only applies when the controller has been probed > > and to each endpoint detected on this particular controller. > > > > Continue having separate quirks for each controller if the core > > isn't the right place to handle MPS/MRRS. > > I see similar code in dwc/pci-keystone.c. Does this problem actually > affect *all* DesignWare-based controllers? > > If so, we should put the workaround in the common dwc code, e.g., > pcie-designware.c or similar. > > It also seems to affect pci-loongson.c (not DesignWare-based). Is > there some reason it has the same problem, e.g., does loongson contain > DesignWare IP, or does it use the same non-PCIe fabric? > > > >> Neil > > > > Signed-off-by: Artem Lapkin <art@khadas.com> > > --- > > drivers/pci/controller/dwc/pci-meson.c | 31 ++++++++++++++++++++++++++ > > 1 file changed, 31 insertions(+) > > > > diff --git a/drivers/pci/controller/dwc/pci-meson.c b/drivers/pci/controller/dwc/pci-meson.c > > index 686ded034..1498950de 100644 > > --- a/drivers/pci/controller/dwc/pci-meson.c > > +++ b/drivers/pci/controller/dwc/pci-meson.c > > @@ -466,6 +466,37 @@ static int meson_pcie_probe(struct platform_device *pdev) > > return ret; > > } > > > > +static void meson_mrrs_limit_quirk(struct pci_dev *dev) > > +{ > > + struct pci_bus *bus = dev->bus; > > + int mrrs, mrrs_limit = 256; > > + static const struct pci_device_id bridge_devids[] = { > > + { PCI_DEVICE(PCI_VENDOR_ID_SYNOPSYS, PCI_DEVICE_ID_SYNOPSYS_HAPSUSB3) }, > > I don't really believe that PCI_DEVICE_ID_SYNOPSYS_HAPSUSB3 is the > only device affected here. Is this related to the Meson root port, or > is it related to a PCI_DEVICE_ID_SYNOPSYS_HAPSUSB3 on a plug-in card? > I guess the former, since you're searching upward for a root port. > > So why is this limited to PCI_DEVICE_ID_SYNOPSYS_HAPSUSB3? > > > + { 0, }, > > + }; > > + > > + /* look for the matching bridge */ > > + while (!pci_is_root_bus(bus)) { > > + /* > > + * 256 bytes maximum read request size. They can't handle > > + * anything larger than this. So force this limit on > > + * any devices attached under these ports. > > + */ > > + if (!pci_match_id(bridge_devids, bus->self)) { > > + bus = bus->parent; > > + continue; > > + } > > + > > + mrrs = pcie_get_readrq(dev); > > + if (mrrs > mrrs_limit) { > > + pci_info(dev, "limiting MRRS %d to %d\n", mrrs, mrrs_limit); > > + pcie_set_readrq(dev, mrrs_limit); > > + } > > + break; > > + } > > +} > > +DECLARE_PCI_FIXUP_ENABLE(PCI_ANY_ID, PCI_ANY_ID, meson_mrrs_limit_quirk); > > + > > static const struct of_device_id meson_pcie_of_match[] = { > > { > > .compatible = "amlogic,axg-pcie", > > -- > > 2.25.1 > >
diff --git a/drivers/pci/controller/dwc/pci-meson.c b/drivers/pci/controller/dwc/pci-meson.c index 686ded034..1498950de 100644 --- a/drivers/pci/controller/dwc/pci-meson.c +++ b/drivers/pci/controller/dwc/pci-meson.c @@ -466,6 +466,37 @@ static int meson_pcie_probe(struct platform_device *pdev) return ret; } +static void meson_mrrs_limit_quirk(struct pci_dev *dev) +{ + struct pci_bus *bus = dev->bus; + int mrrs, mrrs_limit = 256; + static const struct pci_device_id bridge_devids[] = { + { PCI_DEVICE(PCI_VENDOR_ID_SYNOPSYS, PCI_DEVICE_ID_SYNOPSYS_HAPSUSB3) }, + { 0, }, + }; + + /* look for the matching bridge */ + while (!pci_is_root_bus(bus)) { + /* + * 256 bytes maximum read request size. They can't handle + * anything larger than this. So force this limit on + * any devices attached under these ports. + */ + if (!pci_match_id(bridge_devids, bus->self)) { + bus = bus->parent; + continue; + } + + mrrs = pcie_get_readrq(dev); + if (mrrs > mrrs_limit) { + pci_info(dev, "limiting MRRS %d to %d\n", mrrs, mrrs_limit); + pcie_set_readrq(dev, mrrs_limit); + } + break; + } +} +DECLARE_PCI_FIXUP_ENABLE(PCI_ANY_ID, PCI_ANY_ID, meson_mrrs_limit_quirk); + static const struct of_device_id meson_pcie_of_match[] = { { .compatible = "amlogic,axg-pcie",