diff mbox series

[v2] arm64: Enable PCI write-combine resources under sysfs

Message ID 20200918033312.ddfpibgfylfjpex2@amazon.com (mailing list archive)
State New, archived
Headers show
Series [v2] arm64: Enable PCI write-combine resources under sysfs | expand

Commit Message

Clint Sbisa Sept. 18, 2020, 3:33 a.m. UTC
This change exposes write-combine mappings under sysfs for
prefetchable PCI resources on arm64.

Originally, the usage of "write combine" here was driven by the x86
definition of write combine. This definition is specific to x86 and
does not generalize to other architectures. However, the usage of WC
has mutated to "write combine" semantics, which is implemented
differently on each arch.

Generally, prefetchable BARs are accepted to allow speculative
accesses, write combining, and re-ordering-- from the PCI perspective,
this means there are no read side effects. (This contradicts the PCI
spec which allows prefetchable BARs to have read side effects, but
this definition is ill-advised as it is impossible to meet.) On x86,
prefetchable BARs are mapped as WC as originally defined (with some
conditionals on arch features). On arm64, WC is taken to mean normal
non-cacheable memory.

In practice, write combine semantics are used to minimize write
operations. A common usage of this is minimizing PCI TLPs which can
significantly improve performance with PCI devices. In order to
provide the same benefits to userspace, we need to allow userspace to
map prefetchable BARs with write combine semantics. The resourceX_wc
mapping is used today by userspace programs and libraries.

While this model is flawed as "write combine" is very ill-defined, it
is already used by multiple non-x86 archs to expose write combine
semantics to user space. We enable this on arm64 to give userspace on
arm64 an equivalent mechanism for utilizing write combining with PCI
devices.

Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Bjorn Helgaas <helgaas@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Clint Sbisa <csbisa@amazon.com>
---
Changes in v2:
  - Rewrote the commit message.

 arch/arm64/include/asm/pci.h | 1 +
 1 file changed, 1 insertion(+)

Comments

Lorenzo Pieralisi Sept. 18, 2020, 9:21 a.m. UTC | #1
On Fri, Sep 18, 2020 at 03:33:12AM +0000, Clint Sbisa wrote:
> This change exposes write-combine mappings under sysfs for
> prefetchable PCI resources on arm64.
> 
> Originally, the usage of "write combine" here was driven by the x86
> definition of write combine. This definition is specific to x86 and
> does not generalize to other architectures. However, the usage of WC
> has mutated to "write combine" semantics, which is implemented
> differently on each arch.
> 
> Generally, prefetchable BARs are accepted to allow speculative
> accesses, write combining, and re-ordering-- from the PCI perspective,
> this means there are no read side effects. (This contradicts the PCI
> spec which allows prefetchable BARs to have read side effects, but
> this definition is ill-advised as it is impossible to meet.) On x86,
> prefetchable BARs are mapped as WC as originally defined (with some
> conditionals on arch features). On arm64, WC is taken to mean normal
> non-cacheable memory.
> 
> In practice, write combine semantics are used to minimize write
> operations. A common usage of this is minimizing PCI TLPs which can
> significantly improve performance with PCI devices. In order to
> provide the same benefits to userspace, we need to allow userspace to
> map prefetchable BARs with write combine semantics. The resourceX_wc
> mapping is used today by userspace programs and libraries.
> 
> While this model is flawed as "write combine" is very ill-defined, it
> is already used by multiple non-x86 archs to expose write combine
> semantics to user space. We enable this on arm64 to give userspace on
> arm64 an equivalent mechanism for utilizing write combining with PCI
> devices.
> 
> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Cc: Bjorn Helgaas <helgaas@kernel.org>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Jason Gunthorpe <jgg@nvidia.com>
> Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
> Cc: Will Deacon <will@kernel.org>
> Signed-off-by: Clint Sbisa <csbisa@amazon.com>
> ---
> Changes in v2:
>   - Rewrote the commit message.
> 
>  arch/arm64/include/asm/pci.h | 1 +
>  1 file changed, 1 insertion(+)

It would be great if we can add a link to the thread (sorry I forgot to
tell you) for future reference:

Link: https://lore.kernel.org/linux-pci/20200902113207.GA27676@e121166-lin.cambridge.arm.com

With that:

Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>

> diff --git a/arch/arm64/include/asm/pci.h b/arch/arm64/include/asm/pci.h
> index 70b323cf8300..b33ca260e3c9 100644
> --- a/arch/arm64/include/asm/pci.h
> +++ b/arch/arm64/include/asm/pci.h
> @@ -17,6 +17,7 @@
>  #define pcibios_assign_all_busses() \
>  	(pci_has_flag(PCI_REASSIGN_ALL_BUS))
>  
> +#define arch_can_pci_mmap_wc() 1
>  #define ARCH_GENERIC_PCI_MMAP_RESOURCE	1
>  
>  extern int isa_dma_bridge_buggy;
> -- 
> 2.23.3
>
Catalin Marinas Sept. 18, 2020, 11:07 a.m. UTC | #2
On Fri, Sep 18, 2020 at 03:33:12AM +0000, Clint Sbisa wrote:
> This change exposes write-combine mappings under sysfs for
> prefetchable PCI resources on arm64.
> 
> Originally, the usage of "write combine" here was driven by the x86
> definition of write combine. This definition is specific to x86 and
> does not generalize to other architectures. However, the usage of WC
> has mutated to "write combine" semantics, which is implemented
> differently on each arch.
> 
> Generally, prefetchable BARs are accepted to allow speculative
> accesses, write combining, and re-ordering-- from the PCI perspective,
> this means there are no read side effects. (This contradicts the PCI
> spec which allows prefetchable BARs to have read side effects, but
> this definition is ill-advised as it is impossible to meet.) On x86,
> prefetchable BARs are mapped as WC as originally defined (with some
> conditionals on arch features). On arm64, WC is taken to mean normal
> non-cacheable memory.
> 
> In practice, write combine semantics are used to minimize write
> operations. A common usage of this is minimizing PCI TLPs which can
> significantly improve performance with PCI devices. In order to
> provide the same benefits to userspace, we need to allow userspace to
> map prefetchable BARs with write combine semantics. The resourceX_wc
> mapping is used today by userspace programs and libraries.
> 
> While this model is flawed as "write combine" is very ill-defined, it
> is already used by multiple non-x86 archs to expose write combine
> semantics to user space. We enable this on arm64 to give userspace on
> arm64 an equivalent mechanism for utilizing write combining with PCI
> devices.
> 
> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Cc: Bjorn Helgaas <helgaas@kernel.org>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Jason Gunthorpe <jgg@nvidia.com>
> Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
> Cc: Will Deacon <will@kernel.org>
> Signed-off-by: Clint Sbisa <csbisa@amazon.com>

Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Ard Biesheuvel Sept. 18, 2020, 11:56 a.m. UTC | #3
On Fri, 18 Sep 2020 at 14:08, Catalin Marinas <catalin.marinas@arm.com> wrote:
>
> On Fri, Sep 18, 2020 at 03:33:12AM +0000, Clint Sbisa wrote:
> > This change exposes write-combine mappings under sysfs for
> > prefetchable PCI resources on arm64.
> >
> > Originally, the usage of "write combine" here was driven by the x86
> > definition of write combine. This definition is specific to x86 and
> > does not generalize to other architectures. However, the usage of WC
> > has mutated to "write combine" semantics, which is implemented
> > differently on each arch.
> >
> > Generally, prefetchable BARs are accepted to allow speculative
> > accesses, write combining, and re-ordering-- from the PCI perspective,
> > this means there are no read side effects. (This contradicts the PCI
> > spec which allows prefetchable BARs to have read side effects, but
> > this definition is ill-advised as it is impossible to meet.) On x86,
> > prefetchable BARs are mapped as WC as originally defined (with some
> > conditionals on arch features). On arm64, WC is taken to mean normal
> > non-cacheable memory.
> >
> > In practice, write combine semantics are used to minimize write
> > operations. A common usage of this is minimizing PCI TLPs which can
> > significantly improve performance with PCI devices. In order to
> > provide the same benefits to userspace, we need to allow userspace to
> > map prefetchable BARs with write combine semantics. The resourceX_wc
> > mapping is used today by userspace programs and libraries.
> >
> > While this model is flawed as "write combine" is very ill-defined, it
> > is already used by multiple non-x86 archs to expose write combine
> > semantics to user space. We enable this on arm64 to give userspace on
> > arm64 an equivalent mechanism for utilizing write combining with PCI
> > devices.
> >
> > Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> > Cc: Bjorn Helgaas <helgaas@kernel.org>
> > Cc: Catalin Marinas <catalin.marinas@arm.com>
> > Cc: Jason Gunthorpe <jgg@nvidia.com>
> > Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
> > Cc: Will Deacon <will@kernel.org>
> > Signed-off-by: Clint Sbisa <csbisa@amazon.com>
>
> Acked-by: Catalin Marinas <catalin.marinas@arm.com>
>

Acked-by: Ard Biesheuvel <ardb@kernel.org>
Will Deacon Sept. 18, 2020, 4:17 p.m. UTC | #4
On Fri, 18 Sep 2020 03:33:12 +0000, Clint Sbisa wrote:
> This change exposes write-combine mappings under sysfs for
> prefetchable PCI resources on arm64.
> 
> Originally, the usage of "write combine" here was driven by the x86
> definition of write combine. This definition is specific to x86 and
> does not generalize to other architectures. However, the usage of WC
> has mutated to "write combine" semantics, which is implemented
> differently on each arch.
> 
> [...]

Applied to arm64 (for-next/pci), thanks!

[1/1] arm64: Enable PCI write-combine resources under sysfs
      https://git.kernel.org/arm64/c/5fd39dc22027

Cheers,
diff mbox series

Patch

diff --git a/arch/arm64/include/asm/pci.h b/arch/arm64/include/asm/pci.h
index 70b323cf8300..b33ca260e3c9 100644
--- a/arch/arm64/include/asm/pci.h
+++ b/arch/arm64/include/asm/pci.h
@@ -17,6 +17,7 @@ 
 #define pcibios_assign_all_busses() \
 	(pci_has_flag(PCI_REASSIGN_ALL_BUS))
 
+#define arch_can_pci_mmap_wc() 1
 #define ARCH_GENERIC_PCI_MMAP_RESOURCE	1
 
 extern int isa_dma_bridge_buggy;