mbox series

[v5,0/7] arm64: Default to 32-bit wide ZONE_DMA

Message ID 20201029172550.3523-1-nsaenzjulienne@suse.de (mailing list archive)
Headers show
Series arm64: Default to 32-bit wide ZONE_DMA | expand

Message

Nicolas Saenz Julienne Oct. 29, 2020, 5:25 p.m. UTC
Using two distinct DMA zones turned out to be problematic. Here's an
attempt go back to a saner default.

I tested this on both a RPi4 and QEMU.

---

Changes since v4:
 - Fix of_dma_get_max_cpu_address() so it returns the last addressable
   addres, not the limit

Changes since v3:
 - Drop patch adding define in dma-mapping
 - Address small review changes
 - Update Ard's patch
 - Add new patch removing examples from mmzone.h

Changes since v2:
 - Introduce Ard's patch
 - Improve OF dma-ranges parsing function
 - Add unit test for OF function
 - Address small changes
 - Move crashkernel reservation later in boot process

Changes since v1:
 - Parse dma-ranges instead of using machine compatible string

Ard Biesheuvel (1):
  arm64: mm: Set ZONE_DMA size based on early IORT scan

Nicolas Saenz Julienne (6):
  arm64: mm: Move reserve_crashkernel() into mem_init()
  arm64: mm: Move zone_dma_bits initialization into zone_sizes_init()
  of/address: Introduce of_dma_get_max_cpu_address()
  of: unittest: Add test for of_dma_get_max_cpu_address()
  arm64: mm: Set ZONE_DMA size based on devicetree's dma-ranges
  mm: Remove examples from enum zone_type comment

 arch/arm64/mm/init.c      | 16 ++++++------
 drivers/acpi/arm64/iort.c | 52 +++++++++++++++++++++++++++++++++++++++
 drivers/of/address.c      | 42 +++++++++++++++++++++++++++++++
 drivers/of/unittest.c     | 18 ++++++++++++++
 include/linux/acpi_iort.h |  4 +++
 include/linux/mmzone.h    | 20 ---------------
 include/linux/of.h        |  7 ++++++
 7 files changed, 130 insertions(+), 29 deletions(-)

Comments

Catalin Marinas Oct. 30, 2020, 6:11 p.m. UTC | #1
On Thu, Oct 29, 2020 at 06:25:43PM +0100, Nicolas Saenz Julienne wrote:
> Ard Biesheuvel (1):
>   arm64: mm: Set ZONE_DMA size based on early IORT scan
> 
> Nicolas Saenz Julienne (6):
>   arm64: mm: Move reserve_crashkernel() into mem_init()
>   arm64: mm: Move zone_dma_bits initialization into zone_sizes_init()
>   of/address: Introduce of_dma_get_max_cpu_address()
>   of: unittest: Add test for of_dma_get_max_cpu_address()
>   arm64: mm: Set ZONE_DMA size based on devicetree's dma-ranges
>   mm: Remove examples from enum zone_type comment

Thanks for putting this together. I had a minor comment but the patches
look fine to me. We still need an ack from Rob on the DT patch and I can
queue the series for 5.11.

Could you please also test the patch below on top of this series? It's
the removal of the implied DMA offset in the max_zone_phys()
calculation.

--------------------------8<-----------------------------
From 3ae252d888be4984a612236124f5b099e804c745 Mon Sep 17 00:00:00 2001
From: Catalin Marinas <catalin.marinas@arm.com>
Date: Fri, 30 Oct 2020 18:07:34 +0000
Subject: [PATCH] arm64: Ignore any DMA offsets in the max_zone_phys()
 calculation

Currently, the kernel assumes that if RAM starts above 32-bit (or
zone_bits), there is still a ZONE_DMA/DMA32 at the bottom of the RAM and
such constrained devices have a hardwired DMA offset. In practice, we
haven't noticed any such hardware so let's assume that we can expand
ZONE_DMA32 to the available memory if no RAM below 4GB. Similarly,
ZONE_DMA is expanded to the 4GB limit if no RAM addressable by
zone_bits.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm64/mm/init.c | 17 ++++++++++++-----
 1 file changed, 12 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 095540667f0f..362160e16fb2 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -175,14 +175,21 @@ static void __init reserve_elfcorehdr(void)
 #endif /* CONFIG_CRASH_DUMP */
 
 /*
- * Return the maximum physical address for a zone with a given address size
- * limit. It currently assumes that for memory starting above 4G, 32-bit
- * devices will use a DMA offset.
+ * Return the maximum physical address for a zone accessible by the given bits
+ * limit. If the DRAM starts above 32-bit, expand the zone to the maximum
+ * available memory, otherwise cap it at 32-bit.
  */
 static phys_addr_t __init max_zone_phys(unsigned int zone_bits)
 {
-	phys_addr_t offset = memblock_start_of_DRAM() & GENMASK_ULL(63, zone_bits);
-	return min(offset + (1ULL << zone_bits), memblock_end_of_DRAM());
+	phys_addr_t zone_mask = (1ULL << zone_bits) - 1;
+	phys_addr_t phys_start = memblock_start_of_DRAM();
+
+	if (!(phys_start & U32_MAX))
+		zone_mask = PHYS_ADDR_MAX;
+	else if (!(phys_start & zone_mask))
+		zone_mask = U32_MAX;
+
+	return min(zone_mask + 1, memblock_end_of_DRAM());
 }
 
 static void __init zone_sizes_init(unsigned long min, unsigned long max)
Nicolas Saenz Julienne Nov. 3, 2020, 5 p.m. UTC | #2
On Fri, 2020-10-30 at 18:11 +0000, Catalin Marinas wrote:
> On Thu, Oct 29, 2020 at 06:25:43PM +0100, Nicolas Saenz Julienne wrote:
> > Ard Biesheuvel (1):
> >   arm64: mm: Set ZONE_DMA size based on early IORT scan
> > 
> > Nicolas Saenz Julienne (6):
> >   arm64: mm: Move reserve_crashkernel() into mem_init()
> >   arm64: mm: Move zone_dma_bits initialization into zone_sizes_init()
> >   of/address: Introduce of_dma_get_max_cpu_address()
> >   of: unittest: Add test for of_dma_get_max_cpu_address()
> >   arm64: mm: Set ZONE_DMA size based on devicetree's dma-ranges
> >   mm: Remove examples from enum zone_type comment
> 
> Thanks for putting this together. I had a minor comment but the patches
> look fine to me. We still need an ack from Rob on the DT patch and I can
> queue the series for 5.11.

I'm preparing a v6 unifying both functions as you suggested.

> Could you please also test the patch below on top of this series? It's
> the removal of the implied DMA offset in the max_zone_phys()
> calculation.

Yes, happily. Comments below.

> --------------------------8<-----------------------------
> From 3ae252d888be4984a612236124f5b099e804c745 Mon Sep 17 00:00:00 2001
> From: Catalin Marinas <catalin.marinas@arm.com>
> Date: Fri, 30 Oct 2020 18:07:34 +0000
> Subject: [PATCH] arm64: Ignore any DMA offsets in the max_zone_phys()
>  calculation
> 
> Currently, the kernel assumes that if RAM starts above 32-bit (or
> zone_bits), there is still a ZONE_DMA/DMA32 at the bottom of the RAM and
> such constrained devices have a hardwired DMA offset. In practice, we
> haven't noticed any such hardware so let's assume that we can expand
> ZONE_DMA32 to the available memory if no RAM below 4GB. Similarly,
> ZONE_DMA is expanded to the 4GB limit if no RAM addressable by
> zone_bits.
> 
> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
> ---
>  arch/arm64/mm/init.c | 17 ++++++++++++-----
>  1 file changed, 12 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
> index 095540667f0f..362160e16fb2 100644
> --- a/arch/arm64/mm/init.c
> +++ b/arch/arm64/mm/init.c
> @@ -175,14 +175,21 @@ static void __init reserve_elfcorehdr(void)
>  #endif /* CONFIG_CRASH_DUMP */
>  
>  /*
> - * Return the maximum physical address for a zone with a given address size
> - * limit. It currently assumes that for memory starting above 4G, 32-bit
> - * devices will use a DMA offset.
> + * Return the maximum physical address for a zone accessible by the given bits
> + * limit. If the DRAM starts above 32-bit, expand the zone to the maximum
> + * available memory, otherwise cap it at 32-bit.
>   */
>  static phys_addr_t __init max_zone_phys(unsigned int zone_bits)
>  {
> -	phys_addr_t offset = memblock_start_of_DRAM() & GENMASK_ULL(63, zone_bits);
> -	return min(offset + (1ULL << zone_bits), memblock_end_of_DRAM());
> +	phys_addr_t zone_mask = (1ULL << zone_bits) - 1;

Maybe use DMA_BIT_MASK(), instead of the manual calculation?

> +	phys_addr_t phys_start = memblock_start_of_DRAM();
> +
> +	if (!(phys_start & U32_MAX))

I'd suggest using 'bigger than' instead of masks. Just to cover ourselves
against memory starting at odd locations. Also it'll behaves properly when
phys_start is zero (this breaks things on RPi4).

> +		zone_mask = PHYS_ADDR_MAX;
> +	else if (!(phys_start & zone_mask))
> +		zone_mask = U32_MAX;
> +
> +	return min(zone_mask + 1, memblock_end_of_DRAM());

This + 1 isn't going to play well when zone_mask is PHYS_ADDR_MAX.

Regards,
Nicolas
Catalin Marinas Nov. 3, 2020, 6:51 p.m. UTC | #3
On Tue, Nov 03, 2020 at 06:00:33PM +0100, Nicolas Saenz Julienne wrote:
> On Fri, 2020-10-30 at 18:11 +0000, Catalin Marinas wrote:
> > On Thu, Oct 29, 2020 at 06:25:43PM +0100, Nicolas Saenz Julienne wrote:
> > > Ard Biesheuvel (1):
> > >   arm64: mm: Set ZONE_DMA size based on early IORT scan
> > > 
> > > Nicolas Saenz Julienne (6):
> > >   arm64: mm: Move reserve_crashkernel() into mem_init()
> > >   arm64: mm: Move zone_dma_bits initialization into zone_sizes_init()
> > >   of/address: Introduce of_dma_get_max_cpu_address()
> > >   of: unittest: Add test for of_dma_get_max_cpu_address()
> > >   arm64: mm: Set ZONE_DMA size based on devicetree's dma-ranges
> > >   mm: Remove examples from enum zone_type comment
> > 
> > Thanks for putting this together. I had a minor comment but the patches
> > look fine to me. We still need an ack from Rob on the DT patch and I can
> > queue the series for 5.11.
> 
> I'm preparing a v6 unifying both functions as you suggested.
> 
> > Could you please also test the patch below on top of this series? It's
> > the removal of the implied DMA offset in the max_zone_phys()
> > calculation.
> 
> Yes, happily. Comments below.
> 
> > --------------------------8<-----------------------------
> > From 3ae252d888be4984a612236124f5b099e804c745 Mon Sep 17 00:00:00 2001
> > From: Catalin Marinas <catalin.marinas@arm.com>
> > Date: Fri, 30 Oct 2020 18:07:34 +0000
> > Subject: [PATCH] arm64: Ignore any DMA offsets in the max_zone_phys()
> >  calculation
> > 
> > Currently, the kernel assumes that if RAM starts above 32-bit (or
> > zone_bits), there is still a ZONE_DMA/DMA32 at the bottom of the RAM and
> > such constrained devices have a hardwired DMA offset. In practice, we
> > haven't noticed any such hardware so let's assume that we can expand
> > ZONE_DMA32 to the available memory if no RAM below 4GB. Similarly,
> > ZONE_DMA is expanded to the 4GB limit if no RAM addressable by
> > zone_bits.
> > 
> > Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
> > ---
> >  arch/arm64/mm/init.c | 17 ++++++++++++-----
> >  1 file changed, 12 insertions(+), 5 deletions(-)
> > 
> > diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
> > index 095540667f0f..362160e16fb2 100644
> > --- a/arch/arm64/mm/init.c
> > +++ b/arch/arm64/mm/init.c
> > @@ -175,14 +175,21 @@ static void __init reserve_elfcorehdr(void)
> >  #endif /* CONFIG_CRASH_DUMP */
> >  
> >  /*
> > - * Return the maximum physical address for a zone with a given address size
> > - * limit. It currently assumes that for memory starting above 4G, 32-bit
> > - * devices will use a DMA offset.
> > + * Return the maximum physical address for a zone accessible by the given bits
> > + * limit. If the DRAM starts above 32-bit, expand the zone to the maximum
> > + * available memory, otherwise cap it at 32-bit.
> >   */
> >  static phys_addr_t __init max_zone_phys(unsigned int zone_bits)
> >  {
> > -	phys_addr_t offset = memblock_start_of_DRAM() & GENMASK_ULL(63, zone_bits);
> > -	return min(offset + (1ULL << zone_bits), memblock_end_of_DRAM());
> > +	phys_addr_t zone_mask = (1ULL << zone_bits) - 1;
> 
> Maybe use DMA_BIT_MASK(), instead of the manual calculation?

Yes.

> 
> > +	phys_addr_t phys_start = memblock_start_of_DRAM();
> > +
> > +	if (!(phys_start & U32_MAX))
> 
> I'd suggest using 'bigger than' instead of masks. Just to cover ourselves
> against memory starting at odd locations. Also it'll behaves properly when
> phys_start is zero (this breaks things on RPi4).

Good point.

> > +		zone_mask = PHYS_ADDR_MAX;
> > +	else if (!(phys_start & zone_mask))
> > +		zone_mask = U32_MAX;
> > +
> > +	return min(zone_mask + 1, memblock_end_of_DRAM());
> 
> This + 1 isn't going to play well when zone_mask is PHYS_ADDR_MAX.

You are right on PHYS_ADDR_MAX overflowing but I'd keep the +1 since
memblock_end_of_DRAM() returns the first byte past the accessible range
(so exclusive end).

I'll tweak this function a bit to avoid the overflow or use the
arm64-specific PHYS_MASK (that's never going to be the full 64 bits).

Thanks.