diff mbox series

[V3,3/3] PCI: rcar: Recalculate inbound range alignment for each controller entry

Message ID 20190809175741.7066-3-marek.vasut@gmail.com (mailing list archive)
State Superseded, archived
Headers show
Series [V3,1/3] PCI: rcar: Move the inbound index check | expand

Commit Message

Marek Vasut Aug. 9, 2019, 5:57 p.m. UTC
From: Marek Vasut <marek.vasut+renesas@gmail.com>

Due to hardware constraints, the size of each inbound range entry
populated into the controller cannot be larger than the alignment
of the entry's start address. Currently, the alignment for each
"dma-ranges" inbound range is calculated only once for each range
and the increment for programming the controller is also derived
from it only once. Thus, a "dma-ranges" entry describing a memory
at 0x48000000 and size 0x38000000 would lead to multiple controller
entries, each 0x08000000 long.

This is inefficient, especially considering that by adding the size
to the start address, the alignment increases. This patch moves the
alignment calculation into the loop populating the controller entries,
thus updating the alignment for each controller entry.

Signed-off-by: Marek Vasut <marek.vasut+renesas@gmail.com>
Cc: Geert Uytterhoeven <geert+renesas@glider.be>
Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Wolfram Sang <wsa@the-dreams.de>
Cc: linux-renesas-soc@vger.kernel.org
To: linux-pci@vger.kernel.org
Reviewed-by: Simon Horman <horms+renesas@verge.net.au>
---
V2: Update on top of 1/3
V3: No change
---
 drivers/pci/controller/pcie-rcar.c | 37 +++++++++++++++---------------
 1 file changed, 19 insertions(+), 18 deletions(-)

Comments

Andrew Murray Oct. 21, 2019, 10:39 a.m. UTC | #1
On Fri, Aug 09, 2019 at 07:57:41PM +0200, marek.vasut@gmail.com wrote:
> From: Marek Vasut <marek.vasut+renesas@gmail.com>
> 
> Due to hardware constraints, the size of each inbound range entry
> populated into the controller cannot be larger than the alignment
> of the entry's start address. Currently, the alignment for each
> "dma-ranges" inbound range is calculated only once for each range
> and the increment for programming the controller is also derived
> from it only once. Thus, a "dma-ranges" entry describing a memory
> at 0x48000000 and size 0x38000000 would lead to multiple controller
> entries, each 0x08000000 long.
> 
> This is inefficient, especially considering that by adding the size
> to the start address, the alignment increases. This patch moves the
> alignment calculation into the loop populating the controller entries,
> thus updating the alignment for each controller entry.
> 
> Signed-off-by: Marek Vasut <marek.vasut+renesas@gmail.com>
> Cc: Geert Uytterhoeven <geert+renesas@glider.be>
> Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
> Cc: Wolfram Sang <wsa@the-dreams.de>
> Cc: linux-renesas-soc@vger.kernel.org
> To: linux-pci@vger.kernel.org
> Reviewed-by: Simon Horman <horms+renesas@verge.net.au>
> ---
> V2: Update on top of 1/3
> V3: No change
> ---
>  drivers/pci/controller/pcie-rcar.c | 37 +++++++++++++++---------------
>  1 file changed, 19 insertions(+), 18 deletions(-)
> 
> diff --git a/drivers/pci/controller/pcie-rcar.c b/drivers/pci/controller/pcie-rcar.c
> index e2735005ffd3..d820aa64d0b7 100644
> --- a/drivers/pci/controller/pcie-rcar.c
> +++ b/drivers/pci/controller/pcie-rcar.c
> @@ -1029,30 +1029,31 @@ static int rcar_pcie_inbound_ranges(struct rcar_pcie *pcie,
>  	if (restype & IORESOURCE_PREFETCH)
>  		flags |= LAM_PREFETCH;
>  
> -	/*
> -	 * If the size of the range is larger than the alignment of the start
> -	 * address, we have to use multiple entries to perform the mapping.
> -	 */
> -	if (cpu_addr > 0) {
> -		unsigned long nr_zeros = __ffs64(cpu_addr);
> -		u64 alignment = 1ULL << nr_zeros;
> -
> -		size = min(range->size, alignment);
> -	} else {
> -		size = range->size;
> -	}
> -	/* Hardware supports max 4GiB inbound region */
> -	size = min(size, 1ULL << 32);
> -
> -	mask = roundup_pow_of_two(size) - 1;
> -	mask &= ~0xf;
> -
>  	while (cpu_addr < cpu_end) {
>  		if (idx >= MAX_NR_INBOUND_MAPS - 1) {
>  			dev_warn(pcie->dev,
>  				 "Too many inbound regions, not all are mapped.\n");
>  			break;
>  		}
> +		/*
> +		 * If the size of the range is larger than the alignment of
> +		 * the start address, we have to use multiple entries to
> +		 * perform the mapping.
> +		 */
> +		if (cpu_addr > 0) {
> +			unsigned long nr_zeros = __ffs64(cpu_addr);
> +			u64 alignment = 1ULL << nr_zeros;
> +
> +			size = min(range->size, alignment);
> +		} else {
> +			size = range->size;
> +		}
> +		/* Hardware supports max 4GiB inbound region */
> +		size = min(size, 1ULL << 32);
> +
> +		mask = roundup_pow_of_two(size) - 1;
> +		mask &= ~0xf;
> +

Indeed, as the cpu address increases so does the size you can fit in each
window.

Reviewed-by: Andrew Murray <andrew.murray@arm.com>

(Though you'll have to rebase this without patch 2).

>  		/*
>  		 * Set up 64-bit inbound regions as the range parser doesn't
>  		 * distinguish between 32 and 64-bit types.
> -- 
> 2.20.1
>
diff mbox series

Patch

diff --git a/drivers/pci/controller/pcie-rcar.c b/drivers/pci/controller/pcie-rcar.c
index e2735005ffd3..d820aa64d0b7 100644
--- a/drivers/pci/controller/pcie-rcar.c
+++ b/drivers/pci/controller/pcie-rcar.c
@@ -1029,30 +1029,31 @@  static int rcar_pcie_inbound_ranges(struct rcar_pcie *pcie,
 	if (restype & IORESOURCE_PREFETCH)
 		flags |= LAM_PREFETCH;
 
-	/*
-	 * If the size of the range is larger than the alignment of the start
-	 * address, we have to use multiple entries to perform the mapping.
-	 */
-	if (cpu_addr > 0) {
-		unsigned long nr_zeros = __ffs64(cpu_addr);
-		u64 alignment = 1ULL << nr_zeros;
-
-		size = min(range->size, alignment);
-	} else {
-		size = range->size;
-	}
-	/* Hardware supports max 4GiB inbound region */
-	size = min(size, 1ULL << 32);
-
-	mask = roundup_pow_of_two(size) - 1;
-	mask &= ~0xf;
-
 	while (cpu_addr < cpu_end) {
 		if (idx >= MAX_NR_INBOUND_MAPS - 1) {
 			dev_warn(pcie->dev,
 				 "Too many inbound regions, not all are mapped.\n");
 			break;
 		}
+		/*
+		 * If the size of the range is larger than the alignment of
+		 * the start address, we have to use multiple entries to
+		 * perform the mapping.
+		 */
+		if (cpu_addr > 0) {
+			unsigned long nr_zeros = __ffs64(cpu_addr);
+			u64 alignment = 1ULL << nr_zeros;
+
+			size = min(range->size, alignment);
+		} else {
+			size = range->size;
+		}
+		/* Hardware supports max 4GiB inbound region */
+		size = min(size, 1ULL << 32);
+
+		mask = roundup_pow_of_two(size) - 1;
+		mask &= ~0xf;
+
 		/*
 		 * Set up 64-bit inbound regions as the range parser doesn't
 		 * distinguish between 32 and 64-bit types.