diff mbox

Fix IXP4xx coherent allocations

Message ID m3obe9lxil.fsf@intrepid.localdomain (mailing list archive)
State New, archived
Headers show

Commit Message

Krzysztof Halasa March 23, 2013, 7:35 p.m. UTC
ARM core code currently requires coherent DMA mask to be set. Make sure
we limit PCI devices to 64 MiB while allowing on-chip devices to access
the whole 4 GiB address space.

This fixes a v3.7+ regression which broke IXP4xx built-in network devices.

Signed-off-by: Krzysztof Ha?asa <khc@pm.waw.pl>
Cc: <stable@vger.kernel.org>

Comments

David Miller March 23, 2013, 11:57 p.m. UTC | #1
From: Krzysztof Halasa <khc@pm.waw.pl>
Date: Sat, 23 Mar 2013 20:35:46 +0100

> ARM core code currently requires coherent DMA mask to be set. Make sure
> we limit PCI devices to 64 MiB while allowing on-chip devices to access
> the whole 4 GiB address space.
> 
> This fixes a v3.7+ regression which broke IXP4xx built-in network devices.
> 
> Signed-off-by: Krzysztof Ha?asa <khc@pm.waw.pl>

This requirement is not reasonable.

The DMA API documentation clearly states what the default must be,
and what drivers are guarenteed will be the default.
Ben Hutchings March 24, 2013, 7:11 p.m. UTC | #2
On Sat, 2013-03-23 at 19:57 -0400, David Miller wrote:
> From: Krzysztof Halasa <khc@pm.waw.pl>
> Date: Sat, 23 Mar 2013 20:35:46 +0100
> 
> > ARM core code currently requires coherent DMA mask to be set. Make sure
> > we limit PCI devices to 64 MiB while allowing on-chip devices to access
> > the whole 4 GiB address space.
> > 
> > This fixes a v3.7+ regression which broke IXP4xx built-in network devices.
> > 
> > Signed-off-by: Krzysztof Ha?asa <khc@pm.waw.pl>
> 
> This requirement is not reasonable.
> 
> The DMA API documentation clearly states what the default must be,
> and what drivers are guarenteed will be the default.

I'm failing to see where it says the default can't be narrower than 32
bits due to platform limits.  And how do you think DMA mapping is
supposed to work for PCI devices on these platforms, anyway?

Ben.
Krzysztof Halasa March 24, 2013, 9:15 p.m. UTC | #3
Ben Hutchings <bhutchings@solarflare.com> writes:

> I'm failing to see where it says the default can't be narrower than 32
> bits due to platform limits.  And how do you think DMA mapping is
> supposed to work for PCI devices on these platforms, anyway?

The problem on ARM (and probably on powerpc, and on something called
"metag" - grep -r 'coherent DMA mask is unset' arch) is that the default
coherent DMA mask is zero. IOW, coherent DMA allocations are, by
default, disabled. A driver has to dma_set_coherent_mask() or, as many
drivers do, set dev->coherent_dma_mask directly (IMHO
dev->coherent_dma_mask along with dev->dma_mask are private DMA API
stuff and e.g. device drivers have no interest there).

The zero default is IMHO, WRT the actual DMA API, an ARM bug (and
powerpc's etc). Nevertheless, the patch I posted does everything as
required by the API. Specifically, the IXP4xx arch part makes
IXP4xx's dma_set_coherent_mask() compliant with DMA API, and the actual
dma_set_coherent_mask() calls in drivers are both valid and I guess
recommended by the API.

The patch doesn't touch the core ARM issue, that's right.
Krzysztof Halasa March 30, 2013, 1:18 p.m. UTC | #4
David Miller <davem@davemloft.net> writes:

>> ARM core code currently requires coherent DMA mask to be set.

> This requirement is not reasonable.
>
> The DMA API documentation clearly states what the default must be,
> and what drivers are guarenteed will be the default.

ARM people - what do you think?
Russell King - ARM Linux March 30, 2013, 1:29 p.m. UTC | #5
On Sun, Mar 24, 2013 at 10:15:36PM +0100, Krzysztof Halasa wrote:
> The problem on ARM (and probably on powerpc, and on something called
> "metag" - grep -r 'coherent DMA mask is unset' arch) is that the default
> coherent DMA mask is zero. IOW, coherent DMA allocations are, by
> default, disabled. A driver has to dma_set_coherent_mask() or, as many
> drivers do, set dev->coherent_dma_mask directly (IMHO
> dev->coherent_dma_mask along with dev->dma_mask are private DMA API
> stuff and e.g. device drivers have no interest there).
> 
> The zero default is IMHO, WRT the actual DMA API, an ARM bug (and
> powerpc's etc). Nevertheless, the patch I posted does everything as
> required by the API. Specifically, the IXP4xx arch part makes
> IXP4xx's dma_set_coherent_mask() compliant with DMA API, and the actual
> dma_set_coherent_mask() calls in drivers are both valid and I guess
> recommended by the API.
> 
> The patch doesn't touch the core ARM issue, that's right.

I'm having a hard time understanding what is an ARM issue here, what is
an ARM bug, and what the DMA API requires.  The DMA API documentation
is extremely sparse in describing what's required of the DMA masks,
what these functions are supposed to do, and what determines whether
a mask is "possible" or not.

Moreover, I'm also having a hard time understanding what broke in 3.7,
and why this fixes it.

In other words, I'm completely failing to understand everything about
this patch.
Krzysztof Halasa March 30, 2013, 2:22 p.m. UTC | #6
Russell King - ARM Linux <linux@arm.linux.org.uk> writes:

> I'm having a hard time understanding what is an ARM issue here, what is
> an ARM bug, and what the DMA API requires.  The DMA API documentation
> is extremely sparse in describing what's required of the DMA masks,
> what these functions are supposed to do, and what determines whether
> a mask is "possible" or not.
>
> Moreover, I'm also having a hard time understanding what broke in 3.7,
> and why this fixes it.
>
> In other words, I'm completely failing to understand everything about
> this patch.

The ARM issue here is that the coherent DMA mask, by default, is zero
(i.e. coherent allocations are impossible by default UNLESS the device
pointer supplied is NULL - since DMA masks are bound to devices).
On most other platforms, the default DMA mask is 0xFFFFFFFF = (u32)-1
and this is also required by the DMA API.

In v3.7 (between v3.7-rc6 and v3.7-rc7), Xi Wang noticed that IXP4xx net
drivers call dma_pool_create() with NULL dev argument, and changed them
to use &port->netdev->dev. This broke coherent allocations since now the
device supplied to dma_pool_create() is not NULL and the (zero) mask
prevents coherent allocations. This means the Ethernet and HSS drivers
are since then unusable.


The first part of my patch makes dma_set_coherent_mask (IXP4xx-only
code) actually set the mask. This is needed as on IXP4xx we have (at
least) two different situations:

- PCI devices want "DMA zone" memory (26 bits = 64 MiB)
- built-in devices can use any RAM (32 bits = 4 GiB).

Without the patch, dma_set_coherent_mask only returns 0 or -EIO, it
doesn't change the actual coherent DMA mask and it's then impossible for
coherent allocators to differentiate between the above two cases.

+++ b/arch/arm/mach-ixp4xx/common-pci.c
@@ -454,10 +454,15 @@ int ixp4xx_setup(int nr, struct pci_sys_data *sys)
 
 int dma_set_coherent_mask(struct device *dev, u64 mask)
 {
-	if (mask >= SZ_64M - 1)
-		return 0;
+	if ((mask & DMA_BIT_MASK(26)) != DMA_BIT_MASK(26))
+		return -EIO;
+
+	/* PCI devices are limited to 64 MiB */
+	if (dev_is_pci(dev))
+		mask &= DMA_BIT_MASK(26); /* Use DMA region */
 
-	return -EIO;
+	dev->coherent_dma_mask = mask;
+	return 0;
 }


The second part of my patch sets the coherent DMA masks of the Ethernet
and HSS devices to 4 GiB (the value which should already be the
default). This also seems to be a recommended action.

+++ b/drivers/net/ethernet/xscale/ixp4xx_eth.c
@@ -1423,7 +1423,7 @@ static int eth_init_one(struct platform_device *pdev)
 	dev->ethtool_ops = &ixp4xx_ethtool_ops;
 	dev->tx_queue_len = 100;
-
+	dma_set_coherent_mask(&dev->dev, DMA_BIT_MASK(32));
 	netif_napi_add(dev, &port->napi, eth_poll, NAPI_WEIGHT);
 
+++ b/drivers/net/wan/ixp4xx_hss.c
@@ -1367,6 +1367,7 @@ static int hss_init_one(struct platform_device *pdev)
 	port->dev = &pdev->dev;
 	port->plat = pdev->dev.platform_data;
+	dma_set_coherent_mask(&dev->dev, DMA_BIT_MASK(32));
 	netif_napi_add(dev, &port->napi, hss_hdlc_poll, NAPI_WEIGHT);
Russell King - ARM Linux March 30, 2013, 3:31 p.m. UTC | #7
On Sat, Mar 30, 2013 at 03:22:46PM +0100, Krzysztof Halasa wrote:
> Russell King - ARM Linux <linux@arm.linux.org.uk> writes:
> 
> > I'm having a hard time understanding what is an ARM issue here, what is
> > an ARM bug, and what the DMA API requires.  The DMA API documentation
> > is extremely sparse in describing what's required of the DMA masks,
> > what these functions are supposed to do, and what determines whether
> > a mask is "possible" or not.
> >
> > Moreover, I'm also having a hard time understanding what broke in 3.7,
> > and why this fixes it.
> >
> > In other words, I'm completely failing to understand everything about
> > this patch.
> 
> The ARM issue here is that the coherent DMA mask, by default, is zero
> (i.e. coherent allocations are impossible by default UNLESS the device
> pointer supplied is NULL - since DMA masks are bound to devices).

So why is this the case - and on what devices?

If it's the case for platform devices, and those platform devices are
created by board code, that is the fault of whoever wrote the board
code for those platform devices.

And the reason this will happen is because programmers are human and
lazy.  If something isn't required for something to function, it won't
be thought about, and it seems from your description that the DMA masks
being set correctly wasn't required.

> On most other platforms, the default DMA mask is 0xFFFFFFFF = (u32)-1
> and this is also required by the DMA API.
> 
> In v3.7 (between v3.7-rc6 and v3.7-rc7), Xi Wang noticed that IXP4xx net
> drivers call dma_pool_create() with NULL dev argument, and changed them
> to use &port->netdev->dev. This broke coherent allocations since now the
> device supplied to dma_pool_create() is not NULL and the (zero) mask
> prevents coherent allocations. This means the Ethernet and HSS drivers
> are since then unusable.
> 
> 
> The first part of my patch makes dma_set_coherent_mask (IXP4xx-only
> code) actually set the mask. This is needed as on IXP4xx we have (at
> least) two different situations:
> 
> - PCI devices want "DMA zone" memory (26 bits = 64 MiB)
> - built-in devices can use any RAM (32 bits = 4 GiB).

As I understand it (bear in mind that I have nothing to do with IXP4xx,
so I'm talking from purely what I understand of the platform) the reason
this isn't done is because of this code:

static int ixp4xx_pci_platform_notify(struct device *dev)
{
        if(dev->bus == &pci_bus_type) {
                *dev->dma_mask =  SZ_64M - 1;
                dev->coherent_dma_mask = SZ_64M - 1;
                dmabounce_register_dev(dev, 2048, 4096, ixp4xx_needs_bounce);
        }
        return 0;
}

which pre-sets the DMA and coherent masks for PCI devices.  However,
this doesn't make your patch wrong - I'm just pointing out that the
setting of the mask should already be done for PCI devices at creation
time.

> Without the patch, dma_set_coherent_mask only returns 0 or -EIO, it
> doesn't change the actual coherent DMA mask and it's then impossible for
> coherent allocators to differentiate between the above two cases.
> 
> +++ b/arch/arm/mach-ixp4xx/common-pci.c
> @@ -454,10 +454,15 @@ int ixp4xx_setup(int nr, struct pci_sys_data *sys)
>  
>  int dma_set_coherent_mask(struct device *dev, u64 mask)
>  {
> -	if (mask >= SZ_64M - 1)
> -		return 0;
> +	if ((mask & DMA_BIT_MASK(26)) != DMA_BIT_MASK(26))
> +		return -EIO;
> +
> +	/* PCI devices are limited to 64 MiB */
> +	if (dev_is_pci(dev))
> +		mask &= DMA_BIT_MASK(26); /* Use DMA region */
>  
> -	return -EIO;
> +	dev->coherent_dma_mask = mask;
> +	return 0;
>  }
> 
> 
> The second part of my patch sets the coherent DMA masks of the Ethernet
> and HSS devices to 4 GiB (the value which should already be the
> default). This also seems to be a recommended action.
> 
> +++ b/drivers/net/ethernet/xscale/ixp4xx_eth.c
> @@ -1423,7 +1423,7 @@ static int eth_init_one(struct platform_device *pdev)
>  	dev->ethtool_ops = &ixp4xx_ethtool_ops;
>  	dev->tx_queue_len = 100;
> -
> +	dma_set_coherent_mask(&dev->dev, DMA_BIT_MASK(32));
>  	netif_napi_add(dev, &port->napi, eth_poll, NAPI_WEIGHT);
>  
> +++ b/drivers/net/wan/ixp4xx_hss.c
> @@ -1367,6 +1367,7 @@ static int hss_init_one(struct platform_device *pdev)
>  	port->dev = &pdev->dev;
>  	port->plat = pdev->dev.platform_data;
> +	dma_set_coherent_mask(&dev->dev, DMA_BIT_MASK(32));
>  	netif_napi_add(dev, &port->napi, hss_hdlc_poll, NAPI_WEIGHT);
> 

Right, so, the answer is - yes you are talking about platform devices,
and the reason that these aren't already set is because if you grep for
ixp4xx_eth or ixp4xx_hss in arch/arm/mach-ixp4xx, you'll notice that
_none_ of the device declarations set either of the DMA masks at all.
They don't even set the dev->dma_mask pointer.  That is why the masks
are zero.  For a device which does DMA, that is wrong.

If you look at the PCI code, it pre-initializes the DMA mask to be 4GiB:
drivers/pci/probe.c:        dev->dev.coherent_dma_mask = 0xffffffffull;

And that is what is missing from the IXP4xx platform code.

However, avoiding the call to dma_set_coherent_mask() from within the
driver also seems to be questionable as it bypasses the "check if the
mask is possible" part of the DMA API.

So, I think your patch(es) are fine, but I would also suggest adding
the initializations for the coherent DMA mask on those platform device
declarations in arch/arm/mach-ixp4xx, and setting the dma_mask pointer
properly as well - espeically as these drivers use the streaming DMA
API as well.  That will bring IXP4xx into line with stuff like the PCI
layer.
Krzysztof Halasa April 1, 2013, 8:17 p.m. UTC | #8
Russell King - ARM Linux <linux@arm.linux.org.uk> writes:

> Right, so, the answer is - yes you are talking about platform devices,
> and the reason that these aren't already set is because if you grep for
> ixp4xx_eth or ixp4xx_hss in arch/arm/mach-ixp4xx, you'll notice that
> _none_ of the device declarations set either of the DMA masks at all.
> They don't even set the dev->dma_mask pointer.  That is why the masks
> are zero.  For a device which does DMA, that is wrong.

Well, that's new to me. Please tell me how it should be done.
Should I simply add in platform code (on all platforms):

+++ b/arch/arm/mach-ixp4xx/XXX.c
@@ -380,10 +380,12 @@ static struct platform_device device_eth_tab[] = {
         .name           = "ixp4xx_eth",
         .id         = IXP4XX_ETH_NPEB,
         .dev.platform_data  = eth_plat,
+        .dev.coherent_dma_mask  = DMA_BIT_MASK(32),
     }, {
         .name           = "ixp4xx_eth",
         .id         = IXP4XX_ETH_NPEC,
         .dev.platform_data  = eth_plat + 1,
+        .dev.coherent_dma_mask  = DMA_BIT_MASK(32),
     }
 };

This alone isn't going to work, the problem is coherent DMA mask in
net_dev->dev and not in platform_device. So what do I do in the network
drivers? Copy the masks from platform_device to the actual newly created
net_dev->dev?

Should I use the parent device (net_dev->dev.parent which is the
platform_device) as the argument to the allocator? This would probably
work though I'm not sure of its correctness.

BTW the platform code shouldn't IMHO need to declare the masks as they
aren't platform-specific. They are driver-specific (defined by CPU
design) and setting them in the driver seems clean to me (unlike the
platform code).

Several other device drivers simply set their masks directly. Is it the
recommended way?

> If you look at the PCI code, it pre-initializes the DMA mask to be 4GiB:
> drivers/pci/probe.c:        dev->dev.coherent_dma_mask = 0xffffffffull;
>
> And that is what is missing from the IXP4xx platform code.
>
> However, avoiding the call to dma_set_coherent_mask() from within the
> driver also seems to be questionable as it bypasses the "check if the
> mask is possible" part of the DMA API.

I thought about this for a second but the situation is the mask
is guaranteed to be valid (these are on-chip devices).
Ben Hutchings April 2, 2013, 12:40 a.m. UTC | #9
On Mon, 2013-04-01 at 22:17 +0200, Krzysztof Halasa wrote:
> Russell King - ARM Linux <linux@arm.linux.org.uk> writes:
> 
> > Right, so, the answer is - yes you are talking about platform devices,
> > and the reason that these aren't already set is because if you grep for
> > ixp4xx_eth or ixp4xx_hss in arch/arm/mach-ixp4xx, you'll notice that
> > _none_ of the device declarations set either of the DMA masks at all.
> > They don't even set the dev->dma_mask pointer.  That is why the masks
> > are zero.  For a device which does DMA, that is wrong.
> 
> Well, that's new to me. Please tell me how it should be done.
> Should I simply add in platform code (on all platforms):
> 
> +++ b/arch/arm/mach-ixp4xx/XXX.c
> @@ -380,10 +380,12 @@ static struct platform_device device_eth_tab[] = {
>          .name           = "ixp4xx_eth",
>          .id         = IXP4XX_ETH_NPEB,
>          .dev.platform_data  = eth_plat,
> +        .dev.coherent_dma_mask  = DMA_BIT_MASK(32),
>      }, {
>          .name           = "ixp4xx_eth",
>          .id         = IXP4XX_ETH_NPEC,
>          .dev.platform_data  = eth_plat + 1,
> +        .dev.coherent_dma_mask  = DMA_BIT_MASK(32),
>      }
>  };
> 
> This alone isn't going to work, the problem is coherent DMA mask in
> net_dev->dev and not in platform_device. So what do I do in the network
> drivers? Copy the masks from platform_device to the actual newly created
> net_dev->dev?
>
> Should I use the parent device (net_dev->dev.parent which is the
> platform_device) as the argument to the allocator? This would probably
> work though I'm not sure of its correctness.
[...]

Passing the parent device to DMA API functions seems to be the right
thing to do.  It's what you would do in a PCI device driver.

Ben.
diff mbox

Patch

diff --git a/arch/arm/mach-ixp4xx/common-pci.c b/arch/arm/mach-ixp4xx/common-pci.c
index ea1933d..8629fc9 100644
--- a/arch/arm/mach-ixp4xx/common-pci.c
+++ b/arch/arm/mach-ixp4xx/common-pci.c
@@ -454,10 +454,15 @@  int ixp4xx_setup(int nr, struct pci_sys_data *sys)
 
 int dma_set_coherent_mask(struct device *dev, u64 mask)
 {
-	if (mask >= SZ_64M - 1)
-		return 0;
+	if ((mask & DMA_BIT_MASK(26)) != DMA_BIT_MASK(26))
+		return -EIO;
+
+	/* PCI devices are limited to 64 MiB */
+	if (dev_is_pci(dev))
+		mask &= DMA_BIT_MASK(26); /* Use DMA region */
 
-	return -EIO;
+	dev->coherent_dma_mask = mask;
+	return 0;
 }
 
 EXPORT_SYMBOL(ixp4xx_pci_read);
diff --git a/drivers/net/ethernet/xscale/ixp4xx_eth.c b/drivers/net/ethernet/xscale/ixp4xx_eth.c
index 6958a5e..7c08269 100644
--- a/drivers/net/ethernet/xscale/ixp4xx_eth.c
+++ b/drivers/net/ethernet/xscale/ixp4xx_eth.c
@@ -1423,7 +1423,7 @@  static int eth_init_one(struct platform_device *pdev)
 	dev->netdev_ops = &ixp4xx_netdev_ops;
 	dev->ethtool_ops = &ixp4xx_ethtool_ops;
 	dev->tx_queue_len = 100;
-
+	dma_set_coherent_mask(&dev->dev, DMA_BIT_MASK(32));
 	netif_napi_add(dev, &port->napi, eth_poll, NAPI_WEIGHT);
 
 	if (!(port->npe = npe_request(NPE_ID(port->id)))) {
diff --git a/drivers/net/wan/ixp4xx_hss.c b/drivers/net/wan/ixp4xx_hss.c
index 95d0451..83b4597 100644
--- a/drivers/net/wan/ixp4xx_hss.c
+++ b/drivers/net/wan/ixp4xx_hss.c
@@ -1367,6 +1367,7 @@  static int hss_init_one(struct platform_device *pdev)
 	port->id = pdev->id;
 	port->dev = &pdev->dev;
 	port->plat = pdev->dev.platform_data;
+	dma_set_coherent_mask(&dev->dev, DMA_BIT_MASK(32));
 	netif_napi_add(dev, &port->napi, hss_hdlc_poll, NAPI_WEIGHT);
 
 	if ((err = register_hdlc_device(dev)))