Message ID | m3obe9lxil.fsf@intrepid.localdomain (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
From: Krzysztof Halasa <khc@pm.waw.pl> Date: Sat, 23 Mar 2013 20:35:46 +0100 > ARM core code currently requires coherent DMA mask to be set. Make sure > we limit PCI devices to 64 MiB while allowing on-chip devices to access > the whole 4 GiB address space. > > This fixes a v3.7+ regression which broke IXP4xx built-in network devices. > > Signed-off-by: Krzysztof Ha?asa <khc@pm.waw.pl> This requirement is not reasonable. The DMA API documentation clearly states what the default must be, and what drivers are guarenteed will be the default.
On Sat, 2013-03-23 at 19:57 -0400, David Miller wrote: > From: Krzysztof Halasa <khc@pm.waw.pl> > Date: Sat, 23 Mar 2013 20:35:46 +0100 > > > ARM core code currently requires coherent DMA mask to be set. Make sure > > we limit PCI devices to 64 MiB while allowing on-chip devices to access > > the whole 4 GiB address space. > > > > This fixes a v3.7+ regression which broke IXP4xx built-in network devices. > > > > Signed-off-by: Krzysztof Ha?asa <khc@pm.waw.pl> > > This requirement is not reasonable. > > The DMA API documentation clearly states what the default must be, > and what drivers are guarenteed will be the default. I'm failing to see where it says the default can't be narrower than 32 bits due to platform limits. And how do you think DMA mapping is supposed to work for PCI devices on these platforms, anyway? Ben.
Ben Hutchings <bhutchings@solarflare.com> writes: > I'm failing to see where it says the default can't be narrower than 32 > bits due to platform limits. And how do you think DMA mapping is > supposed to work for PCI devices on these platforms, anyway? The problem on ARM (and probably on powerpc, and on something called "metag" - grep -r 'coherent DMA mask is unset' arch) is that the default coherent DMA mask is zero. IOW, coherent DMA allocations are, by default, disabled. A driver has to dma_set_coherent_mask() or, as many drivers do, set dev->coherent_dma_mask directly (IMHO dev->coherent_dma_mask along with dev->dma_mask are private DMA API stuff and e.g. device drivers have no interest there). The zero default is IMHO, WRT the actual DMA API, an ARM bug (and powerpc's etc). Nevertheless, the patch I posted does everything as required by the API. Specifically, the IXP4xx arch part makes IXP4xx's dma_set_coherent_mask() compliant with DMA API, and the actual dma_set_coherent_mask() calls in drivers are both valid and I guess recommended by the API. The patch doesn't touch the core ARM issue, that's right.
David Miller <davem@davemloft.net> writes: >> ARM core code currently requires coherent DMA mask to be set. > This requirement is not reasonable. > > The DMA API documentation clearly states what the default must be, > and what drivers are guarenteed will be the default. ARM people - what do you think?
On Sun, Mar 24, 2013 at 10:15:36PM +0100, Krzysztof Halasa wrote: > The problem on ARM (and probably on powerpc, and on something called > "metag" - grep -r 'coherent DMA mask is unset' arch) is that the default > coherent DMA mask is zero. IOW, coherent DMA allocations are, by > default, disabled. A driver has to dma_set_coherent_mask() or, as many > drivers do, set dev->coherent_dma_mask directly (IMHO > dev->coherent_dma_mask along with dev->dma_mask are private DMA API > stuff and e.g. device drivers have no interest there). > > The zero default is IMHO, WRT the actual DMA API, an ARM bug (and > powerpc's etc). Nevertheless, the patch I posted does everything as > required by the API. Specifically, the IXP4xx arch part makes > IXP4xx's dma_set_coherent_mask() compliant with DMA API, and the actual > dma_set_coherent_mask() calls in drivers are both valid and I guess > recommended by the API. > > The patch doesn't touch the core ARM issue, that's right. I'm having a hard time understanding what is an ARM issue here, what is an ARM bug, and what the DMA API requires. The DMA API documentation is extremely sparse in describing what's required of the DMA masks, what these functions are supposed to do, and what determines whether a mask is "possible" or not. Moreover, I'm also having a hard time understanding what broke in 3.7, and why this fixes it. In other words, I'm completely failing to understand everything about this patch.
Russell King - ARM Linux <linux@arm.linux.org.uk> writes: > I'm having a hard time understanding what is an ARM issue here, what is > an ARM bug, and what the DMA API requires. The DMA API documentation > is extremely sparse in describing what's required of the DMA masks, > what these functions are supposed to do, and what determines whether > a mask is "possible" or not. > > Moreover, I'm also having a hard time understanding what broke in 3.7, > and why this fixes it. > > In other words, I'm completely failing to understand everything about > this patch. The ARM issue here is that the coherent DMA mask, by default, is zero (i.e. coherent allocations are impossible by default UNLESS the device pointer supplied is NULL - since DMA masks are bound to devices). On most other platforms, the default DMA mask is 0xFFFFFFFF = (u32)-1 and this is also required by the DMA API. In v3.7 (between v3.7-rc6 and v3.7-rc7), Xi Wang noticed that IXP4xx net drivers call dma_pool_create() with NULL dev argument, and changed them to use &port->netdev->dev. This broke coherent allocations since now the device supplied to dma_pool_create() is not NULL and the (zero) mask prevents coherent allocations. This means the Ethernet and HSS drivers are since then unusable. The first part of my patch makes dma_set_coherent_mask (IXP4xx-only code) actually set the mask. This is needed as on IXP4xx we have (at least) two different situations: - PCI devices want "DMA zone" memory (26 bits = 64 MiB) - built-in devices can use any RAM (32 bits = 4 GiB). Without the patch, dma_set_coherent_mask only returns 0 or -EIO, it doesn't change the actual coherent DMA mask and it's then impossible for coherent allocators to differentiate between the above two cases. +++ b/arch/arm/mach-ixp4xx/common-pci.c @@ -454,10 +454,15 @@ int ixp4xx_setup(int nr, struct pci_sys_data *sys) int dma_set_coherent_mask(struct device *dev, u64 mask) { - if (mask >= SZ_64M - 1) - return 0; + if ((mask & DMA_BIT_MASK(26)) != DMA_BIT_MASK(26)) + return -EIO; + + /* PCI devices are limited to 64 MiB */ + if (dev_is_pci(dev)) + mask &= DMA_BIT_MASK(26); /* Use DMA region */ - return -EIO; + dev->coherent_dma_mask = mask; + return 0; } The second part of my patch sets the coherent DMA masks of the Ethernet and HSS devices to 4 GiB (the value which should already be the default). This also seems to be a recommended action. +++ b/drivers/net/ethernet/xscale/ixp4xx_eth.c @@ -1423,7 +1423,7 @@ static int eth_init_one(struct platform_device *pdev) dev->ethtool_ops = &ixp4xx_ethtool_ops; dev->tx_queue_len = 100; - + dma_set_coherent_mask(&dev->dev, DMA_BIT_MASK(32)); netif_napi_add(dev, &port->napi, eth_poll, NAPI_WEIGHT); +++ b/drivers/net/wan/ixp4xx_hss.c @@ -1367,6 +1367,7 @@ static int hss_init_one(struct platform_device *pdev) port->dev = &pdev->dev; port->plat = pdev->dev.platform_data; + dma_set_coherent_mask(&dev->dev, DMA_BIT_MASK(32)); netif_napi_add(dev, &port->napi, hss_hdlc_poll, NAPI_WEIGHT);
On Sat, Mar 30, 2013 at 03:22:46PM +0100, Krzysztof Halasa wrote: > Russell King - ARM Linux <linux@arm.linux.org.uk> writes: > > > I'm having a hard time understanding what is an ARM issue here, what is > > an ARM bug, and what the DMA API requires. The DMA API documentation > > is extremely sparse in describing what's required of the DMA masks, > > what these functions are supposed to do, and what determines whether > > a mask is "possible" or not. > > > > Moreover, I'm also having a hard time understanding what broke in 3.7, > > and why this fixes it. > > > > In other words, I'm completely failing to understand everything about > > this patch. > > The ARM issue here is that the coherent DMA mask, by default, is zero > (i.e. coherent allocations are impossible by default UNLESS the device > pointer supplied is NULL - since DMA masks are bound to devices). So why is this the case - and on what devices? If it's the case for platform devices, and those platform devices are created by board code, that is the fault of whoever wrote the board code for those platform devices. And the reason this will happen is because programmers are human and lazy. If something isn't required for something to function, it won't be thought about, and it seems from your description that the DMA masks being set correctly wasn't required. > On most other platforms, the default DMA mask is 0xFFFFFFFF = (u32)-1 > and this is also required by the DMA API. > > In v3.7 (between v3.7-rc6 and v3.7-rc7), Xi Wang noticed that IXP4xx net > drivers call dma_pool_create() with NULL dev argument, and changed them > to use &port->netdev->dev. This broke coherent allocations since now the > device supplied to dma_pool_create() is not NULL and the (zero) mask > prevents coherent allocations. This means the Ethernet and HSS drivers > are since then unusable. > > > The first part of my patch makes dma_set_coherent_mask (IXP4xx-only > code) actually set the mask. This is needed as on IXP4xx we have (at > least) two different situations: > > - PCI devices want "DMA zone" memory (26 bits = 64 MiB) > - built-in devices can use any RAM (32 bits = 4 GiB). As I understand it (bear in mind that I have nothing to do with IXP4xx, so I'm talking from purely what I understand of the platform) the reason this isn't done is because of this code: static int ixp4xx_pci_platform_notify(struct device *dev) { if(dev->bus == &pci_bus_type) { *dev->dma_mask = SZ_64M - 1; dev->coherent_dma_mask = SZ_64M - 1; dmabounce_register_dev(dev, 2048, 4096, ixp4xx_needs_bounce); } return 0; } which pre-sets the DMA and coherent masks for PCI devices. However, this doesn't make your patch wrong - I'm just pointing out that the setting of the mask should already be done for PCI devices at creation time. > Without the patch, dma_set_coherent_mask only returns 0 or -EIO, it > doesn't change the actual coherent DMA mask and it's then impossible for > coherent allocators to differentiate between the above two cases. > > +++ b/arch/arm/mach-ixp4xx/common-pci.c > @@ -454,10 +454,15 @@ int ixp4xx_setup(int nr, struct pci_sys_data *sys) > > int dma_set_coherent_mask(struct device *dev, u64 mask) > { > - if (mask >= SZ_64M - 1) > - return 0; > + if ((mask & DMA_BIT_MASK(26)) != DMA_BIT_MASK(26)) > + return -EIO; > + > + /* PCI devices are limited to 64 MiB */ > + if (dev_is_pci(dev)) > + mask &= DMA_BIT_MASK(26); /* Use DMA region */ > > - return -EIO; > + dev->coherent_dma_mask = mask; > + return 0; > } > > > The second part of my patch sets the coherent DMA masks of the Ethernet > and HSS devices to 4 GiB (the value which should already be the > default). This also seems to be a recommended action. > > +++ b/drivers/net/ethernet/xscale/ixp4xx_eth.c > @@ -1423,7 +1423,7 @@ static int eth_init_one(struct platform_device *pdev) > dev->ethtool_ops = &ixp4xx_ethtool_ops; > dev->tx_queue_len = 100; > - > + dma_set_coherent_mask(&dev->dev, DMA_BIT_MASK(32)); > netif_napi_add(dev, &port->napi, eth_poll, NAPI_WEIGHT); > > +++ b/drivers/net/wan/ixp4xx_hss.c > @@ -1367,6 +1367,7 @@ static int hss_init_one(struct platform_device *pdev) > port->dev = &pdev->dev; > port->plat = pdev->dev.platform_data; > + dma_set_coherent_mask(&dev->dev, DMA_BIT_MASK(32)); > netif_napi_add(dev, &port->napi, hss_hdlc_poll, NAPI_WEIGHT); > Right, so, the answer is - yes you are talking about platform devices, and the reason that these aren't already set is because if you grep for ixp4xx_eth or ixp4xx_hss in arch/arm/mach-ixp4xx, you'll notice that _none_ of the device declarations set either of the DMA masks at all. They don't even set the dev->dma_mask pointer. That is why the masks are zero. For a device which does DMA, that is wrong. If you look at the PCI code, it pre-initializes the DMA mask to be 4GiB: drivers/pci/probe.c: dev->dev.coherent_dma_mask = 0xffffffffull; And that is what is missing from the IXP4xx platform code. However, avoiding the call to dma_set_coherent_mask() from within the driver also seems to be questionable as it bypasses the "check if the mask is possible" part of the DMA API. So, I think your patch(es) are fine, but I would also suggest adding the initializations for the coherent DMA mask on those platform device declarations in arch/arm/mach-ixp4xx, and setting the dma_mask pointer properly as well - espeically as these drivers use the streaming DMA API as well. That will bring IXP4xx into line with stuff like the PCI layer.
Russell King - ARM Linux <linux@arm.linux.org.uk> writes: > Right, so, the answer is - yes you are talking about platform devices, > and the reason that these aren't already set is because if you grep for > ixp4xx_eth or ixp4xx_hss in arch/arm/mach-ixp4xx, you'll notice that > _none_ of the device declarations set either of the DMA masks at all. > They don't even set the dev->dma_mask pointer. That is why the masks > are zero. For a device which does DMA, that is wrong. Well, that's new to me. Please tell me how it should be done. Should I simply add in platform code (on all platforms): +++ b/arch/arm/mach-ixp4xx/XXX.c @@ -380,10 +380,12 @@ static struct platform_device device_eth_tab[] = { .name = "ixp4xx_eth", .id = IXP4XX_ETH_NPEB, .dev.platform_data = eth_plat, + .dev.coherent_dma_mask = DMA_BIT_MASK(32), }, { .name = "ixp4xx_eth", .id = IXP4XX_ETH_NPEC, .dev.platform_data = eth_plat + 1, + .dev.coherent_dma_mask = DMA_BIT_MASK(32), } }; This alone isn't going to work, the problem is coherent DMA mask in net_dev->dev and not in platform_device. So what do I do in the network drivers? Copy the masks from platform_device to the actual newly created net_dev->dev? Should I use the parent device (net_dev->dev.parent which is the platform_device) as the argument to the allocator? This would probably work though I'm not sure of its correctness. BTW the platform code shouldn't IMHO need to declare the masks as they aren't platform-specific. They are driver-specific (defined by CPU design) and setting them in the driver seems clean to me (unlike the platform code). Several other device drivers simply set their masks directly. Is it the recommended way? > If you look at the PCI code, it pre-initializes the DMA mask to be 4GiB: > drivers/pci/probe.c: dev->dev.coherent_dma_mask = 0xffffffffull; > > And that is what is missing from the IXP4xx platform code. > > However, avoiding the call to dma_set_coherent_mask() from within the > driver also seems to be questionable as it bypasses the "check if the > mask is possible" part of the DMA API. I thought about this for a second but the situation is the mask is guaranteed to be valid (these are on-chip devices).
On Mon, 2013-04-01 at 22:17 +0200, Krzysztof Halasa wrote: > Russell King - ARM Linux <linux@arm.linux.org.uk> writes: > > > Right, so, the answer is - yes you are talking about platform devices, > > and the reason that these aren't already set is because if you grep for > > ixp4xx_eth or ixp4xx_hss in arch/arm/mach-ixp4xx, you'll notice that > > _none_ of the device declarations set either of the DMA masks at all. > > They don't even set the dev->dma_mask pointer. That is why the masks > > are zero. For a device which does DMA, that is wrong. > > Well, that's new to me. Please tell me how it should be done. > Should I simply add in platform code (on all platforms): > > +++ b/arch/arm/mach-ixp4xx/XXX.c > @@ -380,10 +380,12 @@ static struct platform_device device_eth_tab[] = { > .name = "ixp4xx_eth", > .id = IXP4XX_ETH_NPEB, > .dev.platform_data = eth_plat, > + .dev.coherent_dma_mask = DMA_BIT_MASK(32), > }, { > .name = "ixp4xx_eth", > .id = IXP4XX_ETH_NPEC, > .dev.platform_data = eth_plat + 1, > + .dev.coherent_dma_mask = DMA_BIT_MASK(32), > } > }; > > This alone isn't going to work, the problem is coherent DMA mask in > net_dev->dev and not in platform_device. So what do I do in the network > drivers? Copy the masks from platform_device to the actual newly created > net_dev->dev? > > Should I use the parent device (net_dev->dev.parent which is the > platform_device) as the argument to the allocator? This would probably > work though I'm not sure of its correctness. [...] Passing the parent device to DMA API functions seems to be the right thing to do. It's what you would do in a PCI device driver. Ben.
diff --git a/arch/arm/mach-ixp4xx/common-pci.c b/arch/arm/mach-ixp4xx/common-pci.c index ea1933d..8629fc9 100644 --- a/arch/arm/mach-ixp4xx/common-pci.c +++ b/arch/arm/mach-ixp4xx/common-pci.c @@ -454,10 +454,15 @@ int ixp4xx_setup(int nr, struct pci_sys_data *sys) int dma_set_coherent_mask(struct device *dev, u64 mask) { - if (mask >= SZ_64M - 1) - return 0; + if ((mask & DMA_BIT_MASK(26)) != DMA_BIT_MASK(26)) + return -EIO; + + /* PCI devices are limited to 64 MiB */ + if (dev_is_pci(dev)) + mask &= DMA_BIT_MASK(26); /* Use DMA region */ - return -EIO; + dev->coherent_dma_mask = mask; + return 0; } EXPORT_SYMBOL(ixp4xx_pci_read); diff --git a/drivers/net/ethernet/xscale/ixp4xx_eth.c b/drivers/net/ethernet/xscale/ixp4xx_eth.c index 6958a5e..7c08269 100644 --- a/drivers/net/ethernet/xscale/ixp4xx_eth.c +++ b/drivers/net/ethernet/xscale/ixp4xx_eth.c @@ -1423,7 +1423,7 @@ static int eth_init_one(struct platform_device *pdev) dev->netdev_ops = &ixp4xx_netdev_ops; dev->ethtool_ops = &ixp4xx_ethtool_ops; dev->tx_queue_len = 100; - + dma_set_coherent_mask(&dev->dev, DMA_BIT_MASK(32)); netif_napi_add(dev, &port->napi, eth_poll, NAPI_WEIGHT); if (!(port->npe = npe_request(NPE_ID(port->id)))) { diff --git a/drivers/net/wan/ixp4xx_hss.c b/drivers/net/wan/ixp4xx_hss.c index 95d0451..83b4597 100644 --- a/drivers/net/wan/ixp4xx_hss.c +++ b/drivers/net/wan/ixp4xx_hss.c @@ -1367,6 +1367,7 @@ static int hss_init_one(struct platform_device *pdev) port->id = pdev->id; port->dev = &pdev->dev; port->plat = pdev->dev.platform_data; + dma_set_coherent_mask(&dev->dev, DMA_BIT_MASK(32)); netif_napi_add(dev, &port->napi, hss_hdlc_poll, NAPI_WEIGHT); if ((err = register_hdlc_device(dev)))
ARM core code currently requires coherent DMA mask to be set. Make sure we limit PCI devices to 64 MiB while allowing on-chip devices to access the whole 4 GiB address space. This fixes a v3.7+ regression which broke IXP4xx built-in network devices. Signed-off-by: Krzysztof Ha?asa <khc@pm.waw.pl> Cc: <stable@vger.kernel.org>