Message ID | 20190326230131.16275-2-nicoleotsuka@gmail.com (mailing list archive) |
---|---|
State | RFC |
Headers | show |
Series | Save single pages from CMA area | expand |
On Tue, Mar 26, 2019 at 04:01:27PM -0700, Nicolin Chen wrote: > page = dma_alloc_from_contiguous(dev, count, order, gfp & __GFP_NOWARN); > + if (!page) > + page = alloc_pages(gfp, order); We have this fallback in most callers already. And with me adding it to the dma-iommu code in one series, and you to arm here I think we really need to take a step back and think of a better way to handle this, and the general mess that dma_alloc_from_contiguous. So what about: (1) change the dma_alloc_from_contiguous prototype to be: struct page *dma_alloc_contiguous(struct device *dev, size_t size, gfp_t gfp); that is: calculate order and count internally, pass the full gfp_t and mask it internally, and drop the pointless from in the name. I'd also use the oppurtunity to forbid a NULL dev argument and opencode those uses. (2) handle the alloc_pages fallback internally. Note that we should use alloc_pages_node as we do in dma-direct. > + if (!dma_release_from_contiguous(dev, page, count)) > + __free_pages(page, get_order(size)); Same for dma_release_from_contiguous - drop the _from, pass the actual size, and handle the free_pages fallback.
On Wed, Apr 24, 2019 at 05:06:38PM +0200, Christoph Hellwig wrote: > I'd also use the oppurtunity to forbid a NULL dev argument and > opencode those uses. Actually, looking at your last patch again the NULL argument might still fit in ok, so maybe we should keep it.
Hi Christoph, On Wed, Apr 24, 2019 at 05:06:38PM +0200, Christoph Hellwig wrote: > On Tue, Mar 26, 2019 at 04:01:27PM -0700, Nicolin Chen wrote: > > page = dma_alloc_from_contiguous(dev, count, order, gfp & __GFP_NOWARN); > > + if (!page) > > + page = alloc_pages(gfp, order); > > We have this fallback in most callers already. And with me adding > it to the dma-iommu code in one series, and you to arm here I think > we really need to take a step back and think of a better way > to handle this, and the general mess that dma_alloc_from_contiguous. > > So what about: Thanks for the suggestion! > (1) change the dma_alloc_from_contiguous prototype to be: > > struct page *dma_alloc_contiguous(struct device *dev, size_t size, gfp_t gfp); > > that is: calculate order and count internally, pass the full gfp_t > and mask it internally, and drop the pointless from in the name. > I'd also use the oppurtunity to forbid a NULL dev argument and > opencode those uses. > > (2) handle the alloc_pages fallback internally. Note that we should > use alloc_pages_node as we do in dma-direct. I feel it's similar to my previous set, which did most of these internally except the renaming part. But Catalin had a concern that some platforms might have limits on CMA range [1]. Will it be still okay to do the fallback internally? [1: https://www.spinics.net/lists/arm-kernel/msg714295.html ]
On Wed, Apr 24, 2019 at 11:33:11AM -0700, Nicolin Chen wrote: > I feel it's similar to my previous set, which did most of these > internally except the renaming part. But Catalin had a concern > that some platforms might have limits on CMA range [1]. Will it > be still okay to do the fallback internally? > > [1: https://www.spinics.net/lists/arm-kernel/msg714295.html ] Catalins statement is correct, but I don't see how it applies to your patch. Your patch just ensures that the fallback we have in most callers is uniformly applied everywhere. The non-iommu callers will still need to select a specific zone and/or retry just the page allocator with other flags if the CMA (or fallback) page doesn't match what they need. dma-direct does this correctly and I think the arm32 allocator does as well, although it is a bit hard to follow sometimes.
On Wed, Apr 24, 2019 at 09:26:52PM +0200, Christoph Hellwig wrote: > On Wed, Apr 24, 2019 at 11:33:11AM -0700, Nicolin Chen wrote: > > I feel it's similar to my previous set, which did most of these > > internally except the renaming part. But Catalin had a concern > > that some platforms might have limits on CMA range [1]. Will it > > be still okay to do the fallback internally? > > > > [1: https://www.spinics.net/lists/arm-kernel/msg714295.html ] > > Catalins statement is correct, but I don't see how it applies to > your patch. Your patch just ensures that the fallback we have > in most callers is uniformly applied everywhere. The non-iommu > callers will still need to select a specific zone and/or retry > just the page allocator with other flags if the CMA (or fallback) > page doesn't match what they need. dma-direct does this correctly > and I think the arm32 allocator does as well, although it is a bit > hard to follow sometimes. Okay. I will revise and submit the patches then. I think we can still discuss on this topic once we have the changes. Thanks
On Wed, Apr 24, 2019 at 05:06:38PM +0200, Christoph Hellwig wrote: > > + if (!dma_release_from_contiguous(dev, page, count)) > > + __free_pages(page, get_order(size)); > > Same for dma_release_from_contiguous - drop the _from, pass the > actual size, and handle the free_pages fallback. What do you think of dma_free_contiguous() instead? I feel "free" is a bit more commonly used (in dma-mapping.h) and it's shorter. Thanks
On Fri, Apr 26, 2019 at 01:21:12PM -0700, Nicolin Chen wrote: > What do you think of dma_free_contiguous() instead? I feel "free" > is a bit more commonly used (in dma-mapping.h) and it's shorter. Yeah, that sounds good.
(catching up on email) On Wed, Apr 24, 2019 at 09:26:52PM +0200, Christoph Hellwig wrote: > On Wed, Apr 24, 2019 at 11:33:11AM -0700, Nicolin Chen wrote: > > I feel it's similar to my previous set, which did most of these > > internally except the renaming part. But Catalin had a concern > > that some platforms might have limits on CMA range [1]. Will it > > be still okay to do the fallback internally? > > > > [1: https://www.spinics.net/lists/arm-kernel/msg714295.html ] > > Catalins statement is correct, but I don't see how it applies to > your patch. Your patch just ensures that the fallback we have > in most callers is uniformly applied everywhere. The non-iommu > callers will still need to select a specific zone and/or retry > just the page allocator with other flags if the CMA (or fallback) > page doesn't match what they need. dma-direct does this correctly > and I think the arm32 allocator does as well, although it is a bit > hard to follow sometimes. My reading of the arm32 __dma_alloc() is that if the conditions are right for the CMA allocator (allows blocking) and there is a default CMA area or a per-device one, the call ends up in cma_alloc() without any fallback if such allocation fails. Whether this is on purpose, I'm not entirely sure. There are a couple of arm32 SoCs which call dma_declare_contiguous() or dma_contiguous_reserve_area() and a few DT files describing a specific CMA range (e.g. arch/arm/boot/dts/sun5i.dtsi with a comment that address must be kept in the lower 256MB). If ZONE_DMA is set up correctly so that cma_alloc() is (or can be made) interchangeable with alloc_pages(GFP_DMA) from a device DMA capability perspective , I think it should be fine to have such fallback.
On Tue, Apr 30, 2019 at 04:24:21PM +0100, Catalin Marinas wrote: > My reading of the arm32 __dma_alloc() is that if the conditions are > right for the CMA allocator (allows blocking) and there is a default CMA > area or a per-device one, the call ends up in cma_alloc() without any > fallback if such allocation fails. Whether this is on purpose, I'm not > entirely sure. There are a couple of arm32 SoCs which call > dma_declare_contiguous() or dma_contiguous_reserve_area() and a few DT > files describing a specific CMA range (e.g. arch/arm/boot/dts/sun5i.dtsi > with a comment that address must be kept in the lower 256MB). > > If ZONE_DMA is set up correctly so that cma_alloc() is (or can be made) > interchangeable with alloc_pages(GFP_DMA) from a device DMA capability > perspective , I think it should be fine to have such fallback. Indeed. I missed arm32 being different from everyone else, but we already addresses that in another thread. Sorry for misleading everyone.
diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c index 8a90f298af96..febaf637a25b 100644 --- a/arch/arm/mm/dma-mapping.c +++ b/arch/arm/mm/dma-mapping.c @@ -589,6 +589,8 @@ static void *__alloc_from_contiguous(struct device *dev, size_t size, void *ptr = NULL; page = dma_alloc_from_contiguous(dev, count, order, gfp & __GFP_NOWARN); + if (!page) + page = alloc_pages(gfp, order); if (!page) return NULL; @@ -600,7 +602,8 @@ static void *__alloc_from_contiguous(struct device *dev, size_t size, if (PageHighMem(page)) { ptr = __dma_alloc_remap(page, size, GFP_KERNEL, prot, caller); if (!ptr) { - dma_release_from_contiguous(dev, page, count); + if (!dma_release_from_contiguous(dev, page, count)) + __free_pages(page, get_order(size)); return NULL; } } else { @@ -622,7 +625,8 @@ static void __free_from_contiguous(struct device *dev, struct page *page, else __dma_remap(page, size, PAGE_KERNEL); } - dma_release_from_contiguous(dev, page, size >> PAGE_SHIFT); + if (!dma_release_from_contiguous(dev, page, size >> PAGE_SHIFT)) + __free_pages(page, get_order(size)); } static inline pgprot_t __get_dma_pgprot(unsigned long attrs, pgprot_t prot) @@ -1295,6 +1299,8 @@ static struct page **__iommu_alloc_buffer(struct device *dev, size_t size, page = dma_alloc_from_contiguous(dev, count, order, gfp & __GFP_NOWARN); + if (!page) + page = alloc_pages(gfp, order); if (!page) goto error; @@ -1369,7 +1375,8 @@ static int __iommu_free_buffer(struct device *dev, struct page **pages, int i; if (attrs & DMA_ATTR_FORCE_CONTIGUOUS) { - dma_release_from_contiguous(dev, pages[0], count); + if (!dma_release_from_contiguous(dev, pages[0], count)) + __free_pages(page[0], get_order(size)); } else { for (i = 0; i < count; i++) if (pages[i])
The CMA allocation will skip allocations of single pages to save CMA resource. This requires its callers to rebound those page allocations from normal area. So this patch adds fallback routines. Signed-off-by: Nicolin Chen <nicoleotsuka@gmail.com> --- arch/arm/mm/dma-mapping.c | 13 ++++++++++--- 1 file changed, 10 insertions(+), 3 deletions(-)