diff mbox series

[v3,2/2] dma-contiguous: Use fallback alloc_pages for single pages

Message ID 20190524040633.16854-3-nicoleotsuka@gmail.com (mailing list archive)
State New, archived
Headers show
Series Optimize dma_*_from_contiguous calls | expand

Commit Message

Nicolin Chen May 24, 2019, 4:06 a.m. UTC
The addresses within a single page are always contiguous, so it's
not so necessary to always allocate one single page from CMA area.
Since the CMA area has a limited predefined size of space, it may
run out of space in heavy use cases, where there might be quite a
lot CMA pages being allocated for single pages.

However, there is also a concern that a device might care where a
page comes from -- it might expect the page from CMA area and act
differently if the page doesn't.

This patch tries to use the fallback alloc_pages path, instead of
one-page size allocations from the global CMA area in case that a
device does not have its own CMA area. This'd save resources from
the CMA global area for more CMA allocations, and also reduce CMA
fragmentations resulted from trivial allocations.

Signed-off-by: Nicolin Chen <nicoleotsuka@gmail.com>
---
 kernel/dma/contiguous.c | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

Comments

Ira Weiny May 24, 2019, 4:16 p.m. UTC | #1
On Thu, May 23, 2019 at 09:06:33PM -0700, Nicolin Chen wrote:
> The addresses within a single page are always contiguous, so it's
> not so necessary to always allocate one single page from CMA area.
> Since the CMA area has a limited predefined size of space, it may
> run out of space in heavy use cases, where there might be quite a
> lot CMA pages being allocated for single pages.
> 
> However, there is also a concern that a device might care where a
> page comes from -- it might expect the page from CMA area and act
> differently if the page doesn't.

How does a device know, after this call, if a CMA area was used?  From the
patches I figured a device should not care.

> 
> This patch tries to use the fallback alloc_pages path, instead of
> one-page size allocations from the global CMA area in case that a
> device does not have its own CMA area. This'd save resources from
> the CMA global area for more CMA allocations, and also reduce CMA
> fragmentations resulted from trivial allocations.
> 
> Signed-off-by: Nicolin Chen <nicoleotsuka@gmail.com>
> ---
>  kernel/dma/contiguous.c | 11 ++++++++++-
>  1 file changed, 10 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/dma/contiguous.c b/kernel/dma/contiguous.c
> index 21f39a6cb04f..6914b92d5c88 100644
> --- a/kernel/dma/contiguous.c
> +++ b/kernel/dma/contiguous.c
> @@ -223,14 +223,23 @@ bool dma_release_from_contiguous(struct device *dev, struct page *pages,
>   * This function allocates contiguous memory buffer for specified device. It
>   * first tries to use device specific contiguous memory area if available or
>   * the default global one, then tries a fallback allocation of normal pages.
> + *
> + * Note that it byapss one-page size of allocations from the global area as
> + * the addresses within one page are always contiguous, so there is no need
> + * to waste CMA pages for that kind; it also helps reduce fragmentations.
>   */
>  struct page *dma_alloc_contiguous(struct device *dev, size_t size, gfp_t gfp)
>  {
>  	int node = dev ? dev_to_node(dev) : NUMA_NO_NODE;
>  	size_t count = PAGE_ALIGN(size) >> PAGE_SHIFT;
>  	size_t align = get_order(PAGE_ALIGN(size));
> -	struct cma *cma = dev_get_cma_area(dev);
>  	struct page *page = NULL;
> +	struct cma *cma = NULL;
> +
> +	if (dev && dev->cma_area)
> +		cma = dev->cma_area;
> +	else if (count > 1)
> +		cma = dma_contiguous_default_area;

Doesn't dev_get_dma_area() already do this?

Ira

>  
>  	/* CMA can be used only in the context which permits sleeping */
>  	if (cma && gfpflags_allow_blocking(gfp)) {
> -- 
> 2.17.1
>
Nicolin Chen May 27, 2019, 10:55 a.m. UTC | #2
Hi Ira,

On Fri, May 24, 2019 at 09:16:19AM -0700, Ira Weiny wrote:
> On Thu, May 23, 2019 at 09:06:33PM -0700, Nicolin Chen wrote:
> > The addresses within a single page are always contiguous, so it's
> > not so necessary to always allocate one single page from CMA area.
> > Since the CMA area has a limited predefined size of space, it may
> > run out of space in heavy use cases, where there might be quite a
> > lot CMA pages being allocated for single pages.
> > 
> > However, there is also a concern that a device might care where a
> > page comes from -- it might expect the page from CMA area and act
> > differently if the page doesn't.
> 
> How does a device know, after this call, if a CMA area was used?  From the
> patches I figured a device should not care.

A device doesn't know. But that doesn't mean a device won't care
at all. There was a concern from Robin and Christoph, as a corner
case that device might act differently if the memory isn't in its
own CMA region. That's why we let it still use its device specific
CMA area.

> > +	if (dev && dev->cma_area)
> > +		cma = dev->cma_area;
> > +	else if (count > 1)
> > +		cma = dma_contiguous_default_area;
> 
> Doesn't dev_get_dma_area() already do this?

Partially yes. But unwrapping it makes the program flow clear in
my opinion. Actually I should have mentioned that this patch was
suggested by Christoph also.

Otherwise, it would need an override like:
	cma = dev_get_dma_area();
	if (count > 1 && cma == dma_contiguous_default_area)
		cma = NULL;

Which doesn't look that bad though..

Thanks
Nicolin
diff mbox series

Patch

diff --git a/kernel/dma/contiguous.c b/kernel/dma/contiguous.c
index 21f39a6cb04f..6914b92d5c88 100644
--- a/kernel/dma/contiguous.c
+++ b/kernel/dma/contiguous.c
@@ -223,14 +223,23 @@  bool dma_release_from_contiguous(struct device *dev, struct page *pages,
  * This function allocates contiguous memory buffer for specified device. It
  * first tries to use device specific contiguous memory area if available or
  * the default global one, then tries a fallback allocation of normal pages.
+ *
+ * Note that it byapss one-page size of allocations from the global area as
+ * the addresses within one page are always contiguous, so there is no need
+ * to waste CMA pages for that kind; it also helps reduce fragmentations.
  */
 struct page *dma_alloc_contiguous(struct device *dev, size_t size, gfp_t gfp)
 {
 	int node = dev ? dev_to_node(dev) : NUMA_NO_NODE;
 	size_t count = PAGE_ALIGN(size) >> PAGE_SHIFT;
 	size_t align = get_order(PAGE_ALIGN(size));
-	struct cma *cma = dev_get_cma_area(dev);
 	struct page *page = NULL;
+	struct cma *cma = NULL;
+
+	if (dev && dev->cma_area)
+		cma = dev->cma_area;
+	else if (count > 1)
+		cma = dma_contiguous_default_area;
 
 	/* CMA can be used only in the context which permits sleeping */
 	if (cma && gfpflags_allow_blocking(gfp)) {