[RFC,v3,09/35] mm: cma: Introduce cma_remove_mem()

Message ID	20240125164256.4147-10-alexandru.elisei@arm.com (mailing list archive)
State	New
Headers	show Return-Path: <owner-linux-mm@kvack.org> From: Alexandru Elisei <alexandru.elisei@arm.com> To: catalin.marinas@arm.com, will@kernel.org, oliver.upton@linux.dev, maz@kernel.org, james.morse@arm.com, suzuki.poulose@arm.com, yuzenghui@huawei.com, arnd@arndb.de, akpm@linux-foundation.org, mingo@redhat.com, peterz@infradead.org, juri.lelli@redhat.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de, bristot@redhat.com, vschneid@redhat.com, mhiramat@kernel.org, rppt@kernel.org, hughd@google.com Cc: pcc@google.com, steven.price@arm.com, anshuman.khandual@arm.com, vincenzo.frascino@arm.com, david@redhat.com, eugenis@google.com, kcc@google.com, hyesoo.yu@samsung.com, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, kvmarm@lists.linux.dev, linux-fsdevel@vger.kernel.org, linux-arch@vger.kernel.org, linux-mm@kvack.org, linux-trace-kernel@vger.kernel.org Subject: [PATCH RFC v3 09/35] mm: cma: Introduce cma_remove_mem() Date: Thu, 25 Jan 2024 16:42:30 +0000 Message-Id: <20240125164256.4147-10-alexandru.elisei@arm.com> In-Reply-To: <20240125164256.4147-1-alexandru.elisei@arm.com> References: <20240125164256.4147-1-alexandru.elisei@arm.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: owner-linux-mm@kvack.org Precedence: bulk
Series	Add support for arm64 MTE dynamic tag storage reuse \| expand [RFC,v3,00/35] Add support for arm64 MTE dynamic tag storage reuse [RFC,v3,01/35] mm: page_alloc: Add gfp_flags parameter to arch_alloc_page() [RFC,v3,02/35] mm: page_alloc: Add an arch hook early in free_pages_prepare() [RFC,v3,03/35] mm: page_alloc: Add an arch hook to filter MIGRATE_CMA allocations [RFC,v3,04/35] mm: page_alloc: Partially revert "mm: page_alloc: remove stale CMA guard code" [RFC,v3,05/35] mm: cma: Don't append newline when generating CMA area name [RFC,v3,06/35] mm: cma: Make CMA_ALLOC_SUCCESS/FAIL count the number of pages [RFC,v3,07/35] mm: cma: Add CMA_RELEASE_{SUCCESS,FAIL} events [RFC,v3,08/35] mm: cma: Introduce cma_alloc_range() [RFC,v3,09/35] mm: cma: Introduce cma_remove_mem() [RFC,v3,10/35] mm: cma: Fast track allocating memory when the pages are free [RFC,v3,11/35] mm: Allow an arch to hook into folio allocation when VMA is known [RFC,v3,12/35] mm: Call arch_swap_prepare_to_restore() before arch_swap_restore() [RFC,v3,13/35] mm: memory: Introduce fault-on-access mechanism for pages [RFC,v3,14/35] of: fdt: Return the region size in of_flat_dt_translate_address() [RFC,v3,15/35] of: fdt: Add of_flat_read_u32() [RFC,v3,16/35] KVM: arm64: Don't deny VM_PFNMAP VMAs when kvm_has_mte() [RFC,v3,17/35] arm64: mte: Rework naming for tag manipulation functions [RFC,v3,18/35] arm64: mte: Rename __GFP_ZEROTAGS to __GFP_TAGGED [RFC,v3,19/35] arm64: mte: Discover tag storage memory [RFC,v3,20/35] arm64: mte: Add tag storage memory to CMA [RFC,v3,21/35] arm64: mte: Disable dynamic tag storage management if HW KASAN is enabled [RFC,v3,22/35] arm64: mte: Enable tag storage if CMA areas have been activated [RFC,v3,23/35] arm64: mte: Try to reserve tag storage in arch_alloc_page() [RFC,v3,24/35] arm64: mte: Perform CMOs for tag blocks [RFC,v3,25/35] arm64: mte: Reserve tag block for the zero page [RFC,v3,26/35] arm64: mte: Use fault-on-access to reserve missing tag storage [RFC,v3,27/35] arm64: mte: Handle tag storage pages mapped in an MTE VMA [RFC,v3,28/35] arm64: mte: swap: Handle tag restoring when missing tag storage [RFC,v3,29/35] arm64: mte: copypage: Handle tag restoring when missing tag storage [RFC,v3,30/35] arm64: mte: ptrace: Handle pages with missing tag storage [RFC,v3,31/35] khugepaged: arm64: Don't collapse MTE enabled VMAs [RFC,v3,32/35] KVM: arm64: mte: Reserve tag storage for virtual machines with MTE [RFC,v3,33/35] KVM: arm64: mte: Introduce VM_MTE_KVM VMA flag [RFC,v3,34/35] arm64: mte: Enable dynamic tag storage management [RFC,v3,35/35] HACK! arm64: dts: Add fake tag storage to fvp-base-revc.dts

Message ID

20240125164256.4147-10-alexandru.elisei@arm.com (mailing list archive)

State

New

Headers

From: Alexandru Elisei <alexandru.elisei@arm.com>
To: catalin.marinas@arm.com,
	will@kernel.org,
	oliver.upton@linux.dev,
	maz@kernel.org,
	james.morse@arm.com,
	suzuki.poulose@arm.com,
	yuzenghui@huawei.com,
	arnd@arndb.de,
	akpm@linux-foundation.org,
	mingo@redhat.com,
	peterz@infradead.org,
	juri.lelli@redhat.com,
	vincent.guittot@linaro.org,
	dietmar.eggemann@arm.com,
	rostedt@goodmis.org,
	bsegall@google.com,
	mgorman@suse.de,
	bristot@redhat.com,
	vschneid@redhat.com,
	mhiramat@kernel.org,
	rppt@kernel.org,
	hughd@google.com
Cc: pcc@google.com,
	steven.price@arm.com,
	anshuman.khandual@arm.com,
	vincenzo.frascino@arm.com,
	david@redhat.com,
	eugenis@google.com,
	kcc@google.com,
	hyesoo.yu@samsung.com,
	linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org,
	kvmarm@lists.linux.dev,
	linux-fsdevel@vger.kernel.org,
	linux-arch@vger.kernel.org,
	linux-mm@kvack.org,
	linux-trace-kernel@vger.kernel.org
Subject: [PATCH RFC v3 09/35] mm: cma: Introduce cma_remove_mem()
Date: Thu, 25 Jan 2024 16:42:30 +0000
Message-Id: <20240125164256.4147-10-alexandru.elisei@arm.com>
In-Reply-To: <20240125164256.4147-1-alexandru.elisei@arm.com>
References: <20240125164256.4147-1-alexandru.elisei@arm.com>
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
Sender: owner-linux-mm@kvack.org
Precedence: bulk

Series

Add support for arm64 MTE dynamic tag storage reuse | expand

Commit Message

Alexandru Elisei Jan. 25, 2024, 4:42 p.m. UTC

Memory is added to CMA with cma_declare_contiguous_nid() and
cma_init_reserved_mem(). This memory is then put on the MIGRATE_CMA list in
cma_init_reserved_areas(), where the page allocator can make use of it.

If a device manages multiple CMA areas, and there's an error when one of
the areas is added to CMA, there is no mechanism for the device to prevent
the rest of the areas, which were added before the error occured, from
being later added to the MIGRATE_CMA list.

Add cma_remove_mem() which allows a previously reserved CMA area to be
removed and thus it cannot be used by the page allocator.

Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
---

Changes since rfc v2:

* New patch.

 include/linux/cma.h |  1 +
 mm/cma.c            | 30 +++++++++++++++++++++++++++++-
 2 files changed, 30 insertions(+), 1 deletion(-)

Comments

Anshuman Khandual Jan. 30, 2024, 5:50 a.m. UTC | #1

On 1/25/24 22:12, Alexandru Elisei wrote:
> Memory is added to CMA with cma_declare_contiguous_nid() and
> cma_init_reserved_mem(). This memory is then put on the MIGRATE_CMA list in
> cma_init_reserved_areas(), where the page allocator can make use of it.

cma_declare_contiguous_nid() reserves memory in memblock and marks the
for subsequent CMA usage, where as cma_init_reserved_areas() activates
these memory areas through init_cma_reserved_pageblock(). Standard page
allocator only receives these memory via free_reserved_page() - only if
the page block activation fails.

> 
> If a device manages multiple CMA areas, and there's an error when one of
> the areas is added to CMA, there is no mechanism for the device to prevent

What kind of error ? init_cma_reserved_pageblock() fails ? But that will
not happen until cma_init_reserved_areas().

> the rest of the areas, which were added before the error occured, from
> being later added to the MIGRATE_CMA list.

Why is this mechanism required ? cma_init_reserved_areas() scans over all
CMA areas and try and activate each of them sequentially. Why is not this
sufficient ?

> 
> Add cma_remove_mem() which allows a previously reserved CMA area to be
> removed and thus it cannot be used by the page allocator.

Successfully activated CMA areas do not get used by the buddy allocator.

> 
> Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
> ---
> 
> Changes since rfc v2:
> 
> * New patch.
> 
>  include/linux/cma.h |  1 +
>  mm/cma.c            | 30 +++++++++++++++++++++++++++++-
>  2 files changed, 30 insertions(+), 1 deletion(-)
> 
> diff --git a/include/linux/cma.h b/include/linux/cma.h
> index e32559da6942..787cbec1702e 100644
> --- a/include/linux/cma.h
> +++ b/include/linux/cma.h
> @@ -48,6 +48,7 @@ extern int cma_init_reserved_mem(phys_addr_t base, phys_addr_t size,
>  					unsigned int order_per_bit,
>  					const char *name,
>  					struct cma **res_cma);
> +extern void cma_remove_mem(struct cma **res_cma);
>  extern struct page *cma_alloc(struct cma *cma, unsigned long count, unsigned int align,
>  			      bool no_warn);
>  extern int cma_alloc_range(struct cma *cma, unsigned long start, unsigned long count,
> diff --git a/mm/cma.c b/mm/cma.c
> index 4a0f68b9443b..2881bab12b01 100644
> --- a/mm/cma.c
> +++ b/mm/cma.c
> @@ -147,8 +147,12 @@ static int __init cma_init_reserved_areas(void)
>  {
>  	int i;
>  
> -	for (i = 0; i < cma_area_count; i++)
> +	for (i = 0; i < cma_area_count; i++) {
> +		/* Region was removed. */
> +		if (!cma_areas[i].count)
> +			continue;

Skip previously added CMA area (now zeroed out) ?

>  		cma_activate_area(&cma_areas[i]);
> +	}
>  
>  	return 0;
>  }

cma_init_reserved_areas() gets called via core_initcall(). Some how
platform/device needs to call cma_remove_mem() before core_initcall()
gets called ? This might be time sensitive.

> @@ -216,6 +220,30 @@ int __init cma_init_reserved_mem(phys_addr_t base, phys_addr_t size,
>  	return 0;
>  }
>  
> +/**
> + * cma_remove_mem() - remove cma area
> + * @res_cma: Pointer to the cma region.
> + *
> + * This function removes a cma region created with cma_init_reserved_mem(). The
> + * ->count is set to 0.
> + */
> +void __init cma_remove_mem(struct cma **res_cma)
> +{
> +	struct cma *cma;
> +
> +	if (WARN_ON_ONCE(!res_cma || !(*res_cma)))
> +		return;
> +
> +	cma = *res_cma;
> +	if (WARN_ON_ONCE(!cma->count))
> +		return;
> +
> +	totalcma_pages -= cma->count;
> +	cma->count = 0;
> +
> +	*res_cma = NULL;
> +}
> +
>  /**
>   * cma_declare_contiguous_nid() - reserve custom contiguous area
>   * @base: Base address of the reserved area optional, use 0 for any

But first please do explain what are the errors device or platform might
see on a previously marked CMA area so that removing them on way becomes
necessary preventing their activation via cma_init_reserved_areas().

Alexandru Elisei Jan. 30, 2024, 11:33 a.m. UTC | #2

Hi,

I really appreciate the feedback you have given me so far. I believe the
commit message isn't clear enough and there has been a confusion.

A CMA user adds a CMA area to the cma_areas array with
cma_declare_contiguous_nid() or cma_init_reserved_mem().
init_cma_reserved_pageblock() then iterates over the array and activates
all cma areas.

The function cma_remove_mem() is intended to be used to remove a cma area
from the cma_areas array **before** the area has been activated.

Usecase: a driver (in this case, the arm64 dynamic tag storage code)
manages several cma areas. The driver successfully adds the first area to
the cma_areas array. When the driver tries to adds the second area, the
function fails. Without cma_remove_mem(), the driver has no way to prevent
the first area from being freed to the page allocator. cma_remove_mem() is
about providing a means to do cleanup in case of error.

Does that make more sense now?

Ok Tue, Jan 30, 2024 at 11:20:56AM +0530, Anshuman Khandual wrote:
> 
> 
> On 1/25/24 22:12, Alexandru Elisei wrote:
> > Memory is added to CMA with cma_declare_contiguous_nid() and
> > cma_init_reserved_mem(). This memory is then put on the MIGRATE_CMA list in
> > cma_init_reserved_areas(), where the page allocator can make use of it.
> 
> cma_declare_contiguous_nid() reserves memory in memblock and marks the

You forgot about about cma_init_reserved_mem() which does the same thing,
but yes, you are right.

> for subsequent CMA usage, where as cma_init_reserved_areas() activates
> these memory areas through init_cma_reserved_pageblock(). Standard page
> allocator only receives these memory via free_reserved_page() - only if

I don't think that's correct. init_cma_reserved_pageblock() clears the
PG_reserved page flag, sets the migratetype to MIGRATE_CMA and then frees
the page. After that, the page is available to the standard page allocator
to use for allocation. Otherwise, what would be the point of the
MIGRATE_CMA migratetype?

> the page block activation fails.

For the sake of having a complete picture, I'll add that that only happens
if cma->reserve_pages_on_error is false. If the CMA user sets the field to
'true' (with cma_reserve_pages_on_error()), then the pages in the CMA
region are kept PG_reserved if activation fails.

> 
> > 
> > If a device manages multiple CMA areas, and there's an error when one of
> > the areas is added to CMA, there is no mechanism for the device to prevent
> 
> What kind of error ? init_cma_reserved_pageblock() fails ? But that will
> not happen until cma_init_reserved_areas().

I think I haven't been clear enough. When I say that "an area is added
to CMA", I mean that the memory region is added to cma_areas array, via
cma_declare_contiguous_nid() or cma_init_reserved_mem(). There are several
ways in which either function can fail.

> 
> > the rest of the areas, which were added before the error occured, from
> > being later added to the MIGRATE_CMA list.
> 
> Why is this mechanism required ? cma_init_reserved_areas() scans over all
> CMA areas and try and activate each of them sequentially. Why is not this
> sufficient ?

This patch is about removing a struct cma from the cma_areas array after it
has been added to the array, with cma_declare_contiguous_nid() or
cma_init_reserved_mem(), to prevent the area from being activated in
cma_init_reserved_areas(). Sorry for the confusion.

I'll add a check in cma_remove_mem() to fail if the cma area has been
activated, and a comment to the function to explain its usage.

> 
> > 
> > Add cma_remove_mem() which allows a previously reserved CMA area to be
> > removed and thus it cannot be used by the page allocator.
> 
> Successfully activated CMA areas do not get used by the buddy allocator.

I don't believe that is correct, see above.

> 
> > 
> > Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
> > ---
> > 
> > Changes since rfc v2:
> > 
> > * New patch.
> > 
> >  include/linux/cma.h |  1 +
> >  mm/cma.c            | 30 +++++++++++++++++++++++++++++-
> >  2 files changed, 30 insertions(+), 1 deletion(-)
> > 
> > diff --git a/include/linux/cma.h b/include/linux/cma.h
> > index e32559da6942..787cbec1702e 100644
> > --- a/include/linux/cma.h
> > +++ b/include/linux/cma.h
> > @@ -48,6 +48,7 @@ extern int cma_init_reserved_mem(phys_addr_t base, phys_addr_t size,
> >  					unsigned int order_per_bit,
> >  					const char *name,
> >  					struct cma **res_cma);
> > +extern void cma_remove_mem(struct cma **res_cma);
> >  extern struct page *cma_alloc(struct cma *cma, unsigned long count, unsigned int align,
> >  			      bool no_warn);
> >  extern int cma_alloc_range(struct cma *cma, unsigned long start, unsigned long count,
> > diff --git a/mm/cma.c b/mm/cma.c
> > index 4a0f68b9443b..2881bab12b01 100644
> > --- a/mm/cma.c
> > +++ b/mm/cma.c
> > @@ -147,8 +147,12 @@ static int __init cma_init_reserved_areas(void)
> >  {
> >  	int i;
> >  
> > -	for (i = 0; i < cma_area_count; i++)
> > +	for (i = 0; i < cma_area_count; i++) {
> > +		/* Region was removed. */
> > +		if (!cma_areas[i].count)
> > +			continue;
> 
> Skip previously added CMA area (now zeroed out) ?

Yes, that's what I meant with the comment "Region was removed". Do you
think I should reword the comment?

> 
> >  		cma_activate_area(&cma_areas[i]);
> > +	}
> >  
> >  	return 0;
> >  }
> 
> cma_init_reserved_areas() gets called via core_initcall(). Some how
> platform/device needs to call cma_remove_mem() before core_initcall()
> gets called ? This might be time sensitive.

I don't understand your point.

> 
> > @@ -216,6 +220,30 @@ int __init cma_init_reserved_mem(phys_addr_t base, phys_addr_t size,
> >  	return 0;
> >  }
> >  
> > +/**
> > + * cma_remove_mem() - remove cma area
> > + * @res_cma: Pointer to the cma region.
> > + *
> > + * This function removes a cma region created with cma_init_reserved_mem(). The
> > + * ->count is set to 0.
> > + */
> > +void __init cma_remove_mem(struct cma **res_cma)
> > +{
> > +	struct cma *cma;
> > +
> > +	if (WARN_ON_ONCE(!res_cma || !(*res_cma)))
> > +		return;
> > +
> > +	cma = *res_cma;
> > +	if (WARN_ON_ONCE(!cma->count))
> > +		return;
> > +
> > +	totalcma_pages -= cma->count;
> > +	cma->count = 0;
> > +
> > +	*res_cma = NULL;
> > +}
> > +
> >  /**
> >   * cma_declare_contiguous_nid() - reserve custom contiguous area
> >   * @base: Base address of the reserved area optional, use 0 for any
> 
> But first please do explain what are the errors device or platform might

cma_declare_contiguous_nid() and cma_init_reserved_mem() can fail in a
number of ways, the code should be self documenting.

> see on a previously marked CMA area so that removing them on way becomes
> necessary preventing their activation via cma_init_reserved_areas().

I've described how the function is supposed to be used at the top of my
reply.

Thanks,
Alex

Anshuman Khandual Jan. 31, 2024, 1:19 p.m. UTC | #3

On 1/30/24 17:03, Alexandru Elisei wrote:
> Hi,
> 
> I really appreciate the feedback you have given me so far. I believe the
> commit message isn't clear enough and there has been a confusion.
> 
> A CMA user adds a CMA area to the cma_areas array with
> cma_declare_contiguous_nid() or cma_init_reserved_mem().
> init_cma_reserved_pageblock() then iterates over the array and activates
> all cma areas.

Agreed.

> 
> The function cma_remove_mem() is intended to be used to remove a cma area
> from the cma_areas array **before** the area has been activated.

Understood.

> 
> Usecase: a driver (in this case, the arm64 dynamic tag storage code)
> manages several cma areas. The driver successfully adds the first area to
> the cma_areas array. When the driver tries to adds the second area, the
> function fails. Without cma_remove_mem(), the driver has no way to prevent
> the first area from being freed to the page allocator. cma_remove_mem() is
> about providing a means to do cleanup in case of error.
> 
> Does that make more sense now?

How to ensure that cma_remove_mem() should get called by the driver before
core_initcall()---> cma_init_reserved_areas()---> cma_activate_area() chain
happens. Else cma_remove_mem() will miss out to clear cma->count and given
area will proceed to get activated like always.


> 
> Ok Tue, Jan 30, 2024 at 11:20:56AM +0530, Anshuman Khandual wrote:
>>
>>
>> On 1/25/24 22:12, Alexandru Elisei wrote:
>>> Memory is added to CMA with cma_declare_contiguous_nid() and
>>> cma_init_reserved_mem(). This memory is then put on the MIGRATE_CMA list in
>>> cma_init_reserved_areas(), where the page allocator can make use of it.
>>
>> cma_declare_contiguous_nid() reserves memory in memblock and marks the
> 
> You forgot about about cma_init_reserved_mem() which does the same thing,
> but yes, you are right.

Agreed, missed that. There are some direct cma_init_reserved_mem() calls as well.

> 
>> for subsequent CMA usage, where as cma_init_reserved_areas() activates
>> these memory areas through init_cma_reserved_pageblock(). Standard page
>> allocator only receives these memory via free_reserved_page() - only if
> 
> I don't think that's correct. init_cma_reserved_pageblock() clears the
> PG_reserved page flag, sets the migratetype to MIGRATE_CMA and then frees
> the page. After that, the page is available to the standard page allocator
> to use for allocation. Otherwise, what would be the point of the
> MIGRATE_CMA migratetype?

Understood and agreed.

> 
>> the page block activation fails.
> 
> For the sake of having a complete picture, I'll add that that only happens
> if cma->reserve_pages_on_error is false. If the CMA user sets the field to
> 'true' (with cma_reserve_pages_on_error()), then the pages in the CMA
> region are kept PG_reserved if activation fails.

Why cannot you use cma_reserve_pages_on_error() ?

> 
>>
>>>
>>> If a device manages multiple CMA areas, and there's an error when one of
>>> the areas is added to CMA, there is no mechanism for the device to prevent
>>
>> What kind of error ? init_cma_reserved_pageblock() fails ? But that will
>> not happen until cma_init_reserved_areas().
> 
> I think I haven't been clear enough. When I say that "an area is added
> to CMA", I mean that the memory region is added to cma_areas array, via
> cma_declare_contiguous_nid() or cma_init_reserved_mem(). There are several
> ways in which either function can fail.

Okay.

> 
>>
>>> the rest of the areas, which were added before the error occured, from
>>> being later added to the MIGRATE_CMA list.
>>
>> Why is this mechanism required ? cma_init_reserved_areas() scans over all
>> CMA areas and try and activate each of them sequentially. Why is not this
>> sufficient ?
> 
> This patch is about removing a struct cma from the cma_areas array after it
> has been added to the array, with cma_declare_contiguous_nid() or
> cma_init_reserved_mem(), to prevent the area from being activated in
> cma_init_reserved_areas(). Sorry for the confusion.
> 
> I'll add a check in cma_remove_mem() to fail if the cma area has been
> activated, and a comment to the function to explain its usage.

That will be a good check.

> 
>>
>>>
>>> Add cma_remove_mem() which allows a previously reserved CMA area to be
>>> removed and thus it cannot be used by the page allocator.
>>
>> Successfully activated CMA areas do not get used by the buddy allocator.
> 
> I don't believe that is correct, see above.
Apologies, it's my bad.

> 
>>
>>>
>>> Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
>>> ---
>>>
>>> Changes since rfc v2:
>>>
>>> * New patch.
>>>
>>>  include/linux/cma.h |  1 +
>>>  mm/cma.c            | 30 +++++++++++++++++++++++++++++-
>>>  2 files changed, 30 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/include/linux/cma.h b/include/linux/cma.h
>>> index e32559da6942..787cbec1702e 100644
>>> --- a/include/linux/cma.h
>>> +++ b/include/linux/cma.h
>>> @@ -48,6 +48,7 @@ extern int cma_init_reserved_mem(phys_addr_t base, phys_addr_t size,
>>>  					unsigned int order_per_bit,
>>>  					const char *name,
>>>  					struct cma **res_cma);
>>> +extern void cma_remove_mem(struct cma **res_cma);
>>>  extern struct page *cma_alloc(struct cma *cma, unsigned long count, unsigned int align,
>>>  			      bool no_warn);
>>>  extern int cma_alloc_range(struct cma *cma, unsigned long start, unsigned long count,
>>> diff --git a/mm/cma.c b/mm/cma.c
>>> index 4a0f68b9443b..2881bab12b01 100644
>>> --- a/mm/cma.c
>>> +++ b/mm/cma.c
>>> @@ -147,8 +147,12 @@ static int __init cma_init_reserved_areas(void)
>>>  {
>>>  	int i;
>>>  
>>> -	for (i = 0; i < cma_area_count; i++)
>>> +	for (i = 0; i < cma_area_count; i++) {
>>> +		/* Region was removed. */
>>> +		if (!cma_areas[i].count)
>>> +			continue;
>>
>> Skip previously added CMA area (now zeroed out) ?
> 
> Yes, that's what I meant with the comment "Region was removed". Do you
> think I should reword the comment?
> 
>>
>>>  		cma_activate_area(&cma_areas[i]);
>>> +	}
>>>  
>>>  	return 0;
>>>  }
>>
>> cma_init_reserved_areas() gets called via core_initcall(). Some how
>> platform/device needs to call cma_remove_mem() before core_initcall()
>> gets called ? This might be time sensitive.
> 
> I don't understand your point.
> 
>>
>>> @@ -216,6 +220,30 @@ int __init cma_init_reserved_mem(phys_addr_t base, phys_addr_t size,
>>>  	return 0;
>>>  }
>>>  
>>> +/**
>>> + * cma_remove_mem() - remove cma area
>>> + * @res_cma: Pointer to the cma region.
>>> + *
>>> + * This function removes a cma region created with cma_init_reserved_mem(). The
>>> + * ->count is set to 0.
>>> + */
>>> +void __init cma_remove_mem(struct cma **res_cma)
>>> +{
>>> +	struct cma *cma;
>>> +
>>> +	if (WARN_ON_ONCE(!res_cma || !(*res_cma)))
>>> +		return;
>>> +
>>> +	cma = *res_cma;
>>> +	if (WARN_ON_ONCE(!cma->count))
>>> +		return;
>>> +
>>> +	totalcma_pages -= cma->count;
>>> +	cma->count = 0;
>>> +
>>> +	*res_cma = NULL;
>>> +}
>>> +
>>>  /**
>>>   * cma_declare_contiguous_nid() - reserve custom contiguous area
>>>   * @base: Base address of the reserved area optional, use 0 for any
>>
>> But first please do explain what are the errors device or platform might
> 
> cma_declare_contiguous_nid() and cma_init_reserved_mem() can fail in a
> number of ways, the code should be self documenting.

But when they do fail - would not cma->count be left uninitialized as 0 ?
Hence the proposed check (!cma->count) in cma_init_reserved_areas() should
just do the trick without requiring an explicit cma_remove_mem() call.

> 
>> see on a previously marked CMA area so that removing them on way becomes
>> necessary preventing their activation via cma_init_reserved_areas().
> 
> I've described how the function is supposed to be used at the top of my
> reply.
> 
> Thanks,
> Alex

Alexandru Elisei Jan. 31, 2024, 1:48 p.m. UTC | #4

Hi,

On Wed, Jan 31, 2024 at 06:49:34PM +0530, Anshuman Khandual wrote:
> On 1/30/24 17:03, Alexandru Elisei wrote:
> > Hi,
> > 
> > I really appreciate the feedback you have given me so far. I believe the
> > commit message isn't clear enough and there has been a confusion.
> > 
> > A CMA user adds a CMA area to the cma_areas array with
> > cma_declare_contiguous_nid() or cma_init_reserved_mem().
> > init_cma_reserved_pageblock() then iterates over the array and activates
> > all cma areas.
> 
> Agreed.
> 
> > 
> > The function cma_remove_mem() is intended to be used to remove a cma area
> > from the cma_areas array **before** the area has been activated.
> 
> Understood.
> 
> > 
> > Usecase: a driver (in this case, the arm64 dynamic tag storage code)
> > manages several cma areas. The driver successfully adds the first area to
> > the cma_areas array. When the driver tries to adds the second area, the
> > function fails. Without cma_remove_mem(), the driver has no way to prevent
> > the first area from being freed to the page allocator. cma_remove_mem() is
> > about providing a means to do cleanup in case of error.
> > 
> > Does that make more sense now?
> 
> How to ensure that cma_remove_mem() should get called by the driver before
> core_initcall()---> cma_init_reserved_areas()---> cma_activate_area() chain
> happens. Else cma_remove_mem() will miss out to clear cma->count and given
> area will proceed to get activated like always.

The same way drivers today call cma_declare_contiguous_nid() and
cma_init_reserved_mem() before cma_init_reserved_areas(). For an example,
have a look at kernel/dma/contiguous.c:: rmem_cma_setup().

As for how the series uses cma_remove_mem(), have a look at patch #20
("arm64: mte: Add tag storage memory to CMA") [1], specifically this bit:

        for (i = 0; i < num_tag_regions; i++) {
                region = &tag_regions[i];

		// code removed for clarity

                ret = cma_init_reserved_mem(PFN_PHYS(region->tag_range.start),
                                PFN_PHYS(range_len(&region->tag_range)),
                                order, NULL, &region->cma);
                if (ret) {
                        for (j = 0; j < i; j++)
                                cma_remove_mem(&region->cma);
                        goto out_disabled;
                }
	}

	// code removed for clarity

out_disabled:
        num_tag_regions = 0;
        pr_info("MTE tag storage region management disabled");

I'll try to walk you through it. The driver manages 2 cma regions.

cma_init_reserved_mem() succeeds for the first region.

cma_init_reserved_mem() fails for the second region.

As a result, the first region will be activated (pages will be placed on
the MIGRATE_CMA list), but the second region will not be activated.

The driver can function only when **all** cma regions have been
successfully activated.

Driver removes first region from CMA, so now no regions will be activated,
and probing fails.

In a more general sense, cma_remove_mem() is **not** about removing a
region that failed initialization or activation, it's about removing a cma
area that was added to cma_areas successfully, but the driver doesn't want
to activate anymore for whatever reason (it can be because of a probing
error totally unrelated to CMA).

Does it make more sense now? I hope that this example also answers the rest
of your questions.

[1] https://lore.kernel.org/linux-arm-kernel/20240125164256.4147-21-alexandru.elisei@arm.com/

Thanks,
Alex

> 
> > 
> > Ok Tue, Jan 30, 2024 at 11:20:56AM +0530, Anshuman Khandual wrote:
> >>
> >>
> >> On 1/25/24 22:12, Alexandru Elisei wrote:
> >>> Memory is added to CMA with cma_declare_contiguous_nid() and
> >>> cma_init_reserved_mem(). This memory is then put on the MIGRATE_CMA list in
> >>> cma_init_reserved_areas(), where the page allocator can make use of it.
> >>
> >> cma_declare_contiguous_nid() reserves memory in memblock and marks the
> > 
> > You forgot about about cma_init_reserved_mem() which does the same thing,
> > but yes, you are right.
> 
> Agreed, missed that. There are some direct cma_init_reserved_mem() calls as well.
> 
> > 
> >> for subsequent CMA usage, where as cma_init_reserved_areas() activates
> >> these memory areas through init_cma_reserved_pageblock(). Standard page
> >> allocator only receives these memory via free_reserved_page() - only if
> > 
> > I don't think that's correct. init_cma_reserved_pageblock() clears the
> > PG_reserved page flag, sets the migratetype to MIGRATE_CMA and then frees
> > the page. After that, the page is available to the standard page allocator
> > to use for allocation. Otherwise, what would be the point of the
> > MIGRATE_CMA migratetype?
> 
> Understood and agreed.
> 
> > 
> >> the page block activation fails.
> > 
> > For the sake of having a complete picture, I'll add that that only happens
> > if cma->reserve_pages_on_error is false. If the CMA user sets the field to
> > 'true' (with cma_reserve_pages_on_error()), then the pages in the CMA
> > region are kept PG_reserved if activation fails.
> 
> Why cannot you use cma_reserve_pages_on_error() ?
> 
> > 
> >>
> >>>
> >>> If a device manages multiple CMA areas, and there's an error when one of
> >>> the areas is added to CMA, there is no mechanism for the device to prevent
> >>
> >> What kind of error ? init_cma_reserved_pageblock() fails ? But that will
> >> not happen until cma_init_reserved_areas().
> > 
> > I think I haven't been clear enough. When I say that "an area is added
> > to CMA", I mean that the memory region is added to cma_areas array, via
> > cma_declare_contiguous_nid() or cma_init_reserved_mem(). There are several
> > ways in which either function can fail.
> 
> Okay.
> 
> > 
> >>
> >>> the rest of the areas, which were added before the error occured, from
> >>> being later added to the MIGRATE_CMA list.
> >>
> >> Why is this mechanism required ? cma_init_reserved_areas() scans over all
> >> CMA areas and try and activate each of them sequentially. Why is not this
> >> sufficient ?
> > 
> > This patch is about removing a struct cma from the cma_areas array after it
> > has been added to the array, with cma_declare_contiguous_nid() or
> > cma_init_reserved_mem(), to prevent the area from being activated in
> > cma_init_reserved_areas(). Sorry for the confusion.
> > 
> > I'll add a check in cma_remove_mem() to fail if the cma area has been
> > activated, and a comment to the function to explain its usage.
> 
> That will be a good check.
> 
> > 
> >>
> >>>
> >>> Add cma_remove_mem() which allows a previously reserved CMA area to be
> >>> removed and thus it cannot be used by the page allocator.
> >>
> >> Successfully activated CMA areas do not get used by the buddy allocator.
> > 
> > I don't believe that is correct, see above.
> Apologies, it's my bad.
> 
> > 
> >>
> >>>
> >>> Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
> >>> ---
> >>>
> >>> Changes since rfc v2:
> >>>
> >>> * New patch.
> >>>
> >>>  include/linux/cma.h |  1 +
> >>>  mm/cma.c            | 30 +++++++++++++++++++++++++++++-
> >>>  2 files changed, 30 insertions(+), 1 deletion(-)
> >>>
> >>> diff --git a/include/linux/cma.h b/include/linux/cma.h
> >>> index e32559da6942..787cbec1702e 100644
> >>> --- a/include/linux/cma.h
> >>> +++ b/include/linux/cma.h
> >>> @@ -48,6 +48,7 @@ extern int cma_init_reserved_mem(phys_addr_t base, phys_addr_t size,
> >>>  					unsigned int order_per_bit,
> >>>  					const char *name,
> >>>  					struct cma **res_cma);
> >>> +extern void cma_remove_mem(struct cma **res_cma);
> >>>  extern struct page *cma_alloc(struct cma *cma, unsigned long count, unsigned int align,
> >>>  			      bool no_warn);
> >>>  extern int cma_alloc_range(struct cma *cma, unsigned long start, unsigned long count,
> >>> diff --git a/mm/cma.c b/mm/cma.c
> >>> index 4a0f68b9443b..2881bab12b01 100644
> >>> --- a/mm/cma.c
> >>> +++ b/mm/cma.c
> >>> @@ -147,8 +147,12 @@ static int __init cma_init_reserved_areas(void)
> >>>  {
> >>>  	int i;
> >>>  
> >>> -	for (i = 0; i < cma_area_count; i++)
> >>> +	for (i = 0; i < cma_area_count; i++) {
> >>> +		/* Region was removed. */
> >>> +		if (!cma_areas[i].count)
> >>> +			continue;
> >>
> >> Skip previously added CMA area (now zeroed out) ?
> > 
> > Yes, that's what I meant with the comment "Region was removed". Do you
> > think I should reword the comment?
> > 
> >>
> >>>  		cma_activate_area(&cma_areas[i]);
> >>> +	}
> >>>  
> >>>  	return 0;
> >>>  }
> >>
> >> cma_init_reserved_areas() gets called via core_initcall(). Some how
> >> platform/device needs to call cma_remove_mem() before core_initcall()
> >> gets called ? This might be time sensitive.
> > 
> > I don't understand your point.
> > 
> >>
> >>> @@ -216,6 +220,30 @@ int __init cma_init_reserved_mem(phys_addr_t base, phys_addr_t size,
> >>>  	return 0;
> >>>  }
> >>>  
> >>> +/**
> >>> + * cma_remove_mem() - remove cma area
> >>> + * @res_cma: Pointer to the cma region.
> >>> + *
> >>> + * This function removes a cma region created with cma_init_reserved_mem(). The
> >>> + * ->count is set to 0.
> >>> + */
> >>> +void __init cma_remove_mem(struct cma **res_cma)
> >>> +{
> >>> +	struct cma *cma;
> >>> +
> >>> +	if (WARN_ON_ONCE(!res_cma || !(*res_cma)))
> >>> +		return;
> >>> +
> >>> +	cma = *res_cma;
> >>> +	if (WARN_ON_ONCE(!cma->count))
> >>> +		return;
> >>> +
> >>> +	totalcma_pages -= cma->count;
> >>> +	cma->count = 0;
> >>> +
> >>> +	*res_cma = NULL;
> >>> +}
> >>> +
> >>>  /**
> >>>   * cma_declare_contiguous_nid() - reserve custom contiguous area
> >>>   * @base: Base address of the reserved area optional, use 0 for any
> >>
> >> But first please do explain what are the errors device or platform might
> > 
> > cma_declare_contiguous_nid() and cma_init_reserved_mem() can fail in a
> > number of ways, the code should be self documenting.
> 
> But when they do fail - would not cma->count be left uninitialized as 0 ?
> Hence the proposed check (!cma->count) in cma_init_reserved_areas() should
> just do the trick without requiring an explicit cma_remove_mem() call.
> 
> > 
> >> see on a previously marked CMA area so that removing them on way becomes
> >> necessary preventing their activation via cma_init_reserved_areas().
> > 
> > I've described how the function is supposed to be used at the top of my
> > reply.
> > 
> > Thanks,
> > Alex
>

diff --git a/include/linux/cma.h b/include/linux/cma.h
index e32559da6942..787cbec1702e 100644
--- a/include/linux/cma.h
+++ b/include/linux/cma.h
@@ -48,6 +48,7 @@  extern int cma_init_reserved_mem(phys_addr_t base, phys_addr_t size,
 					unsigned int order_per_bit,
 					const char *name,
 					struct cma **res_cma);
+extern void cma_remove_mem(struct cma **res_cma);
 extern struct page *cma_alloc(struct cma *cma, unsigned long count, unsigned int align,
 			      bool no_warn);
 extern int cma_alloc_range(struct cma *cma, unsigned long start, unsigned long count,
diff --git a/mm/cma.c b/mm/cma.c
index 4a0f68b9443b..2881bab12b01 100644
--- a/mm/cma.c
+++ b/mm/cma.c
@@ -147,8 +147,12 @@  static int __init cma_init_reserved_areas(void)
 {
 	int i;
 
-	for (i = 0; i < cma_area_count; i++)
+	for (i = 0; i < cma_area_count; i++) {
+		/* Region was removed. */
+		if (!cma_areas[i].count)
+			continue;
 		cma_activate_area(&cma_areas[i]);
+	}
 
 	return 0;
 }
@@ -216,6 +220,30 @@  int __init cma_init_reserved_mem(phys_addr_t base, phys_addr_t size,
 	return 0;
 }
 
+/**
+ * cma_remove_mem() - remove cma area
+ * @res_cma: Pointer to the cma region.
+ *
+ * This function removes a cma region created with cma_init_reserved_mem(). The
+ * ->count is set to 0.
+ */
+void __init cma_remove_mem(struct cma **res_cma)
+{
+	struct cma *cma;
+
+	if (WARN_ON_ONCE(!res_cma || !(*res_cma)))
+		return;
+
+	cma = *res_cma;
+	if (WARN_ON_ONCE(!cma->count))
+		return;
+
+	totalcma_pages -= cma->count;
+	cma->count = 0;
+
+	*res_cma = NULL;
+}
+
 /**
  * cma_declare_contiguous_nid() - reserve custom contiguous area
  * @base: Base address of the reserved area optional, use 0 for any

[RFC,v3,09/35] mm: cma: Introduce cma_remove_mem()

Commit Message

Comments

Patch