diff mbox series

arm64: mte: Use PAGE_KERNEL_TAGGED in arch_add_memory

Message ID 1614745263-27827-1-git-send-email-pdaly@codeaurora.org (mailing list archive)
State New, archived
Headers show
Series arm64: mte: Use PAGE_KERNEL_TAGGED in arch_add_memory | expand

Commit Message

Patrick Daly March 3, 2021, 4:21 a.m. UTC
In a system which supports MTE, the linear kernel region must allow
reading/writing allocation tags. For memory present at boot this
is already being done in map_mem(). Add the same in arch_add_memory().

Signed-off-by: Patrick Daly <pdaly@codeaurora.org>
---
 arch/arm64/mm/mmu.c | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

Comments

Catalin Marinas March 5, 2021, 11:19 a.m. UTC | #1
Hi Patrick,

Thanks for this. Adding a few more people and linux-mm.

On Tue, Mar 02, 2021 at 08:21:03PM -0800, Patrick Daly wrote:
> In a system which supports MTE, the linear kernel region must allow
> reading/writing allocation tags. For memory present at boot this
> is already being done in map_mem(). Add the same in arch_add_memory().
> 
> Signed-off-by: Patrick Daly <pdaly@codeaurora.org>
> ---
>  arch/arm64/mm/mmu.c | 12 +++++++++++-
>  1 file changed, 11 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
> index 9b25d60b..0fcfe90 100644
> --- a/arch/arm64/mm/mmu.c
> +++ b/arch/arm64/mm/mmu.c
> @@ -1463,6 +1463,7 @@ int arch_add_memory(int nid, u64 start, u64 size,
>  		    struct mhp_params *params)
>  {
>  	int ret, flags = 0;
> +	pgprot_t pgprot;
>  
>  	if (!inside_linear_region(start, size)) {
>  		pr_err("[%llx %llx] is outside linear mapping region\n", start, start + size);
> @@ -1477,8 +1478,17 @@ int arch_add_memory(int nid, u64 start, u64 size,
>  	    IS_ENABLED(CONFIG_KFENCE))
>  		flags = NO_BLOCK_MAPPINGS | NO_CONT_MAPPINGS;
>  
> +	/*
> +	 * The linear map must allow allocation tags reading/writing
> +	 * if MTE is present. Otherwise, it has the same attributes as
> +	 * PAGE_KERNEL.
> +	 */
> +	pgprot = params->pgprot;
> +	if (pgprot_val(pgprot) == pgprot_val(PAGE_KERNEL))
> +		pgprot = PAGE_KERNEL_TAGGED;
> +
>  	__create_pgd_mapping(swapper_pg_dir, start, __phys_to_virt(start),
> -			     size, params->pgprot, __pgd_pgtable_alloc,
> +			     size, pgprot, __pgd_pgtable_alloc,
>  			     flags);

We'll need a similar pattern for vmalloc() once we have the khwasan
support in place. So we could add a pgprot_tagged() function (similar to
pgprot_writecombine() etc.) which does the above check and returns
PAGE_KERNEL_TAGGED, maybe only checking the PTE_ATTRINDX_MASK bits
rather than the whole prot bits.

However, the bigger problem is that arch_add_memory() is also called for
ZONE_DEVICE mappings and we can't always guarantee that such range
supports tagging (most likely it doesn't, e.g. persistent memory),
leading to potential external aborts.

One option is to expand mhp_params to pass additional information so
that the arch code can make the right decision. Another option is to
make PAGE_KERNEL_TAGGED global which is just PAGE_KERNEL for all the
other architectures and use it in the core code.

Yet another option which we haven't fully explored with MTE is to have
PAGE_KERNEL always tagged but add a new PAGE_KERNEL_DEVICE (or
_UNTAGGED) for specific cases like ZONE_DEVICE. We need to make sure
that PAGE_KERNEL doesn't end up in places where the backing memory does
not support tags.

I'll give the last option a quick try and see if it falls apart (just
changing PAGE_KERNEL to tagged). In terms of tag cache usage, it
probably won't have much of an impact since the whole of the linear map
is tagged already.
Catalin Marinas March 5, 2021, 3:43 p.m. UTC | #2
On Fri, Mar 05, 2021 at 11:19:08AM +0000, Catalin Marinas wrote:
> On Tue, Mar 02, 2021 at 08:21:03PM -0800, Patrick Daly wrote:
> > In a system which supports MTE, the linear kernel region must allow
> > reading/writing allocation tags. For memory present at boot this
> > is already being done in map_mem(). Add the same in arch_add_memory().
> > 
> > Signed-off-by: Patrick Daly <pdaly@codeaurora.org>
> > ---
> >  arch/arm64/mm/mmu.c | 12 +++++++++++-
> >  1 file changed, 11 insertions(+), 1 deletion(-)
> > 
> > diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
> > index 9b25d60b..0fcfe90 100644
> > --- a/arch/arm64/mm/mmu.c
> > +++ b/arch/arm64/mm/mmu.c
> > @@ -1463,6 +1463,7 @@ int arch_add_memory(int nid, u64 start, u64 size,
> >  		    struct mhp_params *params)
> >  {
> >  	int ret, flags = 0;
> > +	pgprot_t pgprot;
> >  
> >  	if (!inside_linear_region(start, size)) {
> >  		pr_err("[%llx %llx] is outside linear mapping region\n", start, start + size);
> > @@ -1477,8 +1478,17 @@ int arch_add_memory(int nid, u64 start, u64 size,
> >  	    IS_ENABLED(CONFIG_KFENCE))
> >  		flags = NO_BLOCK_MAPPINGS | NO_CONT_MAPPINGS;
> >  
> > +	/*
> > +	 * The linear map must allow allocation tags reading/writing
> > +	 * if MTE is present. Otherwise, it has the same attributes as
> > +	 * PAGE_KERNEL.
> > +	 */
> > +	pgprot = params->pgprot;
> > +	if (pgprot_val(pgprot) == pgprot_val(PAGE_KERNEL))
> > +		pgprot = PAGE_KERNEL_TAGGED;
> > +
> >  	__create_pgd_mapping(swapper_pg_dir, start, __phys_to_virt(start),
> > -			     size, params->pgprot, __pgd_pgtable_alloc,
> > +			     size, pgprot, __pgd_pgtable_alloc,
> >  			     flags);
> 
> We'll need a similar pattern for vmalloc() once we have the khwasan
> support in place. So we could add a pgprot_tagged() function (similar to
> pgprot_writecombine() etc.) which does the above check and returns
> PAGE_KERNEL_TAGGED, maybe only checking the PTE_ATTRINDX_MASK bits
> rather than the whole prot bits.
> 
> However, the bigger problem is that arch_add_memory() is also called for
> ZONE_DEVICE mappings and we can't always guarantee that such range
> supports tagging (most likely it doesn't, e.g. persistent memory),
> leading to potential external aborts.
> 
> One option is to expand mhp_params to pass additional information so
> that the arch code can make the right decision. Another option is to
> make PAGE_KERNEL_TAGGED global which is just PAGE_KERNEL for all the
> other architectures and use it in the core code.
> 
> Yet another option which we haven't fully explored with MTE is to have
> PAGE_KERNEL always tagged but add a new PAGE_KERNEL_DEVICE (or
> _UNTAGGED) for specific cases like ZONE_DEVICE. We need to make sure
> that PAGE_KERNEL doesn't end up in places where the backing memory does
> not support tags.
> 
> I'll give the last option a quick try and see if it falls apart (just
> changing PAGE_KERNEL to tagged). In terms of tag cache usage, it
> probably won't have much of an impact since the whole of the linear map
> is tagged already.

I played with this a bit and the last option is not really feasible.
There are several places in the kernel where PAGE_KERNEL is used just
because the expectation is for write-back memory. For MTE, we need
tagged memory either because the kernel allocators need it (with
KASAN_HW_TAGS) or because it is shared with the user, potentially mapped
as tagged in user space and such tags need to be preserved by the
kernel. So for the latter, only the linear map needs to be tagged (and
memory hotplug falls into this category).

This leaves us with one of the first two options. I think the easiest
is:

---------------8<-------------------------------
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index e17b96d0e4b5..5c78b92d9ec5 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -486,6 +486,8 @@ static inline pmd_t pmd_mkdevmap(pmd_t pmd)
 	__pgprot_modify(prot, PTE_ATTRINDX_MASK, PTE_ATTRINDX(MT_NORMAL_NC) | PTE_PXN | PTE_UXN)
 #define pgprot_device(prot) \
 	__pgprot_modify(prot, PTE_ATTRINDX_MASK, PTE_ATTRINDX(MT_DEVICE_nGnRE) | PTE_PXN | PTE_UXN)
+#define pgprot_tagged(prot) \
+	__pgprot_modify(prot, PTE_ATTRINDX_MASK, PTE_ATTRINDX(MT_NORMAL_TAGGED))
 /*
  * DMA allocations for non-coherent devices use what the Arm architecture calls
  * "Normal non-cacheable" memory, which permits speculation, unaligned accesses
diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h
index cdfc4e9f253e..f5f5044db2ce 100644
--- a/include/linux/pgtable.h
+++ b/include/linux/pgtable.h
@@ -904,6 +904,10 @@ static inline void ptep_modify_prot_commit(struct vm_area_struct *vma,
 #define pgprot_device pgprot_noncached
 #endif

+#ifndef pgprot_tagged
+#define pgprot_tagged(prot)	(prot)
+#endif
+
 #ifdef CONFIG_MMU
 #ifndef pgprot_modify
 #define pgprot_modify pgprot_modify
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 5ba51a8bdaeb..4253d80a59ba 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1072,7 +1072,7 @@ static int online_memory_block(struct memory_block *mem, void *arg)
  */
 int __ref add_memory_resource(int nid, struct resource *res, mhp_t mhp_flags)
 {
-	struct mhp_params params = { .pgprot = PAGE_KERNEL };
+	struct mhp_params params = { .pgprot = pgprot_tagged(PAGE_KERNEL) };
 	u64 start, size;
 	bool new_node = false;
 	int ret;
---------------8<-------------------------------

And we can reuse pgprot_tagged() for vmalloc() with KASAN_HW_TAGS.

If we want to keep the memory hotplug change in arch_add_memory(), maybe
something like below without extending mhp_params:

	struct mem_section *ms = __pfn_to_section(PHYS_PFN(start));

	if (!online_device_section(ms))
		pgprot = pgprot_tagged(params->pgprot);

My preference is for the generic code change as per the diff above as
we'd need something similar for vmalloc().
David Hildenbrand March 5, 2021, 3:55 p.m. UTC | #3
On 05.03.21 16:43, Catalin Marinas wrote:
> On Fri, Mar 05, 2021 at 11:19:08AM +0000, Catalin Marinas wrote:
>> On Tue, Mar 02, 2021 at 08:21:03PM -0800, Patrick Daly wrote:
>>> In a system which supports MTE, the linear kernel region must allow
>>> reading/writing allocation tags. For memory present at boot this
>>> is already being done in map_mem(). Add the same in arch_add_memory().
>>>
>>> Signed-off-by: Patrick Daly <pdaly@codeaurora.org>
>>> ---
>>>   arch/arm64/mm/mmu.c | 12 +++++++++++-
>>>   1 file changed, 11 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
>>> index 9b25d60b..0fcfe90 100644
>>> --- a/arch/arm64/mm/mmu.c
>>> +++ b/arch/arm64/mm/mmu.c
>>> @@ -1463,6 +1463,7 @@ int arch_add_memory(int nid, u64 start, u64 size,
>>>   		    struct mhp_params *params)
>>>   {
>>>   	int ret, flags = 0;
>>> +	pgprot_t pgprot;
>>>   
>>>   	if (!inside_linear_region(start, size)) {
>>>   		pr_err("[%llx %llx] is outside linear mapping region\n", start, start + size);
>>> @@ -1477,8 +1478,17 @@ int arch_add_memory(int nid, u64 start, u64 size,
>>>   	    IS_ENABLED(CONFIG_KFENCE))
>>>   		flags = NO_BLOCK_MAPPINGS | NO_CONT_MAPPINGS;
>>>   
>>> +	/*
>>> +	 * The linear map must allow allocation tags reading/writing
>>> +	 * if MTE is present. Otherwise, it has the same attributes as
>>> +	 * PAGE_KERNEL.
>>> +	 */
>>> +	pgprot = params->pgprot;
>>> +	if (pgprot_val(pgprot) == pgprot_val(PAGE_KERNEL))
>>> +		pgprot = PAGE_KERNEL_TAGGED;
>>> +
>>>   	__create_pgd_mapping(swapper_pg_dir, start, __phys_to_virt(start),
>>> -			     size, params->pgprot, __pgd_pgtable_alloc,
>>> +			     size, pgprot, __pgd_pgtable_alloc,
>>>   			     flags);
>>
>> We'll need a similar pattern for vmalloc() once we have the khwasan
>> support in place. So we could add a pgprot_tagged() function (similar to
>> pgprot_writecombine() etc.) which does the above check and returns
>> PAGE_KERNEL_TAGGED, maybe only checking the PTE_ATTRINDX_MASK bits
>> rather than the whole prot bits.
>>
>> However, the bigger problem is that arch_add_memory() is also called for
>> ZONE_DEVICE mappings and we can't always guarantee that such range
>> supports tagging (most likely it doesn't, e.g. persistent memory),
>> leading to potential external aborts.
>>
>> One option is to expand mhp_params to pass additional information so
>> that the arch code can make the right decision. Another option is to
>> make PAGE_KERNEL_TAGGED global which is just PAGE_KERNEL for all the
>> other architectures and use it in the core code.
>>
>> Yet another option which we haven't fully explored with MTE is to have
>> PAGE_KERNEL always tagged but add a new PAGE_KERNEL_DEVICE (or
>> _UNTAGGED) for specific cases like ZONE_DEVICE. We need to make sure
>> that PAGE_KERNEL doesn't end up in places where the backing memory does
>> not support tags.
>>
>> I'll give the last option a quick try and see if it falls apart (just
>> changing PAGE_KERNEL to tagged). In terms of tag cache usage, it
>> probably won't have much of an impact since the whole of the linear map
>> is tagged already.
> 
> I played with this a bit and the last option is not really feasible.
> There are several places in the kernel where PAGE_KERNEL is used just
> because the expectation is for write-back memory. For MTE, we need
> tagged memory either because the kernel allocators need it (with
> KASAN_HW_TAGS) or because it is shared with the user, potentially mapped
> as tagged in user space and such tags need to be preserved by the
> kernel. So for the latter, only the linear map needs to be tagged (and
> memory hotplug falls into this category).
> 
> This leaves us with one of the first two options. I think the easiest
> is:
> 
> ---------------8<-------------------------------
> diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
> index e17b96d0e4b5..5c78b92d9ec5 100644
> --- a/arch/arm64/include/asm/pgtable.h
> +++ b/arch/arm64/include/asm/pgtable.h
> @@ -486,6 +486,8 @@ static inline pmd_t pmd_mkdevmap(pmd_t pmd)
>   	__pgprot_modify(prot, PTE_ATTRINDX_MASK, PTE_ATTRINDX(MT_NORMAL_NC) | PTE_PXN | PTE_UXN)
>   #define pgprot_device(prot) \
>   	__pgprot_modify(prot, PTE_ATTRINDX_MASK, PTE_ATTRINDX(MT_DEVICE_nGnRE) | PTE_PXN | PTE_UXN)
> +#define pgprot_tagged(prot) \
> +	__pgprot_modify(prot, PTE_ATTRINDX_MASK, PTE_ATTRINDX(MT_NORMAL_TAGGED))
>   /*
>    * DMA allocations for non-coherent devices use what the Arm architecture calls
>    * "Normal non-cacheable" memory, which permits speculation, unaligned accesses
> diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h
> index cdfc4e9f253e..f5f5044db2ce 100644
> --- a/include/linux/pgtable.h
> +++ b/include/linux/pgtable.h
> @@ -904,6 +904,10 @@ static inline void ptep_modify_prot_commit(struct vm_area_struct *vma,
>   #define pgprot_device pgprot_noncached
>   #endif
> 
> +#ifndef pgprot_tagged
> +#define pgprot_tagged(prot)	(prot)
> +#endif
> +
>   #ifdef CONFIG_MMU
>   #ifndef pgprot_modify
>   #define pgprot_modify pgprot_modify
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index 5ba51a8bdaeb..4253d80a59ba 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -1072,7 +1072,7 @@ static int online_memory_block(struct memory_block *mem, void *arg)
>    */
>   int __ref add_memory_resource(int nid, struct resource *res, mhp_t mhp_flags)
>   {
> -	struct mhp_params params = { .pgprot = PAGE_KERNEL };
> +	struct mhp_params params = { .pgprot = pgprot_tagged(PAGE_KERNEL) };

This looks like we're pushing arch specific stuff ("tagged") in here. 
Can't we generalize this to something like

pgprot_mhp_default

(or a better name)

that defaults to PAGE_KERNEL on all architectures except arm64 which 
overwrites this somehow?
Catalin Marinas March 5, 2021, 5:44 p.m. UTC | #4
On Fri, Mar 05, 2021 at 04:55:28PM +0100, David Hildenbrand wrote:
> On 05.03.21 16:43, Catalin Marinas wrote:
> > On Fri, Mar 05, 2021 at 11:19:08AM +0000, Catalin Marinas wrote:
> > > On Tue, Mar 02, 2021 at 08:21:03PM -0800, Patrick Daly wrote:
> > > > In a system which supports MTE, the linear kernel region must allow
> > > > reading/writing allocation tags. For memory present at boot this
> > > > is already being done in map_mem(). Add the same in arch_add_memory().
> > > > 
> > > > Signed-off-by: Patrick Daly <pdaly@codeaurora.org>
> > > > ---
> > > >   arch/arm64/mm/mmu.c | 12 +++++++++++-
> > > >   1 file changed, 11 insertions(+), 1 deletion(-)
> > > > 
> > > > diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
> > > > index 9b25d60b..0fcfe90 100644
> > > > --- a/arch/arm64/mm/mmu.c
> > > > +++ b/arch/arm64/mm/mmu.c
> > > > @@ -1463,6 +1463,7 @@ int arch_add_memory(int nid, u64 start, u64 size,
> > > >   		    struct mhp_params *params)
> > > >   {
> > > >   	int ret, flags = 0;
> > > > +	pgprot_t pgprot;
> > > >   	if (!inside_linear_region(start, size)) {
> > > >   		pr_err("[%llx %llx] is outside linear mapping region\n", start, start + size);
> > > > @@ -1477,8 +1478,17 @@ int arch_add_memory(int nid, u64 start, u64 size,
> > > >   	    IS_ENABLED(CONFIG_KFENCE))
> > > >   		flags = NO_BLOCK_MAPPINGS | NO_CONT_MAPPINGS;
> > > > +	/*
> > > > +	 * The linear map must allow allocation tags reading/writing
> > > > +	 * if MTE is present. Otherwise, it has the same attributes as
> > > > +	 * PAGE_KERNEL.
> > > > +	 */
> > > > +	pgprot = params->pgprot;
> > > > +	if (pgprot_val(pgprot) == pgprot_val(PAGE_KERNEL))
> > > > +		pgprot = PAGE_KERNEL_TAGGED;
> > > > +
> > > >   	__create_pgd_mapping(swapper_pg_dir, start, __phys_to_virt(start),
> > > > -			     size, params->pgprot, __pgd_pgtable_alloc,
> > > > +			     size, pgprot, __pgd_pgtable_alloc,
> > > >   			     flags);
[...]
> > ---------------8<-------------------------------
> > diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
> > index e17b96d0e4b5..5c78b92d9ec5 100644
> > --- a/arch/arm64/include/asm/pgtable.h
> > +++ b/arch/arm64/include/asm/pgtable.h
> > @@ -486,6 +486,8 @@ static inline pmd_t pmd_mkdevmap(pmd_t pmd)
> >   	__pgprot_modify(prot, PTE_ATTRINDX_MASK, PTE_ATTRINDX(MT_NORMAL_NC) | PTE_PXN | PTE_UXN)
> >   #define pgprot_device(prot) \
> >   	__pgprot_modify(prot, PTE_ATTRINDX_MASK, PTE_ATTRINDX(MT_DEVICE_nGnRE) | PTE_PXN | PTE_UXN)
> > +#define pgprot_tagged(prot) \
> > +	__pgprot_modify(prot, PTE_ATTRINDX_MASK, PTE_ATTRINDX(MT_NORMAL_TAGGED))
> >   /*
> >    * DMA allocations for non-coherent devices use what the Arm architecture calls
> >    * "Normal non-cacheable" memory, which permits speculation, unaligned accesses
> > diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h
> > index cdfc4e9f253e..f5f5044db2ce 100644
> > --- a/include/linux/pgtable.h
> > +++ b/include/linux/pgtable.h
> > @@ -904,6 +904,10 @@ static inline void ptep_modify_prot_commit(struct vm_area_struct *vma,
> >   #define pgprot_device pgprot_noncached
> >   #endif
> > 
> > +#ifndef pgprot_tagged
> > +#define pgprot_tagged(prot)	(prot)
> > +#endif
> > +
> >   #ifdef CONFIG_MMU
> >   #ifndef pgprot_modify
> >   #define pgprot_modify pgprot_modify
> > diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> > index 5ba51a8bdaeb..4253d80a59ba 100644
> > --- a/mm/memory_hotplug.c
> > +++ b/mm/memory_hotplug.c
> > @@ -1072,7 +1072,7 @@ static int online_memory_block(struct memory_block *mem, void *arg)
> >    */
> >   int __ref add_memory_resource(int nid, struct resource *res, mhp_t mhp_flags)
> >   {
> > -	struct mhp_params params = { .pgprot = PAGE_KERNEL };
> > +	struct mhp_params params = { .pgprot = pgprot_tagged(PAGE_KERNEL) };
> 
> This looks like we're pushing arch specific stuff ("tagged") in here. Can't
> we generalize this to something like
> 
> pgprot_mhp_default
> 
> (or a better name)
> 
> that defaults to PAGE_KERNEL on all architectures except arm64 which
> overwrites this somehow?

It works for me but I prefer the prot modification style similar to
pgprot_writecombine() etc. (i.e. takes a parameter like PAGE_KERNEL and
just changes bits of it).

Thanks.
David Hildenbrand March 8, 2021, 11 a.m. UTC | #5
On 05.03.21 18:44, Catalin Marinas wrote:
> On Fri, Mar 05, 2021 at 04:55:28PM +0100, David Hildenbrand wrote:
>> On 05.03.21 16:43, Catalin Marinas wrote:
>>> On Fri, Mar 05, 2021 at 11:19:08AM +0000, Catalin Marinas wrote:
>>>> On Tue, Mar 02, 2021 at 08:21:03PM -0800, Patrick Daly wrote:
>>>>> In a system which supports MTE, the linear kernel region must allow
>>>>> reading/writing allocation tags. For memory present at boot this
>>>>> is already being done in map_mem(). Add the same in arch_add_memory().
>>>>>
>>>>> Signed-off-by: Patrick Daly <pdaly@codeaurora.org>
>>>>> ---
>>>>>    arch/arm64/mm/mmu.c | 12 +++++++++++-
>>>>>    1 file changed, 11 insertions(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
>>>>> index 9b25d60b..0fcfe90 100644
>>>>> --- a/arch/arm64/mm/mmu.c
>>>>> +++ b/arch/arm64/mm/mmu.c
>>>>> @@ -1463,6 +1463,7 @@ int arch_add_memory(int nid, u64 start, u64 size,
>>>>>    		    struct mhp_params *params)
>>>>>    {
>>>>>    	int ret, flags = 0;
>>>>> +	pgprot_t pgprot;
>>>>>    	if (!inside_linear_region(start, size)) {
>>>>>    		pr_err("[%llx %llx] is outside linear mapping region\n", start, start + size);
>>>>> @@ -1477,8 +1478,17 @@ int arch_add_memory(int nid, u64 start, u64 size,
>>>>>    	    IS_ENABLED(CONFIG_KFENCE))
>>>>>    		flags = NO_BLOCK_MAPPINGS | NO_CONT_MAPPINGS;
>>>>> +	/*
>>>>> +	 * The linear map must allow allocation tags reading/writing
>>>>> +	 * if MTE is present. Otherwise, it has the same attributes as
>>>>> +	 * PAGE_KERNEL.
>>>>> +	 */
>>>>> +	pgprot = params->pgprot;
>>>>> +	if (pgprot_val(pgprot) == pgprot_val(PAGE_KERNEL))
>>>>> +		pgprot = PAGE_KERNEL_TAGGED;
>>>>> +
>>>>>    	__create_pgd_mapping(swapper_pg_dir, start, __phys_to_virt(start),
>>>>> -			     size, params->pgprot, __pgd_pgtable_alloc,
>>>>> +			     size, pgprot, __pgd_pgtable_alloc,
>>>>>    			     flags);
> [...]
>>> ---------------8<-------------------------------
>>> diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
>>> index e17b96d0e4b5..5c78b92d9ec5 100644
>>> --- a/arch/arm64/include/asm/pgtable.h
>>> +++ b/arch/arm64/include/asm/pgtable.h
>>> @@ -486,6 +486,8 @@ static inline pmd_t pmd_mkdevmap(pmd_t pmd)
>>>    	__pgprot_modify(prot, PTE_ATTRINDX_MASK, PTE_ATTRINDX(MT_NORMAL_NC) | PTE_PXN | PTE_UXN)
>>>    #define pgprot_device(prot) \
>>>    	__pgprot_modify(prot, PTE_ATTRINDX_MASK, PTE_ATTRINDX(MT_DEVICE_nGnRE) | PTE_PXN | PTE_UXN)
>>> +#define pgprot_tagged(prot) \
>>> +	__pgprot_modify(prot, PTE_ATTRINDX_MASK, PTE_ATTRINDX(MT_NORMAL_TAGGED))
>>>    /*
>>>     * DMA allocations for non-coherent devices use what the Arm architecture calls
>>>     * "Normal non-cacheable" memory, which permits speculation, unaligned accesses
>>> diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h
>>> index cdfc4e9f253e..f5f5044db2ce 100644
>>> --- a/include/linux/pgtable.h
>>> +++ b/include/linux/pgtable.h
>>> @@ -904,6 +904,10 @@ static inline void ptep_modify_prot_commit(struct vm_area_struct *vma,
>>>    #define pgprot_device pgprot_noncached
>>>    #endif
>>>
>>> +#ifndef pgprot_tagged
>>> +#define pgprot_tagged(prot)	(prot)
>>> +#endif
>>> +
>>>    #ifdef CONFIG_MMU
>>>    #ifndef pgprot_modify
>>>    #define pgprot_modify pgprot_modify
>>> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
>>> index 5ba51a8bdaeb..4253d80a59ba 100644
>>> --- a/mm/memory_hotplug.c
>>> +++ b/mm/memory_hotplug.c
>>> @@ -1072,7 +1072,7 @@ static int online_memory_block(struct memory_block *mem, void *arg)
>>>     */
>>>    int __ref add_memory_resource(int nid, struct resource *res, mhp_t mhp_flags)
>>>    {
>>> -	struct mhp_params params = { .pgprot = PAGE_KERNEL };
>>> +	struct mhp_params params = { .pgprot = pgprot_tagged(PAGE_KERNEL) };
>>
>> This looks like we're pushing arch specific stuff ("tagged") in here. Can't
>> we generalize this to something like
>>
>> pgprot_mhp_default
>>
>> (or a better name)
>>
>> that defaults to PAGE_KERNEL on all architectures except arm64 which
>> overwrites this somehow?
> 
> It works for me but I prefer the prot modification style similar to
> pgprot_writecombine() etc. (i.e. takes a parameter like PAGE_KERNEL and
> just changes bits of it).

Sure, I'm wondering if we can abstract the "tagged" part here.

pgprot_mhp(PAGE_KERNEL) whatsoever.

But maybe we do consider the "tagged" part okay in common code already 
(it felt quite arm64 specific to me at first). I can spot 
untagged_addr() all over the place. So maybe pgprot_tagged() is just fine.
Catalin Marinas March 8, 2021, 11:18 a.m. UTC | #6
On Mon, Mar 08, 2021 at 12:00:17PM +0100, David Hildenbrand wrote:
> On 05.03.21 18:44, Catalin Marinas wrote:
> > On Fri, Mar 05, 2021 at 04:55:28PM +0100, David Hildenbrand wrote:
> > > On 05.03.21 16:43, Catalin Marinas wrote:
> > > > On Fri, Mar 05, 2021 at 11:19:08AM +0000, Catalin Marinas wrote:
> > > > > On Tue, Mar 02, 2021 at 08:21:03PM -0800, Patrick Daly wrote:
> > > > > > In a system which supports MTE, the linear kernel region must allow
> > > > > > reading/writing allocation tags. For memory present at boot this
> > > > > > is already being done in map_mem(). Add the same in arch_add_memory().
> > > > > > 
> > > > > > Signed-off-by: Patrick Daly <pdaly@codeaurora.org>
> > > > > > ---
> > > > > >    arch/arm64/mm/mmu.c | 12 +++++++++++-
> > > > > >    1 file changed, 11 insertions(+), 1 deletion(-)
> > > > > > 
> > > > > > diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
> > > > > > index 9b25d60b..0fcfe90 100644
> > > > > > --- a/arch/arm64/mm/mmu.c
> > > > > > +++ b/arch/arm64/mm/mmu.c
> > > > > > @@ -1463,6 +1463,7 @@ int arch_add_memory(int nid, u64 start, u64 size,
> > > > > >    		    struct mhp_params *params)
> > > > > >    {
> > > > > >    	int ret, flags = 0;
> > > > > > +	pgprot_t pgprot;
> > > > > >    	if (!inside_linear_region(start, size)) {
> > > > > >    		pr_err("[%llx %llx] is outside linear mapping region\n", start, start + size);
> > > > > > @@ -1477,8 +1478,17 @@ int arch_add_memory(int nid, u64 start, u64 size,
> > > > > >    	    IS_ENABLED(CONFIG_KFENCE))
> > > > > >    		flags = NO_BLOCK_MAPPINGS | NO_CONT_MAPPINGS;
> > > > > > +	/*
> > > > > > +	 * The linear map must allow allocation tags reading/writing
> > > > > > +	 * if MTE is present. Otherwise, it has the same attributes as
> > > > > > +	 * PAGE_KERNEL.
> > > > > > +	 */
> > > > > > +	pgprot = params->pgprot;
> > > > > > +	if (pgprot_val(pgprot) == pgprot_val(PAGE_KERNEL))
> > > > > > +		pgprot = PAGE_KERNEL_TAGGED;
> > > > > > +
> > > > > >    	__create_pgd_mapping(swapper_pg_dir, start, __phys_to_virt(start),
> > > > > > -			     size, params->pgprot, __pgd_pgtable_alloc,
> > > > > > +			     size, pgprot, __pgd_pgtable_alloc,
> > > > > >    			     flags);
> > [...]
> > > > ---------------8<-------------------------------
> > > > diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
> > > > index e17b96d0e4b5..5c78b92d9ec5 100644
> > > > --- a/arch/arm64/include/asm/pgtable.h
> > > > +++ b/arch/arm64/include/asm/pgtable.h
> > > > @@ -486,6 +486,8 @@ static inline pmd_t pmd_mkdevmap(pmd_t pmd)
> > > >    	__pgprot_modify(prot, PTE_ATTRINDX_MASK, PTE_ATTRINDX(MT_NORMAL_NC) | PTE_PXN | PTE_UXN)
> > > >    #define pgprot_device(prot) \
> > > >    	__pgprot_modify(prot, PTE_ATTRINDX_MASK, PTE_ATTRINDX(MT_DEVICE_nGnRE) | PTE_PXN | PTE_UXN)
> > > > +#define pgprot_tagged(prot) \
> > > > +	__pgprot_modify(prot, PTE_ATTRINDX_MASK, PTE_ATTRINDX(MT_NORMAL_TAGGED))
> > > >    /*
> > > >     * DMA allocations for non-coherent devices use what the Arm architecture calls
> > > >     * "Normal non-cacheable" memory, which permits speculation, unaligned accesses
> > > > diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h
> > > > index cdfc4e9f253e..f5f5044db2ce 100644
> > > > --- a/include/linux/pgtable.h
> > > > +++ b/include/linux/pgtable.h
> > > > @@ -904,6 +904,10 @@ static inline void ptep_modify_prot_commit(struct vm_area_struct *vma,
> > > >    #define pgprot_device pgprot_noncached
> > > >    #endif
> > > > 
> > > > +#ifndef pgprot_tagged
> > > > +#define pgprot_tagged(prot)	(prot)
> > > > +#endif
> > > > +
> > > >    #ifdef CONFIG_MMU
> > > >    #ifndef pgprot_modify
> > > >    #define pgprot_modify pgprot_modify
> > > > diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> > > > index 5ba51a8bdaeb..4253d80a59ba 100644
> > > > --- a/mm/memory_hotplug.c
> > > > +++ b/mm/memory_hotplug.c
> > > > @@ -1072,7 +1072,7 @@ static int online_memory_block(struct memory_block *mem, void *arg)
> > > >     */
> > > >    int __ref add_memory_resource(int nid, struct resource *res, mhp_t mhp_flags)
> > > >    {
> > > > -	struct mhp_params params = { .pgprot = PAGE_KERNEL };
> > > > +	struct mhp_params params = { .pgprot = pgprot_tagged(PAGE_KERNEL) };
> > > 
> > > This looks like we're pushing arch specific stuff ("tagged") in here. Can't
> > > we generalize this to something like
> > > 
> > > pgprot_mhp_default
> > > 
> > > (or a better name)
> > > 
> > > that defaults to PAGE_KERNEL on all architectures except arm64 which
> > > overwrites this somehow?
> > 
> > It works for me but I prefer the prot modification style similar to
> > pgprot_writecombine() etc. (i.e. takes a parameter like PAGE_KERNEL and
> > just changes bits of it).
> 
> Sure, I'm wondering if we can abstract the "tagged" part here.
> 
> pgprot_mhp(PAGE_KERNEL) whatsoever.
> 
> But maybe we do consider the "tagged" part okay in common code already (it
> felt quite arm64 specific to me at first). I can spot untagged_addr() all
> over the place. So maybe pgprot_tagged() is just fine.

That was my reasoning initially though 'tagged' in untagged_addr()
refers to some bits in the top part of the address and that will become
more common. 'Tagged' as a mapping attribute is specific to arm64 and
sparc for now.

That said, pgprot_mhp() is probably better as the core code cannot tell
whether the arch implementation wants the hotplugged memory tagged or
not. Similarly, we'll add a pgprot_vmalloc() at some point.
diff mbox series

Patch

diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 9b25d60b..0fcfe90 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -1463,6 +1463,7 @@  int arch_add_memory(int nid, u64 start, u64 size,
 		    struct mhp_params *params)
 {
 	int ret, flags = 0;
+	pgprot_t pgprot;
 
 	if (!inside_linear_region(start, size)) {
 		pr_err("[%llx %llx] is outside linear mapping region\n", start, start + size);
@@ -1477,8 +1478,17 @@  int arch_add_memory(int nid, u64 start, u64 size,
 	    IS_ENABLED(CONFIG_KFENCE))
 		flags = NO_BLOCK_MAPPINGS | NO_CONT_MAPPINGS;
 
+	/*
+	 * The linear map must allow allocation tags reading/writing
+	 * if MTE is present. Otherwise, it has the same attributes as
+	 * PAGE_KERNEL.
+	 */
+	pgprot = params->pgprot;
+	if (pgprot_val(pgprot) == pgprot_val(PAGE_KERNEL))
+		pgprot = PAGE_KERNEL_TAGGED;
+
 	__create_pgd_mapping(swapper_pg_dir, start, __phys_to_virt(start),
-			     size, params->pgprot, __pgd_pgtable_alloc,
+			     size, pgprot, __pgd_pgtable_alloc,
 			     flags);
 
 	memblock_clear_nomap(start, size);