diff mbox series

mm/debug: use valid physical memory for pmd/pud tests

Message ID 20230105215025.422635-1-fvdl@google.com (mailing list archive)
State New
Headers show
Series mm/debug: use valid physical memory for pmd/pud tests | expand

Commit Message

Frank van der Linden Jan. 5, 2023, 9:50 p.m. UTC
The page table debug tests need a physical address to validate
low-level page table manipulation with. The memory at this address
is not actually touched, it just encoded in the page table entries
at various levels during the tests only.

Since the memory is not used, the code just picks the physical
address of the start_kernel symbol. This value is then truncated
to get a properly aligned address that is to be used for various
tests. Because of the truncation, the address might not actually
exist, or might not describe a complete huge page. That's not a
problem for most tests, but the arch-specific code may check
for attribute validity and consistency. The x86 version of
{pud,pmd}_set_huge actually validates the MTRRs for the PMD/PUD
range. This may fail with an address derived from start_kernel,
depending on where the kernel was loaded and what the physical
memory layout of the system is. This then leads to false negatives
for the {pud,pmd}_set_huge tests.

Avoid this by finding a properly aligned memory range that exists
and is usable. If such a range is not found, skip the tests that
needed it.

Fixes: 399145f9eb6c ("mm/debug: add tests validating architecture page table helpers")
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Frank van der Linden <fvdl@google.com>
---
 mm/debug_vm_pgtable.c | 70 +++++++++++++++++++++++++++++++++++++------
 1 file changed, 61 insertions(+), 9 deletions(-)

Comments

Anshuman Khandual Jan. 6, 2023, 4:24 a.m. UTC | #1
Hi Frank,

Thanks for the patch, in principle this LGTM. Did a quick run on arm64,
did not find anything problematic. Although I have some comments below.

On 1/6/23 03:20, Frank van der Linden wrote:
> The page table debug tests need a physical address to validate
> low-level page table manipulation with. The memory at this address
> is not actually touched, it just encoded in the page table entries
> at various levels during the tests only.
> 
> Since the memory is not used, the code just picks the physical
> address of the start_kernel symbol. This value is then truncated
> to get a properly aligned address that is to be used for various
> tests. Because of the truncation, the address might not actually
> exist, or might not describe a complete huge page. That's not a
> problem for most tests, but the arch-specific code may check
> for attribute validity and consistency. The x86 version of
> {pud,pmd}_set_huge actually validates the MTRRs for the PMD/PUD
> range. This may fail with an address derived from start_kernel,
> depending on where the kernel was loaded and what the physical
> memory layout of the system is. This then leads to false negatives
> for the {pud,pmd}_set_huge tests.
> 
> Avoid this by finding a properly aligned memory range that exists
> and is usable. If such a range is not found, skip the tests that
> needed it.
> 
> Fixes: 399145f9eb6c ("mm/debug: add tests validating architecture page table helpers")
> Cc: Anshuman Khandual <anshuman.khandual@arm.com>
> Signed-off-by: Frank van der Linden <fvdl@google.com>
> ---
>  mm/debug_vm_pgtable.c | 70 +++++++++++++++++++++++++++++++++++++------
>  1 file changed, 61 insertions(+), 9 deletions(-)
> 
> diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c
> index c631ade3f1d2..e9b52600904a 100644
> --- a/mm/debug_vm_pgtable.c
> +++ b/mm/debug_vm_pgtable.c
> @@ -15,6 +15,7 @@
>  #include <linux/hugetlb.h>
>  #include <linux/kernel.h>
>  #include <linux/kconfig.h>
> +#include <linux/memblock.h>
>  #include <linux/mm.h>
>  #include <linux/mman.h>
>  #include <linux/mm_types.h>
> @@ -80,6 +81,8 @@ struct pgtable_debug_args {
>  	unsigned long		pmd_pfn;
>  	unsigned long		pte_pfn;
>  
> +	phys_addr_t		fixed_alignment;
> +

This should not be a 'phys_addr_t', as it does not really contain a
physical address. Alignment value can be captured in 'unsigned long'
like other elements.

>  	unsigned long		fixed_pgd_pfn;
>  	unsigned long		fixed_p4d_pfn;
>  	unsigned long		fixed_pud_pfn;
> @@ -430,7 +433,8 @@ static void __init pmd_huge_tests(struct pgtable_debug_args *args)
>  {
>  	pmd_t pmd;
>  
> -	if (!arch_vmap_pmd_supported(args->page_prot))
> +	if (!arch_vmap_pmd_supported(args->page_prot) ||
> +	    args->fixed_alignment < PMD_SIZE)
>  		return;

Small nit. Additional line not need for the conditional statement.


>  
>  	pr_debug("Validating PMD huge\n");
> @@ -449,7 +453,8 @@ static void __init pud_huge_tests(struct pgtable_debug_args *args)
>  {
>  	pud_t pud;
>  
> -	if (!arch_vmap_pud_supported(args->page_prot))
> +	if (!arch_vmap_pud_supported(args->page_prot) ||
> +	    args->fixed_alignment < PUD_SIZE)
>  		return;
Small nit. Additional line not needed for the conditional statement.

>  
>  	pr_debug("Validating PUD huge\n");
> @@ -1077,11 +1082,41 @@ debug_vm_pgtable_alloc_huge_page(struct pgtable_debug_args *args, int order)
>  	return page;
>  }
>  
> +/*
> + * Check if a physical memory range described by <pstart, pend> contains
> + * an area that is of size psize, and aligned to the same.
> + *
> + * Don't use address 0, and check for overflow.
> + */
> +static int __init phys_align_check(phys_addr_t pstart,
> +	phys_addr_t pend, phys_addr_t psize, phys_addr_t *physp,
> +	phys_addr_t *alignp)
> +{
> +	phys_addr_t aligned_start, aligned_end;
> +
> +	if (pstart == 0)
> +		pstart = PAGE_SIZE;

Why ?

> +
> +	aligned_start = ALIGN(pstart, psize);
> +	aligned_end = aligned_start + psize;
> +
> +	if (aligned_end > aligned_start && aligned_end <= pend) {
> +		*alignp = psize;
> +		*physp = aligned_start;
> +		return 1;
> +	}
> +
> +	return 0;
> +}

To be more clear, this function should return a 'bool' instead

> +
> +
>  static int __init init_args(struct pgtable_debug_args *args)
>  {
>  	struct page *page = NULL;
>  	phys_addr_t phys;
>  	int ret = 0;
> +	u64 idx;
> +	phys_addr_t pstart, pend;

This declaration can be merged into the previous line containing 'phys'.

>  
>  	/*
>  	 * Initialize the debugging data.
> @@ -1161,15 +1196,32 @@ static int __init init_args(struct pgtable_debug_args *args)
>  	WARN_ON(!args->start_ptep);
>  
>  	/*
> -	 * PFN for mapping at PTE level is determined from a standard kernel
> -	 * text symbol. But pfns for higher page table levels are derived by
> -	 * masking lower bits of this real pfn. These derived pfns might not
> -	 * exist on the platform but that does not really matter as pfn_pxx()
> -	 * helpers will still create appropriate entries for the test. This
> -	 * helps avoid large memory block allocations to be used for mapping
> -	 * at higher page table levels in some of the tests.
> +	 * Find a valid physical range, preferably aligned to PUD_SIZE.
> +	 * Return the address and the alignment. It doesn't need to be
> +	 * allocated, it just needs to exist as usable memory. The memory
> +	 * won't be touched.
> +	 *
> +	 * The alignment is recorded, and can be checked to see if we
> +	 * can run the tests that require and actual valid physical

s/and/an ?

> +	 * address range on some architectures ({pmd,pud}_huge_test
> +	 * on x86).
>  	 */
> +
>  	phys = __pa_symbol(&start_kernel);

This original 'phys' will still be used as fallback, in case the below attempt
does not find a physical address with required alignments i.e [PUD|PMD]_SIZE ?

> +	args->fixed_alignment = PAGE_SIZE;
> +
> +	for_each_mem_range(idx, &pstart, &pend) {
> +		if (phys_align_check(pstart, pend, PUD_SIZE, &phys,
> +				&args->fixed_alignment))
> +			break;
> +
> +		if (args->fixed_alignment >= PMD_SIZE)
> +			continue;
> +
> +		(void)phys_align_check(pstart, pend, PMD_SIZE, &phys,
> +				&args->fixed_alignment);

(void) ? Why not check the return value here ?

> +	}
> +
>  	args->fixed_pgd_pfn = __phys_to_pfn(phys & PGDIR_MASK);
>  	args->fixed_p4d_pfn = __phys_to_pfn(phys & P4D_MASK);
>  	args->fixed_pud_pfn = __phys_to_pfn(phys & PUD_MASK);

This loops attempts to find a PUD_SIZE aligned address but breaks out in case it
atleast finds a PMD_SIZE aligned address, while looping through available memory
ranges. The entire process of finding 'phys' and 'args->fixed_alignment' should
be encapsulated inside a helper that also updates 'args->fixed_pxx_pfn' elements.

- Anshuman
Frank van der Linden Jan. 6, 2023, 5:54 p.m. UTC | #2
Hi Anshuman, thanks for looking at this.


On Thu, Jan 5, 2023 at 8:24 PM Anshuman Khandual
<anshuman.khandual@arm.com> wrote:
>
> Hi Frank,
>
> Thanks for the patch, in principle this LGTM. Did a quick run on arm64,
> did not find anything problematic. Although I have some comments below.
>
[...]

> > diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c
> > index c631ade3f1d2..e9b52600904a 100644
> > --- a/mm/debug_vm_pgtable.c
> > +++ b/mm/debug_vm_pgtable.c
> > @@ -15,6 +15,7 @@
> >  #include <linux/hugetlb.h>
> >  #include <linux/kernel.h>
> >  #include <linux/kconfig.h>
> > +#include <linux/memblock.h>
> >  #include <linux/mm.h>
> >  #include <linux/mman.h>
> >  #include <linux/mm_types.h>
> > @@ -80,6 +81,8 @@ struct pgtable_debug_args {
> >       unsigned long           pmd_pfn;
> >       unsigned long           pte_pfn;
> >
> > +     phys_addr_t             fixed_alignment;
> > +
>
> This should not be a 'phys_addr_t', as it does not really contain a
> physical address. Alignment value can be captured in 'unsigned long'
> like other elements.

True, yep.

>
> >       unsigned long           fixed_pgd_pfn;
> >       unsigned long           fixed_p4d_pfn;
> >       unsigned long           fixed_pud_pfn;
> > @@ -430,7 +433,8 @@ static void __init pmd_huge_tests(struct pgtable_debug_args *args)
> >  {
> >       pmd_t pmd;
> >
> > -     if (!arch_vmap_pmd_supported(args->page_prot))
> > +     if (!arch_vmap_pmd_supported(args->page_prot) ||
> > +         args->fixed_alignment < PMD_SIZE)
> >               return;
>
> Small nit. Additional line not need for the conditional statement.
>

You mean the line break in the condition? Not breaking it would push
it to 90 characters (if tab=8).

Most of this file, except for a few lines, does stick to 80. I don't
feel particularly strongly about this either way, though :)

>
> >
> >       pr_debug("Validating PMD huge\n");
> > @@ -449,7 +453,8 @@ static void __init pud_huge_tests(struct pgtable_debug_args *args)
> >  {
> >       pud_t pud;
> >
> > -     if (!arch_vmap_pud_supported(args->page_prot))
> > +     if (!arch_vmap_pud_supported(args->page_prot) ||
> > +         args->fixed_alignment < PUD_SIZE)
> >               return;
> Small nit. Additional line not needed for the conditional statement.

See above.

>
> >
> >       pr_debug("Validating PUD huge\n");
> > @@ -1077,11 +1082,41 @@ debug_vm_pgtable_alloc_huge_page(struct pgtable_debug_args *args, int order)
> >       return page;
> >  }
> >
> > +/*
> > + * Check if a physical memory range described by <pstart, pend> contains
> > + * an area that is of size psize, and aligned to the same.
> > + *
> > + * Don't use address 0, and check for overflow.
> > + */
> > +static int __init phys_align_check(phys_addr_t pstart,
> > +     phys_addr_t pend, phys_addr_t psize, phys_addr_t *physp,
> > +     phys_addr_t *alignp)
> > +{
> > +     phys_addr_t aligned_start, aligned_end;
> > +
> > +     if (pstart == 0)
> > +             pstart = PAGE_SIZE;
>
> Why ?

Since the physical address will be used for page table tests, I think
that avoiding 0 is probably a good idea. If e.g. a masking mistake
crept into the code somewhere, using physical address 0 might not find
it. Also, physical address 0 isn't used on x86.
>
> > +
> > +     aligned_start = ALIGN(pstart, psize);
> > +     aligned_end = aligned_start + psize;
> > +
> > +     if (aligned_end > aligned_start && aligned_end <= pend) {
> > +             *alignp = psize;
> > +             *physp = aligned_start;
> > +             return 1;
> > +     }
> > +
> > +     return 0;
> > +}
>
> To be more clear, this function should return a 'bool' instead

That would be better, yes.

>
> > +
> > +
> >  static int __init init_args(struct pgtable_debug_args *args)
> >  {
> >       struct page *page = NULL;
> >       phys_addr_t phys;
> >       int ret = 0;
> > +     u64 idx;
> > +     phys_addr_t pstart, pend;
>
> This declaration can be merged into the previous line containing 'phys'.

Sure, yes.
>
> >
> >       /*
> >        * Initialize the debugging data.
> > @@ -1161,15 +1196,32 @@ static int __init init_args(struct pgtable_debug_args *args)
> >       WARN_ON(!args->start_ptep);
> >
> >       /*
> > -      * PFN for mapping at PTE level is determined from a standard kernel
> > -      * text symbol. But pfns for higher page table levels are derived by
> > -      * masking lower bits of this real pfn. These derived pfns might not
> > -      * exist on the platform but that does not really matter as pfn_pxx()
> > -      * helpers will still create appropriate entries for the test. This
> > -      * helps avoid large memory block allocations to be used for mapping
> > -      * at higher page table levels in some of the tests.
> > +      * Find a valid physical range, preferably aligned to PUD_SIZE.
> > +      * Return the address and the alignment. It doesn't need to be
> > +      * allocated, it just needs to exist as usable memory. The memory
> > +      * won't be touched.
> > +      *
> > +      * The alignment is recorded, and can be checked to see if we
> > +      * can run the tests that require and actual valid physical
>
> s/and/an ?

Indeed, that's a typo.

>
> > +      * address range on some architectures ({pmd,pud}_huge_test
> > +      * on x86).
> >        */
> > +
> >       phys = __pa_symbol(&start_kernel);
>
> This original 'phys' will still be used as fallback, in case the below attempt
> does not find a physical address with required alignments i.e [PUD|PMD]_SIZE ?

Right, the original value (as it is done now) is there as a fallback.

>
> > +     args->fixed_alignment = PAGE_SIZE;
> > +
> > +     for_each_mem_range(idx, &pstart, &pend) {
> > +             if (phys_align_check(pstart, pend, PUD_SIZE, &phys,
> > +                             &args->fixed_alignment))
> > +                     break;
> > +
> > +             if (args->fixed_alignment >= PMD_SIZE)
> > +                     continue;
> > +
> > +             (void)phys_align_check(pstart, pend, PMD_SIZE, &phys,
> > +                             &args->fixed_alignment);
>
> (void) ? Why not check the return value here ?

If you get to that function call, you know that no aligned area has
been found so far, so checking the return value won't change what
you're going to do: you're going to keep going, since even if you get
a PMD_SIZE aligned area, you still want to try to get a PUD_SIZE
aligned area. So there's no point in checking it.

>
> > +     }
> > +
> >       args->fixed_pgd_pfn = __phys_to_pfn(phys & PGDIR_MASK);
> >       args->fixed_p4d_pfn = __phys_to_pfn(phys & P4D_MASK);
> >       args->fixed_pud_pfn = __phys_to_pfn(phys & PUD_MASK);
>
> This loops attempts to find a PUD_SIZE aligned address but breaks out in case it
> atleast finds a PMD_SIZE aligned address, while looping through available memory
> ranges. The entire process of finding 'phys' and 'args->fixed_alignment' should
> be encapsulated inside a helper that also updates 'args->fixed_pxx_pfn' elements.

The loop keeps going until it either runs out of physical memory
ranges to check, or until it finds a PUD_SIZE-aligned area. It won't
break out for a PMD_SIZE-aligned area.

It could be made in to a separate function, yes, that might look a
little cleaner.
>
> - Anshuman

Thanks again for the comments. I see that this was added to
mm-unstable by now. I can send an mm-unstable follow-up patch (though
there won't be any functional changes).

- Frank
Anshuman Khandual Jan. 9, 2023, 8:48 a.m. UTC | #3
On 1/6/23 23:24, Frank van der Linden wrote:
> Hi Anshuman, thanks for looking at this.
> 
> 
> On Thu, Jan 5, 2023 at 8:24 PM Anshuman Khandual
> <anshuman.khandual@arm.com> wrote:
>>
>> Hi Frank,
>>
>> Thanks for the patch, in principle this LGTM. Did a quick run on arm64,
>> did not find anything problematic. Although I have some comments below.
>>
> [...]
> 
>>> diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c
>>> index c631ade3f1d2..e9b52600904a 100644
>>> --- a/mm/debug_vm_pgtable.c
>>> +++ b/mm/debug_vm_pgtable.c
>>> @@ -15,6 +15,7 @@
>>>  #include <linux/hugetlb.h>
>>>  #include <linux/kernel.h>
>>>  #include <linux/kconfig.h>
>>> +#include <linux/memblock.h>
>>>  #include <linux/mm.h>
>>>  #include <linux/mman.h>
>>>  #include <linux/mm_types.h>
>>> @@ -80,6 +81,8 @@ struct pgtable_debug_args {
>>>       unsigned long           pmd_pfn;
>>>       unsigned long           pte_pfn;
>>>
>>> +     phys_addr_t             fixed_alignment;
>>> +
>>
>> This should not be a 'phys_addr_t', as it does not really contain a
>> physical address. Alignment value can be captured in 'unsigned long'
>> like other elements.
> 
> True, yep.
> 
>>
>>>       unsigned long           fixed_pgd_pfn;
>>>       unsigned long           fixed_p4d_pfn;
>>>       unsigned long           fixed_pud_pfn;
>>> @@ -430,7 +433,8 @@ static void __init pmd_huge_tests(struct pgtable_debug_args *args)
>>>  {
>>>       pmd_t pmd;
>>>
>>> -     if (!arch_vmap_pmd_supported(args->page_prot))
>>> +     if (!arch_vmap_pmd_supported(args->page_prot) ||
>>> +         args->fixed_alignment < PMD_SIZE)
>>>               return;
>>
>> Small nit. Additional line not need for the conditional statement.
>>
> 
> You mean the line break in the condition? Not breaking it would push
> it to 90 characters (if tab=8).
> 
> Most of this file, except for a few lines, does stick to 80. I don't
> feel particularly strongly about this either way, though :)

I guess currently the lines could extend up to 100 instead.

> 
>>
>>>
>>>       pr_debug("Validating PMD huge\n");
>>> @@ -449,7 +453,8 @@ static void __init pud_huge_tests(struct pgtable_debug_args *args)
>>>  {
>>>       pud_t pud;
>>>
>>> -     if (!arch_vmap_pud_supported(args->page_prot))
>>> +     if (!arch_vmap_pud_supported(args->page_prot) ||
>>> +         args->fixed_alignment < PUD_SIZE)
>>>               return;
>> Small nit. Additional line not needed for the conditional statement.
> 
> See above.
> 
>>
>>>
>>>       pr_debug("Validating PUD huge\n");
>>> @@ -1077,11 +1082,41 @@ debug_vm_pgtable_alloc_huge_page(struct pgtable_debug_args *args, int order)
>>>       return page;
>>>  }
>>>
>>> +/*
>>> + * Check if a physical memory range described by <pstart, pend> contains
>>> + * an area that is of size psize, and aligned to the same.
>>> + *
>>> + * Don't use address 0, and check for overflow.
>>> + */
>>> +static int __init phys_align_check(phys_addr_t pstart,
>>> +     phys_addr_t pend, phys_addr_t psize, phys_addr_t *physp,
>>> +     phys_addr_t *alignp)
>>> +{
>>> +     phys_addr_t aligned_start, aligned_end;
>>> +
>>> +     if (pstart == 0)
>>> +             pstart = PAGE_SIZE;
>>
>> Why ?
> 
> Since the physical address will be used for page table tests, I think
> that avoiding 0 is probably a good idea. If e.g. a masking mistake
> crept into the code somewhere, using physical address 0 might not find
> it. Also, physical address 0 isn't used on x86.

Make sense, but will need a small comment explaining the same.

>>
>>> +
>>> +     aligned_start = ALIGN(pstart, psize);
>>> +     aligned_end = aligned_start + psize;
>>> +
>>> +     if (aligned_end > aligned_start && aligned_end <= pend) {
>>> +             *alignp = psize;
>>> +             *physp = aligned_start;
>>> +             return 1;
>>> +     }
>>> +
>>> +     return 0;
>>> +}
>>
>> To be more clear, this function should return a 'bool' instead
> 
> That would be better, yes.
> 
>>
>>> +
>>> +
>>>  static int __init init_args(struct pgtable_debug_args *args)
>>>  {
>>>       struct page *page = NULL;
>>>       phys_addr_t phys;
>>>       int ret = 0;
>>> +     u64 idx;
>>> +     phys_addr_t pstart, pend;
>>
>> This declaration can be merged into the previous line containing 'phys'.
> 
> Sure, yes.
>>
>>>
>>>       /*
>>>        * Initialize the debugging data.
>>> @@ -1161,15 +1196,32 @@ static int __init init_args(struct pgtable_debug_args *args)
>>>       WARN_ON(!args->start_ptep);
>>>
>>>       /*
>>> -      * PFN for mapping at PTE level is determined from a standard kernel
>>> -      * text symbol. But pfns for higher page table levels are derived by
>>> -      * masking lower bits of this real pfn. These derived pfns might not
>>> -      * exist on the platform but that does not really matter as pfn_pxx()
>>> -      * helpers will still create appropriate entries for the test. This
>>> -      * helps avoid large memory block allocations to be used for mapping
>>> -      * at higher page table levels in some of the tests.
>>> +      * Find a valid physical range, preferably aligned to PUD_SIZE.
>>> +      * Return the address and the alignment. It doesn't need to be
>>> +      * allocated, it just needs to exist as usable memory. The memory
>>> +      * won't be touched.
>>> +      *
>>> +      * The alignment is recorded, and can be checked to see if we
>>> +      * can run the tests that require and actual valid physical
>>
>> s/and/an ?
> 
> Indeed, that's a typo.
> 
>>
>>> +      * address range on some architectures ({pmd,pud}_huge_test
>>> +      * on x86).
>>>        */
>>> +
>>>       phys = __pa_symbol(&start_kernel);
>>
>> This original 'phys' will still be used as fallback, in case the below attempt
>> does not find a physical address with required alignments i.e [PUD|PMD]_SIZE ?
> 
> Right, the original value (as it is done now) is there as a fallback.
> 
>>
>>> +     args->fixed_alignment = PAGE_SIZE;
>>> +
>>> +     for_each_mem_range(idx, &pstart, &pend) {
>>> +             if (phys_align_check(pstart, pend, PUD_SIZE, &phys,
>>> +                             &args->fixed_alignment))
>>> +                     break;
>>> +
>>> +             if (args->fixed_alignment >= PMD_SIZE)
>>> +                     continue;
>>> +
>>> +             (void)phys_align_check(pstart, pend, PMD_SIZE, &phys,
>>> +                             &args->fixed_alignment);
>>
>> (void) ? Why not check the return value here ?
> 
> If you get to that function call, you know that no aligned area has
> been found so far, so checking the return value won't change what
> you're going to do: you're going to keep going, since even if you get
> a PMD_SIZE aligned area, you still want to try to get a PUD_SIZE
> aligned area. So there's no point in checking it.

Okay but does a void is really necessary here even if the return value
is not checked ?

> 
>>
>>> +     }
>>> +
>>>       args->fixed_pgd_pfn = __phys_to_pfn(phys & PGDIR_MASK);
>>>       args->fixed_p4d_pfn = __phys_to_pfn(phys & P4D_MASK);
>>>       args->fixed_pud_pfn = __phys_to_pfn(phys & PUD_MASK);
>>
>> This loops attempts to find a PUD_SIZE aligned address but breaks out in case it
>> atleast finds a PMD_SIZE aligned address, while looping through available memory
>> ranges. The entire process of finding 'phys' and 'args->fixed_alignment' should
>> be encapsulated inside a helper that also updates 'args->fixed_pxx_pfn' elements.
> 
> The loop keeps going until it either runs out of physical memory
> ranges to check, or until it finds a PUD_SIZE-aligned area. It won't
> break out for a PMD_SIZE-aligned area.
> 
> It could be made in to a separate function, yes, that might look a
> little cleaner.

Indeed.

>>
>> - Anshuman
> 
> Thanks again for the comments. I see that this was added to
> mm-unstable by now. I can send an mm-unstable follow-up patch (though
> there won't be any functional changes).

I think you could still send an updated version with the suggested changes,
which can be pulled again for mm-unstable. These changes should be part of
a single commit being merged, for future clarity while reading these code.
Frank van der Linden Jan. 9, 2023, 5:47 p.m. UTC | #4
Sure, v2 sent, addressing your comments. I got rid of the return value
of phys_align_check() entirely, instead just checking the recorded
alignment value. It's more consistent. Added more comments, made types
consistent, split off things into a function.

Thanks,

- Frank

On Mon, Jan 9, 2023 at 12:48 AM Anshuman Khandual
<anshuman.khandual@arm.com> wrote:
>
>
> On 1/6/23 23:24, Frank van der Linden wrote:
> > Hi Anshuman, thanks for looking at this.
> >
> >
> > On Thu, Jan 5, 2023 at 8:24 PM Anshuman Khandual
> > <anshuman.khandual@arm.com> wrote:
> >>
> >> Hi Frank,
> >>
> >> Thanks for the patch, in principle this LGTM. Did a quick run on arm64,
> >> did not find anything problematic. Although I have some comments below.
> >>
> > [...]
> >
> >>> diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c
> >>> index c631ade3f1d2..e9b52600904a 100644
> >>> --- a/mm/debug_vm_pgtable.c
> >>> +++ b/mm/debug_vm_pgtable.c
> >>> @@ -15,6 +15,7 @@
> >>>  #include <linux/hugetlb.h>
> >>>  #include <linux/kernel.h>
> >>>  #include <linux/kconfig.h>
> >>> +#include <linux/memblock.h>
> >>>  #include <linux/mm.h>
> >>>  #include <linux/mman.h>
> >>>  #include <linux/mm_types.h>
> >>> @@ -80,6 +81,8 @@ struct pgtable_debug_args {
> >>>       unsigned long           pmd_pfn;
> >>>       unsigned long           pte_pfn;
> >>>
> >>> +     phys_addr_t             fixed_alignment;
> >>> +
> >>
> >> This should not be a 'phys_addr_t', as it does not really contain a
> >> physical address. Alignment value can be captured in 'unsigned long'
> >> like other elements.
> >
> > True, yep.
> >
> >>
> >>>       unsigned long           fixed_pgd_pfn;
> >>>       unsigned long           fixed_p4d_pfn;
> >>>       unsigned long           fixed_pud_pfn;
> >>> @@ -430,7 +433,8 @@ static void __init pmd_huge_tests(struct pgtable_debug_args *args)
> >>>  {
> >>>       pmd_t pmd;
> >>>
> >>> -     if (!arch_vmap_pmd_supported(args->page_prot))
> >>> +     if (!arch_vmap_pmd_supported(args->page_prot) ||
> >>> +         args->fixed_alignment < PMD_SIZE)
> >>>               return;
> >>
> >> Small nit. Additional line not need for the conditional statement.
> >>
> >
> > You mean the line break in the condition? Not breaking it would push
> > it to 90 characters (if tab=8).
> >
> > Most of this file, except for a few lines, does stick to 80. I don't
> > feel particularly strongly about this either way, though :)
>
> I guess currently the lines could extend up to 100 instead.
>
> >
> >>
> >>>
> >>>       pr_debug("Validating PMD huge\n");
> >>> @@ -449,7 +453,8 @@ static void __init pud_huge_tests(struct pgtable_debug_args *args)
> >>>  {
> >>>       pud_t pud;
> >>>
> >>> -     if (!arch_vmap_pud_supported(args->page_prot))
> >>> +     if (!arch_vmap_pud_supported(args->page_prot) ||
> >>> +         args->fixed_alignment < PUD_SIZE)
> >>>               return;
> >> Small nit. Additional line not needed for the conditional statement.
> >
> > See above.
> >
> >>
> >>>
> >>>       pr_debug("Validating PUD huge\n");
> >>> @@ -1077,11 +1082,41 @@ debug_vm_pgtable_alloc_huge_page(struct pgtable_debug_args *args, int order)
> >>>       return page;
> >>>  }
> >>>
> >>> +/*
> >>> + * Check if a physical memory range described by <pstart, pend> contains
> >>> + * an area that is of size psize, and aligned to the same.
> >>> + *
> >>> + * Don't use address 0, and check for overflow.
> >>> + */
> >>> +static int __init phys_align_check(phys_addr_t pstart,
> >>> +     phys_addr_t pend, phys_addr_t psize, phys_addr_t *physp,
> >>> +     phys_addr_t *alignp)
> >>> +{
> >>> +     phys_addr_t aligned_start, aligned_end;
> >>> +
> >>> +     if (pstart == 0)
> >>> +             pstart = PAGE_SIZE;
> >>
> >> Why ?
> >
> > Since the physical address will be used for page table tests, I think
> > that avoiding 0 is probably a good idea. If e.g. a masking mistake
> > crept into the code somewhere, using physical address 0 might not find
> > it. Also, physical address 0 isn't used on x86.
>
> Make sense, but will need a small comment explaining the same.
>
> >>
> >>> +
> >>> +     aligned_start = ALIGN(pstart, psize);
> >>> +     aligned_end = aligned_start + psize;
> >>> +
> >>> +     if (aligned_end > aligned_start && aligned_end <= pend) {
> >>> +             *alignp = psize;
> >>> +             *physp = aligned_start;
> >>> +             return 1;
> >>> +     }
> >>> +
> >>> +     return 0;
> >>> +}
> >>
> >> To be more clear, this function should return a 'bool' instead
> >
> > That would be better, yes.
> >
> >>
> >>> +
> >>> +
> >>>  static int __init init_args(struct pgtable_debug_args *args)
> >>>  {
> >>>       struct page *page = NULL;
> >>>       phys_addr_t phys;
> >>>       int ret = 0;
> >>> +     u64 idx;
> >>> +     phys_addr_t pstart, pend;
> >>
> >> This declaration can be merged into the previous line containing 'phys'.
> >
> > Sure, yes.
> >>
> >>>
> >>>       /*
> >>>        * Initialize the debugging data.
> >>> @@ -1161,15 +1196,32 @@ static int __init init_args(struct pgtable_debug_args *args)
> >>>       WARN_ON(!args->start_ptep);
> >>>
> >>>       /*
> >>> -      * PFN for mapping at PTE level is determined from a standard kernel
> >>> -      * text symbol. But pfns for higher page table levels are derived by
> >>> -      * masking lower bits of this real pfn. These derived pfns might not
> >>> -      * exist on the platform but that does not really matter as pfn_pxx()
> >>> -      * helpers will still create appropriate entries for the test. This
> >>> -      * helps avoid large memory block allocations to be used for mapping
> >>> -      * at higher page table levels in some of the tests.
> >>> +      * Find a valid physical range, preferably aligned to PUD_SIZE.
> >>> +      * Return the address and the alignment. It doesn't need to be
> >>> +      * allocated, it just needs to exist as usable memory. The memory
> >>> +      * won't be touched.
> >>> +      *
> >>> +      * The alignment is recorded, and can be checked to see if we
> >>> +      * can run the tests that require and actual valid physical
> >>
> >> s/and/an ?
> >
> > Indeed, that's a typo.
> >
> >>
> >>> +      * address range on some architectures ({pmd,pud}_huge_test
> >>> +      * on x86).
> >>>        */
> >>> +
> >>>       phys = __pa_symbol(&start_kernel);
> >>
> >> This original 'phys' will still be used as fallback, in case the below attempt
> >> does not find a physical address with required alignments i.e [PUD|PMD]_SIZE ?
> >
> > Right, the original value (as it is done now) is there as a fallback.
> >
> >>
> >>> +     args->fixed_alignment = PAGE_SIZE;
> >>> +
> >>> +     for_each_mem_range(idx, &pstart, &pend) {
> >>> +             if (phys_align_check(pstart, pend, PUD_SIZE, &phys,
> >>> +                             &args->fixed_alignment))
> >>> +                     break;
> >>> +
> >>> +             if (args->fixed_alignment >= PMD_SIZE)
> >>> +                     continue;
> >>> +
> >>> +             (void)phys_align_check(pstart, pend, PMD_SIZE, &phys,
> >>> +                             &args->fixed_alignment);
> >>
> >> (void) ? Why not check the return value here ?
> >
> > If you get to that function call, you know that no aligned area has
> > been found so far, so checking the return value won't change what
> > you're going to do: you're going to keep going, since even if you get
> > a PMD_SIZE aligned area, you still want to try to get a PUD_SIZE
> > aligned area. So there's no point in checking it.
>
> Okay but does a void is really necessary here even if the return value
> is not checked ?
>
> >
> >>
> >>> +     }
> >>> +
> >>>       args->fixed_pgd_pfn = __phys_to_pfn(phys & PGDIR_MASK);
> >>>       args->fixed_p4d_pfn = __phys_to_pfn(phys & P4D_MASK);
> >>>       args->fixed_pud_pfn = __phys_to_pfn(phys & PUD_MASK);
> >>
> >> This loops attempts to find a PUD_SIZE aligned address but breaks out in case it
> >> atleast finds a PMD_SIZE aligned address, while looping through available memory
> >> ranges. The entire process of finding 'phys' and 'args->fixed_alignment' should
> >> be encapsulated inside a helper that also updates 'args->fixed_pxx_pfn' elements.
> >
> > The loop keeps going until it either runs out of physical memory
> > ranges to check, or until it finds a PUD_SIZE-aligned area. It won't
> > break out for a PMD_SIZE-aligned area.
> >
> > It could be made in to a separate function, yes, that might look a
> > little cleaner.
>
> Indeed.
>
> >>
> >> - Anshuman
> >
> > Thanks again for the comments. I see that this was added to
> > mm-unstable by now. I can send an mm-unstable follow-up patch (though
> > there won't be any functional changes).
>
> I think you could still send an updated version with the suggested changes,
> which can be pulled again for mm-unstable. These changes should be part of
> a single commit being merged, for future clarity while reading these code.
diff mbox series

Patch

diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c
index c631ade3f1d2..e9b52600904a 100644
--- a/mm/debug_vm_pgtable.c
+++ b/mm/debug_vm_pgtable.c
@@ -15,6 +15,7 @@ 
 #include <linux/hugetlb.h>
 #include <linux/kernel.h>
 #include <linux/kconfig.h>
+#include <linux/memblock.h>
 #include <linux/mm.h>
 #include <linux/mman.h>
 #include <linux/mm_types.h>
@@ -80,6 +81,8 @@  struct pgtable_debug_args {
 	unsigned long		pmd_pfn;
 	unsigned long		pte_pfn;
 
+	phys_addr_t		fixed_alignment;
+
 	unsigned long		fixed_pgd_pfn;
 	unsigned long		fixed_p4d_pfn;
 	unsigned long		fixed_pud_pfn;
@@ -430,7 +433,8 @@  static void __init pmd_huge_tests(struct pgtable_debug_args *args)
 {
 	pmd_t pmd;
 
-	if (!arch_vmap_pmd_supported(args->page_prot))
+	if (!arch_vmap_pmd_supported(args->page_prot) ||
+	    args->fixed_alignment < PMD_SIZE)
 		return;
 
 	pr_debug("Validating PMD huge\n");
@@ -449,7 +453,8 @@  static void __init pud_huge_tests(struct pgtable_debug_args *args)
 {
 	pud_t pud;
 
-	if (!arch_vmap_pud_supported(args->page_prot))
+	if (!arch_vmap_pud_supported(args->page_prot) ||
+	    args->fixed_alignment < PUD_SIZE)
 		return;
 
 	pr_debug("Validating PUD huge\n");
@@ -1077,11 +1082,41 @@  debug_vm_pgtable_alloc_huge_page(struct pgtable_debug_args *args, int order)
 	return page;
 }
 
+/*
+ * Check if a physical memory range described by <pstart, pend> contains
+ * an area that is of size psize, and aligned to the same.
+ *
+ * Don't use address 0, and check for overflow.
+ */
+static int __init phys_align_check(phys_addr_t pstart,
+	phys_addr_t pend, phys_addr_t psize, phys_addr_t *physp,
+	phys_addr_t *alignp)
+{
+	phys_addr_t aligned_start, aligned_end;
+
+	if (pstart == 0)
+		pstart = PAGE_SIZE;
+
+	aligned_start = ALIGN(pstart, psize);
+	aligned_end = aligned_start + psize;
+
+	if (aligned_end > aligned_start && aligned_end <= pend) {
+		*alignp = psize;
+		*physp = aligned_start;
+		return 1;
+	}
+
+	return 0;
+}
+
+
 static int __init init_args(struct pgtable_debug_args *args)
 {
 	struct page *page = NULL;
 	phys_addr_t phys;
 	int ret = 0;
+	u64 idx;
+	phys_addr_t pstart, pend;
 
 	/*
 	 * Initialize the debugging data.
@@ -1161,15 +1196,32 @@  static int __init init_args(struct pgtable_debug_args *args)
 	WARN_ON(!args->start_ptep);
 
 	/*
-	 * PFN for mapping at PTE level is determined from a standard kernel
-	 * text symbol. But pfns for higher page table levels are derived by
-	 * masking lower bits of this real pfn. These derived pfns might not
-	 * exist on the platform but that does not really matter as pfn_pxx()
-	 * helpers will still create appropriate entries for the test. This
-	 * helps avoid large memory block allocations to be used for mapping
-	 * at higher page table levels in some of the tests.
+	 * Find a valid physical range, preferably aligned to PUD_SIZE.
+	 * Return the address and the alignment. It doesn't need to be
+	 * allocated, it just needs to exist as usable memory. The memory
+	 * won't be touched.
+	 *
+	 * The alignment is recorded, and can be checked to see if we
+	 * can run the tests that require and actual valid physical
+	 * address range on some architectures ({pmd,pud}_huge_test
+	 * on x86).
 	 */
+
 	phys = __pa_symbol(&start_kernel);
+	args->fixed_alignment = PAGE_SIZE;
+
+	for_each_mem_range(idx, &pstart, &pend) {
+		if (phys_align_check(pstart, pend, PUD_SIZE, &phys,
+				&args->fixed_alignment))
+			break;
+
+		if (args->fixed_alignment >= PMD_SIZE)
+			continue;
+
+		(void)phys_align_check(pstart, pend, PMD_SIZE, &phys,
+				&args->fixed_alignment);
+	}
+
 	args->fixed_pgd_pfn = __phys_to_pfn(phys & PGDIR_MASK);
 	args->fixed_p4d_pfn = __phys_to_pfn(phys & P4D_MASK);
 	args->fixed_pud_pfn = __phys_to_pfn(phys & PUD_MASK);