mbox series

[0/9] Merge arm64/riscv hugetlbfs contpte support

Message ID 20240301091455.246686-1-alexghiti@rivosinc.com (mailing list archive)
Headers show
Series Merge arm64/riscv hugetlbfs contpte support | expand

Message

Alexandre Ghiti March 1, 2024, 9:14 a.m. UTC
This patchset intends to merge the contiguous ptes hugetlbfs implementation
of arm64 and riscv.

Both arm64 and riscv support the use of contiguous ptes to map pages that
are larger than the default page table size, respectively called contpte
and svnapot.

The riscv implementation differs from the arm64's in that the LSBs of the
pfn of a svnapot pte are used to store the size of the mapping, allowing
for future sizes to be added (for now only 64KB is supported). That's an
issue for the core mm code which expects to find the *real* pfn a pte points
to. Patch 1 fixes that by always returning svnapot ptes with the real pfn
and restores the size of the mapping when it is written to a page table.

The following patches are just merges of the 2 different implementations
that currently exist in arm64 and riscv which are very similar. It paves
the way to the reuse of the recent contpte THP work by Ryan [1] to avoid
reimplementing the same in riscv.

This patchset was tested by running the libhugetlbfs testsuite with 64KB
and 2MB pages on both architectures (on a 4KB base page size arm64 kernel).

[1] https://lore.kernel.org/linux-arm-kernel/20240215103205.2607016-1-ryan.roberts@arm.com/

Alexandre Ghiti (9):
  riscv: Restore the pfn in a NAPOT pte when manipulated by core mm code
  riscv: Safely remove huge_pte_offset() when manipulating NAPOT ptes
  mm: Use common huge_ptep_get() function for riscv/arm64
  mm: Use common set_huge_pte_at() function for riscv/arm64
  mm: Use common huge_pte_clear() function for riscv/arm64
  mm: Use common huge_ptep_get_and_clear() function for riscv/arm64
  mm: Use common huge_ptep_set_access_flags() function for riscv/arm64
  mm: Use common huge_ptep_set_wrprotect() function for riscv/arm64
  mm: Use common huge_ptep_clear_flush() function for riscv/arm64

 arch/arm64/Kconfig                  |   1 +
 arch/arm64/include/asm/pgtable.h    |  59 +++++-
 arch/arm64/mm/hugetlbpage.c         | 291 +---------------------------
 arch/riscv/Kconfig                  |   1 +
 arch/riscv/include/asm/hugetlb.h    |   2 +-
 arch/riscv/include/asm/pgtable-64.h |  11 ++
 arch/riscv/include/asm/pgtable.h    | 120 +++++++++++-
 arch/riscv/mm/hugetlbpage.c         | 227 ----------------------
 mm/Kconfig                          |   3 +
 mm/Makefile                         |   1 +
 mm/contpte.c                        | 268 +++++++++++++++++++++++++
 11 files changed, 456 insertions(+), 528 deletions(-)
 create mode 100644 mm/contpte.c

Comments

Ryan Roberts March 1, 2024, 10:45 a.m. UTC | #1
Hi Alexandre,

I confess I haven't looked at the patches yet, but this cover letter raises a
few quesions for me. I'll aim to look at the actual patches in due course.

On 01/03/2024 09:14, Alexandre Ghiti wrote:
> This patchset intends to merge the contiguous ptes hugetlbfs implementation
> of arm64 and riscv.
> 
> Both arm64 and riscv support the use of contiguous ptes to map pages that
> are larger than the default page table size, respectively called contpte
> and svnapot.
> 
> The riscv implementation differs from the arm64's in that the LSBs of the
> pfn of a svnapot pte are used to store the size of the mapping, allowing
> for future sizes to be added (for now only 64KB is supported). That's an
> issue for the core mm code which expects to find the *real* pfn a pte points
> to. Patch 1 fixes that by always returning svnapot ptes with the real pfn
> and restores the size of the mapping when it is written to a page table.

Yes that makes sense to me. The intention for mTHP (!hugetlb) is to fully
encapsulate PTEs beind set_ptes(), ptep_get() and friends, so what's actually
written to the pgtable is arch-specific and well abstracted.

> 
> The following patches are just merges of the 2 different implementations
> that currently exist in arm64 and riscv which are very similar. It paves
> the way to the reuse of the recent contpte THP work by Ryan [1] to avoid
> reimplementing the same in riscv.

You seem to be talking about both hugetlb (which uses the "huge" pte helpers)
and contpte for THP (i.e. mTHP, which uses the regular pte helpers). They are
pretty separate in my mind, so not sure why you would be modifying them both in
the same series?

Thanks,
Ryan

> 
> This patchset was tested by running the libhugetlbfs testsuite with 64KB
> and 2MB pages on both architectures (on a 4KB base page size arm64 kernel).
> 
> [1] https://lore.kernel.org/linux-arm-kernel/20240215103205.2607016-1-ryan.roberts@arm.com/
> 
> Alexandre Ghiti (9):
>   riscv: Restore the pfn in a NAPOT pte when manipulated by core mm code
>   riscv: Safely remove huge_pte_offset() when manipulating NAPOT ptes
>   mm: Use common huge_ptep_get() function for riscv/arm64
>   mm: Use common set_huge_pte_at() function for riscv/arm64
>   mm: Use common huge_pte_clear() function for riscv/arm64
>   mm: Use common huge_ptep_get_and_clear() function for riscv/arm64
>   mm: Use common huge_ptep_set_access_flags() function for riscv/arm64
>   mm: Use common huge_ptep_set_wrprotect() function for riscv/arm64
>   mm: Use common huge_ptep_clear_flush() function for riscv/arm64
> 
>  arch/arm64/Kconfig                  |   1 +
>  arch/arm64/include/asm/pgtable.h    |  59 +++++-
>  arch/arm64/mm/hugetlbpage.c         | 291 +---------------------------
>  arch/riscv/Kconfig                  |   1 +
>  arch/riscv/include/asm/hugetlb.h    |   2 +-
>  arch/riscv/include/asm/pgtable-64.h |  11 ++
>  arch/riscv/include/asm/pgtable.h    | 120 +++++++++++-
>  arch/riscv/mm/hugetlbpage.c         | 227 ----------------------
>  mm/Kconfig                          |   3 +
>  mm/Makefile                         |   1 +
>  mm/contpte.c                        | 268 +++++++++++++++++++++++++
>  11 files changed, 456 insertions(+), 528 deletions(-)
>  create mode 100644 mm/contpte.c
>
Alexandre Ghiti March 1, 2024, 11:29 a.m. UTC | #2
Hi Ryan,

On Fri, Mar 1, 2024 at 11:45 AM Ryan Roberts <ryan.roberts@arm.com> wrote:
>
> Hi Alexandre,
>
> I confess I haven't looked at the patches yet, but this cover letter raises a
> few quesions for me. I'll aim to look at the actual patches in due course.
>
> On 01/03/2024 09:14, Alexandre Ghiti wrote:
> > This patchset intends to merge the contiguous ptes hugetlbfs implementation
> > of arm64 and riscv.
> >
> > Both arm64 and riscv support the use of contiguous ptes to map pages that
> > are larger than the default page table size, respectively called contpte
> > and svnapot.
> >
> > The riscv implementation differs from the arm64's in that the LSBs of the
> > pfn of a svnapot pte are used to store the size of the mapping, allowing
> > for future sizes to be added (for now only 64KB is supported). That's an
> > issue for the core mm code which expects to find the *real* pfn a pte points
> > to. Patch 1 fixes that by always returning svnapot ptes with the real pfn
> > and restores the size of the mapping when it is written to a page table.
>
> Yes that makes sense to me. The intention for mTHP (!hugetlb) is to fully
> encapsulate PTEs beind set_ptes(), ptep_get() and friends, so what's actually
> written to the pgtable is arch-specific and well abstracted.
>
> >
> > The following patches are just merges of the 2 different implementations
> > that currently exist in arm64 and riscv which are very similar. It paves
> > the way to the reuse of the recent contpte THP work by Ryan [1] to avoid
> > reimplementing the same in riscv.
>
> You seem to be talking about both hugetlb (which uses the "huge" pte helpers)
> and contpte for THP (i.e. mTHP, which uses the regular pte helpers). They are
> pretty separate in my mind, so not sure why you would be modifying them both in
> the same series?

I don't, this patchset only deals with hugetlb, I just meant that this
series was just the beginning as I'm working on moving the contpte for
THP support in the generic code for riscv to use.

Sorry my wording was ambiguous :)

Thanks,

Alex

>
> Thanks,
> Ryan
>
> >
> > This patchset was tested by running the libhugetlbfs testsuite with 64KB
> > and 2MB pages on both architectures (on a 4KB base page size arm64 kernel).
> >
> > [1] https://lore.kernel.org/linux-arm-kernel/20240215103205.2607016-1-ryan.roberts@arm.com/
> >
> > Alexandre Ghiti (9):
> >   riscv: Restore the pfn in a NAPOT pte when manipulated by core mm code
> >   riscv: Safely remove huge_pte_offset() when manipulating NAPOT ptes
> >   mm: Use common huge_ptep_get() function for riscv/arm64
> >   mm: Use common set_huge_pte_at() function for riscv/arm64
> >   mm: Use common huge_pte_clear() function for riscv/arm64
> >   mm: Use common huge_ptep_get_and_clear() function for riscv/arm64
> >   mm: Use common huge_ptep_set_access_flags() function for riscv/arm64
> >   mm: Use common huge_ptep_set_wrprotect() function for riscv/arm64
> >   mm: Use common huge_ptep_clear_flush() function for riscv/arm64
> >
> >  arch/arm64/Kconfig                  |   1 +
> >  arch/arm64/include/asm/pgtable.h    |  59 +++++-
> >  arch/arm64/mm/hugetlbpage.c         | 291 +---------------------------
> >  arch/riscv/Kconfig                  |   1 +
> >  arch/riscv/include/asm/hugetlb.h    |   2 +-
> >  arch/riscv/include/asm/pgtable-64.h |  11 ++
> >  arch/riscv/include/asm/pgtable.h    | 120 +++++++++++-
> >  arch/riscv/mm/hugetlbpage.c         | 227 ----------------------
> >  mm/Kconfig                          |   3 +
> >  mm/Makefile                         |   1 +
> >  mm/contpte.c                        | 268 +++++++++++++++++++++++++
> >  11 files changed, 456 insertions(+), 528 deletions(-)
> >  create mode 100644 mm/contpte.c
> >
>
Ryan Roberts March 1, 2024, 11:38 a.m. UTC | #3
On 01/03/2024 11:29, Alexandre Ghiti wrote:
> Hi Ryan,
> 
> On Fri, Mar 1, 2024 at 11:45 AM Ryan Roberts <ryan.roberts@arm.com> wrote:
>>
>> Hi Alexandre,
>>
>> I confess I haven't looked at the patches yet, but this cover letter raises a
>> few quesions for me. I'll aim to look at the actual patches in due course.
>>
>> On 01/03/2024 09:14, Alexandre Ghiti wrote:
>>> This patchset intends to merge the contiguous ptes hugetlbfs implementation
>>> of arm64 and riscv.
>>>
>>> Both arm64 and riscv support the use of contiguous ptes to map pages that
>>> are larger than the default page table size, respectively called contpte
>>> and svnapot.
>>>
>>> The riscv implementation differs from the arm64's in that the LSBs of the
>>> pfn of a svnapot pte are used to store the size of the mapping, allowing
>>> for future sizes to be added (for now only 64KB is supported). That's an
>>> issue for the core mm code which expects to find the *real* pfn a pte points
>>> to. Patch 1 fixes that by always returning svnapot ptes with the real pfn
>>> and restores the size of the mapping when it is written to a page table.
>>
>> Yes that makes sense to me. The intention for mTHP (!hugetlb) is to fully
>> encapsulate PTEs beind set_ptes(), ptep_get() and friends, so what's actually
>> written to the pgtable is arch-specific and well abstracted.
>>
>>>
>>> The following patches are just merges of the 2 different implementations
>>> that currently exist in arm64 and riscv which are very similar. It paves
>>> the way to the reuse of the recent contpte THP work by Ryan [1] to avoid
>>> reimplementing the same in riscv.
>>
>> You seem to be talking about both hugetlb (which uses the "huge" pte helpers)
>> and contpte for THP (i.e. mTHP, which uses the regular pte helpers). They are
>> pretty separate in my mind, so not sure why you would be modifying them both in
>> the same series?
> 
> I don't, this patchset only deals with hugetlb, I just meant that this
> series was just the beginning as I'm working on moving the contpte for
> THP support in the generic code for riscv to use.

Ahh got it! Thanks for the explanation.

> 
> Sorry my wording was ambiguous :)
> 
> Thanks,
> 
> Alex
> 
>>
>> Thanks,
>> Ryan
>>
>>>
>>> This patchset was tested by running the libhugetlbfs testsuite with 64KB
>>> and 2MB pages on both architectures (on a 4KB base page size arm64 kernel).
>>>
>>> [1] https://lore.kernel.org/linux-arm-kernel/20240215103205.2607016-1-ryan.roberts@arm.com/
>>>
>>> Alexandre Ghiti (9):
>>>   riscv: Restore the pfn in a NAPOT pte when manipulated by core mm code
>>>   riscv: Safely remove huge_pte_offset() when manipulating NAPOT ptes
>>>   mm: Use common huge_ptep_get() function for riscv/arm64
>>>   mm: Use common set_huge_pte_at() function for riscv/arm64
>>>   mm: Use common huge_pte_clear() function for riscv/arm64
>>>   mm: Use common huge_ptep_get_and_clear() function for riscv/arm64
>>>   mm: Use common huge_ptep_set_access_flags() function for riscv/arm64
>>>   mm: Use common huge_ptep_set_wrprotect() function for riscv/arm64
>>>   mm: Use common huge_ptep_clear_flush() function for riscv/arm64
>>>
>>>  arch/arm64/Kconfig                  |   1 +
>>>  arch/arm64/include/asm/pgtable.h    |  59 +++++-
>>>  arch/arm64/mm/hugetlbpage.c         | 291 +---------------------------
>>>  arch/riscv/Kconfig                  |   1 +
>>>  arch/riscv/include/asm/hugetlb.h    |   2 +-
>>>  arch/riscv/include/asm/pgtable-64.h |  11 ++
>>>  arch/riscv/include/asm/pgtable.h    | 120 +++++++++++-
>>>  arch/riscv/mm/hugetlbpage.c         | 227 ----------------------
>>>  mm/Kconfig                          |   3 +
>>>  mm/Makefile                         |   1 +
>>>  mm/contpte.c                        | 268 +++++++++++++++++++++++++
>>>  11 files changed, 456 insertions(+), 528 deletions(-)
>>>  create mode 100644 mm/contpte.c
>>>
>>
Palmer Dabbelt April 18, 2024, 10:11 p.m. UTC | #4
On Fri, 01 Mar 2024 03:29:18 PST (-0800), alexghiti@rivosinc.com wrote:
> Hi Ryan,
>
> On Fri, Mar 1, 2024 at 11:45 AM Ryan Roberts <ryan.roberts@arm.com> wrote:
>>
>> Hi Alexandre,
>>
>> I confess I haven't looked at the patches yet, but this cover letter raises a
>> few quesions for me. I'll aim to look at the actual patches in due course.

Acked-by: Palmer Dabbelt <palmer@rivosinc.com>

in case someone wants to pick them up via a generic tree.  I'm happy to 
take them via the RISC-V tree if folk want, no rush on my end I'm just 
scrubbing through old stuff.

>> On 01/03/2024 09:14, Alexandre Ghiti wrote:
>> > This patchset intends to merge the contiguous ptes hugetlbfs implementation
>> > of arm64 and riscv.
>> >
>> > Both arm64 and riscv support the use of contiguous ptes to map pages that
>> > are larger than the default page table size, respectively called contpte
>> > and svnapot.
>> >
>> > The riscv implementation differs from the arm64's in that the LSBs of the
>> > pfn of a svnapot pte are used to store the size of the mapping, allowing
>> > for future sizes to be added (for now only 64KB is supported). That's an
>> > issue for the core mm code which expects to find the *real* pfn a pte points
>> > to. Patch 1 fixes that by always returning svnapot ptes with the real pfn
>> > and restores the size of the mapping when it is written to a page table.
>>
>> Yes that makes sense to me. The intention for mTHP (!hugetlb) is to fully
>> encapsulate PTEs beind set_ptes(), ptep_get() and friends, so what's actually
>> written to the pgtable is arch-specific and well abstracted.
>>
>> >
>> > The following patches are just merges of the 2 different implementations
>> > that currently exist in arm64 and riscv which are very similar. It paves
>> > the way to the reuse of the recent contpte THP work by Ryan [1] to avoid
>> > reimplementing the same in riscv.
>>
>> You seem to be talking about both hugetlb (which uses the "huge" pte helpers)
>> and contpte for THP (i.e. mTHP, which uses the regular pte helpers). They are
>> pretty separate in my mind, so not sure why you would be modifying them both in
>> the same series?
>
> I don't, this patchset only deals with hugetlb, I just meant that this
> series was just the beginning as I'm working on moving the contpte for
> THP support in the generic code for riscv to use.
>
> Sorry my wording was ambiguous :)
>
> Thanks,
>
> Alex
>
>>
>> Thanks,
>> Ryan
>>
>> >
>> > This patchset was tested by running the libhugetlbfs testsuite with 64KB
>> > and 2MB pages on both architectures (on a 4KB base page size arm64 kernel).
>> >
>> > [1] https://lore.kernel.org/linux-arm-kernel/20240215103205.2607016-1-ryan.roberts@arm.com/
>> >
>> > Alexandre Ghiti (9):
>> >   riscv: Restore the pfn in a NAPOT pte when manipulated by core mm code
>> >   riscv: Safely remove huge_pte_offset() when manipulating NAPOT ptes
>> >   mm: Use common huge_ptep_get() function for riscv/arm64
>> >   mm: Use common set_huge_pte_at() function for riscv/arm64
>> >   mm: Use common huge_pte_clear() function for riscv/arm64
>> >   mm: Use common huge_ptep_get_and_clear() function for riscv/arm64
>> >   mm: Use common huge_ptep_set_access_flags() function for riscv/arm64
>> >   mm: Use common huge_ptep_set_wrprotect() function for riscv/arm64
>> >   mm: Use common huge_ptep_clear_flush() function for riscv/arm64
>> >
>> >  arch/arm64/Kconfig                  |   1 +
>> >  arch/arm64/include/asm/pgtable.h    |  59 +++++-
>> >  arch/arm64/mm/hugetlbpage.c         | 291 +---------------------------
>> >  arch/riscv/Kconfig                  |   1 +
>> >  arch/riscv/include/asm/hugetlb.h    |   2 +-
>> >  arch/riscv/include/asm/pgtable-64.h |  11 ++
>> >  arch/riscv/include/asm/pgtable.h    | 120 +++++++++++-
>> >  arch/riscv/mm/hugetlbpage.c         | 227 ----------------------
>> >  mm/Kconfig                          |   3 +
>> >  mm/Makefile                         |   1 +
>> >  mm/contpte.c                        | 268 +++++++++++++++++++++++++
>> >  11 files changed, 456 insertions(+), 528 deletions(-)
>> >  create mode 100644 mm/contpte.c
>> >
>>
Will Deacon April 19, 2024, 11:03 a.m. UTC | #5
On Thu, Apr 18, 2024 at 03:11:56PM -0700, Palmer Dabbelt wrote:
> On Fri, 01 Mar 2024 03:29:18 PST (-0800), alexghiti@rivosinc.com wrote:
> > On Fri, Mar 1, 2024 at 11:45 AM Ryan Roberts <ryan.roberts@arm.com> wrote:
> > > I confess I haven't looked at the patches yet, but this cover letter raises a
> > > few quesions for me. I'll aim to look at the actual patches in due course.
> 
> Acked-by: Palmer Dabbelt <palmer@rivosinc.com>
> 
> in case someone wants to pick them up via a generic tree.  I'm happy to take
> them via the RISC-V tree if folk want, no rush on my end I'm just scrubbing
> through old stuff.

I'd definitely like to take the arm64 parts via the arm64 tree (on a
shared branch), as it's a non-trivial amount of mm code which may end up
conflicting. I'd also like to see Ryan's Ack on the changes before these
end up in -next.

Will
Alexandre Ghiti April 22, 2024, 8:50 a.m. UTC | #6
Hi,

On 19/04/2024 13:03, Will Deacon wrote:
> On Thu, Apr 18, 2024 at 03:11:56PM -0700, Palmer Dabbelt wrote:
>> On Fri, 01 Mar 2024 03:29:18 PST (-0800), alexghiti@rivosinc.com wrote:
>>> On Fri, Mar 1, 2024 at 11:45 AM Ryan Roberts <ryan.roberts@arm.com> wrote:
>>>> I confess I haven't looked at the patches yet, but this cover letter raises a
>>>> few quesions for me. I'll aim to look at the actual patches in due course.
>> Acked-by: Palmer Dabbelt <palmer@rivosinc.com>
>>
>> in case someone wants to pick them up via a generic tree.  I'm happy to take
>> them via the RISC-V tree if folk want, no rush on my end I'm just scrubbing
>> through old stuff.
> I'd definitely like to take the arm64 parts via the arm64 tree (on a
> shared branch), as it's a non-trivial amount of mm code which may end up
> conflicting. I'd also like to see Ryan's Ack on the changes before these
> end up in -next.
>
> Will


The rebase on top of the contpte mTHP support changed quite a few 
things, I have something working and will send it soon, so no need to 
review this patchset.

Thanks,

Alex


>
> _______________________________________________
> linux-riscv mailing list
> linux-riscv@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-riscv
Ryan Roberts April 23, 2024, 8:52 a.m. UTC | #7
On 22/04/2024 09:50, Alexandre Ghiti wrote:
> Hi,
> 
> On 19/04/2024 13:03, Will Deacon wrote:
>> On Thu, Apr 18, 2024 at 03:11:56PM -0700, Palmer Dabbelt wrote:
>>> On Fri, 01 Mar 2024 03:29:18 PST (-0800), alexghiti@rivosinc.com wrote:
>>>> On Fri, Mar 1, 2024 at 11:45 AM Ryan Roberts <ryan.roberts@arm.com> wrote:
>>>>> I confess I haven't looked at the patches yet, but this cover letter raises a
>>>>> few quesions for me. I'll aim to look at the actual patches in due course.
>>> Acked-by: Palmer Dabbelt <palmer@rivosinc.com>
>>>
>>> in case someone wants to pick them up via a generic tree.  I'm happy to take
>>> them via the RISC-V tree if folk want, no rush on my end I'm just scrubbing
>>> through old stuff.
>> I'd definitely like to take the arm64 parts via the arm64 tree (on a
>> shared branch), as it's a non-trivial amount of mm code which may end up
>> conflicting. I'd also like to see Ryan's Ack on the changes before these
>> end up in -next.
>>
>> Will
> 
> 
> The rebase on top of the contpte mTHP support changed quite a few things, I have
> something working and will send it soon, so no need to review this patchset.

Sorry this fell off my desk. CC me on the next version and I'll take a look.

> 
> Thanks,
> 
> Alex
> 
> 
>>
>> _______________________________________________
>> linux-riscv mailing list
>> linux-riscv@lists.infradead.org
>> http://lists.infradead.org/mailman/listinfo/linux-riscv