mbox series

[v6,00/13] Add support for DAX vmemmap optimization for ppc64

Message ID 20230724190759.483013-1-aneesh.kumar@linux.ibm.com (mailing list archive)
Headers show
Series Add support for DAX vmemmap optimization for ppc64 | expand

Message

Aneesh Kumar K.V July 24, 2023, 7:07 p.m. UTC
This patch series implements changes required to support DAX vmemmap
optimization for ppc64. The vmemmap optimization is only enabled with radix MMU
translation and 1GB PUD mapping with 64K page size. The patch series also split
hugetlb vmemmap optimization as a separate Kconfig variable so that
architectures can enable DAX vmemmap optimization without enabling hugetlb
vmemmap optimization. This should enable architectures like arm64 to enable DAX
vmemmap optimization while they can't enable hugetlb vmemmap optimization. More
details of the same are in patch "mm/vmemmap optimization: Split hugetlb and
devdax vmemmap optimization"

Changes from v5:
* rebase to mm-unstable branch

Changes from v4:
* Address review feedback
* Add the Reviewed-by:

Changes from v3:
* Rebase to latest linus tree
* Build fix with SPARSEMEM_VMEMMP disabled
* Add hash_pud_same outisde THP Kconfig

Changes from v2:
* Rebase to latest linus tree
* Address review feedback

Changes from V1:
* Fix make htmldocs warning
* Fix vmemmap allocation bugs with different alignment values.
* Correctly check for section validity to before we free vmemmap area



Aneesh Kumar K.V (13):
  mm/hugepage pud: Allow arch-specific helper function to check huge
    page pud support
  mm: Change pudp_huge_get_and_clear_full take vm_area_struct as arg
  mm/vmemmap: Improve vmemmap_can_optimize and allow architectures to
    override
  mm/vmemmap: Allow architectures to override how vmemmap optimization
    works
  mm: Add pud_same similar to __HAVE_ARCH_P4D_SAME
  mm/huge pud: Use transparent huge pud helpers only with
    CONFIG_TRANSPARENT_HUGEPAGE
  mm/vmemmap optimization: Split hugetlb and devdax vmemmap optimization
  powerpc/mm/trace: Convert trace event to trace event class
  powerpc/book3s64/mm: Enable transparent pud hugepage
  powerpc/book3s64/vmemmap: Switch radix to use a different vmemmap
    handling function
  powerpc/book3s64/radix: Add support for vmemmap optimization for radix
  powerpc/book3s64/radix: Remove mmu_vmemmap_psize
  powerpc/book3s64/radix: Add debug message to give more details of
    vmemmap allocation

 Documentation/mm/vmemmap_dedup.rst            |   1 +
 Documentation/powerpc/index.rst               |   1 +
 Documentation/powerpc/vmemmap_dedup.rst       | 101 ++++
 arch/loongarch/Kconfig                        |   2 +-
 arch/powerpc/Kconfig                          |   1 +
 arch/powerpc/include/asm/book3s/64/hash.h     |   9 +
 arch/powerpc/include/asm/book3s/64/pgtable.h  | 155 ++++-
 arch/powerpc/include/asm/book3s/64/radix.h    |  47 ++
 .../include/asm/book3s/64/tlbflush-radix.h    |   2 +
 arch/powerpc/include/asm/book3s/64/tlbflush.h |   8 +
 arch/powerpc/include/asm/pgtable.h            |   6 +
 arch/powerpc/mm/book3s64/hash_pgtable.c       |   2 +-
 arch/powerpc/mm/book3s64/pgtable.c            |  78 +++
 arch/powerpc/mm/book3s64/radix_pgtable.c      | 572 ++++++++++++++++--
 arch/powerpc/mm/book3s64/radix_tlb.c          |   7 +
 arch/powerpc/mm/init_64.c                     |  37 +-
 arch/powerpc/platforms/Kconfig.cputype        |   1 +
 arch/riscv/Kconfig                            |   2 +-
 arch/s390/Kconfig                             |   2 +-
 arch/x86/Kconfig                              |   3 +-
 drivers/nvdimm/pfn_devs.c                     |   2 +-
 fs/Kconfig                                    |   2 +-
 include/linux/mm.h                            |  29 +-
 include/linux/pgtable.h                       |  12 +-
 include/trace/events/thp.h                    |  33 +-
 mm/Kconfig                                    |   5 +-
 mm/debug_vm_pgtable.c                         |   2 +-
 mm/huge_memory.c                              |   2 +-
 mm/mm_init.c                                  |   2 +-
 mm/mremap.c                                   |   2 +-
 mm/sparse-vmemmap.c                           |   3 +
 31 files changed, 1049 insertions(+), 82 deletions(-)
 create mode 100644 Documentation/powerpc/vmemmap_dedup.rst

Comments

Andrew Morton July 25, 2023, 7:29 p.m. UTC | #1
On Tue, 25 Jul 2023 00:37:46 +0530 "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> wrote:

> This patch series implements changes required to support DAX vmemmap
> optimization for ppc64.

Do we have any measurements to help us understand the magnitude
of this optimization?

And any documentation which helps users understand whether and
why they should enable this feature?
Aneesh Kumar K.V July 26, 2023, 5:29 a.m. UTC | #2
On 7/26/23 12:59 AM, Andrew Morton wrote:
> On Tue, 25 Jul 2023 00:37:46 +0530 "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> wrote:
> 
>> This patch series implements changes required to support DAX vmemmap
>> optimization for ppc64.
> 
> Do we have any measurements to help us understand the magnitude
> of this optimization?
> 
> And any documentation which helps users understand whether and
> why they should enable this feature?

That is memory space optimization due to kernel reusing the tail page struct pages. The details
of the optimization is documented in patch 11. We document there the impact with both 4k and
64K page size.

-aneesh
Andrew Morton July 26, 2023, 6:52 p.m. UTC | #3
On Wed, 26 Jul 2023 10:59:32 +0530 Aneesh Kumar K V <aneesh.kumar@linux.ibm.com> wrote:

> On 7/26/23 12:59 AM, Andrew Morton wrote:
> > On Tue, 25 Jul 2023 00:37:46 +0530 "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> wrote:
> > 
> >> This patch series implements changes required to support DAX vmemmap
> >> optimization for ppc64.
> > 
> > Do we have any measurements to help us understand the magnitude
> > of this optimization?
> > 
> > And any documentation which helps users understand whether and
> > why they should enable this feature?
> 
> That is memory space optimization due to kernel reusing the tail page struct pages. The details
> of the optimization is documented in patch 11. We document there the impact with both 4k and
> 64K page size.

I suppose that with sufficient arithmetic one could use
Documentation/powerpc/vmemmap_dedup.rst to figure out the bottom-line
savings.

I was more expecting a straightforward statement in the [0/N] overview
to help people understand why they're reading this patchset at all.
Like "saves 5% of total memory on my XXX machine".
Aneesh Kumar K.V July 27, 2023, 5:37 a.m. UTC | #4
Andrew Morton <akpm@linux-foundation.org> writes:

> On Wed, 26 Jul 2023 10:59:32 +0530 Aneesh Kumar K V <aneesh.kumar@linux.ibm.com> wrote:
>
>> On 7/26/23 12:59 AM, Andrew Morton wrote:
>> > On Tue, 25 Jul 2023 00:37:46 +0530 "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> wrote:
>> > 
>> >> This patch series implements changes required to support DAX vmemmap
>> >> optimization for ppc64.
>> > 
>> > Do we have any measurements to help us understand the magnitude
>> > of this optimization?
>> > 
>> > And any documentation which helps users understand whether and
>> > why they should enable this feature?
>> 
>> That is memory space optimization due to kernel reusing the tail page struct pages. The details
>> of the optimization is documented in patch 11. We document there the impact with both 4k and
>> 64K page size.
>
> I suppose that with sufficient arithmetic one could use
> Documentation/powerpc/vmemmap_dedup.rst to figure out the bottom-line
> savings.
>
> I was more expecting a straightforward statement in the [0/N] overview
> to help people understand why they're reading this patchset at all.
> Like "saves 5% of total memory on my XXX machine".

This is specific to devdax usage and also depends on devdax alignment.
The actual saving details are also documented in mm/vmemmap_dedup.rst.
The saving will be based on the devdax device memory size and aligment.

With 64K page size for 16384 pages added (1G) we save 14 pages
With 4K page size for 262144 pages added (1G) we save 4094 pages
With 4K page size for 512 pages added (2M) we save 6 pages

-aneesh