Message ID | 20190403141627.11664-1-steven.price@arm.com (mailing list archive) |
---|---|
Headers | show |
Series | Convert x86 & arm64 to use generic page walk | expand |
Hi all, Gentle ping: who can take this? Is there anything blocking this series? Thanks, Steve On 03/04/2019 15:16, Steven Price wrote: > Most architectures current have a debugfs file for dumping the kernel > page tables. Currently each architecture has to implement custom > functions for walking the page tables because the generic > walk_page_range() function is unable to walk the page tables used by the > kernel. > > This series extends the capabilities of walk_page_range() so that it can > deal with the page tables of the kernel (which have no VMAs and can > contain larger huge pages than exist for user space). x86 and arm64 are > then converted to make use of walk_page_range() removing the custom page > table walkers. > > To enable a generic page table walker to walk the unusual mappings of > the kernel we need to implement a set of functions which let us know > when the walker has reached the leaf entry. Since arm, powerpc, s390, > sparc and x86 all have p?d_large macros lets standardise on that and > implement those that are missing. > > Potentially future changes could unify the implementations of the > debugfs walkers further, moving the common functionality into common > code. This would require a common way of handling the effective > permissions (currently implemented only for x86) along with a per-arch > way of formatting the page table information for debugfs. One > immediate benefit would be getting the KASAN speed up optimisation in > arm64 (and other arches) which is currently only implemented for x86. > > Also available as a git tree: > git://linux-arm.org/linux-sp.git walk_page_range/v8 > > Changes since v7: > https://lore.kernel.org/lkml/20190328152104.23106-1-steven.price@arm.com/T/ > * Updated commit message in patch 2 to clarify that we rely on the page > tables being walked to be the same page size/depth as the kernel's > (since this confused me earlier today). > > Changes since v6: > https://lore.kernel.org/lkml/20190326162624.20736-1-steven.price@arm.com/T/ > * Split the changes for powerpc. pmd_large() is now added in patch 4 > patch, and pmd_is_leaf() removed in patch 5. > > Changes since v5: > https://lore.kernel.org/lkml/20190321141953.31960-1-steven.price@arm.com/T/ > * Updated comment for struct mm_walk based on Mike Rapoport's > suggestion > > Changes since v4: > https://lore.kernel.org/lkml/20190306155031.4291-1-steven.price@arm.com/T/ > * Correctly force result to a boolean in p?d_large for powerpc. > * Added Acked-bys > * Rebased onto v5.1-rc1 > > Changes since v3: > https://lore.kernel.org/lkml/20190227170608.27963-1-steven.price@arm.com/T/ > * Restored the generic macros, only implement p?d_large() for > architectures that have support for large pages. This also means > adding dummy #defines for architectures that define p?d_large as > static inline to avoid picking up the generic macro. > * Drop the 'depth' argument from pte_hole > * Because we no longer have the depth for holes, we also drop support > in x86 for showing missing pages in debugfs. See discussion below: > https://lore.kernel.org/lkml/26df02dd-c54e-ea91-bdd1-0a4aad3a30ac@arm.com/ > * mips: only define p?d_large when _PAGE_HUGE is defined. > > Changes since v2: > https://lore.kernel.org/lkml/20190221113502.54153-1-steven.price@arm.com/T/ > * Rather than attemping to provide generic macros, actually implement > p?d_large() for each architecture. > > Changes since v1: > https://lore.kernel.org/lkml/20190215170235.23360-1-steven.price@arm.com/T/ > * Added p4d_large() macro > * Comments to explain p?d_large() macro semantics > * Expanded comment for pte_hole() callback to explain mapping between > depth and P?D > * Handle folded page tables at all levels, so depth from pte_hole() > ignores folding at any level (see real_depth() function in > mm/pagewalk.c) > > Steven Price (20): > arc: mm: Add p?d_large() definitions > arm64: mm: Add p?d_large() definitions > mips: mm: Add p?d_large() definitions > powerpc: mm: Add p?d_large() definitions > KVM: PPC: Book3S HV: Remove pmd_is_leaf() > riscv: mm: Add p?d_large() definitions > s390: mm: Add p?d_large() definitions > sparc: mm: Add p?d_large() definitions > x86: mm: Add p?d_large() definitions > mm: Add generic p?d_large() macros > mm: pagewalk: Add p4d_entry() and pgd_entry() > mm: pagewalk: Allow walking without vma > mm: pagewalk: Add test_p?d callbacks > arm64: mm: Convert mm/dump.c to use walk_page_range() > x86: mm: Don't display pages which aren't present in debugfs > x86: mm: Point to struct seq_file from struct pg_state > x86: mm+efi: Convert ptdump_walk_pgd_level() to take a mm_struct > x86: mm: Convert ptdump_walk_pgd_level_debugfs() to take an mm_struct > x86: mm: Convert ptdump_walk_pgd_level_core() to take an mm_struct > x86: mm: Convert dump_pagetables to use walk_page_range > > arch/arc/include/asm/pgtable.h | 1 + > arch/arm64/include/asm/pgtable.h | 2 + > arch/arm64/mm/dump.c | 117 +++---- > arch/mips/include/asm/pgtable-64.h | 8 + > arch/powerpc/include/asm/book3s/64/pgtable.h | 30 +- > arch/powerpc/kvm/book3s_64_mmu_radix.c | 12 +- > arch/riscv/include/asm/pgtable-64.h | 7 + > arch/riscv/include/asm/pgtable.h | 7 + > arch/s390/include/asm/pgtable.h | 2 + > arch/sparc/include/asm/pgtable_64.h | 2 + > arch/x86/include/asm/pgtable.h | 10 +- > arch/x86/mm/debug_pagetables.c | 8 +- > arch/x86/mm/dump_pagetables.c | 347 ++++++++++--------- > arch/x86/platform/efi/efi_32.c | 2 +- > arch/x86/platform/efi/efi_64.c | 4 +- > include/asm-generic/pgtable.h | 19 + > include/linux/mm.h | 26 +- > mm/pagewalk.c | 76 +++- > 18 files changed, 407 insertions(+), 273 deletions(-) >
On 4/10/19 7:56 AM, Steven Price wrote:
> Gentle ping: who can take this? Is there anything blocking this series?
First of all, I really appreciate that you tried this. Every open-coded
page walk has a set of common pitfalls, but is pretty unbounded in what
kinds of bugs it can contain. I think this at least gets us to the
point where some of those pitfalls won't happen. That's cool, but I'm a
worried that it hasn't gotten easier in the end.
Linus also had some strong opinions in the past on how page walks should
be written. He needs to have a look before we go much further.
On 12/04/2019 15:44, Dave Hansen wrote: > On 4/10/19 7:56 AM, Steven Price wrote: >> Gentle ping: who can take this? Is there anything blocking this series? > > First of all, I really appreciate that you tried this. Every open-coded > page walk has a set of common pitfalls, but is pretty unbounded in what > kinds of bugs it can contain. I think this at least gets us to the > point where some of those pitfalls won't happen. That's cool, but I'm a > worried that it hasn't gotten easier in the end. My plan was to implement the generic infrastructure and then work to remove the per-arch code for ptdump debugfs where possible. This patch series doesn't actually get that far because I wanted to get some confidence that the general approach would be accepted. > Linus also had some strong opinions in the past on how page walks should > be written. He needs to have a look before we go much further. Fair enough. I'll post the initial work I've done on unifying the x86/arm64 ptdump code - the diffstat is a bit nicer on that - but there's still work to be done so I'm posting just as an RFC. Thanks, Steve