mbox series

[v4,0/5] ptdump: add intermediate directory support

Message ID aw675dhrbplkitj3szjut2vyidsxokogkjj3vi76wl2x4wybtg@5rhk5ca5zpmv (mailing list archive)
Headers show
Series ptdump: add intermediate directory support | expand

Message

Maxwell Bland June 18, 2024, 2:37 p.m. UTC
Makes many several improvements to (arm64) ptdump debugging, including:

- support note_page on intermediate table entries
- (arm64) print intermediate entries and add an array for their specific
  attributes
- (arm64) adjust the entry ranges to remove the implicit exclusive upper
  bound
- (arm64) indent page table by level while maintaining attribute
  alignment
- (arm64) improve documentation clarity, detail, and precision

Thank you again to the maintainers for their review of this patch.

A comparison of the differences in output is provided here:
github.com/maxwell-bland/linux-patch-data/tree/main/ptdump-non-leaf

New in v4:
- Inclusive upper bounds on range specifications
- Splits commit into multiple smaller commits and separates cosmetic,
  documentation, and logic changes
- Updates documentation more sensibly
- Fixes bug in size computation and handles ULONG_MAX bound overflow

v3:
https://lore.kernel.org/all/fik5ys53dbkpkl22o4s7sw7cxi6dqjcpm2f3kno5tyms73jm5y@buo4jsktsnrt/
- Added tabulation to delineate entries
- Fixed formatting issues with mailer and rebased to mm/linus

v2:
https://lore.kernel.org/r/20240423142307.495726312-1-mbland@motorola.com
- Rebased onto linux-next/akpm (the incorrect branch)

v1:
https://lore.kernel.org/all/20240423121820.874441838-1-mbland@motorola.com/


Maxwell Bland (5):
  mm: add ARCH_SUPPORTS_NON_LEAF_PTDUMP
  arm64: non leaf ptdump support
  arm64: indent ptdump by level, aligning attributes
  arm64: exclusive upper bound for ptdump entries
  arm64: add attrs and format to ptdump document

 Documentation/arch/arm64/ptdump.rst | 126 ++++++++++++-----------
 arch/arm64/Kconfig                  |   1 +
 arch/arm64/mm/ptdump.c              | 149 +++++++++++++++++++++++++---
 mm/Kconfig.debug                    |   9 ++
 mm/ptdump.c                         |  21 ++--
 5 files changed, 217 insertions(+), 89 deletions(-)

Comments

Ard Biesheuvel June 18, 2024, 2:59 p.m. UTC | #1
On Tue, 18 Jun 2024 at 16:40, Maxwell Bland <mbland@motorola.com> wrote:
>
> Separate the pte_bits used in ptdump from pxd_bits used by pmd, p4d,
> pud, and pgd descriptors, thereby adding support for printing key
> intermediate directory protection bits, such as PXNTable, and enable the
> associated support Kconfig option.
>
> Signed-off-by: Maxwell Bland <mbland@motorola.com>
> ---
>  arch/arm64/Kconfig     |   1 +
>  arch/arm64/mm/ptdump.c | 140 ++++++++++++++++++++++++++++++++++++-----
>  2 files changed, 125 insertions(+), 16 deletions(-)
>
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index 5d91259ee7b5..f4c3290160db 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -98,6 +98,7 @@ config ARM64
>         select ARCH_SUPPORTS_NUMA_BALANCING
>         select ARCH_SUPPORTS_PAGE_TABLE_CHECK
>         select ARCH_SUPPORTS_PER_VMA_LOCK
> +       select ARCH_SUPPORTS_NON_LEAF_PTDUMP
>         select ARCH_WANT_BATCHED_UNMAP_TLB_FLUSH
>         select ARCH_WANT_COMPAT_IPC_PARSE_VERSION if COMPAT
>         select ARCH_WANT_DEFAULT_BPF_JIT
> diff --git a/arch/arm64/mm/ptdump.c b/arch/arm64/mm/ptdump.c
> index 6986827e0d64..8f0b459c13ed 100644
> --- a/arch/arm64/mm/ptdump.c
> +++ b/arch/arm64/mm/ptdump.c
> @@ -24,6 +24,7 @@
>  #include <asm/memory.h>
>  #include <asm/pgtable-hwdef.h>
>  #include <asm/ptdump.h>
> +#include <asm/pgalloc.h>
>
>
>  #define pt_dump_seq_printf(m, fmt, args...)    \
> @@ -105,11 +106,6 @@ static const struct prot_bits pte_bits[] = {
>                 .val    = PTE_CONT,
>                 .set    = "CON",
>                 .clear  = "   ",
> -       }, {
> -               .mask   = PTE_TABLE_BIT,
> -               .val    = PTE_TABLE_BIT,
> -               .set    = "   ",
> -               .clear  = "BLK",
>         }, {
>                 .mask   = PTE_UXN,
>                 .val    = PTE_UXN,
> @@ -143,34 +139,129 @@ static const struct prot_bits pte_bits[] = {
>         }
>  };
>
> +static const struct prot_bits pxd_bits[] = {

This table will need to distinguish between table and block entries.
In your sample output, I see

2M PMD   TBL     RW               x            UXNTbl    MEM/NORMAL

for a table entry, which includes a memory type and access permissions
based on descriptor fields that are not used for table descriptors.

Some other attributes listed below are equally inapplicable to table
entries, but happen to be 0x0 so they don't appear in the output, but
they would if the IGNORED bit in the descriptor happened to be set.

So I suspect that the distinction pte_bits <-> pxd_bits is not so
useful here. It would be better to have tbl_bits[], with pointers to
it in the pg_level array, where the PTE level one is set to NULL.


> +       {
> +               .mask   = PMD_SECT_VALID,
> +               .val    = PMD_SECT_VALID,
> +               .set    = " ",
> +               .clear  = "F",
> +       }, {
> +               .mask   = PMD_TABLE_BIT,
> +               .val    = PMD_TABLE_BIT,
> +               .set    = "TBL",
> +               .clear  = "BLK",
> +       }, {
> +               .mask   = PMD_SECT_USER,
> +               .val    = PMD_SECT_USER,
> +               .set    = "USR",
> +               .clear  = "   ",
> +       }, {
> +               .mask   = PMD_SECT_RDONLY,
> +               .val    = PMD_SECT_RDONLY,
> +               .set    = "ro",
> +               .clear  = "RW",
> +       }, {
> +               .mask   = PMD_SECT_S,
> +               .val    = PMD_SECT_S,
> +               .set    = "SHD",
> +               .clear  = "   ",
> +       }, {
> +               .mask   = PMD_SECT_AF,
> +               .val    = PMD_SECT_AF,
> +               .set    = "AF",
> +               .clear  = "  ",
> +       }, {
> +               .mask   = PMD_SECT_NG,
> +               .val    = PMD_SECT_NG,
> +               .set    = "NG",
> +               .clear  = "  ",
> +       }, {
> +               .mask   = PMD_SECT_CONT,
> +               .val    = PMD_SECT_CONT,
> +               .set    = "CON",
> +               .clear  = "   ",
> +       }, {
> +               .mask   = PMD_SECT_PXN,
> +               .val    = PMD_SECT_PXN,
> +               .set    = "NX",
> +               .clear  = "x ",
> +       }, {
> +               .mask   = PMD_SECT_UXN,
> +               .val    = PMD_SECT_UXN,
> +               .set    = "UXN",
> +               .clear  = "   ",
> +       }, {
> +               .mask   = PMD_TABLE_PXN,
> +               .val    = PMD_TABLE_PXN,
> +               .set    = "NXTbl",
> +               .clear  = "     ",
> +       }, {
> +               .mask   = PMD_TABLE_UXN,
> +               .val    = PMD_TABLE_UXN,
> +               .set    = "UXNTbl",
> +               .clear  = "      ",
> +       }, {
> +               .mask   = PTE_GP,
> +               .val    = PTE_GP,
> +               .set    = "GP",
> +               .clear  = "  ",
> +       }, {
> +               .mask   = PMD_ATTRINDX_MASK,
> +               .val    = PMD_ATTRINDX(MT_DEVICE_nGnRnE),
> +               .set    = "DEVICE/nGnRnE",
> +       }, {
> +               .mask   = PMD_ATTRINDX_MASK,
> +               .val    = PMD_ATTRINDX(MT_DEVICE_nGnRE),
> +               .set    = "DEVICE/nGnRE",
> +       }, {
> +               .mask   = PMD_ATTRINDX_MASK,
> +               .val    = PMD_ATTRINDX(MT_NORMAL_NC),
> +               .set    = "MEM/NORMAL-NC",
> +       }, {
> +               .mask   = PMD_ATTRINDX_MASK,
> +               .val    = PMD_ATTRINDX(MT_NORMAL),
> +               .set    = "MEM/NORMAL",
> +       }, {
> +               .mask   = PMD_ATTRINDX_MASK,
> +               .val    = PMD_ATTRINDX(MT_NORMAL_TAGGED),
> +               .set    = "MEM/NORMAL-TAGGED",
> +       }
> +};
> +
>  struct pg_level {
>         const struct prot_bits *bits;
>         char name[4];
>         int num;
>         u64 mask;
> +       unsigned long size;
>  };
>
>  static struct pg_level pg_level[] __ro_after_init = {
>         { /* pgd */
>                 .name   = "PGD",
> -               .bits   = pte_bits,
> -               .num    = ARRAY_SIZE(pte_bits),
> +               .bits   = pxd_bits,
> +               .num    = ARRAY_SIZE(pxd_bits),
> +               .size   = PGDIR_SIZE,
>         }, { /* p4d */
>                 .name   = "P4D",
> -               .bits   = pte_bits,
> -               .num    = ARRAY_SIZE(pte_bits),
> +               .bits   = pxd_bits,
> +               .num    = ARRAY_SIZE(pxd_bits),
> +               .size   = P4D_SIZE,
>         }, { /* pud */
>                 .name   = "PUD",
> -               .bits   = pte_bits,
> -               .num    = ARRAY_SIZE(pte_bits),
> +               .bits   = pxd_bits,
> +               .num    = ARRAY_SIZE(pxd_bits),
> +               .size   = PUD_SIZE,
>         }, { /* pmd */
>                 .name   = "PMD",
> -               .bits   = pte_bits,
> -               .num    = ARRAY_SIZE(pte_bits),
> +               .bits   = pxd_bits,
> +               .num    = ARRAY_SIZE(pxd_bits),
> +               .size   = PMD_SIZE,
>         }, { /* pte */
>                 .name   = "PTE",
>                 .bits   = pte_bits,
>                 .num    = ARRAY_SIZE(pte_bits),
> +               .size   = PAGE_SIZE
>         },
>  };
>
> @@ -251,10 +342,27 @@ static void note_page(struct ptdump_state *pt_st, unsigned long addr, int level,
>                         note_prot_wx(st, addr);
>                 }
>
> -               pt_dump_seq_printf(st->seq, "0x%016lx-0x%016lx   ",
> -                                  st->start_address, addr);
> +               /*
> +                * Non-leaf entries use a fixed size for their range
> +                * specification, whereas leaf entries are grouped by
> +                * attributes and may not have a range larger than the type
> +                * specifier.
> +                */
> +               if (st->start_address == addr) {
> +                       if (check_add_overflow(addr, pg_level[st->level].size,
> +                                              &delta))
> +                               delta = ULONG_MAX - addr + 1;
> +                       else
> +                               delta = pg_level[st->level].size;
> +                       pt_dump_seq_printf(st->seq, "0x%016lx-0x%016lx   ",
> +                                          addr, addr + delta);
> +               } else {
> +                       delta = (addr - st->start_address);
> +                       pt_dump_seq_printf(st->seq, "0x%016lx-0x%016lx   ",
> +                                          st->start_address, addr);
> +               }
>
> -               delta = (addr - st->start_address) >> 10;
> +               delta >>= 10;
>                 while (!(delta & 1023) && unit[1]) {
>                         delta >>= 10;
>                         unit++;
> --
> 2.39.2
>
>
Maxwell Bland June 18, 2024, 4:55 p.m. UTC | #2
On Tue, Jun 18, 2024 at 04:59:22PM GMT, Ard Biesheuvel wrote:
> On Tue, 18 Jun 2024 at 16:40, Maxwell Bland <mbland@motorola.com> wrote:
> > @@ -105,11 +106,6 @@ static const struct prot_bits pte_bits[] = {
> >                 .val    = PTE_CONT,
> >                 .set    = "CON",
> >                 .clear  = "   ",
> > -       }, {
> > -               .mask   = PTE_TABLE_BIT,
> > -               .val    = PTE_TABLE_BIT,
> > -               .set    = "   ",
> > -               .clear  = "BLK",
> >         }, {
> >                 .mask   = PTE_UXN,
> >                 .val    = PTE_UXN,
> This table will need to distinguish between table and block entries.
> 
> I suspect that the distinction pte_bits <-> pxd_bits is not so useful
> here. It would be better to have tbl_bits[], with pointers to it in
> the pg_level array, where the PTE level one is set to NULL.

Nice, thanks! Adding now. I'll slate a v5 release for next monday.

Maxwell