diff mbox series

[v3] ptdump: add intermediate directory support

Message ID fik5ys53dbkpkl22o4s7sw7cxi6dqjcpm2f3kno5tyms73jm5y@buo4jsktsnrt (mailing list archive)
State New
Headers show
Series [v3] ptdump: add intermediate directory support | expand

Commit Message

Maxwell Bland April 30, 2024, 4:05 p.m. UTC
Add an optional note_non_leaf parameter to ptdump, causing note_page to
be called on non-leaf descriptors. Implement this functionality on arm64
by printing table descriptors along with table-specific permission sets.

For arm64, break (1) the uniform number of columns for each descriptor,
and (2) the coalescing of large PTE regions, which are now split up by
PMD. This is a "good" thing since it makes the behavior and protection
bits set on page tables, such as PXNTable, more explicit.

Before:
0xffff008440210000-0xffff008440400000 1984K PTE ro NX SHD AF NG UXN M...
0xffff008440400000-0xffff008441c00000 24M PMD ro NX SHD AF NG BLK UXN...
0xffff008441c00000-0xffff008441dc0000 1792K PTE ro NX SHD AF NG UXN M...
0xffff008441dc0000-0xffff00844317b000 20204K PTE RW NX SHD AF NG UXN ...

After (tabulation omitted and spaces condensed):
0xffff0fb640200000-0xffff0fb640400000 2M PMD TBL RW x NXTbl UXNTbl ME...
0xffff0fb640200000-0xffff0fb640210000 64K PTE RW NX SHD AF NG UXN MEM...
0xffff0fb640210000-0xffff0fb640400000 1984K PTE ro NX SHD AF NG UXN M...
0xffff0fb640400000-0xffff0fb641c00000 24M PMD BLK ro SHD AF NG NX UXN...
0xffff0fb641c00000-0xffff0fb641e00000 2M PMD TBL RW x NXTbl UXNTbl ME...
0xffff0fb641c00000-0xffff0fb641dc0000 1792K PTE ro NX SHD AF NG UXN M...
0xffff0fb641dc0000-0xffff0fb641e00000 256K PTE RW NX SHD AF NG UXN ME...

v3:
  - Added tabulation to delineate entries
  - Fixed formatting issues with mailer and rebased to mm/linus

v2:
  - Rebased onto linux-next/akpm (the incorrect branch)

Signed-off-by: Maxwell Bland <mbland@motorola.com>
---
Thank you again to the maintainers for your review of this patch.

To Andrew Morton, I apologize for the malformatted patches last week.It
will hopefully never happen again. I have tested mailing this patch to
myself and have confirmed it cleanly merges to mm/linus.

 Documentation/arch/arm64/ptdump.rst | 184 +++++++++++++---------
 arch/arm64/mm/ptdump.c              | 230 +++++++++++++++++++++++++---
 include/linux/ptdump.h              |   1 +
 mm/ptdump.c                         |  13 ++
 4 files changed, 332 insertions(+), 96 deletions(-)


base-commit: a93289b830ce783955b22fbe5d1274a464c05acf

Comments

Maxwell Bland April 30, 2024, 4:13 p.m. UTC | #1
On Tue, Apr 30, 2024 at 11:05:01AM GMT, Maxwell Bland wrote:
> v3:
>   - Added tabulation to delineate entries
>   - Fixed formatting issues with mailer and rebased to mm/linus
> 
> v2:
>   - Rebased onto linux-next/akpm (the incorrect branch)

Note that I am referring to
20240423121820.874441838-1-mbland@motorola.com
as the v1/v2 here.

The v1/v2 mailer malformatting will hopefully never happen again. I have
tested mailing this patch to myself and have confirmed it cleanly merges
to mm/linus. I ended up needing to compile mutt from scratch. ):

Also, I made these changes in order to compliment testing
20240220203256.31153-1-mbland@motorola.com
and
20240423095843.446565600-1-mbland@motorola.com
and figured they might be useful to include/merge before attempting to
merge the above more impactful changes.
Catalin Marinas May 1, 2024, 12:07 p.m. UTC | #2
On Tue, Apr 30, 2024 at 11:05:01AM -0500, Maxwell Bland wrote:
> Add an optional note_non_leaf parameter to ptdump, causing note_page to
> be called on non-leaf descriptors. Implement this functionality on arm64
> by printing table descriptors along with table-specific permission sets.
> 
> For arm64, break (1) the uniform number of columns for each descriptor,
> and (2) the coalescing of large PTE regions, which are now split up by
> PMD. This is a "good" thing since it makes the behavior and protection
> bits set on page tables, such as PXNTable, more explicit.
> 
> Before:
> 0xffff008440210000-0xffff008440400000 1984K PTE ro NX SHD AF NG UXN M...
> 0xffff008440400000-0xffff008441c00000 24M PMD ro NX SHD AF NG BLK UXN...
> 0xffff008441c00000-0xffff008441dc0000 1792K PTE ro NX SHD AF NG UXN M...
> 0xffff008441dc0000-0xffff00844317b000 20204K PTE RW NX SHD AF NG UXN ...
> 
> After (tabulation omitted and spaces condensed):
> 0xffff0fb640200000-0xffff0fb640400000 2M PMD TBL RW x NXTbl UXNTbl ME...
> 0xffff0fb640200000-0xffff0fb640210000 64K PTE RW NX SHD AF NG UXN MEM...
> 0xffff0fb640210000-0xffff0fb640400000 1984K PTE ro NX SHD AF NG UXN M...
> 0xffff0fb640400000-0xffff0fb641c00000 24M PMD BLK ro SHD AF NG NX UXN...
> 0xffff0fb641c00000-0xffff0fb641e00000 2M PMD TBL RW x NXTbl UXNTbl ME...
> 0xffff0fb641c00000-0xffff0fb641dc0000 1792K PTE ro NX SHD AF NG UXN M...
> 0xffff0fb641dc0000-0xffff0fb641e00000 256K PTE RW NX SHD AF NG UXN ME...
> 
> v3:
>   - Added tabulation to delineate entries
>   - Fixed formatting issues with mailer and rebased to mm/linus
> 
> v2:
>   - Rebased onto linux-next/akpm (the incorrect branch)
> 
> Signed-off-by: Maxwell Bland <mbland@motorola.com>
> ---
> Thank you again to the maintainers for your review of this patch.
> 
> To Andrew Morton, I apologize for the malformatted patches last week.It
> will hopefully never happen again. I have tested mailing this patch to
> myself and have confirmed it cleanly merges to mm/linus.
> 
>  Documentation/arch/arm64/ptdump.rst | 184 +++++++++++++---------
>  arch/arm64/mm/ptdump.c              | 230 +++++++++++++++++++++++++---
>  include/linux/ptdump.h              |   1 +
>  mm/ptdump.c                         |  13 ++
>  4 files changed, 332 insertions(+), 96 deletions(-)

Is this v3 replacing v2 here:

https://lore.kernel.org/r/20240423142307.495726312-1-mbland@motorola.com

or it goes on top? The patch versioning and subject change confuses me.
Maxwell Bland May 1, 2024, 2:32 p.m. UTC | #3
On Wed, May 01, 2024 at 01:07:36PM GMT, Catalin Marinas wrote:
> Is this v3 replacing v2 here:
> https://lore.kernel.org/r/20240423142307.495726312-1-mbland@motorola.com
> or it goes on top?

Replacing. Sorry for the confusion---my mailer broke the previous
versions' formatting.

I am new to linux kernel commits and our SMTP/IT was not set up for
patch submission, but this new workflow looks OK. (-:

Maxwell
Catalin Marinas May 8, 2024, 11:20 a.m. UTC | #4
On Tue, Apr 30, 2024 at 11:05:01AM -0500, Maxwell Bland wrote:
> diff --git a/Documentation/arch/arm64/ptdump.rst b/Documentation/arch/arm64/ptdump.rst
> index 5dcfc5d7cddf..350eea06300e 100644
> --- a/Documentation/arch/arm64/ptdump.rst
> +++ b/Documentation/arch/arm64/ptdump.rst
> @@ -2,25 +2,24 @@
>  Kernel page table dump
>  ======================
>  
> -ptdump is a debugfs interface that provides a detailed dump of the
> -kernel page tables. It offers a comprehensive overview of the kernel
> -virtual memory layout as well as the attributes associated with the
> -various regions in a human-readable format. It is useful to dump the
> -kernel page tables to verify permissions and memory types. Examining the
> -page table entries and permissions helps identify potential security
> -vulnerabilities such as mappings with overly permissive access rights or
> -improper memory protections.
> +ptdump is a debugfs interface that provides a detailed dump of the kernel page
> +tables. It offers a comprehensive overview of the kernel virtual memory layout
> +as well as the attributes associated with the various regions in a
> +human-readable format. It is useful to dump the kernel page tables to verify
> +permissions and memory types. Examining the page table entries and permissions
> +helps identify potential security vulnerabilities such as mappings with overly
> +permissive access rights or improper memory protections.

Please don't re-wrap existing text unless it's a separate patch with no
content change (and with a good justification). It is hard to see
whether anything has changed in this paragraph (or the next ones).

Also, many (most?) text files are wrapped around 72 characters, no need
to re-wrap them at 80, just keep the formatting in that file when adding
new text.

> @@ -29,68 +28,101 @@ configurations and mount debugfs::
>   mount -t debugfs nodev /sys/kernel/debug
>   cat /sys/kernel/debug/kernel_page_tables
>  
> -On analysing the output of ``cat /sys/kernel/debug/kernel_page_tables``
> -one can derive information about the virtual address range of the entry,
> -followed by size of the memory region covered by this entry, the
> -hierarchical structure of the page tables and finally the attributes
> -associated with each page. The page attributes provide information about
> -access permissions, execution capability, type of mapping such as leaf
> -level PTE or block level PGD, PMD and PUD, and access status of a page
> -within the kernel memory. Assessing these attributes can assist in
> -understanding the memory layout, access patterns and security
> -characteristics of the kernel pages.
> +On analysing the output of ``cat /sys/kernel/debug/kernel_page_tables`` one can
> +derive information about the virtual address range of a contiguous group of
> +page table entries, followed by size of the memory region covered by this
> +group, the hierarchical structure of the page tables and finally the attributes
> +associated with each page in the group. Groups are broken up either according
> +to a change in attributes or by parent descriptor, such as a PMD. Note that the
> +set of attributes, and therefore formatting, is not equivalent between entry
> +types. For example, PMD entries have a separate set of attributes from leaf
> +level PTE entries, because they support both the UXNTable and PXNTable
> +permission bits.
> +
> +The page attributes provide information about access permissions, execution
> +capability, type of mapping such as leaf level PTE or block level PGD, PMD and
> +PUD, and access status of a page within the kernel memory. Non-PTE block or
> +page level entries are denoted with either "BLK" or "TBL", respectively.
> +Assessing these attributes can assist in understanding the memory layout,
> +access patterns and security characteristics of the kernel pages.

I presume there's some new text here.

> diff --git a/arch/arm64/mm/ptdump.c b/arch/arm64/mm/ptdump.c
> index 6986827e0d64..bd4f1df0c444 100644
> --- a/arch/arm64/mm/ptdump.c
> +++ b/arch/arm64/mm/ptdump.c
> @@ -24,6 +24,7 @@
>  #include <asm/memory.h>
>  #include <asm/pgtable-hwdef.h>
>  #include <asm/ptdump.h>
> +#include <asm/pgalloc.h>
>  
>  
>  #define pt_dump_seq_printf(m, fmt, args...)	\
> @@ -70,6 +71,11 @@ static const struct prot_bits pte_bits[] = {
>  		.val	= PTE_VALID,
>  		.set	= " ",
>  		.clear	= "F",
> +	}, {
> +		.mask	= PTE_TABLE_BIT,
> +		.val	= PTE_TABLE_BIT,
> +		.set	= "   ",
> +		.clear	= "BLK",
>  	}, {
>  		.mask	= PTE_USER,
>  		.val	= PTE_USER,
> @@ -105,11 +111,6 @@ static const struct prot_bits pte_bits[] = {
>  		.val	= PTE_CONT,
>  		.set	= "CON",
>  		.clear	= "   ",
> -	}, {
> -		.mask	= PTE_TABLE_BIT,
> -		.val	= PTE_TABLE_BIT,
> -		.set	= "   ",
> -		.clear	= "BLK",
>  	}, {
>  		.mask	= PTE_UXN,
>  		.val	= PTE_UXN,

Since you are adding a separate pmd_bits[] array, I think we could get
rid of the PTE_TABLE_BIT entry. It doesn't make sense for ptes anyway.
Why it works currently is that for ptes it won't show anything since we
have the bit set while for p*d entries it should have shown TBL when set
but it's not called on non-leaf entries.

> @@ -143,34 +144,208 @@ static const struct prot_bits pte_bits[] = {
>  	}
>  };
>  
> +static const struct prot_bits pmd_bits[] = {
[...]
> +};
> +
> +static const struct prot_bits pud_bits[] = {
[...]
> +};

Do we need pud_bits[] as well? Can we not just use pmd_bits[]? Call it
pxd_bits if you want, the format is the same for all p*d entries.

> +
>  struct pg_level {
>  	const struct prot_bits *bits;
>  	char name[4];
>  	int num;
>  	u64 mask;
> +	unsigned long size;
>  };
>  
>  static struct pg_level pg_level[] __ro_after_init = {
>  	{ /* pgd */
>  		.name	= "PGD",
> -		.bits	= pte_bits,
> -		.num	= ARRAY_SIZE(pte_bits),
> +		.bits	= pud_bits,
> +		.num	= ARRAY_SIZE(pud_bits),
> +		.size	= PGD_SIZE
>  	}, { /* p4d */
>  		.name	= "P4D",
> -		.bits	= pte_bits,
> -		.num	= ARRAY_SIZE(pte_bits),
> +		.bits	= pud_bits,
> +		.num	= ARRAY_SIZE(pud_bits),
> +		.size	= P4D_SIZE
>  	}, { /* pud */
>  		.name	= "PUD",
> -		.bits	= pte_bits,
> -		.num	= ARRAY_SIZE(pte_bits),
> +		.bits	= pud_bits,
> +		.num	= ARRAY_SIZE(pud_bits),
> +		.size	= PUD_SIZE
>  	}, { /* pmd */
>  		.name	= "PMD",
> -		.bits	= pte_bits,
> -		.num	= ARRAY_SIZE(pte_bits),
> +		.bits	= pmd_bits,
> +		.num	= ARRAY_SIZE(pmd_bits),
> +		.size	= PMD_SIZE
>  	}, { /* pte */
>  		.name	= "PTE",
>  		.bits	= pte_bits,
>  		.num	= ARRAY_SIZE(pte_bits),
> +		.size	= PAGE_SIZE
>  	},
>  };
>  
> @@ -225,8 +400,9 @@ static void note_page(struct ptdump_state *pt_st, unsigned long addr, int level,
>  		      u64 val)
>  {
>  	struct pg_state *st = container_of(pt_st, struct pg_state, ptdump);
> -	static const char units[] = "KMGTPE";
> +	static const char units[] = "BKMGTPE";
>  	u64 prot = 0;
> +	int i = 0;
>  
>  	/* check if the current level has been folded dynamically */
>  	if ((level == 1 && mm_p4d_folded(st->mm)) ||
> @@ -241,20 +417,33 @@ static void note_page(struct ptdump_state *pt_st, unsigned long addr, int level,
>  		st->current_prot = prot;
>  		st->start_address = addr;
>  		pt_dump_seq_printf(st->seq, "---[ %s ]---\n", st->marker->name);
> -	} else if (prot != st->current_prot || level != st->level ||
> -		   addr >= st->marker[1].start_address) {
> +	} else if ((prot != st->current_prot || level != st->level ||
> +		   addr >= st->marker[1].start_address)) {
>  		const char *unit = units;
>  		unsigned long delta;
>  
> +		for (i = 0; i < st->level; i++)
> +			pt_dump_seq_printf(st->seq, "  ");

Please separate the alignment changes into a different patch, it makes
it easier to review what's new functionality, what's cosmetic. I'm also
not particularly keen on the new alignment. It's fine to have the
sub-ranges indented but I'd keep the bits/permissions/size/etc. aligned.

> +
>  		if (st->current_prot) {
>  			note_prot_uxn(st, addr);
>  			note_prot_wx(st, addr);
>  		}
>  
> -		pt_dump_seq_printf(st->seq, "0x%016lx-0x%016lx   ",
> -				   st->start_address, addr);
> +		/*
> +		 * Entries are coalesced into a single line, so non-leaf
> +		 * entries have no size relative to start_address
> +		 */
> +		if (st->start_address != addr) {
> +			pt_dump_seq_printf(st->seq, "0x%016lx-0x%016lx   ",
> +					   st->start_address, addr);
> +			delta = (addr - st->start_address);

What's this supposed to show? In your example, it's strange that the PGD
is shown as 128 bytes:

+ 0xffff020000000000-0xffff020000000080         128B PGD   TBL     RW               NXTbl UXNTbl    MEM/NORMAL
+     0xffff020000000000-0xffff023080000000         194G PUD
+     0xffff023080000000-0xffff0230c0000000           1G PUD   TBL     RW               NXTbl UXNTbl    MEM/NORMAL

The table pgd entries should cover full pud ranges below. I don't know
how it ended up with 0x...80 as the end of the range for the pgd. Should
it show a PGD_SIZE * number_of_entries instead?

> +		} else {
> +			pt_dump_seq_printf(st->seq, "0x%016lx-0x%016lx   ", addr,
> +					   addr + pg_level[st->level].size);
> +			delta = (pg_level[st->level].size);
> +		}
>  
> -		delta = (addr - st->start_address) >> 10;
>  		while (!(delta & 1023) && unit[1]) {
>  			delta >>= 10;
>  			unit++;
> @@ -301,7 +490,8 @@ void ptdump_walk(struct seq_file *s, struct ptdump_info *info)
>  			.range = (struct ptdump_range[]){
>  				{info->base_addr, end},
>  				{0, 0}
> -			}
> +			},
> +			.note_non_leaf = true
>  		}
>  	};
>  
> diff --git a/include/linux/ptdump.h b/include/linux/ptdump.h
> index 8dbd51ea8626..b3e793a5c77f 100644
> --- a/include/linux/ptdump.h
> +++ b/include/linux/ptdump.h
> @@ -16,6 +16,7 @@ struct ptdump_state {
>  			  int level, u64 val);
>  	void (*effective_prot)(struct ptdump_state *st, int level, u64 val);
>  	const struct ptdump_range *range;
> +	bool note_non_leaf;
>  };
>  
>  bool ptdump_walk_pgd_level_core(struct seq_file *m,
> diff --git a/mm/ptdump.c b/mm/ptdump.c
> index 106e1d66e9f9..97da7a765b22 100644
> --- a/mm/ptdump.c
> +++ b/mm/ptdump.c
> @@ -41,6 +41,9 @@ static int ptdump_pgd_entry(pgd_t *pgd, unsigned long addr,
>  	if (st->effective_prot)
>  		st->effective_prot(st, 0, pgd_val(val));
>  
> +	if (st->note_non_leaf && !pgd_leaf(val))
> +		st->note_page(st, addr, 0, pgd_val(val));
> +
>  	if (pgd_leaf(val)) {
>  		st->note_page(st, addr, 0, pgd_val(val));
>  		walk->action = ACTION_CONTINUE;

Is the difference between leaf and non-leaf calls only the walk->action?
We could have a single call to st->note_page() and keep the walk->action
setting separately. Do we also need to set ACTION_SUBTREE in case the
entry is a table entry? Or is it done in the caller somewhere? I could
not figure out.

An alternative would be to have an ARCH_WANT_NON_LEAF_PTDUMP Kconfig
option instead of a bool note_non_leaf in struct ptdump_state. This
option seems to be entirely static, not sure it's worth a struct member
for it. You'd use IS_ENABLED() above instead of st->note_non_leaf.
Maxwell Bland May 8, 2024, 3:29 p.m. UTC | #5
On Wed, May 08, 2024 at 12:20:41PM GMT, Catalin Marinas wrote:
> On Tue, Apr 30, 2024 at 11:05:01AM -0500, Maxwell Bland wrote:

> > +		if (st->start_address != addr) {
> > +			pt_dump_seq_printf(st->seq, "0x%016lx-0x%016lx   ",
> > +					   st->start_address, addr);
> > +			delta = (addr - st->start_address);

> Should it show a PGD_SIZE * number_of_entries instead?

It should show the full range of memory covered by the PGD's table.
Will fix, thanks!

r > +	if (st->note_non_leaf && !pgd_leaf(val))
> > +		st->note_page(st, addr, 0, pgd_val(val));
> > +
> >  	if (pgd_leaf(val)) {
> >  		st->note_page(st, addr, 0, pgd_val(val));
> >  		walk->action = ACTION_CONTINUE;

> Is the difference between leaf and non-leaf calls only the walk->action?
> We could have a single call to st->note_page() and keep the walk->action
> setting separately. Do we also need to set ACTION_SUBTREE in case the
> entry is a table entry? Or is it done in the caller somewhere? I could
> not figure out.
>
> An alternative would be to have an ARCH_WANT_NON_LEAF_PTDUMP Kconfig
> option instead of a bool note_non_leaf in struct ptdump_state. This
> option seems to be entirely static, not sure it's worth a struct member
> for it. You'd use IS_ENABLED() above instead of st->note_non_leaf.


ACTION_SUBTREE seems right, I will look into it. Something like (though
I'll check to see if it is correct and polish):

  walk_action = (!pgd_leaf) ? ACTION_SUBTREE : ACTION_CONTINUE;
  if ((IS_ENABLED(...) && !pgd_leaf()) || pgd_leaf())
  	st->note_page ...

> -- 
> Catalin

Nice! Thank you for your feedback. I will iterate and also fix up the
other minor things, e.g. 72 character wrap in doc files.
Jesse Taube May 21, 2024, 4:54 p.m. UTC | #6
On 4/30/24 12:05, Maxwell Bland wrote:
> Add an optional note_non_leaf parameter to ptdump, causing note_page to
> be called on non-leaf descriptors. Implement this functionality on arm64
> by printing table descriptors along with table-specific permission sets.
> 
> For arm64, break (1) the uniform number of columns for each descriptor,
> and (2) the coalescing of large PTE regions, which are now split up by
> PMD. This is a "good" thing since it makes the behavior and protection
> bits set on page tables, such as PXNTable, more explicit.
> 
> Before:
> 0xffff008440210000-0xffff008440400000 1984K PTE ro NX SHD AF NG UXN M...
> 0xffff008440400000-0xffff008441c00000 24M PMD ro NX SHD AF NG BLK UXN...
> 0xffff008441c00000-0xffff008441dc0000 1792K PTE ro NX SHD AF NG UXN M...
> 0xffff008441dc0000-0xffff00844317b000 20204K PTE RW NX SHD AF NG UXN ...
> 
> After (tabulation omitted and spaces condensed):
> 0xffff0fb640200000-0xffff0fb640400000 2M PMD TBL RW x NXTbl UXNTbl ME...
> 0xffff0fb640200000-0xffff0fb640210000 64K PTE RW NX SHD AF NG UXN MEM...
> 0xffff0fb640210000-0xffff0fb640400000 1984K PTE ro NX SHD AF NG UXN M...
> 0xffff0fb640400000-0xffff0fb641c00000 24M PMD BLK ro SHD AF NG NX UXN...
> 0xffff0fb641c00000-0xffff0fb641e00000 2M PMD TBL RW x NXTbl UXNTbl ME...
> 0xffff0fb641c00000-0xffff0fb641dc0000 1792K PTE ro NX SHD AF NG UXN M...
> 0xffff0fb641dc0000-0xffff0fb641e00000 256K PTE RW NX SHD AF NG UXN ME...
> 
> v3:
>    - Added tabulation to delineate entries
>    - Fixed formatting issues with mailer and rebased to mm/linus
> 
> v2:
>    - Rebased onto linux-next/akpm (the incorrect branch)

Typically the patch versions go in the additional comments section under 
the ---
https://www.kernel.org/doc/html/v4.13/process/submitting-patches.html#the-canonical-patch-format

> 
> Signed-off-by: Maxwell Bland <mbland@motorola.com>

When you reply the email client(and git send-email???) seems to send it 
twice?
Very odd im suprised the server didnt filter the dup.
https://lore.kernel.org/all/ZjIwiFa3CMxxtAZ1@arm.com/

> ---
> Thank you again to the maintainers for your review of this patch.
> 
> To Andrew Morton, I apologize for the malformatted patches last week.It
> will hopefully never happen again. I have tested mailing this patch to
> myself and have confirmed it cleanly merges to mm/linus.
> 
>   Documentation/arch/arm64/ptdump.rst | 184 +++++++++++++---------

Typicaly docs are seperated into a seperate commit and sent as a set.

>   arch/arm64/mm/ptdump.c              | 230 +++++++++++++++++++++++++---

As said by Catalin anything that can be seperated into smaller patches 
should be.

>   include/linux/ptdump.h              |   1 +
>   mm/ptdump.c                         |  13 ++
>   4 files changed, 332 insertions(+), 96 deletions(-)
> 
> diff --git a/Documentation/arch/arm64/ptdump.rst b/Documentation/arch/arm64/ptdump.rst
> index 5dcfc5d7cddf..350eea06300e 100644
> --- a/Documentation/arch/arm64/ptdump.rst
> +++ b/Documentation/arch/arm64/ptdump.rst
> @@ -2,25 +2,24 @@
>   Kernel page table dump
>   ======================
>   
> -ptdump is a debugfs interface that provides a detailed dump of the
> -kernel page tables. It offers a comprehensive overview of the kernel
> -virtual memory layout as well as the attributes associated with the
> -various regions in a human-readable format. It is useful to dump the
> -kernel page tables to verify permissions and memory types. Examining the
> -page table entries and permissions helps identify potential security
> -vulnerabilities such as mappings with overly permissive access rights or
> -improper memory protections.
> +ptdump is a debugfs interface that provides a detailed dump of the kernel page
> +tables. It offers a comprehensive overview of the kernel virtual memory layout
> +as well as the attributes associated with the various regions in a
> +human-readable format. It is useful to dump the kernel page tables to verify
> +permissions and memory types. Examining the page table entries and permissions
> +helps identify potential security vulnerabilities such as mappings with overly
> +permissive access rights or improper memory protections.
>   
> -Memory hotplug allows dynamic expansion or contraction of available
> -memory without requiring a system reboot. To maintain the consistency
> -and integrity of the memory management data structures, arm64 makes use
> -of the ``mem_hotplug_lock`` semaphore in write mode. Additionally, in
> -read mode, ``mem_hotplug_lock`` supports an efficient implementation of
> -``get_online_mems()`` and ``put_online_mems()``. These protect the
> -offlining of memory being accessed by the ptdump code.
> +Memory hotplug allows dynamic expansion or contraction of available memory
> +without requiring a system reboot. To maintain the consistency and integrity of
> +the memory management data structures, arm64 makes use of the
> +``mem_hotplug_lock`` semaphore in write mode. Additionally, in read mode,
> +``mem_hotplug_lock`` supports an efficient implementation of
> +``get_online_mems()`` and ``put_online_mems()``. These protect the offlining of
> +memory being accessed by the ptdump code.
>   
> -In order to dump the kernel page tables, enable the following
> -configurations and mount debugfs::
> +In order to dump the kernel page tables, enable the following configurations
> +and mount debugfs::
>   
>    CONFIG_GENERIC_PTDUMP=y
>    CONFIG_PTDUMP_CORE=y
> @@ -29,68 +28,101 @@ configurations and mount debugfs::
>    mount -t debugfs nodev /sys/kernel/debug
>    cat /sys/kernel/debug/kernel_page_tables
>   
> -On analysing the output of ``cat /sys/kernel/debug/kernel_page_tables``
> -one can derive information about the virtual address range of the entry,
> -followed by size of the memory region covered by this entry, the
> -hierarchical structure of the page tables and finally the attributes
> -associated with each page. The page attributes provide information about
> -access permissions, execution capability, type of mapping such as leaf
> -level PTE or block level PGD, PMD and PUD, and access status of a page
> -within the kernel memory. Assessing these attributes can assist in
> -understanding the memory layout, access patterns and security
> -characteristics of the kernel pages.
> +On analysing the output of ``cat /sys/kernel/debug/kernel_page_tables`` one can
> +derive information about the virtual address range of a contiguous group of
> +page table entries, followed by size of the memory region covered by this
> +group, the hierarchical structure of the page tables and finally the attributes
> +associated with each page in the group. Groups are broken up either according
> +to a change in attributes or by parent descriptor, such as a PMD. Note that the
> +set of attributes, and therefore formatting, is not equivalent between entry
> +types. For example, PMD entries have a separate set of attributes from leaf
> +level PTE entries, because they support both the UXNTable and PXNTable
> +permission bits.
> +
> +The page attributes provide information about access permissions, execution
> +capability, type of mapping such as leaf level PTE or block level PGD, PMD and
> +PUD, and access status of a page within the kernel memory. Non-PTE block or
> +page level entries are denoted with either "BLK" or "TBL", respectively.
> +Assessing these attributes can assist in understanding the memory layout,
> +access patterns and security characteristics of the kernel pages.
>   
>   Kernel virtual memory layout example::
>   
> - start address        end address         size             attributes
> - +---------------------------------------------------------------------------------------+
> - | ---[ Linear Mapping start ]---------------------------------------------------------- |
> - | ..................                                                                    |
> - | 0xfff0000000000000-0xfff0000000210000  2112K PTE RW NX SHD AF  UXN  MEM/NORMAL-TAGGED |
> - | 0xfff0000000210000-0xfff0000001c00000 26560K PTE ro NX SHD AF  UXN  MEM/NORMAL        |
> - | ..................                                                                    |
> - | ---[ Linear Mapping end ]------------------------------------------------------------ |
> - +---------------------------------------------------------------------------------------+
> - | ---[ Modules start ]----------------------------------------------------------------- |
> - | ..................                                                                    |
> - | 0xffff800000000000-0xffff800008000000   128M PTE                                      |
> - | ..................                                                                    |
> - | ---[ Modules end ]------------------------------------------------------------------- |
> - +---------------------------------------------------------------------------------------+
> - | ---[ vmalloc() area ]---------------------------------------------------------------- |
> - | ..................                                                                    |
> - | 0xffff800008010000-0xffff800008200000  1984K PTE ro x  SHD AF       UXN  MEM/NORMAL   |
> - | 0xffff800008200000-0xffff800008e00000    12M PTE ro x  SHD AF  CON  UXN  MEM/NORMAL   |
> - | ..................                                                                    |
> - | ---[ vmalloc() end ]----------------------------------------------------------------- |
> - +---------------------------------------------------------------------------------------+
> - | ---[ Fixmap start ]------------------------------------------------------------------ |
> - | ..................                                                                    |
> - | 0xfffffbfffdb80000-0xfffffbfffdb90000    64K PTE ro x  SHD AF  UXN  MEM/NORMAL        |
> - | 0xfffffbfffdb90000-0xfffffbfffdba0000    64K PTE ro NX SHD AF  UXN  MEM/NORMAL        |
> - | ..................                                                                    |
> - | ---[ Fixmap end ]-------------------------------------------------------------------- |
> - +---------------------------------------------------------------------------------------+
> - | ---[ PCI I/O start ]----------------------------------------------------------------- |
> - | ..................                                                                    |
> - | 0xfffffbfffe800000-0xfffffbffff800000    16M PTE                                      |
> - | ..................                                                                    |
> - | ---[ PCI I/O end ]------------------------------------------------------------------- |
> - +---------------------------------------------------------------------------------------+
> - | ---[ vmemmap start ]----------------------------------------------------------------- |
> - | ..................                                                                    |
> - | 0xfffffc0002000000-0xfffffc0002200000     2M PTE RW NX SHD AF  UXN  MEM/NORMAL        |
> - | 0xfffffc0002200000-0xfffffc0020000000   478M PTE                                      |
> - | ..................                                                                    |
> - | ---[ vmemmap end ]------------------------------------------------------------------- |
> - +---------------------------------------------------------------------------------------+
> + start address        end address         size type  leaf    attributes
> + +-----------------------------------------------------------------------------------------------------------------+
> + | ---[ Linear Mapping start ]---                                                                                  |
> + | ...                                                                                                             |
> + | 0xffff0d02c3200000-0xffff0d02c3400000    2M PMD   TBL     RW               x      NXTbl UXNTbl    MEM/NORMAL    |
> + | 0xffff0d02c3200000-0xffff0d02c3218000   96K PTE           ro NX SHD AF NG     UXN    MEM/NORMAL-TAGGED          |
> + | 0xffff0d02c3218000-0xffff0d02c3250000  224K PTE           RW NX SHD AF NG     UXN    MEM/NORMAL-TAGGED          |
> + | 0xffff0d02c3250000-0xffff0d02c33b3000 1420K PTE           ro NX SHD AF NG     UXN    MEM/NORMAL-TAGGED          |
> + | 0xffff0d02c33b3000-0xffff0d02c3400000  308K PTE           RW NX SHD AF NG     UXN    MEM/NORMAL-TAGGED          |
> + | 0xffff0d02c3400000-0xffff0d02c3600000    2M PMD   TBL     RW               x      NXTbl UXNTbl    MEM/NORMAL    |
> + | 0xffff0d02c3400000-0xffff0d02c3600000    2M PTE           RW NX SHD AF NG     UXN    MEM/NORMAL-TAGGED          |
> + | ...                                                                                                             |
> + | 0xffff0d02c3200000-0xffff0d02c3400000    2M PMD   TBL     RW               x      NXTbl UXNTbl    MEM/NORMAL    |
> + | ...                                                                                                             |
> + | ---[ Linear Mapping end ]---                                                                                    |
> + +-----------------------------------------------------------------------------------------------------------------+
> + | ---[ Modules start ]---                                                                                         |
> + | ...                                                                                                             |
> + | 0xffff800000000000-0xffff800000000080 128B PGD   TBL     RW               x     UXNTbl    MEM/NORMAL            |
> + | 0xffff800000000000-0xffff800080000000   2G PUD F BLK     RW               x               MEM/NORMAL            |
> + | ...                                                                                                             |
> + | ---[ Modules end ]---                                                                                           |
> + +-----------------------------------------------------------------------------------------------------------------+
> + | ---[ vmalloc() area ]---                                                                                        |
> + | ...                                                                                                             |
> + | 0xffff800080000000-0xffff8000c0000000   1G PUD   TBL     RW               x     UXNTbl    MEM/NORMAL            |
> + | ...                                                                                                             |
> + | 0xffff800080200000-0xffff800080400000   2M PMD   TBL     RW               x      NXTbl UXNTbl    MEM/NORMAL     |
> + | 0xffff800080200000-0xffff80008022f000 188K PTE           RW NX SHD AF NG     UXN    MEM/NORMAL                  |

It's probably good to add another space between F and BLK to show that F 
is related to the type? Also maybe add docs as to what it means, but I 
may just be dumb and its obvious to others.

> + | 0xffff80008022f000-0xffff800080230000   4K PTE F BLK     RW x                       MEM/NORMAL                  |
> + | 0xffff800080230000-0xffff800080233000  12K PTE           RW NX SHD AF NG     UXN    MEM/NORMAL                  |
> + | 0xffff800080233000-0xffff800080234000   4K PTE F BLK     RW x                       MEM/NORMAL                  |
> + | 0xffff800080234000-0xffff800080237000  12K PTE           RW NX SHD AF NG     UXN    MEM/NORMAL                  |
> + | ...                                                                                                             |
> + | 0xffff800080400000-0xffff800084000000  60M PMD F BLK     RW               x      x     x         MEM/NORMAL     |
> + | ...                                                                                                             |
> + | ---[ vmalloc() end ]---                                                                                         |
> + +-----------------------------------------------------------------------------------------------------------------+
> + | ---[ vmemmap start ]---                                                                                         |
> + | ...                                                                                                             |
> + | 0xfffffe33cb000000-0xfffffe33cc000000  16M PMD   BLK     RW SHD AF NG     NX UXN x     x         MEM/NORMAL     |
> + | 0xfffffe33cc000000-0xfffffe3400000000 832M PMD F BLK     RW               x      x     x         MEM/NORMAL     |
> + | ...                                                                                                             |
> + | ---[ vmemmap end ]---                                                                                           |
> + +-----------------------------------------------------------------------------------------------------------------+
> + | ---[ PCI I/O start ]---                                                                                         |
> + | ...                                                                                                             |
> + | 0xffffffffc0800000-0xffffffffc0810000 64K PTE           RW NX SHD AF NG     UXN    DEVICE/nGnRE                 |
> + | ...                                                                                                             |
> + | ---[ PCI I/O end ]---                                                                                           |
> + +-----------------------------------------------------------------------------------------------------------------+
> + | ---[ Fixmap start ]---                                                                                          |
> + | ...                                                                                                             |
> + | 0xffffffffff5f6000-0xffffffffff5f9000 12K PTE           ro x  SHD AF        UXN    MEM/NORMAL                   |
> + | 0xffffffffff5f9000-0xffffffffff5fa000  4K PTE           ro NX SHD AF NG     UXN    MEM/NORMAL                   |
> + | ...                                                                                                             |
> + | ---[ Fixmap end ]---                                                                                            |
> + +-----------------------------------------------------------------------------------------------------------------+
>   
>   ``cat /sys/kernel/debug/kernel_page_tables`` output::
>   
> - 0xfff0000001c00000-0xfff0000080000000     2020M PTE  RW NX SHD AF   UXN    MEM/NORMAL-TAGGED
> - 0xfff0000080000000-0xfff0000800000000       30G PMD
> - 0xfff0000800000000-0xfff0000800700000        7M PTE  RW NX SHD AF   UXN    MEM/NORMAL-TAGGED
> - 0xfff0000800700000-0xfff0000800710000       64K PTE  ro NX SHD AF   UXN    MEM/NORMAL-TAGGED
> - 0xfff0000800710000-0xfff0000880000000  2089920K PTE  RW NX SHD AF   UXN    MEM/NORMAL-TAGGED
> - 0xfff0000880000000-0xfff0040000000000     4062G PMD
> - 0xfff0040000000000-0xffff800000000000     3964T PGD
> + 0xffff000000000000-0xffff020000000000           2T PGD
> + 0xffff020000000000-0xffff020000000080         128B PGD   TBL     RW               NXTbl UXNTbl    MEM/NORMAL
> +     0xffff020000000000-0xffff023080000000         194G PUD
> +     0xffff023080000000-0xffff0230c0000000           1G PUD   TBL     RW               NXTbl UXNTbl    MEM/NORMAL
> +       0xffff023080000000-0xffff023080200000           2M PMD   TBL     RW               x      NXTbl UXNTbl    MEM/NORMAL
> +         0xffff023080000000-0xffff023080200000           2M PTE           RW NX SHD AF NG     UXN    MEM/NORMAL-TAGGED
> +       0xffff023080200000-0xffff023080400000           2M PMD   TBL     RW               x      NXTbl UXNTbl    MEM/NORMAL
> +         0xffff023080200000-0xffff023080210000          64K PTE           RW NX SHD AF NG     UXN    MEM/NORMAL-TAGGED
> +         0xffff023080210000-0xffff023080400000        1984K PTE           ro NX SHD AF NG     UXN    MEM/NORMAL
> +       0xffff023080400000-0xffff023081c00000          24M PMD   BLK     ro SHD AF NG     NX UXN x     x         MEM/NORMAL
> +       0xffff023081c00000-0xffff023081e00000           2M PMD   TBL     RW               x      NXTbl UXNTbl    MEM/NORMAL
> +         0xffff023081c00000-0xffff023081dd0000        1856K PTE           ro NX SHD AF NG     UXN    MEM/NORMAL
> +         0xffff023081dd0000-0xffff023081e00000         192K PTE           RW NX SHD AF NG     UXN    MEM/NORMAL-TAGGED
> +       0xffff023081e00000-0xffff023082000000           2M PMD   TBL     RW               x      NXTbl UXNTbl    MEM/NORMAL
> +         0xffff023081e00000-0xffff023082000000           2M PTE           RW NX SHD AF NG     UXN    MEM/NORMAL-TAGGED
> +       0xffff023082000000-0xffff023082200000           2M PMD   TBL     RW               x      NXTbl UXNTbl    MEM/NORMAL
> +         0xffff023082000000-0xffff023082200000           2M PTE           RW NX SHD AF NG     UXN    MEM/NORMAL-TAGGED
> diff --git a/arch/arm64/mm/ptdump.c b/arch/arm64/mm/ptdump.c
> index 6986827e0d64..bd4f1df0c444 100644
> --- a/arch/arm64/mm/ptdump.c
> +++ b/arch/arm64/mm/ptdump.c
> @@ -24,6 +24,7 @@
>   #include <asm/memory.h>
>   #include <asm/pgtable-hwdef.h>
>   #include <asm/ptdump.h>
> +#include <asm/pgalloc.h>
>   
>   
>   #define pt_dump_seq_printf(m, fmt, args...)	\
> @@ -70,6 +71,11 @@ static const struct prot_bits pte_bits[] = {
>   		.val	= PTE_VALID,
>   		.set	= " ",
>   		.clear	= "F",
> +	}, {
> +		.mask	= PTE_TABLE_BIT,
> +		.val	= PTE_TABLE_BIT,
> +		.set	= "   ",
> +		.clear	= "BLK",
>   	}, {
>   		.mask	= PTE_USER,
>   		.val	= PTE_USER,
> @@ -105,11 +111,6 @@ static const struct prot_bits pte_bits[] = {
>   		.val	= PTE_CONT,
>   		.set	= "CON",
>   		.clear	= "   ",
> -	}, {
> -		.mask	= PTE_TABLE_BIT,
> -		.val	= PTE_TABLE_BIT,
> -		.set	= "   ",
> -		.clear	= "BLK",
>   	}, {
>   		.mask	= PTE_UXN,
>   		.val	= PTE_UXN,
> @@ -143,34 +144,208 @@ static const struct prot_bits pte_bits[] = {
>   	}
>   };
>   
> +static const struct prot_bits pmd_bits[] = {
> +	{
> +		.mask	= PMD_SECT_VALID,
> +		.val	= PMD_SECT_VALID,
> +		.set	= " ",
> +		.clear	= "F",
> +	}, {
> +		.mask	= PMD_TABLE_BIT,
> +		.val	= PMD_TABLE_BIT,
> +		.set	= "TBL",
> +		.clear	= "BLK",
> +	}, {
> +		.mask	= PMD_SECT_USER,
> +		.val	= PMD_SECT_USER,
> +		.set	= "USR",
> +		.clear	= "   ",
> +	}, {
> +		.mask	= PMD_SECT_RDONLY,
> +		.val	= PMD_SECT_RDONLY,
> +		.set	= "ro",
> +		.clear	= "RW",
> +	}, {
> +		.mask	= PMD_SECT_S,
> +		.val	= PMD_SECT_S,
> +		.set	= "SHD",
> +		.clear	= "   ",
> +	}, {
> +		.mask	= PMD_SECT_AF,
> +		.val	= PMD_SECT_AF,
> +		.set	= "AF",
> +		.clear	= "  ",
> +	}, {
> +		.mask	= PMD_SECT_NG,
> +		.val	= PMD_SECT_NG,
> +		.set	= "NG",
> +		.clear	= "  ",
> +	}, {
> +		.mask	= PMD_SECT_CONT,
> +		.val	= PMD_SECT_CONT,
> +		.set	= "CON",
> +		.clear	= "   ",
> +	}, {
> +		.mask	= PMD_SECT_PXN,
> +		.val	= PMD_SECT_PXN,
> +		.set	= "NX",
> +		.clear	= "x ",
> +	}, {
> +		.mask	= PMD_SECT_UXN,
> +		.val	= PMD_SECT_UXN,
> +		.set	= "UXN",
> +		.clear	= "   ",
> +	}, {
> +		.mask	= PMD_TABLE_PXN,
> +		.val	= PMD_TABLE_PXN,
> +		.set	= "NXTbl",
> +		.clear	= "x    ",
> +	}, {
> +		.mask	= PMD_TABLE_UXN,
> +		.val	= PMD_TABLE_UXN,
> +		.set	= "UXNTbl",
> +		.clear	= "x     ",
> +	}, {
> +		.mask	= PTE_GP,
> +		.val	= PTE_GP,
> +		.set	= "GP",
> +		.clear	= "  ",
> +	}, {
> +		.mask	= PMD_ATTRINDX_MASK,
> +		.val	= PMD_ATTRINDX(MT_DEVICE_nGnRnE),
> +		.set	= "DEVICE/nGnRnE",
> +	}, {
> +		.mask	= PMD_ATTRINDX_MASK,
> +		.val	= PMD_ATTRINDX(MT_DEVICE_nGnRE),
> +		.set	= "DEVICE/nGnRE",
> +	}, {
> +		.mask	= PMD_ATTRINDX_MASK,
> +		.val	= PMD_ATTRINDX(MT_NORMAL_NC),
> +		.set	= "MEM/NORMAL-NC",
> +	}, {
> +		.mask	= PMD_ATTRINDX_MASK,
> +		.val	= PMD_ATTRINDX(MT_NORMAL),
> +		.set	= "MEM/NORMAL",
> +	}, {
> +		.mask	= PMD_ATTRINDX_MASK,
> +		.val	= PMD_ATTRINDX(MT_NORMAL_TAGGED),
> +		.set	= "MEM/NORMAL-TAGGED",
> +	}
> +};
> +
> +static const struct prot_bits pud_bits[] = {
> +	{
> +		.mask	= PUD_TYPE_SECT,
> +		.val	= PUD_TYPE_SECT,
> +		.set	= " ",
> +		.clear	= "F",
> +	}, {
> +		.mask	= PUD_TABLE_BIT,
> +		.val	= PUD_TABLE_BIT,
> +		.set	= "TBL",
> +		.clear	= "BLK",
> +	}, {
> +		.mask	= PTE_USER,
> +		.val	= PTE_USER,
> +		.set	= "USR",
> +		.clear	= "   ",
> +	}, {
> +		.mask	= PUD_SECT_RDONLY,
> +		.val	= PUD_SECT_RDONLY,
> +		.set	= "ro",
> +		.clear	= "RW",
> +	}, {
> +		.mask	= PTE_SHARED,
> +		.val	= PTE_SHARED,
> +		.set	= "SHD",
> +		.clear	= "   ",
> +	}, {
> +		.mask	= PTE_AF,
> +		.val	= PTE_AF,
> +		.set	= "AF",
> +		.clear	= "  ",
> +	}, {
> +		.mask	= PTE_NG,
> +		.val	= PTE_NG,
> +		.set	= "NG",
> +		.clear	= "  ",
> +	}, {
> +		.mask	= PTE_CONT,
> +		.val	= PTE_CONT,
> +		.set	= "CON",
> +		.clear	= "   ",
> +	}, {
> +		.mask	= PUD_TABLE_PXN,
> +		.val	= PUD_TABLE_PXN,
> +		.set	= "NXTbl",
> +		.clear	= "x    ",
> +	}, {
> +		.mask	= PUD_TABLE_UXN,
> +		.val	= PUD_TABLE_UXN,
> +		.set	= "UXNTbl",
> +		.clear	= "      ",
> +	}, {
> +		.mask	= PTE_GP,
> +		.val	= PTE_GP,
> +		.set	= "GP",
> +		.clear	= "  ",
> +	}, {
> +		.mask	= PMD_ATTRINDX_MASK,
> +		.val	= PMD_ATTRINDX(MT_DEVICE_nGnRnE),
> +		.set	= "DEVICE/nGnRnE",
> +	}, {
> +		.mask	= PMD_ATTRINDX_MASK,
> +		.val	= PMD_ATTRINDX(MT_DEVICE_nGnRE),
> +		.set	= "DEVICE/nGnRE",
> +	}, {
> +		.mask	= PMD_ATTRINDX_MASK,
> +		.val	= PMD_ATTRINDX(MT_NORMAL_NC),
> +		.set	= "MEM/NORMAL-NC",
> +	}, {
> +		.mask	= PMD_ATTRINDX_MASK,
> +		.val	= PMD_ATTRINDX(MT_NORMAL),
> +		.set	= "MEM/NORMAL",
> +	}, {
> +		.mask	= PMD_ATTRINDX_MASK,
> +		.val	= PMD_ATTRINDX(MT_NORMAL_TAGGED),
> +		.set	= "MEM/NORMAL-TAGGED",
> +	}
> +};
> +
>   struct pg_level {
>   	const struct prot_bits *bits;
>   	char name[4];
>   	int num;
>   	u64 mask;
> +	unsigned long size;
>   };
>   
>   static struct pg_level pg_level[] __ro_after_init = {
>   	{ /* pgd */
>   		.name	= "PGD",
> -		.bits	= pte_bits,
> -		.num	= ARRAY_SIZE(pte_bits),
> +		.bits	= pud_bits,
> +		.num	= ARRAY_SIZE(pud_bits),
> +		.size	= PGD_SIZE
>   	}, { /* p4d */
>   		.name	= "P4D",
> -		.bits	= pte_bits,
> -		.num	= ARRAY_SIZE(pte_bits),
> +		.bits	= pud_bits,
> +		.num	= ARRAY_SIZE(pud_bits),
> +		.size	= P4D_SIZE
>   	}, { /* pud */
>   		.name	= "PUD",
> -		.bits	= pte_bits,
> -		.num	= ARRAY_SIZE(pte_bits),
> +		.bits	= pud_bits,
> +		.num	= ARRAY_SIZE(pud_bits),
> +		.size	= PUD_SIZE
>   	}, { /* pmd */
>   		.name	= "PMD",
> -		.bits	= pte_bits,
> -		.num	= ARRAY_SIZE(pte_bits),
> +		.bits	= pmd_bits,
> +		.num	= ARRAY_SIZE(pmd_bits),
> +		.size	= PMD_SIZE
>   	}, { /* pte */
>   		.name	= "PTE",
>   		.bits	= pte_bits,
>   		.num	= ARRAY_SIZE(pte_bits),
> +		.size	= PAGE_SIZE
>   	},
>   };
>   
> @@ -225,8 +400,9 @@ static void note_page(struct ptdump_state *pt_st, unsigned long addr, int level,
>   		      u64 val)
>   {
>   	struct pg_state *st = container_of(pt_st, struct pg_state, ptdump);
> -	static const char units[] = "KMGTPE";
> +	static const char units[] = "BKMGTPE";

This doesnt seem to be related to your changes is it?

>   	u64 prot = 0;
> +	int i = 0;
>   
>   	/* check if the current level has been folded dynamically */
>   	if ((level == 1 && mm_p4d_folded(st->mm)) ||
> @@ -241,20 +417,33 @@ static void note_page(struct ptdump_state *pt_st, unsigned long addr, int level,
>   		st->current_prot = prot;
>   		st->start_address = addr;
>   		pt_dump_seq_printf(st->seq, "---[ %s ]---\n", st->marker->name);
> -	} else if (prot != st->current_prot || level != st->level ||
> -		   addr >= st->marker[1].start_address) {
> +	} else if ((prot != st->current_prot || level != st->level ||
> +		   addr >= st->marker[1].start_address)) {
>   		const char *unit = units;
>   		unsigned long delta;
>   
> +		for (i = 0; i < st->level; i++)
> +			pt_dump_seq_printf(st->seq, "  ");
> +
>   		if (st->current_prot) {
>   			note_prot_uxn(st, addr);
>   			note_prot_wx(st, addr);
>   		}
>   
> -		pt_dump_seq_printf(st->seq, "0x%016lx-0x%016lx   ",
> -				   st->start_address, addr);
> +		/*
> +		 * Entries are coalesced into a single line, so non-leaf
> +		 * entries have no size relative to start_address
> +		 */
> +		if (st->start_address != addr) {
> +			pt_dump_seq_printf(st->seq, "0x%016lx-0x%016lx   ",
> +					   st->start_address, addr);
> +			delta = (addr - st->start_address);
> +		} else {
> +			pt_dump_seq_printf(st->seq, "0x%016lx-0x%016lx   ", addr,
> +					   addr + pg_level[st->level].size);
> +			delta = (pg_level[st->level].size);
> +		}
>   
> -		delta = (addr - st->start_address) >> 10;
>   		while (!(delta & 1023) && unit[1]) {
>   			delta >>= 10;
>   			unit++;
> @@ -301,7 +490,8 @@ void ptdump_walk(struct seq_file *s, struct ptdump_info *info)
>   			.range = (struct ptdump_range[]){
>   				{info->base_addr, end},
>   				{0, 0}
> -			}
> +			},
> +			.note_non_leaf = true
>   		}
>   	};
>   
> diff --git a/include/linux/ptdump.h b/include/linux/ptdump.h
> index 8dbd51ea8626..b3e793a5c77f 100644
> --- a/include/linux/ptdump.h
> +++ b/include/linux/ptdump.h
> @@ -16,6 +16,7 @@ struct ptdump_state {
>   			  int level, u64 val);
>   	void (*effective_prot)(struct ptdump_state *st, int level, u64 val);
>   	const struct ptdump_range *range;
> +	bool note_non_leaf;
>   };
>   
>   bool ptdump_walk_pgd_level_core(struct seq_file *m,
> diff --git a/mm/ptdump.c b/mm/ptdump.c
> index 106e1d66e9f9..97da7a765b22 100644
> --- a/mm/ptdump.c
> +++ b/mm/ptdump.c
> @@ -41,6 +41,9 @@ static int ptdump_pgd_entry(pgd_t *pgd, unsigned long addr,
>   	if (st->effective_prot)
>   		st->effective_prot(st, 0, pgd_val(val));
>   
> +	if (st->note_non_leaf && !pgd_leaf(val))
> +		st->note_page(st, addr, 0, pgd_val(val));
> +
>   	if (pgd_leaf(val)) {
>   		st->note_page(st, addr, 0, pgd_val(val));
>   		walk->action = ACTION_CONTINUE;
> @@ -64,6 +67,9 @@ static int ptdump_p4d_entry(p4d_t *p4d, unsigned long addr,
>   	if (st->effective_prot)
>   		st->effective_prot(st, 1, p4d_val(val));
>   
> +	if (st->note_non_leaf && !p4d_leaf(val))
> +		st->note_page(st, addr, 1, p4d_val(val));
> +
>   	if (p4d_leaf(val)) {
>   		st->note_page(st, addr, 1, p4d_val(val));
>   		walk->action = ACTION_CONTINUE;
> @@ -87,6 +93,9 @@ static int ptdump_pud_entry(pud_t *pud, unsigned long addr,
>   	if (st->effective_prot)
>   		st->effective_prot(st, 2, pud_val(val));
>   
> +	if (st->note_non_leaf && !pud_leaf(val))
> +		st->note_page(st, addr, 2, pud_val(val));
> +
>   	if (pud_leaf(val)) {
>   		st->note_page(st, addr, 2, pud_val(val));
>   		walk->action = ACTION_CONTINUE;
> @@ -108,6 +117,10 @@ static int ptdump_pmd_entry(pmd_t *pmd, unsigned long addr,
>   
>   	if (st->effective_prot)
>   		st->effective_prot(st, 3, pmd_val(val));
> +
> +	if (st->note_non_leaf && !pmd_leaf(val))
> +		st->note_page(st, addr, 3, pmd_val(val));
> +
>   	if (pmd_leaf(val)) {
>   		st->note_page(st, addr, 3, pmd_val(val));
>   		walk->action = ACTION_CONTINUE;
> 
> base-commit: a93289b830ce783955b22fbe5d1274a464c05acf
Maxwell Bland June 18, 2024, 3:03 p.m. UTC | #7
On Wed, May 08, 2024 at 12:20:41PM GMT, Catalin Marinas wrote:
> On Tue, Apr 30, 2024 at 11:05:01AM -0500, Maxwell Bland wrote:
> > -ptdump is a debugfs interface that provides a detailed dump of the

Hi Catalin! Apologies for the delayed response to this review, life got
in the way. A version 4 that addresses your comments is available here:

https://lore.kernel.org/all/aw675dhrbplkitj3szjut2vyidsxokogkjj3vi76wl2x4wybtg@5rhk5ca5zpmv/

> > +Assessing these attributes can assist in understanding the memory layout,
> > +access patterns and security characteristics of the kernel pages.
> 
> I presume there's some new text here.

Yes. Though after having a bit of time to think on it, I just reworked
the presentation altogether for version 4.

> 
> >  	}, {
> >  		.mask	= PTE_UXN,
> >  		.val	= PTE_UXN,
> 
> Since you are adding a separate pmd_bits[] array, I think we could get
> rid of the PTE_TABLE_BIT entry. It doesn't make sense for ptes anyway.

Done! Sweet.

> > +static const struct prot_bits pud_bits[] = {
> [...]
> > +};
> 
> Do we need pud_bits[] as well? Can we not just use pmd_bits[]? Call it
> pxd_bits if you want, the format is the same for all p*d entries.

Thanks, done!

> 
> Please separate the alignment changes into a different patch

Done! 

> > +			delta = (addr - st->start_address);
> 
> What's this supposed to show? In your example, it's strange that the PGD
> is shown as 128 bytes:

This was a bug due to my misunderstanding of what we were going for
here. Thank you for pointing it out, as it made it easy to notice and
patch.

> >  	if (pgd_leaf(val)) {
> >  		st->note_page(st, addr, 0, pgd_val(val));
> >  		walk->action = ACTION_CONTINUE;
> 
> Is the difference between leaf and non-leaf calls only the walk->action?
> We could have a single call to st->note_page() and keep the walk->action
> setting separately. Do we also need to set ACTION_SUBTREE in case the
> entry is a table entry? Or is it done in the caller somewhere? I could
> not figure out.

ACTION_SUBTREE is the default walk action, so it is implicitly set for
table descriptors.

> 
> An alternative would be to have an ARCH_WANT_NON_LEAF_PTDUMP Kconfig
> option instead of a bool note_non_leaf in struct ptdump_state. This
> option seems to be entirely static, not sure it's worth a struct member
> for it. You'd use IS_ENABLED() above instead of st->note_non_leaf.

This was an excellent idea, thank you. Incorporated.

BRs and thanks again for your help on this,
Maxwell Bland
diff mbox series

Patch

diff --git a/Documentation/arch/arm64/ptdump.rst b/Documentation/arch/arm64/ptdump.rst
index 5dcfc5d7cddf..350eea06300e 100644
--- a/Documentation/arch/arm64/ptdump.rst
+++ b/Documentation/arch/arm64/ptdump.rst
@@ -2,25 +2,24 @@ 
 Kernel page table dump
 ======================
 
-ptdump is a debugfs interface that provides a detailed dump of the
-kernel page tables. It offers a comprehensive overview of the kernel
-virtual memory layout as well as the attributes associated with the
-various regions in a human-readable format. It is useful to dump the
-kernel page tables to verify permissions and memory types. Examining the
-page table entries and permissions helps identify potential security
-vulnerabilities such as mappings with overly permissive access rights or
-improper memory protections.
+ptdump is a debugfs interface that provides a detailed dump of the kernel page
+tables. It offers a comprehensive overview of the kernel virtual memory layout
+as well as the attributes associated with the various regions in a
+human-readable format. It is useful to dump the kernel page tables to verify
+permissions and memory types. Examining the page table entries and permissions
+helps identify potential security vulnerabilities such as mappings with overly
+permissive access rights or improper memory protections.
 
-Memory hotplug allows dynamic expansion or contraction of available
-memory without requiring a system reboot. To maintain the consistency
-and integrity of the memory management data structures, arm64 makes use
-of the ``mem_hotplug_lock`` semaphore in write mode. Additionally, in
-read mode, ``mem_hotplug_lock`` supports an efficient implementation of
-``get_online_mems()`` and ``put_online_mems()``. These protect the
-offlining of memory being accessed by the ptdump code.
+Memory hotplug allows dynamic expansion or contraction of available memory
+without requiring a system reboot. To maintain the consistency and integrity of
+the memory management data structures, arm64 makes use of the
+``mem_hotplug_lock`` semaphore in write mode. Additionally, in read mode,
+``mem_hotplug_lock`` supports an efficient implementation of
+``get_online_mems()`` and ``put_online_mems()``. These protect the offlining of
+memory being accessed by the ptdump code.
 
-In order to dump the kernel page tables, enable the following
-configurations and mount debugfs::
+In order to dump the kernel page tables, enable the following configurations
+and mount debugfs::
 
  CONFIG_GENERIC_PTDUMP=y
  CONFIG_PTDUMP_CORE=y
@@ -29,68 +28,101 @@  configurations and mount debugfs::
  mount -t debugfs nodev /sys/kernel/debug
  cat /sys/kernel/debug/kernel_page_tables
 
-On analysing the output of ``cat /sys/kernel/debug/kernel_page_tables``
-one can derive information about the virtual address range of the entry,
-followed by size of the memory region covered by this entry, the
-hierarchical structure of the page tables and finally the attributes
-associated with each page. The page attributes provide information about
-access permissions, execution capability, type of mapping such as leaf
-level PTE or block level PGD, PMD and PUD, and access status of a page
-within the kernel memory. Assessing these attributes can assist in
-understanding the memory layout, access patterns and security
-characteristics of the kernel pages.
+On analysing the output of ``cat /sys/kernel/debug/kernel_page_tables`` one can
+derive information about the virtual address range of a contiguous group of
+page table entries, followed by size of the memory region covered by this
+group, the hierarchical structure of the page tables and finally the attributes
+associated with each page in the group. Groups are broken up either according
+to a change in attributes or by parent descriptor, such as a PMD. Note that the
+set of attributes, and therefore formatting, is not equivalent between entry
+types. For example, PMD entries have a separate set of attributes from leaf
+level PTE entries, because they support both the UXNTable and PXNTable
+permission bits.
+
+The page attributes provide information about access permissions, execution
+capability, type of mapping such as leaf level PTE or block level PGD, PMD and
+PUD, and access status of a page within the kernel memory. Non-PTE block or
+page level entries are denoted with either "BLK" or "TBL", respectively.
+Assessing these attributes can assist in understanding the memory layout,
+access patterns and security characteristics of the kernel pages.
 
 Kernel virtual memory layout example::
 
- start address        end address         size             attributes
- +---------------------------------------------------------------------------------------+
- | ---[ Linear Mapping start ]---------------------------------------------------------- |
- | ..................                                                                    |
- | 0xfff0000000000000-0xfff0000000210000  2112K PTE RW NX SHD AF  UXN  MEM/NORMAL-TAGGED |
- | 0xfff0000000210000-0xfff0000001c00000 26560K PTE ro NX SHD AF  UXN  MEM/NORMAL        |
- | ..................                                                                    |
- | ---[ Linear Mapping end ]------------------------------------------------------------ |
- +---------------------------------------------------------------------------------------+
- | ---[ Modules start ]----------------------------------------------------------------- |
- | ..................                                                                    |
- | 0xffff800000000000-0xffff800008000000   128M PTE                                      |
- | ..................                                                                    |
- | ---[ Modules end ]------------------------------------------------------------------- |
- +---------------------------------------------------------------------------------------+
- | ---[ vmalloc() area ]---------------------------------------------------------------- |
- | ..................                                                                    |
- | 0xffff800008010000-0xffff800008200000  1984K PTE ro x  SHD AF       UXN  MEM/NORMAL   |
- | 0xffff800008200000-0xffff800008e00000    12M PTE ro x  SHD AF  CON  UXN  MEM/NORMAL   |
- | ..................                                                                    |
- | ---[ vmalloc() end ]----------------------------------------------------------------- |
- +---------------------------------------------------------------------------------------+
- | ---[ Fixmap start ]------------------------------------------------------------------ |
- | ..................                                                                    |
- | 0xfffffbfffdb80000-0xfffffbfffdb90000    64K PTE ro x  SHD AF  UXN  MEM/NORMAL        |
- | 0xfffffbfffdb90000-0xfffffbfffdba0000    64K PTE ro NX SHD AF  UXN  MEM/NORMAL        |
- | ..................                                                                    |
- | ---[ Fixmap end ]-------------------------------------------------------------------- |
- +---------------------------------------------------------------------------------------+
- | ---[ PCI I/O start ]----------------------------------------------------------------- |
- | ..................                                                                    |
- | 0xfffffbfffe800000-0xfffffbffff800000    16M PTE                                      |
- | ..................                                                                    |
- | ---[ PCI I/O end ]------------------------------------------------------------------- |
- +---------------------------------------------------------------------------------------+
- | ---[ vmemmap start ]----------------------------------------------------------------- |
- | ..................                                                                    |
- | 0xfffffc0002000000-0xfffffc0002200000     2M PTE RW NX SHD AF  UXN  MEM/NORMAL        |
- | 0xfffffc0002200000-0xfffffc0020000000   478M PTE                                      |
- | ..................                                                                    |
- | ---[ vmemmap end ]------------------------------------------------------------------- |
- +---------------------------------------------------------------------------------------+
+ start address        end address         size type  leaf    attributes
+ +-----------------------------------------------------------------------------------------------------------------+
+ | ---[ Linear Mapping start ]---                                                                                  |
+ | ...                                                                                                             |
+ | 0xffff0d02c3200000-0xffff0d02c3400000    2M PMD   TBL     RW               x      NXTbl UXNTbl    MEM/NORMAL    |
+ | 0xffff0d02c3200000-0xffff0d02c3218000   96K PTE           ro NX SHD AF NG     UXN    MEM/NORMAL-TAGGED          |
+ | 0xffff0d02c3218000-0xffff0d02c3250000  224K PTE           RW NX SHD AF NG     UXN    MEM/NORMAL-TAGGED          |
+ | 0xffff0d02c3250000-0xffff0d02c33b3000 1420K PTE           ro NX SHD AF NG     UXN    MEM/NORMAL-TAGGED          |
+ | 0xffff0d02c33b3000-0xffff0d02c3400000  308K PTE           RW NX SHD AF NG     UXN    MEM/NORMAL-TAGGED          |
+ | 0xffff0d02c3400000-0xffff0d02c3600000    2M PMD   TBL     RW               x      NXTbl UXNTbl    MEM/NORMAL    |
+ | 0xffff0d02c3400000-0xffff0d02c3600000    2M PTE           RW NX SHD AF NG     UXN    MEM/NORMAL-TAGGED          |
+ | ...                                                                                                             |
+ | 0xffff0d02c3200000-0xffff0d02c3400000    2M PMD   TBL     RW               x      NXTbl UXNTbl    MEM/NORMAL    |
+ | ...                                                                                                             |
+ | ---[ Linear Mapping end ]---                                                                                    |
+ +-----------------------------------------------------------------------------------------------------------------+
+ | ---[ Modules start ]---                                                                                         |
+ | ...                                                                                                             |
+ | 0xffff800000000000-0xffff800000000080 128B PGD   TBL     RW               x     UXNTbl    MEM/NORMAL            |
+ | 0xffff800000000000-0xffff800080000000   2G PUD F BLK     RW               x               MEM/NORMAL            |
+ | ...                                                                                                             |
+ | ---[ Modules end ]---                                                                                           |
+ +-----------------------------------------------------------------------------------------------------------------+
+ | ---[ vmalloc() area ]---                                                                                        |
+ | ...                                                                                                             |
+ | 0xffff800080000000-0xffff8000c0000000   1G PUD   TBL     RW               x     UXNTbl    MEM/NORMAL            |
+ | ...                                                                                                             |
+ | 0xffff800080200000-0xffff800080400000   2M PMD   TBL     RW               x      NXTbl UXNTbl    MEM/NORMAL     |
+ | 0xffff800080200000-0xffff80008022f000 188K PTE           RW NX SHD AF NG     UXN    MEM/NORMAL                  |
+ | 0xffff80008022f000-0xffff800080230000   4K PTE F BLK     RW x                       MEM/NORMAL                  |
+ | 0xffff800080230000-0xffff800080233000  12K PTE           RW NX SHD AF NG     UXN    MEM/NORMAL                  |
+ | 0xffff800080233000-0xffff800080234000   4K PTE F BLK     RW x                       MEM/NORMAL                  |
+ | 0xffff800080234000-0xffff800080237000  12K PTE           RW NX SHD AF NG     UXN    MEM/NORMAL                  |
+ | ...                                                                                                             |
+ | 0xffff800080400000-0xffff800084000000  60M PMD F BLK     RW               x      x     x         MEM/NORMAL     |
+ | ...                                                                                                             |
+ | ---[ vmalloc() end ]---                                                                                         |
+ +-----------------------------------------------------------------------------------------------------------------+
+ | ---[ vmemmap start ]---                                                                                         |
+ | ...                                                                                                             |
+ | 0xfffffe33cb000000-0xfffffe33cc000000  16M PMD   BLK     RW SHD AF NG     NX UXN x     x         MEM/NORMAL     |
+ | 0xfffffe33cc000000-0xfffffe3400000000 832M PMD F BLK     RW               x      x     x         MEM/NORMAL     |
+ | ...                                                                                                             |
+ | ---[ vmemmap end ]---                                                                                           |
+ +-----------------------------------------------------------------------------------------------------------------+
+ | ---[ PCI I/O start ]---                                                                                         |
+ | ...                                                                                                             |
+ | 0xffffffffc0800000-0xffffffffc0810000 64K PTE           RW NX SHD AF NG     UXN    DEVICE/nGnRE                 |
+ | ...                                                                                                             |
+ | ---[ PCI I/O end ]---                                                                                           |
+ +-----------------------------------------------------------------------------------------------------------------+
+ | ---[ Fixmap start ]---                                                                                          |
+ | ...                                                                                                             |
+ | 0xffffffffff5f6000-0xffffffffff5f9000 12K PTE           ro x  SHD AF        UXN    MEM/NORMAL                   |
+ | 0xffffffffff5f9000-0xffffffffff5fa000  4K PTE           ro NX SHD AF NG     UXN    MEM/NORMAL                   |
+ | ...                                                                                                             |
+ | ---[ Fixmap end ]---                                                                                            |
+ +-----------------------------------------------------------------------------------------------------------------+
 
 ``cat /sys/kernel/debug/kernel_page_tables`` output::
 
- 0xfff0000001c00000-0xfff0000080000000     2020M PTE  RW NX SHD AF   UXN    MEM/NORMAL-TAGGED
- 0xfff0000080000000-0xfff0000800000000       30G PMD
- 0xfff0000800000000-0xfff0000800700000        7M PTE  RW NX SHD AF   UXN    MEM/NORMAL-TAGGED
- 0xfff0000800700000-0xfff0000800710000       64K PTE  ro NX SHD AF   UXN    MEM/NORMAL-TAGGED
- 0xfff0000800710000-0xfff0000880000000  2089920K PTE  RW NX SHD AF   UXN    MEM/NORMAL-TAGGED
- 0xfff0000880000000-0xfff0040000000000     4062G PMD
- 0xfff0040000000000-0xffff800000000000     3964T PGD
+ 0xffff000000000000-0xffff020000000000           2T PGD
+ 0xffff020000000000-0xffff020000000080         128B PGD   TBL     RW               NXTbl UXNTbl    MEM/NORMAL
+     0xffff020000000000-0xffff023080000000         194G PUD
+     0xffff023080000000-0xffff0230c0000000           1G PUD   TBL     RW               NXTbl UXNTbl    MEM/NORMAL
+       0xffff023080000000-0xffff023080200000           2M PMD   TBL     RW               x      NXTbl UXNTbl    MEM/NORMAL
+         0xffff023080000000-0xffff023080200000           2M PTE           RW NX SHD AF NG     UXN    MEM/NORMAL-TAGGED
+       0xffff023080200000-0xffff023080400000           2M PMD   TBL     RW               x      NXTbl UXNTbl    MEM/NORMAL
+         0xffff023080200000-0xffff023080210000          64K PTE           RW NX SHD AF NG     UXN    MEM/NORMAL-TAGGED
+         0xffff023080210000-0xffff023080400000        1984K PTE           ro NX SHD AF NG     UXN    MEM/NORMAL
+       0xffff023080400000-0xffff023081c00000          24M PMD   BLK     ro SHD AF NG     NX UXN x     x         MEM/NORMAL
+       0xffff023081c00000-0xffff023081e00000           2M PMD   TBL     RW               x      NXTbl UXNTbl    MEM/NORMAL
+         0xffff023081c00000-0xffff023081dd0000        1856K PTE           ro NX SHD AF NG     UXN    MEM/NORMAL
+         0xffff023081dd0000-0xffff023081e00000         192K PTE           RW NX SHD AF NG     UXN    MEM/NORMAL-TAGGED
+       0xffff023081e00000-0xffff023082000000           2M PMD   TBL     RW               x      NXTbl UXNTbl    MEM/NORMAL
+         0xffff023081e00000-0xffff023082000000           2M PTE           RW NX SHD AF NG     UXN    MEM/NORMAL-TAGGED
+       0xffff023082000000-0xffff023082200000           2M PMD   TBL     RW               x      NXTbl UXNTbl    MEM/NORMAL
+         0xffff023082000000-0xffff023082200000           2M PTE           RW NX SHD AF NG     UXN    MEM/NORMAL-TAGGED
diff --git a/arch/arm64/mm/ptdump.c b/arch/arm64/mm/ptdump.c
index 6986827e0d64..bd4f1df0c444 100644
--- a/arch/arm64/mm/ptdump.c
+++ b/arch/arm64/mm/ptdump.c
@@ -24,6 +24,7 @@ 
 #include <asm/memory.h>
 #include <asm/pgtable-hwdef.h>
 #include <asm/ptdump.h>
+#include <asm/pgalloc.h>
 
 
 #define pt_dump_seq_printf(m, fmt, args...)	\
@@ -70,6 +71,11 @@  static const struct prot_bits pte_bits[] = {
 		.val	= PTE_VALID,
 		.set	= " ",
 		.clear	= "F",
+	}, {
+		.mask	= PTE_TABLE_BIT,
+		.val	= PTE_TABLE_BIT,
+		.set	= "   ",
+		.clear	= "BLK",
 	}, {
 		.mask	= PTE_USER,
 		.val	= PTE_USER,
@@ -105,11 +111,6 @@  static const struct prot_bits pte_bits[] = {
 		.val	= PTE_CONT,
 		.set	= "CON",
 		.clear	= "   ",
-	}, {
-		.mask	= PTE_TABLE_BIT,
-		.val	= PTE_TABLE_BIT,
-		.set	= "   ",
-		.clear	= "BLK",
 	}, {
 		.mask	= PTE_UXN,
 		.val	= PTE_UXN,
@@ -143,34 +144,208 @@  static const struct prot_bits pte_bits[] = {
 	}
 };
 
+static const struct prot_bits pmd_bits[] = {
+	{
+		.mask	= PMD_SECT_VALID,
+		.val	= PMD_SECT_VALID,
+		.set	= " ",
+		.clear	= "F",
+	}, {
+		.mask	= PMD_TABLE_BIT,
+		.val	= PMD_TABLE_BIT,
+		.set	= "TBL",
+		.clear	= "BLK",
+	}, {
+		.mask	= PMD_SECT_USER,
+		.val	= PMD_SECT_USER,
+		.set	= "USR",
+		.clear	= "   ",
+	}, {
+		.mask	= PMD_SECT_RDONLY,
+		.val	= PMD_SECT_RDONLY,
+		.set	= "ro",
+		.clear	= "RW",
+	}, {
+		.mask	= PMD_SECT_S,
+		.val	= PMD_SECT_S,
+		.set	= "SHD",
+		.clear	= "   ",
+	}, {
+		.mask	= PMD_SECT_AF,
+		.val	= PMD_SECT_AF,
+		.set	= "AF",
+		.clear	= "  ",
+	}, {
+		.mask	= PMD_SECT_NG,
+		.val	= PMD_SECT_NG,
+		.set	= "NG",
+		.clear	= "  ",
+	}, {
+		.mask	= PMD_SECT_CONT,
+		.val	= PMD_SECT_CONT,
+		.set	= "CON",
+		.clear	= "   ",
+	}, {
+		.mask	= PMD_SECT_PXN,
+		.val	= PMD_SECT_PXN,
+		.set	= "NX",
+		.clear	= "x ",
+	}, {
+		.mask	= PMD_SECT_UXN,
+		.val	= PMD_SECT_UXN,
+		.set	= "UXN",
+		.clear	= "   ",
+	}, {
+		.mask	= PMD_TABLE_PXN,
+		.val	= PMD_TABLE_PXN,
+		.set	= "NXTbl",
+		.clear	= "x    ",
+	}, {
+		.mask	= PMD_TABLE_UXN,
+		.val	= PMD_TABLE_UXN,
+		.set	= "UXNTbl",
+		.clear	= "x     ",
+	}, {
+		.mask	= PTE_GP,
+		.val	= PTE_GP,
+		.set	= "GP",
+		.clear	= "  ",
+	}, {
+		.mask	= PMD_ATTRINDX_MASK,
+		.val	= PMD_ATTRINDX(MT_DEVICE_nGnRnE),
+		.set	= "DEVICE/nGnRnE",
+	}, {
+		.mask	= PMD_ATTRINDX_MASK,
+		.val	= PMD_ATTRINDX(MT_DEVICE_nGnRE),
+		.set	= "DEVICE/nGnRE",
+	}, {
+		.mask	= PMD_ATTRINDX_MASK,
+		.val	= PMD_ATTRINDX(MT_NORMAL_NC),
+		.set	= "MEM/NORMAL-NC",
+	}, {
+		.mask	= PMD_ATTRINDX_MASK,
+		.val	= PMD_ATTRINDX(MT_NORMAL),
+		.set	= "MEM/NORMAL",
+	}, {
+		.mask	= PMD_ATTRINDX_MASK,
+		.val	= PMD_ATTRINDX(MT_NORMAL_TAGGED),
+		.set	= "MEM/NORMAL-TAGGED",
+	}
+};
+
+static const struct prot_bits pud_bits[] = {
+	{
+		.mask	= PUD_TYPE_SECT,
+		.val	= PUD_TYPE_SECT,
+		.set	= " ",
+		.clear	= "F",
+	}, {
+		.mask	= PUD_TABLE_BIT,
+		.val	= PUD_TABLE_BIT,
+		.set	= "TBL",
+		.clear	= "BLK",
+	}, {
+		.mask	= PTE_USER,
+		.val	= PTE_USER,
+		.set	= "USR",
+		.clear	= "   ",
+	}, {
+		.mask	= PUD_SECT_RDONLY,
+		.val	= PUD_SECT_RDONLY,
+		.set	= "ro",
+		.clear	= "RW",
+	}, {
+		.mask	= PTE_SHARED,
+		.val	= PTE_SHARED,
+		.set	= "SHD",
+		.clear	= "   ",
+	}, {
+		.mask	= PTE_AF,
+		.val	= PTE_AF,
+		.set	= "AF",
+		.clear	= "  ",
+	}, {
+		.mask	= PTE_NG,
+		.val	= PTE_NG,
+		.set	= "NG",
+		.clear	= "  ",
+	}, {
+		.mask	= PTE_CONT,
+		.val	= PTE_CONT,
+		.set	= "CON",
+		.clear	= "   ",
+	}, {
+		.mask	= PUD_TABLE_PXN,
+		.val	= PUD_TABLE_PXN,
+		.set	= "NXTbl",
+		.clear	= "x    ",
+	}, {
+		.mask	= PUD_TABLE_UXN,
+		.val	= PUD_TABLE_UXN,
+		.set	= "UXNTbl",
+		.clear	= "      ",
+	}, {
+		.mask	= PTE_GP,
+		.val	= PTE_GP,
+		.set	= "GP",
+		.clear	= "  ",
+	}, {
+		.mask	= PMD_ATTRINDX_MASK,
+		.val	= PMD_ATTRINDX(MT_DEVICE_nGnRnE),
+		.set	= "DEVICE/nGnRnE",
+	}, {
+		.mask	= PMD_ATTRINDX_MASK,
+		.val	= PMD_ATTRINDX(MT_DEVICE_nGnRE),
+		.set	= "DEVICE/nGnRE",
+	}, {
+		.mask	= PMD_ATTRINDX_MASK,
+		.val	= PMD_ATTRINDX(MT_NORMAL_NC),
+		.set	= "MEM/NORMAL-NC",
+	}, {
+		.mask	= PMD_ATTRINDX_MASK,
+		.val	= PMD_ATTRINDX(MT_NORMAL),
+		.set	= "MEM/NORMAL",
+	}, {
+		.mask	= PMD_ATTRINDX_MASK,
+		.val	= PMD_ATTRINDX(MT_NORMAL_TAGGED),
+		.set	= "MEM/NORMAL-TAGGED",
+	}
+};
+
 struct pg_level {
 	const struct prot_bits *bits;
 	char name[4];
 	int num;
 	u64 mask;
+	unsigned long size;
 };
 
 static struct pg_level pg_level[] __ro_after_init = {
 	{ /* pgd */
 		.name	= "PGD",
-		.bits	= pte_bits,
-		.num	= ARRAY_SIZE(pte_bits),
+		.bits	= pud_bits,
+		.num	= ARRAY_SIZE(pud_bits),
+		.size	= PGD_SIZE
 	}, { /* p4d */
 		.name	= "P4D",
-		.bits	= pte_bits,
-		.num	= ARRAY_SIZE(pte_bits),
+		.bits	= pud_bits,
+		.num	= ARRAY_SIZE(pud_bits),
+		.size	= P4D_SIZE
 	}, { /* pud */
 		.name	= "PUD",
-		.bits	= pte_bits,
-		.num	= ARRAY_SIZE(pte_bits),
+		.bits	= pud_bits,
+		.num	= ARRAY_SIZE(pud_bits),
+		.size	= PUD_SIZE
 	}, { /* pmd */
 		.name	= "PMD",
-		.bits	= pte_bits,
-		.num	= ARRAY_SIZE(pte_bits),
+		.bits	= pmd_bits,
+		.num	= ARRAY_SIZE(pmd_bits),
+		.size	= PMD_SIZE
 	}, { /* pte */
 		.name	= "PTE",
 		.bits	= pte_bits,
 		.num	= ARRAY_SIZE(pte_bits),
+		.size	= PAGE_SIZE
 	},
 };
 
@@ -225,8 +400,9 @@  static void note_page(struct ptdump_state *pt_st, unsigned long addr, int level,
 		      u64 val)
 {
 	struct pg_state *st = container_of(pt_st, struct pg_state, ptdump);
-	static const char units[] = "KMGTPE";
+	static const char units[] = "BKMGTPE";
 	u64 prot = 0;
+	int i = 0;
 
 	/* check if the current level has been folded dynamically */
 	if ((level == 1 && mm_p4d_folded(st->mm)) ||
@@ -241,20 +417,33 @@  static void note_page(struct ptdump_state *pt_st, unsigned long addr, int level,
 		st->current_prot = prot;
 		st->start_address = addr;
 		pt_dump_seq_printf(st->seq, "---[ %s ]---\n", st->marker->name);
-	} else if (prot != st->current_prot || level != st->level ||
-		   addr >= st->marker[1].start_address) {
+	} else if ((prot != st->current_prot || level != st->level ||
+		   addr >= st->marker[1].start_address)) {
 		const char *unit = units;
 		unsigned long delta;
 
+		for (i = 0; i < st->level; i++)
+			pt_dump_seq_printf(st->seq, "  ");
+
 		if (st->current_prot) {
 			note_prot_uxn(st, addr);
 			note_prot_wx(st, addr);
 		}
 
-		pt_dump_seq_printf(st->seq, "0x%016lx-0x%016lx   ",
-				   st->start_address, addr);
+		/*
+		 * Entries are coalesced into a single line, so non-leaf
+		 * entries have no size relative to start_address
+		 */
+		if (st->start_address != addr) {
+			pt_dump_seq_printf(st->seq, "0x%016lx-0x%016lx   ",
+					   st->start_address, addr);
+			delta = (addr - st->start_address);
+		} else {
+			pt_dump_seq_printf(st->seq, "0x%016lx-0x%016lx   ", addr,
+					   addr + pg_level[st->level].size);
+			delta = (pg_level[st->level].size);
+		}
 
-		delta = (addr - st->start_address) >> 10;
 		while (!(delta & 1023) && unit[1]) {
 			delta >>= 10;
 			unit++;
@@ -301,7 +490,8 @@  void ptdump_walk(struct seq_file *s, struct ptdump_info *info)
 			.range = (struct ptdump_range[]){
 				{info->base_addr, end},
 				{0, 0}
-			}
+			},
+			.note_non_leaf = true
 		}
 	};
 
diff --git a/include/linux/ptdump.h b/include/linux/ptdump.h
index 8dbd51ea8626..b3e793a5c77f 100644
--- a/include/linux/ptdump.h
+++ b/include/linux/ptdump.h
@@ -16,6 +16,7 @@  struct ptdump_state {
 			  int level, u64 val);
 	void (*effective_prot)(struct ptdump_state *st, int level, u64 val);
 	const struct ptdump_range *range;
+	bool note_non_leaf;
 };
 
 bool ptdump_walk_pgd_level_core(struct seq_file *m,
diff --git a/mm/ptdump.c b/mm/ptdump.c
index 106e1d66e9f9..97da7a765b22 100644
--- a/mm/ptdump.c
+++ b/mm/ptdump.c
@@ -41,6 +41,9 @@  static int ptdump_pgd_entry(pgd_t *pgd, unsigned long addr,
 	if (st->effective_prot)
 		st->effective_prot(st, 0, pgd_val(val));
 
+	if (st->note_non_leaf && !pgd_leaf(val))
+		st->note_page(st, addr, 0, pgd_val(val));
+
 	if (pgd_leaf(val)) {
 		st->note_page(st, addr, 0, pgd_val(val));
 		walk->action = ACTION_CONTINUE;
@@ -64,6 +67,9 @@  static int ptdump_p4d_entry(p4d_t *p4d, unsigned long addr,
 	if (st->effective_prot)
 		st->effective_prot(st, 1, p4d_val(val));
 
+	if (st->note_non_leaf && !p4d_leaf(val))
+		st->note_page(st, addr, 1, p4d_val(val));
+
 	if (p4d_leaf(val)) {
 		st->note_page(st, addr, 1, p4d_val(val));
 		walk->action = ACTION_CONTINUE;
@@ -87,6 +93,9 @@  static int ptdump_pud_entry(pud_t *pud, unsigned long addr,
 	if (st->effective_prot)
 		st->effective_prot(st, 2, pud_val(val));
 
+	if (st->note_non_leaf && !pud_leaf(val))
+		st->note_page(st, addr, 2, pud_val(val));
+
 	if (pud_leaf(val)) {
 		st->note_page(st, addr, 2, pud_val(val));
 		walk->action = ACTION_CONTINUE;
@@ -108,6 +117,10 @@  static int ptdump_pmd_entry(pmd_t *pmd, unsigned long addr,
 
 	if (st->effective_prot)
 		st->effective_prot(st, 3, pmd_val(val));
+
+	if (st->note_non_leaf && !pmd_leaf(val))
+		st->note_page(st, addr, 3, pmd_val(val));
+
 	if (pmd_leaf(val)) {
 		st->note_page(st, addr, 3, pmd_val(val));
 		walk->action = ACTION_CONTINUE;