diff mbox series

[v3,01/12] mm: dump_page(): better diagnostics for compound pages

Message ID 20200201034029.4063170-2-jhubbard@nvidia.com (mailing list archive)
State New, archived
Headers show
Series mm/gup: track FOLL_PIN pages | expand

Commit Message

John Hubbard Feb. 1, 2020, 3:40 a.m. UTC
A compound page collects the refcount in the head page, while leaving
the refcount of each tail page at zero. Therefore, when debugging a
problem that involves compound pages, it's best to have diagnostics that
reflect that situation. However, dump_page() is oblivious to these
points.

Change dump_page() as follows:

1) For tail pages, print relevant head page information: refcount, in
   particular. But only do this if the page is not corrupted so badly
   that the pointer to the head page is all wrong.

2) Do a separate check to catch any (rare) cases of the tail page's
   refcount being non-zero, and issue a separate, clear pr_warn() if
   that ever happens.

Suggested-by: Matthew Wilcox <willy@infradead.org>
Suggested-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 mm/debug.c | 34 ++++++++++++++++++++++++++++------
 1 file changed, 28 insertions(+), 6 deletions(-)

Comments

Kirill A . Shutemov Feb. 3, 2020, 1:16 p.m. UTC | #1
On Fri, Jan 31, 2020 at 07:40:18PM -0800, John Hubbard wrote:
> A compound page collects the refcount in the head page, while leaving
> the refcount of each tail page at zero. Therefore, when debugging a
> problem that involves compound pages, it's best to have diagnostics that
> reflect that situation. However, dump_page() is oblivious to these
> points.
> 
> Change dump_page() as follows:
> 
> 1) For tail pages, print relevant head page information: refcount, in
>    particular. But only do this if the page is not corrupted so badly
>    that the pointer to the head page is all wrong.
> 
> 2) Do a separate check to catch any (rare) cases of the tail page's
>    refcount being non-zero, and issue a separate, clear pr_warn() if
>    that ever happens.
> 
> Suggested-by: Matthew Wilcox <willy@infradead.org>
> Suggested-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> Signed-off-by: John Hubbard <jhubbard@nvidia.com>

Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>

Few nit-picks below.

> ---
>  mm/debug.c | 34 ++++++++++++++++++++++++++++------
>  1 file changed, 28 insertions(+), 6 deletions(-)
> 
> diff --git a/mm/debug.c b/mm/debug.c
> index ecccd9f17801..beb1c59d784b 100644
> --- a/mm/debug.c
> +++ b/mm/debug.c
> @@ -42,6 +42,32 @@ const struct trace_print_flags vmaflag_names[] = {
>  	{0, NULL}
>  };
>  
> +static void __dump_tail_page(struct page *page, int mapcount)
> +{
> +	struct page *head = compound_head(page);
> +
> +	if ((page < head) || (page >= head + MAX_ORDER_NR_PAGES)) {

I'm not sure if we want to use compound_nr() here instead of
MAX_ORDER_NR_PAGES. Do you have any reasonaing about it?

> +		/*
> +		 * Page is hopelessly corrupted, so limit any reporting to
> +		 * information about the page itself. Do not attempt to look at
> +		 * the head page.
> +		 */
> +		pr_warn("page:%px refcount:%d mapcount:%d mapping:%px "
> +			"index:%#lx (corrupted tail page case)\n",
> +			page, page_ref_count(page), mapcount, page->mapping,
> +			page_to_pgoff(page));
> +	} else {
> +		pr_warn("page:%px compound refcount:%d mapcount:%d mapping:%px "
> +			"index:%#lx compound_mapcount:%d\n",
> +			page, page_ref_count(head), mapcount, head->mapping,
> +			page_to_pgoff(head), compound_mapcount(page));
> +	}
> +
> +	if (page_ref_count(page) != 0)
> +		pr_warn("page:%px PROBLEM: non-zero refcount (==%d) on this "
> +			"tail page\n", page, page_ref_count(page));

Wrap into {}, please.

> +}
> +
>  void __dump_page(struct page *page, const char *reason)
>  {
>  	struct address_space *mapping;
> @@ -75,12 +101,8 @@ void __dump_page(struct page *page, const char *reason)
>  	 */
>  	mapcount = PageSlab(page) ? 0 : page_mapcount(page);
>  
> -	if (PageCompound(page))
> -		pr_warn("page:%px refcount:%d mapcount:%d mapping:%px "
> -			"index:%#lx compound_mapcount: %d\n",
> -			page, page_ref_count(page), mapcount,
> -			page->mapping, page_to_pgoff(page),
> -			compound_mapcount(page));
> +	if (PageTail(page))
> +		__dump_tail_page(page, mapcount);
>  	else
>  		pr_warn("page:%px refcount:%d mapcount:%d mapping:%px index:%#lx\n",
>  			page, page_ref_count(page), mapcount,
> -- 
> 2.25.0
>
John Hubbard Feb. 3, 2020, 7:51 p.m. UTC | #2
On 2/3/20 5:16 AM, Kirill A. Shutemov wrote:
> On Fri, Jan 31, 2020 at 07:40:18PM -0800, John Hubbard wrote:
>> A compound page collects the refcount in the head page, while leaving
>> the refcount of each tail page at zero. Therefore, when debugging a
>> problem that involves compound pages, it's best to have diagnostics that
>> reflect that situation. However, dump_page() is oblivious to these
>> points.
>>
>> Change dump_page() as follows:
>>
>> 1) For tail pages, print relevant head page information: refcount, in
>>    particular. But only do this if the page is not corrupted so badly
>>    that the pointer to the head page is all wrong.
>>
>> 2) Do a separate check to catch any (rare) cases of the tail page's
>>    refcount being non-zero, and issue a separate, clear pr_warn() if
>>    that ever happens.
>>
>> Suggested-by: Matthew Wilcox <willy@infradead.org>
>> Suggested-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
>> Signed-off-by: John Hubbard <jhubbard@nvidia.com>
> 
> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>

Thanks for looking through all of these!

> 
> Few nit-picks below.
> 
>> ---
>>  mm/debug.c | 34 ++++++++++++++++++++++++++++------
>>  1 file changed, 28 insertions(+), 6 deletions(-)
>>
>> diff --git a/mm/debug.c b/mm/debug.c
>> index ecccd9f17801..beb1c59d784b 100644
>> --- a/mm/debug.c
>> +++ b/mm/debug.c
>> @@ -42,6 +42,32 @@ const struct trace_print_flags vmaflag_names[] = {
>>  	{0, NULL}
>>  };
>>  
>> +static void __dump_tail_page(struct page *page, int mapcount)
>> +{
>> +	struct page *head = compound_head(page);
>> +
>> +	if ((page < head) || (page >= head + MAX_ORDER_NR_PAGES)) {
> 
> I'm not sure if we want to use compound_nr() here instead of
> MAX_ORDER_NR_PAGES. Do you have any reasonaing about it?


Yes: compound_nr(page) reads from the struct page, whereas MAX_ORDER_NR_PAGES
is an independent, immutable limit. When checking a struct page for corruption,
it's ideal to avoid relying on data within the struct page, as compound_nr()
would have to do.


> 
>> +		/*
>> +		 * Page is hopelessly corrupted, so limit any reporting to
>> +		 * information about the page itself. Do not attempt to look at
>> +		 * the head page.
>> +		 */
>> +		pr_warn("page:%px refcount:%d mapcount:%d mapping:%px "
>> +			"index:%#lx (corrupted tail page case)\n",
>> +			page, page_ref_count(page), mapcount, page->mapping,
>> +			page_to_pgoff(page));
>> +	} else {
>> +		pr_warn("page:%px compound refcount:%d mapcount:%d mapping:%px "
>> +			"index:%#lx compound_mapcount:%d\n",
>> +			page, page_ref_count(head), mapcount, head->mapping,
>> +			page_to_pgoff(head), compound_mapcount(page));
>> +	}
>> +
>> +	if (page_ref_count(page) != 0)
>> +		pr_warn("page:%px PROBLEM: non-zero refcount (==%d) on this "
>> +			"tail page\n", page, page_ref_count(page));
> 
> Wrap into {}, please.


Fixed, thanks.


thanks,
diff mbox series

Patch

diff --git a/mm/debug.c b/mm/debug.c
index ecccd9f17801..beb1c59d784b 100644
--- a/mm/debug.c
+++ b/mm/debug.c
@@ -42,6 +42,32 @@  const struct trace_print_flags vmaflag_names[] = {
 	{0, NULL}
 };
 
+static void __dump_tail_page(struct page *page, int mapcount)
+{
+	struct page *head = compound_head(page);
+
+	if ((page < head) || (page >= head + MAX_ORDER_NR_PAGES)) {
+		/*
+		 * Page is hopelessly corrupted, so limit any reporting to
+		 * information about the page itself. Do not attempt to look at
+		 * the head page.
+		 */
+		pr_warn("page:%px refcount:%d mapcount:%d mapping:%px "
+			"index:%#lx (corrupted tail page case)\n",
+			page, page_ref_count(page), mapcount, page->mapping,
+			page_to_pgoff(page));
+	} else {
+		pr_warn("page:%px compound refcount:%d mapcount:%d mapping:%px "
+			"index:%#lx compound_mapcount:%d\n",
+			page, page_ref_count(head), mapcount, head->mapping,
+			page_to_pgoff(head), compound_mapcount(page));
+	}
+
+	if (page_ref_count(page) != 0)
+		pr_warn("page:%px PROBLEM: non-zero refcount (==%d) on this "
+			"tail page\n", page, page_ref_count(page));
+}
+
 void __dump_page(struct page *page, const char *reason)
 {
 	struct address_space *mapping;
@@ -75,12 +101,8 @@  void __dump_page(struct page *page, const char *reason)
 	 */
 	mapcount = PageSlab(page) ? 0 : page_mapcount(page);
 
-	if (PageCompound(page))
-		pr_warn("page:%px refcount:%d mapcount:%d mapping:%px "
-			"index:%#lx compound_mapcount: %d\n",
-			page, page_ref_count(page), mapcount,
-			page->mapping, page_to_pgoff(page),
-			compound_mapcount(page));
+	if (PageTail(page))
+		__dump_tail_page(page, mapcount);
 	else
 		pr_warn("page:%px refcount:%d mapcount:%d mapping:%px index:%#lx\n",
 			page, page_ref_count(page), mapcount,