diff mbox series

[RFC] mm/page_alloc: Enumerate bad page reasons

Message ID 1585551097-27283-1-git-send-email-anshuman.khandual@arm.com (mailing list archive)
State New, archived
Headers show
Series [RFC] mm/page_alloc: Enumerate bad page reasons | expand

Commit Message

Anshuman Khandual March 30, 2020, 6:51 a.m. UTC
Enumerate all existing bad page reasons which can be used in bad_page() for
reporting via __dump_page(). Unfortunately __dump_page() cannot be changed.
__dump_page() is called from dump_page() that accepts a raw string and is
also an exported symbol that is currently being used from various generic
memory functions and other drivers. This reduces code duplication while
reporting bad pages.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: linux-mm@kvack.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
Boot tested on arm64 and x86 platforms (next-20200327) but has been built
tested on many other platforms.

 include/linux/memory.h | 36 ++++++++++++++++++++++++++++++++++++
 mm/page_alloc.c        | 40 ++++++++++++++++++++--------------------
 2 files changed, 56 insertions(+), 20 deletions(-)

Comments

Michal Hocko March 30, 2020, 8:43 a.m. UTC | #1
On Mon 30-03-20 12:21:37, Anshuman Khandual wrote:
> Enumerate all existing bad page reasons which can be used in bad_page() for
> reporting via __dump_page(). Unfortunately __dump_page() cannot be changed.
> __dump_page() is called from dump_page() that accepts a raw string and is
> also an exported symbol that is currently being used from various generic
> memory functions and other drivers. This reduces code duplication while
> reporting bad pages.

I dunno. It sounds like over engineering something that is an internal
stuff. Besides that I consider string reasons kinda obvious and I am
pretty sure I would have to check them for each numeric alias when want
to read the code. Yeah, yeah, nothing really hard but still...

So I am not really sure this is all worth the code churn. Besides
that I stongly suspect you wanted ...

> -static void bad_page(struct page *page, const char *reason,
> +static void bad_page(struct page *page, int reason,
>  		unsigned long bad_flags)

... enum page_bad_reason reason here, right? What is the point of declaring
an enum when you are not using it?
David Hildenbrand March 30, 2020, 8:45 a.m. UTC | #2
On 30.03.20 08:51, Anshuman Khandual wrote:
> Enumerate all existing bad page reasons which can be used in bad_page() for
> reporting via __dump_page(). Unfortunately __dump_page() cannot be changed.
> __dump_page() is called from dump_page() that accepts a raw string and is
> also an exported symbol that is currently being used from various generic
> memory functions and other drivers. This reduces code duplication while
> reporting bad pages.

Yeah sounds nice, but "56 insertions(+), 20 deletions(-)" does not sound
so nice ... and the "code duplication" is actually "repeating strings".

... I don't think we want/need this.
Anshuman Khandual March 30, 2020, 12:25 p.m. UTC | #3
On 03/30/2020 02:13 PM, Michal Hocko wrote:
> On Mon 30-03-20 12:21:37, Anshuman Khandual wrote:
>> Enumerate all existing bad page reasons which can be used in bad_page() for
>> reporting via __dump_page(). Unfortunately __dump_page() cannot be changed.
>> __dump_page() is called from dump_page() that accepts a raw string and is
>> also an exported symbol that is currently being used from various generic
>> memory functions and other drivers. This reduces code duplication while
>> reporting bad pages.
> 
> I dunno. It sounds like over engineering something that is an internal
> stuff. Besides that I consider string reasons kinda obvious and I am
> pretty sure I would have to check them for each numeric alias when want
> to read the code. Yeah, yeah, nothing really hard but still...

Right these are very much self explanatory. Would moving these aliases into
mm/page_alloc.c itself, make it any better for quicker access ?

> 
> So I am not really sure this is all worth the code churn. Besides

I understand but is not just repeating the same strings in similar functions
bit suboptimal as well.

> that I stongly suspect you wanted ...
> 
>> -static void bad_page(struct page *page, const char *reason,
>> +static void bad_page(struct page *page, int reason,
>>  		unsigned long bad_flags)
> 
> ... enum page_bad_reason reason here, right? What is the point of declaring
> an enum when you are not using it?

Sure, will replace here and other local reason variables which are 'int'.
Anshuman Khandual March 30, 2020, 12:28 p.m. UTC | #4
On 03/30/2020 02:15 PM, David Hildenbrand wrote:
> On 30.03.20 08:51, Anshuman Khandual wrote:
>> Enumerate all existing bad page reasons which can be used in bad_page() for
>> reporting via __dump_page(). Unfortunately __dump_page() cannot be changed.
>> __dump_page() is called from dump_page() that accepts a raw string and is
>> also an exported symbol that is currently being used from various generic
>> memory functions and other drivers. This reduces code duplication while
>> reporting bad pages.
> 
> Yeah sounds nice, but "56 insertions(+), 20 deletions(-)" does not sound
> so nice ... and the "code duplication" is actually "repeating strings".

But repeating strings in very similar functions dealing with bad page state
is not bit suboptimal ?

> 
> ... I don't think we want/need this.
>
Michal Hocko March 30, 2020, 12:34 p.m. UTC | #5
On Mon 30-03-20 17:55:14, Anshuman Khandual wrote:
> 
> 
> On 03/30/2020 02:13 PM, Michal Hocko wrote:
> > On Mon 30-03-20 12:21:37, Anshuman Khandual wrote:
> >> Enumerate all existing bad page reasons which can be used in bad_page() for
> >> reporting via __dump_page(). Unfortunately __dump_page() cannot be changed.
> >> __dump_page() is called from dump_page() that accepts a raw string and is
> >> also an exported symbol that is currently being used from various generic
> >> memory functions and other drivers. This reduces code duplication while
> >> reporting bad pages.
> > 
> > I dunno. It sounds like over engineering something that is an internal
> > stuff. Besides that I consider string reasons kinda obvious and I am
> > pretty sure I would have to check them for each numeric alias when want
> > to read the code. Yeah, yeah, nothing really hard but still...
> 
> Right these are very much self explanatory. Would moving these aliases into
> mm/page_alloc.c itself, make it any better for quicker access ?

Not really. Cscopes doesn't really care where it is. It is the fact that
the constant makes to have a look is what makes this not an improvement
from my POV.
 
> > So I am not really sure this is all worth the code churn. Besides
> 
> I understand but is not just repeating the same strings in similar functions
> bit suboptimal as well.

I do not really see why that would be suboptimal.
David Hildenbrand March 30, 2020, 12:38 p.m. UTC | #6
On 30.03.20 14:28, Anshuman Khandual wrote:
> 
> 
> On 03/30/2020 02:15 PM, David Hildenbrand wrote:
>> On 30.03.20 08:51, Anshuman Khandual wrote:
>>> Enumerate all existing bad page reasons which can be used in bad_page() for
>>> reporting via __dump_page(). Unfortunately __dump_page() cannot be changed.
>>> __dump_page() is called from dump_page() that accepts a raw string and is
>>> also an exported symbol that is currently being used from various generic
>>> memory functions and other drivers. This reduces code duplication while
>>> reporting bad pages.
>>
>> Yeah sounds nice, but "56 insertions(+), 20 deletions(-)" does not sound
>> so nice ... and the "code duplication" is actually "repeating strings".
> 
> But repeating strings in very similar functions dealing with bad page state
> is not bit suboptimal ?

Who cares? This is debug output.
diff mbox series

Patch

diff --git a/include/linux/memory.h b/include/linux/memory.h
index 7914b0dbd4bb..b2fb90fda537 100644
--- a/include/linux/memory.h
+++ b/include/linux/memory.h
@@ -82,6 +82,42 @@  struct memory_notify {
 struct notifier_block;
 struct mem_section;
 
+/*
+ * Bad page reasons
+ */
+enum page_bad_reason {
+	PAGE_BAD_NONZERO_MAPCOUNT,
+	PAGE_BAD_NONZERO_REFCOUNT,
+	PAGE_BAD_NONNULL_MAPPING,
+	PAGE_BAD_FREE_FLAGS,
+	PAGE_BAD_PREP_FLAGS,
+	PAGE_BAD_CGROUP_CHARGED,
+	PAGE_BAD_HWPOISON,
+	PAGE_BAD_TAIL_CORRUPTED,
+	PAGE_BAD_TAIL_NOT_SET,
+	PAGE_BAD_COMPOUND_HEAD,
+	PAGE_BAD_COMPOUND_MAPCOUNT,
+};
+
+static const char *const page_bad_names[] = {
+	[PAGE_BAD_NONZERO_MAPCOUNT]  = "non-zero mapcout",
+	[PAGE_BAD_NONZERO_REFCOUNT]  = "non-zero _refcount",
+	[PAGE_BAD_NONNULL_MAPPING]   = "non-NULL mapping",
+	[PAGE_BAD_FREE_FLAGS]        = "PAGE_FLAGS_CHECK_AT_FREE flag(s) set",
+	[PAGE_BAD_PREP_FLAGS]        = "PAGE_FLAGS_CHECK_AT_PREP flag(s) set",
+	[PAGE_BAD_CGROUP_CHARGED]    = "page still charged to cgroup",
+	[PAGE_BAD_HWPOISON]          = "HWPoisoned (hardware-corrupted)",
+	[PAGE_BAD_TAIL_CORRUPTED]    = "corrupted mapping in tail page",
+	[PAGE_BAD_TAIL_NOT_SET]      = "PageTail not set",
+	[PAGE_BAD_COMPOUND_HEAD]     = "compound_head not consistent",
+	[PAGE_BAD_COMPOUND_MAPCOUNT] = "non-zero compound_mapcount",
+};
+
+static inline const char *get_page_bad(int reason)
+{
+	return page_bad_names[reason];
+}
+
 /*
  * Priorities for the hotplug memory callback routines (stored in decreasing
  * order in the callback chain)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index ca1453204e66..0f05da0fdc9a 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -68,6 +68,7 @@ 
 #include <linux/lockdep.h>
 #include <linux/nmi.h>
 #include <linux/psi.h>
+#include <linux/memory.h>
 
 #include <asm/sections.h>
 #include <asm/tlbflush.h>
@@ -609,7 +610,7 @@  static inline int __maybe_unused bad_range(struct zone *zone, struct page *page)
 }
 #endif
 
-static void bad_page(struct page *page, const char *reason,
+static void bad_page(struct page *page, int reason,
 		unsigned long bad_flags)
 {
 	static unsigned long resume;
@@ -638,7 +639,7 @@  static void bad_page(struct page *page, const char *reason,
 
 	pr_alert("BUG: Bad page state in process %s  pfn:%05lx\n",
 		current->comm, page_to_pfn(page));
-	__dump_page(page, reason);
+	__dump_page(page, get_page_bad(reason));
 	bad_flags &= page->flags;
 	if (bad_flags)
 		pr_alert("bad because of flags: %#lx(%pGp)\n",
@@ -1079,25 +1080,24 @@  static inline bool page_expected_state(struct page *page,
 
 static void free_pages_check_bad(struct page *page)
 {
-	const char *bad_reason;
+	int bad_reason;
 	unsigned long bad_flags;
 
-	bad_reason = NULL;
 	bad_flags = 0;
 
 	if (unlikely(atomic_read(&page->_mapcount) != -1))
-		bad_reason = "nonzero mapcount";
+		bad_reason = PAGE_BAD_NONZERO_MAPCOUNT;
 	if (unlikely(page->mapping != NULL))
-		bad_reason = "non-NULL mapping";
+		bad_reason = PAGE_BAD_NONNULL_MAPPING;
 	if (unlikely(page_ref_count(page) != 0))
-		bad_reason = "nonzero _refcount";
+		bad_reason = PAGE_BAD_NONZERO_REFCOUNT;
 	if (unlikely(page->flags & PAGE_FLAGS_CHECK_AT_FREE)) {
-		bad_reason = "PAGE_FLAGS_CHECK_AT_FREE flag(s) set";
+		bad_reason = PAGE_BAD_FREE_FLAGS;
 		bad_flags = PAGE_FLAGS_CHECK_AT_FREE;
 	}
 #ifdef CONFIG_MEMCG
 	if (unlikely(page->mem_cgroup))
-		bad_reason = "page still charged to cgroup";
+		bad_reason = PAGE_BAD_CGROUP_CHARGED;
 #endif
 	bad_page(page, bad_reason, bad_flags);
 }
@@ -1130,7 +1130,7 @@  static int free_tail_pages_check(struct page *head_page, struct page *page)
 	case 1:
 		/* the first tail page: ->mapping may be compound_mapcount() */
 		if (unlikely(compound_mapcount(page))) {
-			bad_page(page, "nonzero compound_mapcount", 0);
+			bad_page(page, PAGE_BAD_COMPOUND_MAPCOUNT, 0);
 			goto out;
 		}
 		break;
@@ -1142,17 +1142,17 @@  static int free_tail_pages_check(struct page *head_page, struct page *page)
 		break;
 	default:
 		if (page->mapping != TAIL_MAPPING) {
-			bad_page(page, "corrupted mapping in tail page", 0);
+			bad_page(page, PAGE_BAD_TAIL_CORRUPTED, 0);
 			goto out;
 		}
 		break;
 	}
 	if (unlikely(!PageTail(page))) {
-		bad_page(page, "PageTail not set", 0);
+		bad_page(page, PAGE_BAD_TAIL_NOT_SET, 0);
 		goto out;
 	}
 	if (unlikely(compound_head(page) != head_page)) {
-		bad_page(page, "compound_head not consistent", 0);
+		bad_page(page, PAGE_BAD_COMPOUND_HEAD, 0);
 		goto out;
 	}
 	ret = 0;
@@ -2110,29 +2110,29 @@  static inline void expand(struct zone *zone, struct page *page,
 
 static void check_new_page_bad(struct page *page)
 {
-	const char *bad_reason = NULL;
+	int bad_reason;
 	unsigned long bad_flags = 0;
 
 	if (unlikely(atomic_read(&page->_mapcount) != -1))
-		bad_reason = "nonzero mapcount";
+		bad_reason = PAGE_BAD_NONZERO_MAPCOUNT;
 	if (unlikely(page->mapping != NULL))
-		bad_reason = "non-NULL mapping";
+		bad_reason = PAGE_BAD_NONNULL_MAPPING;
 	if (unlikely(page_ref_count(page) != 0))
-		bad_reason = "nonzero _refcount";
+		bad_reason = PAGE_BAD_NONZERO_REFCOUNT;
 	if (unlikely(page->flags & __PG_HWPOISON)) {
-		bad_reason = "HWPoisoned (hardware-corrupted)";
+		bad_reason = PAGE_BAD_HWPOISON;
 		bad_flags = __PG_HWPOISON;
 		/* Don't complain about hwpoisoned pages */
 		page_mapcount_reset(page); /* remove PageBuddy */
 		return;
 	}
 	if (unlikely(page->flags & PAGE_FLAGS_CHECK_AT_PREP)) {
-		bad_reason = "PAGE_FLAGS_CHECK_AT_PREP flag set";
+		bad_reason = PAGE_BAD_PREP_FLAGS;
 		bad_flags = PAGE_FLAGS_CHECK_AT_PREP;
 	}
 #ifdef CONFIG_MEMCG
 	if (unlikely(page->mem_cgroup))
-		bad_reason = "page still charged to cgroup";
+		bad_reason = PAGE_BAD_CGROUP_CHARGED;
 #endif
 	bad_page(page, bad_reason, bad_flags);
 }