diff mbox series

mm/slub: use WARN_ON() for some slab errors

Message ID 1548063490-545-1-git-send-email-miles.chen@mediatek.com (mailing list archive)
State New, archived
Headers show
Series mm/slub: use WARN_ON() for some slab errors | expand

Commit Message

Miles Chen Jan. 21, 2019, 9:38 a.m. UTC
From: Miles Chen <miles.chen@mediatek.com>

When debugging with slub.c, sometimes we have to trigger a panic in
order to get the coredump file. To do that, we have to modify slub.c and
rebuild kernel. To make debugging easier, use WARN_ON() for these slab
errors so we can dump stack trace by default or set panic_on_warn to
trigger a panic.

Signed-off-by: Miles Chen <miles.chen@mediatek.com>
---
 mm/slub.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

Comments

Christoph Lameter (Ampere) Jan. 21, 2019, 10:02 p.m. UTC | #1
On Mon, 21 Jan 2019, miles.chen@mediatek.com wrote:

> From: Miles Chen <miles.chen@mediatek.com>
>
> When debugging with slub.c, sometimes we have to trigger a panic in
> order to get the coredump file. To do that, we have to modify slub.c and
> rebuild kernel. To make debugging easier, use WARN_ON() for these slab
> errors so we can dump stack trace by default or set panic_on_warn to
> trigger a panic.

These locations really should dump stack and not terminate. There is
subsequent processing that should be done.

Slub terminates by default. The messages you are modifying are only
enabled if the user specified that special debugging should be one
(typically via a kernel parameter slub_debug).

It does not make sense to terminate the process here.
Miles Chen Jan. 22, 2019, 4:14 a.m. UTC | #2
On Mon, 2019-01-21 at 22:02 +0000, Christopher Lameter wrote:
> On Mon, 21 Jan 2019, miles.chen@mediatek.com wrote:
> 
> > From: Miles Chen <miles.chen@mediatek.com>
> >
> > When debugging with slub.c, sometimes we have to trigger a panic in
> > order to get the coredump file. To do that, we have to modify slub.c and
> > rebuild kernel. To make debugging easier, use WARN_ON() for these slab
> > errors so we can dump stack trace by default or set panic_on_warn to
> > trigger a panic.
> 
> These locations really should dump stack and not terminate. There is
> subsequent processing that should be done.

Understood. We should not terminate the process for normal case. The
change only terminate the process when panic_on_warn is set.

> Slub terminates by default. The messages you are modifying are only
> enabled if the user specified that special debugging should be one
> (typically via a kernel parameter slub_debug).

I'm a little bit confused about this: Do you mean that I should use the
following approach?

1. Add a special debugging flag (say SLAB_PANIC_ON_ERROR) and call
panic() by:

if (s->flags & SLAB_PANIC_ON_ERROR)
     panic("slab error");

2. The SLAB_PANIC_ON_ERROR should be set by slub_debug param.

> It does not make sense to terminate the process here.


Thanks for you comment. Sometimes it's useful to trigger a panic and get
its coredump file before any restore/reset processing because we can
exam the unmodified data in the coredump file with this approach. 

I added BUG() for the slab errors in internal branches for a few years
and it does help for both software issues and bit flipping issues. It's
a quite useful in developing stage.

cheers,
Miles
diff mbox series

Patch

diff --git a/mm/slub.c b/mm/slub.c
index 1e3d0ec4e200..e48c3bb30c93 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -684,7 +684,7 @@  static void print_trailer(struct kmem_cache *s, struct page *page, u8 *p)
 		print_section(KERN_ERR, "Padding ", p + off,
 			      size_from_object(s) - off);
 
-	dump_stack();
+	WARN_ON(1);
 }
 
 void object_err(struct kmem_cache *s, struct page *page,
@@ -705,7 +705,7 @@  static __printf(3, 4) void slab_err(struct kmem_cache *s, struct page *page,
 	va_end(args);
 	slab_bug(s, "%s", buf);
 	print_page_info(page);
-	dump_stack();
+	WARN_ON(1);
 }
 
 static void init_object(struct kmem_cache *s, void *object, u8 val)
@@ -1690,7 +1690,7 @@  static struct page *new_slab(struct kmem_cache *s, gfp_t flags, int node)
 		flags &= ~GFP_SLAB_BUG_MASK;
 		pr_warn("Unexpected gfp: %#x (%pGg). Fixing up to gfp: %#x (%pGg). Fix your code!\n",
 				invalid_mask, &invalid_mask, flags, &flags);
-		dump_stack();
+		WARN_ON(1);
 	}
 
 	return allocate_slab(s,