diff mbox

[1/2] mm/slab_common: add SLAB_NO_MERGE flag for use when creating slabs

Message ID 1441129890-25585-1-git-send-email-snitzer@redhat.com (mailing list archive)
State Rejected, archived
Delegated to: Mike Snitzer
Headers show

Commit Message

Mike Snitzer Sept. 1, 2015, 5:51 p.m. UTC
The slab aliasing/merging by default transition went unnoticed (at least
to the DM subsystem).  Add a new SLAB_NO_MERGE flag that allows
individual slabs to be created without slab merging.  This beats forcing
all slabs to be created in this fashion by specifying sl[au]b_nomerge on
the kernel commandline.

DM has historically taken care to have separate named slabs that each
devices' mempool_t are backed by.  These separate slabs are useful --
even if only to aid inspection of DM's memory use (via /proc/slabinfo)
on production systems.

I stumbled onto slab merging as a side-effect of a leak in dm-cache
being attributed to 'kmalloc-96' rather than the expected
'dm_bio_prison_cell' named slab.  Moving forward DM will disable slab
merging for all of DM's slabs by using SLAB_NO_MERGE.

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
---
 include/linux/slab.h | 2 ++
 mm/slab.h            | 2 +-
 mm/slab_common.c     | 2 +-
 3 files changed, 4 insertions(+), 2 deletions(-)

Comments

Dave Chinner Sept. 2, 2015, 1:15 a.m. UTC | #1
On Tue, Sep 01, 2015 at 01:51:29PM -0400, Mike Snitzer wrote:
> The slab aliasing/merging by default transition went unnoticed (at least
> to the DM subsystem).  Add a new SLAB_NO_MERGE flag that allows
> individual slabs to be created without slab merging.  This beats forcing
> all slabs to be created in this fashion by specifying sl[au]b_nomerge on
> the kernel commandline.

I didn't realise that this merging had also been applied to SLAB - I
thought it was just SLUB that did this.  Indeed, one of the prime
reasons for using SLAB over SLUB was that it didn't merge caches and
so still gave excellent visibility of runtime slab memory usage on
production systems.

I had no idea that commit 12220de ("mm/slab: support slab merge")
had made SLAB do this as well as it was not cc'd to any of the
people/subsystems that maintain code that uses slab caches.  IMNSHO
the commit message gives pretty flimsy justification for such a
change, especially considering we need to deal with slab caches that
individually grow to contain hundreds of millions of objects.

> DM has historically taken care to have separate named slabs that each
> devices' mempool_t are backed by.  These separate slabs are useful --
> even if only to aid inspection of DM's memory use (via /proc/slabinfo)
> on production systems.

Yup, that's one of the reasons XFS has 17 separate slab caches. The
other main reason is that slab separation also helps keep memory
corruption and use-after free issues contained; if you have a
problem with the objects from one slab cache, the isolation of the
slab makes it less likely that the problem propagates to other
subsystems. Hence failures also tend to be isolated to the code that
has the leak/corruption problem, making them easier to triage and
debug on production systems...

> I stumbled onto slab merging as a side-effect of a leak in dm-cache
> being attributed to 'kmalloc-96' rather than the expected
> 'dm_bio_prison_cell' named slab.  Moving forward DM will disable slab
> merging for all of DM's slabs by using SLAB_NO_MERGE.

Seems like a fine idea to me. I can apply it to various slabs in XFS
once it's merged....

Acked-by: Dave Chinner <dchinner@redhat.com>

Cheers,

Dave.
diff mbox

Patch

diff --git a/include/linux/slab.h b/include/linux/slab.h
index a99f0e5..d007407 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -87,6 +87,8 @@ 
 # define SLAB_FAILSLAB		0x00000000UL
 #endif
 
+#define SLAB_NO_MERGE		0x04000000UL	/* Do not merge with existing slab */
+
 /* The following flags affect the page allocator grouping pages by mobility */
 #define SLAB_RECLAIM_ACCOUNT	0x00020000UL		/* Objects are reclaimable */
 #define SLAB_TEMPORARY		SLAB_RECLAIM_ACCOUNT	/* Objects are short-lived */
diff --git a/mm/slab.h b/mm/slab.h
index 8da63e4..35eb6f4 100644
--- a/mm/slab.h
+++ b/mm/slab.h
@@ -115,7 +115,7 @@  static inline unsigned long kmem_cache_flags(unsigned long object_size,
 
 /* Legal flag mask for kmem_cache_create(), for various configurations */
 #define SLAB_CORE_FLAGS (SLAB_HWCACHE_ALIGN | SLAB_CACHE_DMA | SLAB_PANIC | \
-			 SLAB_DESTROY_BY_RCU | SLAB_DEBUG_OBJECTS )
+			 SLAB_DESTROY_BY_RCU | SLAB_DEBUG_OBJECTS | SLAB_NO_MERGE)
 
 #if defined(CONFIG_DEBUG_SLAB)
 #define SLAB_DEBUG_FLAGS (SLAB_RED_ZONE | SLAB_POISON | SLAB_STORE_USER)
diff --git a/mm/slab_common.c b/mm/slab_common.c
index 8683110..3a5a8ed 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -35,7 +35,7 @@  struct kmem_cache *kmem_cache;
  */
 #define SLAB_NEVER_MERGE (SLAB_RED_ZONE | SLAB_POISON | SLAB_STORE_USER | \
 		SLAB_TRACE | SLAB_DESTROY_BY_RCU | SLAB_NOLEAKTRACE | \
-		SLAB_FAILSLAB)
+		SLAB_FAILSLAB | SLAB_NO_MERGE)
 
 #define SLAB_MERGE_SAME (SLAB_RECLAIM_ACCOUNT | SLAB_CACHE_DMA | SLAB_NOTRACK)