diff mbox

slub: Fix sysfs duplicate filename creation when slub_debug=O

Message ID 1510023934-17517-1-git-send-email-miles.chen@mediatek.com (mailing list archive)
State New, archived
Headers show

Commit Message

Miles Chen Nov. 7, 2017, 3:05 a.m. UTC
From: Miles Chen <miles.chen@mediatek.com>

When slub_debug=O is set. It is possible to clear debug flags
for an "unmergeable" slab cache in kmem_cache_open().
It makes the "unmergeable" cache became "mergeable" in sysfs_slab_add().

These caches will generate their "unique IDs" by create_unique_id(),
but it is possible to create identical unique IDs. In my experiment,
sgpool-128, names_cache, biovec-256 generate the same ID ":Ft-0004096"
and the kernel reports "sysfs: cannot create duplicate filename
'/kernel/slab/:Ft-0004096'".

To repeat my experiment, set disable_higher_order_debug=1,
CONFIG_SLUB_DEBUG_ON=y in kernel-4.14.

Fix this issue by setting unmergeable=1 if slub_debug=O and the
the default slub_debug contains any no-merge flags.

call path:
kmem_cache_create()
  __kmem_cache_alias()	-> we set SLAB_NEVER_MERGE flags here
  create_cache()
    __kmem_cache_create()
      kmem_cache_open()	-> clear DEBUG_METADATA_FLAGS
      sysfs_slab_add()	-> the slab cache is mergeable now

[    0.674272] sysfs: cannot create duplicate filename '/kernel/slab/:Ft-0004096'
[    0.674473] ------------[ cut here ]------------
[    0.674653] WARNING: CPU: 0 PID: 1 at fs/sysfs/dir.c:31 sysfs_warn_dup+0x60/0x7c
[    0.674847] Modules linked in:
[    0.674969] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G        W       4.14.0-rc7ajb-00131-gd4c2e9f-dirty #123
[    0.675211] Hardware name: linux,dummy-virt (DT)
[    0.675342] task: ffffffc07d4e0080 task.stack: ffffff8008008000
[    0.675505] PC is at sysfs_warn_dup+0x60/0x7c
[    0.675633] LR is at sysfs_warn_dup+0x60/0x7c
[    0.675759] pc : [<ffffff8008235808>] lr : [<ffffff8008235808>] pstate: 60000145
[    0.675948] sp : ffffff800800bb40
[    0.676048] x29: ffffff800800bb40 x28: 0000000000000040
[    0.676209] x27: ffffffc07c52a380 x26: 0000000000000000
[    0.676369] x25: ffffff8008af4ad0 x24: ffffff8008af4000
[    0.676528] x23: ffffffc07c532580 x22: ffffffc07cf04598
[    0.676695] x21: ffffffc07cf26578 x20: ffffffc07c533700
[    0.676857] x19: ffffffc07ce67000 x18: 0000000000000002
[    0.677017] x17: 0000000000007ffe x16: 0000000000000007
[    0.677176] x15: 0000000000000001 x14: 0000000000007fff
[    0.677335] x13: 0000000000000394 x12: 0000000000000000
[    0.677492] x11: 00000000000001ab x10: 0000000000000007
[    0.677651] x9 : 00000000000001ac x8 : ffffff800835d114
[    0.677809] x7 : 656b2f2720656d61 x6 : 0000000000000017
[    0.677967] x5 : ffffffc07ffdb9a8 x4 : 0000000000000000
[    0.678124] x3 : 0000000000000000 x2 : ffffffffffffffff
[    0.678282] x1 : ffffff8008a4e878 x0 : 0000000000000042
[    0.678442] Call trace:
[    0.678528] Exception stack(0xffffff800800ba00 to 0xffffff800800bb40)
[    0.678706] ba00: 0000000000000042 ffffff8008a4e878 ffffffffffffffff 0000000000000000
[    0.678914] ba20: 0000000000000000 ffffffc07ffdb9a8 0000000000000017 656b2f2720656d61
[    0.679121] ba40: ffffff800835d114 00000000000001ac 0000000000000007 00000000000001ab
[    0.679326] ba60: 0000000000000000 0000000000000394 0000000000007fff 0000000000000001
[    0.679532] ba80: 0000000000000007 0000000000007ffe 0000000000000002 ffffffc07ce67000
[    0.679739] baa0: ffffffc07c533700 ffffffc07cf26578 ffffffc07cf04598 ffffffc07c532580
[    0.679944] bac0: ffffff8008af4000 ffffff8008af4ad0 0000000000000000 ffffffc07c52a380
[    0.680149] bae0: 0000000000000040 ffffff800800bb40 ffffff8008235808 ffffff800800bb40
[    0.680354] bb00: ffffff8008235808 0000000060000145 ffffffc07c533700 0000000062616c73
[    0.680560] bb20: ffffffffffffffff 0000000000000000 ffffff800800bb40 ffffff8008235808
[    0.680774] [<ffffff8008235808>] sysfs_warn_dup+0x60/0x7c
[    0.680928] [<ffffff8008235920>] sysfs_create_dir_ns+0x98/0xa0
[    0.681095] [<ffffff8008539274>] kobject_add_internal+0xa0/0x294
[    0.681267] [<ffffff80085394f8>] kobject_init_and_add+0x90/0xb4
[    0.681435] [<ffffff80081b524c>] sysfs_slab_add+0x90/0x200
[    0.681592] [<ffffff80081b62a0>] __kmem_cache_create+0x26c/0x438
[    0.681769] [<ffffff80081858a4>] kmem_cache_create+0x164/0x1f4
[    0.681940] [<ffffff80086caa98>] sg_pool_init+0x60/0x100
[    0.682094] [<ffffff8008084144>] do_one_initcall+0x38/0x12c
[    0.682254] [<ffffff80086a0d10>] kernel_init_freeable+0x138/0x1d4
[    0.682423] [<ffffff8008547388>] kernel_init+0x10/0xfc
[    0.682571] [<ffffff80080851e0>] ret_from_fork+0x10/0x18

Signed-off-by: Miles Chen <miles.chen@mediatek.com>
---
 mm/slub.c | 4 ++++
 1 file changed, 4 insertions(+)

Comments

Christoph Lameter (Ampere) Nov. 7, 2017, 3:22 p.m. UTC | #1
On Tue, 7 Nov 2017, miles.chen@mediatek.com wrote:

> When slub_debug=O is set. It is possible to clear debug flags
> for an "unmergeable" slab cache in kmem_cache_open().
> It makes the "unmergeable" cache became "mergeable" in sysfs_slab_add().

Right but that is only if disable_higher_order_debug is set.

> These caches will generate their "unique IDs" by create_unique_id(),
> but it is possible to create identical unique IDs. In my experiment,
> sgpool-128, names_cache, biovec-256 generate the same ID ":Ft-0004096"
> and the kernel reports "sysfs: cannot create duplicate filename
> '/kernel/slab/:Ft-0004096'".

Ok then the aliasing failed for some reason. The creation of the unique id
and the alias detection needs to be in sync otherwise duplicate filenames
are created. What is the difference there?

The clearing of the DEBUG_METADATA_FLAGS looks ok to me. kmem_cache_alias
should do the same right?
kernel test robot Nov. 8, 2017, 3:05 a.m. UTC | #2
Hi Miles,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on mmotm/master]
[also build test ERROR on v4.14-rc8 next-20171107]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/miles-chen-mediatek-com/slub-Fix-sysfs-duplicate-filename-creation-when-slub_debug-O/20171108-100701
base:   git://git.cmpxchg.org/linux-mmotm.git master
config: i386-randconfig-x001-201745 (attached as .config)
compiler: gcc-6 (Debian 6.2.0-3) 6.2.0 20160901
reproduce:
        # save the attached .config to linux build tree
        make ARCH=i386 

All errors (new ones prefixed by >>):

   mm/slub.c: In function 'sysfs_slab_add':
>> mm/slub.c:5713:18: error: 'SLAB_NEVER_MERGE' undeclared (first use in this function)
       (slub_debug & SLAB_NEVER_MERGE))
                     ^~~~~~~~~~~~~~~~
   mm/slub.c:5713:18: note: each undeclared identifier is reported only once for each function it appears in

vim +/SLAB_NEVER_MERGE +5713 mm/slub.c

  5697	
  5698	static int sysfs_slab_add(struct kmem_cache *s)
  5699	{
  5700		int err;
  5701		const char *name;
  5702		struct kset *kset = cache_kset(s);
  5703		int unmergeable = slab_unmergeable(s);
  5704	
  5705		INIT_WORK(&s->kobj_remove_work, sysfs_slab_remove_workfn);
  5706	
  5707		if (!kset) {
  5708			kobject_init(&s->kobj, &slab_ktype);
  5709			return 0;
  5710		}
  5711	
  5712		if (!unmergeable && disable_higher_order_debug &&
> 5713				(slub_debug & SLAB_NEVER_MERGE))
  5714			unmergeable = 1;
  5715	
  5716		if (unmergeable) {
  5717			/*
  5718			 * Slabcache can never be merged so we can use the name proper.
  5719			 * This is typically the case for debug situations. In that
  5720			 * case we can catch duplicate names easily.
  5721			 */
  5722			sysfs_remove_link(&slab_kset->kobj, s->name);
  5723			name = s->name;
  5724		} else {
  5725			/*
  5726			 * Create a unique name for the slab as a target
  5727			 * for the symlinks.
  5728			 */
  5729			name = create_unique_id(s);
  5730		}
  5731	
  5732		s->kobj.kset = kset;
  5733		err = kobject_init_and_add(&s->kobj, &slab_ktype, NULL, "%s", name);
  5734		if (err)
  5735			goto out;
  5736	
  5737		err = sysfs_create_group(&s->kobj, &slab_attr_group);
  5738		if (err)
  5739			goto out_del_kobj;
  5740	

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation
Miles Chen Nov. 8, 2017, 5:32 a.m. UTC | #3
On Tue, 2017-11-07 at 09:22 -0600, Christopher Lameter wrote:
> On Tue, 7 Nov 2017, miles.chen@mediatek.com wrote:
> 
> > When slub_debug=O is set. It is possible to clear debug flags
> > for an "unmergeable" slab cache in kmem_cache_open().
> > It makes the "unmergeable" cache became "mergeable" in sysfs_slab_add().
> 
> Right but that is only if disable_higher_order_debug is set.

yes

> 
> > These caches will generate their "unique IDs" by create_unique_id(),
> > but it is possible to create identical unique IDs. In my experiment,
> > sgpool-128, names_cache, biovec-256 generate the same ID ":Ft-0004096"
> > and the kernel reports "sysfs: cannot create duplicate filename
> > '/kernel/slab/:Ft-0004096'".
> 
> Ok then the aliasing failed for some reason. The creation of the unique id
> and the alias detection needs to be in sync otherwise duplicate filenames
> are created. What is the difference there?

The aliasing failed because find_mergeable() returns if (flags &
SLAB_NEVER_MERGE) is true. So we do not go to search for alias caches.

__kmem_cache_alias()
  find_mergeable()
    kmem_cache_flags()  --> setup flag by the slub_debug
    if (flags & SLAB_NEVER_MERGE) return NULL;
    ...
    search alias logic...
    

The flags maybe changed if disable_higher_order_debug=1. So the
unmergeable cache becomes mergeable later.

> 
> The clearing of the DEBUG_METADATA_FLAGS looks ok to me. kmem_cache_alias
> should do the same right?
> 
Yes, I think clearing DEBUG_METADATA flags in kmem_cache_alias is
another solution for this issue.

We will need to do calculate_sizes() by using original flags and compare
the order of s->size and s->object_size when
disable_higher_order_debug=1.
Christoph Lameter (Ampere) Nov. 8, 2017, 3:05 p.m. UTC | #4
On Wed, 8 Nov 2017, Miles Chen wrote:

> > Ok then the aliasing failed for some reason. The creation of the unique id
> > and the alias detection needs to be in sync otherwise duplicate filenames
> > are created. What is the difference there?
>
> The aliasing failed because find_mergeable() returns if (flags &
> SLAB_NEVER_MERGE) is true. So we do not go to search for alias caches.
>
> __kmem_cache_alias()
>   find_mergeable()
>     kmem_cache_flags()  --> setup flag by the slub_debug
>     if (flags & SLAB_NEVER_MERGE) return NULL;
>     ...
>     search alias logic...
>
>
> The flags maybe changed if disable_higher_order_debug=1. So the
> unmergeable cache becomes mergeable later.

Ok so make sure taht the aliasing logic also clears those flags before
checking for SLAB_NEVER_MERGE.

> > The clearing of the DEBUG_METADATA_FLAGS looks ok to me. kmem_cache_alias
> > should do the same right?
> >
> Yes, I think clearing DEBUG_METADATA flags in kmem_cache_alias is
> another solution for this issue.
>
> We will need to do calculate_sizes() by using original flags and compare
> the order of s->size and s->object_size when
> disable_higher_order_debug=1.

Hmmm... Or move the aliasing check to a point where we know the size of
the slab objects?
Miles Chen Nov. 9, 2017, 8:52 a.m. UTC | #5
On Wed, 2017-11-08 at 09:05 -0600, Christopher Lameter wrote:
> On Wed, 8 Nov 2017, Miles Chen wrote:
> 
> > > Ok then the aliasing failed for some reason. The creation of the unique id
> > > and the alias detection needs to be in sync otherwise duplicate filenames
> > > are created. What is the difference there?
> >
> > The aliasing failed because find_mergeable() returns if (flags &
> > SLAB_NEVER_MERGE) is true. So we do not go to search for alias caches.
> >
> > __kmem_cache_alias()
> >   find_mergeable()
> >     kmem_cache_flags()  --> setup flag by the slub_debug
> >     if (flags & SLAB_NEVER_MERGE) return NULL;
> >     ...
> >     search alias logic...
> >
> >
> > The flags maybe changed if disable_higher_order_debug=1. So the
> > unmergeable cache becomes mergeable later.
> 
> Ok so make sure taht the aliasing logic also clears those flags before
> checking for SLAB_NEVER_MERGE.
> 
> > > The clearing of the DEBUG_METADATA_FLAGS looks ok to me. kmem_cache_alias
> > > should do the same right?
> > >
> > Yes, I think clearing DEBUG_METADATA flags in kmem_cache_alias is
> > another solution for this issue.
> >
> > We will need to do calculate_sizes() by using original flags and compare
> > the order of s->size and s->object_size when
> > disable_higher_order_debug=1.
> 
> Hmmm... Or move the aliasing check to a point where we know the size of
> the slab objects?

The biggest concern is that we may have some merged caches even if we
enable CONFIG_SLUB_DEBUG_ON and slub_debug=O. So a developer cannot say
"I set CONFIG_SLUB_DEBUG_ON=y to stop all slab merging". 
(https://www.spinics.net/lists/linux-mm/msg77919.html)

In this fix patch, it disables slab merging if SLUB_DEBUG=O and
CONFIG_SLUB_DEBUG_ON=y but the debug features are disabled by the
disable_higher_order_debug logic and it holds the "slab merging is off
if any debug features are enabled" behavior.
Christoph Lameter (Ampere) Nov. 9, 2017, 3:49 p.m. UTC | #6
On Thu, 9 Nov 2017, Miles Chen wrote:

> In this fix patch, it disables slab merging if SLUB_DEBUG=O and
> CONFIG_SLUB_DEBUG_ON=y but the debug features are disabled by the
> disable_higher_order_debug logic and it holds the "slab merging is off
> if any debug features are enabled" behavior.

Sounds good. Where is the patch?
diff mbox

Patch

diff --git a/mm/slub.c b/mm/slub.c
index 1efbb812..8cbf9f7 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -5704,6 +5704,10 @@  static int sysfs_slab_add(struct kmem_cache *s)
 		return 0;
 	}
 
+	if (!unmergeable && disable_higher_order_debug &&
+			(slub_debug & SLAB_NEVER_MERGE))
+		unmergeable = 1;
+
 	if (unmergeable) {
 		/*
 		 * Slabcache can never be merged so we can use the name proper.