Message ID | 20240804212839.685795-1-pedro.falcato@gmail.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | slab: Error out on duplicate cache names when DEBUG_VM=y | expand |
On Sun, Aug 4, 2024 at 10:28 PM Pedro Falcato <pedro.falcato@gmail.com> wrote: > > Duplicate slab cache names can create havoc for userspace tooling that > expects slab cache names to be unique. This is a reasonable expectation. For completeness, I just had a look at duplicate cache names around the kernel using git grep -Eoh "kmem_cache_create.*\"," | grep -Eo \".*\" | uniq -d (which seems to be correct) which results in the following patch (on top of torvalds/linux.git master, so file_lock_cache hasn't been fixed yet) This patch being so small is what leads me to believe that erroring out here is safe. Of course, no one knows what the out-of-tree modules do. diff --git a/drivers/scsi/snic/snic_main.c b/drivers/scsi/snic/snic_main.c index cc824dcfe7d..abc78320c66 100644 --- a/drivers/scsi/snic/snic_main.c +++ b/drivers/scsi/snic/snic_main.c @@ -873,7 +873,7 @@ snic_global_data_init(void) snic_glob->req_cache[SNIC_REQ_CACHE_MAX_SGL] = cachep; len = sizeof(struct snic_host_req); - cachep = kmem_cache_create("snic_req_maxsgl", len, SNIC_SG_DESC_ALIGN, + cachep = kmem_cache_create("snic_req_tm", len, SNIC_SG_DESC_ALIGN, SLAB_HWCACHE_ALIGN, NULL); if (!cachep) { SNIC_ERR("Failed to create snic tm req slab\n"); diff --git a/fs/locks.c b/fs/locks.c index 9afb16e0683..e45cad40f8b 100644 --- a/fs/locks.c +++ b/fs/locks.c @@ -2984,7 +2984,7 @@ static int __init filelock_init(void) filelock_cache = kmem_cache_create("file_lock_cache", sizeof(struct file_lock), 0, SLAB_PANIC, NULL); - filelease_cache = kmem_cache_create("file_lock_cache", + filelease_cache = kmem_cache_create("file_lease_cache", sizeof(struct file_lease), 0, SLAB_PANIC, NULL); for_each_possible_cpu(i) {
On 8/5/24 12:24, Pedro Falcato wrote: > On Sun, Aug 4, 2024 at 10:28 PM Pedro Falcato <pedro.falcato@gmail.com> wrote: >> >> Duplicate slab cache names can create havoc for userspace tooling that >> expects slab cache names to be unique. This is a reasonable expectation. > > For completeness, I just had a look at duplicate cache names around > the kernel using > git grep -Eoh "kmem_cache_create.*\"," | grep -Eo \".*\" | uniq -d > (which seems to be correct) > > which results in the following patch (on top of torvalds/linux.git > master, so file_lock_cache hasn't been fixed yet) > > This patch being so small is what leads me to believe that erroring > out here is safe. Of course, no one knows what the out-of-tree modules > do. What about module unload/reload with a SLAB_TYPESAFE_BY_RCU cache that will delay its freeing. Soon also if there are kfree_rcu()'s in flight. And the zombie cache can stay also permamently around if it fails to be destroy because some objects were not freed. In all these cases the cache's refcount should be 0 at that point, so minimally the check should ignore those. But I would also rather make it be a warning anyway, at least for a few releases. > diff --git a/drivers/scsi/snic/snic_main.c b/drivers/scsi/snic/snic_main.c > index cc824dcfe7d..abc78320c66 100644 > --- a/drivers/scsi/snic/snic_main.c > +++ b/drivers/scsi/snic/snic_main.c > @@ -873,7 +873,7 @@ snic_global_data_init(void) > snic_glob->req_cache[SNIC_REQ_CACHE_MAX_SGL] = cachep; > > len = sizeof(struct snic_host_req); > - cachep = kmem_cache_create("snic_req_maxsgl", len, SNIC_SG_DESC_ALIGN, > + cachep = kmem_cache_create("snic_req_tm", len, SNIC_SG_DESC_ALIGN, > SLAB_HWCACHE_ALIGN, NULL); > if (!cachep) { > SNIC_ERR("Failed to create snic tm req slab\n"); > diff --git a/fs/locks.c b/fs/locks.c > index 9afb16e0683..e45cad40f8b 100644 > --- a/fs/locks.c > +++ b/fs/locks.c > @@ -2984,7 +2984,7 @@ static int __init filelock_init(void) > filelock_cache = kmem_cache_create("file_lock_cache", > sizeof(struct file_lock), 0, SLAB_PANIC, NULL); > > - filelease_cache = kmem_cache_create("file_lock_cache", > + filelease_cache = kmem_cache_create("file_lease_cache", > sizeof(struct file_lease), 0, SLAB_PANIC, NULL); > > for_each_possible_cpu(i) {
On Mon, Aug 05, 2024 at 12:38:29PM +0200, Vlastimil Babka wrote: > What about module unload/reload with a SLAB_TYPESAFE_BY_RCU cache that will > delay its freeing. Soon also if there are kfree_rcu()'s in flight. And the > zombie cache can stay also permamently around if it fails to be destroy > because some objects were not freed. > It should be an invariant that the cache is fully whacked by the time kmem_cache_destroy returns, at worst with the exception of when leaked items are encountered (but even then it should be renamed to something indicating it is defunct). Suppose a cache could not have been destroyed and was left as is, then the offending module was loaded again -- now you got two with the same name, which is not that great either. I find myself quite surprised that kmem_cache_destroy can return even if cache destruction is still pending. This was added in 657dc2f97220 ("slab: remove synchronous rcu_barrier() call in memcg cache release path"), citing batching benefits for frequent kmem cache creation/destruction. I believe the very notion of doing that *frequently* is b0rked and any code doing it should be patched to stop regardless. Even so, if there are any benefits to the committed patch, it perhaps can be augmented so that the kmem_cache_destroy caller can wait for the entire thing to finish (e.g., with a completion).
diff --git a/mm/slab_common.c b/mm/slab_common.c index 40b582a014b..c4f31c887f9 100644 --- a/mm/slab_common.c +++ b/mm/slab_common.c @@ -88,6 +88,19 @@ unsigned int kmem_cache_size(struct kmem_cache *s) EXPORT_SYMBOL(kmem_cache_size); #ifdef CONFIG_DEBUG_VM + +static bool kmem_cache_is_duplicate_name(const char *name) +{ + struct kmem_cache *s; + + list_for_each_entry(s, &slab_caches, list) { + if (!strcmp(s->name, name)) + return true; + } + + return false; +} + static int kmem_cache_sanity_check(const char *name, unsigned int size) { if (!name || in_interrupt() || size > KMALLOC_MAX_SIZE) { @@ -95,6 +108,12 @@ static int kmem_cache_sanity_check(const char *name, unsigned int size) return -EINVAL; } + if (kmem_cache_is_duplicate_name(name)) { + /* Duplicate names will confuse slabtop, et al */ + pr_err("%s: name %s already exists as a cache\n", __func__, name); + return -EINVAL; + } + WARN_ON(strchr(name, ' ')); /* It confuses parsers */ return 0; }
Duplicate slab cache names can create havoc for userspace tooling that expects slab cache names to be unique. This is a reasonable expectation. Link: https://lore.kernel.org/linux-fsdevel/2d1d053da1cafb3e7940c4f25952da4f0af34e38.1722293276.git.osandov@fb.com/ Signed-off-by: Pedro Falcato <pedro.falcato@gmail.com> --- mm/slab_common.c | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+)