From patchwork Mon Oct 21 09:14:13 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hyeonggon Yoo <42.hyeyoo@gmail.com> X-Patchwork-Id: 13843745 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B0784D0E6D1 for ; Mon, 21 Oct 2024 09:14:33 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2DE9B6B0089; Mon, 21 Oct 2024 05:14:33 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 28F7F6B008A; Mon, 21 Oct 2024 05:14:33 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1569C6B008C; Mon, 21 Oct 2024 05:14:33 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id E9C0C6B0089 for ; Mon, 21 Oct 2024 05:14:32 -0400 (EDT) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id DE76BA17A6 for ; Mon, 21 Oct 2024 09:14:04 +0000 (UTC) X-FDA: 82697048112.13.6679336 Received: from mail-pj1-f51.google.com (mail-pj1-f51.google.com [209.85.216.51]) by imf20.hostedemail.com (Postfix) with ESMTP id D118F1C0012 for ; Mon, 21 Oct 2024 09:14:12 +0000 (UTC) Authentication-Results: imf20.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=joqxzmVW; spf=pass (imf20.hostedemail.com: domain of 42.hyeyoo@gmail.com designates 209.85.216.51 as permitted sender) smtp.mailfrom=42.hyeyoo@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1729501922; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:references:dkim-signature; bh=oNXF8iO7rrb814cpiWg8zr4M7B5ll1DXx6/IM3ktaEM=; b=BnK1haGybVvGenbWZP2e4C0Ls00cIkU9rjSpgN9R118ugI7fquqqpATE5BhVVyTHRizdFU Sea5nb+iipkpYcJ0Bw3IKa58IQ32XRylWCaIg1iKVawduFPLFqylo+oK8vtobydoj8fvGx X0vSR1R5azVQagOQGVONHISSQ/Ai4no= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1729501922; a=rsa-sha256; cv=none; b=JsViZraMDuTVRxXA14YPboeXRiVugCe/1/R1sLhyIv1rRviPEEFA3BHwcqWtWfAp21QGTJ sHfMf9DQ1ZKwlVFD7e48pbP3YTxu2AFAwIXvxBtL/tw7peX0qwkfDn5T3z1F5qOKurNVqV K6+aDoikLjxUDOBbzm2gCF3dAL/EtVY= ARC-Authentication-Results: i=1; imf20.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=joqxzmVW; spf=pass (imf20.hostedemail.com: domain of 42.hyeyoo@gmail.com designates 209.85.216.51 as permitted sender) smtp.mailfrom=42.hyeyoo@gmail.com; dmarc=pass (policy=none) header.from=gmail.com Received: by mail-pj1-f51.google.com with SMTP id 98e67ed59e1d1-2e2bd347124so3064355a91.1 for ; Mon, 21 Oct 2024 02:14:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1729502069; x=1730106869; darn=kvack.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=oNXF8iO7rrb814cpiWg8zr4M7B5ll1DXx6/IM3ktaEM=; b=joqxzmVW6tb0/5lwA6YLSVQSVhIWQL5mjSceJT0DoCqJ7KFKvge8krTCLkLl0m76xa ZmACcx/l8Zo1AA3cViW/gB8elayWnp++CIoz94s51QSNyaFaUaIxV0USdl3K12VWKD91 ys5n4la3ow9ENYcunx52MQ4ScHxT9tcggUz15Q5Zwx1PHNhaXVW2P12enbYKZnV4vzZs VLr4A2311xQPVxiw3QY5WTr0X1CzPwsE0g4X+s3bG1l/eSvbGuVD4hJJ15vP8wnULqa7 0O1qzmRZMExf4Oxfmi5fVMojOMXGVdq49jVdRqTAti7Z2txK2uKvX1MnT4NMlxaqsCC2 GU1Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1729502069; x=1730106869; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=oNXF8iO7rrb814cpiWg8zr4M7B5ll1DXx6/IM3ktaEM=; b=qrYwGBOTXkEZF8lqn28TkSsdub8VjdZOvWKpWJLRQWyaP7m5PGv3MSEShoBL0ntmom WPLYiyUz0QYcCUpYqMgumK8Gx2CxPMJUqktLM9DbF6LPu2linFiJnwVSKUMYCgTsIBRs lfBb4Q7XQddOa1Y908/zkRirBIITTZ9cEO9euJzQztpaGQzggM63f5G99i0ZDk29sUuV TEZYTNkRWJKzfx0tWNq8gQeeJIZqYN7CHA/rkTcBF8yFiCbcNcwLnnw4aiXVi+Tx+dT4 AKaqqs/gZDPHDiNGOi5Liti9oEbk1sSw+2VMETQ8+3K0LLkI0QjHee/yRRm0dg7FfiCf dxkw== X-Forwarded-Encrypted: i=1; AJvYcCXBW9G2sSIP0mKXUi0Q2t1onPNUG47jvODT0fX7jeU+IyobBLuolREXu52xe9wHJmEGowMit4foYA==@kvack.org X-Gm-Message-State: AOJu0YzG9IeAO+s45bzTKkALJHsZcvsnLxt8Hjr8hcN5pTMldzQcQAV9 IBZd7DaDi9hzjiQIzbNev2sKYDHTCMhfGuBj6QgTFxLFCPatJeg+ X-Google-Smtp-Source: AGHT+IFux1qohWhbH0Z3qON7yq//Imc+lXrs9R/yd1Ue7DKdBqwoHI/d2B3IHEkLKScvgKSCDrYE+A== X-Received: by 2002:a05:6a21:6da6:b0:1d9:386d:a0ec with SMTP id adf61e73a8af0-1d9386da0femr13481485637.2.1729502069053; Mon, 21 Oct 2024 02:14:29 -0700 (PDT) Received: from fedora.. ([1.245.180.67]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-71ec13ea1f3sm2466327b3a.147.2024.10.21.02.14.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 21 Oct 2024 02:14:28 -0700 (PDT) From: Hyeonggon Yoo <42.hyeyoo@gmail.com> To: Christoph Lameter , Pekka Enberg , David Rientjes , Joonsoo Kim , Andrew Morton , Vlastimil Babka , Roman Gushchin Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com>, linux-mm@kvack.org Subject: [RFC PATCH] mm/slab: fix a memory leak on kobject_init_and_add() failure Date: Mon, 21 Oct 2024 18:14:13 +0900 Message-ID: <20241021091413.154775-1-42.hyeyoo@gmail.com> X-Mailer: git-send-email 2.45.0 MIME-Version: 1.0 X-Rspamd-Queue-Id: D118F1C0012 X-Stat-Signature: bgbnng1ht4d8mafppahybhh65ji1ihaz X-Rspamd-Server: rspam09 X-Rspam-User: X-HE-Tag: 1729502052-764917 X-HE-Meta: U2FsdGVkX1/+ELI4Q1oAM6j56FcnXEnaHJ/2heLgPMbyfM3t6CRwe+7wmRPYeXULu/iTbB64n83ZwZ1kvSbUEJGZlClS/UkvlDMUO0gL6obgFnQdKVOr+zrcr/nrEYZLCN1fNuDxbUnOpXik+36Mg6gMSYOdGOnyQp/2cWeNEPjnGKLncAg/Z03M4hTygPsOXllgildgJEWrw+hVAE9xg99u/rlMJLsD137s3H5L2CnZxgvinvUf5uvckEM5NJAU7w0yTsKD3HkITl2Jkh5NFwmHq0tqpiFsbOTgtkhOoGEFSy+hob7HVnI3SophPWzZHjQZbbmMx0iBsZUhaCbhbyMiFVG8GigaxlIZz0A0F3Mag9WjwqDWtEsA4EiG/VOrOpB1on6a+flDnB/2s7LRnxuRG+wPw3I9xRYN/3aHRim6rohpgFv8fWfaxc0MV1KAgLgTB9+kLrnFZwPCg7cTYCSe+sp2kb3bWzg/emteApUplY7EZJZ0Cygw8ji2oSceHU3y2gjHqIk5CxGocjeijSPjb0RnnrV3VzoMRff0Zxs/m0/xxeDQsy469FN28w3GYKEUcrc3UFdycCOEHPR0QVhmBhDQzDjAEMvPw87sxuBeB7JeTiClbtwrlGnW9PCqrnwo3Qz+usnxJa39S3bKw/gqfpgQEzuPFh6Iig6mYspHvdiKO2QMH6/Mvd73jeXTwpOTYARIAjnNjn2J52WKArXKdyyO5uR9DzTIN0rFUJrYLaypSvUMJqoRVHG55cKvlBu/Tv8yBLd4BEZYqOaIKF3fmO4eDQXwE4TmgWLGOGoLaAb3J+2aUbwZ4Btt9Cv3Ni4aojMYyzqiJHIw78qiB7FbM0ZF7EOkNcHKJGntWyNtqNAcUMpFFOSLX8I8Qz4WTbB4thK6/0dNr/1awZEfoEBxPS0ZEt0yUAYIFglcDZ0W1LMkSvLHfam9H5ydGVNlSnO7ylKsNnO1bBC8ziz Z/3eUhxZ obzhXLpy+Fxtb+ui2ooJ8NI18lQnvRgljHXY8W1o8GHtxsxQNs7Ovjb1iTCmvchEv1s+fUOWizs2KJvP1icppakN1q/6NGxCp47nteEwkThx5skmb01vTOYz9TkkEmQcIKLwOwyTz1io+CMnjvmWZBK1tr7UL1fCQxYPY1wEgImmOBt+O+87nJFUyuX4ApiOt0FnkoHJKeYJxDvAD7FU+ysxrowEIkSBK26OEHXAHWLXc49ris8ftmx/oIRieC6drepSIViwwEsT6mkmmtwASNZkDKWZukB7UqhwSstBpsLQBiK9Zm9nPIXv8oaMFKCeTdjvkw8MbPBqbQd5GlUJOrziX4vxovi5ySk9bbNWzAgodyUxboPO7HvkRyyhfy3cNSiKUpL9/dRIAG2zOugY7cvwCiqNMP1TeZs4SVuvGDQWH1Q+UQBrxj8s5BPbuB4m9GHR9D31sD4VySobyfWI83sih+gvZIkAkGJCq5KeSLzfS8gesejBRmXVvS9BzcODsP8kgnjvp/p4zjy+linqa//X0zw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: When kobject_init_and_add() fails during cache creation, kobj->name can be leaked because SLUB does not call kobject_put(), which should have been invoked per the kobject API documentation. It has a bit of historical context, though; SLUB does not call kobject_put() to avoid double-free for struct kmem_cache because 1) simply calling it would free all resources related to the cache, and 2) struct kmem_cache descriptor is always freed by cache_cache()'s error handling path, causing struct kmem_cache to be freed twice. This issue can be reproduced by creating new slab caches while applying failslab for kernfs_node_cache. This makes kobject_add_varg() succeed, but causes kobject_add_internal() to fail in kobject_init_and_add() during cache creation. Historically, this issue has attracted developers' attention several times. Each time a fix addressed either the leak or the double-free, it caused the other issue. Let's summarize a bit of history here: The leak has existed since the early days of SLUB. Commit 54b6a731025f ("slub: fix leak of 'name' in sysfs_slab_add") introduced a double-free bug while fixing the leak. Commit 80da026a8e5d ("mm/slub: fix slab double-free in case of duplicate sysfs filename") re-introduced the leak while fixing the double-free error. Commit dde3c6b72a16 ("mm/slub: fix a memory leak in sysfs_slab_add()") fixed the memory leak, but it was later reverted by commit 757fed1d0898 ("Revert "mm/slub: fix a memory leak in sysfs_slab_add()"") to avoid the double-free error. This is where we are now: we've chosen a memory leak over a double-free. So, how can this be fixed? Let's change the logic this way: 1. Move cache name allocation from __kmem_cache_create_args() to do_kmem_cache_create() and free all resources associated with the cache on error in do_kmem_cache_create() to simplify error handling. 2. In do_kmem_cache_create(), assume all resources are already freed upon sysfs_slab_add() failure and do not take the error handling path. One more thing to address when calling kobject_put() in sysfs_slab_add() on kobject_init_and_add() failure is that it shouldn't destroy caches that were already created during the boot process just because it failed to create sysfs files. To avoid that, use a no-op function as the release function if SLUB is trying to create sysfs files for those early caches. This ensures that resources allocated by the kobject infrastructure are properly freed on error while keeping the early caches active. Note that using a no-op function as the release function is not recommended by the kobject API document, but I can't come up with a better approach here. Signed-off-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> --- mm/slab_common.c | 17 +++------------- mm/slub.c | 50 ++++++++++++++++++++++++++++++++++++++---------- 2 files changed, 43 insertions(+), 24 deletions(-) diff --git a/mm/slab_common.c b/mm/slab_common.c index 62878edb0a81..d4ccd10f3098 100644 --- a/mm/slab_common.c +++ b/mm/slab_common.c @@ -236,14 +236,12 @@ static struct kmem_cache *create_cache(const char *name, goto out; err = do_kmem_cache_create(s, name, object_size, args, flags); if (err) - goto out_free_cache; + goto out; s->refcount = 1; list_add(&s->list, &slab_caches); return s; -out_free_cache: - kmem_cache_free(kmem_cache, s); out: return ERR_PTR(err); } @@ -281,7 +279,6 @@ struct kmem_cache *__kmem_cache_create_args(const char *name, slab_flags_t flags) { struct kmem_cache *s = NULL; - const char *cache_name; int err; #ifdef CONFIG_SLUB_DEBUG @@ -332,18 +329,10 @@ struct kmem_cache *__kmem_cache_create_args(const char *name, if (s) goto out_unlock; - cache_name = kstrdup_const(name, GFP_KERNEL); - if (!cache_name) { - err = -ENOMEM; - goto out_unlock; - } - args->align = calculate_alignment(flags, args->align, object_size); - s = create_cache(cache_name, object_size, args, flags); - if (IS_ERR(s)) { + s = create_cache(name, object_size, args, flags); + if (IS_ERR(s)) err = PTR_ERR(s); - kfree_const(cache_name); - } out_unlock: mutex_unlock(&slab_mutex); diff --git a/mm/slub.c b/mm/slub.c index c95ac26dad07..1daf4de8dc33 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -330,10 +330,11 @@ struct track { enum track_item { TRACK_ALLOC, TRACK_FREE }; #ifdef SLAB_SUPPORTS_SYSFS -static int sysfs_slab_add(struct kmem_cache *); +static int sysfs_slab_add(struct kmem_cache *, bool); static int sysfs_slab_alias(struct kmem_cache *, const char *); #else -static inline int sysfs_slab_add(struct kmem_cache *s) { return 0; } +static inline int sysfs_slab_add(struct kmem_cache *s, bool sysfs_lazy) + { return 0; } static inline int sysfs_slab_alias(struct kmem_cache *s, const char *p) { return 0; } #endif @@ -6137,7 +6138,12 @@ int do_kmem_cache_create(struct kmem_cache *s, const char *name, { int err = -EINVAL; - s->name = name; + s->name = kstrdup_const(name, GFP_KERNEL); + if (!s->name) { + err = -ENOMEM; + goto out; + } + s->size = s->object_size = size; s->flags = kmem_cache_flags(flags, s->name); @@ -6204,16 +6210,20 @@ int do_kmem_cache_create(struct kmem_cache *s, const char *name, goto out; } - err = sysfs_slab_add(s); + err = sysfs_slab_add(s, false); if (err) - goto out; + /* Resources are already freed on sysfs_slab_add() failure */ + return err; if (s->flags & SLAB_STORE_USER) debugfs_slab_add(s); out: - if (err) + if (err) { __kmem_cache_release(s); + kfree_const(s->name); + kmem_cache_free(kmem_cache, s); + } return err; } @@ -7162,11 +7172,22 @@ static void kmem_cache_release(struct kobject *k) slab_kmem_cache_release(to_slab(k)); } +static void kmem_cache_sysfs_lazy_release(struct kobject *k) { } + static const struct sysfs_ops slab_sysfs_ops = { .show = slab_attr_show, .store = slab_attr_store, }; +/* + * Early slab caches shouldn't be destroyed just because SLUB failed to create + * sysfs files. use a no-op function as .release function. + */ +static const struct kobj_type slab_sysfs_lazy_ktype = { + .sysfs_ops = &slab_sysfs_ops, + .release = kmem_cache_sysfs_lazy_release, +}; + static const struct kobj_type slab_ktype = { .sysfs_ops = &slab_sysfs_ops, .release = kmem_cache_release, @@ -7223,12 +7244,13 @@ static char *create_unique_id(struct kmem_cache *s) return name; } -static int sysfs_slab_add(struct kmem_cache *s) +static int sysfs_slab_add(struct kmem_cache *s, bool sysfs_lazy) { int err; const char *name; struct kset *kset = cache_kset(s); int unmergeable = slab_unmergeable(s); + const struct kobj_type *ktype; if (!unmergeable && disable_higher_order_debug && (slub_debug & DEBUG_METADATA_FLAGS)) @@ -7253,9 +7275,14 @@ static int sysfs_slab_add(struct kmem_cache *s) } s->kobj.kset = kset; - err = kobject_init_and_add(&s->kobj, &slab_ktype, NULL, "%s", name); + if (sysfs_lazy) + ktype = &slab_sysfs_lazy_ktype; + else + ktype = &slab_ktype; + err = kobject_init_and_add(&s->kobj, ktype, NULL, "%s", name); + if (err) - goto out; + goto out_put_kobj; err = sysfs_create_group(&s->kobj, &slab_attr_group); if (err) @@ -7269,8 +7296,11 @@ static int sysfs_slab_add(struct kmem_cache *s) if (!unmergeable) kfree(name); return err; + out_del_kobj: kobject_del(&s->kobj); +out_put_kobj: + kobject_put(&s->kobj); goto out; } @@ -7337,7 +7367,7 @@ static int __init slab_sysfs_init(void) slab_state = FULL; list_for_each_entry(s, &slab_caches, list) { - err = sysfs_slab_add(s); + err = sysfs_slab_add(s, true); if (err) pr_err("SLUB: Unable to add boot slab %s to sysfs\n", s->name);