From patchwork Wed Aug 7 10:31:18 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vlastimil Babka X-Patchwork-Id: 13756118 Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.223.130]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 797D91DF680; Wed, 7 Aug 2024 10:31:45 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=195.135.223.130 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723026707; cv=none; b=Xrs6/j3nzUZU24kYCNVsZ7ZlG9MZMqNjBMW7mQvJOrUam/WpLm/tCJOj59mR1p1ELJWeNrfuHcsayGeO6HFEl5yGvG909oEmxyHf1W7sZTcDRl/nrBfcMq29dU6HOAw95bcFcKXi6Ft9KVyj7YcvzcYgBE5ZTC/CZPfvhocvdvE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723026707; c=relaxed/simple; bh=qqSyR3F36zg6ItnZnkJHnykEHqCUkuxH66SdYCyKjmM=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=td4+7ei5JP0wYca84cF9p0YDI+TfUjYAxt/MF5rTm+n4AZpJU4gsbdI9rVl5GNOrSRZYqmqgj72kzRdttDDX6/3+MlmwvMdttPN8FhH+rlfr9MXkOTjDAV+NpfrJPviwqutIbmbEcXqKOzQoTPMumqCVv9erlZqBNd+4IeK4D1o= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=suse.cz; spf=pass smtp.mailfrom=suse.cz; dkim=pass (1024-bit key) header.d=suse.cz header.i=@suse.cz header.b=jpwuLCpP; dkim=permerror (0-bit key) header.d=suse.cz header.i=@suse.cz header.b=FR81OezO; dkim=pass (1024-bit key) header.d=suse.cz header.i=@suse.cz header.b=jpwuLCpP; dkim=permerror (0-bit key) header.d=suse.cz header.i=@suse.cz header.b=FR81OezO; arc=none smtp.client-ip=195.135.223.130 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=suse.cz Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=suse.cz Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=suse.cz header.i=@suse.cz header.b="jpwuLCpP"; dkim=permerror (0-bit key) header.d=suse.cz header.i=@suse.cz header.b="FR81OezO"; dkim=pass (1024-bit key) header.d=suse.cz header.i=@suse.cz header.b="jpwuLCpP"; dkim=permerror (0-bit key) header.d=suse.cz header.i=@suse.cz header.b="FR81OezO" Received: from imap1.dmz-prg2.suse.org (imap1.dmz-prg2.suse.org [IPv6:2a07:de40:b281:104:10:150:64:97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 6A06C21CF4; Wed, 7 Aug 2024 10:31:34 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1723026694; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=j2GOiA9T7Jb2ihHQdRICUWrhL/KNdHwLnom7wJXeQZ0=; b=jpwuLCpPuvSyiqyVN47EiRV73g9FPVh5mrTmQ+obDDW+us1CYub59oQma/q10TbxCfFVXc TGxKtRPpf5FiiLkhBYov+vP8ijkpHtOvsxJaqq/MT3gRodsm9+vm6fCa+vEhEBOL2k5Qa5 Xv2h7HzZgIDk7h1nE3KCRiNdfdGZ31E= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1723026694; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=j2GOiA9T7Jb2ihHQdRICUWrhL/KNdHwLnom7wJXeQZ0=; b=FR81OezOYQfW7NBMp5RYhI9AcpsQcoUxO6KI+QsVLzVsKMxv/hh8vLbnigIQJHnif1mmXK QemlZIN1J5GkqrDA== Authentication-Results: smtp-out1.suse.de; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=jpwuLCpP; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=FR81OezO DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1723026694; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=j2GOiA9T7Jb2ihHQdRICUWrhL/KNdHwLnom7wJXeQZ0=; b=jpwuLCpPuvSyiqyVN47EiRV73g9FPVh5mrTmQ+obDDW+us1CYub59oQma/q10TbxCfFVXc TGxKtRPpf5FiiLkhBYov+vP8ijkpHtOvsxJaqq/MT3gRodsm9+vm6fCa+vEhEBOL2k5Qa5 Xv2h7HzZgIDk7h1nE3KCRiNdfdGZ31E= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1723026694; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=j2GOiA9T7Jb2ihHQdRICUWrhL/KNdHwLnom7wJXeQZ0=; b=FR81OezOYQfW7NBMp5RYhI9AcpsQcoUxO6KI+QsVLzVsKMxv/hh8vLbnigIQJHnif1mmXK QemlZIN1J5GkqrDA== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id 3CBD413B08; Wed, 7 Aug 2024 10:31:34 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id IOqMDgZNs2YsHwAAD6G6ig (envelope-from ); Wed, 07 Aug 2024 10:31:34 +0000 From: Vlastimil Babka Date: Wed, 07 Aug 2024 12:31:18 +0200 Subject: [PATCH v2 5/7] rcu/kvfree: Add kvfree_rcu_barrier() API Precedence: bulk X-Mailing-List: rcu@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20240807-b4-slab-kfree_rcu-destroy-v2-5-ea79102f428c@suse.cz> References: <20240807-b4-slab-kfree_rcu-destroy-v2-0-ea79102f428c@suse.cz> In-Reply-To: <20240807-b4-slab-kfree_rcu-destroy-v2-0-ea79102f428c@suse.cz> To: "Paul E. McKenney" , Joel Fernandes , Josh Triplett , Boqun Feng , Christoph Lameter , David Rientjes Cc: Steven Rostedt , Mathieu Desnoyers , Lai Jiangshan , Zqiang , Julia Lawall , Jakub Kicinski , "Jason A. Donenfeld" , "Uladzislau Rezki (Sony)" , Andrew Morton , Roman Gushchin , Hyeonggon Yoo <42.hyeyoo@gmail.com>, linux-mm@kvack.org, linux-kernel@vger.kernel.org, rcu@vger.kernel.org, Alexander Potapenko , Marco Elver , Dmitry Vyukov , kasan-dev@googlegroups.com, Jann Horn , Mateusz Guzik , Vlastimil Babka X-Mailer: b4 0.14.1 X-Spam-Level: X-Rspamd-Action: no action X-Spam-Score: -3.01 X-Spam-Flag: NO X-Rspamd-Queue-Id: 6A06C21CF4 X-Spamd-Result: default: False [-3.01 / 50.00]; BAYES_HAM(-3.00)[100.00%]; SUSPICIOUS_RECIPS(1.50)[]; NEURAL_HAM_LONG(-1.00)[-1.000]; R_DKIM_ALLOW(-0.20)[suse.cz:s=susede2_rsa,suse.cz:s=susede2_ed25519]; NEURAL_HAM_SHORT(-0.20)[-1.000]; MIME_GOOD(-0.10)[text/plain]; MX_GOOD(-0.01)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; FREEMAIL_ENVRCPT(0.00)[gmail.com]; DKIM_SIGNED(0.00)[suse.cz:s=susede2_rsa,suse.cz:s=susede2_ed25519]; FREEMAIL_TO(0.00)[kernel.org,joelfernandes.org,joshtriplett.org,gmail.com,linux.com,google.com]; FUZZY_BLOCKED(0.00)[rspamd.com]; MIME_TRACE(0.00)[0:+]; ARC_NA(0.00)[]; FREEMAIL_CC(0.00)[goodmis.org,efficios.com,gmail.com,inria.fr,kernel.org,zx2c4.com,linux-foundation.org,linux.dev,kvack.org,vger.kernel.org,google.com,googlegroups.com,suse.cz]; RCVD_TLS_ALL(0.00)[]; DKIM_TRACE(0.00)[suse.cz:+]; RCVD_COUNT_TWO(0.00)[2]; FROM_EQ_ENVFROM(0.00)[]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_TWELVE(0.00)[27]; DNSWL_BLOCKED(0.00)[2a07:de40:b281:104:10:150:64:97:from]; TAGGED_RCPT(0.00)[]; MID_RHS_MATCH_FROM(0.00)[]; R_RATELIMIT(0.00)[to_ip_from(RLsm9p66qmnckghmjmpccdnq6s)]; RCVD_VIA_SMTP_AUTH(0.00)[]; TO_DN_SOME(0.00)[] X-Rspamd-Server: rspamd1.dmz-prg2.suse.org From: "Uladzislau Rezki (Sony)" Add a kvfree_rcu_barrier() function. It waits until all in-flight pointers are freed over RCU machinery. It does not wait any GP completion and it is within its right to return immediately if there are no outstanding pointers. This function is useful when there is a need to guarantee that a memory is fully freed before destroying memory caches. For example, during unloading a kernel module. Signed-off-by: Uladzislau Rezki (Sony) Signed-off-by: Vlastimil Babka --- include/linux/rcutiny.h | 5 +++ include/linux/rcutree.h | 1 + kernel/rcu/tree.c | 103 ++++++++++++++++++++++++++++++++++++++++++++---- 3 files changed, 101 insertions(+), 8 deletions(-) diff --git a/include/linux/rcutiny.h b/include/linux/rcutiny.h index d9ac7b136aea..522123050ff8 100644 --- a/include/linux/rcutiny.h +++ b/include/linux/rcutiny.h @@ -111,6 +111,11 @@ static inline void __kvfree_call_rcu(struct rcu_head *head, void *ptr) kvfree(ptr); } +static inline void kvfree_rcu_barrier(void) +{ + rcu_barrier(); +} + #ifdef CONFIG_KASAN_GENERIC void kvfree_call_rcu(struct rcu_head *head, void *ptr); #else diff --git a/include/linux/rcutree.h b/include/linux/rcutree.h index 254244202ea9..58e7db80f3a8 100644 --- a/include/linux/rcutree.h +++ b/include/linux/rcutree.h @@ -35,6 +35,7 @@ static inline void rcu_virt_note_context_switch(void) void synchronize_rcu_expedited(void); void kvfree_call_rcu(struct rcu_head *head, void *ptr); +void kvfree_rcu_barrier(void); void rcu_barrier(void); void rcu_momentary_dyntick_idle(void); diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index e641cc681901..ebcfed9b570e 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -3584,18 +3584,15 @@ kvfree_rcu_drain_ready(struct kfree_rcu_cpu *krcp) } /* - * This function is invoked after the KFREE_DRAIN_JIFFIES timeout. + * Return: %true if a work is queued, %false otherwise. */ -static void kfree_rcu_monitor(struct work_struct *work) +static bool +kvfree_rcu_queue_batch(struct kfree_rcu_cpu *krcp) { - struct kfree_rcu_cpu *krcp = container_of(work, - struct kfree_rcu_cpu, monitor_work.work); unsigned long flags; + bool queued = false; int i, j; - // Drain ready for reclaim. - kvfree_rcu_drain_ready(krcp); - raw_spin_lock_irqsave(&krcp->lock, flags); // Attempt to start a new batch. @@ -3634,11 +3631,27 @@ static void kfree_rcu_monitor(struct work_struct *work) // be that the work is in the pending state when // channels have been detached following by each // other. - queue_rcu_work(system_wq, &krwp->rcu_work); + queued = queue_rcu_work(system_wq, &krwp->rcu_work); } } raw_spin_unlock_irqrestore(&krcp->lock, flags); + return queued; +} + +/* + * This function is invoked after the KFREE_DRAIN_JIFFIES timeout. + */ +static void kfree_rcu_monitor(struct work_struct *work) +{ + struct kfree_rcu_cpu *krcp = container_of(work, + struct kfree_rcu_cpu, monitor_work.work); + + // Drain ready for reclaim. + kvfree_rcu_drain_ready(krcp); + + // Queue a batch for a rest. + kvfree_rcu_queue_batch(krcp); // If there is nothing to detach, it means that our job is // successfully done here. In case of having at least one @@ -3859,6 +3872,80 @@ void kvfree_call_rcu(struct rcu_head *head, void *ptr) } EXPORT_SYMBOL_GPL(kvfree_call_rcu); +/** + * kvfree_rcu_barrier - Wait until all in-flight kvfree_rcu() complete. + * + * Note that a single argument of kvfree_rcu() call has a slow path that + * triggers synchronize_rcu() following by freeing a pointer. It is done + * before the return from the function. Therefore for any single-argument + * call that will result in a kfree() to a cache that is to be destroyed + * during module exit, it is developer's responsibility to ensure that all + * such calls have returned before the call to kmem_cache_destroy(). + */ +void kvfree_rcu_barrier(void) +{ + struct kfree_rcu_cpu_work *krwp; + struct kfree_rcu_cpu *krcp; + bool queued; + int i, cpu; + + /* + * Firstly we detach objects and queue them over an RCU-batch + * for all CPUs. Finally queued works are flushed for each CPU. + * + * Please note. If there are outstanding batches for a particular + * CPU, those have to be finished first following by queuing a new. + */ + for_each_possible_cpu(cpu) { + krcp = per_cpu_ptr(&krc, cpu); + + /* + * Check if this CPU has any objects which have been queued for a + * new GP completion. If not(means nothing to detach), we are done + * with it. If any batch is pending/running for this "krcp", below + * per-cpu flush_rcu_work() waits its completion(see last step). + */ + if (!need_offload_krc(krcp)) + continue; + + while (1) { + /* + * If we are not able to queue a new RCU work it means: + * - batches for this CPU are still in flight which should + * be flushed first and then repeat; + * - no objects to detach, because of concurrency. + */ + queued = kvfree_rcu_queue_batch(krcp); + + /* + * Bail out, if there is no need to offload this "krcp" + * anymore. As noted earlier it can run concurrently. + */ + if (queued || !need_offload_krc(krcp)) + break; + + /* There are ongoing batches. */ + for (i = 0; i < KFREE_N_BATCHES; i++) { + krwp = &(krcp->krw_arr[i]); + flush_rcu_work(&krwp->rcu_work); + } + } + } + + /* + * Now we guarantee that all objects are flushed. + */ + for_each_possible_cpu(cpu) { + krcp = per_cpu_ptr(&krc, cpu); + + for (i = 0; i < KFREE_N_BATCHES; i++) { + krwp = &(krcp->krw_arr[i]); + flush_rcu_work(&krwp->rcu_work); + } + } +} +EXPORT_SYMBOL_GPL(kvfree_rcu_barrier); + static unsigned long kfree_rcu_shrink_count(struct shrinker *shrink, struct shrink_control *sc) {