From patchwork Thu Aug 24 23:21:20 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13364858 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1E2CBEE49A5 for ; Thu, 24 Aug 2023 23:22:11 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238161AbjHXXVo (ORCPT ); Thu, 24 Aug 2023 19:21:44 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36910 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241313AbjHXXVY (ORCPT ); Thu, 24 Aug 2023 19:21:24 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B8881E51 for ; Thu, 24 Aug 2023 16:21:21 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 4E4C764553 for ; Thu, 24 Aug 2023 23:21:21 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id A9FD3C433CA; Thu, 24 Aug 2023 23:21:20 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1692919280; bh=JmMz0JJ+IcRHTtCZglX4Ie4ra2fCEx20OW7Dog+N/sE=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=SWNQN7QgQGFhj0s6XgQTiz1XVUS0aoRulf10cucCE9r3VoZbKHZEkmE+M/eXm8vu3 zI5Aaqc5XUrZawsjLd59dwNxVyBnsEU2KuSkKsl6Kx0fkXSOAYf1cjyhZSwnwQyE+Z J6NBB+cmra7Ro1iku1yOmb5CzxglCDltTC5XdcmQe/e36piXUMKJn8jZS5wFMvNNWN gYUf0CkiDXflNFQZXgbf5Or+SiEAJbPE/F3SSE5OFK/IHZuJZoQ3b+hF+PnyZpwm0z y96MyZorBl/YdB2BCF0g1EdjYQUuitxhJTq/8Y3nbqZ/PhdA8sgDi6UlBB9qSvq+im aNHE+c2ZCeCyw== Subject: [PATCH 1/3] xfs: fix per-cpu CIL structure aggregation racing with dying cpus From: "Darrick J. Wong" To: chandan.babu@gmail.com, djwong@kernel.org Cc: tglx@linutronix.de, peterz@infradead.org, ritesh.list@gmail.com, sandeen@sandeen.net, linux-xfs@vger.kernel.org, david@fromorbit.com, ritesh.list@gmail.com, sandeen@sandeen.net Date: Thu, 24 Aug 2023 16:21:20 -0700 Message-ID: <169291928016.219974.17814488726880866494.stgit@frogsfrogsfrogs> In-Reply-To: <169291927442.219974.9654062191833512358.stgit@frogsfrogsfrogs> References: <169291927442.219974.9654062191833512358.stgit@frogsfrogsfrogs> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong In commit 7c8ade2121200 ("xfs: implement percpu cil space used calculation"), the XFS committed (log) item list code was converted to use per-cpu lists and space tracking to reduce cpu contention when multiple threads are modifying different parts of the filesystem and hence end up contending on the log structures during transaction commit. Each CPU tracks its own commit items and space usage, and these do not have to be merged into the main CIL until either someone wants to push the CIL items, or we run over a soft threshold and switch to slower (but more accurate) accounting with atomics. Unfortunately, the for_each_cpu iteration suffers from the same race with cpu dying problem that was identified in commit 8b57b11cca88f ("pcpcntrs: fix dying cpu summation race") -- CPUs are removed from cpu_online_mask before the CPUHP_XFS_DEAD callback gets called. As a result, both CIL percpu structure aggregation functions fail to collect the items and accounted space usage at the correct point in time. If we're lucky, the items that are collected from the online cpus exceed the space given to those cpus, and the log immediately shuts down in xlog_cil_insert_items due to the (apparent) log reservation overrun. This happens periodically with generic/650, which exercises cpu hotplug vs. the filesystem code: smpboot: CPU 3 is now offline XFS (sda3): ctx ticket reservation ran out. Need to up reservation XFS (sda3): ticket reservation summary: XFS (sda3): unit res = 9268 bytes XFS (sda3): current res = -40 bytes XFS (sda3): original count = 1 XFS (sda3): remaining count = 1 XFS (sda3): Filesystem has been shut down due to log error (0x2). Applying the same sort of fix from 8b57b11cca88f to the CIL code seems to make the generic/650 problem go away, but I've been told that tglx was not happy when he saw: "...the only thing we actually need to care about is that percpu_counter_sum() iterates dying CPUs. That's trivial to do, and when there are no CPUs dying, it has no addition overhead except for a cpumask_or() operation." The CPU hotplug code is rather complex and difficult to understand and I don't want to try to understand the cpu hotplug locking well enough to use cpu_dying mask. Furthermore, there's a performance improvement that could be had here. Attach a private cpu mask to the CIL structure so that we can track exactly which cpus have accessed the percpu data at all. It doesn't matter if the cpu has since gone offline; log item aggregation will still find the items. Better yet, we skip cpus that have not recently logged anything. Worse yet, Ritesh Harjani and Eric Sandeen both reported today that CPU hot remove racing with an xfs mount can crash if the cpu_dead notifier tries to access the log but the mount hasn't yet set up the log. Link: https://lore.kernel.org/linux-xfs/ZOLzgBOuyWHapOyZ@dread.disaster.area/T/ Link: https://lore.kernel.org/lkml/877cuj1mt1.ffs@tglx/ Link: https://lore.kernel.org/lkml/20230414162755.281993820@linutronix.de/ Link: https://lore.kernel.org/linux-xfs/ZOVkjxWZq0YmjrJu@dread.disaster.area/T/ Cc: tglx@linutronix.de Cc: peterz@infradead.org Reported-by: ritesh.list@gmail.com Reported-by: sandeen@sandeen.net Fixes: af1c2146a50b ("xfs: introduce per-cpu CIL tracking structure") Signed-off-by: Darrick J. Wong --- fs/xfs/xfs_log_cil.c | 50 +++++++++++++++---------------------------------- fs/xfs/xfs_log_priv.h | 14 ++++++-------- fs/xfs/xfs_super.c | 1 - 3 files changed, 21 insertions(+), 44 deletions(-) diff --git a/fs/xfs/xfs_log_cil.c b/fs/xfs/xfs_log_cil.c index eccbfb99e894..6eb767a49188 100644 --- a/fs/xfs/xfs_log_cil.c +++ b/fs/xfs/xfs_log_cil.c @@ -124,7 +124,7 @@ xlog_cil_push_pcp_aggregate( struct xlog_cil_pcp *cilpcp; int cpu; - for_each_online_cpu(cpu) { + for_each_cpu(cpu, &ctx->cil_pcpmask) { cilpcp = per_cpu_ptr(cil->xc_pcp, cpu); ctx->ticket->t_curr_res += cilpcp->space_reserved; @@ -165,7 +165,13 @@ xlog_cil_insert_pcp_aggregate( if (!test_and_clear_bit(XLOG_CIL_PCP_SPACE, &cil->xc_flags)) return; - for_each_online_cpu(cpu) { + /* + * We can race with other cpus setting cil_pcpmask. However, we've + * atomically cleared PCP_SPACE which forces other threads to add to + * the global space used count. cil_pcpmask is a superset of cilpcp + * structures that could have a nonzero space_used. + */ + for_each_cpu(cpu, &ctx->cil_pcpmask) { int old, prev; cilpcp = per_cpu_ptr(cil->xc_pcp, cpu); @@ -554,6 +560,7 @@ xlog_cil_insert_items( int iovhdr_res = 0, split_res = 0, ctx_res = 0; int space_used; int order; + unsigned int cpu_nr; struct xlog_cil_pcp *cilpcp; ASSERT(tp); @@ -577,7 +584,12 @@ xlog_cil_insert_items( * can't be scheduled away between split sample/update operations that * are done without outside locking to serialise them. */ - cilpcp = get_cpu_ptr(cil->xc_pcp); + cpu_nr = get_cpu(); + cilpcp = this_cpu_ptr(cil->xc_pcp); + + /* Tell the future push that there was work added by this CPU. */ + if (!cpumask_test_cpu(cpu_nr, &ctx->cil_pcpmask)) + cpumask_test_and_set_cpu(cpu_nr, &ctx->cil_pcpmask); /* * We need to take the CIL checkpoint unit reservation on the first @@ -1790,38 +1802,6 @@ xlog_cil_force_seq( return 0; } -/* - * Move dead percpu state to the relevant CIL context structures. - * - * We have to lock the CIL context here to ensure that nothing is modifying - * the percpu state, either addition or removal. Both of these are done under - * the CIL context lock, so grabbing that exclusively here will ensure we can - * safely drain the cilpcp for the CPU that is dying. - */ -void -xlog_cil_pcp_dead( - struct xlog *log, - unsigned int cpu) -{ - struct xfs_cil *cil = log->l_cilp; - struct xlog_cil_pcp *cilpcp = per_cpu_ptr(cil->xc_pcp, cpu); - struct xfs_cil_ctx *ctx; - - down_write(&cil->xc_ctx_lock); - ctx = cil->xc_ctx; - if (ctx->ticket) - ctx->ticket->t_curr_res += cilpcp->space_reserved; - cilpcp->space_reserved = 0; - - if (!list_empty(&cilpcp->log_items)) - list_splice_init(&cilpcp->log_items, &ctx->log_items); - if (!list_empty(&cilpcp->busy_extents)) - list_splice_init(&cilpcp->busy_extents, &ctx->busy_extents); - atomic_add(cilpcp->space_used, &ctx->space_used); - cilpcp->space_used = 0; - up_write(&cil->xc_ctx_lock); -} - /* * Perform initial CIL structure initialisation. */ diff --git a/fs/xfs/xfs_log_priv.h b/fs/xfs/xfs_log_priv.h index 1bd2963e8fbd..af87648331d5 100644 --- a/fs/xfs/xfs_log_priv.h +++ b/fs/xfs/xfs_log_priv.h @@ -231,6 +231,12 @@ struct xfs_cil_ctx { struct work_struct discard_endio_work; struct work_struct push_work; atomic_t order_id; + + /* + * CPUs that could have added items to the percpu CIL data. Access is + * coordinated with xc_ctx_lock. + */ + struct cpumask cil_pcpmask; }; /* @@ -278,9 +284,6 @@ struct xfs_cil { wait_queue_head_t xc_push_wait; /* background push throttle */ void __percpu *xc_pcp; /* percpu CIL structures */ -#ifdef CONFIG_HOTPLUG_CPU - struct list_head xc_pcp_list; -#endif } ____cacheline_aligned_in_smp; /* xc_flags bit values */ @@ -705,9 +708,4 @@ xlog_kvmalloc( return p; } -/* - * CIL CPU dead notifier - */ -void xlog_cil_pcp_dead(struct xlog *log, unsigned int cpu); - #endif /* __XFS_LOG_PRIV_H__ */ diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c index 09638e8fb4ee..ef7775657ce3 100644 --- a/fs/xfs/xfs_super.c +++ b/fs/xfs/xfs_super.c @@ -2313,7 +2313,6 @@ xfs_cpu_dead( list_for_each_entry_safe(mp, n, &xfs_mount_list, m_mount_list) { spin_unlock(&xfs_mount_list_lock); xfs_inodegc_cpu_dead(mp, cpu); - xlog_cil_pcp_dead(mp->m_log, cpu); spin_lock(&xfs_mount_list_lock); } spin_unlock(&xfs_mount_list_lock); From patchwork Thu Aug 24 23:21:25 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13364857 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C2EE3C88CB2 for ; Thu, 24 Aug 2023 23:22:10 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S244024AbjHXXVr (ORCPT ); Thu, 24 Aug 2023 19:21:47 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36972 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S244039AbjHXXV3 (ORCPT ); Thu, 24 Aug 2023 19:21:29 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6E50DE59 for ; Thu, 24 Aug 2023 16:21:27 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 0D4FB64B19 for ; Thu, 24 Aug 2023 23:21:27 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 6393CC433C9; Thu, 24 Aug 2023 23:21:26 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1692919286; bh=LxVktdgTRhetM6D25+6aFGSUb4HeBBWzyEUbAWMLKs4=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=MMV7iLlRhoJgxcb5wQ6fNZ5Xnqx2JwqxB7sfC2GmuK15+sdRmQrZ/u+H68Pg6ihSq ndttJ5GCQKeB/iAyI087yaeZUBWTVviFjOZOySMjA32KxKvZZY0tdA5qv2gGy6ZxdW nWL52UeCr7JGGBTBjD0DPI/XWd6mxLnDuSI8bjQwSa84uCIFN040XNGquSTL9CTzKr 2GNToIUUyg+dxUhvhERmEStOqpcfdpQfHUEPTgckr+PdrAFq4XCVo4mO9Il7CzYlty aUus35GH4u1FZcYLq90x+zb3zzBQ2fypuSxr5WYAtC6sdsbPyKpcAfew/SqjpowTQS PMxc08lWT0lyA== Subject: [PATCH 2/3] xfs: use per-mount cpumask to track nonempty percpu inodegc lists From: "Darrick J. Wong" To: chandan.babu@gmail.com, djwong@kernel.org Cc: linux-xfs@vger.kernel.org, david@fromorbit.com, ritesh.list@gmail.com, sandeen@sandeen.net Date: Thu, 24 Aug 2023 16:21:25 -0700 Message-ID: <169291928586.219974.10915745531517859853.stgit@frogsfrogsfrogs> In-Reply-To: <169291927442.219974.9654062191833512358.stgit@frogsfrogsfrogs> References: <169291927442.219974.9654062191833512358.stgit@frogsfrogsfrogs> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Directly track which CPUs have contributed to the inodegc percpu lists instead of trusting the cpu online mask. This eliminates a theoretical problem where the inodegc flush functions might fail to flush a CPU's inodes if that CPU happened to be dying at exactly the same time. Most likely nobody's noticed this because the CPU dead hook moves the percpu inodegc list to another CPU and schedules that worker immediately. But it's quite possible that this is a subtle race leading to UAF if the inodegc flush were part of an unmount. Further benefits: This reduces the overhead of the inodegc flush code slightly by allowing us to ignore CPUs that have empty lists. Better yet, it reduces our dependence on the cpu online masks, which have been the cause of confusion and drama lately. Fixes: ab23a7768739 ("xfs: per-cpu deferred inode inactivation queues") Signed-off-by: Darrick J. Wong --- fs/xfs/xfs_icache.c | 60 +++++++++++---------------------------------------- fs/xfs/xfs_icache.h | 1 - fs/xfs/xfs_mount.h | 6 +++-- fs/xfs/xfs_super.c | 4 +-- 4 files changed, 18 insertions(+), 53 deletions(-) diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c index e541f5c0bc25..7fd876e94ecb 100644 --- a/fs/xfs/xfs_icache.c +++ b/fs/xfs/xfs_icache.c @@ -443,7 +443,7 @@ xfs_inodegc_queue_all( int cpu; bool ret = false; - for_each_online_cpu(cpu) { + for_each_cpu(cpu, &mp->m_inodegc_cpumask) { gc = per_cpu_ptr(mp->m_inodegc, cpu); if (!llist_empty(&gc->list)) { mod_delayed_work_on(cpu, mp->m_inodegc_wq, &gc->work, 0); @@ -463,7 +463,7 @@ xfs_inodegc_wait_all( int error = 0; flush_workqueue(mp->m_inodegc_wq); - for_each_online_cpu(cpu) { + for_each_cpu(cpu, &mp->m_inodegc_cpumask) { struct xfs_inodegc *gc; gc = per_cpu_ptr(mp->m_inodegc, cpu); @@ -1845,10 +1845,12 @@ xfs_inodegc_worker( struct xfs_inodegc, work); struct llist_node *node = llist_del_all(&gc->list); struct xfs_inode *ip, *n; + struct xfs_mount *mp = gc->mp; unsigned int nofs_flag; ASSERT(gc->cpu == smp_processor_id()); + cpumask_test_and_clear_cpu(gc->cpu, &mp->m_inodegc_cpumask); WRITE_ONCE(gc->items, 0); if (!node) @@ -1862,7 +1864,7 @@ xfs_inodegc_worker( nofs_flag = memalloc_nofs_save(); ip = llist_entry(node, struct xfs_inode, i_gclist); - trace_xfs_inodegc_worker(ip->i_mount, READ_ONCE(gc->shrinker_hits)); + trace_xfs_inodegc_worker(mp, READ_ONCE(gc->shrinker_hits)); WRITE_ONCE(gc->shrinker_hits, 0); llist_for_each_entry_safe(ip, n, node, i_gclist) { @@ -2057,6 +2059,7 @@ xfs_inodegc_queue( struct xfs_inodegc *gc; int items; unsigned int shrinker_hits; + unsigned int cpu_nr; unsigned long queue_delay = 1; trace_xfs_inode_set_need_inactive(ip); @@ -2064,12 +2067,16 @@ xfs_inodegc_queue( ip->i_flags |= XFS_NEED_INACTIVE; spin_unlock(&ip->i_flags_lock); - gc = get_cpu_ptr(mp->m_inodegc); + cpu_nr = get_cpu(); + gc = this_cpu_ptr(mp->m_inodegc); llist_add(&ip->i_gclist, &gc->list); items = READ_ONCE(gc->items); WRITE_ONCE(gc->items, items + 1); shrinker_hits = READ_ONCE(gc->shrinker_hits); + if (!cpumask_test_cpu(cpu_nr, &mp->m_inodegc_cpumask)) + cpumask_test_and_set_cpu(cpu_nr, &mp->m_inodegc_cpumask); + /* * We queue the work while holding the current CPU so that the work * is scheduled to run on this CPU. @@ -2093,47 +2100,6 @@ xfs_inodegc_queue( } } -/* - * Fold the dead CPU inodegc queue into the current CPUs queue. - */ -void -xfs_inodegc_cpu_dead( - struct xfs_mount *mp, - unsigned int dead_cpu) -{ - struct xfs_inodegc *dead_gc, *gc; - struct llist_node *first, *last; - unsigned int count = 0; - - dead_gc = per_cpu_ptr(mp->m_inodegc, dead_cpu); - cancel_delayed_work_sync(&dead_gc->work); - - if (llist_empty(&dead_gc->list)) - return; - - first = dead_gc->list.first; - last = first; - while (last->next) { - last = last->next; - count++; - } - dead_gc->list.first = NULL; - dead_gc->items = 0; - - /* Add pending work to current CPU */ - gc = get_cpu_ptr(mp->m_inodegc); - llist_add_batch(first, last, &gc->list); - count += READ_ONCE(gc->items); - WRITE_ONCE(gc->items, count); - - if (xfs_is_inodegc_enabled(mp)) { - trace_xfs_inodegc_queue(mp, __return_address); - mod_delayed_work_on(current_cpu(), mp->m_inodegc_wq, &gc->work, - 0); - } - put_cpu_ptr(gc); -} - /* * We set the inode flag atomically with the radix tree tag. Once we get tag * lookups on the radix tree, this inode flag can go away. @@ -2195,7 +2161,7 @@ xfs_inodegc_shrinker_count( if (!xfs_is_inodegc_enabled(mp)) return 0; - for_each_online_cpu(cpu) { + for_each_cpu(cpu, &mp->m_inodegc_cpumask) { gc = per_cpu_ptr(mp->m_inodegc, cpu); if (!llist_empty(&gc->list)) return XFS_INODEGC_SHRINKER_COUNT; @@ -2220,7 +2186,7 @@ xfs_inodegc_shrinker_scan( trace_xfs_inodegc_shrinker_scan(mp, sc, __return_address); - for_each_online_cpu(cpu) { + for_each_cpu(cpu, &mp->m_inodegc_cpumask) { gc = per_cpu_ptr(mp->m_inodegc, cpu); if (!llist_empty(&gc->list)) { unsigned int h = READ_ONCE(gc->shrinker_hits); diff --git a/fs/xfs/xfs_icache.h b/fs/xfs/xfs_icache.h index 2fa6f2e09d07..905944dafbe5 100644 --- a/fs/xfs/xfs_icache.h +++ b/fs/xfs/xfs_icache.h @@ -79,7 +79,6 @@ void xfs_inodegc_push(struct xfs_mount *mp); int xfs_inodegc_flush(struct xfs_mount *mp); void xfs_inodegc_stop(struct xfs_mount *mp); void xfs_inodegc_start(struct xfs_mount *mp); -void xfs_inodegc_cpu_dead(struct xfs_mount *mp, unsigned int cpu); int xfs_inodegc_register_shrinker(struct xfs_mount *mp); #endif diff --git a/fs/xfs/xfs_mount.h b/fs/xfs/xfs_mount.h index a25eece3be2b..f4a8879ba0e9 100644 --- a/fs/xfs/xfs_mount.h +++ b/fs/xfs/xfs_mount.h @@ -60,6 +60,7 @@ struct xfs_error_cfg { * Per-cpu deferred inode inactivation GC lists. */ struct xfs_inodegc { + struct xfs_mount *mp; struct llist_head list; struct delayed_work work; int error; @@ -67,9 +68,7 @@ struct xfs_inodegc { /* approximate count of inodes in the list */ unsigned int items; unsigned int shrinker_hits; -#if defined(DEBUG) || defined(XFS_WARN) unsigned int cpu; -#endif }; /* @@ -249,6 +248,9 @@ typedef struct xfs_mount { unsigned int *m_errortag; struct xfs_kobj m_errortag_kobj; #endif + + /* cpus that have inodes queued for inactivation */ + struct cpumask m_inodegc_cpumask; } xfs_mount_t; #define M_IGEO(mp) (&(mp)->m_ino_geo) diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c index ef7775657ce3..6a4f8b2f6159 100644 --- a/fs/xfs/xfs_super.c +++ b/fs/xfs/xfs_super.c @@ -1110,9 +1110,8 @@ xfs_inodegc_init_percpu( for_each_possible_cpu(cpu) { gc = per_cpu_ptr(mp->m_inodegc, cpu); -#if defined(DEBUG) || defined(XFS_WARN) gc->cpu = cpu; -#endif + gc->mp = mp; init_llist_head(&gc->list); gc->items = 0; gc->error = 0; @@ -2312,7 +2311,6 @@ xfs_cpu_dead( spin_lock(&xfs_mount_list_lock); list_for_each_entry_safe(mp, n, &xfs_mount_list, m_mount_list) { spin_unlock(&xfs_mount_list_lock); - xfs_inodegc_cpu_dead(mp, cpu); spin_lock(&xfs_mount_list_lock); } spin_unlock(&xfs_mount_list_lock); From patchwork Thu Aug 24 23:21:31 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13364859 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 56B4BEE49B2 for ; Thu, 24 Aug 2023 23:22:11 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S244032AbjHXXVs (ORCPT ); Thu, 24 Aug 2023 19:21:48 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34596 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S244047AbjHXXVf (ORCPT ); Thu, 24 Aug 2023 19:21:35 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 278A0A1 for ; Thu, 24 Aug 2023 16:21:33 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id B896665314 for ; Thu, 24 Aug 2023 23:21:32 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 1A2A7C433CA; Thu, 24 Aug 2023 23:21:32 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1692919292; bh=NunsRwNBvtOwveTZLVwp4JEX6aTnUcPmwM0/kW/V7g8=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=idqWvC0sOnXCrJdNrauquYt6rig17hAvnqKlnp6VqxvWbtt6i9IDP4lR614VFK5El /1xxkNkkfArho9Cmj5wJAJ5/OyZmeUIjgvh4lUBBb9GRs+9akyT0+H+7u2qKgdqyj6 BWsxNHPPHkINaR3BpE+uBLOxPANruUfYIsjZO1k3/eCO1i97P3T2m67kVjlHqt2xBC f0hhhhX+AQ8qfrD5rhChf+97chSZEHtqjnG21NL12wVc9jrpUCrLHdurGrYhXgTF+x DHubUg8m0X6RpwcQ+TljzFzVNRNgq/VvWq2eOKR0y/e7ABF7b5OyJhAsGDamT9oPIV RXSA8NmcdKBng== Subject: [PATCH 3/3] xfs: remove cpu hotplug hooks From: "Darrick J. Wong" To: chandan.babu@gmail.com, djwong@kernel.org Cc: linux-xfs@vger.kernel.org, david@fromorbit.com, ritesh.list@gmail.com, sandeen@sandeen.net Date: Thu, 24 Aug 2023 16:21:31 -0700 Message-ID: <169291929160.219974.11113805270455831702.stgit@frogsfrogsfrogs> In-Reply-To: <169291927442.219974.9654062191833512358.stgit@frogsfrogsfrogs> References: <169291927442.219974.9654062191833512358.stgit@frogsfrogsfrogs> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong There are no users of the cpu hotplug hooks in xfs now, so remove the infrastructure. Signed-off-by: Darrick J. Wong Reviewed-by: Dave Chinner --- fs/xfs/xfs_super.c | 50 +------------------------------------------- include/linux/cpuhotplug.h | 1 - 2 files changed, 1 insertion(+), 50 deletions(-) diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c index 6a4f8b2f6159..1403aace4fe3 100644 --- a/fs/xfs/xfs_super.c +++ b/fs/xfs/xfs_super.c @@ -2301,47 +2301,6 @@ xfs_destroy_workqueues(void) destroy_workqueue(xfs_alloc_wq); } -#ifdef CONFIG_HOTPLUG_CPU -static int -xfs_cpu_dead( - unsigned int cpu) -{ - struct xfs_mount *mp, *n; - - spin_lock(&xfs_mount_list_lock); - list_for_each_entry_safe(mp, n, &xfs_mount_list, m_mount_list) { - spin_unlock(&xfs_mount_list_lock); - spin_lock(&xfs_mount_list_lock); - } - spin_unlock(&xfs_mount_list_lock); - return 0; -} - -static int __init -xfs_cpu_hotplug_init(void) -{ - int error; - - error = cpuhp_setup_state_nocalls(CPUHP_XFS_DEAD, "xfs:dead", NULL, - xfs_cpu_dead); - if (error < 0) - xfs_alert(NULL, -"Failed to initialise CPU hotplug, error %d. XFS is non-functional.", - error); - return error; -} - -static void -xfs_cpu_hotplug_destroy(void) -{ - cpuhp_remove_state_nocalls(CPUHP_XFS_DEAD); -} - -#else /* !CONFIG_HOTPLUG_CPU */ -static inline int xfs_cpu_hotplug_init(void) { return 0; } -static inline void xfs_cpu_hotplug_destroy(void) {} -#endif - STATIC int __init init_xfs_fs(void) { @@ -2358,13 +2317,9 @@ init_xfs_fs(void) xfs_dir_startup(); - error = xfs_cpu_hotplug_init(); - if (error) - goto out; - error = xfs_init_caches(); if (error) - goto out_destroy_hp; + goto out; error = xfs_init_workqueues(); if (error) @@ -2448,8 +2403,6 @@ init_xfs_fs(void) xfs_destroy_workqueues(); out_destroy_caches: xfs_destroy_caches(); - out_destroy_hp: - xfs_cpu_hotplug_destroy(); out: return error; } @@ -2473,7 +2426,6 @@ exit_xfs_fs(void) xfs_destroy_workqueues(); xfs_destroy_caches(); xfs_uuid_table_free(); - xfs_cpu_hotplug_destroy(); } module_init(init_xfs_fs); diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h index 25b6e6e6ba6b..a93e539efd85 100644 --- a/include/linux/cpuhotplug.h +++ b/include/linux/cpuhotplug.h @@ -90,7 +90,6 @@ enum cpuhp_state { CPUHP_FS_BUFF_DEAD, CPUHP_PRINTK_DEAD, CPUHP_MM_MEMCQ_DEAD, - CPUHP_XFS_DEAD, CPUHP_PERCPU_CNT_DEAD, CPUHP_RADIX_DEAD, CPUHP_PAGE_ALLOC, From patchwork Thu Aug 24 23:45:40 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13364876 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8B6ACC88CB2 for ; Thu, 24 Aug 2023 23:46:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232135AbjHXXpy (ORCPT ); Thu, 24 Aug 2023 19:45:54 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58326 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239873AbjHXXpr (ORCPT ); Thu, 24 Aug 2023 19:45:47 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3515119AE for ; Thu, 24 Aug 2023 16:45:42 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id B648363216 for ; Thu, 24 Aug 2023 23:45:41 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 19889C433C7; Thu, 24 Aug 2023 23:45:41 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1692920741; bh=xpZB8nS06qwCFIv1KqBF1XSy14wOgaAWMDH+ZadWZrc=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=JDiwjUH1/6pJl9HI8CacpIVRgsuLsdwlx9cK6Z1rL7cFOCGkueL788IZs0YoIIChp l9CLCuRymHXOEvHU2XqgZDPAG8rrZs5l+bU4TzyNfeyJEYRhOhfLrBlznDdPl6VAEL B3eKDObnRGKSdwbE6o9W2JEGjqscj9b4UnYpRgkJrAg3b377DVHZupqQ3GvUmsHg5g itLg8SGDgMCoGNJJlQkUXEDOmXwotqLF+Zg2dbX/LEbUkqtFypE5/+jcIZkADOUh/N YT2b1tFHsH9n9CArSfxUOgr6UerAZsNzYT9YcuncW5NXTD+zNxgB1P4oIpY/DkDLmF 0/p7zETwSbwig== Date: Thu, 24 Aug 2023 16:45:40 -0700 From: "Darrick J. Wong" To: chandan.babu@gmail.com Cc: peterz@infradead.org, ritesh.list@gmail.com, sandeen@sandeen.net, tglx@linutronix.de, linux-xfs@vger.kernel.org, david@fromorbit.com Subject: [PATCH 4/3] generic/650: race mount and unmount with cpu hotplug too Message-ID: <20230824234540.GD17912@frogsfrogsfrogs> References: <169291927442.219974.9654062191833512358.stgit@frogsfrogsfrogs> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <169291927442.219974.9654062191833512358.stgit@frogsfrogsfrogs> Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Ritesh Harjani reported that mount and unmount can race with the xfs cpu hotplug notifier hooks and crash the kernel. Extend this test to include that. Signed-off-by: Darrick J. Wong --- tests/generic/650 | 13 ++++++++++--- 1 file changed, 10 insertions(+), 3 deletions(-) diff --git a/tests/generic/650 b/tests/generic/650 index 05c939b84f..773f93c7cb 100755 --- a/tests/generic/650 +++ b/tests/generic/650 @@ -67,11 +67,18 @@ fsstress_args=(-w -d $stress_dir) nr_cpus=$((LOAD_FACTOR * nr_hotplug_cpus)) test "$nr_cpus" -gt 1024 && nr_cpus="$nr_hotplug_cpus" fsstress_args+=(-p $nr_cpus) -test -n "$SOAK_DURATION" && fsstress_args+=(--duration="$SOAK_DURATION") +if [ -n "$SOAK_DURATION" ]; then + test "$SOAK_DURATION" -lt 10 && SOAK_DURATION=10 + fsstress_args+=(--duration="$((SOAK_DURATION / 10))") +fi -nr_ops=$((25000 * TIME_FACTOR)) +nr_ops=$((2500 * TIME_FACTOR)) fsstress_args+=(-n $nr_ops) -$FSSTRESS_PROG $FSSTRESS_AVOID -w "${fsstress_args[@]}" >> $seqres.full +for ((i = 0; i < 10; i++)); do + $FSSTRESS_PROG $FSSTRESS_AVOID -w "${fsstress_args[@]}" >> $seqres.full + _test_cycle_mount +done + rm -f $sentinel_file # success, all done