From patchwork Tue Nov 12 22:05:14 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dave Chinner X-Patchwork-Id: 13872991 Received: from mail-oo1-f49.google.com (mail-oo1-f49.google.com [209.85.161.49]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 99E772185A0 for ; Tue, 12 Nov 2024 22:19:26 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.161.49 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731449968; cv=none; b=kZVerTTJbXO/29zO9aKyiy4q8GP+eFqV6fDxEvFPeoNnt21ygnjqZ9Tw5jRlHsuPC8VAAAgc0yEtz1UicV3Eu7Dqy0vgzveowFUJCCEyF/NsgwgCNuauoqXTC34I4BvLhHs6FtJHhfyFKpbARMMDsUnbC+VDwgMNG/zbODpib60= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731449968; c=relaxed/simple; bh=7p9qOVab9mUqKiCrMpPfasqlhXIZO9XRO05iSOCNdPE=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Z+JVvTjnVZb3PraTu6754Fg2ntZq1BcKV1QSFXBNomAwfAZaB/rqXkwGyDtQknFHyHEhXPgG/DNxt5Ap2w70g6av2iWyJ1MxwsyhlivfL/vkKIDwmcG3KtIdHP/ni9pbrxaTMQO/eFmxbjlnU/R7FbmbA3d7AvuITOc0vFG9iZA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=fromorbit.com; spf=pass smtp.mailfrom=fromorbit.com; dkim=pass (2048-bit key) header.d=fromorbit-com.20230601.gappssmtp.com header.i=@fromorbit-com.20230601.gappssmtp.com header.b=tzkV9QhI; arc=none smtp.client-ip=209.85.161.49 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=fromorbit.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=fromorbit.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=fromorbit-com.20230601.gappssmtp.com header.i=@fromorbit-com.20230601.gappssmtp.com header.b="tzkV9QhI" Received: by mail-oo1-f49.google.com with SMTP id 006d021491bc7-5eb9ee4f14cso2658975eaf.1 for ; Tue, 12 Nov 2024 14:19:26 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fromorbit-com.20230601.gappssmtp.com; s=20230601; t=1731449965; x=1732054765; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Pv876Dy8Ajy7g5mfHuh133EW6I1k95G47jQYc8OWOCQ=; b=tzkV9QhI8vdQp1EOI7CWCMsjiOGn1iQyZPP38P6zL6jZTAXhRUcZzjcDuxxWGlewHE GuvTZkwrnEPvOmJhJzHDKxu8GEP4bC9e7LSmCuEdNctWSawATq2dIwYpF80lqXeIpLni BU5npOj06V72iTFFN3LvO0rW+ZYW2GKiEi5qmAfU5QeHw4R6HcWfkbouGrpuRQCg08tw df1VjrWFf90mr+h6YVTgdVhxRIUKwIbWLMTOPMALDKuhXHMgICeuVg5OWO3/cZUL/QF6 NaIlCGAqiFDnyGPxKVdgHd9kmSMcYFICH+YdXE0HVBxkHxIinKesobG/owP77/jgbBIS 54jA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1731449965; x=1732054765; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Pv876Dy8Ajy7g5mfHuh133EW6I1k95G47jQYc8OWOCQ=; b=fbecyMr56ueSirDLBFC54af09wFIfWJ03ln/uy+CdGWKGPYOafWhwmwrPbvNzyyXp0 IHKSTadrwDw4J/KplL8AmQDXpPP1dwQ2D0Z7QwoW6SqzHZTnATgE5zVxiigqU0vZDjMb LzinuY9JaJ66B1/BVgS4D6Gno+oIatOencKvNFb5TOY2qhkyvzfNgKPP+eC5XrcHfRpi G39Cn54dSWoJJfXHPs/I2rbUuTbXuB2X+MAJc0lZLUfFL6opU2U3cTP/MU4iy6M5AeNg 3wi6XNN3Z8KvN1Lwc4aU/nrwintF/K6Q2gM+8G33aTOTcAIdIKuNxQN/cLpwtL8/zKG7 7J2w== X-Gm-Message-State: AOJu0YzvK2ivIMWANIMnnqv4DuYKnQp1+uJmgsPjbGS1sWpRbF3vSGzg u7UspQxJwtWCFXU/KgLY7z2E1UR8nZ6eXO2a3c9LkBLT/Bnvd5blEBdbKGx+cOG6pgFn6bBJyzs M X-Google-Smtp-Source: AGHT+IH3foCqINOA6eT/AhAzu6OOhECiEIX9MYzdSIPp9jrN9MstpjLWeRjd+tPurZPSTUS8nueMsw== X-Received: by 2002:a05:6830:620f:b0:710:ea11:3d35 with SMTP id 46e09a7af769-71a1c221985mr16850663a34.15.1731449965479; Tue, 12 Nov 2024 14:19:25 -0800 (PST) Received: from dread.disaster.area (pa49-186-86-168.pa.vic.optusnet.com.au. [49.186.86.168]) by smtp.gmail.com with ESMTPSA id 41be03b00d2f7-7f41f5bab50sm9480064a12.25.2024.11.12.14.19.24 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 12 Nov 2024 14:19:24 -0800 (PST) Received: from [192.168.253.23] (helo=devoid.disaster.area) by dread.disaster.area with esmtp (Exim 4.96) (envelope-from ) id 1tAzEH-00Doos-0e; Wed, 13 Nov 2024 09:19:21 +1100 Received: from dave by devoid.disaster.area with local (Exim 4.98) (envelope-from ) id 1tAzEH-00000004e7y-2gvQ; Wed, 13 Nov 2024 09:19:21 +1100 From: Dave Chinner To: linux-xfs@vger.kernel.org Cc: cem@kernel.org Subject: [PATCH 1/3] xfs: fix sparse inode limits on runt AG Date: Wed, 13 Nov 2024 09:05:14 +1100 Message-ID: <20241112221920.1105007-2-david@fromorbit.com> X-Mailer: git-send-email 2.45.2 In-Reply-To: <20241112221920.1105007-1-david@fromorbit.com> References: <20241112221920.1105007-1-david@fromorbit.com> Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Dave Chinner The runt AG at the end of a filesystem is almost always smaller than the mp->m_sb.sb_agblocks. Unfortunately, when setting the max_agbno limit for the inode chunk allocation, we do not take this into account. This means we can allocate a sparse inode chunk that overlaps beyond the end of an AG. When we go to allocate an inode from that sparse chunk, the irec fails validation because the agbno of the start of the irec is beyond valid limits for the runt AG. Prevent this from happening by taking into account the size of the runt AG when allocating inode chunks. Also convert the various checks for valid inode chunk agbnos to use xfs_ag_block_count() so that they will also catch such issues in the future. Fixes: 56d1115c9bc7 ("xfs: allocate sparse inode chunks on full chunk allocation failure") Signed-off-by: Dave Chinner Reviewed-by: Darrick J. Wong --- fs/xfs/libxfs/xfs_ialloc.c | 16 +++++++++------- 1 file changed, 9 insertions(+), 7 deletions(-) diff --git a/fs/xfs/libxfs/xfs_ialloc.c b/fs/xfs/libxfs/xfs_ialloc.c index 271855227514..6258527315f2 100644 --- a/fs/xfs/libxfs/xfs_ialloc.c +++ b/fs/xfs/libxfs/xfs_ialloc.c @@ -855,7 +855,8 @@ xfs_ialloc_ag_alloc( * the end of the AG. */ args.min_agbno = args.mp->m_sb.sb_inoalignmt; - args.max_agbno = round_down(args.mp->m_sb.sb_agblocks, + args.max_agbno = round_down(xfs_ag_block_count(args.mp, + pag->pag_agno), args.mp->m_sb.sb_inoalignmt) - igeo->ialloc_blks; @@ -2332,9 +2333,9 @@ xfs_difree( return -EINVAL; } agbno = XFS_AGINO_TO_AGBNO(mp, agino); - if (agbno >= mp->m_sb.sb_agblocks) { - xfs_warn(mp, "%s: agbno >= mp->m_sb.sb_agblocks (%d >= %d).", - __func__, agbno, mp->m_sb.sb_agblocks); + if (agbno >= xfs_ag_block_count(mp, pag->pag_agno)) { + xfs_warn(mp, "%s: agbno >= xfs_ag_block_count (%d >= %d).", + __func__, agbno, xfs_ag_block_count(mp, pag->pag_agno)); ASSERT(0); return -EINVAL; } @@ -2457,7 +2458,7 @@ xfs_imap( */ agino = XFS_INO_TO_AGINO(mp, ino); agbno = XFS_AGINO_TO_AGBNO(mp, agino); - if (agbno >= mp->m_sb.sb_agblocks || + if (agbno >= xfs_ag_block_count(mp, pag->pag_agno) || ino != XFS_AGINO_TO_INO(mp, pag->pag_agno, agino)) { error = -EINVAL; #ifdef DEBUG @@ -2467,11 +2468,12 @@ xfs_imap( */ if (flags & XFS_IGET_UNTRUSTED) return error; - if (agbno >= mp->m_sb.sb_agblocks) { + if (agbno >= xfs_ag_block_count(mp, pag->pag_agno)) { xfs_alert(mp, "%s: agbno (0x%llx) >= mp->m_sb.sb_agblocks (0x%lx)", __func__, (unsigned long long)agbno, - (unsigned long)mp->m_sb.sb_agblocks); + (unsigned long)xfs_ag_block_count(mp, + pag->pag_agno)); } if (ino != XFS_AGINO_TO_INO(mp, pag->pag_agno, agino)) { xfs_alert(mp, From patchwork Tue Nov 12 22:05:15 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dave Chinner X-Patchwork-Id: 13872994 Received: from mail-pf1-f172.google.com (mail-pf1-f172.google.com [209.85.210.172]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 22B5F1292CE for ; Tue, 12 Nov 2024 22:19:25 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.172 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731449970; cv=none; b=NCY0xu4vTEpjqVaVsNlXg0+UWAeZEE0TTSdAA6iKf+2C4TbAFaoASu0gOcKYA9PiEuSaN8LTtKFdqbpNArWJPbwWr8k6zWh0cEflxdpSc4P8qhwTMc38fS4UnK1BrLRpOlHnmYSbaXPx1gfMqdOV45wWVCmF23KsyVNHu1nZU7M= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731449970; c=relaxed/simple; bh=UahsPdezhKfe1FeRYKBw9nkhAEVrxSDhaIHvglDUEuM=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=pgcbTmpQQYCLZRuLg/1CbY8tbAAHgLMWJ6QOK0AcMkvXD7gVkhYDeC8CvlS5/PXRGdo+O6VtciiYVZX2ehvn4YZKWXyGO+aRwhUqT+26kakb0X2wH4UYmk3LxKaWpnjs4sE6CAqKX7Ep1mhZ2ZHbZxk0sZYsC47ztk9S/j7MtZ0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=fromorbit.com; spf=pass smtp.mailfrom=fromorbit.com; dkim=pass (2048-bit key) header.d=fromorbit-com.20230601.gappssmtp.com header.i=@fromorbit-com.20230601.gappssmtp.com header.b=kfdcOoXE; arc=none smtp.client-ip=209.85.210.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=fromorbit.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=fromorbit.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=fromorbit-com.20230601.gappssmtp.com header.i=@fromorbit-com.20230601.gappssmtp.com header.b="kfdcOoXE" Received: by mail-pf1-f172.google.com with SMTP id d2e1a72fcca58-720be27db74so89007b3a.1 for ; Tue, 12 Nov 2024 14:19:25 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fromorbit-com.20230601.gappssmtp.com; s=20230601; t=1731449965; x=1732054765; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=uLQEA5U2rjQCoFu4hmUmY2uxE5Fxmi3WuoV1ejBhXa8=; b=kfdcOoXEDST5fJsd9oPYaF3hFVxVEu9lXriR+sJbINVBYhwF4NDhJC3WY5ZyRsyOOP lIhP6X7tAbvhfb6AtQYGC+uQNJTnsi7Kgg2XpMDkRplwtzGPV/MUzjKv/GKCQokQtjk4 Ub9+Lzsc4J4HNYZuwLABvS2FJF0f8mCYAY3K+dtap0BoLgwpMzphvzLwnzzgefSD2a+V OKwDsqvxKNe3gzVmqkFEwcLmVd5H8jaMlGiheqcAoxdtdgWKKuWnUfPvNCw1HTO4m13f oSxSV5tAfSmZwJgcW1b/Q3dRmvbAx/Xs/kxQ39iJneKwooU3T84Cmm3bDELDu3ULlNFi lwcQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1731449965; x=1732054765; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=uLQEA5U2rjQCoFu4hmUmY2uxE5Fxmi3WuoV1ejBhXa8=; b=SXQjT9PIc4WlaryWi5Iy4LVAHmZmIbeNMPc/eCg+xXZ04pw4s3Ix8xXt43Qja+9dFK Vr958grSxF3TA6RKRwjv286j/rQ/0gbxfcJ08sxEpSW4QMbKPqBR4cxxJ1ZfqiZZHS39 O0NZREYQ/hsVgAwgZYTDLLs4B4i3JoFH/xqR7DnBdS5fct8Crz+NoCYiE+p8kS1nN8e5 E6v0+nA8QV83dgTRjkwPg6tWk8aXz5FcXKHjmx+fpM7ZNh2R+MhpBdy0kRY15IIyPMBc jpS6HRGCEJBhs5GZ2oN8yDb8Ohqr67EpRp271hS6t9HMRrI7oAPL1hXcTMW5MiGlDPY0 H8Yg== X-Gm-Message-State: AOJu0YzE9ob9M2+FiBK3cvjXiJBwiW3FG8R9YVyMC/SGWeOXaMp2fjJZ I7I6+hx38IBPOoqA6zTdxwqKkCvr4LavrMshHE7kbt8lFpZE+snfeVDoHU+btVbSluBEjlN3XK2 9 X-Google-Smtp-Source: AGHT+IGxL1cQaPbiFcFVXjYr/PEXFHJt2qgqsMSy6YXL5rnWxAhTn+LmG95UCuNNR/fMXSAGbMBD6w== X-Received: by 2002:a05:6a00:22ce:b0:71e:76dc:10f7 with SMTP id d2e1a72fcca58-72413f7c7c3mr27948697b3a.4.1731449965053; Tue, 12 Nov 2024 14:19:25 -0800 (PST) Received: from dread.disaster.area (pa49-186-86-168.pa.vic.optusnet.com.au. [49.186.86.168]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-72407a571a8sm11732967b3a.195.2024.11.12.14.19.24 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 12 Nov 2024 14:19:24 -0800 (PST) Received: from [192.168.253.23] (helo=devoid.disaster.area) by dread.disaster.area with esmtp (Exim 4.96) (envelope-from ) id 1tAzEH-00Doot-0h; Wed, 13 Nov 2024 09:19:21 +1100 Received: from dave by devoid.disaster.area with local (Exim 4.98) (envelope-from ) id 1tAzEH-00000004e82-2os8; Wed, 13 Nov 2024 09:19:21 +1100 From: Dave Chinner To: linux-xfs@vger.kernel.org Cc: cem@kernel.org Subject: [PATCH 2/3] xfs: delalloc and quota softlimit timers are incoherent Date: Wed, 13 Nov 2024 09:05:15 +1100 Message-ID: <20241112221920.1105007-3-david@fromorbit.com> X-Mailer: git-send-email 2.45.2 In-Reply-To: <20241112221920.1105007-1-david@fromorbit.com> References: <20241112221920.1105007-1-david@fromorbit.com> Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Supriya Wickrematillake I've been seeing this failure on during xfs/050 recently: XFS: Assertion failed: dst->d_spc_timer != 0, file: fs/xfs/xfs_qm_syscalls.c, line: 435 .... Call Trace: xfs_qm_scall_getquota_fill_qc+0x2a2/0x2b0 xfs_qm_scall_getquota_next+0x69/0xa0 xfs_fs_get_nextdqblk+0x62/0xf0 quota_getnextxquota+0xbf/0x320 do_quotactl+0x1a1/0x410 __se_sys_quotactl+0x126/0x310 __x64_sys_quotactl+0x21/0x30 x64_sys_call+0x2819/0x2ee0 do_syscall_64+0x68/0x130 entry_SYSCALL_64_after_hwframe+0x76/0x7e It turns out that the _qmount call has silently been failing to unmount and mount the filesystem, so when the softlimit is pushed past with a buffered write, it is not getting synced to disk before the next quota report is being run. Hence when the quota report runs, we have 300 blocks of delalloc data on an inode, with a soft limit of 200 blocks. XFS dquots account delalloc reservations as used space, hence the dquot is over the soft limit. However, we don't update the soft limit timers until we do a transactional update of the dquot. That is, the dquot sits over the soft limit without a softlimit timer being started until writeback occurs and the allocation modifies the dquot and we call xfs_qm_adjust_dqtimers() from xfs_trans_apply_dquot_deltas() in xfs_trans_commit() context. This isn't really a problem, except for this debug code in xfs_qm_scall_getquota_fill_qc(): #ifdef DEBUG if (xfs_dquot_is_enforced(dqp) && dqp->q_id != 0) { if ((dst->d_space > dst->d_spc_softlimit) && (dst->d_spc_softlimit > 0)) { ASSERT(dst->d_spc_timer != 0); } .... It asserts taht if the used block count is over the soft limit, it *must* have a soft limit timer running. This is clearly not the case, because we haven't committed the delalloc space to disk yet. Hence the soft limit is only exceeded temporarily in memory (which isn't an issue) and we start the timer the moment we exceed the soft limit in journalled metadata. This debug was introduced in: commit 0d5ad8383061fbc0a9804fbb98218750000fe032 Author: Supriya Wickrematillake Date: Wed May 15 22:44:44 1996 +0000 initial checkin quotactl syscall functions. The very first quota support commit back in 1996. This is zero-day debug for Irix and, as it turns out, a zero-day bug in the debug code because the delalloc code on Irix didn't update the softlimit timers, either. IOWs, this issue has been in the code for 28 years. We obviously don't care if soft limit timers are a bit rubbery when we have delalloc reservations in memory. Production systems running quota reports have been exposed to this situation for 28 years and nobody has noticed it, so the debug code is essentially worthless at this point in time. We also have the on-disk dquot verifiers checking that the soft limit timer is running whenever the dquot is over the soft limit before we write it to disk and after we read it from disk. These aren't firing, so it is clear the issue is purely a temporary in-memory incoherency that I never would have noticed had the test not silently failed to unmount the filesystem. Hence I'm simply going to trash this runtime debug because it isn't useful in the slightest for catching quota bugs. Signed-off-by: Dave Chinner Reviewed-by: Darrick J. Wong Reviewed-by: Christoph Hellwig --- fs/xfs/xfs_qm_syscalls.c | 13 ------------- 1 file changed, 13 deletions(-) diff --git a/fs/xfs/xfs_qm_syscalls.c b/fs/xfs/xfs_qm_syscalls.c index 4eda50ae2d1c..0c78f30fa4a3 100644 --- a/fs/xfs/xfs_qm_syscalls.c +++ b/fs/xfs/xfs_qm_syscalls.c @@ -427,19 +427,6 @@ xfs_qm_scall_getquota_fill_qc( dst->d_ino_timer = 0; dst->d_rt_spc_timer = 0; } - -#ifdef DEBUG - if (xfs_dquot_is_enforced(dqp) && dqp->q_id != 0) { - if ((dst->d_space > dst->d_spc_softlimit) && - (dst->d_spc_softlimit > 0)) { - ASSERT(dst->d_spc_timer != 0); - } - if ((dst->d_ino_count > dqp->q_ino.softlimit) && - (dqp->q_ino.softlimit > 0)) { - ASSERT(dst->d_ino_timer != 0); - } - } -#endif } /* Return the quota information for the dquot matching id. */ From patchwork Tue Nov 12 22:05:16 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dave Chinner X-Patchwork-Id: 13872992 Received: from mail-pf1-f177.google.com (mail-pf1-f177.google.com [209.85.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9809121745E for ; Tue, 12 Nov 2024 22:19:26 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.177 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731449968; cv=none; b=Big5J63ugQwzZTpub2z+RBgJN+uRy73RjOtlWoLyzytOittXgX+9dbero3iLtyprasifb0yrLM2ZQWhdd3t56QmjXTA2DE70VkrqLSqMyTbvTw0XEIyx4HMULkVXYmktEsNQDlFg/su6dxLqdGpjJFEgHVdYLtvlz/yHnNcHtN0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731449968; c=relaxed/simple; bh=kwD3/W5ggXA0vWwxSJ2QDeURLk92O7XRlo4/TAU/o8c=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=G5e8Dvnk/tw2VVFHOv6m+PZm32/ERERi+VjRW0Z4x8KeISa9QwjWRWwsa3RehMnsA9VW9ZwbDU5iHOlYAm50n+4xw3Bb7C/CVEYjj3A64obHj1iFtl/uIuvbWGTtFLQRekvkbLSnuNrZP/vwB6aMIL+sHWwhSoER6SY0bEZYNDg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=fromorbit.com; spf=pass smtp.mailfrom=fromorbit.com; dkim=pass (2048-bit key) header.d=fromorbit-com.20230601.gappssmtp.com header.i=@fromorbit-com.20230601.gappssmtp.com header.b=OEZZtlwf; arc=none smtp.client-ip=209.85.210.177 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=fromorbit.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=fromorbit.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=fromorbit-com.20230601.gappssmtp.com header.i=@fromorbit-com.20230601.gappssmtp.com header.b="OEZZtlwf" Received: by mail-pf1-f177.google.com with SMTP id d2e1a72fcca58-720cb6ac25aso5695962b3a.3 for ; Tue, 12 Nov 2024 14:19:26 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fromorbit-com.20230601.gappssmtp.com; s=20230601; t=1731449966; x=1732054766; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=NCrzX8DC94RIQ/x55wiqFNazWZKZh5c+Fo3m/s1Otro=; b=OEZZtlwfxxZKwFhQJxZeDGCBE3/tJt72PhL9YAO4CjWBG4jAKYmZEoI/BqoByClWtn IkAKeMMebbMohNFIAYOzgSA+LRdqeynjpCczQ93g4KlpXQf9rtk2wKQWkl8FCAw9+klH kpV2GcUeWADMf5G+cx2XN/slW++4v0nGGHtZZ68sx0S8PJvPw9VyXyeQjvyY4cw/FTSs 6PBy3GGHNYnZoi3yRQTwv0JshwlHpzWNGpq5dMnjXMbPiRtbLtujcUPVSxh1Fl2Z6iz1 +m7fI69qN8xIisLJSDO8+fLcbvgwNBaYPC/cuPkmEMX9SXyon09tIc8hnz6mNALWZGOj viPA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1731449966; x=1732054766; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=NCrzX8DC94RIQ/x55wiqFNazWZKZh5c+Fo3m/s1Otro=; b=u2tfLskNWd3aHe3bBw8xTnmo5XF6QKRlfSjKRmNEza/TZucHQw79nhGTykVgp8vg0U BR7vgz3HLxioRcUtTQgwR6emeRPUFfF4WzkWjrvqVQh09Dv3K5SuIrQBj+Ip8ZHJxOdc IQTcGV81IhruGjYDf5Gq937UlnkMojQRGxbwf2//R0Gc7Fkr4+1YRCf6j7okyh6EN1KH 3RvpzduDLQXd489blaQVNED1ssMJfXI1YD+q9trc7cGfzqOxjCPdTbtFlddjbJUtKBfi /Cd7o22mYQSRsK96VOThLX8sfEExga+uo9+hn9AoeW9FnqIrC7VFgVB9OEgLJGZgSj3v +Gqg== X-Gm-Message-State: AOJu0YzMJI0E3Ne03bDXIRsMVDlvhOL8429e/U+ajN4ny7HHSuB1ucue WRlNrMtwmBfJE9O8qVvLU8Dr4sx+QlCDcpz4GAEXXK9lKXw2IXi7WKo/KU2/lJROiYaolzvDCP+ D X-Google-Smtp-Source: AGHT+IHEeUzo9nKt/5sR0cdYOUrFcqbWpz3gHLqetTv5TASsV41P3ElhEzhLprJt5JjVORNdjOv48A== X-Received: by 2002:a05:6a20:3948:b0:1d8:a29b:8f6f with SMTP id adf61e73a8af0-1dc229440aemr28246599637.16.1731449965918; Tue, 12 Nov 2024 14:19:25 -0800 (PST) Received: from dread.disaster.area (pa49-186-86-168.pa.vic.optusnet.com.au. [49.186.86.168]) by smtp.gmail.com with ESMTPSA id 41be03b00d2f7-7f41f5b3f8fsm10974624a12.19.2024.11.12.14.19.24 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 12 Nov 2024 14:19:25 -0800 (PST) Received: from [192.168.253.23] (helo=devoid.disaster.area) by dread.disaster.area with esmtp (Exim 4.96) (envelope-from ) id 1tAzEH-00Doox-0p; Wed, 13 Nov 2024 09:19:21 +1100 Received: from dave by devoid.disaster.area with local (Exim 4.98) (envelope-from ) id 1tAzEH-00000004e87-2xWL; Wed, 13 Nov 2024 09:19:21 +1100 From: Dave Chinner To: linux-xfs@vger.kernel.org Cc: cem@kernel.org Subject: [PATCH 3/3] xfs: prevent mount and log shutdown race Date: Wed, 13 Nov 2024 09:05:16 +1100 Message-ID: <20241112221920.1105007-4-david@fromorbit.com> X-Mailer: git-send-email 2.45.2 In-Reply-To: <20241112221920.1105007-1-david@fromorbit.com> References: <20241112221920.1105007-1-david@fromorbit.com> Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Dave Chinner I recently had an fstests hang where there were two internal tasks stuck like so: [ 6559.010870] task:kworker/24:45 state:D stack:12152 pid:631308 tgid:631308 ppid:2 flags:0x00004000 [ 6559.016984] Workqueue: xfs-buf/dm-2 xfs_buf_ioend_work [ 6559.020349] Call Trace: [ 6559.022002] [ 6559.023426] __schedule+0x650/0xb10 [ 6559.025734] schedule+0x6d/0xf0 [ 6559.027835] schedule_timeout+0x31/0x180 [ 6559.030582] wait_for_common+0x10c/0x1e0 [ 6559.033495] wait_for_completion+0x1d/0x30 [ 6559.036463] __flush_workqueue+0xeb/0x490 [ 6559.039479] ? mempool_alloc_slab+0x15/0x20 [ 6559.042537] xlog_cil_force_seq+0xa1/0x2f0 [ 6559.045498] ? bio_alloc_bioset+0x1d8/0x510 [ 6559.048578] ? submit_bio_noacct+0x2f2/0x380 [ 6559.051665] ? xlog_force_shutdown+0x3b/0x170 [ 6559.054819] xfs_log_force+0x77/0x230 [ 6559.057455] xlog_force_shutdown+0x3b/0x170 [ 6559.060507] xfs_do_force_shutdown+0xd4/0x200 [ 6559.063798] ? xfs_buf_rele+0x1bd/0x580 [ 6559.066541] xfs_buf_ioend_handle_error+0x163/0x2e0 [ 6559.070099] xfs_buf_ioend+0x61/0x200 [ 6559.072728] xfs_buf_ioend_work+0x15/0x20 [ 6559.075706] process_scheduled_works+0x1d4/0x400 [ 6559.078814] worker_thread+0x234/0x2e0 [ 6559.081300] kthread+0x147/0x170 [ 6559.083462] ? __pfx_worker_thread+0x10/0x10 [ 6559.086295] ? __pfx_kthread+0x10/0x10 [ 6559.088771] ret_from_fork+0x3e/0x50 [ 6559.091153] ? __pfx_kthread+0x10/0x10 [ 6559.093624] ret_from_fork_asm+0x1a/0x30 [ 6559.096227] [ 6559.109304] Workqueue: xfs-cil/dm-2 xlog_cil_push_work [ 6559.112673] Call Trace: [ 6559.114333] [ 6559.115760] __schedule+0x650/0xb10 [ 6559.118084] schedule+0x6d/0xf0 [ 6559.120175] schedule_timeout+0x31/0x180 [ 6559.122776] ? call_rcu+0xee/0x2f0 [ 6559.125034] __down_common+0xbe/0x1f0 [ 6559.127470] __down+0x1d/0x30 [ 6559.129458] down+0x48/0x50 [ 6559.131343] ? xfs_buf_item_unpin+0x8d/0x380 [ 6559.134213] xfs_buf_lock+0x3d/0xe0 [ 6559.136544] xfs_buf_item_unpin+0x8d/0x380 [ 6559.139253] xlog_cil_committed+0x287/0x520 [ 6559.142019] ? sched_clock+0x10/0x30 [ 6559.144384] ? sched_clock_cpu+0x10/0x190 [ 6559.147039] ? psi_group_change+0x48/0x310 [ 6559.149735] ? _raw_spin_unlock+0xe/0x30 [ 6559.152340] ? finish_task_switch+0xbc/0x310 [ 6559.155163] xlog_cil_process_committed+0x6d/0x90 [ 6559.158265] xlog_state_shutdown_callbacks+0x53/0x110 [ 6559.161564] ? xlog_cil_push_work+0xa70/0xaf0 [ 6559.164441] xlog_state_release_iclog+0xba/0x1b0 [ 6559.167483] xlog_cil_push_work+0xa70/0xaf0 [ 6559.170260] process_scheduled_works+0x1d4/0x400 [ 6559.173286] worker_thread+0x234/0x2e0 [ 6559.175779] kthread+0x147/0x170 [ 6559.177933] ? __pfx_worker_thread+0x10/0x10 [ 6559.180748] ? __pfx_kthread+0x10/0x10 [ 6559.183231] ret_from_fork+0x3e/0x50 [ 6559.185601] ? __pfx_kthread+0x10/0x10 [ 6559.188092] ret_from_fork_asm+0x1a/0x30 [ 6559.190692] This is an ABBA deadlock where buffer IO completion is triggering a forced shutdown with the buffer lock held. It is waiting for the CIL to flush as part of the log force. The CIL flush is blocked doing shutdown processing of all it's objects, trying to unpin a buffer item. That requires taking the buffer lock.... For the CIL to be doing shutdown processing, the log must be marked with XLOG_IO_ERROR, but that doesn't happen until after the log force is issued. Hence for xfs_do_force_shutdown() to be forcing the log on a shut down log, we must have had a racing xlog_force_shutdown and xfs_force_shutdown like so: p0 p1 CIL push xlog_force_shutdown xfs_log_force test_and_set_bit(XLOG_IO_ERROR) xlog_state_release_iclog() sees XLOG_IO_ERROR xlog_state_shutdown_callbacks .... xfs_buf_item_unpin xfs_buf_lock xfs_force_shutdown xfs_set_shutdown(mp) wins xlog_force_shutdown xfs_log_force xfs_set_shutdown(mp) fails The deadlock can be mitigated by avoiding the log force on the second pass through xlog_force_shutdown. Do this by adding another atomic state bit (XLOG_OP_PENDING_SHUTDOWN) that is set on entry to xlog_force_shutdown() but doesn't mark the log as shutdown. Signed-off-by: Dave Chinner Reviewed-by: Darrick J. Wong Reviewed-by: Christoph Hellwig --- fs/xfs/xfs_log.c | 11 +++++++++++ fs/xfs/xfs_log_priv.h | 1 + 2 files changed, 12 insertions(+) diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c index 26b2f5887b88..05daad8a8d34 100644 --- a/fs/xfs/xfs_log.c +++ b/fs/xfs/xfs_log.c @@ -3455,6 +3455,16 @@ xlog_force_shutdown( if (!log) return false; + /* + * Ensure that there is only ever one log shutdown being processed. + * If we allow the log force below on a second pass after shutting + * down the log, we risk deadlocking the CIL push as it may require + * locks on objects the current shutdown context holds (e.g. taking + * buffer locks to abort buffers on last unpin of buf log items). + */ + if (test_and_set_bit(XLOG_SHUTDOWN_STARTED, &log->l_opstate)) + return false; + /* * Flush all the completed transactions to disk before marking the log * being shut down. We need to do this first as shutting down the log @@ -3487,6 +3497,7 @@ xlog_force_shutdown( spin_lock(&log->l_icloglock); if (test_and_set_bit(XLOG_IO_ERROR, &log->l_opstate)) { spin_unlock(&log->l_icloglock); + ASSERT(0); return false; } spin_unlock(&log->l_icloglock); diff --git a/fs/xfs/xfs_log_priv.h b/fs/xfs/xfs_log_priv.h index b8778a4fd6b6..f3d78869e5e5 100644 --- a/fs/xfs/xfs_log_priv.h +++ b/fs/xfs/xfs_log_priv.h @@ -458,6 +458,7 @@ struct xlog { #define XLOG_IO_ERROR 2 /* log hit an I/O error, and being shutdown */ #define XLOG_TAIL_WARN 3 /* log tail verify warning issued */ +#define XLOG_SHUTDOWN_STARTED 4 /* xlog_force_shutdown() exclusion */ static inline bool xlog_recovery_needed(struct xlog *log)