From patchwork Sun Dec 31 22:35:39 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13507910 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1B1D7C13B for ; Sun, 31 Dec 2023 22:35:40 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="E3moGxEN" Received: by smtp.kernel.org (Postfix) with ESMTPSA id E01A8C433C7; Sun, 31 Dec 2023 22:35:39 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1704062139; bh=FufdpvNb4JVmLp41eQlzQFvV+Ql8h9yNw1ERx+/GC0I=; h=Date:Subject:From:To:Cc:In-Reply-To:References:From; b=E3moGxENyUcUF8LoMJp8JIb3xZ3eXIwJgotwo+BvpmAMP4n27RxPh/++nKyI2mIw3 q5xtm0GgjCv4l/fRaatr6jPO6N0Md4Ut+VI+opy2YuwL7btUQ2q+ZZTnRwKAiB2H13 T/i0wuhKgEK18bpNIF+o9Is5n3ZAA2Fif1kBi7dCvOERYM4QqN2cBgp8aKwxEL7/9d LKip0J4gzvvry2LaDuXji7VupOALfRzm38ODUxoEwOIrwbt8FoYUZrx8837dfQVT/V V+49pGsqyerGFbnwVFFLWnViLqWrPUkNsQjNxChvQjaOpfWqSBqwbWf97FkWJQOTgO brNlALCx3JxEA== Date: Sun, 31 Dec 2023 14:35:39 -0800 Subject: [PATCH 1/4] xfs: check unused nlink fields in the ondisk inode From: "Darrick J. Wong" To: djwong@kernel.org, cem@kernel.org Cc: linux-xfs@vger.kernel.org Message-ID: <170404998305.1797172.6971280830194810433.stgit@frogsfrogsfrogs> In-Reply-To: <170404998289.1797172.11188208357520292150.stgit@frogsfrogsfrogs> References: <170404998289.1797172.11188208357520292150.stgit@frogsfrogsfrogs> User-Agent: StGit/0.19 Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Darrick J. Wong v2/v3 inodes use di_nlink and not di_onlink; and v1 inodes use di_onlink and not di_nlink. Whichever field is not in use, make sure its contents are zero, and teach xfs_scrub to fix that if it is. This clears a bunch of missing scrub failure errors in xfs/385 for core.onlink. Signed-off-by: Darrick J. Wong --- libxfs/xfs_inode_buf.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/libxfs/xfs_inode_buf.c b/libxfs/xfs_inode_buf.c index 82cf64db938..aee581d53c8 100644 --- a/libxfs/xfs_inode_buf.c +++ b/libxfs/xfs_inode_buf.c @@ -488,6 +488,14 @@ xfs_dinode_verify( return __this_address; } + if (dip->di_version > 1) { + if (dip->di_onlink) + return __this_address; + } else { + if (dip->di_nlink) + return __this_address; + } + /* don't allow invalid i_size */ di_size = be64_to_cpu(dip->di_size); if (di_size & (1ULL << 63)) From patchwork Sun Dec 31 22:35:55 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13507911 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EECFFC13B for ; Sun, 31 Dec 2023 22:35:55 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="KEgt2DFl" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 88DE3C433C7; Sun, 31 Dec 2023 22:35:55 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1704062155; bh=JmzvmOgsePx4ZkN1JXcbHLd9xvilKJjvoQ8UHPBSX1E=; h=Date:Subject:From:To:Cc:In-Reply-To:References:From; b=KEgt2DFl5Jchamlyk5+B1JBJjbEezCT0ePgBX2FQ5NdkUNjj0HOMovG9DN6Y8JfpI gxu8sHKE043/WpAkkgiuzgLpKCNV1DWhmapee/WB2YN8CNeP07GcX5dfYgQ4bF5Waa GrWr7xZedVuCAr/mOviPyNqLRTyQhjaD/wbQf5fEOpbb3ii+jlQRGblw8K1TDJ7Ifq xuEFSOFE6uFiVkJPRn4mKOe1K83Z0PmIF0i1TmfUVln745I+mnfqQMwdpMzbejGTjW kWdS4ZBGfrBFGC0MV0n6rd+5NPflqFoyOdYvYd90z4m/2Y7tyTsx0wYYsxGiretJwf ELL2HXmzzA2wg== Date: Sun, 31 Dec 2023 14:35:55 -0800 Subject: [PATCH 2/4] xfs: try to avoid allocating from sick inode clusters From: "Darrick J. Wong" To: djwong@kernel.org, cem@kernel.org Cc: linux-xfs@vger.kernel.org Message-ID: <170404998319.1797172.10634924197446175862.stgit@frogsfrogsfrogs> In-Reply-To: <170404998289.1797172.11188208357520292150.stgit@frogsfrogsfrogs> References: <170404998289.1797172.11188208357520292150.stgit@frogsfrogsfrogs> User-Agent: StGit/0.19 Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Darrick J. Wong I noticed that xfs/413 and xfs/375 occasionally failed while fuzzing core.mode of an inode. The root cause of these problems is that the field we fuzzed (core.mode or core.magic, typically) causes the entire inode cluster buffer verification to fail, which affects several inodes at once. The repair process tries to create either a /lost+found or a temporary repair file, but regrettably it picks the same inode cluster that we just corrupted, with the result that repair triggers the demise of the filesystem. Try avoid this by making the inode allocation path detect when the perag health status indicates that someone has found bad inode cluster buffers, and try to read the inode cluster buffer. If the cluster buffer fails the verifiers, try another AG. This isn't foolproof and can result in premature ENOSPC, but that might be better than shutting down. Signed-off-by: Darrick J. Wong --- libxfs/util.c | 6 ++++++ libxfs/xfs_ialloc.c | 40 ++++++++++++++++++++++++++++++++++++++++ 2 files changed, 46 insertions(+) diff --git a/libxfs/util.c b/libxfs/util.c index 097362d488d..c1ddaf92c8a 100644 --- a/libxfs/util.c +++ b/libxfs/util.c @@ -732,6 +732,12 @@ void xfs_fs_mark_sick(struct xfs_mount *mp, unsigned int mask) { } void xfs_agno_mark_sick(struct xfs_mount *mp, xfs_agnumber_t agno, unsigned int mask) { } void xfs_ag_mark_sick(struct xfs_perag *pag, unsigned int mask) { } +void xfs_ag_measure_sickness(struct xfs_perag *pag, unsigned int *sick, + unsigned int *checked) +{ + *sick = 0; + *checked = 0; +} void xfs_bmap_mark_sick(struct xfs_inode *ip, int whichfork) { } void xfs_btree_mark_sick(struct xfs_btree_cur *cur) { } void xfs_dirattr_mark_sick(struct xfs_inode *ip, int whichfork) { } diff --git a/libxfs/xfs_ialloc.c b/libxfs/xfs_ialloc.c index 21577a50f65..46d4515baba 100644 --- a/libxfs/xfs_ialloc.c +++ b/libxfs/xfs_ialloc.c @@ -1007,6 +1007,33 @@ xfs_inobt_first_free_inode( return xfs_lowbit64(realfree); } +/* + * If this AG has corrupt inodes, check if allocating this inode would fail + * with corruption errors. Returns 0 if we're clear, or EAGAIN to try again + * somewhere else. + */ +static int +xfs_dialloc_check_ino( + struct xfs_perag *pag, + struct xfs_trans *tp, + xfs_ino_t ino) +{ + struct xfs_imap imap; + struct xfs_buf *bp; + int error; + + error = xfs_imap(pag, tp, ino, &imap, 0); + if (error) + return -EAGAIN; + + error = xfs_imap_to_bp(pag->pag_mount, tp, &imap, &bp); + if (error) + return -EAGAIN; + + xfs_trans_brelse(tp, bp); + return 0; +} + /* * Allocate an inode using the inobt-only algorithm. */ @@ -1259,6 +1286,13 @@ xfs_dialloc_ag_inobt( ASSERT((XFS_AGINO_TO_OFFSET(mp, rec.ir_startino) % XFS_INODES_PER_CHUNK) == 0); ino = XFS_AGINO_TO_INO(mp, pag->pag_agno, rec.ir_startino + offset); + + if (xfs_ag_has_sickness(pag, XFS_SICK_AG_INODES)) { + error = xfs_dialloc_check_ino(pag, tp, ino); + if (error) + goto error0; + } + rec.ir_free &= ~XFS_INOBT_MASK(offset); rec.ir_freecount--; error = xfs_inobt_update(cur, &rec); @@ -1534,6 +1568,12 @@ xfs_dialloc_ag( XFS_INODES_PER_CHUNK) == 0); ino = XFS_AGINO_TO_INO(mp, pag->pag_agno, rec.ir_startino + offset); + if (xfs_ag_has_sickness(pag, XFS_SICK_AG_INODES)) { + error = xfs_dialloc_check_ino(pag, tp, ino); + if (error) + goto error_cur; + } + /* * Modify or remove the finobt record. */ From patchwork Sun Dec 31 22:36:10 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13507912 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 649BFC13B for ; Sun, 31 Dec 2023 22:36:11 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="YbwBLpKF" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 2D1A4C433C7; Sun, 31 Dec 2023 22:36:11 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1704062171; bh=yjGx4Q2QpJo23s0rUwxRiGWEO2YAPwy5CaAXIn6f5p8=; h=Date:Subject:From:To:Cc:In-Reply-To:References:From; b=YbwBLpKFKNz1rpoqV/qpGRgb8XBNRXCfGeZtH28Xgbh65djdGynZ2sbD0/J6HiZgv 0rIY3h40/IGigpmTZMZEyIcsYFY0n73oGnDDs6HjjwcReESvf4boZwxqJtdrU81IW0 OmPTABgwH59t8k7j8SSoIe358KNHpIjJTYkxY5tsfNQmPoOc2daKEvYnn4Zcx1VcFW x2hNLjRx5CySoXjBmQUFyOOpOe3FI+f/6bOFFzNp3ZVogxCGsfKoeBGR5JdFZphu7a CYEPhgbNg8ce1b6yuPTQfmefKNKAxBw6PyxFkWF1gxXSmQIaijMnJsz7dvI1RtcT/r oZxp5T6bueTXw== Date: Sun, 31 Dec 2023 14:36:10 -0800 Subject: [PATCH 3/4] libxfs: port the bumplink function from the kernel From: "Darrick J. Wong" To: djwong@kernel.org, cem@kernel.org Cc: linux-xfs@vger.kernel.org Message-ID: <170404998332.1797172.7585897571914908972.stgit@frogsfrogsfrogs> In-Reply-To: <170404998289.1797172.11188208357520292150.stgit@frogsfrogsfrogs> References: <170404998289.1797172.11188208357520292150.stgit@frogsfrogsfrogs> User-Agent: StGit/0.19 Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Darrick J. Wong Port the xfs_bumplink function from the kernel and use it to replace raw calls to inc_nlink. The next patch will need this common function to prevent integer overflows in the link count. Signed-off-by: Darrick J. Wong --- include/xfs_inode.h | 2 ++ libxfs/util.c | 17 +++++++++++++++++ mkfs/proto.c | 4 ++-- repair/phase6.c | 10 +++++----- 4 files changed, 26 insertions(+), 7 deletions(-) diff --git a/include/xfs_inode.h b/include/xfs_inode.h index 302df4c6f7e..47959314811 100644 --- a/include/xfs_inode.h +++ b/include/xfs_inode.h @@ -348,6 +348,8 @@ extern void libxfs_trans_ichgtime(struct xfs_trans *, struct xfs_inode *, int); extern int libxfs_iflush_int (struct xfs_inode *, struct xfs_buf *); +void libxfs_bumplink(struct xfs_trans *tp, struct xfs_inode *ip); + /* Inode Cache Interfaces */ extern int libxfs_iget(struct xfs_mount *, struct xfs_trans *, xfs_ino_t, uint, struct xfs_inode **); diff --git a/libxfs/util.c b/libxfs/util.c index c1ddaf92c8a..11978529ed6 100644 --- a/libxfs/util.c +++ b/libxfs/util.c @@ -240,6 +240,23 @@ xfs_inode_propagate_flags( ip->i_diflags |= di_flags; } +/* + * Increment the link count on an inode & log the change. + */ +void +libxfs_bumplink( + struct xfs_trans *tp, + struct xfs_inode *ip) +{ + struct inode *inode = VFS_I(ip); + + xfs_trans_ichgtime(tp, ip, XFS_ICHGTIME_CHG); + + inc_nlink(inode); + + xfs_trans_log_inode(tp, ip, XFS_ILOG_CORE); +} + /* * Initialise a newly allocated inode and return the in-core inode to the * caller locked exclusively. diff --git a/mkfs/proto.c b/mkfs/proto.c index 0f2facbc32e..457899ac178 100644 --- a/mkfs/proto.c +++ b/mkfs/proto.c @@ -590,7 +590,7 @@ parseproto( &creds, fsxp, &ip); if (error) fail(_("Inode allocation failed"), error); - inc_nlink(VFS_I(ip)); /* account for . */ + libxfs_bumplink(tp, ip); /* account for . */ if (!pip) { pip = ip; mp->m_sb.sb_rootino = ip->i_ino; @@ -600,7 +600,7 @@ parseproto( libxfs_trans_ijoin(tp, pip, 0); xname.type = XFS_DIR3_FT_DIR; newdirent(mp, tp, pip, &xname, ip->i_ino); - inc_nlink(VFS_I(pip)); + libxfs_bumplink(tp, pip); libxfs_trans_log_inode(tp, pip, XFS_ILOG_CORE); } newdirectory(mp, tp, ip, pip); diff --git a/repair/phase6.c b/repair/phase6.c index ac037cf80ad..75391378291 100644 --- a/repair/phase6.c +++ b/repair/phase6.c @@ -944,7 +944,7 @@ mk_orphanage(xfs_mount_t *mp) do_error(_("%s inode allocation failed %d\n"), ORPHANAGE, error); } - inc_nlink(VFS_I(ip)); /* account for . */ + libxfs_bumplink(tp, ip); /* account for . */ ino = ip->i_ino; irec = find_inode_rec(mp, @@ -996,7 +996,7 @@ mk_orphanage(xfs_mount_t *mp) * for .. in the new directory, and update the irec copy of the * on-disk nlink so we don't fail the link count check later. */ - inc_nlink(VFS_I(pip)); + libxfs_bumplink(tp, pip); irec = find_inode_rec(mp, XFS_INO_TO_AGNO(mp, mp->m_sb.sb_rootino), XFS_INO_TO_AGINO(mp, mp->m_sb.sb_rootino)); add_inode_ref(irec, 0); @@ -1090,7 +1090,7 @@ mv_orphanage( if (irec) add_inode_ref(irec, ino_offset); else - inc_nlink(VFS_I(orphanage_ip)); + libxfs_bumplink(tp, orphanage_ip); libxfs_trans_log_inode(tp, orphanage_ip, XFS_ILOG_CORE); err = -libxfs_dir_createname(tp, ino_p, &xfs_name_dotdot, @@ -1099,7 +1099,7 @@ mv_orphanage( do_error( _("creation of .. entry failed (%d)\n"), err); - inc_nlink(VFS_I(ino_p)); + libxfs_bumplink(tp, ino_p); libxfs_trans_log_inode(tp, ino_p, XFS_ILOG_CORE); err = -libxfs_trans_commit(tp); if (err) @@ -1124,7 +1124,7 @@ mv_orphanage( if (irec) add_inode_ref(irec, ino_offset); else - inc_nlink(VFS_I(orphanage_ip)); + libxfs_bumplink(tp, orphanage_ip); libxfs_trans_log_inode(tp, orphanage_ip, XFS_ILOG_CORE); /* From patchwork Sun Dec 31 22:36:26 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13507913 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 34787C147 for ; Sun, 31 Dec 2023 22:36:26 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="nfOw874l" Received: by smtp.kernel.org (Postfix) with ESMTPSA id B99B2C433C8; Sun, 31 Dec 2023 22:36:26 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1704062186; bh=Cz/CmfUNHSHRdaxYFhDnP6yMI3jv2z9NR2h+F9MmWSw=; h=Date:Subject:From:To:Cc:In-Reply-To:References:From; b=nfOw874lZbKjXuQS7D2TPhjMDGBnJp9im1xXg17fMqRtC6dAuC9vK77AZrF4p+BST +//b6WVJbLg6HdPEhlpPL8k5IE2gtrQcg6hxze7KQ/ul7cYf33Ey8jIYvUNCEhTm6r 8TKMsIOwWHmjIIBiTONN+t2Qh/RDFidLiapCmwP0v654ffXEo4tnhmQiRVIEL7mewJ sGKvYSJLlHdaNj8wMSPE8fSN0WHOhG3PDG2LVm1TdTfJ3KRp/vfJKS0EvwiTW8KVsH BODPXm3mflU8JddG9yuf7YdaeLQpF/NyC9EmOV4jA2bWWvngMC5wDaEQJwpZzpgE4I yGxA/DaIsKjCw== Date: Sun, 31 Dec 2023 14:36:26 -0800 Subject: [PATCH 4/4] xfs: pin inodes that would otherwise overflow link count From: "Darrick J. Wong" To: djwong@kernel.org, cem@kernel.org Cc: linux-xfs@vger.kernel.org Message-ID: <170404998345.1797172.12807191836564310136.stgit@frogsfrogsfrogs> In-Reply-To: <170404998289.1797172.11188208357520292150.stgit@frogsfrogsfrogs> References: <170404998289.1797172.11188208357520292150.stgit@frogsfrogsfrogs> User-Agent: StGit/0.19 Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Darrick J. Wong The VFS inc_nlink function does not explicitly check for integer overflows in the i_nlink field. Instead, it checks the link count against s_max_links in the vfs_{link,create,rename} functions. XFS sets the maximum link count to 2.1 billion, so integer overflows should not be a problem. However. It's possible that online repair could find that a file has more than four billion links, particularly if the link count got corrupted while creating hardlinks to the file. The di_nlinkv2 field is not large enough to store a value larger than 2^32, so we ought to define a magic pin value of ~0U which means that the inode never gets deleted. This will prevent a UAF error if the repair finds this situation and users begin deleting links to the file. Signed-off-by: Darrick J. Wong --- libxfs/util.c | 3 ++- libxfs/xfs_format.h | 6 ++++++ repair/incore_ino.c | 3 ++- 3 files changed, 10 insertions(+), 2 deletions(-) diff --git a/libxfs/util.c b/libxfs/util.c index 11978529ed6..03191ebcd08 100644 --- a/libxfs/util.c +++ b/libxfs/util.c @@ -252,7 +252,8 @@ libxfs_bumplink( xfs_trans_ichgtime(tp, ip, XFS_ICHGTIME_CHG); - inc_nlink(inode); + if (inode->i_nlink != XFS_NLINK_PINNED) + inc_nlink(inode); xfs_trans_log_inode(tp, ip, XFS_ILOG_CORE); } diff --git a/libxfs/xfs_format.h b/libxfs/xfs_format.h index 7861539ab8b..ec25010b577 100644 --- a/libxfs/xfs_format.h +++ b/libxfs/xfs_format.h @@ -912,6 +912,12 @@ static inline uint xfs_dinode_size(int version) */ #define XFS_MAXLINK ((1U << 31) - 1U) +/* + * Any file that hits the maximum ondisk link count should be pinned to avoid + * a use-after-free situation. + */ +#define XFS_NLINK_PINNED (~0U) + /* * Values for di_format * diff --git a/repair/incore_ino.c b/repair/incore_ino.c index 0dd7a2f060f..b0b41a2cc5c 100644 --- a/repair/incore_ino.c +++ b/repair/incore_ino.c @@ -108,7 +108,8 @@ void add_inode_ref(struct ino_tree_node *irec, int ino_offset) nlink_grow_16_to_32(irec); /*FALLTHRU*/ case sizeof(uint32_t): - irec->ino_un.ex_data->counted_nlinks.un32[ino_offset]++; + if (irec->ino_un.ex_data->counted_nlinks.un32[ino_offset] != XFS_NLINK_PINNED) + irec->ino_un.ex_data->counted_nlinks.un32[ino_offset]++; break; default: ASSERT(0);