xfs_db: do some checks in init to prevent corruption

Message ID	20240820015654.17418-1-liuhuan01@kylinos.cn (mailing list archive)
State	Accepted, archived
Headers	show Received: from mailgw.kylinos.cn (mailgw.kylinos.cn [124.126.103.232]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 27ABF17FE for <linux-xfs@vger.kernel.org>; Tue, 20 Aug 2024 02:18:28 +0000 (UTC) From: liuhuan01@kylinos.cn To: linux-xfs@vger.kernel.org Cc: djwong@kernel.org, cmaiolino@redhat.com, liuh <liuhuan01@kylinos.cn> Subject: [PATCH] xfs_db: do some checks in init to prevent corruption Date: Tue, 20 Aug 2024 09:56:54 +0800 Message-Id: <20240820015654.17418-1-liuhuan01@kylinos.cn> Precedence: bulk MIME-Version: 1.0 Content-Transfer-Encoding: 8bit
Series	xfs_db: do some checks in init to prevent corruption \| expand xfs_db: do some checks in init to prevent corruption

Message ID

20240820015654.17418-1-liuhuan01@kylinos.cn (mailing list archive)

State

Accepted, archived

Headers

From: liuhuan01@kylinos.cn
To: linux-xfs@vger.kernel.org
Cc: djwong@kernel.org,
	cmaiolino@redhat.com,
	liuh <liuhuan01@kylinos.cn>
Subject: [PATCH] xfs_db: do some checks in init to prevent corruption
Date: Tue, 20 Aug 2024 09:56:54 +0800
Message-Id: <20240820015654.17418-1-liuhuan01@kylinos.cn>
Precedence: bulk
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit

Series

xfs_db: do some checks in init to prevent corruption | expand

Commit Message

liuh Aug. 20, 2024, 1:56 a.m. UTC

From: liuh <liuhuan01@kylinos.cn>

Recently, I was testing xfstests. When I run xfs/350 case, it always generate coredumps during the process.
Total two types of coredump:
	(a) xfs_db -c "sb 0" -c "print sectlog" /dev/loop1
	(b) xfs_db -c "sb 0" -c "print agblock" /dev/loop1

For coredump (a) system will generate signal SIGSEGV corrupt the process. And the stack as follow:
corrupt at: q = *++b; in function crc32_body
	#0  crc32_body
	#1  crc32_le_generic
	#2  crc32c_le
	#3  xfs_start_cksum_safe
	#4  libxfs_verify_cksum
	#5  xfs_buf_verify_cksum
	#6  xfs_agf_read_verify
	#7  libxfs_readbuf_verify
	#8  libxfs_buf_read_map
	#9  libxfs_trans_read_buf_map
	#10 libxfs_trans_read_buf
	#11 libxfs_read_agf
	#12 libxfs_alloc_read_agf
	#13 libxfs_initialize_perag_data
	#14 init
	#15 main

For coredump (b) system will generate signal SIGFPE corrupt the process. And the stack as follow:
corrupt at: (*bpp)->b_pag = xfs_perag_get(btp->bt_mount, xfs_daddr_to_agno(btp->bt_mount, blkno)); in function libxfs_getbuf_flags
	#0  libxfs_getbuf_flags
	#1  libxfs_getbuf_flags
	#2  libxfs_buf_read_map
	#3  libxfs_buf_read
	#4  libxfs_mount
	#5  init
	#6  main

Analyze the above two issues separately:
	coredump (a) was caused by the corrupt superblock metadata: (mp)->m_sb.sb_sectlog, it was 128;
	coredump (b) was caused by the corrupt superblock metadata: (mp)->m_sb.sb_agblocks, it was 0;

Current, xfs_db doesn't validate the superblock, it goes to corruption if superblock is damaged, theoretically.

So do some check in xfs_db init function to prevent corruption and leave some hints.

Signed-off-by: liuh <liuhuan01@kylinos.cn>
---
 db/init.c | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

Comments

Darrick J. Wong Aug. 21, 2024, 12:31 a.m. UTC | #1

On Tue, Aug 20, 2024 at 09:56:54AM +0800, liuhuan01@kylinos.cn wrote:
> From: liuh <liuhuan01@kylinos.cn>
> 
> Recently, I was testing xfstests. When I run xfs/350 case, it always generate coredumps during the process.
> Total two types of coredump:
> 	(a) xfs_db -c "sb 0" -c "print sectlog" /dev/loop1
> 	(b) xfs_db -c "sb 0" -c "print agblock" /dev/loop1
> 
> For coredump (a) system will generate signal SIGSEGV corrupt the process. And the stack as follow:
> corrupt at: q = *++b; in function crc32_body
> 	#0  crc32_body
> 	#1  crc32_le_generic
> 	#2  crc32c_le
> 	#3  xfs_start_cksum_safe
> 	#4  libxfs_verify_cksum
> 	#5  xfs_buf_verify_cksum
> 	#6  xfs_agf_read_verify
> 	#7  libxfs_readbuf_verify
> 	#8  libxfs_buf_read_map
> 	#9  libxfs_trans_read_buf_map
> 	#10 libxfs_trans_read_buf
> 	#11 libxfs_read_agf
> 	#12 libxfs_alloc_read_agf
> 	#13 libxfs_initialize_perag_data
> 	#14 init
> 	#15 main
> 
> For coredump (b) system will generate signal SIGFPE corrupt the process. And the stack as follow:
> corrupt at: (*bpp)->b_pag = xfs_perag_get(btp->bt_mount, xfs_daddr_to_agno(btp->bt_mount, blkno)); in function libxfs_getbuf_flags
> 	#0  libxfs_getbuf_flags
> 	#1  libxfs_getbuf_flags
> 	#2  libxfs_buf_read_map
> 	#3  libxfs_buf_read
> 	#4  libxfs_mount
> 	#5  init
> 	#6  main
> 
> Analyze the above two issues separately:
> 	coredump (a) was caused by the corrupt superblock metadata: (mp)->m_sb.sb_sectlog, it was 128;
> 	coredump (b) was caused by the corrupt superblock metadata: (mp)->m_sb.sb_agblocks, it was 0;

One patch per bugfix, please.

> Current, xfs_db doesn't validate the superblock, it goes to corruption if superblock is damaged, theoretically.

libxfs does, but xfs_db doesn't quite use the verifiers, being a
debugger and all.

> So do some check in xfs_db init function to prevent corruption and leave some hints.
> 
> Signed-off-by: liuh <liuhuan01@kylinos.cn>
> ---
>  db/init.c | 14 ++++++++++++++
>  1 file changed, 14 insertions(+)
> 
> diff --git a/db/init.c b/db/init.c
> index cea25ae5..4402f85f 100644
> --- a/db/init.c
> +++ b/db/init.c
> @@ -129,6 +129,13 @@ init(
>  		}
>  	}
>  
> +	if (unlikely(sbp->sb_agblocks == 0)) {
> +		fprintf(stderr,
> +			_("%s: device %s agblocks unexpected\n"),
> +			progname, x.data.name);
> +		exit(1);
> +	}
> +
>  	agcount = sbp->sb_agcount;
>  	mp = libxfs_mount(&xmount, sbp, &x, LIBXFS_MOUNT_DEBUGGER);
>  	if (!mp) {
> @@ -140,6 +147,13 @@ init(
>  	mp->m_log = &xlog;
>  	blkbb = 1 << mp->m_blkbb_log;
>  
> +	if (unlikely(sbp->sb_sectlog < XFS_MIN_SECTORSIZE_LOG || sbp->sb_sectlog > XFS_MAX_SECTORSIZE_LOG)) {

Please fix the long lines.

> +		fprintf(stderr,
> +			_("%s: device %s sectlog(%u) unexpected\n"),
> +			progname, x.data.name, sbp->sb_sectlog);
> +		exit(1);

If xfs_db is being run in expert mode, could we try to install
reasonable values here?  Such as the mkfs defaults?  Or let the user
specify an override via extended cli option?

--D

> +	}
> +
>  	/* Did we limit a broken agcount in libxfs_mount? */
>  	if (sbp->sb_agcount != agcount)
>  		exitcode = 1;
> -- 
> 2.43.0
> 
>

diff --git a/db/init.c b/db/init.c
index cea25ae5..4402f85f 100644
--- a/db/init.c
+++ b/db/init.c
@@ -129,6 +129,13 @@  init(
 		}
 	}
 
+	if (unlikely(sbp->sb_agblocks == 0)) {
+		fprintf(stderr,
+			_("%s: device %s agblocks unexpected\n"),
+			progname, x.data.name);
+		exit(1);
+	}
+
 	agcount = sbp->sb_agcount;
 	mp = libxfs_mount(&xmount, sbp, &x, LIBXFS_MOUNT_DEBUGGER);
 	if (!mp) {
@@ -140,6 +147,13 @@  init(
 	mp->m_log = &xlog;
 	blkbb = 1 << mp->m_blkbb_log;
 
+	if (unlikely(sbp->sb_sectlog < XFS_MIN_SECTORSIZE_LOG || sbp->sb_sectlog > XFS_MAX_SECTORSIZE_LOG)) {
+		fprintf(stderr,
+			_("%s: device %s sectlog(%u) unexpected\n"),
+			progname, x.data.name, sbp->sb_sectlog);
+		exit(1);
+	}
+
 	/* Did we limit a broken agcount in libxfs_mount? */
 	if (sbp->sb_agcount != agcount)
 		exitcode = 1;

xfs_db: do some checks in init to prevent corruption

Commit Message

Comments

Patch