diff mbox series

xfs_db: do some checks in init to prevent corruption

Message ID 20240820015654.17418-1-liuhuan01@kylinos.cn (mailing list archive)
State New
Headers show
Series xfs_db: do some checks in init to prevent corruption | expand

Commit Message

liuh Aug. 20, 2024, 1:56 a.m. UTC
From: liuh <liuhuan01@kylinos.cn>

Recently, I was testing xfstests. When I run xfs/350 case, it always generate coredumps during the process.
Total two types of coredump:
	(a) xfs_db -c "sb 0" -c "print sectlog" /dev/loop1
	(b) xfs_db -c "sb 0" -c "print agblock" /dev/loop1

For coredump (a) system will generate signal SIGSEGV corrupt the process. And the stack as follow:
corrupt at: q = *++b; in function crc32_body
	#0  crc32_body
	#1  crc32_le_generic
	#2  crc32c_le
	#3  xfs_start_cksum_safe
	#4  libxfs_verify_cksum
	#5  xfs_buf_verify_cksum
	#6  xfs_agf_read_verify
	#7  libxfs_readbuf_verify
	#8  libxfs_buf_read_map
	#9  libxfs_trans_read_buf_map
	#10 libxfs_trans_read_buf
	#11 libxfs_read_agf
	#12 libxfs_alloc_read_agf
	#13 libxfs_initialize_perag_data
	#14 init
	#15 main

For coredump (b) system will generate signal SIGFPE corrupt the process. And the stack as follow:
corrupt at: (*bpp)->b_pag = xfs_perag_get(btp->bt_mount, xfs_daddr_to_agno(btp->bt_mount, blkno)); in function libxfs_getbuf_flags
	#0  libxfs_getbuf_flags
	#1  libxfs_getbuf_flags
	#2  libxfs_buf_read_map
	#3  libxfs_buf_read
	#4  libxfs_mount
	#5  init
	#6  main

Analyze the above two issues separately:
	coredump (a) was caused by the corrupt superblock metadata: (mp)->m_sb.sb_sectlog, it was 128;
	coredump (b) was caused by the corrupt superblock metadata: (mp)->m_sb.sb_agblocks, it was 0;

Current, xfs_db doesn't validate the superblock, it goes to corruption if superblock is damaged, theoretically.

So do some check in xfs_db init function to prevent corruption and leave some hints.

Signed-off-by: liuh <liuhuan01@kylinos.cn>
---
 db/init.c | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

Comments

Darrick J. Wong Aug. 21, 2024, 12:31 a.m. UTC | #1
On Tue, Aug 20, 2024 at 09:56:54AM +0800, liuhuan01@kylinos.cn wrote:
> From: liuh <liuhuan01@kylinos.cn>
> 
> Recently, I was testing xfstests. When I run xfs/350 case, it always generate coredumps during the process.
> Total two types of coredump:
> 	(a) xfs_db -c "sb 0" -c "print sectlog" /dev/loop1
> 	(b) xfs_db -c "sb 0" -c "print agblock" /dev/loop1
> 
> For coredump (a) system will generate signal SIGSEGV corrupt the process. And the stack as follow:
> corrupt at: q = *++b; in function crc32_body
> 	#0  crc32_body
> 	#1  crc32_le_generic
> 	#2  crc32c_le
> 	#3  xfs_start_cksum_safe
> 	#4  libxfs_verify_cksum
> 	#5  xfs_buf_verify_cksum
> 	#6  xfs_agf_read_verify
> 	#7  libxfs_readbuf_verify
> 	#8  libxfs_buf_read_map
> 	#9  libxfs_trans_read_buf_map
> 	#10 libxfs_trans_read_buf
> 	#11 libxfs_read_agf
> 	#12 libxfs_alloc_read_agf
> 	#13 libxfs_initialize_perag_data
> 	#14 init
> 	#15 main
> 
> For coredump (b) system will generate signal SIGFPE corrupt the process. And the stack as follow:
> corrupt at: (*bpp)->b_pag = xfs_perag_get(btp->bt_mount, xfs_daddr_to_agno(btp->bt_mount, blkno)); in function libxfs_getbuf_flags
> 	#0  libxfs_getbuf_flags
> 	#1  libxfs_getbuf_flags
> 	#2  libxfs_buf_read_map
> 	#3  libxfs_buf_read
> 	#4  libxfs_mount
> 	#5  init
> 	#6  main
> 
> Analyze the above two issues separately:
> 	coredump (a) was caused by the corrupt superblock metadata: (mp)->m_sb.sb_sectlog, it was 128;
> 	coredump (b) was caused by the corrupt superblock metadata: (mp)->m_sb.sb_agblocks, it was 0;

One patch per bugfix, please.

> Current, xfs_db doesn't validate the superblock, it goes to corruption if superblock is damaged, theoretically.

libxfs does, but xfs_db doesn't quite use the verifiers, being a
debugger and all.

> So do some check in xfs_db init function to prevent corruption and leave some hints.
> 
> Signed-off-by: liuh <liuhuan01@kylinos.cn>
> ---
>  db/init.c | 14 ++++++++++++++
>  1 file changed, 14 insertions(+)
> 
> diff --git a/db/init.c b/db/init.c
> index cea25ae5..4402f85f 100644
> --- a/db/init.c
> +++ b/db/init.c
> @@ -129,6 +129,13 @@ init(
>  		}
>  	}
>  
> +	if (unlikely(sbp->sb_agblocks == 0)) {
> +		fprintf(stderr,
> +			_("%s: device %s agblocks unexpected\n"),
> +			progname, x.data.name);
> +		exit(1);
> +	}
> +
>  	agcount = sbp->sb_agcount;
>  	mp = libxfs_mount(&xmount, sbp, &x, LIBXFS_MOUNT_DEBUGGER);
>  	if (!mp) {
> @@ -140,6 +147,13 @@ init(
>  	mp->m_log = &xlog;
>  	blkbb = 1 << mp->m_blkbb_log;
>  
> +	if (unlikely(sbp->sb_sectlog < XFS_MIN_SECTORSIZE_LOG || sbp->sb_sectlog > XFS_MAX_SECTORSIZE_LOG)) {

Please fix the long lines.

> +		fprintf(stderr,
> +			_("%s: device %s sectlog(%u) unexpected\n"),
> +			progname, x.data.name, sbp->sb_sectlog);
> +		exit(1);

If xfs_db is being run in expert mode, could we try to install
reasonable values here?  Such as the mkfs defaults?  Or let the user
specify an override via extended cli option?

--D

> +	}
> +
>  	/* Did we limit a broken agcount in libxfs_mount? */
>  	if (sbp->sb_agcount != agcount)
>  		exitcode = 1;
> -- 
> 2.43.0
> 
>
diff mbox series

Patch

diff --git a/db/init.c b/db/init.c
index cea25ae5..4402f85f 100644
--- a/db/init.c
+++ b/db/init.c
@@ -129,6 +129,13 @@  init(
 		}
 	}
 
+	if (unlikely(sbp->sb_agblocks == 0)) {
+		fprintf(stderr,
+			_("%s: device %s agblocks unexpected\n"),
+			progname, x.data.name);
+		exit(1);
+	}
+
 	agcount = sbp->sb_agcount;
 	mp = libxfs_mount(&xmount, sbp, &x, LIBXFS_MOUNT_DEBUGGER);
 	if (!mp) {
@@ -140,6 +147,13 @@  init(
 	mp->m_log = &xlog;
 	blkbb = 1 << mp->m_blkbb_log;
 
+	if (unlikely(sbp->sb_sectlog < XFS_MIN_SECTORSIZE_LOG || sbp->sb_sectlog > XFS_MAX_SECTORSIZE_LOG)) {
+		fprintf(stderr,
+			_("%s: device %s sectlog(%u) unexpected\n"),
+			progname, x.data.name, sbp->sb_sectlog);
+		exit(1);
+	}
+
 	/* Did we limit a broken agcount in libxfs_mount? */
 	if (sbp->sb_agcount != agcount)
 		exitcode = 1;