diff mbox series

[v2] xfs: avoid LR buffer overrun due to crafted h_len

Message ID 20200902141923.26422-1-hsiangkao@redhat.com (mailing list archive)
State Superseded
Headers show
Series [v2] xfs: avoid LR buffer overrun due to crafted h_len | expand

Commit Message

Gao Xiang Sept. 2, 2020, 2:19 p.m. UTC
Currently, crafted h_len has been blocked for the log
header of the tail block in commit a70f9fe52daa ("xfs:
detect and handle invalid iclog size set by mkfs").

However, each log record could still have crafted
h_len and cause log record buffer overrun. So let's
check h_len for each log record as well instead.

Signed-off-by: Gao Xiang <hsiangkao@redhat.com>
---
v2: fix a misjudgement "unlikely(hlen >= hsize)"

 fs/xfs/xfs_log_recover.c | 70 +++++++++++++++++++++-------------------
 1 file changed, 37 insertions(+), 33 deletions(-)

Comments

Brian Foster Sept. 2, 2020, 5:38 p.m. UTC | #1
On Wed, Sep 02, 2020 at 10:19:23PM +0800, Gao Xiang wrote:
> Currently, crafted h_len has been blocked for the log
> header of the tail block in commit a70f9fe52daa ("xfs:
> detect and handle invalid iclog size set by mkfs").
> 

Ok, so according to that commit log the original purpose of this code
was to work around a quirky mkfs condition where record length of an
unmount record was enlarged but the iclog buffer size remained at 32k.
The fix is to simply increase the size of iclog buf.

> However, each log record could still have crafted
> h_len and cause log record buffer overrun. So let's
> check h_len for each log record as well instead.
> 

Is this something you've observed or attempted to reproduce, or is this
based on code inspection?

> Signed-off-by: Gao Xiang <hsiangkao@redhat.com>
> ---
> v2: fix a misjudgement "unlikely(hlen >= hsize)"
> 
>  fs/xfs/xfs_log_recover.c | 70 +++++++++++++++++++++-------------------
>  1 file changed, 37 insertions(+), 33 deletions(-)
> 
> diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
> index e2ec91b2d0f4..2d9195fb9367 100644
> --- a/fs/xfs/xfs_log_recover.c
> +++ b/fs/xfs/xfs_log_recover.c
> @@ -2904,7 +2904,8 @@ STATIC int
>  xlog_valid_rec_header(
>  	struct xlog		*log,
>  	struct xlog_rec_header	*rhead,
> -	xfs_daddr_t		blkno)
> +	xfs_daddr_t		blkno,
> +	int			hsize)
>  {
>  	int			hlen;
>  
> @@ -2920,10 +2921,39 @@ xlog_valid_rec_header(
>  		return -EFSCORRUPTED;
>  	}
>  
> -	/* LR body must have data or it wouldn't have been written */
> +	/*
> +	 * LR body must have data (or it wouldn't have been written) and
> +	 * h_len must not be greater than h_size with one exception.
> +	 *
> +	 * That is that xfsprogs has a bug where record length is based on
> +	 * lsunit but h_size (iclog size) is hardcoded to 32k. This means
> +	 * the log buffer allocated can be too small for the record to
> +	 * cause an overrun.
> +	 *
> +	 * Detect this condition here. Use lsunit for the buffer size as
> +	 * long as this looks like the mkfs case. Otherwise, return an
> +	 * error to avoid a buffer overrun.
> +	 */
>  	hlen = be32_to_cpu(rhead->h_len);
> -	if (XFS_IS_CORRUPT(log->l_mp, hlen <= 0 || hlen > INT_MAX))
> +	if (XFS_IS_CORRUPT(log->l_mp, hlen <= 0))

Why is the second part of the check removed?

>  		return -EFSCORRUPTED;
> +
> +	if (hsize && XFS_IS_CORRUPT(log->l_mp,
> +				    hsize < be32_to_cpu(rhead->h_size)))
> +		return -EFSCORRUPTED;
> +	hsize = be32_to_cpu(rhead->h_size);

I'm a little confused why we take hsize as a parameter as well as read
it from the record header. If we're validating a particular record,
shouldn't we use the size as specified by that record?

Also FWIW I think pulling bits of logic out of the XFS_IS_CORRUPT()
check makes this a little harder to read than just putting the entire
logic statement within the macro.

> +
> +	if (unlikely(hlen > hsize)) {

I think we've made a point to avoid the [un]likely() modifiers in XFS as
they don't usually have a noticeable impact. I certainly wouldn't expect
it to in log recovery.

> +		if (XFS_IS_CORRUPT(log->l_mp, hlen > log->l_mp->m_logbsize ||
> +				   rhead->h_num_logops != cpu_to_be32(1)))
> +			return -EFSCORRUPTED;
> +
> +		xfs_warn(log->l_mp,
> +		"invalid iclog size (%d bytes), using lsunit (%d bytes)",
> +			 hsize, log->l_mp->m_logbsize);
> +		rhead->h_size = cpu_to_be32(log->l_mp->m_logbsize);

I also find updating the header structure as such down in a "validation
helper" a bit obscured.

> +	}
> +
>  	if (XFS_IS_CORRUPT(log->l_mp,
>  			   blkno > log->l_logBBsize || blkno > INT_MAX))
>  		return -EFSCORRUPTED;
...
> @@ -3096,7 +3100,7 @@ xlog_do_recovery_pass(
>  			}
>  			rhead = (xlog_rec_header_t *)offset;
>  			error = xlog_valid_rec_header(log, rhead,
> -						split_hblks ? blk_no : 0);
> +					split_hblks ? blk_no : 0, h_size);
>  			if (error)
>  				goto bread_err2;
>  
> @@ -3177,7 +3181,7 @@ xlog_do_recovery_pass(
>  			goto bread_err2;
>  
>  		rhead = (xlog_rec_header_t *)offset;
> -		error = xlog_valid_rec_header(log, rhead, blk_no);
> +		error = xlog_valid_rec_header(log, rhead, blk_no, h_size);
>  		if (error)
>  			goto bread_err2;

In these two cases we've already allocated the record header and data
buffers and we're walking through the log records doing recovery. Given
that, it seems like the purpose of the parameter is more to check the
subsequent records against the size of the current record buffer. That
seems like a reasonable check to incorporate, but I think the mkfs
workaround logic is misplaced in a generic record validation helper.
IIUC that is a very special case that should only apply to the first
record in the log and only impacts the size of the buffer we allocate to
read in the remaining records.

Can we rework this to leave the mkfs workaround logic as is and update
the validation helper to check that each record length fits in the size
of the buffer we've decided to allocate? I'd also suggest to rename the
new parameter to something like 'bufsize' instead of 'h_size' to clarify
what it actually means in the context of xlog_valid_rec_header().

Brian

>  
> -- 
> 2.18.1
>
Gao Xiang Sept. 2, 2020, 10:47 p.m. UTC | #2
Hi Brian,

On Wed, Sep 02, 2020 at 01:38:59PM -0400, Brian Foster wrote:
> On Wed, Sep 02, 2020 at 10:19:23PM +0800, Gao Xiang wrote:

...

> > However, each log record could still have crafted
> > h_len and cause log record buffer overrun. So let's
> > check h_len for each log record as well instead.
> > 
> 
> Is this something you've observed or attempted to reproduce, or is this
> based on code inspection?

Thanks for your review.

based on code inspection, the logic seems straight-forward

in xlog_do_recovery_pass()
	...

	dbp = xlog_alloc_buffer(log, BTOBB(h_size));
					^ here uses h_size from the tail block
	if (!dbp) {
		kmem_free(hbp);
		return -ENOMEM;
	}

	if (tail_blk > head_blk) {
		while (blk_no < log->l_logBBsize) {
			xlog_bread
			xlog_valid_rec_header
			xlog_recover_process
		}
	}

	while (blk_no < head_blk) {
		xlog_bread
		xlog_valid_rec_header
		xlog_recover_process
	}


in xlog_recover_process()
	crc = xlog_cksum(log, rhead, dp, be32_to_cpu(rhead->h_len));
							^here
	...

and also xlog_recover_process_data()
	end = dp + be32_to_cpu(rhead->h_len);
	...
	while ((dp < end) && num_logops) {
		ohead = (struct xlog_op_header *)dp;
		(all things around dp/ohead if num_logops is crafted as well. 
		...
	}


> 
> > -	if (XFS_IS_CORRUPT(log->l_mp, hlen <= 0 || hlen > INT_MAX))
> > +	if (XFS_IS_CORRUPT(log->l_mp, hlen <= 0))
> 
> Why is the second part of the check removed?

if hlen <= hsize (hsize > 0) then hlen will be <= INT_MAX

> 
> >  		return -EFSCORRUPTED;
> > +
> > +	if (hsize && XFS_IS_CORRUPT(log->l_mp,
> > +				    hsize < be32_to_cpu(rhead->h_size)))
> > +		return -EFSCORRUPTED;
> > +	hsize = be32_to_cpu(rhead->h_size);
> 
> I'm a little confused why we take hsize as a parameter as well as read
> it from the record header. If we're validating a particular record,
> shouldn't we use the size as specified by that record?
> 
> Also FWIW I think pulling bits of logic out of the XFS_IS_CORRUPT()
> check makes this a little harder to read than just putting the entire
> logic statement within the macro.

It seems that is partially self-answered in the last part of the email.
So move the response to the last of the email...

> 
> > +
> > +	if (unlikely(hlen > hsize)) {
> 
> I think we've made a point to avoid the [un]likely() modifiers in XFS as
> they don't usually have a noticeable impact. I certainly wouldn't expect
> it to in log recovery.

Honestly, I really don't want to work on some topic about [un]likely,
I did a long discussion with Dan Carpenter and a couple of other people
last year, but *shrug*

For this case just simply bacause XFS_IS_CORRUPT() has this annotation,
and it seems xlog_valid_rec_header() logic will be changed in v3
if we leave the mkfs workaround logic as is.

> 
> > +		if (XFS_IS_CORRUPT(log->l_mp, hlen > log->l_mp->m_logbsize ||
> > +				   rhead->h_num_logops != cpu_to_be32(1)))
> > +			return -EFSCORRUPTED;
> > +
> > +		xfs_warn(log->l_mp,
> > +		"invalid iclog size (%d bytes), using lsunit (%d bytes)",
> > +			 hsize, log->l_mp->m_logbsize);
> > +		rhead->h_size = cpu_to_be32(log->l_mp->m_logbsize);
> 
> I also find updating the header structure as such down in a "validation
> helper" a bit obscured.

also the same words at the last of the email...

> 
> > +	}
> > +
> >  	if (XFS_IS_CORRUPT(log->l_mp,
> >  			   blkno > log->l_logBBsize || blkno > INT_MAX))
> >  		return -EFSCORRUPTED;
> ...
> > @@ -3096,7 +3100,7 @@ xlog_do_recovery_pass(
> >  			}
> >  			rhead = (xlog_rec_header_t *)offset;
> >  			error = xlog_valid_rec_header(log, rhead,
> > -						split_hblks ? blk_no : 0);
> > +					split_hblks ? blk_no : 0, h_size);
> >  			if (error)
> >  				goto bread_err2;
> >  
> > @@ -3177,7 +3181,7 @@ xlog_do_recovery_pass(
> >  			goto bread_err2;
> >  
> >  		rhead = (xlog_rec_header_t *)offset;
> > -		error = xlog_valid_rec_header(log, rhead, blk_no);
> > +		error = xlog_valid_rec_header(log, rhead, blk_no, h_size);
> >  		if (error)
> >  			goto bread_err2;
> 
> In these two cases we've already allocated the record header and data
> buffers and we're walking through the log records doing recovery. Given
> that, it seems like the purpose of the parameter is more to check the
> subsequent records against the size of the current record buffer. That
> seems like a reasonable check to incorporate, but I think the mkfs

Yes

> workaround logic is misplaced in a generic record validation helper.
> IIUC that is a very special case that should only apply to the first
> record in the log and only impacts the size of the buffer we allocate to
> read in the remaining records.
> 
> Can we rework this to leave the mkfs workaround logic as is and update
> the validation helper to check that each record length fits in the size
> of the buffer we've decided to allocate? I'd also suggest to rename the
> new parameter to something like 'bufsize' instead of 'h_size' to clarify
> what it actually means in the context of xlog_valid_rec_header().

Ok, that is fine. I will leave the mkfs workaround logic as is and rename
to bufsize.

Thanks,
Gao Xiang


> 
> Brian
> 
> >  
> > -- 
> > 2.18.1
> > 
>
diff mbox series

Patch

diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
index e2ec91b2d0f4..2d9195fb9367 100644
--- a/fs/xfs/xfs_log_recover.c
+++ b/fs/xfs/xfs_log_recover.c
@@ -2904,7 +2904,8 @@  STATIC int
 xlog_valid_rec_header(
 	struct xlog		*log,
 	struct xlog_rec_header	*rhead,
-	xfs_daddr_t		blkno)
+	xfs_daddr_t		blkno,
+	int			hsize)
 {
 	int			hlen;
 
@@ -2920,10 +2921,39 @@  xlog_valid_rec_header(
 		return -EFSCORRUPTED;
 	}
 
-	/* LR body must have data or it wouldn't have been written */
+	/*
+	 * LR body must have data (or it wouldn't have been written) and
+	 * h_len must not be greater than h_size with one exception.
+	 *
+	 * That is that xfsprogs has a bug where record length is based on
+	 * lsunit but h_size (iclog size) is hardcoded to 32k. This means
+	 * the log buffer allocated can be too small for the record to
+	 * cause an overrun.
+	 *
+	 * Detect this condition here. Use lsunit for the buffer size as
+	 * long as this looks like the mkfs case. Otherwise, return an
+	 * error to avoid a buffer overrun.
+	 */
 	hlen = be32_to_cpu(rhead->h_len);
-	if (XFS_IS_CORRUPT(log->l_mp, hlen <= 0 || hlen > INT_MAX))
+	if (XFS_IS_CORRUPT(log->l_mp, hlen <= 0))
 		return -EFSCORRUPTED;
+
+	if (hsize && XFS_IS_CORRUPT(log->l_mp,
+				    hsize < be32_to_cpu(rhead->h_size)))
+		return -EFSCORRUPTED;
+	hsize = be32_to_cpu(rhead->h_size);
+
+	if (unlikely(hlen > hsize)) {
+		if (XFS_IS_CORRUPT(log->l_mp, hlen > log->l_mp->m_logbsize ||
+				   rhead->h_num_logops != cpu_to_be32(1)))
+			return -EFSCORRUPTED;
+
+		xfs_warn(log->l_mp,
+		"invalid iclog size (%d bytes), using lsunit (%d bytes)",
+			 hsize, log->l_mp->m_logbsize);
+		rhead->h_size = cpu_to_be32(log->l_mp->m_logbsize);
+	}
+
 	if (XFS_IS_CORRUPT(log->l_mp,
 			   blkno > log->l_logBBsize || blkno > INT_MAX))
 		return -EFSCORRUPTED;
@@ -2951,7 +2981,7 @@  xlog_do_recovery_pass(
 	xfs_daddr_t		rhead_blk;
 	char			*offset;
 	char			*hbp, *dbp;
-	int			error = 0, h_size, h_len;
+	int			error = 0, h_size;
 	int			error2 = 0;
 	int			bblks, split_bblks;
 	int			hblks, split_hblks, wrapped_hblks;
@@ -2984,37 +3014,11 @@  xlog_do_recovery_pass(
 			goto bread_err1;
 
 		rhead = (xlog_rec_header_t *)offset;
-		error = xlog_valid_rec_header(log, rhead, tail_blk);
+		error = xlog_valid_rec_header(log, rhead, tail_blk, 0);
 		if (error)
 			goto bread_err1;
 
-		/*
-		 * xfsprogs has a bug where record length is based on lsunit but
-		 * h_size (iclog size) is hardcoded to 32k. Now that we
-		 * unconditionally CRC verify the unmount record, this means the
-		 * log buffer can be too small for the record and cause an
-		 * overrun.
-		 *
-		 * Detect this condition here. Use lsunit for the buffer size as
-		 * long as this looks like the mkfs case. Otherwise, return an
-		 * error to avoid a buffer overrun.
-		 */
 		h_size = be32_to_cpu(rhead->h_size);
-		h_len = be32_to_cpu(rhead->h_len);
-		if (h_len > h_size) {
-			if (h_len <= log->l_mp->m_logbsize &&
-			    be32_to_cpu(rhead->h_num_logops) == 1) {
-				xfs_warn(log->l_mp,
-		"invalid iclog size (%d bytes), using lsunit (%d bytes)",
-					 h_size, log->l_mp->m_logbsize);
-				h_size = log->l_mp->m_logbsize;
-			} else {
-				XFS_ERROR_REPORT(__func__, XFS_ERRLEVEL_LOW,
-						log->l_mp);
-				error = -EFSCORRUPTED;
-				goto bread_err1;
-			}
-		}
 
 		if ((be32_to_cpu(rhead->h_version) & XLOG_VERSION_2) &&
 		    (h_size > XLOG_HEADER_CYCLE_SIZE)) {
@@ -3096,7 +3100,7 @@  xlog_do_recovery_pass(
 			}
 			rhead = (xlog_rec_header_t *)offset;
 			error = xlog_valid_rec_header(log, rhead,
-						split_hblks ? blk_no : 0);
+					split_hblks ? blk_no : 0, h_size);
 			if (error)
 				goto bread_err2;
 
@@ -3177,7 +3181,7 @@  xlog_do_recovery_pass(
 			goto bread_err2;
 
 		rhead = (xlog_rec_header_t *)offset;
-		error = xlog_valid_rec_header(log, rhead, blk_no);
+		error = xlog_valid_rec_header(log, rhead, blk_no, h_size);
 		if (error)
 			goto bread_err2;