Message ID | 20200902141923.26422-1-hsiangkao@redhat.com (mailing list archive) |
---|---|
State | Superseded, archived |
Headers | show |
Series | [v2] xfs: avoid LR buffer overrun due to crafted h_len | expand |
On Wed, Sep 02, 2020 at 10:19:23PM +0800, Gao Xiang wrote: > Currently, crafted h_len has been blocked for the log > header of the tail block in commit a70f9fe52daa ("xfs: > detect and handle invalid iclog size set by mkfs"). > Ok, so according to that commit log the original purpose of this code was to work around a quirky mkfs condition where record length of an unmount record was enlarged but the iclog buffer size remained at 32k. The fix is to simply increase the size of iclog buf. > However, each log record could still have crafted > h_len and cause log record buffer overrun. So let's > check h_len for each log record as well instead. > Is this something you've observed or attempted to reproduce, or is this based on code inspection? > Signed-off-by: Gao Xiang <hsiangkao@redhat.com> > --- > v2: fix a misjudgement "unlikely(hlen >= hsize)" > > fs/xfs/xfs_log_recover.c | 70 +++++++++++++++++++++------------------- > 1 file changed, 37 insertions(+), 33 deletions(-) > > diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c > index e2ec91b2d0f4..2d9195fb9367 100644 > --- a/fs/xfs/xfs_log_recover.c > +++ b/fs/xfs/xfs_log_recover.c > @@ -2904,7 +2904,8 @@ STATIC int > xlog_valid_rec_header( > struct xlog *log, > struct xlog_rec_header *rhead, > - xfs_daddr_t blkno) > + xfs_daddr_t blkno, > + int hsize) > { > int hlen; > > @@ -2920,10 +2921,39 @@ xlog_valid_rec_header( > return -EFSCORRUPTED; > } > > - /* LR body must have data or it wouldn't have been written */ > + /* > + * LR body must have data (or it wouldn't have been written) and > + * h_len must not be greater than h_size with one exception. > + * > + * That is that xfsprogs has a bug where record length is based on > + * lsunit but h_size (iclog size) is hardcoded to 32k. This means > + * the log buffer allocated can be too small for the record to > + * cause an overrun. > + * > + * Detect this condition here. Use lsunit for the buffer size as > + * long as this looks like the mkfs case. Otherwise, return an > + * error to avoid a buffer overrun. > + */ > hlen = be32_to_cpu(rhead->h_len); > - if (XFS_IS_CORRUPT(log->l_mp, hlen <= 0 || hlen > INT_MAX)) > + if (XFS_IS_CORRUPT(log->l_mp, hlen <= 0)) Why is the second part of the check removed? > return -EFSCORRUPTED; > + > + if (hsize && XFS_IS_CORRUPT(log->l_mp, > + hsize < be32_to_cpu(rhead->h_size))) > + return -EFSCORRUPTED; > + hsize = be32_to_cpu(rhead->h_size); I'm a little confused why we take hsize as a parameter as well as read it from the record header. If we're validating a particular record, shouldn't we use the size as specified by that record? Also FWIW I think pulling bits of logic out of the XFS_IS_CORRUPT() check makes this a little harder to read than just putting the entire logic statement within the macro. > + > + if (unlikely(hlen > hsize)) { I think we've made a point to avoid the [un]likely() modifiers in XFS as they don't usually have a noticeable impact. I certainly wouldn't expect it to in log recovery. > + if (XFS_IS_CORRUPT(log->l_mp, hlen > log->l_mp->m_logbsize || > + rhead->h_num_logops != cpu_to_be32(1))) > + return -EFSCORRUPTED; > + > + xfs_warn(log->l_mp, > + "invalid iclog size (%d bytes), using lsunit (%d bytes)", > + hsize, log->l_mp->m_logbsize); > + rhead->h_size = cpu_to_be32(log->l_mp->m_logbsize); I also find updating the header structure as such down in a "validation helper" a bit obscured. > + } > + > if (XFS_IS_CORRUPT(log->l_mp, > blkno > log->l_logBBsize || blkno > INT_MAX)) > return -EFSCORRUPTED; ... > @@ -3096,7 +3100,7 @@ xlog_do_recovery_pass( > } > rhead = (xlog_rec_header_t *)offset; > error = xlog_valid_rec_header(log, rhead, > - split_hblks ? blk_no : 0); > + split_hblks ? blk_no : 0, h_size); > if (error) > goto bread_err2; > > @@ -3177,7 +3181,7 @@ xlog_do_recovery_pass( > goto bread_err2; > > rhead = (xlog_rec_header_t *)offset; > - error = xlog_valid_rec_header(log, rhead, blk_no); > + error = xlog_valid_rec_header(log, rhead, blk_no, h_size); > if (error) > goto bread_err2; In these two cases we've already allocated the record header and data buffers and we're walking through the log records doing recovery. Given that, it seems like the purpose of the parameter is more to check the subsequent records against the size of the current record buffer. That seems like a reasonable check to incorporate, but I think the mkfs workaround logic is misplaced in a generic record validation helper. IIUC that is a very special case that should only apply to the first record in the log and only impacts the size of the buffer we allocate to read in the remaining records. Can we rework this to leave the mkfs workaround logic as is and update the validation helper to check that each record length fits in the size of the buffer we've decided to allocate? I'd also suggest to rename the new parameter to something like 'bufsize' instead of 'h_size' to clarify what it actually means in the context of xlog_valid_rec_header(). Brian > > -- > 2.18.1 >
Hi Brian, On Wed, Sep 02, 2020 at 01:38:59PM -0400, Brian Foster wrote: > On Wed, Sep 02, 2020 at 10:19:23PM +0800, Gao Xiang wrote: ... > > However, each log record could still have crafted > > h_len and cause log record buffer overrun. So let's > > check h_len for each log record as well instead. > > > > Is this something you've observed or attempted to reproduce, or is this > based on code inspection? Thanks for your review. based on code inspection, the logic seems straight-forward in xlog_do_recovery_pass() ... dbp = xlog_alloc_buffer(log, BTOBB(h_size)); ^ here uses h_size from the tail block if (!dbp) { kmem_free(hbp); return -ENOMEM; } if (tail_blk > head_blk) { while (blk_no < log->l_logBBsize) { xlog_bread xlog_valid_rec_header xlog_recover_process } } while (blk_no < head_blk) { xlog_bread xlog_valid_rec_header xlog_recover_process } in xlog_recover_process() crc = xlog_cksum(log, rhead, dp, be32_to_cpu(rhead->h_len)); ^here ... and also xlog_recover_process_data() end = dp + be32_to_cpu(rhead->h_len); ... while ((dp < end) && num_logops) { ohead = (struct xlog_op_header *)dp; (all things around dp/ohead if num_logops is crafted as well. ... } > > > - if (XFS_IS_CORRUPT(log->l_mp, hlen <= 0 || hlen > INT_MAX)) > > + if (XFS_IS_CORRUPT(log->l_mp, hlen <= 0)) > > Why is the second part of the check removed? if hlen <= hsize (hsize > 0) then hlen will be <= INT_MAX > > > return -EFSCORRUPTED; > > + > > + if (hsize && XFS_IS_CORRUPT(log->l_mp, > > + hsize < be32_to_cpu(rhead->h_size))) > > + return -EFSCORRUPTED; > > + hsize = be32_to_cpu(rhead->h_size); > > I'm a little confused why we take hsize as a parameter as well as read > it from the record header. If we're validating a particular record, > shouldn't we use the size as specified by that record? > > Also FWIW I think pulling bits of logic out of the XFS_IS_CORRUPT() > check makes this a little harder to read than just putting the entire > logic statement within the macro. It seems that is partially self-answered in the last part of the email. So move the response to the last of the email... > > > + > > + if (unlikely(hlen > hsize)) { > > I think we've made a point to avoid the [un]likely() modifiers in XFS as > they don't usually have a noticeable impact. I certainly wouldn't expect > it to in log recovery. Honestly, I really don't want to work on some topic about [un]likely, I did a long discussion with Dan Carpenter and a couple of other people last year, but *shrug* For this case just simply bacause XFS_IS_CORRUPT() has this annotation, and it seems xlog_valid_rec_header() logic will be changed in v3 if we leave the mkfs workaround logic as is. > > > + if (XFS_IS_CORRUPT(log->l_mp, hlen > log->l_mp->m_logbsize || > > + rhead->h_num_logops != cpu_to_be32(1))) > > + return -EFSCORRUPTED; > > + > > + xfs_warn(log->l_mp, > > + "invalid iclog size (%d bytes), using lsunit (%d bytes)", > > + hsize, log->l_mp->m_logbsize); > > + rhead->h_size = cpu_to_be32(log->l_mp->m_logbsize); > > I also find updating the header structure as such down in a "validation > helper" a bit obscured. also the same words at the last of the email... > > > + } > > + > > if (XFS_IS_CORRUPT(log->l_mp, > > blkno > log->l_logBBsize || blkno > INT_MAX)) > > return -EFSCORRUPTED; > ... > > @@ -3096,7 +3100,7 @@ xlog_do_recovery_pass( > > } > > rhead = (xlog_rec_header_t *)offset; > > error = xlog_valid_rec_header(log, rhead, > > - split_hblks ? blk_no : 0); > > + split_hblks ? blk_no : 0, h_size); > > if (error) > > goto bread_err2; > > > > @@ -3177,7 +3181,7 @@ xlog_do_recovery_pass( > > goto bread_err2; > > > > rhead = (xlog_rec_header_t *)offset; > > - error = xlog_valid_rec_header(log, rhead, blk_no); > > + error = xlog_valid_rec_header(log, rhead, blk_no, h_size); > > if (error) > > goto bread_err2; > > In these two cases we've already allocated the record header and data > buffers and we're walking through the log records doing recovery. Given > that, it seems like the purpose of the parameter is more to check the > subsequent records against the size of the current record buffer. That > seems like a reasonable check to incorporate, but I think the mkfs Yes > workaround logic is misplaced in a generic record validation helper. > IIUC that is a very special case that should only apply to the first > record in the log and only impacts the size of the buffer we allocate to > read in the remaining records. > > Can we rework this to leave the mkfs workaround logic as is and update > the validation helper to check that each record length fits in the size > of the buffer we've decided to allocate? I'd also suggest to rename the > new parameter to something like 'bufsize' instead of 'h_size' to clarify > what it actually means in the context of xlog_valid_rec_header(). Ok, that is fine. I will leave the mkfs workaround logic as is and rename to bufsize. Thanks, Gao Xiang > > Brian > > > > > -- > > 2.18.1 > > >
diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c index e2ec91b2d0f4..2d9195fb9367 100644 --- a/fs/xfs/xfs_log_recover.c +++ b/fs/xfs/xfs_log_recover.c @@ -2904,7 +2904,8 @@ STATIC int xlog_valid_rec_header( struct xlog *log, struct xlog_rec_header *rhead, - xfs_daddr_t blkno) + xfs_daddr_t blkno, + int hsize) { int hlen; @@ -2920,10 +2921,39 @@ xlog_valid_rec_header( return -EFSCORRUPTED; } - /* LR body must have data or it wouldn't have been written */ + /* + * LR body must have data (or it wouldn't have been written) and + * h_len must not be greater than h_size with one exception. + * + * That is that xfsprogs has a bug where record length is based on + * lsunit but h_size (iclog size) is hardcoded to 32k. This means + * the log buffer allocated can be too small for the record to + * cause an overrun. + * + * Detect this condition here. Use lsunit for the buffer size as + * long as this looks like the mkfs case. Otherwise, return an + * error to avoid a buffer overrun. + */ hlen = be32_to_cpu(rhead->h_len); - if (XFS_IS_CORRUPT(log->l_mp, hlen <= 0 || hlen > INT_MAX)) + if (XFS_IS_CORRUPT(log->l_mp, hlen <= 0)) return -EFSCORRUPTED; + + if (hsize && XFS_IS_CORRUPT(log->l_mp, + hsize < be32_to_cpu(rhead->h_size))) + return -EFSCORRUPTED; + hsize = be32_to_cpu(rhead->h_size); + + if (unlikely(hlen > hsize)) { + if (XFS_IS_CORRUPT(log->l_mp, hlen > log->l_mp->m_logbsize || + rhead->h_num_logops != cpu_to_be32(1))) + return -EFSCORRUPTED; + + xfs_warn(log->l_mp, + "invalid iclog size (%d bytes), using lsunit (%d bytes)", + hsize, log->l_mp->m_logbsize); + rhead->h_size = cpu_to_be32(log->l_mp->m_logbsize); + } + if (XFS_IS_CORRUPT(log->l_mp, blkno > log->l_logBBsize || blkno > INT_MAX)) return -EFSCORRUPTED; @@ -2951,7 +2981,7 @@ xlog_do_recovery_pass( xfs_daddr_t rhead_blk; char *offset; char *hbp, *dbp; - int error = 0, h_size, h_len; + int error = 0, h_size; int error2 = 0; int bblks, split_bblks; int hblks, split_hblks, wrapped_hblks; @@ -2984,37 +3014,11 @@ xlog_do_recovery_pass( goto bread_err1; rhead = (xlog_rec_header_t *)offset; - error = xlog_valid_rec_header(log, rhead, tail_blk); + error = xlog_valid_rec_header(log, rhead, tail_blk, 0); if (error) goto bread_err1; - /* - * xfsprogs has a bug where record length is based on lsunit but - * h_size (iclog size) is hardcoded to 32k. Now that we - * unconditionally CRC verify the unmount record, this means the - * log buffer can be too small for the record and cause an - * overrun. - * - * Detect this condition here. Use lsunit for the buffer size as - * long as this looks like the mkfs case. Otherwise, return an - * error to avoid a buffer overrun. - */ h_size = be32_to_cpu(rhead->h_size); - h_len = be32_to_cpu(rhead->h_len); - if (h_len > h_size) { - if (h_len <= log->l_mp->m_logbsize && - be32_to_cpu(rhead->h_num_logops) == 1) { - xfs_warn(log->l_mp, - "invalid iclog size (%d bytes), using lsunit (%d bytes)", - h_size, log->l_mp->m_logbsize); - h_size = log->l_mp->m_logbsize; - } else { - XFS_ERROR_REPORT(__func__, XFS_ERRLEVEL_LOW, - log->l_mp); - error = -EFSCORRUPTED; - goto bread_err1; - } - } if ((be32_to_cpu(rhead->h_version) & XLOG_VERSION_2) && (h_size > XLOG_HEADER_CYCLE_SIZE)) { @@ -3096,7 +3100,7 @@ xlog_do_recovery_pass( } rhead = (xlog_rec_header_t *)offset; error = xlog_valid_rec_header(log, rhead, - split_hblks ? blk_no : 0); + split_hblks ? blk_no : 0, h_size); if (error) goto bread_err2; @@ -3177,7 +3181,7 @@ xlog_do_recovery_pass( goto bread_err2; rhead = (xlog_rec_header_t *)offset; - error = xlog_valid_rec_header(log, rhead, blk_no); + error = xlog_valid_rec_header(log, rhead, blk_no, h_size); if (error) goto bread_err2;
Currently, crafted h_len has been blocked for the log header of the tail block in commit a70f9fe52daa ("xfs: detect and handle invalid iclog size set by mkfs"). However, each log record could still have crafted h_len and cause log record buffer overrun. So let's check h_len for each log record as well instead. Signed-off-by: Gao Xiang <hsiangkao@redhat.com> --- v2: fix a misjudgement "unlikely(hlen >= hsize)" fs/xfs/xfs_log_recover.c | 70 +++++++++++++++++++++------------------- 1 file changed, 37 insertions(+), 33 deletions(-)