diff mbox series

xfs: don't trip over uninitialized buffer on extent read of corrupted inode

Message ID 20190719193032.11096-1-mcgrof@kernel.org (mailing list archive)
State Superseded, archived
Headers show
Series xfs: don't trip over uninitialized buffer on extent read of corrupted inode | expand

Commit Message

Luis R. Rodriguez July 19, 2019, 7:30 p.m. UTC
From: Brian Foster <bfoster@redhat.com>

commit 6958d11f77d45db80f7e22a21a74d4d5f44dc667 upstream.

We've had rather rare reports of bmap btree block corruption where
the bmap root block has a level count of zero. The root cause of the
corruption is so far unknown. We do have verifier checks to detect
this form of on-disk corruption, but this doesn't cover a memory
corruption variant of the problem. The latter is a reasonable
possibility because the root block is part of the inode fork and can
reside in-core for some time before inode extents are read.

If this occurs, it leads to a system crash such as the following:

 BUG: unable to handle kernel paging request at ffffffff00000221
 PF error: [normal kernel read fault]
 ...
 RIP: 0010:xfs_trans_brelse+0xf/0x200 [xfs]
 ...
 Call Trace:
  xfs_iread_extents+0x379/0x540 [xfs]
  xfs_file_iomap_begin_delay+0x11a/0xb40 [xfs]
  ? xfs_attr_get+0xd1/0x120 [xfs]
  ? iomap_write_begin.constprop.40+0x2d0/0x2d0
  xfs_file_iomap_begin+0x4c4/0x6d0 [xfs]
  ? __vfs_getxattr+0x53/0x70
  ? iomap_write_begin.constprop.40+0x2d0/0x2d0
  iomap_apply+0x63/0x130
  ? iomap_write_begin.constprop.40+0x2d0/0x2d0
  iomap_file_buffered_write+0x62/0x90
  ? iomap_write_begin.constprop.40+0x2d0/0x2d0
  xfs_file_buffered_aio_write+0xe4/0x3b0 [xfs]
  __vfs_write+0x150/0x1b0
  vfs_write+0xba/0x1c0
  ksys_pwrite64+0x64/0xa0
  do_syscall_64+0x5a/0x1d0
  entry_SYSCALL_64_after_hwframe+0x49/0xbe

The crash occurs because xfs_iread_extents() attempts to release an
uninitialized buffer pointer as the level == 0 value prevented the
buffer from ever being allocated or read. Change the level > 0
assert to an explicit error check in xfs_iread_extents() to avoid
crashing the kernel in the event of localized, in-core inode
corruption.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
[mcgrof: fixes kz#204223 ]
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---
 fs/xfs/libxfs/xfs_bmap.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

Comments

Luis R. Rodriguez July 19, 2019, 9:23 p.m. UTC | #1
On Fri, Jul 19, 2019 at 07:30:32PM +0000, Luis Chamberlain wrote:
> From: Brian Foster <bfoster@redhat.com>
> [mcgrof: fixes kz#204223 ]

Sorry, spoke too soon, although it helps... it actually still does not
fix that exact issue. Fixing this will require a bit more work. You can
ignore this patch for stable for now.

  Luis
Luis R. Rodriguez July 19, 2019, 11:07 p.m. UTC | #2
On Fri, Jul 19, 2019 at 07:30:32PM +0000, Luis Chamberlain wrote:
> [mcgrof: fixes kz#204223 ]

This patch can be ingored for now for stable. It does not actually
fix the issue, just delays it a bit. Once I stress test over 1000
runs with some other fixes I have I'll send a new set of stable
fixes.

  Luis
Dave Chinner July 21, 2019, 9:42 p.m. UTC | #3
On Fri, Jul 19, 2019 at 11:07:29PM +0000, Luis Chamberlain wrote:
> On Fri, Jul 19, 2019 at 07:30:32PM +0000, Luis Chamberlain wrote:
> > [mcgrof: fixes kz#204223 ]
> 
> This patch can be ingored for now for stable. It does not actually
> fix the issue, just delays it a bit. Once I stress test over 1000
> runs with some other fixes I have I'll send a new set of stable
> fixes.

generic/388 is one of the tests we expect to uncover interesting
failures over time.  i.e. every time we fix a problem in these
tests, it will expose another issue that we haven't been able to
exercise until easier-to-hit failures have been fixed.

The best you can do right now is minimise the occurence of failures
by backporting fixes - this test (like generic/475) will continue to
uncover new shutdown and recovery issues as they are exposed by
new fixes. Expecting it to pass 1000 times without failure on an
older stable kernel is, IMO, a somewhat unrealistic expectation...

Cheers,

Dave.
Sasha Levin July 22, 2019, 2:35 p.m. UTC | #4
On Fri, Jul 19, 2019 at 09:23:00PM +0000, Luis Chamberlain wrote:
>On Fri, Jul 19, 2019 at 07:30:32PM +0000, Luis Chamberlain wrote:
>> From: Brian Foster <bfoster@redhat.com>
>> [mcgrof: fixes kz#204223 ]
>
>Sorry, spoke too soon, although it helps... it actually still does not
>fix that exact issue. Fixing this will require a bit more work. You can
>ignore this patch for stable for now.

What about the other 9 patch series?

--
Thanks,
Sasha
diff mbox series

Patch

diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
index 3a496ffe6551..ab2465bc413a 100644
--- a/fs/xfs/libxfs/xfs_bmap.c
+++ b/fs/xfs/libxfs/xfs_bmap.c
@@ -1178,7 +1178,10 @@  xfs_iread_extents(
 	 * Root level must use BMAP_BROOT_PTR_ADDR macro to get ptr out.
 	 */
 	level = be16_to_cpu(block->bb_level);
-	ASSERT(level > 0);
+	if (unlikely(level == 0)) {
+		XFS_ERROR_REPORT(__func__, XFS_ERRLEVEL_LOW, mp);
+		return -EFSCORRUPTED;
+	}
 	pp = XFS_BMAP_BROOT_PTR_ADDR(mp, block, 1, ifp->if_broot_bytes);
 	bno = be64_to_cpu(*pp);