diff mbox series

[3/8] xfs: remove magic handling of unwritten extents in xfs_bmapi_allocate

Message ID 20181002174207.25275-4-hch@lst.de (mailing list archive)
State New, archived
Headers show
Series [1/8] xfs: remove XFS_IO_INVALID | expand

Commit Message

Christoph Hellwig Oct. 2, 2018, 5:42 p.m. UTC
There is no real need to treat unwritten delalloc extent special in
any way here, so remove the special casing and related comment.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/xfs/libxfs/xfs_bmap.c | 16 +++-------------
 1 file changed, 3 insertions(+), 13 deletions(-)

Comments

Brian Foster Oct. 3, 2018, 12:15 p.m. UTC | #1
On Tue, Oct 02, 2018 at 10:42:02AM -0700, Christoph Hellwig wrote:
> There is no real need to treat unwritten delalloc extent special in
> any way here, so remove the special casing and related comment.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---

Reviewed-by: Brian Foster <bfoster@redhat.com>

>  fs/xfs/libxfs/xfs_bmap.c | 16 +++-------------
>  1 file changed, 3 insertions(+), 13 deletions(-)
> 
> diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
> index da6b768664e3..3bb250ee6c7c 100644
> --- a/fs/xfs/libxfs/xfs_bmap.c
> +++ b/fs/xfs/libxfs/xfs_bmap.c
> @@ -4069,20 +4069,10 @@ xfs_bmapi_allocate(
>  	bma->got.br_startoff = bma->offset;
>  	bma->got.br_startblock = bma->blkno;
>  	bma->got.br_blockcount = bma->length;
> -	bma->got.br_state = XFS_EXT_NORM;
> -
> -	/*
> -	 * In the data fork, a wasdelay extent has been initialized, so
> -	 * shouldn't be flagged as unwritten.
> -	 *
> -	 * For the cow fork, however, we convert delalloc reservations
> -	 * (extents allocated for speculative preallocation) to
> -	 * allocated unwritten extents, and only convert the unwritten
> -	 * extents to real extents when we're about to write the data.
> -	 */
> -	if ((!bma->wasdel || (bma->flags & XFS_BMAPI_COWFORK)) &&
> -	    (bma->flags & XFS_BMAPI_PREALLOC))
> +	if (bma->flags & XFS_BMAPI_PREALLOC)
>  		bma->got.br_state = XFS_EXT_UNWRITTEN;
> +	else
> +		bma->got.br_state = XFS_EXT_NORM;
>  
>  	if (bma->wasdel)
>  		error = xfs_bmap_add_extent_delay_real(bma, whichfork);
> -- 
> 2.19.0
>
Dave Chinner Oct. 6, 2018, 9:34 a.m. UTC | #2
On Tue, Oct 02, 2018 at 10:42:02AM -0700, Christoph Hellwig wrote:
> There is no real need to treat unwritten delalloc extent special in
> any way here, so remove the special casing and related comment.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>  fs/xfs/libxfs/xfs_bmap.c | 16 +++-------------
>  1 file changed, 3 insertions(+), 13 deletions(-)
> 
> diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
> index da6b768664e3..3bb250ee6c7c 100644
> --- a/fs/xfs/libxfs/xfs_bmap.c
> +++ b/fs/xfs/libxfs/xfs_bmap.c
> @@ -4069,20 +4069,10 @@ xfs_bmapi_allocate(
>  	bma->got.br_startoff = bma->offset;
>  	bma->got.br_startblock = bma->blkno;
>  	bma->got.br_blockcount = bma->length;
> -	bma->got.br_state = XFS_EXT_NORM;
> -
> -	/*
> -	 * In the data fork, a wasdelay extent has been initialized, so
> -	 * shouldn't be flagged as unwritten.
> -	 *
> -	 * For the cow fork, however, we convert delalloc reservations
> -	 * (extents allocated for speculative preallocation) to
> -	 * allocated unwritten extents, and only convert the unwritten
> -	 * extents to real extents when we're about to write the data.
> -	 */
> -	if ((!bma->wasdel || (bma->flags & XFS_BMAPI_COWFORK)) &&
> -	    (bma->flags & XFS_BMAPI_PREALLOC))
> +	if (bma->flags & XFS_BMAPI_PREALLOC)
>  		bma->got.br_state = XFS_EXT_UNWRITTEN;
> +	else
> +		bma->got.br_state = XFS_EXT_NORM;

I bisected the generic/127 rmap corruption down to this patch:

[   36.523002] run fstests generic/127 at 2018-10-06 19:25:42
[   48.642735] XFS: Assertion failed: fs_is_ok, file: fs/xfs/libxfs/xfs_rmap.c, line: 418
[   48.647380] ------------ [ cut here ]------------
[   48.648322] kernel BUG at fs/xfs/xfs_message.c:102!
[   48.649347] invalid opcode: 0000
[#1] PREEMPT SMP
[   48.650303] CPU: 9 PID: 4468 Comm: fsx Not tainted 4.19.0-rc6-dgc+ #681
[   48.651874] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.1-1 04/01/2014
[   48.653562] RIP: 0010:assfail+0x28/0x30
[   48.654332] Code: c3 90 66 66 66 66 90 48 89 f1 41 89 d0 48 c7 c6 70 56 2e 82 48 89 fa 31 ff e8 64 f9 ff ff 80 3d 75 9f 0a 01 00 75 03 0f 0b c3 <0f> 0b 66 0f 1f 44 00 00 66 66 66 66 90 48 63 f6 49 89a
[   48.657980] RSP: 0018:ffffc900049a7b18 EFLAGS: 00010202
[   48.659015] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
[   48.660423] RDX: 00000000ffffffc0 RSI: 000000000000000a RDI: ffffffff8227ab34
[   48.661817] RBP: ffffc900049a7c38 R08: 0000000000000000 R09: 0000000000000000
[   48.663209] R10: 00000000000000c0 R11: f000000000000000 R12: 00000000000000c0
[   48.664634] R13: 0000000000000004 R14: ffff880239789d20 R15: 0000000000000000
[   48.666033] FS:  00007fb5eed1db80(0000) GS:ffff88023fd00000(0000) knlGS:0000000000000000
[   48.667608] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   48.668752] CR2: 00007f5ce9a0e8e0 CR3: 0000000238f5c001 CR4: 0000000000060ee0
[   48.670168] Call Trace:
[   48.671250]  xfs_rmap_unmap+0x633/0x980
[   48.672037]  ? kmem_zone_alloc+0x61/0xe0
[   48.672842]  xfs_rmap_finish_one+0x2d0/0x340
[   48.673692]  xfs_trans_log_finish_rmap_update+0x2f/0x40
[   48.674737]  xfs_rmap_update_finish_item+0x2c/0x40
[   48.675710]  xfs_defer_finish_noroll+0x184/0x520
[   48.676614]  ? xfs_rmap_update_cancel_item+0x10/0x10
[   48.677615]  ? xfs_free_file_space+0x355/0x390
[   48.678512]  __xfs_trans_commit+0x189/0x370
[   48.679360]  xfs_free_file_space+0x355/0x390
[   48.680227]  xfs_file_fallocate+0x1b3/0x330
[   48.681077]  ? __sb_start_write+0x8d/0xc0
[   48.681889]  vfs_fallocate+0x13d/0x270
[   48.682637]  ksys_fallocate+0x3c/0x70
[   48.683382]  __x64_sys_fallocate+0x1a/0x20
[   48.684208]  do_syscall_64+0x5a/0x180
[   48.684953]  entry_SYSCALL_64_after_hwframe+0x49/0xbe

The corruption check that is failing is this:

        /* Make sure the unwritten flag matches. */
	XFS_WANT_CORRUPTED_GOTO(mp, (flags & XFS_RMAP_UNWRITTEN) ==
			(rec->rm_flags & XFS_RMAP_UNWRITTEN), out);

So this patch does not appear to be doing the right thing with
unwritten extent flagging in some case.

Cheers,

Dave.
Christoph Hellwig Oct. 6, 2018, 9:43 a.m. UTC | #3
On Sat, Oct 06, 2018 at 07:34:34PM +1000, Dave Chinner wrote:
> I bisected the generic/127 rmap corruption down to this patch:

Heh, I just did so too this morning.  

> The corruption check that is failing is this:
> 
>         /* Make sure the unwritten flag matches. */
> 	XFS_WANT_CORRUPTED_GOTO(mp, (flags & XFS_RMAP_UNWRITTEN) ==
> 			(rec->rm_flags & XFS_RMAP_UNWRITTEN), out);
> 
> So this patch does not appear to be doing the right thing with
> unwritten extent flagging in some case.

I suspect the asserts actually are what is incorrect.  But given
how late we are in the cycle I've just dropped the patch and kicked
off xfstests runs (now including rmap, sigh..).
Christoph Hellwig Oct. 7, 2018, 10:13 a.m. UTC | #4
On Sat, Oct 06, 2018 at 11:43:31AM +0200, Christoph Hellwig wrote:
> I suspect the asserts actually are what is incorrect.  But given
> how late we are in the cycle I've just dropped the patch and kicked
> off xfstests runs (now including rmap, sigh..).

Without this patch test runs including rmap succeed.  Do you want
me to resend with the patch dropped (thing should just apply without
it as-is).
Dave Chinner Oct. 7, 2018, 10:02 p.m. UTC | #5
On Sun, Oct 07, 2018 at 12:13:19PM +0200, Christoph Hellwig wrote:
> On Sat, Oct 06, 2018 at 11:43:31AM +0200, Christoph Hellwig wrote:
> > I suspect the asserts actually are what is incorrect.  But given
> > how late we are in the cycle I've just dropped the patch and kicked
> > off xfstests runs (now including rmap, sigh..).
> 
> Without this patch test runs including rmap succeed.  Do you want
> me to resend with the patch dropped (thing should just apply without
> it as-is).

I'll drop it and see what happens...

Cheers,

Dave.
Dave Chinner Oct. 8, 2018, 2:24 a.m. UTC | #6
On Mon, Oct 08, 2018 at 09:02:58AM +1100, Dave Chinner wrote:
> On Sun, Oct 07, 2018 at 12:13:19PM +0200, Christoph Hellwig wrote:
> > On Sat, Oct 06, 2018 at 11:43:31AM +0200, Christoph Hellwig wrote:
> > > I suspect the asserts actually are what is incorrect.  But given
> > > how late we are in the cycle I've just dropped the patch and kicked
> > > off xfstests runs (now including rmap, sigh..).
> > 
> > Without this patch test runs including rmap succeed.  Do you want
> > me to resend with the patch dropped (thing should just apply without
> > it as-is).
> 
> I'll drop it and see what happens...

Different problems after dropping that patch and re-instating the
rest of the series. The shutdown/io error stress tests now hang
randomly in unmount reclaiming inodes. e.g. generic/388:

.....
[ 4028.393874] XFS (pmem1): Mounting V5 Filesystem
[ 4028.397627] XFS (pmem1): Starting recovery (logdev: internal)
[ 4028.400436] XFS (pmem1): Ending recovery (logdev: internal)
[ 4028.406885] XFS (pmem1): xfs_do_force_shutdown(0x8) called from line 435 of file fs/xfs/libxfs/xfs_defer.c.  Return address = ffffffff814fc998
[ 4028.406966] XFS (pmem1): xfs_imap_lookup: xfs_ialloc_read_agi() returned error -5, agno 0
[ 4028.500258] XFS (pmem1): Unmounting Filesystem
[ 4230.446548] INFO: task umount:16507 blocked for more than 120 seconds.
[ 4230.449820]       Not tainted 4.19.0-rc7-dgc+ #682
[ 4230.452695] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 4230.456142] umount          D14552 16507  14540 0x00000000
[ 4230.458206] Call Trace:
[ 4230.458980]  ? __schedule+0x2bf/0x8a0
[ 4230.459852]  ? rwsem_down_write_failed+0x235/0x440
[ 4230.460815]  ? rwsem_down_write_failed+0x230/0x440
[ 4230.461777]  schedule+0x45/0x110
[ 4230.462597]  rwsem_down_write_failed+0x235/0x440
[ 4230.463685]  ? xfs_reclaim_inode+0x107/0x430
[ 4230.464549]  ? call_rwsem_down_write_failed+0x13/0x20
[ 4230.465564]  call_rwsem_down_write_failed+0x13/0x20
[ 4230.466599]  down_write+0x37/0x50
[ 4230.467438]  xfs_ilock+0x119/0x200
[ 4230.468133]  xfs_reclaim_inode+0x107/0x430
[ 4230.468965]  xfs_reclaim_inodes_ag+0x1bd/0x330
[ 4230.469870]  xfs_reclaim_inodes+0x2b/0x50
[ 4230.470736]  xfs_unmountfs+0x92/0x180
[ 4230.471682]  xfs_fs_put_super+0x39/0xb0
[ 4230.472466]  generic_shutdown_super+0x64/0x110
[ 4230.473368]  kill_block_super+0x21/0x50
[ 4230.474149]  deactivate_locked_super+0x39/0x70
[ 4230.475100]  cleanup_mnt+0x3b/0x80
[ 4230.476004]  task_work_run+0x82/0xb0
[ 4230.476740]  exit_to_usermode_loop+0xd3/0xe0
[ 4230.477608]  do_syscall_64+0x170/0x180
[ 4230.478421]  entry_SYSCALL_64_after_hwframe+0x49/0xbe

And on a different machine, generic/475:

[ 8001.777349] XFS (dm-0): Unmounting Filesystem
[ 8001.850325] XFS (dm-0): Mounting V5 Filesystem
[ 8002.234755] XFS (dm-0): Starting recovery (logdev: internal)
[ 8002.410851] XFS (dm-0): Ending recovery (logdev: internal)
[ 8002.438481] XFS (dm-0): metadata I/O error in "xfs_trans_read_buf_map" at daddr 0xd60 len 8 error 5
[ 8002.438542] XFS (dm-0): metadata I/O error in "xfs_trans_read_buf_map" at daddr 0xa00728 len 8 error 5
[ 8002.441590] XFS (dm-0): metadata I/O error in "xfs_trans_read_buf_map" at daddr 0x1406f80 len 32 error 5
[ 8002.441597] XFS (dm-0): xfs_imap_to_bp: xfs_trans_read_buf() returned error -5.
[ 8002.442325] Buffer I/O error on dev dm-0, logical block 5242864, async page read
[ 8002.442377] XFS (dm-0): xfs_do_force_shutdown(0x1) called from line 327 of file fs/xfs/xfs_trans_buf.c.  Return address = ffffffff8156e5e4
[ 8002.442422] XFS (dm-0): metadata I/O error in "xfs_trans_read_buf_map" at daddr 0x1e00080 len 32 error 5
[ 8002.442427] XFS (dm-0): xfs_imap_to_bp: xfs_trans_read_buf() returned error -5.
[ 8002.442443] XFS (dm-0): metadata I/O error in "xlog_iodone" at daddr 0x1402790 len 64 error 5
[ 8002.442448] XFS (dm-0): xfs_do_force_shutdown(0x2) called from line 1271 of file fs/xfs/xfs_log.c.  Return address = ffffffff81558d45
[ 8002.442459] XFS (dm-0): xfs_imap_to_bp: xfs_trans_read_buf() returned error -5.
[ 8002.442480] XFS (dm-0): Log I/O Error Detected.  Shutting down filesystem
[ 8002.442481] XFS (dm-0): Please umount the filesystem and rectify the problem(s)
[ 8002.525313] XFS (dm-0): Unmounting Filesystem
[ 8217.616288] INFO: task umount:11278 blocked for more than 120 seconds.
[ 8217.621420]       Tainted: G        W         4.19.0-rc7-dgc+ #682
[ 8217.622924] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 8217.625024] umount          D14640 11278   9205 0x00000080
[ 8217.626512] Call Trace:
[ 8217.627192]  ? __schedule+0x2bf/0x8a0
[ 8217.628213]  ? rwsem_down_write_failed+0x235/0x440
[ 8217.629512]  ? rwsem_down_write_failed+0x230/0x440
[ 8217.630804]  schedule+0x45/0x110
[ 8217.631673]  rwsem_down_write_failed+0x235/0x440
[ 8217.632593]  ? xfs_reclaim_inode+0x107/0x430
[ 8217.633429]  ? call_rwsem_down_write_failed+0x13/0x20
[ 8217.634419]  call_rwsem_down_write_failed+0x13/0x20
[ 8217.635361]  down_write+0x37/0x50
[ 8217.636010]  xfs_ilock+0x119/0x200
[ 8217.636702]  xfs_reclaim_inode+0x107/0x430
[ 8217.637510]  xfs_reclaim_inodes_ag+0x1bd/0x330
[ 8217.638391]  ? _raw_spin_lock_irqsave+0x32/0x40
[ 8217.639276]  ? __flush_work+0x194/0x1e0
[ 8217.640027]  ? preempt_count_sub+0x43/0x50
[ 8217.640847]  ? _raw_spin_unlock_irqrestore+0x2c/0x60
[ 8217.641815]  ? del_timer+0x53/0x80
[ 8217.642490]  ? __cancel_work_timer+0x13c/0x1e0
[ 8217.643354]  ? xfs_ail_push_all_sync+0xcc/0xf0
[ 8217.644238]  xfs_reclaim_inodes+0x2b/0x50
[ 8217.645025]  xfs_unmountfs+0x92/0x180
[ 8217.645742]  xfs_fs_put_super+0x39/0xb0
[ 8217.646490]  generic_shutdown_super+0x64/0x110
[ 8217.647360]  kill_block_super+0x21/0x50
[ 8217.648131]  deactivate_locked_super+0x39/0x70
[ 8217.649006]  cleanup_mnt+0x3b/0x80
[ 8217.649675]  task_work_run+0x82/0xb0
[ 8217.650389]  exit_to_usermode_loop+0xd3/0xe0
[ 8217.651223]  do_syscall_64+0x170/0x180

Cheers,

Dave.
Dave Chinner Oct. 8, 2018, 6:07 a.m. UTC | #7
On Mon, Oct 08, 2018 at 01:24:55PM +1100, Dave Chinner wrote:
> On Mon, Oct 08, 2018 at 09:02:58AM +1100, Dave Chinner wrote:
> > On Sun, Oct 07, 2018 at 12:13:19PM +0200, Christoph Hellwig wrote:
> > > On Sat, Oct 06, 2018 at 11:43:31AM +0200, Christoph Hellwig wrote:
> > > > I suspect the asserts actually are what is incorrect.  But given
> > > > how late we are in the cycle I've just dropped the patch and kicked
> > > > off xfstests runs (now including rmap, sigh..).
> > > 
> > > Without this patch test runs including rmap succeed.  Do you want
> > > me to resend with the patch dropped (thing should just apply without
> > > it as-is).
> > 
> > I'll drop it and see what happens...
> 
> Different problems after dropping that patch and re-instating the
> rest of the series. The shutdown/io error stress tests now hang
> randomly in unmount reclaiming inodes. e.g. generic/388:

Ah, my fault. False alarm.

I had a patch that I knew was broken at the end of the series that I
hadn't commented out. All prior testing was without it, but when I
reset branch without this patch I simply popped everything back in,
including the broken patch at the end. Retesting now.

-Dave.
diff mbox series

Patch

diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
index da6b768664e3..3bb250ee6c7c 100644
--- a/fs/xfs/libxfs/xfs_bmap.c
+++ b/fs/xfs/libxfs/xfs_bmap.c
@@ -4069,20 +4069,10 @@  xfs_bmapi_allocate(
 	bma->got.br_startoff = bma->offset;
 	bma->got.br_startblock = bma->blkno;
 	bma->got.br_blockcount = bma->length;
-	bma->got.br_state = XFS_EXT_NORM;
-
-	/*
-	 * In the data fork, a wasdelay extent has been initialized, so
-	 * shouldn't be flagged as unwritten.
-	 *
-	 * For the cow fork, however, we convert delalloc reservations
-	 * (extents allocated for speculative preallocation) to
-	 * allocated unwritten extents, and only convert the unwritten
-	 * extents to real extents when we're about to write the data.
-	 */
-	if ((!bma->wasdel || (bma->flags & XFS_BMAPI_COWFORK)) &&
-	    (bma->flags & XFS_BMAPI_PREALLOC))
+	if (bma->flags & XFS_BMAPI_PREALLOC)
 		bma->got.br_state = XFS_EXT_UNWRITTEN;
+	else
+		bma->got.br_state = XFS_EXT_NORM;
 
 	if (bma->wasdel)
 		error = xfs_bmap_add_extent_delay_real(bma, whichfork);