diff mbox series

btrfs: fix outstanding extents calculation

Message ID 06cc127b5d3c535917e90fbdce0534742dcde478.1648470587.git.naohiro.aota@wdc.com (mailing list archive)
State New, archived
Headers show
Series btrfs: fix outstanding extents calculation | expand

Commit Message

Naohiro Aota March 28, 2022, 12:32 p.m. UTC
Running generic/406 causes the following WARNING in btrfs_destroy_inode()
which tells there is outstanding extents left.

In btrfs_get_blocks_direct_write(), we reserve a temporary outstanding
extents with btrfs_delalloc_reserve_metadata() (or indirectly from
btrfs_delalloc_reserve_space(()). We then release the outstanding extents
with btrfs_delalloc_release_extents(). However, the "len" can be modified
in the CoW case, which releasing fewer outstanding extents than expected.

Fix it by calling btrfs_delalloc_release_extents() for the original length.

    ------------[ cut here ]------------
    WARNING: CPU: 0 PID: 757 at fs/btrfs/inode.c:8848 btrfs_destroy_inode+0x1e6/0x210 [btrfs]
    Modules linked in: btrfs blake2b_generic xor lzo_compress
    lzo_decompress raid6_pq zstd zstd_decompress zstd_compress xxhash zram
    zsmalloc
    CPU: 0 PID: 757 Comm: umount Not tainted 5.17.0-rc8+ #101
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS d55cb5a 04/01/2014
    RIP: 0010:btrfs_destroy_inode+0x1e6/0x210 [btrfs]
    Code: fe ff ff 0f 0b e9 d6 fe ff ff 0f 0b 48 83 bb e0 01 00 00 00 0f 84
    65 fe ff ff 0f 0b 48 83 bb 78 ff ff ff 00 0f 84 63 fe ff ff <0f> 0b 48
    83 bb 70 ff ff ff 00 0f 84 61 fe ff ff 0f 0b e9 5a fe ff
    RSP: 0018:ffffc9000327bda8 EFLAGS: 00010206
    RAX: 0000000000000000 RBX: ffff888100548b78 RCX: 0000000000000000
    RDX: 0000000000026900 RSI: 0000000000000000 RDI: ffff888100548b78
    RBP: ffff888100548940 R08: 0000000000000000 R09: ffff88810b48aba8
    R10: 0000000000000001 R11: ffff8881004eb240 R12: ffff88810b48a800
    R13: ffff88810b48ec08 R14: ffff88810b48ed00 R15: ffff888100490c68
    FS:  00007f8549ea0b80(0000) GS:ffff888237c00000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 00007f854a09e733 CR3: 000000010a2e9003 CR4: 0000000000370eb0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    Call Trace:
     <TASK>
     destroy_inode+0x33/0x70
     dispose_list+0x43/0x60
     evict_inodes+0x161/0x1b0
     generic_shutdown_super+0x2d/0x110
     kill_anon_super+0xf/0x20
     btrfs_kill_super+0xd/0x20 [btrfs]
     deactivate_locked_super+0x27/0x90
     cleanup_mnt+0x12c/0x180
     task_work_run+0x54/0x80
     exit_to_user_mode_prepare+0x152/0x160
     syscall_exit_to_user_mode+0x12/0x30
     do_syscall_64+0x42/0x80
     entry_SYSCALL_64_after_hwframe+0x44/0xae
    RIP: 0033:0x7f854a000fb7

Fixes: f0bfa76a11e9 ("btrfs: fix ENOSPC failure when attempting direct IO write into NOCOW range")
CC: stable@vger.kernel.org # 5.17
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
---
 fs/btrfs/inode.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

Comments

Johannes Thumshirn March 28, 2022, 12:58 p.m. UTC | #1
Looks good,
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Tested-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Filipe Manana March 28, 2022, 1:08 p.m. UTC | #2
On Mon, Mar 28, 2022 at 09:32:05PM +0900, Naohiro Aota wrote:
> Running generic/406 causes the following WARNING in btrfs_destroy_inode()
> which tells there is outstanding extents left.

I can't trigger the warning with generic/406.
Any special setup or config to trigger it?

The change looks fine to me, however I'm curious why this isn't triggered
with generic/406, nor anyone else reported it before, since the test is
fully deterministic.

Thanks.

> 
> In btrfs_get_blocks_direct_write(), we reserve a temporary outstanding
> extents with btrfs_delalloc_reserve_metadata() (or indirectly from
> btrfs_delalloc_reserve_space(()). We then release the outstanding extents
> with btrfs_delalloc_release_extents(). However, the "len" can be modified
> in the CoW case, which releasing fewer outstanding extents than expected.
> 
> Fix it by calling btrfs_delalloc_release_extents() for the original length.
> 
>     ------------[ cut here ]------------
>     WARNING: CPU: 0 PID: 757 at fs/btrfs/inode.c:8848 btrfs_destroy_inode+0x1e6/0x210 [btrfs]
>     Modules linked in: btrfs blake2b_generic xor lzo_compress
>     lzo_decompress raid6_pq zstd zstd_decompress zstd_compress xxhash zram
>     zsmalloc
>     CPU: 0 PID: 757 Comm: umount Not tainted 5.17.0-rc8+ #101
>     Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS d55cb5a 04/01/2014
>     RIP: 0010:btrfs_destroy_inode+0x1e6/0x210 [btrfs]
>     Code: fe ff ff 0f 0b e9 d6 fe ff ff 0f 0b 48 83 bb e0 01 00 00 00 0f 84
>     65 fe ff ff 0f 0b 48 83 bb 78 ff ff ff 00 0f 84 63 fe ff ff <0f> 0b 48
>     83 bb 70 ff ff ff 00 0f 84 61 fe ff ff 0f 0b e9 5a fe ff
>     RSP: 0018:ffffc9000327bda8 EFLAGS: 00010206
>     RAX: 0000000000000000 RBX: ffff888100548b78 RCX: 0000000000000000
>     RDX: 0000000000026900 RSI: 0000000000000000 RDI: ffff888100548b78
>     RBP: ffff888100548940 R08: 0000000000000000 R09: ffff88810b48aba8
>     R10: 0000000000000001 R11: ffff8881004eb240 R12: ffff88810b48a800
>     R13: ffff88810b48ec08 R14: ffff88810b48ed00 R15: ffff888100490c68
>     FS:  00007f8549ea0b80(0000) GS:ffff888237c00000(0000) knlGS:0000000000000000
>     CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>     CR2: 00007f854a09e733 CR3: 000000010a2e9003 CR4: 0000000000370eb0
>     DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>     DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>     Call Trace:
>      <TASK>
>      destroy_inode+0x33/0x70
>      dispose_list+0x43/0x60
>      evict_inodes+0x161/0x1b0
>      generic_shutdown_super+0x2d/0x110
>      kill_anon_super+0xf/0x20
>      btrfs_kill_super+0xd/0x20 [btrfs]
>      deactivate_locked_super+0x27/0x90
>      cleanup_mnt+0x12c/0x180
>      task_work_run+0x54/0x80
>      exit_to_user_mode_prepare+0x152/0x160
>      syscall_exit_to_user_mode+0x12/0x30
>      do_syscall_64+0x42/0x80
>      entry_SYSCALL_64_after_hwframe+0x44/0xae
>     RIP: 0033:0x7f854a000fb7
> 
> Fixes: f0bfa76a11e9 ("btrfs: fix ENOSPC failure when attempting direct IO write into NOCOW range")
> CC: stable@vger.kernel.org # 5.17
> Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
> ---
>  fs/btrfs/inode.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
> index c7b15634fe70..5c0c54057921 100644
> --- a/fs/btrfs/inode.c
> +++ b/fs/btrfs/inode.c
> @@ -7409,6 +7409,7 @@ static int btrfs_get_blocks_direct_write(struct extent_map **map,
>  	u64 block_start, orig_start, orig_block_len, ram_bytes;
>  	bool can_nocow = false;
>  	bool space_reserved = false;
> +	u64 prev_len;
>  	int ret = 0;
>  
>  	/*
> @@ -7436,6 +7437,7 @@ static int btrfs_get_blocks_direct_write(struct extent_map **map,
>  			can_nocow = true;
>  	}
>  
> +	prev_len = len;
>  	if (can_nocow) {
>  		struct extent_map *em2;
>  
> @@ -7465,8 +7467,6 @@ static int btrfs_get_blocks_direct_write(struct extent_map **map,
>  			goto out;
>  		}
>  	} else {
> -		const u64 prev_len = len;
> -
>  		/* Our caller expects us to free the input extent map. */
>  		free_extent_map(em);
>  		*map = NULL;
> @@ -7497,7 +7497,7 @@ static int btrfs_get_blocks_direct_write(struct extent_map **map,
>  	 * We have created our ordered extent, so we can now release our reservation
>  	 * for an outstanding extent.
>  	 */
> -	btrfs_delalloc_release_extents(BTRFS_I(inode), len);
> +	btrfs_delalloc_release_extents(BTRFS_I(inode), prev_len);
>  
>  	/*
>  	 * Need to update the i_size under the extent lock so buffered
> -- 
> 2.35.1
>
Johannes Thumshirn March 28, 2022, 1:14 p.m. UTC | #3
On 28/03/2022 15:09, Filipe Manana wrote:
> On Mon, Mar 28, 2022 at 09:32:05PM +0900, Naohiro Aota wrote:
>> Running generic/406 causes the following WARNING in btrfs_destroy_inode()
>> which tells there is outstanding extents left.
> 
> I can't trigger the warning with generic/406.
> Any special setup or config to trigger it?
> 
> The change looks fine to me, however I'm curious why this isn't triggered
> with generic/406, nor anyone else reported it before, since the test is
> fully deterministic.
> 

I am able to trigger the WARN() with a different test (which is for a different,
not yet solved problem) on my zoned setup. With this patch, the WARN() is gone.
Filipe Manana March 28, 2022, 1:17 p.m. UTC | #4
On Mon, Mar 28, 2022 at 02:08:30PM +0100, Filipe Manana wrote:
> On Mon, Mar 28, 2022 at 09:32:05PM +0900, Naohiro Aota wrote:
> > Running generic/406 causes the following WARNING in btrfs_destroy_inode()
> > which tells there is outstanding extents left.
> 
> I can't trigger the warning with generic/406.
> Any special setup or config to trigger it?
> 
> The change looks fine to me, however I'm curious why this isn't triggered
> with generic/406, nor anyone else reported it before, since the test is
> fully deterministic.

Also the subject "btrfs: fix outstanding extents calculation" is confusing,
as the problem is not in the outstanding extents calculation code, but
rather not releasing delalloc by the correct amount in a specific code path.

So something like "btrfs: release correct delalloc amount in direct IO write path"
would be more correct and specific.

> 
> Thanks.
> 
> > 
> > In btrfs_get_blocks_direct_write(), we reserve a temporary outstanding
> > extents with btrfs_delalloc_reserve_metadata() (or indirectly from
> > btrfs_delalloc_reserve_space(()). We then release the outstanding extents
> > with btrfs_delalloc_release_extents(). However, the "len" can be modified
> > in the CoW case, which releasing fewer outstanding extents than expected.
> > 
> > Fix it by calling btrfs_delalloc_release_extents() for the original length.
> > 
> >     ------------[ cut here ]------------
> >     WARNING: CPU: 0 PID: 757 at fs/btrfs/inode.c:8848 btrfs_destroy_inode+0x1e6/0x210 [btrfs]
> >     Modules linked in: btrfs blake2b_generic xor lzo_compress
> >     lzo_decompress raid6_pq zstd zstd_decompress zstd_compress xxhash zram
> >     zsmalloc
> >     CPU: 0 PID: 757 Comm: umount Not tainted 5.17.0-rc8+ #101
> >     Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS d55cb5a 04/01/2014
> >     RIP: 0010:btrfs_destroy_inode+0x1e6/0x210 [btrfs]
> >     Code: fe ff ff 0f 0b e9 d6 fe ff ff 0f 0b 48 83 bb e0 01 00 00 00 0f 84
> >     65 fe ff ff 0f 0b 48 83 bb 78 ff ff ff 00 0f 84 63 fe ff ff <0f> 0b 48
> >     83 bb 70 ff ff ff 00 0f 84 61 fe ff ff 0f 0b e9 5a fe ff
> >     RSP: 0018:ffffc9000327bda8 EFLAGS: 00010206
> >     RAX: 0000000000000000 RBX: ffff888100548b78 RCX: 0000000000000000
> >     RDX: 0000000000026900 RSI: 0000000000000000 RDI: ffff888100548b78
> >     RBP: ffff888100548940 R08: 0000000000000000 R09: ffff88810b48aba8
> >     R10: 0000000000000001 R11: ffff8881004eb240 R12: ffff88810b48a800
> >     R13: ffff88810b48ec08 R14: ffff88810b48ed00 R15: ffff888100490c68
> >     FS:  00007f8549ea0b80(0000) GS:ffff888237c00000(0000) knlGS:0000000000000000
> >     CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> >     CR2: 00007f854a09e733 CR3: 000000010a2e9003 CR4: 0000000000370eb0
> >     DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> >     DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> >     Call Trace:
> >      <TASK>
> >      destroy_inode+0x33/0x70
> >      dispose_list+0x43/0x60
> >      evict_inodes+0x161/0x1b0
> >      generic_shutdown_super+0x2d/0x110
> >      kill_anon_super+0xf/0x20
> >      btrfs_kill_super+0xd/0x20 [btrfs]
> >      deactivate_locked_super+0x27/0x90
> >      cleanup_mnt+0x12c/0x180
> >      task_work_run+0x54/0x80
> >      exit_to_user_mode_prepare+0x152/0x160
> >      syscall_exit_to_user_mode+0x12/0x30
> >      do_syscall_64+0x42/0x80
> >      entry_SYSCALL_64_after_hwframe+0x44/0xae
> >     RIP: 0033:0x7f854a000fb7
> > 
> > Fixes: f0bfa76a11e9 ("btrfs: fix ENOSPC failure when attempting direct IO write into NOCOW range")
> > CC: stable@vger.kernel.org # 5.17
> > Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
> > ---
> >  fs/btrfs/inode.c | 6 +++---
> >  1 file changed, 3 insertions(+), 3 deletions(-)
> > 
> > diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
> > index c7b15634fe70..5c0c54057921 100644
> > --- a/fs/btrfs/inode.c
> > +++ b/fs/btrfs/inode.c
> > @@ -7409,6 +7409,7 @@ static int btrfs_get_blocks_direct_write(struct extent_map **map,
> >  	u64 block_start, orig_start, orig_block_len, ram_bytes;
> >  	bool can_nocow = false;
> >  	bool space_reserved = false;
> > +	u64 prev_len;
> >  	int ret = 0;
> >  
> >  	/*
> > @@ -7436,6 +7437,7 @@ static int btrfs_get_blocks_direct_write(struct extent_map **map,
> >  			can_nocow = true;
> >  	}
> >  
> > +	prev_len = len;
> >  	if (can_nocow) {
> >  		struct extent_map *em2;
> >  
> > @@ -7465,8 +7467,6 @@ static int btrfs_get_blocks_direct_write(struct extent_map **map,
> >  			goto out;
> >  		}
> >  	} else {
> > -		const u64 prev_len = len;
> > -
> >  		/* Our caller expects us to free the input extent map. */
> >  		free_extent_map(em);
> >  		*map = NULL;
> > @@ -7497,7 +7497,7 @@ static int btrfs_get_blocks_direct_write(struct extent_map **map,
> >  	 * We have created our ordered extent, so we can now release our reservation
> >  	 * for an outstanding extent.
> >  	 */
> > -	btrfs_delalloc_release_extents(BTRFS_I(inode), len);
> > +	btrfs_delalloc_release_extents(BTRFS_I(inode), prev_len);
> >  
> >  	/*
> >  	 * Need to update the i_size under the extent lock so buffered
> > -- 
> > 2.35.1
> >
Filipe Manana March 28, 2022, 1:21 p.m. UTC | #5
On Mon, Mar 28, 2022 at 01:14:02PM +0000, Johannes Thumshirn wrote:
> On 28/03/2022 15:09, Filipe Manana wrote:
> > On Mon, Mar 28, 2022 at 09:32:05PM +0900, Naohiro Aota wrote:
> >> Running generic/406 causes the following WARNING in btrfs_destroy_inode()
> >> which tells there is outstanding extents left.
> > 
> > I can't trigger the warning with generic/406.
> > Any special setup or config to trigger it?
> > 
> > The change looks fine to me, however I'm curious why this isn't triggered
> > with generic/406, nor anyone else reported it before, since the test is
> > fully deterministic.
> > 
> 
> I am able to trigger the WARN() with a different test (which is for a different,
> not yet solved problem) on my zoned setup. With this patch, the WARN() is gone.

I have no doubts about the fix being correct.
I'm just puzzled why I can't trigger it with generic/406, given that it's a very
deterministic test.

If there's any special config or setup (mount options, zoned fs, etc), I would
like to have it explicitly mentioned in the changelog.
Johannes Thumshirn March 28, 2022, 1:22 p.m. UTC | #6
On 28/03/2022 15:21, Filipe Manana wrote:
> On Mon, Mar 28, 2022 at 01:14:02PM +0000, Johannes Thumshirn wrote:
>> On 28/03/2022 15:09, Filipe Manana wrote:
>>> On Mon, Mar 28, 2022 at 09:32:05PM +0900, Naohiro Aota wrote:
>>>> Running generic/406 causes the following WARNING in btrfs_destroy_inode()
>>>> which tells there is outstanding extents left.
>>>
>>> I can't trigger the warning with generic/406.
>>> Any special setup or config to trigger it?
>>>
>>> The change looks fine to me, however I'm curious why this isn't triggered
>>> with generic/406, nor anyone else reported it before, since the test is
>>> fully deterministic.
>>>
>>
>> I am able to trigger the WARN() with a different test (which is for a different,
>> not yet solved problem) on my zoned setup. With this patch, the WARN() is gone.
> 
> I have no doubts about the fix being correct.
> I'm just puzzled why I can't trigger it with generic/406, given that it's a very
> deterministic test.
> 
> If there's any special config or setup (mount options, zoned fs, etc), I would
> like to have it explicitly mentioned in the changelog.
> 

This I have to deferre to Naohiro, he was the one who was able to reproduce it
with generic/406 (on a non-zoned fs IIRC)
Naohiro Aota March 28, 2022, 1:36 p.m. UTC | #7
On Mon, Mar 28, 2022 at 02:21:28PM +0100, Filipe Manana wrote:
> On Mon, Mar 28, 2022 at 01:14:02PM +0000, Johannes Thumshirn wrote:
> > On 28/03/2022 15:09, Filipe Manana wrote:
> > > On Mon, Mar 28, 2022 at 09:32:05PM +0900, Naohiro Aota wrote:
> > >> Running generic/406 causes the following WARNING in btrfs_destroy_inode()
> > >> which tells there is outstanding extents left.
> > > 
> > > I can't trigger the warning with generic/406.
> > > Any special setup or config to trigger it?
> > > 
> > > The change looks fine to me, however I'm curious why this isn't triggered
> > > with generic/406, nor anyone else reported it before, since the test is
> > > fully deterministic.
> > > 
> > 
> > I am able to trigger the WARN() with a different test (which is for a different,
> > not yet solved problem) on my zoned setup. With this patch, the WARN() is gone.
> 
> I have no doubts about the fix being correct.
> I'm just puzzled why I can't trigger it with generic/406, given that it's a very
> deterministic test.
>
> If there's any special config or setup (mount options, zoned fs, etc), I would
> like to have it explicitly mentioned in the changelog.

I don't think it's super special. I can always reproduce it on 1GB
zram device. Here is the mkfs setup, and no mount options are
specified.

++ mkfs.btrfs -f -d single -m single /dev/zram0
btrfs-progs v5.16.2
See http://btrfs.wiki.kernel.org for more information.

Performing full device TRIM /dev/zram0 (1.00GiB) ...
NOTE: several default settings have changed in version 5.15, please make sure
      this does not affect your deployments:
      - DUP for metadata (-m dup)
      - enabled no-holes (-O no-holes)
      - enabled free-space-tree (-R free-space-tree)

Label:              (null)
UUID:               b7260fb1-fa0e-4acd-8c3d-0530799a9fd3
Node size:          16384
Sector size:        4096
Filesystem size:    1.00GiB
Block group profiles:
  Data:             single            8.00MiB
  Metadata:         single            8.00MiB
  System:           single            4.00MiB
SSD detected:       yes
Zoned device:       no
Incompat features:  extref, skinny-metadata, no-holes
Runtime features:   free-space-tree
Checksum:           crc32c
Number of devices:  1
Devices:
   ID        SIZE  PATH
    1     1.00GiB  /dev/zram0
Naohiro Aota March 28, 2022, 1:40 p.m. UTC | #8
On Mon, Mar 28, 2022 at 02:17:49PM +0100, Filipe Manana wrote:
> On Mon, Mar 28, 2022 at 02:08:30PM +0100, Filipe Manana wrote:
> > On Mon, Mar 28, 2022 at 09:32:05PM +0900, Naohiro Aota wrote:
> > > Running generic/406 causes the following WARNING in btrfs_destroy_inode()
> > > which tells there is outstanding extents left.
> > 
> > I can't trigger the warning with generic/406.
> > Any special setup or config to trigger it?
> > 
> > The change looks fine to me, however I'm curious why this isn't triggered
> > with generic/406, nor anyone else reported it before, since the test is
> > fully deterministic.
> 
> Also the subject "btrfs: fix outstanding extents calculation" is confusing,
> as the problem is not in the outstanding extents calculation code, but
> rather not releasing delalloc by the correct amount in a specific code path.
> 
> So something like "btrfs: release correct delalloc amount in direct IO write path"
> would be more correct and specific.

Thank you. That is much more informative. I'll fix the subject to it.

> > 
> > Thanks.
> > 
> > > 
> > > In btrfs_get_blocks_direct_write(), we reserve a temporary outstanding
> > > extents with btrfs_delalloc_reserve_metadata() (or indirectly from
> > > btrfs_delalloc_reserve_space(()). We then release the outstanding extents
> > > with btrfs_delalloc_release_extents(). However, the "len" can be modified
> > > in the CoW case, which releasing fewer outstanding extents than expected.
> > > 
> > > Fix it by calling btrfs_delalloc_release_extents() for the original length.
> > > 
> > >     ------------[ cut here ]------------
> > >     WARNING: CPU: 0 PID: 757 at fs/btrfs/inode.c:8848 btrfs_destroy_inode+0x1e6/0x210 [btrfs]
> > >     Modules linked in: btrfs blake2b_generic xor lzo_compress
> > >     lzo_decompress raid6_pq zstd zstd_decompress zstd_compress xxhash zram
> > >     zsmalloc
> > >     CPU: 0 PID: 757 Comm: umount Not tainted 5.17.0-rc8+ #101
> > >     Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS d55cb5a 04/01/2014
> > >     RIP: 0010:btrfs_destroy_inode+0x1e6/0x210 [btrfs]
> > >     Code: fe ff ff 0f 0b e9 d6 fe ff ff 0f 0b 48 83 bb e0 01 00 00 00 0f 84
> > >     65 fe ff ff 0f 0b 48 83 bb 78 ff ff ff 00 0f 84 63 fe ff ff <0f> 0b 48
> > >     83 bb 70 ff ff ff 00 0f 84 61 fe ff ff 0f 0b e9 5a fe ff
> > >     RSP: 0018:ffffc9000327bda8 EFLAGS: 00010206
> > >     RAX: 0000000000000000 RBX: ffff888100548b78 RCX: 0000000000000000
> > >     RDX: 0000000000026900 RSI: 0000000000000000 RDI: ffff888100548b78
> > >     RBP: ffff888100548940 R08: 0000000000000000 R09: ffff88810b48aba8
> > >     R10: 0000000000000001 R11: ffff8881004eb240 R12: ffff88810b48a800
> > >     R13: ffff88810b48ec08 R14: ffff88810b48ed00 R15: ffff888100490c68
> > >     FS:  00007f8549ea0b80(0000) GS:ffff888237c00000(0000) knlGS:0000000000000000
> > >     CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > >     CR2: 00007f854a09e733 CR3: 000000010a2e9003 CR4: 0000000000370eb0
> > >     DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > >     DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > >     Call Trace:
> > >      <TASK>
> > >      destroy_inode+0x33/0x70
> > >      dispose_list+0x43/0x60
> > >      evict_inodes+0x161/0x1b0
> > >      generic_shutdown_super+0x2d/0x110
> > >      kill_anon_super+0xf/0x20
> > >      btrfs_kill_super+0xd/0x20 [btrfs]
> > >      deactivate_locked_super+0x27/0x90
> > >      cleanup_mnt+0x12c/0x180
> > >      task_work_run+0x54/0x80
> > >      exit_to_user_mode_prepare+0x152/0x160
> > >      syscall_exit_to_user_mode+0x12/0x30
> > >      do_syscall_64+0x42/0x80
> > >      entry_SYSCALL_64_after_hwframe+0x44/0xae
> > >     RIP: 0033:0x7f854a000fb7
> > > 
> > > Fixes: f0bfa76a11e9 ("btrfs: fix ENOSPC failure when attempting direct IO write into NOCOW range")
> > > CC: stable@vger.kernel.org # 5.17
> > > Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
> > > ---
> > >  fs/btrfs/inode.c | 6 +++---
> > >  1 file changed, 3 insertions(+), 3 deletions(-)
> > > 
> > > diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
> > > index c7b15634fe70..5c0c54057921 100644
> > > --- a/fs/btrfs/inode.c
> > > +++ b/fs/btrfs/inode.c
> > > @@ -7409,6 +7409,7 @@ static int btrfs_get_blocks_direct_write(struct extent_map **map,
> > >  	u64 block_start, orig_start, orig_block_len, ram_bytes;
> > >  	bool can_nocow = false;
> > >  	bool space_reserved = false;
> > > +	u64 prev_len;
> > >  	int ret = 0;
> > >  
> > >  	/*
> > > @@ -7436,6 +7437,7 @@ static int btrfs_get_blocks_direct_write(struct extent_map **map,
> > >  			can_nocow = true;
> > >  	}
> > >  
> > > +	prev_len = len;
> > >  	if (can_nocow) {
> > >  		struct extent_map *em2;
> > >  
> > > @@ -7465,8 +7467,6 @@ static int btrfs_get_blocks_direct_write(struct extent_map **map,
> > >  			goto out;
> > >  		}
> > >  	} else {
> > > -		const u64 prev_len = len;
> > > -
> > >  		/* Our caller expects us to free the input extent map. */
> > >  		free_extent_map(em);
> > >  		*map = NULL;
> > > @@ -7497,7 +7497,7 @@ static int btrfs_get_blocks_direct_write(struct extent_map **map,
> > >  	 * We have created our ordered extent, so we can now release our reservation
> > >  	 * for an outstanding extent.
> > >  	 */
> > > -	btrfs_delalloc_release_extents(BTRFS_I(inode), len);
> > > +	btrfs_delalloc_release_extents(BTRFS_I(inode), prev_len);
> > >  
> > >  	/*
> > >  	 * Need to update the i_size under the extent lock so buffered
> > > -- 
> > > 2.35.1
> > >
Filipe Manana March 28, 2022, 1:45 p.m. UTC | #9
On Mon, Mar 28, 2022 at 01:36:36PM +0000, Naohiro Aota wrote:
> On Mon, Mar 28, 2022 at 02:21:28PM +0100, Filipe Manana wrote:
> > On Mon, Mar 28, 2022 at 01:14:02PM +0000, Johannes Thumshirn wrote:
> > > On 28/03/2022 15:09, Filipe Manana wrote:
> > > > On Mon, Mar 28, 2022 at 09:32:05PM +0900, Naohiro Aota wrote:
> > > >> Running generic/406 causes the following WARNING in btrfs_destroy_inode()
> > > >> which tells there is outstanding extents left.
> > > > 
> > > > I can't trigger the warning with generic/406.
> > > > Any special setup or config to trigger it?
> > > > 
> > > > The change looks fine to me, however I'm curious why this isn't triggered
> > > > with generic/406, nor anyone else reported it before, since the test is
> > > > fully deterministic.
> > > > 
> > > 
> > > I am able to trigger the WARN() with a different test (which is for a different,
> > > not yet solved problem) on my zoned setup. With this patch, the WARN() is gone.
> > 
> > I have no doubts about the fix being correct.
> > I'm just puzzled why I can't trigger it with generic/406, given that it's a very
> > deterministic test.
> >
> > If there's any special config or setup (mount options, zoned fs, etc), I would
> > like to have it explicitly mentioned in the changelog.
> 
> I don't think it's super special. I can always reproduce it on 1GB
> zram device. Here is the mkfs setup, and no mount options are
> specified.

Ok, that, the 1G device, explains it.

It's trigfering a short-write, due to not being able to allocate a large extent and
instead allocating a smaller one. And the test does not fail if we write less than
what we requested, as it redirects xfs_io's stdout to the .full file and does not
check that we wrote the exact amount we asked to write (which is a rather bad test
IMO, should also check we are able to read what we wrote before, etc).

The change looks good to me.

With an updated subject, you can add:

Reviewed-by: Filipe Manana <fdmanana@suse.com>

Thanks.

> 
> ++ mkfs.btrfs -f -d single -m single /dev/zram0
> btrfs-progs v5.16.2
> See http://btrfs.wiki.kernel.org for more information.
> 
> Performing full device TRIM /dev/zram0 (1.00GiB) ...
> NOTE: several default settings have changed in version 5.15, please make sure
>       this does not affect your deployments:
>       - DUP for metadata (-m dup)
>       - enabled no-holes (-O no-holes)
>       - enabled free-space-tree (-R free-space-tree)
> 
> Label:              (null)
> UUID:               b7260fb1-fa0e-4acd-8c3d-0530799a9fd3
> Node size:          16384
> Sector size:        4096
> Filesystem size:    1.00GiB
> Block group profiles:
>   Data:             single            8.00MiB
>   Metadata:         single            8.00MiB
>   System:           single            4.00MiB
> SSD detected:       yes
> Zoned device:       no
> Incompat features:  extref, skinny-metadata, no-holes
> Runtime features:   free-space-tree
> Checksum:           crc32c
> Number of devices:  1
> Devices:
>    ID        SIZE  PATH
>     1     1.00GiB  /dev/zram0
David Sterba March 28, 2022, 7:26 p.m. UTC | #10
On Mon, Mar 28, 2022 at 09:32:05PM +0900, Naohiro Aota wrote:
> Running generic/406 causes the following WARNING in btrfs_destroy_inode()
> which tells there is outstanding extents left.
> 
> In btrfs_get_blocks_direct_write(), we reserve a temporary outstanding
> extents with btrfs_delalloc_reserve_metadata() (or indirectly from
> btrfs_delalloc_reserve_space(()). We then release the outstanding extents
> with btrfs_delalloc_release_extents(). However, the "len" can be modified
> in the CoW case, which releasing fewer outstanding extents than expected.
> 
> Fix it by calling btrfs_delalloc_release_extents() for the original length.
> 
>     ------------[ cut here ]------------
>     WARNING: CPU: 0 PID: 757 at fs/btrfs/inode.c:8848 btrfs_destroy_inode+0x1e6/0x210 [btrfs]
>     Modules linked in: btrfs blake2b_generic xor lzo_compress
>     lzo_decompress raid6_pq zstd zstd_decompress zstd_compress xxhash zram
>     zsmalloc
>     CPU: 0 PID: 757 Comm: umount Not tainted 5.17.0-rc8+ #101
>     Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS d55cb5a 04/01/2014
>     RIP: 0010:btrfs_destroy_inode+0x1e6/0x210 [btrfs]
>     Code: fe ff ff 0f 0b e9 d6 fe ff ff 0f 0b 48 83 bb e0 01 00 00 00 0f 84
>     65 fe ff ff 0f 0b 48 83 bb 78 ff ff ff 00 0f 84 63 fe ff ff <0f> 0b 48
>     83 bb 70 ff ff ff 00 0f 84 61 fe ff ff 0f 0b e9 5a fe ff
>     RSP: 0018:ffffc9000327bda8 EFLAGS: 00010206
>     RAX: 0000000000000000 RBX: ffff888100548b78 RCX: 0000000000000000
>     RDX: 0000000000026900 RSI: 0000000000000000 RDI: ffff888100548b78
>     RBP: ffff888100548940 R08: 0000000000000000 R09: ffff88810b48aba8
>     R10: 0000000000000001 R11: ffff8881004eb240 R12: ffff88810b48a800
>     R13: ffff88810b48ec08 R14: ffff88810b48ed00 R15: ffff888100490c68
>     FS:  00007f8549ea0b80(0000) GS:ffff888237c00000(0000) knlGS:0000000000000000
>     CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>     CR2: 00007f854a09e733 CR3: 000000010a2e9003 CR4: 0000000000370eb0
>     DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>     DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>     Call Trace:
>      <TASK>
>      destroy_inode+0x33/0x70
>      dispose_list+0x43/0x60
>      evict_inodes+0x161/0x1b0
>      generic_shutdown_super+0x2d/0x110
>      kill_anon_super+0xf/0x20
>      btrfs_kill_super+0xd/0x20 [btrfs]
>      deactivate_locked_super+0x27/0x90
>      cleanup_mnt+0x12c/0x180
>      task_work_run+0x54/0x80
>      exit_to_user_mode_prepare+0x152/0x160
>      syscall_exit_to_user_mode+0x12/0x30
>      do_syscall_64+0x42/0x80
>      entry_SYSCALL_64_after_hwframe+0x44/0xae
>     RIP: 0033:0x7f854a000fb7
> 
> Fixes: f0bfa76a11e9 ("btrfs: fix ENOSPC failure when attempting direct IO write into NOCOW range")
> CC: stable@vger.kernel.org # 5.17
> Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>

Added to misc-next, with updated subject and changelog, thanks.
diff mbox series

Patch

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index c7b15634fe70..5c0c54057921 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -7409,6 +7409,7 @@  static int btrfs_get_blocks_direct_write(struct extent_map **map,
 	u64 block_start, orig_start, orig_block_len, ram_bytes;
 	bool can_nocow = false;
 	bool space_reserved = false;
+	u64 prev_len;
 	int ret = 0;
 
 	/*
@@ -7436,6 +7437,7 @@  static int btrfs_get_blocks_direct_write(struct extent_map **map,
 			can_nocow = true;
 	}
 
+	prev_len = len;
 	if (can_nocow) {
 		struct extent_map *em2;
 
@@ -7465,8 +7467,6 @@  static int btrfs_get_blocks_direct_write(struct extent_map **map,
 			goto out;
 		}
 	} else {
-		const u64 prev_len = len;
-
 		/* Our caller expects us to free the input extent map. */
 		free_extent_map(em);
 		*map = NULL;
@@ -7497,7 +7497,7 @@  static int btrfs_get_blocks_direct_write(struct extent_map **map,
 	 * We have created our ordered extent, so we can now release our reservation
 	 * for an outstanding extent.
 	 */
-	btrfs_delalloc_release_extents(BTRFS_I(inode), len);
+	btrfs_delalloc_release_extents(BTRFS_I(inode), prev_len);
 
 	/*
 	 * Need to update the i_size under the extent lock so buffered