diff mbox series

[v2] btrfs: fix __folio_put with folio refcount held

Message ID 511fdb3b7fc4293159d4af8e62775c2c1eb95250.1719874991.git.boris@bur.io (mailing list archive)
State New
Headers show
Series [v2] btrfs: fix __folio_put with folio refcount held | expand

Commit Message

Boris Burkov July 1, 2024, 11:04 p.m. UTC
The conversion to folios switched __free_page to __folio_put in the
error path in btrfs_do_encoded_write and __alloc_dummy_extent_buffer.

However, this gets the page refcounting wrong. If we do hit that error
path (I reproduced by modifying btrfs_do_encoded_write to pretend to
always fail in a way that jumps to out_folios and running the xfstest
btrfs/281), then we always hit the following BUG freeing the folio:

BUG: Bad page state in process btrfs  pfn:40ab0b
page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x61be5 pfn:0x40ab0b
 flags: 0x5ffff0000000000(node=0|zone=2|lastcpupid=0x1ffff)
raw: 05ffff0000000000 0000000000000000 dead000000000122 0000000000000000
raw: 0000000000061be5 0000000000000000 00000001ffffffff 0000000000000000
page dumped because: nonzero _refcount
Call Trace:
<TASK>
dump_stack_lvl+0x3d/0xe0
bad_page+0xea/0xf0
free_unref_page+0x8e1/0x900
? __mem_cgroup_uncharge+0x69/0x90
__folio_put+0xe6/0x190
btrfs_do_encoded_write+0x445/0x780
? current_time+0x25/0xd0
btrfs_do_write_iter+0x2cc/0x4b0
btrfs_ioctl_encoded_write+0x2b6/0x340

It turns out __free_page dereferenced the page while __folio_put does
not. Switch __folio_put to folio_put which does dereference the folio
first.

Fixes: 400b172b8cdc ("btrfs: compression: migrate compression/decompression paths to folios")
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Boris Burkov <boris@bur.io>
---
Changelog
v2:
- Fixed __folio_put in __alloc_dummy_extent_buffer as well

 fs/btrfs/extent_io.c | 2 +-
 fs/btrfs/inode.c     | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

Comments

Filipe Manana July 2, 2024, 9:50 a.m. UTC | #1
On Tue, Jul 2, 2024 at 12:05 AM Boris Burkov <boris@bur.io> wrote:
>
> The conversion to folios switched __free_page to __folio_put in the
> error path in btrfs_do_encoded_write and __alloc_dummy_extent_buffer.
>
> However, this gets the page refcounting wrong. If we do hit that error
> path (I reproduced by modifying btrfs_do_encoded_write to pretend to
> always fail in a way that jumps to out_folios and running the xfstest
> btrfs/281), then we always hit the following BUG freeing the folio:
>
> BUG: Bad page state in process btrfs  pfn:40ab0b
> page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x61be5 pfn:0x40ab0b
>  flags: 0x5ffff0000000000(node=0|zone=2|lastcpupid=0x1ffff)
> raw: 05ffff0000000000 0000000000000000 dead000000000122 0000000000000000
> raw: 0000000000061be5 0000000000000000 00000001ffffffff 0000000000000000
> page dumped because: nonzero _refcount
> Call Trace:
> <TASK>
> dump_stack_lvl+0x3d/0xe0
> bad_page+0xea/0xf0
> free_unref_page+0x8e1/0x900
> ? __mem_cgroup_uncharge+0x69/0x90
> __folio_put+0xe6/0x190
> btrfs_do_encoded_write+0x445/0x780
> ? current_time+0x25/0xd0
> btrfs_do_write_iter+0x2cc/0x4b0
> btrfs_ioctl_encoded_write+0x2b6/0x340
>
> It turns out __free_page dereferenced the page while __folio_put does
> not. Switch __folio_put to folio_put which does dereference the folio
> first.
>
> Fixes: 400b172b8cdc ("btrfs: compression: migrate compression/decompression paths to folios")

So this landed in 6.10-rc1, and corresponds to one change in this
patch, the one for inode.c.

> Reviewed-by: Qu Wenruo <wqu@suse.com>
> Signed-off-by: Boris Burkov <boris@bur.io>
> ---
> Changelog
> v2:
> - Fixed __folio_put in __alloc_dummy_extent_buffer as well
>
>  fs/btrfs/extent_io.c | 2 +-
>  fs/btrfs/inode.c     | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
> index d3ce07ab9692..cb315779af30 100644
> --- a/fs/btrfs/extent_io.c
> +++ b/fs/btrfs/extent_io.c
> @@ -2775,7 +2775,7 @@ struct extent_buffer *__alloc_dummy_extent_buffer(struct btrfs_fs_info *fs_info,
>         for (int i = 0; i < num_folios; i++) {
>                 if (eb->folios[i]) {
>                         detach_extent_buffer_folio(eb, eb->folios[i]);
> -                       __folio_put(eb->folios[i]);
> +                       folio_put(eb->folios[i]);

Now this bug was introduced in commit 13df3775efca ("btrfs: cleanup
metadata page pointer usage"), which was added in kernel 6.8-rc1.

Why only one Fixes tag in the changelog?
And how do you expect this to be backported?

Two separate patches will be less confusing when it comes to
backporting, given the different kernel versions.



>                 }
>         }
>         __free_extent_buffer(eb);
> diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
> index 0a11d309ee89..12fb7e8056a1 100644
> --- a/fs/btrfs/inode.c
> +++ b/fs/btrfs/inode.c
> @@ -9558,7 +9558,7 @@ ssize_t btrfs_do_encoded_write(struct kiocb *iocb, struct iov_iter *from,
>  out_folios:
>         for (i = 0; i < nr_folios; i++) {
>                 if (folios[i])
> -                       __folio_put(folios[i]);
> +                       folio_put(folios[i]);
>         }
>         kvfree(folios);
>  out:
> --
> 2.45.2
>
>
diff mbox series

Patch

diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index d3ce07ab9692..cb315779af30 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -2775,7 +2775,7 @@  struct extent_buffer *__alloc_dummy_extent_buffer(struct btrfs_fs_info *fs_info,
 	for (int i = 0; i < num_folios; i++) {
 		if (eb->folios[i]) {
 			detach_extent_buffer_folio(eb, eb->folios[i]);
-			__folio_put(eb->folios[i]);
+			folio_put(eb->folios[i]);
 		}
 	}
 	__free_extent_buffer(eb);
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 0a11d309ee89..12fb7e8056a1 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -9558,7 +9558,7 @@  ssize_t btrfs_do_encoded_write(struct kiocb *iocb, struct iov_iter *from,
 out_folios:
 	for (i = 0; i < nr_folios; i++) {
 		if (folios[i])
-			__folio_put(folios[i]);
+			folio_put(folios[i]);
 	}
 	kvfree(folios);
 out: