diff mbox series

[v3,1/2] btrfs: fix __folio_put refcount in btrfs_do_encoded_write

Message ID 9e23e32fb945c3e1c43f8c0f8ca20552c48d5b65.1719930430.git.boris@bur.io (mailing list archive)
State New
Headers show
Series btrfs: fix __folio_put refcount errors | expand

Commit Message

Boris Burkov July 2, 2024, 2:31 p.m. UTC
The conversion to folios switched __free_page to __folio_put in the
error path in btrfs_do_encoded_write.

However, this gets the page refcounting wrong. If we do hit that error
path (I reproduced by modifying btrfs_do_encoded_write to pretend to
always fail in a way that jumps to out_folios and running the xfstest
btrfs/281), then we always hit the following BUG freeing the folio:

BUG: Bad page state in process btrfs  pfn:40ab0b
page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x61be5 pfn:0x40ab0b
 flags: 0x5ffff0000000000(node=0|zone=2|lastcpupid=0x1ffff)
raw: 05ffff0000000000 0000000000000000 dead000000000122 0000000000000000
raw: 0000000000061be5 0000000000000000 00000001ffffffff 0000000000000000
page dumped because: nonzero _refcount
Call Trace:
<TASK>
dump_stack_lvl+0x3d/0xe0
bad_page+0xea/0xf0
free_unref_page+0x8e1/0x900
? __mem_cgroup_uncharge+0x69/0x90
__folio_put+0xe6/0x190
btrfs_do_encoded_write+0x445/0x780
? current_time+0x25/0xd0
btrfs_do_write_iter+0x2cc/0x4b0
btrfs_ioctl_encoded_write+0x2b6/0x340

It turns out __free_page dereferenced the page while __folio_put does
not. Switch __folio_put to folio_put which does dereference the folio
first.

Fixes: 400b172b8cdc ("btrfs: compression: migrate compression/decompression paths to folios")
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Boris Burkov <boris@bur.io>
---
 fs/btrfs/inode.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

David Sterba July 2, 2024, 4:19 p.m. UTC | #1
On Tue, Jul 02, 2024 at 07:31:13AM -0700, Boris Burkov wrote:
> The conversion to folios switched __free_page to __folio_put in the
> error path in btrfs_do_encoded_write.
> 
> However, this gets the page refcounting wrong. If we do hit that error
> path (I reproduced by modifying btrfs_do_encoded_write to pretend to
> always fail in a way that jumps to out_folios and running the xfstest
> btrfs/281), then we always hit the following BUG freeing the folio:
> 
> BUG: Bad page state in process btrfs  pfn:40ab0b
> page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x61be5 pfn:0x40ab0b
>  flags: 0x5ffff0000000000(node=0|zone=2|lastcpupid=0x1ffff)
> raw: 05ffff0000000000 0000000000000000 dead000000000122 0000000000000000
> raw: 0000000000061be5 0000000000000000 00000001ffffffff 0000000000000000
> page dumped because: nonzero _refcount
> Call Trace:
> <TASK>
> dump_stack_lvl+0x3d/0xe0
> bad_page+0xea/0xf0
> free_unref_page+0x8e1/0x900
> ? __mem_cgroup_uncharge+0x69/0x90
> __folio_put+0xe6/0x190
> btrfs_do_encoded_write+0x445/0x780
> ? current_time+0x25/0xd0
> btrfs_do_write_iter+0x2cc/0x4b0
> btrfs_ioctl_encoded_write+0x2b6/0x340
> 
> It turns out __free_page dereferenced the page while __folio_put does
> not. Switch __folio_put to folio_put which does dereference the folio
> first.

By 'dereferenced' you mean to decrease the reference count? Because
dereference is usually said about pointers, it's confusing in this
context.
Boris Burkov July 2, 2024, 4:51 p.m. UTC | #2
On Tue, Jul 02, 2024 at 06:19:16PM +0200, David Sterba wrote:
> On Tue, Jul 02, 2024 at 07:31:13AM -0700, Boris Burkov wrote:
> > The conversion to folios switched __free_page to __folio_put in the
> > error path in btrfs_do_encoded_write.
> > 
> > However, this gets the page refcounting wrong. If we do hit that error
> > path (I reproduced by modifying btrfs_do_encoded_write to pretend to
> > always fail in a way that jumps to out_folios and running the xfstest
> > btrfs/281), then we always hit the following BUG freeing the folio:
> > 
> > BUG: Bad page state in process btrfs  pfn:40ab0b
> > page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x61be5 pfn:0x40ab0b
> >  flags: 0x5ffff0000000000(node=0|zone=2|lastcpupid=0x1ffff)
> > raw: 05ffff0000000000 0000000000000000 dead000000000122 0000000000000000
> > raw: 0000000000061be5 0000000000000000 00000001ffffffff 0000000000000000
> > page dumped because: nonzero _refcount
> > Call Trace:
> > <TASK>
> > dump_stack_lvl+0x3d/0xe0
> > bad_page+0xea/0xf0
> > free_unref_page+0x8e1/0x900
> > ? __mem_cgroup_uncharge+0x69/0x90
> > __folio_put+0xe6/0x190
> > btrfs_do_encoded_write+0x445/0x780
> > ? current_time+0x25/0xd0
> > btrfs_do_write_iter+0x2cc/0x4b0
> > btrfs_ioctl_encoded_write+0x2b6/0x340
> > 
> > It turns out __free_page dereferenced the page while __folio_put does
> > not. Switch __folio_put to folio_put which does dereference the folio
> > first.
> 
> By 'dereferenced' you mean to decrease the reference count? Because
> dereference is usually said about pointers, it's confusing in this
> context.

Yes, you're right. It is "removing a reference" but not the best use of
"de-reference" in that case :) "decrement the refcount" is certainly
much clearer.
diff mbox series

Patch

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 0a11d309ee89..12fb7e8056a1 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -9558,7 +9558,7 @@  ssize_t btrfs_do_encoded_write(struct kiocb *iocb, struct iov_iter *from,
 out_folios:
 	for (i = 0; i < nr_folios; i++) {
 		if (folios[i])
-			__folio_put(folios[i]);
+			folio_put(folios[i]);
 	}
 	kvfree(folios);
 out: