Message ID | 9e23e32fb945c3e1c43f8c0f8ca20552c48d5b65.1719930430.git.boris@bur.io (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | btrfs: fix __folio_put refcount errors | expand |
On Tue, Jul 02, 2024 at 07:31:13AM -0700, Boris Burkov wrote: > The conversion to folios switched __free_page to __folio_put in the > error path in btrfs_do_encoded_write. > > However, this gets the page refcounting wrong. If we do hit that error > path (I reproduced by modifying btrfs_do_encoded_write to pretend to > always fail in a way that jumps to out_folios and running the xfstest > btrfs/281), then we always hit the following BUG freeing the folio: > > BUG: Bad page state in process btrfs pfn:40ab0b > page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x61be5 pfn:0x40ab0b > flags: 0x5ffff0000000000(node=0|zone=2|lastcpupid=0x1ffff) > raw: 05ffff0000000000 0000000000000000 dead000000000122 0000000000000000 > raw: 0000000000061be5 0000000000000000 00000001ffffffff 0000000000000000 > page dumped because: nonzero _refcount > Call Trace: > <TASK> > dump_stack_lvl+0x3d/0xe0 > bad_page+0xea/0xf0 > free_unref_page+0x8e1/0x900 > ? __mem_cgroup_uncharge+0x69/0x90 > __folio_put+0xe6/0x190 > btrfs_do_encoded_write+0x445/0x780 > ? current_time+0x25/0xd0 > btrfs_do_write_iter+0x2cc/0x4b0 > btrfs_ioctl_encoded_write+0x2b6/0x340 > > It turns out __free_page dereferenced the page while __folio_put does > not. Switch __folio_put to folio_put which does dereference the folio > first. By 'dereferenced' you mean to decrease the reference count? Because dereference is usually said about pointers, it's confusing in this context.
On Tue, Jul 02, 2024 at 06:19:16PM +0200, David Sterba wrote: > On Tue, Jul 02, 2024 at 07:31:13AM -0700, Boris Burkov wrote: > > The conversion to folios switched __free_page to __folio_put in the > > error path in btrfs_do_encoded_write. > > > > However, this gets the page refcounting wrong. If we do hit that error > > path (I reproduced by modifying btrfs_do_encoded_write to pretend to > > always fail in a way that jumps to out_folios and running the xfstest > > btrfs/281), then we always hit the following BUG freeing the folio: > > > > BUG: Bad page state in process btrfs pfn:40ab0b > > page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x61be5 pfn:0x40ab0b > > flags: 0x5ffff0000000000(node=0|zone=2|lastcpupid=0x1ffff) > > raw: 05ffff0000000000 0000000000000000 dead000000000122 0000000000000000 > > raw: 0000000000061be5 0000000000000000 00000001ffffffff 0000000000000000 > > page dumped because: nonzero _refcount > > Call Trace: > > <TASK> > > dump_stack_lvl+0x3d/0xe0 > > bad_page+0xea/0xf0 > > free_unref_page+0x8e1/0x900 > > ? __mem_cgroup_uncharge+0x69/0x90 > > __folio_put+0xe6/0x190 > > btrfs_do_encoded_write+0x445/0x780 > > ? current_time+0x25/0xd0 > > btrfs_do_write_iter+0x2cc/0x4b0 > > btrfs_ioctl_encoded_write+0x2b6/0x340 > > > > It turns out __free_page dereferenced the page while __folio_put does > > not. Switch __folio_put to folio_put which does dereference the folio > > first. > > By 'dereferenced' you mean to decrease the reference count? Because > dereference is usually said about pointers, it's confusing in this > context. Yes, you're right. It is "removing a reference" but not the best use of "de-reference" in that case :) "decrement the refcount" is certainly much clearer.
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 0a11d309ee89..12fb7e8056a1 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -9558,7 +9558,7 @@ ssize_t btrfs_do_encoded_write(struct kiocb *iocb, struct iov_iter *from, out_folios: for (i = 0; i < nr_folios; i++) { if (folios[i]) - __folio_put(folios[i]); + folio_put(folios[i]); } kvfree(folios); out: