Message ID | 20211014071634.29738-1-wqu@suse.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | btrfs: fix a triggered ASSERT() for generic/029 when executed with compression | expand |
On Thu, Oct 14, 2021 at 03:16:34PM +0800, Qu Wenruo wrote: > [BUG] > When running generic/029 test with "-o compress" mount option, it will > crash with the following ASSERT() triggered: > > assertion failed: PageLocked(page), in fs/btrfs/extent_io.c:5150 > ------------[ cut here ]------------ > kernel BUG at fs/btrfs/ctree.h:3495! > RIP: 0010:assertfail.constprop.0+0x18/0x1a [btrfs] > Call Trace: > extent_write_locked_range.cold+0x11/0x44 [btrfs] > submit_compressed_extents+0x42f/0x440 [btrfs] > async_cow_submit+0x37/0x90 [btrfs] > btrfs_work_helper+0x132/0x3e0 [btrfs] > process_one_work+0x294/0x5d0 > worker_thread+0x55/0x3c0 > kthread+0x140/0x170 > ret_from_fork+0x22/0x30 > ---[ end trace 32cf9a3b0c96a939 ]--- > > [CAUSE] > Function extent_write_locked_range() expects all its pages being locked > since btrfs_run_delalloc_range(). > Thus the ASSERT() in it is doing the correct check. > > The problem is in submit_uncompressed_range(), which calls > cow_file_range() to create ordered extent for it. > > But since cow_file_range() can inline the data, callers must check if > cow_file_range() created an inline extent, and handle it properly. > > Unfortunately commit c12036efa894 ("btrfs: factor uncompressed async > extent submission code into a new helper") did a wrong refactor, which > uses "ret > 0" to check if cow_file_range() created an inline extent. > > The correct check should be "page_start". > > This causes submit_compressed_extents() always call > extent_write_locked_range() even we have created an inline extent. > > And when an inline extent is created, the page is unlocked and dirty bit > cleared, which will trigger the new ASSERT() in > extent_write_locked_range(). > > [FIX] > Use the proper condition to check if cow_file_range() created an inline > extent. > > Reported-by: Josef Bacik <josef@toxicpanda.com> > Signed-off-by: Qu Wenruo <wqu@suse.com> > --- > > This commit can be fixed into the offending refactor. > It looks like I can't escape the recent destiny that refactors are > introducing more bugs. This is what our continual testing is meant to catch, be happy the process is working! I do hate the return value here makes it a bit tricky to figure out what's going on, looks like we just eat the ret == 1 value in cow_file_range(), so we're one unfortunate refactor away from accidentally returning 1 from cow_file_range. We should probably rework that so it's less confusing and less error prone. But that's for another day, you can add Reviewed-by: Josef Bacik <josef@toxicpanda.com> to this, thanks, Josef
On Thu, Oct 14, 2021 at 10:41:17AM -0400, Josef Bacik wrote: > > This commit can be fixed into the offending refactor. > > It looks like I can't escape the recent destiny that refactors are > > introducing more bugs. Folded to the patch, thanks. > This is what our continual testing is meant to catch, be happy the process is > working! Yep, I did not see the crash as I don't run the default setup with -o compress.
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index b9c70a073a24..cb802a55391c 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -899,7 +899,7 @@ static int submit_uncompressed_range(struct btrfs_inode *inode, ret = cow_file_range(inode, locked_page, start, end, &page_started, &nr_written, 0); /* Inline extent inserted, page gets unlocked and everything is done */ - if (ret > 0) { + if (page_started) { ret = 0; goto out; }
[BUG] When running generic/029 test with "-o compress" mount option, it will crash with the following ASSERT() triggered: assertion failed: PageLocked(page), in fs/btrfs/extent_io.c:5150 ------------[ cut here ]------------ kernel BUG at fs/btrfs/ctree.h:3495! RIP: 0010:assertfail.constprop.0+0x18/0x1a [btrfs] Call Trace: extent_write_locked_range.cold+0x11/0x44 [btrfs] submit_compressed_extents+0x42f/0x440 [btrfs] async_cow_submit+0x37/0x90 [btrfs] btrfs_work_helper+0x132/0x3e0 [btrfs] process_one_work+0x294/0x5d0 worker_thread+0x55/0x3c0 kthread+0x140/0x170 ret_from_fork+0x22/0x30 ---[ end trace 32cf9a3b0c96a939 ]--- [CAUSE] Function extent_write_locked_range() expects all its pages being locked since btrfs_run_delalloc_range(). Thus the ASSERT() in it is doing the correct check. The problem is in submit_uncompressed_range(), which calls cow_file_range() to create ordered extent for it. But since cow_file_range() can inline the data, callers must check if cow_file_range() created an inline extent, and handle it properly. Unfortunately commit c12036efa894 ("btrfs: factor uncompressed async extent submission code into a new helper") did a wrong refactor, which uses "ret > 0" to check if cow_file_range() created an inline extent. The correct check should be "page_start". This causes submit_compressed_extents() always call extent_write_locked_range() even we have created an inline extent. And when an inline extent is created, the page is unlocked and dirty bit cleared, which will trigger the new ASSERT() in extent_write_locked_range(). [FIX] Use the proper condition to check if cow_file_range() created an inline extent. Reported-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Qu Wenruo <wqu@suse.com> --- This commit can be fixed into the offending refactor. It looks like I can't escape the recent destiny that refactors are introducing more bugs. --- fs/btrfs/inode.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)