diff mbox series

btrfs: fix a triggered ASSERT() for generic/029 when executed with compression

Message ID 20211014071634.29738-1-wqu@suse.com (mailing list archive)
State New, archived
Headers show
Series btrfs: fix a triggered ASSERT() for generic/029 when executed with compression | expand

Commit Message

Qu Wenruo Oct. 14, 2021, 7:16 a.m. UTC
[BUG]
When running generic/029 test with "-o compress" mount option, it will
crash with the following ASSERT() triggered:

  assertion failed: PageLocked(page), in fs/btrfs/extent_io.c:5150
  ------------[ cut here ]------------
  kernel BUG at fs/btrfs/ctree.h:3495!
  RIP: 0010:assertfail.constprop.0+0x18/0x1a [btrfs]
  Call Trace:
   extent_write_locked_range.cold+0x11/0x44 [btrfs]
   submit_compressed_extents+0x42f/0x440 [btrfs]
   async_cow_submit+0x37/0x90 [btrfs]
   btrfs_work_helper+0x132/0x3e0 [btrfs]
   process_one_work+0x294/0x5d0
   worker_thread+0x55/0x3c0
   kthread+0x140/0x170
   ret_from_fork+0x22/0x30
  ---[ end trace 32cf9a3b0c96a939 ]---

[CAUSE]
Function extent_write_locked_range() expects all its pages being locked
since btrfs_run_delalloc_range().
Thus the ASSERT() in it is doing the correct check.

The problem is in submit_uncompressed_range(), which calls
cow_file_range() to create ordered extent for it.

But since cow_file_range() can inline the data, callers must check if
cow_file_range() created an inline extent, and handle it properly.

Unfortunately commit c12036efa894 ("btrfs: factor uncompressed async
extent submission code into a new helper") did a wrong refactor, which
uses "ret > 0" to check if cow_file_range() created an inline extent.

The correct check should be "page_start".

This causes submit_compressed_extents() always call
extent_write_locked_range() even we have created an inline extent.

And when an inline extent is created, the page is unlocked and dirty bit
cleared, which will trigger the new ASSERT() in
extent_write_locked_range().

[FIX]
Use the proper condition to check if cow_file_range() created an inline
extent.

Reported-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
---

This commit can be fixed into the offending refactor.
It looks like I can't escape the recent destiny that refactors are
introducing more bugs.
---
 fs/btrfs/inode.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Josef Bacik Oct. 14, 2021, 2:41 p.m. UTC | #1
On Thu, Oct 14, 2021 at 03:16:34PM +0800, Qu Wenruo wrote:
> [BUG]
> When running generic/029 test with "-o compress" mount option, it will
> crash with the following ASSERT() triggered:
> 
>   assertion failed: PageLocked(page), in fs/btrfs/extent_io.c:5150
>   ------------[ cut here ]------------
>   kernel BUG at fs/btrfs/ctree.h:3495!
>   RIP: 0010:assertfail.constprop.0+0x18/0x1a [btrfs]
>   Call Trace:
>    extent_write_locked_range.cold+0x11/0x44 [btrfs]
>    submit_compressed_extents+0x42f/0x440 [btrfs]
>    async_cow_submit+0x37/0x90 [btrfs]
>    btrfs_work_helper+0x132/0x3e0 [btrfs]
>    process_one_work+0x294/0x5d0
>    worker_thread+0x55/0x3c0
>    kthread+0x140/0x170
>    ret_from_fork+0x22/0x30
>   ---[ end trace 32cf9a3b0c96a939 ]---
> 
> [CAUSE]
> Function extent_write_locked_range() expects all its pages being locked
> since btrfs_run_delalloc_range().
> Thus the ASSERT() in it is doing the correct check.
> 
> The problem is in submit_uncompressed_range(), which calls
> cow_file_range() to create ordered extent for it.
> 
> But since cow_file_range() can inline the data, callers must check if
> cow_file_range() created an inline extent, and handle it properly.
> 
> Unfortunately commit c12036efa894 ("btrfs: factor uncompressed async
> extent submission code into a new helper") did a wrong refactor, which
> uses "ret > 0" to check if cow_file_range() created an inline extent.
> 
> The correct check should be "page_start".
> 
> This causes submit_compressed_extents() always call
> extent_write_locked_range() even we have created an inline extent.
> 
> And when an inline extent is created, the page is unlocked and dirty bit
> cleared, which will trigger the new ASSERT() in
> extent_write_locked_range().
> 
> [FIX]
> Use the proper condition to check if cow_file_range() created an inline
> extent.
> 
> Reported-by: Josef Bacik <josef@toxicpanda.com>
> Signed-off-by: Qu Wenruo <wqu@suse.com>
> ---
> 
> This commit can be fixed into the offending refactor.
> It looks like I can't escape the recent destiny that refactors are
> introducing more bugs.

This is what our continual testing is meant to catch, be happy the process is
working!

I do hate the return value here makes it a bit tricky to figure out what's going
on, looks like we just eat the ret == 1 value in cow_file_range(), so we're one
unfortunate refactor away from accidentally returning 1 from cow_file_range.  We
should probably rework that so it's less confusing and less error prone.

But that's for another day, you can add

Reviewed-by: Josef Bacik <josef@toxicpanda.com>

to this, thanks,

Josef
David Sterba Oct. 14, 2021, 3:06 p.m. UTC | #2
On Thu, Oct 14, 2021 at 10:41:17AM -0400, Josef Bacik wrote:
> > This commit can be fixed into the offending refactor.
> > It looks like I can't escape the recent destiny that refactors are
> > introducing more bugs.

Folded to the patch, thanks.

> This is what our continual testing is meant to catch, be happy the process is
> working!

Yep, I did not see the crash as I don't run the default setup with
-o compress.
diff mbox series

Patch

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index b9c70a073a24..cb802a55391c 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -899,7 +899,7 @@  static int submit_uncompressed_range(struct btrfs_inode *inode,
 	ret = cow_file_range(inode, locked_page, start, end, &page_started,
 			     &nr_written, 0);
 	/* Inline extent inserted, page gets unlocked and everything is done */
-	if (ret > 0) {
+	if (page_started) {
 		ret = 0;
 		goto out;
 	}