diff mbox

[v2] Btrfs: add another missing end_page_writeback on submit_extent_page failure

Message ID 1486628673-30631-1-git-send-email-takafumi.kubota1012@sslab.ics.keio.ac.jp (mailing list archive)
State Accepted
Headers show

Commit Message

Takafumi Kubota Feb. 9, 2017, 8:24 a.m. UTC
If btrfs_bio_alloc fails in submit_extent_page, submit_extent_page returns
without clearing the writeback bit of the failed page.

__extent_writepage_io, that is a caller of submit_extent_page,
does not clear the remaining writeback bit anywhere.
As a result, this will cause the hang at filemap_fdatawait_range,
because it waits the writeback bit to be cleared from the failed page.
So, we have to call end_page_writeback to clear the writeback bit.

For reproducing the hang, we inject a fault like

   if (should_failtest()) { // I define should_failtest()
        bio = NULL;
   }
   else {
        bio = btrfs_bio_alloc(...);
   }

in submit_extent_page.

We should also check whether page has the bit before end_page_writeback,
to avoid the conflict against the other end_page_writeback in bio_endio.
Thus, we add PageWriteback checks not only in __extent_writepage_io,
but also in write_one_eb too, because it misses the check.

Signed-off-by: Takafumi Kubota <takafumi.kubota1012@sslab.ics.keio.ac.jp>
Reviewed-by: Liu Bo <bo.li.liu@oracle.com>
Cc: David Sterba <dsterba@suse.cz>
---
 fs/btrfs/extent_io.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

Comments

David Sterba Feb. 10, 2017, 4:02 p.m. UTC | #1
On Thu, Feb 09, 2017 at 05:24:33PM +0900, Takafumi Kubota wrote:
> If btrfs_bio_alloc fails in submit_extent_page, submit_extent_page returns
> without clearing the writeback bit of the failed page.
> 
> __extent_writepage_io, that is a caller of submit_extent_page,
> does not clear the remaining writeback bit anywhere.
> As a result, this will cause the hang at filemap_fdatawait_range,
> because it waits the writeback bit to be cleared from the failed page.
> So, we have to call end_page_writeback to clear the writeback bit.
> 
> For reproducing the hang, we inject a fault like
> 
>    if (should_failtest()) { // I define should_failtest()
>         bio = NULL;
>    }
>    else {
>         bio = btrfs_bio_alloc(...);
>    }
> 
> in submit_extent_page.
> 
> We should also check whether page has the bit before end_page_writeback,
> to avoid the conflict against the other end_page_writeback in bio_endio.
> Thus, we add PageWriteback checks not only in __extent_writepage_io,
> but also in write_one_eb too, because it misses the check.
> 
> Signed-off-by: Takafumi Kubota <takafumi.kubota1012@sslab.ics.keio.ac.jp>
> Reviewed-by: Liu Bo <bo.li.liu@oracle.com>
> Cc: David Sterba <dsterba@suse.cz>

Added to 4.11 queue, thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index 4ac383a..aa1908a 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -3445,8 +3445,11 @@  static noinline_for_stack int __extent_writepage_io(struct inode *inode,
 					 bdev, &epd->bio, max_nr,
 					 end_bio_extent_writepage,
 					 0, 0, 0, false);
-		if (ret)
+		if (ret) {
 			SetPageError(page);
+			if (PageWriteback(page))
+				end_page_writeback(page);
+		}
 
 		cur = cur + iosize;
 		pg_offset += iosize;
@@ -3767,7 +3770,8 @@  static noinline_for_stack int write_one_eb(struct extent_buffer *eb,
 		epd->bio_flags = bio_flags;
 		if (ret) {
 			set_btree_ioerr(p);
-			end_page_writeback(p);
+			if (PageWriteback(p))
+				end_page_writeback(p);
 			if (atomic_sub_and_test(num_pages - i, &eb->io_pages))
 				end_extent_buffer_writeback(eb);
 			ret = -EIO;