diff mbox series

[v2,3/4] btrfs: fix error handling of fallbacked uncompress write

Message ID 7347f1de449c3a3f36690b816c2ded133508c5c2.1655791781.git.naohiro.aota@wdc.com (mailing list archive)
State New, archived
Headers show
Series btrfs: fix error handling of cow_file_range(unlock = 0) | expand

Commit Message

Naohiro Aota June 21, 2022, 6:41 a.m. UTC
When cow_file_range() fails in the middle of the allocation loop, it
unlocks the pages but leaves the ordered extents intact. Thus, we need to
call btrfs_cleanup_ordered_extents() to finish the created ordered extents.

Also, we need to call end_extent_writepage() if locked_page is available
because btrfs_cleanup_ordered_extents() never process the region on the
locked_page.

Furthermore, we need to set the mapping as error if locked_page is
unavailable before unlocking the pages, so that the errno is properly
propagated to the userland.

CC: stable@vger.kernel.org # 5.18+
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
---
I choose 5.18+ as the target as they are after refactoring and we can apply
the series cleanly. Technically, older versions potentially have the same
issue, but it might not happen actually. So, let's choose easy targets to
apply.
---
 fs/btrfs/inode.c | 17 +++++++++++++++--
 1 file changed, 15 insertions(+), 2 deletions(-)

Comments

Filipe Manana June 21, 2022, 3:04 p.m. UTC | #1
On Tue, Jun 21, 2022 at 03:41:01PM +0900, Naohiro Aota wrote:
> When cow_file_range() fails in the middle of the allocation loop, it
> unlocks the pages but leaves the ordered extents intact. Thus, we need to
> call btrfs_cleanup_ordered_extents() to finish the created ordered extents.
> 
> Also, we need to call end_extent_writepage() if locked_page is available
> because btrfs_cleanup_ordered_extents() never process the region on the
> locked_page.
> 
> Furthermore, we need to set the mapping as error if locked_page is
> unavailable before unlocking the pages, so that the errno is properly
> propagated to the userland.
> 
> CC: stable@vger.kernel.org # 5.18+
> Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
> ---
> I choose 5.18+ as the target as they are after refactoring and we can apply
> the series cleanly. Technically, older versions potentially have the same
> issue, but it might not happen actually. So, let's choose easy targets to
> apply.
> ---
>  fs/btrfs/inode.c | 17 +++++++++++++++--
>  1 file changed, 15 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
> index 326150552e57..38d8e6d78e77 100644
> --- a/fs/btrfs/inode.c
> +++ b/fs/btrfs/inode.c
> @@ -933,8 +933,18 @@ static int submit_uncompressed_range(struct btrfs_inode *inode,
>  		goto out;
>  	}
>  	if (ret < 0) {
> -		if (locked_page)
> +		btrfs_cleanup_ordered_extents(inode, locked_page, start, end - start + 1);
> +		if (locked_page) {
> +			u64 page_start = page_offset(locked_page);
> +			u64 page_end = page_start + PAGE_SIZE - 1;
> +
> +			btrfs_page_set_error(inode->root->fs_info, locked_page,
> +					     page_start, PAGE_SIZE);
> +			set_page_writeback(locked_page);
> +			end_page_writeback(locked_page);
> +			end_extent_writepage(locked_page, ret, page_start, page_end);
>  			unlock_page(locked_page);
> +		}
>  		goto out;
>  	}
>  
> @@ -1383,9 +1393,12 @@ static noinline int cow_file_range(struct btrfs_inode *inode,
>  	 * However, in case of unlock == 0, we still need to unlock the pages
>  	 * (except @locked_page) to ensure all the pages are unlocked.
>  	 */
> -	if (!unlock && orig_start < start)
> +	if (!unlock && orig_start < start) {
> +		if (!locked_page)
> +			mapping_set_error(inode->vfs_inode.i_mapping, ret);

Instead of this we can pass PAGE_SET_ERROR in page_ops, which will result in
setting the error in the inode's mapping.

In fact we currently only mark the locked page with error (at writepage_delalloc()).
However all pages before and after it are still locked and we don't set the error on
them, I think we should - I don't see why not.

Did I miss something (again)?

Sorry I only noticed this now.

Thanks.

>  		extent_clear_unlock_delalloc(inode, orig_start, start - 1,
>  					     locked_page, 0, page_ops);
> +	}
>  
>  	/*
>  	 * For the range (2). If we reserved an extent for our delalloc range
> -- 
> 2.35.1
>
Naohiro Aota June 22, 2022, 12:56 a.m. UTC | #2
On Tue, Jun 21, 2022 at 04:04:14PM +0100, Filipe Manana wrote:
> On Tue, Jun 21, 2022 at 03:41:01PM +0900, Naohiro Aota wrote:
> > When cow_file_range() fails in the middle of the allocation loop, it
> > unlocks the pages but leaves the ordered extents intact. Thus, we need to
> > call btrfs_cleanup_ordered_extents() to finish the created ordered extents.
> > 
> > Also, we need to call end_extent_writepage() if locked_page is available
> > because btrfs_cleanup_ordered_extents() never process the region on the
> > locked_page.
> > 
> > Furthermore, we need to set the mapping as error if locked_page is
> > unavailable before unlocking the pages, so that the errno is properly
> > propagated to the userland.
> > 
> > CC: stable@vger.kernel.org # 5.18+
> > Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
> > ---
> > I choose 5.18+ as the target as they are after refactoring and we can apply
> > the series cleanly. Technically, older versions potentially have the same
> > issue, but it might not happen actually. So, let's choose easy targets to
> > apply.
> > ---
> >  fs/btrfs/inode.c | 17 +++++++++++++++--
> >  1 file changed, 15 insertions(+), 2 deletions(-)
> > 
> > diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
> > index 326150552e57..38d8e6d78e77 100644
> > --- a/fs/btrfs/inode.c
> > +++ b/fs/btrfs/inode.c
> > @@ -933,8 +933,18 @@ static int submit_uncompressed_range(struct btrfs_inode *inode,
> >  		goto out;
> >  	}
> >  	if (ret < 0) {
> > -		if (locked_page)
> > +		btrfs_cleanup_ordered_extents(inode, locked_page, start, end - start + 1);
> > +		if (locked_page) {
> > +			u64 page_start = page_offset(locked_page);
> > +			u64 page_end = page_start + PAGE_SIZE - 1;
> > +
> > +			btrfs_page_set_error(inode->root->fs_info, locked_page,
> > +					     page_start, PAGE_SIZE);
> > +			set_page_writeback(locked_page);
> > +			end_page_writeback(locked_page);
> > +			end_extent_writepage(locked_page, ret, page_start, page_end);
> >  			unlock_page(locked_page);
> > +		}
> >  		goto out;
> >  	}
> >  
> > @@ -1383,9 +1393,12 @@ static noinline int cow_file_range(struct btrfs_inode *inode,
> >  	 * However, in case of unlock == 0, we still need to unlock the pages
> >  	 * (except @locked_page) to ensure all the pages are unlocked.
> >  	 */
> > -	if (!unlock && orig_start < start)
> > +	if (!unlock && orig_start < start) {
> > +		if (!locked_page)
> > +			mapping_set_error(inode->vfs_inode.i_mapping, ret);
> 
> Instead of this we can pass PAGE_SET_ERROR in page_ops, which will result in
> setting the error in the inode's mapping.

PAGE_SET_ERROR always set the error as EIO, so it cannot propagate other
errors returned by cow_file_range() to the userland. Thus, I choose
mapping_set_error() here.

> 
> In fact we currently only mark the locked page with error (at writepage_delalloc()).

Yes and at submit_uncompressed_range() with this series applied.

> However all pages before and after it are still locked and we don't set the error on
> them, I think we should - I don't see why not.

No, we unlock all pages except locked_pages in the error case, and yes, we
set the error only on the locked_page. So, that will make the other pages
to be writebacked later again (and make tons of the
btrfs_writepage_fixup_work). That is common among unlock = 0 case and
unlock = 1 case. Should we change that?

Setting all the pages as error would be good to avoid the tons of fixup in
this case.

> Did I miss something (again)?
> 
> Sorry I only noticed this now.
> 
> Thanks.
> 
> >  		extent_clear_unlock_delalloc(inode, orig_start, start - 1,
> >  					     locked_page, 0, page_ops);
> > +	}
> >  
> >  	/*
> >  	 * For the range (2). If we reserved an extent for our delalloc range
> > -- 
> > 2.35.1
> >
Filipe Manana June 22, 2022, 9:13 a.m. UTC | #3
On Wed, Jun 22, 2022 at 12:56:25AM +0000, Naohiro Aota wrote:
> On Tue, Jun 21, 2022 at 04:04:14PM +0100, Filipe Manana wrote:
> > On Tue, Jun 21, 2022 at 03:41:01PM +0900, Naohiro Aota wrote:
> > > When cow_file_range() fails in the middle of the allocation loop, it
> > > unlocks the pages but leaves the ordered extents intact. Thus, we need to
> > > call btrfs_cleanup_ordered_extents() to finish the created ordered extents.
> > > 
> > > Also, we need to call end_extent_writepage() if locked_page is available
> > > because btrfs_cleanup_ordered_extents() never process the region on the
> > > locked_page.
> > > 
> > > Furthermore, we need to set the mapping as error if locked_page is
> > > unavailable before unlocking the pages, so that the errno is properly
> > > propagated to the userland.
> > > 
> > > CC: stable@vger.kernel.org # 5.18+
> > > Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
> > > ---
> > > I choose 5.18+ as the target as they are after refactoring and we can apply
> > > the series cleanly. Technically, older versions potentially have the same
> > > issue, but it might not happen actually. So, let's choose easy targets to
> > > apply.
> > > ---
> > >  fs/btrfs/inode.c | 17 +++++++++++++++--
> > >  1 file changed, 15 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
> > > index 326150552e57..38d8e6d78e77 100644
> > > --- a/fs/btrfs/inode.c
> > > +++ b/fs/btrfs/inode.c
> > > @@ -933,8 +933,18 @@ static int submit_uncompressed_range(struct btrfs_inode *inode,
> > >  		goto out;
> > >  	}
> > >  	if (ret < 0) {
> > > -		if (locked_page)
> > > +		btrfs_cleanup_ordered_extents(inode, locked_page, start, end - start + 1);
> > > +		if (locked_page) {
> > > +			u64 page_start = page_offset(locked_page);
> > > +			u64 page_end = page_start + PAGE_SIZE - 1;
> > > +
> > > +			btrfs_page_set_error(inode->root->fs_info, locked_page,
> > > +					     page_start, PAGE_SIZE);
> > > +			set_page_writeback(locked_page);
> > > +			end_page_writeback(locked_page);
> > > +			end_extent_writepage(locked_page, ret, page_start, page_end);
> > >  			unlock_page(locked_page);
> > > +		}
> > >  		goto out;
> > >  	}
> > >  
> > > @@ -1383,9 +1393,12 @@ static noinline int cow_file_range(struct btrfs_inode *inode,
> > >  	 * However, in case of unlock == 0, we still need to unlock the pages
> > >  	 * (except @locked_page) to ensure all the pages are unlocked.
> > >  	 */
> > > -	if (!unlock && orig_start < start)
> > > +	if (!unlock && orig_start < start) {
> > > +		if (!locked_page)
> > > +			mapping_set_error(inode->vfs_inode.i_mapping, ret);
> > 
> > Instead of this we can pass PAGE_SET_ERROR in page_ops, which will result in
> > setting the error in the inode's mapping.
> 
> PAGE_SET_ERROR always set the error as EIO, so it cannot propagate other
> errors returned by cow_file_range() to the userland. Thus, I choose
> mapping_set_error() here.

Ok, actually only EIO and ENOSPC are used as mapping errors. mapping_set_error()
converts any error other than ENOSPC to AS_EIO. And it's been like for many years
as far as I remember.

> 
> > 
> > In fact we currently only mark the locked page with error (at writepage_delalloc()).
> 
> Yes and at submit_uncompressed_range() with this series applied.
> 
> > However all pages before and after it are still locked and we don't set the error on
> > them, I think we should - I don't see why not.
> 
> No, we unlock all pages except locked_pages in the error case, and yes, we

Yes, but by passing PAGE_SET_ERROR, we will end up setting the error bit on the
pages before unlocking them.

> set the error only on the locked_page. So, that will make the other pages
> to be writebacked later again (and make tons of the
> btrfs_writepage_fixup_work). That is common among unlock = 0 case and
> unlock = 1 case. Should we change that?

I don't see why not. But that can be done later in a separate patch, no worries.
Just something I think I didn't notice before.

Ok, the whole patchset looks good to me, thanks.

> 
> Setting all the pages as error would be good to avoid the tons of fixup in
> this case.
> 
> > Did I miss something (again)?
> > 
> > Sorry I only noticed this now.
> > 
> > Thanks.
> > 
> > >  		extent_clear_unlock_delalloc(inode, orig_start, start - 1,
> > >  					     locked_page, 0, page_ops);
> > > +	}
> > >  
> > >  	/*
> > >  	 * For the range (2). If we reserved an extent for our delalloc range
> > > -- 
> > > 2.35.1
> > >
diff mbox series

Patch

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 326150552e57..38d8e6d78e77 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -933,8 +933,18 @@  static int submit_uncompressed_range(struct btrfs_inode *inode,
 		goto out;
 	}
 	if (ret < 0) {
-		if (locked_page)
+		btrfs_cleanup_ordered_extents(inode, locked_page, start, end - start + 1);
+		if (locked_page) {
+			u64 page_start = page_offset(locked_page);
+			u64 page_end = page_start + PAGE_SIZE - 1;
+
+			btrfs_page_set_error(inode->root->fs_info, locked_page,
+					     page_start, PAGE_SIZE);
+			set_page_writeback(locked_page);
+			end_page_writeback(locked_page);
+			end_extent_writepage(locked_page, ret, page_start, page_end);
 			unlock_page(locked_page);
+		}
 		goto out;
 	}
 
@@ -1383,9 +1393,12 @@  static noinline int cow_file_range(struct btrfs_inode *inode,
 	 * However, in case of unlock == 0, we still need to unlock the pages
 	 * (except @locked_page) to ensure all the pages are unlocked.
 	 */
-	if (!unlock && orig_start < start)
+	if (!unlock && orig_start < start) {
+		if (!locked_page)
+			mapping_set_error(inode->vfs_inode.i_mapping, ret);
 		extent_clear_unlock_delalloc(inode, orig_start, start - 1,
 					     locked_page, 0, page_ops);
+	}
 
 	/*
 	 * For the range (2). If we reserved an extent for our delalloc range