diff mbox

[5/7] Btrfs: replace BUG_ON with WARN in merge_bio

Message ID 1463184422-13584-5-git-send-email-bo.li.liu@oracle.com (mailing list archive)
State Superseded
Headers show

Commit Message

Liu Bo May 14, 2016, 12:07 a.m. UTC
We have two BUG_ON in merge_bio, but since it can gracefully return errors
to callers, use WARN_ONCE to give error information and don't leave a
possible panic.

Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
---
 fs/btrfs/extent_io.c | 1 -
 fs/btrfs/inode.c     | 6 ++++--
 2 files changed, 4 insertions(+), 3 deletions(-)

Comments

David Sterba May 16, 2016, 8:44 a.m. UTC | #1
On Fri, May 13, 2016 at 05:07:00PM -0700, Liu Bo wrote:
> We have two BUG_ON in merge_bio, but since it can gracefully return errors
> to callers, use WARN_ONCE to give error information and don't leave a
> possible panic.
> 
> Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
> ---
>  fs/btrfs/extent_io.c | 1 -
>  fs/btrfs/inode.c     | 6 ++++--
>  2 files changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
> index e601e0f..99286d1 100644
> --- a/fs/btrfs/extent_io.c
> +++ b/fs/btrfs/extent_io.c
> @@ -2746,7 +2746,6 @@ static int merge_bio(int rw, struct extent_io_tree *tree, struct page *page,
>  	if (tree->ops && tree->ops->merge_bio_hook)
>  		ret = tree->ops->merge_bio_hook(rw, page, offset, size, bio,
>  						bio_flags);
> -	BUG_ON(ret < 0);
>  	return ret;
>  
>  }
> diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
> index 5874562..3a989e3 100644
> --- a/fs/btrfs/inode.c
> +++ b/fs/btrfs/inode.c
> @@ -1827,8 +1827,10 @@ int btrfs_merge_bio_hook(int rw, struct page *page, unsigned long offset,
>  	map_length = length;
>  	ret = btrfs_map_block(root->fs_info, rw, logical,
>  			      &map_length, NULL, 0);
> -	/* Will always return 0 with map_multi == NULL */
> -	BUG_ON(ret < 0);
> +	if (ret < 0) {
> +		WARN_ONCE(ret < 0, KERN_ERR "ret = %d\n", ret);

btrfs_map_block is quite verbose about the errors so it's not needed to
print it here.

Otherwise I'm not sure if all paths that go through the merge hook
handle errors, eg. in btrfs_submit_compressed_read or _write. Some code
is skipped if merge hook returns nonzero. But, the code expects either 0
or 1, and when the BUG_ON(ret < 0) is removed suddenly the 'ret < 0' can
be returned. Unexpected.

It's IMO better to push up the BUG_ON error handling only one caller at
a time. That way it's easier to review the callgraph and call paths.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Liu Bo May 16, 2016, 5:24 p.m. UTC | #2
On Mon, May 16, 2016 at 10:44:38AM +0200, David Sterba wrote:
> On Fri, May 13, 2016 at 05:07:00PM -0700, Liu Bo wrote:
> > We have two BUG_ON in merge_bio, but since it can gracefully return errors
> > to callers, use WARN_ONCE to give error information and don't leave a
> > possible panic.
> > 
> > Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
> > ---
> >  fs/btrfs/extent_io.c | 1 -
> >  fs/btrfs/inode.c     | 6 ++++--
> >  2 files changed, 4 insertions(+), 3 deletions(-)
> > 
> > diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
> > index e601e0f..99286d1 100644
> > --- a/fs/btrfs/extent_io.c
> > +++ b/fs/btrfs/extent_io.c
> > @@ -2746,7 +2746,6 @@ static int merge_bio(int rw, struct extent_io_tree *tree, struct page *page,
> >  	if (tree->ops && tree->ops->merge_bio_hook)
> >  		ret = tree->ops->merge_bio_hook(rw, page, offset, size, bio,
> >  						bio_flags);
> > -	BUG_ON(ret < 0);
> >  	return ret;
> >  
> >  }
> > diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
> > index 5874562..3a989e3 100644
> > --- a/fs/btrfs/inode.c
> > +++ b/fs/btrfs/inode.c
> > @@ -1827,8 +1827,10 @@ int btrfs_merge_bio_hook(int rw, struct page *page, unsigned long offset,
> >  	map_length = length;
> >  	ret = btrfs_map_block(root->fs_info, rw, logical,
> >  			      &map_length, NULL, 0);
> > -	/* Will always return 0 with map_multi == NULL */
> > -	BUG_ON(ret < 0);
> > +	if (ret < 0) {
> > +		WARN_ONCE(ret < 0, KERN_ERR "ret = %d\n", ret);
> 
> btrfs_map_block is quite verbose about the errors so it's not needed to
> print it here.

OK.

> 
> Otherwise I'm not sure if all paths that go through the merge hook
> handle errors, eg. in btrfs_submit_compressed_read or _write. Some code
> is skipped if merge hook returns nonzero. But, the code expects either 0
> or 1, and when the BUG_ON(ret < 0) is removed suddenly the 'ret < 0' can
> be returned. Unexpected.

Right now btrfs_merge_bio_hook() only returns 1 or 0 and marks (ret < 0) as
 BUG_ON.

But compress code is ready to handle the error,

btrfs_submit_compressed_read/write() {
	...
	ret = merge_bio_hook()
	if (ret || bio_add_page()) {
		ret = btrfs_map_bio();
		BUG_ON(ret);
		...
	}
	...
}

So ret < 0 is handled, if there's any errors from merge_bio_hook(), it submits
 the current bio, so the way is sane to me.

The other consumer of merge_bio is submit_extent_page(), where it's OK
to return errors and callers are ready to handle them.

> 
> It's IMO better to push up the BUG_ON error handling only one caller at
> a time. That way it's easier to review the callgraph and call paths.

merge_bio() is a wrapper for tree->ops->merge_bio_hook(), which is the
same thing with btrfs_merge_bio_hook().  It makes no sense if we just
look at BUG_ON in btrfs_merge_bio_hook but keep BUG_ON() in merge_bio().

Thanks,

-liubo
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Sterba May 17, 2016, 9:55 a.m. UTC | #3
On Mon, May 16, 2016 at 10:24:01AM -0700, Liu Bo wrote:
> On Mon, May 16, 2016 at 10:44:38AM +0200, David Sterba wrote:
> > On Fri, May 13, 2016 at 05:07:00PM -0700, Liu Bo wrote:
> > > We have two BUG_ON in merge_bio, but since it can gracefully return errors
> > > to callers, use WARN_ONCE to give error information and don't leave a
> > > possible panic.
> > > 
> > > Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
> > > ---
> > >  fs/btrfs/extent_io.c | 1 -
> > >  fs/btrfs/inode.c     | 6 ++++--
> > >  2 files changed, 4 insertions(+), 3 deletions(-)
> > > 
> > > diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
> > > index e601e0f..99286d1 100644
> > > --- a/fs/btrfs/extent_io.c
> > > +++ b/fs/btrfs/extent_io.c
> > > @@ -2746,7 +2746,6 @@ static int merge_bio(int rw, struct extent_io_tree *tree, struct page *page,
> > >  	if (tree->ops && tree->ops->merge_bio_hook)
> > >  		ret = tree->ops->merge_bio_hook(rw, page, offset, size, bio,
> > >  						bio_flags);
> > > -	BUG_ON(ret < 0);
> > >  	return ret;
> > >  
> > >  }
> > > diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
> > > index 5874562..3a989e3 100644
> > > --- a/fs/btrfs/inode.c
> > > +++ b/fs/btrfs/inode.c
> > > @@ -1827,8 +1827,10 @@ int btrfs_merge_bio_hook(int rw, struct page *page, unsigned long offset,
> > >  	map_length = length;
> > >  	ret = btrfs_map_block(root->fs_info, rw, logical,
> > >  			      &map_length, NULL, 0);
> > > -	/* Will always return 0 with map_multi == NULL */
> > > -	BUG_ON(ret < 0);
> > > +	if (ret < 0) {
> > > +		WARN_ONCE(ret < 0, KERN_ERR "ret = %d\n", ret);
> > 
> > btrfs_map_block is quite verbose about the errors so it's not needed to
> > print it here.
> 
> OK.
> 
> > 
> > Otherwise I'm not sure if all paths that go through the merge hook
> > handle errors, eg. in btrfs_submit_compressed_read or _write. Some code
> > is skipped if merge hook returns nonzero. But, the code expects either 0
> > or 1, and when the BUG_ON(ret < 0) is removed suddenly the 'ret < 0' can
> > be returned. Unexpected.
> 
> Right now btrfs_merge_bio_hook() only returns 1 or 0 and marks (ret < 0) as
>  BUG_ON.

IOW, "ret < 0" never leaves btrfs_merge_bio_hook in current code.

> But compress code is ready to handle the error,
> 
> btrfs_submit_compressed_read/write() {
> 	...
> 	ret = merge_bio_hook()
> 	if (ret || bio_add_page()) {

This code expects 0 or 1, < 0 is not expected here and has to be
considered after the change in merge_bio_hook.

> 		ret = btrfs_map_bio();
> 		BUG_ON(ret);
> 		...
> 	}
> 	...
> }
> 
> So ret < 0 is handled, if there's any errors from merge_bio_hook(), it submits
>  the current bio, so the way is sane to me.

Well, that's what I don't call 'handled'. 'ret < 0' will get to the 'if'
but the question is whether the code really expects to see < 0 there and
if it does the right thing then.

> The other consumer of merge_bio is submit_extent_page(), where it's OK
> to return errors and callers are ready to handle them.

Same argument.

'if (... || merge_bio() || ...)'

returns 1 "we can merge, continue with submit_bio" 
returns <0 "there was an error, we cannot continue" ... yet will still
try to do submit_bio.

> > It's IMO better to push up the BUG_ON error handling only one caller at
> > a time. That way it's easier to review the callgraph and call paths.
> 
> merge_bio() is a wrapper for tree->ops->merge_bio_hook(), which is the
> same thing with btrfs_merge_bio_hook().  It makes no sense if we just
> look at BUG_ON in btrfs_merge_bio_hook but keep BUG_ON() in merge_bio().

So, looking at the code again: the BUG_ON in btrfs_merge_bio_hook can be
replaced by return ret (no warning needed IMO).

If merge_bio gets rid of the BUG_ON, the calles must explicitly handle
'ret < 0' unless it's provably not a problem.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Liu Bo May 17, 2016, 5:30 p.m. UTC | #4
On Tue, May 17, 2016 at 11:55:19AM +0200, David Sterba wrote:
> On Mon, May 16, 2016 at 10:24:01AM -0700, Liu Bo wrote:
> > On Mon, May 16, 2016 at 10:44:38AM +0200, David Sterba wrote:
> > > On Fri, May 13, 2016 at 05:07:00PM -0700, Liu Bo wrote:
> > > > We have two BUG_ON in merge_bio, but since it can gracefully return errors
> > > > to callers, use WARN_ONCE to give error information and don't leave a
> > > > possible panic.
> > > > 
> > > > Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
> > > > ---
> > > >  fs/btrfs/extent_io.c | 1 -
> > > >  fs/btrfs/inode.c     | 6 ++++--
> > > >  2 files changed, 4 insertions(+), 3 deletions(-)
> > > > 
> > > > diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
> > > > index e601e0f..99286d1 100644
> > > > --- a/fs/btrfs/extent_io.c
> > > > +++ b/fs/btrfs/extent_io.c
> > > > @@ -2746,7 +2746,6 @@ static int merge_bio(int rw, struct extent_io_tree *tree, struct page *page,
> > > >  	if (tree->ops && tree->ops->merge_bio_hook)
> > > >  		ret = tree->ops->merge_bio_hook(rw, page, offset, size, bio,
> > > >  						bio_flags);
> > > > -	BUG_ON(ret < 0);
> > > >  	return ret;
> > > >  
> > > >  }
> > > > diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
> > > > index 5874562..3a989e3 100644
> > > > --- a/fs/btrfs/inode.c
> > > > +++ b/fs/btrfs/inode.c
> > > > @@ -1827,8 +1827,10 @@ int btrfs_merge_bio_hook(int rw, struct page *page, unsigned long offset,
> > > >  	map_length = length;
> > > >  	ret = btrfs_map_block(root->fs_info, rw, logical,
> > > >  			      &map_length, NULL, 0);
> > > > -	/* Will always return 0 with map_multi == NULL */
> > > > -	BUG_ON(ret < 0);
> > > > +	if (ret < 0) {
> > > > +		WARN_ONCE(ret < 0, KERN_ERR "ret = %d\n", ret);
> > > 
> > > btrfs_map_block is quite verbose about the errors so it's not needed to
> > > print it here.
> > 
> > OK.
> > 
> > > 
> > > Otherwise I'm not sure if all paths that go through the merge hook
> > > handle errors, eg. in btrfs_submit_compressed_read or _write. Some code
> > > is skipped if merge hook returns nonzero. But, the code expects either 0
> > > or 1, and when the BUG_ON(ret < 0) is removed suddenly the 'ret < 0' can
> > > be returned. Unexpected.
> > 
> > Right now btrfs_merge_bio_hook() only returns 1 or 0 and marks (ret < 0) as
> >  BUG_ON.
> 
> IOW, "ret < 0" never leaves btrfs_merge_bio_hook in current code.

OK.

> 
> > But compress code is ready to handle the error,
> > 
> > btrfs_submit_compressed_read/write() {
> > 	...
> > 	ret = merge_bio_hook()
> > 	if (ret || bio_add_page()) {
> 
> This code expects 0 or 1, < 0 is not expected here and has to be
> considered after the change in merge_bio_hook.

OK.

> 
> > 		ret = btrfs_map_bio();
> > 		BUG_ON(ret);
> > 		...
> > 	}
> > 	...
> > }
> > 
> > So ret < 0 is handled, if there's any errors from merge_bio_hook(), it submits
> >  the current bio, so the way is sane to me.
> 
> Well, that's what I don't call 'handled'. 'ret < 0' will get to the 'if'
> but the question is whether the code really expects to see < 0 there and
> if it does the right thing then.

OK.

> 
> > The other consumer of merge_bio is submit_extent_page(), where it's OK
> > to return errors and callers are ready to handle them.
> 
> Same argument.
> 
> 'if (... || merge_bio() || ...)'
> 
> returns 1 "we can merge, continue with submit_bio" 

return 1 means "we _cannot_ merge" ;-)

> returns <0 "there was an error, we cannot continue" ... yet will still
> try to do submit_bio.
> 
> > > It's IMO better to push up the BUG_ON error handling only one caller at
> > > a time. That way it's easier to review the callgraph and call paths.
> > 
> > merge_bio() is a wrapper for tree->ops->merge_bio_hook(), which is the
> > same thing with btrfs_merge_bio_hook().  It makes no sense if we just
> > look at BUG_ON in btrfs_merge_bio_hook but keep BUG_ON() in merge_bio().
> 
> So, looking at the code again: the BUG_ON in btrfs_merge_bio_hook can be
> replaced by return ret (no warning needed IMO).

OK.

> 
> If merge_bio gets rid of the BUG_ON, the calles must explicitly handle
> 'ret < 0' unless it's provably not a problem.

If merge_bio() returns < 0, then it must be __btrfs_map_block() returns < 0,
so even if we continue with submiting this bio, it'd fail at
__btrfs_map_block() again because merge_bio and submit_bio are using the
same bio.  Because of that we don't bother to submit it, instead we can
just return in case of (ret < 0), this applies to both
submit_extent_page() and btrfs_submit_compressed_read/write().

What do you think?

Thanks,

-liubo
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Sterba May 18, 2016, 1:54 p.m. UTC | #5
On Tue, May 17, 2016 at 10:30:47AM -0700, Liu Bo wrote:
> > If merge_bio gets rid of the BUG_ON, the calles must explicitly handle
> > 'ret < 0' unless it's provably not a problem.
> 
> If merge_bio() returns < 0, then it must be __btrfs_map_block() returns < 0,
> so even if we continue with submiting this bio, it'd fail at
> __btrfs_map_block() again because merge_bio and submit_bio are using the
> same bio.  Because of that we don't bother to submit it, instead we can
> just return in case of (ret < 0), this applies to both
> submit_extent_page() and btrfs_submit_compressed_read/write().
> 
> What do you think?

Makes sense to me. All the partial processing (allocations and bios)
need to be cleaned up, which does not seem trivial from the first look.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index e601e0f..99286d1 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -2746,7 +2746,6 @@  static int merge_bio(int rw, struct extent_io_tree *tree, struct page *page,
 	if (tree->ops && tree->ops->merge_bio_hook)
 		ret = tree->ops->merge_bio_hook(rw, page, offset, size, bio,
 						bio_flags);
-	BUG_ON(ret < 0);
 	return ret;
 
 }
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 5874562..3a989e3 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -1827,8 +1827,10 @@  int btrfs_merge_bio_hook(int rw, struct page *page, unsigned long offset,
 	map_length = length;
 	ret = btrfs_map_block(root->fs_info, rw, logical,
 			      &map_length, NULL, 0);
-	/* Will always return 0 with map_multi == NULL */
-	BUG_ON(ret < 0);
+	if (ret < 0) {
+		WARN_ONCE(ret < 0, KERN_ERR "ret = %d\n", ret);
+		return ret;
+	}
 	if (map_length < length + size)
 		return 1;
 	return 0;