diff mbox series

[05/20] btrfs: handle invalid root reference found in btrfs_find_root()

Message ID 0011782bc0af988fc393ae8cee8b2d761def05d4.1706130791.git.dsterba@suse.com (mailing list archive)
State New, archived
Headers show
Series Error handling fixes | expand

Commit Message

David Sterba Jan. 24, 2024, 9:18 p.m. UTC
The btrfs_find_root() looks up a root by a key, allowing to do an
inexact search when key->offset is -1.  It's never expected to find such
item, as it would break allowed the range of a root id.

Signed-off-by: David Sterba <dsterba@suse.com>
---
 fs/btrfs/root-tree.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

Comments

Qu Wenruo Jan. 25, 2024, 4:03 a.m. UTC | #1
On 2024/1/25 07:48, David Sterba wrote:
> The btrfs_find_root() looks up a root by a key, allowing to do an
> inexact search when key->offset is -1.  It's never expected to find such
> item, as it would break allowed the range of a root id.
>
> Signed-off-by: David Sterba <dsterba@suse.com>
> ---
>   fs/btrfs/root-tree.c | 9 ++++++++-
>   1 file changed, 8 insertions(+), 1 deletion(-)
>
> diff --git a/fs/btrfs/root-tree.c b/fs/btrfs/root-tree.c
> index ba7e2181ff4e..326cd0d03287 100644
> --- a/fs/btrfs/root-tree.c
> +++ b/fs/btrfs/root-tree.c
> @@ -82,7 +82,14 @@ int btrfs_find_root(struct btrfs_root *root, const struct btrfs_key *search_key,
>   		if (ret > 0)
>   			goto out;
>   	} else {
> -		BUG_ON(ret == 0);		/* Logical error */
> +		/*
> +		 * Key with offset -1 found, there would have to exist a root
> +		 * with such id, but this is out of the valid range.
> +		 */
> +		if (ret == 0) {
> +			ret = -EUCLEAN;
> +			goto out;
> +		}

Considering all root items would already be checked by tree-checker,
I'd prefer to add an extra "key.offset == (u64)-1" check, and convert
this to ASSERT(ret == 0), so that we won't need to bother the impossible
case (nor its error messages).

Thanks,
Qu

>   		if (path->slots[0] == 0)
>   			goto out;
>   		path->slots[0]--;
David Sterba Jan. 25, 2024, 4:28 p.m. UTC | #2
On Thu, Jan 25, 2024 at 02:33:53PM +1030, Qu Wenruo wrote:
> 
> 
> On 2024/1/25 07:48, David Sterba wrote:
> > The btrfs_find_root() looks up a root by a key, allowing to do an
> > inexact search when key->offset is -1.  It's never expected to find such
> > item, as it would break allowed the range of a root id.
> >
> > Signed-off-by: David Sterba <dsterba@suse.com>
> > ---
> >   fs/btrfs/root-tree.c | 9 ++++++++-
> >   1 file changed, 8 insertions(+), 1 deletion(-)
> >
> > diff --git a/fs/btrfs/root-tree.c b/fs/btrfs/root-tree.c
> > index ba7e2181ff4e..326cd0d03287 100644
> > --- a/fs/btrfs/root-tree.c
> > +++ b/fs/btrfs/root-tree.c
> > @@ -82,7 +82,14 @@ int btrfs_find_root(struct btrfs_root *root, const struct btrfs_key *search_key,
> >   		if (ret > 0)
> >   			goto out;
> >   	} else {
> > -		BUG_ON(ret == 0);		/* Logical error */
> > +		/*
> > +		 * Key with offset -1 found, there would have to exist a root
> > +		 * with such id, but this is out of the valid range.
> > +		 */
> > +		if (ret == 0) {
> > +			ret = -EUCLEAN;
> > +			goto out;
> > +		}
> 
> Considering all root items would already be checked by tree-checker,
> I'd prefer to add an extra "key.offset == (u64)-1" check, and convert
> this to ASSERT(ret == 0), so that we won't need to bother the impossible
> case (nor its error messages).

I did not think about tree-checker in this context and it actually does
verify offsets of the item keys so it's already prevented, assuming such
corrupted issue would come from the outside.

Assertions are not very popular in release builds and distros turn them
off. A real error handling prevents propagating an error further to the
code, so this is for catching bugs that could happen after tree-checker
lets it pass with valid data but something sets wrong value to offset.

The reasoning I'm using is to have full coverage of the error values as
real handling with worst case throwing an EUCLEAN. Assertions are not
popular in release builds and distros turn them off so it's effectively
removing error handling and allowing silent errors.
Qu Wenruo Jan. 25, 2024, 11:14 p.m. UTC | #3
On 2024/1/26 02:58, David Sterba wrote:
> On Thu, Jan 25, 2024 at 02:33:53PM +1030, Qu Wenruo wrote:
>>
>>
>> On 2024/1/25 07:48, David Sterba wrote:
>>> The btrfs_find_root() looks up a root by a key, allowing to do an
>>> inexact search when key->offset is -1.  It's never expected to find such
>>> item, as it would break allowed the range of a root id.
>>>
>>> Signed-off-by: David Sterba <dsterba@suse.com>
>>> ---
>>>    fs/btrfs/root-tree.c | 9 ++++++++-
>>>    1 file changed, 8 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/fs/btrfs/root-tree.c b/fs/btrfs/root-tree.c
>>> index ba7e2181ff4e..326cd0d03287 100644
>>> --- a/fs/btrfs/root-tree.c
>>> +++ b/fs/btrfs/root-tree.c
>>> @@ -82,7 +82,14 @@ int btrfs_find_root(struct btrfs_root *root, const struct btrfs_key *search_key,
>>>    		if (ret > 0)
>>>    			goto out;
>>>    	} else {
>>> -		BUG_ON(ret == 0);		/* Logical error */
>>> +		/*
>>> +		 * Key with offset -1 found, there would have to exist a root
>>> +		 * with such id, but this is out of the valid range.
>>> +		 */
>>> +		if (ret == 0) {
>>> +			ret = -EUCLEAN;
>>> +			goto out;
>>> +		}
>>
>> Considering all root items would already be checked by tree-checker,
>> I'd prefer to add an extra "key.offset == (u64)-1" check, and convert
>> this to ASSERT(ret == 0), so that we won't need to bother the impossible
>> case (nor its error messages).
>
> I did not think about tree-checker in this context and it actually does
> verify offsets of the item keys so it's already prevented, assuming such
> corrupted issue would come from the outside.

If the root item is fine, but it's some runtime memory bitflip, then I'd
say if hitting an ASSERT() is really the last time we need to consider.

Furthermore, we have further safenet at metadata write time, which would
definitely lead to a transaction abort.

>
> Assertions are not very popular in release builds and distros turn them
> off. A real error handling prevents propagating an error further to the
> code, so this is for catching bugs that could happen after tree-checker
> lets it pass with valid data but something sets wrong value to offset.
>
> The reasoning I'm using is to have full coverage of the error values as
> real handling with worst case throwing an EUCLEAN. Assertions are not
> popular in release builds and distros turn them off so it's effectively
> removing error handling and allowing silent errors.
>
Yep, I know ASSERT() is not popular in release builds.

But in this case, if tree-checker didn't catch such corruption, but some
runtime memory biflip (well, no single bitflip can result to -1) or
memory corruption created such -1 key offset, and ASSERT() is ignoring it.

That would still be fine, as our final write time tree-checker would
catch and abort current transaction.

So yes, I want to use ASSERT() intentionally here, because we're still
fine even it's not properly caught.

And again back to the old discussion on EUCLEAN, I really want it to be
noisy enough immediately, not waiting for the caller layers up the call
chain to do their error handling.

Thanks,
Qu
Anand Jain Jan. 26, 2024, 12:06 p.m. UTC | #4
On 1/25/24 05:18, David Sterba wrote:
> The btrfs_find_root() looks up a root by a key, allowing to do an
> inexact search when key->offset is -1.  It's never expected to find such
> item, as it would break allowed the range of a root id.
> 
> Signed-off-by: David Sterba <dsterba@suse.com>
> ---
>   fs/btrfs/root-tree.c | 9 ++++++++-
>   1 file changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/btrfs/root-tree.c b/fs/btrfs/root-tree.c
> index ba7e2181ff4e..326cd0d03287 100644
> --- a/fs/btrfs/root-tree.c
> +++ b/fs/btrfs/root-tree.c
> @@ -82,7 +82,14 @@ int btrfs_find_root(struct btrfs_root *root, const struct btrfs_key *search_key,
>   		if (ret > 0)
>   			goto out;
>   	} else {
> -		BUG_ON(ret == 0);		/* Logical error */
> +		/*
> +		 * Key with offset -1 found, there would have to exist a root
> +		 * with such id, but this is out of the valid range.
> +		 */
> +		if (ret == 0) {
> +			ret = -EUCLEAN;
> +			goto out;
> +		}
>   		if (path->slots[0] == 0)
>   			goto out;
>   		path->slots[0]--;


While here, why not also add an error message, especially for calls
from btrfs_read_roots() when the IGNOREBADROOTS is set, we ignore
the error and continue without the abort(). Including an error
message will provide more information about the bad root.

btrfs_read_roots()
::
  btrfs_read_tree_root() | btrfs_get_fs_root_commit_root() | 
load_global_roots_objectid()
  read_tree_root_path()
   btrfs_find_root()


Thanks, Anand
Josef Bacik Jan. 26, 2024, 2:19 p.m. UTC | #5
On Fri, Jan 26, 2024 at 09:44:52AM +1030, Qu Wenruo wrote:
> 
> 
> On 2024/1/26 02:58, David Sterba wrote:
> > On Thu, Jan 25, 2024 at 02:33:53PM +1030, Qu Wenruo wrote:
> > > 
> > > 
> > > On 2024/1/25 07:48, David Sterba wrote:
> > > > The btrfs_find_root() looks up a root by a key, allowing to do an
> > > > inexact search when key->offset is -1.  It's never expected to find such
> > > > item, as it would break allowed the range of a root id.
> > > > 
> > > > Signed-off-by: David Sterba <dsterba@suse.com>
> > > > ---
> > > >    fs/btrfs/root-tree.c | 9 ++++++++-
> > > >    1 file changed, 8 insertions(+), 1 deletion(-)
> > > > 
> > > > diff --git a/fs/btrfs/root-tree.c b/fs/btrfs/root-tree.c
> > > > index ba7e2181ff4e..326cd0d03287 100644
> > > > --- a/fs/btrfs/root-tree.c
> > > > +++ b/fs/btrfs/root-tree.c
> > > > @@ -82,7 +82,14 @@ int btrfs_find_root(struct btrfs_root *root, const struct btrfs_key *search_key,
> > > >    		if (ret > 0)
> > > >    			goto out;
> > > >    	} else {
> > > > -		BUG_ON(ret == 0);		/* Logical error */
> > > > +		/*
> > > > +		 * Key with offset -1 found, there would have to exist a root
> > > > +		 * with such id, but this is out of the valid range.
> > > > +		 */
> > > > +		if (ret == 0) {
> > > > +			ret = -EUCLEAN;
> > > > +			goto out;
> > > > +		}
> > > 
> > > Considering all root items would already be checked by tree-checker,
> > > I'd prefer to add an extra "key.offset == (u64)-1" check, and convert
> > > this to ASSERT(ret == 0), so that we won't need to bother the impossible
> > > case (nor its error messages).
> > 
> > I did not think about tree-checker in this context and it actually does
> > verify offsets of the item keys so it's already prevented, assuming such
> > corrupted issue would come from the outside.
> 
> If the root item is fine, but it's some runtime memory bitflip, then I'd
> say if hitting an ASSERT() is really the last time we need to consider.
> 
> Furthermore, we have further safenet at metadata write time, which would
> definitely lead to a transaction abort.
> 
> > 
> > Assertions are not very popular in release builds and distros turn them
> > off. A real error handling prevents propagating an error further to the
> > code, so this is for catching bugs that could happen after tree-checker
> > lets it pass with valid data but something sets wrong value to offset.
> > 
> > The reasoning I'm using is to have full coverage of the error values as
> > real handling with worst case throwing an EUCLEAN. Assertions are not
> > popular in release builds and distros turn them off so it's effectively
> > removing error handling and allowing silent errors.
> > 
> Yep, I know ASSERT() is not popular in release builds.
> 
> But in this case, if tree-checker didn't catch such corruption, but some
> runtime memory biflip (well, no single bitflip can result to -1) or
> memory corruption created such -1 key offset, and ASSERT() is ignoring it.
> 
> That would still be fine, as our final write time tree-checker would
> catch and abort current transaction.
> 
> So yes, I want to use ASSERT() intentionally here, because we're still
> fine even it's not properly caught.
> 
> And again back to the old discussion on EUCLEAN, I really want it to be
> noisy enough immediately, not waiting for the caller layers up the call
> chain to do their error handling.

We can have both.

This patch series is removing BUG_ON()'s, and this is the correct change for
that.

If we need followups to harden the tree checker that's valuable work too, but
that's a followup and doesn't mean this series needs changing.

With CONFIG_BTRFS_DEBUG on we're going to get a WARN_ON when this thing trips
and we'll see it in fstests.  Thanks,

Josef
David Sterba Jan. 29, 2024, 3:55 p.m. UTC | #6
On Fri, Jan 26, 2024 at 09:19:27AM -0500, Josef Bacik wrote:
> On Fri, Jan 26, 2024 at 09:44:52AM +1030, Qu Wenruo wrote:
> > 
> > 
> > On 2024/1/26 02:58, David Sterba wrote:
> > > On Thu, Jan 25, 2024 at 02:33:53PM +1030, Qu Wenruo wrote:
> > > > 
> > > > 
> > > > On 2024/1/25 07:48, David Sterba wrote:
> > > > > The btrfs_find_root() looks up a root by a key, allowing to do an
> > > > > inexact search when key->offset is -1.  It's never expected to find such
> > > > > item, as it would break allowed the range of a root id.
> > > > > 
> > > > > Signed-off-by: David Sterba <dsterba@suse.com>
> > > > > ---
> > > > >    fs/btrfs/root-tree.c | 9 ++++++++-
> > > > >    1 file changed, 8 insertions(+), 1 deletion(-)
> > > > > 
> > > > > diff --git a/fs/btrfs/root-tree.c b/fs/btrfs/root-tree.c
> > > > > index ba7e2181ff4e..326cd0d03287 100644
> > > > > --- a/fs/btrfs/root-tree.c
> > > > > +++ b/fs/btrfs/root-tree.c
> > > > > @@ -82,7 +82,14 @@ int btrfs_find_root(struct btrfs_root *root, const struct btrfs_key *search_key,
> > > > >    		if (ret > 0)
> > > > >    			goto out;
> > > > >    	} else {
> > > > > -		BUG_ON(ret == 0);		/* Logical error */
> > > > > +		/*
> > > > > +		 * Key with offset -1 found, there would have to exist a root
> > > > > +		 * with such id, but this is out of the valid range.
> > > > > +		 */
> > > > > +		if (ret == 0) {
> > > > > +			ret = -EUCLEAN;
> > > > > +			goto out;
> > > > > +		}
> > > > 
> > > > Considering all root items would already be checked by tree-checker,
> > > > I'd prefer to add an extra "key.offset == (u64)-1" check, and convert
> > > > this to ASSERT(ret == 0), so that we won't need to bother the impossible
> > > > case (nor its error messages).
> > > 
> > > I did not think about tree-checker in this context and it actually does
> > > verify offsets of the item keys so it's already prevented, assuming such
> > > corrupted issue would come from the outside.
> > 
> > If the root item is fine, but it's some runtime memory bitflip, then I'd
> > say if hitting an ASSERT() is really the last time we need to consider.
> > 
> > Furthermore, we have further safenet at metadata write time, which would
> > definitely lead to a transaction abort.
> > 
> > > 
> > > Assertions are not very popular in release builds and distros turn them
> > > off. A real error handling prevents propagating an error further to the
> > > code, so this is for catching bugs that could happen after tree-checker
> > > lets it pass with valid data but something sets wrong value to offset.
> > > 
> > > The reasoning I'm using is to have full coverage of the error values as
> > > real handling with worst case throwing an EUCLEAN. Assertions are not
> > > popular in release builds and distros turn them off so it's effectively
> > > removing error handling and allowing silent errors.
> > > 
> > Yep, I know ASSERT() is not popular in release builds.
> > 
> > But in this case, if tree-checker didn't catch such corruption, but some
> > runtime memory biflip (well, no single bitflip can result to -1) or
> > memory corruption created such -1 key offset, and ASSERT() is ignoring it.
> > 
> > That would still be fine, as our final write time tree-checker would
> > catch and abort current transaction.
> > 
> > So yes, I want to use ASSERT() intentionally here, because we're still
> > fine even it's not properly caught.
> > 
> > And again back to the old discussion on EUCLEAN, I really want it to be
> > noisy enough immediately, not waiting for the caller layers up the call
> > chain to do their error handling.
> 
> We can have both.
> 
> This patch series is removing BUG_ON()'s, and this is the correct change for
> that.
> 
> If we need followups to harden the tree checker that's valuable work too, but
> that's a followup and doesn't mean this series needs changing.
> 
> With CONFIG_BTRFS_DEBUG on we're going to get a WARN_ON when this thing trips
> and we'll see it in fstests.  Thanks,

I'd propose something different than an ASSERT for handling the EUCLEAN
cases. To make it print a WARN_ON under debug and with a message (all
builds).

The first step is to handle all the errors so I'd like not to mix it
with series. There are about 255 EUCLEAN cases (and possibly more
missing), that's a lot so we need to have a common way to handle them
instead of randomly adding ASSERT(0), duplicating an error condition
in the assert or doing thinsg like WARN(IS_ENABLED(CONFIG_BTRFS_DEBUG).
David Sterba Jan. 29, 2024, 3:56 p.m. UTC | #7
On Fri, Jan 26, 2024 at 08:06:06PM +0800, Anand Jain wrote:
> On 1/25/24 05:18, David Sterba wrote:
> > The btrfs_find_root() looks up a root by a key, allowing to do an
> > inexact search when key->offset is -1.  It's never expected to find such
> > item, as it would break allowed the range of a root id.
> > 
> > Signed-off-by: David Sterba <dsterba@suse.com>
> > ---
> >   fs/btrfs/root-tree.c | 9 ++++++++-
> >   1 file changed, 8 insertions(+), 1 deletion(-)
> > 
> > diff --git a/fs/btrfs/root-tree.c b/fs/btrfs/root-tree.c
> > index ba7e2181ff4e..326cd0d03287 100644
> > --- a/fs/btrfs/root-tree.c
> > +++ b/fs/btrfs/root-tree.c
> > @@ -82,7 +82,14 @@ int btrfs_find_root(struct btrfs_root *root, const struct btrfs_key *search_key,
> >   		if (ret > 0)
> >   			goto out;
> >   	} else {
> > -		BUG_ON(ret == 0);		/* Logical error */
> > +		/*
> > +		 * Key with offset -1 found, there would have to exist a root
> > +		 * with such id, but this is out of the valid range.
> > +		 */
> > +		if (ret == 0) {
> > +			ret = -EUCLEAN;
> > +			goto out;
> > +		}
> >   		if (path->slots[0] == 0)
> >   			goto out;
> >   		path->slots[0]--;
> 
> 
> While here, why not also add an error message, especially for calls
> from btrfs_read_roots() when the IGNOREBADROOTS is set, we ignore
> the error and continue without the abort(). Including an error
> message will provide more information about the bad root.

Yeah, after some thoughs i agree that the messages would be good, self
documenting instead of the comments. I'll do that in another series as
we first need some common way of handling it and wrappers.
diff mbox series

Patch

diff --git a/fs/btrfs/root-tree.c b/fs/btrfs/root-tree.c
index ba7e2181ff4e..326cd0d03287 100644
--- a/fs/btrfs/root-tree.c
+++ b/fs/btrfs/root-tree.c
@@ -82,7 +82,14 @@  int btrfs_find_root(struct btrfs_root *root, const struct btrfs_key *search_key,
 		if (ret > 0)
 			goto out;
 	} else {
-		BUG_ON(ret == 0);		/* Logical error */
+		/*
+		 * Key with offset -1 found, there would have to exist a root
+		 * with such id, but this is out of the valid range.
+		 */
+		if (ret == 0) {
+			ret = -EUCLEAN;
+			goto out;
+		}
 		if (path->slots[0] == 0)
 			goto out;
 		path->slots[0]--;