Message ID | aaaf2cadf66d9e573e2dbcc3e8fab7984ce42f99.1629322156.git.josef@toxicpanda.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | btrfs-progs: make check handle invalid bg items | expand |
On 2021/8/19 上午5:33, Josef Bacik wrote: > By enabling the lowmem checks properly I uncovered the case where test > 007 will infinite loop at the detection stage. This is because when > checking the inode item we will just btrfs_next_item(), and because we > ignore check tree block failures at read time we don't get an -EIO from > btrfs_next_leaf. Generally what check usually does is validate the > leaves/nodes as we hit them, but in this case we're not doing that. Fix > this by checking the leaf if we move to the next one and if it fails > bail. This allows us to pass the 007 test with lowmem. Doesn't this mean btrfs_next_item() is not doing what it should do? Normally we would expect btrfs_next_item() to return -EIO other than manually checking the returned leaf. Thanks, Qu > > Signed-off-by: Josef Bacik <josef@toxicpanda.com> > --- > check/mode-lowmem.c | 9 +++++++++ > 1 file changed, 9 insertions(+) > > diff --git a/check/mode-lowmem.c b/check/mode-lowmem.c > index 323e66bc..7fc7d467 100644 > --- a/check/mode-lowmem.c > +++ b/check/mode-lowmem.c > @@ -2675,6 +2675,15 @@ static int check_inode_item(struct btrfs_root *root, struct btrfs_path *path) > while (1) { > btrfs_item_key_to_cpu(path->nodes[0], &last_key, path->slots[0]); > ret = btrfs_next_item(root, path); > + > + /* > + * New leaf, we need to check it and see if it's valid, if not > + * we need to bail otherwise we could end up stuck. > + */ > + if (path->slots[0] == 0 && > + btrfs_check_leaf(gfs_info, NULL, path->nodes[0])) > + ret = -EIO; > + > if (ret < 0) { > /* out will fill 'err' rusing current statistics */ > goto out; >
On Thu, Aug 19, 2021 at 01:42:39PM +0800, Qu Wenruo wrote: > > > On 2021/8/19 上午5:33, Josef Bacik wrote: > > By enabling the lowmem checks properly I uncovered the case where test > > 007 will infinite loop at the detection stage. This is because when > > checking the inode item we will just btrfs_next_item(), and because we > > ignore check tree block failures at read time we don't get an -EIO from > > btrfs_next_leaf. Generally what check usually does is validate the > > leaves/nodes as we hit them, but in this case we're not doing that. Fix > > this by checking the leaf if we move to the next one and if it fails > > bail. This allows us to pass the 007 test with lowmem. > > Doesn't this mean btrfs_next_item() is not doing what it should do? > > Normally we would expect btrfs_next_item() to return -EIO other than > manually checking the returned leaf. That's an interesting point, I think we rely on that behaviour elsewhere too.
On 8/23/21 11:04 AM, David Sterba wrote: > On Thu, Aug 19, 2021 at 01:42:39PM +0800, Qu Wenruo wrote: >> >> >> On 2021/8/19 上午5:33, Josef Bacik wrote: >>> By enabling the lowmem checks properly I uncovered the case where test >>> 007 will infinite loop at the detection stage. This is because when >>> checking the inode item we will just btrfs_next_item(), and because we >>> ignore check tree block failures at read time we don't get an -EIO from >>> btrfs_next_leaf. Generally what check usually does is validate the >>> leaves/nodes as we hit them, but in this case we're not doing that. Fix >>> this by checking the leaf if we move to the next one and if it fails >>> bail. This allows us to pass the 007 test with lowmem. >> >> Doesn't this mean btrfs_next_item() is not doing what it should do? >> >> Normally we would expect btrfs_next_item() to return -EIO other than >> manually checking the returned leaf. > > That's an interesting point, I think we rely on that behaviour > elsewhere too. > This is the result of how we deal with corrupt blocks. We will happily read corrupt blocks with check, because we expect check to do it's own btrfs_check_node/btrfs_check_leaf(). The problem here is that if the block is corrupt it's still in cache, and so btrfs_next_leaf() will return it because the buffer is marked uptodate. It looks like search does the extra check_block() specifically to catch this case, so I'll fix next_leaf to do the same thing as well. Thanks, Josef
On 2021/8/24 上午2:44, Josef Bacik wrote: > On 8/23/21 11:04 AM, David Sterba wrote: >> On Thu, Aug 19, 2021 at 01:42:39PM +0800, Qu Wenruo wrote: >>> >>> >>> On 2021/8/19 上午5:33, Josef Bacik wrote: >>>> By enabling the lowmem checks properly I uncovered the case where test >>>> 007 will infinite loop at the detection stage. This is because when >>>> checking the inode item we will just btrfs_next_item(), and because we >>>> ignore check tree block failures at read time we don't get an -EIO from >>>> btrfs_next_leaf. Generally what check usually does is validate the >>>> leaves/nodes as we hit them, but in this case we're not doing that. >>>> Fix >>>> this by checking the leaf if we move to the next one and if it fails >>>> bail. This allows us to pass the 007 test with lowmem. >>> >>> Doesn't this mean btrfs_next_item() is not doing what it should do? >>> >>> Normally we would expect btrfs_next_item() to return -EIO other than >>> manually checking the returned leaf. >> >> That's an interesting point, I think we rely on that behaviour >> elsewhere too. >> > > This is the result of how we deal with corrupt blocks. We will happily > read corrupt blocks with check, because we expect check to do it's own > btrfs_check_node/btrfs_check_leaf(). The problem here is that if the > block is corrupt it's still in cache, and so btrfs_next_leaf() will > return it because the buffer is marked uptodate. Shouldn't we prevent the corrupted block from entering the cache? > > It looks like search does the extra check_block() specifically to catch > this case, so I'll fix next_leaf to do the same thing as well. Thanks, OK for now I think it's fine to have the extra check. It won't cause any harm even if we solved the cache problem in the future. Thanks, Qu > > Josef
diff --git a/check/mode-lowmem.c b/check/mode-lowmem.c index 323e66bc..7fc7d467 100644 --- a/check/mode-lowmem.c +++ b/check/mode-lowmem.c @@ -2675,6 +2675,15 @@ static int check_inode_item(struct btrfs_root *root, struct btrfs_path *path) while (1) { btrfs_item_key_to_cpu(path->nodes[0], &last_key, path->slots[0]); ret = btrfs_next_item(root, path); + + /* + * New leaf, we need to check it and see if it's valid, if not + * we need to bail otherwise we could end up stuck. + */ + if (path->slots[0] == 0 && + btrfs_check_leaf(gfs_info, NULL, path->nodes[0])) + ret = -EIO; + if (ret < 0) { /* out will fill 'err' rusing current statistics */ goto out;
By enabling the lowmem checks properly I uncovered the case where test 007 will infinite loop at the detection stage. This is because when checking the inode item we will just btrfs_next_item(), and because we ignore check tree block failures at read time we don't get an -EIO from btrfs_next_leaf. Generally what check usually does is validate the leaves/nodes as we hit them, but in this case we're not doing that. Fix this by checking the leaf if we move to the next one and if it fails bail. This allows us to pass the 007 test with lowmem. Signed-off-by: Josef Bacik <josef@toxicpanda.com> --- check/mode-lowmem.c | 9 +++++++++ 1 file changed, 9 insertions(+)