Message ID | 4beedde9b4f6adf4a7054707617f8784e5ee8b35.1689883754.git.josef@toxicpanda.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | btrfs: fix generic/475 hang | expand |
On Thu, Jul 20, 2023 at 04:12:16PM -0400, Josef Bacik wrote: > Instead of looping through the RAID indices before advancing the FFE > loop lets move this until after we've exhausted the entire FFE loop in > order to give us the highest chance of success in satisfying the > allocation based on its flags. Doesn't this get screwed by the find_free_extent_update_loop setting index to 0? i.e., let's say we fail on the first pass with the correct raid flag. then we go into find_free_extent_update_loop and intelligently don't do the pointless raid loops. But then we set index to 0 and start doing an even worse meta loop of doing every step (including allocating chunks) with every raid index, most of which are doomed to fail by definition. Not setting it to 0, OTOH, breaks the logic for setting "full_search", but I do think that could be fixed one way or another. > > Signed-off-by: Josef Bacik <josef@toxicpanda.com> > --- > fs/btrfs/extent-tree.c | 9 +++++---- > 1 file changed, 5 insertions(+), 4 deletions(-) > > diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c > index 8db2673948c9..ca4277ec1b19 100644 > --- a/fs/btrfs/extent-tree.c > +++ b/fs/btrfs/extent-tree.c > @@ -3953,10 +3953,6 @@ static int find_free_extent_update_loop(struct btrfs_fs_info *fs_info, > return 0; > } > > - ffe_ctl->index++; > - if (ffe_ctl->index < BTRFS_NR_RAID_TYPES) > - return 1; > - > /* > * See the comment for btrfs_loop_type for an explanation of the phases. > */ > @@ -4026,6 +4022,11 @@ static int find_free_extent_update_loop(struct btrfs_fs_info *fs_info, > } > return 1; > } > + > + ffe_ctl->index++; > + if (ffe_ctl->index < BTRFS_NR_RAID_TYPES) > + return 1; > + > return -ENOSPC; > } > > -- > 2.41.0 >
On Thu, Jul 20, 2023 at 03:28:17PM -0700, Boris Burkov wrote: > On Thu, Jul 20, 2023 at 04:12:16PM -0400, Josef Bacik wrote: > > Instead of looping through the RAID indices before advancing the FFE > > loop lets move this until after we've exhausted the entire FFE loop in > > order to give us the highest chance of success in satisfying the > > allocation based on its flags. > > Doesn't this get screwed by the find_free_extent_update_loop setting > index to 0? > > i.e., let's say we fail on the first pass with the correct raid flag. > then we go into find_free_extent_update_loop and intelligently don't do > the pointless raid loops. But then we set index to 0 and start doing an > even worse meta loop of doing every step (including allocating chunks) > with every raid index, most of which are doomed to fail by definition. > > Not setting it to 0, OTOH, breaks the logic for setting "full_search", > but I do think that could be fixed one way or another. > Yeah lets drop this, a bunch of tests failed, I need to drop this and re-run and see what happens. I hate this code. Thanks, Josef
Hello, kernel test robot noticed "xfstests.btrfs.288.fail" on: commit: 68c33cafcef309e75b25df860af930d8b44236f1 ("[PATCH 3/3] btrfs: cycle through the RAID profiles as a last resort") url: https://github.com/intel-lab-lkp/linux/commits/Josef-Bacik/btrfs-wait-for-block-groups-to-finish-caching-during-allocation/20230721-041418 base: https://git.kernel.org/cgit/linux/kernel/git/kdave/linux.git for-next patch link: https://lore.kernel.org/all/4beedde9b4f6adf4a7054707617f8784e5ee8b35.1689883754.git.josef@toxicpanda.com/ patch subject: [PATCH 3/3] btrfs: cycle through the RAID profiles as a last resort in testcase: xfstests version: xfstests-x86_64-bb8af9c-1_20230724 with following parameters: disk: 6HDD fs: btrfs test: btrfs-288 compiler: gcc-12 test machine: 8 threads 1 sockets Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz (Haswell) with 8G memory (please refer to attached dmesg/kmsg for entire log/backtrace) If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot <oliver.sang@intel.com> | Closes: https://lore.kernel.org/oe-lkp/202307252242.e00b11dc-oliver.sang@intel.com 2023-07-24 12:52:16 export TEST_DIR=/fs/sdb1 2023-07-24 12:52:16 export TEST_DEV=/dev/sdb1 2023-07-24 12:52:16 export FSTYP=btrfs 2023-07-24 12:52:16 export SCRATCH_MNT=/fs/scratch 2023-07-24 12:52:16 mkdir /fs/scratch -p 2023-07-24 12:52:16 export SCRATCH_DEV_POOL="/dev/sdb2 /dev/sdb3 /dev/sdb4 /dev/sdb5 /dev/sdb6" 2023-07-24 12:52:16 echo btrfs/288 2023-07-24 12:52:16 ./check btrfs/288 FSTYP -- btrfs PLATFORM -- Linux/x86_64 lkp-hsw-d01 6.5.0-rc2-00111-g68c33cafcef3 #1 SMP PREEMPT_DYNAMIC Mon Jul 24 16:31:26 CST 2023 MKFS_OPTIONS -- /dev/sdb2 MOUNT_OPTIONS -- /dev/sdb2 /fs/scratch btrfs/288 - output mismatch (see /lkp/benchmarks/xfstests/results//btrfs/288.out.bad) --- tests/btrfs/288.out 2023-07-24 01:11:46.000000000 +0000 +++ /lkp/benchmarks/xfstests/results//btrfs/288.out.bad 2023-07-24 12:52:42.119031191 +0000 @@ -1,9 +1,9 @@ QA output created by 288 step 1......mkfs.btrfs -wrote 131072/131072 bytes at offset 0 -XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) +pwrite: No space left on device step 2......corrupt one mirror +Failed to find any extent at [1,16385) ... (Run 'diff -u /lkp/benchmarks/xfstests/tests/btrfs/288.out /lkp/benchmarks/xfstests/results//btrfs/288.out.bad' to see the entire diff) HINT: You _MAY_ be missing kernel fix: xxxxxxxxxxxx btrfs: scrub: respect the read-only flag during repair Ran: btrfs/288 Failures: btrfs/288 Failed 1 of 1 tests To reproduce: git clone https://github.com/intel/lkp-tests.git cd lkp-tests sudo bin/lkp install job.yaml # job file is attached in this email bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run sudo bin/lkp run generated-yaml-file # if come across any failure that blocks the test, # please remove ~/.lkp and /lkp dir to run from a clean state.
diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index 8db2673948c9..ca4277ec1b19 100644 --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -3953,10 +3953,6 @@ static int find_free_extent_update_loop(struct btrfs_fs_info *fs_info, return 0; } - ffe_ctl->index++; - if (ffe_ctl->index < BTRFS_NR_RAID_TYPES) - return 1; - /* * See the comment for btrfs_loop_type for an explanation of the phases. */ @@ -4026,6 +4022,11 @@ static int find_free_extent_update_loop(struct btrfs_fs_info *fs_info, } return 1; } + + ffe_ctl->index++; + if (ffe_ctl->index < BTRFS_NR_RAID_TYPES) + return 1; + return -ENOSPC; }
Instead of looping through the RAID indices before advancing the FFE loop lets move this until after we've exhausted the entire FFE loop in order to give us the highest chance of success in satisfying the allocation based on its flags. Signed-off-by: Josef Bacik <josef@toxicpanda.com> --- fs/btrfs/extent-tree.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-)