Message ID | 20190124143143.8838-1-josef@toxicpanda.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | btrfs: don't end the transaction for delayed refs in throttle | expand |
On Thu, Jan 24, 2019 at 09:31:43AM -0500, Josef Bacik wrote: > Previously callers to btrfs_end_transaction_throttle() would commit the > transaction if there wasn't enough delayed refs space. This happens in > relocation, and if the fs is relatively empty we'll run out of delayed > refs space basically immediately, so we'll just be stuck in this loop of > committing the transaction over and over again. > > This code existed because we didn't have a good feedback mechanism for > running delayed refs, but with the delayed refs rsv we do now. Delete > this throttling code and let the btrfs_start_transaction() in relocation > deal with putting pressure on the delayed refs infrastructure. With > this patch we no longer take 5 minutes to balance a metadata only fs. > > Signed-off-by: Josef Bacik <josef@toxicpanda.com> For the record, this has been merged to 5.0-rc5
On 2019/2/13 上午12:03, David Sterba wrote: > On Thu, Jan 24, 2019 at 09:31:43AM -0500, Josef Bacik wrote: >> Previously callers to btrfs_end_transaction_throttle() would commit the >> transaction if there wasn't enough delayed refs space. This happens in >> relocation, and if the fs is relatively empty we'll run out of delayed >> refs space basically immediately, so we'll just be stuck in this loop of >> committing the transaction over and over again. >> >> This code existed because we didn't have a good feedback mechanism for >> running delayed refs, but with the delayed refs rsv we do now. Delete >> this throttling code and let the btrfs_start_transaction() in relocation >> deal with putting pressure on the delayed refs infrastructure. With >> this patch we no longer take 5 minutes to balance a metadata only fs. >> >> Signed-off-by: Josef Bacik <josef@toxicpanda.com> > > For the record, this has been merged to 5.0-rc5 > Bisecting leads me to this patch for strange balance ENOSPC. Can be reproduced by btrfs/156, or the following small script: ------ #!/bin/bash dev="/dev/test/test" mnt="/mnt/btrfs" _fail() { echo "!!! FAILED: $@ !!!" exit 1 } do_work() { umount $dev &> /dev/null umount $mnt &> /dev/null mkfs.btrfs -b 1G -m single -d single $dev -f > /dev/null mount $dev $mnt for i in $(seq -w 0 511); do # xfs_io -f -c "falloc 0 1m" $mnt/file_$i > /dev/null xfs_io -f -c "pwrite 0 1m" $mnt/inline_$i > /dev/null done sync btrfs balance start --full $mnt || return 1 sync btrfs balance start --full $mnt || return 1 umount $mnt } failed=0 for i in $(seq -w 0 24); do echo "=== run $i ===" do_work if [ $? -eq 1 ]; then failed=$(($failed + 1)) fi done if [ $failed -ne 0 ]; then echo "!!! failed $failed/25 !!!" else echo "=== all passes ===" fi ------ For v4.20, it will fail at the rate around 0/25 ~ 2/25 (very rare). But at that patch (upstream commit 302167c50b32e7fccc98994a91d40ddbbab04e52), the failure rate raise to 25/25. Any idea for that ENOSPC problem? As it looks really wired for the 2nd full balance to fail even we have enough unallocated space. Thanks, Qu
On Mon, Jun 03, 2019 at 02:53:00PM +0800, Qu Wenruo wrote: > > > On 2019/2/13 上午12:03, David Sterba wrote: > > On Thu, Jan 24, 2019 at 09:31:43AM -0500, Josef Bacik wrote: > >> Previously callers to btrfs_end_transaction_throttle() would commit the > >> transaction if there wasn't enough delayed refs space. This happens in > >> relocation, and if the fs is relatively empty we'll run out of delayed > >> refs space basically immediately, so we'll just be stuck in this loop of > >> committing the transaction over and over again. > >> > >> This code existed because we didn't have a good feedback mechanism for > >> running delayed refs, but with the delayed refs rsv we do now. Delete > >> this throttling code and let the btrfs_start_transaction() in relocation > >> deal with putting pressure on the delayed refs infrastructure. With > >> this patch we no longer take 5 minutes to balance a metadata only fs. > >> > >> Signed-off-by: Josef Bacik <josef@toxicpanda.com> > > > > For the record, this has been merged to 5.0-rc5 > > > > Bisecting leads me to this patch for strange balance ENOSPC. > > Can be reproduced by btrfs/156, or the following small script: > ------ > #!/bin/bash > dev="/dev/test/test" > mnt="/mnt/btrfs" > > _fail() > { > echo "!!! FAILED: $@ !!!" > exit 1 > } > > do_work() > { > umount $dev &> /dev/null > umount $mnt &> /dev/null > > mkfs.btrfs -b 1G -m single -d single $dev -f > /dev/null > > mount $dev $mnt > > for i in $(seq -w 0 511); do > # xfs_io -f -c "falloc 0 1m" $mnt/file_$i > /dev/null > xfs_io -f -c "pwrite 0 1m" $mnt/inline_$i > /dev/null > done > sync > > btrfs balance start --full $mnt || return 1 > sync > > > btrfs balance start --full $mnt || return 1 > umount $mnt > } > > failed=0 > for i in $(seq -w 0 24); do > echo "=== run $i ===" > do_work > if [ $? -eq 1 ]; then > failed=$(($failed + 1)) > fi > done > if [ $failed -ne 0 ]; then > echo "!!! failed $failed/25 !!!" > else > echo "=== all passes ===" > fi > ------ > > For v4.20, it will fail at the rate around 0/25 ~ 2/25 (very rare). > But at that patch (upstream commit > 302167c50b32e7fccc98994a91d40ddbbab04e52), the failure rate raise to 25/25. > > Any idea for that ENOSPC problem? > As it looks really wired for the 2nd full balance to fail even we have > enough unallocated space. > I've been running this all morning on kdave's misc-next and not had a single failure. I ran it a few times on spinning rust and a few times on my nvme drive. I wouldn't doubt that it's failing for you, but I can't reproduce. It would be helpful to know where the ENOSPC was coming from so I can think of where the problem might be. Thanks, Josef
On Mon, Jun 03, 2019 at 01:36:11PM -0400, Josef Bacik wrote: > > For v4.20, it will fail at the rate around 0/25 ~ 2/25 (very rare). > > But at that patch (upstream commit > > 302167c50b32e7fccc98994a91d40ddbbab04e52), the failure rate raise to 25/25. > > > > Any idea for that ENOSPC problem? > > As it looks really wired for the 2nd full balance to fail even we have > > enough unallocated space. > > > > I've been running this all morning on kdave's misc-next and not had a single > failure. I ran it a few times on spinning rust and a few times on my nvme > drive. I wouldn't doubt that it's failing for you, but I can't reproduce. It > would be helpful to know where the ENOSPC was coming from so I can think of > where the problem might be. Thanks, That's interesting because the test btrfs/156 hasn't succesfully finished a single run on my VM testing setup since qemu 4.0 implemented discard on virtio devices several weeks ago. It's 4cpu/2g, file-backed virtio-scsi devices, files are on a spinning disk.
On 2019/6/4 上午1:36, Josef Bacik wrote: > On Mon, Jun 03, 2019 at 02:53:00PM +0800, Qu Wenruo wrote: >> >> >> On 2019/2/13 上午12:03, David Sterba wrote: >>> On Thu, Jan 24, 2019 at 09:31:43AM -0500, Josef Bacik wrote: >>>> Previously callers to btrfs_end_transaction_throttle() would commit the >>>> transaction if there wasn't enough delayed refs space. This happens in >>>> relocation, and if the fs is relatively empty we'll run out of delayed >>>> refs space basically immediately, so we'll just be stuck in this loop of >>>> committing the transaction over and over again. >>>> >>>> This code existed because we didn't have a good feedback mechanism for >>>> running delayed refs, but with the delayed refs rsv we do now. Delete >>>> this throttling code and let the btrfs_start_transaction() in relocation >>>> deal with putting pressure on the delayed refs infrastructure. With >>>> this patch we no longer take 5 minutes to balance a metadata only fs. >>>> >>>> Signed-off-by: Josef Bacik <josef@toxicpanda.com> >>> >>> For the record, this has been merged to 5.0-rc5 >>> >> >> Bisecting leads me to this patch for strange balance ENOSPC. >> >> Can be reproduced by btrfs/156, or the following small script: >> ------ >> #!/bin/bash >> dev="/dev/test/test" >> mnt="/mnt/btrfs" >> >> _fail() >> { >> echo "!!! FAILED: $@ !!!" >> exit 1 >> } >> >> do_work() >> { >> umount $dev &> /dev/null >> umount $mnt &> /dev/null >> >> mkfs.btrfs -b 1G -m single -d single $dev -f > /dev/null >> >> mount $dev $mnt >> >> for i in $(seq -w 0 511); do >> # xfs_io -f -c "falloc 0 1m" $mnt/file_$i > /dev/null >> xfs_io -f -c "pwrite 0 1m" $mnt/inline_$i > /dev/null >> done >> sync >> >> btrfs balance start --full $mnt || return 1 >> sync >> >> >> btrfs balance start --full $mnt || return 1 >> umount $mnt >> } >> >> failed=0 >> for i in $(seq -w 0 24); do >> echo "=== run $i ===" >> do_work >> if [ $? -eq 1 ]; then >> failed=$(($failed + 1)) >> fi >> done >> if [ $failed -ne 0 ]; then >> echo "!!! failed $failed/25 !!!" >> else >> echo "=== all passes ===" >> fi >> ------ >> >> For v4.20, it will fail at the rate around 0/25 ~ 2/25 (very rare). >> But at that patch (upstream commit >> 302167c50b32e7fccc98994a91d40ddbbab04e52), the failure rate raise to 25/25. >> >> Any idea for that ENOSPC problem? >> As it looks really wired for the 2nd full balance to fail even we have >> enough unallocated space. >> > > I've been running this all morning on kdave's misc-next and not had a single > failure. I ran it a few times on spinning rust and a few times on my nvme > drive. I wouldn't doubt that it's failing for you, but I can't reproduce. It > would be helpful to know where the ENOSPC was coming from so I can think of > where the problem might be. Thanks, > > Josef > Since v5.2-rc2 has a lot of enospc debug output merged, here is the debug info just by enospc_debug: BTRFS: device fsid defe70f2-d083-41f0-a4fd-28a0cc03dce7 devid 1 transid 5 /dev/test/test BTRFS info (device dm-3): disk space caching is enabled BTRFS info (device dm-3): has skinny extents BTRFS info (device dm-3): flagging fs with big metadata feature BTRFS info (device dm-3): checking UUID tree BTRFS info (device dm-3): balance: start -d -m -s BTRFS info (device dm-3): relocating block group 726663168 flags metadata BTRFS info (device dm-3): relocating block group 609222656 flags data BTRFS info (device dm-3): found 57 extents BTRFS info (device dm-3): found 57 extents BTRFS info (device dm-3): relocating block group 491782144 flags data BTRFS info (device dm-3): found 112 extents BTRFS info (device dm-3): found 112 extents BTRFS info (device dm-3): relocating block group 374341632 flags data BTRFS info (device dm-3): found 115 extents BTRFS info (device dm-3): found 114 extents BTRFS info (device dm-3): relocating block group 256901120 flags data BTRFS info (device dm-3): found 112 extents BTRFS info (device dm-3): found 112 extents BTRFS info (device dm-3): relocating block group 139460608 flags data BTRFS info (device dm-3): found 112 extents BTRFS info (device dm-3): found 112 extents BTRFS info (device dm-3): unable to make block group 22020096 ro BTRFS info (device dm-3): sinfo_used=42909696 bg_num_bytes=116293632 min_allocable=1048576 BTRFS info (device dm-3): space_info 4 has 82919424 free, is not full BTRFS info (device dm-3): space_info total=125829120, used=1015808, pinned=0, reserved=81920, may_use=41746432, readonly=65536 BTRFS info (device dm-3): global_block_rsv: size 16777216 reserved 16744448 BTRFS info (device dm-3): trans_block_rsv: size 0 reserved 0 BTRFS info (device dm-3): chunk_block_rsv: size 0 reserved 0 BTRFS info (device dm-3): delayed_block_rsv: size 0 reserved 0 BTRFS info (device dm-3): delayed_refs_rsv: size 61865984 reserved 25001984 BTRFS info (device dm-3): relocating block group 22020096 flags metadata BTRFS info (device dm-3): found 54 extents BTRFS info (device dm-3): relocating block group 13631488 flags data BTRFS info (device dm-3): found 8 extents BTRFS info (device dm-3): found 8 extents BTRFS info (device dm-3): relocating block group 5242880 flags metadata BTRFS info (device dm-3): found 56 extents BTRFS info (device dm-3): unable to make block group 1048576 ro BTRFS info (device dm-3): sinfo_used=32768 bg_num_bytes=4161536 min_allocable=1048576 BTRFS info (device dm-3): space_info 2 has 4161536 free, is not full BTRFS info (device dm-3): space_info total=4194304, used=16384, pinned=0, reserved=16384, may_use=0, readonly=0 BTRFS info (device dm-3): global_block_rsv: size 16777216 reserved 16744448 BTRFS info (device dm-3): trans_block_rsv: size 0 reserved 0 BTRFS info (device dm-3): chunk_block_rsv: size 0 reserved 0 BTRFS info (device dm-3): delayed_block_rsv: size 0 reserved 0 BTRFS info (device dm-3): delayed_refs_rsv: size 3145728 reserved 1540096 BTRFS info (device dm-3): relocating block group 1048576 flags system BTRFS info (device dm-3): found 1 extents BTRFS info (device dm-3): balance: ended with status: 0 BTRFS info (device dm-3): balance: start -d -m -s BTRFS info (device dm-3): unable to make block group 1431306240 ro BTRFS info (device dm-3): sinfo_used=16384 bg_num_bytes=33538048 min_allocable=1048576 BTRFS info (device dm-3): space_info 2 has 33538048 free, is not full BTRFS info (device dm-3): space_info total=33554432, used=16384, pinned=0, reserved=0, may_use=0, readonly=0 BTRFS info (device dm-3): global_block_rsv: size 16777216 reserved 16777216 BTRFS info (device dm-3): trans_block_rsv: size 0 reserved 0 BTRFS info (device dm-3): chunk_block_rsv: size 0 reserved 0 BTRFS info (device dm-3): delayed_block_rsv: size 0 reserved 0 BTRFS info (device dm-3): delayed_refs_rsv: size 0 reserved 0 BTRFS info (device dm-3): relocating block group 1431306240 flags system BTRFS info (device dm-3): unable to make block group 1313865728 ro BTRFS info (device dm-3): sinfo_used=19382272 bg_num_bytes=116342784 min_allocable=1048576 BTRFS info (device dm-3): space_info 4 has 98058240 free, is not full BTRFS info (device dm-3): space_info total=117440512, used=1015808, pinned=0, reserved=81920, may_use=18284544, readonly=0 BTRFS info (device dm-3): global_block_rsv: size 16777216 reserved 16744448 BTRFS info (device dm-3): trans_block_rsv: size 0 reserved 0 BTRFS info (device dm-3): chunk_block_rsv: size 0 reserved 0 BTRFS info (device dm-3): delayed_block_rsv: size 0 reserved 0 BTRFS info (device dm-3): delayed_refs_rsv: size 3145728 reserved 1540096 BTRFS info (device dm-3): relocating block group 1313865728 flags metadata BTRFS info (device dm-3): found 55 extents BTRFS info (device dm-3): relocating block group 1196425216 flags data BTRFS info (device dm-3): found 65 extents BTRFS info (device dm-3): found 65 extents BTRFS warning (device dm-3): no space to allocate a new chunk for block group 1078984704 BTRFS warning (device dm-3): no space to allocate a new chunk for block group 961544192 BTRFS warning (device dm-3): no space to allocate a new chunk for block group 844103680 BTRFS warning (device dm-3): no space to allocate a new chunk for block group 726663168 BTRFS info (device dm-3): 4 enospc errors during balance BTRFS info (device dm-3): balance: ended with status: -28 The ENOSPC is still from inc_block_group_ro(). For the first data block failure, it still looks like something in bytes_may_use doesn't look correct. Although for system block group, it's another story. It has no reserved/pinned/may_use bytes, it's the min_allocable failing the check. I can't remember if this is some bug I reported before, but looks a little similar. Thanks, Qu
On Tue, Jun 04, 2019 at 08:31:23AM +0800, Qu Wenruo wrote: > > > On 2019/6/4 上午1:36, Josef Bacik wrote: > > On Mon, Jun 03, 2019 at 02:53:00PM +0800, Qu Wenruo wrote: > >> > >> > >> On 2019/2/13 上午12:03, David Sterba wrote: > >>> On Thu, Jan 24, 2019 at 09:31:43AM -0500, Josef Bacik wrote: > >>>> Previously callers to btrfs_end_transaction_throttle() would commit the > >>>> transaction if there wasn't enough delayed refs space. This happens in > >>>> relocation, and if the fs is relatively empty we'll run out of delayed > >>>> refs space basically immediately, so we'll just be stuck in this loop of > >>>> committing the transaction over and over again. > >>>> > >>>> This code existed because we didn't have a good feedback mechanism for > >>>> running delayed refs, but with the delayed refs rsv we do now. Delete > >>>> this throttling code and let the btrfs_start_transaction() in relocation > >>>> deal with putting pressure on the delayed refs infrastructure. With > >>>> this patch we no longer take 5 minutes to balance a metadata only fs. > >>>> > >>>> Signed-off-by: Josef Bacik <josef@toxicpanda.com> > >>> > >>> For the record, this has been merged to 5.0-rc5 > >>> > >> > >> Bisecting leads me to this patch for strange balance ENOSPC. > >> > >> Can be reproduced by btrfs/156, or the following small script: > >> ------ > >> #!/bin/bash > >> dev="/dev/test/test" > >> mnt="/mnt/btrfs" > >> > >> _fail() > >> { > >> echo "!!! FAILED: $@ !!!" > >> exit 1 > >> } > >> > >> do_work() > >> { > >> umount $dev &> /dev/null > >> umount $mnt &> /dev/null > >> > >> mkfs.btrfs -b 1G -m single -d single $dev -f > /dev/null > >> > >> mount $dev $mnt > >> > >> for i in $(seq -w 0 511); do > >> # xfs_io -f -c "falloc 0 1m" $mnt/file_$i > /dev/null > >> xfs_io -f -c "pwrite 0 1m" $mnt/inline_$i > /dev/null > >> done > >> sync > >> > >> btrfs balance start --full $mnt || return 1 > >> sync > >> > >> > >> btrfs balance start --full $mnt || return 1 > >> umount $mnt > >> } > >> > >> failed=0 > >> for i in $(seq -w 0 24); do > >> echo "=== run $i ===" > >> do_work > >> if [ $? -eq 1 ]; then > >> failed=$(($failed + 1)) > >> fi > >> done > >> if [ $failed -ne 0 ]; then > >> echo "!!! failed $failed/25 !!!" > >> else > >> echo "=== all passes ===" > >> fi > >> ------ > >> > >> For v4.20, it will fail at the rate around 0/25 ~ 2/25 (very rare). > >> But at that patch (upstream commit > >> 302167c50b32e7fccc98994a91d40ddbbab04e52), the failure rate raise to 25/25. > >> > >> Any idea for that ENOSPC problem? > >> As it looks really wired for the 2nd full balance to fail even we have > >> enough unallocated space. > >> > > > > I've been running this all morning on kdave's misc-next and not had a single > > failure. I ran it a few times on spinning rust and a few times on my nvme > > drive. I wouldn't doubt that it's failing for you, but I can't reproduce. It > > would be helpful to know where the ENOSPC was coming from so I can think of > > where the problem might be. Thanks, > > > > Josef > > > > Since v5.2-rc2 has a lot of enospc debug output merged, here is the > debug info just by enospc_debug: > Ah ok, sorry I'm travelling so I can't easily test a patch right now, but change the btrfs_join_transaction() in btrfs_inc_block_group_ro to btrfs_start_transaction(root, 0); This will trigger the delayed ref flushing if we need it and likely will fix the problem. There's so much random cruft built into the relocation enospc stuff that we're likely to keep finding problems like this, we just need to rework it so it's still tripping over the normal reservation path. Thanks, Josef
On 2019/6/5 上午1:43, Josef Bacik wrote: > On Tue, Jun 04, 2019 at 08:31:23AM +0800, Qu Wenruo wrote: >> >> >> On 2019/6/4 上午1:36, Josef Bacik wrote: >>> On Mon, Jun 03, 2019 at 02:53:00PM +0800, Qu Wenruo wrote: >>>> >>>> >>>> On 2019/2/13 上午12:03, David Sterba wrote: >>>>> On Thu, Jan 24, 2019 at 09:31:43AM -0500, Josef Bacik wrote: >>>>>> Previously callers to btrfs_end_transaction_throttle() would commit the >>>>>> transaction if there wasn't enough delayed refs space. This happens in >>>>>> relocation, and if the fs is relatively empty we'll run out of delayed >>>>>> refs space basically immediately, so we'll just be stuck in this loop of >>>>>> committing the transaction over and over again. >>>>>> >>>>>> This code existed because we didn't have a good feedback mechanism for >>>>>> running delayed refs, but with the delayed refs rsv we do now. Delete >>>>>> this throttling code and let the btrfs_start_transaction() in relocation >>>>>> deal with putting pressure on the delayed refs infrastructure. With >>>>>> this patch we no longer take 5 minutes to balance a metadata only fs. >>>>>> >>>>>> Signed-off-by: Josef Bacik <josef@toxicpanda.com> >>>>> >>>>> For the record, this has been merged to 5.0-rc5 >>>>> >>>> >>>> Bisecting leads me to this patch for strange balance ENOSPC. >>>> >>>> Can be reproduced by btrfs/156, or the following small script: >>>> ------ >>>> #!/bin/bash >>>> dev="/dev/test/test" >>>> mnt="/mnt/btrfs" >>>> >>>> _fail() >>>> { >>>> echo "!!! FAILED: $@ !!!" >>>> exit 1 >>>> } >>>> >>>> do_work() >>>> { >>>> umount $dev &> /dev/null >>>> umount $mnt &> /dev/null >>>> >>>> mkfs.btrfs -b 1G -m single -d single $dev -f > /dev/null >>>> >>>> mount $dev $mnt >>>> >>>> for i in $(seq -w 0 511); do >>>> # xfs_io -f -c "falloc 0 1m" $mnt/file_$i > /dev/null >>>> xfs_io -f -c "pwrite 0 1m" $mnt/inline_$i > /dev/null >>>> done >>>> sync >>>> >>>> btrfs balance start --full $mnt || return 1 >>>> sync >>>> >>>> >>>> btrfs balance start --full $mnt || return 1 >>>> umount $mnt >>>> } >>>> >>>> failed=0 >>>> for i in $(seq -w 0 24); do >>>> echo "=== run $i ===" >>>> do_work >>>> if [ $? -eq 1 ]; then >>>> failed=$(($failed + 1)) >>>> fi >>>> done >>>> if [ $failed -ne 0 ]; then >>>> echo "!!! failed $failed/25 !!!" >>>> else >>>> echo "=== all passes ===" >>>> fi >>>> ------ >>>> >>>> For v4.20, it will fail at the rate around 0/25 ~ 2/25 (very rare). >>>> But at that patch (upstream commit >>>> 302167c50b32e7fccc98994a91d40ddbbab04e52), the failure rate raise to 25/25. >>>> >>>> Any idea for that ENOSPC problem? >>>> As it looks really wired for the 2nd full balance to fail even we have >>>> enough unallocated space. >>>> >>> >>> I've been running this all morning on kdave's misc-next and not had a single >>> failure. I ran it a few times on spinning rust and a few times on my nvme >>> drive. I wouldn't doubt that it's failing for you, but I can't reproduce. It >>> would be helpful to know where the ENOSPC was coming from so I can think of >>> where the problem might be. Thanks, >>> >>> Josef >>> >> >> Since v5.2-rc2 has a lot of enospc debug output merged, here is the >> debug info just by enospc_debug: >> > > Ah ok, sorry I'm travelling so I can't easily test a patch right now, but change > the btrfs_join_transaction() in btrfs_inc_block_group_ro to > btrfs_start_transaction(root, 0); This will trigger the delayed ref flushing if > we need it and likely will fix the problem. Unfortunately, it doesn't work as expected. False ENOSPC still gets triggered. The same problem for system chunk, min_alloc is causing problem, as pinned/reserved/may_use are all zero. Just the min_allocable of 1M makes the check to fail. While for metadata, btrfs_start_transaction(root, 0) doesn't solve the problem as reserved and may_use is still relatively high. I'll dig further to find a different way to solve it. Thanks, Qu > There's so much random cruft built > into the relocation enospc stuff that we're likely to keep finding problems like > this, we just need to rework it so it's still tripping over the normal > reservation path. Thanks, > > Josef >
diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c index 22b0dacae003..ef98b3cd30db 100644 --- a/fs/btrfs/transaction.c +++ b/fs/btrfs/transaction.c @@ -851,14 +851,6 @@ static int __btrfs_end_transaction(struct btrfs_trans_handle *trans, btrfs_trans_release_chunk_metadata(trans); - if (lock && should_end_transaction(trans) && - READ_ONCE(cur_trans->state) == TRANS_STATE_RUNNING) { - spin_lock(&info->trans_lock); - if (cur_trans->state == TRANS_STATE_RUNNING) - cur_trans->state = TRANS_STATE_BLOCKED; - spin_unlock(&info->trans_lock); - } - if (lock && READ_ONCE(cur_trans->state) == TRANS_STATE_BLOCKED) { if (throttle) return btrfs_commit_transaction(trans);
Previously callers to btrfs_end_transaction_throttle() would commit the transaction if there wasn't enough delayed refs space. This happens in relocation, and if the fs is relatively empty we'll run out of delayed refs space basically immediately, so we'll just be stuck in this loop of committing the transaction over and over again. This code existed because we didn't have a good feedback mechanism for running delayed refs, but with the delayed refs rsv we do now. Delete this throttling code and let the btrfs_start_transaction() in relocation deal with putting pressure on the delayed refs infrastructure. With this patch we no longer take 5 minutes to balance a metadata only fs. Signed-off-by: Josef Bacik <josef@toxicpanda.com> --- fs/btrfs/transaction.c | 8 -------- 1 file changed, 8 deletions(-)