Message ID | cover.1648543951.git.johannes.thumshirn@wdc.com (mailing list archive) |
---|---|
Headers | show |
Series | btrfs: rework background block group relocation | expand |
On Tue, Mar 29, 2022 at 01:56:05AM -0700, Johannes Thumshirn wrote: > This is a combination of Josef's series titled "btrfs: rework background > block group relocation" and my patch titled "btrfs: zoned: make auto-reclaim > less aggressive" plus another preparation patch to address Josef's comments. > > I've opted for rebasinig my path onto Josef's series to avoid and fix > conflicts, as we're both touching the same code. > > Here's the original cover letter from Josef: > > Currently the background block group relocation code only works for zoned > devices, as it prevents the file system from becoming unusable because of block > group fragmentation. > > However inside Facebook our common workload is to download tens of gigabytes > worth of send files or package files, and it does this by fallocate()'ing the > entire package, writing into it, and then free'ing it up afterwards. > Unfortunately this leads to a similar problem as zoned, we get fragmented data > block groups, and this trends towards filling the entire disk up with partly > used data block groups, which then leads to ENOSPC because of the lack of > metadata space. > > Because of this we have been running balance internally forever, but this was > triggered based on different size usage hueristics and stil gave us a high > enough failure rate that it was annoying (figure 10-20 machines needing to be > reprovisioned per week). > > So I modified the existing bg_reclaim_threshold code to also apply in the !zoned > case, and I also made it only apply to DATA block groups. This has completely > eliminated these random failure cases, and we're no longer reprovisioning > machines that get stuck with 0 metadata space. > > However my internal patch is kind of janky as it hard codes the DATA check. > What I've done here is made the bg_reclaim_threshold per-space_info, this way > a user can target all block group types or just the ones they care about. This > won't break any current users because this only applied in the zoned case > before. > > Additionally I've added the code to allow this to work in the !zoned case, and > loosened the restriction on the threshold from 50-100 to 0-100. > > I tested this on my vm by writing 500m files and then removing half of them and > validating that the block groups were automatically reclaimed. > > https://lore.kernel.org/linux-btrfs/cover.1646934721.git.josef@toxicpanda.com/ > > Changes to v1: > * Fix zoned threshold calculation (Pankaj) > * Drop unneeded patch > > Johannes Thumshirn (1): > btrfs: zoned: make auto-reclaim less aggressive > > Josef Bacik (3): > btrfs: make the bg_reclaim_threshold per-space info > btrfs: allow block group background reclaim for !zoned fs'es > btrfs: change the bg_reclaim_threshold valid region from 0 to 100 Added to misc-next, thanks.