Message ID | CANaSA1wZfn5Gxg_dU33WbamchVtWDU4GpXazn8ep-NJKGNaetA@mail.gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Different sized disks, allocation strategy | expand |
iker vagyok wrote: > Hello! > > I am curious if the multiple disk allocation strategy for btrfs could > be improved for my use case. The current situation is: > > > # btrfs filesystem usage -T / > Overall: > Device size: 32.74TiB > Device allocated: 15.66TiB > Device unallocated: 17.08TiB > Device missing: 0.00B > Device slack: 0.00B > Used: 15.57TiB > Free (estimated): 8.58TiB (min: 4.31TiB) > Free (statfs, df): 6.12TiB > Data ratio: 2.00 > Metadata ratio: 4.00 > Global reserve: 512.00MiB (used: 0.00B) > Multiple profiles: no > > Data Metadata System > Id Path RAID1 RAID1C4 RAID1C4 Unallocated Total Slack > -- --------- ------- -------- -------- ----------- -------- ----- > 2 /dev/sdc2 4.18TiB 13.00GiB 32.00MiB 4.91TiB 9.09TiB - > 3 /dev/sda2 - 7.00GiB - 2.72TiB 2.72TiB - > 6 /dev/sde2 - 6.00GiB 32.00MiB 2.72TiB 2.72TiB - > 8 /dev/sdb2 5.72TiB 13.00GiB 32.00MiB 3.37TiB 9.09TiB - > 9 /dev/sdd2 5.71TiB 13.00GiB 32.00MiB 3.37TiB 9.09TiB - > -- --------- ------- -------- -------- ----------- -------- ----- > Total 7.80TiB 13.00GiB 32.00MiB 17.08TiB 32.74TiB 0.00B > Used 7.77TiB 9.95GiB 1.19MiB > > > As you can see, my server has 2*3TB and 3*10TB HDDs and uses RAID1 for > data and RAID1C4 for metadata. This works fine and the smaller devices > are even used for RAID1C4 (metadata), as there are not enough big > drives to handle 4 copies. But the smaller drives are not used for > RAID1 (data) until all bigger disks are filled (with <3TB remaining > free). Only then will all the disks take part in the raid setup. > First of all, I am not a btrfs dev, just a regular user. This "issue" bugs me a bit as well and I remember that there was a proposal quite some time ago where someone wanted to use the percentage of free space as the allocation trigger. I believe that would have been a quite useful change. On modern kernels (>= 5.15) the RAID10 profile "degrades" to RAID1 (which is perfectly fine from a reliability perspective) so you would spread the data over all your disk a bit better when the filesystem is "fresh", the opposite may be more likely when the filesystem nears full state, then the three largest devices will be the one in use for writes. So assuming regular "home use", fill up with data once, read often: In your case, you roughly have the choice between: A: RAID1 , use your largest devices first, then spread over all devices. B: RAID10, use all devices first, then use your largest devices last.
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index 841e799dece5..db36c01a62be 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -5049,6 +5049,14 @@ static int btrfs_cmp_device_info(const void *a, const void *b) const struct btrfs_device_info *di_a = a; const struct btrfs_device_info *di_b = b; + if (di_a->total_avail > 0 && di_b->total_avail > 0) + { + if (di_a->max_avail / di_a->total_avail > di_b->max_avail / di_b->total_avail) + return -1; + if (di_a->max_avail / di_a->total_avail < di_b->max_avail / di_b->total_avail) + return 1; + }; if (di_a->max_avail > di_b->max_avail)