Message ID | 20211119061933.13560-1-wqu@suse.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | btrfs: don't bother stripe length if the profile is not stripe based | expand |
Looks good,
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
LGTM. Reviewed-by: Anand Jain <anand.jain@oracle.com> Nit: > +#define BTRFS_BLOCK_GROUP_STRIPE_MASK (BTRFS_BLOCK_GROUP_RAID0 |\ > + BTRFS_BLOCK_GROUP_RAID10 |\ > + BTRFS_BLOCK_GROUP_RAID56_MASK) How about moving this define to btrfs_tree.h at line number 911? We have other BG MASKs there. Thanks, Anand
On 2021/11/19 18:22, Anand Jain wrote: > LGTM. > Reviewed-by: Anand Jain <anand.jain@oracle.com> > > Nit: > >> +#define BTRFS_BLOCK_GROUP_STRIPE_MASK (BTRFS_BLOCK_GROUP_RAID0 |\ >> + BTRFS_BLOCK_GROUP_RAID10 |\ >> + BTRFS_BLOCK_GROUP_RAID56_MASK) > > How about moving this define to btrfs_tree.h at line number 911? > We have other BG MASKs there. I don't think we need it in the long run. There is already a plan to remove the stripe_len requirement completely (all the bio splitting will happen at btrfs_map_bio() time instead). Even this definition would be gone in the future. Thanks, Qu > > Thanks, Anand
On Fri, Nov 19, 2021 at 02:19:33PM +0800, Qu Wenruo wrote: > [BUG] > When debugging calc_bio_boundaries(), I found that even for RAID1 > metadata, we're following stripe length to calculate stripe boundary. > > # mkfs.btrfs -m raid1 -d raid1 /dev/test/scratch[12] > # mount /dev/test/scratch /mnt/btrfs > # xfs_io -f -c "pwrite 0 64K" /mnt/btrfs/file > # umount > > Above very basic operations will make calc_bio_boundaries() to report > the following result: > > submit_extent_page: r/i=1/1 file_offset=22036480 len_to_stripe_boundary=49152 > submit_extent_page: r/i=1/1 file_offset=30474240 len_to_stripe_boundary=65536 > ... > submit_extent_page: r/i=1/1 file_offset=30523392 len_to_stripe_boundary=16384 > submit_extent_page: r/i=1/1 file_offset=30457856 len_to_stripe_boundary=16384 > submit_extent_page: r/i=5/257 file_offset=0 len_to_stripe_boundary=65536 > submit_extent_page: r/i=5/257 file_offset=65536 len_to_stripe_boundary=65536 > submit_extent_page: r/i=1/1 file_offset=30490624 len_to_stripe_boundary=49152 > submit_extent_page: r/i=1/1 file_offset=30507008 len_to_stripe_boundary=32768 > > Where "r/i" is the rootid and inode, 1/1 means they metadata. > The remaining names match the member used in kernel. > > Even all data/metadata are using raid1, we're still following stripe > length. > > [CAUSE] > This behavior is caused by a wrong condition in btrfs_get_io_geometry(): > > if (map->type & BTRFS_BLOCK_GROUP_PROFILE_MASK) { > /* Fill using stripe_len */ > len = min_t(u64, em->len - offset, max_len); > } else { > len = em->len - offset; > } > > This means, only for SINGLE we will not follow stripe_len. > > However for profiles like RAID1*, DUP, they don't need to bother > stripe_len. > > This can lead to unnecessry bio split for RAID1*/DUP profiles, and can > even be a blockage for future zoned RAID support. > > [FIX] > Introduce one single-use macro, BTRFS_BLOCK_GROUP_STRIPE_MASK, and > change the condition to only calculate the length using stripe length > for stripe based profiles. > > Signed-off-by: Qu Wenruo <wqu@suse.com> Added to misc-next, thanks. > @@ -6296,6 +6296,10 @@ static bool need_full_stripe(enum btrfs_map_op op) > return (op == BTRFS_MAP_WRITE || op == BTRFS_MAP_GET_READ_MIRRORS); > } > > + > +#define BTRFS_BLOCK_GROUP_STRIPE_MASK (BTRFS_BLOCK_GROUP_RAID0 |\ > + BTRFS_BLOCK_GROUP_RAID10 |\ > + BTRFS_BLOCK_GROUP_RAID56_MASK) I've moved the defintion to the beginning of the file, even if it's for a single use, such definitions should go to somewhere we can find them easily and not 6000 lines down the file. The nature of the bit mask is generic and we could need to use it again for other purposes.
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index ae1a4f2a8af6..3dc1de376966 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -6296,6 +6296,10 @@ static bool need_full_stripe(enum btrfs_map_op op) return (op == BTRFS_MAP_WRITE || op == BTRFS_MAP_GET_READ_MIRRORS); } + +#define BTRFS_BLOCK_GROUP_STRIPE_MASK (BTRFS_BLOCK_GROUP_RAID0 |\ + BTRFS_BLOCK_GROUP_RAID10 |\ + BTRFS_BLOCK_GROUP_RAID56_MASK) /* * Calculate the geometry of a particular (address, len) tuple. This * information is used to calculate how big a particular bio can get before it @@ -6345,7 +6349,8 @@ int btrfs_get_io_geometry(struct btrfs_fs_info *fs_info, struct extent_map *em, stripe_offset = offset - stripe_offset; data_stripes = nr_data_stripes(map); - if (map->type & BTRFS_BLOCK_GROUP_PROFILE_MASK) { + /* Only stripe based profiles needs to check against stripe length. */ + if (map->type & BTRFS_BLOCK_GROUP_STRIPE_MASK) { u64 max_len = stripe_len - stripe_offset; /*
[BUG] When debugging calc_bio_boundaries(), I found that even for RAID1 metadata, we're following stripe length to calculate stripe boundary. # mkfs.btrfs -m raid1 -d raid1 /dev/test/scratch[12] # mount /dev/test/scratch /mnt/btrfs # xfs_io -f -c "pwrite 0 64K" /mnt/btrfs/file # umount Above very basic operations will make calc_bio_boundaries() to report the following result: submit_extent_page: r/i=1/1 file_offset=22036480 len_to_stripe_boundary=49152 submit_extent_page: r/i=1/1 file_offset=30474240 len_to_stripe_boundary=65536 ... submit_extent_page: r/i=1/1 file_offset=30523392 len_to_stripe_boundary=16384 submit_extent_page: r/i=1/1 file_offset=30457856 len_to_stripe_boundary=16384 submit_extent_page: r/i=5/257 file_offset=0 len_to_stripe_boundary=65536 submit_extent_page: r/i=5/257 file_offset=65536 len_to_stripe_boundary=65536 submit_extent_page: r/i=1/1 file_offset=30490624 len_to_stripe_boundary=49152 submit_extent_page: r/i=1/1 file_offset=30507008 len_to_stripe_boundary=32768 Where "r/i" is the rootid and inode, 1/1 means they metadata. The remaining names match the member used in kernel. Even all data/metadata are using raid1, we're still following stripe length. [CAUSE] This behavior is caused by a wrong condition in btrfs_get_io_geometry(): if (map->type & BTRFS_BLOCK_GROUP_PROFILE_MASK) { /* Fill using stripe_len */ len = min_t(u64, em->len - offset, max_len); } else { len = em->len - offset; } This means, only for SINGLE we will not follow stripe_len. However for profiles like RAID1*, DUP, they don't need to bother stripe_len. This can lead to unnecessry bio split for RAID1*/DUP profiles, and can even be a blockage for future zoned RAID support. [FIX] Introduce one single-use macro, BTRFS_BLOCK_GROUP_STRIPE_MASK, and change the condition to only calculate the length using stripe length for stripe based profiles. Signed-off-by: Qu Wenruo <wqu@suse.com> --- fs/btrfs/volumes.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)