Message ID | b61263cba690fd894e21d75442556ae2f150f095.1679278088.git.wqu@suse.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | btrfs: scrub: use a more reader friendly code to implement scrub_simple_mirror() | expand |
> @@ -39,6 +39,8 @@ void btrfs_bio_init(struct btrfs_bio *bbio, struct btrfs_inode *inode, > bbio->end_io = end_io; > bbio->private = private; > atomic_set(&bbio->pending_ios, 1); > + if (inode) > + bbio->fs_info = inode->root->fs_info; > } Please split modifications to the btrfs_bio infrastructure into separate patches. And we'll need a proper helper to allocate these non-bio inodes that ensures fs_info is always set. > struct btrfs_bio *bbio = btrfs_bio(bio); > struct btrfs_device *dev = bio->bi_private; > - struct btrfs_fs_info *fs_info = bbio->inode->root->fs_info; > + struct btrfs_fs_info *fs_info = bbio->inode ? > + bbio->inode->root->fs_info : bbio->fs_info; .. and all places that just need the fs_info need to just use bbio->fs_info instead of these complicated constructs.
On 2023/3/21 20:02, Christoph Hellwig wrote: >> @@ -39,6 +39,8 @@ void btrfs_bio_init(struct btrfs_bio *bbio, struct btrfs_inode *inode, >> bbio->end_io = end_io; >> bbio->private = private; >> atomic_set(&bbio->pending_ios, 1); >> + if (inode) >> + bbio->fs_info = inode->root->fs_info; >> } > > Please split modifications to the btrfs_bio infrastructure into > separate patches. And we'll need a proper helper to allocate these > non-bio inodes that ensures fs_info is always set. > >> struct btrfs_bio *bbio = btrfs_bio(bio); >> struct btrfs_device *dev = bio->bi_private; >> - struct btrfs_fs_info *fs_info = bbio->inode->root->fs_info; >> + struct btrfs_fs_info *fs_info = bbio->inode ? >> + bbio->inode->root->fs_info : bbio->fs_info; > > .. and all places that just need the fs_info need to just use > bbio->fs_info instead of these complicated constructs. > I have updated the branch, with a dedicated btrfs_bio::fs_info, btrfs_scrub_bio_alloc() and always populate that new fs_info. Mind to have a pre-check? https://github.com/adam900710/linux/commit/0f6a419f035787546eec6f51b0430a05a345d4c5 Although the next two patches are also slightly updated to take the advantage of that always populated fs_info: Since the modification, I'm a little more dependent on that always populated fs_info now, thus not sure if going back to a union and conditionally grabbing that fs_info is a good idea. Although I'm more interested why there is a super big gap between work and bio (32 bytes), while the compiler still refuses to put the pointer into that hole... Thanks, Qu
On Fri, Mar 24, 2023 at 05:58:57PM +0800, Qu Wenruo wrote: > I have updated the branch, with a dedicated btrfs_bio::fs_info, > btrfs_scrub_bio_alloc() and always populate that new fs_info. > > Mind to have a pre-check? > > https://github.com/adam900710/linux/commit/0f6a419f035787546eec6f51b0430a05a345d4c5 I'd prefer to just pass a fs_info to btrfs_bio_init and btrfs_bio_alloc instead of duplicating the allocator. With my "simplify extent_buffer reading and writing" series and another tiny change we'll only need the inode for data inodes. But given that we've not made too much progress on getting that series merged I guess we'll have to add the duplicate allocator for now and just revert it later. I'd drop the ASSERT in btrfs_submit_bio - all allocators initialize the field, no need for the extra check. > Although the next two patches are also slightly updated to take the > advantage of that always populated fs_info: I think you wanted to add links here, but they are missing. > Since the modification, I'm a little more dependent on that always populated > fs_info now, thus not sure if going back to a union and conditionally > grabbing that fs_info is a good idea. I don't think the union is a good idea at all. Maybe in the future we can move the inode into a union, but right now the csum for data read is the upper bound on the btrfs_bio, so I'm not sure we'll ever get to a point where that would help. > Although I'm more interested why there is a super big gap between work and > bio (32 bytes), while the compiler still refuses to put the pointer into > that hole... I don't think there is one. pahole just reports structs strangely sometimes.
On 2023/3/25 16:09, Christoph Hellwig wrote: > On Fri, Mar 24, 2023 at 05:58:57PM +0800, Qu Wenruo wrote: >> I have updated the branch, with a dedicated btrfs_bio::fs_info, >> btrfs_scrub_bio_alloc() and always populate that new fs_info. >> >> Mind to have a pre-check? >> >> https://github.com/adam900710/linux/commit/0f6a419f035787546eec6f51b0430a05a345d4c5 > > I'd prefer to just pass a fs_info to btrfs_bio_init and btrfs_bio_alloc > instead of duplicating the allocator. With my "simplify extent_buffer > reading and writing" series and another tiny change we'll only need the > inode for data inodes. But given that we've not made too much > progress on getting that series merged I guess we'll have to add > the duplicate allocator for now and just revert it later. No big deal, I can go with the extra fs_info parameter for btrfs_bio_init() and btrfs_bio_alloc(). The main reason I go with the duplicated allocate is to remove the need for nr_vecs, but that's pretty minor, thus not a critical one. > > I'd drop the ASSERT in btrfs_submit_bio - all allocators initialize > the field, no need for the extra check. OK, sounds reasonable. > >> Although the next two patches are also slightly updated to take the >> advantage of that always populated fs_info: > > I think you wanted to add links here, but they are missing. > >> Since the modification, I'm a little more dependent on that always populated >> fs_info now, thus not sure if going back to a union and conditionally >> grabbing that fs_info is a good idea. > > I don't think the union is a good idea at all. Maybe in the future > we can move the inode into a union, but right now the csum for data > read is the upper bound on the btrfs_bio, so I'm not sure we'll > ever get to a point where that would help. Yeah, let's avoid union and let fs_info always populated. Thanks, Qu > >> Although I'm more interested why there is a super big gap between work and >> bio (32 bytes), while the compiler still refuses to put the pointer into >> that hole... > > I don't think there is one. pahole just reports structs strangely > sometimes.
On Sat, Mar 25, 2023 at 04:21:41PM +0800, Qu Wenruo wrote: > No big deal, I can go with the extra fs_info parameter for btrfs_bio_init() > and btrfs_bio_alloc(). I'd be tempted to just initialize it in the callers given that it's an optional field now. I wonder if the same might also make sense for ->file_offset. > The main reason I go with the duplicated allocate is to remove the need for > nr_vecs, but that's pretty minor, thus not a critical one. Passing nr_vecs might actually make sense. There is no need to always allocated all of them for existing callers either, so we can do additional optimizations based on that later.
On 2023/3/25 16:31, Christoph Hellwig wrote: > On Sat, Mar 25, 2023 at 04:21:41PM +0800, Qu Wenruo wrote: >> No big deal, I can go with the extra fs_info parameter for btrfs_bio_init() >> and btrfs_bio_alloc(). > > I'd be tempted to just initialize it in the callers given that it's > an optional field now. I wonder if the same might also make sense > for ->file_offset. In that case, I'm not sure if adding more and more arguments while almost all of them can be optional is a good idea... Shouldn't we only pass mandatory arguments, then let the caller to populate the optional ones? Anyway, for scrub usage, fs_info can be initialized at the submission timing if neexed, thus I'm fine either way. > >> The main reason I go with the duplicated allocate is to remove the need for >> nr_vecs, but that's pretty minor, thus not a critical one. > > Passing nr_vecs might actually make sense. There is no need to always > allocated all of them for existing callers either, so we can do > additional optimizations based on that later. Or we can just go with the github version and refine later? Considering scrub would be the final major code migrating to use btrfs_bio, there shouldn't be any major use case change to btrfs_bio for a while. Thanks, Qu
On Sat, Mar 25, 2023 at 04:48:40PM +0800, Qu Wenruo wrote: > > I'd be tempted to just initialize it in the callers given that it's > > an optional field now. I wonder if the same might also make sense > > for ->file_offset. > > In that case, I'm not sure if adding more and more arguments while almost > all of them can be optional is a good idea... > > Shouldn't we only pass mandatory arguments, then let the caller to populate > the optional ones? Yes, that's what I tried to suggest above. Right now inode and file_offset are mandatory, but with your changes they stop being so, so we should stop passing them to the allocation helpers.
diff --git a/fs/btrfs/bio.c b/fs/btrfs/bio.c index cf09c6271edb..b96f40160b08 100644 --- a/fs/btrfs/bio.c +++ b/fs/btrfs/bio.c @@ -39,6 +39,8 @@ void btrfs_bio_init(struct btrfs_bio *bbio, struct btrfs_inode *inode, bbio->end_io = end_io; bbio->private = private; atomic_set(&bbio->pending_ios, 1); + if (inode) + bbio->fs_info = inode->root->fs_info; } /* @@ -308,8 +310,8 @@ static void btrfs_end_bio_work(struct work_struct *work) { struct btrfs_bio *bbio = container_of(work, struct btrfs_bio, end_io_work); - /* Metadata reads are checked and repaired by the submitter. */ - if (bbio->bio.bi_opf & REQ_META) + /* Metadata or scrub reads are checked and repaired by the submitter. */ + if (bbio->bio.bi_opf & REQ_META || !bbio->inode) bbio->end_io(bbio); else btrfs_check_read_bio(bbio, bbio->bio.bi_private); @@ -319,7 +321,8 @@ static void btrfs_simple_end_io(struct bio *bio) { struct btrfs_bio *bbio = btrfs_bio(bio); struct btrfs_device *dev = bio->bi_private; - struct btrfs_fs_info *fs_info = bbio->inode->root->fs_info; + struct btrfs_fs_info *fs_info = bbio->inode ? + bbio->inode->root->fs_info : bbio->fs_info; btrfs_bio_counter_dec(fs_info); @@ -343,7 +346,8 @@ static void btrfs_raid56_end_io(struct bio *bio) btrfs_bio_counter_dec(bioc->fs_info); bbio->mirror_num = bioc->mirror_num; - if (bio_op(bio) == REQ_OP_READ && !(bbio->bio.bi_opf & REQ_META)) + if (bio_op(bio) == REQ_OP_READ && bbio->inode && + !(bbio->bio.bi_opf & REQ_META)) btrfs_check_read_bio(bbio, NULL); else btrfs_orig_bbio_end_io(bbio); @@ -689,6 +693,46 @@ static bool btrfs_submit_chunk(struct btrfs_bio *bbio, int mirror_num) return true; } +/* + * Scrub read special version, with extra limits: + * + * - Only support read for scrub usage + * - @mirror_num must be >0 + * - No read-time repair nor checksum verification. + * - The @bbio must not cross stripe boundary. + */ +void btrfs_submit_scrub_read(struct btrfs_fs_info *fs_info, + struct btrfs_bio *bbio, int mirror_num) +{ + struct btrfs_bio *orig_bbio = bbio; + u64 logical = bbio->bio.bi_iter.bi_sector << SECTOR_SHIFT; + u64 length = bbio->bio.bi_iter.bi_size; + u64 map_length = length; + struct btrfs_io_context *bioc = NULL; + struct btrfs_io_stripe smap; + int ret; + + ASSERT(mirror_num > 0); + ASSERT(btrfs_op(&bbio->bio) == BTRFS_MAP_READ); + ASSERT(!bbio->inode); + + bbio->fs_info = fs_info; + btrfs_bio_counter_inc_blocked(fs_info); + ret = __btrfs_map_block(fs_info, btrfs_op(&bbio->bio), logical, + &map_length, &bioc, &smap, &mirror_num, 1); + if (ret) + goto fail; + + /* Caller should ensure the @bbio doesn't cross stripe boundary. */ + ASSERT(map_length >= length); + __btrfs_submit_bio(&bbio->bio, bioc, &smap, mirror_num); + return; + +fail: + btrfs_bio_counter_dec(fs_info); + btrfs_bio_end_io(orig_bbio, ret); +} + void btrfs_submit_bio(struct btrfs_bio *bbio, int mirror_num) { while (!btrfs_submit_chunk(bbio, mirror_num)) diff --git a/fs/btrfs/bio.h b/fs/btrfs/bio.h index dbf125f6fa33..073df13365e4 100644 --- a/fs/btrfs/bio.h +++ b/fs/btrfs/bio.h @@ -30,7 +30,13 @@ typedef void (*btrfs_bio_end_io_t)(struct btrfs_bio *bbio); * passed to btrfs_submit_bio for mapping to the physical devices. */ struct btrfs_bio { - /* Inode and offset into it that this I/O operates on. */ + /* + * Inode and offset into it that this I/O operates on. + * + * @inode can be NULL for callers who don't want any advanced features + * like read-time repair. + * In that case, @fs_info must be properly initialized. + */ struct btrfs_inode *inode; u64 file_offset; @@ -58,6 +64,13 @@ struct btrfs_bio { atomic_t pending_ios; struct work_struct end_io_work; + /* + * For cases where callers only want to read/write from a logical + * bytenr, in that case @inode can be NULL, and we need such + * @fs_info pointer to grab the corresponding fs_info. + */ + struct btrfs_fs_info *fs_info; + /* * This member must come last, bio_alloc_bioset will allocate enough * bytes for entire btrfs_bio but relies on bio being last. @@ -89,6 +102,8 @@ static inline void btrfs_bio_end_io(struct btrfs_bio *bbio, blk_status_t status) #define REQ_BTRFS_ONE_ORDERED REQ_DRV void btrfs_submit_bio(struct btrfs_bio *bbio, int mirror_num); +void btrfs_submit_scrub_read(struct btrfs_fs_info *fs_info, + struct btrfs_bio *bbio, int mirror_num); int btrfs_repair_io_failure(struct btrfs_fs_info *fs_info, u64 ino, u64 start, u64 length, u64 logical, struct page *page, unsigned int pg_offset, int mirror_num);
The new helper, btrfs_submit_scrub_read(), would be mostly a subset of btrfs_submit_bio(), with the following limitations: - Only supports read - @mirror_num must be > 0 - No read-time repair nor checksum verification - The @bbio must not cross stripe boundary This would provide the basis for unified read repair for scrub, as we no longer needs to handle RAID56 recovery all by scrub, and RAID56 data stripes scrub can share the same code of read and repair. The repair part would be the same as non-RAID56, as we only need to try the next mirror. Signed-off-by: Qu Wenruo <wqu@suse.com> --- fs/btrfs/bio.c | 52 ++++++++++++++++++++++++++++++++++++++++++++++---- fs/btrfs/bio.h | 17 ++++++++++++++++- 2 files changed, 64 insertions(+), 5 deletions(-)