Message ID | 1471273882-3938-1-git-send-email-ming.lei@canonical.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Mon, Aug 15, 2016 at 11:11:22PM +0800, Ming Lei wrote: > After arbitrary bio size is supported, the incoming bio may > be very big. We have to split the bio into small bios so that > each holds at most BIO_MAX_PAGES bvecs for safety reason, such > as bio_clone(). I still think working around a rough driver submitting too large I/O is a bad thing until we've done a full audit of all consuming bios through ->make_request, and we've enabled it for the common path as well. > bool do_split = true; > struct bio *new = NULL; > const unsigned max_sectors = get_max_io_size(q, bio); > + unsigned bvecs = 0; > + > + *no_merge = true; > > bio_for_each_segment(bv, bio, iter) { > /* > + * With arbitrary bio size, the incoming bio may be very > + * big. We have to split the bio into small bios so that > + * each holds at most BIO_MAX_PAGES bvecs because > + * bio_clone() can fail to allocate big bvecs. > + * > + * It should have been better to apply the limit per > + * request queue in which bio_clone() is involved, > + * instead of globally. The biggest blocker is > + * bio_clone() in bio bounce. > + * > + * If bio is splitted by this reason, we should allow > + * to continue bios merging. > + * > + * TODO: deal with bio bounce's bio_clone() gracefully > + * and convert the global limit into per-queue limit. > + */ > + if (bvecs++ >= BIO_MAX_PAGES) { > + *no_merge = false; > + goto split; > + } That being said this simple if check here is simple enough that it's probably fine. But I see no need to uglify the whole code path with that no_merge flag. Please drop if for now, and if we start caring for this path in common code we should just move the REQ_NOMERGE setting into the actual blk_bio_*_split helpers. -- To unsubscribe from this list: send the line "unsubscribe linux-block" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Aug 15, 2016 at 11:23:28AM -0700, Christoph Hellwig wrote: > On Mon, Aug 15, 2016 at 11:11:22PM +0800, Ming Lei wrote: > > After arbitrary bio size is supported, the incoming bio may > > be very big. We have to split the bio into small bios so that > > each holds at most BIO_MAX_PAGES bvecs for safety reason, such > > as bio_clone(). > > I still think working around a rough driver submitting too large > I/O is a bad thing until we've done a full audit of all consuming > bios through ->make_request, and we've enabled it for the common > path as well. bcache originally had workaround code to split too-large bios when it first went upstream - that was dropped only after the patches to make generic_make_request() handle arbitrary size bios went in. So to do what you're suggesting would mean reverting that bcache patch and bringing that code back, which from my perspective would be a step in the wrong direction. I just want to get this over and done with. re: interactions with other drivers - bio_clone() has already been changed to only clone biovecs that are live for current bi_iter, so there shouldn't be any safety issues. A driver would have to be intentionally doing its own open coded bio cloning that clones all of bi_io_vec, not just the active ones - but if they're doing that, they're already broken because a driver isn't allowed to look at bi_vcnt if it isn't a bio that it owns - bi_vcnt is 0 on bios that don't own their biovec (i.e. that were created by bio_clone_fast). And the cloning and bi_vcnt usage stuff I audited very thoroughly back when I was working on immutable biovecs and such back in the day, and I had to do a fair amount of cleanup/refactoring before that stuff could go in. > > > bool do_split = true; > > struct bio *new = NULL; > > const unsigned max_sectors = get_max_io_size(q, bio); > > + unsigned bvecs = 0; > > + > > + *no_merge = true; > > > > bio_for_each_segment(bv, bio, iter) { > > /* > > + * With arbitrary bio size, the incoming bio may be very > > + * big. We have to split the bio into small bios so that > > + * each holds at most BIO_MAX_PAGES bvecs because > > + * bio_clone() can fail to allocate big bvecs. > > + * > > + * It should have been better to apply the limit per > > + * request queue in which bio_clone() is involved, > > + * instead of globally. The biggest blocker is > > + * bio_clone() in bio bounce. > > + * > > + * If bio is splitted by this reason, we should allow > > + * to continue bios merging. > > + * > > + * TODO: deal with bio bounce's bio_clone() gracefully > > + * and convert the global limit into per-queue limit. > > + */ > > + if (bvecs++ >= BIO_MAX_PAGES) { > > + *no_merge = false; > > + goto split; > > + } > > That being said this simple if check here is simple enough that it's > probably fine. But I see no need to uglify the whole code path > with that no_merge flag. Please drop if for now, and if we start > caring for this path in common code we should just move the > REQ_NOMERGE setting into the actual blk_bio_*_split helpers. Agreed about the no_merge thing. -- To unsubscribe from this list: send the line "unsubscribe linux-block" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> On Mon, Aug 15, 2016 at 11:23:28AM -0700, Christoph Hellwig wrote: > > On Mon, Aug 15, 2016 at 11:11:22PM +0800, Ming Lei wrote: > > > After arbitrary bio size is supported, the incoming bio may > > > be very big. We have to split the bio into small bios so that > > > each holds at most BIO_MAX_PAGES bvecs for safety reason, such > > > as bio_clone(). > > > > I still think working around a rough driver submitting too large > > I/O is a bad thing until we've done a full audit of all consuming > > bios through ->make_request, and we've enabled it for the common > > path as well. > > bcache originally had workaround code to split too-large bios when it > first went upstream - that was dropped only after the patches to make > generic_make_request() handle arbitrary size bios went in. So to do what > you're suggesting would mean reverting that bcache patch and bringing > that code back, which from my perspective would be a step in the wrong > direction. I just want to get this over and done with. > > > > > > bool do_split = true; > > > struct bio *new = NULL; > > > const unsigned max_sectors = get_max_io_size(q, bio); > > > + unsigned bvecs = 0; > > > + > > > + *no_merge = true; > > > > > > bio_for_each_segment(bv, bio, iter) { > > > /* > > > + * With arbitrary bio size, the incoming bio may be very > > > + * big. We have to split the bio into small bios so that > > > + * each holds at most BIO_MAX_PAGES bvecs because > > > + * bio_clone() can fail to allocate big bvecs. > > > + * > > > + * It should have been better to apply the limit per > > > + * request queue in which bio_clone() is involved, > > > + * instead of globally. The biggest blocker is > > > + * bio_clone() in bio bounce. > > > + * > > > + * If bio is splitted by this reason, we should allow > > > + * to continue bios merging. > > > + * > > > + * TODO: deal with bio bounce's bio_clone() gracefully > > > + * and convert the global limit into per-queue limit. > > > + */ > > > + if (bvecs++ >= BIO_MAX_PAGES) { > > > + *no_merge = false; > > > + goto split; > > > + } > > > > That being said this simple if check here is simple enough that it's > > probably fine. But I see no need to uglify the whole code path > > with that no_merge flag. Please drop if for now, and if we start > > caring for this path in common code we should just move the > > REQ_NOMERGE setting into the actual blk_bio_*_split helpers. > > Agreed about the no_merge thing. By removing `no_merge` this patch should cherry-peck into stable v4.3+ without merge issues by avoiding bi_rw refactor interference, too. Ming, can you send out a V4 without `no_merge` ? -- Eric Wheeler -- To unsubscribe from this list: send the line "unsubscribe linux-block" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Aug 19, 2016 at 8:41 AM, Eric Wheeler <bcache@lists.ewheeler.net> wrote: >> On Mon, Aug 15, 2016 at 11:23:28AM -0700, Christoph Hellwig wrote: >> > On Mon, Aug 15, 2016 at 11:11:22PM +0800, Ming Lei wrote: >> > > After arbitrary bio size is supported, the incoming bio may >> > > be very big. We have to split the bio into small bios so that >> > > each holds at most BIO_MAX_PAGES bvecs for safety reason, such >> > > as bio_clone(). >> > >> > I still think working around a rough driver submitting too large >> > I/O is a bad thing until we've done a full audit of all consuming >> > bios through ->make_request, and we've enabled it for the common >> > path as well. >> >> bcache originally had workaround code to split too-large bios when it >> first went upstream - that was dropped only after the patches to make >> generic_make_request() handle arbitrary size bios went in. So to do what >> you're suggesting would mean reverting that bcache patch and bringing >> that code back, which from my perspective would be a step in the wrong >> direction. I just want to get this over and done with. >> >> > >> > > bool do_split = true; >> > > struct bio *new = NULL; >> > > const unsigned max_sectors = get_max_io_size(q, bio); >> > > + unsigned bvecs = 0; >> > > + >> > > + *no_merge = true; >> > > >> > > bio_for_each_segment(bv, bio, iter) { >> > > /* >> > > + * With arbitrary bio size, the incoming bio may be very >> > > + * big. We have to split the bio into small bios so that >> > > + * each holds at most BIO_MAX_PAGES bvecs because >> > > + * bio_clone() can fail to allocate big bvecs. >> > > + * >> > > + * It should have been better to apply the limit per >> > > + * request queue in which bio_clone() is involved, >> > > + * instead of globally. The biggest blocker is >> > > + * bio_clone() in bio bounce. >> > > + * >> > > + * If bio is splitted by this reason, we should allow >> > > + * to continue bios merging. >> > > + * >> > > + * TODO: deal with bio bounce's bio_clone() gracefully >> > > + * and convert the global limit into per-queue limit. >> > > + */ >> > > + if (bvecs++ >= BIO_MAX_PAGES) { >> > > + *no_merge = false; >> > > + goto split; >> > > + } >> > >> > That being said this simple if check here is simple enough that it's >> > probably fine. But I see no need to uglify the whole code path >> > with that no_merge flag. Please drop if for now, and if we start >> > caring for this path in common code we should just move the >> > REQ_NOMERGE setting into the actual blk_bio_*_split helpers. >> >> Agreed about the no_merge thing. > > By removing `no_merge` this patch should cherry-peck into stable v4.3+ > without merge issues by avoiding bi_rw refactor interference, too. > > Ming, can you send out a V4 without `no_merge` ? This patch is definitely correct, and I don't see dealing with 'no_merge' should be removed. In this case, the bio is still possible to merge with others because it doesn't violate any limit of the queue because it just can't be held in 256 bvecs, that means it is correct to clear no_merge for this situation. Also I don't think it is complicated or ugly to deal with the flag. Jens, could you merge this fix? Thanks, Ming > > -- > Eric Wheeler > > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-block" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-block" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sun, Aug 21, 2016 at 05:31:33PM +0800, Ming Lei wrote: > This patch is definitely correct, and I don't see dealing with 'no_merge' > should be removed. > > In this case, the bio is still possible to merge with others because > it doesn't violate any limit of the queue because it just can't be held in > 256 bvecs, that means it is correct to clear no_merge for this situation. > > Also I don't think it is complicated or ugly to deal with the flag. It's extra unnecessary code. All other things being equal, less code is always better than more code. It works just fine without it, what's the justification for adding the flag? Don't be so dismissive of other maintainer's concerns, we've got to deal with this code too. Especially when I and Christoph are agreeing on something - how often does that happen? -- To unsubscribe from this list: send the line "unsubscribe linux-block" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Aug 22, 2016 at 1:58 AM, Kent Overstreet <kent.overstreet@gmail.com> wrote: > On Sun, Aug 21, 2016 at 05:31:33PM +0800, Ming Lei wrote: >> This patch is definitely correct, and I don't see dealing with 'no_merge' >> should be removed. >> >> In this case, the bio is still possible to merge with others because >> it doesn't violate any limit of the queue because it just can't be held in >> 256 bvecs, that means it is correct to clear no_merge for this situation. >> >> Also I don't think it is complicated or ugly to deal with the flag. > > It's extra unnecessary code. All other things being equal, less code is always > better than more code. It works just fine without it, what's the justification > for adding the flag? From rationality view, the 'no_merge' flag need to be cleared because it doesn't violate any queue's limit and should be allowed to merge, and actually it is possible to submit all these splitted bios into hardware via one request, that is different with other splitting cases. Given you introduced support of arbitrarily sized bios in commit 54efd50bf (block:make generic_make_request handle arbitrarily sized bios), other drivers/fs might use this to simplify bio submission too in the future. > > Don't be so dismissive of other maintainer's concerns, we've got to deal with > this code too. Especially when I and Christoph are agreeing on something - how > often does that happen? OK, I will submit a V4 and comment on not merging just for simplicity resaon. Thanks, Ming -- To unsubscribe from this list: send the line "unsubscribe linux-block" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/block/blk-merge.c b/block/blk-merge.c index 3eec75a..c44d3e9 100644 --- a/block/blk-merge.c +++ b/block/blk-merge.c @@ -85,7 +85,8 @@ static inline unsigned get_max_io_size(struct request_queue *q, static struct bio *blk_bio_segment_split(struct request_queue *q, struct bio *bio, struct bio_set *bs, - unsigned *segs) + unsigned *segs, + bool *no_merge) { struct bio_vec bv, bvprv, *bvprvp = NULL; struct bvec_iter iter; @@ -94,9 +95,34 @@ static struct bio *blk_bio_segment_split(struct request_queue *q, bool do_split = true; struct bio *new = NULL; const unsigned max_sectors = get_max_io_size(q, bio); + unsigned bvecs = 0; + + *no_merge = true; bio_for_each_segment(bv, bio, iter) { /* + * With arbitrary bio size, the incoming bio may be very + * big. We have to split the bio into small bios so that + * each holds at most BIO_MAX_PAGES bvecs because + * bio_clone() can fail to allocate big bvecs. + * + * It should have been better to apply the limit per + * request queue in which bio_clone() is involved, + * instead of globally. The biggest blocker is + * bio_clone() in bio bounce. + * + * If bio is splitted by this reason, we should allow + * to continue bios merging. + * + * TODO: deal with bio bounce's bio_clone() gracefully + * and convert the global limit into per-queue limit. + */ + if (bvecs++ >= BIO_MAX_PAGES) { + *no_merge = false; + goto split; + } + + /* * If the queue doesn't support SG gaps and adding this * offset would create a gap, disallow it. */ @@ -171,13 +197,15 @@ void blk_queue_split(struct request_queue *q, struct bio **bio, { struct bio *split, *res; unsigned nsegs; + bool no_merge_for_split = true; if (bio_op(*bio) == REQ_OP_DISCARD) split = blk_bio_discard_split(q, *bio, bs, &nsegs); else if (bio_op(*bio) == REQ_OP_WRITE_SAME) split = blk_bio_write_same_split(q, *bio, bs, &nsegs); else - split = blk_bio_segment_split(q, *bio, q->bio_split, &nsegs); + split = blk_bio_segment_split(q, *bio, q->bio_split, &nsegs, + &no_merge_for_split); /* physical segments can be figured out during splitting */ res = split ? split : *bio; @@ -186,7 +214,8 @@ void blk_queue_split(struct request_queue *q, struct bio **bio, if (split) { /* there isn't chance to merge the splitted bio */ - split->bi_opf |= REQ_NOMERGE; + if (no_merge_for_split) + split->bi_opf |= REQ_NOMERGE; bio_chain(split, *bio); trace_block_split(q, split, (*bio)->bi_iter.bi_sector);