block: fix bio splitting on max sectors
diff mbox

Message ID 1453487372-8391-1-git-send-email-tom.leiming@gmail.com
State New
Headers show

Commit Message

Ming Lei Jan. 22, 2016, 6:29 p.m. UTC
After commit e36f62042880(block: split bios to maxpossible length),
bio can be splitted in the middle of a vector entry, then it
is easy to split out one bio which size isn't aligned with block
size, especially when the block size is bigger than 512.

This patch fixes the issue by making the max io size aligned
to logical block size.

Fixes: e36f62042880(block: split bios to maxpossible length)
Reported-by: Stefan Haberland <sth@linux.vnet.ibm.com>
Cc: Keith Busch <keith.busch@intel.com>
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Ming Lei <tom.leiming@gmail.com>
---
 block/blk-merge.c | 27 ++++++++++++++++++++-------
 1 file changed, 20 insertions(+), 7 deletions(-)

Comments

Linus Torvalds Jan. 22, 2016, 6:43 p.m. UTC | #1
On Fri, Jan 22, 2016 at 10:29 AM, Ming Lei <tom.leiming@gmail.com> wrote:
>
> This patch fixes the issue by making the max io size aligned
> to logical block size.

Looks better, thanks.

I'd suggest also moving the "max_sectors" variable into the
bio_for_each_segment() loop too just to keep variables with minimal
scope, but at least this is fairly legible.

Also:

> +static inline unsigned get_max_io_size(struct request_queue *q,
> +                                      struct bio *bio)
> +{
> +       unsigned sectors = blk_max_size_offset(q, bio->bi_iter.bi_sector);
> +       unsigned mask = ~(queue_logical_block_size(q) - 1);
> +
> +       /* aligned to logical block size */
> +       sectors = ((sectors << 9) & mask) >> 9;

this could be written as

        unsigned mask = queue_logical_block_size(q) - 1;

        sectors = sectors & ~(mask >> 9);

avoiding the extra shift. That also avoids the possible overflow that
that extra left-shift introduces. Hmm?

                 Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-block" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Keith Busch Jan. 22, 2016, 8:11 p.m. UTC | #2
On Fri, Jan 22, 2016 at 10:43:59AM -0800, Linus Torvalds wrote:
> On Fri, Jan 22, 2016 at 10:29 AM, Ming Lei <tom.leiming@gmail.com> wrote:
> >
> > This patch fixes the issue by making the max io size aligned
> > to logical block size.
> 
> Looks better, thanks.
> 
> I'd suggest also moving the "max_sectors" variable into the
> bio_for_each_segment() loop too just to keep variables with minimal
> scope, but at least this is fairly legible.

The value of max_sectors doesn't change in this function, so its
calculation could be done just once instead of with each segment
iteration.

Other than that, thanks Ming, the fix looks good to me. Third try's
a charm?
--
To unsubscribe from this list: send the line "unsubscribe linux-block" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Linus Torvalds Jan. 22, 2016, 8:32 p.m. UTC | #3
[ Grr. re-sending due to stupid html from android mobile app. ]

On Jan 22, 2016 12:11 PM, "Keith Busch" <keith.busch@intel.com> wrote:
>
> The value of max_sectors doesn't change in this function

Why do you say that? It depends on the offset, so it clearly *does* change.

Now, if the patch did what I suggested and made max_sector and the cluster
size thing be separate, then *those* would indeed be constant over the whole
function.

      Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-block" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Keith Busch Jan. 22, 2016, 8:33 p.m. UTC | #4
On Fri, Jan 22, 2016 at 12:22:18PM -0800, Linus Torvalds wrote:
> 
> On Jan 22, 2016 12:11 PM, "Keith Busch" <keith.busch@intel.com> wrote:
> >
> > The value of max_sectors doesn't change in this function
> 
> Why do you say that? It depends on the offset, so it clearly *does* change.

The only "offset" used is the bio's start sector, which doesn't change.
--
To unsubscribe from this list: send the line "unsubscribe linux-block" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch
diff mbox

diff --git a/block/blk-merge.c b/block/blk-merge.c
index 1699df5..ad22153 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -70,6 +70,18 @@  static struct bio *blk_bio_write_same_split(struct request_queue *q,
 	return bio_split(bio, q->limits.max_write_same_sectors, GFP_NOIO, bs);
 }
 
+static inline unsigned get_max_io_size(struct request_queue *q,
+				       struct bio *bio)
+{
+	unsigned sectors = blk_max_size_offset(q, bio->bi_iter.bi_sector);
+	unsigned mask = ~(queue_logical_block_size(q) - 1);
+
+	/* aligned to logical block size */
+	sectors = ((sectors << 9) & mask) >> 9;
+
+	return sectors;
+}
+
 static struct bio *blk_bio_segment_split(struct request_queue *q,
 					 struct bio *bio,
 					 struct bio_set *bs,
@@ -81,6 +93,7 @@  static struct bio *blk_bio_segment_split(struct request_queue *q,
 	unsigned front_seg_size = bio->bi_seg_front_size;
 	bool do_split = true;
 	struct bio *new = NULL;
+	unsigned max_sectors;
 
 	bio_for_each_segment(bv, bio, iter) {
 		/*
@@ -90,20 +103,20 @@  static struct bio *blk_bio_segment_split(struct request_queue *q,
 		if (bvprvp && bvec_gap_to_prev(q, bvprvp, bv.bv_offset))
 			goto split;
 
-		if (sectors + (bv.bv_len >> 9) >
-				blk_max_size_offset(q, bio->bi_iter.bi_sector)) {
+		max_sectors = get_max_io_size(q, bio);
+		if (sectors + (bv.bv_len >> 9) > max_sectors) {
 			/*
 			 * Consider this a new segment if we're splitting in
 			 * the middle of this vector.
 			 */
 			if (nsegs < queue_max_segments(q) &&
-			    sectors < blk_max_size_offset(q,
-						bio->bi_iter.bi_sector)) {
+			    sectors < max_sectors) {
 				nsegs++;
-				sectors = blk_max_size_offset(q,
-						bio->bi_iter.bi_sector);
+				sectors = max_sectors;
 			}
-			goto split;
+			if (sectors)
+				goto split;
+			/* Make this single bvec as the 1st segment */
 		}
 
 		if (bvprvp && blk_queue_cluster(q)) {