diff mbox

block: bios with an offset are always gappy

Message ID 20170413144505.GB10008@ming.t460p (mailing list archive)
State New, archived
Headers show

Commit Message

Ming Lei April 13, 2017, 2:45 p.m. UTC
On Thu, Apr 13, 2017 at 01:53:28PM +0200, Johannes Thumshirn wrote:
> On Thu, Apr 13, 2017 at 06:02:21PM +0800, Ming Lei wrote:
> > On Thu, Apr 13, 2017 at 10:06:29AM +0200, Johannes Thumshirn wrote:
> > > Doing a mkfs.btrfs on a (qemu emulated) PCIe NVMe causes a kernel panic
> > > in nvme_setup_prps() because the dma_len will drop below zero but the
> > > length not.
> > 
> > Looks I can't reproduce the issue in QEMU(32G nvme, either partitioned
> > or not, just use 'mkfs.btrfs /dev/nvme0n1p1'), could you share the exact
> > mkfs command line and size of your emulated NVMe?
> 
> the exact cmdline is mkfs.btrfs -f /dev/nvme0n1p1 (-f because there was a
> existing btrfs on the image). The image is 17179869184 (a.k.a 16G) bytes.
> 
> [...]
> 
> > Could you try the following patch to see if it fixes your issue?
> 
> It's back to the old, erratic behaviour, see log below.

Johannes, could you test the following patch?

Thanks
Ming

---

Comments

Johannes Thumshirn April 13, 2017, 2:50 p.m. UTC | #1
On Thu, Apr 13, 2017 at 10:45:10PM +0800, Ming Lei wrote:
> On Thu, Apr 13, 2017 at 01:53:28PM +0200, Johannes Thumshirn wrote:
> > On Thu, Apr 13, 2017 at 06:02:21PM +0800, Ming Lei wrote:
> > > On Thu, Apr 13, 2017 at 10:06:29AM +0200, Johannes Thumshirn wrote:
> > > > Doing a mkfs.btrfs on a (qemu emulated) PCIe NVMe causes a kernel panic
> > > > in nvme_setup_prps() because the dma_len will drop below zero but the
> > > > length not.
> > > 
> > > Looks I can't reproduce the issue in QEMU(32G nvme, either partitioned
> > > or not, just use 'mkfs.btrfs /dev/nvme0n1p1'), could you share the exact
> > > mkfs command line and size of your emulated NVMe?
> > 
> > the exact cmdline is mkfs.btrfs -f /dev/nvme0n1p1 (-f because there was a
> > existing btrfs on the image). The image is 17179869184 (a.k.a 16G) bytes.
> > 
> > [...]
> > 
> > > Could you try the following patch to see if it fixes your issue?
> > 
> > It's back to the old, erratic behaviour, see log below.
> 
> Johannes, could you test the following patch?
> 
> Thanks
> Ming

Works, awesome thanks!

Tested-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Andreas Mohr April 13, 2017, 8:35 p.m. UTC | #2
On Thu, Apr 13, 2017 at 10:45:10PM +0800, Ming Lei wrote:
> +		/*
> +		 * don't merge if the 1st bio starts with non-zero
> +		 * offset, otherwise it is quite difficult to respect
> +		 * sg gap limit. We work hard to merge huge of small
> +		 * bios in case of mkfs.

"huge of small bios in case" - -ENOPARSE.

"huge numbers of"?
"huge or small"?

HTH,

Andreas Mohr
Ming Lei April 14, 2017, 1:15 a.m. UTC | #3
On Thu, Apr 13, 2017 at 10:35:17PM +0200, Andreas Mohr wrote:
> On Thu, Apr 13, 2017 at 10:45:10PM +0800, Ming Lei wrote:
> > +		/*
> > +		 * don't merge if the 1st bio starts with non-zero
> > +		 * offset, otherwise it is quite difficult to respect
> > +		 * sg gap limit. We work hard to merge huge of small
> > +		 * bios in case of mkfs.
> 
> "huge of small bios in case" - -ENOPARSE.
> 
> "huge numbers of"?
> "huge or small"?

OK, it should be tons of or sort of description. As you can see
in the trace, there are 2560 bios merged in one request.

Thanks,
Ming
diff mbox

Patch

diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index b75d6fe5a1b9..cc68dfaef528 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -1672,12 +1672,27 @@  static inline bool bios_segs_mergeable(struct request_queue *q,
 	return true;
 }
 
-static inline bool bio_will_gap(struct request_queue *q, struct bio *prev,
-			 struct bio *next)
+static inline bool bio_will_gap(struct request_queue *q,
+				struct request *prev_rq,
+				struct bio *prev,
+				struct bio *next)
 {
 	if (bio_has_data(prev) && queue_virt_boundary(q)) {
 		struct bio_vec pb, nb;
 
+		/*
+		 * don't merge if the 1st bio starts with non-zero
+		 * offset, otherwise it is quite difficult to respect
+		 * sg gap limit. We work hard to merge huge of small
+		 * bios in case of mkfs.
+		 */
+		if (prev_rq)
+			bio_get_first_bvec(prev_rq->bio, &pb);
+		else
+			bio_get_first_bvec(prev, &pb);
+		if (pb.bv_offset)
+			return true;
+
 		bio_get_last_bvec(prev, &pb);
 		bio_get_first_bvec(next, &nb);
 
@@ -1690,12 +1705,12 @@  static inline bool bio_will_gap(struct request_queue *q, struct bio *prev,
 
 static inline bool req_gap_back_merge(struct request *req, struct bio *bio)
 {
-	return bio_will_gap(req->q, req->biotail, bio);
+	return bio_will_gap(req->q, req, req->biotail, bio);
 }
 
 static inline bool req_gap_front_merge(struct request *req, struct bio *bio)
 {
-	return bio_will_gap(req->q, bio, req->bio);
+	return bio_will_gap(req->q, NULL, bio, req->bio);
 }
 
 static inline void blk_dump_rq(const struct request *req, const char *msg)