diff mbox series

[8/8] block: stop calling blk_queue_bounce for passthrough requests

Message ID 20210331073001.46776-9-hch@lst.de (mailing list archive)
State New, archived
Headers show
Series [1/8] aha1542: use a local bounce buffer | expand

Commit Message

Christoph Hellwig March 31, 2021, 7:30 a.m. UTC
Instead of overloading the passthrough fast path with the deprecated
block layer bounce buffering let the users that combine an old
undermaintained driver with a highmem system pay the price by always
falling back to copies in that case.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
---
 block/blk-map.c                    | 116 ++++++++---------------------
 block/bounce.c                     |  11 +--
 drivers/nvme/host/lightnvm.c       |   2 +-
 drivers/target/target_core_pscsi.c |   4 +-
 include/linux/blkdev.h             |   2 +-
 5 files changed, 36 insertions(+), 99 deletions(-)

Comments

Guenter Roeck April 8, 2021, 9:45 p.m. UTC | #1
On Wed, Mar 31, 2021 at 09:30:01AM +0200, Christoph Hellwig wrote:
> Instead of overloading the passthrough fast path with the deprecated
> block layer bounce buffering let the users that combine an old
> undermaintained driver with a highmem system pay the price by always
> falling back to copies in that case.
> 

Hmm, that price is pretty high. When trying to boot sh images from usb,
it results in a traceback, followed by an i/o error, and the drive
fails to open.

WARNING: CPU: 0 PID: 5 at block/blk-map.c:488 blk_rq_append_bio+0xdc/0xf4
Modules linked in:

CPU: 0 PID: 5 Comm: kworker/u2:0 Not tainted 5.12.0-rc6-next-20210408 #1
Workqueue: events_unbound async_run_entry_fn
PC is at blk_rq_append_bio+0xdc/0xf4
PR is at blk_rq_map_kern+0x154/0x2d4
PC  : 8c16b7bc SP  : 8c82bcd8 SR  : 40008000 TEA : c00d9fe0
R0  : 000002bc R1  : 00000001 R2  : 00000020 R3  : 00000020
R4  : 8ca58000 R5  : 8ca3bc80 R6  : 8ff93c20 R7  : 00000000
R8  : 00000000 R9  : 8ca5c800 R10 : 8ca5c808 R11 : 8ca3bc80
R12 : 8ca1ae78 R13 : 00000008 R14 : 00000c00
MACH: 0000aac5 MACL: 2fff92b5 GBR : 00000000 PR  : 8c16b9b0

Call trace:
 [<8c22d658>] __scsi_execute+0x54/0x164
 [<8c26508e>] read_capacity_10+0x7a/0x190
 [<8c1aacb4>] memset+0x0/0x8c
 [<8c22d6e4>] __scsi_execute+0xe0/0x164
 [<8c26625a>] sd_revalidate_disk+0xcc2/0x1bf0
 [<8c13819c>] internal_create_group+0x0/0x2a0
 [<8c21aa10>] device_add+0xec/0x774
 [<8c1ad8ec>] kobject_put+0x0/0xf4
 [<8c138598>] sysfs_create_groups+0x0/0x14
 [<8c1ae0a4>] kobject_set_name_vargs+0x20/0x90
 [<8c267636>] sd_probe+0x296/0x384
 [<8c265598>] sd_revalidate_disk+0x0/0x1bf0
 [<8c21e786>] really_probe+0xc2/0x320
 [<8c011ab8>] arch_local_irq_restore+0x0/0x24
 [<8c21cdac>] bus_for_each_drv+0x50/0xa0
 [<8c011ab8>] arch_local_irq_restore+0x0/0x24
 [<8c1ad4ec>] klist_next+0x0/0xf4
 [<8c21ebf8>] __device_attach_driver+0x0/0x90
 [<8c21e104>] __device_attach_async_helper+0x6c/0x94
 [<8c039cfa>] async_run_entry_fn+0x22/0xa8
 [<8c03190e>] process_one_work+0x13e/0x2b4
 [<8c0321f6>] worker_thread+0x10a/0x388
 [<8c011ab8>] arch_local_irq_restore+0x0/0x24
 [<8c0317d0>] process_one_work+0x0/0x2b4
 [<8c036e4e>] kthread+0xce/0x108
 [<8c042d30>] __init_swait_queue_head+0x0/0x8
 [<8c0320ec>] worker_thread+0x0/0x388
 [<8c0151c4>] ret_from_kernel_thread+0xc/0x14
 [<8c011ab0>] arch_local_save_flags+0x0/0x8
 [<8c011ab8>] arch_local_irq_restore+0x0/0x24
 [<8c03ceac>] schedule_tail+0x0/0x5c
 [<8c036d80>] kthread+0x0/0x108

...
blk_update_request: I/O error, dev sda, sector 2 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
EXT2-fs (sda): error: unable to read superblock
sd 1:0:0:0: [sda] tag#0 access beyond end of device
blk_update_request: I/O error, dev sda, sector 2 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
MINIX-fs: unable to read superblock
sd 1:0:0:0: [sda] tag#0 access beyond end of device
blk_update_request: I/O error, dev sda, sector 0 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
FAT-fs (sda): unable to read boot sector
VFS: Cannot open root device "sda" or unknown-block(8,0): error -5

Reverting this patch fixes the problem.

Guenter

---
# bad: [6145d80cfc62e3ed8f16ff584d6287e6d88b82b9] Add linux-next specific files for 20210408
# good: [e49d033bddf5b565044e2abe4241353959bc9120] Linux 5.12-rc6
git bisect start 'HEAD' 'v5.12-rc6'
# good: [109c72558f12be32f904747573e1f221537c5f17] Merge remote-tracking branch 'crypto/master'
git bisect good 109c72558f12be32f904747573e1f221537c5f17
# bad: [379e1b0ffc22f8e0af810558ae87f742a01169c6] Merge remote-tracking branch 'ftrace/for-next'
git bisect bad 379e1b0ffc22f8e0af810558ae87f742a01169c6
# good: [8bc74be758b51ff45786c1840fd8cf405773e0c6] Merge remote-tracking branch 'sound/for-next'
git bisect good 8bc74be758b51ff45786c1840fd8cf405773e0c6
# bad: [58fea702e23f66773e1de827de44fea3087c453c] Merge remote-tracking branch 'mmc/next'
git bisect bad 58fea702e23f66773e1de827de44fea3087c453c
# good: [73935e931c945b019bde312a99f1b43a0a783fca] Merge series "ASoC: soc-core: tidyup error handling for rtd" from Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>:
git bisect good 73935e931c945b019bde312a99f1b43a0a783fca
# good: [e0956194697c0b319a71406e6c7452555820eeb5] Merge branch 'for-5.13/drivers' into for-next
git bisect good e0956194697c0b319a71406e6c7452555820eeb5
# bad: [8723f253faed6dafc718d0a0b780567f09446a8b] Merge remote-tracking branch 'block/for-next'
git bisect bad 8723f253faed6dafc718d0a0b780567f09446a8b
# good: [73e7f1732e800a88cafab31d75548c6fcfdd8c47] Input: imx_keypad - convert to a DT-only driver
git bisect good 73e7f1732e800a88cafab31d75548c6fcfdd8c47
# good: [8361c6da77b7d267707da9ff3b94458e018dd3da] Merge series "Adds SPI support" from Jiri Prchal <jiri.prchal@aksignal.cz>:
git bisect good 8361c6da77b7d267707da9ff3b94458e018dd3da
# bad: [c8872394ac388b38952dbe89da91bf2b108ce5e6] Merge branch 'for-5.13/libata' into for-next
git bisect bad c8872394ac388b38952dbe89da91bf2b108ce5e6
# good: [ce288e0535688cc3475a3c3d4d96624514c3550c] block: remove BLK_BOUNCE_ISA support
git bisect good ce288e0535688cc3475a3c3d4d96624514c3550c
# good: [7d33004d24dafeedb95b85a271a37aa33678ac0b] pata_legacy: Add `probe_mask' parameter like with ide-generic
git bisect good 7d33004d24dafeedb95b85a271a37aa33678ac0b
# bad: [393bb12e00580aaa23356504eed38d8f5571153a] block: stop calling blk_queue_bounce for passthrough requests
git bisect bad 393bb12e00580aaa23356504eed38d8f5571153a
# good: [9bb33f24abbd0fa2fadad01ec75438d7cc239189] block: refactor the bounce buffering code
git bisect good 9bb33f24abbd0fa2fadad01ec75438d7cc239189
# first bad commit: [393bb12e00580aaa23356504eed38d8f5571153a] block: stop calling blk_queue_bounce for passthrough requests
Christoph Hellwig April 9, 2021, 7:40 a.m. UTC | #2
On Thu, Apr 08, 2021 at 02:45:06PM -0700, Guenter Roeck wrote:
> On Wed, Mar 31, 2021 at 09:30:01AM +0200, Christoph Hellwig wrote:
> > Instead of overloading the passthrough fast path with the deprecated
> > block layer bounce buffering let the users that combine an old
> > undermaintained driver with a highmem system pay the price by always
> > falling back to copies in that case.
> > 
> 
> Hmm, that price is pretty high. When trying to boot sh images from usb,
> it results in a traceback, followed by an i/o error, and the drive
> fails to open.

That's just because this warning is completely bogus, sorry.

Does this patch fix the boot for you?

diff --git a/block/blk-map.c b/block/blk-map.c
index dac78376acc899..3743158ddaeb76 100644
--- a/block/blk-map.c
+++ b/block/blk-map.c
@@ -485,9 +485,6 @@ int blk_rq_append_bio(struct request *rq, struct bio *bio)
 	struct bio_vec bv;
 	unsigned int nr_segs = 0;
 
-	if (WARN_ON_ONCE(rq->q->limits.bounce != BLK_BOUNCE_NONE))
-		return -EINVAL;
-
 	bio_for_each_bvec(bv, bio, iter)
 		nr_segs++;
Guenter Roeck April 9, 2021, 1:04 p.m. UTC | #3
On 4/9/21 12:40 AM, Christoph Hellwig wrote:
> On Thu, Apr 08, 2021 at 02:45:06PM -0700, Guenter Roeck wrote:
>> On Wed, Mar 31, 2021 at 09:30:01AM +0200, Christoph Hellwig wrote:
>>> Instead of overloading the passthrough fast path with the deprecated
>>> block layer bounce buffering let the users that combine an old
>>> undermaintained driver with a highmem system pay the price by always
>>> falling back to copies in that case.
>>>
>>
>> Hmm, that price is pretty high. When trying to boot sh images from usb,
>> it results in a traceback, followed by an i/o error, and the drive
>> fails to open.
> 
> That's just because this warning is completely bogus, sorry.
> 
> Does this patch fix the boot for you?
> 

Yes it does.

Tested-by: Guenter Roeck <linux@roeck-us.net>

Thanks,
Guenter

> diff --git a/block/blk-map.c b/block/blk-map.c
> index dac78376acc899..3743158ddaeb76 100644
> --- a/block/blk-map.c
> +++ b/block/blk-map.c
> @@ -485,9 +485,6 @@ int blk_rq_append_bio(struct request *rq, struct bio *bio)
>  	struct bio_vec bv;
>  	unsigned int nr_segs = 0;
>  
> -	if (WARN_ON_ONCE(rq->q->limits.bounce != BLK_BOUNCE_NONE))
> -		return -EINVAL;
> -
>  	bio_for_each_bvec(bv, bio, iter)
>  		nr_segs++;
>  
>
diff mbox series

Patch

diff --git a/block/blk-map.c b/block/blk-map.c
index b62b52dcb61d97..dac78376acc899 100644
--- a/block/blk-map.c
+++ b/block/blk-map.c
@@ -123,7 +123,6 @@  static int bio_uncopy_user(struct bio *bio)
 			bio_free_pages(bio);
 	}
 	kfree(bmd);
-	bio_put(bio);
 	return ret;
 }
 
@@ -132,7 +131,7 @@  static int bio_copy_user_iov(struct request *rq, struct rq_map_data *map_data,
 {
 	struct bio_map_data *bmd;
 	struct page *page;
-	struct bio *bio, *bounce_bio;
+	struct bio *bio;
 	int i = 0, ret;
 	int nr_pages;
 	unsigned int len = iter->count;
@@ -218,16 +217,9 @@  static int bio_copy_user_iov(struct request *rq, struct rq_map_data *map_data,
 
 	bio->bi_private = bmd;
 
-	bounce_bio = bio;
-	ret = blk_rq_append_bio(rq, &bounce_bio);
+	ret = blk_rq_append_bio(rq, bio);
 	if (ret)
 		goto cleanup;
-
-	/*
-	 * We link the bounce buffer in and could have to traverse it later, so
-	 * we have to get a ref to prevent it from being freed
-	 */
-	bio_get(bounce_bio);
 	return 0;
 cleanup:
 	if (!map_data)
@@ -242,7 +234,7 @@  static int bio_map_user_iov(struct request *rq, struct iov_iter *iter,
 		gfp_t gfp_mask)
 {
 	unsigned int max_sectors = queue_max_hw_sectors(rq->q);
-	struct bio *bio, *bounce_bio;
+	struct bio *bio;
 	int ret;
 	int j;
 
@@ -304,49 +296,17 @@  static int bio_map_user_iov(struct request *rq, struct iov_iter *iter,
 			break;
 	}
 
-	/*
-	 * Subtle: if we end up needing to bounce a bio, it would normally
-	 * disappear when its bi_end_io is run.  However, we need the original
-	 * bio for the unmap, so grab an extra reference to it
-	 */
-	bio_get(bio);
-
-	bounce_bio = bio;
-	ret = blk_rq_append_bio(rq, &bounce_bio);
+	ret = blk_rq_append_bio(rq, bio);
 	if (ret)
-		goto out_put_orig;
-
-	/*
-	 * We link the bounce buffer in and could have to traverse it
-	 * later, so we have to get a ref to prevent it from being freed
-	 */
-	bio_get(bounce_bio);
+		goto out_unmap;
 	return 0;
 
- out_put_orig:
-	bio_put(bio);
  out_unmap:
 	bio_release_pages(bio, false);
 	bio_put(bio);
 	return ret;
 }
 
-/**
- *	bio_unmap_user	-	unmap a bio
- *	@bio:		the bio being unmapped
- *
- *	Unmap a bio previously mapped by bio_map_user_iov(). Must be called from
- *	process context.
- *
- *	bio_unmap_user() may sleep.
- */
-static void bio_unmap_user(struct bio *bio)
-{
-	bio_release_pages(bio, bio_data_dir(bio) == READ);
-	bio_put(bio);
-	bio_put(bio);
-}
-
 static void bio_invalidate_vmalloc_pages(struct bio *bio)
 {
 #ifdef ARCH_HAS_FLUSH_KERNEL_DCACHE_PAGE
@@ -519,33 +479,27 @@  static struct bio *bio_copy_kern(struct request_queue *q, void *data,
  * Append a bio to a passthrough request.  Only works if the bio can be merged
  * into the request based on the driver constraints.
  */
-int blk_rq_append_bio(struct request *rq, struct bio **bio)
+int blk_rq_append_bio(struct request *rq, struct bio *bio)
 {
-	struct bio *orig_bio = *bio;
 	struct bvec_iter iter;
 	struct bio_vec bv;
 	unsigned int nr_segs = 0;
 
-	blk_queue_bounce(rq->q, bio);
+	if (WARN_ON_ONCE(rq->q->limits.bounce != BLK_BOUNCE_NONE))
+		return -EINVAL;
 
-	bio_for_each_bvec(bv, *bio, iter)
+	bio_for_each_bvec(bv, bio, iter)
 		nr_segs++;
 
 	if (!rq->bio) {
-		blk_rq_bio_prep(rq, *bio, nr_segs);
+		blk_rq_bio_prep(rq, bio, nr_segs);
 	} else {
-		if (!ll_back_merge_fn(rq, *bio, nr_segs)) {
-			if (orig_bio != *bio) {
-				bio_put(*bio);
-				*bio = orig_bio;
-			}
+		if (!ll_back_merge_fn(rq, bio, nr_segs))
 			return -EINVAL;
-		}
-
-		rq->biotail->bi_next = *bio;
-		rq->biotail = *bio;
-		rq->__data_len += (*bio)->bi_iter.bi_size;
-		bio_crypt_free_ctx(*bio);
+		rq->biotail->bi_next = bio;
+		rq->biotail = bio;
+		rq->__data_len += (bio)->bi_iter.bi_size;
+		bio_crypt_free_ctx(bio);
 	}
 
 	return 0;
@@ -566,12 +520,6 @@  EXPORT_SYMBOL(blk_rq_append_bio);
  *
  *    A matching blk_rq_unmap_user() must be issued at the end of I/O, while
  *    still in process context.
- *
- *    Note: The mapped bio may need to be bounced through blk_queue_bounce()
- *    before being submitted to the device, as pages mapped may be out of
- *    reach. It's the callers responsibility to make sure this happens. The
- *    original bio must be passed back in to blk_rq_unmap_user() for proper
- *    unmapping.
  */
 int blk_rq_map_user_iov(struct request_queue *q, struct request *rq,
 			struct rq_map_data *map_data,
@@ -588,6 +536,8 @@  int blk_rq_map_user_iov(struct request_queue *q, struct request *rq,
 
 	if (map_data)
 		copy = true;
+	else if (blk_queue_may_bounce(q))
+		copy = true;
 	else if (iov_iter_alignment(iter) & align)
 		copy = true;
 	else if (queue_virt_boundary(q))
@@ -641,25 +591,21 @@  EXPORT_SYMBOL(blk_rq_map_user);
  */
 int blk_rq_unmap_user(struct bio *bio)
 {
-	struct bio *mapped_bio;
+	struct bio *next_bio;
 	int ret = 0, ret2;
 
 	while (bio) {
-		mapped_bio = bio;
-		if (unlikely(bio_flagged(bio, BIO_BOUNCED)))
-			mapped_bio = bio->bi_private;
-
 		if (bio->bi_private) {
-			ret2 = bio_uncopy_user(mapped_bio);
+			ret2 = bio_uncopy_user(bio);
 			if (ret2 && !ret)
 				ret = ret2;
 		} else {
-			bio_unmap_user(mapped_bio);
+			bio_release_pages(bio, bio_data_dir(bio) == READ);
 		}
 
-		mapped_bio = bio;
+		next_bio = bio;
 		bio = bio->bi_next;
-		bio_put(mapped_bio);
+		bio_put(next_bio);
 	}
 
 	return ret;
@@ -684,7 +630,7 @@  int blk_rq_map_kern(struct request_queue *q, struct request *rq, void *kbuf,
 {
 	int reading = rq_data_dir(rq) == READ;
 	unsigned long addr = (unsigned long) kbuf;
-	struct bio *bio, *orig_bio;
+	struct bio *bio;
 	int ret;
 
 	if (len > (queue_max_hw_sectors(q) << 9))
@@ -692,7 +638,8 @@  int blk_rq_map_kern(struct request_queue *q, struct request *rq, void *kbuf,
 	if (!len || !kbuf)
 		return -EINVAL;
 
-	if (!blk_rq_aligned(q, addr, len) || object_is_on_stack(kbuf))
+	if (!blk_rq_aligned(q, addr, len) || object_is_on_stack(kbuf) ||
+	    blk_queue_may_bounce(q))
 		bio = bio_copy_kern(q, kbuf, len, gfp_mask, reading);
 	else
 		bio = bio_map_kern(q, kbuf, len, gfp_mask);
@@ -703,14 +650,9 @@  int blk_rq_map_kern(struct request_queue *q, struct request *rq, void *kbuf,
 	bio->bi_opf &= ~REQ_OP_MASK;
 	bio->bi_opf |= req_op(rq);
 
-	orig_bio = bio;
-	ret = blk_rq_append_bio(rq, &bio);
-	if (unlikely(ret)) {
-		/* request is too big */
-		bio_put(orig_bio);
-		return ret;
-	}
-
-	return 0;
+	ret = blk_rq_append_bio(rq, bio);
+	if (unlikely(ret))
+		bio_put(bio);
+	return ret;
 }
 EXPORT_SYMBOL(blk_rq_map_kern);
diff --git a/block/bounce.c b/block/bounce.c
index 6bafc0d1f867a1..94081e013c58cc 100644
--- a/block/bounce.c
+++ b/block/bounce.c
@@ -180,12 +180,8 @@  static struct bio *bounce_clone_bio(struct bio *bio_src)
 	 *    asking for trouble and would force extra work on
 	 *    __bio_clone_fast() anyways.
 	 */
-	if (bio_is_passthrough(bio_src))
-		bio = bio_kmalloc(GFP_NOIO | __GFP_NOFAIL,
-				  bio_segments(bio_src));
-	else
-		bio = bio_alloc_bioset(GFP_NOIO, bio_segments(bio_src),
-				       &bounce_bio_set);
+	bio = bio_alloc_bioset(GFP_NOIO, bio_segments(bio_src),
+			       &bounce_bio_set);
 	bio->bi_bdev		= bio_src->bi_bdev;
 	if (bio_flagged(bio_src, BIO_REMAPPED))
 		bio_set_flag(bio, BIO_REMAPPED);
@@ -245,8 +241,7 @@  void __blk_queue_bounce(struct request_queue *q, struct bio **bio_orig)
 	if (!bounce)
 		return;
 
-	if (!bio_is_passthrough(*bio_orig) &&
-	    sectors < bio_sectors(*bio_orig)) {
+	if (sectors < bio_sectors(*bio_orig)) {
 		bio = bio_split(*bio_orig, sectors, GFP_NOIO, &bounce_bio_split);
 		bio_chain(bio, *bio_orig);
 		submit_bio_noacct(*bio_orig);
diff --git a/drivers/nvme/host/lightnvm.c b/drivers/nvme/host/lightnvm.c
index b705988629f224..f6ca2fbb711e98 100644
--- a/drivers/nvme/host/lightnvm.c
+++ b/drivers/nvme/host/lightnvm.c
@@ -660,7 +660,7 @@  static struct request *nvme_nvm_alloc_request(struct request_queue *q,
 	rq->cmd_flags &= ~REQ_FAILFAST_DRIVER;
 
 	if (rqd->bio)
-		blk_rq_append_bio(rq, &rqd->bio);
+		blk_rq_append_bio(rq, rqd->bio);
 	else
 		rq->ioprio = IOPRIO_PRIO_VALUE(IOPRIO_CLASS_BE, IOPRIO_NORM);
 
diff --git a/drivers/target/target_core_pscsi.c b/drivers/target/target_core_pscsi.c
index 3cbc074992bc86..7df4a9c9c7ffaa 100644
--- a/drivers/target/target_core_pscsi.c
+++ b/drivers/target/target_core_pscsi.c
@@ -911,7 +911,7 @@  pscsi_map_sg(struct se_cmd *cmd, struct scatterlist *sgl, u32 sgl_nents,
 					" %d i: %d bio: %p, allocating another"
 					" bio\n", bio->bi_vcnt, i, bio);
 
-				rc = blk_rq_append_bio(req, &bio);
+				rc = blk_rq_append_bio(req, bio);
 				if (rc) {
 					pr_err("pSCSI: failed to append bio\n");
 					goto fail;
@@ -930,7 +930,7 @@  pscsi_map_sg(struct se_cmd *cmd, struct scatterlist *sgl, u32 sgl_nents,
 	}
 
 	if (bio) {
-		rc = blk_rq_append_bio(req, &bio);
+		rc = blk_rq_append_bio(req, bio);
 		if (rc) {
 			pr_err("pSCSI: failed to append bio\n");
 			goto fail;
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 55cc8b96c84427..d5d320da51f8bf 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -909,7 +909,7 @@  extern int blk_rq_prep_clone(struct request *rq, struct request *rq_src,
 extern void blk_rq_unprep_clone(struct request *rq);
 extern blk_status_t blk_insert_cloned_request(struct request_queue *q,
 				     struct request *rq);
-extern int blk_rq_append_bio(struct request *rq, struct bio **bio);
+int blk_rq_append_bio(struct request *rq, struct bio *bio);
 extern void blk_queue_split(struct bio **);
 extern int scsi_verify_blk_ioctl(struct block_device *, unsigned int);
 extern int scsi_cmd_blk_ioctl(struct block_device *, fmode_t,