diff mbox series

dm rq: clear cloned bio ->bi_bdev to fix I/O accounting

Message ID 1642183128-17875-1-git-send-email-bmarzins@redhat.com (mailing list archive)
State Superseded, archived
Delegated to: Mike Snitzer
Headers show
Series dm rq: clear cloned bio ->bi_bdev to fix I/O accounting | expand

Commit Message

Benjamin Marzinski Jan. 14, 2022, 5:58 p.m. UTC
bio_clone_fast() sets the cloned bio to have the same ->bi_bdev as the
source bio. This means that when request-based dm called setup_clone(),
the cloned bio had its ->bi_bdev pointing to the dm device. After Commit
0b6e522cdc4a ("blk-mq: use ->bi_bdev for I/O accounting")
__blk_account_io_start() started using the request's ->bio->bi_bdev for
I/O accounting, if it was set. This caused IO going to the underlying
devices to use the dm device for their I/O accounting.

Request-based dm can't be used on top of partitions, so
dm_rq_bio_constructor() can just clear the cloned bio's ->bi_bdev and
have __blk_account_io_start() fall back to using rq->rq_disk->part0 for
the I/O accounting.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
---
 drivers/md/dm-rq.c | 1 +
 1 file changed, 1 insertion(+)

Comments

Ming Lei Jan. 16, 2022, 5:52 a.m. UTC | #1
On Sat, Jan 15, 2022 at 1:59 AM Benjamin Marzinski <bmarzins@redhat.com> wrote:
>
> bio_clone_fast() sets the cloned bio to have the same ->bi_bdev as the
> source bio. This means that when request-based dm called setup_clone(),
> the cloned bio had its ->bi_bdev pointing to the dm device. After Commit
> 0b6e522cdc4a ("blk-mq: use ->bi_bdev for I/O accounting")
> __blk_account_io_start() started using the request's ->bio->bi_bdev for
> I/O accounting, if it was set. This caused IO going to the underlying
> devices to use the dm device for their I/O accounting.
>
> Request-based dm can't be used on top of partitions, so
> dm_rq_bio_constructor() can just clear the cloned bio's ->bi_bdev and
> have __blk_account_io_start() fall back to using rq->rq_disk->part0 for
> the I/O accounting.
>
> Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
> ---
>  drivers/md/dm-rq.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/drivers/md/dm-rq.c b/drivers/md/dm-rq.c
> index 579ab6183d4d..42099dc76e3c 100644
> --- a/drivers/md/dm-rq.c
> +++ b/drivers/md/dm-rq.c
> @@ -328,6 +328,7 @@ static int dm_rq_bio_constructor(struct bio *bio, struct bio *bio_orig,
>         info->orig = bio_orig;
>         info->tio = tio;
>         bio->bi_end_io = end_clone_bio;
> +       bio->bi_bdev = NULL;

Reviewed-by: Ming Lei <ming.lei@redhat.com>

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel
Christoph Hellwig Jan. 17, 2022, 7:51 a.m. UTC | #2
So I actually noticed this during code inspection a while ago, but I
think we need to use the actual underlying device instead of NULL here
to keep our block layer gurantees.  See the patch in my queue below.

---
>From 1e330b8e57fc0d6c6fb07c0ec2915dca5d7a585a Mon Sep 17 00:00:00 2001
From: Christoph Hellwig <hch@lst.de>
Date: Thu, 13 Jan 2022 10:53:59 +0100
Subject: block: assign bi_bdev for cloned bios in blk_rq_prep_clone

The cloned bios for the cloned request in blk_rq_prep_clone currently
still point to the original bi_bdev.  This is harmless because dm-mpath
doesn't look at bi_bdev but violates the invariants of always having
the correct bi_bdev.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 block/blk-mq.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index a6d4780580fcd..b5e35e63adad4 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -2976,6 +2976,7 @@ int blk_rq_prep_clone(struct request *rq, struct request *rq_src,
 		bio = bio_clone_fast(bio_src, gfp_mask, bs);
 		if (!bio)
 			goto free_and_out;
+		bio->bi_bdev = rq->q->disk->part0;
 
 		if (bio_ctr && bio_ctr(bio, bio_src, data))
 			goto free_and_out;
Benjamin Marzinski Jan. 17, 2022, 4:27 p.m. UTC | #3
On Sun, Jan 16, 2022 at 11:51:17PM -0800, Christoph Hellwig wrote:
> So I actually noticed this during code inspection a while ago, but I
> think we need to use the actual underlying device instead of NULL here
> to keep our block layer gurantees.  See the patch in my queue below.

Makes sense.

> ---
> >From 1e330b8e57fc0d6c6fb07c0ec2915dca5d7a585a Mon Sep 17 00:00:00 2001
> From: Christoph Hellwig <hch@lst.de>
> Date: Thu, 13 Jan 2022 10:53:59 +0100
> Subject: block: assign bi_bdev for cloned bios in blk_rq_prep_clone
> 
> The cloned bios for the cloned request in blk_rq_prep_clone currently
> still point to the original bi_bdev.  This is harmless because dm-mpath
> doesn't look at bi_bdev but violates the invariants of always having
> the correct bi_bdev.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>  block/blk-mq.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index a6d4780580fcd..b5e35e63adad4 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -2976,6 +2976,7 @@ int blk_rq_prep_clone(struct request *rq, struct request *rq_src,
>  		bio = bio_clone_fast(bio_src, gfp_mask, bs);
>  		if (!bio)
>  			goto free_and_out;
> +		bio->bi_bdev = rq->q->disk->part0;
>  
>  		if (bio_ctr && bio_ctr(bio, bio_src, data))
>  			goto free_and_out;
> -- 
> 2.30.2

Reviewed-by: Benjamin Marzinski <bmarzins@redhat.com>

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel
Ming Lei Jan. 18, 2022, 12:30 a.m. UTC | #4
On Sun, Jan 16, 2022 at 11:51:17PM -0800, Christoph Hellwig wrote:
> So I actually noticed this during code inspection a while ago, but I
> think we need to use the actual underlying device instead of NULL here
> to keep our block layer gurantees.  See the patch in my queue below.
> 
> ---
> From 1e330b8e57fc0d6c6fb07c0ec2915dca5d7a585a Mon Sep 17 00:00:00 2001
> From: Christoph Hellwig <hch@lst.de>
> Date: Thu, 13 Jan 2022 10:53:59 +0100
> Subject: block: assign bi_bdev for cloned bios in blk_rq_prep_clone
> 
> The cloned bios for the cloned request in blk_rq_prep_clone currently
> still point to the original bi_bdev.  This is harmless because dm-mpath

It breaks io accounting, so not harmless, but the code in this patch is
fine:

Reviewed-by: Ming Lei <ming.lei@redhat.com>


Thanks,
Ming

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel
Christoph Hellwig Jan. 18, 2022, 7 a.m. UTC | #5
On Tue, Jan 18, 2022 at 08:30:16AM +0800, Ming Lei wrote:
> On Sun, Jan 16, 2022 at 11:51:17PM -0800, Christoph Hellwig wrote:
> > So I actually noticed this during code inspection a while ago, but I
> > think we need to use the actual underlying device instead of NULL here
> > to keep our block layer gurantees.  See the patch in my queue below.
> > 
> > ---
> > From 1e330b8e57fc0d6c6fb07c0ec2915dca5d7a585a Mon Sep 17 00:00:00 2001
> > From: Christoph Hellwig <hch@lst.de>
> > Date: Thu, 13 Jan 2022 10:53:59 +0100
> > Subject: block: assign bi_bdev for cloned bios in blk_rq_prep_clone
> > 
> > The cloned bios for the cloned request in blk_rq_prep_clone currently
> > still point to the original bi_bdev.  This is harmless because dm-mpath
> 
> It breaks io accounting, so not harmless, but the code in this patch is
> fine:

That's what I thought at the time.  Based on the patch from
Benjamin that is obviously not true, so I'll reword the commit message
and add a Fixes tag.

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel
diff mbox series

Patch

diff --git a/drivers/md/dm-rq.c b/drivers/md/dm-rq.c
index 579ab6183d4d..42099dc76e3c 100644
--- a/drivers/md/dm-rq.c
+++ b/drivers/md/dm-rq.c
@@ -328,6 +328,7 @@  static int dm_rq_bio_constructor(struct bio *bio, struct bio *bio_orig,
 	info->orig = bio_orig;
 	info->tio = tio;
 	bio->bi_end_io = end_clone_bio;
+	bio->bi_bdev = NULL;
 
 	return 0;
 }