diff mbox series

[RESEND] md: raid0: account for split bio in iostat accounting

Message ID 20230814205347.17891-1-djeffery@redhat.com (mailing list archive)
State New, archived
Headers show
Series [RESEND] md: raid0: account for split bio in iostat accounting | expand

Commit Message

David Jeffery Aug. 14, 2023, 8:53 p.m. UTC
When a bio is split by md raid0, the newly created bio will not be tracked
by md for I/O accounting. Only the portion of I/O still assigned to the
original bio which was reduced by the split will be accounted for. This
results in md iostat data sometimes showing I/O values far below the actual
amount of data being sent through md.

md_account_bio() needs to be called for all bio generated by the bio split.

Fixes: 10764815ff47 ("md: add io accounting for raid0 and raid5")
Signed-off-by: David Jeffery <djeffery@redhat.com>
Tested-by: Laurence Oberman <loberman@redhat.com>
Reviewed-by: Laurence Oberman <loberman@redhat.com>
Reviewed-by: Yu Kuai <yukuai3@huawei.com>
---

No change to the patch itself. Added Fixes and Reviewed-by lines.

A simple example of the issue was generated using a raid0 device on partitions
to the same device. Since all raid0 I/O then goes to one device, it makes it
easy to see a gap between the md device and its sd storage. Reading an lvm
device on top of the md device, the iostat output (some 0 columns and extra
devices removed to make the data more compact) was:

Device             tps    kB_read/s    kB_wrtn/s    kB_dscd/s    kB_read
md2               0.00         0.00         0.00         0.00          0
sde               0.00         0.00         0.00         0.00          0
md2            1364.00    411496.00         0.00         0.00     411496
sde            1734.00    646144.00         0.00         0.00     646144
md2            1699.00    510680.00         0.00         0.00     510680
sde            2155.00    802784.00         0.00         0.00     802784
md2             803.00    241480.00         0.00         0.00     241480
sde            1016.00    377888.00         0.00         0.00     377888
md2               0.00         0.00         0.00         0.00          0
sde               0.00         0.00         0.00         0.00          0

I/O was generated doing large direct I/O reads (12M) with dd to a linear
lvm volume on top of the 4 leg raid0 device.

The md2 reads were showing as roughly 2/3 of the reads to the sde device
containing all of md2's raid partitions. The sum of reads to sde was
1826816 kB, which was the expected amount as it was the amount read by
dd. With the patch, the total reads from md will match the reads from
sde and be consistent with the amount of I/O generated.

 drivers/md/raid0.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

Comments

Song Liu Aug. 15, 2023, 8:08 a.m. UTC | #1
On Tue, Aug 15, 2023 at 4:54 AM David Jeffery <djeffery@redhat.com> wrote:
>
> When a bio is split by md raid0, the newly created bio will not be tracked
> by md for I/O accounting. Only the portion of I/O still assigned to the
> original bio which was reduced by the split will be accounted for. This
> results in md iostat data sometimes showing I/O values far below the actual
> amount of data being sent through md.
>
> md_account_bio() needs to be called for all bio generated by the bio split.
>
> Fixes: 10764815ff47 ("md: add io accounting for raid0 and raid5")
> Signed-off-by: David Jeffery <djeffery@redhat.com>
> Tested-by: Laurence Oberman <loberman@redhat.com>
> Reviewed-by: Laurence Oberman <loberman@redhat.com>
> Reviewed-by: Yu Kuai <yukuai3@huawei.com>

It appears this patch conflicts with Jan's set:

https://lore.kernel.org/linux-raid/20230814091452.9670-1-jack@suse.cz/

Please rebase on top of md-next and resubmit the patch:

https://git.kernel.org/pub/scm/linux/kernel/git/song/md.git/log/?h=md-next

> ---
>
> No change to the patch itself. Added Fixes and Reviewed-by lines.
>
> A simple example of the issue was generated using a raid0 device on partitions
> to the same device. Since all raid0 I/O then goes to one device, it makes it
> easy to see a gap between the md device and its sd storage. Reading an lvm
> device on top of the md device, the iostat output (some 0 columns and extra
> devices removed to make the data more compact) was:
>
> Device             tps    kB_read/s    kB_wrtn/s    kB_dscd/s    kB_read
> md2               0.00         0.00         0.00         0.00          0
> sde               0.00         0.00         0.00         0.00          0
> md2            1364.00    411496.00         0.00         0.00     411496
> sde            1734.00    646144.00         0.00         0.00     646144
> md2            1699.00    510680.00         0.00         0.00     510680
> sde            2155.00    802784.00         0.00         0.00     802784
> md2             803.00    241480.00         0.00         0.00     241480
> sde            1016.00    377888.00         0.00         0.00     377888
> md2               0.00         0.00         0.00         0.00          0
> sde               0.00         0.00         0.00         0.00          0
>
> I/O was generated doing large direct I/O reads (12M) with dd to a linear
> lvm volume on top of the 4 leg raid0 device.
>
> The md2 reads were showing as roughly 2/3 of the reads to the sde device
> containing all of md2's raid partitions. The sum of reads to sde was
> 1826816 kB, which was the expected amount as it was the amount read by
> dd. With the patch, the total reads from md will match the reads from
> sde and be consistent with the amount of I/O generated.

We can include this part in the commit log to keep more context for future
reference.

Thanks,
Song
diff mbox series

Patch

diff --git a/drivers/md/raid0.c b/drivers/md/raid0.c
index d1ac73fcd852..1fd559ac8c68 100644
--- a/drivers/md/raid0.c
+++ b/drivers/md/raid0.c
@@ -597,8 +597,7 @@  static bool raid0_make_request(struct mddev *mddev, struct bio *bio)
 		bio = split;
 	}
 
-	if (bio->bi_pool != &mddev->bio_set)
-		md_account_bio(mddev, &bio);
+	md_account_bio(mddev, &bio);
 
 	orig_sector = sector;
 	zone = find_zone(mddev->private, &sector);