mbox series

[0/3] Optimize wait_for_overlap

Message ID 20240827153536.6743-1-artur.paszkiewicz@intel.com (mailing list archive)
Headers show
Series Optimize wait_for_overlap | expand

Message

Artur Paszkiewicz Aug. 27, 2024, 3:35 p.m. UTC
The wait_for_overlap wait queue is currently used in two cases, which
are not really related:
 - waiting for actual overlapping bios, which uses R5_Overlap bit,
 - waiting for events related to reshape.

Handling every write request in raid5_make_request() involves adding to
and removing from this wait queue, which uses a spinlock. With fast
storage and multiple submitting threads the contention on this lock is
noticeable.

This patch series aims to resolve this by separating the two cases
mentioned above and using this wait queue only when reshape is in
progress. 

The results when testing 4k random writes on raid5 with null_blk
(8 jobs, qd=64, group_thread_cnt=8):
before: 463k IOPS
after:  523k IOPS

The improvement is not huge with this series alone but it is just one of
the bottlenecks. When applied onto some other changes I'm working on, it
allowed to go from 845k IOPS to 975k IOPS on the same test.

Artur Paszkiewicz (3):
  md/raid5: use wait_on_bit() for R5_Overlap
  md/raid5: only add to wait queue if reshape is in progress
  md/raid5: rename wait_for_overlap to wait_for_reshape

 drivers/md/raid5-cache.c |  6 +--
 drivers/md/raid5.c       | 95 +++++++++++++++++++++-------------------
 drivers/md/raid5.h       |  2 +-
 3 files changed, 52 insertions(+), 51 deletions(-)

Comments

Song Liu Aug. 29, 2024, 6:24 p.m. UTC | #1
On Tue, Aug 27, 2024 at 8:35 AM Artur Paszkiewicz
<artur.paszkiewicz@intel.com> wrote:
>
> The wait_for_overlap wait queue is currently used in two cases, which
> are not really related:
>  - waiting for actual overlapping bios, which uses R5_Overlap bit,
>  - waiting for events related to reshape.
>
> Handling every write request in raid5_make_request() involves adding to
> and removing from this wait queue, which uses a spinlock. With fast
> storage and multiple submitting threads the contention on this lock is
> noticeable.
>
> This patch series aims to resolve this by separating the two cases
> mentioned above and using this wait queue only when reshape is in
> progress.
>
> The results when testing 4k random writes on raid5 with null_blk
> (8 jobs, qd=64, group_thread_cnt=8):
> before: 463k IOPS
> after:  523k IOPS
>
> The improvement is not huge with this series alone but it is just one of
> the bottlenecks. When applied onto some other changes I'm working on, it
> allowed to go from 845k IOPS to 975k IOPS on the same test.
>
> Artur Paszkiewicz (3):
>   md/raid5: use wait_on_bit() for R5_Overlap
>   md/raid5: only add to wait queue if reshape is in progress
>   md/raid5: rename wait_for_overlap to wait_for_reshape

Thanks for the optimization! Applied the set to md-6.12 branch.

Song