Message ID | 20240222075806.1816400-1-yukuai1@huaweicloud.com (mailing list archive) |
---|---|
Headers | show |
Series | md/raid1: refactor read_balance() and some minor fix | expand |
Dear Kuai, dear Paul, Thank you for your work. Some nits and request for benchmarks below. Am 22.02.24 um 08:57 schrieb Yu Kuai: > From: Yu Kuai <yukuai3@huawei.com> > > The orignial idea is that Paul want to optimize raid1 read original > performance([1]), however, we think that the orignial code for original > read_balance() is quite complex, and we don't want to add more > complexity. Hence we decide to refactor read_balance() first, to make > code cleaner and easier for follow up. > > Before this patchset, read_balance() has many local variables and many > braches, it want to consider all the scenarios in one iteration. The branches > idea of this patch is to devide them into 4 different steps: divide > 1) If resync is in progress, find the first usable disk, patch 5; > Otherwise: > 2) Loop through all disks and skipping slow disks and disks with bad > blocks, choose the best disk, patch 10. If no disk is found: > 3) Look for disks with bad blocks and choose the one with most number of > sectors, patch 8. If no disk is found: > 4) Choose first found slow disk with no bad blocks, or slow disk with > most number of sectors, patch 7. > > Note that step 3) and step 4) are super code path, and performance > should not be considered. > > And after this patchset, we'll continue to optimize read_balance for > step 2), specifically how to choose the best rdev to read. Is there a change in performance with the current patch set? Is radi1 well enough covered by the test suite? Kind regards, Paul > [1] https://lore.kernel.org/all/20240102125115.129261-1-paul.e.luse@linux.intel.com/ > > Yu Kuai (10): > md: add a new helper rdev_has_badblock() > md: record nonrot rdevs while adding/removing rdevs to conf > md/raid1: fix choose next idle in read_balance() > md/raid1-10: add a helper raid1_check_read_range() > md/raid1-10: factor out a new helper raid1_should_read_first() > md/raid1: factor out read_first_rdev() from read_balance() > md/raid1: factor out choose_slow_rdev() from read_balance() > md/raid1: factor out choose_bb_rdev() from read_balance() > md/raid1: factor out the code to manage sequential IO > md/raid1: factor out helpers to choose the best rdev from > read_balance() > > drivers/md/md.c | 28 ++- > drivers/md/md.h | 12 ++ > drivers/md/raid1-10.c | 69 +++++++ > drivers/md/raid1.c | 454 ++++++++++++++++++++++++------------------ > drivers/md/raid10.c | 66 ++---- > drivers/md/raid5.c | 35 ++-- > 6 files changed, 402 insertions(+), 262 deletions(-)
Hi, 在 2024/02/22 16:40, Paul Menzel 写道: > Is there a change in performance with the current patch set? Is radi1 > well enough covered by the test suite? Yes, there are no performance degradation, and mdadm tests passed. And Paul Luse also ran fio mixed workload w/data integrity and it passes. Thanks, Kuai
> On Feb 22, 2024, at 2:08 AM, Yu Kuai <yukuai1@huaweicloud.com> wrote: > > Hi, > > 在 2024/02/22 16:40, Paul Menzel 写道: >> Is there a change in performance with the current patch set? Is radi1 well enough covered by the test suite? > > Yes, there are no performance degradation, and mdadm tests passed. And > Paul Luse also ran fio mixed workload w/data integrity and it passes. > > Thanks, > Kuai > Kuai is correct, in my original perf improvement patch I included lots of results. For this set where we just refactored I checked performance to assure we didn't go downhill but didn't save the results as deltas were in the noise. After this series lands we will look at introducing performance improvements again and at that time results from a full performance sweep will be included. For data integrity, 1 and 2 disk mirrors were ran overnight w/fio and crcr32 check - no issues. To assure other code paths execute as they did before was a little trickier without a unit test framework but for those cases I did modify/un-modify the code several times to follow various code paths and assure they're working as expected (ie bad blocks, etc) -Paul
Dear Paul, dear Kuai Am 22.02.24 um 14:04 schrieb Luse, Paul E: >> On Feb 22, 2024, at 2:08 AM, Yu Kuai <yukuai1@huaweicloud.com> wrote: >> 在 2024/02/22 16:40, Paul Menzel 写道: >>> Is there a change in performance with the current patch set? Is >>> radi1 well enough covered by the test suite? >> >> Yes, there are no performance degradation, and mdadm tests passed. >> And Paul Luse also ran fio mixed workload w/data integrity and it >> passes. > > Kuai is correct, in my original perf improvement patch I included > lots of results. For this set where we just refactored I checked > performance to assure we didn't go downhill but didn't save the > results as deltas were in the noise. After this series lands we will > look at introducing performance improvements again and at that time > results from a full performance sweep will be included. > > For data integrity, 1 and 2 disk mirrors were ran overnight w/fio and > crcr32 check - no issues. > > To assure other code paths execute as they did before was a little > trickier without a unit test framework but for those cases I did > modify/un-modify the code several times to follow various code paths > and assure they're working as expected (ie bad blocks, etc) Thank you very much for the elaborate response. In our infrastructure, we often notice things improve, but we sometimes also have the “feeling” that things get worse. As IO is so complex, I find it always helpful to exactly note down the test setup and the run tests. So thank you for responding. Kind regards, Paul
Hi Kuai and Paul, On Thu, Feb 22, 2024 at 12:03 AM Yu Kuai <yukuai1@huaweicloud.com> wrote: > > From: Yu Kuai <yukuai3@huawei.com> > > The orignial idea is that Paul want to optimize raid1 read > performance([1]), however, we think that the orignial code for > read_balance() is quite complex, and we don't want to add more > complexity. Hence we decide to refactor read_balance() first, to make > code cleaner and easier for follow up. > > Before this patchset, read_balance() has many local variables and many > braches, it want to consider all the scenarios in one iteration. The > idea of this patch is to devide them into 4 different steps: > > 1) If resync is in progress, find the first usable disk, patch 5; > Otherwise: > 2) Loop through all disks and skipping slow disks and disks with bad > blocks, choose the best disk, patch 10. If no disk is found: > 3) Look for disks with bad blocks and choose the one with most number of > sectors, patch 8. If no disk is found: > 4) Choose first found slow disk with no bad blocks, or slow disk with > most number of sectors, patch 7. Thanks for your great work in this set. It looks great. Please address feedback from folks and send v2. We can still get this in 6.9 merge window. Thanks, Song
From: Yu Kuai <yukuai3@huawei.com> The orignial idea is that Paul want to optimize raid1 read performance([1]), however, we think that the orignial code for read_balance() is quite complex, and we don't want to add more complexity. Hence we decide to refactor read_balance() first, to make code cleaner and easier for follow up. Before this patchset, read_balance() has many local variables and many braches, it want to consider all the scenarios in one iteration. The idea of this patch is to devide them into 4 different steps: 1) If resync is in progress, find the first usable disk, patch 5; Otherwise: 2) Loop through all disks and skipping slow disks and disks with bad blocks, choose the best disk, patch 10. If no disk is found: 3) Look for disks with bad blocks and choose the one with most number of sectors, patch 8. If no disk is found: 4) Choose first found slow disk with no bad blocks, or slow disk with most number of sectors, patch 7. Note that step 3) and step 4) are super code path, and performance should not be considered. And after this patchset, we'll continue to optimize read_balance for step 2), specifically how to choose the best rdev to read. [1] https://lore.kernel.org/all/20240102125115.129261-1-paul.e.luse@linux.intel.com/ Yu Kuai (10): md: add a new helper rdev_has_badblock() md: record nonrot rdevs while adding/removing rdevs to conf md/raid1: fix choose next idle in read_balance() md/raid1-10: add a helper raid1_check_read_range() md/raid1-10: factor out a new helper raid1_should_read_first() md/raid1: factor out read_first_rdev() from read_balance() md/raid1: factor out choose_slow_rdev() from read_balance() md/raid1: factor out choose_bb_rdev() from read_balance() md/raid1: factor out the code to manage sequential IO md/raid1: factor out helpers to choose the best rdev from read_balance() drivers/md/md.c | 28 ++- drivers/md/md.h | 12 ++ drivers/md/raid1-10.c | 69 +++++++ drivers/md/raid1.c | 454 ++++++++++++++++++++++++------------------ drivers/md/raid10.c | 66 ++---- drivers/md/raid5.c | 35 ++-- 6 files changed, 402 insertions(+), 262 deletions(-)