[md-6.12,0/7] md: enhance faulty chekcing for blocked handling

Message ID	20240830072721.2112006-1-yukuai1@huaweicloud.com (mailing list archive)
Headers	show Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 233301586CD; Fri, 30 Aug 2024 07:28:38 +0000 (UTC) From: Yu Kuai <yukuai1@huaweicloud.com> To: mariusz.tkaczyk@intel.com, song@kernel.org Cc: linux-raid@vger.kernel.org, linux-kernel@vger.kernel.org, yukuai3@huawei.com, yukuai1@huaweicloud.com, yi.zhang@huawei.com, yangerkun@huawei.com Subject: [PATCH md-6.12 0/7] md: enhance faulty chekcing for blocked handling Date: Fri, 30 Aug 2024 15:27:14 +0800 Message-Id: <20240830072721.2112006-1-yukuai1@huaweicloud.com> Precedence: bulk MIME-Version: 1.0 Content-Transfer-Encoding: 8bit
Series	md: enhance faulty chekcing for blocked handling \| expand [md-6.12,0/7] md: enhance faulty chekcing for blocked handling [md-6.12,1/7] md: add a new helper rdev_blocked() [md-6.12,2/7] md: don't wait faulty rdev in md_wait_for_blocked_rdev() [md-6.12,3/7] md: don't record new badblocks for faulty rdev [md-6.12,4/7] md/raid1: factor out helper to handle blocked rdev from raid1_write_request() [md-6.12,5/7] md/raid1: don't wait for Faulty rdev in wait_blocked_rdev() [md-6.12,6/7] md/raid10: don't wait for Faulty rdev in wait_blocked_rdev() [md-6.12,7/7] md/raid5: don't set Faulty rdev for blocked_rdev

Message ID

20240830072721.2112006-1-yukuai1@huaweicloud.com (mailing list archive)

Headers

From: Yu Kuai <yukuai1@huaweicloud.com>
To: mariusz.tkaczyk@intel.com,
	song@kernel.org
Cc: linux-raid@vger.kernel.org,
	linux-kernel@vger.kernel.org,
	yukuai3@huawei.com,
	yukuai1@huaweicloud.com,
	yi.zhang@huawei.com,
	yangerkun@huawei.com
Subject: [PATCH md-6.12 0/7] md: enhance faulty chekcing for blocked handling
Date: Fri, 30 Aug 2024 15:27:14 +0800
Message-Id: <20240830072721.2112006-1-yukuai1@huaweicloud.com>
Precedence: bulk
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit

Series

md: enhance faulty chekcing for blocked handling | expand

Message

Yu Kuai Aug. 30, 2024, 7:27 a.m. UTC

From: Yu Kuai <yukuai3@huawei.com>

The lifetime of badblocks:

- IO error, and decide to record badblocks, and record sb_flags;
- write IO found rdev has badblocks and not yet acknowledged, then this
IO is blocked;
- daemon found sb_flags is set, update superblock and flush badblocks;
- write IO continue;

Main idea is that badblocks will be set in memory fist, before badblocks
are acknowledged, new write request must be blocked to prevent reading
old data after power failure, and this behaviour is not necessary if rdev
is faulty in the first place.

Yu Kuai (7):
  md: add a new helper rdev_blocked()
  md: don't wait faulty rdev in md_wait_for_blocked_rdev()
  md: don't record new badblocks for faulty rdev
  md/raid1: factor out helper to handle blocked rdev from
    raid1_write_request()
  md/raid1: don't wait for Faulty rdev in wait_blocked_rdev()
  md/raid10: don't wait for Faulty rdev in wait_blocked_rdev()
  md/raid5: don't set Faulty rdev for blocked_rdev

 drivers/md/md.c     |  8 +++--
 drivers/md/md.h     | 24 +++++++++++++++
 drivers/md/raid1.c  | 75 +++++++++++++++++++++++----------------------
 drivers/md/raid10.c | 40 +++++++++++-------------
 drivers/md/raid5.c  | 13 ++++----
 5 files changed, 92 insertions(+), 68 deletions(-)

Comments

Mariusz Tkaczyk Aug. 30, 2024, 11:12 a.m. UTC | #1

On Fri, 30 Aug 2024 15:27:14 +0800
Yu Kuai <yukuai1@huaweicloud.com> wrote:

> From: Yu Kuai <yukuai3@huawei.com>
> 
> The lifetime of badblocks:
> 
> - IO error, and decide to record badblocks, and record sb_flags;
> - write IO found rdev has badblocks and not yet acknowledged, then this
> IO is blocked;
> - daemon found sb_flags is set, update superblock and flush badblocks;
> - write IO continue;
> 
> Main idea is that badblocks will be set in memory fist, before badblocks
> are acknowledged, new write request must be blocked to prevent reading
> old data after power failure, and this behaviour is not necessary if rdev
> is faulty in the first place.
> 
> Yu Kuai (7):
>   md: add a new helper rdev_blocked()
>   md: don't wait faulty rdev in md_wait_for_blocked_rdev()
>   md: don't record new badblocks for faulty rdev
>   md/raid1: factor out helper to handle blocked rdev from
>     raid1_write_request()
>   md/raid1: don't wait for Faulty rdev in wait_blocked_rdev()
>   md/raid10: don't wait for Faulty rdev in wait_blocked_rdev()
>   md/raid5: don't set Faulty rdev for blocked_rdev
> 
>  drivers/md/md.c     |  8 +++--
>  drivers/md/md.h     | 24 +++++++++++++++
>  drivers/md/raid1.c  | 75 +++++++++++++++++++++++----------------------
>  drivers/md/raid10.c | 40 +++++++++++-------------
>  drivers/md/raid5.c  | 13 ++++----
>  5 files changed, 92 insertions(+), 68 deletions(-)
> 

Hi Song,
We need to test this with external metadata so please wait for our green light
before you will take this.
I checked the code and it looks safe but I need to double confirm it to avoid
hung tasks.

Thanks,
Mariusz

Mariusz Tkaczyk Oct. 9, 2024, 7:14 a.m. UTC | #2

On Fri, 30 Aug 2024 15:27:14 +0800
Yu Kuai <yukuai1@huaweicloud.com> wrote:

> From: Yu Kuai <yukuai3@huawei.com>
> 
> The lifetime of badblocks:
> 
> - IO error, and decide to record badblocks, and record sb_flags;
> - write IO found rdev has badblocks and not yet acknowledged, then this
> IO is blocked;
> - daemon found sb_flags is set, update superblock and flush badblocks;
> - write IO continue;
> 
> Main idea is that badblocks will be set in memory fist, before badblocks
> are acknowledged, new write request must be blocked to prevent reading
> old data after power failure, and this behaviour is not necessary if rdev
> is faulty in the first place.
> 
> Yu Kuai (7):
>   md: add a new helper rdev_blocked()
>   md: don't wait faulty rdev in md_wait_for_blocked_rdev()
>   md: don't record new badblocks for faulty rdev
>   md/raid1: factor out helper to handle blocked rdev from
>     raid1_write_request()
>   md/raid1: don't wait for Faulty rdev in wait_blocked_rdev()
>   md/raid10: don't wait for Faulty rdev in wait_blocked_rdev()
>   md/raid5: don't set Faulty rdev for blocked_rdev
> 
>  drivers/md/md.c     |  8 +++--
>  drivers/md/md.h     | 24 +++++++++++++++
>  drivers/md/raid1.c  | 75 +++++++++++++++++++++++----------------------
>  drivers/md/raid10.c | 40 +++++++++++-------------
>  drivers/md/raid5.c  | 13 ++++----
>  5 files changed, 92 insertions(+), 68 deletions(-)
> 


Hi,
We tested this patchset.

mdmon rework:
https://github.com/md-raid-utilities/mdadm/pull/66 

Kernel build torvalds/linux.git master:
commit e32cde8d2bd7d251a8f9b434143977ddf13dcec6

I applied this patchset on top of that.

My tests proved that:
- If only mdmon PR is applied - hangs are reproducible.
- If only this patchset is applied - hangs are reproducible.
- If both kernel patchset and mdmon rework are applied- hangs are not
  reproducible (at least until now).

It was tricky topic (I needed to deal with weird issues related to shared
descriptors in mdmon).

What the most important- there is no regression detected.

Thanks,
Mariusz

Paul Menzel Oct. 9, 2024, 8:52 a.m. UTC | #3

Dear Kuai,


Thank you for this patch series. Just a note about the typo in che*ck*ing.


Kind regards,

Paul

Yu Kuai Oct. 10, 2024, 12:38 p.m. UTC | #4

Hi,

在 2024/10/09 15:14, Mariusz Tkaczyk 写道:
> On Fri, 30 Aug 2024 15:27:14 +0800
> Yu Kuai <yukuai1@huaweicloud.com> wrote:
> 
>> From: Yu Kuai <yukuai3@huawei.com>
>>
>> The lifetime of badblocks:
>>
>> - IO error, and decide to record badblocks, and record sb_flags;
>> - write IO found rdev has badblocks and not yet acknowledged, then this
>> IO is blocked;
>> - daemon found sb_flags is set, update superblock and flush badblocks;
>> - write IO continue;
>>
>> Main idea is that badblocks will be set in memory fist, before badblocks
>> are acknowledged, new write request must be blocked to prevent reading
>> old data after power failure, and this behaviour is not necessary if rdev
>> is faulty in the first place.
>>
>> Yu Kuai (7):
>>    md: add a new helper rdev_blocked()
>>    md: don't wait faulty rdev in md_wait_for_blocked_rdev()
>>    md: don't record new badblocks for faulty rdev
>>    md/raid1: factor out helper to handle blocked rdev from
>>      raid1_write_request()
>>    md/raid1: don't wait for Faulty rdev in wait_blocked_rdev()
>>    md/raid10: don't wait for Faulty rdev in wait_blocked_rdev()
>>    md/raid5: don't set Faulty rdev for blocked_rdev
>>
>>   drivers/md/md.c     |  8 +++--
>>   drivers/md/md.h     | 24 +++++++++++++++
>>   drivers/md/raid1.c  | 75 +++++++++++++++++++++++----------------------
>>   drivers/md/raid10.c | 40 +++++++++++-------------
>>   drivers/md/raid5.c  | 13 ++++----
>>   5 files changed, 92 insertions(+), 68 deletions(-)
>>
> 
> 
> Hi,
> We tested this patchset.
> 
> mdmon rework:
> https://github.com/md-raid-utilities/mdadm/pull/66
> 
> Kernel build torvalds/linux.git master:
> commit e32cde8d2bd7d251a8f9b434143977ddf13dcec6
> 
> I applied this patchset on top of that.
> 
> My tests proved that:
> - If only mdmon PR is applied - hangs are reproducible.
> - If only this patchset is applied - hangs are reproducible.
> - If both kernel patchset and mdmon rework are applied- hangs are not
>    reproducible (at least until now).
> 
> It was tricky topic (I needed to deal with weird issues related to shared
> descriptors in mdmon).
> 
> What the most important- there is no regression detected.

Good to here that, I'll send a V2 then. Usually this set will land in
v6.13, because this doesn't look like a fix in kernel. :)

Thanks,
Kuai

> 
> Thanks,
> Mariusz
> 
> .
>

Yu Kuai Oct. 10, 2024, 12:40 p.m. UTC | #5

Hi,

在 2024/10/09 16:52, Paul Menzel 写道:
> Dear Kuai,
> 
> 
> Thank you for this patch series. Just a note about the typo in che*ck*ing.

Thanks for the notice. :)

Kuai

> 
> 
> Kind regards,
> 
> Paul
> .
>