[1/1,V2,md-6.12,1/1] md: add new_level sysfs interface

Message ID	20240829100133.74242-1-xni@redhat.com (mailing list archive)
State	Superseded
Headers	show Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C9D75156875 for <linux-raid@vger.kernel.org>; Thu, 29 Aug 2024 10:01:47 +0000 (UTC) From: Xiao Ni <xni@redhat.com> To: song@kernel.org Cc: linux-raid@vger.kernel.org, pmenzel@molgen.mpg.de Subject: [PATCH 1/1] [PATCH V2 md-6.12 1/1] md: add new_level sysfs interface Date: Thu, 29 Aug 2024 18:01:33 +0800 Message-Id: <20240829100133.74242-1-xni@redhat.com> Precedence: bulk MIME-Version: 1.0 Content-Transfer-Encoding: 8bit
Series	[1/1,V2,md-6.12,1/1] md: add new_level sysfs interface \| expand [1/1,V2,md-6.12,1/1] md: add new_level sysfs interface

Context	Check	Description
mdraidci/vmtest-md-6_11-PR	success	PR summary
mdraidci/vmtest-md-6_11-VM_Test-0	success	Logs for build-kernel
mdraidci/vmtest-md-6_12-PR	success	PR summary
mdraidci/vmtest-md-6_12-VM_Test-0	success	Logs for per-patch-testing

Xiao Ni Aug. 29, 2024, 10:01 a.m. UTC

This interface is used to update new level during reshape progress.
Now it doesn't update new level during reshape. So it can't know the
new level when systemd service mdadm-grow-continue runs. And it can't
finally change to new level.

mdadm -CR /dev/md0 -l6 -n4 /dev/loop[0-3]
mdadm --wait /dev/md0
mdadm /dev/md0 --grow -l5 --backup=backup
cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4] [raid0] [raid1] [raid10]
md0 : active raid6 loop3[3] loop2[2] loop1[1] loop0[0]

The case 07changelevels in mdadm can trigger this problem. Now it can't
run successfully. This patch is used to fix this. There is also a
userspace patch set that work together with this patch to fix this problem.

Signed-off-by: Xiao Ni <xni@redhat.com>
---
V2: add detail about test information
 drivers/md/md.c | 29 +++++++++++++++++++++++++++++
 1 file changed, 29 insertions(+)

Yu Kuai Aug. 29, 2024, 12:24 p.m. UTC | #1

Hi,

在 2024/08/29 18:01, Xiao Ni 写道:
> This interface is used to update new level during reshape progress.
> Now it doesn't update new level during reshape. So it can't know the
> new level when systemd service mdadm-grow-continue runs. And it can't
> finally change to new level.

I'm not sure why updaing new level during reshape. Will this corrupt
data in the array completely?
> 
> mdadm -CR /dev/md0 -l6 -n4 /dev/loop[0-3]
> mdadm --wait /dev/md0
> mdadm /dev/md0 --grow -l5 --backup=backup
> cat /proc/mdstat
> Personalities : [raid6] [raid5] [raid4] [raid0] [raid1] [raid10]
> md0 : active raid6 loop3[3] loop2[2] loop1[1] loop0[0]

The problem is that level is still 6 after mddev --grow -l5? I don't
understand yet why this is related to update new level during reshape.
:)

Thanks,
Kuai

> 
> The case 07changelevels in mdadm can trigger this problem. Now it can't
> run successfully. This patch is used to fix this. There is also a
> userspace patch set that work together with this patch to fix this problem.
> 
> Signed-off-by: Xiao Ni <xni@redhat.com>
> ---
> V2: add detail about test information
>   drivers/md/md.c | 29 +++++++++++++++++++++++++++++
>   1 file changed, 29 insertions(+)
> 
> diff --git a/drivers/md/md.c b/drivers/md/md.c
> index d3a837506a36..3c354e7a7825 100644
> --- a/drivers/md/md.c
> +++ b/drivers/md/md.c
> @@ -4141,6 +4141,34 @@ level_store(struct mddev *mddev, const char *buf, size_t len)
>   static struct md_sysfs_entry md_level =
>   __ATTR(level, S_IRUGO|S_IWUSR, level_show, level_store);
>   
> +static ssize_t
> +new_level_show(struct mddev *mddev, char *page)
> +{
> +	return sprintf(page, "%d\n", mddev->new_level);
> +}
> +
> +static ssize_t
> +new_level_store(struct mddev *mddev, const char *buf, size_t len)
> +{
> +	unsigned int n;
> +	int err;
> +
> +	err = kstrtouint(buf, 10, &n);
> +	if (err < 0)
> +		return err;
> +	err = mddev_lock(mddev);
> +	if (err)
> +		return err;
> +
> +	mddev->new_level = n;
> +	md_update_sb(mddev, 1);
> +
> +	mddev_unlock(mddev);
> +	return len;
> +}
> +static struct md_sysfs_entry md_new_level =
> +__ATTR(new_level, 0664, new_level_show, new_level_store);
> +
>   static ssize_t
>   layout_show(struct mddev *mddev, char *page)
>   {
> @@ -5666,6 +5694,7 @@ __ATTR(serialize_policy, S_IRUGO | S_IWUSR, serialize_policy_show,
>   
>   static struct attribute *md_default_attrs[] = {
>   	&md_level.attr,
> +	&md_new_level.attr,
>   	&md_layout.attr,
>   	&md_raid_disks.attr,
>   	&md_uuid.attr,
>

Xiao Ni Aug. 29, 2024, 1:13 p.m. UTC | #2

On Thu, Aug 29, 2024 at 8:24 PM Yu Kuai <yukuai1@huaweicloud.com> wrote:
>
> Hi,
>
> 在 2024/08/29 18:01, Xiao Ni 写道:
> > This interface is used to update new level during reshape progress.
> > Now it doesn't update new level during reshape. So it can't know the
> > new level when systemd service mdadm-grow-continue runs. And it can't
> > finally change to new level.
>
Hi Kuai

> I'm not sure why updaing new level during reshape. Will this corrupt
> data in the array completely?

There is no data corruption.


> >
> > mdadm -CR /dev/md0 -l6 -n4 /dev/loop[0-3]
> > mdadm --wait /dev/md0
> > mdadm /dev/md0 --grow -l5 --backup=backup
> > cat /proc/mdstat
> > Personalities : [raid6] [raid5] [raid4] [raid0] [raid1] [raid10]
> > md0 : active raid6 loop3[3] loop2[2] loop1[1] loop0[0]
>
> The problem is that level is still 6 after mddev --grow -l5? I don't
> understand yet why this is related to update new level during reshape.
> :)

mdadm --grow command returns once reshape starts when specifying
backup file. And it needs mdadm-grow-continue service to monitor the
reshape progress. It doesn't record the new level when running `mdadm
--grow`. So mdadm-grow-continue doesn't know the new level and it
can't change to new level after reshape finishes. This needs userspace
patch to work together.
https://www.spinics.net/lists/raid/msg78081.html

Best Regards

Xiao
>
> Thanks,
> Kuai
>
> >
> > The case 07changelevels in mdadm can trigger this problem. Now it can't
> > run successfully. This patch is used to fix this. There is also a
> > userspace patch set that work together with this patch to fix this problem.
> >
> > Signed-off-by: Xiao Ni <xni@redhat.com>
> > ---
> > V2: add detail about test information
> >   drivers/md/md.c | 29 +++++++++++++++++++++++++++++
> >   1 file changed, 29 insertions(+)
> >
> > diff --git a/drivers/md/md.c b/drivers/md/md.c
> > index d3a837506a36..3c354e7a7825 100644
> > --- a/drivers/md/md.c
> > +++ b/drivers/md/md.c
> > @@ -4141,6 +4141,34 @@ level_store(struct mddev *mddev, const char *buf, size_t len)
> >   static struct md_sysfs_entry md_level =
> >   __ATTR(level, S_IRUGO|S_IWUSR, level_show, level_store);
> >
> > +static ssize_t
> > +new_level_show(struct mddev *mddev, char *page)
> > +{
> > +     return sprintf(page, "%d\n", mddev->new_level);
> > +}
> > +
> > +static ssize_t
> > +new_level_store(struct mddev *mddev, const char *buf, size_t len)
> > +{
> > +     unsigned int n;
> > +     int err;
> > +
> > +     err = kstrtouint(buf, 10, &n);
> > +     if (err < 0)
> > +             return err;
> > +     err = mddev_lock(mddev);
> > +     if (err)
> > +             return err;
> > +
> > +     mddev->new_level = n;
> > +     md_update_sb(mddev, 1);
> > +
> > +     mddev_unlock(mddev);
> > +     return len;
> > +}
> > +static struct md_sysfs_entry md_new_level =
> > +__ATTR(new_level, 0664, new_level_show, new_level_store);
> > +
> >   static ssize_t
> >   layout_show(struct mddev *mddev, char *page)
> >   {
> > @@ -5666,6 +5694,7 @@ __ATTR(serialize_policy, S_IRUGO | S_IWUSR, serialize_policy_show,
> >
> >   static struct attribute *md_default_attrs[] = {
> >       &md_level.attr,
> > +     &md_new_level.attr,
> >       &md_layout.attr,
> >       &md_raid_disks.attr,
> >       &md_uuid.attr,
> >
>

Yu Kuai Aug. 29, 2024, 1:34 p.m. UTC | #3

Hi,

在 2024/08/29 21:13, Xiao Ni 写道:
> On Thu, Aug 29, 2024 at 8:24 PM Yu Kuai <yukuai1@huaweicloud.com> wrote:
>>
>> Hi,
>>
>> 在 2024/08/29 18:01, Xiao Ni 写道:
>>> This interface is used to update new level during reshape progress.
>>> Now it doesn't update new level during reshape. So it can't know the
>>> new level when systemd service mdadm-grow-continue runs. And it can't
>>> finally change to new level.
>>
> Hi Kuai
> 
>> I'm not sure why updaing new level during reshape. Will this corrupt
>> data in the array completely?
> 
> There is no data corruption.
> 
> 
>>>
>>> mdadm -CR /dev/md0 -l6 -n4 /dev/loop[0-3]
>>> mdadm --wait /dev/md0
>>> mdadm /dev/md0 --grow -l5 --backup=backup
>>> cat /proc/mdstat
>>> Personalities : [raid6] [raid5] [raid4] [raid0] [raid1] [raid10]
>>> md0 : active raid6 loop3[3] loop2[2] loop1[1] loop0[0]
>>
>> The problem is that level is still 6 after mddev --grow -l5? I don't
>> understand yet why this is related to update new level during reshape.
>> :)
> 
> mdadm --grow command returns once reshape starts when specifying
> backup file. And it needs mdadm-grow-continue service to monitor the
> reshape progress. It doesn't record the new level when running `mdadm
> --grow`. So mdadm-grow-continue doesn't know the new level and it
> can't change to new level after reshape finishes. This needs userspace
> patch to work together.
> https://www.spinics.net/lists/raid/msg78081.html

Hi,

Got it. Then I just wonder, what kind of content are stored in the
backup file, why don't store the 'new level' in it as well, after
reshape finishes, can mdadm use the level api to do this?

Thanks,
Kuai

> 
> Best Regards
> 
> Xiao
>>
>> Thanks,
>> Kuai
>>
>>>
>>> The case 07changelevels in mdadm can trigger this problem. Now it can't
>>> run successfully. This patch is used to fix this. There is also a
>>> userspace patch set that work together with this patch to fix this problem.
>>>
>>> Signed-off-by: Xiao Ni <xni@redhat.com>
>>> ---
>>> V2: add detail about test information
>>>    drivers/md/md.c | 29 +++++++++++++++++++++++++++++
>>>    1 file changed, 29 insertions(+)
>>>
>>> diff --git a/drivers/md/md.c b/drivers/md/md.c
>>> index d3a837506a36..3c354e7a7825 100644
>>> --- a/drivers/md/md.c
>>> +++ b/drivers/md/md.c
>>> @@ -4141,6 +4141,34 @@ level_store(struct mddev *mddev, const char *buf, size_t len)
>>>    static struct md_sysfs_entry md_level =
>>>    __ATTR(level, S_IRUGO|S_IWUSR, level_show, level_store);
>>>
>>> +static ssize_t
>>> +new_level_show(struct mddev *mddev, char *page)
>>> +{
>>> +     return sprintf(page, "%d\n", mddev->new_level);
>>> +}
>>> +
>>> +static ssize_t
>>> +new_level_store(struct mddev *mddev, const char *buf, size_t len)
>>> +{
>>> +     unsigned int n;
>>> +     int err;
>>> +
>>> +     err = kstrtouint(buf, 10, &n);
>>> +     if (err < 0)
>>> +             return err;
>>> +     err = mddev_lock(mddev);
>>> +     if (err)
>>> +             return err;
>>> +
>>> +     mddev->new_level = n;
>>> +     md_update_sb(mddev, 1);
>>> +
>>> +     mddev_unlock(mddev);
>>> +     return len;
>>> +}
>>> +static struct md_sysfs_entry md_new_level =
>>> +__ATTR(new_level, 0664, new_level_show, new_level_store);
>>> +
>>>    static ssize_t
>>>    layout_show(struct mddev *mddev, char *page)
>>>    {
>>> @@ -5666,6 +5694,7 @@ __ATTR(serialize_policy, S_IRUGO | S_IWUSR, serialize_policy_show,
>>>
>>>    static struct attribute *md_default_attrs[] = {
>>>        &md_level.attr,
>>> +     &md_new_level.attr,
>>>        &md_layout.attr,
>>>        &md_raid_disks.attr,
>>>        &md_uuid.attr,
>>>
>>
> 
> 
> 
> .
>

Xiao Ni Aug. 29, 2024, 1:48 p.m. UTC | #4

On Thu, Aug 29, 2024 at 9:34 PM Yu Kuai <yukuai1@huaweicloud.com> wrote:
>
> Hi,
>
> 在 2024/08/29 21:13, Xiao Ni 写道:
> > On Thu, Aug 29, 2024 at 8:24 PM Yu Kuai <yukuai1@huaweicloud.com> wrote:
> >>
> >> Hi,
> >>
> >> 在 2024/08/29 18:01, Xiao Ni 写道:
> >>> This interface is used to update new level during reshape progress.
> >>> Now it doesn't update new level during reshape. So it can't know the
> >>> new level when systemd service mdadm-grow-continue runs. And it can't
> >>> finally change to new level.
> >>
> > Hi Kuai
> >
> >> I'm not sure why updaing new level during reshape. Will this corrupt
> >> data in the array completely?
> >
> > There is no data corruption.
> >
> >
> >>>
> >>> mdadm -CR /dev/md0 -l6 -n4 /dev/loop[0-3]
> >>> mdadm --wait /dev/md0
> >>> mdadm /dev/md0 --grow -l5 --backup=backup
> >>> cat /proc/mdstat
> >>> Personalities : [raid6] [raid5] [raid4] [raid0] [raid1] [raid10]
> >>> md0 : active raid6 loop3[3] loop2[2] loop1[1] loop0[0]
> >>
> >> The problem is that level is still 6 after mddev --grow -l5? I don't
> >> understand yet why this is related to update new level during reshape.
> >> :)
> >
> > mdadm --grow command returns once reshape starts when specifying
> > backup file. And it needs mdadm-grow-continue service to monitor the
> > reshape progress. It doesn't record the new level when running `mdadm
> > --grow`. So mdadm-grow-continue doesn't know the new level and it
> > can't change to new level after reshape finishes. This needs userspace
> > patch to work together.
> > https://www.spinics.net/lists/raid/msg78081.html
>
> Hi,
>
> Got it. Then I just wonder, what kind of content are stored in the
> backup file, why don't store the 'new level' in it as well, after
> reshape finishes, can mdadm use the level api to do this?

The backup file has a superblock too and the content is the data from
the raid. The service monitors reshape progress and controls it. It
copies the data from raid to backup file, and then it reshapes. Now we
usually don't care about it because the data offset can change and it
has free space. For situations where the data offset can't be changed,
it needs to write the data to the region where it is read from. If
there is a power down or something else, the backup file can avoid
data corruption.

It should be right to update the new level in md during reshape. If
not, the new level is wrong.

Best Regards
Xiao
>
> Thanks,
> Kuai
>
> >
> > Best Regards
> >
> > Xiao
> >>
> >> Thanks,
> >> Kuai
> >>
> >>>
> >>> The case 07changelevels in mdadm can trigger this problem. Now it can't
> >>> run successfully. This patch is used to fix this. There is also a
> >>> userspace patch set that work together with this patch to fix this problem.
> >>>
> >>> Signed-off-by: Xiao Ni <xni@redhat.com>
> >>> ---
> >>> V2: add detail about test information
> >>>    drivers/md/md.c | 29 +++++++++++++++++++++++++++++
> >>>    1 file changed, 29 insertions(+)
> >>>
> >>> diff --git a/drivers/md/md.c b/drivers/md/md.c
> >>> index d3a837506a36..3c354e7a7825 100644
> >>> --- a/drivers/md/md.c
> >>> +++ b/drivers/md/md.c
> >>> @@ -4141,6 +4141,34 @@ level_store(struct mddev *mddev, const char *buf, size_t len)
> >>>    static struct md_sysfs_entry md_level =
> >>>    __ATTR(level, S_IRUGO|S_IWUSR, level_show, level_store);
> >>>
> >>> +static ssize_t
> >>> +new_level_show(struct mddev *mddev, char *page)
> >>> +{
> >>> +     return sprintf(page, "%d\n", mddev->new_level);
> >>> +}
> >>> +
> >>> +static ssize_t
> >>> +new_level_store(struct mddev *mddev, const char *buf, size_t len)
> >>> +{
> >>> +     unsigned int n;
> >>> +     int err;
> >>> +
> >>> +     err = kstrtouint(buf, 10, &n);
> >>> +     if (err < 0)
> >>> +             return err;
> >>> +     err = mddev_lock(mddev);
> >>> +     if (err)
> >>> +             return err;
> >>> +
> >>> +     mddev->new_level = n;
> >>> +     md_update_sb(mddev, 1);
> >>> +
> >>> +     mddev_unlock(mddev);
> >>> +     return len;
> >>> +}
> >>> +static struct md_sysfs_entry md_new_level =
> >>> +__ATTR(new_level, 0664, new_level_show, new_level_store);
> >>> +
> >>>    static ssize_t
> >>>    layout_show(struct mddev *mddev, char *page)
> >>>    {
> >>> @@ -5666,6 +5694,7 @@ __ATTR(serialize_policy, S_IRUGO | S_IWUSR, serialize_policy_show,
> >>>
> >>>    static struct attribute *md_default_attrs[] = {
> >>>        &md_level.attr,
> >>> +     &md_new_level.attr,
> >>>        &md_layout.attr,
> >>>        &md_raid_disks.attr,
> >>>        &md_uuid.attr,
> >>>
> >>
> >
> >
> >
> > .
> >
>

Yu Kuai Aug. 30, 2024, 7:36 a.m. UTC | #5

Hi,

在 2024/08/29 21:48, Xiao Ni 写道:
> On Thu, Aug 29, 2024 at 9:34 PM Yu Kuai <yukuai1@huaweicloud.com> wrote:
>>
>> Hi,
>>
>> 在 2024/08/29 21:13, Xiao Ni 写道:
>>> On Thu, Aug 29, 2024 at 8:24 PM Yu Kuai <yukuai1@huaweicloud.com> wrote:
>>>>
>>>> Hi,
>>>>
>>>> 在 2024/08/29 18:01, Xiao Ni 写道:
>>>>> This interface is used to update new level during reshape progress.
>>>>> Now it doesn't update new level during reshape. So it can't know the
>>>>> new level when systemd service mdadm-grow-continue runs. And it can't
>>>>> finally change to new level.
>>>>
>>> Hi Kuai
>>>
>>>> I'm not sure why updaing new level during reshape. Will this corrupt
>>>> data in the array completely?
>>>
>>> There is no data corruption.
>>>
>>>
>>>>>
>>>>> mdadm -CR /dev/md0 -l6 -n4 /dev/loop[0-3]
>>>>> mdadm --wait /dev/md0
>>>>> mdadm /dev/md0 --grow -l5 --backup=backup
>>>>> cat /proc/mdstat
>>>>> Personalities : [raid6] [raid5] [raid4] [raid0] [raid1] [raid10]
>>>>> md0 : active raid6 loop3[3] loop2[2] loop1[1] loop0[0]
>>>>
>>>> The problem is that level is still 6 after mddev --grow -l5? I don't
>>>> understand yet why this is related to update new level during reshape.
>>>> :)
>>>
>>> mdadm --grow command returns once reshape starts when specifying
>>> backup file. And it needs mdadm-grow-continue service to monitor the
>>> reshape progress. It doesn't record the new level when running `mdadm
>>> --grow`. So mdadm-grow-continue doesn't know the new level and it
>>> can't change to new level after reshape finishes. This needs userspace
>>> patch to work together.
>>> https://www.spinics.net/lists/raid/msg78081.html
>>
>> Hi,
>>
>> Got it. Then I just wonder, what kind of content are stored in the
>> backup file, why don't store the 'new level' in it as well, after
>> reshape finishes, can mdadm use the level api to do this?
> 
> The backup file has a superblock too and the content is the data from
> the raid. The service monitors reshape progress and controls it. It
> copies the data from raid to backup file, and then it reshapes. Now we
> usually don't care about it because the data offset can change and it
> has free space. For situations where the data offset can't be changed,
> it needs to write the data to the region where it is read from. If
> there is a power down or something else, the backup file can avoid
> data corruption.
> 
> It should be right to update the new level in md during reshape. If
> not, the new level is wrong.

Thanks for the explanation, I still need to investigate more about
detals, I won't object to the new sysfs api, however, I'll suggest
consider this as the final choice.

Thanks,
Kuai
> 
> Best Regards
> Xiao
>>
>> Thanks,
>> Kuai
>>
>>>
>>> Best Regards
>>>
>>> Xiao
>>>>
>>>> Thanks,
>>>> Kuai
>>>>
>>>>>
>>>>> The case 07changelevels in mdadm can trigger this problem. Now it can't
>>>>> run successfully. This patch is used to fix this. There is also a
>>>>> userspace patch set that work together with this patch to fix this problem.
>>>>>
>>>>> Signed-off-by: Xiao Ni <xni@redhat.com>
>>>>> ---
>>>>> V2: add detail about test information
>>>>>     drivers/md/md.c | 29 +++++++++++++++++++++++++++++
>>>>>     1 file changed, 29 insertions(+)
>>>>>
>>>>> diff --git a/drivers/md/md.c b/drivers/md/md.c
>>>>> index d3a837506a36..3c354e7a7825 100644
>>>>> --- a/drivers/md/md.c
>>>>> +++ b/drivers/md/md.c
>>>>> @@ -4141,6 +4141,34 @@ level_store(struct mddev *mddev, const char *buf, size_t len)
>>>>>     static struct md_sysfs_entry md_level =
>>>>>     __ATTR(level, S_IRUGO|S_IWUSR, level_show, level_store);
>>>>>
>>>>> +static ssize_t
>>>>> +new_level_show(struct mddev *mddev, char *page)
>>>>> +{
>>>>> +     return sprintf(page, "%d\n", mddev->new_level);
>>>>> +}
>>>>> +
>>>>> +static ssize_t
>>>>> +new_level_store(struct mddev *mddev, const char *buf, size_t len)
>>>>> +{
>>>>> +     unsigned int n;
>>>>> +     int err;
>>>>> +
>>>>> +     err = kstrtouint(buf, 10, &n);
>>>>> +     if (err < 0)
>>>>> +             return err;
>>>>> +     err = mddev_lock(mddev);
>>>>> +     if (err)
>>>>> +             return err;
>>>>> +
>>>>> +     mddev->new_level = n;
>>>>> +     md_update_sb(mddev, 1);
>>>>> +
>>>>> +     mddev_unlock(mddev);
>>>>> +     return len;
>>>>> +}
>>>>> +static struct md_sysfs_entry md_new_level =
>>>>> +__ATTR(new_level, 0664, new_level_show, new_level_store);
>>>>> +
>>>>>     static ssize_t
>>>>>     layout_show(struct mddev *mddev, char *page)
>>>>>     {
>>>>> @@ -5666,6 +5694,7 @@ __ATTR(serialize_policy, S_IRUGO | S_IWUSR, serialize_policy_show,
>>>>>
>>>>>     static struct attribute *md_default_attrs[] = {
>>>>>         &md_level.attr,
>>>>> +     &md_new_level.attr,
>>>>>         &md_layout.attr,
>>>>>         &md_raid_disks.attr,
>>>>>         &md_uuid.attr,
>>>>>
>>>>
>>>
>>>
>>>
>>> .
>>>
>>
> 
> 
> .
>

Xiao Ni Aug. 30, 2024, 8:51 a.m. UTC | #6

On Fri, Aug 30, 2024 at 3:37 PM Yu Kuai <yukuai1@huaweicloud.com> wrote:
>
> Hi,
>
> 在 2024/08/29 21:48, Xiao Ni 写道:
> > On Thu, Aug 29, 2024 at 9:34 PM Yu Kuai <yukuai1@huaweicloud.com> wrote:
> >>
> >> Hi,
> >>
> >> 在 2024/08/29 21:13, Xiao Ni 写道:
> >>> On Thu, Aug 29, 2024 at 8:24 PM Yu Kuai <yukuai1@huaweicloud.com> wrote:
> >>>>
> >>>> Hi,
> >>>>
> >>>> 在 2024/08/29 18:01, Xiao Ni 写道:
> >>>>> This interface is used to update new level during reshape progress.
> >>>>> Now it doesn't update new level during reshape. So it can't know the
> >>>>> new level when systemd service mdadm-grow-continue runs. And it can't
> >>>>> finally change to new level.
> >>>>
> >>> Hi Kuai
> >>>
> >>>> I'm not sure why updaing new level during reshape. Will this corrupt
> >>>> data in the array completely?
> >>>
> >>> There is no data corruption.
> >>>
> >>>
> >>>>>
> >>>>> mdadm -CR /dev/md0 -l6 -n4 /dev/loop[0-3]
> >>>>> mdadm --wait /dev/md0
> >>>>> mdadm /dev/md0 --grow -l5 --backup=backup
> >>>>> cat /proc/mdstat
> >>>>> Personalities : [raid6] [raid5] [raid4] [raid0] [raid1] [raid10]
> >>>>> md0 : active raid6 loop3[3] loop2[2] loop1[1] loop0[0]
> >>>>
> >>>> The problem is that level is still 6 after mddev --grow -l5? I don't
> >>>> understand yet why this is related to update new level during reshape.
> >>>> :)
> >>>
> >>> mdadm --grow command returns once reshape starts when specifying
> >>> backup file. And it needs mdadm-grow-continue service to monitor the
> >>> reshape progress. It doesn't record the new level when running `mdadm
> >>> --grow`. So mdadm-grow-continue doesn't know the new level and it
> >>> can't change to new level after reshape finishes. This needs userspace
> >>> patch to work together.
> >>> https://www.spinics.net/lists/raid/msg78081.html
> >>
> >> Hi,
> >>
> >> Got it. Then I just wonder, what kind of content are stored in the
> >> backup file, why don't store the 'new level' in it as well, after
> >> reshape finishes, can mdadm use the level api to do this?
> >
> > The backup file has a superblock too and the content is the data from
> > the raid. The service monitors reshape progress and controls it. It
> > copies the data from raid to backup file, and then it reshapes. Now we
> > usually don't care about it because the data offset can change and it
> > has free space. For situations where the data offset can't be changed,
> > it needs to write the data to the region where it is read from. If
> > there is a power down or something else, the backup file can avoid
> > data corruption.
> >
> > It should be right to update the new level in md during reshape. If
> > not, the new level is wrong.
>
> Thanks for the explanation, I still need to investigate more about
> detals, I won't object to the new sysfs api, however, I'll suggest
> consider this as the final choice.

Hi Kuai

The userspace reshape_array is the key function. Feel free to ask any
questions. I didn't know much about it before fixing mdadm regression
test cases. For now, I think it's the easiest way to fix this. It
needs to update the superblock with the right information.

By the way, the --backup design doesn't work as expected. There is a
data corruption risk when reshaping with a backup file. I didn't fix
it because focus on fixing regression cases this time.

Best Regards
Xiao
>
> Thanks,
> Kuai
> >
> > Best Regards
> > Xiao
> >>
> >> Thanks,
> >> Kuai
> >>
> >>>
> >>> Best Regards
> >>>
> >>> Xiao
> >>>>
> >>>> Thanks,
> >>>> Kuai
> >>>>
> >>>>>
> >>>>> The case 07changelevels in mdadm can trigger this problem. Now it can't
> >>>>> run successfully. This patch is used to fix this. There is also a
> >>>>> userspace patch set that work together with this patch to fix this problem.
> >>>>>
> >>>>> Signed-off-by: Xiao Ni <xni@redhat.com>
> >>>>> ---
> >>>>> V2: add detail about test information
> >>>>>     drivers/md/md.c | 29 +++++++++++++++++++++++++++++
> >>>>>     1 file changed, 29 insertions(+)
> >>>>>
> >>>>> diff --git a/drivers/md/md.c b/drivers/md/md.c
> >>>>> index d3a837506a36..3c354e7a7825 100644
> >>>>> --- a/drivers/md/md.c
> >>>>> +++ b/drivers/md/md.c
> >>>>> @@ -4141,6 +4141,34 @@ level_store(struct mddev *mddev, const char *buf, size_t len)
> >>>>>     static struct md_sysfs_entry md_level =
> >>>>>     __ATTR(level, S_IRUGO|S_IWUSR, level_show, level_store);
> >>>>>
> >>>>> +static ssize_t
> >>>>> +new_level_show(struct mddev *mddev, char *page)
> >>>>> +{
> >>>>> +     return sprintf(page, "%d\n", mddev->new_level);
> >>>>> +}
> >>>>> +
> >>>>> +static ssize_t
> >>>>> +new_level_store(struct mddev *mddev, const char *buf, size_t len)
> >>>>> +{
> >>>>> +     unsigned int n;
> >>>>> +     int err;
> >>>>> +
> >>>>> +     err = kstrtouint(buf, 10, &n);
> >>>>> +     if (err < 0)
> >>>>> +             return err;
> >>>>> +     err = mddev_lock(mddev);
> >>>>> +     if (err)
> >>>>> +             return err;
> >>>>> +
> >>>>> +     mddev->new_level = n;
> >>>>> +     md_update_sb(mddev, 1);
> >>>>> +
> >>>>> +     mddev_unlock(mddev);
> >>>>> +     return len;
> >>>>> +}
> >>>>> +static struct md_sysfs_entry md_new_level =
> >>>>> +__ATTR(new_level, 0664, new_level_show, new_level_store);
> >>>>> +
> >>>>>     static ssize_t
> >>>>>     layout_show(struct mddev *mddev, char *page)
> >>>>>     {
> >>>>> @@ -5666,6 +5694,7 @@ __ATTR(serialize_policy, S_IRUGO | S_IWUSR, serialize_policy_show,
> >>>>>
> >>>>>     static struct attribute *md_default_attrs[] = {
> >>>>>         &md_level.attr,
> >>>>> +     &md_new_level.attr,
> >>>>>         &md_layout.attr,
> >>>>>         &md_raid_disks.attr,
> >>>>>         &md_uuid.attr,
> >>>>>
> >>>>
> >>>
> >>>
> >>>
> >>> .
> >>>
> >>
> >
> >
> > .
> >
>

[1/1,V2,md-6.12,1/1] md: add new_level sysfs interface

Checks

Commit Message

Comments

Patch