Message ID | d78cbb12-9a9c-7965-db9a-70b4b277f908@huawei.com (mailing list archive) |
---|---|
State | Changes Requested |
Headers | show |
Series | mdadm: Don't set bad_blocks flag when initializing metadata | expand |
Context | Check | Description |
---|---|---|
mdraidci/vmtest-md-6_14-PR | fail | merge-conflict |
Hi, 在 2025/03/04 14:12, Wu Guanghao 写道: > When testing the raid1, I found that adding disk to raid1 fails. > Here's how to reproduce it: > > 1. modprobe brd rd_nr=3 rd_size=524288 > 2. mdadm -Cv /dev/md0 -l1 -n2 -e1.0 /dev/ram0 /dev/ram1 > > 3. mdadm /dev/md0 -f /dev/ram0 > 4. mdadm /dev/md0 -r /dev/ram0 > > 5. echo "10000 100" > /sys/block/md0/md/dev-ram1/bad_blocks > 6. echo "write_error" > /sys/block/md0/md/dev-ram1/state > > 7. mkfs.xfs /dev/md0 > 8. mdadm --examine-badblocks /dev/ram1 # Bad block records can be seen > Bad-blocks on /dev/ram1: > 10000 for 100 sectors > > 9. mdadm /dev/md0 -a /dev/ram2 > mdadm: add new device failed for /dev/ram2 as 2: Invalid argument Can you add a new regression test as well? > > When adding a disk to a RAID1 array, the metadata is read from the existing > member disks for synchronization. However, only the bad_blocks flag are copied, > the bad_blocks records are not copied, so the bad_blocks records are all zeros. > The kernel function super_1_load() detects bad_blocks flag and reads the > bad_blocks record, then sets the bad block using badblocks_set(). > > After the kernel commit 1726c7746("badblocks: improve badblocks_set() for > multiple ranges handling"), if the length of a bad_blocks record is 0, it will > return a failure. Therefore the device addition will fail. > > So, don't set the bad_blocks flag when initializing the metadata, kernel can > handle it. > > Signed-off-by: Wu Guanghao <wuguanghao3@huawei.com> > --- > super1.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/super1.c b/super1.c > index fe3c4c64..03578e5b 100644 > --- a/super1.c > +++ b/super1.c > @@ -2139,6 +2139,9 @@ static int write_init_super1(struct supertype *st) > if (raid0_need_layout) > sb->feature_map |= __cpu_to_le32(MD_FEATURE_RAID0_LAYOUT); > > + if (sb->feature_map & MD_FEATURE_BAD_BLOCKS) > + sb->feature_map &= ~__cpu_to_le32(MD_FEATURE_BAD_BLOCKS); There are also other flags that is per rdev, like MD_FEATURE_REPLACEMENT and MD_FEATURE_JOURNAL, they should be excluded as well. Thanks, Kuai > + > sb->sb_csum = calc_sb_1_csum(sb); > rv = store_super1(st, di->fd); >
在 2025/3/4 20:26, Yu Kuai 写道: > Hi, > > 在 2025/03/04 14:12, Wu Guanghao 写道: >> When testing the raid1, I found that adding disk to raid1 fails. >> Here's how to reproduce it: >> >> 1. modprobe brd rd_nr=3 rd_size=524288 >> 2. mdadm -Cv /dev/md0 -l1 -n2 -e1.0 /dev/ram0 /dev/ram1 >> >> 3. mdadm /dev/md0 -f /dev/ram0 >> 4. mdadm /dev/md0 -r /dev/ram0 >> >> 5. echo "10000 100" > /sys/block/md0/md/dev-ram1/bad_blocks >> 6. echo "write_error" > /sys/block/md0/md/dev-ram1/state >> >> 7. mkfs.xfs /dev/md0 >> 8. mdadm --examine-badblocks /dev/ram1 # Bad block records can be seen >> Bad-blocks on /dev/ram1: >> 10000 for 100 sectors >> >> 9. mdadm /dev/md0 -a /dev/ram2 >> mdadm: add new device failed for /dev/ram2 as 2: Invalid argument > > Can you add a new regression test as well? > Yeah, I will add in v2. >> >> When adding a disk to a RAID1 array, the metadata is read from the existing >> member disks for synchronization. However, only the bad_blocks flag are copied, >> the bad_blocks records are not copied, so the bad_blocks records are all zeros. >> The kernel function super_1_load() detects bad_blocks flag and reads the >> bad_blocks record, then sets the bad block using badblocks_set(). >> >> After the kernel commit 1726c7746("badblocks: improve badblocks_set() for >> multiple ranges handling"), if the length of a bad_blocks record is 0, it will >> return a failure. Therefore the device addition will fail. >> >> So, don't set the bad_blocks flag when initializing the metadata, kernel can >> handle it. >> >> Signed-off-by: Wu Guanghao <wuguanghao3@huawei.com> >> --- >> super1.c | 3 +++ >> 1 file changed, 3 insertions(+) >> >> diff --git a/super1.c b/super1.c >> index fe3c4c64..03578e5b 100644 >> --- a/super1.c >> +++ b/super1.c >> @@ -2139,6 +2139,9 @@ static int write_init_super1(struct supertype *st) >> if (raid0_need_layout) >> sb->feature_map |= __cpu_to_le32(MD_FEATURE_RAID0_LAYOUT); >> >> + if (sb->feature_map & MD_FEATURE_BAD_BLOCKS) >> + sb->feature_map &= ~__cpu_to_le32(MD_FEATURE_BAD_BLOCKS); > > There are also other flags that is per rdev, like MD_FEATURE_REPLACEMENT > and MD_FEATURE_JOURNAL, they should be excluded as well. > OK, These will be modified in v2. > Thanks, > Kuai > >> + >> sb->sb_csum = calc_sb_1_csum(sb); >> rv = store_super1(st, di->fd); >> > > .
Hi Guanghao Thanks for your patch. On Tue, Mar 4, 2025 at 8:27 PM Yu Kuai <yukuai1@huaweicloud.com> wrote: > > Hi, > > 在 2025/03/04 14:12, Wu Guanghao 写道: > > When testing the raid1, I found that adding disk to raid1 fails. > > Here's how to reproduce it: > > > > 1. modprobe brd rd_nr=3 rd_size=524288 > > 2. mdadm -Cv /dev/md0 -l1 -n2 -e1.0 /dev/ram0 /dev/ram1 > > > > 3. mdadm /dev/md0 -f /dev/ram0 > > 4. mdadm /dev/md0 -r /dev/ram0 > > > > 5. echo "10000 100" > /sys/block/md0/md/dev-ram1/bad_blocks > > 6. echo "write_error" > /sys/block/md0/md/dev-ram1/state > > > > 7. mkfs.xfs /dev/md0 Do we need this step7 here? > > 8. mdadm --examine-badblocks /dev/ram1 # Bad block records can be seen > > Bad-blocks on /dev/ram1: > > 10000 for 100 sectors > > > > 9. mdadm /dev/md0 -a /dev/ram2 > > mdadm: add new device failed for /dev/ram2 as 2: Invalid argument > > Can you add a new regression test as well? > > > > > When adding a disk to a RAID1 array, the metadata is read from the existing > > member disks for synchronization. However, only the bad_blocks flag are copied, > > the bad_blocks records are not copied, so the bad_blocks records are all zeros. > > The kernel function super_1_load() detects bad_blocks flag and reads the > > bad_blocks record, then sets the bad block using badblocks_set(). > > > > After the kernel commit 1726c7746("badblocks: improve badblocks_set() for > > multiple ranges handling"), if the length of a bad_blocks record is 0, it will > > return a failure. Therefore the device addition will fail. Can you give the specific function replace with "it will return a failure" here? > > > > So, don't set the bad_blocks flag when initializing the metadata, kernel can > > handle it. > > > > Signed-off-by: Wu Guanghao <wuguanghao3@huawei.com> > > --- > > super1.c | 3 +++ > > 1 file changed, 3 insertions(+) > > > > diff --git a/super1.c b/super1.c > > index fe3c4c64..03578e5b 100644 > > --- a/super1.c > > +++ b/super1.c > > @@ -2139,6 +2139,9 @@ static int write_init_super1(struct supertype *st) > > if (raid0_need_layout) > > sb->feature_map |= __cpu_to_le32(MD_FEATURE_RAID0_LAYOUT); > > > > + if (sb->feature_map & MD_FEATURE_BAD_BLOCKS) > > + sb->feature_map &= ~__cpu_to_le32(MD_FEATURE_BAD_BLOCKS); > > There are also other flags that is per rdev, like MD_FEATURE_REPLACEMENT > and MD_FEATURE_JOURNAL, they should be excluded as well. Hmm, why do we need to remove these flags too? It's better to use a separate patch rather than using this patch and describe it in detail. It's better to give an example also. Best Regards Xiao > > Thanks, > Kuai > > > + > > sb->sb_csum = calc_sb_1_csum(sb); > > rv = store_super1(st, di->fd); > > > >
On Sat, Mar 8, 2025 at 11:06 AM Xiao Ni <xni@redhat.com> wrote: > > Hi Guanghao > > Thanks for your patch. > > On Tue, Mar 4, 2025 at 8:27 PM Yu Kuai <yukuai1@huaweicloud.com> wrote: > > > > Hi, > > > > 在 2025/03/04 14:12, Wu Guanghao 写道: > > > When testing the raid1, I found that adding disk to raid1 fails. > > > Here's how to reproduce it: > > > > > > 1. modprobe brd rd_nr=3 rd_size=524288 > > > 2. mdadm -Cv /dev/md0 -l1 -n2 -e1.0 /dev/ram0 /dev/ram1 > > > > > > 3. mdadm /dev/md0 -f /dev/ram0 > > > 4. mdadm /dev/md0 -r /dev/ram0 > > > > > > 5. echo "10000 100" > /sys/block/md0/md/dev-ram1/bad_blocks > > > 6. echo "write_error" > /sys/block/md0/md/dev-ram1/state > > > > > > 7. mkfs.xfs /dev/md0 > > Do we need this step7 here? I confirmed myself. The answer is yes. > > > > 8. mdadm --examine-badblocks /dev/ram1 # Bad block records can be seen > > > Bad-blocks on /dev/ram1: > > > 10000 for 100 sectors > > > > > > 9. mdadm /dev/md0 -a /dev/ram2 > > > mdadm: add new device failed for /dev/ram2 as 2: Invalid argument > > > > Can you add a new regression test as well? > > > > > > > > When adding a disk to a RAID1 array, the metadata is read from the existing > > > member disks for synchronization. However, only the bad_blocks flag are copied, > > > the bad_blocks records are not copied, so the bad_blocks records are all zeros. > > > The kernel function super_1_load() detects bad_blocks flag and reads the > > > bad_blocks record, then sets the bad block using badblocks_set(). > > > > > > After the kernel commit 1726c7746("badblocks: improve badblocks_set() for > > > multiple ranges handling"), if the length of a bad_blocks record is 0, it will > > > return a failure. Therefore the device addition will fail. > > Can you give the specific function replace with "it will return a failure" here? I know, you mean badblocks_set which is called in super_1_load. It's better to describe clearly in a commit message. > > > > > > > So, don't set the bad_blocks flag when initializing the metadata, kernel can > > > handle it. > > > > > > Signed-off-by: Wu Guanghao <wuguanghao3@huawei.com> > > > --- > > > super1.c | 3 +++ > > > 1 file changed, 3 insertions(+) > > > > > > diff --git a/super1.c b/super1.c > > > index fe3c4c64..03578e5b 100644 > > > --- a/super1.c > > > +++ b/super1.c > > > @@ -2139,6 +2139,9 @@ static int write_init_super1(struct supertype *st) > > > if (raid0_need_layout) > > > sb->feature_map |= __cpu_to_le32(MD_FEATURE_RAID0_LAYOUT); > > > > > > + if (sb->feature_map & MD_FEATURE_BAD_BLOCKS) > > > + sb->feature_map &= ~__cpu_to_le32(MD_FEATURE_BAD_BLOCKS); > > > > There are also other flags that is per rdev, like MD_FEATURE_REPLACEMENT > > and MD_FEATURE_JOURNAL, they should be excluded as well. > > Hmm, why do we need to remove these flags too? It's better to use a > separate patch rather than using this patch and describe it in detail. > It's better to give an example also. Please answer this question. Regards Xiao > > Best Regards > Xiao > > > > Thanks, > > Kuai > > > > > + > > > sb->sb_csum = calc_sb_1_csum(sb); > > > rv = store_super1(st, di->fd); > > > > > > >
Hi, 在 2025/03/08 13:27, Xiao Ni 写道: > On Sat, Mar 8, 2025 at 11:06 AM Xiao Ni <xni@redhat.com> wrote: >> >> Hi Guanghao >> >> Thanks for your patch. >> >> On Tue, Mar 4, 2025 at 8:27 PM Yu Kuai <yukuai1@huaweicloud.com> wrote: >>> >>> Hi, >>> >>> 在 2025/03/04 14:12, Wu Guanghao 写道: >>>> When testing the raid1, I found that adding disk to raid1 fails. >>>> Here's how to reproduce it: >>>> >>>> 1. modprobe brd rd_nr=3 rd_size=524288 >>>> 2. mdadm -Cv /dev/md0 -l1 -n2 -e1.0 /dev/ram0 /dev/ram1 >>>> >>>> 3. mdadm /dev/md0 -f /dev/ram0 >>>> 4. mdadm /dev/md0 -r /dev/ram0 >>>> >>>> 5. echo "10000 100" > /sys/block/md0/md/dev-ram1/bad_blocks >>>> 6. echo "write_error" > /sys/block/md0/md/dev-ram1/state >>>> >>>> 7. mkfs.xfs /dev/md0 >> >> Do we need this step7 here? > > I confirmed myself. The answer is yes. >> >>>> 8. mdadm --examine-badblocks /dev/ram1 # Bad block records can be seen >>>> Bad-blocks on /dev/ram1: >>>> 10000 for 100 sectors >>>> >>>> 9. mdadm /dev/md0 -a /dev/ram2 >>>> mdadm: add new device failed for /dev/ram2 as 2: Invalid argument >>> >>> Can you add a new regression test as well? >>> >>>> >>>> When adding a disk to a RAID1 array, the metadata is read from the existing >>>> member disks for synchronization. However, only the bad_blocks flag are copied, >>>> the bad_blocks records are not copied, so the bad_blocks records are all zeros. >>>> The kernel function super_1_load() detects bad_blocks flag and reads the >>>> bad_blocks record, then sets the bad block using badblocks_set(). >>>> >>>> After the kernel commit 1726c7746("badblocks: improve badblocks_set() for >>>> multiple ranges handling"), if the length of a bad_blocks record is 0, it will >>>> return a failure. Therefore the device addition will fail. >> >> Can you give the specific function replace with "it will return a failure" here? > > I know, you mean badblocks_set which is called in super_1_load. It's > better to describe clearly in a commit message. >> >>>> >>>> So, don't set the bad_blocks flag when initializing the metadata, kernel can >>>> handle it. >>>> >>>> Signed-off-by: Wu Guanghao <wuguanghao3@huawei.com> >>>> --- >>>> super1.c | 3 +++ >>>> 1 file changed, 3 insertions(+) >>>> >>>> diff --git a/super1.c b/super1.c >>>> index fe3c4c64..03578e5b 100644 >>>> --- a/super1.c >>>> +++ b/super1.c >>>> @@ -2139,6 +2139,9 @@ static int write_init_super1(struct supertype *st) >>>> if (raid0_need_layout) >>>> sb->feature_map |= __cpu_to_le32(MD_FEATURE_RAID0_LAYOUT); >>>> >>>> + if (sb->feature_map & MD_FEATURE_BAD_BLOCKS) >>>> + sb->feature_map &= ~__cpu_to_le32(MD_FEATURE_BAD_BLOCKS); >>> >>> There are also other flags that is per rdev, like MD_FEATURE_REPLACEMENT >>> and MD_FEATURE_JOURNAL, they should be excluded as well. >> >> Hmm, why do we need to remove these flags too? It's better to use a >> separate patch rather than using this patch and describe it in detail. >> It's better to give an example also. This MD_FEATURE_REPLACEMENT should be removed, because this is per-rdev flag, means this rdev is an replacement, and this should never be copied to new rdev: if (le32_to_cpu(sb->feature_map) & MD_FEATURE_REPLACEMENT) set_bit(Replacement, &rdev->flags); This is exactaly the same as MD_FEATURE_BAD_BLOCKS, means this rdev has bad blocks. And I'm wrong about MD_FEATURE_JOURNAL, this doesn't not mean this rdev is journal. Thanks, Kuai > > Please answer this question. > > Regards > Xiao >> >> Best Regards >> Xiao >>> >>> Thanks, >>> Kuai >>> >>>> + >>>> sb->sb_csum = calc_sb_1_csum(sb); >>>> rv = store_super1(st, di->fd); >>>> >>> >>> > > > . >
On Mon, Mar 10, 2025 at 9:38 AM Yu Kuai <yukuai1@huaweicloud.com> wrote: > > Hi, > > 在 2025/03/08 13:27, Xiao Ni 写道: > > On Sat, Mar 8, 2025 at 11:06 AM Xiao Ni <xni@redhat.com> wrote: > >> > >> Hi Guanghao > >> > >> Thanks for your patch. > >> > >> On Tue, Mar 4, 2025 at 8:27 PM Yu Kuai <yukuai1@huaweicloud.com> wrote: > >>> > >>> Hi, > >>> > >>> 在 2025/03/04 14:12, Wu Guanghao 写道: > >>>> When testing the raid1, I found that adding disk to raid1 fails. > >>>> Here's how to reproduce it: > >>>> > >>>> 1. modprobe brd rd_nr=3 rd_size=524288 > >>>> 2. mdadm -Cv /dev/md0 -l1 -n2 -e1.0 /dev/ram0 /dev/ram1 > >>>> > >>>> 3. mdadm /dev/md0 -f /dev/ram0 > >>>> 4. mdadm /dev/md0 -r /dev/ram0 > >>>> > >>>> 5. echo "10000 100" > /sys/block/md0/md/dev-ram1/bad_blocks > >>>> 6. echo "write_error" > /sys/block/md0/md/dev-ram1/state > >>>> > >>>> 7. mkfs.xfs /dev/md0 > >> > >> Do we need this step7 here? > > > > I confirmed myself. The answer is yes. > >> > >>>> 8. mdadm --examine-badblocks /dev/ram1 # Bad block records can be seen > >>>> Bad-blocks on /dev/ram1: > >>>> 10000 for 100 sectors > >>>> > >>>> 9. mdadm /dev/md0 -a /dev/ram2 > >>>> mdadm: add new device failed for /dev/ram2 as 2: Invalid argument > >>> > >>> Can you add a new regression test as well? > >>> > >>>> > >>>> When adding a disk to a RAID1 array, the metadata is read from the existing > >>>> member disks for synchronization. However, only the bad_blocks flag are copied, > >>>> the bad_blocks records are not copied, so the bad_blocks records are all zeros. > >>>> The kernel function super_1_load() detects bad_blocks flag and reads the > >>>> bad_blocks record, then sets the bad block using badblocks_set(). > >>>> > >>>> After the kernel commit 1726c7746("badblocks: improve badblocks_set() for > >>>> multiple ranges handling"), if the length of a bad_blocks record is 0, it will > >>>> return a failure. Therefore the device addition will fail. > >> > >> Can you give the specific function replace with "it will return a failure" here? > > > > I know, you mean badblocks_set which is called in super_1_load. It's > > better to describe clearly in a commit message. > >> > >>>> > >>>> So, don't set the bad_blocks flag when initializing the metadata, kernel can > >>>> handle it. > >>>> > >>>> Signed-off-by: Wu Guanghao <wuguanghao3@huawei.com> > >>>> --- > >>>> super1.c | 3 +++ > >>>> 1 file changed, 3 insertions(+) > >>>> > >>>> diff --git a/super1.c b/super1.c > >>>> index fe3c4c64..03578e5b 100644 > >>>> --- a/super1.c > >>>> +++ b/super1.c > >>>> @@ -2139,6 +2139,9 @@ static int write_init_super1(struct supertype *st) > >>>> if (raid0_need_layout) > >>>> sb->feature_map |= __cpu_to_le32(MD_FEATURE_RAID0_LAYOUT); > >>>> > >>>> + if (sb->feature_map & MD_FEATURE_BAD_BLOCKS) > >>>> + sb->feature_map &= ~__cpu_to_le32(MD_FEATURE_BAD_BLOCKS); > >>> > >>> There are also other flags that is per rdev, like MD_FEATURE_REPLACEMENT > >>> and MD_FEATURE_JOURNAL, they should be excluded as well. > >> > >> Hmm, why do we need to remove these flags too? It's better to use a > >> separate patch rather than using this patch and describe it in detail. > >> It's better to give an example also. > > This MD_FEATURE_REPLACEMENT should be removed, because this is per-rdev > flag, means this rdev is an replacement, and this should never be copied > to new rdev: > > if (le32_to_cpu(sb->feature_map) & MD_FEATURE_REPLACEMENT) > set_bit(Replacement, &rdev->flags); > > This is exactaly the same as MD_FEATURE_BAD_BLOCKS, means this rdev has > bad blocks. > > And I'm wrong about MD_FEATURE_JOURNAL, this doesn't not mean this rdev > is journal. Thanks for clarifying this. So please send the new version of this patch. Regards Xiao > > Thanks, > Kuai > > > > > Please answer this question. > > > > Regards > > Xiao > >> > >> Best Regards > >> Xiao > >>> > >>> Thanks, > >>> Kuai > >>> > >>>> + > >>>> sb->sb_csum = calc_sb_1_csum(sb); > >>>> rv = store_super1(st, di->fd); > >>>> > >>> > >>> > > > > > > . > > >
diff --git a/super1.c b/super1.c index fe3c4c64..03578e5b 100644 --- a/super1.c +++ b/super1.c @@ -2139,6 +2139,9 @@ static int write_init_super1(struct supertype *st) if (raid0_need_layout) sb->feature_map |= __cpu_to_le32(MD_FEATURE_RAID0_LAYOUT); + if (sb->feature_map & MD_FEATURE_BAD_BLOCKS) + sb->feature_map &= ~__cpu_to_le32(MD_FEATURE_BAD_BLOCKS); + sb->sb_csum = calc_sb_1_csum(sb); rv = store_super1(st, di->fd);
When testing the raid1, I found that adding disk to raid1 fails. Here's how to reproduce it: 1. modprobe brd rd_nr=3 rd_size=524288 2. mdadm -Cv /dev/md0 -l1 -n2 -e1.0 /dev/ram0 /dev/ram1 3. mdadm /dev/md0 -f /dev/ram0 4. mdadm /dev/md0 -r /dev/ram0 5. echo "10000 100" > /sys/block/md0/md/dev-ram1/bad_blocks 6. echo "write_error" > /sys/block/md0/md/dev-ram1/state 7. mkfs.xfs /dev/md0 8. mdadm --examine-badblocks /dev/ram1 # Bad block records can be seen Bad-blocks on /dev/ram1: 10000 for 100 sectors 9. mdadm /dev/md0 -a /dev/ram2 mdadm: add new device failed for /dev/ram2 as 2: Invalid argument When adding a disk to a RAID1 array, the metadata is read from the existing member disks for synchronization. However, only the bad_blocks flag are copied, the bad_blocks records are not copied, so the bad_blocks records are all zeros. The kernel function super_1_load() detects bad_blocks flag and reads the bad_blocks record, then sets the bad block using badblocks_set(). After the kernel commit 1726c7746("badblocks: improve badblocks_set() for multiple ranges handling"), if the length of a bad_blocks record is 0, it will return a failure. Therefore the device addition will fail. So, don't set the bad_blocks flag when initializing the metadata, kernel can handle it. Signed-off-by: Wu Guanghao <wuguanghao3@huawei.com> --- super1.c | 3 +++ 1 file changed, 3 insertions(+)