Message ID | 20250221081109.734170-6-zhengqixing@huaweicloud.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | badblocks: bugfix and cleanup for badblocks | expand |
On Fri, Feb 21, 2025 at 04:11:02PM +0800, Zheng Qixing wrote: > From: Li Nan <linan122@huawei.com> > > _badblocks_set() returns success if at least one badblock is set > successfully, even if others fail. This can lead to data inconsistencies > in raid, where a failed badblock set should trigger the disk to be kicked > out to prevent future reads from failed write areas. > > _badblocks_set() should return error if any badblock set fails. Instead > of relying on 'rv', directly returning 'sectors' for clearer logic. If all > badblocks are successfully set, 'sectors' will be 0, otherwise it > indicates the number of badblocks that have not been set yet, thus > signaling failure. > > By the way, it can also fix an issue: when a newly set unack badblock is > included in an existing ack badblock, the setting will return an error. > ··· > echo "0 100" /sys/block/md0/md/dev-loop1/bad_blocks > echo "0 100" /sys/block/md0/md/dev-loop1/unacknowledged_bad_blocks > -bash: echo: write error: No space left on device > ``` > After fix, it will return success. > > Fixes: aa511ff8218b ("badblocks: switch to the improved badblock handling code") > Signed-off-by: Li Nan <linan122@huawei.com> > --- > block/badblocks.c | 16 ++++------------ > 1 file changed, 4 insertions(+), 12 deletions(-) > NACK. Such modification will break current API. Current API doesn’t handle partial success/fail condition, if any bad block range is set it should return true. It is not about correct or wrong, just current fragile design. A new API is necessary to handle such thing. This is why I leave the return value as int other than bool. Thanks. Coly Li > diff --git a/block/badblocks.c b/block/badblocks.c > index 1c8b8f65f6df..a953d2e9417f 100644 > --- a/block/badblocks.c > +++ b/block/badblocks.c > @@ -843,7 +843,6 @@ static int _badblocks_set(struct badblocks *bb, sector_t s, int sectors, > struct badblocks_context bad; > int prev = -1, hint = -1; > unsigned long flags; > - int rv = 0; > u64 *p; > > if (bb->shift < 0) > @@ -873,10 +872,8 @@ static int _badblocks_set(struct badblocks *bb, sector_t s, int sectors, > bad.len = sectors; > len = 0; > > - if (badblocks_full(bb)) { > - rv = 1; > + if (badblocks_full(bb)) > goto out; > - } > > if (badblocks_empty(bb)) { > len = insert_at(bb, 0, &bad); > @@ -916,10 +913,8 @@ static int _badblocks_set(struct badblocks *bb, sector_t s, int sectors, > int extra = 0; > > if (!can_front_overwrite(bb, prev, &bad, &extra)) { > - if (extra > 0) { > - rv = 1; > + if (extra > 0) > goto out; > - } > > len = min_t(sector_t, > BB_END(p[prev]) - s, sectors); > @@ -986,10 +981,7 @@ static int _badblocks_set(struct badblocks *bb, sector_t s, int sectors, > > write_sequnlock_irqrestore(&bb->lock, flags); > > - if (!added) > - rv = 1; > - > - return rv; > + return sectors; > } > > /* > @@ -1353,7 +1345,7 @@ EXPORT_SYMBOL_GPL(badblocks_check); > * > * Return: > * 0: success > - * 1: failed to set badblocks (out of space) > + * other: failed to set badblocks (out of space) > */ > int badblocks_set(struct badblocks *bb, sector_t s, int sectors, > int acknowledged) > -- > 2.39.2 >
Hi, 在 2025/02/21 17:52, Coly Li 写道: > On Fri, Feb 21, 2025 at 04:11:02PM +0800, Zheng Qixing wrote: >> From: Li Nan <linan122@huawei.com> >> >> _badblocks_set() returns success if at least one badblock is set >> successfully, even if others fail. This can lead to data inconsistencies >> in raid, where a failed badblock set should trigger the disk to be kicked >> out to prevent future reads from failed write areas. >> >> _badblocks_set() should return error if any badblock set fails. Instead >> of relying on 'rv', directly returning 'sectors' for clearer logic. If all >> badblocks are successfully set, 'sectors' will be 0, otherwise it >> indicates the number of badblocks that have not been set yet, thus >> signaling failure. >> >> By the way, it can also fix an issue: when a newly set unack badblock is >> included in an existing ack badblock, the setting will return an error. >> ··· >> echo "0 100" /sys/block/md0/md/dev-loop1/bad_blocks >> echo "0 100" /sys/block/md0/md/dev-loop1/unacknowledged_bad_blocks >> -bash: echo: write error: No space left on device >> ``` >> After fix, it will return success. >> >> Fixes: aa511ff8218b ("badblocks: switch to the improved badblock handling code") >> Signed-off-by: Li Nan <linan122@huawei.com> >> --- >> block/badblocks.c | 16 ++++------------ >> 1 file changed, 4 insertions(+), 12 deletions(-) >> > > NACK. Such modification will break current API. Take a look at current APIs: - for raid, error should be returned, otherwise data may be corrupted. - for nvdimm, there is only error message if fail, and it make sense as well if any badblocks set failed: if (badblocks_set(bb, s, num, 1)) dev_info_once(bb->dev, "%s: failed for sector %llx\n", __func__, (u64) s); - for null_blk, I think it's fine as well. Hence I think it's fine to return error if any badblocks set failed. There is no need to invent a new API and switch all callers to a new API. Thanks, Kuai > > Current API doesn’t handle partial success/fail condition, if any bad block range is set it should return true. > > It is not about correct or wrong, just current fragile design. A new API is necessary to handle such thing. This is why I leave the return value as int other than bool. > > Thanks. > > Coly Li > > > >> diff --git a/block/badblocks.c b/block/badblocks.c >> index 1c8b8f65f6df..a953d2e9417f 100644 >> --- a/block/badblocks.c >> +++ b/block/badblocks.c >> @@ -843,7 +843,6 @@ static int _badblocks_set(struct badblocks *bb, sector_t s, int sectors, >> struct badblocks_context bad; >> int prev = -1, hint = -1; >> unsigned long flags; >> - int rv = 0; >> u64 *p; >> >> if (bb->shift < 0) >> @@ -873,10 +872,8 @@ static int _badblocks_set(struct badblocks *bb, sector_t s, int sectors, >> bad.len = sectors; >> len = 0; >> >> - if (badblocks_full(bb)) { >> - rv = 1; >> + if (badblocks_full(bb)) >> goto out; >> - } >> >> if (badblocks_empty(bb)) { >> len = insert_at(bb, 0, &bad); >> @@ -916,10 +913,8 @@ static int _badblocks_set(struct badblocks *bb, sector_t s, int sectors, >> int extra = 0; >> >> if (!can_front_overwrite(bb, prev, &bad, &extra)) { >> - if (extra > 0) { >> - rv = 1; >> + if (extra > 0) >> goto out; >> - } >> >> len = min_t(sector_t, >> BB_END(p[prev]) - s, sectors); >> @@ -986,10 +981,7 @@ static int _badblocks_set(struct badblocks *bb, sector_t s, int sectors, >> >> write_sequnlock_irqrestore(&bb->lock, flags); >> >> - if (!added) >> - rv = 1; >> - >> - return rv; >> + return sectors; >> } >> >> /* >> @@ -1353,7 +1345,7 @@ EXPORT_SYMBOL_GPL(badblocks_check); >> * >> * Return: >> * 0: success >> - * 1: failed to set badblocks (out of space) >> + * other: failed to set badblocks (out of space) >> */ >> int badblocks_set(struct badblocks *bb, sector_t s, int sectors, >> int acknowledged) >> -- >> 2.39.2 >> >
> 2025年2月21日 18:09,Yu Kuai <yukuai1@huaweicloud.com> 写道: > > Hi, > > 在 2025/02/21 17:52, Coly Li 写道: >> On Fri, Feb 21, 2025 at 04:11:02PM +0800, Zheng Qixing wrote: >>> From: Li Nan <linan122@huawei.com> >>> >>> _badblocks_set() returns success if at least one badblock is set >>> successfully, even if others fail. This can lead to data inconsistencies >>> in raid, where a failed badblock set should trigger the disk to be kicked >>> out to prevent future reads from failed write areas. >>> >>> _badblocks_set() should return error if any badblock set fails. Instead >>> of relying on 'rv', directly returning 'sectors' for clearer logic. If all >>> badblocks are successfully set, 'sectors' will be 0, otherwise it >>> indicates the number of badblocks that have not been set yet, thus >>> signaling failure. >>> >>> By the way, it can also fix an issue: when a newly set unack badblock is >>> included in an existing ack badblock, the setting will return an error. >>> ··· >>> echo "0 100" /sys/block/md0/md/dev-loop1/bad_blocks >>> echo "0 100" /sys/block/md0/md/dev-loop1/unacknowledged_bad_blocks >>> -bash: echo: write error: No space left on device >>> ``` >>> After fix, it will return success. >>> >>> Fixes: aa511ff8218b ("badblocks: switch to the improved badblock handling code") >>> Signed-off-by: Li Nan <linan122@huawei.com> >>> --- >>> block/badblocks.c | 16 ++++------------ >>> 1 file changed, 4 insertions(+), 12 deletions(-) >>> >> NACK. Such modification will break current API. > > Take a look at current APIs: > - for raid, error should be returned, otherwise data may be corrupted. > - for nvdimm, there is only error message if fail, and it make sense as > well if any badblocks set failed: > if (badblocks_set(bb, s, num, 1)) > dev_info_once(bb->dev, "%s: failed for sector %llx\n", > __func__, (u64) s); > - for null_blk, I think it's fine as well. > > Hence I think it's fine to return error if any badblocks set failed. > There is no need to invent a new API and switch all callers to a new > API. So we don’t need to add a negative return value for partial success/failure? Coly Li
Hi, 在 2025/02/21 18:12, Coly Li 写道: > So we don’t need to add a negative return value for partial success/failure? > > Coly Li. Yes, I think so. No one really use this value, and patch 10 clean this up by changing return type to bool. Thanks, Kuai
On Sat, Feb 22, 2025 at 09:12:53AM +0800, Yu Kuai wrote: > Hi, > > 在 2025/02/21 18:12, Coly Li 写道: > > So we don’t need to add a negative return value for partial success/failure? > > > > Coly Li. > > Yes, I think so. No one really use this value, and patch 10 clean this > up by changing return type to bool. OK, then it is fine to me. It will be good to add a code comment that parital setting will be treated as failure. Thanks.
On Fri, Feb 21, 2025 at 04:11:02PM +0800, Zheng Qixing wrote: > From: Li Nan <linan122@huawei.com> > > _badblocks_set() returns success if at least one badblock is set > successfully, even if others fail. This can lead to data inconsistencies > in raid, where a failed badblock set should trigger the disk to be kicked > out to prevent future reads from failed write areas. > > _badblocks_set() should return error if any badblock set fails. Instead > of relying on 'rv', directly returning 'sectors' for clearer logic. If all > badblocks are successfully set, 'sectors' will be 0, otherwise it > indicates the number of badblocks that have not been set yet, thus > signaling failure. > > By the way, it can also fix an issue: when a newly set unack badblock is > included in an existing ack badblock, the setting will return an error. > ··· > echo "0 100" /sys/block/md0/md/dev-loop1/bad_blocks > echo "0 100" /sys/block/md0/md/dev-loop1/unacknowledged_bad_blocks > -bash: echo: write error: No space left on device > ``` > After fix, it will return success. > > Fixes: aa511ff8218b ("badblocks: switch to the improved badblock handling code") > Signed-off-by: Li Nan <linan122@huawei.com> Based on Kuai's reply that we dont handle partial set and treat it as failure, I am fine with this patch. It will be good to add comment to explain that the parital set condition will be treated as failure. Acked-by: Coly Li <colyli@kernel.org> Thanks. > --- > block/badblocks.c | 16 ++++------------ > 1 file changed, 4 insertions(+), 12 deletions(-) > > diff --git a/block/badblocks.c b/block/badblocks.c > index 1c8b8f65f6df..a953d2e9417f 100644 > --- a/block/badblocks.c > +++ b/block/badblocks.c > @@ -843,7 +843,6 @@ static int _badblocks_set(struct badblocks *bb, sector_t s, int sectors, > struct badblocks_context bad; > int prev = -1, hint = -1; > unsigned long flags; > - int rv = 0; > u64 *p; > > if (bb->shift < 0) > @@ -873,10 +872,8 @@ static int _badblocks_set(struct badblocks *bb, sector_t s, int sectors, > bad.len = sectors; > len = 0; > > - if (badblocks_full(bb)) { > - rv = 1; > + if (badblocks_full(bb)) > goto out; > - } > > if (badblocks_empty(bb)) { > len = insert_at(bb, 0, &bad); > @@ -916,10 +913,8 @@ static int _badblocks_set(struct badblocks *bb, sector_t s, int sectors, > int extra = 0; > > if (!can_front_overwrite(bb, prev, &bad, &extra)) { > - if (extra > 0) { > - rv = 1; > + if (extra > 0) > goto out; > - } > > len = min_t(sector_t, > BB_END(p[prev]) - s, sectors); > @@ -986,10 +981,7 @@ static int _badblocks_set(struct badblocks *bb, sector_t s, int sectors, > > write_sequnlock_irqrestore(&bb->lock, flags); > > - if (!added) > - rv = 1; > - > - return rv; > + return sectors; > } > > /* > @@ -1353,7 +1345,7 @@ EXPORT_SYMBOL_GPL(badblocks_check); > * > * Return: > * 0: success > - * 1: failed to set badblocks (out of space) > + * other: failed to set badblocks (out of space) > */ > int badblocks_set(struct badblocks *bb, sector_t s, int sectors, > int acknowledged) > -- > 2.39.2 >
diff --git a/block/badblocks.c b/block/badblocks.c index 1c8b8f65f6df..a953d2e9417f 100644 --- a/block/badblocks.c +++ b/block/badblocks.c @@ -843,7 +843,6 @@ static int _badblocks_set(struct badblocks *bb, sector_t s, int sectors, struct badblocks_context bad; int prev = -1, hint = -1; unsigned long flags; - int rv = 0; u64 *p; if (bb->shift < 0) @@ -873,10 +872,8 @@ static int _badblocks_set(struct badblocks *bb, sector_t s, int sectors, bad.len = sectors; len = 0; - if (badblocks_full(bb)) { - rv = 1; + if (badblocks_full(bb)) goto out; - } if (badblocks_empty(bb)) { len = insert_at(bb, 0, &bad); @@ -916,10 +913,8 @@ static int _badblocks_set(struct badblocks *bb, sector_t s, int sectors, int extra = 0; if (!can_front_overwrite(bb, prev, &bad, &extra)) { - if (extra > 0) { - rv = 1; + if (extra > 0) goto out; - } len = min_t(sector_t, BB_END(p[prev]) - s, sectors); @@ -986,10 +981,7 @@ static int _badblocks_set(struct badblocks *bb, sector_t s, int sectors, write_sequnlock_irqrestore(&bb->lock, flags); - if (!added) - rv = 1; - - return rv; + return sectors; } /* @@ -1353,7 +1345,7 @@ EXPORT_SYMBOL_GPL(badblocks_check); * * Return: * 0: success - * 1: failed to set badblocks (out of space) + * other: failed to set badblocks (out of space) */ int badblocks_set(struct badblocks *bb, sector_t s, int sectors, int acknowledged)