Message ID | 20220829131502.165356-2-yukuai1@huaweicloud.com (mailing list archive) |
---|---|
State | New, archived |
Delegated to: | Song Liu |
Headers | show |
Series | md/raid10: reduce lock contention for io | expand |
>>>>> "Yu" == Yu Kuai <yukuai1@huaweicloud.com> writes:
Yu> From: Yu Kuai <yukuai3@huawei.com>
Yu> 'conf->barrier' is protected by 'conf->resync_lock', reading
Yu> 'conf->barrier' without holding the lock is wrong.
Yu> Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Yu> ---
Yu> drivers/md/raid10.c | 2 +-
Yu> 1 file changed, 1 insertion(+), 1 deletion(-)
Yu> diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
Yu> index 9117fcdee1be..b70c207f7932 100644
Yu> --- a/drivers/md/raid10.c
Yu> +++ b/drivers/md/raid10.c
Yu> @@ -930,8 +930,8 @@ static void flush_pending_writes(struct r10conf *conf)
Yu> static void raise_barrier(struct r10conf *conf, int force)
Yu> {
Yu> - BUG_ON(force && !conf->barrier);
Yu> spin_lock_irq(&conf->resync_lock);
Yu> + BUG_ON(force && !conf->barrier);
I don't like this BUG_ON() at all, why are you crashing the system
here instead of just doing a simple WARN_ONCE() instead? Is there
anything the user can do to get into this situation on their own, or
does it really signify a logic error in the code? If so, why are you
killing the system?
Yu> /* Wait until no block IO is waiting (unless 'force') */
Yu> wait_event_lock_irq(conf->wait_barrier, force || !conf->nr_waiting,
Yu> --
Yu> 2.31.1
Hi, John 在 2022/08/30 3:53, John Stoffel 写道: >>>>>> "Yu" == Yu Kuai <yukuai1@huaweicloud.com> writes: > > Yu> From: Yu Kuai <yukuai3@huawei.com> > Yu> 'conf->barrier' is protected by 'conf->resync_lock', reading > Yu> 'conf->barrier' without holding the lock is wrong. > > Yu> Signed-off-by: Yu Kuai <yukuai3@huawei.com> > Yu> --- > Yu> drivers/md/raid10.c | 2 +- > Yu> 1 file changed, 1 insertion(+), 1 deletion(-) > > Yu> diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c > Yu> index 9117fcdee1be..b70c207f7932 100644 > Yu> --- a/drivers/md/raid10.c > Yu> +++ b/drivers/md/raid10.c > Yu> @@ -930,8 +930,8 @@ static void flush_pending_writes(struct r10conf *conf) > > Yu> static void raise_barrier(struct r10conf *conf, int force) > Yu> { > Yu> - BUG_ON(force && !conf->barrier); > Yu> spin_lock_irq(&conf->resync_lock); > Yu> + BUG_ON(force && !conf->barrier); > > I don't like this BUG_ON() at all, why are you crashing the system > here instead of just doing a simple WARN_ONCE() instead? Is there > anything the user can do to get into this situation on their own, or > does it really signify a logic error in the code? If so, why are you > killing the system? I'm not sure why to use the BUG_ON() here. I just noticed that 'conf->barrier' is read without holding 'resync_lock', and BUG_ON() can be triggered false positive. Thanks, Kuai > > > > Yu> /* Wait until no block IO is waiting (unless 'force') */ > Yu> wait_event_lock_irq(conf->wait_barrier, force || !conf->nr_waiting, > Yu> -- > Yu> 2.31.1 > > > . >
Dear John, Am 29.08.22 um 21:53 schrieb John Stoffel: >>>>>> "Yu" == Yu Kuai <yukuai1@huaweicloud.com> writes: > > Yu> From: Yu Kuai <yukuai3@huawei.com> The quoting style is really confusing, as it does not seem to be the standard, and a lot of MUAs won’t mark up the citation. […] > Yu> 'conf->barrier' is protected by 'conf->resync_lock', reading > Yu> 'conf->barrier' without holding the lock is wrong. > > Yu> Signed-off-by: Yu Kuai <yukuai3@huawei.com> > Yu> --- > Yu> drivers/md/raid10.c | 2 +- > Yu> 1 file changed, 1 insertion(+), 1 deletion(-) > > Yu> diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c > Yu> index 9117fcdee1be..b70c207f7932 100644 > Yu> --- a/drivers/md/raid10.c > Yu> +++ b/drivers/md/raid10.c > Yu> @@ -930,8 +930,8 @@ static void flush_pending_writes(struct r10conf *conf) > > Yu> static void raise_barrier(struct r10conf *conf, int force) > Yu> { > Yu> - BUG_ON(force && !conf->barrier); > Yu> spin_lock_irq(&conf->resync_lock); > Yu> + BUG_ON(force && !conf->barrier); > > I don't like this BUG_ON() at all, why are you crashing the system > here instead of just doing a simple WARN_ONCE() instead? Is there > anything the user can do to get into this situation on their own, or > does it really signify a logic error in the code? If so, why are you > killing the system? As you can see, the BUG_ON() was there before, so it’s unrelated to this patch and Yun is not killing anything. […] > Yu> /* Wait until no block IO is waiting (unless 'force') */ > Yu> wait_event_lock_irq(conf->wait_barrier, force || !conf->nr_waiting, > Yu> -- > Yu> 2.31.1 Kind regards, Paul
diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c index 9117fcdee1be..b70c207f7932 100644 --- a/drivers/md/raid10.c +++ b/drivers/md/raid10.c @@ -930,8 +930,8 @@ static void flush_pending_writes(struct r10conf *conf) static void raise_barrier(struct r10conf *conf, int force) { - BUG_ON(force && !conf->barrier); spin_lock_irq(&conf->resync_lock); + BUG_ON(force && !conf->barrier); /* Wait until no block IO is waiting (unless 'force') */ wait_event_lock_irq(conf->wait_barrier, force || !conf->nr_waiting,