Message ID | 20220408073916.1428590-9-yukuai3@huawei.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | improve tag allocation under heavy load | expand |
On 4/8/22 00:39, Yu Kuai wrote: > Always wake up 'wake_batch' threads will intensify competition and > split io won't be issued continuously. Now that how many tags is required > is recorded for huge io, it's safe to wake up baed on required tags. > > Signed-off-by: Yu Kuai <yukuai3@huawei.com> > --- > lib/sbitmap.c | 22 +++++++++++++++++++++- > 1 file changed, 21 insertions(+), 1 deletion(-) > > diff --git a/lib/sbitmap.c b/lib/sbitmap.c > index 8d01e02ea4b1..eac9fa5c2b4d 100644 > --- a/lib/sbitmap.c > +++ b/lib/sbitmap.c > @@ -614,6 +614,26 @@ static inline void sbq_update_preemption(struct sbitmap_queue *sbq, > WRITE_ONCE(sbq->force_tag_preemption, force); > } > > +static unsigned int get_wake_nr(struct sbq_wait_state *ws, unsigned int nr_tags) Consider renaming "get_wake_nr()" into "nr_to_wake_up()". > +{ > + struct sbq_wait *wait; > + struct wait_queue_entry *entry; > + unsigned int nr = 1; > + > + spin_lock_irq(&ws->wait.lock); > + list_for_each_entry(entry, &ws->wait.head, entry) { > + wait = container_of(entry, struct sbq_wait, wait); > + if (nr_tags <= wait->nr_tags) > + break; > + > + nr++; > + nr_tags -= wait->nr_tags; > + } > + spin_unlock_irq(&ws->wait.lock); > + > + return nr; > +} > + > static bool __sbq_wake_up(struct sbitmap_queue *sbq) > { > struct sbq_wait_state *ws; > @@ -648,7 +668,7 @@ static bool __sbq_wake_up(struct sbitmap_queue *sbq) > smp_mb__before_atomic(); > atomic_set(&ws->wait_cnt, wake_batch); > sbq_update_preemption(sbq, wake_batch); > - wake_up_nr(&ws->wait, wake_batch); > + wake_up_nr(&ws->wait, get_wake_nr(ws, wake_batch)); > > return true; > } ws->wait.lock is unlocked after the number of threads to wake up has been computed and is locked again by wake_up_nr(). The ws->wait.head list may be modified after get_wake_nr() returns and before wake_up_nr() is called. Isn't that a race condition? Thanks, Bart.
On 4/8/22 00:39, Yu Kuai wrote: > +static unsigned int get_wake_nr(struct sbq_wait_state *ws, unsigned int nr_tags) > +{ > + struct sbq_wait *wait; > + struct wait_queue_entry *entry; > + unsigned int nr = 1; > + > + spin_lock_irq(&ws->wait.lock); > + list_for_each_entry(entry, &ws->wait.head, entry) { > + wait = container_of(entry, struct sbq_wait, wait); > + if (nr_tags <= wait->nr_tags) > + break; > + > + nr++; > + nr_tags -= wait->nr_tags; > + } > + spin_unlock_irq(&ws->wait.lock); > + > + return nr; > +} > + > static bool __sbq_wake_up(struct sbitmap_queue *sbq) > { > struct sbq_wait_state *ws; > @@ -648,7 +668,7 @@ static bool __sbq_wake_up(struct sbitmap_queue *sbq) > smp_mb__before_atomic(); > atomic_set(&ws->wait_cnt, wake_batch); > sbq_update_preemption(sbq, wake_batch); > - wake_up_nr(&ws->wait, wake_batch); > + wake_up_nr(&ws->wait, get_wake_nr(ws, wake_batch)); > > return true; > } An additional comment: my understanding is that __sbq_wake_up() should wake up exactly `wake_batch` waiters. The above patch changes that into waking up at most `wake_batch` waiters. I think that's wrong. Bart.
在 2022/04/09 5:13, Bart Van Assche 写道: > On 4/8/22 00:39, Yu Kuai wrote: >> +static unsigned int get_wake_nr(struct sbq_wait_state *ws, unsigned >> int nr_tags) >> +{ >> + struct sbq_wait *wait; >> + struct wait_queue_entry *entry; >> + unsigned int nr = 1; >> + >> + spin_lock_irq(&ws->wait.lock); >> + list_for_each_entry(entry, &ws->wait.head, entry) { >> + wait = container_of(entry, struct sbq_wait, wait); >> + if (nr_tags <= wait->nr_tags) >> + break; >> + >> + nr++; >> + nr_tags -= wait->nr_tags; >> + } >> + spin_unlock_irq(&ws->wait.lock); >> + >> + return nr; >> +} >> + >> static bool __sbq_wake_up(struct sbitmap_queue *sbq) >> { >> struct sbq_wait_state *ws; >> @@ -648,7 +668,7 @@ static bool __sbq_wake_up(struct sbitmap_queue *sbq) >> smp_mb__before_atomic(); >> atomic_set(&ws->wait_cnt, wake_batch); >> sbq_update_preemption(sbq, wake_batch); >> - wake_up_nr(&ws->wait, wake_batch); >> + wake_up_nr(&ws->wait, get_wake_nr(ws, wake_batch)); >> return true; >> } > > An additional comment: my understanding is that __sbq_wake_up() should > wake up exactly `wake_batch` waiters. The above patch changes that into > waking up at most `wake_batch` waiters. I think that's wrong. Hi, I think the reason to wake up 'wake_batch' waiters is to make sure wakers will use up 'wake_batch' tags that is just freed, because each wakers should aquire at least one tag. Thus I think if we can make sure wakers will use up 'wake_batch' tags, it's ok to wake up less waiters. Please kindly correct me if I'm wrong. Thanks, Kuai > > Bart. > . >
在 2022/04/08 22:31, Bart Van Assche 写道: > On 4/8/22 00:39, Yu Kuai wrote: >> Always wake up 'wake_batch' threads will intensify competition and >> split io won't be issued continuously. Now that how many tags is required >> is recorded for huge io, it's safe to wake up baed on required tags. >> >> Signed-off-by: Yu Kuai <yukuai3@huawei.com> >> --- >> lib/sbitmap.c | 22 +++++++++++++++++++++- >> 1 file changed, 21 insertions(+), 1 deletion(-) >> >> diff --git a/lib/sbitmap.c b/lib/sbitmap.c >> index 8d01e02ea4b1..eac9fa5c2b4d 100644 >> --- a/lib/sbitmap.c >> +++ b/lib/sbitmap.c >> @@ -614,6 +614,26 @@ static inline void sbq_update_preemption(struct >> sbitmap_queue *sbq, >> WRITE_ONCE(sbq->force_tag_preemption, force); >> } >> +static unsigned int get_wake_nr(struct sbq_wait_state *ws, unsigned >> int nr_tags) > > Consider renaming "get_wake_nr()" into "nr_to_wake_up()". > >> +{ >> + struct sbq_wait *wait; >> + struct wait_queue_entry *entry; >> + unsigned int nr = 1; >> + >> + spin_lock_irq(&ws->wait.lock); >> + list_for_each_entry(entry, &ws->wait.head, entry) { >> + wait = container_of(entry, struct sbq_wait, wait); >> + if (nr_tags <= wait->nr_tags) >> + break; >> + >> + nr++; >> + nr_tags -= wait->nr_tags; >> + } >> + spin_unlock_irq(&ws->wait.lock); >> + >> + return nr; >> +} >> + >> static bool __sbq_wake_up(struct sbitmap_queue *sbq) >> { >> struct sbq_wait_state *ws; >> @@ -648,7 +668,7 @@ static bool __sbq_wake_up(struct sbitmap_queue *sbq) >> smp_mb__before_atomic(); >> atomic_set(&ws->wait_cnt, wake_batch); >> sbq_update_preemption(sbq, wake_batch); >> - wake_up_nr(&ws->wait, wake_batch); >> + wake_up_nr(&ws->wait, get_wake_nr(ws, wake_batch)); >> return true; >> } > > ws->wait.lock is unlocked after the number of threads to wake up has > been computed and is locked again by wake_up_nr(). The ws->wait.head > list may be modified after get_wake_nr() returns and before wake_up_nr() > is called. Isn't that a race condition? Hi, That is a race condition, I was hoping that the problem patch 5 fixed can cover this. Thanks, Kuai > > Thanks, > > Bart. > . >
On 4/8/22 19:17, yukuai (C) wrote: > I think the reason to wake up 'wake_batch' waiters is to make sure > wakers will use up 'wake_batch' tags that is just freed, because each > wakers should aquire at least one tag. Thus I think if we can make sure > wakers will use up 'wake_batch' tags, it's ok to wake up less waiters. Hmm ... I think it's up to you to (a) explain this behavior change in detail in the commit message and (b) to prove that this behavior change won't cause trouble (I guess this change will cause trouble). Thanks, Bart.
在 2022/04/09 12:16, Bart Van Assche 写道: > On 4/8/22 19:17, yukuai (C) wrote: >> I think the reason to wake up 'wake_batch' waiters is to make sure >> wakers will use up 'wake_batch' tags that is just freed, because each >> wakers should aquire at least one tag. Thus I think if we can make sure >> wakers will use up 'wake_batch' tags, it's ok to wake up less waiters. > > Hmm ... I think it's up to you to (a) explain this behavior change in > detail in the commit message and (b) to prove that this behavior change > won't cause trouble (I guess this change will cause trouble). Hi, Bart Sorry that the commit message doesn't explain clearly. There are only two situations that wakers will be less than 'wake_batch' after this patch: (a) some wakers will acquire multipul tags, as I mentioned above, this is ok because wakers will use up 'wake_batch' tags. (b) the total number of waiters is less than 'wake_batch', this is problematic if tag preemption is disabled, because io concurrency will be declined.(patch 5 should fix the problem) For the race that new threads are waited after get_wake_nr() and before wake_up_nr() in situation (b), I can't figure out how this can be problematic, however, this can be optimized by triggering additional wake up: @@ -623,15 +623,17 @@ static unsigned int get_wake_nr(struct sbq_wait_state *ws, unsigned int nr_tags) spin_lock_irq(&ws->wait.lock); list_for_each_entry(entry, &ws->wait.head, entry) { wait = container_of(entry, struct sbq_wait, wait); - if (nr_tags <= wait->nr_tags) + if (nr_tags <= wait->nr_tags) { + nr_tags = 0; break; + } nr++; nr_tags -= wait->nr_tags; } spin_unlock_irq(&ws->wait.lock); - return nr; + return nr + nr_tags; } What do you think? Thanks, Kuai > > Thanks, > > Bart. > . >
On 4/9/22 00:01, yukuai (C) wrote: > For the race that new threads are waited after get_wake_nr() and before > wake_up_nr() in situation (b), I can't figure out how this can be > problematic [ ... ] If the atomic_dec_return() statement in __sbq_wake_up() returns a value that is less than or equal to zero, wake_batch waiters should be woken up or I/O will hang until sbitmap_queue_wake_all() is called. That last function should not be called unless there is no alternative. Thanks, Bart.
diff --git a/lib/sbitmap.c b/lib/sbitmap.c index 8d01e02ea4b1..eac9fa5c2b4d 100644 --- a/lib/sbitmap.c +++ b/lib/sbitmap.c @@ -614,6 +614,26 @@ static inline void sbq_update_preemption(struct sbitmap_queue *sbq, WRITE_ONCE(sbq->force_tag_preemption, force); } +static unsigned int get_wake_nr(struct sbq_wait_state *ws, unsigned int nr_tags) +{ + struct sbq_wait *wait; + struct wait_queue_entry *entry; + unsigned int nr = 1; + + spin_lock_irq(&ws->wait.lock); + list_for_each_entry(entry, &ws->wait.head, entry) { + wait = container_of(entry, struct sbq_wait, wait); + if (nr_tags <= wait->nr_tags) + break; + + nr++; + nr_tags -= wait->nr_tags; + } + spin_unlock_irq(&ws->wait.lock); + + return nr; +} + static bool __sbq_wake_up(struct sbitmap_queue *sbq) { struct sbq_wait_state *ws; @@ -648,7 +668,7 @@ static bool __sbq_wake_up(struct sbitmap_queue *sbq) smp_mb__before_atomic(); atomic_set(&ws->wait_cnt, wake_batch); sbq_update_preemption(sbq, wake_batch); - wake_up_nr(&ws->wait, wake_batch); + wake_up_nr(&ws->wait, get_wake_nr(ws, wake_batch)); return true; }
Always wake up 'wake_batch' threads will intensify competition and split io won't be issued continuously. Now that how many tags is required is recorded for huge io, it's safe to wake up baed on required tags. Signed-off-by: Yu Kuai <yukuai3@huawei.com> --- lib/sbitmap.c | 22 +++++++++++++++++++++- 1 file changed, 21 insertions(+), 1 deletion(-)