Message ID | 20190710195227.92322-1-josef@toxicpanda.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [1/2] wait: add wq_has_multiple_sleepers helper | expand |
On 7/10/19 1:52 PM, Josef Bacik wrote: > rq-qos sits in the io path so we want to take locks as sparingly as > possible. To accomplish this we try not to take the waitqueue head lock > unless we are sure we need to go to sleep, and we have an optimization > to make sure that we don't starve out existing waiters. Since we check > if there are existing waiters locklessly we need to be able to update > our view of the waitqueue list after we've added ourselves to the > waitqueue. Accomplish this by adding this helper to see if there are > more than two waiters on the waitqueue. > > Suggested-by: Jens Axboe <axboe@kernel.dk> > Signed-off-by: Josef Bacik <josef@toxicpanda.com> > --- > include/linux/wait.h | 21 +++++++++++++++++++++ > 1 file changed, 21 insertions(+) > > diff --git a/include/linux/wait.h b/include/linux/wait.h > index b6f77cf60dd7..89c41a7b3046 100644 > --- a/include/linux/wait.h > +++ b/include/linux/wait.h > @@ -126,6 +126,27 @@ static inline int waitqueue_active(struct wait_queue_head *wq_head) > return !list_empty(&wq_head->head); > } > > +/** > + * wq_has_multiple_sleepers - check if there are multiple waiting prcesses > + * @wq_head: wait queue head > + * > + * Returns true of wq_head has multiple waiting processes. > + * > + * Please refer to the comment for waitqueue_active. > + */ > +static inline bool wq_has_multiple_sleepers(struct wait_queue_head *wq_head) > +{ > + /* > + * We need to be sure we are in sync with the > + * add_wait_queue modifications to the wait queue. > + * > + * This memory barrier should be paired with one on the > + * waiting side. > + */ > + smp_mb(); > + return !list_is_singular(&wq_head->head); > +} > + > /** > * wq_has_sleeper - check if there are any waiting processes > * @wq_head: wait queue head This (and 2/2) looks good to me, better than v1 for sure. Peter/Ingo, are you OK with adding this new helper? For reference, this (and the next patch) replace the alternative, which is an open-coding of prepare_to_wait(): https://lore.kernel.org/linux-block/20190710190514.86911-1-josef@toxicpanda.com/
On Wed, Jul 10, 2019 at 02:23:23PM -0600, Jens Axboe wrote: > On 7/10/19 1:52 PM, Josef Bacik wrote: > > rq-qos sits in the io path so we want to take locks as sparingly as > > possible. To accomplish this we try not to take the waitqueue head lock > > unless we are sure we need to go to sleep, and we have an optimization > > to make sure that we don't starve out existing waiters. Since we check > > if there are existing waiters locklessly we need to be able to update > > our view of the waitqueue list after we've added ourselves to the > > waitqueue. Accomplish this by adding this helper to see if there are > > more than two waiters on the waitqueue. > > > > Suggested-by: Jens Axboe <axboe@kernel.dk> > > Signed-off-by: Josef Bacik <josef@toxicpanda.com> > > --- > > include/linux/wait.h | 21 +++++++++++++++++++++ > > 1 file changed, 21 insertions(+) > > > > diff --git a/include/linux/wait.h b/include/linux/wait.h > > index b6f77cf60dd7..89c41a7b3046 100644 > > --- a/include/linux/wait.h > > +++ b/include/linux/wait.h > > @@ -126,6 +126,27 @@ static inline int waitqueue_active(struct wait_queue_head *wq_head) > > return !list_empty(&wq_head->head); > > } > > > > +/** > > + * wq_has_multiple_sleepers - check if there are multiple waiting prcesses > > + * @wq_head: wait queue head > > + * > > + * Returns true of wq_head has multiple waiting processes. > > + * > > + * Please refer to the comment for waitqueue_active. > > + */ > > +static inline bool wq_has_multiple_sleepers(struct wait_queue_head *wq_head) > > +{ > > + /* > > + * We need to be sure we are in sync with the > > + * add_wait_queue modifications to the wait queue. > > + * > > + * This memory barrier should be paired with one on the > > + * waiting side. > > + */ > > + smp_mb(); > > + return !list_is_singular(&wq_head->head); > > +} > > + > > /** > > * wq_has_sleeper - check if there are any waiting processes > > * @wq_head: wait queue head > > This (and 2/2) looks good to me, better than v1 for sure. Peter/Ingo, > are you OK with adding this new helper? For reference, this (and the > next patch) replace the alternative, which is an open-coding of > prepare_to_wait(): > > https://lore.kernel.org/linux-block/20190710190514.86911-1-josef@toxicpanda.com/ Yet another approach would be to have prepare_to_wait*() return this state, but I think this is ok. The smp_mb() is superfluous -- in your specific case -- since preprare_to_wait*() already does one through set_current_state(). So you could do without it, I think.
On 7/10/19 2:35 PM, Peter Zijlstra wrote: > On Wed, Jul 10, 2019 at 02:23:23PM -0600, Jens Axboe wrote: >> On 7/10/19 1:52 PM, Josef Bacik wrote: >>> rq-qos sits in the io path so we want to take locks as sparingly as >>> possible. To accomplish this we try not to take the waitqueue head lock >>> unless we are sure we need to go to sleep, and we have an optimization >>> to make sure that we don't starve out existing waiters. Since we check >>> if there are existing waiters locklessly we need to be able to update >>> our view of the waitqueue list after we've added ourselves to the >>> waitqueue. Accomplish this by adding this helper to see if there are >>> more than two waiters on the waitqueue. >>> >>> Suggested-by: Jens Axboe <axboe@kernel.dk> >>> Signed-off-by: Josef Bacik <josef@toxicpanda.com> >>> --- >>> include/linux/wait.h | 21 +++++++++++++++++++++ >>> 1 file changed, 21 insertions(+) >>> >>> diff --git a/include/linux/wait.h b/include/linux/wait.h >>> index b6f77cf60dd7..89c41a7b3046 100644 >>> --- a/include/linux/wait.h >>> +++ b/include/linux/wait.h >>> @@ -126,6 +126,27 @@ static inline int waitqueue_active(struct wait_queue_head *wq_head) >>> return !list_empty(&wq_head->head); >>> } >>> >>> +/** >>> + * wq_has_multiple_sleepers - check if there are multiple waiting prcesses >>> + * @wq_head: wait queue head >>> + * >>> + * Returns true of wq_head has multiple waiting processes. >>> + * >>> + * Please refer to the comment for waitqueue_active. >>> + */ >>> +static inline bool wq_has_multiple_sleepers(struct wait_queue_head *wq_head) >>> +{ >>> + /* >>> + * We need to be sure we are in sync with the >>> + * add_wait_queue modifications to the wait queue. >>> + * >>> + * This memory barrier should be paired with one on the >>> + * waiting side. >>> + */ >>> + smp_mb(); >>> + return !list_is_singular(&wq_head->head); >>> +} >>> + >>> /** >>> * wq_has_sleeper - check if there are any waiting processes >>> * @wq_head: wait queue head >> >> This (and 2/2) looks good to me, better than v1 for sure. Peter/Ingo, >> are you OK with adding this new helper? For reference, this (and the >> next patch) replace the alternative, which is an open-coding of >> prepare_to_wait(): >> >> https://lore.kernel.org/linux-block/20190710190514.86911-1-josef@toxicpanda.com/ > > Yet another approach would be to have prepare_to_wait*() return this > state, but I think this is ok. We did discuss that case, but it seems somewhat random to have it return that specific piece of info. But it'd work for this case. > The smp_mb() is superfluous -- in your specific case -- since > preprare_to_wait*() already does one through set_current_state(). > > So you could do without it, I think. But that's specific to this use case. Maybe it's the only one we'll have, and then it's fine, but as a generic helper it seems safer to include the same ordering protection as wq_has_sleeper().
Jens, I managed to convince myself I understand why 2/2 needs this change... But rq_qos_wait() still looks suspicious to me. Why can't the main loop "break" right after io_schedule()? rq_qos_wake_function() either sets data->got_token = true or it doesn't wakeup the waiter sleeping in io_schedule() This means that data.got_token = F at the 2nd iteration is only possible after a spurious wakeup, right? But in this case we need to set state = TASK_UNINTERRUPTIBLE again to avoid busy-wait looping ? Oleg.
On 07/11, Oleg Nesterov wrote: > > Jens, > > I managed to convince myself I understand why 2/2 needs this change... > But rq_qos_wait() still looks suspicious to me. Why can't the main loop > "break" right after io_schedule()? rq_qos_wake_function() either sets > data->got_token = true or it doesn't wakeup the waiter sleeping in > io_schedule() > > This means that data.got_token = F at the 2nd iteration is only possible > after a spurious wakeup, right? But in this case we need to set state = > TASK_UNINTERRUPTIBLE again to avoid busy-wait looping ? Oh. I can be easily wrong, I never read this code before, but it seems to me there is another unrelated race. rq_qos_wait() can't rely on finish_wait() because it doesn't necessarily take wq_head->lock. rq_qos_wait() inside the main loop does if (!has_sleeper && acquire_inflight_cb(rqw, private_data)) { finish_wait(&rqw->wait, &data.wq); /* * We raced with wbt_wake_function() getting a token, * which means we now have two. Put our local token * and wake anyone else potentially waiting for one. */ if (data.got_token) cleanup_cb(rqw, private_data); break; } finish_wait() + "if (data.got_token)" can race with rq_qos_wake_function() which does data->got_token = true; list_del_init(&curr->entry); rq_qos_wait() can see these changes out-of-order: finish_wait() can see list_empty_careful() == T and avoid wq_head->lock, and in this case the code above can see data->got_token = false. No? and I don't really understand has_sleeper = false; at the end of the main loop. I think it should do "has_sleeper = true", we need to execute the code above only once, right after prepare_to_wait(). But this is harmless. Oleg.
On Thu, Jul 11, 2019 at 03:40:06PM +0200, Oleg Nesterov wrote: > On 07/11, Oleg Nesterov wrote: > > > > Jens, > > > > I managed to convince myself I understand why 2/2 needs this change... > > But rq_qos_wait() still looks suspicious to me. Why can't the main loop > > "break" right after io_schedule()? rq_qos_wake_function() either sets > > data->got_token = true or it doesn't wakeup the waiter sleeping in > > io_schedule() > > > > This means that data.got_token = F at the 2nd iteration is only possible > > after a spurious wakeup, right? But in this case we need to set state = > > TASK_UNINTERRUPTIBLE again to avoid busy-wait looping ? > > Oh. I can be easily wrong, I never read this code before, but it seems to > me there is another unrelated race. > > rq_qos_wait() can't rely on finish_wait() because it doesn't necessarily > take wq_head->lock. > > rq_qos_wait() inside the main loop does > > if (!has_sleeper && acquire_inflight_cb(rqw, private_data)) { > finish_wait(&rqw->wait, &data.wq); > > /* > * We raced with wbt_wake_function() getting a token, > * which means we now have two. Put our local token > * and wake anyone else potentially waiting for one. > */ > if (data.got_token) > cleanup_cb(rqw, private_data); > break; > } > > finish_wait() + "if (data.got_token)" can race with rq_qos_wake_function() > which does > > data->got_token = true; > list_del_init(&curr->entry); > Argh finish_wait() does __set_current_state, well that's shitty. I guess we need to do data->got_token = true; smp_wmb() list_del_init(&curr->entry); and then do smp_rmb(); if (data.got_token) cleanup_cb(rqw, private_data); to be safe? > rq_qos_wait() can see these changes out-of-order: finish_wait() can see > list_empty_careful() == T and avoid wq_head->lock, and in this case the > code above can see data->got_token = false. > > No? > > and I don't really understand > > has_sleeper = false; > > at the end of the main loop. I think it should do "has_sleeper = true", > we need to execute the code above only once, right after prepare_to_wait(). > But this is harmless. We want has_sleeper = false because the second time around we just want to grab the inflight counter. Yes we should have been worken up by our special thing and so should already have data.got_token, but that sort of thinking ends in hung boxes and me having to try to mitigate thousands of boxes suddenly hitting a case we didn't think was possible. Thanks, Josef
On 07/11, Josef Bacik wrote: > > On Thu, Jul 11, 2019 at 03:40:06PM +0200, Oleg Nesterov wrote: > > rq_qos_wait() inside the main loop does > > > > if (!has_sleeper && acquire_inflight_cb(rqw, private_data)) { > > finish_wait(&rqw->wait, &data.wq); > > > > /* > > * We raced with wbt_wake_function() getting a token, > > * which means we now have two. Put our local token > > * and wake anyone else potentially waiting for one. > > */ > > if (data.got_token) > > cleanup_cb(rqw, private_data); > > break; > > } > > > > finish_wait() + "if (data.got_token)" can race with rq_qos_wake_function() > > which does > > > > data->got_token = true; > > list_del_init(&curr->entry); > > > > Argh finish_wait() does __set_current_state, well that's shitty. Hmm. I think this is irrelevant, > data->got_token = true; > smp_wmb() > list_del_init(&curr->entry); > > and then do > > smp_rmb(); > if (data.got_token) > cleanup_cb(rqw, private_data); Yes, this should work, > > and I don't really understand > > > > has_sleeper = false; > > > > at the end of the main loop. I think it should do "has_sleeper = true", > > we need to execute the code above only once, right after prepare_to_wait(). > > But this is harmless. > > We want has_sleeper = false because the second time around we just want to grab > the inflight counter. I don't think so. > Yes we should have been worken up by our special thing > and so should already have data.got_token, Yes. Again, unless wakeup was spurious and this needs another trivial fix. If we can't rely on this then this code is simply broken? > but that sort of thinking ends in > hung boxes and me having to try to mitigate thousands of boxes suddenly hitting > a case we didn't think was possible. Thanks, I can't understand this logic, but I can't argue. However, in this case I'd suggest the patch below instead of this series. If rq_qos_wait() does the unnecessary acquire_inflight_cb() because it can hit a case we didn't think was possible, then why can't it do on the first iteration for the same reason? This should equally fix the problem and simplify the code. In case it is not clear: no, I don't like it. Just I can't understand your logic. And btw... again, I won't argue, but wq_has_multiple_sleepers is badly named, and the comments are simply wrong. It can return T if wq has no sleepers, iow if list_empty(wq_head->head). 2/2 actualy uses !wq_has_multiple_sleepers(), this turns the condition back into list_is_singular(), but to me this alll looks very confusing. Plus I too do not understand smp_mb() in this helper. Oleg. --- a/block/blk-rq-qos.c +++ b/block/blk-rq-qos.c @@ -247,7 +247,7 @@ void rq_qos_wait(struct rq_wait *rqw, void *private_data, do { if (data.got_token) break; - if (!has_sleeper && acquire_inflight_cb(rqw, private_data)) { + if (acquire_inflight_cb(rqw, private_data)) { finish_wait(&rqw->wait, &data.wq); /* @@ -260,7 +260,6 @@ void rq_qos_wait(struct rq_wait *rqw, void *private_data, break; } io_schedule(); - has_sleeper = false; } while (1); finish_wait(&rqw->wait, &data.wq); }
diff --git a/include/linux/wait.h b/include/linux/wait.h index b6f77cf60dd7..89c41a7b3046 100644 --- a/include/linux/wait.h +++ b/include/linux/wait.h @@ -126,6 +126,27 @@ static inline int waitqueue_active(struct wait_queue_head *wq_head) return !list_empty(&wq_head->head); } +/** + * wq_has_multiple_sleepers - check if there are multiple waiting prcesses + * @wq_head: wait queue head + * + * Returns true of wq_head has multiple waiting processes. + * + * Please refer to the comment for waitqueue_active. + */ +static inline bool wq_has_multiple_sleepers(struct wait_queue_head *wq_head) +{ + /* + * We need to be sure we are in sync with the + * add_wait_queue modifications to the wait queue. + * + * This memory barrier should be paired with one on the + * waiting side. + */ + smp_mb(); + return !list_is_singular(&wq_head->head); +} + /** * wq_has_sleeper - check if there are any waiting processes * @wq_head: wait queue head
rq-qos sits in the io path so we want to take locks as sparingly as possible. To accomplish this we try not to take the waitqueue head lock unless we are sure we need to go to sleep, and we have an optimization to make sure that we don't starve out existing waiters. Since we check if there are existing waiters locklessly we need to be able to update our view of the waitqueue list after we've added ourselves to the waitqueue. Accomplish this by adding this helper to see if there are more than two waiters on the waitqueue. Suggested-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Josef Bacik <josef@toxicpanda.com> --- include/linux/wait.h | 21 +++++++++++++++++++++ 1 file changed, 21 insertions(+)