Message ID | 20241029085559.98390-2-songmuchun@bytedance.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [v2,1/2] block: introduce init_wait_func() | expand |
On 2024/10/29 16:55, Muchun Song wrote: > When rq_qos_wait() is first introduced, it is easy to understand. But > with some bug fixes applied, it is not easy for newcomers to understand > the whole logic under those fixes. In this patch, rq_qos_wait() is > refactored and more comments are added for better understanding. There > are 3 points for the improvement: > > 1) Use waitqueue_active() instead of wq_has_sleeper() to eliminate > unnecessary memory barrier in wq_has_sleeper() which is supposed > to be used in waker side. In this case, we do need the barrier. > So use the cheaper one to locklessly test for waiters on the queue. > > 2) Remove acquire_inflight_cb() logic for the first waiter out of the > while loop to make the code clear. > > 3) Add more comments to explain how to sync with different waiters and > the waker. > > Signed-off-by: Muchun Song <songmuchun@bytedance.com> > Reviewed-by: Yu Kuai <yukuai3@huawei.com> Looks good to me! Reviewed-by: Chengming Zhou <chengming.zhou@linux.dev> Thanks. > --- > v2: > - Introduce init_wait_func() in a seprate patch (Yu Kuai). > > block/blk-rq-qos.c | 68 ++++++++++++++++++++++++++++++++-------------- > 1 file changed, 47 insertions(+), 21 deletions(-) > > diff --git a/block/blk-rq-qos.c b/block/blk-rq-qos.c > index 858ce69c04ece..5d995d389eaf5 100644 > --- a/block/blk-rq-qos.c > +++ b/block/blk-rq-qos.c > @@ -223,6 +223,14 @@ static int rq_qos_wake_function(struct wait_queue_entry *curr, > * Remove explicitly and use default_wake_function(). > */ > default_wake_function(curr, mode, wake_flags, key); > + /* > + * Note that the order of operations is important as finish_wait() > + * tests whether @curr is removed without grabbing the lock. This > + * should be the last thing to do to make sure we will not have a > + * UAF access to @data. And the semantics of memory barrier in it > + * also make sure the waiter will see the latest @data->got_token > + * once list_empty_careful() in finish_wait() returns true. > + */ > list_del_init_careful(&curr->entry); > return 1; > } > @@ -248,37 +256,55 @@ void rq_qos_wait(struct rq_wait *rqw, void *private_data, > cleanup_cb_t *cleanup_cb) > { > struct rq_qos_wait_data data = { > - .rqw = rqw, > - .cb = acquire_inflight_cb, > - .private_data = private_data, > + .rqw = rqw, > + .cb = acquire_inflight_cb, > + .private_data = private_data, > + .got_token = false, > }; > - bool has_sleeper; > + bool first_waiter; > > - has_sleeper = wq_has_sleeper(&rqw->wait); > - if (!has_sleeper && acquire_inflight_cb(rqw, private_data)) > + /* > + * If there are no waiters in the waiting queue, try to increase the > + * inflight counter if we can. Otherwise, prepare for adding ourselves > + * to the waiting queue. > + */ > + if (!waitqueue_active(&rqw->wait) && acquire_inflight_cb(rqw, private_data)) > return; > > init_wait_func(&data.wq, rq_qos_wake_function); > - has_sleeper = !prepare_to_wait_exclusive(&rqw->wait, &data.wq, > + first_waiter = prepare_to_wait_exclusive(&rqw->wait, &data.wq, > TASK_UNINTERRUPTIBLE); > + /* > + * Make sure there is at least one inflight process; otherwise, waiters > + * will never be woken up. Since there may be no inflight process before > + * adding ourselves to the waiting queue above, we need to try to > + * increase the inflight counter for ourselves. And it is sufficient to > + * guarantee that at least the first waiter to enter the waiting queue > + * will re-check the waiting condition before going to sleep, thus > + * ensuring forward progress. > + */ > + if (!data.got_token && first_waiter && acquire_inflight_cb(rqw, private_data)) { > + finish_wait(&rqw->wait, &data.wq); > + /* > + * We raced with rq_qos_wake_function() getting a token, > + * which means we now have two. Put our local token > + * and wake anyone else potentially waiting for one. > + * > + * Enough memory barrier in list_empty_careful() in > + * finish_wait() is paired with list_del_init_careful() > + * in rq_qos_wake_function() to make sure we will see > + * the latest @data->got_token. > + */ > + if (data.got_token) > + cleanup_cb(rqw, private_data); > + return; > + } > + > + /* we are now relying on the waker to increase our inflight counter. */ > do { > - /* The memory barrier in set_task_state saves us here. */ > if (data.got_token) > break; > - if (!has_sleeper && acquire_inflight_cb(rqw, private_data)) { > - finish_wait(&rqw->wait, &data.wq); > - > - /* > - * We raced with rq_qos_wake_function() getting a token, > - * which means we now have two. Put our local token > - * and wake anyone else potentially waiting for one. > - */ > - if (data.got_token) > - cleanup_cb(rqw, private_data); > - return; > - } > io_schedule(); > - has_sleeper = true; > set_current_state(TASK_UNINTERRUPTIBLE); > } while (1); > finish_wait(&rqw->wait, &data.wq);
diff --git a/block/blk-rq-qos.c b/block/blk-rq-qos.c index 858ce69c04ece..5d995d389eaf5 100644 --- a/block/blk-rq-qos.c +++ b/block/blk-rq-qos.c @@ -223,6 +223,14 @@ static int rq_qos_wake_function(struct wait_queue_entry *curr, * Remove explicitly and use default_wake_function(). */ default_wake_function(curr, mode, wake_flags, key); + /* + * Note that the order of operations is important as finish_wait() + * tests whether @curr is removed without grabbing the lock. This + * should be the last thing to do to make sure we will not have a + * UAF access to @data. And the semantics of memory barrier in it + * also make sure the waiter will see the latest @data->got_token + * once list_empty_careful() in finish_wait() returns true. + */ list_del_init_careful(&curr->entry); return 1; } @@ -248,37 +256,55 @@ void rq_qos_wait(struct rq_wait *rqw, void *private_data, cleanup_cb_t *cleanup_cb) { struct rq_qos_wait_data data = { - .rqw = rqw, - .cb = acquire_inflight_cb, - .private_data = private_data, + .rqw = rqw, + .cb = acquire_inflight_cb, + .private_data = private_data, + .got_token = false, }; - bool has_sleeper; + bool first_waiter; - has_sleeper = wq_has_sleeper(&rqw->wait); - if (!has_sleeper && acquire_inflight_cb(rqw, private_data)) + /* + * If there are no waiters in the waiting queue, try to increase the + * inflight counter if we can. Otherwise, prepare for adding ourselves + * to the waiting queue. + */ + if (!waitqueue_active(&rqw->wait) && acquire_inflight_cb(rqw, private_data)) return; init_wait_func(&data.wq, rq_qos_wake_function); - has_sleeper = !prepare_to_wait_exclusive(&rqw->wait, &data.wq, + first_waiter = prepare_to_wait_exclusive(&rqw->wait, &data.wq, TASK_UNINTERRUPTIBLE); + /* + * Make sure there is at least one inflight process; otherwise, waiters + * will never be woken up. Since there may be no inflight process before + * adding ourselves to the waiting queue above, we need to try to + * increase the inflight counter for ourselves. And it is sufficient to + * guarantee that at least the first waiter to enter the waiting queue + * will re-check the waiting condition before going to sleep, thus + * ensuring forward progress. + */ + if (!data.got_token && first_waiter && acquire_inflight_cb(rqw, private_data)) { + finish_wait(&rqw->wait, &data.wq); + /* + * We raced with rq_qos_wake_function() getting a token, + * which means we now have two. Put our local token + * and wake anyone else potentially waiting for one. + * + * Enough memory barrier in list_empty_careful() in + * finish_wait() is paired with list_del_init_careful() + * in rq_qos_wake_function() to make sure we will see + * the latest @data->got_token. + */ + if (data.got_token) + cleanup_cb(rqw, private_data); + return; + } + + /* we are now relying on the waker to increase our inflight counter. */ do { - /* The memory barrier in set_task_state saves us here. */ if (data.got_token) break; - if (!has_sleeper && acquire_inflight_cb(rqw, private_data)) { - finish_wait(&rqw->wait, &data.wq); - - /* - * We raced with rq_qos_wake_function() getting a token, - * which means we now have two. Put our local token - * and wake anyone else potentially waiting for one. - */ - if (data.got_token) - cleanup_cb(rqw, private_data); - return; - } io_schedule(); - has_sleeper = true; set_current_state(TASK_UNINTERRUPTIBLE); } while (1); finish_wait(&rqw->wait, &data.wq);