Message ID | 20250204194814.393112-6-axboe@kernel.dk (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | io_uring epoll wait support | expand |
On 2/4/25 19:46, Jens Axboe wrote: > If a wait_queue_entry is passed in to epoll_wait(), then utilize this > new helper for reaping events and/or adding to the epoll waitqueue > rather than calling the potentially sleeping ep_poll(). It works like > ep_poll(), except it doesn't block - it either returns the events that > are already available, or it adds the specified entry to the struct > eventpoll waitqueue to get a callback when events are triggered. It > returns -EIOCBQUEUED for that case. > > Signed-off-by: Jens Axboe <axboe@kernel.dk> > --- > fs/eventpoll.c | 37 ++++++++++++++++++++++++++++++++++++- > 1 file changed, 36 insertions(+), 1 deletion(-) > > diff --git a/fs/eventpoll.c b/fs/eventpoll.c > index ecaa5591f4be..a8be0c7110e4 100644 > --- a/fs/eventpoll.c > +++ b/fs/eventpoll.c > @@ -2032,6 +2032,39 @@ static int ep_try_send_events(struct eventpoll *ep, > return res; > } > > +static int ep_poll_queue(struct eventpoll *ep, > + struct epoll_event __user *events, int maxevents, > + struct wait_queue_entry *wait) > +{ > + int res, eavail; > + > + /* See ep_poll() for commentary */ > + eavail = ep_events_available(ep); > + while (1) { > + if (eavail) { > + res = ep_try_send_events(ep, events, maxevents); > + if (res) > + return res; > + } > + > + eavail = ep_busy_loop(ep, true); I have doubts we want to busy loop here even if it's just one iteration / nonblockinf. And there is already napi polling support in io_uring done from the right for io_uring users spot.
On 2/7/25 5:28 AM, Pavel Begunkov wrote: > On 2/4/25 19:46, Jens Axboe wrote: >> If a wait_queue_entry is passed in to epoll_wait(), then utilize this >> new helper for reaping events and/or adding to the epoll waitqueue >> rather than calling the potentially sleeping ep_poll(). It works like >> ep_poll(), except it doesn't block - it either returns the events that >> are already available, or it adds the specified entry to the struct >> eventpoll waitqueue to get a callback when events are triggered. It >> returns -EIOCBQUEUED for that case. >> >> Signed-off-by: Jens Axboe <axboe@kernel.dk> >> --- >> fs/eventpoll.c | 37 ++++++++++++++++++++++++++++++++++++- >> 1 file changed, 36 insertions(+), 1 deletion(-) >> >> diff --git a/fs/eventpoll.c b/fs/eventpoll.c >> index ecaa5591f4be..a8be0c7110e4 100644 >> --- a/fs/eventpoll.c >> +++ b/fs/eventpoll.c >> @@ -2032,6 +2032,39 @@ static int ep_try_send_events(struct eventpoll *ep, >> return res; >> } >> +static int ep_poll_queue(struct eventpoll *ep, >> + struct epoll_event __user *events, int maxevents, >> + struct wait_queue_entry *wait) >> +{ >> + int res, eavail; >> + >> + /* See ep_poll() for commentary */ >> + eavail = ep_events_available(ep); >> + while (1) { >> + if (eavail) { >> + res = ep_try_send_events(ep, events, maxevents); >> + if (res) >> + return res; >> + } >> + >> + eavail = ep_busy_loop(ep, true); > > I have doubts we want to busy loop here even if it's just one iteration / > nonblockinf. And there is already napi polling support in io_uring done > from the right for io_uring users spot. Yeah I did ponder that which is why it's passing in the timed_out == true to just do a single loop. We could certainly get rid of that.
diff --git a/fs/eventpoll.c b/fs/eventpoll.c index ecaa5591f4be..a8be0c7110e4 100644 --- a/fs/eventpoll.c +++ b/fs/eventpoll.c @@ -2032,6 +2032,39 @@ static int ep_try_send_events(struct eventpoll *ep, return res; } +static int ep_poll_queue(struct eventpoll *ep, + struct epoll_event __user *events, int maxevents, + struct wait_queue_entry *wait) +{ + int res, eavail; + + /* See ep_poll() for commentary */ + eavail = ep_events_available(ep); + while (1) { + if (eavail) { + res = ep_try_send_events(ep, events, maxevents); + if (res) + return res; + } + + eavail = ep_busy_loop(ep, true); + if (eavail) + continue; + + if (!list_empty_careful(&wait->entry)) + return -EIOCBQUEUED; + + write_lock_irq(&ep->lock); + eavail = ep_events_available(ep); + if (!eavail) + __add_wait_queue_exclusive(&ep->wq, wait); + write_unlock_irq(&ep->lock); + + if (!eavail) + return -EIOCBQUEUED; + } +} + /** * ep_poll - Retrieves ready events, and delivers them to the caller-supplied * event buffer. @@ -2497,7 +2530,9 @@ int epoll_wait(struct file *file, struct epoll_event __user *events, ep = file->private_data; /* Time to fish for events ... */ - return ep_poll(ep, events, maxevents, to); + if (!wait) + return ep_poll(ep, events, maxevents, to); + return ep_poll_queue(ep, events, maxevents, wait); } /*
If a wait_queue_entry is passed in to epoll_wait(), then utilize this new helper for reaping events and/or adding to the epoll waitqueue rather than calling the potentially sleeping ep_poll(). It works like ep_poll(), except it doesn't block - it either returns the events that are already available, or it adds the specified entry to the struct eventpoll waitqueue to get a callback when events are triggered. It returns -EIOCBQUEUED for that case. Signed-off-by: Jens Axboe <axboe@kernel.dk> --- fs/eventpoll.c | 37 ++++++++++++++++++++++++++++++++++++- 1 file changed, 36 insertions(+), 1 deletion(-)