diff mbox series

[05/11] eventpoll: add ep_poll_queue() loop

Message ID 20250204194814.393112-6-axboe@kernel.dk (mailing list archive)
State New
Headers show
Series io_uring epoll wait support | expand

Commit Message

Jens Axboe Feb. 4, 2025, 7:46 p.m. UTC
If a wait_queue_entry is passed in to epoll_wait(), then utilize this
new helper for reaping events and/or adding to the epoll waitqueue
rather than calling the potentially sleeping ep_poll(). It works like
ep_poll(), except it doesn't block - it either returns the events that
are already available, or it adds the specified entry to the struct
eventpoll waitqueue to get a callback when events are triggered. It
returns -EIOCBQUEUED for that case.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 fs/eventpoll.c | 37 ++++++++++++++++++++++++++++++++++++-
 1 file changed, 36 insertions(+), 1 deletion(-)

Comments

Pavel Begunkov Feb. 7, 2025, 12:28 p.m. UTC | #1
On 2/4/25 19:46, Jens Axboe wrote:
> If a wait_queue_entry is passed in to epoll_wait(), then utilize this
> new helper for reaping events and/or adding to the epoll waitqueue
> rather than calling the potentially sleeping ep_poll(). It works like
> ep_poll(), except it doesn't block - it either returns the events that
> are already available, or it adds the specified entry to the struct
> eventpoll waitqueue to get a callback when events are triggered. It
> returns -EIOCBQUEUED for that case.
> 
> Signed-off-by: Jens Axboe <axboe@kernel.dk>
> ---
>   fs/eventpoll.c | 37 ++++++++++++++++++++++++++++++++++++-
>   1 file changed, 36 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/eventpoll.c b/fs/eventpoll.c
> index ecaa5591f4be..a8be0c7110e4 100644
> --- a/fs/eventpoll.c
> +++ b/fs/eventpoll.c
> @@ -2032,6 +2032,39 @@ static int ep_try_send_events(struct eventpoll *ep,
>   	return res;
>   }
>   
> +static int ep_poll_queue(struct eventpoll *ep,
> +			 struct epoll_event __user *events, int maxevents,
> +			 struct wait_queue_entry *wait)
> +{
> +	int res, eavail;
> +
> +	/* See ep_poll() for commentary */
> +	eavail = ep_events_available(ep);
> +	while (1) {
> +		if (eavail) {
> +			res = ep_try_send_events(ep, events, maxevents);
> +			if (res)
> +				return res;
> +		}
> +
> +		eavail = ep_busy_loop(ep, true);

I have doubts we want to busy loop here even if it's just one iteration /
nonblockinf. And there is already napi polling support in io_uring done
from the right for io_uring users spot.
Jens Axboe Feb. 7, 2025, 2:29 p.m. UTC | #2
On 2/7/25 5:28 AM, Pavel Begunkov wrote:
> On 2/4/25 19:46, Jens Axboe wrote:
>> If a wait_queue_entry is passed in to epoll_wait(), then utilize this
>> new helper for reaping events and/or adding to the epoll waitqueue
>> rather than calling the potentially sleeping ep_poll(). It works like
>> ep_poll(), except it doesn't block - it either returns the events that
>> are already available, or it adds the specified entry to the struct
>> eventpoll waitqueue to get a callback when events are triggered. It
>> returns -EIOCBQUEUED for that case.
>>
>> Signed-off-by: Jens Axboe <axboe@kernel.dk>
>> ---
>>   fs/eventpoll.c | 37 ++++++++++++++++++++++++++++++++++++-
>>   1 file changed, 36 insertions(+), 1 deletion(-)
>>
>> diff --git a/fs/eventpoll.c b/fs/eventpoll.c
>> index ecaa5591f4be..a8be0c7110e4 100644
>> --- a/fs/eventpoll.c
>> +++ b/fs/eventpoll.c
>> @@ -2032,6 +2032,39 @@ static int ep_try_send_events(struct eventpoll *ep,
>>       return res;
>>   }
>>   +static int ep_poll_queue(struct eventpoll *ep,
>> +             struct epoll_event __user *events, int maxevents,
>> +             struct wait_queue_entry *wait)
>> +{
>> +    int res, eavail;
>> +
>> +    /* See ep_poll() for commentary */
>> +    eavail = ep_events_available(ep);
>> +    while (1) {
>> +        if (eavail) {
>> +            res = ep_try_send_events(ep, events, maxevents);
>> +            if (res)
>> +                return res;
>> +        }
>> +
>> +        eavail = ep_busy_loop(ep, true);
> 
> I have doubts we want to busy loop here even if it's just one iteration /
> nonblockinf. And there is already napi polling support in io_uring done
> from the right for io_uring users spot.

Yeah I did ponder that which is why it's passing in the timed_out == true
to just do a single loop. We could certainly get rid of that.
diff mbox series

Patch

diff --git a/fs/eventpoll.c b/fs/eventpoll.c
index ecaa5591f4be..a8be0c7110e4 100644
--- a/fs/eventpoll.c
+++ b/fs/eventpoll.c
@@ -2032,6 +2032,39 @@  static int ep_try_send_events(struct eventpoll *ep,
 	return res;
 }
 
+static int ep_poll_queue(struct eventpoll *ep,
+			 struct epoll_event __user *events, int maxevents,
+			 struct wait_queue_entry *wait)
+{
+	int res, eavail;
+
+	/* See ep_poll() for commentary */
+	eavail = ep_events_available(ep);
+	while (1) {
+		if (eavail) {
+			res = ep_try_send_events(ep, events, maxevents);
+			if (res)
+				return res;
+		}
+
+		eavail = ep_busy_loop(ep, true);
+		if (eavail)
+			continue;
+
+		if (!list_empty_careful(&wait->entry))
+			return -EIOCBQUEUED;
+
+		write_lock_irq(&ep->lock);
+		eavail = ep_events_available(ep);
+		if (!eavail)
+			__add_wait_queue_exclusive(&ep->wq, wait);
+		write_unlock_irq(&ep->lock);
+
+		if (!eavail)
+			return -EIOCBQUEUED;
+	}
+}
+
 /**
  * ep_poll - Retrieves ready events, and delivers them to the caller-supplied
  *           event buffer.
@@ -2497,7 +2530,9 @@  int epoll_wait(struct file *file, struct epoll_event __user *events,
 	ep = file->private_data;
 
 	/* Time to fish for events ... */
-	return ep_poll(ep, events, maxevents, to);
+	if (!wait)
+		return ep_poll(ep, events, maxevents, to);
+	return ep_poll_queue(ep, events, maxevents, wait);
 }
 
 /*