diff mbox series

[RFC] io_uring: make signalfd work with io_uring (and aio) POLL

Message ID 58059c9c-adf9-1683-99f5-7e45280aea87@kernel.dk (mailing list archive)
State New, archived
Headers show
Series [RFC] io_uring: make signalfd work with io_uring (and aio) POLL | expand

Commit Message

Jens Axboe Nov. 14, 2019, 4:31 a.m. UTC
This is a case of "I don't really know what I'm doing, but this works
for me". Caveat emptor, but I'd love some input on this.

I got a bug report that using the poll command with signalfd doesn't
work for io_uring. The reporter also noted that it doesn't work with the
aio poll implementation either. So I took a look at it.

What happens is that the original task issues the poll request, we call
->poll() (which ends up with signalfd for this fd), and find that
nothing is pending. Then we wait, and the poll is passed to async
context. When the requested signal comes in, that worker is woken up,
and proceeds to call ->poll() again, and signalfd unsurprisingly finds
no signals pending, since it's the async worker calling it.

That's obviously no good. The below allows you to pass in the task in
the poll_table, and it does the right thing for me, signal is delivered
and the correct mask is checked in signalfd_poll().

Similar patch for aio would be trivial, of course.

Not-really-signed-off-by: Jens Axboe <axboe@kernel.dk>

---

Comments

Jens Axboe Nov. 14, 2019, 4:49 a.m. UTC | #1
On 11/13/19 9:31 PM, Jens Axboe wrote:
> This is a case of "I don't really know what I'm doing, but this works
> for me". Caveat emptor, but I'd love some input on this.
> 
> I got a bug report that using the poll command with signalfd doesn't
> work for io_uring. The reporter also noted that it doesn't work with the
> aio poll implementation either. So I took a look at it.
> 
> What happens is that the original task issues the poll request, we call
> ->poll() (which ends up with signalfd for this fd), and find that
> nothing is pending. Then we wait, and the poll is passed to async
> context. When the requested signal comes in, that worker is woken up,
> and proceeds to call ->poll() again, and signalfd unsurprisingly finds
> no signals pending, since it's the async worker calling it.
> 
> That's obviously no good. The below allows you to pass in the task in
> the poll_table, and it does the right thing for me, signal is delivered
> and the correct mask is checked in signalfd_poll().
> 
> Similar patch for aio would be trivial, of course.

From the probably-less-nasty category, Jann Horn helpfully pointed out
that it'd be easier if signalfd just looked at the task that originally
created the fd instead. That looks like the below, and works equally
well for the test case at hand.

diff --git a/fs/signalfd.c b/fs/signalfd.c
index 44b6845b071c..cc72b5b08946 100644
--- a/fs/signalfd.c
+++ b/fs/signalfd.c
@@ -50,6 +50,7 @@ void signalfd_cleanup(struct sighand_struct *sighand)
 
 struct signalfd_ctx {
 	sigset_t sigmask;
+	struct task_struct *task;
 };
 
 static int signalfd_release(struct inode *inode, struct file *file)
@@ -63,14 +64,14 @@ static __poll_t signalfd_poll(struct file *file, poll_table *wait)
 	struct signalfd_ctx *ctx = file->private_data;
 	__poll_t events = 0;
 
-	poll_wait(file, &current->sighand->signalfd_wqh, wait);
+	poll_wait(file, &ctx->task->sighand->signalfd_wqh, wait);
 
-	spin_lock_irq(&current->sighand->siglock);
-	if (next_signal(&current->pending, &ctx->sigmask) ||
-	    next_signal(&current->signal->shared_pending,
+	spin_lock_irq(&ctx->task->sighand->siglock);
+	if (next_signal(&ctx->task->pending, &ctx->sigmask) ||
+	    next_signal(&ctx->task->signal->shared_pending,
 			&ctx->sigmask))
 		events |= EPOLLIN;
-	spin_unlock_irq(&current->sighand->siglock);
+	spin_unlock_irq(&ctx->task->sighand->siglock);
 
 	return events;
 }
@@ -280,6 +281,7 @@ static int do_signalfd4(int ufd, sigset_t *mask, int flags)
 			return -ENOMEM;
 
 		ctx->sigmask = *mask;
+		ctx->task = current;
 
 		/*
 		 * When we call this, the initialization must be complete, since
Rasmus Villemoes Nov. 14, 2019, 9:19 a.m. UTC | #2
On 14/11/2019 05.49, Jens Axboe wrote:
> On 11/13/19 9:31 PM, Jens Axboe wrote:
>> This is a case of "I don't really know what I'm doing, but this works
>> for me". Caveat emptor, but I'd love some input on this.
>>
>> I got a bug report that using the poll command with signalfd doesn't
>> work for io_uring. The reporter also noted that it doesn't work with the
>> aio poll implementation either. So I took a look at it.
>>
>> What happens is that the original task issues the poll request, we call
>> ->poll() (which ends up with signalfd for this fd), and find that
>> nothing is pending. Then we wait, and the poll is passed to async
>> context. When the requested signal comes in, that worker is woken up,
>> and proceeds to call ->poll() again, and signalfd unsurprisingly finds
>> no signals pending, since it's the async worker calling it.
>>
>> That's obviously no good. The below allows you to pass in the task in
>> the poll_table, and it does the right thing for me, signal is delivered
>> and the correct mask is checked in signalfd_poll().
>>
>> Similar patch for aio would be trivial, of course.
> 
> From the probably-less-nasty category, Jann Horn helpfully pointed out
> that it'd be easier if signalfd just looked at the task that originally
> created the fd instead. That looks like the below, and works equally
> well for the test case at hand.

Eh, how should that work? If I create a signalfd() and fork(), the
child's signalfd should only be concerned with signals sent to the
child. Not to mention what happens after the parent dies and the child
polls its fd.

Or am I completely confused?

Rasmus
Jann Horn Nov. 14, 2019, 1:46 p.m. UTC | #3
On Thu, Nov 14, 2019 at 10:20 AM Rasmus Villemoes
<linux@rasmusvillemoes.dk> wrote:
> On 14/11/2019 05.49, Jens Axboe wrote:
> > On 11/13/19 9:31 PM, Jens Axboe wrote:
> >> This is a case of "I don't really know what I'm doing, but this works
> >> for me". Caveat emptor, but I'd love some input on this.
> >>
> >> I got a bug report that using the poll command with signalfd doesn't
> >> work for io_uring. The reporter also noted that it doesn't work with the
> >> aio poll implementation either. So I took a look at it.
> >>
> >> What happens is that the original task issues the poll request, we call
> >> ->poll() (which ends up with signalfd for this fd), and find that
> >> nothing is pending. Then we wait, and the poll is passed to async
> >> context. When the requested signal comes in, that worker is woken up,
> >> and proceeds to call ->poll() again, and signalfd unsurprisingly finds
> >> no signals pending, since it's the async worker calling it.
> >>
> >> That's obviously no good. The below allows you to pass in the task in
> >> the poll_table, and it does the right thing for me, signal is delivered
> >> and the correct mask is checked in signalfd_poll().
> >>
> >> Similar patch for aio would be trivial, of course.
> >
> > From the probably-less-nasty category, Jann Horn helpfully pointed out
> > that it'd be easier if signalfd just looked at the task that originally
> > created the fd instead. That looks like the below, and works equally
> > well for the test case at hand.
>
> Eh, how should that work? If I create a signalfd() and fork(), the
> child's signalfd should only be concerned with signals sent to the
> child. Not to mention what happens after the parent dies and the child
> polls its fd.
>
> Or am I completely confused?

I think the child should not be getting signals for the child when
it's reading from the parent's signalfd. read() and write() aren't
supposed to look at properties of `current`. If I send an fd to some
daemon via SCM_RIGHTS, and the daemon does a read() on it, that should
never cause signals to disappear from the daemon's signal queue.

Of course, if someone does rely on the current (silly) semantics, this
might break stuff.

And we probably also don't want to just let the signalfd keep a
reference to a task, because then if the task later goes through a
setuid transition, you'd still be able to dequeue its signals. So it'd
have to also check against ->self_exec_id or something like that.
Rasmus Villemoes Nov. 14, 2019, 2:12 p.m. UTC | #4
On 14/11/2019 14.46, Jann Horn wrote:
> On Thu, Nov 14, 2019 at 10:20 AM Rasmus Villemoes
> <linux@rasmusvillemoes.dk> wrote:
>> On 14/11/2019 05.49, Jens Axboe wrote:
>>> On 11/13/19 9:31 PM, Jens Axboe wrote:
>>>> This is a case of "I don't really know what I'm doing, but this works
>>>> for me". Caveat emptor, but I'd love some input on this.
>>>>
>>>> I got a bug report that using the poll command with signalfd doesn't
>>>> work for io_uring. The reporter also noted that it doesn't work with the
>>>> aio poll implementation either. So I took a look at it.
>>>>
>>>> What happens is that the original task issues the poll request, we call
>>>> ->poll() (which ends up with signalfd for this fd), and find that
>>>> nothing is pending. Then we wait, and the poll is passed to async
>>>> context. When the requested signal comes in, that worker is woken up,
>>>> and proceeds to call ->poll() again, and signalfd unsurprisingly finds
>>>> no signals pending, since it's the async worker calling it.
>>>>
>>>> That's obviously no good. The below allows you to pass in the task in
>>>> the poll_table, and it does the right thing for me, signal is delivered
>>>> and the correct mask is checked in signalfd_poll().
>>>>
>>>> Similar patch for aio would be trivial, of course.
>>>
>>> From the probably-less-nasty category, Jann Horn helpfully pointed out
>>> that it'd be easier if signalfd just looked at the task that originally
>>> created the fd instead. That looks like the below, and works equally
>>> well for the test case at hand.
>>
>> Eh, how should that work? If I create a signalfd() and fork(), the
>> child's signalfd should only be concerned with signals sent to the
>> child. Not to mention what happens after the parent dies and the child
>> polls its fd.
>>
>> Or am I completely confused?
> 
> I think the child should not be getting signals for the child when
> it's reading from the parent's signalfd. read() and write() aren't
> supposed to look at properties of `current`.

That may be, but this has always been the semantics of signalfd(), quite
clearly documented in 'man signalfd'.

> Of course, if someone does rely on the current (silly) semantics, this
> might break stuff.

That, and Jens' patch only seemed to change the poll callback, so the
child (or whoever else got a hand on that signalfd) would wait for the
parent to get a signal, but then a subsequent read would attempt to
dequeue from the child itself.

So, I can't really think of anybody that might be relying on inheriting
a signalfd instead of just setting it up in the child, but changing the
semantics of it now seems rather dangerous. Also, I _can_ imagine
threads in a process sharing a signalfd (initial thread sets it up and
blocks the signals, all threads subsequently use that same fd), and for
that case it would be wrong for one thread to dequeue signals directed
at the initial thread. Plus the lifetime problems.

Rasmus
Jens Axboe Nov. 14, 2019, 3:09 p.m. UTC | #5
On 11/14/19 7:12 AM, Rasmus Villemoes wrote:
> On 14/11/2019 14.46, Jann Horn wrote:
>> On Thu, Nov 14, 2019 at 10:20 AM Rasmus Villemoes
>> <linux@rasmusvillemoes.dk> wrote:
>>> On 14/11/2019 05.49, Jens Axboe wrote:
>>>> On 11/13/19 9:31 PM, Jens Axboe wrote:
>>>>> This is a case of "I don't really know what I'm doing, but this works
>>>>> for me". Caveat emptor, but I'd love some input on this.
>>>>>
>>>>> I got a bug report that using the poll command with signalfd doesn't
>>>>> work for io_uring. The reporter also noted that it doesn't work with the
>>>>> aio poll implementation either. So I took a look at it.
>>>>>
>>>>> What happens is that the original task issues the poll request, we call
>>>>> ->poll() (which ends up with signalfd for this fd), and find that
>>>>> nothing is pending. Then we wait, and the poll is passed to async
>>>>> context. When the requested signal comes in, that worker is woken up,
>>>>> and proceeds to call ->poll() again, and signalfd unsurprisingly finds
>>>>> no signals pending, since it's the async worker calling it.
>>>>>
>>>>> That's obviously no good. The below allows you to pass in the task in
>>>>> the poll_table, and it does the right thing for me, signal is delivered
>>>>> and the correct mask is checked in signalfd_poll().
>>>>>
>>>>> Similar patch for aio would be trivial, of course.
>>>>
>>>>  From the probably-less-nasty category, Jann Horn helpfully pointed out
>>>> that it'd be easier if signalfd just looked at the task that originally
>>>> created the fd instead. That looks like the below, and works equally
>>>> well for the test case at hand.
>>>
>>> Eh, how should that work? If I create a signalfd() and fork(), the
>>> child's signalfd should only be concerned with signals sent to the
>>> child. Not to mention what happens after the parent dies and the child
>>> polls its fd.
>>>
>>> Or am I completely confused?
>>
>> I think the child should not be getting signals for the child when
>> it's reading from the parent's signalfd. read() and write() aren't
>> supposed to look at properties of `current`.
> 
> That may be, but this has always been the semantics of signalfd(), quite
> clearly documented in 'man signalfd'.
> 
>> Of course, if someone does rely on the current (silly) semantics, this
>> might break stuff.
> 
> That, and Jens' patch only seemed to change the poll callback, so the
> child (or whoever else got a hand on that signalfd) would wait for the
> parent to get a signal, but then a subsequent read would attempt to
> dequeue from the child itself.
> 
> So, I can't really think of anybody that might be relying on inheriting
> a signalfd instead of just setting it up in the child, but changing the
> semantics of it now seems rather dangerous. Also, I _can_ imagine
> threads in a process sharing a signalfd (initial thread sets it up and
> blocks the signals, all threads subsequently use that same fd), and for
> that case it would be wrong for one thread to dequeue signals directed
> at the initial thread. Plus the lifetime problems.

What if we just made it specific SFD_CLOEXEC? I don't want to break
existing applications, even if the use case is nonsensical, but it is
important to allow signalfd to be properly used with use cases that are
already in the kernel (aio with IOCB_CMD_POLL, io_uring with
IORING_OP_POLL_ADD). Alternatively, if need be, we could add a specific
SFD_ flag for this. Might also help with applications knowing if this
will work with io_uring/aio at all.
Rasmus Villemoes Nov. 14, 2019, 3:19 p.m. UTC | #6
On 14/11/2019 16.09, Jens Axboe wrote:
> On 11/14/19 7:12 AM, Rasmus Villemoes wrote:

>> So, I can't really think of anybody that might be relying on inheriting
>> a signalfd instead of just setting it up in the child, but changing the
>> semantics of it now seems rather dangerous. Also, I _can_ imagine
>> threads in a process sharing a signalfd (initial thread sets it up and
>> blocks the signals, all threads subsequently use that same fd), and for
>> that case it would be wrong for one thread to dequeue signals directed
>> at the initial thread. Plus the lifetime problems.
> 
> What if we just made it specific SFD_CLOEXEC?

O_CLOEXEC can be set and removed afterwards. Sure, we're far into
"nobody does that" land, but having signalfd() have wildly different
semantics based on whether it was initially created with O_CLOEXEC seems
rather dubious.

 I don't want to break
> existing applications, even if the use case is nonsensical, but it is
> important to allow signalfd to be properly used with use cases that are
> already in the kernel (aio with IOCB_CMD_POLL, io_uring with
> IORING_OP_POLL_ADD). Alternatively, if need be, we could add a specific
> SFD_ flag for this.

Yeah, if you want another signalfd flavour, adding it via a new SFD_
flag seems the way to go. Though I can't imagine the resulting code
would be very pretty.

Rasmus
Jens Axboe Nov. 14, 2019, 3:20 p.m. UTC | #7
On 11/14/19 8:19 AM, Rasmus Villemoes wrote:
> On 14/11/2019 16.09, Jens Axboe wrote:
>> On 11/14/19 7:12 AM, Rasmus Villemoes wrote:
> 
>>> So, I can't really think of anybody that might be relying on inheriting
>>> a signalfd instead of just setting it up in the child, but changing the
>>> semantics of it now seems rather dangerous. Also, I _can_ imagine
>>> threads in a process sharing a signalfd (initial thread sets it up and
>>> blocks the signals, all threads subsequently use that same fd), and for
>>> that case it would be wrong for one thread to dequeue signals directed
>>> at the initial thread. Plus the lifetime problems.
>>
>> What if we just made it specific SFD_CLOEXEC?
> 
> O_CLOEXEC can be set and removed afterwards. Sure, we're far into
> "nobody does that" land, but having signalfd() have wildly different
> semantics based on whether it was initially created with O_CLOEXEC seems
> rather dubious.
> 
>   I don't want to break
>> existing applications, even if the use case is nonsensical, but it is
>> important to allow signalfd to be properly used with use cases that are
>> already in the kernel (aio with IOCB_CMD_POLL, io_uring with
>> IORING_OP_POLL_ADD). Alternatively, if need be, we could add a specific
>> SFD_ flag for this.
> 
> Yeah, if you want another signalfd flavour, adding it via a new SFD_
> flag seems the way to go. Though I can't imagine the resulting code
> would be very pretty.

Well, it's currently _broken_ for the listed in-kernel use cases, so
I think making it work is the first priority here.
Jens Axboe Nov. 14, 2019, 3:27 p.m. UTC | #8
On 11/14/19 8:20 AM, Jens Axboe wrote:
> On 11/14/19 8:19 AM, Rasmus Villemoes wrote:
>> On 14/11/2019 16.09, Jens Axboe wrote:
>>> On 11/14/19 7:12 AM, Rasmus Villemoes wrote:
>>
>>>> So, I can't really think of anybody that might be relying on inheriting
>>>> a signalfd instead of just setting it up in the child, but changing the
>>>> semantics of it now seems rather dangerous. Also, I _can_ imagine
>>>> threads in a process sharing a signalfd (initial thread sets it up and
>>>> blocks the signals, all threads subsequently use that same fd), and for
>>>> that case it would be wrong for one thread to dequeue signals directed
>>>> at the initial thread. Plus the lifetime problems.
>>>
>>> What if we just made it specific SFD_CLOEXEC?
>>
>> O_CLOEXEC can be set and removed afterwards. Sure, we're far into
>> "nobody does that" land, but having signalfd() have wildly different
>> semantics based on whether it was initially created with O_CLOEXEC seems
>> rather dubious.
>>
>>    I don't want to break
>>> existing applications, even if the use case is nonsensical, but it is
>>> important to allow signalfd to be properly used with use cases that are
>>> already in the kernel (aio with IOCB_CMD_POLL, io_uring with
>>> IORING_OP_POLL_ADD). Alternatively, if need be, we could add a specific
>>> SFD_ flag for this.
>>
>> Yeah, if you want another signalfd flavour, adding it via a new SFD_
>> flag seems the way to go. Though I can't imagine the resulting code
>> would be very pretty.
> 
> Well, it's currently _broken_ for the listed in-kernel use cases, so
> I think making it work is the first priority here.

How about something like this, then? Not tested.

diff --git a/fs/signalfd.c b/fs/signalfd.c
index 44b6845b071c..d8b183ec1d4e 100644
--- a/fs/signalfd.c
+++ b/fs/signalfd.c
@@ -50,6 +50,8 @@ void signalfd_cleanup(struct sighand_struct *sighand)
 
 struct signalfd_ctx {
 	sigset_t sigmask;
+	struct task_struct *task;
+	u32 task_exec_id;
 };
 
 static int signalfd_release(struct inode *inode, struct file *file)
@@ -61,16 +63,22 @@ static int signalfd_release(struct inode *inode, struct file *file)
 static __poll_t signalfd_poll(struct file *file, poll_table *wait)
 {
 	struct signalfd_ctx *ctx = file->private_data;
+	struct task_struct *tsk = ctx->task ?: current;
 	__poll_t events = 0;
 
-	poll_wait(file, &current->sighand->signalfd_wqh, wait);
+	if (ctx->task && ctx->task->self_exec_id == ctx->task_exec_id)
+		tsk = ctx->task;
+	else
+		tsk = current;
 
-	spin_lock_irq(&current->sighand->siglock);
-	if (next_signal(&current->pending, &ctx->sigmask) ||
-	    next_signal(&current->signal->shared_pending,
+	poll_wait(file, &tsk->sighand->signalfd_wqh, wait);
+
+	spin_lock_irq(&tsk->sighand->siglock);
+	if (next_signal(&tsk->pending, &ctx->sigmask) ||
+	    next_signal(&tsk->signal->shared_pending,
 			&ctx->sigmask))
 		events |= EPOLLIN;
-	spin_unlock_irq(&current->sighand->siglock);
+	spin_unlock_irq(&tsk->sighand->siglock);
 
 	return events;
 }
@@ -267,19 +275,26 @@ static int do_signalfd4(int ufd, sigset_t *mask, int flags)
 	/* Check the SFD_* constants for consistency.  */
 	BUILD_BUG_ON(SFD_CLOEXEC != O_CLOEXEC);
 	BUILD_BUG_ON(SFD_NONBLOCK != O_NONBLOCK);
+	BUILD_BUG_ON(SFD_TASK & (SFD_CLOEXEC | SFD_NONBLOCK));
 
-	if (flags & ~(SFD_CLOEXEC | SFD_NONBLOCK))
+	if (flags & ~(SFD_CLOEXEC | SFD_NONBLOCK | SFD_TASK))
+		return -EINVAL;
+	if ((flags & (SFD_CLOEXEC | SFD_TASK)) == SFD_TASK)
 		return -EINVAL;
 
 	sigdelsetmask(mask, sigmask(SIGKILL) | sigmask(SIGSTOP));
 	signotset(mask);
 
 	if (ufd == -1) {
-		ctx = kmalloc(sizeof(*ctx), GFP_KERNEL);
+		ctx = kzalloc(sizeof(*ctx), GFP_KERNEL);
 		if (!ctx)
 			return -ENOMEM;
 
 		ctx->sigmask = *mask;
+		if (flags & SFD_TASK) {
+			ctx->task = current;
+			ctx->task_exec_id = current->self_exec_id;
+		}
 
 		/*
 		 * When we call this, the initialization must be complete, since
diff --git a/include/uapi/linux/signalfd.h b/include/uapi/linux/signalfd.h
index 83429a05b698..064c5dc3eb99 100644
--- a/include/uapi/linux/signalfd.h
+++ b/include/uapi/linux/signalfd.h
@@ -16,6 +16,7 @@
 /* Flags for signalfd4.  */
 #define SFD_CLOEXEC O_CLOEXEC
 #define SFD_NONBLOCK O_NONBLOCK
+#define SFD_TASK 00000001
 
 struct signalfd_siginfo {
 	__u32 ssi_signo;
Jens Axboe Nov. 14, 2019, 3:51 p.m. UTC | #9
On 11/14/19 8:27 AM, Jens Axboe wrote:
> On 11/14/19 8:20 AM, Jens Axboe wrote:
>> On 11/14/19 8:19 AM, Rasmus Villemoes wrote:
>>> On 14/11/2019 16.09, Jens Axboe wrote:
>>>> On 11/14/19 7:12 AM, Rasmus Villemoes wrote:
>>>
>>>>> So, I can't really think of anybody that might be relying on inheriting
>>>>> a signalfd instead of just setting it up in the child, but changing the
>>>>> semantics of it now seems rather dangerous. Also, I _can_ imagine
>>>>> threads in a process sharing a signalfd (initial thread sets it up and
>>>>> blocks the signals, all threads subsequently use that same fd), and for
>>>>> that case it would be wrong for one thread to dequeue signals directed
>>>>> at the initial thread. Plus the lifetime problems.
>>>>
>>>> What if we just made it specific SFD_CLOEXEC?
>>>
>>> O_CLOEXEC can be set and removed afterwards. Sure, we're far into
>>> "nobody does that" land, but having signalfd() have wildly different
>>> semantics based on whether it was initially created with O_CLOEXEC seems
>>> rather dubious.
>>>
>>>     I don't want to break
>>>> existing applications, even if the use case is nonsensical, but it is
>>>> important to allow signalfd to be properly used with use cases that are
>>>> already in the kernel (aio with IOCB_CMD_POLL, io_uring with
>>>> IORING_OP_POLL_ADD). Alternatively, if need be, we could add a specific
>>>> SFD_ flag for this.
>>>
>>> Yeah, if you want another signalfd flavour, adding it via a new SFD_
>>> flag seems the way to go. Though I can't imagine the resulting code
>>> would be very pretty.
>>
>> Well, it's currently _broken_ for the listed in-kernel use cases, so
>> I think making it work is the first priority here.
> 
> How about something like this, then? Not tested.

Tested, works for me. Here's the test case I used. We setup a signalfd
with SIGALRM, and arm a timer for 100msec. Then we queue a poll for the
signalfd, and wait for that to complete with a timeout of 1 second. If
we time out waiting for the completion, we failed. If we do get a
completion but we don't have POLLIN set, we failed.


#include <unistd.h>
#include <sys/signalfd.h>
#include <sys/poll.h>
#include <sys/time.h>
#include <errno.h>
#include <stdio.h>
#include <string.h>

#include <liburing.h>

#define SFD_TASK	00000001

int main(int argc, char *argv[])
{
	struct __kernel_timespec ts;
	struct io_uring_sqe *sqe;
	struct io_uring_cqe *cqe;
	struct io_uring ring;
	struct itimerval itv;
	sigset_t mask;
	int sfd, ret;

	sigemptyset(&mask);
	sigaddset(&mask, SIGALRM);
	sigprocmask(SIG_BLOCK, &mask, NULL);

	sfd = signalfd(-1, &mask, SFD_NONBLOCK | SFD_CLOEXEC | SFD_TASK);
	if (sfd < 0) {
		if (errno == EINVAL) {
			printf("Not supported\n");
			return 0;
		}
		perror("signalfd");
		return 1;
	}

	memset(&itv, 0, sizeof(itv));
	itv.it_value.tv_sec = 0;
	itv.it_value.tv_usec = 100000;
	setitimer(ITIMER_REAL, &itv, NULL);

	io_uring_queue_init(32, &ring, 0);
	sqe = io_uring_get_sqe(&ring);
	io_uring_prep_poll_add(sqe, sfd, POLLIN);
	io_uring_submit(&ring);

	ts.tv_sec = 1;
	ts.tv_nsec = 0;
	ret = io_uring_wait_cqe_timeout(&ring, &cqe, &ts);
	if (ret < 0) {
		fprintf(stderr, "Timed out waiting for cqe\n");
		ret = 1;
	} else {
		if (cqe->res < 0) {
			fprintf(stderr, "cqe failed with %d\n", cqe->res);
			ret = 1;
		} else if (!(cqe->res & POLLIN)) {
			fprintf(stderr, "POLLIN not set in result mask?\n");
			ret = 1;
		} else {
			ret = 0;
		}
	}
	io_uring_cqe_seen(&ring, cqe);

	io_uring_queue_exit(&ring);
	close(sfd);
	return ret;
}
diff mbox series

Patch

diff --git a/fs/io_uring.c b/fs/io_uring.c
index d8ea9b4f83a7..d9a4c9aac958 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -299,6 +299,7 @@  struct io_poll_iocb {
 	bool				done;
 	bool				canceled;
 	struct wait_queue_entry		wait;
+	struct task_struct		*task;
 };
 
 struct io_timeout {
@@ -2021,7 +2022,10 @@  static void io_poll_complete_work(struct io_wq_work **workptr)
 	struct io_wq_work *work = *workptr;
 	struct io_kiocb *req = container_of(work, struct io_kiocb, work);
 	struct io_poll_iocb *poll = &req->poll;
-	struct poll_table_struct pt = { ._key = poll->events };
+	struct poll_table_struct pt = {
+		._key = poll->events,
+		.task = poll->task
+	};
 	struct io_ring_ctx *ctx = req->ctx;
 	struct io_kiocb *nxt = NULL;
 	__poll_t mask = 0;
@@ -2139,9 +2143,11 @@  static int io_poll_add(struct io_kiocb *req, const struct io_uring_sqe *sqe,
 	poll->head = NULL;
 	poll->done = false;
 	poll->canceled = false;
+	poll->task = current;
 
 	ipt.pt._qproc = io_poll_queue_proc;
 	ipt.pt._key = poll->events;
+	ipt.pt.task = poll->task;
 	ipt.req = req;
 	ipt.error = -EINVAL; /* same as no support for IOCB_CMD_POLL */
 
diff --git a/fs/signalfd.c b/fs/signalfd.c
index 44b6845b071c..a7f31758db1a 100644
--- a/fs/signalfd.c
+++ b/fs/signalfd.c
@@ -61,16 +61,17 @@  static int signalfd_release(struct inode *inode, struct file *file)
 static __poll_t signalfd_poll(struct file *file, poll_table *wait)
 {
 	struct signalfd_ctx *ctx = file->private_data;
+	struct task_struct *tsk = wait->task ?: current;
 	__poll_t events = 0;
 
-	poll_wait(file, &current->sighand->signalfd_wqh, wait);
+	poll_wait(file, &tsk->sighand->signalfd_wqh, wait);
 
-	spin_lock_irq(&current->sighand->siglock);
-	if (next_signal(&current->pending, &ctx->sigmask) ||
-	    next_signal(&current->signal->shared_pending,
+	spin_lock_irq(&tsk->sighand->siglock);
+	if (next_signal(&tsk->pending, &ctx->sigmask) ||
+	    next_signal(&tsk->signal->shared_pending,
 			&ctx->sigmask))
 		events |= EPOLLIN;
-	spin_unlock_irq(&current->sighand->siglock);
+	spin_unlock_irq(&tsk->sighand->siglock);
 
 	return events;
 }
diff --git a/include/linux/poll.h b/include/linux/poll.h
index 1cdc32b1f1b0..6d2b6d923b2b 100644
--- a/include/linux/poll.h
+++ b/include/linux/poll.h
@@ -43,6 +43,7 @@  typedef void (*poll_queue_proc)(struct file *, wait_queue_head_t *, struct poll_
 typedef struct poll_table_struct {
 	poll_queue_proc _qproc;
 	__poll_t _key;
+	struct task_struct *task;
 } poll_table;
 
 static inline void poll_wait(struct file * filp, wait_queue_head_t * wait_address, poll_table *p)
@@ -76,6 +77,7 @@  static inline void init_poll_funcptr(poll_table *pt, poll_queue_proc qproc)
 {
 	pt->_qproc = qproc;
 	pt->_key   = ~(__poll_t)0; /* all events enabled */
+	pt->task = NULL;
 }
 
 static inline bool file_can_poll(struct file *file)