Message ID | nhoaiykqnoid3df3ckmqqgycbjqtd2rutrpeat25j4bbm7tbjl@tpncnt7cp26n (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | PIDFD_THREAD behavior for thread-group leaders | expand |
On 03/06, Christian Brauner wrote: > > Back when we implemented support for PIDFD_THREAD we ended up with the > decision that if userspace holds: > > pidfd_leader_thread = pidfd_open(<thread-group-leader-pid>, PIDFD_THREAD) > > that exit notification is not strictly defined if a non-thread-group > leader thread execs: Yes, this was even documented in commit 64bef697d33b ... > --- a/kernel/exit.c > +++ b/kernel/exit.c > @@ -745,8 +745,11 @@ static void exit_notify(struct task_struct *tsk, int group_dead) > /* > * sub-thread or delay_group_leader(), wake up the > * PIDFD_THREAD waiters. > + * > + * The thread-group leader will be taken over by the execing > + * task so don't cause spurious wakeups. > */ > - if (!thread_group_empty(tsk)) > + if (!thread_group_empty(tsk) && (tsk->signal->notify_count >= 0)) > do_notify_pidfd(tsk); > > if (unlikely(tsk->ptrace)) { perhaps... but this won't help if the leader exits and that another thread does exec? From the changelog Perhaps we can improve this behaviour later, pidfd_poll() can probably take sig->group_exec_task into account. But this doesn't really differ from the case when the leader exits before other threads (so pidfd_poll() succeeds) and then another thread execs and pidfd_poll() will block again. so I am not sure what can we do. I'll try to think more later, but I can't promise much :( Oleg.
On Thu, Mar 06, 2025 at 01:17:13PM +0100, Oleg Nesterov wrote: > On 03/06, Christian Brauner wrote: > > > > Back when we implemented support for PIDFD_THREAD we ended up with the > > decision that if userspace holds: > > > > pidfd_leader_thread = pidfd_open(<thread-group-leader-pid>, PIDFD_THREAD) > > > > that exit notification is not strictly defined if a non-thread-group > > leader thread execs: > > Yes, this was even documented in commit 64bef697d33b ... Yeah, I'm aware I was just revisiting this decision. > > > --- a/kernel/exit.c > > +++ b/kernel/exit.c > > @@ -745,8 +745,11 @@ static void exit_notify(struct task_struct *tsk, int group_dead) > > /* > > * sub-thread or delay_group_leader(), wake up the > > * PIDFD_THREAD waiters. > > + * > > + * The thread-group leader will be taken over by the execing > > + * task so don't cause spurious wakeups. > > */ > > - if (!thread_group_empty(tsk)) > > + if (!thread_group_empty(tsk) && (tsk->signal->notify_count >= 0)) > > do_notify_pidfd(tsk); > > > > if (unlikely(tsk->ptrace)) { > > perhaps... but this won't help if the leader exits and that another > thread does exec? > > >From the changelog > > Perhaps we can improve this behaviour later, pidfd_poll() > can probably take sig->group_exec_task into account. But > this doesn't really differ from the case when the leader > exits before other threads (so pidfd_poll() succeeds) and > then another thread execs and pidfd_poll() will block again. > > so I am not sure what can we do. I think early thread-group leader exit is a bug in the userspace program whereas multi-threaded exec is well-defined (if ugly). To detect early-thread-group leader exec userspace could do: pidfd_leader_thread = pidfd_open(<thread-group-leader-pid>, 0) pidfd_thread = pidfd_open(<thread-group-leader-pid>, PIDFD_THREAD) and then they need to add both file descriptors to the poll instance. If proper multi-threaded exec happens no notification will be generated on either file descriptor. However, if the thread-group leader exits prematurely then a notification will be generated on pidfd_thread allowing detection of malformed behavior. So the premature thread-group leader exec thing is probably something we can ignore as a real use-case. > > I'll try to think more later, but I can't promise much :( > > Oleg. >
diff --git a/kernel/exit.c b/kernel/exit.c index 9916305e34d3..b79ded1b3bf5 100644 --- a/kernel/exit.c +++ b/kernel/exit.c @@ -745,8 +745,11 @@ static void exit_notify(struct task_struct *tsk, int group_dead) /* * sub-thread or delay_group_leader(), wake up the * PIDFD_THREAD waiters. + * + * The thread-group leader will be taken over by the execing + * task so don't cause spurious wakeups. */ - if (!thread_group_empty(tsk)) + if (!thread_group_empty(tsk) && (tsk->signal->notify_count >= 0)) do_notify_pidfd(tsk); if (unlikely(tsk->ptrace)) {