diff mbox series

[v2,2/2] pidfs: ensure consistent ENOENT/ESRCH reporting

Message ID 20250411-work-pidfs-enoent-v2-2-60b2d3bb545f@kernel.org (mailing list archive)
State New
Headers show
Series pidfs: ensure consistent ENOENT/ESRCH reporting | expand

Commit Message

Christian Brauner April 11, 2025, 1:22 p.m. UTC
In a prior patch series we tried to cleanly differentiate between:

(1) The task has already been reaped.
(2) The caller requested a pidfd for a thread-group leader but the pid
    actually references a struct pid that isn't used as a thread-group
    leader.

as this was causing issues for non-threaded workloads.

But there's cases where the current simple logic is wrong. Specifically,
if the pid was a leader pid and the check races with __unhash_process().
Stabilize this by using the pidfd waitqueue lock.

Signed-off-by: Christian Brauner <brauner@kernel.org>
---
 kernel/fork.c | 31 +++++++++++++------------------
 1 file changed, 13 insertions(+), 18 deletions(-)

Comments

Oleg Nesterov April 11, 2025, 1:54 p.m. UTC | #1
For both patches:

Reviewed-by: Oleg Nesterov <oleg@redhat.com>

a minor nit below...

On 04/11, Christian Brauner wrote:
>
>  int pidfd_prepare(struct pid *pid, unsigned int flags, struct file **ret)
>  {
> -	int err = 0;
> -
> -	if (!(flags & PIDFD_THREAD)) {
> +	scoped_guard(spinlock_irq, &pid->wait_pidfd.lock) {
> +		/*
> +		 * If this wasn't a thread-group leader struct pid or
> +		 * the task already been reaped report ESRCH to
> +		 * userspace.
> +		 */
> +		if (!pid_has_task(pid, PIDTYPE_PID))
> +			return -ESRCH;

The "If this wasn't a thread-group leader struct pid" part of the
comment looks a bit confusing to me, as if pid_has_task(PIDTYPE_PID)
should return false in this case.

OTOH, perhaps it makes sense to explain scoped_guard(wait_pidfd.lock)?
Something like "see unhash_process -> wake_up_all(), detach_pid(TGID)
isn't possible if pid_has_task(PID) succeeds".

Oleg.
Christian Brauner April 11, 2025, 3:14 p.m. UTC | #2
On Fri, Apr 11, 2025 at 03:54:45PM +0200, Oleg Nesterov wrote:
> For both patches:
> 
> Reviewed-by: Oleg Nesterov <oleg@redhat.com>
> 
> a minor nit below...
> 
> On 04/11, Christian Brauner wrote:
> >
> >  int pidfd_prepare(struct pid *pid, unsigned int flags, struct file **ret)
> >  {
> > -	int err = 0;
> > -
> > -	if (!(flags & PIDFD_THREAD)) {
> > +	scoped_guard(spinlock_irq, &pid->wait_pidfd.lock) {
> > +		/*
> > +		 * If this wasn't a thread-group leader struct pid or
> > +		 * the task already been reaped report ESRCH to
> > +		 * userspace.
> > +		 */
> > +		if (!pid_has_task(pid, PIDTYPE_PID))
> > +			return -ESRCH;
> 
> The "If this wasn't a thread-group leader struct pid" part of the
> comment looks a bit confusing to me, as if pid_has_task(PIDTYPE_PID)
> should return false in this case.

Ok.

> 
> OTOH, perhaps it makes sense to explain scoped_guard(wait_pidfd.lock)?
> Something like "see unhash_process -> wake_up_all(), detach_pid(TGID)
> isn't possible if pid_has_task(PID) succeeds".

I'm verbose. I hope you can live with it:

        /*
         * While holding the pidfd waitqueue lock removing the task
         * linkage for the thread-group leader pid (PIDTYPE_TGID) isn't
         * possible. Thus, if there's still task linkage for PIDTYPE_PID
         * not having thread-group leader linkage for the pid means it
         * wasn't a thread-group leader in the first place.
         */

:)
Oleg Nesterov April 11, 2025, 3:28 p.m. UTC | #3
On 04/11, Christian Brauner wrote:
>
> I'm verbose. I hope you can live with it:
>
>         /*
>          * While holding the pidfd waitqueue lock removing the task
>          * linkage for the thread-group leader pid (PIDTYPE_TGID) isn't
>          * possible. Thus, if there's still task linkage for PIDTYPE_PID
>          * not having thread-group leader linkage for the pid means it
>          * wasn't a thread-group leader in the first place.
>          */
>
> :)

LGTM ;)

Oleg.
diff mbox series

Patch

diff --git a/kernel/fork.c b/kernel/fork.c
index 4a2080b968c8..cde960fd0c71 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -2108,28 +2108,23 @@  static int __pidfd_prepare(struct pid *pid, unsigned int flags, struct file **re
  */
 int pidfd_prepare(struct pid *pid, unsigned int flags, struct file **ret)
 {
-	int err = 0;
-
-	if (!(flags & PIDFD_THREAD)) {
+	scoped_guard(spinlock_irq, &pid->wait_pidfd.lock) {
+		/*
+		 * If this wasn't a thread-group leader struct pid or
+		 * the task already been reaped report ESRCH to
+		 * userspace.
+		 */
+		if (!pid_has_task(pid, PIDTYPE_PID))
+			return -ESRCH;
 		/*
-		 * If this is struct pid isn't used as a thread-group
-		 * leader pid but the caller requested to create a
-		 * thread-group leader pidfd then report ENOENT to the
-		 * caller as a hint.
+		 * If this struct pid isn't used as a thread-group
+		 * leader but the caller requested to create a
+		 * thread-group leader pidfd then report ENOENT.
 		 */
-		if (!pid_has_task(pid, PIDTYPE_TGID))
-			err = -ENOENT;
+		if (!(flags & PIDFD_THREAD) && !pid_has_task(pid, PIDTYPE_TGID))
+			return -ENOENT;
 	}
 
-	/*
-	 * If this wasn't a thread-group leader struct pid or the task
-	 * got reaped in the meantime report -ESRCH to userspace.
-	 */
-	if (!pid_has_task(pid, PIDTYPE_PID))
-		err = -ESRCH;
-	if (err)
-		return err;
-
 	return __pidfd_prepare(pid, flags, ret);
 }