diff mbox series

[6/6] kernel: add a kernel_wait helper

Message ID 20200618144627.114057-7-hch@lst.de (mailing list archive)
State New, archived
Headers show
Series [1/6] exec: cleanup the execve wrappers | expand

Commit Message

Christoph Hellwig June 18, 2020, 2:46 p.m. UTC
Add a helper that waits for a pid and stores the status in the passed
in kernel pointer.  Use it to fix the usage of kernel_wait4 in
call_usermodehelper_exec_sync that only happens to work due to the
implicit set_fs(KERNEL_DS) for kernel threads.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 include/linux/sched/task.h |  1 +
 kernel/exit.c              | 16 ++++++++++++++++
 kernel/umh.c               | 29 ++++-------------------------
 3 files changed, 21 insertions(+), 25 deletions(-)

Comments

Luis Chamberlain June 19, 2020, 9:17 p.m. UTC | #1
On Thu, Jun 18, 2020 at 04:46:27PM +0200, Christoph Hellwig wrote:
> --- a/kernel/exit.c
> +++ b/kernel/exit.c
> @@ -1626,6 +1626,22 @@ long kernel_wait4(pid_t upid, int __user *stat_addr, int options,
>  	return ret;
>  }
>  
> +int kernel_wait(pid_t pid, int *stat)
> +{
> +	struct wait_opts wo = {
> +		.wo_type	= PIDTYPE_PID,
> +		.wo_pid		= find_get_pid(pid),
> +		.wo_flags	= WEXITED,
> +	};
> +	int ret;
> +
> +	ret = do_wait(&wo);
> +	if (ret > 0 && wo.wo_stat)
> +		*stat = wo.wo_stat;

Since all we care about is WEXITED, that could be simplified
to something like this:

if (ret > 0 && KWIFEXITED(wo.wo_stat)
 	*stat = KWEXITSTATUS(wo.wo_stat)

Otherwise callers have to use W*() wrappers.

> +	put_pid(wo.wo_pid);
> +	return ret;
> +}

Then we don't get *any* in-kernel code dealing with the W*() crap.
I just unwrapped this for the umh [0], given that otherwise we'd
have to use KW*() callers elsewhere. Doing it upshot one level
further would be even better.

[0] https://lkml.kernel.org/r/20200610154923.27510-1-mcgrof@kernel.org              

  Luis
Christoph Hellwig June 20, 2020, 6:35 a.m. UTC | #2
On Fri, Jun 19, 2020 at 09:17:00PM +0000, Luis Chamberlain wrote:
> On Thu, Jun 18, 2020 at 04:46:27PM +0200, Christoph Hellwig wrote:
> > --- a/kernel/exit.c
> > +++ b/kernel/exit.c
> > @@ -1626,6 +1626,22 @@ long kernel_wait4(pid_t upid, int __user *stat_addr, int options,
> >  	return ret;
> >  }
> >  
> > +int kernel_wait(pid_t pid, int *stat)
> > +{
> > +	struct wait_opts wo = {
> > +		.wo_type	= PIDTYPE_PID,
> > +		.wo_pid		= find_get_pid(pid),
> > +		.wo_flags	= WEXITED,
> > +	};
> > +	int ret;
> > +
> > +	ret = do_wait(&wo);
> > +	if (ret > 0 && wo.wo_stat)
> > +		*stat = wo.wo_stat;
> 
> Since all we care about is WEXITED, that could be simplified
> to something like this:
> 
> if (ret > 0 && KWIFEXITED(wo.wo_stat)
>  	*stat = KWEXITSTATUS(wo.wo_stat)
> 
> Otherwise callers have to use W*() wrappers.
> 
> > +	put_pid(wo.wo_pid);
> > +	return ret;
> > +}
> 
> Then we don't get *any* in-kernel code dealing with the W*() crap.
> I just unwrapped this for the umh [0], given that otherwise we'd
> have to use KW*() callers elsewhere. Doing it upshot one level
> further would be even better.
> 
> [0] https://lkml.kernel.org/r/20200610154923.27510-1-mcgrof@kernel.org              
Do you just want to pick this patch up, add your suggested bits and
add it to the beginning of your series?  That should clean the whole
thing up a bit.  Nothing else in this series depends on the patch.
Luis Chamberlain June 20, 2020, 5:02 p.m. UTC | #3
On Sat, Jun 20, 2020 at 08:35:38AM +0200, Christoph Hellwig wrote:
> On Fri, Jun 19, 2020 at 09:17:00PM +0000, Luis Chamberlain wrote:
> > On Thu, Jun 18, 2020 at 04:46:27PM +0200, Christoph Hellwig wrote:
> > > --- a/kernel/exit.c
> > > +++ b/kernel/exit.c
> > > @@ -1626,6 +1626,22 @@ long kernel_wait4(pid_t upid, int __user *stat_addr, int options,
> > >  	return ret;
> > >  }
> > >  
> > > +int kernel_wait(pid_t pid, int *stat)
> > > +{
> > > +	struct wait_opts wo = {
> > > +		.wo_type	= PIDTYPE_PID,
> > > +		.wo_pid		= find_get_pid(pid),
> > > +		.wo_flags	= WEXITED,
> > > +	};
> > > +	int ret;
> > > +
> > > +	ret = do_wait(&wo);
> > > +	if (ret > 0 && wo.wo_stat)
> > > +		*stat = wo.wo_stat;
> > 
> > Since all we care about is WEXITED, that could be simplified
> > to something like this:
> > 
> > if (ret > 0 && KWIFEXITED(wo.wo_stat)
> >  	*stat = KWEXITSTATUS(wo.wo_stat)
> > 
> > Otherwise callers have to use W*() wrappers.
> > 
> > > +	put_pid(wo.wo_pid);
> > > +	return ret;
> > > +}
> > 
> > Then we don't get *any* in-kernel code dealing with the W*() crap.
> > I just unwrapped this for the umh [0], given that otherwise we'd
> > have to use KW*() callers elsewhere. Doing it upshot one level
> > further would be even better.
> > 
> > [0] https://lkml.kernel.org/r/20200610154923.27510-1-mcgrof@kernel.org              
> Do you just want to pick this patch up, add your suggested bits and
> add it to the beginning of your series?  That should clean the whole
> thing up a bit.  Nothing else in this series depends on the patch.

Sure but let's wait to hear from the NFS folks.

I'm waiting to hear from NFS folks if the one place where the UMH is
fixed for the error code (on fs/nfsd/nfs4recover.c we never were
disabling the upcall as the error code of -ENOENT or -EACCES was *never*
properly checked for) to see how critical that was. If it can help
stable kernels the fix can go in as I proposed, followed by this patch
to further take the KWEXITSTATUS() up further, and ensure we *never*
deal with this in-kernel. If its not a fix stable kernels should care
for what you suggest of taking this patch first would be best and I'd be
happy to do that.

  Luis
diff mbox series

Patch

diff --git a/include/linux/sched/task.h b/include/linux/sched/task.h
index 38359071236ad7..a80007df396e95 100644
--- a/include/linux/sched/task.h
+++ b/include/linux/sched/task.h
@@ -102,6 +102,7 @@  struct task_struct *fork_idle(int);
 struct mm_struct *copy_init_mm(void);
 extern pid_t kernel_thread(int (*fn)(void *), void *arg, unsigned long flags);
 extern long kernel_wait4(pid_t, int __user *, int, struct rusage *);
+int kernel_wait(pid_t pid, int *stat);
 
 extern void free_task(struct task_struct *tsk);
 
diff --git a/kernel/exit.c b/kernel/exit.c
index 727150f2810338..fd598846df0b17 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -1626,6 +1626,22 @@  long kernel_wait4(pid_t upid, int __user *stat_addr, int options,
 	return ret;
 }
 
+int kernel_wait(pid_t pid, int *stat)
+{
+	struct wait_opts wo = {
+		.wo_type	= PIDTYPE_PID,
+		.wo_pid		= find_get_pid(pid),
+		.wo_flags	= WEXITED,
+	};
+	int ret;
+
+	ret = do_wait(&wo);
+	if (ret > 0 && wo.wo_stat)
+		*stat = wo.wo_stat;
+	put_pid(wo.wo_pid);
+	return ret;
+}
+
 SYSCALL_DEFINE4(wait4, pid_t, upid, int __user *, stat_addr,
 		int, options, struct rusage __user *, ru)
 {
diff --git a/kernel/umh.c b/kernel/umh.c
index 1284823dbad338..6fd948e478bec4 100644
--- a/kernel/umh.c
+++ b/kernel/umh.c
@@ -126,37 +126,16 @@  static void call_usermodehelper_exec_sync(struct subprocess_info *sub_info)
 {
 	pid_t pid;
 
-	/* If SIGCLD is ignored kernel_wait4 won't populate the status. */
+	/* If SIGCLD is ignored do_wait won't populate the status. */
 	kernel_sigaction(SIGCHLD, SIG_DFL);
 	pid = kernel_thread(call_usermodehelper_exec_async, sub_info, SIGCHLD);
-	if (pid < 0) {
+	if (pid < 0)
 		sub_info->retval = pid;
-	} else {
-		int ret = -ECHILD;
-		/*
-		 * Normally it is bogus to call wait4() from in-kernel because
-		 * wait4() wants to write the exit code to a userspace address.
-		 * But call_usermodehelper_exec_sync() always runs as kernel
-		 * thread (workqueue) and put_user() to a kernel address works
-		 * OK for kernel threads, due to their having an mm_segment_t
-		 * which spans the entire address space.
-		 *
-		 * Thus the __user pointer cast is valid here.
-		 */
-		kernel_wait4(pid, (int __user *)&ret, 0, NULL);
-
-		/*
-		 * If ret is 0, either call_usermodehelper_exec_async failed and
-		 * the real error code is already in sub_info->retval or
-		 * sub_info->retval is 0 anyway, so don't mess with it then.
-		 */
-		if (ret)
-			sub_info->retval = ret;
-	}
+	else
+		kernel_wait(pid, &sub_info->retval);
 
 	/* Restore default kernel sig handler */
 	kernel_sigaction(SIGCHLD, SIG_IGN);
-
 	umh_complete(sub_info);
 }