diff mbox series

[v3,1/2] exec: fix up /proc/pid/comm in the execveat(AT_EMPTY_PATH) case

Message ID 20241001134945.798662-1-tycho@tycho.pizza (mailing list archive)
State New
Headers show
Series [v3,1/2] exec: fix up /proc/pid/comm in the execveat(AT_EMPTY_PATH) case | expand

Commit Message

Tycho Andersen Oct. 1, 2024, 1:49 p.m. UTC
From: Tycho Andersen <tandersen@netflix.com>

Zbigniew mentioned at Linux Plumber's that systemd is interested in
switching to execveat() for service execution, but can't, because the
contents of /proc/pid/comm are the file descriptor which was used,
instead of the path to the binary. This makes the output of tools like
top and ps useless, especially in a world where most fds are opened
CLOEXEC so the number is truly meaningless.

Change exec path to fix up /proc/pid/comm in the case where we have
allocated one of these synthetic paths in bprm_init(). This way the actual
exec machinery is unchanged, but cosmetically the comm looks reasonable to
admins investigating things.

Signed-off-by: Tycho Andersen <tandersen@netflix.com>
Suggested-by: Zbigniew Jędrzejewski-Szmek <zbyszek@in.waw.pl>
CC: Aleksa Sarai <cyphar@cyphar.com>
Link: https://github.com/uapi-group/kernel-features#set-comm-field-before-exec
---
v2: * drop the flag, everyone :)
    * change the rendered value to f_path.dentry->d_name.name instead of
      argv[0], Eric
v3: * fix up subject line, Eric
---
 fs/exec.c | 13 ++++++++++++-
 1 file changed, 12 insertions(+), 1 deletion(-)


base-commit: baeb9a7d8b60b021d907127509c44507539c15e5

Comments

Aleksa Sarai Oct. 1, 2024, 6:42 p.m. UTC | #1
On 2024-10-01, Tycho Andersen <tycho@tycho.pizza> wrote:
> From: Tycho Andersen <tandersen@netflix.com>
> 
> Zbigniew mentioned at Linux Plumber's that systemd is interested in
> switching to execveat() for service execution, but can't, because the
> contents of /proc/pid/comm are the file descriptor which was used,
> instead of the path to the binary. This makes the output of tools like
> top and ps useless, especially in a world where most fds are opened
> CLOEXEC so the number is truly meaningless.
> 
> Change exec path to fix up /proc/pid/comm in the case where we have
> allocated one of these synthetic paths in bprm_init(). This way the actual
> exec machinery is unchanged, but cosmetically the comm looks reasonable to
> admins investigating things.

While I still think the argv[0] solution was semantically nicer, it
seems this is enough to fix the systemd problem for most cases and so we
can revisit the argv[0] discussion in another 10 years. :D

Reviewed-by: Aleksa Sarai <cyphar@cyphar.com>

> Signed-off-by: Tycho Andersen <tandersen@netflix.com>
> Suggested-by: Zbigniew Jędrzejewski-Szmek <zbyszek@in.waw.pl>
> CC: Aleksa Sarai <cyphar@cyphar.com>
> Link: https://github.com/uapi-group/kernel-features#set-comm-field-before-exec
> ---
> v2: * drop the flag, everyone :)
>     * change the rendered value to f_path.dentry->d_name.name instead of
>       argv[0], Eric
> v3: * fix up subject line, Eric
> ---
>  fs/exec.c | 13 ++++++++++++-
>  1 file changed, 12 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/exec.c b/fs/exec.c
> index dad402d55681..9520359a8dcc 100644
> --- a/fs/exec.c
> +++ b/fs/exec.c
> @@ -1416,7 +1416,18 @@ int begin_new_exec(struct linux_binprm * bprm)
>  		set_dumpable(current->mm, SUID_DUMP_USER);
>  
>  	perf_event_exec();
> -	__set_task_comm(me, kbasename(bprm->filename), true);
> +
> +	/*
> +	 * If fdpath was set, execveat() made up a path that will
> +	 * probably not be useful to admins running ps or similar.
> +	 * Let's fix it up to be something reasonable.
> +	 */
> +	if (bprm->fdpath) {
> +		BUILD_BUG_ON(TASK_COMM_LEN > DNAME_INLINE_LEN);
> +		__set_task_comm(me, bprm->file->f_path.dentry->d_name.name, true);
> +	} else {
> +		__set_task_comm(me, kbasename(bprm->filename), true);
> +	}
>  
>  	/* An exec changes our domain. We are no longer part of the thread
>  	   group */
> 
> base-commit: baeb9a7d8b60b021d907127509c44507539c15e5
> -- 
> 2.34.1
>
Aleksa Sarai Oct. 1, 2024, 6:48 p.m. UTC | #2
On 2024-10-01, Aleksa Sarai <cyphar@cyphar.com> wrote:
> On 2024-10-01, Tycho Andersen <tycho@tycho.pizza> wrote:
> > From: Tycho Andersen <tandersen@netflix.com>
> > 
> > Zbigniew mentioned at Linux Plumber's that systemd is interested in
> > switching to execveat() for service execution, but can't, because the
> > contents of /proc/pid/comm are the file descriptor which was used,
> > instead of the path to the binary. This makes the output of tools like
> > top and ps useless, especially in a world where most fds are opened
> > CLOEXEC so the number is truly meaningless.
> > 
> > Change exec path to fix up /proc/pid/comm in the case where we have
> > allocated one of these synthetic paths in bprm_init(). This way the actual
> > exec machinery is unchanged, but cosmetically the comm looks reasonable to
> > admins investigating things.
> 
> While I still think the argv[0] solution was semantically nicer, it
> seems this is enough to fix the systemd problem for most cases and so we
> can revisit the argv[0] discussion in another 10 years. :D

Of course, this assumes the busybox problem I mentioned really is not an
issue. But at least this option is "less wrong" than using the fd
number. I suspect we will eventually need the argv[0] thing.

> Reviewed-by: Aleksa Sarai <cyphar@cyphar.com>
> 
> > Signed-off-by: Tycho Andersen <tandersen@netflix.com>
> > Suggested-by: Zbigniew Jędrzejewski-Szmek <zbyszek@in.waw.pl>
> > CC: Aleksa Sarai <cyphar@cyphar.com>
> > Link: https://github.com/uapi-group/kernel-features#set-comm-field-before-exec
> > ---
> > v2: * drop the flag, everyone :)
> >     * change the rendered value to f_path.dentry->d_name.name instead of
> >       argv[0], Eric
> > v3: * fix up subject line, Eric
> > ---
> >  fs/exec.c | 13 ++++++++++++-
> >  1 file changed, 12 insertions(+), 1 deletion(-)
> > 
> > diff --git a/fs/exec.c b/fs/exec.c
> > index dad402d55681..9520359a8dcc 100644
> > --- a/fs/exec.c
> > +++ b/fs/exec.c
> > @@ -1416,7 +1416,18 @@ int begin_new_exec(struct linux_binprm * bprm)
> >  		set_dumpable(current->mm, SUID_DUMP_USER);
> >  
> >  	perf_event_exec();
> > -	__set_task_comm(me, kbasename(bprm->filename), true);
> > +
> > +	/*
> > +	 * If fdpath was set, execveat() made up a path that will
> > +	 * probably not be useful to admins running ps or similar.
> > +	 * Let's fix it up to be something reasonable.
> > +	 */
> > +	if (bprm->fdpath) {
> > +		BUILD_BUG_ON(TASK_COMM_LEN > DNAME_INLINE_LEN);
> > +		__set_task_comm(me, bprm->file->f_path.dentry->d_name.name, true);
> > +	} else {
> > +		__set_task_comm(me, kbasename(bprm->filename), true);
> > +	}
> >  
> >  	/* An exec changes our domain. We are no longer part of the thread
> >  	   group */
> > 
> > base-commit: baeb9a7d8b60b021d907127509c44507539c15e5
> > -- 
> > 2.34.1
> > 
> 
> -- 
> Aleksa Sarai
> Senior Software Engineer (Containers)
> SUSE Linux GmbH
> <https://www.cyphar.com/>
diff mbox series

Patch

diff --git a/fs/exec.c b/fs/exec.c
index dad402d55681..9520359a8dcc 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -1416,7 +1416,18 @@  int begin_new_exec(struct linux_binprm * bprm)
 		set_dumpable(current->mm, SUID_DUMP_USER);
 
 	perf_event_exec();
-	__set_task_comm(me, kbasename(bprm->filename), true);
+
+	/*
+	 * If fdpath was set, execveat() made up a path that will
+	 * probably not be useful to admins running ps or similar.
+	 * Let's fix it up to be something reasonable.
+	 */
+	if (bprm->fdpath) {
+		BUILD_BUG_ON(TASK_COMM_LEN > DNAME_INLINE_LEN);
+		__set_task_comm(me, bprm->file->f_path.dentry->d_name.name, true);
+	} else {
+		__set_task_comm(me, kbasename(bprm->filename), true);
+	}
 
 	/* An exec changes our domain. We are no longer part of the thread
 	   group */