diff mbox series

linux-user: Fix access to /proc/self/exe

Message ID 20221205113825.20615-1-deller@gmx.de (mailing list archive)
State New, archived
Headers show
Series linux-user: Fix access to /proc/self/exe | expand

Commit Message

Helge Deller Dec. 5, 2022, 11:38 a.m. UTC
When accsssing /proc/self/exe from a userspace program, linux-user tries
to resolve the name via realpath(), which may fail if the process
changed the working directory in the meantime.

An example:
- a userspace program ist started with ./testprogram
- the program runs chdir("/tmp")
- then the program calls readlink("/proc/self/exe")
- linux-user tries to run realpath("./testprogram") which fails
  because ./testprogram isn't in /tmp
- readlink() will return -ENOENT back to the program

Avoid this issue by resolving the full path name of the started process
at startup of linux-user and store it in real_exec_path[]. This then
simplifies the emulation of readlink() and readlinkat() as well, because
they can simply copy the path string to userspace.

I noticed this bug because the testsuite of the debian package "pandoc"
failed on linux-user while it succeeded on real hardware.  The full log
is here:
https://buildd.debian.org/status/fetch.php?pkg=pandoc&arch=hppa&ver=2.17.1.1-1.1%2Bb1&stamp=1670153210&raw=0

Signed-off-by: Helge Deller <deller@gmx.de>
---
 linux-user/main.c    |  6 ++++++
 linux-user/syscall.c | 34 ++++++++++------------------------
 2 files changed, 16 insertions(+), 24 deletions(-)

--
2.38.1

Comments

Laurent Vivier March 6, 2023, 10:23 p.m. UTC | #1
Le 05/12/2022 à 12:38, Helge Deller a écrit :
> When accsssing /proc/self/exe from a userspace program, linux-user tries
> to resolve the name via realpath(), which may fail if the process
> changed the working directory in the meantime.
> 
> An example:
> - a userspace program ist started with ./testprogram
> - the program runs chdir("/tmp")
> - then the program calls readlink("/proc/self/exe")
> - linux-user tries to run realpath("./testprogram") which fails
>    because ./testprogram isn't in /tmp
> - readlink() will return -ENOENT back to the program
> 
> Avoid this issue by resolving the full path name of the started process
> at startup of linux-user and store it in real_exec_path[]. This then
> simplifies the emulation of readlink() and readlinkat() as well, because
> they can simply copy the path string to userspace.
> 
> I noticed this bug because the testsuite of the debian package "pandoc"
> failed on linux-user while it succeeded on real hardware.  The full log
> is here:
> https://buildd.debian.org/status/fetch.php?pkg=pandoc&arch=hppa&ver=2.17.1.1-1.1%2Bb1&stamp=1670153210&raw=0
> 
> Signed-off-by: Helge Deller <deller@gmx.de>
> ---
>   linux-user/main.c    |  6 ++++++
>   linux-user/syscall.c | 34 ++++++++++------------------------
>   2 files changed, 16 insertions(+), 24 deletions(-)
> 
> diff --git a/linux-user/main.c b/linux-user/main.c
> index aedd707459..e7e53f7d5e 100644
> --- a/linux-user/main.c
> +++ b/linux-user/main.c
> @@ -64,6 +64,7 @@
>   #endif
> 
>   char *exec_path;
> +char real_exec_path[PATH_MAX];
> 
>   int singlestep;
>   static const char *argv0;
> @@ -744,6 +745,11 @@ int main(int argc, char **argv, char **envp)
>           }
>       }
> 
> +    /* Resolve executable file name to full path name */
> +    if (realpath(exec_path, real_exec_path)) {
> +        exec_path = real_exec_path;
> +    }
> +
>       /*
>        * get binfmt_misc flags
>        * but only if not already done by parse_args() above
> diff --git a/linux-user/syscall.c b/linux-user/syscall.c
> index 0468a1bad7..71ae867024 100644
> --- a/linux-user/syscall.c
> +++ b/linux-user/syscall.c
> @@ -10011,18 +10011,11 @@ static abi_long do_syscall1(CPUArchState *cpu_env, int num, abi_long arg1,
>                   /* Short circuit this for the magic exe check. */
>                   ret = -TARGET_EINVAL;
>               } else if (is_proc_myself((const char *)p, "exe")) {
> -                char real[PATH_MAX], *temp;
> -                temp = realpath(exec_path, real);
> -                /* Return value is # of bytes that we wrote to the buffer. */
> -                if (temp == NULL) {
> -                    ret = get_errno(-1);
> -                } else {
> -                    /* Don't worry about sign mismatch as earlier mapping
> -                     * logic would have thrown a bad address error. */
> -                    ret = MIN(strlen(real), arg3);
> -                    /* We cannot NUL terminate the string. */
> -                    memcpy(p2, real, ret);
> -                }
> +	        /* Don't worry about sign mismatch as earlier mapping
> +	         * logic would have thrown a bad address error. */
> +                ret = MIN(strlen(exec_path), arg3);
> +                /* We cannot NUL terminate the string. */
> +                memcpy(p2, exec_path, ret);
>               } else {
>                   ret = get_errno(readlink(path(p), p2, arg3));
>               }
> @@ -10043,18 +10036,11 @@ static abi_long do_syscall1(CPUArchState *cpu_env, int num, abi_long arg1,
>                   /* Short circuit this for the magic exe check. */
>                   ret = -TARGET_EINVAL;
>               } else if (is_proc_myself((const char *)p, "exe")) {
> -                char real[PATH_MAX], *temp;
> -                temp = realpath(exec_path, real);
> -                /* Return value is # of bytes that we wrote to the buffer. */
> -                if (temp == NULL) {
> -                    ret = get_errno(-1);
> -                } else {
> -                    /* Don't worry about sign mismatch as earlier mapping
> -                     * logic would have thrown a bad address error. */
> -                    ret = MIN(strlen(real), arg4);
> -                    /* We cannot NUL terminate the string. */
> -                    memcpy(p2, real, ret);
> -                }
> +	        /* Don't worry about sign mismatch as earlier mapping
> +	         * logic would have thrown a bad address error. */
> +                ret = MIN(strlen(exec_path), arg4);
> +                /* We cannot NUL terminate the string. */
> +                memcpy(p2, exec_path, ret);
>               } else {
>                   ret = get_errno(readlinkat(arg1, path(p), p2, arg4));
>               }
> --
> 2.38.1
> 
> 

Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Laurent Vivier March 7, 2023, 9:30 a.m. UTC | #2
Le 05/12/2022 à 12:38, Helge Deller a écrit :
> When accsssing /proc/self/exe from a userspace program, linux-user tries
> to resolve the name via realpath(), which may fail if the process
> changed the working directory in the meantime.
> 
> An example:
> - a userspace program ist started with ./testprogram
> - the program runs chdir("/tmp")
> - then the program calls readlink("/proc/self/exe")
> - linux-user tries to run realpath("./testprogram") which fails
>    because ./testprogram isn't in /tmp
> - readlink() will return -ENOENT back to the program
> 
> Avoid this issue by resolving the full path name of the started process
> at startup of linux-user and store it in real_exec_path[]. This then
> simplifies the emulation of readlink() and readlinkat() as well, because
> they can simply copy the path string to userspace.
> 
> I noticed this bug because the testsuite of the debian package "pandoc"
> failed on linux-user while it succeeded on real hardware.  The full log
> is here:
> https://buildd.debian.org/status/fetch.php?pkg=pandoc&arch=hppa&ver=2.17.1.1-1.1%2Bb1&stamp=1670153210&raw=0
> 
> Signed-off-by: Helge Deller <deller@gmx.de>
> ---
>   linux-user/main.c    |  6 ++++++
>   linux-user/syscall.c | 34 ++++++++++------------------------
>   2 files changed, 16 insertions(+), 24 deletions(-)
> 
> diff --git a/linux-user/main.c b/linux-user/main.c
> index aedd707459..e7e53f7d5e 100644
> --- a/linux-user/main.c
> +++ b/linux-user/main.c
> @@ -64,6 +64,7 @@
>   #endif
> 
>   char *exec_path;
> +char real_exec_path[PATH_MAX];
> 
>   int singlestep;
>   static const char *argv0;
> @@ -744,6 +745,11 @@ int main(int argc, char **argv, char **envp)
>           }
>       }
> 
> +    /* Resolve executable file name to full path name */
> +    if (realpath(exec_path, real_exec_path)) {
> +        exec_path = real_exec_path;
> +    }
> +
>       /*
>        * get binfmt_misc flags
>        * but only if not already done by parse_args() above
> diff --git a/linux-user/syscall.c b/linux-user/syscall.c
> index 0468a1bad7..71ae867024 100644
> --- a/linux-user/syscall.c
> +++ b/linux-user/syscall.c
> @@ -10011,18 +10011,11 @@ static abi_long do_syscall1(CPUArchState *cpu_env, int num, abi_long arg1,
>                   /* Short circuit this for the magic exe check. */
>                   ret = -TARGET_EINVAL;
>               } else if (is_proc_myself((const char *)p, "exe")) {
> -                char real[PATH_MAX], *temp;
> -                temp = realpath(exec_path, real);
> -                /* Return value is # of bytes that we wrote to the buffer. */
> -                if (temp == NULL) {
> -                    ret = get_errno(-1);
> -                } else {
> -                    /* Don't worry about sign mismatch as earlier mapping
> -                     * logic would have thrown a bad address error. */
> -                    ret = MIN(strlen(real), arg3);
> -                    /* We cannot NUL terminate the string. */
> -                    memcpy(p2, real, ret);
> -                }
> +	        /* Don't worry about sign mismatch as earlier mapping
> +	         * logic would have thrown a bad address error. */
> +                ret = MIN(strlen(exec_path), arg3);
> +                /* We cannot NUL terminate the string. */
> +                memcpy(p2, exec_path, ret);
>               } else {
>                   ret = get_errno(readlink(path(p), p2, arg3));
>               }
> @@ -10043,18 +10036,11 @@ static abi_long do_syscall1(CPUArchState *cpu_env, int num, abi_long arg1,
>                   /* Short circuit this for the magic exe check. */
>                   ret = -TARGET_EINVAL;
>               } else if (is_proc_myself((const char *)p, "exe")) {
> -                char real[PATH_MAX], *temp;
> -                temp = realpath(exec_path, real);
> -                /* Return value is # of bytes that we wrote to the buffer. */
> -                if (temp == NULL) {
> -                    ret = get_errno(-1);
> -                } else {
> -                    /* Don't worry about sign mismatch as earlier mapping
> -                     * logic would have thrown a bad address error. */
> -                    ret = MIN(strlen(real), arg4);
> -                    /* We cannot NUL terminate the string. */
> -                    memcpy(p2, real, ret);
> -                }
> +	        /* Don't worry about sign mismatch as earlier mapping
> +	         * logic would have thrown a bad address error. */
> +                ret = MIN(strlen(exec_path), arg4);
> +                /* We cannot NUL terminate the string. */
> +                memcpy(p2, exec_path, ret);
>               } else {
>                   ret = get_errno(readlinkat(arg1, path(p), p2, arg4));
>               }
> --
> 2.38.1
> 
> 

Applied to my linux-user-for-8.0 branch.

Thanks,
Laurent
diff mbox series

Patch

diff --git a/linux-user/main.c b/linux-user/main.c
index aedd707459..e7e53f7d5e 100644
--- a/linux-user/main.c
+++ b/linux-user/main.c
@@ -64,6 +64,7 @@ 
 #endif

 char *exec_path;
+char real_exec_path[PATH_MAX];

 int singlestep;
 static const char *argv0;
@@ -744,6 +745,11 @@  int main(int argc, char **argv, char **envp)
         }
     }

+    /* Resolve executable file name to full path name */
+    if (realpath(exec_path, real_exec_path)) {
+        exec_path = real_exec_path;
+    }
+
     /*
      * get binfmt_misc flags
      * but only if not already done by parse_args() above
diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index 0468a1bad7..71ae867024 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -10011,18 +10011,11 @@  static abi_long do_syscall1(CPUArchState *cpu_env, int num, abi_long arg1,
                 /* Short circuit this for the magic exe check. */
                 ret = -TARGET_EINVAL;
             } else if (is_proc_myself((const char *)p, "exe")) {
-                char real[PATH_MAX], *temp;
-                temp = realpath(exec_path, real);
-                /* Return value is # of bytes that we wrote to the buffer. */
-                if (temp == NULL) {
-                    ret = get_errno(-1);
-                } else {
-                    /* Don't worry about sign mismatch as earlier mapping
-                     * logic would have thrown a bad address error. */
-                    ret = MIN(strlen(real), arg3);
-                    /* We cannot NUL terminate the string. */
-                    memcpy(p2, real, ret);
-                }
+	        /* Don't worry about sign mismatch as earlier mapping
+	         * logic would have thrown a bad address error. */
+                ret = MIN(strlen(exec_path), arg3);
+                /* We cannot NUL terminate the string. */
+                memcpy(p2, exec_path, ret);
             } else {
                 ret = get_errno(readlink(path(p), p2, arg3));
             }
@@ -10043,18 +10036,11 @@  static abi_long do_syscall1(CPUArchState *cpu_env, int num, abi_long arg1,
                 /* Short circuit this for the magic exe check. */
                 ret = -TARGET_EINVAL;
             } else if (is_proc_myself((const char *)p, "exe")) {
-                char real[PATH_MAX], *temp;
-                temp = realpath(exec_path, real);
-                /* Return value is # of bytes that we wrote to the buffer. */
-                if (temp == NULL) {
-                    ret = get_errno(-1);
-                } else {
-                    /* Don't worry about sign mismatch as earlier mapping
-                     * logic would have thrown a bad address error. */
-                    ret = MIN(strlen(real), arg4);
-                    /* We cannot NUL terminate the string. */
-                    memcpy(p2, real, ret);
-                }
+	        /* Don't worry about sign mismatch as earlier mapping
+	         * logic would have thrown a bad address error. */
+                ret = MIN(strlen(exec_path), arg4);
+                /* We cannot NUL terminate the string. */
+                memcpy(p2, exec_path, ret);
             } else {
                 ret = get_errno(readlinkat(arg1, path(p), p2, arg4));
             }