diff mbox series

exec: fix linux_binprm::exec in transfer_args_to_stack()

Message ID 20240320182607.1472887-1-jcmvbkbc@gmail.com (mailing list archive)
State New
Headers show
Series exec: fix linux_binprm::exec in transfer_args_to_stack() | expand

Commit Message

Max Filippov March 20, 2024, 6:26 p.m. UTC
In NUMMU kernel the value of linux_binprm::p is the offset inside the
temporary program arguments array maintained in separate pages in the
linux_binprm::page. linux_binprm::exec being a copy of linux_binprm::p
thus must be adjusted when that array is copied to the user stack.
Without that adjustment the value passed by the NOMMU kernel to the ELF
program in the AT_EXECFN entry of the aux array doesn't make any sense
and it may break programs that try to access memory pointed to by that
entry.

Adjust linux_binprm::exec before the successful return from the
transfer_args_to_stack().

Cc: stable@vger.kernel.org
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
---
 fs/exec.c | 1 +
 1 file changed, 1 insertion(+)

Comments

Matthew Wilcox March 20, 2024, 7:29 p.m. UTC | #1
On Wed, Mar 20, 2024 at 11:26:07AM -0700, Max Filippov wrote:
> In NUMMU kernel the value of linux_binprm::p is the offset inside the
> temporary program arguments array maintained in separate pages in the
> linux_binprm::page. linux_binprm::exec being a copy of linux_binprm::p
> thus must be adjusted when that array is copied to the user stack.
> Without that adjustment the value passed by the NOMMU kernel to the ELF
> program in the AT_EXECFN entry of the aux array doesn't make any sense
> and it may break programs that try to access memory pointed to by that
> entry.
> 
> Adjust linux_binprm::exec before the successful return from the
> transfer_args_to_stack().

Do you know which commit broke this, ie how far back should this be
backported?  Or has it always been broken?

> Cc: stable@vger.kernel.org
> Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
> ---
>  fs/exec.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/fs/exec.c b/fs/exec.c
> index af4fbb61cd53..5ee2545c3e18 100644
> --- a/fs/exec.c
> +++ b/fs/exec.c
> @@ -895,6 +895,7 @@ int transfer_args_to_stack(struct linux_binprm *bprm,
>  			goto out;
>  	}
>  
> +	bprm->exec += *sp_location - MAX_ARG_PAGES * PAGE_SIZE;
>  	*sp_location = sp;
>  
>  out:
> -- 
> 2.39.2
> 
>
Max Filippov March 20, 2024, 8:31 p.m. UTC | #2
On Wed, Mar 20, 2024 at 12:29 PM Matthew Wilcox <willy@infradead.org> wrote:
>
> On Wed, Mar 20, 2024 at 11:26:07AM -0700, Max Filippov wrote:
> > In NUMMU kernel the value of linux_binprm::p is the offset inside the
> > temporary program arguments array maintained in separate pages in the
> > linux_binprm::page. linux_binprm::exec being a copy of linux_binprm::p
> > thus must be adjusted when that array is copied to the user stack.
> > Without that adjustment the value passed by the NOMMU kernel to the ELF
> > program in the AT_EXECFN entry of the aux array doesn't make any sense
> > and it may break programs that try to access memory pointed to by that
> > entry.
> >
> > Adjust linux_binprm::exec before the successful return from the
> > transfer_args_to_stack().
>
> Do you know which commit broke this, ie how far back should this be
> backported?  Or has it always been broken?

From reading the code I see that linux_binprm::p started being an offset
in the commit b6a2fea39318 ("mm: variable length argument support")
which is v2.6.22-3328-gb6a2fea39318 and filling in the AT_EXECFN
aux entry was added in the commit
 5edc2a5123a7 ("binfmt_elf_fdpic: wire up AT_EXECFD, AT_EXECFN, AT_SECURE")
which is v2.6.27-4641-g5edc2a5123a7. I don't see any translation
of the linux_binprm::exec at that time so to me it looks like it's always
been broken.

--
Thanks.
-- Max
Kees Cook March 21, 2024, 5:05 p.m. UTC | #3
On Wed, Mar 20, 2024 at 11:26:07AM -0700, Max Filippov wrote:
> In NUMMU kernel the value of linux_binprm::p is the offset inside the
> temporary program arguments array maintained in separate pages in the
> linux_binprm::page. linux_binprm::exec being a copy of linux_binprm::p
> thus must be adjusted when that array is copied to the user stack.
> Without that adjustment the value passed by the NOMMU kernel to the ELF
> program in the AT_EXECFN entry of the aux array doesn't make any sense
> and it may break programs that try to access memory pointed to by that
> entry.
> 
> Adjust linux_binprm::exec before the successful return from the
> transfer_args_to_stack().

What's the best way to test this? (Is there a qemu setup I can use to
see the before/after of AT_EXECFN?)

How did you encounter the problem?

> 
> Cc: stable@vger.kernel.org
> Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
> ---
>  fs/exec.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/fs/exec.c b/fs/exec.c
> index af4fbb61cd53..5ee2545c3e18 100644
> --- a/fs/exec.c
> +++ b/fs/exec.c
> @@ -895,6 +895,7 @@ int transfer_args_to_stack(struct linux_binprm *bprm,
>  			goto out;
>  	}
>  
> +	bprm->exec += *sp_location - MAX_ARG_PAGES * PAGE_SIZE;
>  	*sp_location = sp;
>  
>  out:
> -- 
> 2.39.2
>
Max Filippov March 21, 2024, 7:52 p.m. UTC | #4
On Thu, Mar 21, 2024 at 10:05 AM Kees Cook <keescook@chromium.org> wrote:
>
> On Wed, Mar 20, 2024 at 11:26:07AM -0700, Max Filippov wrote:
> > In NUMMU kernel the value of linux_binprm::p is the offset inside the
> > temporary program arguments array maintained in separate pages in the
> > linux_binprm::page. linux_binprm::exec being a copy of linux_binprm::p
> > thus must be adjusted when that array is copied to the user stack.
> > Without that adjustment the value passed by the NOMMU kernel to the ELF
> > program in the AT_EXECFN entry of the aux array doesn't make any sense
> > and it may break programs that try to access memory pointed to by that
> > entry.
> >
> > Adjust linux_binprm::exec before the successful return from the
> > transfer_args_to_stack().
>
> What's the best way to test this? (Is there a qemu setup I can use to
> see the before/after of AT_EXECFN?)

I put a readme with the steps to build such system here:
  http://jcmvbkbc.org/~dumb/tmp/202403211236/README
it uses a prebuilt rootfs image and a 6.8 kernel branch with two
patches on top of it: one adds a dts and a defconfig and the other
is this fix. The rootfs boots successfully with this fix, but panics
if this fix is removed.
The easiest way to actually see the AT_EXECFN is, I guess, to
do something like that:
---8<---
diff --git a/fs/binfmt_elf_fdpic.c b/fs/binfmt_elf_fdpic.c
index fefc642541cb..22d34272a570 100644
--- a/fs/binfmt_elf_fdpic.c
+++ b/fs/binfmt_elf_fdpic.c
@@ -659,6 +659,7 @@ static int create_elf_fdpic_tables(struct
linux_binprm *bprm,
       NEW_AUX_ENT(AT_EGID,    (elf_addr_t)
from_kgid_munged(cred->user_ns, cred->egid));
       NEW_AUX_ENT(AT_SECURE,  bprm->secureexec);
       NEW_AUX_ENT(AT_EXECFN,  bprm->exec);
+       pr_info("%s: AT_EXECFN = %#lx\n", __func__, bprm->exec);

#ifdef ARCH_DLINFO
       nr = 0;
---8<---

> How did you encounter the problem?

I'm doing xtensa FDPIC port of musl libc and this issue popped up when
I began testing it on qemu-system-xtensa with the real linux kernel.
Related post to the musl ML:
  https://www.openwall.com/lists/musl/2024/03/20/2

--
Thanks.
-- Max
Kees Cook March 22, 2024, 3:18 a.m. UTC | #5
On Wed, 20 Mar 2024 11:26:07 -0700, Max Filippov wrote:
> In NUMMU kernel the value of linux_binprm::p is the offset inside the
> temporary program arguments array maintained in separate pages in the
> linux_binprm::page. linux_binprm::exec being a copy of linux_binprm::p
> thus must be adjusted when that array is copied to the user stack.
> Without that adjustment the value passed by the NOMMU kernel to the ELF
> program in the AT_EXECFN entry of the aux array doesn't make any sense
> and it may break programs that try to access memory pointed to by that
> entry.
> 
> [...]

Applied to for-next/execve, thanks!

[1/1] exec: fix linux_binprm::exec in transfer_args_to_stack()
      https://git.kernel.org/kees/c/2aea94ac14d1

Take care,
Kees Cook March 22, 2024, 3:48 a.m. UTC | #6
On Thu, Mar 21, 2024 at 12:52:16PM -0700, Max Filippov wrote:
> On Thu, Mar 21, 2024 at 10:05 AM Kees Cook <keescook@chromium.org> wrote:
> >
> > On Wed, Mar 20, 2024 at 11:26:07AM -0700, Max Filippov wrote:
> > > In NUMMU kernel the value of linux_binprm::p is the offset inside the
> > > temporary program arguments array maintained in separate pages in the
> > > linux_binprm::page. linux_binprm::exec being a copy of linux_binprm::p
> > > thus must be adjusted when that array is copied to the user stack.
> > > Without that adjustment the value passed by the NOMMU kernel to the ELF
> > > program in the AT_EXECFN entry of the aux array doesn't make any sense
> > > and it may break programs that try to access memory pointed to by that
> > > entry.
> > >
> > > Adjust linux_binprm::exec before the successful return from the
> > > transfer_args_to_stack().
> >
> > What's the best way to test this? (Is there a qemu setup I can use to
> > see the before/after of AT_EXECFN?)
> 
> I put a readme with the steps to build such system here:
>   http://jcmvbkbc.org/~dumb/tmp/202403211236/README
> it uses a prebuilt rootfs image and a 6.8 kernel branch with two
> patches on top of it: one adds a dts and a defconfig and the other
> is this fix. The rootfs boots successfully with this fix, but panics
> if this fix is removed.

Ah, perfect! Thanks for this.

> The easiest way to actually see the AT_EXECFN is, I guess, to
> do something like that:
> ---8<---
> diff --git a/fs/binfmt_elf_fdpic.c b/fs/binfmt_elf_fdpic.c
> index fefc642541cb..22d34272a570 100644
> --- a/fs/binfmt_elf_fdpic.c
> +++ b/fs/binfmt_elf_fdpic.c
> @@ -659,6 +659,7 @@ static int create_elf_fdpic_tables(struct
> linux_binprm *bprm,
>        NEW_AUX_ENT(AT_EGID,    (elf_addr_t)
> from_kgid_munged(cred->user_ns, cred->egid));
>        NEW_AUX_ENT(AT_SECURE,  bprm->secureexec);
>        NEW_AUX_ENT(AT_EXECFN,  bprm->exec);
> +       pr_info("%s: AT_EXECFN = %#lx\n", __func__, bprm->exec);
> 
> #ifdef ARCH_DLINFO
>        nr = 0;
> ---8<---

Does musl have something like the LD_SHOW_AUXV env variable. With glibc,
I usually explore auxv like so:

$ LD_SHOW_AUXV=1 uname -a | grep EXECFN
AT_EXECFN:            /usr/bin/uname

> > How did you encounter the problem?
> 
> I'm doing xtensa FDPIC port of musl libc and this issue popped up when
> I began testing it on qemu-system-xtensa with the real linux kernel.
> Related post to the musl ML:
>   https://www.openwall.com/lists/musl/2024/03/20/2

Thanks!
Max Filippov March 22, 2024, 4:39 p.m. UTC | #7
On Thu, Mar 21, 2024 at 8:48 PM Kees Cook <keescook@chromium.org> wrote:
> On Thu, Mar 21, 2024 at 12:52:16PM -0700, Max Filippov wrote:
> > On Thu, Mar 21, 2024 at 10:05 AM Kees Cook <keescook@chromium.org> wrote:
> > > What's the best way to test this? (Is there a qemu setup I can use to
> > > see the before/after of AT_EXECFN?)
> >
> > I put a readme with the steps to build such system here:
> >   http://jcmvbkbc.org/~dumb/tmp/202403211236/README
> > it uses a prebuilt rootfs image and a 6.8 kernel branch with two
> > patches on top of it: one adds a dts and a defconfig and the other
> > is this fix. The rootfs boots successfully with this fix, but panics
> > if this fix is removed.
>
> Does musl have something like the LD_SHOW_AUXV env variable. With glibc,
> I usually explore auxv like so:
>
> $ LD_SHOW_AUXV=1 uname -a | grep EXECFN
> AT_EXECFN:            /usr/bin/uname

I couldn't find anything like that in either musl or uClibc-ng.
So I updated the above rootfs and put the following program
into it as /bin/test-auxv:

#include <stdio.h>
#include <sys/auxv.h>

int main()
{
       unsigned long p = getauxval(AT_EXECFN);
       fprintf(stderr, "AT_EXECFN: 0x%lx\n", p);
       if (p)
               fprintf(stderr, "%s\n", (const char *)p);
       return 0;
}

While looking at it I also noticed that /proc/<pid>/auxv is empty
on NOMMU, looks like there will be yet another fix for that.
diff mbox series

Patch

diff --git a/fs/exec.c b/fs/exec.c
index af4fbb61cd53..5ee2545c3e18 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -895,6 +895,7 @@  int transfer_args_to_stack(struct linux_binprm *bprm,
 			goto out;
 	}
 
+	bprm->exec += *sp_location - MAX_ARG_PAGES * PAGE_SIZE;
 	*sp_location = sp;
 
 out: