diff mbox series

[1/1] target/ppc: fix ELFv2 signal handler endianness

Message ID 20200315155202.13107-1-vfazio@xes-inc.com (mailing list archive)
State New, archived
Headers show
Series [1/1] target/ppc: fix ELFv2 signal handler endianness | expand

Commit Message

Vincent Fazio March 15, 2020, 3:52 p.m. UTC
From: Vincent Fazio <vfazio@gmail.com>

In ELFv2, function pointers are entry points and are in host endianness.

Previously, the signal handler would be swapped if the target CPU was a
different endianness than the host. This would cause a SIGSEGV when
attempting to translate the opcode pointed to by the swapped address.

 Thread 1 "qemu-ppc64" received signal SIGSEGV, Segmentation fault.
 0x00000000600a9257 in ldl_he_p (ptr=0x4c2c061000000000) at qemu/include/qemu/bswap.h:351
 351        __builtin_memcpy(&r, ptr, sizeof(r));

 #0  0x00000000600a9257 in ldl_he_p (ptr=0x4c2c061000000000) at qemu/include/qemu/bswap.h:351
 #1  0x00000000600a92fe in ldl_be_p (ptr=0x4c2c061000000000) at qemu/include/qemu/bswap.h:449
 #2  0x00000000600c0790 in translator_ldl_swap at qemu/include/exec/translator.h:201
 #3  0x000000006011c1ab in ppc_tr_translate_insn at qemu/target/ppc/translate.c:7856
 #4  0x000000006005ae70 in translator_loop at qemu/accel/tcg/translator.c:102

Now, no swap is performed and execution continues properly.

Signed-off-by: Vincent Fazio <vfazio@gmail.com>
---
 linux-user/ppc/signal.c | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

Comments

Laurent Vivier March 15, 2020, 6:10 p.m. UTC | #1
Le 15/03/2020 à 16:52, Vincent Fazio a écrit :
> From: Vincent Fazio <vfazio@gmail.com>
> 
> In ELFv2, function pointers are entry points and are in host endianness.

"host endianness" is misleading here. "target endianness" is better.

> 
> Previously, the signal handler would be swapped if the target CPU was a
> different endianness than the host. This would cause a SIGSEGV when
> attempting to translate the opcode pointed to by the swapped address.

This is correct.

>  Thread 1 "qemu-ppc64" received signal SIGSEGV, Segmentation fault.
>  0x00000000600a9257 in ldl_he_p (ptr=0x4c2c061000000000) at qemu/include/qemu/bswap.h:351
>  351        __builtin_memcpy(&r, ptr, sizeof(r));
> 
>  #0  0x00000000600a9257 in ldl_he_p (ptr=0x4c2c061000000000) at qemu/include/qemu/bswap.h:351
>  #1  0x00000000600a92fe in ldl_be_p (ptr=0x4c2c061000000000) at qemu/include/qemu/bswap.h:449
>  #2  0x00000000600c0790 in translator_ldl_swap at qemu/include/exec/translator.h:201
>  #3  0x000000006011c1ab in ppc_tr_translate_insn at qemu/target/ppc/translate.c:7856
>  #4  0x000000006005ae70 in translator_loop at qemu/accel/tcg/translator.c:102
> 
> Now, no swap is performed and execution continues properly.
> 
> Signed-off-by: Vincent Fazio <vfazio@gmail.com>
> ---
>  linux-user/ppc/signal.c | 10 +++++++---
>  1 file changed, 7 insertions(+), 3 deletions(-)
> 
> diff --git a/linux-user/ppc/signal.c b/linux-user/ppc/signal.c
> index 5b82af6cb6..c7f6455170 100644
> --- a/linux-user/ppc/signal.c
> +++ b/linux-user/ppc/signal.c
> @@ -567,9 +567,13 @@ void setup_rt_frame(int sig, struct target_sigaction *ka,
>          env->nip = tswapl(handler->entry);
>          env->gpr[2] = tswapl(handler->toc);
>      } else {
> -        /* ELFv2 PPC64 function pointers are entry points, but R12
> -         * must also be set */
> -        env->nip = tswapl((target_ulong) ka->_sa_handler);
> +        /*
> +         * ELFv2 PPC64 function pointers are entry points and are in host
> +         * endianness so should not to be swapped.

"target endianness"

> +         *
> +         * Note: R12 must also be set.
> +         */
> +        env->nip = (target_ulong) ka->_sa_handler;

The cast is not needed: nip and _sa_handler are abi_ulong.

>          env->gpr[12] = env->nip;
>      }
>  #else
> 

If you repost with the fix I've reported above you can add my:

Reviewed-by: Laurent Vivier <laurent@vivier.eu>

Thanks,
Laurent
Vincent Fazio March 16, 2020, 12:29 a.m. UTC | #2
Laurent,

On Sun, Mar 15, 2020 at 1:10 PM Laurent Vivier <laurent@vivier.eu> wrote:
>
> Le 15/03/2020 à 16:52, Vincent Fazio a écrit :
> > From: Vincent Fazio <vfazio@gmail.com>
> >
> > In ELFv2, function pointers are entry points and are in host endianness.
>
> "host endianness" is misleading here. "target endianness" is better.

I do want to clarify here. In a mixed endian scenario (my test case
was an x86 host and e5500 PPC BE target), the function pointers are in
host endianness (little endian) so that the virtual address can be
dereferenced by the host for the target instructions to be translated.

>
> >
> > Previously, the signal handler would be swapped if the target CPU was a
> > different endianness than the host. This would cause a SIGSEGV when
> > attempting to translate the opcode pointed to by the swapped address.
>
> This is correct.
>
> >  Thread 1 "qemu-ppc64" received signal SIGSEGV, Segmentation fault.
> >  0x00000000600a9257 in ldl_he_p (ptr=0x4c2c061000000000) at qemu/include/qemu/bswap.h:351
> >  351        __builtin_memcpy(&r, ptr, sizeof(r));
> >
> >  #0  0x00000000600a9257 in ldl_he_p (ptr=0x4c2c061000000000) at qemu/include/qemu/bswap.h:351
> >  #1  0x00000000600a92fe in ldl_be_p (ptr=0x4c2c061000000000) at qemu/include/qemu/bswap.h:449
> >  #2  0x00000000600c0790 in translator_ldl_swap at qemu/include/exec/translator.h:201
> >  #3  0x000000006011c1ab in ppc_tr_translate_insn at qemu/target/ppc/translate.c:7856
> >  #4  0x000000006005ae70 in translator_loop at qemu/accel/tcg/translator.c:102
> >
> > Now, no swap is performed and execution continues properly.
> >
> > Signed-off-by: Vincent Fazio <vfazio@gmail.com>
> > ---
> >  linux-user/ppc/signal.c | 10 +++++++---
> >  1 file changed, 7 insertions(+), 3 deletions(-)
> >
> > diff --git a/linux-user/ppc/signal.c b/linux-user/ppc/signal.c
> > index 5b82af6cb6..c7f6455170 100644
> > --- a/linux-user/ppc/signal.c
> > +++ b/linux-user/ppc/signal.c
> > @@ -567,9 +567,13 @@ void setup_rt_frame(int sig, struct target_sigaction *ka,
> >          env->nip = tswapl(handler->entry);
> >          env->gpr[2] = tswapl(handler->toc);
> >      } else {
> > -        /* ELFv2 PPC64 function pointers are entry points, but R12
> > -         * must also be set */
> > -        env->nip = tswapl((target_ulong) ka->_sa_handler);
> > +        /*
> > +         * ELFv2 PPC64 function pointers are entry points and are in host
> > +         * endianness so should not to be swapped.
>
> "target endianness"
>
> > +         *
> > +         * Note: R12 must also be set.
> > +         */
> > +        env->nip = (target_ulong) ka->_sa_handler;
>
> The cast is not needed: nip and _sa_handler are abi_ulong.

I'll drop this in v2

>
> >          env->gpr[12] = env->nip;
> >      }
> >  #else
> >
>
> If you repost with the fix I've reported above you can add my:
>
> Reviewed-by: Laurent Vivier <laurent@vivier.eu>
>

I'll hold off on reposting until the endianness wording is figured out.

> Thanks,
> Laurent

Thanks,
-Vincent
David Gibson March 16, 2020, 2:21 a.m. UTC | #3
On Sun, Mar 15, 2020 at 07:29:04PM -0500, Vincent Fazio wrote:
> Laurent,
> 
> On Sun, Mar 15, 2020 at 1:10 PM Laurent Vivier <laurent@vivier.eu> wrote:
> >
> > Le 15/03/2020 à 16:52, Vincent Fazio a écrit :
> > > From: Vincent Fazio <vfazio@gmail.com>
> > >
> > > In ELFv2, function pointers are entry points and are in host endianness.
> >
> > "host endianness" is misleading here. "target endianness" is better.

Yeah, the trouble here is that I think the ELF spec will use "host"
and "target" in a quite different sense than qemu.

> I do want to clarify here. In a mixed endian scenario (my test case
> was an x86 host and e5500 PPC BE target), the function pointers are in
> host endianness (little endian) so that the virtual address can be
> dereferenced by the host for the target instructions to be
> translated.

This can't be right.  The ELF is operating entirely within the guest,
and has no concept of a host (in the qemu sense).  Therefore it's
impossible for it to specify anything as "host endian" (again in the
qemu sense).

It *is* possible that it's little endian explicitly (in which case
we'd need a conditional swap that's different from the one we have
now).

But even that seems pretty odd.  AFAICT that target_sigaction
structure is copied verbatim from guest memory when the guest makes
the sigaction() syscall.  Are we expecting a BE process to put LE
parameters into a syscall structure?  That seems unlikely.

I really think you need to put some instrumentation in the sigaction()
call that comes before this, to see exactly what the guest process is
supplying there.

And then we maybe need to look at your guest side libc and/or a native
e5500 BE kernel to see what it expects in that structure.

> > > Previously, the signal handler would be swapped if the target CPU was a
> > > different endianness than the host. This would cause a SIGSEGV when
> > > attempting to translate the opcode pointed to by the swapped address.
> >
> > This is correct.
> >
> > >  Thread 1 "qemu-ppc64" received signal SIGSEGV, Segmentation fault.
> > >  0x00000000600a9257 in ldl_he_p (ptr=0x4c2c061000000000) at qemu/include/qemu/bswap.h:351
> > >  351        __builtin_memcpy(&r, ptr, sizeof(r));
> > >
> > >  #0  0x00000000600a9257 in ldl_he_p (ptr=0x4c2c061000000000) at qemu/include/qemu/bswap.h:351
> > >  #1  0x00000000600a92fe in ldl_be_p (ptr=0x4c2c061000000000) at qemu/include/qemu/bswap.h:449
> > >  #2  0x00000000600c0790 in translator_ldl_swap at qemu/include/exec/translator.h:201
> > >  #3  0x000000006011c1ab in ppc_tr_translate_insn at qemu/target/ppc/translate.c:7856
> > >  #4  0x000000006005ae70 in translator_loop at qemu/accel/tcg/translator.c:102
> > >
> > > Now, no swap is performed and execution continues properly.
> > >
> > > Signed-off-by: Vincent Fazio <vfazio@gmail.com>
> > > ---
> > >  linux-user/ppc/signal.c | 10 +++++++---
> > >  1 file changed, 7 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/linux-user/ppc/signal.c b/linux-user/ppc/signal.c
> > > index 5b82af6cb6..c7f6455170 100644
> > > --- a/linux-user/ppc/signal.c
> > > +++ b/linux-user/ppc/signal.c
> > > @@ -567,9 +567,13 @@ void setup_rt_frame(int sig, struct target_sigaction *ka,
> > >          env->nip = tswapl(handler->entry);
> > >          env->gpr[2] = tswapl(handler->toc);
> > >      } else {
> > > -        /* ELFv2 PPC64 function pointers are entry points, but R12
> > > -         * must also be set */
> > > -        env->nip = tswapl((target_ulong) ka->_sa_handler);
> > > +        /*
> > > +         * ELFv2 PPC64 function pointers are entry points and are in host
> > > +         * endianness so should not to be swapped.
> >
> > "target endianness"
> >
> > > +         *
> > > +         * Note: R12 must also be set.
> > > +         */
> > > +        env->nip = (target_ulong) ka->_sa_handler;
> >
> > The cast is not needed: nip and _sa_handler are abi_ulong.
> 
> I'll drop this in v2
> 
> >
> > >          env->gpr[12] = env->nip;
> > >      }
> > >  #else
> > >
> >
> > If you repost with the fix I've reported above you can add my:
> >
> > Reviewed-by: Laurent Vivier <laurent@vivier.eu>
> >
> 
> I'll hold off on reposting until the endianness wording is figured out.
> 
> > Thanks,
> > Laurent
> 
> Thanks,
> -Vincent
>
Vincent Fazio March 18, 2020, 3 p.m. UTC | #4
David, Laurent,

On 3/15/20 9:21 PM, David Gibson wrote:
> On Sun, Mar 15, 2020 at 07:29:04PM -0500, Vincent Fazio wrote:
>> Laurent,
>>
>> On Sun, Mar 15, 2020 at 1:10 PM Laurent Vivier <laurent@vivier.eu> wrote:
>>> Le 15/03/2020 à 16:52, Vincent Fazio a écrit :
>>>> From: Vincent Fazio <vfazio@gmail.com>
>>>>
>>>> In ELFv2, function pointers are entry points and are in host endianness.
>>> "host endianness" is misleading here. "target endianness" is better.
> Yeah, the trouble here is that I think the ELF spec will use "host"
> and "target" in a quite different sense than qemu.
>
I'll be simplifying the wording in the message to just mention the 
problematic cross-endian scenario
>> I do want to clarify here. In a mixed endian scenario (my test case
>> was an x86 host and e5500 PPC BE target), the function pointers are in
>> host endianness (little endian) so that the virtual address can be
>> dereferenced by the host for the target instructions to be
>> translated.
> This can't be right.  The ELF is operating entirely within the guest,
> and has no concept of a host (in the qemu sense).  Therefore it's
> impossible for it to specify anything as "host endian" (again in the
> qemu sense).
>
> It *is* possible that it's little endian explicitly (in which case
> we'd need a conditional swap that's different from the one we have
> now).
>
> But even that seems pretty odd.  AFAICT that target_sigaction
> structure is copied verbatim from guest memory when the guest makes
> the sigaction() syscall.  Are we expecting a BE process to put LE
> parameters into a syscall structure?  That seems unlikely.
>
> I really think you need to put some instrumentation in the sigaction()
> call that comes before this, to see exactly what the guest process is
> supplying there.
>
> And then we maybe need to look at your guest side libc and/or a native
> e5500 BE kernel to see what it expects in that structure.
As we discussed in the other thread, I missed the endian swap done as 
part of get_user in do_sigaction. So while my initial determination for 
the root cause of the problem was wrong, the fix is still the same (drop 
the `tswapl` call). The commit message will be updated.
>>>> Previously, the signal handler would be swapped if the target CPU was a
>>>> different endianness than the host. This would cause a SIGSEGV when
>>>> attempting to translate the opcode pointed to by the swapped address.
>>> This is correct.
>>>
>>>>   Thread 1 "qemu-ppc64" received signal SIGSEGV, Segmentation fault.
>>>>   0x00000000600a9257 in ldl_he_p (ptr=0x4c2c061000000000) at qemu/include/qemu/bswap.h:351
>>>>   351        __builtin_memcpy(&r, ptr, sizeof(r));
>>>>
>>>>   #0  0x00000000600a9257 in ldl_he_p (ptr=0x4c2c061000000000) at qemu/include/qemu/bswap.h:351
>>>>   #1  0x00000000600a92fe in ldl_be_p (ptr=0x4c2c061000000000) at qemu/include/qemu/bswap.h:449
>>>>   #2  0x00000000600c0790 in translator_ldl_swap at qemu/include/exec/translator.h:201
>>>>   #3  0x000000006011c1ab in ppc_tr_translate_insn at qemu/target/ppc/translate.c:7856
>>>>   #4  0x000000006005ae70 in translator_loop at qemu/accel/tcg/translator.c:102
>>>>
>>>> Now, no swap is performed and execution continues properly.
>>>>
>>>> Signed-off-by: Vincent Fazio <vfazio@gmail.com>
>>>> ---
>>>>   linux-user/ppc/signal.c | 10 +++++++---
>>>>   1 file changed, 7 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/linux-user/ppc/signal.c b/linux-user/ppc/signal.c
>>>> index 5b82af6cb6..c7f6455170 100644
>>>> --- a/linux-user/ppc/signal.c
>>>> +++ b/linux-user/ppc/signal.c
>>>> @@ -567,9 +567,13 @@ void setup_rt_frame(int sig, struct target_sigaction *ka,
>>>>           env->nip = tswapl(handler->entry);
>>>>           env->gpr[2] = tswapl(handler->toc);
>>>>       } else {
>>>> -        /* ELFv2 PPC64 function pointers are entry points, but R12
>>>> -         * must also be set */
>>>> -        env->nip = tswapl((target_ulong) ka->_sa_handler);
>>>> +        /*
>>>> +         * ELFv2 PPC64 function pointers are entry points and are in host
>>>> +         * endianness so should not to be swapped.
>>> "target endianness"
>>>
>>>> +         *
>>>> +         * Note: R12 must also be set.
>>>> +         */
>>>> +        env->nip = (target_ulong) ka->_sa_handler;
>>> The cast is not needed: nip and _sa_handler are abi_ulong.
>> I'll drop this in v2
>>
>>>>           env->gpr[12] = env->nip;
>>>>       }
>>>>   #else
>>>>
>>> If you repost with the fix I've reported above you can add my:
>>>
>>> Reviewed-by: Laurent Vivier <laurent@vivier.eu>
>>>
>> I'll hold off on reposting until the endianness wording is figured out.
I'll be submitting v2 shortly, but it will have a different commit 
message to better reflect the issue.
>>> Thanks,
>>> Laurent
>> Thanks,
>> -Vincent
>>
David Gibson March 19, 2020, 5:17 a.m. UTC | #5
On Wed, Mar 18, 2020 at 10:00:20AM -0500, Vincent Fazio wrote:
> David, Laurent,
> 
> On 3/15/20 9:21 PM, David Gibson wrote:
> > On Sun, Mar 15, 2020 at 07:29:04PM -0500, Vincent Fazio wrote:
> > > Laurent,
> > > 
> > > On Sun, Mar 15, 2020 at 1:10 PM Laurent Vivier <laurent@vivier.eu> wrote:
> > > > Le 15/03/2020 à 16:52, Vincent Fazio a écrit :
> > > > > From: Vincent Fazio <vfazio@gmail.com>
> > > > > 
> > > > > In ELFv2, function pointers are entry points and are in host endianness.
> > > > "host endianness" is misleading here. "target endianness" is better.
> > Yeah, the trouble here is that I think the ELF spec will use "host"
> > and "target" in a quite different sense than qemu.
> > 
> I'll be simplifying the wording in the message to just mention the
> problematic cross-endian scenario
> > > I do want to clarify here. In a mixed endian scenario (my test case
> > > was an x86 host and e5500 PPC BE target), the function pointers are in
> > > host endianness (little endian) so that the virtual address can be
> > > dereferenced by the host for the target instructions to be
> > > translated.
> > This can't be right.  The ELF is operating entirely within the guest,
> > and has no concept of a host (in the qemu sense).  Therefore it's
> > impossible for it to specify anything as "host endian" (again in the
> > qemu sense).
> > 
> > It *is* possible that it's little endian explicitly (in which case
> > we'd need a conditional swap that's different from the one we have
> > now).
> > 
> > But even that seems pretty odd.  AFAICT that target_sigaction
> > structure is copied verbatim from guest memory when the guest makes
> > the sigaction() syscall.  Are we expecting a BE process to put LE
> > parameters into a syscall structure?  That seems unlikely.
> > 
> > I really think you need to put some instrumentation in the sigaction()
> > call that comes before this, to see exactly what the guest process is
> > supplying there.
> > 
> > And then we maybe need to look at your guest side libc and/or a native
> > e5500 BE kernel to see what it expects in that structure.
> As we discussed in the other thread, I missed the endian swap done as part
> of get_user in do_sigaction.

Right, as did I when I looked the first time.

> So while my initial determination for the root
> cause of the problem was wrong, the fix is still the same (drop the `tswapl`
> call). The commit message will be updated.

Right, agreed.

> > > > > Previously, the signal handler would be swapped if the target CPU was a
> > > > > different endianness than the host. This would cause a SIGSEGV when
> > > > > attempting to translate the opcode pointed to by the swapped address.
> > > > This is correct.
> > > > 
> > > > >   Thread 1 "qemu-ppc64" received signal SIGSEGV, Segmentation fault.
> > > > >   0x00000000600a9257 in ldl_he_p (ptr=0x4c2c061000000000) at qemu/include/qemu/bswap.h:351
> > > > >   351        __builtin_memcpy(&r, ptr, sizeof(r));
> > > > > 
> > > > >   #0  0x00000000600a9257 in ldl_he_p (ptr=0x4c2c061000000000) at qemu/include/qemu/bswap.h:351
> > > > >   #1  0x00000000600a92fe in ldl_be_p (ptr=0x4c2c061000000000) at qemu/include/qemu/bswap.h:449
> > > > >   #2  0x00000000600c0790 in translator_ldl_swap at qemu/include/exec/translator.h:201
> > > > >   #3  0x000000006011c1ab in ppc_tr_translate_insn at qemu/target/ppc/translate.c:7856
> > > > >   #4  0x000000006005ae70 in translator_loop at qemu/accel/tcg/translator.c:102
> > > > > 
> > > > > Now, no swap is performed and execution continues properly.
> > > > > 
> > > > > Signed-off-by: Vincent Fazio <vfazio@gmail.com>
> > > > > ---
> > > > >   linux-user/ppc/signal.c | 10 +++++++---
> > > > >   1 file changed, 7 insertions(+), 3 deletions(-)
> > > > > 
> > > > > diff --git a/linux-user/ppc/signal.c b/linux-user/ppc/signal.c
> > > > > index 5b82af6cb6..c7f6455170 100644
> > > > > --- a/linux-user/ppc/signal.c
> > > > > +++ b/linux-user/ppc/signal.c
> > > > > @@ -567,9 +567,13 @@ void setup_rt_frame(int sig, struct target_sigaction *ka,
> > > > >           env->nip = tswapl(handler->entry);
> > > > >           env->gpr[2] = tswapl(handler->toc);
> > > > >       } else {
> > > > > -        /* ELFv2 PPC64 function pointers are entry points, but R12
> > > > > -         * must also be set */
> > > > > -        env->nip = tswapl((target_ulong) ka->_sa_handler);
> > > > > +        /*
> > > > > +         * ELFv2 PPC64 function pointers are entry points and are in host
> > > > > +         * endianness so should not to be swapped.
> > > > "target endianness"
> > > > 
> > > > > +         *
> > > > > +         * Note: R12 must also be set.
> > > > > +         */
> > > > > +        env->nip = (target_ulong) ka->_sa_handler;
> > > > The cast is not needed: nip and _sa_handler are abi_ulong.
> > > I'll drop this in v2
> > > 
> > > > >           env->gpr[12] = env->nip;
> > > > >       }
> > > > >   #else
> > > > > 
> > > > If you repost with the fix I've reported above you can add my:
> > > > 
> > > > Reviewed-by: Laurent Vivier <laurent@vivier.eu>
> > > > 
> > > I'll hold off on reposting until the endianness wording is figured out.
> I'll be submitting v2 shortly, but it will have a different commit message
> to better reflect the issue.
> > > > Thanks,
> > > > Laurent
> > > Thanks,
> > > -Vincent
> > >
diff mbox series

Patch

diff --git a/linux-user/ppc/signal.c b/linux-user/ppc/signal.c
index 5b82af6cb6..c7f6455170 100644
--- a/linux-user/ppc/signal.c
+++ b/linux-user/ppc/signal.c
@@ -567,9 +567,13 @@  void setup_rt_frame(int sig, struct target_sigaction *ka,
         env->nip = tswapl(handler->entry);
         env->gpr[2] = tswapl(handler->toc);
     } else {
-        /* ELFv2 PPC64 function pointers are entry points, but R12
-         * must also be set */
-        env->nip = tswapl((target_ulong) ka->_sa_handler);
+        /*
+         * ELFv2 PPC64 function pointers are entry points and are in host
+         * endianness so should not to be swapped.
+         *
+         * Note: R12 must also be set.
+         */
+        env->nip = (target_ulong) ka->_sa_handler;
         env->gpr[12] = env->nip;
     }
 #else