diff mbox

[v2] target-sh4: add atomic tas

Message ID 1478182068-14082-1-git-send-email-laurent@vivier.eu (mailing list archive)
State New, archived
Headers show

Commit Message

Laurent Vivier Nov. 3, 2016, 2:07 p.m. UTC
Implement real atomic tas:

    When (Rn) = 0, 1 -> T
    Otherwise, 0 -> T
    In both cases, 1 -> MSB of (Rn)

using atomic_fetch_or_i32() and setcondi_i32().

Tested with image from:
http://wiki.qemu.org/download/sh-test-0.2.tar.bz2

This image contains a "tas_test" that runs without
error with this change.

Signed-off-by: Laurent Vivier <laurent@vivier.eu>
---
v2:
- don't use helper but atomic_fetch_or_i32
  Thank you Paolo!

 target-sh4/translate.c | 19 ++++++++-----------
 1 file changed, 8 insertions(+), 11 deletions(-)

Comments

Richard Henderson Nov. 3, 2016, 2:53 p.m. UTC | #1
On 11/03/2016 08:07 AM, Laurent Vivier wrote:
> Implement real atomic tas:
>
>     When (Rn) = 0, 1 -> T
>     Otherwise, 0 -> T
>     In both cases, 1 -> MSB of (Rn)
>
> using atomic_fetch_or_i32() and setcondi_i32().
>
> Tested with image from:
> http://wiki.qemu.org/download/sh-test-0.2.tar.bz2
>
> This image contains a "tas_test" that runs without
> error with this change.
>
> Signed-off-by: Laurent Vivier <laurent@vivier.eu>
> ---
> v2:
> - don't use helper but atomic_fetch_or_i32
>   Thank yo

Reviewed-by: Richard Henderson <rth@twiddle.net>


r~
Paolo Bonzini Nov. 3, 2016, 3:32 p.m. UTC | #2
On 03/11/2016 15:07, Laurent Vivier wrote:
> Implement real atomic tas:
> 
>     When (Rn) = 0, 1 -> T
>     Otherwise, 0 -> T
>     In both cases, 1 -> MSB of (Rn)
> 
> using atomic_fetch_or_i32() and setcondi_i32().
> 
> Tested with image from:
> http://wiki.qemu.org/download/sh-test-0.2.tar.bz2
> 
> This image contains a "tas_test" that runs without
> error with this change.
> 
> Signed-off-by: Laurent Vivier <laurent@vivier.eu>
> ---
> v2:
> - don't use helper but atomic_fetch_or_i32
>   Thank you Paolo!
> 
>  target-sh4/translate.c | 19 ++++++++-----------
>  1 file changed, 8 insertions(+), 11 deletions(-)
> 
> diff --git a/target-sh4/translate.c b/target-sh4/translate.c
> index c89a147..1b83d59 100644
> --- a/target-sh4/translate.c
> +++ b/target-sh4/translate.c
> @@ -1640,18 +1640,15 @@ static void _decode_opc(DisasContext * ctx)
>  	tcg_gen_shri_i32(REG(B11_8), REG(B11_8), 16);
>  	return;
>      case 0x401b:		/* tas.b @Rn */
> -	{
> -	    TCGv addr, val;
> -	    addr = tcg_temp_local_new();
> -	    tcg_gen_mov_i32(addr, REG(B11_8));
> -	    val = tcg_temp_local_new();
> -            tcg_gen_qemu_ld_i32(val, addr, ctx->memidx, MO_UB);
> +        {
> +            TCGv val = tcg_temp_new();
> +            TCGv msb = tcg_const_i32(0x80);
> +            tcg_gen_atomic_fetch_or_i32(val, REG(B11_8), msb,
> +                                        ctx->memidx, MO_UB);
> +            tcg_temp_free(msb);
>              tcg_gen_setcondi_i32(TCG_COND_EQ, cpu_sr_t, val, 0);
> -	    tcg_gen_ori_i32(val, val, 0x80);
> -            tcg_gen_qemu_st_i32(val, addr, ctx->memidx, MO_UB);
> -	    tcg_temp_free(val);
> -	    tcg_temp_free(addr);
> -	}
> +            tcg_temp_free(val);
> +        }
>  	return;
>      case 0xf00d: /* fsts FPUL,FRn - FPSCR: Nothing */
>  	CHECK_FPU_ENABLED
> 

For 2.8?

Paolo
Laurent Vivier Nov. 3, 2016, 3:35 p.m. UTC | #3
Le 03/11/2016 à 16:32, Paolo Bonzini a écrit :
> 
> 
> On 03/11/2016 15:07, Laurent Vivier wrote:
>> Implement real atomic tas:
>>
>>     When (Rn) = 0, 1 -> T
>>     Otherwise, 0 -> T
>>     In both cases, 1 -> MSB of (Rn)
>>
>> using atomic_fetch_or_i32() and setcondi_i32().
>>
>> Tested with image from:
>> http://wiki.qemu.org/download/sh-test-0.2.tar.bz2
>>
>> This image contains a "tas_test" that runs without
>> error with this change.
>>
>> Signed-off-by: Laurent Vivier <laurent@vivier.eu>
>> ---
>> v2:
>> - don't use helper but atomic_fetch_or_i32
>>   Thank you Paolo!
>>
>>  target-sh4/translate.c | 19 ++++++++-----------
>>  1 file changed, 8 insertions(+), 11 deletions(-)
>>
>> diff --git a/target-sh4/translate.c b/target-sh4/translate.c
>> index c89a147..1b83d59 100644
>> --- a/target-sh4/translate.c
>> +++ b/target-sh4/translate.c
>> @@ -1640,18 +1640,15 @@ static void _decode_opc(DisasContext * ctx)
>>  	tcg_gen_shri_i32(REG(B11_8), REG(B11_8), 16);
>>  	return;
>>      case 0x401b:		/* tas.b @Rn */
>> -	{
>> -	    TCGv addr, val;
>> -	    addr = tcg_temp_local_new();
>> -	    tcg_gen_mov_i32(addr, REG(B11_8));
>> -	    val = tcg_temp_local_new();
>> -            tcg_gen_qemu_ld_i32(val, addr, ctx->memidx, MO_UB);
>> +        {
>> +            TCGv val = tcg_temp_new();
>> +            TCGv msb = tcg_const_i32(0x80);
>> +            tcg_gen_atomic_fetch_or_i32(val, REG(B11_8), msb,
>> +                                        ctx->memidx, MO_UB);
>> +            tcg_temp_free(msb);
>>              tcg_gen_setcondi_i32(TCG_COND_EQ, cpu_sr_t, val, 0);
>> -	    tcg_gen_ori_i32(val, val, 0x80);
>> -            tcg_gen_qemu_st_i32(val, addr, ctx->memidx, MO_UB);
>> -	    tcg_temp_free(val);
>> -	    tcg_temp_free(addr);
>> -	}
>> +            tcg_temp_free(val);
>> +        }
>>  	return;
>>      case 0xf00d: /* fsts FPUL,FRn - FPSCR: Nothing */
>>  	CHECK_FPU_ENABLED
>>
> 
> For 2.8?

Is it possible?

Otherwise, can it be stored on a maintainer branch waiting the end of
the freeze?

Thanks,
Laurent
Paolo Bonzini Nov. 3, 2016, 4:18 p.m. UTC | #4
On 03/11/2016 16:35, Laurent Vivier wrote:
> Le 03/11/2016 à 16:32, Paolo Bonzini a écrit :
>>
>>
>> On 03/11/2016 15:07, Laurent Vivier wrote:
>>> Implement real atomic tas:
>>>
>>>     When (Rn) = 0, 1 -> T
>>>     Otherwise, 0 -> T
>>>     In both cases, 1 -> MSB of (Rn)
>>>
>>> using atomic_fetch_or_i32() and setcondi_i32().
>>>
>>> Tested with image from:
>>> http://wiki.qemu.org/download/sh-test-0.2.tar.bz2
>>>
>>> This image contains a "tas_test" that runs without
>>> error with this change.
>>>
>>> Signed-off-by: Laurent Vivier <laurent@vivier.eu>
>>> ---
>>> v2:
>>> - don't use helper but atomic_fetch_or_i32
>>>   Thank you Paolo!
>>>
>>>  target-sh4/translate.c | 19 ++++++++-----------
>>>  1 file changed, 8 insertions(+), 11 deletions(-)
>>>
>>> diff --git a/target-sh4/translate.c b/target-sh4/translate.c
>>> index c89a147..1b83d59 100644
>>> --- a/target-sh4/translate.c
>>> +++ b/target-sh4/translate.c
>>> @@ -1640,18 +1640,15 @@ static void _decode_opc(DisasContext * ctx)
>>>  	tcg_gen_shri_i32(REG(B11_8), REG(B11_8), 16);
>>>  	return;
>>>      case 0x401b:		/* tas.b @Rn */
>>> -	{
>>> -	    TCGv addr, val;
>>> -	    addr = tcg_temp_local_new();
>>> -	    tcg_gen_mov_i32(addr, REG(B11_8));
>>> -	    val = tcg_temp_local_new();
>>> -            tcg_gen_qemu_ld_i32(val, addr, ctx->memidx, MO_UB);
>>> +        {
>>> +            TCGv val = tcg_temp_new();
>>> +            TCGv msb = tcg_const_i32(0x80);
>>> +            tcg_gen_atomic_fetch_or_i32(val, REG(B11_8), msb,
>>> +                                        ctx->memidx, MO_UB);
>>> +            tcg_temp_free(msb);
>>>              tcg_gen_setcondi_i32(TCG_COND_EQ, cpu_sr_t, val, 0);
>>> -	    tcg_gen_ori_i32(val, val, 0x80);
>>> -            tcg_gen_qemu_st_i32(val, addr, ctx->memidx, MO_UB);
>>> -	    tcg_temp_free(val);
>>> -	    tcg_temp_free(addr);
>>> -	}
>>> +            tcg_temp_free(val);
>>> +        }
>>>  	return;
>>>      case 0xf00d: /* fsts FPUL,FRn - FPSCR: Nothing */
>>>  	CHECK_FPU_ENABLED
>>>
>>
>> For 2.8?
> 
> Is it possible?

Well, tas_test "runs without error with this change", I suppose it fails
before?  In other words, is this patch enough to run multithreaded sh4
programs with qemu-user?

Paolo
Laurent Vivier Nov. 3, 2016, 4:21 p.m. UTC | #5
Le 03/11/2016 à 17:18, Paolo Bonzini a écrit :
> 
> 
> On 03/11/2016 16:35, Laurent Vivier wrote:
>> Le 03/11/2016 à 16:32, Paolo Bonzini a écrit :
>>>
>>>
>>> On 03/11/2016 15:07, Laurent Vivier wrote:
>>>> Implement real atomic tas:
>>>>
>>>>     When (Rn) = 0, 1 -> T
>>>>     Otherwise, 0 -> T
>>>>     In both cases, 1 -> MSB of (Rn)
>>>>
>>>> using atomic_fetch_or_i32() and setcondi_i32().
>>>>
>>>> Tested with image from:
>>>> http://wiki.qemu.org/download/sh-test-0.2.tar.bz2
>>>>
>>>> This image contains a "tas_test" that runs without
>>>> error with this change.
>>>>
>>>> Signed-off-by: Laurent Vivier <laurent@vivier.eu>
>>>> ---
>>>> v2:
>>>> - don't use helper but atomic_fetch_or_i32
>>>>   Thank you Paolo!
>>>>
>>>>  target-sh4/translate.c | 19 ++++++++-----------
>>>>  1 file changed, 8 insertions(+), 11 deletions(-)
>>>>
>>>> diff --git a/target-sh4/translate.c b/target-sh4/translate.c
>>>> index c89a147..1b83d59 100644
>>>> --- a/target-sh4/translate.c
>>>> +++ b/target-sh4/translate.c
>>>> @@ -1640,18 +1640,15 @@ static void _decode_opc(DisasContext * ctx)
>>>>  	tcg_gen_shri_i32(REG(B11_8), REG(B11_8), 16);
>>>>  	return;
>>>>      case 0x401b:		/* tas.b @Rn */
>>>> -	{
>>>> -	    TCGv addr, val;
>>>> -	    addr = tcg_temp_local_new();
>>>> -	    tcg_gen_mov_i32(addr, REG(B11_8));
>>>> -	    val = tcg_temp_local_new();
>>>> -            tcg_gen_qemu_ld_i32(val, addr, ctx->memidx, MO_UB);
>>>> +        {
>>>> +            TCGv val = tcg_temp_new();
>>>> +            TCGv msb = tcg_const_i32(0x80);
>>>> +            tcg_gen_atomic_fetch_or_i32(val, REG(B11_8), msb,
>>>> +                                        ctx->memidx, MO_UB);
>>>> +            tcg_temp_free(msb);
>>>>              tcg_gen_setcondi_i32(TCG_COND_EQ, cpu_sr_t, val, 0);
>>>> -	    tcg_gen_ori_i32(val, val, 0x80);
>>>> -            tcg_gen_qemu_st_i32(val, addr, ctx->memidx, MO_UB);
>>>> -	    tcg_temp_free(val);
>>>> -	    tcg_temp_free(addr);
>>>> -	}
>>>> +            tcg_temp_free(val);
>>>> +        }
>>>>  	return;
>>>>      case 0xf00d: /* fsts FPUL,FRn - FPSCR: Nothing */
>>>>  	CHECK_FPU_ENABLED
>>>>
>>>
>>> For 2.8?
>>
>> Is it possible?
> 
> Well, tas_test "runs without error with this change", I suppose it fails
> before?  In other words, is this patch enough to run multithreaded sh4
> programs with qemu-user?

It should,:the problem was reported by Adrian (cc:) while compiling ghc
in qemu-sh4, but I have just tested the functionality with the softmmu
version, not the atomicity.

Adrian, could you test this patch?

Thanks,
Laurent
Laurent Vivier Nov. 3, 2016, 4:51 p.m. UTC | #6
Le 03/11/2016 à 17:21, Laurent Vivier a écrit :
> Le 03/11/2016 à 17:18, Paolo Bonzini a écrit :
>>
>>
>> On 03/11/2016 16:35, Laurent Vivier wrote:
>>> Le 03/11/2016 à 16:32, Paolo Bonzini a écrit :
>>>>
>>>>
>>>> On 03/11/2016 15:07, Laurent Vivier wrote:
>>>>> Implement real atomic tas:
>>>>>
>>>>>     When (Rn) = 0, 1 -> T
>>>>>     Otherwise, 0 -> T
>>>>>     In both cases, 1 -> MSB of (Rn)
>>>>>
>>>>> using atomic_fetch_or_i32() and setcondi_i32().
>>>>>
>>>>> Tested with image from:
>>>>> http://wiki.qemu.org/download/sh-test-0.2.tar.bz2
>>>>>
>>>>> This image contains a "tas_test" that runs without
>>>>> error with this change.
>>>>>
>>>>> Signed-off-by: Laurent Vivier <laurent@vivier.eu>
>>>>> ---
>>>>> v2:
>>>>> - don't use helper but atomic_fetch_or_i32
>>>>>   Thank you Paolo!
>>>>>
>>>>>  target-sh4/translate.c | 19 ++++++++-----------
>>>>>  1 file changed, 8 insertions(+), 11 deletions(-)
>>>>>
>>>>> diff --git a/target-sh4/translate.c b/target-sh4/translate.c
>>>>> index c89a147..1b83d59 100644
>>>>> --- a/target-sh4/translate.c
>>>>> +++ b/target-sh4/translate.c
>>>>> @@ -1640,18 +1640,15 @@ static void _decode_opc(DisasContext * ctx)
>>>>>  	tcg_gen_shri_i32(REG(B11_8), REG(B11_8), 16);
>>>>>  	return;
>>>>>      case 0x401b:		/* tas.b @Rn */
>>>>> -	{
>>>>> -	    TCGv addr, val;
>>>>> -	    addr = tcg_temp_local_new();
>>>>> -	    tcg_gen_mov_i32(addr, REG(B11_8));
>>>>> -	    val = tcg_temp_local_new();
>>>>> -            tcg_gen_qemu_ld_i32(val, addr, ctx->memidx, MO_UB);
>>>>> +        {
>>>>> +            TCGv val = tcg_temp_new();
>>>>> +            TCGv msb = tcg_const_i32(0x80);
>>>>> +            tcg_gen_atomic_fetch_or_i32(val, REG(B11_8), msb,
>>>>> +                                        ctx->memidx, MO_UB);
>>>>> +            tcg_temp_free(msb);
>>>>>              tcg_gen_setcondi_i32(TCG_COND_EQ, cpu_sr_t, val, 0);
>>>>> -	    tcg_gen_ori_i32(val, val, 0x80);
>>>>> -            tcg_gen_qemu_st_i32(val, addr, ctx->memidx, MO_UB);
>>>>> -	    tcg_temp_free(val);
>>>>> -	    tcg_temp_free(addr);
>>>>> -	}
>>>>> +            tcg_temp_free(val);
>>>>> +        }
>>>>>  	return;
>>>>>      case 0xf00d: /* fsts FPUL,FRn - FPSCR: Nothing */
>>>>>  	CHECK_FPU_ENABLED
>>>>>
>>>>
>>>> For 2.8?
>>>
>>> Is it possible?
>>
>> Well, tas_test "runs without error with this change", I suppose it fails
>> before?  

I need to add:

It doesn't fail before. The problem is only with qemu-sh4.

Laurent
Richard Henderson Nov. 3, 2016, 4:51 p.m. UTC | #7
On 11/03/2016 10:21 AM, Laurent Vivier wrote:
> Le 03/11/2016 à 17:18, Paolo Bonzini a écrit :
>> Well, tas_test "runs without error with this change", I suppose it fails
>> before?  In other words, is this patch enough to run multithreaded sh4
>> programs with qemu-user?
>
> It should,:the problem was reported by Adrian (cc:) while compiling ghc
> in qemu-sh4, but I have just tested the functionality with the softmmu
> version, not the atomicity.
>
> Adrian, could you test this patch?

It's a start, but sh4 has an interesting scheme to implement atomic sequences 
via special values in the stack pointer.  E.g. xchg is

	mova	1f, r0
	mov	sp, r1
	mov	#(0f-1f), sp
0:	mov.l	mem, out
	mov.l	in, mem
1:	mov	r1, sp

which is only atomic if you've got a UP kernel and have code in your interrupt 
entry point that recognizes the small negative value in SP to reset the PC as 
necessary.

For SH4A, there are proper load-locked/store-condition insns, but prior to that 
TAS was the only truly atomic insn.


r~
Paolo Bonzini Nov. 3, 2016, 5:52 p.m. UTC | #8
On 03/11/2016 17:51, Richard Henderson wrote:
>>> Well, tas_test "runs without error with this change", I suppose it fails
>>> before?  In other words, is this patch enough to run multithreaded sh4
>>> programs with qemu-user?
>>
>> It should,:the problem was reported by Adrian (cc:) while compiling ghc
>> in qemu-sh4, but I have just tested the functionality with the softmmu
>> version, not the atomicity.
>>
>> Adrian, could you test this patch?
> 
> It's a start, but sh4 has an interesting scheme to implement atomic
> sequences via special values in the stack pointer.  E.g. xchg is
> 
>     mova    1f, r0
>     mov    sp, r1
>     mov    #(0f-1f), sp
> 0:    mov.l    mem, out
>     mov.l    in, mem
> 1:    mov    r1, sp
> 
> which is only atomic if you've got a UP kernel and have code in your
> interrupt entry point that recognizes the small negative value in SP to
> reset the PC as necessary.

UP kernel = no sane way to implement this in user-mode qemu?  Doing
pattern matching on negative sp moves and the instructions in between is
probably not sane, even though GCC always has:

- mov/mov for exchange
- mov/cmpeq/bf/mov for compare-and-swap
- mov/mov/op/mov for fetch-and-foo
- mov/mov/and/not/mov for fetch-and-nand
- mov/op/mov for foo-and-fetch
- mov/and/not/mov for nand-and-fetch

Another possibility is to treat the load as a LL and the store as a SC
(implemented in turn with cmpxchg+branch if it fails).  cmpxchg spans
two basic blocks, so maybe one also needs to look at r0 and sp in
cpu_get_tb_cpu_state...

Anyhow this patch seems like a bugfix.

Paolo

> For SH4A, there are proper load-locked/store-condition insns, but prior
> to that TAS was the only truly atomic insn.
Richard Henderson Nov. 3, 2016, 7:15 p.m. UTC | #9
On 11/03/2016 11:52 AM, Paolo Bonzini wrote:
> UP kernel = no sane way to implement this in user-mode qemu?

Probably no straight-forward way, no.

> Another possibility is to treat the load as a LL and the store as a SC
> (implemented in turn with cmpxchg+branch if it fails).  cmpxchg spans
> two basic blocks, so maybe one also needs to look at r0 and sp in
> cpu_get_tb_cpu_state...

Yeah, that's a possibility.  With the store-conditional failure auto-branching 
back to the start of the sequence (r0+sp).

> Anyhow this patch seems like a bugfix.

Absolutely.


r~
Aurelien Jarno Nov. 4, 2016, 12:02 a.m. UTC | #10
On 2016-11-03 15:07, Laurent Vivier wrote:
> Implement real atomic tas:
> 
>     When (Rn) = 0, 1 -> T
>     Otherwise, 0 -> T
>     In both cases, 1 -> MSB of (Rn)
> 
> using atomic_fetch_or_i32() and setcondi_i32().
> 
> Tested with image from:
> http://wiki.qemu.org/download/sh-test-0.2.tar.bz2
> 
> This image contains a "tas_test" that runs without
> error with this change.
> 
> Signed-off-by: Laurent Vivier <laurent@vivier.eu>
> ---
> v2:
> - don't use helper but atomic_fetch_or_i32
>   Thank you Paolo!

Thanks, this look good. I have tried it with my test image, and it
doesn't break it.

Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Acked-by: Aurelien Jarno <aurelien@aurel32.net>

I consider this as a bugfix, not a new feature, so that should be fine
despite the soft freeze. Do you want me to send a pull request?

Aurelien
John Paul Adrian Glaubitz Nov. 4, 2016, 9:23 a.m. UTC | #11
On 11/03/2016 05:21 PM, Laurent Vivier wrote:
> It should,:the problem was reported by Adrian (cc:) while compiling ghc
> in qemu-sh4, but I have just tested the functionality with the softmmu
> version, not the atomicity.
> 
> Adrian, could you test this patch?

Will absolutely do that. Awesome to see progress here :).

I just can't do it right now as I just returned from Asia and just came
to the office to go through a long backlog of work and mail.

Adrian
John Paul Adrian Glaubitz Nov. 4, 2016, 9:43 a.m. UTC | #12
On 11/04/2016 10:23 AM, John Paul Adrian Glaubitz wrote:
> On 11/03/2016 05:21 PM, Laurent Vivier wrote:
>> It should,:the problem was reported by Adrian (cc:) while compiling ghc
>> in qemu-sh4, but I have just tested the functionality with the softmmu
>> version, not the atomicity.
>>
>> Adrian, could you test this patch?
> 
> Will absolutely do that. Awesome to see progress here :).

Ok, I couldn't wait. It doesn't help with the GHC issue, unfortunately:

root@ikarus:~/ghc-7.8.4/utils/ghc-pwd# ghc Main.hs
[1 of 1] Compiling Main             ( Main.hs, Main.o )
qemu-sh4-static: /home/glaubitz/upstream/qemu/translate-all.c:175: tb_lock: Assertion `!have_tb_lock' failed.
qemu-sh4-static: /home/glaubitz/upstream/qemu/translate-all.c:175: tb_lock: Assertion `!have_tb_lock' failed.
Segmentation fault
root@ikarus:~/ghc-7.8.4/utils/ghc-pwd#

I've also seen it lock up and strace showing it hanging in a futex lock.

Adrian
Laurent Vivier Nov. 4, 2016, 9:53 a.m. UTC | #13
Le 04/11/2016 à 10:43, John Paul Adrian Glaubitz a écrit :
> On 11/04/2016 10:23 AM, John Paul Adrian Glaubitz wrote:
>> On 11/03/2016 05:21 PM, Laurent Vivier wrote:
>>> It should,:the problem was reported by Adrian (cc:) while compiling ghc
>>> in qemu-sh4, but I have just tested the functionality with the softmmu
>>> version, not the atomicity.
>>>
>>> Adrian, could you test this patch?
>>
>> Will absolutely do that. Awesome to see progress here :).
> 
> Ok, I couldn't wait. It doesn't help with the GHC issue, unfortunately:
> 
> root@ikarus:~/ghc-7.8.4/utils/ghc-pwd# ghc Main.hs
> [1 of 1] Compiling Main             ( Main.hs, Main.o )
> qemu-sh4-static: /home/glaubitz/upstream/qemu/translate-all.c:175: tb_lock: Assertion `!have_tb_lock' failed.
> qemu-sh4-static: /home/glaubitz/upstream/qemu/translate-all.c:175: tb_lock: Assertion `!have_tb_lock' failed.
> Segmentation fault
> root@ikarus:~/ghc-7.8.4/utils/ghc-pwd#
> 
> I've also seen it lock up and strace showing it hanging in a futex lock.

I think it's more likely a linux-user bug than a target-sh4 bug.

As you report in a mail to me in February, "do_futex()" must be
protected against parallel execution for some futex commands.

Thanks,
Laurent
John Paul Adrian Glaubitz Nov. 4, 2016, 10 a.m. UTC | #14
On 11/04/2016 10:53 AM, Laurent Vivier wrote:
> I think it's more likely a linux-user bug than a target-sh4 bug.
> 
> As you report in a mail to me in February, "do_futex()" must be
> protected against parallel execution for some futex commands.

FWIW, it works fine on qemu-user-armel last time I tested. I could
build GHC completely on qemu-user for armel without any issues.

Btw, if anyone wants to test themselves:

$ wget http://users.physik.fu-berlin.de/~glaubitz/sid-sh4-sbuild-ghc.tgz
$ tar xf sid-sh4-sbuild-ghc.tgz
(compile qemu with --target-list=sh4-linux-user --static)
$ cp -av qemu-sh4 sid-sh4-sbuild-ghc/usr/bin/qemu-sh4-static
$ chroot sid-sh4-sbuild-ghc
(in chroot):
$ cd /root/ghc-7.8.4/utils/ghc-pwd
$ ghc Main.hs

Adrian
Paolo Bonzini Nov. 4, 2016, 10:13 a.m. UTC | #15
On 04/11/2016 11:00, John Paul Adrian Glaubitz wrote:
> On 11/04/2016 10:53 AM, Laurent Vivier wrote:
>> I think it's more likely a linux-user bug than a target-sh4 bug.
>>
>> As you report in a mail to me in February, "do_futex()" must be
>> protected against parallel execution for some futex commands.
> 
> FWIW, it works fine on qemu-user-armel last time I tested. I could
> build GHC completely on qemu-user for armel without any issues.
> 
> Btw, if anyone wants to test themselves:
> 
> $ wget http://users.physik.fu-berlin.de/~glaubitz/sid-sh4-sbuild-ghc.tgz
> $ tar xf sid-sh4-sbuild-ghc.tgz
> (compile qemu with --target-list=sh4-linux-user --static)
> $ cp -av qemu-sh4 sid-sh4-sbuild-ghc/usr/bin/qemu-sh4-static
> $ chroot sid-sh4-sbuild-ghc
> (in chroot):
> $ cd /root/ghc-7.8.4/utils/ghc-pwd
> $ ghc Main.hs

If Haskell is compiled to use the "negative sp" trick that Richard
mentioned, it would rely on the SH machine being uniprocessor.  Try
running chroot with "taskset -c 0".

Paolo
John Paul Adrian Glaubitz Nov. 4, 2016, 10:16 a.m. UTC | #16
On 11/04/2016 11:13 AM, Paolo Bonzini wrote:
>> $ wget http://users.physik.fu-berlin.de/~glaubitz/sid-sh4-sbuild-ghc.tgz
>> $ tar xf sid-sh4-sbuild-ghc.tgz
>> (compile qemu with --target-list=sh4-linux-user --static)
>> $ cp -av qemu-sh4 sid-sh4-sbuild-ghc/usr/bin/qemu-sh4-static
>> $ chroot sid-sh4-sbuild-ghc
>> (in chroot):
>> $ cd /root/ghc-7.8.4/utils/ghc-pwd
>> $ ghc Main.hs
> 
> If Haskell is compiled to use the "negative sp" trick that Richard
> mentioned, it would rely on the SH machine being uniprocessor.  Try
> running chroot with "taskset -c 0".

Doesn't help unfortunately, still either crashes or locks up like this:

root@ikarus:~# strace -p 32415
strace: Process 32415 attached
futex(0x7f836c8bd4c8, FUTEX_WAIT_PRIVATE, 7, NULL

with 32415 being the qemu ghc process.

Adrian
Paolo Bonzini Nov. 4, 2016, 10:52 a.m. UTC | #17
On 04/11/2016 11:16, John Paul Adrian Glaubitz wrote:
>> > 
>> > If Haskell is compiled to use the "negative sp" trick that Richard
>> > mentioned, it would rely on the SH machine being uniprocessor.  Try
>> > running chroot with "taskset -c 0".
> Doesn't help unfortunately, still either crashes or locks up like this:
> 
> root@ikarus:~# strace -p 32415
> strace: Process 32415 attached
> futex(0x7f836c8bd4c8, FUTEX_WAIT_PRIVATE, 7, NULL
> 
> with 32415 being the qemu ghc process.

Indeed you would still need luck. :(

Paolo
diff mbox

Patch

diff --git a/target-sh4/translate.c b/target-sh4/translate.c
index c89a147..1b83d59 100644
--- a/target-sh4/translate.c
+++ b/target-sh4/translate.c
@@ -1640,18 +1640,15 @@  static void _decode_opc(DisasContext * ctx)
 	tcg_gen_shri_i32(REG(B11_8), REG(B11_8), 16);
 	return;
     case 0x401b:		/* tas.b @Rn */
-	{
-	    TCGv addr, val;
-	    addr = tcg_temp_local_new();
-	    tcg_gen_mov_i32(addr, REG(B11_8));
-	    val = tcg_temp_local_new();
-            tcg_gen_qemu_ld_i32(val, addr, ctx->memidx, MO_UB);
+        {
+            TCGv val = tcg_temp_new();
+            TCGv msb = tcg_const_i32(0x80);
+            tcg_gen_atomic_fetch_or_i32(val, REG(B11_8), msb,
+                                        ctx->memidx, MO_UB);
+            tcg_temp_free(msb);
             tcg_gen_setcondi_i32(TCG_COND_EQ, cpu_sr_t, val, 0);
-	    tcg_gen_ori_i32(val, val, 0x80);
-            tcg_gen_qemu_st_i32(val, addr, ctx->memidx, MO_UB);
-	    tcg_temp_free(val);
-	    tcg_temp_free(addr);
-	}
+            tcg_temp_free(val);
+        }
 	return;
     case 0xf00d: /* fsts FPUL,FRn - FPSCR: Nothing */
 	CHECK_FPU_ENABLED