diff mbox

[2/8] target/arm: optimize smul_dual() and neon_trn_u8() using extract op

Message ID 20170510200535.13268-3-f4bug@amsat.org (mailing list archive)
State New, archived
Headers show

Commit Message

Philippe Mathieu-Daudé May 10, 2017, 8:05 p.m. UTC
Applied using Coccinelle script.

Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
---
 target/arm/translate.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

Comments

Eric Blake May 10, 2017, 8:15 p.m. UTC | #1
On 05/10/2017 03:05 PM, Philippe Mathieu-Daudé wrote:
> Applied using Coccinelle script.

Thinking forward a year - if I want to reproduce this (to see if other
instances have crept in), I have to dig up a mail archive to learn the
formula you used.  Better is to list your coccinelle command line
directly in the commit message, so that reproducing the fix involves
less effort.  Same comment applies throughout the series.

For reference, see how I did it in commit de6e7951.
Richard Henderson May 10, 2017, 8:20 p.m. UTC | #2
On 05/10/2017 01:05 PM, Philippe Mathieu-Daudé wrote:
> -    tcg_gen_shri_i32(t1, t1, 8);
> -    tcg_gen_andi_i32(t1, t1, 0x00ff00ff);
> +    tcg_gen_extract_i32(t1, t1, 8, 0x00ff00ff);

This is very wrong.  See my previous comment.


r~
Philippe Mathieu-Daudé May 10, 2017, 8:32 p.m. UTC | #3
On 05/10/2017 05:20 PM, Richard Henderson wrote:
> On 05/10/2017 01:05 PM, Philippe Mathieu-Daudé wrote:
>> -    tcg_gen_shri_i32(t1, t1, 8);
>> -    tcg_gen_andi_i32(t1, t1, 0x00ff00ff);
>> +    tcg_gen_extract_i32(t1, t1, 8, 0x00ff00ff);
>
> This is very wrong.  See my previous comment.

Arghhh I see, I checked manually and though I had it...

I'll first check with the cocci script if it can handles this better 
then review the serie manually before bother you again.

Thinking about it, this should be quite easy unit-testable somehow ... 
Not sure if I want to start this path although.

Sorry for the noise and thank a lot for the review!

Phil.
Philippe Mathieu-Daudé May 12, 2017, 1:31 a.m. UTC | #4
Hi Richard,

On 05/10/2017 05:32 PM, Philippe Mathieu-Daudé wrote:
> On 05/10/2017 05:20 PM, Richard Henderson wrote:
>> On 05/10/2017 01:05 PM, Philippe Mathieu-Daudé wrote:
>>> -    tcg_gen_shri_i32(t1, t1, 8);
>>> -    tcg_gen_andi_i32(t1, t1, 0x00ff00ff);
>>> +    tcg_gen_extract_i32(t1, t1, 8, 0x00ff00ff);
>>
>> This is very wrong.  See my previous comment.

Indeed, after correcting the script:

$ docker run -it -v `pwd`:`pwd` -w `pwd` petersenna/coccinelle --sp-file 
scripts/coccinelle/tcg_gen_extract.cocci --macro-file 
scripts/cocci-macro-file.h target/arm/translate.c --in-place
init_defs_builtins: /usr/lib64/coccinelle/standard.h
init_defs: scripts/cocci-macro-file.h
HANDLING: target/arm/translate.c
candidate at target/arm/translate.c:4703
   op_size: i32/i32 (same)
   low_bits: 8 (value: 0xff)
   len: 0xff00ff
   len_bits != low_bits
   candidate is NOT optimizable

candidate at target/arm/translate.c:342
   op_size: i32/i32 (same)
   low_bits: 8 (value: 0xff)
   len: 0xff00ff
   len_bits != low_bits
   candidate is NOT optimizable

>
> Arghhh I see, I checked manually and though I had it...
>
> I'll first check with the cocci script if it can handles this better
> then review the serie manually before bother you again.
>
> Thinking about it, this should be quite easy unit-testable somehow ...
> Not sure if I want to start this path although.
>
> Sorry for the noise and thank a lot for the review!
>
> Phil.
Philippe Mathieu-Daudé May 12, 2017, 2:49 a.m. UTC | #5
Hi Eric,

On 05/10/2017 05:15 PM, Eric Blake wrote:
> On 05/10/2017 03:05 PM, Philippe Mathieu-Daudé wrote:
>> Applied using Coccinelle script.
>
> Thinking forward a year - if I want to reproduce this (to see if other
> instances have crept in), I have to dig up a mail archive to learn the
> formula you used.  Better is to list your coccinelle command line
> directly in the commit message, so that reproducing the fix involves
> less effort.  Same comment applies throughout the series.
>
> For reference, see how I did it in commit de6e7951.

Ok!

In the cover I used cocci spatch directly thru a unofficial docker image 
which I have no idea it will be around in a year...
I'll let the docker example in the cover and add the spatch command in 
the commits.

Regards,

Phil.
diff mbox

Patch

diff --git a/target/arm/translate.c b/target/arm/translate.c
index 0b5a0bca06..3230efe1be 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -339,8 +339,7 @@  static void gen_smul_dual(TCGv_i32 a, TCGv_i32 b)
 static void gen_rev16(TCGv_i32 var)
 {
     TCGv_i32 tmp = tcg_temp_new_i32();
-    tcg_gen_shri_i32(tmp, var, 8);
-    tcg_gen_andi_i32(tmp, tmp, 0x00ff00ff);
+    tcg_gen_extract_i32(tmp, var, 8, 0x00ff00ff);
     tcg_gen_shli_i32(var, var, 8);
     tcg_gen_andi_i32(var, var, 0xff00ff00);
     tcg_gen_or_i32(var, var, tmp);
@@ -4700,8 +4699,7 @@  static void gen_neon_trn_u8(TCGv_i32 t0, TCGv_i32 t1)
     tcg_gen_andi_i32(tmp, t1, 0x00ff00ff);
     tcg_gen_or_i32(rd, rd, tmp);
 
-    tcg_gen_shri_i32(t1, t1, 8);
-    tcg_gen_andi_i32(t1, t1, 0x00ff00ff);
+    tcg_gen_extract_i32(t1, t1, 8, 0x00ff00ff);
     tcg_gen_andi_i32(tmp, t0, 0xff00ff00);
     tcg_gen_or_i32(t1, t1, tmp);
     tcg_gen_mov_i32(t0, rd);