diff mbox series

[2/6] target/ppc: add vmulld instruction

Message ID 20200613042029.22321-3-ljp@linux.ibm.com (mailing list archive)
State New, archived
Headers show
Series Add several Power ISA 3.1 32/64-bit vector instructions | expand

Commit Message

Lijun Pan June 13, 2020, 4:20 a.m. UTC
vmulld: Vector Multiply Low Doubleword.

Signed-off-by: Lijun Pan <ljp@linux.ibm.com>
---
 target/ppc/helper.h                 | 1 +
 target/ppc/int_helper.c             | 1 +
 target/ppc/translate/vmx-impl.inc.c | 1 +
 target/ppc/translate/vmx-ops.inc.c  | 1 +
 4 files changed, 4 insertions(+)

Comments

Richard Henderson June 18, 2020, 11:27 p.m. UTC | #1
On 6/12/20 9:20 PM, Lijun Pan wrote:
> vmulld: Vector Multiply Low Doubleword.
> 
> Signed-off-by: Lijun Pan <ljp@linux.ibm.com>
> ---
>  target/ppc/helper.h                 | 1 +
>  target/ppc/int_helper.c             | 1 +
>  target/ppc/translate/vmx-impl.inc.c | 1 +
>  target/ppc/translate/vmx-ops.inc.c  | 1 +
>  4 files changed, 4 insertions(+)
> 
> diff --git a/target/ppc/helper.h b/target/ppc/helper.h
> index 2dfa1c6942..c3f087ccb3 100644
> --- a/target/ppc/helper.h
> +++ b/target/ppc/helper.h
> @@ -185,6 +185,7 @@ DEF_HELPER_3(vmuloub, void, avr, avr, avr)
>  DEF_HELPER_3(vmulouh, void, avr, avr, avr)
>  DEF_HELPER_3(vmulouw, void, avr, avr, avr)
>  DEF_HELPER_3(vmuluwm, void, avr, avr, avr)
> +DEF_HELPER_3(vmulld, void, avr, avr, avr)
>  DEF_HELPER_3(vslo, void, avr, avr, avr)
>  DEF_HELPER_3(vsro, void, avr, avr, avr)
>  DEF_HELPER_3(vsrv, void, avr, avr, avr)
> diff --git a/target/ppc/int_helper.c b/target/ppc/int_helper.c
> index be53cd6f68..afbcdd05b4 100644
> --- a/target/ppc/int_helper.c
> +++ b/target/ppc/int_helper.c
> @@ -533,6 +533,7 @@ void helper_vprtybq(ppc_avr_t *r, ppc_avr_t *b)
>          }                                                               \
>      }
>  VARITH_DO(muluwm, *, u32)
> +VARITH_DO(mulld, *, s64)

>From this implementation, I would say that both vmuluwm and vmulld can be
implemented with tcg_gen_gvec_mul().

I guess vmuluwm was missed when many of the other vmx operations were converted
to gvec.

Please first convert vmuluwm to tcg_gen_gvec_mul, then implement vmulld in the
same manner.


r~
Lijun Pan June 19, 2020, 5:30 a.m. UTC | #2
> On Jun 18, 2020, at 6:27 PM, Richard Henderson <richard.henderson@linaro.org> wrote:
> 
> On 6/12/20 9:20 PM, Lijun Pan wrote:
>> vmulld: Vector Multiply Low Doubleword.
>> 
>> Signed-off-by: Lijun Pan <ljp@linux.ibm.com>
>> ---
>> target/ppc/helper.h                 | 1 +
>> target/ppc/int_helper.c             | 1 +
>> target/ppc/translate/vmx-impl.inc.c | 1 +
>> target/ppc/translate/vmx-ops.inc.c  | 1 +
>> 4 files changed, 4 insertions(+)
>> 
>> diff --git a/target/ppc/helper.h b/target/ppc/helper.h
>> index 2dfa1c6942..c3f087ccb3 100644
>> --- a/target/ppc/helper.h
>> +++ b/target/ppc/helper.h
>> @@ -185,6 +185,7 @@ DEF_HELPER_3(vmuloub, void, avr, avr, avr)
>> DEF_HELPER_3(vmulouh, void, avr, avr, avr)
>> DEF_HELPER_3(vmulouw, void, avr, avr, avr)
>> DEF_HELPER_3(vmuluwm, void, avr, avr, avr)
>> +DEF_HELPER_3(vmulld, void, avr, avr, avr)
>> DEF_HELPER_3(vslo, void, avr, avr, avr)
>> DEF_HELPER_3(vsro, void, avr, avr, avr)
>> DEF_HELPER_3(vsrv, void, avr, avr, avr)
>> diff --git a/target/ppc/int_helper.c b/target/ppc/int_helper.c
>> index be53cd6f68..afbcdd05b4 100644
>> --- a/target/ppc/int_helper.c
>> +++ b/target/ppc/int_helper.c
>> @@ -533,6 +533,7 @@ void helper_vprtybq(ppc_avr_t *r, ppc_avr_t *b)
>>         }                                                               \
>>     }
>> VARITH_DO(muluwm, *, u32)
>> +VARITH_DO(mulld, *, s64)
> 
>> From this implementation, I would say that both vmuluwm and vmulld can be
> implemented with tcg_gen_gvec_mul().
> 
> I guess vmuluwm was missed when many of the other vmx operations were converted
> to gvec.
> 
> Please first convert vmuluwm to tcg_gen_gvec_mul, then implement vmulld in the
> same manner.

I did a grep in git repo, and found out only arm use this tcg_gen_gvec_mul.
The original implementation is very straightforward, and being adopted at many places
all over target/ppc/int_helper.c. Why do we need to convert
to tcg_gen_gvec_mul, which seems to me very convoluted?

Thanks,
Lijun
Richard Henderson June 19, 2020, 9:16 p.m. UTC | #3
On 6/18/20 10:30 PM, Lijun Pan wrote:
> Why do we need to convert
> to tcg_gen_gvec_mul, which seems to me very convoluted?

Because that way we can generate a single host vector multiply instruction in
the compiled translation block.


r~
diff mbox series

Patch

diff --git a/target/ppc/helper.h b/target/ppc/helper.h
index 2dfa1c6942..c3f087ccb3 100644
--- a/target/ppc/helper.h
+++ b/target/ppc/helper.h
@@ -185,6 +185,7 @@  DEF_HELPER_3(vmuloub, void, avr, avr, avr)
 DEF_HELPER_3(vmulouh, void, avr, avr, avr)
 DEF_HELPER_3(vmulouw, void, avr, avr, avr)
 DEF_HELPER_3(vmuluwm, void, avr, avr, avr)
+DEF_HELPER_3(vmulld, void, avr, avr, avr)
 DEF_HELPER_3(vslo, void, avr, avr, avr)
 DEF_HELPER_3(vsro, void, avr, avr, avr)
 DEF_HELPER_3(vsrv, void, avr, avr, avr)
diff --git a/target/ppc/int_helper.c b/target/ppc/int_helper.c
index be53cd6f68..afbcdd05b4 100644
--- a/target/ppc/int_helper.c
+++ b/target/ppc/int_helper.c
@@ -533,6 +533,7 @@  void helper_vprtybq(ppc_avr_t *r, ppc_avr_t *b)
         }                                                               \
     }
 VARITH_DO(muluwm, *, u32)
+VARITH_DO(mulld, *, s64)
 #undef VARITH_DO
 #undef VARITH
 
diff --git a/target/ppc/translate/vmx-impl.inc.c b/target/ppc/translate/vmx-impl.inc.c
index 403ed3a01c..4ee1df48f2 100644
--- a/target/ppc/translate/vmx-impl.inc.c
+++ b/target/ppc/translate/vmx-impl.inc.c
@@ -807,6 +807,7 @@  GEN_VXFORM_DUAL(vmulouw, PPC_ALTIVEC, PPC_NONE,
 GEN_VXFORM(vmulosb, 4, 4);
 GEN_VXFORM(vmulosh, 4, 5);
 GEN_VXFORM(vmulosw, 4, 6);
+GEN_VXFORM(vmulld,  4, 7);
 GEN_VXFORM(vmuleub, 4, 8);
 GEN_VXFORM(vmuleuh, 4, 9);
 GEN_VXFORM(vmuleuw, 4, 10);
diff --git a/target/ppc/translate/vmx-ops.inc.c b/target/ppc/translate/vmx-ops.inc.c
index 84e05fb827..499bed0a44 100644
--- a/target/ppc/translate/vmx-ops.inc.c
+++ b/target/ppc/translate/vmx-ops.inc.c
@@ -104,6 +104,7 @@  GEN_VXFORM_DUAL(vmulouw, vmuluwm, 4, 2, PPC_ALTIVEC, PPC_NONE),
 GEN_VXFORM(vmulosb, 4, 4),
 GEN_VXFORM(vmulosh, 4, 5),
 GEN_VXFORM_207(vmulosw, 4, 6),
+GEN_VXFORM_300(vmulld, 4, 7),
 GEN_VXFORM(vmuleub, 4, 8),
 GEN_VXFORM(vmuleuh, 4, 9),
 GEN_VXFORM_207(vmuleuw, 4, 10),