diff mbox series

[1/2] target/arm: Disable VFPv4-D32 when NEON is not available

Message ID 20220928164719.655586-2-clg@kaod.org (mailing list archive)
State New, archived
Headers show
Series ast2600: Disable NEON and VFPv4-D32 | expand

Commit Message

Cédric Le Goater Sept. 28, 2022, 4:47 p.m. UTC
As the Cortex A7 MPCore Technical reference says :

  "When FPU option is selected without NEON, the FPU is VFPv4-D16 and
  uses 16 double-precision registers. When the FPU is implemented with
  NEON, the FPU is VFPv4-D32 and uses 32 double-precision registers.
  This register bank is shared with NEON."

Modify the mvfr0 register value of the cortex A7 to advertise only 16
registers when NEON is not available, and not 32 registers.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
---
 target/arm/cpu.c | 4 ++++
 1 file changed, 4 insertions(+)

Comments

Joel Stanley Sept. 28, 2022, 11 p.m. UTC | #1
On Wed, 28 Sept 2022 at 16:47, Cédric Le Goater <clg@kaod.org> wrote:
>
> As the Cortex A7 MPCore Technical reference says :
>
>   "When FPU option is selected without NEON, the FPU is VFPv4-D16 and
>   uses 16 double-precision registers. When the FPU is implemented with
>   NEON, the FPU is VFPv4-D32 and uses 32 double-precision registers.
>   This register bank is shared with NEON."
>
> Modify the mvfr0 register value of the cortex A7 to advertise only 16
> registers when NEON is not available, and not 32 registers.
>
> Signed-off-by: Cédric Le Goater <clg@kaod.org>



> ---
>  target/arm/cpu.c | 4 ++++
>  1 file changed, 4 insertions(+)
>
> diff --git a/target/arm/cpu.c b/target/arm/cpu.c
> index 7ec3281da9aa..01dc74c32add 100644
> --- a/target/arm/cpu.c
> +++ b/target/arm/cpu.c
> @@ -1684,6 +1684,10 @@ static void arm_cpu_realizefn(DeviceState *dev, Error **errp)
>          cpu->isar.id_isar6 = u;
>
>          if (!arm_feature(env, ARM_FEATURE_M)) {

Can you explain why the test is put behind the !ARM_FEATURE_M check?

Reviewed-by: Joel Stanley <joel@jms.id.au>

> +            u = cpu->isar.mvfr0;
> +            u = FIELD_DP32(u, MVFR0, SIMDREG, 1); /* 16 registers */
> +            cpu->isar.mvfr0 = u;
> +
>              u = cpu->isar.mvfr1;
>              u = FIELD_DP32(u, MVFR1, SIMDLS, 0);
>              u = FIELD_DP32(u, MVFR1, SIMDINT, 0);
> --
> 2.37.3
>
Cédric Le Goater Sept. 29, 2022, 7:20 a.m. UTC | #2
On 9/29/22 01:00, Joel Stanley wrote:
> On Wed, 28 Sept 2022 at 16:47, Cédric Le Goater <clg@kaod.org> wrote:
>>
>> As the Cortex A7 MPCore Technical reference says :
>>
>>    "When FPU option is selected without NEON, the FPU is VFPv4-D16 and
>>    uses 16 double-precision registers. When the FPU is implemented with
>>    NEON, the FPU is VFPv4-D32 and uses 32 double-precision registers.
>>    This register bank is shared with NEON."
>>
>> Modify the mvfr0 register value of the cortex A7 to advertise only 16
>> registers when NEON is not available, and not 32 registers.
>>
>> Signed-off-by: Cédric Le Goater <clg@kaod.org>
> 
> 
> 
>> ---
>>   target/arm/cpu.c | 4 ++++
>>   1 file changed, 4 insertions(+)
>>
>> diff --git a/target/arm/cpu.c b/target/arm/cpu.c
>> index 7ec3281da9aa..01dc74c32add 100644
>> --- a/target/arm/cpu.c
>> +++ b/target/arm/cpu.c
>> @@ -1684,6 +1684,10 @@ static void arm_cpu_realizefn(DeviceState *dev, Error **errp)
>>           cpu->isar.id_isar6 = u;
>>
>>           if (!arm_feature(env, ARM_FEATURE_M)) {
> 
> Can you explain why the test is put behind the !ARM_FEATURE_M check?

Do you mean the setting of MVFR0 ?

because it was close to the code clearing the SIMD bits (NEON)
of MVFR1 and it seemed more in sync with the specs :

    "When FPU option is selected without NEON, the FPU is VFPv4-D16 and
     uses 16 double-precision registers. When the FPU is implemented with
     NEON, the FPU is VFPv4-D32 and uses 32 double-precision registers.
     This register bank is shared with NEON."

(That said, M processors don't have NEON, so this part of the code
should never be reached )

It could be done outside of this test also because SIMDREG = 1 is
a valid value for M processors and the code path  :

     if (!cpu->has_neon && !cpu->has_vfp) {

will set MVFR0 to 0 later on if needed.


M55 seems to be a special case though :

     cpu->isar.mvfr1 = 0x12100211

these are the FPU and MVE bits.

C.

> 
> Reviewed-by: Joel Stanley <joel@jms.id.au>
> 
>> +            u = cpu->isar.mvfr0;
>> +            u = FIELD_DP32(u, MVFR0, SIMDREG, 1); /* 16 registers */
>> +            cpu->isar.mvfr0 = u;
>> +
>>               u = cpu->isar.mvfr1;
>>               u = FIELD_DP32(u, MVFR1, SIMDLS, 0);
>>               u = FIELD_DP32(u, MVFR1, SIMDINT, 0);
>> --
>> 2.37.3
>>
Peter Maydell Sept. 29, 2022, 11:44 a.m. UTC | #3
On Wed, 28 Sept 2022 at 17:47, Cédric Le Goater <clg@kaod.org> wrote:
>
> As the Cortex A7 MPCore Technical reference says :
>
>   "When FPU option is selected without NEON, the FPU is VFPv4-D16 and
>   uses 16 double-precision registers. When the FPU is implemented with
>   NEON, the FPU is VFPv4-D32 and uses 32 double-precision registers.
>   This register bank is shared with NEON."
>
> Modify the mvfr0 register value of the cortex A7 to advertise only 16
> registers when NEON is not available, and not 32 registers.
>
> Signed-off-by: Cédric Le Goater <clg@kaod.org>
> ---
>  target/arm/cpu.c | 4 ++++
>  1 file changed, 4 insertions(+)
>
> diff --git a/target/arm/cpu.c b/target/arm/cpu.c
> index 7ec3281da9aa..01dc74c32add 100644
> --- a/target/arm/cpu.c
> +++ b/target/arm/cpu.c
> @@ -1684,6 +1684,10 @@ static void arm_cpu_realizefn(DeviceState *dev, Error **errp)
>          cpu->isar.id_isar6 = u;
>
>          if (!arm_feature(env, ARM_FEATURE_M)) {
> +            u = cpu->isar.mvfr0;
> +            u = FIELD_DP32(u, MVFR0, SIMDREG, 1); /* 16 registers */
> +            cpu->isar.mvfr0 = u;
> +

Architecturally, Neon implies that you must have 32 dp registers,
but not having Neon does not imply that you must only have 16.
In particular, the Cortex-A15 always implements VFPv4-D32
whether Neon is enabled or not.

If you want to be able to turn off D32 and restrict to 16
registers, I think you need to add a separate property to
control that.

thanks
-- PMM
Peter Maydell Sept. 29, 2022, 11:48 a.m. UTC | #4
On Thu, 29 Sept 2022 at 08:20, Cédric Le Goater <clg@kaod.org> wrote:
>
> On 9/29/22 01:00, Joel Stanley wrote:
> > On Wed, 28 Sept 2022 at 16:47, Cédric Le Goater <clg@kaod.org> wrote:
> >> @@ -1684,6 +1684,10 @@ static void arm_cpu_realizefn(DeviceState *dev, Error **errp)
> >>           cpu->isar.id_isar6 = u;
> >>
> >>           if (!arm_feature(env, ARM_FEATURE_M)) {
> >
> > Can you explain why the test is put behind the !ARM_FEATURE_M check?
>
> Do you mean the setting of MVFR0 ?
>
> because it was close to the code clearing the SIMD bits (NEON)
> of MVFR1 and it seemed more in sync with the specs :
>
>     "When FPU option is selected without NEON, the FPU is VFPv4-D16 and
>      uses 16 double-precision registers. When the FPU is implemented with
>      NEON, the FPU is VFPv4-D32 and uses 32 double-precision registers.
>      This register bank is shared with NEON."
>
> (That said, M processors don't have NEON, so this part of the code
> should never be reached )

They don't have Neon, but that means that cpu->has_neon is
false, so this part of the code *will* be reached. The reason
this sub-part of the "disable Neon" handling is inside
a not-M check is because M-profile has a different assignment
for some of the MVFR1 fields (check the comments in the FIELD
definitions in cpu.h), and zeroing things out based on the
A-profile meanings would be wrong.

thanks
-- PMM
Richard Henderson Sept. 29, 2022, 3:22 p.m. UTC | #5
On 9/29/22 04:44, Peter Maydell wrote:
> Architecturally, Neon implies that you must have 32 dp registers,
> but not having Neon does not imply that you must only have 16.
> In particular, the Cortex-A15 always implements VFPv4-D32
> whether Neon is enabled or not.

A15 requires VFP == NEON in its configuration.


r~
Peter Maydell Sept. 29, 2022, 3:29 p.m. UTC | #6
On Thu, 29 Sept 2022 at 16:22, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> On 9/29/22 04:44, Peter Maydell wrote:
> > Architecturally, Neon implies that you must have 32 dp registers,
> > but not having Neon does not imply that you must only have 16.
> > In particular, the Cortex-A15 always implements VFPv4-D32
> > whether Neon is enabled or not.
>
> A15 requires VFP == NEON in its configuration.

No, it requires that if you have Neon then you have VFP;
but it allows all of:
 * no VFP or Neon
 * VFP, no Neon
 * VFP and Neon

https://developer.arm.com/documentation/ddi0438/i/neon-and-vfp-unit/about-neon-and-vfp-unit

-- PMM
Cédric Le Goater Sept. 30, 2022, 2:59 p.m. UTC | #7
On 9/29/22 13:44, Peter Maydell wrote:
> On Wed, 28 Sept 2022 at 17:47, Cédric Le Goater <clg@kaod.org> wrote:
>>
>> As the Cortex A7 MPCore Technical reference says :
>>
>>    "When FPU option is selected without NEON, the FPU is VFPv4-D16 and
>>    uses 16 double-precision registers. When the FPU is implemented with
>>    NEON, the FPU is VFPv4-D32 and uses 32 double-precision registers.
>>    This register bank is shared with NEON."
>>
>> Modify the mvfr0 register value of the cortex A7 to advertise only 16
>> registers when NEON is not available, and not 32 registers.
>>
>> Signed-off-by: Cédric Le Goater <clg@kaod.org>
>> ---
>>   target/arm/cpu.c | 4 ++++
>>   1 file changed, 4 insertions(+)
>>
>> diff --git a/target/arm/cpu.c b/target/arm/cpu.c
>> index 7ec3281da9aa..01dc74c32add 100644
>> --- a/target/arm/cpu.c
>> +++ b/target/arm/cpu.c
>> @@ -1684,6 +1684,10 @@ static void arm_cpu_realizefn(DeviceState *dev, Error **errp)
>>           cpu->isar.id_isar6 = u;
>>
>>           if (!arm_feature(env, ARM_FEATURE_M)) {
>> +            u = cpu->isar.mvfr0;
>> +            u = FIELD_DP32(u, MVFR0, SIMDREG, 1); /* 16 registers */
>> +            cpu->isar.mvfr0 = u;
>> +
> 
> Architecturally, Neon implies that you must have 32 dp registers,
> but not having Neon does not imply that you must only have 16.
> In particular, the Cortex-A15 always implements VFPv4-D32
> whether Neon is enabled or not.
> 
> If you want to be able to turn off D32 and restrict to 16
> registers, I think you need to add a separate property to
> control that.

Something like "vfp-d16" ?

Thanks,

C.
Peter Maydell Sept. 30, 2022, 3:10 p.m. UTC | #8
On Fri, 30 Sept 2022 at 15:59, Cédric Le Goater <clg@kaod.org> wrote:
>
> On 9/29/22 13:44, Peter Maydell wrote:
> > If you want to be able to turn off D32 and restrict to 16
> > registers, I think you need to add a separate property to
> > control that.
>
> Something like "vfp-d16" ?

That ends up being a sort of negative-polarity feature.
Maybe "vfp-d32" for "have 32 dregs", with 'no' meaning "only 16" ?

thanks
-- PMM
diff mbox series

Patch

diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index 7ec3281da9aa..01dc74c32add 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -1684,6 +1684,10 @@  static void arm_cpu_realizefn(DeviceState *dev, Error **errp)
         cpu->isar.id_isar6 = u;
 
         if (!arm_feature(env, ARM_FEATURE_M)) {
+            u = cpu->isar.mvfr0;
+            u = FIELD_DP32(u, MVFR0, SIMDREG, 1); /* 16 registers */
+            cpu->isar.mvfr0 = u;
+
             u = cpu->isar.mvfr1;
             u = FIELD_DP32(u, MVFR1, SIMDLS, 0);
             u = FIELD_DP32(u, MVFR1, SIMDINT, 0);