[RFC,1/3] target/ppc: Bugfix fadd/fsub result with OE/UE set

Message ID	20220803122217.20847-2-lucas.araujo@eldorado.org.br (mailing list archive)
State	New, archived
Headers	show Return-Path: <qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org> From: "Lucas Mateus Castro(alqotel)" <lucas.araujo@eldorado.org.br> To: qemu-ppc@nongnu.org Cc: richard.henderson@linaro.org, danielhb413@gmail.com, "Lucas Mateus Castro (alqotel)" <lucas.araujo@eldorado.org.br>, Aurelien Jarno <aurelien@aurel32.net>, Peter Maydell <peter.maydell@linaro.org>, =?utf-8?q?Alex_Benn=C3=A9e?= <alex.bennee@linaro.org>, David Gibson <david@gibson.dropbear.id.au>, Greg Kurz <groug@kaod.org>, qemu-devel@nongnu.org (open list:All patches CC here) Subject: [RFC PATCH 1/3] target/ppc: Bugfix fadd/fsub result with OE/UE set Date: Wed, 3 Aug 2022 09:22:15 -0300 Message-Id: <20220803122217.20847-2-lucas.araujo@eldorado.org.br> In-Reply-To: <20220803122217.20847-1-lucas.araujo@eldorado.org.br> References: <20220803122217.20847-1-lucas.araujo@eldorado.org.br> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Received-SPF: pass client-ip=200.168.210.66; envelope-from=lucas.araujo@eldorado.org.br; helo=outlook.eldorado.org.br X-Spam_score_int: -4 X-Spam_score: -0.5 X-Spam_bar: / X-Spam_report: (-0.5 / 5.0 requ) BAYES_00=-1.9, PDS_HP_HELO_NORDNS=0.659, RDNS_NONE=0.793, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=no autolearn_force=no X-Spam_action: no action Precedence: list Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" <qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org>
Series	[RFC,1/3] target/ppc: Bugfix fadd/fsub result with OE/UE set \| expand [RFC,1/3] target/ppc: Bugfix fadd/fsub result with OE/UE set [RFC,2/3] target/ppc: Bugfix fmul result with OE/UE set [RFC,3/3] target/ppc: Bugfix fdiv result with OE/UE set

Message ID

20220803122217.20847-2-lucas.araujo@eldorado.org.br (mailing list archive)

State

New, archived

Headers

From: "Lucas Mateus Castro(alqotel)" <lucas.araujo@eldorado.org.br>
To: qemu-ppc@nongnu.org
Cc: richard.henderson@linaro.org, danielhb413@gmail.com,
 "Lucas Mateus Castro (alqotel)" <lucas.araujo@eldorado.org.br>,
 Aurelien Jarno <aurelien@aurel32.net>,
 Peter Maydell <peter.maydell@linaro.org>,
 =?utf-8?q?Alex_Benn=C3=A9e?= <alex.bennee@linaro.org>,
 David Gibson <david@gibson.dropbear.id.au>, Greg Kurz <groug@kaod.org>,
 qemu-devel@nongnu.org (open list:All patches CC here)
Subject: [RFC PATCH 1/3] target/ppc: Bugfix fadd/fsub result with OE/UE set
Date: Wed,  3 Aug 2022 09:22:15 -0300
Message-Id: <20220803122217.20847-2-lucas.araujo@eldorado.org.br>
In-Reply-To: <20220803122217.20847-1-lucas.araujo@eldorado.org.br>
References: <20220803122217.20847-1-lucas.araujo@eldorado.org.br>
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
Received-SPF: pass client-ip=200.168.210.66;
 envelope-from=lucas.araujo@eldorado.org.br; helo=outlook.eldorado.org.br
X-Spam_score_int: -4
X-Spam_score: -0.5
X-Spam_bar: /
X-Spam_report: (-0.5 / 5.0 requ) BAYES_00=-1.9, PDS_HP_HELO_NORDNS=0.659,
 RDNS_NONE=0.793, SPF_HELO_NONE=0.001, SPF_PASS=-0.001,
 T_SCC_BODY_TEXT_LINE=-0.01 autolearn=no autolearn_force=no
X-Spam_action: no action
X-BeenThere: qemu-devel@nongnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
 <mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <https://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
 <mailto:qemu-devel-request@nongnu.org?subject=subscribe>
Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org
Sender: "Qemu-devel"
 <qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org>

Series

[RFC,1/3] target/ppc: Bugfix fadd/fsub result with OE/UE set | expand

Commit Message

Lucas Mateus Martins Araujo e Castro Aug. 3, 2022, 12:22 p.m. UTC

From: "Lucas Mateus Castro (alqotel)" <lucas.araujo@eldorado.org.br>

As mentioned in the functions float_overflow_excp and
float_underflow_excp, the result should be adjusted as mentioned in the
ISA (subtracted 192/1536 from the exponent of the intermediate result if
an overflow occurs with OE set and added 192/1536 to the exponent of the
intermediate result if an underflow occurs with UE set), but at those
functions the result has already been rounded so it is not possible to
add/subtract from the intermediate result anymore.
 
This patch creates a new function that receives the value that should be
subtracted/added from the exponent if an overflow/underflow happens, to
not leave some arbitrary numbers from the PowerISA in the middle of the
FPU code. If these numbers are 0 the new functions just call the old
ones.

I used 2 values here for overflow and underflow, maybe it'd be better to
just use the same ones, any thoughts?

Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br>
---
An alternative I've thought was to always return the value adjusted if a
overflow or underflow occurs and in float_underflow_excp and
float_overflow_excp adjust it to inf/den/0 if OE/UE is 0, but I didn't
saw many advantages to that approach.
---
 fpu/softfloat.c         | 75 +++++++++++++++++++++++++++++++++++++++++
 include/fpu/softfloat.h |  2 ++
 target/ppc/fpu_helper.c | 10 ++++--
 3 files changed, 85 insertions(+), 2 deletions(-)

Comments

Richard Henderson Aug. 3, 2022, 4:18 p.m. UTC | #1

On 8/3/22 05:22, Lucas Mateus Castro(alqotel) wrote:
> From: "Lucas Mateus Castro (alqotel)" <lucas.araujo@eldorado.org.br>
> 
> As mentioned in the functions float_overflow_excp and
> float_underflow_excp, the result should be adjusted as mentioned in the
> ISA (subtracted 192/1536 from the exponent of the intermediate result if
> an overflow occurs with OE set and added 192/1536 to the exponent of the
> intermediate result if an underflow occurs with UE set), but at those
> functions the result has already been rounded so it is not possible to
> add/subtract from the intermediate result anymore.
>   
> This patch creates a new function that receives the value that should be
> subtracted/added from the exponent if an overflow/underflow happens, to
> not leave some arbitrary numbers from the PowerISA in the middle of the
> FPU code. If these numbers are 0 the new functions just call the old
> ones.
> 
> I used 2 values here for overflow and underflow, maybe it'd be better to
> just use the same ones, any thoughts?
> 
> Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br>
> ---
> An alternative I've thought was to always return the value adjusted if a
> overflow or underflow occurs and in float_underflow_excp and
> float_overflow_excp adjust it to inf/den/0 if OE/UE is 0, but I didn't
> saw many advantages to that approach.
> ---
>   fpu/softfloat.c         | 75 +++++++++++++++++++++++++++++++++++++++++
>   include/fpu/softfloat.h |  2 ++
>   target/ppc/fpu_helper.c | 10 ++++--
>   3 files changed, 85 insertions(+), 2 deletions(-)
> 
> diff --git a/fpu/softfloat.c b/fpu/softfloat.c
> index 4a871ef2a1..a407129dcb 100644
> --- a/fpu/softfloat.c
> +++ b/fpu/softfloat.c
> @@ -268,6 +268,8 @@ typedef bool (*f64_check_fn)(union_float64 a, union_float64 b);
>   
>   typedef float32 (*soft_f32_op2_fn)(float32 a, float32 b, float_status *s);
>   typedef float64 (*soft_f64_op2_fn)(float64 a, float64 b, float_status *s);
> +typedef float64 (*soft_f64_op2_int2_fn)(float64 a, float64 b, int c, int d,
> +                                        float_status *s);
>   typedef float   (*hard_f32_op2_fn)(float a, float b);
>   typedef double  (*hard_f64_op2_fn)(double a, double b);
>   
> @@ -401,6 +403,19 @@ float64_gen2(float64 xa, float64 xb, float_status *s,
>       return soft(ua.s, ub.s, s);
>   }
>   
> +static inline float64
> +float64_gen2_excp(float64 xa, float64 xb, int xc, int xd, float_status *s,
> +                  hard_f64_op2_fn hard, soft_f64_op2_fn soft,
> +                  soft_f64_op2_int2_fn soft_excp, f64_check_fn pre,
> +                  f64_check_fn post)
> +{
> +    if (xc || xd) {
> +        return soft_excp(xa, xb, xc, xd, s);
> +    } else {
> +        return float64_gen2(xa, xb, s, hard, soft, pre, post);
> +    }
> +}
> +
>   /*
>    * Classify a floating point number. Everything above float_class_qnan
>    * is a NaN so cls >= float_class_qnan is any NaN.
> @@ -1929,6 +1944,39 @@ static double hard_f64_sub(double a, double b)
>       return a - b;
>   }
>   
> +static float64 QEMU_SOFTFLOAT_ATTR
> +soft_f64_addsub_excp_en(float64 a, float64 b, int oe_sub, int ue_sum,
> +                        float_status *status, bool subtract)
> +{
> +    FloatParts64 pa, pb, *pr;
> +
> +    float64_unpack_canonical(&pa, a, status);
> +    float64_unpack_canonical(&pb, b, status);
> +    pr = parts_addsub(&pa, &pb, status, subtract);
> +
> +    if (unlikely(oe_sub && (pr->exp > 1023))) {
> +        pr->exp -= oe_sub;
> +        float_raise(float_flag_overflow, status);
> +    } else if (unlikely(ue_sum && (pr->exp < -1022))) {
> +        pr->exp += ue_sum;
> +        float_raise(float_flag_underflow, status);
> +    }
> +
> +    return float64_round_pack_canonical(pr, status);

This is incorrect, because the exponent is not fixed until the middle of round_pack_canonical.

I think you should not add new functions like this, with new parameters, but instead add 
fields to float_status, which would then be checked at the places currently setting 
underflow and overflow.


r~

Lucas Mateus Martins Araujo e Castro Aug. 3, 2022, 5:45 p.m. UTC | #2

On 03/08/2022 13:18, Richard Henderson wrote:
>
> On 8/3/22 05:22, Lucas Mateus Castro(alqotel) wrote:
>> From: "Lucas Mateus Castro (alqotel)" <lucas.araujo@eldorado.org.br>
>>
>> As mentioned in the functions float_overflow_excp and
>> float_underflow_excp, the result should be adjusted as mentioned in the
>> ISA (subtracted 192/1536 from the exponent of the intermediate result if
>> an overflow occurs with OE set and added 192/1536 to the exponent of the
>> intermediate result if an underflow occurs with UE set), but at those
>> functions the result has already been rounded so it is not possible to
>> add/subtract from the intermediate result anymore.
>>
>> This patch creates a new function that receives the value that should be
>> subtracted/added from the exponent if an overflow/underflow happens, to
>> not leave some arbitrary numbers from the PowerISA in the middle of the
>> FPU code. If these numbers are 0 the new functions just call the old
>> ones.
>>
>> I used 2 values here for overflow and underflow, maybe it'd be better to
>> just use the same ones, any thoughts?
>>
>> Signed-off-by: Lucas Mateus Castro (alqotel) 
>> <lucas.araujo@eldorado.org.br>
>> ---
>> An alternative I've thought was to always return the value adjusted if a
>> overflow or underflow occurs and in float_underflow_excp and
>> float_overflow_excp adjust it to inf/den/0 if OE/UE is 0, but I didn't
>> saw many advantages to that approach.
>> ---
>>   fpu/softfloat.c         | 75 +++++++++++++++++++++++++++++++++++++++++
>>   include/fpu/softfloat.h |  2 ++
>>   target/ppc/fpu_helper.c | 10 ++++--
>>   3 files changed, 85 insertions(+), 2 deletions(-)
>>
>> diff --git a/fpu/softfloat.c b/fpu/softfloat.c
>> index 4a871ef2a1..a407129dcb 100644
>> --- a/fpu/softfloat.c
>> +++ b/fpu/softfloat.c
>> @@ -268,6 +268,8 @@ typedef bool (*f64_check_fn)(union_float64 a, 
>> union_float64 b);
>>
>>   typedef float32 (*soft_f32_op2_fn)(float32 a, float32 b, 
>> float_status *s);
>>   typedef float64 (*soft_f64_op2_fn)(float64 a, float64 b, 
>> float_status *s);
>> +typedef float64 (*soft_f64_op2_int2_fn)(float64 a, float64 b, int c, 
>> int d,
>> +                                        float_status *s);
>>   typedef float   (*hard_f32_op2_fn)(float a, float b);
>>   typedef double  (*hard_f64_op2_fn)(double a, double b);
>>
>> @@ -401,6 +403,19 @@ float64_gen2(float64 xa, float64 xb, 
>> float_status *s,
>>       return soft(ua.s, ub.s, s);
>>   }
>>
>> +static inline float64
>> +float64_gen2_excp(float64 xa, float64 xb, int xc, int xd, 
>> float_status *s,
>> +                  hard_f64_op2_fn hard, soft_f64_op2_fn soft,
>> +                  soft_f64_op2_int2_fn soft_excp, f64_check_fn pre,
>> +                  f64_check_fn post)
>> +{
>> +    if (xc || xd) {
>> +        return soft_excp(xa, xb, xc, xd, s);
>> +    } else {
>> +        return float64_gen2(xa, xb, s, hard, soft, pre, post);
>> +    }
>> +}
>> +
>>   /*
>>    * Classify a floating point number. Everything above float_class_qnan
>>    * is a NaN so cls >= float_class_qnan is any NaN.
>> @@ -1929,6 +1944,39 @@ static double hard_f64_sub(double a, double b)
>>       return a - b;
>>   }
>>
>> +static float64 QEMU_SOFTFLOAT_ATTR
>> +soft_f64_addsub_excp_en(float64 a, float64 b, int oe_sub, int ue_sum,
>> +                        float_status *status, bool subtract)
>> +{
>> +    FloatParts64 pa, pb, *pr;
>> +
>> +    float64_unpack_canonical(&pa, a, status);
>> +    float64_unpack_canonical(&pb, b, status);
>> +    pr = parts_addsub(&pa, &pb, status, subtract);
>> +
>> +    if (unlikely(oe_sub && (pr->exp > 1023))) {
>> +        pr->exp -= oe_sub;
>> +        float_raise(float_flag_overflow, status);
>> +    } else if (unlikely(ue_sum && (pr->exp < -1022))) {
>> +        pr->exp += ue_sum;
>> +        float_raise(float_flag_underflow, status);
>> +    }
>> +
>> +    return float64_round_pack_canonical(pr, status);
>
> This is incorrect, because the exponent is not fixed until the middle 
> of round_pack_canonical.
>
> I think you should not add new functions like this, with new 
> parameters, but instead add
> fields to float_status, which would then be checked at the places 
> currently setting
> underflow and overflow.

So add overflow_correction and underflow_correction in 
'partsN(uncanon_normal)' so that:

if (exp >= exp_max) {
     if (overflow_correction != 0) {
         exp -= overflow_correction;
     }
}

And the equivalent for underflow, or a bool ppc_overflow_enable that 
uses a fixed value like:

if (exp >= exp_max) {
     if (ppc_overflow_enable) {
         exp -= ((fmt->exp_bias + 1) + (fmt->exp_bias + 1)/2);
     }
}

(and the equivalent for underflow) ?

>
>
> r~

Richard Henderson Aug. 3, 2022, 6:16 p.m. UTC | #3

On 8/3/22 10:45, Lucas Mateus Martins Araujo e Castro wrote:
> 
> On 03/08/2022 13:18, Richard Henderson wrote:
>>
>> On 8/3/22 05:22, Lucas Mateus Castro(alqotel) wrote:
>>> From: "Lucas Mateus Castro (alqotel)" <lucas.araujo@eldorado.org.br>
>>>
>>> As mentioned in the functions float_overflow_excp and
>>> float_underflow_excp, the result should be adjusted as mentioned in the
>>> ISA (subtracted 192/1536 from the exponent of the intermediate result if
>>> an overflow occurs with OE set and added 192/1536 to the exponent of the
>>> intermediate result if an underflow occurs with UE set), but at those
>>> functions the result has already been rounded so it is not possible to
>>> add/subtract from the intermediate result anymore.
>>>
>>> This patch creates a new function that receives the value that should be
>>> subtracted/added from the exponent if an overflow/underflow happens, to
>>> not leave some arbitrary numbers from the PowerISA in the middle of the
>>> FPU code. If these numbers are 0 the new functions just call the old
>>> ones.
>>>
>>> I used 2 values here for overflow and underflow, maybe it'd be better to
>>> just use the same ones, any thoughts?
>>>
>>> Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br>
>>> ---
>>> An alternative I've thought was to always return the value adjusted if a
>>> overflow or underflow occurs and in float_underflow_excp and
>>> float_overflow_excp adjust it to inf/den/0 if OE/UE is 0, but I didn't
>>> saw many advantages to that approach.
>>> ---
>>>   fpu/softfloat.c         | 75 +++++++++++++++++++++++++++++++++++++++++
>>>   include/fpu/softfloat.h |  2 ++
>>>   target/ppc/fpu_helper.c | 10 ++++--
>>>   3 files changed, 85 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/fpu/softfloat.c b/fpu/softfloat.c
>>> index 4a871ef2a1..a407129dcb 100644
>>> --- a/fpu/softfloat.c
>>> +++ b/fpu/softfloat.c
>>> @@ -268,6 +268,8 @@ typedef bool (*f64_check_fn)(union_float64 a, union_float64 b);
>>>
>>>   typedef float32 (*soft_f32_op2_fn)(float32 a, float32 b, float_status *s);
>>>   typedef float64 (*soft_f64_op2_fn)(float64 a, float64 b, float_status *s);
>>> +typedef float64 (*soft_f64_op2_int2_fn)(float64 a, float64 b, int c, int d,
>>> +                                        float_status *s);
>>>   typedef float   (*hard_f32_op2_fn)(float a, float b);
>>>   typedef double  (*hard_f64_op2_fn)(double a, double b);
>>>
>>> @@ -401,6 +403,19 @@ float64_gen2(float64 xa, float64 xb, float_status *s,
>>>       return soft(ua.s, ub.s, s);
>>>   }
>>>
>>> +static inline float64
>>> +float64_gen2_excp(float64 xa, float64 xb, int xc, int xd, float_status *s,
>>> +                  hard_f64_op2_fn hard, soft_f64_op2_fn soft,
>>> +                  soft_f64_op2_int2_fn soft_excp, f64_check_fn pre,
>>> +                  f64_check_fn post)
>>> +{
>>> +    if (xc || xd) {
>>> +        return soft_excp(xa, xb, xc, xd, s);
>>> +    } else {
>>> +        return float64_gen2(xa, xb, s, hard, soft, pre, post);
>>> +    }
>>> +}
>>> +
>>>   /*
>>>    * Classify a floating point number. Everything above float_class_qnan
>>>    * is a NaN so cls >= float_class_qnan is any NaN.
>>> @@ -1929,6 +1944,39 @@ static double hard_f64_sub(double a, double b)
>>>       return a - b;
>>>   }
>>>
>>> +static float64 QEMU_SOFTFLOAT_ATTR
>>> +soft_f64_addsub_excp_en(float64 a, float64 b, int oe_sub, int ue_sum,
>>> +                        float_status *status, bool subtract)
>>> +{
>>> +    FloatParts64 pa, pb, *pr;
>>> +
>>> +    float64_unpack_canonical(&pa, a, status);
>>> +    float64_unpack_canonical(&pb, b, status);
>>> +    pr = parts_addsub(&pa, &pb, status, subtract);
>>> +
>>> +    if (unlikely(oe_sub && (pr->exp > 1023))) {
>>> +        pr->exp -= oe_sub;
>>> +        float_raise(float_flag_overflow, status);
>>> +    } else if (unlikely(ue_sum && (pr->exp < -1022))) {
>>> +        pr->exp += ue_sum;
>>> +        float_raise(float_flag_underflow, status);
>>> +    }
>>> +
>>> +    return float64_round_pack_canonical(pr, status);
>>
>> This is incorrect, because the exponent is not fixed until the middle of 
>> round_pack_canonical.
>>
>> I think you should not add new functions like this, with new parameters, but instead add
>> fields to float_status, which would then be checked at the places currently setting
>> underflow and overflow.
> 
> So add overflow_correction and underflow_correction in 'partsN(uncanon_normal)' so that:
> 
> if (exp >= exp_max) {
>      if (overflow_correction != 0) {
>          exp -= overflow_correction;
>      }
> }
> 
> And the equivalent for underflow, or a bool ppc_overflow_enable that uses a fixed value like:
> 
> if (exp >= exp_max) {
>      if (ppc_overflow_enable) {
>          exp -= ((fmt->exp_bias + 1) + (fmt->exp_bias + 1)/2);
>      }
> }
> 
> (and the equivalent for underflow) ?

Something like that.

I would suggest pre-computing that adjustment into fmt, via FLOAT_PARAMS.
Naming is always hard, but how about exp_re_bias?

The flag(s) should not contain "ppc" in the name.  But perhaps

   s->rebias_overflow
   s->rebias_underflow


r~

Lucas Mateus Martins Araujo e Castro Aug. 3, 2022, 6:42 p.m. UTC | #4

On 03/08/2022 15:16, Richard Henderson wrote:
>
> On 8/3/22 10:45, Lucas Mateus Martins Araujo e Castro wrote:
>>
>> On 03/08/2022 13:18, Richard Henderson wrote:
>>>
>>> On 8/3/22 05:22, Lucas Mateus Castro(alqotel) wrote:
>>>> From: "Lucas Mateus Castro (alqotel)" <lucas.araujo@eldorado.org.br>
>>>>
>>>> As mentioned in the functions float_overflow_excp and
>>>> float_underflow_excp, the result should be adjusted as mentioned in 
>>>> the
>>>> ISA (subtracted 192/1536 from the exponent of the intermediate 
>>>> result if
>>>> an overflow occurs with OE set and added 192/1536 to the exponent 
>>>> of the
>>>> intermediate result if an underflow occurs with UE set), but at those
>>>> functions the result has already been rounded so it is not possible to
>>>> add/subtract from the intermediate result anymore.
>>>>
>>>> This patch creates a new function that receives the value that 
>>>> should be
>>>> subtracted/added from the exponent if an overflow/underflow 
>>>> happens, to
>>>> not leave some arbitrary numbers from the PowerISA in the middle of 
>>>> the
>>>> FPU code. If these numbers are 0 the new functions just call the old
>>>> ones.
>>>>
>>>> I used 2 values here for overflow and underflow, maybe it'd be 
>>>> better to
>>>> just use the same ones, any thoughts?
>>>>
>>>> Signed-off-by: Lucas Mateus Castro (alqotel) 
>>>> <lucas.araujo@eldorado.org.br>
>>>> ---
>>>> An alternative I've thought was to always return the value adjusted 
>>>> if a
>>>> overflow or underflow occurs and in float_underflow_excp and
>>>> float_overflow_excp adjust it to inf/den/0 if OE/UE is 0, but I didn't
>>>> saw many advantages to that approach.
>>>> ---
>>>>   fpu/softfloat.c         | 75 
>>>> +++++++++++++++++++++++++++++++++++++++++
>>>>   include/fpu/softfloat.h |  2 ++
>>>>   target/ppc/fpu_helper.c | 10 ++++--
>>>>   3 files changed, 85 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/fpu/softfloat.c b/fpu/softfloat.c
>>>> index 4a871ef2a1..a407129dcb 100644
>>>> --- a/fpu/softfloat.c
>>>> +++ b/fpu/softfloat.c
>>>> @@ -268,6 +268,8 @@ typedef bool (*f64_check_fn)(union_float64 a, 
>>>> union_float64 b);
>>>>
>>>>   typedef float32 (*soft_f32_op2_fn)(float32 a, float32 b, 
>>>> float_status *s);
>>>>   typedef float64 (*soft_f64_op2_fn)(float64 a, float64 b, 
>>>> float_status *s);
>>>> +typedef float64 (*soft_f64_op2_int2_fn)(float64 a, float64 b, int 
>>>> c, int d,
>>>> +                                        float_status *s);
>>>>   typedef float   (*hard_f32_op2_fn)(float a, float b);
>>>>   typedef double  (*hard_f64_op2_fn)(double a, double b);
>>>>
>>>> @@ -401,6 +403,19 @@ float64_gen2(float64 xa, float64 xb, 
>>>> float_status *s,
>>>>       return soft(ua.s, ub.s, s);
>>>>   }
>>>>
>>>> +static inline float64
>>>> +float64_gen2_excp(float64 xa, float64 xb, int xc, int xd, 
>>>> float_status *s,
>>>> +                  hard_f64_op2_fn hard, soft_f64_op2_fn soft,
>>>> +                  soft_f64_op2_int2_fn soft_excp, f64_check_fn pre,
>>>> +                  f64_check_fn post)
>>>> +{
>>>> +    if (xc || xd) {
>>>> +        return soft_excp(xa, xb, xc, xd, s);
>>>> +    } else {
>>>> +        return float64_gen2(xa, xb, s, hard, soft, pre, post);
>>>> +    }
>>>> +}
>>>> +
>>>>   /*
>>>>    * Classify a floating point number. Everything above 
>>>> float_class_qnan
>>>>    * is a NaN so cls >= float_class_qnan is any NaN.
>>>> @@ -1929,6 +1944,39 @@ static double hard_f64_sub(double a, double b)
>>>>       return a - b;
>>>>   }
>>>>
>>>> +static float64 QEMU_SOFTFLOAT_ATTR
>>>> +soft_f64_addsub_excp_en(float64 a, float64 b, int oe_sub, int ue_sum,
>>>> +                        float_status *status, bool subtract)
>>>> +{
>>>> +    FloatParts64 pa, pb, *pr;
>>>> +
>>>> +    float64_unpack_canonical(&pa, a, status);
>>>> +    float64_unpack_canonical(&pb, b, status);
>>>> +    pr = parts_addsub(&pa, &pb, status, subtract);
>>>> +
>>>> +    if (unlikely(oe_sub && (pr->exp > 1023))) {
>>>> +        pr->exp -= oe_sub;
>>>> +        float_raise(float_flag_overflow, status);
>>>> +    } else if (unlikely(ue_sum && (pr->exp < -1022))) {
>>>> +        pr->exp += ue_sum;
>>>> +        float_raise(float_flag_underflow, status);
>>>> +    }
>>>> +
>>>> +    return float64_round_pack_canonical(pr, status);
>>>
>>> This is incorrect, because the exponent is not fixed until the 
>>> middle of
>>> round_pack_canonical.
>>>
>>> I think you should not add new functions like this, with new 
>>> parameters, but instead add
>>> fields to float_status, which would then be checked at the places 
>>> currently setting
>>> underflow and overflow.
>>
>> So add overflow_correction and underflow_correction in 
>> 'partsN(uncanon_normal)' so that:
>>
>> if (exp >= exp_max) {
>>      if (overflow_correction != 0) {
>>          exp -= overflow_correction;
>>      }
>> }
>>
>> And the equivalent for underflow, or a bool ppc_overflow_enable that 
>> uses a fixed value like:
>>
>> if (exp >= exp_max) {
>>      if (ppc_overflow_enable) {
>>          exp -= ((fmt->exp_bias + 1) + (fmt->exp_bias + 1)/2);
>>      }
>> }
>>
>> (and the equivalent for underflow) ?
>
> Something like that.
>
> I would suggest pre-computing that adjustment into fmt, via FLOAT_PARAMS.
> Naming is always hard, but how about exp_re_bias?
>
> The flag(s) should not contain "ppc" in the name.  But perhaps
>
>   s->rebias_overflow
>   s->rebias_underflow

rebias_* sounds good to me.

Also I imagine that these bools would be set by mtfsf, mtfsfi, mtfsb0 
and mtfsb1, in which case it'd make these patches significantly shorter. 
I'll send a v2 with these changes

>
>
>
> r~

diff --git a/fpu/softfloat.c b/fpu/softfloat.c
index 4a871ef2a1..a407129dcb 100644
--- a/fpu/softfloat.c
+++ b/fpu/softfloat.c
@@ -268,6 +268,8 @@  typedef bool (*f64_check_fn)(union_float64 a, union_float64 b);
 
 typedef float32 (*soft_f32_op2_fn)(float32 a, float32 b, float_status *s);
 typedef float64 (*soft_f64_op2_fn)(float64 a, float64 b, float_status *s);
+typedef float64 (*soft_f64_op2_int2_fn)(float64 a, float64 b, int c, int d,
+                                        float_status *s);
 typedef float   (*hard_f32_op2_fn)(float a, float b);
 typedef double  (*hard_f64_op2_fn)(double a, double b);
 
@@ -401,6 +403,19 @@  float64_gen2(float64 xa, float64 xb, float_status *s,
     return soft(ua.s, ub.s, s);
 }
 
+static inline float64
+float64_gen2_excp(float64 xa, float64 xb, int xc, int xd, float_status *s,
+                  hard_f64_op2_fn hard, soft_f64_op2_fn soft,
+                  soft_f64_op2_int2_fn soft_excp, f64_check_fn pre,
+                  f64_check_fn post)
+{
+    if (xc || xd) {
+        return soft_excp(xa, xb, xc, xd, s);
+    } else {
+        return float64_gen2(xa, xb, s, hard, soft, pre, post);
+    }
+}
+
 /*
  * Classify a floating point number. Everything above float_class_qnan
  * is a NaN so cls >= float_class_qnan is any NaN.
@@ -1929,6 +1944,39 @@  static double hard_f64_sub(double a, double b)
     return a - b;
 }
 
+static float64 QEMU_SOFTFLOAT_ATTR
+soft_f64_addsub_excp_en(float64 a, float64 b, int oe_sub, int ue_sum,
+                        float_status *status, bool subtract)
+{
+    FloatParts64 pa, pb, *pr;
+
+    float64_unpack_canonical(&pa, a, status);
+    float64_unpack_canonical(&pb, b, status);
+    pr = parts_addsub(&pa, &pb, status, subtract);
+
+    if (unlikely(oe_sub && (pr->exp > 1023))) {
+        pr->exp -= oe_sub;
+        float_raise(float_flag_overflow, status);
+    } else if (unlikely(ue_sum && (pr->exp < -1022))) {
+        pr->exp += ue_sum;
+        float_raise(float_flag_underflow, status);
+    }
+
+    return float64_round_pack_canonical(pr, status);
+}
+
+static float64 soft_f64_add_excp_en(float64 a, float64 b, int oe_sub,
+                                    int ue_sum, float_status *status)
+{
+    return soft_f64_addsub_excp_en(a, b, oe_sub, ue_sum, status, false);
+}
+
+static float64 soft_f64_sub_excp_en(float64 a, float64 b, int oe_sub,
+                                    int ue_sum, float_status *status)
+{
+    return soft_f64_addsub_excp_en(a, b, oe_sub, ue_sum, status, true);
+}
+
 static bool f32_addsubmul_post(union_float32 a, union_float32 b)
 {
     if (QEMU_HARDFLOAT_2F32_USE_FP) {
@@ -1960,6 +2008,15 @@  static float64 float64_addsub(float64 a, float64 b, float_status *s,
                         f64_is_zon2, f64_addsubmul_post);
 }
 
+static float64 float64_addsub_excp_en(float64 a, float64 b, int oe_sum,
+                                      int ue_sub, float_status *s,
+                                      hard_f64_op2_fn hard, soft_f64_op2_fn soft,
+                                      soft_f64_op2_int2_fn soft_excp)
+{
+    return float64_gen2_excp(a, b, oe_sum, ue_sub, s, hard, soft, soft_excp,
+                             f64_is_zon2, f64_addsubmul_post);
+}
+
 float32 QEMU_FLATTEN
 float32_add(float32 a, float32 b, float_status *s)
 {
@@ -1984,6 +2041,24 @@  float64_sub(float64 a, float64 b, float_status *s)
     return float64_addsub(a, b, s, hard_f64_sub, soft_f64_sub);
 }
 
+float64 QEMU_FLATTEN
+float64_add_excp_en(float64 a, float64 b, int oe_sub, int ue_sum,
+                    float_status *s)
+{
+    return float64_addsub_excp_en(a, b, oe_sub, ue_sum, s,
+                                  hard_f64_add, soft_f64_add,
+                                  soft_f64_add_excp_en);
+}
+
+float64 QEMU_FLATTEN
+float64_sub_excp_en(float64 a, float64 b, int oe_sub, int ue_sum,
+                    float_status *s)
+{
+    return float64_addsub_excp_en(a, b, oe_sub, ue_sum, s,
+                                  hard_f64_sub, soft_f64_sub,
+                                  soft_f64_sub_excp_en);
+}
+
 static float64 float64r32_addsub(float64 a, float64 b, float_status *status,
                                  bool subtract)
 {
diff --git a/include/fpu/softfloat.h b/include/fpu/softfloat.h
index 3dcf20e3a2..76bf628a29 100644
--- a/include/fpu/softfloat.h
+++ b/include/fpu/softfloat.h
@@ -772,7 +772,9 @@  float128 float64_to_float128(float64, float_status *status);
 *----------------------------------------------------------------------------*/
 float64 float64_round_to_int(float64, float_status *status);
 float64 float64_add(float64, float64, float_status *status);
+float64 float64_add_excp_en(float64, float64, int, int, float_status *status);
 float64 float64_sub(float64, float64, float_status *status);
+float64 float64_sub_excp_en(float64, float64, int, int, float_status *status);
 float64 float64_mul(float64, float64, float_status *status);
 float64 float64_div(float64, float64, float_status *status);
 float64 float64_rem(float64, float64, float_status *status);
diff --git a/target/ppc/fpu_helper.c b/target/ppc/fpu_helper.c
index 7ab6beadad..cb82c91340 100644
--- a/target/ppc/fpu_helper.c
+++ b/target/ppc/fpu_helper.c
@@ -529,7 +529,10 @@  static void float_invalid_op_addsub(CPUPPCState *env, int flags,
 /* fadd - fadd. */
 float64 helper_fadd(CPUPPCState *env, float64 arg1, float64 arg2)
 {
-    float64 ret = float64_add(arg1, arg2, &env->fp_status);
+    int oe_sub = (FP_OE & env->fpscr) ? 1536 : 0;
+    int ue_add = (FP_UE & env->fpscr) ? 1536 : 0;
+    float64 ret = float64_add_excp_en(arg1, arg2, oe_sub, ue_add,
+                                      &env->fp_status);
     int flags = get_float_exception_flags(&env->fp_status);
 
     if (unlikely(flags & float_flag_invalid)) {
@@ -554,7 +557,10 @@  float64 helper_fadds(CPUPPCState *env, float64 arg1, float64 arg2)
 /* fsub - fsub. */
 float64 helper_fsub(CPUPPCState *env, float64 arg1, float64 arg2)
 {
-    float64 ret = float64_sub(arg1, arg2, &env->fp_status);
+    int oe_sub = (FP_OE & env->fpscr) ? 1536 : 0;
+    int ue_add = (FP_UE & env->fpscr) ? 1536 : 0;
+    float64 ret = float64_sub_excp_en(arg1, arg2, oe_sub, ue_add,
+                                      &env->fp_status);
     int flags = get_float_exception_flags(&env->fp_status);
 
     if (unlikely(flags & float_flag_invalid)) {

[RFC,1/3] target/ppc: Bugfix fadd/fsub result with OE/UE set

Commit Message

Comments

Patch