Message ID | 20181225070305.18221-1-cota@braap.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v2] softfloat: enforce softfloat if the host's FMA is broken | expand |
Hi Emilio, On 12/25/18 8:03 AM, Emilio G. Cota wrote: > The added branch to the FMA ops is marked as unlikely and therefore > its impact on performance (measured with fp-bench) is within noise range > when measured on an Intel(R) Xeon(R) Gold 6142 CPU @ 2.60GHz. Did you forget to write here the result of your measurements? > > Reported-by: Laurent Desnogues <laurent.desnogues@gmail.com> Suggested-by: Richard Henderson <richard.henderson@linaro.org> > Signed-off-by: Emilio G. Cota <cota@braap.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> > --- > v2: drop the cpuid bits, since that work (1) is only a performance > optimization, not a correctness fix and (2) requires additional > consolidation work that I do not have time to do right now. > > fpu/softfloat.c | 33 +++++++++++++++++++++++++++++++++ > 1 file changed, 33 insertions(+) > > diff --git a/fpu/softfloat.c b/fpu/softfloat.c > index 59eac97d10..9132d7a0b0 100644 > --- a/fpu/softfloat.c > +++ b/fpu/softfloat.c > @@ -1542,6 +1542,8 @@ soft_f64_muladd(float64 a, float64 b, float64 c, int flags, > return float64_round_pack_canonical(pr, status); > } > > +static bool force_soft_fma; > + > float32 QEMU_FLATTEN > float32_muladd(float32 xa, float32 xb, float32 xc, int flags, float_status *s) > { > @@ -1562,6 +1564,11 @@ float32_muladd(float32 xa, float32 xb, float32 xc, int flags, float_status *s) > if (unlikely(!f32_is_zon3(ua, ub, uc))) { > goto soft; > } > + > + if (unlikely(force_soft_fma)) { > + goto soft; > + } > + > /* > * When (a || b) == 0, there's no need to check for under/over flow, > * since we know the addend is (normal || 0) and the product is 0. > @@ -1623,6 +1630,11 @@ float64_muladd(float64 xa, float64 xb, float64 xc, int flags, float_status *s) > if (unlikely(!f64_is_zon3(ua, ub, uc))) { > goto soft; > } > + > + if (unlikely(force_soft_fma)) { > + goto soft; > + } > + > /* > * When (a || b) == 0, there's no need to check for under/over flow, > * since we know the addend is (normal || 0) and the product is 0. > @@ -7974,3 +7986,24 @@ float128 float128_scalbn(float128 a, int n, float_status *status) > , status); > > } > + > +static void __attribute__((constructor)) softfloat_init(void) > +{ > + union_float64 ua, ub, uc, ur; > + > + if (QEMU_NO_HARDFLOAT) { > + return; > + } > + /* > + * Test that the host's FMA is not obviously broken. For example, > + * glibc < 2.23 can perform an incorrect FMA on certain hosts; see > + * https://sourceware.org/bugzilla/show_bug.cgi?id=13304 > + */ > + ua.s = 0x0020000000000001ULL; > + ub.s = 0x3ca0000000000000ULL; > + uc.s = 0x0020000000000000ULL; > + ur.h = fma(ua.h, ub.h, uc.h); > + if (ur.s != 0x0020000000000001ULL) { > + force_soft_fma = true; > + } > +} >
On Tue, Dec 25, 2018 at 12:15 PM Philippe Mathieu-Daudé <f4bug@amsat.org> wrote: > > Hi Emilio, > > On 12/25/18 8:03 AM, Emilio G. Cota wrote: > > The added branch to the FMA ops is marked as unlikely and therefore > > its impact on performance (measured with fp-bench) is within noise range > > when measured on an Intel(R) Xeon(R) Gold 6142 CPU @ 2.60GHz. > > Did you forget to write here the result of your measurements? > > > > > Reported-by: Laurent Desnogues <laurent.desnogues@gmail.com> > > Suggested-by: Richard Henderson <richard.henderson@linaro.org> > > > Signed-off-by: Emilio G. Cota <cota@braap.org> > > Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Oops I meant: Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> > > > --- > > v2: drop the cpuid bits, since that work (1) is only a performance > > optimization, not a correctness fix and (2) requires additional > > consolidation work that I do not have time to do right now. > > > > fpu/softfloat.c | 33 +++++++++++++++++++++++++++++++++ > > 1 file changed, 33 insertions(+) > > > > diff --git a/fpu/softfloat.c b/fpu/softfloat.c > > index 59eac97d10..9132d7a0b0 100644 > > --- a/fpu/softfloat.c > > +++ b/fpu/softfloat.c > > @@ -1542,6 +1542,8 @@ soft_f64_muladd(float64 a, float64 b, float64 c, int flags, > > return float64_round_pack_canonical(pr, status); > > } > > > > +static bool force_soft_fma; > > + > > float32 QEMU_FLATTEN > > float32_muladd(float32 xa, float32 xb, float32 xc, int flags, float_status *s) > > { > > @@ -1562,6 +1564,11 @@ float32_muladd(float32 xa, float32 xb, float32 xc, int flags, float_status *s) > > if (unlikely(!f32_is_zon3(ua, ub, uc))) { > > goto soft; > > } > > + > > + if (unlikely(force_soft_fma)) { > > + goto soft; > > + } > > + > > /* > > * When (a || b) == 0, there's no need to check for under/over flow, > > * since we know the addend is (normal || 0) and the product is 0. > > @@ -1623,6 +1630,11 @@ float64_muladd(float64 xa, float64 xb, float64 xc, int flags, float_status *s) > > if (unlikely(!f64_is_zon3(ua, ub, uc))) { > > goto soft; > > } > > + > > + if (unlikely(force_soft_fma)) { > > + goto soft; > > + } > > + > > /* > > * When (a || b) == 0, there's no need to check for under/over flow, > > * since we know the addend is (normal || 0) and the product is 0. > > @@ -7974,3 +7986,24 @@ float128 float128_scalbn(float128 a, int n, float_status *status) > > , status); > > > > } > > + > > +static void __attribute__((constructor)) softfloat_init(void) > > +{ > > + union_float64 ua, ub, uc, ur; > > + > > + if (QEMU_NO_HARDFLOAT) { > > + return; > > + } > > + /* > > + * Test that the host's FMA is not obviously broken. For example, > > + * glibc < 2.23 can perform an incorrect FMA on certain hosts; see > > + * https://sourceware.org/bugzilla/show_bug.cgi?id=13304 > > + */ > > + ua.s = 0x0020000000000001ULL; > > + ub.s = 0x3ca0000000000000ULL; > > + uc.s = 0x0020000000000000ULL; > > + ur.h = fma(ua.h, ub.h, uc.h); > > + if (ur.s != 0x0020000000000001ULL) { > > + force_soft_fma = true; > > + } > > +} > >
diff --git a/fpu/softfloat.c b/fpu/softfloat.c index 59eac97d10..9132d7a0b0 100644 --- a/fpu/softfloat.c +++ b/fpu/softfloat.c @@ -1542,6 +1542,8 @@ soft_f64_muladd(float64 a, float64 b, float64 c, int flags, return float64_round_pack_canonical(pr, status); } +static bool force_soft_fma; + float32 QEMU_FLATTEN float32_muladd(float32 xa, float32 xb, float32 xc, int flags, float_status *s) { @@ -1562,6 +1564,11 @@ float32_muladd(float32 xa, float32 xb, float32 xc, int flags, float_status *s) if (unlikely(!f32_is_zon3(ua, ub, uc))) { goto soft; } + + if (unlikely(force_soft_fma)) { + goto soft; + } + /* * When (a || b) == 0, there's no need to check for under/over flow, * since we know the addend is (normal || 0) and the product is 0. @@ -1623,6 +1630,11 @@ float64_muladd(float64 xa, float64 xb, float64 xc, int flags, float_status *s) if (unlikely(!f64_is_zon3(ua, ub, uc))) { goto soft; } + + if (unlikely(force_soft_fma)) { + goto soft; + } + /* * When (a || b) == 0, there's no need to check for under/over flow, * since we know the addend is (normal || 0) and the product is 0. @@ -7974,3 +7986,24 @@ float128 float128_scalbn(float128 a, int n, float_status *status) , status); } + +static void __attribute__((constructor)) softfloat_init(void) +{ + union_float64 ua, ub, uc, ur; + + if (QEMU_NO_HARDFLOAT) { + return; + } + /* + * Test that the host's FMA is not obviously broken. For example, + * glibc < 2.23 can perform an incorrect FMA on certain hosts; see + * https://sourceware.org/bugzilla/show_bug.cgi?id=13304 + */ + ua.s = 0x0020000000000001ULL; + ub.s = 0x3ca0000000000000ULL; + uc.s = 0x0020000000000000ULL; + ur.h = fma(ua.h, ub.h, uc.h); + if (ur.s != 0x0020000000000001ULL) { + force_soft_fma = true; + } +}
The added branch to the FMA ops is marked as unlikely and therefore its impact on performance (measured with fp-bench) is within noise range when measured on an Intel(R) Xeon(R) Gold 6142 CPU @ 2.60GHz. Reported-by: Laurent Desnogues <laurent.desnogues@gmail.com> Signed-off-by: Emilio G. Cota <cota@braap.org> --- v2: drop the cpuid bits, since that work (1) is only a performance optimization, not a correctness fix and (2) requires additional consolidation work that I do not have time to do right now. fpu/softfloat.c | 33 +++++++++++++++++++++++++++++++++ 1 file changed, 33 insertions(+)