[RFC,63/65] fpu: implement full set compare for fp16

Message ID	20200710104920.13550-64-frank.chang@sifive.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <SRS0=OzId=AV=nongnu.org=qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@kernel.org> DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 54ECD207D0 From: frank.chang@sifive.com To: qemu-devel@nongnu.org, qemu-riscv@nongnu.org Subject: [RFC 63/65] fpu: implement full set compare for fp16 Date: Fri, 10 Jul 2020 18:49:17 +0800 Message-Id: <20200710104920.13550-64-frank.chang@sifive.com> In-Reply-To: <20200710104920.13550-1-frank.chang@sifive.com> References: <20200710104920.13550-1-frank.chang@sifive.com> Received-SPF: pass client-ip=2607:f8b0:4864:20::535; envelope-from=frank.chang@sifive.com; helo=mail-pg1-x535.google.com Precedence: list Cc: Peter Maydell <peter.maydell@linaro.org>, Frank Chang <frank.chang@sifive.com>, Chih-Min Chao <chihmin.chao@sifive.com>, Kito Cheng <kito.cheng@sifive.com>, =?utf-8?q?Alex_Benn=C3=A9e?= <alex.bennee@linaro.org>, Aurelien Jarno <aurelien@aurel32.net> Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" <qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org>
Series	target/riscv: support vector extension v0.9 \| expand [RFC,00/65] target/riscv: support vector extension v0.9 [RFC,01/65] target/riscv: fix rsub gvec tcg_assert_listed_vecop assertion [RFC,02/65] target/riscv: correct the gvec IR called in gen_vec_rsub16_i64() [RFC,03/65] target/riscv: fix return value of do_opivx_widen() [RFC,04/65] target/riscv: fix vill bit index in vtype register [RFC,05/65] target/riscv: remove vsll.vi, vsrl.vi, vsra.vi insns from using gvec [RFC,06/65] target/riscv: rvv-0.9: add vcsr register [RFC,07/65] target/riscv: rvv-0.9: add vector context status [RFC,08/65] target/riscv: rvv-0.9: update mstatus_vs by tb_flags [RFC,09/65] target/riscv: rvv-0.9: add vlenb register [RFC,10/65] target/riscv: rvv-0.9: remove MLEN calculations [RFC,11/65] target/riscv: rvv-0.9: add fractional LMUL, VTA and VMA [RFC,12/65] target/riscv: rvv-0.9: update check functions [RFC,13/65] target/riscv: rvv-0.9: configure instructions [RFC,14/65] target/riscv: rvv-0.9: stride load and store instructions [RFC,15/65] target/riscv: rvv-0.9: index load and store instructions [RFC,16/65] target/riscv: rvv-0.9: fix address index overflow bug of indexed load/store insns [RFC,17/65] target/riscv: rvv-0.9: fault-only-first unit stride load [RFC,18/65] target/riscv: rvv-0.9: amo operations [RFC,19/65] target/riscv: rvv-0.9: load/store whole register instructions [RFC,20/65] target/riscv: rvv-0.9: update vext_max_elems() for load/store insns [RFC,21/65] target/riscv: rvv-0.9: take fractional LMUL into vector max elements calculation [RFC,22/65] target/riscv: rvv-0.9: floating-point square-root instruction [RFC,23/65] target/riscv: rvv-0.9: floating-point classify instructions [RFC,24/65] target/riscv: rvv-0.9: mask population count instruction [RFC,25/65] target/riscv: rvv-0.9: find-first-set mask bit instruction [RFC,26/65] target/riscv: rvv-0.9: set-X-first mask bit instructions [RFC,27/65] target/riscv: rvv-0.9: iota instruction [RFC,28/65] target/riscv: rvv-0.9: element index instruction [RFC,29/65] target/riscv: rvv-0.9: integer scalar move instructions [RFC,30/65] target/riscv: rvv-0.9: floating-point scalar move instructions [RFC,31/65] target/riscv: rvv-0.9: whole register move instructions [RFC,32/65] target/riscv: rvv-0.9: integer extension instructions [RFC,33/65] target/riscv: rvv-0.9: single-width averaging add and subtract instructions [RFC,34/65] target/riscv: rvv-0.9: integer add-with-carry/subtract-with-borrow [RFC,35/65] target/riscv: rvv-0.9: narrowing integer right shift instructions [RFC,36/65] target/riscv: rvv-0.9: widening integer multiply-add instructions [RFC,37/65] target/riscv: rvv-0.9: quad-widening integer multiply-add instructions [RFC,38/65] target/riscv: rvv-0.9: integer merge and move instructions [RFC,39/65] target/riscv: rvv-0.9: single-width saturating add and subtract instructions [RFC,40/65] target/riscv: rvv-0.9: integer comparison instructions [RFC,41/65] target/riscv: rvv-0.9: floating-point compare instructions [RFC,42/65] target/riscv: rvv-0.9: single-width integer reduction instructions [RFC,43/65] target/riscv: rvv-0.9: widening integer reduction instructions [RFC,44/65] target/riscv: rvv-0.9: mask-register logical instructions [RFC,45/65] target/riscv: rvv-0.9: register gather instructions [RFC,46/65] target/riscv: rvv-0.9: slide instructions [RFC,47/65] target/riscv: rvv-0.9: floating-point slide instructions [RFC,48/65] target/riscv: rvv-0.9: narrowing fixed-point clip instructions [RFC,49/65] target/riscv: rvv-0.9: floating-point move instructions [RFC,50/65] target/riscv: rvv-0.9: floating-point/integer type-convert instructions [RFC,51/65] target/riscv: rvv-0.9: single-width floating-point reduction [RFC,52/65] target/riscv: rvv-0.9: widening floating-point reduction instructions [RFC,53/65] target/riscv: rvv-0.9: single-width scaling shift instructions [RFC,54/65] target/riscv: rvv-0.9: remove widening saturating scaled multiply-add [RFC,55/65] target/riscv: rvv-0.9: remove vmford.vv and vmford.vf [RFC,56/65] target/riscv: rvv-0.9: remove integer extract instruction [RFC,57/65] target/riscv: rvv-0.9: floating-point min/max instructions [RFC,58/65] target/riscv: rvv-0.9: widening floating-point/integer type-convert [RFC,59/65] target/riscv: rvv-0.9: narrowing floating-point/integer type-convert [RFC,60/65] softfloat: add fp16 and uint8/int8 interconvert functions [RFC,61/65] fpu: fix float16 nan check [RFC,62/65] fpu: add api to handle alternative sNaN propagation [RFC,63/65] fpu: implement full set compare for fp16 [RFC,64/65] target/riscv: use softfloat lib float16 comparison functions [RFC,65/65] target/riscv: bump to RVV 0.9

Message ID

20200710104920.13550-64-frank.chang@sifive.com (mailing list archive)

State

New, archived

Headers

DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 54ECD207D0
From: frank.chang@sifive.com
To: qemu-devel@nongnu.org,
	qemu-riscv@nongnu.org
Subject: [RFC 63/65] fpu: implement full set compare for fp16
Date: Fri, 10 Jul 2020 18:49:17 +0800
Message-Id: <20200710104920.13550-64-frank.chang@sifive.com>
In-Reply-To: <20200710104920.13550-1-frank.chang@sifive.com>
References: <20200710104920.13550-1-frank.chang@sifive.com>
Received-SPF: pass client-ip=2607:f8b0:4864:20::535;
 envelope-from=frank.chang@sifive.com; helo=mail-pg1-x535.google.com
X-Spam_score_int: -20
X-Spam_score: -2.1
X-Spam_bar: --
X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1,
 DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1,
 RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001,
 URIBL_BLOCKED=0.001 autolearn=unavailable autolearn_force=no
X-Spam_action: no action
X-Mailman-Approved-At: Fri, 10 Jul 2020 08:57:18 -0400
X-BeenThere: qemu-devel@nongnu.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
 <mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <https://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
 <mailto:qemu-devel-request@nongnu.org?subject=subscribe>
Cc: Peter Maydell <peter.maydell@linaro.org>,
 Frank Chang <frank.chang@sifive.com>,
 Chih-Min Chao <chihmin.chao@sifive.com>, Kito Cheng <kito.cheng@sifive.com>,
	=?utf-8?q?Alex_Benn=C3=A9e?= <alex.bennee@linaro.org>,
 Aurelien Jarno <aurelien@aurel32.net>
Errors-To: 
 qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org
Sender: "Qemu-devel"
 <qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org>

Series

target/riscv: support vector extension v0.9 | expand

Commit Message

Frank Chang July 10, 2020, 10:49 a.m. UTC

From: Kito Cheng <kito.cheng@sifive.com>

Signed-off-by: Kito Cheng <kito.cheng@sifive.com>
Signed-off-by: Chih-Min Chao <chihmin.chao@sifive.com>
Signed-off-by: Frank Chang <frank.chang@sifive.com>
---
 fpu/softfloat.c         | 240 ++++++++++++++++++++++++++++++++++++++++
 include/fpu/softfloat.h |   8 ++
 2 files changed, 248 insertions(+)

Comments

Chih-Min Chao July 14, 2020, 9:29 a.m. UTC | #1

On Fri, Jul 10, 2020 at 8:26 PM Alex Bennée <alex.bennee@linaro.org> wrote:

>
> Alex Bennée <alex.bennee@linaro.org> writes:
>
> > frank.chang@sifive.com writes:
> >
> >> From: Kito Cheng <kito.cheng@sifive.com>
> >>
> >> Signed-off-by: Kito Cheng <kito.cheng@sifive.com>
> >> Signed-off-by: Chih-Min Chao <chihmin.chao@sifive.com>
> >> Signed-off-by: Frank Chang <frank.chang@sifive.com>
> >
> > NACK I'm afraid. What's wrong with the exiting float_compare support?
> >
> > Even if you did want to bring in aliases for these functions within
> > softfloat itself the correct way would be to use the decomposed
> > float_compare support for a bunch of stubs and not restore the old style
> > error prone bit masking code.
>
> In fact see the example float32_eq inline function in the softfloat.h
> header.
>
> --
> Alex Bennée
>

Hi Alex,

Thanks for the suggestion of using wrong and old implementation and this
part will be refined in next separated softfloat PR.

Thanks
Chih-Min Chao

diff --git a/fpu/softfloat.c b/fpu/softfloat.c
index 028b857167..8bebea1142 100644
--- a/fpu/softfloat.c
+++ b/fpu/softfloat.c
@@ -401,6 +401,34 @@  float64_gen2(float64 xa, float64 xb, float_status *s,
     return soft(ua.s, ub.s, s);
 }
 
+/*----------------------------------------------------------------------------
+| Returns the fraction bits of the half-precision floating-point value `a'.
+*----------------------------------------------------------------------------*/
+
+static inline uint32_t extractFloat16Frac(float16 a)
+{
+    return float16_val(a) & 0x3ff;
+}
+
+/*----------------------------------------------------------------------------
+| Returns the exponent bits of the half-precision floating-point value `a'.
+*----------------------------------------------------------------------------*/
+
+static inline int extractFloat16Exp(float16 a)
+{
+    return (float16_val(a) >> 10) & 0x1f;
+}
+
+/*----------------------------------------------------------------------------
+| Returns the sign bit of the half-precision floating-point value `a'.
+*----------------------------------------------------------------------------*/
+
+static inline bool extractFloat16Sign(float16 a)
+{
+    return float16_val(a) >> 15;
+}
+
+
 /*----------------------------------------------------------------------------
 | Returns the fraction bits of the single-precision floating-point value `a'.
 *----------------------------------------------------------------------------*/
@@ -5006,6 +5034,218 @@  float64 float64_log2(float64 a, float_status *status)
     return normalizeRoundAndPackFloat64(zSign, 0x408, zSig, status);
 }
 
+/*----------------------------------------------------------------------------
+| Returns 1 if the half-precision floating-point value `a' is equal to
+| the corresponding value `b', and 0 otherwise.  The invalid exception is
+| raised if either operand is a NaN.  Otherwise, the comparison is performed
+| according to the IEC/IEEE Standard for Binary Floating-Point Arithmetic.
+*----------------------------------------------------------------------------*/
+
+int float16_eq(float16 a, float16 b, float_status *status)
+{
+    uint16_t av, bv;
+    a = float16_squash_input_denormal(a, status);
+    b = float16_squash_input_denormal(b, status);
+
+    if (((extractFloat16Exp(a) == 0x1F) && extractFloat16Frac(a))
+        || ((extractFloat16Exp(b) == 0x1F) && extractFloat16Frac(b))) {
+        float_raise(float_flag_invalid, status);
+        return 0;
+    }
+    av = float16_val(a);
+    bv = float16_val(b);
+    return (av == bv) || ((uint16_t) ((av | bv) << 1) == 0);
+}
+
+/*----------------------------------------------------------------------------
+| Returns 1 if the half-precision floating-point value `a' is less than
+| or equal to the corresponding value `b', and 0 otherwise.  The invalid
+| exception is raised if either operand is a NaN.  The comparison is performed
+| according to the IEC/IEEE Standard for Binary Floating-Point Arithmetic.
+*----------------------------------------------------------------------------*/
+
+int float16_le(float16 a, float16 b, float_status *status)
+{
+    bool aSign, bSign;
+    uint16_t av, bv;
+    a = float16_squash_input_denormal(a, status);
+    b = float16_squash_input_denormal(b, status);
+
+    if (((extractFloat16Exp(a) == 0x1F) && extractFloat16Frac(a))
+        || ((extractFloat16Exp(b) == 0x1F) && extractFloat16Frac(b))) {
+        float_raise(float_flag_invalid, status);
+        return 0;
+    }
+    aSign = extractFloat16Sign(a);
+    bSign = extractFloat16Sign(b);
+    av = float16_val(a);
+    bv = float16_val(b);
+    if (aSign != bSign) {
+        return aSign || ((uint16_t) ((av | bv) << 1) == 0);
+    }
+    return (av == bv) || (aSign ^ (av < bv));
+}
+
+/*----------------------------------------------------------------------------
+| Returns 1 if the half-precision floating-point value `a' is less than
+| the corresponding value `b', and 0 otherwise.  The invalid exception is
+| raised if either operand is a NaN.  The comparison is performed according
+| to the IEC/IEEE Standard for Binary Floating-Point Arithmetic.
+*----------------------------------------------------------------------------*/
+
+int float16_lt(float16 a, float16 b, float_status *status)
+{
+    bool aSign, bSign;
+    uint16_t av, bv;
+    a = float16_squash_input_denormal(a, status);
+    b = float16_squash_input_denormal(b, status);
+
+    if (((extractFloat16Exp(a) == 0x1F) && extractFloat16Frac(a))
+        || ((extractFloat16Exp(b) == 0x1F) && extractFloat16Frac(b))) {
+        float_raise(float_flag_invalid, status);
+        return 0;
+    }
+    aSign = extractFloat16Sign(a);
+    bSign = extractFloat16Sign(b);
+    av = float16_val(a);
+    bv = float16_val(b);
+    if (aSign != bSign) {
+        return aSign && ((uint16_t) ((av | bv) << 1) != 0);
+    }
+    return (av != bv) && (aSign ^ (av < bv));
+}
+
+/*----------------------------------------------------------------------------
+| Returns 1 if the half-precision floating-point values `a' and `b' cannot
+| be compared, and 0 otherwise.  The invalid exception is raised if either
+| operand is a NaN.  The comparison is performed according to the IEC/IEEE
+| Standard for Binary Floating-Point Arithmetic.
+*----------------------------------------------------------------------------*/
+
+int float16_unordered(float16 a, float16 b, float_status *status)
+{
+    a = float16_squash_input_denormal(a, status);
+    b = float16_squash_input_denormal(b, status);
+
+    if (((extractFloat16Exp(a) == 0x1F) && extractFloat16Frac(a))
+        || ((extractFloat16Exp(b) == 0x1F) && extractFloat16Frac(b))) {
+        float_raise(float_flag_invalid, status);
+        return 1;
+    }
+    return 0;
+}
+
+/*----------------------------------------------------------------------------
+| Returns 1 if the half-precision floating-point value `a' is equal to
+| the corresponding value `b', and 0 otherwise.  Quiet NaNs do not cause an
+| exception.  The comparison is performed according to the IEC/IEEE Standard
+| for Binary Floating-Point Arithmetic.
+*----------------------------------------------------------------------------*/
+
+int float16_eq_quiet(float16 a, float16 b, float_status *status)
+{
+    a = float16_squash_input_denormal(a, status);
+    b = float16_squash_input_denormal(b, status);
+
+    if (((extractFloat16Exp(a) == 0x1F) && extractFloat16Frac(a))
+        || ((extractFloat16Exp(b) == 0x1F) && extractFloat16Frac(b))) {
+        if (float16_is_signaling_nan(a, status)
+        || float16_is_signaling_nan(b, status)) {
+            float_raise(float_flag_invalid, status);
+        }
+        return 0;
+    }
+    return (float16_val(a) == float16_val(b)) ||
+            ((uint16_t) ((float16_val(a) | float16_val(b)) << 1) == 0);
+}
+
+/*----------------------------------------------------------------------------
+| Returns 1 if the half-precision floating-point value `a' is less than or
+| equal to the corresponding value `b', and 0 otherwise.  Quiet NaNs do not
+| cause an exception.  Otherwise, the comparison is performed according to the
+| IEC/IEEE Standard for Binary Floating-Point Arithmetic.
+*----------------------------------------------------------------------------*/
+
+int float16_le_quiet(float16 a, float16 b, float_status *status)
+{
+    bool aSign, bSign;
+    uint16_t av, bv;
+    a = float16_squash_input_denormal(a, status);
+    b = float16_squash_input_denormal(b, status);
+
+    if (((extractFloat16Exp(a) == 0x1F) && extractFloat16Frac(a))
+        || ((extractFloat16Exp(b) == 0x1F) && extractFloat16Frac(b))) {
+        if (float16_is_signaling_nan(a, status)
+        || float16_is_signaling_nan(b, status)) {
+            float_raise(float_flag_invalid, status);
+        }
+        return 0;
+    }
+    aSign = extractFloat16Sign(a);
+    bSign = extractFloat16Sign(b);
+    av = float16_val(a);
+    bv = float16_val(b);
+    if (aSign != bSign) {
+        return aSign || ((uint16_t) ((av | bv) << 1) == 0);
+    }
+    return (av == bv) || (aSign ^ (av < bv));
+}
+
+/*----------------------------------------------------------------------------
+| Returns 1 if the half-precision floating-point value `a' is less than
+| the corresponding value `b', and 0 otherwise.  Quiet NaNs do not cause an
+| exception.  Otherwise, the comparison is performed according to the IEC/IEEE
+| Standard for Binary Floating-Point Arithmetic.
+*----------------------------------------------------------------------------*/
+
+int float16_lt_quiet(float16 a, float16 b, float_status *status)
+{
+    bool aSign, bSign;
+    uint16_t av, bv;
+    a = float16_squash_input_denormal(a, status);
+    b = float16_squash_input_denormal(b, status);
+
+    if (((extractFloat16Exp(a) == 0x1F) && extractFloat16Frac(a))
+        || ((extractFloat16Exp(b) == 0x1F) && extractFloat16Frac(b))) {
+        if (float16_is_signaling_nan(a, status)
+        || float16_is_signaling_nan(b, status)) {
+            float_raise(float_flag_invalid, status);
+        }
+        return 0;
+    }
+    aSign = extractFloat16Sign(a);
+    bSign = extractFloat16Sign(b);
+    av = float16_val(a);
+    bv = float16_val(b);
+    if (aSign != bSign) {
+        return aSign && ((uint16_t) ((av | bv) << 1) != 0);
+    }
+    return (av != bv) && (aSign ^ (av < bv));
+}
+
+/*----------------------------------------------------------------------------
+| Returns 1 if the half-precision floating-point values `a' and `b' cannot
+| be compared, and 0 otherwise.  Quiet NaNs do not cause an exception.  The
+| comparison is performed according to the IEC/IEEE Standard for Binary
+| Floating-Point Arithmetic.
+*----------------------------------------------------------------------------*/
+
+int float16_unordered_quiet(float16 a, float16 b, float_status *status)
+{
+    a = float16_squash_input_denormal(a, status);
+    b = float16_squash_input_denormal(b, status);
+
+    if (((extractFloat16Exp(a) == 0x1F) && extractFloat16Frac(a))
+        || ((extractFloat16Exp(b) == 0x1F) && extractFloat16Frac(b))) {
+        if (float16_is_signaling_nan(a, status)
+        || float16_is_signaling_nan(b, status)) {
+            float_raise(float_flag_invalid, status);
+        }
+        return 1;
+    }
+    return 0;
+}
+
 /*----------------------------------------------------------------------------
 | Returns the result of converting the extended double-precision floating-
 | point value `a' to the 32-bit two's complement integer format.  The
diff --git a/include/fpu/softfloat.h b/include/fpu/softfloat.h
index 075c680456..d36a54be3e 100644
--- a/include/fpu/softfloat.h
+++ b/include/fpu/softfloat.h
@@ -244,6 +244,14 @@  float16 float16_maxnum_noprop(float16, float16, float_status *status);
 float16 float16_sqrt(float16, float_status *status);
 FloatRelation float16_compare(float16, float16, float_status *status);
 FloatRelation float16_compare_quiet(float16, float16, float_status *status);
+int float16_eq(float16, float16, float_status *status);
+int float16_le(float16, float16, float_status *status);
+int float16_lt(float16, float16, float_status *status);
+int float16_unordered(float16, float16, float_status *status);
+int float16_eq_quiet(float16, float16, float_status *status);
+int float16_le_quiet(float16, float16, float_status *status);
+int float16_lt_quiet(float16, float16, float_status *status);
+int float16_unordered_quiet(float16, float16, float_status *status);
 
 bool float16_is_quiet_nan(float16, float_status *status);
 bool float16_is_signaling_nan(float16, float_status *status);

[RFC,63/65] fpu: implement full set compare for fp16

Commit Message

Comments

Patch