diff mbox

[2/2] target-mips: Implement IEEE 754-2008 functionality for R6 and MSA instructions

Message ID 1458910214-12239-3-git-send-email-aleksandar.markovic@rt-rk.com (mailing list archive)
State New, archived
Headers show

Commit Message

Aleksandar Markovic March 25, 2016, 12:50 p.m. UTC
From: Aleksandar Markovic <aleksandar.markovic@imgtec.com>

This patch utilizes provisions from the previous patch, and configures
Mips R6 CPUs and Mips MSA units appropriately with reference to the meaning
of the signaling NaN bit (this is mentioned in point 3 in the list below).
The majority of involved MIPS instructions will be fixed just with that
change. Certain number of other IEEE 754-2008 standard-related MIPS issues
are addreessed with this patch as well.

The changes can be summarized this way:

1) Definitions of Mips processors are updated to reflect supported
   IEEE-754-2008-related features. (file target-mips/translate_init.c)

2) Functions fpu_init() and msa_reset() are updated so that flag
   snan_bit_is_one is properly set for any Mips configuration.
   (file target-mips/translate_init.c)

3) Helpers helper_float_abs_<fmt>() and helper_float_chs_<fmt>() are
   rewritten to reflect new behavior of instructions ABS.fmt and NEG.fmt
   in MIPS Release 6. Affected MIPS instructions are:

   ABS.S
   ABS.D
   NEG.S
   NEG.D

   Note that legacy (pre-R6) ABS and NEG instructions are arithmetic
   (any NaN operand signals invalid operation), while R6 ones are
   non-arithmetic, always changing the sign bit, even for NaN-like operands.

   Details on these instructions are documented in [1] p. 35 and 359.

   Affected files are target-mips/helper.h and target-mips/op_helper.c.

4) Helpers helper_float_ceilxxx(), helper_float_cvtxxx(),
   helper_float_floorxxx(), helper_float_roundxxx(), and
   helper_float_truncxxx() are rewritten to reflect the behavior of
   relevant instructions if its operands are floating numbers out of
   the range of the integer destination.

   Affected MIPS instructions are:

   CEIL.L.fmt
   CEIL.W.fmt
   CVT.L.fmt
   CVT.W.fmt
   FLOOR.L.fmt
   FLOOR.W.fmt
   ROUND.L.fmt
   ROUND.W.fmt
   TRUNC.L.fmt
   TRUNC.W.fmt

   Details on these instructions are presented in [1] p. 129, 130, 149,
   155, 222, 223, 393, 394, 504, 505.

   Affected files are target-mips/helper.h and target-mips/translate.c.

5) Helpers helper_msa_class_s() and helper_msa_class_d() added so that
   MSA version of instruction CLASS can operate independently of the one
   from the base set of instructions. Affected MIPS instructions are:

   FCLASS.W
   FCLASS.D

   Details on these instructions can be found in [2] p. 158.

   Affected source code files are target-mips/helper.h and
   target-mips/msa_helper.c.

6) Handling og instructions CVT.S.PU and CVT.S.PL is updated to reflect
   the fact that they are removed in Mips R6 architecture and belong to
   so-called paired-single class of instructions. Details on these
   instructions can be found in [1], p. 152 and 153. Affected source
   code file is target-mips/translate.c.

[1] "MIPS® Architecture For Programmers Volume II-A:
    The MIPS64® Instruction Set Reference Manual",
    Imagination Technologies LTD, Revision 6.04, November 13, 2015
    (https://imagination-technologies-cloudfront-assets.s3.amazonaws.com/
     documentation/MD00087-2B-MIPS64BIS-AFP-06.04.pdf)

[2] "MIPS Architecture for Programmers Volume IV-j:
    The MIPS32® SIMD Architecture Module",
    Imagination Technologies LTD, Revision 1.12, February 3, 2016
    (https://imagination-technologies-cloudfront-assets.s3.amazonaws.com/
     documentation/MD00866-2B-MSA32-AFP-01.12.pdf)

Signed-off-by: Aleksandar Markovic <aleksandar.markovic@imgtec.com>
---
 target-mips/helper.h         |  10 +-
 target-mips/msa_helper.c     |  60 ++++-
 target-mips/op_helper.c      | 516 ++++++++++++++++++++++++++++++++++++-------
 target-mips/translate.c      |  16 +-
 target-mips/translate_init.c |  22 +-
 5 files changed, 520 insertions(+), 104 deletions(-)

Comments

Richard Henderson March 28, 2016, 9:49 p.m. UTC | #1
On 03/25/2016 05:50 AM, Aleksandar Markovic wrote:
> @@ -2621,9 +2621,23 @@ uint64_t helper_float_cvtl_d(CPUMIPSState *env, uint64_t fdt0)
>       uint64_t dt2;
>
>       dt2 = float64_to_int64(fdt0, &env->active_fpu.fp_status);
> -    if (get_float_exception_flags(&env->active_fpu.fp_status)
> -        & (float_flag_invalid | float_flag_overflow)) {
> -        dt2 = FP_TO_INT64_OVERFLOW;
> +    if (env->active_fpu.fcr31 & (1 << FCR31_NAN2008)) {
> +        if (get_float_exception_flags(&env->active_fpu.fp_status)
> +                & (float_flag_invalid | float_flag_overflow)) {
> +            if (float64_is_any_nan(fdt0)) {
> +                dt2 = 0;
> +            } else {
> +                if (float64_is_neg(fdt0))
> +                    dt2 = INT64_MIN;
> +                else
> +                    dt2 = INT64_MAX;
> +            }
> +        }
> +    } else {
> +        if (get_float_exception_flags(&env->active_fpu.fp_status)
> +                & (float_flag_invalid | float_flag_overflow)) {
> +            dt2 = FP_TO_INT64_OVERFLOW;
> +        }

Better to swap the tests here, so that you test the exception flags first (and 
once).  That is the exceptional condition, the one that will be true least 
often.  After that, FCR31_NAN2008 will be tested only when needed.

But also, this pattern is replicated so many times you'd do well to pull this 
sequence out to helper functions (one for s, one for d).

> +uint64_t helper_float_abs_d(CPUMIPSState *env, uint64_t fdt0)
> +{
> +    uint64_t fdt1;
> +
> +    if (env->active_fpu.fcr31 & (1 << FCR31_ABS2008)) {
> +        fdt1 = float64_abs(fdt0);
> +    } else {
> +        if (float64_is_neg(fdt0)) {
> +            fdt1 = float64_sub(0, fdt0, &env->active_fpu.fp_status);
> +        } else {
> +            fdt1 = float64_add(0, fdt0, &env->active_fpu.fp_status);
> +        }
> +        update_fcr31(env, GETPC());

Here you're better off using two separate helper functions, and chose the 
correct one during translation.  Indeed, since the 2008 version is a simple 
bit-flip, you needn't actually have a helper; just expand the sequence inline.


r~
Aleksandar Markovic March 30, 2016, 7:28 p.m. UTC | #2
I really appreciate your guidance and help. I will respond shortly with a proposal that will address all issues that you brought up. Thanks again for your support and time.

Aleksandar
Aleksandar Markovic March 31, 2016, 11:55 a.m. UTC | #3
Hi, Richard, what would you think about this approach:

Functionality of <ABS|NEG>.<S|D> and <CVT|FLOOR|CEIL|TRUNC|ROUND>.<L|W>.<S|D>
instructions is dependent on flags ABS2008 and NAN2008 in FCR31. There are
MIPS architectures (for example mips32r5) that allow implementations
with different values of these flags. So, in order to detect the desired
behavior in translate-time, insn_flags field can't be used - and, therefore,
it makes sense to add two new members to the MIPS's DisasContext:

typedef struct DisasContext {
    . . .
    bool nan2008;
    bool abs2008;
} DisasContext;

Their initialization could be in gen_intermediate_code_internal():

    ctx.nan2008 = (env->active_fpu.fcr31 >> FCR31_NAN2008) & 1;
    ctx.abs2008 = (env->active_fpu.fcr31 >> FCR31_ABS2008) & 1;

Now, ABS.D (and all <ABS|NEG>.<S|D>) handling might look like this:

    case OPC_ABS_D:
        check_cp1_registers(ctx, fs | fd);
        {
            TCGv_i64 fp0 = tcg_temp_new_i64();

            gen_load_fpr64(ctx, fp0, fs);
            if (ctx->abs2008) {
                tcg_gen_andi_i64(fp0, fp0, 0x7fffffffffffffffULL);
            } else {
                gen_helper_float_abs_d(fp0, fp0);
            }
            gen_store_fpr64(ctx, fp0, fd);
            tcg_temp_free_i64(fp0);
        }
        opn = "abs.d";
        break;

Here, 2008-style ABS.D is implemented inline, without a helper, and
gen_helper_float_abs_d() is an old pre-2008 helper that would be intact
(the same as it is currently) with this change.

On the other hand, CVT.L.D (and all <CVT|FLOOR|CEIL|TRUNC|ROUND>.<L|W>.<S|D>)
handling would take this form:

    case OPC_CVT_L_D:
        check_cp1_64bitmode(ctx);
        {
            TCGv_i64 fp0 = tcg_temp_new_i64();

            gen_load_fpr64(ctx, fp0, fs);
            if (ctx->nan2008) {
                gen_helper_float_cvt_2008_l_d(fp0, cpu_env, fp0);
            } else {
                gen_helper_float_cvt_l_d(fp0, cpu_env, fp0);
            }
            gen_store_fpr64(ctx, fp0, fd);
            tcg_temp_free_i64(fp0);
        }
        opn = "cvt.l.d";
        break;

Function helper_float_cvt_2008_l_d() is a new, only-2008-style helper for
CVT.L.D and would look like this:

uint64_t helper_float_cvt_2008_l_d(CPUMIPSState *env, uint64_t fdt0)
{
    uint64_t dt2;

    dt2 = float64_to_int64(fdt0, &env->active_fpu.fp_status);
    if (get_float_exception_flags(&env->active_fpu.fp_status)
            & (float_flag_invalid | float_flag_overflow)) {
        dt2 = DBL_TO_INT64_OVERFLOW(fdt0)
    }
    update_fcr31(env, GETPC());
    return dt2;
}

(macro DBL_TO_INT64_OVERFLOW(x) would be defined this way:

#define DBL_TO_INT64_OVERFLOW(x)                                             \
    float64_is_any_nan(x) ? 0 : (float64_is_neg(x) ? INT64_MIN : INT64_MAX);

to avoid awkward repeating "if" statements in multiple headers)

gen_helper_float_cvt_l_d() and all old style helpers for instructions
<CVT|FLOOR|CEIL|TRUNC|ROUND>.<L|W>.<S|D> would remain the same.

Please let me know about your opinion. I greatly appreciate your kind
consideration of this matter. I am looking forward to hearing from you.

Yours,
Aleksandar
Richard Henderson March 31, 2016, 4:30 p.m. UTC | #4
On 03/31/2016 04:55 AM, Aleksandar Markovic wrote:
> Hi, Richard, what would you think about this approach:
> 
> Functionality of <ABS|NEG>.<S|D> and <CVT|FLOOR|CEIL|TRUNC|ROUND>.<L|W>.<S|D>
> instructions is dependent on flags ABS2008 and NAN2008 in FCR31. There are
> MIPS architectures (for example mips32r5) that allow implementations
> with different values of these flags. So, in order to detect the desired
> behavior in translate-time, insn_flags field can't be used - and, therefore,
> it makes sense to add two new members to the MIPS's DisasContext:
> 
> typedef struct DisasContext {
>     . . .
>     bool nan2008;
>     bool abs2008;
> } DisasContext;
> 
> Their initialization could be in gen_intermediate_code_internal():
> 
>     ctx.nan2008 = (env->active_fpu.fcr31 >> FCR31_NAN2008) & 1;
>     ctx.abs2008 = (env->active_fpu.fcr31 >> FCR31_ABS2008) & 1;
> 
> Now, ABS.D (and all <ABS|NEG>.<S|D>) handling might look like this:
> 
>     case OPC_ABS_D:
>         check_cp1_registers(ctx, fs | fd);
>         {
>             TCGv_i64 fp0 = tcg_temp_new_i64();
> 
>             gen_load_fpr64(ctx, fp0, fs);
>             if (ctx->abs2008) {
>                 tcg_gen_andi_i64(fp0, fp0, 0x7fffffffffffffffULL);
>             } else {
>                 gen_helper_float_abs_d(fp0, fp0);
>             }
>             gen_store_fpr64(ctx, fp0, fd);
>             tcg_temp_free_i64(fp0);
>         }
>         opn = "abs.d";
>         break;
> 
> Here, 2008-style ABS.D is implemented inline, without a helper, and
> gen_helper_float_abs_d() is an old pre-2008 helper that would be intact
> (the same as it is currently) with this change.

Yes, that's exactly what I had in mind.

> On the other hand, CVT.L.D (and all <CVT|FLOOR|CEIL|TRUNC|ROUND>.<L|W>.<S|D>)
> handling would take this form:
> 
>     case OPC_CVT_L_D:
>         check_cp1_64bitmode(ctx);
>         {
>             TCGv_i64 fp0 = tcg_temp_new_i64();
> 
>             gen_load_fpr64(ctx, fp0, fs);
>             if (ctx->nan2008) {
>                 gen_helper_float_cvt_2008_l_d(fp0, cpu_env, fp0);
>             } else {
>                 gen_helper_float_cvt_l_d(fp0, cpu_env, fp0);
>             }
>             gen_store_fpr64(ctx, fp0, fd);
>             tcg_temp_free_i64(fp0);
>         }
>         opn = "cvt.l.d";
>         break;
> 
> Function helper_float_cvt_2008_l_d() is a new, only-2008-style helper for
> CVT.L.D and would look like this:
> 
> uint64_t helper_float_cvt_2008_l_d(CPUMIPSState *env, uint64_t fdt0)
> {
>     uint64_t dt2;
> 
>     dt2 = float64_to_int64(fdt0, &env->active_fpu.fp_status);
>     if (get_float_exception_flags(&env->active_fpu.fp_status)
>             & (float_flag_invalid | float_flag_overflow)) {
>         dt2 = DBL_TO_INT64_OVERFLOW(fdt0)
>     }
>     update_fcr31(env, GETPC());
>     return dt2;
> }

That looks fine as well.



r~
Leon Alrae April 1, 2016, 7:07 p.m. UTC | #5
On 25/03/16 12:50, Aleksandar Markovic wrote:
> +#define MSA_CLASS_SIGNALING_NAN      0x001
> +#define MSA_CLASS_QUIET_NAN          0x002
> +#define MSA_CLASS_NEGATIVE_INFINITY  0x004
> +#define MSA_CLASS_NEGATIVE_NORMAL    0x008
> +#define MSA_CLASS_NEGATIVE_SUBNORMAL 0x010
> +#define MSA_CLASS_NEGATIVE_ZERO      0x020
> +#define MSA_CLASS_POSITIVE_INFINITY  0x040
> +#define MSA_CLASS_POSITIVE_NORMAL    0x080
> +#define MSA_CLASS_POSITIVE_SUBNORMAL 0x100
> +#define MSA_CLASS_POSITIVE_ZERO      0x200
> +
> +#define MSA_CLASS(name, bits)                                        \
> +uint ## bits ## _t helper_msa_ ## name (CPUMIPSState *env,           \
> +                                        uint ## bits ## _t arg)      \
> +{                                                                    \
> +    if (float ## bits ## _is_signaling_nan(arg,                      \
> +                &env->active_tc.msa_fp_status)) {                    \
> +        return MSA_CLASS_SIGNALING_NAN;                              \
> +    } else if (float ## bits ## _is_quiet_nan(arg,                   \
> +                    &env->active_tc.msa_fp_status)) {                \
> +        return MSA_CLASS_QUIET_NAN;                                  \
> +    } else if (float ## bits ## _is_neg(arg)) {                      \
> +        if (float ## bits ## _is_infinity(arg)) {                    \
> +            return MSA_CLASS_NEGATIVE_INFINITY;                      \
> +        } else if (float ## bits ## _is_zero(arg)) {                 \
> +            return MSA_CLASS_NEGATIVE_ZERO;                          \
> +        } else if (float ## bits ## _is_zero_or_denormal(arg)) {     \
> +            return MSA_CLASS_NEGATIVE_SUBNORMAL;                     \
> +        } else {                                                     \
> +            return MSA_CLASS_NEGATIVE_NORMAL;                        \
> +        }                                                            \
> +    } else {                                                         \
> +        if (float ## bits ## _is_infinity(arg)) {                    \
> +            return MSA_CLASS_POSITIVE_INFINITY;                      \
> +        } else if (float ## bits ## _is_zero(arg)) {                 \
> +            return MSA_CLASS_POSITIVE_ZERO;                          \
> +        } else if (float ## bits ## _is_zero_or_denormal(arg)) {     \
> +            return MSA_CLASS_POSITIVE_SUBNORMAL;                     \
> +        } else {                                                     \
> +            return MSA_CLASS_POSITIVE_NORMAL;                        \
> +        }                                                            \
> +    }                                                                \
> +}

Duplicating the class operation is unnecessary. We can just have common
function for FPU and MSA which takes additional float_status argument.

Also I noticed that this patch series doesn't provide Flush Subnormals
(the FCSR.FS bit), but probably this functionality can come later...

Leon
Aleksandar Markovic April 3, 2016, 3:05 p.m. UTC | #6
Hello, Leon, thank you very much for the kind feedback. Let me clarify my take on the involved issues.

1) Class operations

I am going to correct the code as you hinted.

The reason I wanted separate handling of MSA class operation is code and module decoupling. Handling of MSA instructions (in file msa_helper.c) and regular instructions (in file op_helper.c) have many overlaping areas - however, my understanding is that the designer of MSA module wanted it to be as independant on code in other files/modulas as possible. Handling class operation is on of the rare instances where code in msa_helper.c relies on the code in op_helper.c., and it made sense to me that this dependence should be removed, for the sake of consistency within MSA module - even if the functionalitied are virtually identical. That said, I will anyway listen to your advice, since you most probably see more than myself regarding this, and I am going to revert to a single handling of class operations, for both MSA and regular versions.

2) Flush subnormals

My impression is that his set of features should be treated and implemented separately, at some later point in time.

Although the implementation seems not to be too complex (defining FCR31_FS, invoking appropriately set_flush_to_zero() and set_flush_inputs_to_zero() on CPU init, plus special exception handling, like it is already done for MSA equivalents), it looks to me that it would have added a lot of risk into a patch series that is already touching a lot of sensitive areas, and therefore introducing a lot of risks. Once this patch series is hopefully intergrated, flush subnormals will be much easier to integrate, since it will be mips-only issue. Therefore, if you agree, I will leave it for the future. I will definitely mention it in commit messages though (as a limitaion), for future reference.

Thanks again for your consideration of this matter.

Sincerely yours,
Aleksandar


-------- Original Message --------
Subject: Re: [PATCH 2/2] target-mips: Implement IEEE 754-2008 functionality for R6 and MSA instructions
Date: Friday, April 1, 2016 21:07 CEST
From: Leon Alrae <leon.alrae@imgtec.com>
To: Aleksandar Markovic <aleksandar.markovic@rt-rk.com>,<qemu-devel@nongnu.org>
CC: <qemu-arm@nongnu.org>, <qemu-ppc@nongnu.org>, <aurelien@aurel32.net>,<peter.maydell@linaro.org>, <rth@twiddle.net>, <afaerber@suse.de>,<pbonzini@redhat.com>, <ehabkost@redhat.com>, <edgar.iglesias@gmail.com>,<proljc@gmail.com>, <agraf@suse.de>, <blauwirbel@gmail.com>,<mark.cave-ayland@ilande.co.uk>, <gxt@mprc.pku.edu.cn>,<petar.jovanovic@imgtec.com>, <miodrag.dinic@imgtec.com>,<jcmvbkbc@gmail.com>, <kbastian@mail.uni-paderborn.de>
References: <1458910214-12239-1-git-send-email-aleksandar.markovic@rt-rk.com><1458910214-12239-3-git-send-email-aleksandar.markovic@rt-rk.com>


 On 25/03/16 12:50, Aleksandar Markovic wrote:
> +#define MSA_CLASS_SIGNALING_NAN 0x001
> +#define MSA_CLASS_QUIET_NAN 0x002
> +#define MSA_CLASS_NEGATIVE_INFINITY 0x004
> +#define MSA_CLASS_NEGATIVE_NORMAL 0x008
> +#define MSA_CLASS_NEGATIVE_SUBNORMAL 0x010
> +#define MSA_CLASS_NEGATIVE_ZERO 0x020
> +#define MSA_CLASS_POSITIVE_INFINITY 0x040
> +#define MSA_CLASS_POSITIVE_NORMAL 0x080
> +#define MSA_CLASS_POSITIVE_SUBNORMAL 0x100
> +#define MSA_CLASS_POSITIVE_ZERO 0x200
> +
> +#define MSA_CLASS(name, bits) \
> +uint ## bits ## _t helper_msa_ ## name (CPUMIPSState *env, \
> + uint ## bits ## _t arg) \
> +{ \
> + if (float ## bits ## _is_signaling_nan(arg, \
> + &env->active_tc.msa_fp_status)) { \
> + return MSA_CLASS_SIGNALING_NAN; \
> + } else if (float ## bits ## _is_quiet_nan(arg, \
> + &env->active_tc.msa_fp_status)) { \
> + return MSA_CLASS_QUIET_NAN; \
> + } else if (float ## bits ## _is_neg(arg)) { \
> + if (float ## bits ## _is_infinity(arg)) { \
> + return MSA_CLASS_NEGATIVE_INFINITY; \
> + } else if (float ## bits ## _is_zero(arg)) { \
> + return MSA_CLASS_NEGATIVE_ZERO; \
> + } else if (float ## bits ## _is_zero_or_denormal(arg)) { \
> + return MSA_CLASS_NEGATIVE_SUBNORMAL; \
> + } else { \
> + return MSA_CLASS_NEGATIVE_NORMAL; \
> + } \
> + } else { \
> + if (float ## bits ## _is_infinity(arg)) { \
> + return MSA_CLASS_POSITIVE_INFINITY; \
> + } else if (float ## bits ## _is_zero(arg)) { \
> + return MSA_CLASS_POSITIVE_ZERO; \
> + } else if (float ## bits ## _is_zero_or_denormal(arg)) { \
> + return MSA_CLASS_POSITIVE_SUBNORMAL; \
> + } else { \
> + return MSA_CLASS_POSITIVE_NORMAL; \
> + } \
> + } \
> +}

Duplicating the class operation is unnecessary. We can just have common
function for FPU and MSA which takes additional float_status argument.

Also I noticed that this patch series doesn't provide Flush Subnormals
(the FCSR.FS bit), but probably this functionality can come later...

Leon
diff mbox

Patch

diff --git a/target-mips/helper.h b/target-mips/helper.h
index 1aaa316..952af63 100644
--- a/target-mips/helper.h
+++ b/target-mips/helper.h
@@ -254,10 +254,10 @@  FOP_PROTO(recip)
 FOP_PROTO(rint)
 #undef FOP_PROTO
 
-#define FOP_PROTO(op)                       \
-DEF_HELPER_1(float_ ## op ## _s, i32, i32)  \
-DEF_HELPER_1(float_ ## op ## _d, i64, i64)  \
-DEF_HELPER_1(float_ ## op ## _ps, i64, i64)
+#define FOP_PROTO(op)                            \
+DEF_HELPER_2(float_ ## op ## _s, i32, env, i32)  \
+DEF_HELPER_2(float_ ## op ## _d, i64, env, i64)  \
+DEF_HELPER_2(float_ ## op ## _ps, i64, env, i64)
 FOP_PROTO(abs)
 FOP_PROTO(chs)
 #undef FOP_PROTO
@@ -924,6 +924,8 @@  DEF_HELPER_4(msa_pcnt_df, void, env, i32, i32, i32)
 DEF_HELPER_4(msa_nloc_df, void, env, i32, i32, i32)
 DEF_HELPER_4(msa_nlzc_df, void, env, i32, i32, i32)
 
+DEF_HELPER_2(msa_class_s, i32, env, i32)
+DEF_HELPER_2(msa_class_d, i64, env, i64)
 DEF_HELPER_4(msa_fclass_df, void, env, i32, i32, i32)
 DEF_HELPER_4(msa_ftrunc_s_df, void, env, i32, i32, i32)
 DEF_HELPER_4(msa_ftrunc_u_df, void, env, i32, i32, i32)
diff --git a/target-mips/msa_helper.c b/target-mips/msa_helper.c
index 47fbba0..fed430d 100644
--- a/target-mips/msa_helper.c
+++ b/target-mips/msa_helper.c
@@ -2924,19 +2924,67 @@  void helper_msa_fmax_a_df(CPUMIPSState *env, uint32_t df, uint32_t wd,
     msa_move_v(pwd, pwx);
 }
 
+#define MSA_CLASS_SIGNALING_NAN      0x001
+#define MSA_CLASS_QUIET_NAN          0x002
+#define MSA_CLASS_NEGATIVE_INFINITY  0x004
+#define MSA_CLASS_NEGATIVE_NORMAL    0x008
+#define MSA_CLASS_NEGATIVE_SUBNORMAL 0x010
+#define MSA_CLASS_NEGATIVE_ZERO      0x020
+#define MSA_CLASS_POSITIVE_INFINITY  0x040
+#define MSA_CLASS_POSITIVE_NORMAL    0x080
+#define MSA_CLASS_POSITIVE_SUBNORMAL 0x100
+#define MSA_CLASS_POSITIVE_ZERO      0x200
+
+#define MSA_CLASS(name, bits)                                        \
+uint ## bits ## _t helper_msa_ ## name (CPUMIPSState *env,           \
+                                        uint ## bits ## _t arg)      \
+{                                                                    \
+    if (float ## bits ## _is_signaling_nan(arg,                      \
+                &env->active_tc.msa_fp_status)) {                    \
+        return MSA_CLASS_SIGNALING_NAN;                              \
+    } else if (float ## bits ## _is_quiet_nan(arg,                   \
+                    &env->active_tc.msa_fp_status)) {                \
+        return MSA_CLASS_QUIET_NAN;                                  \
+    } else if (float ## bits ## _is_neg(arg)) {                      \
+        if (float ## bits ## _is_infinity(arg)) {                    \
+            return MSA_CLASS_NEGATIVE_INFINITY;                      \
+        } else if (float ## bits ## _is_zero(arg)) {                 \
+            return MSA_CLASS_NEGATIVE_ZERO;                          \
+        } else if (float ## bits ## _is_zero_or_denormal(arg)) {     \
+            return MSA_CLASS_NEGATIVE_SUBNORMAL;                     \
+        } else {                                                     \
+            return MSA_CLASS_NEGATIVE_NORMAL;                        \
+        }                                                            \
+    } else {                                                         \
+        if (float ## bits ## _is_infinity(arg)) {                    \
+            return MSA_CLASS_POSITIVE_INFINITY;                      \
+        } else if (float ## bits ## _is_zero(arg)) {                 \
+            return MSA_CLASS_POSITIVE_ZERO;                          \
+        } else if (float ## bits ## _is_zero_or_denormal(arg)) {     \
+            return MSA_CLASS_POSITIVE_SUBNORMAL;                     \
+        } else {                                                     \
+            return MSA_CLASS_POSITIVE_NORMAL;                        \
+        }                                                            \
+    }                                                                \
+}
+
+MSA_CLASS(class_s, 32)
+MSA_CLASS(class_d, 64)
+#undef FLOAT_MSA_CLASS
+
 void helper_msa_fclass_df(CPUMIPSState *env, uint32_t df,
         uint32_t wd, uint32_t ws)
 {
     wr_t *pwd = &(env->active_fpu.fpr[wd].wr);
     wr_t *pws = &(env->active_fpu.fpr[ws].wr);
     if (df == DF_WORD) {
-        pwd->w[0] = helper_float_class_s(env, pws->w[0]);
-        pwd->w[1] = helper_float_class_s(env, pws->w[1]);
-        pwd->w[2] = helper_float_class_s(env, pws->w[2]);
-        pwd->w[3] = helper_float_class_s(env, pws->w[3]);
+        pwd->w[0] = helper_msa_class_s(env, pws->w[0]);
+        pwd->w[1] = helper_msa_class_s(env, pws->w[1]);
+        pwd->w[2] = helper_msa_class_s(env, pws->w[2]);
+        pwd->w[3] = helper_msa_class_s(env, pws->w[3]);
     } else {
-        pwd->d[0] = helper_float_class_d(env, pws->d[0]);
-        pwd->d[1] = helper_float_class_d(env, pws->d[1]);
+        pwd->d[0] = helper_msa_class_d(env, pws->d[0]);
+        pwd->d[1] = helper_msa_class_d(env, pws->d[1]);
     }
 }
 
diff --git a/target-mips/op_helper.c b/target-mips/op_helper.c
index 0d22b25..407d5e0 100644
--- a/target-mips/op_helper.c
+++ b/target-mips/op_helper.c
@@ -2621,9 +2621,23 @@  uint64_t helper_float_cvtl_d(CPUMIPSState *env, uint64_t fdt0)
     uint64_t dt2;
 
     dt2 = float64_to_int64(fdt0, &env->active_fpu.fp_status);
-    if (get_float_exception_flags(&env->active_fpu.fp_status)
-        & (float_flag_invalid | float_flag_overflow)) {
-        dt2 = FP_TO_INT64_OVERFLOW;
+    if (env->active_fpu.fcr31 & (1 << FCR31_NAN2008)) {
+        if (get_float_exception_flags(&env->active_fpu.fp_status)
+                & (float_flag_invalid | float_flag_overflow)) {
+            if (float64_is_any_nan(fdt0)) {
+                dt2 = 0;
+            } else {
+                if (float64_is_neg(fdt0))
+                    dt2 = INT64_MIN;
+                else
+                    dt2 = INT64_MAX;
+            }
+        }
+    } else {
+        if (get_float_exception_flags(&env->active_fpu.fp_status)
+                & (float_flag_invalid | float_flag_overflow)) {
+            dt2 = FP_TO_INT64_OVERFLOW;
+        }
     }
     update_fcr31(env, GETPC());
     return dt2;
@@ -2634,14 +2648,29 @@  uint64_t helper_float_cvtl_s(CPUMIPSState *env, uint32_t fst0)
     uint64_t dt2;
 
     dt2 = float32_to_int64(fst0, &env->active_fpu.fp_status);
-    if (get_float_exception_flags(&env->active_fpu.fp_status)
-        & (float_flag_invalid | float_flag_overflow)) {
-        dt2 = FP_TO_INT64_OVERFLOW;
+    if (env->active_fpu.fcr31 & (1 << FCR31_NAN2008)) {
+        if (get_float_exception_flags(&env->active_fpu.fp_status)
+                & (float_flag_invalid | float_flag_overflow)) {
+            if (float32_is_any_nan(fst0)) {
+                dt2 = 0;
+            } else {
+                if (float32_is_neg(fst0))
+                    dt2 = INT64_MIN;
+                else
+                    dt2 = INT64_MAX;
+            }
+        }
+    } else {
+        if (get_float_exception_flags(&env->active_fpu.fp_status)
+                & (float_flag_invalid | float_flag_overflow)) {
+            dt2 = FP_TO_INT64_OVERFLOW;
+        }
     }
     update_fcr31(env, GETPC());
     return dt2;
 }
 
+
 uint64_t helper_float_cvtps_pw(CPUMIPSState *env, uint64_t dt0)
 {
     uint32_t fst2;
@@ -2729,9 +2758,23 @@  uint32_t helper_float_cvtw_s(CPUMIPSState *env, uint32_t fst0)
     uint32_t wt2;
 
     wt2 = float32_to_int32(fst0, &env->active_fpu.fp_status);
-    if (get_float_exception_flags(&env->active_fpu.fp_status)
-        & (float_flag_invalid | float_flag_overflow)) {
-        wt2 = FP_TO_INT32_OVERFLOW;
+    if (env->active_fpu.fcr31 & (1 << FCR31_NAN2008)) {
+        if (get_float_exception_flags(&env->active_fpu.fp_status)
+                & (float_flag_invalid | float_flag_overflow)) {
+            if (float32_is_any_nan(fst0)) {
+                wt2 = 0;
+            } else {
+                if (float32_is_neg(fst0))
+                    wt2 = INT32_MIN;
+                else
+                    wt2 = INT32_MAX;
+            }
+        }
+    } else {
+        if (get_float_exception_flags(&env->active_fpu.fp_status)
+                & (float_flag_invalid | float_flag_overflow)) {
+            wt2 = FP_TO_INT32_OVERFLOW;
+        }
     }
     update_fcr31(env, GETPC());
     return wt2;
@@ -2742,9 +2785,23 @@  uint32_t helper_float_cvtw_d(CPUMIPSState *env, uint64_t fdt0)
     uint32_t wt2;
 
     wt2 = float64_to_int32(fdt0, &env->active_fpu.fp_status);
-    if (get_float_exception_flags(&env->active_fpu.fp_status)
-        & (float_flag_invalid | float_flag_overflow)) {
-        wt2 = FP_TO_INT32_OVERFLOW;
+    if (env->active_fpu.fcr31 & (1 << FCR31_NAN2008)) {
+        if (get_float_exception_flags(&env->active_fpu.fp_status)
+                & (float_flag_invalid | float_flag_overflow)) {
+            if (float64_is_any_nan(fdt0)) {
+                wt2 = 0;
+            } else {
+                if (float64_is_neg(fdt0))
+                    wt2 = INT32_MIN;
+                else
+                    wt2 = INT32_MAX;
+            }
+        }
+    } else {
+        if (get_float_exception_flags(&env->active_fpu.fp_status)
+                & (float_flag_invalid | float_flag_overflow)) {
+            wt2 = FP_TO_INT32_OVERFLOW;
+        }
     }
     update_fcr31(env, GETPC());
     return wt2;
@@ -2757,9 +2814,23 @@  uint64_t helper_float_roundl_d(CPUMIPSState *env, uint64_t fdt0)
     set_float_rounding_mode(float_round_nearest_even, &env->active_fpu.fp_status);
     dt2 = float64_to_int64(fdt0, &env->active_fpu.fp_status);
     restore_rounding_mode(env);
-    if (get_float_exception_flags(&env->active_fpu.fp_status)
-        & (float_flag_invalid | float_flag_overflow)) {
-        dt2 = FP_TO_INT64_OVERFLOW;
+    if (env->active_fpu.fcr31 & (1 << FCR31_NAN2008)) {
+        if (get_float_exception_flags(&env->active_fpu.fp_status)
+                & (float_flag_invalid | float_flag_overflow)) {
+            if (float64_is_any_nan(fdt0)) {
+                dt2 = 0;
+            } else {
+                if (float64_is_neg(fdt0))
+                    dt2 = INT64_MIN;
+                else
+                    dt2 = INT64_MAX;
+            }
+        }
+    } else {
+        if (get_float_exception_flags(&env->active_fpu.fp_status)
+                & (float_flag_invalid | float_flag_overflow)) {
+            dt2 = FP_TO_INT64_OVERFLOW;
+        }
     }
     update_fcr31(env, GETPC());
     return dt2;
@@ -2772,9 +2843,23 @@  uint64_t helper_float_roundl_s(CPUMIPSState *env, uint32_t fst0)
     set_float_rounding_mode(float_round_nearest_even, &env->active_fpu.fp_status);
     dt2 = float32_to_int64(fst0, &env->active_fpu.fp_status);
     restore_rounding_mode(env);
-    if (get_float_exception_flags(&env->active_fpu.fp_status)
-        & (float_flag_invalid | float_flag_overflow)) {
-        dt2 = FP_TO_INT64_OVERFLOW;
+    if (env->active_fpu.fcr31 & (1 << FCR31_NAN2008)) {
+        if (get_float_exception_flags(&env->active_fpu.fp_status)
+                & (float_flag_invalid | float_flag_overflow)) {
+            if (float32_is_any_nan(fst0)) {
+                dt2 = 0;
+            } else {
+                if (float32_is_neg(fst0))
+                    dt2 = INT64_MIN;
+                else
+                    dt2 = INT64_MAX;
+            }
+        }
+    } else {
+        if (get_float_exception_flags(&env->active_fpu.fp_status)
+                & (float_flag_invalid | float_flag_overflow)) {
+            dt2 = FP_TO_INT64_OVERFLOW;
+        }
     }
     update_fcr31(env, GETPC());
     return dt2;
@@ -2787,9 +2872,23 @@  uint32_t helper_float_roundw_d(CPUMIPSState *env, uint64_t fdt0)
     set_float_rounding_mode(float_round_nearest_even, &env->active_fpu.fp_status);
     wt2 = float64_to_int32(fdt0, &env->active_fpu.fp_status);
     restore_rounding_mode(env);
-    if (get_float_exception_flags(&env->active_fpu.fp_status)
-        & (float_flag_invalid | float_flag_overflow)) {
-        wt2 = FP_TO_INT32_OVERFLOW;
+    if (env->active_fpu.fcr31 & (1 << FCR31_NAN2008)) {
+        if (get_float_exception_flags(&env->active_fpu.fp_status)
+                & (float_flag_invalid | float_flag_overflow)) {
+            if (float64_is_any_nan(fdt0)) {
+                wt2 = 0;
+            } else {
+                if (float64_is_neg(fdt0))
+                    wt2 = INT32_MIN;
+                else
+                    wt2 = INT32_MAX;
+            }
+        }
+    } else {
+        if (get_float_exception_flags(&env->active_fpu.fp_status)
+                & (float_flag_invalid | float_flag_overflow)) {
+            wt2 = FP_TO_INT32_OVERFLOW;
+        }
     }
     update_fcr31(env, GETPC());
     return wt2;
@@ -2802,9 +2901,23 @@  uint32_t helper_float_roundw_s(CPUMIPSState *env, uint32_t fst0)
     set_float_rounding_mode(float_round_nearest_even, &env->active_fpu.fp_status);
     wt2 = float32_to_int32(fst0, &env->active_fpu.fp_status);
     restore_rounding_mode(env);
-    if (get_float_exception_flags(&env->active_fpu.fp_status)
-        & (float_flag_invalid | float_flag_overflow)) {
-        wt2 = FP_TO_INT32_OVERFLOW;
+    if (env->active_fpu.fcr31 & (1 << FCR31_NAN2008)) {
+        if (get_float_exception_flags(&env->active_fpu.fp_status)
+                & (float_flag_invalid | float_flag_overflow)) {
+            if (float32_is_any_nan(fst0)) {
+                wt2 = 0;
+            } else {
+                if (float32_is_neg(fst0))
+                    wt2 = INT32_MIN;
+                else
+                    wt2 = INT32_MAX;
+            }
+        }
+    } else {
+        if (get_float_exception_flags(&env->active_fpu.fp_status)
+                & (float_flag_invalid | float_flag_overflow)) {
+            wt2 = FP_TO_INT32_OVERFLOW;
+        }
     }
     update_fcr31(env, GETPC());
     return wt2;
@@ -2815,9 +2928,23 @@  uint64_t helper_float_truncl_d(CPUMIPSState *env, uint64_t fdt0)
     uint64_t dt2;
 
     dt2 = float64_to_int64_round_to_zero(fdt0, &env->active_fpu.fp_status);
-    if (get_float_exception_flags(&env->active_fpu.fp_status)
-        & (float_flag_invalid | float_flag_overflow)) {
-        dt2 = FP_TO_INT64_OVERFLOW;
+    if (env->active_fpu.fcr31 & (1 << FCR31_NAN2008)) {
+        if (get_float_exception_flags(&env->active_fpu.fp_status)
+                & (float_flag_invalid | float_flag_overflow)) {
+            if (float64_is_any_nan(fdt0)) {
+                dt2 = 0;
+            } else {
+                if (float64_is_neg(fdt0))
+                    dt2 = INT64_MIN;
+                else
+                    dt2 = INT64_MAX;
+            }
+        }
+    } else {
+        if (get_float_exception_flags(&env->active_fpu.fp_status)
+                & (float_flag_invalid | float_flag_overflow)) {
+            dt2 = FP_TO_INT64_OVERFLOW;
+        }
     }
     update_fcr31(env, GETPC());
     return dt2;
@@ -2828,9 +2955,23 @@  uint64_t helper_float_truncl_s(CPUMIPSState *env, uint32_t fst0)
     uint64_t dt2;
 
     dt2 = float32_to_int64_round_to_zero(fst0, &env->active_fpu.fp_status);
-    if (get_float_exception_flags(&env->active_fpu.fp_status)
-        & (float_flag_invalid | float_flag_overflow)) {
-        dt2 = FP_TO_INT64_OVERFLOW;
+    if (env->active_fpu.fcr31 & (1 << FCR31_NAN2008)) {
+        if (get_float_exception_flags(&env->active_fpu.fp_status)
+                & (float_flag_invalid | float_flag_overflow)) {
+            if (float32_is_any_nan(fst0)) {
+                dt2 = 0;
+            } else {
+                if (float32_is_neg(fst0))
+                    dt2 = INT64_MIN;
+                else
+                    dt2 = INT64_MAX;
+            }
+        }
+    } else {
+        if (get_float_exception_flags(&env->active_fpu.fp_status)
+                & (float_flag_invalid | float_flag_overflow)) {
+            dt2 = FP_TO_INT64_OVERFLOW;
+        }
     }
     update_fcr31(env, GETPC());
     return dt2;
@@ -2841,9 +2982,23 @@  uint32_t helper_float_truncw_d(CPUMIPSState *env, uint64_t fdt0)
     uint32_t wt2;
 
     wt2 = float64_to_int32_round_to_zero(fdt0, &env->active_fpu.fp_status);
-    if (get_float_exception_flags(&env->active_fpu.fp_status)
-        & (float_flag_invalid | float_flag_overflow)) {
-        wt2 = FP_TO_INT32_OVERFLOW;
+    if (env->active_fpu.fcr31 & (1 << FCR31_NAN2008)) {
+        if (get_float_exception_flags(&env->active_fpu.fp_status)
+                & (float_flag_invalid | float_flag_overflow)) {
+            if (float64_is_any_nan(fdt0)) {
+                wt2 = 0;
+            } else {
+                if (float64_is_neg(fdt0))
+                    wt2 = INT32_MIN;
+                else
+                    wt2 = INT32_MAX;
+            }
+        }
+    } else {
+        if (get_float_exception_flags(&env->active_fpu.fp_status)
+                & (float_flag_invalid | float_flag_overflow)) {
+            wt2 = FP_TO_INT32_OVERFLOW;
+        }
     }
     update_fcr31(env, GETPC());
     return wt2;
@@ -2854,9 +3009,23 @@  uint32_t helper_float_truncw_s(CPUMIPSState *env, uint32_t fst0)
     uint32_t wt2;
 
     wt2 = float32_to_int32_round_to_zero(fst0, &env->active_fpu.fp_status);
-    if (get_float_exception_flags(&env->active_fpu.fp_status)
-        & (float_flag_invalid | float_flag_overflow)) {
-        wt2 = FP_TO_INT32_OVERFLOW;
+    if (env->active_fpu.fcr31 & (1 << FCR31_NAN2008)) {
+        if (get_float_exception_flags(&env->active_fpu.fp_status)
+                & (float_flag_invalid | float_flag_overflow)) {
+            if (float32_is_any_nan(fst0)) {
+                wt2 = 0;
+            } else {
+                if (float32_is_neg(fst0))
+                    wt2 = INT32_MIN;
+                else
+                    wt2 = INT32_MAX;
+            }
+        }
+    } else {
+        if (get_float_exception_flags(&env->active_fpu.fp_status)
+                & (float_flag_invalid | float_flag_overflow)) {
+            wt2 = FP_TO_INT32_OVERFLOW;
+        }
     }
     update_fcr31(env, GETPC());
     return wt2;
@@ -2869,9 +3038,23 @@  uint64_t helper_float_ceill_d(CPUMIPSState *env, uint64_t fdt0)
     set_float_rounding_mode(float_round_up, &env->active_fpu.fp_status);
     dt2 = float64_to_int64(fdt0, &env->active_fpu.fp_status);
     restore_rounding_mode(env);
-    if (get_float_exception_flags(&env->active_fpu.fp_status)
-        & (float_flag_invalid | float_flag_overflow)) {
-        dt2 = FP_TO_INT64_OVERFLOW;
+    if (env->active_fpu.fcr31 & (1 << FCR31_NAN2008)) {
+        if (get_float_exception_flags(&env->active_fpu.fp_status)
+                & (float_flag_invalid | float_flag_overflow)) {
+            if (float64_is_any_nan(fdt0)) {
+                dt2 = 0;
+            } else {
+                if (float64_is_neg(fdt0))
+                    dt2 = INT64_MIN;
+                else
+                    dt2 = INT64_MAX;
+            }
+        }
+    } else {
+        if (get_float_exception_flags(&env->active_fpu.fp_status)
+                & (float_flag_invalid | float_flag_overflow)) {
+            dt2 = FP_TO_INT64_OVERFLOW;
+        }
     }
     update_fcr31(env, GETPC());
     return dt2;
@@ -2884,9 +3067,23 @@  uint64_t helper_float_ceill_s(CPUMIPSState *env, uint32_t fst0)
     set_float_rounding_mode(float_round_up, &env->active_fpu.fp_status);
     dt2 = float32_to_int64(fst0, &env->active_fpu.fp_status);
     restore_rounding_mode(env);
-    if (get_float_exception_flags(&env->active_fpu.fp_status)
-        & (float_flag_invalid | float_flag_overflow)) {
-        dt2 = FP_TO_INT64_OVERFLOW;
+    if (env->active_fpu.fcr31 & (1 << FCR31_NAN2008)) {
+        if (get_float_exception_flags(&env->active_fpu.fp_status)
+                & (float_flag_invalid | float_flag_overflow)) {
+            if (float32_is_any_nan(fst0)) {
+                dt2 = 0;
+            } else {
+                if (float32_is_neg(fst0))
+                    dt2 = INT64_MIN;
+                else
+                    dt2 = INT64_MAX;
+            }
+        }
+    } else {
+        if (get_float_exception_flags(&env->active_fpu.fp_status)
+                & (float_flag_invalid | float_flag_overflow)) {
+            dt2 = FP_TO_INT64_OVERFLOW;
+        }
     }
     update_fcr31(env, GETPC());
     return dt2;
@@ -2899,9 +3096,23 @@  uint32_t helper_float_ceilw_d(CPUMIPSState *env, uint64_t fdt0)
     set_float_rounding_mode(float_round_up, &env->active_fpu.fp_status);
     wt2 = float64_to_int32(fdt0, &env->active_fpu.fp_status);
     restore_rounding_mode(env);
-    if (get_float_exception_flags(&env->active_fpu.fp_status)
-        & (float_flag_invalid | float_flag_overflow)) {
-        wt2 = FP_TO_INT32_OVERFLOW;
+    if (env->active_fpu.fcr31 & (1 << FCR31_NAN2008)) {
+        if (get_float_exception_flags(&env->active_fpu.fp_status)
+                & (float_flag_invalid | float_flag_overflow)) {
+            if (float64_is_any_nan(fdt0)) {
+                wt2 = 0;
+            } else {
+                if (float64_is_neg(fdt0))
+                    wt2 = INT32_MIN;
+                else
+                    wt2 = INT32_MAX;
+            }
+        }
+    } else {
+        if (get_float_exception_flags(&env->active_fpu.fp_status)
+               & (float_flag_invalid | float_flag_overflow)) {
+            wt2 = FP_TO_INT32_OVERFLOW;
+        }
     }
     update_fcr31(env, GETPC());
     return wt2;
@@ -2914,9 +3125,23 @@  uint32_t helper_float_ceilw_s(CPUMIPSState *env, uint32_t fst0)
     set_float_rounding_mode(float_round_up, &env->active_fpu.fp_status);
     wt2 = float32_to_int32(fst0, &env->active_fpu.fp_status);
     restore_rounding_mode(env);
-    if (get_float_exception_flags(&env->active_fpu.fp_status)
-        & (float_flag_invalid | float_flag_overflow)) {
-        wt2 = FP_TO_INT32_OVERFLOW;
+    if (env->active_fpu.fcr31 & (1 << FCR31_NAN2008)) {
+        if (get_float_exception_flags(&env->active_fpu.fp_status)
+                & (float_flag_invalid | float_flag_overflow)) {
+            if (float32_is_any_nan(fst0)) {
+                wt2 = 0;
+            } else {
+                if (float32_is_neg(fst0))
+                    wt2 = INT32_MIN;
+                else
+                    wt2 = INT32_MAX;
+            }
+        }
+    } else {
+        if (get_float_exception_flags(&env->active_fpu.fp_status)
+                & (float_flag_invalid | float_flag_overflow)) {
+            wt2 = FP_TO_INT32_OVERFLOW;
+        }
     }
     update_fcr31(env, GETPC());
     return wt2;
@@ -2929,9 +3154,23 @@  uint64_t helper_float_floorl_d(CPUMIPSState *env, uint64_t fdt0)
     set_float_rounding_mode(float_round_down, &env->active_fpu.fp_status);
     dt2 = float64_to_int64(fdt0, &env->active_fpu.fp_status);
     restore_rounding_mode(env);
-    if (get_float_exception_flags(&env->active_fpu.fp_status)
-        & (float_flag_invalid | float_flag_overflow)) {
-        dt2 = FP_TO_INT64_OVERFLOW;
+    if (env->active_fpu.fcr31 & (1 << FCR31_NAN2008)) {
+        if (get_float_exception_flags(&env->active_fpu.fp_status)
+                & (float_flag_invalid | float_flag_overflow)) {
+            if (float64_is_any_nan(fdt0)) {
+                dt2 = 0;
+            } else {
+                if (float64_is_neg(fdt0))
+                    dt2 = INT64_MIN;
+                else
+                    dt2 = INT64_MAX;
+            }
+        }
+    } else {
+        if (get_float_exception_flags(&env->active_fpu.fp_status)
+                & (float_flag_invalid | float_flag_overflow)) {
+            dt2 = FP_TO_INT64_OVERFLOW;
+        }
     }
     update_fcr31(env, GETPC());
     return dt2;
@@ -2944,9 +3183,23 @@  uint64_t helper_float_floorl_s(CPUMIPSState *env, uint32_t fst0)
     set_float_rounding_mode(float_round_down, &env->active_fpu.fp_status);
     dt2 = float32_to_int64(fst0, &env->active_fpu.fp_status);
     restore_rounding_mode(env);
-    if (get_float_exception_flags(&env->active_fpu.fp_status)
-        & (float_flag_invalid | float_flag_overflow)) {
-        dt2 = FP_TO_INT64_OVERFLOW;
+    if (env->active_fpu.fcr31 & (1 << FCR31_NAN2008)) {
+        if (get_float_exception_flags(&env->active_fpu.fp_status)
+                & (float_flag_invalid | float_flag_overflow)) {
+            if (float32_is_any_nan(fst0)) {
+                dt2 = 0;
+            } else {
+                if (float32_is_neg(fst0))
+                    dt2 = INT64_MIN;
+                else
+                    dt2 = INT64_MAX;
+            }
+        }
+    } else {
+        if (get_float_exception_flags(&env->active_fpu.fp_status)
+                & (float_flag_invalid | float_flag_overflow)) {
+            dt2 = FP_TO_INT64_OVERFLOW;
+        }
     }
     update_fcr31(env, GETPC());
     return dt2;
@@ -2959,9 +3212,23 @@  uint32_t helper_float_floorw_d(CPUMIPSState *env, uint64_t fdt0)
     set_float_rounding_mode(float_round_down, &env->active_fpu.fp_status);
     wt2 = float64_to_int32(fdt0, &env->active_fpu.fp_status);
     restore_rounding_mode(env);
-    if (get_float_exception_flags(&env->active_fpu.fp_status)
-        & (float_flag_invalid | float_flag_overflow)) {
-        wt2 = FP_TO_INT32_OVERFLOW;
+    if (env->active_fpu.fcr31 & (1 << FCR31_NAN2008)) {
+        if (get_float_exception_flags(&env->active_fpu.fp_status)
+                & (float_flag_invalid | float_flag_overflow)) {
+            if (float64_is_any_nan(fdt0)) {
+                wt2 = 0;
+            } else {
+                if (float64_is_neg(fdt0))
+                    wt2 = INT32_MIN;
+                else
+                    wt2 = INT32_MAX;
+            }
+        }
+    } else {
+        if (get_float_exception_flags(&env->active_fpu.fp_status)
+                & (float_flag_invalid | float_flag_overflow)) {
+            wt2 = FP_TO_INT32_OVERFLOW;
+        }
     }
     update_fcr31(env, GETPC());
     return wt2;
@@ -2974,36 +3241,121 @@  uint32_t helper_float_floorw_s(CPUMIPSState *env, uint32_t fst0)
     set_float_rounding_mode(float_round_down, &env->active_fpu.fp_status);
     wt2 = float32_to_int32(fst0, &env->active_fpu.fp_status);
     restore_rounding_mode(env);
-    if (get_float_exception_flags(&env->active_fpu.fp_status)
-        & (float_flag_invalid | float_flag_overflow)) {
-        wt2 = FP_TO_INT32_OVERFLOW;
+    if (env->active_fpu.fcr31 & (1 << FCR31_NAN2008)) {
+        if (get_float_exception_flags(&env->active_fpu.fp_status)
+                & (float_flag_invalid | float_flag_overflow)) {
+            if (float32_is_any_nan(fst0)) {
+                wt2 = 0;
+            } else {
+                if (float32_is_neg(fst0))
+                    wt2 = INT32_MIN;
+                else
+                    wt2 = INT32_MAX;
+            }
+        }
+    } else {
+        if (get_float_exception_flags(&env->active_fpu.fp_status)
+                & (float_flag_invalid | float_flag_overflow)) {
+            wt2 = FP_TO_INT32_OVERFLOW;
+        }
     }
     update_fcr31(env, GETPC());
     return wt2;
 }
 
-/* unary operations, not modifying fp status  */
-#define FLOAT_UNOP(name)                                       \
-uint64_t helper_float_ ## name ## _d(uint64_t fdt0)                \
-{                                                              \
-    return float64_ ## name(fdt0);                             \
-}                                                              \
-uint32_t helper_float_ ## name ## _s(uint32_t fst0)                \
-{                                                              \
-    return float32_ ## name(fst0);                             \
-}                                                              \
-uint64_t helper_float_ ## name ## _ps(uint64_t fdt0)               \
-{                                                              \
-    uint32_t wt0;                                              \
-    uint32_t wth0;                                             \
-                                                               \
-    wt0 = float32_ ## name(fdt0 & 0XFFFFFFFF);                 \
-    wth0 = float32_ ## name(fdt0 >> 32);                       \
-    return ((uint64_t)wth0 << 32) | wt0;                       \
-}
-FLOAT_UNOP(abs)
-FLOAT_UNOP(chs)
-#undef FLOAT_UNOP
+uint64_t helper_float_abs_d(CPUMIPSState *env, uint64_t fdt0)
+{
+    uint64_t fdt1;
+
+    if (env->active_fpu.fcr31 & (1 << FCR31_ABS2008)) {
+        fdt1 = float64_abs(fdt0);
+    } else {
+        if (float64_is_neg(fdt0)) {
+            fdt1 = float64_sub(0, fdt0, &env->active_fpu.fp_status);
+        } else {
+            fdt1 = float64_add(0, fdt0, &env->active_fpu.fp_status);
+        }
+        update_fcr31(env, GETPC());
+    }
+    return fdt1;
+}
+
+uint32_t helper_float_abs_s(CPUMIPSState *env, uint32_t fst0)
+{
+    uint32_t fst1;
+
+    if (env->active_fpu.fcr31 & (1 << FCR31_ABS2008)) {
+        fst1 = float32_abs(fst0);
+    } else {
+        if (float32_is_neg(fst0)) {
+            fst1 = float32_sub(0, fst0, &env->active_fpu.fp_status);
+        } else {
+            fst1 = float32_add(0, fst0, &env->active_fpu.fp_status);
+        }
+        update_fcr31(env, GETPC());
+    }
+    return fst1;
+}
+
+uint64_t helper_float_abs_ps(CPUMIPSState *env, uint64_t fpst0)
+{
+    uint32_t fst0 = fpst0 & 0XFFFFFFFF;
+    uint32_t fsth0 = fpst0 >> 32;
+    uint32_t fst1;
+    uint32_t fsth1;
+
+    if (float32_is_neg(fst0)) {
+        fst1 = float32_sub(0, fst0, &env->active_fpu.fp_status);
+    } else {
+        fst1 = float32_add(0, fst0, &env->active_fpu.fp_status);
+    }
+    if (float32_is_neg(fsth0)) {
+        fsth1 = float32_sub(0, fsth0, &env->active_fpu.fp_status);
+    } else {
+        fsth1 = float32_add(0, fsth0, &env->active_fpu.fp_status);
+    }
+    update_fcr31(env, GETPC());
+    return ((uint64_t)fsth1 << 32) | fst1;
+}
+
+uint64_t helper_float_chs_d(CPUMIPSState *env, uint64_t fdt0)
+{
+    uint64_t fdt1;
+
+    if (env->active_fpu.fcr31 & (1 << FCR31_ABS2008)) {
+        fdt1 = float64_chs(fdt0);
+    } else {
+        fdt1 = float64_sub(0, fdt0, &env->active_fpu.fp_status);
+        update_fcr31(env, GETPC());
+    }
+    return fdt1;
+}
+
+uint32_t helper_float_chs_s(CPUMIPSState *env, uint32_t fst0)
+{
+    uint32_t fst1;
+
+    if (env->active_fpu.fcr31 & (1 << FCR31_ABS2008)) {
+        fst1 = float32_chs(fst0);
+    } else {
+        fst1 = float32_sub(0, fst0, &env->active_fpu.fp_status);
+        update_fcr31(env, GETPC());
+    }
+    return fst1;
+}
+
+uint64_t helper_float_chs_ps(CPUMIPSState *env, uint64_t fpst0)
+{
+    uint32_t fst0 = fpst0 & 0XFFFFFFFF;
+    uint32_t fsth0 = fpst0 >> 32;
+    uint32_t fst1;
+    uint32_t fsth1;
+
+    fst1 = float32_sub(0, fst0, &env->active_fpu.fp_status);
+    fsth1 = float32_sub(0, fsth0, &env->active_fpu.fp_status);
+    update_fcr31(env, GETPC());
+    return ((uint64_t)fsth1 << 32) | fst1;
+}
 
 /* MIPS specific unary operations */
 uint64_t helper_float_recip_d(CPUMIPSState *env, uint64_t fdt0)
diff --git a/target-mips/translate.c b/target-mips/translate.c
index 669bd0c..327c532 100644
--- a/target-mips/translate.c
+++ b/target-mips/translate.c
@@ -8792,7 +8792,7 @@  static void gen_farith (DisasContext *ctx, enum fopcode op1,
             TCGv_i32 fp0 = tcg_temp_new_i32();
 
             gen_load_fpr32(ctx, fp0, fs);
-            gen_helper_float_abs_s(fp0, fp0);
+            gen_helper_float_abs_s(fp0, cpu_env, fp0);
             gen_store_fpr32(ctx, fp0, fd);
             tcg_temp_free_i32(fp0);
         }
@@ -8811,7 +8811,7 @@  static void gen_farith (DisasContext *ctx, enum fopcode op1,
             TCGv_i32 fp0 = tcg_temp_new_i32();
 
             gen_load_fpr32(ctx, fp0, fs);
-            gen_helper_float_chs_s(fp0, fp0);
+            gen_helper_float_chs_s(fp0, cpu_env, fp0);
             gen_store_fpr32(ctx, fp0, fd);
             tcg_temp_free_i32(fp0);
         }
@@ -9282,7 +9282,7 @@  static void gen_farith (DisasContext *ctx, enum fopcode op1,
             TCGv_i64 fp0 = tcg_temp_new_i64();
 
             gen_load_fpr64(ctx, fp0, fs);
-            gen_helper_float_abs_d(fp0, fp0);
+            gen_helper_float_abs_d(fp0, cpu_env, fp0);
             gen_store_fpr64(ctx, fp0, fd);
             tcg_temp_free_i64(fp0);
         }
@@ -9303,7 +9303,7 @@  static void gen_farith (DisasContext *ctx, enum fopcode op1,
             TCGv_i64 fp0 = tcg_temp_new_i64();
 
             gen_load_fpr64(ctx, fp0, fs);
-            gen_helper_float_chs_d(fp0, fp0);
+            gen_helper_float_chs_d(fp0, cpu_env, fp0);
             gen_store_fpr64(ctx, fp0, fd);
             tcg_temp_free_i64(fp0);
         }
@@ -9794,7 +9794,7 @@  static void gen_farith (DisasContext *ctx, enum fopcode op1,
             TCGv_i64 fp0 = tcg_temp_new_i64();
 
             gen_load_fpr64(ctx, fp0, fs);
-            gen_helper_float_abs_ps(fp0, fp0);
+            gen_helper_float_abs_ps(fp0, cpu_env, fp0);
             gen_store_fpr64(ctx, fp0, fd);
             tcg_temp_free_i64(fp0);
         }
@@ -9815,7 +9815,7 @@  static void gen_farith (DisasContext *ctx, enum fopcode op1,
             TCGv_i64 fp0 = tcg_temp_new_i64();
 
             gen_load_fpr64(ctx, fp0, fs);
-            gen_helper_float_chs_ps(fp0, fp0);
+            gen_helper_float_chs_ps(fp0, cpu_env, fp0);
             gen_store_fpr64(ctx, fp0, fd);
             tcg_temp_free_i64(fp0);
         }
@@ -9934,7 +9934,7 @@  static void gen_farith (DisasContext *ctx, enum fopcode op1,
         }
         break;
     case OPC_CVT_S_PU:
-        check_cp1_64bitmode(ctx);
+        check_ps(ctx);
         {
             TCGv_i32 fp0 = tcg_temp_new_i32();
 
@@ -9956,7 +9956,7 @@  static void gen_farith (DisasContext *ctx, enum fopcode op1,
         }
         break;
     case OPC_CVT_S_PL:
-        check_cp1_64bitmode(ctx);
+        check_ps(ctx);
         {
             TCGv_i32 fp0 = tcg_temp_new_i32();
 
diff --git a/target-mips/translate_init.c b/target-mips/translate_init.c
index 3e48680..18bc7cb 100644
--- a/target-mips/translate_init.c
+++ b/target-mips/translate_init.c
@@ -273,6 +273,7 @@  static const mips_def_t mips_defs[] =
         .CP0_Status_rw_bitmask = 0x3678FF1F,
         .CP1_fcr0 = (1 << FCR0_F64) | (1 << FCR0_L) | (1 << FCR0_W) |
                     (1 << FCR0_D) | (1 << FCR0_S) | (0x93 << FCR0_PRID),
+        .CP1_fcr31 = 0,
         .SEGBITS = 32,
         .PABITS = 32,
         .insn_flags = CPU_MIPS32R2 | ASE_MIPS16,
@@ -303,6 +304,7 @@  static const mips_def_t mips_defs[] =
                     (0xff << CP0TCSt_TASID),
         .CP1_fcr0 = (1 << FCR0_F64) | (1 << FCR0_L) | (1 << FCR0_W) |
                     (1 << FCR0_D) | (1 << FCR0_S) | (0x95 << FCR0_PRID),
+        .CP1_fcr31 = 0,
         .CP0_SRSCtl = (0xf << CP0SRSCtl_HSS),
         .CP0_SRSConf0_rw_bitmask = 0x3fffffff,
         .CP0_SRSConf0 = (1U << CP0SRSC0_M) | (0x3fe << CP0SRSC0_SRS3) |
@@ -343,6 +345,7 @@  static const mips_def_t mips_defs[] =
         .CP0_Status_rw_bitmask = 0x3778FF1F,
         .CP1_fcr0 = (1 << FCR0_F64) | (1 << FCR0_L) | (1 << FCR0_W) |
                     (1 << FCR0_D) | (1 << FCR0_S) | (0x93 << FCR0_PRID),
+        .CP1_fcr31 = 0,
         .SEGBITS = 32,
         .PABITS = 32,
         .insn_flags = CPU_MIPS32R2 | ASE_MIPS16 | ASE_DSP | ASE_DSPR2,
@@ -433,8 +436,7 @@  static const mips_def_t mips_defs[] =
     },
     {
         /* A generic CPU supporting MIPS32 Release 6 ISA.
-           FIXME: Support IEEE 754-2008 FP.
-                  Eventually this should be replaced by a real CPU model. */
+           FIXME: Eventually this should be replaced by a real CPU model. */
         .name = "mips32r6-generic",
         .CP0_PRid = 0x00010000,
         .CP0_Config0 = MIPS_CONFIG0 | (0x2 << CP0C0_AR) |
@@ -484,6 +486,7 @@  static const mips_def_t mips_defs[] =
         .CP0_Status_rw_bitmask = 0x3678FFFF,
         /* The R4000 has a full 64bit FPU but doesn't use the fcr0 bits. */
         .CP1_fcr0 = (0x5 << FCR0_PRID) | (0x0 << FCR0_REV),
+        .CP1_fcr31 = 0,
         .SEGBITS = 40,
         .PABITS = 36,
         .insn_flags = CPU_MIPS3,
@@ -502,6 +505,7 @@  static const mips_def_t mips_defs[] =
         .CP0_Status_rw_bitmask = 0x3678FFFF,
         /* The VR5432 has a full 64bit FPU but doesn't use the fcr0 bits. */
         .CP1_fcr0 = (0x54 << FCR0_PRID) | (0x0 << FCR0_REV),
+        .CP1_fcr31 = 0,
         .SEGBITS = 40,
         .PABITS = 32,
         .insn_flags = CPU_VR54XX,
@@ -547,6 +551,7 @@  static const mips_def_t mips_defs[] =
         /* The 5Kf has F64 / L / W but doesn't use the fcr0 bits. */
         .CP1_fcr0 = (1 << FCR0_D) | (1 << FCR0_S) |
                     (0x81 << FCR0_PRID) | (0x0 << FCR0_REV),
+        .CP1_fcr31 = 0,
         .SEGBITS = 42,
         .PABITS = 36,
         .insn_flags = CPU_MIPS64,
@@ -574,6 +579,7 @@  static const mips_def_t mips_defs[] =
         .CP1_fcr0 = (1 << FCR0_3D) | (1 << FCR0_PS) |
                     (1 << FCR0_D) | (1 << FCR0_S) |
                     (0x82 << FCR0_PRID) | (0x0 << FCR0_REV),
+        .CP1_fcr31 = 0,
         .SEGBITS = 40,
         .PABITS = 36,
         .insn_flags = CPU_MIPS64 | ASE_MIPS3D,
@@ -600,6 +606,7 @@  static const mips_def_t mips_defs[] =
         .CP1_fcr0 = (1 << FCR0_F64) | (1 << FCR0_3D) | (1 << FCR0_PS) |
                     (1 << FCR0_L) | (1 << FCR0_W) | (1 << FCR0_D) |
                     (1 << FCR0_S) | (0x00 << FCR0_PRID) | (0x0 << FCR0_REV),
+        .CP1_fcr31 = 0,
         .SEGBITS = 42,
         .PABITS = 36,
         .insn_flags = CPU_MIPS64R2 | ASE_MIPS3D,
@@ -702,6 +709,7 @@  static const mips_def_t mips_defs[] =
         .CCRes = 2,
         .CP0_Status_rw_bitmask = 0x35D0FFFF,
         .CP1_fcr0 = (0x5 << FCR0_PRID) | (0x1 << FCR0_REV),
+        .CP1_fcr31 = 0,
         .SEGBITS = 40,
         .PABITS = 40,
         .insn_flags = CPU_LOONGSON2E,
@@ -720,6 +728,7 @@  static const mips_def_t mips_defs[] =
         .CCRes = 2,
         .CP0_Status_rw_bitmask = 0xF5D0FF1F,   /* Bits 7:5 not writable.  */
         .CP1_fcr0 = (0x5 << FCR0_PRID) | (0x1 << FCR0_REV),
+        .CP1_fcr31 = 0,
         .SEGBITS = 40,
         .PABITS = 40,
         .insn_flags = CPU_LOONGSON2F,
@@ -747,6 +756,7 @@  static const mips_def_t mips_defs[] =
         .CP1_fcr0 = (1 << FCR0_F64) | (1 << FCR0_3D) | (1 << FCR0_PS) |
                     (1 << FCR0_L) | (1 << FCR0_W) | (1 << FCR0_D) |
                     (1 << FCR0_S) | (0x00 << FCR0_PRID) | (0x0 << FCR0_REV),
+        .CP1_fcr31 = 0,
         .SEGBITS = 42,
         .PABITS = 36,
         .insn_flags = CPU_MIPS64R2 | ASE_DSP | ASE_DSPR2,
@@ -834,7 +844,11 @@  static void fpu_init (CPUMIPSState *env, const mips_def_t *def)
 
     for (i = 0; i < MIPS_FPU_MAX; i++) {
         env->fpus[i].fcr0 = def->CP1_fcr0;
-        set_snan_bit_is_one(1, &env->fpus[i].fp_status);
+        if (env->insn_flags & ISA_MIPS32R6) {
+            set_snan_bit_is_one(0, &env->fpus[i].fp_status);
+        } else {
+            set_snan_bit_is_one(1, &env->fpus[i].fp_status);
+        }
     }
 
     memcpy(&env->active_fpu, &env->fpus[0], sizeof(env->active_fpu));
@@ -893,5 +907,5 @@  static void msa_reset(CPUMIPSState *env)
     /* clear float_status nan mode */
     set_default_nan_mode(0, &env->active_tc.msa_fp_status);
 
-    set_snan_bit_is_one(1, &env->active_tc.msa_fp_status);
+    set_snan_bit_is_one(0, &env->active_tc.msa_fp_status);
 }