Message ID | 20240228111151.287738-5-pbonzini@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | target/i386: use TSTEQ/TSTNE in x86 frontend | expand |
On 2/28/24 01:11, Paolo Bonzini wrote: > - /* TSTNE x,sign -> LT x,0 */ > - if (arg_is_const_val(*p2, (ctx->type == TCG_TYPE_I32 > - ? INT32_MIN : INT64_MIN))) { > + /* TSTNE x,i -> LT x,0 if i only includes sign bit copies */ > + if (arg_is_const(*p2) && (arg_info(*p2)->val & ~i1->s_mask) == 0) { This is a good idea, but s_mask isn't defined like you think -- it is *repetitions* of the sign bit, but not including the sign bit itself. For INT64_MIN, s_mask == 0. So for TSTNE min,min, (min & ~0) != 0, so the test won't pass. r~
On 2/29/24 00:10, Richard Henderson wrote: > On 2/28/24 01:11, Paolo Bonzini wrote: >> - /* TSTNE x,sign -> LT x,0 */ >> - if (arg_is_const_val(*p2, (ctx->type == TCG_TYPE_I32 >> - ? INT32_MIN : INT64_MIN))) { >> + /* TSTNE x,i -> LT x,0 if i only includes sign bit copies */ >> + if (arg_is_const(*p2) && (arg_info(*p2)->val & ~i1->s_mask) == 0) { > > This is a good idea, but s_mask isn't defined like you think -- it is > *repetitions* of the sign bit, but not including the sign bit itself. > For INT64_MIN, s_mask == 0. > > So for TSTNE min,min, (min & ~0) != 0, so the test won't pass. Oh! So I have to squash: diff --git a/tcg/optimize.c b/tcg/optimize.c index ab976a5bbe7..44d1b1a6d8a 100644 --- a/tcg/optimize.c +++ b/tcg/optimize.c @@ -140,6 +140,12 @@ static inline bool arg_is_const_val(TCGArg arg, uint64_t val) return ts_is_const_val(arg_temp(arg), val); } +/* Calculate all the copies of the sign bit, both redundant and not. */ +static inline uint64_t all_sign_bit_copies(TempOptInfo *info) +{ + return (info->s_mask >> 1) | INT64_MIN; +} + static inline bool ts_is_copy(TCGTemp *ts) { return ts_info(ts)->next_copy != ts; @@ -825,7 +831,7 @@ static int do_constant_folding_cond1(OptContext *ctx, TCGOp *op, TCGArg dest, } /* TSTNE x,i -> LT x,0 if i only includes sign bit copies */ - if (arg_is_const(*p2) && (arg_info(*p2)->val & ~i1->s_mask) == 0) { + if (arg_is_const(*p2) && (arg_info(*p2)->val & ~all_sign_bit_copies(i1)) == 0) { *p2 = arg_new_constant(ctx, 0); *pcond = tcg_tst_ltge_cond(cond); return -1; I tested with movq $0xffffffff80000000, %rbx test %ebx, %ebx js y and I get brcond_i64 cc_dst,$0x80000000,tstne,$L1 which works and matches your explanation: i1.s_mask == 0xffffffff00000000 i2.val == 0x80000000 all_sign_bit_copies(i1) == 0xffffffff80000000 u2.val & ~all_sign_bit_copies(i1) == 0 Thanks! Paolo
On 2/28/24 23:35, Paolo Bonzini wrote: > On 2/29/24 00:10, Richard Henderson wrote: >> On 2/28/24 01:11, Paolo Bonzini wrote: >>> - /* TSTNE x,sign -> LT x,0 */ >>> - if (arg_is_const_val(*p2, (ctx->type == TCG_TYPE_I32 >>> - ? INT32_MIN : INT64_MIN))) { >>> + /* TSTNE x,i -> LT x,0 if i only includes sign bit copies */ >>> + if (arg_is_const(*p2) && (arg_info(*p2)->val & ~i1->s_mask) == 0) { >> >> This is a good idea, but s_mask isn't defined like you think -- it is *repetitions* of >> the sign bit, but not including the sign bit itself. For INT64_MIN, s_mask == 0. >> >> So for TSTNE min,min, (min & ~0) != 0, so the test won't pass. > > Oh! So I have to squash: > > diff --git a/tcg/optimize.c b/tcg/optimize.c > index ab976a5bbe7..44d1b1a6d8a 100644 > --- a/tcg/optimize.c > +++ b/tcg/optimize.c > @@ -140,6 +140,12 @@ static inline bool arg_is_const_val(TCGArg arg, uint64_t val) > return ts_is_const_val(arg_temp(arg), val); > } > > +/* Calculate all the copies of the sign bit, both redundant and not. */ > +static inline uint64_t all_sign_bit_copies(TempOptInfo *info) > +{ > + return (info->s_mask >> 1) | INT64_MIN; > +} You need to care about type too -- for TCG_TYPE_I32, you'll want to OR in INT32_MIN. The high bits of s_mask will be unknown (might be 1's from fold_masks, might be 0 from reset_ts). But otherwise that's a good solution. r~
diff --git a/tcg/optimize.c b/tcg/optimize.c index 752cc5c56b6..ab976a5bbe7 100644 --- a/tcg/optimize.c +++ b/tcg/optimize.c @@ -793,6 +793,7 @@ static int do_constant_folding_cond1(OptContext *ctx, TCGOp *op, TCGArg dest, TCGArg *p1, TCGArg *p2, TCGArg *pcond) { TCGCond cond; + TempOptInfo *i1; bool swap; int r; @@ -810,19 +811,21 @@ static int do_constant_folding_cond1(OptContext *ctx, TCGOp *op, TCGArg dest, return -1; } + i1 = arg_info(*p1); + /* * TSTNE x,x -> NE x,0 - * TSTNE x,-1 -> NE x,0 + * TSTNE x,i -> NE x,0 if i includes all nonzero bits of x */ - if (args_are_copies(*p1, *p2) || arg_is_const_val(*p2, -1)) { + if (args_are_copies(*p1, *p2) || + (arg_is_const(*p2) && (i1->z_mask & ~arg_info(*p2)->val) == 0)) { *p2 = arg_new_constant(ctx, 0); *pcond = tcg_tst_eqne_cond(cond); return -1; } - /* TSTNE x,sign -> LT x,0 */ - if (arg_is_const_val(*p2, (ctx->type == TCG_TYPE_I32 - ? INT32_MIN : INT64_MIN))) { + /* TSTNE x,i -> LT x,0 if i only includes sign bit copies */ + if (arg_is_const(*p2) && (arg_info(*p2)->val & ~i1->s_mask) == 0) { *p2 = arg_new_constant(ctx, 0); *pcond = tcg_tst_ltge_cond(cond); return -1;
Generalize the existing optimization of "TSTNE x,sign" and "TSTNE x,-1". This can be useful in some cases when the i386 frontend creates opcodes that test against 0xff or 0x80. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> --- tcg/optimize.c | 13 ++++++++----- 1 file changed, 8 insertions(+), 5 deletions(-)