diff mbox series

[v11,03/16] target/riscv: clwz must ignore high bits (use shift-left & changed logic)

Message ID 20210911140016.834071-4-philipp.tomsich@vrull.eu (mailing list archive)
State New, archived
Headers show
Series target/riscv: Update QEmu for Zb[abcs] 1.0.0 | expand

Commit Message

Philipp Tomsich Sept. 11, 2021, 2 p.m. UTC
Assume clzw being executed on a register that is not sign-extended, such
as for the following sequence that uses (1ULL << 63) | 392 as the operand
to clzw:
	bseti	a2, zero, 63
	addi	a2, a2, 392
	clzw    a3, a2
The correct result of clzw would be 23, but the current implementation
returns -32 (as it performs a 64bit clz, which results in 0 leading zero
bits, and then subtracts 32).

Fix this by changing the implementation to:
 1. shift the original register up by 32
 2. performs a target-length (64bit) clz
 3. return 32 if no bits are set

Marking this instruction as 'w-form' (i.e., setting ctx->w) would not
correctly model the behaviour, as the instruction should not perform
a zero-extensions on the input (after all, it is not a .uw instruction)
and the result is always in the range 0..32 (so neither a sign-extension
nor a zero-extension on the result will ever be needed).  Consequently,
we do not set ctx->w and mark the instruction as EXT_NONE.

Signed-off-by: Philipp Tomsich <philipp.tomsich@vrull.eu>
---

Changes in v11:
- Swaps out the EXT_ZERO to EXT_NONE, as no extension is to be performed.

Changes in v10:
- New patch, fixing correctnes for clzw called on a register with undefined
  (as in: not properly sign-extended) upper bits.

 target/riscv/insn_trans/trans_rvb.c.inc | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

Comments

LIU Zhiwei Sept. 14, 2021, 9:15 a.m. UTC | #1
On 2021/9/11 下午10:00, Philipp Tomsich wrote:
> Assume clzw being executed on a register that is not sign-extended, such
> as for the following sequence that uses (1ULL << 63) | 392 as the operand
> to clzw:
> 	bseti	a2, zero, 63
> 	addi	a2, a2, 392
> 	clzw    a3, a2
> The correct result of clzw would be 23, but the current implementation
> returns -32 (as it performs a 64bit clz, which results in 0 leading zero
> bits, and then subtracts 32).

As the MSB word of  a3 has been cleaned,  the result of current 
implementation will be 23. So there is no
error here.

Thanks,
Zhiwei

> Fix this by changing the implementation to:
>   1. shift the original register up by 32
>   2. performs a target-length (64bit) clz
>   3. return 32 if no bits are set
>
> Marking this instruction as 'w-form' (i.e., setting ctx->w) would not
> correctly model the behaviour, as the instruction should not perform
> a zero-extensions on the input (after all, it is not a .uw instruction)
> and the result is always in the range 0..32 (so neither a sign-extension
> nor a zero-extension on the result will ever be needed).  Consequently,
> we do not set ctx->w and mark the instruction as EXT_NONE.
>
> Signed-off-by: Philipp Tomsich <philipp.tomsich@vrull.eu>
> ---
>
> Changes in v11:
> - Swaps out the EXT_ZERO to EXT_NONE, as no extension is to be performed.
>
> Changes in v10:
> - New patch, fixing correctnes for clzw called on a register with undefined
>    (as in: not properly sign-extended) upper bits.
>
>   target/riscv/insn_trans/trans_rvb.c.inc | 8 +++++---
>   1 file changed, 5 insertions(+), 3 deletions(-)
>
> diff --git a/target/riscv/insn_trans/trans_rvb.c.inc b/target/riscv/insn_trans/trans_rvb.c.inc
> index 6c85c89f6d..73d1e45026 100644
> --- a/target/riscv/insn_trans/trans_rvb.c.inc
> +++ b/target/riscv/insn_trans/trans_rvb.c.inc
> @@ -349,15 +349,17 @@ GEN_TRANS_SHADD(3)
>   
>   static void gen_clzw(TCGv ret, TCGv arg1)
>   {
> -    tcg_gen_clzi_tl(ret, arg1, 64);
> -    tcg_gen_subi_tl(ret, ret, 32);
> +    TCGv t = tcg_temp_new();
> +    tcg_gen_shli_tl(t, arg1, 32);
> +    tcg_gen_clzi_tl(ret, t, 32);
> +    tcg_temp_free(t);
>   }
>   
>   static bool trans_clzw(DisasContext *ctx, arg_clzw *a)
>   {
>       REQUIRE_64BIT(ctx);
>       REQUIRE_EXT(ctx, RVB);
> -    return gen_unary(ctx, a, EXT_ZERO, gen_clzw);
> +    return gen_unary(ctx, a, EXT_NONE, gen_clzw);
>   }
>   
>   static void gen_ctzw(TCGv ret, TCGv arg1)
Philipp Tomsich Sept. 14, 2021, 10:24 a.m. UTC | #2
On Tue 14. Sep 2021 at 11:15, LIU Zhiwei <zhiwei_liu@c-sky.com> wrote:

>
> On 2021/9/11 下午10:00, Philipp Tomsich wrote:
> > Assume clzw being executed on a register that is not sign-extended, such
> > as for the following sequence that uses (1ULL << 63) | 392 as the operand
> > to clzw:
> >       bseti   a2, zero, 63
> >       addi    a2, a2, 392
> >       clzw    a3, a2
> > The correct result of clzw would be 23, but the current implementation
> > returns -32 (as it performs a 64bit clz, which results in 0 leading zero
> > bits, and then subtracts 32).
>
> As the MSB word of  a3 has been cleaned,  the result of current
> implementation will be 23. So there is no
> error here.


Zhiwei,

bits [63:32] on rs (arg1) are not zero-extended, as ctx->w is not being set
(the EXT_ZERO doesn’t have any effect, unless ctx->w is true).  Please see
the earlier discussion on this topic in v9 and v10.

Thanks,
Philipp.

Thanks,
> Zhiwei
>
> > Fix this by changing the implementation to:
> >   1. shift the original register up by 32
> >   2. performs a target-length (64bit) clz
> >   3. return 32 if no bits are set
> >
> > Marking this instruction as 'w-form' (i.e., setting ctx->w) would not
> > correctly model the behaviour, as the instruction should not perform
> > a zero-extensions on the input (after all, it is not a .uw instruction)
> > and the result is always in the range 0..32 (so neither a sign-extension
> > nor a zero-extension on the result will ever be needed).  Consequently,
> > we do not set ctx->w and mark the instruction as EXT_NONE.
> >
> > Signed-off-by: Philipp Tomsich <philipp.tomsich@vrull.eu>
> > ---
> >
> > Changes in v11:
> > - Swaps out the EXT_ZERO to EXT_NONE, as no extension is to be performed.
> >
> > Changes in v10:
> > - New patch, fixing correctnes for clzw called on a register with
> undefined
> >    (as in: not properly sign-extended) upper bits.
> >
> >   target/riscv/insn_trans/trans_rvb.c.inc | 8 +++++---
> >   1 file changed, 5 insertions(+), 3 deletions(-)
> >
> > diff --git a/target/riscv/insn_trans/trans_rvb.c.inc
> b/target/riscv/insn_trans/trans_rvb.c.inc
> > index 6c85c89f6d..73d1e45026 100644
> > --- a/target/riscv/insn_trans/trans_rvb.c.inc
> > +++ b/target/riscv/insn_trans/trans_rvb.c.inc
> > @@ -349,15 +349,17 @@ GEN_TRANS_SHADD(3)
> >
> >   static void gen_clzw(TCGv ret, TCGv arg1)
> >   {
> > -    tcg_gen_clzi_tl(ret, arg1, 64);
> > -    tcg_gen_subi_tl(ret, ret, 32);
> > +    TCGv t = tcg_temp_new();
> > +    tcg_gen_shli_tl(t, arg1, 32);
> > +    tcg_gen_clzi_tl(ret, t, 32);
> > +    tcg_temp_free(t);
> >   }
> >
> >   static bool trans_clzw(DisasContext *ctx, arg_clzw *a)
> >   {
> >       REQUIRE_64BIT(ctx);
> >       REQUIRE_EXT(ctx, RVB);
> > -    return gen_unary(ctx, a, EXT_ZERO, gen_clzw);
> > +    return gen_unary(ctx, a, EXT_NONE, gen_clzw);
> >   }
> >
> >   static void gen_ctzw(TCGv ret, TCGv arg1)
>
LIU Zhiwei Sept. 15, 2021, 8:11 a.m. UTC | #3
On 2021/9/14 下午6:24, Philipp Tomsich wrote:
>
>
> On Tue 14. Sep 2021 at 11:15, LIU Zhiwei <zhiwei_liu@c-sky.com 
> <mailto:zhiwei_liu@c-sky.com>> wrote:
>
>
>     On 2021/9/11 下午10:00, Philipp Tomsich wrote:
>     > Assume clzw being executed on a register that is not
>     sign-extended, such
>     > as for the following sequence that uses (1ULL << 63) | 392 as
>     the operand
>     > to clzw:
>     >       bseti   a2, zero, 63
>     >       addi    a2, a2, 392
>     >       clzw    a3, a2
>     > The correct result of clzw would be 23, but the current
>     implementation
>     > returns -32 (as it performs a 64bit clz, which results in 0
>     leading zero
>     > bits, and then subtracts 32).
>
>     As the MSB word of  a3 has been cleaned,  the result of current
>     implementation will be 23. So there is no
>     error here.
>
>
> Zhiwei,
>
> bits [63:32] on rs (arg1) are not zero-extended, as ctx->w is not 
> being set (the EXT_ZERO doesn’t have any effect, unless ctx->w is 
> true).  Please see the earlier discussion on this topic in v9 and v10.

Yes,  you are right.

Reviewed-by: LIU Zhiwei<zhiwei_liu@c-sky.com>

Zhiwei

>
> Thanks,
> Philipp.
>
>     Thanks,
>     Zhiwei
>
>     > Fix this by changing the implementation to:
>     >   1. shift the original register up by 32
>     >   2. performs a target-length (64bit) clz
>     >   3. return 32 if no bits are set
>     >
>     > Marking this instruction as 'w-form' (i.e., setting ctx->w)
>     would not
>     > correctly model the behaviour, as the instruction should not perform
>     > a zero-extensions on the input (after all, it is not a .uw
>     instruction)
>     > and the result is always in the range 0..32 (so neither a
>     sign-extension
>     > nor a zero-extension on the result will ever be needed). 
>     Consequently,
>     > we do not set ctx->w and mark the instruction as EXT_NONE.
>     >
>     > Signed-off-by: Philipp Tomsich <philipp.tomsich@vrull.eu
>     <mailto:philipp.tomsich@vrull.eu>>
>     > ---
>     >
>     > Changes in v11:
>     > - Swaps out the EXT_ZERO to EXT_NONE, as no extension is to be
>     performed.
>     >
>     > Changes in v10:
>     > - New patch, fixing correctnes for clzw called on a register
>     with undefined
>     >    (as in: not properly sign-extended) upper bits.
>     >
>     >   target/riscv/insn_trans/trans_rvb.c.inc | 8 +++++---
>     >   1 file changed, 5 insertions(+), 3 deletions(-)
>     >
>     > diff --git a/target/riscv/insn_trans/trans_rvb.c.inc
>     b/target/riscv/insn_trans/trans_rvb.c.inc
>     > index 6c85c89f6d..73d1e45026 100644
>     > --- a/target/riscv/insn_trans/trans_rvb.c.inc
>     > +++ b/target/riscv/insn_trans/trans_rvb.c.inc
>     > @@ -349,15 +349,17 @@ GEN_TRANS_SHADD(3)
>     >
>     >   static void gen_clzw(TCGv ret, TCGv arg1)
>     >   {
>     > -    tcg_gen_clzi_tl(ret, arg1, 64);
>     > -    tcg_gen_subi_tl(ret, ret, 32);
>     > +    TCGv t = tcg_temp_new();
>     > +    tcg_gen_shli_tl(t, arg1, 32);
>     > +    tcg_gen_clzi_tl(ret, t, 32);
>     > +    tcg_temp_free(t);
>     >   }
>     >
>     >   static bool trans_clzw(DisasContext *ctx, arg_clzw *a)
>     >   {
>     >       REQUIRE_64BIT(ctx);
>     >       REQUIRE_EXT(ctx, RVB);
>     > -    return gen_unary(ctx, a, EXT_ZERO, gen_clzw);
>     > +    return gen_unary(ctx, a, EXT_NONE, gen_clzw);
>     >   }
>     >
>     >   static void gen_ctzw(TCGv ret, TCGv arg1)
>
diff mbox series

Patch

diff --git a/target/riscv/insn_trans/trans_rvb.c.inc b/target/riscv/insn_trans/trans_rvb.c.inc
index 6c85c89f6d..73d1e45026 100644
--- a/target/riscv/insn_trans/trans_rvb.c.inc
+++ b/target/riscv/insn_trans/trans_rvb.c.inc
@@ -349,15 +349,17 @@  GEN_TRANS_SHADD(3)
 
 static void gen_clzw(TCGv ret, TCGv arg1)
 {
-    tcg_gen_clzi_tl(ret, arg1, 64);
-    tcg_gen_subi_tl(ret, ret, 32);
+    TCGv t = tcg_temp_new();
+    tcg_gen_shli_tl(t, arg1, 32);
+    tcg_gen_clzi_tl(ret, t, 32);
+    tcg_temp_free(t);
 }
 
 static bool trans_clzw(DisasContext *ctx, arg_clzw *a)
 {
     REQUIRE_64BIT(ctx);
     REQUIRE_EXT(ctx, RVB);
-    return gen_unary(ctx, a, EXT_ZERO, gen_clzw);
+    return gen_unary(ctx, a, EXT_NONE, gen_clzw);
 }
 
 static void gen_ctzw(TCGv ret, TCGv arg1)