mbox series

[0/2] arm: Implement M-profile trapping on division by zero

Message ID 20210730151636.17254-1-peter.maydell@linaro.org (mailing list archive)
Headers show
Series arm: Implement M-profile trapping on division by zero | expand

Message

Peter Maydell July 30, 2021, 3:16 p.m. UTC
Unlike A-profile, for M-profile the UDIV and SDIV insns can be
configured to raise an exception on division by zero, using the CCR
DIV_0_TRP bit.  This patchset implements that missing functionality
by having the udiv and sdiv helpers raise an exception if needed.

Some questions:

Is it worth allowing A-profile to retain the mildly better codegen it
gets from not having to pass in 'env' and marking the helper as
no-side-effects (ie having M-specific udiv/sdiv helpers) ?

Is it worth inlining either udiv or sdiv for the A-profile case?
udiv can be done with movcond/movcond/divu, something like:

    /* t1 = (t2 == 0) ? 0 : t1;    t2 = (t2 == 0) ? 1 : t2 */
    tcg_gen_movcond_i32(TCG_COND_EQ, t1, t2, tcg_constant_i32(0),
    tcg_constant_i32(0), t1);
    tcg_gen_movcond_i32(TCG_COND_EQ, t2, t2, tcg_constant_i32(0),
    tcg_constant_i32(1), t2);
    /* Either t1 / t2; or 0 / 1 to give 0 for division-by-zero */
    tcg_gen_divu_i32(t1, t1, t2);

sdiv is more painful because it needs to check for both x/0 and
INTMIN/-1 cases.  Some other targets choose to generate inline TCG
ops for it, though.

Side note, I don't understand the x86-64 codegen for the above
sketch of an inline udiv. When I try it the TCG ops are

  mov_i32 tmp3,r2
  mov_i32 tmp6,r3
  movcond_i32 tmp3,tmp6,$0x0,$0x0,tmp3,eq
  movcond_i32 tmp6,tmp6,$0x0,$0x1,tmp6,eq
  mov_i32 tmp7,$0x0
  divu2_i32 tmp3,tmp7,tmp3,tmp7,tmp6
  mov_i32 r3,tmp3

but the x86 code is
0x7f5f1807dc0c:  45 33 f6                 xorl     %r14d, %r14d
0x7f5f1807dc0f:  45 85 ed                 testl    %r13d, %r13d
0x7f5f1807dc12:  45 0f 44 e6              cmovel   %r14d, %r12d
0x7f5f1807dc16:  41 bf 01 00 00 00        movl     $1, %r15d
0x7f5f1807dc1c:  45 3b ee                 cmpl     %r14d, %r13d
0x7f5f1807dc1f:  45 0f 44 ef              cmovel   %r15d, %r13d
0x7f5f1807dc23:  41 8b c4                 movl     %r12d, %eax
0x7f5f1807dc26:  41 8b d6                 movl     %r14d, %edx
0x7f5f1807dc29:  41 f7 f5                 divl     %r13d

where the comparison for the first cmovel is 'testl %r13d, %r13d",
but the second comparison is 'cmpl %r14d, %r13d'.  That's the same
effect (given r14 is 0) but I don't understand why the backend has
chosen to generate different code for the two cases.  (Ideally of
course it would notice that it already had generated the condition
check and not repeat it.)

thanks
-- PMM

Peter Maydell (2):
  target/arm: Re-indent sdiv and udiv helpers
  target/arm: Implement M-profile trapping on division by zero

 target/arm/cpu.h       |  1 +
 target/arm/helper.h    |  4 ++--
 target/arm/helper.c    | 34 ++++++++++++++++++++++++++--------
 target/arm/m_helper.c  |  4 ++++
 target/arm/translate.c |  4 ++--
 5 files changed, 35 insertions(+), 12 deletions(-)

Comments

Richard Henderson Aug. 2, 2021, 10:23 p.m. UTC | #1
On 7/30/21 5:16 AM, Peter Maydell wrote:
> Unlike A-profile, for M-profile the UDIV and SDIV insns can be
> configured to raise an exception on division by zero, using the CCR
> DIV_0_TRP bit.  This patchset implements that missing functionality
> by having the udiv and sdiv helpers raise an exception if needed.
> 
> Some questions:
> 
> Is it worth allowing A-profile to retain the mildly better codegen it
> gets from not having to pass in 'env' and marking the helper as
> no-side-effects (ie having M-specific udiv/sdiv helpers) ?

Probably not.

> Is it worth inlining either udiv or sdiv for the A-profile case?

Probably not.

>    mov_i32 tmp3,r2
>    mov_i32 tmp6,r3
>    movcond_i32 tmp3,tmp6,$0x0,$0x0,tmp3,eq
>    movcond_i32 tmp6,tmp6,$0x0,$0x1,tmp6,eq
>    mov_i32 tmp7,$0x0
>    divu2_i32 tmp3,tmp7,tmp3,tmp7,tmp6
>    mov_i32 r3,tmp3
> 
> but the x86 code is
> 0x7f5f1807dc0c:  45 33 f6                 xorl     %r14d, %r14d
> 0x7f5f1807dc0f:  45 85 ed                 testl    %r13d, %r13d
> 0x7f5f1807dc12:  45 0f 44 e6              cmovel   %r14d, %r12d


At the start of the first movcond, $0x0 is not allocated to a register, and the 
constraints allow a constant for argument 3.  Then, constraints do not allow a constant 
for argument 4 so we load $0x0 into a register.

> 0x7f5f1807dc16:  41 bf 01 00 00 00        movl     $1, %r15d
> 0x7f5f1807dc1c:  45 3b ee                 cmpl     %r14d, %r13d
> 0x7f5f1807dc1f:  45 0f 44 ef              cmovel   %r15d, %r13d

At the start of the second movcond, $0x0 is loaded into a register, so we use it.

> (Ideally of
> course it would notice that it already had generated the condition
> check and not repeat it.)

Yep.


r~