Message ID | 1566236601-22954-1-git-send-email-pc@us.ibm.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v2] ppc: conform to processor User's Manual for xscvdpspn | expand |
On Mon, Aug 19, 2019 at 12:43:21PM -0500, Paul A. Clarke wrote: > From: "Paul A. Clarke" <pc@us.ibm.com> > > The POWER8 and POWER9 User's Manuals specify the implementation > behavior for what the ISA leaves "undefined" behavior for the > xscvdpspn and xscvdpsp instructions. This patch corrects the QEMU > implementation to match the hardware implementation for that case. > > ISA 3.0B has xscvdpspn leaving its result in word 0 of the target register, > with the other words of the target register left "undefined". > > The User's Manuals specify: > VSX scalar convert from double-precision to single-precision (xscvdpsp, > xscvdpspn). > VSR[32:63] is set to VSR[0:31]. > So, words 0 and 1 both contain the result. > > Note: this is important because GCC as of version 8 or so, assumes and takes > advantage of this behavior to optimize the following sequence: > xscvdpspn vs0,vs1 > mffprwz r8,f0 > ISA 3.0B has xscvdpspn leaving its result in word 0 of the target register, > and mffprwz expecting its input to come from word 1 of the source register. > This sequence fails with QEMU, as a shift is required between those two > instructions. However, since the hardware splats the result to both words 0 > and 1 of its output register, the shift is not necessary. > > Expect a future revision of the ISA to specify this behavior. > > Signed-off-by: Paul A. Clarke <pc@us.ibm.com> Applied to ppc-for-4.2, thanks. > > v2 > - Splitting patch "ppc: Three floating point fixes"; this is just one part. > - Updated commit message to clarify behavior is documented in User's Manuals. > - Updated commit message to correct which words are in output and source of > xscvdpspn and mffprz. > - No source changes to this part of the original patch. > > --- > target/ppc/fpu_helper.c | 6 +++++- > 1 file changed, 5 insertions(+), 1 deletion(-) > > diff --git a/target/ppc/fpu_helper.c b/target/ppc/fpu_helper.c > index 5611cf0..23b9c97 100644 > --- a/target/ppc/fpu_helper.c > +++ b/target/ppc/fpu_helper.c > @@ -2871,10 +2871,14 @@ void helper_xscvqpdp(CPUPPCState *env, uint32_t opcode, > > uint64_t helper_xscvdpspn(CPUPPCState *env, uint64_t xb) > { > + uint64_t result; > + > float_status tstat = env->fp_status; > set_float_exception_flags(0, &tstat); > > - return (uint64_t)float64_to_float32(xb, &tstat) << 32; > + result = (uint64_t)float64_to_float32(xb, &tstat); > + /* hardware replicates result to both words of the doubleword result. */ > + return (result << 32) | result; > } > > uint64_t helper_xscvspdpn(CPUPPCState *env, uint64_t xb)
diff --git a/target/ppc/fpu_helper.c b/target/ppc/fpu_helper.c index 5611cf0..23b9c97 100644 --- a/target/ppc/fpu_helper.c +++ b/target/ppc/fpu_helper.c @@ -2871,10 +2871,14 @@ void helper_xscvqpdp(CPUPPCState *env, uint32_t opcode, uint64_t helper_xscvdpspn(CPUPPCState *env, uint64_t xb) { + uint64_t result; + float_status tstat = env->fp_status; set_float_exception_flags(0, &tstat); - return (uint64_t)float64_to_float32(xb, &tstat) << 32; + result = (uint64_t)float64_to_float32(xb, &tstat); + /* hardware replicates result to both words of the doubleword result. */ + return (result << 32) | result; } uint64_t helper_xscvspdpn(CPUPPCState *env, uint64_t xb)