diff mbox

[28/32] ppc: Avoid double translation for lvx/lvxl/stvx/stvxl

Message ID 1469782800.5978.284.camel@kernel.crashing.org (mailing list archive)
State New, archived
Headers show

Commit Message

Benjamin Herrenschmidt July 29, 2016, 9 a.m. UTC
On Fri, 2016-07-29 at 06:19 +0530, Richard Henderson wrote:
> (1) The helper, since it writes to registers controlled by tcg, must be 
> described to clobber all registers.  Which will noticeably increase memory 
> traffic to ENV.  For instance, you won't be able to hold the guest register 
> holding the address in a host register across the call.

So after fixing my test setup, I did observe indeed a small performance
loss using the helper in qemu-user. It might still win us something in
softmmu due to avoiding extra translations but I will leave that aside
as I mentioned separately.

Now out of curosity, I tried this:


If my understanding is right, the above is correct, as none of these
instructions will write to the env, though they can read from it and/
or generate faults.

Sadly I haven't observed any performance improvement as a result in
a few micro-benchmarks I cooked up.

Cheers,
Ben

Comments

Richard Henderson July 29, 2016, 12:43 p.m. UTC | #1
On 07/29/2016 02:30 PM, Benjamin Herrenschmidt wrote:
>  DEF_HELPER_3(lmw, void, env, tl, i32)
> -DEF_HELPER_3(stmw, void, env, tl, i32)
> +DEF_HELPER_FLAGS_3(stmw, TCG_CALL_NO_WG, void, env, tl, i32)
>  DEF_HELPER_4(lsw, void, env, tl, i32, i32)
>  DEF_HELPER_5(lswx, void, env, tl, i32, i32, i32)
> -DEF_HELPER_4(stsw, void, env, tl, i32, i32)
> -DEF_HELPER_3(dcbz, void, env, tl, i32)
> -DEF_HELPER_2(icbi, void, env, tl)
> +DEF_HELPER_FLAGS_4(stsw, TCG_CALL_NO_WG, void, env, tl, i32, i32)
> +DEF_HELPER_FLAGS_3(dcbz, TCG_CALL_NO_WG, void, env, tl, i32)
> +DEF_HELPER_FLAGS_2(icbi, TCG_CALL_NO_WG, void, env, tl)
>  DEF_HELPER_5(lscbx, tl, env, tl, i32, i32, i32)
>
>  #if defined(TARGET_PPC64)
>
> If my understanding is right, the above is correct, as none of these
> instructions will write to the env, though they can read from it and/
> or generate faults.
>
> Sadly I haven't observed any performance improvement as a result in
> a few micro-benchmarks I cooked up.

Too bad.

But the changes look correct, so you might as well queue them.  ;-)


r~
diff mbox

Patch

--- a/target-ppc/helper.h
+++ b/target-ppc/helper.h
@@ -22,12 +22,12 @@  DEF_HELPER_1(check_tlb_flush, void, env)
 #endif
 
 DEF_HELPER_3(lmw, void, env, tl, i32)
-DEF_HELPER_3(stmw, void, env, tl, i32)
+DEF_HELPER_FLAGS_3(stmw, TCG_CALL_NO_WG, void, env, tl, i32)
 DEF_HELPER_4(lsw, void, env, tl, i32, i32)
 DEF_HELPER_5(lswx, void, env, tl, i32, i32, i32)
-DEF_HELPER_4(stsw, void, env, tl, i32, i32)
-DEF_HELPER_3(dcbz, void, env, tl, i32)
-DEF_HELPER_2(icbi, void, env, tl)
+DEF_HELPER_FLAGS_4(stsw, TCG_CALL_NO_WG, void, env, tl, i32, i32)
+DEF_HELPER_FLAGS_3(dcbz, TCG_CALL_NO_WG, void, env, tl, i32)
+DEF_HELPER_FLAGS_2(icbi, TCG_CALL_NO_WG, void, env, tl)
 DEF_HELPER_5(lscbx, tl, env, tl, i32, i32, i32)
 
 #if defined(TARGET_PPC64)