diff mbox

[RFC,v3,4/5] mttcg: Implement implicit ordering semantics

Message ID 20170829063313.10237-4-bobby.prani@gmail.com (mailing list archive)
State New, archived
Headers show

Commit Message

Pranith Kumar Aug. 29, 2017, 6:33 a.m. UTC
Currently, we cannot use mttcg for running strong memory model guests
on weak memory model hosts due to missing ordering semantics.

We implicitly generate fence instructions for stronger guests if an
ordering mismatch is detected. We generate fences only for the orders
for which fence instructions are necessary, for example a fence is not
necessary between a store and a subsequent load on x86 since its
absence in the guest binary tells that ordering need not be
ensured. Also note that if we find multiple subsequent fence
instructions in the generated IR, we combine them in the TCG
optimization pass.

This patch allows us to boot an x86 guest on ARM64 hosts using mttcg.

Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
---
 tcg/tcg-op.c | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

Comments

Richard Henderson Aug. 29, 2017, 2:53 p.m. UTC | #1
On 08/28/2017 11:33 PM, Pranith Kumar wrote:
> Currently, we cannot use mttcg for running strong memory model guests
> on weak memory model hosts due to missing ordering semantics.
> 
> We implicitly generate fence instructions for stronger guests if an
> ordering mismatch is detected. We generate fences only for the orders
> for which fence instructions are necessary, for example a fence is not
> necessary between a store and a subsequent load on x86 since its
> absence in the guest binary tells that ordering need not be
> ensured. Also note that if we find multiple subsequent fence
> instructions in the generated IR, we combine them in the TCG
> optimization pass.
> 
> This patch allows us to boot an x86 guest on ARM64 hosts using mttcg.
> 
> Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
> ---
>  tcg/tcg-op.c | 16 ++++++++++++++++
>  1 file changed, 16 insertions(+)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>


r~
Emilio Cota Sept. 2, 2017, 1:44 a.m. UTC | #2
On Tue, Aug 29, 2017 at 02:33:12 -0400, Pranith Kumar wrote:
> Currently, we cannot use mttcg for running strong memory model guests
> on weak memory model hosts due to missing ordering semantics.
> 
> We implicitly generate fence instructions for stronger guests if an

This confused me. By "We implicitly" are we still talking about
the current state (as per the "currently" above?). If not, I'd
rephrase as:

"We cannot use [...].

To fix it, generate fences [...]"

Also, I think you meant s/stronger/weaker/ in the last sentence.

> ordering mismatch is detected. We generate fences only for the orders
> for which fence instructions are necessary, for example a fence is not
> necessary between a store and a subsequent load on x86 since its
> absence in the guest binary tells that ordering need not be
> ensured. Also note that if we find multiple subsequent fence
> instructions in the generated IR, we combine them in the TCG
> optimization pass.

A before/after example of -d out_asm would be great to have here.
> 
> This patch allows us to boot an x86 guest on ARM64 hosts using mttcg.

A test with a simple program that *cannot* work without this patch
would be even better.

Thanks,

		Emilio
diff mbox

Patch

diff --git a/tcg/tcg-op.c b/tcg/tcg-op.c
index 87f673ef49..688d91755b 100644
--- a/tcg/tcg-op.c
+++ b/tcg/tcg-op.c
@@ -28,6 +28,7 @@ 
 #include "exec/exec-all.h"
 #include "tcg.h"
 #include "tcg-op.h"
+#include "tcg-mo.h"
 #include "trace-tcg.h"
 #include "trace/mem.h"
 
@@ -2662,8 +2663,20 @@  static void gen_ldst_i64(TCGOpcode opc, TCGv_i64 val, TCGv addr,
 #endif
 }
 
+static void tcg_gen_req_mo(TCGBar type)
+{
+#ifdef TCG_GUEST_DEFAULT_MO
+    type &= TCG_GUEST_DEFAULT_MO;
+#endif
+    type &= ~TCG_TARGET_DEFAULT_MO;
+    if (type) {
+        tcg_gen_mb(type | TCG_BAR_SC);
+    }
+}
+
 void tcg_gen_qemu_ld_i32(TCGv_i32 val, TCGv addr, TCGArg idx, TCGMemOp memop)
 {
+    tcg_gen_req_mo(TCG_MO_LD_LD | TCG_MO_ST_LD);
     memop = tcg_canonicalize_memop(memop, 0, 0);
     trace_guest_mem_before_tcg(tcg_ctx.cpu, tcg_ctx.tcg_env,
                                addr, trace_mem_get_info(memop, 0));
@@ -2672,6 +2685,7 @@  void tcg_gen_qemu_ld_i32(TCGv_i32 val, TCGv addr, TCGArg idx, TCGMemOp memop)
 
 void tcg_gen_qemu_st_i32(TCGv_i32 val, TCGv addr, TCGArg idx, TCGMemOp memop)
 {
+    tcg_gen_req_mo(TCG_MO_LD_ST | TCG_MO_ST_ST);
     memop = tcg_canonicalize_memop(memop, 0, 1);
     trace_guest_mem_before_tcg(tcg_ctx.cpu, tcg_ctx.tcg_env,
                                addr, trace_mem_get_info(memop, 1));
@@ -2680,6 +2694,7 @@  void tcg_gen_qemu_st_i32(TCGv_i32 val, TCGv addr, TCGArg idx, TCGMemOp memop)
 
 void tcg_gen_qemu_ld_i64(TCGv_i64 val, TCGv addr, TCGArg idx, TCGMemOp memop)
 {
+    tcg_gen_req_mo(TCG_MO_LD_LD | TCG_MO_ST_LD);
     if (TCG_TARGET_REG_BITS == 32 && (memop & MO_SIZE) < MO_64) {
         tcg_gen_qemu_ld_i32(TCGV_LOW(val), addr, idx, memop);
         if (memop & MO_SIGN) {
@@ -2698,6 +2713,7 @@  void tcg_gen_qemu_ld_i64(TCGv_i64 val, TCGv addr, TCGArg idx, TCGMemOp memop)
 
 void tcg_gen_qemu_st_i64(TCGv_i64 val, TCGv addr, TCGArg idx, TCGMemOp memop)
 {
+    tcg_gen_req_mo(TCG_MO_LD_ST | TCG_MO_ST_ST);
     if (TCG_TARGET_REG_BITS == 32 && (memop & MO_SIZE) < MO_64) {
         tcg_gen_qemu_st_i32(TCGV_LOW(val), addr, idx, memop);
         return;