diff mbox

[2/5] trace: [all] Add "guest_vmem" event

Message ID 145625173880.12025.6630606700468410319.stgit@localhost (mailing list archive)
State New, archived
Headers show

Commit Message

Lluís Vilanova Feb. 23, 2016, 6:22 p.m. UTC
Signed-off-by: Lluís Vilanova <vilanova@ac.upc.edu>
---
 include/exec/cpu_ldst_template.h          |   17 +++++++++++++++
 include/exec/cpu_ldst_useronly_template.h |   14 ++++++++++++
 tcg/tcg-op.c                              |   34 ++++++++++++++++++++++++++---
 trace-events                              |   13 +++++++++++
 4 files changed, 74 insertions(+), 4 deletions(-)

Comments

Peter Maydell March 16, 2016, 3:01 p.m. UTC | #1
On 23 February 2016 at 18:22, Lluís Vilanova <vilanova@ac.upc.edu> wrote:
> Signed-off-by: Lluís Vilanova <vilanova@ac.upc.edu>

> +### Guest events, keep at bottom
> +
> +# @vaddr: Access' virtual address.
> +# @size : Access' size (bytes).
> +# @store: Whether the access is a store.
> +#
> +# Start virtual memory access (before any potential access violation).
> +#
> +# Does not include memory accesses performed by devices.
> +#
> +# Targets: TCG(all)
> +disable vcpu tcg guest_vmem(TCGv vaddr, uint8_t size, uint8_t store) "size=%d store=%d", "vaddr=0x%016"PRIx64" size=%d store=%d"

Shouldn't we also be tracing some of the other information in the memop?
In particular, endianness of the access seems useful. You could also
say whether we're zero- or sign-extending the access, though I guess
you could defend not printing that since we don't print anything about
what the target- code does with the loaded data.

Otherwise I think this looks OK (though the various paths memory accesses
take are a bit opaque since I've forgotten how this bit of QEMU works).

thanks
-- PMM
Lluís Vilanova March 17, 2016, 7:22 p.m. UTC | #2
Peter Maydell writes:

> On 23 February 2016 at 18:22, Lluís Vilanova <vilanova@ac.upc.edu> wrote:
>> Signed-off-by: Lluís Vilanova <vilanova@ac.upc.edu>

>> +### Guest events, keep at bottom
>> +
>> +# @vaddr: Access' virtual address.
>> +# @size : Access' size (bytes).
>> +# @store: Whether the access is a store.
>> +#
>> +# Start virtual memory access (before any potential access violation).
>> +#
>> +# Does not include memory accesses performed by devices.
>> +#
>> +# Targets: TCG(all)
>> +disable vcpu tcg guest_vmem(TCGv vaddr, uint8_t size, uint8_t store) "size=%d store=%d", "vaddr=0x%016"PRIx64" size=%d store=%d"

> Shouldn't we also be tracing some of the other information in the memop?
> In particular, endianness of the access seems useful. You could also
> say whether we're zero- or sign-extending the access, though I guess
> you could defend not printing that since we don't print anything about
> what the target- code does with the loaded data.

> Otherwise I think this looks OK (though the various paths memory accesses
> take are a bit opaque since I've forgotten how this bit of QEMU works).

Mmmmm, the endianness seems more of a vCPU property than one of the memory
access. A separate event could be added for that (e.g., at vCPU
initalization/hot-plug and whenever it is dynamically changed like in ARM).

For the sign extension and memory value, what about adding multiple events?
What I did for instructions is have a simple event and one with extended
information, so that we can tweak performance of a tracing QEMU by choosing one
or the other.  We could do the same for memory accesses (e.g., also show the
memory value, sign extension and physical address).
Richard Henderson March 17, 2016, 8:18 p.m. UTC | #3
On 03/17/2016 12:22 PM, Lluís Vilanova wrote:
> Mmmmm, the endianness seems more of a vCPU property than one of the memory
> access.

On the contrary.  Plenty of cpus have endian-swapping load/store insns: x86
(haswell's movbe), powerpc, sparcv9, s390.  Maybe others I've forgotten.


r~
Peter Maydell March 17, 2016, 8:25 p.m. UTC | #4
On 17 March 2016 at 19:22, Lluís Vilanova <vilanova@ac.upc.edu> wrote:
> Mmmmm, the endianness seems more of a vCPU property than one of the memory
> access. A separate event could be added for that (e.g., at vCPU
> initalization/hot-plug and whenever it is dynamically changed like in ARM).

We've chosen to implement it in QEMU as a property of the memory
access. Consider for instance instructions like x86 movbe --
sometimes an individual instruction will be a BE access even if the
CPU more usually does LE accesses. Equally, CPUs might have different
accesses for data and code, or other complicated things.
I think we're probably better off tracing as part of the memory
access the things we've implemented as memory access properties
or flags; at least that's consistent.

> For the sign extension and memory value, what about adding multiple events?
> What I did for instructions is have a simple event and one with extended
> information, so that we can tweak performance of a tracing QEMU by choosing one
> or the other.  We could do the same for memory accesses (e.g., also show the
> memory value, sign extension and physical address).

I think the info that's in the memop value is probably worth
just printing always, it won't make much difference.

Trying to trace physaddrs is very tricky -- in the case of
a TLB hit there is no guarantee you can still identify the
physaddr of what you're accessing (the guest might have
changed the page tables and not invalidated the TLB).

thanks
-- PMM
Lluís Vilanova March 18, 2016, 6:50 p.m. UTC | #5
Peter Maydell writes:

> On 17 March 2016 at 19:22, Lluís Vilanova <vilanova@ac.upc.edu> wrote:
>> Mmmmm, the endianness seems more of a vCPU property than one of the memory
>> access. A separate event could be added for that (e.g., at vCPU
>> initalization/hot-plug and whenever it is dynamically changed like in ARM).

> We've chosen to implement it in QEMU as a property of the memory
> access. Consider for instance instructions like x86 movbe --
> sometimes an individual instruction will be a BE access even if the
> CPU more usually does LE accesses. Equally, CPUs might have different
> accesses for data and code, or other complicated things.
> I think we're probably better off tracing as part of the memory
> access the things we've implemented as memory access properties
> or flags; at least that's consistent.

Oh, I wasn't aware of the endinness swapping, but your argument makes sense.


>> For the sign extension and memory value, what about adding multiple events?
>> What I did for instructions is have a simple event and one with extended
>> information, so that we can tweak performance of a tracing QEMU by choosing one
>> or the other.  We could do the same for memory accesses (e.g., also show the
>> memory value, sign extension and physical address).

> I think the info that's in the memop value is probably worth
> just printing always, it won't make much difference.

But is the endianness, sign extension, etc. valuable information when the data
is not part of the trace?

If it is, I can easily add that; but, there is still some additional
per-argument overhead when generating a TCG call to the exec-time tracing
routine, thus my target on adding two trace events with different levels of
detail. I haven't measured the impact, though.


> Trying to trace physaddrs is very tricky -- in the case of
> a TLB hit there is no guarantee you can still identify the
> physaddr of what you're accessing (the guest might have
> changed the page tables and not invalidated the TLB).

I was looking at how to modify the soft TLB code to generate that information
for the trace event, but it requires changes at every TCG target. But in any
case, that should always provide the same phys address used by QEMU, so the
event info is "correct" in that sense. Or did I miss something?


Cheers,
  Lluis
Peter Maydell March 19, 2016, 1:59 p.m. UTC | #6
On 18 March 2016 at 18:50, Lluís Vilanova <vilanova@ac.upc.edu> wrote:
> Peter Maydell writes:
>> Trying to trace physaddrs is very tricky -- in the case of
>> a TLB hit there is no guarantee you can still identify the
>> physaddr of what you're accessing (the guest might have
>> changed the page tables and not invalidated the TLB).
>
> I was looking at how to modify the soft TLB code to generate that information
> for the trace event, but it requires changes at every TCG target. But in any
> case, that should always provide the same phys address used by QEMU, so the
> event info is "correct" in that sense. Or did I miss something?

Consider the sequence:
 * guest makes access to vaddr V, currently mapped to physaddr P1
   (which is host address H)
 * we put the mapping V -> H into QEMU's TLB
 * guest changes its page tables so V now maps to P2,
   but doesn't do a TLB flush
 * guest makes another access to vaddr V
 * we hit in QEMU's TLB, so we know V and H; but we don't
   know P1 (because we don't record that in the TLB) and we
   can't even get it by walking the page table because
   at this point V maps to P2, not P1. (And for sw-TLB
   guest archs like MIPS you can't even do a V-to-P lookup
   in QEMU non-intrusively.)

(This is often defined to be unpredictable or similar in the guest
architecture. But a buggy guest might do it, and tracing the
wrong thing would be pretty confusing if you were trying to
track down that bug.)

thanks
-- PMM
Lluís Vilanova March 20, 2016, 6:09 p.m. UTC | #7
Peter Maydell writes:

> On 18 March 2016 at 18:50, Lluís Vilanova <vilanova@ac.upc.edu> wrote:
>> Peter Maydell writes:
>>> Trying to trace physaddrs is very tricky -- in the case of
>>> a TLB hit there is no guarantee you can still identify the
>>> physaddr of what you're accessing (the guest might have
>>> changed the page tables and not invalidated the TLB).
>> 
>> I was looking at how to modify the soft TLB code to generate that information
>> for the trace event, but it requires changes at every TCG target. But in any
>> case, that should always provide the same phys address used by QEMU, so the
>> event info is "correct" in that sense. Or did I miss something?

> Consider the sequence:
>  * guest makes access to vaddr V, currently mapped to physaddr P1
>    (which is host address H)
>  * we put the mapping V -> H into QEMU's TLB
>  * guest changes its page tables so V now maps to P2,
>    but doesn't do a TLB flush
>  * guest makes another access to vaddr V
>  * we hit in QEMU's TLB, so we know V and H; but we don't
>    know P1 (because we don't record that in the TLB) and we
>    can't even get it by walking the page table because
>    at this point V maps to P2, not P1. (And for sw-TLB
>    guest archs like MIPS you can't even do a V-to-P lookup
>    in QEMU non-intrusively.)

> (This is often defined to be unpredictable or similar in the guest
> architecture. But a buggy guest might do it, and tracing the
> wrong thing would be pretty confusing if you were trying to
> track down that bug.)

Oh! Yes, I seem to remember that code path now, I checked it a really long time
ago. I was assuming that whenever this event is enabled at compile time, I would
have to modify QEMU's TLB to store the guest physical address (then used by the
tracing event).


Cheers,
  Lluis
Peter Maydell March 20, 2016, 7:59 p.m. UTC | #8
On 20 March 2016 at 18:09, Lluís Vilanova <vilanova@ac.upc.edu> wrote:
> Oh! Yes, I seem to remember that code path now, I checked it a
> really long time ago. I was assuming that whenever this event is
> enabled at compile time, I would have to modify QEMU's TLB to store
> the guest physical address (then used by the tracing event).

I guess we could maybe put that into the iotlb. You definitely
don't want it in the CPUTLBEntry struct as that one is space
critical for performance. (If you're really lucky you can
reconstruct the physaddr from the iotlb addr field but I suspect
you can't.)

Once you've decided to take the hit of keeping the paddr in the
TLB it's probably faster to just unconditionally store it rather
than doing a "store if trace event X enabled" test.

PS: you probably also want to be able to trace the MemTxAttrs
(which tells you whether you have an S or NS access on ARM,
among other things).

thanks
-- PMM
Lluís Vilanova March 21, 2016, 4:51 p.m. UTC | #9
Peter Maydell writes:

> On 20 March 2016 at 18:09, Lluís Vilanova <vilanova@ac.upc.edu> wrote:
>> Oh! Yes, I seem to remember that code path now, I checked it a
>> really long time ago. I was assuming that whenever this event is
>> enabled at compile time, I would have to modify QEMU's TLB to store
>> the guest physical address (then used by the tracing event).

> I guess we could maybe put that into the iotlb. You definitely
> don't want it in the CPUTLBEntry struct as that one is space
> critical for performance. (If you're really lucky you can
> reconstruct the physaddr from the iotlb addr field but I suspect
> you can't.)

> Once you've decided to take the hit of keeping the paddr in the
> TLB it's probably faster to just unconditionally store it rather
> than doing a "store if trace event X enabled" test.

I meant to make the check at compile time, since we have defines to check which
events are enabled/disabled in trace-events.


> PS: you probably also want to be able to trace the MemTxAttrs
> (which tells you whether you have an S or NS access on ARM,
> among other things).

I'll keep these in mind for a separate series with extended memory info.


Thanks,
  Lluis
diff mbox

Patch

diff --git a/include/exec/cpu_ldst_template.h b/include/exec/cpu_ldst_template.h
index 3091c00..516f378 100644
--- a/include/exec/cpu_ldst_template.h
+++ b/include/exec/cpu_ldst_template.h
@@ -23,6 +23,11 @@ 
  * You should have received a copy of the GNU Lesser General Public
  * License along with this library; if not, see <http://www.gnu.org/licenses/>.
  */
+
+#if !defined(SOFTMMU_CODE_ACCESS)
+#include "trace.h"
+#endif
+
 #if DATA_SIZE == 8
 #define SUFFIX q
 #define USUFFIX q
@@ -80,6 +85,10 @@  glue(glue(glue(cpu_ld, USUFFIX), MEMSUFFIX), _ra)(CPUArchState *env,
     int mmu_idx;
     TCGMemOpIdx oi;
 
+#if !defined(SOFTMMU_CODE_ACCESS)
+    trace_guest_vmem_exec(ENV_GET_CPU(env), ptr, DATA_SIZE, 0);
+#endif
+
     addr = ptr;
     page_index = (addr >> TARGET_PAGE_BITS) & (CPU_TLB_SIZE - 1);
     mmu_idx = CPU_MMU_INDEX;
@@ -112,6 +121,10 @@  glue(glue(glue(cpu_lds, SUFFIX), MEMSUFFIX), _ra)(CPUArchState *env,
     int mmu_idx;
     TCGMemOpIdx oi;
 
+#if !defined(SOFTMMU_CODE_ACCESS)
+    trace_guest_vmem_exec(ENV_GET_CPU(env), ptr, DATA_SIZE, 0);
+#endif
+
     addr = ptr;
     page_index = (addr >> TARGET_PAGE_BITS) & (CPU_TLB_SIZE - 1);
     mmu_idx = CPU_MMU_INDEX;
@@ -148,6 +161,10 @@  glue(glue(glue(cpu_st, SUFFIX), MEMSUFFIX), _ra)(CPUArchState *env,
     int mmu_idx;
     TCGMemOpIdx oi;
 
+#if !defined(SOFTMMU_CODE_ACCESS)
+    trace_guest_vmem_exec(ENV_GET_CPU(env), ptr, DATA_SIZE, 1);
+#endif
+
     addr = ptr;
     page_index = (addr >> TARGET_PAGE_BITS) & (CPU_TLB_SIZE - 1);
     mmu_idx = CPU_MMU_INDEX;
diff --git a/include/exec/cpu_ldst_useronly_template.h b/include/exec/cpu_ldst_useronly_template.h
index 040b147..cde3d00 100644
--- a/include/exec/cpu_ldst_useronly_template.h
+++ b/include/exec/cpu_ldst_useronly_template.h
@@ -22,6 +22,11 @@ 
  * You should have received a copy of the GNU Lesser General Public
  * License along with this library; if not, see <http://www.gnu.org/licenses/>.
  */
+
+#if !defined(CODE_ACCESS)
+#include "trace.h"
+#endif
+
 #if DATA_SIZE == 8
 #define SUFFIX q
 #define USUFFIX q
@@ -53,6 +58,9 @@ 
 static inline RES_TYPE
 glue(glue(cpu_ld, USUFFIX), MEMSUFFIX)(CPUArchState *env, target_ulong ptr)
 {
+#if !defined(CODE_ACCESS)
+    trace_guest_vmem_exec(ENV_GET_CPU(env), ptr, DATA_SIZE, 0);
+#endif
     return glue(glue(ld, USUFFIX), _p)(g2h(ptr));
 }
 
@@ -68,6 +76,9 @@  glue(glue(glue(cpu_ld, USUFFIX), MEMSUFFIX), _ra)(CPUArchState *env,
 static inline int
 glue(glue(cpu_lds, SUFFIX), MEMSUFFIX)(CPUArchState *env, target_ulong ptr)
 {
+#if !defined(CODE_ACCESS)
+    trace_guest_vmem_exec(ENV_GET_CPU(env), ptr, DATA_SIZE, 0);
+#endif
     return glue(glue(lds, SUFFIX), _p)(g2h(ptr));
 }
 
@@ -85,6 +96,9 @@  static inline void
 glue(glue(cpu_st, SUFFIX), MEMSUFFIX)(CPUArchState *env, target_ulong ptr,
                                       RES_TYPE v)
 {
+#if !defined(CODE_ACCESS)
+    trace_guest_vmem_exec(ENV_GET_CPU(env), ptr, DATA_SIZE, 1);
+#endif
     glue(glue(st, SUFFIX), _p)(g2h(ptr), v);
 }
 
diff --git a/tcg/tcg-op.c b/tcg/tcg-op.c
index f554b86..789e427 100644
--- a/tcg/tcg-op.c
+++ b/tcg/tcg-op.c
@@ -25,6 +25,7 @@ 
 #include "qemu/osdep.h"
 #include "tcg.h"
 #include "tcg-op.h"
+#include "trace-tcg.h"
 
 /* Reduce the number of ifdefs below.  This assumes that all uses of
    TCGV_HIGH and TCGV_LOW are properly protected by a conditional that
@@ -1904,22 +1905,44 @@  static void gen_ldst_i64(TCGOpcode opc, TCGv_i64 val, TCGv addr,
 #endif
 }
 
-void tcg_gen_qemu_ld_i32(TCGv_i32 val, TCGv addr, TCGArg idx, TCGMemOp memop)
+static inline uint8_t tcg_memop_size(TCGMemOp op)
+{
+    return 1 << (op & MO_SIZE);
+}
+
+static inline void do_tcg_gen_qemu_ld_i32(TCGv_i32 val, TCGv addr, TCGArg idx, TCGMemOp memop)
 {
     memop = tcg_canonicalize_memop(memop, 0, 0);
     gen_ldst_i32(INDEX_op_qemu_ld_i32, val, addr, memop, idx);
 }
 
-void tcg_gen_qemu_st_i32(TCGv_i32 val, TCGv addr, TCGArg idx, TCGMemOp memop)
+void tcg_gen_qemu_ld_i32(TCGv_i32 val, TCGv addr, TCGArg idx, TCGMemOp memop)
+{
+    trace_guest_vmem_tcg(tcg_ctx.cpu, tcg_ctx.tcg_env,
+                         addr, tcg_memop_size(memop), 0);
+    do_tcg_gen_qemu_ld_i32(val, addr, idx, memop);
+}
+
+static inline void do_tcg_gen_qemu_st_i32(TCGv_i32 val, TCGv addr, TCGArg idx, TCGMemOp memop)
 {
     memop = tcg_canonicalize_memop(memop, 0, 1);
     gen_ldst_i32(INDEX_op_qemu_st_i32, val, addr, memop, idx);
 }
 
+void tcg_gen_qemu_st_i32(TCGv_i32 val, TCGv addr, TCGArg idx, TCGMemOp memop)
+{
+    trace_guest_vmem_tcg(tcg_ctx.cpu, tcg_ctx.tcg_env,
+                         addr, tcg_memop_size(memop), 1);
+    do_tcg_gen_qemu_st_i32(val, addr, idx, memop);
+}
+
 void tcg_gen_qemu_ld_i64(TCGv_i64 val, TCGv addr, TCGArg idx, TCGMemOp memop)
 {
+    trace_guest_vmem_tcg(tcg_ctx.cpu, tcg_ctx.tcg_env,
+                         addr, tcg_memop_size(memop), 0);
+
     if (TCG_TARGET_REG_BITS == 32 && (memop & MO_SIZE) < MO_64) {
-        tcg_gen_qemu_ld_i32(TCGV_LOW(val), addr, idx, memop);
+        do_tcg_gen_qemu_ld_i32(TCGV_LOW(val), addr, idx, memop);
         if (memop & MO_SIGN) {
             tcg_gen_sari_i32(TCGV_HIGH(val), TCGV_LOW(val), 31);
         } else {
@@ -1934,8 +1957,11 @@  void tcg_gen_qemu_ld_i64(TCGv_i64 val, TCGv addr, TCGArg idx, TCGMemOp memop)
 
 void tcg_gen_qemu_st_i64(TCGv_i64 val, TCGv addr, TCGArg idx, TCGMemOp memop)
 {
+    trace_guest_vmem_tcg(tcg_ctx.cpu, tcg_ctx.tcg_env,
+                         addr, tcg_memop_size(memop), 0);
+
     if (TCG_TARGET_REG_BITS == 32 && (memop & MO_SIZE) < MO_64) {
-        tcg_gen_qemu_st_i32(TCGV_LOW(val), addr, idx, memop);
+        do_tcg_gen_qemu_st_i32(TCGV_LOW(val), addr, idx, memop);
         return;
     }
 
diff --git a/trace-events b/trace-events
index f986c81..1088fe0 100644
--- a/trace-events
+++ b/trace-events
@@ -1890,3 +1890,16 @@  qio_channel_command_new_pid(void *ioc, int writefd, int readfd, int pid) "Comman
 qio_channel_command_new_spawn(void *ioc, const char *binary, int flags) "Command new spawn ioc=%p binary=%s flags=%d"
 qio_channel_command_abort(void *ioc, int pid) "Command abort ioc=%p pid=%d"
 qio_channel_command_wait(void *ioc, int pid, int ret, int status) "Command abort ioc=%p pid=%d ret=%d status=%d"
+
+### Guest events, keep at bottom
+
+# @vaddr: Access' virtual address.
+# @size : Access' size (bytes).
+# @store: Whether the access is a store.
+#
+# Start virtual memory access (before any potential access violation).
+#
+# Does not include memory accesses performed by devices.
+#
+# Targets: TCG(all)
+disable vcpu tcg guest_vmem(TCGv vaddr, uint8_t size, uint8_t store) "size=%d store=%d", "vaddr=0x%016"PRIx64" size=%d store=%d"