diff mbox series

x86/monitor: add option to disable Xen's pagetable walking on events

Message ID 992f16e8331363f4bc1eef49763810948ad5fff2.1609700210.git.tamas@tklengyel.com (mailing list archive)
State New, archived
Headers show
Series x86/monitor: add option to disable Xen's pagetable walking on events | expand

Commit Message

Tamas K Lengyel Jan. 3, 2021, 7:01 p.m. UTC
Add option to the monitor interface to disable walking of the guest pagetable
on certain events. This is a performance optimization for tools that never
require that information or prefer to do it themselves. For example LibVMI
maintains a virtual TLB which is faster to lookup then what Xen does here.

Signed-off-by: Tamas K Lengyel <tamas@tklengyel.com>
---
 xen/arch/x86/hvm/monitor.c    | 3 +++
 xen/common/monitor.c          | 3 +++
 xen/include/asm-arm/monitor.h | 7 +++++++
 xen/include/asm-x86/domain.h  | 1 +
 xen/include/asm-x86/monitor.h | 6 ++++++
 xen/include/public/domctl.h   | 1 +
 6 files changed, 21 insertions(+)

Comments

Andrew Cooper Jan. 4, 2021, 1:04 p.m. UTC | #1
On 03/01/2021 19:01, Tamas K Lengyel wrote:
> Add option to the monitor interface to disable walking of the guest pagetable
> on certain events. This is a performance optimization for tools that never
> require that information or prefer to do it themselves. For example LibVMI
> maintains a virtual TLB which is faster to lookup then what Xen does here.

There is no plausible way that a remote agent can do this (correctly)
faster than Xen can.  Even if you foreign map the entire VM up front,
and track every PTE write (to maintain the vTLB properly), the best you
can achieve is the same speed as Xen, but that would also require
intercepting the TLB management instructions which isn't available in
the monitor API.

Also, there is an important side effect of setting A/D bits which libVMI
doesn't handle, but is relevant for gla-not-valid faults.


I accept that "not doing things the agent doesn't care about" is a valid
reason, but this isn't the only place where a pagewalk occurs, and some
cases require pagewalks before we can even generate the event (e.g. LMSW
on AMD for CR0 monitoring).

As such, I don't think "disable pagewalks" is something we can actually
do.  Wouldn't it be better to call this "auto translate rip to gfn" or
similar, seeing as it is ancillary information?

~Andrew
Tamas K Lengyel Jan. 4, 2021, 1:24 p.m. UTC | #2
On Mon, Jan 4, 2021 at 8:04 AM Andrew Cooper <andrew.cooper3@citrix.com> wrote:
>
> On 03/01/2021 19:01, Tamas K Lengyel wrote:
> > Add option to the monitor interface to disable walking of the guest pagetable
> > on certain events. This is a performance optimization for tools that never
> > require that information or prefer to do it themselves. For example LibVMI
> > maintains a virtual TLB which is faster to lookup then what Xen does here.
>
> There is no plausible way that a remote agent can do this (correctly)
> faster than Xen can.  Even if you foreign map the entire VM up front,
> and track every PTE write (to maintain the vTLB properly), the best you
> can achieve is the same speed as Xen, but that would also require
> intercepting the TLB management instructions which isn't available in
> the monitor API.

When the value is in the vTLB, especially for nested addresses, it is
faster. To maintain the vTLB properly is a hurdle but that's a
separate issue and may not be relevant to all use-cases. Can also be
done by the way without having to track every PTE write simply by
trapping on the OS functions known for updating/moving the pagetable.

>
> Also, there is an important side effect of setting A/D bits which libVMI
> doesn't handle, but is relevant for gla-not-valid faults.

Care to elaborate?

>
> I accept that "not doing things the agent doesn't care about" is a valid
> reason, but this isn't the only place where a pagewalk occurs, and some
> cases require pagewalks before we can even generate the event (e.g. LMSW
> on AMD for CR0 monitoring).
>
> As such, I don't think "disable pagewalks" is something we can actually
> do.  Wouldn't it be better to call this "auto translate rip to gfn" or
> similar, seeing as it is ancillary information?

I don't much care about what it's called, but
XEN_DOMCTL_MONITOR_OP_FLAG_DISABLE_AUTO_TRANSLATE_RIP_TO_GFN is a
mouthful.

Tamas
diff mbox series

Patch

diff --git a/xen/arch/x86/hvm/monitor.c b/xen/arch/x86/hvm/monitor.c
index e4a09964a0..4c8272ab11 100644
--- a/xen/arch/x86/hvm/monitor.c
+++ b/xen/arch/x86/hvm/monitor.c
@@ -124,6 +124,9 @@  static inline unsigned long gfn_of_rip(unsigned long rip)
     struct segment_register sreg;
     uint32_t pfec = PFEC_page_present | PFEC_insn_fetch;
 
+    if ( curr->domain->arch.monitor.disable_ptwalks )
+        return 0;
+
     if ( hvm_get_cpl(curr) == 3 )
         pfec |= PFEC_user_mode;
 
diff --git a/xen/common/monitor.c b/xen/common/monitor.c
index d5c9ff1cbf..3a808ae7e3 100644
--- a/xen/common/monitor.c
+++ b/xen/common/monitor.c
@@ -58,6 +58,9 @@  int monitor_domctl(struct domain *d, struct xen_domctl_monitor_op *mop)
         mop->event = arch_monitor_get_capabilities(d);
         return 0;
 
+    case XEN_DOMCTL_MONITOR_OP_DISABLE_PTWALKS:
+        return arch_monitor_disable_ptwalks(d);
+
     default:
         /* The monitor op is probably handled on the arch-side. */
         return arch_monitor_domctl_op(d, mop);
diff --git a/xen/include/asm-arm/monitor.h b/xen/include/asm-arm/monitor.h
index 7567be66bd..1398a3ee26 100644
--- a/xen/include/asm-arm/monitor.h
+++ b/xen/include/asm-arm/monitor.h
@@ -63,6 +63,13 @@  static inline uint32_t arch_monitor_get_capabilities(struct domain *d)
     return capabilities;
 }
 
+static inline
+int arch_monitor_disable_ptwalks(struct domain *d)
+{
+    /* Not supported on ARM. */
+    return -EOPNOTSUPP;
+}
+
 int monitor_smc(void);
 
 #endif /* __ASM_ARM_MONITOR_H__ */
diff --git a/xen/include/asm-x86/domain.h b/xen/include/asm-x86/domain.h
index 3900d7b48b..83f210c750 100644
--- a/xen/include/asm-x86/domain.h
+++ b/xen/include/asm-x86/domain.h
@@ -430,6 +430,7 @@  struct arch_domain
          */
         unsigned int inguest_pagefault_disabled                            : 1;
         unsigned int control_register_values                               : 1;
+        unsigned int disable_ptwalks                                       : 1;
         struct monitor_msr_bitmap *msr_bitmap;
         uint64_t write_ctrlreg_mask[4];
     } monitor;
diff --git a/xen/include/asm-x86/monitor.h b/xen/include/asm-x86/monitor.h
index 01c6d63bb9..d9e53a0499 100644
--- a/xen/include/asm-x86/monitor.h
+++ b/xen/include/asm-x86/monitor.h
@@ -100,6 +100,12 @@  static inline uint32_t arch_monitor_get_capabilities(struct domain *d)
     return capabilities;
 }
 
+static inline int arch_monitor_disable_ptwalks(struct domain *d)
+{
+    d->arch.monitor.disable_ptwalks = true;
+    return 0;
+}
+
 int arch_monitor_domctl_event(struct domain *d,
                               struct xen_domctl_monitor_op *mop);
 
diff --git a/xen/include/public/domctl.h b/xen/include/public/domctl.h
index 666aeb71bf..a75b731a57 100644
--- a/xen/include/public/domctl.h
+++ b/xen/include/public/domctl.h
@@ -1028,6 +1028,7 @@  struct xen_domctl_psr_cmt_op {
  * to ensure all vCPUs have resumed before it is safe to turn it off.
  */
 #define XEN_DOMCTL_MONITOR_OP_CONTROL_REGISTERS 4
+#define XEN_DOMCTL_MONITOR_OP_DISABLE_PTWALKS   5
 
 #define XEN_DOMCTL_MONITOR_EVENT_WRITE_CTRLREG         0
 #define XEN_DOMCTL_MONITOR_EVENT_MOV_TO_MSR            1