diff mbox series

[v3,2/2] x86/ept: limit calls to memory_type_changed()

Message ID 20220928141117.51351-3-roger.pau@citrix.com (mailing list archive)
State New, archived
Headers show
Series Move calls to memory_type_changed() | expand

Commit Message

Roger Pau Monné Sept. 28, 2022, 2:11 p.m. UTC
memory_type_changed() is currently only implemented for Intel EPT, and
results in the invalidation of EMT attributes on all the entries in
the EPT page tables.  Such invalidation causes EPT_MISCONFIG vmexits
when the guest tries to access any gfns for the first time, which
results in the recalculation of the EMT for the accessed page.  The
vmexit and the recalculations are expensive, and as such should be
avoided when possible.

Remove the call to memory_type_changed() from
XEN_DOMCTL_memory_mapping: there are no modifications of the
iomem_caps ranges anymore that could alter the return of
cache_flush_permitted() from that domctl.

Encapsulate calls to memory_type_changed() resulting from changes to
the domain iomem_caps or ioport_caps ranges in the helpers themselves
(io{ports,mem}_{permit,deny}_access()), and add a note in
epte_get_entry_emt() to remind that changes to the logic there likely
need to be propagaed to the IO capabilities helpers.

Note changes to the IO ports or memory ranges are not very common
during guest runtime, but Citrix Hypervisor has an use case for them
related to device passthrough.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Changes since v2:
 - Split the Arm side changes into a pre-patch.

Changes since v1:
 - Place the calls to memory_type_changed() inside the
   io{ports,mem}_{permit,deny}_access() helpers.
---
 xen/arch/x86/domctl.c            |  4 ----
 xen/arch/x86/include/asm/iocap.h | 33 +++++++++++++++++++++++----
 xen/arch/x86/mm/p2m-ept.c        |  4 ++++
 xen/common/domctl.c              |  4 ----
 xen/include/xen/iocap.h          | 38 ++++++++++++++++++++++++++++----
 5 files changed, 67 insertions(+), 16 deletions(-)

Comments

Jan Beulich Sept. 29, 2022, 10:13 a.m. UTC | #1
On 28.09.2022 16:11, Roger Pau Monne wrote:
> memory_type_changed() is currently only implemented for Intel EPT, and
> results in the invalidation of EMT attributes on all the entries in
> the EPT page tables.  Such invalidation causes EPT_MISCONFIG vmexits
> when the guest tries to access any gfns for the first time, which
> results in the recalculation of the EMT for the accessed page.  The
> vmexit and the recalculations are expensive, and as such should be
> avoided when possible.
> 
> Remove the call to memory_type_changed() from
> XEN_DOMCTL_memory_mapping: there are no modifications of the
> iomem_caps ranges anymore that could alter the return of
> cache_flush_permitted() from that domctl.
> 
> Encapsulate calls to memory_type_changed() resulting from changes to
> the domain iomem_caps or ioport_caps ranges in the helpers themselves
> (io{ports,mem}_{permit,deny}_access()), and add a note in
> epte_get_entry_emt() to remind that changes to the logic there likely
> need to be propagaed to the IO capabilities helpers.
> 
> Note changes to the IO ports or memory ranges are not very common
> during guest runtime, but Citrix Hypervisor has an use case for them
> related to device passthrough.
> 
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>

Reviewed-by: Jan Beulich <jbeulich@suse.com>
with one minor remark at the end, which can be taken care of while committing.

> ---
> Changes since v2:
>  - Split the Arm side changes into a pre-patch.

Despite this I'd prefer to have an Arm maintainer view on this as well. As
previously pointed out the resulting code is going to be sub-optimal there.

> --- a/xen/include/xen/iocap.h
> +++ b/xen/include/xen/iocap.h
> @@ -7,13 +7,43 @@
>  #ifndef __XEN_IOCAP_H__
>  #define __XEN_IOCAP_H__
>  
> +#include <xen/sched.h>
>  #include <xen/rangeset.h>
>  #include <asm/iocap.h>
> +#include <asm/p2m.h>
> +
> +static inline int iomem_permit_access(struct domain *d, unsigned long s,
> +                                      unsigned long e)
> +{
> +    bool flush = cache_flush_permitted(d);
> +    int ret = rangeset_add_range(d->iomem_caps, s, e);
> +
> +    if ( !ret && !is_iommu_enabled(d) && !flush )
> +        /*
> +         * Only flush if the range(s) are empty before this addition and
> +         * IOMMU is not enabled for the domain, otherwise it makes no
> +         * difference for effective cache attribute calculation purposes.
> +         */
> +        memory_type_changed(d);
> +
> +    return ret;
> +}
> +static inline int iomem_deny_access(struct domain *d, unsigned long s,

A blank line would be nice between these two (and similarly for the
x86-only pair). Omitting such blank lines is imo advisable only for
trivial inline functions.

Jan
Bertrand Marquis Sept. 29, 2022, 12:08 p.m. UTC | #2
Hi Han,

> On 29 Sep 2022, at 12:13, Jan Beulich <jbeulich@suse.com> wrote:
> 
> On 28.09.2022 16:11, Roger Pau Monne wrote:
>> memory_type_changed() is currently only implemented for Intel EPT, and
>> results in the invalidation of EMT attributes on all the entries in
>> the EPT page tables.  Such invalidation causes EPT_MISCONFIG vmexits
>> when the guest tries to access any gfns for the first time, which
>> results in the recalculation of the EMT for the accessed page.  The
>> vmexit and the recalculations are expensive, and as such should be
>> avoided when possible.
>> 
>> Remove the call to memory_type_changed() from
>> XEN_DOMCTL_memory_mapping: there are no modifications of the
>> iomem_caps ranges anymore that could alter the return of
>> cache_flush_permitted() from that domctl.
>> 
>> Encapsulate calls to memory_type_changed() resulting from changes to
>> the domain iomem_caps or ioport_caps ranges in the helpers themselves
>> (io{ports,mem}_{permit,deny}_access()), and add a note in
>> epte_get_entry_emt() to remind that changes to the logic there likely
>> need to be propagaed to the IO capabilities helpers.
>> 
>> Note changes to the IO ports or memory ranges are not very common
>> during guest runtime, but Citrix Hypervisor has an use case for them
>> related to device passthrough.
>> 
>> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> 
> Reviewed-by: Jan Beulich <jbeulich@suse.com>
> with one minor remark at the end, which can be taken care of while committing.
> 
>> ---
>> Changes since v2:
>> - Split the Arm side changes into a pre-patch.
> 
> Despite this I'd prefer to have an Arm maintainer view on this as well. As
> previously pointed out the resulting code is going to be sub-optimal there.

On arm none of those will be called at runtime, it happens only during guest creation
so the potential performance impact is very reduce.

Cheers
Bertrand

> 
>> --- a/xen/include/xen/iocap.h
>> +++ b/xen/include/xen/iocap.h
>> @@ -7,13 +7,43 @@
>> #ifndef __XEN_IOCAP_H__
>> #define __XEN_IOCAP_H__
>> 
>> +#include <xen/sched.h>
>> #include <xen/rangeset.h>
>> #include <asm/iocap.h>
>> +#include <asm/p2m.h>
>> +
>> +static inline int iomem_permit_access(struct domain *d, unsigned long s,
>> +                                      unsigned long e)
>> +{
>> +    bool flush = cache_flush_permitted(d);
>> +    int ret = rangeset_add_range(d->iomem_caps, s, e);
>> +
>> +    if ( !ret && !is_iommu_enabled(d) && !flush )
>> +        /*
>> +         * Only flush if the range(s) are empty before this addition and
>> +         * IOMMU is not enabled for the domain, otherwise it makes no
>> +         * difference for effective cache attribute calculation purposes.
>> +         */
>> +        memory_type_changed(d);
>> +
>> +    return ret;
>> +}
>> +static inline int iomem_deny_access(struct domain *d, unsigned long s,
> 
> A blank line would be nice between these two (and similarly for the
> x86-only pair). Omitting such blank lines is imo advisable only for
> trivial inline functions.
> 
> Jan
diff mbox series

Patch

diff --git a/xen/arch/x86/domctl.c b/xen/arch/x86/domctl.c
index 020df615bd..e9bfbc57a7 100644
--- a/xen/arch/x86/domctl.c
+++ b/xen/arch/x86/domctl.c
@@ -232,8 +232,6 @@  long arch_do_domctl(
             ret = ioports_permit_access(d, fp, fp + np - 1);
         else
             ret = ioports_deny_access(d, fp, fp + np - 1);
-        if ( !ret )
-            memory_type_changed(d);
         break;
     }
 
@@ -666,8 +664,6 @@  long arch_do_domctl(
                        "ioport_map: error %ld denying dom%d access to [%x,%x]\n",
                        ret, d->domain_id, fmp, fmp + np - 1);
         }
-        if ( !ret )
-            memory_type_changed(d);
         break;
     }
 
diff --git a/xen/arch/x86/include/asm/iocap.h b/xen/arch/x86/include/asm/iocap.h
index eee47228d4..ce83c3d8a4 100644
--- a/xen/arch/x86/include/asm/iocap.h
+++ b/xen/arch/x86/include/asm/iocap.h
@@ -7,10 +7,11 @@ 
 #ifndef __X86_IOCAP_H__
 #define __X86_IOCAP_H__
 
-#define ioports_permit_access(d, s, e)                  \
-    rangeset_add_range((d)->arch.ioport_caps, s, e)
-#define ioports_deny_access(d, s, e)                    \
-    rangeset_remove_range((d)->arch.ioport_caps, s, e)
+#include <xen/sched.h>
+#include <xen/rangeset.h>
+
+#include <asm/p2m.h>
+
 #define ioports_access_permitted(d, s, e)               \
     rangeset_contains_range((d)->arch.ioport_caps, s, e)
 
@@ -18,4 +19,28 @@ 
     (!rangeset_is_empty((d)->iomem_caps) ||             \
      !rangeset_is_empty((d)->arch.ioport_caps))
 
+static inline int ioports_permit_access(struct domain *d, unsigned long s,
+                                        unsigned long e)
+{
+    bool flush = cache_flush_permitted(d);
+    int ret = rangeset_add_range(d->arch.ioport_caps, s, e);
+
+    if ( !ret && !is_iommu_enabled(d) && !flush )
+        /* See comment in iomem_permit_access(). */
+        memory_type_changed(d);
+
+    return ret;
+}
+static inline int ioports_deny_access(struct domain *d, unsigned long s,
+                                      unsigned long e)
+{
+    int ret = rangeset_remove_range(d->arch.ioport_caps, s, e);
+
+    if ( !ret && !is_iommu_enabled(d) && !cache_flush_permitted(d) )
+        /* See comment in iomem_deny_access(). */
+        memory_type_changed(d);
+
+    return ret;
+}
+
 #endif /* __X86_IOCAP_H__ */
diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c
index b4919bad51..d61d66c20e 100644
--- a/xen/arch/x86/mm/p2m-ept.c
+++ b/xen/arch/x86/mm/p2m-ept.c
@@ -518,6 +518,10 @@  int epte_get_entry_emt(struct domain *d, gfn_t gfn, mfn_t mfn,
         return MTRR_TYPE_UNCACHABLE;
     }
 
+    /*
+     * Conditional must be kept in sync with the code in
+     * {iomem,ioports}_{permit,deny}_access().
+     */
     if ( type != p2m_mmio_direct && !is_iommu_enabled(d) &&
          !cache_flush_permitted(d) )
     {
diff --git a/xen/common/domctl.c b/xen/common/domctl.c
index 452266710a..69fb9abd34 100644
--- a/xen/common/domctl.c
+++ b/xen/common/domctl.c
@@ -716,8 +716,6 @@  long do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl)
             ret = iomem_permit_access(d, mfn, mfn + nr_mfns - 1);
         else
             ret = iomem_deny_access(d, mfn, mfn + nr_mfns - 1);
-        if ( !ret )
-            memory_type_changed(d);
         break;
     }
 
@@ -778,8 +776,6 @@  long do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl)
                        "memory_map: error %ld removing dom%d access to [%lx,%lx]\n",
                        ret, d->domain_id, mfn, mfn_end);
         }
-        /* Do this unconditionally to cover errors on above failure paths. */
-        memory_type_changed(d);
         break;
     }
 
diff --git a/xen/include/xen/iocap.h b/xen/include/xen/iocap.h
index 1ca3858fc0..0ca4c9745f 100644
--- a/xen/include/xen/iocap.h
+++ b/xen/include/xen/iocap.h
@@ -7,13 +7,43 @@ 
 #ifndef __XEN_IOCAP_H__
 #define __XEN_IOCAP_H__
 
+#include <xen/sched.h>
 #include <xen/rangeset.h>
 #include <asm/iocap.h>
+#include <asm/p2m.h>
+
+static inline int iomem_permit_access(struct domain *d, unsigned long s,
+                                      unsigned long e)
+{
+    bool flush = cache_flush_permitted(d);
+    int ret = rangeset_add_range(d->iomem_caps, s, e);
+
+    if ( !ret && !is_iommu_enabled(d) && !flush )
+        /*
+         * Only flush if the range(s) are empty before this addition and
+         * IOMMU is not enabled for the domain, otherwise it makes no
+         * difference for effective cache attribute calculation purposes.
+         */
+        memory_type_changed(d);
+
+    return ret;
+}
+static inline int iomem_deny_access(struct domain *d, unsigned long s,
+                                    unsigned long e)
+{
+    int ret = rangeset_remove_range(d->iomem_caps, s, e);
+
+    if ( !ret && !is_iommu_enabled(d) && !cache_flush_permitted(d) )
+        /*
+         * Only flush if the range(s) are empty after this removal and
+         * IOMMU is not enabled for the domain, otherwise it makes no
+         * difference for effective cache attribute calculation purposes.
+         */
+        memory_type_changed(d);
+
+    return ret;
+}
 
-#define iomem_permit_access(d, s, e)                    \
-    rangeset_add_range((d)->iomem_caps, s, e)
-#define iomem_deny_access(d, s, e)                      \
-    rangeset_remove_range((d)->iomem_caps, s, e)
 #define iomem_access_permitted(d, s, e)                 \
     rangeset_contains_range((d)->iomem_caps, s, e)