diff mbox series

[3/3] x86/ept: force WB cache attributes for grant and foreign maps

Message ID 20210528173935.29919-4-roger.pau@citrix.com (mailing list archive)
State New, archived
Headers show
Series x86/ept: force WB to foreign and grant mappings | expand

Commit Message

Roger Pau Monné May 28, 2021, 5:39 p.m. UTC
Force WB type for grants and foreign pages. Those are usually mapped
over unpopulated physical ranges in the p2m, and those ranges would
usually be UC in the MTRR state, which is unlikely to be the correct
cache attribute. It's also cumbersome (or even impossible) for the
guest to be setting the MTRR type for all those mappings as WB, as
MTRR ranges are finite.

Note that on AMD we cannot force a cache attribute because of the lack
of ignore PAT equivalent, so the behavior here slightly diverges
between AMD and Intel (or EPT vs NPT/shadow).

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
 xen/arch/x86/hvm/vmx/vmx.c        |  2 +-
 xen/arch/x86/mm/p2m-ept.c         | 35 ++++++++++++++++++++++++++-----
 xen/include/asm-x86/hvm/vmx/vmx.h |  2 +-
 3 files changed, 32 insertions(+), 7 deletions(-)

Comments

Jan Beulich May 31, 2021, 7:21 a.m. UTC | #1
On 28.05.2021 19:39, Roger Pau Monne wrote:
> --- a/xen/arch/x86/mm/p2m-ept.c
> +++ b/xen/arch/x86/mm/p2m-ept.c
> @@ -487,11 +487,12 @@ static int ept_invalidate_emt_range(struct p2m_domain *p2m,
>  }
>  
>  int epte_get_entry_emt(struct domain *d, gfn_t gfn, mfn_t mfn,
> -                       unsigned int order, bool *ipat, bool direct_mmio)
> +                       unsigned int order, bool *ipat, p2m_type_t type)
>  {
>      int gmtrr_mtype, hmtrr_mtype;
>      struct vcpu *v = current;
>      unsigned long i;
> +    bool direct_mmio = type == p2m_mmio_direct;

I don't think this variable is worthwhile to retain/introduce:

> @@ -535,9 +536,33 @@ int epte_get_entry_emt(struct domain *d, gfn_t gfn, mfn_t mfn,
>          }
>      }
>  
> -    if ( direct_mmio )

With this gone, there's exactly one further use left. Preferably
with this adjustment (which I'd be fine to make while committing, as
long as you and/or the maintainers agree)
Reviewed-by: Jan Beulich <jbeulich@suse.com>

> +    switch ( type )
> +    {
> +    case p2m_mmio_direct:
>          return MTRR_TYPE_UNCACHABLE;

As a largely unrelated note: We really want to find a way to return
WC here for e.g. the frame buffer of graphics cards, the more that
hvm_get_mem_pinned_cacheattr() gets invoked only below from here
(unlike at initial introduction of the function, where it was called
ahead of the direct_mmio check, but still after the mfn_valid(), so
the results were inconsistent anyway). Perhaps we should obtain the
host MTRR setting for the page (or range) in question.

As to hvm_get_mem_pinned_cacheattr(), XEN_DOMCTL_pin_mem_cacheattr
is documented to be intended to be used on RAM only anyway ...

Jan
Roger Pau Monné June 2, 2021, 9:36 a.m. UTC | #2
On Mon, May 31, 2021 at 09:21:25AM +0200, Jan Beulich wrote:
> On 28.05.2021 19:39, Roger Pau Monne wrote:
> > --- a/xen/arch/x86/mm/p2m-ept.c
> > +++ b/xen/arch/x86/mm/p2m-ept.c
> > @@ -487,11 +487,12 @@ static int ept_invalidate_emt_range(struct p2m_domain *p2m,
> >  }
> >  
> >  int epte_get_entry_emt(struct domain *d, gfn_t gfn, mfn_t mfn,
> > -                       unsigned int order, bool *ipat, bool direct_mmio)
> > +                       unsigned int order, bool *ipat, p2m_type_t type)
> >  {
> >      int gmtrr_mtype, hmtrr_mtype;
> >      struct vcpu *v = current;
> >      unsigned long i;
> > +    bool direct_mmio = type == p2m_mmio_direct;
> 
> I don't think this variable is worthwhile to retain/introduce:
> 
> > @@ -535,9 +536,33 @@ int epte_get_entry_emt(struct domain *d, gfn_t gfn, mfn_t mfn,
> >          }
> >      }
> >  
> > -    if ( direct_mmio )
> 
> With this gone, there's exactly one further use left. Preferably
> with this adjustment (which I'd be fine to make while committing, as
> long as you and/or the maintainers agree)
> Reviewed-by: Jan Beulich <jbeulich@suse.com>

Thanks. That's fine, I was about to drop it but didn't want to introduce any
more changes than necessary.

> 
> > +    switch ( type )
> > +    {
> > +    case p2m_mmio_direct:
> >          return MTRR_TYPE_UNCACHABLE;
> 
> As a largely unrelated note: We really want to find a way to return
> WC here for e.g. the frame buffer of graphics cards, the more that
> hvm_get_mem_pinned_cacheattr() gets invoked only below from here
> (unlike at initial introduction of the function, where it was called
> ahead of the direct_mmio check, but still after the mfn_valid(), so
> the results were inconsistent anyway). Perhaps we should obtain the
> host MTRR setting for the page (or range) in question.
> 
> As to hvm_get_mem_pinned_cacheattr(), XEN_DOMCTL_pin_mem_cacheattr
> is documented to be intended to be used on RAM only anyway ...

I also think we should make epte_get_entry_emt available to all p2m
code so it can partially replace the logic in p2m_type_to_flags to
account for cache attributes. I don't think there's much point in
keeping such different methods for accounting for cache attributes. I
know AMD lacks an ignore PAT equivalent, but there's no reason why p2m
cache attributes calculation should be done differently for AMD and
Intel AFAICT.

Roger.
Tian, Kevin June 17, 2021, 9:31 a.m. UTC | #3
> From: Roger Pau Monne <roger.pau@citrix.com>
> Sent: Saturday, May 29, 2021 1:40 AM
> 
> Force WB type for grants and foreign pages. Those are usually mapped
> over unpopulated physical ranges in the p2m, and those ranges would
> usually be UC in the MTRR state, which is unlikely to be the correct
> cache attribute. It's also cumbersome (or even impossible) for the
> guest to be setting the MTRR type for all those mappings as WB, as
> MTRR ranges are finite.
> 
> Note that on AMD we cannot force a cache attribute because of the lack
> of ignore PAT equivalent, so the behavior here slightly diverges
> between AMD and Intel (or EPT vs NPT/shadow).
> 
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>

Reviewed-by: Kevin Tian <kevin.tian@intel.com>

btw incorrect cache attribute brings functional/performance problem. 
it'd be good to explain a bit why this problem doesn't hurt AMD in the 
commit msg...

> ---
>  xen/arch/x86/hvm/vmx/vmx.c        |  2 +-
>  xen/arch/x86/mm/p2m-ept.c         | 35 ++++++++++++++++++++++++++-----
>  xen/include/asm-x86/hvm/vmx/vmx.h |  2 +-
>  3 files changed, 32 insertions(+), 7 deletions(-)
> 
> diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
> index 0d4b47681b..e09b7e3af9 100644
> --- a/xen/arch/x86/hvm/vmx/vmx.c
> +++ b/xen/arch/x86/hvm/vmx/vmx.c
> @@ -423,7 +423,7 @@ static void domain_creation_finished(struct domain
> *d)
>          return;
> 
>      ASSERT(epte_get_entry_emt(d, gfn, apic_access_mfn, 0, &ipat,
> -                              true) == MTRR_TYPE_WRBACK);
> +                              p2m_mmio_direct) == MTRR_TYPE_WRBACK);
>      ASSERT(ipat);
> 
>      if ( set_mmio_p2m_entry(d, gfn, apic_access_mfn, PAGE_ORDER_4K) )
> diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c
> index f1d1d07e92..59c0325473 100644
> --- a/xen/arch/x86/mm/p2m-ept.c
> +++ b/xen/arch/x86/mm/p2m-ept.c
> @@ -487,11 +487,12 @@ static int ept_invalidate_emt_range(struct
> p2m_domain *p2m,
>  }
> 
>  int epte_get_entry_emt(struct domain *d, gfn_t gfn, mfn_t mfn,
> -                       unsigned int order, bool *ipat, bool direct_mmio)
> +                       unsigned int order, bool *ipat, p2m_type_t type)
>  {
>      int gmtrr_mtype, hmtrr_mtype;
>      struct vcpu *v = current;
>      unsigned long i;
> +    bool direct_mmio = type == p2m_mmio_direct;
> 
>      *ipat = false;
> 
> @@ -535,9 +536,33 @@ int epte_get_entry_emt(struct domain *d, gfn_t gfn,
> mfn_t mfn,
>          }
>      }
> 
> -    if ( direct_mmio )
> +    switch ( type )
> +    {
> +    case p2m_mmio_direct:
>          return MTRR_TYPE_UNCACHABLE;
> 
> +    case p2m_grant_map_ro:
> +    case p2m_grant_map_rw:
> +    case p2m_map_foreign:
> +        /*
> +         * Force WB type for grants and foreign pages. Those are usually
> mapped
> +         * over unpopulated physical ranges in the p2m, and those would
> usually
> +         * be UC in the MTRR state, which is unlikely to be the correct cache
> +         * attribute. It's also cumbersome (or even impossible) for the guest
> +         * to be setting the MTRR type for all those mappings as WB, as MTRR
> +         * ranges are finite.
> +         *
> +         * Note that on AMD we cannot force a cache attribute because of the
> +         * lack of ignore PAT equivalent, so the behavior here slightly
> +         * diverges. See p2m_type_to_flags for the AMD attributes.
> +         */
> +        *ipat = true;
> +        return MTRR_TYPE_WRBACK;
> +
> +    default:
> +        break;
> +    }
> +
>      gmtrr_mtype = hvm_get_mem_pinned_cacheattr(d, gfn, order);
>      if ( gmtrr_mtype >= 0 )
>      {
> @@ -640,7 +665,7 @@ static int resolve_misconfig(struct p2m_domain
> *p2m, unsigned long gfn)
>                          continue;
>                      e.emt = epte_get_entry_emt(p2m->domain, _gfn(gfn + i),
>                                                 _mfn(e.mfn), 0, &ipat,
> -                                               e.sa_p2mt == p2m_mmio_direct);
> +                                               e.sa_p2mt);
>                      e.ipat = ipat;
> 
>                      nt = p2m_recalc_type(e.recalc, e.sa_p2mt, p2m, gfn + i);
> @@ -659,7 +684,7 @@ static int resolve_misconfig(struct p2m_domain
> *p2m, unsigned long gfn)
>                  int emt = epte_get_entry_emt(p2m->domain, _gfn(gfn),
>                                               _mfn(e.mfn),
>                                               level * EPT_TABLE_ORDER, &ipat,
> -                                             e.sa_p2mt == p2m_mmio_direct);
> +                                             e.sa_p2mt);
>                  bool_t recalc = e.recalc;
> 
>                  if ( recalc && p2m_is_changeable(e.sa_p2mt) )
> @@ -895,7 +920,7 @@ ept_set_entry(struct p2m_domain *p2m, gfn_t gfn_,
> mfn_t mfn,
>          bool ipat;
>          int emt = epte_get_entry_emt(p2m->domain, _gfn(gfn), mfn,
>                                       i * EPT_TABLE_ORDER, &ipat,
> -                                     p2mt == p2m_mmio_direct);
> +                                     p2mt);
> 
>          if ( emt >= 0 )
>              new_entry.emt = emt;
> diff --git a/xen/include/asm-x86/hvm/vmx/vmx.h b/xen/include/asm-
> x86/hvm/vmx/vmx.h
> index f668ee1f09..0deb507490 100644
> --- a/xen/include/asm-x86/hvm/vmx/vmx.h
> +++ b/xen/include/asm-x86/hvm/vmx/vmx.h
> @@ -600,7 +600,7 @@ void ept_p2m_uninit(struct p2m_domain *p2m);
>  void ept_walk_table(struct domain *d, unsigned long gfn);
>  bool_t ept_handle_misconfig(uint64_t gpa);
>  int epte_get_entry_emt(struct domain *d, gfn_t gfn, mfn_t mfn,
> -                       unsigned int order, bool *ipat, bool direct_mmio);
> +                       unsigned int order, bool *ipat, p2m_type_t type);
>  void setup_ept_dump(void);
>  void p2m_init_altp2m_ept(struct domain *d, unsigned int i);
>  /* Locate an alternate p2m by its EPTP */
> --
> 2.31.1
Roger Pau Monné June 17, 2021, 11:40 a.m. UTC | #4
On Thu, Jun 17, 2021 at 09:31:28AM +0000, Tian, Kevin wrote:
> > From: Roger Pau Monne <roger.pau@citrix.com>
> > Sent: Saturday, May 29, 2021 1:40 AM
> > 
> > Force WB type for grants and foreign pages. Those are usually mapped
> > over unpopulated physical ranges in the p2m, and those ranges would
> > usually be UC in the MTRR state, which is unlikely to be the correct
> > cache attribute. It's also cumbersome (or even impossible) for the
> > guest to be setting the MTRR type for all those mappings as WB, as
> > MTRR ranges are finite.
> > 
> > Note that on AMD we cannot force a cache attribute because of the lack
> > of ignore PAT equivalent, so the behavior here slightly diverges
> > between AMD and Intel (or EPT vs NPT/shadow).
> > 
> > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> 
> Reviewed-by: Kevin Tian <kevin.tian@intel.com>
> 
> btw incorrect cache attribute brings functional/performance problem. 
> it'd be good to explain a bit why this problem doesn't hurt AMD in the 
> commit msg...

What about re-writing the last commit paragraph as:

Note that this is not an issue on AMD because WB cache attribute is
already set on grants and foreign mappings in the p2m and MTRR types
are ignored. Also on AMD Xen cannot force a cache attribute because of
the lack of ignore PAT equivalent, so the behavior here slightly
diverges between AMD and Intel (or EPT vs NPT/shadow).

Thanks, Roger.
Jan Beulich June 17, 2021, 11:57 a.m. UTC | #5
On 17.06.2021 13:40, Roger Pau Monné wrote:
> On Thu, Jun 17, 2021 at 09:31:28AM +0000, Tian, Kevin wrote:
>>> From: Roger Pau Monne <roger.pau@citrix.com>
>>> Sent: Saturday, May 29, 2021 1:40 AM
>>>
>>> Force WB type for grants and foreign pages. Those are usually mapped
>>> over unpopulated physical ranges in the p2m, and those ranges would
>>> usually be UC in the MTRR state, which is unlikely to be the correct
>>> cache attribute. It's also cumbersome (or even impossible) for the
>>> guest to be setting the MTRR type for all those mappings as WB, as
>>> MTRR ranges are finite.
>>>
>>> Note that on AMD we cannot force a cache attribute because of the lack
>>> of ignore PAT equivalent, so the behavior here slightly diverges
>>> between AMD and Intel (or EPT vs NPT/shadow).
>>>
>>> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
>>
>> Reviewed-by: Kevin Tian <kevin.tian@intel.com>
>>
>> btw incorrect cache attribute brings functional/performance problem. 
>> it'd be good to explain a bit why this problem doesn't hurt AMD in the 
>> commit msg...
> 
> What about re-writing the last commit paragraph as:
> 
> Note that this is not an issue on AMD because WB cache attribute is
> already set on grants and foreign mappings in the p2m and MTRR types
> are ignored. Also on AMD Xen cannot force a cache attribute because of
> the lack of ignore PAT equivalent, so the behavior here slightly
> diverges between AMD and Intel (or EPT vs NPT/shadow).

I'll try to remember to swap this in when committing.

Jan
diff mbox series

Patch

diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index 0d4b47681b..e09b7e3af9 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -423,7 +423,7 @@  static void domain_creation_finished(struct domain *d)
         return;
 
     ASSERT(epte_get_entry_emt(d, gfn, apic_access_mfn, 0, &ipat,
-                              true) == MTRR_TYPE_WRBACK);
+                              p2m_mmio_direct) == MTRR_TYPE_WRBACK);
     ASSERT(ipat);
 
     if ( set_mmio_p2m_entry(d, gfn, apic_access_mfn, PAGE_ORDER_4K) )
diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c
index f1d1d07e92..59c0325473 100644
--- a/xen/arch/x86/mm/p2m-ept.c
+++ b/xen/arch/x86/mm/p2m-ept.c
@@ -487,11 +487,12 @@  static int ept_invalidate_emt_range(struct p2m_domain *p2m,
 }
 
 int epte_get_entry_emt(struct domain *d, gfn_t gfn, mfn_t mfn,
-                       unsigned int order, bool *ipat, bool direct_mmio)
+                       unsigned int order, bool *ipat, p2m_type_t type)
 {
     int gmtrr_mtype, hmtrr_mtype;
     struct vcpu *v = current;
     unsigned long i;
+    bool direct_mmio = type == p2m_mmio_direct;
 
     *ipat = false;
 
@@ -535,9 +536,33 @@  int epte_get_entry_emt(struct domain *d, gfn_t gfn, mfn_t mfn,
         }
     }
 
-    if ( direct_mmio )
+    switch ( type )
+    {
+    case p2m_mmio_direct:
         return MTRR_TYPE_UNCACHABLE;
 
+    case p2m_grant_map_ro:
+    case p2m_grant_map_rw:
+    case p2m_map_foreign:
+        /*
+         * Force WB type for grants and foreign pages. Those are usually mapped
+         * over unpopulated physical ranges in the p2m, and those would usually
+         * be UC in the MTRR state, which is unlikely to be the correct cache
+         * attribute. It's also cumbersome (or even impossible) for the guest
+         * to be setting the MTRR type for all those mappings as WB, as MTRR
+         * ranges are finite.
+         *
+         * Note that on AMD we cannot force a cache attribute because of the
+         * lack of ignore PAT equivalent, so the behavior here slightly
+         * diverges. See p2m_type_to_flags for the AMD attributes.
+         */
+        *ipat = true;
+        return MTRR_TYPE_WRBACK;
+
+    default:
+        break;
+    }
+
     gmtrr_mtype = hvm_get_mem_pinned_cacheattr(d, gfn, order);
     if ( gmtrr_mtype >= 0 )
     {
@@ -640,7 +665,7 @@  static int resolve_misconfig(struct p2m_domain *p2m, unsigned long gfn)
                         continue;
                     e.emt = epte_get_entry_emt(p2m->domain, _gfn(gfn + i),
                                                _mfn(e.mfn), 0, &ipat,
-                                               e.sa_p2mt == p2m_mmio_direct);
+                                               e.sa_p2mt);
                     e.ipat = ipat;
 
                     nt = p2m_recalc_type(e.recalc, e.sa_p2mt, p2m, gfn + i);
@@ -659,7 +684,7 @@  static int resolve_misconfig(struct p2m_domain *p2m, unsigned long gfn)
                 int emt = epte_get_entry_emt(p2m->domain, _gfn(gfn),
                                              _mfn(e.mfn),
                                              level * EPT_TABLE_ORDER, &ipat,
-                                             e.sa_p2mt == p2m_mmio_direct);
+                                             e.sa_p2mt);
                 bool_t recalc = e.recalc;
 
                 if ( recalc && p2m_is_changeable(e.sa_p2mt) )
@@ -895,7 +920,7 @@  ept_set_entry(struct p2m_domain *p2m, gfn_t gfn_, mfn_t mfn,
         bool ipat;
         int emt = epte_get_entry_emt(p2m->domain, _gfn(gfn), mfn,
                                      i * EPT_TABLE_ORDER, &ipat,
-                                     p2mt == p2m_mmio_direct);
+                                     p2mt);
 
         if ( emt >= 0 )
             new_entry.emt = emt;
diff --git a/xen/include/asm-x86/hvm/vmx/vmx.h b/xen/include/asm-x86/hvm/vmx/vmx.h
index f668ee1f09..0deb507490 100644
--- a/xen/include/asm-x86/hvm/vmx/vmx.h
+++ b/xen/include/asm-x86/hvm/vmx/vmx.h
@@ -600,7 +600,7 @@  void ept_p2m_uninit(struct p2m_domain *p2m);
 void ept_walk_table(struct domain *d, unsigned long gfn);
 bool_t ept_handle_misconfig(uint64_t gpa);
 int epte_get_entry_emt(struct domain *d, gfn_t gfn, mfn_t mfn,
-                       unsigned int order, bool *ipat, bool direct_mmio);
+                       unsigned int order, bool *ipat, p2m_type_t type);
 void setup_ept_dump(void);
 void p2m_init_altp2m_ept(struct domain *d, unsigned int i);
 /* Locate an alternate p2m by its EPTP */