diff mbox series

[1/2] x86/tlbflush: do not toggle the PGE CR4 bit unless necessary

Message ID 20191125172213.1904-2-roger.pau@citrix.com (mailing list archive)
State New, archived
Headers show
Series x86/pvshim: improve tlb flush performance | expand

Commit Message

Roger Pau Monne Nov. 25, 2019, 5:22 p.m. UTC
When PCID is not available Xen does a full tlbflush by toggling the
PGE bit in CR4. This is not necessary if PGE is not enabled, since a
flush can be performed by writing to CR3 in that case.

Change the code in do_tlb_flush to only toggle the PGE bit in CR4 if
it's already enabled, otherwise do the tlb flush by writing to CR3.
This is relevant when running virtualized, since hypervisors don't
usually trap accesses to CR3 when using hardware assisted paging, but
do trap accesses to CR4 specially on AMD hardware, which makes such
accesses much more expensive.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
 xen/arch/x86/flushtlb.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

Comments

Jan Beulich Nov. 29, 2019, 12:04 p.m. UTC | #1
On 25.11.2019 18:22, Roger Pau Monne wrote:
> When PCID is not available Xen does a full tlbflush by toggling the
> PGE bit in CR4. This is not necessary if PGE is not enabled, since a
> flush can be performed by writing to CR3 in that case.
> 
> Change the code in do_tlb_flush to only toggle the PGE bit in CR4 if
> it's already enabled, otherwise do the tlb flush by writing to CR3.
> This is relevant when running virtualized, since hypervisors don't
> usually trap accesses to CR3 when using hardware assisted paging, but
> do trap accesses to CR4 specially on AMD hardware, which makes such
> accesses much more expensive.
> 
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>

Reviewed-by: Jan Beulich <jbeulich@suse.com>
with one remark:

> --- a/xen/arch/x86/flushtlb.c
> +++ b/xen/arch/x86/flushtlb.c
> @@ -84,6 +84,7 @@ static void post_flush(u32 t)
>  static void do_tlb_flush(void)
>  {
>      unsigned long flags;
> +    unsigned long cr4;

This would better be merged with the adjacent declaration. Can surely
be done while committing.

Jan
diff mbox series

Patch

diff --git a/xen/arch/x86/flushtlb.c b/xen/arch/x86/flushtlb.c
index c1ae0d9467..540209c856 100644
--- a/xen/arch/x86/flushtlb.c
+++ b/xen/arch/x86/flushtlb.c
@@ -84,6 +84,7 @@  static void post_flush(u32 t)
 static void do_tlb_flush(void)
 {
     unsigned long flags;
+    unsigned long cr4;
     u32 t;
 
     /* This non-reentrant function is sometimes called in interrupt context. */
@@ -93,13 +94,13 @@  static void do_tlb_flush(void)
 
     if ( use_invpcid )
         invpcid_flush_all();
-    else
+    else if ( (cr4 = read_cr4()) & X86_CR4_PGE )
     {
-        unsigned long cr4 = read_cr4();
-
-        write_cr4(cr4 ^ X86_CR4_PGE);
+        write_cr4(cr4 & ~X86_CR4_PGE);
         write_cr4(cr4);
     }
+    else
+        write_cr3(read_cr3());
 
     post_flush(t);