Message ID | 20200127181115.82709-7-roger.pau@citrix.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | x86: improve assisted tlb flush and use it in guest mode | expand |
On Mon, Jan 27, 2020 at 07:11:14PM +0100, Roger Pau Monne wrote: > The TLB clock is helpful when running Xen on bare metal because when > doing a TLB flush each CPU is IPI'ed and can keep a timestamp of the > last flush. > > This is not the case however when Xen is running virtualized, and the > underlying hypervisor provides mechanism to assist in performing TLB > flushes: Xen itself for example offers a HVMOP_flush_tlbs hypercall in > order to perform a TLB flush without having to IPI each CPU. When > using such mechanisms it's no longer possible to keep a timestamp of > the flushes on each CPU, as they are performed by the underlying > hypervisor. > > Offer a boolean in order to signal Xen that the timestamped TLB > shouldn't be used. This avoids keeping the timestamps of the flushes, > and also forces NEED_FLUSH to always return true. > > No functional change intended, as this change doesn't introduce any > user that disables the timestamped TLB. > > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> > --- > xen/arch/x86/flushtlb.c | 19 +++++++++++++------ > xen/include/asm-x86/flushtlb.h | 17 ++++++++++++++++- > 2 files changed, 29 insertions(+), 7 deletions(-) > > diff --git a/xen/arch/x86/flushtlb.c b/xen/arch/x86/flushtlb.c > index e7ccd4ec7b..3649900793 100644 > --- a/xen/arch/x86/flushtlb.c > +++ b/xen/arch/x86/flushtlb.c > @@ -32,6 +32,9 @@ > u32 tlbflush_clock = 1U; > DEFINE_PER_CPU(u32, tlbflush_time); > > +/* Signals whether the TLB flush clock is in use. */ > +bool __read_mostly tlb_clk_enabled = true; > + > /* > * pre_flush(): Increment the virtual TLB-flush clock. Returns new clock value. > * > @@ -82,12 +85,13 @@ static void post_flush(u32 t) > static void do_tlb_flush(void) > { > unsigned long flags, cr4; > - u32 t; > + u32 t = 0; > > /* This non-reentrant function is sometimes called in interrupt context. */ > local_irq_save(flags); > > - t = pre_flush(); > + if ( tlb_clk_enabled ) > + t = pre_flush(); I think it makes more sense to push the check to pre_flush and post_flush -- they are the ones that care about the clock, after all. This also has the effect of making this patch a bit shorter, I think. Wei.
On Thu, Feb 06, 2020 at 01:31:34PM +0000, Wei Liu wrote: > On Mon, Jan 27, 2020 at 07:11:14PM +0100, Roger Pau Monne wrote: > > The TLB clock is helpful when running Xen on bare metal because when > > doing a TLB flush each CPU is IPI'ed and can keep a timestamp of the > > last flush. > > > > This is not the case however when Xen is running virtualized, and the > > underlying hypervisor provides mechanism to assist in performing TLB > > flushes: Xen itself for example offers a HVMOP_flush_tlbs hypercall in > > order to perform a TLB flush without having to IPI each CPU. When > > using such mechanisms it's no longer possible to keep a timestamp of > > the flushes on each CPU, as they are performed by the underlying > > hypervisor. > > > > Offer a boolean in order to signal Xen that the timestamped TLB > > shouldn't be used. This avoids keeping the timestamps of the flushes, > > and also forces NEED_FLUSH to always return true. > > > > No functional change intended, as this change doesn't introduce any > > user that disables the timestamped TLB. > > > > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> > > --- > > xen/arch/x86/flushtlb.c | 19 +++++++++++++------ > > xen/include/asm-x86/flushtlb.h | 17 ++++++++++++++++- > > 2 files changed, 29 insertions(+), 7 deletions(-) > > > > diff --git a/xen/arch/x86/flushtlb.c b/xen/arch/x86/flushtlb.c > > index e7ccd4ec7b..3649900793 100644 > > --- a/xen/arch/x86/flushtlb.c > > +++ b/xen/arch/x86/flushtlb.c > > @@ -32,6 +32,9 @@ > > u32 tlbflush_clock = 1U; > > DEFINE_PER_CPU(u32, tlbflush_time); > > > > +/* Signals whether the TLB flush clock is in use. */ > > +bool __read_mostly tlb_clk_enabled = true; > > + > > /* > > * pre_flush(): Increment the virtual TLB-flush clock. Returns new clock value. > > * > > @@ -82,12 +85,13 @@ static void post_flush(u32 t) > > static void do_tlb_flush(void) > > { > > unsigned long flags, cr4; > > - u32 t; > > + u32 t = 0; > > > > /* This non-reentrant function is sometimes called in interrupt context. */ > > local_irq_save(flags); > > > > - t = pre_flush(); > > + if ( tlb_clk_enabled ) > > + t = pre_flush(); > > I think it makes more sense to push the check to pre_flush and > post_flush -- they are the ones that care about the clock, after all. > > This also has the effect of making this patch a bit shorter, I think. I added the check here in order to avoid the call just to return 0. This is a hot path, so I thought that a check here would be less expensive that a function call when the TLB stamps are disabled. I don't mind moving it inside {pre/post}_flush functions if that's the consensus. Thanks, Roger.
diff --git a/xen/arch/x86/flushtlb.c b/xen/arch/x86/flushtlb.c index e7ccd4ec7b..3649900793 100644 --- a/xen/arch/x86/flushtlb.c +++ b/xen/arch/x86/flushtlb.c @@ -32,6 +32,9 @@ u32 tlbflush_clock = 1U; DEFINE_PER_CPU(u32, tlbflush_time); +/* Signals whether the TLB flush clock is in use. */ +bool __read_mostly tlb_clk_enabled = true; + /* * pre_flush(): Increment the virtual TLB-flush clock. Returns new clock value. * @@ -82,12 +85,13 @@ static void post_flush(u32 t) static void do_tlb_flush(void) { unsigned long flags, cr4; - u32 t; + u32 t = 0; /* This non-reentrant function is sometimes called in interrupt context. */ local_irq_save(flags); - t = pre_flush(); + if ( tlb_clk_enabled ) + t = pre_flush(); if ( use_invpcid ) invpcid_flush_all(); @@ -99,7 +103,8 @@ static void do_tlb_flush(void) else write_cr3(read_cr3()); - post_flush(t); + if ( tlb_clk_enabled ) + post_flush(t); local_irq_restore(flags); } @@ -107,7 +112,7 @@ static void do_tlb_flush(void) void switch_cr3_cr4(unsigned long cr3, unsigned long cr4) { unsigned long flags, old_cr4; - u32 t; + u32 t = 0; /* Throughout this function we make this assumption: */ ASSERT(!(cr4 & X86_CR4_PCIDE) || !(cr4 & X86_CR4_PGE)); @@ -115,7 +120,8 @@ void switch_cr3_cr4(unsigned long cr3, unsigned long cr4) /* This non-reentrant function is sometimes called in interrupt context. */ local_irq_save(flags); - t = pre_flush(); + if ( tlb_clk_enabled ) + t = pre_flush(); old_cr4 = read_cr4(); ASSERT(!(old_cr4 & X86_CR4_PCIDE) || !(old_cr4 & X86_CR4_PGE)); @@ -167,7 +173,8 @@ void switch_cr3_cr4(unsigned long cr3, unsigned long cr4) if ( cr4 & X86_CR4_PCIDE ) invpcid_flush_all_nonglobals(); - post_flush(t); + if ( tlb_clk_enabled ) + post_flush(t); local_irq_restore(flags); } diff --git a/xen/include/asm-x86/flushtlb.h b/xen/include/asm-x86/flushtlb.h index 07f9bc6103..9773014320 100644 --- a/xen/include/asm-x86/flushtlb.h +++ b/xen/include/asm-x86/flushtlb.h @@ -21,10 +21,21 @@ extern u32 tlbflush_clock; /* Time at which each CPU's TLB was last flushed. */ DECLARE_PER_CPU(u32, tlbflush_time); -#define tlbflush_current_time() tlbflush_clock +/* TLB clock is in use. */ +extern bool tlb_clk_enabled; + +static inline uint32_t tlbflush_current_time(void) +{ + /* Returning 0 from tlbflush_current_time will always force a flush. */ + return tlb_clk_enabled ? tlbflush_clock : 0; +} static inline void page_set_tlbflush_timestamp(struct page_info *page) { + /* Avoid the write if the TLB clock is disabled. */ + if ( !tlb_clk_enabled ) + return; + /* * Prevent storing a stale time stamp, which could happen if an update * to tlbflush_clock plus a subsequent flush IPI happen between the @@ -67,6 +78,10 @@ static inline void tlbflush_filter(cpumask_t *mask, uint32_t page_timestamp) { unsigned int cpu; + /* Short-circuit: there's no need to iterate if the clock is disabled. */ + if ( !tlb_clk_enabled ) + return; + for_each_cpu ( cpu, mask ) if ( !NEED_FLUSH(per_cpu(tlbflush_time, cpu), page_timestamp) ) __cpumask_clear_cpu(cpu, mask);
The TLB clock is helpful when running Xen on bare metal because when doing a TLB flush each CPU is IPI'ed and can keep a timestamp of the last flush. This is not the case however when Xen is running virtualized, and the underlying hypervisor provides mechanism to assist in performing TLB flushes: Xen itself for example offers a HVMOP_flush_tlbs hypercall in order to perform a TLB flush without having to IPI each CPU. When using such mechanisms it's no longer possible to keep a timestamp of the flushes on each CPU, as they are performed by the underlying hypervisor. Offer a boolean in order to signal Xen that the timestamped TLB shouldn't be used. This avoids keeping the timestamps of the flushes, and also forces NEED_FLUSH to always return true. No functional change intended, as this change doesn't introduce any user that disables the timestamped TLB. Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> --- xen/arch/x86/flushtlb.c | 19 +++++++++++++------ xen/include/asm-x86/flushtlb.h | 17 ++++++++++++++++- 2 files changed, 29 insertions(+), 7 deletions(-)