diff mbox series

[v8,1/3] x86/tlb: introduce a flush HVM ASIDs flag

Message ID 20200320184240.41769-2-roger.pau@citrix.com (mailing list archive)
State Superseded
Headers show
Series x86/guest: use assisted TLB flush in guest mode | expand

Commit Message

Roger Pau Monné March 20, 2020, 6:42 p.m. UTC
Introduce a specific flag to request a HVM guest linear TLB flush,
which is an ASID/VPID tickle that forces a guest linear to guest
physical TLB flush for all HVM guests.

This was previously unconditionally done in each pre_flush call, but
that's not required: HVM guests not using shadow don't require linear
TLB flushes as Xen doesn't modify the guest page tables in that case
(ie: when using HAP). Note that shadow paging code already takes care
of issuing the necessary flushes when the shadow page tables are
modified.

In order to keep the previous behavior modify all shadow code TLB
flushes to also flush the guest linear to physical TLB. I haven't
looked at each specific shadow code TLB flush in order to figure out
whether it actually requires a guest TLB flush or not, so there might
be room for improvement in that regard.

Also perform ASID/VPIT flushes when modifying the p2m tables as it's a
requirement for AMD hardware. Finally keep the flush in
switch_cr3_cr4, as it's not clear whether code could rely on
switch_cr3_cr4 also performing a guest linear TLB flush. A following
patch can remove the ASID/VPIT tickle from switch_cr3_cr4 if found to
not be necessary.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Changes since v7:
 - Do not perform an ASID flush in filtered_flush_tlb_mask: the
   requested flush is related to the page need_tlbflush field and not
   to p2m changes (applies to both callers).

Changes since v6:
 - Add ASID/VPIT flushes when modifying the p2m.
 - Keep the ASID/VPIT flush in switch_cr3_cr4.

Changes since v5:
 - Rename FLUSH_GUESTS_TLB to FLUSH_HVM_ASID_CORE.
 - Clarify commit message.
 - Define FLUSH_HVM_ASID_CORE to 0 when !CONFIG_HVM.
---
 xen/arch/x86/flushtlb.c          |  6 ++++--
 xen/arch/x86/mm/hap/hap.c        |  8 ++++----
 xen/arch/x86/mm/hap/nested_hap.c |  2 +-
 xen/arch/x86/mm/p2m-pt.c         |  3 ++-
 xen/arch/x86/mm/paging.c         |  2 +-
 xen/arch/x86/mm/shadow/common.c  | 18 +++++++++---------
 xen/arch/x86/mm/shadow/hvm.c     |  2 +-
 xen/arch/x86/mm/shadow/multi.c   | 16 ++++++++--------
 xen/include/asm-x86/flushtlb.h   |  6 ++++++
 9 files changed, 36 insertions(+), 27 deletions(-)

Comments

Wei Liu March 29, 2020, 2:52 p.m. UTC | #1
On Fri, Mar 20, 2020 at 07:42:38PM +0100, Roger Pau Monne wrote:
> Introduce a specific flag to request a HVM guest linear TLB flush,
> which is an ASID/VPID tickle that forces a guest linear to guest
> physical TLB flush for all HVM guests.
> 
> This was previously unconditionally done in each pre_flush call, but
> that's not required: HVM guests not using shadow don't require linear
> TLB flushes as Xen doesn't modify the guest page tables in that case
> (ie: when using HAP). Note that shadow paging code already takes care
> of issuing the necessary flushes when the shadow page tables are
> modified.
> 
> In order to keep the previous behavior modify all shadow code TLB
> flushes to also flush the guest linear to physical TLB. I haven't
> looked at each specific shadow code TLB flush in order to figure out
> whether it actually requires a guest TLB flush or not, so there might
> be room for improvement in that regard.
> 
> Also perform ASID/VPIT flushes when modifying the p2m tables as it's a
> requirement for AMD hardware. Finally keep the flush in
> switch_cr3_cr4, as it's not clear whether code could rely on
> switch_cr3_cr4 also performing a guest linear TLB flush. A following
> patch can remove the ASID/VPIT tickle from switch_cr3_cr4 if found to
> not be necessary.
> 
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>

As far as I can tell all previous comments are addressed:

Reviewed-by: Wei Liu <wl@xen.org>
Jan Beulich March 31, 2020, 3:40 p.m. UTC | #2
On 20.03.2020 19:42, Roger Pau Monne wrote:
> Introduce a specific flag to request a HVM guest linear TLB flush,
> which is an ASID/VPID tickle that forces a guest linear to guest
> physical TLB flush for all HVM guests.
> 
> This was previously unconditionally done in each pre_flush call, but
> that's not required: HVM guests not using shadow don't require linear
> TLB flushes as Xen doesn't modify the guest page tables in that case
> (ie: when using HAP). Note that shadow paging code already takes care
> of issuing the necessary flushes when the shadow page tables are
> modified.
> 
> In order to keep the previous behavior modify all shadow code TLB
> flushes to also flush the guest linear to physical TLB. I haven't
> looked at each specific shadow code TLB flush in order to figure out
> whether it actually requires a guest TLB flush or not, so there might
> be room for improvement in that regard.
> 
> Also perform ASID/VPIT flushes when modifying the p2m tables as it's a
> requirement for AMD hardware. Finally keep the flush in
> switch_cr3_cr4, as it's not clear whether code could rely on
> switch_cr3_cr4 also performing a guest linear TLB flush. A following
> patch can remove the ASID/VPIT tickle from switch_cr3_cr4 if found to
> not be necessary.

s/VPIT/VPID/ in this paragraph?

> --- a/xen/arch/x86/mm/hap/hap.c
> +++ b/xen/arch/x86/mm/hap/hap.c
> @@ -118,7 +118,7 @@ int hap_track_dirty_vram(struct domain *d,
>              p2m_change_type_range(d, begin_pfn, begin_pfn + nr,
>                                    p2m_ram_rw, p2m_ram_logdirty);
>  
> -            flush_tlb_mask(d->dirty_cpumask);
> +            flush_mask(d->dirty_cpumask, FLUSH_TLB | FLUSH_HVM_ASID_CORE);
>  
>              memset(dirty_bitmap, 0xff, size); /* consider all pages dirty */
>          }
> @@ -205,7 +205,7 @@ static int hap_enable_log_dirty(struct domain *d, bool_t log_global)
>           * to be read-only, or via hardware-assisted log-dirty.
>           */
>          p2m_change_entry_type_global(d, p2m_ram_rw, p2m_ram_logdirty);
> -        flush_tlb_mask(d->dirty_cpumask);
> +        flush_mask(d->dirty_cpumask, FLUSH_TLB | FLUSH_HVM_ASID_CORE);
>      }
>      return 0;
>  }
> @@ -234,7 +234,7 @@ static void hap_clean_dirty_bitmap(struct domain *d)
>       * be read-only, or via hardware-assisted log-dirty.
>       */
>      p2m_change_entry_type_global(d, p2m_ram_rw, p2m_ram_logdirty);
> -    flush_tlb_mask(d->dirty_cpumask);
> +    flush_mask(d->dirty_cpumask, FLUSH_TLB | FLUSH_HVM_ASID_CORE);
>  }
>  
>  /************************************************/
> @@ -798,7 +798,7 @@ hap_write_p2m_entry(struct p2m_domain *p2m, unsigned long gfn, l1_pgentry_t *p,
>  
>      safe_write_pte(p, new);
>      if ( old_flags & _PAGE_PRESENT )
> -        flush_tlb_mask(d->dirty_cpumask);
> +        flush_mask(d->dirty_cpumask, FLUSH_TLB | FLUSH_HVM_ASID_CORE);

For all four - why FLUSH_TLB? Doesn't the flushing here solely care
about guest translations?

> --- a/xen/arch/x86/mm/hap/nested_hap.c
> +++ b/xen/arch/x86/mm/hap/nested_hap.c
> @@ -84,7 +84,7 @@ nestedp2m_write_p2m_entry(struct p2m_domain *p2m, unsigned long gfn,
>      safe_write_pte(p, new);
>  
>      if (old_flags & _PAGE_PRESENT)
> -        flush_tlb_mask(p2m->dirty_cpumask);
> +        flush_mask(p2m->dirty_cpumask, FLUSH_TLB | FLUSH_HVM_ASID_CORE);

Same here then I guess.

> --- a/xen/arch/x86/mm/p2m-pt.c
> +++ b/xen/arch/x86/mm/p2m-pt.c
> @@ -896,7 +896,8 @@ static void p2m_pt_change_entry_type_global(struct p2m_domain *p2m,
>      unmap_domain_page(tab);
>  
>      if ( changed )
> -         flush_tlb_mask(p2m->domain->dirty_cpumask);
> +         flush_mask(p2m->domain->dirty_cpumask,
> +                    FLUSH_TLB | FLUSH_HVM_ASID_CORE);

Given that this code is used in shadow mode as well, perhaps
better to keep it here. Albeit maybe FLUSH_TLB could be dependent
upon !hap_enabled()?

> --- a/xen/arch/x86/mm/paging.c
> +++ b/xen/arch/x86/mm/paging.c
> @@ -613,7 +613,7 @@ void paging_log_dirty_range(struct domain *d,
>  
>      p2m_unlock(p2m);
>  
> -    flush_tlb_mask(d->dirty_cpumask);
> +    flush_mask(d->dirty_cpumask, FLUSH_TLB | FLUSH_HVM_ASID_CORE);

Same here?

> @@ -993,7 +993,7 @@ static void shadow_blow_tables(struct domain *d)
>                                 pagetable_get_mfn(v->arch.shadow_table[i]), 0);
>  
>      /* Make sure everyone sees the unshadowings */
> -    flush_tlb_mask(d->dirty_cpumask);
> +    flush_mask(d->dirty_cpumask, FLUSH_TLB | FLUSH_HVM_ASID_CORE);

Taking this as example, wouldn't it be more consistent overall if
paths not being HVM-only would specify FLUSH_HVM_ASID_CORE only
for HVM domains?

Also, seeing the large number of conversions, perhaps have another
wrapper, e.g. flush_tlb_mask_hvm(), at least for the cases where
both flags get specified unconditionally?

Jan
Roger Pau Monné March 31, 2020, 4:45 p.m. UTC | #3
On Tue, Mar 31, 2020 at 05:40:59PM +0200, Jan Beulich wrote:
> On 20.03.2020 19:42, Roger Pau Monne wrote:
> > Introduce a specific flag to request a HVM guest linear TLB flush,
> > which is an ASID/VPID tickle that forces a guest linear to guest
> > physical TLB flush for all HVM guests.
> > 
> > This was previously unconditionally done in each pre_flush call, but
> > that's not required: HVM guests not using shadow don't require linear
> > TLB flushes as Xen doesn't modify the guest page tables in that case
> > (ie: when using HAP). Note that shadow paging code already takes care
> > of issuing the necessary flushes when the shadow page tables are
> > modified.
> > 
> > In order to keep the previous behavior modify all shadow code TLB
> > flushes to also flush the guest linear to physical TLB. I haven't
> > looked at each specific shadow code TLB flush in order to figure out
> > whether it actually requires a guest TLB flush or not, so there might
> > be room for improvement in that regard.
> > 
> > Also perform ASID/VPIT flushes when modifying the p2m tables as it's a
> > requirement for AMD hardware. Finally keep the flush in
> > switch_cr3_cr4, as it's not clear whether code could rely on
> > switch_cr3_cr4 also performing a guest linear TLB flush. A following
> > patch can remove the ASID/VPIT tickle from switch_cr3_cr4 if found to
> > not be necessary.
> 
> s/VPIT/VPID/ in this paragraph?

Right, sorry.

> > --- a/xen/arch/x86/mm/hap/hap.c
> > +++ b/xen/arch/x86/mm/hap/hap.c
> > @@ -118,7 +118,7 @@ int hap_track_dirty_vram(struct domain *d,
> >              p2m_change_type_range(d, begin_pfn, begin_pfn + nr,
> >                                    p2m_ram_rw, p2m_ram_logdirty);
> >  
> > -            flush_tlb_mask(d->dirty_cpumask);
> > +            flush_mask(d->dirty_cpumask, FLUSH_TLB | FLUSH_HVM_ASID_CORE);
> >  
> >              memset(dirty_bitmap, 0xff, size); /* consider all pages dirty */
> >          }
> > @@ -205,7 +205,7 @@ static int hap_enable_log_dirty(struct domain *d, bool_t log_global)
> >           * to be read-only, or via hardware-assisted log-dirty.
> >           */
> >          p2m_change_entry_type_global(d, p2m_ram_rw, p2m_ram_logdirty);
> > -        flush_tlb_mask(d->dirty_cpumask);
> > +        flush_mask(d->dirty_cpumask, FLUSH_TLB | FLUSH_HVM_ASID_CORE);
> >      }
> >      return 0;
> >  }
> > @@ -234,7 +234,7 @@ static void hap_clean_dirty_bitmap(struct domain *d)
> >       * be read-only, or via hardware-assisted log-dirty.
> >       */
> >      p2m_change_entry_type_global(d, p2m_ram_rw, p2m_ram_logdirty);
> > -    flush_tlb_mask(d->dirty_cpumask);
> > +    flush_mask(d->dirty_cpumask, FLUSH_TLB | FLUSH_HVM_ASID_CORE);
> >  }
> >  
> >  /************************************************/
> > @@ -798,7 +798,7 @@ hap_write_p2m_entry(struct p2m_domain *p2m, unsigned long gfn, l1_pgentry_t *p,
> >  
> >      safe_write_pte(p, new);
> >      if ( old_flags & _PAGE_PRESENT )
> > -        flush_tlb_mask(d->dirty_cpumask);
> > +        flush_mask(d->dirty_cpumask, FLUSH_TLB | FLUSH_HVM_ASID_CORE);
> 
> For all four - why FLUSH_TLB? Doesn't the flushing here solely care
> about guest translations?

Not on AMD, at least to my understanding, the AMD SDM states:

"If a hypervisor modifies a nested page table by decreasing permission
levels, clearing present bits, or changing address translations and
intends to return to the same ASID, it should use either TLB command
011b or 001b."

It's in section 15.16.1.

This to my understanding implies that on AMD hardware modifications to
the NPT require an ASID flush. I assume that on AMD ASIDs also cache
combined translations, guest linear -> host physical.

In fact without doing such flushes when modifying the nested page
tables XenRT was seeing multiple issues on AMD hardware.

> > --- a/xen/arch/x86/mm/hap/nested_hap.c
> > +++ b/xen/arch/x86/mm/hap/nested_hap.c
> > @@ -84,7 +84,7 @@ nestedp2m_write_p2m_entry(struct p2m_domain *p2m, unsigned long gfn,
> >      safe_write_pte(p, new);
> >  
> >      if (old_flags & _PAGE_PRESENT)
> > -        flush_tlb_mask(p2m->dirty_cpumask);
> > +        flush_mask(p2m->dirty_cpumask, FLUSH_TLB | FLUSH_HVM_ASID_CORE);
> 
> Same here then I guess.
> 
> > --- a/xen/arch/x86/mm/p2m-pt.c
> > +++ b/xen/arch/x86/mm/p2m-pt.c
> > @@ -896,7 +896,8 @@ static void p2m_pt_change_entry_type_global(struct p2m_domain *p2m,
> >      unmap_domain_page(tab);
> >  
> >      if ( changed )
> > -         flush_tlb_mask(p2m->domain->dirty_cpumask);
> > +         flush_mask(p2m->domain->dirty_cpumask,
> > +                    FLUSH_TLB | FLUSH_HVM_ASID_CORE);
> 
> Given that this code is used in shadow mode as well, perhaps
> better to keep it here. Albeit maybe FLUSH_TLB could be dependent
> upon !hap_enabled()?
> 
> > --- a/xen/arch/x86/mm/paging.c
> > +++ b/xen/arch/x86/mm/paging.c
> > @@ -613,7 +613,7 @@ void paging_log_dirty_range(struct domain *d,
> >  
> >      p2m_unlock(p2m);
> >  
> > -    flush_tlb_mask(d->dirty_cpumask);
> > +    flush_mask(d->dirty_cpumask, FLUSH_TLB | FLUSH_HVM_ASID_CORE);
> 
> Same here?

I'm fine with doing further refinements, but I would like to be on the
conservative side and keep such flushes.

> > @@ -993,7 +993,7 @@ static void shadow_blow_tables(struct domain *d)
> >                                 pagetable_get_mfn(v->arch.shadow_table[i]), 0);
> >  
> >      /* Make sure everyone sees the unshadowings */
> > -    flush_tlb_mask(d->dirty_cpumask);
> > +    flush_mask(d->dirty_cpumask, FLUSH_TLB | FLUSH_HVM_ASID_CORE);
> 
> Taking this as example, wouldn't it be more consistent overall if
> paths not being HVM-only would specify FLUSH_HVM_ASID_CORE only
> for HVM domains?

I think there's indeed room for improvement here, as it's likely
possible to drop some of the ASID/VPID flushes. Given that previous to
this patch we would flush ASIDs on every TLB flush I think the current
approach is safe, and as said above further improvements can be done
afterwards.

> Also, seeing the large number of conversions, perhaps have another
> wrapper, e.g. flush_tlb_mask_hvm(), at least for the cases where
> both flags get specified unconditionally?

That's fine for me, if you agree with the proposed naming
(flush_tlb_mask_hvm) I'm happy to introduce the helper.

Thanks, Roger.
Jan Beulich April 1, 2020, 6:34 a.m. UTC | #4
On 31.03.2020 18:45, Roger Pau Monné wrote:
> On Tue, Mar 31, 2020 at 05:40:59PM +0200, Jan Beulich wrote:
>> On 20.03.2020 19:42, Roger Pau Monne wrote:
>>> --- a/xen/arch/x86/mm/hap/hap.c
>>> +++ b/xen/arch/x86/mm/hap/hap.c
>>> @@ -118,7 +118,7 @@ int hap_track_dirty_vram(struct domain *d,
>>>              p2m_change_type_range(d, begin_pfn, begin_pfn + nr,
>>>                                    p2m_ram_rw, p2m_ram_logdirty);
>>>  
>>> -            flush_tlb_mask(d->dirty_cpumask);
>>> +            flush_mask(d->dirty_cpumask, FLUSH_TLB | FLUSH_HVM_ASID_CORE);
>>>  
>>>              memset(dirty_bitmap, 0xff, size); /* consider all pages dirty */
>>>          }
>>> @@ -205,7 +205,7 @@ static int hap_enable_log_dirty(struct domain *d, bool_t log_global)
>>>           * to be read-only, or via hardware-assisted log-dirty.
>>>           */
>>>          p2m_change_entry_type_global(d, p2m_ram_rw, p2m_ram_logdirty);
>>> -        flush_tlb_mask(d->dirty_cpumask);
>>> +        flush_mask(d->dirty_cpumask, FLUSH_TLB | FLUSH_HVM_ASID_CORE);
>>>      }
>>>      return 0;
>>>  }
>>> @@ -234,7 +234,7 @@ static void hap_clean_dirty_bitmap(struct domain *d)
>>>       * be read-only, or via hardware-assisted log-dirty.
>>>       */
>>>      p2m_change_entry_type_global(d, p2m_ram_rw, p2m_ram_logdirty);
>>> -    flush_tlb_mask(d->dirty_cpumask);
>>> +    flush_mask(d->dirty_cpumask, FLUSH_TLB | FLUSH_HVM_ASID_CORE);
>>>  }
>>>  
>>>  /************************************************/
>>> @@ -798,7 +798,7 @@ hap_write_p2m_entry(struct p2m_domain *p2m, unsigned long gfn, l1_pgentry_t *p,
>>>  
>>>      safe_write_pte(p, new);
>>>      if ( old_flags & _PAGE_PRESENT )
>>> -        flush_tlb_mask(d->dirty_cpumask);
>>> +        flush_mask(d->dirty_cpumask, FLUSH_TLB | FLUSH_HVM_ASID_CORE);
>>
>> For all four - why FLUSH_TLB? Doesn't the flushing here solely care
>> about guest translations?
> 
> Not on AMD, at least to my understanding, the AMD SDM states:
> 
> "If a hypervisor modifies a nested page table by decreasing permission
> levels, clearing present bits, or changing address translations and
> intends to return to the same ASID, it should use either TLB command
> 011b or 001b."
> 
> It's in section 15.16.1.
> 
> This to my understanding implies that on AMD hardware modifications to
> the NPT require an ASID flush. I assume that on AMD ASIDs also cache
> combined translations, guest linear -> host physical.

I guess I don't follow - I asked about FLUSH_TLB. I agree there needs
to be FLUSH_HVM_ASID_CORE here.

>>> --- a/xen/arch/x86/mm/hap/nested_hap.c
>>> +++ b/xen/arch/x86/mm/hap/nested_hap.c
>>> @@ -84,7 +84,7 @@ nestedp2m_write_p2m_entry(struct p2m_domain *p2m, unsigned long gfn,
>>>      safe_write_pte(p, new);
>>>  
>>>      if (old_flags & _PAGE_PRESENT)
>>> -        flush_tlb_mask(p2m->dirty_cpumask);
>>> +        flush_mask(p2m->dirty_cpumask, FLUSH_TLB | FLUSH_HVM_ASID_CORE);
>>
>> Same here then I guess.
>>
>>> --- a/xen/arch/x86/mm/p2m-pt.c
>>> +++ b/xen/arch/x86/mm/p2m-pt.c
>>> @@ -896,7 +896,8 @@ static void p2m_pt_change_entry_type_global(struct p2m_domain *p2m,
>>>      unmap_domain_page(tab);
>>>  
>>>      if ( changed )
>>> -         flush_tlb_mask(p2m->domain->dirty_cpumask);
>>> +         flush_mask(p2m->domain->dirty_cpumask,
>>> +                    FLUSH_TLB | FLUSH_HVM_ASID_CORE);
>>
>> Given that this code is used in shadow mode as well, perhaps
>> better to keep it here. Albeit maybe FLUSH_TLB could be dependent
>> upon !hap_enabled()?
>>
>>> --- a/xen/arch/x86/mm/paging.c
>>> +++ b/xen/arch/x86/mm/paging.c
>>> @@ -613,7 +613,7 @@ void paging_log_dirty_range(struct domain *d,
>>>  
>>>      p2m_unlock(p2m);
>>>  
>>> -    flush_tlb_mask(d->dirty_cpumask);
>>> +    flush_mask(d->dirty_cpumask, FLUSH_TLB | FLUSH_HVM_ASID_CORE);
>>
>> Same here?
> 
> I'm fine with doing further refinements, but I would like to be on the
> conservative side and keep such flushes.

Well, if hap.c had FLUSH_TLB dropped, for consistency it should
become conditional here, imo.

>>> @@ -993,7 +993,7 @@ static void shadow_blow_tables(struct domain *d)
>>>                                 pagetable_get_mfn(v->arch.shadow_table[i]), 0);
>>>  
>>>      /* Make sure everyone sees the unshadowings */
>>> -    flush_tlb_mask(d->dirty_cpumask);
>>> +    flush_mask(d->dirty_cpumask, FLUSH_TLB | FLUSH_HVM_ASID_CORE);
>>
>> Taking this as example, wouldn't it be more consistent overall if
>> paths not being HVM-only would specify FLUSH_HVM_ASID_CORE only
>> for HVM domains?
> 
> I think there's indeed room for improvement here, as it's likely
> possible to drop some of the ASID/VPID flushes. Given that previous to
> this patch we would flush ASIDs on every TLB flush I think the current
> approach is safe, and as said above further improvements can be done
> afterwards.

There's no safety implication from my suggestion. Needless
FLUSH_HVM_ASID_CORE for non-HVM will result in a call to
hvm_flush_guest_tlbs(), with it then causing the generation
to be incremented without there being any vCPU to consume
this, as there's not going to be a VM entry without a prior
context switch on the specific CPU.

>> Also, seeing the large number of conversions, perhaps have another
>> wrapper, e.g. flush_tlb_mask_hvm(), at least for the cases where
>> both flags get specified unconditionally?
> 
> That's fine for me, if you agree with the proposed naming
> (flush_tlb_mask_hvm) I'm happy to introduce the helper.

Well, I couldn't (and still can't) think of a better (yet not
overly long) name, yet I'm also not fully happy with it.

Jan
Roger Pau Monné April 1, 2020, 7:15 a.m. UTC | #5
On Wed, Apr 01, 2020 at 08:34:23AM +0200, Jan Beulich wrote:
> On 31.03.2020 18:45, Roger Pau Monné wrote:
> > On Tue, Mar 31, 2020 at 05:40:59PM +0200, Jan Beulich wrote:
> >> On 20.03.2020 19:42, Roger Pau Monne wrote:
> >>> --- a/xen/arch/x86/mm/hap/hap.c
> >>> +++ b/xen/arch/x86/mm/hap/hap.c
> >>> @@ -118,7 +118,7 @@ int hap_track_dirty_vram(struct domain *d,
> >>>              p2m_change_type_range(d, begin_pfn, begin_pfn + nr,
> >>>                                    p2m_ram_rw, p2m_ram_logdirty);
> >>>  
> >>> -            flush_tlb_mask(d->dirty_cpumask);
> >>> +            flush_mask(d->dirty_cpumask, FLUSH_TLB | FLUSH_HVM_ASID_CORE);
> >>>  
> >>>              memset(dirty_bitmap, 0xff, size); /* consider all pages dirty */
> >>>          }
> >>> @@ -205,7 +205,7 @@ static int hap_enable_log_dirty(struct domain *d, bool_t log_global)
> >>>           * to be read-only, or via hardware-assisted log-dirty.
> >>>           */
> >>>          p2m_change_entry_type_global(d, p2m_ram_rw, p2m_ram_logdirty);
> >>> -        flush_tlb_mask(d->dirty_cpumask);
> >>> +        flush_mask(d->dirty_cpumask, FLUSH_TLB | FLUSH_HVM_ASID_CORE);
> >>>      }
> >>>      return 0;
> >>>  }
> >>> @@ -234,7 +234,7 @@ static void hap_clean_dirty_bitmap(struct domain *d)
> >>>       * be read-only, or via hardware-assisted log-dirty.
> >>>       */
> >>>      p2m_change_entry_type_global(d, p2m_ram_rw, p2m_ram_logdirty);
> >>> -    flush_tlb_mask(d->dirty_cpumask);
> >>> +    flush_mask(d->dirty_cpumask, FLUSH_TLB | FLUSH_HVM_ASID_CORE);
> >>>  }
> >>>  
> >>>  /************************************************/
> >>> @@ -798,7 +798,7 @@ hap_write_p2m_entry(struct p2m_domain *p2m, unsigned long gfn, l1_pgentry_t *p,
> >>>  
> >>>      safe_write_pte(p, new);
> >>>      if ( old_flags & _PAGE_PRESENT )
> >>> -        flush_tlb_mask(d->dirty_cpumask);
> >>> +        flush_mask(d->dirty_cpumask, FLUSH_TLB | FLUSH_HVM_ASID_CORE);
> >>
> >> For all four - why FLUSH_TLB? Doesn't the flushing here solely care
> >> about guest translations?
> > 
> > Not on AMD, at least to my understanding, the AMD SDM states:
> > 
> > "If a hypervisor modifies a nested page table by decreasing permission
> > levels, clearing present bits, or changing address translations and
> > intends to return to the same ASID, it should use either TLB command
> > 011b or 001b."
> > 
> > It's in section 15.16.1.
> > 
> > This to my understanding implies that on AMD hardware modifications to
> > the NPT require an ASID flush. I assume that on AMD ASIDs also cache
> > combined translations, guest linear -> host physical.
> 
> I guess I don't follow - I asked about FLUSH_TLB. I agree there needs
> to be FLUSH_HVM_ASID_CORE here.

I clearly misread your comment, I'm really sorry.

The main point of this patch was to remove the ASID flush from some of
the flush TLB callers, not to remove TLB flushes. I understand this
was all intertwined together, and now that it's possible to split
those there are a lot of callers that can be further refined. I think
such further improvements should be done in a separate patch, as it's
IMO a tricky area that can trigger very subtle bugs.

> >>> --- a/xen/arch/x86/mm/hap/nested_hap.c
> >>> +++ b/xen/arch/x86/mm/hap/nested_hap.c
> >>> @@ -84,7 +84,7 @@ nestedp2m_write_p2m_entry(struct p2m_domain *p2m, unsigned long gfn,
> >>>      safe_write_pte(p, new);
> >>>  
> >>>      if (old_flags & _PAGE_PRESENT)
> >>> -        flush_tlb_mask(p2m->dirty_cpumask);
> >>> +        flush_mask(p2m->dirty_cpumask, FLUSH_TLB | FLUSH_HVM_ASID_CORE);
> >>
> >> Same here then I guess.
> >>
> >>> --- a/xen/arch/x86/mm/p2m-pt.c
> >>> +++ b/xen/arch/x86/mm/p2m-pt.c
> >>> @@ -896,7 +896,8 @@ static void p2m_pt_change_entry_type_global(struct p2m_domain *p2m,
> >>>      unmap_domain_page(tab);
> >>>  
> >>>      if ( changed )
> >>> -         flush_tlb_mask(p2m->domain->dirty_cpumask);
> >>> +         flush_mask(p2m->domain->dirty_cpumask,
> >>> +                    FLUSH_TLB | FLUSH_HVM_ASID_CORE);
> >>
> >> Given that this code is used in shadow mode as well, perhaps
> >> better to keep it here. Albeit maybe FLUSH_TLB could be dependent
> >> upon !hap_enabled()?
> >>
> >>> --- a/xen/arch/x86/mm/paging.c
> >>> +++ b/xen/arch/x86/mm/paging.c
> >>> @@ -613,7 +613,7 @@ void paging_log_dirty_range(struct domain *d,
> >>>  
> >>>      p2m_unlock(p2m);
> >>>  
> >>> -    flush_tlb_mask(d->dirty_cpumask);
> >>> +    flush_mask(d->dirty_cpumask, FLUSH_TLB | FLUSH_HVM_ASID_CORE);
> >>
> >> Same here?
> > 
> > I'm fine with doing further refinements, but I would like to be on the
> > conservative side and keep such flushes.
> 
> Well, if hap.c had FLUSH_TLB dropped, for consistency it should
> become conditional here, imo.
> 
> >>> @@ -993,7 +993,7 @@ static void shadow_blow_tables(struct domain *d)
> >>>                                 pagetable_get_mfn(v->arch.shadow_table[i]), 0);
> >>>  
> >>>      /* Make sure everyone sees the unshadowings */
> >>> -    flush_tlb_mask(d->dirty_cpumask);
> >>> +    flush_mask(d->dirty_cpumask, FLUSH_TLB | FLUSH_HVM_ASID_CORE);
> >>
> >> Taking this as example, wouldn't it be more consistent overall if
> >> paths not being HVM-only would specify FLUSH_HVM_ASID_CORE only
> >> for HVM domains?

I could introduce something specific for shadow:

sh_flush_tlb_mask(d, m) \
    flush_mask(m, FLUSH_TLB | (is_hvm_domain(d) ? FLUSH_HVM_ASID_CORE : 0))

And likely a similar macro for hap, that uses hap_enabled.

> > I think there's indeed room for improvement here, as it's likely
> > possible to drop some of the ASID/VPID flushes. Given that previous to
> > this patch we would flush ASIDs on every TLB flush I think the current
> > approach is safe, and as said above further improvements can be done
> > afterwards.
> 
> There's no safety implication from my suggestion. Needless
> FLUSH_HVM_ASID_CORE for non-HVM will result in a call to
> hvm_flush_guest_tlbs(), with it then causing the generation
> to be incremented without there being any vCPU to consume
> this, as there's not going to be a VM entry without a prior
> context switch on the specific CPU.

As said above, I would rather do this in smaller steps, as I already
had plenty of fun with this change, but anyway. I think what I
proposed is correct and already an improvement from the current
status.

Will prepare a new version with your suggestions.

> >> Also, seeing the large number of conversions, perhaps have another
> >> wrapper, e.g. flush_tlb_mask_hvm(), at least for the cases where
> >> both flags get specified unconditionally?
> > 
> > That's fine for me, if you agree with the proposed naming
> > (flush_tlb_mask_hvm) I'm happy to introduce the helper.
> 
> Well, I couldn't (and still can't) think of a better (yet not
> overly long) name, yet I'm also not fully happy with it.

I'm going to use the proposed name unless someone comes up with a
better suggestion that has consensus.

Thanks, Roger.
Jan Beulich April 1, 2020, 7:58 a.m. UTC | #6
On 01.04.2020 09:15, Roger Pau Monné wrote:
> On Wed, Apr 01, 2020 at 08:34:23AM +0200, Jan Beulich wrote:
>> On 31.03.2020 18:45, Roger Pau Monné wrote:
>>> On Tue, Mar 31, 2020 at 05:40:59PM +0200, Jan Beulich wrote:
>>>> On 20.03.2020 19:42, Roger Pau Monne wrote:
>>>>> @@ -993,7 +993,7 @@ static void shadow_blow_tables(struct domain *d)
>>>>>                                 pagetable_get_mfn(v->arch.shadow_table[i]), 0);
>>>>>  
>>>>>      /* Make sure everyone sees the unshadowings */
>>>>> -    flush_tlb_mask(d->dirty_cpumask);
>>>>> +    flush_mask(d->dirty_cpumask, FLUSH_TLB | FLUSH_HVM_ASID_CORE);
>>>>
>>>> Taking this as example, wouldn't it be more consistent overall if
>>>> paths not being HVM-only would specify FLUSH_HVM_ASID_CORE only
>>>> for HVM domains?
> 
> I could introduce something specific for shadow:
> 
> sh_flush_tlb_mask(d, m) \
>     flush_mask(m, FLUSH_TLB | (is_hvm_domain(d) ? FLUSH_HVM_ASID_CORE : 0))

This looks good.

> And likely a similar macro for hap, that uses hap_enabled.

And then there's no point to use it anywhere in hap.c, as that code
runs only when hap_enabled is true. Hence my suggestion to simply
drop FLUSH_TLB there (assuming by "similar" you meant making
FLUSH_TLB conditional there).

Jan
diff mbox series

Patch

diff --git a/xen/arch/x86/flushtlb.c b/xen/arch/x86/flushtlb.c
index 03f92c23dc..c81e53c0ae 100644
--- a/xen/arch/x86/flushtlb.c
+++ b/xen/arch/x86/flushtlb.c
@@ -59,8 +59,6 @@  static u32 pre_flush(void)
         raise_softirq(NEW_TLBFLUSH_CLOCK_PERIOD_SOFTIRQ);
 
  skip_clocktick:
-    hvm_flush_guest_tlbs();
-
     return t2;
 }
 
@@ -118,6 +116,7 @@  void switch_cr3_cr4(unsigned long cr3, unsigned long cr4)
     local_irq_save(flags);
 
     t = pre_flush();
+    hvm_flush_guest_tlbs();
 
     old_cr4 = read_cr4();
     ASSERT(!(old_cr4 & X86_CR4_PCIDE) || !(old_cr4 & X86_CR4_PGE));
@@ -221,6 +220,9 @@  unsigned int flush_area_local(const void *va, unsigned int flags)
             do_tlb_flush();
     }
 
+    if ( flags & FLUSH_HVM_ASID_CORE )
+        hvm_flush_guest_tlbs();
+
     if ( flags & FLUSH_CACHE )
     {
         const struct cpuinfo_x86 *c = &current_cpu_data;
diff --git a/xen/arch/x86/mm/hap/hap.c b/xen/arch/x86/mm/hap/hap.c
index a6d5e39b02..004a89b4b9 100644
--- a/xen/arch/x86/mm/hap/hap.c
+++ b/xen/arch/x86/mm/hap/hap.c
@@ -118,7 +118,7 @@  int hap_track_dirty_vram(struct domain *d,
             p2m_change_type_range(d, begin_pfn, begin_pfn + nr,
                                   p2m_ram_rw, p2m_ram_logdirty);
 
-            flush_tlb_mask(d->dirty_cpumask);
+            flush_mask(d->dirty_cpumask, FLUSH_TLB | FLUSH_HVM_ASID_CORE);
 
             memset(dirty_bitmap, 0xff, size); /* consider all pages dirty */
         }
@@ -205,7 +205,7 @@  static int hap_enable_log_dirty(struct domain *d, bool_t log_global)
          * to be read-only, or via hardware-assisted log-dirty.
          */
         p2m_change_entry_type_global(d, p2m_ram_rw, p2m_ram_logdirty);
-        flush_tlb_mask(d->dirty_cpumask);
+        flush_mask(d->dirty_cpumask, FLUSH_TLB | FLUSH_HVM_ASID_CORE);
     }
     return 0;
 }
@@ -234,7 +234,7 @@  static void hap_clean_dirty_bitmap(struct domain *d)
      * be read-only, or via hardware-assisted log-dirty.
      */
     p2m_change_entry_type_global(d, p2m_ram_rw, p2m_ram_logdirty);
-    flush_tlb_mask(d->dirty_cpumask);
+    flush_mask(d->dirty_cpumask, FLUSH_TLB | FLUSH_HVM_ASID_CORE);
 }
 
 /************************************************/
@@ -798,7 +798,7 @@  hap_write_p2m_entry(struct p2m_domain *p2m, unsigned long gfn, l1_pgentry_t *p,
 
     safe_write_pte(p, new);
     if ( old_flags & _PAGE_PRESENT )
-        flush_tlb_mask(d->dirty_cpumask);
+        flush_mask(d->dirty_cpumask, FLUSH_TLB | FLUSH_HVM_ASID_CORE);
 
     paging_unlock(d);
 
diff --git a/xen/arch/x86/mm/hap/nested_hap.c b/xen/arch/x86/mm/hap/nested_hap.c
index abe5958a52..9c0750be17 100644
--- a/xen/arch/x86/mm/hap/nested_hap.c
+++ b/xen/arch/x86/mm/hap/nested_hap.c
@@ -84,7 +84,7 @@  nestedp2m_write_p2m_entry(struct p2m_domain *p2m, unsigned long gfn,
     safe_write_pte(p, new);
 
     if (old_flags & _PAGE_PRESENT)
-        flush_tlb_mask(p2m->dirty_cpumask);
+        flush_mask(p2m->dirty_cpumask, FLUSH_TLB | FLUSH_HVM_ASID_CORE);
 
     paging_unlock(d);
 
diff --git a/xen/arch/x86/mm/p2m-pt.c b/xen/arch/x86/mm/p2m-pt.c
index eb66077496..fbcea181ba 100644
--- a/xen/arch/x86/mm/p2m-pt.c
+++ b/xen/arch/x86/mm/p2m-pt.c
@@ -896,7 +896,8 @@  static void p2m_pt_change_entry_type_global(struct p2m_domain *p2m,
     unmap_domain_page(tab);
 
     if ( changed )
-         flush_tlb_mask(p2m->domain->dirty_cpumask);
+         flush_mask(p2m->domain->dirty_cpumask,
+                    FLUSH_TLB | FLUSH_HVM_ASID_CORE);
 }
 
 static int p2m_pt_change_entry_type_range(struct p2m_domain *p2m,
diff --git a/xen/arch/x86/mm/paging.c b/xen/arch/x86/mm/paging.c
index 469bb76429..f9d930b7a9 100644
--- a/xen/arch/x86/mm/paging.c
+++ b/xen/arch/x86/mm/paging.c
@@ -613,7 +613,7 @@  void paging_log_dirty_range(struct domain *d,
 
     p2m_unlock(p2m);
 
-    flush_tlb_mask(d->dirty_cpumask);
+    flush_mask(d->dirty_cpumask, FLUSH_TLB | FLUSH_HVM_ASID_CORE);
 }
 
 /*
diff --git a/xen/arch/x86/mm/shadow/common.c b/xen/arch/x86/mm/shadow/common.c
index 121ddf1255..aa750eafae 100644
--- a/xen/arch/x86/mm/shadow/common.c
+++ b/xen/arch/x86/mm/shadow/common.c
@@ -363,7 +363,7 @@  static int oos_remove_write_access(struct vcpu *v, mfn_t gmfn,
     }
 
     if ( ftlb )
-        flush_tlb_mask(d->dirty_cpumask);
+        flush_mask(d->dirty_cpumask, FLUSH_TLB | FLUSH_HVM_ASID_CORE);
 
     return 0;
 }
@@ -939,7 +939,7 @@  static void _shadow_prealloc(struct domain *d, unsigned int pages)
                 /* See if that freed up enough space */
                 if ( d->arch.paging.shadow.free_pages >= pages )
                 {
-                    flush_tlb_mask(d->dirty_cpumask);
+                    flush_mask(d->dirty_cpumask, FLUSH_TLB | FLUSH_HVM_ASID_CORE);
                     return;
                 }
             }
@@ -993,7 +993,7 @@  static void shadow_blow_tables(struct domain *d)
                                pagetable_get_mfn(v->arch.shadow_table[i]), 0);
 
     /* Make sure everyone sees the unshadowings */
-    flush_tlb_mask(d->dirty_cpumask);
+    flush_mask(d->dirty_cpumask, FLUSH_TLB | FLUSH_HVM_ASID_CORE);
 }
 
 void shadow_blow_tables_per_domain(struct domain *d)
@@ -1102,7 +1102,7 @@  mfn_t shadow_alloc(struct domain *d,
         if ( unlikely(!cpumask_empty(&mask)) )
         {
             perfc_incr(shadow_alloc_tlbflush);
-            flush_tlb_mask(&mask);
+            flush_mask(&mask, FLUSH_TLB | FLUSH_HVM_ASID_CORE);
         }
         /* Now safe to clear the page for reuse */
         clear_domain_page(page_to_mfn(sp));
@@ -2290,7 +2290,7 @@  void sh_remove_shadows(struct domain *d, mfn_t gmfn, int fast, int all)
 
     /* Need to flush TLBs now, so that linear maps are safe next time we
      * take a fault. */
-    flush_tlb_mask(d->dirty_cpumask);
+    flush_mask(d->dirty_cpumask, FLUSH_TLB | FLUSH_HVM_ASID_CORE);
 
     paging_unlock(d);
 }
@@ -3005,7 +3005,7 @@  static void sh_unshadow_for_p2m_change(struct domain *d, unsigned long gfn,
         {
             sh_remove_all_shadows_and_parents(d, mfn);
             if ( sh_remove_all_mappings(d, mfn, _gfn(gfn)) )
-                flush_tlb_mask(d->dirty_cpumask);
+                flush_mask(d->dirty_cpumask, FLUSH_TLB | FLUSH_HVM_ASID_CORE);
         }
     }
 
@@ -3045,7 +3045,7 @@  static void sh_unshadow_for_p2m_change(struct domain *d, unsigned long gfn,
                 }
                 omfn = mfn_add(omfn, 1);
             }
-            flush_tlb_mask(&flushmask);
+            flush_mask(&flushmask, FLUSH_TLB | FLUSH_HVM_ASID_CORE);
 
             if ( npte )
                 unmap_domain_page(npte);
@@ -3332,7 +3332,7 @@  int shadow_track_dirty_vram(struct domain *d,
         }
     }
     if ( flush_tlb )
-        flush_tlb_mask(d->dirty_cpumask);
+        flush_mask(d->dirty_cpumask, FLUSH_TLB | FLUSH_HVM_ASID_CORE);
     goto out;
 
 out_sl1ma:
@@ -3402,7 +3402,7 @@  bool shadow_flush_tlb(bool (*flush_vcpu)(void *ctxt, struct vcpu *v),
     }
 
     /* Flush TLBs on all CPUs with dirty vcpu state. */
-    flush_tlb_mask(mask);
+    flush_mask(mask, FLUSH_TLB | FLUSH_HVM_ASID_CORE);
 
     /* Done. */
     for_each_vcpu ( d, v )
diff --git a/xen/arch/x86/mm/shadow/hvm.c b/xen/arch/x86/mm/shadow/hvm.c
index 1e6024c71f..509162cdce 100644
--- a/xen/arch/x86/mm/shadow/hvm.c
+++ b/xen/arch/x86/mm/shadow/hvm.c
@@ -591,7 +591,7 @@  static void validate_guest_pt_write(struct vcpu *v, mfn_t gmfn,
 
     if ( rc & SHADOW_SET_FLUSH )
         /* Need to flush TLBs to pick up shadow PT changes */
-        flush_tlb_mask(d->dirty_cpumask);
+        flush_mask(d->dirty_cpumask, FLUSH_TLB | FLUSH_HVM_ASID_CORE);
 
     if ( rc & SHADOW_SET_ERROR )
     {
diff --git a/xen/arch/x86/mm/shadow/multi.c b/xen/arch/x86/mm/shadow/multi.c
index b6afc0fba4..667fca96c7 100644
--- a/xen/arch/x86/mm/shadow/multi.c
+++ b/xen/arch/x86/mm/shadow/multi.c
@@ -3066,7 +3066,7 @@  static int sh_page_fault(struct vcpu *v,
         perfc_incr(shadow_rm_write_flush_tlb);
         smp_wmb();
         atomic_inc(&d->arch.paging.shadow.gtable_dirty_version);
-        flush_tlb_mask(d->dirty_cpumask);
+        flush_mask(d->dirty_cpumask, FLUSH_TLB | FLUSH_HVM_ASID_CORE);
     }
 
 #if (SHADOW_OPTIMIZATIONS & SHOPT_OUT_OF_SYNC)
@@ -3575,7 +3575,7 @@  static bool sh_invlpg(struct vcpu *v, unsigned long linear)
     if ( mfn_to_page(sl1mfn)->u.sh.type
          == SH_type_fl1_shadow )
     {
-        flush_tlb_local();
+        flush_local(FLUSH_TLB | FLUSH_HVM_ASID_CORE);
         return false;
     }
 
@@ -3810,7 +3810,7 @@  sh_update_linear_entries(struct vcpu *v)
          * table entry. But, without this change, it would fetch the wrong
          * value due to a stale TLB.
          */
-        flush_tlb_local();
+        flush_local(FLUSH_TLB | FLUSH_HVM_ASID_CORE);
     }
 }
 
@@ -4011,7 +4011,7 @@  sh_update_cr3(struct vcpu *v, int do_locking, bool noflush)
      * (old) shadow linear maps in the writeable mapping heuristics. */
 #if GUEST_PAGING_LEVELS == 2
     if ( sh_remove_write_access(d, gmfn, 2, 0) != 0 )
-        flush_tlb_mask(d->dirty_cpumask);
+        flush_mask(d->dirty_cpumask, FLUSH_TLB | FLUSH_HVM_ASID_CORE);
     sh_set_toplevel_shadow(v, 0, gmfn, SH_type_l2_shadow);
 #elif GUEST_PAGING_LEVELS == 3
     /* PAE guests have four shadow_table entries, based on the
@@ -4035,7 +4035,7 @@  sh_update_cr3(struct vcpu *v, int do_locking, bool noflush)
             }
         }
         if ( flush )
-            flush_tlb_mask(d->dirty_cpumask);
+            flush_mask(d->dirty_cpumask, FLUSH_TLB | FLUSH_HVM_ASID_CORE);
         /* Now install the new shadows. */
         for ( i = 0; i < 4; i++ )
         {
@@ -4056,7 +4056,7 @@  sh_update_cr3(struct vcpu *v, int do_locking, bool noflush)
     }
 #elif GUEST_PAGING_LEVELS == 4
     if ( sh_remove_write_access(d, gmfn, 4, 0) != 0 )
-        flush_tlb_mask(d->dirty_cpumask);
+        flush_mask(d->dirty_cpumask, FLUSH_TLB | FLUSH_HVM_ASID_CORE);
     sh_set_toplevel_shadow(v, 0, gmfn, SH_type_l4_shadow);
     if ( !shadow_mode_external(d) && !is_pv_32bit_domain(d) )
     {
@@ -4502,7 +4502,7 @@  static void sh_pagetable_dying(paddr_t gpa)
         }
     }
     if ( flush )
-        flush_tlb_mask(d->dirty_cpumask);
+        flush_mask(d->dirty_cpumask, FLUSH_TLB | FLUSH_HVM_ASID_CORE);
 
     /* Remember that we've seen the guest use this interface, so we
      * can rely on it using it in future, instead of guessing at
@@ -4539,7 +4539,7 @@  static void sh_pagetable_dying(paddr_t gpa)
         mfn_to_page(gmfn)->pagetable_dying = true;
         shadow_unhook_mappings(d, smfn, 1/* user pages only */);
         /* Now flush the TLB: we removed toplevel mappings. */
-        flush_tlb_mask(d->dirty_cpumask);
+        flush_mask(d->dirty_cpumask, FLUSH_TLB | FLUSH_HVM_ASID_CORE);
     }
 
     /* Remember that we've seen the guest use this interface, so we
diff --git a/xen/include/asm-x86/flushtlb.h b/xen/include/asm-x86/flushtlb.h
index 2cfe4e6e97..579dc56803 100644
--- a/xen/include/asm-x86/flushtlb.h
+++ b/xen/include/asm-x86/flushtlb.h
@@ -105,6 +105,12 @@  void switch_cr3_cr4(unsigned long cr3, unsigned long cr4);
 #define FLUSH_VCPU_STATE 0x1000
  /* Flush the per-cpu root page table */
 #define FLUSH_ROOT_PGTBL 0x2000
+#if CONFIG_HVM
+ /* Flush all HVM guests linear TLB (using ASID/VPID) */
+#define FLUSH_HVM_ASID_CORE 0x4000
+#else
+#define FLUSH_HVM_ASID_CORE 0
+#endif
 
 /* Flush local TLBs/caches. */
 unsigned int flush_area_local(const void *va, unsigned int flags);