Message ID | 7b2b7cc9-8828-41bd-7949-764161bbe7ff@suse.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | x86/p2m: hook adjustments | expand |
At 10:24 +0100 on 28 Oct (1603880693), Jan Beulich wrote: > Fair parts of the present handlers are identical; in fact > nestedp2m_write_p2m_entry() lacks a call to p2m_entry_modify(). Move > common parts right into write_p2m_entry(), splitting the hooks into a > "pre" one (needed just by shadow code) and a "post" one. > > For the common parts moved I think that the p2m_flush_nestedp2m() is, > at least from an abstract perspective, also applicable in the shadow > case. Hence it doesn't get a 3rd hook put in place. > > The initial comment that was in hap_write_p2m_entry() gets dropped: Its > placement was bogus, and looking back the the commit introducing it > (dd6de3ab9985 "Implement Nested-on-Nested") I can't see either what use > of a p2m it was meant to be associated with. > > Signed-off-by: Jan Beulich <jbeulich@suse.com> This seems like a good approach to me. I'm happy with the shadow parts but am not confident enough on nested p2m any more to have an opinion on that side. Acked-by: Tim Deegan <tim@xen.org>
On Wed, Oct 28, 2020 at 10:24:53AM +0100, Jan Beulich wrote: > Fair parts of the present handlers are identical; in fact > nestedp2m_write_p2m_entry() lacks a call to p2m_entry_modify(). Move > common parts right into write_p2m_entry(), splitting the hooks into a > "pre" one (needed just by shadow code) and a "post" one. > > For the common parts moved I think that the p2m_flush_nestedp2m() is, > at least from an abstract perspective, also applicable in the shadow > case. Hence it doesn't get a 3rd hook put in place. > > The initial comment that was in hap_write_p2m_entry() gets dropped: Its > placement was bogus, and looking back the the commit introducing it > (dd6de3ab9985 "Implement Nested-on-Nested") I can't see either what use > of a p2m it was meant to be associated with. Is there any performance implication of moving from one hook to two hooks? Since this shouldn't be a hot path I assume it's fine. > Signed-off-by: Jan Beulich <jbeulich@suse.com> > --- > RFC: This is effectively the alternative to the suggestion in an earlier > patch that we might do away with the hook altogether. Of course a > hybrid approach would also be possible, by using direct calls here > instead of splitting the hook into two. > --- > I'm unsure whether p2m_init_nestedp2m() zapping the "pre" hook is > actually correct, or whether previously the sh_unshadow_for_p2m_change() > invocation was wrongly skipped. > > --- a/xen/arch/x86/mm/hap/hap.c > +++ b/xen/arch/x86/mm/hap/hap.c > @@ -774,55 +774,18 @@ static void hap_update_paging_modes(stru > put_gfn(d, cr3_gfn); > } > > -static int > -hap_write_p2m_entry(struct p2m_domain *p2m, unsigned long gfn, l1_pgentry_t *p, > - l1_pgentry_t new, unsigned int level) > +static void > +hap_write_p2m_entry_post(struct p2m_domain *p2m, unsigned int oflags) > { > struct domain *d = p2m->domain; > - uint32_t old_flags; > - mfn_t omfn; > - int rc; > > - /* We know always use the host p2m here, regardless if the vcpu > - * is in host or guest mode. The vcpu can be in guest mode by > - * a hypercall which passes a domain and chooses mostly the first > - * vcpu. */ > - > - paging_lock(d); > - old_flags = l1e_get_flags(*p); > - omfn = l1e_get_mfn(*p); > - > - rc = p2m_entry_modify(p2m, p2m_flags_to_type(l1e_get_flags(new)), > - p2m_flags_to_type(old_flags), l1e_get_mfn(new), > - omfn, level); > - if ( rc ) > - { > - paging_unlock(d); > - return rc; > - } > - > - safe_write_pte(p, new); > - if ( old_flags & _PAGE_PRESENT ) > + if ( oflags & _PAGE_PRESENT ) > guest_flush_tlb_mask(d, d->dirty_cpumask); > - > - paging_unlock(d); > - > - if ( nestedhvm_enabled(d) && (old_flags & _PAGE_PRESENT) && > - !p2m_get_hostp2m(d)->defer_nested_flush && > - /* > - * We are replacing a valid entry so we need to flush nested p2ms, > - * unless the only change is an increase in access rights. > - */ > - (!mfn_eq(omfn, l1e_get_mfn(new)) || > - !perms_strictly_increased(old_flags, l1e_get_flags(new))) ) > - p2m_flush_nestedp2m(d); > - > - return 0; > } > > void hap_p2m_init(struct p2m_domain *p2m) > { > - p2m->write_p2m_entry = hap_write_p2m_entry; > + p2m->write_p2m_entry_post = hap_write_p2m_entry_post; > } > > static unsigned long hap_gva_to_gfn_real_mode( > --- a/xen/arch/x86/mm/hap/nested_hap.c > +++ b/xen/arch/x86/mm/hap/nested_hap.c > @@ -71,24 +71,11 @@ > /* NESTED VIRT P2M FUNCTIONS */ > /********************************************/ > > -int > -nestedp2m_write_p2m_entry(struct p2m_domain *p2m, unsigned long gfn, > - l1_pgentry_t *p, l1_pgentry_t new, unsigned int level) > +void > +nestedp2m_write_p2m_entry_post(struct p2m_domain *p2m, unsigned int oflags) > { > - struct domain *d = p2m->domain; > - uint32_t old_flags; > - > - paging_lock(d); > - > - old_flags = l1e_get_flags(*p); > - safe_write_pte(p, new); > - > - if (old_flags & _PAGE_PRESENT) > - guest_flush_tlb_mask(d, p2m->dirty_cpumask); > - > - paging_unlock(d); > - > - return 0; > + if ( oflags & _PAGE_PRESENT ) > + guest_flush_tlb_mask(p2m->domain, p2m->dirty_cpumask); > } This is a verbatimi copy of hap_write_p2m_entry_post. I assume there's a reason why we need both, but I'm missing it. > > /********************************************/ > --- a/xen/arch/x86/mm/p2m-pt.c > +++ b/xen/arch/x86/mm/p2m-pt.c > @@ -122,17 +122,55 @@ static int write_p2m_entry(struct p2m_do > { > struct domain *d = p2m->domain; > struct vcpu *v = current; > - int rc = 0; > > if ( v->domain != d ) > v = d->vcpu ? d->vcpu[0] : NULL; > if ( likely(v && paging_mode_enabled(d) && paging_get_hostmode(v)) || > p2m_is_nestedp2m(p2m) ) > - rc = p2m->write_p2m_entry(p2m, gfn, p, new, level); > + { > + unsigned int oflags; > + mfn_t omfn; > + int rc; > + > + paging_lock(d); > + > + if ( p2m->write_p2m_entry_pre ) > + p2m->write_p2m_entry_pre(d, gfn, p, new, level); > + > + oflags = l1e_get_flags(*p); > + omfn = l1e_get_mfn(*p); > + > + rc = p2m_entry_modify(p2m, p2m_flags_to_type(l1e_get_flags(new)), > + p2m_flags_to_type(oflags), l1e_get_mfn(new), > + omfn, level); > + if ( rc ) > + { > + paging_unlock(d); > + return rc; > + } > + > + safe_write_pte(p, new); > + > + if ( p2m->write_p2m_entry_post ) > + p2m->write_p2m_entry_post(p2m, oflags); > + > + paging_unlock(d); > + > + if ( nestedhvm_enabled(d) && !p2m_is_nestedp2m(p2m) && > + (oflags & _PAGE_PRESENT) && > + !p2m_get_hostp2m(d)->defer_nested_flush && > + /* > + * We are replacing a valid entry so we need to flush nested p2ms, > + * unless the only change is an increase in access rights. > + */ > + (!mfn_eq(omfn, l1e_get_mfn(new)) || > + !perms_strictly_increased(oflags, l1e_get_flags(new))) ) > + p2m_flush_nestedp2m(d); It feel slightly weird to have a nested p2m hook post, and yet have nested specific code here. Have you considered if the post hook could be moved outside of the locked region, so that we could put this chunk there in the nested p2m case? Thanks, Roger.
On 10.11.2020 14:59, Roger Pau Monné wrote: > On Wed, Oct 28, 2020 at 10:24:53AM +0100, Jan Beulich wrote: >> Fair parts of the present handlers are identical; in fact >> nestedp2m_write_p2m_entry() lacks a call to p2m_entry_modify(). Move >> common parts right into write_p2m_entry(), splitting the hooks into a >> "pre" one (needed just by shadow code) and a "post" one. >> >> For the common parts moved I think that the p2m_flush_nestedp2m() is, >> at least from an abstract perspective, also applicable in the shadow >> case. Hence it doesn't get a 3rd hook put in place. >> >> The initial comment that was in hap_write_p2m_entry() gets dropped: Its >> placement was bogus, and looking back the the commit introducing it >> (dd6de3ab9985 "Implement Nested-on-Nested") I can't see either what use >> of a p2m it was meant to be associated with. > > Is there any performance implication of moving from one hook to two > hooks? Since this shouldn't be a hot path I assume it's fine. Well, first of all just a couple of patches ago two indirect calls were folded into one, so it's at least not getting worse compared to where we started from. And then both HAP and nested install just one of the two hooks. As per the remark in an earlier patch, referred to ... >> Signed-off-by: Jan Beulich <jbeulich@suse.com> >> --- >> RFC: This is effectively the alternative to the suggestion in an earlier >> patch that we might do away with the hook altogether. Of course a >> hybrid approach would also be possible, by using direct calls here >> instead of splitting the hook into two. ... here, there is the option of doing away with the indirect calls altogether, but - as said earlier - at the price of at least coming close to a layering violation. >> --- a/xen/arch/x86/mm/hap/nested_hap.c >> +++ b/xen/arch/x86/mm/hap/nested_hap.c >> @@ -71,24 +71,11 @@ >> /* NESTED VIRT P2M FUNCTIONS */ >> /********************************************/ >> >> -int >> -nestedp2m_write_p2m_entry(struct p2m_domain *p2m, unsigned long gfn, >> - l1_pgentry_t *p, l1_pgentry_t new, unsigned int level) >> +void >> +nestedp2m_write_p2m_entry_post(struct p2m_domain *p2m, unsigned int oflags) >> { >> - struct domain *d = p2m->domain; >> - uint32_t old_flags; >> - >> - paging_lock(d); >> - >> - old_flags = l1e_get_flags(*p); >> - safe_write_pte(p, new); >> - >> - if (old_flags & _PAGE_PRESENT) >> - guest_flush_tlb_mask(d, p2m->dirty_cpumask); >> - >> - paging_unlock(d); >> - >> - return 0; >> + if ( oflags & _PAGE_PRESENT ) >> + guest_flush_tlb_mask(p2m->domain, p2m->dirty_cpumask); >> } > > This is a verbatimi copy of hap_write_p2m_entry_post. I assume there's > a reason why we need both, but I'm missing it. Only almost, since HAP has if ( oflags & _PAGE_PRESENT ) guest_flush_tlb_mask(d, d->dirty_cpumask); instead (i.e. they differ in which dirty_cpumask gets used). >> --- a/xen/arch/x86/mm/p2m-pt.c >> +++ b/xen/arch/x86/mm/p2m-pt.c >> @@ -122,17 +122,55 @@ static int write_p2m_entry(struct p2m_do >> { >> struct domain *d = p2m->domain; >> struct vcpu *v = current; >> - int rc = 0; >> >> if ( v->domain != d ) >> v = d->vcpu ? d->vcpu[0] : NULL; >> if ( likely(v && paging_mode_enabled(d) && paging_get_hostmode(v)) || >> p2m_is_nestedp2m(p2m) ) >> - rc = p2m->write_p2m_entry(p2m, gfn, p, new, level); >> + { >> + unsigned int oflags; >> + mfn_t omfn; >> + int rc; >> + >> + paging_lock(d); >> + >> + if ( p2m->write_p2m_entry_pre ) >> + p2m->write_p2m_entry_pre(d, gfn, p, new, level); >> + >> + oflags = l1e_get_flags(*p); >> + omfn = l1e_get_mfn(*p); >> + >> + rc = p2m_entry_modify(p2m, p2m_flags_to_type(l1e_get_flags(new)), >> + p2m_flags_to_type(oflags), l1e_get_mfn(new), >> + omfn, level); >> + if ( rc ) >> + { >> + paging_unlock(d); >> + return rc; >> + } >> + >> + safe_write_pte(p, new); >> + >> + if ( p2m->write_p2m_entry_post ) >> + p2m->write_p2m_entry_post(p2m, oflags); >> + >> + paging_unlock(d); >> + >> + if ( nestedhvm_enabled(d) && !p2m_is_nestedp2m(p2m) && >> + (oflags & _PAGE_PRESENT) && >> + !p2m_get_hostp2m(d)->defer_nested_flush && >> + /* >> + * We are replacing a valid entry so we need to flush nested p2ms, >> + * unless the only change is an increase in access rights. >> + */ >> + (!mfn_eq(omfn, l1e_get_mfn(new)) || >> + !perms_strictly_increased(oflags, l1e_get_flags(new))) ) >> + p2m_flush_nestedp2m(d); > > It feel slightly weird to have a nested p2m hook post, and yet have > nested specific code here. > > Have you considered if the post hook could be moved outside of the > locked region, so that we could put this chunk there in the nested p2m > case? Yes, I did, but I don't think the post hook can be moved out. The only alternative therefore would be a 3rd hook. And this hook would then need to be installed on the host p2m for nested guests, as opposed to nestedp2m_write_p2m_entry_post, which gets installed in the nested p2m-s. As said in the description, the main reason I decided against a 3rd hook is that I suppose the code here isn't HAP-specific (while prior to this patch it was). Jan
On Tue, Nov 10, 2020 at 03:50:44PM +0100, Jan Beulich wrote: > On 10.11.2020 14:59, Roger Pau Monné wrote: > > On Wed, Oct 28, 2020 at 10:24:53AM +0100, Jan Beulich wrote: > >> --- a/xen/arch/x86/mm/hap/nested_hap.c > >> +++ b/xen/arch/x86/mm/hap/nested_hap.c > >> @@ -71,24 +71,11 @@ > >> /* NESTED VIRT P2M FUNCTIONS */ > >> /********************************************/ > >> > >> -int > >> -nestedp2m_write_p2m_entry(struct p2m_domain *p2m, unsigned long gfn, > >> - l1_pgentry_t *p, l1_pgentry_t new, unsigned int level) > >> +void > >> +nestedp2m_write_p2m_entry_post(struct p2m_domain *p2m, unsigned int oflags) > >> { > >> - struct domain *d = p2m->domain; > >> - uint32_t old_flags; > >> - > >> - paging_lock(d); > >> - > >> - old_flags = l1e_get_flags(*p); > >> - safe_write_pte(p, new); > >> - > >> - if (old_flags & _PAGE_PRESENT) > >> - guest_flush_tlb_mask(d, p2m->dirty_cpumask); > >> - > >> - paging_unlock(d); > >> - > >> - return 0; > >> + if ( oflags & _PAGE_PRESENT ) > >> + guest_flush_tlb_mask(p2m->domain, p2m->dirty_cpumask); > >> } > > > > This is a verbatimi copy of hap_write_p2m_entry_post. I assume there's > > a reason why we need both, but I'm missing it. > > Only almost, since HAP has > > if ( oflags & _PAGE_PRESENT ) > guest_flush_tlb_mask(d, d->dirty_cpumask); > > instead (i.e. they differ in which dirty_cpumask gets used). Sorry, missed that bit. > >> --- a/xen/arch/x86/mm/p2m-pt.c > >> +++ b/xen/arch/x86/mm/p2m-pt.c > >> @@ -122,17 +122,55 @@ static int write_p2m_entry(struct p2m_do > >> { > >> struct domain *d = p2m->domain; > >> struct vcpu *v = current; > >> - int rc = 0; > >> > >> if ( v->domain != d ) > >> v = d->vcpu ? d->vcpu[0] : NULL; > >> if ( likely(v && paging_mode_enabled(d) && paging_get_hostmode(v)) || > >> p2m_is_nestedp2m(p2m) ) > >> - rc = p2m->write_p2m_entry(p2m, gfn, p, new, level); > >> + { > >> + unsigned int oflags; > >> + mfn_t omfn; > >> + int rc; > >> + > >> + paging_lock(d); > >> + > >> + if ( p2m->write_p2m_entry_pre ) > >> + p2m->write_p2m_entry_pre(d, gfn, p, new, level); > >> + > >> + oflags = l1e_get_flags(*p); > >> + omfn = l1e_get_mfn(*p); > >> + > >> + rc = p2m_entry_modify(p2m, p2m_flags_to_type(l1e_get_flags(new)), > >> + p2m_flags_to_type(oflags), l1e_get_mfn(new), > >> + omfn, level); > >> + if ( rc ) > >> + { > >> + paging_unlock(d); > >> + return rc; > >> + } > >> + > >> + safe_write_pte(p, new); > >> + > >> + if ( p2m->write_p2m_entry_post ) > >> + p2m->write_p2m_entry_post(p2m, oflags); > >> + > >> + paging_unlock(d); > >> + > >> + if ( nestedhvm_enabled(d) && !p2m_is_nestedp2m(p2m) && > >> + (oflags & _PAGE_PRESENT) && > >> + !p2m_get_hostp2m(d)->defer_nested_flush && > >> + /* > >> + * We are replacing a valid entry so we need to flush nested p2ms, > >> + * unless the only change is an increase in access rights. > >> + */ > >> + (!mfn_eq(omfn, l1e_get_mfn(new)) || > >> + !perms_strictly_increased(oflags, l1e_get_flags(new))) ) > >> + p2m_flush_nestedp2m(d); > > > > It feel slightly weird to have a nested p2m hook post, and yet have > > nested specific code here. > > > > Have you considered if the post hook could be moved outside of the > > locked region, so that we could put this chunk there in the nested p2m > > case? > > Yes, I did, but I don't think the post hook can be moved out. The > only alternative therefore would be a 3rd hook. And this hook would > then need to be installed on the host p2m for nested guests, as > opposed to nestedp2m_write_p2m_entry_post, which gets installed in > the nested p2m-s. As said in the description, the main reason I > decided against a 3rd hook is that I suppose the code here isn't > HAP-specific (while prior to this patch it was). I'm not convinced the guest TLB flush needs to be performed while holding the paging lock. The point of such flush is to invalidate any intermediate guest visible translations that might now be invalid as a result of the p2m change, but the paging lock doesn't affect the guest in any way. It's true that the dirty_cpumask might change, but I think we only care that when returning from the function there are no stale cache entries that contain the now invalid translation, and this can be achieved equally by doing the flush outside of the locked region. Roger.
On 11.11.2020 13:17, Roger Pau Monné wrote: > On Tue, Nov 10, 2020 at 03:50:44PM +0100, Jan Beulich wrote: >> On 10.11.2020 14:59, Roger Pau Monné wrote: >>> On Wed, Oct 28, 2020 at 10:24:53AM +0100, Jan Beulich wrote: >>>> --- a/xen/arch/x86/mm/p2m-pt.c >>>> +++ b/xen/arch/x86/mm/p2m-pt.c >>>> @@ -122,17 +122,55 @@ static int write_p2m_entry(struct p2m_do >>>> { >>>> struct domain *d = p2m->domain; >>>> struct vcpu *v = current; >>>> - int rc = 0; >>>> >>>> if ( v->domain != d ) >>>> v = d->vcpu ? d->vcpu[0] : NULL; >>>> if ( likely(v && paging_mode_enabled(d) && paging_get_hostmode(v)) || >>>> p2m_is_nestedp2m(p2m) ) >>>> - rc = p2m->write_p2m_entry(p2m, gfn, p, new, level); >>>> + { >>>> + unsigned int oflags; >>>> + mfn_t omfn; >>>> + int rc; >>>> + >>>> + paging_lock(d); >>>> + >>>> + if ( p2m->write_p2m_entry_pre ) >>>> + p2m->write_p2m_entry_pre(d, gfn, p, new, level); >>>> + >>>> + oflags = l1e_get_flags(*p); >>>> + omfn = l1e_get_mfn(*p); >>>> + >>>> + rc = p2m_entry_modify(p2m, p2m_flags_to_type(l1e_get_flags(new)), >>>> + p2m_flags_to_type(oflags), l1e_get_mfn(new), >>>> + omfn, level); >>>> + if ( rc ) >>>> + { >>>> + paging_unlock(d); >>>> + return rc; >>>> + } >>>> + >>>> + safe_write_pte(p, new); >>>> + >>>> + if ( p2m->write_p2m_entry_post ) >>>> + p2m->write_p2m_entry_post(p2m, oflags); >>>> + >>>> + paging_unlock(d); >>>> + >>>> + if ( nestedhvm_enabled(d) && !p2m_is_nestedp2m(p2m) && >>>> + (oflags & _PAGE_PRESENT) && >>>> + !p2m_get_hostp2m(d)->defer_nested_flush && >>>> + /* >>>> + * We are replacing a valid entry so we need to flush nested p2ms, >>>> + * unless the only change is an increase in access rights. >>>> + */ >>>> + (!mfn_eq(omfn, l1e_get_mfn(new)) || >>>> + !perms_strictly_increased(oflags, l1e_get_flags(new))) ) >>>> + p2m_flush_nestedp2m(d); >>> >>> It feel slightly weird to have a nested p2m hook post, and yet have >>> nested specific code here. >>> >>> Have you considered if the post hook could be moved outside of the >>> locked region, so that we could put this chunk there in the nested p2m >>> case? >> >> Yes, I did, but I don't think the post hook can be moved out. The >> only alternative therefore would be a 3rd hook. And this hook would >> then need to be installed on the host p2m for nested guests, as >> opposed to nestedp2m_write_p2m_entry_post, which gets installed in >> the nested p2m-s. As said in the description, the main reason I >> decided against a 3rd hook is that I suppose the code here isn't >> HAP-specific (while prior to this patch it was). > > I'm not convinced the guest TLB flush needs to be performed while > holding the paging lock. The point of such flush is to invalidate any > intermediate guest visible translations that might now be invalid as a > result of the p2m change, but the paging lock doesn't affect the guest > in any way. > > It's true that the dirty_cpumask might change, but I think we only > care that when returning from the function there are no stale cache > entries that contain the now invalid translation, and this can be > achieved equally by doing the flush outside of the locked region. I agree with all this. If only it was merely about TLB flushes. In the shadow case, shadow_blow_all_tables() gets invoked, and that one - looking at the other call sites - wants the paging lock held. Additionally moving the stuff out of the locked region wouldn't allow us to achieve the goal of moving the nested flush into the hook, unless we introduced further hook handlers to be installed on the host p2m-s of nested guests. Jan
On Thu, Nov 12, 2020 at 01:29:33PM +0100, Jan Beulich wrote: > On 11.11.2020 13:17, Roger Pau Monné wrote: > > On Tue, Nov 10, 2020 at 03:50:44PM +0100, Jan Beulich wrote: > >> On 10.11.2020 14:59, Roger Pau Monné wrote: > >>> On Wed, Oct 28, 2020 at 10:24:53AM +0100, Jan Beulich wrote: > >>>> --- a/xen/arch/x86/mm/p2m-pt.c > >>>> +++ b/xen/arch/x86/mm/p2m-pt.c > >>>> @@ -122,17 +122,55 @@ static int write_p2m_entry(struct p2m_do > >>>> { > >>>> struct domain *d = p2m->domain; > >>>> struct vcpu *v = current; > >>>> - int rc = 0; > >>>> > >>>> if ( v->domain != d ) > >>>> v = d->vcpu ? d->vcpu[0] : NULL; > >>>> if ( likely(v && paging_mode_enabled(d) && paging_get_hostmode(v)) || > >>>> p2m_is_nestedp2m(p2m) ) > >>>> - rc = p2m->write_p2m_entry(p2m, gfn, p, new, level); > >>>> + { > >>>> + unsigned int oflags; > >>>> + mfn_t omfn; > >>>> + int rc; > >>>> + > >>>> + paging_lock(d); > >>>> + > >>>> + if ( p2m->write_p2m_entry_pre ) > >>>> + p2m->write_p2m_entry_pre(d, gfn, p, new, level); > >>>> + > >>>> + oflags = l1e_get_flags(*p); > >>>> + omfn = l1e_get_mfn(*p); > >>>> + > >>>> + rc = p2m_entry_modify(p2m, p2m_flags_to_type(l1e_get_flags(new)), > >>>> + p2m_flags_to_type(oflags), l1e_get_mfn(new), > >>>> + omfn, level); > >>>> + if ( rc ) > >>>> + { > >>>> + paging_unlock(d); > >>>> + return rc; > >>>> + } > >>>> + > >>>> + safe_write_pte(p, new); > >>>> + > >>>> + if ( p2m->write_p2m_entry_post ) > >>>> + p2m->write_p2m_entry_post(p2m, oflags); > >>>> + > >>>> + paging_unlock(d); > >>>> + > >>>> + if ( nestedhvm_enabled(d) && !p2m_is_nestedp2m(p2m) && > >>>> + (oflags & _PAGE_PRESENT) && > >>>> + !p2m_get_hostp2m(d)->defer_nested_flush && > >>>> + /* > >>>> + * We are replacing a valid entry so we need to flush nested p2ms, > >>>> + * unless the only change is an increase in access rights. > >>>> + */ > >>>> + (!mfn_eq(omfn, l1e_get_mfn(new)) || > >>>> + !perms_strictly_increased(oflags, l1e_get_flags(new))) ) > >>>> + p2m_flush_nestedp2m(d); > >>> > >>> It feel slightly weird to have a nested p2m hook post, and yet have > >>> nested specific code here. > >>> > >>> Have you considered if the post hook could be moved outside of the > >>> locked region, so that we could put this chunk there in the nested p2m > >>> case? > >> > >> Yes, I did, but I don't think the post hook can be moved out. The > >> only alternative therefore would be a 3rd hook. And this hook would > >> then need to be installed on the host p2m for nested guests, as > >> opposed to nestedp2m_write_p2m_entry_post, which gets installed in > >> the nested p2m-s. As said in the description, the main reason I > >> decided against a 3rd hook is that I suppose the code here isn't > >> HAP-specific (while prior to this patch it was). > > > > I'm not convinced the guest TLB flush needs to be performed while > > holding the paging lock. The point of such flush is to invalidate any > > intermediate guest visible translations that might now be invalid as a > > result of the p2m change, but the paging lock doesn't affect the guest > > in any way. > > > > It's true that the dirty_cpumask might change, but I think we only > > care that when returning from the function there are no stale cache > > entries that contain the now invalid translation, and this can be > > achieved equally by doing the flush outside of the locked region. > > I agree with all this. If only it was merely about TLB flushes. In > the shadow case, shadow_blow_all_tables() gets invoked, and that > one - looking at the other call sites - wants the paging lock held. You got me confused here, I think you meant shadow_blow_tables? The post hook for shadow could pick the lock again, as I don't think the removal of the tables needs to be strictly done inside of the same locked region? Something to consider from a performance PoV. > Additionally moving the stuff out of the locked region wouldn't > allow us to achieve the goal of moving the nested flush into the > hook, unless we introduced further hook handlers to be installed > on the host p2m-s of nested guests. Right, or else we would also need to add that chunk in the non-nestedp2m hook also? Maybe you could join both the nested and non-nested hooks and use a different dirty bitmap for the flush? Thanks, Roger.
On 12.11.2020 14:07, Roger Pau Monné wrote: > On Thu, Nov 12, 2020 at 01:29:33PM +0100, Jan Beulich wrote: >> On 11.11.2020 13:17, Roger Pau Monné wrote: >>> On Tue, Nov 10, 2020 at 03:50:44PM +0100, Jan Beulich wrote: >>>> On 10.11.2020 14:59, Roger Pau Monné wrote: >>>>> On Wed, Oct 28, 2020 at 10:24:53AM +0100, Jan Beulich wrote: >>>>>> --- a/xen/arch/x86/mm/p2m-pt.c >>>>>> +++ b/xen/arch/x86/mm/p2m-pt.c >>>>>> @@ -122,17 +122,55 @@ static int write_p2m_entry(struct p2m_do >>>>>> { >>>>>> struct domain *d = p2m->domain; >>>>>> struct vcpu *v = current; >>>>>> - int rc = 0; >>>>>> >>>>>> if ( v->domain != d ) >>>>>> v = d->vcpu ? d->vcpu[0] : NULL; >>>>>> if ( likely(v && paging_mode_enabled(d) && paging_get_hostmode(v)) || >>>>>> p2m_is_nestedp2m(p2m) ) >>>>>> - rc = p2m->write_p2m_entry(p2m, gfn, p, new, level); >>>>>> + { >>>>>> + unsigned int oflags; >>>>>> + mfn_t omfn; >>>>>> + int rc; >>>>>> + >>>>>> + paging_lock(d); >>>>>> + >>>>>> + if ( p2m->write_p2m_entry_pre ) >>>>>> + p2m->write_p2m_entry_pre(d, gfn, p, new, level); >>>>>> + >>>>>> + oflags = l1e_get_flags(*p); >>>>>> + omfn = l1e_get_mfn(*p); >>>>>> + >>>>>> + rc = p2m_entry_modify(p2m, p2m_flags_to_type(l1e_get_flags(new)), >>>>>> + p2m_flags_to_type(oflags), l1e_get_mfn(new), >>>>>> + omfn, level); >>>>>> + if ( rc ) >>>>>> + { >>>>>> + paging_unlock(d); >>>>>> + return rc; >>>>>> + } >>>>>> + >>>>>> + safe_write_pte(p, new); >>>>>> + >>>>>> + if ( p2m->write_p2m_entry_post ) >>>>>> + p2m->write_p2m_entry_post(p2m, oflags); >>>>>> + >>>>>> + paging_unlock(d); >>>>>> + >>>>>> + if ( nestedhvm_enabled(d) && !p2m_is_nestedp2m(p2m) && >>>>>> + (oflags & _PAGE_PRESENT) && >>>>>> + !p2m_get_hostp2m(d)->defer_nested_flush && >>>>>> + /* >>>>>> + * We are replacing a valid entry so we need to flush nested p2ms, >>>>>> + * unless the only change is an increase in access rights. >>>>>> + */ >>>>>> + (!mfn_eq(omfn, l1e_get_mfn(new)) || >>>>>> + !perms_strictly_increased(oflags, l1e_get_flags(new))) ) >>>>>> + p2m_flush_nestedp2m(d); >>>>> >>>>> It feel slightly weird to have a nested p2m hook post, and yet have >>>>> nested specific code here. >>>>> >>>>> Have you considered if the post hook could be moved outside of the >>>>> locked region, so that we could put this chunk there in the nested p2m >>>>> case? >>>> >>>> Yes, I did, but I don't think the post hook can be moved out. The >>>> only alternative therefore would be a 3rd hook. And this hook would >>>> then need to be installed on the host p2m for nested guests, as >>>> opposed to nestedp2m_write_p2m_entry_post, which gets installed in >>>> the nested p2m-s. As said in the description, the main reason I >>>> decided against a 3rd hook is that I suppose the code here isn't >>>> HAP-specific (while prior to this patch it was). >>> >>> I'm not convinced the guest TLB flush needs to be performed while >>> holding the paging lock. The point of such flush is to invalidate any >>> intermediate guest visible translations that might now be invalid as a >>> result of the p2m change, but the paging lock doesn't affect the guest >>> in any way. >>> >>> It's true that the dirty_cpumask might change, but I think we only >>> care that when returning from the function there are no stale cache >>> entries that contain the now invalid translation, and this can be >>> achieved equally by doing the flush outside of the locked region. >> >> I agree with all this. If only it was merely about TLB flushes. In >> the shadow case, shadow_blow_all_tables() gets invoked, and that >> one - looking at the other call sites - wants the paging lock held. > > You got me confused here, I think you meant shadow_blow_tables? Oh, yes, sorry - copy-and-paste from the wrong source. > The post hook for shadow could pick the lock again, as I don't think > the removal of the tables needs to be strictly done inside of the same > locked region? I think it does, or else a use of the now stale tables may occur before they got blown away. Tim? > Something to consider from a performance PoV. > >> Additionally moving the stuff out of the locked region wouldn't >> allow us to achieve the goal of moving the nested flush into the >> hook, unless we introduced further hook handlers to be installed >> on the host p2m-s of nested guests. > > Right, or else we would also need to add that chunk in the > non-nestedp2m hook also? I'm a little confused by the question: If we wanted to move this into the hook functions, it would need to be both hap's and shadow's, i.e. _only_ the non-nested ones. IOW it could then stay in hap's and be duplicated into shadow's. Avoiding the duplication _and_ keeping it outside the locked region is why I moved it into the common logic (provided, of course, I'm right with my understanding of it also being needed in the shadow case; else it could stay in the hap function alone), of which the 2nd aspect would go away if the hook invocation itself lived outside the locked region. But the duplication of this would still concern me ... > Maybe you could join both the nested and non-nested hooks and use a > different dirty bitmap for the flush? What would this gain us? Extra conditionals in a hook, when the hook is (indirectly) to avoid having endless conditionals in the common logic? Jan
At 15:04 +0100 on 12 Nov (1605193496), Jan Beulich wrote: > On 12.11.2020 14:07, Roger Pau Monné wrote: > > On Thu, Nov 12, 2020 at 01:29:33PM +0100, Jan Beulich wrote: > >> I agree with all this. If only it was merely about TLB flushes. In > >> the shadow case, shadow_blow_all_tables() gets invoked, and that > >> one - looking at the other call sites - wants the paging lock held. [...] > > The post hook for shadow could pick the lock again, as I don't think > > the removal of the tables needs to be strictly done inside of the same > > locked region? > > I think it does, or else a use of the now stale tables may occur > before they got blown away. Tim? Is this the call to shadow_blow_tables() in the write_p2m_entry path? I think it would be safe to drop and re-take the paging lock there as long as the call happens before the write is considered to have finished. But it would not be a useful performance improvement - any update that takes this path is going to be very slow regardless. So unless you have another pressing reason to split it up, I would be inclined to leave it as it is. That way it's easier to see that the locking is correct. Cheers, Tim.
On 12.11.2020 18:52, Tim Deegan wrote: > At 15:04 +0100 on 12 Nov (1605193496), Jan Beulich wrote: >> On 12.11.2020 14:07, Roger Pau Monné wrote: >>> On Thu, Nov 12, 2020 at 01:29:33PM +0100, Jan Beulich wrote: >>>> I agree with all this. If only it was merely about TLB flushes. In >>>> the shadow case, shadow_blow_all_tables() gets invoked, and that >>>> one - looking at the other call sites - wants the paging lock held. > [...] >>> The post hook for shadow could pick the lock again, as I don't think >>> the removal of the tables needs to be strictly done inside of the same >>> locked region? >> >> I think it does, or else a use of the now stale tables may occur >> before they got blown away. Tim? > > Is this the call to shadow_blow_tables() in the write_p2m_entry path? Yes. > I think it would be safe to drop and re-take the paging lock there as > long as the call happens before the write is considered to have > finished. > > But it would not be a useful performance improvement - any update that > takes this path is going to be very slow regardless. So unless you > have another pressing reason to split it up, I would be inclined to > leave it as it is. That way it's easier to see that the locking is > correct. Thanks for the clarification. Roger - what's your view at this point? Jan
On Fri, Nov 13, 2020 at 10:52:58AM +0100, Jan Beulich wrote: > On 12.11.2020 18:52, Tim Deegan wrote: > > At 15:04 +0100 on 12 Nov (1605193496), Jan Beulich wrote: > >> On 12.11.2020 14:07, Roger Pau Monné wrote: > >>> On Thu, Nov 12, 2020 at 01:29:33PM +0100, Jan Beulich wrote: > >>>> I agree with all this. If only it was merely about TLB flushes. In > >>>> the shadow case, shadow_blow_all_tables() gets invoked, and that > >>>> one - looking at the other call sites - wants the paging lock held. > > [...] > >>> The post hook for shadow could pick the lock again, as I don't think > >>> the removal of the tables needs to be strictly done inside of the same > >>> locked region? > >> > >> I think it does, or else a use of the now stale tables may occur > >> before they got blown away. Tim? > > > > Is this the call to shadow_blow_tables() in the write_p2m_entry path? > > Yes. > > > I think it would be safe to drop and re-take the paging lock there as > > long as the call happens before the write is considered to have > > finished. > > > > But it would not be a useful performance improvement - any update that > > takes this path is going to be very slow regardless. So unless you > > have another pressing reason to split it up, I would be inclined to > > leave it as it is. That way it's easier to see that the locking is > > correct. > > Thanks for the clarification. > > Roger - what's your view at this point? So my main concern was not really about making this path faster - after all this shouldn't be a hot path anyway, but rather reducing the region protected by the paging lock, since it's a global lock that's quite contended. In the HAP case we could move the flush outside of the locked region thus reducing the lock time. Anyway, seeing there's not much consensus on this aspect leaving it as-is is no worse that what's currently in there. Roger.
On Wed, Oct 28, 2020 at 10:24:53AM +0100, Jan Beulich wrote: > Fair parts of the present handlers are identical; in fact > nestedp2m_write_p2m_entry() lacks a call to p2m_entry_modify(). Move > common parts right into write_p2m_entry(), splitting the hooks into a > "pre" one (needed just by shadow code) and a "post" one. > > For the common parts moved I think that the p2m_flush_nestedp2m() is, > at least from an abstract perspective, also applicable in the shadow > case. Hence it doesn't get a 3rd hook put in place. > > The initial comment that was in hap_write_p2m_entry() gets dropped: Its > placement was bogus, and looking back the the commit introducing it > (dd6de3ab9985 "Implement Nested-on-Nested") I can't see either what use > of a p2m it was meant to be associated with. > > Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Roger Pau Monné <roger.pau@citrix.com> Thanks, Roger.
--- a/xen/arch/x86/mm/hap/hap.c +++ b/xen/arch/x86/mm/hap/hap.c @@ -774,55 +774,18 @@ static void hap_update_paging_modes(stru put_gfn(d, cr3_gfn); } -static int -hap_write_p2m_entry(struct p2m_domain *p2m, unsigned long gfn, l1_pgentry_t *p, - l1_pgentry_t new, unsigned int level) +static void +hap_write_p2m_entry_post(struct p2m_domain *p2m, unsigned int oflags) { struct domain *d = p2m->domain; - uint32_t old_flags; - mfn_t omfn; - int rc; - /* We know always use the host p2m here, regardless if the vcpu - * is in host or guest mode. The vcpu can be in guest mode by - * a hypercall which passes a domain and chooses mostly the first - * vcpu. */ - - paging_lock(d); - old_flags = l1e_get_flags(*p); - omfn = l1e_get_mfn(*p); - - rc = p2m_entry_modify(p2m, p2m_flags_to_type(l1e_get_flags(new)), - p2m_flags_to_type(old_flags), l1e_get_mfn(new), - omfn, level); - if ( rc ) - { - paging_unlock(d); - return rc; - } - - safe_write_pte(p, new); - if ( old_flags & _PAGE_PRESENT ) + if ( oflags & _PAGE_PRESENT ) guest_flush_tlb_mask(d, d->dirty_cpumask); - - paging_unlock(d); - - if ( nestedhvm_enabled(d) && (old_flags & _PAGE_PRESENT) && - !p2m_get_hostp2m(d)->defer_nested_flush && - /* - * We are replacing a valid entry so we need to flush nested p2ms, - * unless the only change is an increase in access rights. - */ - (!mfn_eq(omfn, l1e_get_mfn(new)) || - !perms_strictly_increased(old_flags, l1e_get_flags(new))) ) - p2m_flush_nestedp2m(d); - - return 0; } void hap_p2m_init(struct p2m_domain *p2m) { - p2m->write_p2m_entry = hap_write_p2m_entry; + p2m->write_p2m_entry_post = hap_write_p2m_entry_post; } static unsigned long hap_gva_to_gfn_real_mode( --- a/xen/arch/x86/mm/hap/nested_hap.c +++ b/xen/arch/x86/mm/hap/nested_hap.c @@ -71,24 +71,11 @@ /* NESTED VIRT P2M FUNCTIONS */ /********************************************/ -int -nestedp2m_write_p2m_entry(struct p2m_domain *p2m, unsigned long gfn, - l1_pgentry_t *p, l1_pgentry_t new, unsigned int level) +void +nestedp2m_write_p2m_entry_post(struct p2m_domain *p2m, unsigned int oflags) { - struct domain *d = p2m->domain; - uint32_t old_flags; - - paging_lock(d); - - old_flags = l1e_get_flags(*p); - safe_write_pte(p, new); - - if (old_flags & _PAGE_PRESENT) - guest_flush_tlb_mask(d, p2m->dirty_cpumask); - - paging_unlock(d); - - return 0; + if ( oflags & _PAGE_PRESENT ) + guest_flush_tlb_mask(p2m->domain, p2m->dirty_cpumask); } /********************************************/ --- a/xen/arch/x86/mm/p2m-pt.c +++ b/xen/arch/x86/mm/p2m-pt.c @@ -122,17 +122,55 @@ static int write_p2m_entry(struct p2m_do { struct domain *d = p2m->domain; struct vcpu *v = current; - int rc = 0; if ( v->domain != d ) v = d->vcpu ? d->vcpu[0] : NULL; if ( likely(v && paging_mode_enabled(d) && paging_get_hostmode(v)) || p2m_is_nestedp2m(p2m) ) - rc = p2m->write_p2m_entry(p2m, gfn, p, new, level); + { + unsigned int oflags; + mfn_t omfn; + int rc; + + paging_lock(d); + + if ( p2m->write_p2m_entry_pre ) + p2m->write_p2m_entry_pre(d, gfn, p, new, level); + + oflags = l1e_get_flags(*p); + omfn = l1e_get_mfn(*p); + + rc = p2m_entry_modify(p2m, p2m_flags_to_type(l1e_get_flags(new)), + p2m_flags_to_type(oflags), l1e_get_mfn(new), + omfn, level); + if ( rc ) + { + paging_unlock(d); + return rc; + } + + safe_write_pte(p, new); + + if ( p2m->write_p2m_entry_post ) + p2m->write_p2m_entry_post(p2m, oflags); + + paging_unlock(d); + + if ( nestedhvm_enabled(d) && !p2m_is_nestedp2m(p2m) && + (oflags & _PAGE_PRESENT) && + !p2m_get_hostp2m(d)->defer_nested_flush && + /* + * We are replacing a valid entry so we need to flush nested p2ms, + * unless the only change is an increase in access rights. + */ + (!mfn_eq(omfn, l1e_get_mfn(new)) || + !perms_strictly_increased(oflags, l1e_get_flags(new))) ) + p2m_flush_nestedp2m(d); + } else safe_write_pte(p, new); - return rc; + return 0; } // Find the next level's P2M entry, checking for out-of-range gfn's... --- a/xen/arch/x86/mm/p2m.c +++ b/xen/arch/x86/mm/p2m.c @@ -198,7 +198,8 @@ static int p2m_init_nestedp2m(struct dom return -ENOMEM; } p2m->p2m_class = p2m_nested; - p2m->write_p2m_entry = nestedp2m_write_p2m_entry; + p2m->write_p2m_entry_pre = NULL; + p2m->write_p2m_entry_post = nestedp2m_write_p2m_entry_post; list_add(&p2m->np2m_list, &p2m_get_hostp2m(d)->np2m_list); } --- a/xen/arch/x86/mm/shadow/common.c +++ b/xen/arch/x86/mm/shadow/common.c @@ -3137,34 +3137,22 @@ static void sh_unshadow_for_p2m_change(s } } -static int -shadow_write_p2m_entry(struct p2m_domain *p2m, unsigned long gfn, - l1_pgentry_t *p, l1_pgentry_t new, - unsigned int level) +static void +sh_write_p2m_entry_pre(struct domain *d, unsigned long gfn, l1_pgentry_t *p, + l1_pgentry_t new, unsigned int level) { - struct domain *d = p2m->domain; - int rc; - - paging_lock(d); - /* If there are any shadows, update them. But if shadow_teardown() * has already been called then it's not safe to try. */ if ( likely(d->arch.paging.shadow.total_pages != 0) ) sh_unshadow_for_p2m_change(d, gfn, p, new, level); - - rc = p2m_entry_modify(p2m, p2m_flags_to_type(l1e_get_flags(new)), - p2m_flags_to_type(l1e_get_flags(*p)), - l1e_get_mfn(new), l1e_get_mfn(*p), level); - if ( rc ) - { - paging_unlock(d); - return rc; - } - - /* Update the entry with new content */ - safe_write_pte(p, new); +} #if (SHADOW_OPTIMIZATIONS & SHOPT_FAST_FAULT_PATH) +static void +sh_write_p2m_entry_post(struct p2m_domain *p2m, unsigned int oflags) +{ + struct domain *d = p2m->domain; + /* If we're doing FAST_FAULT_PATH, then shadow mode may have cached the fact that this is an mmio region in the shadow page tables. Blow the tables away to remove the cache. @@ -3176,16 +3164,15 @@ shadow_write_p2m_entry(struct p2m_domain shadow_blow_tables(d); d->arch.paging.shadow.has_fast_mmio_entries = false; } -#endif - - paging_unlock(d); - - return 0; } +#else +# define sh_write_p2m_entry_post NULL +#endif void shadow_p2m_init(struct p2m_domain *p2m) { - p2m->write_p2m_entry = shadow_write_p2m_entry; + p2m->write_p2m_entry_pre = sh_write_p2m_entry_pre; + p2m->write_p2m_entry_post = sh_write_p2m_entry_post; } /**************************************************************************/ --- a/xen/include/asm-x86/p2m.h +++ b/xen/include/asm-x86/p2m.h @@ -272,10 +272,13 @@ struct p2m_domain { unsigned long first_gfn, unsigned long last_gfn); void (*memory_type_changed)(struct p2m_domain *p2m); - - int (*write_p2m_entry)(struct p2m_domain *p2m, - unsigned long gfn, l1_pgentry_t *p, - l1_pgentry_t new, unsigned int level); + void (*write_p2m_entry_pre)(struct domain *d, + unsigned long gfn, + l1_pgentry_t *p, + l1_pgentry_t new, + unsigned int level); + void (*write_p2m_entry_post)(struct p2m_domain *p2m, + unsigned int oflags); #if P2M_AUDIT long (*audit_p2m)(struct p2m_domain *p2m); #endif @@ -472,7 +475,7 @@ void __put_gfn(struct p2m_domain *p2m, u * * This is also used in the shadow code whenever the paging lock is * held -- in those cases, the caller is protected against concurrent - * p2m updates by the fact that shadow_write_p2m_entry() also takes + * p2m updates by the fact that write_p2m_entry() also takes * the paging lock. * * Note that an unlocked accessor only makes sense for a "query" lookup. @@ -841,8 +844,8 @@ void np2m_flush_base(struct vcpu *v, uns void hap_p2m_init(struct p2m_domain *p2m); void shadow_p2m_init(struct p2m_domain *p2m); -int nestedp2m_write_p2m_entry(struct p2m_domain *p2m, unsigned long gfn, - l1_pgentry_t *p, l1_pgentry_t new, unsigned int level); +void nestedp2m_write_p2m_entry_post(struct p2m_domain *p2m, + unsigned int oflags); /* * Alternate p2m: shadow p2m tables used for alternate memory views
Fair parts of the present handlers are identical; in fact nestedp2m_write_p2m_entry() lacks a call to p2m_entry_modify(). Move common parts right into write_p2m_entry(), splitting the hooks into a "pre" one (needed just by shadow code) and a "post" one. For the common parts moved I think that the p2m_flush_nestedp2m() is, at least from an abstract perspective, also applicable in the shadow case. Hence it doesn't get a 3rd hook put in place. The initial comment that was in hap_write_p2m_entry() gets dropped: Its placement was bogus, and looking back the the commit introducing it (dd6de3ab9985 "Implement Nested-on-Nested") I can't see either what use of a p2m it was meant to be associated with. Signed-off-by: Jan Beulich <jbeulich@suse.com> --- RFC: This is effectively the alternative to the suggestion in an earlier patch that we might do away with the hook altogether. Of course a hybrid approach would also be possible, by using direct calls here instead of splitting the hook into two. --- I'm unsure whether p2m_init_nestedp2m() zapping the "pre" hook is actually correct, or whether previously the sh_unshadow_for_p2m_change() invocation was wrongly skipped.