Message ID | 5b3f46f3-3c9f-57fb-00a5-94128f41e34a@suse.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v2] x86/PoD: move increment of entry count | expand |
On 04.01.2022 11:57, Jan Beulich wrote: > When not holding the PoD lock across the entire region covering P2M > update and stats update, the entry count should indicate too large a > value in preference to a too small one, to avoid functions bailing early > when they find the count is zero. Hence increments should happen ahead > of P2M updates, while decrements should happen only after. Deal with the > one place where this hasn't been the case yet. > > Signed-off-by: Jan Beulich <jbeulich@suse.com> > --- > v2: Add comments. Ping? Jan > --- > While it might be possible to hold the PoD lock over the entire > operation, I didn't want to chance introducing a lock order violation on > a perhaps rarely taken code path. > > --- a/xen/arch/x86/mm/p2m-pod.c > +++ b/xen/arch/x86/mm/p2m-pod.c > @@ -1342,19 +1342,22 @@ mark_populate_on_demand(struct domain *d > } > } > > + /* > + * Without holding the PoD lock across the entire operation, bump the > + * entry count up front assuming success of p2m_set_entry(), undoing the > + * bump as necessary upon failure. Bumping only upon success would risk > + * code elsewhere observing entry count being zero despite there actually > + * still being PoD entries. > + */ > + pod_lock(p2m); > + p2m->pod.entry_count += (1UL << order) - pod_count; > + pod_unlock(p2m); > + > /* Now, actually do the two-way mapping */ > rc = p2m_set_entry(p2m, gfn, INVALID_MFN, order, > p2m_populate_on_demand, p2m->default_access); > if ( rc == 0 ) > - { > - pod_lock(p2m); > - p2m->pod.entry_count += 1UL << order; > - p2m->pod.entry_count -= pod_count; > - BUG_ON(p2m->pod.entry_count < 0); > - pod_unlock(p2m); > - > ioreq_request_mapcache_invalidate(d); > - } > else if ( order ) > { > /* > @@ -1366,6 +1369,14 @@ mark_populate_on_demand(struct domain *d > d, gfn_l, order, rc); > domain_crash(d); > } > + else if ( !pod_count ) > + { > + /* Undo earlier increment; see comment above. */ > + pod_lock(p2m); > + BUG_ON(!p2m->pod.entry_count); > + --p2m->pod.entry_count; > + pod_unlock(p2m); > + } > > out: > gfn_unlock(p2m, gfn, order); > >
On 27.01.2022 16:04, Jan Beulich wrote: > On 04.01.2022 11:57, Jan Beulich wrote: >> When not holding the PoD lock across the entire region covering P2M >> update and stats update, the entry count should indicate too large a >> value in preference to a too small one, to avoid functions bailing early >> when they find the count is zero. Hence increments should happen ahead >> of P2M updates, while decrements should happen only after. Deal with the >> one place where this hasn't been the case yet. >> >> Signed-off-by: Jan Beulich <jbeulich@suse.com> >> --- >> v2: Add comments. > > Ping? Similarly here: Another 4 weeks have passed ... Thanks for feedback either way, Jan >> --- >> While it might be possible to hold the PoD lock over the entire >> operation, I didn't want to chance introducing a lock order violation on >> a perhaps rarely taken code path. >> >> --- a/xen/arch/x86/mm/p2m-pod.c >> +++ b/xen/arch/x86/mm/p2m-pod.c >> @@ -1342,19 +1342,22 @@ mark_populate_on_demand(struct domain *d >> } >> } >> >> + /* >> + * Without holding the PoD lock across the entire operation, bump the >> + * entry count up front assuming success of p2m_set_entry(), undoing the >> + * bump as necessary upon failure. Bumping only upon success would risk >> + * code elsewhere observing entry count being zero despite there actually >> + * still being PoD entries. >> + */ >> + pod_lock(p2m); >> + p2m->pod.entry_count += (1UL << order) - pod_count; >> + pod_unlock(p2m); >> + >> /* Now, actually do the two-way mapping */ >> rc = p2m_set_entry(p2m, gfn, INVALID_MFN, order, >> p2m_populate_on_demand, p2m->default_access); >> if ( rc == 0 ) >> - { >> - pod_lock(p2m); >> - p2m->pod.entry_count += 1UL << order; >> - p2m->pod.entry_count -= pod_count; >> - BUG_ON(p2m->pod.entry_count < 0); >> - pod_unlock(p2m); >> - >> ioreq_request_mapcache_invalidate(d); >> - } >> else if ( order ) >> { >> /* >> @@ -1366,6 +1369,14 @@ mark_populate_on_demand(struct domain *d >> d, gfn_l, order, rc); >> domain_crash(d); >> } >> + else if ( !pod_count ) >> + { >> + /* Undo earlier increment; see comment above. */ >> + pod_lock(p2m); >> + BUG_ON(!p2m->pod.entry_count); >> + --p2m->pod.entry_count; >> + pod_unlock(p2m); >> + } >> >> out: >> gfn_unlock(p2m, gfn, order); >> >> >
On Tue, Jan 4, 2022 at 10:58 AM Jan Beulich <jbeulich@suse.com> wrote: > When not holding the PoD lock across the entire region covering P2M > update and stats update, the entry count should indicate too large a > value in preference to a too small one, to avoid functions bailing early > when they find the count is zero. Hence increments should happen ahead > of P2M updates, while decrements should happen only after. Deal with the > one place where this hasn't been the case yet. > > Signed-off-by: Jan Beulich <jbeulich@suse.com> > --- > v2: Add comments. > --- > While it might be possible to hold the PoD lock over the entire > operation, I didn't want to chance introducing a lock order violation on > a perhaps rarely taken code path. > [No idea how I missed this 2 years ago, sorry for the delay] Actually I think just holding the lock is probably the right thing to do. We already grab gfn_lock() over the entire operation, and p2m_set_entry() ASSERTs gfn_locked_by_me(); and all we have to do to trigger the check is boot any guest in PoD mode at all; surely we have osstest tests for that? -George
On 11.03.2024 17:04, George Dunlap wrote: > On Tue, Jan 4, 2022 at 10:58 AM Jan Beulich <jbeulich@suse.com> wrote: > >> When not holding the PoD lock across the entire region covering P2M >> update and stats update, the entry count should indicate too large a >> value in preference to a too small one, to avoid functions bailing early >> when they find the count is zero. Hence increments should happen ahead >> of P2M updates, while decrements should happen only after. Deal with the >> one place where this hasn't been the case yet. >> >> Signed-off-by: Jan Beulich <jbeulich@suse.com> >> --- >> v2: Add comments. >> --- >> While it might be possible to hold the PoD lock over the entire >> operation, I didn't want to chance introducing a lock order violation on >> a perhaps rarely taken code path. >> > > [No idea how I missed this 2 years ago, sorry for the delay] > > Actually I think just holding the lock is probably the right thing to do. > We already grab gfn_lock() over the entire operation, and p2m_set_entry() > ASSERTs gfn_locked_by_me(); and all we have to do to trigger the check is > boot any guest in PoD mode at all; surely we have osstest tests for that? The gfn (aka p2m) lock isn't of interest here. It's the locks whose lock level is between p2m and pod which are. Then again there are obviously other calls to p2m_set_entry() with the PoD lock held. So if there was a problem (e.g. with ept_set_entry() calling p2m_altp2m_propagate_change()), I wouldn't make things meaningfully worse by holding the PoD lock for longer here. So yes, let me switch to that and then hope ... Jan
--- a/xen/arch/x86/mm/p2m-pod.c +++ b/xen/arch/x86/mm/p2m-pod.c @@ -1342,19 +1342,22 @@ mark_populate_on_demand(struct domain *d } } + /* + * Without holding the PoD lock across the entire operation, bump the + * entry count up front assuming success of p2m_set_entry(), undoing the + * bump as necessary upon failure. Bumping only upon success would risk + * code elsewhere observing entry count being zero despite there actually + * still being PoD entries. + */ + pod_lock(p2m); + p2m->pod.entry_count += (1UL << order) - pod_count; + pod_unlock(p2m); + /* Now, actually do the two-way mapping */ rc = p2m_set_entry(p2m, gfn, INVALID_MFN, order, p2m_populate_on_demand, p2m->default_access); if ( rc == 0 ) - { - pod_lock(p2m); - p2m->pod.entry_count += 1UL << order; - p2m->pod.entry_count -= pod_count; - BUG_ON(p2m->pod.entry_count < 0); - pod_unlock(p2m); - ioreq_request_mapcache_invalidate(d); - } else if ( order ) { /* @@ -1366,6 +1369,14 @@ mark_populate_on_demand(struct domain *d d, gfn_l, order, rc); domain_crash(d); } + else if ( !pod_count ) + { + /* Undo earlier increment; see comment above. */ + pod_lock(p2m); + BUG_ON(!p2m->pod.entry_count); + --p2m->pod.entry_count; + pod_unlock(p2m); + } out: gfn_unlock(p2m, gfn, order);
When not holding the PoD lock across the entire region covering P2M update and stats update, the entry count should indicate too large a value in preference to a too small one, to avoid functions bailing early when they find the count is zero. Hence increments should happen ahead of P2M updates, while decrements should happen only after. Deal with the one place where this hasn't been the case yet. Signed-off-by: Jan Beulich <jbeulich@suse.com> --- v2: Add comments. --- While it might be possible to hold the PoD lock over the entire operation, I didn't want to chance introducing a lock order violation on a perhaps rarely taken code path.