Message ID | e447da22-23d6-d3db-313d-4e4ca009c3df@suse.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [4.17] x86/shadow: drop (replace) bogus assertions | expand |
On 14/10/2022 09:49, Jan Beulich wrote: > The addition of a call to shadow_blow_tables() from shadow_teardown() > has resulted in the "no vcpus" related assertion becoming triggerable: > If domain_create() fails with at least one page successfully allocated > in the course of shadow_enable(), or if domain_create() succeeds and > the domain is then killed without ever invoking XEN_DOMCTL_max_vcpus. It warrants pointing out that are unit tests in the tree which do exactly this. > The assertion's comment was bogus anyway: Shadow mode has been getting > enabled before allocation of vCPU-s for quite some time. I agree with the principle of what you're saying, but I don't think it's accurate. Shadow (vs hap) has always been part of domain create. But we've always had a problem where we need to wait for a shadow op to allocate some shadow memory before various domain-centric operations can proceed. As with the ARM GICv2 issues, we do want to address this and let domain_create() (or some continuable version of it) allocate the bare minimum shadow pool size, which simplifies a load of other codepaths. > Convert the > assertion to a conditional: As long as there are no vCPU-s, there's > nothing to blow away. > > Fixes: e7aa55c0aab3 ("x86/p2m: free the paging memory pool preemptively") > Reported-by: Andrew Cooper <andrew.cooper3@citrix.com> > > A similar assertion/comment pair exists in _shadow_prealloc(); the > comment is similarly bogus, and the assertion could in principle trigger > e.g. when shadow_alloc_p2m_page() is called early enough. Replace those > at the same time by a similar early return, here indicating failure to > the caller (which will generally lead to the domain being crashed in > shadow_prealloc()). > > Signed-off-by: Jan Beulich <jbeulich@suse.com> > --- > While in shadow_blow_tables() the option exists to simply remove the > assertion without adding a new conditional (the two loops simply will > do nothing), the same isn't true for _shadow_prealloc(): There we > would then trigger the ASSERT_UNREACHABLE() near the end of the > function. IMO, blow_tables() has no business caring about vcpus. It should be idempotent, and ideally wants to be left in a state where it doesn't need modifying when the aformentioned create changes happen. For prealloc(), I'd argue that it shouldn't care, but as this is a bugfix to an XSA, leaving it in this form for now is the safer course of action. Whomever cleans up the create path can do the work to ensure that all prealloc() paths are safe before vcpus are allocated. ~Andrew
On Fri, Oct 14, 2022 at 10:49:55AM +0200, Jan Beulich wrote: > The addition of a call to shadow_blow_tables() from shadow_teardown() > has resulted in the "no vcpus" related assertion becoming triggerable: > If domain_create() fails with at least one page successfully allocated > in the course of shadow_enable(), or if domain_create() succeeds and > the domain is then killed without ever invoking XEN_DOMCTL_max_vcpus. > > The assertion's comment was bogus anyway: Shadow mode has been getting > enabled before allocation of vCPU-s for quite some time. Convert the > assertion to a conditional: As long as there are no vCPU-s, there's > nothing to blow away. > > Fixes: e7aa55c0aab3 ("x86/p2m: free the paging memory pool preemptively") > Reported-by: Andrew Cooper <andrew.cooper3@citrix.com> > > A similar assertion/comment pair exists in _shadow_prealloc(); the > comment is similarly bogus, and the assertion could in principle trigger > e.g. when shadow_alloc_p2m_page() is called early enough. Replace those > at the same time by a similar early return, here indicating failure to > the caller (which will generally lead to the domain being crashed in > shadow_prealloc()). It's my understanding we do care about this because a control domain could try to populate the p2m before calling XEN_DOMCTL_max_vcpus, and hence could trigger the ASSERT, as otherwise asserting would be fine. > Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Roger Pau Monné <roger.pau@citrix.com> > --- > While in shadow_blow_tables() the option exists to simply remove the > assertion without adding a new conditional (the two loops simply will > do nothing), the same isn't true for _shadow_prealloc(): There we > would then trigger the ASSERT_UNREACHABLE() near the end of the > function. I think it's fine to exit early. Thanks, Roger.
On 14.10.2022 12:02, Andrew Cooper wrote: > On 14/10/2022 09:49, Jan Beulich wrote: >> The addition of a call to shadow_blow_tables() from shadow_teardown() >> has resulted in the "no vcpus" related assertion becoming triggerable: >> If domain_create() fails with at least one page successfully allocated >> in the course of shadow_enable(), or if domain_create() succeeds and >> the domain is then killed without ever invoking XEN_DOMCTL_max_vcpus. > > It warrants pointing out that are unit tests in the tree which do > exactly this. Can do. >> The assertion's comment was bogus anyway: Shadow mode has been getting >> enabled before allocation of vCPU-s for quite some time. > > I agree with the principle of what you're saying, but I don't think it's > accurate. I'm afraid I can't derive from ... > Shadow (vs hap) has always been part of domain create. But we've always > had a problem where we need to wait for a shadow op to allocate some > shadow memory before various domain-centric operations can proceed. > > As with the ARM GICv2 issues, we do want to address this and let > domain_create() (or some continuable version of it) allocate the bare > minimum shadow pool size, which simplifies a load of other codepaths. ... this why the statement isn't accurate. What both functions are trying to do is reclaim pages from in-use shadows. None can exist without vCPU-s. Yet still shadow mode has been getting enabled before vCPU-s are allocated. >> --- >> While in shadow_blow_tables() the option exists to simply remove the >> assertion without adding a new conditional (the two loops simply will >> do nothing), the same isn't true for _shadow_prealloc(): There we >> would then trigger the ASSERT_UNREACHABLE() near the end of the >> function. > > IMO, blow_tables() has no business caring about vcpus. It should be > idempotent, and ideally wants to be left in a state where it doesn't > need modifying when the aformentioned create changes happen. First: Both the change as done by the patch as well as the alternative pointed out fulfill this requirement, afaict at least. Hence what you say doesn't make clear whether you agree with the change as done or whether you'd prefer the alternative (and if so, why). Then the two functions do about the same thing, just with prealloc having an early exit condition (once having reached the intended count of available pages). Therefore ... > For prealloc(), I'd argue that it shouldn't care, but as this is a > bugfix to an XSA, leaving it in this form for now is the safer course of > action. Whomever cleans up the create path can do the work to ensure > that all prealloc() paths are safe before vcpus are allocated. ... I think the two functions want to remain as closely in sync as possible (I'm afraid I didn't fully have this in mind when writing the remark - it should have been worded a little differently); really I think it would be best if the duplicate code was folded. Hence to me leaving alone that function (which is what I understand you suggest) is not a good option, yet as explained in the post-commit- message remark not replacing the assertion by an if() would have an unwanted consequence. Jan
On 14.10.2022 12:30, Roger Pau Monné wrote: > On Fri, Oct 14, 2022 at 10:49:55AM +0200, Jan Beulich wrote: >> The addition of a call to shadow_blow_tables() from shadow_teardown() >> has resulted in the "no vcpus" related assertion becoming triggerable: >> If domain_create() fails with at least one page successfully allocated >> in the course of shadow_enable(), or if domain_create() succeeds and >> the domain is then killed without ever invoking XEN_DOMCTL_max_vcpus. >> >> The assertion's comment was bogus anyway: Shadow mode has been getting >> enabled before allocation of vCPU-s for quite some time. Convert the >> assertion to a conditional: As long as there are no vCPU-s, there's >> nothing to blow away. >> >> Fixes: e7aa55c0aab3 ("x86/p2m: free the paging memory pool preemptively") >> Reported-by: Andrew Cooper <andrew.cooper3@citrix.com> >> >> A similar assertion/comment pair exists in _shadow_prealloc(); the >> comment is similarly bogus, and the assertion could in principle trigger >> e.g. when shadow_alloc_p2m_page() is called early enough. Replace those >> at the same time by a similar early return, here indicating failure to >> the caller (which will generally lead to the domain being crashed in >> shadow_prealloc()). > > It's my understanding we do care about this because a control domain > could try to populate the p2m before calling XEN_DOMCTL_max_vcpus, and > hence could trigger the ASSERT, as otherwise asserting would be fine. Yes, that's the scenario I had in mind. >> Signed-off-by: Jan Beulich <jbeulich@suse.com> > > Acked-by: Roger Pau Monné <roger.pau@citrix.com> Thanks, but Andrew and I will need to reach agreement before I can put this (or whatever alternative) in. Jan
On 14.10.2022 12:30, Roger Pau Monné wrote: > On Fri, Oct 14, 2022 at 10:49:55AM +0200, Jan Beulich wrote: >> The addition of a call to shadow_blow_tables() from shadow_teardown() >> has resulted in the "no vcpus" related assertion becoming triggerable: >> If domain_create() fails with at least one page successfully allocated >> in the course of shadow_enable(), or if domain_create() succeeds and >> the domain is then killed without ever invoking XEN_DOMCTL_max_vcpus. >> >> The assertion's comment was bogus anyway: Shadow mode has been getting >> enabled before allocation of vCPU-s for quite some time. Convert the >> assertion to a conditional: As long as there are no vCPU-s, there's >> nothing to blow away. >> >> Fixes: e7aa55c0aab3 ("x86/p2m: free the paging memory pool preemptively") >> Reported-by: Andrew Cooper <andrew.cooper3@citrix.com> >> >> A similar assertion/comment pair exists in _shadow_prealloc(); the >> comment is similarly bogus, and the assertion could in principle trigger >> e.g. when shadow_alloc_p2m_page() is called early enough. Replace those >> at the same time by a similar early return, here indicating failure to >> the caller (which will generally lead to the domain being crashed in >> shadow_prealloc()). > > It's my understanding we do care about this because a control domain > could try to populate the p2m before calling XEN_DOMCTL_max_vcpus, and > hence could trigger the ASSERT, as otherwise asserting would be fine. > >> Signed-off-by: Jan Beulich <jbeulich@suse.com> > > Acked-by: Roger Pau Monné <roger.pau@citrix.com> In a discussion amongst maintainers we've settled Andrew's reservations. May I please ask for a release-ack for this change, so it can go in (as a bug fix on top of the recent batch of XSAs)? Thanks, Jan
Hi Jan, > -----Original Message----- > From: Jan Beulich <jbeulich@suse.com> > Subject: Re: [PATCH][4.17] x86/shadow: drop (replace) bogus assertions > > On 14.10.2022 12:30, Roger Pau Monné wrote: > > On Fri, Oct 14, 2022 at 10:49:55AM +0200, Jan Beulich wrote: > >> The addition of a call to shadow_blow_tables() from shadow_teardown() > >> has resulted in the "no vcpus" related assertion becoming triggerable: > >> If domain_create() fails with at least one page successfully allocated > >> in the course of shadow_enable(), or if domain_create() succeeds and > >> the domain is then killed without ever invoking XEN_DOMCTL_max_vcpus. > >> > >> The assertion's comment was bogus anyway: Shadow mode has been > getting > >> enabled before allocation of vCPU-s for quite some time. Convert the > >> assertion to a conditional: As long as there are no vCPU-s, there's > >> nothing to blow away. > >> > >> Fixes: e7aa55c0aab3 ("x86/p2m: free the paging memory pool > preemptively") > >> Reported-by: Andrew Cooper <andrew.cooper3@citrix.com> > >> > >> A similar assertion/comment pair exists in _shadow_prealloc(); the > >> comment is similarly bogus, and the assertion could in principle trigger > >> e.g. when shadow_alloc_p2m_page() is called early enough. Replace > those > >> at the same time by a similar early return, here indicating failure to > >> the caller (which will generally lead to the domain being crashed in > >> shadow_prealloc()). > > > > It's my understanding we do care about this because a control domain > > could try to populate the p2m before calling XEN_DOMCTL_max_vcpus, > and > > hence could trigger the ASSERT, as otherwise asserting would be fine. > > > >> Signed-off-by: Jan Beulich <jbeulich@suse.com> > > > > Acked-by: Roger Pau Monné <roger.pau@citrix.com> > > In a discussion amongst maintainers we've settled Andrew's reservations. > May I please ask for a release-ack for this change, so it can go in (as > a bug fix on top of the recent batch of XSAs)? Absolutely. Thanks for noticing. Release-acked-by: Henry Wang <Henry.Wang@arm.com> Kind regards, Henry > > Thanks, Jan
--- a/xen/arch/x86/mm/shadow/common.c +++ b/xen/arch/x86/mm/shadow/common.c @@ -943,8 +943,9 @@ static bool __must_check _shadow_preallo /* No reclaim when the domain is dying, teardown will take care of it. */ return false; - /* Shouldn't have enabled shadows if we've no vcpus. */ - ASSERT(d->vcpu && d->vcpu[0]); + /* Nothing to reclaim when there are no vcpus yet. */ + if ( !d->vcpu[0] ) + return false; /* Stage one: walk the list of pinned pages, unpinning them */ perfc_incr(shadow_prealloc_1); @@ -1034,8 +1035,9 @@ void shadow_blow_tables(struct domain *d mfn_t smfn; int i; - /* Shouldn't have enabled shadows if we've no vcpus. */ - ASSERT(d->vcpu && d->vcpu[0]); + /* Nothing to do when there are no vcpus yet. */ + if ( !d->vcpu[0] ) + return; /* Pass one: unpin all pinned pages */ foreach_pinned_shadow(d, sp, t)
The addition of a call to shadow_blow_tables() from shadow_teardown() has resulted in the "no vcpus" related assertion becoming triggerable: If domain_create() fails with at least one page successfully allocated in the course of shadow_enable(), or if domain_create() succeeds and the domain is then killed without ever invoking XEN_DOMCTL_max_vcpus. The assertion's comment was bogus anyway: Shadow mode has been getting enabled before allocation of vCPU-s for quite some time. Convert the assertion to a conditional: As long as there are no vCPU-s, there's nothing to blow away. Fixes: e7aa55c0aab3 ("x86/p2m: free the paging memory pool preemptively") Reported-by: Andrew Cooper <andrew.cooper3@citrix.com> A similar assertion/comment pair exists in _shadow_prealloc(); the comment is similarly bogus, and the assertion could in principle trigger e.g. when shadow_alloc_p2m_page() is called early enough. Replace those at the same time by a similar early return, here indicating failure to the caller (which will generally lead to the domain being crashed in shadow_prealloc()). Signed-off-by: Jan Beulich <jbeulich@suse.com> --- While in shadow_blow_tables() the option exists to simply remove the assertion without adding a new conditional (the two loops simply will do nothing), the same isn't true for _shadow_prealloc(): There we would then trigger the ASSERT_UNREACHABLE() near the end of the function.