Message ID | de46590ad566d9be55b26eaca0bc4dc7fbbada59.1585063311.git.hongyxia@amazon.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Revert "domctl: improve locking during domain destruction" | expand |
On 24.03.2020 16:21, Hongyan Xia wrote: > From: Hongyan Xia <hongyxia@amazon.com> > > Unfortunately, even though that commit dropped the domctl lock and > allowed other domctl to continue, it created severe lock contention > within domain destructions themselves. Multiple domain destructions in > parallel now spin for the global heap lock when freeing memory and could > spend a long time before the next hypercall continuation. I'm not at all happy to see this reverted; instead I was hoping that we could drop the domctl lock in further cases. If a lack of continuations is the problem, did you try forcing them to occur more frequently? > In contrast, > after dropping that commit, parallel domain destructions will just fail > to take the domctl lock, creating a hypercall continuation and backing > off immediately, allowing the thread that holds the lock to destroy a > domain much more quickly and allowing backed-off threads to process > events and irqs. > > On a 144-core server with 4TiB of memory, destroying 32 guests (each > with 4 vcpus and 122GiB memory) simultaneously takes: > > before the revert: 29 minutes > after the revert: 6 minutes This wants comparing against numbers demonstrating the bad effects of the global domctl lock. Iirc they were quite a bit higher than 6 min, perhaps depending on guest properties. Jan
On 24/03/2020 16:13, Jan Beulich wrote: > On 24.03.2020 16:21, Hongyan Xia wrote: >> From: Hongyan Xia <hongyxia@amazon.com> >> In contrast, >> after dropping that commit, parallel domain destructions will just fail >> to take the domctl lock, creating a hypercall continuation and backing >> off immediately, allowing the thread that holds the lock to destroy a >> domain much more quickly and allowing backed-off threads to process >> events and irqs. >> >> On a 144-core server with 4TiB of memory, destroying 32 guests (each >> with 4 vcpus and 122GiB memory) simultaneously takes: >> >> before the revert: 29 minutes >> after the revert: 6 minutes > > This wants comparing against numbers demonstrating the bad effects of > the global domctl lock. Iirc they were quite a bit higher than 6 min, > perhaps depending on guest properties. Your original commit message doesn't contain any clue in which cases the domctl lock was an issue. So please provide information on the setups you think it will make it worse. Cheers,
On 24/03/2020 15:21, Hongyan Xia wrote: > From: Hongyan Xia <hongyxia@amazon.com> > > Unfortunately, even though that commit dropped the domctl lock and > allowed other domctl to continue, it created severe lock contention > within domain destructions themselves. Multiple domain destructions in > parallel now spin for the global heap lock when freeing memory and could > spend a long time before the next hypercall continuation. In contrast, > after dropping that commit, parallel domain destructions will just fail > to take the domctl lock, creating a hypercall continuation and backing > off immediately, allowing the thread that holds the lock to destroy a > domain much more quickly and allowing backed-off threads to process > events and irqs. > > On a 144-core server with 4TiB of memory, destroying 32 guests (each > with 4 vcpus and 122GiB memory) simultaneously takes: > > before the revert: 29 minutes > after the revert: 6 minutes > > This is timed between the first page and the very last page of all 32 > guests is released back to the heap. > > This reverts commit 228ab9992ffb1d8f9d2475f2581e68b2913acb88. > > Signed-off-by: Hongyan Xia <hongyxia@amazon.com> Reviewed-by: Julien Grall <julien@xen.org> > --- > xen/common/domain.c | 11 +---------- > xen/common/domctl.c | 5 +---- > 2 files changed, 2 insertions(+), 14 deletions(-) > > diff --git a/xen/common/domain.c b/xen/common/domain.c > index b4eb476a9c..7b02f5ead7 100644 > --- a/xen/common/domain.c > +++ b/xen/common/domain.c > @@ -698,20 +698,11 @@ int domain_kill(struct domain *d) > if ( d == current->domain ) > return -EINVAL; > > - /* Protected by d->domain_lock. */ > + /* Protected by domctl_lock. */ > switch ( d->is_dying ) > { > case DOMDYING_alive: > - domain_unlock(d); > domain_pause(d); > - domain_lock(d); > - /* > - * With the domain lock dropped, d->is_dying may have changed. Call > - * ourselves recursively if so, which is safe as then we won't come > - * back here. > - */ > - if ( d->is_dying != DOMDYING_alive ) > - return domain_kill(d); > d->is_dying = DOMDYING_dying; > argo_destroy(d); > evtchn_destroy(d); > diff --git a/xen/common/domctl.c b/xen/common/domctl.c > index a69b3b59a8..e010079203 100644 > --- a/xen/common/domctl.c > +++ b/xen/common/domctl.c > @@ -571,14 +571,11 @@ long do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl) > break; > > case XEN_DOMCTL_destroydomain: > - domctl_lock_release(); > - domain_lock(d); > ret = domain_kill(d); > - domain_unlock(d); > if ( ret == -ERESTART ) > ret = hypercall_create_continuation( > __HYPERVISOR_domctl, "h", u_domctl); > - goto domctl_out_unlock_domonly; > + break; > > case XEN_DOMCTL_setnodeaffinity: > { >
On 24.03.2020 19:39, Julien Grall wrote: > On 24/03/2020 16:13, Jan Beulich wrote: >> On 24.03.2020 16:21, Hongyan Xia wrote: >>> From: Hongyan Xia <hongyxia@amazon.com> >>> In contrast, >>> after dropping that commit, parallel domain destructions will just fail >>> to take the domctl lock, creating a hypercall continuation and backing >>> off immediately, allowing the thread that holds the lock to destroy a >>> domain much more quickly and allowing backed-off threads to process >>> events and irqs. >>> >>> On a 144-core server with 4TiB of memory, destroying 32 guests (each >>> with 4 vcpus and 122GiB memory) simultaneously takes: >>> >>> before the revert: 29 minutes >>> after the revert: 6 minutes >> >> This wants comparing against numbers demonstrating the bad effects of >> the global domctl lock. Iirc they were quite a bit higher than 6 min, >> perhaps depending on guest properties. > > Your original commit message doesn't contain any clue in which > cases the domctl lock was an issue. So please provide information > on the setups you think it will make it worse. I did never observe the issue myself - let's see whether one of the SUSE people possibly involved in this back then recall (or have further pointers; Jim, Charles?), or whether any of the (partly former) Citrix folks do. My vague recollection is that the issue was the tool stack as a whole stalling for far too long in particular when destroying very large guests. One important aspect not discussed in the commit message at all is that holding the domctl lock block basically _all_ tool stack operations (including e.g. creation of new guests), whereas the new issue attempted to be addressed is limited to just domain cleanup. Jan
On Wed, 2020-03-25 at 08:11 +0100, Jan Beulich wrote: > On 24.03.2020 19:39, Julien Grall wrote: > > On 24/03/2020 16:13, Jan Beulich wrote: > > > On 24.03.2020 16:21, Hongyan Xia wrote: > > > > From: Hongyan Xia <hongyxia@amazon.com> > > > > In contrast, > > > > after dropping that commit, parallel domain destructions will > > > > just fail > > > > to take the domctl lock, creating a hypercall continuation and > > > > backing > > > > off immediately, allowing the thread that holds the lock to > > > > destroy a > > > > domain much more quickly and allowing backed-off threads to > > > > process > > > > events and irqs. > > > > > > > > On a 144-core server with 4TiB of memory, destroying 32 guests > > > > (each > > > > with 4 vcpus and 122GiB memory) simultaneously takes: > > > > > > > > before the revert: 29 minutes > > > > after the revert: 6 minutes > > > > > > This wants comparing against numbers demonstrating the bad > > > effects of > > > the global domctl lock. Iirc they were quite a bit higher than 6 > > > min, > > > perhaps depending on guest properties. > > > > Your original commit message doesn't contain any clue in which > > cases the domctl lock was an issue. So please provide information > > on the setups you think it will make it worse. > > I did never observe the issue myself - let's see whether one of the > SUSE > people possibly involved in this back then recall (or have further > pointers; Jim, Charles?), or whether any of the (partly former) > Citrix > folks do. My vague recollection is that the issue was the tool stack > as > a whole stalling for far too long in particular when destroying very > large guests. One important aspect not discussed in the commit > message > at all is that holding the domctl lock block basically _all_ tool > stack > operations (including e.g. creation of new guests), whereas the new > issue attempted to be addressed is limited to just domain cleanup. The best solution is to make the heap scalable instead of a global lock, but that is not going to be trivial. Of course, another solution is to keep the domctl lock dropped in domain_kill() but have another domain_kill lock so that competing domain_kill()s will try to take that lock and back off with hypercall continuation. But this is kind of hacky (we introduce a lock to reduce spinlock contention elsewhere), which is probably not a solution but a workaround. Seeing the dramatic increase from 6 to 29 minutes in concurrent guest destruction, I wonder if the benefit of that commit can outweigh this negative though. Hongyan
On 3/25/20 1:11 AM, Jan Beulich wrote: > On 24.03.2020 19:39, Julien Grall wrote: >> On 24/03/2020 16:13, Jan Beulich wrote: >>> On 24.03.2020 16:21, Hongyan Xia wrote: >>>> From: Hongyan Xia <hongyxia@amazon.com> >>>> In contrast, >>>> after dropping that commit, parallel domain destructions will just fail >>>> to take the domctl lock, creating a hypercall continuation and backing >>>> off immediately, allowing the thread that holds the lock to destroy a >>>> domain much more quickly and allowing backed-off threads to process >>>> events and irqs. >>>> >>>> On a 144-core server with 4TiB of memory, destroying 32 guests (each >>>> with 4 vcpus and 122GiB memory) simultaneously takes: >>>> >>>> before the revert: 29 minutes >>>> after the revert: 6 minutes >>> >>> This wants comparing against numbers demonstrating the bad effects of >>> the global domctl lock. Iirc they were quite a bit higher than 6 min, >>> perhaps depending on guest properties. >> >> Your original commit message doesn't contain any clue in which >> cases the domctl lock was an issue. So please provide information >> on the setups you think it will make it worse. > > I did never observe the issue myself - let's see whether one of the SUSE > people possibly involved in this back then recall (or have further > pointers; Jim, Charles?), or whether any of the (partly former) Citrix > folks do. My vague recollection is that the issue was the tool stack as > a whole stalling for far too long in particular when destroying very > large guests. I too only have a vague memory of the issue but do recall shutting down large guests (e.g. 500GB) taking a long time and blocking other toolstack operations. I haven't checked on the behavior in quite some time though. > One important aspect not discussed in the commit message > at all is that holding the domctl lock block basically _all_ tool stack > operations (including e.g. creation of new guests), whereas the new > issue attempted to be addressed is limited to just domain cleanup. I more vaguely recall shutting down the host taking a *long* time when dom0 had large amounts of memory, e.g. when it had all host memory (no dom0_mem= setting and autoballooning enabled). Regards, Jim
Hi Jim, On 26/03/2020 16:55, Jim Fehlig wrote: > On 3/25/20 1:11 AM, Jan Beulich wrote: >> On 24.03.2020 19:39, Julien Grall wrote: >>> On 24/03/2020 16:13, Jan Beulich wrote: >>>> On 24.03.2020 16:21, Hongyan Xia wrote: >>>>> From: Hongyan Xia <hongyxia@amazon.com> >>>>> In contrast, >>>>> after dropping that commit, parallel domain destructions will just >>>>> fail >>>>> to take the domctl lock, creating a hypercall continuation and backing >>>>> off immediately, allowing the thread that holds the lock to destroy a >>>>> domain much more quickly and allowing backed-off threads to process >>>>> events and irqs. >>>>> >>>>> On a 144-core server with 4TiB of memory, destroying 32 guests (each >>>>> with 4 vcpus and 122GiB memory) simultaneously takes: >>>>> >>>>> before the revert: 29 minutes >>>>> after the revert: 6 minutes >>>> >>>> This wants comparing against numbers demonstrating the bad effects of >>>> the global domctl lock. Iirc they were quite a bit higher than 6 min, >>>> perhaps depending on guest properties. >>> >>> Your original commit message doesn't contain any clue in which >>> cases the domctl lock was an issue. So please provide information >>> on the setups you think it will make it worse. >> >> I did never observe the issue myself - let's see whether one of the SUSE >> people possibly involved in this back then recall (or have further >> pointers; Jim, Charles?), or whether any of the (partly former) Citrix >> folks do. My vague recollection is that the issue was the tool stack as >> a whole stalling for far too long in particular when destroying very >> large guests. > > I too only have a vague memory of the issue but do recall shutting down > large guests (e.g. 500GB) taking a long time and blocking other > toolstack operations. I haven't checked on the behavior in quite some > time though. It might be worth checking how toolstack operations (such as domain creating) is affected by the revert. @Hongyan would you be able to test it? > >> One important aspect not discussed in the commit message >> at all is that holding the domctl lock block basically _all_ tool stack >> operations (including e.g. creation of new guests), whereas the new >> issue attempted to be addressed is limited to just domain cleanup. > > I more vaguely recall shutting down the host taking a *long* time when > dom0 had large amounts of memory, e.g. when it had all host memory (no > dom0_mem= setting and autoballooning enabled). AFAIK, we never relinquish memory from dom0. So I am not sure how a large amount of memory in Dom0 would affect the host shutting down. Cheers,
diff --git a/xen/common/domain.c b/xen/common/domain.c index b4eb476a9c..7b02f5ead7 100644 --- a/xen/common/domain.c +++ b/xen/common/domain.c @@ -698,20 +698,11 @@ int domain_kill(struct domain *d) if ( d == current->domain ) return -EINVAL; - /* Protected by d->domain_lock. */ + /* Protected by domctl_lock. */ switch ( d->is_dying ) { case DOMDYING_alive: - domain_unlock(d); domain_pause(d); - domain_lock(d); - /* - * With the domain lock dropped, d->is_dying may have changed. Call - * ourselves recursively if so, which is safe as then we won't come - * back here. - */ - if ( d->is_dying != DOMDYING_alive ) - return domain_kill(d); d->is_dying = DOMDYING_dying; argo_destroy(d); evtchn_destroy(d); diff --git a/xen/common/domctl.c b/xen/common/domctl.c index a69b3b59a8..e010079203 100644 --- a/xen/common/domctl.c +++ b/xen/common/domctl.c @@ -571,14 +571,11 @@ long do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl) break; case XEN_DOMCTL_destroydomain: - domctl_lock_release(); - domain_lock(d); ret = domain_kill(d); - domain_unlock(d); if ( ret == -ERESTART ) ret = hypercall_create_continuation( __HYPERVISOR_domctl, "h", u_domctl); - goto domctl_out_unlock_domonly; + break; case XEN_DOMCTL_setnodeaffinity: {