Message ID | 2e7044de3cd8a6768a20250e61fe262f3a018724.1631790362.git.isaikin-dmitry@yandex.ru (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v1] domctl: hold domctl lock while domain is destroyed | expand |
On 16.09.2021 13:10, Dmitry Isaikin wrote: > From: Dmitry Isaykin <isaikin-dmitry@yandex.ru> > > This significantly speeds up concurrent destruction of multiple domains on x86. This effectively is a simplistic revert of 228ab9992ffb ("domctl: improve locking during domain destruction"). There it was found to actually improve things; now you're claiming the opposite. It'll take more justification, clearly identifying that you actually revert an earlier change, and an explanation why then you don't revert that change altogether. You will want to specifically also consider the cleaning up of huge VMs, where use of the (global) domctl lock may hamper progress of other (parallel) operations on the system. > I identify the place taking the most time: > > do_domctl(case XEN_DOMCTL_destroydomain) > -> domain_kill() > -> domain_relinquish_resources() > -> relinquish_memory(d, &d->page_list, PGT_l4_page_table) > > My reference setup: Intel(R) Xeon(R) CPU E5-2680 v4 @ 2.40GHz, Xen 4.14. > > I use this command for test: > > for i in $(seq 1 5) ; do xl destroy test-vm-${i} & done > > Without holding the lock all calls of `relinquish_memory(d, &d->page_list, PGT_l4_page_table)` > took on my setup (for HVM with 2GB of memory) about 3 seconds for each destroying domain. > > With holding the lock it took only 100 ms. I'm further afraid I can't make the connection. Do you have an explanation for why there would be such a massive difference? What would prevent progress of relinquish_memory() with the domctl lock not held? Jan
On Thu, Sep 16, 2021 at 02:30:39PM +0200, Jan Beulich wrote: > On 16.09.2021 13:10, Dmitry Isaikin wrote: > > From: Dmitry Isaykin <isaikin-dmitry@yandex.ru> > > > > This significantly speeds up concurrent destruction of multiple domains on x86. > > This effectively is a simplistic revert of 228ab9992ffb ("domctl: > improve locking during domain destruction"). There it was found to > actually improve things; now you're claiming the opposite. It'll > take more justification, clearly identifying that you actually > revert an earlier change, and an explanation why then you don't > revert that change altogether. You will want to specifically also > consider the cleaning up of huge VMs, where use of the (global) > domctl lock may hamper progress of other (parallel) operations on > the system. > > > I identify the place taking the most time: > > > > do_domctl(case XEN_DOMCTL_destroydomain) > > -> domain_kill() > > -> domain_relinquish_resources() > > -> relinquish_memory(d, &d->page_list, PGT_l4_page_table) > > > > My reference setup: Intel(R) Xeon(R) CPU E5-2680 v4 @ 2.40GHz, Xen 4.14. > > > > I use this command for test: > > > > for i in $(seq 1 5) ; do xl destroy test-vm-${i} & done > > > > Without holding the lock all calls of `relinquish_memory(d, &d->page_list, PGT_l4_page_table)` > > took on my setup (for HVM with 2GB of memory) about 3 seconds for each destroying domain. > > > > With holding the lock it took only 100 ms. > > I'm further afraid I can't make the connection. Do you have an > explanation for why there would be such a massive difference? > What would prevent progress of relinquish_memory() with the > domctl lock not held? I would recommend to Dmitry to use lock profiling with and without this change applied and try to spot which lock is causing the contention as a starting point. That should be fairly easy and could share some light. Regards, Roger.
On 16/09/2021 13:30, Jan Beulich wrote: > On 16.09.2021 13:10, Dmitry Isaikin wrote: >> From: Dmitry Isaykin <isaikin-dmitry@yandex.ru> >> >> This significantly speeds up concurrent destruction of multiple domains on x86. > This effectively is a simplistic revert of 228ab9992ffb ("domctl: > improve locking during domain destruction"). There it was found to > actually improve things; Was it? I recall that it was simply an expectation that performance would be better... Amazon previously identified 228ab9992ffb as a massive perf hit, too. Clearly some of the reasoning behind 228ab9992ffb was flawed and/or incomplete, and it appears as if it wasn't necessarily a wise move in hindsight. ~Andrew
On 16.09.2021 19:52, Andrew Cooper wrote: > On 16/09/2021 13:30, Jan Beulich wrote: >> On 16.09.2021 13:10, Dmitry Isaikin wrote: >>> From: Dmitry Isaykin <isaikin-dmitry@yandex.ru> >>> >>> This significantly speeds up concurrent destruction of multiple domains on x86. >> This effectively is a simplistic revert of 228ab9992ffb ("domctl: >> improve locking during domain destruction"). There it was found to >> actually improve things; > > Was it? I recall that it was simply an expectation that performance > would be better... My recollection is that it was, for one of our customers. > Amazon previously identified 228ab9992ffb as a massive perf hit, too. Interesting. I don't recall any mail to that effect. > Clearly some of the reasoning behind 228ab9992ffb was flawed and/or > incomplete, and it appears as if it wasn't necessarily a wise move in > hindsight. Possible; I continue to think though that the present observation wants properly understanding instead of more or less blindly undoing that change. Jan
Hi, (+ some AWS folks) On 17/09/2021 11:17, Jan Beulich wrote: > On 16.09.2021 19:52, Andrew Cooper wrote: >> On 16/09/2021 13:30, Jan Beulich wrote: >>> On 16.09.2021 13:10, Dmitry Isaikin wrote: >>>> From: Dmitry Isaykin <isaikin-dmitry@yandex.ru> >>>> >>>> This significantly speeds up concurrent destruction of multiple domains on x86. >>> This effectively is a simplistic revert of 228ab9992ffb ("domctl: >>> improve locking during domain destruction"). There it was found to >>> actually improve things; >> >> Was it? I recall that it was simply an expectation that performance >> would be better... > > My recollection is that it was, for one of our customers. > >> Amazon previously identified 228ab9992ffb as a massive perf hit, too. > > Interesting. I don't recall any mail to that effect. Here we go: https://lore.kernel.org/xen-devel/de46590ad566d9be55b26eaca0bc4dc7fbbada59.1585063311.git.hongyxia@amazon.com/ We have been using the revert for quite a while in production and didn't notice any regression. > >> Clearly some of the reasoning behind 228ab9992ffb was flawed and/or >> incomplete, and it appears as if it wasn't necessarily a wise move in >> hindsight. > > Possible; I continue to think though that the present observation wants > properly understanding instead of more or less blindly undoing that > change. To be honest, I think this is the other way around. You wrote and merged a patch with the following justification: " There is no need to hold the global domctl lock across domain_kill() - the domain lock is fully sufficient here, and parallel cleanup after multiple domains performs quite a bit better this way. " Clearly, the original commit message is lacking details on the exact setups and numbers. But we now have two stakeholders with proof that your patch is harmful to the setup you claim perform better with your patch. To me this is enough justification to revert the original patch. Anyone against the revert, should provide clear details of why the patch should not be reverted. Cheers,
On 17/09/2021 10:27, Julien Grall wrote: > Hi, > > (+ some AWS folks) > > On 17/09/2021 11:17, Jan Beulich wrote: >> On 16.09.2021 19:52, Andrew Cooper wrote: >>> On 16/09/2021 13:30, Jan Beulich wrote: >>>> On 16.09.2021 13:10, Dmitry Isaikin wrote: >>>>> From: Dmitry Isaykin <isaikin-dmitry@yandex.ru> >>>>> >>>>> This significantly speeds up concurrent destruction of multiple >>>>> domains on x86. >>>> This effectively is a simplistic revert of 228ab9992ffb ("domctl: >>>> improve locking during domain destruction"). There it was found to >>>> actually improve things; >>> >>> Was it? I recall that it was simply an expectation that performance >>> would be better... >> >> My recollection is that it was, for one of our customers. >> >>> Amazon previously identified 228ab9992ffb as a massive perf hit, too. >> >> Interesting. I don't recall any mail to that effect. > > Here we go: > > https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fxen-devel%2Fde46590ad566d9be55b26eaca0bc4dc7fbbada59.1585063311.git.hongyxia%40amazon.com%2F&data=04%7C01%7CAndrew.Cooper3%40citrix.com%7C8cf65b3fb3324abe7cf108d979bd7171%7C335836de42ef43a2b145348c2ee9ca5b%7C0%7C0%7C637674676843910175%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=si7eYIxSqsJY77sWuwsad5MzJDMzGF%2F8L0JxGrWTmtI%3D&reserved=0 > > > We have been using the revert for quite a while in production and didn't > notice any regression. > >> >>> Clearly some of the reasoning behind 228ab9992ffb was flawed and/or >>> incomplete, and it appears as if it wasn't necessarily a wise move in >>> hindsight. >> >> Possible; I continue to think though that the present observation wants >> properly understanding instead of more or less blindly undoing that >> change. > > To be honest, I think this is the other way around. You wrote and merged > a patch with the following justification: > > " > There is no need to hold the global domctl lock across domain_kill() - > the domain lock is fully sufficient here, and parallel cleanup after > multiple domains performs quite a bit better this way. > " > > Clearly, the original commit message is lacking details on the exact > setups and numbers. But we now have two stakeholders with proof that > your patch is harmful to the setup you claim perform better with your > patch. > > To me this is enough justification to revert the original patch. Anyone > against the revert, should provide clear details of why the patch should > not be reverted. I second a revert. I was concerned at the time that the claim was unsubstantiated, and now there is plenty of evidence to counter the claim. ~Andrew
On 17.09.2021 11:41, Andrew Cooper wrote: > On 17/09/2021 10:27, Julien Grall wrote: >> Hi, >> >> (+ some AWS folks) >> >> On 17/09/2021 11:17, Jan Beulich wrote: >>> On 16.09.2021 19:52, Andrew Cooper wrote: >>>> On 16/09/2021 13:30, Jan Beulich wrote: >>>>> On 16.09.2021 13:10, Dmitry Isaikin wrote: >>>>>> From: Dmitry Isaykin <isaikin-dmitry@yandex.ru> >>>>>> >>>>>> This significantly speeds up concurrent destruction of multiple >>>>>> domains on x86. >>>>> This effectively is a simplistic revert of 228ab9992ffb ("domctl: >>>>> improve locking during domain destruction"). There it was found to >>>>> actually improve things; >>>> >>>> Was it? I recall that it was simply an expectation that performance >>>> would be better... >>> >>> My recollection is that it was, for one of our customers. >>> >>>> Amazon previously identified 228ab9992ffb as a massive perf hit, too. >>> >>> Interesting. I don't recall any mail to that effect. >> >> Here we go: >> >> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fxen-devel%2Fde46590ad566d9be55b26eaca0bc4dc7fbbada59.1585063311.git.hongyxia%40amazon.com%2F&data=04%7C01%7CAndrew.Cooper3%40citrix.com%7C8cf65b3fb3324abe7cf108d979bd7171%7C335836de42ef43a2b145348c2ee9ca5b%7C0%7C0%7C637674676843910175%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=si7eYIxSqsJY77sWuwsad5MzJDMzGF%2F8L0JxGrWTmtI%3D&reserved=0 >> >> >> We have been using the revert for quite a while in production and didn't >> notice any regression. >> >>> >>>> Clearly some of the reasoning behind 228ab9992ffb was flawed and/or >>>> incomplete, and it appears as if it wasn't necessarily a wise move in >>>> hindsight. >>> >>> Possible; I continue to think though that the present observation wants >>> properly understanding instead of more or less blindly undoing that >>> change. >> >> To be honest, I think this is the other way around. You wrote and merged >> a patch with the following justification: >> >> " >> There is no need to hold the global domctl lock across domain_kill() - >> the domain lock is fully sufficient here, and parallel cleanup after >> multiple domains performs quite a bit better this way. >> " >> >> Clearly, the original commit message is lacking details on the exact >> setups and numbers. But we now have two stakeholders with proof that >> your patch is harmful to the setup you claim perform better with your >> patch. >> >> To me this is enough justification to revert the original patch. Anyone >> against the revert, should provide clear details of why the patch should >> not be reverted. > > I second a revert. > > I was concerned at the time that the claim was unsubstantiated, and now > there is plenty of evidence to counter the claim. Well, I won't object to a proper revert. I still think we'd better get to the bottom of this, not the least because I thought there was agreement that mid to long term we should get rid of global locking wherever possible. Or are both of you saying that using a global lock here is obviously fine? And does either of you have at least a theory to explain the observation? I can only say that I find it puzzling. Jan
Hi Jan, On 17/09/2021 14:47, Jan Beulich wrote: > On 17.09.2021 11:41, Andrew Cooper wrote: >> On 17/09/2021 10:27, Julien Grall wrote: >>> Hi, >>> >>> (+ some AWS folks) >>> >>> On 17/09/2021 11:17, Jan Beulich wrote: >>>> On 16.09.2021 19:52, Andrew Cooper wrote: >>>>> On 16/09/2021 13:30, Jan Beulich wrote: >>>>>> On 16.09.2021 13:10, Dmitry Isaikin wrote: >>>>>>> From: Dmitry Isaykin <isaikin-dmitry@yandex.ru> >>>>>>> >>>>>>> This significantly speeds up concurrent destruction of multiple >>>>>>> domains on x86. >>>>>> This effectively is a simplistic revert of 228ab9992ffb ("domctl: >>>>>> improve locking during domain destruction"). There it was found to >>>>>> actually improve things; >>>>> >>>>> Was it? I recall that it was simply an expectation that performance >>>>> would be better... >>>> >>>> My recollection is that it was, for one of our customers. >>>> >>>>> Amazon previously identified 228ab9992ffb as a massive perf hit, too. >>>> >>>> Interesting. I don't recall any mail to that effect. >>> >>> Here we go: >>> >>> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fxen-devel%2Fde46590ad566d9be55b26eaca0bc4dc7fbbada59.1585063311.git.hongyxia%40amazon.com%2F&data=04%7C01%7CAndrew.Cooper3%40citrix.com%7C8cf65b3fb3324abe7cf108d979bd7171%7C335836de42ef43a2b145348c2ee9ca5b%7C0%7C0%7C637674676843910175%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=si7eYIxSqsJY77sWuwsad5MzJDMzGF%2F8L0JxGrWTmtI%3D&reserved=0 >>> >>> >>> We have been using the revert for quite a while in production and didn't >>> notice any regression. >>> >>>> >>>>> Clearly some of the reasoning behind 228ab9992ffb was flawed and/or >>>>> incomplete, and it appears as if it wasn't necessarily a wise move in >>>>> hindsight. >>>> >>>> Possible; I continue to think though that the present observation wants >>>> properly understanding instead of more or less blindly undoing that >>>> change. >>> >>> To be honest, I think this is the other way around. You wrote and merged >>> a patch with the following justification: >>> >>> " >>> There is no need to hold the global domctl lock across domain_kill() - >>> the domain lock is fully sufficient here, and parallel cleanup after >>> multiple domains performs quite a bit better this way. >>> " >>> >>> Clearly, the original commit message is lacking details on the exact >>> setups and numbers. But we now have two stakeholders with proof that >>> your patch is harmful to the setup you claim perform better with your >>> patch. >>> >>> To me this is enough justification to revert the original patch. Anyone >>> against the revert, should provide clear details of why the patch should >>> not be reverted. >> >> I second a revert. >> >> I was concerned at the time that the claim was unsubstantiated, and now >> there is plenty of evidence to counter the claim. > > Well, I won't object to a proper revert. I still think we'd better get to > the bottom of this, not the least because I thought there was agreement > that mid to long term we should get rid of global locking wherever > possible. Or are both of you saying that using a global lock here is > obviously fine? And does either of you have at least a theory to explain > the observation? I can only say that I find it puzzling. I will quote what Hongyan wrote back on the first report: " The best solution is to make the heap scalable instead of a global lock, but that is not going to be trivial. Of course, another solution is to keep the domctl lock dropped in domain_kill() but have another domain_kill lock so that competing domain_kill()s will try to take that lock and back off with hypercall continuation. But this is kind of hacky (we introduce a lock to reduce spinlock contention elsewhere), which is probably not a solution but a workaround. " Cheers,
On 17.09.2021 18:01, Julien Grall wrote: > I will quote what Hongyan wrote back on the first report: > > " > The best solution is to make the heap scalable instead of a global > lock, but that is not going to be trivial. > > Of course, another solution is to keep the domctl lock dropped in > domain_kill() but have another domain_kill lock so that competing > domain_kill()s will try to take that lock and back off with hypercall > continuation. But this is kind of hacky (we introduce a lock to reduce > spinlock contention elsewhere), which is probably not a solution but a > workaround. > " Interesting. Is contention on the heap lock merely suspected or was that lock proven to be the problem one? Jan
diff --git a/xen/common/domctl.c b/xen/common/domctl.c index 12d6144d28..b9a50d3e5d 100644 --- a/xen/common/domctl.c +++ b/xen/common/domctl.c @@ -497,14 +497,13 @@ long do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl) break; case XEN_DOMCTL_destroydomain: - domctl_lock_release(); domain_lock(d); ret = domain_kill(d); domain_unlock(d); if ( ret == -ERESTART ) ret = hypercall_create_continuation( __HYPERVISOR_domctl, "h", u_domctl); - goto domctl_out_unlock_domonly; + break; case XEN_DOMCTL_setnodeaffinity: {