Message ID | 20160527181139.GA18797@potion (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Fri, May 27, 2016 at 08:11:40PM +0200, Radim Krčmář wrote: > 2016-05-27 20:28+0300, Roman Kagan: > >> Queueing unconditionally seems to be the correct thing to do. > > > > The notifier is registered at kvm module init, so the work will be > > scheduled even when there are no VMs at all. > > Good point, we don't want to call pvclock_gtod_notify in that case > either. Registering (unregistering) with the first (last) VM should be > good enough ... what about adding something based on this? > > diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c > index 37af23052470..0779f0f01523 100644 > --- a/virt/kvm/kvm_main.c > +++ b/virt/kvm/kvm_main.c > @@ -655,6 +655,8 @@ static struct kvm *kvm_create_vm(unsigned long type) > goto out_err; > > spin_lock(&kvm_lock); > + if (list_empty(&kvm->vm_list)) > + kvm_arch_create_first_vm(kvm); > list_add(&kvm->vm_list, &vm_list); > spin_unlock(&kvm_lock); > > @@ -709,6 +711,8 @@ static void kvm_destroy_vm(struct kvm *kvm) > kvm_arch_sync_events(kvm); > spin_lock(&kvm_lock); > list_del(&kvm->vm_list); > + if (list_empty(&kvm->vm_list)) > + kvm_arch_destroy_last_vm(kvm); > spin_unlock(&kvm_lock); > kvm_free_irq_routing(kvm); > for (i = 0; i < KVM_NR_BUSES; i++) Makes perfect sense IMO. > >> Interaction between kvm_gen_update_masterclock(), pvclock_gtod_work(), > >> and NTP could be a problem: kvm_gen_update_masterclock() only has to > >> run once per VM, but pvclock_gtod_work() calls it on every VCPU, so > >> frequent NTP updates on bigger guests could kill performance. > > > > Unfortunately, things are worse than that: this stuff is updated on > > every *tick* on the timekeeping CPU, so, as long as you keep at least > > one of your CPUs busy, the update rate can reach HZ. The frequency of > > NTP updates is unimportant; it happens without NTP updates at all. > > > > So I tend to agree that we're perhaps better off not fixing this bug and > > leaving the kvm_clocks to drift until we figure out how to do it with > > acceptable overhead. > > Yuck ... the hunk below could help a bit. > I haven't checked if the timekeeping code updates gtod and therefore > sets 'was_set' even when the resulting time hasn't changed, so we might > need to do more to avoid useless situations. > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > index a8c7ca34ee5d..37ed0a342bf1 100644 > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -5802,12 +5802,15 @@ static DECLARE_WORK(pvclock_gtod_work, pvclock_gtod_update_fn); > /* > * Notification about pvclock gtod data update. > */ > -static int pvclock_gtod_notify(struct notifier_block *nb, unsigned long unused, > +static int pvclock_gtod_notify(struct notifier_block *nb, unsigned long was_set, > void *priv) > { > struct pvclock_gtod_data *gtod = &pvclock_gtod_data; > struct timekeeper *tk = priv; > > + if (!was_set) > + return 0; > + > update_pvclock_gtod(tk); > Nope, this parameter is only set when there's a step-like change in the time. The timekeeper itself is always updated. I guess we could mitigate the costs somewhat if we skipped updating the gtod copy until the accumulated error reaches certain limit; not sure if that's gonna help though. Roman. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
2016-05-27 21:46+0300, Roman Kagan: > On Fri, May 27, 2016 at 08:11:40PM +0200, Radim Krčmář wrote: > > 2016-05-27 20:28+0300, Roman Kagan: >> >> Interaction between kvm_gen_update_masterclock(), pvclock_gtod_work(), >> >> and NTP could be a problem: kvm_gen_update_masterclock() only has to >> >> run once per VM, but pvclock_gtod_work() calls it on every VCPU, so >> >> frequent NTP updates on bigger guests could kill performance. >> > >> > Unfortunately, things are worse than that: this stuff is updated on >> > every *tick* on the timekeeping CPU, so, as long as you keep at least >> > one of your CPUs busy, the update rate can reach HZ. The frequency of >> > NTP updates is unimportant; it happens without NTP updates at all. >> > >> > So I tend to agree that we're perhaps better off not fixing this bug and >> > leaving the kvm_clocks to drift until we figure out how to do it with >> > acceptable overhead. >> >> Yuck ... the hunk below could help a bit. >> I haven't checked if the timekeeping code updates gtod and therefore >> sets 'was_set' even when the resulting time hasn't changed, so we might >> need to do more to avoid useless situations. >> >> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c >> index a8c7ca34ee5d..37ed0a342bf1 100644 >> --- a/arch/x86/kvm/x86.c >> +++ b/arch/x86/kvm/x86.c >> @@ -5802,12 +5802,15 @@ static DECLARE_WORK(pvclock_gtod_work, pvclock_gtod_update_fn); >> /* >> * Notification about pvclock gtod data update. >> */ >> -static int pvclock_gtod_notify(struct notifier_block *nb, unsigned long unused, >> +static int pvclock_gtod_notify(struct notifier_block *nb, unsigned long was_set, >> void *priv) >> { >> struct pvclock_gtod_data *gtod = &pvclock_gtod_data; >> struct timekeeper *tk = priv; >> >> + if (!was_set) >> + return 0; >> + >> update_pvclock_gtod(tk); >> > > Nope, this parameter is only set when there's a step-like change in the > time. The timekeeper itself is always updated. I guess we could > mitigate the costs somewhat if we skipped updating the gtod copy until > the accumulated error reaches certain limit; not sure if that's gonna > help though. I see, timekeeping_adjust() isn't covered, but it should not adjust every tick, so we could propagate information about adjustments to pvclock_gtod_notify (rename unused to has_changed), because pvclock only cares about change of time. Adding another threshold is a reasonable improvement if adjustments happen too often, but we need to fix pvclock_gtod_update_fn() in any case. Am I missing anyting else? Thanks. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 37af23052470..0779f0f01523 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -655,6 +655,8 @@ static struct kvm *kvm_create_vm(unsigned long type) goto out_err; spin_lock(&kvm_lock); + if (list_empty(&kvm->vm_list)) + kvm_arch_create_first_vm(kvm); list_add(&kvm->vm_list, &vm_list); spin_unlock(&kvm_lock); @@ -709,6 +711,8 @@ static void kvm_destroy_vm(struct kvm *kvm) kvm_arch_sync_events(kvm); spin_lock(&kvm_lock); list_del(&kvm->vm_list); + if (list_empty(&kvm->vm_list)) + kvm_arch_destroy_last_vm(kvm); spin_unlock(&kvm_lock); kvm_free_irq_routing(kvm); for (i = 0; i < KVM_NR_BUSES; i++) >> Interaction between kvm_gen_update_masterclock(), pvclock_gtod_work(), >> and NTP could be a problem: kvm_gen_update_masterclock() only has to >> run once per VM, but pvclock_gtod_work() calls it on every VCPU, so >> frequent NTP updates on bigger guests could kill performance. > > Unfortunately, things are worse than that: this stuff is updated on > every *tick* on the timekeeping CPU, so, as long as you keep at least > one of your CPUs busy, the update rate can reach HZ. The frequency of > NTP updates is unimportant; it happens without NTP updates at all. > > So I tend to agree that we're perhaps better off not fixing this bug and > leaving the kvm_clocks to drift until we figure out how to do it with > acceptable overhead. Yuck ... the hunk below could help a bit. I haven't checked if the timekeeping code updates gtod and therefore sets 'was_set' even when the resulting time hasn't changed, so we might need to do more to avoid useless situations. diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index a8c7ca34ee5d..37ed0a342bf1 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -5802,12 +5802,15 @@ static DECLARE_WORK(pvclock_gtod_work, pvclock_gtod_update_fn); /* * Notification about pvclock gtod data update. */ -static int pvclock_gtod_notify(struct notifier_block *nb, unsigned long unused, +static int pvclock_gtod_notify(struct notifier_block *nb, unsigned long was_set, void *priv) { struct pvclock_gtod_data *gtod = &pvclock_gtod_data; struct timekeeper *tk = priv; + if (!was_set) + return 0; + update_pvclock_gtod(tk); /* disable master clock if host does not trust, or does not