From patchwork Fri May 27 18:11:40 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?b?UmFkaW0gS3LEjW3DocWZ?= X-Patchwork-Id: 9138847 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 2AC9760467 for ; Fri, 27 May 2016 18:11:49 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1648E281F9 for ; Fri, 27 May 2016 18:11:49 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 09ABF28285; Fri, 27 May 2016 18:11:49 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,HK_RANDOM_FROM, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 6E21E281F9 for ; Fri, 27 May 2016 18:11:48 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933115AbcE0SLp (ORCPT ); Fri, 27 May 2016 14:11:45 -0400 Received: from mx1.redhat.com ([209.132.183.28]:55531 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932737AbcE0SLo (ORCPT ); Fri, 27 May 2016 14:11:44 -0400 Received: from int-mx09.intmail.prod.int.phx2.redhat.com (int-mx09.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id C802EC04B312; Fri, 27 May 2016 18:11:43 +0000 (UTC) Received: from potion (dhcp-1-116.brq.redhat.com [10.34.1.116]) by int-mx09.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with SMTP id u4RIBeAZ011282; Fri, 27 May 2016 14:11:41 -0400 Received: by potion (sSMTP sendmail emulation); Fri, 27 May 2016 20:11:40 +0200 Date: Fri, 27 May 2016 20:11:40 +0200 From: Radim =?utf-8?B?S3LEjW3DocWZ?= To: Roman Kagan , kvm@vger.kernel.org, "Denis V. Lunev" , Owen Hofmann , Paolo Bonzini , Marcelo Tosatti Subject: Re: [PATCH] x86/kvm: fix condition to update kvm master clocks Message-ID: <20160527181139.GA18797@potion> References: <1464274195-31296-1-git-send-email-rkagan@virtuozzo.com> <20160526201936.GA25334@potion> <20160527172809.GB17398@rkaganb.sw.ru> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20160527172809.GB17398@rkaganb.sw.ru> X-Scanned-By: MIMEDefang 2.68 on 10.5.11.22 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.31]); Fri, 27 May 2016 18:11:43 +0000 (UTC) Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP 2016-05-27 20:28+0300, Roman Kagan: > On Thu, May 26, 2016 at 10:19:36PM +0200, Radim Krčmář wrote: >> > atomic_read(&kvm_guest_has_master_clock) != 0) >> >> And I don't see why we don't want to enable master clock if the host >> switches back to TSC. > > Agreed (even though I guess it's not very likely: AFAICS once switched > to a different clocksource, the host can switch back to TSC only upon > human manipulating /sys/devices/system/clocksource). Yeah, it's a corner case. Human would have to switch from tsc as well, automatic switch happens only when tsc is not useable anymore, AFAIK. >> > queue_work(system_long_wq, &pvclock_gtod_work); >> >> Queueing unconditionally seems to be the correct thing to do. > > The notifier is registered at kvm module init, so the work will be > scheduled even when there are no VMs at all. Good point, we don't want to call pvclock_gtod_notify in that case either. Registering (unregistering) with the first (last) VM should be good enough ... what about adding something based on this? --- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 37af23052470..0779f0f01523 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -655,6 +655,8 @@ static struct kvm *kvm_create_vm(unsigned long type) goto out_err; spin_lock(&kvm_lock); + if (list_empty(&kvm->vm_list)) + kvm_arch_create_first_vm(kvm); list_add(&kvm->vm_list, &vm_list); spin_unlock(&kvm_lock); @@ -709,6 +711,8 @@ static void kvm_destroy_vm(struct kvm *kvm) kvm_arch_sync_events(kvm); spin_lock(&kvm_lock); list_del(&kvm->vm_list); + if (list_empty(&kvm->vm_list)) + kvm_arch_destroy_last_vm(kvm); spin_unlock(&kvm_lock); kvm_free_irq_routing(kvm); for (i = 0; i < KVM_NR_BUSES; i++) >> Interaction between kvm_gen_update_masterclock(), pvclock_gtod_work(), >> and NTP could be a problem: kvm_gen_update_masterclock() only has to >> run once per VM, but pvclock_gtod_work() calls it on every VCPU, so >> frequent NTP updates on bigger guests could kill performance. > > Unfortunately, things are worse than that: this stuff is updated on > every *tick* on the timekeeping CPU, so, as long as you keep at least > one of your CPUs busy, the update rate can reach HZ. The frequency of > NTP updates is unimportant; it happens without NTP updates at all. > > So I tend to agree that we're perhaps better off not fixing this bug and > leaving the kvm_clocks to drift until we figure out how to do it with > acceptable overhead. Yuck ... the hunk below could help a bit. I haven't checked if the timekeeping code updates gtod and therefore sets 'was_set' even when the resulting time hasn't changed, so we might need to do more to avoid useless situations. diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index a8c7ca34ee5d..37ed0a342bf1 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -5802,12 +5802,15 @@ static DECLARE_WORK(pvclock_gtod_work, pvclock_gtod_update_fn); /* * Notification about pvclock gtod data update. */ -static int pvclock_gtod_notify(struct notifier_block *nb, unsigned long unused, +static int pvclock_gtod_notify(struct notifier_block *nb, unsigned long was_set, void *priv) { struct pvclock_gtod_data *gtod = &pvclock_gtod_data; struct timekeeper *tk = priv; + if (!was_set) + return 0; + update_pvclock_gtod(tk); /* disable master clock if host does not trust, or does not