From patchwork Mon Sep 19 11:39:12 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paolo Bonzini X-Patchwork-Id: 9339123 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 93616607D0 for ; Mon, 19 Sep 2016 11:40:24 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 83FBE28960 for ; Mon, 19 Sep 2016 11:40:24 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 787A928A10; Mon, 19 Sep 2016 11:40:24 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.3 required=2.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_HI, RCVD_IN_SORBS_SPAM, T_DKIM_INVALID autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1E32228960 for ; Mon, 19 Sep 2016 11:40:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932322AbcISLje (ORCPT ); Mon, 19 Sep 2016 07:39:34 -0400 Received: from mail-wm0-f68.google.com ([74.125.82.68]:35381 "EHLO mail-wm0-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1761027AbcISLjW (ORCPT ); Mon, 19 Sep 2016 07:39:22 -0400 Received: by mail-wm0-f68.google.com with SMTP id 133so14467525wmq.2; Mon, 19 Sep 2016 04:39:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=sender:from:to:cc:subject:date:message-id:in-reply-to:references; bh=S2iB3UiDFuBZW8ikL6+11zIkQrQaph48QKeRnGTHh1c=; b=rAUvpGlZkODb4qmZnqEItJTwPf97EmeJf/+0eaIu2ldZNfeZOGhr3AVfPvY9KWWReb Ikr+Vtxm4T+XwWwl9iiltEyZc0OMo4w8b6hsYljRTQOJfFGQvdz4/s174Mz8Iolov52u T6ov/xdh+SGs3JgOK8N7B20oayfNvSvbB2xsBmV0EKaKZEVkD3xaH3x4CzYJq48BrF89 jL080/9izehiAuxccZG+4i+9X/6P52HrxW6TDXB89HD7HSibAPldzmPy0odoxcCoiT5g IEUe2VRpOPgTuapZjk8gVPau+C/tF2NjIQ1ZC4BPlPEtwLRPXdzvW4iuoxVg6Ul8+CuB J+nQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:sender:from:to:cc:subject:date:message-id :in-reply-to:references; bh=S2iB3UiDFuBZW8ikL6+11zIkQrQaph48QKeRnGTHh1c=; b=bdXbEzClmo26kNQhYQ5ZotNTCGpuumRRiGxt1TWn5cSNl+iXGUQR2p4qqjQZFFVGBd SXWuJl75OiY8Ef83i48OVqBen2jDYdHbvkpqmpIdpwkK5TpmTntkCyHE7wHaZZ0mGaTO oi4r5kg30uNspd/dg/uwud7MH06gV1xYtfdutV3AFwOWH63elvA7zqQiRGBKs6IaCzoP +kJUKe7B8aNzM+ofl5EqbQ74QKdPGnWNYHTdy17p5L0r/s6BAPoprkrgS2WYGWXGlNVN ekPLF7mH4SQhl2do5Yns5bWBkq7lbhFI19cD6dyCDcI+RrjX3t/3Aa4BuyE9zXnA0xNu Zz0Q== X-Gm-Message-State: AE9vXwPKVHpK/ymKMPC3oMUvKU4Ss1BiVh6iEatmgAJ9kNi+606YVDS9y/V5YS1VJcehrg== X-Received: by 10.28.74.156 with SMTP id n28mr6646372wmi.96.1474285159961; Mon, 19 Sep 2016 04:39:19 -0700 (PDT) Received: from 640k.lan (94-39-176-182.adsl-ull.clienti.tiscali.it. [94.39.176.182]) by smtp.gmail.com with ESMTPSA id f8sm22676029wjh.45.2016.09.19.04.39.18 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 19 Sep 2016 04:39:19 -0700 (PDT) From: Paolo Bonzini To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: rkrcmar@redhat.com, den@openvz.org, rkagan@virtuozzo.com, peterhornyack@google.com Subject: [PATCH 3/4] KVM: x86: introduce get_kvmclock_ns Date: Mon, 19 Sep 2016 13:39:12 +0200 Message-Id: <1474285153-4216-4-git-send-email-pbonzini@redhat.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1474285153-4216-1-git-send-email-pbonzini@redhat.com> References: <1474285153-4216-1-git-send-email-pbonzini@redhat.com> Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Introduce a function that reads the exact nanoseconds value that is provided to the guest in kvmclock. This crystallizes the notion of kvmclock as a thin veneer over a stable TSC, that the guest will (hopefully) convert with NTP. In other words, kvmclock is *not* a paravirtualized host-to-guest NTP. Drop the get_kernel_ns() function, that was used both to get the base value of the master clock and to get the current value of kvmclock. The former use is replaced by ktime_get_boot_ns(), the latter is the purpose of get_kernel_ns(). This also allows KVM to provide a Hyper-V time reference counter that is synchronized with the time that is computed from the TSC page. Signed-off-by: Paolo Bonzini Reviewed-by: Roman Kagan --- arch/x86/entry/vdso/vclock_gettime.c | 2 +- arch/x86/include/asm/pvclock.h | 5 ++-- arch/x86/kernel/pvclock.c | 2 +- arch/x86/kvm/hyperv.c | 2 +- arch/x86/kvm/x86.c | 48 +++++++++++++++++++++++++++--------- arch/x86/kvm/x86.h | 6 +---- 6 files changed, 43 insertions(+), 22 deletions(-) diff --git a/arch/x86/entry/vdso/vclock_gettime.c b/arch/x86/entry/vdso/vclock_gettime.c index 94d54d0defa7..02223cb4bcfd 100644 --- a/arch/x86/entry/vdso/vclock_gettime.c +++ b/arch/x86/entry/vdso/vclock_gettime.c @@ -129,7 +129,7 @@ static notrace cycle_t vread_pvclock(int *mode) return 0; } - ret = __pvclock_read_cycles(pvti); + ret = __pvclock_read_cycles(pvti, rdtsc_ordered()); } while (pvclock_read_retry(pvti, version)); /* refer to vread_tsc() comment for rationale */ diff --git a/arch/x86/include/asm/pvclock.h b/arch/x86/include/asm/pvclock.h index d019f0cc80ec..3ad741b84072 100644 --- a/arch/x86/include/asm/pvclock.h +++ b/arch/x86/include/asm/pvclock.h @@ -87,9 +87,10 @@ static inline u64 pvclock_scale_delta(u64 delta, u32 mul_frac, int shift) } static __always_inline -cycle_t __pvclock_read_cycles(const struct pvclock_vcpu_time_info *src) +cycle_t __pvclock_read_cycles(const struct pvclock_vcpu_time_info *src, + u64 tsc) { - u64 delta = rdtsc_ordered() - src->tsc_timestamp; + u64 delta = tsc - src->tsc_timestamp; cycle_t offset = pvclock_scale_delta(delta, src->tsc_to_system_mul, src->tsc_shift); return src->system_time + offset; diff --git a/arch/x86/kernel/pvclock.c b/arch/x86/kernel/pvclock.c index 3599404e3089..5b2cc889ce34 100644 --- a/arch/x86/kernel/pvclock.c +++ b/arch/x86/kernel/pvclock.c @@ -80,7 +80,7 @@ cycle_t pvclock_clocksource_read(struct pvclock_vcpu_time_info *src) do { version = pvclock_read_begin(src); - ret = __pvclock_read_cycles(src); + ret = __pvclock_read_cycles(src, rdtsc_ordered()); flags = src->flags; } while (pvclock_read_retry(src, version)); diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c index 01bd7b7a6866..ed5b77f39ffb 100644 --- a/arch/x86/kvm/hyperv.c +++ b/arch/x86/kvm/hyperv.c @@ -386,7 +386,7 @@ static void synic_init(struct kvm_vcpu_hv_synic *synic) static u64 get_time_ref_counter(struct kvm *kvm) { - return div_u64(get_kernel_ns() + kvm->arch.kvmclock_offset, 100); + return div_u64(get_kvmclock_ns(kvm), 100); } static void stimer_mark_pending(struct kvm_vcpu_hv_stimer *stimer, diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 00e569c3ca71..81e9945cdf28 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -1431,7 +1431,7 @@ void kvm_write_tsc(struct kvm_vcpu *vcpu, struct msr_data *msr) raw_spin_lock_irqsave(&kvm->arch.tsc_write_lock, flags); offset = kvm_compute_tsc_offset(vcpu, data); - ns = get_kernel_ns(); + ns = ktime_get_boot_ns(); elapsed = ns - kvm->arch.last_tsc_nsec; if (vcpu->arch.virtual_tsc_khz) { @@ -1722,6 +1722,34 @@ static void kvm_gen_update_masterclock(struct kvm *kvm) #endif } +static u64 __get_kvmclock_ns(struct kvm *kvm) +{ + struct kvm_vcpu *vcpu = kvm_get_vcpu(kvm, 0); + struct kvm_arch *ka = &kvm->arch; + s64 ns; + + if (vcpu->arch.hv_clock.flags & PVCLOCK_TSC_STABLE_BIT) { + u64 tsc = kvm_read_l1_tsc(vcpu, rdtsc()); + ns = __pvclock_read_cycles(&vcpu->arch.hv_clock, tsc); + } else { + ns = ktime_get_boot_ns() + ka->kvmclock_offset; + } + + return ns; +} + +u64 get_kvmclock_ns(struct kvm *kvm) +{ + unsigned long flags; + s64 ns; + + local_irq_save(flags); + ns = __get_kvmclock_ns(kvm); + local_irq_restore(flags); + + return ns; +} + static void kvm_setup_pvclock_page(struct kvm_vcpu *v) { struct kvm_vcpu_arch *vcpu = &v->arch; @@ -1811,7 +1839,7 @@ static int kvm_guest_time_update(struct kvm_vcpu *v) } if (!use_master_clock) { host_tsc = rdtsc(); - kernel_ns = get_kernel_ns(); + kernel_ns = ktime_get_boot_ns(); } tsc_timestamp = kvm_read_l1_tsc(v, host_tsc); @@ -4054,7 +4082,6 @@ long kvm_arch_vm_ioctl(struct file *filp, case KVM_SET_CLOCK: { struct kvm_clock_data user_ns; u64 now_ns; - s64 delta; r = -EFAULT; if (copy_from_user(&user_ns, argp, sizeof(user_ns))) @@ -4066,10 +4093,9 @@ long kvm_arch_vm_ioctl(struct file *filp, r = 0; local_irq_disable(); - now_ns = get_kernel_ns(); - delta = user_ns.clock - now_ns; + now_ns = __get_kvmclock_ns(kvm); + kvm->arch.kvmclock_offset += user_ns.clock - now_ns; local_irq_enable(); - kvm->arch.kvmclock_offset = delta; kvm_gen_update_masterclock(kvm); break; } @@ -4077,10 +4103,8 @@ long kvm_arch_vm_ioctl(struct file *filp, struct kvm_clock_data user_ns; u64 now_ns; - local_irq_disable(); - now_ns = get_kernel_ns(); - user_ns.clock = kvm->arch.kvmclock_offset + now_ns; - local_irq_enable(); + now_ns = get_kvmclock_ns(kvm); + user_ns.clock = now_ns; user_ns.flags = 0; memset(&user_ns.pad, 0, sizeof(user_ns.pad)); @@ -7544,7 +7568,7 @@ int kvm_arch_hardware_enable(void) * before any KVM threads can be running. Unfortunately, we can't * bring the TSCs fully up to date with real time, as we aren't yet far * enough into CPU bringup that we know how much real time has actually - * elapsed; our helper function, get_kernel_ns() will be using boot + * elapsed; our helper function, ktime_get_boot_ns() will be using boot * variables that haven't been updated yet. * * So we simply find the maximum observed TSC above, then record the @@ -7779,7 +7803,7 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type) mutex_init(&kvm->arch.apic_map_lock); spin_lock_init(&kvm->arch.pvclock_gtod_sync_lock); - kvm->arch.kvmclock_offset = -get_kernel_ns(); + kvm->arch.kvmclock_offset = -ktime_get_boot_ns(); pvclock_update_vm_gtod_copy(kvm); INIT_DELAYED_WORK(&kvm->arch.kvmclock_update_work, kvmclock_update_fn); diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h index a82ca466b62e..e8ff3e4ce38a 100644 --- a/arch/x86/kvm/x86.h +++ b/arch/x86/kvm/x86.h @@ -148,11 +148,6 @@ static inline void kvm_register_writel(struct kvm_vcpu *vcpu, return kvm_register_write(vcpu, reg, val); } -static inline u64 get_kernel_ns(void) -{ - return ktime_get_boot_ns(); -} - static inline bool kvm_check_has_quirk(struct kvm *kvm, u64 quirk) { return !(kvm->arch.disabled_quirks & quirk); @@ -164,6 +159,7 @@ void kvm_set_pending_timer(struct kvm_vcpu *vcpu); int kvm_inject_realmode_interrupt(struct kvm_vcpu *vcpu, int irq, int inc_eip); void kvm_write_tsc(struct kvm_vcpu *vcpu, struct msr_data *msr); +u64 get_kvmclock_ns(struct kvm *kvm); int kvm_read_guest_virt(struct x86_emulate_ctxt *ctxt, gva_t addr, void *val, unsigned int bytes,