From patchwork Tue Dec 19 08:34:38 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 13498023 Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.115]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3E7F011709; Tue, 19 Dec 2023 08:35:18 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="Yzjn6pCn" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1702974918; x=1734510918; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=yppM1vzyIg0mJzvtILq30t1ac8vqS11G7wocpCARxsA=; b=Yzjn6pCn9lHK35vJFmvSSRpYKVJTHpyOp+Z6r5jYoCWDHOl+a5sHi+vM 4CGWIHvNuNozKqro65WQ2ZALaf2aVyFye2JSGqcSUFcijCQ1/v3Vzl5An vjbD1AY9UFWIDEGMDJw9b3L84U2dZ4DA7nXrB2DY5n5egbaW69kY627Rc 6DEXWIhKsmbOiVMOZK85Pwnn5fM7uML2xvqdi8SUGcab7gT9ExnkGBKEl NhALDMBh3AkD7xvzBLKcZeuNZToqHcstKdhsf7tIfSJh6Ghv+DJjifNgI lWnDHMuBHHhd48c4gAxxAzSbejB0TmSTZoTCMOchrjHzewuX0Nx/aGJAD Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10928"; a="395355754" X-IronPort-AV: E=Sophos;i="6.04,287,1695711600"; d="scan'208";a="395355754" Received: from orsmga003.jf.intel.com ([10.7.209.27]) by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 19 Dec 2023 00:34:54 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10928"; a="725658892" X-IronPort-AV: E=Sophos;i="6.04,287,1695711600"; d="scan'208";a="725658892" Received: from ls.sc.intel.com (HELO localhost) ([172.25.112.31]) by orsmga003-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 19 Dec 2023 00:34:53 -0800 From: Isaku Yamahata To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Vishal Annapurve , Jim Mattson , Maxim Levitsky , Xiaoyao Li Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com Subject: [PATCH v3 1/4] KVM: x86/hyperv: Calculate APIC bus frequency for hyper-v Date: Tue, 19 Dec 2023 00:34:38 -0800 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Remove APIC_BUS_FREUQNCY and calculate it based on APIC bus cycles per NS. APIC_BUS_FREUQNCY is used only for HV_X64_MSR_APIC_FREQUENCY. The MSR is not frequently read, calculate it every time. In order to make APIC bus frequency configurable, we need to make make two related constants into variables. APIC_BUS_FREUQNCY and APIC_BUS_CYCLE_NS. One can be calculated from the other. APIC_BUS_CYCLES_NS = 1000 * 1000 * 1000 / APIC_BUS_FREQUENCY. By removing APIC_BUS_FREQUENCY, we need to track only single variable instead of two. Signed-off-by: Isaku Yamahata Reviewed-by: Maxim Levitsky Reviewed-by: Xiaoyao Li --- Changes v3: - Newly added according to Maxim Levistsky suggestion. --- arch/x86/kvm/hyperv.c | 2 +- arch/x86/kvm/lapic.h | 1 - 2 files changed, 1 insertion(+), 2 deletions(-) diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c index 238afd7335e4..a40ca2fef58c 100644 --- a/arch/x86/kvm/hyperv.c +++ b/arch/x86/kvm/hyperv.c @@ -1687,7 +1687,7 @@ static int kvm_hv_get_msr(struct kvm_vcpu *vcpu, u32 msr, u64 *pdata, data = (u64)vcpu->arch.virtual_tsc_khz * 1000; break; case HV_X64_MSR_APIC_FREQUENCY: - data = APIC_BUS_FREQUENCY; + data = div64_u64(1000000000ULL, APIC_BUS_CYCLE_NS); break; default: kvm_pr_unimpl_rdmsr(vcpu, msr); diff --git a/arch/x86/kvm/lapic.h b/arch/x86/kvm/lapic.h index 0a0ea4b5dd8c..a20cb006b6c8 100644 --- a/arch/x86/kvm/lapic.h +++ b/arch/x86/kvm/lapic.h @@ -17,7 +17,6 @@ #define APIC_DEST_MASK 0x800 #define APIC_BUS_CYCLE_NS 1 -#define APIC_BUS_FREQUENCY (1000000000ULL / APIC_BUS_CYCLE_NS) #define APIC_BROADCAST 0xFF #define X2APIC_BROADCAST 0xFFFFFFFFul From patchwork Tue Dec 19 08:34:39 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 13498024 Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.115]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 17DE411710; Tue, 19 Dec 2023 08:35:18 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="BFdQF7Si" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1702974919; x=1734510919; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=mIsYk9plOBZsfC+weyfMTrhinzlBM70nbvAtVPy5xdA=; b=BFdQF7SiOFtS/h+zoNR1qR2D9bQ9DX8rG0ZjlB4wTJ+CvwaAQZlZd1Ap WrFMcXEGoko2DvHSSbiTqK5lHpeE7ertYTvNVlE2yzwafOklmnfAmwWx7 l44V5Sz6GZeJ8aUb2pZdKMWlXa+OcickVaZ6fsGa3eyXqsI35RN6J90W6 pQG/77bOVi7fwuH38EvDnTzt6QxiF9QP1AqVsf+J0Y2eAJ2uLJmFYmZzR IFWGbodKDH+071chjm97q3QeMdAaZGpOcwnIjLnX0+0xYET8+jK8+tXpY 2ujpOvMV5L5RoK+BleieqsBYiQVARn7OxTUvY0BYvTpdK03iGfobKHbml A==; X-IronPort-AV: E=McAfee;i="6600,9927,10928"; a="395355758" X-IronPort-AV: E=Sophos;i="6.04,287,1695711600"; d="scan'208";a="395355758" Received: from orsmga003.jf.intel.com ([10.7.209.27]) by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 19 Dec 2023 00:34:55 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10928"; a="725658900" X-IronPort-AV: E=Sophos;i="6.04,287,1695711600"; d="scan'208";a="725658900" Received: from ls.sc.intel.com (HELO localhost) ([172.25.112.31]) by orsmga003-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 19 Dec 2023 00:34:54 -0800 From: Isaku Yamahata To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Vishal Annapurve , Jim Mattson , Maxim Levitsky , Xiaoyao Li Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com Subject: [PATCH v3 2/4] KVM: x86: Make the APIC bus cycles per nsec VM variable Date: Tue, 19 Dec 2023 00:34:39 -0800 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Introduce the VM variable of the APIC cycles per nano second as the preparation to make the APIC APIC bus frequency configurable. The TDX architecture hard-codes the APIC bus frequency to 25MHz in the CPUID leaf 0x15. The TDX mandates it to be exposed and doesn't allow the VMM to override its value. The KVM APIC timer emulation hard-codes the frequency to 1GHz. To ensure that the guest doesn't have a conflicting view of the APIC bus frequency, allow the userspace to tell KVM to use the same frequency that TDX mandates instead of the default 1Ghz. Signed-off-by: Isaku Yamahata Reviewed-by: Maxim Levitsky --- Changes v3: - Update commit message. - Dropped apic_bus_frequency according to Maxim Levitsky. Changes v2: - no change. --- arch/x86/include/asm/kvm_host.h | 1 + arch/x86/kvm/hyperv.c | 3 ++- arch/x86/kvm/lapic.c | 6 ++++-- arch/x86/kvm/lapic.h | 2 +- arch/x86/kvm/x86.c | 1 + 5 files changed, 9 insertions(+), 4 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index d7036982332e..45ee7a1d13f6 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1334,6 +1334,7 @@ struct kvm_arch { u32 default_tsc_khz; bool user_set_tsc; + u64 apic_bus_cycle_ns; seqcount_raw_spinlock_t pvclock_sc; bool use_master_clock; diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c index a40ca2fef58c..37ff31c18ad1 100644 --- a/arch/x86/kvm/hyperv.c +++ b/arch/x86/kvm/hyperv.c @@ -1687,7 +1687,8 @@ static int kvm_hv_get_msr(struct kvm_vcpu *vcpu, u32 msr, u64 *pdata, data = (u64)vcpu->arch.virtual_tsc_khz * 1000; break; case HV_X64_MSR_APIC_FREQUENCY: - data = div64_u64(1000000000ULL, APIC_BUS_CYCLE_NS); + data = div64_u64(1000000000ULL, + vcpu->kvm->arch.apic_bus_cycle_ns); break; default: kvm_pr_unimpl_rdmsr(vcpu, msr); diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c index 245b20973cae..73956b0ac1f1 100644 --- a/arch/x86/kvm/lapic.c +++ b/arch/x86/kvm/lapic.c @@ -1542,7 +1542,8 @@ static u32 apic_get_tmcct(struct kvm_lapic *apic) remaining = 0; ns = mod_64(ktime_to_ns(remaining), apic->lapic_timer.period); - return div64_u64(ns, (APIC_BUS_CYCLE_NS * apic->divide_count)); + return div64_u64(ns, (apic->vcpu->kvm->arch.apic_bus_cycle_ns * + apic->divide_count)); } static void __report_tpr_access(struct kvm_lapic *apic, bool write) @@ -1960,7 +1961,8 @@ static void start_sw_tscdeadline(struct kvm_lapic *apic) static inline u64 tmict_to_ns(struct kvm_lapic *apic, u32 tmict) { - return (u64)tmict * APIC_BUS_CYCLE_NS * (u64)apic->divide_count; + return (u64)tmict * apic->vcpu->kvm->arch.apic_bus_cycle_ns * + (u64)apic->divide_count; } static void update_target_expiration(struct kvm_lapic *apic, uint32_t old_divisor) diff --git a/arch/x86/kvm/lapic.h b/arch/x86/kvm/lapic.h index a20cb006b6c8..51e09f5a7fc5 100644 --- a/arch/x86/kvm/lapic.h +++ b/arch/x86/kvm/lapic.h @@ -16,7 +16,7 @@ #define APIC_DEST_NOSHORT 0x0 #define APIC_DEST_MASK 0x800 -#define APIC_BUS_CYCLE_NS 1 +#define APIC_BUS_CYCLE_NS_DEFAULT 1 #define APIC_BROADCAST 0xFF #define X2APIC_BROADCAST 0xFFFFFFFFul diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 1a3aaa7dafae..d7d865f7c847 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -12466,6 +12466,7 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type) raw_spin_unlock_irqrestore(&kvm->arch.tsc_write_lock, flags); kvm->arch.default_tsc_khz = max_tsc_khz ? : tsc_khz; + kvm->arch.apic_bus_cycle_ns = APIC_BUS_CYCLE_NS_DEFAULT; kvm->arch.guest_can_read_msr_platform_info = true; kvm->arch.enable_pmu = enable_pmu; From patchwork Tue Dec 19 08:34:40 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 13498025 Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.115]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2B84211725; Tue, 19 Dec 2023 08:35:19 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="IkTRuIQi" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1702974919; x=1734510919; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=QOT8xDWFhRMaeV97RW8V0jSs4k7dpFvDcqIoLtVT2/Y=; b=IkTRuIQixJexHuJfGzdGwdYW+nsHu77FUm/qj7H+Y6uY/Hoadurm1fHv WwROvIVeZxQWrC5DMWnAoa68v/GD2I88c66FToYJWYVQ+sSajdd+JleDa Fa0458SxNmZXNt1DiQoXMKuvj9loPV0fz/exqGcHxdggNw+pmT/Lx3gsW pVhXnxOVQiocm6WdNsoq+GSqHTwHkIpwDPpWy7OObpKdPPuNBFY6Z8zER kztPWRYkFu/zm8pGt5moYnWH1AkdZ382e7H78LdDz2trFQZ6X/hPBUiAj i3dedo1YS2XFKbQPQXjSj8bYKPn+QqFcKm8Kn2vXd9WvNgTHl2Um4hU60 g==; X-IronPort-AV: E=McAfee;i="6600,9927,10928"; a="395355765" X-IronPort-AV: E=Sophos;i="6.04,287,1695711600"; d="scan'208";a="395355765" Received: from orsmga003.jf.intel.com ([10.7.209.27]) by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 19 Dec 2023 00:34:55 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10928"; a="725658907" X-IronPort-AV: E=Sophos;i="6.04,287,1695711600"; d="scan'208";a="725658907" Received: from ls.sc.intel.com (HELO localhost) ([172.25.112.31]) by orsmga003-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 19 Dec 2023 00:34:55 -0800 From: Isaku Yamahata To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Vishal Annapurve , Jim Mattson , Maxim Levitsky , Xiaoyao Li Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com Subject: [PATCH v3 3/4] KVM: X86: Add a capability to configure bus frequency for APIC timer Date: Tue, 19 Dec 2023 00:34:40 -0800 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Add KVM_CAP_X86_BUS_FREQUENCY_CONTROL capability to configure the core crystal clock (or processor's bus clock) for APIC timer emulation. Allow KVM_ENABLE_CAPABILITY(KVM_CAP_X86_BUS_FREQUENCY_CONTROL) to set the frequency. The TDX architecture hard-codes the APIC bus frequency to 25MHz. The TDX mandates it to be enumerated in CPUIID leaf 0x15 and doesn't allow the VMM to override its value. The KVM APIC timer emulation hard-codes the frequency to 1GHz. The KVM doesn't enumerate it to the guest unless the user space VMM sets the CPUID leaf 0x15 by KVM_SET_CPUID. If the CPUID leaf 0x15 is enumerated, the guest kernel uses it as the APIC bus frequency. If not, the guest kernel measures the frequency based on other known timers like the ACPI timer or the legacy PIT. The TDX guest kernel gets timer interrupt more times by (1GHz as the frequency KVM used) / (25MHz as TDX CPUID enumerates). To ensure that the guest doesn't have a conflicting view of the APIC bus frequency, allow the userspace to tell KVM to use the same frequency that TDX mandates instead of the default 1Ghz. There are several options to address this. 1. Make the KVM able to configure APIC bus frequency (This patch). Pro: It resembles the existing hardware. The recent Intel CPUs adapts 25MHz. Con: Require the VMM to emulate the APIC timer at 25MHz. 2. Make the TDX architecture enumerate CPUID 0x15 to configurable frequency or not enumerate it. Pro: Any APIC bus frequency is allowed. Con: Deviation from the real hardware. 3. Make the TDX guest kernel use 1GHz when it's running on KVM. Con: The kernel ignores CPUID leaf 0x15. 4. Change CPUID.15H under TDX to report the crystal clock frequency as 1 GHz. Pro: This has been the virtual APIC frequency for KVM guests for 13 years. Pro: This requires changing only one hard-coded constant in TDX. Con: It doesn't work with other VMMs as TDX isn't specific to KVM. This patch doesn't affect the TSC deadline timer emulation. The APIC timer emulation path calculates the TSC value from the TMICT register value and uses the TSC deadline timer path. This patch touches only the APIC timer-specific code. [1] https://lore.kernel.org/lkml/20231006011255.4163884-1-vannapurve@google.com/ Reported-by: Vishal Annapurve Signed-off-by: Isaku Yamahata Reviewed-by: Maxim Levitsky --- Changes v3: - Added reviewed-by Maxim Levitsky. - minor update of the commit message. Changes v2: - Add check if vcpu isn't created. - Add check if lapic chip is in-kernel emulation. - Fix build error for i386. - Add document to api.rst. - typo in the commit message. --- Documentation/virt/kvm/api.rst | 14 ++++++++++++++ arch/x86/kvm/x86.c | 33 +++++++++++++++++++++++++++++++++ include/uapi/linux/kvm.h | 1 + 3 files changed, 48 insertions(+) diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst index 7025b3751027..cc976df2651e 100644 --- a/Documentation/virt/kvm/api.rst +++ b/Documentation/virt/kvm/api.rst @@ -7858,6 +7858,20 @@ This capability is aimed to mitigate the threat that malicious VMs can cause CPU stuck (due to event windows don't open up) and make the CPU unavailable to host or other VMs. +7.34 KVM_CAP_X86_BUS_FREQUENCY_CONTROL +-------------------------------------- + +:Architectures: x86 +:Target: VM +:Parameters: args[0] is the value of apic bus clock frequency +:Returns: 0 on success, -EINVAL if args[0] contains invalid value for the + frequency, or -ENXIO if virtual local APIC isn't enabled by + KVM_CREATE_IRQCHIP, or -EBUSY if any vcpu is created. + +This capability sets the APIC bus clock frequency (or core crystal clock +frequency) for kvm to emulate APIC in the kernel. The default value is 1000000 +(1GHz). + 8. Other capabilities. ====================== diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index d7d865f7c847..97f81d612366 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -4625,6 +4625,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext) case KVM_CAP_ENABLE_CAP: case KVM_CAP_VM_DISABLE_NX_HUGE_PAGES: case KVM_CAP_IRQFD_RESAMPLE: + case KVM_CAP_X86_BUS_FREQUENCY_CONTROL: r = 1; break; case KVM_CAP_EXIT_HYPERCALL: @@ -6616,6 +6617,38 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm, } mutex_unlock(&kvm->lock); break; + case KVM_CAP_X86_BUS_FREQUENCY_CONTROL: { + u64 bus_frequency = cap->args[0]; + u64 bus_cycle_ns; + + if (!bus_frequency) + return -EINVAL; + /* CPUID[0x15] only support 32bits. */ + if (bus_frequency != (u32)bus_frequency) + return -EINVAL; + + /* Cast to avoid 64bit division on 32bit platform. */ + bus_cycle_ns = 1000000000UL / (u32)bus_frequency; + if (!bus_cycle_ns) + return -EINVAL; + + r = 0; + mutex_lock(&kvm->lock); + /* + * Don't allow to change the frequency dynamically during vcpu + * running to avoid potentially bizarre behavior. + */ + if (kvm->created_vcpus) + r = -EBUSY; + /* This is for in-kernel vAPIC emulation. */ + else if (!irqchip_in_kernel(kvm)) + r = -ENXIO; + + if (!r) + kvm->arch.apic_bus_cycle_ns = bus_cycle_ns; + mutex_unlock(&kvm->lock); + return r; + } default: r = -EINVAL; break; diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h index 211b86de35ac..d74a057df173 100644 --- a/include/uapi/linux/kvm.h +++ b/include/uapi/linux/kvm.h @@ -1201,6 +1201,7 @@ struct kvm_ppc_resize_hpt { #define KVM_CAP_ARM_EAGER_SPLIT_CHUNK_SIZE 228 #define KVM_CAP_ARM_SUPPORTED_BLOCK_SIZES 229 #define KVM_CAP_ARM_SUPPORTED_REG_MASK_RANGES 230 +#define KVM_CAP_X86_BUS_FREQUENCY_CONTROL 231 #ifdef KVM_CAP_IRQ_ROUTING From patchwork Tue Dec 19 08:34:41 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 13498026 Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.115]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 391AB12B9E; Tue, 19 Dec 2023 08:35:20 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="UyvDST5G" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1702974920; x=1734510920; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=sp+9xcBrDRa7FSf2M1vCbvXXsVMU/OMOwmsjWVEj29A=; b=UyvDST5GVlpfGzfI0Y+eovq6zTLTAkdsnEZJvZ1V5sJMW/uqPglJoCm+ PnfciMp66NGFCw0z4c1X4faQ5kBr4oleMvTPdPQBNZnzcbFlk8Yg88mxU GgmtOYN4+nz5I7fzoV3G9u8S2tBoUp+zWs6Q8NpFuHDXFvs/VyT1f/ufi emfk0UAhs/NA6R6bKeI5OgN/52QOBNYf6BukWjxAvJjjD3EHE6nd+SUzk QQ1GJA+O/U4Em07gr1HvRGTPA2xJiGtPPTo8iqyuySzzUl0HJB8EhOaLQ rk4cu0+mosrZP8pyP0bjGaxrSqdkw225uReDkZmS1Y1zcEa2jFkWiAKrl Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10928"; a="395355772" X-IronPort-AV: E=Sophos;i="6.04,287,1695711600"; d="scan'208";a="395355772" Received: from orsmga003.jf.intel.com ([10.7.209.27]) by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 19 Dec 2023 00:34:56 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10928"; a="725658911" X-IronPort-AV: E=Sophos;i="6.04,287,1695711600"; d="scan'208";a="725658911" Received: from ls.sc.intel.com (HELO localhost) ([172.25.112.31]) by orsmga003-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 19 Dec 2023 00:34:56 -0800 From: Isaku Yamahata To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Vishal Annapurve , Jim Mattson , Maxim Levitsky , Xiaoyao Li Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com Subject: [PATCH v3 4/4] KVM: selftests: Add test case for x86 apic_bus_clock_frequency Date: Tue, 19 Dec 2023 00:34:41 -0800 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Test if the apic bus clock frequency is exptected to the configured value. Set APIC TMICT to the maximum value and busy wait for 100 msec (any value is okay) with tsc value, and read TMCCT. Calculate apic bus clock frequency based on TSC frequency. Signed-off-by: Isaku Yamahata Reviewed-by: Maxim Levitsky --- Changes v3: - Use 1.5GHz instead of 1GHz as frequency. Changes v2: - Newly added. --- tools/testing/selftests/kvm/Makefile | 1 + .../selftests/kvm/include/x86_64/apic.h | 7 + .../kvm/x86_64/apic_bus_clock_test.c | 135 ++++++++++++++++++ 3 files changed, 143 insertions(+) create mode 100644 tools/testing/selftests/kvm/x86_64/apic_bus_clock_test.c diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile index 963435959a92..e07ec9c1dbd1 100644 --- a/tools/testing/selftests/kvm/Makefile +++ b/tools/testing/selftests/kvm/Makefile @@ -116,6 +116,7 @@ TEST_GEN_PROGS_x86_64 += x86_64/vmx_invalid_nested_guest_state TEST_GEN_PROGS_x86_64 += x86_64/vmx_set_nested_state_test TEST_GEN_PROGS_x86_64 += x86_64/vmx_tsc_adjust_test TEST_GEN_PROGS_x86_64 += x86_64/vmx_nested_tsc_scaling_test +TEST_GEN_PROGS_x86_64 += x86_64/apic_bus_clock_test TEST_GEN_PROGS_x86_64 += x86_64/xapic_ipi_test TEST_GEN_PROGS_x86_64 += x86_64/xapic_state_test TEST_GEN_PROGS_x86_64 += x86_64/xcr0_cpuid_test diff --git a/tools/testing/selftests/kvm/include/x86_64/apic.h b/tools/testing/selftests/kvm/include/x86_64/apic.h index bed316fdecd5..866a58d5fa11 100644 --- a/tools/testing/selftests/kvm/include/x86_64/apic.h +++ b/tools/testing/selftests/kvm/include/x86_64/apic.h @@ -60,6 +60,13 @@ #define APIC_VECTOR_MASK 0x000FF #define APIC_ICR2 0x310 #define SET_APIC_DEST_FIELD(x) ((x) << 24) +#define APIC_LVT0 0x350 +#define APIC_LVT_TIMER_ONESHOT (0 << 17) +#define APIC_LVT_TIMER_PERIODIC (1 << 17) +#define APIC_LVT_TIMER_TSCDEADLINE (2 << 17) +#define APIC_TMICT 0x380 +#define APIC_TMCCT 0x390 +#define APIC_TDCR 0x3E0 void apic_disable(void); void xapic_enable(void); diff --git a/tools/testing/selftests/kvm/x86_64/apic_bus_clock_test.c b/tools/testing/selftests/kvm/x86_64/apic_bus_clock_test.c new file mode 100644 index 000000000000..e7896d703e7d --- /dev/null +++ b/tools/testing/selftests/kvm/x86_64/apic_bus_clock_test.c @@ -0,0 +1,135 @@ +// SPDX-License-Identifier: GPL-2.0-only +#define _GNU_SOURCE /* for program_invocation_short_name */ + +#include "apic.h" +#include "test_util.h" + +/* + * Pick one convenient value, 1.5Ghz. No special meaning and different from + * the default value, 1Ghz. + */ +#define TSC_HZ (1500 * 1000 * 1000ULL) + +/* Wait for 100 msec, not too long, not too short value. */ +#define LOOP_MSEC 100ULL +#define TSC_WAIT_DELTA (TSC_HZ / 1000 * LOOP_MSEC) + +/* Pick up typical value. Different enough from the default value, 1GHz. */ +#define APIC_BUS_CLOCK_FREQ (25 * 1000 * 1000ULL) + +static void guest_code(void) +{ + /* Possible tdcr values and its divide count. */ + struct { + u32 tdcr; + u32 divide_count; + } tdcrs[] = { + {0x0, 2}, + {0x1, 4}, + {0x2, 8}, + {0x3, 16}, + {0x8, 32}, + {0x9, 64}, + {0xa, 128}, + {0xb, 1}, + }; + + u32 tmict, tmcct; + u64 tsc0, tsc1; + int i; + + asm volatile("cli"); + + xapic_enable(); + + /* + * Setup one-shot timer. Because we don't fire the interrupt, the + * vector doesn't matter. + */ + xapic_write_reg(APIC_LVT0, APIC_LVT_TIMER_ONESHOT); + + for (i = 0; i < ARRAY_SIZE(tdcrs); i++) { + xapic_write_reg(APIC_TDCR, tdcrs[i].tdcr); + + /* Set the largest value to not trigger the interrupt. */ + tmict = ~0; + xapic_write_reg(APIC_TMICT, tmict); + + /* Busy wait for LOOP_MSEC */ + tsc0 = rdtsc(); + tsc1 = tsc0; + while (tsc1 - tsc0 < TSC_WAIT_DELTA) + tsc1 = rdtsc(); + + /* Read apic timer and tsc */ + tmcct = xapic_read_reg(APIC_TMCCT); + tsc1 = rdtsc(); + + /* Stop timer */ + xapic_write_reg(APIC_TMICT, 0); + + /* Report it. */ + GUEST_SYNC_ARGS(tdcrs[i].divide_count, tmict - tmcct, + tsc1 - tsc0, 0, 0); + } + + GUEST_DONE(); +} + +void test_apic_bus_clock(struct kvm_vcpu *vcpu) +{ + bool done = false; + struct ucall uc; + + while (!done) { + vcpu_run(vcpu); + TEST_ASSERT_KVM_EXIT_REASON(vcpu, KVM_EXIT_IO); + + switch (get_ucall(vcpu, &uc)) { + case UCALL_DONE: + done = true; + break; + case UCALL_ABORT: + REPORT_GUEST_ASSERT(uc); + break; + case UCALL_SYNC: { + u32 divide_counter = uc.args[1]; + u32 apic_cycles = uc.args[2]; + u64 tsc_cycles = uc.args[3]; + u64 freq; + + TEST_ASSERT(tsc_cycles > 0, + "tsc cycles must not be zero."); + + /* Allow 1% slack. */ + freq = apic_cycles * divide_counter * TSC_HZ / tsc_cycles; + TEST_ASSERT(freq < APIC_BUS_CLOCK_FREQ * 101 / 100, + "APIC bus clock frequency is too large"); + TEST_ASSERT(freq > APIC_BUS_CLOCK_FREQ * 99 / 100, + "APIC bus clock frequency is too small"); + break; + } + default: + TEST_FAIL("Unknown ucall %lu", uc.cmd); + break; + } + } +} + +int main(int argc, char *argv[]) +{ + struct kvm_vm *vm; + struct kvm_vcpu *vcpu; + + vm = __vm_create(VM_MODE_DEFAULT, 1, 0); + vm_ioctl(vm, KVM_SET_TSC_KHZ, (void *) (TSC_HZ / 1000)); + /* KVM_CAP_X86_BUS_FREQUENCY_CONTROL requires that no vcpu is created. */ + vm_enable_cap(vm, KVM_CAP_X86_BUS_FREQUENCY_CONTROL, + APIC_BUS_CLOCK_FREQ); + vcpu = vm_vcpu_add(vm, 0, guest_code); + + virt_pg_map(vm, APIC_DEFAULT_GPA, APIC_DEFAULT_GPA); + + test_apic_bus_clock(vcpu); + kvm_vm_free(vm); +}