From patchwork Sat Feb 3 09:11:49 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zhao Liu X-Patchwork-Id: 13543951 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.14]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C6D515CDD3; Sat, 3 Feb 2024 08:59:51 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.14 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706950793; cv=none; b=jB2FYW8QvJ1N970S63K/2qVzkMsdA3/3mCU/AssqXuULmQJYDOsQIC6ZTX6hr4P/+yaWU2jK7a2fEvyaUHKfzaGtZ5p49D1ZBpYjql1/hPsXHvkyXPIsJsDGI6VqngCMOwbLFcf3AwNR7wVINlw39TJVzkDnoNgyx01MCcU/GL4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706950793; c=relaxed/simple; bh=O309/Vt9MktFf0PdG6Lw5tGr+fycilz2bwzopTCTR0Y=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=sh+l+RyY4JMFnWNvCkbH5znmfsdaEkEfhbq+CFyJSGOtHD3/FqZk6mVG2HDYtwTBroJgMfmDMjo2MmaGDUR2JB7lXkrRFbbzoixGEJdXTNE968Xwzrk2Y5d5iTuBi5bQfpl++X2HJd+1WgUnRdW+z7c6+tKw0+57XSUwPDf2kCE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=OqxgokOW; arc=none smtp.client-ip=198.175.65.14 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="OqxgokOW" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1706950792; x=1738486792; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=O309/Vt9MktFf0PdG6Lw5tGr+fycilz2bwzopTCTR0Y=; b=OqxgokOWH3FKqLfnco6VJfiyyvGuBZbnJ6NynOs6fn4KIPjx1LE1FNB9 xgzQzA3Y4SBTDWyBiYuKjXUInV0AZl1T4T9UPXlB5L9+pImPGPGxW+6lQ PcBH8/v5fwYQWd1X4KV76Zg1vfkPLzA1YUitqhExME5Mzp2ZLDLmYJoPp JlWG3CppyELiyTjKUCOOaQFH7rADn3njTs6jBquyCE6TuwXOYfIOY09Tf 4aZMgtpPekVpAtxonDLwh1eiCSpAl9dZpoUYIs3pQytIVwGGQhIxGX4gf wE+X9ioOvTNCMH/tXmlAIJLI+Uo05ST0/nbYyFXfMlFpolNR4ZNi3Ffkn Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10971"; a="4131855" X-IronPort-AV: E=Sophos;i="6.05,240,1701158400"; d="scan'208";a="4131855" Received: from fmviesa009.fm.intel.com ([10.60.135.149]) by orvoesa106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 03 Feb 2024 00:59:52 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.05,240,1701158400"; d="scan'208";a="291146" Received: from liuzhao-optiplex-7080.sh.intel.com ([10.239.160.36]) by fmviesa009.fm.intel.com with ESMTP; 03 Feb 2024 00:59:45 -0800 From: Zhao Liu To: Paolo Bonzini , Sean Christopherson , "Rafael J . Wysocki" , Daniel Lezcano , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H . Peter Anvin" , kvm@vger.kernel.org, linux-pm@vger.kernel.org, linux-kernel@vger.kernel.org, x86@kernel.org Cc: Ricardo Neri , Len Brown , Zhang Rui , Zhenyu Wang , Zhuocheng Ding , Dapeng Mi , Yanting Jiang , Yongwei Ma , Vineeth Pillai , Suleiman Souhlal , Masami Hiramatsu , David Dai , Saravana Kannan , Zhao Liu Subject: [RFC 01/26] thermal: Add bit definition for x86 thermal related MSRs Date: Sat, 3 Feb 2024 17:11:49 +0800 Message-Id: <20240203091214.411862-2-zhao1.liu@linux.intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240203091214.411862-1-zhao1.liu@linux.intel.com> References: <20240203091214.411862-1-zhao1.liu@linux.intel.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Zhao Liu Add the definition of more bits of these MSRs: * MSR_IA32_THERM_CONTROL * MSR_IA32_THERM_INTERRUPT * MSR_IA32_THERM_STATUS * MSR_IA32_PACKAGE_THERM_STATUS * MSR_IA32_PACKAGE_THERM_INTERRUPT The virtualization of thermal events need these extra definitions. While here, regroup the definitions and use the BIT_ULL() and GENMASK_ULL() macro to improve readability. Tested-by: Yanting Jiang Signed-off-by: Zhao Liu --- arch/x86/include/asm/msr-index.h | 54 +++++++++++++++++++---------- drivers/thermal/intel/therm_throt.c | 1 - 2 files changed, 35 insertions(+), 20 deletions(-) diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h index 65b1bfb9c304..4f7ebfafa46a 100644 --- a/arch/x86/include/asm/msr-index.h +++ b/arch/x86/include/asm/msr-index.h @@ -829,17 +829,26 @@ #define MSR_IA32_MPERF 0x000000e7 #define MSR_IA32_APERF 0x000000e8 -#define MSR_IA32_THERM_CONTROL 0x0000019a -#define MSR_IA32_THERM_INTERRUPT 0x0000019b - -#define THERM_INT_HIGH_ENABLE (1 << 0) -#define THERM_INT_LOW_ENABLE (1 << 1) -#define THERM_INT_PLN_ENABLE (1 << 24) - -#define MSR_IA32_THERM_STATUS 0x0000019c +#define MSR_IA32_THERM_CONTROL 0x0000019a +#define THERM_ON_DEM_CLO_MOD_DUTY_CYC_MASK GENMASK_ULL(3, 1) +#define THERM_ON_DEM_CLO_MOD_ENABLE BIT_ULL(4) -#define THERM_STATUS_PROCHOT (1 << 0) -#define THERM_STATUS_POWER_LIMIT (1 << 10) +#define MSR_IA32_THERM_INTERRUPT 0x0000019b +#define THERM_INT_HIGH_ENABLE BIT_ULL(0) +#define THERM_INT_LOW_ENABLE BIT_ULL(1) +#define THERM_INT_PROCHOT_ENABLE BIT_ULL(2) +#define THERM_INT_FORCEPR_ENABLE BIT_ULL(3) +#define THERM_INT_CRITICAL_TEM_ENABLE BIT_ULL(4) +#define THERM_INT_PLN_ENABLE BIT_ULL(24) + +#define MSR_IA32_THERM_STATUS 0x0000019c +#define THERM_STATUS_PROCHOT BIT_ULL(0) +#define THERM_STATUS_PROCHOT_LOG BIT_ULL(1) +#define THERM_STATUS_PROCHOT_FORCEPR_EVENT BIT_ULL(2) +#define THERM_STATUS_PROCHOT_FORCEPR_LOG BIT_ULL(3) +#define THERM_STATUS_CRITICAL_TEMP BIT_ULL(4) +#define THERM_STATUS_CRITICAL_TEMP_LOG BIT_ULL(5) +#define THERM_STATUS_POWER_LIMIT BIT_ULL(10) #define MSR_THERM2_CTL 0x0000019d @@ -861,17 +870,24 @@ #define ENERGY_PERF_BIAS_POWERSAVE 15 #define MSR_IA32_PACKAGE_THERM_STATUS 0x000001b1 - -#define PACKAGE_THERM_STATUS_PROCHOT (1 << 0) -#define PACKAGE_THERM_STATUS_POWER_LIMIT (1 << 10) -#define PACKAGE_THERM_STATUS_HFI_UPDATED (1 << 26) +#define PACKAGE_THERM_STATUS_PROCHOT BIT_ULL(0) +#define PACKAGE_THERM_STATUS_PROCHOT_LOG BIT_ULL(1) +#define PACKAGE_THERM_STATUS_PROCHOT_EVENT BIT_ULL(2) +#define PACKAGE_THERM_STATUS_PROCHOT_EVENT_LOG BIT_ULL(3) +#define PACKAGE_THERM_STATUS_CRITICAL_TEMP BIT_ULL(4) +#define PACKAGE_THERM_STATUS_CRITICAL_TEMP_LOG BIT_ULL(5) +#define PACKAGE_THERM_STATUS_POWER_LIMIT BIT_ULL(10) +#define PACKAGE_THERM_STATUS_POWER_LIMIT_LOG BIT_ULL(11) +#define PACKAGE_THERM_STATUS_DIG_READOUT_MASK GENMASK_ULL(22, 16) +#define PACKAGE_THERM_STATUS_HFI_UPDATED BIT_ULL(26) #define MSR_IA32_PACKAGE_THERM_INTERRUPT 0x000001b2 - -#define PACKAGE_THERM_INT_HIGH_ENABLE (1 << 0) -#define PACKAGE_THERM_INT_LOW_ENABLE (1 << 1) -#define PACKAGE_THERM_INT_PLN_ENABLE (1 << 24) -#define PACKAGE_THERM_INT_HFI_ENABLE (1 << 25) +#define PACKAGE_THERM_INT_HIGH_ENABLE BIT_ULL(0) +#define PACKAGE_THERM_INT_LOW_ENABLE BIT_ULL(1) +#define PACKAGE_THERM_INT_PROCHOT_ENABLE BIT_ULL(2) +#define PACKAGE_THERM_INT_OVERHEAT_ENABLE BIT_ULL(4) +#define PACKAGE_THERM_INT_PLN_ENABLE BIT_ULL(24) +#define PACKAGE_THERM_INT_HFI_ENABLE BIT_ULL(25) /* Thermal Thresholds Support */ #define THERM_INT_THRESHOLD0_ENABLE (1 << 15) diff --git a/drivers/thermal/intel/therm_throt.c b/drivers/thermal/intel/therm_throt.c index e69868e868eb..4c72fee32bf2 100644 --- a/drivers/thermal/intel/therm_throt.c +++ b/drivers/thermal/intel/therm_throt.c @@ -191,7 +191,6 @@ static const struct attribute_group thermal_attr_group = { #endif /* CONFIG_SYSFS */ #define THERM_THROT_POLL_INTERVAL HZ -#define THERM_STATUS_PROCHOT_LOG BIT(1) static u64 therm_intr_core_clear_mask; static u64 therm_intr_pkg_clear_mask; From patchwork Sat Feb 3 09:11:50 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zhao Liu X-Patchwork-Id: 13543952 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.14]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 775EF5CDD3; Sat, 3 Feb 2024 08:59:57 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.14 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706950799; cv=none; b=dazohyMkcihe4bt1gHPK8CZ0UjCC1uObQHm4u4pnc7jXYF/yu2Iek0mIU1ZoQrw1m9J6Xklfg9Txpxf4Wu64Tklajc5idKwMcPihntBmzowmozjAK0hnml99CoGlqUrgxgq2GsXNVBMIHELUmobHVDytBaNcJOcJZabREThoHeA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706950799; c=relaxed/simple; bh=ji6khLPdPc2d+wRmATkf+HZE1RiFEEWfxK/r4PNR6xw=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=KS73vLTeQv8zORXs0KqQjyTa4Mxv70TmZWQUxNB8WZP8lDVHIkQcFuPQJOvCafNar6vIBAb3kdT28vFwC/Ysiz7df8DWkZaO1i808ualjHiRIFnGHNnuxX+3Nl7oD+sWPY5EE8FLf03b7g+j30z9aK1ntmE94XDqAbgVh3JVijw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=Ik5CkECQ; arc=none smtp.client-ip=198.175.65.14 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="Ik5CkECQ" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1706950798; x=1738486798; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=ji6khLPdPc2d+wRmATkf+HZE1RiFEEWfxK/r4PNR6xw=; b=Ik5CkECQeyg67nr3+/tIk7xw8go43rgzgtt0eQVg2PNmHtmWvyWjOEOq W7GhIaS50HEo12MZ2eXYxbzMQQ0UptG83yTEiofp3GFukW7BZjDevxPBe d08/mHOfvQKUyaXRxn0ZDEqnScslaeVCVNbPzTwNKTvsi1r8McZOG3IrZ bfE6Lp4fsIfjmo6duPDLTE6RVpfDBpN5SCzd7aZZ95llU4sihjHBsJ1Zx bapVe7q4+SCWsJamxD0DEmiU/dwmpmrzf4cAZc4T6H3/7gC+P6QbJPI0h lRsUmcnqO/m0F8lDs9RcBjDieIXbkvT9h+HutRCE3luxSDfXW6CFRp2UY Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10971"; a="4131863" X-IronPort-AV: E=Sophos;i="6.05,240,1701158400"; d="scan'208";a="4131863" Received: from fmviesa009.fm.intel.com ([10.60.135.149]) by orvoesa106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 03 Feb 2024 00:59:57 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.05,240,1701158400"; d="scan'208";a="291159" Received: from liuzhao-optiplex-7080.sh.intel.com ([10.239.160.36]) by fmviesa009.fm.intel.com with ESMTP; 03 Feb 2024 00:59:51 -0800 From: Zhao Liu To: Paolo Bonzini , Sean Christopherson , "Rafael J . Wysocki" , Daniel Lezcano , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H . Peter Anvin" , kvm@vger.kernel.org, linux-pm@vger.kernel.org, linux-kernel@vger.kernel.org, x86@kernel.org Cc: Ricardo Neri , Len Brown , Zhang Rui , Zhenyu Wang , Zhuocheng Ding , Dapeng Mi , Yanting Jiang , Yongwei Ma , Vineeth Pillai , Suleiman Souhlal , Masami Hiramatsu , David Dai , Saravana Kannan , Zhao Liu Subject: [RFC 02/26] thermal: intel: hfi: Add helpers to build HFI/ITD structures Date: Sat, 3 Feb 2024 17:11:50 +0800 Message-Id: <20240203091214.411862-3-zhao1.liu@linux.intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240203091214.411862-1-zhao1.liu@linux.intel.com> References: <20240203091214.411862-1-zhao1.liu@linux.intel.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Zhuocheng Ding Virtual machines need to compose their own HFI tables. Provide helper functions that collect the relevant features and data from the host machine. Tested-by: Yanting Jiang Signed-off-by: Zhuocheng Ding Co-developed-by: Zhao Liu Signed-off-by: Zhao Liu --- arch/x86/include/asm/hfi.h | 20 ++++ drivers/thermal/intel/intel_hfi.c | 149 ++++++++++++++++++++++++++++++ 2 files changed, 169 insertions(+) diff --git a/arch/x86/include/asm/hfi.h b/arch/x86/include/asm/hfi.h index b7fda3e0e8c8..e0fe5b30fb53 100644 --- a/arch/x86/include/asm/hfi.h +++ b/arch/x86/include/asm/hfi.h @@ -82,4 +82,24 @@ struct hfi_features { unsigned int hdr_size; }; +#if defined(CONFIG_INTEL_HFI_THERMAL) +int intel_hfi_max_instances(void); +int intel_hfi_build_virt_features(struct hfi_features *features, unsigned int nr_classes, + unsigned int nr_entries); +int intel_hfi_build_virt_table(struct hfi_table *table, struct hfi_features *features, + unsigned int nr_classes, unsigned int hfi_index, + unsigned int cpu); +static inline bool intel_hfi_enabled(void) { return intel_hfi_max_instances() > 0; } +#else +static inline int intel_hfi_max_instances(void) { return 0; } +static inline int intel_hfi_build_virt_features(struct hfi_features *features, + unsigned int nr_classes, + unsigned int nr_entries) { return 0; } +static inline int intel_hfi_build_virt_table(struct hfi_table *table, + struct hfi_features *features, + unsigned int nr_classes, unsigned int hfi_index, + unsigned int cpu) { return 0; } +static inline bool intel_hfi_enabled(void) { return false; } +#endif + #endif /* _ASM_X86_HFI_H */ diff --git a/drivers/thermal/intel/intel_hfi.c b/drivers/thermal/intel/intel_hfi.c index b69fa234b317..139ce2d4b26b 100644 --- a/drivers/thermal/intel/intel_hfi.c +++ b/drivers/thermal/intel/intel_hfi.c @@ -29,6 +29,7 @@ #include #include #include +#include #include #include #include @@ -642,3 +643,151 @@ void __init intel_hfi_init(void) kfree(hfi_instances); hfi_instances = NULL; } + +/** + * intel_hfi_max_instances() - Get the maximum number of hfi instances. + * + * Return: the maximum number of hfi instances. + */ +int intel_hfi_max_instances(void) +{ + return max_hfi_instances; +} +EXPORT_SYMBOL_GPL(intel_hfi_max_instances); + +/** + * intel_hfi_build_virt_features() - Build a virtual hfi_features structure. + * + * @features: Feature structure need to be filled + * @nr_classes: Maximum number of classes supported. 1 class indicates + * only HFI feature is configured and 4 classes indicates + * both HFI and ITD features. + * @nr_entries: Number of HFI entries in HFI table. + * + * Fill a virtual hfi_features structure which is used for HFI/ITD virtualization. + * HFI and ITD have different feature information, and the virtual feature + * structure is based on the corresponding configured number of classes (in Guest + * CPUID) to be built. + * + * Return: -EINVAL if there's the error for the parameters, otherwise 0. + */ +int intel_hfi_build_virt_features(struct hfi_features *features, + unsigned int nr_classes, + unsigned int nr_entries) +{ + unsigned int data_size; + + if (!features || !nr_classes || !nr_entries) + return -EINVAL; + + /* + * The virtual feature must be based on the Host's feature; when Host + * enables both HFI and ITD, it is allowed for Guest to create only the + * HFI feature structure which has fewer classes than ITD. + */ + if (nr_classes > hfi_features.nr_classes) + return -EINVAL; + + features->nr_classes = nr_classes; + features->class_stride = hfi_features.class_stride; + /* + * For the meaning of these two calculations, please refer to the comments + * in hfi_parse_features(). + */ + features->hdr_size = DIV_ROUND_UP(features->class_stride * + features->nr_classes, 8) * 8; + features->cpu_stride = DIV_ROUND_UP(features->class_stride * + features->nr_classes, 8) * 8; + + data_size = features->hdr_size + nr_entries * features->cpu_stride; + features->nr_table_pages = PAGE_ALIGN(data_size) >> PAGE_SHIFT; + return 0; +} +EXPORT_SYMBOL_GPL(intel_hfi_build_virt_features); + +/** + * intel_hfi_build_virt_table() - Fill the data of @hfi_index in virtual HFI table. + * + * @table: HFI table to be filled + * @features: Configured feature information of the HFI table + * @nr_classes: Number of classes to be updated for @table. This field is + * based on the enabled feature, which may be different with + * the feature information configured in @features. + * @hfi_index: Index of the HFI data in HFI table to be filled + * @cpu: CPU whose real HFI data is used to fill the @hfi_index + * + * Fill the row data of hfi_index in a virtual HFI table which is used for HFI/ITD + * virtualization. The size of the virtual HFI table is decided by the configured + * feature information in @features, and the filled HFI data range is decided by + * specified number of classes @nr_classes. + * + * Virtual machine may disable ITD at runtime through MSR_IA32_HW_FEEDBACK_CONFIG, + * in this case, only 1 class data (class 0) can be dynamically updated in virtual + * HFI table (class 0). + * + * Return: 1 if the @table is changed, 0 if the @table isn't changed, and + * -EINVAL/-ENOMEM if there's the error for the parameters. + */ +int intel_hfi_build_virt_table(struct hfi_table *table, + struct hfi_features *features, + unsigned int nr_classes, + unsigned int hfi_index, + unsigned int cpu) +{ + struct hfi_instance *hfi_instance; + struct hfi_hdr *hfi_hdr = table->hdr; + s16 host_hfi_index; + void *src_ptr, *dst_ptr; + int table_changed = 0; + + if (!table || !features || !nr_classes) + return -EINVAL; + + if (nr_classes > features->nr_classes || + nr_classes > hfi_features.nr_classes) + return -EINVAL; + + /* + * Make sure that this raw that will be filled doesn't cause overflow. + * features->nr_classes indicates the maximum number of possible + * classes. + */ + if (features->hdr_size + (hfi_index + 1) * features->cpu_stride > + features->nr_table_pages << PAGE_SHIFT) + return -ENOMEM; + + if (cpu >= nr_cpu_ids) + return -EINVAL; + + if (features->class_stride != hfi_features.class_stride) + return -EINVAL; + + hfi_instance = per_cpu(hfi_cpu_info, cpu).hfi_instance; + host_hfi_index = per_cpu(hfi_cpu_info, cpu).index; + + src_ptr = hfi_instance->local_table.data + + host_hfi_index * hfi_features.cpu_stride; + dst_ptr = table->data + hfi_index * features->cpu_stride; + + raw_spin_lock_irq(&hfi_instance->table_lock); + for (int i = 0; i < nr_classes; i++) { + struct hfi_cpu_data *src = src_ptr + i * hfi_features.class_stride; + struct hfi_cpu_data *dst = dst_ptr + i * features->class_stride; + + if (dst->perf_cap != src->perf_cap) { + dst->perf_cap = src->perf_cap; + hfi_hdr->perf_updated = 1; + } + if (dst->ee_cap != src->ee_cap) { + dst->ee_cap = src->ee_cap; + hfi_hdr->ee_updated = 1; + } + if (hfi_hdr->perf_updated || hfi_hdr->ee_updated) + table_changed = 1; + hfi_hdr++; + } + raw_spin_unlock_irq(&hfi_instance->table_lock); + + return table_changed; +} +EXPORT_SYMBOL_GPL(intel_hfi_build_virt_table); From patchwork Sat Feb 3 09:11:51 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zhao Liu X-Patchwork-Id: 13543953 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.14]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5246A5D8F5; Sat, 3 Feb 2024 09:00:03 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.14 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706950804; cv=none; b=STAyRVGOBBeVvB4GxXdsjC6/yhdyqda5k3z+2tLRhBPP79UUBazfX6r0eie+dHPx0w9srCtbWiWlLpsUTqvSyXaL69fhV6pDa0jvs/3miFc0PrzjSnTuU99v5Yui2aVEM/FZ4cQ5nROGrZ6RIs+ZN3CuF1K8Bj8hFy+1gPO63q0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706950804; c=relaxed/simple; bh=L+ZWKNEEcAmsWFGnKg8TWtGGaRb+SV6m6dlf0/X0V/c=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=XR2rz5d1LAm19XXbD1cl1wvb6pQ7V1lcqm7ECNq6WYEIistQUTOWoYgXVbK6/Xo2Jm01eJIm9g5uLf1jzgi/vQapeylN4sJ6kMhEJSc5AQzsobQT4U0IK/5gcuDqi3Ocsd8fB7KPaiVIZI4sLOwJr6cc1XIH3uJzVuDET5CEjZs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=h1oR1Mcs; arc=none smtp.client-ip=198.175.65.14 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="h1oR1Mcs" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1706950804; x=1738486804; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=L+ZWKNEEcAmsWFGnKg8TWtGGaRb+SV6m6dlf0/X0V/c=; b=h1oR1McsXko4tLa3NPZ1gpqBnP360P4GpCFQXJk5tT/wAqoy6D9RutCm hEKOVM71npXHfQidOClNFBk3YlHbMGlrj9kNxxfKs+YJWVdp+2MmRZzOn vHSSD1zu1VJ0xJrzAQjlc11IbKi7R2OgZaOhrXLzZFKecHV0+nhF5ZaDM /5quFqilAeR8xWPEhhQZRj6UISZYpzm5myG6mYBZYyDzURavvJkam6OHX 6Ket3LFio9rqETS6mMEigl+dOSug133e9dp7VhcYrEk6q10NPAzrH3i5+ P/sUzYgMNRkU3bZtwH4B6EJn6VANb5y44iLfLBEjioTA2nQ4yHuvt7ctQ Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10971"; a="4131874" X-IronPort-AV: E=Sophos;i="6.05,240,1701158400"; d="scan'208";a="4131874" Received: from fmviesa009.fm.intel.com ([10.60.135.149]) by orvoesa106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 03 Feb 2024 01:00:03 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.05,240,1701158400"; d="scan'208";a="291215" Received: from liuzhao-optiplex-7080.sh.intel.com ([10.239.160.36]) by fmviesa009.fm.intel.com with ESMTP; 03 Feb 2024 00:59:57 -0800 From: Zhao Liu To: Paolo Bonzini , Sean Christopherson , "Rafael J . Wysocki" , Daniel Lezcano , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H . Peter Anvin" , kvm@vger.kernel.org, linux-pm@vger.kernel.org, linux-kernel@vger.kernel.org, x86@kernel.org Cc: Ricardo Neri , Len Brown , Zhang Rui , Zhenyu Wang , Zhuocheng Ding , Dapeng Mi , Yanting Jiang , Yongwei Ma , Vineeth Pillai , Suleiman Souhlal , Masami Hiramatsu , David Dai , Saravana Kannan , Zhao Liu Subject: [RFC 03/26] thermal: intel: hfi: Add HFI notifier helpers to notify HFI update Date: Sat, 3 Feb 2024 17:11:51 +0800 Message-Id: <20240203091214.411862-4-zhao1.liu@linux.intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240203091214.411862-1-zhao1.liu@linux.intel.com> References: <20240203091214.411862-1-zhao1.liu@linux.intel.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Zhuocheng Ding KVM builds virtual HFI tables for virtual machines, which also needs to sync Host's HFI table update in time. Add notifier_chain in HFI instance to notify other modules about HFI table updates, and provide 2 helpers to register/unregister notifier hook in HFI driver. Tested-by: Yanting Jiang Signed-off-by: Zhuocheng Ding Co-developed-by: Zhao Liu Signed-off-by: Zhao Liu --- arch/x86/include/asm/hfi.h | 8 ++++ drivers/thermal/intel/intel_hfi.c | 63 ++++++++++++++++++++++++++++--- 2 files changed, 65 insertions(+), 6 deletions(-) diff --git a/arch/x86/include/asm/hfi.h b/arch/x86/include/asm/hfi.h index e0fe5b30fb53..19e3e5a7fb77 100644 --- a/arch/x86/include/asm/hfi.h +++ b/arch/x86/include/asm/hfi.h @@ -90,6 +90,10 @@ int intel_hfi_build_virt_table(struct hfi_table *table, struct hfi_features *fea unsigned int nr_classes, unsigned int hfi_index, unsigned int cpu); static inline bool intel_hfi_enabled(void) { return intel_hfi_max_instances() > 0; } +int intel_hfi_notifier_register(struct notifier_block *notifier, + unsigned int cpu); +int intel_hfi_notifier_unregister(struct notifier_block *notifier, + unsigned int cpu); #else static inline int intel_hfi_max_instances(void) { return 0; } static inline int intel_hfi_build_virt_features(struct hfi_features *features, @@ -100,6 +104,10 @@ static inline int intel_hfi_build_virt_table(struct hfi_table *table, unsigned int nr_classes, unsigned int hfi_index, unsigned int cpu) { return 0; } static inline bool intel_hfi_enabled(void) { return false; } +static inline int intel_hfi_notifier_register(struct notifier_block *notifier, + unsigned int cpu) { return -ENODEV; } +static inline int intel_hfi_notifier_unregister(struct notifier_block *notifier, + unsigned int cpu) { return -ENODEV; } #endif #endif /* _ASM_X86_HFI_H */ diff --git a/drivers/thermal/intel/intel_hfi.c b/drivers/thermal/intel/intel_hfi.c index 139ce2d4b26b..330b264ca23d 100644 --- a/drivers/thermal/intel/intel_hfi.c +++ b/drivers/thermal/intel/intel_hfi.c @@ -72,18 +72,20 @@ struct hfi_cpu_data { * @cpus: CPUs represented in this HFI table instance * @hw_table: Pointer to the HFI table of this instance * @update_work: Delayed work to process HFI updates + * @notifier_chain: Notification chain dedicated to this instance * @table_lock: Lock to protect acceses to the table of this instance * @event_lock: Lock to process HFI interrupts * * A set of parameters to parse and navigate a specific HFI table. */ struct hfi_instance { - struct hfi_table local_table; - cpumask_var_t cpus; - void *hw_table; - struct delayed_work update_work; - raw_spinlock_t table_lock; - raw_spinlock_t event_lock; + struct hfi_table local_table; + cpumask_var_t cpus; + void *hw_table; + struct delayed_work update_work; + struct raw_notifier_head notifier_chain; + raw_spinlock_t table_lock; + raw_spinlock_t event_lock; }; /** @@ -189,6 +191,7 @@ static void hfi_update_work_fn(struct work_struct *work) update_work); update_capabilities(hfi_instance); + raw_notifier_call_chain(&hfi_instance->notifier_chain, 0, NULL); } void intel_hfi_process_event(__u64 pkg_therm_status_msr_val) @@ -448,6 +451,7 @@ void intel_hfi_online(unsigned int cpu) init_hfi_instance(hfi_instance); INIT_DELAYED_WORK(&hfi_instance->update_work, hfi_update_work_fn); + RAW_INIT_NOTIFIER_HEAD(&hfi_instance->notifier_chain); raw_spin_lock_init(&hfi_instance->table_lock); raw_spin_lock_init(&hfi_instance->event_lock); @@ -791,3 +795,50 @@ int intel_hfi_build_virt_table(struct hfi_table *table, return table_changed; } EXPORT_SYMBOL_GPL(intel_hfi_build_virt_table); + +/** + * intel_hfi_notifier_register() - Register @notifier hook at @hfi_instance. + * + * @notifier: HFI notifier hook to be registered + * @cpu: CPU whose HFI instance the notifier is register at + * + * When the HFI instance of @cpu receives HFI interrupt and updates its local + * HFI table, the registered HFI notifier will be called. + * + * Return: 0 if successful, otherwise error. + */ +int intel_hfi_notifier_register(struct notifier_block *notifier, + unsigned int cpu) +{ + struct hfi_instance *hfi_instance; + + if (!notifier || cpu >= nr_cpu_ids) + return -EINVAL; + + hfi_instance = per_cpu(hfi_cpu_info, cpu).hfi_instance; + return raw_notifier_chain_register(&hfi_instance->notifier_chain, + notifier); +} +EXPORT_SYMBOL_GPL(intel_hfi_notifier_register); + +/** + * intel_hfi_notifier_unregister() - Unregister @notifier hook at @hfi_instance + * + * @notifier: HFI notifier hook to be unregistered + * @cpu: CPU whose HFI instance the notifier is unregister from + * + * Return: 0 if successful, otherwise error. + */ +int intel_hfi_notifier_unregister(struct notifier_block *notifier, + unsigned int cpu) +{ + struct hfi_instance *hfi_instance; + + if (!notifier || cpu >= nr_cpu_ids) + return -EINVAL; + + hfi_instance = per_cpu(hfi_cpu_info, cpu).hfi_instance; + return raw_notifier_chain_unregister(&hfi_instance->notifier_chain, + notifier); +} +EXPORT_SYMBOL_GPL(intel_hfi_notifier_unregister); From patchwork Sat Feb 3 09:11:52 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zhao Liu X-Patchwork-Id: 13543954 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.14]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 530CA5C91F; Sat, 3 Feb 2024 09:00:13 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.14 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706950814; cv=none; b=k7YaobqxS5fj5lHwqksuHmf9KhLDCBkzU4Mrge1Mj1HXJ+855VbIyNhpSJ0LE4DWdoxXRk6PIHEt655xgPE8DWV1gMyZlJLhTmcX741XeQ9JO//WGJOu7K5unMiBMXpYeaORKxvfSdVf2NPc2D29GyqjTn6L0bY/83lrGy5MEBQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706950814; c=relaxed/simple; bh=HuU52pRVoiXsEnoBJbVxkcXWDiT120gC151O8BbGlew=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=rUGfB04SflGKtCniRFMsQdHLCjrx1XSxsowzGdfP2tMnKw9RhHL/9oYWafhgihemXd75UmFFCzgeg9+AM7CpvpyXLun7eYblJH6Q0dMd0VMVBF6h7kGLeXhGxxStKSeCE+/lHWKFmfV7BNN2VJYUI6g78/JTkTf3l619wZFHbqM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=ZVUA9QAv; arc=none smtp.client-ip=198.175.65.14 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="ZVUA9QAv" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1706950814; x=1738486814; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=HuU52pRVoiXsEnoBJbVxkcXWDiT120gC151O8BbGlew=; b=ZVUA9QAvzO2U1DvNdzfCOPCALYqV1C41Vxr8e6H2wYVlWw5ZMI3OJnre uyVDm1f7tgUh2s6g/MmLzDtTeygKF+1SU6AuMovo44bI2e2kB+oycRmHL Cf0GDbxUocfAcBZJ2c36qdRuXTeQGUU1eHlFs67A2aF1uXW9EHhWOOcxa yU5RW3zSbGDMxc2Z2wDJW9Rd+TVTxFeL1LI537bqH77rxLeUnKSqc9hgo 9uFR0sqLEtg3IdjWKXRHXu9SKHYmXS0RjxG7zkcwPJEGfhzpyFRbgL9wT xKIGWwuyDpK81FTEykMWhXhgITbXsgIUlu0rsI1d5p1tGDF/bsIXnyQTv A==; X-IronPort-AV: E=McAfee;i="6600,9927,10971"; a="4131886" X-IronPort-AV: E=Sophos;i="6.05,240,1701158400"; d="scan'208";a="4131886" Received: from fmviesa009.fm.intel.com ([10.60.135.149]) by orvoesa106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 03 Feb 2024 01:00:13 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.05,240,1701158400"; d="scan'208";a="291240" Received: from liuzhao-optiplex-7080.sh.intel.com ([10.239.160.36]) by fmviesa009.fm.intel.com with ESMTP; 03 Feb 2024 01:00:02 -0800 From: Zhao Liu To: Paolo Bonzini , Sean Christopherson , "Rafael J . Wysocki" , Daniel Lezcano , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H . Peter Anvin" , kvm@vger.kernel.org, linux-pm@vger.kernel.org, linux-kernel@vger.kernel.org, x86@kernel.org Cc: Ricardo Neri , Len Brown , Zhang Rui , Zhenyu Wang , Zhuocheng Ding , Dapeng Mi , Yanting Jiang , Yongwei Ma , Vineeth Pillai , Suleiman Souhlal , Masami Hiramatsu , David Dai , Saravana Kannan , Zhao Liu Subject: [RFC 04/26] KVM: Add kvm_arch_sched_out() hook Date: Sat, 3 Feb 2024 17:11:52 +0800 Message-Id: <20240203091214.411862-5-zhao1.liu@linux.intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240203091214.411862-1-zhao1.liu@linux.intel.com> References: <20240203091214.411862-1-zhao1.liu@linux.intel.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Zhao Liu x86 needs to reset classification history when vCPU is scheduled out. Add the kvm_arch_sched_out() hook to allow x86 implements its own history reset logic at sched_out. Tested-by: Yanting Jiang Signed-off-by: Zhao Liu --- arch/arm64/include/asm/kvm_host.h | 1 + arch/mips/include/asm/kvm_host.h | 1 + arch/powerpc/include/asm/kvm_host.h | 1 + arch/riscv/include/asm/kvm_host.h | 1 + arch/s390/include/asm/kvm_host.h | 1 + arch/x86/include/asm/kvm_host.h | 2 ++ include/linux/kvm_host.h | 1 + virt/kvm/kvm_main.c | 1 + 8 files changed, 9 insertions(+) diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index 21c57b812569..a7898fceb761 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -1127,6 +1127,7 @@ static inline bool kvm_system_needs_idmapped_vectors(void) static inline void kvm_arch_sync_events(struct kvm *kvm) {} static inline void kvm_arch_sched_in(struct kvm_vcpu *vcpu, int cpu) {} +static inline void kvm_arch_sched_out(struct kvm_vcpu *vcpu) {} void kvm_arm_init_debug(void); void kvm_arm_vcpu_init_debug(struct kvm_vcpu *vcpu); diff --git a/arch/mips/include/asm/kvm_host.h b/arch/mips/include/asm/kvm_host.h index 179f320cc231..2bcd462db11a 100644 --- a/arch/mips/include/asm/kvm_host.h +++ b/arch/mips/include/asm/kvm_host.h @@ -891,6 +891,7 @@ static inline void kvm_arch_free_memslot(struct kvm *kvm, struct kvm_memory_slot *slot) {} static inline void kvm_arch_memslots_updated(struct kvm *kvm, u64 gen) {} static inline void kvm_arch_sched_in(struct kvm_vcpu *vcpu, int cpu) {} +static inline void kvm_arch_sched_out(struct kvm_vcpu *vcpu) {} static inline void kvm_arch_vcpu_blocking(struct kvm_vcpu *vcpu) {} static inline void kvm_arch_vcpu_unblocking(struct kvm_vcpu *vcpu) {} diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h index 8abac532146e..96bcf62439b2 100644 --- a/arch/powerpc/include/asm/kvm_host.h +++ b/arch/powerpc/include/asm/kvm_host.h @@ -898,6 +898,7 @@ static inline void kvm_arch_sync_events(struct kvm *kvm) {} static inline void kvm_arch_memslots_updated(struct kvm *kvm, u64 gen) {} static inline void kvm_arch_flush_shadow_all(struct kvm *kvm) {} static inline void kvm_arch_sched_in(struct kvm_vcpu *vcpu, int cpu) {} +static inline void kvm_arch_sched_out(struct kvm_vcpu *vcpu) {} static inline void kvm_arch_vcpu_blocking(struct kvm_vcpu *vcpu) {} static inline void kvm_arch_vcpu_unblocking(struct kvm_vcpu *vcpu) {} diff --git a/arch/riscv/include/asm/kvm_host.h b/arch/riscv/include/asm/kvm_host.h index 484d04a92fa6..a395a366f034 100644 --- a/arch/riscv/include/asm/kvm_host.h +++ b/arch/riscv/include/asm/kvm_host.h @@ -273,6 +273,7 @@ struct kvm_vcpu_arch { static inline void kvm_arch_sync_events(struct kvm *kvm) {} static inline void kvm_arch_sched_in(struct kvm_vcpu *vcpu, int cpu) {} +static inline void kvm_arch_sched_out(struct kvm_vcpu *vcpu) {} #define KVM_RISCV_GSTAGE_TLB_MIN_ORDER 12 diff --git a/arch/s390/include/asm/kvm_host.h b/arch/s390/include/asm/kvm_host.h index 52664105a473..6e03188d11b0 100644 --- a/arch/s390/include/asm/kvm_host.h +++ b/arch/s390/include/asm/kvm_host.h @@ -1045,6 +1045,7 @@ extern int kvm_s390_gisc_unregister(struct kvm *kvm, u32 gisc); static inline void kvm_arch_sync_events(struct kvm *kvm) {} static inline void kvm_arch_sched_in(struct kvm_vcpu *vcpu, int cpu) {} +static inline void kvm_arch_sched_out(struct kvm_vcpu *vcpu) {} static inline void kvm_arch_free_memslot(struct kvm *kvm, struct kvm_memory_slot *slot) {} static inline void kvm_arch_memslots_updated(struct kvm *kvm, u64 gen) {} diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index b5b2d0fde579..2be78549bec8 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -2280,6 +2280,8 @@ static inline int kvm_cpu_get_apicid(int mps_cpu) int memslot_rmap_alloc(struct kvm_memory_slot *slot, unsigned long npages); +static inline void kvm_arch_sched_out(struct kvm_vcpu *vcpu) {} + #define KVM_CLOCK_VALID_FLAGS \ (KVM_CLOCK_TSC_STABLE | KVM_CLOCK_REALTIME | KVM_CLOCK_HOST_TSC) diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 7e7fd25b09b3..3aabd3813de0 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -1478,6 +1478,7 @@ int kvm_arch_vcpu_ioctl_set_guest_debug(struct kvm_vcpu *vcpu, int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu); void kvm_arch_sched_in(struct kvm_vcpu *vcpu, int cpu); +void kvm_arch_sched_out(struct kvm_vcpu *vcpu); void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu); void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu); diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 10bfc88a69f7..671f88dff006 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -6317,6 +6317,7 @@ static void kvm_sched_out(struct preempt_notifier *pn, WRITE_ONCE(vcpu->ready, true); } kvm_arch_vcpu_put(vcpu); + kvm_arch_sched_out(vcpu); __this_cpu_write(kvm_running_vcpu, NULL); } From patchwork Sat Feb 3 09:11:53 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zhao Liu X-Patchwork-Id: 13543955 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.14]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E2D7E5C61F; Sat, 3 Feb 2024 09:00:27 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.14 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706950829; cv=none; b=WV9LJdm+Byfpc7XIRQJjZle0XtYGTyrLvEvzAm37+dnSG4gU3NCuIxvcLTiwkktY9Vm/5bWe0Uc+gexQg7XEKgeiW3oGQk6PWWRpXJtPMNhyO8UfdRvIgrskdD/uEW+A1EGkSbd0LijAX29Xa9YRbJYPDlgz/EhLRyAtLxKZ7TU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706950829; c=relaxed/simple; bh=SCjjaAWB7G5busdQtgw95kBzyDRKHqtVEz1Bv3Q2lXo=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=qC0ybgDqc6+0ucFY0DJtTvRlLGxheCAbWlh10XWYUaYpJXIO1MoDeonMAADHCJexyjqLrAAhbypU/tDW/Q5q2yrLKJka7Prz6T+L3xvTOiqmWqGSZ04lT6ErQcAVH+qc9h5jBV0ZAq88h5wmUTqiNBp6snyn12ZWoxg0rhSxN1w= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=kJh6d7hv; arc=none smtp.client-ip=198.175.65.14 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="kJh6d7hv" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1706950828; x=1738486828; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=SCjjaAWB7G5busdQtgw95kBzyDRKHqtVEz1Bv3Q2lXo=; b=kJh6d7hv6OGZ80CVV+AFZZcrJ/XlBPL7M38Z1rj8bYO4bj5FRjjCQFRE OPxmDuCxCDUt7/BWpaJt0NocHZ27wKTdXGfNZ/LPcBizK87UXkBKs52gy MSG03aLL0plHfAvJF4KnEsEigtDB3WWmzKo0LDVsXpAL5qax2LdtV14JZ vMSM/+zUsi3/sWoMaYeWuuiC2aHgLt2V+eoQxO8kDR8BBXT1hc9AiekQq 9E+PVq6Kp9XHNEAGjxI62QHedLrqv7fZffQXNCWyRN4kRAsm5mSSCbozD bOSyTuyHf/0q5ItWyivr7hEPHfp6jy61sz+5bAWjhEUronu/v55NqnzDt Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10971"; a="4131903" X-IronPort-AV: E=Sophos;i="6.05,240,1701158400"; d="scan'208";a="4131903" Received: from fmviesa009.fm.intel.com ([10.60.135.149]) by orvoesa106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 03 Feb 2024 01:00:15 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.05,240,1701158400"; d="scan'208";a="291244" Received: from liuzhao-optiplex-7080.sh.intel.com ([10.239.160.36]) by fmviesa009.fm.intel.com with ESMTP; 03 Feb 2024 01:00:09 -0800 From: Zhao Liu To: Paolo Bonzini , Sean Christopherson , "Rafael J . Wysocki" , Daniel Lezcano , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H . Peter Anvin" , kvm@vger.kernel.org, linux-pm@vger.kernel.org, linux-kernel@vger.kernel.org, x86@kernel.org Cc: Ricardo Neri , Len Brown , Zhang Rui , Zhenyu Wang , Zhuocheng Ding , Dapeng Mi , Yanting Jiang , Yongwei Ma , Vineeth Pillai , Suleiman Souhlal , Masami Hiramatsu , David Dai , Saravana Kannan , Zhao Liu Subject: [RFC 05/26] KVM: x86: Reset hardware history at vCPU's sched_in/out Date: Sat, 3 Feb 2024 17:11:53 +0800 Message-Id: <20240203091214.411862-6-zhao1.liu@linux.intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240203091214.411862-1-zhao1.liu@linux.intel.com> References: <20240203091214.411862-1-zhao1.liu@linux.intel.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Zhao Liu Reset the classification history of the vCPU thread when it's scheduled in and scheduled out. Hardware will start the classification of the vCPU thread from scratch. This helps protect Host/VM history information from leaking Host history to VMs or leaking VM history to sibling VMs. Tested-by: Yanting Jiang Signed-off-by: Zhao Liu --- arch/x86/include/asm/kvm_host.h | 2 -- arch/x86/kvm/x86.c | 8 ++++++++ 2 files changed, 8 insertions(+), 2 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 2be78549bec8..b5b2d0fde579 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -2280,8 +2280,6 @@ static inline int kvm_cpu_get_apicid(int mps_cpu) int memslot_rmap_alloc(struct kvm_memory_slot *slot, unsigned long npages); -static inline void kvm_arch_sched_out(struct kvm_vcpu *vcpu) {} - #define KVM_CLOCK_VALID_FLAGS \ (KVM_CLOCK_TSC_STABLE | KVM_CLOCK_REALTIME | KVM_CLOCK_HOST_TSC) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 363b1c080205..cd9a7251c768 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -79,6 +79,7 @@ #include #include #include +#include #include #include #include @@ -12491,9 +12492,16 @@ void kvm_arch_sched_in(struct kvm_vcpu *vcpu, int cpu) pmu->need_cleanup = true; kvm_make_request(KVM_REQ_PMU, vcpu); } + + reset_hardware_history(); static_call(kvm_x86_sched_in)(vcpu, cpu); } +void kvm_arch_sched_out(struct kvm_vcpu *vcpu) +{ + reset_hardware_history(); +} + void kvm_arch_free_vm(struct kvm *kvm) { #if IS_ENABLED(CONFIG_HYPERV) From patchwork Sat Feb 3 09:11:54 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zhao Liu X-Patchwork-Id: 13543956 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.14]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9B60E5D727; Sat, 3 Feb 2024 09:00:29 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.14 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706950831; cv=none; b=NcljuE+OQOS4RcWo1xnk02W6YTZk+wcx86QfLoWUTrKUCE/4vq7iJ0lFG1MQtRboBVygBT73e744gJFWrp8NvMSHPJEkEEYpEz9VzeYr4LyWQCBoF0yaS3vDDRZqKSgKGhIl5UflFqcbNhFlUo0a0Ct+B6TRJ8VHKb1X4JRmsA0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706950831; c=relaxed/simple; bh=pRteXlHaQGDUTIvzskDwM7LxHqTRhHkoy/T6XfHmLLs=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=BJq5WDJ2FJLWFCKXgenuy19hwYvIBG0gNYMOeXIk98zXHN9Ybzvv5vsvw0iTuTbBQaee+mtUWyHQpe3o5E59F6efdCJaIXXX0LRZvAQF4ewsWwNwuYVT9d5PPVcJL3v765YphShAmisI/Gz2mQ2w2Hg8B0et7uUFAjf49GM4MAk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=B/cUFpCn; arc=none smtp.client-ip=198.175.65.14 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="B/cUFpCn" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1706950830; x=1738486830; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=pRteXlHaQGDUTIvzskDwM7LxHqTRhHkoy/T6XfHmLLs=; b=B/cUFpCnkTRZEcGGLynukYXICSWwqFWvZWiU1YOz+glp5dNgH6fYYa4E DQZReSydXutB2aYeQHRthyBWRU19HcKkXKZ7iDOO/j7QSy8A+ufebW+bl jHkU9EL+GAabqylmb8RYE5pnaV7JoeaiCADRXbDjS5mRs01iL7GEk1juh 8CU43dWeksDTZSHUkaoL/+7v56omUS3WTvdsPpHu4iLUcn48uLfUJoauN aHH8eIh9KpdYOt+EgIBnrL5HI6Ha2ortn3GFkeYQYDD213ViiXeuhaiSo p+HxXBOGtTX6zwcz0QKg8LeCiXa8K+WaCUScfmCkYoW59Ksm0jwFhS3cv Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10971"; a="4131915" X-IronPort-AV: E=Sophos;i="6.05,240,1701158400"; d="scan'208";a="4131915" Received: from fmviesa009.fm.intel.com ([10.60.135.149]) by orvoesa106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 03 Feb 2024 01:00:21 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.05,240,1701158400"; d="scan'208";a="291248" Received: from liuzhao-optiplex-7080.sh.intel.com ([10.239.160.36]) by fmviesa009.fm.intel.com with ESMTP; 03 Feb 2024 01:00:14 -0800 From: Zhao Liu To: Paolo Bonzini , Sean Christopherson , "Rafael J . Wysocki" , Daniel Lezcano , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H . Peter Anvin" , kvm@vger.kernel.org, linux-pm@vger.kernel.org, linux-kernel@vger.kernel.org, x86@kernel.org Cc: Ricardo Neri , Len Brown , Zhang Rui , Zhenyu Wang , Zhuocheng Ding , Dapeng Mi , Yanting Jiang , Yongwei Ma , Vineeth Pillai , Suleiman Souhlal , Masami Hiramatsu , David Dai , Saravana Kannan , Zhao Liu Subject: [RFC 06/26] KVM: VMX: Add helpers to handle the writes to MSR's R/O and R/WC0 bits Date: Sat, 3 Feb 2024 17:11:54 +0800 Message-Id: <20240203091214.411862-7-zhao1.liu@linux.intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240203091214.411862-1-zhao1.liu@linux.intel.com> References: <20240203091214.411862-1-zhao1.liu@linux.intel.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Zhao Liu For WRMSR emulation, any write to R/O bit and any nonzero write to R/WC0 bit must be ignored. Provide 2 helpers to emulate the above R/O and R/WC0 write behavior. Tested-by: Yanting Jiang Signed-off-by: Zhao Liu --- arch/x86/kvm/vmx/vmx.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index e262bc2ba4e5..8f5981635fe5 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -2147,6 +2147,20 @@ static u64 vmx_get_supported_debugctl(struct kvm_vcpu *vcpu, bool host_initiated return debugctl; } +/* Ignore writes to R/O bits. */ +static inline u64 vmx_set_msr_ro_bits(u64 new_val, u64 old_val, u64 ro_mask) +{ + return (new_val & ~ro_mask) | (old_val & ro_mask); +} + +/* Ignore non-zero writes to R/WC0 bits. */ +static inline u64 vmx_set_msr_rwc0_bits(u64 new_val, u64 old_val, u64 rwc0_mask) +{ + u64 new_rwc0 = new_val & rwc0_mask, old_rwc0 = old_val & rwc0_mask; + + return ((new_rwc0 | ~old_rwc0) & old_rwc0) | (new_val & ~rwc0_mask); +} + /* * Writes msr value into the appropriate "register". * Returns 0 on success, non-0 otherwise. From patchwork Sat Feb 3 09:11:55 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zhao Liu X-Patchwork-Id: 13543957 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.14]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id F2AA45D730; Sat, 3 Feb 2024 09:00:29 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.14 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706950831; cv=none; b=T2Fk/Fo34q2VZkQqxNbeFd8AaJAeHYsAqm+gx76ZASgf/Nt4Rgjh+doRrwkEOa4oRkC/P1CjY47V3hWr8hki1jN+SAfhDWHMAD2VVKadf3vjhx1MWa4vO+y2hlgGi0tGV1xWm+c3WF6Ogrw2P2k47Pc/VaMasLzAU8ENxZK+hJI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706950831; c=relaxed/simple; bh=wo39QfY6kJNr9HVF4FRT1N8YzxbVUTeLkZgS6iaMtTI=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=NL1hfjnUNyi71BPSD/tEALOXC63J7NW7MB5PdpghcYM3DG+GpHMLlrkeqTqX+a2hz9wRy1s6zXuXmGdX6oXb5H25f2VSvr+nPhlHgPJDsicAQb0y//ejeeR6++mDPqZQOz7/zuipZtZCkX5FcrhgbJu0h8ASpjJmJLuQ7NzFq0E= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=B50w1Qso; arc=none smtp.client-ip=198.175.65.14 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="B50w1Qso" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1706950830; x=1738486830; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=wo39QfY6kJNr9HVF4FRT1N8YzxbVUTeLkZgS6iaMtTI=; b=B50w1QsoL4+WGOaihzrreLFYGwNTpQ25GQCnNz/z9mB9fUcXhLXEEwge Uxtqscw+eP40mtNQsfRc7Eh0Ai7GUqBGXqn6rZAJUEO9do2LE6s9VQ4sI aivMXE9tazjyfXG8uAPCn24m4EGiMDEa7+B+ialY3ouho9KiOH4vRsF08 U4n6vpMM3DjCGNHVARSj7BlFddEZzTrWxTzIUO4QOek7uj/09aqLjPB1i VbzXuX1sOnmcavJ8virE3ANZmpETHEGoh1cgI6dmhq/wvCCofCMChzwRL EV4DqvL7478j7AUlDE2Nzye1sbJEkJ02ka1B9wLiv214SXg2IC1OCSnOw g==; X-IronPort-AV: E=McAfee;i="6600,9927,10971"; a="4131932" X-IronPort-AV: E=Sophos;i="6.05,240,1701158400"; d="scan'208";a="4131932" Received: from fmviesa009.fm.intel.com ([10.60.135.149]) by orvoesa106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 03 Feb 2024 01:00:26 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.05,240,1701158400"; d="scan'208";a="291255" Received: from liuzhao-optiplex-7080.sh.intel.com ([10.239.160.36]) by fmviesa009.fm.intel.com with ESMTP; 03 Feb 2024 01:00:20 -0800 From: Zhao Liu To: Paolo Bonzini , Sean Christopherson , "Rafael J . Wysocki" , Daniel Lezcano , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H . Peter Anvin" , kvm@vger.kernel.org, linux-pm@vger.kernel.org, linux-kernel@vger.kernel.org, x86@kernel.org Cc: Ricardo Neri , Len Brown , Zhang Rui , Zhenyu Wang , Zhuocheng Ding , Dapeng Mi , Yanting Jiang , Yongwei Ma , Vineeth Pillai , Suleiman Souhlal , Masami Hiramatsu , David Dai , Saravana Kannan , Zhao Liu Subject: [RFC 07/26] KVM: VMX: Emulate ACPI (CPUID.0x01.edx[bit 22]) feature Date: Sat, 3 Feb 2024 17:11:55 +0800 Message-Id: <20240203091214.411862-8-zhao1.liu@linux.intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240203091214.411862-1-zhao1.liu@linux.intel.com> References: <20240203091214.411862-1-zhao1.liu@linux.intel.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Zhuocheng Ding The ACPI (Thermal Monitor and Software Controlled Clock Facilities) feature is a dependency of thermal interrupt processing so that it is required for the HFI notification (a thermal interrupt) handling. To support VM to handle thermal interrupt, we need to emulate ACPI feature in KVM: 1. Emulate MSR_IA32_THERM_CONTROL (alias, IA32_CLOCK_MODULATION), MSR_IA32_THERM_INTERRUPT and MSR_IA32_THERM_STATUS with dummy values. According to SDM [1], the ACPI feature means: "The ACPI flag (bit 22) of the CPUID feature flags indicates the presence of the IA32_THERM_STATUS, IA32_THERM_INTERRUPT, IA32_CLOCK_MODULATION MSRs, and the xAPIC thermal LVT entry." It is enough to use dummy values in KVM to emulate the RDMSR/WRMSR on them. 2. Add the thermal interrupt injection interfaces. This interface reflects the integrity of the ACPI emulation. Although thermal interrupts are not actually injected into the Guest now, in the following HFI/ITD emulations, thermal interrupt will be injected into Guest once the conditions are met. 3. Additionally, expose the CPUID bit of the ACPI feature to the VM, which can help enable thermal interrupt handling in the VM. [1]: SDM, vol. 3B, section 15.8.4.1, Detection of Software Controlled Clock Modulation Extension. Tested-by: Yanting Jiang Signed-off-by: Zhuocheng Ding Co-developed-by: Zhao Liu Signed-off-by: Zhao Liu --- arch/x86/kvm/cpuid.c | 2 +- arch/x86/kvm/irq.h | 1 + arch/x86/kvm/lapic.c | 9 ++++ arch/x86/kvm/svm/svm.c | 3 ++ arch/x86/kvm/vmx/vmx.c | 94 ++++++++++++++++++++++++++++++++++++++++++ arch/x86/kvm/vmx/vmx.h | 3 ++ arch/x86/kvm/x86.c | 3 ++ 7 files changed, 114 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c index adba49afb5fe..1ad547651022 100644 --- a/arch/x86/kvm/cpuid.c +++ b/arch/x86/kvm/cpuid.c @@ -623,7 +623,7 @@ void kvm_set_cpu_caps(void) F(CX8) | F(APIC) | 0 /* Reserved */ | F(SEP) | F(MTRR) | F(PGE) | F(MCA) | F(CMOV) | F(PAT) | F(PSE36) | 0 /* PSN */ | F(CLFLUSH) | - 0 /* Reserved, DS, ACPI */ | F(MMX) | + 0 /* Reserved, DS */ | F(ACPI) | F(MMX) | F(FXSR) | F(XMM) | F(XMM2) | F(SELFSNOOP) | 0 /* HTT, TM, Reserved, PBE */ ); diff --git a/arch/x86/kvm/irq.h b/arch/x86/kvm/irq.h index c2d7cfe82d00..e11c1fb6e1e6 100644 --- a/arch/x86/kvm/irq.h +++ b/arch/x86/kvm/irq.h @@ -99,6 +99,7 @@ static inline int irqchip_in_kernel(struct kvm *kvm) void kvm_inject_pending_timer_irqs(struct kvm_vcpu *vcpu); void kvm_inject_apic_timer_irqs(struct kvm_vcpu *vcpu); void kvm_apic_nmi_wd_deliver(struct kvm_vcpu *vcpu); +void kvm_apic_therm_deliver(struct kvm_vcpu *vcpu); void __kvm_migrate_apic_timer(struct kvm_vcpu *vcpu); void __kvm_migrate_pit_timer(struct kvm_vcpu *vcpu); void __kvm_migrate_timers(struct kvm_vcpu *vcpu); diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c index 3242f3da2457..af8572798976 100644 --- a/arch/x86/kvm/lapic.c +++ b/arch/x86/kvm/lapic.c @@ -2783,6 +2783,15 @@ void kvm_apic_nmi_wd_deliver(struct kvm_vcpu *vcpu) kvm_apic_local_deliver(apic, APIC_LVT0); } +void kvm_apic_therm_deliver(struct kvm_vcpu *vcpu) +{ + struct kvm_lapic *apic = vcpu->arch.apic; + + if (apic) + kvm_apic_local_deliver(apic, APIC_LVTTHMR); +} +EXPORT_SYMBOL_GPL(kvm_apic_therm_deliver); + static const struct kvm_io_device_ops apic_mmio_ops = { .read = apic_mmio_read, .write = apic_mmio_write, diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index e90b429c84f1..2e22d5e86768 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -4288,6 +4288,9 @@ static bool svm_has_emulated_msr(struct kvm *kvm, u32 index) switch (index) { case MSR_IA32_MCG_EXT_CTL: case KVM_FIRST_EMULATED_VMX_MSR ... KVM_LAST_EMULATED_VMX_MSR: + case MSR_IA32_THERM_CONTROL: + case MSR_IA32_THERM_INTERRUPT: + case MSR_IA32_THERM_STATUS: return false; case MSR_IA32_SMBASE: if (!IS_ENABLED(CONFIG_KVM_SMM)) diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 8f5981635fe5..aa37b55cf045 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -157,6 +157,32 @@ module_param(allow_smaller_maxphyaddr, bool, S_IRUGO); RTIT_STATUS_ERROR | RTIT_STATUS_STOPPED | \ RTIT_STATUS_BYTECNT)) +/* + * TM2 (CPUID.01H:ECX[8]), DTHERM (CPUID.06H:EAX[0]), PLN (CPUID.06H:EAX[4]), + * and HWP (CPUID.06H:EAX[7]) are not emulated in kvm. + */ +#define MSR_IA32_THERM_STATUS_RO_MASK (THERM_STATUS_PROCHOT | \ + THERM_STATUS_PROCHOT_FORCEPR_EVENT | THERM_STATUS_CRITICAL_TEMP) +#define MSR_IA32_THERM_STATUS_RWC0_MASK (THERM_STATUS_PROCHOT_LOG | \ + THERM_STATUS_PROCHOT_FORCEPR_LOG | THERM_STATUS_CRITICAL_TEMP_LOG) +/* MSR_IA32_THERM_STATUS unavailable bits mask: unsupported and reserved bits. */ +#define MSR_IA32_THERM_STATUS_UNAVAIL_MASK (~(MSR_IA32_THERM_STATUS_RO_MASK | \ + MSR_IA32_THERM_STATUS_RWC0_MASK)) + +/* ECMD (CPUID.06H:EAX[5]) is not emulated in kvm. */ +#define MSR_IA32_THERM_CONTROL_AVAIL_MASK (THERM_ON_DEM_CLO_MOD_ENABLE | \ + THERM_ON_DEM_CLO_MOD_DUTY_CYC_MASK) + +/* + * MSR_IA32_THERM_INTERRUPT available bits mask. + * PLN (CPUID.06H:EAX[4]) and HFN (CPUID.06H:EAX[24]) are not emulated in kvm. + */ +#define MSR_IA32_THERM_INTERRUPT_AVAIL_MASK (THERM_INT_HIGH_ENABLE | \ + THERM_INT_LOW_ENABLE | THERM_INT_PROCHOT_ENABLE | \ + THERM_INT_FORCEPR_ENABLE | THERM_INT_CRITICAL_TEM_ENABLE | \ + THERM_MASK_THRESHOLD0 | THERM_INT_THRESHOLD0_ENABLE | \ + THERM_MASK_THRESHOLD1 | THERM_INT_THRESHOLD1_ENABLE) + /* * List of MSRs that can be directly passed to the guest. * In addition to these x2apic and PT MSRs are handled specially. @@ -1470,6 +1496,19 @@ void vmx_vcpu_load_vmcs(struct kvm_vcpu *vcpu, int cpu, } } +static void vmx_inject_therm_interrupt(struct kvm_vcpu *vcpu) +{ + /* + * From SDM, the ACPI flag also indicates the presence of the + * xAPIC thermal LVT entry. + */ + if (!guest_cpuid_has(vcpu, X86_FEATURE_ACPI)) + return; + + if (irqchip_in_kernel(vcpu->kvm)) + kvm_apic_therm_deliver(vcpu); +} + /* * Switches to specified vcpu, until a matching vcpu_put(), but assumes * vcpu mutex is already taken. @@ -2109,6 +2148,24 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) case MSR_IA32_DEBUGCTLMSR: msr_info->data = vmcs_read64(GUEST_IA32_DEBUGCTL); break; + case MSR_IA32_THERM_CONTROL: + if (!msr_info->host_initiated && + !guest_cpuid_has(vcpu, X86_FEATURE_ACPI)) + return 1; + msr_info->data = vmx->msr_ia32_therm_control; + break; + case MSR_IA32_THERM_INTERRUPT: + if (!msr_info->host_initiated && + !guest_cpuid_has(vcpu, X86_FEATURE_ACPI)) + return 1; + msr_info->data = vmx->msr_ia32_therm_interrupt; + break; + case MSR_IA32_THERM_STATUS: + if (!msr_info->host_initiated && + !guest_cpuid_has(vcpu, X86_FEATURE_ACPI)) + return 1; + msr_info->data = vmx->msr_ia32_therm_status; + break; default: find_uret_msr: msr = vmx_find_uret_msr(vmx, msr_info->index); @@ -2452,6 +2509,40 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) } ret = kvm_set_msr_common(vcpu, msr_info); break; + case MSR_IA32_THERM_CONTROL: + if (!msr_info->host_initiated && + !guest_cpuid_has(vcpu, X86_FEATURE_ACPI)) + return 1; + if (!msr_info->host_initiated && + data & ~MSR_IA32_THERM_CONTROL_AVAIL_MASK) + return 1; + vmx->msr_ia32_therm_control = data; + break; + case MSR_IA32_THERM_INTERRUPT: + if (!msr_info->host_initiated && + !guest_cpuid_has(vcpu, X86_FEATURE_ACPI)) + return 1; + if (!msr_info->host_initiated && + data & ~MSR_IA32_THERM_INTERRUPT_AVAIL_MASK) + return 1; + vmx->msr_ia32_therm_interrupt = data; + break; + case MSR_IA32_THERM_STATUS: + if (!msr_info->host_initiated && + !guest_cpuid_has(vcpu, X86_FEATURE_ACPI)) + return 1; + /* Unsupported and reserved bits: generate the exception. */ + if (!msr_info->host_initiated && + data & MSR_IA32_THERM_STATUS_UNAVAIL_MASK) + return 1; + if (!msr_info->host_initiated) { + data = vmx_set_msr_rwc0_bits(data, vmx->msr_ia32_therm_status, + MSR_IA32_THERM_STATUS_RWC0_MASK); + data = vmx_set_msr_ro_bits(data, vmx->msr_ia32_therm_status, + MSR_IA32_THERM_STATUS_RO_MASK); + } + vmx->msr_ia32_therm_status = data; + break; default: find_uret_msr: @@ -4870,6 +4961,9 @@ static void vmx_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event) vmx->spec_ctrl = 0; vmx->msr_ia32_umwait_control = 0; + vmx->msr_ia32_therm_control = 0; + vmx->msr_ia32_therm_interrupt = 0; + vmx->msr_ia32_therm_status = 0; vmx->hv_deadline_tsc = -1; kvm_set_cr8(vcpu, 0); diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h index e3b0985bb74a..e159dd5b7a66 100644 --- a/arch/x86/kvm/vmx/vmx.h +++ b/arch/x86/kvm/vmx/vmx.h @@ -282,6 +282,9 @@ struct vcpu_vmx { u64 spec_ctrl; u32 msr_ia32_umwait_control; + u64 msr_ia32_therm_control; + u64 msr_ia32_therm_interrupt; + u64 msr_ia32_therm_status; /* * loaded_vmcs points to the VMCS currently used in this vcpu. For a diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index cd9a7251c768..50aceb0ce4ee 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -1545,6 +1545,9 @@ static const u32 emulated_msrs_all[] = { MSR_AMD64_TSC_RATIO, MSR_IA32_POWER_CTL, MSR_IA32_UCODE_REV, + MSR_IA32_THERM_CONTROL, + MSR_IA32_THERM_INTERRUPT, + MSR_IA32_THERM_STATUS, /* * KVM always supports the "true" VMX control MSRs, even if the host From patchwork Sat Feb 3 09:11:56 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zhao Liu X-Patchwork-Id: 13543958 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.14]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BC77E5C91F; Sat, 3 Feb 2024 09:00:32 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.14 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706950834; cv=none; b=fDo+xgd9UgXGDhToWeWck2fDvRAuxVRxiiSxVRL+xFioPP2oj4ETaX+lvHR5fT+2YKWDoG070ZoqIfgQEJDwdUXALZyLgqYvVIgX3ic44v+LYD+YH/9f1iSOIQIsdnpJyZb8kPg+EriphqEMb1c2MNIrR19ps+Xe4B9T2v+jD8c= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706950834; c=relaxed/simple; bh=dsIPqJPqsn+oitClY92kZLn0nh7sst/LKOjgIgrgVVw=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=gSXpcA78Al8inqPHIWR1OmtmJqsZ0ojH8boq6w8t3/nzGU1QunIU5VOdaOHJ6OGxo2AK3pif0E7TVrRMScdcaJtQxRgxqtxZaCRyQnGuYu6hOwTmG2VHe2hDkNABkQwQzJ9dEDdlAI3c2CudoTSTKm95dvsat620p9nSpDbLEC8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=OtMm7tD1; arc=none smtp.client-ip=198.175.65.14 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="OtMm7tD1" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1706950833; x=1738486833; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=dsIPqJPqsn+oitClY92kZLn0nh7sst/LKOjgIgrgVVw=; b=OtMm7tD18vgfuEJ7bcBko5HNjO1NGT9lx1G/IJVxZ/t9c9JDP8oAhobz yILgOgtymZ8mVX3IbIE8CInIHlf/JLc2v3gkdYf75ne73PF2yZt25WIG4 GNjtV2MVElTVWleKGrM2s3PM1pqOFKuDAG9Vt7INz+xyb9yBISq0sSO6t UU3sKZggb/38GI992PpvJVvxdDfnSvRcVXyxGGjWuAP1UKYzV2Mc3n+x+ sI4xBRcpIf9RJuLNkdqWmwUPUa2dVEwjiNq2RC0Prfhu0nzteFP2f9Qed TjkK7BJfCWJaGr/hWhckVozl1JX/qsDRdUZs/76HFM2Tx7aMPknCXcrbn g==; X-IronPort-AV: E=McAfee;i="6600,9927,10971"; a="4131954" X-IronPort-AV: E=Sophos;i="6.05,240,1701158400"; d="scan'208";a="4131954" Received: from fmviesa009.fm.intel.com ([10.60.135.149]) by orvoesa106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 03 Feb 2024 01:00:32 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.05,240,1701158400"; d="scan'208";a="291265" Received: from liuzhao-optiplex-7080.sh.intel.com ([10.239.160.36]) by fmviesa009.fm.intel.com with ESMTP; 03 Feb 2024 01:00:26 -0800 From: Zhao Liu To: Paolo Bonzini , Sean Christopherson , "Rafael J . Wysocki" , Daniel Lezcano , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H . Peter Anvin" , kvm@vger.kernel.org, linux-pm@vger.kernel.org, linux-kernel@vger.kernel.org, x86@kernel.org Cc: Ricardo Neri , Len Brown , Zhang Rui , Zhenyu Wang , Zhuocheng Ding , Dapeng Mi , Yanting Jiang , Yongwei Ma , Vineeth Pillai , Suleiman Souhlal , Masami Hiramatsu , David Dai , Saravana Kannan , Zhao Liu Subject: [RFC 08/26] KVM: x86: Expose TM/ACC (CPUID.0x01.edx[bit 29]) feature bit to VM Date: Sat, 3 Feb 2024 17:11:56 +0800 Message-Id: <20240203091214.411862-9-zhao1.liu@linux.intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240203091214.411862-1-zhao1.liu@linux.intel.com> References: <20240203091214.411862-1-zhao1.liu@linux.intel.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Zhuocheng Ding The TM (Thermal Monitor, alias, TM1/ACC) feature is a dependency of thermal interrupt processing so that it is required for the HFI notification (a thermal interrupt) handling. According to SDM [1], the TM feature means: "The TM1 flag (bit 29) of the CPUID feature flags indicates the presence of the automatic thermal monitoring facilities that modulate clock duty cycles." Considering that the TM feature does not provide any OS interaction interface, but only indicates the presence of a hardware feature. Therefore, we do not need to perform any additional software emulation while exposing the TM feature bit. Expose the TM feature bit to the VM to support the VM in handling the thermal interrupt. [1]: SDM, vol. 3B, section 15.8.4.1, Detection of Software Controlled Clock Modulation Extension. Tested-by: Yanting Jiang Signed-off-by: Zhuocheng Ding Co-developed-by: Zhao Liu Signed-off-by: Zhao Liu --- arch/x86/kvm/cpuid.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c index 1ad547651022..829bb9c6516f 100644 --- a/arch/x86/kvm/cpuid.c +++ b/arch/x86/kvm/cpuid.c @@ -625,7 +625,7 @@ void kvm_set_cpu_caps(void) F(PAT) | F(PSE36) | 0 /* PSN */ | F(CLFLUSH) | 0 /* Reserved, DS */ | F(ACPI) | F(MMX) | F(FXSR) | F(XMM) | F(XMM2) | F(SELFSNOOP) | - 0 /* HTT, TM, Reserved, PBE */ + 0 /* HTT */ | F(ACC) | 0 /* Reserved, PBE */ ); kvm_cpu_cap_mask(CPUID_7_0_EBX, From patchwork Sat Feb 3 09:11:57 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zhao Liu X-Patchwork-Id: 13543959 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.14]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CE03B5F57F; Sat, 3 Feb 2024 09:00:37 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.14 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706950839; cv=none; b=AXAmzWwC/DEsCtXhMAtwZmBq4Up3PW52ysFlqaKlbCTHKCQO+HT6EIvjIaMc31jhk7rbt3tUqGkjx4rFF+5oJWVyBEuJ4+O4dQLYtXk75zWf4y5E/9zV48RLrflnBy2yW2JrLZyq18qAsoJmflpwNJIvzx9SksjAdo+YBB0ZoQE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706950839; c=relaxed/simple; bh=UAl/onYRs7BB51ytgYE6qoZNfSltN0H9mc+OMwUe7do=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=VUNwnOFY3xo24S60n2th706DYvuOAT49F6NeNrhZxrW4vHHv9AJGP0TLh2XgZUwJMxVz1G89eRP4MXBAr7P71rCoTVm2iGbAaw9LJ92y11GNO9HD+0UBwXYlMkjf3Q5WP+NFUrzX7RfUT4n8NU1poaYkNed8AheeqXZp886Ephs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=lcGpilDs; arc=none smtp.client-ip=198.175.65.14 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="lcGpilDs" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1706950838; x=1738486838; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=UAl/onYRs7BB51ytgYE6qoZNfSltN0H9mc+OMwUe7do=; b=lcGpilDsIvPVGUbBZqsiAWieRMn5NP+/YQv8j8BzFaaRxpY3x/oM9nlP jZecC9uxuFxaM1gB/c1zfVSnbo3HpXz3HvAS4r9Odhb1g3bclBRGgXNOw smS0jitA+DYM1HNezy9EPomRwpnRU8kLTX1yBrePN55xje/0UB93iZuCQ EkOg47uzCvqFrVtqgiBaoRtBLY3ChNcBDMBxe9Z7Yv4HLCTV+HSVvhkSB y/PukPFsi0qjgnzJuNwiimuyyvNpNwK07g3E2uS4Anv9GUx1jgy4SBjS9 eviMi5IZflCxcVnhzGs/gcn5rZg4fmADgVomk/IIwnl9myXn1Ll6CUohT Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10971"; a="4131970" X-IronPort-AV: E=Sophos;i="6.05,240,1701158400"; d="scan'208";a="4131970" Received: from fmviesa009.fm.intel.com ([10.60.135.149]) by orvoesa106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 03 Feb 2024 01:00:38 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.05,240,1701158400"; d="scan'208";a="291279" Received: from liuzhao-optiplex-7080.sh.intel.com ([10.239.160.36]) by fmviesa009.fm.intel.com with ESMTP; 03 Feb 2024 01:00:31 -0800 From: Zhao Liu To: Paolo Bonzini , Sean Christopherson , "Rafael J . Wysocki" , Daniel Lezcano , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H . Peter Anvin" , kvm@vger.kernel.org, linux-pm@vger.kernel.org, linux-kernel@vger.kernel.org, x86@kernel.org Cc: Ricardo Neri , Len Brown , Zhang Rui , Zhenyu Wang , Zhuocheng Ding , Dapeng Mi , Yanting Jiang , Yongwei Ma , Vineeth Pillai , Suleiman Souhlal , Masami Hiramatsu , David Dai , Saravana Kannan , Zhao Liu Subject: [RFC 09/26] KVM: x86: cpuid: Define CPUID 0x06.eax by kvm_cpu_cap_mask() Date: Sat, 3 Feb 2024 17:11:57 +0800 Message-Id: <20240203091214.411862-10-zhao1.liu@linux.intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240203091214.411862-1-zhao1.liu@linux.intel.com> References: <20240203091214.411862-1-zhao1.liu@linux.intel.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Zhao Liu PTS, HFI, and ITD feature bits are to be specified in kvm_cpu_caps and depend on Host support. Define kvm_cpu_caps[CPUID_6_EAX] with kvm_cpu_cap_mask() and use this 0x06 cap feature to set the 0x06 leaf of the guest. Currently, only ARAT is supported in 0x06.eax. Although ARAT is not available on all CPUs with VMX support[1], commit e453aa0f7e7b ("KVM: x86: Allow ARAT CPU feature") always sets ARAT for Guest because the APIC timer is emulated. Explicitly check ARAT in __do_cpuid_func() and make sure this feature bit is always set. [1]: https://lore.kernel.org/kvm/1523455369.20087.16.camel@intel.com/ Tested-by: Yanting Jiang Signed-off-by: Zhao Liu --- arch/x86/kvm/cpuid.c | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c index 829bb9c6516f..d8cfae17cc92 100644 --- a/arch/x86/kvm/cpuid.c +++ b/arch/x86/kvm/cpuid.c @@ -628,6 +628,10 @@ void kvm_set_cpu_caps(void) 0 /* HTT */ | F(ACC) | 0 /* Reserved, PBE */ ); + kvm_cpu_cap_mask(CPUID_6_EAX, + F(ARAT) + ); + kvm_cpu_cap_mask(CPUID_7_0_EBX, F(FSGSBASE) | F(SGX) | F(BMI1) | F(HLE) | F(AVX2) | F(FDP_EXCPTN_ONLY) | F(SMEP) | F(BMI2) | F(ERMS) | F(INVPCID) | @@ -964,7 +968,12 @@ static inline int __do_cpuid_func(struct kvm_cpuid_array *array, u32 function) } break; case 6: /* Thermal management */ - entry->eax = 0x4; /* allow ARAT */ + cpuid_entry_override(entry, CPUID_6_EAX); + + /* Always allow ARAT since APICs are emulated. */ + if (!kvm_cpu_cap_has(X86_FEATURE_ARAT)) + entry->eax |= 0x4; + entry->ebx = 0; entry->ecx = 0; entry->edx = 0; From patchwork Sat Feb 3 09:11:58 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zhao Liu X-Patchwork-Id: 13543960 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.14]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D5F875F857; Sat, 3 Feb 2024 09:00:43 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.14 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706950845; cv=none; b=llF8S1OMNbkN96tbN0Tny/lfFZt2euz0eob+i9gqJK5OPOuzKtQ674Lw//lYlYYZZfvD6U48e3GINqe/mozOw2FBW3ViO7IMNnuXZPrCCLQagdZ7nd3+MvCt7h3CvTZe4K4yxExt8tXYkzEYhX/QWrZ7lRjOb8BjqUYAwvpnIE8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706950845; c=relaxed/simple; bh=m40AHoQ2UOqtleX+0akDeAncXxKxVKmuDUaNWhRJTc8=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=cAH0mbhZuW4xypltjXUm0CXLlvwyun3P290W9jsVMQkFyN+ElH11bv4TwXR5RV6AYrd4Vt0txfGl7wjvIom+YNXWY6Fot9fhfAO14WgphreLfclcSGObJu6A1kH4fnVXCdRhrzSCrFh58KRfn4tR1TIuSButQ0A4GVessfCkFnI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=ISWNHVpz; arc=none smtp.client-ip=198.175.65.14 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="ISWNHVpz" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1706950844; x=1738486844; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=m40AHoQ2UOqtleX+0akDeAncXxKxVKmuDUaNWhRJTc8=; b=ISWNHVpzGFkmI5RK+i7DBalvx8y+pKvDxtXutoVUZJSK0CnFMiKRcR4s On3bZM0AwWg3xlzKI+Uy1q1IzwElpALsfTyRJLQ3W5koUPkkorEletX4B IfLSUCcpMfFY+/7y1h50EXwgbzJCRgQNur5p7EgFGl5jjRFLrl5YUpMjw uUCx2a3iRzImbsyqP2SzhLuxZ9qWKA7bUS/pQbvhaQv/gmkLIbpwgl5zS yolyTEVW4lkKF+qVR0qaHYimKsY5iHEDiukDFIj8JU8MYq34ZAAHu8Ikb WlJ0SwUVFLEVuE6MntJBwNpl7WfuhDxSrrJiWHxyZl66AOkQ0cOFFWol1 g==; X-IronPort-AV: E=McAfee;i="6600,9927,10971"; a="4131982" X-IronPort-AV: E=Sophos;i="6.05,240,1701158400"; d="scan'208";a="4131982" Received: from fmviesa009.fm.intel.com ([10.60.135.149]) by orvoesa106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 03 Feb 2024 01:00:44 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.05,240,1701158400"; d="scan'208";a="291335" Received: from liuzhao-optiplex-7080.sh.intel.com ([10.239.160.36]) by fmviesa009.fm.intel.com with ESMTP; 03 Feb 2024 01:00:37 -0800 From: Zhao Liu To: Paolo Bonzini , Sean Christopherson , "Rafael J . Wysocki" , Daniel Lezcano , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H . Peter Anvin" , kvm@vger.kernel.org, linux-pm@vger.kernel.org, linux-kernel@vger.kernel.org, x86@kernel.org Cc: Ricardo Neri , Len Brown , Zhang Rui , Zhenyu Wang , Zhuocheng Ding , Dapeng Mi , Yanting Jiang , Yongwei Ma , Vineeth Pillai , Suleiman Souhlal , Masami Hiramatsu , David Dai , Saravana Kannan , Zhao Liu Subject: [RFC 10/26] KVM: VMX: Emulate PTM/PTS (CPUID.0x06.eax[bit 6]) feature Date: Sat, 3 Feb 2024 17:11:58 +0800 Message-Id: <20240203091214.411862-11-zhao1.liu@linux.intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240203091214.411862-1-zhao1.liu@linux.intel.com> References: <20240203091214.411862-1-zhao1.liu@linux.intel.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Zhuocheng Ding The PTM feature (Package Thermal Management, alias, PTS) is the dependency of the Hardware Feedback Interface (HFI) feature. To support HFI virtualization, PTM feature is also required to be emulated in KVM. The PTM feature provides 2 package-level thermal related MSRs: MSR_IA32_PACKAGE_THERM_INTERRUPT and MSR_IA32_PACKAGE_THERM_STATUS. Currently KVM doesn't support MSR topology (except for thread scope MSR, no more other different topological scopes), but since PTM's package thermal MSRs are only used on client platform with only 1 package, it's enough to handle these 2 MSRs at VM level. Additionally, a mutex is used to avoid competing different vCPUs' access to emulated MSR values stored in kvm_vmx. PTM also indicates the presence of package level thermal interrupts, which is meaningful for VM to handle package level thermal interrupt. The ACPI emulation patch has already added the support for thermal interrupt injection, and this also reflects the integrity of the PTM emulation. Although thermal interrupts are not actually injected into the Guest now, in the following HFI/ITD emulations, thermal interrupts will be injected into the Guest once the conditions are met. In addition, expose the CPUID bit of the PTM feature to the VM, which can help enable package thermal interrupt handling in VM. Tested-by: Yanting Jiang Signed-off-by: Zhuocheng Ding Co-developed-by: Zhao Liu Signed-off-by: Zhao Liu --- arch/x86/kvm/cpuid.c | 11 ++++++ arch/x86/kvm/svm/svm.c | 2 ++ arch/x86/kvm/vmx/vmx.c | 76 +++++++++++++++++++++++++++++++++++++++++- arch/x86/kvm/vmx/vmx.h | 9 +++++ arch/x86/kvm/x86.c | 2 ++ 5 files changed, 99 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c index d8cfae17cc92..eaac2c8d98b9 100644 --- a/arch/x86/kvm/cpuid.c +++ b/arch/x86/kvm/cpuid.c @@ -632,6 +632,17 @@ void kvm_set_cpu_caps(void) F(ARAT) ); + /* + * PTS is the dependency of ITD, currently we only use PTS for + * enabling ITD in KVM. Since KVM does not support msr topology at + * present, the emulation of PTS has restrictions on the topology of + * Guest, so we only expose PTS when Host enables ITD. + */ + if (cpu_feature_enabled(X86_FEATURE_ITD)) { + if (boot_cpu_has(X86_FEATURE_PTS)) + kvm_cpu_cap_set(X86_FEATURE_PTS); + } + kvm_cpu_cap_mask(CPUID_7_0_EBX, F(FSGSBASE) | F(SGX) | F(BMI1) | F(HLE) | F(AVX2) | F(FDP_EXCPTN_ONLY) | F(SMEP) | F(BMI2) | F(ERMS) | F(INVPCID) | diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index 2e22d5e86768..7039ae48d8d0 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -4291,6 +4291,8 @@ static bool svm_has_emulated_msr(struct kvm *kvm, u32 index) case MSR_IA32_THERM_CONTROL: case MSR_IA32_THERM_INTERRUPT: case MSR_IA32_THERM_STATUS: + case MSR_IA32_PACKAGE_THERM_INTERRUPT: + case MSR_IA32_PACKAGE_THERM_STATUS: return false; case MSR_IA32_SMBASE: if (!IS_ENABLED(CONFIG_KVM_SMM)) diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index aa37b55cf045..45b40a47b448 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -183,6 +183,29 @@ module_param(allow_smaller_maxphyaddr, bool, S_IRUGO); THERM_MASK_THRESHOLD0 | THERM_INT_THRESHOLD0_ENABLE | \ THERM_MASK_THRESHOLD1 | THERM_INT_THRESHOLD1_ENABLE) +/* HFI (CPUID.06H:EAX[19]) is not emulated in kvm yet. */ +#define MSR_IA32_PACKAGE_THERM_STATUS_RO_MASK (PACKAGE_THERM_STATUS_PROCHOT | \ + PACKAGE_THERM_STATUS_PROCHOT_EVENT | PACKAGE_THERM_STATUS_CRITICAL_TEMP | \ + THERM_STATUS_THRESHOLD0 | THERM_STATUS_THRESHOLD1 | \ + PACKAGE_THERM_STATUS_POWER_LIMIT | PACKAGE_THERM_STATUS_DIG_READOUT_MASK) +#define MSR_IA32_PACKAGE_THERM_STATUS_RWC0_MASK (PACKAGE_THERM_STATUS_PROCHOT_LOG | \ + PACKAGE_THERM_STATUS_PROCHOT_EVENT_LOG | PACKAGE_THERM_STATUS_CRITICAL_TEMP_LOG | \ + THERM_LOG_THRESHOLD0 | THERM_LOG_THRESHOLD1 | \ + PACKAGE_THERM_STATUS_POWER_LIMIT_LOG) +/* MSR_IA32_PACKAGE_THERM_STATUS unavailable bits mask: unsupported and reserved bits. */ +#define MSR_IA32_PACKAGE_THERM_STATUS_UNAVAIL_MASK (~(MSR_IA32_PACKAGE_THERM_STATUS_RO_MASK | \ + MSR_IA32_PACKAGE_THERM_STATUS_RWC0_MASK)) + +/* + * MSR_IA32_PACKAGE_THERM_INTERRUPT available bits mask. + * HFI (CPUID.06H:EAX[19]) is not emulated in kvm yet. + */ +#define MSR_IA32_PACKAGE_THERM_INTERRUPT_AVAIL_MASK (PACKAGE_THERM_INT_HIGH_ENABLE | \ + PACKAGE_THERM_INT_LOW_ENABLE | PACKAGE_THERM_INT_PROCHOT_ENABLE | \ + PACKAGE_THERM_INT_OVERHEAT_ENABLE | THERM_MASK_THRESHOLD0 | \ + THERM_INT_THRESHOLD0_ENABLE | THERM_MASK_THRESHOLD1 | \ + THERM_INT_THRESHOLD1_ENABLE | PACKAGE_THERM_INT_PLN_ENABLE) + /* * List of MSRs that can be directly passed to the guest. * In addition to these x2apic and PT MSRs are handled specially. @@ -2013,6 +2036,7 @@ static int vmx_get_msr_feature(struct kvm_msr_entry *msr) static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) { struct vcpu_vmx *vmx = to_vmx(vcpu); + struct kvm_vmx *kvm_vmx = to_kvm_vmx(vcpu->kvm); struct vmx_uret_msr *msr; u32 index; @@ -2166,6 +2190,18 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) return 1; msr_info->data = vmx->msr_ia32_therm_status; break; + case MSR_IA32_PACKAGE_THERM_INTERRUPT: + if (!msr_info->host_initiated && + !guest_cpuid_has(vcpu, X86_FEATURE_PTS)) + return 1; + msr_info->data = kvm_vmx->pkg_therm.msr_pkg_therm_int; + break; + case MSR_IA32_PACKAGE_THERM_STATUS: + if (!msr_info->host_initiated && + !guest_cpuid_has(vcpu, X86_FEATURE_PTS)) + return 1; + msr_info->data = kvm_vmx->pkg_therm.msr_pkg_therm_status; + break; default: find_uret_msr: msr = vmx_find_uret_msr(vmx, msr_info->index); @@ -2226,6 +2262,7 @@ static inline u64 vmx_set_msr_rwc0_bits(u64 new_val, u64 old_val, u64 rwc0_mask) static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) { struct vcpu_vmx *vmx = to_vmx(vcpu); + struct kvm_vmx *kvm_vmx = to_kvm_vmx(vcpu->kvm); struct vmx_uret_msr *msr; int ret = 0; u32 msr_index = msr_info->index; @@ -2543,7 +2580,35 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) } vmx->msr_ia32_therm_status = data; break; + case MSR_IA32_PACKAGE_THERM_INTERRUPT: + if (!msr_info->host_initiated && + !guest_cpuid_has(vcpu, X86_FEATURE_PTS)) + return 1; + /* Unsupported and reserved bits: generate the exception. */ + if (!msr_info->host_initiated && + data & ~MSR_IA32_PACKAGE_THERM_INTERRUPT_AVAIL_MASK) + return 1; + kvm_vmx->pkg_therm.msr_pkg_therm_int = data; + break; + case MSR_IA32_PACKAGE_THERM_STATUS: + if (!msr_info->host_initiated && + !guest_cpuid_has(vcpu, X86_FEATURE_PTS)) + return 1; + /* Unsupported and reserved bits: generate the exception. */ + if (!msr_info->host_initiated && + data & MSR_IA32_PACKAGE_THERM_STATUS_UNAVAIL_MASK) + return 1; + mutex_lock(&kvm_vmx->pkg_therm.pkg_therm_lock); + if (!msr_info->host_initiated) { + data = vmx_set_msr_rwc0_bits(data, kvm_vmx->pkg_therm.msr_pkg_therm_status, + MSR_IA32_PACKAGE_THERM_STATUS_RWC0_MASK); + data = vmx_set_msr_ro_bits(data, kvm_vmx->pkg_therm.msr_pkg_therm_status, + MSR_IA32_PACKAGE_THERM_STATUS_RO_MASK); + } + kvm_vmx->pkg_therm.msr_pkg_therm_status = data; + mutex_unlock(&kvm_vmx->pkg_therm.pkg_therm_lock); + break; default: find_uret_msr: msr = vmx_find_uret_msr(vmx, msr_index); @@ -7649,6 +7714,14 @@ static int vmx_vcpu_create(struct kvm_vcpu *vcpu) return err; } +static int vmx_vm_init_pkg_therm(struct kvm *kvm) +{ + struct pkg_therm_desc *pkg_therm = &to_kvm_vmx(kvm)->pkg_therm; + + mutex_init(&pkg_therm->pkg_therm_lock); + return 0; +} + #define L1TF_MSG_SMT "L1TF CPU bug present and SMT on, data leak possible. See CVE-2018-3646 and https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/l1tf.html for details.\n" #define L1TF_MSG_L1D "L1TF CPU bug present and virtualization mitigation disabled, data leak possible. See CVE-2018-3646 and https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/l1tf.html for details.\n" @@ -7680,7 +7753,8 @@ static int vmx_vm_init(struct kvm *kvm) break; } } - return 0; + + return vmx_vm_init_pkg_therm(kvm); } static u8 vmx_get_mt_mask(struct kvm_vcpu *vcpu, gfn_t gfn, bool is_mmio) diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h index e159dd5b7a66..5723780da180 100644 --- a/arch/x86/kvm/vmx/vmx.h +++ b/arch/x86/kvm/vmx/vmx.h @@ -369,6 +369,13 @@ struct vcpu_vmx { } shadow_msr_intercept; }; +struct pkg_therm_desc { + u64 msr_pkg_therm_int; + u64 msr_pkg_therm_status; + /* All members before "struct mutex pkg_therm_lock" are protected by the lock. */ + struct mutex pkg_therm_lock; +}; + struct kvm_vmx { struct kvm kvm; @@ -377,6 +384,8 @@ struct kvm_vmx { gpa_t ept_identity_map_addr; /* Posted Interrupt Descriptor (PID) table for IPI virtualization */ u64 *pid_table; + + struct pkg_therm_desc pkg_therm; }; void vmx_vcpu_load_vmcs(struct kvm_vcpu *vcpu, int cpu, diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 50aceb0ce4ee..7d787ced513f 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -1548,6 +1548,8 @@ static const u32 emulated_msrs_all[] = { MSR_IA32_THERM_CONTROL, MSR_IA32_THERM_INTERRUPT, MSR_IA32_THERM_STATUS, + MSR_IA32_PACKAGE_THERM_INTERRUPT, + MSR_IA32_PACKAGE_THERM_STATUS, /* * KVM always supports the "true" VMX control MSRs, even if the host From patchwork Sat Feb 3 09:11:59 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zhao Liu X-Patchwork-Id: 13543961 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.14]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 761195FB83; Sat, 3 Feb 2024 09:00:49 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.14 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706950851; cv=none; b=fHeZwbKTNjarx3IS7Q92Jc/QS/tfcmvw7IufGPcZseUYfbXuPtmFIi+W/PsSCuz9HcubKbV8mOMSvez1HknofAK9oI+actTdbSnjj77aN7hfiLeCMJQj0YYO1Fwo974pWt0vTjaym5a/pWevofei4m433kAT4gZcOJLGn+f6+bE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706950851; c=relaxed/simple; bh=uBmYyQNd4eTBcQAihNRXZaPJbPctc/7AfMfj011jqn8=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=m/qxnLW42ehZVz9dRh7gW7n9o3iby0j3df2/Vo7wiIHc/np/7gTtcCpQMyFs5fOowzuz72bFHlUe69d7xDRx5rc0fwGdiLRo636gJLOP+64zGojyGwxNgvf40nFN6VB4h8EzbdxYX4QyXyFPYfFuVYjx9Ztr5KNQ3O1DgfU/05w= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=lv0ZZ78y; arc=none smtp.client-ip=198.175.65.14 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="lv0ZZ78y" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1706950850; x=1738486850; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=uBmYyQNd4eTBcQAihNRXZaPJbPctc/7AfMfj011jqn8=; b=lv0ZZ78yF8yFPdKFWxEszGiWsvN8JWMHvxW+p3Doa4poWzjH1xQ5+oRH 7RKfIboBvBDzDt3nb0wvlXGDRDCc97eiD/ZZ8geGWEuv8EavfqtRIOK28 8RyXaSLG2Vfk0V9rZjqbQGDqm+Q8DjRoQY4QWA/74p/pu4pN4HzcD03qa npUDM39/H8X96v3CzS1rCL+lm7awhwy6y2y6wU+8Puo5UHlrMBwZJahNp CQi82saLLiWq0UFH8G3vlcyJjFd5EpcyN+QV+F1etiVP8nGw9+XaV4zUS 4f10dYPdrjL/7Y6WfDulrtsQfzRHairNyt0SdTUKQqgIaY1F74uALP8xv g==; X-IronPort-AV: E=McAfee;i="6600,9927,10971"; a="4131993" X-IronPort-AV: E=Sophos;i="6.05,240,1701158400"; d="scan'208";a="4131993" Received: from fmviesa009.fm.intel.com ([10.60.135.149]) by orvoesa106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 03 Feb 2024 01:00:49 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.05,240,1701158400"; d="scan'208";a="291356" Received: from liuzhao-optiplex-7080.sh.intel.com ([10.239.160.36]) by fmviesa009.fm.intel.com with ESMTP; 03 Feb 2024 01:00:43 -0800 From: Zhao Liu To: Paolo Bonzini , Sean Christopherson , "Rafael J . Wysocki" , Daniel Lezcano , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H . Peter Anvin" , kvm@vger.kernel.org, linux-pm@vger.kernel.org, linux-kernel@vger.kernel.org, x86@kernel.org Cc: Ricardo Neri , Len Brown , Zhang Rui , Zhenyu Wang , Zhuocheng Ding , Dapeng Mi , Yanting Jiang , Yongwei Ma , Vineeth Pillai , Suleiman Souhlal , Masami Hiramatsu , David Dai , Saravana Kannan , Zhao Liu Subject: [RFC 11/26] KVM: VMX: Introduce HFI description structure Date: Sat, 3 Feb 2024 17:11:59 +0800 Message-Id: <20240203091214.411862-12-zhao1.liu@linux.intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240203091214.411862-1-zhao1.liu@linux.intel.com> References: <20240203091214.411862-1-zhao1.liu@linux.intel.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Zhao Liu HFI/ITD virtualization needs to maintain the virtual HFI table in KVM. This virtual HFI table is used to sync the update of Host's HFI table, and then KVM will write this virtual table into Guest's HFI table address. Along with the sync process, the KVM also needs to know the status of the Guest HFI. Therefore, provide the hfi_desc structure to store the following things: * The state flags of Guest HFI. * The basic information of Guest HFI. * The local virtual HFI table. The PTS feature is emulated at VM level, so also support hfi_desc at VM level. Tested-by: Yanting Jiang Signed-off-by: Zhao Liu --- arch/x86/kvm/vmx/vmx.c | 2 ++ arch/x86/kvm/vmx/vmx.h | 41 +++++++++++++++++++++++++++++++++++++++++ 2 files changed, 43 insertions(+) diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 45b40a47b448..48f304683d6f 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -8386,7 +8386,9 @@ static void vmx_hardware_unsetup(void) static void vmx_vm_destroy(struct kvm *kvm) { struct kvm_vmx *kvm_vmx = to_kvm_vmx(kvm); + struct hfi_desc *kvm_vmx_hfi = &kvm_vmx->pkg_therm.hfi_desc; + kfree(kvm_vmx_hfi->hfi_table.base_addr); free_pages((unsigned long)kvm_vmx->pid_table, vmx_get_pid_table_order(kvm)); } diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h index 5723780da180..4bf4ca6ac1c0 100644 --- a/arch/x86/kvm/vmx/vmx.h +++ b/arch/x86/kvm/vmx/vmx.h @@ -7,6 +7,7 @@ #include #include #include +#include #include "capabilities.h" #include "../kvm_cache_regs.h" @@ -369,9 +370,49 @@ struct vcpu_vmx { } shadow_msr_intercept; }; +/** + * struct hfi_desc - Representation of an HFI instance (i.e., a table) + * @hfi_enabled: Flag to indicate whether HFI is enabled at runtime. + * Parsed from the Guest's MSR_IA32_HW_FEEDBACK_CONFIG. + * @hfi_int_enabled: Flag to indicate whether HFI is enabled at runtime. + * Parsed from Guest's MSR_IA32_PACKAGE_THERM_INTERRUPT[bit 25]. + * @table_ptr_valid: Flag to indicate whether the memory of Guest HFI table is ready. + * Parsed from the valid bit of Guest's MSR_IA32_HW_FEEDBACK_PTR. + * @hfi_update_status: Flag to indicate whether Guest has handled the virtual HFI table + * update. + * Parsed from Guest's MSR_IA32_PACKAGE_THERM_STATUS[bit 26]. + * @hfi_update_pending: Flag to indicate whether there's any update on Host that is not + * synced to Guest. + * KVM should update the Guest's HFI table and inject the notification + * until Guest has cleared hfi_update_status. + * @table_base: GPA of Guest's HFI table, which is parsed from Guest's + * MSR_IA32_HW_FEEDBACK_PTR. + * @hfi_features: Feature information based on Guest's HFI/ITD CPUID. + * @hfi_table: Local virtual HFI table based on the HFI data of the pCPU that + * the vCPU is running on. + * When KVM updates the Guest's HFI table, it writes the local + * virtual HFI table to the Guest HFI table memory in @table_base. + * + * A set of status flags and feature information, used to maintain local virtual HFI table + * and sync updates to Guest HFI table. + */ + +struct hfi_desc { + bool hfi_enabled; + bool hfi_int_enabled; + bool table_ptr_valid; + bool hfi_update_status; + bool hfi_update_pending; + gpa_t table_base; + struct hfi_features hfi_features; + struct hfi_table hfi_table; +}; + struct pkg_therm_desc { u64 msr_pkg_therm_int; u64 msr_pkg_therm_status; + /* Currently HFI is only supported at package level. */ + struct hfi_desc hfi_desc; /* All members before "struct mutex pkg_therm_lock" are protected by the lock. */ struct mutex pkg_therm_lock; }; From patchwork Sat Feb 3 09:12:00 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zhao Liu X-Patchwork-Id: 13543962 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.14]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 357335FB83; Sat, 3 Feb 2024 09:00:55 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.14 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706950856; cv=none; b=E5qUu5ni4G3RBNbBQNLusJLzWgdv1SI8vhkC+uT129HzTqROm0zJ1k4mA32SF/rz2orD4X9NSooPUYogMTdaa8FoZBOChKFSljHFq3s7eRDKYmtcivm5Ei6ZaeGPekeImV5RGgs50DvFfKaRs1MFjqX7ZRCpq0Vs9Em8wEEPhAw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706950856; c=relaxed/simple; bh=xuhB6ti18yKxtjfmmEA4k56Sy3StXgqJYsS1nszVK90=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=Bv7JYMiajwWvSaIpgaQpW8L05Ki6ac1Ee8q7LtitG6ElffeCjBvFvpmjGfNrc3vMXiGjiVkEBqHYHVlFoXPOK1Mr7/uZ58szU29Nu1/SqfzPbNAvm9UaN/ntdLl7QNNzyxH6lZ5bG16XCvj2NNQY7JXOpE/UGH9QLd3X0QDmMSc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=B+ztlGDJ; arc=none smtp.client-ip=198.175.65.14 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="B+ztlGDJ" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1706950855; x=1738486855; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=xuhB6ti18yKxtjfmmEA4k56Sy3StXgqJYsS1nszVK90=; b=B+ztlGDJ3I4kpBbMIE3LTOQlXmNBXXW+uM9Q/mrh0x8LOSPgJMA1L8WI H4mdRYiQmV3QOqg+IuNz2Ox/QCwXXm9uNagv72T1qWK43stAKQftg+w7l +/VItCHcPr52eJ8uS6LOQx5xgzkbmqQDRncfJThMIuyb8RW438+HpZj1d qNW4+TrTpSvBwUFaXxLd4Ph9/nc70cdUqz4GoMGvzzRpTPZvb1EXgJRkL XUKeaPgZcKdwTtgK6NLYyjRcRVOH02ohXHUWVXEuBeFI8vkaV1x9ahOIW 5Nwm2HRbk5NTN8ufq+eyLcb6c/bMQgcgCirVsV6UeXc6Q2DILjVuy+q55 w==; X-IronPort-AV: E=McAfee;i="6600,9927,10971"; a="4132013" X-IronPort-AV: E=Sophos;i="6.05,240,1701158400"; d="scan'208";a="4132013" Received: from fmviesa009.fm.intel.com ([10.60.135.149]) by orvoesa106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 03 Feb 2024 01:00:55 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.05,240,1701158400"; d="scan'208";a="291369" Received: from liuzhao-optiplex-7080.sh.intel.com ([10.239.160.36]) by fmviesa009.fm.intel.com with ESMTP; 03 Feb 2024 01:00:49 -0800 From: Zhao Liu To: Paolo Bonzini , Sean Christopherson , "Rafael J . Wysocki" , Daniel Lezcano , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H . Peter Anvin" , kvm@vger.kernel.org, linux-pm@vger.kernel.org, linux-kernel@vger.kernel.org, x86@kernel.org Cc: Ricardo Neri , Len Brown , Zhang Rui , Zhenyu Wang , Zhuocheng Ding , Dapeng Mi , Yanting Jiang , Yongwei Ma , Vineeth Pillai , Suleiman Souhlal , Masami Hiramatsu , David Dai , Saravana Kannan , Zhao Liu Subject: [RFC 12/26] KVM: VMX: Introduce HFI table index for vCPU Date: Sat, 3 Feb 2024 17:12:00 +0800 Message-Id: <20240203091214.411862-13-zhao1.liu@linux.intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240203091214.411862-1-zhao1.liu@linux.intel.com> References: <20240203091214.411862-1-zhao1.liu@linux.intel.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Zhao Liu The HFI table contains a table header and many table entries. Each table entry is identified by an HFI table index, and each CPU corresponds to one of the HFI table indexes [1]. Add hfi_table_idx in vcpu_vmx, and this will be used to build virtual HFI table. This HFI index is initialized to 0, but in the following patch the VMM can be allowed to configure this index with a custom value (CPUID.0x06. edx[bits 16-31]). [1]: SDM, vol. 3B, section 15.6.1 Hardware Feedback Interface Table Structure Tested-by: Yanting Jiang Signed-off-by: Zhao Liu --- arch/x86/kvm/vmx/vmx.c | 6 ++++++ arch/x86/kvm/vmx/vmx.h | 3 +++ 2 files changed, 9 insertions(+) diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 48f304683d6f..96f0f768939d 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -7648,6 +7648,12 @@ static int vmx_vcpu_create(struct kvm_vcpu *vcpu) tsx_ctrl->mask = ~(u64)TSX_CTRL_CPUID_CLEAR; } + /* + * hfi_table_idx is initialized to 0, but later it may be changed according + * to the value in the Guest's CPUID.0x06.edx[bits 16-31]. + */ + vmx->hfi_table_idx = 0; + err = alloc_loaded_vmcs(&vmx->vmcs01); if (err < 0) goto free_pml; diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h index 4bf4ca6ac1c0..63874aad7ae3 100644 --- a/arch/x86/kvm/vmx/vmx.h +++ b/arch/x86/kvm/vmx/vmx.h @@ -362,6 +362,9 @@ struct vcpu_vmx { struct pt_desc pt_desc; struct lbr_desc lbr_desc; + /* Should be extracted from Guest's CPUID.0x06.edx[bits 16-31]. */ + int hfi_table_idx; + /* Save desired MSR intercept (read: pass-through) state */ #define MAX_POSSIBLE_PASSTHROUGH_MSRS 16 struct { From patchwork Sat Feb 3 09:12:01 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zhao Liu X-Patchwork-Id: 13543963 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.14]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AA5185FDB7; Sat, 3 Feb 2024 09:01:00 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.14 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706950862; cv=none; b=khSBATT0lVndlQBILkMyrdEy1saxLdkVadGwReSyn8Aw+u3XU8PUsYyahW1MPhTwhH0ju538heY/EW8ANDmAbN1pf4IebmTTHISP899PTypOFEzho1RxEqQgrhzl+cAdBuPZ2y8onpS8rtFbAMFD+9Z83LaPjIDghtfbD2gBc2k= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706950862; c=relaxed/simple; bh=SIIfDj0zX5GYY4IEtDgxTxiCH4JC5qldRU/UOzkQNm0=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=VjN9MZAPmylvYZzs1RC8YCREskD9bb3Sf+t885xnuWWRuzByUCGMKT8eJfHgcqT1gUvhzg2fb55RZ6EhIsefMS8UbHedIsqxWNC06tYOS2iFrNxu1uDTe3wveiwewRj7TapsH4YbLT5pq+o60LUp3V6vOK63CJE3q4CgcLLhsfc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=JYdl/2He; arc=none smtp.client-ip=198.175.65.14 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="JYdl/2He" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1706950861; x=1738486861; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=SIIfDj0zX5GYY4IEtDgxTxiCH4JC5qldRU/UOzkQNm0=; b=JYdl/2HeBi52H//58wlbNOnv/kE3XYETHRw0Nr8pWcqgD7jfTIxrHgKf /+mN/JKcCJ1d70yJFajX5hpBxnO2XutWkkNlWkpobbELwLXMeTMA0AoQx 5Mwt+TUj3djuxynDkjLhrg7FwuzWIYUm6UM3VP2TBqVOXNc0GFtg3xNwP 0XatsHBv/WQe83v60OTIixBPjlG21kEI+01vzyfx7SbgGbiUoOlf9TOTW YfQHVVN2TslvCRvQmuXBYMLPl8T16YUrPLFTNfseyWrpJiZ/1fZnPGv8E autM435WmNrIjsyT8b8FNWMYLyaWRo3cdiM/dL6WUm7sJNoRjIaM/TZB/ A==; X-IronPort-AV: E=McAfee;i="6600,9927,10971"; a="4132017" X-IronPort-AV: E=Sophos;i="6.05,240,1701158400"; d="scan'208";a="4132017" Received: from fmviesa009.fm.intel.com ([10.60.135.149]) by orvoesa106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 03 Feb 2024 01:01:01 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.05,240,1701158400"; d="scan'208";a="291390" Received: from liuzhao-optiplex-7080.sh.intel.com ([10.239.160.36]) by fmviesa009.fm.intel.com with ESMTP; 03 Feb 2024 01:00:54 -0800 From: Zhao Liu To: Paolo Bonzini , Sean Christopherson , "Rafael J . Wysocki" , Daniel Lezcano , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H . Peter Anvin" , kvm@vger.kernel.org, linux-pm@vger.kernel.org, linux-kernel@vger.kernel.org, x86@kernel.org Cc: Ricardo Neri , Len Brown , Zhang Rui , Zhenyu Wang , Zhuocheng Ding , Dapeng Mi , Yanting Jiang , Yongwei Ma , Vineeth Pillai , Suleiman Souhlal , Masami Hiramatsu , David Dai , Saravana Kannan , Zhao Liu Subject: [RFC 13/26] KVM: VMX: Support virtual HFI table for VM Date: Sat, 3 Feb 2024 17:12:01 +0800 Message-Id: <20240203091214.411862-14-zhao1.liu@linux.intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240203091214.411862-1-zhao1.liu@linux.intel.com> References: <20240203091214.411862-1-zhao1.liu@linux.intel.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Zhuocheng Ding The Hardware Feedback Interface (HFI) is a feature that allows hardware to provide guidance to the operating system scheduler through a hardware feedback interface structure (HFI table) in memory [1], so that the scheduler can perform optimal workload scheduling. ITD (Intel Thread Director) and HFI features both depend on the HFI table, but their HFI tables are slightly different. The HFI table provided by the ITD feature has 4 classes (in terms of more columns in the table) and the native HFI feature supports 1 class [2]. In fact, the size of the HFI table is determined by the feature bit that the processor supports, but the range of data updates in the table is determined by the feature actually enabled (HFI or ITD) [3], which is controlled by MSR_IA32_HW_FEEDBACK_CONFIG. To benefit the scheduling in VM with HFI/ITD, we need to maintain virtual HFI tables in KVM. The virtual HFI table is based on the real HFI table. We extract the HFI entries corresponding to the pCPU that the vCPU is running on, and reorganize these actual entries into a new virtual HFI table with the vCPU's HFI index. Also, to simplify the logic, before the emulation of ITD is supported, we build virtual HFI table based on HFI feature by default (i.e. only 1 class is supported, based on class 0 of real hardware). Add the interfaces to initialize and build the virtual HFI table, and to inject the thermal interrupt into the VM to notify about HFI updates. [1]: SDM, vol. 3B, section 15.6 HARDWARE FEEDBACK INTERFACE AND INTEL THREAD DIRECTOR [2]: SDM, vol. 3B, section 15.6.2 Intel Thread Director Table Structure [3]: SDM, vol. 3B, section 15.6.5 Hardware Feedback Interface Configuration, Table 15-10. IA32_HW_FEEDBACK_CONFIG Control Option Tested-by: Yanting Jiang Signed-off-by: Zhuocheng Ding Co-developed-by: Zhao Liu Signed-off-by: Zhao Liu --- arch/x86/kvm/vmx/vmx.c | 119 +++++++++++++++++++++++++++++++++++++++++ 1 file changed, 119 insertions(+) diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 96f0f768939d..7881f6b51daa 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -1532,6 +1532,125 @@ static void vmx_inject_therm_interrupt(struct kvm_vcpu *vcpu) kvm_apic_therm_deliver(vcpu); } +static inline bool vmx_hfi_initialized(struct kvm_vmx *kvm_vmx) +{ + return kvm_vmx->pkg_therm.hfi_desc.hfi_enabled && + kvm_vmx->pkg_therm.hfi_desc.table_ptr_valid; +} + +static inline bool vmx_hfi_int_enabled(struct kvm_vmx *kvm_vmx) +{ + return kvm_vmx->pkg_therm.hfi_desc.hfi_int_enabled; +} + +static int vmx_init_hfi_table(struct kvm *kvm) +{ + struct kvm_vmx *kvm_vmx = to_kvm_vmx(kvm); + struct hfi_desc *kvm_vmx_hfi = &kvm_vmx->pkg_therm.hfi_desc; + struct hfi_features *hfi_features = &kvm_vmx_hfi->hfi_features; + struct hfi_table *hfi_table = &kvm_vmx_hfi->hfi_table; + int nr_classes, ret = 0; + + /* + * Currently we haven't supported ITD. HFI is the default feature + * with 1 class. + */ + nr_classes = 1; + ret = intel_hfi_build_virt_features(hfi_features, + nr_classes, + kvm->created_vcpus); + if (unlikely(ret)) + return ret; + + hfi_table->base_addr = kzalloc(hfi_features->nr_table_pages << + PAGE_SHIFT, GFP_KERNEL); + if (!hfi_table->base_addr) + return -ENOMEM; + + hfi_table->hdr = hfi_table->base_addr + sizeof(*hfi_table->timestamp); + hfi_table->data = hfi_table->hdr + hfi_features->hdr_size; + return 0; +} + +static int vmx_build_hfi_table(struct kvm *kvm) +{ + struct kvm_vmx *kvm_vmx = to_kvm_vmx(kvm); + struct hfi_desc *kvm_vmx_hfi = &kvm_vmx->pkg_therm.hfi_desc; + struct hfi_features *hfi_features = &kvm_vmx_hfi->hfi_features; + struct hfi_table *hfi_table = &kvm_vmx_hfi->hfi_table; + struct hfi_hdr *hfi_hdr = hfi_table->hdr; + int nr_classes, ret = 0, updated = 0; + struct kvm_vcpu *v; + unsigned long i; + + /* + * Currently we haven't supported ITD. HFI is the default feature + * with 1 class. + */ + nr_classes = 1; + for (int j = 0; j < nr_classes; j++) { + hfi_hdr->perf_updated = 0; + hfi_hdr->ee_updated = 0; + hfi_hdr++; + } + + kvm_for_each_vcpu(i, v, kvm) { + ret = intel_hfi_build_virt_table(hfi_table, hfi_features, + nr_classes, + to_vmx(v)->hfi_table_idx, + v->cpu); + if (unlikely(ret < 0)) + return ret; + updated |= ret; + } + + if (!updated) + return updated; + + /* Timestamp must be monotonic. */ + (*kvm_vmx_hfi->hfi_table.timestamp)++; + + /* Update the HFI table, whether the HFI interrupt is enabled or not. */ + kvm_write_guest(kvm, kvm_vmx_hfi->table_base, hfi_table->base_addr, + hfi_features->nr_table_pages << PAGE_SHIFT); + return 1; +} + +static void vmx_update_hfi_table(struct kvm *kvm) +{ + struct kvm_vmx *kvm_vmx = to_kvm_vmx(kvm); + struct hfi_desc *kvm_vmx_hfi = &kvm_vmx->pkg_therm.hfi_desc; + int ret = 0; + + if (!intel_hfi_enabled()) + return; + + if (!vmx_hfi_initialized(kvm_vmx)) + return; + + if (!kvm_vmx_hfi->hfi_table.base_addr) { + ret = vmx_init_hfi_table(kvm); + if (unlikely(ret)) + return; + } + + ret = vmx_build_hfi_table(kvm); + if (ret <= 0) + return; + + kvm_vmx_hfi->hfi_update_status = true; + kvm_vmx_hfi->hfi_update_pending = false; + + /* + * Since HFI is shared for all vCPUs of the same VM, we + * actually support only 1 package topology VMs, so when + * emulating package level interrupt, we only inject an + * interrupt into one vCPU to reduce the overhead. + */ + if (vmx_hfi_int_enabled(kvm_vmx)) + vmx_inject_therm_interrupt(kvm_get_vcpu(kvm, 0)); +} + /* * Switches to specified vcpu, until a matching vcpu_put(), but assumes * vcpu mutex is already taken. From patchwork Sat Feb 3 09:12:02 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zhao Liu X-Patchwork-Id: 13543964 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.14]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9B7325FEF8; Sat, 3 Feb 2024 09:01:06 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.14 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706950868; cv=none; b=uqTvmk30TqcNMYNXiY2DDSVk4NyEZwa6C+JymrqDwGkLcUVk9NIgfXa5N6nAVTvQmkZG1ejlgWirgsHFZadqQ0MyYBxJKQJw0rSQm/S1ATf7bviHX2uUfwdpPBcg+udqLOI8tGjFgIanm1v3OFJN4xmYHdGXDqyM3XIrUki6a94= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706950868; c=relaxed/simple; bh=jaSpkN47fYUzN6obTA+Fv5SVXL8pgNQFR+mSAgkAyoY=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=g1+bRW+MUyYl0Bf8xOvOTxbkWj/bm/qosPckRRpIL7zUqs5qiuwQ8ntutdEBbo+bdkjNEkfexsiz7CkTkWe+/4i1Vz9T8Eqr3vflqqkULYh1mOtbUER9ae810DlFqqccK52Fac9Qd8x+FphyQzg7cC7e7MPhqt/TwT6iuZkqG6k= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=WUvoFJal; arc=none smtp.client-ip=198.175.65.14 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="WUvoFJal" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1706950867; x=1738486867; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=jaSpkN47fYUzN6obTA+Fv5SVXL8pgNQFR+mSAgkAyoY=; b=WUvoFJal5haVfkyljBUyamGtlIUZ1MjP3iQsB8q/hgYQqgp8U9zg9sRY VnpoaGiT4v0tOzA2joZnFdBocnopNyXILz1plYCCx8AA/wygz+WrleDeF cK1j/8vwYsA9x/8hvw+YhACGGSreULeRyZbA+1SpXrhw8BWiGWqbsEm0e dRreTo35TxaG0h78FReEKOWO5zhfgZ1/LMjUXI0aITik3bFIDIN504c/Y hhR9xqnnFX3/2YTPKIWVU/zVgnhmMDPvR7xOA6v5gaCpZMpCwdkWVszBh J2F6dlfoNrl4EaRMLxrOs9T5oDMItXfRxPK1UQUMGvkwqlfeAX6yZner3 A==; X-IronPort-AV: E=McAfee;i="6600,9927,10971"; a="4132038" X-IronPort-AV: E=Sophos;i="6.05,240,1701158400"; d="scan'208";a="4132038" Received: from fmviesa009.fm.intel.com ([10.60.135.149]) by orvoesa106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 03 Feb 2024 01:01:06 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.05,240,1701158400"; d="scan'208";a="291420" Received: from liuzhao-optiplex-7080.sh.intel.com ([10.239.160.36]) by fmviesa009.fm.intel.com with ESMTP; 03 Feb 2024 01:01:00 -0800 From: Zhao Liu To: Paolo Bonzini , Sean Christopherson , "Rafael J . Wysocki" , Daniel Lezcano , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H . Peter Anvin" , kvm@vger.kernel.org, linux-pm@vger.kernel.org, linux-kernel@vger.kernel.org, x86@kernel.org Cc: Ricardo Neri , Len Brown , Zhang Rui , Zhenyu Wang , Zhuocheng Ding , Dapeng Mi , Yanting Jiang , Yongwei Ma , Vineeth Pillai , Suleiman Souhlal , Masami Hiramatsu , David Dai , Saravana Kannan , Zhao Liu Subject: [RFC 14/26] KVM: x86: Introduce the HFI dynamic update request and kvm_x86_ops Date: Sat, 3 Feb 2024 17:12:02 +0800 Message-Id: <20240203091214.411862-15-zhao1.liu@linux.intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240203091214.411862-1-zhao1.liu@linux.intel.com> References: <20240203091214.411862-1-zhao1.liu@linux.intel.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Zhao Liu There're 2 cases that we need to update Guest HFI table dynamically: 1. When Host's HFI table has update, we need to sync the change to Guest. 2. When vCPU thread migrates to another pCPU, we need rebuild the new Guest HFI table based the HFI data of the pCPU that the vCPU is running on. So add the updating mechanism with a new request and a new op to prepare for the above 2 cases: - New KVM request type to perform HFI updating at vcpu_enter_guest(). Updating the VM's HFI table will result in writing to the VM's memory. This requires vCPU context, so we pend HFI updates via kvm request until vCPU is running. Here we only make request for one vCPU per VM because all vCPUs of the same VM share the same HFI table. This allows one vCPU to update the HFI table for the entire VM. - New kvm_x86_op (optional for x86). When KVM processes KVM_REQ_HFI_UPDATE, this ops is called to update the corresponding HFI table raw for the specified vCPU. Tested-by: Yanting Jiang Signed-off-by: Zhao Liu --- arch/x86/include/asm/kvm-x86-ops.h | 3 ++- arch/x86/include/asm/kvm_host.h | 2 ++ arch/x86/kvm/vmx/vmx.c | 30 ++++++++++++++++++++++++++++++ arch/x86/kvm/x86.c | 2 ++ 4 files changed, 36 insertions(+), 1 deletion(-) diff --git a/arch/x86/include/asm/kvm-x86-ops.h b/arch/x86/include/asm/kvm-x86-ops.h index 378ed944b849..1b16de7a03eb 100644 --- a/arch/x86/include/asm/kvm-x86-ops.h +++ b/arch/x86/include/asm/kvm-x86-ops.h @@ -136,8 +136,9 @@ KVM_X86_OP_OPTIONAL(migrate_timers) KVM_X86_OP(msr_filter_changed) KVM_X86_OP(complete_emulated_msr) KVM_X86_OP(vcpu_deliver_sipi_vector) -KVM_X86_OP_OPTIONAL_RET0(vcpu_get_apicv_inhibit_reasons); +KVM_X86_OP_OPTIONAL_RET0(vcpu_get_apicv_inhibit_reasons) KVM_X86_OP_OPTIONAL(get_untagged_addr) +KVM_X86_OP_OPTIONAL(update_hfi) #undef KVM_X86_OP #undef KVM_X86_OP_OPTIONAL diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index b5b2d0fde579..e476a86b0766 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -121,6 +121,7 @@ KVM_ARCH_REQ_FLAGS(31, KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP) #define KVM_REQ_HV_TLB_FLUSH \ KVM_ARCH_REQ_FLAGS(32, KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP) +#define KVM_REQ_HFI_UPDATE KVM_ARCH_REQ(33) #define CR0_RESERVED_BITS \ (~(unsigned long)(X86_CR0_PE | X86_CR0_MP | X86_CR0_EM | X86_CR0_TS \ @@ -1794,6 +1795,7 @@ struct kvm_x86_ops { unsigned long (*vcpu_get_apicv_inhibit_reasons)(struct kvm_vcpu *vcpu); gva_t (*get_untagged_addr)(struct kvm_vcpu *vcpu, gva_t gva, unsigned int flags); + void (*update_hfi)(struct kvm_vcpu *vcpu); }; struct kvm_x86_nested_ops { diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 7881f6b51daa..93c47ba0817b 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -1651,6 +1651,35 @@ static void vmx_update_hfi_table(struct kvm *kvm) vmx_inject_therm_interrupt(kvm_get_vcpu(kvm, 0)); } +static void vmx_dynamic_update_hfi_table(struct kvm_vcpu *vcpu) +{ + struct kvm_vmx *kvm_vmx = to_kvm_vmx(vcpu->kvm); + struct hfi_desc *kvm_vmx_hfi = &kvm_vmx->pkg_therm.hfi_desc; + + if (!intel_hfi_enabled()) + return; + + mutex_lock(&kvm_vmx->pkg_therm.pkg_therm_lock); + + /* + * If Guest hasn't handled the previous update, just mark a pending + * flag to indicate that Host has more updates that KVM needs to sync. + */ + if (kvm_vmx_hfi->hfi_update_status) { + kvm_vmx_hfi->hfi_update_pending = true; + mutex_unlock(&kvm_vmx->pkg_therm.pkg_therm_lock); + return; + } + + /* + * The virtual HFI table is maintained at VM level so that vCPUs + * of the same VM are sharing the one HFI table. Therefore, one + * vCPU can update the HFI table for the whole VM. + */ + vmx_update_hfi_table(vcpu->kvm); + mutex_unlock(&kvm_vmx->pkg_therm.pkg_therm_lock); +} + /* * Switches to specified vcpu, until a matching vcpu_put(), but assumes * vcpu mutex is already taken. @@ -8703,6 +8732,7 @@ static struct kvm_x86_ops vmx_x86_ops __initdata = { .vcpu_deliver_sipi_vector = kvm_vcpu_deliver_sipi_vector, .get_untagged_addr = vmx_get_untagged_addr, + .update_hfi = vmx_dynamic_update_hfi_table, }; static unsigned int vmx_handle_intel_pt_intr(void) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 7d787ced513f..bea3def6a4b1 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -10850,6 +10850,8 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu) if (kvm_check_request(KVM_REQ_UPDATE_CPU_DIRTY_LOGGING, vcpu)) static_call(kvm_x86_update_cpu_dirty_logging)(vcpu); + if (kvm_check_request(KVM_REQ_HFI_UPDATE, vcpu)) + static_call(kvm_x86_update_hfi)(vcpu); } if (kvm_check_request(KVM_REQ_EVENT, vcpu) || req_int_win || From patchwork Sat Feb 3 09:12:03 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zhao Liu X-Patchwork-Id: 13543965 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.14]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3B81A60270; Sat, 3 Feb 2024 09:01:12 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.14 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706950873; cv=none; b=MZwWisrT+LS4zOuUH96ZCS5yIOHV783Qhado8YJqi0Bk0jlLhNoVEwhDnzAjTVBqB+K4buhRAEl+1Fm96xjHwJh0KS9v797I27uEAZl1UOYLdzFOOACZU/v+Fei1g9uCNSMmKd6KHu0XcjAGtHEPnGkLd5Jgsdb/aB7iJDXRxlI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706950873; c=relaxed/simple; bh=6sr4GCiFFLTfE8DEtWSDum9V0+zRPamVMwEHqH2SYw0=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=d19L3x5MNiskiHTsaP5Yq+/4wfbs+yjn1eCLjJHBGjLi/V8HRel0UoW0/9Brgbw/N0rSvq6PzP+Ya95PqJ2F4kFQBlGmVMn/DBiVWNuhGVn7Spouvv7hpVnfniwasT43hJS7j4qzPvR3r/P8PPglN32UwL4MNHRpamWbPaRY9+o= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=PTSqnYHo; arc=none smtp.client-ip=198.175.65.14 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="PTSqnYHo" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1706950872; x=1738486872; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=6sr4GCiFFLTfE8DEtWSDum9V0+zRPamVMwEHqH2SYw0=; b=PTSqnYHo/xv1pvxEecK9sDeuLcw+ho3TsV0PnIVw1auIeJQ0WTXDXRxI NgyMdDJmeXTEbTT059lbzeIYoncdb+c6JXrargq2uCWMtasuNnth6PeKz PJH0cwRkcz2FtzUnnVIam49bWwKg+bqEMlgP9iHCvJYHiBbgydmBD5kbL lUAcXYxSMjO8kzO2fOrHeUu4w4qPLBLccQ5VeNeaPa+DTU1yWjCSGfd0/ SOAfgu+xhTAej0dDFiU+WNh9+//hIp4R3qLPZoEGTIuQSWXSZwk1smvIm rDcySKvtm4xQSA9HgUhyrpSVkT8jRmlfprkQBJWv6i+FULw72gA2EfKrk A==; X-IronPort-AV: E=McAfee;i="6600,9927,10971"; a="4132041" X-IronPort-AV: E=Sophos;i="6.05,240,1701158400"; d="scan'208";a="4132041" Received: from fmviesa009.fm.intel.com ([10.60.135.149]) by orvoesa106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 03 Feb 2024 01:01:12 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.05,240,1701158400"; d="scan'208";a="291438" Received: from liuzhao-optiplex-7080.sh.intel.com ([10.239.160.36]) by fmviesa009.fm.intel.com with ESMTP; 03 Feb 2024 01:01:06 -0800 From: Zhao Liu To: Paolo Bonzini , Sean Christopherson , "Rafael J . Wysocki" , Daniel Lezcano , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H . Peter Anvin" , kvm@vger.kernel.org, linux-pm@vger.kernel.org, linux-kernel@vger.kernel.org, x86@kernel.org Cc: Ricardo Neri , Len Brown , Zhang Rui , Zhenyu Wang , Zhuocheng Ding , Dapeng Mi , Yanting Jiang , Yongwei Ma , Vineeth Pillai , Suleiman Souhlal , Masami Hiramatsu , David Dai , Saravana Kannan , Zhao Liu Subject: [RFC 15/26] KVM: VMX: Sync update of Host HFI table to Guest Date: Sat, 3 Feb 2024 17:12:03 +0800 Message-Id: <20240203091214.411862-16-zhao1.liu@linux.intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240203091214.411862-1-zhao1.liu@linux.intel.com> References: <20240203091214.411862-1-zhao1.liu@linux.intel.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Zhuocheng Ding The HFI table could be updated via thermal interrupt as the actual operating conditions of the processor change during runtime [1], so it is required to synchronize hardware hint changes to Guest's HFI table in time. Provide the interfaces to register/unregister the Host's HFI update, and in the callback of the notification, make HFI update request to update Guest's HFI table before entering Guest. [1]: SDM, vol. 3B, section 15.6.7 Hardware Feedback Interface and Intel Thread Director Structure Dynamic Update Tested-by: Yanting Jiang Signed-off-by: Zhuocheng Ding Co-developed-by: Zhao Liu Signed-off-by: Zhao Liu --- arch/x86/kvm/vmx/vmx.c | 59 ++++++++++++++++++++++++++++++++++++++++++ arch/x86/kvm/vmx/vmx.h | 8 ++++++ 2 files changed, 67 insertions(+) diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 93c47ba0817b..0ad5e3473a28 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -1651,6 +1651,61 @@ static void vmx_update_hfi_table(struct kvm *kvm) vmx_inject_therm_interrupt(kvm_get_vcpu(kvm, 0)); } +static void vmx_hfi_notifier_register(struct kvm *kvm) +{ + struct kvm_vmx *kvm_vmx = to_kvm_vmx(kvm); + struct hfi_desc *kvm_vmx_hfi = &kvm_vmx->pkg_therm.hfi_desc; + + if (!intel_hfi_enabled()) + return; + + if (!vmx_hfi_initialized(kvm_vmx)) + return; + + if (kvm_vmx_hfi->has_hfi_instance) + return; + + /* + * HFI/ITD virtualization is supported on the platforms with only + * 1 HFI instance. Just register notifier for vCPU 0. + */ + kvm_vmx_hfi->has_hfi_instance = + !intel_hfi_notifier_register(&kvm_vmx_hfi->hfi_nb, + kvm_get_vcpu(kvm, 0)->cpu); +} + +static void vmx_hfi_notifier_unregister(struct kvm *kvm) +{ + struct kvm_vmx *kvm_vmx = to_kvm_vmx(kvm); + struct hfi_desc *kvm_vmx_hfi = &kvm_vmx->pkg_therm.hfi_desc; + + if (!kvm_vmx_hfi->has_hfi_instance) + return; + + intel_hfi_notifier_unregister(&kvm_vmx_hfi->hfi_nb, + kvm_get_vcpu(kvm, 0)->cpu); + kvm_vmx_hfi->has_hfi_instance = false; +} + +static int vmx_hfi_update_notify(struct notifier_block *nb, + unsigned long code, void *data) +{ + struct hfi_desc *kvm_vmx_hfi; + struct kvm *kvm; + + kvm_vmx_hfi = container_of(nb, struct hfi_desc, hfi_nb); + kvm = &kvm_vmx_hfi->vmx->kvm; + + /* + * Don't need to check if vcpu 0 belongs to + * kvm_vmx_hfi->host_hfi_instance since currently ITD/HFI + * virtualization is only supported for client platforms + * (with only one HFI instance). + */ + kvm_make_request(KVM_REQ_HFI_UPDATE, kvm_get_vcpu(kvm, 0)); + return NOTIFY_OK; +} + static void vmx_dynamic_update_hfi_table(struct kvm_vcpu *vcpu) { struct kvm_vmx *kvm_vmx = to_kvm_vmx(vcpu->kvm); @@ -7871,8 +7926,11 @@ static int vmx_vcpu_create(struct kvm_vcpu *vcpu) static int vmx_vm_init_pkg_therm(struct kvm *kvm) { struct pkg_therm_desc *pkg_therm = &to_kvm_vmx(kvm)->pkg_therm; + struct hfi_desc *kvm_vmx_hfi = &pkg_therm->hfi_desc; mutex_init(&pkg_therm->pkg_therm_lock); + kvm_vmx_hfi->hfi_nb.notifier_call = vmx_hfi_update_notify; + kvm_vmx_hfi->vmx = to_kvm_vmx(kvm); return 0; } @@ -8542,6 +8600,7 @@ static void vmx_vm_destroy(struct kvm *kvm) struct kvm_vmx *kvm_vmx = to_kvm_vmx(kvm); struct hfi_desc *kvm_vmx_hfi = &kvm_vmx->pkg_therm.hfi_desc; + vmx_hfi_notifier_unregister(kvm); kfree(kvm_vmx_hfi->hfi_table.base_addr); free_pages((unsigned long)kvm_vmx->pid_table, vmx_get_pid_table_order(kvm)); } diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h index 63874aad7ae3..ff205bc0e99a 100644 --- a/arch/x86/kvm/vmx/vmx.h +++ b/arch/x86/kvm/vmx/vmx.h @@ -395,6 +395,11 @@ struct vcpu_vmx { * the vCPU is running on. * When KVM updates the Guest's HFI table, it writes the local * virtual HFI table to the Guest HFI table memory in @table_base. + * @has_hfi_instance: Flag indicates if VM registers @hfi_nb on Host's HFI instance. + * @hfi_nb: Notifier block to be registered in Host HFI instance. + * @vmx: Points to the kvm_vmx where the current nb is located. + * Used to get the corresponding kvm_vmx of the nb when it + * is executed. * * A set of status flags and feature information, used to maintain local virtual HFI table * and sync updates to Guest HFI table. @@ -409,6 +414,9 @@ struct hfi_desc { gpa_t table_base; struct hfi_features hfi_features; struct hfi_table hfi_table; + bool has_hfi_instance; + struct notifier_block hfi_nb; + struct kvm_vmx *vmx; }; struct pkg_therm_desc { From patchwork Sat Feb 3 09:12:04 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zhao Liu X-Patchwork-Id: 13543966 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.14]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DBF8C6086E; Sat, 3 Feb 2024 09:01:17 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.14 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706950879; cv=none; b=UaGQTJqd38Ux+1UH/zMbkyTZJ7wyhVeUEwnaGNANCeGAv0uqJUdUM5mudZpYYvtGLXXl7DddxskIzsua+umUPtDLbWo8tGgshtgbNzyHRT00OvZ23SPBwms2Xwiz1Klg1sSB4xZbNaHfCWJCpJGUeis/qSqNcVqe8MCIAKwHuRk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706950879; c=relaxed/simple; bh=o9Ea4he1GyJdoS3fqNSGXkZQw07L9///FXUYj80JU5U=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=oLhRptss+HYgza5tvMN2hw4SKyBx71bmpStT+7/c27Y325Rdwgw/GseVobg/FuTcqMv7+pJxkzeAmBrItsWUwHEHfTQ/sjyJsqlKQhwhTYZfejVc0AoyLvvl+UwTIKJxLTm0XK6lzx+Eg9JixGZagWnTvwwVhI4wR41QrdiTbK4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=ho84Tknx; arc=none smtp.client-ip=198.175.65.14 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="ho84Tknx" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1706950878; x=1738486878; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=o9Ea4he1GyJdoS3fqNSGXkZQw07L9///FXUYj80JU5U=; b=ho84TknxSFDncqBiIGAInwPEc9R1sZIU2l7u5x6rySaMBs8YvndVtUro fVFoF+XKepcIRIVwcLuEKq9GlIWFkjCRBD9+PrLhA/6YBOWhk0FbSz6Df x2++Q+OzoyrxVCPDKwWq4ezXLGNY6vd+nrqjXn86RvB2wlGM42xsQmwUp 1czQ//dGkTV2DoYrMLovGsCbVdudttw8V2nLZ9uJunDS3ZH5VpDNQ6Y46 SxBsw3eXNOa91AFOVzZDMC6iD0ILANm5zJ+acuD1lHLA1SSbhU1QECmeo +5GWtX3O36LE6ZOtZyx81/35TnAZ9eQBwrd4DKuDoRww0gnfsazBe6oF9 A==; X-IronPort-AV: E=McAfee;i="6600,9927,10971"; a="4132054" X-IronPort-AV: E=Sophos;i="6.05,240,1701158400"; d="scan'208";a="4132054" Received: from fmviesa009.fm.intel.com ([10.60.135.149]) by orvoesa106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 03 Feb 2024 01:01:18 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.05,240,1701158400"; d="scan'208";a="291457" Received: from liuzhao-optiplex-7080.sh.intel.com ([10.239.160.36]) by fmviesa009.fm.intel.com with ESMTP; 03 Feb 2024 01:01:11 -0800 From: Zhao Liu To: Paolo Bonzini , Sean Christopherson , "Rafael J . Wysocki" , Daniel Lezcano , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H . Peter Anvin" , kvm@vger.kernel.org, linux-pm@vger.kernel.org, linux-kernel@vger.kernel.org, x86@kernel.org Cc: Ricardo Neri , Len Brown , Zhang Rui , Zhenyu Wang , Zhuocheng Ding , Dapeng Mi , Yanting Jiang , Yongwei Ma , Vineeth Pillai , Suleiman Souhlal , Masami Hiramatsu , David Dai , Saravana Kannan , Zhao Liu Subject: [RFC 16/26] KVM: VMX: Update HFI table when vCPU migrates Date: Sat, 3 Feb 2024 17:12:04 +0800 Message-Id: <20240203091214.411862-17-zhao1.liu@linux.intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240203091214.411862-1-zhao1.liu@linux.intel.com> References: <20240203091214.411862-1-zhao1.liu@linux.intel.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Zhuocheng Ding When the vCPU migrates to a different pCPU, the virtual hfi data corresponding to the vCPU's hfi index should be updated to the new pCPU's data. We don't need to re-register HFI notifier because currently ITD/HFI virtualization is only supported for client platforms (with only one HFI instance). In this case, make the request to update the virtual hfi table. Tested-by: Yanting Jiang Signed-off-by: Zhuocheng Ding Co-developed-by: Zhao Liu Signed-off-by: Zhao Liu --- arch/x86/kvm/vmx/vmx.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 0ad5e3473a28..44c09c995120 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -1735,6 +1735,17 @@ static void vmx_dynamic_update_hfi_table(struct kvm_vcpu *vcpu) mutex_unlock(&kvm_vmx->pkg_therm.pkg_therm_lock); } +static void vmx_vcpu_hfi_load(struct kvm_vcpu *vcpu, int cpu) +{ + if (!intel_hfi_enabled()) + return; + + if (!vmx_hfi_initialized(to_kvm_vmx(vcpu->kvm))) + return; + + kvm_make_request(KVM_REQ_HFI_UPDATE, vcpu); +} + /* * Switches to specified vcpu, until a matching vcpu_put(), but assumes * vcpu mutex is already taken. @@ -1748,6 +1759,9 @@ static void vmx_vcpu_load(struct kvm_vcpu *vcpu, int cpu) vmx_vcpu_pi_load(vcpu, cpu); vmx->host_debugctlmsr = get_debugctlmsr(); + + if (unlikely(vcpu->cpu != cpu)) + vmx_vcpu_hfi_load(vcpu, cpu); } static void vmx_vcpu_put(struct kvm_vcpu *vcpu) From patchwork Sat Feb 3 09:12:05 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zhao Liu X-Patchwork-Id: 13543967 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.14]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EB9C45D489; Sat, 3 Feb 2024 09:01:24 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.14 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706950886; cv=none; b=hqfQireyjwqG8rMVmn70vqkSPtYmPr6+y95c9cC4zpNyPlqkPI8pteJ5dtfA1y3eqc1aOY3a8jWcabAKRwsT6Sd1zU/puyTFivwdAQWiflqF3eTyYw7uDHxnbN/1kBKtzRvbk5594gUDcCJCrb6B7lhh6tCUtq6eOOG9bdubNtY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706950886; c=relaxed/simple; bh=SMoF3nIPJhY2CvvIts5nS/Mk2TwIlNjiRdux5w3q9k8=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=L7e0QZd/iimOaJ4hsH9HHb+Kw7D9wWlFTsHEcEbf7RmGAreprzaUFaInu8pGbog5L844+lHDd+uSHwSEK1ouAifXE7iOs4KsGTOc/ixm5fjMkK77ZsU1bIy0k/DOUR18r4aCT92d42zF9BqNyjJu6QfQBEDlHQ3p/pjoIfq5xGs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=AWRm4P9l; arc=none smtp.client-ip=198.175.65.14 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="AWRm4P9l" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1706950885; x=1738486885; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=SMoF3nIPJhY2CvvIts5nS/Mk2TwIlNjiRdux5w3q9k8=; b=AWRm4P9lOt9jckK0W6gew+9NvQoJCEKUaNX1QZvA+GPVwjjGfwPc3xeF Bmuq7/zV05gFQQJ8S0ZoER1p0C77FBuQ+7LWgh1gMgXFvaMh/qHKgdm8o NE2qS4A/HESDHIQPVfXSmIBjkFqKpf4oNTT+Y5p6xjSo0r/zrGtvW9zZJ ej0ueyTlte1W+ROoNk27EhYW6uKxwNZVKSExFSl1mdpCADYgNOcHa7V/W lTdCaUknK8FuwX+djLi0oBdZzjcUfpZDfwEiLZCadmHntxdmzZ6HagHp0 Rlpny6jWxckhzUdAr19vGRC3++m4Xb0q2lskJ1P+ZCl+CQWqBL15D31bC A==; X-IronPort-AV: E=McAfee;i="6600,9927,10971"; a="4132070" X-IronPort-AV: E=Sophos;i="6.05,240,1701158400"; d="scan'208";a="4132070" Received: from fmviesa009.fm.intel.com ([10.60.135.149]) by orvoesa106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 03 Feb 2024 01:01:24 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.05,240,1701158400"; d="scan'208";a="291478" Received: from liuzhao-optiplex-7080.sh.intel.com ([10.239.160.36]) by fmviesa009.fm.intel.com with ESMTP; 03 Feb 2024 01:01:17 -0800 From: Zhao Liu To: Paolo Bonzini , Sean Christopherson , "Rafael J . Wysocki" , Daniel Lezcano , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H . Peter Anvin" , kvm@vger.kernel.org, linux-pm@vger.kernel.org, linux-kernel@vger.kernel.org, x86@kernel.org Cc: Ricardo Neri , Len Brown , Zhang Rui , Zhenyu Wang , Zhuocheng Ding , Dapeng Mi , Yanting Jiang , Yongwei Ma , Vineeth Pillai , Suleiman Souhlal , Masami Hiramatsu , David Dai , Saravana Kannan , Zhao Liu Subject: [RFC 17/26] KVM: VMX: Allow to inject thermal interrupt without HFI update Date: Sat, 3 Feb 2024 17:12:05 +0800 Message-Id: <20240203091214.411862-18-zhao1.liu@linux.intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240203091214.411862-1-zhao1.liu@linux.intel.com> References: <20240203091214.411862-1-zhao1.liu@linux.intel.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Zhao Liu When the HFI table memory address is set by MSR_IA32_HW_FEEDBACK_PTR or when MSR_IA32_HW_FEEDBACK_CONFIG enables the HFI feature, the hardware sends an initial HFI notification via thermal interrupt and sets the thermal status bit. To prepare for the above cases, extend vmx_update_hfi_table() to allow the forced thermal interrupt injection (with the thermal status bit set) regardless of whether there is the HFI table change to be updated. Tested-by: Yanting Jiang Signed-off-by: Zhao Liu --- arch/x86/kvm/vmx/vmx.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 44c09c995120..97bb7b304213 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -1616,7 +1616,7 @@ static int vmx_build_hfi_table(struct kvm *kvm) return 1; } -static void vmx_update_hfi_table(struct kvm *kvm) +static void vmx_update_hfi_table(struct kvm *kvm, bool forced_int) { struct kvm_vmx *kvm_vmx = to_kvm_vmx(kvm); struct hfi_desc *kvm_vmx_hfi = &kvm_vmx->pkg_therm.hfi_desc; @@ -1635,7 +1635,7 @@ static void vmx_update_hfi_table(struct kvm *kvm) } ret = vmx_build_hfi_table(kvm); - if (ret <= 0) + if (ret < 0 || (!ret && !forced_int)) return; kvm_vmx_hfi->hfi_update_status = true; @@ -1731,7 +1731,7 @@ static void vmx_dynamic_update_hfi_table(struct kvm_vcpu *vcpu) * of the same VM are sharing the one HFI table. Therefore, one * vCPU can update the HFI table for the whole VM. */ - vmx_update_hfi_table(vcpu->kvm); + vmx_update_hfi_table(vcpu->kvm, false); mutex_unlock(&kvm_vmx->pkg_therm.pkg_therm_lock); } From patchwork Sat Feb 3 09:12:06 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zhao Liu X-Patchwork-Id: 13543968 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.14]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8422E612F5; Sat, 3 Feb 2024 09:01:29 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.14 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706950891; cv=none; b=YaEBwA9S5F+ESzB/8SFZiyDVtX6dCbpzn+4Rc/XftNoTKuFY1Jm5sQsjp2f19rAYm0OJ6rSfw/dIVooVi8Jwx3Ntl+GReFvu9UUBIOQUQ3AjO8x7zcPFvfUXBc2xFrOgi5xFhxwTd1Cba/BY6zgFRyMNAipv6tQVbYM1WotmMEU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706950891; c=relaxed/simple; bh=xPsoiGvQ/WepqnMDgjkmeX0HGRQmeVD7pNcmGBM6KEo=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=FX0hp02W2hzeXLl9BJ8PRCRjW0fNN9AOup/SZB5mzVD7e1dHBv+T37E61kx8/SPyD2uY9dixF9ny8hKwFWI2UcXD2RPEtIRALZwACthrtZRrCCgkPrcB1c2nKTMxmOr07V8CEwQeOc/r/r7Fozkk9OYnAAxP5ZjZkWkoKgMhyzA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=XNxX8FLq; arc=none smtp.client-ip=198.175.65.14 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="XNxX8FLq" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1706950890; x=1738486890; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=xPsoiGvQ/WepqnMDgjkmeX0HGRQmeVD7pNcmGBM6KEo=; b=XNxX8FLqfNEcl+ZjdB9QXcRB+D8nbyOonzMzp0jSwBw+TgRmZvbiKLMe sXehU72p9x095U2bxoBuzLEzHGmi5VtS5ydQKYTnmtrmvAP8Jzd6dvv6B xERrScTWeZ4xR5FXaUfchVY6aLJcEBJEA0W9KYupS6zoirgYmR/Pcy+8y TUK+wy8VwGF2MIEyTzsVnxuPUwg0exuKXd+sH/a3sgntX/ilMzv4ZbzBq ULPmgKL0lV/TY5FPpTgIuOKulOXqtuU/eUzPWLzPkt35BnxogbXc5pNFC H5DMPOvFimt90cXsiYKdQUECK/5mIdaysaP9OesjknFQ9LehYqpgDBB9V w==; X-IronPort-AV: E=McAfee;i="6600,9927,10971"; a="4132081" X-IronPort-AV: E=Sophos;i="6.05,240,1701158400"; d="scan'208";a="4132081" Received: from fmviesa009.fm.intel.com ([10.60.135.149]) by orvoesa106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 03 Feb 2024 01:01:29 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.05,240,1701158400"; d="scan'208";a="291496" Received: from liuzhao-optiplex-7080.sh.intel.com ([10.239.160.36]) by fmviesa009.fm.intel.com with ESMTP; 03 Feb 2024 01:01:23 -0800 From: Zhao Liu To: Paolo Bonzini , Sean Christopherson , "Rafael J . Wysocki" , Daniel Lezcano , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H . Peter Anvin" , kvm@vger.kernel.org, linux-pm@vger.kernel.org, linux-kernel@vger.kernel.org, x86@kernel.org Cc: Ricardo Neri , Len Brown , Zhang Rui , Zhenyu Wang , Zhuocheng Ding , Dapeng Mi , Yanting Jiang , Yongwei Ma , Vineeth Pillai , Suleiman Souhlal , Masami Hiramatsu , David Dai , Saravana Kannan , Zhao Liu Subject: [RFC 18/26] KVM: VMX: Emulate HFI related bits in package thermal MSRs Date: Sat, 3 Feb 2024 17:12:06 +0800 Message-Id: <20240203091214.411862-19-zhao1.liu@linux.intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240203091214.411862-1-zhao1.liu@linux.intel.com> References: <20240203091214.411862-1-zhao1.liu@linux.intel.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Zhao Liu The HFI feature adds the new bits in MSR_IA32_PACKAGE_THERM_STATUS and MSR_IA32_PACKAGE_THERM_INTERRUPT to control HFI status and notification: * MSR_IA32_PACKAGE_THERM_STATUS: PACKAGE_THERM_STATUS_HFI_UPDATED bit. This bit indicates if there's the new HFI update. Whenever the HFI table is updated, the hardware sends an HFI notification and sets this bit to 1. Only when the OS clears this bit to 0 will the HFI table continue to be updated. Emulate the logic of this bit to coordinate with the update of the Guest HFI table and also to support Guest's clear 0 write. * MSR_IA32_PACKAGE_THERM_INTERRUPT: PACKAGE_THERM_INT_HFI_ENABLE bit. This bit controls the HFI notification enabling. If it's set to 1, every time when HFI table has update, hardware will send a thermal interrupt to notify OS. Therefore, also emulate this bit to support thermal interrupt when Guest HFI table is updated. These status/control bits correspond to the flags in struct hfi_desc, (this is hfi_update_status and hfi_int_enabled). Note that for the thermal interrupt-related features, we only fully emulate HFI, so MSR_IA32_PACKAGE_THERM_STATUS and MSR_IA32_PACKAGE_THERM_INTERRUPT do not (and should not, even though we do not disable the initial exception MSR value via KVM_SET_MSRS) take effect by setting other bits. Tested-by: Yanting Jiang Co-developed-by: Zhuocheng Ding Signed-off-by: Zhuocheng Ding Signed-off-by: Zhao Liu --- arch/x86/kvm/vmx/vmx.c | 129 +++++++++++++++++++++++++++++++++++------ 1 file changed, 111 insertions(+), 18 deletions(-) diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 97bb7b304213..92dded89ae3c 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -183,7 +183,6 @@ module_param(allow_smaller_maxphyaddr, bool, S_IRUGO); THERM_MASK_THRESHOLD0 | THERM_INT_THRESHOLD0_ENABLE | \ THERM_MASK_THRESHOLD1 | THERM_INT_THRESHOLD1_ENABLE) -/* HFI (CPUID.06H:EAX[19]) is not emulated in kvm yet. */ #define MSR_IA32_PACKAGE_THERM_STATUS_RO_MASK (PACKAGE_THERM_STATUS_PROCHOT | \ PACKAGE_THERM_STATUS_PROCHOT_EVENT | PACKAGE_THERM_STATUS_CRITICAL_TEMP | \ THERM_STATUS_THRESHOLD0 | THERM_STATUS_THRESHOLD1 | \ @@ -191,20 +190,17 @@ module_param(allow_smaller_maxphyaddr, bool, S_IRUGO); #define MSR_IA32_PACKAGE_THERM_STATUS_RWC0_MASK (PACKAGE_THERM_STATUS_PROCHOT_LOG | \ PACKAGE_THERM_STATUS_PROCHOT_EVENT_LOG | PACKAGE_THERM_STATUS_CRITICAL_TEMP_LOG | \ THERM_LOG_THRESHOLD0 | THERM_LOG_THRESHOLD1 | \ - PACKAGE_THERM_STATUS_POWER_LIMIT_LOG) + PACKAGE_THERM_STATUS_POWER_LIMIT_LOG | PACKAGE_THERM_STATUS_HFI_UPDATED) /* MSR_IA32_PACKAGE_THERM_STATUS unavailable bits mask: unsupported and reserved bits. */ #define MSR_IA32_PACKAGE_THERM_STATUS_UNAVAIL_MASK (~(MSR_IA32_PACKAGE_THERM_STATUS_RO_MASK | \ MSR_IA32_PACKAGE_THERM_STATUS_RWC0_MASK)) -/* - * MSR_IA32_PACKAGE_THERM_INTERRUPT available bits mask. - * HFI (CPUID.06H:EAX[19]) is not emulated in kvm yet. - */ -#define MSR_IA32_PACKAGE_THERM_INTERRUPT_AVAIL_MASK (PACKAGE_THERM_INT_HIGH_ENABLE | \ +#define MSR_IA32_PACKAGE_THERM_INTERRUPT_MASK (PACKAGE_THERM_INT_HIGH_ENABLE | \ PACKAGE_THERM_INT_LOW_ENABLE | PACKAGE_THERM_INT_PROCHOT_ENABLE | \ PACKAGE_THERM_INT_OVERHEAT_ENABLE | THERM_MASK_THRESHOLD0 | \ THERM_INT_THRESHOLD0_ENABLE | THERM_MASK_THRESHOLD1 | \ - THERM_INT_THRESHOLD1_ENABLE | PACKAGE_THERM_INT_PLN_ENABLE) + THERM_INT_THRESHOLD1_ENABLE | PACKAGE_THERM_INT_PLN_ENABLE | \ + PACKAGE_THERM_INT_HFI_ENABLE) /* * List of MSRs that can be directly passed to the guest. @@ -2417,7 +2413,16 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) if (!msr_info->host_initiated && !guest_cpuid_has(vcpu, X86_FEATURE_PTS)) return 1; + + mutex_lock(&kvm_vmx->pkg_therm.pkg_therm_lock); + if (kvm_vmx->pkg_therm.hfi_desc.hfi_update_status) + kvm_vmx->pkg_therm.msr_pkg_therm_status |= + PACKAGE_THERM_STATUS_HFI_UPDATED; + else + kvm_vmx->pkg_therm.msr_pkg_therm_status &= + ~PACKAGE_THERM_STATUS_HFI_UPDATED; msr_info->data = kvm_vmx->pkg_therm.msr_pkg_therm_status; + mutex_unlock(&kvm_vmx->pkg_therm.pkg_therm_lock); break; default: find_uret_msr: @@ -2471,6 +2476,87 @@ static inline u64 vmx_set_msr_rwc0_bits(u64 new_val, u64 old_val, u64 rwc0_mask) return ((new_rwc0 | ~old_rwc0) & old_rwc0) | (new_val & ~rwc0_mask); } +static int vmx_set_pkg_therm_int_msr(struct kvm_vcpu *vcpu, + struct msr_data *msr_info) +{ + struct kvm_vmx *kvm_vmx = to_kvm_vmx(vcpu->kvm); + struct hfi_desc *kvm_vmx_hfi = &kvm_vmx->pkg_therm.hfi_desc; + u64 data = msr_info->data; + bool hfi_int_enabled, hfi_int_changed; + + hfi_int_enabled = data & PACKAGE_THERM_INT_HFI_ENABLE; + hfi_int_changed = vmx_hfi_int_enabled(kvm_vmx) != hfi_int_enabled; + + kvm_vmx->pkg_therm.msr_pkg_therm_int = data; + kvm_vmx_hfi->hfi_int_enabled = hfi_int_enabled; + + /* + * Only HFI notification is supported, otherwise behave as a + * dummy MSR. + */ + if (!intel_hfi_enabled() || + !guest_cpuid_has(vcpu, X86_FEATURE_HFI) || + !hfi_int_changed) + return 0; + + if (!hfi_int_enabled) + return 0; + + /* + * SDM: (For IA32_HW_FEEDBACK_CONFIG) no (HFI) status bit + * set, no interrupt is generated. + */ + if (!kvm_vmx_hfi->hfi_enabled) + return 0; + + /* + * When HFI interrupt enable bit transitions from 0 to 1, + * try to inject initial interrupt. No need to force + * injection of the interrupt if there's no HFI table update. + */ + vmx_update_hfi_table(vcpu->kvm, false); + + return 0; +} + +static int vmx_set_pkg_therm_status_msr(struct kvm_vcpu *vcpu, + struct msr_data *msr_info) +{ + struct kvm_vmx *kvm_vmx = to_kvm_vmx(vcpu->kvm); + struct hfi_desc *kvm_vmx_hfi = &kvm_vmx->pkg_therm.hfi_desc; + u64 data = msr_info->data; + bool hfi_status_updated, hfi_status_changed; + + if (!msr_info->host_initiated) { + data = vmx_set_msr_rwc0_bits(data, kvm_vmx->pkg_therm.msr_pkg_therm_status, + MSR_IA32_PACKAGE_THERM_STATUS_RWC0_MASK); + data = vmx_set_msr_ro_bits(data, kvm_vmx->pkg_therm.msr_pkg_therm_status, + MSR_IA32_PACKAGE_THERM_STATUS_RO_MASK); + } + + hfi_status_updated = data & PACKAGE_THERM_STATUS_HFI_UPDATED; + hfi_status_changed = kvm_vmx_hfi->hfi_update_status != hfi_status_updated; + + kvm_vmx->pkg_therm.msr_pkg_therm_status = data; + kvm_vmx_hfi->hfi_update_status = hfi_status_updated; + + if (!intel_hfi_enabled() || + !guest_cpuid_has(vcpu, X86_FEATURE_HFI) || + !hfi_status_changed) + return 0; + + /* + * From SDM, once the HFI (thermal) status bit is set, the hardware + * will not generate any further updates to HFI table until the OS + * clears this bit by writing 0. When this bit is cleared, apply any + * pending updates to guest HFI table. + */ + if (!kvm_vmx_hfi->hfi_update_status && kvm_vmx_hfi->hfi_update_pending) + vmx_update_hfi_table(vcpu->kvm, false); + + return 0; +} + /* * Writes msr value into the appropriate "register". * Returns 0 on success, non-0 otherwise. @@ -2801,11 +2887,19 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) if (!msr_info->host_initiated && !guest_cpuid_has(vcpu, X86_FEATURE_PTS)) return 1; - /* Unsupported and reserved bits: generate the exception. */ + /* Unsupported bit: generate the exception. */ + if (!msr_info->host_initiated && + !guest_cpuid_has(vcpu, X86_FEATURE_HFI) && + data & PACKAGE_THERM_INT_HFI_ENABLE) + return 1; + /* Reserved bits: generate the exception. */ if (!msr_info->host_initiated && - data & ~MSR_IA32_PACKAGE_THERM_INTERRUPT_AVAIL_MASK) + data & ~MSR_IA32_PACKAGE_THERM_INTERRUPT_MASK) return 1; - kvm_vmx->pkg_therm.msr_pkg_therm_int = data; + + mutex_lock(&kvm_vmx->pkg_therm.pkg_therm_lock); + ret = vmx_set_pkg_therm_int_msr(vcpu, msr_info); + mutex_unlock(&kvm_vmx->pkg_therm.pkg_therm_lock); break; case MSR_IA32_PACKAGE_THERM_STATUS: if (!msr_info->host_initiated && @@ -2815,15 +2909,14 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) if (!msr_info->host_initiated && data & MSR_IA32_PACKAGE_THERM_STATUS_UNAVAIL_MASK) return 1; + /* Unsupported bit: generate the exception. */ + if (!msr_info->host_initiated && + !guest_cpuid_has(vcpu, X86_FEATURE_HFI) && + data & PACKAGE_THERM_STATUS_HFI_UPDATED) + return 1; mutex_lock(&kvm_vmx->pkg_therm.pkg_therm_lock); - if (!msr_info->host_initiated) { - data = vmx_set_msr_rwc0_bits(data, kvm_vmx->pkg_therm.msr_pkg_therm_status, - MSR_IA32_PACKAGE_THERM_STATUS_RWC0_MASK); - data = vmx_set_msr_ro_bits(data, kvm_vmx->pkg_therm.msr_pkg_therm_status, - MSR_IA32_PACKAGE_THERM_STATUS_RO_MASK); - } - kvm_vmx->pkg_therm.msr_pkg_therm_status = data; + ret = vmx_set_pkg_therm_status_msr(vcpu, msr_info); mutex_unlock(&kvm_vmx->pkg_therm.pkg_therm_lock); break; default: From patchwork Sat Feb 3 09:12:07 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zhao Liu X-Patchwork-Id: 13543969 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.14]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B30EB63101; Sat, 3 Feb 2024 09:01:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.14 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706950897; cv=none; b=OmIWpHLNNC9RI8yvmyynEUyY2oHNz+nn4s+p1OBk+zRwDqDijXSiLPYK1YZnfDaKKK834llbzqLuPkvRbVgZOVznUVKw/bhfjPXB7HO0XDOHG0YNwM0QxrOp1fzH1t9nDw9j5qLgOT/1CfujtESSfd8ypJaUMBz7+WmwEqj+3XQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706950897; c=relaxed/simple; bh=SGebCb4uhyf1FjJh6l9SreVJIhhK/w3FLryXtt84pw8=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=C7m7pPFqgl6+g0q9OESrmk+1xDzZQuZEQ2h4S5Vw1u+lomzrjb7q6lRTaWDiCxNI2MLuyQJD+j6qwPO0JwmSU9f3WaRsEbyQZiJmXqLML6rXX4HF4WZYaebJ1aN+hAEddJV5hMepP8YdrtRNh1KV7ZgBEpRILC14oSo5Nj4aKFY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=FJLd0h6V; arc=none smtp.client-ip=198.175.65.14 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="FJLd0h6V" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1706950896; x=1738486896; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=SGebCb4uhyf1FjJh6l9SreVJIhhK/w3FLryXtt84pw8=; b=FJLd0h6VaqwYXyjXvE9CR+VFXPvMY7FGpcBjbRhO9Ch4qmaNFflPNWrR pBmKh19vRegZLi/2BEIG1bOMGhZeLqBPQQEtR+bCgQBGt4+p6sue99iZ9 Han8wPbN30Ohqh91PLfOCRfnIHvmXWYr2SMWeqaoA6ZVnPK5grJlflxRx wCeoOLVqjY1UAstPekJN53636sCdQejutS4gPCt0N73/9T0c3JtCfROvT /SEc3pU7iw4IbRJeVFK0QA4xNQgo//et94UmexLHJaI97TtIsTHKJXHaW ZWgbb14gijy9WwJ7E+wDjqG6o/Jq9J0YmTBpoVBJZmSYRQR9Xxlw8cyEL w==; X-IronPort-AV: E=McAfee;i="6600,9927,10971"; a="4132093" X-IronPort-AV: E=Sophos;i="6.05,240,1701158400"; d="scan'208";a="4132093" Received: from fmviesa009.fm.intel.com ([10.60.135.149]) by orvoesa106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 03 Feb 2024 01:01:35 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.05,240,1701158400"; d="scan'208";a="291510" Received: from liuzhao-optiplex-7080.sh.intel.com ([10.239.160.36]) by fmviesa009.fm.intel.com with ESMTP; 03 Feb 2024 01:01:29 -0800 From: Zhao Liu To: Paolo Bonzini , Sean Christopherson , "Rafael J . Wysocki" , Daniel Lezcano , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H . Peter Anvin" , kvm@vger.kernel.org, linux-pm@vger.kernel.org, linux-kernel@vger.kernel.org, x86@kernel.org Cc: Ricardo Neri , Len Brown , Zhang Rui , Zhenyu Wang , Zhuocheng Ding , Dapeng Mi , Yanting Jiang , Yongwei Ma , Vineeth Pillai , Suleiman Souhlal , Masami Hiramatsu , David Dai , Saravana Kannan , Zhao Liu Subject: [RFC 19/26] KVM: VMX: Emulate the MSRs of HFI feature Date: Sat, 3 Feb 2024 17:12:07 +0800 Message-Id: <20240203091214.411862-20-zhao1.liu@linux.intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240203091214.411862-1-zhao1.liu@linux.intel.com> References: <20240203091214.411862-1-zhao1.liu@linux.intel.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Zhao Liu In addition to adding new bits to the package thermal MSRs, HFI has also introduced two new MSRs: * MSR_IA32_HW_FEEDBACK_CONFIG: used to enable/disable HFI feature at runtime. Emulate this MSR by parsing the HFI enabling bit. * MSR_IA32_HW_FEEDBACK_PTR: used to configure the HFI table's memory address. Emulate this MSR by storing the Guest HFI table's GPA, and writing local virtual HFI table into this GPA when Guest's HFI table needs to be updated. Only when HFI is enabled (set by Guest in MSR_IA32_HW_FEEDBACK_CONFIG) and Guest HFI table is valid (set the valid address by Guest in MSR_IA32_HW_FEEDBACK_PTR), Guest can have the valid HFI table and its HFI table can be updated. Because the current virtual HFI table is maintained for each VM, not for each virtual package, these 2 MSRs are also emulated at the VM level. Tested-by: Yanting Jiang Co-developed-by: Zhuocheng Ding Signed-off-by: Zhuocheng Ding Signed-off-by: Zhao Liu --- arch/x86/kvm/svm/svm.c | 2 + arch/x86/kvm/vmx/vmx.c | 112 +++++++++++++++++++++++++++++++++++++++++ arch/x86/kvm/vmx/vmx.h | 2 + arch/x86/kvm/x86.c | 2 + 4 files changed, 118 insertions(+) diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index 7039ae48d8d0..980d93c70eb6 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -4293,6 +4293,8 @@ static bool svm_has_emulated_msr(struct kvm *kvm, u32 index) case MSR_IA32_THERM_STATUS: case MSR_IA32_PACKAGE_THERM_INTERRUPT: case MSR_IA32_PACKAGE_THERM_STATUS: + case MSR_IA32_HW_FEEDBACK_CONFIG: + case MSR_IA32_HW_FEEDBACK_PTR: return false; case MSR_IA32_SMBASE: if (!IS_ENABLED(CONFIG_KVM_SMM)) diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 92dded89ae3c..9c28d4ea0b2d 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -2424,6 +2424,18 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) msr_info->data = kvm_vmx->pkg_therm.msr_pkg_therm_status; mutex_unlock(&kvm_vmx->pkg_therm.pkg_therm_lock); break; + case MSR_IA32_HW_FEEDBACK_CONFIG: + if (!msr_info->host_initiated && + !guest_cpuid_has(vcpu, X86_FEATURE_HFI)) + return 1; + msr_info->data = kvm_vmx->pkg_therm.msr_ia32_hfi_cfg; + break; + case MSR_IA32_HW_FEEDBACK_PTR: + if (!msr_info->host_initiated && + !guest_cpuid_has(vcpu, X86_FEATURE_HFI)) + return 1; + msr_info->data = kvm_vmx->pkg_therm.msr_ia32_hfi_ptr; + break; default: find_uret_msr: msr = vmx_find_uret_msr(vmx, msr_info->index); @@ -2557,6 +2569,77 @@ static int vmx_set_pkg_therm_status_msr(struct kvm_vcpu *vcpu, return 0; } +static int vmx_set_hfi_cfg_msr(struct kvm_vcpu *vcpu, + struct msr_data *msr_info) +{ + struct kvm_vmx *kvm_vmx = to_kvm_vmx(vcpu->kvm); + struct hfi_desc *kvm_vmx_hfi = &kvm_vmx->pkg_therm.hfi_desc; + u64 data = msr_info->data; + bool hfi_enabled, hfi_changed; + + /* + * When the HFI enable bit changes (either from 0 to 1 or 1 to + * 0), HFI status bit is set and an interrupt is generated if + * enabled. + */ + hfi_enabled = data & HW_FEEDBACK_CONFIG_HFI_ENABLE; + hfi_changed = kvm_vmx_hfi->hfi_enabled != hfi_enabled; + + kvm_vmx->pkg_therm.msr_ia32_hfi_cfg = data; + kvm_vmx_hfi->hfi_enabled = hfi_enabled; + + if (!hfi_changed) + return 0; + + if (!hfi_enabled) { + /* + * SDM: hardware sets the IA32_PACKAGE_THERM_STATUS[bit 26] + * to 1 to acknowledge disabling of the interface. + */ + kvm_vmx_hfi->hfi_update_status = true; + if (vmx_hfi_int_enabled(kvm_vmx)) + vmx_inject_therm_interrupt(vcpu); + } else { + /* + * Here we don't care pending updates, because the enabed + * feature change may cause the HFI table update range to + * change. + */ + vmx_update_hfi_table(vcpu->kvm, true); + vmx_hfi_notifier_register(vcpu->kvm); + } + + return 0; +} + +static int vmx_set_hfi_ptr_msr(struct kvm_vcpu *vcpu, + struct msr_data *msr_info) +{ + struct kvm_vmx *kvm_vmx = to_kvm_vmx(vcpu->kvm); + struct hfi_desc *kvm_vmx_hfi = &kvm_vmx->pkg_therm.hfi_desc; + u64 data = msr_info->data; + + if (kvm_vmx->pkg_therm.msr_ia32_hfi_ptr == data) + return 0; + + kvm_vmx->pkg_therm.msr_ia32_hfi_ptr = data; + kvm_vmx_hfi->table_ptr_valid = data & HW_FEEDBACK_PTR_VALID; + /* + * Currently we don't really support MSR handling for package + * scope, so when Guest writes, it is not possible to distinguish + * between writes from different packages or repeated writes from + * the same package. To simplify the process, we just assume that + * multiple writes are duplicate writes of the same package and + * overwrite the old. + */ + kvm_vmx_hfi->table_base = data & ~HW_FEEDBACK_PTR_VALID; + + vmx_update_hfi_table(vcpu->kvm, true); + vmx_hfi_notifier_register(vcpu->kvm); + + return 0; +} + /* * Writes msr value into the appropriate "register". * Returns 0 on success, non-0 otherwise. @@ -2919,6 +3002,35 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) ret = vmx_set_pkg_therm_status_msr(vcpu, msr_info); mutex_unlock(&kvm_vmx->pkg_therm.pkg_therm_lock); break; + case MSR_IA32_HW_FEEDBACK_CONFIG: + if (!msr_info->host_initiated && + !guest_cpuid_has(vcpu, X86_FEATURE_HFI)) + return 1; + /* + * Unsupported and reserved bits. ITD is not supported + * (CPUID.06H:EAX[19]) yet. + */ + if (!msr_info->host_initiated && + data & ~(HW_FEEDBACK_CONFIG_HFI_ENABLE)) + return 1; + + mutex_lock(&kvm_vmx->pkg_therm.pkg_therm_lock); + ret = vmx_set_hfi_cfg_msr(vcpu, msr_info); + mutex_unlock(&kvm_vmx->pkg_therm.pkg_therm_lock); + break; + case MSR_IA32_HW_FEEDBACK_PTR: + if (!msr_info->host_initiated && + !guest_cpuid_has(vcpu, X86_FEATURE_HFI)) + return 1; + /* Reserved bits: generate the exception. */ + if (!msr_info->host_initiated && + data & HW_FEEDBACK_PTR_RESERVED_MASK) + return 1; + + mutex_lock(&kvm_vmx->pkg_therm.pkg_therm_lock); + ret = vmx_set_hfi_ptr_msr(vcpu, msr_info); + mutex_unlock(&kvm_vmx->pkg_therm.pkg_therm_lock); + break; default: find_uret_msr: msr = vmx_find_uret_msr(vmx, msr_index); diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h index ff205bc0e99a..d9db8bf3726f 100644 --- a/arch/x86/kvm/vmx/vmx.h +++ b/arch/x86/kvm/vmx/vmx.h @@ -422,6 +422,8 @@ struct hfi_desc { struct pkg_therm_desc { u64 msr_pkg_therm_int; u64 msr_pkg_therm_status; + u64 msr_ia32_hfi_cfg; + u64 msr_ia32_hfi_ptr; /* Currently HFI is only supported at package level. */ struct hfi_desc hfi_desc; /* All members before "struct mutex pkg_therm_lock" are protected by the lock. */ diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index bea3def6a4b1..27bec359907c 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -1550,6 +1550,8 @@ static const u32 emulated_msrs_all[] = { MSR_IA32_THERM_STATUS, MSR_IA32_PACKAGE_THERM_INTERRUPT, MSR_IA32_PACKAGE_THERM_STATUS, + MSR_IA32_HW_FEEDBACK_CONFIG, + MSR_IA32_HW_FEEDBACK_PTR, /* * KVM always supports the "true" VMX control MSRs, even if the host From patchwork Sat Feb 3 09:12:08 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zhao Liu X-Patchwork-Id: 13543970 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.14]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 368BC633F1; Sat, 3 Feb 2024 09:01:41 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.14 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706950903; cv=none; b=Siu6zYVhcpfpXS95qlZfRs3ahBNUtb9qqAH60eKDbJ7nPPHPps9qS1qzqTAiaIM4tubbmGhZe7Cr8s1Z3r8EBdGrgdIrkzvfA1zrHg3wG8kFJBgMHAD17QtdjzDnUfrDOFCURDEOQEuY9ipSbsy5HjXykiUovOhvLowBh+rY/bk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706950903; c=relaxed/simple; bh=tPickwMWphnvoa4nUW5041d+Jv4leQezd407dqDzWtA=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=ezYkUzvjPGkDbsrBF/PMUP/VFujJoZGXQhuBNQ0eRGR/sFvcNbcZaG2L3++nmh4YKHUBVGqACKxBBXoJUj6QeZstPSqvrNix+1pN61shxVB2pcnj7ZPlEVZnav2+2zG4REwZAsWnEQcPsslgei8ZvYtIM6uJMJrpkXUfTkgEgeQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=U+uoOdHJ; arc=none smtp.client-ip=198.175.65.14 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="U+uoOdHJ" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1706950902; x=1738486902; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=tPickwMWphnvoa4nUW5041d+Jv4leQezd407dqDzWtA=; b=U+uoOdHJ2mz53GnZP++rPgr8RLw2uc2m55byKaoul3C07JKYnPJmrgjm JQmfZxAxct9Klalxx944Pi1b0kjxyFaBY/hR3m/dM0stw31bvkp7TJg4h nx1OJ3/C2WNiefm2Hi8WAzKrEB7+5fMHAW1vIV9KxuJ3EqL8EAZMdQePd DwJVmWGfl/ipdnWpImp9DqdLwF1gOphyJGBsb8gi2LoYpNR6ZJ9r+udt9 /GxnUu+0wiiZW+wnqTSgfHih7WLm3OZARLvMEISOrfIx5OQkIwc5pHiMd ff7H5+FPWBalMVWW19LGR6hW+/0rprEZ8dI+ypE8MmDeDPenmcM/hud6L Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10971"; a="4132106" X-IronPort-AV: E=Sophos;i="6.05,240,1701158400"; d="scan'208";a="4132106" Received: from fmviesa009.fm.intel.com ([10.60.135.149]) by orvoesa106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 03 Feb 2024 01:01:41 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.05,240,1701158400"; d="scan'208";a="291526" Received: from liuzhao-optiplex-7080.sh.intel.com ([10.239.160.36]) by fmviesa009.fm.intel.com with ESMTP; 03 Feb 2024 01:01:34 -0800 From: Zhao Liu To: Paolo Bonzini , Sean Christopherson , "Rafael J . Wysocki" , Daniel Lezcano , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H . Peter Anvin" , kvm@vger.kernel.org, linux-pm@vger.kernel.org, linux-kernel@vger.kernel.org, x86@kernel.org Cc: Ricardo Neri , Len Brown , Zhang Rui , Zhenyu Wang , Zhuocheng Ding , Dapeng Mi , Yanting Jiang , Yongwei Ma , Vineeth Pillai , Suleiman Souhlal , Masami Hiramatsu , David Dai , Saravana Kannan , Zhao Liu Subject: [RFC 20/26] KVM: x86: Expose HFI feature bit and HFI info in CPUID Date: Sat, 3 Feb 2024 17:12:08 +0800 Message-Id: <20240203091214.411862-21-zhao1.liu@linux.intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240203091214.411862-1-zhao1.liu@linux.intel.com> References: <20240203091214.411862-1-zhao1.liu@linux.intel.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Zhao Liu The HFI feature contains the following relevant CPUID fields: * 0x06.eax[bit 19]: HFI feature bit * 0x06.ecx[bits 08-15]: Number of HFI/ITD supported classes * 0x06.edx[bits 00-07]: Bitmap of supported HFI capabilities * 0x06.edx[bits 08-11]: Enumerates the size of the HFI table in number of 4 KB pages * 0x06.edx[bits 16-31]: HFI table index of processor Guest's HFI feature bit (0x06.eax[bit 19]) is based on Host's HFI enabling. For other HFI related CPUID fields, since they affect the memory allocation and HFI data filling of the virtual HFI table in KVM, check the hfi related CPUID fields after KVM_SET_CPUID/KVM_SET_CPUID2 to ensure the valid HFI feature information and the valid memory size. And about the HFI table index, since the current KVM creates the same CPUID template for all vCPUs, we refer to the CPU topology handling and leave the specific filling of the HFI table index to the user, if the user does not specifically specify the HFI index, all vCPUs will share the HFI entry with hfi index 0. The shared HFI index is valid in spec [1], but considering that the data of the virtual HFI table is all from the pCPU on which the vCPU is running, the shared hfi index of vCPUs on different pCPUs might cause frequent HFI updates, and the virtual HFI table cannot accurately reflect the actual processor situation, which might have a negative impact on the Guest performance. Therefore, it is better to assign different HFI table indexes to different vCPUs. [1]: SDM, vol. 2A, chap. CPUID--CPU Identification, CPUID.06H.EDX[Bits 31-16], about HFI table index sharing, it said, "Note that on some parts the index may be same for multiple logical processors". Tested-by: Yanting Jiang Co-developed-by: Zhuocheng Ding Signed-off-by: Zhuocheng Ding Signed-off-by: Zhao Liu --- arch/x86/kvm/cpuid.c | 136 ++++++++++++++++++++++++++++++++++++----- arch/x86/kvm/vmx/vmx.c | 7 +++ 2 files changed, 128 insertions(+), 15 deletions(-) diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c index eaac2c8d98b9..4da8f3319917 100644 --- a/arch/x86/kvm/cpuid.c +++ b/arch/x86/kvm/cpuid.c @@ -17,6 +17,7 @@ #include #include +#include #include #include #include @@ -130,12 +131,77 @@ static inline struct kvm_cpuid_entry2 *cpuid_entry2_find( return NULL; } +static int kvm_check_hfi_cpuid(struct kvm_vcpu *vcpu, + struct kvm_cpuid_entry2 *entries, + int nent) +{ + struct hfi_features hfi_features; + struct kvm_cpuid_entry2 *best = NULL; + bool has_hfi; + int nr_classes, ret; + union cpuid6_ecx ecx; + union cpuid6_edx edx; + unsigned int data_size; + + best = cpuid_entry2_find(entries, nent, 0x6, 0); + if (!best) + return 0; + + has_hfi = cpuid_entry_has(best, X86_FEATURE_HFI); + if (!has_hfi) + return 0; + + /* + * Only the platform with 1 HFI instance (i.e., client platform) + * can enable HFI in Guest. For more information, please refer to + * the comment in kvm_set_cpu_caps(). + */ + if (intel_hfi_max_instances() != 1) + return -EINVAL; + + /* + * Currently we haven't supported ITD. HFI is the default feature + * with 1 class. + */ + nr_classes = 1; + ret = intel_hfi_build_virt_features(&hfi_features, + nr_classes, + vcpu->kvm->created_vcpus); + if (ret) + return ret; + + ecx.full = best->ecx; + edx.full = best->edx; + + if (ecx.split.nr_classes != hfi_features.nr_classes) + return -EINVAL; + + if (hweight8(edx.split.capabilities.bits) != hfi_features.class_stride) + return -EINVAL; + + if (edx.split.table_pages + 1 != hfi_features.nr_table_pages) + return -EINVAL; + + /* + * The total size of the row corresponding to index and all + * previous data. + */ + data_size = hfi_features.hdr_size + (edx.split.index + 1) * + hfi_features.cpu_stride; + /* Invalid index. */ + if (data_size > hfi_features.nr_table_pages << PAGE_SHIFT) + return -EINVAL; + + return 0; +} + static int kvm_check_cpuid(struct kvm_vcpu *vcpu, struct kvm_cpuid_entry2 *entries, int nent) { struct kvm_cpuid_entry2 *best; u64 xfeatures; + int ret; /* * The existing code assumes virtual address is 48-bit or 57-bit in the @@ -155,15 +221,18 @@ static int kvm_check_cpuid(struct kvm_vcpu *vcpu, * enabling in the FPU, e.g. to expand the guest XSAVE state size. */ best = cpuid_entry2_find(entries, nent, 0xd, 0); - if (!best) - return 0; - - xfeatures = best->eax | ((u64)best->edx << 32); - xfeatures &= XFEATURE_MASK_USER_DYNAMIC; - if (!xfeatures) - return 0; + if (best) { + xfeatures = best->eax | ((u64)best->edx << 32); + xfeatures &= XFEATURE_MASK_USER_DYNAMIC; + if (xfeatures) { + ret = fpu_enable_guest_xfd_features(&vcpu->arch.guest_fpu, + xfeatures); + if (ret) + return ret; + } + } - return fpu_enable_guest_xfd_features(&vcpu->arch.guest_fpu, xfeatures); + return kvm_check_hfi_cpuid(vcpu, entries, nent); } /* Check whether the supplied CPUID data is equal to what is already set for the vCPU. */ @@ -633,14 +702,27 @@ void kvm_set_cpu_caps(void) ); /* - * PTS is the dependency of ITD, currently we only use PTS for - * enabling ITD in KVM. Since KVM does not support msr topology at - * present, the emulation of PTS has restrictions on the topology of - * Guest, so we only expose PTS when Host enables ITD. + * PTS and HFI are the dependencies of ITD, currently we only use PTS/HFI + * for enabling ITD in KVM. Since KVM does not support msr topology at + * present, the emulation of PTS/HFI has restrictions on the topology of + * Guest, so we only expose PTS/HFI when Host enables ITD. + * + * We also restrict HFI virtualization support to platforms with only 1 HFI + * instance (i.e., this is the client platform, and ITD is currently a + * client-specific feature), while server platforms with multiple instances + * do not require HFI virtualization. This restriction avoids adding + * additional complex logic to handle notification register updates when + * vCPUs migrate between different HFI instances. */ - if (cpu_feature_enabled(X86_FEATURE_ITD)) { + if (cpu_feature_enabled(X86_FEATURE_ITD) && intel_hfi_max_instances() == 1) { if (boot_cpu_has(X86_FEATURE_PTS)) kvm_cpu_cap_set(X86_FEATURE_PTS); + /* + * Set HFI based on hardware capability. Only when the Host has + * the valid HFI instance, KVM can build the virtual HFI table. + */ + if (intel_hfi_enabled()) + kvm_cpu_cap_set(X86_FEATURE_HFI); } kvm_cpu_cap_mask(CPUID_7_0_EBX, @@ -986,8 +1068,32 @@ static inline int __do_cpuid_func(struct kvm_cpuid_array *array, u32 function) entry->eax |= 0x4; entry->ebx = 0; - entry->ecx = 0; - entry->edx = 0; + + if (kvm_cpu_cap_has(X86_FEATURE_HFI)) { + union cpuid6_ecx ecx; + union cpuid6_edx edx; + + ecx.full = 0; + edx.full = 0; + /* Number of supported HFI classes */ + ecx.split.nr_classes = 1; + /* HFI supports performance and energy efficiency capabilities. */ + edx.split.capabilities.split.performance = 1; + edx.split.capabilities.split.energy_efficiency = 1; + /* As default, keep the same HFI table size as host. */ + edx.split.table_pages = ((union cpuid6_edx)entry->edx).split.table_pages; + /* + * Default HFI index = 0. User should be careful that + * the index differ for each CPUs. + */ + edx.split.index = 0; + + entry->ecx = ecx.full; + entry->edx = edx.full; + } else { + entry->ecx = 0; + entry->edx = 0; + } break; /* function 7 has additional index. */ case 7: diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 9c28d4ea0b2d..636f2bd68546 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -8434,6 +8434,13 @@ static void vmx_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu) vmx->msr_ia32_feature_control_valid_bits &= ~FEAT_CTL_SGX_LC_ENABLED; + if (guest_cpuid_has(vcpu, X86_FEATURE_HFI) && intel_hfi_enabled()) { + struct kvm_cpuid_entry2 *best = kvm_find_cpuid_entry_index(vcpu, 0x6, 0); + + if (best) + vmx->hfi_table_idx = ((union cpuid6_edx)best->edx).split.index; + } + /* Refresh #PF interception to account for MAXPHYADDR changes. */ vmx_update_exception_bitmap(vcpu); } From patchwork Sat Feb 3 09:12:09 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zhao Liu X-Patchwork-Id: 13543971 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.14]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1F5B6634F9; Sat, 3 Feb 2024 09:01:47 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.14 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706950909; cv=none; b=pX1r2pyfdeG28eFYOEuJpLPD+lFfEqrUZfHd96RFqy0+Zud2M0Jp/3+szyo5y0M/Iz+ghLxGnzD6Rb33pht1jxF99eL/8uhtYOma7ESilaGuzE+awt5iMrV93+avUisBUWd17jfuwdGUKwskHV1FZGJEj1XpzFAmJrJI/20wKC8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706950909; c=relaxed/simple; bh=lSCiO+p9r/GjQUjj3pkuJGjUOV3DAnIfLLhwnD/+1iI=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=FngUg57vi5NDHpjD/iGFPTxBKzfeLTuWGgAT1EmrT62n2DU26yxVmri93E0LVu7FJj7LjafVruvIfsqx2CJxUIIABHbHgbXWnKVPhQaATY7ri95HlHVKAoJNKfR9oemd/fTOn+ahutUciggZLnbCUj6PvWwfcDN2O/vcMARhMjs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=YKfFuU/Y; arc=none smtp.client-ip=198.175.65.14 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="YKfFuU/Y" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1706950907; x=1738486907; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=lSCiO+p9r/GjQUjj3pkuJGjUOV3DAnIfLLhwnD/+1iI=; b=YKfFuU/Y0cuYxO2Kynb5wtEjNQ1Q+xXcv4+rnrk4lPgwKCQNrDd111Fr Jx275Ga900i/5mI4HbWfSD+zy9b86Jl0Em9x4nv7vIyoQIcIsYkOaXReS yGty4VwwMEvIRargPa+GDAaJiK/ixA8wFmXN2HPn2kaUFEIQi9TCjc5WD iP6KhNPC3rbP6kmEwK++plqlBOWY5tSSLfrbF1djIIwG6bETRf9X22Wp5 6lyaBdsrTONFmB9Fw7S7WQRbYhqqO9jgu/KbFHJ+Js/a5cZWe1AYKNBEN whS7nZ/gB7maItztlk/P6qfVd2jDBHTNN55aBLtt9ZnqeKjtMnCYSb29b A==; X-IronPort-AV: E=McAfee;i="6600,9927,10971"; a="4132136" X-IronPort-AV: E=Sophos;i="6.05,240,1701158400"; d="scan'208";a="4132136" Received: from fmviesa009.fm.intel.com ([10.60.135.149]) by orvoesa106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 03 Feb 2024 01:01:47 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.05,240,1701158400"; d="scan'208";a="291543" Received: from liuzhao-optiplex-7080.sh.intel.com ([10.239.160.36]) by fmviesa009.fm.intel.com with ESMTP; 03 Feb 2024 01:01:40 -0800 From: Zhao Liu To: Paolo Bonzini , Sean Christopherson , "Rafael J . Wysocki" , Daniel Lezcano , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H . Peter Anvin" , kvm@vger.kernel.org, linux-pm@vger.kernel.org, linux-kernel@vger.kernel.org, x86@kernel.org Cc: Ricardo Neri , Len Brown , Zhang Rui , Zhenyu Wang , Zhuocheng Ding , Dapeng Mi , Yanting Jiang , Yongwei Ma , Vineeth Pillai , Suleiman Souhlal , Masami Hiramatsu , David Dai , Saravana Kannan , Zhao Liu Subject: [RFC 21/26] KVM: VMX: Extend HFI table and MSR emulation to support ITD Date: Sat, 3 Feb 2024 17:12:09 +0800 Message-Id: <20240203091214.411862-22-zhao1.liu@linux.intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240203091214.411862-1-zhao1.liu@linux.intel.com> References: <20240203091214.411862-1-zhao1.liu@linux.intel.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Zhao Liu ITD (Intel Thread Director) is the extension of HFI feature. Based on HFI, it adds 4 classes in HFI table and it provides the MSR interface to support the OS to classify the currently running task into one of 4 classes. As the first step of ITD support, extend the HFI table and related HFI MSRs' emulation to support the ITD: * More classes in HFI table If ITD is configured in Guest's CPUID, the virtual HFI table will be built with 4 classes. But only when ITD is enabled in MSR_IA32_HW_FEEDBACK_CONFIG, the virtual HFI table will update all these 4 classes, otherwise it will only update class 0's data if HFI is enabled. * MSR_IA32_HW_FEEDBACK_CONFIG (HW_FEEDBACK_CONFIG_ITD_ENABLE bit) With ITD support, MSR_IA32_HW_FEEDBACK_CONFIG has 2 feature enabling bits: HW_FEEDBACK_CONFIG_HFI_ENABLE and HW_FEEDBACK_CONFIG_ITD_ENABLE bit. These 2 bits control whether the HFI and ITD features are enabled or not, and also affect which class data should actually be updated in the virtual HFI table [1]. For the MSR_IA32_HW_FEEDBACK_CONFIG's emulation, add support for dynamically changing these two bits and the corresponding HFI update adjustments. [1]: SDM, vol. 3B, section 15.6.5 Hardware Feedback Interface Configuration, Table 15-10. IA32_HW_FEEDBACK_CONFIG Control Option Tested-by: Yanting Jiang Co-developed-by: Zhuocheng Ding Signed-off-by: Zhuocheng Ding Signed-off-by: Zhao Liu --- arch/x86/kvm/vmx/vmx.c | 68 +++++++++++++++++++++++++++++++----------- arch/x86/kvm/vmx/vmx.h | 3 ++ 2 files changed, 54 insertions(+), 17 deletions(-) diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 636f2bd68546..bdff1d424b2f 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -1547,11 +1547,11 @@ static int vmx_init_hfi_table(struct kvm *kvm) struct hfi_table *hfi_table = &kvm_vmx_hfi->hfi_table; int nr_classes, ret = 0; - /* - * Currently we haven't supported ITD. HFI is the default feature - * with 1 class. - */ - nr_classes = 1; + if (guest_cpuid_has(kvm_get_vcpu(kvm, 0), X86_FEATURE_ITD)) + nr_classes = 4; + else + nr_classes = 1; + ret = intel_hfi_build_virt_features(hfi_features, nr_classes, kvm->created_vcpus); @@ -1579,11 +1579,11 @@ static int vmx_build_hfi_table(struct kvm *kvm) struct kvm_vcpu *v; unsigned long i; - /* - * Currently we haven't supported ITD. HFI is the default feature - * with 1 class. - */ - nr_classes = 1; + if (kvm_vmx_hfi->itd_enabled) + nr_classes = kvm_vmx_hfi->hfi_features.nr_classes; + else + nr_classes = 1; + for (int j = 0; j < nr_classes; j++) { hfi_hdr->perf_updated = 0; hfi_hdr->ee_updated = 0; @@ -2575,7 +2575,7 @@ static int vmx_set_hfi_cfg_msr(struct kvm_vcpu *vcpu, struct kvm_vmx *kvm_vmx = to_kvm_vmx(vcpu->kvm); struct hfi_desc *kvm_vmx_hfi = &kvm_vmx->pkg_therm.hfi_desc; u64 data = msr_info->data; - bool hfi_enabled, hfi_changed; + bool hfi_enabled, hfi_changed, itd_enabled, itd_changed; /* * When the HFI enable bit changes (either from 0 to 1 or 1 to @@ -2584,12 +2584,44 @@ static int vmx_set_hfi_cfg_msr(struct kvm_vcpu *vcpu, */ hfi_enabled = data & HW_FEEDBACK_CONFIG_HFI_ENABLE; hfi_changed = kvm_vmx_hfi->hfi_enabled != hfi_enabled; + itd_enabled = data & HW_FEEDBACK_CONFIG_ITD_ENABLE; + itd_changed = kvm_vmx_hfi->itd_enabled != itd_enabled; kvm_vmx->pkg_therm.msr_ia32_hfi_cfg = data; kvm_vmx_hfi->hfi_enabled = hfi_enabled; + kvm_vmx_hfi->itd_enabled = itd_enabled; + + if (!hfi_changed && !itd_changed) + return 0; + + /* + * Refer to SDM, vol. 3B, Table 15-10. IA32_HW_FEEDBACK_CONFIG + * Control Option. + */ + + /* Invalid option; quietly ignored by the hardware. */ + if (!hfi_changed && itd_changed && !hfi_enabled && itd_enabled) { + /* No action (no update in the table). */ + return 0; + } - if (!hfi_changed) + /* No action; keep HFI and Intel Thread Director disabled. */ + if (!hfi_changed && itd_changed && !hfi_enabled && !itd_enabled) { + /* No action (no update in the table). */ return 0; + } + + /* No action; keep HFI enabled. */ + if (!hfi_changed && itd_changed && hfi_enabled && !itd_enabled) { + /* No action (no update in the table). */ + return 0; + } + + /* Disable HFI and Intel Thread Director whether ITD changed. */ + if (hfi_changed && !hfi_enabled && itd_enabled) { + kvm_vmx_hfi->hfi_enabled = false; + kvm_vmx_hfi->itd_enabled = false; + } if (!hfi_enabled) { /* @@ -3006,12 +3038,14 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) if (!msr_info->host_initiated && !guest_cpuid_has(vcpu, X86_FEATURE_HFI)) return 1; - /* - * Unsupported and reserved bits. ITD is not supported - * (CPUID.06H:EAX[19]) yet. - */ + /* Unsupported bit: generate the exception. */ + if (!msr_info->host_initiated && + !guest_cpuid_has(vcpu, X86_FEATURE_ITD) && + (data & HW_FEEDBACK_CONFIG_ITD_ENABLE)) + return 1; + /* Reserved bits: generate the exception. */ if (!msr_info->host_initiated && - data & ~(HW_FEEDBACK_CONFIG_HFI_ENABLE)) + data & ~(HW_FEEDBACK_CONFIG_HFI_ENABLE | HW_FEEDBACK_CONFIG_ITD_ENABLE)) return 1; mutex_lock(&kvm_vmx->pkg_therm.pkg_therm_lock); diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h index d9db8bf3726f..0ef767d63def 100644 --- a/arch/x86/kvm/vmx/vmx.h +++ b/arch/x86/kvm/vmx/vmx.h @@ -377,6 +377,8 @@ struct vcpu_vmx { * struct hfi_desc - Representation of an HFI instance (i.e., a table) * @hfi_enabled: Flag to indicate whether HFI is enabled at runtime. * Parsed from the Guest's MSR_IA32_HW_FEEDBACK_CONFIG. + * @itd_enabled: Flag to indicate whether ITD is enabled at runtime. + * Parsed from the Guest's MSR_IA32_HW_FEEDBACK_CONFIG. * @hfi_int_enabled: Flag to indicate whether HFI is enabled at runtime. * Parsed from Guest's MSR_IA32_PACKAGE_THERM_INTERRUPT[bit 25]. * @table_ptr_valid: Flag to indicate whether the memory of Guest HFI table is ready. @@ -407,6 +409,7 @@ struct vcpu_vmx { struct hfi_desc { bool hfi_enabled; + bool itd_enabled; bool hfi_int_enabled; bool table_ptr_valid; bool hfi_update_status; From patchwork Sat Feb 3 09:12:10 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zhao Liu X-Patchwork-Id: 13543972 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.14]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6276F64A8A; Sat, 3 Feb 2024 09:01:53 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.14 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706950914; cv=none; b=sqjqUygj4lTMP+PUMRBttjCLQu7AiJ94vCbtFL4mvniMmQrvhC2834emT5qUxPGbI5EBN3JQPu2bjoPmZQWAc61SfcSrYmrpgDDOI3Ho3DiFxD9xJ7VTURaMyg/E2LPSWM+CQRN+Ur4PqiF/7fRNNNQJgba5++9CucaJzFYs5H4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706950914; c=relaxed/simple; bh=mfaQaAeHONDag9Knpv31L0HoH9EOtHuJiJGnjNwXfx8=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=n9eJVKYgwksXLUdKqUjbasO/nA8lzuC201BdNMnHMqoV6LyDfH6HSjx7/u9RxFeqFLLoVygY0Ig0EkMZWL0OOxZpfWI/LVxpqp7WrGkUPDzq5EJJJbw8AzLwrG4/aCx3+ZLLP+h5Z7icVEMYD3okJRdoOTmrFrkG387vlIH3RCE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=b4SN3Uwf; arc=none smtp.client-ip=198.175.65.14 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="b4SN3Uwf" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1706950914; x=1738486914; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=mfaQaAeHONDag9Knpv31L0HoH9EOtHuJiJGnjNwXfx8=; b=b4SN3Uwfa9pyvX5CZskGA5dxNc9TUZ3xrckWhgZOrJNfasgsl3QSEmcJ TFPK1IJRlrdxeQCgMcNY2fGq80/1mt19lO/484fsSmviQW4eoGdxTIqPz 6Zc+/OlRkHB/MdXg3mV2Z2UNwyT8Nyym5ziBXKEyE6Qr76J6FDvwQbCka WKhcxpAXrxQ4ld0ZECHvL7Aqnsx7xhjERWab9heIQElDAYfe7N/VjFzg7 5ArBxvw3G/JtccnfYgcHyaEh4zuEJ+myJ97+a8DyHr+TeSOf99HlrAIp6 WgbEwohCFSYcg5NozVfWhFE/5ybAMJqinbDzvTX5D7g8Jq/pwhOTjiMSP w==; X-IronPort-AV: E=McAfee;i="6600,9927,10971"; a="4132157" X-IronPort-AV: E=Sophos;i="6.05,240,1701158400"; d="scan'208";a="4132157" Received: from fmviesa009.fm.intel.com ([10.60.135.149]) by orvoesa106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 03 Feb 2024 01:01:52 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.05,240,1701158400"; d="scan'208";a="291560" Received: from liuzhao-optiplex-7080.sh.intel.com ([10.239.160.36]) by fmviesa009.fm.intel.com with ESMTP; 03 Feb 2024 01:01:46 -0800 From: Zhao Liu To: Paolo Bonzini , Sean Christopherson , "Rafael J . Wysocki" , Daniel Lezcano , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H . Peter Anvin" , kvm@vger.kernel.org, linux-pm@vger.kernel.org, linux-kernel@vger.kernel.org, x86@kernel.org Cc: Ricardo Neri , Len Brown , Zhang Rui , Zhenyu Wang , Zhuocheng Ding , Dapeng Mi , Yanting Jiang , Yongwei Ma , Vineeth Pillai , Suleiman Souhlal , Masami Hiramatsu , David Dai , Saravana Kannan , Zhao Liu Subject: [RFC 22/26] KVM: VMX: Pass through ITD classification related MSRs to Guest Date: Sat, 3 Feb 2024 17:12:10 +0800 Message-Id: <20240203091214.411862-23-zhao1.liu@linux.intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240203091214.411862-1-zhao1.liu@linux.intel.com> References: <20240203091214.411862-1-zhao1.liu@linux.intel.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Zhao Liu ITD adds 2 new MSRs, MSR_IA32_HW_FEEDBACK_CHAR and MSR_IA32_HW_FEEDBACK_THREAD_CONFIG, to allow OS to classify the running task into one of four classes [1]. Pass through these 2 MSRs to Guest: * MSR_IA32_HW_FEEDBACK_CHAR. MSR_IA32_HW_FEEDBACK_CHAR is a thread scope MSR. It is used to specify the class for the currently running workload, * MSR_IA32_HW_FEEDBACK_THREAD_CONFIG. MSR_IA32_HW_FEEDBACK_THREAD_CONFIG is also a thread scope MSR and is used to control the enablement of the classification function. [1]: SDM, vol. 3B, section 15.6.8 Logical Processor Scope Intel Thread Director Configuration Suggested-by: Zhenyu Wang Tested-by: Yanting Jiang Co-developed-by: Zhuocheng Ding Signed-off-by: Zhuocheng Ding Signed-off-by: Zhao Liu --- arch/x86/kvm/vmx/vmx.c | 37 +++++++++++++++++++++++++++++++++++++ arch/x86/kvm/vmx/vmx.h | 8 +++++++- 2 files changed, 44 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index bdff1d424b2f..11d42e0a208b 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -225,6 +225,8 @@ static u32 vmx_possible_passthrough_msrs[MAX_POSSIBLE_PASSTHROUGH_MSRS] = { MSR_CORE_C3_RESIDENCY, MSR_CORE_C6_RESIDENCY, MSR_CORE_C7_RESIDENCY, + MSR_IA32_HW_FEEDBACK_THREAD_CONFIG, + MSR_IA32_HW_FEEDBACK_CHAR, }; /* @@ -1288,6 +1290,30 @@ static void pt_guest_exit(struct vcpu_vmx *vmx) wrmsrl(MSR_IA32_RTIT_CTL, vmx->pt_desc.host.ctl); } +static void itd_guest_enter(struct vcpu_vmx *vmx) +{ + struct vcpu_hfi_desc *vcpu_hfi = &vmx->vcpu_hfi_desc; + + if (!guest_cpuid_has(&vmx->vcpu, X86_FEATURE_ITD) || + !kvm_cpu_cap_has(X86_FEATURE_ITD)) + return; + + rdmsrl(MSR_IA32_HW_FEEDBACK_THREAD_CONFIG, vcpu_hfi->host_thread_cfg); + wrmsrl(MSR_IA32_HW_FEEDBACK_THREAD_CONFIG, vcpu_hfi->guest_thread_cfg); +} + +static void itd_guest_exit(struct vcpu_vmx *vmx) +{ + struct vcpu_hfi_desc *vcpu_hfi = &vmx->vcpu_hfi_desc; + + if (!guest_cpuid_has(&vmx->vcpu, X86_FEATURE_ITD) || + !kvm_cpu_cap_has(X86_FEATURE_ITD)) + return; + + rdmsrl(MSR_IA32_HW_FEEDBACK_THREAD_CONFIG, vcpu_hfi->guest_thread_cfg); + wrmsrl(MSR_IA32_HW_FEEDBACK_THREAD_CONFIG, vcpu_hfi->host_thread_cfg); +} + void vmx_set_host_fs_gs(struct vmcs_host_state *host, u16 fs_sel, u16 gs_sel, unsigned long fs_base, unsigned long gs_base) { @@ -5485,6 +5511,8 @@ static void vmx_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event) vmx->msr_ia32_therm_control = 0; vmx->msr_ia32_therm_interrupt = 0; vmx->msr_ia32_therm_status = 0; + vmx->vcpu_hfi_desc.host_thread_cfg = 0; + vmx->vcpu_hfi_desc.guest_thread_cfg = 0; vmx->hv_deadline_tsc = -1; kvm_set_cr8(vcpu, 0); @@ -7977,6 +8005,7 @@ static fastpath_t vmx_vcpu_run(struct kvm_vcpu *vcpu) kvm_load_guest_xsave_state(vcpu); pt_guest_enter(vmx); + itd_guest_enter(vmx); atomic_switch_perf_msrs(vmx); if (intel_pmu_lbr_is_enabled(vcpu)) @@ -8015,6 +8044,7 @@ static fastpath_t vmx_vcpu_run(struct kvm_vcpu *vcpu) loadsegment(es, __USER_DS); #endif + itd_guest_exit(vmx); pt_guest_exit(vmx); kvm_load_host_xsave_state(vcpu); @@ -8475,6 +8505,13 @@ static void vmx_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu) vmx->hfi_table_idx = ((union cpuid6_edx)best->edx).split.index; } + if (guest_cpuid_has(vcpu, X86_FEATURE_ITD) && kvm_cpu_cap_has(X86_FEATURE_ITD)) { + vmx_set_intercept_for_msr(vcpu, MSR_IA32_HW_FEEDBACK_THREAD_CONFIG, + MSR_TYPE_RW, !guest_cpuid_has(vcpu, X86_FEATURE_ITD)); + vmx_set_intercept_for_msr(vcpu, MSR_IA32_HW_FEEDBACK_CHAR, + MSR_TYPE_RW, !guest_cpuid_has(vcpu, X86_FEATURE_ITD)); + } + /* Refresh #PF interception to account for MAXPHYADDR changes. */ vmx_update_exception_bitmap(vcpu); } diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h index 0ef767d63def..3d3238dd8fc3 100644 --- a/arch/x86/kvm/vmx/vmx.h +++ b/arch/x86/kvm/vmx/vmx.h @@ -71,6 +71,11 @@ struct pt_desc { struct pt_ctx guest; }; +struct vcpu_hfi_desc { + u64 host_thread_cfg; + u64 guest_thread_cfg; +}; + union vmx_exit_reason { struct { u32 basic : 16; @@ -286,6 +291,7 @@ struct vcpu_vmx { u64 msr_ia32_therm_control; u64 msr_ia32_therm_interrupt; u64 msr_ia32_therm_status; + struct vcpu_hfi_desc vcpu_hfi_desc; /* * loaded_vmcs points to the VMCS currently used in this vcpu. For a @@ -366,7 +372,7 @@ struct vcpu_vmx { int hfi_table_idx; /* Save desired MSR intercept (read: pass-through) state */ -#define MAX_POSSIBLE_PASSTHROUGH_MSRS 16 +#define MAX_POSSIBLE_PASSTHROUGH_MSRS 18 struct { DECLARE_BITMAP(read, MAX_POSSIBLE_PASSTHROUGH_MSRS); DECLARE_BITMAP(write, MAX_POSSIBLE_PASSTHROUGH_MSRS); From patchwork Sat Feb 3 09:12:11 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zhao Liu X-Patchwork-Id: 13543973 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.14]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5AA4A6775C; Sat, 3 Feb 2024 09:01:58 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.14 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706950919; cv=none; b=iZRycaHAchAEpswdDobEl4fyuNXW9+zAdVpXLUTWe2iAJR34Y3MA9BVOct9Gjsc65w+O8t8ibpetkY49GxTMGJyy5gpxEnloZuLbze/INW7e08KuHjhHlLrk0Jsf5KTn9t2MqK+CQUaL+RQ5ps+GLtKoo6MNOPR4QqJprFPSuL4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706950919; c=relaxed/simple; bh=VCv51ZKM2GW64nR2mDsrfM6GGX/jccl5W1ec8CqBGLo=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=S2kbMQ2Pg0UMjRQGIWZnoF4GC46BEIxmzZfRw7+yLKbglE1Rk4CigkDS+97btCWk3CtwG0VVr4RKa0Mb6U9PghIjhYJpTBN1jYdgFaMpu5hNVeTQW3uZGw35tlLx4NtqYbH+ncRo7eqoTnJI9jA563+pB+GG3Jte3Tdcl2CRwcs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=VdseGHXE; arc=none smtp.client-ip=198.175.65.14 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="VdseGHXE" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1706950919; x=1738486919; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=VCv51ZKM2GW64nR2mDsrfM6GGX/jccl5W1ec8CqBGLo=; b=VdseGHXEhR8GTHqm6waOSbYG2+7VoEhbB9O/wxaW1kEGyVt1b+3sgjc1 TAr5g2JuGxAhRNF4T68573D6axKhLa2mIwPQqnfXNSXRhfyQNu5TJf2+n 93q73yd5QQANXA0NDgDk6FifxLWyDHYSb7MI+jWD8D/Ppsh2oGgEARBl1 wk57uC24NnmCdnXfSfLzAX+u6zRgx7uWuSBz7IObkrWQmt/Km5yXJJzRm nKwbVrnfy4Km7ebDJG1frSXMcv+N2jcjZRBpHFVKtg6Uj/iUiRbow5PbH QqqDteyW8ykwHwi7574//y7cnpZ0yyJ28zGd3XfhxP1PzW9P46Tc55AOT w==; X-IronPort-AV: E=McAfee;i="6600,9927,10971"; a="4132204" X-IronPort-AV: E=Sophos;i="6.05,240,1701158400"; d="scan'208";a="4132204" Received: from fmviesa009.fm.intel.com ([10.60.135.149]) by orvoesa106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 03 Feb 2024 01:01:58 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.05,240,1701158400"; d="scan'208";a="291576" Received: from liuzhao-optiplex-7080.sh.intel.com ([10.239.160.36]) by fmviesa009.fm.intel.com with ESMTP; 03 Feb 2024 01:01:52 -0800 From: Zhao Liu To: Paolo Bonzini , Sean Christopherson , "Rafael J . Wysocki" , Daniel Lezcano , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H . Peter Anvin" , kvm@vger.kernel.org, linux-pm@vger.kernel.org, linux-kernel@vger.kernel.org, x86@kernel.org Cc: Ricardo Neri , Len Brown , Zhang Rui , Zhenyu Wang , Zhuocheng Ding , Dapeng Mi , Yanting Jiang , Yongwei Ma , Vineeth Pillai , Suleiman Souhlal , Masami Hiramatsu , David Dai , Saravana Kannan , Zhao Liu Subject: [RFC 23/26] KVM: x86: Expose ITD feature bit and related info in CPUID Date: Sat, 3 Feb 2024 17:12:11 +0800 Message-Id: <20240203091214.411862-24-zhao1.liu@linux.intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240203091214.411862-1-zhao1.liu@linux.intel.com> References: <20240203091214.411862-1-zhao1.liu@linux.intel.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Zhao Liu The Guest's Intel Thread Director (ITD) feature bit is not only dependent on the Host ITD's enablement, but is also based on the Guest's HFI feature bit. When the Host supports both HFI and ITD, try to support HFI and ITD for the Guest. If Host doesn't support ITD, we won't allow Guest to enable HFI or ITD. Tested-by: Yanting Jiang Co-developed-by: Zhuocheng Ding Signed-off-by: Zhuocheng Ding Signed-off-by: Zhao Liu --- arch/x86/kvm/cpuid.c | 55 +++++++++++++++++++++++++++++++------------- 1 file changed, 39 insertions(+), 16 deletions(-) diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c index 4da8f3319917..9e78398f29dc 100644 --- a/arch/x86/kvm/cpuid.c +++ b/arch/x86/kvm/cpuid.c @@ -137,7 +137,7 @@ static int kvm_check_hfi_cpuid(struct kvm_vcpu *vcpu, { struct hfi_features hfi_features; struct kvm_cpuid_entry2 *best = NULL; - bool has_hfi; + bool has_hfi, has_itd; int nr_classes, ret; union cpuid6_ecx ecx; union cpuid6_edx edx; @@ -148,9 +148,14 @@ static int kvm_check_hfi_cpuid(struct kvm_vcpu *vcpu, return 0; has_hfi = cpuid_entry_has(best, X86_FEATURE_HFI); - if (!has_hfi) + has_itd = cpuid_entry_has(best, X86_FEATURE_ITD); + if (!has_hfi && !has_itd) return 0; + /* ITD must base on HFI. */ + if (!has_hfi && has_itd) + return -EINVAL; + /* * Only the platform with 1 HFI instance (i.e., client platform) * can enable HFI in Guest. For more information, please refer to @@ -159,11 +164,11 @@ static int kvm_check_hfi_cpuid(struct kvm_vcpu *vcpu, if (intel_hfi_max_instances() != 1) return -EINVAL; - /* - * Currently we haven't supported ITD. HFI is the default feature - * with 1 class. - */ - nr_classes = 1; + /* Guest's ITD must base on Host's ITD enablement. */ + if (!cpu_feature_enabled(X86_FEATURE_ITD) && has_itd) + return -EINVAL; + + nr_classes = has_itd ? 4 : 1; ret = intel_hfi_build_virt_features(&hfi_features, nr_classes, vcpu->kvm->created_vcpus); @@ -718,11 +723,13 @@ void kvm_set_cpu_caps(void) if (boot_cpu_has(X86_FEATURE_PTS)) kvm_cpu_cap_set(X86_FEATURE_PTS); /* - * Set HFI based on hardware capability. Only when the Host has + * Set HFI/ITD based on hardware capability. Only when the Host has * the valid HFI instance, KVM can build the virtual HFI table. */ - if (intel_hfi_enabled()) + if (intel_hfi_enabled()) { kvm_cpu_cap_set(X86_FEATURE_HFI); + kvm_cpu_cap_set(X86_FEATURE_ITD); + } } kvm_cpu_cap_mask(CPUID_7_0_EBX, @@ -1069,19 +1076,35 @@ static inline int __do_cpuid_func(struct kvm_cpuid_array *array, u32 function) entry->ebx = 0; - if (kvm_cpu_cap_has(X86_FEATURE_HFI)) { + /* + * When Host enables ITD, we will expose ITD and HFI, + * otherwise, HFI/ITD will not be exposed to Guest. + * ITD is an extension of HFI, so after KVM supports ITD + * emulation, HFI-related info in 0x6 leaf should be consistent + * with the Host, that is, use the Host's ITD info, except + * for the HFI index. + * + * HFI table size is related to the HFI table indexes, but + * this item will be checked in kvm_check_cpuid() after + * KVM_SET_CPUID/KVM_SET_CPUID2. + */ + if (kvm_cpu_cap_has(X86_FEATURE_ITD)) { union cpuid6_ecx ecx; union cpuid6_edx edx; + union cpuid6_ecx *host_ecx = (union cpuid6_ecx *)&entry->ecx; + union cpuid6_edx *host_edx = (union cpuid6_edx *)&entry->edx; ecx.full = 0; edx.full = 0; - /* Number of supported HFI classes */ - ecx.split.nr_classes = 1; - /* HFI supports performance and energy efficiency capabilities. */ - edx.split.capabilities.split.performance = 1; - edx.split.capabilities.split.energy_efficiency = 1; + /* Number of supported HFI/ITD classes. */ + ecx.split.nr_classes = host_ecx->split.nr_classes; + /* HFI/ITD supports performance and energy efficiency capabilities. */ + edx.split.capabilities.split.performance = + host_edx->split.capabilities.split.performance; + edx.split.capabilities.split.energy_efficiency = + host_edx->split.capabilities.split.energy_efficiency; /* As default, keep the same HFI table size as host. */ - edx.split.table_pages = ((union cpuid6_edx)entry->edx).split.table_pages; + edx.split.table_pages = host_edx->split.table_pages; /* * Default HFI index = 0. User should be careful that * the index differ for each CPUs. From patchwork Sat Feb 3 09:12:12 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zhao Liu X-Patchwork-Id: 13543974 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.14]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AF4A369D13; Sat, 3 Feb 2024 09:02:03 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.14 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706950925; cv=none; b=AwLNhj72iAIhsqSRXhsaN90kclGOSNqOqUzS7oMxgbozBlLcurp71rzGV3WLlcxzG23n9MPSevZtx8ibhIfhY2BjBsFonMS0zTYJRG2p11s7JzJQc6cfjJU7pd/WIcLQuHXXv2Nz8V0qkroD7GGLtwSjr805RFbHvBfKyfhdpuk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706950925; c=relaxed/simple; bh=21uhPX23ysNrQvndgnEhnvqGKaQ2s2imIgrPnztL+CI=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=tbisdWhYpo/Oevm3QXvhOVqrGTazJ9NM7/t7WcQVe+inAOm4Tu/y/hAsXuafwYHXvmjZJO30AdWg+/b0IZh3lKBOGiUZdyTjXOS3OHKnunb2oylz+26WSyaxYOMNLHDYYCQ/WH+IowupwGw/ciX79QvDwYGqdq1RokIpFGI+8Q0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=DzVSoMfb; arc=none smtp.client-ip=198.175.65.14 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="DzVSoMfb" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1706950924; x=1738486924; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=21uhPX23ysNrQvndgnEhnvqGKaQ2s2imIgrPnztL+CI=; b=DzVSoMfb791AXpZhnYNAhjGCRc3ErPY20TO5s1c0N71LCz9r1SLMTpfU VyztG9zb/RMyNOOoBZNboEtOqgCcZ80VFvyxn9IJNjuCyiW8RQ1sO+4qE iDEp1qYZfwfh2sy8mXv2WgHqKpIkuBbdieWRhIdv16Oh5fxpRcpn0oGOA bQ6mLiJgKqXGrYsoUjhzYqwVrniQh/XpfG8I+FhJmbWNNrCsnwqNNmuF8 +4bI2zCzUAoRzbxTfWKerIMX+IRXFW9/MBrw7xmWsvSC+auR45ZXZ0D0i B9zmyvCOtaH4xO4FoL68osS4TUjXmQL/edcK4xar8ZvnNaDqdZQDVxcIZ A==; X-IronPort-AV: E=McAfee;i="6600,9927,10971"; a="4132225" X-IronPort-AV: E=Sophos;i="6.05,240,1701158400"; d="scan'208";a="4132225" Received: from fmviesa009.fm.intel.com ([10.60.135.149]) by orvoesa106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 03 Feb 2024 01:02:03 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.05,240,1701158400"; d="scan'208";a="291593" Received: from liuzhao-optiplex-7080.sh.intel.com ([10.239.160.36]) by fmviesa009.fm.intel.com with ESMTP; 03 Feb 2024 01:01:57 -0800 From: Zhao Liu To: Paolo Bonzini , Sean Christopherson , "Rafael J . Wysocki" , Daniel Lezcano , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H . Peter Anvin" , kvm@vger.kernel.org, linux-pm@vger.kernel.org, linux-kernel@vger.kernel.org, x86@kernel.org Cc: Ricardo Neri , Len Brown , Zhang Rui , Zhenyu Wang , Zhuocheng Ding , Dapeng Mi , Yanting Jiang , Yongwei Ma , Vineeth Pillai , Suleiman Souhlal , Masami Hiramatsu , David Dai , Saravana Kannan , Zhao Liu Subject: [RFC 24/26] KVM: VMX: Emulate the MSR of HRESET feature Date: Sat, 3 Feb 2024 17:12:12 +0800 Message-Id: <20240203091214.411862-25-zhao1.liu@linux.intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240203091214.411862-1-zhao1.liu@linux.intel.com> References: <20240203091214.411862-1-zhao1.liu@linux.intel.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Zhao Liu HRESET is a feature associated with ITD, which provides an HRESET instruction to reset the ITD related history accumulated on the current logical processor it is executing on [1]. The HRESET instruction does not cause the VMExit and is therefore available to the Guest by default when the HRESET feature bit is set for the Guest. The HRESET feature also provides a thread scope MSR to control the enabling of the ITD history reset via the HRESET instruction [2]: MSR_IA32_HW_HRESET_ENABLE. This MSR can control the hardware, so we use the emulation way to support it for Guest, and this makes the Guest's changes to the hardware under the control of the Host. Considering that there may be the difference between Guest and Host about HRESET enabling status, we store the MSR_IA32_HW_HRESET_ENABLE values of Host and Guest in vcpu_vmx and save/load their respective configurations when Guest/Host switch. [1]: SDM, vol. 3B, section 15.6.11 Logical Processor Scope History [2]: SDM, vol. 2A, chap. CPUID--CPU Identification, CPUID.07H.01H.EAX [Bit 22], HRESET. Tested-by: Yanting Jiang Co-developed-by: Zhuocheng Ding Signed-off-by: Zhuocheng Ding Signed-off-by: Zhao Liu --- arch/x86/kvm/svm/svm.c | 1 + arch/x86/kvm/vmx/vmx.c | 54 ++++++++++++++++++++++++++++++++++++++++++ arch/x86/kvm/vmx/vmx.h | 2 ++ arch/x86/kvm/x86.c | 1 + 4 files changed, 58 insertions(+) diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index 980d93c70eb6..d847dd8eb193 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -4295,6 +4295,7 @@ static bool svm_has_emulated_msr(struct kvm *kvm, u32 index) case MSR_IA32_PACKAGE_THERM_STATUS: case MSR_IA32_HW_FEEDBACK_CONFIG: case MSR_IA32_HW_FEEDBACK_PTR: + case MSR_IA32_HW_HRESET_ENABLE: return false; case MSR_IA32_SMBASE: if (!IS_ENABLED(CONFIG_KVM_SMM)) diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 11d42e0a208b..2d733c959f32 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -1314,6 +1314,35 @@ static void itd_guest_exit(struct vcpu_vmx *vmx) wrmsrl(MSR_IA32_HW_FEEDBACK_THREAD_CONFIG, vcpu_hfi->host_thread_cfg); } +static void hreset_guest_enter(struct vcpu_vmx *vmx) +{ + struct vcpu_hfi_desc *vcpu_hfi = &vmx->vcpu_hfi_desc; + + if (!kvm_cpu_cap_has(X86_FEATURE_HRESET) || + !guest_cpuid_has(&vmx->vcpu, X86_FEATURE_HRESET)) + return; + + rdmsrl(MSR_IA32_HW_HRESET_ENABLE, vcpu_hfi->host_hreset_enable); + if (unlikely(vcpu_hfi->host_hreset_enable != vcpu_hfi->guest_hreset_enable)) + wrmsrl(MSR_IA32_HW_HRESET_ENABLE, vcpu_hfi->guest_hreset_enable); +} + +static void hreset_guest_exit(struct vcpu_vmx *vmx) +{ + struct vcpu_hfi_desc *vcpu_hfi = &vmx->vcpu_hfi_desc; + + if (!kvm_cpu_cap_has(X86_FEATURE_HRESET) || + !guest_cpuid_has(&vmx->vcpu, X86_FEATURE_HRESET)) + return; + + /* + * MSR_IA32_HW_HRESET_ENABLE is not passed through to Guest, so there + * is no need to read the MSR to save the Guest's value. + */ + if (unlikely(vcpu_hfi->host_hreset_enable != vcpu_hfi->guest_hreset_enable)) + wrmsrl(MSR_IA32_HW_HRESET_ENABLE, vcpu_hfi->host_hreset_enable); +} + void vmx_set_host_fs_gs(struct vmcs_host_state *host, u16 fs_sel, u16 gs_sel, unsigned long fs_base, unsigned long gs_base) { @@ -2462,6 +2491,12 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) return 1; msr_info->data = kvm_vmx->pkg_therm.msr_ia32_hfi_ptr; break; + case MSR_IA32_HW_HRESET_ENABLE: + if (!msr_info->host_initiated && + !guest_cpuid_has(&vmx->vcpu, X86_FEATURE_HRESET)) + return 1; + msr_info->data = vmx->vcpu_hfi_desc.guest_hreset_enable; + break; default: find_uret_msr: msr = vmx_find_uret_msr(vmx, msr_info->index); @@ -3091,6 +3126,21 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) ret = vmx_set_hfi_ptr_msr(vcpu, msr_info); mutex_unlock(&kvm_vmx->pkg_therm.pkg_therm_lock); break; + case MSR_IA32_HW_HRESET_ENABLE: { + struct kvm_cpuid_entry2 *entry; + + if (!msr_info->host_initiated && + !guest_cpuid_has(&vmx->vcpu, X86_FEATURE_HRESET)) + return 1; + + entry = kvm_find_cpuid_entry_index(&vmx->vcpu, 0x20, 0); + /* Reserved bits: generate the exception. */ + if (!msr_info->host_initiated && data & ~entry->ebx) + return 1; + /* hreset_guest_enter() will update MSR for Guest. */ + vmx->vcpu_hfi_desc.guest_hreset_enable = data; + break; + } default: find_uret_msr: msr = vmx_find_uret_msr(vmx, msr_index); @@ -5513,6 +5563,8 @@ static void vmx_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event) vmx->msr_ia32_therm_status = 0; vmx->vcpu_hfi_desc.host_thread_cfg = 0; vmx->vcpu_hfi_desc.guest_thread_cfg = 0; + vmx->vcpu_hfi_desc.host_hreset_enable = 0; + vmx->vcpu_hfi_desc.guest_hreset_enable = 0; vmx->hv_deadline_tsc = -1; kvm_set_cr8(vcpu, 0); @@ -8006,6 +8058,7 @@ static fastpath_t vmx_vcpu_run(struct kvm_vcpu *vcpu) pt_guest_enter(vmx); itd_guest_enter(vmx); + hreset_guest_enter(vmx); atomic_switch_perf_msrs(vmx); if (intel_pmu_lbr_is_enabled(vcpu)) @@ -8044,6 +8097,7 @@ static fastpath_t vmx_vcpu_run(struct kvm_vcpu *vcpu) loadsegment(es, __USER_DS); #endif + hreset_guest_exit(vmx); itd_guest_exit(vmx); pt_guest_exit(vmx); diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h index 3d3238dd8fc3..c5b4684a5b51 100644 --- a/arch/x86/kvm/vmx/vmx.h +++ b/arch/x86/kvm/vmx/vmx.h @@ -74,6 +74,8 @@ struct pt_desc { struct vcpu_hfi_desc { u64 host_thread_cfg; u64 guest_thread_cfg; + u64 host_hreset_enable; + u64 guest_hreset_enable; }; union vmx_exit_reason { diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 27bec359907c..04489efc2fb4 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -1552,6 +1552,7 @@ static const u32 emulated_msrs_all[] = { MSR_IA32_PACKAGE_THERM_STATUS, MSR_IA32_HW_FEEDBACK_CONFIG, MSR_IA32_HW_FEEDBACK_PTR, + MSR_IA32_HW_HRESET_ENABLE, /* * KVM always supports the "true" VMX control MSRs, even if the host From patchwork Sat Feb 3 09:12:13 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zhao Liu X-Patchwork-Id: 13543975 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.14]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B0E9369D13; Sat, 3 Feb 2024 09:02:09 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.14 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706950931; cv=none; b=Oa8fSH4IG/oZqErhRbiU078Qx+dkhOT+Mo1qRS/ZwwR3T4OYtYYZ3TYDbIJm3Ka7ucGsZmxaNsI2vZys744zeYc1JpaRwkf+AEID6iDhuq/VnnwcPqGrkT9GbN1dz+Ih/sX6oie8ozW3BGo6sfa4JZvM0sOTUFpXbWoH8TlO748= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706950931; c=relaxed/simple; bh=McvWIfrMDVQQK98OonhC4shsYqFjFCcrohaIF+ih4cY=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=mUB5f/K/wtDBQ/W9tZ1mVaI8JVBiVhoQSxv9PXwDs6K0IsuzDS5fy8s7umVFo1CURDYuy1r9voI8P7bS87d8ept4wJJlB9UGVmJgS62l3DOturN8UWucLej8RB3hnau4v61uGnAfHB3oWM28pRK61LUJRfcF5RSMPxGDClCt7eI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=CkZuKTJ1; arc=none smtp.client-ip=198.175.65.14 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="CkZuKTJ1" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1706950930; x=1738486930; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=McvWIfrMDVQQK98OonhC4shsYqFjFCcrohaIF+ih4cY=; b=CkZuKTJ1SXWDRkQOtDVAYmKcvQUPUsrmemyaHFXpRknte9OGcIP/e9Qr kycoguiF6c7fcsgC39hZjBF8zuYmvklm8nxuTztkLhgx7lNEnAqyDYhUA 3L+F3tkDbmZv9KwTvdJVJtCCftjAwIr86E3weiEd5nPDoVct7EUh5rDia oxFB1INPCOsk14R0bYmcrpjJ4pLQFHeVR8RrgWB5Y7DDKfjpWkAjVc3JG 8vGndeEv5N7D/FkELrHXf+mgl/NMOpmksCFeq18JPxlPs+i8lkljKLBqZ 6qw4vFpxBsm4mjQy7JsaLpU8k2rtVZuAcprQnpqkI5FU0REWjt1PxViXZ A==; X-IronPort-AV: E=McAfee;i="6600,9927,10971"; a="4132259" X-IronPort-AV: E=Sophos;i="6.05,240,1701158400"; d="scan'208";a="4132259" Received: from fmviesa009.fm.intel.com ([10.60.135.149]) by orvoesa106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 03 Feb 2024 01:02:09 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.05,240,1701158400"; d="scan'208";a="291606" Received: from liuzhao-optiplex-7080.sh.intel.com ([10.239.160.36]) by fmviesa009.fm.intel.com with ESMTP; 03 Feb 2024 01:02:03 -0800 From: Zhao Liu To: Paolo Bonzini , Sean Christopherson , "Rafael J . Wysocki" , Daniel Lezcano , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H . Peter Anvin" , kvm@vger.kernel.org, linux-pm@vger.kernel.org, linux-kernel@vger.kernel.org, x86@kernel.org Cc: Ricardo Neri , Len Brown , Zhang Rui , Zhenyu Wang , Zhuocheng Ding , Dapeng Mi , Yanting Jiang , Yongwei Ma , Vineeth Pillai , Suleiman Souhlal , Masami Hiramatsu , David Dai , Saravana Kannan , Zhao Liu Subject: [RFC 25/26] KVM: x86: Expose HRESET feature's CPUID to Guest Date: Sat, 3 Feb 2024 17:12:13 +0800 Message-Id: <20240203091214.411862-26-zhao1.liu@linux.intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240203091214.411862-1-zhao1.liu@linux.intel.com> References: <20240203091214.411862-1-zhao1.liu@linux.intel.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Zhuocheng Ding The HRESET feature needs to have not only the feature bit of 0x07.0x01. eax[bit 22] in the CPUID, but also the associated 0x20 leaf, so, pass- through the Host's 0x20 leaf to Guest. Since currently, HRESET is only used to clear ITD's classification history, only expose HRESET related CPUID when Guest has the ITD capability. Tested-by: Yanting Jiang Signed-off-by: Zhuocheng Ding Co-developed-by: Zhao Liu Signed-off-by: Zhao Liu --- arch/x86/kvm/cpuid.c | 26 +++++++++++++++++++++++++- 1 file changed, 25 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c index 9e78398f29dc..726b723ee34b 100644 --- a/arch/x86/kvm/cpuid.c +++ b/arch/x86/kvm/cpuid.c @@ -197,6 +197,16 @@ static int kvm_check_hfi_cpuid(struct kvm_vcpu *vcpu, if (data_size > hfi_features.nr_table_pages << PAGE_SHIFT) return -EINVAL; + /* + * Check HRESET leaf since Guest's control of MSR_IA32_HW_HRESET_ENABLE + * needs to take effect on hardware. + */ + best = cpuid_entry2_find(entries, nent, 0x20, 0); + + /* Cannot set the Guest bit that is unsopported by Host. */ + if (best && best->ebx & ~cpuid_ebx(0x20)) + return -EINVAL; + return 0; } @@ -784,6 +794,10 @@ void kvm_set_cpu_caps(void) F(AMX_FP16) | F(AVX_IFMA) | F(LAM) ); + /* Currently HRESET is used to reset the ITD related history. */ + if (kvm_cpu_cap_has(X86_FEATURE_ITD)) + kvm_cpu_cap_set(X86_FEATURE_HRESET); + kvm_cpu_cap_init_kvm_defined(CPUID_7_1_EDX, F(AVX_VNNI_INT8) | F(AVX_NE_CONVERT) | F(PREFETCHITI) | F(AMX_COMPLEX) @@ -1030,7 +1044,7 @@ static inline int __do_cpuid_func(struct kvm_cpuid_array *array, u32 function) switch (function) { case 0: /* Limited to the highest leaf implemented in KVM. */ - entry->eax = min(entry->eax, 0x1fU); + entry->eax = min(entry->eax, 0x20U); break; case 1: cpuid_entry_override(entry, CPUID_1_EDX); @@ -1300,6 +1314,16 @@ static inline int __do_cpuid_func(struct kvm_cpuid_array *array, u32 function) break; } break; + /* Intel HRESET */ + case 0x20: + if (!kvm_cpu_cap_has(X86_FEATURE_HRESET)) { + entry->eax = 0; + entry->ebx = 0; + entry->ecx = 0; + entry->edx = 0; + break; + } + break; case KVM_CPUID_SIGNATURE: { const u32 *sigptr = (const u32 *)KVM_SIGNATURE; entry->eax = KVM_CPUID_FEATURES; From patchwork Sat Feb 3 09:12:14 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zhao Liu X-Patchwork-Id: 13543976 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.14]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 607F56E2DC; Sat, 3 Feb 2024 09:02:15 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.14 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706950936; cv=none; b=QYDZxDz/Y10TU9o9kzTi6OwxQ3SolVp0JFe8VALXNddKUe/12VssOSbckD09/UsS2I7Lb7iSdbkXFrKsh7+AEmQEovZKAxbzOvwa8rETneZEaMxgKqKVkdXkQQTDm3m0hFoS9mUVal+o94IaVa+VfrpYPk2U3ECoZ6D1SW1aReQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706950936; c=relaxed/simple; bh=qd5tUyNdUljBtdVqZxN09u5VE/YXPZNXPRlXvr1e0nQ=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=G+AkXa9y5shuEoDm/VVuEoZLIRMlxQrwRH+JVSyJAXm8iHlSPljtQkHOzawGqMMfGHlRL7jY5XeJn96rbzqSAfSaUB7kGvtsz4KHa3m5xix5obg4yr5T678X7rVlrIc/SkZjv7Shjd6mP7nliFwkATN0RQUTsULWeAufjx7c8Mw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=VhF4F9uH; arc=none smtp.client-ip=198.175.65.14 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="VhF4F9uH" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1706950936; x=1738486936; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=qd5tUyNdUljBtdVqZxN09u5VE/YXPZNXPRlXvr1e0nQ=; b=VhF4F9uHPsiQEfgzIALsBC591Bf2JZwj+N/i98F27t8blI4rVupo6T0s MZvk9eESQBodl2jJAbHCdg5iL/J+eYKJWlXYxfLwBUPCgH92SrUevKYrL ZHROiCXCmx4ISEX8hMKUBS8D+7riFxGD9mklvY/i87UngBGJ8mUhrqzxO rpc6gTD0bgRlOqCLPw1/XwDqETadKuUlTVxQsa6PdL7W9WgZC+mIG0LUS CxSOMkXShxLC9FJ29H52Tu6kr/K6M7W+PzRgKYFCq3bSR2ystbs4i2aqI SbeWrqZpVzUqKMR3hqyAfT8STiB/MW3kZPeZy8eyAu3RH1lKCgZQbZcWZ w==; X-IronPort-AV: E=McAfee;i="6600,9927,10971"; a="4132272" X-IronPort-AV: E=Sophos;i="6.05,240,1701158400"; d="scan'208";a="4132272" Received: from fmviesa009.fm.intel.com ([10.60.135.149]) by orvoesa106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 03 Feb 2024 01:02:15 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.05,240,1701158400"; d="scan'208";a="291619" Received: from liuzhao-optiplex-7080.sh.intel.com ([10.239.160.36]) by fmviesa009.fm.intel.com with ESMTP; 03 Feb 2024 01:02:08 -0800 From: Zhao Liu To: Paolo Bonzini , Sean Christopherson , "Rafael J . Wysocki" , Daniel Lezcano , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H . Peter Anvin" , kvm@vger.kernel.org, linux-pm@vger.kernel.org, linux-kernel@vger.kernel.org, x86@kernel.org Cc: Ricardo Neri , Len Brown , Zhang Rui , Zhenyu Wang , Zhuocheng Ding , Dapeng Mi , Yanting Jiang , Yongwei Ma , Vineeth Pillai , Suleiman Souhlal , Masami Hiramatsu , David Dai , Saravana Kannan , Zhao Liu Subject: [RFC 26/26] Documentation: KVM: Add description of pkg_therm_lock Date: Sat, 3 Feb 2024 17:12:14 +0800 Message-Id: <20240203091214.411862-27-zhao1.liu@linux.intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240203091214.411862-1-zhao1.liu@linux.intel.com> References: <20240203091214.411862-1-zhao1.liu@linux.intel.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Zhao Liu pkg_therm_lock is a per-VM lock and used in PTS, HFI and ITD virtualization supports. Add description about it. Tested-by: Yanting Jiang Signed-off-by: Zhao Liu --- Documentation/virt/kvm/locking.rst | 13 ++++++++++++- 1 file changed, 12 insertions(+), 1 deletion(-) diff --git a/Documentation/virt/kvm/locking.rst b/Documentation/virt/kvm/locking.rst index 02880d5552d5..84a138916898 100644 --- a/Documentation/virt/kvm/locking.rst +++ b/Documentation/virt/kvm/locking.rst @@ -290,7 +290,7 @@ time it will be set using the Dirty tracking mechanism described above. wakeup. ``vendor_module_lock`` -^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +^^^^^^^^^^^^^^^^^^^^^^ :Type: mutex :Arch: x86 :Protects: loading a vendor module (kvm_amd or kvm_intel) @@ -298,3 +298,14 @@ time it will be set using the Dirty tracking mechanism described above. taken outside of kvm_lock, e.g. in KVM's CPU online/offline callbacks, and many operations need to take cpu_hotplug_lock when loading a vendor module, e.g. updating static calls. + +``pkg_therm_lock`` +^^^^^^^^^^^^^^^^^^ +:Type: mutex +:Arch: x86 (vmx) +:Protects: PTS, HFI and ITD emulation +:Comment: This is a per-VM lock and it is used for VM level thermal features' + emulation (PTS, HFI and ITD). When these features' emulated MSRs need to + be changed, or when we handle the virtual HFI table's update, this lock is + needed to create the atomi context and to avoid competing behavior of other + vCPUs in the same VM.