diff mbox series

[v5,03/10] platform/x86/intel/ifs: Add stub driver for In-Field Scan

Message ID 20220428153849.295779-4-tony.luck@intel.com (mailing list archive)
State Deferred, archived
Headers show
Series Introduce In Field Scan driver | expand

Commit Message

Luck, Tony April 28, 2022, 3:38 p.m. UTC
Cloud Service Providers that operate fleets of servers have reported
[1] occasions where they can detect that a CPU has gone bad due to
effects like electromigration, or isolated manufacturing defects.
However, that detection method is A/B testing seemingly random
application failures looking for a pattern. In-Field Scan (IFS) is
a driver for a platform capability to load a crafted 'scan image'
to run targeted low level diagnostics outside of the CPU's architectural
error detection capabilities.

Stub version of driver just does initial part of check for the IFS
feature. MSR_IA32_CORE_CAPS must enumerate the presence of the
MSR_INTEGRITY_CAPS MSR.

[1]: https://www.youtube.com/watch?v=QMF3rqhjYuM

Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
---
 MAINTAINERS                             |  7 ++++
 drivers/platform/x86/intel/Kconfig      |  1 +
 drivers/platform/x86/intel/Makefile     |  1 +
 drivers/platform/x86/intel/ifs/Kconfig  | 13 +++++++
 drivers/platform/x86/intel/ifs/Makefile |  3 ++
 drivers/platform/x86/intel/ifs/core.c   | 48 +++++++++++++++++++++++++
 6 files changed, 73 insertions(+)
 create mode 100644 drivers/platform/x86/intel/ifs/Kconfig
 create mode 100644 drivers/platform/x86/intel/ifs/Makefile
 create mode 100644 drivers/platform/x86/intel/ifs/core.c

Comments

Thomas Gleixner May 4, 2022, 10:35 a.m. UTC | #1
On Thu, Apr 28 2022 at 08:38, Tony Luck wrote:
> +#define X86_MATCH(model)				\
> +	X86_MATCH_VENDOR_FAM_MODEL_FEATURE(INTEL, 6,	\
> +		INTEL_FAM6_##model, X86_FEATURE_CORE_CAPABILITIES, NULL)
> +
> +static const struct x86_cpu_id ifs_cpu_ids[] __initconst = {
> +	X86_MATCH(SAPPHIRERAPIDS_X),

Why do we need a model match here? The core capabilities MSR is only
available when X86_FEATURE_CORE_CAPABILITIES is set:

    "If CPUID.(EAX=07H, ECX=0):EDX[30] = 1.
     This MSR provides an architectural enumeration
     function for model-specific behavior."

So checking for Intel Fam6 ANYMODEL and X86_FEATURE_CORE_CAPABILITIES is
sufficient, no?

We really don't need more match id tables with gazillions of CPU models.

Thanks,

        tglx
Luck, Tony May 4, 2022, 4:24 p.m. UTC | #2
>> +static const struct x86_cpu_id ifs_cpu_ids[] __initconst = {
>> +	X86_MATCH(SAPPHIRERAPIDS_X),
>
> Why do we need a model match here? The core capabilities MSR is only
> available when X86_FEATURE_CORE_CAPABILITIES is set:
>
>    "If CPUID.(EAX=07H, ECX=0):EDX[30] = 1.
>     This MSR provides an architectural enumeration
>     function for model-specific behavior."
>
> So checking for Intel Fam6 ANYMODEL and X86_FEATURE_CORE_CAPABILITIES is
> sufficient, no?

IA32_CORE_CAPABILITES is a nightmare. Although it is an architectural
register, the bits inside it are model specific.

In particular bit 2 (which we check here for the existence of the INTEGRITY
MSR) has been assigned for other use on other models. See SDM volume 4
table 2-45 where bit 2 means FUSA supported on 06_8C and 06_8D (Tigerlake
mobile and desktop). Ditto in table 2-46 (Alderlake and Raptorlake).

> We really don't need more match id tables with gazillions of CPU models.

Sadly we do :-(

-Tony
Borislav Petkov May 4, 2022, 4:28 p.m. UTC | #3
On Wed, May 04, 2022 at 04:24:50PM +0000, Luck, Tony wrote:
> > We really don't need more match id tables with gazillions of CPU models.
> 
> Sadly we do :-(

So what was the reasoning about CPUID bits being so expensive so that we
need to match models? Ditto for the splitlock situation - that thing is
supported on a bunch of models but nope, not a CPUID bit in sight. What
was the convincing argument that made hw folks give a CPUID bit to the
PPIN thing? Perhaps we could use it there too. :)
diff mbox series

Patch

diff --git a/MAINTAINERS b/MAINTAINERS
index 5e8c2f611766..bc902d0c64d2 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -9859,6 +9859,13 @@  B:	https://bugzilla.kernel.org
 T:	git git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux.git
 F:	drivers/idle/intel_idle.c
 
+INTEL IN FIELD SCAN (IFS) DEVICE
+M:	Jithu Joseph <jithu.joseph@intel.com>
+R:	Ashok Raj <ashok.raj@intel.com>
+R:	Tony Luck <tony.luck@intel.com>
+S:	Maintained
+F:	drivers/platform/x86/intel/ifs
+
 INTEL INTEGRATED SENSOR HUB DRIVER
 M:	Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
 M:	Jiri Kosina <jikos@kernel.org>
diff --git a/drivers/platform/x86/intel/Kconfig b/drivers/platform/x86/intel/Kconfig
index 1f01a8a23c57..794968bda115 100644
--- a/drivers/platform/x86/intel/Kconfig
+++ b/drivers/platform/x86/intel/Kconfig
@@ -4,6 +4,7 @@ 
 #
 
 source "drivers/platform/x86/intel/atomisp2/Kconfig"
+source "drivers/platform/x86/intel/ifs/Kconfig"
 source "drivers/platform/x86/intel/int1092/Kconfig"
 source "drivers/platform/x86/intel/int3472/Kconfig"
 source "drivers/platform/x86/intel/pmc/Kconfig"
diff --git a/drivers/platform/x86/intel/Makefile b/drivers/platform/x86/intel/Makefile
index c61bc3e97121..717933dd0cfd 100644
--- a/drivers/platform/x86/intel/Makefile
+++ b/drivers/platform/x86/intel/Makefile
@@ -5,6 +5,7 @@ 
 #
 
 obj-$(CONFIG_INTEL_ATOMISP2_PDX86)	+= atomisp2/
+obj-$(CONFIG_INTEL_IFS)			+= ifs/
 obj-$(CONFIG_INTEL_SAR_INT1092)		+= int1092/
 obj-$(CONFIG_INTEL_SKL_INT3472)		+= int3472/
 obj-$(CONFIG_INTEL_PMC_CORE)		+= pmc/
diff --git a/drivers/platform/x86/intel/ifs/Kconfig b/drivers/platform/x86/intel/ifs/Kconfig
new file mode 100644
index 000000000000..d84491cfb0db
--- /dev/null
+++ b/drivers/platform/x86/intel/ifs/Kconfig
@@ -0,0 +1,13 @@ 
+config INTEL_IFS
+	tristate "Intel In Field Scan"
+	depends on X86 && 64BIT && SMP
+	select INTEL_IFS_DEVICE
+	help
+	  Enable support for the In Field Scan capability in select
+	  CPUs. The capability allows for running low level tests via
+	  a scan image distributed by Intel via Github to validate CPU
+	  operation beyond baseline RAS capabilities. To compile this
+	  support as a module, choose M here. The module will be called
+	  intel_ifs.
+
+	  If unsure, say N.
diff --git a/drivers/platform/x86/intel/ifs/Makefile b/drivers/platform/x86/intel/ifs/Makefile
new file mode 100644
index 000000000000..af904880e959
--- /dev/null
+++ b/drivers/platform/x86/intel/ifs/Makefile
@@ -0,0 +1,3 @@ 
+obj-$(CONFIG_INTEL_IFS)		+= intel_ifs.o
+
+intel_ifs-objs			:= core.o
diff --git a/drivers/platform/x86/intel/ifs/core.c b/drivers/platform/x86/intel/ifs/core.c
new file mode 100644
index 000000000000..e3623ac691b5
--- /dev/null
+++ b/drivers/platform/x86/intel/ifs/core.c
@@ -0,0 +1,48 @@ 
+// SPDX-License-Identifier: GPL-2.0-only
+/* Copyright(c) 2022 Intel Corporation. */
+
+#include <linux/module.h>
+#include <linux/kdev_t.h>
+
+#include <asm/cpu_device_id.h>
+
+#define X86_MATCH(model)				\
+	X86_MATCH_VENDOR_FAM_MODEL_FEATURE(INTEL, 6,	\
+		INTEL_FAM6_##model, X86_FEATURE_CORE_CAPABILITIES, NULL)
+
+static const struct x86_cpu_id ifs_cpu_ids[] __initconst = {
+	X86_MATCH(SAPPHIRERAPIDS_X),
+	{}
+};
+MODULE_DEVICE_TABLE(x86cpu, ifs_cpu_ids);
+
+static int __init ifs_init(void)
+{
+	const struct x86_cpu_id *m;
+	u64 msrval;
+
+	m = x86_match_cpu(ifs_cpu_ids);
+	if (!m)
+		return -ENODEV;
+
+	if (rdmsrl_safe(MSR_IA32_CORE_CAPS, &msrval))
+		return -ENODEV;
+
+	if (!(msrval & MSR_IA32_CORE_CAPS_INTEGRITY_CAPS))
+		return -ENODEV;
+
+	if (rdmsrl_safe(MSR_INTEGRITY_CAPS, &msrval))
+		return -ENODEV;
+
+	return 0;
+}
+
+static void __exit ifs_exit(void)
+{
+}
+
+module_init(ifs_init);
+module_exit(ifs_exit);
+
+MODULE_LICENSE("GPL");
+MODULE_DESCRIPTION("Intel In Field Scan (IFS) device");