diff mbox series

[v3,02/11] Documentation: In-Field Scan

Message ID 20220419163859.2228874-3-tony.luck@intel.com (mailing list archive)
State Superseded, archived
Headers show
Series Introduce In Field Scan driver | expand

Commit Message

Luck, Tony April 19, 2022, 4:38 p.m. UTC
Add documentation for In-Field Scan (IFS). This documentation
describes the basics of IFS, the loading IFS image, chunk
authentication, running scan and how to check result via sysfs
as well as tunable parameters.

The CORE_CAPABILITIES MSR enumerates whether IFS is supported.

The full  github location for distributing the IFS images is
still being decided. So just a placeholder included for now
in the documentation.

Future CPUs will support more than one type of test. Plan for
that now by using a ".0" suffix on the ABI directory names.
Additional test types will use ".1", etc.

Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
---
 Documentation/x86/ifs.rst   | 101 ++++++++++++++++++++++++++++++++++++
 Documentation/x86/index.rst |   1 +
 2 files changed, 102 insertions(+)
 create mode 100644 Documentation/x86/ifs.rst

Comments

Greg KH April 19, 2022, 4:48 p.m. UTC | #1
On Tue, Apr 19, 2022 at 09:38:50AM -0700, Tony Luck wrote:
> Add documentation for In-Field Scan (IFS). This documentation
> describes the basics of IFS, the loading IFS image, chunk
> authentication, running scan and how to check result via sysfs
> as well as tunable parameters.
> 
> The CORE_CAPABILITIES MSR enumerates whether IFS is supported.
> 
> The full  github location for distributing the IFS images is
> still being decided. So just a placeholder included for now
> in the documentation.
> 
> Future CPUs will support more than one type of test. Plan for
> that now by using a ".0" suffix on the ABI directory names.
> Additional test types will use ".1", etc.
> 
> Reviewed-by: Dan Williams <dan.j.williams@intel.com>
> Signed-off-by: Tony Luck <tony.luck@intel.com>
> ---
>  Documentation/x86/ifs.rst   | 101 ++++++++++++++++++++++++++++++++++++
>  Documentation/x86/index.rst |   1 +
>  2 files changed, 102 insertions(+)
>  create mode 100644 Documentation/x86/ifs.rst
> 
> diff --git a/Documentation/x86/ifs.rst b/Documentation/x86/ifs.rst
> new file mode 100644
> index 000000000000..62f3c07d433a
> --- /dev/null
> +++ b/Documentation/x86/ifs.rst
> @@ -0,0 +1,101 @@
> +.. SPDX-License-Identifier: GPL-2.0
> +
> +=============
> +In-Field Scan
> +=============
> +
> +Introduction
> +------------
> +
> +In Field Scan (IFS) is a hardware feature to run circuit level tests on
> +a CPU core to detect problems that are not caught by parity or ECC checks.
> +Future CPUs will support more than one type of test which will show up
> +with a new platform-device instance-id, for now only .0 is exposed.
> +
> +
> +IFS Image
> +---------
> +
> +Intel provides a firmware file containing the scan tests via
> +github [#f1]_.  Similar to microcode there is a separate file for each
> +family-model-stepping.
> +
> +IFS Image Loading
> +-----------------
> +
> +The driver loads the tests into memory reserved BIOS local to each CPU
> +socket in a two step process using writes to MSRs to first load the
> +SHA hashes for the test. Then the tests themselves. Status MSRs provide
> +feedback on the success/failure of these steps. When a new test file
> +is installed it can be loaded by writing to the driver reload file::
> +
> +  # echo 1 > /sys/bus/platform/drivers/intel_ifs.0/reload
> +
> +Similar to microcode, the current version of the scan tests is stored
> +in a fixed location: /lib/firmware/intel/ifs.0/family-model-stepping.scan
> +
> +Running tests
> +-------------
> +
> +Tests are run by the driver synchronizing execution of all threads on a
> +core and then writing to the ACTIVATE_SCAN MSR on all threads. Instruction
> +execution continues when:
> +
> +1) All tests have completed.
> +2) Execution was interrupted.
> +3) A test detected a problem.
> +
> +In all cases reading the SCAN_STATUS MSR provides details on what
> +happened. The driver makes the value of this MSR visible to applications
> +via the "details" file (see below). Interrupted tests may be restarted.
> +
> +The IFS driver provides sysfs interfaces via /sys/devices/platform/intel_ifs.0/
> +to control execution:
> +
> +Test a specific core::
> +
> +  # echo <cpu#> > /sys/devices/platform/intel_ifs.0/run_test
> +
> +when HT is enabled any of the sibling cpu# can be specified to test its
> +corresponding physical core. Since the tests are per physical core, the
> +result of testing any thread is same. It is only necessary to test one
> +thread.
> +
> +For e.g. to test core corresponding to cpu5
> +
> +  # echo 5 > /sys/devices/platform/intel_ifs.0/run_test
> +
> +Results of the last test is provided in /sys::
> +
> +  $ cat /sys/devices/platform/intel_ifs.0/status
> +  pass

sysfs documentation belongs in Documentation/ABI/

And why not just include this whole thing in the driver itself and suck
the documentation out of that?  No need to have a separate file.

thanks,

greg k-h
Dan Williams April 19, 2022, 7:45 p.m. UTC | #2
On Tue, Apr 19, 2022 at 9:48 AM Greg KH <gregkh@linuxfoundation.org> wrote:
>
> On Tue, Apr 19, 2022 at 09:38:50AM -0700, Tony Luck wrote:
> > Add documentation for In-Field Scan (IFS). This documentation
> > describes the basics of IFS, the loading IFS image, chunk
> > authentication, running scan and how to check result via sysfs
> > as well as tunable parameters.
> >
> > The CORE_CAPABILITIES MSR enumerates whether IFS is supported.
> >
> > The full  github location for distributing the IFS images is
> > still being decided. So just a placeholder included for now
> > in the documentation.
> >
> > Future CPUs will support more than one type of test. Plan for
> > that now by using a ".0" suffix on the ABI directory names.
> > Additional test types will use ".1", etc.
> >
> > Reviewed-by: Dan Williams <dan.j.williams@intel.com>
> > Signed-off-by: Tony Luck <tony.luck@intel.com>
> > ---
> >  Documentation/x86/ifs.rst   | 101 ++++++++++++++++++++++++++++++++++++
> >  Documentation/x86/index.rst |   1 +
> >  2 files changed, 102 insertions(+)
> >  create mode 100644 Documentation/x86/ifs.rst
> >
> > diff --git a/Documentation/x86/ifs.rst b/Documentation/x86/ifs.rst
> > new file mode 100644
> > index 000000000000..62f3c07d433a
> > --- /dev/null
> > +++ b/Documentation/x86/ifs.rst
> > @@ -0,0 +1,101 @@
> > +.. SPDX-License-Identifier: GPL-2.0
> > +
> > +=============
> > +In-Field Scan
> > +=============
> > +
> > +Introduction
> > +------------
> > +
> > +In Field Scan (IFS) is a hardware feature to run circuit level tests on
> > +a CPU core to detect problems that are not caught by parity or ECC checks.
> > +Future CPUs will support more than one type of test which will show up
> > +with a new platform-device instance-id, for now only .0 is exposed.
> > +
> > +
> > +IFS Image
> > +---------
> > +
> > +Intel provides a firmware file containing the scan tests via
> > +github [#f1]_.  Similar to microcode there is a separate file for each
> > +family-model-stepping.
> > +
> > +IFS Image Loading
> > +-----------------
> > +
> > +The driver loads the tests into memory reserved BIOS local to each CPU
> > +socket in a two step process using writes to MSRs to first load the
> > +SHA hashes for the test. Then the tests themselves. Status MSRs provide
> > +feedback on the success/failure of these steps. When a new test file
> > +is installed it can be loaded by writing to the driver reload file::
> > +
> > +  # echo 1 > /sys/bus/platform/drivers/intel_ifs.0/reload
> > +
> > +Similar to microcode, the current version of the scan tests is stored
> > +in a fixed location: /lib/firmware/intel/ifs.0/family-model-stepping.scan
> > +
> > +Running tests
> > +-------------
> > +
> > +Tests are run by the driver synchronizing execution of all threads on a
> > +core and then writing to the ACTIVATE_SCAN MSR on all threads. Instruction
> > +execution continues when:
> > +
> > +1) All tests have completed.
> > +2) Execution was interrupted.
> > +3) A test detected a problem.
> > +
> > +In all cases reading the SCAN_STATUS MSR provides details on what
> > +happened. The driver makes the value of this MSR visible to applications
> > +via the "details" file (see below). Interrupted tests may be restarted.
> > +
> > +The IFS driver provides sysfs interfaces via /sys/devices/platform/intel_ifs.0/
> > +to control execution:
> > +
> > +Test a specific core::
> > +
> > +  # echo <cpu#> > /sys/devices/platform/intel_ifs.0/run_test
> > +
> > +when HT is enabled any of the sibling cpu# can be specified to test its
> > +corresponding physical core. Since the tests are per physical core, the
> > +result of testing any thread is same. It is only necessary to test one
> > +thread.
> > +
> > +For e.g. to test core corresponding to cpu5
> > +
> > +  # echo 5 > /sys/devices/platform/intel_ifs.0/run_test
> > +
> > +Results of the last test is provided in /sys::
> > +
> > +  $ cat /sys/devices/platform/intel_ifs.0/status
> > +  pass
>
> sysfs documentation belongs in Documentation/ABI/
>
> And why not just include this whole thing in the driver itself and suck
> the documentation out of that?  No need to have a separate file.

At a minimum a separate file is needed to house the
---
 .. kernel-doc:: $source_file
   :doc: $header
---
...statements, but ok, I'll recommend that going forward to
de-emphasize shipping content directly from Documentation/ when it can
be ingested from "DOC:" source. I had been assuming DOC: blocks in the
code were more for augmenting kernel-doc on driver internal ABIs and
not longer theory of operation documentation that is an awkward fit
for Documentation/ABI/.
Greg KH April 20, 2022, 7:48 a.m. UTC | #3
On Tue, Apr 19, 2022 at 12:45:03PM -0700, Dan Williams wrote:
> On Tue, Apr 19, 2022 at 9:48 AM Greg KH <gregkh@linuxfoundation.org> wrote:
> >
> > On Tue, Apr 19, 2022 at 09:38:50AM -0700, Tony Luck wrote:
> > > Add documentation for In-Field Scan (IFS). This documentation
> > > describes the basics of IFS, the loading IFS image, chunk
> > > authentication, running scan and how to check result via sysfs
> > > as well as tunable parameters.
> > >
> > > The CORE_CAPABILITIES MSR enumerates whether IFS is supported.
> > >
> > > The full  github location for distributing the IFS images is
> > > still being decided. So just a placeholder included for now
> > > in the documentation.
> > >
> > > Future CPUs will support more than one type of test. Plan for
> > > that now by using a ".0" suffix on the ABI directory names.
> > > Additional test types will use ".1", etc.
> > >
> > > Reviewed-by: Dan Williams <dan.j.williams@intel.com>
> > > Signed-off-by: Tony Luck <tony.luck@intel.com>
> > > ---
> > >  Documentation/x86/ifs.rst   | 101 ++++++++++++++++++++++++++++++++++++
> > >  Documentation/x86/index.rst |   1 +
> > >  2 files changed, 102 insertions(+)
> > >  create mode 100644 Documentation/x86/ifs.rst
> > >
> > > diff --git a/Documentation/x86/ifs.rst b/Documentation/x86/ifs.rst
> > > new file mode 100644
> > > index 000000000000..62f3c07d433a
> > > --- /dev/null
> > > +++ b/Documentation/x86/ifs.rst
> > > @@ -0,0 +1,101 @@
> > > +.. SPDX-License-Identifier: GPL-2.0
> > > +
> > > +=============
> > > +In-Field Scan
> > > +=============
> > > +
> > > +Introduction
> > > +------------
> > > +
> > > +In Field Scan (IFS) is a hardware feature to run circuit level tests on
> > > +a CPU core to detect problems that are not caught by parity or ECC checks.
> > > +Future CPUs will support more than one type of test which will show up
> > > +with a new platform-device instance-id, for now only .0 is exposed.
> > > +
> > > +
> > > +IFS Image
> > > +---------
> > > +
> > > +Intel provides a firmware file containing the scan tests via
> > > +github [#f1]_.  Similar to microcode there is a separate file for each
> > > +family-model-stepping.
> > > +
> > > +IFS Image Loading
> > > +-----------------
> > > +
> > > +The driver loads the tests into memory reserved BIOS local to each CPU
> > > +socket in a two step process using writes to MSRs to first load the
> > > +SHA hashes for the test. Then the tests themselves. Status MSRs provide
> > > +feedback on the success/failure of these steps. When a new test file
> > > +is installed it can be loaded by writing to the driver reload file::
> > > +
> > > +  # echo 1 > /sys/bus/platform/drivers/intel_ifs.0/reload
> > > +
> > > +Similar to microcode, the current version of the scan tests is stored
> > > +in a fixed location: /lib/firmware/intel/ifs.0/family-model-stepping.scan
> > > +
> > > +Running tests
> > > +-------------
> > > +
> > > +Tests are run by the driver synchronizing execution of all threads on a
> > > +core and then writing to the ACTIVATE_SCAN MSR on all threads. Instruction
> > > +execution continues when:
> > > +
> > > +1) All tests have completed.
> > > +2) Execution was interrupted.
> > > +3) A test detected a problem.
> > > +
> > > +In all cases reading the SCAN_STATUS MSR provides details on what
> > > +happened. The driver makes the value of this MSR visible to applications
> > > +via the "details" file (see below). Interrupted tests may be restarted.
> > > +
> > > +The IFS driver provides sysfs interfaces via /sys/devices/platform/intel_ifs.0/
> > > +to control execution:
> > > +
> > > +Test a specific core::
> > > +
> > > +  # echo <cpu#> > /sys/devices/platform/intel_ifs.0/run_test
> > > +
> > > +when HT is enabled any of the sibling cpu# can be specified to test its
> > > +corresponding physical core. Since the tests are per physical core, the
> > > +result of testing any thread is same. It is only necessary to test one
> > > +thread.
> > > +
> > > +For e.g. to test core corresponding to cpu5
> > > +
> > > +  # echo 5 > /sys/devices/platform/intel_ifs.0/run_test
> > > +
> > > +Results of the last test is provided in /sys::
> > > +
> > > +  $ cat /sys/devices/platform/intel_ifs.0/status
> > > +  pass
> >
> > sysfs documentation belongs in Documentation/ABI/
> >
> > And why not just include this whole thing in the driver itself and suck
> > the documentation out of that?  No need to have a separate file.
> 
> At a minimum a separate file is needed to house the
> ---
>  .. kernel-doc:: $source_file
>    :doc: $header
> ---
> ...statements, but ok, I'll recommend that going forward to
> de-emphasize shipping content directly from Documentation/ when it can
> be ingested from "DOC:" source. I had been assuming DOC: blocks in the
> code were more for augmenting kernel-doc on driver internal ABIs and
> not longer theory of operation documentation that is an awkward fit
> for Documentation/ABI/.

I don't know which is better, it's just that creating a whole new
documentation file for a single tiny driver feels very odd as it will
get out of date and is totally removed from the driver itself.

I'd prefer that drivers be self-contained, including the documentation,
as it is much more obvious what is happening with that.  Spreading stuff
around the tree only causes stuff to get out of sync easier.

thanks,

greg k-h
diff mbox series

Patch

diff --git a/Documentation/x86/ifs.rst b/Documentation/x86/ifs.rst
new file mode 100644
index 000000000000..62f3c07d433a
--- /dev/null
+++ b/Documentation/x86/ifs.rst
@@ -0,0 +1,101 @@ 
+.. SPDX-License-Identifier: GPL-2.0
+
+=============
+In-Field Scan
+=============
+
+Introduction
+------------
+
+In Field Scan (IFS) is a hardware feature to run circuit level tests on
+a CPU core to detect problems that are not caught by parity or ECC checks.
+Future CPUs will support more than one type of test which will show up
+with a new platform-device instance-id, for now only .0 is exposed.
+
+
+IFS Image
+---------
+
+Intel provides a firmware file containing the scan tests via
+github [#f1]_.  Similar to microcode there is a separate file for each
+family-model-stepping.
+
+IFS Image Loading
+-----------------
+
+The driver loads the tests into memory reserved BIOS local to each CPU
+socket in a two step process using writes to MSRs to first load the
+SHA hashes for the test. Then the tests themselves. Status MSRs provide
+feedback on the success/failure of these steps. When a new test file
+is installed it can be loaded by writing to the driver reload file::
+
+  # echo 1 > /sys/bus/platform/drivers/intel_ifs.0/reload
+
+Similar to microcode, the current version of the scan tests is stored
+in a fixed location: /lib/firmware/intel/ifs.0/family-model-stepping.scan
+
+Running tests
+-------------
+
+Tests are run by the driver synchronizing execution of all threads on a
+core and then writing to the ACTIVATE_SCAN MSR on all threads. Instruction
+execution continues when:
+
+1) All tests have completed.
+2) Execution was interrupted.
+3) A test detected a problem.
+
+In all cases reading the SCAN_STATUS MSR provides details on what
+happened. The driver makes the value of this MSR visible to applications
+via the "details" file (see below). Interrupted tests may be restarted.
+
+The IFS driver provides sysfs interfaces via /sys/devices/platform/intel_ifs.0/
+to control execution:
+
+Test a specific core::
+
+  # echo <cpu#> > /sys/devices/platform/intel_ifs.0/run_test
+
+when HT is enabled any of the sibling cpu# can be specified to test its
+corresponding physical core. Since the tests are per physical core, the
+result of testing any thread is same. It is only necessary to test one
+thread.
+
+For e.g. to test core corresponding to cpu5
+
+  # echo 5 > /sys/devices/platform/intel_ifs.0/run_test
+
+Results of the last test is provided in /sys::
+
+  $ cat /sys/devices/platform/intel_ifs.0/status
+  pass
+
+Status can be one of pass, fail, untested
+
+Additional details of the last test is provided by the details file::
+
+  $ cat /sys/devices/platform/intel_ifs.0/details
+  0x8081
+
+The details file reports the hex value of the SCAN_STATUS MSR.
+Hardware defined error codes are documented in volume 4 of the Intel
+Software Developer's Manual but the error_code field may contain one of
+the following driver defined software codes:
+
++------+--------------------+
+| 0xFD | Software timeout   |
++------+--------------------+
+| 0xFE | Partial completion |
++------+--------------------+
+
+Driver design choices
+---------------------
+
+1) The ACTIVATE_SCAN MSR allows for running any consecutive subrange of
+available tests. But the driver always tries to run all tests and only
+uses the subrange feature to restart an interrupted test.
+
+2) Hardware allows for some number of cores to be tested in parallel.
+The driver does not make use of this, it only tests one core at a time.
+
+.. [#f1] https://github.com/intel/TBD
diff --git a/Documentation/x86/index.rst b/Documentation/x86/index.rst
index 91b2fa456618..9d8e8a73d57b 100644
--- a/Documentation/x86/index.rst
+++ b/Documentation/x86/index.rst
@@ -35,6 +35,7 @@  x86-specific Documentation
    usb-legacy-support
    i386/index
    x86_64/index
+   ifs
    sva
    sgx
    features