mbox series

[RFC,0/4] EDAC/ghes: Add EDAC device for recording the CPU error count

Message ID 20201105174233.1146-1-shiju.jose@huawei.com (mailing list archive)
Headers show
Series EDAC/ghes: Add EDAC device for recording the CPU error count | expand

Message

Shiju Jose Nov. 5, 2020, 5:42 p.m. UTC
For the firmware-first error handling on ARM64 hardware platforms,
CPU cache corrected error count is not recorded.
Create an CPU EDAC device and device blocks for the CPU caches
for this purpose. The EDAC device blocks  are created based on the
CPU caches information represented in the ACPI PPTT.

User-space application could monitor the recorded corrected error
count for the early fault detection.

Jonathan Cameron (1):
  ACPI: PPTT: Fix for a high level cache node detected in the low level

Shiju Jose (3):
  ACPI: PPTT: Add function acpi_find_cache_info
  EDAC/ghes: Add EDAC device for the CPU caches
  ACPI / APEI: Add reporting ARM64 CPU cache corrected error count

 drivers/acpi/apei/ghes.c  |  79 +++++++++++++++++++++-
 drivers/acpi/pptt.c       | 123 +++++++++++++++++++++++++++++++++-
 drivers/edac/Kconfig      |  10 +++
 drivers/edac/ghes_edac.c  | 135 ++++++++++++++++++++++++++++++++++++++
 include/acpi/ghes.h       |  27 ++++++++
 include/linux/cacheinfo.h |  12 ++++
 include/linux/cper.h      |   4 ++
 7 files changed, 386 insertions(+), 4 deletions(-)

Comments

James Morse Nov. 6, 2020, 7:33 p.m. UTC | #1
Hi Shiju,

On 05/11/2020 17:42, Shiju Jose wrote:
> For the firmware-first error handling on ARM64 hardware platforms,
> CPU cache corrected error count is not recorded.
> Create an CPU EDAC device and device blocks for the CPU caches
> for this purpose. The EDAC device blocks  are created based on the
> CPU caches information represented in the ACPI PPTT.

Using the PPTT won't work on x86 systems. Can we use the core-code's common data to learn
about caches: struct cpu_cacheinfo and struct cacheinfo ?


Thanks,

James