[ndctl,v4,0/6] Add support for reporting papr-scm nvdimm health
mbox series

Message ID 20200527044737.40615-1-vaibhav@linux.ibm.com
Headers show
Series
  • Add support for reporting papr-scm nvdimm health
Related show

Message

Vaibhav Jain May 27, 2020, 4:47 a.m. UTC
Changes since v3 [1]:
* Updated 'papr_scm_psdm.h' to recent kernel version that removes
  'payload_offset' field from 'struct nd_pdsm_cmd_pkg'.
* Changes to 'papr_scm.c' to remove code that was populating
  'payload_offset'.

[1] https://lore.kernel.org/linux-nvdimm/20200519190147.258142-1-vaibhav@linux.ibm.com

---
This patch-set proposes changes to libndctl to add support for reporting
health for nvdimms that support the PAPR standard[2]. The standard defines
machenism (HCALL) through which a guest kernel can query and fetch health
and performance stats of an nvdimm attached to the hypervisor[3]. Until
now 'ndctl' was unable to report these stats for papr_scm dimms on PPC64
guests due to absence of ACPI/NFIT, a limitation which this patch-set tries
to address.

The patch-set introduces support for the new PAPR-SCM PDSM family
defined at [4] via a new dimm-ops named
'papr_scm_dimm_ops'. Infrastructure to probe and distinguish papr-scm
dimms from other dimm families that may support ACPI/NFIT is
implemented by updating the 'struct ndctl_dimm' initialization
routines to bifurcate based on the nvdimm type. We also introduce two
new dimm-ops member for handling initialization of dimm specific data
for specific DSM families.

These changes coupled with proposed kernel changes located at Ref[1] should
provide a way for the user to retrieve NVDIMM health status using ndtcl for
papr_scm guests. Below is a sample output using proposed kernel + ndctl
changes:

 # ndctl list -DH
[
  {
    "dev":"nmem0",
    "flag_smart_event":true,
    "health":{
      "health_state":"fatal",
      "shutdown_state":"dirty"
    }
  }
]

Structure of the patchset
=========================

We start with a re-factoring patch that splits the 'add_dimm()' function
into two functions one that take care of allocating and initializing
'struct ndctl_dimm' and another that takes care of initializing nfit
specific dimm attributes.

Patch-2 introduces probe function of papr_scm nvdimms and assigning
'papr_scm_dimm_ops' defined in 'papr_scm.c' to 'dimm->ops' if
needed. The patch also code to parse the dimm flags specific to
papr-scm nvdimms

Patch-3 introduces new dimm ops 'dimm_init()' & 'dimm_uninit()' to handle
DSM family specific initialization of 'struct ndctl_dimm'.

Patches-4,5 implements scaffolding to add support for PAPR_SCM PDSM
requests and pull in their definitions from the kernel.

Finally Patch-6 add support for issuing and handling the result of
'struct ndctl_cmd' to request dimm health stats from papr_scm kernel module
and returning appropriate health status to libndctl for reporting.

References
==========
[2] "Power Architecture Platform Reference"
https://en.wikipedia.org/wiki/Power_Architecture_Platform_Reference

[3] "Hypercall Op-codes (hcalls)"
https://github.com/torvalds/linux/blob/master/Documentation/powerpc/papr_hcalls.rst

[4] "ndctl/papr_scm,uapi: Add support for PAPR nvdimm specific methods"
https://lore.kernel.org/linux-nvdimm/20200527041244.37821-5-vaibhav@linux.ibm.com/

---
Vaibhav Jain (6):
  libndctl: Refactor out add_dimm() to handle NFIT specific init
  libncdtl: Add initial support for NVDIMM_FAMILY_PAPR_SCM dimm family
  libndctl: Introduce new dimm-ops dimm_init() & dimm_uninit()
  libndctl,papr_scm: Add definitions for PAPR nvdimm specific methods
  libndctl,papr_scm: Add scaffolding to issue and handle PDSM requests
  libndctl,papr_scm: Implement support for PAPR_SCM_PDSM_HEALTH

 ndctl/lib/Makefile.am     |   1 +
 ndctl/lib/libndctl.c      | 260 +++++++++++++++++++++++---------
 ndctl/lib/papr_scm.c      | 301 ++++++++++++++++++++++++++++++++++++++
 ndctl/lib/papr_scm_pdsm.h | 175 ++++++++++++++++++++++
 ndctl/lib/private.h       |   9 ++
 ndctl/libndctl.h          |   1 +
 ndctl/ndctl.h             |   1 +
 7 files changed, 678 insertions(+), 70 deletions(-)
 create mode 100644 ndctl/lib/papr_scm.c
 create mode 100644 ndctl/lib/papr_scm_pdsm.h