[ndctl,0/8] Add support for reporting papr-scm nvdimm health
mbox series

Message ID 20200220104928.198625-1-vaibhav@linux.ibm.com
Headers show
Series
  • Add support for reporting papr-scm nvdimm health
Related show

Message

Vaibhav Jain Feb. 20, 2020, 10:49 a.m. UTC
This patch-set proposes changes to libndctl to add support for reporting
health for nvdimms that support the PAPR standard[1]. The standard defines
heathenism (HCALL) through which a guest kernel can query and fetch health
and performance stats of an nvdimm attached to the hypervisor[2]. Until
now 'ndctl' was unable to report these stats for papr_scm dimms on PPC64
guests due to absence of ACPI/NFIT, a limitation which this patch-set tries
to address.

The patch-set introduces support for the new PAPR-SCM DSM family defined at
[3] via a new dimm-ops named 'papr_scm_dimm_ops'. Infrastructure to probe
and distinguish papr-scm dimms from other dimm families that may support
ACPI/NFIT is implemented by updating the 'struct ndctl_dimm' initialization
routines to bifurcate based on the nvdimm type. We also introduce two new
dimm-ops for handling initialization of dimm specific data for specific DSM
families. 

These changes coupled with proposed ndtcl changes located at Ref[4] should
provide a way for the user to retrieve NVDIMM health status using ndtcl for
papr_scm guests. Below is a sample output using proposed kernel + ndctl
changes:

 # ndctl list -DH
[
  {
    "dev":"nmem0",
    "health":{
      "health_state":"fatal",
      "shutdown_state":"dirty"
    }
  }
]

Structure of the patchset
=========================

We start with a re-factoring patch that splits the 'add_dimm()' function
into two functions one that take care of allocating and initializing
'struct ndctl_dimm' and another that takes care of initializing nfit
specific dimm attributes.

Patch-2 introduces new dimm ops 'dimm_init()' & 'dimm_uninit()' to handle
DSM family specific initialization of 'struct ndctl_dimm'.

Patch-3 introduces probe function of papr_scm nvdimms and assigning
'papr_scm_dimm_ops' to 'dimm->ops' if needed. Subsequent patch-4 adds
support for parsing papr_scm nvdimm flags and updating 'dimm->flags' with
them.

Patches-5,6 implements scaffolding to add support for PAPR_SCM DSM commands
and pull in their definitions from the kernel. This implementation resides
in newly added 'papr_scm.c' file.

Finally Patched-7,8 add support for issuing and handling the result of
'struct ndctl_cmd' to request dimm health stats from papr_scm kernel module
and returning appropriate health status to libndctl for reporting.

References:
[1]: "Power Architecture Platform Reference"
      https://en.wikipedia.org/wiki/Power_Architecture_Platform_Reference
[2]: "[DOC,v2] powerpc: Provide initial documentation for PAPR hcalls"
     https://patchwork.ozlabs.org/patch/1154292/
[3]: "[PATCH 5/8] powerpc/uapi: Introduce uapi header 'papr_scm_dsm.h' for
     papr_scm DSMs"
     https://patchwork.ozlabs.org/patch/1241352/
[4]: "powerpc/papr_scm: Add support for reporting nvdimm health"
     https://patchwork.ozlabs.org/project/linuxppc-dev/list/?series=159701     

Vaibhav Jain (8):
  libndctl: Refactor out add_dimm() to handle NFIT specific init
  libndctl: Introduce a new dimm-ops dimm_init() & dimm_uninit()
  libncdtl: Add initial support for NVDIMM_FAMILY_PAPR_SCM dimm family
  libndctl: Add support for parsing of_pmem dimm flags and monitor mode
  libndctl,papr_scm: Add definitions for PAPR_SCM DSM commands
  libndctl,papr_scm: Implement scaffolding to issue and handle DSM cmds
  libndctl,papr_scm: Implement support for DSM_PAPR_SCM_HEALTH
  libndctl,papr_scm: Add support for reporting bad shutdown

 ndctl/lib/Makefile.am    |   1 +
 ndctl/lib/libndctl.c     | 263 +++++++++++++++++++++++++----------
 ndctl/lib/papr_scm.c     | 292 +++++++++++++++++++++++++++++++++++++++
 ndctl/lib/papr_scm_dsm.h | 143 +++++++++++++++++++
 ndctl/lib/private.h      |   7 +
 ndctl/libndctl.h         |   1 +
 ndctl/ndctl.h            |   1 +
 7 files changed, 634 insertions(+), 74 deletions(-)
 create mode 100644 ndctl/lib/papr_scm.c
 create mode 100644 ndctl/lib/papr_scm_dsm.h