mbox series

[ndctl,v6,0/5] Add support for reporting papr nvdimm health

Message ID 20200616053029.84731-1-vaibhav@linux.ibm.com (mailing list archive)
Headers show
Series Add support for reporting papr nvdimm health | expand

Message

Vaibhav Jain June 16, 2020, 5:30 a.m. UTC
Changes since v5 [1]:
* Removed the patch introducing new dimm-ops 'dimm_init()' &
  'dimm_uninit()'. Corrosponding code that used the dimm private
  initialization is also removed.
* Updated various dimm ops callback to rely on 'struct ndctl_cmd' arg
  instead of dimm-private.
* Added ndctl_bus_has_of_node() and ndctl_bus_is_papr_scm() to library
  ld version script.
* Simplified probing of new papr compatible nvdimm based on
  introduction of new exported library function
  ndctl_bus_is_papr_scm().
* Reworked various dimm-ops callbacks based on update uapi interface
  with papr_scm as defined at [5].
* Introduced a new header 'papr.h' that defines 'struct nd_pkg_papr'
  that holds 'struct nd_cmd_pkg gen'  and 'struct nd_pkg_pdsm pdsm'
  together.

[1] https://lore.kernel.org/linux-nvdimm/20200529220600.225320-1-vaibhav@linux.ibm.com
---
This patch-set proposes changes to libndctl to add support for reporting
health for nvdimms that support the PAPR standard[2]. The standard defines
machenism (HCALL) through which a guest kernel can query and fetch health
and performance stats of an nvdimm attached to the hypervisor[3]. Until
now 'ndctl' was unable to report these stats for papr_scm dimms on PPC64
guests due to absence of ACPI/NFIT, a limitation which this patch-set tries
to address.

The patch-set introduces support for the new PAPR PDSM family
defined at [4] & [5] via a new dimm-ops named
'papr_dimm_ops'. Infrastructure to probe and distinguish papr-scm
dimms from other dimm families that may support ACPI/NFIT is
implemented by updating the 'struct ndctl_dimm' initialization
routines to bifurcate based on the nvdimm type. We also introduce two
new dimm-ops member for handling initialization of dimm specific data
for specific DSM families.

These changes coupled with proposed kernel changes located at Ref[1] should
provide a way for the user to retrieve NVDIMM health status using ndtcl for
pseries guests. Below is a sample output using proposed kernel + ndctl
changes:

 # ndctl list -DH
[
  {
    "dev":"nmem0",
    "flag_smart_event":true,
    "health":{
      "health_state":"fatal",
      "shutdown_state":"dirty"
    }
  }
]

Structure of the patchset
=========================

We start with a re-factoring patch that splits the 'add_dimm()' function
into two functions one that take care of allocating and initializing
'struct ndctl_dimm' and another that takes care of initializing nfit
specific dimm attributes.

Patch-2 introduces probe function of papr nvdimms and assigning
'papr_dimm_ops' defined in 'papr.c' to 'dimm->ops' if
needed. The patch also code to parse the dimm flags specific to
papr nvdimms

Patches-3,4 implements scaffolding to add support for PAPR PDSM
requests and pull in their definitions from the kernel.

Finally Patch-6 add support for issuing and handling the result of
'struct ndctl_cmd' to request dimm health stats from papr_scm kernel module
and returning appropriate health status to libndctl for reporting.

References
==========
[2] "Power Architecture Platform Reference"
https://en.wikipedia.org/wiki/Power_Architecture_Platform_Reference

[3] "Hypercall Op-codes (hcalls)"
https://github.com/torvalds/linux/blob/master/Documentation/powerpc/papr_hcalls.rst

[4] "powerpc/papr_scm: Add support for reporting nvdimm health"
https://lore.kernel.org/linux-nvdimm/20200615124407.32596-1-vaibhav@linux.ibm.com/

[5] "ndctl/papr_scm,uapi: Add support for PAPR nvdimm specific methods"
https://lore.kernel.org/linux-nvdimm/20200615124407.32596-6-vaibhav@linux.ibm.com/

Vaibhav Jain (5):
  libndctl: Refactor out add_dimm() to handle NFIT specific init
  libncdtl: Add initial support for NVDIMM_FAMILY_PAPR nvdimm family
  libndctl,papr_scm: Add definitions for PAPR nvdimm specific methods
  papr: Add scaffolding to issue and handle PDSM requests
  libndctl,papr_scm: Implement support for PAPR_PDSM_HEALTH

 ndctl/lib/Makefile.am  |   1 +
 ndctl/lib/libndctl.c   | 264 +++++++++++++++++++++++++++++------------
 ndctl/lib/libndctl.sym |   5 +
 ndctl/lib/papr.c       | 218 ++++++++++++++++++++++++++++++++++
 ndctl/lib/papr.h       |  15 +++
 ndctl/lib/papr_pdsm.h  | 132 +++++++++++++++++++++
 ndctl/lib/private.h    |   4 +
 ndctl/libndctl.h       |   2 +
 ndctl/ndctl.h          |   1 +
 9 files changed, 569 insertions(+), 73 deletions(-)
 create mode 100644 ndctl/lib/papr.c
 create mode 100644 ndctl/lib/papr.h
 create mode 100644 ndctl/lib/papr_pdsm.h