mbox series

[v3,0/2] papr: Implement initial support for injecting smart errors

Message ID 163638167629.400685.8268507373653839032.stgit@lep8c.aus.stglabs.ibm.com (mailing list archive)
Headers show
Series papr: Implement initial support for injecting smart errors | expand

Message

Shivaprasad G Bhat Nov. 8, 2021, 2:27 p.m. UTC
From: Vaibhav Jain <vaibhav@linux.ibm.com>

Changes since v2:
Link: https://lore.kernel.org/nvdimm/163102311841.258999.14260383111577082134.stgit@99912bbcb4c7/
* Removed redundant comments as suggested by Ira.
* Added the Reviewed-by: Ira tag

Changes since v1:
Link: https://patchwork.kernel.org/project/linux-nvdimm/cover/20210712173132.1205192-1-vaibhav@linux.ibm.com/
* Minor update to patch description
* The changes are based on the new kernel patch [1]

The patch series implements limited support for injecting smart errors for PAPR
NVDIMMs via ndctl-inject-smart(1) command. SMART errors are emulating in
papr_scm module as presently PAPR doesn't support injecting smart errors on an
NVDIMM. Currently support for injecting 'fatal' health state and 'dirty'
shutdown state is implemented. With the proposed ndctl patched and with
corresponding kernel patch [1] following command flow is expected:

$ sudo ndctl list -DH -d nmem0
...
      "health_state":"ok",
      "shutdown_state":"clean",
...
 # inject unsafe shutdown and fatal health error
$ sudo ndctl inject-smart nmem0 -Uf
...
      "health_state":"fatal",
      "shutdown_state":"dirty",
...
 # uninject all errors
$ sudo ndctl inject-smart nmem0 -N
...
      "health_state":"ok",
      "shutdown_state":"clean",
...

Structure of the patch series
=============================

* First patch updates 'inject-smart' code to not always assume support for
  injecting all smart-errors. It also updates 'intel.c' to explicitly indicate
  the type of smart-inject errors supported.

* Update 'papr.c' to add support for injecting smart 'fatal' health and
  'dirty-shutdown' errors.

[1] : https://patchwork.kernel.org/project/linux-nvdimm/patch/163091917031.334.16212158243308361834.stgit@82313cf9f602/
---

Vaibhav Jain (2):
      libndctl, intel: Indicate supported smart-inject types
      libndctl/papr: Add limited support for inject-smart


 ndctl/inject-smart.c  | 33 ++++++++++++++++++-----
 ndctl/lib/intel.c     |  7 ++++-
 ndctl/lib/papr.c      | 61 +++++++++++++++++++++++++++++++++++++++++++
 ndctl/lib/papr_pdsm.h | 17 ++++++++++++
 ndctl/libndctl.h      |  8 ++++++
 5 files changed, 118 insertions(+), 8 deletions(-)

--
Signature

Comments

Vaibhav Jain Dec. 13, 2021, 5:34 a.m. UTC | #1
Hi Dan, Ira and Vishal,


Gentle reminder about this patch series. If there are any objections to
this please let us know.

Thanks,
~ Vaibhav

Shivaprasad G Bhat <sbhat@linux.ibm.com> writes:

> From: Vaibhav Jain <vaibhav@linux.ibm.com>
>
> Changes since v2:
> Link: https://lore.kernel.org/nvdimm/163102311841.258999.14260383111577082134.stgit@99912bbcb4c7/
> * Removed redundant comments as suggested by Ira.
> * Added the Reviewed-by: Ira tag
>
> Changes since v1:
> Link: https://patchwork.kernel.org/project/linux-nvdimm/cover/20210712173132.1205192-1-vaibhav@linux.ibm.com/
> * Minor update to patch description
> * The changes are based on the new kernel patch [1]
>
> The patch series implements limited support for injecting smart errors for PAPR
> NVDIMMs via ndctl-inject-smart(1) command. SMART errors are emulating in
> papr_scm module as presently PAPR doesn't support injecting smart errors on an
> NVDIMM. Currently support for injecting 'fatal' health state and 'dirty'
> shutdown state is implemented. With the proposed ndctl patched and with
> corresponding kernel patch [1] following command flow is expected:
>
> $ sudo ndctl list -DH -d nmem0
> ...
>       "health_state":"ok",
>       "shutdown_state":"clean",
> ...
>  # inject unsafe shutdown and fatal health error
> $ sudo ndctl inject-smart nmem0 -Uf
> ...
>       "health_state":"fatal",
>       "shutdown_state":"dirty",
> ...
>  # uninject all errors
> $ sudo ndctl inject-smart nmem0 -N
> ...
>       "health_state":"ok",
>       "shutdown_state":"clean",
> ...
>
> Structure of the patch series
> =============================
>
> * First patch updates 'inject-smart' code to not always assume support for
>   injecting all smart-errors. It also updates 'intel.c' to explicitly indicate
>   the type of smart-inject errors supported.
>
> * Update 'papr.c' to add support for injecting smart 'fatal' health and
>   'dirty-shutdown' errors.
>
> [1] : https://patchwork.kernel.org/project/linux-nvdimm/patch/163091917031.334.16212158243308361834.stgit@82313cf9f602/
> ---
>
> Vaibhav Jain (2):
>       libndctl, intel: Indicate supported smart-inject types
>       libndctl/papr: Add limited support for inject-smart
>
>
>  ndctl/inject-smart.c  | 33 ++++++++++++++++++-----
>  ndctl/lib/intel.c     |  7 ++++-
>  ndctl/lib/papr.c      | 61 +++++++++++++++++++++++++++++++++++++++++++
>  ndctl/lib/papr_pdsm.h | 17 ++++++++++++
>  ndctl/libndctl.h      |  8 ++++++
>  5 files changed, 118 insertions(+), 8 deletions(-)
>
> --
> Signature
>
>
>