[v8,1/5] powerpc: Document details on H_SCM_HEALTH hcall
diff mbox series

Message ID 20200527041244.37821-2-vaibhav@linux.ibm.com
State Superseded
Headers show
Series
  • powerpc/papr_scm: Add support for reporting nvdimm health
Related show

Commit Message

Vaibhav Jain May 27, 2020, 4:12 a.m. UTC
Add documentation to 'papr_hcalls.rst' describing the bitmap flags
that are returned from H_SCM_HEALTH hcall as per the PAPR-SCM
specification.

Cc: "Aneesh Kumar K . V" <aneesh.kumar@linux.ibm.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
---
Changelog:
v7..v8:
* Added a clarification on bit-ordering of Health Bitmap

Resend:
* None

v6..v7:
* None

v5..v6:
* New patch in the series
---
 Documentation/powerpc/papr_hcalls.rst | 45 ++++++++++++++++++++++++---
 1 file changed, 41 insertions(+), 4 deletions(-)

Comments

Dan Williams May 27, 2020, 6:56 p.m. UTC | #1
On Tue, May 26, 2020 at 9:13 PM Vaibhav Jain <vaibhav@linux.ibm.com> wrote:
>
> Add documentation to 'papr_hcalls.rst' describing the bitmap flags
> that are returned from H_SCM_HEALTH hcall as per the PAPR-SCM
> specification.
>

Please do a global s/SCM/PMEM/ or s/SCM/NVDIMM/. It's unfortunate that
we already have 2 ways to describe persistent memory devices, let's
not perpetuate a third so that "grep" has a chance to find
interrelated code across architectures. Other than that this looks
good to me.

> Cc: "Aneesh Kumar K . V" <aneesh.kumar@linux.ibm.com>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Michael Ellerman <mpe@ellerman.id.au>
> Cc: Ira Weiny <ira.weiny@intel.com>
> Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
> ---
> Changelog:
> v7..v8:
> * Added a clarification on bit-ordering of Health Bitmap
>
> Resend:
> * None
>
> v6..v7:
> * None
>
> v5..v6:
> * New patch in the series
> ---
>  Documentation/powerpc/papr_hcalls.rst | 45 ++++++++++++++++++++++++---
>  1 file changed, 41 insertions(+), 4 deletions(-)
>
> diff --git a/Documentation/powerpc/papr_hcalls.rst b/Documentation/powerpc/papr_hcalls.rst
> index 3493631a60f8..45063f305813 100644
> --- a/Documentation/powerpc/papr_hcalls.rst
> +++ b/Documentation/powerpc/papr_hcalls.rst
> @@ -220,13 +220,50 @@ from the LPAR memory.
>  **H_SCM_HEALTH**
>
>  | Input: drcIndex
> -| Out: *health-bitmap, health-bit-valid-bitmap*
> +| Out: *health-bitmap (r4), health-bit-valid-bitmap (r5)*
>  | Return Value: *H_Success, H_Parameter, H_Hardware*
>
>  Given a DRC Index return the info on predictive failure and overall health of
> -the NVDIMM. The asserted bits in the health-bitmap indicate a single predictive
> -failure and health-bit-valid-bitmap indicate which bits in health-bitmap are
> -valid.
> +the NVDIMM. The asserted bits in the health-bitmap indicate one or more states
> +(described in table below) of the NVDIMM and health-bit-valid-bitmap indicate
> +which bits in health-bitmap are valid. The bits are reported in
> +reverse bit ordering for example a value of 0xC400000000000000
> +indicates bits 0, 1, and 5 are valid.
> +
> +Health Bitmap Flags:
> +
> ++------+-----------------------------------------------------------------------+
> +|  Bit |               Definition                                              |
> ++======+=======================================================================+
> +|  00  | SCM device is unable to persist memory contents.                      |
> +|      | If the system is powered down, nothing will be saved.                 |
> ++------+-----------------------------------------------------------------------+
> +|  01  | SCM device failed to persist memory contents. Either contents were not|
> +|      | saved successfully on power down or were not restored properly on     |
> +|      | power up.                                                             |
> ++------+-----------------------------------------------------------------------+
> +|  02  | SCM device contents are persisted from previous IPL. The data from    |
> +|      | the last boot were successfully restored.                             |
> ++------+-----------------------------------------------------------------------+
> +|  03  | SCM device contents are not persisted from previous IPL. There was no |
> +|      | data to restore from the last boot.                                   |
> ++------+-----------------------------------------------------------------------+
> +|  04  | SCM device memory life remaining is critically low                    |
> ++------+-----------------------------------------------------------------------+
> +|  05  | SCM device will be garded off next IPL due to failure                 |
> ++------+-----------------------------------------------------------------------+
> +|  06  | SCM contents cannot persist due to current platform health status. A  |
> +|      | hardware failure may prevent data from being saved or restored.       |
> ++------+-----------------------------------------------------------------------+
> +|  07  | SCM device is unable to persist memory contents in certain conditions |
> ++------+-----------------------------------------------------------------------+
> +|  08  | SCM device is encrypted                                               |
> ++------+-----------------------------------------------------------------------+
> +|  09  | SCM device has successfully completed a requested erase or secure     |
> +|      | erase procedure.                                                      |
> ++------+-----------------------------------------------------------------------+
> +|10:63 | Reserved / Unused                                                     |
> ++------+-----------------------------------------------------------------------+
>
>  **H_SCM_PERFORMANCE_STATS**
>
> --
> 2.26.2
>
Vaibhav Jain May 28, 2020, 7:24 p.m. UTC | #2
Thanks for looking into this patchset Dan,


Dan Williams <dan.j.williams@intel.com> writes:

> On Tue, May 26, 2020 at 9:13 PM Vaibhav Jain <vaibhav@linux.ibm.com> wrote:
>>
>> Add documentation to 'papr_hcalls.rst' describing the bitmap flags
>> that are returned from H_SCM_HEALTH hcall as per the PAPR-SCM
>> specification.
>>
>
> Please do a global s/SCM/PMEM/ or s/SCM/NVDIMM/. It's unfortunate that
> we already have 2 ways to describe persistent memory devices, let's
> not perpetuate a third so that "grep" has a chance to find
> interrelated code across architectures. Other than that this looks
> good to me.

Sure, will use PAPR_NVDIMM instead of PAPR_SCM for new code being
introduced. However certain identifiers like H_SCM_HEALTH are taken from
the papr specificiation hence need to use the same name.

>
>> Cc: "Aneesh Kumar K . V" <aneesh.kumar@linux.ibm.com>
>> Cc: Dan Williams <dan.j.williams@intel.com>
>> Cc: Michael Ellerman <mpe@ellerman.id.au>
>> Cc: Ira Weiny <ira.weiny@intel.com>
>> Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
>> ---
>> Changelog:
>> v7..v8:
>> * Added a clarification on bit-ordering of Health Bitmap
>>
>> Resend:
>> * None
>>
>> v6..v7:
>> * None
>>
>> v5..v6:
>> * New patch in the series
>> ---
>>  Documentation/powerpc/papr_hcalls.rst | 45 ++++++++++++++++++++++++---
>>  1 file changed, 41 insertions(+), 4 deletions(-)
>>
>> diff --git a/Documentation/powerpc/papr_hcalls.rst b/Documentation/powerpc/papr_hcalls.rst
>> index 3493631a60f8..45063f305813 100644
>> --- a/Documentation/powerpc/papr_hcalls.rst
>> +++ b/Documentation/powerpc/papr_hcalls.rst
>> @@ -220,13 +220,50 @@ from the LPAR memory.
>>  **H_SCM_HEALTH**
>>
>>  | Input: drcIndex
>> -| Out: *health-bitmap, health-bit-valid-bitmap*
>> +| Out: *health-bitmap (r4), health-bit-valid-bitmap (r5)*
>>  | Return Value: *H_Success, H_Parameter, H_Hardware*
>>
>>  Given a DRC Index return the info on predictive failure and overall health of
>> -the NVDIMM. The asserted bits in the health-bitmap indicate a single predictive
>> -failure and health-bit-valid-bitmap indicate which bits in health-bitmap are
>> -valid.
>> +the NVDIMM. The asserted bits in the health-bitmap indicate one or more states
>> +(described in table below) of the NVDIMM and health-bit-valid-bitmap indicate
>> +which bits in health-bitmap are valid. The bits are reported in
>> +reverse bit ordering for example a value of 0xC400000000000000
>> +indicates bits 0, 1, and 5 are valid.
>> +
>> +Health Bitmap Flags:
>> +
>> ++------+-----------------------------------------------------------------------+
>> +|  Bit |               Definition                                              |
>> ++======+=======================================================================+
>> +|  00  | SCM device is unable to persist memory contents.                      |
>> +|      | If the system is powered down, nothing will be saved.                 |
>> ++------+-----------------------------------------------------------------------+
>> +|  01  | SCM device failed to persist memory contents. Either contents were not|
>> +|      | saved successfully on power down or were not restored properly on     |
>> +|      | power up.                                                             |
>> ++------+-----------------------------------------------------------------------+
>> +|  02  | SCM device contents are persisted from previous IPL. The data from    |
>> +|      | the last boot were successfully restored.                             |
>> ++------+-----------------------------------------------------------------------+
>> +|  03  | SCM device contents are not persisted from previous IPL. There was no |
>> +|      | data to restore from the last boot.                                   |
>> ++------+-----------------------------------------------------------------------+
>> +|  04  | SCM device memory life remaining is critically low                    |
>> ++------+-----------------------------------------------------------------------+
>> +|  05  | SCM device will be garded off next IPL due to failure                 |
>> ++------+-----------------------------------------------------------------------+
>> +|  06  | SCM contents cannot persist due to current platform health status. A  |
>> +|      | hardware failure may prevent data from being saved or restored.       |
>> ++------+-----------------------------------------------------------------------+
>> +|  07  | SCM device is unable to persist memory contents in certain conditions |
>> ++------+-----------------------------------------------------------------------+
>> +|  08  | SCM device is encrypted                                               |
>> ++------+-----------------------------------------------------------------------+
>> +|  09  | SCM device has successfully completed a requested erase or secure     |
>> +|      | erase procedure.                                                      |
>> ++------+-----------------------------------------------------------------------+
>> +|10:63 | Reserved / Unused                                                     |
>> ++------+-----------------------------------------------------------------------+
>>
>>  **H_SCM_PERFORMANCE_STATS**
>>
>> --
>> 2.26.2
>>

Patch
diff mbox series

diff --git a/Documentation/powerpc/papr_hcalls.rst b/Documentation/powerpc/papr_hcalls.rst
index 3493631a60f8..45063f305813 100644
--- a/Documentation/powerpc/papr_hcalls.rst
+++ b/Documentation/powerpc/papr_hcalls.rst
@@ -220,13 +220,50 @@  from the LPAR memory.
 **H_SCM_HEALTH**
 
 | Input: drcIndex
-| Out: *health-bitmap, health-bit-valid-bitmap*
+| Out: *health-bitmap (r4), health-bit-valid-bitmap (r5)*
 | Return Value: *H_Success, H_Parameter, H_Hardware*
 
 Given a DRC Index return the info on predictive failure and overall health of
-the NVDIMM. The asserted bits in the health-bitmap indicate a single predictive
-failure and health-bit-valid-bitmap indicate which bits in health-bitmap are
-valid.
+the NVDIMM. The asserted bits in the health-bitmap indicate one or more states
+(described in table below) of the NVDIMM and health-bit-valid-bitmap indicate
+which bits in health-bitmap are valid. The bits are reported in
+reverse bit ordering for example a value of 0xC400000000000000
+indicates bits 0, 1, and 5 are valid.
+
+Health Bitmap Flags:
+
++------+-----------------------------------------------------------------------+
+|  Bit |               Definition                                              |
++======+=======================================================================+
+|  00  | SCM device is unable to persist memory contents.                      |
+|      | If the system is powered down, nothing will be saved.                 |
++------+-----------------------------------------------------------------------+
+|  01  | SCM device failed to persist memory contents. Either contents were not|
+|      | saved successfully on power down or were not restored properly on     |
+|      | power up.                                                             |
++------+-----------------------------------------------------------------------+
+|  02  | SCM device contents are persisted from previous IPL. The data from    |
+|      | the last boot were successfully restored.                             |
++------+-----------------------------------------------------------------------+
+|  03  | SCM device contents are not persisted from previous IPL. There was no |
+|      | data to restore from the last boot.                                   |
++------+-----------------------------------------------------------------------+
+|  04  | SCM device memory life remaining is critically low                    |
++------+-----------------------------------------------------------------------+
+|  05  | SCM device will be garded off next IPL due to failure                 |
++------+-----------------------------------------------------------------------+
+|  06  | SCM contents cannot persist due to current platform health status. A  |
+|      | hardware failure may prevent data from being saved or restored.       |
++------+-----------------------------------------------------------------------+
+|  07  | SCM device is unable to persist memory contents in certain conditions |
++------+-----------------------------------------------------------------------+
+|  08  | SCM device is encrypted                                               |
++------+-----------------------------------------------------------------------+
+|  09  | SCM device has successfully completed a requested erase or secure     |
+|      | erase procedure.                                                      |
++------+-----------------------------------------------------------------------+
+|10:63 | Reserved / Unused                                                     |
++------+-----------------------------------------------------------------------+
 
 **H_SCM_PERFORMANCE_STATS**