diff mbox series

powerpc/papr_scm: Reduce error severity if nvdimm stats inaccessible

Message ID 20210414124026.332472-1-vaibhav@linux.ibm.com (mailing list archive)
State Superseded
Headers show
Series powerpc/papr_scm: Reduce error severity if nvdimm stats inaccessible | expand

Commit Message

Vaibhav Jain April 14, 2021, 12:40 p.m. UTC
Currently drc_pmem_qeury_stats() generates a dev_err in case
"Enable Performance Information Collection" feature is disabled from
HMC. The error is of the form below:

papr_scm ibm,persistent-memory:ibm,pmemory@44104001: Failed to query
	 performance stats, Err:-10

This error message confuses users as it implies a possible problem
with the nvdimm even though its due to a disabled feature.

So we fix this by explicitly handling the H_AUTHORITY error from the
H_SCM_PERFORMANCE_STATS hcall and generating a warning instead of an
error, saying that "Performance stats in-accessible".

Fixes: 2d02bf835e57('powerpc/papr_scm: Fetch nvdimm performance stats from PHYP')
Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
---
 arch/powerpc/platforms/pseries/papr_scm.c | 3 +++
 1 file changed, 3 insertions(+)

Comments

Ira Weiny April 14, 2021, 3:36 p.m. UTC | #1
On Wed, Apr 14, 2021 at 06:10:26PM +0530, Vaibhav Jain wrote:
> Currently drc_pmem_qeury_stats() generates a dev_err in case
> "Enable Performance Information Collection" feature is disabled from
> HMC. The error is of the form below:
> 
> papr_scm ibm,persistent-memory:ibm,pmemory@44104001: Failed to query
> 	 performance stats, Err:-10
> 
> This error message confuses users as it implies a possible problem
> with the nvdimm even though its due to a disabled feature.
> 
> So we fix this by explicitly handling the H_AUTHORITY error from the
> H_SCM_PERFORMANCE_STATS hcall and generating a warning instead of an
> error, saying that "Performance stats in-accessible".
> 
> Fixes: 2d02bf835e57('powerpc/papr_scm: Fetch nvdimm performance stats from PHYP')
> Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
> ---
>  arch/powerpc/platforms/pseries/papr_scm.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/arch/powerpc/platforms/pseries/papr_scm.c b/arch/powerpc/platforms/pseries/papr_scm.c
> index 835163f54244..9216424f8be3 100644
> --- a/arch/powerpc/platforms/pseries/papr_scm.c
> +++ b/arch/powerpc/platforms/pseries/papr_scm.c
> @@ -277,6 +277,9 @@ static ssize_t drc_pmem_query_stats(struct papr_scm_priv *p,
>  		dev_err(&p->pdev->dev,
>  			"Unknown performance stats, Err:0x%016lX\n", ret[0]);
>  		return -ENOENT;
> +	} else if (rc == H_AUTHORITY) {
> +		dev_warn(&p->pdev->dev, "Performance stats in-accessible");
> +		return -EPERM;

Is this because of a disabled feature or because of permissions?

Ira

>  	} else if (rc != H_SUCCESS) {
>  		dev_err(&p->pdev->dev,
>  			"Failed to query performance stats, Err:%lld\n", rc);
> -- 
> 2.30.2
>
Vaibhav Jain April 14, 2021, 4:21 p.m. UTC | #2
Thanks for looking into this patch Ira,

Ira Weiny <ira.weiny@intel.com> writes:

> On Wed, Apr 14, 2021 at 06:10:26PM +0530, Vaibhav Jain wrote:
>> Currently drc_pmem_qeury_stats() generates a dev_err in case
>> "Enable Performance Information Collection" feature is disabled from
>> HMC. The error is of the form below:
>> 
>> papr_scm ibm,persistent-memory:ibm,pmemory@44104001: Failed to query
>> 	 performance stats, Err:-10
>> 
>> This error message confuses users as it implies a possible problem
>> with the nvdimm even though its due to a disabled feature.
>> 
>> So we fix this by explicitly handling the H_AUTHORITY error from the
>> H_SCM_PERFORMANCE_STATS hcall and generating a warning instead of an
>> error, saying that "Performance stats in-accessible".
>> 
>> Fixes: 2d02bf835e57('powerpc/papr_scm: Fetch nvdimm performance stats from PHYP')
>> Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
>> ---
>>  arch/powerpc/platforms/pseries/papr_scm.c | 3 +++
>>  1 file changed, 3 insertions(+)
>> 
>> diff --git a/arch/powerpc/platforms/pseries/papr_scm.c b/arch/powerpc/platforms/pseries/papr_scm.c
>> index 835163f54244..9216424f8be3 100644
>> --- a/arch/powerpc/platforms/pseries/papr_scm.c
>> +++ b/arch/powerpc/platforms/pseries/papr_scm.c
>> @@ -277,6 +277,9 @@ static ssize_t drc_pmem_query_stats(struct papr_scm_priv *p,
>>  		dev_err(&p->pdev->dev,
>>  			"Unknown performance stats, Err:0x%016lX\n", ret[0]);
>>  		return -ENOENT;
>> +	} else if (rc == H_AUTHORITY) {
>> +		dev_warn(&p->pdev->dev, "Performance stats in-accessible");
>> +		return -EPERM;
>
> Is this because of a disabled feature or because of permissions?

Its because of a disabled feature that revokes permission for a guest to
retrieve performance statistics.

The feature is called "Enable Performance Information Collection" and
once disabled the hcall H_SCM_PERFORMANCE_STATS returns an error
H_AUTHORITY indicating that the guest doesn't have permission to retrieve
performance statistics.

>
> Ira
>
>>  	} else if (rc != H_SUCCESS) {
>>  		dev_err(&p->pdev->dev,
>>  			"Failed to query performance stats, Err:%lld\n", rc);
>> -- 
>> 2.30.2
>>
Ira Weiny April 14, 2021, 9:24 p.m. UTC | #3
On Wed, Apr 14, 2021 at 09:51:40PM +0530, Vaibhav Jain wrote:
> Thanks for looking into this patch Ira,
> 
> Ira Weiny <ira.weiny@intel.com> writes:
> 
> > On Wed, Apr 14, 2021 at 06:10:26PM +0530, Vaibhav Jain wrote:
> >> Currently drc_pmem_qeury_stats() generates a dev_err in case
> >> "Enable Performance Information Collection" feature is disabled from
> >> HMC. The error is of the form below:
> >> 
> >> papr_scm ibm,persistent-memory:ibm,pmemory@44104001: Failed to query
> >> 	 performance stats, Err:-10
> >> 
> >> This error message confuses users as it implies a possible problem
> >> with the nvdimm even though its due to a disabled feature.
> >> 
> >> So we fix this by explicitly handling the H_AUTHORITY error from the
> >> H_SCM_PERFORMANCE_STATS hcall and generating a warning instead of an
> >> error, saying that "Performance stats in-accessible".
> >> 
> >> Fixes: 2d02bf835e57('powerpc/papr_scm: Fetch nvdimm performance stats from PHYP')
> >> Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
> >> ---
> >>  arch/powerpc/platforms/pseries/papr_scm.c | 3 +++
> >>  1 file changed, 3 insertions(+)
> >> 
> >> diff --git a/arch/powerpc/platforms/pseries/papr_scm.c b/arch/powerpc/platforms/pseries/papr_scm.c
> >> index 835163f54244..9216424f8be3 100644
> >> --- a/arch/powerpc/platforms/pseries/papr_scm.c
> >> +++ b/arch/powerpc/platforms/pseries/papr_scm.c
> >> @@ -277,6 +277,9 @@ static ssize_t drc_pmem_query_stats(struct papr_scm_priv *p,
> >>  		dev_err(&p->pdev->dev,
> >>  			"Unknown performance stats, Err:0x%016lX\n", ret[0]);
> >>  		return -ENOENT;
> >> +	} else if (rc == H_AUTHORITY) {
> >> +		dev_warn(&p->pdev->dev, "Performance stats in-accessible");
> >> +		return -EPERM;
> >
> > Is this because of a disabled feature or because of permissions?
> 
> Its because of a disabled feature that revokes permission for a guest to
> retrieve performance statistics.
> 
> The feature is called "Enable Performance Information Collection" and
> once disabled the hcall H_SCM_PERFORMANCE_STATS returns an error
> H_AUTHORITY indicating that the guest doesn't have permission to retrieve
> performance statistics.

In that case would it be appropriate to have the error message indicate a
permission issue?

Something like 'permission denied'?

Ira
Dan Williams April 14, 2021, 9:32 p.m. UTC | #4
On Wed, Apr 14, 2021 at 5:40 AM Vaibhav Jain <vaibhav@linux.ibm.com> wrote:
>
> Currently drc_pmem_qeury_stats() generates a dev_err in case
> "Enable Performance Information Collection" feature is disabled from
> HMC. The error is of the form below:
>
> papr_scm ibm,persistent-memory:ibm,pmemory@44104001: Failed to query
>          performance stats, Err:-10
>
> This error message confuses users as it implies a possible problem
> with the nvdimm even though its due to a disabled feature.
>
> So we fix this by explicitly handling the H_AUTHORITY error from the
> H_SCM_PERFORMANCE_STATS hcall and generating a warning instead of an
> error, saying that "Performance stats in-accessible".
>
> Fixes: 2d02bf835e57('powerpc/papr_scm: Fetch nvdimm performance stats from PHYP')
> Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
> ---
>  arch/powerpc/platforms/pseries/papr_scm.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/arch/powerpc/platforms/pseries/papr_scm.c b/arch/powerpc/platforms/pseries/papr_scm.c
> index 835163f54244..9216424f8be3 100644
> --- a/arch/powerpc/platforms/pseries/papr_scm.c
> +++ b/arch/powerpc/platforms/pseries/papr_scm.c
> @@ -277,6 +277,9 @@ static ssize_t drc_pmem_query_stats(struct papr_scm_priv *p,
>                 dev_err(&p->pdev->dev,
>                         "Unknown performance stats, Err:0x%016lX\n", ret[0]);
>                 return -ENOENT;
> +       } else if (rc == H_AUTHORITY) {
> +               dev_warn(&p->pdev->dev, "Performance stats in-accessible");
> +               return -EPERM;

So userspace can spam the kernel log? Why is kernel log message needed
at all? EPERM told the caller what happened.
Vaibhav Jain April 15, 2021, 11:43 a.m. UTC | #5
Thanks for looking into this Dan,

Dan Williams <dan.j.williams@intel.com> writes:

> On Wed, Apr 14, 2021 at 5:40 AM Vaibhav Jain <vaibhav@linux.ibm.com> wrote:
>>
>> Currently drc_pmem_qeury_stats() generates a dev_err in case
>> "Enable Performance Information Collection" feature is disabled from
>> HMC. The error is of the form below:
>>
>> papr_scm ibm,persistent-memory:ibm,pmemory@44104001: Failed to query
>>          performance stats, Err:-10
>>
>> This error message confuses users as it implies a possible problem
>> with the nvdimm even though its due to a disabled feature.
>>
>> So we fix this by explicitly handling the H_AUTHORITY error from the
>> H_SCM_PERFORMANCE_STATS hcall and generating a warning instead of an
>> error, saying that "Performance stats in-accessible".
>>
>> Fixes: 2d02bf835e57('powerpc/papr_scm: Fetch nvdimm performance stats from PHYP')
>> Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
>> ---
>>  arch/powerpc/platforms/pseries/papr_scm.c | 3 +++
>>  1 file changed, 3 insertions(+)
>>
>> diff --git a/arch/powerpc/platforms/pseries/papr_scm.c b/arch/powerpc/platforms/pseries/papr_scm.c
>> index 835163f54244..9216424f8be3 100644
>> --- a/arch/powerpc/platforms/pseries/papr_scm.c
>> +++ b/arch/powerpc/platforms/pseries/papr_scm.c
>> @@ -277,6 +277,9 @@ static ssize_t drc_pmem_query_stats(struct papr_scm_priv *p,
>>                 dev_err(&p->pdev->dev,
>>                         "Unknown performance stats, Err:0x%016lX\n", ret[0]);
>>                 return -ENOENT;
>> +       } else if (rc == H_AUTHORITY) {
>> +               dev_warn(&p->pdev->dev, "Performance stats in-accessible");
>> +               return -EPERM;
>
> So userspace can spam the kernel log? Why is kernel log message needed
> at all? EPERM told the caller what happened.
Currently this error message is only reported during probe of the
nvdimm. So userspace cannot directly spam kernel log.

The callsite for this function in papr_scm_nvdimm_init() doesnt handle
specific error codes. Instead in case of an error it only reports that
"Dimm performance stats are unavailable". The log message just
preceeding that mentions the real cause of failure. Thats why just
returning -EPERM wont be usefui.

Alternatively I can update papr_scm_nvdimm_init() to report the error
code returned from drc_pmem_query_stats().
Vaibhav Jain April 15, 2021, 11:48 a.m. UTC | #6
Ira Weiny <ira.weiny@intel.com> writes:

> On Wed, Apr 14, 2021 at 09:51:40PM +0530, Vaibhav Jain wrote:
>> Thanks for looking into this patch Ira,
>> 
>> Ira Weiny <ira.weiny@intel.com> writes:
>> 
>> > On Wed, Apr 14, 2021 at 06:10:26PM +0530, Vaibhav Jain wrote:
>> >> Currently drc_pmem_qeury_stats() generates a dev_err in case
>> >> "Enable Performance Information Collection" feature is disabled from
>> >> HMC. The error is of the form below:
>> >> 
>> >> papr_scm ibm,persistent-memory:ibm,pmemory@44104001: Failed to query
>> >> 	 performance stats, Err:-10
>> >> 
>> >> This error message confuses users as it implies a possible problem
>> >> with the nvdimm even though its due to a disabled feature.
>> >> 
>> >> So we fix this by explicitly handling the H_AUTHORITY error from the
>> >> H_SCM_PERFORMANCE_STATS hcall and generating a warning instead of an
>> >> error, saying that "Performance stats in-accessible".
>> >> 
>> >> Fixes: 2d02bf835e57('powerpc/papr_scm: Fetch nvdimm performance stats from PHYP')
>> >> Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
>> >> ---
>> >>  arch/powerpc/platforms/pseries/papr_scm.c | 3 +++
>> >>  1 file changed, 3 insertions(+)
>> >> 
>> >> diff --git a/arch/powerpc/platforms/pseries/papr_scm.c b/arch/powerpc/platforms/pseries/papr_scm.c
>> >> index 835163f54244..9216424f8be3 100644
>> >> --- a/arch/powerpc/platforms/pseries/papr_scm.c
>> >> +++ b/arch/powerpc/platforms/pseries/papr_scm.c
>> >> @@ -277,6 +277,9 @@ static ssize_t drc_pmem_query_stats(struct papr_scm_priv *p,
>> >>  		dev_err(&p->pdev->dev,
>> >>  			"Unknown performance stats, Err:0x%016lX\n", ret[0]);
>> >>  		return -ENOENT;
>> >> +	} else if (rc == H_AUTHORITY) {
>> >> +		dev_warn(&p->pdev->dev, "Performance stats in-accessible");
>> >> +		return -EPERM;
>> >
>> > Is this because of a disabled feature or because of permissions?
>> 
>> Its because of a disabled feature that revokes permission for a guest to
>> retrieve performance statistics.
>> 
>> The feature is called "Enable Performance Information Collection" and
>> once disabled the hcall H_SCM_PERFORMANCE_STATS returns an error
>> H_AUTHORITY indicating that the guest doesn't have permission to retrieve
>> performance statistics.
>
> In that case would it be appropriate to have the error message indicate a
> permission issue?
>
> Something like 'permission denied'?

Yes, Something like "Permission denied while accessing performance
stats" might be more clear and intuitive.

Will update the warn message in v2.

>
> Ira
>
Dan Williams April 15, 2021, 5:03 p.m. UTC | #7
On Thu, Apr 15, 2021 at 4:44 AM Vaibhav Jain <vaibhav@linux.ibm.com> wrote:
>
> Thanks for looking into this Dan,
>
> Dan Williams <dan.j.williams@intel.com> writes:
>
> > On Wed, Apr 14, 2021 at 5:40 AM Vaibhav Jain <vaibhav@linux.ibm.com> wrote:
> >>
> >> Currently drc_pmem_qeury_stats() generates a dev_err in case
> >> "Enable Performance Information Collection" feature is disabled from
> >> HMC. The error is of the form below:
> >>
> >> papr_scm ibm,persistent-memory:ibm,pmemory@44104001: Failed to query
> >>          performance stats, Err:-10
> >>
> >> This error message confuses users as it implies a possible problem
> >> with the nvdimm even though its due to a disabled feature.
> >>
> >> So we fix this by explicitly handling the H_AUTHORITY error from the
> >> H_SCM_PERFORMANCE_STATS hcall and generating a warning instead of an
> >> error, saying that "Performance stats in-accessible".
> >>
> >> Fixes: 2d02bf835e57('powerpc/papr_scm: Fetch nvdimm performance stats from PHYP')
> >> Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
> >> ---
> >>  arch/powerpc/platforms/pseries/papr_scm.c | 3 +++
> >>  1 file changed, 3 insertions(+)
> >>
> >> diff --git a/arch/powerpc/platforms/pseries/papr_scm.c b/arch/powerpc/platforms/pseries/papr_scm.c
> >> index 835163f54244..9216424f8be3 100644
> >> --- a/arch/powerpc/platforms/pseries/papr_scm.c
> >> +++ b/arch/powerpc/platforms/pseries/papr_scm.c
> >> @@ -277,6 +277,9 @@ static ssize_t drc_pmem_query_stats(struct papr_scm_priv *p,
> >>                 dev_err(&p->pdev->dev,
> >>                         "Unknown performance stats, Err:0x%016lX\n", ret[0]);
> >>                 return -ENOENT;
> >> +       } else if (rc == H_AUTHORITY) {
> >> +               dev_warn(&p->pdev->dev, "Performance stats in-accessible");
> >> +               return -EPERM;
> >
> > So userspace can spam the kernel log? Why is kernel log message needed
> > at all? EPERM told the caller what happened.
> Currently this error message is only reported during probe of the
> nvdimm. So userspace cannot directly spam kernel log.

Oh, ok, I saw things like papr_pdsm_fuel_gauge() in the call stack and
thought this was reachable through an ioctl. Sorry for the noise.
kajoljain April 18, 2021, 7:58 a.m. UTC | #8
On 4/15/21 5:18 PM, Vaibhav Jain wrote:
> Ira Weiny <ira.weiny@intel.com> writes:
> 
>> On Wed, Apr 14, 2021 at 09:51:40PM +0530, Vaibhav Jain wrote:
>>> Thanks for looking into this patch Ira,
>>>
>>> Ira Weiny <ira.weiny@intel.com> writes:
>>>
>>>> On Wed, Apr 14, 2021 at 06:10:26PM +0530, Vaibhav Jain wrote:
>>>>> Currently drc_pmem_qeury_stats() generates a dev_err in case
>>>>> "Enable Performance Information Collection" feature is disabled from
>>>>> HMC. The error is of the form below:
>>>>>
>>>>> papr_scm ibm,persistent-memory:ibm,pmemory@44104001: Failed to query
>>>>> 	 performance stats, Err:-10
>>>>>
>>>>> This error message confuses users as it implies a possible problem
>>>>> with the nvdimm even though its due to a disabled feature.
>>>>>
>>>>> So we fix this by explicitly handling the H_AUTHORITY error from the
>>>>> H_SCM_PERFORMANCE_STATS hcall and generating a warning instead of an
>>>>> error, saying that "Performance stats in-accessible".
>>>>>
>>>>> Fixes: 2d02bf835e57('powerpc/papr_scm: Fetch nvdimm performance stats from PHYP')
>>>>> Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
>>>>> ---
>>>>>  arch/powerpc/platforms/pseries/papr_scm.c | 3 +++
>>>>>  1 file changed, 3 insertions(+)
>>>>>
>>>>> diff --git a/arch/powerpc/platforms/pseries/papr_scm.c b/arch/powerpc/platforms/pseries/papr_scm.c
>>>>> index 835163f54244..9216424f8be3 100644
>>>>> --- a/arch/powerpc/platforms/pseries/papr_scm.c
>>>>> +++ b/arch/powerpc/platforms/pseries/papr_scm.c
>>>>> @@ -277,6 +277,9 @@ static ssize_t drc_pmem_query_stats(struct papr_scm_priv *p,
>>>>>  		dev_err(&p->pdev->dev,
>>>>>  			"Unknown performance stats, Err:0x%016lX\n", ret[0]);
>>>>>  		return -ENOENT;
>>>>> +	} else if (rc == H_AUTHORITY) {
>>>>> +		dev_warn(&p->pdev->dev, "Performance stats in-accessible");
>>>>> +		return -EPERM;
>>>>
>>>> Is this because of a disabled feature or because of permissions?
>>>
>>> Its because of a disabled feature that revokes permission for a guest to
>>> retrieve performance statistics.
>>>
>>> The feature is called "Enable Performance Information Collection" and
>>> once disabled the hcall H_SCM_PERFORMANCE_STATS returns an error
>>> H_AUTHORITY indicating that the guest doesn't have permission to retrieve
>>> performance statistics.
>>
>> In that case would it be appropriate to have the error message indicate a
>> permission issue?
>>
>> Something like 'permission denied'?
> 
> Yes, Something like "Permission denied while accessing performance
> stats" might be more clear and intuitive.

Hi Vaibhav,
   Thanks for the patch. I agree with Ira and above warning message with "Permission denied" looks more clear.
With that change, patch looks good to me.

Reviewed-By: Kajol Jain<kjain@linux.ibm.com>

Thanks,
Kajol Jain
> 
> Will update the warn message in v2.
> 
>>
>> Ira
>>
>
diff mbox series

Patch

diff --git a/arch/powerpc/platforms/pseries/papr_scm.c b/arch/powerpc/platforms/pseries/papr_scm.c
index 835163f54244..9216424f8be3 100644
--- a/arch/powerpc/platforms/pseries/papr_scm.c
+++ b/arch/powerpc/platforms/pseries/papr_scm.c
@@ -277,6 +277,9 @@  static ssize_t drc_pmem_query_stats(struct papr_scm_priv *p,
 		dev_err(&p->pdev->dev,
 			"Unknown performance stats, Err:0x%016lX\n", ret[0]);
 		return -ENOENT;
+	} else if (rc == H_AUTHORITY) {
+		dev_warn(&p->pdev->dev, "Performance stats in-accessible");
+		return -EPERM;
 	} else if (rc != H_SUCCESS) {
 		dev_err(&p->pdev->dev,
 			"Failed to query performance stats, Err:%lld\n", rc);