diff mbox series

[4/4] platform/x86: intel_telemetry: report debugfs failure

Message ID 20180903180415.31575-4-rajneesh.bhardwaj@linux.intel.com (mailing list archive)
State Changes Requested, archived
Delegated to: Andy Shevchenko
Headers show
Series [1/4] platform/x86: intel_pmc_core: Show Latency Tolerance info | expand

Commit Message

Bhardwaj, Rajneesh Sept. 3, 2018, 6:04 p.m. UTC
On some Goldmont based systems such as ASRock J3455M the BIOS may not
enable the IPC1 device that provides access to the PMC and PUNIT. In
such scenarios, the ioss and pss resources from the platform device can
not be obtained and result in a invalid telemetry_plt_config.
This is also applicable to the platforms where the BIOS supports IPC1
device under debug configurations but IPC1 is disabled by user or the
policy.

This change allows user to know the reason for not seeing entries under
/sys/kernel/debug/telemetry/* when there is no apparent failure at boot.

Cc: Matt Turner <matt.turner@intel.com>
Cc: Len Brown <len.brown@intel.com>
Cc: Souvik Kumar Chakravarty <souvik.k.chakravarty@intel.com>
Cc: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@intel.com>

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=198779

Acked-by: Matt Turner <matt.turner@intel.com>
Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@linux.intel.com>
---
 drivers/platform/x86/intel_telemetry_debugfs.c | 14 ++++++++++----
 1 file changed, 10 insertions(+), 4 deletions(-)

Comments

Andy Shevchenko Sept. 26, 2018, 1:56 p.m. UTC | #1
On Mon, Sep 3, 2018 at 9:05 PM Rajneesh Bhardwaj
<rajneesh.bhardwaj@linux.intel.com> wrote:
>
> On some Goldmont based systems such as ASRock J3455M the BIOS may not
> enable the IPC1 device that provides access to the PMC and PUNIT. In
> such scenarios, the ioss and pss resources from the platform device can

IOSS
PSS

> not be obtained and result in a invalid telemetry_plt_config.

What is telemetry_plt_config?

> This is also applicable to the platforms where the BIOS supports IPC1
> device under debug configurations but IPC1 is disabled by user or the
> policy.
>
> This change allows user to know the reason for not seeing entries under
> /sys/kernel/debug/telemetry/* when there is no apparent failure at boot.
>
> Cc: Matt Turner <matt.turner@intel.com>
> Cc: Len Brown <len.brown@intel.com>
> Cc: Souvik Kumar Chakravarty <souvik.k.chakravarty@intel.com>
> Cc: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@intel.com>
>

> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=198779
>
There should be not a blank line.

> Acked-by: Matt Turner <matt.turner@intel.com>
> Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@linux.intel.com>

> +exit:
> +       pr_debug(pr_fmt(DRIVER_NAME) " Failed\n");

Completely useless.

Device core does it in generic way.
Bhardwaj, Rajneesh Sept. 26, 2018, 2:24 p.m. UTC | #2
On 26-Sep-18 7:26 PM, Andy Shevchenko wrote:
> On Mon, Sep 3, 2018 at 9:05 PM Rajneesh Bhardwaj
> <rajneesh.bhardwaj@linux.intel.com> wrote:
>> On some Goldmont based systems such as ASRock J3455M the BIOS may not
>> enable the IPC1 device that provides access to the PMC and PUNIT. In
>> such scenarios, the ioss and pss resources from the platform device can
> IOSS
> PSS

Fine.

>
>> not be obtained and result in a invalid telemetry_plt_config.
> What is telemetry_plt_config?

Internal data structure that holds platform config, maintained by the 
telemetry platform driver.

>
>> This is also applicable to the platforms where the BIOS supports IPC1
>> device under debug configurations but IPC1 is disabled by user or the
>> policy.
>>
>> This change allows user to know the reason for not seeing entries under
>> /sys/kernel/debug/telemetry/* when there is no apparent failure at boot.
>>
>> Cc: Matt Turner <matt.turner@intel.com>
>> Cc: Len Brown <len.brown@intel.com>
>> Cc: Souvik Kumar Chakravarty <souvik.k.chakravarty@intel.com>
>> Cc: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@intel.com>
>>
>> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=198779
>>
> There should be not a blank line.

OK.

>
>> Acked-by: Matt Turner <matt.turner@intel.com>
>> Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@linux.intel.com>
>> +exit:
>> +       pr_debug(pr_fmt(DRIVER_NAME) " Failed\n");
> Completely useless.
>
> Device core does it in generic way.

If i remove this print then perhaps there is no need of this patch. 
Reason to print this is that the platform driver / core driver does not 
show any error. In-fact they are even loaded in module table. OTOH, this 
debugfs interface fails. This is very confusing to the users if they 
check the lsmod output so i feel this print might help.

>
Andy Shevchenko Sept. 26, 2018, 5:18 p.m. UTC | #3
On Wed, Sep 26, 2018 at 5:24 PM Bhardwaj, Rajneesh
<rajneesh.bhardwaj@linux.intel.com> wrote:
> On 26-Sep-18 7:26 PM, Andy Shevchenko wrote:
> > On Mon, Sep 3, 2018 at 9:05 PM Rajneesh Bhardwaj
> > <rajneesh.bhardwaj@linux.intel.com> wrote:

> >> not be obtained and result in a invalid telemetry_plt_config.
> > What is telemetry_plt_config?
>
> Internal data structure that holds platform config, maintained by the
> telemetry platform driver.

You need to spell if for the reader.

> >> This is also applicable to the platforms where the BIOS supports IPC1
> >> device under debug configurations but IPC1 is disabled by user or the
> >> policy.
> >>
> >> This change allows user to know the reason for not seeing entries under
> >> /sys/kernel/debug/telemetry/* when there is no apparent failure at boot.

> >> +exit:
> >> +       pr_debug(pr_fmt(DRIVER_NAME) " Failed\n");
> > Completely useless.
> >
> > Device core does it in generic way.
>
> If i remove this print then perhaps there is no need of this patch.

Maybe.

> Reason to print this is that the platform driver / core driver does not
> show any error.

If the code fails and returns 0 — it's a bug in error reporting inside the code.

> In-fact they are even loaded in module table. OTOH, this
> debugfs interface fails. This is very confusing to the users if they
> check the lsmod output so i feel this print might help.

Again, device core *already has* this and even more (it prints also a
return code!).
Rajneesh Bhardwaj Sept. 28, 2018, 9:10 a.m. UTC | #4
On Fri, Sep 28, 2018 at 02:44:06PM +0530, Bhardwaj, Rajneesh wrote:

Resending it since previous message had few HTML contents so it was not
delivered to the list.

>
>
> On 26-Sep-18 10:48 PM, Andy Shevchenko wrote:
> >On Wed, Sep 26, 2018 at 5:24 PM Bhardwaj, Rajneesh
> ><rajneesh.bhardwaj@linux.intel.com> wrote:
> >>On 26-Sep-18 7:26 PM, Andy Shevchenko wrote:
> >>>On Mon, Sep 3, 2018 at 9:05 PM Rajneesh Bhardwaj
> >>><rajneesh.bhardwaj@linux.intel.com> wrote:
> >>>>not be obtained and result in a invalid telemetry_plt_config.
> >>>What is telemetry_plt_config?
> >>Internal data structure that holds platform config, maintained by the
> >>telemetry platform driver.
> >You need to spell if for the reader.
>
> Sure, thanks for the suggestion. I will do it if you agree to my explanation
> below and require v2.
>
> >
> >>>>This is also applicable to the platforms where the BIOS supports IPC1
> >>>>device under debug configurations but IPC1 is disabled by user or the
> >>>>policy.
> >>>>
> >>>>This change allows user to know the reason for not seeing entries under
> >>>>/sys/kernel/debug/telemetry/* when there is no apparent failure at boot.
> >>>>+exit:
> >>>>+       pr_debug(pr_fmt(DRIVER_NAME) " Failed\n");
> >>>Completely useless.
> >>>
> >>>Device core does it in generic way.
> >>If i remove this print then perhaps there is no need of this patch.
> >Maybe.
> >
> >
> >
> >>Reason to print this is that the platform driver / core driver does not
> >>show any error.
> >If the code fails and returns 0 — it's a bug in error reporting inside the code.
>

Below are my comments as per previous mail.

> The existing Telemetry ecosystem for Atom consists of IPC driver, telem core
> driver and telem platform driver but there are no consumers of the APIs
> except the ones in telem debugfs driver. Here in this case, not all drivers
> that make up the telemetry infra return failure. Here is how the system
> looks w/wo IPC1 setting in the BIOS.
>
> When IPC is present
>
> root@raj-glk:~# lsmod | grep -i telem
>
> intel_telemetry_debugfs245760
>
> intel_telemetry_pltdrv204800
>
> intel_punit_ipc163841 intel_telemetry_pltdrv
>
> intel_telemetry_core163842 intel_telemetry_debugfs,intel_telemetry_pltdrv
>
> intel_pmc_ipc204802 intel_telemetry_debugfs,intel_telemetry_pltdrv
>
> When IPC is missing
>
> root@raj-glk:~# lsmod | grep -i telem
>
> intel_telemetry_pltdrv204800
>
> intel_punit_ipc163841 intel_telemetry_pltdrv
>
> intel_telemetry_core163841 intel_telemetry_pltdrv
>
> intel_pmc_ipc204801 intel_telemetry_pltdrv
>
>
> If we look at the dmesg log, we see  "intel_telemetry_core Init". So one
> might think that the driver has loaded fine since user may not know that
> telemetry is more than just one driver. Such things are often reported by
> users of this driver.
> So IMHO, keeping this error helps user triage the problem easily. I can
> change to pr_info you'd like it.
>
> >
> >>In-fact they are even loaded in module table. OTOH, this
> >>debugfs interface fails. This is very confusing to the users if they
> >>check the lsmod output so i feel this print might help.
> >Again, device core *already has* this and even more (it prints also a
> >return code!).
> >
>
diff mbox series

Patch

diff --git a/drivers/platform/x86/intel_telemetry_debugfs.c b/drivers/platform/x86/intel_telemetry_debugfs.c
index ffd0474b0531..77212e9b22d6 100644
--- a/drivers/platform/x86/intel_telemetry_debugfs.c
+++ b/drivers/platform/x86/intel_telemetry_debugfs.c
@@ -951,12 +951,16 @@  static int __init telemetry_debugfs_init(void)
 	debugfs_conf = (struct telemetry_debugfs_conf *)id->driver_data;
 
 	err = telemetry_pltconfig_valid();
-	if (err < 0)
-		return -ENODEV;
+	if (err < 0) {
+		pr_debug("Invalid pltconfig, ensure IPC1 device is enabled in BIOS\n");
+		goto exit;
+	}
 
 	err = telemetry_debugfs_check_evts();
-	if (err < 0)
-		return -EINVAL;
+	if (err < 0) {
+		pr_debug("telemetry_debugfs_check_evts failed\n");
+		goto exit;
+	}
 
 	register_pm_notifier(&pm_notifier);
 
@@ -1020,6 +1024,8 @@  static int __init telemetry_debugfs_init(void)
 	debugfs_conf->telemetry_dbg_dir = NULL;
 out_pm:
 	unregister_pm_notifier(&pm_notifier);
+exit:
+	pr_debug(pr_fmt(DRIVER_NAME) " Failed\n");
 
 	return err;
 }