diff mbox

drm/i915: Handle msr read failure gracefully

Message ID 87o9s7zrx3.fsf@dilma.collabora.co.uk (mailing list archive)
State New, archived
Headers show

Commit Message

Gabriel Krisman Bertazi July 26, 2017, 5:30 a.m. UTC
Chris Wilson <chris@chris-wilson.co.uk> writes:

> Quoting Gabriel Krisman Bertazi (2017-07-25 19:19:22)
>>         power = (power & 0x1f00) >> 8;
>>         units = 1000000 / (1 << power); /* convert to uJ */
>>         power = I915_READ(MCH_SECP_NRG_STTS);
>
> Just after this is a useless cast. Though it will be neater to kill the
> (long long unsigned) and s/u64/unsigned long long/ so that we are
> consistent with the rdmsrl_safe interface.
>
> Also we should use 1u << power as we allow power to be 31, or better yet
> use:
>
> 	units = (power & 0x1f00) >> 8;
> 	power = I915_READ(MCH_SECP_NRG_STTS);
> 	power = (100000 * power) >> units; /* convert to uJ */

Hi Chris,

Thanks for your review.  I have added your suggestions on a v2 of the
patch below.

>8
Subject: [PATCH] drm/i915: Handle msr read failure gracefully

When reading the i915_energy_uJ debugfs file, it tries to fetch
MSR_RAPL_POWER_UNIT, which might not be available, like in a vm
environment, causing the exception shown below.

We can easily prevent it by doing a rdmsrl_safe read instead, which will
handle the exception, allowing us to abort the debugfs file read.

This was caught by the new igt@debugfs_test@read_all_entries testcase in
the CI.

  unchecked MSR access error: RDMSR from 0x606 at rIP:0xffffffffa0078f66
    (i915_energy_uJ+0x36/0xb0 [i915])
  Call Trace:
   seq_read+0xdc/0x3a0
   full_proxy_read+0x4f/0x70
   __vfs_read+0x23/0x120
   ? putname+0x4f/0x60
   ? rcu_read_lock_sched_held+0x75/0x80
   ? entry_SYSCALL_64_fastpath+0x5/0xb1
   vfs_read+0xa0/0x150
   SyS_read+0x44/0xb0
   entry_SYSCALL_64_fastpath+0x1c/0xb1
  RIP: 0033:0x7f1f5e9f4500
  RSP: 002b:00007ffc77e65cf8 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
  RAX: ffffffffffffffda RBX: ffffffff8146e003 RCX: 00007f1f5e9f4500
  RDX: 0000000000000200 RSI: 00007ffc77e65d10 RDI: 0000000000000006
  RBP: ffffc900007abf88 R08: 0000000001eaff20 R09: 0000000000000000
  R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
  R13: 0000000000000006 R14: 0000000000000005 R15: 0000000001eb94db
   ? __this_cpu_preempt_check+0x13/0x20

v2:
 - Drop unsigned long long cast and improve calculation (Chris)

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101901
Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.co.uk>
---
 drivers/gpu/drm/i915/i915_debugfs.c | 15 +++++++++------
 1 file changed, 9 insertions(+), 6 deletions(-)

Comments

Chris Wilson July 26, 2017, 8:39 a.m. UTC | #1
Quoting Gabriel Krisman Bertazi (2017-07-26 06:30:16)
> Chris Wilson <chris@chris-wilson.co.uk> writes:
> 
> > Quoting Gabriel Krisman Bertazi (2017-07-25 19:19:22)
> >>         power = (power & 0x1f00) >> 8;
> >>         units = 1000000 / (1 << power); /* convert to uJ */
> >>         power = I915_READ(MCH_SECP_NRG_STTS);
> >
> > Just after this is a useless cast. Though it will be neater to kill the
> > (long long unsigned) and s/u64/unsigned long long/ so that we are
> > consistent with the rdmsrl_safe interface.
> >
> > Also we should use 1u << power as we allow power to be 31, or better yet
> > use:
> >
> >       units = (power & 0x1f00) >> 8;
> >       power = I915_READ(MCH_SECP_NRG_STTS);
> >       power = (100000 * power) >> units; /* convert to uJ */
> 
> Hi Chris,
> 
> Thanks for your review.  I have added your suggestions on a v2 of the
> patch below.
> 
> >8
> Subject: [PATCH] drm/i915: Handle msr read failure gracefully
> 
> When reading the i915_energy_uJ debugfs file, it tries to fetch
> MSR_RAPL_POWER_UNIT, which might not be available, like in a vm
> environment, causing the exception shown below.
> 
> We can easily prevent it by doing a rdmsrl_safe read instead, which will
> handle the exception, allowing us to abort the debugfs file read.
> 
> This was caught by the new igt@debugfs_test@read_all_entries testcase in
> the CI.
> 
>   unchecked MSR access error: RDMSR from 0x606 at rIP:0xffffffffa0078f66
>     (i915_energy_uJ+0x36/0xb0 [i915])
>   Call Trace:
>    seq_read+0xdc/0x3a0
>    full_proxy_read+0x4f/0x70
>    __vfs_read+0x23/0x120
>    ? putname+0x4f/0x60
>    ? rcu_read_lock_sched_held+0x75/0x80
>    ? entry_SYSCALL_64_fastpath+0x5/0xb1
>    vfs_read+0xa0/0x150
>    SyS_read+0x44/0xb0
>    entry_SYSCALL_64_fastpath+0x1c/0xb1
>   RIP: 0033:0x7f1f5e9f4500
>   RSP: 002b:00007ffc77e65cf8 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
>   RAX: ffffffffffffffda RBX: ffffffff8146e003 RCX: 00007f1f5e9f4500
>   RDX: 0000000000000200 RSI: 00007ffc77e65d10 RDI: 0000000000000006
>   RBP: ffffc900007abf88 R08: 0000000001eaff20 R09: 0000000000000000
>   R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
>   R13: 0000000000000006 R14: 0000000000000005 R15: 0000000001eb94db
>    ? __this_cpu_preempt_check+0x13/0x20
> 
> v2:
>  - Drop unsigned long long cast and improve calculation (Chris)
> 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101901
> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
-Chris
Chris Wilson July 26, 2017, 11:44 a.m. UTC | #2
Quoting Chris Wilson (2017-07-26 09:39:31)
> Quoting Gabriel Krisman Bertazi (2017-07-26 06:30:16)
> > When reading the i915_energy_uJ debugfs file, it tries to fetch
> > MSR_RAPL_POWER_UNIT, which might not be available, like in a vm
> > environment, causing the exception shown below.
> > 
> > We can easily prevent it by doing a rdmsrl_safe read instead, which will
> > handle the exception, allowing us to abort the debugfs file read.
> > 
> > This was caught by the new igt@debugfs_test@read_all_entries testcase in
> > the CI.
> > 
> >   unchecked MSR access error: RDMSR from 0x606 at rIP:0xffffffffa0078f66
> >     (i915_energy_uJ+0x36/0xb0 [i915])
> >   Call Trace:
> >    seq_read+0xdc/0x3a0
> >    full_proxy_read+0x4f/0x70
> >    __vfs_read+0x23/0x120
> >    ? putname+0x4f/0x60
> >    ? rcu_read_lock_sched_held+0x75/0x80
> >    ? entry_SYSCALL_64_fastpath+0x5/0xb1
> >    vfs_read+0xa0/0x150
> >    SyS_read+0x44/0xb0
> >    entry_SYSCALL_64_fastpath+0x1c/0xb1
> >   RIP: 0033:0x7f1f5e9f4500
> >   RSP: 002b:00007ffc77e65cf8 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
> >   RAX: ffffffffffffffda RBX: ffffffff8146e003 RCX: 00007f1f5e9f4500
> >   RDX: 0000000000000200 RSI: 00007ffc77e65d10 RDI: 0000000000000006
> >   RBP: ffffc900007abf88 R08: 0000000001eaff20 R09: 0000000000000000
> >   R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
> >   R13: 0000000000000006 R14: 0000000000000005 R15: 0000000001eb94db
> >    ? __this_cpu_preempt_check+0x13/0x20
> > 
> > v2:
> >  - Drop unsigned long long cast and improve calculation (Chris)
> > 
> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101901
> > Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.co.uk>
> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>

And pushed, thanks for the fix.
-Chris
diff mbox

Patch

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index c25f42c60d61..1dba3a2be849 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -2783,7 +2783,7 @@  static int i915_sink_crc(struct seq_file *m, void *data)
 static int i915_energy_uJ(struct seq_file *m, void *data)
 {
 	struct drm_i915_private *dev_priv = node_to_i915(m->private);
-	u64 power;
+	unsigned long long power;
 	u32 units;
 
 	if (INTEL_GEN(dev_priv) < 6)
@@ -2791,15 +2791,18 @@  static int i915_energy_uJ(struct seq_file *m, void *data)
 
 	intel_runtime_pm_get(dev_priv);
 
-	rdmsrl(MSR_RAPL_POWER_UNIT, power);
-	power = (power & 0x1f00) >> 8;
-	units = 1000000 / (1 << power); /* convert to uJ */
+	if (rdmsrl_safe(MSR_RAPL_POWER_UNIT, &power)) {
+		intel_runtime_pm_put(dev_priv);
+		return -ENODEV;
+	}
+
+	units = (power & 0x1f00) >> 8;
 	power = I915_READ(MCH_SECP_NRG_STTS);
-	power *= units;
+	power = (1000000 * power) >> units; /* convert to uJ */
 
 	intel_runtime_pm_put(dev_priv);
 
-	seq_printf(m, "%llu", (long long unsigned)power);
+	seq_printf(m, "%llu", power);
 
 	return 0;
 }