diff mbox

[v2] drm/i915/vma: Correct use after free in eviction

Message ID 1376684973-14743-1-git-send-email-benjamin.widawsky@intel.com (mailing list archive)
State New, archived
Headers show

Commit Message

Ben Widawsky Aug. 16, 2013, 8:29 p.m. UTC
The vma will [possibly] be destroyed during unbind in eviction.
Immediately after this, we try to delete the list entry.

Chris and Ville did the debug on this before I woke up, I just get to
take credit for the fix :p

v2: Missed the drm_object_unreference use after free (Ville)

Reported-by: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_gem_evict.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

Comments

Chris Wilson Aug. 16, 2013, 10:31 p.m. UTC | #1
On Fri, Aug 16, 2013 at 01:29:33PM -0700, Ben Widawsky wrote:
> The vma will [possibly] be destroyed during unbind in eviction.
> Immediately after this, we try to delete the list entry.
> 
> Chris and Ville did the debug on this before I woke up, I just get to
> take credit for the fix :p
> 
> v2: Missed the drm_object_unreference use after free (Ville)
> 
> Reported-by: Mika Kuoppala <mika.kuoppala@intel.com>
> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>

Should have spotted this earlier :(

Can I have a couple more lines of whitespace?
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
-Chris
Daniel Vetter Aug. 18, 2013, 5:26 p.m. UTC | #2
On Fri, Aug 16, 2013 at 11:31:12PM +0100, Chris Wilson wrote:
> On Fri, Aug 16, 2013 at 01:29:33PM -0700, Ben Widawsky wrote:
> > The vma will [possibly] be destroyed during unbind in eviction.
> > Immediately after this, we try to delete the list entry.
> > 
> > Chris and Ville did the debug on this before I woke up, I just get to
> > take credit for the fix :p
> > 
> > v2: Missed the drm_object_unreference use after free (Ville)
> > 
> > Reported-by: Mika Kuoppala <mika.kuoppala@intel.com>
> > Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > Cc: Chris Wilson <chris@chris-wilson.co.uk>
> > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>

Since the commit message lacks that crucial piece of information: How was
this discovered? I use that to tune my gut feeling for gauging the
-nightly test effectiveness ...

> Should have spotted this earlier :(
> 
> Can I have a couple more lines of whitespace?
> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>

Queued for -next, thanks for the patch.
-Daniel
Ben Widawsky Aug. 18, 2013, 10:35 p.m. UTC | #3
On Sun, Aug 18, 2013 at 07:26:57PM +0200, Daniel Vetter wrote:
> On Fri, Aug 16, 2013 at 11:31:12PM +0100, Chris Wilson wrote:
> > On Fri, Aug 16, 2013 at 01:29:33PM -0700, Ben Widawsky wrote:
> > > The vma will [possibly] be destroyed during unbind in eviction.
> > > Immediately after this, we try to delete the list entry.
> > > 
> > > Chris and Ville did the debug on this before I woke up, I just get to
> > > take credit for the fix :p
> > > 
> > > v2: Missed the drm_object_unreference use after free (Ville)
> > > 
> > > Reported-by: Mika Kuoppala <mika.kuoppala@intel.com>
> > > Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > > Cc: Chris Wilson <chris@chris-wilson.co.uk>
> > > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> 
> Since the commit message lacks that crucial piece of information: How was
> this discovered? I use that to tune my gut feeling for gauging the
> -nightly test effectiveness ...

Mika pasted an oops on #intel-gfx. Chris and Ville had is solved before
I woke up. It's pretty strange, Chris said the bug existed in the
original ppgtt2 branch (I'm too lazy to check). In many runs for myself,
and QA, I'd not seen the oops though. I really can't explain it.
Daniel Vetter Aug. 19, 2013, 6:39 a.m. UTC | #4
On Mon, Aug 19, 2013 at 12:35 AM, Ben Widawsky <ben@bwidawsk.net> wrote:
> On Sun, Aug 18, 2013 at 07:26:57PM +0200, Daniel Vetter wrote:
>> On Fri, Aug 16, 2013 at 11:31:12PM +0100, Chris Wilson wrote:
>> > On Fri, Aug 16, 2013 at 01:29:33PM -0700, Ben Widawsky wrote:
>> > > The vma will [possibly] be destroyed during unbind in eviction.
>> > > Immediately after this, we try to delete the list entry.
>> > >
>> > > Chris and Ville did the debug on this before I woke up, I just get to
>> > > take credit for the fix :p
>> > >
>> > > v2: Missed the drm_object_unreference use after free (Ville)
>> > >
>> > > Reported-by: Mika Kuoppala <mika.kuoppala@intel.com>
>> > > Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
>> > > Cc: Chris Wilson <chris@chris-wilson.co.uk>
>> > > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
>>
>> Since the commit message lacks that crucial piece of information: How was
>> this discovered? I use that to tune my gut feeling for gauging the
>> -nightly test effectiveness ...
>
> Mika pasted an oops on #intel-gfx. Chris and Ville had is solved before
> I woke up. It's pretty strange, Chris said the bug existed in the
> original ppgtt2 branch (I'm too lazy to check). In many runs for myself,
> and QA, I'd not seen the oops though. I really can't explain it.

Thanks for the explanation. Please add such information (including the
full Oops) to the commit message next time around. I've asked Mika for
the backtrace meanwhile.

I guess this is another candidate for a testcase - if you and QA have
beat on this for a while and couldn't hit this bug we need to try
harder to hit bugs ;-)
-Daniel
Mika Kuoppala Aug. 19, 2013, 8:49 a.m. UTC | #5
Daniel Vetter <daniel@ffwll.ch> writes:

> On Mon, Aug 19, 2013 at 12:35 AM, Ben Widawsky <ben@bwidawsk.net> wrote:
>> On Sun, Aug 18, 2013 at 07:26:57PM +0200, Daniel Vetter wrote:
>>> On Fri, Aug 16, 2013 at 11:31:12PM +0100, Chris Wilson wrote:
>>> > On Fri, Aug 16, 2013 at 01:29:33PM -0700, Ben Widawsky wrote:
>>> > > The vma will [possibly] be destroyed during unbind in eviction.
>>> > > Immediately after this, we try to delete the list entry.
>>> > >
>>> > > Chris and Ville did the debug on this before I woke up, I just get to
>>> > > take credit for the fix :p
>>> > >
>>> > > v2: Missed the drm_object_unreference use after free (Ville)
>>> > >
>>> > > Reported-by: Mika Kuoppala <mika.kuoppala@intel.com>
>>> > > Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
>>> > > Cc: Chris Wilson <chris@chris-wilson.co.uk>
>>> > > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
>>>
>>> Since the commit message lacks that crucial piece of information: How was
>>> this discovered? I use that to tune my gut feeling for gauging the
>>> -nightly test effectiveness ...
>>
>> Mika pasted an oops on #intel-gfx. Chris and Ville had is solved before
>> I woke up. It's pretty strange, Chris said the bug existed in the
>> original ppgtt2 branch (I'm too lazy to check). In many runs for myself,
>> and QA, I'd not seen the oops though. I really can't explain it.
>
> Thanks for the explanation. Please add such information (including the
> full Oops) to the commit message next time around. I've asked Mika for
> the backtrace meanwhile.

Here is the trace:

[  403.472448] BUG: unable to handle kernel paging request at 6b6b6b6b
[  403.472473] IP: [<c12c1500>] __list_del_entry+0x20/0xe0
[  403.472514] *pdpt = 000000002e89c001 *pde = 0000000000000000 
[  403.472556] Oops: 0000 [#1] SMP 
[  403.472582] Modules linked in: mxm_wmi snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_hwdep snd_pcm snd_seq_midi snd_rawmidi psmouse snd_seq_midi_event snd_seq serio_raw snd_timer snd_seq_device snd soundcore snd_page_alloc wmi bnep rfcomm bluetooth mac_hid parport_pc ppdev lp parport usbhid dm_crypt firewire_ohci firewire_core crc_itu_t i915 drm_kms_helper e1000e ptp drm i2c_algo_bit pps_core xhci_hcd video
[  403.472895] CPU: 2 PID: 1940 Comm: Xorg Not tainted 3.11.0-rc2+ #827
[  403.472938] Hardware name:                  /DZ77BH-55K, BIOS BHZ7710H.86A.0070.2012.0416.2117 04/16/2012
[  403.473002] task: ec866c00 ti: ee6a2000 task.ti: ee6a2000
[  403.473039] EIP: 0060:[<c12c1500>] EFLAGS: 00013202 CPU: 2
[  403.473078] EIP is at __list_del_entry+0x20/0xe0
[  403.473109] EAX: f016d9bc EBX: f016d9bc ECX: 6b6b6b6b EDX: 6b6b6b6b
[  403.473151] ESI: 00000000 EDI: ee6a3c90 EBP: ee6a3c60 ESP: ee6a3c48
[  403.473193]  DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
[  403.473230] CR0: 80050033 CR2: 6b6b6b6b CR3: 2ec43000 CR4: 001407f0
[  403.473271] Stack:
[  403.473285]  f63b2ff0 f61f98c0 f61f8000 f016d9bc 00000000 f016d9bc ee6a3cac f8519a4a
[  403.473347]  00000000 00000000 10000000 f61f8000 0100a000 10000000 00000001 008ca000
[  403.473410]  f64ee840 f61f98c0 f016d9bc f016dcec ee6a3c98 ee6a3c98 f61f98c0 dcc58f00
[  403.473472] Call Trace:
[  403.473509]  [<f8519a4a>] i915_gem_evict_something+0x17a/0x2d0 [i915]
[  403.473567]  [<f8516ed1>] i915_gem_object_pin+0x271/0x660 [i915]
[  403.473622]  [<f851c740>] ? i915_ggtt_clear_range+0x20/0x20 [i915]
[  403.473676]  [<f8517afa>] i915_gem_object_pin_to_display_plane+0xda/0x190 [i915]
[  403.473742]  [<f852d9fa>] intel_pin_and_fence_fb_obj+0xba/0x140 [i915]
[  403.473800]  [<f852db40>] intel_gen7_queue_flip+0x30/0x1c0 [i915]
[  403.473856]  [<f85337b0>] intel_crtc_page_flip+0x1a0/0x320 [i915]
[  403.473911]  [<f847b549>] ? drm_framebuffer_reference+0x39/0x80 [drm]
[  403.473965]  [<f847f9fb>] drm_mode_page_flip_ioctl+0x28b/0x320 [drm]
[  403.474018]  [<f846fec8>] drm_ioctl+0x4b8/0x560 [drm]
[  403.474064]  [<f847f770>] ? drm_mode_gamma_get_ioctl+0xd0/0xd0 [drm]
[  403.474113]  [<c1140f8a>] ? do_sync_read+0x6a/0xa0
[  403.474154]  [<f846fa10>] ? drm_copy_field+0x80/0x80 [drm]
[  403.474193]  [<c115134c>] do_vfs_ioctl+0x7c/0x5b0
[  403.474228]  [<c1141d2f>] ? vfs_read+0xef/0x160
[  403.474263]  [<c108dcbb>] ? ktime_get_ts+0x4b/0x120
[  403.474298]  [<c1151917>] SyS_ioctl+0x97/0xa0
[  403.474330]  [<c1590bc1>] sysenter_do_call+0x12/0x22
[  403.474364] Code: 55 f4 8b 45 f8 e9 75 ff ff ff 90 55 89 e5 53 83 ec 14 8b 08 8b 50 04 81 f9 00 01 10 00 74 24 81 fa 00 02 20 00 0f 84 8e 00 00 00 <8b> 1a 39 d8 75 62 8b 59 04 39 d8 75 35 89 51 04 89 0a 83 c4 14
[  403.474566] EIP: [<c12c1500>] __list_del_entry+0x20/0xe0 SS:ESP 0068:ee6a3c48
[  403.476513] CR2: 000000006b6b6b6b
Daniel Vetter Aug. 19, 2013, 9:01 a.m. UTC | #6
On Mon, Aug 19, 2013 at 10:49 AM, Mika Kuoppala
<mika.kuoppala@linux.intel.com> wrote:
>>> Mika pasted an oops on #intel-gfx. Chris and Ville had is solved before
>>> I woke up. It's pretty strange, Chris said the bug existed in the
>>> original ppgtt2 branch (I'm too lazy to check). In many runs for myself,
>>> and QA, I'd not seen the oops though. I really can't explain it.
>>
>> Thanks for the explanation. Please add such information (including the
>> full Oops) to the commit message next time around. I've asked Mika for
>> the backtrace meanwhile.
>
> Here is the trace:

Thanks, added to the commit message.
-Daniel
diff mbox

Patch

diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c b/drivers/gpu/drm/i915/i915_gem_evict.c
index 0cbaad4..3b7b74e 100644
--- a/drivers/gpu/drm/i915/i915_gem_evict.c
+++ b/drivers/gpu/drm/i915/i915_gem_evict.c
@@ -136,14 +136,17 @@  found:
 
 	/* Unbinding will emit any required flushes */
 	while (!list_empty(&eviction_list)) {
+		struct drm_gem_object *obj;
 		vma = list_first_entry(&eviction_list,
 				       struct i915_vma,
 				       exec_list);
+
+		obj =  &vma->obj->base;
+		list_del_init(&vma->exec_list);
 		if (ret == 0)
 			ret = i915_vma_unbind(vma);
 
-		list_del_init(&vma->exec_list);
-		drm_gem_object_unreference(&vma->obj->base);
+		drm_gem_object_unreference(obj);
 	}
 
 	return ret;