diff mbox

drm/i915: Revert i915.semaphore=1 default from 47ae63e0

Message ID efc5a05bb650e18fb9fb6976d0f8a999536e38f0.1305303009.git.luto@mit.edu (mailing list archive)
State New, archived
Headers show

Commit Message

Andrew Lutomirski May 13, 2011, 4:14 p.m. UTC
My Q67 / i7-2600 box has rev09 Sandy Bridge graphics.  It hangs
instantly when GNOME loads and it hangs so hard the reset button
doesn't work.  Setting i915.semaphore=0 fixes it.

Semaphores were disabled in a1656b9090f7008d2941c314f5a64724bea2ae37
in 2.6.38 and were re-enabled by

commit 47ae63e0c2e5fdb582d471dc906eb29be94c732f
Merge: c59a333 467cffb
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Mon Mar 7 12:32:44 2011 +0000

    Merge branch 'drm-intel-fixes' into drm-intel-next

    Apply the trivial conflicting regression fixes, but keep GPU semaphores
    enabled.

    Conflicts:
        drivers/gpu/drm/i915/i915_drv.h
        drivers/gpu/drm/i915/i915_gem_execbuffer.c

(It's worth noting that the offending change is i915_drv.c,
 which is not a conflict.)

Signed-off-by: Andy Lutomirski <luto@mit.edu>
---

This fixes a 2.6.39 regression that affects my Q67 box but not
my SNB laptop.

Send to Linus and airlied because 2.6.39 is apparently about to
be released.

 drivers/gpu/drm/i915/i915_drv.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

Comments

Keith Packard May 15, 2011, 11:09 p.m. UTC | #1
On Fri, 13 May 2011 12:14:54 -0400, Andy Lutomirski <luto@MIT.EDU> wrote:

> -unsigned int i915_semaphores = 1;
> +unsigned int i915_semaphores = 0;
>  module_param_named(semaphores, i915_semaphores, int, 0600);

Acked-by: Keith Packard <keithp@keithp.com>
Keith Packard May 19, 2011, 7:56 p.m. UTC | #2
On Fri, 13 May 2011 12:14:54 -0400, Andy Lutomirski <luto@MIT.EDU> wrote:

> My Q67 / i7-2600 box has rev09 Sandy Bridge graphics.  It hangs
> instantly when GNOME loads and it hangs so hard the reset button
> doesn't work.  Setting i915.semaphore=0 fixes it.

Can you describe precisely what hardware you have? The make and model of
the motherboard and the exact model of CPU? I'd like to get this issue
resolved for 2.6.40 and need to have a reliable way to reproduce the
hang; replicating your hardware may be a good way of doing that.
Andrew Lutomirski May 19, 2011, 8:50 p.m. UTC | #3
On Thu, May 19, 2011 at 3:56 PM, Keith Packard <keithp@keithp.com> wrote:
> On Fri, 13 May 2011 12:14:54 -0400, Andy Lutomirski <luto@MIT.EDU> wrote:
>
>> My Q67 / i7-2600 box has rev09 Sandy Bridge graphics.  It hangs
>> instantly when GNOME loads and it hangs so hard the reset button
>> doesn't work.  Setting i915.semaphore=0 fixes it.
>
> Can you describe precisely what hardware you have? The make and model of
> the motherboard and the exact model of CPU? I'd like to get this issue
> resolved for 2.6.40 and need to have a reliable way to reproduce the
> hang; replicating your hardware may be a good way of doing that.

Intel DQ67SW.  AMT is enabled but the port 5900 is closed (because I
can't figure out how to turn on the kvm).

The system boots from UEFI (which is buggy, both because efifb doesn't
recognize my hardware, because asking the firmware for the boot menu
causes all efivars entries to be ignored, and because unless I set
reboot=k the system crashes on reboot).

Userspace is F15 running compiz (*not* gnome-shell).

BIOS is SWQ6710H.86A.0051.2011.0413.1154.

CPU is:
processor       : 0
vendor_id       : GenuineIntel
cpu family      : 6
model           : 42
model name      : Intel(R) Core(TM) i7-2600 CPU @ 3.40GHz
stepping        : 7

config and dmidecode output attached.

I'm happy to help test after Sunday.  I'm currently out of town and
busy trying to make my Sandy Bridge laptop work.  (It has a different
non-i915-related problem.)

--Andy
Andrew Lutomirski May 24, 2011, 5:10 p.m. UTC | #4
On Thu, May 19, 2011 at 4:50 PM, Andrew Lutomirski <luto@mit.edu> wrote:
> On Thu, May 19, 2011 at 3:56 PM, Keith Packard <keithp@keithp.com> wrote:
>> On Fri, 13 May 2011 12:14:54 -0400, Andy Lutomirski <luto@MIT.EDU> wrote:
>>
>>> My Q67 / i7-2600 box has rev09 Sandy Bridge graphics.  It hangs
>>> instantly when GNOME loads and it hangs so hard the reset button
>>> doesn't work.  Setting i915.semaphore=0 fixes it.
>>
>> Can you describe precisely what hardware you have? The make and model of
>> the motherboard and the exact model of CPU? I'd like to get this issue
>> resolved for 2.6.40 and need to have a reliable way to reproduce the
>> hang; replicating your hardware may be a good way of doing that.
>

I'm getting hangs on my X220 Core i7 laptop as well with 2.6.39 and
i915.semaphores=1.  They're not as reliable -- sometimes the system
hangs on log in, sometimes after a few seconds, and sometimes it
doesn't hang.

--Andy
Keith Packard May 24, 2011, 5:46 p.m. UTC | #5
On Tue, 24 May 2011 13:10:27 -0400, Andrew Lutomirski <luto@mit.edu> wrote:

> I'm getting hangs on my X220 Core i7 laptop as well with 2.6.39 and
> i915.semaphores=1.  They're not as reliable -- sometimes the system
> hangs on log in, sometimes after a few seconds, and sometimes it
> doesn't hang.

That's consistent with our theory at least -- lockups caused by GPU/CPU
interactions, which occur far more often with semaphores disabled, but
still occur with semaphores enabled.

We're working on it...
Ivan Bulatovic May 24, 2011, 8:05 p.m. UTC | #6
On Tue, 24 May 2011 13:10:27 -0400
Andrew Lutomirski <luto@mit.edu> wrote:

> On Thu, May 19, 2011 at 4:50 PM, Andrew Lutomirski <luto@mit.edu>
> wrote:
> > On Thu, May 19, 2011 at 3:56 PM, Keith Packard <keithp@keithp.com>
> > wrote:
> >> On Fri, 13 May 2011 12:14:54 -0400, Andy Lutomirski <luto@MIT.EDU>
> >> wrote:
> >>
> >>> My Q67 / i7-2600 box has rev09 Sandy Bridge graphics.  It hangs
> >>> instantly when GNOME loads and it hangs so hard the reset button
> >>> doesn't work.  Setting i915.semaphore=0 fixes it.
> >>
> >> Can you describe precisely what hardware you have? The make and
> >> model of the motherboard and the exact model of CPU? I'd like to
> >> get this issue resolved for 2.6.40 and need to have a reliable way
> >> to reproduce the hang; replicating your hardware may be a good way
> >> of doing that.
> >
> 
> I'm getting hangs on my X220 Core i7 laptop as well with 2.6.39 and
> i915.semaphores=1.  They're not as reliable -- sometimes the system
> hangs on log in, sometimes after a few seconds, and sometimes it
> doesn't hang.
> 
> --Andy

I haven't had GPU hangs since this patch:

https://bugs.freedesktop.org/show_bug.cgi?id=33921#c23

Poll the FIFO for free entries before writing the register

Gigabyte GA-H67MA-UD2H-B3 + i5 2400 here.
Eric Anholt June 7, 2011, 7:12 a.m. UTC | #7
On Thu, 19 May 2011 16:50:00 -0400, Andrew Lutomirski <luto@mit.edu> wrote:
> On Thu, May 19, 2011 at 3:56 PM, Keith Packard <keithp@keithp.com> wrote:
> > On Fri, 13 May 2011 12:14:54 -0400, Andy Lutomirski <luto@MIT.EDU> wrote:
> >
> >> My Q67 / i7-2600 box has rev09 Sandy Bridge graphics.  It hangs
> >> instantly when GNOME loads and it hangs so hard the reset button
> >> doesn't work.  Setting i915.semaphore=0 fixes it.
> >
> > Can you describe precisely what hardware you have? The make and model of
> > the motherboard and the exact model of CPU? I'd like to get this issue
> > resolved for 2.6.40 and need to have a reliable way to reproduce the
> > hang; replicating your hardware may be a good way of doing that.
> 
> Intel DQ67SW.  AMT is enabled but the port 5900 is closed (because I
> can't figure out how to turn on the kvm).
> 
> The system boots from UEFI (which is buggy, both because efifb doesn't
> recognize my hardware, because asking the firmware for the boot menu
> causes all efivars entries to be ignored, and because unless I set
> reboot=k the system crashes on reboot).
> 
> Userspace is F15 running compiz (*not* gnome-shell).
> 
> BIOS is SWQ6710H.86A.0051.2011.0413.1154.
> 
> CPU is:
> processor       : 0
> vendor_id       : GenuineIntel
> cpu family      : 6
> model           : 42
> model name      : Intel(R) Core(TM) i7-2600 CPU @ 3.40GHz
> stepping        : 7

I fear this bug is just "getting more asynchronous with the GPU means we
trigger race conditions on the GPU more easily."  On that note, could
you retest with Mesa as of this commit, since I'm told this is a GT1 CPU:

commit ef59049c5242a1be7fa59a182d342191185dd62b
Author: Eric Anholt <eric@anholt.net>
Date:   Sun Jun 5 23:20:57 2011 -0700

    i965: Fix flipped GT1 vs GT2 URB VS entry count limits.
Andrew Lutomirski June 10, 2011, 2:06 p.m. UTC | #8
On Tue, Jun 7, 2011 at 3:12 AM, Eric Anholt <eric@anholt.net> wrote:
> On Thu, 19 May 2011 16:50:00 -0400, Andrew Lutomirski <luto@mit.edu> wrote:
>> On Thu, May 19, 2011 at 3:56 PM, Keith Packard <keithp@keithp.com> wrote:
>> > On Fri, 13 May 2011 12:14:54 -0400, Andy Lutomirski <luto@MIT.EDU> wrote:
>> >
>> >> My Q67 / i7-2600 box has rev09 Sandy Bridge graphics.  It hangs
>> >> instantly when GNOME loads and it hangs so hard the reset button
>> >> doesn't work.  Setting i915.semaphore=0 fixes it.
>> >
>> > Can you describe precisely what hardware you have? The make and model of
>> > the motherboard and the exact model of CPU? I'd like to get this issue
>> > resolved for 2.6.40 and need to have a reliable way to reproduce the
>> > hang; replicating your hardware may be a good way of doing that.
>>
>> Intel DQ67SW.  AMT is enabled but the port 5900 is closed (because I
>> can't figure out how to turn on the kvm).
>>
>> The system boots from UEFI (which is buggy, both because efifb doesn't
>> recognize my hardware, because asking the firmware for the boot menu
>> causes all efivars entries to be ignored, and because unless I set
>> reboot=k the system crashes on reboot).
>>
>> Userspace is F15 running compiz (*not* gnome-shell).
>>
>> BIOS is SWQ6710H.86A.0051.2011.0413.1154.
>>
>> CPU is:
>> processor       : 0
>> vendor_id       : GenuineIntel
>> cpu family      : 6
>> model           : 42
>> model name      : Intel(R) Core(TM) i7-2600 CPU @ 3.40GHz
>> stepping        : 7
>
> I fear this bug is just "getting more asynchronous with the GPU means we
> trigger race conditions on the GPU more easily."  On that note, could
> you retest with Mesa as of this commit, since I'm told this is a GT1 CPU:
>
> commit ef59049c5242a1be7fa59a182d342191185dd62b
> Author: Eric Anholt <eric@anholt.net>
> Date:   Sun Jun 5 23:20:57 2011 -0700
>
>    i965: Fix flipped GT1 vs GT2 URB VS entry count limits.
>

Nope -- I still got the reset-button-not-working hang.

I'm pretty sure I installed mesa right -- I moved everything in
/usr/lib64/dri out of the way and put the files in mesa/lib/ in there.

--Andy
Jesse Barnes Aug. 22, 2011, 4:53 p.m. UTC | #9
On Fri, 10 Jun 2011 10:06:39 -0400
Andrew Lutomirski <luto@mit.edu> wrote:

> On Tue, Jun 7, 2011 at 3:12 AM, Eric Anholt <eric@anholt.net> wrote:
> > I fear this bug is just "getting more asynchronous with the GPU means we
> > trigger race conditions on the GPU more easily."  On that note, could
> > you retest with Mesa as of this commit, since I'm told this is a GT1 CPU:
> >
> > commit ef59049c5242a1be7fa59a182d342191185dd62b
> > Author: Eric Anholt <eric@anholt.net>
> > Date:   Sun Jun 5 23:20:57 2011 -0700
> >
> >    i965: Fix flipped GT1 vs GT2 URB VS entry count limits.
> >
> 
> Nope -- I still got the reset-button-not-working hang.
> 
> I'm pretty sure I installed mesa right -- I moved everything in
> /usr/lib64/dri out of the way and put the files in mesa/lib/ in there.

Andrew, what's the latest with this?  Are you still seeing problems
when semaphores are enabled?

Thanks,
Ben Widawsky Aug. 31, 2011, 6:24 p.m. UTC | #10
On Mon, 22 Aug 2011 09:53:11 -0700
Jesse Barnes <jbarnes@virtuousgeek.org> wrote:

> On Fri, 10 Jun 2011 10:06:39 -0400
> Andrew Lutomirski <luto@mit.edu> wrote:
> 
> > On Tue, Jun 7, 2011 at 3:12 AM, Eric Anholt <eric@anholt.net> wrote:
> > > I fear this bug is just "getting more asynchronous with the GPU
> > > means we trigger race conditions on the GPU more easily."  On
> > > that note, could you retest with Mesa as of this commit, since
> > > I'm told this is a GT1 CPU:
> > >
> > > commit ef59049c5242a1be7fa59a182d342191185dd62b
> > > Author: Eric Anholt <eric@anholt.net>
> > > Date:   Sun Jun 5 23:20:57 2011 -0700
> > >
> > >    i965: Fix flipped GT1 vs GT2 URB VS entry count limits.
> > >
> > 
> > Nope -- I still got the reset-button-not-working hang.
> > 
> > I'm pretty sure I installed mesa right -- I moved everything in
> > /usr/lib64/dri out of the way and put the files in mesa/lib/ in
> > there.
> 
> Andrew, what's the latest with this?  Are you still seeing problems
> when semaphores are enabled?
> 
> Thanks,


Ping
Andrew Lutomirski Aug. 31, 2011, 6:30 p.m. UTC | #11
On Mon, Aug 22, 2011 at 12:53 PM, Jesse Barnes <jbarnes@virtuousgeek.org>
wrote:
> On Fri, 10 Jun 2011 10:06:39 -0400
> Andrew Lutomirski <luto@mit.edu> wrote:
>
>> On Tue, Jun 7, 2011 at 3:12 AM, Eric Anholt <eric@anholt.net> wrote:
>> > I fear this bug is just "getting more asynchronous with the GPU means
we
>> > trigger race conditions on the GPU more easily."  On that note, could
>> > you retest with Mesa as of this commit, since I'm told this is a GT1
CPU:
>> >
>> > commit ef59049c5242a1be7fa59a182d342191185dd62b
>> > Author: Eric Anholt <eric@anholt.net>
>> > Date:   Sun Jun 5 23:20:57 2011 -0700
>> >
>> >    i965: Fix flipped GT1 vs GT2 URB VS entry count limits.
>> >
>>
>> Nope -- I still got the reset-button-not-working hang.
>>
>> I'm pretty sure I installed mesa right -- I moved everything in
>> /usr/lib64/dri out of the way and put the files in mesa/lib/ in there.
>
> Andrew, what's the latest with this?  Are you still seeing problems
> when semaphores are enabled?

Yes. On the latest -rc, the system still freezes hard when mutter starts up
if I set i915.semaphores=1.

[I typed this email last week but apparently it got stuck as a draft.
Sorry.]

--Andy
Keith Packard Aug. 31, 2011, 7:07 p.m. UTC | #12
On Wed, 31 Aug 2011 14:30:00 -0400, Andrew Lutomirski <luto@mit.edu> wrote:

> Yes. On the latest -rc, the system still freezes hard when mutter starts up
> if I set i915.semaphores=1.

Ok. You said that you were running compiz before and that failed, and
are now running mutter and that fails also?
Andrew Lutomirski Aug. 31, 2011, 7:37 p.m. UTC | #13
Yes.  I suspect that, as soon as any 3D happens, the machine locks hard.

How does it even get into a state that makes the hardware reset button not
work?
On Aug 31, 2011 3:08 PM, "Keith Packard" <keithp@keithp.com> wrote:
> On Wed, 31 Aug 2011 14:30:00 -0400, Andrew Lutomirski <luto@mit.edu>
wrote:
>
>> Yes. On the latest -rc, the system still freezes hard when mutter starts
up
>> if I set i915.semaphores=1.
>
> Ok. You said that you were running compiz before and that failed, and
> are now running mutter and that fails also?
>
> --
> keith.packard@intel.com
diff mbox

Patch

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index c34a8dd..32d1b3e 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -49,7 +49,7 @@  module_param_named(panel_ignore_lid, i915_panel_ignore_lid, int, 0600);
 unsigned int i915_powersave = 1;
 module_param_named(powersave, i915_powersave, int, 0600);
 
-unsigned int i915_semaphores = 1;
+unsigned int i915_semaphores = 0;
 module_param_named(semaphores, i915_semaphores, int, 0600);
 
 unsigned int i915_enable_rc6 = 0;