diff mbox

nouveau: nv46: Change mc subdev oclass from nv44 to nv4c

Message ID 1437664812-5943-1-git-send-email-hdegoede@redhat.com (mailing list archive)
State New, archived
Headers show

Commit Message

Hans de Goede July 23, 2015, 3:20 p.m. UTC
MSI interrupts appear to not work for nv46 based cards. Change the mc
subdev oclass for these cards from nv44 to nv4c, the nv4c mc code is
identical to the nv44 mc code except that it does not use msi
(it does not define a msi_rearm callback).

BugLink: https://bugs.freedesktop.org/show_bug.cgi?id=90435
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
---
 drivers/gpu/drm/nouveau/nvkm/engine/device/nv40.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Ben Skeggs July 24, 2015, 2:32 a.m. UTC | #1
On 24 July 2015 at 01:20, Hans de Goede <hdegoede@redhat.com> wrote:
> MSI interrupts appear to not work for nv46 based cards. Change the mc
> subdev oclass for these cards from nv44 to nv4c, the nv4c mc code is
> identical to the nv44 mc code except that it does not use msi
> (it does not define a msi_rearm callback).
I'm fine with this, but it'd be nice to check that the binary driver
doesn't/can't use MSI on these too (there might be an alternate method
we need to use).

Would you be able to grab the latest proprietary driver that works on
nv4x, and do a mmiotrace of it?  You *might* need to use "modprobe
nvidia NVreg_EnableMSI=1", because at some point NVIDIA didn't use it
by default anywhere.

Thanks,
Ben.

> BugLink: https://bugs.freedesktop.org/show_bug.cgi?id=90435
> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
> ---
>  drivers/gpu/drm/nouveau/nvkm/engine/device/nv40.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/device/nv40.c b/drivers/gpu/drm/nouveau/nvkm/engine/device/nv40.c
> index c630136..b4ad791 100644
> --- a/drivers/gpu/drm/nouveau/nvkm/engine/device/nv40.c
> +++ b/drivers/gpu/drm/nouveau/nvkm/engine/device/nv40.c
> @@ -265,7 +265,7 @@ nv40_identify(struct nvkm_device *device)
>                 device->oclass[NVDEV_SUBDEV_CLK    ] = &nv40_clk_oclass;
>                 device->oclass[NVDEV_SUBDEV_THERM  ] = &nv40_therm_oclass;
>                 device->oclass[NVDEV_SUBDEV_DEVINIT] =  nv1a_devinit_oclass;
> -               device->oclass[NVDEV_SUBDEV_MC     ] =  nv44_mc_oclass;
> +               device->oclass[NVDEV_SUBDEV_MC     ] =  nv4c_mc_oclass;
>                 device->oclass[NVDEV_SUBDEV_BUS    ] =  nv31_bus_oclass;
>                 device->oclass[NVDEV_SUBDEV_TIMER  ] = &nv04_timer_oclass;
>                 device->oclass[NVDEV_SUBDEV_FB     ] =  nv46_fb_oclass;
> --
> 2.4.3
>
> _______________________________________________
> Nouveau mailing list
> Nouveau@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/nouveau
Ilia Mirkin July 24, 2015, 2:39 a.m. UTC | #2
On Thu, Jul 23, 2015 at 10:32 PM, Ben Skeggs <skeggsb@gmail.com> wrote:
> On 24 July 2015 at 01:20, Hans de Goede <hdegoede@redhat.com> wrote:
>> MSI interrupts appear to not work for nv46 based cards. Change the mc
>> subdev oclass for these cards from nv44 to nv4c, the nv4c mc code is
>> identical to the nv44 mc code except that it does not use msi
>> (it does not define a msi_rearm callback).
> I'm fine with this, but it'd be nice to check that the binary driver
> doesn't/can't use MSI on these too (there might be an alternate method
> we need to use).
>
> Would you be able to grab the latest proprietary driver that works on
> nv4x, and do a mmiotrace of it?  You *might* need to use "modprobe
> nvidia NVreg_EnableMSI=1", because at some point NVIDIA didn't use it
> by default anywhere.

AFAIK the blob never used MSI on nv4x. Perhaps we should have just
left it alone...
Ben Skeggs July 24, 2015, 2:56 a.m. UTC | #3
On 24 July 2015 at 12:39, Ilia Mirkin <imirkin@alum.mit.edu> wrote:
> On Thu, Jul 23, 2015 at 10:32 PM, Ben Skeggs <skeggsb@gmail.com> wrote:
>> On 24 July 2015 at 01:20, Hans de Goede <hdegoede@redhat.com> wrote:
>>> MSI interrupts appear to not work for nv46 based cards. Change the mc
>>> subdev oclass for these cards from nv44 to nv4c, the nv4c mc code is
>>> identical to the nv44 mc code except that it does not use msi
>>> (it does not define a msi_rearm callback).
>> I'm fine with this, but it'd be nice to check that the binary driver
>> doesn't/can't use MSI on these too (there might be an alternate method
>> we need to use).
>>
>> Would you be able to grab the latest proprietary driver that works on
>> nv4x, and do a mmiotrace of it?  You *might* need to use "modprobe
>> nvidia NVreg_EnableMSI=1", because at some point NVIDIA didn't use it
>> by default anywhere.
>
> AFAIK the blob never used MSI on nv4x. Perhaps we should have just
> left it alone...
They have support for it, it was just never on by default (for any
chipset) during the supported lifetime of nv4x.
Hans de Goede July 24, 2015, 9:23 a.m. UTC | #4
Hi,

On 24-07-15 04:32, Ben Skeggs wrote:
> On 24 July 2015 at 01:20, Hans de Goede <hdegoede@redhat.com> wrote:
>> MSI interrupts appear to not work for nv46 based cards. Change the mc
>> subdev oclass for these cards from nv44 to nv4c, the nv4c mc code is
>> identical to the nv44 mc code except that it does not use msi
>> (it does not define a msi_rearm callback).
> I'm fine with this, but it'd be nice to check that the binary driver
> doesn't/can't use MSI on these too (there might be an alternate method
> we need to use).
>
> Would you be able to grab the latest proprietary driver that works on
> nv4x, and do a mmiotrace of it?  You *might* need to use "modprobe
> nvidia NVreg_EnableMSI=1", because at some point NVIDIA didn't use it
> by default anywhere.

Will do first thing coming monday.

Regards,

Hans


>
> Thanks,
> Ben.
>
>> BugLink: https://bugs.freedesktop.org/show_bug.cgi?id=90435
>> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
>> ---
>>   drivers/gpu/drm/nouveau/nvkm/engine/device/nv40.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/device/nv40.c b/drivers/gpu/drm/nouveau/nvkm/engine/device/nv40.c
>> index c630136..b4ad791 100644
>> --- a/drivers/gpu/drm/nouveau/nvkm/engine/device/nv40.c
>> +++ b/drivers/gpu/drm/nouveau/nvkm/engine/device/nv40.c
>> @@ -265,7 +265,7 @@ nv40_identify(struct nvkm_device *device)
>>                  device->oclass[NVDEV_SUBDEV_CLK    ] = &nv40_clk_oclass;
>>                  device->oclass[NVDEV_SUBDEV_THERM  ] = &nv40_therm_oclass;
>>                  device->oclass[NVDEV_SUBDEV_DEVINIT] =  nv1a_devinit_oclass;
>> -               device->oclass[NVDEV_SUBDEV_MC     ] =  nv44_mc_oclass;
>> +               device->oclass[NVDEV_SUBDEV_MC     ] =  nv4c_mc_oclass;
>>                  device->oclass[NVDEV_SUBDEV_BUS    ] =  nv31_bus_oclass;
>>                  device->oclass[NVDEV_SUBDEV_TIMER  ] = &nv04_timer_oclass;
>>                  device->oclass[NVDEV_SUBDEV_FB     ] =  nv46_fb_oclass;
>> --
>> 2.4.3
>>
>> _______________________________________________
>> Nouveau mailing list
>> Nouveau@lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/nouveau
Hans de Goede July 27, 2015, 3:52 p.m. UTC | #5
Hi,

On 24-07-15 04:32, Ben Skeggs wrote:
> On 24 July 2015 at 01:20, Hans de Goede <hdegoede@redhat.com> wrote:
>> MSI interrupts appear to not work for nv46 based cards. Change the mc
>> subdev oclass for these cards from nv44 to nv4c, the nv4c mc code is
>> identical to the nv44 mc code except that it does not use msi
>> (it does not define a msi_rearm callback).
> I'm fine with this, but it'd be nice to check that the binary driver
> doesn't/can't use MSI on these too (there might be an alternate method
> we need to use).
>
> Would you be able to grab the latest proprietary driver that works on
> nv4x, and do a mmiotrace of it?

I've grabbed 304.125

> You *might* need to use "modprobe
> nvidia NVreg_EnableMSI=1", because at some point NVIDIA didn't use it
> by default anywhere.

You're right I needed to specify NVreg_EnableMSI=1, with that set
/proc/interrupts shows that MSI is used.

Here is an of running glxgears with the binary driver using msi interrupts mmiotrace:

https://fedorapeople.org/~jwrdegoede/nvidia-bin-nv46-msi-on-glxgears.mmiotrace.gz

AFAIK there are some nouveau tools to parse this a bit, right ? I'm going
to call it a day for today, if you can give me some pointers what to do with the
mmiotrace to find a potential fix for the msi issues, that would be appreciated.


BTW I had to build my own kernel with mmiotrace enabled in Kconfig, as this
is disabled in the Fedora kernels by default. Do you know if there is a good
reason to have this disabled by default, or should I ask the Fedora
kernel maintainers to enable it by default ?


Slightly offtopic:

I decided to be bold and try gnome-shell on the nv46 with msi disabled,
which sofar was a guaranteed way to freeze the system, and it now works
somewhat (latest kernel, ddx and mesa). I see something which shows
horizontal lines which are small parts from my desktop background, and
things change significantly when I switch to the overview mode.

But other then that the display is completely wrong, it looks a bit
like a framebuffer pitch problem, but then different. I think it
is likely some tiling problem or some such.

Note that metacity + glxgears works, this only shows with
gnome-shell, any hints where to start looking wrt debugging this?

Or should I first try to run piglet and see if some tests there
point out the culprit?


Regards,

Hans



>
> Thanks,
> Ben.
>
>> BugLink: https://bugs.freedesktop.org/show_bug.cgi?id=90435
>> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
>> ---
>>   drivers/gpu/drm/nouveau/nvkm/engine/device/nv40.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/device/nv40.c b/drivers/gpu/drm/nouveau/nvkm/engine/device/nv40.c
>> index c630136..b4ad791 100644
>> --- a/drivers/gpu/drm/nouveau/nvkm/engine/device/nv40.c
>> +++ b/drivers/gpu/drm/nouveau/nvkm/engine/device/nv40.c
>> @@ -265,7 +265,7 @@ nv40_identify(struct nvkm_device *device)
>>                  device->oclass[NVDEV_SUBDEV_CLK    ] = &nv40_clk_oclass;
>>                  device->oclass[NVDEV_SUBDEV_THERM  ] = &nv40_therm_oclass;
>>                  device->oclass[NVDEV_SUBDEV_DEVINIT] =  nv1a_devinit_oclass;
>> -               device->oclass[NVDEV_SUBDEV_MC     ] =  nv44_mc_oclass;
>> +               device->oclass[NVDEV_SUBDEV_MC     ] =  nv4c_mc_oclass;
>>                  device->oclass[NVDEV_SUBDEV_BUS    ] =  nv31_bus_oclass;
>>                  device->oclass[NVDEV_SUBDEV_TIMER  ] = &nv04_timer_oclass;
>>                  device->oclass[NVDEV_SUBDEV_FB     ] =  nv46_fb_oclass;
>> --
>> 2.4.3
>>
>> _______________________________________________
>> Nouveau mailing list
>> Nouveau@lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/nouveau
Ilia Mirkin July 27, 2015, 4:25 p.m. UTC | #6
On Mon, Jul 27, 2015 at 11:52 AM, Hans de Goede <hdegoede@redhat.com> wrote:
> https://fedorapeople.org/~jwrdegoede/nvidia-bin-nv46-msi-on-glxgears.mmiotrace.gz
>
> AFAIK there are some nouveau tools to parse this a bit, right ? I'm going
> to call it a day for today, if you can give me some pointers what to do with
> the
> mmiotrace to find a potential fix for the msi issues, that would be
> appreciated.

rnn/demmio -l foo-mmiotrace.gz

Enjoy :)
Ben Skeggs July 28, 2015, 7:26 a.m. UTC | #7
On 28 July 2015 at 01:52, Hans de Goede <hdegoede@redhat.com> wrote:
> Hi,
>
> On 24-07-15 04:32, Ben Skeggs wrote:
>>
>> On 24 July 2015 at 01:20, Hans de Goede <hdegoede@redhat.com> wrote:
>>>
>>> MSI interrupts appear to not work for nv46 based cards. Change the mc
>>> subdev oclass for these cards from nv44 to nv4c, the nv4c mc code is
>>> identical to the nv44 mc code except that it does not use msi
>>> (it does not define a msi_rearm callback).
>>
>> I'm fine with this, but it'd be nice to check that the binary driver
>> doesn't/can't use MSI on these too (there might be an alternate method
>> we need to use).
>>
>> Would you be able to grab the latest proprietary driver that works on
>> nv4x, and do a mmiotrace of it?
>
>
> I've grabbed 304.125
>
>> You *might* need to use "modprobe
>> nvidia NVreg_EnableMSI=1", because at some point NVIDIA didn't use it
>> by default anywhere.
>
>
> You're right I needed to specify NVreg_EnableMSI=1, with that set
> /proc/interrupts shows that MSI is used.
>
> Here is an of running glxgears with the binary driver using msi interrupts
> mmiotrace:
>
> https://fedorapeople.org/~jwrdegoede/nvidia-bin-nv46-msi-on-glxgears.mmiotrace.gz
>
> AFAIK there are some nouveau tools to parse this a bit, right ? I'm going
> to call it a day for today, if you can give me some pointers what to do with
> the
> mmiotrace to find a potential fix for the msi issues, that would be
> appreciated.
>
>
> BTW I had to build my own kernel with mmiotrace enabled in Kconfig, as this
> is disabled in the Fedora kernels by default. Do you know if there is a good
> reason to have this disabled by default, or should I ask the Fedora
> kernel maintainers to enable it by default ?
The -debug kernel has it enabled already.  However, it's also
problematic in that it enables various lockdep debugging stuff that
causes the binary driver kernel module to end up depending on GPL-only
symbols, which you have to hack around by changing the
MODULE_LICENSE() for the binary driver to "GPL"... Which is clearly a
pain :)  So, I guess if you want a slightly more straight-forward
approach, it'd be good to enable in the non-debug kernels too.

>
>
> Slightly offtopic:
>
> I decided to be bold and try gnome-shell on the nv46 with msi disabled,
> which sofar was a guaranteed way to freeze the system, and it now works
> somewhat (latest kernel, ddx and mesa). I see something which shows
> horizontal lines which are small parts from my desktop background, and
> things change significantly when I switch to the overview mode.
>
> But other then that the display is completely wrong, it looks a bit
> like a framebuffer pitch problem, but then different. I think it
> is likely some tiling problem or some such.
>
> Note that metacity + glxgears works, this only shows with
> gnome-shell, any hints where to start looking wrt debugging this?
These are the main issues that I'd like to see resolved :)

>
> Or should I first try to run piglet and see if some tests there
> point out the culprit?
I think this is a good place to start.

Thanks,
Ben.

>
>
> Regards,
>
> Hans
>
>
>
>>
>> Thanks,
>> Ben.
>>
>>> BugLink: https://bugs.freedesktop.org/show_bug.cgi?id=90435
>>> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
>>> ---
>>>   drivers/gpu/drm/nouveau/nvkm/engine/device/nv40.c | 2 +-
>>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/device/nv40.c
>>> b/drivers/gpu/drm/nouveau/nvkm/engine/device/nv40.c
>>> index c630136..b4ad791 100644
>>> --- a/drivers/gpu/drm/nouveau/nvkm/engine/device/nv40.c
>>> +++ b/drivers/gpu/drm/nouveau/nvkm/engine/device/nv40.c
>>> @@ -265,7 +265,7 @@ nv40_identify(struct nvkm_device *device)
>>>                  device->oclass[NVDEV_SUBDEV_CLK    ] = &nv40_clk_oclass;
>>>                  device->oclass[NVDEV_SUBDEV_THERM  ] =
>>> &nv40_therm_oclass;
>>>                  device->oclass[NVDEV_SUBDEV_DEVINIT] =
>>> nv1a_devinit_oclass;
>>> -               device->oclass[NVDEV_SUBDEV_MC     ] =  nv44_mc_oclass;
>>> +               device->oclass[NVDEV_SUBDEV_MC     ] =  nv4c_mc_oclass;
>>>                  device->oclass[NVDEV_SUBDEV_BUS    ] =  nv31_bus_oclass;
>>>                  device->oclass[NVDEV_SUBDEV_TIMER  ] =
>>> &nv04_timer_oclass;
>>>                  device->oclass[NVDEV_SUBDEV_FB     ] =  nv46_fb_oclass;
>>> --
>>> 2.4.3
>>>
>>> _______________________________________________
>>> Nouveau mailing list
>>> Nouveau@lists.freedesktop.org
>>> http://lists.freedesktop.org/mailman/listinfo/nouveau
Hans de Goede July 29, 2015, 3:36 p.m. UTC | #8
Hi,

On 28-07-15 09:26, Ben Skeggs wrote:
> On 28 July 2015 at 01:52, Hans de Goede <hdegoede@redhat.com> wrote:
>> Hi,
>>
>> On 24-07-15 04:32, Ben Skeggs wrote:
>>>
>>> On 24 July 2015 at 01:20, Hans de Goede <hdegoede@redhat.com> wrote:
>>>>
>>>> MSI interrupts appear to not work for nv46 based cards. Change the mc
>>>> subdev oclass for these cards from nv44 to nv4c, the nv4c mc code is
>>>> identical to the nv44 mc code except that it does not use msi
>>>> (it does not define a msi_rearm callback).
>>>
>>> I'm fine with this, but it'd be nice to check that the binary driver
>>> doesn't/can't use MSI on these too (there might be an alternate method
>>> we need to use).
>>>
>>> Would you be able to grab the latest proprietary driver that works on
>>> nv4x, and do a mmiotrace of it?
>>
>>
>> I've grabbed 304.125
>>
>>> You *might* need to use "modprobe
>>> nvidia NVreg_EnableMSI=1", because at some point NVIDIA didn't use it
>>> by default anywhere.
>>
>>
>> You're right I needed to specify NVreg_EnableMSI=1, with that set
>> /proc/interrupts shows that MSI is used.
>>
>> Here is an of running glxgears with the binary driver using msi interrupts
>> mmiotrace:
>>
>> https://fedorapeople.org/~jwrdegoede/nvidia-bin-nv46-msi-on-glxgears.mmiotrace.gz
>>
>> AFAIK there are some nouveau tools to parse this a bit, right ? I'm going
>> to call it a day for today, if you can give me some pointers what to do with
>> the
>> mmiotrace to find a potential fix for the msi issues, that would be
>> appreciated.

I've run demmio on this as suggested by Ilia, I've checked all the writes
to the pmc pbus and pci ranges, and I've been unable to find anything which
helps I'm afraid. I've also checked the interrupt regs of the crt block, and
those are correct, and the interrupt flag for vblank is set.

So I'm all out of clues I'm afraid. One thing which does stand out is that
lspci -vvv shows the following differences between nouveau vs nvidea output:

@@ -361,23 +361,23 @@
         Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Ste
         Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort-
         Latency: 0
-       Interrupt: pin A routed to IRQ 28
+       Interrupt: pin A routed to IRQ 29
         Region 0: Memory at fd000000 (32-bit, non-prefetchable) [size=16M]
         Region 1: Memory at d0000000 (64-bit, prefetchable) [size=256M]
         Region 3: Memory at fc000000 (64-bit, non-prefetchable) [size=16M]
-       Expansion ROM at fe9e0000 [disabled] [size=128K]
+       [virtual] Expansion ROM at fe9e0000 [disabled] [size=128K]
         Capabilities: [60] Power Management version 2
                 Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3ho
                 Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
         Capabilities: [68] MSI: Enable+ Count=1/1 Maskable- 64bit+
-               Address: 00000000fee0300c  Data: 41a2
+               Address: 00000000fee0300c  Data: 41c2
         Capabilities: [78] Express (v1) Endpoint, MSI 00
                 DevCap: MaxPayload 128 bytes, PhantFunc 0, Latency L0s <256ns,
                         ExtTag- AttnBtn- AttnInd- PwrInd- RBE- FLReset-
                 DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupport
                         RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
                         MaxPayload 128 bytes, MaxReadReq 512 bytes
-               DevSta: CorrErr- UncorrErr+ FatalErr- UnsuppReq+ AuxPwr- TransP
+               DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransP
                 LnkCap: Port #0, Speed 2.5GT/s, Width x16, ASPM L0s L1, Exit La
                         ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp-
                 LnkCtl: ASPM Disabled; RCB 128 bytes Disabled- CommClk+
@@ -393,7 +393,7 @@
                         Ctrl:   Enable+ ID=0 ArbSelect=Fixed TC/VC=01
                         Status: NegoPending- InProgress-
         Capabilities: [128 v1] Power Budgeting <?>
-       Kernel driver in use: nouveau
+       Kernel driver in use: nvidia
         Kernel modules: nouveau, nvidia

The DevSta shows that we are sending some commands the device does not like.

At first I thought this would be the culprit, but as discussed before on some
boots things just work and on others they do not (when using nouveau). I've
checked a boot with nouveau where things just happen to work, and there
UncorrErr+ and UnsuppReq+ are still set when things just work.

So I'm officially giving up on this, and I'm going to continue to work on
the nv46 with msi disabled.

Note that when things do not work, we do get some interrupts, they just stop
coming at one point shortly after boot.

>> BTW I had to build my own kernel with mmiotrace enabled in Kconfig, as this
>> is disabled in the Fedora kernels by default. Do you know if there is a good
>> reason to have this disabled by default, or should I ask the Fedora
>> kernel maintainers to enable it by default ?
> The -debug kernel has it enabled already.  However, it's also
> problematic in that it enables various lockdep debugging stuff that
> causes the binary driver kernel module to end up depending on GPL-only
> symbols, which you have to hack around by changing the
> MODULE_LICENSE() for the binary driver to "GPL"... Which is clearly a
> pain :)  So, I guess if you want a slightly more straight-forward
> approach, it'd be good to enable in the non-debug kernels too.

Ok, before I submit a patch to the Fedora kernel devs for this, mmiotrace
uses live patching like the other ftrace stuff, so no performance impact
unless actually used, right ?

>
>>
>>
>> Slightly offtopic:
>>
>> I decided to be bold and try gnome-shell on the nv46 with msi disabled,
>> which sofar was a guaranteed way to freeze the system, and it now works
>> somewhat (latest kernel, ddx and mesa). I see something which shows
>> horizontal lines which are small parts from my desktop background, and
>> things change significantly when I switch to the overview mode.
>>
>> But other then that the display is completely wrong, it looks a bit
>> like a framebuffer pitch problem, but then different. I think it
>> is likely some tiling problem or some such.
>>
>> Note that metacity + glxgears works, this only shows with
>> gnome-shell, any hints where to start looking wrt debugging this?

> These are the main issues that I'd like to see resolved :)

Agreed getting gnome-shell running is really the minimum level we should
support cards at.

>> Or should I first try to run piglet and see if some tests there
>> point out the culprit?
> I think this is a good place to start.

Ok, will do.

Regards,

Hans
Hans de Goede July 30, 2015, 12:42 p.m. UTC | #9
Hi,

On 27-07-15 17:52, Hans de Goede wrote:

> Slightly offtopic:
>
> I decided to be bold and try gnome-shell on the nv46 with msi disabled,
> which sofar was a guaranteed way to freeze the system, and it now works
> somewhat (latest kernel, ddx and mesa). I see something which shows
> horizontal lines which are small parts from my desktop background, and
> things change significantly when I switch to the overview mode.
>
> But other then that the display is completely wrong, it looks a bit
> like a framebuffer pitch problem, but then different. I think it
> is likely some tiling problem or some such.
>
> Note that metacity + glxgears works, this only shows with
> gnome-shell, any hints where to start looking wrt debugging this?
>
> Or should I first try to run piglet and see if some tests there
> point out the culprit?

I've been working on this today, I decided to first make sure
that the latest ddx + mesa did not have a regression on nv4x in
general, so I plugged in my nv43 card which used to run gnome-shell
fine and that shows the same problem.

Some debugging with that card shows that things break with this
ddx commit:

http://cgit.freedesktop.org/nouveau/xf86-video-nouveau/commit/?id=241e7289f25a342a457952b9b0e539c2f0b81d99

"enable dri3 support without glamor"

Using an older ddx + latest mesa master gnome-shell runs fine
on my nv43 card.

And adding my patch to disable msi interrupts on nv46 makes
gnome-shell run fine on my nv46 card too :)

So unless someone has a good idea to fix msi interrupts on
nv46, I suggest we merge my patch to disable them
(with a Cc: stable@vger.kernel.org), which should fix most
problems nv46 users have been seeing.

Regards,

Hans
diff mbox

Patch

diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/device/nv40.c b/drivers/gpu/drm/nouveau/nvkm/engine/device/nv40.c
index c630136..b4ad791 100644
--- a/drivers/gpu/drm/nouveau/nvkm/engine/device/nv40.c
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/device/nv40.c
@@ -265,7 +265,7 @@  nv40_identify(struct nvkm_device *device)
 		device->oclass[NVDEV_SUBDEV_CLK    ] = &nv40_clk_oclass;
 		device->oclass[NVDEV_SUBDEV_THERM  ] = &nv40_therm_oclass;
 		device->oclass[NVDEV_SUBDEV_DEVINIT] =  nv1a_devinit_oclass;
-		device->oclass[NVDEV_SUBDEV_MC     ] =  nv44_mc_oclass;
+		device->oclass[NVDEV_SUBDEV_MC     ] =  nv4c_mc_oclass;
 		device->oclass[NVDEV_SUBDEV_BUS    ] =  nv31_bus_oclass;
 		device->oclass[NVDEV_SUBDEV_TIMER  ] = &nv04_timer_oclass;
 		device->oclass[NVDEV_SUBDEV_FB     ] =  nv46_fb_oclass;