diff mbox

Regression: bug 85491: radeon 0000:01:00.0: Fatal error during GPU init

Message ID 20141208233855.GA15713@shangw (mailing list archive)
State New, archived
Delegated to: Bjorn Helgaas
Headers show

Commit Message

Gavin Shan Dec. 8, 2014, 11:38 p.m. UTC
On Mon, Dec 08, 2014 at 02:46:00PM -0700, Bjorn Helgaas wrote:
>[-cc Saifi (bouncing)]
>
>On Mon, Dec 8, 2014 at 2:38 PM, Benjamin Herrenschmidt
><benh@kernel.crashing.org> wrote:
>> On Mon, 2014-12-08 at 14:04 -0700, Bjorn Helgaas wrote:
>>>
>>> However, I think 5b28541552ef is taking the wrong approach.  Consider
>>> a device that, like this Radeon, has a 32-bit prefetchable BAR.  If
>>> the upstream bridge has a 32-bit prefetchable window, things work as
>>> expected -- we put the prefetchable BAR in the prefetchable window.
>>> But if the bridge has a 64-bit prefetchable window, we put that same
>>> BAR in a 32-bit non-prefetchable window.
>>
>> Well, that's expected, unless the 64-bit prefetchable window happens to
>> be fully enclosed in 32-bit space ... So maybe the approach is to check
>> that this is the case and "downgrade" such 64-bit prefetchable BARs to
>> 32-bit ...
>
>Yeah, I didn't word that very clearly.  I meant that after
>5b28541552ef, that 64-bit window is wasted unless there's a 64-bit BAR
>below it; we can't even place the window below 4GB and use it for
>32-bit prefetchable BARs.
>

Yes, I agree. It's why I suggested to return error from pbus_size_mem()
to indicate the case: 64-bits prefetchable window isn't used and it's
still available to be used by child 32-bits prefetchable BARs. Please
take a look on the attached draft patch, which helps to explain the idea
only.

Thanks,
Gavin

Comments

Yinghai Lu Dec. 9, 2014, 12:21 a.m. UTC | #1
On Mon, Dec 8, 2014 at 3:38 PM, Gavin Shan <gwshan@linux.vnet.ibm.com> wrote:
> On Mon, Dec 08, 2014 at 02:46:00PM -0700, Bjorn Helgaas wrote:
>>Yeah, I didn't word that very clearly.  I meant that after
>>5b28541552ef, that 64-bit window is wasted unless there's a 64-bit BAR
>>below it; we can't even place the window below 4GB and use it for
>>32-bit prefetchable BARs.
>>
>
> Yes, I agree. It's why I suggested to return error from pbus_size_mem()
> to indicate the case: 64-bits prefetchable window isn't used and it's
> still available to be used by child 32-bits prefetchable BARs. Please
> take a look on the attached draft patch, which helps to explain the idea
> only.

That would not help. The 00:01.0 on Zermond's system support hotplug. so mem
pref will be used for 64bit pref.
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bjorn Helgaas Dec. 19, 2014, 10:17 p.m. UTC | #2
[+cc linux-kernel for visibility]

For context, this is about a regression caused by 5b28541552ef ("PCI:
Restrict 64-bit prefetchable bridge windows to 64-bit resources"),
which appeared in v3.16.  The bugzilla is at
https://bugzilla.kernel.org/show_bug.cgi?id=85491

The problem is that BIOS programmed an invalid Root Port window
leading to the Radeon device.  The window contains the Radeon device,
so the configuration actually *works* fine, but the window is invalid
because it either overlaps system RAM or starts below the upstream
host bridge window, so Linux discards it:

Zermond's system:
  acpi PNP0A08:00: host bridge window [mem 0xc0000000-0xffffffff] (ignored)
  pci_bus 0000:00: root bus resource [mem 0x00000000-0xfffffffff]
  pci 0000:00:01.0:   bridge window [mem 0xbdf00000-0xddefffff 64bit
pref]  # invalid Root Port window
  pci 0000:00:01.0: can't claim BAR 15 [mem 0xbdf00000-0xddefffff
64bit pref]: address conflict with System RAM [mem
0x00100000-0xbff9ffff]
  pci 0000:01:00.0: can't claim BAR 0 [mem 0xc0000000-0xcfffffff
pref]: no compatible bridge window  # Radeon

Marek's system:
  pci_bus 0000:00: root bus resource [mem 0xc0000000-0xffffffff]   (from _CRS)
  pci 0000:00:01.0:   bridge window [mem 0xbdf00000-0xddefffff 64bit
pref]  # invalid Root Port window
  pci 0000:00:01.0: can't claim BAR 15 [mem 0xbdf00000-0xddefffff
64bit pref]: no compatible bridge window
  pci 0000:01:00.0: can't claim BAR 0 [mem 0xc0000000-0xcfffffff
pref]: no compatible bridge window  # Radeon

The Root Port window had the same problem before 5b28541552ef, of
course, since BIOS set it up.  But before 5b28541552ef, Linux assigned
a valid window big enough for the Radeon:

  pci 0000:00:01.0:   bridge window [mem 0xc0000000-0xcfffffff pref]

After 5b28541552ef, we won't put a 64-bit window below 4GB, so we
assign space above 4GB:

  pci 0000:00:01.0:   bridge window [mem 0x140000000-0x1401fffff 64bit pref]

which is not usable by Radeon, since it only has a 32-bit BAR.

On Mon, Dec 8, 2014 at 5:21 PM, Yinghai Lu <yinghai@kernel.org> wrote:
> On Mon, Dec 8, 2014 at 3:38 PM, Gavin Shan <gwshan@linux.vnet.ibm.com> wrote:
>> On Mon, Dec 08, 2014 at 02:46:00PM -0700, Bjorn Helgaas wrote:
>>>Yeah, I didn't word that very clearly.  I meant that after
>>>5b28541552ef, that 64-bit window is wasted unless there's a 64-bit BAR
>>>below it; we can't even place the window below 4GB and use it for
>>>32-bit prefetchable BARs.
>>>
>>
>> Yes, I agree. It's why I suggested to return error from pbus_size_mem()
>> to indicate the case: 64-bits prefetchable window isn't used and it's
>> still available to be used by child 32-bits prefetchable BARs. Please
>> take a look on the attached draft patch, which helps to explain the idea
>> only.
>
> That would not help. The 00:01.0 on Zermond's system support hotplug. so mem
> pref will be used for 64bit pref.

I don't think it is an improvement to restrict use of the 64bit pref
window to a hypothetical future 64-bit capable device.  The
configuration done by BIOS was optimal for the hardware already in the
machine -- the prefetchable BAR is in a prefetchable window.  I don't
think we should change that configuration just because there's a
possibility of a different device in the future.

Bjorn
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bjorn Helgaas Dec. 20, 2014, 12:23 a.m. UTC | #3
On Mon, Dec 08, 2014 at 04:21:34PM -0800, Yinghai Lu wrote:
> On Mon, Dec 8, 2014 at 3:38 PM, Gavin Shan <gwshan@linux.vnet.ibm.com> wrote:
> > On Mon, Dec 08, 2014 at 02:46:00PM -0700, Bjorn Helgaas wrote:
> >>Yeah, I didn't word that very clearly.  I meant that after
> >>5b28541552ef, that 64-bit window is wasted unless there's a 64-bit BAR
> >>below it; we can't even place the window below 4GB and use it for
> >>32-bit prefetchable BARs.
> >
> > Yes, I agree. It's why I suggested to return error from pbus_size_mem()
> > to indicate the case: 64-bits prefetchable window isn't used and it's
> > still available to be used by child 32-bits prefetchable BARs. Please
> > take a look on the attached draft patch, which helps to explain the idea
> > only.
> 
> That would not help. The 00:01.0 on Zermond's system support hotplug. so mem
> pref will be used for 64bit pref.

Maybe it doesn't work as-is, but I think the idea is worth pursuing.  We
can tell whether there are existing children, and we can tell whether there
are any 64-bit prefetchable BARs.

If there is already a device below a hotplug bridge, and it has no 64-bit
BARs, I think we should use the prefetchable window for that device.

It seems silly to reserve the prefetchable window for a possible future
hot-added device.  That means we penalize the device we already know about
because it can't have any prefetchable space.  And we've consumed some
64-bit space that is unused.  We only benefit if we hot-remove the existing
device and hot-add a new device with 64-bit BARs that happen to fit inside
the 2MB (DEFAULT_HOTPLUG_MEM_SIZE) we allocated for it.  That seems pretty
unlikely.

I don't see any patches that resolve the regression (bug 85491) yet.  If we
don't figure something out, I'm going to have to revert 5b28541552ef.

Bjorn
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Yinghai Lu Dec. 20, 2014, 12:34 a.m. UTC | #4
On Fri, Dec 19, 2014 at 4:23 PM, Bjorn Helgaas <bhelgaas@google.com> wrote:
> On Mon, Dec 08, 2014 at 04:21:34PM -0800, Yinghai Lu wrote:
>
> I don't see any patches that resolve the regression (bug 85491) yet.  If we
> don't figure something out, I'm going to have to revert 5b28541552ef.

As you don't like idea to clear the MEM64 bit in resource ...

After this merging window, I would like to post 5 patches in

https://git.kernel.org/cgit/linux/kernel/git/yinghai/linux-yinghai.git/log/?h=for-pci-allocate-fit-3.18
that will make those two systems works with "pci=realloc".

We could have another patch that enable pci=realloc when VGA bar get
rejected at the first point.

The point is there is no performance difference between two solutions according
to Marek's test.

Thanks

Yinghai
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bjorn Helgaas Dec. 20, 2014, 12:56 a.m. UTC | #5
On Fri, Dec 19, 2014 at 5:34 PM, Yinghai Lu <yinghai@kernel.org> wrote:
> On Fri, Dec 19, 2014 at 4:23 PM, Bjorn Helgaas <bhelgaas@google.com> wrote:
>> On Mon, Dec 08, 2014 at 04:21:34PM -0800, Yinghai Lu wrote:
>>
>> I don't see any patches that resolve the regression (bug 85491) yet.  If we
>> don't figure something out, I'm going to have to revert 5b28541552ef.
>
> As you don't like idea to clear the MEM64 bit in resource ...
>
> After this merging window, I would like to post 5 patches in
>
> https://git.kernel.org/cgit/linux/kernel/git/yinghai/linux-yinghai.git/log/?h=for-pci-allocate-fit-3.18
> that will make those two systems works with "pci=realloc".

It's fine with me if you post things during the merge window.  But a
fix that depends on adding "pci=realloc" to the kernel parameters is
only a workaround, not sufficient to fix a regression.

> We could have another patch that enable pci=realloc when VGA bar get
> rejected at the first point.

That sounds pretty hacky, but I haven't seen your patch.

> The point is there is no performance difference between two solutions according
> to Marek's test.

I don't believe that.  There might not be a difference in Marek's
case, but this problem could happen with any device, and a
prefetchable window clearly performs better when we do significant
amounts of MMIO.  I'm not interested in fixes targeted at individual
cases.  We have to come up with designs that are good in general.

Marek's case is a little easier because his system is using _CRS.  We
should be able to notice that the Root Port window overlaps the host
bridge window, but isn't contained by it.  In that case, if we merely
trim the Root Port window so it fits, everything will just work.

Zermond's case is similar, but his system isn't using _CRS.  If we
turned on _CRS across the board instead of just for 2008 and newer,
his system would be fixed, too.  That's too risky for v3.19, but worth
considering.

I think we need to think about a different way of resolving the
PowerPC issue (https://bugzilla.kernel.org/show_bug.cgi?id=74151).
5b28541552ef was one way, but certainly not the only possible
approach.

SeaBIOS uses a different PCI configuration strategy.  It doesn't
handle some of the things we need, e.g., SR-IOV, but it's worth
reading for some new ideas:
http://code.coreboot.org/p/seabios/source/tree/master/src/fw/pciinit.c

Bjorn
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Yinghai Lu Dec. 20, 2014, 1:33 a.m. UTC | #6
On Fri, Dec 19, 2014 at 4:56 PM, Bjorn Helgaas <bhelgaas@google.com> wrote:
>
> Marek's case is a little easier because his system is using _CRS.  We
> should be able to notice that the Root Port window overlaps the host
> bridge window, but isn't contained by it.  In that case, if we merely
> trim the Root Port window so it fits, everything will just work.

Ok, will check that Monday.

Yinghai
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Wei Yang Dec. 22, 2014, 8:17 a.m. UTC | #7
On Fri, Dec 19, 2014 at 04:34:57PM -0800, Yinghai Lu wrote:
>On Fri, Dec 19, 2014 at 4:23 PM, Bjorn Helgaas <bhelgaas@google.com> wrote:
>> On Mon, Dec 08, 2014 at 04:21:34PM -0800, Yinghai Lu wrote:
>>
>> I don't see any patches that resolve the regression (bug 85491) yet.  If we
>> don't figure something out, I'm going to have to revert 5b28541552ef.
>
>As you don't like idea to clear the MEM64 bit in resource ...
>
>After this merging window, I would like to post 5 patches in
>
>https://git.kernel.org/cgit/linux/kernel/git/yinghai/linux-yinghai.git/log/?h=for-pci-allocate-fit-3.18
>that will make those two systems works with "pci=realloc".
>
>We could have another patch that enable pci=realloc when VGA bar get
>rejected at the first point.
>
>The point is there is no performance difference between two solutions according
>to Marek's test.

One quick question. 

The patch in http://www.spinics.net/lists/linux-pci/msg37377.html doesn't use
"pci=realloc" as it mentioned. And seem worked on Marek's machine.

We wouldn't take it into consideration?

>
>Thanks
>
>Yinghai
Bjorn Helgaas Dec. 22, 2014, 7:50 p.m. UTC | #8
On Mon, Dec 22, 2014 at 1:17 AM, Wei Yang <weiyang@linux.vnet.ibm.com> wrote:
> On Fri, Dec 19, 2014 at 04:34:57PM -0800, Yinghai Lu wrote:
>>On Fri, Dec 19, 2014 at 4:23 PM, Bjorn Helgaas <bhelgaas@google.com> wrote:
>>> On Mon, Dec 08, 2014 at 04:21:34PM -0800, Yinghai Lu wrote:
>>>
>>> I don't see any patches that resolve the regression (bug 85491) yet.  If we
>>> don't figure something out, I'm going to have to revert 5b28541552ef.
>>
>>As you don't like idea to clear the MEM64 bit in resource ...
>>
>>After this merging window, I would like to post 5 patches in
>>
>>https://git.kernel.org/cgit/linux/kernel/git/yinghai/linux-yinghai.git/log/?h=for-pci-allocate-fit-3.18
>>that will make those two systems works with "pci=realloc".
>>
>>We could have another patch that enable pci=realloc when VGA bar get
>>rejected at the first point.
>>
>>The point is there is no performance difference between two solutions according
>>to Marek's test.
>
> One quick question.
>
> The patch in http://www.spinics.net/lists/linux-pci/msg37377.html doesn't use
> "pci=realloc" as it mentioned. And seem worked on Marek's machine.
>
> We wouldn't take it into consideration?

Nope.  https://lkml.org/lkml/2014/12/19/392
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

From bef332d8e2bfc464da202daa2c9d1415db1f1224 Mon Sep 17 00:00:00 2001
From: Gavin Shan <gwshan@linux.vnet.ibm.com>
Date: Tue, 9 Dec 2014 10:21:24 +1100
Subject: [PATCH] PCI: Use 64-bits pref window accomodate pref BARs

The PCI resource allocation and reassignment has been changed a
lot by Commit 5b28541552ef ("PCI: Restrict 64-bit prefetchable
bridge windows to 64-bit resources"): If parent bridge has 64-bits
prefetchable window, all child 64-bits prefetchable BARs are
expected to be assigned from the window. The left child BARs are
going to be allocated from parent non-prefetchable window.

If there're no child 64-bits prefetchable BARs, pbus_size_mem()
should return error so that the 64-bits prefetchable window still
can be used to accomodate child 32-bits prefetchable BARs.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
---
v1: Draft patch to explain idea only
---
 drivers/pci/setup-bus.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
index 0482235..7a7778f 100644
--- a/drivers/pci/setup-bus.c
+++ b/drivers/pci/setup-bus.c
@@ -989,6 +989,16 @@  static int pbus_size_mem(struct pci_bus *bus, unsigned long mask,
 		}
 	}
 
+	/*
+	 * If we don't have unassigned BARs with indicated type, the
+	 * corresponding window won't be used. However, we might use
+	 * this window for BARs with other types, e.g. 64-bits pref
+	 * window for 32-bits pref BARs. Here to return error to
+	 * indicate the case.
+	 */
+	if (!size)
+		return -ENODEV;
+
 	min_align = calculate_mem_align(aligns, max_order);
 	min_align = max(min_align, window_alignment(bus, b_res->flags));
 	size0 = calculate_memsize(size, min_size, 0, resource_size(b_res), min_align);
-- 
1.8.3.2