diff mbox

[1/2] PCI: make pci_claim_resource() work with conflict resources as appropriate

Message ID 1311852512-7340-2-git-send-email-dengcheng.zhu@gmail.com (mailing list archive)
State New, archived
Delegated to: Bjorn Helgaas
Headers show

Commit Message

Deng-Cheng Zhu July 28, 2011, 11:28 a.m. UTC
In resolving a network driver issue with the MIPS Malta platform, the root
cause was traced into pci_claim_resource():

MIPS System Controller's PCI I/O resources stay in 0x1000-0xffffff. When
PCI quirks start claiming resources using request_resource_conflict(),
collisions happen and -EBUSY is returned, thereby rendering the onboard AMD
PCnet32 NIC unaware of quirks' region and preventing the NIC from functioning.
For PCI quirks, PIIX4 ACPI is expected to claim 0x1000-0x103f, and PIIX4 SMB to
claim 0x1100-0x110f, both of which fall into the MSC I/O range. Certainly, we
can increase the start point of this range in arch/mips/mti-malta/malta-pci.c to
avoid the collisions. But a fix in here looks more justified, though it seems to
have a wider impact. Using insert_xxx as opposed to request_xxx will register
PCI quirks' resources as children of MSC I/O and return OK, instead of seeing
collisions which are actually resolvable.

Signed-off-by: Deng-Cheng Zhu <dengcheng.zhu@gmail.com>
---
 drivers/pci/setup-res.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

Comments

Ralf Baechle July 28, 2011, 11:53 a.m. UTC | #1
On Thu, Jul 28, 2011 at 07:28:31PM +0800, Deng-Cheng Zhu wrote:

> In resolving a network driver issue with the MIPS Malta platform, the root
> cause was traced into pci_claim_resource():
> 
> MIPS System Controller's PCI I/O resources stay in 0x1000-0xffffff. When
> PCI quirks start claiming resources using request_resource_conflict(),
> collisions happen and -EBUSY is returned, thereby rendering the onboard AMD
> PCnet32 NIC unaware of quirks' region and preventing the NIC from functioning.
> For PCI quirks, PIIX4 ACPI is expected to claim 0x1000-0x103f, and PIIX4 SMB to
> claim 0x1100-0x110f, both of which fall into the MSC I/O range. Certainly, we
> can increase the start point of this range in arch/mips/mti-malta/malta-pci.c to
> avoid the collisions. But a fix in here looks more justified, though it seems to
> have a wider impact. Using insert_xxx as opposed to request_xxx will register
> PCI quirks' resources as children of MSC I/O and return OK, instead of seeing
> collisions which are actually resolvable.

This used to work in the past; do you know which commit broke the resource
handling for the NIC?

  Ralf
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bjorn Helgaas July 28, 2011, 3:59 p.m. UTC | #2
On Thu, Jul 28, 2011 at 5:28 AM, Deng-Cheng Zhu <dengcheng.zhu@gmail.com> wrote:
> In resolving a network driver issue with the MIPS Malta platform, the root
> cause was traced into pci_claim_resource():
>
> MIPS System Controller's PCI I/O resources stay in 0x1000-0xffffff. When
> PCI quirks start claiming resources using request_resource_conflict(),
> collisions happen and -EBUSY is returned, thereby rendering the onboard AMD
> PCnet32 NIC unaware of quirks' region and preventing the NIC from functioning.
> For PCI quirks, PIIX4 ACPI is expected to claim 0x1000-0x103f, and PIIX4 SMB to
> claim 0x1100-0x110f, both of which fall into the MSC I/O range. Certainly, we
> can increase the start point of this range in arch/mips/mti-malta/malta-pci.c to
> avoid the collisions. But a fix in here looks more justified, though it seems to
> have a wider impact. Using insert_xxx as opposed to request_xxx will register
> PCI quirks' resources as children of MSC I/O and return OK, instead of seeing
> collisions which are actually resolvable.

What's the collision?  Can we see the dmesg log (which should have
that information) and maybe the /proc/ioports contents?  Did something
change the order in which we claim resources, so things that used to
work now cause conflicts?

I think insert_resource() (where the newly-inserted resource can
become the parent of something that was previously inserted) is sort
of a hack, and the fact that we need it is telling us that we're doing
things in the wrong order.  It's nicer when we can discover and claim
resources in a top-down hierarchical way.  But I recognize that may
not always be possible, or at least not convenient.

Bjorn
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Deng-Cheng Zhu July 29, 2011, 6:32 a.m. UTC | #3
I noticed that at 79896cf42f Linus changed the function from insert_resource()
to request_resource() (and later evolved into request_resource_conflict()) and
he explained the reason. So, in the NIC's case, the problem is that in
pci_claim_resource() the function pci_find_parent_resource() returns the root
(0x0-0xffffff) rather than the MSC PCI I/O (0x1000-0xffffff). So
request_resource_conflict() for PCI quirks (0x1000-0x103f and 0x1100-0x110f)
will simply return an error, coz these 2 regions 'conflict' with MSC PCI I/O.
Instead, insert_resource_conflict() will also find the collisions but register
quirks as children of MSC PCI I/O (is this supposed to be correct?) and return
a success.


Deng-Cheng

2011/7/28 Ralf Baechle <ralf@linux-mips.org>
>
> On Thu, Jul 28, 2011 at 07:28:31PM +0800, Deng-Cheng Zhu wrote:
>
> > In resolving a network driver issue with the MIPS Malta platform, the root
> > cause was traced into pci_claim_resource():
> >
> > MIPS System Controller's PCI I/O resources stay in 0x1000-0xffffff. When
> > PCI quirks start claiming resources using request_resource_conflict(),
> > collisions happen and -EBUSY is returned, thereby rendering the onboard AMD
> > PCnet32 NIC unaware of quirks' region and preventing the NIC from functioning.
> > For PCI quirks, PIIX4 ACPI is expected to claim 0x1000-0x103f, and PIIX4 SMB to
> > claim 0x1100-0x110f, both of which fall into the MSC I/O range. Certainly, we
> > can increase the start point of this range in arch/mips/mti-malta/malta-pci.c to
> > avoid the collisions. But a fix in here looks more justified, though it seems to
> > have a wider impact. Using insert_xxx as opposed to request_xxx will register
> > PCI quirks' resources as children of MSC I/O and return OK, instead of seeing
> > collisions which are actually resolvable.
>
> This used to work in the past; do you know which commit broke the resource
> handling for the NIC?
>
>  Ralf
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Deng-Cheng Zhu July 29, 2011, 6:35 a.m. UTC | #4
Well, here are dmesg and /proc/ioports *BEFORE* applying the patch:

...
...
pci 0000:00:0a.0: [8086:7110] type 0 class 0x000601
pci 0000:00:0a.1: [8086:7111] type 0 class 0x000101
pci 0000:00:0a.1: reg 20: [io  0x1240-0x124f]
pci 0000:00:0a.2: [8086:7112] type 0 class 0x000c03
pci 0000:00:0a.2: reg 20: [io  0x1220-0x123f]
pci 0000:00:0a.3: [8086:7113] type 0 class 0x000680
pci 0000:00:0a.3: address space collision: [io  0x1000-0x103f]
conflicts with MSC PCI I/O [io  0x1000-0xffffff]
pci 0000:00:0a.3: address space collision: [io  0x1100-0x110f]
conflicts with MSC PCI I/O [io  0x1000-0xffffff]
pci 0000:00:0b.0: [1022:2000] type 0 class 0x000200
pci 0000:00:0b.0: reg 10: [io  0x1200-0x121f]
pci 0000:00:0b.0: reg 14: [mem 0x10000000-0x1000001f]
pci 0000:00:0b.0: reg 30: [mem 0x00000000-0x000fffff pref]
pci 0000:00:0b.0: PME# supported from D0 D3hot D3cold
pci 0000:00:0b.0: PME# disabled
pci 0000:00:11.0: [153f:0001] type 0 class 0x000600
pci 0000:00:11.0: reg 10: [mem 0x00000000-0x0fffffff]
pci 0000:00:0a.3: BAR 13: [io  0x1000-0x103f] has bogus alignment
pci 0000:00:0a.3: BAR 14: [io  0x1100-0x110f] has bogus alignment
pci 0000:00:0b.0: BAR 6: assigned [mem 0x10000000-0x100fffff pref]
pci 0000:00:0a.2: BAR 4: assigned [io  0x1000-0x101f]
pci 0000:00:0a.2: BAR 4: set to [io  0x1000-0x101f] (PCI address
[0x1000-0x101f])
pci 0000:00:0b.0: BAR 0: assigned [io  0x1020-0x103f]
pci 0000:00:0b.0: BAR 0: set to [io  0x1020-0x103f] (PCI address
[0x1020-0x103f])
pci 0000:00:0b.0: BAR 1: assigned [mem 0x10100000-0x1010001f]
pci 0000:00:0b.0: BAR 1: set to [mem 0x10100000-0x1010001f] (PCI
address [0x10100000-0x1010001f])
pci 0000:00:0a.1: BAR 4: assigned [io  0x1040-0x104f]
pci 0000:00:0a.1: BAR 4: set to [io  0x1040-0x104f] (PCI address
[0x1040-0x104f])
...
...
pcnet32: pcnet32.c:v1.35 21.Apr.2008 tsbogend@alpha.franken.de
pcnet32: No access methods
...
...

-sh-4.0# cat /proc/ioports
00000000-0000001f : dma1
00000020-00000021 : pic1
00000040-0000005f : timer
00000060-0000006f : keyboard
00000070-00000077 : rtc_cmos
00000080-0000008f : dma page reg
000000a0-000000a1 : pic2
000000c0-000000df : dma2
00000170-00000177 : piix
000001f0-000001f7 : piix
000002f8-000002ff : serial
00000376-00000376 : piix
000003f6-000003f6 : piix
000003f8-000003ff : serial
00001000-00ffffff : MSC PCI I/O
  00001000-0000101f : 0000:00:0a.2
  00001020-0000103f : 0000:00:0b.0
  00001040-0000104f : 0000:00:0a.1
    00001040-0000104f : piix

And *AFTER* applying the patch:

...
...
pci 0000:00:0a.0: [8086:7110] type 0 class 0x000601
pci 0000:00:0a.1: [8086:7111] type 0 class 0x000101
pci 0000:00:0a.1: reg 20: [io  0x1240-0x124f]
pci 0000:00:0a.2: [8086:7112] type 0 class 0x000c03
pci 0000:00:0a.2: reg 20: [io  0x1220-0x123f]
pci 0000:00:0a.3: [8086:7113] type 0 class 0x000680
pci 0000:00:0a.3: quirk: [io  0x1000-0x103f] claimed by PIIX4 ACPI
pci 0000:00:0a.3: quirk: [io  0x1100-0x110f] claimed by PIIX4 SMB
pci 0000:00:0b.0: [1022:2000] type 0 class 0x000200
pci 0000:00:0b.0: reg 10: [io  0x1200-0x121f]
pci 0000:00:0b.0: reg 14: [mem 0x10000000-0x1000001f]
pci 0000:00:0b.0: reg 30: [mem 0x00000000-0x000fffff pref]
pci 0000:00:0b.0: PME# supported from D0 D3hot D3cold
pci 0000:00:0b.0: PME# disabled
pci 0000:00:11.0: [153f:0001] type 0 class 0x000600
pci 0000:00:11.0: reg 10: [mem 0x00000000-0x0fffffff]
pci 0000:00:0b.0: BAR 6: assigned [mem 0x10000000-0x100fffff pref]
pci 0000:00:0a.2: BAR 4: assigned [io  0x1040-0x105f]
pci 0000:00:0a.2: BAR 4: set to [io  0x1040-0x105f] (PCI address
[0x1040-0x105f])
pci 0000:00:0b.0: BAR 0: assigned [io  0x1060-0x107f]
pci 0000:00:0b.0: BAR 0: set to [io  0x1060-0x107f] (PCI address
[0x1060-0x107f])
pci 0000:00:0b.0: BAR 1: assigned [mem 0x10100000-0x1010001f]
pci 0000:00:0b.0: BAR 1: set to [mem 0x10100000-0x1010001f] (PCI
address [0x10100000-0x1010001f])
pci 0000:00:0a.1: BAR 4: assigned [io  0x1080-0x108f]
pci 0000:00:0a.1: BAR 4: set to [io  0x1080-0x108f] (PCI address
[0x1080-0x108f])
...
...
pcnet32: pcnet32.c:v1.35 21.Apr.2008 tsbogend@alpha.franken.de
pcnet32: PCnet/FAST III 79C973 at 0x1060, 00:d0:a0:00:06:72 assigned IRQ 10
pcnet32: Found PHY 0000:6b60 at address 30
pcnet32: eth0: registered as PCnet/FAST III 79C973
pcnet32: 1 cards_found
...
...

-sh-4.0# cat /proc/ioports
00000000-0000001f : dma1
00000020-00000021 : pic1
00000040-0000005f : timer
00000060-0000006f : keyboard
00000070-00000077 : rtc_cmos
00000080-0000008f : dma page reg
000000a0-000000a1 : pic2
000000c0-000000df : dma2
00000170-00000177 : piix
000001f0-000001f7 : piix
000002f8-000002ff : serial
00000376-00000376 : piix
000003f6-000003f6 : piix
000003f8-000003ff : serial
00001000-00ffffff : MSC PCI I/O
  00001000-0000103f : 0000:00:0a.3
  00001040-0000105f : 0000:00:0a.2
  00001060-0000107f : 0000:00:0b.0
    00001060-0000107f : pcnet32_probe_pci
  00001080-0000108f : 0000:00:0a.1
    00001080-0000108f : piix
  00001100-0000110f : 0000:00:0a.3

> Did something change the order in which we claim resources, so things that
> used to work now cause conflicts?

It used to work (as I see on 2.6.29) because the function insert_resource() was
used. The /proc/ioports of 2.6.29 has exactly the same contents as that of the
patched kernel.

> I think insert_resource() (where the newly-inserted resource can become the
> parent of something that was previously inserted) is sort of a hack...

Yes, agree, though in this case the newly-inserted resource actually becomes the
child of a previously inserted one, the MIPS System Controller's PCI I/O. So,
talking about sticking to using request_resource_conflict() as apposed to
insert_resource_conflict(), the problem is, like what I mentioned in my reply to
Ralf, why pci_find_parent_resource() does not return MSC PCI I/O but return the
root. If MSC PCI I/O is supposed to be the parent of PCI quirks, then something
is wrong somewhere else. Or else, the start point of MSC PCI I/O might be
raised to avoid the collisions.


Deng-Cheng

2011/7/28 Bjorn Helgaas <bhelgaas@google.com>:
> On Thu, Jul 28, 2011 at 5:28 AM, Deng-Cheng Zhu <dengcheng.zhu@gmail.com> wrote:
>> In resolving a network driver issue with the MIPS Malta platform, the root
>> cause was traced into pci_claim_resource():
>>
>> MIPS System Controller's PCI I/O resources stay in 0x1000-0xffffff. When
>> PCI quirks start claiming resources using request_resource_conflict(),
>> collisions happen and -EBUSY is returned, thereby rendering the onboard AMD
>> PCnet32 NIC unaware of quirks' region and preventing the NIC from functioning.
>> For PCI quirks, PIIX4 ACPI is expected to claim 0x1000-0x103f, and PIIX4 SMB to
>> claim 0x1100-0x110f, both of which fall into the MSC I/O range. Certainly, we
>> can increase the start point of this range in arch/mips/mti-malta/malta-pci.c to
>> avoid the collisions. But a fix in here looks more justified, though it seems to
>> have a wider impact. Using insert_xxx as opposed to request_xxx will register
>> PCI quirks' resources as children of MSC I/O and return OK, instead of seeing
>> collisions which are actually resolvable.
>
> What's the collision?  Can we see the dmesg log (which should have
> that information) and maybe the /proc/ioports contents?  Did something
> change the order in which we claim resources, so things that used to
> work now cause conflicts?
>
> I think insert_resource() (where the newly-inserted resource can
> become the parent of something that was previously inserted) is sort
> of a hack, and the fact that we need it is telling us that we're doing
> things in the wrong order.  It's nicer when we can discover and claim
> resources in a top-down hierarchical way.  But I recognize that may
> not always be possible, or at least not convenient.
>
> Bjorn
>
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bjorn Helgaas July 29, 2011, 5:35 p.m. UTC | #5
On Fri, Jul 29, 2011 at 12:32 AM, Deng-Cheng Zhu
<dengcheng.zhu@gmail.com> wrote:
> I noticed that at 79896cf42f Linus changed the function from insert_resource()
> to request_resource() (and later evolved into request_resource_conflict()) and
> he explained the reason. So, in the NIC's case, the problem is that in
> pci_claim_resource() the function pci_find_parent_resource() returns the root
> (0x0-0xffffff) rather than the MSC PCI I/O (0x1000-0xffffff).

This seems like the real problem: PCI has the wrong idea of the
resources available on bus 00.  The pci_bus->resource[0] for bus 00
points to ioport_resource (the default put there by pci_create_bus()),
when it should point to to msc_io_resource instead.

Some architectures fill in the pci_bus->resource[] array directly for
host bridges (for examples, try 'grep -r "resource\[0\] = " arch/').
On x86 and ia64, we use pci_bus_remove_resources() and
pci_bus_add_resource(), and I'd prefer that style for new code because
it hides some ugly implementation details.

I'm a little puzzled that we don't see this problem on more
architectures.  The grep above only found a few arches that update the
root bus resources.  I would expect most of the ones it didn't find to
be broken the same way Malta is.

Bjorn
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Deng-Cheng Zhu Aug. 1, 2011, 10:13 a.m. UTC | #6
It was found that PCI quirks claim resources (by calling pci_claim_resource())
*BEFORE* pcibios_fixup_bus() is called. In pcibios_fixup_bus(),
pci_bus->resource[0] for the root bus DOES point to msc_io_resource. If PCI
quirks do the resource claim after the arch-defined pcibios_fixup_bus() being
called, then the problem with Malta goes away.

So, it looks like 2 solutions out there:

1) To manage the call sequence. This seems not a desired one as it affects other
arches.

2) To raise the start point of the system controller's io_resource in
mips_pcibios_init() in arch/mips/mti-malta/malta-pci.c. This will place PCI
quirks' resources at the same level of the system controller's resources.

Ralf and Bjorn, which one sounds good to you?


Deng-Cheng


2011/7/30 Bjorn Helgaas <bhelgaas@google.com>:
> On Fri, Jul 29, 2011 at 12:32 AM, Deng-Cheng Zhu
> <dengcheng.zhu@gmail.com> wrote:
>> I noticed that at 79896cf42f Linus changed the function from insert_resource()
>> to request_resource() (and later evolved into request_resource_conflict()) and
>> he explained the reason. So, in the NIC's case, the problem is that in
>> pci_claim_resource() the function pci_find_parent_resource() returns the root
>> (0x0-0xffffff) rather than the MSC PCI I/O (0x1000-0xffffff).
>
> This seems like the real problem: PCI has the wrong idea of the
> resources available on bus 00.  The pci_bus->resource[0] for bus 00
> points to ioport_resource (the default put there by pci_create_bus()),
> when it should point to to msc_io_resource instead.
>
> Some architectures fill in the pci_bus->resource[] array directly for
> host bridges (for examples, try 'grep -r "resource\[0\] = " arch/').
> On x86 and ia64, we use pci_bus_remove_resources() and
> pci_bus_add_resource(), and I'd prefer that style for new code because
> it hides some ugly implementation details.
>
> I'm a little puzzled that we don't see this problem on more
> architectures.  The grep above only found a few arches that update the
> root bus resources.  I would expect most of the ones it didn't find to
> be broken the same way Malta is.
>
> Bjorn
>
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bjorn Helgaas Aug. 1, 2011, 3:21 p.m. UTC | #7
On Mon, Aug 1, 2011 at 4:13 AM, Deng-Cheng Zhu <dengcheng.zhu@gmail.com> wrote:
> It was found that PCI quirks claim resources (by calling pci_claim_resource())
> *BEFORE* pcibios_fixup_bus() is called. In pcibios_fixup_bus(),
> pci_bus->resource[0] for the root bus DOES point to msc_io_resource. If PCI
> quirks do the resource claim after the arch-defined pcibios_fixup_bus() being
> called, then the problem with Malta goes away.

Oh, I see.  pcibios_fixup_bus() copies the hose resources to the root
bus pci_bus structure.  I think that's bogus because we have the
interval between mips_pcibios_init() and pcibios_fixup_bus() where the
root bus resources are incorrect.

I think it would be better to set up the resources correctly right
away, as we do on x86.  In fact, I'm dubious about pci_create_bus()
filling in ioport_resource and iomem_resource as defaults -- that's
never what we really want there, and we have to rely on the arch
coming back later to fix it up.

I'd like to see some sort of restructuring there so we could pass in a
list of resources at the time we create the bus.

Bjorn
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/pci/setup-res.c b/drivers/pci/setup-res.c
index bc0e6ee..40d767e 100644
--- a/drivers/pci/setup-res.c
+++ b/drivers/pci/setup-res.c
@@ -102,7 +102,7 @@  int pci_claim_resource(struct pci_dev *dev, int resource)
 		return -EINVAL;
 	}
 
-	conflict = request_resource_conflict(root, res);
+	conflict = insert_resource_conflict(root, res);
 	if (conflict) {
 		dev_info(&dev->dev,
 			 "address space collision: %pR conflicts with %s %pR\n",