diff mbox

Hint HB6 - kernel doesn't see chips behind it.

Message ID 20160129162400.GB12965@localhost (mailing list archive)
State New, archived
Delegated to: Bjorn Helgaas
Headers show

Commit Message

Bjorn Helgaas Jan. 29, 2016, 4:24 p.m. UTC
On Thu, Jan 28, 2016 at 10:23:03AM +0000, Richard F wrote:
> Bjorn,
> 
> Thanks for looking at it. Actually I've tried 3 different PCI cards in
> both slots and they all work fine.  I have 2 different single channel
> BT878 based PCI cards in there now capturing CCTV 24 x 7, the machine
> stays up for months. That's why it's so odd.  Any other pointers?  Does
> the kernel need the BIOS to detect the card right?

The kernel should at least discover the card even if the BIOS doesn't
do anything.  But on x86, the BIOS usually *does* configure things, so
it's very possible we could trip over a Linux defect if it doesn't.

Try booting with "pci=pcie_scan_all".  That *shouldn't* make a
difference because this isn't a PCIe device, but maybe our logic is
broken.

Your topology looks a little strange:

  00:1c.0 PCIe root port to [bus 01]    slot 0
  00:1c.1 PCIe root port to [bus 02]    slot 1
  00:1c.2 PCIe root port to [bus 03-05] slot 2
  03:00.0 PCI bridge to [bus 04-05] (Integrated Technology Express)
  04:01.0 PCI bridge to [bus 05] (Hint Corp)

00:1c.2 is a normal PCIe Root Port, so the device it's connected to
*should* be a PCIe device, but 03:00.0 doesn't have a PCIe capability.
Is this an adapter card of some kind?

03:00.0 is an ITE 8893, and we do have a quirk related to a similar
device:
http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/drivers/pci/quirks.c?h=v4.4#n3662

Can you try the patch below, please?

--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Richard F Jan. 30, 2016, 5:54 p.m. UTC | #1
On 29/01/2016 16:24, Bjorn Helgaas wrote:
> On Thu, Jan 28, 2016 at 10:23:03AM +0000, Richard F wrote:
>> Bjorn,
>>
>> Thanks for looking at it. Actually I've tried 3 different PCI cards in
>> both slots and they all work fine.  I have 2 different single channel
>> BT878 based PCI cards in there now capturing CCTV 24 x 7, the machine
>> stays up for months. That's why it's so odd.  Any other pointers?  Does
>> the kernel need the BIOS to detect the card right?
> 
> The kernel should at least discover the card even if the BIOS doesn't
> do anything.  But on x86, the BIOS usually *does* configure things, so
> it's very possible we could trip over a Linux defect if it doesn't.
> 
> Try booting with "pci=pcie_scan_all".  That *shouldn't* make a
> difference because this isn't a PCIe device, but maybe our logic is
> broken.
> 
> Your topology looks a little strange:
> 
>   00:1c.0 PCIe root port to [bus 01]    slot 0
>   00:1c.1 PCIe root port to [bus 02]    slot 1
>   00:1c.2 PCIe root port to [bus 03-05] slot 2
>   03:00.0 PCI bridge to [bus 04-05] (Integrated Technology Express)
>   04:01.0 PCI bridge to [bus 05] (Hint Corp)
> 
> 00:1c.2 is a normal PCIe Root Port, so the device it's connected to
> *should* be a PCIe device, but 03:00.0 doesn't have a PCIe capability.
> Is this an adapter card of some kind?

It's a motherboard bridge to get from PCIe to legacy PCI slots, quite a
few motherboards use it I think. It's not an adapter I plugged in.

pci=pcie_scan_all didn't yield anything new, as you expected.

I posted the output of DMESG with your patch (in 4.4.0) to the bugzilla
https://bugzilla.kernel.org/show_bug.cgi?id=110851

It produced a fair bit of output but doesn't look like the card was
recognised. At least modprobe'ing bttv with the right parameters didn't
yield the right response.

I also tried pci=reouteirq and posted the result, that also didn't help.

Please let me know if there's anything else I can try. The AMI BIOS
doesn't have any setup parameters for the PCI/e.

Thanks
Richard


> 03:00.0 is an ITE 8893, and we do have a quirk related to a similar
> device:
> http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/drivers/pci/quirks.c?h=v4.4#n3662
> 
> Can you try the patch below, please?
> 
> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
> index 6d7ab9b..f6d8e85 100644
> --- a/drivers/pci/probe.c
> +++ b/drivers/pci/probe.c
> @@ -1530,6 +1530,7 @@ bool pci_bus_read_dev_vendor_id(struct pci_bus *bus, int devfn, u32 *l,
>  	if (pci_bus_read_config_dword(bus, devfn, PCI_VENDOR_ID, l))
>  		return false;
>  
> +	dev_info(&bus->dev, "%s %02x %#010x\n", __func__, devfn, *l);
>  	/* some broken boards return 0 or ~0 if a slot is empty: */
>  	if (*l == 0xffffffff || *l == 0x00000000 ||
>  	    *l == 0x0000ffff || *l == 0xffff0000)
> @@ -1571,6 +1572,7 @@ static struct pci_dev *pci_scan_device(struct pci_bus *bus, int devfn)
>  	struct pci_dev *dev;
>  	u32 l;
>  
> +	dev_info(&bus->dev, "%s %02x\n", __func__, devfn);
>  	if (!pci_bus_read_dev_vendor_id(bus, devfn, &l, 60*1000))
>  		return NULL;
>  
> @@ -1751,6 +1753,7 @@ struct pci_dev *pci_scan_single_device(struct pci_bus *bus, int devfn)
>  {
>  	struct pci_dev *dev;
>  
> +	dev_info(&bus->dev, "%s %02x\n", __func__, devfn);
>  	dev = pci_get_slot(bus, devfn);
>  	if (dev) {
>  		pci_dev_put(dev);
> @@ -1825,6 +1828,7 @@ int pci_scan_slot(struct pci_bus *bus, int devfn)
>  	unsigned fn, nr = 0;
>  	struct pci_dev *dev;
>  
> +	dev_info(&bus->dev, "%s %02x\n", __func__, devfn);
>  	if (only_one_child(bus) && (devfn > 0))
>  		return 0; /* Already scanned the entire slot */
>  

--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bjorn Helgaas Feb. 1, 2016, 7:06 p.m. UTC | #2
On Sat, Jan 30, 2016 at 05:54:28PM +0000, Richard F wrote:
> On 29/01/2016 16:24, Bjorn Helgaas wrote:
> > On Thu, Jan 28, 2016 at 10:23:03AM +0000, Richard F wrote:
> > Your topology looks a little strange:
> > 
> >   00:1c.0 PCIe root port to [bus 01]    slot 0
> >   00:1c.1 PCIe root port to [bus 02]    slot 1
> >   00:1c.2 PCIe root port to [bus 03-05] slot 2
> >   03:00.0 PCI bridge to [bus 04-05] (Integrated Technology Express)
> >   04:01.0 PCI bridge to [bus 05] (Hint Corp)
> > 
> > 00:1c.2 is a normal PCIe Root Port, so the device it's connected to
> > *should* be a PCIe device, but 03:00.0 doesn't have a PCIe capability.
> > Is this an adapter card of some kind?
> 
> It's a motherboard bridge to get from PCIe to legacy PCI slots, quite a
> few motherboards use it I think. It's not an adapter I plugged in.

That makes sense.  It sounds like there are two conventional PCI
slots?  I think it's also a minor platform bug that the 00:1c.2 root
port advertises a slot.  1c.2 is connected to a system-integrated
device, i.e., 03:00.0, not a slot.  This might cause pciehp to claim
1c.2 when it shouldn't.  But that's unrelated to the current issue, of
course.

> I posted the output of DMESG with your patch (in 4.4.0) to the bugzilla
> https://bugzilla.kernel.org/show_bug.cgi?id=110851
> 
> It produced a fair bit of output but doesn't look like the card was
> recognised. At least modprobe'ing bttv with the right parameters didn't
> yield the right response.

I only added printks, so I didn't expect it to change the behavior.  I
just wanted to confirm that we are scanning the bus and device numbers
where we expect the bttv devices to be, and we are.  I think your bttv
card includes these devices:

  04:01.0 PCI-PCI bridge (Hint Corp)
  05:0c.0 bt878
  05:0c.1 bt878
  05:0d.0 bt878
  ...
  05:0f.1 bt878

For conventional PCI, I think the device number is determined by the
slot wiring.  That affects the device number of the Hint Corp bridge,
so if you move it between slots, it should change from device 01 to
something else.

The device and function numbers of the bt878 devices are determined by
wiring on the card, so those should be the same between machine A and
B.  These are 5- and 3-bit fields, respectively, so 0c.0 means we have
01100 000 encoded into an 8-bit devfn as 0110 0000 or 0x60.  When we
tried to read the vendor & device ID from 0c.0, we got no response
from the device:

  pci_bus 0000:05: pci_scan_slot 60
  pci_bus 0000:05: pci_scan_single_device 60
  pci_bus 0000:05: pci_scan_device 60
  pci_bus 0000:05: pci_bus_read_dev_vendor_id 60 0xffffffff

I'm out of ideas.  Other cards work in this slot; it's only the bttv
card that doesn't work.  So it seems like it must be something about
that card that's different.

Maybe somebody on the list will have more ideas?

Bjorn
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Richard F Feb. 1, 2016, 8:06 p.m. UTC | #3
Bjorn,

Thanks so much for your investigation.

Yes, MOBO has 2 PCI slots fed off the IT8893E.
This guy -
http://linux.derkeiler.com/Mailing-Lists/Debian/2005-08/3243.html  had a
similar problem 10yrs ago(!), fixed by disabling ACPI, I tried that
without any success.   I did extract the BIOS tables and
disassemble/reassemble them to see if there was anything obvious broken
there, and not much shows, a few warnings, AML is pretty indecipherable
stuff... I also tried fooling it with some Windows acpi_os_name's
matching those in the AML, without luck.

For kicks I spun up an old Win XP image today, and it recognised the PCI
bridge but may not have been able to see the BT878's behind, but then I
didn't have a reliable source for drivers for it.

Is there's something needing configuring in that Hint HB6/PCI6140
bridge?  When working, it loads the shpchp module, and it does advertise
itself as "non transparent" mode. The other difference is a latency of
64 in the working scenario, 32 when not. Not configurable on the AMI
BIOS unfortunately.

Thanks again
Richard


On 1/02/2016 19:06, Bjorn Helgaas wrote:
> On Sat, Jan 30, 2016 at 05:54:28PM +0000, Richard F wrote:
>> On 29/01/2016 16:24, Bjorn Helgaas wrote:
>>> On Thu, Jan 28, 2016 at 10:23:03AM +0000, Richard F wrote:
>>> Your topology looks a little strange:
>>>
>>>   00:1c.0 PCIe root port to [bus 01]    slot 0
>>>   00:1c.1 PCIe root port to [bus 02]    slot 1
>>>   00:1c.2 PCIe root port to [bus 03-05] slot 2
>>>   03:00.0 PCI bridge to [bus 04-05] (Integrated Technology Express)
>>>   04:01.0 PCI bridge to [bus 05] (Hint Corp)
>>>
>>> 00:1c.2 is a normal PCIe Root Port, so the device it's connected to
>>> *should* be a PCIe device, but 03:00.0 doesn't have a PCIe capability.
>>> Is this an adapter card of some kind?
>>
>> It's a motherboard bridge to get from PCIe to legacy PCI slots, quite a
>> few motherboards use it I think. It's not an adapter I plugged in.
> 
> That makes sense.  It sounds like there are two conventional PCI
> slots?  I think it's also a minor platform bug that the 00:1c.2 root
> port advertises a slot.  1c.2 is connected to a system-integrated
> device, i.e., 03:00.0, not a slot.  This might cause pciehp to claim
> 1c.2 when it shouldn't.  But that's unrelated to the current issue, of
> course.
> 
>> I posted the output of DMESG with your patch (in 4.4.0) to the bugzilla
>> https://bugzilla.kernel.org/show_bug.cgi?id=110851
>>
>> It produced a fair bit of output but doesn't look like the card was
>> recognised. At least modprobe'ing bttv with the right parameters didn't
>> yield the right response.
> 
> I only added printks, so I didn't expect it to change the behavior.  I
> just wanted to confirm that we are scanning the bus and device numbers
> where we expect the bttv devices to be, and we are.  I think your bttv
> card includes these devices:
> 
>   04:01.0 PCI-PCI bridge (Hint Corp)
>   05:0c.0 bt878
>   05:0c.1 bt878
>   05:0d.0 bt878
>   ...
>   05:0f.1 bt878
> 
> For conventional PCI, I think the device number is determined by the
> slot wiring.  That affects the device number of the Hint Corp bridge,
> so if you move it between slots, it should change from device 01 to
> something else.
> 
> The device and function numbers of the bt878 devices are determined by
> wiring on the card, so those should be the same between machine A and
> B.  These are 5- and 3-bit fields, respectively, so 0c.0 means we have
> 01100 000 encoded into an 8-bit devfn as 0110 0000 or 0x60.  When we
> tried to read the vendor & device ID from 0c.0, we got no response
> from the device:
> 
>   pci_bus 0000:05: pci_scan_slot 60
>   pci_bus 0000:05: pci_scan_single_device 60
>   pci_bus 0000:05: pci_scan_device 60
>   pci_bus 0000:05: pci_bus_read_dev_vendor_id 60 0xffffffff
> 
> I'm out of ideas.  Other cards work in this slot; it's only the bttv
> card that doesn't work.  So it seems like it must be something about
> that card that's different.
> 
> Maybe somebody on the list will have more ideas?
> 
> Bjorn
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bjorn Helgaas Feb. 1, 2016, 11:35 p.m. UTC | #4
On Mon, Feb 01, 2016 at 08:06:51PM +0000, Richard F wrote:
> Bjorn,
> 
> Thanks so much for your investigation.
> 
> Yes, MOBO has 2 PCI slots fed off the IT8893E.
> This guy -
> http://linux.derkeiler.com/Mailing-Lists/Debian/2005-08/3243.html  had a
> similar problem 10yrs ago(!), fixed by disabling ACPI, I tried that
> without any success.   

Interesting.  That makes you think there's some bridge init
difference.

> I did extract the BIOS tables and
> disassemble/reassemble them to see if there was anything obvious broken
> there, and not much shows, a few warnings, AML is pretty indecipherable
> stuff... I also tried fooling it with some Windows acpi_os_name's
> matching those in the AML, without luck.
> 
> For kicks I spun up an old Win XP image today, and it recognised the PCI
> bridge but may not have been able to see the BT878's behind, but then I
> didn't have a reliable source for drivers for it.

You should be able to tell whether Windows sees the BT878 even without
drivers.  I think there might be something in Device Manager, or you
can use a tool like AIDA64 (there was a free trial version last I
checked).

> Is there's something needing configuring in that Hint HB6/PCI6140
> bridge?

I can't think of anything, but that does seem like the most likely
explanation.

> When working, it loads the shpchp module, and it does advertise
> itself as "non transparent" mode. 

I see "Hint Corp HB6 Universal PCI-PCI bridge (non-transparent mode)"
in both lspci outputs.  Is that what you mean, or do you see a
difference somewhere else?  It looks like that string is just looked
up from the device ID; it's not influenced by anything the kernel
does.

> The other difference is a latency of
> 64 in the working scenario, 32 when not. Not configurable on the AMI
> BIOS unfortunately.

I did notice the shpchp and latency timer differences, but I couldn't
figure out how they could possibly be related.  But it certainly
wouldn't hurt to enable shpchp in your kernel and see if it makes a
difference.

I can't figure out how the latency timer could be involved either, but
you can try fiddling with it, e.g., set it to 64:

  # setpci -s04:01.0 0x0d.b=0x40
  # echo 1 > /sys/bus/pci/rescan

You can also use "lspci -vvv -s04:01.0" and compare with the working
system and see if there are other differences.  I think AIDA64 will
also dump that config space, so you might be able to compare with
with what Windows XP does, too.

> On 1/02/2016 19:06, Bjorn Helgaas wrote:
> > On Sat, Jan 30, 2016 at 05:54:28PM +0000, Richard F wrote:
> >> On 29/01/2016 16:24, Bjorn Helgaas wrote:
> >>> On Thu, Jan 28, 2016 at 10:23:03AM +0000, Richard F wrote:
> >>> Your topology looks a little strange:
> >>>
> >>>   00:1c.0 PCIe root port to [bus 01]    slot 0
> >>>   00:1c.1 PCIe root port to [bus 02]    slot 1
> >>>   00:1c.2 PCIe root port to [bus 03-05] slot 2
> >>>   03:00.0 PCI bridge to [bus 04-05] (Integrated Technology Express)
> >>>   04:01.0 PCI bridge to [bus 05] (Hint Corp)
> >>>
> >>> 00:1c.2 is a normal PCIe Root Port, so the device it's connected to
> >>> *should* be a PCIe device, but 03:00.0 doesn't have a PCIe capability.
> >>> Is this an adapter card of some kind?
> >>
> >> It's a motherboard bridge to get from PCIe to legacy PCI slots, quite a
> >> few motherboards use it I think. It's not an adapter I plugged in.
> > 
> > That makes sense.  It sounds like there are two conventional PCI
> > slots?  I think it's also a minor platform bug that the 00:1c.2 root
> > port advertises a slot.  1c.2 is connected to a system-integrated
> > device, i.e., 03:00.0, not a slot.  This might cause pciehp to claim
> > 1c.2 when it shouldn't.  But that's unrelated to the current issue, of
> > course.
> > 
> >> I posted the output of DMESG with your patch (in 4.4.0) to the bugzilla
> >> https://bugzilla.kernel.org/show_bug.cgi?id=110851
> >>
> >> It produced a fair bit of output but doesn't look like the card was
> >> recognised. At least modprobe'ing bttv with the right parameters didn't
> >> yield the right response.
> > 
> > I only added printks, so I didn't expect it to change the behavior.  I
> > just wanted to confirm that we are scanning the bus and device numbers
> > where we expect the bttv devices to be, and we are.  I think your bttv
> > card includes these devices:
> > 
> >   04:01.0 PCI-PCI bridge (Hint Corp)
> >   05:0c.0 bt878
> >   05:0c.1 bt878
> >   05:0d.0 bt878
> >   ...
> >   05:0f.1 bt878
> > 
> > For conventional PCI, I think the device number is determined by the
> > slot wiring.  That affects the device number of the Hint Corp bridge,
> > so if you move it between slots, it should change from device 01 to
> > something else.
> > 
> > The device and function numbers of the bt878 devices are determined by
> > wiring on the card, so those should be the same between machine A and
> > B.  These are 5- and 3-bit fields, respectively, so 0c.0 means we have
> > 01100 000 encoded into an 8-bit devfn as 0110 0000 or 0x60.  When we
> > tried to read the vendor & device ID from 0c.0, we got no response
> > from the device:
> > 
> >   pci_bus 0000:05: pci_scan_slot 60
> >   pci_bus 0000:05: pci_scan_single_device 60
> >   pci_bus 0000:05: pci_scan_device 60
> >   pci_bus 0000:05: pci_bus_read_dev_vendor_id 60 0xffffffff
> > 
> > I'm out of ideas.  Other cards work in this slot; it's only the bttv
> > card that doesn't work.  So it seems like it must be something about
> > that card that's different.
> > 
> > Maybe somebody on the list will have more ideas?
> > 
> > Bjorn
> > 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Richard F Feb. 3, 2016, 1:52 p.m. UTC | #5
On 1/02/2016 23:35, Bjorn Helgaas wrote:
---
> 
> You should be able to tell whether Windows sees the BT878 even without
> drivers.  I think there might be something in Device Manager, or you
> can use a tool like AIDA64 (there was a free trial version last I
> checked).

I ran up AIDA64. The Hint device was recognised as something slightly
different. It also didn't list anything behind the bridge - same issue.
Not sure if the Subsystem ID of 0000 is an issue.

[ HiNT HB1-SE33 PCI-PCI Bridge ]

Device Properties:
Device Description  	HiNT HB1-SE33 PCI-PCI Bridge
			Bus Type  	PCI
			Bus / Device / Function  	4 / 1 / 0
			Device ID  	3388-0021
			Subsystem ID  	0000-0000
			Device Class  	0604 (PCI/PCI Bridge)
			Revision  	11
			Fast Back-to-Back Transactions
Supported, Disabled

Device Features:
			66 MHz Operation  	Not Supported
			Bus Mastering  	Enabled

The IT8893 similarly listed:

[ ITE IT8893 PCI Bridge ]

Device Properties:
			Device Description  	ITE IT8893 PCI Bridge
			Bus Type  	PCI
			Bus / Device / Function  	3 / 0 / 0
			Device ID  	1283-8893
			Subsystem ID  	0000-0000
			Device Class  	0604 (PCI/PCI Bridge)
			Revision  	10
			Fast Back-to-Back Transactions  	Not Supported

Device Features:
			66 MHz Operation  	Not Supported


>> Is there's something needing configuring in that Hint HB6/PCI6140
>> bridge?
> 
> I can't think of anything, but that does seem like the most likely
> explanation.
> 
>> When working, it loads the shpchp module, and it does advertise
>> itself as "non transparent" mode. 
> 
> I see "Hint Corp HB6 Universal PCI-PCI bridge (non-transparent mode)"
> in both lspci outputs.  Is that what you mean, or do you see a
> difference somewhere else?  It looks like that string is just looked
> up from the device ID; it's not influenced by anything the kernel
> does.
> 
>> The other difference is a latency of
>> 64 in the working scenario, 32 when not. Not configurable on the AMI
>> BIOS unfortunately.
> 
> I did notice the shpchp and latency timer differences, but I couldn't
> figure out how they could possibly be related.  But it certainly
> wouldn't hurt to enable shpchp in your kernel and see if it makes a
> difference.
> 
> I can't figure out how the latency timer could be involved either, but
> you can try fiddling with it, e.g., set it to 64:
> 
>   # setpci -s04:01.0 0x0d.b=0x40
>   # echo 1 > /sys/bus/pci/rescan

The shpchp module was already in the kernel config, but not used.
rmmoding and modprobing again doesn't appear to help.

I tried the above setpci and rescan, but that didn't do anything new.

Must be a broken BIOS somehow masking the bridge - are we at a dead end?

Thanks
Richard


--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bjorn Helgaas Feb. 3, 2016, 3:51 p.m. UTC | #6
On Wed, Feb 03, 2016 at 01:52:42PM +0000, Richard F wrote:
> On 1/02/2016 23:35, Bjorn Helgaas wrote:
> ---
> > 
> > You should be able to tell whether Windows sees the BT878 even without
> > drivers.  I think there might be something in Device Manager, or you
> > can use a tool like AIDA64 (there was a free trial version last I
> > checked).
> 
> I ran up AIDA64. The Hint device was recognised as something slightly
> different. It also didn't list anything behind the bridge - same issue.
> Not sure if the Subsystem ID of 0000 is an issue.

I don't think the Subsystem ID is relevant.

> [ HiNT HB1-SE33 PCI-PCI Bridge ]
> 
> Device Properties:
> Device Description  	HiNT HB1-SE33 PCI-PCI Bridge
> 			Bus Type  	PCI
> 			Bus / Device / Function  	4 / 1 / 0
> 			Device ID  	3388-0021
> 			Subsystem ID  	0000-0000
> 			Device Class  	0604 (PCI/PCI Bridge)
> 			Revision  	11
> 			Fast Back-to-Back Transactions
> Supported, Disabled
> 
> Device Features:
> 			66 MHz Operation  	Not Supported
> 			Bus Mastering  	Enabled
> 
> The IT8893 similarly listed:
> 
> [ ITE IT8893 PCI Bridge ]
> 
> Device Properties:
> 			Device Description  	ITE IT8893 PCI Bridge
> 			Bus Type  	PCI
> 			Bus / Device / Function  	3 / 0 / 0
> 			Device ID  	1283-8893
> 			Subsystem ID  	0000-0000
> 			Device Class  	0604 (PCI/PCI Bridge)
> 			Revision  	10
> 			Fast Back-to-Back Transactions  	Not Supported
> 
> Device Features:
> 			66 MHz Operation  	Not Supported
> 
> 
> >> Is there's something needing configuring in that Hint HB6/PCI6140
> >> bridge?
> > 
> > I can't think of anything, but that does seem like the most likely
> > explanation.
> > 
> >> When working, it loads the shpchp module, and it does advertise
> >> itself as "non transparent" mode. 
> > 
> > I see "Hint Corp HB6 Universal PCI-PCI bridge (non-transparent mode)"
> > in both lspci outputs.  Is that what you mean, or do you see a
> > difference somewhere else?  It looks like that string is just looked
> > up from the device ID; it's not influenced by anything the kernel
> > does.
> > 
> >> The other difference is a latency of
> >> 64 in the working scenario, 32 when not. Not configurable on the AMI
> >> BIOS unfortunately.
> > 
> > I did notice the shpchp and latency timer differences, but I couldn't
> > figure out how they could possibly be related.  But it certainly
> > wouldn't hurt to enable shpchp in your kernel and see if it makes a
> > difference.
> > 
> > I can't figure out how the latency timer could be involved either, but
> > you can try fiddling with it, e.g., set it to 64:
> > 
> >   # setpci -s04:01.0 0x0d.b=0x40
> >   # echo 1 > /sys/bus/pci/rescan
> 
> The shpchp module was already in the kernel config, but not used.
> rmmoding and modprobing again doesn't appear to help.
> 
> I tried the above setpci and rescan, but that didn't do anything new.
> 
> Must be a broken BIOS somehow masking the bridge - are we at a dead end?

I mentioned "lspci -vvv" before, but I meant "lspci -xxx": that will
show you the whole config space.  You could compare them between the
working and non-working machines.  I think the Hint bridge is the
important device.

The BIOS isn't directly involved when we're enumerating devices.  It
may have done setup earlier that affects how the hardware works, but
it doesn't have a chance to intervene when we do config reads to find
devices.  So if the BIOS configured something in the bridge that
causes this problem, the "lspci -xxx" should show it.

Other than that, I don't have any other ideas.

Bjorn
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 6d7ab9b..f6d8e85 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -1530,6 +1530,7 @@  bool pci_bus_read_dev_vendor_id(struct pci_bus *bus, int devfn, u32 *l,
 	if (pci_bus_read_config_dword(bus, devfn, PCI_VENDOR_ID, l))
 		return false;
 
+	dev_info(&bus->dev, "%s %02x %#010x\n", __func__, devfn, *l);
 	/* some broken boards return 0 or ~0 if a slot is empty: */
 	if (*l == 0xffffffff || *l == 0x00000000 ||
 	    *l == 0x0000ffff || *l == 0xffff0000)
@@ -1571,6 +1572,7 @@  static struct pci_dev *pci_scan_device(struct pci_bus *bus, int devfn)
 	struct pci_dev *dev;
 	u32 l;
 
+	dev_info(&bus->dev, "%s %02x\n", __func__, devfn);
 	if (!pci_bus_read_dev_vendor_id(bus, devfn, &l, 60*1000))
 		return NULL;
 
@@ -1751,6 +1753,7 @@  struct pci_dev *pci_scan_single_device(struct pci_bus *bus, int devfn)
 {
 	struct pci_dev *dev;
 
+	dev_info(&bus->dev, "%s %02x\n", __func__, devfn);
 	dev = pci_get_slot(bus, devfn);
 	if (dev) {
 		pci_dev_put(dev);
@@ -1825,6 +1828,7 @@  int pci_scan_slot(struct pci_bus *bus, int devfn)
 	unsigned fn, nr = 0;
 	struct pci_dev *dev;
 
+	dev_info(&bus->dev, "%s %02x\n", __func__, devfn);
 	if (only_one_child(bus) && (devfn > 0))
 		return 0; /* Already scanned the entire slot */