Patchworkβ pci-express hotplug

login
register
about
Submitter Alexander Chiang
Date 2009-10-27 02:48:41
Message ID <20091027024841.GA30509@ldl.fc.hp.com>
Download mbox | patch
Permalink /patch/56021/
State RFC
Headers show

Comments

Alexander Chiang - 2009-10-27 02:48:41
* Jens Axboe <jens.axboe@oracle.com>:
> > > acpiphp: enable_slot - physical_slot = 1
> > > acpiphp_glue: acpiphp_enable_slot: Slot status is not ACPI_STA_ALL
> > > acpiphp: enable_slot - physical_slot = 2
> > > acpiphp_glue: acpiphp_enable_slot: Slot status is not ACPI_STA_ALL
> > > acpiphp: enable_slot - physical_slot = 6
> > > acpiphp_glue: acpiphp_enable_slot: Slot status is not ACPI_STA_ALL
> > > acpiphp: enable_slot - physical_slot = 7
> > > acpiphp_glue: acpiphp_enable_slot: Slot status is not ACPI_STA_ALL
> > 
> > Hm, so for some reason, firmware on your machine is telling us
> > that it doesn't think cards are present and/or enabled.
> > 
> > Unfortunately, I don't know why your firmware would be saying
> > that. We could add some more debug printks to see what firmware
> > thinks about your system... Or we could just wait and see what
> > happens after you get your hardware replaced.

Let's try and find out why firmware is telling us that we didn't
get ACPI_STA_ALL.

Can you please apply this debug patch and send the output? Again,
please modprobe with debug=1.

Thanks,
/ac

---

--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jens Axboe - 2009-10-27 08:26:35
On Mon, Oct 26 2009, Alex Chiang wrote:
> * Jens Axboe <jens.axboe@oracle.com>:
> > > > acpiphp: enable_slot - physical_slot = 1
> > > > acpiphp_glue: acpiphp_enable_slot: Slot status is not ACPI_STA_ALL
> > > > acpiphp: enable_slot - physical_slot = 2
> > > > acpiphp_glue: acpiphp_enable_slot: Slot status is not ACPI_STA_ALL
> > > > acpiphp: enable_slot - physical_slot = 6
> > > > acpiphp_glue: acpiphp_enable_slot: Slot status is not ACPI_STA_ALL
> > > > acpiphp: enable_slot - physical_slot = 7
> > > > acpiphp_glue: acpiphp_enable_slot: Slot status is not ACPI_STA_ALL
> > > 
> > > Hm, so for some reason, firmware on your machine is telling us
> > > that it doesn't think cards are present and/or enabled.
> > > 
> > > Unfortunately, I don't know why your firmware would be saying
> > > that. We could add some more debug printks to see what firmware
> > > thinks about your system... Or we could just wait and see what
> > > happens after you get your hardware replaced.
> 
> Let's try and find out why firmware is telling us that we didn't
> get ACPI_STA_ALL.
> 
> Can you please apply this debug patch and send the output? Again,
> please modprobe with debug=1.

acpiphp: enable_slot - physical_slot = 1
power_on_slot
  no _PS0
  no _PS0
  no _PS0
  no _PS0
  no _PS0
  no _PS0
  no _PS0
  no _PS0
get_slot_status
  reading config space dvid 0xffffffff
  reading config space dvid 0xffffffff
  reading config space dvid 0xffffffff
  reading config space dvid 0xffffffff
  reading config space dvid 0xffffffff
  reading config space dvid 0xffffffff
  reading config space dvid 0xffffffff
  reading config space dvid 0xffffffff
acpiphp_glue: acpiphp_enable_slot: Slot status is not ACPI_STA_ALL
Jens Axboe - 2009-10-27 08:34:19
On Tue, Oct 27 2009, Jens Axboe wrote:
> On Mon, Oct 26 2009, Alex Chiang wrote:
> > * Jens Axboe <jens.axboe@oracle.com>:
> > > > > acpiphp: enable_slot - physical_slot = 1
> > > > > acpiphp_glue: acpiphp_enable_slot: Slot status is not ACPI_STA_ALL
> > > > > acpiphp: enable_slot - physical_slot = 2
> > > > > acpiphp_glue: acpiphp_enable_slot: Slot status is not ACPI_STA_ALL
> > > > > acpiphp: enable_slot - physical_slot = 6
> > > > > acpiphp_glue: acpiphp_enable_slot: Slot status is not ACPI_STA_ALL
> > > > > acpiphp: enable_slot - physical_slot = 7
> > > > > acpiphp_glue: acpiphp_enable_slot: Slot status is not ACPI_STA_ALL
> > > > 
> > > > Hm, so for some reason, firmware on your machine is telling us
> > > > that it doesn't think cards are present and/or enabled.
> > > > 
> > > > Unfortunately, I don't know why your firmware would be saying
> > > > that. We could add some more debug printks to see what firmware
> > > > thinks about your system... Or we could just wait and see what
> > > > happens after you get your hardware replaced.
> > 
> > Let's try and find out why firmware is telling us that we didn't
> > get ACPI_STA_ALL.
> > 
> > Can you please apply this debug patch and send the output? Again,
> > please modprobe with debug=1.
> 
> acpiphp: enable_slot - physical_slot = 1
> power_on_slot
>   no _PS0
>   no _PS0
>   no _PS0
>   no _PS0
>   no _PS0
>   no _PS0
>   no _PS0
>   no _PS0
> get_slot_status
>   reading config space dvid 0xffffffff
>   reading config space dvid 0xffffffff
>   reading config space dvid 0xffffffff
>   reading config space dvid 0xffffffff
>   reading config space dvid 0xffffffff
>   reading config space dvid 0xffffffff
>   reading config space dvid 0xffffffff
>   reading config space dvid 0xffffffff
> acpiphp_glue: acpiphp_enable_slot: Slot status is not ACPI_STA_ALL

Since this is a new board and BIOS, below is the info from loading
acpiphp with debug enabled and acpi debug enabled.

acpiphp: ACPI Hot Plug PCI Controller Driver version: 0.5
acpiphp_glue: found PCI-to-PCI bridge at PCI 0000:00:05.0
acpiphp_glue: found ACPI PCI Hotplug slot 1 at PCI 0000:08:00
get_slot_status
get_slot_status
acpiphp: Slot [1] registered
acpiphp_glue: found PCI-to-PCI bridge at PCI 0000:00:07.0
acpiphp_glue: found ACPI PCI Hotplug slot 2 at PCI 0000:0b:00
get_slot_status
get_slot_status
acpiphp: Slot [2] registered
acpiphp_glue: found PCI-to-PCI bridge at PCI 0000:80:07.0
acpiphp_glue: found ACPI PCI Hotplug slot 6 at PCI 0000:84:00
get_slot_status
get_slot_status
acpiphp: Slot [6] registered
acpiphp_glue: found PCI-to-PCI bridge at PCI 0000:80:09.0
acpiphp_glue: found ACPI PCI Hotplug slot 7 at PCI 0000:87:00
get_slot_status
get_slot_status
acpiphp: Slot [7] registered
acpiphp_glue: Bus 0000:87 has 1 slot
acpiphp_glue: Bus 0000:84 has 1 slot
acpiphp_glue: Bus 0000:0b has 1 slot
acpiphp_glue: Bus 0000:08 has 1 slot
acpiphp_glue: Total 4 slots
acpiphp: Slot [1] unregistered
acpiphp: release_slot - physical_slot = 1
acpiphp: Slot [2] unregistered
acpiphp: release_slot - physical_slot = 2
acpiphp: Slot [6] unregistered
acpiphp: release_slot - physical_slot = 6
acpiphp: Slot [7] unregistered
acpiphp: release_slot - physical_slot = 7
Alexander Chiang - 2009-10-27 15:15:52
* Jens Axboe <jens.axboe@oracle.com>:
> 
> Since this is a new board and BIOS, below is the info from loading
> acpiphp with debug enabled and acpi debug enabled.

Thanks. Can you please send your DSDT as well please?

You can obtain that with the acpidump tools. If they're not part
of your distro, you can find them on lesswatts.org.

Thanks.

/ac

--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jens Axboe - 2009-10-28 09:18:18
On Tue, Oct 27 2009, Alex Chiang wrote:
> * Jens Axboe <jens.axboe@oracle.com>:
> > 
> > Since this is a new board and BIOS, below is the info from loading
> > acpiphp with debug enabled and acpi debug enabled.
> 
> Thanks. Can you please send your DSDT as well please?
> 
> You can obtain that with the acpidump tools. If they're not part
> of your distro, you can find them on lesswatts.org.

Sent privately.
Alexander Chiang - 2009-10-28 19:55:21
* Jens Axboe <jens.axboe@oracle.com>:
> On Tue, Oct 27 2009, Alex Chiang wrote:
> > * Jens Axboe <jens.axboe@oracle.com>:
> > > 
> > > Since this is a new board and BIOS, below is the info from loading
> > > acpiphp with debug enabled and acpi debug enabled.
> > 
> > Thanks. Can you please send your DSDT as well please?
> > 
> > You can obtain that with the acpidump tools. If they're not part
> > of your distro, you can find them on lesswatts.org.
> 
> Sent privately.

Sorry, one more RTT of debug info needed. :-/

Can you send dmesg output and contents of /proc/iomem?

Thanks.

--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Alexander Chiang - 2009-10-28 20:46:57
* Jens Axboe <jens.axboe@oracle.com>:
> 
> acpiphp: enable_slot - physical_slot = 1
> power_on_slot
>   no _PS0
>   no _PS0
>   no _PS0
>   no _PS0
>   no _PS0
>   no _PS0
>   no _PS0
>   no _PS0
> get_slot_status
>   reading config space dvid 0xffffffff
>   reading config space dvid 0xffffffff
>   reading config space dvid 0xffffffff
>   reading config space dvid 0xffffffff
>   reading config space dvid 0xffffffff
>   reading config space dvid 0xffffffff
>   reading config space dvid 0xffffffff
>   reading config space dvid 0xffffffff
> acpiphp_glue: acpiphp_enable_slot: Slot status is not ACPI_STA_ALL

Hm, as Kenji-san writes in an earlier email:

	The direct cause of the problem that your slot was not
	turned on is power fault. I guess acpiphp is suffering
	the same problem.  Unfortunately, it's difficult for me
	to analyze the root cause of this power fault. Please ask
	the hardware vendor about it. I hope board replacement
	will fix the problem.

In get_slot_status(), we're trying to read the card's vendor ID,
which is a mandatory PCI config space register. The fact that we
can't even read that suggests something is going wrong way
earlier before we get to this point.

Bjorn wondered on irc if your slots are physically working. Do
you know if they work under Windows? If they do, then it would be
good to find out how your bridges are being programmed, which I
believe you can discover with the Device Manager.

/ac
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Alexander Chiang - 2009-10-28 21:39:48
* Jens Axboe <jens.axboe@oracle.com>:
> 
> acpiphp: enable_slot - physical_slot = 1
> power_on_slot
>   no _PS0
>   no _PS0
>   no _PS0
>   no _PS0
>   no _PS0
>   no _PS0
>   no _PS0
>   no _PS0

One final thought -- your DSDT doesn't provide any power methods
such as _PS[0-3] (I grepped your DSDT so basing my statement on
more than just the output above), and without those, I'm pretty
sure that there's no way for the OS to communicate to the BIOS
that we want to power those slots on.

So, something funky is going on with your BIOS. This isn't some
weird proto board or something, is it? ;)

/ac

--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jens Axboe - 2009-10-29 08:57:55
On Wed, Oct 28 2009, Alex Chiang wrote:
> * Jens Axboe <jens.axboe@oracle.com>:
> > 
> > acpiphp: enable_slot - physical_slot = 1
> > power_on_slot
> >   no _PS0
> >   no _PS0
> >   no _PS0
> >   no _PS0
> >   no _PS0
> >   no _PS0
> >   no _PS0
> >   no _PS0
> 
> One final thought -- your DSDT doesn't provide any power methods
> such as _PS[0-3] (I grepped your DSDT so basing my statement on
> more than just the output above), and without those, I'm pretty
> sure that there's no way for the OS to communicate to the BIOS
> that we want to power those slots on.
> 
> So, something funky is going on with your BIOS. This isn't some
> weird proto board or something, is it? ;)

It's pre-production, but not a prototype. I'll take it up with the
vendor.
Jens Axboe - 2009-10-29 18:55:51
Just a note for the archives - after chatting with Alex on irc about
this issue and trying other cards, the likely suspect seems to be the
specific card used and/or the firmware on that card. Hotplug works
otherwise, just not with that card at least.

Patch

diff --git a/drivers/pci/hotplug/acpiphp_glue.c b/drivers/pci/hotplug/acpiphp_glue.c
index 58d25a1..2caa447 100644
--- a/drivers/pci/hotplug/acpiphp_glue.c
+++ b/drivers/pci/hotplug/acpiphp_glue.c
@@ -797,9 +797,13 @@  static int power_on_slot(struct acpiphp_slot *slot)
 	struct list_head *l;
 	int retval = 0;
 
+	printk("%s\n", __func__);
+
 	/* if already enabled, just skip */
-	if (slot->flags & SLOT_POWEREDON)
+	if (slot->flags & SLOT_POWEREDON) {
+		printk("  slot %ld already powered on\n", slot->sun);
 		goto err_exit;
+	}
 
 	list_for_each (l, &slot->funcs) {
 		func = list_entry(l, struct acpiphp_func, sibling);
@@ -813,6 +817,8 @@  static int power_on_slot(struct acpiphp_slot *slot)
 				goto err_exit;
 			} else
 				break;
+		} else {
+			printk("  no _PS0\n");
 		}
 	}
 
@@ -1122,11 +1128,14 @@  static unsigned int get_slot_status(struct acpiphp_slot *slot)
 	struct list_head *l;
 	struct acpiphp_func *func;
 
+	printk("%s\n", __func__);
+
 	list_for_each (l, &slot->funcs) {
 		func = list_entry(l, struct acpiphp_func, sibling);
 
 		if (func->flags & FUNC_HAS_STA) {
 			status = acpi_evaluate_integer(func->handle, "_STA", NULL, &sta);
+			printk("  FUNC_HAS_STA status %d _STA %#lx\n", status, sta);
 			if (ACPI_SUCCESS(status) && sta)
 				break;
 		} else {
@@ -1134,6 +1143,7 @@  static unsigned int get_slot_status(struct acpiphp_slot *slot)
 						  PCI_DEVFN(slot->device,
 							    func->function),
 						  PCI_VENDOR_ID, &dvid);
+			printk("  reading config space dvid %#lx\n", dvid);
 			if (dvid != 0xffffffff) {
 				sta = ACPI_STA_ALL;
 				break;