Message ID | 20130401235256.GA31966@google.com (mailing list archive) |
---|---|
State | Not Applicable, archived |
Headers | show |
On Mon, Apr 1, 2013 at 4:52 PM, Bjorn Helgaas <bhelgaas@google.com> wrote: > On Fri, Mar 29, 2013 at 11:04:48AM -0700, Yinghai Lu wrote: >> attatched -v3 again >> >> > Please check attached -v3. > > It's getting late in the v3.9 cycle already, and while your v3 patch > probably fixes Roman's problem, I can't convince myself that it is > safe in general. > > I think the safest thing to do at this point is to revert 8c33f51df > ("PCI/ACPI: Request _OSC control before scanning PCI root bus") with a > patch like the one below. Agreed. > > That does mean the booting path and hotplug paths will be different (we set > aspm_disabled after boot but before hotplug), but it was that way for a > long time before 8c33f51df. I think it's more important to fix this recent > ath5k regression than to fix a long-standing hotplug bug that nobody ever > complained about. > > Obviously, I think we should fix the hotplug bug and clean up the ASPM > mess, too. But we need to do that when we have more time to do it right > and test it. Sure. > > Bjorn > > > commit 96e5d01cd536458435ef0678d9fa3dc542afb41f > Author: Bjorn Helgaas <bhelgaas@google.com> > Date: Mon Apr 1 15:47:39 2013 -0600 > > Revert "PCI/ACPI: Request _OSC control before scanning PCI root bus" > > This reverts commit 8c33f51df406e1a1f7fa4e9b244845b7ebd61fa6. > > Conflicts: > drivers/acpi/pci_root.c > > Reference: https://bugzilla.kernel.org/show_bug.cgi?id=55211 > Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Yinghai Lu <yinghai@kernel.org> stable need this reverting too. Yinghai -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Monday, April 01, 2013 05:52:56 PM Bjorn Helgaas wrote: > On Fri, Mar 29, 2013 at 11:04:48AM -0700, Yinghai Lu wrote: > > attatched -v3 again > > > > On Fri, Mar 29, 2013 at 11:02 AM, Yinghai Lu <yinghai@kernel.org> wrote: > > > On Fri, Mar 29, 2013 at 5:24 AM, Bjorn Helgaas <bhelgaas@google.com> wrote: > > >> > > >> Half of your v1 patch (removing the pcie_aspm_sanity_check() test) > > >> *might* be the right thing, but only if you can clearly explain why > > >> that will not reintroduce the bug Matthew fixed with c9651e70. > > >> > > >> I think we also need to fix the PCI_FIXUP_FINAL quirk regression, but > > >> that's a separate issue and should be a separate patch. > > > > > > > > > First commit from Matthew > > > 0ae5eaf10 PCI: ignore pre-1.1 ASPM quirking when ASPM is disabled > > > Right now we won't touch ASPM state if ASPM is disabled, except in the case > > > where we find a device that appears to be too old to reliably support ASPM. > > > Right now we'll clear it in that case, which is almost certainly the wrong > > > thing to do > > > > > > Try to not touch pre-1.1 ASPM for all, and it causes lots of regression. > > > > > > So second commit > > > > > > cdb0f9a1ad2e ASPM: Fix pcie devices with non-pcie children > > > Since 3.2.12 and 3.3, some systems are failing to boot with a BUG_ON. > > > Some other systems using the pata_jmicron driver fail to boot because no > > > disks are detected. Passing pcie_aspm=force on the kernel command line > > > works around it. > > > > > > move the check aspm_disabled down. > > > > > > but ath5 and etc (pre-1.1) really need to aspm_disable to change their > > > hw setting. > > > > > > So the right solution would be dropping pcie_aspm_sanity_check() > > > change -in v2 should make all both happy, as quirk and disable that in > > > driver for ath5 are calling > > > pcie_disable_aspm_state explicitly. > > > > > > In v2, we already removed pcie_clear_aspm() that is calling > > > pcie_disable_aspm_state. > > > > > > > > > Please check attached -v3. > > It's getting late in the v3.9 cycle already, and while your v3 patch > probably fixes Roman's problem, I can't convince myself that it is > safe in general. > > I think the safest thing to do at this point is to revert 8c33f51df > ("PCI/ACPI: Request _OSC control before scanning PCI root bus") with a > patch like the one below. > > That does mean the booting path and hotplug paths will be different (we set > aspm_disabled after boot but before hotplug), but it was that way for a > long time before 8c33f51df. I think it's more important to fix this recent > ath5k regression than to fix a long-standing hotplug bug that nobody ever > complained about. > > Obviously, I think we should fix the hotplug bug and clean up the ASPM > mess, too. But we need to do that when we have more time to do it right > and test it. > > Bjorn > > > commit 96e5d01cd536458435ef0678d9fa3dc542afb41f > Author: Bjorn Helgaas <bhelgaas@google.com> > Date: Mon Apr 1 15:47:39 2013 -0600 > > Revert "PCI/ACPI: Request _OSC control before scanning PCI root bus" > > This reverts commit 8c33f51df406e1a1f7fa4e9b244845b7ebd61fa6. > > Conflicts: > drivers/acpi/pci_root.c > > Reference: https://bugzilla.kernel.org/show_bug.cgi?id=55211 > Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > diff --git a/drivers/acpi/pci_root.c b/drivers/acpi/pci_root.c > index 0ac546d..c740364 100644 > --- a/drivers/acpi/pci_root.c > +++ b/drivers/acpi/pci_root.c > @@ -415,7 +415,6 @@ static int acpi_pci_root_add(struct acpi_device *device, > struct acpi_pci_root *root; > struct acpi_pci_driver *driver; > u32 flags, base_flags; > - bool is_osc_granted = false; > > root = kzalloc(sizeof(struct acpi_pci_root), GFP_KERNEL); > if (!root) > @@ -476,6 +475,30 @@ static int acpi_pci_root_add(struct acpi_device *device, > flags = base_flags = OSC_PCI_SEGMENT_GROUPS_SUPPORT; > acpi_pci_osc_support(root, flags); > > + /* > + * TBD: Need PCI interface for enumeration/configuration of roots. > + */ > + > + mutex_lock(&acpi_pci_root_lock); > + list_add_tail(&root->node, &acpi_pci_roots); > + mutex_unlock(&acpi_pci_root_lock); > + > + /* > + * Scan the Root Bridge > + * -------------------- > + * Must do this prior to any attempt to bind the root device, as the > + * PCI namespace does not get created until this call is made (and > + * thus the root bridge's pci_dev does not exist). > + */ > + root->bus = pci_acpi_scan_root(root); > + if (!root->bus) { > + printk(KERN_ERR PREFIX > + "Bus %04x:%02x not present in PCI namespace\n", > + root->segment, (unsigned int)root->secondary.start); > + result = -ENODEV; > + goto out_del_root; > + } > + > /* Indicate support for various _OSC capabilities. */ > if (pci_ext_cfg_avail()) > flags |= OSC_EXT_PCI_CONFIG_SUPPORT; > @@ -494,6 +517,7 @@ static int acpi_pci_root_add(struct acpi_device *device, > flags = base_flags; > } > } > + > if (!pcie_ports_disabled > && (flags & ACPI_PCIE_REQ_SUPPORT) == ACPI_PCIE_REQ_SUPPORT) { > flags = OSC_PCI_EXPRESS_CAP_STRUCTURE_CONTROL > @@ -514,54 +538,28 @@ static int acpi_pci_root_add(struct acpi_device *device, > status = acpi_pci_osc_control_set(device->handle, &flags, > OSC_PCI_EXPRESS_CAP_STRUCTURE_CONTROL); > if (ACPI_SUCCESS(status)) { > - is_osc_granted = true; > dev_info(&device->dev, > "ACPI _OSC control (0x%02x) granted\n", flags); > + if (acpi_gbl_FADT.boot_flags & ACPI_FADT_NO_ASPM) { > + /* > + * We have ASPM control, but the FADT indicates > + * that it's unsupported. Clear it. > + */ > + pcie_clear_aspm(root->bus); > + } > } else { > - is_osc_granted = false; > dev_info(&device->dev, > "ACPI _OSC request failed (%s), " > "returned control mask: 0x%02x\n", > acpi_format_exception(status), flags); > + pr_info("ACPI _OSC control for PCIe not granted, " > + "disabling ASPM\n"); > + pcie_no_aspm(); > } > } else { > dev_info(&device->dev, > - "Unable to request _OSC control " > - "(_OSC support mask: 0x%02x)\n", flags); > - } > - > - /* > - * TBD: Need PCI interface for enumeration/configuration of roots. > - */ > - > - mutex_lock(&acpi_pci_root_lock); > - list_add_tail(&root->node, &acpi_pci_roots); > - mutex_unlock(&acpi_pci_root_lock); > - > - /* > - * Scan the Root Bridge > - * -------------------- > - * Must do this prior to any attempt to bind the root device, as the > - * PCI namespace does not get created until this call is made (and > - * thus the root bridge's pci_dev does not exist). > - */ > - root->bus = pci_acpi_scan_root(root); > - if (!root->bus) { > - printk(KERN_ERR PREFIX > - "Bus %04x:%02x not present in PCI namespace\n", > - root->segment, (unsigned int)root->secondary.start); > - result = -ENODEV; > - goto out_del_root; > - } > - > - /* ASPM setting */ > - if (is_osc_granted) { > - if (acpi_gbl_FADT.boot_flags & ACPI_FADT_NO_ASPM) > - pcie_clear_aspm(root->bus); > - } else { > - pr_info("ACPI _OSC control for PCIe not granted, " > - "disabling ASPM\n"); > - pcie_no_aspm(); > + "Unable to request _OSC control " > + "(_OSC support mask: 0x%02x)\n", flags); > } > > pci_acpi_add_bus_pm_notifier(device, root->bus);
On Mon, Apr 1, 2013 at 6:03 PM, Yinghai Lu <yinghai@kernel.org> wrote: > On Mon, Apr 1, 2013 at 4:52 PM, Bjorn Helgaas <bhelgaas@google.com> wrote: >> On Fri, Mar 29, 2013 at 11:04:48AM -0700, Yinghai Lu wrote: >>> attatched -v3 again >>> >>> > Please check attached -v3. >> >> It's getting late in the v3.9 cycle already, and while your v3 patch >> probably fixes Roman's problem, I can't convince myself that it is >> safe in general. >> >> I think the safest thing to do at this point is to revert 8c33f51df >> ("PCI/ACPI: Request _OSC control before scanning PCI root bus") with a >> patch like the one below. > > Agreed. > >> >> That does mean the booting path and hotplug paths will be different (we set >> aspm_disabled after boot but before hotplug), but it was that way for a >> long time before 8c33f51df. I think it's more important to fix this recent >> ath5k regression than to fix a long-standing hotplug bug that nobody ever >> complained about. >> >> Obviously, I think we should fix the hotplug bug and clean up the ASPM >> mess, too. But we need to do that when we have more time to do it right >> and test it. > > Sure. > >> >> Bjorn >> >> >> commit 96e5d01cd536458435ef0678d9fa3dc542afb41f >> Author: Bjorn Helgaas <bhelgaas@google.com> >> Date: Mon Apr 1 15:47:39 2013 -0600 >> >> Revert "PCI/ACPI: Request _OSC control before scanning PCI root bus" >> >> This reverts commit 8c33f51df406e1a1f7fa4e9b244845b7ebd61fa6. >> >> Conflicts: >> drivers/acpi/pci_root.c >> >> Reference: https://bugzilla.kernel.org/show_bug.cgi?id=55211 >> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> > > Acked-by: Yinghai Lu <yinghai@kernel.org> > > stable need this reverting too. I updated the changelog and added this to my for-linus branch, headed for v3.9. -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Apr 2, 2013 at 1:10 PM, Bjorn Helgaas <bhelgaas@google.com> wrote: > On Mon, Apr 1, 2013 at 6:03 PM, Yinghai Lu <yinghai@kernel.org> wrote: >>> commit 96e5d01cd536458435ef0678d9fa3dc542afb41f >>> Author: Bjorn Helgaas <bhelgaas@google.com> >>> Date: Mon Apr 1 15:47:39 2013 -0600 >>> >>> Revert "PCI/ACPI: Request _OSC control before scanning PCI root bus" >>> >>> This reverts commit 8c33f51df406e1a1f7fa4e9b244845b7ebd61fa6. >>> >>> Conflicts: >>> drivers/acpi/pci_root.c >>> >>> Reference: https://bugzilla.kernel.org/show_bug.cgi?id=55211 >>> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> >> >> Acked-by: Yinghai Lu <yinghai@kernel.org> >> >> stable need this reverting too. > > I updated the changelog and added this to my for-linus branch, headed for v3.9. We may need to revert this reverting for 3.10. Today noticed that acpiphp jump in before pciehp on one setup. current code from acpi_pci_root_add we have 1. pci_acpi_scan_root ==> pci devices enumeration and bus scanning. ==> pci_alloc_child_bus ==> pcibios_add_bus ==> acpi_pci_add_bus ==> acpiphp_enumerate_slots ==> ...==> register_slot ==> device_is_managed_by_native_pciehp ==> check osc_set with OSC_PCI_EXPRESS_NATIVE_HP_CONTROL 2. _OSC set request so we always have acpiphp hotplug slot registered at first. so either we need to A. revert reverting about _OSC B. move pcibios_add_bus down to pci_bus_add_devices() as acpiphp and apci pci slot driver are some kind of drivers for pci_bus C. A+B Thanks Yinghai -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Jun 12, 2013 at 12:20 AM, Yinghai Lu <yinghai@kernel.org> wrote: > On Tue, Apr 2, 2013 at 1:10 PM, Bjorn Helgaas <bhelgaas@google.com> wrote: >> On Mon, Apr 1, 2013 at 6:03 PM, Yinghai Lu <yinghai@kernel.org> wrote: >>>> commit 96e5d01cd536458435ef0678d9fa3dc542afb41f >>>> Author: Bjorn Helgaas <bhelgaas@google.com> >>>> Date: Mon Apr 1 15:47:39 2013 -0600 >>>> >>>> Revert "PCI/ACPI: Request _OSC control before scanning PCI root bus" >>>> >>>> This reverts commit 8c33f51df406e1a1f7fa4e9b244845b7ebd61fa6. >>>> >>>> Conflicts: >>>> drivers/acpi/pci_root.c >>>> >>>> Reference: https://bugzilla.kernel.org/show_bug.cgi?id=55211 >>>> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> >>> >>> Acked-by: Yinghai Lu <yinghai@kernel.org> >>> >>> stable need this reverting too. >> >> I updated the changelog and added this to my for-linus branch, headed for v3.9. > > We may need to revert this reverting for 3.10. > > Today noticed that acpiphp jump in before pciehp on one setup. > > current code from acpi_pci_root_add we have > 1. pci_acpi_scan_root > ==> pci devices enumeration and bus scanning. > ==> pci_alloc_child_bus > ==> pcibios_add_bus > ==> acpi_pci_add_bus > ==> acpiphp_enumerate_slots > ==> ...==> register_slot > ==> device_is_managed_by_native_pciehp > ==> check osc_set with > OSC_PCI_EXPRESS_NATIVE_HP_CONTROL > 2. _OSC set request > > so we always have acpiphp hotplug slot registered at first. > > so either we need to > A. revert reverting about _OSC > B. move pcibios_add_bus down to pci_bus_add_devices() > as acpiphp and apci pci slot driver are some kind of drivers for pci_bus > C. A+B It doesn't surprise me at all that there are problems in the _OSC code and the acpiphp/pciehp interaction. That whole area is a complete disaster. It'd really be nice if somebody stepped up and reworked it so it makes sense. But this report is useless to me. I don't have time to work out what the problem is and how it affects users and come up with a fix. My advice is to simplify the path first, and worry about fixing the bug afterwards. We've already done several iterations of fiddling with things, and I think all we're doing is playing "whack-a-mole" and pushing the bugs around from one place to another. Bjorn -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Jun 12, 2013 at 10:05 AM, Bjorn Helgaas <bhelgaas@google.com> wrote: >> current code from acpi_pci_root_add we have >> 1. pci_acpi_scan_root >> ==> pci devices enumeration and bus scanning. >> ==> pci_alloc_child_bus >> ==> pcibios_add_bus >> ==> acpi_pci_add_bus >> ==> acpiphp_enumerate_slots >> ==> ...==> register_slot >> ==> device_is_managed_by_native_pciehp >> ==> check osc_set with >> OSC_PCI_EXPRESS_NATIVE_HP_CONTROL >> 2. _OSC set request >> >> so we always have acpiphp hotplug slot registered at first. >> >> so either we need to >> A. revert reverting about _OSC >> B. move pcibios_add_bus down to pci_bus_add_devices() >> as acpiphp and apci pci slot driver are some kind of drivers for pci_bus >> C. A+B > > It doesn't surprise me at all that there are problems in the _OSC code > and the acpiphp/pciehp interaction. That whole area is a complete > disaster. It'd really be nice if somebody stepped up and reworked it > so it makes sense. > > But this report is useless to me. I don't have time to work out what > the problem is and how it affects users and come up with a fix. effects: without fix the problem, user can not use pcie native hotplug if their system's firmware support acpihp and pciehp. And make it worse, that acpiphp have to be built-in, so they have no way to blacklist acpiphp in config. > > My advice is to simplify the path first, and worry about fixing the > bug afterwards. We've already done several iterations of fiddling > with things, and I think all we're doing is playing "whack-a-mole" and > pushing the bugs around from one place to another. We need to address regression at first. my suggestion is : revert the reverting and apply my -v3 version that will fix regression that Roman Yepishev met. please check attached two patches, hope it could save your some time. Yinghai
On Wed, Jun 12, 2013 at 1:41 PM, Yinghai Lu <yinghai@kernel.org> wrote: > On Wed, Jun 12, 2013 at 10:05 AM, Bjorn Helgaas <bhelgaas@google.com> wrote: >>> current code from acpi_pci_root_add we have >>> 1. pci_acpi_scan_root >>> ==> pci devices enumeration and bus scanning. >>> ==> pci_alloc_child_bus >>> ==> pcibios_add_bus >>> ==> acpi_pci_add_bus >>> ==> acpiphp_enumerate_slots >>> ==> ...==> register_slot >>> ==> device_is_managed_by_native_pciehp >>> ==> check osc_set with >>> OSC_PCI_EXPRESS_NATIVE_HP_CONTROL >>> 2. _OSC set request >>> >>> so we always have acpiphp hotplug slot registered at first. >>> >>> so either we need to >>> A. revert reverting about _OSC >>> B. move pcibios_add_bus down to pci_bus_add_devices() >>> as acpiphp and apci pci slot driver are some kind of drivers for pci_bus >>> C. A+B >> >> It doesn't surprise me at all that there are problems in the _OSC code >> and the acpiphp/pciehp interaction. That whole area is a complete >> disaster. It'd really be nice if somebody stepped up and reworked it >> so it makes sense. >> >> But this report is useless to me. I don't have time to work out what >> the problem is and how it affects users and come up with a fix. > > effects: without fix the problem, user can not use pcie native hotplug > if their system's firmware support acpihp and pciehp. > And make it worse, that acpiphp have to be built-in, so they have no > way to blacklist acpiphp in config. > >> >> My advice is to simplify the path first, and worry about fixing the >> bug afterwards. We've already done several iterations of fiddling >> with things, and I think all we're doing is playing "whack-a-mole" and >> pushing the bugs around from one place to another. > > We need to address regression at first. > my suggestion is : revert the reverting and apply my -v3 version that will fix > regression that Roman Yepishev met. > > please check attached two patches, hope it could save your some time. OK, you're right. It's not reasonable to do anything more than a minimal fix when we're at -rc5. Sigh. I'll spend tomorrow trying to understand your patches and write changelogs for you. I think you're saying that in systems that support both acpiphp and pciehp, we should be using pciehp, but we currently use acpiphp. If so, that's certainly a bug. How serious is it? Is it a disaster if we use acpiphp until we can resolve this cleanly? Are there a lot of systems that claim to support acpiphp but it doesn't actually work? Bjorn -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi Bjorn, I'm working on several acpiphp related bugfixes, and feel some are materials for 3.10 too. Actually we have identified four bugs related to dock station support on Sony VAIO VPCZ23A4R laptop. I will try to send out patchset to address these bugs tonight. Seems we really need to rethink about acpiphp and pciehp now. Regards! Gerry On 2013/6/13 11:50, Bjorn Helgaas wrote: > On Wed, Jun 12, 2013 at 1:41 PM, Yinghai Lu <yinghai@kernel.org> wrote: >> On Wed, Jun 12, 2013 at 10:05 AM, Bjorn Helgaas <bhelgaas@google.com> wrote: >>>> current code from acpi_pci_root_add we have >>>> 1. pci_acpi_scan_root >>>> ==> pci devices enumeration and bus scanning. >>>> ==> pci_alloc_child_bus >>>> ==> pcibios_add_bus >>>> ==> acpi_pci_add_bus >>>> ==> acpiphp_enumerate_slots >>>> ==> ...==> register_slot >>>> ==> device_is_managed_by_native_pciehp >>>> ==> check osc_set with >>>> OSC_PCI_EXPRESS_NATIVE_HP_CONTROL >>>> 2. _OSC set request >>>> >>>> so we always have acpiphp hotplug slot registered at first. >>>> >>>> so either we need to >>>> A. revert reverting about _OSC >>>> B. move pcibios_add_bus down to pci_bus_add_devices() >>>> as acpiphp and apci pci slot driver are some kind of drivers for pci_bus >>>> C. A+B >>> >>> It doesn't surprise me at all that there are problems in the _OSC code >>> and the acpiphp/pciehp interaction. That whole area is a complete >>> disaster. It'd really be nice if somebody stepped up and reworked it >>> so it makes sense. >>> >>> But this report is useless to me. I don't have time to work out what >>> the problem is and how it affects users and come up with a fix. >> >> effects: without fix the problem, user can not use pcie native hotplug >> if their system's firmware support acpihp and pciehp. >> And make it worse, that acpiphp have to be built-in, so they have no >> way to blacklist acpiphp in config. >> >>> >>> My advice is to simplify the path first, and worry about fixing the >>> bug afterwards. We've already done several iterations of fiddling >>> with things, and I think all we're doing is playing "whack-a-mole" and >>> pushing the bugs around from one place to another. >> >> We need to address regression at first. >> my suggestion is : revert the reverting and apply my -v3 version that will fix >> regression that Roman Yepishev met. >> >> please check attached two patches, hope it could save your some time. > > OK, you're right. It's not reasonable to do anything more than a > minimal fix when we're at -rc5. > > Sigh. I'll spend tomorrow trying to understand your patches and write > changelogs for you. > > I think you're saying that in systems that support both acpiphp and > pciehp, we should be using pciehp, but we currently use acpiphp. If > so, that's certainly a bug. How serious is it? Is it a disaster if > we use acpiphp until we can resolve this cleanly? Are there a lot of > systems that claim to support acpiphp but it doesn't actually work? > > Bjorn > > . > -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Jun 12, 2013 at 8:50 PM, Bjorn Helgaas <bhelgaas@google.com> wrote: > On Wed, Jun 12, 2013 at 1:41 PM, Yinghai Lu <yinghai@kernel.org> wrote: > > I think you're saying that in systems that support both acpiphp and > pciehp, we should be using pciehp, but we currently use acpiphp. If > so, that's certainly a bug. How serious is it? Is it a disaster if > we use acpiphp until we can resolve this cleanly? Are there a lot of > systems that claim to support acpiphp but it doesn't actually work? No sure. To make acpiphp would need more expertise in bios. Normally BIOS vendor would have half done work there, and will need OEM or system vendor have someone to make it work .... You would not want to read asl code in DSDT to help them out. That is not something that we can control. Yinghai -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wednesday, June 12, 2013 10:47:08 PM Yinghai Lu wrote: > On Wed, Jun 12, 2013 at 8:50 PM, Bjorn Helgaas <bhelgaas@google.com> wrote: > > On Wed, Jun 12, 2013 at 1:41 PM, Yinghai Lu <yinghai@kernel.org> wrote: > > > > I think you're saying that in systems that support both acpiphp and > > pciehp, we should be using pciehp, but we currently use acpiphp. If > > so, that's certainly a bug. How serious is it? Is it a disaster if > > we use acpiphp until we can resolve this cleanly? Are there a lot of > > systems that claim to support acpiphp but it doesn't actually work? > > No sure. To make acpiphp would need more expertise in bios. > Normally BIOS vendor would have half done work there, and will need > OEM or system vendor have someone to make it work .... > You would not want to read asl code in DSDT to help them out. > That is not something that we can control. However, pciehp may simply not work by itself on those systems. It's pretty much like saying "Oh, _CRS may be screwed up, so let's just ignore it", which isn't overly smart. Thanks, Rafael
On Wed, Jun 12, 2013 at 10:11 PM, Jiang Liu (Gerry) <jiang.liu@huawei.com> wrote: > Hi Bjorn, > I'm working on several acpiphp related bugfixes, and feel some > are materials for 3.10 too. Actually we have identified four bugs > related to dock station support on Sony VAIO VPCZ23A4R laptop. > I will try to send out patchset to address these bugs tonight. > Seems we really need to rethink about acpiphp and pciehp now. We certainly need more rework in acpiphp and pciehp. But unless it's an obvious fix for a serious regression, I'm doubtful about putting it in 3.10. rc6 is imminent and it's not the time to be putting significant changes in. I don't know the details of the dock issues, but you might not need to be in a huge rush about them. Bjorn -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Jun 12, 2013 at 12:41:42PM -0700, Yinghai Lu wrote: > On Wed, Jun 12, 2013 at 10:05 AM, Bjorn Helgaas <bhelgaas@google.com> wrote: > >> current code from acpi_pci_root_add we have > >> 1. pci_acpi_scan_root > >> ==> pci devices enumeration and bus scanning. > >> ==> pci_alloc_child_bus > >> ==> pcibios_add_bus > >> ==> acpi_pci_add_bus > >> ==> acpiphp_enumerate_slots > >> ==> ...==> register_slot > >> ==> device_is_managed_by_native_pciehp > >> ==> check osc_set with > >> OSC_PCI_EXPRESS_NATIVE_HP_CONTROL > >> 2. _OSC set request > >> > >> so we always have acpiphp hotplug slot registered at first. > >> > >> so either we need to > >> A. revert reverting about _OSC > >> B. move pcibios_add_bus down to pci_bus_add_devices() > >> as acpiphp and apci pci slot driver are some kind of drivers for pci_bus > >> C. A+B > > > > It doesn't surprise me at all that there are problems in the _OSC code > > and the acpiphp/pciehp interaction. That whole area is a complete > > disaster. It'd really be nice if somebody stepped up and reworked it > > so it makes sense. > > > > But this report is useless to me. I don't have time to work out what > > the problem is and how it affects users and come up with a fix. > > effects: without fix the problem, user can not use pcie native hotplug > if their system's firmware support acpihp and pciehp. > And make it worse, that acpiphp have to be built-in, so they have no > way to blacklist acpiphp in config. > > > > > My advice is to simplify the path first, and worry about fixing the > > bug afterwards. We've already done several iterations of fiddling > > with things, and I think all we're doing is playing "whack-a-mole" and > > pushing the bugs around from one place to another. > > We need to address regression at first. > my suggestion is : revert the reverting and apply my -v3 version that will fix > regression that Roman Yepishev met. > > please check attached two patches, hope it could save your some time. Here are some of my notes from trying to sort this out, in chronological order: 29594404 v3.7 Bus scanned before requesting _OSC control pre-1.1 ath5k has ASPM disabled (works fine) 8c33f51d "request _OSC control before scanning bus" 19f949f5 v3.8 _OSC control requested before scanning bus Now pre-1.1 ath5k has ASPM enabled and doesn't work https://bugzilla.kernel.org/show_bug.cgi?id=55211 opened b8178f13 "revert 'request _OSC control before scanning bus' (8c33f51d)" Bus now scanned before requesting _OSC control (as in v3.7) c1be5a5b v3.9 pciehp claims slots first, even when both pciehp & acpiphp are built-in, because pciehp module_init precedes acpiphp module_init in link order 6037a803 "Convert acpiphp to be builtin only" This also adds "acpiphp.disable" boot option 3b63aaa7 "Do not use ACPI PCI subdriver mechanism" Now acpiphp claims slots first because we call acpiphp_enumerate_slots() from pcibios_add_bus() during PCI device enumeration. This happens before pciehp, which still uses module_init. f722406f v3.10-rc1 ........ "Revert reverting of 'request _OSC control before scanning bus' (b8178f13)" _OSC control requested before scanning bus (as in v3.8) pre-1.1 ath5k probably has ASPM enabled and doesn't work ........ "Remove not needed check in disable aspm link" Now pci_disable_link_state() unconditionally disables ASPM, even when BIOS hasn't given us ASPM control 1) The problem you're trying to fix is that when both acpiphp and pciehp are supported for the same slot, acpiphp claims the slot first and pciehp will not claim it. I think this problem was introduced by 3b63aaa7, which was merged after v3.9. Therefore, v3.9 should work correctly, and this regression appeared in v3.10-rc1. 2) As you say, acpiphp cannot be a module, so the user would have to rebuild the kernel to remove it. However, 6037a803 *did* add a "acpiphp.disable" boot option, so that should be a workaround that allows pciehp to claim the slot. 3) I think your "revert reverting" patch gets us back to the same situation we had after 8c33f51d, i.e., Roman's pre-1.1 ath5k device will have ASPM enabled and won't work. I don't want to leave the tree in this broken state, even though you intend to fix it in the next patch. If you can reorder your patches so the ASPM fix is first, that would be better. 4) Your "Remove not needed check in disable aspm link" patch makes pci_disable_link_state() disable ASPM even when the OS doesn't have permission to control ASPM. I think this is a mistake. I proposed a similar change in [1], but Rafael and Matthew thought it was too risky, and I agree. Bjorn [1] https://lkml.kernel.org/r/20130510225257.GA10847@google.com -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Jun 14, 2013 at 7:11 AM, Bjorn Helgaas <bhelgaas@google.com> wrote: > Here are some of my notes from trying to sort this out, in chronological > order: > > 29594404 v3.7 > Bus scanned before requesting _OSC control > pre-1.1 ath5k has ASPM disabled (works fine) > > 8c33f51d "request _OSC control before scanning bus" > > 19f949f5 v3.8 > _OSC control requested before scanning bus > Now pre-1.1 ath5k has ASPM enabled and doesn't work > https://bugzilla.kernel.org/show_bug.cgi?id=55211 opened > > b8178f13 "revert 'request _OSC control before scanning bus' (8c33f51d)" > Bus now scanned before requesting _OSC control (as in v3.7) > > c1be5a5b v3.9 > pciehp claims slots first, even when both pciehp & acpiphp are > built-in, because pciehp module_init precedes acpiphp module_init > in link order > > 6037a803 "Convert acpiphp to be builtin only" > This also adds "acpiphp.disable" boot option > > 3b63aaa7 "Do not use ACPI PCI subdriver mechanism" > Now acpiphp claims slots first because we call > acpiphp_enumerate_slots() from pcibios_add_bus() during PCI device > enumeration. This happens before pciehp, which still uses > module_init. > > f722406f v3.10-rc1 > > ........ "Revert reverting of 'request _OSC control before scanning bus' (b8178f13)" > _OSC control requested before scanning bus (as in v3.8) > pre-1.1 ath5k probably has ASPM enabled and doesn't work > > ........ "Remove not needed check in disable aspm link" > Now pci_disable_link_state() unconditionally disables ASPM, > even when BIOS hasn't given us ASPM control > > > 1) The problem you're trying to fix is that when both acpiphp and > pciehp are supported for the same slot, acpiphp claims the slot first > and pciehp will not claim it. I think this problem was introduced by > 3b63aaa7, which was merged after v3.9. Therefore, v3.9 should work > correctly, and this regression appeared in v3.10-rc1. > > 2) As you say, acpiphp cannot be a module, so the user would have to > rebuild the kernel to remove it. However, 6037a803 *did* add a > "acpiphp.disable" boot option, so that should be a workaround that > allows pciehp to claim the slot. How about the same system that some slots need to be handled by acpiphp and some others need to be handled by pciehp ? for example: laptop that have dock that will need acpiphp, and also have pci express card that need pciehp. > > 3) I think your "revert reverting" patch gets us back to the same > situation we had after 8c33f51d, i.e., Roman's pre-1.1 ath5k device > will have ASPM enabled and won't work. I don't want to leave the tree > in this broken state, even though you intend to fix it in the next > patch. If you can reorder your patches so the ASPM fix is first, that > would be better. yes. We could apply your patch in [1] at first, and revert the reverting. and do not touch pcie_clear_aspm now. > > 4) Your "Remove not needed check in disable aspm link" patch makes > pci_disable_link_state() disable ASPM even when the OS doesn't have > permission to control ASPM. I think this is a mistake. I proposed a > similar change in [1], but Rafael and Matthew thought it was too > risky, and I agree. before all those changes, and in current state: quirk disable aspm is before _osc support and control are set. aka in pci_acpi_scan_root will allocate all link state struct, and quirk call pci_disable_link_state, and later will _osc support or control can not be set, pcie_no_aspm is called, can will block all aspm operation. That is risky too?, why booting path quirk could do that, but driver and hot-add quirk path can not do that ? or we can have another pci_disable_link_state always work on quirk path only? Yinghai -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Jun 14, 2013 at 10:17 AM, Yinghai Lu <yinghai@kernel.org> wrote: > On Fri, Jun 14, 2013 at 7:11 AM, Bjorn Helgaas <bhelgaas@google.com> wrote: >> Here are some of my notes from trying to sort this out, in chronological >> order: >> >> 29594404 v3.7 >> Bus scanned before requesting _OSC control >> pre-1.1 ath5k has ASPM disabled (works fine) >> >> 8c33f51d "request _OSC control before scanning bus" >> >> 19f949f5 v3.8 >> _OSC control requested before scanning bus >> Now pre-1.1 ath5k has ASPM enabled and doesn't work >> https://bugzilla.kernel.org/show_bug.cgi?id=55211 opened >> >> b8178f13 "revert 'request _OSC control before scanning bus' (8c33f51d)" >> Bus now scanned before requesting _OSC control (as in v3.7) >> >> c1be5a5b v3.9 >> pciehp claims slots first, even when both pciehp & acpiphp are >> built-in, because pciehp module_init precedes acpiphp module_init >> in link order >> >> 6037a803 "Convert acpiphp to be builtin only" >> This also adds "acpiphp.disable" boot option >> >> 3b63aaa7 "Do not use ACPI PCI subdriver mechanism" >> Now acpiphp claims slots first because we call >> acpiphp_enumerate_slots() from pcibios_add_bus() during PCI device >> enumeration. This happens before pciehp, which still uses >> module_init. >> >> f722406f v3.10-rc1 >> >> ........ "Revert reverting of 'request _OSC control before scanning bus' (b8178f13)" >> _OSC control requested before scanning bus (as in v3.8) >> pre-1.1 ath5k probably has ASPM enabled and doesn't work >> >> ........ "Remove not needed check in disable aspm link" >> Now pci_disable_link_state() unconditionally disables ASPM, >> even when BIOS hasn't given us ASPM control >> >> >> 1) The problem you're trying to fix is that when both acpiphp and >> pciehp are supported for the same slot, acpiphp claims the slot first >> and pciehp will not claim it. I think this problem was introduced by >> 3b63aaa7, which was merged after v3.9. Therefore, v3.9 should work >> correctly, and this regression appeared in v3.10-rc1. >> >> 2) As you say, acpiphp cannot be a module, so the user would have to >> rebuild the kernel to remove it. However, 6037a803 *did* add a >> "acpiphp.disable" boot option, so that should be a workaround that >> allows pciehp to claim the slot. > > How about the same system that some slots need to be handled by acpiphp > and some others need to be handled by pciehp ? > > for example: laptop that have dock that will need acpiphp, and also have > pci express card that need pciehp. > >> >> 3) I think your "revert reverting" patch gets us back to the same >> situation we had after 8c33f51d, i.e., Roman's pre-1.1 ath5k device >> will have ASPM enabled and won't work. I don't want to leave the tree >> in this broken state, even though you intend to fix it in the next >> patch. If you can reorder your patches so the ASPM fix is first, that >> would be better. > > yes. > > We could apply your patch in [1] at first, and revert the reverting. > and do not touch pcie_clear_aspm now. > >> >> 4) Your "Remove not needed check in disable aspm link" patch makes >> pci_disable_link_state() disable ASPM even when the OS doesn't have >> permission to control ASPM. I think this is a mistake. I proposed a >> similar change in [1], but Rafael and Matthew thought it was too >> risky, and I agree. > > before all those changes, and in current state: > quirk disable aspm is before _osc support and control are set. Can you please refer to specific function names? I can't read your mind. You might be referring to quirk_disable_aspm_l0s(). This is a pci_fixup_final quirk that calls pci_disable_link_state(). In the current tree, we enumerate devices before requesting _OSC control. However, pci_fixup_final quirks are not run until the pci_apply_final_quirks() fs_initcall, which is after we request _OSC control. As far as I can tell, we never call pci_disable_link_state() before calling pcie_no_aspm(). > aka in pci_acpi_scan_root will allocate all link state struct, and quirk > call pci_disable_link_state, and later will _osc support or control can > not be set, pcie_no_aspm is called, can will block all aspm operation. > > That is risky too?, why booting path quirk could do that, but driver > and hot-add quirk path can not do that ? > > or we can have another pci_disable_link_state always work on quirk path only? > > Yinghai -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Jun 14, 2013 at 9:33 AM, Bjorn Helgaas <bhelgaas@google.com> wrote: > On Fri, Jun 14, 2013 at 10:17 AM, Yinghai Lu <yinghai@kernel.org> wrote: > > Can you please refer to specific function names? I can't read your mind. > > You might be referring to quirk_disable_aspm_l0s(). This is a > pci_fixup_final quirk that calls pci_disable_link_state(). In the > current tree, we enumerate devices before requesting _OSC control. > However, pci_fixup_final quirks are not run until the > pci_apply_final_quirks() fs_initcall, which is after we request _OSC > control. > > As far as I can tell, we never call pci_disable_link_state() before > calling pcie_no_aspm(). ok, you are right, that is not pci_disable_link_state. It is pcie_aspm_init_link_state ==> pcie_aspm_sanity_check in booting path that disable aspm. It has "if (aspm_disabled)" in it, and it cause the difference. Yinghai -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Jun 14, 2013 at 10:57 AM, Yinghai Lu <yinghai@kernel.org> wrote: > On Fri, Jun 14, 2013 at 9:33 AM, Bjorn Helgaas <bhelgaas@google.com> wrote: >> On Fri, Jun 14, 2013 at 10:17 AM, Yinghai Lu <yinghai@kernel.org> wrote: >> >> Can you please refer to specific function names? I can't read your mind. >> >> You might be referring to quirk_disable_aspm_l0s(). This is a >> pci_fixup_final quirk that calls pci_disable_link_state(). In the >> current tree, we enumerate devices before requesting _OSC control. >> However, pci_fixup_final quirks are not run until the >> pci_apply_final_quirks() fs_initcall, which is after we request _OSC >> control. >> >> As far as I can tell, we never call pci_disable_link_state() before >> calling pcie_no_aspm(). > > ok, you are right, that is not pci_disable_link_state. > > It is pcie_aspm_init_link_state ==> pcie_aspm_sanity_check in booting path > that disable aspm. It has "if (aspm_disabled)" in it, and it cause > the difference. Yes, I agree, the pcie_aspm_init_link_state() path uses aspm_disabled before we set it: acpi_pci_root_add pci_acpi_scan_root pci_scan_child_bus pci_scan_slot pcie_aspm_init_link_state pcie_aspm_sanity_check if (aspm_disabled) # used before set ... acpi_pci_osc_control_set pcie_no_aspm aspm_disabled = 1 # set That might mean we do some ASPM configuration during enumeration (in pci_scan_slot()) even though the BIOS hasn't given us permission. It looks like we did that even in v3.7, since we did the enumeration before the _OSC there as well. That looks like a bug to me. I don't think the fact that we have been doing ASPM config during enumeration before _OSC is an argument for dropping the check in pci_disable_link_state(). Bjorn -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Jun 14, 2013 at 10:44 AM, Bjorn Helgaas <bhelgaas@google.com> wrote: > On Fri, Jun 14, 2013 at 10:57 AM, Yinghai Lu <yinghai@kernel.org> wrote: >> On Fri, Jun 14, 2013 at 9:33 AM, Bjorn Helgaas <bhelgaas@google.com> wrote: >>> On Fri, Jun 14, 2013 at 10:17 AM, Yinghai Lu <yinghai@kernel.org> wrote: >>> >>> Can you please refer to specific function names? I can't read your mind. >>> >>> You might be referring to quirk_disable_aspm_l0s(). This is a >>> pci_fixup_final quirk that calls pci_disable_link_state(). In the >>> current tree, we enumerate devices before requesting _OSC control. >>> However, pci_fixup_final quirks are not run until the >>> pci_apply_final_quirks() fs_initcall, which is after we request _OSC >>> control. >>> >>> As far as I can tell, we never call pci_disable_link_state() before >>> calling pcie_no_aspm(). >> >> ok, you are right, that is not pci_disable_link_state. >> >> It is pcie_aspm_init_link_state ==> pcie_aspm_sanity_check in booting path >> that disable aspm. It has "if (aspm_disabled)" in it, and it cause >> the difference. > > Yes, I agree, the pcie_aspm_init_link_state() path uses aspm_disabled > before we set it: > > acpi_pci_root_add > pci_acpi_scan_root > pci_scan_child_bus > pci_scan_slot > pcie_aspm_init_link_state > pcie_aspm_sanity_check > if (aspm_disabled) # used before set > ... > acpi_pci_osc_control_set > pcie_no_aspm > aspm_disabled = 1 # set > > That might mean we do some ASPM configuration during enumeration (in > pci_scan_slot()) even though the BIOS hasn't given us permission. It > looks like we did that even in v3.7, since we did the enumeration > before the _OSC there as well. That looks like a bug to me. agreed. that means commits from Matthew Garrett commit 4949be16822e92a18ea0cc1616319926628092ee Author: Matthew Garrett <mjg@redhat.com> Date: Tue Mar 6 13:41:49 2012 -0500 PCI: ignore pre-1.1 ASPM quirking when ASPM is disabled commit c9651e70ad0aa499814817cbf3cc1d0b806ed3a1 Author: Matthew Garrett <mjg@redhat.com> Date: Tue Mar 27 10:17:41 2012 -0400 ASPM: Fix pcie devices with non-pcie children will only works when the user specify "aspm=off" in boot command line. (Roman's should have problem when he boot current linus tree with "aspm=off", as no one will disable aspm for the offending pci devices) To close the hole that Matthew' commits miss, that we should move _OSC support/control set ahead. For Roman's system it will have to fail, as BIOS enable prep-1.1 pcie devices aspm, and do not handle over control to OS, so os can not disable aspm link state for it. To workaround the problem in Roman's system, we can add pcie_aspm=force_off so we will have pcie_aspm=off pcie_aspm=force pcie_aspm=force_off What a mess! > > I don't think the fact that we have been doing ASPM config during > enumeration before _OSC is an argument for dropping the check in > pci_disable_link_state(). Agreed. Yinghai -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[+cc Maxim, Jussi] On Fri, Jun 14, 2013 at 12:26 PM, Yinghai Lu <yinghai@kernel.org> wrote: > On Fri, Jun 14, 2013 at 10:44 AM, Bjorn Helgaas <bhelgaas@google.com> wrote: >> On Fri, Jun 14, 2013 at 10:57 AM, Yinghai Lu <yinghai@kernel.org> wrote: >>> On Fri, Jun 14, 2013 at 9:33 AM, Bjorn Helgaas <bhelgaas@google.com> wrote: >>>> On Fri, Jun 14, 2013 at 10:17 AM, Yinghai Lu <yinghai@kernel.org> wrote: >>>> >>>> Can you please refer to specific function names? I can't read your mind. >>>> >>>> You might be referring to quirk_disable_aspm_l0s(). This is a >>>> pci_fixup_final quirk that calls pci_disable_link_state(). In the >>>> current tree, we enumerate devices before requesting _OSC control. >>>> However, pci_fixup_final quirks are not run until the >>>> pci_apply_final_quirks() fs_initcall, which is after we request _OSC >>>> control. >>>> >>>> As far as I can tell, we never call pci_disable_link_state() before >>>> calling pcie_no_aspm(). >>> >>> ok, you are right, that is not pci_disable_link_state. >>> >>> It is pcie_aspm_init_link_state ==> pcie_aspm_sanity_check in booting path >>> that disable aspm. It has "if (aspm_disabled)" in it, and it cause >>> the difference. >> >> Yes, I agree, the pcie_aspm_init_link_state() path uses aspm_disabled >> before we set it: >> >> acpi_pci_root_add >> pci_acpi_scan_root >> pci_scan_child_bus >> pci_scan_slot >> pcie_aspm_init_link_state >> pcie_aspm_sanity_check >> if (aspm_disabled) # used before set >> ... >> acpi_pci_osc_control_set >> pcie_no_aspm >> aspm_disabled = 1 # set >> >> That might mean we do some ASPM configuration during enumeration (in >> pci_scan_slot()) even though the BIOS hasn't given us permission. It >> looks like we did that even in v3.7, since we did the enumeration >> before the _OSC there as well. That looks like a bug to me. > > agreed. that means commits from Matthew Garrett > > commit 4949be16822e92a18ea0cc1616319926628092ee > Author: Matthew Garrett <mjg@redhat.com> > Date: Tue Mar 6 13:41:49 2012 -0500 > > PCI: ignore pre-1.1 ASPM quirking when ASPM is disabled > > commit c9651e70ad0aa499814817cbf3cc1d0b806ed3a1 > Author: Matthew Garrett <mjg@redhat.com> > Date: Tue Mar 27 10:17:41 2012 -0400 > > ASPM: Fix pcie devices with non-pcie children > > will only works when the user specify "aspm=off" in boot command line. > > (Roman's should have problem when he boot current linus tree with > "aspm=off", as no one will disable aspm for the offending pci devices) > > To close the hole that Matthew' commits miss, that we should move _OSC > support/control set ahead. > > For Roman's system it will have to fail, as BIOS enable prep-1.1 pcie devices > aspm, and do not handle over control to OS, so os can not disable aspm link > state for it. > > To workaround the problem in Roman's system, we can add pcie_aspm=force_off > > so we will have > pcie_aspm=off > pcie_aspm=force > pcie_aspm=force_off > > What a mess! Yeah, this is a huge mess. It makes my head hurt. I don't think it's reasonable to add more flags because that will make my head hurt even more. If I understand correctly, on Roman's system (the Acer Aspire One AOA150 netbook mentioned in https://bugzilla.kernel.org/show_bug.cgi?id=55211): - The BIOS leaves ASPM enabled for the ath5k device (03:00.0) - The BIOS does not allow the OS to manage ASPM (via _OSC) - The ath5k device does not work correctly with ASPM enabled - The ath5k driver calls pci_disable_link_state(), but we do not disable ASPM because we don't have permission from the BIOS This is basically the case I investigated in bz 57331 [1], and my conclusion was that Windows behaves the same way, i.e., Windows also leaves ASPM enabled in this situation. It would be interesting to know whether that device on Roman's machine works under Windows and what the ASPM configuration there is. When Maxim added the pci_disable_link_state() to ath5k with 6ccf15a1, he did say that the Windows driver disabled L0s [2], but I don't know what machine that was or what its _OSC method said. At the time of 6ccf15a1, Linux evaluated _OSC but did not call pcie_no_aspm() when it failed, so the pci_disable_link_state() in ath5k actually *did* disable ASPM. I did find the Atheros Windows driver for the AOA150 on the Acer website [3], and the .INF file has several interesting mentions of ASPM, but I don't know what they mean. Bjorn [1] https://bugzilla.kernel.org/show_bug.cgi?id=57331#c5 [2] https://lists.ath5k.org/pipermail/ath5k-devel/2010-June/003842.html [3] http://global-download.acer.com/GDFiles/Driver/Wireless%20LAN/WLAN_Atheros_7.6.0.224_XPx86_A.zip?acerid=633639843308102758&Step1=NETBOOK,%20CHROMEBOOK&Step2=ASPIRE%20ONE&Step3=AOA150&OS=ALL&LC=en&BC=ACER&SC=PA_6 -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, 2013-06-14 at 15:26 -0600, Bjorn Helgaas wrote: > I did find the Atheros Windows driver for the AOA150 on the Acer > website [3], and the .INF file has several interesting mentions of > ASPM, but I don't know what they mean. They're not the standard functions, so it's possible that the Windows driver for this hardware disables ASPM via its own register writes. -- Matthew Garrett | mjg59@srcf.ucam.org
On Fri, Jun 14, 2013 at 2:26 PM, Bjorn Helgaas <bhelgaas@google.com> wrote: > [+cc Maxim, Jussi] > > Yeah, this is a huge mess. It makes my head hurt. I don't think it's > reasonable to add more flags because that will make my head hurt even > more. > > If I understand correctly, on Roman's system (the Acer Aspire One > AOA150 netbook mentioned in > https://bugzilla.kernel.org/show_bug.cgi?id=55211): > > - The BIOS leaves ASPM enabled for the ath5k device (03:00.0) > - The BIOS does not allow the OS to manage ASPM (via _OSC) > - The ath5k device does not work correctly with ASPM enabled > - The ath5k driver calls pci_disable_link_state(), but we do not > disable ASPM because we don't have permission from the BIOS looks like Matthew Garrett path is causing problem: commit 4949be16822e92a18ea0cc1616319926628092ee Author: Matthew Garrett <mjg@redhat.com> Date: Tue Mar 6 13:41:49 2012 -0500 PCI: ignore pre-1.1 ASPM quirking when ASPM is disabled commit c9651e70ad0aa499814817cbf3cc1d0b806ed3a1 Author: Matthew Garrett <mjg@redhat.com> Date: Tue Mar 27 10:17:41 2012 -0400 ASPM: Fix pcie devices with non-pcie children after those two patches, it aspm_disabled is set, via _osc early, pre-1.1 devices aspm register will be touched even aspm_force is not specified. pcie_aspm_init_link_state will all the way to pcie_config_aspm_path ==> pcie_config_aspm_link in that path, aspm_disabled is not checked nowhere. BTW, when aspm is not disabled, even the link is allocated, because it is black listed, so it is never get touched. Matthew's patch is not needed in any case. I would suspect that that aspm enabling in Roman's system could set by that path instead of BIOS. Roman, can you please check two patches + linsus' tree on your system? Thanks Yinghai
On Fri, 2013-06-14 at 15:17 -0700, Yinghai Lu wrote: > after those two patches, it aspm_disabled is set, via _osc early, > pre-1.1 devices aspm register will be touched even aspm_force is not specified. I don't follow. We were previously automatically disabling ASPM on pre-1.1 devices even if _OSC didn't give us control. I've confirmed that this was the wrong thing for us to be doing, and my patch changed the behaviour such that if the firmware enables ASPM on a pre-1.1 device and refuses to give us control via _OSC we will leave ASPM enabled. -- Matthew Garrett | mjg59@srcf.ucam.org
On Fri, Jun 14, 2013 at 3:27 PM, Matthew Garrett <matthew.garrett@nebula.com> wrote: > On Fri, 2013-06-14 at 15:17 -0700, Yinghai Lu wrote: > >> after those two patches, it aspm_disabled is set, via _osc early, >> pre-1.1 devices aspm register will be touched even aspm_force is not specified. > > I don't follow. We were previously automatically disabling ASPM on > pre-1.1 devices even if _OSC didn't give us control. I don't think so, we just moved _OSC support/control setting before pci scan in 3.8 and revert that in v3.9. > I've confirmed that > this was the wrong thing for us to be doing, and my patch changed the > behaviour such that if the firmware enables ASPM on a pre-1.1 device and > refuses to give us control via _OSC we will leave ASPM enabled. not sure, aspm_disabled should be false on booting path when that function is called, if you don't pass aspm=off. Yinghai -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, 2013-06-14 at 15:40 -0700, Yinghai Lu wrote: > On Fri, Jun 14, 2013 at 3:27 PM, Matthew Garrett > <matthew.garrett@nebula.com> wrote: > > On Fri, 2013-06-14 at 15:17 -0700, Yinghai Lu wrote: > > > >> after those two patches, it aspm_disabled is set, via _osc early, > >> pre-1.1 devices aspm register will be touched even aspm_force is not specified. > > > > I don't follow. We were previously automatically disabling ASPM on > > pre-1.1 devices even if _OSC didn't give us control. > > I don't think so, we just moved _OSC support/control setting before pci scan > in 3.8 and revert that in v3.9. Right, sorry, I don't mean _OSC, I mean the FADT flag. We were previously automatically disabling ASPM on pre-1.1 devices even if the FADT flag was set. -- Matthew Garrett | mjg59@srcf.ucam.org
On Fri, Jun 14, 2013 at 3:48 PM, Matthew Garrett <matthew.garrett@nebula.com> wrote: > On Fri, 2013-06-14 at 15:40 -0700, Yinghai Lu wrote: >> On Fri, Jun 14, 2013 at 3:27 PM, Matthew Garrett >> <matthew.garrett@nebula.com> wrote: >> > On Fri, 2013-06-14 at 15:17 -0700, Yinghai Lu wrote: >> > >> >> after those two patches, it aspm_disabled is set, via _osc early, >> >> pre-1.1 devices aspm register will be touched even aspm_force is not specified. >> > >> > I don't follow. We were previously automatically disabling ASPM on >> > pre-1.1 devices even if _OSC didn't give us control. >> >> I don't think so, we just moved _OSC support/control setting before pci scan >> in 3.8 and revert that in v3.9. > > Right, sorry, I don't mean _OSC, I mean the FADT flag. We were > previously automatically disabling ASPM on pre-1.1 devices even if the > FADT flag was set. so in that case aspm_disabled, never get set. booting path: pcie_aspm_init_link_state will not touch aspm on pre-1.1 devices. late, FADT checking will cause pcie_clear_aspm() get called, it will call __pci_disable_link_state, and because following line in pcie_aspm_link() /* Nothing to do if the link is already in the requested state */ state &= (link->aspm_capable & ~link->aspm_disable); if (link->aspm_enabled == state) return; aspm in pre-1.1 devices still does not get touched. Maybe I miss something. Thanks Yinghai -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/acpi/pci_root.c b/drivers/acpi/pci_root.c index 0ac546d..c740364 100644 --- a/drivers/acpi/pci_root.c +++ b/drivers/acpi/pci_root.c @@ -415,7 +415,6 @@ static int acpi_pci_root_add(struct acpi_device *device, struct acpi_pci_root *root; struct acpi_pci_driver *driver; u32 flags, base_flags; - bool is_osc_granted = false; root = kzalloc(sizeof(struct acpi_pci_root), GFP_KERNEL); if (!root) @@ -476,6 +475,30 @@ static int acpi_pci_root_add(struct acpi_device *device, flags = base_flags = OSC_PCI_SEGMENT_GROUPS_SUPPORT; acpi_pci_osc_support(root, flags); + /* + * TBD: Need PCI interface for enumeration/configuration of roots. + */ + + mutex_lock(&acpi_pci_root_lock); + list_add_tail(&root->node, &acpi_pci_roots); + mutex_unlock(&acpi_pci_root_lock); + + /* + * Scan the Root Bridge + * -------------------- + * Must do this prior to any attempt to bind the root device, as the + * PCI namespace does not get created until this call is made (and + * thus the root bridge's pci_dev does not exist). + */ + root->bus = pci_acpi_scan_root(root); + if (!root->bus) { + printk(KERN_ERR PREFIX + "Bus %04x:%02x not present in PCI namespace\n", + root->segment, (unsigned int)root->secondary.start); + result = -ENODEV; + goto out_del_root; + } + /* Indicate support for various _OSC capabilities. */ if (pci_ext_cfg_avail()) flags |= OSC_EXT_PCI_CONFIG_SUPPORT; @@ -494,6 +517,7 @@ static int acpi_pci_root_add(struct acpi_device *device, flags = base_flags; } } + if (!pcie_ports_disabled && (flags & ACPI_PCIE_REQ_SUPPORT) == ACPI_PCIE_REQ_SUPPORT) { flags = OSC_PCI_EXPRESS_CAP_STRUCTURE_CONTROL @@ -514,54 +538,28 @@ static int acpi_pci_root_add(struct acpi_device *device, status = acpi_pci_osc_control_set(device->handle, &flags, OSC_PCI_EXPRESS_CAP_STRUCTURE_CONTROL); if (ACPI_SUCCESS(status)) { - is_osc_granted = true; dev_info(&device->dev, "ACPI _OSC control (0x%02x) granted\n", flags); + if (acpi_gbl_FADT.boot_flags & ACPI_FADT_NO_ASPM) { + /* + * We have ASPM control, but the FADT indicates + * that it's unsupported. Clear it. + */ + pcie_clear_aspm(root->bus); + } } else { - is_osc_granted = false; dev_info(&device->dev, "ACPI _OSC request failed (%s), " "returned control mask: 0x%02x\n", acpi_format_exception(status), flags); + pr_info("ACPI _OSC control for PCIe not granted, " + "disabling ASPM\n"); + pcie_no_aspm(); } } else { dev_info(&device->dev, - "Unable to request _OSC control " - "(_OSC support mask: 0x%02x)\n", flags); - } - - /* - * TBD: Need PCI interface for enumeration/configuration of roots. - */ - - mutex_lock(&acpi_pci_root_lock); - list_add_tail(&root->node, &acpi_pci_roots); - mutex_unlock(&acpi_pci_root_lock); - - /* - * Scan the Root Bridge - * -------------------- - * Must do this prior to any attempt to bind the root device, as the - * PCI namespace does not get created until this call is made (and - * thus the root bridge's pci_dev does not exist). - */ - root->bus = pci_acpi_scan_root(root); - if (!root->bus) { - printk(KERN_ERR PREFIX - "Bus %04x:%02x not present in PCI namespace\n", - root->segment, (unsigned int)root->secondary.start); - result = -ENODEV; - goto out_del_root; - } - - /* ASPM setting */ - if (is_osc_granted) { - if (acpi_gbl_FADT.boot_flags & ACPI_FADT_NO_ASPM) - pcie_clear_aspm(root->bus); - } else { - pr_info("ACPI _OSC control for PCIe not granted, " - "disabling ASPM\n"); - pcie_no_aspm(); + "Unable to request _OSC control " + "(_OSC support mask: 0x%02x)\n", flags); } pci_acpi_add_bus_pm_notifier(device, root->bus);