diff mbox

[RFC] PCI: Workaround to enable poweroff on Mac Pro 11

Message ID CAE9FiQV_Mc9VuyBvCB7hL_0pgj+-wW1_jupzhbTHij+VQeLF-A@mail.gmail.com (mailing list archive)
State New, archived
Delegated to: Bjorn Helgaas
Headers show

Commit Message

Yinghai Lu May 31, 2016, 7 a.m. UTC
On Mon, May 30, 2016 at 8:24 PM, Chen Yu <yu.c.chen@intel.com> wrote:

> and then in pcibios_assign_resources, 0000:00:1c.0 tries to allocate minimal
> resource window and then update related base/limit registers:
>
> [    0.865342] pci 0000:00:1c.0: bridge window [io  0x1000-0x0fff] to [bus
> 02] add_size 1000
> [    0.865343] pci 0000:00:1c.0: bridge window [mem 0x00100000-0x000fffff
> 64bit pref] to [bus 02] add_size 200000 add_align 100000
> [    0.865344] pci 0000:00:1c.0: bridge window [mem 0x00100000-0x000fffff]
> to [bus 02] add_size 200000 add_align 100000
>

That is for hotplug bridge, then we could use following instead.

  * some mulifunction chips.
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Chen Yu May 31, 2016, 7:18 a.m. UTC | #1
On 2016年05月31日 15:00, Yinghai Lu wrote:
> On Mon, May 30, 2016 at 8:24 PM, Chen Yu <yu.c.chen@intel.com> wrote:
>
>> and then in pcibios_assign_resources, 0000:00:1c.0 tries to allocate minimal
>> resource window and then update related base/limit registers:
>>
>> [    0.865342] pci 0000:00:1c.0: bridge window [io  0x1000-0x0fff] to [bus
>> 02] add_size 1000
>> [    0.865343] pci 0000:00:1c.0: bridge window [mem 0x00100000-0x000fffff
>> 64bit pref] to [bus 02] add_size 200000 add_align 100000
>> [    0.865344] pci 0000:00:1c.0: bridge window [mem 0x00100000-0x000fffff]
>> to [bus 02] add_size 200000 add_align 100000
>>
> That is for hotplug bridge, then we could use following instead.
>
> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> index ee72ebe..d3ec833 100644
> --- a/drivers/pci/quirks.c
> +++ b/drivers/pci/quirks.c
> @@ -2775,6 +2775,13 @@ static void quirk_hotplug_bridge(struct pci_dev *dev)
>
>   DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_HINT, 0x0020, quirk_hotplug_bridge);
>
> +static void quirk_hotplug_bridge_skip(struct pci_dev *dev)
> +{
> +       dev->is_hotplug_bridge = 0;
> +}
> +
> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x8c10,
> quirk_hotplug_bridge_skip);
> +
>   /*
>    * This is a quirk for the Ricoh MMC controller found as a part of
>    * some mulifunction chips.
Good idea, but in this way we might not have io window allocated for 
it?I'm not sure
if this would break wifi,etc, I'll suggest reporters to have a try.

thanks,
Yu
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bjorn Helgaas May 31, 2016, 1:16 p.m. UTC | #2
On Tue, May 31, 2016 at 03:18:02PM +0800, Chen Yu wrote:
> On 2016年05月31日 15:00, Yinghai Lu wrote:
> >On Mon, May 30, 2016 at 8:24 PM, Chen Yu <yu.c.chen@intel.com> wrote:
> >
> >>and then in pcibios_assign_resources, 0000:00:1c.0 tries to allocate minimal
> >>resource window and then update related base/limit registers:
> >>
> >>[    0.865342] pci 0000:00:1c.0: bridge window [io  0x1000-0x0fff] to [bus
> >>02] add_size 1000
> >>[    0.865343] pci 0000:00:1c.0: bridge window [mem 0x00100000-0x000fffff
> >>64bit pref] to [bus 02] add_size 200000 add_align 100000
> >>[    0.865344] pci 0000:00:1c.0: bridge window [mem 0x00100000-0x000fffff]
> >>to [bus 02] add_size 200000 add_align 100000
> >>
> >That is for hotplug bridge, then we could use following instead.
> >
> >diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> >index ee72ebe..d3ec833 100644
> >--- a/drivers/pci/quirks.c
> >+++ b/drivers/pci/quirks.c
> >@@ -2775,6 +2775,13 @@ static void quirk_hotplug_bridge(struct pci_dev *dev)
> >
> >  DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_HINT, 0x0020, quirk_hotplug_bridge);
> >
> >+static void quirk_hotplug_bridge_skip(struct pci_dev *dev)
> >+{
> >+       dev->is_hotplug_bridge = 0;
> >+}
> >+
> >+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x8c10,
> >quirk_hotplug_bridge_skip);
> >+
> >  /*
> >   * This is a quirk for the Ricoh MMC controller found as a part of
> >   * some mulifunction chips.
> Good idea, but in this way we might not have io window allocated for
> it?I'm not sure
> if this would break wifi,etc, I'll suggest reporters to have a try.

Let's figure out the root cause before trying more random fixes.  I
have the same objection to the patch above.  No doubt there are many
ways we could "fix" this, but unless we know the root cause, we're
likely to make a change that's not a complete fix or will cause other
issues in the future.

Bjorn
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Chen Yu June 8, 2016, 4:31 a.m. UTC | #3
Hi Bjorn,

On 2016年05月31日 21:16, Bjorn Helgaas wrote:
> On Tue, May 31, 2016 at 03:18:02PM +0800, Chen Yu wrote:
>> On 2016年05月31日 15:00, Yinghai Lu wrote:
>>> On Mon, May 30, 2016 at 8:24 PM, Chen Yu <yu.c.chen@intel.com> wrote:
>>>
>>>> and then in pcibios_assign_resources, 0000:00:1c.0 tries to allocate minimal
>>>> resource window and then update related base/limit registers:
>>>>
>>>> [    0.865342] pci 0000:00:1c.0: bridge window [io  0x1000-0x0fff] to [bus
>>>> 02] add_size 1000
>>>> [    0.865343] pci 0000:00:1c.0: bridge window [mem 0x00100000-0x000fffff
>>>> 64bit pref] to [bus 02] add_size 200000 add_align 100000
>>>> [    0.865344] pci 0000:00:1c.0: bridge window [mem 0x00100000-0x000fffff]
>>>> to [bus 02] add_size 200000 add_align 100000
>>>>
>>> That is for hotplug bridge, then we could use following instead.
>>>
>>> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
>>> index ee72ebe..d3ec833 100644
>>> --- a/drivers/pci/quirks.c
>>> +++ b/drivers/pci/quirks.c
>>> @@ -2775,6 +2775,13 @@ static void quirk_hotplug_bridge(struct pci_dev *dev)
>>>
>>>   DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_HINT, 0x0020, quirk_hotplug_bridge);
>>>
>>> +static void quirk_hotplug_bridge_skip(struct pci_dev *dev)
>>> +{
>>> +       dev->is_hotplug_bridge = 0;
>>> +}
>>> +
>>> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x8c10,
>>> quirk_hotplug_bridge_skip);
>>> +
>>>   /*
>>>    * This is a quirk for the Ricoh MMC controller found as a part of
>>>    * some mulifunction chips.
>> Good idea, but in this way we might not have io window allocated for
>> it?I'm not sure
>> if this would break wifi,etc, I'll suggest reporters to have a try.
> Let's figure out the root cause before trying more random fixes.  I
> have the same objection to the patch above.  No doubt there are many
> ways we could "fix" this, but unless we know the root cause, we're
> likely to make a change that's not a complete fix or will cause other
> issues in the future.
>
I just let the reporter  paste their lspci in Mac OS, it looks that Mac 
OS also
does not allocate any resource for this broken pci bridge, and other pci
devices have almost the same resource region as it is in linux, so I think
this is an evidence that Apple does not want to use this firmware for now,
maybe reserved for future use, declaim resource for this pci bridge might
cause unpredictable result, how about adding a dmi+pci quirk for this
special platform?

lspci result from Mac OS, there's no resource behind this bridge:
https://bugzilla.kernel.org/attachment.cgi?id=219321


thanks,
Yu
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bjorn Helgaas June 8, 2016, 12:47 p.m. UTC | #4
On Wed, Jun 08, 2016 at 12:31:27PM +0800, Chen Yu wrote:
> Hi Bjorn,
> 
> On 2016年05月31日 21:16, Bjorn Helgaas wrote:
> >On Tue, May 31, 2016 at 03:18:02PM +0800, Chen Yu wrote:
> >>On 2016年05月31日 15:00, Yinghai Lu wrote:
> >>>On Mon, May 30, 2016 at 8:24 PM, Chen Yu <yu.c.chen@intel.com> wrote:
> >>>
> >>>>and then in pcibios_assign_resources, 0000:00:1c.0 tries to allocate minimal
> >>>>resource window and then update related base/limit registers:
> >>>>
> >>>>[    0.865342] pci 0000:00:1c.0: bridge window [io  0x1000-0x0fff] to [bus
> >>>>02] add_size 1000
> >>>>[    0.865343] pci 0000:00:1c.0: bridge window [mem 0x00100000-0x000fffff
> >>>>64bit pref] to [bus 02] add_size 200000 add_align 100000
> >>>>[    0.865344] pci 0000:00:1c.0: bridge window [mem 0x00100000-0x000fffff]
> >>>>to [bus 02] add_size 200000 add_align 100000
> >>>>
> >>>That is for hotplug bridge, then we could use following instead.
> >>>
> >>>diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> >>>index ee72ebe..d3ec833 100644
> >>>--- a/drivers/pci/quirks.c
> >>>+++ b/drivers/pci/quirks.c
> >>>@@ -2775,6 +2775,13 @@ static void quirk_hotplug_bridge(struct pci_dev *dev)
> >>>
> >>>  DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_HINT, 0x0020, quirk_hotplug_bridge);
> >>>
> >>>+static void quirk_hotplug_bridge_skip(struct pci_dev *dev)
> >>>+{
> >>>+       dev->is_hotplug_bridge = 0;
> >>>+}
> >>>+
> >>>+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x8c10,
> >>>quirk_hotplug_bridge_skip);
> >>>+
> >>>  /*
> >>>   * This is a quirk for the Ricoh MMC controller found as a part of
> >>>   * some mulifunction chips.
> >>Good idea, but in this way we might not have io window allocated for
> >>it?I'm not sure
> >>if this would break wifi,etc, I'll suggest reporters to have a try.
> >Let's figure out the root cause before trying more random fixes.  I
> >have the same objection to the patch above.  No doubt there are many
> >ways we could "fix" this, but unless we know the root cause, we're
> >likely to make a change that's not a complete fix or will cause other
> >issues in the future.
> >
> I just let the reporter  paste their lspci in Mac OS, it looks that
> Mac OS also
> does not allocate any resource for this broken pci bridge, and other pci
> devices have almost the same resource region as it is in linux, so I think
> this is an evidence that Apple does not want to use this firmware for now,
> maybe reserved for future use, declaim resource for this pci bridge might
> cause unpredictable result, how about adding a dmi+pci quirk for this
> special platform?

This is not a root cause.  We still have no idea why the problem
occurs.  I do not think a quirk is a good idea until we know what the
problem is and exactly how a quirk would fix it.

> lspci result from Mac OS, there's no resource behind this bridge:
> https://bugzilla.kernel.org/attachment.cgi?id=219321
> 
> 
> thanks,
> Yu
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bjorn Helgaas June 9, 2016, 4:50 p.m. UTC | #5
On Wed, Jun 08, 2016 at 07:47:42AM -0500, Bjorn Helgaas wrote:
> On Wed, Jun 08, 2016 at 12:31:27PM +0800, Chen Yu wrote:
> > Hi Bjorn,
> > 
> > On 2016年05月31日 21:16, Bjorn Helgaas wrote:
> > >On Tue, May 31, 2016 at 03:18:02PM +0800, Chen Yu wrote:
> > >>On 2016年05月31日 15:00, Yinghai Lu wrote:
> > >>>On Mon, May 30, 2016 at 8:24 PM, Chen Yu <yu.c.chen@intel.com> wrote:
> > >>>
> > >>>>and then in pcibios_assign_resources, 0000:00:1c.0 tries to allocate minimal
> > >>>>resource window and then update related base/limit registers:
> > >>>>
> > >>>>[    0.865342] pci 0000:00:1c.0: bridge window [io  0x1000-0x0fff] to [bus
> > >>>>02] add_size 1000
> > >>>>[    0.865343] pci 0000:00:1c.0: bridge window [mem 0x00100000-0x000fffff
> > >>>>64bit pref] to [bus 02] add_size 200000 add_align 100000
> > >>>>[    0.865344] pci 0000:00:1c.0: bridge window [mem 0x00100000-0x000fffff]
> > >>>>to [bus 02] add_size 200000 add_align 100000
> > >>>>
> > >>>That is for hotplug bridge, then we could use following instead.
> > >>>
> > >>>diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> > >>>index ee72ebe..d3ec833 100644
> > >>>--- a/drivers/pci/quirks.c
> > >>>+++ b/drivers/pci/quirks.c
> > >>>@@ -2775,6 +2775,13 @@ static void quirk_hotplug_bridge(struct pci_dev *dev)
> > >>>
> > >>>  DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_HINT, 0x0020, quirk_hotplug_bridge);
> > >>>
> > >>>+static void quirk_hotplug_bridge_skip(struct pci_dev *dev)
> > >>>+{
> > >>>+       dev->is_hotplug_bridge = 0;
> > >>>+}
> > >>>+
> > >>>+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x8c10,
> > >>>quirk_hotplug_bridge_skip);
> > >>>+
> > >>>  /*
> > >>>   * This is a quirk for the Ricoh MMC controller found as a part of
> > >>>   * some mulifunction chips.
> > >>Good idea, but in this way we might not have io window allocated for
> > >>it?I'm not sure
> > >>if this would break wifi,etc, I'll suggest reporters to have a try.
> > >Let's figure out the root cause before trying more random fixes.  I
> > >have the same objection to the patch above.  No doubt there are many
> > >ways we could "fix" this, but unless we know the root cause, we're
> > >likely to make a change that's not a complete fix or will cause other
> > >issues in the future.
> > >
> > I just let the reporter  paste their lspci in Mac OS, it looks that
> > Mac OS also
> > does not allocate any resource for this broken pci bridge, and other pci
> > devices have almost the same resource region as it is in linux, so I think
> > this is an evidence that Apple does not want to use this firmware for now,
> > maybe reserved for future use, declaim resource for this pci bridge might
> > cause unpredictable result, how about adding a dmi+pci quirk for this
> > special platform?
> 
> This is not a root cause.  We still have no idea why the problem
> occurs.  I do not think a quirk is a good idea until we know what the
> problem is and exactly how a quirk would fix it.
> 
> > lspci result from Mac OS, there's no resource behind this bridge:
> > https://bugzilla.kernel.org/attachment.cgi?id=219321

Here are some ideas for debugging this:

1) Apparently the problem is sensitive to programming the prefetchable
   memory aperture of 00:1c.0 to [mem 0x7fc00000-0x7fdfffff 64bit
   pref].  The only effect of that *should* be that the bridge will
   now claim accesses in the aperture, when it didn't before.

   We *think* there's nothing else at the address of that aperture,
   but if there is an unreported device there, it may stop working.  I
   would pore over the E820 memory map, EFI memory map, all ACPI _CRS
   methods, etc., looking for anything in that area that we haven't
   accounted for.

   There are lots of anomalies in this system, e.g., (these are from
   Bastien's dmesg log at
   https://bugzilla.kernel.org/attachment.cgi?id=208961)

     BIOS-e820: [mem 0x00000000e00f8000-0x00000000e00f8fff] reserved
     PCI: MMCONFIG for domain 0000 [bus 00-9b] at [mem 0xe0000000-0xe9bfffff] (base 0xe0000000)
     acpi PNP0A08:00: [Firmware Info]: MMCONFIG for domain 0000 [bus 00-9b] only partially covers this bridge
     pci_bus 0000:00: root bus resource [mem 0x7fa00000-0xfeafffff window]
     system 00:04: [mem 0xe0000000-0xefffffff] could not be reserved
     ACPI Warning: SystemIO range 0x000000000000EFA0-0x000000000000EFBF conflicts with OpRegion 0x000000000000EFA0-0x000000000000EFAF (\_SB.PCI0.SBUS.SMBI)

   The MMCONFIG area appears correctly described in the 00:04 _CRS,
   but incorrectly in MCFG.  The E820 region appears to be a chunk in
   the middle of the MMCONFIG area.  The host bridge window 
   [mem 0x7fa00000-0xfeafffff window] is clearly bogus -- it includes
   the MMCONFIG area, which is definitely not a window.  I doubt
   anything above 0xdfffffff should be included.

   I don't know what the ACPI conflict warning is about, but I'd try
   to figure it out.

   Either the firmware is badly broken, or we're not interpreting
   something correctly.

   I'd try assigning the 00:1c.0 aperture at the end of the actual
   aperture instead of the beginning.  I think Windows, and likely
   MacOS uses a top-down allocation strategy instead of bottom-up like
   Linux does.  If that makes poweroff work, there is likely an
   unreported device in the [mem 0x7fc00000-0x7fdfffff 64bit pref]
   area.

   This experimentation could all be done with setpci, without
   requiring kernel patches.

2) I would explore exactly what the grub "halt" command does and how
   it compares to what Linux is doing.  I see the assertion that this
   is related to [io 0x1804], but I don't know what that's based on.
   Programming the 00:1c.0 prefetchable aperture shouldn't have
   anything to do with I/O ports, so if it really is related, there
   might be some SMM magic or something where SMM code is doing
   something that relates to the memory aperture.

3) The lspci output at https://bugzilla.kernel.org/attachment.cgi?id=219321
   (I think this is from MacOS) shows invalid data for devices
   starting at 04:00.0.  Why?  Maybe this is an unrelated artifact,
   but it doesn't smell right.

4) If you can hot-add devices under MacOS, look to see what address
   space they get assigned.  That may tell you what allocation
   strategy MacOS uses.

5) 00:1c.0 claims to have a slot that supports hotplug.  Is that
   actually true?  Could you add a device below it?  If not, maybe the
   problem is that the BIOS should have configured 00:1c.0 so it
   doesn't report a slot.  If it didn't report a slot, we shouldn't
   assign resources to it, since there is no possibility of a device
   below it.

Bjorn
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bjorn Helgaas June 9, 2016, 5:04 p.m. UTC | #6
On Thu, Jun 09, 2016 at 11:50:11AM -0500, Bjorn Helgaas wrote:

> 5) 00:1c.0 claims to have a slot that supports hotplug.  Is that
>    actually true?  Could you add a device below it?  If not, maybe the
>    problem is that the BIOS should have configured 00:1c.0 so it
>    doesn't report a slot.  If it didn't report a slot, we shouldn't
>    assign resources to it, since there is no possibility of a device
>    below it.

Of course, this would only be *part* of the problem, because a hot-added
device somewhere else could still be assigned the space at 
[mem 0x7fc00000-0x7fdfffff].

This just smells like an unreported device in there somewhere.
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index ee72ebe..d3ec833 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -2775,6 +2775,13 @@  static void quirk_hotplug_bridge(struct pci_dev *dev)

 DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_HINT, 0x0020, quirk_hotplug_bridge);

+static void quirk_hotplug_bridge_skip(struct pci_dev *dev)
+{
+       dev->is_hotplug_bridge = 0;
+}
+
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x8c10,
quirk_hotplug_bridge_skip);
+
 /*
  * This is a quirk for the Ricoh MMC controller found as a part of