Message ID | 1364444938-19801-1-git-send-email-yinghai@kernel.org (mailing list archive) |
---|---|
State | New, archived |
Delegated to: | Bjorn Helgaas |
Headers | show |
On Wednesday, March 27, 2013 09:28:58 PM Yinghai Lu wrote: > Found problem on system that firmware that could handle pci aer. > Firmware get error reporting after pci injecting error, before os boots. > But after os boots, firmware can not get report anymore, even pci=noaer > is passed. > > Root cause: BIOS _OSC has problem with query bit checking. > It turns out that BIOS vendor is copying example code from ACPI Spec. > In ACPI Spec 5.0, page 290: > > If (Not(And(CDW1,1))) // Query flag clear? > { // Disable GPEs for features granted native control. > If (And(CTRL,0x01)) // Hot plug control granted? > { > Store(0,HPCE) // clear the hot plug SCI enable bit > Store(1,HPCS) // clear the hot plug SCI status bit > } > ... > } > > When Query flag is set, And(CDW1,1) will be 1, Not(1) will return 0xfffffffe. > So it will get into code path that should be for control set only. > BIOS acpi code should be changed to "If (LEqual(And(CDW1,1), 0)))" > > Current kernel code is using _OSC query to notify firmware about support > from OS and then use _OSC to set control bits. > During query support, current code is using all possible controls. > So will execute code that should be only for control set stage. > > That will have problem when pci=noaer or aer firmware_first is used. > As firmware have that control set for os aer already in query support stage, > but later will not os aer handling. > > We should avoid passing all possible controls, just use osc_control_set > instead. > That should workaround BIOS bugs with affected systems on the field > as more bios vendors are copying sample code from ACPI spec. > > Signed-off-by: Yinghai Lu <yinghai@kernel.org> > Cc: stable@kernel.org > > --- > drivers/acpi/pci_root.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > Index: linux-2.6/drivers/acpi/pci_root.c > =================================================================== > --- linux-2.6.orig/drivers/acpi/pci_root.c > +++ linux-2.6/drivers/acpi/pci_root.c > @@ -201,8 +201,8 @@ static acpi_status acpi_pci_query_osc(st > *control &= OSC_PCI_CONTROL_MASKS; > capbuf[OSC_CONTROL_TYPE] = *control | root->osc_control_set; > } else { > - /* Run _OSC query for all possible controls. */ > - capbuf[OSC_CONTROL_TYPE] = OSC_PCI_CONTROL_MASKS; > + /* Run _OSC query only with existing controls. */ > + capbuf[OSC_CONTROL_TYPE] = root->osc_control_set; I suppose we can do that, but then why this should be root->osc_control_set and not just 0? > } > > status = acpi_pci_run_osc(root->device->handle, capbuf, &result); > -- Rafael
On Fri, Mar 29, 2013 at 5:36 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote: >> - /* Run _OSC query for all possible controls. */ >> - capbuf[OSC_CONTROL_TYPE] = OSC_PCI_CONTROL_MASKS; >> + /* Run _OSC query only with existing controls. */ >> + capbuf[OSC_CONTROL_TYPE] = root->osc_control_set; > > I suppose we can do that, but then why this should be root->osc_control_set and > not just 0? in case query support and set control are called in mixed sequence. And ACPI spec says if control set and can not be revoked. also when it control is passed, it is always OR with root->os_control_set. capbuf[OSC_CONTROL_TYPE] = *control | root->osc_control_set; Yinghai -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Mar 29, 2013 at 6:12 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote: > On Friday, March 29, 2013 06:02:45 PM Yinghai Lu wrote: >> On Fri, Mar 29, 2013 at 5:36 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote: >> >> - /* Run _OSC query for all possible controls. */ >> >> - capbuf[OSC_CONTROL_TYPE] = OSC_PCI_CONTROL_MASKS; >> >> + /* Run _OSC query only with existing controls. */ >> >> + capbuf[OSC_CONTROL_TYPE] = root->osc_control_set; >> > >> > I suppose we can do that, but then why this should be root->osc_control_set and >> > not just 0? >> >> in case query support and set control are called in mixed sequence. > > OK, that's a good enough reason I think. > > I'm kind of afarid of regressions that may result from this, though, so I'm > going to queue it up for 3.10. Ok, Thanks -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Friday, March 29, 2013 06:02:45 PM Yinghai Lu wrote: > On Fri, Mar 29, 2013 at 5:36 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote: > >> - /* Run _OSC query for all possible controls. */ > >> - capbuf[OSC_CONTROL_TYPE] = OSC_PCI_CONTROL_MASKS; > >> + /* Run _OSC query only with existing controls. */ > >> + capbuf[OSC_CONTROL_TYPE] = root->osc_control_set; > > > > I suppose we can do that, but then why this should be root->osc_control_set and > > not just 0? > > in case query support and set control are called in mixed sequence. OK, that's a good enough reason I think. I'm kind of afarid of regressions that may result from this, though, so I'm going to queue it up for 3.10. Thanks, Rafael
[+cc Bob for spec typo question] On Wed, Mar 27, 2013 at 10:28 PM, Yinghai Lu <yinghai@kernel.org> wrote: > Found problem on system that firmware that could handle pci aer. > Firmware get error reporting after pci injecting error, before os boots. > But after os boots, firmware can not get report anymore, even pci=noaer > is passed. > > Root cause: BIOS _OSC has problem with query bit checking. > It turns out that BIOS vendor is copying example code from ACPI Spec. > In ACPI Spec 5.0, page 290: > > If (Not(And(CDW1,1))) // Query flag clear? > { // Disable GPEs for features granted native control. > If (And(CTRL,0x01)) // Hot plug control granted? > { > Store(0,HPCE) // clear the hot plug SCI enable bit > Store(1,HPCS) // clear the hot plug SCI status bit > } > ... > } > > When Query flag is set, And(CDW1,1) will be 1, Not(1) will return 0xfffffffe. > So it will get into code path that should be for control set only. > BIOS acpi code should be changed to "If (LEqual(And(CDW1,1), 0)))" Isn't this just a typo in the spec? Shouldn't it be using "LNot" instead of "Not"? If (LNot(And(CDW1,1))) // Query flag clear? Of course, that doesn't change the need for your Linux change, though a comment about the hazard might be nice for future readers. > Current kernel code is using _OSC query to notify firmware about support > from OS and then use _OSC to set control bits. > During query support, current code is using all possible controls. > So will execute code that should be only for control set stage. > > That will have problem when pci=noaer or aer firmware_first is used. > As firmware have that control set for os aer already in query support stage, > but later will not os aer handling. > > We should avoid passing all possible controls, just use osc_control_set > instead. > That should workaround BIOS bugs with affected systems on the field > as more bios vendors are copying sample code from ACPI spec. > > Signed-off-by: Yinghai Lu <yinghai@kernel.org> > Cc: stable@kernel.org > > --- > drivers/acpi/pci_root.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > Index: linux-2.6/drivers/acpi/pci_root.c > =================================================================== > --- linux-2.6.orig/drivers/acpi/pci_root.c > +++ linux-2.6/drivers/acpi/pci_root.c > @@ -201,8 +201,8 @@ static acpi_status acpi_pci_query_osc(st > *control &= OSC_PCI_CONTROL_MASKS; > capbuf[OSC_CONTROL_TYPE] = *control | root->osc_control_set; > } else { > - /* Run _OSC query for all possible controls. */ > - capbuf[OSC_CONTROL_TYPE] = OSC_PCI_CONTROL_MASKS; > + /* Run _OSC query only with existing controls. */ > + capbuf[OSC_CONTROL_TYPE] = root->osc_control_set; > } > > status = acpi_pci_run_osc(root->device->handle, capbuf, &result); -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Index: linux-2.6/drivers/acpi/pci_root.c =================================================================== --- linux-2.6.orig/drivers/acpi/pci_root.c +++ linux-2.6/drivers/acpi/pci_root.c @@ -201,8 +201,8 @@ static acpi_status acpi_pci_query_osc(st *control &= OSC_PCI_CONTROL_MASKS; capbuf[OSC_CONTROL_TYPE] = *control | root->osc_control_set; } else { - /* Run _OSC query for all possible controls. */ - capbuf[OSC_CONTROL_TYPE] = OSC_PCI_CONTROL_MASKS; + /* Run _OSC query only with existing controls. */ + capbuf[OSC_CONTROL_TYPE] = root->osc_control_set; } status = acpi_pci_run_osc(root->device->handle, capbuf, &result);
Found problem on system that firmware that could handle pci aer. Firmware get error reporting after pci injecting error, before os boots. But after os boots, firmware can not get report anymore, even pci=noaer is passed. Root cause: BIOS _OSC has problem with query bit checking. It turns out that BIOS vendor is copying example code from ACPI Spec. In ACPI Spec 5.0, page 290: If (Not(And(CDW1,1))) // Query flag clear? { // Disable GPEs for features granted native control. If (And(CTRL,0x01)) // Hot plug control granted? { Store(0,HPCE) // clear the hot plug SCI enable bit Store(1,HPCS) // clear the hot plug SCI status bit } ... } When Query flag is set, And(CDW1,1) will be 1, Not(1) will return 0xfffffffe. So it will get into code path that should be for control set only. BIOS acpi code should be changed to "If (LEqual(And(CDW1,1), 0)))" Current kernel code is using _OSC query to notify firmware about support from OS and then use _OSC to set control bits. During query support, current code is using all possible controls. So will execute code that should be only for control set stage. That will have problem when pci=noaer or aer firmware_first is used. As firmware have that control set for os aer already in query support stage, but later will not os aer handling. We should avoid passing all possible controls, just use osc_control_set instead. That should workaround BIOS bugs with affected systems on the field as more bios vendors are copying sample code from ACPI spec. Signed-off-by: Yinghai Lu <yinghai@kernel.org> Cc: stable@kernel.org --- drivers/acpi/pci_root.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html