diff mbox

PCI: Quirk for ASUS S3 issue

Message ID Pine.LNX.4.44L0.1207041442430.15857-100000@netrider.rowland.org (mailing list archive)
State New, archived
Delegated to: Bjorn Helgaas
Headers show

Commit Message

Alan Stern July 4, 2012, 6:47 p.m. UTC
On Wed, 4 Jul 2012, AceLan Kao wrote:

> We contacted the ASUS' BIOS engineer and try to verify this issue, for
> it happened
> on many ASUS' machines. Then they found that the system hangs in their
> BIOS code.
> The code will try to disable the USB if they found the PCI COMMAND
> register is not zero.
> 
> That's not a correct behavior that BIOS should do, the PCI COMMAND
> register is not
> represent if the USB is disabled or not, It's a workaround they tried to fix
> another issue in windows long time ago, but ASUS' BIOS engineer refuse to remove
> that part of code. But windows will clear the PCI COMMAND register if
> windows is already
> disabled the USB.
> So, I try to write this quirk.

> > Is there any reason not to clear the PCI COMMAND register of every PCI
> > USB host controller when entering S3?
> Quote from ASUS, they only mentioned EHCI
> ---------
> BIOS will check EHCI command register PCI regiter offset 04h to see if
> USB is disabled or not.
> Because regiter offset 04h is not cleared, BIOS think USB is not disabled.
> Then BIOS will try to disabled USB, but the USB is disabled by Ubuntu
> already. This conflict will cause system hang.
> ASUS think since Ubuntu will disable USB, it also need to clear the
> register too.
> ---------

How about this patch instead?  (I haven't tested it yet...)

Rafael, does this seem okay with no special quirk flag?  Or should the 
command register be cleared as part of the USB code?

Alan Stern




--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Rafael Wysocki July 4, 2012, 9:14 p.m. UTC | #1
On Wednesday, July 04, 2012, Alan Stern wrote:
> On Wed, 4 Jul 2012, AceLan Kao wrote:
> 
> > We contacted the ASUS' BIOS engineer and try to verify this issue, for
> > it happened
> > on many ASUS' machines. Then they found that the system hangs in their
> > BIOS code.
> > The code will try to disable the USB if they found the PCI COMMAND
> > register is not zero.
> > 
> > That's not a correct behavior that BIOS should do, the PCI COMMAND
> > register is not
> > represent if the USB is disabled or not, It's a workaround they tried to fix
> > another issue in windows long time ago, but ASUS' BIOS engineer refuse to remove
> > that part of code. But windows will clear the PCI COMMAND register if
> > windows is already
> > disabled the USB.
> > So, I try to write this quirk.
> 
> > > Is there any reason not to clear the PCI COMMAND register of every PCI
> > > USB host controller when entering S3?
> > Quote from ASUS, they only mentioned EHCI
> > ---------
> > BIOS will check EHCI command register PCI regiter offset 04h to see if
> > USB is disabled or not.
> > Because regiter offset 04h is not cleared, BIOS think USB is not disabled.
> > Then BIOS will try to disabled USB, but the USB is disabled by Ubuntu
> > already. This conflict will cause system hang.
> > ASUS think since Ubuntu will disable USB, it also need to clear the
> > register too.
> > ---------
> 
> How about this patch instead?  (I haven't tested it yet...)
> 
> Rafael, does this seem okay with no special quirk flag?  Or should the 
> command register be cleared as part of the USB code?

Well, the clearing of it in the USB code would probably be too early,
so we don't really have much choice.

I suppose we can add pci_fixup_suspend_noirq for that in analogy with
pci_fixup_suspend, but I'm not sure it's worth the effort.

Bjorn, what do you think?

Rafael



> Index: usb-3.5/drivers/pci/pci-driver.c
> ===================================================================
> --- usb-3.5.orig/drivers/pci/pci-driver.c
> +++ usb-3.5/drivers/pci/pci-driver.c
> @@ -748,6 +748,18 @@ static int pci_pm_suspend_noirq(struct d
>  
>  	pci_pm_set_unknown_state(pci_dev);
>  
> +	/*
> +	 * Some BIOSes from ASUS have a bug: If a USB EHCI host controller's
> +	 * PCI COMMAND register isn't 0, the BIOS assumes that the controller
> +	 * hasn't been quiesced and tries to turn it off.  If the controller
> +	 * is already in D3, this can hang or cause memory corruption.
> +	 *
> +	 * Since the value of the COMMAND register doesn't matter once the
> +	 * device has been suspended, we can safely set it to 0 here.
> +	 */
> +	if (pci_dev->class == PCI_CLASS_SERIAL_USB_EHCI)
> +		pci_write_config_word(pci_dev, PCI_COMMAND, 0);
> +
>  	return 0;
>  }
>  
> 
> 
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bjorn Helgaas July 5, 2012, 6:17 p.m. UTC | #2
On Wed, Jul 4, 2012 at 3:14 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> On Wednesday, July 04, 2012, Alan Stern wrote:
>> On Wed, 4 Jul 2012, AceLan Kao wrote:
>>
>> > We contacted the ASUS' BIOS engineer and try to verify this issue, for
>> > it happened
>> > on many ASUS' machines. Then they found that the system hangs in their
>> > BIOS code.
>> > The code will try to disable the USB if they found the PCI COMMAND
>> > register is not zero.
>> >
>> > That's not a correct behavior that BIOS should do, the PCI COMMAND
>> > register is not
>> > represent if the USB is disabled or not, It's a workaround they tried to fix
>> > another issue in windows long time ago, but ASUS' BIOS engineer refuse to remove
>> > that part of code. But windows will clear the PCI COMMAND register if
>> > windows is already
>> > disabled the USB.
>> > So, I try to write this quirk.
>>
>> > > Is there any reason not to clear the PCI COMMAND register of every PCI
>> > > USB host controller when entering S3?
>> > Quote from ASUS, they only mentioned EHCI
>> > ---------
>> > BIOS will check EHCI command register PCI regiter offset 04h to see if
>> > USB is disabled or not.
>> > Because regiter offset 04h is not cleared, BIOS think USB is not disabled.
>> > Then BIOS will try to disabled USB, but the USB is disabled by Ubuntu
>> > already. This conflict will cause system hang.
>> > ASUS think since Ubuntu will disable USB, it also need to clear the
>> > register too.
>> > ---------
>>
>> How about this patch instead?  (I haven't tested it yet...)
>>
>> Rafael, does this seem okay with no special quirk flag?  Or should the
>> command register be cleared as part of the USB code?
>
> Well, the clearing of it in the USB code would probably be too early,
> so we don't really have much choice.
>
> I suppose we can add pci_fixup_suspend_noirq for that in analogy with
> pci_fixup_suspend, but I'm not sure it's worth the effort.
>
> Bjorn, what do you think?

I agree; it doesn't seem like a new fixup type is worth the trouble.

I wish we knew more about what's really going on here.  I looked
through the bug reports AceLan dug up, and I don't get the feeling
that anybody really has a grand unified theory about why Windows works
but Linux doesn't.  But I'm willing to apply one of these patches if
you and/or Alan sign off on it.

>> Index: usb-3.5/drivers/pci/pci-driver.c
>> ===================================================================
>> --- usb-3.5.orig/drivers/pci/pci-driver.c
>> +++ usb-3.5/drivers/pci/pci-driver.c
>> @@ -748,6 +748,18 @@ static int pci_pm_suspend_noirq(struct d
>>
>>       pci_pm_set_unknown_state(pci_dev);
>>
>> +     /*
>> +      * Some BIOSes from ASUS have a bug: If a USB EHCI host controller's
>> +      * PCI COMMAND register isn't 0, the BIOS assumes that the controller
>> +      * hasn't been quiesced and tries to turn it off.  If the controller
>> +      * is already in D3, this can hang or cause memory corruption.
>> +      *
>> +      * Since the value of the COMMAND register doesn't matter once the
>> +      * device has been suspended, we can safely set it to 0 here.
>> +      */
>> +     if (pci_dev->class == PCI_CLASS_SERIAL_USB_EHCI)
>> +             pci_write_config_word(pci_dev, PCI_COMMAND, 0);
>> +
>>       return 0;
>>  }
>>
>>
>>
>>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Alan Stern July 5, 2012, 9:33 p.m. UTC | #3
On Thu, 5 Jul 2012, Bjorn Helgaas wrote:

> I wish we knew more about what's really going on here.  I looked
> through the bug reports AceLan dug up, and I don't get the feeling
> that anybody really has a grand unified theory about why Windows works
> but Linux doesn't.  But I'm willing to apply one of these patches if
> you and/or Alan sign off on it.

His description seems to make sense, even if it may not be complete.
  
Windows sets the COMMAND register to 0 before suspending, so it doesn't
trigger the BIOS bus.  Linux doesn't, so the BIOS tries to quiesce the
EHCI controller.  This involves doing various MMIO accesses, which of
course don't work if the controller is already in D3.

For obvious reasons, I'm in favor of my two-line change over AceLan's
new PCI quirk.  So far there have been positive responses from a few 
volunteers testing the patch.

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rafael Wysocki July 5, 2012, 10 p.m. UTC | #4
On Thursday, July 05, 2012, Alan Stern wrote:
> On Thu, 5 Jul 2012, Bjorn Helgaas wrote:
> 
> > I wish we knew more about what's really going on here.  I looked
> > through the bug reports AceLan dug up, and I don't get the feeling
> > that anybody really has a grand unified theory about why Windows works
> > but Linux doesn't.  But I'm willing to apply one of these patches if
> > you and/or Alan sign off on it.
> 
> His description seems to make sense, even if it may not be complete.
>   
> Windows sets the COMMAND register to 0 before suspending, so it doesn't
> trigger the BIOS bus.  Linux doesn't, so the BIOS tries to quiesce the
> EHCI controller.  This involves doing various MMIO accesses, which of
> course don't work if the controller is already in D3.
> 
> For obvious reasons, I'm in favor of my two-line change over AceLan's
> new PCI quirk.  So far there have been positive responses from a few 
> volunteers testing the patch.

The two-liner is good enough, I think, and tested.  We can always add a quirk
based on it in the future.

Thanks,
Rafael
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bjorn Helgaas July 5, 2012, 10:39 p.m. UTC | #5
On Thu, Jul 5, 2012 at 4:00 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> On Thursday, July 05, 2012, Alan Stern wrote:
>> On Thu, 5 Jul 2012, Bjorn Helgaas wrote:
>>
>> > I wish we knew more about what's really going on here.  I looked
>> > through the bug reports AceLan dug up, and I don't get the feeling
>> > that anybody really has a grand unified theory about why Windows works
>> > but Linux doesn't.  But I'm willing to apply one of these patches if
>> > you and/or Alan sign off on it.
>>
>> His description seems to make sense, even if it may not be complete.
>>
>> Windows sets the COMMAND register to 0 before suspending, so it doesn't
>> trigger the BIOS bus.  Linux doesn't, so the BIOS tries to quiesce the
>> EHCI controller.  This involves doing various MMIO accesses, which of
>> course don't work if the controller is already in D3.

Should we clear the COMMAND register for more than just EHCI devices?
It seems like known differences from Windows are just waiting to bite
us.

I like your patch and explanation for how this can cause a hang...
that helps make sense of things.

>> For obvious reasons, I'm in favor of my two-line change over AceLan's
>> new PCI quirk.  So far there have been positive responses from a few
>> volunteers testing the patch.
>
> The two-liner is good enough, I think, and tested.  We can always add a quirk
> based on it in the future.

Cool, do you want to send me a patch with Signed-off-by, etc?
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Alan Stern July 6, 2012, 2:01 a.m. UTC | #6
On Thu, 5 Jul 2012, Bjorn Helgaas wrote:

> >> Windows sets the COMMAND register to 0 before suspending, so it doesn't
> >> trigger the BIOS bus.  Linux doesn't, so the BIOS tries to quiesce the
> >> EHCI controller.  This involves doing various MMIO accesses, which of
> >> course don't work if the controller is already in D3.
> 
> Should we clear the COMMAND register for more than just EHCI devices?
> It seems like known differences from Windows are just waiting to bite
> us.

You're undoubtedly right, but I don't know what Windows does!  I'm just 
relying on what AceLan reported.

For the patch, I wanted to change the system behavior as little as
possible -- hence the restriction to EHCI controllers.  For all I know
it might be perfectly okay to do this with all PCI devices, but it
seems more likely that it would cause trouble in some obscure cases.  

And at least with EHCI controllers, I know what the underlying driver
is doing.

> I like your patch and explanation for how this can cause a hang...
> that helps make sense of things.
> 
> >> For obvious reasons, I'm in favor of my two-line change over AceLan's
> >> new PCI quirk.  So far there have been positive responses from a few
> >> volunteers testing the patch.
> >
> > The two-liner is good enough, I think, and tested.  We can always add a quirk
> > based on it in the future.
> 
> Cool, do you want to send me a patch with Signed-off-by, etc?

I'll send it early next week.

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

Index: usb-3.5/drivers/pci/pci-driver.c
===================================================================
--- usb-3.5.orig/drivers/pci/pci-driver.c
+++ usb-3.5/drivers/pci/pci-driver.c
@@ -748,6 +748,18 @@  static int pci_pm_suspend_noirq(struct d
 
 	pci_pm_set_unknown_state(pci_dev);
 
+	/*
+	 * Some BIOSes from ASUS have a bug: If a USB EHCI host controller's
+	 * PCI COMMAND register isn't 0, the BIOS assumes that the controller
+	 * hasn't been quiesced and tries to turn it off.  If the controller
+	 * is already in D3, this can hang or cause memory corruption.
+	 *
+	 * Since the value of the COMMAND register doesn't matter once the
+	 * device has been suspended, we can safely set it to 0 here.
+	 */
+	if (pci_dev->class == PCI_CLASS_SERIAL_USB_EHCI)
+		pci_write_config_word(pci_dev, PCI_COMMAND, 0);
+
 	return 0;
 }