diff mbox

bisected regression, v3.5 -> next-20120724: PCI PM causes USB hotplug failure

Message ID 1343292851.3874.12.camel@yhuang-dev (mailing list archive)
State New, archived
Delegated to: Bjorn Helgaas
Headers show

Commit Message

Huang, Ying July 26, 2012, 8:54 a.m. UTC
On Wed, 2012-07-25 at 15:58 +0200, Bjørn Mork wrote:
> huang ying <huang.ying.caritas@gmail.com> writes:
> 
> > Hi, Bjorn,
> >
> > Thank you very much for your detailed information.
> >
> > On Wed, Jul 25, 2012 at 5:58 PM, Bjørn Mork <bjorn@mork.no> wrote:
> >> Huang Ying <ying.huang@intel.com> writes:
> >>> On Wed, 2012-07-25 at 06:08 +0200, Bjørn Mork wrote:
> >>>> Enabling autosuspend for USB causes hotplug failure in the current
> >>>> linux-next. Newly plugged devices are not detected at all until the
> >>>> port/controller is manually powered on by writing "on" to power/control.
> >>>> Testing is pretty simple:
> >>>>
> >>>>   1) for f in /sys/bus/usb/devices/*/power/control; do echo auto > $f; done
> >>>
> >>> Have you done:
> >>>
> >>> for f in /sys/bus/pci/devices/*/power/confol; do echo auto > $f; done
> >>>
> >>> ?
> >>>
> >>> If not, the pci device will not be suspended at all.
> >>
> >> Yes, sorry for missing that.  I had it automatically enabled.  Yes,
> >> autosuspend for the PCI device and all child devices must be enabled for
> >> the device to be suspended at all, of course.
> >>
> >>>>   2) wait for the controllers to suspend
> >>>>   3) plugin a new USB device
> >>>
> >>> After plugin the new USB device, is there anything in dmesg?
> >>
> >> No. Absolutely nothing, so the USB devices is not enumerated.  Another
> >> indication of the same:  Plugging a device like an Android phone, which
> >> normally detects being connected to a host and presents a device type
> >> menu to the user, results in the charging LED lighting up but no menu.
> >>
> >>
> >> Trying to show the sequence of events:
> >>
> >> 1)  the controllers are suspended:
> >>
> >> Jul 25 11:27:12 nemi kernel: [   38.962792] uhci_hcd 0000:00:1a.2: power state changed by ACPI to D2
> >> Jul 25 11:27:12 nemi kernel: [   39.006718] uhci_hcd 0000:00:1d.0: power state changed by ACPI to D2
> >> Jul 25 11:27:15 nemi kernel: [   41.808471] uhci_hcd 0000:00:1a.0: power state changed by ACPI to D2
> >> Jul 25 11:27:15 nemi kernel: [   41.824123] ehci_hcd 0000:00:1a.7: power state changed by ACPI to D2
> >> Jul 25 11:27:15 nemi kernel: [   41.824194] ehci_hcd 0000:00:1d.7: power state changed by ACPI to D2
> >
> > Here uhci controller is put into D2
> >
> > [snip]
> >>
> >> Doing the same with commit 448bd857d reverted:
> >>
> >>
> >> 1)  the controllers are suspended (to state D3? instead of D2?):
> >>
> >> Jul 25 11:34:01 nemi kernel: [   37.064955] uhci_hcd 0000:00:1a.2: power state changed by ACPI to D3
> >> Jul 25 11:34:01 nemi kernel: [   37.106586] uhci_hcd 0000:00:1d.0: power state changed by ACPI to D3
> >> Jul 25 11:34:04 nemi kernel: [   39.808329] uhci_hcd 0000:00:1a.0: power state changed by ACPI to D3
> >> Jul 25 11:34:04 nemi kernel: [   39.840054] ehci_hcd 0000:00:1d.7: power state changed by ACPI to D3
> >> Jul 25 11:34:04 nemi kernel: [   39.840068] ehci_hcd 0000:00:1a.7: power state changed by ACPI to D3
> >
> > With commit reverted, the uhci_controller is put into D3 (ACPI D3cold).
> >
> > And the uhci controller on your system may not work properly under D2
> > state, while OK in D3 state, and the commit will make uhci controller
> > choose D2 instead of D3.
> >
> > Please try following command line before testing.
> >
> > for f in /sys/bus/pci/devices/*/d3cold_allowed; do echo 1 > $f; done
> 
> That made it work.  The USB controllers ended up in D4 though:
> 
> Jul 25 15:52:33 nemi kernel: [  152.753280] uhci_hcd 0000:00:1a.0: power state changed by ACPI to D0
> Jul 25 15:52:33 nemi kernel: [  152.753453] uhci_hcd 0000:00:1a.0: setting latency timer to 64
> Jul 25 15:52:33 nemi kernel: [  152.753619] uhci_hcd 0000:00:1a.0: power state changed by ACPI to D4
> Jul 25 15:52:33 nemi kernel: [  152.753875] uhci_hcd 0000:00:1a.1: setting latency timer to 64
> Jul 25 15:52:33 nemi kernel: [  152.754153] uhci_hcd 0000:00:1a.2: power state changed by ACPI to D0
> Jul 25 15:52:33 nemi kernel: [  152.754279] uhci_hcd 0000:00:1a.2: setting latency timer to 64
> Jul 25 15:52:33 nemi kernel: [  152.754432] uhci_hcd 0000:00:1a.2: power state changed by ACPI to D4
> Jul 25 15:52:33 nemi kernel: [  152.754668] ehci_hcd 0000:00:1a.7: setting latency timer to 64
> Jul 25 15:52:33 nemi kernel: [  152.768089] ehci_hcd 0000:00:1a.7: power state changed by ACPI to D4
> Jul 25 15:52:33 nemi kernel: [  152.768573] uhci_hcd 0000:00:1d.0: power state changed by ACPI to D0
> Jul 25 15:52:33 nemi kernel: [  152.768726] uhci_hcd 0000:00:1d.0: setting latency timer to 64
> Jul 25 15:52:33 nemi kernel: [  152.768891] uhci_hcd 0000:00:1d.0: power state changed by ACPI to D4
> Jul 25 15:52:33 nemi kernel: [  152.769144] uhci_hcd 0000:00:1d.1: setting latency timer to 64
> Jul 25 15:52:33 nemi kernel: [  152.769530] uhci_hcd 0000:00:1d.2: setting latency timer to 64
> Jul 25 15:52:33 nemi kernel: [  152.769902] ehci_hcd 0000:00:1d.7: setting latency timer to 64
> Jul 25 15:52:33 nemi kernel: [  152.784189] ehci_hcd 0000:00:1d.7: power state changed by ACPI to D4
> 
> 
> was that expected?  Anyway, waking up the controller from this state by
> plugging a USB device works, so it's a perfectly OK workaround.  Is this
> something that could/should be implemented as a system specific quirk,
> or is the problem more generic?  Even if it is a ACPI implementation
> issue I would expect it to be common to a number of Lenovo laptops of
> the same generation as mine (~2008).
> 
> > And please provide the output of the following command line.
> >
> > acpidump
> 
> Attached.  Thanks a lot for all your help debugging this issue.

Do you have time to try the below patch?

Best Regards,
Huang Ying

Subject: [BUGFIX] PCI/PM: enable D3/D3cold by default for most devices

This patch fixes the following bug:

http://marc.info/?l=linux-usb&m=134318961120825&w=2

Originally, device lower power states include D1, D2, D3.  After that,
D3 is further divided into D3hot and D3cold.  To support both scenario
safely, original D3 is mapped to D3cold.

When adding D3cold support, because worry about some device may have
broken D3cold support, D3cold is disabled by default.  This disable D3
on original platform too.  But some original platform may only have
working D3, but no working D1, D2.  The root cause of the above bug is
it too.

To deal with this, this patch enables D3/D3cold by default for most
devices.  This restores the original behavior.  For some devices that
suspected to have broken D3cold support, such as PCIe port, D3cold is
disabled by default.

Signed-off-by: Huang Ying <ying.huang@intel.com>
---
 drivers/pci/pci.c              |    1 +
 drivers/pci/pcie/portdrv_pci.c |    5 +++++
 2 files changed, 6 insertions(+)



--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Bjørn Mork July 26, 2012, 10:35 a.m. UTC | #1
Huang Ying <ying.huang@intel.com> writes:

> Do you have time to try the below patch?

Sure.  Looks OK wrt the USB problems, but may cause problems with the
PCIe WiFi card.  Unless those are related to other changes in -next.

Anyway, for I applied your patch on top of next-20120724 for
consistency (still without Rafael print fix, so we get the D4 below).  
This results in different stats for the uhci_hcd and ehci_hcd:


Jul 26 12:13:42 nemi kernel: [   72.820305] uhci_hcd 0000:00:1a.2: power state changed by ACPI to D2
Jul 26 12:13:42 nemi kernel: [   72.835101] uhci_hcd 0000:00:1d.0: power state changed by ACPI to D2
Jul 26 12:13:44 nemi kernel: [   74.808770] uhci_hcd 0000:00:1a.0: power state changed by ACPI to D2
Jul 26 12:13:44 nemi kernel: [   74.840293] ehci_hcd 0000:00:1d.7: power state changed by ACPI to D4
Jul 26 12:13:44 nemi kernel: [   74.840326] ehci_hcd 0000:00:1a.7: power state changed by ACPI to D4

I assume that is expected, based on the lspci output I posted earlier.
Overall I get a nice mix of allowed/disallowed:



nemi:/home/bjorn# grep . /sys/bus/pci/devices/*/d3cold_allowed
/sys/bus/pci/devices/0000:00:00.0/d3cold_allowed:0
/sys/bus/pci/devices/0000:00:02.0/d3cold_allowed:1
/sys/bus/pci/devices/0000:00:02.1/d3cold_allowed:1
/sys/bus/pci/devices/0000:00:03.0/d3cold_allowed:1
/sys/bus/pci/devices/0000:00:19.0/d3cold_allowed:1
/sys/bus/pci/devices/0000:00:1a.0/d3cold_allowed:0
/sys/bus/pci/devices/0000:00:1a.1/d3cold_allowed:0
/sys/bus/pci/devices/0000:00:1a.2/d3cold_allowed:0
/sys/bus/pci/devices/0000:00:1a.7/d3cold_allowed:1
/sys/bus/pci/devices/0000:00:1b.0/d3cold_allowed:1
/sys/bus/pci/devices/0000:00:1c.0/d3cold_allowed:0
/sys/bus/pci/devices/0000:00:1c.1/d3cold_allowed:0
/sys/bus/pci/devices/0000:00:1d.0/d3cold_allowed:0
/sys/bus/pci/devices/0000:00:1d.1/d3cold_allowed:0
/sys/bus/pci/devices/0000:00:1d.2/d3cold_allowed:0
/sys/bus/pci/devices/0000:00:1d.7/d3cold_allowed:1
/sys/bus/pci/devices/0000:00:1e.0/d3cold_allowed:0
/sys/bus/pci/devices/0000:00:1f.0/d3cold_allowed:0
/sys/bus/pci/devices/0000:00:1f.2/d3cold_allowed:1
/sys/bus/pci/devices/0000:00:1f.3/d3cold_allowed:0
/sys/bus/pci/devices/0000:03:00.0/d3cold_allowed:1


USB hotplugging seems to work fine with this.  But attempting to
unload/reload the wifi drivers resulted in this:




Jul 26 12:20:43 nemi kernel: [  493.812266] iwlwifi 0000:03:00.0: Refused to change power state, currently in D3
Jul 26 12:20:43 nemi kernel: [  493.812331] iwlwifi 0000:03:00.0: pci_resource_len = 0x00002000
Jul 26 12:20:43 nemi kernel: [  493.812335] iwlwifi 0000:03:00.0: pci_resource_base = ffffc900055a4000
Jul 26 12:20:43 nemi kernel: [  493.812339] iwlwifi 0000:03:00.0: HW Revision ID = 0x0
Jul 26 12:20:43 nemi kernel: [  493.812350] iwlwifi 0000:03:00.0: pci_enable_msi failed(0Xffffffea)
Jul 26 12:20:43 nemi kernel: [  493.812377] driver: '0000:03:00.0': driver_bound: bound to device 'iwlwifi'
Jul 26 12:20:43 nemi kernel: [  493.812385] bus: 'pci': really_probe: bound device 0000:03:00.0 to driver iwlwifi
Jul 26 12:20:43 nemi kernel: [  493.813634] iwldvm: Intel(R) Wireless WiFi Link AGN driver for Linux, in-tree:
Jul 26 12:20:43 nemi kernel: [  493.813637] iwldvm: Copyright(c) 2003-2012 Intel Corporation
Jul 26 12:20:43 nemi kernel: [  493.813912] device: '0000:03:00.0': device_add
Jul 26 12:20:43 nemi kernel: [  493.813939] PM: Adding info for No Bus:0000:03:00.0
Jul 26 12:20:43 nemi kernel: [  493.813947] firmware 0000:03:00.0: firmware: requesting iwlwifi-5000-5.ucode
Jul 26 12:20:43 nemi kernel: [  493.827551] PM: Removing info for No Bus:0000:03:00.0
Jul 26 12:20:43 nemi kernel: [  493.827615] iwlwifi 0000:03:00.0: loaded firmware version 8.83.5.1 build 33692
Jul 26 12:20:43 nemi kernel: [  493.828170] iwlwifi 0000:03:00.0: CONFIG_IWLWIFI_DEBUG disabled
Jul 26 12:20:43 nemi kernel: [  493.828175] iwlwifi 0000:03:00.0: CONFIG_IWLWIFI_DEBUGFS disabled
Jul 26 12:20:43 nemi kernel: [  493.828178] iwlwifi 0000:03:00.0: CONFIG_IWLWIFI_DEVICE_TRACING disabled
Jul 26 12:20:43 nemi kernel: [  493.828182] iwlwifi 0000:03:00.0: CONFIG_IWLWIFI_DEVICE_TESTMODE disabled
Jul 26 12:20:43 nemi kernel: [  493.828185] iwlwifi 0000:03:00.0: CONFIG_IWLWIFI_P2P disabled
Jul 26 12:20:43 nemi kernel: [  493.828190] iwlwifi 0000:03:00.0: Detected Intel(R) Ultimate N WiFi Link 5300 AGN, REV=0xFFFFFFFF
Jul 26 12:20:43 nemi kernel: [  493.828218] iwlwifi 0000:03:00.0: L1 Enabled; Disabling L0S
Jul 26 12:20:43 nemi kernel: [  493.832013] ------------[ cut here ]------------
Jul 26 12:20:43 nemi kernel: [  493.832013] WARNING: at drivers/net/wireless/iwlwifi/iwl-io.c:150 iwl_grab_nic_access+0x47/0x54 [iwlwifi]()
Jul 26 12:20:43 nemi kernel: [  493.832013] Hardware name: 2776LEG
Jul 26 12:20:43 nemi kernel: [  493.832013] Timeout waiting for hardware access (CSR_GP_CNTRL 0xffffffff)
Jul 26 12:20:43 nemi kernel: [  493.832013] Modules linked in: iwldvm iwlwifi mac80211 cfg80211 cpufreq_userspace cpufreq_stats cpufreq_conservative cpufreq_powersave xt_multiport xt_hl ip6table_filter iptable_filter ip_tables ip6_tables x_tables rfcomm bnep binfmt_misc fuse nfsd nfs lockd fscache auth_rpcgss nfs_acl sunrpc 8021q garp stp llc tun ext2 loop snd_hda_codec_conexant snd_hda_intel snd_hda_codec snd_hwdep snd_pcm_oss snd_mixer_oss snd_pcm thinkpad_acpi snd_seq_midi snd_rawmidi snd_seq_midi_event snd_seq snd_timer snd_seq_device arc4 snd i915 coretemp kvm_intel qcserial drm_kms_helper usb_wwan acpi_cpufreq drm psmouse lpc_ich i2c_algo_bit soundcore usbserial qmi_wwan usbnet mii cdc_wdm kvm microcode btusb evdev bluetooth i2c_i801 serio_raw mfd_core rfkill mei snd_page_alloc crc16 i2c_core battery nvram ac video wmi mperf processor button ext3 mbcache jbd sha256_generic ablk_helper cryptd aes_x86_64 aes_generic cbc dm_crypt dm_mod nbd sd_mod crc_t10dif sr_mod cdrom uhci_hcd ahci libahci libata ehci_hc
Jul 26 12:20:43 nemi kernel: d scsi_mod thermal thermal_sys usbcore usb_common e1000e [last unloaded: cfg80211]
Jul 26 12:20:43 nemi kernel: [  493.832013] Pid: 200, comm: kworker/0:2 Not tainted 3.5.0-rc2-next-20120724+ #23
Jul 26 12:20:43 nemi kernel: [  493.832013] Call Trace:
Jul 26 12:20:43 nemi kernel: [  493.832013]  [<ffffffff8103d0fd>] ? warn_slowpath_common+0x78/0x8c
Jul 26 12:20:43 nemi kernel: [  493.832013]  [<ffffffff8103d1af>] ? warn_slowpath_fmt+0x45/0x4a
Jul 26 12:20:43 nemi kernel: [  493.832013]  [<ffffffffa03822be>] ? iwl_grab_nic_access+0x47/0x54 [iwlwifi]
Jul 26 12:20:43 nemi kernel: [  493.832013]  [<ffffffffa0382590>] ? iwl_write_prph+0x29/0x56 [iwlwifi]
Jul 26 12:20:43 nemi kernel: [  493.832013]  [<ffffffffa03882cb>] ? iwl_apm_init+0x13a/0x16b [iwlwifi]
Jul 26 12:20:43 nemi kernel: [  493.832013]  [<ffffffffa03883fd>] ? iwl_trans_pcie_start_hw+0x101/0x15b [iwlwifi]
Jul 26 12:20:43 nemi kernel: [  493.832013]  [<ffffffffa026aa5b>] ? iwl_op_mode_dvm_start+0x246/0x96a [iwldvm]
Jul 26 12:20:43 nemi kernel: [  493.832013]  [<ffffffffa03835d1>] ? iwl_ucode_callback+0x9e5/0xad8 [iwlwifi]
Jul 26 12:20:43 nemi kernel: [  493.832013]  [<ffffffff812793c2>] ? _request_firmware_prepare+0x1e2/0x1e2
Jul 26 12:20:43 nemi kernel: [  493.832013]  [<ffffffff81279471>] ? request_firmware_work_func+0xaf/0xe4
Jul 26 12:20:43 nemi kernel: [  493.832013]  [<ffffffff81054dea>] ? process_one_work+0x1ff/0x311
Jul 26 12:20:43 nemi kernel: [  493.832013]  [<ffffffff810550f7>] ? worker_thread+0x1fb/0x2fb
Jul 26 12:20:43 nemi kernel: [  493.832013]  [<ffffffff81054efc>] ? process_one_work+0x311/0x311
Jul 26 12:20:43 nemi kernel: [  493.832013]  [<ffffffff81054efc>] ? process_one_work+0x311/0x311
Jul 26 12:20:43 nemi kernel: [  493.832013]  [<ffffffff810588fd>] ? kthread+0x81/0x89
Jul 26 12:20:43 nemi kernel: [  493.832013]  [<ffffffff813703c4>] ? kernel_thread_helper+0x4/0x10
Jul 26 12:20:43 nemi kernel: [  493.832013]  [<ffffffff8105887c>] ? kthread_freezable_should_stop+0x53/0x53
Jul 26 12:20:43 nemi kernel: [  493.832013]  [<ffffffff813703c0>] ? gs_change+0x13/0x13
Jul 26 12:20:43 nemi kernel: [  493.832013] ---[ end trace fcaaf916dd43b7ca ]---
Jul 26 12:20:43 nemi kernel: [  493.864785] iwlwifi 0000:03:00.0: bad EEPROM/OTP signature, type=OTP, EEPROM_GP=0x00000007
Jul 26 12:20:43 nemi kernel: [  493.864791] iwlwifi 0000:03:00.0: EEPROM not found, EEPROM_GP=0xffffffff
Jul 26 12:20:43 nemi kernel: [  493.864795] iwlwifi 0000:03:00.0: Unable to init EEPROM


That does not look good...

The bridge and WiFi devices power status at this point is:

bjorn@nemi:~$  grep . /sys/bus/pci/devices/0000:00:1c.1/power/*
/sys/bus/pci/devices/0000:00:1c.1/power/async:enabled
grep: /sys/bus/pci/devices/0000:00:1c.1/power/autosuspend_delay_ms: Input/output error
/sys/bus/pci/devices/0000:00:1c.1/power/control:auto
/sys/bus/pci/devices/0000:00:1c.1/power/runtime_active_kids:0
/sys/bus/pci/devices/0000:00:1c.1/power/runtime_active_time:390576
/sys/bus/pci/devices/0000:00:1c.1/power/runtime_enabled:enabled
/sys/bus/pci/devices/0000:00:1c.1/power/runtime_status:suspended
/sys/bus/pci/devices/0000:00:1c.1/power/runtime_suspended_time:697004
/sys/bus/pci/devices/0000:00:1c.1/power/runtime_usage:0
/sys/bus/pci/devices/0000:00:1c.1/power/wakeup:disabled

bjorn@nemi:~$ grep . /sys/bus/pci/devices/0000:03:00.0/power/*
/sys/bus/pci/devices/0000:03:00.0/power/async:enabled
grep: /sys/bus/pci/devices/0000:03:00.0/power/autosuspend_delay_ms: Input/output error
/sys/bus/pci/devices/0000:03:00.0/power/control:auto
/sys/bus/pci/devices/0000:03:00.0/power/runtime_active_kids:0
/sys/bus/pci/devices/0000:03:00.0/power/runtime_active_time:0
/sys/bus/pci/devices/0000:03:00.0/power/runtime_enabled:disabled
/sys/bus/pci/devices/0000:03:00.0/power/runtime_status:unsupported
/sys/bus/pci/devices/0000:03:00.0/power/runtime_suspended_time:70220
/sys/bus/pci/devices/0000:03:00.0/power/runtime_usage:0
/sys/bus/pci/devices/0000:03:00.0/power/wakeup:disabled



I don't really understand the last one.  How can suspended_time > 0 when
status is unsupported and autosuspend is disabled?



Bjørn
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bjørn Mork July 26, 2012, 11:02 a.m. UTC | #2
Bjørn Mork <bjorn@mork.no> writes:

>
> Jul 26 12:20:43 nemi kernel: [  493.812266] iwlwifi 0000:03:00.0: Refused to change power state, currently in D3
> Jul 26 12:20:43 nemi kernel: [  493.812331] iwlwifi 0000:03:00.0: pci_resource_len = 0x00002000
> Jul 26 12:20:43 nemi kernel: [  493.812335] iwlwifi 0000:03:00.0: pci_resource_base = ffffc900055a4000
> Jul 26 12:20:43 nemi kernel: [  493.812339] iwlwifi 0000:03:00.0: HW Revision ID = 0x0
> Jul 26 12:20:43 nemi kernel: [  493.812350] iwlwifi 0000:03:00.0: pci_enable_msi failed(0Xffffffea)
[..]
> Jul 26 12:20:43 nemi kernel: [  493.832013] ---[ end trace fcaaf916dd43b7ca ]---
> Jul 26 12:20:43 nemi kernel: [  493.864785] iwlwifi 0000:03:00.0: bad EEPROM/OTP signature, type=OTP, EEPROM_GP=0x00000007
> Jul 26 12:20:43 nemi kernel: [  493.864791] iwlwifi 0000:03:00.0: EEPROM not found, EEPROM_GP=0xffffffff
> Jul 26 12:20:43 nemi kernel: [  493.864795] iwlwifi 0000:03:00.0: Unable to init EEPROM


Doing 

 echo on >/sys/bus/pci/devices/0000:00:1c.1/power/control

to force the brigde on and then unload/reload driver "fixes" this issue.
But I believe it is safe to say that this does not work as it should.  I
assume the bridge should have been automatically woken up first here?

And why the odd autosuspend output?  The Wifi device and bridge PM caps
look compatible to my untrained eye, and should make both D3hot and
D3cold supported ?:


        Capabilities: [c8] Power Management version 3
                Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
                Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-

        Capabilities: [a0] Power Management version 2
                Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
                Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-




Bjørn
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bjørn Mork July 26, 2012, 12:04 p.m. UTC | #3
Bjørn Mork <bjorn@mork.no> writes:

>         Capabilities: [c8] Power Management version 3
>                 Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
>                 Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
>
>         Capabilities: [a0] Power Management version 2
>                 Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
>                 Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
>

That was from the previous captured lspci output.  But I assume you
might be interested in the state when the driver fails. The default with
your newest patch is to allow d3cold for the WiFi device but disallow it
for the bridge (PCIe port).  Which I guess is the intention based on the
patch description.

This seems to work.  The PCIe port ends up in D3:


nemi:/home/bjorn# lspci -vvnns 00:1c.1
00:1c.1 PCI bridge [0604]: Intel Corporation 82801I (ICH9 Family) PCI Express Port 2 [8086:2942] (rev 03) (prog-if 00 [Normal decode])
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx+
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 0, Cache Line Size: 64 bytes
        Bus: primary=00, secondary=03, subordinate=03, sec-latency=0
        I/O behind bridge: 00003000-00003fff
        Memory behind bridge: f0500000-f05fffff
        Prefetchable memory behind bridge: 00000000c0400000-00000000c05fffff
        Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- <SERR- <PERR-
        BridgeCtl: Parity- SERR- NoISA+ VGA- MAbort- >Reset- FastB2B-
                PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
        Capabilities: [40] Express (v1) Root Port (Slot+), MSI 00
                DevCap: MaxPayload 128 bytes, PhantFunc 0, Latency L0s <64ns, L1 <1us
                        ExtTag- RBE+ FLReset-
                DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
                        RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
                        MaxPayload 128 bytes, MaxReadReq 128 bytes
                DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr+ TransPend-
                LnkCap: Port #2, Speed 2.5GT/s, Width x1, ASPM L0s L1, Latency L0 <256ns, L1 <4us
                        ClockPM- Surprise- LLActRep+ BwNot-
                LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk+
                        ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
                LnkSta: Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive+ BWMgmt- ABWMgmt-
                SltCap: AttnBtn- PwrCtrl- MRL- AttnInd- PwrInd- HotPlug+ Surprise+
                        Slot #1, PowerLimit 6.500W; Interlock- NoCompl-
                SltCtl: Enable: AttnBtn- PwrFlt- MRL- PresDet+ CmdCplt+ HPIrq+ LinkChg-
                        Control: AttnInd Unknown, PwrInd Unknown, Power- Interlock-
                SltSta: Status: AttnBtn- PowerFlt- MRL- CmdCplt- PresDet+ Interlock-
                        Changed: MRL- PresDet- LinkState+
                RootCtl: ErrCorrectable- ErrNon-Fatal- ErrFatal- PMEIntEna+ CRSVisible-
                RootCap: CRSVisible-
                RootSta: PME ReqID 0000, PMEStatus- PMEPending-
        Capabilities: [80] MSI: Enable+ Count=1/1 Maskable- 64bit-
                Address: fee0300c  Data: 4143
        Capabilities: [90] Subsystem: Lenovo Device [17aa:20f3]
        Capabilities: [a0] Power Management version 2
                Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
                Status: D3 NoSoftRst- PME-Enable+ DSel=0 DScale=0 PME-
        Capabilities: [100 v1] Virtual Channel
                Caps:   LPEVC=0 RefClk=100ns PATEntryBits=1
                Arb:    Fixed+ WRR32- WRR64- WRR128-
                Ctrl:   ArbSelect=Fixed
                Status: InProgress-
                VC0:    Caps:   PATOffset=00 MaxTimeSlots=1 RejSnoopTrans-
                        Arb:    Fixed+ WRR32- WRR64- WRR128- TWRR128- WRR256-
                        Ctrl:   Enable+ ID=0 ArbSelect=Fixed TC/VC=01
                        Status: NegoPending- InProgress-
        Capabilities: [180 v1] Root Complex Link
                Desc:   PortNumber=02 ComponentID=02 EltType=Config
                Link0:  Desc:   TargetPort=00 TargetComponent=02 AssocRCRB- LinkType=MemMapped LinkValid+
                        Addr:   00000000fed1c000
        Kernel driver in use: pcieport


Attempting to read the WiFi device config at this point is futile:

nemi:/home/bjorn# lspci -vvnns 03:00.0
03:00.0 Network controller [0280]: Intel Corporation Ultimate N WiFi Link 5300 [8086:4236] (rev ff) (prog-if ff)
        !!! Unknown header type 7f



Does that work as expected BTW?  Would be nice if any attempt to read
config would wake the brigde to allow it, would it not?  I have
absolutely no idea whether that's is achievable..

But any attempt to load a driver for the WiFi device should most
definitely wake the bridge.  But it does not:

nemi:/home/bjorn# modprobe iwldvm; lspci -vvnns 00:1c.1
00:1c.1 PCI bridge [0604]: Intel Corporation 82801I (ICH9 Family) PCI Express Port 2 [8086:2942] (rev 03) (prog-if 00 [Normal decode])
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx+
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 0, Cache Line Size: 64 bytes
        Bus: primary=00, secondary=03, subordinate=03, sec-latency=0
        I/O behind bridge: 00003000-00003fff
        Memory behind bridge: f0500000-f05fffff
        Prefetchable memory behind bridge: 00000000c0400000-00000000c05fffff
        Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- <SERR- <PERR-
        BridgeCtl: Parity- SERR- NoISA+ VGA- MAbort- >Reset- FastB2B-
                PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
        Capabilities: [40] Express (v1) Root Port (Slot+), MSI 00
                DevCap: MaxPayload 128 bytes, PhantFunc 0, Latency L0s <64ns, L1 <1us
                        ExtTag- RBE+ FLReset-
                DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
                        RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
                        MaxPayload 128 bytes, MaxReadReq 128 bytes
                DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr+ TransPend-
                LnkCap: Port #2, Speed 2.5GT/s, Width x1, ASPM L0s L1, Latency L0 <256ns, L1 <4us
                        ClockPM- Surprise- LLActRep+ BwNot-
                LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk+
                        ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
                LnkSta: Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive+ BWMgmt- ABWMgmt-
                SltCap: AttnBtn- PwrCtrl- MRL- AttnInd- PwrInd- HotPlug+ Surprise+
                        Slot #1, PowerLimit 6.500W; Interlock- NoCompl-
                SltCtl: Enable: AttnBtn- PwrFlt- MRL- PresDet+ CmdCplt+ HPIrq+ LinkChg-
                        Control: AttnInd Unknown, PwrInd Unknown, Power- Interlock-
                SltSta: Status: AttnBtn- PowerFlt- MRL- CmdCplt- PresDet+ Interlock-
                        Changed: MRL- PresDet- LinkState+
                RootCtl: ErrCorrectable- ErrNon-Fatal- ErrFatal- PMEIntEna+ CRSVisible-
                RootCap: CRSVisible-
                RootSta: PME ReqID 0000, PMEStatus- PMEPending-
        Capabilities: [80] MSI: Enable+ Count=1/1 Maskable- 64bit-
                Address: fee0300c  Data: 4143
        Capabilities: [90] Subsystem: Lenovo Device [17aa:20f3]
        Capabilities: [a0] Power Management version 2
                Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
                Status: D3 NoSoftRst- PME-Enable+ DSel=0 DScale=0 PME-
        Capabilities: [100 v1] Virtual Channel
                Caps:   LPEVC=0 RefClk=100ns PATEntryBits=1
                Arb:    Fixed+ WRR32- WRR64- WRR128-
                Ctrl:   ArbSelect=Fixed
                Status: InProgress-
                VC0:    Caps:   PATOffset=00 MaxTimeSlots=1 RejSnoopTrans-
                        Arb:    Fixed+ WRR32- WRR64- WRR128- TWRR128- WRR256-
                        Ctrl:   Enable+ ID=0 ArbSelect=Fixed TC/VC=01
                        Status: NegoPending- InProgress-
        Capabilities: [180 v1] Root Complex Link
                Desc:   PortNumber=02 ComponentID=02 EltType=Config
                Link0:  Desc:   TargetPort=00 TargetComponent=02 AssocRCRB- LinkType=MemMapped LinkValid+
                        Addr:   00000000fed1c000
        Kernel driver in use: pcieport


I have tried all 4 combinations of d3cold_allowed for these 2 devices,
but none of them make any difference.  The default with your patches is
to disallow it for the PCIe port.  One strange issue is that the PCIe
port goes into the same stat even if I set d3cold_allowed to 1:

        Capabilities: [a0] Power Management version 2
                Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
                Status: D3 NoSoftRst- PME-Enable+ DSel=0 DScale=0 PME-


Should't the status then be "D4" or "D3cold" or whatever lspci will call
it?  At least different?  The d3cold_allowed setting does not seem to
change anything for this port.



Bjørn
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Alan Stern July 26, 2012, 3:03 p.m. UTC | #4
On Thu, 26 Jul 2012, Bjørn Mork wrote:

> Huang Ying <ying.huang@intel.com> writes:
> 
> > Do you have time to try the below patch?
> 
> Sure.  Looks OK wrt the USB problems, but may cause problems with the
> PCIe WiFi card.  Unless those are related to other changes in -next.
> 
> Anyway, for I applied your patch on top of next-20120724 for
> consistency (still without Rafael print fix, so we get the D4 below).  
> This results in different stats for the uhci_hcd and ehci_hcd:
> 
> 
> Jul 26 12:13:42 nemi kernel: [   72.820305] uhci_hcd 0000:00:1a.2: power state changed by ACPI to D2
> Jul 26 12:13:42 nemi kernel: [   72.835101] uhci_hcd 0000:00:1d.0: power state changed by ACPI to D2
> Jul 26 12:13:44 nemi kernel: [   74.808770] uhci_hcd 0000:00:1a.0: power state changed by ACPI to D2
> Jul 26 12:13:44 nemi kernel: [   74.840293] ehci_hcd 0000:00:1d.7: power state changed by ACPI to D4
> Jul 26 12:13:44 nemi kernel: [   74.840326] ehci_hcd 0000:00:1a.7: power state changed by ACPI to D4
> 
> I assume that is expected, based on the lspci output I posted earlier.
> Overall I get a nice mix of allowed/disallowed:

...

> USB hotplugging seems to work fine with this.

Don't be too sure.  Have you tested to see if it still works after 
doing "rmmod ehci-hcd"?

So far you have tested the EHCI controllers, but have you tested the 
UHCI controllers?  Unloading ehci-hcd will force them to be used.

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bjørn Mork July 26, 2012, 4:24 p.m. UTC | #5
Alan Stern <stern@rowland.harvard.edu> writes:

>> USB hotplugging seems to work fine with this.
>
> Don't be too sure.  Have you tested to see if it still works after 
> doing "rmmod ehci-hcd"?
>
> So far you have tested the EHCI controllers, but have you tested the 
> UHCI controllers?  Unloading ehci-hcd will force them to be used.

Good point.  Now tested and seems to work.

But I'd appreciate it if someone else were able to doublecheck my
results.  I am juggling with a number of settings and could easily mess
up something, invalidating the tests.  I believe the issue should be
present in lots of systems, isn't that correct?  Any GM45 based laptop
should do as a test bench, and maybe others too.

The WiFi issues I had also indicates that these changes should be tested
on as many different systems as possible. 



Bjørn
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -1941,6 +1941,7 @@  void pci_pm_init(struct pci_dev *dev)
 	dev->pm_cap = pm;
 	dev->d3_delay = PCI_PM_D3_WAIT;
 	dev->d3cold_delay = PCI_PM_D3COLD_WAIT;
+	dev->d3cold_allowed = true;
 
 	dev->d1_support = false;
 	dev->d2_support = false;
--- a/drivers/pci/pcie/portdrv_pci.c
+++ b/drivers/pci/pcie/portdrv_pci.c
@@ -200,6 +200,11 @@  static int __devinit pcie_portdrv_probe(
 		return status;
 
 	pci_save_state(dev);
+	/*
+	 * D3cold may not work properly on some PCIe port, so disable
+	 * it by default.
+	 */
+	dev->d3cold_allowed = false;
 	if (!pci_match_id(port_runtime_pm_black_list, dev))
 		pm_runtime_put_noidle(&dev->dev);