diff mbox

Asus eeepc 1008HA suspend issue and mac80211 suspend corner case

Message ID 20091222022355.GA32508@bombadil.infradead.org (mailing list archive)
State Not Applicable, archived
Headers show

Commit Message

Luis Rodriguez Dec. 22, 2009, 2:23 a.m. UTC
None
diff mbox

Patch

diff --git a/net/mac80211/util.c b/net/mac80211/util.c
index e6c08da..63d42fa 100644
--- a/net/mac80211/util.c
+++ b/net/mac80211/util.c
@@ -1031,7 +1031,14 @@  int ieee80211_reconfig(struct ieee80211_local *local)
 
 	/* restart hardware */
 	if (local->open_count) {
+		/*
+		 * Upon resume hardware can sometimes be goofy due to
+		 * various platform issues, so restarting the device may
+		 * at times not work immediately. Propagate the error.
+		 */
 		res = drv_start(local);
+		if (res)
+			return res;
 
 		ieee80211_led_radio(local, true);
 	}

But this isn't enough. And since we cannot exactly talk to hardware
we can't try to send a deassoc as harware would be unresponsive. I
also don't see us handling such cases before either on cfg80211 or
mac80211, so curious what we should do. Doing the above is not enough
since userspace will still believe it will be associated if it left
the device in an associated state. If you end up killing userspace
and restarting you'll end up with crawling into cfg80211/mac80211
warnings due to the unexpected state we left things in. This is
currently busted on 2.6.32.2 and I don't see an obvious fix, hoping
others might.

As for the specific Asus eeepc 1008HA issue what I'm seeing is ath9k
talking to harware fine prior to suspend, disabling harware and then
upon resume it becomes unusable, failing at the first harware reset.
lspci tells me the following when the device is functional, both during
initial boot, and during successfull pm-suspend cycles:

01:00.0 Network controller: Atheros Communications Inc. AR9285 Wireless Network Adapter (PCI-Express) (rev 01)
	Subsystem: Device 1a3b:1089
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0, Cache Line Size: 32 bytes
	Interrupt: pin A routed to IRQ 18
	Region 0: Memory at fbef0000 (64-bit, non-prefetchable) [size=64K]
	Capabilities: [40] Power Management version 3
		Flags: PMEClk- DSI- D1+ D2- AuxCurrent=375mA PME(D0+,D1+,D2-,D3hot+,D3cold+)
		Status: D0 PME-Enable- DSel=0 DScale=0 PME-
	Capabilities: [50] Message Signalled Interrupts: Mask- 64bit- Queue=0/0 Enable-
		Address: 00000000  Data: 0000
	Capabilities: [60] Express (v2) Legacy Endpoint, MSI 00
		DevCap:	MaxPayload 128 bytes, PhantFunc 0, Latency L0s <512ns, L1 <64us
			ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
		DevCtl:	Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
			RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop-
			MaxPayload 128 bytes, MaxReadReq 512 bytes
		DevSta:	CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr+ TransPend-
		LnkCap:	Port #0, Speed 2.5GT/s, Width x1, ASPM L0s L1, Latency L0 <512ns, L1 <64us
			ClockPM- Suprise- LLActRep- BwNot-
		LnkCtl:	ASPM L1 Enabled; RCB 64 bytes Disabled- Retrain- CommClk+
			ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
		LnkSta:	Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
	Capabilities: [100] Advanced Error Reporting <?>
	Capabilities: [140] Virtual Channel <?>
	Capabilities: [160] Device Serial Number 12-14-24-ff-ff-17-15-00
	Capabilities: [170] Power Budgeting <?>
	Kernel driver in use: ath9k
	Kernel modules: ath9k

I do notice a difference when resume goes bust and the ath9k device becomes unhappy. This
is what I see:

--- lspci-ok.txt	2009-12-21 17:22:24.000000000 -0800
+++ lspci-busted.txt	2009-12-21 17:22:50.000000000 -0800
@@ -16,7 +16,7 @@ 
 		DevCtl:	Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
 			RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop-
 			MaxPayload 128 bytes, MaxReadReq 512 bytes
-		DevSta:	CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr+ TransPend-
+		DevSta:	CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr+ TransPend-
 		LnkCap:	Port #0, Speed 2.5GT/s, Width x1, ASPM L0s L1, Latency L0 <512ns, L1 <64us
 			ClockPM- Suprise- LLActRep- BwNot-
 		LnkCtl:	ASPM L1 Enabled; RCB 64 bytes Disabled- Retrain- CommClk+