diff mbox series

[5/6] wifi: mt76: mt7996: Mitigate mcu communication loss.

Message ID 20240307192951.3271156-5-greearb@candelatech.com (mailing list archive)
State New
Delegated to: Felix Fietkau
Headers show
Series [1/6] wifi: mt76: mt7996: add debugging for MCU command timeouts. | expand

Commit Message

Ben Greear March 7, 2024, 7:29 p.m. UTC
From: Ben Greear <greearb@candelatech.com>

Many calls that end up sending mcu messages to the firmware hold
RTNL or other important locks.  So when radio stops answering,
the entire system becomes very sluggish.

Add timeout counter, and if radio times out 3 times in a row,
consider it dead and no longer attempt to talk to it.

Signed-off-by: Ben Greear <greearb@candelatech.com>
---
 drivers/net/wireless/mediatek/mt76/mt7996/mcu.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)
diff mbox series

Patch

diff --git a/drivers/net/wireless/mediatek/mt76/mt7996/mcu.c b/drivers/net/wireless/mediatek/mt76/mt7996/mcu.c
index 5550671cdaf6..77c89d2d2423 100644
--- a/drivers/net/wireless/mediatek/mt76/mt7996/mcu.c
+++ b/drivers/net/wireless/mediatek/mt76/mt7996/mcu.c
@@ -202,14 +202,16 @@  mt7996_mcu_parse_response(struct mt76_dev *mdev, int cmd,
 	if (!skb) {
 		const char *first = "Secondary";
 
+		mdev->mcu_timeouts++;
 		if (!mdev->first_failed_mcu_cmd)
 			first = "Initial";
 
 		dev_err(mdev->dev,
-			"MCU: %s Failure: Message %08x (cid %lx ext_cid: %lx seq %d) timeout.  Last successful cmd: 0x%x\n",
+			"MCU: %s Failure: Message %08x (cid %lx ext_cid: %lx seq %d) timeout (%d/%d).  Last successful cmd: 0x%x\n",
 			first,
 			cmd, FIELD_GET(__MCU_CMD_FIELD_ID, cmd),
 			FIELD_GET(__MCU_CMD_FIELD_EXT_ID, cmd), seq,
+			mdev->mcu_timeouts, MAX_MCU_TIMEOUTS,
 			mdev->last_successful_mcu_cmd);
 
 		if (!mdev->first_failed_mcu_cmd)
@@ -217,6 +219,7 @@  mt7996_mcu_parse_response(struct mt76_dev *mdev, int cmd,
 		return -ETIMEDOUT;
 	}
 
+	mdev->mcu_timeouts = 0;
 	mdev->last_successful_mcu_cmd = cmd;
 
 	if (mdev->first_failed_mcu_cmd) {