From patchwork Fri Dec 2 02:30:00 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ben Greear X-Patchwork-Id: 9457711 X-Patchwork-Delegate: kvalo@adurom.com Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id F022F60756 for ; Fri, 2 Dec 2016 02:38:44 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E400A219AC for ; Fri, 2 Dec 2016 02:38:44 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id D8E58283E4; Fri, 2 Dec 2016 02:38:44 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.2 required=2.0 tests=BAYES_00, RCVD_IN_DNSWL_MED autolearn=unavailable version=3.3.1 Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.9]) (using TLSv1.2 with cipher AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 7D6CA27FA8 for ; Fri, 2 Dec 2016 02:38:44 +0000 (UTC) Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.85_2 #1 (Red Hat Linux)) id 1cCdkE-0000YE-Ak; Fri, 02 Dec 2016 02:38:38 +0000 Received: from merlin.infradead.org ([2001:4978:20e::2]) by bombadil.infradead.org with esmtps (Exim 4.85_2 #1 (Red Hat Linux)) id 1cCdiE-00070R-98 for ath10k@bombadil.infradead.org; Fri, 02 Dec 2016 02:36:34 +0000 Received: from mail2.candelatech.com ([208.74.158.173]) by merlin.infradead.org with esmtp (Exim 4.85_2 #1 (Red Hat Linux)) id 1cCdiB-0006gL-To for ath10k@lists.infradead.org; Fri, 02 Dec 2016 02:36:32 +0000 Received: from ben-dt3.candelatech.com (firewall.candelatech.com [50.251.239.81]) by mail2.candelatech.com (Postfix) with ESMTP id 3E10540BF02; Thu, 1 Dec 2016 18:30:06 -0800 (PST) From: greearb@candelatech.com To: linux-wireless@vger.kernel.org Subject: [PATCH 2/2] ath10k: work-around for stale txq in ar->txqs Date: Thu, 1 Dec 2016 18:30:00 -0800 Message-Id: <1480645800-2148-2-git-send-email-greearb@candelatech.com> X-Mailer: git-send-email 2.4.11 In-Reply-To: <1480645800-2148-1-git-send-email-greearb@candelatech.com> References: <1480645800-2148-1-git-send-email-greearb@candelatech.com> X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20161201_213632_077636_DA33E2FD X-CRM114-Status: GOOD ( 14.45 ) X-BeenThere: ath10k@lists.infradead.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Ben Greear , ath10k@lists.infradead.org MIME-Version: 1.0 Sender: "ath10k" Errors-To: ath10k-bounces+patchwork-ath10k=patchwork.kernel.org@lists.infradead.org X-Virus-Scanned: ClamAV using ClamSMTP From: Ben Greear Due to reasons I do not fully understand, when ath10k firmware crashes when trying to bring up lots of vdevs, the ar->txqs may still have references to the txq struct when mac80211 re-adds the network devices. The device add logic was re-initializing the list members, but if they were already in the ar->txqs, then that meant the list was broken and trying to walk the list would end up in an infinite loop. So, check for this particular isue, and remove the reference from ar->txqs before re-initializing the list-head. There must be a cleaner way to do this, but I am not sure exactly what that would be. Signed-off-by: Ben Greear --- drivers/net/wireless/ath/ath10k/mac.c | 48 ++++++++++++++++++++++++++++++----- drivers/net/wireless/ath/ath10k/wmi.c | 9 +++++++ 2 files changed, 51 insertions(+), 6 deletions(-) diff --git a/drivers/net/wireless/ath/ath10k/mac.c b/drivers/net/wireless/ath/ath10k/mac.c index 784cf2b..2f50915 100644 --- a/drivers/net/wireless/ath/ath10k/mac.c +++ b/drivers/net/wireless/ath/ath10k/mac.c @@ -4190,13 +4190,37 @@ void ath10k_mgmt_over_wmi_tx_work(struct work_struct *work) } } -static void ath10k_mac_txq_init(struct ieee80211_txq *txq) +static void ath10k_mac_txq_init(struct ath10k *ar, struct ieee80211_txq *txq) { struct ath10k_txq *artxq = (void *)txq->drv_priv; + struct ath10k_txq *tmp, *walker; + struct ieee80211_txq *txq_tmp; + int i = 0; if (!txq) return; + spin_lock_bh(&ar->txqs_lock); + + /* Remove from ar->txqs in case it still exists there. */ + list_for_each_entry_safe(walker, tmp, &ar->txqs, list) { + txq_tmp = container_of((void *)walker, struct ieee80211_txq, + drv_priv); + if ((++i % 10000) == 0) { + ath10k_err(ar, "txq-init: Checking txq_tmp: %p i: %d\n", txq_tmp, i); + ath10k_err(ar, "txq-init: txqs: %p walker->list: %p w->next: %p w->prev: %p ar->txqs: %p\n", + &ar->txqs, &(walker->list), walker->list.next, walker->list.prev, &ar->txqs); + } + + if (txq_tmp == txq) { + WARN_ON_ONCE(1); + ath10k_err(ar, "txq-init: Found txq when it should be deleted, txq_tmp: %p txq: %p\n", + txq_tmp, txq); + list_del(&walker->list); + } + } + spin_unlock_bh(&ar->txqs_lock); + INIT_LIST_HEAD(&artxq->list); } @@ -4208,6 +4232,7 @@ static void ath10k_mac_txq_unref(struct ath10k *ar, struct ieee80211_txq *txq) struct sk_buff *msdu; struct ieee80211_txq *txq_tmp; int msdu_id; + int i = 0; if (!txq) return; @@ -4220,8 +4245,18 @@ static void ath10k_mac_txq_unref(struct ath10k *ar, struct ieee80211_txq *txq) list_for_each_entry_safe(walker, tmp, &ar->txqs, list) { txq_tmp = container_of((void *)walker, struct ieee80211_txq, drv_priv); - if (txq_tmp == txq) + if ((++i % 10000) == 0) { + ath10k_err(ar, "Checking txq_tmp: %p i: %d\n", txq_tmp, i); + ath10k_err(ar, "txqs: %p walker->list: %p w->next: %p w->prev: %p ar->txqs: %p\n", + &ar->txqs, &(walker->list), walker->list.next, walker->list.prev, &ar->txqs); + } + + if (txq_tmp == txq) { + WARN_ON_ONCE(1); + ath10k_err(ar, "Found txq when it should be deleted, txq_tmp: %p txq: %p\n", + txq_tmp, txq); list_del(&walker->list); + } } spin_unlock_bh(&ar->txqs_lock); @@ -5255,7 +5290,7 @@ static int ath10k_add_interface(struct ieee80211_hw *hw, mutex_lock(&ar->conf_mutex); memset(arvif, 0, sizeof(*arvif)); - ath10k_mac_txq_init(vif->txq); + ath10k_mac_txq_init(ar, vif->txq); memset(&arvif->bcast_rate, WMI_FIXED_RATE_NONE, sizeof(arvif->bcast_rate)); memset(&arvif->mcast_rate, WMI_FIXED_RATE_NONE, sizeof(arvif->mcast_rate)); @@ -5620,8 +5655,9 @@ static void ath10k_remove_interface(struct ieee80211_hw *hw, kfree(arvif->u.ap.noa_data); } - ath10k_dbg(ar, ATH10K_DBG_MAC, "mac vdev %i delete (remove interface)\n", - arvif->vdev_id); + ath10k_dbg(ar, ATH10K_DBG_MAC, + "mac vdev %i delete (remove interface), vif: %p arvif: %p\n", + arvif->vdev_id, vif, arvif); ret = ath10k_wmi_vdev_delete(ar, arvif->vdev_id); if (ret) @@ -6437,7 +6473,7 @@ static int ath10k_sta_state(struct ieee80211_hw *hw, INIT_WORK(&arsta->update_wk, ath10k_sta_rc_update_wk); for (i = 0; i < ARRAY_SIZE(sta->txq); i++) - ath10k_mac_txq_init(sta->txq[i]); + ath10k_mac_txq_init(ar, sta->txq[i]); } /* cancel must be done outside the mutex to avoid deadlock */ diff --git a/drivers/net/wireless/ath/ath10k/wmi.c b/drivers/net/wireless/ath/ath10k/wmi.c index fd685c4..1c8ceb2 100644 --- a/drivers/net/wireless/ath/ath10k/wmi.c +++ b/drivers/net/wireless/ath/ath10k/wmi.c @@ -1771,6 +1771,15 @@ static void ath10k_wmi_tx_beacon_nowait(struct ath10k_vif *arvif) bool deliver_cab; int ret; + /* I saw a kasan warning here, looks like arvif and/or ar might have been + * NULL, add something to catch this if it happens again. + */ + if ((((unsigned long)(arvif)) < 8000) || (((unsigned long)(ar)) < 8000)) { + pr_err("tx-beacon-nowait: arvif: %p ar: %p\n", arvif, ar); + BUG_ON(((unsigned long)(arvif)) < 8000); + BUG_ON(((unsigned long)(ar)) < 8000); + } + spin_lock_bh(&ar->data_lock); bcn = arvif->beacon;