From patchwork Wed Dec 13 15:20:18 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Stanislaw Gruszka X-Patchwork-Id: 10110259 X-Patchwork-Delegate: kvalo@adurom.com Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 1896360352 for ; Wed, 13 Dec 2017 15:20:43 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1640A1FF40 for ; Wed, 13 Dec 2017 15:20:43 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 0AEE028831; Wed, 13 Dec 2017 15:20:43 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 596271FF40 for ; Wed, 13 Dec 2017 15:20:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753536AbdLMPUl (ORCPT ); Wed, 13 Dec 2017 10:20:41 -0500 Received: from mx1.redhat.com ([209.132.183.28]:56392 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753505AbdLMPUZ (ORCPT ); Wed, 13 Dec 2017 10:20:25 -0500 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 436284900E; Wed, 13 Dec 2017 15:20:25 +0000 (UTC) Received: from localhost (unknown [10.43.2.28]) by smtp.corp.redhat.com (Postfix) with ESMTP id 30B0C77D68; Wed, 13 Dec 2017 15:20:24 +0000 (UTC) Date: Wed, 13 Dec 2017 16:20:18 +0100 From: Stanislaw Gruszka To: Enrico Mioso Cc: linux-wireless@vger.kernel.org, Johannes Berg , Daniel Golle , Arnd Bergmann , John Crispin , nbd@nbd.name Subject: Re: ieee80211 phy0: rt2x00queue_write_tx_frame: Error - Dropping frame due to full tx queue...? Message-ID: <20171213152017.GA3554@redhat.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) X-Scanned-By: MIMEDefang 2.79 on 10.5.11.12 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.38]); Wed, 13 Dec 2017 15:20:25 +0000 (UTC) Sender: linux-wireless-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-wireless@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP On Mon, Dec 11, 2017 at 09:51:29PM +0100, Enrico Mioso wrote: > Hello guys, and sorry for the big CC list. > I would like to point out about a bug who survived for years - at least from 2015 until now, regarding the Ralink driver getting stuck, and in some cases not being able to recover. > The problem manifested with an MT7620A chip, and the wireless card inside the WL-330N3G device. > The error message is in rt2x00/rt2x00queue.c . > > This bug was discussed, and a patch proposed, in this thread: > https://lists.openwrt.org/pipermail/openwrt-devel/2015-September/thread.html#35778 > > I would like to help if possible - I have the have both an Archer MR200, and the WL330n3G hardware. > BTW, the Archer MR200 is a nice MT7610 device, and the problem manifests itself, see: > https://forum.openwrt.org/viewtopic.php?id=64293&p=6 > > Any help, hint, anything would be apreciated. > thank you to all. First I would try to remove this patch: http://git.lede-project.org/?p=source.git;a=blob;f=package/kernel/mac80211/patches/600-23-rt2x00-rt2800mmio-add-a-workaround-for-spurious-TX_F.patch and see if it makes things better. However I think for the stuck problem we need tx status timeout mechanism similar like for rt2800usb. Attached patch can mitigate "Dropping frame due to full tx queue" errors. I did not test it, so it can be totally broken. Regards Stanislaw diff --git a/drivers/net/wireless/ralink/rt2x00/rt2x00mac.c b/drivers/net/wireless/ralink/rt2x00/rt2x00mac.c index ecc96312a370..c8a6f163102f 100644 --- a/drivers/net/wireless/ralink/rt2x00/rt2x00mac.c +++ b/drivers/net/wireless/ralink/rt2x00/rt2x00mac.c @@ -152,16 +152,6 @@ void rt2x00mac_tx(struct ieee80211_hw *hw, if (unlikely(rt2x00queue_write_tx_frame(queue, skb, control->sta, false))) goto exit_fail; - /* - * Pausing queue has to be serialized with rt2x00lib_txdone(). Note - * we should not use spin_lock_bh variant as bottom halve was already - * disabled before ieee80211_xmit() call. - */ - spin_lock(&queue->tx_lock); - if (rt2x00queue_threshold(queue)) - rt2x00queue_pause_queue(queue); - spin_unlock(&queue->tx_lock); - return; exit_fail: diff --git a/drivers/net/wireless/ralink/rt2x00/rt2x00queue.c b/drivers/net/wireless/ralink/rt2x00/rt2x00queue.c index a2c1ca5c76d1..39d523bbb661 100644 --- a/drivers/net/wireless/ralink/rt2x00/rt2x00queue.c +++ b/drivers/net/wireless/ralink/rt2x00/rt2x00queue.c @@ -714,6 +714,13 @@ int rt2x00queue_write_tx_frame(struct data_queue *queue, struct sk_buff *skb, rt2x00queue_write_tx_descriptor(entry, &txdesc); rt2x00queue_kick_tx_queue(queue, &txdesc); + /* + * Pausing queue has to be serialized with rt2x00lib_txdone(), so we + * do this under queue->tx_lock. Bottom halve was already disabled + * before ieee80211_xmit() call. + */ + if (rt2x00queue_threshold(queue)) + rt2x00queue_pause_queue(queue); out: spin_unlock(&queue->tx_lock); return ret;