diff mbox series

[v2,1/2] rt2x00usb: fix rx queue hang

Message ID 20190701105314.9707-1-smoch@web.de (mailing list archive)
State Accepted
Commit 41a531ffa4c5aeb062f892227c00fabb3b4a9c91
Delegated to: Kalle Valo
Headers show
Series [v2,1/2] rt2x00usb: fix rx queue hang | expand

Commit Message

Soeren Moch July 1, 2019, 10:53 a.m. UTC
Since commit ed194d136769 ("usb: core: remove local_irq_save() around
 ->complete() handler") the handler rt2x00usb_interrupt_rxdone() is
not running with interrupts disabled anymore. So this completion handler
is not guaranteed to run completely before workqueue processing starts
for the same queue entry.
Be sure to set all other flags in the entry correctly before marking
this entry ready for workqueue processing. This way we cannot miss error
conditions that need to be signalled from the completion handler to the
worker thread.
Note that rt2x00usb_work_rxdone() processes all available entries, not
only such for which queue_work() was called.

This patch is similar to what commit df71c9cfceea ("rt2x00: fix order
of entry flags modification") did for TX processing.

This fixes a regression on a RT5370 based wifi stick in AP mode, which
suddenly stopped data transmission after some period of heavy load. Also
stopping the hanging hostapd resulted in the error message "ieee80211
phy0: rt2x00queue_flush_queue: Warning - Queue 14 failed to flush".
Other operation modes are probably affected as well, this just was
the used testcase.

Fixes: ed194d136769 ("usb: core: remove local_irq_save() around ->complete() handler")
Cc: stable@vger.kernel.org # 4.20+
Signed-off-by: Soeren Moch <smoch@web.de>
---
Changes in v2:
 complete rework

Cc: Stanislaw Gruszka <sgruszka@redhat.com>
Cc: Helmut Schaa <helmut.schaa@googlemail.com>
Cc: Kalle Valo <kvalo@codeaurora.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: linux-wireless@vger.kernel.org
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
---
 drivers/net/wireless/ralink/rt2x00/rt2x00usb.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

--
2.17.1

Comments

Stanislaw Gruszka July 1, 2019, 11:07 a.m. UTC | #1
On Mon, Jul 01, 2019 at 12:53:13PM +0200, Soeren Moch wrote:
> Since commit ed194d136769 ("usb: core: remove local_irq_save() around
>  ->complete() handler") the handler rt2x00usb_interrupt_rxdone() is
> not running with interrupts disabled anymore. So this completion handler
> is not guaranteed to run completely before workqueue processing starts
> for the same queue entry.
> Be sure to set all other flags in the entry correctly before marking
> this entry ready for workqueue processing. This way we cannot miss error
> conditions that need to be signalled from the completion handler to the
> worker thread.
> Note that rt2x00usb_work_rxdone() processes all available entries, not
> only such for which queue_work() was called.
> 
> This patch is similar to what commit df71c9cfceea ("rt2x00: fix order
> of entry flags modification") did for TX processing.
> 
> This fixes a regression on a RT5370 based wifi stick in AP mode, which
> suddenly stopped data transmission after some period of heavy load. Also
> stopping the hanging hostapd resulted in the error message "ieee80211
> phy0: rt2x00queue_flush_queue: Warning - Queue 14 failed to flush".
> Other operation modes are probably affected as well, this just was
> the used testcase.
> 
> Fixes: ed194d136769 ("usb: core: remove local_irq_save() around ->complete() handler")
> Cc: stable@vger.kernel.org # 4.20+
> Signed-off-by: Soeren Moch <smoch@web.de>

Acked-by: Stanislaw Gruszka <sgruszka@redhat.com>
Kalle Valo July 15, 2019, 8:48 a.m. UTC | #2
Soeren Moch <smoch@web.de> writes:

> Since commit ed194d136769 ("usb: core: remove local_irq_save() around
>  ->complete() handler") the handler rt2x00usb_interrupt_rxdone() is
> not running with interrupts disabled anymore. So this completion handler
> is not guaranteed to run completely before workqueue processing starts
> for the same queue entry.
> Be sure to set all other flags in the entry correctly before marking
> this entry ready for workqueue processing. This way we cannot miss error
> conditions that need to be signalled from the completion handler to the
> worker thread.
> Note that rt2x00usb_work_rxdone() processes all available entries, not
> only such for which queue_work() was called.
>
> This patch is similar to what commit df71c9cfceea ("rt2x00: fix order
> of entry flags modification") did for TX processing.
>
> This fixes a regression on a RT5370 based wifi stick in AP mode, which
> suddenly stopped data transmission after some period of heavy load. Also
> stopping the hanging hostapd resulted in the error message "ieee80211
> phy0: rt2x00queue_flush_queue: Warning - Queue 14 failed to flush".
> Other operation modes are probably affected as well, this just was
> the used testcase.
>
> Fixes: ed194d136769 ("usb: core: remove local_irq_save() around ->complete() handler")
> Cc: stable@vger.kernel.org # 4.20+
> Signed-off-by: Soeren Moch <smoch@web.de>

I'll queue this for v5.3.
Soeren Moch July 15, 2019, 1:33 p.m. UTC | #3
On 15.07.19 10:48, Kalle Valo wrote:
> Soeren Moch <smoch@web.de> writes:
>
>> Since commit ed194d136769 ("usb: core: remove local_irq_save() around
>>  ->complete() handler") the handler rt2x00usb_interrupt_rxdone() is
>> not running with interrupts disabled anymore. So this completion handler
>> is not guaranteed to run completely before workqueue processing starts
>> for the same queue entry.
>> Be sure to set all other flags in the entry correctly before marking
>> this entry ready for workqueue processing. This way we cannot miss error
>> conditions that need to be signalled from the completion handler to the
>> worker thread.
>> Note that rt2x00usb_work_rxdone() processes all available entries, not
>> only such for which queue_work() was called.
>>
>> This patch is similar to what commit df71c9cfceea ("rt2x00: fix order
>> of entry flags modification") did for TX processing.
>>
>> This fixes a regression on a RT5370 based wifi stick in AP mode, which
>> suddenly stopped data transmission after some period of heavy load. Also
>> stopping the hanging hostapd resulted in the error message "ieee80211
>> phy0: rt2x00queue_flush_queue: Warning - Queue 14 failed to flush".
>> Other operation modes are probably affected as well, this just was
>> the used testcase.
>>
>> Fixes: ed194d136769 ("usb: core: remove local_irq_save() around ->complete() handler")
>> Cc: stable@vger.kernel.org # 4.20+
>> Signed-off-by: Soeren Moch <smoch@web.de>
> I'll queue this for v5.3.
>
OK, thanks,
Soeren
Kalle Valo July 15, 2019, 5:53 p.m. UTC | #4
Soeren Moch <smoch@web.de> wrote:

> Since commit ed194d136769 ("usb: core: remove local_irq_save() around
>  ->complete() handler") the handler rt2x00usb_interrupt_rxdone() is
> not running with interrupts disabled anymore. So this completion handler
> is not guaranteed to run completely before workqueue processing starts
> for the same queue entry.
> Be sure to set all other flags in the entry correctly before marking
> this entry ready for workqueue processing. This way we cannot miss error
> conditions that need to be signalled from the completion handler to the
> worker thread.
> Note that rt2x00usb_work_rxdone() processes all available entries, not
> only such for which queue_work() was called.
> 
> This patch is similar to what commit df71c9cfceea ("rt2x00: fix order
> of entry flags modification") did for TX processing.
> 
> This fixes a regression on a RT5370 based wifi stick in AP mode, which
> suddenly stopped data transmission after some period of heavy load. Also
> stopping the hanging hostapd resulted in the error message "ieee80211
> phy0: rt2x00queue_flush_queue: Warning - Queue 14 failed to flush".
> Other operation modes are probably affected as well, this just was
> the used testcase.
> 
> Fixes: ed194d136769 ("usb: core: remove local_irq_save() around ->complete() handler")
> Cc: stable@vger.kernel.org # 4.20+
> Signed-off-by: Soeren Moch <smoch@web.de>
> Acked-by: Stanislaw Gruszka <sgruszka@redhat.com>

Patch applied to wireless-drivers.git, thanks.

41a531ffa4c5 rt2x00usb: fix rx queue hang
diff mbox series

Patch

diff --git a/drivers/net/wireless/ralink/rt2x00/rt2x00usb.c b/drivers/net/wireless/ralink/rt2x00/rt2x00usb.c
index 67b81c7221c4..7e3a621b9c0d 100644
--- a/drivers/net/wireless/ralink/rt2x00/rt2x00usb.c
+++ b/drivers/net/wireless/ralink/rt2x00/rt2x00usb.c
@@ -372,14 +372,9 @@  static void rt2x00usb_interrupt_rxdone(struct urb *urb)
 	struct queue_entry *entry = (struct queue_entry *)urb->context;
 	struct rt2x00_dev *rt2x00dev = entry->queue->rt2x00dev;

-	if (!test_and_clear_bit(ENTRY_OWNER_DEVICE_DATA, &entry->flags))
+	if (!test_bit(ENTRY_OWNER_DEVICE_DATA, &entry->flags))
 		return;

-	/*
-	 * Report the frame as DMA done
-	 */
-	rt2x00lib_dmadone(entry);
-
 	/*
 	 * Check if the received data is simply too small
 	 * to be actually valid, or if the urb is signaling
@@ -388,6 +383,11 @@  static void rt2x00usb_interrupt_rxdone(struct urb *urb)
 	if (urb->actual_length < entry->queue->desc_size || urb->status)
 		set_bit(ENTRY_DATA_IO_FAILED, &entry->flags);

+	/*
+	 * Report the frame as DMA done
+	 */
+	rt2x00lib_dmadone(entry);
+
 	/*
 	 * Schedule the delayed work for reading the RX status
 	 * from the device.