diff mbox

[v2,4/4] ath10k: fix issues on non-preemptible systems

Message ID 1377507205-5386-5-git-send-email-michal.kazior@tieto.com (mailing list archive)
State Not Applicable, archived
Headers show

Commit Message

Michal Kazior Aug. 26, 2013, 8:53 a.m. UTC
Workers may not call into a sleepable function
(e.g. mutex_lock). Since ath10k workers can run
for a very long time it is necessary to explicitly
allow process rescheduling in case there's no
preemption.

This fixes some issues with system freezes, hangs,
watchdogs being triggered, userspace being
unresponsive on slow host machines under heavy
load.

Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
---
 drivers/net/wireless/ath/ath10k/htc.c    |    4 ++++
 drivers/net/wireless/ath/ath10k/htt_rx.c |    4 ++++
 drivers/net/wireless/ath/ath10k/wmi.c    |    4 ++++
 3 files changed, 12 insertions(+)

Comments

Kalle Valo Aug. 27, 2013, 7:06 a.m. UTC | #1
Michal Kazior <michal.kazior@tieto.com> writes:

> Workers may not call into a sleepable function
> (e.g. mutex_lock). Since ath10k workers can run
> for a very long time it is necessary to explicitly
> allow process rescheduling in case there's no
> preemption.
>
> This fixes some issues with system freezes, hangs,
> watchdogs being triggered, userspace being
> unresponsive on slow host machines under heavy
> load.

I consider this more as a workaround as a real fix. Would NAPI be a
proper fix for issues like this?

NAPI support was removed from mac80211 six months ago, but I guess we
could try to get it back if we have a good reason:

http://marc.info/?l=linux-wireless&m=136204135907491

> --- a/drivers/net/wireless/ath/ath10k/htt_rx.c
> +++ b/drivers/net/wireless/ath/ath10k/htt_rx.c
> @@ -1229,6 +1229,10 @@ static void ath10k_htt_rx_work(struct work_struct *work)
>  			break;
>  
>  		ath10k_htt_rx_process_skb(htt->ar, skb);
> +
> +#ifndef CONFIG_PREEMPT
> +		cond_resched();
> +#endif

Why the #ifndef? Why should we not call cond_resched() when PREEMPT is
enabled? Does something negative happen then?
Michal Kazior Aug. 27, 2013, 7:30 a.m. UTC | #2
On 27 August 2013 09:06, Kalle Valo <kvalo@qca.qualcomm.com> wrote:
> Michal Kazior <michal.kazior@tieto.com> writes:
>
>> Workers may not call into a sleepable function
>> (e.g. mutex_lock). Since ath10k workers can run
>> for a very long time it is necessary to explicitly
>> allow process rescheduling in case there's no
>> preemption.
>>
>> This fixes some issues with system freezes, hangs,
>> watchdogs being triggered, userspace being
>> unresponsive on slow host machines under heavy
>> load.
>
> I consider this more as a workaround as a real fix. Would NAPI be a
> proper fix for issues like this?
>
> NAPI support was removed from mac80211 six months ago, but I guess we
> could try to get it back if we have a good reason:
>
> http://marc.info/?l=linux-wireless&m=136204135907491

Hmm. From what I understand NAPI is used for RX polling. My patch
addresses mainly WMI/HTT TX starvation.

There's another solution that I had in mind. Instead of:

  for (;;) { dequeue(z); process; }

I did:

  q = dequeue_all(z); for (;;) { dequeue(q); process; }

I.e. move all queued stuff at the worker entry and move it out of the
global queue, that can, and will be, having more stuff queued while
the worker does its job).

This way workers would exit/restart more often, but from what I tested
it didn't really solve the problem. Given enough traffic HTC worker
responsible for HTT TX gets overwhelmed eventually. You could try
limit how many frames a worker can process during one execution, but
how do you guess that? This starvation depends on how fast your CPU
is.

Thus I opted for sole cond_resched().


>> --- a/drivers/net/wireless/ath/ath10k/htt_rx.c
>> +++ b/drivers/net/wireless/ath/ath10k/htt_rx.c
>> @@ -1229,6 +1229,10 @@ static void ath10k_htt_rx_work(struct work_struct *work)
>>                       break;
>>
>>               ath10k_htt_rx_process_skb(htt->ar, skb);
>> +
>> +#ifndef CONFIG_PREEMPT
>> +             cond_resched();
>> +#endif
>
> Why the #ifndef? Why should we not call cond_resched() when PREEMPT is
> enabled? Does something negative happen then?

I think it should be okay. I saw the #ifndef thing in b43legacy and
thought it simply makes sense (although it's unsightly).


Micha?.
--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Kalle Valo Aug. 28, 2013, 4:02 a.m. UTC | #3
Michal Kazior <michal.kazior@tieto.com> writes:

> On 27 August 2013 09:06, Kalle Valo <kvalo@qca.qualcomm.com> wrote:
>
>> I consider this more as a workaround as a real fix. Would NAPI be a
>> proper fix for issues like this?
>>
>> NAPI support was removed from mac80211 six months ago, but I guess we
>> could try to get it back if we have a good reason:
>>
>> http://marc.info/?l=linux-wireless&m=136204135907491
>
> Hmm. From what I understand NAPI is used for RX polling. My patch
> addresses mainly WMI/HTT TX starvation.

I thought I saw something related to NAPI on TX as well, but I can't
find it anymore.

> There's another solution that I had in mind. Instead of:
>
>   for (;;) { dequeue(z); process; }
>
> I did:
>
>   q = dequeue_all(z); for (;;) { dequeue(q); process; }
>
> I.e. move all queued stuff at the worker entry and move it out of the
> global queue, that can, and will be, having more stuff queued while
> the worker does its job).
>
> This way workers would exit/restart more often, but from what I tested
> it didn't really solve the problem. Given enough traffic HTC worker
> responsible for HTT TX gets overwhelmed eventually. You could try
> limit how many frames a worker can process during one execution, but
> how do you guess that? This starvation depends on how fast your CPU
> is.

I think we should come up with better ways to handle this. To have a
quota (like you mention above) would be one option. Other option would
be to have multiple queues and some sort of priorisation or fair
queueing.

And most importantly of all, we should minimise the lenght of queue we
have inside ath10k. I'm worried that we queue way too many packets
within ath10k right now.

> Thus I opted for sole cond_resched().

Yeah, this is a good workaround for the problem. But it would be good to
get bottom of this as well.

>>> +#ifndef CONFIG_PREEMPT
>>> +             cond_resched();
>>> +#endif
>>
>> Why the #ifndef? Why should we not call cond_resched() when PREEMPT is
>> enabled? Does something negative happen then?
>
> I think it should be okay. I saw the #ifndef thing in b43legacy and
> thought it simply makes sense (although it's unsightly).

For various reasons I would prefer to have cond_resched() without the
#ifndef, if possible.
Michal Kazior Aug. 28, 2013, 1:16 p.m. UTC | #4
On 28 August 2013 06:02, Kalle Valo <kvalo@qca.qualcomm.com> wrote:
> Michal Kazior <michal.kazior@tieto.com> writes:
>> There's another solution that I had in mind. Instead of:
>>
>>   for (;;) { dequeue(z); process; }
>>
>> I did:
>>
>>   q = dequeue_all(z); for (;;) { dequeue(q); process; }
>>
>> I.e. move all queued stuff at the worker entry and move it out of the
>> global queue, that can, and will be, having more stuff queued while
>> the worker does its job).
>>
>> This way workers would exit/restart more often, but from what I tested
>> it didn't really solve the problem. Given enough traffic HTC worker
>> responsible for HTT TX gets overwhelmed eventually. You could try
>> limit how many frames a worker can process during one execution, but
>> how do you guess that? This starvation depends on how fast your CPU
>> is.
>
> I think we should come up with better ways to handle this. To have a
> quota (like you mention above) would be one option. Other option would
> be to have multiple queues and some sort of priorisation or fair
> queueing.

Having quota will not help here in any way. You can re-queue a worker
after each single frame and avoid WMI starvation, however you can
still starve the rest of the system (and that can lead to system reset
via watchdog). I'm also unsure about the overhead queueing a work may
have (on a uP system without preemption in might be negligible, but
what about other systems?), so you'd have to guess the quota size or
else you'd get increased latency/overhead and perhaps slower
performance.

I believe cond_resched is a solution, not a workaround. Slow systems
without preemption need this. I wonder how other drivers got around
it? Or perhaps none of the other drivers had to deal with really
insufficient number of CPU cycles versus lots of work.

We could perhaps move workers out of HTC and have a single TX worker
in core.c for both WMI and HTT that would prioritize WMI, before
trying HTT. This could help guarantee that all beacons (which go
through WMI) are sent in a timely fashion in response to SWBA event.
But that won't fix the overall system starvation.


> And most importantly of all, we should minimise the lenght of queue we
> have inside ath10k. I'm worried that we queue way too many packets
> within ath10k right now.

Felix pointed that out quite some time ago. I would agree but I'm
affraid you'll hurt performance if you decrease the queue depth. There
seems to be some kind of latency thing going on (either on host, or on
firmware/hardware, or both combined). I tried decreasing HTT TX ring
buffer from 2048 to 256. In result on AP135 UDP TX got trimmed at
~330mbps max. Stuffing more throughput even left some idle CPU cycles.
If you consider 3x3 devices that are supposed to get you 1.3gbps, then
you apparently need that 2048 depth.


Micha?.
--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/net/wireless/ath/ath10k/htc.c b/drivers/net/wireless/ath/ath10k/htc.c
index 7d445d3..99f1dbd 100644
--- a/drivers/net/wireless/ath/ath10k/htc.c
+++ b/drivers/net/wireless/ath/ath10k/htc.c
@@ -237,6 +237,10 @@  static void ath10k_htc_send_work(struct work_struct *work)
 		ret = ath10k_htc_issue_skb(htc, ep, skb, credits);
 		if (ret == -ENOSR)
 			break;
+
+#ifndef CONFIG_PREEMPT
+		cond_resched();
+#endif
 	}
 }
 
diff --git a/drivers/net/wireless/ath/ath10k/htt_rx.c b/drivers/net/wireless/ath/ath10k/htt_rx.c
index 6cf4d95..80ea398 100644
--- a/drivers/net/wireless/ath/ath10k/htt_rx.c
+++ b/drivers/net/wireless/ath/ath10k/htt_rx.c
@@ -1229,6 +1229,10 @@  static void ath10k_htt_rx_work(struct work_struct *work)
 			break;
 
 		ath10k_htt_rx_process_skb(htt->ar, skb);
+
+#ifndef CONFIG_PREEMPT
+		cond_resched();
+#endif
 	}
 }
 
diff --git a/drivers/net/wireless/ath/ath10k/wmi.c b/drivers/net/wireless/ath/ath10k/wmi.c
index 32fd5e7..f36f0be 100644
--- a/drivers/net/wireless/ath/ath10k/wmi.c
+++ b/drivers/net/wireless/ath/ath10k/wmi.c
@@ -1155,6 +1155,10 @@  static void ath10k_wmi_event_work(struct work_struct *work)
 			break;
 
 		ath10k_wmi_event_process(ar, skb);
+
+#ifndef CONFIG_PREEMPT
+		cond_resched();
+#endif
 	}
 }