diff mbox series

brcmfmac: Set SDIO workqueue as WQ_HIGHPRI

Message ID 20210802170904.3116223-1-sean.anderson@seco.com (mailing list archive)
State Accepted
Commit 41b637bac0b0a90424793aa1ec265b24c4c50fb1
Delegated to: Kalle Valo
Headers show
Series brcmfmac: Set SDIO workqueue as WQ_HIGHPRI | expand

Commit Message

Sean Anderson Aug. 2, 2021, 5:09 p.m. UTC
This puts tasks submitted to the SDIO workqueue at the head of the queue
and runs them immediately. This gets higher RX throughput with the SDIO
bus.

This was originally submitted as [1]. The original author Wright Feng
reports

> throughput result with 43455(11ac) on 1 core 1.6 Ghz platform is
>     Without WQ_HIGGPRI TX/RX: 293/301 (mbps)
>     With    WQ_HIGHPRI TX/RX: 293/321 (mbps)

I tested this with a 43364(11bgn) on a 1 core 800 MHz platform and got
    Without WQ_HIGHPRI TX/RX: 16/19 (Mbits/sec)
    With    WQ_HIGHPRI TX/RX: 24/20 (MBits/sec)

[1] https://lore.kernel.org/linux-wireless/1584604406-15452-4-git-send-email-wright.feng@cypress.com/

Signed-off-by: Sean Anderson <sean.anderson@seco.com>
---

 drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Sean Anderson Aug. 17, 2021, 4:48 p.m. UTC | #1
ping?

On 8/2/21 1:09 PM, Sean Anderson wrote:
> This puts tasks submitted to the SDIO workqueue at the head of the queue
> and runs them immediately. This gets higher RX throughput with the SDIO
> bus.
> 
> This was originally submitted as [1]. The original author Wright Feng
> reports
> 
>> throughput result with 43455(11ac) on 1 core 1.6 Ghz platform is
>>     Without WQ_HIGGPRI TX/RX: 293/301 (mbps)
>>     With    WQ_HIGHPRI TX/RX: 293/321 (mbps)
> 
> I tested this with a 43364(11bgn) on a 1 core 800 MHz platform and got
>      Without WQ_HIGHPRI TX/RX: 16/19 (Mbits/sec)
>      With    WQ_HIGHPRI TX/RX: 24/20 (MBits/sec)
> 
> [1] https://lore.kernel.org/linux-wireless/1584604406-15452-4-git-send-email-wright.feng@cypress.com/
> 
> Signed-off-by: Sean Anderson <sean.anderson@seco.com>
> ---
> 
>   drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c
> index 97ee9e2e2e35..5e10176c6c7e 100644
> --- a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c
> +++ b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c
> @@ -4442,7 +4442,7 @@ struct brcmf_sdio *brcmf_sdio_probe(struct brcmf_sdio_dev *sdiodev)
>   	bus->tx_seq = SDPCM_SEQ_WRAP - 1;
>   
>   	/* single-threaded workqueue */
> -	wq = alloc_ordered_workqueue("brcmf_wq/%s", WQ_MEM_RECLAIM,
> +	wq = alloc_ordered_workqueue("brcmf_wq/%s", WQ_MEM_RECLAIM | WQ_HIGHPRI,
>   				     dev_name(&sdiodev->func1->dev));
>   	if (!wq) {
>   		brcmf_err("insufficient memory to create txworkqueue\n");
>
Arend Van Spriel Aug. 17, 2021, 5:17 p.m. UTC | #2
On August 17, 2021 6:50:50 PM Sean Anderson <sean.anderson@seco.com> wrote:

> ping?

Good idea to ping with a top-level post :-p

>
>
> On 8/2/21 1:09 PM, Sean Anderson wrote:
>> This puts tasks submitted to the SDIO workqueue at the head of the queue
>> and runs them immediately. This gets higher RX throughput with the SDIO
>> bus.
>>
>> This was originally submitted as [1]. The original author Wright Feng
>> reports
>>
>>> throughput result with 43455(11ac) on 1 core 1.6 Ghz platform is
>>> Without WQ_HIGGPRI TX/RX: 293/301 (mbps)
>>> With    WQ_HIGHPRI TX/RX: 293/321 (mbps)
>>
>> I tested this with a 43364(11bgn) on a 1 core 800 MHz platform and got
>> Without WQ_HIGHPRI TX/RX: 16/19 (Mbits/sec)
>> With    WQ_HIGHPRI TX/RX: 24/20 (MBits/sec)
>>
>> [1] 
>> https://lore.kernel.org/linux-wireless/1584604406-15452-4-git-send-email-wright.feng@cypress.com/

While I understand the obvious gain it seems like a wrong move to me. What 
if all workqueues in the kernel would start using this flag? I bet the gain 
above would be negated and all are equal in the eyes of .. the kernel

Regards,
Arend
Sean Anderson Aug. 17, 2021, 6:21 p.m. UTC | #3
On 8/17/21 1:17 PM, Arend van Spriel wrote:
> On August 17, 2021 6:50:50 PM Sean Anderson <sean.anderson@seco.com> wrote:
>
>> ping?
>
> Good idea to ping with a top-level post :-p

Sorry, I'm not subscribed to this list; did I mess up the list/CC for the original?

>
>>
>>
>> On 8/2/21 1:09 PM, Sean Anderson wrote:
>>> This puts tasks submitted to the SDIO workqueue at the head of the queue
>>> and runs them immediately. This gets higher RX throughput with the SDIO
>>> bus.
>>>
>>> This was originally submitted as [1]. The original author Wright Feng
>>> reports
>>>
>>>> throughput result with 43455(11ac) on 1 core 1.6 Ghz platform is
>>>> Without WQ_HIGGPRI TX/RX: 293/301 (mbps)
>>>> With    WQ_HIGHPRI TX/RX: 293/321 (mbps)
>>>
>>> I tested this with a 43364(11bgn) on a 1 core 800 MHz platform and got
>>> Without WQ_HIGHPRI TX/RX: 16/19 (Mbits/sec)
>>> With    WQ_HIGHPRI TX/RX: 24/20 (MBits/sec)
>>>
>>> [1] https://lore.kernel.org/linux-wireless/1584604406-15452-4-git-send-email-wright.feng@cypress.com/
>
> While I understand the obvious gain it seems like a wrong move to me. What if all workqueues in the kernel would start using this flag? I bet the gain above would be negated and all are equal in the eyes of .. the kernel

Is there an official policy on what counts as high-priority? Using some
very-scientific methodology [1], it seems like most high-priority
workqueues are in drivers/net and fs. Making these queues high-priority
seems to be commonplace. For example, in fe101716c7c9 ("rtw88: replace
tx tasklet with work queue"), Po-Hao Huang remarks:

> Since throughput is delay-sensitive in most cases, we allocate a
> dedicated, high priority wq for our needs.

which is effectively the same rationale as this patch. At least for my
application, network transfer speed is one of the most important
performance metrics.

The original patch got the following feedback [2] from Kalle Valo:

> Why would someone want to disable this? Like in patch 2, please avoid
> adding new module parameters as much as possible.

so for this patch I just made the workqueue high-priority
unconditionally.

--Sean

[1] git grep -l WQ_HIGHPRI | cut -f 1,2 -d / | uniq -c | sort -n
[2] https://lore.kernel.org/linux-wireless/87tv2gbgv1.fsf@kamboji.qca.qualcomm.com/
Tejun Heo Aug. 17, 2021, 6:28 p.m. UTC | #4
Hello,

On Tue, Aug 17, 2021 at 02:21:55PM -0400, Sean Anderson wrote:
> > While I understand the obvious gain it seems like a wrong move to me. What if all workqueues in the kernel would start using this flag? I bet the gain above would be negated and all are equal in the eyes of .. the kernel
> 
> Is there an official policy on what counts as high-priority? Using some
> very-scientific methodology [1], it seems like most high-priority
> workqueues are in drivers/net and fs. Making these queues high-priority
> seems to be commonplace. For example, in fe101716c7c9 ("rtw88: replace
> tx tasklet with work queue"), Po-Hao Huang remarks:

I think this is actually a good candidate for HIGHPRI. As you noted, stuff
which interacts with hardware in latency sensitive manner with impact on
observable performance is one of the common use cases. The alternatives
would be doing it from hard/softirqs which are higher priorities anyway.

Thanks.
Arend Van Spriel Aug. 17, 2021, 9:14 p.m. UTC | #5
On August 17, 2021 8:28:23 PM Tejun Heo <tj@kernel.org> wrote:

> Hello,
>
> On Tue, Aug 17, 2021 at 02:21:55PM -0400, Sean Anderson wrote:
>>> While I understand the obvious gain it seems like a wrong move to me. What 
>>> if all workqueues in the kernel would start using this flag? I bet the gain 
>>> above would be negated and all are equal in the eyes of .. the kernel
>>
>> Is there an official policy on what counts as high-priority? Using some
>> very-scientific methodology [1], it seems like most high-priority
>> workqueues are in drivers/net and fs. Making these queues high-priority
>> seems to be commonplace. For example, in fe101716c7c9 ("rtw88: replace
>> tx tasklet with work queue"), Po-Hao Huang remarks:
>
> I think this is actually a good candidate for HIGHPRI. As you noted, stuff
> which interacts with hardware in latency sensitive manner with impact on
> observable performance is one of the common use cases. The alternatives
> would be doing it from hard/softirqs which are higher priorities anyway.

Hi, Tejun

Thanks for the explanation.

Regards,
Arend
Arend Van Spriel Aug. 18, 2021, 4:53 a.m. UTC | #6
On August 2, 2021 7:11:12 PM Sean Anderson <sean.anderson@seco.com> wrote:

> This puts tasks submitted to the SDIO workqueue at the head of the queue
> and runs them immediately. This gets higher RX throughput with the SDIO
> bus.
>
> This was originally submitted as [1]. The original author Wright Feng
> reports
>
>> throughput result with 43455(11ac) on 1 core 1.6 Ghz platform is
>> Without WQ_HIGGPRI TX/RX: 293/301 (mbps)
>> With    WQ_HIGHPRI TX/RX: 293/321 (mbps)
>
> I tested this with a 43364(11bgn) on a 1 core 800 MHz platform and got
>    Without WQ_HIGHPRI TX/RX: 16/19 (Mbits/sec)
>    With    WQ_HIGHPRI TX/RX: 24/20 (MBits/sec)
>
> [1] 
> https://lore.kernel.org/linux-wireless/1584604406-15452-4-git-send-email-wright.feng@cypress.com/

Not sure if Wright Feng needs to be attributed as you clearly had a good 
look at his patch and the discussion related to it. You can add my ...

Reviewed-by: Arend van Spriel <aspriel@gmail.com>
> Signed-off-by: Sean Anderson <sean.anderson@seco.com>
> ---
>
> drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
Kalle Valo Aug. 21, 2021, 4:59 p.m. UTC | #7
Sean Anderson <sean.anderson@seco.com> wrote:

> This puts tasks submitted to the SDIO workqueue at the head of the queue
> and runs them immediately. This gets higher RX throughput with the SDIO
> bus.
> 
> This was originally submitted as [1]. The original author Wright Feng
> reports
> 
> > throughput result with 43455(11ac) on 1 core 1.6 Ghz platform is
> >     Without WQ_HIGGPRI TX/RX: 293/301 (mbps)
> >     With    WQ_HIGHPRI TX/RX: 293/321 (mbps)
> 
> I tested this with a 43364(11bgn) on a 1 core 800 MHz platform and got
>     Without WQ_HIGHPRI TX/RX: 16/19 (Mbits/sec)
>     With    WQ_HIGHPRI TX/RX: 24/20 (MBits/sec)
> 
> [1] https://lore.kernel.org/linux-wireless/1584604406-15452-4-git-send-email-wright.feng@cypress.com/
> 
> Signed-off-by: Sean Anderson <sean.anderson@seco.com>
> Reviewed-by: Arend van Spriel <aspriel@gmail.com>

Patch applied to wireless-drivers-next.git, thanks.

41b637bac0b0 brcmfmac: Set SDIO workqueue as WQ_HIGHPRI
diff mbox series

Patch

diff --git a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c
index 97ee9e2e2e35..5e10176c6c7e 100644
--- a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c
+++ b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c
@@ -4442,7 +4442,7 @@  struct brcmf_sdio *brcmf_sdio_probe(struct brcmf_sdio_dev *sdiodev)
 	bus->tx_seq = SDPCM_SEQ_WRAP - 1;
 
 	/* single-threaded workqueue */
-	wq = alloc_ordered_workqueue("brcmf_wq/%s", WQ_MEM_RECLAIM,
+	wq = alloc_ordered_workqueue("brcmf_wq/%s", WQ_MEM_RECLAIM | WQ_HIGHPRI,
 				     dev_name(&sdiodev->func1->dev));
 	if (!wq) {
 		brcmf_err("insufficient memory to create txworkqueue\n");