[v5,0/8] ath10k: improve throughout of tcp/udp TX/RX of sdio
mbox series

Message ID 1567679893-14029-1-git-send-email-wgong@codeaurora.org
Headers show
Series
  • ath10k: improve throughout of tcp/udp TX/RX of sdio
Related show

Message

Wen Gong Sept. 5, 2019, 10:38 a.m. UTC
The bottleneck of throughout on sdio chip is the bus bandwidth, to the
patches are all to increase the use ratio of sdio bus.

                      udp-rx    udp-tx    tcp-rx    tcp-tx
without patches(Mbps)  320        180       170       151
with patches(Mbps)     450        410       400       320

These patches only affect sdio bus chip, explanation is mentioned in each
patch's commit log.

Alagu Sankar (1):
  ath10k: enable RX bundle receive for sdio
v2: fix incorrect skb tail of rx bundle in ath10k_sdio_mbox_rx_process_packet
v3: change some code style
split fix incorrect skb tail of rx bundle to patch "adjust skb length in ath10k_sdio_mbox_rx_packet"
v4: add err handle in ath10k_sdio_mbox_rx_fetch_bundle
v5: no change

Nicolas Boichat (1):
  ath10k: adjust skb length in ath10k_sdio_mbox_rx_packet
v2: no this patch
v3: new added
v4: change commit log
v5: no change

Wen Gong (6):
  ath10k: change max RX bundle size from 8 to 32 for sdio
v2: change macro HTC_GET_BUNDLE_COUNT
v3: change some code style
v4: add macro ATH10K_HTC_FLAG_BUNDLE_MASK
v5: no change

  ath10k: add workqueue for RX path of sdio
v2: no change
v3: change some code style
v4: no change
v5: no change

  ath10k: disable TX complete indication of htt for sdio
v2: change some code style
v3: change some code style
v4: no change
v5: no change

  ath10k: add htt TX bundle for sdio
v2: no change
v3: change some code style
v4: no change
v5: change ath10k_htc_setup_tx_req to add check bundle_tx
to forbidden init 2 times

  ath10k: enable alt data of TX path for sdio
v2: no change
v3: change some code style
v4: add macro ATH10K_HTC_MSG_READY_EXT_ALT_DATA_MASK
v5: no change

  ath10k: enable napi on RX path for sdio
v2: no change
v3: change some code style
v4: change some code style
v5: move skb_queue_head_init(&ar->htt.rx_indication_head)
from ath10k_htt_connect to ath10k_core_create to forbidden 
init 2 times

 drivers/net/wireless/ath/ath10k/core.c   |  44 +++-
 drivers/net/wireless/ath/ath10k/core.h   |   4 +-
 drivers/net/wireless/ath/ath10k/hif.h    |   9 +
 drivers/net/wireless/ath/ath10k/htc.c    | 378 ++++++++++++++++++++++++++++---
 drivers/net/wireless/ath/ath10k/htc.h    |  49 +++-
 drivers/net/wireless/ath/ath10k/htt.c    |  13 ++
 drivers/net/wireless/ath/ath10k/htt.h    |  20 +-
 drivers/net/wireless/ath/ath10k/htt_rx.c |  80 ++++++-
 drivers/net/wireless/ath/ath10k/htt_tx.c |  38 +++-
 drivers/net/wireless/ath/ath10k/hw.h     |   2 +-
 drivers/net/wireless/ath/ath10k/sdio.c   | 281 ++++++++++++++++++++---
 drivers/net/wireless/ath/ath10k/sdio.h   |  31 ++-
 12 files changed, 852 insertions(+), 97 deletions(-)

Comments

Kalle Valo Sept. 12, 2019, 3:39 p.m. UTC | #1
Wen Gong <wgong@codeaurora.org> writes:

> The bottleneck of throughout on sdio chip is the bus bandwidth, to the
> patches are all to increase the use ratio of sdio bus.
>
>                       udp-rx    udp-tx    tcp-rx    tcp-tx
> without patches(Mbps)  320        180       170       151
> with patches(Mbps)     450        410       400       320
>
> These patches only affect sdio bus chip, explanation is mentioned in each
> patch's commit log.

I tried to apply patches 2-8, patch 2 had a conflict due to my changes
and patch 8 didn't apply at all. Also I saw few warnings with the
patches I was able to test:

$ ath10k-check
drivers/net/wireless/ath/ath10k/htc.c: In function 'ath10k_htc_bundle_tx_work':
drivers/net/wireless/ath/ath10k/htc.c:796:24: warning: variable 'eid' set but not used [-Wunused-but-set-variable]
drivers/net/wireless/ath/ath10k/sdio.c:2169:6: warning: no previous prototype for 'ath10k_sdio_check_fw_reg' [-Wmissing-prototypes]
drivers/net/wireless/ath/ath10k/sdio.c: In function 'ath10k_sdio_check_fw_reg':
drivers/net/wireless/ath/ath10k/sdio.c:2171:6: warning: variable 'ret' set but not used [-Wunused-but-set-variable]
drivers/net/wireless/ath/ath10k/sdio.c: In function 'ath10k_sdio_fw_crashed_dump':
drivers/net/wireless/ath/ath10k/sdio.c:2434:17: warning: too many arguments for format [-Wformat-extra-args]
drivers/net/wireless/ath/ath10k/sdio.c:2169:6: warning: symbol 'ath10k_sdio_check_fw_reg' was not declared. Should it be static?
drivers/net/wireless/ath/ath10k/sdio.c: In function 'ath10k_sdio_fw_crashed_dump':
drivers/net/wireless/ath/ath10k/sdio.c:2434:17: warning: too many arguments for format [-Wformat-extra-args]
drivers/net/wireless/ath/ath10k/sdio.c:734: braces {} are not necessary for single statement blocks
drivers/net/wireless/ath/ath10k/sdio.c:969: braces {} are not necessary for single statement blocks

But please don't send a new version until I have provided review
comments.
Kalle Valo Sept. 12, 2019, 5:51 p.m. UTC | #2
Kalle Valo <kvalo@codeaurora.org> writes:

> Wen Gong <wgong@codeaurora.org> writes:
>
>> The bottleneck of throughout on sdio chip is the bus bandwidth, to the
>> patches are all to increase the use ratio of sdio bus.
>>
>>                       udp-rx    udp-tx    tcp-rx    tcp-tx
>> without patches(Mbps)  320        180       170       151
>> with patches(Mbps)     450        410       400       320
>>
>> These patches only affect sdio bus chip, explanation is mentioned in each
>> patch's commit log.
>
> I tried to apply patches 2-8, patch 2 had a conflict due to my changes
> and patch 8 didn't apply at all. Also I saw few warnings with the
> patches I was able to test:
>
> $ ath10k-check
> drivers/net/wireless/ath/ath10k/htc.c: In function 'ath10k_htc_bundle_tx_work':
> drivers/net/wireless/ath/ath10k/htc.c:796:24: warning: variable 'eid'
> set but not used [-Wunused-but-set-variable]
> drivers/net/wireless/ath/ath10k/sdio.c:2169:6: warning: no previous
> prototype for 'ath10k_sdio_check_fw_reg' [-Wmissing-prototypes]
> drivers/net/wireless/ath/ath10k/sdio.c: In function 'ath10k_sdio_check_fw_reg':
> drivers/net/wireless/ath/ath10k/sdio.c:2171:6: warning: variable 'ret'
> set but not used [-Wunused-but-set-variable]
> drivers/net/wireless/ath/ath10k/sdio.c: In function 'ath10k_sdio_fw_crashed_dump':
> drivers/net/wireless/ath/ath10k/sdio.c:2434:17: warning: too many
> arguments for format [-Wformat-extra-args]
> drivers/net/wireless/ath/ath10k/sdio.c:2169:6: warning: symbol
> 'ath10k_sdio_check_fw_reg' was not declared. Should it be static?
> drivers/net/wireless/ath/ath10k/sdio.c: In function 'ath10k_sdio_fw_crashed_dump':
> drivers/net/wireless/ath/ath10k/sdio.c:2434:17: warning: too many
> arguments for format [-Wformat-extra-args]
> drivers/net/wireless/ath/ath10k/sdio.c:734: braces {} are not
> necessary for single statement blocks
> drivers/net/wireless/ath/ath10k/sdio.c:969: braces {} are not
> necessary for single statement blocks

Actually some of the warnings are from another patch.
Wen Gong Sept. 13, 2019, 3:54 a.m. UTC | #3
>>> The bottleneck of throughout on sdio chip is the bus bandwidth, to the
>>>patches are all to increase the use ratio of sdio bus.

>> I tried to apply patches 2-8, patch 2 had a conflict due to my changes
>> and patch 8 didn't apply at all. Also I saw few warnings with the
>> patches I was able to test:

HI kalle,

i see some warning is from patch "ath10k: add fw coredump for sdio when firmware assert"
and this patch also have change in sdio.c, so maybe it will have conflict with the 8 patches.

patch 8 didn't apply at all, is it means each change of the patch is conflict?

I used command to check each patch.
perl ~/opensource/checkpatch.pl --strict --no-tree --max-line-length=90  --show-types --ignore CONST_STRUCT ./*

I found it not check Wunused-but-set-variable.
Kalle Valo Sept. 20, 2019, 12:44 p.m. UTC | #4
Wen Gong <wgong@qti.qualcomm.com> writes:

>>>> The bottleneck of throughout on sdio chip is the bus bandwidth, to the
>>>>patches are all to increase the use ratio of sdio bus.
>
>>> I tried to apply patches 2-8, patch 2 had a conflict due to my changes
>>> and patch 8 didn't apply at all. Also I saw few warnings with the
>>> patches I was able to test:
>
> Hi kalle,
>
> i see some warning is from patch "ath10k: add fw coredump for sdio when firmware assert"
> and this patch also have change in sdio.c, so maybe it will have
> conflict with the 8 patches.
>
> patch 8 didn't apply at all, is it means each change of the patch is
> conflict?

Patches 1-7 applied fine, but patch 8 didn't apply. I didn't investigate
what was the conflict.

> I used command to check each patch.
> perl ~/opensource/checkpatch.pl --strict --no-tree
> --max-line-length=90 --show-types --ignore CONST_STRUCT ./*
>
> I found it not check Wunused-but-set-variable.

checkpatch only checks style issues, unused-but-set-variable is a
warning from GCC.

I use ath10k-check script to check all ath10k patches:

https://wireless.wiki.kernel.org/en/users/drivers/ath10k/codingstyle#checking_code

And I use latest GCC and sparse with that. crosstool is a simple way to
install a relatively new version of GCC for kernel compilation:

https://mirrors.edge.kernel.org/pub/tools/crosstool/
Kalle Valo Sept. 23, 2019, 9:29 a.m. UTC | #5
Wen Gong <wgong@codeaurora.org> writes:

> The bottleneck of throughout on sdio chip is the bus bandwidth, to the
> patches are all to increase the use ratio of sdio bus.
>
>                       udp-rx    udp-tx    tcp-rx    tcp-tx
> without patches(Mbps)  320        180       170       151
> with patches(Mbps)     450        410       400       320
>
> These patches only affect sdio bus chip, explanation is mentioned in each
> patch's commit log.

Below is my summary of the patchset. I recommend splitting these into
smaller sets, makes it a lot easier to review and apply. And please send
only one or two patchsets at a time.

[PATCH v5 1/8] ath10k: adjust skb length in ath10k_sdio_mbox_rx_packet

Applied.

Patchset 1:

[PATCH v5 2/8] ath10k: enable RX bundle receive for sdio
[PATCH v5 3/8] ath10k: change max RX bundle size from 8 to 32 for sdio

Reasonal but needs some cleanup.

Patchset 2:

[PATCH v5 4/8] ath10k: add workqueue for RX path of sdio

Is really another thread needed? We already have one for SDIO.

[PATCH v5 6/8] ath10k: add htt TX bundle for sdio

And again a new thread so that we would have three threads for SDIO? I'm
not convinced about that.

Patchset 3:

[PATCH v5 7/8] ath10k: enable alt data of TX path for sdio

Again another module parameter?

[PATCH v5 8/8] ath10k: enable napi on RX path for sdio

Seems reasonable, but worried about breaking USB.

Patchset 4:

[PATCH v5 5/8] ath10k: disable TX complete indication of htt for sdio

Quite hackish and I need numbers how much it really improves throughput
Wen Gong Sept. 24, 2019, 12:32 p.m. UTC | #6
On 2019-09-23 17:29, Kalle Valo wrote:
> Wen Gong <wgong@codeaurora.org> writes:
> 
>> The bottleneck of throughout on sdio chip is the bus bandwidth, to the
>> patches are all to increase the use ratio of sdio bus.
>> 
>>                       udp-rx    udp-tx    tcp-rx    tcp-tx
>> without patches(Mbps)  320        180       170       151
>> with patches(Mbps)     450        410       400       320
>> 
>> These patches only affect sdio bus chip, explanation is mentioned in 
>> each
>> patch's commit log.
> 
> Below is my summary of the patchset. I recommend splitting these into
> smaller sets, makes it a lot easier to review and apply. And please 
> send
> only one or two patchsets at a time.
> 
> [PATCH v5 1/8] ath10k: adjust skb length in ath10k_sdio_mbox_rx_packet
> 
> Applied.
> 
> Patchset 1:
> 
> [PATCH v5 2/8] ath10k: enable RX bundle receive for sdio
> [PATCH v5 3/8] ath10k: change max RX bundle size from 8 to 32 for sdio
> 
> Reasonal but needs some cleanup.
[PATCH v5 2/8] I will use sk_buff_head to replace the 
ath10k_sdio_rx_request, then it will be simple
[PATCH v5 3/8] FIELD_GET is to >>, not <<, so << still need by my 
understand
> 
> Patchset 2:
> 
> [PATCH v5 4/8] ath10k: add workqueue for RX path of sdio
> 
> Is really another thread needed? We already have one for SDIO.

the SDIO thread is used for async tx, this queue is for RX, and it will 
improve udp rx
from 200Mbps to 400Mbps. And it used the workqueue_aux of ar, not new 
created.
this patch is better to put to Patchset 1, it helps RX, so it should put 
together with the
[PATCH v5 2/8] ath10k: enable RX bundle receive for sdio

> 
> [PATCH v5 6/8] ath10k: add htt TX bundle for sdio
> 
> And again a new thread so that we would have three threads for SDIO? 
> I'm
> not convinced about that.
The thread is for tx complete indication and the thread only used for tx 
bundle, if it does not have
heavy traffic, then it will not bundle, then the thread will not run.
for bundled tx, after bundled, it has max 32 packets in each bundle, the 
tx complete for the 32 packets will cost much time, if not give the tx 
complete task to the thread, then it will much delay the bundle of the 
next packets, then it will drop throughput.
> 
> Patchset 3:
> 
> [PATCH v5 7/8] ath10k: enable alt data of TX path for sdio
> 
> Again another module parameter?
the alt_data could be removed
> 
> [PATCH v5 8/8] ath10k: enable napi on RX path for sdio
> 
> Seems reasonable, but worried about breaking USB.
it can change to check napi enabled, if not, will use old 
ieee80211_rx_ni in ath10k_htt_rx_proc_rx_ind_hl
> 
> Patchset 4:
> 
> [PATCH v5 5/8] ath10k: disable TX complete indication of htt for sdio
> 
> Quite hackish and I need numbers how much it really improves throughput
it will improve throughput,
for udp tx, it can arrive to 400Mbps, if remove this patch, it will drop 
to
130M. it not only remove the tx complete message's bus bandwidth of 
sdio, and it also
has a improvement in firmware's tx path's logic, it will change the 
logic
of tx simple both in firmware and ath10k and faster the tx circle.
And the paramter disable_tx_comp can be removed.
this patch shoud be put in Patchset 2, it help TX, so it is better to 
put together with the
[PATCH v5 6/8] ath10k: add htt TX bundle for sdio.
Wen Gong Sept. 26, 2019, 2:33 a.m. UTC | #7
On 2019-09-24 20:32, Wen Gong wrote:
> On 2019-09-23 17:29, Kalle Valo wrote:
>> Wen Gong <wgong@codeaurora.org> writes:


patch v6 ath10k: improve throughout of RX of sdio has sent
new patch v6 only have 3 patches, the left patches will also sent later

Alagu Sankar (1):
   ath10k: enable RX bundle receive for sdio

Wen Gong (2):
   ath10k: change max RX bundle size from 8 to 32 for sdio
   ath10k: add workqueue for RX path of sdio

https://patchwork.kernel.org/patch/11160247/
https://patchwork.kernel.org/patch/11160245/
https://patchwork.kernel.org/patch/11160241/
Wen Gong Oct. 14, 2019, 11:53 a.m. UTC | #8
> -----Original Message-----
> From: ath10k <ath10k-bounces@lists.infradead.org> On Behalf Of Wen Gong
> Sent: Thursday, September 26, 2019 10:33 AM
> To: kvalo@codeaurora.org
> Cc: linux-wireless@vger.kernel.org; ath10k@lists.infradead.org
> Subject: [EXT] Re: [PATCH v5 0/8] ath10k: improve throughout of tcp/udp
> TX/RX of sdio
> 
> On 2019-09-24 20:32, Wen Gong wrote:
> > On 2019-09-23 17:29, Kalle Valo wrote:
> >> Wen Gong <wgong@codeaurora.org> writes:
> 
> 
> patch v6 ath10k: improve throughout of RX of sdio has sent
> new patch v6 only have 3 patches, the left patches will also sent later
> 
> Alagu Sankar (1):
>    ath10k: enable RX bundle receive for sdio
> 
> Wen Gong (2):
>    ath10k: change max RX bundle size from 8 to 32 for sdio
>    ath10k: add workqueue for RX path of sdio
> 
> https://patchwork.kernel.org/patch/11160247/
> https://patchwork.kernel.org/patch/11160245/
> https://patchwork.kernel.org/patch/11160241/
> 
Left 4 patches sent v6:
https://patchwork.kernel.org/patch/11188393/
https://patchwork.kernel.org/patch/11188403/
https://patchwork.kernel.org/patch/11188405/
https://patchwork.kernel.org/patch/11188407/

> _______________________________________________
> ath10k mailing list
> ath10k@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/ath10k