[1/2] ath10k: use cumulative survey statistics
diff mbox series

Message ID 20200504154122.91862-1-markus.theil@tu-ilmenau.de
State New
Headers show
Series
  • [1/2] ath10k: use cumulative survey statistics
Related show

Commit Message

Markus Theil May 4, 2020, 3:41 p.m. UTC
ath10k currently reports survey results for the last interval between each
invocation of NL80211_CMD_GET_SURVEY. For concurrent invocations, this
can lead to unexpectedly small results, e.g. when hostapd uses survey
data and iw survey dump is invoked in parallel. Fix this by returning
cumulative results, that don't depend on the last invocation. Other
drivers, e.g. ath9k or mt76 also use this behavior.

Signed-off-by: Markus Theil <markus.theil@tu-ilmenau.de>
---
 drivers/net/wireless/ath/ath10k/wmi.c | 20 ++++++++++----------
 1 file changed, 10 insertions(+), 10 deletions(-)

Comments

Sven Eckelmann May 4, 2020, 4:29 p.m. UTC | #1
On Monday, 4 May 2020 17:41:21 CEST Markus Theil wrote:
> ath10k currently reports survey results for the last interval between each
> invocation of NL80211_CMD_GET_SURVEY. For concurrent invocations, this
> can lead to unexpectedly small results, e.g. when hostapd uses survey
> data and iw survey dump is invoked in parallel. Fix this by returning
> cumulative results, that don't depend on the last invocation. Other
> drivers, e.g. ath9k or mt76 also use this behavior.

It is (unfortunately) not that trivial:

See code and comments from other people:

* https://patchwork.kernel.org/cover/11150285/
* https://patchwork.kernel.org/patch/11150287/
* https://patchwork.kernel.org/patch/11150289/

Kind regards,
	Sven
Markus Theil May 4, 2020, 4:33 p.m. UTC | #2
On 5/4/20 6:29 PM, Sven Eckelmann wrote:
> On Monday, 4 May 2020 17:41:21 CEST Markus Theil wrote:
>> ath10k currently reports survey results for the last interval between each
>> invocation of NL80211_CMD_GET_SURVEY. For concurrent invocations, this
>> can lead to unexpectedly small results, e.g. when hostapd uses survey
>> data and iw survey dump is invoked in parallel. Fix this by returning
>> cumulative results, that don't depend on the last invocation. Other
>> drivers, e.g. ath9k or mt76 also use this behavior.
> It is (unfortunately) not that trivial:
>
> See code and comments from other people:
>
> * https://patchwork.kernel.org/cover/11150285/
> * https://patchwork.kernel.org/patch/11150287/
> * https://patchwork.kernel.org/patch/11150289/
>
> Kind regards,
> 	Sven
Thanks a lot for pointing this out! I was not aware of your patch.
Rajkumar Manoharan May 4, 2020, 11:46 p.m. UTC | #3
On 2020-05-04 08:41, Markus Theil wrote:
> ath10k currently reports survey results for the last interval between 
> each
> invocation of NL80211_CMD_GET_SURVEY. For concurrent invocations, this
> can lead to unexpectedly small results, e.g. when hostapd uses survey
> data and iw survey dump is invoked in parallel. Fix this by returning
> cumulative results, that don't depend on the last invocation. Other
> drivers, e.g. ath9k or mt76 also use this behavior.
> 
> Signed-off-by: Markus Theil <markus.theil@tu-ilmenau.de>
> 

IIRC this was fixed a while ago by below patch. Somehow it never landed 
in ath.git.
Simple one line change is enough.

https://patchwork.kernel.org/patch/10550707/

-Rajkumar
Ben Greear May 4, 2020, 11:49 p.m. UTC | #4
On 05/04/2020 04:46 PM, Rajkumar Manoharan wrote:
> On 2020-05-04 08:41, Markus Theil wrote:
>> ath10k currently reports survey results for the last interval between each
>> invocation of NL80211_CMD_GET_SURVEY. For concurrent invocations, this
>> can lead to unexpectedly small results, e.g. when hostapd uses survey
>> data and iw survey dump is invoked in parallel. Fix this by returning
>> cumulative results, that don't depend on the last invocation. Other
>> drivers, e.g. ath9k or mt76 also use this behavior.
>>
>> Signed-off-by: Markus Theil <markus.theil@tu-ilmenau.de>
>>
>
> IIRC this was fixed a while ago by below patch. Somehow it never landed in ath.git.
> Simple one line change is enough.
>
> https://patchwork.kernel.org/patch/10550707/
>
> -Rajkumar

Have you tested this with wave-1?  Lots of older, at least, firmware has brokenness in this area.

Thanks,
Ben
Rajkumar Manoharan May 4, 2020, 11:52 p.m. UTC | #5
On 2020-05-04 16:49, Ben Greear wrote:
> On 05/04/2020 04:46 PM, Rajkumar Manoharan wrote:
>> On 2020-05-04 08:41, Markus Theil wrote:
>>> ath10k currently reports survey results for the last interval between 
>>> each
>>> invocation of NL80211_CMD_GET_SURVEY. For concurrent invocations, 
>>> this
>>> can lead to unexpectedly small results, e.g. when hostapd uses survey
>>> data and iw survey dump is invoked in parallel. Fix this by returning
>>> cumulative results, that don't depend on the last invocation. Other
>>> drivers, e.g. ath9k or mt76 also use this behavior.
>>> 
>>> Signed-off-by: Markus Theil <markus.theil@tu-ilmenau.de>
>>> 
>> 
>> IIRC this was fixed a while ago by below patch. Somehow it never 
>> landed in ath.git.
>> Simple one line change is enough.
>> 
>> https://patchwork.kernel.org/patch/10550707/
>> 
>> -Rajkumar
> 
> Have you tested this with wave-1?  Lots of older, at least, firmware
> has brokenness in this area.
> 
Yes. It was tested in wave-1 as well. Venkat replied to your comment on 
original change.

-Rajkumar
Ben Greear May 5, 2020, 12:02 a.m. UTC | #6
On 05/04/2020 04:52 PM, Rajkumar Manoharan wrote:
> On 2020-05-04 16:49, Ben Greear wrote:
>> On 05/04/2020 04:46 PM, Rajkumar Manoharan wrote:
>>> On 2020-05-04 08:41, Markus Theil wrote:
>>>> ath10k currently reports survey results for the last interval between each
>>>> invocation of NL80211_CMD_GET_SURVEY. For concurrent invocations, this
>>>> can lead to unexpectedly small results, e.g. when hostapd uses survey
>>>> data and iw survey dump is invoked in parallel. Fix this by returning
>>>> cumulative results, that don't depend on the last invocation. Other
>>>> drivers, e.g. ath9k or mt76 also use this behavior.
>>>>
>>>> Signed-off-by: Markus Theil <markus.theil@tu-ilmenau.de>
>>>>
>>>
>>> IIRC this was fixed a while ago by below patch. Somehow it never landed in ath.git.
>>> Simple one line change is enough.
>>>
>>> https://patchwork.kernel.org/patch/10550707/
>>>
>>> -Rajkumar
>>
>> Have you tested this with wave-1?  Lots of older, at least, firmware
>> has brokenness in this area.
>>
> Yes. It was tested in wave-1 as well. Venkat replied to your comment on original change.

Ahh, sorry I missed that.

Hopefully no one is using the broken firmware anymore then!

--Ben
Sven Eckelmann May 5, 2020, 7:01 a.m. UTC | #7
On Tuesday, 5 May 2020 01:46:12 CEST Rajkumar Manoharan wrote:
[...]
> IIRC this was fixed a while ago by below patch. Somehow it never landed 
> in ath.git.
> Simple one line change is enough.
> 
> https://patchwork.kernel.org/patch/10550707/

Because it doesn't work for everything. Remember that 10.2.4.x overflows all 
the time (14-30s) because it used only 31 bit for the counters.

But feel free to point me to the firmware version which fixed this.

Kind regards,
	Sven
Sven Eckelmann May 5, 2020, 7:49 a.m. UTC | #8
On Tuesday, 5 May 2020 09:01:34 CEST Sven Eckelmann wrote:
> On Tuesday, 5 May 2020 01:46:12 CEST Rajkumar Manoharan wrote:
> [...]
> > IIRC this was fixed a while ago by below patch. Somehow it never landed 
> > in ath.git.
> > Simple one line change is enough.
> > 
> > https://patchwork.kernel.org/patch/10550707/
> 
> Because it doesn't work for everything. Remember that 10.2.4.x overflows all 
> the time (14-30s) because it used only 31 bit for the counters.
> 
> But feel free to point me to the firmware version which fixed this.

See also https://patchwork.kernel.org/patch/9701459/

Kind regards,
	Sven
Sven Eckelmann May 5, 2020, 7:55 a.m. UTC | #9
On Tuesday, 5 May 2020 09:49:46 CEST Sven Eckelmann wrote:
[...]
> See also https://patchwork.kernel.org/patch/9701459/

And I completely forgot about this one: https://patchwork.kernel.org/patch/10417673/

Kind regards,
	Sven
Markus Theil May 5, 2020, 3:37 p.m. UTC | #10
On 5/5/20 9:49 AM, Sven Eckelmann wrote:
> On Tuesday, 5 May 2020 09:01:34 CEST Sven Eckelmann wrote:
>> On Tuesday, 5 May 2020 01:46:12 CEST Rajkumar Manoharan wrote:
>> [...]
>>> IIRC this was fixed a while ago by below patch. Somehow it never landed 
>>> in ath.git.
>>> Simple one line change is enough.
>>>
>>> https://patchwork.kernel.org/patch/10550707/
>> Because it doesn't work for everything. Remember that 10.2.4.x overflows all 
>> the time (14-30s) because it used only 31 bit for the counters.
>>
>> But feel free to point me to the firmware version which fixed this.
> See also https://patchwork.kernel.org/patch/9701459/
>
> Kind regards,
> 	Sven

This patch already fixes the problem for me. I tested it on QCA988X hw
with firmware 10.2.4.

[   10.350919] ath10k_pci 0000:04:00.0: qca988x hw2.0 target 0x4100016c
chip_id 0x043222ff sub 0000:0000
[   10.350930] ath10k_pci 0000:04:00.0: kconfig debug 1 debugfs 1
tracing 1 dfs 0 testmode 0
[   10.351803] ath10k_pci 0000:04:00.0: firmware ver 10.2.4-1.0-00047
api 5 features no-p2p,raw-mode,mfp,allows-mesh-bcast crc32 35bd9258
[   10.385617] ath10k_pci 0000:04:00.0: board_file api 1 bmi_id N/A
crc32 bebc7c08
[   11.536818] ath10k_pci 0000:04:00.0: htt-ver 2.1 wmi-op 5 htt-op 2
cal otp max-sta 128 raw 0 hwcrypto 1

I also did not see the 31 bit overflow after a small amount of seconds.

Survey data from wlp4s0
    frequency:            2412 MHz [in use]
    noise:                -65 dBm
    channel active time:        5370225 ms
    channel busy time:        924199 ms
    channel receive time:        140 ms
    channel transmit time:        0 ms

Kind regards,
Markus
Rajkumar Manoharan May 5, 2020, 6:32 p.m. UTC | #11
On 2020-05-05 08:37, Markus Theil wrote:
> On 5/5/20 9:49 AM, Sven Eckelmann wrote:
>> On Tuesday, 5 May 2020 09:01:34 CEST Sven Eckelmann wrote:
>>> On Tuesday, 5 May 2020 01:46:12 CEST Rajkumar Manoharan wrote:
>>> [...]
>>>> IIRC this was fixed a while ago by below patch. Somehow it never 
>>>> landed
>>>> in ath.git.
>>>> Simple one line change is enough.
>>>> 
>>>> https://patchwork.kernel.org/patch/10550707/
>>> Because it doesn't work for everything. Remember that 10.2.4.x 
>>> overflows all
>>> the time (14-30s) because it used only 31 bit for the counters.
>>> 
>>> But feel free to point me to the firmware version which fixed this.
>> See also https://patchwork.kernel.org/patch/9701459/
>> 
>> Kind regards,
>> 	Sven
> 
> This patch already fixes the problem for me. I tested it on QCA988X hw
> with firmware 10.2.4.
> 
> [   10.350919] ath10k_pci 0000:04:00.0: qca988x hw2.0 target 0x4100016c
> chip_id 0x043222ff sub 0000:0000
> [   10.350930] ath10k_pci 0000:04:00.0: kconfig debug 1 debugfs 1
> tracing 1 dfs 0 testmode 0
> [   10.351803] ath10k_pci 0000:04:00.0: firmware ver 10.2.4-1.0-00047
> api 5 features no-p2p,raw-mode,mfp,allows-mesh-bcast crc32 35bd9258
> [   10.385617] ath10k_pci 0000:04:00.0: board_file api 1 bmi_id N/A
> crc32 bebc7c08
> [   11.536818] ath10k_pci 0000:04:00.0: htt-ver 2.1 wmi-op 5 htt-op 2
> cal otp max-sta 128 raw 0 hwcrypto 1
> 
> I also did not see the 31 bit overflow after a small amount of seconds.
> 
> Survey data from wlp4s0
>     frequency:            2412 MHz [in use]
>     noise:                -65 dBm
>     channel active time:        5370225 ms
>     channel busy time:        924199 ms
>     channel receive time:        140 ms
>     channel transmit time:        0 ms
> 
Great. Thanks for validating the patch. Venkat will post the patch on 
ToT.

-Rajkumar
Rajkumar Manoharan May 5, 2020, 6:45 p.m. UTC | #12
On 2020-05-05 00:55, Sven Eckelmann wrote:
> On Tuesday, 5 May 2020 09:49:46 CEST Sven Eckelmann wrote:
> [...]
>> See also https://patchwork.kernel.org/patch/9701459/
> 
> And I completely forgot about this one:
> https://patchwork.kernel.org/patch/10417673/
> 
_Me_too_ :) Hope this change is not needed when all drivers report
accumulated values.

-Rajkumar

Patch
diff mbox series

diff --git a/drivers/net/wireless/ath/ath10k/wmi.c b/drivers/net/wireless/ath/ath10k/wmi.c
index a81a1ab2de19..451c8275d07b 100644
--- a/drivers/net/wireless/ath/ath10k/wmi.c
+++ b/drivers/net/wireless/ath/ath10k/wmi.c
@@ -5802,16 +5802,16 @@  static int ath10k_wmi_event_pdev_bss_chan_info(struct ath10k *ar,
 
 	survey = &ar->survey[idx];
 
-	survey->noise     = noise_floor;
-	survey->time      = div_u64(total, cc_freq_hz);
-	survey->time_busy = div_u64(busy, cc_freq_hz);
-	survey->time_rx   = div_u64(rx_bss, cc_freq_hz);
-	survey->time_tx   = div_u64(tx, cc_freq_hz);
-	survey->filled   |= (SURVEY_INFO_NOISE_DBM |
-			     SURVEY_INFO_TIME |
-			     SURVEY_INFO_TIME_BUSY |
-			     SURVEY_INFO_TIME_RX |
-			     SURVEY_INFO_TIME_TX);
+	survey->noise      = noise_floor;
+	survey->time      += div_u64(total, cc_freq_hz);
+	survey->time_busy += div_u64(busy, cc_freq_hz);
+	survey->time_rx   += div_u64(rx_bss, cc_freq_hz);
+	survey->time_tx   += div_u64(tx, cc_freq_hz);
+	survey->filled    |= (SURVEY_INFO_NOISE_DBM |
+			      SURVEY_INFO_TIME |
+			      SURVEY_INFO_TIME_BUSY |
+			      SURVEY_INFO_TIME_RX |
+			      SURVEY_INFO_TIME_TX);
 exit:
 	spin_unlock_bh(&ar->data_lock);
 	complete(&ar->bss_survey_done);