Message ID | 20190918124259.17804-1-sven@narfation.org (mailing list archive) |
---|---|
Headers | show |
Series | ath10k: provide survey info as accumulated data | expand |
Sven Eckelmann <sven@narfation.org> writes: > it was observed that ath9k provides accumulated survey counters but ath10k > neither provides deltas nor accumulated counters. Instead it returns > some value which was returned at some point from the firmware. > > But as it turns out, this data is not reliable. To make it more useful, > ath10k has to: > > * retrieve counters rather frequently for hardware which is known to use > firmware versions with low number counter bits (for only 14-30s) > * clean up received counter values > * accumulate counters from firmware > > A comparison of the resulting output with these fixes can be seen under > https://stats.freifunk-vogtland.net/d/ffv_node/nodeinfo?orgId=1&var-node=ac86749f4d60&fullscreen&panelId=5&from=1568782046974&to=1568807068706 > > The left side of the graph shows the output before the patches were applied > and the right side the output with the patches applied. Just as reference, an > ath9k device in the same building is > https://stats.freifunk-vogtland.net/d/ffv_node/nodeinfo?orgId=1&var-node=ac86740037e0&fullscreen&panelId=5&from=1568782046974&to=1568807068706 Thanks, this looks very good to me and I had only cosmetic comments. Please submit next version without RFC so that I can apply these.
i checked your patch on 10.4 based chipsets with 9984. the values are now looking bogus and wrong at all. busy and active time time in ms does increase in hours each second the problem seem to be that your patch is 10.2.4 only related. ath_clean_survey does not trigger on 10.4 so the values double itself each time the event raises since you add the full values and not just a delta on top Sebastian Am 18.09.2019 um 14:42 schrieb Sven Eckelmann: > From: Sven Eckelmann <seckelmann@datto.com> > > Hi, > > it was observed that ath9k provides accumulated survey counters but ath10k > neither provides deltas nor accumulated counters. Instead it returns > some value which was returned at some point from the firmware. > > But as it turns out, this data is not reliable. To make it more useful, > ath10k has to: > > * retrieve counters rather frequently for hardware which is known to use > firmware versions with low number counter bits (for only 14-30s) > * clean up received counter values > * accumulate counters from firmware > > A comparison of the resulting output with these fixes can be seen under > https://stats.freifunk-vogtland.net/d/ffv_node/nodeinfo?orgId=1&var-node=ac86749f4d60&fullscreen&panelId=5&from=1568782046974&to=1568807068706 > > The left side of the graph shows the output before the patches were applied > and the right side the output with the patches applied. Just as reference, an > ath9k device in the same building is > https://stats.freifunk-vogtland.net/d/ffv_node/nodeinfo?orgId=1&var-node=ac86740037e0&fullscreen&panelId=5&from=1568782046974&to=1568807068706 > > Kind regards, > Sven > > Sven Eckelmann (2): > ath10k: report survey info as accumulated values > ath10k: regularly fetch survey counters > > drivers/net/wireless/ath/ath10k/core.c | 8 ++++ > drivers/net/wireless/ath/ath10k/core.h | 3 ++ > drivers/net/wireless/ath/ath10k/hw.c | 13 +++-- > drivers/net/wireless/ath/ath10k/mac.c | 52 ++++++++++++++++++++ > drivers/net/wireless/ath/ath10k/mac.h | 3 ++ > drivers/net/wireless/ath/ath10k/wmi.c | 66 ++++++++++++++++++++++---- > 6 files changed, 130 insertions(+), 15 deletions(-) >
On Monday, 14 October 2019 00:15:20 CEST Sebastian Gottschall wrote: > i checked your patch on 10.4 based chipsets with 9984. the values are > now looking bogus and wrong at all. busy and active time time in ms does > increase in hours each second > the problem seem to be that your patch is 10.2.4 only related. > ath_clean_survey does not trigger on 10.4 so the values double itself > each time the event raises since you add the full values and not just a > delta on top Thanks for the feedback. So we have now a firmware 10.2.4 which is counting busy + active up and has wraparound problems. And then we have a 10.4 firmware (on QCA9888 and QCA4019) which is clearing everything as expected with WMI_BSS_SURVEY_REQ_TYPE_READ_CLEAR and then we have some 10.4 firmware (one QCA9984) which behaves more like ath 10.2.4 firmware but is marked as ATH10K_HW_CC_WRAP_SHIFTED_EACH like the QCA4019. So I have no idea how to fix this when QCA4019 and QCA9984 are currently marked the same but behave differently. Does somebody have a overview how the different HW versions should behave or is there some special bit in the data reported by the firmware which can be used to evaluate the expected behavior? Kind regards, Sven
Sven Eckelmann <sven@narfation.org> writes: > On Monday, 14 October 2019 00:15:20 CEST Sebastian Gottschall wrote: >> i checked your patch on 10.4 based chipsets with 9984. the values are >> now looking bogus and wrong at all. busy and active time time in ms does >> increase in hours each second >> the problem seem to be that your patch is 10.2.4 only related. >> ath_clean_survey does not trigger on 10.4 so the values double itself >> each time the event raises since you add the full values and not just a >> delta on top > > Thanks for the feedback. So we have now a firmware 10.2.4 which is counting > busy + active up and has wraparound problems. And then we have a 10.4 firmware > (on QCA9888 and QCA4019) which is clearing everything as expected with > WMI_BSS_SURVEY_REQ_TYPE_READ_CLEAR and then we have some 10.4 firmware (one > QCA9984) which behaves more like ath 10.2.4 firmware but is marked as > ATH10K_HW_CC_WRAP_SHIFTED_EACH like the QCA4019. > > So I have no idea how to fix this when QCA4019 and QCA9984 are currently > marked the same but behave differently. Does somebody have a overview how the > different HW versions should behave or is there some special bit in the data > reported by the firmware which can be used to evaluate the expected behavior? I hope there's an easy way to detect this behaviour change, but if nothing else we could add a new bit to enum ath10k_fw_features. But of course that's the last resort, maintaining the firmware features bitfield accross different firmware branches is quite cumbersome.
10.4 has additional 64 bit cycle counters. see wmi_pdev_bss_chan_info_event the standard 32 cycle counters do individual wrap around as far as i know Am 14.10.2019 um 09:07 schrieb Sven Eckelmann: > On Monday, 14 October 2019 00:15:20 CEST Sebastian Gottschall wrote: >> i checked your patch on 10.4 based chipsets with 9984. the values are >> now looking bogus and wrong at all. busy and active time time in ms does >> increase in hours each second >> the problem seem to be that your patch is 10.2.4 only related. >> ath_clean_survey does not trigger on 10.4 so the values double itself >> each time the event raises since you add the full values and not just a >> delta on top > Thanks for the feedback. So we have now a firmware 10.2.4 which is > counting > busy + active up and has wraparound problems. And then we have a 10.4 > firmware > (on QCA9888 and QCA4019) which is clearing everything as expected with > WMI_BSS_SURVEY_REQ_TYPE_READ_CLEAR and then we have some 10.4 > firmware (one > QCA9984) which behaves more like ath 10.2.4 firmware but is marked as > ATH10K_HW_CC_WRAP_SHIFTED_EACH like the QCA4019. > > So I have no idea how to fix this when QCA4019 and QCA9984 are currently > marked the same but behave differently. Does somebody have a overview > how the > different HW versions should behave or is there some special bit in > the data > reported by the firmware which can be used to evaluate the expected > behavior? > > Kind regards, > Sven
From: Sven Eckelmann <seckelmann@datto.com> Hi, it was observed that ath9k provides accumulated survey counters but ath10k neither provides deltas nor accumulated counters. Instead it returns some value which was returned at some point from the firmware. But as it turns out, this data is not reliable. To make it more useful, ath10k has to: * retrieve counters rather frequently for hardware which is known to use firmware versions with low number counter bits (for only 14-30s) * clean up received counter values * accumulate counters from firmware A comparison of the resulting output with these fixes can be seen under https://stats.freifunk-vogtland.net/d/ffv_node/nodeinfo?orgId=1&var-node=ac86749f4d60&fullscreen&panelId=5&from=1568782046974&to=1568807068706 The left side of the graph shows the output before the patches were applied and the right side the output with the patches applied. Just as reference, an ath9k device in the same building is https://stats.freifunk-vogtland.net/d/ffv_node/nodeinfo?orgId=1&var-node=ac86740037e0&fullscreen&panelId=5&from=1568782046974&to=1568807068706 Kind regards, Sven Sven Eckelmann (2): ath10k: report survey info as accumulated values ath10k: regularly fetch survey counters drivers/net/wireless/ath/ath10k/core.c | 8 ++++ drivers/net/wireless/ath/ath10k/core.h | 3 ++ drivers/net/wireless/ath/ath10k/hw.c | 13 +++-- drivers/net/wireless/ath/ath10k/mac.c | 52 ++++++++++++++++++++ drivers/net/wireless/ath/ath10k/mac.h | 3 ++ drivers/net/wireless/ath/ath10k/wmi.c | 66 ++++++++++++++++++++++---- 6 files changed, 130 insertions(+), 15 deletions(-)