Message ID | 1556283505-29539-1-git-send-email-vnaralas@codeaurora.org (mailing list archive) |
---|---|
State | Accepted |
Commit | 9d740d6380e5030f356e2811b14fe45684c793b1 |
Delegated to: | Kalle Valo |
Headers | show |
Series | [PATCHv2] ath10k: Add wrapper function to ath10k debug | expand |
On 4/26/19 5:58 AM, Venkateswara Naralasetty wrote: > ath10k_dbg() is called in ath10k_process_rx() with huge set of arguments > which is causing CPU overhead even when debug_mask is not set. > Good improvement was observed in the receive side performance when call > to ath10k_dbg() is avoided in the RX path. > > Since currently all debug messages are sent via tracing infrastructure, > we cannot entirely avoid calling ath10k_dbg. Therefore, call to > ath10k_dbg() is made conditional based on tracing config in the driver. > > Trasmit performance remains unchanged with this patch; below are some > experimental results with this patch and tracing disabled. > > mesh mode: > > w/o this patch with this patch > Traffic TP CPU Usage TP CPU usage > > TCP 840Mbps 76.53% 960Mbps 78.14% > UDP 1030Mbps 74.58% 1132Mbps 74.31% > > Infra mode: > > w/o this patch with this patch > Traffic TP CPU Usage TP CPU usage > > TCP Rx 1241Mbps 80.89% 1270Mbps 73.50% > UDP Rx 1433Mbps 81.77% 1472Mbps 72.80% > > Tested platform : IPQ8064 > hardware used : QCA9984 > firmware ver : ver 10.4-3.5.3-00057 > > Signed-off-by: Kan Yan <kyan@chromium.org> > Signed-off-by: Venkateswara Naralasetty <vnaralas@codeaurora.org> > --- > v2: > * changed trace enabled check from IS_ENABLED(CONFIG_ATH10K_TRACING) > * to trace_ath10k_log_dbg_enabled(). > > drivers/net/wireless/ath/ath10k/core.c | 2 ++ > drivers/net/wireless/ath/ath10k/debug.c | 8 ++++---- > drivers/net/wireless/ath/ath10k/debug.h | 22 ++++++++++++++++------ > drivers/net/wireless/ath/ath10k/trace.c | 1 + > drivers/net/wireless/ath/ath10k/trace.h | 6 +++++- > 5 files changed, 28 insertions(+), 11 deletions(-) > > diff --git a/drivers/net/wireless/ath/ath10k/core.c b/drivers/net/wireless/ath/ath10k/core.c > index cfd7bb2..ab709bf 100644 > --- a/drivers/net/wireless/ath/ath10k/core.c > +++ b/drivers/net/wireless/ath/ath10k/core.c > @@ -26,6 +26,8 @@ > #include "coredump.h" > > unsigned int ath10k_debug_mask; > +EXPORT_SYMBOL(ath10k_debug_mask); > + > static unsigned int ath10k_cryptmode_param; > static bool uart_print; > static bool skip_otp; > diff --git a/drivers/net/wireless/ath/ath10k/debug.c b/drivers/net/wireless/ath/ath10k/debug.c > index 32d967a..1b63929 100644 > --- a/drivers/net/wireless/ath/ath10k/debug.c > +++ b/drivers/net/wireless/ath/ath10k/debug.c > @@ -2620,8 +2620,8 @@ void ath10k_debug_unregister(struct ath10k *ar) > #endif /* CONFIG_ATH10K_DEBUGFS */ > > #ifdef CONFIG_ATH10K_DEBUG > -void ath10k_dbg(struct ath10k *ar, enum ath10k_debug_mask mask, > - const char *fmt, ...) > +void __ath10k_dbg(struct ath10k *ar, enum ath10k_debug_mask mask, > + const char *fmt, ...) > { > struct va_format vaf; > va_list args; Do you still need the check later in this method: if (ath10k_debug_mask & mask) since you already checked in the ath10k_dbg() macro? Thanks, Ben
> -----Original Message----- > From: ath10k <ath10k-bounces@lists.infradead.org> On Behalf Of Ben > Greear > Sent: Friday, April 26, 2019 6:52 PM > To: Venkateswara Naralasetty <vnaralas@codeaurora.org>; > ath10k@lists.infradead.org > Cc: Kan Yan <kyan@chromium.org>; linux-wireless@vger.kernel.org > Subject: [EXT] Re: [PATCHv2] ath10k: Add wrapper function to ath10k debug > > On 4/26/19 5:58 AM, Venkateswara Naralasetty wrote: > > ath10k_dbg() is called in ath10k_process_rx() with huge set of > > arguments which is causing CPU overhead even when debug_mask is not > set. > > Good improvement was observed in the receive side performance when > > call to ath10k_dbg() is avoided in the RX path. > > > > Since currently all debug messages are sent via tracing > > infrastructure, we cannot entirely avoid calling ath10k_dbg. > > Therefore, call to > > ath10k_dbg() is made conditional based on tracing config in the driver. > > > > Trasmit performance remains unchanged with this patch; below are some > > experimental results with this patch and tracing disabled. > > > > mesh mode: > > > > w/o this patch with this patch > > Traffic TP CPU Usage TP CPU usage > > > > TCP 840Mbps 76.53% 960Mbps 78.14% > > UDP 1030Mbps 74.58% 1132Mbps 74.31% > > > > Infra mode: > > > > w/o this patch with this patch > > Traffic TP CPU Usage TP CPU usage > > > > TCP Rx 1241Mbps 80.89% 1270Mbps 73.50% > > UDP Rx 1433Mbps 81.77% 1472Mbps 72.80% > > > > Tested platform : IPQ8064 > > hardware used : QCA9984 > > firmware ver : ver 10.4-3.5.3-00057 > > > > Signed-off-by: Kan Yan <kyan@chromium.org> > > Signed-off-by: Venkateswara Naralasetty <vnaralas@codeaurora.org> > > --- > > v2: > > * changed trace enabled check from > IS_ENABLED(CONFIG_ATH10K_TRACING) > > * to trace_ath10k_log_dbg_enabled(). > > > > drivers/net/wireless/ath/ath10k/core.c | 2 ++ > > drivers/net/wireless/ath/ath10k/debug.c | 8 ++++---- > > drivers/net/wireless/ath/ath10k/debug.h | 22 ++++++++++++++++------ > > drivers/net/wireless/ath/ath10k/trace.c | 1 + > > drivers/net/wireless/ath/ath10k/trace.h | 6 +++++- > > 5 files changed, 28 insertions(+), 11 deletions(-) > > > > diff --git a/drivers/net/wireless/ath/ath10k/core.c > > b/drivers/net/wireless/ath/ath10k/core.c > > index cfd7bb2..ab709bf 100644 > > --- a/drivers/net/wireless/ath/ath10k/core.c > > +++ b/drivers/net/wireless/ath/ath10k/core.c > > @@ -26,6 +26,8 @@ > > #include "coredump.h" > > > > unsigned int ath10k_debug_mask; > > +EXPORT_SYMBOL(ath10k_debug_mask); > > + > > static unsigned int ath10k_cryptmode_param; > > static bool uart_print; > > static bool skip_otp; > > diff --git a/drivers/net/wireless/ath/ath10k/debug.c > > b/drivers/net/wireless/ath/ath10k/debug.c > > index 32d967a..1b63929 100644 > > --- a/drivers/net/wireless/ath/ath10k/debug.c > > +++ b/drivers/net/wireless/ath/ath10k/debug.c > > @@ -2620,8 +2620,8 @@ void ath10k_debug_unregister(struct ath10k *ar) > > #endif /* CONFIG_ATH10K_DEBUGFS */ > > > > #ifdef CONFIG_ATH10K_DEBUG > > -void ath10k_dbg(struct ath10k *ar, enum ath10k_debug_mask mask, > > - const char *fmt, ...) > > +void __ath10k_dbg(struct ath10k *ar, enum ath10k_debug_mask mask, > > + const char *fmt, ...) > > { > > struct va_format vaf; > > va_list args; > > Do you still need the check later in this method: > > if (ath10k_debug_mask & mask) > > since you already checked in the ath10k_dbg() macro? Yes, we need this check. Otherwise all debug messages will be printed even without any debug mask set in case of tracing enabled. > > Thanks, > Ben > > > -- > Ben Greear <greearb@candelatech.com> > Candela Technologies Inc http://www.candelatech.com > > > _______________________________________________ > ath10k mailing list > ath10k@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/ath10k
On Fri, 26 Apr 2019 at 14:58, Venkateswara Naralasetty <vnaralas@codeaurora.org> wrote: > > ath10k_dbg() is called in ath10k_process_rx() with huge set of arguments > which is causing CPU overhead even when debug_mask is not set. > Good improvement was observed in the receive side performance when call > to ath10k_dbg() is avoided in the RX path. [...] > +/* Avoid calling __ath10k_dbg() if debug_mask is not set and tracing > + * disabled. > + */ > +#define ath10k_dbg(ar, dbg_mask, fmt, ...) \ > +do { \ > + if ((ath10k_debug_mask & dbg_mask) || \ > + trace_ath10k_log_dbg_enabled()) \ > + __ath10k_dbg(ar, dbg_mask, fmt, ##__VA_ARGS__); \ > +} while (0) Did you consider using jump labels (see include/linux/jump_label.h)? It's what tracing uses under the hood. I wonder if you could squeeze out a bit more performance with that? I guess you'd need to add `struct static_key ath10k_dbg_mask_keys[ATH10K_DBG_MAX]` and re-do ath10k_debug_mask enum a bit. Michal
On 4/26/19 6:44 AM, Michał Kazior wrote: > On Fri, 26 Apr 2019 at 14:58, Venkateswara Naralasetty > <vnaralas@codeaurora.org> wrote: >> >> ath10k_dbg() is called in ath10k_process_rx() with huge set of arguments >> which is causing CPU overhead even when debug_mask is not set. >> Good improvement was observed in the receive side performance when call >> to ath10k_dbg() is avoided in the RX path. > [...] > >> +/* Avoid calling __ath10k_dbg() if debug_mask is not set and tracing >> + * disabled. >> + */ >> +#define ath10k_dbg(ar, dbg_mask, fmt, ...) \ >> +do { \ >> + if ((ath10k_debug_mask & dbg_mask) || \ >> + trace_ath10k_log_dbg_enabled()) \ >> + __ath10k_dbg(ar, dbg_mask, fmt, ##__VA_ARGS__); \ >> +} while (0) > > Did you consider using jump labels (see include/linux/jump_label.h)? > It's what tracing uses under the hood. I wonder if you could squeeze > out a bit more performance with that? I guess you'd need to add > `struct static_key ath10k_dbg_mask_keys[ATH10K_DBG_MAX]` and re-do > ath10k_debug_mask enum a bit. Maybe first test with debugging just compiled out to see if there is still any significant overhead with this new patch applied? Thanks, Ben
On 4/26/19 6:38 AM, Venkateswara Naralasetty wrote: >>> #ifdef CONFIG_ATH10K_DEBUG >>> -void ath10k_dbg(struct ath10k *ar, enum ath10k_debug_mask mask, >>> - const char *fmt, ...) >>> +void __ath10k_dbg(struct ath10k *ar, enum ath10k_debug_mask mask, >>> + const char *fmt, ...) >>> { >>> struct va_format vaf; >>> va_list args; >> >> Do you still need the check later in this method: >> >> if (ath10k_debug_mask & mask) >> >> since you already checked in the ath10k_dbg() macro? > Yes, we need this check. > Otherwise all debug messages will be printed even without any debug mask set in case of tracing enabled. Ahh, I see. Thanks, Ben
> -----Original Message----- > From: ath10k <ath10k-bounces@lists.infradead.org> On Behalf Of Ben > Greear > Sent: Friday, April 26, 2019 7:27 PM > To: Michał Kazior <kazikcz@gmail.com>; Venkateswara Naralasetty > <vnaralas@codeaurora.org> > Cc: Kan Yan <kyan@chromium.org>; linux-wireless <linux- > wireless@vger.kernel.org>; ath10k@lists.infradead.org > Subject: [EXT] Re: [PATCHv2] ath10k: Add wrapper function to ath10k debug > > On 4/26/19 6:44 AM, Michał Kazior wrote: > > On Fri, 26 Apr 2019 at 14:58, Venkateswara Naralasetty > > <vnaralas@codeaurora.org> wrote: > >> > >> ath10k_dbg() is called in ath10k_process_rx() with huge set of > >> arguments which is causing CPU overhead even when debug_mask is not > set. > >> Good improvement was observed in the receive side performance when > >> call to ath10k_dbg() is avoided in the RX path. > > [...] > > > >> +/* Avoid calling __ath10k_dbg() if debug_mask is not set and tracing > >> + * disabled. > >> + */ > >> +#define ath10k_dbg(ar, dbg_mask, fmt, ...) \ > >> +do { \ > >> + if ((ath10k_debug_mask & dbg_mask) || \ > >> + trace_ath10k_log_dbg_enabled()) \ > >> + __ath10k_dbg(ar, dbg_mask, fmt, ##__VA_ARGS__); \ } > >> +while (0) > > > > Did you consider using jump labels (see include/linux/jump_label.h)? > > It's what tracing uses under the hood. I wonder if you could squeeze > > out a bit more performance with that? I guess you'd need to add > > `struct static_key ath10k_dbg_mask_keys[ATH10K_DBG_MAX]` and re-do > > ath10k_debug_mask enum a bit. > > Maybe first test with debugging just compiled out to see if there is still any > significant overhead with this new patch applied? Since this macro ath10k_dbg defined outside of CONFIG_ATH10K_DEBUG will it make any difference even if debugging compiled out? Thanks, Venkatesh. > > Thanks, > Ben > > -- > Ben Greear <greearb@candelatech.com> > Candela Technologies Inc http://www.candelatech.com > > > _______________________________________________ > ath10k mailing list > ath10k@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/ath10k
> -----Original Message----- > From: ath10k <ath10k-bounces@lists.infradead.org> On Behalf Of Michal > Kazior > Sent: Friday, April 26, 2019 7:15 PM > To: Venkateswara Naralasetty <vnaralas@codeaurora.org> > Cc: Kan Yan <kyan@chromium.org>; linux-wireless <linux- > wireless@vger.kernel.org>; ath10k@lists.infradead.org > Subject: [EXT] Re: [PATCHv2] ath10k: Add wrapper function to ath10k debug > > On Fri, 26 Apr 2019 at 14:58, Venkateswara Naralasetty > <vnaralas@codeaurora.org> wrote: > > > > ath10k_dbg() is called in ath10k_process_rx() with huge set of > > arguments which is causing CPU overhead even when debug_mask is not > set. > > Good improvement was observed in the receive side performance when > > call to ath10k_dbg() is avoided in the RX path. > [...] > > > +/* Avoid calling __ath10k_dbg() if debug_mask is not set and tracing > > + * disabled. > > + */ > > +#define ath10k_dbg(ar, dbg_mask, fmt, ...) \ > > +do { \ > > + if ((ath10k_debug_mask & dbg_mask) || \ > > + trace_ath10k_log_dbg_enabled()) \ > > + __ath10k_dbg(ar, dbg_mask, fmt, ##__VA_ARGS__); \ } > > +while (0) > > Did you consider using jump labels (see include/linux/jump_label.h)? > It's what tracing uses under the hood. I wonder if you could squeeze out a bit > more performance with that? I guess you'd need to add `struct static_key > ath10k_dbg_mask_keys[ATH10K_DBG_MAX]` and re-do > ath10k_debug_mask enum a bit. > I could not observe any significant Throughput/CPU improvement after using jump labels. For now shall we go with my patch? Thanks, Venkatesh. > > Michal > > _______________________________________________ > ath10k mailing list > ath10k@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/ath10k
Venkateswara Naralasetty <vnaralas@codeaurora.org> wrote: > ath10k_dbg() is called in ath10k_process_rx() with huge set of arguments > which is causing CPU overhead even when debug_mask is not set. > Good improvement was observed in the receive side performance when call > to ath10k_dbg() is avoided in the RX path. > > Since currently all debug messages are sent via tracing infrastructure, > we cannot entirely avoid calling ath10k_dbg. Therefore, call to > ath10k_dbg() is made conditional based on tracing config in the driver. > > Trasmit performance remains unchanged with this patch; below are some > experimental results with this patch and tracing disabled. > > mesh mode: > > w/o this patch with this patch > Traffic TP CPU Usage TP CPU usage > > TCP 840Mbps 76.53% 960Mbps 78.14% > UDP 1030Mbps 74.58% 1132Mbps 74.31% > > Infra mode: > > w/o this patch with this patch > Traffic TP CPU Usage TP CPU usage > > TCP Rx 1241Mbps 80.89% 1270Mbps 73.50% > UDP Rx 1433Mbps 81.77% 1472Mbps 72.80% > > Tested platform : IPQ8064 > hardware used : QCA9984 > firmware ver : ver 10.4-3.5.3-00057 > > Signed-off-by: Kan Yan <kyan@chromium.org> > Signed-off-by: Venkateswara Naralasetty <vnaralas@codeaurora.org> > Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Patch applied to ath-next branch of ath.git, thanks. 9d740d6380e5 ath10k: Add wrapper function to ath10k debug
diff --git a/drivers/net/wireless/ath/ath10k/core.c b/drivers/net/wireless/ath/ath10k/core.c index cfd7bb2..ab709bf 100644 --- a/drivers/net/wireless/ath/ath10k/core.c +++ b/drivers/net/wireless/ath/ath10k/core.c @@ -26,6 +26,8 @@ #include "coredump.h" unsigned int ath10k_debug_mask; +EXPORT_SYMBOL(ath10k_debug_mask); + static unsigned int ath10k_cryptmode_param; static bool uart_print; static bool skip_otp; diff --git a/drivers/net/wireless/ath/ath10k/debug.c b/drivers/net/wireless/ath/ath10k/debug.c index 32d967a..1b63929 100644 --- a/drivers/net/wireless/ath/ath10k/debug.c +++ b/drivers/net/wireless/ath/ath10k/debug.c @@ -2620,8 +2620,8 @@ void ath10k_debug_unregister(struct ath10k *ar) #endif /* CONFIG_ATH10K_DEBUGFS */ #ifdef CONFIG_ATH10K_DEBUG -void ath10k_dbg(struct ath10k *ar, enum ath10k_debug_mask mask, - const char *fmt, ...) +void __ath10k_dbg(struct ath10k *ar, enum ath10k_debug_mask mask, + const char *fmt, ...) { struct va_format vaf; va_list args; @@ -2638,7 +2638,7 @@ void ath10k_dbg(struct ath10k *ar, enum ath10k_debug_mask mask, va_end(args); } -EXPORT_SYMBOL(ath10k_dbg); +EXPORT_SYMBOL(__ath10k_dbg); void ath10k_dbg_dump(struct ath10k *ar, enum ath10k_debug_mask mask, @@ -2651,7 +2651,7 @@ void ath10k_dbg_dump(struct ath10k *ar, if (ath10k_debug_mask & mask) { if (msg) - ath10k_dbg(ar, mask, "%s\n", msg); + __ath10k_dbg(ar, mask, "%s\n", msg); for (ptr = buf; (ptr - buf) < len; ptr += 16) { linebuflen = 0; diff --git a/drivers/net/wireless/ath/ath10k/debug.h b/drivers/net/wireless/ath/ath10k/debug.h index db78e85..a5b2039 100644 --- a/drivers/net/wireless/ath/ath10k/debug.h +++ b/drivers/net/wireless/ath/ath10k/debug.h @@ -240,18 +240,18 @@ void ath10k_sta_update_rx_tid_stats_ampdu(struct ath10k *ar, #endif /* CONFIG_MAC80211_DEBUGFS */ #ifdef CONFIG_ATH10K_DEBUG -__printf(3, 4) void ath10k_dbg(struct ath10k *ar, - enum ath10k_debug_mask mask, - const char *fmt, ...); +__printf(3, 4) void __ath10k_dbg(struct ath10k *ar, + enum ath10k_debug_mask mask, + const char *fmt, ...); void ath10k_dbg_dump(struct ath10k *ar, enum ath10k_debug_mask mask, const char *msg, const char *prefix, const void *buf, size_t len); #else /* CONFIG_ATH10K_DEBUG */ -static inline int ath10k_dbg(struct ath10k *ar, - enum ath10k_debug_mask dbg_mask, - const char *fmt, ...) +static inline int __ath10k_dbg(struct ath10k *ar, + enum ath10k_debug_mask dbg_mask, + const char *fmt, ...) { return 0; } @@ -263,4 +263,14 @@ static inline void ath10k_dbg_dump(struct ath10k *ar, { } #endif /* CONFIG_ATH10K_DEBUG */ + +/* Avoid calling __ath10k_dbg() if debug_mask is not set and tracing + * disabled. + */ +#define ath10k_dbg(ar, dbg_mask, fmt, ...) \ +do { \ + if ((ath10k_debug_mask & dbg_mask) || \ + trace_ath10k_log_dbg_enabled()) \ + __ath10k_dbg(ar, dbg_mask, fmt, ##__VA_ARGS__); \ +} while (0) #endif /* _DEBUG_H_ */ diff --git a/drivers/net/wireless/ath/ath10k/trace.c b/drivers/net/wireless/ath/ath10k/trace.c index 3ecdff1..c7d4c97 100644 --- a/drivers/net/wireless/ath/ath10k/trace.c +++ b/drivers/net/wireless/ath/ath10k/trace.c @@ -7,3 +7,4 @@ #define CREATE_TRACE_POINTS #include "trace.h" +EXPORT_SYMBOL(__tracepoint_ath10k_log_dbg); diff --git a/drivers/net/wireless/ath/ath10k/trace.h b/drivers/net/wireless/ath/ath10k/trace.h index ba977bb..ab91645 100644 --- a/drivers/net/wireless/ath/ath10k/trace.h +++ b/drivers/net/wireless/ath/ath10k/trace.h @@ -29,7 +29,11 @@ static inline u32 ath10k_frm_hdr_len(const void *buf, size_t len) #if !defined(CONFIG_ATH10K_TRACING) #undef TRACE_EVENT #define TRACE_EVENT(name, proto, ...) \ -static inline void trace_ ## name(proto) {} +static inline void trace_ ## name(proto) {} \ +static inline bool trace_##name##_enabled(void) \ +{ \ + return false; \ +} #undef DECLARE_EVENT_CLASS #define DECLARE_EVENT_CLASS(...) #undef DEFINE_EVENT