Message ID | 20250207-cwnd_tracepoint-v1-1-13650f3ca96d@debian.org (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | [net-next] trace: tcp: Add tracepoint for tcp_cwnd_reduction() | expand |
On Fri, Feb 7, 2025 at 7:04 PM Breno Leitao <leitao@debian.org> wrote: > > Add a lightweight tracepoint to monitor TCP congestion window > adjustments via tcp_cwnd_reduction(). This tracepoint enables tracking > of: > - TCP window size fluctuations > - Active socket behavior > - Congestion window reduction events > > Meta has been using BPF programs to monitor this function for years. > Adding a proper tracepoint provides a stable API for all users who need > to monitor TCP congestion window behavior. > > Use DECLARE_TRACE instead of TRACE_EVENT to avoid creating trace event > infrastructure and exporting to tracefs, keeping the implementation > minimal. (Thanks Steven Rostedt) > > Signed-off-by: Breno Leitao <leitao@debian.org> > --- I can give my +2 on this patch, although I have no way of testing it. I will trust Steven on this :) Reviewed-by: Eric Dumazet <edumazet@google.com>
On Tue, 11 Feb 2025 16:19:54 +0100
Eric Dumazet <edumazet@google.com> wrote:
> I can give my +2 on this patch, although I have no way of testing it.
If you want to test this, apply the below patch, enable
CONFIG_SAMPLE_TRACE_CUSTOM_EVENTS, and after you boot up, do the following:
# modprobe trace_custom_sched
# cd /sys/kernel/tracing
# echo 1 > events/custom/tcp_cwnd_reduction_tp/enable
[ do something to trigger it ]
# cat trace
-- Steve
diff --git a/samples/trace_events/trace_custom_sched.c b/samples/trace_events/trace_custom_sched.c
index dd409b704b35..35b3cfa6e91d 100644
--- a/samples/trace_events/trace_custom_sched.c
+++ b/samples/trace_events/trace_custom_sched.c
@@ -16,6 +16,7 @@
* from the C file, and not in the custom header file.
*/
#include <trace/events/sched.h>
+#include <trace/events/tcp.h>
/* Declare CREATE_CUSTOM_TRACE_EVENTS before including custom header */
#define CREATE_CUSTOM_TRACE_EVENTS
@@ -37,6 +38,7 @@
*/
static void fct(struct tracepoint *tp, void *priv)
{
+ trace_custom_event_tcp_cwnd_reduction_tp_update(tp);
trace_custom_event_sched_switch_update(tp);
trace_custom_event_sched_waking_update(tp);
}
diff --git a/samples/trace_events/trace_custom_sched.h b/samples/trace_events/trace_custom_sched.h
index 951388334a3f..339957d692c0 100644
--- a/samples/trace_events/trace_custom_sched.h
+++ b/samples/trace_events/trace_custom_sched.h
@@ -74,6 +74,33 @@ TRACE_CUSTOM_EVENT(sched_waking,
TP_printk("pid=%d prio=%d", __entry->pid, __entry->prio)
)
+
+struct sock;
+
+TRACE_CUSTOM_EVENT(tcp_cwnd_reduction_tp,
+
+ TP_PROTO(const struct sock *sk, int newly_acked_sacked,
+ int newly_lost, int flag),
+
+ TP_ARGS(sk, newly_acked_sacked, newly_lost, flag),
+
+ TP_STRUCT__entry(
+ __field( unsigned long, sk )
+ __field( int, ack )
+ __field( int, lost )
+ __field( int, flag )
+ ),
+
+ TP_fast_assign(
+ __entry->sk = (unsigned long)sk;
+ __entry->ack = newly_acked_sacked;
+ __entry->lost = newly_lost;
+ __entry->flag = flag;
+ ),
+
+ TP_printk("sk=%lx ack=%d lost=%d flag=%d", __entry->sk,
+ __entry->ack, __entry->lost, __entry->flag)
+)
#endif
/*
* Just like the headers that create TRACE_EVENTs, the below must
On Fri, 07 Feb 2025 10:03:53 -0800 Breno Leitao wrote: > +DECLARE_TRACE(tcp_cwnd_reduction_tp, > + TP_PROTO(const struct sock *sk, int newly_acked_sacked, > + int newly_lost, int flag), > + TP_ARGS(sk, newly_acked_sacked, newly_lost, flag)); nit: I think that the ");" traditionally goes on a separate line? regarding testing if the goal is the use in BPF perhaps you could add a small sample/result to the commit message of using bpftrace against it?
diff --git a/include/trace/events/tcp.h b/include/trace/events/tcp.h index a27c4b619dffd7dcc72fffa71bf0fd5e34fe6681..d574e6151dc4f7430206f9ccefe0bf0d463aaa52 100644 --- a/include/trace/events/tcp.h +++ b/include/trace/events/tcp.h @@ -259,6 +259,11 @@ TRACE_EVENT(tcp_retransmit_synack, __entry->saddr_v6, __entry->daddr_v6) ); +DECLARE_TRACE(tcp_cwnd_reduction_tp, + TP_PROTO(const struct sock *sk, int newly_acked_sacked, + int newly_lost, int flag), + TP_ARGS(sk, newly_acked_sacked, newly_lost, flag)); + #include <trace/events/net_probe_common.h> TRACE_EVENT(tcp_probe, diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index eb82e01da911048b41ca380f913ef55566be79a7..1a667e67df6beacde9871a50d44e180c2943ded0 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -2710,6 +2710,8 @@ void tcp_cwnd_reduction(struct sock *sk, int newly_acked_sacked, int newly_lost, if (newly_acked_sacked <= 0 || WARN_ON_ONCE(!tp->prior_cwnd)) return; + trace_tcp_cwnd_reduction_tp(sk, newly_acked_sacked, newly_lost, flag); + tp->prr_delivered += newly_acked_sacked; if (delta < 0) { u64 dividend = (u64)tp->snd_ssthresh * tp->prr_delivered +
Add a lightweight tracepoint to monitor TCP congestion window adjustments via tcp_cwnd_reduction(). This tracepoint enables tracking of: - TCP window size fluctuations - Active socket behavior - Congestion window reduction events Meta has been using BPF programs to monitor this function for years. Adding a proper tracepoint provides a stable API for all users who need to monitor TCP congestion window behavior. Use DECLARE_TRACE instead of TRACE_EVENT to avoid creating trace event infrastructure and exporting to tracefs, keeping the implementation minimal. (Thanks Steven Rostedt) Signed-off-by: Breno Leitao <leitao@debian.org> --- Changes since RFC: - Change from a full tracepoint to DECLARE_TRACE() as suggested by Steven - Link to RFC: https://lore.kernel.org/r/20250120-cwnd_tracepoint-v1-1-36b0e0d643fa@debian.org --- include/trace/events/tcp.h | 5 +++++ net/ipv4/tcp_input.c | 2 ++ 2 files changed, 7 insertions(+) --- base-commit: 09717c28b76c30b1dc8c261c855ffb2406abab2e change-id: 20250120-cwnd_tracepoint-2e11c996a9cb Best regards,