Message ID | 20230311151756.83302-1-kerneljasonxing@gmail.com (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net-next] net-sysfs: display two backlog queue len separately | expand |
On Sat, Mar 11, 2023 at 11:17:56PM +0800, Jason Xing wrote: > From: Jason Xing <kernelxing@tencent.com> > > Sometimes we need to know which one of backlog queue can be exactly > long enough to cause some latency when debugging this part is needed. > Thus, we can then separate the display of both. > > Signed-off-by: Jason Xing <kernelxing@tencent.com> > --- > net/core/net-procfs.c | 17 ++++++++++++----- > 1 file changed, 12 insertions(+), 5 deletions(-) > > diff --git a/net/core/net-procfs.c b/net/core/net-procfs.c > index 1ec23bf8b05c..97a304e1957a 100644 > --- a/net/core/net-procfs.c > +++ b/net/core/net-procfs.c > @@ -115,10 +115,14 @@ static int dev_seq_show(struct seq_file *seq, void *v) > return 0; > } > > -static u32 softnet_backlog_len(struct softnet_data *sd) > +static u32 softnet_input_pkt_queue_len(struct softnet_data *sd) > { > - return skb_queue_len_lockless(&sd->input_pkt_queue) + > - skb_queue_len_lockless(&sd->process_queue); > + return skb_queue_len_lockless(&sd->input_pkt_queue); > +} > + > +static u32 softnet_process_queue_len(struct softnet_data *sd) > +{ > + return skb_queue_len_lockless(&sd->process_queue); > } > > static struct softnet_data *softnet_get_online(loff_t *pos) > @@ -169,12 +173,15 @@ static int softnet_seq_show(struct seq_file *seq, void *v) > * mapping the data a specific CPU > */ > seq_printf(seq, > - "%08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x\n", > + "%08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x " > + "%08x %08x\n", > sd->processed, sd->dropped, sd->time_squeeze, 0, > 0, 0, 0, 0, /* was fastroute */ > 0, /* was cpu_collision */ > sd->received_rps, flow_limit_count, > - softnet_backlog_len(sd), (int)seq->index); > + 0, /* was len of two backlog queues */ > + (int)seq->index, nit: I think you could avoid this cast by using %llx as the format specifier. > + softnet_input_pkt_queue_len(sd), softnet_process_queue_len(sd)); > return 0; > } > > -- > 2.37.3 >
On Mon, Mar 13, 2023 at 3:02 AM Simon Horman <simon.horman@corigine.com> wrote: > > On Sat, Mar 11, 2023 at 11:17:56PM +0800, Jason Xing wrote: > > From: Jason Xing <kernelxing@tencent.com> > > > > Sometimes we need to know which one of backlog queue can be exactly > > long enough to cause some latency when debugging this part is needed. > > Thus, we can then separate the display of both. > > > > Signed-off-by: Jason Xing <kernelxing@tencent.com> > > --- > > net/core/net-procfs.c | 17 ++++++++++++----- > > 1 file changed, 12 insertions(+), 5 deletions(-) > > > > diff --git a/net/core/net-procfs.c b/net/core/net-procfs.c > > index 1ec23bf8b05c..97a304e1957a 100644 > > --- a/net/core/net-procfs.c > > +++ b/net/core/net-procfs.c > > @@ -115,10 +115,14 @@ static int dev_seq_show(struct seq_file *seq, void *v) > > return 0; > > } > > > > -static u32 softnet_backlog_len(struct softnet_data *sd) > > +static u32 softnet_input_pkt_queue_len(struct softnet_data *sd) > > { > > - return skb_queue_len_lockless(&sd->input_pkt_queue) + > > - skb_queue_len_lockless(&sd->process_queue); > > + return skb_queue_len_lockless(&sd->input_pkt_queue); > > +} > > + > > +static u32 softnet_process_queue_len(struct softnet_data *sd) > > +{ > > + return skb_queue_len_lockless(&sd->process_queue); > > } > > > > static struct softnet_data *softnet_get_online(loff_t *pos) > > @@ -169,12 +173,15 @@ static int softnet_seq_show(struct seq_file *seq, void *v) > > * mapping the data a specific CPU > > */ > > seq_printf(seq, > > - "%08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x\n", > > + "%08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x " > > + "%08x %08x\n", > > sd->processed, sd->dropped, sd->time_squeeze, 0, > > 0, 0, 0, 0, /* was fastroute */ > > 0, /* was cpu_collision */ > > sd->received_rps, flow_limit_count, > > - softnet_backlog_len(sd), (int)seq->index); > > + 0, /* was len of two backlog queues */ > > + (int)seq->index, > > nit: I think you could avoid this cast by using %llx as the format specifier. I'm not sure if I should change this format since the above line is introduced in commit 7d58e6555870d ('net-sysfs: add backlog len and CPU id to softnet data'). The seq->index here manifests which cpu it uses, so it can be displayed in 'int' format. Meanwhile, using %8x to output is much cleaner if the user executes 'cat /proc/net/softnet_stat'. What do you think about this? Thanks, Jason > > > + softnet_input_pkt_queue_len(sd), softnet_process_queue_len(sd)); > > return 0; > > } > > > > -- > > 2.37.3 > >
On Mon, 13 Mar 2023 09:55:37 +0800 Jason Xing <kerneljasonxing@gmail.com> wrote: > On Mon, Mar 13, 2023 at 3:02 AM Simon Horman <simon.horman@corigine.com> wrote: > > > > On Sat, Mar 11, 2023 at 11:17:56PM +0800, Jason Xing wrote: > > > From: Jason Xing <kernelxing@tencent.com> > > > > > > Sometimes we need to know which one of backlog queue can be exactly > > > long enough to cause some latency when debugging this part is needed. > > > Thus, we can then separate the display of both. > > > > > > Signed-off-by: Jason Xing <kernelxing@tencent.com> > > > --- > > > net/core/net-procfs.c | 17 ++++++++++++----- > > > 1 file changed, 12 insertions(+), 5 deletions(-) > > > > > > diff --git a/net/core/net-procfs.c b/net/core/net-procfs.c > > > index 1ec23bf8b05c..97a304e1957a 100644 > > > --- a/net/core/net-procfs.c > > > +++ b/net/core/net-procfs.c > > > @@ -115,10 +115,14 @@ static int dev_seq_show(struct seq_file *seq, void *v) > > > return 0; > > > } > > > > > > -static u32 softnet_backlog_len(struct softnet_data *sd) > > > +static u32 softnet_input_pkt_queue_len(struct softnet_data *sd) > > > { > > > - return skb_queue_len_lockless(&sd->input_pkt_queue) + > > > - skb_queue_len_lockless(&sd->process_queue); > > > + return skb_queue_len_lockless(&sd->input_pkt_queue); > > > +} > > > + > > > +static u32 softnet_process_queue_len(struct softnet_data *sd) > > > +{ > > > + return skb_queue_len_lockless(&sd->process_queue); > > > } > > > > > > static struct softnet_data *softnet_get_online(loff_t *pos) > > > @@ -169,12 +173,15 @@ static int softnet_seq_show(struct seq_file *seq, void *v) > > > * mapping the data a specific CPU > > > */ > > > seq_printf(seq, > > > - "%08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x\n", > > > + "%08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x " > > > + "%08x %08x\n", > > > sd->processed, sd->dropped, sd->time_squeeze, 0, > > > 0, 0, 0, 0, /* was fastroute */ > > > 0, /* was cpu_collision */ > > > sd->received_rps, flow_limit_count, > > > - softnet_backlog_len(sd), (int)seq->index); > > > + 0, /* was len of two backlog queues */ > > > + (int)seq->index, > > > > nit: I think you could avoid this cast by using %llx as the format specifier. > > I'm not sure if I should change this format since the above line is > introduced in commit 7d58e6555870d ('net-sysfs: add backlog len and > CPU id to softnet data'). > The seq->index here manifests which cpu it uses, so it can be > displayed in 'int' format. Meanwhile, using %8x to output is much > cleaner if the user executes 'cat /proc/net/softnet_stat'. > > What do you think about this? > > Thanks, > Jason I consider sofnet_data a legacy API (ie don't change). Why not add to real sysfs for network device with the one value per file?
On Mon, Mar 13, 2023 at 10:28 AM Stephen Hemminger <stephen@networkplumber.org> wrote: > > On Mon, 13 Mar 2023 09:55:37 +0800 > Jason Xing <kerneljasonxing@gmail.com> wrote: > > > On Mon, Mar 13, 2023 at 3:02 AM Simon Horman <simon.horman@corigine.com> wrote: > > > > > > On Sat, Mar 11, 2023 at 11:17:56PM +0800, Jason Xing wrote: > > > > From: Jason Xing <kernelxing@tencent.com> > > > > > > > > Sometimes we need to know which one of backlog queue can be exactly > > > > long enough to cause some latency when debugging this part is needed. > > > > Thus, we can then separate the display of both. > > > > > > > > Signed-off-by: Jason Xing <kernelxing@tencent.com> > > > > --- > > > > net/core/net-procfs.c | 17 ++++++++++++----- > > > > 1 file changed, 12 insertions(+), 5 deletions(-) > > > > > > > > diff --git a/net/core/net-procfs.c b/net/core/net-procfs.c > > > > index 1ec23bf8b05c..97a304e1957a 100644 > > > > --- a/net/core/net-procfs.c > > > > +++ b/net/core/net-procfs.c > > > > @@ -115,10 +115,14 @@ static int dev_seq_show(struct seq_file *seq, void *v) > > > > return 0; > > > > } > > > > > > > > -static u32 softnet_backlog_len(struct softnet_data *sd) > > > > +static u32 softnet_input_pkt_queue_len(struct softnet_data *sd) > > > > { > > > > - return skb_queue_len_lockless(&sd->input_pkt_queue) + > > > > - skb_queue_len_lockless(&sd->process_queue); > > > > + return skb_queue_len_lockless(&sd->input_pkt_queue); > > > > +} > > > > + > > > > +static u32 softnet_process_queue_len(struct softnet_data *sd) > > > > +{ > > > > + return skb_queue_len_lockless(&sd->process_queue); > > > > } > > > > > > > > static struct softnet_data *softnet_get_online(loff_t *pos) > > > > @@ -169,12 +173,15 @@ static int softnet_seq_show(struct seq_file *seq, void *v) > > > > * mapping the data a specific CPU > > > > */ > > > > seq_printf(seq, > > > > - "%08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x\n", > > > > + "%08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x " > > > > + "%08x %08x\n", > > > > sd->processed, sd->dropped, sd->time_squeeze, 0, > > > > 0, 0, 0, 0, /* was fastroute */ > > > > 0, /* was cpu_collision */ > > > > sd->received_rps, flow_limit_count, > > > > - softnet_backlog_len(sd), (int)seq->index); > > > > + 0, /* was len of two backlog queues */ > > > > + (int)seq->index, > > > > > > nit: I think you could avoid this cast by using %llx as the format specifier. > > > > I'm not sure if I should change this format since the above line is > > introduced in commit 7d58e6555870d ('net-sysfs: add backlog len and > > CPU id to softnet data'). > > The seq->index here manifests which cpu it uses, so it can be > > displayed in 'int' format. Meanwhile, using %8x to output is much > > cleaner if the user executes 'cat /proc/net/softnet_stat'. > > > > What do you think about this? > > > > Thanks, > > Jason > > I consider sofnet_data a legacy API (ie don't change). Yeah, people seldomly touch this file in these years. > Why not add to real sysfs for network device with the one value per file? Thanks for your advice. It's worth thinking about what kind of output can replace the softnet_stat file because this file includes almost everything which makes it more user-friendly. I'm a little bit confused if we can use a fully new way to completely replace the legacy file. Do other maintainers have some precious opinion on this? Thanks, Jason
On Mon, Mar 13, 2023 at 10:28 AM Stephen Hemminger <stephen@networkplumber.org> wrote: > > On Mon, 13 Mar 2023 09:55:37 +0800 > Jason Xing <kerneljasonxing@gmail.com> wrote: > > > On Mon, Mar 13, 2023 at 3:02 AM Simon Horman <simon.horman@corigine.com> wrote: > > > > > > On Sat, Mar 11, 2023 at 11:17:56PM +0800, Jason Xing wrote: > > > > From: Jason Xing <kernelxing@tencent.com> > > > > > > > > Sometimes we need to know which one of backlog queue can be exactly > > > > long enough to cause some latency when debugging this part is needed. > > > > Thus, we can then separate the display of both. > > > > > > > > Signed-off-by: Jason Xing <kernelxing@tencent.com> > > > > --- > > > > net/core/net-procfs.c | 17 ++++++++++++----- > > > > 1 file changed, 12 insertions(+), 5 deletions(-) > > > > > > > > diff --git a/net/core/net-procfs.c b/net/core/net-procfs.c > > > > index 1ec23bf8b05c..97a304e1957a 100644 > > > > --- a/net/core/net-procfs.c > > > > +++ b/net/core/net-procfs.c > > > > @@ -115,10 +115,14 @@ static int dev_seq_show(struct seq_file *seq, void *v) > > > > return 0; > > > > } > > > > > > > > -static u32 softnet_backlog_len(struct softnet_data *sd) > > > > +static u32 softnet_input_pkt_queue_len(struct softnet_data *sd) > > > > { > > > > - return skb_queue_len_lockless(&sd->input_pkt_queue) + > > > > - skb_queue_len_lockless(&sd->process_queue); > > > > + return skb_queue_len_lockless(&sd->input_pkt_queue); > > > > +} > > > > + > > > > +static u32 softnet_process_queue_len(struct softnet_data *sd) > > > > +{ > > > > + return skb_queue_len_lockless(&sd->process_queue); > > > > } > > > > > > > > static struct softnet_data *softnet_get_online(loff_t *pos) > > > > @@ -169,12 +173,15 @@ static int softnet_seq_show(struct seq_file *seq, void *v) > > > > * mapping the data a specific CPU > > > > */ > > > > seq_printf(seq, > > > > - "%08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x\n", > > > > + "%08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x " > > > > + "%08x %08x\n", > > > > sd->processed, sd->dropped, sd->time_squeeze, 0, > > > > 0, 0, 0, 0, /* was fastroute */ > > > > 0, /* was cpu_collision */ > > > > sd->received_rps, flow_limit_count, > > > > - softnet_backlog_len(sd), (int)seq->index); > > > > + 0, /* was len of two backlog queues */ > > > > + (int)seq->index, > > > > > > nit: I think you could avoid this cast by using %llx as the format specifier. > > > > I'm not sure if I should change this format since the above line is > > introduced in commit 7d58e6555870d ('net-sysfs: add backlog len and > > CPU id to softnet data'). > > The seq->index here manifests which cpu it uses, so it can be > > displayed in 'int' format. Meanwhile, using %8x to output is much > > cleaner if the user executes 'cat /proc/net/softnet_stat'. > > > > What do you think about this? > > > > Thanks, > > Jason > > I consider sofnet_data a legacy API (ie don't change). [...] > Why not add to real sysfs for network device with the one value per file? Well, I'm wondering if the way you suggested is probably not proper because this structure is per cpu which means that we have 'num_cpus' * 'how many members we should print' files. It's too many, I think. /proc/net/softnet_stat is still a good choice :) I need more advice on this. Thanks, Jason
On Mon, Mar 13, 2023 at 09:55:37AM +0800, Jason Xing wrote: > On Mon, Mar 13, 2023 at 3:02 AM Simon Horman <simon.horman@corigine.com> wrote: > > > > On Sat, Mar 11, 2023 at 11:17:56PM +0800, Jason Xing wrote: > > > From: Jason Xing <kernelxing@tencent.com> > > > > > > Sometimes we need to know which one of backlog queue can be exactly > > > long enough to cause some latency when debugging this part is needed. > > > Thus, we can then separate the display of both. > > > > > > Signed-off-by: Jason Xing <kernelxing@tencent.com> > > > --- > > > net/core/net-procfs.c | 17 ++++++++++++----- > > > 1 file changed, 12 insertions(+), 5 deletions(-) > > > > > > diff --git a/net/core/net-procfs.c b/net/core/net-procfs.c > > > index 1ec23bf8b05c..97a304e1957a 100644 > > > --- a/net/core/net-procfs.c > > > +++ b/net/core/net-procfs.c > > > @@ -115,10 +115,14 @@ static int dev_seq_show(struct seq_file *seq, void *v) > > > return 0; > > > } > > > > > > -static u32 softnet_backlog_len(struct softnet_data *sd) > > > +static u32 softnet_input_pkt_queue_len(struct softnet_data *sd) > > > { > > > - return skb_queue_len_lockless(&sd->input_pkt_queue) + > > > - skb_queue_len_lockless(&sd->process_queue); > > > + return skb_queue_len_lockless(&sd->input_pkt_queue); > > > +} > > > + > > > +static u32 softnet_process_queue_len(struct softnet_data *sd) > > > +{ > > > + return skb_queue_len_lockless(&sd->process_queue); > > > } > > > > > > static struct softnet_data *softnet_get_online(loff_t *pos) > > > @@ -169,12 +173,15 @@ static int softnet_seq_show(struct seq_file *seq, void *v) > > > * mapping the data a specific CPU > > > */ > > > seq_printf(seq, > > > - "%08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x\n", > > > + "%08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x " > > > + "%08x %08x\n", > > > sd->processed, sd->dropped, sd->time_squeeze, 0, > > > 0, 0, 0, 0, /* was fastroute */ > > > 0, /* was cpu_collision */ > > > sd->received_rps, flow_limit_count, > > > - softnet_backlog_len(sd), (int)seq->index); > > > + 0, /* was len of two backlog queues */ > > > + (int)seq->index, > > > > nit: I think you could avoid this cast by using %llx as the format specifier. > > I'm not sure if I should change this format since the above line is > introduced in commit 7d58e6555870d ('net-sysfs: add backlog len and > CPU id to softnet data'). > The seq->index here manifests which cpu it uses, so it can be > displayed in 'int' format. Meanwhile, using %8x to output is much > cleaner if the user executes 'cat /proc/net/softnet_stat'. > > What do you think about this? I think %08llx might be a good way to go. But perhaps I'm missing something wrt to changing user-facing output. In any case, this is more a suggestion than a request for a change.
On Mon, Mar 13, 2023 at 8:11 PM Simon Horman <simon.horman@corigine.com> wrote: > > On Mon, Mar 13, 2023 at 09:55:37AM +0800, Jason Xing wrote: > > On Mon, Mar 13, 2023 at 3:02 AM Simon Horman <simon.horman@corigine.com> wrote: > > > > > > On Sat, Mar 11, 2023 at 11:17:56PM +0800, Jason Xing wrote: > > > > From: Jason Xing <kernelxing@tencent.com> > > > > > > > > Sometimes we need to know which one of backlog queue can be exactly > > > > long enough to cause some latency when debugging this part is needed. > > > > Thus, we can then separate the display of both. > > > > > > > > Signed-off-by: Jason Xing <kernelxing@tencent.com> > > > > --- > > > > net/core/net-procfs.c | 17 ++++++++++++----- > > > > 1 file changed, 12 insertions(+), 5 deletions(-) > > > > > > > > diff --git a/net/core/net-procfs.c b/net/core/net-procfs.c > > > > index 1ec23bf8b05c..97a304e1957a 100644 > > > > --- a/net/core/net-procfs.c > > > > +++ b/net/core/net-procfs.c > > > > @@ -115,10 +115,14 @@ static int dev_seq_show(struct seq_file *seq, void *v) > > > > return 0; > > > > } > > > > > > > > -static u32 softnet_backlog_len(struct softnet_data *sd) > > > > +static u32 softnet_input_pkt_queue_len(struct softnet_data *sd) > > > > { > > > > - return skb_queue_len_lockless(&sd->input_pkt_queue) + > > > > - skb_queue_len_lockless(&sd->process_queue); > > > > + return skb_queue_len_lockless(&sd->input_pkt_queue); > > > > +} > > > > + > > > > +static u32 softnet_process_queue_len(struct softnet_data *sd) > > > > +{ > > > > + return skb_queue_len_lockless(&sd->process_queue); > > > > } > > > > > > > > static struct softnet_data *softnet_get_online(loff_t *pos) > > > > @@ -169,12 +173,15 @@ static int softnet_seq_show(struct seq_file *seq, void *v) > > > > * mapping the data a specific CPU > > > > */ > > > > seq_printf(seq, > > > > - "%08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x\n", > > > > + "%08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x " > > > > + "%08x %08x\n", > > > > sd->processed, sd->dropped, sd->time_squeeze, 0, > > > > 0, 0, 0, 0, /* was fastroute */ > > > > 0, /* was cpu_collision */ > > > > sd->received_rps, flow_limit_count, > > > > - softnet_backlog_len(sd), (int)seq->index); > > > > + 0, /* was len of two backlog queues */ > > > > + (int)seq->index, > > > > > > nit: I think you could avoid this cast by using %llx as the format specifier. > > > > I'm not sure if I should change this format since the above line is > > introduced in commit 7d58e6555870d ('net-sysfs: add backlog len and > > CPU id to softnet data'). > > The seq->index here manifests which cpu it uses, so it can be > > displayed in 'int' format. Meanwhile, using %8x to output is much > > cleaner if the user executes 'cat /proc/net/softnet_stat'. > > > > What do you think about this? > > I think %08llx might be a good way to go. > But perhaps I'm missing something wrt to changing user-facing output. > > In any case, this is more a suggestion than a request for a change. Ah, now I see. Thanks again for your review and suggestion :) Thanks, Jason
On Sat, Mar 11, 2023 at 7:18 AM Jason Xing <kerneljasonxing@gmail.com> wrote: > > From: Jason Xing <kernelxing@tencent.com> > > Sometimes we need to know which one of backlog queue can be exactly > long enough to cause some latency when debugging this part is needed. > Thus, we can then separate the display of both. > > Signed-off-by: Jason Xing <kernelxing@tencent.com> > --- > net/core/net-procfs.c | 17 ++++++++++++----- > 1 file changed, 12 insertions(+), 5 deletions(-) > > diff --git a/net/core/net-procfs.c b/net/core/net-procfs.c > index 1ec23bf8b05c..97a304e1957a 100644 > --- a/net/core/net-procfs.c > +++ b/net/core/net-procfs.c > @@ -115,10 +115,14 @@ static int dev_seq_show(struct seq_file *seq, void *v) > return 0; > } > > -static u32 softnet_backlog_len(struct softnet_data *sd) > +static u32 softnet_input_pkt_queue_len(struct softnet_data *sd) > { > - return skb_queue_len_lockless(&sd->input_pkt_queue) + > - skb_queue_len_lockless(&sd->process_queue); > + return skb_queue_len_lockless(&sd->input_pkt_queue); > +} > + > +static u32 softnet_process_queue_len(struct softnet_data *sd) > +{ > + return skb_queue_len_lockless(&sd->process_queue); > } > > static struct softnet_data *softnet_get_online(loff_t *pos) > @@ -169,12 +173,15 @@ static int softnet_seq_show(struct seq_file *seq, void *v) > * mapping the data a specific CPU > */ > seq_printf(seq, > - "%08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x\n", > + "%08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x " > + "%08x %08x\n", > sd->processed, sd->dropped, sd->time_squeeze, 0, > 0, 0, 0, 0, /* was fastroute */ > 0, /* was cpu_collision */ > sd->received_rps, flow_limit_count, > - softnet_backlog_len(sd), (int)seq->index); > + 0, /* was len of two backlog queues */ You can not pretend the sum is zero, some user space tools out there would be fooled. > + (int)seq->index, > + softnet_input_pkt_queue_len(sd), softnet_process_queue_len(sd)); > return 0; > } > > -- > 2.37.3 > In general I would prefer we no longer change this file. Perhaps add a tracepoint instead ?
On Mon, Mar 13, 2023 at 8:34 PM Eric Dumazet <edumazet@google.com> wrote: > > On Sat, Mar 11, 2023 at 7:18 AM Jason Xing <kerneljasonxing@gmail.com> wrote: > > > > From: Jason Xing <kernelxing@tencent.com> > > > > Sometimes we need to know which one of backlog queue can be exactly > > long enough to cause some latency when debugging this part is needed. > > Thus, we can then separate the display of both. > > > > Signed-off-by: Jason Xing <kernelxing@tencent.com> > > --- > > net/core/net-procfs.c | 17 ++++++++++++----- > > 1 file changed, 12 insertions(+), 5 deletions(-) > > > > diff --git a/net/core/net-procfs.c b/net/core/net-procfs.c > > index 1ec23bf8b05c..97a304e1957a 100644 > > --- a/net/core/net-procfs.c > > +++ b/net/core/net-procfs.c > > @@ -115,10 +115,14 @@ static int dev_seq_show(struct seq_file *seq, void *v) > > return 0; > > } > > > > -static u32 softnet_backlog_len(struct softnet_data *sd) > > +static u32 softnet_input_pkt_queue_len(struct softnet_data *sd) > > { > > - return skb_queue_len_lockless(&sd->input_pkt_queue) + > > - skb_queue_len_lockless(&sd->process_queue); > > + return skb_queue_len_lockless(&sd->input_pkt_queue); > > +} > > + > > +static u32 softnet_process_queue_len(struct softnet_data *sd) > > +{ > > + return skb_queue_len_lockless(&sd->process_queue); > > } > > > > static struct softnet_data *softnet_get_online(loff_t *pos) > > @@ -169,12 +173,15 @@ static int softnet_seq_show(struct seq_file *seq, void *v) > > * mapping the data a specific CPU > > */ > > seq_printf(seq, > > - "%08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x\n", > > + "%08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x " > > + "%08x %08x\n", > > sd->processed, sd->dropped, sd->time_squeeze, 0, > > 0, 0, 0, 0, /* was fastroute */ > > 0, /* was cpu_collision */ > > sd->received_rps, flow_limit_count, > > - softnet_backlog_len(sd), (int)seq->index); > > + 0, /* was len of two backlog queues */ > > You can not pretend the sum is zero, some user space tools out there > would be fooled. > > > + (int)seq->index, > > + softnet_input_pkt_queue_len(sd), softnet_process_queue_len(sd)); > > return 0; > > } > > > > -- > > 2.37.3 > > > > In general I would prefer we no longer change this file. Fine. Since now, let this legacy file be one part of history. > > Perhaps add a tracepoint instead ? Thanks, Eric. It's one good idea. It seems acceptable if we only need to trace two separate backlog queues where it can probably hit the limit, say, in the enqueue_to_backlog(). Similarly I decide to write another two tracepoints of time_squeeze and budget_squeeze which I introduced to distinguish from time_squeeze as the below link shows: https://lore.kernel.org/lkml/CAL+tcoAwodpnE2NjMLPhBbmHUvmKMgSykqx0EQ4YZaQHjrx0Hw@mail.gmail.com/. For that change, any suggestions are deeply welcome :) Thanks, Jason
On Mon, Mar 13, 2023 at 6:16 AM Jason Xing <kerneljasonxing@gmail.com> wrote: > > On Mon, Mar 13, 2023 at 8:34 PM Eric Dumazet <edumazet@google.com> wrote: > > > > On Sat, Mar 11, 2023 at 7:18 AM Jason Xing <kerneljasonxing@gmail.com> wrote: > > > > > > From: Jason Xing <kernelxing@tencent.com> > > > > > > Sometimes we need to know which one of backlog queue can be exactly > > > long enough to cause some latency when debugging this part is needed. > > > Thus, we can then separate the display of both. > > > > > > Signed-off-by: Jason Xing <kernelxing@tencent.com> > > > --- > > > net/core/net-procfs.c | 17 ++++++++++++----- > > > 1 file changed, 12 insertions(+), 5 deletions(-) > > > > > > diff --git a/net/core/net-procfs.c b/net/core/net-procfs.c > > > index 1ec23bf8b05c..97a304e1957a 100644 > > > --- a/net/core/net-procfs.c > > > +++ b/net/core/net-procfs.c > > > @@ -115,10 +115,14 @@ static int dev_seq_show(struct seq_file *seq, void *v) > > > return 0; > > > } > > > > > > -static u32 softnet_backlog_len(struct softnet_data *sd) > > > +static u32 softnet_input_pkt_queue_len(struct softnet_data *sd) > > > { > > > - return skb_queue_len_lockless(&sd->input_pkt_queue) + > > > - skb_queue_len_lockless(&sd->process_queue); > > > + return skb_queue_len_lockless(&sd->input_pkt_queue); > > > +} > > > + > > > +static u32 softnet_process_queue_len(struct softnet_data *sd) > > > +{ > > > + return skb_queue_len_lockless(&sd->process_queue); > > > } > > > > > > static struct softnet_data *softnet_get_online(loff_t *pos) > > > @@ -169,12 +173,15 @@ static int softnet_seq_show(struct seq_file *seq, void *v) > > > * mapping the data a specific CPU > > > */ > > > seq_printf(seq, > > > - "%08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x\n", > > > + "%08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x " > > > + "%08x %08x\n", > > > sd->processed, sd->dropped, sd->time_squeeze, 0, > > > 0, 0, 0, 0, /* was fastroute */ > > > 0, /* was cpu_collision */ > > > sd->received_rps, flow_limit_count, > > > - softnet_backlog_len(sd), (int)seq->index); > > > + 0, /* was len of two backlog queues */ > > > > You can not pretend the sum is zero, some user space tools out there > > would be fooled. > > > > > + (int)seq->index, > > > + softnet_input_pkt_queue_len(sd), softnet_process_queue_len(sd)); > > > return 0; > > > } > > > > > > -- > > > 2.37.3 > > > > > > > In general I would prefer we no longer change this file. > > Fine. Since now, let this legacy file be one part of history. > > > > > Perhaps add a tracepoint instead ? > > Thanks, Eric. It's one good idea. It seems acceptable if we only need > to trace two separate backlog queues where it can probably hit the > limit, say, in the enqueue_to_backlog(). Note that enqueue_to_backlog() already uses a specific kfree_skb_reason() reason (SKB_DROP_REASON_CPU_BACKLOG) so existing infrastructure should work just fine. > > Similarly I decide to write another two tracepoints of time_squeeze > and budget_squeeze which I introduced to distinguish from time_squeeze > as the below link shows: > https://lore.kernel.org/lkml/CAL+tcoAwodpnE2NjMLPhBbmHUvmKMgSykqx0EQ4YZaQHjrx0Hw@mail.gmail.com/. > For that change, any suggestions are deeply welcome :) > For your workloads to hit these limits enough for you to be worried, it looks like you are not using any scaling stuff documented in Documentation/networking/scaling.rst
On Mon, Mar 13, 2023 at 11:59 PM Eric Dumazet <edumazet@google.com> wrote: > > On Mon, Mar 13, 2023 at 6:16 AM Jason Xing <kerneljasonxing@gmail.com> wrote: > > > > On Mon, Mar 13, 2023 at 8:34 PM Eric Dumazet <edumazet@google.com> wrote: > > > > > > On Sat, Mar 11, 2023 at 7:18 AM Jason Xing <kerneljasonxing@gmail.com> wrote: > > > > > > > > From: Jason Xing <kernelxing@tencent.com> > > > > > > > > Sometimes we need to know which one of backlog queue can be exactly > > > > long enough to cause some latency when debugging this part is needed. > > > > Thus, we can then separate the display of both. > > > > > > > > Signed-off-by: Jason Xing <kernelxing@tencent.com> > > > > --- > > > > net/core/net-procfs.c | 17 ++++++++++++----- > > > > 1 file changed, 12 insertions(+), 5 deletions(-) > > > > > > > > diff --git a/net/core/net-procfs.c b/net/core/net-procfs.c > > > > index 1ec23bf8b05c..97a304e1957a 100644 > > > > --- a/net/core/net-procfs.c > > > > +++ b/net/core/net-procfs.c > > > > @@ -115,10 +115,14 @@ static int dev_seq_show(struct seq_file *seq, void *v) > > > > return 0; > > > > } > > > > > > > > -static u32 softnet_backlog_len(struct softnet_data *sd) > > > > +static u32 softnet_input_pkt_queue_len(struct softnet_data *sd) > > > > { > > > > - return skb_queue_len_lockless(&sd->input_pkt_queue) + > > > > - skb_queue_len_lockless(&sd->process_queue); > > > > + return skb_queue_len_lockless(&sd->input_pkt_queue); > > > > +} > > > > + > > > > +static u32 softnet_process_queue_len(struct softnet_data *sd) > > > > +{ > > > > + return skb_queue_len_lockless(&sd->process_queue); > > > > } > > > > > > > > static struct softnet_data *softnet_get_online(loff_t *pos) > > > > @@ -169,12 +173,15 @@ static int softnet_seq_show(struct seq_file *seq, void *v) > > > > * mapping the data a specific CPU > > > > */ > > > > seq_printf(seq, > > > > - "%08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x\n", > > > > + "%08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x " > > > > + "%08x %08x\n", > > > > sd->processed, sd->dropped, sd->time_squeeze, 0, > > > > 0, 0, 0, 0, /* was fastroute */ > > > > 0, /* was cpu_collision */ > > > > sd->received_rps, flow_limit_count, > > > > - softnet_backlog_len(sd), (int)seq->index); > > > > + 0, /* was len of two backlog queues */ > > > > > > You can not pretend the sum is zero, some user space tools out there > > > would be fooled. > > > > > > > + (int)seq->index, > > > > + softnet_input_pkt_queue_len(sd), softnet_process_queue_len(sd)); > > > > return 0; > > > > } > > > > > > > > -- > > > > 2.37.3 > > > > > > > > > > In general I would prefer we no longer change this file. > > > > Fine. Since now, let this legacy file be one part of history. > > > > > > > > Perhaps add a tracepoint instead ? > > > > Thanks, Eric. It's one good idea. It seems acceptable if we only need > > to trace two separate backlog queues where it can probably hit the > > limit, say, in the enqueue_to_backlog(). > > [...] > Note that enqueue_to_backlog() already uses a specific kfree_skb_reason() reason > (SKB_DROP_REASON_CPU_BACKLOG) so existing infrastructure should work just fine. Sure, I noticed that. It traces all the kfree_skb paths, not only softnet_data. If it isn't proper, what would you recommend where to put the trace function into? Now I'm thinking of resorting to the legacy file we discussed above :( > > > > > > Similarly I decide to write another two tracepoints of time_squeeze > > and budget_squeeze which I introduced to distinguish from time_squeeze > > as the below link shows: > > https://lore.kernel.org/lkml/CAL+tcoAwodpnE2NjMLPhBbmHUvmKMgSykqx0EQ4YZaQHjrx0Hw@mail.gmail.com/. > > For that change, any suggestions are deeply welcome :) > > > > For your workloads to hit these limits enough for you to be worried, > it looks like you are not using any scaling stuff documented > in Documentation/networking/scaling.rst Thanks for the guidance. Scaling is a good way to go really. But I just would like to separate these two kinds of limits to watch them closely. More often we cannot decide to adjust accurately which one should be adjusted. Time squeeze may not be clear and we cannot randomly write a larger number into both proc files which may do harm to some external customers unless we can show some proof to them. Maybe I got something wrong. If adding some tracepoints for those limits in softnet_data is not elegant, please enlighten me :) Thanks, Jason
On Mon, Mar 13, 2023 at 10:16 AM Jason Xing <kerneljasonxing@gmail.com> wrote: > > Thanks for the guidance. Scaling is a good way to go really. But I > just would like to separate these two kinds of limits to watch them > closely. More often we cannot decide to adjust accurately which one > should be adjusted. Time squeeze may not be clear and we cannot > randomly write a larger number into both proc files which may do harm > to some external customers unless we can show some proof to them. > > Maybe I got something wrong. If adding some tracepoints for those > limits in softnet_data is not elegant, please enlighten me :) > I dunno, but it really looks like you are re-discovering things that we dealt with about 10 years ago. I wonder why new ways of tracing stuff are needed nowadays, while ~10 years ago nothing officially put and maintained forever in the kernel was needed.
On Tue, Mar 14, 2023 at 1:34 AM Eric Dumazet <edumazet@google.com> wrote: > > On Mon, Mar 13, 2023 at 10:16 AM Jason Xing <kerneljasonxing@gmail.com> wrote: > > > > > Thanks for the guidance. Scaling is a good way to go really. But I > > just would like to separate these two kinds of limits to watch them > > closely. More often we cannot decide to adjust accurately which one > > should be adjusted. Time squeeze may not be clear and we cannot > > randomly write a larger number into both proc files which may do harm > > to some external customers unless we can show some proof to them. > > > > Maybe I got something wrong. If adding some tracepoints for those > > limits in softnet_data is not elegant, please enlighten me :) > > > [...] > I dunno, but it really looks like you are re-discovering things that > we dealt with about 10 years ago. > > I wonder why new ways of tracing stuff are needed nowadays, while ~10 > years ago nothing > officially put and maintained forever in the kernel was needed. Well, that's not my original intention. All I want to do is show more important members in softnet_data to help users know more about this part and decide which one to tune. I think what you said (which is "You can not pretend the sum is zero, some user space tools out there would be fooled.") is quite right, I can keep this softnet_backlog_len() untouched as the old days. Thanks, Jason
diff --git a/net/core/net-procfs.c b/net/core/net-procfs.c index 1ec23bf8b05c..97a304e1957a 100644 --- a/net/core/net-procfs.c +++ b/net/core/net-procfs.c @@ -115,10 +115,14 @@ static int dev_seq_show(struct seq_file *seq, void *v) return 0; } -static u32 softnet_backlog_len(struct softnet_data *sd) +static u32 softnet_input_pkt_queue_len(struct softnet_data *sd) { - return skb_queue_len_lockless(&sd->input_pkt_queue) + - skb_queue_len_lockless(&sd->process_queue); + return skb_queue_len_lockless(&sd->input_pkt_queue); +} + +static u32 softnet_process_queue_len(struct softnet_data *sd) +{ + return skb_queue_len_lockless(&sd->process_queue); } static struct softnet_data *softnet_get_online(loff_t *pos) @@ -169,12 +173,15 @@ static int softnet_seq_show(struct seq_file *seq, void *v) * mapping the data a specific CPU */ seq_printf(seq, - "%08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x\n", + "%08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x " + "%08x %08x\n", sd->processed, sd->dropped, sd->time_squeeze, 0, 0, 0, 0, 0, /* was fastroute */ 0, /* was cpu_collision */ sd->received_rps, flow_limit_count, - softnet_backlog_len(sd), (int)seq->index); + 0, /* was len of two backlog queues */ + (int)seq->index, + softnet_input_pkt_queue_len(sd), softnet_process_queue_len(sd)); return 0; }