Message ID | 20211123163955.154512-1-alexandr.lobakin@intel.com (mailing list archive) |
---|---|
Headers | show |
Series | net: introduce and use generic XDP stats | expand |
On 11/23/21 9:39 AM, Alexander Lobakin wrote: > This is an almost complete rework of [0]. > > This series introduces generic XDP statistics infra based on rtnl > xstats (Ethtool standard stats previously), and wires up the drivers > which collect appropriate statistics to this new interface. Finally, > it introduces XDP/XSK statistics to all XDP-capable Intel drivers. > > Those counters are: > * packets: number of frames passed to bpf_prog_run_xdp(). > * bytes: number of bytes went through bpf_prog_run_xdp(). > * errors: number of general XDP errors, if driver has one unified > counter. > * aborted: number of XDP_ABORTED returns. > * drop: number of XDP_DROP returns. > * invalid: number of returns of unallowed values (i.e. not XDP_*). > * pass: number of XDP_PASS returns. > * redirect: number of successfully performed XDP_REDIRECT requests. > * redirect_errors: number of failed XDP_REDIRECT requests. > * tx: number of successfully performed XDP_TX requests. > * tx_errors: number of failed XDP_TX requests. > * xmit_packets: number of successfully transmitted XDP/XSK frames. > * xmit_bytes: number of successfully transmitted XDP/XSK frames. > * xmit_errors: of XDP/XSK frames failed to transmit. > * xmit_full: number of XDP/XSK queue being full at the moment of > transmission. > > To provide them, developers need to implement .ndo_get_xdp_stats() > and, if they want to expose stats on a per-channel basis, > .ndo_get_xdp_stats_nch(). include/net/xdp.h contains some helper Why the tie to a channel? There are Rx queues and Tx queues and no requirement to link them into a channel. It would be better (more flexible) to allow them to be independent. Rather than ask the driver "how many channels", ask 'how many Rx queues' and 'how many Tx queues' for which xdp stats are reported. From there, allow queue numbers or queue id's to be non-consecutive and add a queue id or number as an attribute. e.g., [XDP stats] [ Rx queue N] counters [ Tx queue N] counters This would allow a follow on patch set to do something like "Give me XDP stats for Rx queue N" instead of doing a full dump.
From: Alexander Lobakin <alexandr.lobakin@intel.com>
Date: Tue, 23 Nov 2021 17:39:29 +0100
Ok, open questions:
1. Channels vs queues vs global.
Jakub: no per-channel.
David (Ahern): it's worth it to separate as Rx/Tx.
Toke is fine with globals at the end I think?
My point was that for most of the systems we have 1:1 Rx:Tx
(usually num_online_cpus()), so asking drivers separately for
the number of RQs and then SQs would end up asking for the same
number twice.
But the main reason TBH was that most of the drivers store stats
on a per-channel basis and I didn't want them to regress in
functionality. I'm fine with reporting only netdev-wide if
everyone are.
In case if we keep per-channel: report per-channel only by request
and cumulative globals by default to not flood the output?
2. Count all errors as "drops" vs separately.
Daniel: account everything as drops, plus errors should be
reported as exceptions for tracing sub.
Jesper: we shouldn't mix drops and errors.
My point: we shouldn't, that's why there are patches for 2 drivers
to give errors a separate counter.
I provided an option either to report all errors together ('errors'
in stats structure) or to provide individual counters for each of
them (sonamed ctrs), but personally prefer detailed errors. However,
they might "go detailed" under trace_xdp_exception() only, sound
fine (OTOH in RTNL stats we have both "general" errors and detailed
error counters).
3. XDP and XSK ctrs separately or not.
My PoV is that those are two quite different worlds.
However, stats for actions on XSK really make a little sense since
99% of time we have xskmap redirect. So I think it'd be fine to just
expand stats structure with xsk_{rx,tx}_{packets,bytes} and count
the rest (actions, errors) together with XDP.
Rest:
- don't create a separate `ip` command and report under `-s`;
- save some RTNL skb space by skipping zeroed counters.
Also, regarding that I count all on the stack and then add to the
storage once in a polling cycle -- most drivers don't do that and
just increment the values in the storage directly, but this can be
less performant for frequently updated stats (or it's just my
embedded past).
Re u64 vs u64_stats_t -- the latter is more universal and
architecture-friendly, the former is used directly in most of the
drivers primarily because those drivers and the corresponding HW
are being run on 64-bit systems in the vast majority of cases, and
Ethtools stats themselves are not so critical to guard them with
anti-tearing. Anyways, local64_t is cheap on ARM64/x86_64 I guess?
Thanks,
Al
On Tue, 30 Nov 2021 16:56:12 +0100 Alexander Lobakin wrote: > 3. XDP and XSK ctrs separately or not. > > My PoV is that those are two quite different worlds. > However, stats for actions on XSK really make a little sense since > 99% of time we have xskmap redirect. So I think it'd be fine to just > expand stats structure with xsk_{rx,tx}_{packets,bytes} and count > the rest (actions, errors) together with XDP. > > > Rest: > - don't create a separate `ip` command and report under `-s`; > - save some RTNL skb space by skipping zeroed counters. Let me ruin this point of clarity for you. I think that stats should be skipped when they are not collected (see ETHTOOL_STAT_NOT_SET). If messages get large user should use the GETSTATS call and avoid the problem more effectively. > Also, regarding that I count all on the stack and then add to the > storage once in a polling cycle -- most drivers don't do that and > just increment the values in the storage directly, but this can be > less performant for frequently updated stats (or it's just my > embedded past). > Re u64 vs u64_stats_t -- the latter is more universal and > architecture-friendly, the former is used directly in most of the > drivers primarily because those drivers and the corresponding HW > are being run on 64-bit systems in the vast majority of cases, and > Ethtools stats themselves are not so critical to guard them with > anti-tearing. Anyways, local64_t is cheap on ARM64/x86_64 I guess?
Alexander Lobakin <alexandr.lobakin@intel.com> writes: > From: Alexander Lobakin <alexandr.lobakin@intel.com> > Date: Tue, 23 Nov 2021 17:39:29 +0100 > > Ok, open questions: > > 1. Channels vs queues vs global. > > Jakub: no per-channel. > David (Ahern): it's worth it to separate as Rx/Tx. > Toke is fine with globals at the end I think? Well, I don't like throwing data away, so in that sense I do like per-queue stats, but it's not a very strong preference (i.e., I can live with either)... > My point was that for most of the systems we have 1:1 Rx:Tx > (usually num_online_cpus()), so asking drivers separately for > the number of RQs and then SQs would end up asking for the same > number twice. > But the main reason TBH was that most of the drivers store stats > on a per-channel basis and I didn't want them to regress in > functionality. I'm fine with reporting only netdev-wide if > everyone are. > > In case if we keep per-channel: report per-channel only by request > and cumulative globals by default to not flood the output? ... however if we do go with per-channel stats I do agree that they shouldn't be in the default output. I guess netlink could still split them out and iproute2 could just sum them before display? > 2. Count all errors as "drops" vs separately. > > Daniel: account everything as drops, plus errors should be > reported as exceptions for tracing sub. > Jesper: we shouldn't mix drops and errors. > > My point: we shouldn't, that's why there are patches for 2 drivers > to give errors a separate counter. > I provided an option either to report all errors together ('errors' > in stats structure) or to provide individual counters for each of > them (sonamed ctrs), but personally prefer detailed errors. However, > they might "go detailed" under trace_xdp_exception() only, sound > fine (OTOH in RTNL stats we have both "general" errors and detailed > error counters). I agree it would be nice to have a separate error counter, but a single counter is enough when combined with the tracepoints. > 3. XDP and XSK ctrs separately or not. > > My PoV is that those are two quite different worlds. > However, stats for actions on XSK really make a little sense since > 99% of time we have xskmap redirect. So I think it'd be fine to just > expand stats structure with xsk_{rx,tx}_{packets,bytes} and count > the rest (actions, errors) together with XDP. A whole set of separate counters for XSK is certainly overkill. No strong preference as to whether they need a separate counter at all... > Rest: > - don't create a separate `ip` command and report under `-s`; > - save some RTNL skb space by skipping zeroed counters. > > Also, regarding that I count all on the stack and then add to the > storage once in a polling cycle -- most drivers don't do that and > just increment the values in the storage directly, but this can be > less performant for frequently updated stats (or it's just my > embedded past). > Re u64 vs u64_stats_t -- the latter is more universal and > architecture-friendly, the former is used directly in most of the > drivers primarily because those drivers and the corresponding HW > are being run on 64-bit systems in the vast majority of cases, and > Ethtools stats themselves are not so critical to guard them with > anti-tearing. Anyways, local64_t is cheap on ARM64/x86_64 I guess? I'm generally a fan of correctness first, so since you're touching all the drivers anyway why I'd say go for u64_stats_t :) -Toke
From: Jakub Kicinski <kuba@kernel.org> Date: Tue, 30 Nov 2021 08:12:07 -0800 > On Tue, 30 Nov 2021 16:56:12 +0100 Alexander Lobakin wrote: > > 3. XDP and XSK ctrs separately or not. > > > > My PoV is that those are two quite different worlds. > > However, stats for actions on XSK really make a little sense since > > 99% of time we have xskmap redirect. So I think it'd be fine to just > > expand stats structure with xsk_{rx,tx}_{packets,bytes} and count > > the rest (actions, errors) together with XDP. > > > > > > Rest: > > - don't create a separate `ip` command and report under `-s`; > > - save some RTNL skb space by skipping zeroed counters. > > Let me ruin this point of clarity for you. I think that stats should > be skipped when they are not collected (see ETHTOOL_STAT_NOT_SET). > If messages get large user should use the GETSTATS call and avoid > the problem more effectively. Well, it was Dave's thought here: [0] > Another thought on this patch: with individual attributes you could save > some overhead by not sending 0 counters to userspace. e.g., define a > helper that does: I know about ETHTOOL_STAT_NOT_SET, but RTNL xstats doesn't use this, does it? GETSTATS is another thing, and I'll use it, thanks. > > > Also, regarding that I count all on the stack and then add to the > > storage once in a polling cycle -- most drivers don't do that and > > just increment the values in the storage directly, but this can be > > less performant for frequently updated stats (or it's just my > > embedded past). > > Re u64 vs u64_stats_t -- the latter is more universal and > > architecture-friendly, the former is used directly in most of the > > drivers primarily because those drivers and the corresponding HW > > are being run on 64-bit systems in the vast majority of cases, and > > Ethtools stats themselves are not so critical to guard them with > > anti-tearing. Anyways, local64_t is cheap on ARM64/x86_64 I guess? [0] https://lore.kernel.org/netdev/a4602b15-25b1-c388-73b4-1f97f6f0e555@gmail.com/ Al
On Tue, 30 Nov 2021 17:34:54 +0100 Alexander Lobakin wrote: > > Another thought on this patch: with individual attributes you could save > > some overhead by not sending 0 counters to userspace. e.g., define a > > helper that does: > > I know about ETHTOOL_STAT_NOT_SET, but RTNL xstats doesn't use this, > does it? Not sure if you're asking me or Dave but no, to my knowledge RTNL does not use such semantics today. But the reason is mostly because there weren't many driver stats added there. Knowing if an error counter is not supported or supporter and 0 is important for monitoring. Even if XDP stats don't have a counter which may not be supported today it's not a good precedent to make IMO.
On Tue, 30 Nov 2021 17:17:24 +0100 Toke Høiland-Jørgensen wrote: > > 1. Channels vs queues vs global. > > > > Jakub: no per-channel. > > David (Ahern): it's worth it to separate as Rx/Tx. > > Toke is fine with globals at the end I think? > > Well, I don't like throwing data away, so in that sense I do like > per-queue stats, but it's not a very strong preference (i.e., I can live > with either)... We don't even have a clear definition of a queue in Linux. As I said, adding this API today without a strong user and letting drivers diverge in behavior would be a mistake.
On 11/30/21 10:04 AM, Jakub Kicinski wrote: > On Tue, 30 Nov 2021 17:34:54 +0100 Alexander Lobakin wrote: >>> Another thought on this patch: with individual attributes you could save >>> some overhead by not sending 0 counters to userspace. e.g., define a >>> helper that does: >> >> I know about ETHTOOL_STAT_NOT_SET, but RTNL xstats doesn't use this, >> does it? > > Not sure if you're asking me or Dave but no, to my knowledge RTNL does > not use such semantics today. But the reason is mostly because there > weren't many driver stats added there. Knowing if an error counter is > not supported or supporter and 0 is important for monitoring. Even if > XDP stats don't have a counter which may not be supported today it's > not a good precedent to make IMO. > Today, stats are sent as a struct so skipping stats whose value is 0 is not an option. When using individual attributes for the counters this becomes an option. Given there is no value in sending '0' why do it? Is your pushback that there should be a uapi to opt-in to this behavior?
On 11/30/21 8:56 AM, Alexander Lobakin wrote: > Rest: > - don't create a separate `ip` command and report under `-s`; Reporting XDP stats under 'ip -s' is not going to be scalable from a readability perspective. ifstat (misc/ifstat.c) has support for extended stats which is where you are adding these.
On 11/30/21 10:07 AM, Jakub Kicinski wrote: > On Tue, 30 Nov 2021 17:17:24 +0100 Toke Høiland-Jørgensen wrote: >>> 1. Channels vs queues vs global. >>> >>> Jakub: no per-channel. >>> David (Ahern): it's worth it to separate as Rx/Tx. >>> Toke is fine with globals at the end I think? >> >> Well, I don't like throwing data away, so in that sense I do like >> per-queue stats, but it's not a very strong preference (i.e., I can live >> with either)... > > We don't even have a clear definition of a queue in Linux. > The summary above says "Jakub: no per-channel", and then you have this comment about a clear definition of a queue. What is your preference here, Jakub? I think I have gotten lost in all of the coments. My request was just to not lump Rx and Tx together under a 'channel' definition as a new API. Proposals like zctap and 'queues as a first class citizen' are examples of intentions / desires to move towards Rx and Tx queues beyond what exists today.
On Tue, 30 Nov 2021 10:38:14 -0700 David Ahern wrote: > On 11/30/21 10:04 AM, Jakub Kicinski wrote: > > On Tue, 30 Nov 2021 17:34:54 +0100 Alexander Lobakin wrote: > >> I know about ETHTOOL_STAT_NOT_SET, but RTNL xstats doesn't use this, > >> does it? > > > > Not sure if you're asking me or Dave but no, to my knowledge RTNL does > > not use such semantics today. But the reason is mostly because there > > weren't many driver stats added there. Knowing if an error counter is > > not supported or supporter and 0 is important for monitoring. Even if > > XDP stats don't have a counter which may not be supported today it's > > not a good precedent to make IMO. > > Today, stats are sent as a struct so skipping stats whose value is 0 is > not an option. When using individual attributes for the counters this > becomes an option. Given there is no value in sending '0' why do it? To establish semantics of what it means that the statistic is not reported. If we need to save space we'd need an extra attr with a bitmap of "these stats were skipped because they were zero". Or conversely some way of querying supported stats. > Is your pushback that there should be a uapi to opt-in to this behavior? Not where I was going with it, but it is an option. If skipping 0s was controlled by a flag a dump without such flag set would basically serve as a way to query supported stats.
On Tue, 30 Nov 2021 10:56:26 -0700 David Ahern wrote: > On 11/30/21 10:07 AM, Jakub Kicinski wrote: > > On Tue, 30 Nov 2021 17:17:24 +0100 Toke Høiland-Jørgensen wrote: > >> Well, I don't like throwing data away, so in that sense I do like > >> per-queue stats, but it's not a very strong preference (i.e., I can live > >> with either)... > > > > We don't even have a clear definition of a queue in Linux. > > > > The summary above says "Jakub: no per-channel", and then you have this > comment about a clear definition of a queue. What is your preference > here, Jakub? I think I have gotten lost in all of the coments. I'm against per-channel and against per-queue stats. I'm not saying "do one instead of the other". Hope that makes it clear. > My request was just to not lump Rx and Tx together under a 'channel' > definition as a new API. Proposals like zctap and 'queues as a first > class citizen' are examples of intentions / desires to move towards Rx > and Tx queues beyond what exists today. Right, and when we have the objects to control those we'll hang the stats off them. Right now half of the NICs will destroy queue stats on random reconfiguration requests, others will mix the stats between queue instantiations.. mlx5 does it's shadow queue thing. It's a mess. uAPI which is not portable and not usable in production is pure burden.
On 2021-11-30 12:38, David Ahern wrote: > Today, stats are sent as a struct so skipping stats whose value is 0 is > not an option. When using individual attributes for the counters this > becomes an option. Given there is no value in sending '0' why do it? > > Is your pushback that there should be a uapi to opt-in to this behavior? A filter in the netlink request should help pick what is user-preferred. You can default to not sending zeros. cheers, jamal