Message ID | 04b7b9d0b480158eb3ab4366ec80aa2ab7e41fcb.1725031794.git.gnault@redhat.com (mailing list archive) |
---|---|
State | Accepted |
Commit | 4963d2343af81f493519f9c3ea9f2169eaa7353a |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net] bareudp: Fix device stats updates. | expand |
Guillaume Nault wrote: > Bareudp devices update their stats concurrently. > Therefore they need proper atomic increments. > > Fixes: 571912c69f0e ("net: UDP tunnel encapsulation module for tunnelling different protocols like MPLS, IP, NSH etc.") > Signed-off-by: Guillaume Nault <gnault@redhat.com> Reviewed-by: Willem de Bruijn <willemb@google.com>
On Fri, 30 Aug 2024 17:31:07 +0200 Guillaume Nault wrote: > Bareudp devices update their stats concurrently. > Therefore they need proper atomic increments. The driver already uses struct pcpu_sw_netstats, would it make sense to bump it up to struct pcpu_dstats and have per CPU rx drops as well?
On Tue, Sep 03, 2024 at 11:34:02AM -0700, Jakub Kicinski wrote: > On Fri, 30 Aug 2024 17:31:07 +0200 Guillaume Nault wrote: > > Bareudp devices update their stats concurrently. > > Therefore they need proper atomic increments. > > The driver already uses struct pcpu_sw_netstats, would it make sense to > bump it up to struct pcpu_dstats and have per CPU rx drops as well? Long term, I was considering moving bareudp to use dev->tstats for packets/bytes and dev->core_stats for drops. It looks like dev->dstats is only used for VRF, so I didn't consider it. Should we favour dev->dstats for tunnels instead of combining ->tstats and ->core_stats? (vxlan uses the later for example).
On Wed, 4 Sep 2024 14:29:44 +0200 Guillaume Nault wrote: > > The driver already uses struct pcpu_sw_netstats, would it make sense to > > bump it up to struct pcpu_dstats and have per CPU rx drops as well? > > Long term, I was considering moving bareudp to use dev->tstats for > packets/bytes and dev->core_stats for drops. It looks like dev->dstats > is only used for VRF, so I didn't consider it. Right, d stands for dummy so I guess they also were used by dummy at some stage? Mostly I think it's a matter of the other stats being less recent. > Should we favour dev->dstats for tunnels instead of combining ->tstats > and ->core_stats? (vxlan uses the later for example). Seems reasonable to me. Not important enough to convert existing drivers, maybe, unless someone sees contention. But in new code, or if we're touching the relevant lines I reckon we should consider it? No strong feelings tho, LMK if you want to send v2 or keep this patch as is.
[Adding David Ahern for the vrf/dstats discussion] On Wed, Sep 04, 2024 at 07:57:32AM -0700, Jakub Kicinski wrote: > On Wed, 4 Sep 2024 14:29:44 +0200 Guillaume Nault wrote: > > > The driver already uses struct pcpu_sw_netstats, would it make sense to > > > bump it up to struct pcpu_dstats and have per CPU rx drops as well? > > > > Long term, I was considering moving bareudp to use dev->tstats for > > packets/bytes and dev->core_stats for drops. It looks like dev->dstats > > is only used for VRF, so I didn't consider it. > > Right, d stands for dummy so I guess they also were used by dummy > at some stage? Mostly I think it's a matter of the other stats being > less recent. Looks like dummy had its own dstats, yes. But those dstats were really like the current lstats (packets and bytes counters, nothing for drops). Dummy was later converted to lstats by commit 4a43b1f96b1d ("net: dummy: use standard dev_lstats_add() and dev_lstats_read()"). The dstats we have now really come from vrf (different counters for tx and rx and counters for packet drops), which had its own implementation at that time. My understanding is that vrf implemented its own dstats in order to have per-cpu counters for regular bytes/packets counters and also for packet drops. But when vrf's dstats got moved to the core (commits 79e0c5be8c73 ("net, vrf: Move dstats structure to core") and 34d21de99cea ("net: Move {l,t,d}stats allocation to core and convert veth & vrf")), the networking core had caught up and had also gained support for pcpu drop counters (commit 625788b58445 ("net: add per-cpu storage and net->core_stats")). In this context, I feel that dstats is now just a mix of tstats and core_stats. > > Should we favour dev->dstats for tunnels instead of combining ->tstats > > and ->core_stats? (vxlan uses the later for example). > > Seems reasonable to me. Not important enough to convert existing > drivers, maybe, unless someone sees contention. But in new code, > or if we're touching the relevant lines I reckon we should consider it? Given that we now have pcpu stats for packet drops anyway, what does dstats bring compared to tstats? Shouldn't we go the other way around and convert vrf to tstats and core_stats? Then we could drop dstats entirely. Back to bareudp, for the moment, I'd prefer to convert it to tstats rather than dstats. The reason is that vxlan (and geneve to a lesser extent) use tstats and I'd like to ease potential future code consolidation between those three modules. > No strong feelings tho, LMK if you want to send v2 or keep this patch > as is. I'd prefer to have this patch merged as is in -net. I have other patches pending that have to update stats and I'd like to do that correctly (that is, in a non-racy way) and consistently with existing code. I feel that converting bareudp to either tstats or dstats is something for net-next. After -net will merge into net-next, I'll can convert bareudp to either dstats or tstats, depending on the outcome of this conversation.
On Wed, 4 Sep 2024 19:54:40 +0200 Guillaume Nault wrote: > In this context, I feel that dstats is now just a mix of tstats and > core_stats. I don't know the full background but: * @core_stats: core networking counters, * do not use this in drivers bareudp is a driver. > After -net will merge into net-next, I'll can convert bareudp to either > dstats or tstats, depending on the outcome of this conversation. Sure.
Hello: This patch was applied to netdev/net.git (main) by Jakub Kicinski <kuba@kernel.org>: On Fri, 30 Aug 2024 17:31:07 +0200 you wrote: > Bareudp devices update their stats concurrently. > Therefore they need proper atomic increments. > > Fixes: 571912c69f0e ("net: UDP tunnel encapsulation module for tunnelling different protocols like MPLS, IP, NSH etc.") > Signed-off-by: Guillaume Nault <gnault@redhat.com> > --- > drivers/net/bareudp.c | 22 +++++++++++----------- > 1 file changed, 11 insertions(+), 11 deletions(-) Here is the summary with links: - [net] bareudp: Fix device stats updates. https://git.kernel.org/netdev/net/c/4963d2343af8 You are awesome, thank you!
On 9/4/24 11:54 AM, Guillaume Nault wrote: > [Adding David Ahern for the vrf/dstats discussion] > > On Wed, Sep 04, 2024 at 07:57:32AM -0700, Jakub Kicinski wrote: >> On Wed, 4 Sep 2024 14:29:44 +0200 Guillaume Nault wrote: >>>> The driver already uses struct pcpu_sw_netstats, would it make sense to >>>> bump it up to struct pcpu_dstats and have per CPU rx drops as well? >>> >>> Long term, I was considering moving bareudp to use dev->tstats for >>> packets/bytes and dev->core_stats for drops. It looks like dev->dstats >>> is only used for VRF, so I didn't consider it. >> >> Right, d stands for dummy so I guess they also were used by dummy >> at some stage? Mostly I think it's a matter of the other stats being >> less recent. > > Looks like dummy had its own dstats, yes. But those dstats were really > like the current lstats (packets and bytes counters, nothing for > drops). Dummy was later converted to lstats by commit 4a43b1f96b1d > ("net: dummy: use standard dev_lstats_add() and dev_lstats_read()"). > > The dstats we have now really come from vrf (different counters for tx > and rx and counters for packet drops), which had its own implementation > at that time. > > My understanding is that vrf implemented its own dstats in order to > have per-cpu counters for regular bytes/packets counters and also for > packet drops. VRF was following other per-cpu counters that existed in 2015-2016 timeframe. I have no preference on the naming; just wanted per-cpu counters.
On Wed, Sep 04, 2024 at 07:50:55PM -0600, David Ahern wrote: > On 9/4/24 11:54 AM, Guillaume Nault wrote: > > [Adding David Ahern for the vrf/dstats discussion] > > > > On Wed, Sep 04, 2024 at 07:57:32AM -0700, Jakub Kicinski wrote: > >> On Wed, 4 Sep 2024 14:29:44 +0200 Guillaume Nault wrote: > >>>> The driver already uses struct pcpu_sw_netstats, would it make sense to > >>>> bump it up to struct pcpu_dstats and have per CPU rx drops as well? > >>> > >>> Long term, I was considering moving bareudp to use dev->tstats for > >>> packets/bytes and dev->core_stats for drops. It looks like dev->dstats > >>> is only used for VRF, so I didn't consider it. > >> > >> Right, d stands for dummy so I guess they also were used by dummy > >> at some stage? Mostly I think it's a matter of the other stats being > >> less recent. > > > > Looks like dummy had its own dstats, yes. But those dstats were really > > like the current lstats (packets and bytes counters, nothing for > > drops). Dummy was later converted to lstats by commit 4a43b1f96b1d > > ("net: dummy: use standard dev_lstats_add() and dev_lstats_read()"). > > > > The dstats we have now really come from vrf (different counters for tx > > and rx and counters for packet drops), which had its own implementation > > at that time. > > > > My understanding is that vrf implemented its own dstats in order to > > have per-cpu counters for regular bytes/packets counters and also for > > packet drops. > > VRF was following other per-cpu counters that existed in 2015-2016 > timeframe. Thanks. That was my impression as well. > I have no preference on the naming; just wanted per-cpu counters. >
On Wed, Sep 04, 2024 at 02:48:39PM -0700, Jakub Kicinski wrote: > On Wed, 4 Sep 2024 19:54:40 +0200 Guillaume Nault wrote: > > In this context, I feel that dstats is now just a mix of tstats and > > core_stats. > > I don't know the full background but: > > * @core_stats: core networking counters, > * do not use this in drivers Hum, I didn't realise that :/. I'd really like to understand why drivers shouldn't use core_stats. I mean, what makes driver and core networking counters so different that they need to be handled in two different ways (but finally merged together when exporting stats to user space)? Does that prevent any contention on the counters or optimise cache line access? I can't see how, so I'm probably missing something important here. > bareudp is a driver. > > > After -net will merge into net-next, I'll can convert bareudp to either > > dstats or tstats, depending on the outcome of this conversation. > > Sure. >
On Fri, Sep 6, 2024 at 12:42 PM Guillaume Nault <gnault@redhat.com> wrote: > > On Wed, Sep 04, 2024 at 02:48:39PM -0700, Jakub Kicinski wrote: > > On Wed, 4 Sep 2024 19:54:40 +0200 Guillaume Nault wrote: > > > In this context, I feel that dstats is now just a mix of tstats and > > > core_stats. > > > > I don't know the full background but: > > > > * @core_stats: core networking counters, > > * do not use this in drivers > > Hum, I didn't realise that :/. > > I'd really like to understand why drivers shouldn't use core_stats. > > I mean, what makes driver and core networking counters so different > that they need to be handled in two different ways (but finally merged > together when exporting stats to user space)? > > Does that prevent any contention on the counters or optimise cache line > access? I can't see how, so I'm probably missing something important > here. Some archeology might help. Before we had tracing, having separate fields could help for diagnostics. commit caf586e5f23cebb2a68cbaf288d59dbbf2d74052 Author: Eric Dumazet <eric.dumazet@gmail.com> Date: Thu Sep 30 21:06:55 2010 +0000 net: add a core netdev->rx_dropped counter In various situations, a device provides a packet to our stack and we drop it before it enters protocol stack : - softnet backlog full (accounted in /proc/net/softnet_stat) - bad vlan tag (not accounted) - unknown/unregistered protocol (not accounted) We can handle a per-device counter of such dropped frames at core level, and automatically adds it to the device provided stats (rx_dropped), so that standard tools can be used (ifconfig, ip link, cat /proc/net/dev) This is a generalization of commit 8990f468a (net: rx_dropped accounting), thus reverting it. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
On Fri, Sep 06, 2024 at 02:47:15PM +0200, Eric Dumazet wrote: > On Fri, Sep 6, 2024 at 12:42 PM Guillaume Nault <gnault@redhat.com> wrote: > > > > On Wed, Sep 04, 2024 at 02:48:39PM -0700, Jakub Kicinski wrote: > > > On Wed, 4 Sep 2024 19:54:40 +0200 Guillaume Nault wrote: > > > > In this context, I feel that dstats is now just a mix of tstats and > > > > core_stats. > > > > > > I don't know the full background but: > > > > > > * @core_stats: core networking counters, > > > * do not use this in drivers > > > > Hum, I didn't realise that :/. > > > > I'd really like to understand why drivers shouldn't use core_stats. > > > > I mean, what makes driver and core networking counters so different > > that they need to be handled in two different ways (but finally merged > > together when exporting stats to user space)? > > > > Does that prevent any contention on the counters or optimise cache line > > access? I can't see how, so I'm probably missing something important > > here. > > Some archeology might help. > > Before we had tracing, having separate fields could help for diagnostics. Interesting, I didn't think about it from a diagnostic perspective. Considering that we now have tracing and skb_drop_reason, does it still make sense to keep this distinction between core and driver stats? I find this approach elegant, but since no UDP tunnel respects it and that dstats are only used by one driver (vrf) I wonder what's the best path forward. I'm restricting this discussion to UDP tunnels, as I'd like them to keep their implementation as consistent as possible (to hopefully ease code consolidation in the future). But feel free to broaden the scope if necessary. I can think of two possibilities: 1- Follow the core/driver stats policy and convert vxlan, geneve and bareudp to dstats (NETDEV_PCPU_STAT_DSTATS). 2- Give up on that policy, let vxlan and geneve as is and convert bareudp to tstats (NETDEV_PCPU_STAT_TSTATS). Then convert vrf to tstats too and drop NETDEV_PCPU_STAT_DSTATS which becomes unused. Any opinion? > commit caf586e5f23cebb2a68cbaf288d59dbbf2d74052 > Author: Eric Dumazet <eric.dumazet@gmail.com> > Date: Thu Sep 30 21:06:55 2010 +0000 > > net: add a core netdev->rx_dropped counter > > In various situations, a device provides a packet to our stack and we > drop it before it enters protocol stack : > - softnet backlog full (accounted in /proc/net/softnet_stat) > - bad vlan tag (not accounted) > - unknown/unregistered protocol (not accounted) > > We can handle a per-device counter of such dropped frames at core level, > and automatically adds it to the device provided stats (rx_dropped), so > that standard tools can be used (ifconfig, ip link, cat /proc/net/dev) > > This is a generalization of commit 8990f468a (net: rx_dropped > accounting), thus reverting it. > > Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> > Signed-off-by: David S. Miller <davem@davemloft.net> >
diff --git a/drivers/net/bareudp.c b/drivers/net/bareudp.c index d5c56ca91b77..7aca0544fb29 100644 --- a/drivers/net/bareudp.c +++ b/drivers/net/bareudp.c @@ -83,7 +83,7 @@ static int bareudp_udp_encap_recv(struct sock *sk, struct sk_buff *skb) if (skb_copy_bits(skb, BAREUDP_BASE_HLEN, &ipversion, sizeof(ipversion))) { - bareudp->dev->stats.rx_dropped++; + DEV_STATS_INC(bareudp->dev, rx_dropped); goto drop; } ipversion >>= 4; @@ -93,7 +93,7 @@ static int bareudp_udp_encap_recv(struct sock *sk, struct sk_buff *skb) } else if (ipversion == 6 && bareudp->multi_proto_mode) { proto = htons(ETH_P_IPV6); } else { - bareudp->dev->stats.rx_dropped++; + DEV_STATS_INC(bareudp->dev, rx_dropped); goto drop; } } else if (bareudp->ethertype == htons(ETH_P_MPLS_UC)) { @@ -107,7 +107,7 @@ static int bareudp_udp_encap_recv(struct sock *sk, struct sk_buff *skb) ipv4_is_multicast(tunnel_hdr->daddr)) { proto = htons(ETH_P_MPLS_MC); } else { - bareudp->dev->stats.rx_dropped++; + DEV_STATS_INC(bareudp->dev, rx_dropped); goto drop; } } else { @@ -123,7 +123,7 @@ static int bareudp_udp_encap_recv(struct sock *sk, struct sk_buff *skb) (addr_type & IPV6_ADDR_MULTICAST)) { proto = htons(ETH_P_MPLS_MC); } else { - bareudp->dev->stats.rx_dropped++; + DEV_STATS_INC(bareudp->dev, rx_dropped); goto drop; } } @@ -135,7 +135,7 @@ static int bareudp_udp_encap_recv(struct sock *sk, struct sk_buff *skb) proto, !net_eq(bareudp->net, dev_net(bareudp->dev)))) { - bareudp->dev->stats.rx_dropped++; + DEV_STATS_INC(bareudp->dev, rx_dropped); goto drop; } @@ -143,7 +143,7 @@ static int bareudp_udp_encap_recv(struct sock *sk, struct sk_buff *skb) tun_dst = udp_tun_rx_dst(skb, family, key, 0, 0); if (!tun_dst) { - bareudp->dev->stats.rx_dropped++; + DEV_STATS_INC(bareudp->dev, rx_dropped); goto drop; } skb_dst_set(skb, &tun_dst->dst); @@ -169,8 +169,8 @@ static int bareudp_udp_encap_recv(struct sock *sk, struct sk_buff *skb) &((struct ipv6hdr *)oiph)->saddr); } if (err > 1) { - ++bareudp->dev->stats.rx_frame_errors; - ++bareudp->dev->stats.rx_errors; + DEV_STATS_INC(bareudp->dev, rx_frame_errors); + DEV_STATS_INC(bareudp->dev, rx_errors); goto drop; } } @@ -467,11 +467,11 @@ static netdev_tx_t bareudp_xmit(struct sk_buff *skb, struct net_device *dev) dev_kfree_skb(skb); if (err == -ELOOP) - dev->stats.collisions++; + DEV_STATS_INC(dev, collisions); else if (err == -ENETUNREACH) - dev->stats.tx_carrier_errors++; + DEV_STATS_INC(dev, tx_carrier_errors); - dev->stats.tx_errors++; + DEV_STATS_INC(dev, tx_errors); return NETDEV_TX_OK; }
Bareudp devices update their stats concurrently. Therefore they need proper atomic increments. Fixes: 571912c69f0e ("net: UDP tunnel encapsulation module for tunnelling different protocols like MPLS, IP, NSH etc.") Signed-off-by: Guillaume Nault <gnault@redhat.com> --- drivers/net/bareudp.c | 22 +++++++++++----------- 1 file changed, 11 insertions(+), 11 deletions(-)