Message ID | 20221111153505.434398-1-shenwei.wang@nxp.com (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [v3,1/1] net: fec: add xdp and page pool statistics | expand |
From: Shenwei Wang <shenwei.wang@nxp.com> Date: Fri, 11 Nov 2022 09:35:05 -0600 > Added xdp and page pool statistics. > In order to make the implementation simple and compatible, the patch > uses the 32bit integer to record the XDP statistics. > > Signed-off-by: Shenwei Wang <shenwei.wang@nxp.com> > Reported-by: kernel test robot <lkp@intel.com> I was this went upstream[0], I think it was quite premature. First of all, there was a non-acked reply to me in the v2 thread, but okay, I can live with that. More serious issues in the inline comments below. > --- > Changes in v3: > - change memcpy to strncpy to fix the warning reported by Paolo Abeni > - fix the compile errors on powerpc > > Changes in v2: > - clean up and restructure the codes per Andrew Lunn's review comments > - clear the statistics when the adaptor is down > > drivers/net/ethernet/freescale/Kconfig | 1 + > drivers/net/ethernet/freescale/fec.h | 14 ++++ > drivers/net/ethernet/freescale/fec_main.c | 85 +++++++++++++++++++++-- > 3 files changed, 95 insertions(+), 5 deletions(-) > > diff --git a/drivers/net/ethernet/freescale/Kconfig b/drivers/net/ethernet/freescale/Kconfig > index ce866ae3df03..f1e80d6996ef 100644 > --- a/drivers/net/ethernet/freescale/Kconfig > +++ b/drivers/net/ethernet/freescale/Kconfig > @@ -29,6 +29,7 @@ config FEC > select CRC32 > select PHYLIB > select PAGE_POOL > + select PAGE_POOL_STATS Drivers should never select PAGE_POOL_STATS. This Kconfig option was made to allow user to choose whether he wants stats or better performance on slower systems. It's pure user choice, if something doesn't build or link, it must be guarded with IS_ENABLED(CONFIG_PAGE_POOL_STATS). > imply NET_SELFTESTS > help > Say Y here if you want to use the built-in 10/100 Fast ethernet > diff --git a/drivers/net/ethernet/freescale/fec.h b/drivers/net/ethernet/freescale/fec.h > index 61e847b18343..5ba1e0d71c68 100644 > --- a/drivers/net/ethernet/freescale/fec.h > +++ b/drivers/net/ethernet/freescale/fec.h > @@ -526,6 +526,19 @@ struct fec_enet_priv_txrx_info { > struct sk_buff *skb; > }; > > +enum { > + RX_XDP_REDIRECT = 0, > + RX_XDP_PASS, > + RX_XDP_DROP, > + RX_XDP_TX, > + RX_XDP_TX_ERRORS, > + TX_XDP_XMIT, > + TX_XDP_XMIT_ERRORS, > + > + /* The following must be the last one */ > + XDP_STATS_TOTAL, > +}; > + > struct fec_enet_priv_tx_q { > struct bufdesc_prop bd; > unsigned char *tx_bounce[TX_RING_SIZE]; > @@ -546,6 +559,7 @@ struct fec_enet_priv_rx_q { > /* page_pool */ > struct page_pool *page_pool; > struct xdp_rxq_info xdp_rxq; > + u32 stats[XDP_STATS_TOTAL]; Still not convinced it is okay to deliberately provoke overflows here, maybe we need some more reviewers to help us agree on what is better? > > /* rx queue number, in the range 0-7 */ > u8 id; [...] > case ETH_SS_STATS: > - for (i = 0; i < ARRAY_SIZE(fec_stats); i++) > - memcpy(data + i * ETH_GSTRING_LEN, > - fec_stats[i].name, ETH_GSTRING_LEN); > + for (i = 0; i < ARRAY_SIZE(fec_stats); i++) { > + memcpy(data, fec_stats[i].name, ETH_GSTRING_LEN); > + data += ETH_GSTRING_LEN; > + } > + for (i = 0; i < ARRAY_SIZE(fec_xdp_stat_strs); i++) { > + strncpy(data, fec_xdp_stat_strs[i], ETH_GSTRING_LEN); strncpy() is deprecated in favor of strscpy(), there were tons of commits which replace the former with the latter across the whole tree. > + data += ETH_GSTRING_LEN; > + } > + page_pool_ethtool_stats_get_strings(data); > + > break; > case ETH_SS_TEST: > net_selftest_get_strings(data); [...] > + for (i = fep->num_rx_queues - 1; i >= 0; i--) { > + rxq = fep->rx_queue[i]; > + for (j = 0; j < XDP_STATS_TOTAL; j++) > + rxq->stats[j] = 0; (not critical) Just memset(&rxq->stats)? > + } > + > /* Don't disable MIB statistics counters */ > writel(0, fep->hwp + FEC_MIB_CTRLSTAT); > } > @@ -3084,6 +3156,9 @@ static void fec_enet_free_buffers(struct net_device *ndev) > for (i = 0; i < rxq->bd.ring_size; i++) > page_pool_release_page(rxq->page_pool, rxq->rx_skb_info[i].page); > > + for (i = 0; i < XDP_STATS_TOTAL; i++) > + rxq->stats[i] = 0; > + > if (xdp_rxq_info_is_reg(&rxq->xdp_rxq)) > xdp_rxq_info_unreg(&rxq->xdp_rxq); > page_pool_destroy(rxq->page_pool); > -- > 2.34.1 Could you please send a follow-up maybe, fixing at least that PAGE_POOL_STATS select and strncpy()? [0] https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/commit/?h=main&id=6970ef27ff7fd1ce3455b2c696081503d0c0f8ac Thanks, Olek
> Drivers should never select PAGE_POOL_STATS. This Kconfig option was > made to allow user to choose whether he wants stats or better > performance on slower systems. It's pure user choice, if something > doesn't build or link, it must be guarded with > IS_ENABLED(CONFIG_PAGE_POOL_STATS). Given how simple the API is, and the stubs for when CONFIG_PAGE_POOL_STATS is disabled, i doubt there is any need for the driver to do anything. > > struct page_pool *page_pool; > > struct xdp_rxq_info xdp_rxq; > > + u32 stats[XDP_STATS_TOTAL]; > > Still not convinced it is okay to deliberately provoke overflows > here, maybe we need some more reviewers to help us agree on what is > better? You will find that many embedded drivers only have 32 bit hardware stats and do wrap around. And the hardware does not have atomic read and clear so you can accumulate into a u64. The FEC is from the times of MIB 2 ifTable, which only requires 32 bit counters. ifXtable is modern compared to the FEC. Software counters like this are a different matter. The overhead of a u64 on a 32 bit system is probably in the noise, so i think there is strong argument for using u64. Andrew
> -----Original Message----- > From: Alexander Lobakin <alexandr.lobakin@intel.com> > Sent: Monday, November 14, 2022 7:46 AM > To: Shenwei Wang <shenwei.wang@nxp.com> > Cc: Alexander Lobakin <alexandr.lobakin@intel.com>; David S. Miller > <davem@davemloft.net>; Eric Dumazet <edumazet@google.com>; Jakub > Kicinski <kuba@kernel.org>; Paolo Abeni <pabeni@redhat.com>; Alexei > Starovoitov <ast@kernel.org>; Daniel Borkmann <daniel@iogearbox.net>; > > @@ -29,6 +29,7 @@ config FEC > > select CRC32 > > select PHYLIB > > select PAGE_POOL > > + select PAGE_POOL_STATS > > Drivers should never select PAGE_POOL_STATS. This Kconfig option was made to > allow user to choose whether he wants stats or better performance on slower > systems. It's pure user choice, if something doesn't build or link, it must be > guarded with IS_ENABLED(CONFIG_PAGE_POOL_STATS). As the PAGE_POOL_STATS is becoming the infrastructure codes for many drivers, it is redundant for every driver to implement the stub function in case it is not selected. These stub functions should be provided by PAGE_POOL_STATS itself if the option is not selected. > > > imply NET_SELFTESTS > > help > > Say Y here if you want to use the built-in 10/100 Fast > > ethernet diff --git a/drivers/net/ethernet/freescale/fec.h > > b/drivers/net/ethernet/freescale/fec.h > > index 61e847b18343..5ba1e0d71c68 100644 > > --- a/drivers/net/ethernet/freescale/fec.h > > +++ b/drivers/net/ethernet/freescale/fec.h > > @@ -526,6 +526,19 @@ struct fec_enet_priv_txrx_info { > > struct sk_buff *skb; > > }; > > > > +enum { > > + RX_XDP_REDIRECT = 0, > > + RX_XDP_PASS, > > + RX_XDP_DROP, > > + RX_XDP_TX, > > + RX_XDP_TX_ERRORS, > > + TX_XDP_XMIT, > > + TX_XDP_XMIT_ERRORS, > > + > > + /* The following must be the last one */ > > + XDP_STATS_TOTAL, > > +}; > > + > > struct fec_enet_priv_tx_q { > > struct bufdesc_prop bd; > > unsigned char *tx_bounce[TX_RING_SIZE]; @@ -546,6 +559,7 @@ > > struct fec_enet_priv_rx_q { > > /* page_pool */ > > struct page_pool *page_pool; > > struct xdp_rxq_info xdp_rxq; > > + u32 stats[XDP_STATS_TOTAL]; > > Still not convinced it is okay to deliberately provoke overflows here, maybe we > need some more reviewers to help us agree on what is better? > > > > > /* rx queue number, in the range 0-7 */ > > u8 id; > > [...] > > > case ETH_SS_STATS: > > - for (i = 0; i < ARRAY_SIZE(fec_stats); i++) > > - memcpy(data + i * ETH_GSTRING_LEN, > > - fec_stats[i].name, ETH_GSTRING_LEN); > > + for (i = 0; i < ARRAY_SIZE(fec_stats); i++) { > > + memcpy(data, fec_stats[i].name, ETH_GSTRING_LEN); > > + data += ETH_GSTRING_LEN; > > + } > > + for (i = 0; i < ARRAY_SIZE(fec_xdp_stat_strs); i++) { > > + strncpy(data, fec_xdp_stat_strs[i], > > + ETH_GSTRING_LEN); > > strncpy() is deprecated in favor of strscpy(), there were tons of commits which > replace the former with the latter across the whole tree. > Got it. Thanks. Shenwei > > + data += ETH_GSTRING_LEN; > > + } > > + page_pool_ethtool_stats_get_strings(data); > > + > > break; > > case ETH_SS_TEST: > > net_selftest_get_strings(data); > > [...] > > > + for (i = fep->num_rx_queues - 1; i >= 0; i--) { > > + rxq = fep->rx_queue[i]; > > + for (j = 0; j < XDP_STATS_TOTAL; j++) > > + rxq->stats[j] = 0; > > (not critical) Just memset(&rxq->stats)? > > > + } > > + > > /* Don't disable MIB statistics counters */ > > writel(0, fep->hwp + FEC_MIB_CTRLSTAT); } @@ -3084,6 +3156,9 @@ > > static void fec_enet_free_buffers(struct net_device *ndev) > > for (i = 0; i < rxq->bd.ring_size; i++) > > page_pool_release_page(rxq->page_pool, > > rxq->rx_skb_info[i].page); > > > > + for (i = 0; i < XDP_STATS_TOTAL; i++) > > + rxq->stats[i] = 0; > > + > > if (xdp_rxq_info_is_reg(&rxq->xdp_rxq)) > > xdp_rxq_info_unreg(&rxq->xdp_rxq); > > page_pool_destroy(rxq->page_pool); > > -- > > 2.34.1 > > Could you please send a follow-up maybe, fixing at least that > PAGE_POOL_STATS select and strncpy()? > > [0] > https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit.kernel > .org%2Fpub%2Fscm%2Flinux%2Fkernel%2Fgit%2Fnetdev%2Fnet- > next.git%2Fcommit%2F%3Fh%3Dmain%26id%3D6970ef27ff7fd1ce3455b2c6960 > 81503d0c0f8ac&data=05%7C01%7Cshenwei.wang%40nxp.com%7C0f6bfc > 73c869426e59fc08dac64692d2%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0 > %7C0%7C638040303661645473%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4 > wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C > %7C%7C&sdata=8NbrJmoDnsbyb8WXU85OIq6BOYCOrXLBm1mjbTi%2Fam > Q%3D&reserved=0 > > Thanks, > Olek
> -----Original Message----- > From: Andrew Lunn <andrew@lunn.ch> > Sent: Monday, November 14, 2022 8:08 AM > To: Alexander Lobakin <alexandr.lobakin@intel.com> > Cc: Shenwei Wang <shenwei.wang@nxp.com>; David S. Miller > <davem@davemloft.net>; Eric Dumazet <edumazet@google.com>; Jakub > Kicinski <kuba@kernel.org>; Paolo Abeni <pabeni@redhat.com>; Alexei > Starovoitov <ast@kernel.org>; Daniel Borkmann <daniel@iogearbox.net>; > Jesper Dangaard Brouer <hawk@kernel.org>; John Fastabend > <john.fastabend@gmail.com>; Wei Fang <wei.fang@nxp.com>; > netdev@vger.kernel.org; linux-kernel@vger.kernel.org; imx@lists.linux.dev; > kernel test robot <lkp@intel.com> > Subject: [EXT] Re: [PATCH v3 1/1] net: fec: add xdp and page pool statistics > > Caution: EXT Email > > > Drivers should never select PAGE_POOL_STATS. This Kconfig option was > > made to allow user to choose whether he wants stats or better > > performance on slower systems. It's pure user choice, if something > > doesn't build or link, it must be guarded with > > IS_ENABLED(CONFIG_PAGE_POOL_STATS). > > Given how simple the API is, and the stubs for when CONFIG_PAGE_POOL_STATS > is disabled, i doubt there is any need for the driver to do anything. > > > > struct page_pool *page_pool; > > > struct xdp_rxq_info xdp_rxq; > > > + u32 stats[XDP_STATS_TOTAL]; > > > > Still not convinced it is okay to deliberately provoke overflows here, > > maybe we need some more reviewers to help us agree on what is better? > > You will find that many embedded drivers only have 32 bit hardware stats and do > wrap around. And the hardware does not have atomic read and clear so you can > accumulate into a u64. The FEC is from the times of MIB 2 ifTable, which only > requires 32 bit counters. ifXtable is modern compared to the FEC. > > Software counters like this are a different matter. The overhead of a > u64 on a 32 bit system is probably in the noise, so i think there is strong > argument for using u64. If it is required to support u64 counters, the code logic need to change to record the counter locally per packet, and then update the counters for the fec instance when the napi receive loop is complete. In this way we can reduce the performance overhead. Thanks, Shenwei > > Andrew
From: Shenwei Wang <shenwei.wang@nxp.com> Date: Mon, 14 Nov 2022 15:06:04 +0000 > > -----Original Message----- > > From: Andrew Lunn <andrew@lunn.ch> > > Sent: Monday, November 14, 2022 8:08 AM > > To: Alexander Lobakin <alexandr.lobakin@intel.com> > > Cc: Shenwei Wang <shenwei.wang@nxp.com>; David S. Miller > > <davem@davemloft.net>; Eric Dumazet <edumazet@google.com>; Jakub > > Kicinski <kuba@kernel.org>; Paolo Abeni <pabeni@redhat.com>; Alexei > > Starovoitov <ast@kernel.org>; Daniel Borkmann <daniel@iogearbox.net>; > > Jesper Dangaard Brouer <hawk@kernel.org>; John Fastabend > > <john.fastabend@gmail.com>; Wei Fang <wei.fang@nxp.com>; > > netdev@vger.kernel.org; linux-kernel@vger.kernel.org; imx@lists.linux.dev; > > kernel test robot <lkp@intel.com> > > Subject: [EXT] Re: [PATCH v3 1/1] net: fec: add xdp and page pool statistics > > > > Caution: EXT Email > > > >> Drivers should never select PAGE_POOL_STATS. This Kconfig option was > >> made to allow user to choose whether he wants stats or better > >> performance on slower systems. It's pure user choice, if something > >> doesn't build or link, it must be guarded with > >> IS_ENABLED(CONFIG_PAGE_POOL_STATS). > > > > Given how simple the API is, and the stubs for when CONFIG_PAGE_POOL_STATS > > is disabled, i doubt there is any need for the driver to do anything. > > > >>> struct page_pool *page_pool; > >>> struct xdp_rxq_info xdp_rxq; > >>> + u32 stats[XDP_STATS_TOTAL]; > >> > >> Still not convinced it is okay to deliberately provoke overflows here, > >> maybe we need some more reviewers to help us agree on what is better? > > > > You will find that many embedded drivers only have 32 bit hardware stats and do > > wrap around. And the hardware does not have atomic read and clear so you can > > accumulate into a u64. The FEC is from the times of MIB 2 ifTable, which only > > requires 32 bit counters. ifXtable is modern compared to the FEC. > > > > Software counters like this are a different matter. The overhead of a > > u64 on a 32 bit system is probably in the noise, so i think there is strong > > argument for using u64. > > If it is required to support u64 counters, the code logic need to change to record > the counter locally per packet, and then update the counters for the fec instance > when the napi receive loop is complete. In this way we can reduce the performance > overhead. That's how it is usually done in the drivers. You put u32 counters on the stack, it's impossible to overflow them in just one NAPI poll cycle. Then, after you're done with processing descriptors, you just increment the 64-bit on-ring counters at once. > > Thanks, > Shenwei > > > > > Andrew Thanks, Olek
From: Shenwei Wang <shenwei.wang@nxp.com> Date: Mon, 14 Nov 2022 14:53:00 +0000 > > -----Original Message----- > > From: Alexander Lobakin <alexandr.lobakin@intel.com> > > Sent: Monday, November 14, 2022 7:46 AM > > To: Shenwei Wang <shenwei.wang@nxp.com> > > Cc: Alexander Lobakin <alexandr.lobakin@intel.com>; David S. Miller > > <davem@davemloft.net>; Eric Dumazet <edumazet@google.com>; Jakub > > Kicinski <kuba@kernel.org>; Paolo Abeni <pabeni@redhat.com>; Alexei > > Starovoitov <ast@kernel.org>; Daniel Borkmann <daniel@iogearbox.net>; > >> @@ -29,6 +29,7 @@ config FEC > >> select CRC32 > >> select PHYLIB > >> select PAGE_POOL > >> + select PAGE_POOL_STATS > > > > Drivers should never select PAGE_POOL_STATS. This Kconfig option was made to > > allow user to choose whether he wants stats or better performance on slower > > systems. It's pure user choice, if something doesn't build or link, it must be > > guarded with IS_ENABLED(CONFIG_PAGE_POOL_STATS). > > As the PAGE_POOL_STATS is becoming the infrastructure codes for many drivers, it is > redundant for every driver to implement the stub function in case it is not selected. These > stub functions should be provided by PAGE_POOL_STATS itself if the option is not selected. Correct, but I think you added 'select PAGE_POOL_STATS' due to some build issues on PPC64, or not? So if there are any when !PAGE_POOL_STATS, it's always better to handle this at the Page Pool API level in a separate patch. > > > > >> imply NET_SELFTESTS > >> help > >> Say Y here if you want to use the built-in 10/100 Fast Thanks, Olek
> -----Original Message----- > From: Alexander Lobakin <alexandr.lobakin@intel.com> > Sent: Monday, November 14, 2022 9:28 AM > To: Shenwei Wang <shenwei.wang@nxp.com> > Cc: Alexander Lobakin <alexandr.lobakin@intel.com>; David S. Miller > <davem@davemloft.net>; Eric Dumazet <edumazet@google.com>; Jakub > Kicinski <kuba@kernel.org>; Paolo Abeni <pabeni@redhat.com>; Alexei > Starovoitov <ast@kernel.org>; Daniel Borkmann <daniel@iogearbox.net>; > Jesper Dangaard Brouer <hawk@kernel.org>; John Fastabend > <john.fastabend@gmail.com>; Wei Fang <wei.fang@nxp.com>; > netdev@vger.kernel.org; linux-kernel@vger.kernel.org; imx@lists.linux.dev; > kernel test robot <lkp@intel.com> > > > <daniel@iogearbox.net>; > > >> @@ -29,6 +29,7 @@ config FEC > > >> select CRC32 > > >> select PHYLIB > > >> select PAGE_POOL > > >> + select PAGE_POOL_STATS > > > > > > Drivers should never select PAGE_POOL_STATS. This Kconfig option was > > > made to allow user to choose whether he wants stats or better > > > performance on slower systems. It's pure user choice, if something > > > doesn't build or link, it must be guarded with > IS_ENABLED(CONFIG_PAGE_POOL_STATS). > > > > As the PAGE_POOL_STATS is becoming the infrastructure codes for many > > drivers, it is redundant for every driver to implement the stub > > function in case it is not selected. These stub functions should be provided by > PAGE_POOL_STATS itself if the option is not selected. > > Correct, but I think you added 'select PAGE_POOL_STATS' due to some build > issues on PPC64, or not? So if there are any when !PAGE_POOL_STATS, it's > always better to handle this at the Page Pool API level in a separate patch. Yes, the 'select PAGE_POOL_STATS' was added because of the building failure found on PPC64 platform. Thanks, Shenwei > > > > > > > > >> imply NET_SELFTESTS > > >> help > > >> Say Y here if you want to use the built-in 10/100 Fast > > Thanks, > Olek
> -----Original Message----- > From: Alexander Lobakin <alexandr.lobakin@intel.com> > Sent: Monday, November 14, 2022 9:23 AM > To: Shenwei Wang <shenwei.wang@nxp.com> > Cc: Alexander Lobakin <alexandr.lobakin@intel.com>; Andrew Lunn > <andrew@lunn.ch>; David S. Miller <davem@davemloft.net>; Eric Dumazet > <edumazet@google.com>; Jakub Kicinski <kuba@kernel.org>; Paolo Abeni > <pabeni@redhat.com>; Alexei Starovoitov <ast@kernel.org>; Daniel Borkmann > <daniel@iogearbox.net>; Jesper Dangaard Brouer <hawk@kernel.org>; John > Fastabend <john.fastabend@gmail.com>; Wei Fang <wei.fang@nxp.com>; > netdev@vger.kernel.org; linux-kernel@vger.kernel.org; imx@lists.linux.dev; > kernel test robot <lkp@intel.com> > Subject: [EXT] Re: [PATCH v3 1/1] net: fec: add xdp and page pool statistics > > Caution: EXT Email > > From: Shenwei Wang <shenwei.wang@nxp.com> > Date: Mon, 14 Nov 2022 15:06:04 +0000 > > > > -----Original Message----- > > > From: Andrew Lunn <andrew@lunn.ch> > > > Sent: Monday, November 14, 2022 8:08 AM > > > To: Alexander Lobakin <alexandr.lobakin@intel.com> > > > Cc: Shenwei Wang <shenwei.wang@nxp.com>; David S. Miller > > > <davem@davemloft.net>; Eric Dumazet <edumazet@google.com>; Jakub > > > Kicinski <kuba@kernel.org>; Paolo Abeni <pabeni@redhat.com>; Alexei > > > Starovoitov <ast@kernel.org>; Daniel Borkmann > > > <daniel@iogearbox.net>; Jesper Dangaard Brouer <hawk@kernel.org>; > > > John Fastabend <john.fastabend@gmail.com>; Wei Fang > > > <wei.fang@nxp.com>; netdev@vger.kernel.org; > > > linux-kernel@vger.kernel.org; imx@lists.linux.dev; kernel test robot > > > <lkp@intel.com> > > > Subject: [EXT] Re: [PATCH v3 1/1] net: fec: add xdp and page pool > > > statistics > > > > > > Caution: EXT Email > > > > > >> Drivers should never select PAGE_POOL_STATS. This Kconfig option > > >> was made to allow user to choose whether he wants stats or better > > >> performance on slower systems. It's pure user choice, if something > > >> doesn't build or link, it must be guarded with > > >> IS_ENABLED(CONFIG_PAGE_POOL_STATS). > > > > > > Given how simple the API is, and the stubs for when > > > CONFIG_PAGE_POOL_STATS is disabled, i doubt there is any need for the > driver to do anything. > > > > > >>> struct page_pool *page_pool; > > >>> struct xdp_rxq_info xdp_rxq; > > >>> + u32 stats[XDP_STATS_TOTAL]; > > >> > > >> Still not convinced it is okay to deliberately provoke overflows > > >> here, maybe we need some more reviewers to help us agree on what is > better? > > > > > > You will find that many embedded drivers only have 32 bit hardware > > > stats and do wrap around. And the hardware does not have atomic read > > > and clear so you can accumulate into a u64. The FEC is from the > > > times of MIB 2 ifTable, which only requires 32 bit counters. ifXtable is > modern compared to the FEC. > > > > > > Software counters like this are a different matter. The overhead of > > > a > > > u64 on a 32 bit system is probably in the noise, so i think there is > > > strong argument for using u64. > > > > If it is required to support u64 counters, the code logic need to > > change to record the counter locally per packet, and then update the > > counters for the fec instance when the napi receive loop is complete. > > In this way we can reduce the performance overhead. > > That's how it is usually done in the drivers. You put u32 counters on the stack, > it's impossible to overflow them in just one NAPI poll cycle. Then, after you're > done with processing descriptors, you just increment the 64-bit on-ring counters > at once. > Did some testing with the atomic64_t counter, with the following codes to update the u64 counter in the end of every NAPI poll cycle. @@ -1764,7 +1768,13 @@ fec_enet_rx_queue(struct net_device *ndev, int budget, u16 queue_id) if (xdp_result & FEC_ENET_XDP_REDIR) xdp_do_flush_map(); +#if 1 + if (xdp_prog) { + int i; + for(i = 0; i < XDP_STATS_TOTAL; i++) + atomic64_add(xdp_stats[i], &rxq->stats[i]); + } +#endif return pkt_received; } With the codes above, the testing result is below: root@imx8qxpc0mek:~/bpf# ./xdpsock -i eth0 sock0@eth0:0 rxdrop xdp-drv pps pkts 1.00 rx 349399 1035008 tx 0 0 sock0@eth0:0 rxdrop xdp-drv pps pkts 1.00 rx 349407 1384640 tx 0 0 Without the atomic_add codes above, the testing result is below: root@imx8qxpc0mek:~/bpf# ./xdpsock -i eth0 sock0@eth0:0 rxdrop xdp-drv pps pkts 1.00 rx 350109 1989130 tx 0 0 sock0@eth0:0 rxdrop xdp-drv pps pkts 1.00 rx 350425 2339786 tx 0 0 And regarding the u32 counter solution, the testing result is below: root@imx8qxpc0mek:~/bpf# ./xdpsock -i eth0 sock0@eth0:0 rxdrop xdp-drv pps pkts 1.00 rx 361347 2637796 tx 0 0 There are about 10K pkts/s difference here. Do we really want the u64 counters? Regards, Shenwei > > > > Thanks, > > Shenwei > > > > > > > > Andrew > > Thanks, > Olek
> @@ -1764,7 +1768,13 @@ fec_enet_rx_queue(struct net_device *ndev, int budget, u16 queue_id) > > if (xdp_result & FEC_ENET_XDP_REDIR) > xdp_do_flush_map(); > +#if 1 > + if (xdp_prog) { > + int i; > + for(i = 0; i < XDP_STATS_TOTAL; i++) > + atomic64_add(xdp_stats[i], &rxq->stats[i]); > + } > +#endif Atomic operations are expensive. You should not use them unless you really do need them. What driver are you copying here? There is nothing particularly new here, so you should just be copying code from another driver, and hope you picked a good example. Andrew
From: Shenwei Wang <shenwei.wang@nxp.com> Date: Mon, 14 Nov 2022 21:17:48 +0000 > > -----Original Message----- > > From: Alexander Lobakin <alexandr.lobakin@intel.com> > > Sent: Monday, November 14, 2022 9:23 AM > > To: Shenwei Wang <shenwei.wang@nxp.com> > > Cc: Alexander Lobakin <alexandr.lobakin@intel.com>; Andrew Lunn > > <andrew@lunn.ch>; David S. Miller <davem@davemloft.net>; Eric Dumazet > > <edumazet@google.com>; Jakub Kicinski <kuba@kernel.org>; Paolo Abeni > > <pabeni@redhat.com>; Alexei Starovoitov <ast@kernel.org>; Daniel Borkmann > > <daniel@iogearbox.net>; Jesper Dangaard Brouer <hawk@kernel.org>; John > > Fastabend <john.fastabend@gmail.com>; Wei Fang <wei.fang@nxp.com>; > > netdev@vger.kernel.org; linux-kernel@vger.kernel.org; imx@lists.linux.dev; > > kernel test robot <lkp@intel.com> > > Subject: [EXT] Re: [PATCH v3 1/1] net: fec: add xdp and page pool statistics [...] > Did some testing with the atomic64_t counter, with the following codes to update > the u64 counter in the end of every NAPI poll cycle. > > @@ -1764,7 +1768,13 @@ fec_enet_rx_queue(struct net_device *ndev, int budget, u16 queue_id) > > if (xdp_result & FEC_ENET_XDP_REDIR) > xdp_do_flush_map(); > +#if 1 > + if (xdp_prog) { > + int i; > + for(i = 0; i < XDP_STATS_TOTAL; i++) > + atomic64_add(xdp_stats[i], &rxq->stats[i]); > + } > +#endif > return pkt_received; > } > > With the codes above, the testing result is below: > root@imx8qxpc0mek:~/bpf# ./xdpsock -i eth0 > sock0@eth0:0 rxdrop xdp-drv > pps pkts 1.00 > rx 349399 1035008 > tx 0 0 > > sock0@eth0:0 rxdrop xdp-drv > pps pkts 1.00 > rx 349407 1384640 > tx 0 0 > > Without the atomic_add codes above, the testing result is below: > root@imx8qxpc0mek:~/bpf# ./xdpsock -i eth0 > sock0@eth0:0 rxdrop xdp-drv > pps pkts 1.00 > rx 350109 1989130 > tx 0 0 > > sock0@eth0:0 rxdrop xdp-drv > pps pkts 1.00 > rx 350425 2339786 > tx 0 0 > > And regarding the u32 counter solution, the testing result is below: > root@imx8qxpc0mek:~/bpf# ./xdpsock -i eth0 > sock0@eth0:0 rxdrop xdp-drv > pps pkts 1.00 > rx 361347 2637796 > tx 0 0 > > There are about 10K pkts/s difference here. Do we really want the u64 counters? Where did those atomic64_t come from? u64_stats_t use either plain u64 for 32-bit platforms or local64_t for 64-bit ones. Take a look at [0] for the example of how x86_64 does this, it is far from atomic64_t. > > Regards, > Shenwei > > >> > >> Thanks, > >> Shenwei > >> > >>> > >>> Andrew > > > > Thanks, > > Olek [0] https://elixir.bootlin.com/linux/v6.1-rc5/source/arch/x86/include/asm/local.h#L31 Thanks, Olek
> -----Original Message----- > From: Alexander Lobakin <alexandr.lobakin@intel.com> > Sent: Wednesday, November 16, 2022 8:34 AM > To: Shenwei Wang <shenwei.wang@nxp.com> > Cc: Alexander Lobakin <alexandr.lobakin@intel.com>; Andrew Lunn > <andrew@lunn.ch>; David S. Miller <davem@davemloft.net>; Eric Dumazet > <edumazet@google.com>; Jakub Kicinski <kuba@kernel.org>; Paolo Abeni > <pabeni@redhat.com>; Alexei Starovoitov <ast@kernel.org>; Daniel Borkmann > <daniel@iogearbox.net>; Jesper Dangaard Brouer <hawk@kernel.org>; John > Fastabend <john.fastabend@gmail.com>; Wei Fang <wei.fang@nxp.com>; > netdev@vger.kernel.org; linux-kernel@vger.kernel.org; imx@lists.linux.dev; > kernel test robot <lkp@intel.com> > Subject: [EXT] Re: [PATCH v3 1/1] net: fec: add xdp and page pool statistics > > Caution: EXT Email > > From: Shenwei Wang <shenwei.wang@nxp.com> > Date: Mon, 14 Nov 2022 21:17:48 +0000 > > > > -----Original Message----- > > > From: Alexander Lobakin <alexandr.lobakin@intel.com> > > > Sent: Monday, November 14, 2022 9:23 AM > > > To: Shenwei Wang <shenwei.wang@nxp.com> > > > Cc: Alexander Lobakin <alexandr.lobakin@intel.com>; Andrew Lunn > > > <andrew@lunn.ch>; David S. Miller <davem@davemloft.net>; Eric > > > Dumazet <edumazet@google.com>; Jakub Kicinski <kuba@kernel.org>; > > > Paolo Abeni <pabeni@redhat.com>; Alexei Starovoitov > > > <ast@kernel.org>; Daniel Borkmann <daniel@iogearbox.net>; Jesper > > > Dangaard Brouer <hawk@kernel.org>; John Fastabend > > > <john.fastabend@gmail.com>; Wei Fang <wei.fang@nxp.com>; > > > netdev@vger.kernel.org; linux-kernel@vger.kernel.org; > > > imx@lists.linux.dev; kernel test robot <lkp@intel.com> > > > Subject: [EXT] Re: [PATCH v3 1/1] net: fec: add xdp and page pool > > > statistics > > [...] > > > Did some testing with the atomic64_t counter, with the following codes > > to update the u64 counter in the end of every NAPI poll cycle. > > > > @@ -1764,7 +1768,13 @@ fec_enet_rx_queue(struct net_device *ndev, int > > budget, u16 queue_id) > > > > if (xdp_result & FEC_ENET_XDP_REDIR) > > xdp_do_flush_map(); > > +#if 1 > > + if (xdp_prog) { > > + int i; > > + for(i = 0; i < XDP_STATS_TOTAL; i++) > > + atomic64_add(xdp_stats[i], &rxq->stats[i]); > > + } > > +#endif > > return pkt_received; > > } > > > > With the codes above, the testing result is below: > > root@imx8qxpc0mek:~/bpf# ./xdpsock -i eth0 > > sock0@eth0:0 rxdrop xdp-drv > > pps pkts 1.00 > > rx 349399 1035008 > > tx 0 0 > > > > sock0@eth0:0 rxdrop xdp-drv > > pps pkts 1.00 > > rx 349407 1384640 > > tx 0 0 > > > > Without the atomic_add codes above, the testing result is below: > > root@imx8qxpc0mek:~/bpf# ./xdpsock -i eth0 > > sock0@eth0:0 rxdrop xdp-drv > > pps pkts 1.00 > > rx 350109 1989130 > > tx 0 0 > > > > sock0@eth0:0 rxdrop xdp-drv > > pps pkts 1.00 > > rx 350425 2339786 > > tx 0 0 > > > > And regarding the u32 counter solution, the testing result is below: > > root@imx8qxpc0mek:~/bpf# ./xdpsock -i eth0 > > sock0@eth0:0 rxdrop xdp-drv > > pps pkts 1.00 > > rx 361347 2637796 > > tx 0 0 > > > > There are about 10K pkts/s difference here. Do we really want the u64 > counters? > > Where did those atomic64_t come from? u64_stats_t use either plain > u64 for 32-bit platforms or local64_t for 64-bit ones. Take a look at [0] for the > example of how x86_64 does this, it is far from atomic64_t. The v5 patch has changed it to u64. Thanks, Shenwei > > > > > Regards, > > Shenwei > > > > >> > > >> Thanks, > > >> Shenwei > > >> > > >>> > > >>> Andrew > > > > > > Thanks, > > > Olek > > [0] > https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Felixir.boo > tlin.com%2Flinux%2Fv6.1- > rc5%2Fsource%2Farch%2Fx86%2Finclude%2Fasm%2Flocal.h%23L31&data > =05%7C01%7Cshenwei.wang%40nxp.com%7Caf8d6634fcfc4d77e07308dac7dfa > 4a8%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C638042060598040 > 501%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiL > CJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=sWQ4Ts > ayZvfs4pkCpC4tWvT51Cv89c4N6D5w2J%2FDC84%3D&reserved=0 > > Thanks, > Olek
diff --git a/drivers/net/ethernet/freescale/Kconfig b/drivers/net/ethernet/freescale/Kconfig index ce866ae3df03..f1e80d6996ef 100644 --- a/drivers/net/ethernet/freescale/Kconfig +++ b/drivers/net/ethernet/freescale/Kconfig @@ -29,6 +29,7 @@ config FEC select CRC32 select PHYLIB select PAGE_POOL + select PAGE_POOL_STATS imply NET_SELFTESTS help Say Y here if you want to use the built-in 10/100 Fast ethernet diff --git a/drivers/net/ethernet/freescale/fec.h b/drivers/net/ethernet/freescale/fec.h index 61e847b18343..5ba1e0d71c68 100644 --- a/drivers/net/ethernet/freescale/fec.h +++ b/drivers/net/ethernet/freescale/fec.h @@ -526,6 +526,19 @@ struct fec_enet_priv_txrx_info { struct sk_buff *skb; }; +enum { + RX_XDP_REDIRECT = 0, + RX_XDP_PASS, + RX_XDP_DROP, + RX_XDP_TX, + RX_XDP_TX_ERRORS, + TX_XDP_XMIT, + TX_XDP_XMIT_ERRORS, + + /* The following must be the last one */ + XDP_STATS_TOTAL, +}; + struct fec_enet_priv_tx_q { struct bufdesc_prop bd; unsigned char *tx_bounce[TX_RING_SIZE]; @@ -546,6 +559,7 @@ struct fec_enet_priv_rx_q { /* page_pool */ struct page_pool *page_pool; struct xdp_rxq_info xdp_rxq; + u32 stats[XDP_STATS_TOTAL]; /* rx queue number, in the range 0-7 */ u8 id; diff --git a/drivers/net/ethernet/freescale/fec_main.c b/drivers/net/ethernet/freescale/fec_main.c index 3fb870340c22..d3bf3ffe8349 100644 --- a/drivers/net/ethernet/freescale/fec_main.c +++ b/drivers/net/ethernet/freescale/fec_main.c @@ -1523,10 +1523,12 @@ fec_enet_run_xdp(struct fec_enet_private *fep, struct bpf_prog *prog, switch (act) { case XDP_PASS: + rxq->stats[RX_XDP_PASS]++; ret = FEC_ENET_XDP_PASS; break; case XDP_REDIRECT: + rxq->stats[RX_XDP_REDIRECT]++; err = xdp_do_redirect(fep->netdev, xdp, prog); if (!err) { ret = FEC_ENET_XDP_REDIR; @@ -1549,6 +1551,7 @@ fec_enet_run_xdp(struct fec_enet_private *fep, struct bpf_prog *prog, fallthrough; /* handle aborts by dropping packet */ case XDP_DROP: + rxq->stats[RX_XDP_DROP]++; ret = FEC_ENET_XDP_CONSUMED; page = virt_to_head_page(xdp->data); page_pool_put_page(rxq->page_pool, page, sync, true); @@ -2659,6 +2662,16 @@ static const struct fec_stat { #define FEC_STATS_SIZE (ARRAY_SIZE(fec_stats) * sizeof(u64)) +static const char *fec_xdp_stat_strs[XDP_STATS_TOTAL] = { + "rx_xdp_redirect", /* RX_XDP_REDIRECT = 0, */ + "rx_xdp_pass", /* RX_XDP_PASS, */ + "rx_xdp_drop", /* RX_XDP_DROP, */ + "rx_xdp_tx", /* RX_XDP_TX, */ + "rx_xdp_tx_errors", /* RX_XDP_TX_ERRORS, */ + "tx_xdp_xmit", /* TX_XDP_XMIT, */ + "tx_xdp_xmit_errors", /* TX_XDP_XMIT_ERRORS, */ +}; + static void fec_enet_update_ethtool_stats(struct net_device *dev) { struct fec_enet_private *fep = netdev_priv(dev); @@ -2668,6 +2681,40 @@ static void fec_enet_update_ethtool_stats(struct net_device *dev) fep->ethtool_stats[i] = readl(fep->hwp + fec_stats[i].offset); } +static void fec_enet_get_xdp_stats(struct fec_enet_private *fep, u64 *data) +{ + u64 xdp_stats[XDP_STATS_TOTAL] = { 0 }; + struct fec_enet_priv_rx_q *rxq; + int i, j; + + for (i = fep->num_rx_queues - 1; i >= 0; i--) { + rxq = fep->rx_queue[i]; + + for (j = 0; j < XDP_STATS_TOTAL; j++) + xdp_stats[j] += rxq->stats[j]; + } + + memcpy(data, xdp_stats, sizeof(xdp_stats)); +} + +static void fec_enet_page_pool_stats(struct fec_enet_private *fep, u64 *data) +{ + struct page_pool_stats stats = {}; + struct fec_enet_priv_rx_q *rxq; + int i; + + for (i = fep->num_rx_queues - 1; i >= 0; i--) { + rxq = fep->rx_queue[i]; + + if (!rxq->page_pool) + continue; + + page_pool_get_stats(rxq->page_pool, &stats); + } + + page_pool_ethtool_stats_get(data, &stats); +} + static void fec_enet_get_ethtool_stats(struct net_device *dev, struct ethtool_stats *stats, u64 *data) { @@ -2677,6 +2724,12 @@ static void fec_enet_get_ethtool_stats(struct net_device *dev, fec_enet_update_ethtool_stats(dev); memcpy(data, fep->ethtool_stats, FEC_STATS_SIZE); + data += FEC_STATS_SIZE / sizeof(u64); + + fec_enet_get_xdp_stats(fep, data); + data += XDP_STATS_TOTAL; + + fec_enet_page_pool_stats(fep, data); } static void fec_enet_get_strings(struct net_device *netdev, @@ -2685,9 +2738,16 @@ static void fec_enet_get_strings(struct net_device *netdev, int i; switch (stringset) { case ETH_SS_STATS: - for (i = 0; i < ARRAY_SIZE(fec_stats); i++) - memcpy(data + i * ETH_GSTRING_LEN, - fec_stats[i].name, ETH_GSTRING_LEN); + for (i = 0; i < ARRAY_SIZE(fec_stats); i++) { + memcpy(data, fec_stats[i].name, ETH_GSTRING_LEN); + data += ETH_GSTRING_LEN; + } + for (i = 0; i < ARRAY_SIZE(fec_xdp_stat_strs); i++) { + strncpy(data, fec_xdp_stat_strs[i], ETH_GSTRING_LEN); + data += ETH_GSTRING_LEN; + } + page_pool_ethtool_stats_get_strings(data); + break; case ETH_SS_TEST: net_selftest_get_strings(data); @@ -2697,9 +2757,14 @@ static void fec_enet_get_strings(struct net_device *netdev, static int fec_enet_get_sset_count(struct net_device *dev, int sset) { + int count; + switch (sset) { case ETH_SS_STATS: - return ARRAY_SIZE(fec_stats); + count = ARRAY_SIZE(fec_stats) + XDP_STATS_TOTAL; + count += page_pool_ethtool_stats_get_count(); + return count; + case ETH_SS_TEST: return net_selftest_get_count(); default: @@ -2710,7 +2775,8 @@ static int fec_enet_get_sset_count(struct net_device *dev, int sset) static void fec_enet_clear_ethtool_stats(struct net_device *dev) { struct fec_enet_private *fep = netdev_priv(dev); - int i; + struct fec_enet_priv_rx_q *rxq; + int i, j; /* Disable MIB statistics counters */ writel(FEC_MIB_CTRLSTAT_DISABLE, fep->hwp + FEC_MIB_CTRLSTAT); @@ -2718,6 +2784,12 @@ static void fec_enet_clear_ethtool_stats(struct net_device *dev) for (i = 0; i < ARRAY_SIZE(fec_stats); i++) writel(0, fep->hwp + fec_stats[i].offset); + for (i = fep->num_rx_queues - 1; i >= 0; i--) { + rxq = fep->rx_queue[i]; + for (j = 0; j < XDP_STATS_TOTAL; j++) + rxq->stats[j] = 0; + } + /* Don't disable MIB statistics counters */ writel(0, fep->hwp + FEC_MIB_CTRLSTAT); } @@ -3084,6 +3156,9 @@ static void fec_enet_free_buffers(struct net_device *ndev) for (i = 0; i < rxq->bd.ring_size; i++) page_pool_release_page(rxq->page_pool, rxq->rx_skb_info[i].page); + for (i = 0; i < XDP_STATS_TOTAL; i++) + rxq->stats[i] = 0; + if (xdp_rxq_info_is_reg(&rxq->xdp_rxq)) xdp_rxq_info_unreg(&rxq->xdp_rxq); page_pool_destroy(rxq->page_pool);
Added xdp and page pool statistics. In order to make the implementation simple and compatible, the patch uses the 32bit integer to record the XDP statistics. Signed-off-by: Shenwei Wang <shenwei.wang@nxp.com> Reported-by: kernel test robot <lkp@intel.com> --- Changes in v3: - change memcpy to strncpy to fix the warning reported by Paolo Abeni - fix the compile errors on powerpc Changes in v2: - clean up and restructure the codes per Andrew Lunn's review comments - clear the statistics when the adaptor is down drivers/net/ethernet/freescale/Kconfig | 1 + drivers/net/ethernet/freescale/fec.h | 14 ++++ drivers/net/ethernet/freescale/fec_main.c | 85 +++++++++++++++++++++-- 3 files changed, 95 insertions(+), 5 deletions(-) -- 2.34.1