diff mbox series

[bpf,v2,2/8] net: Move {l,t,d}stats allocation to core and convert veth & vrf

Message ID 20231112203009.26073-3-daniel@iogearbox.net (mailing list archive)
State Superseded
Delegated to: BPF
Headers show
Series bpf_redirect_peer fixes | expand

Checks

Context Check Description
netdev/series_format success Posting correctly formatted
netdev/tree_selection success Clearly marked for bpf, async
netdev/fixes_present success Fixes tag present in non-next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 5309 this patch: 5309
netdev/cc_maintainers warning 2 maintainers not CCed: edumazet@google.com pabeni@redhat.com
netdev/build_clang success Errors and warnings before: 1388 this patch: 1388
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 5646 this patch: 5646
netdev/checkpatch warning CHECK: multiple assignments should be avoided WARNING: line length of 83 exceeds 80 columns
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc fail Errors and warnings before: 0 this patch: 1
netdev/source_inline success Was 0 now: 0
bpf/vmtest-bpf-VM_Test-2 success Logs for Validate matrix.py
bpf/vmtest-bpf-VM_Test-0 success Logs for Lint
bpf/vmtest-bpf-VM_Test-1 success Logs for ShellCheck
bpf/vmtest-bpf-PR success PR summary
bpf/vmtest-bpf-VM_Test-8 success Logs for aarch64-gcc / veristat
bpf/vmtest-bpf-VM_Test-3 success Logs for aarch64-gcc / build / build for aarch64 with gcc
bpf/vmtest-bpf-VM_Test-7 success Logs for aarch64-gcc / test (test_verifier, false, 360) / test_verifier on aarch64 with gcc
bpf/vmtest-bpf-VM_Test-5 success Logs for aarch64-gcc / test (test_progs, false, 360) / test_progs on aarch64 with gcc
bpf/vmtest-bpf-VM_Test-6 success Logs for aarch64-gcc / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on aarch64 with gcc
bpf/vmtest-bpf-VM_Test-4 success Logs for aarch64-gcc / test (test_maps, false, 360) / test_maps on aarch64 with gcc
bpf/vmtest-bpf-VM_Test-9 success Logs for s390x-gcc / build / build for s390x with gcc
bpf/vmtest-bpf-VM_Test-14 success Logs for s390x-gcc / veristat
bpf/vmtest-bpf-VM_Test-15 success Logs for set-matrix
bpf/vmtest-bpf-VM_Test-16 success Logs for x86_64-gcc / build / build for x86_64 with gcc
bpf/vmtest-bpf-VM_Test-17 success Logs for x86_64-gcc / test (test_maps, false, 360) / test_maps on x86_64 with gcc
bpf/vmtest-bpf-VM_Test-18 success Logs for x86_64-gcc / test (test_progs, false, 360) / test_progs on x86_64 with gcc
bpf/vmtest-bpf-VM_Test-19 success Logs for x86_64-gcc / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on x86_64 with gcc
bpf/vmtest-bpf-VM_Test-20 success Logs for x86_64-gcc / test (test_progs_no_alu32_parallel, true, 30) / test_progs_no_alu32_parallel on x86_64 with gcc
bpf/vmtest-bpf-VM_Test-21 success Logs for x86_64-gcc / test (test_progs_parallel, true, 30) / test_progs_parallel on x86_64 with gcc
bpf/vmtest-bpf-VM_Test-22 success Logs for x86_64-gcc / test (test_verifier, false, 360) / test_verifier on x86_64 with gcc
bpf/vmtest-bpf-VM_Test-23 success Logs for x86_64-gcc / veristat / veristat on x86_64 with gcc
bpf/vmtest-bpf-VM_Test-24 success Logs for x86_64-llvm-16 / build / build for x86_64 with llvm-16
bpf/vmtest-bpf-VM_Test-25 success Logs for x86_64-llvm-16 / test (test_maps, false, 360) / test_maps on x86_64 with llvm-16
bpf/vmtest-bpf-VM_Test-26 success Logs for x86_64-llvm-16 / test (test_progs, false, 360) / test_progs on x86_64 with llvm-16
bpf/vmtest-bpf-VM_Test-27 success Logs for x86_64-llvm-16 / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on x86_64 with llvm-16
bpf/vmtest-bpf-VM_Test-28 success Logs for x86_64-llvm-16 / test (test_verifier, false, 360) / test_verifier on x86_64 with llvm-16
bpf/vmtest-bpf-VM_Test-29 success Logs for x86_64-llvm-16 / veristat
bpf/vmtest-bpf-VM_Test-13 success Logs for s390x-gcc / test (test_verifier, false, 360) / test_verifier on s390x with gcc
bpf/vmtest-bpf-VM_Test-12 success Logs for s390x-gcc / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on s390x with gcc
bpf/vmtest-bpf-VM_Test-11 success Logs for s390x-gcc / test (test_progs, false, 360) / test_progs on s390x with gcc
bpf/vmtest-bpf-VM_Test-10 success Logs for s390x-gcc / test (test_maps, false, 360) / test_maps on s390x with gcc

Commit Message

Daniel Borkmann Nov. 12, 2023, 8:30 p.m. UTC
Move {l,t,d}stats allocation to the core and let netdevs pick the stats
type they need. That way the driver doesn't have to bother with error
handling (allocation failure checking, making sure free happens in the
right spot, etc) - all happening in the core.

Co-developed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Ahern <dsahern@kernel.org>
---
 drivers/net/veth.c        | 16 ++-----------
 drivers/net/vrf.c         | 14 +++---------
 include/linux/netdevice.h |  8 +++++++
 net/core/dev.c            | 47 ++++++++++++++++++++++++++++++++++++++-
 4 files changed, 59 insertions(+), 26 deletions(-)

Comments

Nikolay Aleksandrov Nov. 13, 2023, 8:52 a.m. UTC | #1
On 11/12/23 22:30, Daniel Borkmann wrote:
> Move {l,t,d}stats allocation to the core and let netdevs pick the stats
> type they need. That way the driver doesn't have to bother with error
> handling (allocation failure checking, making sure free happens in the
> right spot, etc) - all happening in the core.
> 
> Co-developed-by: Jakub Kicinski <kuba@kernel.org>
> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
> Cc: David Ahern <dsahern@kernel.org>
> ---
>   drivers/net/veth.c        | 16 ++-----------
>   drivers/net/vrf.c         | 14 +++---------
>   include/linux/netdevice.h |  8 +++++++
>   net/core/dev.c            | 47 ++++++++++++++++++++++++++++++++++++++-
>   4 files changed, 59 insertions(+), 26 deletions(-)
> 

Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>
Simon Horman Nov. 13, 2023, 9:57 a.m. UTC | #2
On Sun, Nov 12, 2023 at 09:30:03PM +0100, Daniel Borkmann wrote:
> Move {l,t,d}stats allocation to the core and let netdevs pick the stats
> type they need. That way the driver doesn't have to bother with error
> handling (allocation failure checking, making sure free happens in the
> right spot, etc) - all happening in the core.
> 
> Co-developed-by: Jakub Kicinski <kuba@kernel.org>
> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
> Cc: David Ahern <dsahern@kernel.org>

...

> @@ -2354,6 +2361,7 @@ struct net_device {
>  	void				*ml_priv;
>  	enum netdev_ml_priv_type	ml_priv_type;
>  
> +	enum netdev_stat_type		pcpu_stat_type:8;

Hi Daniel,

nit: Please consider adding documentation for this new field to
     the kernel doc for net_device.

...
Simon Horman Nov. 13, 2023, 10:03 a.m. UTC | #3
On Sun, Nov 12, 2023 at 09:30:03PM +0100, Daniel Borkmann wrote:
> Move {l,t,d}stats allocation to the core and let netdevs pick the stats
> type they need. That way the driver doesn't have to bother with error
> handling (allocation failure checking, making sure free happens in the
> right spot, etc) - all happening in the core.
> 
> Co-developed-by: Jakub Kicinski <kuba@kernel.org>
> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
> Cc: David Ahern <dsahern@kernel.org>

Hi Daniel,

sorry I was a bit to hasty in hitting the send button for my previous
message. I have a some more minor feedback on this series.

> diff --git a/net/core/dev.c b/net/core/dev.c
> index 0d548431f3fa..75db81496db5 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -10049,6 +10049,44 @@ void netif_tx_stop_all_queues(struct net_device *dev)
>  }
>  EXPORT_SYMBOL(netif_tx_stop_all_queues);
>  
> +static int netdev_do_alloc_pcpu_stats(struct net_device *dev)
> +{
> +	void __percpu *v;
> +
> +	switch (dev->pcpu_stat_type) {
> +	case NETDEV_PCPU_STAT_NONE:
> +		return 0;
> +	case NETDEV_PCPU_STAT_LSTATS:
> +		v = dev->lstats = netdev_alloc_pcpu_stats(struct pcpu_lstats);
> +		break;
> +	case NETDEV_PCPU_STAT_TSTATS:
> +		v = dev->tstats = netdev_alloc_pcpu_stats(struct pcpu_sw_netstats);
> +		break;
> +	case NETDEV_PCPU_STAT_DSTATS:
> +		v = dev->dstats = netdev_alloc_pcpu_stats(struct pcpu_dstats);
> +		break;
> +	}
> +
> +	return v ? 0 : -ENOMEM;

Perhaps this cannot happen, but if none of the cases in the switch
statement are met, then v will be uninitialised here.

As flagged by Smatch.

> +}
> +

...

> @@ -10469,6 +10513,7 @@ void netdev_run_todo(void)
>  		WARN_ON(rcu_access_pointer(dev->ip_ptr));
>  		WARN_ON(rcu_access_pointer(dev->ip6_ptr));
>  
> +		netdev_do_free_pcpu_stats(dev);
>  		if (dev->priv_destructor)
>  			dev->priv_destructor(dev);
>  		if (dev->needs_free_netdev)

nit: the hunk above seems unnecessary; one blank line is enough.
Daniel Borkmann Nov. 13, 2023, 1:04 p.m. UTC | #4
On 11/13/23 10:57 AM, Simon Horman wrote:
> On Sun, Nov 12, 2023 at 09:30:03PM +0100, Daniel Borkmann wrote:
>> Move {l,t,d}stats allocation to the core and let netdevs pick the stats
>> type they need. That way the driver doesn't have to bother with error
>> handling (allocation failure checking, making sure free happens in the
>> right spot, etc) - all happening in the core.
>>
>> Co-developed-by: Jakub Kicinski <kuba@kernel.org>
>> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
>> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
>> Cc: David Ahern <dsahern@kernel.org>
> 
> ...
> 
>> @@ -2354,6 +2361,7 @@ struct net_device {
>>   	void				*ml_priv;
>>   	enum netdev_ml_priv_type	ml_priv_type;
>>   
>> +	enum netdev_stat_type		pcpu_stat_type:8;
> 
> Hi Daniel,
> 
> nit: Please consider adding documentation for this new field to
>       the kernel doc for net_device.
> 

Will add, thanks Simon!
Daniel Borkmann Nov. 13, 2023, 1:05 p.m. UTC | #5
On 11/13/23 11:03 AM, Simon Horman wrote:
> On Sun, Nov 12, 2023 at 09:30:03PM +0100, Daniel Borkmann wrote:
>> Move {l,t,d}stats allocation to the core and let netdevs pick the stats
>> type they need. That way the driver doesn't have to bother with error
>> handling (allocation failure checking, making sure free happens in the
>> right spot, etc) - all happening in the core.
>>
>> Co-developed-by: Jakub Kicinski <kuba@kernel.org>
>> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
>> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
>> Cc: David Ahern <dsahern@kernel.org>
> 
> Hi Daniel,
> 
> sorry I was a bit to hasty in hitting the send button for my previous
> message. I have a some more minor feedback on this series.
> 
>> diff --git a/net/core/dev.c b/net/core/dev.c
>> index 0d548431f3fa..75db81496db5 100644
>> --- a/net/core/dev.c
>> +++ b/net/core/dev.c
>> @@ -10049,6 +10049,44 @@ void netif_tx_stop_all_queues(struct net_device *dev)
>>   }
>>   EXPORT_SYMBOL(netif_tx_stop_all_queues);
>>   
>> +static int netdev_do_alloc_pcpu_stats(struct net_device *dev)
>> +{
>> +	void __percpu *v;
>> +
>> +	switch (dev->pcpu_stat_type) {
>> +	case NETDEV_PCPU_STAT_NONE:
>> +		return 0;
>> +	case NETDEV_PCPU_STAT_LSTATS:
>> +		v = dev->lstats = netdev_alloc_pcpu_stats(struct pcpu_lstats);
>> +		break;
>> +	case NETDEV_PCPU_STAT_TSTATS:
>> +		v = dev->tstats = netdev_alloc_pcpu_stats(struct pcpu_sw_netstats);
>> +		break;
>> +	case NETDEV_PCPU_STAT_DSTATS:
>> +		v = dev->dstats = netdev_alloc_pcpu_stats(struct pcpu_dstats);
>> +		break;
>> +	}
>> +
>> +	return v ? 0 : -ENOMEM;
> 
> Perhaps this cannot happen, but if none of the cases in the switch
> statement are met, then v will be uninitialised here.
> 
> As flagged by Smatch.

Good point, I'll add a guard in case someone tries to set an invalid value
outside of the enum.

>> +}
>> +
> 
> ...
> 
>> @@ -10469,6 +10513,7 @@ void netdev_run_todo(void)
>>   		WARN_ON(rcu_access_pointer(dev->ip_ptr));
>>   		WARN_ON(rcu_access_pointer(dev->ip6_ptr));
>>   
>> +		netdev_do_free_pcpu_stats(dev);
>>   		if (dev->priv_destructor)
>>   			dev->priv_destructor(dev);
>>   		if (dev->needs_free_netdev)
> 
> nit: the hunk above seems unnecessary; one blank line is enough.

I'm not sure which one you mean?

Thanks,
Daniel
Simon Horman Nov. 13, 2023, 4:15 p.m. UTC | #6
On Mon, Nov 13, 2023 at 02:05:36PM +0100, Daniel Borkmann wrote:
> On 11/13/23 11:03 AM, Simon Horman wrote:
> > On Sun, Nov 12, 2023 at 09:30:03PM +0100, Daniel Borkmann wrote:

...

> > > @@ -10469,6 +10513,7 @@ void netdev_run_todo(void)
> > >   		WARN_ON(rcu_access_pointer(dev->ip_ptr));
> > >   		WARN_ON(rcu_access_pointer(dev->ip6_ptr));
> > > +		netdev_do_free_pcpu_stats(dev);
> > >   		if (dev->priv_destructor)
> > >   			dev->priv_destructor(dev);
> > >   		if (dev->needs_free_netdev)
> > 
> > nit: the hunk above seems unnecessary; one blank line is enough.
> 
> I'm not sure which one you mean?

It seems that I was confused for some reason,
please ignore my previous comment.
diff mbox series

Patch

diff --git a/drivers/net/veth.c b/drivers/net/veth.c
index 9980517ed8b0..ac030c241d1a 100644
--- a/drivers/net/veth.c
+++ b/drivers/net/veth.c
@@ -1506,25 +1506,12 @@  static void veth_free_queues(struct net_device *dev)
 
 static int veth_dev_init(struct net_device *dev)
 {
-	int err;
-
-	dev->lstats = netdev_alloc_pcpu_stats(struct pcpu_lstats);
-	if (!dev->lstats)
-		return -ENOMEM;
-
-	err = veth_alloc_queues(dev);
-	if (err) {
-		free_percpu(dev->lstats);
-		return err;
-	}
-
-	return 0;
+	return veth_alloc_queues(dev);
 }
 
 static void veth_dev_free(struct net_device *dev)
 {
 	veth_free_queues(dev);
-	free_percpu(dev->lstats);
 }
 
 #ifdef CONFIG_NET_POLL_CONTROLLER
@@ -1796,6 +1783,7 @@  static void veth_setup(struct net_device *dev)
 			       NETIF_F_HW_VLAN_STAG_RX);
 	dev->needs_free_netdev = true;
 	dev->priv_destructor = veth_dev_free;
+	dev->pcpu_stat_type = NETDEV_PCPU_STAT_LSTATS;
 	dev->max_mtu = ETH_MAX_MTU;
 
 	dev->hw_features = VETH_FEATURES;
diff --git a/drivers/net/vrf.c b/drivers/net/vrf.c
index 3e6e0fdc3ba7..bb95ce43cd97 100644
--- a/drivers/net/vrf.c
+++ b/drivers/net/vrf.c
@@ -1164,22 +1164,15 @@  static void vrf_dev_uninit(struct net_device *dev)
 
 	vrf_rtable_release(dev, vrf);
 	vrf_rt6_release(dev, vrf);
-
-	free_percpu(dev->dstats);
-	dev->dstats = NULL;
 }
 
 static int vrf_dev_init(struct net_device *dev)
 {
 	struct net_vrf *vrf = netdev_priv(dev);
 
-	dev->dstats = netdev_alloc_pcpu_stats(struct pcpu_dstats);
-	if (!dev->dstats)
-		goto out_nomem;
-
 	/* create the default dst which points back to us */
 	if (vrf_rtable_create(dev) != 0)
-		goto out_stats;
+		goto out_nomem;
 
 	if (vrf_rt6_create(dev) != 0)
 		goto out_rth;
@@ -1193,9 +1186,6 @@  static int vrf_dev_init(struct net_device *dev)
 
 out_rth:
 	vrf_rtable_release(dev, vrf);
-out_stats:
-	free_percpu(dev->dstats);
-	dev->dstats = NULL;
 out_nomem:
 	return -ENOMEM;
 }
@@ -1694,6 +1684,8 @@  static void vrf_setup(struct net_device *dev)
 	dev->min_mtu = IPV6_MIN_MTU;
 	dev->max_mtu = IP6_MAX_MTU;
 	dev->mtu = dev->max_mtu;
+
+	dev->pcpu_stat_type = NETDEV_PCPU_STAT_DSTATS;
 }
 
 static int vrf_validate(struct nlattr *tb[], struct nlattr *data[],
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 98082113156e..eccb00a4572f 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -1797,6 +1797,13 @@  enum netdev_ml_priv_type {
 	ML_PRIV_CAN,
 };
 
+enum netdev_stat_type {
+	NETDEV_PCPU_STAT_NONE,
+	NETDEV_PCPU_STAT_LSTATS, /* struct pcpu_lstats */
+	NETDEV_PCPU_STAT_TSTATS, /* struct pcpu_sw_netstats */
+	NETDEV_PCPU_STAT_DSTATS, /* struct pcpu_dstats */
+};
+
 /**
  *	struct net_device - The DEVICE structure.
  *
@@ -2354,6 +2361,7 @@  struct net_device {
 	void				*ml_priv;
 	enum netdev_ml_priv_type	ml_priv_type;
 
+	enum netdev_stat_type		pcpu_stat_type:8;
 	union {
 		struct pcpu_lstats __percpu		*lstats;
 		struct pcpu_sw_netstats __percpu	*tstats;
diff --git a/net/core/dev.c b/net/core/dev.c
index 0d548431f3fa..75db81496db5 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -10049,6 +10049,44 @@  void netif_tx_stop_all_queues(struct net_device *dev)
 }
 EXPORT_SYMBOL(netif_tx_stop_all_queues);
 
+static int netdev_do_alloc_pcpu_stats(struct net_device *dev)
+{
+	void __percpu *v;
+
+	switch (dev->pcpu_stat_type) {
+	case NETDEV_PCPU_STAT_NONE:
+		return 0;
+	case NETDEV_PCPU_STAT_LSTATS:
+		v = dev->lstats = netdev_alloc_pcpu_stats(struct pcpu_lstats);
+		break;
+	case NETDEV_PCPU_STAT_TSTATS:
+		v = dev->tstats = netdev_alloc_pcpu_stats(struct pcpu_sw_netstats);
+		break;
+	case NETDEV_PCPU_STAT_DSTATS:
+		v = dev->dstats = netdev_alloc_pcpu_stats(struct pcpu_dstats);
+		break;
+	}
+
+	return v ? 0 : -ENOMEM;
+}
+
+static void netdev_do_free_pcpu_stats(struct net_device *dev)
+{
+	switch (dev->pcpu_stat_type) {
+	case NETDEV_PCPU_STAT_NONE:
+		return;
+	case NETDEV_PCPU_STAT_LSTATS:
+		free_percpu(dev->lstats);
+		break;
+	case NETDEV_PCPU_STAT_TSTATS:
+		free_percpu(dev->tstats);
+		break;
+	case NETDEV_PCPU_STAT_DSTATS:
+		free_percpu(dev->dstats);
+		break;
+	}
+}
+
 /**
  * register_netdevice() - register a network device
  * @dev: device to register
@@ -10109,9 +10147,13 @@  int register_netdevice(struct net_device *dev)
 		goto err_uninit;
 	}
 
+	ret = netdev_do_alloc_pcpu_stats(dev);
+	if (ret)
+		goto err_uninit;
+
 	ret = dev_index_reserve(net, dev->ifindex);
 	if (ret < 0)
-		goto err_uninit;
+		goto err_free_pcpu;
 	dev->ifindex = ret;
 
 	/* Transfer changeable features to wanted_features and enable
@@ -10217,6 +10259,8 @@  int register_netdevice(struct net_device *dev)
 	call_netdevice_notifiers(NETDEV_PRE_UNINIT, dev);
 err_ifindex_release:
 	dev_index_release(net, dev->ifindex);
+err_free_pcpu:
+	netdev_do_free_pcpu_stats(dev);
 err_uninit:
 	if (dev->netdev_ops->ndo_uninit)
 		dev->netdev_ops->ndo_uninit(dev);
@@ -10469,6 +10513,7 @@  void netdev_run_todo(void)
 		WARN_ON(rcu_access_pointer(dev->ip_ptr));
 		WARN_ON(rcu_access_pointer(dev->ip6_ptr));
 
+		netdev_do_free_pcpu_stats(dev);
 		if (dev->priv_destructor)
 			dev->priv_destructor(dev);
 		if (dev->needs_free_netdev)