diff mbox series

[net-next] net: create a dummy net_device allocator

Message ID 20240327200809.512867-1-leitao@debian.org (mailing list archive)
State Changes Requested
Delegated to: Netdev Maintainers
Headers show
Series [net-next] net: create a dummy net_device allocator | expand

Checks

Context Check Description
netdev/series_format success Single patches do not need cover letters
netdev/tree_selection success Clearly marked for net-next, async
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 4891 this patch: 4891
netdev/build_tools success Errors and warnings before: 0 this patch: 0
netdev/cc_maintainers success CCed 4 of 4 maintainers
netdev/build_clang success Errors and warnings before: 1058 this patch: 1058
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 5169 this patch: 5169
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 85 lines checked
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Breno Leitao March 27, 2024, 8:08 p.m. UTC
It is impossible to use init_dummy_netdev together with alloc_netdev()
as the 'setup' argument.

This is because alloc_netdev() initializes some fields in the net_device
structure, and later init_dummy_netdev() memzero them all. This causes
some problems as reported here:

	https://lore.kernel.org/all/20240322082336.49f110cc@kernel.org/

Split the init_dummy_netdev() function in two. Create a new function called
init_dummy_netdev_core() that does not memzero the net_device structure.
Then have init_dummy_netdev() memzero-ing and calling
init_dummy_netdev_core(), keeping the old behaviour.

init_dummy_netdev_core() is the new function that could be called as an
argument for alloc_netdev().

Also, create a helper to allocate and initialize dummy net devices,
leveraging init_dummy_netdev_core() as the setup argument. This function
basically simplify the allocation of dummy devices, by allocating and
initializing it. Freeing the device continue to be done through
free_netdev()

Suggested-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Breno Leitao <leitao@debian.org>
---
 include/linux/netdevice.h |  3 +++
 net/core/dev.c            | 55 ++++++++++++++++++++++++++-------------
 2 files changed, 40 insertions(+), 18 deletions(-)

Comments

Jakub Kicinski March 27, 2024, 11:17 p.m. UTC | #1
On Wed, 27 Mar 2024 13:08:04 -0700 Breno Leitao wrote:
> It is impossible to use init_dummy_netdev together with alloc_netdev()
> as the 'setup' argument.
> 
> This is because alloc_netdev() initializes some fields in the net_device
> structure, and later init_dummy_netdev() memzero them all. This causes
> some problems as reported here:
> 
> 	https://lore.kernel.org/all/20240322082336.49f110cc@kernel.org/
> 
> Split the init_dummy_netdev() function in two. Create a new function called
> init_dummy_netdev_core() that does not memzero the net_device structure.
> Then have init_dummy_netdev() memzero-ing and calling
> init_dummy_netdev_core(), keeping the old behaviour.
> 
> init_dummy_netdev_core() is the new function that could be called as an
> argument for alloc_netdev().
> 
> Also, create a helper to allocate and initialize dummy net devices,
> leveraging init_dummy_netdev_core() as the setup argument. This function
> basically simplify the allocation of dummy devices, by allocating and
> initializing it. Freeing the device continue to be done through
> free_netdev()

Ah, but you need to make it part of the series with some caller.
Maybe convert all the ethernet ones?

$ git grep 'struct net_device [^*]*;' -- drivers/net/ethernet/
drivers/net/ethernet/cavium/thunder/thunder_bgx.c:      struct net_device       netdev;
drivers/net/ethernet/marvell/prestera/prestera_rxtx.c:  struct net_device napi_dev;
drivers/net/ethernet/microchip/vcap/vcap_api_debugfs_kunit.c:static struct net_device test_netdev = {};
drivers/net/ethernet/microchip/vcap/vcap_api_kunit.c:static struct net_device test_netdev = {};
Alexander Lobakin March 28, 2024, 2:57 p.m. UTC | #2
From: Jakub Kicinski <kuba@kernel.org>
Date: Wed, 27 Mar 2024 16:17:31 -0700

> On Wed, 27 Mar 2024 13:08:04 -0700 Breno Leitao wrote:
>> It is impossible to use init_dummy_netdev together with alloc_netdev()
>> as the 'setup' argument.
>>
>> This is because alloc_netdev() initializes some fields in the net_device
>> structure, and later init_dummy_netdev() memzero them all. This causes
>> some problems as reported here:
>>
>> 	https://lore.kernel.org/all/20240322082336.49f110cc@kernel.org/
>>
>> Split the init_dummy_netdev() function in two. Create a new function called
>> init_dummy_netdev_core() that does not memzero the net_device structure.
>> Then have init_dummy_netdev() memzero-ing and calling
>> init_dummy_netdev_core(), keeping the old behaviour.
>>
>> init_dummy_netdev_core() is the new function that could be called as an
>> argument for alloc_netdev().
>>
>> Also, create a helper to allocate and initialize dummy net devices,
>> leveraging init_dummy_netdev_core() as the setup argument. This function
>> basically simplify the allocation of dummy devices, by allocating and
>> initializing it. Freeing the device continue to be done through
>> free_netdev()
> 
> Ah, but you need to make it part of the series with some caller.
> Maybe convert all the ethernet ones?
> 
> $ git grep 'struct net_device [^*]*;' -- drivers/net/ethernet/
> drivers/net/ethernet/cavium/thunder/thunder_bgx.c:      struct net_device       netdev;
> drivers/net/ethernet/marvell/prestera/prestera_rxtx.c:  struct net_device napi_dev;
> drivers/net/ethernet/microchip/vcap/vcap_api_debugfs_kunit.c:static struct net_device test_netdev = {};
> drivers/net/ethernet/microchip/vcap/vcap_api_kunit.c:static struct net_device test_netdev = {};

There are much more of them unfortunately. Pretty much every user of
init_dummy_netdev()[0].

I would prefer *replacing* init_dummy_netdev() with
alloc_dummy_netdev(), so that we'll be sure there are no embedded
&net_devices and we can use a proper flex array there.

[0] https://elixir.bootlin.com/linux/v6.9-rc1/A/ident/init_dummy_netdev

Thanks,
Olek
Alexander Lobakin March 28, 2024, 3:02 p.m. UTC | #3
From: Breno Leitao <leitao@debian.org>
Date: Wed, 27 Mar 2024 13:08:04 -0700

> It is impossible to use init_dummy_netdev together with alloc_netdev()
> as the 'setup' argument.
> 
> This is because alloc_netdev() initializes some fields in the net_device
> structure, and later init_dummy_netdev() memzero them all. This causes
> some problems as reported here:
> 
> 	https://lore.kernel.org/all/20240322082336.49f110cc@kernel.org/
> 
> Split the init_dummy_netdev() function in two. Create a new function called
> init_dummy_netdev_core() that does not memzero the net_device structure.
> Then have init_dummy_netdev() memzero-ing and calling
> init_dummy_netdev_core(), keeping the old behaviour.
> 
> init_dummy_netdev_core() is the new function that could be called as an
> argument for alloc_netdev().
> 
> Also, create a helper to allocate and initialize dummy net devices,
> leveraging init_dummy_netdev_core() as the setup argument. This function
> basically simplify the allocation of dummy devices, by allocating and
> initializing it. Freeing the device continue to be done through
> free_netdev()
> 
> Suggested-by: Jakub Kicinski <kuba@kernel.org>
> Signed-off-by: Breno Leitao <leitao@debian.org>
> ---
>  include/linux/netdevice.h |  3 +++
>  net/core/dev.c            | 55 ++++++++++++++++++++++++++-------------
>  2 files changed, 40 insertions(+), 18 deletions(-)
> 
> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> index c6f6ac779b34..f4226a99a146 100644
> --- a/include/linux/netdevice.h
> +++ b/include/linux/netdevice.h
> @@ -4545,6 +4545,9 @@ static inline void netif_addr_unlock_bh(struct net_device *dev)
>  
>  void ether_setup(struct net_device *dev);
>  
> +/* Allocate dummy net_device */
> +struct net_device *alloc_netdev_dummy(int sizeof_priv, const char *name);
> +
>  /* Support for loadable net-drivers */
>  struct net_device *alloc_netdev_mqs(int sizeof_priv, const char *name,
>  				    unsigned char name_assign_type,
> diff --git a/net/core/dev.c b/net/core/dev.c
> index 0766a245816b..df2484bbc041 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -10340,25 +10340,12 @@ int register_netdevice(struct net_device *dev)
>  }
>  EXPORT_SYMBOL(register_netdevice);
>  
> -/**
> - *	init_dummy_netdev	- init a dummy network device for NAPI
> - *	@dev: device to init
> - *
> - *	This takes a network device structure and initialize the minimum
> - *	amount of fields so it can be used to schedule NAPI polls without
> - *	registering a full blown interface. This is to be used by drivers
> - *	that need to tie several hardware interfaces to a single NAPI
> - *	poll scheduler due to HW limitations.
> +/* Initialize the core of a dummy net device.
> + * This is useful if you are calling this function after alloc_netdev(),
> + * since it does not memset the net_device fields.
>   */
> -void init_dummy_netdev(struct net_device *dev)
> +static void init_dummy_netdev_core(struct net_device *dev)
>  {
> -	/* Clear everything. Note we don't initialize spinlocks
> -	 * are they aren't supposed to be taken by any of the
> -	 * NAPI code and this dummy netdev is supposed to be
> -	 * only ever used for NAPI polls
> -	 */
> -	memset(dev, 0, sizeof(struct net_device));
> -
>  	/* make sure we BUG if trying to hit standard
>  	 * register/unregister code path
>  	 */
> @@ -10379,8 +10366,28 @@ void init_dummy_netdev(struct net_device *dev)
>  	 * its refcount.
>  	 */
>  }
> -EXPORT_SYMBOL_GPL(init_dummy_netdev);
>  
> +/**
> + *	init_dummy_netdev	- init a dummy network device for NAPI
> + *	@dev: device to init
> + *
> + *	This takes a network device structure and initialize the minimum
> + *	amount of fields so it can be used to schedule NAPI polls without
> + *	registering a full blown interface. This is to be used by drivers
> + *	that need to tie several hardware interfaces to a single NAPI
> + *	poll scheduler due to HW limitations.
> + */
> +void init_dummy_netdev(struct net_device *dev)
> +{
> +	/* Clear everything. Note we don't initialize spinlocks
> +	 * are they aren't supposed to be taken by any of the
> +	 * NAPI code and this dummy netdev is supposed to be
> +	 * only ever used for NAPI polls
> +	 */
> +	memset(dev, 0, sizeof(struct net_device));
> +	init_dummy_netdev_core(dev);
> +}
> +EXPORT_SYMBOL_GPL(init_dummy_netdev);
>  
>  /**
>   *	register_netdev	- register a network device
> @@ -10991,6 +10998,18 @@ void free_netdev(struct net_device *dev)
>  }
>  EXPORT_SYMBOL(free_netdev);
>  
> +/**
> + * alloc_netdev_dummy - Allocate and initialize a dummy net device.
> + * @sizeof_priv: size of private data to allocate space for
> + * @name: device name format string
> + */
> +struct net_device *alloc_netdev_dummy(int sizeof_priv, const char *name)

Since the users of init_dummy_netdev embed &net_device into their
private structures, do we need sizeof_priv here at all? Or maybe we
could unconditionally pass 0?

> +{
> +	return alloc_netdev(sizeof_priv, name, NET_NAME_UNKNOWN,
> +			    init_dummy_netdev_core);
> +}
> +EXPORT_SYMBOL_GPL(alloc_netdev_dummy);
> +
>  /**
>   *	synchronize_net -  Synchronize with packet receive processing
>   *

As Jakub mentioned, you need to introduce consumers of the functionality
you add within the same series. Personally, I'd like to see a series
with agressive conversion of all the affected drivers from
init_dummy_netdev() to alloc_dummy_netdev() and final removal of
init_dummy_netdev() :D

(and then a followup which converts &net_device to proper flex arrays)

Thanks,
Olek
Jakub Kicinski March 28, 2024, 5:10 p.m. UTC | #4
On Thu, 28 Mar 2024 16:02:12 +0100 Alexander Lobakin wrote:
> > +/**
> > + * alloc_netdev_dummy - Allocate and initialize a dummy net device.
> > + * @sizeof_priv: size of private data to allocate space for
> > + * @name: device name format string
> > + */
> > +struct net_device *alloc_netdev_dummy(int sizeof_priv, const char *name)  
> 
> Since the users of init_dummy_netdev embed &net_device into their
> private structures, do we need sizeof_priv here at all? Or maybe we
> could unconditionally pass 0?

FWIW similar thing could be said about @name, if we never intend to
register the device - it will never have a legitimate (user visible)
name. So we may be better off naming them "dummy#" or some such.
No strong preference, tho. Adding params back later may be a bit
of a pain.

> > +{
> > +	return alloc_netdev(sizeof_priv, name, NET_NAME_UNKNOWN,
> > +			    init_dummy_netdev_core);
> > +}
> > +EXPORT_SYMBOL_GPL(alloc_netdev_dummy);
> > +
> >  /**
> >   *	synchronize_net -  Synchronize with packet receive processing
> >   *  
> 
> As Jakub mentioned, you need to introduce consumers of the functionality
> you add within the same series. Personally, I'd like to see a series
> with agressive conversion of all the affected drivers from
> init_dummy_netdev() to alloc_dummy_netdev() and final removal of
> init_dummy_netdev() :D

We can, and put it on a shared branch so other trees can also pull in
the conversions. No preference on my side, tho. I think that Breno
doesn't have any of the HW in question, so starting with one and making
sure it works could be more "work conserving", than redoing all
patches..
Breno Leitao March 28, 2024, 5:40 p.m. UTC | #5
On Thu, Mar 28, 2024 at 04:02:12PM +0100, Alexander Lobakin wrote:
> From: Breno Leitao <leitao@debian.org>
> Date: Wed, 27 Mar 2024 13:08:04 -0700

> > @@ -10991,6 +10998,18 @@ void free_netdev(struct net_device *dev)
> >  }
> >  EXPORT_SYMBOL(free_netdev);
> >  
> > +/**
> > + * alloc_netdev_dummy - Allocate and initialize a dummy net device.
> > + * @sizeof_priv: size of private data to allocate space for
> > + * @name: device name format string
> > + */
> > +struct net_device *alloc_netdev_dummy(int sizeof_priv, const char *name)
> 
> Since the users of init_dummy_netdev embed &net_device into their
> private structures, do we need sizeof_priv here at all? Or maybe we
> could unconditionally pass 0?

We need to have this private size for cases where we need to get the
pointer to their private structure given the dummy device.  in the
embedded case you can use container_of(), but, with a pointer to
net_device, that is not the case anymore, and we should use the dummy private
data pointer back to the private data.

For instance, see the example of iwlwifi: 
https://lore.kernel.org/all/20240307174843.1719130-1-leitao@debian.org/

> > +{
> > +	return alloc_netdev(sizeof_priv, name, NET_NAME_UNKNOWN,
> > +			    init_dummy_netdev_core);
> > +}
> > +EXPORT_SYMBOL_GPL(alloc_netdev_dummy);
> > +
> >  /**
> >   *	synchronize_net -  Synchronize with packet receive processing
> >   *
> 
> As Jakub mentioned, you need to introduce consumers of the functionality
> you add within the same series. Personally, I'd like to see a series
> with agressive conversion of all the affected drivers from
> init_dummy_netdev() to alloc_dummy_netdev() and final removal of
> init_dummy_netdev() :D
> 
> (and then a followup which converts &net_device to proper flex arrays)

That is the goal in fact, but I personally think this will take a while,
given that we need to rely on maintainers to test the changes to be able
to move forward.
Breno Leitao March 28, 2024, 5:46 p.m. UTC | #6
On Thu, Mar 28, 2024 at 10:10:53AM -0700, Jakub Kicinski wrote:
> On Thu, 28 Mar 2024 16:02:12 +0100 Alexander Lobakin wrote:
> > > +/**
> > > + * alloc_netdev_dummy - Allocate and initialize a dummy net device.
> > > + * @sizeof_priv: size of private data to allocate space for
> > > + * @name: device name format string
> > > + */
> > > +struct net_device *alloc_netdev_dummy(int sizeof_priv, const char *name)  
> > 
> > Since the users of init_dummy_netdev embed &net_device into their
> > private structures, do we need sizeof_priv here at all? Or maybe we
> > could unconditionally pass 0?
> 
> FWIW similar thing could be said about @name, if we never intend to
> register the device - it will never have a legitimate (user visible)
> name. So we may be better off naming them "dummy#" or some such.
> No strong preference, tho. Adding params back later may be a bit
> of a pain.

Removing the @name seems to be safer than @sizeof_priv. I can remove it
in v2 if any one has any strong preference.

Unfortunately removing @sizeof_priv might not be possible given cases as
iwlwifi.

> > > +{
> > > +	return alloc_netdev(sizeof_priv, name, NET_NAME_UNKNOWN,
> > > +			    init_dummy_netdev_core);
> > > +}
> > > +EXPORT_SYMBOL_GPL(alloc_netdev_dummy);
> > > +
> > >  /**
> > >   *	synchronize_net -  Synchronize with packet receive processing
> > >   *  
> > 
> > As Jakub mentioned, you need to introduce consumers of the functionality
> > you add within the same series. Personally, I'd like to see a series
> > with agressive conversion of all the affected drivers from
> > init_dummy_netdev() to alloc_dummy_netdev() and final removal of
> > init_dummy_netdev() :D
> 
> We can, and put it on a shared branch so other trees can also pull in
> the conversions. No preference on my side, tho. I think that Breno
> doesn't have any of the HW in question, so starting with one and making
> sure it works could be more "work conserving", than redoing all
> patches..

I would prefer to do the more conservative approach first and make sure
there is no major regression, and then complete the work once the risk
is low.
Leon Romanovsky April 2, 2024, 6:01 p.m. UTC | #7
On Wed, Mar 27, 2024 at 01:08:04PM -0700, Breno Leitao wrote:
> It is impossible to use init_dummy_netdev together with alloc_netdev()
> as the 'setup' argument.

<...>

> Also, create a helper to allocate and initialize dummy net devices,
> leveraging init_dummy_netdev_core() as the setup argument. 

<...>

> Suggested-by: Jakub Kicinski <kuba@kernel.org>
> Signed-off-by: Breno Leitao <leitao@debian.org>

Exciting read for people who remember this conversation:
"""
> I prefer to see some new wrapper over plain alloc_netdev, which will
> create this dummy netdevice. For example, alloc_dummy_netdev(...).

Nope, no bona fide APIs for hacky uses.
"""
https://lore.kernel.org/linux-rdma/20240311112532.71f1cb35@kernel.org/

Thanks
Jakub Kicinski April 2, 2024, 6:17 p.m. UTC | #8
On Tue, 2 Apr 2024 21:01:55 +0300 Leon Romanovsky wrote:
> > Suggested-by: Jakub Kicinski <kuba@kernel.org>
> > Signed-off-by: Breno Leitao <leitao@debian.org>  
> 
> Exciting read for people who remember this conversation:
> """
> > I prefer to see some new wrapper over plain alloc_netdev, which will
> > create this dummy netdevice. For example, alloc_dummy_netdev(...).  
> 
> Nope, no bona fide APIs for hacky uses.
> """
> https://lore.kernel.org/linux-rdma/20240311112532.71f1cb35@kernel.org/

Still my preference, but there's only so many hours in the day
to keep explaining things. I'd rather we made some progress.
diff mbox series

Patch

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index c6f6ac779b34..f4226a99a146 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -4545,6 +4545,9 @@  static inline void netif_addr_unlock_bh(struct net_device *dev)
 
 void ether_setup(struct net_device *dev);
 
+/* Allocate dummy net_device */
+struct net_device *alloc_netdev_dummy(int sizeof_priv, const char *name);
+
 /* Support for loadable net-drivers */
 struct net_device *alloc_netdev_mqs(int sizeof_priv, const char *name,
 				    unsigned char name_assign_type,
diff --git a/net/core/dev.c b/net/core/dev.c
index 0766a245816b..df2484bbc041 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -10340,25 +10340,12 @@  int register_netdevice(struct net_device *dev)
 }
 EXPORT_SYMBOL(register_netdevice);
 
-/**
- *	init_dummy_netdev	- init a dummy network device for NAPI
- *	@dev: device to init
- *
- *	This takes a network device structure and initialize the minimum
- *	amount of fields so it can be used to schedule NAPI polls without
- *	registering a full blown interface. This is to be used by drivers
- *	that need to tie several hardware interfaces to a single NAPI
- *	poll scheduler due to HW limitations.
+/* Initialize the core of a dummy net device.
+ * This is useful if you are calling this function after alloc_netdev(),
+ * since it does not memset the net_device fields.
  */
-void init_dummy_netdev(struct net_device *dev)
+static void init_dummy_netdev_core(struct net_device *dev)
 {
-	/* Clear everything. Note we don't initialize spinlocks
-	 * are they aren't supposed to be taken by any of the
-	 * NAPI code and this dummy netdev is supposed to be
-	 * only ever used for NAPI polls
-	 */
-	memset(dev, 0, sizeof(struct net_device));
-
 	/* make sure we BUG if trying to hit standard
 	 * register/unregister code path
 	 */
@@ -10379,8 +10366,28 @@  void init_dummy_netdev(struct net_device *dev)
 	 * its refcount.
 	 */
 }
-EXPORT_SYMBOL_GPL(init_dummy_netdev);
 
+/**
+ *	init_dummy_netdev	- init a dummy network device for NAPI
+ *	@dev: device to init
+ *
+ *	This takes a network device structure and initialize the minimum
+ *	amount of fields so it can be used to schedule NAPI polls without
+ *	registering a full blown interface. This is to be used by drivers
+ *	that need to tie several hardware interfaces to a single NAPI
+ *	poll scheduler due to HW limitations.
+ */
+void init_dummy_netdev(struct net_device *dev)
+{
+	/* Clear everything. Note we don't initialize spinlocks
+	 * are they aren't supposed to be taken by any of the
+	 * NAPI code and this dummy netdev is supposed to be
+	 * only ever used for NAPI polls
+	 */
+	memset(dev, 0, sizeof(struct net_device));
+	init_dummy_netdev_core(dev);
+}
+EXPORT_SYMBOL_GPL(init_dummy_netdev);
 
 /**
  *	register_netdev	- register a network device
@@ -10991,6 +10998,18 @@  void free_netdev(struct net_device *dev)
 }
 EXPORT_SYMBOL(free_netdev);
 
+/**
+ * alloc_netdev_dummy - Allocate and initialize a dummy net device.
+ * @sizeof_priv: size of private data to allocate space for
+ * @name: device name format string
+ */
+struct net_device *alloc_netdev_dummy(int sizeof_priv, const char *name)
+{
+	return alloc_netdev(sizeof_priv, name, NET_NAME_UNKNOWN,
+			    init_dummy_netdev_core);
+}
+EXPORT_SYMBOL_GPL(alloc_netdev_dummy);
+
 /**
  *	synchronize_net -  Synchronize with packet receive processing
  *