Message ID | 1344871635-1052-2-git-send-email-jiri@resnulli.us (mailing list archive) |
---|---|
State | Not Applicable |
Headers | show |
On Mon, 2012-08-13 at 17:27 +0200, Jiri Pirko wrote: > This lists are supposed to serve for storing pointers to all upper devices. > Eventually it will replace dev->master pointer which is used for > bonding, bridge, team but it cannot be used for vlan, macvlan where > there might be multiple "masters" present. > > New upper device list resolves this limitation. Also, the information > stored in lists is used for preventing looping setups like > "bond->somethingelse->samebond" > > Signed-off-by: Jiri Pirko <jiri@resnulli.us> [...] > --- a/net/core/dev.c > +++ b/net/core/dev.c > @@ -4425,6 +4425,229 @@ static int __init dev_proc_init(void) > #endif /* CONFIG_PROC_FS */ > > > +struct netdev_upper { > + struct net_device *dev; > + bool unique; This needs a better name. It doesn't really have anything to do with uniqueness and doesn't ensure exclusivity. I think that it would be fine to keep the 'master' term. > + struct list_head list; > + struct rcu_head rcu; > +}; [...] > +static int __netdev_upper_dev_link(struct net_device *dev, > + struct net_device *upper_dev, bool unique) > +{ > + struct netdev_upper *upper; > + > + ASSERT_RTNL(); > + > + if (dev == upper_dev) > + return -EBUSY; > + /* > + * To prevent loops, check if dev is not upper device to upper_dev. > + */ > + if (__netdev_has_upper_dev(upper_dev, dev, true)) > + return -EBUSY; > + > + if (__netdev_find_upper(dev, upper_dev)) > + return -EEXIST; > + > + if (unique && netdev_unique_upper_dev_get(dev)) > + return -EBUSY; > + > + upper = kmalloc(sizeof(*upper), GFP_KERNEL); > + if (!upper) > + return -ENOMEM; > + > + upper->dev = upper_dev; > + upper->unique = unique; > + > + /* > + * Ensure that unique upper link is always the first item in the list. > + */ > + if (unique) > + list_add_rcu(&upper->list, &dev->upper_dev_list); > + else > + list_add_tail_rcu(&upper->list, &dev->upper_dev_list); > + dev_hold(upper_dev); This behaviour (calling dev_hold()) matches netdev_set_master(). But it's oddly asymmetric: generally the administrator can remove either the upper device or the lower device (rtnl_link_ops or unbinding a physical device) and the upper device driver must then unlink itself from the lower device (using a notifier to catch lower device removal). If the upper device driver fails to unlink when the upper device is unregistered, then this extra reference causes netdev_wait_allrefs() to hang... is that the intent? Or should there be a more explicit counter and check on unregistration, e.g. WARN_ON(dev->num_lower_devs != 0)? If it fails to unlink when the lower device is removed, this warning in rollback_registered_many() may be triggered: /* Notifier chain MUST detach us from master device. */ WARN_ON(dev->master); I think that needs to become WARN_ON(netdev_has_upper_dev(dev)). > + return 0; > +} [...]
Mon, Aug 13, 2012 at 07:04:11PM CEST, bhutchings@solarflare.com wrote: >On Mon, 2012-08-13 at 17:27 +0200, Jiri Pirko wrote: >> This lists are supposed to serve for storing pointers to all upper devices. >> Eventually it will replace dev->master pointer which is used for >> bonding, bridge, team but it cannot be used for vlan, macvlan where >> there might be multiple "masters" present. >> >> New upper device list resolves this limitation. Also, the information >> stored in lists is used for preventing looping setups like >> "bond->somethingelse->samebond" >> >> Signed-off-by: Jiri Pirko <jiri@resnulli.us> >[...] >> --- a/net/core/dev.c >> +++ b/net/core/dev.c >> @@ -4425,6 +4425,229 @@ static int __init dev_proc_init(void) >> #endif /* CONFIG_PROC_FS */ >> >> >> +struct netdev_upper { >> + struct net_device *dev; >> + bool unique; > >This needs a better name. It doesn't really have anything to do with >uniqueness and doesn't ensure exclusivity. I think that it would be >fine to keep the 'master' term. Hmm. I admit that "unique" I do not like too much as well. But "master" I like even less. This flag should ensure exclusivity. Only one upper device with this flag can be present at a time. > >> + struct list_head list; >> + struct rcu_head rcu; >> +}; >[...] >> +static int __netdev_upper_dev_link(struct net_device *dev, >> + struct net_device *upper_dev, bool unique) >> +{ >> + struct netdev_upper *upper; >> + >> + ASSERT_RTNL(); >> + >> + if (dev == upper_dev) >> + return -EBUSY; >> + /* >> + * To prevent loops, check if dev is not upper device to upper_dev. >> + */ >> + if (__netdev_has_upper_dev(upper_dev, dev, true)) >> + return -EBUSY; >> + >> + if (__netdev_find_upper(dev, upper_dev)) >> + return -EEXIST; >> + >> + if (unique && netdev_unique_upper_dev_get(dev)) >> + return -EBUSY; >> + >> + upper = kmalloc(sizeof(*upper), GFP_KERNEL); >> + if (!upper) >> + return -ENOMEM; >> + >> + upper->dev = upper_dev; >> + upper->unique = unique; >> + >> + /* >> + * Ensure that unique upper link is always the first item in the list. >> + */ >> + if (unique) >> + list_add_rcu(&upper->list, &dev->upper_dev_list); >> + else >> + list_add_tail_rcu(&upper->list, &dev->upper_dev_list); >> + dev_hold(upper_dev); > >This behaviour (calling dev_hold()) matches netdev_set_master(). But >it's oddly asymmetric: generally the administrator can remove either the >upper device or the lower device (rtnl_link_ops or unbinding a physical >device) and the upper device driver must then unlink itself from the >lower device (using a notifier to catch lower device removal). > >If the upper device driver fails to unlink when the upper device is >unregistered, then this extra reference causes netdev_wait_allrefs() to >hang... is that the intent? Or should there be a more explicit counter >and check on unregistration, e.g. WARN_ON(dev->num_lower_devs != 0)? > I'm not sure I understand you. I believe that upper device notifier should take care of the unlink. This behaviour is unchanged by the patch. >If it fails to unlink when the lower device is removed, this warning in >rollback_registered_many() may be triggered: > > /* Notifier chain MUST detach us from master device. */ > WARN_ON(dev->master); > >I think that needs to become WARN_ON(netdev_has_upper_dev(dev)). Patch 15 > >> + return 0; >> +} >[...] > >-- >Ben Hutchings, Staff Engineer, Solarflare >Not speaking for my employer; that's the marketing department's job. >They asked us to note that Solarflare product names are trademarked. > -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, 13 Aug 2012 17:27:00 +0200 Jiri Pirko <jiri@resnulli.us> wrote: > This lists are supposed to serve for storing pointers to all upper devices. > Eventually it will replace dev->master pointer which is used for > bonding, bridge, team but it cannot be used for vlan, macvlan where > there might be multiple "masters" present. > > New upper device list resolves this limitation. Also, the information > stored in lists is used for preventing looping setups like > "bond->somethingelse->samebond" > > Signed-off-by: Jiri Pirko <jiri@resnulli.us> > --- > include/linux/netdevice.h | 14 +++ > net/core/dev.c | 232 ++++++++++++++++++++++++++++++++++++++++++++- > 2 files changed, 244 insertions(+), 2 deletions(-) > > diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h > index a9db4f3..e7a07f8 100644 > --- a/include/linux/netdevice.h > +++ b/include/linux/netdevice.h > @@ -1173,6 +1173,8 @@ struct net_device { > * which this device is member of. > */ > > + struct list_head upper_dev_list; /* List of upper devices */ > + > /* Interface address info used in eth_type_trans() */ > unsigned char *dev_addr; /* hw address, (before bcast > because most packets are > @@ -2611,6 +2613,18 @@ extern int netdev_max_backlog; > extern int netdev_tstamp_prequeue; > extern int weight_p; > extern int bpf_jit_enable; > + > +extern bool netdev_has_upper_dev(struct net_device *dev, > + struct net_device *upper_dev); > +extern bool netdev_has_any_upper_dev(struct net_device *dev); > +extern struct net_device *netdev_unique_upper_dev_get(struct net_device *dev); > +extern struct net_device *netdev_unique_upper_dev_get_rcu(struct net_device *dev); > +extern int netdev_upper_dev_link(struct net_device *dev, > + struct net_device *upper_dev); > +extern int netdev_unique_upper_dev_link(struct net_device *dev, > + struct net_device *upper_dev); > +extern void netdev_upper_dev_unlink(struct net_device *dev, > + struct net_device *upper_dev); > extern int netdev_set_master(struct net_device *dev, struct net_device *master); > extern int netdev_set_bond_master(struct net_device *dev, > struct net_device *master); > diff --git a/net/core/dev.c b/net/core/dev.c > index 1f06df8..68db1ac 100644 > --- a/net/core/dev.c > +++ b/net/core/dev.c > @@ -4425,6 +4425,229 @@ static int __init dev_proc_init(void) > #endif /* CONFIG_PROC_FS */ > > > +struct netdev_upper { > + struct net_device *dev; > + bool unique; unique is quite confusing here. I see that it is possible to have one unique and many non-unique linked in the list, so maybe rename to 'main_dev', 'master' or 'principal'... > + struct list_head list; > + struct rcu_head rcu; > +}; > + > +static bool __netdev_has_upper_dev(struct net_device *dev, > + struct net_device *upper_dev, > + bool deep) > +{ > + struct netdev_upper *upper; > + > + list_for_each_entry(upper, &dev->upper_dev_list, list) { > + if (upper->dev == upper_dev) > + return true; > + if (deep && __netdev_has_upper_dev(upper->dev, upper_dev, deep)) > + return true; > + } > + return false; > +} > + > +static struct netdev_upper *__netdev_find_upper(struct net_device *dev, > + struct net_device *upper_dev) > +{ > + struct netdev_upper *upper; > + > + list_for_each_entry(upper, &dev->upper_dev_list, list) { > + if (upper->dev == upper_dev) > + return upper; > + } > + return NULL; > +} > + > +/** > + * netdev_has_upper_dev - Check if device is linked to an upper device > + * @dev: device > + * @upper_dev: upper device to check > + * > + * Find out if a device is linked to specified upper device and return true > + * in case it is. The caller must hold the RTNL semaphore. > + */ > +bool netdev_has_upper_dev(struct net_device *dev, > + struct net_device *upper_dev) > +{ > + ASSERT_RTNL(); > + > + return __netdev_has_upper_dev(dev, upper_dev, false); > +} > +EXPORT_SYMBOL(netdev_has_upper_dev); > + > +/** > + * netdev_has_any_upper_dev - Check if device is linked to some device > + * @dev: device > + * > + * Find out if a device is linked to an upper device and return true in case > + * it is. The caller must hold the RTNL semaphore. > + */ > +bool netdev_has_any_upper_dev(struct net_device *dev) > +{ > + ASSERT_RTNL(); > + > + return !list_empty(&dev->upper_dev_list); > +} > +EXPORT_SYMBOL(netdev_has_any_upper_dev); > + > +/** > + * netdev_unique_upper_dev_get - Get unique upper device > + * @dev: device > + * > + * Find a unique upper device and return pointer to it or NULL in case > + * it's not there. The caller must hold the RTNL semaphore. > + */ > +struct net_device *netdev_unique_upper_dev_get(struct net_device *dev) > +{ > + struct netdev_upper *upper; > + > + ASSERT_RTNL(); > + > + if (list_empty(&dev->upper_dev_list)) > + return NULL; > + > + upper = list_first_entry(&dev->upper_dev_list, > + struct netdev_upper, list); > + if (likely(upper->unique)) > + return upper->dev; > + return NULL; > +} > +EXPORT_SYMBOL(netdev_unique_upper_dev_get); > + > +/** > + * netdev_unique_upper_dev_get_rcu - Get unique upper device > + * @dev: device > + * > + * Find a unique upper device and return pointer to it or NULL in case > + * it's not there. The caller must hold the RCU read lock. > + */ > +struct net_device *netdev_unique_upper_dev_get_rcu(struct net_device *dev) > +{ > + struct netdev_upper *upper; > + > + upper = list_first_or_null_rcu(&dev->upper_dev_list, > + struct netdev_upper, list); > + if (likely(upper->unique)) It will oopses here if 'upper' is NULL (i.e. no upper devices). > + return upper->dev; > + return NULL; > +} > +EXPORT_SYMBOL(netdev_unique_upper_dev_get_rcu); > + > +static int __netdev_upper_dev_link(struct net_device *dev, > + struct net_device *upper_dev, bool unique) > +{ > + struct netdev_upper *upper; > + > + ASSERT_RTNL(); > + > + if (dev == upper_dev) > + return -EBUSY; > + /* > + * To prevent loops, check if dev is not upper device to upper_dev. > + */ > + if (__netdev_has_upper_dev(upper_dev, dev, true)) > + return -EBUSY; > + > + if (__netdev_find_upper(dev, upper_dev)) > + return -EEXIST; __netdev_has_upper_dev() can go all the way up finding the device and the __netdev_find_upper() just check the first level. I think it would be better to use: __netdev_find_upper_dev(,,deep=true/false) __netdev_has_upper(,) thanks, fbl > + if (unique && netdev_unique_upper_dev_get(dev)) > + return -EBUSY; > + > + upper = kmalloc(sizeof(*upper), GFP_KERNEL); > + if (!upper) > + return -ENOMEM; > + > + upper->dev = upper_dev; > + upper->unique = unique; > + > + /* > + * Ensure that unique upper link is always the first item in the list. > + */ > + if (unique) > + list_add_rcu(&upper->list, &dev->upper_dev_list); > + else > + list_add_tail_rcu(&upper->list, &dev->upper_dev_list); > + dev_hold(upper_dev); > + > + return 0; > +} > +/** > + * netdev_upper_dev_link - Add a link to the upper device > + * @dev: device > + * @upper_dev: new upper device > + * > + * Adds a link to device which is upper to this one. The caller must hold > + * the RTNL semaphore. On a failure a negative errno code is returned. > + * On success the reference counts are adjusted and the function > + * returns zero. > + */ > +int netdev_upper_dev_link(struct net_device *dev, > + struct net_device *upper_dev) > +{ > + return __netdev_upper_dev_link(dev, upper_dev, false); > +} > +EXPORT_SYMBOL(netdev_upper_dev_link); > + > +/** > + * netdev_unique_upper_dev_link - Add a unique link to the upper device > + * @dev: device > + * @upper_dev: new upper device > + * > + * Adds a link to device which is upper to this one. In this case, only > + * one unique upper device can be linked, although other non-unique devices > + * might be linked as well. The caller must hold the RTNL semaphore. > + * On a failure a negative errno code is returned. On success the reference > + * counts are adjusted and the function returns zero. > + */ > +int netdev_unique_upper_dev_link(struct net_device *dev, > + struct net_device *upper_dev) > +{ > + return __netdev_upper_dev_link(dev, upper_dev, true); > +} > +EXPORT_SYMBOL(netdev_unique_upper_dev_link); > + > +/** > + * netdev_upper_free_rcu - Frees a upper device list item via the RCU pointer > + * @entry: the entry's RCU field > + * > + * This function is designed to be used as a callback to the call_rcu() > + * function so that the memory allocated to the netdev upper device list item > + * can be released safely. > + */ > +static void netdev_upper_free_rcu(struct rcu_head *entry) > +{ > + struct netdev_upper *upper; > + > + upper = container_of(entry, struct netdev_upper, rcu); > + kfree(upper); > +} > + > +/** > + * netdev_upper_dev_unlink - Removes a link to upper device > + * @dev: device > + * @upper_dev: new upper device > + * > + * Removes a link to device which is upper to this one. The caller must hold > + * the RTNL semaphore. > + */ > +void netdev_upper_dev_unlink(struct net_device *dev, > + struct net_device *upper_dev) > +{ > + struct netdev_upper *upper; > + > + ASSERT_RTNL(); > + > + upper = __netdev_find_upper(dev, upper_dev); > + if (!upper) > + return; > + list_del_rcu(&upper->list); > + dev_put(upper_dev); > + call_rcu(&upper->rcu, netdev_upper_free_rcu); > +} > +EXPORT_SYMBOL(netdev_upper_dev_unlink); > + > /** > * netdev_set_master - set up master pointer > * @slave: slave device > @@ -4438,19 +4661,23 @@ static int __init dev_proc_init(void) > int netdev_set_master(struct net_device *slave, struct net_device *master) > { > struct net_device *old = slave->master; > + int err; > > ASSERT_RTNL(); > > if (master) { > if (old) > return -EBUSY; > - dev_hold(master); > + err = netdev_unique_upper_dev_link(slave, master); > + if (err) > + return err; > } > > slave->master = master; > > if (old) > - dev_put(old); > + netdev_upper_dev_unlink(slave, master); > + > return 0; > } > EXPORT_SYMBOL(netdev_set_master); > @@ -5999,6 +6226,7 @@ struct net_device *alloc_netdev_mqs(int sizeof_priv, const char *name, > INIT_LIST_HEAD(&dev->napi_list); > INIT_LIST_HEAD(&dev->unreg_list); > INIT_LIST_HEAD(&dev->link_watch_list); > + INIT_LIST_HEAD(&dev->upper_dev_list); > dev->priv_flags = IFF_XMIT_DST_RELEASE; > setup(dev); > -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 08/13/2012 11:27 PM, Jiri Pirko wrote: > This lists are supposed to serve for storing pointers to all upper devices. > Eventually it will replace dev->master pointer which is used for > bonding, bridge, team but it cannot be used for vlan, macvlan where > there might be multiple "masters" present. > > New upper device list resolves this limitation. Also, the information > stored in lists is used for preventing looping setups like > "bond->somethingelse->samebond" Hi, Jiri, I have some difficulty to understand this patch description, so take bridged bonding as an example, IIUC, both the bridge dev and bonding dev will be in the upper dev list of eth0 and eth1? And br0 will be in the upper dev list of bond0 too? I think it would be nice to describe the hierarchy after your patches. Thanks! -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Tue, Aug 14, 2012 at 11:02:33AM CEST, xiyou.wangcong@gmail.com wrote: >On 08/13/2012 11:27 PM, Jiri Pirko wrote: >>This lists are supposed to serve for storing pointers to all upper devices. >>Eventually it will replace dev->master pointer which is used for >>bonding, bridge, team but it cannot be used for vlan, macvlan where >>there might be multiple "masters" present. >> >>New upper device list resolves this limitation. Also, the information >>stored in lists is used for preventing looping setups like >>"bond->somethingelse->samebond" > >Hi, Jiri, > >I have some difficulty to understand this patch description, so take >bridged bonding as an example, IIUC, both the bridge dev and bonding >dev will be in the upper dev list of eth0 and eth1? And br0 will be >in the upper dev list of bond0 too? No. Only direct upper device is in the list. > >I think it would be nice to describe the hierarchy after your patches. Hierarchy is the same after. > >Thanks! > -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Mon, Aug 13, 2012 at 07:52:17PM CEST, fbl@redhat.com wrote: >On Mon, 13 Aug 2012 17:27:00 +0200 >Jiri Pirko <jiri@resnulli.us> wrote: > >> This lists are supposed to serve for storing pointers to all upper devices. >> Eventually it will replace dev->master pointer which is used for >> bonding, bridge, team but it cannot be used for vlan, macvlan where >> there might be multiple "masters" present. >> >> New upper device list resolves this limitation. Also, the information >> stored in lists is used for preventing looping setups like >> "bond->somethingelse->samebond" >> >> Signed-off-by: Jiri Pirko <jiri@resnulli.us> >> --- >> include/linux/netdevice.h | 14 +++ >> net/core/dev.c | 232 ++++++++++++++++++++++++++++++++++++++++++++- >> 2 files changed, 244 insertions(+), 2 deletions(-) >> >> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h >> index a9db4f3..e7a07f8 100644 >> --- a/include/linux/netdevice.h >> +++ b/include/linux/netdevice.h >> @@ -1173,6 +1173,8 @@ struct net_device { >> * which this device is member of. >> */ >> >> + struct list_head upper_dev_list; /* List of upper devices */ >> + >> /* Interface address info used in eth_type_trans() */ >> unsigned char *dev_addr; /* hw address, (before bcast >> because most packets are >> @@ -2611,6 +2613,18 @@ extern int netdev_max_backlog; >> extern int netdev_tstamp_prequeue; >> extern int weight_p; >> extern int bpf_jit_enable; >> + >> +extern bool netdev_has_upper_dev(struct net_device *dev, >> + struct net_device *upper_dev); >> +extern bool netdev_has_any_upper_dev(struct net_device *dev); >> +extern struct net_device *netdev_unique_upper_dev_get(struct net_device *dev); >> +extern struct net_device *netdev_unique_upper_dev_get_rcu(struct net_device *dev); >> +extern int netdev_upper_dev_link(struct net_device *dev, >> + struct net_device *upper_dev); >> +extern int netdev_unique_upper_dev_link(struct net_device *dev, >> + struct net_device *upper_dev); >> +extern void netdev_upper_dev_unlink(struct net_device *dev, >> + struct net_device *upper_dev); >> extern int netdev_set_master(struct net_device *dev, struct net_device *master); >> extern int netdev_set_bond_master(struct net_device *dev, >> struct net_device *master); >> diff --git a/net/core/dev.c b/net/core/dev.c >> index 1f06df8..68db1ac 100644 >> --- a/net/core/dev.c >> +++ b/net/core/dev.c >> @@ -4425,6 +4425,229 @@ static int __init dev_proc_init(void) >> #endif /* CONFIG_PROC_FS */ >> >> >> +struct netdev_upper { >> + struct net_device *dev; >> + bool unique; > >unique is quite confusing here. I see that it is possible to have >one unique and many non-unique linked in the list, so maybe rename >to 'main_dev', 'master' or 'principal'... > > >> + struct list_head list; >> + struct rcu_head rcu; >> +}; >> + >> +static bool __netdev_has_upper_dev(struct net_device *dev, >> + struct net_device *upper_dev, >> + bool deep) >> +{ >> + struct netdev_upper *upper; >> + >> + list_for_each_entry(upper, &dev->upper_dev_list, list) { >> + if (upper->dev == upper_dev) >> + return true; >> + if (deep && __netdev_has_upper_dev(upper->dev, upper_dev, deep)) >> + return true; >> + } >> + return false; >> +} >> + >> +static struct netdev_upper *__netdev_find_upper(struct net_device *dev, >> + struct net_device *upper_dev) >> +{ >> + struct netdev_upper *upper; >> + >> + list_for_each_entry(upper, &dev->upper_dev_list, list) { >> + if (upper->dev == upper_dev) >> + return upper; >> + } >> + return NULL; >> +} >> + >> +/** >> + * netdev_has_upper_dev - Check if device is linked to an upper device >> + * @dev: device >> + * @upper_dev: upper device to check >> + * >> + * Find out if a device is linked to specified upper device and return true >> + * in case it is. The caller must hold the RTNL semaphore. >> + */ >> +bool netdev_has_upper_dev(struct net_device *dev, >> + struct net_device *upper_dev) >> +{ >> + ASSERT_RTNL(); >> + >> + return __netdev_has_upper_dev(dev, upper_dev, false); >> +} >> +EXPORT_SYMBOL(netdev_has_upper_dev); >> + >> +/** >> + * netdev_has_any_upper_dev - Check if device is linked to some device >> + * @dev: device >> + * >> + * Find out if a device is linked to an upper device and return true in case >> + * it is. The caller must hold the RTNL semaphore. >> + */ >> +bool netdev_has_any_upper_dev(struct net_device *dev) >> +{ >> + ASSERT_RTNL(); >> + >> + return !list_empty(&dev->upper_dev_list); >> +} >> +EXPORT_SYMBOL(netdev_has_any_upper_dev); >> + >> +/** >> + * netdev_unique_upper_dev_get - Get unique upper device >> + * @dev: device >> + * >> + * Find a unique upper device and return pointer to it or NULL in case >> + * it's not there. The caller must hold the RTNL semaphore. >> + */ >> +struct net_device *netdev_unique_upper_dev_get(struct net_device *dev) >> +{ >> + struct netdev_upper *upper; >> + >> + ASSERT_RTNL(); >> + >> + if (list_empty(&dev->upper_dev_list)) >> + return NULL; >> + >> + upper = list_first_entry(&dev->upper_dev_list, >> + struct netdev_upper, list); >> + if (likely(upper->unique)) >> + return upper->dev; >> + return NULL; >> +} >> +EXPORT_SYMBOL(netdev_unique_upper_dev_get); >> + >> +/** >> + * netdev_unique_upper_dev_get_rcu - Get unique upper device >> + * @dev: device >> + * >> + * Find a unique upper device and return pointer to it or NULL in case >> + * it's not there. The caller must hold the RCU read lock. >> + */ >> +struct net_device *netdev_unique_upper_dev_get_rcu(struct net_device *dev) >> +{ >> + struct netdev_upper *upper; >> + >> + upper = list_first_or_null_rcu(&dev->upper_dev_list, >> + struct netdev_upper, list); >> + if (likely(upper->unique)) > >It will oopses here if 'upper' is NULL (i.e. no upper devices). Fixed > > >> + return upper->dev; >> + return NULL; >> +} >> +EXPORT_SYMBOL(netdev_unique_upper_dev_get_rcu); >> + >> +static int __netdev_upper_dev_link(struct net_device *dev, >> + struct net_device *upper_dev, bool unique) >> +{ >> + struct netdev_upper *upper; >> + >> + ASSERT_RTNL(); >> + >> + if (dev == upper_dev) >> + return -EBUSY; >> + /* >> + * To prevent loops, check if dev is not upper device to upper_dev. >> + */ >> + if (__netdev_has_upper_dev(upper_dev, dev, true)) >> + return -EBUSY; >> + >> + if (__netdev_find_upper(dev, upper_dev)) >> + return -EEXIST; > >__netdev_has_upper_dev() can go all the way up finding the device and >the __netdev_find_upper() just check the first level. I do not think this ordering is somewhat inportant. > >I think it would be better to use: >__netdev_find_upper_dev(,,deep=true/false) >__netdev_has_upper(,) > >thanks, >fbl > >> + if (unique && netdev_unique_upper_dev_get(dev)) >> + return -EBUSY; >> + >> + upper = kmalloc(sizeof(*upper), GFP_KERNEL); >> + if (!upper) >> + return -ENOMEM; >> + >> + upper->dev = upper_dev; >> + upper->unique = unique; >> + >> + /* >> + * Ensure that unique upper link is always the first item in the list. >> + */ >> + if (unique) >> + list_add_rcu(&upper->list, &dev->upper_dev_list); >> + else >> + list_add_tail_rcu(&upper->list, &dev->upper_dev_list); >> + dev_hold(upper_dev); >> + >> + return 0; >> +} >> +/** >> + * netdev_upper_dev_link - Add a link to the upper device >> + * @dev: device >> + * @upper_dev: new upper device >> + * >> + * Adds a link to device which is upper to this one. The caller must hold >> + * the RTNL semaphore. On a failure a negative errno code is returned. >> + * On success the reference counts are adjusted and the function >> + * returns zero. >> + */ >> +int netdev_upper_dev_link(struct net_device *dev, >> + struct net_device *upper_dev) >> +{ >> + return __netdev_upper_dev_link(dev, upper_dev, false); >> +} >> +EXPORT_SYMBOL(netdev_upper_dev_link); >> + >> +/** >> + * netdev_unique_upper_dev_link - Add a unique link to the upper device >> + * @dev: device >> + * @upper_dev: new upper device >> + * >> + * Adds a link to device which is upper to this one. In this case, only >> + * one unique upper device can be linked, although other non-unique devices >> + * might be linked as well. The caller must hold the RTNL semaphore. >> + * On a failure a negative errno code is returned. On success the reference >> + * counts are adjusted and the function returns zero. >> + */ >> +int netdev_unique_upper_dev_link(struct net_device *dev, >> + struct net_device *upper_dev) >> +{ >> + return __netdev_upper_dev_link(dev, upper_dev, true); >> +} >> +EXPORT_SYMBOL(netdev_unique_upper_dev_link); >> + >> +/** >> + * netdev_upper_free_rcu - Frees a upper device list item via the RCU pointer >> + * @entry: the entry's RCU field >> + * >> + * This function is designed to be used as a callback to the call_rcu() >> + * function so that the memory allocated to the netdev upper device list item >> + * can be released safely. >> + */ >> +static void netdev_upper_free_rcu(struct rcu_head *entry) >> +{ >> + struct netdev_upper *upper; >> + >> + upper = container_of(entry, struct netdev_upper, rcu); >> + kfree(upper); >> +} >> + >> +/** >> + * netdev_upper_dev_unlink - Removes a link to upper device >> + * @dev: device >> + * @upper_dev: new upper device >> + * >> + * Removes a link to device which is upper to this one. The caller must hold >> + * the RTNL semaphore. >> + */ >> +void netdev_upper_dev_unlink(struct net_device *dev, >> + struct net_device *upper_dev) >> +{ >> + struct netdev_upper *upper; >> + >> + ASSERT_RTNL(); >> + >> + upper = __netdev_find_upper(dev, upper_dev); >> + if (!upper) >> + return; >> + list_del_rcu(&upper->list); >> + dev_put(upper_dev); >> + call_rcu(&upper->rcu, netdev_upper_free_rcu); >> +} >> +EXPORT_SYMBOL(netdev_upper_dev_unlink); >> + >> /** >> * netdev_set_master - set up master pointer >> * @slave: slave device >> @@ -4438,19 +4661,23 @@ static int __init dev_proc_init(void) >> int netdev_set_master(struct net_device *slave, struct net_device *master) >> { >> struct net_device *old = slave->master; >> + int err; >> >> ASSERT_RTNL(); >> >> if (master) { >> if (old) >> return -EBUSY; >> - dev_hold(master); >> + err = netdev_unique_upper_dev_link(slave, master); >> + if (err) >> + return err; >> } >> >> slave->master = master; >> >> if (old) >> - dev_put(old); >> + netdev_upper_dev_unlink(slave, master); >> + >> return 0; >> } >> EXPORT_SYMBOL(netdev_set_master); >> @@ -5999,6 +6226,7 @@ struct net_device *alloc_netdev_mqs(int sizeof_priv, const char *name, >> INIT_LIST_HEAD(&dev->napi_list); >> INIT_LIST_HEAD(&dev->unreg_list); >> INIT_LIST_HEAD(&dev->link_watch_list); >> + INIT_LIST_HEAD(&dev->upper_dev_list); >> dev->priv_flags = IFF_XMIT_DST_RELEASE; >> setup(dev); >> > -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, 14 Aug 2012 14:24:33 +0200 Jiri Pirko <jiri@resnulli.us> wrote: > Mon, Aug 13, 2012 at 07:52:17PM CEST, fbl@redhat.com wrote: > >On Mon, 13 Aug 2012 17:27:00 +0200 > >Jiri Pirko <jiri@resnulli.us> wrote: > >> + /* > >> + * To prevent loops, check if dev is not upper device to upper_dev. > >> + */ > >> + if (__netdev_has_upper_dev(upper_dev, dev, true)) > >> + return -EBUSY; > >> + > >> + if (__netdev_find_upper(dev, upper_dev)) > >> + return -EEXIST; > > > >__netdev_has_upper_dev() can go all the way up finding the device and > >the __netdev_find_upper() just check the first level. > > > I do not think this ordering is somewhat inportant. it's not the order, see below: > >I think it would be better to use: > >__netdev_find_upper_dev(,,deep=true/false) > >__netdev_has_upper(,) It's their names. Currently, the function ..._find_... look at one level only, while the function ..._has_... does one or more levels. I think it's better to swap 'has' and 'find' in their names: __netdev_find_upper_dev(,,deep=true/false) <-- find in all levels __netdev_has_upper(,) <-- check only the one level. fbl -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Tue, Aug 14, 2012 at 03:14:00PM CEST, fbl@redhat.com wrote: >On Tue, 14 Aug 2012 14:24:33 +0200 >Jiri Pirko <jiri@resnulli.us> wrote: > >> Mon, Aug 13, 2012 at 07:52:17PM CEST, fbl@redhat.com wrote: >> >On Mon, 13 Aug 2012 17:27:00 +0200 >> >Jiri Pirko <jiri@resnulli.us> wrote: >> >> + /* >> >> + * To prevent loops, check if dev is not upper device to upper_dev. >> >> + */ >> >> + if (__netdev_has_upper_dev(upper_dev, dev, true)) >> >> + return -EBUSY; >> >> + >> >> + if (__netdev_find_upper(dev, upper_dev)) >> >> + return -EEXIST; >> > >> >__netdev_has_upper_dev() can go all the way up finding the device and >> >the __netdev_find_upper() just check the first level. >> >> >> I do not think this ordering is somewhat inportant. > >it's not the order, see below: > >> >I think it would be better to use: >> >__netdev_find_upper_dev(,,deep=true/false) >> >__netdev_has_upper(,) > >It's their names. Currently, the function ..._find_... look at >one level only, while the function ..._has_... does one or more >levels. I think it's better to swap 'has' and 'find' in their names: > >__netdev_find_upper_dev(,,deep=true/false) <-- find in all levels >__netdev_has_upper(,) <-- check only the one level. Oh, now I think I see your point. But realise this: The main reason for __netdev_find_upper() is to find "struct upper" for netdev_upper_dev_unlink(). Therefore the name is not "__netdev_find_upper_dev" and there's no need to go deep here. On the orher hand, __netdev_has_upper_dev() only says whether device is lower to specified upper device. In this case I think the name is quite convenient as well. > >fbl -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Le 13/08/2012 19:31, Jiri Pirko a écrit : > Mon, Aug 13, 2012 at 07:04:11PM CEST, bhutchings@solarflare.com wrote: >>> +struct netdev_upper { >>> + struct net_device *dev; >>> + bool unique; >> >> This needs a better name. It doesn't really have anything to do with >> uniqueness and doesn't ensure exclusivity. I think that it would be >> fine to keep the 'master' term. > > Hmm. I admit that "unique" I do not like too much as well. But "master" > I like even less. > > This flag should ensure exclusivity. Only one upper device with this > flag can be present at a time. Well, can't we simply call it "upper_device"? And as we only have a single field, this is exclusive by design. Nicolas. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index a9db4f3..e7a07f8 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -1173,6 +1173,8 @@ struct net_device { * which this device is member of. */ + struct list_head upper_dev_list; /* List of upper devices */ + /* Interface address info used in eth_type_trans() */ unsigned char *dev_addr; /* hw address, (before bcast because most packets are @@ -2611,6 +2613,18 @@ extern int netdev_max_backlog; extern int netdev_tstamp_prequeue; extern int weight_p; extern int bpf_jit_enable; + +extern bool netdev_has_upper_dev(struct net_device *dev, + struct net_device *upper_dev); +extern bool netdev_has_any_upper_dev(struct net_device *dev); +extern struct net_device *netdev_unique_upper_dev_get(struct net_device *dev); +extern struct net_device *netdev_unique_upper_dev_get_rcu(struct net_device *dev); +extern int netdev_upper_dev_link(struct net_device *dev, + struct net_device *upper_dev); +extern int netdev_unique_upper_dev_link(struct net_device *dev, + struct net_device *upper_dev); +extern void netdev_upper_dev_unlink(struct net_device *dev, + struct net_device *upper_dev); extern int netdev_set_master(struct net_device *dev, struct net_device *master); extern int netdev_set_bond_master(struct net_device *dev, struct net_device *master); diff --git a/net/core/dev.c b/net/core/dev.c index 1f06df8..68db1ac 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -4425,6 +4425,229 @@ static int __init dev_proc_init(void) #endif /* CONFIG_PROC_FS */ +struct netdev_upper { + struct net_device *dev; + bool unique; + struct list_head list; + struct rcu_head rcu; +}; + +static bool __netdev_has_upper_dev(struct net_device *dev, + struct net_device *upper_dev, + bool deep) +{ + struct netdev_upper *upper; + + list_for_each_entry(upper, &dev->upper_dev_list, list) { + if (upper->dev == upper_dev) + return true; + if (deep && __netdev_has_upper_dev(upper->dev, upper_dev, deep)) + return true; + } + return false; +} + +static struct netdev_upper *__netdev_find_upper(struct net_device *dev, + struct net_device *upper_dev) +{ + struct netdev_upper *upper; + + list_for_each_entry(upper, &dev->upper_dev_list, list) { + if (upper->dev == upper_dev) + return upper; + } + return NULL; +} + +/** + * netdev_has_upper_dev - Check if device is linked to an upper device + * @dev: device + * @upper_dev: upper device to check + * + * Find out if a device is linked to specified upper device and return true + * in case it is. The caller must hold the RTNL semaphore. + */ +bool netdev_has_upper_dev(struct net_device *dev, + struct net_device *upper_dev) +{ + ASSERT_RTNL(); + + return __netdev_has_upper_dev(dev, upper_dev, false); +} +EXPORT_SYMBOL(netdev_has_upper_dev); + +/** + * netdev_has_any_upper_dev - Check if device is linked to some device + * @dev: device + * + * Find out if a device is linked to an upper device and return true in case + * it is. The caller must hold the RTNL semaphore. + */ +bool netdev_has_any_upper_dev(struct net_device *dev) +{ + ASSERT_RTNL(); + + return !list_empty(&dev->upper_dev_list); +} +EXPORT_SYMBOL(netdev_has_any_upper_dev); + +/** + * netdev_unique_upper_dev_get - Get unique upper device + * @dev: device + * + * Find a unique upper device and return pointer to it or NULL in case + * it's not there. The caller must hold the RTNL semaphore. + */ +struct net_device *netdev_unique_upper_dev_get(struct net_device *dev) +{ + struct netdev_upper *upper; + + ASSERT_RTNL(); + + if (list_empty(&dev->upper_dev_list)) + return NULL; + + upper = list_first_entry(&dev->upper_dev_list, + struct netdev_upper, list); + if (likely(upper->unique)) + return upper->dev; + return NULL; +} +EXPORT_SYMBOL(netdev_unique_upper_dev_get); + +/** + * netdev_unique_upper_dev_get_rcu - Get unique upper device + * @dev: device + * + * Find a unique upper device and return pointer to it or NULL in case + * it's not there. The caller must hold the RCU read lock. + */ +struct net_device *netdev_unique_upper_dev_get_rcu(struct net_device *dev) +{ + struct netdev_upper *upper; + + upper = list_first_or_null_rcu(&dev->upper_dev_list, + struct netdev_upper, list); + if (likely(upper->unique)) + return upper->dev; + return NULL; +} +EXPORT_SYMBOL(netdev_unique_upper_dev_get_rcu); + +static int __netdev_upper_dev_link(struct net_device *dev, + struct net_device *upper_dev, bool unique) +{ + struct netdev_upper *upper; + + ASSERT_RTNL(); + + if (dev == upper_dev) + return -EBUSY; + /* + * To prevent loops, check if dev is not upper device to upper_dev. + */ + if (__netdev_has_upper_dev(upper_dev, dev, true)) + return -EBUSY; + + if (__netdev_find_upper(dev, upper_dev)) + return -EEXIST; + + if (unique && netdev_unique_upper_dev_get(dev)) + return -EBUSY; + + upper = kmalloc(sizeof(*upper), GFP_KERNEL); + if (!upper) + return -ENOMEM; + + upper->dev = upper_dev; + upper->unique = unique; + + /* + * Ensure that unique upper link is always the first item in the list. + */ + if (unique) + list_add_rcu(&upper->list, &dev->upper_dev_list); + else + list_add_tail_rcu(&upper->list, &dev->upper_dev_list); + dev_hold(upper_dev); + + return 0; +} +/** + * netdev_upper_dev_link - Add a link to the upper device + * @dev: device + * @upper_dev: new upper device + * + * Adds a link to device which is upper to this one. The caller must hold + * the RTNL semaphore. On a failure a negative errno code is returned. + * On success the reference counts are adjusted and the function + * returns zero. + */ +int netdev_upper_dev_link(struct net_device *dev, + struct net_device *upper_dev) +{ + return __netdev_upper_dev_link(dev, upper_dev, false); +} +EXPORT_SYMBOL(netdev_upper_dev_link); + +/** + * netdev_unique_upper_dev_link - Add a unique link to the upper device + * @dev: device + * @upper_dev: new upper device + * + * Adds a link to device which is upper to this one. In this case, only + * one unique upper device can be linked, although other non-unique devices + * might be linked as well. The caller must hold the RTNL semaphore. + * On a failure a negative errno code is returned. On success the reference + * counts are adjusted and the function returns zero. + */ +int netdev_unique_upper_dev_link(struct net_device *dev, + struct net_device *upper_dev) +{ + return __netdev_upper_dev_link(dev, upper_dev, true); +} +EXPORT_SYMBOL(netdev_unique_upper_dev_link); + +/** + * netdev_upper_free_rcu - Frees a upper device list item via the RCU pointer + * @entry: the entry's RCU field + * + * This function is designed to be used as a callback to the call_rcu() + * function so that the memory allocated to the netdev upper device list item + * can be released safely. + */ +static void netdev_upper_free_rcu(struct rcu_head *entry) +{ + struct netdev_upper *upper; + + upper = container_of(entry, struct netdev_upper, rcu); + kfree(upper); +} + +/** + * netdev_upper_dev_unlink - Removes a link to upper device + * @dev: device + * @upper_dev: new upper device + * + * Removes a link to device which is upper to this one. The caller must hold + * the RTNL semaphore. + */ +void netdev_upper_dev_unlink(struct net_device *dev, + struct net_device *upper_dev) +{ + struct netdev_upper *upper; + + ASSERT_RTNL(); + + upper = __netdev_find_upper(dev, upper_dev); + if (!upper) + return; + list_del_rcu(&upper->list); + dev_put(upper_dev); + call_rcu(&upper->rcu, netdev_upper_free_rcu); +} +EXPORT_SYMBOL(netdev_upper_dev_unlink); + /** * netdev_set_master - set up master pointer * @slave: slave device @@ -4438,19 +4661,23 @@ static int __init dev_proc_init(void) int netdev_set_master(struct net_device *slave, struct net_device *master) { struct net_device *old = slave->master; + int err; ASSERT_RTNL(); if (master) { if (old) return -EBUSY; - dev_hold(master); + err = netdev_unique_upper_dev_link(slave, master); + if (err) + return err; } slave->master = master; if (old) - dev_put(old); + netdev_upper_dev_unlink(slave, master); + return 0; } EXPORT_SYMBOL(netdev_set_master); @@ -5999,6 +6226,7 @@ struct net_device *alloc_netdev_mqs(int sizeof_priv, const char *name, INIT_LIST_HEAD(&dev->napi_list); INIT_LIST_HEAD(&dev->unreg_list); INIT_LIST_HEAD(&dev->link_watch_list); + INIT_LIST_HEAD(&dev->upper_dev_list); dev->priv_flags = IFF_XMIT_DST_RELEASE; setup(dev);
This lists are supposed to serve for storing pointers to all upper devices. Eventually it will replace dev->master pointer which is used for bonding, bridge, team but it cannot be used for vlan, macvlan where there might be multiple "masters" present. New upper device list resolves this limitation. Also, the information stored in lists is used for preventing looping setups like "bond->somethingelse->samebond" Signed-off-by: Jiri Pirko <jiri@resnulli.us> --- include/linux/netdevice.h | 14 +++ net/core/dev.c | 232 ++++++++++++++++++++++++++++++++++++++++++++- 2 files changed, 244 insertions(+), 2 deletions(-)