Message ID | 20240402001137.2980589-8-Joseph.Huang@garmin.com (mailing list archive) |
---|---|
State | RFC |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | MC Flood disable and snooping | expand |
On Mon, Apr 01, 2024 at 08:11:06PM -0400, Joseph Huang wrote: > Keep track of bridge mdb objects in the driver. > > Similar to the previous patch, since the driver doesn't get explicit > notifications about mdb group creation or destruction, just create > the mdb group when the first port joins the group via > mv88e6xxx_port_mdb_add(), and destroys the group when the last port left > the group via mv88e6xxx_port_mdb_del(). > > Use the group's L2 address together with the VLAN ID as the key to the list. > Port membership is again stored in a bitmask. > > Signed-off-by: Joseph Huang <Joseph.Huang@garmin.com> > --- Can you comment on the feasibility/infeasibility of Tobias' proposal of: "The bridge could just provide some MDB iterator to save us from having to cache all the configured groups."? https://lore.kernel.org/netdev/87sg31n04a.fsf@waldekranz.com/ What is done here will have to be scaled to many drivers - potentially all existing DSA ones, as far as I'm aware.
Hi Vladimir, On 4/2/2024 8:23 AM, Vladimir Oltean wrote: > Can you comment on the feasibility/infeasibility of Tobias' proposal of: > "The bridge could just provide some MDB iterator to save us from having > to cache all the configured groups."? > https://lore.kernel.org/netdev/87sg31n04a.fsf@waldekranz.com/ > > What is done here will have to be scaled to many drivers - potentially > all existing DSA ones, as far as I'm aware. > I thought about implementing an MDB iterator as suggested by Tobias, but I'm a bit concerned about the coherence of these MDB objects. In theory, when the device driver is trying to act on an event, the source of the trigger may have changed its state in the bridge already. If, upon receiving an event in the device driver, we iterate over what the bridge has at that instant, the differences between the worlds as seen by the bridge and the device driver might lead to some unexpected results. However, if we cache the MDB objects in the device driver, at least the order in which the events took place will be coherent and at any give time the state of the MDB objects in the device driver can be guaranteed to be sane. This is also the approach the prestera device driver took. Thanks, Joseph
On Thu, Apr 04, 2024 at 04:43:38PM -0400, Joseph Huang wrote: > Hi Vladimir, > > On 4/2/2024 8:23 AM, Vladimir Oltean wrote: > > Can you comment on the feasibility/infeasibility of Tobias' proposal of: > > "The bridge could just provide some MDB iterator to save us from having > > to cache all the configured groups."? > > https://lore.kernel.org/netdev/87sg31n04a.fsf@waldekranz.com/ > > > > What is done here will have to be scaled to many drivers - potentially > > all existing DSA ones, as far as I'm aware. > > > > I thought about implementing an MDB iterator as suggested by Tobias, but I'm > a bit concerned about the coherence of these MDB objects. In theory, when > the device driver is trying to act on an event, the source of the trigger > may have changed its state in the bridge already. Yes, this is the result of SWITCHDEV_F_DEFER, used by both SWITCHDEV_ATTR_ID_PORT_MROUTER and SWITCHDEV_OBJ_ID_PORT_MDB. > If, upon receiving an event in the device driver, we iterate over what > the bridge has at that instant, the differences between the worlds as > seen by the bridge and the device driver might lead to some unexpected > results. Translated: iterating over bridge MDB objects needs to be serialized with new switchdev events by acquiring rtnl_lock(). Then, once switchdev events are temporarily blocked, the pending ones need to be flushed using switchdev_deferred_process(), so resync the bridge state with the driver state. Once the resync is done, the iteration is safe until rtnl_unlock(). Applied to our case, the MDB iterator is needed in mv88e6xxx_port_mrouter(). This is already called with rtnl_lock() acquired. The resync procedure will indirectly call mv88e6xxx_port_mdb_add()/mv88e6xxx_port_mdb_del() through switchdev_deferred_process(), and then the walk is consistent for the remainder of the mv88e6xxx_port_mrouter() function. A helper which does this is what would be required - an iterator function which calls an int (*cb)(struct net_device *brport, const struct switchdev_obj_port_mdb *mdb) for each MDB entry. The DSA core could then offer some post-processing services over this API, to recover the struct dsa_port associated with the bridge port (in the LAG case they aren't the same) and the address database associated with the bridge. Do you think there would be unexpected results even if we did this? br_switchdev_mdb_replay() needs to handle a similarly complicated situation of synchronizing with deferred MDB events. > However, if we cache the MDB objects in the device driver, at least > the order in which the events took place will be coherent and at any > give time the state of the MDB objects in the device driver can be > guaranteed to be sane. This is also the approach the prestera device > driver took. Not contesting this, but I wouldn't like to see MDBs cached in each device driver just for this. Switchdev is not very high on the list of APIs which are easy to use, and making MDB caching a requirement (for the common case that MDB entry destinations need software fixups with the mrouter ports) isn't exactly going to make that any better. Others' opinion may differ, but mine is that core offload APIs need to consider what hardware is available in the real world, make the common case easy, and the advanced cases possible. Rather than make every case "advanced" :)
Hi Vladimir, On 4/5/2024 7:07 AM, Vladimir Oltean wrote: > On Thu, Apr 04, 2024 at 04:43:38PM -0400, Joseph Huang wrote: >> Hi Vladimir, >> >> On 4/2/2024 8:23 AM, Vladimir Oltean wrote: >>> Can you comment on the feasibility/infeasibility of Tobias' proposal of: >>> "The bridge could just provide some MDB iterator to save us from having >>> to cache all the configured groups."? >>> https://lore.kernel.org/netdev/87sg31n04a.fsf@waldekranz.com/ >>> >>> What is done here will have to be scaled to many drivers - potentially >>> all existing DSA ones, as far as I'm aware. >>> >> >> I thought about implementing an MDB iterator as suggested by Tobias, but I'm >> a bit concerned about the coherence of these MDB objects. In theory, when >> the device driver is trying to act on an event, the source of the trigger >> may have changed its state in the bridge already. > > Yes, this is the result of SWITCHDEV_F_DEFER, used by both > SWITCHDEV_ATTR_ID_PORT_MROUTER and SWITCHDEV_OBJ_ID_PORT_MDB. > >> If, upon receiving an event in the device driver, we iterate over what >> the bridge has at that instant, the differences between the worlds as >> seen by the bridge and the device driver might lead to some unexpected >> results. > > Translated: iterating over bridge MDB objects needs to be serialized > with new switchdev events by acquiring rtnl_lock(). Then, once switchdev > events are temporarily blocked, the pending ones need to be flushed > using switchdev_deferred_process(), so resync the bridge state with the > driver state. Once the resync is done, the iteration is safe until > rtnl_unlock(). > > Applied to our case, the MDB iterator is needed in mv88e6xxx_port_mrouter(). > This is already called with rtnl_lock() acquired. The resync procedure > will indirectly call mv88e6xxx_port_mdb_add()/mv88e6xxx_port_mdb_del() > through switchdev_deferred_process(), and then the walk is consistent > for the remainder of the mv88e6xxx_port_mrouter() function. > > A helper which does this is what would be required - an iterator > function which calls an int (*cb)(struct net_device *brport, const struct switchdev_obj_port_mdb *mdb) > for each MDB entry. The DSA core could then offer some post-processing > services over this API, to recover the struct dsa_port associated with > the bridge port (in the LAG case they aren't the same) and the address > database associated with the bridge. > > Do you think there would be unexpected results even if we did this? > br_switchdev_mdb_replay() needs to handle a similarly complicated > situation of synchronizing with deferred MDB events. > >> However, if we cache the MDB objects in the device driver, at least >> the order in which the events took place will be coherent and at any >> give time the state of the MDB objects in the device driver can be >> guaranteed to be sane. This is also the approach the prestera device >> driver took. > > Not contesting this, but I wouldn't like to see MDBs cached in each > device driver just for this. Switchdev is not very high on the list of > APIs which are easy to use, and making MDB caching a requirement > (for the common case that MDB entry destinations need software fixups > with the mrouter ports) isn't exactly going to make that any better. > Others' opinion may differ, but mine is that core offload APIs need to > consider what hardware is available in the real world, make the common > case easy, and the advanced cases possible. Rather than make every case > "advanced" :) Just throwing some random ideas out there. Do you think it would make more sense if this whole solution (rtnl_lock, iterator cb,...etc.) is moved up to DSA so that other DSA drivers could benefit from it? I thought about implement it (not the iterator, the current form) in DSA at first, but I'm not sure how other drivers would behave, so I did it with mv instead. I guess the question is, is the current limitation (mrouter not properly offloaded) an issue specific to mv or is it a limitation of hardware offloading in general? I tend to think it's the latter. But then again, if we move it to DSA, we would lose the benefit of the optimization of consolidating multiple register writes into just one (as done in patch 10 currently), unless we add a new switch op which takes a portvec instead of a port when modifying mdb's.
On 4/5/2024 2:58 PM, Joseph Huang wrote: > Hi Vladimir, > > On 4/5/2024 7:07 AM, Vladimir Oltean wrote: >> On Thu, Apr 04, 2024 at 04:43:38PM -0400, Joseph Huang wrote: >>> Hi Vladimir, >>> >>> On 4/2/2024 8:23 AM, Vladimir Oltean wrote: >>>> Can you comment on the feasibility/infeasibility of Tobias' proposal >>>> of: >>>> "The bridge could just provide some MDB iterator to save us from having >>>> to cache all the configured groups."? >>>> https://lore.kernel.org/netdev/87sg31n04a.fsf@waldekranz.com/ >>>> >>>> What is done here will have to be scaled to many drivers - potentially >>>> all existing DSA ones, as far as I'm aware. >>>> >>> >>> I thought about implementing an MDB iterator as suggested by Tobias, >>> but I'm >>> a bit concerned about the coherence of these MDB objects. In theory, >>> when >>> the device driver is trying to act on an event, the source of the >>> trigger >>> may have changed its state in the bridge already. >> >> Yes, this is the result of SWITCHDEV_F_DEFER, used by both >> SWITCHDEV_ATTR_ID_PORT_MROUTER and SWITCHDEV_OBJ_ID_PORT_MDB. >> >>> If, upon receiving an event in the device driver, we iterate over what >>> the bridge has at that instant, the differences between the worlds as >>> seen by the bridge and the device driver might lead to some unexpected >>> results. >> >> Translated: iterating over bridge MDB objects needs to be serialized >> with new switchdev events by acquiring rtnl_lock(). Then, once switchdev >> events are temporarily blocked, the pending ones need to be flushed >> using switchdev_deferred_process(), so resync the bridge state with the >> driver state. Once the resync is done, the iteration is safe until >> rtnl_unlock(). >> >> Applied to our case, the MDB iterator is needed in >> mv88e6xxx_port_mrouter(). >> This is already called with rtnl_lock() acquired. The resync procedure >> will indirectly call mv88e6xxx_port_mdb_add()/mv88e6xxx_port_mdb_del() >> through switchdev_deferred_process(), and then the walk is consistent >> for the remainder of the mv88e6xxx_port_mrouter() function. >> >> A helper which does this is what would be required - an iterator >> function which calls an int (*cb)(struct net_device *brport, const >> struct switchdev_obj_port_mdb *mdb) >> for each MDB entry. The DSA core could then offer some post-processing >> services over this API, to recover the struct dsa_port associated with >> the bridge port (in the LAG case they aren't the same) and the address >> database associated with the bridge. Something like this (some layers omitted for brevity)? +br_iterator | for each mdb | _br_switchdev_mdb_notify rtnl_lock | without F_DEFER flag | | | +switchdev_port_attr_set_deferred | +switchdev_port_obj_notify | | | +dsa_port_mrouter | +dsa_user_port_obj_a/d | | | +mv88e6xxx_port_mrouter----------+ +mv88e6xxx_port_obj_a/d | +--------------------------------------+ | rtnl_unlock Note that on the system I tested, each register read/write takes about 100us to complete. For 100's of mdb groups, this would mean that we will be holding rtnl lock for 10's of ms. I don't know if it's considered too long. >> >> Do you think there would be unexpected results even if we did this? >> br_switchdev_mdb_replay() needs to handle a similarly complicated >> situation of synchronizing with deferred MDB events. >> >> However, if we cache the MDB objects in the device driver, at least >>> the order in which the events took place will be coherent and at any >>> give time the state of the MDB objects in the device driver can be >>> guaranteed to be sane. This is also the approach the prestera device >>> driver took. >> >> Not contesting this, but I wouldn't like to see MDBs cached in each >> device driver just for this. Switchdev is not very high on the list of >> APIs which are easy to use, and making MDB caching a requirement >> (for the common case that MDB entry destinations need software fixups >> with the mrouter ports) isn't exactly going to make that any better. >> Others' opinion may differ, but mine is that core offload APIs need to >> consider what hardware is available in the real world, make the common >> case easy, and the advanced cases possible. Rather than make every case >> "advanced" :) > > Just throwing some random ideas out there. Do you think it would make > more sense if this whole solution (rtnl_lock, iterator cb,...etc.) is > moved up to DSA so that other DSA drivers could benefit from it? I > thought about implement it (not the iterator, the current form) in DSA > at first, but I'm not sure how other drivers would behave, so I did it > with mv instead. > > I guess the question is, is the current limitation (mrouter not properly > offloaded) an issue specific to mv or is it a limitation of hardware > offloading in general? I tend to think it's the latter. > > But then again, if we move it to DSA, we would lose the benefit of the > optimization of consolidating multiple register writes into just one (as > done in patch 10 currently), unless we add a new switch op which takes a > portvec instead of a port when modifying mdb's.
On Mon, Apr 29, 2024 at 06:07:25PM -0400, Joseph Huang wrote: > Something like this (some layers omitted for brevity)? > > +br_iterator > | for each mdb > | _br_switchdev_mdb_notify > rtnl_lock | without F_DEFER flag > | | | > +switchdev_port_attr_set_deferred | +switchdev_port_obj_notify > | | | > +dsa_port_mrouter | +dsa_user_port_obj_a/d > | | | > +mv88e6xxx_port_mrouter----------+ +mv88e6xxx_port_obj_a/d > | > +--------------------------------------+ > | > rtnl_unlock At a _very_ superficial glance, I don't think you are properly accounting for the fact that even with rtnl_lock() held, there are still SWITCHDEV_OBJ_ID_PORT_MDB events which may be pending on the switchdev chain. Without a switchdev_deferred_process() flush call, you won't be getting rid of them, so when you rtnl_unlock(), they will still run. Even worse, holding rtnl_lock() will not stop the bridge multicast layer from modifying its br->mdb_list; only br->multicast_lock will. So you may be better off also acquiring br->multicast_lock, and notifying the MDB entries to the switchdev chain _with_the F_DEFER flag. > Note that on the system I tested, each register read/write takes about 100us > to complete. For 100's of mdb groups, this would mean that we will be > holding rtnl lock for 10's of ms. I don't know if it's considered too long. Not sure how this is going to be any better if the iteration over MDB entries is done 100% in the driver, though - since its hook, dsa_port_mrouter(), runs entirely under rtnl_lock(). Anyway, with the SWITCHDEV_F_DEFER flag, maybe the mdb object notifications can be made to run by switchdev only a few at a time, to give the network stack time to do other things as well.
On 4/29/2024 8:59 PM, Vladimir Oltean wrote: > On Mon, Apr 29, 2024 at 06:07:25PM -0400, Joseph Huang wrote: >> Something like this (some layers omitted for brevity)? >> >> +br_iterator >> | for each mdb >> | _br_switchdev_mdb_notify >> rtnl_lock | without F_DEFER flag >> | | | >> +switchdev_port_attr_set_deferred | +switchdev_port_obj_notify >> | | | >> +dsa_port_mrouter | +dsa_user_port_obj_a/d >> | | | >> +mv88e6xxx_port_mrouter----------+ +mv88e6xxx_port_obj_a/d >> | >> +--------------------------------------+ >> | >> rtnl_unlock > > At a _very_ superficial glance, I don't think you are properly > accounting for the fact that even with rtnl_lock() held, there are still > SWITCHDEV_OBJ_ID_PORT_MDB events which may be pending on the switchdev > chain. Without a switchdev_deferred_process() flush call, you won't be > getting rid of them, so when you rtnl_unlock(), they will still run. > > Even worse, holding rtnl_lock() will not stop the bridge multicast layer > from modifying its br->mdb_list; only br->multicast_lock will. > > So you may be better off also acquiring br->multicast_lock, and > notifying the MDB entries to the switchdev chain _with_the F_DEFER flag. Like this? +br_iterator(dsa_cb) | lock br->multicask_lock | for each mdb | br_switchdev_mdb_notify rtnl_lock | | | | +switchdev_port_obj_._defer +switchdev_port_attr_set_deferred | unlock br->multicast_lock | | +dsa_port_mrouter | | | +mv88e6xxx_port_mrouter----------+ | +--------------------------------------+ | rtnl_unlock (potential task change) rtnl_lock | +switchdev_deferred_process | flush all queued dfitems in queuing order | rtnl_unlock I'm not that familiar with the bridge code, but is there any concern with potential deadlock here (bewteen rtnl_lock and br->multicast_lock)? > >> Note that on the system I tested, each register read/write takes about 100us >> to complete. For 100's of mdb groups, this would mean that we will be >> holding rtnl lock for 10's of ms. I don't know if it's considered too long. > > Not sure how this is going to be any better if the iteration over MDB > entries is done 100% in the driver, though - since its hook, > dsa_port_mrouter(), runs entirely under rtnl_lock(). > > Anyway, with the SWITCHDEV_F_DEFER flag, maybe the mdb object > notifications can be made to run by switchdev only a few at a time, to > give the network stack time to do other things as well.
On 4/30/2024 12:27 PM, Joseph Huang wrote: > On 4/29/2024 8:59 PM, Vladimir Oltean wrote: >> On Mon, Apr 29, 2024 at 06:07:25PM -0400, Joseph Huang wrote: >>> Something like this (some layers omitted for brevity)? >>> >>> +br_iterator >>> | for each mdb >>> | _br_switchdev_mdb_notify >>> rtnl_lock | without F_DEFER flag >>> | | | >>> +switchdev_port_attr_set_deferred | +switchdev_port_obj_notify >>> | | | >>> +dsa_port_mrouter | +dsa_user_port_obj_a/d >>> | | | >>> +mv88e6xxx_port_mrouter----------+ >>> +mv88e6xxx_port_obj_a/d >>> | >>> +--------------------------------------+ >>> | >>> rtnl_unlock >> >> At a _very_ superficial glance, I don't think you are properly >> accounting for the fact that even with rtnl_lock() held, there are still >> SWITCHDEV_OBJ_ID_PORT_MDB events which may be pending on the switchdev >> chain. Without a switchdev_deferred_process() flush call, you won't be >> getting rid of them, so when you rtnl_unlock(), they will still run. >> >> Even worse, holding rtnl_lock() will not stop the bridge multicast layer >> from modifying its br->mdb_list; only br->multicast_lock will. >> >> So you may be better off also acquiring br->multicast_lock, and >> notifying the MDB entries to the switchdev chain _with_the F_DEFER flag. > > Like this? > > +br_iterator(dsa_cb) > | lock br->multicask_lock > | for each mdb > | br_switchdev_mdb_notify > rtnl_lock | | > | | +switchdev_port_obj_._defer > +switchdev_port_attr_set_deferred | unlock br->multicast_lock > | | > +dsa_port_mrouter | > | | > +mv88e6xxx_port_mrouter----------+ > | > +--------------------------------------+ > | > rtnl_unlock > > (potential task change) > > rtnl_lock > | > +switchdev_deferred_process > | flush all queued dfitems in queuing order > | > rtnl_unlock > > I'm not that familiar with the bridge code, but is there any concern > with potential deadlock here (between rtnl_lock and br->multicast_lock)? Hi Nik, Do you know if it's safe to acquire rtnl_lock and br->multicast_lock in the following sequence? Is there any potential possibility for a deadlock? rtnl_lock spin_lock(br->multicast_lock) spin_unlock(br->multicast_lock) rtnl_unlock If the operation is safe, the next question is should it be spin_lock or spin_lock_bh? Thanks, Joseph
diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c index f66ddde484dc..32a613c965b1 100644 --- a/drivers/net/dsa/mv88e6xxx/chip.c +++ b/drivers/net/dsa/mv88e6xxx/chip.c @@ -47,6 +47,14 @@ struct mv88e6xxx_bridge { struct list_head head; struct net_device *br_dev; u16 ports; + struct list_head br_mdb_list; +}; + +struct mv88e6xxx_br_mdb { + struct list_head head; + unsigned char addr[ETH_ALEN]; + u16 vid; + u16 portvec; }; static void assert_reg_lock(struct mv88e6xxx_chip *chip) @@ -2974,6 +2982,7 @@ mv88e6xxx_bridge_create(struct mv88e6xxx_chip *chip, struct net_device *br_dev) return ERR_PTR(-ENOMEM); mv_bridge->br_dev = br_dev; + INIT_LIST_HEAD(&mv_bridge->br_mdb_list); list_add(&mv_bridge->head, &chip->bridge_list); return mv_bridge; @@ -2984,6 +2993,7 @@ static void mv88e6xxx_bridge_destroy(struct mv88e6xxx_bridge *mv_bridge) list_del(&mv_bridge->head); WARN_ON(mv_bridge->ports); + WARN_ON(!list_empty(&mv_bridge->br_mdb_list)); kfree(mv_bridge); } @@ -6583,16 +6593,101 @@ static int mv88e6xxx_change_tag_protocol(struct dsa_switch *ds, return err; } +static struct mv88e6xxx_br_mdb * +mv88e6xxx_br_mdb_create(struct mv88e6xxx_bridge *mv_bridge, + const struct switchdev_obj_port_mdb *mdb) +{ + struct mv88e6xxx_br_mdb *mv_br_mdb; + + mv_br_mdb = kzalloc(sizeof(*mv_br_mdb), GFP_KERNEL); + if (!mv_br_mdb) + return ERR_PTR(-ENOMEM); + + ether_addr_copy(mv_br_mdb->addr, mdb->addr); + mv_br_mdb->vid = mdb->vid; + list_add(&mv_br_mdb->head, &mv_bridge->br_mdb_list); + + return mv_br_mdb; +} + +static void mv88e6xxx_br_mdb_destroy(struct mv88e6xxx_br_mdb *mv_br_mdb) +{ + list_del(&mv_br_mdb->head); + + WARN_ON(mv_br_mdb->portvec); + kfree(mv_br_mdb); +} + +static struct mv88e6xxx_br_mdb * +mv88e6xxx_br_mdb_find(struct mv88e6xxx_bridge *mv_bridge, + const struct switchdev_obj_port_mdb *mdb) +{ + struct mv88e6xxx_br_mdb *mv_br_mdb; + + list_for_each_entry(mv_br_mdb, &mv_bridge->br_mdb_list, head) + if (ether_addr_equal(mv_br_mdb->addr, mdb->addr) && + mv_br_mdb->vid == mdb->vid) + return mv_br_mdb; + + return NULL; +} + +static struct mv88e6xxx_br_mdb * +mv88e6xxx_br_mdb_get(struct mv88e6xxx_bridge *mv_bridge, + const struct switchdev_obj_port_mdb *mdb) +{ + struct mv88e6xxx_br_mdb *mv_br_mdb; + + mv_br_mdb = mv88e6xxx_br_mdb_find(mv_bridge, mdb); + if (!mv_br_mdb) + mv_br_mdb = mv88e6xxx_br_mdb_create(mv_bridge, mdb); + + return mv_br_mdb; +} + +static void mv88e6xxx_br_mdb_put(struct mv88e6xxx_br_mdb *mv_br_mdb) +{ + if (!mv_br_mdb->portvec) + mv88e6xxx_br_mdb_destroy(mv_br_mdb); +} + static int mv88e6xxx_port_mdb_add(struct dsa_switch *ds, int port, const struct switchdev_obj_port_mdb *mdb, struct dsa_db db) { struct mv88e6xxx_chip *chip = ds->priv; + struct mv88e6xxx_bridge *mv_bridge; + struct mv88e6xxx_br_mdb *mv_br_mdb; + struct net_device *orig_dev; + struct net_device *br_dev; int err; + orig_dev = mdb->obj.orig_dev; + br_dev = netdev_master_upper_dev_get(orig_dev); + if (!br_dev) + br_dev = orig_dev; + + mv_bridge = mv88e6xxx_bridge_by_dev(chip, br_dev); + if (!mv_bridge) + return -EINVAL; + + mv_br_mdb = mv88e6xxx_br_mdb_get(mv_bridge, mdb); + if (IS_ERR(mv_br_mdb)) + return PTR_ERR(mv_br_mdb); + + if (mv_br_mdb->portvec & BIT(port)) + return -EEXIST; + mv88e6xxx_reg_lock(chip); err = mv88e6xxx_port_db_load_purge(chip, port, mdb->addr, mdb->vid, MV88E6XXX_G1_ATU_DATA_STATE_MC_STATIC); + + if (err) + goto out; + + mv_br_mdb->portvec |= BIT(port); + +out: mv88e6xxx_reg_unlock(chip); return err; @@ -6603,10 +6698,32 @@ static int mv88e6xxx_port_mdb_del(struct dsa_switch *ds, int port, struct dsa_db db) { struct mv88e6xxx_chip *chip = ds->priv; + struct mv88e6xxx_bridge *mv_bridge; + struct mv88e6xxx_br_mdb *mv_br_mdb; + struct net_device *orig_dev; + struct net_device *br_dev; int err; + orig_dev = mdb->obj.orig_dev; + br_dev = netdev_master_upper_dev_get(orig_dev); + if (!br_dev) + br_dev = orig_dev; + + mv_bridge = mv88e6xxx_bridge_by_dev(chip, br_dev); + if (!mv_bridge) + return -EINVAL; + + mv_br_mdb = mv88e6xxx_br_mdb_find(mv_bridge, mdb); + if (!mv_br_mdb) + return -ENOENT; + + if (!(mv_br_mdb->portvec & BIT(port))) + return -ENOENT; + mv88e6xxx_reg_lock(chip); err = mv88e6xxx_port_db_load_purge(chip, port, mdb->addr, mdb->vid, 0); + mv_br_mdb->portvec &= ~BIT(port); + mv88e6xxx_br_mdb_put(mv_br_mdb); mv88e6xxx_reg_unlock(chip); return err;
Keep track of bridge mdb objects in the driver. Similar to the previous patch, since the driver doesn't get explicit notifications about mdb group creation or destruction, just create the mdb group when the first port joins the group via mv88e6xxx_port_mdb_add(), and destroys the group when the last port left the group via mv88e6xxx_port_mdb_del(). Use the group's L2 address together with the VLAN ID as the key to the list. Port membership is again stored in a bitmask. Signed-off-by: Joseph Huang <Joseph.Huang@garmin.com> --- drivers/net/dsa/mv88e6xxx/chip.c | 117 +++++++++++++++++++++++++++++++ 1 file changed, 117 insertions(+)