Message ID | 20240816114813.326645-5-razor@blackwall.org (mailing list archive) |
---|---|
State | Accepted |
Commit | c4c5c5d2ef40a9f67a9241dc5422eac9ffe19547 |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | bonding: fix xfrm offload bugs | expand |
On Fri, Aug 16, 2024 at 02:48:13PM +0300, Nikolay Aleksandrov wrote: > If the active slave is cleared manually the xfrm state is not flushed. > This leads to xfrm add/del imbalance and adding the same state multiple > times. For example when the device cannot handle anymore states we get: > [ 1169.884811] bond0: (slave eni0np1): bond_ipsec_add_sa_all: failed to add SA > because it's filled with the same state after multiple active slave > clearings. This change also has a few nice side effects: user-space > gets a notification for the change, the old device gets its mac address > and promisc/mcast adjusted properly. > > Fixes: 18cb261afd7b ("bonding: support hardware encryption offload to slaves") > Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org> > --- > Please review this one more carefully. I plan to add a selftest with > netdevsim for this as well. > > drivers/net/bonding/bond_options.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/net/bonding/bond_options.c b/drivers/net/bonding/bond_options.c > index bc80fb6397dc..95d59a18c022 100644 > --- a/drivers/net/bonding/bond_options.c > +++ b/drivers/net/bonding/bond_options.c > @@ -936,7 +936,7 @@ static int bond_option_active_slave_set(struct bonding *bond, > /* check to see if we are clearing active */ > if (!slave_dev) { > netdev_dbg(bond->dev, "Clearing current active slave\n"); > - RCU_INIT_POINTER(bond->curr_active_slave, NULL); > + bond_change_active_slave(bond, NULL); The good part of this is we can do bond_ipsec_del_sa_all and bond_ipsec_add_sa_all. I'm not sure if we should do promisc/mcast adjustment when set active_slave to null. Jay should know better. Thanks Hangbin > bond_select_active_slave(bond); > } else { > struct slave *old_active = rtnl_dereference(bond->curr_active_slave); > -- > 2.44.0 >
On 19/08/2024 06:05, Hangbin Liu wrote: > On Fri, Aug 16, 2024 at 02:48:13PM +0300, Nikolay Aleksandrov wrote: >> If the active slave is cleared manually the xfrm state is not flushed. >> This leads to xfrm add/del imbalance and adding the same state multiple >> times. For example when the device cannot handle anymore states we get: >> [ 1169.884811] bond0: (slave eni0np1): bond_ipsec_add_sa_all: failed to add SA >> because it's filled with the same state after multiple active slave >> clearings. This change also has a few nice side effects: user-space >> gets a notification for the change, the old device gets its mac address >> and promisc/mcast adjusted properly. >> >> Fixes: 18cb261afd7b ("bonding: support hardware encryption offload to slaves") >> Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org> >> --- >> Please review this one more carefully. I plan to add a selftest with >> netdevsim for this as well. >> >> drivers/net/bonding/bond_options.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/drivers/net/bonding/bond_options.c b/drivers/net/bonding/bond_options.c >> index bc80fb6397dc..95d59a18c022 100644 >> --- a/drivers/net/bonding/bond_options.c >> +++ b/drivers/net/bonding/bond_options.c >> @@ -936,7 +936,7 @@ static int bond_option_active_slave_set(struct bonding *bond, >> /* check to see if we are clearing active */ >> if (!slave_dev) { >> netdev_dbg(bond->dev, "Clearing current active slave\n"); >> - RCU_INIT_POINTER(bond->curr_active_slave, NULL); >> + bond_change_active_slave(bond, NULL); > > The good part of this is we can do bond_ipsec_del_sa_all and > bond_ipsec_add_sa_all. I'm not sure if we should do promisc/mcast adjustment > when set active_slave to null. > > Jay should know better. > > Thanks > Hangbin Jay please correct me, but I'm pretty sure we should adjust them. They get adjusted on every active slave change, this is no different. In fact I'd argue that it's a long standing bug because they don't get adjusted when the active slave is cleared manually and if a new one is chosen (we call bond_select_active_slave() right after) then the old one would still have them set. During normal operations and automatic curr active slave changes, it is always adjusted. >> bond_select_active_slave(bond); >> } else { >> struct slave *old_active = rtnl_dereference(bond->curr_active_slave); >> -- >> 2.44.0 >>
On Mon, Aug 19, 2024 at 10:38:01AM +0300, Nikolay Aleksandrov wrote: > >> diff --git a/drivers/net/bonding/bond_options.c b/drivers/net/bonding/bond_options.c > >> index bc80fb6397dc..95d59a18c022 100644 > >> --- a/drivers/net/bonding/bond_options.c > >> +++ b/drivers/net/bonding/bond_options.c > >> @@ -936,7 +936,7 @@ static int bond_option_active_slave_set(struct bonding *bond, > >> /* check to see if we are clearing active */ > >> if (!slave_dev) { > >> netdev_dbg(bond->dev, "Clearing current active slave\n"); > >> - RCU_INIT_POINTER(bond->curr_active_slave, NULL); > >> + bond_change_active_slave(bond, NULL); > > > > The good part of this is we can do bond_ipsec_del_sa_all and > > bond_ipsec_add_sa_all. I'm not sure if we should do promisc/mcast adjustment > > when set active_slave to null. > > > > Jay should know better. > > > > Thanks > > Hangbin > > Jay please correct me, but I'm pretty sure we should adjust them. They get adjusted on > every active slave change, this is no different. In fact I'd argue that it's a long > standing bug because they don't get adjusted when the active slave is cleared > manually and if a new one is chosen (we call bond_select_active_slave() right after) > then the old one would still have them set. During normal operations and automatic > curr active slave changes, it is always adjusted. OK, I rechecked the code. The mcast resend only happens when there is a new new_active or in rr mode. But bond_option_active_slave_set() only called with active-backup/alb/tlb mode. So this should be safe. Reviewed-by: Hangbin Liu <liuhangbin@gmail.com>
diff --git a/drivers/net/bonding/bond_options.c b/drivers/net/bonding/bond_options.c index bc80fb6397dc..95d59a18c022 100644 --- a/drivers/net/bonding/bond_options.c +++ b/drivers/net/bonding/bond_options.c @@ -936,7 +936,7 @@ static int bond_option_active_slave_set(struct bonding *bond, /* check to see if we are clearing active */ if (!slave_dev) { netdev_dbg(bond->dev, "Clearing current active slave\n"); - RCU_INIT_POINTER(bond->curr_active_slave, NULL); + bond_change_active_slave(bond, NULL); bond_select_active_slave(bond); } else { struct slave *old_active = rtnl_dereference(bond->curr_active_slave);
If the active slave is cleared manually the xfrm state is not flushed. This leads to xfrm add/del imbalance and adding the same state multiple times. For example when the device cannot handle anymore states we get: [ 1169.884811] bond0: (slave eni0np1): bond_ipsec_add_sa_all: failed to add SA because it's filled with the same state after multiple active slave clearings. This change also has a few nice side effects: user-space gets a notification for the change, the old device gets its mac address and promisc/mcast adjusted properly. Fixes: 18cb261afd7b ("bonding: support hardware encryption offload to slaves") Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org> --- Please review this one more carefully. I plan to add a selftest with netdevsim for this as well. drivers/net/bonding/bond_options.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)