diff mbox series

net: bonding: do not set force_primary if reselect is set to failure

Message ID 20240912064043.36956-1-suresh2514@gmail.com (mailing list archive)
State Rejected
Delegated to: Netdev Maintainers
Headers show
Series net: bonding: do not set force_primary if reselect is set to failure | expand

Checks

Context Check Description
netdev/series_format warning Single patches do not need cover letters; Target tree name not specified in the subject
netdev/tree_selection success Guessed tree name to be net-next
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 16 this patch: 16
netdev/build_tools success No tools touched, skip
netdev/cc_maintainers success CCed 6 of 6 maintainers
netdev/build_clang success Errors and warnings before: 16 this patch: 16
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 16 this patch: 16
netdev/checkpatch fail ERROR: code indent should use tabs where possible WARNING: please, no spaces at the start of a line WARNING: suspect code indent for conditional statements (12, 28)
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 13 this patch: 13
netdev/source_inline success Was 0 now: 0
netdev/contest success net-next-2024-09-12--15-00 (tests: 764)

Commit Message

suresh ks Sept. 12, 2024, 6:40 a.m. UTC
when bond_enslave() is called, it sets bond->force_primary to true
without checking if primary_reselect is set to 'failure' or 'better'.
This can result in primary becoming active again when link is back which
is not what we want when primary_reselect is set to 'failure'

Test
====
Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)

Bonding Mode: fault-tolerance (active-backup)
Primary Slave: enp1s0 (primary_reselect failure)
Currently Active Slave: enp1s0
MII Status: up
MII Polling Interval (ms): 100
Up Delay (ms): 0
Down Delay (ms): 0
Peer Notification Delay (ms): 0

Slave Interface: enp1s0
MII Status: up
Speed: 1000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: 52:54:00:d7:a7:2a
Slave queue ID: 0

Slave Interface: enp9s0
MII Status: up
Speed: 1000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: 52:54:00:da:9a:f9
Slave queue ID: 0


After primary link failure:

Bonding Mode: fault-tolerance (active-backup)
Primary Slave: None
Currently Active Slave: enp9s0 <---- secondary is active now
MII Status: up
MII Polling Interval (ms): 100
Up Delay (ms): 0
Down Delay (ms): 0
Peer Notification Delay (ms): 0

Slave Interface: enp9s0
MII Status: up
Speed: 1000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: 52:54:00:da:9a:f9
Slave queue ID: 0


Now add primary link back and check bond status:

Bonding Mode: fault-tolerance (active-backup)
Primary Slave: enp1s0 (primary_reselect failure)
Currently Active Slave: enp1s0  <------------- primary is active again
MII Status: up
MII Polling Interval (ms): 100
Up Delay (ms): 0
Down Delay (ms): 0
Peer Notification Delay (ms): 0

Slave Interface: enp9s0
MII Status: up
Speed: 1000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: 52:54:00:da:9a:f9
Slave queue ID: 0

Slave Interface: enp1s0
MII Status: up
Speed: 1000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: 52:54:00:d7:a7:2a
Slave queue ID: 0

Signed-off-by: Suresh Kumar <suresh2514@gmail.com>
---
 drivers/net/bonding/bond_main.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

Comments

Jay Vosburgh Sept. 13, 2024, 12:06 a.m. UTC | #1
Suresh Kumar <suresh2514@gmail.com> wrote:

>when bond_enslave() is called, it sets bond->force_primary to true
>without checking if primary_reselect is set to 'failure' or 'better'.
>This can result in primary becoming active again when link is back which
>is not what we want when primary_reselect is set to 'failure'

	The current behavior is by design, and is documented in
Documentation/networking/bonding.rst:


	The primary_reselect setting is ignored in two cases:

		If no slaves are active, the first slave to recover is
		made the active slave.

		When initially enslaved, the primary slave is always made
		the active slave.


	Your proposed change would cause the primary to never be made
the active interface when added to the bond for the primary_reselect
"better" and "failure" settings, unless the primary interface is added
to the bond first or all other interfaces are down.

	Also, your description above and the test example below use the
phrases "link is back" and "primary link failure" but the patch and test
context suggest that the primary interface is being removed from the
bond and then later added back to the bond, which is not the same thing
as a link failure.

	-J

>Test
>====
>Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)
>
>Bonding Mode: fault-tolerance (active-backup)
>Primary Slave: enp1s0 (primary_reselect failure)
>Currently Active Slave: enp1s0
>MII Status: up
>MII Polling Interval (ms): 100
>Up Delay (ms): 0
>Down Delay (ms): 0
>Peer Notification Delay (ms): 0
>
>Slave Interface: enp1s0
>MII Status: up
>Speed: 1000 Mbps
>Duplex: full
>Link Failure Count: 0
>Permanent HW addr: 52:54:00:d7:a7:2a
>Slave queue ID: 0
>
>Slave Interface: enp9s0
>MII Status: up
>Speed: 1000 Mbps
>Duplex: full
>Link Failure Count: 0
>Permanent HW addr: 52:54:00:da:9a:f9
>Slave queue ID: 0
>
>
>After primary link failure:
>
>Bonding Mode: fault-tolerance (active-backup)
>Primary Slave: None
>Currently Active Slave: enp9s0 <---- secondary is active now
>MII Status: up
>MII Polling Interval (ms): 100
>Up Delay (ms): 0
>Down Delay (ms): 0
>Peer Notification Delay (ms): 0
>
>Slave Interface: enp9s0
>MII Status: up
>Speed: 1000 Mbps
>Duplex: full
>Link Failure Count: 0
>Permanent HW addr: 52:54:00:da:9a:f9
>Slave queue ID: 0
>
>
>Now add primary link back and check bond status:
>
>Bonding Mode: fault-tolerance (active-backup)
>Primary Slave: enp1s0 (primary_reselect failure)
>Currently Active Slave: enp1s0  <------------- primary is active again
>MII Status: up
>MII Polling Interval (ms): 100
>Up Delay (ms): 0
>Down Delay (ms): 0
>Peer Notification Delay (ms): 0
>
>Slave Interface: enp9s0
>MII Status: up
>Speed: 1000 Mbps
>Duplex: full
>Link Failure Count: 0
>Permanent HW addr: 52:54:00:da:9a:f9
>Slave queue ID: 0
>
>Slave Interface: enp1s0
>MII Status: up
>Speed: 1000 Mbps
>Duplex: full
>Link Failure Count: 0
>Permanent HW addr: 52:54:00:d7:a7:2a
>Slave queue ID: 0
>
>Signed-off-by: Suresh Kumar <suresh2514@gmail.com>
>---
> drivers/net/bonding/bond_main.c | 4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-)
>
>diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
>index bb9c3d6ef435..731256fbb996 100644
>--- a/drivers/net/bonding/bond_main.c
>+++ b/drivers/net/bonding/bond_main.c
>@@ -2146,7 +2146,9 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev,
> 		/* if there is a primary slave, remember it */
> 		if (strcmp(bond->params.primary, new_slave->dev->name) == 0) {
> 			rcu_assign_pointer(bond->primary_slave, new_slave);
>-			bond->force_primary = true;
>+            if (bond->params.primary_reselect != BOND_PRI_RESELECT_FAILURE  &&
>+                bond->params.primary_reselect != BOND_PRI_RESELECT_BETTER)
>+			    bond->force_primary = true;
> 		}
> 	}
> 
>-- 
>2.43.0
>

---
	-Jay Vosburgh, jv@jvosburgh.net
suresh ks Sept. 13, 2024, 2:36 a.m. UTC | #2
Hi,

Thanks  a lot for reviewing.

I was working for a customer issue where they removed a primary NIC for switch
maintenance and when added back, it caused iscsi storage outage because they
did not expect the bod to do a failover.

So bonding is behaving as default. But then I thought maybe we can do
something to cater for such scenarios and came up with this idea.  But
I agree my
testing was not a failure as I see ""Link Failure Count: 0" there.  I
used the below
command from my kvm host to simulate a link down and up.

     virsh  detach-interface testvm1  --type network --mac 52:54:00:d7:a7:2a

and attached it back with:

    virsh  attach-interface testvm1 --type network --source default
--mac 52:54:00:d7:a7:2a
     --model e1000e --config --live

So what would be the best solution here if I want to take out a
primary NIC for maintenance,
and then add it back ?.  I was also  trying with 'ifenslave'  to first
make secondary NIC active
and then remove primary NIC.

   ifenslave -d bond0 enp1s0

The interface changed to 'down', but immediately it came back up and
became active again.
I don't know why. The journal logs suggest my NetworkManager is
autoactivating it again :)

Thanks a lot for your time again.

- Suresh

On Fri, Sep 13, 2024 at 5:36 AM Jay Vosburgh <jv@jvosburgh.net> wrote:
>
> Suresh Kumar <suresh2514@gmail.com> wrote:
>
> >when bond_enslave() is called, it sets bond->force_primary to true
> >without checking if primary_reselect is set to 'failure' or 'better'.
> >This can result in primary becoming active again when link is back which
> >is not what we want when primary_reselect is set to 'failure'
>
>         The current behavior is by design, and is documented in
> Documentation/networking/bonding.rst:
>
>
>         The primary_reselect setting is ignored in two cases:
>
>                 If no slaves are active, the first slave to recover is
>                 made the active slave.
>
>                 When initially enslaved, the primary slave is always made
>                 the active slave.
>
>
>         Your proposed change would cause the primary to never be made
> the active interface when added to the bond for the primary_reselect
> "better" and "failure" settings, unless the primary interface is added
> to the bond first or all other interfaces are down.
>
>         Also, your description above and the test example below use the
> phrases "link is back" and "primary link failure" but the patch and test
> context suggest that the primary interface is being removed from the
> bond and then later added back to the bond, which is not the same thing
> as a link failure.
>
>         -J
>
> >Test
> >====
> >Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)
> >
> >Bonding Mode: fault-tolerance (active-backup)
> >Primary Slave: enp1s0 (primary_reselect failure)
> >Currently Active Slave: enp1s0
> >MII Status: up
> >MII Polling Interval (ms): 100
> >Up Delay (ms): 0
> >Down Delay (ms): 0
> >Peer Notification Delay (ms): 0
> >
> >Slave Interface: enp1s0
> >MII Status: up
> >Speed: 1000 Mbps
> >Duplex: full
> >Link Failure Count: 0
> >Permanent HW addr: 52:54:00:d7:a7:2a
> >Slave queue ID: 0
> >
> >Slave Interface: enp9s0
> >MII Status: up
> >Speed: 1000 Mbps
> >Duplex: full
> >Link Failure Count: 0
> >Permanent HW addr: 52:54:00:da:9a:f9
> >Slave queue ID: 0
> >
> >
> >After primary link failure:
> >
> >Bonding Mode: fault-tolerance (active-backup)
> >Primary Slave: None
> >Currently Active Slave: enp9s0 <---- secondary is active now
> >MII Status: up
> >MII Polling Interval (ms): 100
> >Up Delay (ms): 0
> >Down Delay (ms): 0
> >Peer Notification Delay (ms): 0
> >
> >Slave Interface: enp9s0
> >MII Status: up
> >Speed: 1000 Mbps
> >Duplex: full
> >Link Failure Count: 0
> >Permanent HW addr: 52:54:00:da:9a:f9
> >Slave queue ID: 0
> >
> >
> >Now add primary link back and check bond status:
> >
> >Bonding Mode: fault-tolerance (active-backup)
> >Primary Slave: enp1s0 (primary_reselect failure)
> >Currently Active Slave: enp1s0  <------------- primary is active again
> >MII Status: up
> >MII Polling Interval (ms): 100
> >Up Delay (ms): 0
> >Down Delay (ms): 0
> >Peer Notification Delay (ms): 0
> >
> >Slave Interface: enp9s0
> >MII Status: up
> >Speed: 1000 Mbps
> >Duplex: full
> >Link Failure Count: 0
> >Permanent HW addr: 52:54:00:da:9a:f9
> >Slave queue ID: 0
> >
> >Slave Interface: enp1s0
> >MII Status: up
> >Speed: 1000 Mbps
> >Duplex: full
> >Link Failure Count: 0
> >Permanent HW addr: 52:54:00:d7:a7:2a
> >Slave queue ID: 0
> >
> >Signed-off-by: Suresh Kumar <suresh2514@gmail.com>
> >---
> > drivers/net/bonding/bond_main.c | 4 +++-
> > 1 file changed, 3 insertions(+), 1 deletion(-)
> >
> >diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
> >index bb9c3d6ef435..731256fbb996 100644
> >--- a/drivers/net/bonding/bond_main.c
> >+++ b/drivers/net/bonding/bond_main.c
> >@@ -2146,7 +2146,9 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev,
> >               /* if there is a primary slave, remember it */
> >               if (strcmp(bond->params.primary, new_slave->dev->name) == 0) {
> >                       rcu_assign_pointer(bond->primary_slave, new_slave);
> >-                      bond->force_primary = true;
> >+            if (bond->params.primary_reselect != BOND_PRI_RESELECT_FAILURE  &&
> >+                bond->params.primary_reselect != BOND_PRI_RESELECT_BETTER)
> >+                          bond->force_primary = true;
> >               }
> >       }
> >
> >--
> >2.43.0
> >
>
> ---
>         -Jay Vosburgh, jv@jvosburgh.net
Jay Vosburgh Sept. 13, 2024, 5:07 p.m. UTC | #3
suresh ks <suresh2514@gmail.com> wrote:

>Hi,
>
>Thanks  a lot for reviewing.
>
>I was working for a customer issue where they removed a primary NIC for switch
>maintenance and when added back, it caused iscsi storage outage because they
>did not expect the bod to do a failover.
>
>So bonding is behaving as default. But then I thought maybe we can do
>something to cater for such scenarios and came up with this idea.  But
>I agree my
>testing was not a failure as I see ""Link Failure Count: 0" there.  I
>used the below
>command from my kvm host to simulate a link down and up.
>
>     virsh  detach-interface testvm1  --type network --mac 52:54:00:d7:a7:2a
>
>and attached it back with:
>
>    virsh  attach-interface testvm1 --type network --source default
>--mac 52:54:00:d7:a7:2a
>     --model e1000e --config --live

	This removes the interface entirely, so it's not just a link
failure.  You could probably simulate a link failure via something like
"ip link set dev ${INTERFACE} down", followed by a later "up" to restore
link state.  This would happen in the testvm1, and ${INTERFACE} is the
name of the bond member interface.

>So what would be the best solution here if I want to take out a
>primary NIC for maintenance,
>and then add it back ?.

	You could clear the primary option on the bond for the
maintenance window, and later set primary to the desired interface when
everything is ready for that interface to become active again.

	-J

> I was also  trying with 'ifenslave'  to first
>make secondary NIC active
>and then remove primary NIC.
>
>   ifenslave -d bond0 enp1s0
>
>The interface changed to 'down', but immediately it came back up and
>became active again.
>I don't know why. The journal logs suggest my NetworkManager is
>autoactivating it again :)
>
>Thanks a lot for your time again.
>
>- Suresh
>
>On Fri, Sep 13, 2024 at 5:36 AM Jay Vosburgh <jv@jvosburgh.net> wrote:
>>
>> Suresh Kumar <suresh2514@gmail.com> wrote:
>>
>> >when bond_enslave() is called, it sets bond->force_primary to true
>> >without checking if primary_reselect is set to 'failure' or 'better'.
>> >This can result in primary becoming active again when link is back which
>> >is not what we want when primary_reselect is set to 'failure'
>>
>>         The current behavior is by design, and is documented in
>> Documentation/networking/bonding.rst:
>>
>>
>>         The primary_reselect setting is ignored in two cases:
>>
>>                 If no slaves are active, the first slave to recover is
>>                 made the active slave.
>>
>>                 When initially enslaved, the primary slave is always made
>>                 the active slave.
>>
>>
>>         Your proposed change would cause the primary to never be made
>> the active interface when added to the bond for the primary_reselect
>> "better" and "failure" settings, unless the primary interface is added
>> to the bond first or all other interfaces are down.
>>
>>         Also, your description above and the test example below use the
>> phrases "link is back" and "primary link failure" but the patch and test
>> context suggest that the primary interface is being removed from the
>> bond and then later added back to the bond, which is not the same thing
>> as a link failure.
>>
>>         -J
>>
>> >Test
>> >====
>> >Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)
>> >
>> >Bonding Mode: fault-tolerance (active-backup)
>> >Primary Slave: enp1s0 (primary_reselect failure)
>> >Currently Active Slave: enp1s0
>> >MII Status: up
>> >MII Polling Interval (ms): 100
>> >Up Delay (ms): 0
>> >Down Delay (ms): 0
>> >Peer Notification Delay (ms): 0
>> >
>> >Slave Interface: enp1s0
>> >MII Status: up
>> >Speed: 1000 Mbps
>> >Duplex: full
>> >Link Failure Count: 0
>> >Permanent HW addr: 52:54:00:d7:a7:2a
>> >Slave queue ID: 0
>> >
>> >Slave Interface: enp9s0
>> >MII Status: up
>> >Speed: 1000 Mbps
>> >Duplex: full
>> >Link Failure Count: 0
>> >Permanent HW addr: 52:54:00:da:9a:f9
>> >Slave queue ID: 0
>> >
>> >
>> >After primary link failure:
>> >
>> >Bonding Mode: fault-tolerance (active-backup)
>> >Primary Slave: None
>> >Currently Active Slave: enp9s0 <---- secondary is active now
>> >MII Status: up
>> >MII Polling Interval (ms): 100
>> >Up Delay (ms): 0
>> >Down Delay (ms): 0
>> >Peer Notification Delay (ms): 0
>> >
>> >Slave Interface: enp9s0
>> >MII Status: up
>> >Speed: 1000 Mbps
>> >Duplex: full
>> >Link Failure Count: 0
>> >Permanent HW addr: 52:54:00:da:9a:f9
>> >Slave queue ID: 0
>> >
>> >
>> >Now add primary link back and check bond status:
>> >
>> >Bonding Mode: fault-tolerance (active-backup)
>> >Primary Slave: enp1s0 (primary_reselect failure)
>> >Currently Active Slave: enp1s0  <------------- primary is active again
>> >MII Status: up
>> >MII Polling Interval (ms): 100
>> >Up Delay (ms): 0
>> >Down Delay (ms): 0
>> >Peer Notification Delay (ms): 0
>> >
>> >Slave Interface: enp9s0
>> >MII Status: up
>> >Speed: 1000 Mbps
>> >Duplex: full
>> >Link Failure Count: 0
>> >Permanent HW addr: 52:54:00:da:9a:f9
>> >Slave queue ID: 0
>> >
>> >Slave Interface: enp1s0
>> >MII Status: up
>> >Speed: 1000 Mbps
>> >Duplex: full
>> >Link Failure Count: 0
>> >Permanent HW addr: 52:54:00:d7:a7:2a
>> >Slave queue ID: 0
>> >
>> >Signed-off-by: Suresh Kumar <suresh2514@gmail.com>
>> >---
>> > drivers/net/bonding/bond_main.c | 4 +++-
>> > 1 file changed, 3 insertions(+), 1 deletion(-)
>> >
>> >diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
>> >index bb9c3d6ef435..731256fbb996 100644
>> >--- a/drivers/net/bonding/bond_main.c
>> >+++ b/drivers/net/bonding/bond_main.c
>> >@@ -2146,7 +2146,9 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev,
>> >               /* if there is a primary slave, remember it */
>> >               if (strcmp(bond->params.primary, new_slave->dev->name) == 0) {
>> >                       rcu_assign_pointer(bond->primary_slave, new_slave);
>> >-                      bond->force_primary = true;
>> >+            if (bond->params.primary_reselect != BOND_PRI_RESELECT_FAILURE  &&
>> >+                bond->params.primary_reselect != BOND_PRI_RESELECT_BETTER)
>> >+                          bond->force_primary = true;
>> >               }
>> >       }
>> >
>> >--
>> >2.43.0
>> >
>>
>> ---
>>         -Jay Vosburgh, jv@jvosburgh.net
diff mbox series

Patch

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index bb9c3d6ef435..731256fbb996 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -2146,7 +2146,9 @@  int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev,
 		/* if there is a primary slave, remember it */
 		if (strcmp(bond->params.primary, new_slave->dev->name) == 0) {
 			rcu_assign_pointer(bond->primary_slave, new_slave);
-			bond->force_primary = true;
+            if (bond->params.primary_reselect != BOND_PRI_RESELECT_FAILURE  &&
+                bond->params.primary_reselect != BOND_PRI_RESELECT_BETTER)
+			    bond->force_primary = true;
 		}
 	}