diff mbox series

[net] bonding: fix lockdep splat in bond_miimon_commit()

Message ID 20221220130831.1480888-1-edumazet@google.com (mailing list archive)
State Accepted
Commit 42c7ded0eeacd2ba5db599205c71c279dc715de7
Delegated to: Netdev Maintainers
Headers show
Series [net] bonding: fix lockdep splat in bond_miimon_commit() | expand

Checks

Context Check Description
netdev/tree_selection success Clearly marked for net
netdev/fixes_present success Fixes tag present in non-next series
netdev/subject_prefix success Link
netdev/cover_letter success Single patches do not need cover letters
netdev/patch_count success Link
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 3 this patch: 3
netdev/cc_maintainers success CCed 9 of 9 maintainers
netdev/build_clang success Errors and warnings before: 0 this patch: 0
netdev/module_param success Was 0 now: 0
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 3 this patch: 3
netdev/checkpatch warning WARNING: line length of 86 exceeds 80 columns
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Eric Dumazet Dec. 20, 2022, 1:08 p.m. UTC
bond_miimon_commit() is run while RTNL is held, not RCU.

WARNING: suspicious RCU usage
6.1.0-syzkaller-09671-g89529367293c #0 Not tainted
-----------------------------
drivers/net/bonding/bond_main.c:2704 suspicious rcu_dereference_check() usage!

Fixes: e95cc44763a4 ("bonding: do failover when high prio link up")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Cc: Hangbin Liu <liuhangbin@gmail.com>
Cc: Jay Vosburgh <j.vosburgh@gmail.com>
Cc: Veaceslav Falico <vfalico@gmail.com>
Cc: Andy Gospodarek <andy@greyhouse.net>
---
 drivers/net/bonding/bond_main.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

Comments

Hangbin Liu Dec. 21, 2022, 3:33 a.m. UTC | #1
On Tue, Dec 20, 2022 at 01:08:31PM +0000, Eric Dumazet wrote:
> bond_miimon_commit() is run while RTNL is held, not RCU.
> 
> WARNING: suspicious RCU usage
> 6.1.0-syzkaller-09671-g89529367293c #0 Not tainted
> -----------------------------
> drivers/net/bonding/bond_main.c:2704 suspicious rcu_dereference_check() usage!
> 
> Fixes: e95cc44763a4 ("bonding: do failover when high prio link up")
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Reported-by: syzbot <syzkaller@googlegroups.com>
> Cc: Hangbin Liu <liuhangbin@gmail.com>
> Cc: Jay Vosburgh <j.vosburgh@gmail.com>
> Cc: Veaceslav Falico <vfalico@gmail.com>
> Cc: Andy Gospodarek <andy@greyhouse.net>
> ---
>  drivers/net/bonding/bond_main.c | 8 +++++---
>  1 file changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
> index b4c65783960a5aa14de5d64aeea190f02a04be44..0363ce597661422b82a7d33ef001151b275f9ada 100644
> --- a/drivers/net/bonding/bond_main.c
> +++ b/drivers/net/bonding/bond_main.c
> @@ -2654,10 +2654,12 @@ static void bond_miimon_link_change(struct bonding *bond,
>  
>  static void bond_miimon_commit(struct bonding *bond)
>  {
> -	struct slave *slave, *primary;
> +	struct slave *slave, *primary, *active;
>  	bool do_failover = false;
>  	struct list_head *iter;
>  
> +	ASSERT_RTNL();
> +
>  	bond_for_each_slave(bond, slave, iter) {
>  		switch (slave->link_new_state) {
>  		case BOND_LINK_NOCHANGE:
> @@ -2700,8 +2702,8 @@ static void bond_miimon_commit(struct bonding *bond)
>  
>  			bond_miimon_link_change(bond, slave, BOND_LINK_UP);
>  
> -			if (!rcu_access_pointer(bond->curr_active_slave) || slave == primary ||
> -			    slave->prio > rcu_dereference(bond->curr_active_slave)->prio)
> +			active = rtnl_dereference(bond->curr_active_slave);
> +			if (!active || slave == primary || slave->prio > active->prio)
>  				do_failover = true;

Hi Eric,

Thanks for the fix. I have some silly questions.

Is there an easy way or tool that could find if the functions is holding via
RTNL lock or RCU lock, except review all the call chains? I have faced
this issue in commit 9b80ccda233f ("bonding: fix missed rcu protection"),
which we though the function is under RTNL, while there is a call chain that
not hold rcu lock. Adding ASSERT_RTNL() could find it during running. I just
want to know if there is another way that we could find it in code review.


Another questions is, I'm still a little confused with the mixing usage of
rcu_access_pointer() and rtnl_dereference() under RTNL. e.g.

In bond_miimon_commit() we use rcu_access_pointer() to check the pointers.
                case BOND_LINK_DOWN:
                        if (slave == rcu_access_pointer(bond->curr_active_slave))
                                do_failover = true;

In bond_ab_arp_commit() we use rtnl_dereference() to check the pointer

                case BOND_LINK_DOWN:
                        if (slave == rtnl_dereference(bond->curr_active_slave)) {
                                RCU_INIT_POINTER(bond->current_arp_slave, NULL);
                                do_failover = true;
                        }
                case BOND_LINK_FAIL:
                        if (rtnl_dereference(bond->curr_active_slave))
                                RCU_INIT_POINTER(bond->current_arp_slave, NULL);

Does it matter to use which one? Should we change to rcu_access_pointer()
if there is no dereference?

Thanks
Hangbin
Paolo Abeni Dec. 22, 2022, 9:39 a.m. UTC | #2
On Wed, 2022-12-21 at 11:33 +0800, Hangbin Liu wrote:
> Another questions is, I'm still a little confused with the mixing usage of
> rcu_access_pointer() and rtnl_dereference() under RTNL. e.g.
> 
> In bond_miimon_commit() we use rcu_access_pointer() to check the pointers.
>                 case BOND_LINK_DOWN:
>                         if (slave == rcu_access_pointer(bond->curr_active_slave))
>                                 do_failover = true;
> 
> In bond_ab_arp_commit() we use rtnl_dereference() to check the pointer
> 
>                 case BOND_LINK_DOWN:
>                         if (slave == rtnl_dereference(bond->curr_active_slave)) {
>                                 RCU_INIT_POINTER(bond->current_arp_slave, NULL);
>                                 do_failover = true;
>                         }
>                 case BOND_LINK_FAIL:
>                         if (rtnl_dereference(bond->curr_active_slave))
>                                 RCU_INIT_POINTER(bond->current_arp_slave, NULL);
> 
> Does it matter to use which one? Should we change to rcu_access_pointer()
> if there is no dereference?

You can use rcu_access_pointer() every time the code does not actually
use the RCU pointer, just checks for NULL value.

rtnl_dereference() needs stronger guarantees (the caller must hold the
RTNL lock at call time). As such it adds additional lockdep-safety, and
should be preferred _when_ the call site meets the requirement.

In the above bond_miimon_commit() example the rcu_access_pointer()
could be replaced with rtnl_dereference() for extra safety and
consistency, but it will be mostly a cosmetic change.

Cheers,

Paolo
patchwork-bot+netdevbpf@kernel.org Dec. 22, 2022, 10 a.m. UTC | #3
Hello:

This patch was applied to netdev/net.git (master)
by Paolo Abeni <pabeni@redhat.com>:

On Tue, 20 Dec 2022 13:08:31 +0000 you wrote:
> bond_miimon_commit() is run while RTNL is held, not RCU.
> 
> WARNING: suspicious RCU usage
> 6.1.0-syzkaller-09671-g89529367293c #0 Not tainted
> -----------------------------
> drivers/net/bonding/bond_main.c:2704 suspicious rcu_dereference_check() usage!
> 
> [...]

Here is the summary with links:
  - [net] bonding: fix lockdep splat in bond_miimon_commit()
    https://git.kernel.org/netdev/net/c/42c7ded0eeac

You are awesome, thank you!
diff mbox series

Patch

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index b4c65783960a5aa14de5d64aeea190f02a04be44..0363ce597661422b82a7d33ef001151b275f9ada 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -2654,10 +2654,12 @@  static void bond_miimon_link_change(struct bonding *bond,
 
 static void bond_miimon_commit(struct bonding *bond)
 {
-	struct slave *slave, *primary;
+	struct slave *slave, *primary, *active;
 	bool do_failover = false;
 	struct list_head *iter;
 
+	ASSERT_RTNL();
+
 	bond_for_each_slave(bond, slave, iter) {
 		switch (slave->link_new_state) {
 		case BOND_LINK_NOCHANGE:
@@ -2700,8 +2702,8 @@  static void bond_miimon_commit(struct bonding *bond)
 
 			bond_miimon_link_change(bond, slave, BOND_LINK_UP);
 
-			if (!rcu_access_pointer(bond->curr_active_slave) || slave == primary ||
-			    slave->prio > rcu_dereference(bond->curr_active_slave)->prio)
+			active = rtnl_dereference(bond->curr_active_slave);
+			if (!active || slave == primary || slave->prio > active->prio)
 				do_failover = true;
 
 			continue;