bonding: 3ad: update slave arr after initialize

Message ID	1618537982-454-1-git-send-email-jinyiting@huawei.com (mailing list archive)
State	Changes Requested
Delegated to:	Netdev Maintainers
Headers	show Return-Path: <netdev-owner@kernel.org> From: jinyiting <jinyiting@huawei.com> To: <j.vosburgh@gmail.com>, <vfalico@gmail.com>, <andy@greyhouse.net>, <davem@davemloft.net>, <kuba@kernel.org>, <netdev@vger.kernel.org>, <security@kernel.org>, <linux-kernel@vger.kernel.org> CC: <xuhanbing@huawei.com>, <wangxiaogang3@huawei.com> Subject: [PATCH] bonding: 3ad: update slave arr after initialize Date: Fri, 16 Apr 2021 09:53:02 +0800 Message-ID: <1618537982-454-1-git-send-email-jinyiting@huawei.com> MIME-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 8bit Precedence: bulk
Series	bonding: 3ad: update slave arr after initialize \| expand bonding: 3ad: update slave arr after initialize

Context	Check	Description
netdev/cover_letter	success	Link
netdev/fixes_present	success	Link
netdev/patch_count	success	Link
netdev/tree_selection	success	Guessed tree name to be net-next
netdev/subject_prefix	warning	Target tree name not specified in the subject
netdev/cc_maintainers	success	CCed 6 of 6 maintainers
netdev/source_inline	success	Was 0 now: 0
netdev/verify_signedoff	success	Link
netdev/module_param	success	Was 0 now: 0
netdev/build_32bit	success	Errors and warnings before: 0 this patch: 0
netdev/kdoc	success	Errors and warnings before: 0 this patch: 0
netdev/verify_fixes	success	Link
netdev/checkpatch	warning	WARNING: line length of 89 exceeds 80 columns
netdev/build_allmodconfig_warn	success	Errors and warnings before: 0 this patch: 0
netdev/header_inline	success	Link

jinyiting April 16, 2021, 1:53 a.m. UTC

From: jin yiting <jinyiting@huawei.com>

The bond works in mode 4, and performs down/up operations on the bond
that is normally negotiated. The probability of bond-> slave_arr is NULL

Test commands:
    ifconfig bond1 down
    ifconfig bond1 up

The conflict occurs in the following process：

__dev_open (CPU A)
 --bond_open
   --queue_delayed_work(bond->wq,&bond->ad_work,0);
   --bond_update_slave_arr
     --bond_3ad_get_active_agg_info

ad_work(CPU B)
 --bond_3ad_state_machine_handler
   --ad_agg_selection_logic

ad_work runs on cpu B. In the function ad_agg_selection_logic, all
agg->is_active will be cleared. Before the new active aggregator is
selected on CPU B, bond_3ad_get_active_agg_info failed on CPU A,
bond->slave_arr will be set to NULL. The best aggregator in
ad_agg_selection_logic has not changed, no need to update slave arr.

Signed-off-by: jin yiting <jinyiting@huawei.com>
---
 drivers/net/bonding/bond_3ad.c | 6 ++++++
 1 file changed, 6 insertions(+)

Jay Vosburgh April 16, 2021, 4:28 a.m. UTC | #1

jinyiting <jinyiting@huawei.com> wrote:

>From: jin yiting <jinyiting@huawei.com>
>
>The bond works in mode 4, and performs down/up operations on the bond
>that is normally negotiated. The probability of bond-> slave_arr is NULL
>
>Test commands:
>    ifconfig bond1 down
>    ifconfig bond1 up
>
>The conflict occurs in the following process：
>
>__dev_open (CPU A)
> --bond_open
>   --queue_delayed_work(bond->wq,&bond->ad_work,0);
>   --bond_update_slave_arr
>     --bond_3ad_get_active_agg_info
>
>ad_work(CPU B)
> --bond_3ad_state_machine_handler
>   --ad_agg_selection_logic
>
>ad_work runs on cpu B. In the function ad_agg_selection_logic, all
>agg->is_active will be cleared. Before the new active aggregator is
>selected on CPU B, bond_3ad_get_active_agg_info failed on CPU A,
>bond->slave_arr will be set to NULL. The best aggregator in
>ad_agg_selection_logic has not changed, no need to update slave arr.
>
>Signed-off-by: jin yiting <jinyiting@huawei.com>
>---
> drivers/net/bonding/bond_3ad.c | 6 ++++++
> 1 file changed, 6 insertions(+)
>
>diff --git a/drivers/net/bonding/bond_3ad.c b/drivers/net/bonding/bond_3ad.c
>index 6908822..d100079 100644
>--- a/drivers/net/bonding/bond_3ad.c
>+++ b/drivers/net/bonding/bond_3ad.c
>@@ -2327,6 +2327,12 @@ void bond_3ad_state_machine_handler(struct work_struct *work)
> 
> 			aggregator = __get_first_agg(port);
> 			ad_agg_selection_logic(aggregator, &update_slave_arr);
>+			if (!update_slave_arr) {
>+				struct aggregator *active = __get_active_agg(aggregator);
>+
>+				if (active && active->is_active)
>+					update_slave_arr = true;
>+			}
> 		}
> 		bond_3ad_set_carrier(bond);
> 	}

	The described issue is a race condition (in that
ad_agg_selection_logic clears agg->is_active under mode_lock, but
bond_open -> bond_update_slave_arr is inspecting agg->is_active outside
the lock).  I don't see how the above change will reliably manage this;
the real issue looks to be that bond_update_slave_arr is committing
changes to the array (via bond_reset_slave_arr) based on a racy
inspection of the active aggregator state while it is in flux.

	Also, the description of the issue says "The best aggregator in
ad_agg_selection_logic has not changed, no need to update slave arr,"
but the change above does the opposite, and will set update_slave_arr
when the aggregator has not changed (update_slave_arr remains false at
return of ad_agg_selection_logic).

	I believe I understand the described problem, but I don't see
how the patch fixes it.  I suspect (but haven't tested) that the proper
fix is to acquire mode_lock in bond_update_slave_arr while calling
bond_3ad_get_active_agg_info to avoid conflict with the state machine.

	-J

---
	-Jay Vosburgh, jay.vosburgh@canonical.com

jinyiting April 20, 2021, 3:22 a.m. UTC | #2

在 2021/4/16 12:28, Jay Vosburgh 写道:
> jinyiting <jinyiting@huawei.com> wrote:
> 
>> From: jin yiting <jinyiting@huawei.com>
>>
>> The bond works in mode 4, and performs down/up operations on the bond
>> that is normally negotiated. The probability of bond-> slave_arr is NULL
>>
>> Test commands:
>>     ifconfig bond1 down
>>     ifconfig bond1 up
>>
>> The conflict occurs in the following process：
>>
>> __dev_open (CPU A)
>> --bond_open
>>    --queue_delayed_work(bond->wq,&bond->ad_work,0);
>>    --bond_update_slave_arr
>>      --bond_3ad_get_active_agg_info
>>
>> ad_work(CPU B)
>> --bond_3ad_state_machine_handler
>>    --ad_agg_selection_logic
>>
>> ad_work runs on cpu B. In the function ad_agg_selection_logic, all
>> agg->is_active will be cleared. Before the new active aggregator is
>> selected on CPU B, bond_3ad_get_active_agg_info failed on CPU A,
>> bond->slave_arr will be set to NULL. The best aggregator in
>> ad_agg_selection_logic has not changed, no need to update slave arr.
>>
>> Signed-off-by: jin yiting <jinyiting@huawei.com>
>> ---
>> drivers/net/bonding/bond_3ad.c | 6 ++++++
>> 1 file changed, 6 insertions(+)
>>
>> diff --git a/drivers/net/bonding/bond_3ad.c b/drivers/net/bonding/bond_3ad.c
>> index 6908822..d100079 100644
>> --- a/drivers/net/bonding/bond_3ad.c
>> +++ b/drivers/net/bonding/bond_3ad.c
>> @@ -2327,6 +2327,12 @@ void bond_3ad_state_machine_handler(struct work_struct *work)
>>
>> 			aggregator = __get_first_agg(port);
>> 			ad_agg_selection_logic(aggregator, &update_slave_arr);
>> +			if (!update_slave_arr) {
>> +				struct aggregator *active = __get_active_agg(aggregator);
>> +
>> +				if (active && active->is_active)
>> +					update_slave_arr = true;
>> +			}
>> 		}
>> 		bond_3ad_set_carrier(bond);
>> 	}
> 
> 	The described issue is a race condition (in that
> ad_agg_selection_logic clears agg->is_active under mode_lock, but
> bond_open -> bond_update_slave_arr is inspecting agg->is_active outside
> the lock).  I don't see how the above change will reliably manage this;
> the real issue looks to be that bond_update_slave_arr is committing
> changes to the array (via bond_reset_slave_arr) based on a racy
> inspection of the active aggregator state while it is in flux.
> 
> 	Also, the description of the issue says "The best aggregator in
> ad_agg_selection_logic has not changed, no need to update slave arr,"
> but the change above does the opposite, and will set update_slave_arr
> when the aggregator has not changed (update_slave_arr remains false at
> return of ad_agg_selection_logic).
> 
> 	I believe I understand the described problem, but I don't see
> how the patch fixes it.  I suspect (but haven't tested) that the proper
> fix is to acquire mode_lock in bond_update_slave_arr while calling
> bond_3ad_get_active_agg_info to avoid conflict with the state machine.
> 
> 	-J
> 
> ---
> 	-Jay Vosburgh, jay.vosburgh@canonical.com
> .
> 

	Thank you for your reply. The last patch does have redundant update 
slave arr.Thank you for your correction.

         As you said, holding mode_lock in bond_update_slave_arr while 
calling bond_3ad_get_active_agg_info can avoid conflictwith the state 
machine. I have tested this patch, with ifdown/ifup operations for bond 
or slaves.

         But bond_update_slave_arr is expected to hold RTNL only and NO 
other lock. And it have WARN_ON(lockdep_is_held(&bond->mode_lock)); in 
bond_update_slave_arr. I'm not sure that holding mode_lock in 
bond_update_slave_arr while calling bond_3ad_get_active_agg_info is a 
correct action.


diff --git a/drivers/net/bonding/bond_main.c 
b/drivers/net/bonding/bond_main.c
index 74cbbb2..db988e5 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -4406,7 +4406,9 @@ int bond_update_slave_arr(struct bonding *bond, 
struct slave *skipslave)
     if (BOND_MODE(bond) == BOND_MODE_8023AD) {
         struct ad_info ad_info;

+       spin_lock_bh(&bond->mode_lock);
         if (bond_3ad_get_active_agg_info(bond, &ad_info)) {
+           spin_unlock_bh(&bond->mode_lock);
             pr_debug("bond_3ad_get_active_agg_info failed\n");
             /* No active aggragator means it's not safe to use
              * the previous array.
@@ -4414,6 +4416,7 @@ int bond_update_slave_arr(struct bonding *bond, 
struct slave *skipslave)
             bond_reset_slave_arr(bond);
             goto out;
         }
+       spin_unlock_bh(&bond->mode_lock);
         agg_id = ad_info.aggregator_id;
     }
     bond_for_each_slave(bond, slave, iter) {

Jay Vosburgh April 20, 2021, 5:04 a.m. UTC | #3

jin yiting <jinyiting@huawei.com> wrote:
[...]
>> 	The described issue is a race condition (in that
>> ad_agg_selection_logic clears agg->is_active under mode_lock, but
>> bond_open -> bond_update_slave_arr is inspecting agg->is_active outside
>> the lock).  I don't see how the above change will reliably manage this;
>> the real issue looks to be that bond_update_slave_arr is committing
>> changes to the array (via bond_reset_slave_arr) based on a racy
>> inspection of the active aggregator state while it is in flux.
>>
>> 	Also, the description of the issue says "The best aggregator in
>> ad_agg_selection_logic has not changed, no need to update slave arr,"
>> but the change above does the opposite, and will set update_slave_arr
>> when the aggregator has not changed (update_slave_arr remains false at
>> return of ad_agg_selection_logic).
>>
>> 	I believe I understand the described problem, but I don't see
>> how the patch fixes it.  I suspect (but haven't tested) that the proper
>> fix is to acquire mode_lock in bond_update_slave_arr while calling
>> bond_3ad_get_active_agg_info to avoid conflict with the state machine.
>>
>> 	-J
>>
>> ---
>> 	-Jay Vosburgh, jay.vosburgh@canonical.com
>> .
>>
>
>	Thank you for your reply. The last patch does have redundant
>update slave arr.Thank you for your correction.
>
>        As you said, holding mode_lock in bond_update_slave_arr while
>calling bond_3ad_get_active_agg_info can avoid conflictwith the state
>machine. I have tested this patch, with ifdown/ifup operations for bond or
>slaves.
>
>        But bond_update_slave_arr is expected to hold RTNL only and NO
>other lock. And it have WARN_ON(lockdep_is_held(&bond->mode_lock)); in
>bond_update_slave_arr. I'm not sure that holding mode_lock in
>bond_update_slave_arr while calling bond_3ad_get_active_agg_info is a
>correct action.

	That WARN_ON came up in discussion recently, and my opinion is
that it's incorrect, and is trying to insure bond_update_slave_arr is
safe for a potential sleep when allocating memory.

https://lore.kernel.org/netdev/20210322123846.3024549-1-maximmi@nvidia.com/

	The original authors haven't replied, so I would suggest you
remove the WARN_ON and the surrounding CONFIG_LOCKDEP ifdefs as part of
your patch and replace it with a call to might_sleep.

	The other callers of bond_3ad_get_active_agg_info are generally
obtaining the state in order to report it to user space, so I think it's
safe to leave those calls not holding the mode_lock.  The race is still
there, but the data returned to user space is a snapshot and so may
reflect an incomplete state during a transition.  Further, having the
inspection functions acquire the mode_lock permits user space to spam
the lock with little effort.

	-J

>diff --git a/drivers/net/bonding/bond_main.c
>b/drivers/net/bonding/bond_main.c
>index 74cbbb2..db988e5 100644
>--- a/drivers/net/bonding/bond_main.c
>+++ b/drivers/net/bonding/bond_main.c
>@@ -4406,7 +4406,9 @@ int bond_update_slave_arr(struct bonding *bond,
>struct slave *skipslave)
>    if (BOND_MODE(bond) == BOND_MODE_8023AD) {
>        struct ad_info ad_info;
>
>+       spin_lock_bh(&bond->mode_lock);
>        if (bond_3ad_get_active_agg_info(bond, &ad_info)) {
>+           spin_unlock_bh(&bond->mode_lock);
>            pr_debug("bond_3ad_get_active_agg_info failed\n");
>            /* No active aggragator means it's not safe to use
>             * the previous array.
>@@ -4414,6 +4416,7 @@ int bond_update_slave_arr(struct bonding *bond,
>struct slave *skipslave)
>            bond_reset_slave_arr(bond);
>            goto out;
>        }
>+       spin_unlock_bh(&bond->mode_lock);
>        agg_id = ad_info.aggregator_id;
>    }
>    bond_for_each_slave(bond, slave, iter) {

---
	-Jay Vosburgh, jay.vosburgh@canonical.com

jinyiting April 21, 2021, 3:34 a.m. UTC | #4

在 2021/4/20 13:04, Jay Vosburgh 写道:
> jin yiting <jinyiting@huawei.com> wrote:
> [...]
>>> 	The described issue is a race condition (in that
>>> ad_agg_selection_logic clears agg->is_active under mode_lock, but
>>> bond_open -> bond_update_slave_arr is inspecting agg->is_active outside
>>> the lock).  I don't see how the above change will reliably manage this;
>>> the real issue looks to be that bond_update_slave_arr is committing
>>> changes to the array (via bond_reset_slave_arr) based on a racy
>>> inspection of the active aggregator state while it is in flux.
>>>
>>> 	Also, the description of the issue says "The best aggregator in
>>> ad_agg_selection_logic has not changed, no need to update slave arr,"
>>> but the change above does the opposite, and will set update_slave_arr
>>> when the aggregator has not changed (update_slave_arr remains false at
>>> return of ad_agg_selection_logic).
>>>
>>> 	I believe I understand the described problem, but I don't see
>>> how the patch fixes it.  I suspect (but haven't tested) that the proper
>>> fix is to acquire mode_lock in bond_update_slave_arr while calling
>>> bond_3ad_get_active_agg_info to avoid conflict with the state machine.
>>>
>>> 	-J
>>>
>>> ---
>>> 	-Jay Vosburgh, jay.vosburgh@canonical.com
>>> .
>>>
>>
>> 	Thank you for your reply. The last patch does have redundant
>> update slave arr.Thank you for your correction.
>>
>>         As you said, holding mode_lock in bond_update_slave_arr while
>> calling bond_3ad_get_active_agg_info can avoid conflictwith the state
>> machine. I have tested this patch, with ifdown/ifup operations for bond or
>> slaves.
>>
>>         But bond_update_slave_arr is expected to hold RTNL only and NO
>> other lock. And it have WARN_ON(lockdep_is_held(&bond->mode_lock)); in
>> bond_update_slave_arr. I'm not sure that holding mode_lock in
>> bond_update_slave_arr while calling bond_3ad_get_active_agg_info is a
>> correct action.
> 
> 	That WARN_ON came up in discussion recently, and my opinion is
> that it's incorrect, and is trying to insure bond_update_slave_arr is
> safe for a potential sleep when allocating memory.
> 
> https://lore.kernel.org/netdev/20210322123846.3024549-1-maximmi@nvidia.com/
> 
> 	The original authors haven't replied, so I would suggest you
> remove the WARN_ON and the surrounding CONFIG_LOCKDEP ifdefs as part of
> your patch and replace it with a call to might_sleep.
> 
> 	The other callers of bond_3ad_get_active_agg_info are generally
> obtaining the state in order to report it to user space, so I think it's
> safe to leave those calls not holding the mode_lock.  The race is still
> there, but the data returned to user space is a snapshot and so may
> reflect an incomplete state during a transition.  Further, having the
> inspection functions acquire the mode_lock permits user space to spam
> the lock with little effort.
> 
> 	-J
> 
>> diff --git a/drivers/net/bonding/bond_main.c
>> b/drivers/net/bonding/bond_main.c
>> index 74cbbb2..db988e5 100644
>> --- a/drivers/net/bonding/bond_main.c
>> +++ b/drivers/net/bonding/bond_main.c
>> @@ -4406,7 +4406,9 @@ int bond_update_slave_arr(struct bonding *bond,
>> struct slave *skipslave)
>>     if (BOND_MODE(bond) == BOND_MODE_8023AD) {
>>         struct ad_info ad_info;
>>
>> +       spin_lock_bh(&bond->mode_lock);
>>         if (bond_3ad_get_active_agg_info(bond, &ad_info)) {
>> +           spin_unlock_bh(&bond->mode_lock);
>>             pr_debug("bond_3ad_get_active_agg_info failed\n");
>>             /* No active aggragator means it's not safe to use
>>              * the previous array.
>> @@ -4414,6 +4416,7 @@ int bond_update_slave_arr(struct bonding *bond,
>> struct slave *skipslave)
>>             bond_reset_slave_arr(bond);
>>             goto out;
>>         }
>> +       spin_unlock_bh(&bond->mode_lock);
>>         agg_id = ad_info.aggregator_id;
>>     }
>>     bond_for_each_slave(bond, slave, iter) {
> 
> ---
> 	-Jay Vosburgh, jay.vosburgh@canonical.com
> .
> 

	I have remove the WARN_ON and the surrounding CONFIG_LOCKDEP ifdefs in 
the new patch and replace it with a call to might_sleep.

	And I will send a new patch again.

	Thank you for your guidance.


diff --git a/drivers/net/bonding/bond_main.c 
b/drivers/net/bonding/bond_main.c
index 74cbbb2..83ef62d 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -4391,9 +4391,7 @@ int bond_update_slave_arr(struct bonding *bond, 
struct slave *skipslave)
  	int agg_id = 0;
  	int ret = 0;

-#ifdef CONFIG_LOCKDEP
-	WARN_ON(lockdep_is_held(&bond->mode_lock));
-#endif
+	might_sleep();

  	usable_slaves = kzalloc(struct_size(usable_slaves, arr,
  					    bond->slave_cnt), GFP_KERNEL);
@@ -4406,7 +4404,9 @@ int bond_update_slave_arr(struct bonding *bond, 
struct slave *skipslave)
  	if (BOND_MODE(bond) == BOND_MODE_8023AD) {
  		struct ad_info ad_info;

+		spin_lock_bh(&bond->mode_lock);
  		if (bond_3ad_get_active_agg_info(bond, &ad_info)) {
+			spin_unlock_bh(&bond->mode_lock);
  			pr_debug("bond_3ad_get_active_agg_info failed\n");
  			/* No active aggragator means it's not safe to use
  			 * the previous array.
@@ -4414,6 +4414,7 @@ int bond_update_slave_arr(struct bonding *bond, 
struct slave *skipslave)
  			bond_reset_slave_arr(bond);
  			goto out;
  		}
+		spin_unlock_bh(&bond->mode_lock);
  		agg_id = ad_info.aggregator_id;
  	}
  	bond_for_each_slave(bond, slave, iter) {
--

bonding: 3ad: update slave arr after initialize

Checks

Commit Message

Comments

Patch