diff mbox series

ipv6: fix locking issues with loops over idev->addr_list

Message ID 20220322213406.55977-1-dossche.niels@gmail.com (mailing list archive)
State Superseded
Delegated to: Netdev Maintainers
Headers show
Series ipv6: fix locking issues with loops over idev->addr_list | expand

Checks

Context Check Description
netdev/fixes_present success Fixes tag not required for -next series
netdev/subject_prefix warning Target tree name not specified in the subject
netdev/cover_letter success Single patches do not need cover letters
netdev/patch_count success Link
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 2097 this patch: 2097
netdev/cc_maintainers success CCed 6 of 6 maintainers
netdev/build_clang success Errors and warnings before: 305 this patch: 305
netdev/module_param success Was 0 now: 0
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 2226 this patch: 2226
netdev/checkpatch warning WARNING: line length of 82 exceeds 80 columns WARNING: line length of 89 exceeds 80 columns
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
netdev/tree_selection success Guessing tree name failed - patch did not apply

Commit Message

Niels Dossche March 22, 2022, 9:34 p.m. UTC
idev->addr_list needs to be protected by idev->lock. However, it is not
always possible to do so while iterating and performing actions on
inet6_ifaddr instances. For example, multiple functions (like
addrconf_{join,leave}_anycast) eventually call down to other functions
that acquire the idev->lock. The current code temporarily unlocked the
idev->lock during the loops, which can cause race conditions. Moving the
locks up is also not an appropriate solution as the ordering of lock
acquisition will be inconsistent with for example mc_lock.

This solution adds an additional field to inet6_ifaddr that is used
to temporarily add the instances to a temporary list while holding
idev->lock. The temporary list can then be traversed without holding
idev->lock. This change was done in two places. In addrconf_ifdown, the
list_for_each_entry_safe variant of the list loop is also no longer
necessary as there is no deletion within that specific loop.

The remaining loop in addrconf_ifdown that unlocks idev->lock in its
loop body is of no issue. This is because that loop always gets the
first entry and performs the delete and condition check under the
idev->lock.

Suggested-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Niels Dossche <dossche.niels@gmail.com>
---

This was previously discussed in the mailing thread of
[PATCH v2] ipv6: acquire write lock for addr_list in dev_forward_change

 include/net/if_inet6.h |  7 +++++++
 net/ipv6/addrconf.c    | 29 +++++++++++++++++++++++------
 2 files changed, 30 insertions(+), 6 deletions(-)

Comments

Paolo Abeni March 23, 2022, 8:20 a.m. UTC | #1
On Tue, 2022-03-22 at 22:34 +0100, Niels Dossche wrote:
> idev->addr_list needs to be protected by idev->lock. However, it is not
> always possible to do so while iterating and performing actions on
> inet6_ifaddr instances. For example, multiple functions (like
> addrconf_{join,leave}_anycast) eventually call down to other functions
> that acquire the idev->lock. The current code temporarily unlocked the
> idev->lock during the loops, which can cause race conditions. Moving the
> locks up is also not an appropriate solution as the ordering of lock
> acquisition will be inconsistent with for example mc_lock.
> 
> This solution adds an additional field to inet6_ifaddr that is used
> to temporarily add the instances to a temporary list while holding
> idev->lock. The temporary list can then be traversed without holding
> idev->lock. This change was done in two places. In addrconf_ifdown, the
> list_for_each_entry_safe variant of the list loop is also no longer
> necessary as there is no deletion within that specific loop.
> 
> The remaining loop in addrconf_ifdown that unlocks idev->lock in its
> loop body is of no issue. This is because that loop always gets the
> first entry and performs the delete and condition check under the
> idev->lock.
> 
> Suggested-by: Paolo Abeni <pabeni@redhat.com>
> Signed-off-by: Niels Dossche <dossche.niels@gmail.com>
> ---
> 
> This was previously discussed in the mailing thread of
> [PATCH v2] ipv6: acquire write lock for addr_list in dev_forward_change
> 
>  include/net/if_inet6.h |  7 +++++++
>  net/ipv6/addrconf.c    | 29 +++++++++++++++++++++++------
>  2 files changed, 30 insertions(+), 6 deletions(-)
> 
> diff --git a/include/net/if_inet6.h b/include/net/if_inet6.h
> index f026cf08a8e8..a17f29f75e9a 100644
> --- a/include/net/if_inet6.h
> +++ b/include/net/if_inet6.h
> @@ -64,6 +64,13 @@ struct inet6_ifaddr {
>  
>  	struct hlist_node	addr_lst;
>  	struct list_head	if_list;
> +	/*
> +	 * Used to safely traverse idev->addr_list in process context
> +	 * if the idev->lock needed to protect idev->addr_list cannot be held.
> +	 * In that case, add the items to this list temporarily and iterate
> +	 * without holding idev->lock. See addrconf_ifdown and dev_forward_change.
> +	 */
> +	struct list_head	if_list_aux;
>  
>  	struct list_head	tmp_list;
>  	struct inet6_ifaddr	*ifpub;
> diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
> index f908e2fd30b2..72790d1934c7 100644
> --- a/net/ipv6/addrconf.c
> +++ b/net/ipv6/addrconf.c
> @@ -800,6 +800,7 @@ static void dev_forward_change(struct inet6_dev *idev)
>  {
>  	struct net_device *dev;
>  	struct inet6_ifaddr *ifa;
> +	LIST_HEAD(tmp_addr_list);
>  
>  	if (!idev)
>  		return;
> @@ -818,14 +819,23 @@ static void dev_forward_change(struct inet6_dev *idev)
>  		}
>  	}
>  
> +	read_lock_bh(&idev->lock);
>  	list_for_each_entry(ifa, &idev->addr_list, if_list) {
>  		if (ifa->flags&IFA_F_TENTATIVE)
>  			continue;
> +		list_add_tail(&ifa->if_list_aux, &tmp_addr_list);
> +	}
> +	read_unlock_bh(&idev->lock);
> +
> +	while (!list_empty(&tmp_addr_list)) {
> +		ifa = list_first_entry(&tmp_addr_list, struct inet6_ifaddr, if_list_aux);
> +		list_del(&ifa->if_list_aux);
>  		if (idev->cnf.forwarding)
>  			addrconf_join_anycast(ifa);
>  		else
>  			addrconf_leave_anycast(ifa);
>  	}
> +
>  	inet6_netconf_notify_devconf(dev_net(dev), RTM_NEWNETCONF,
>  				     NETCONFA_FORWARDING,
>  				     dev->ifindex, &idev->cnf);
> @@ -3730,10 +3740,11 @@ static int addrconf_ifdown(struct net_device *dev, bool unregister)
>  	unsigned long event = unregister ? NETDEV_UNREGISTER : NETDEV_DOWN;
>  	struct net *net = dev_net(dev);
>  	struct inet6_dev *idev;
> -	struct inet6_ifaddr *ifa, *tmp;
> +	struct inet6_ifaddr *ifa;
>  	bool keep_addr = false;
>  	bool was_ready;
>  	int state, i;
> +	LIST_HEAD(tmp_addr_list);

Very minot nit: I guess it's better to try to enforce the reverse x-mas
tree order for newly added variables - that is: this declaration should
me moved up, just after 'ifa'.

>  	ASSERT_RTNL();
>  
> @@ -3822,16 +3833,23 @@ static int addrconf_ifdown(struct net_device *dev, bool unregister)
>  		write_lock_bh(&idev->lock);
>  	}
>  
> -	list_for_each_entry_safe(ifa, tmp, &idev->addr_list, if_list) {
> +	list_for_each_entry(ifa, &idev->addr_list, if_list) {
> +		list_add_tail(&ifa->if_list_aux, &tmp_addr_list);
> +	}


Other minor nit: the braces are not required here.

Otherwise LGTM:

Acked-by: Paolo Abeni <pabeni@redhat.com>

However this looks like net-next material, and we are in the merge
window right now. I think you should re-post in (slighly less) than 2w.
Please add the target tree into the subj.

Side note: AFAICS after this patch, there is still the suspicious
tempaddr_list usage in addrconf_ifdown to be handled.

Cheers,

Paolo
Niels Dossche March 23, 2022, 12:56 p.m. UTC | #2
On 23/03/2022 09:20, Paolo Abeni wrote:
> On Tue, 2022-03-22 at 22:34 +0100, Niels Dossche wrote:
>> idev->addr_list needs to be protected by idev->lock. However, it is not
>> always possible to do so while iterating and performing actions on
>> inet6_ifaddr instances. For example, multiple functions (like
>> addrconf_{join,leave}_anycast) eventually call down to other functions
>> that acquire the idev->lock. The current code temporarily unlocked the
>> idev->lock during the loops, which can cause race conditions. Moving the
>> locks up is also not an appropriate solution as the ordering of lock
>> acquisition will be inconsistent with for example mc_lock.
>>
>> This solution adds an additional field to inet6_ifaddr that is used
>> to temporarily add the instances to a temporary list while holding
>> idev->lock. The temporary list can then be traversed without holding
>> idev->lock. This change was done in two places. In addrconf_ifdown, the
>> list_for_each_entry_safe variant of the list loop is also no longer
>> necessary as there is no deletion within that specific loop.
>>
>> The remaining loop in addrconf_ifdown that unlocks idev->lock in its
>> loop body is of no issue. This is because that loop always gets the
>> first entry and performs the delete and condition check under the
>> idev->lock.
>>
>> Suggested-by: Paolo Abeni <pabeni@redhat.com>
>> Signed-off-by: Niels Dossche <dossche.niels@gmail.com>
>> ---
>>
>> This was previously discussed in the mailing thread of
>> [PATCH v2] ipv6: acquire write lock for addr_list in dev_forward_change
>>
>>  include/net/if_inet6.h |  7 +++++++
>>  net/ipv6/addrconf.c    | 29 +++++++++++++++++++++++------
>>  2 files changed, 30 insertions(+), 6 deletions(-)
>>
>> diff --git a/include/net/if_inet6.h b/include/net/if_inet6.h
>> index f026cf08a8e8..a17f29f75e9a 100644
>> --- a/include/net/if_inet6.h
>> +++ b/include/net/if_inet6.h
>> @@ -64,6 +64,13 @@ struct inet6_ifaddr {
>>  
>>  	struct hlist_node	addr_lst;
>>  	struct list_head	if_list;
>> +	/*
>> +	 * Used to safely traverse idev->addr_list in process context
>> +	 * if the idev->lock needed to protect idev->addr_list cannot be held.
>> +	 * In that case, add the items to this list temporarily and iterate
>> +	 * without holding idev->lock. See addrconf_ifdown and dev_forward_change.
>> +	 */
>> +	struct list_head	if_list_aux;
>>  
>>  	struct list_head	tmp_list;
>>  	struct inet6_ifaddr	*ifpub;
>> diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
>> index f908e2fd30b2..72790d1934c7 100644
>> --- a/net/ipv6/addrconf.c
>> +++ b/net/ipv6/addrconf.c
>> @@ -800,6 +800,7 @@ static void dev_forward_change(struct inet6_dev *idev)
>>  {
>>  	struct net_device *dev;
>>  	struct inet6_ifaddr *ifa;
>> +	LIST_HEAD(tmp_addr_list);
>>  
>>  	if (!idev)
>>  		return;
>> @@ -818,14 +819,23 @@ static void dev_forward_change(struct inet6_dev *idev)
>>  		}
>>  	}
>>  
>> +	read_lock_bh(&idev->lock);
>>  	list_for_each_entry(ifa, &idev->addr_list, if_list) {
>>  		if (ifa->flags&IFA_F_TENTATIVE)
>>  			continue;
>> +		list_add_tail(&ifa->if_list_aux, &tmp_addr_list);
>> +	}
>> +	read_unlock_bh(&idev->lock);
>> +
>> +	while (!list_empty(&tmp_addr_list)) {
>> +		ifa = list_first_entry(&tmp_addr_list, struct inet6_ifaddr, if_list_aux);
>> +		list_del(&ifa->if_list_aux);
>>  		if (idev->cnf.forwarding)
>>  			addrconf_join_anycast(ifa);
>>  		else
>>  			addrconf_leave_anycast(ifa);
>>  	}
>> +
>>  	inet6_netconf_notify_devconf(dev_net(dev), RTM_NEWNETCONF,
>>  				     NETCONFA_FORWARDING,
>>  				     dev->ifindex, &idev->cnf);
>> @@ -3730,10 +3740,11 @@ static int addrconf_ifdown(struct net_device *dev, bool unregister)
>>  	unsigned long event = unregister ? NETDEV_UNREGISTER : NETDEV_DOWN;
>>  	struct net *net = dev_net(dev);
>>  	struct inet6_dev *idev;
>> -	struct inet6_ifaddr *ifa, *tmp;
>> +	struct inet6_ifaddr *ifa;
>>  	bool keep_addr = false;
>>  	bool was_ready;
>>  	int state, i;
>> +	LIST_HEAD(tmp_addr_list);
> 
> Very minot nit: I guess it's better to try to enforce the reverse x-mas
> tree order for newly added variables - that is: this declaration should
> me moved up, just after 'ifa'.
> 
>>  	ASSERT_RTNL();
>>  
>> @@ -3822,16 +3833,23 @@ static int addrconf_ifdown(struct net_device *dev, bool unregister)
>>  		write_lock_bh(&idev->lock);
>>  	}
>>  
>> -	list_for_each_entry_safe(ifa, tmp, &idev->addr_list, if_list) {
>> +	list_for_each_entry(ifa, &idev->addr_list, if_list) {
>> +		list_add_tail(&ifa->if_list_aux, &tmp_addr_list);
>> +	}
> 
> 
> Other minor nit: the braces are not required here.
> 
> Otherwise LGTM:
> 
> Acked-by: Paolo Abeni <pabeni@redhat.com>
> 

I will apply the suggestions to the patch, thanks for the review.

> However this looks like net-next material, and we are in the merge
> window right now. I think you should re-post in (slighly less) than 2w.
> Please add the target tree into the subj.
> 

Okay, I'll send it for net-next when the merge window is closed.

> Side note: AFAICS after this patch, there is still the suspicious
> tempaddr_list usage in addrconf_ifdown to be handled.
> 
> Cheers,
> 
> Paolo
> 

I have taken a look at that loop. I'm not sure that it is really problematic since the iteration and deletion is at least atomic under idev->lock.
But I can write a patch for that as well.

Cheers
Niels
diff mbox series

Patch

diff --git a/include/net/if_inet6.h b/include/net/if_inet6.h
index f026cf08a8e8..a17f29f75e9a 100644
--- a/include/net/if_inet6.h
+++ b/include/net/if_inet6.h
@@ -64,6 +64,13 @@  struct inet6_ifaddr {
 
 	struct hlist_node	addr_lst;
 	struct list_head	if_list;
+	/*
+	 * Used to safely traverse idev->addr_list in process context
+	 * if the idev->lock needed to protect idev->addr_list cannot be held.
+	 * In that case, add the items to this list temporarily and iterate
+	 * without holding idev->lock. See addrconf_ifdown and dev_forward_change.
+	 */
+	struct list_head	if_list_aux;
 
 	struct list_head	tmp_list;
 	struct inet6_ifaddr	*ifpub;
diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index f908e2fd30b2..72790d1934c7 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -800,6 +800,7 @@  static void dev_forward_change(struct inet6_dev *idev)
 {
 	struct net_device *dev;
 	struct inet6_ifaddr *ifa;
+	LIST_HEAD(tmp_addr_list);
 
 	if (!idev)
 		return;
@@ -818,14 +819,23 @@  static void dev_forward_change(struct inet6_dev *idev)
 		}
 	}
 
+	read_lock_bh(&idev->lock);
 	list_for_each_entry(ifa, &idev->addr_list, if_list) {
 		if (ifa->flags&IFA_F_TENTATIVE)
 			continue;
+		list_add_tail(&ifa->if_list_aux, &tmp_addr_list);
+	}
+	read_unlock_bh(&idev->lock);
+
+	while (!list_empty(&tmp_addr_list)) {
+		ifa = list_first_entry(&tmp_addr_list, struct inet6_ifaddr, if_list_aux);
+		list_del(&ifa->if_list_aux);
 		if (idev->cnf.forwarding)
 			addrconf_join_anycast(ifa);
 		else
 			addrconf_leave_anycast(ifa);
 	}
+
 	inet6_netconf_notify_devconf(dev_net(dev), RTM_NEWNETCONF,
 				     NETCONFA_FORWARDING,
 				     dev->ifindex, &idev->cnf);
@@ -3730,10 +3740,11 @@  static int addrconf_ifdown(struct net_device *dev, bool unregister)
 	unsigned long event = unregister ? NETDEV_UNREGISTER : NETDEV_DOWN;
 	struct net *net = dev_net(dev);
 	struct inet6_dev *idev;
-	struct inet6_ifaddr *ifa, *tmp;
+	struct inet6_ifaddr *ifa;
 	bool keep_addr = false;
 	bool was_ready;
 	int state, i;
+	LIST_HEAD(tmp_addr_list);
 
 	ASSERT_RTNL();
 
@@ -3822,16 +3833,23 @@  static int addrconf_ifdown(struct net_device *dev, bool unregister)
 		write_lock_bh(&idev->lock);
 	}
 
-	list_for_each_entry_safe(ifa, tmp, &idev->addr_list, if_list) {
+	list_for_each_entry(ifa, &idev->addr_list, if_list) {
+		list_add_tail(&ifa->if_list_aux, &tmp_addr_list);
+	}
+	write_unlock_bh(&idev->lock);
+
+	while (!list_empty(&tmp_addr_list)) {
 		struct fib6_info *rt = NULL;
 		bool keep;
 
+		ifa = list_first_entry(&tmp_addr_list, struct inet6_ifaddr, if_list_aux);
+		list_del(&ifa->if_list_aux);
+
 		addrconf_del_dad_work(ifa);
 
 		keep = keep_addr && (ifa->flags & IFA_F_PERMANENT) &&
 			!addr_is_local(&ifa->addr);
 
-		write_unlock_bh(&idev->lock);
 		spin_lock_bh(&ifa->lock);
 
 		if (keep) {
@@ -3862,15 +3880,14 @@  static int addrconf_ifdown(struct net_device *dev, bool unregister)
 			addrconf_leave_solict(ifa->idev, &ifa->addr);
 		}
 
-		write_lock_bh(&idev->lock);
 		if (!keep) {
+			write_lock_bh(&idev->lock);
 			list_del_rcu(&ifa->if_list);
+			write_unlock_bh(&idev->lock);
 			in6_ifa_put(ifa);
 		}
 	}
 
-	write_unlock_bh(&idev->lock);
-
 	/* Step 5: Discard anycast and multicast list */
 	if (unregister) {
 		ipv6_ac_destroy_dev(idev);