diff mbox series

neighbour: Prevent a dead entry from updating gc_list

Message ID 20210125195927.GA26972@chinagar-linux.qualcomm.com (mailing list archive)
State Superseded
Delegated to: Netdev Maintainers
Headers show
Series neighbour: Prevent a dead entry from updating gc_list | expand

Checks

Context Check Description
netdev/cover_letter success Link
netdev/fixes_present success Link
netdev/patch_count success Link
netdev/tree_selection success Guessed tree name to be net-next
netdev/subject_prefix warning Target tree name not specified in the subject
netdev/cc_maintainers warning 9 maintainers not CCed: weichen.chen@linux.alibaba.com lirongqing@baidu.com liuhangbin@gmail.com nikolay@cumulusnetworks.com mrv@mojatatu.com davem@davemloft.net dsahern@kernel.org kuba@kernel.org jdike@akamai.com
netdev/source_inline success Was 0 now: 0
netdev/verify_signedoff success Link
netdev/module_param success Was 0 now: 0
netdev/build_32bit success Errors and warnings before: 3 this patch: 3
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/verify_fixes success Link
netdev/checkpatch fail ERROR: spaces required around that '=' (ctx:VxV)
netdev/build_allmodconfig_warn success Errors and warnings before: 3 this patch: 3
netdev/header_inline success Link
netdev/stable success Stable not CCed

Commit Message

Chinmay Agarwal Jan. 25, 2021, 7:59 p.m. UTC
Following race condition was detected:
<CPU A, t0> - neigh_flush_dev() is under execution and calls neigh_mark_dead(n),
marking the neighbour entry 'n' as dead.

<CPU B, t1> - Executing: __netif_receive_skb() -> __netif_receive_skb_core()
-> arp_rcv() -> arp_process().arp_process() calls __neigh_lookup() which takes
a reference on neighbour entry 'n'.

<CPU A, t2> - Moves further along neigh_flush_dev() and calls
neigh_cleanup_and_release(n), but since reference count increased in t2,
'n' couldn't be destroyed.

<CPU B, t3> - Moves further along, arp_process() and calls
neigh_update()-> __neigh_update() -> neigh_update_gc_list(), which adds
the neighbour entry back in gc_list(neigh_mark_dead(), removed it
earlier in t0 from gc_list)

<CPU B, t4> - arp_process() finally calls neigh_release(n), destroying
the neighbour entry.

This leads to 'n' still being part of gc_list, but the actual
neighbour structure has been freed.

The situation can be prevented from happening if we disallow a dead
entry to have any possibility of updating gc_list. This is what the
patch intends to achieve.

Signed-off-by: Chinmay Agarwal <chinagar@codeaurora.org>
---
 net/core/neighbour.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

--

Comments

Cong Wang Jan. 25, 2021, 9:07 p.m. UTC | #1
On Mon, Jan 25, 2021 at 11:59 AM Chinmay Agarwal
<chinagar@codeaurora.org> wrote:
>
> Following race condition was detected:
> <CPU A, t0> - neigh_flush_dev() is under execution and calls neigh_mark_dead(n),
> marking the neighbour entry 'n' as dead.
>
> <CPU B, t1> - Executing: __netif_receive_skb() -> __netif_receive_skb_core()
> -> arp_rcv() -> arp_process().arp_process() calls __neigh_lookup() which takes
> a reference on neighbour entry 'n'.
>
> <CPU A, t2> - Moves further along neigh_flush_dev() and calls
> neigh_cleanup_and_release(n), but since reference count increased in t2,
> 'n' couldn't be destroyed.
>
> <CPU B, t3> - Moves further along, arp_process() and calls
> neigh_update()-> __neigh_update() -> neigh_update_gc_list(), which adds
> the neighbour entry back in gc_list(neigh_mark_dead(), removed it
> earlier in t0 from gc_list)
>
> <CPU B, t4> - arp_process() finally calls neigh_release(n), destroying
> the neighbour entry.
>
> This leads to 'n' still being part of gc_list, but the actual
> neighbour structure has been freed.
>
> The situation can be prevented from happening if we disallow a dead
> entry to have any possibility of updating gc_list. This is what the
> patch intends to achieve.
>
> Signed-off-by: Chinmay Agarwal <chinagar@codeaurora.org>

Please add a Fixes tag for bug fixes, in this case it is probably:

Fixes: 9c29a2f55ec0 ("neighbor: Fix locking order for gc_list changes")

And, make sure you run checkpatch.pl before sending out. For your
patch, it will definitely complain about the missing spaces around the
assignment "new=old;".

Thanks.
Jakub Kicinski Jan. 25, 2021, 9:59 p.m. UTC | #2
On Tue, 26 Jan 2021 01:29:37 +0530 Chinmay Agarwal wrote:
> Following race condition was detected:
> <CPU A, t0> - neigh_flush_dev() is under execution and calls neigh_mark_dead(n),
> marking the neighbour entry 'n' as dead.
> 
> <CPU B, t1> - Executing: __netif_receive_skb() -> __netif_receive_skb_core()
> -> arp_rcv() -> arp_process().arp_process() calls __neigh_lookup() which takes  
> a reference on neighbour entry 'n'.
> 
> <CPU A, t2> - Moves further along neigh_flush_dev() and calls
> neigh_cleanup_and_release(n), but since reference count increased in t2,
> 'n' couldn't be destroyed.
> 
> <CPU B, t3> - Moves further along, arp_process() and calls
> neigh_update()-> __neigh_update() -> neigh_update_gc_list(), which adds
> the neighbour entry back in gc_list(neigh_mark_dead(), removed it
> earlier in t0 from gc_list)
> 
> <CPU B, t4> - arp_process() finally calls neigh_release(n), destroying
> the neighbour entry.
> 
> This leads to 'n' still being part of gc_list, but the actual
> neighbour structure has been freed.
> 
> The situation can be prevented from happening if we disallow a dead
> entry to have any possibility of updating gc_list. This is what the
> patch intends to achieve.
> 
> Signed-off-by: Chinmay Agarwal <chinagar@codeaurora.org>
> ---
>  net/core/neighbour.c | 7 ++++---
>  1 file changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/net/core/neighbour.c b/net/core/neighbour.c
> index ff07358..cf8e3076 100644
> --- a/net/core/neighbour.c
> +++ b/net/core/neighbour.c
> @@ -1244,13 +1244,14 @@ static int __neigh_update(struct neighbour *neigh, const u8 *lladdr,
>  	old    = neigh->nud_state;
>  	err    = -EPERM;
>  
> -	if (!(flags & NEIGH_UPDATE_F_ADMIN) &&
> -	    (old & (NUD_NOARP | NUD_PERMANENT)))
> -		goto out;
>  	if (neigh->dead) {
>  		NL_SET_ERR_MSG(extack, "Neighbor entry is now dead");
> +		new=old;
>  		goto out;
>  	}
> +	if (!(flags & NEIGH_UPDATE_F_ADMIN) &&
> +	    (old & (NUD_NOARP | NUD_PERMANENT)))
> +		goto out;
>  
>  	ext_learn_change = neigh_update_ext_learned(neigh, flags, &notify);
>  

Please run checkpatch on your patches:

ERROR: spaces required around that '=' (ctx:VxV)
#52: FILE: net/core/neighbour.c:1249:
+		new=old;
 		   ^
diff mbox series

Patch

diff --git a/net/core/neighbour.c b/net/core/neighbour.c
index ff07358..cf8e3076 100644
--- a/net/core/neighbour.c
+++ b/net/core/neighbour.c
@@ -1244,13 +1244,14 @@  static int __neigh_update(struct neighbour *neigh, const u8 *lladdr,
 	old    = neigh->nud_state;
 	err    = -EPERM;
 
-	if (!(flags & NEIGH_UPDATE_F_ADMIN) &&
-	    (old & (NUD_NOARP | NUD_PERMANENT)))
-		goto out;
 	if (neigh->dead) {
 		NL_SET_ERR_MSG(extack, "Neighbor entry is now dead");
+		new=old;
 		goto out;
 	}
+	if (!(flags & NEIGH_UPDATE_F_ADMIN) &&
+	    (old & (NUD_NOARP | NUD_PERMANENT)))
+		goto out;
 
 	ext_learn_change = neigh_update_ext_learned(neigh, flags, &notify);