diff mbox series

[net] ipv6: make ip6_rt_gc_expire an atomic_t

Message ID 20220413181333.649424-1-eric.dumazet@gmail.com (mailing list archive)
State Accepted
Commit 9cb7c013420f98fa6fd12fc6a5dc055170c108db
Delegated to: Netdev Maintainers
Headers show
Series [net] ipv6: make ip6_rt_gc_expire an atomic_t | expand

Checks

Context Check Description
netdev/tree_selection success Clearly marked for net
netdev/fixes_present success Fixes tag present in non-next series
netdev/subject_prefix success Link
netdev/cover_letter success Single patches do not need cover letters
netdev/patch_count success Link
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 5032 this patch: 5032
netdev/cc_maintainers warning 2 maintainers not CCed: yoshfuji@linux-ipv6.org fw@strlen.de
netdev/build_clang success Errors and warnings before: 1130 this patch: 1130
netdev/module_param success Was 0 now: 0
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/verify_fixes fail Problems with Fixes tag: 1
netdev/build_allmodconfig_warn success Errors and warnings before: 5172 this patch: 5172
netdev/checkpatch warning CHECK: spaces preferred around that '*' (ctx:VxV) WARNING: Possible repeated word: 'Google'
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Eric Dumazet April 13, 2022, 6:13 p.m. UTC
From: Eric Dumazet <edumazet@google.com>

Reads and Writes to ip6_rt_gc_expire always have been racy,
as syzbot reported lately [1]

There is a possible risk of under-flow, leading
to unexpected high value passed to fib6_run_gc(),
although I have not observed this in the field.

Hosts hitting ip6_dst_gc() very hard are under pretty bad
state anyway.

[1]
BUG: KCSAN: data-race in ip6_dst_gc / ip6_dst_gc

read-write to 0xffff888102110744 of 4 bytes by task 13165 on cpu 1:
 ip6_dst_gc+0x1f3/0x220 net/ipv6/route.c:3311
 dst_alloc+0x9b/0x160 net/core/dst.c:86
 ip6_dst_alloc net/ipv6/route.c:344 [inline]
 icmp6_dst_alloc+0xb2/0x360 net/ipv6/route.c:3261
 mld_sendpack+0x2b9/0x580 net/ipv6/mcast.c:1807
 mld_send_cr net/ipv6/mcast.c:2119 [inline]
 mld_ifc_work+0x576/0x800 net/ipv6/mcast.c:2651
 process_one_work+0x3d3/0x720 kernel/workqueue.c:2289
 worker_thread+0x618/0xa70 kernel/workqueue.c:2436
 kthread+0x1a9/0x1e0 kernel/kthread.c:376
 ret_from_fork+0x1f/0x30

read-write to 0xffff888102110744 of 4 bytes by task 11607 on cpu 0:
 ip6_dst_gc+0x1f3/0x220 net/ipv6/route.c:3311
 dst_alloc+0x9b/0x160 net/core/dst.c:86
 ip6_dst_alloc net/ipv6/route.c:344 [inline]
 icmp6_dst_alloc+0xb2/0x360 net/ipv6/route.c:3261
 mld_sendpack+0x2b9/0x580 net/ipv6/mcast.c:1807
 mld_send_cr net/ipv6/mcast.c:2119 [inline]
 mld_ifc_work+0x576/0x800 net/ipv6/mcast.c:2651
 process_one_work+0x3d3/0x720 kernel/workqueue.c:2289
 worker_thread+0x618/0xa70 kernel/workqueue.c:2436
 kthread+0x1a9/0x1e0 kernel/kthread.c:376
 ret_from_fork+0x1f/0x30

value changed: 0x00000bb3 -> 0x00000ba9

Reported by Kernel Concurrency Sanitizer on:
CPU: 0 PID: 11607 Comm: kworker/0:21 Not tainted 5.18.0-rc1-syzkaller-00037-g42e7a03d3bad-dirty #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Workqueue: mld mld_ifc_work

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
---
 include/net/netns/ipv6.h |  4 ++--
 net/ipv6/route.c         | 11 ++++++-----
 2 files changed, 8 insertions(+), 7 deletions(-)

Comments

David Ahern April 14, 2022, 12:39 a.m. UTC | #1
On 4/13/22 12:13 PM, Eric Dumazet wrote:
> From: Eric Dumazet <edumazet@google.com>
> 
> Reads and Writes to ip6_rt_gc_expire always have been racy,
> as syzbot reported lately [1]
> 
> There is a possible risk of under-flow, leading
> to unexpected high value passed to fib6_run_gc(),
> although I have not observed this in the field.
> 
> Hosts hitting ip6_dst_gc() very hard are under pretty bad
> state anyway.
> 
> [1]
> BUG: KCSAN: data-race in ip6_dst_gc / ip6_dst_gc
> 
> read-write to 0xffff888102110744 of 4 bytes by task 13165 on cpu 1:
>  ip6_dst_gc+0x1f3/0x220 net/ipv6/route.c:3311
>  dst_alloc+0x9b/0x160 net/core/dst.c:86
>  ip6_dst_alloc net/ipv6/route.c:344 [inline]
>  icmp6_dst_alloc+0xb2/0x360 net/ipv6/route.c:3261
>  mld_sendpack+0x2b9/0x580 net/ipv6/mcast.c:1807
>  mld_send_cr net/ipv6/mcast.c:2119 [inline]
>  mld_ifc_work+0x576/0x800 net/ipv6/mcast.c:2651
>  process_one_work+0x3d3/0x720 kernel/workqueue.c:2289
>  worker_thread+0x618/0xa70 kernel/workqueue.c:2436
>  kthread+0x1a9/0x1e0 kernel/kthread.c:376
>  ret_from_fork+0x1f/0x30
> 
> read-write to 0xffff888102110744 of 4 bytes by task 11607 on cpu 0:
>  ip6_dst_gc+0x1f3/0x220 net/ipv6/route.c:3311
>  dst_alloc+0x9b/0x160 net/core/dst.c:86
>  ip6_dst_alloc net/ipv6/route.c:344 [inline]
>  icmp6_dst_alloc+0xb2/0x360 net/ipv6/route.c:3261
>  mld_sendpack+0x2b9/0x580 net/ipv6/mcast.c:1807
>  mld_send_cr net/ipv6/mcast.c:2119 [inline]
>  mld_ifc_work+0x576/0x800 net/ipv6/mcast.c:2651
>  process_one_work+0x3d3/0x720 kernel/workqueue.c:2289
>  worker_thread+0x618/0xa70 kernel/workqueue.c:2436
>  kthread+0x1a9/0x1e0 kernel/kthread.c:376
>  ret_from_fork+0x1f/0x30
> 
> value changed: 0x00000bb3 -> 0x00000ba9
> 
> Reported by Kernel Concurrency Sanitizer on:
> CPU: 0 PID: 11607 Comm: kworker/0:21 Not tainted 5.18.0-rc1-syzkaller-00037-g42e7a03d3bad-dirty #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> Workqueue: mld mld_ifc_work
> 
> Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Reported-by: syzbot <syzkaller@googlegroups.com>
> ---
>  include/net/netns/ipv6.h |  4 ++--
>  net/ipv6/route.c         | 11 ++++++-----
>  2 files changed, 8 insertions(+), 7 deletions(-)
> 

Reviewed-by: David Ahern <dsahern@kernel.org>
patchwork-bot+netdevbpf@kernel.org April 15, 2022, 9:50 p.m. UTC | #2
Hello:

This patch was applied to netdev/net.git (master)
by Jakub Kicinski <kuba@kernel.org>:

On Wed, 13 Apr 2022 11:13:33 -0700 you wrote:
> From: Eric Dumazet <edumazet@google.com>
> 
> Reads and Writes to ip6_rt_gc_expire always have been racy,
> as syzbot reported lately [1]
> 
> There is a possible risk of under-flow, leading
> to unexpected high value passed to fib6_run_gc(),
> although I have not observed this in the field.
> 
> [...]

Here is the summary with links:
  - [net] ipv6: make ip6_rt_gc_expire an atomic_t
    https://git.kernel.org/netdev/net/c/9cb7c013420f

You are awesome, thank you!
diff mbox series

Patch

diff --git a/include/net/netns/ipv6.h b/include/net/netns/ipv6.h
index 3d83b64471d32391fb632e8c25e12a8ec7d1b42e..b4af4837d80b4ed47d05474432d5b8ebb42322e7 100644
--- a/include/net/netns/ipv6.h
+++ b/include/net/netns/ipv6.h
@@ -75,8 +75,8 @@  struct netns_ipv6 {
 	struct list_head	fib6_walkers;
 	rwlock_t		fib6_walker_lock;
 	spinlock_t		fib6_gc_lock;
-	unsigned int		 ip6_rt_gc_expire;
-	unsigned long		 ip6_rt_last_gc;
+	atomic_t		ip6_rt_gc_expire;
+	unsigned long		ip6_rt_last_gc;
 	unsigned char		flowlabel_has_excl;
 #ifdef CONFIG_IPV6_MULTIPLE_TABLES
 	bool			fib6_has_custom_rules;
diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index 169e9df6d172ead607cc20a108c3371a20dbc632..c4b6ce017d5e3bf63c66a53df3d46c08370aed23 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -3292,6 +3292,7 @@  static int ip6_dst_gc(struct dst_ops *ops)
 	int rt_elasticity = net->ipv6.sysctl.ip6_rt_gc_elasticity;
 	int rt_gc_timeout = net->ipv6.sysctl.ip6_rt_gc_timeout;
 	unsigned long rt_last_gc = net->ipv6.ip6_rt_last_gc;
+	unsigned int val;
 	int entries;
 
 	entries = dst_entries_get_fast(ops);
@@ -3302,13 +3303,13 @@  static int ip6_dst_gc(struct dst_ops *ops)
 	    entries <= rt_max_size)
 		goto out;
 
-	net->ipv6.ip6_rt_gc_expire++;
-	fib6_run_gc(net->ipv6.ip6_rt_gc_expire, net, true);
+	fib6_run_gc(atomic_inc_return(&net->ipv6.ip6_rt_gc_expire), net, true);
 	entries = dst_entries_get_slow(ops);
 	if (entries < ops->gc_thresh)
-		net->ipv6.ip6_rt_gc_expire = rt_gc_timeout>>1;
+		atomic_set(&net->ipv6.ip6_rt_gc_expire, rt_gc_timeout >> 1);
 out:
-	net->ipv6.ip6_rt_gc_expire -= net->ipv6.ip6_rt_gc_expire>>rt_elasticity;
+	val = atomic_read(&net->ipv6.ip6_rt_gc_expire);
+	atomic_set(&net->ipv6.ip6_rt_gc_expire, val - (val >> rt_elasticity));
 	return entries > rt_max_size;
 }
 
@@ -6509,7 +6510,7 @@  static int __net_init ip6_route_net_init(struct net *net)
 	net->ipv6.sysctl.ip6_rt_min_advmss = IPV6_MIN_MTU - 20 - 40;
 	net->ipv6.sysctl.skip_notify_on_dev_down = 0;
 
-	net->ipv6.ip6_rt_gc_expire = 30*HZ;
+	atomic_set(&net->ipv6.ip6_rt_gc_expire, 30*HZ);
 
 	ret = 0;
 out: