diff mbox series

[net] tcp: Add READ_ONCE() to read tcp_orphan_count

Message ID 20220512103322.380405-1-liujian56@huawei.com (mailing list archive)
State Changes Requested
Delegated to: Netdev Maintainers
Headers show
Series [net] tcp: Add READ_ONCE() to read tcp_orphan_count | expand

Checks

Context Check Description
netdev/tree_selection success Clearly marked for net
netdev/fixes_present success Fixes tag present in non-next series
netdev/subject_prefix success Link
netdev/cover_letter success Single patches do not need cover letters
netdev/patch_count success Link
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 2 this patch: 2
netdev/cc_maintainers success CCed 8 of 8 maintainers
netdev/build_clang success Errors and warnings before: 9 this patch: 9
netdev/module_param success Was 0 now: 0
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 2 this patch: 2
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 8 lines checked
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Liu Jian May 12, 2022, 10:33 a.m. UTC
The tcp_orphan_count per-CPU variable is read locklessly, so this commit
add the READ_ONCE() to a load in order to avoid below KCSAN warnning:

BUG: KCSAN: data-race in tcp_orphan_count_sum net/ipv4/tcp.c:2476 [inline]
BUG: KCSAN: data-race in tcp_orphan_update+0x64/0x100 net/ipv4/tcp.c:2487

race at unknown origin, with read to 0xffff9c63bbdac7a8 of 4 bytes by interrupt on cpu 2:
 tcp_orphan_count_sum net/ipv4/tcp.c:2476 [inline]
 tcp_orphan_update+0x64/0x100 net/ipv4/tcp.c:2487
 call_timer_fn+0x33/0x210 kernel/time/timer.c:1414

Fixes: 19757cebf0c5 ("tcp: switch orphan_count to bare per-cpu counters")
Signed-off-by: Liu Jian <liujian56@huawei.com>
---
 net/ipv4/tcp.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Eric Dumazet May 12, 2022, 8:06 p.m. UTC | #1
On Thu, May 12, 2022 at 3:32 AM Liu Jian <liujian56@huawei.com> wrote:
>
> The tcp_orphan_count per-CPU variable is read locklessly, so this commit
> add the READ_ONCE() to a load in order to avoid below KCSAN warnning:
>
> BUG: KCSAN: data-race in tcp_orphan_count_sum net/ipv4/tcp.c:2476 [inline]
> BUG: KCSAN: data-race in tcp_orphan_update+0x64/0x100 net/ipv4/tcp.c:2487
>
> race at unknown origin, with read to 0xffff9c63bbdac7a8 of 4 bytes by interrupt on cpu 2:
>  tcp_orphan_count_sum net/ipv4/tcp.c:2476 [inline]
>  tcp_orphan_update+0x64/0x100 net/ipv4/tcp.c:2487
>  call_timer_fn+0x33/0x210 kernel/time/timer.c:1414
>
> Fixes: 19757cebf0c5 ("tcp: switch orphan_count to bare per-cpu counters")
> Signed-off-by: Liu Jian <liujian56@huawei.com>
> ---
>  net/ipv4/tcp.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
> index cf18fbcbf123..7245609f41e6 100644
> --- a/net/ipv4/tcp.c
> +++ b/net/ipv4/tcp.c
> @@ -2718,7 +2718,7 @@ int tcp_orphan_count_sum(void)
>         int i, total = 0;
>
>         for_each_possible_cpu(i)
> -               total += per_cpu(tcp_orphan_count, i);
> +               total += READ_ONCE(per_cpu(tcp_orphan_count, i));

We might raise the discussion to lkml and/or KCSAN supporters.

Presumably, all per_cpu() uses in the kernel will have the same issue ?

By definition per-cpu data can be changed by other cpus.

So maybe per_cpu() should contain the annotation, instead of having to
annotate all users.


>
>         return max(total, 0);
>  }
> --
> 2.17.1
>
Marco Elver May 12, 2022, 9:18 p.m. UTC | #2
On Thu, 12 May 2022 at 22:06, Eric Dumazet <edumazet@google.com> wrote:
>
> On Thu, May 12, 2022 at 3:32 AM Liu Jian <liujian56@huawei.com> wrote:
> >
> > The tcp_orphan_count per-CPU variable is read locklessly, so this commit
> > add the READ_ONCE() to a load in order to avoid below KCSAN warnning:
> >
> > BUG: KCSAN: data-race in tcp_orphan_count_sum net/ipv4/tcp.c:2476 [inline]
> > BUG: KCSAN: data-race in tcp_orphan_update+0x64/0x100 net/ipv4/tcp.c:2487
> >
> > race at unknown origin, with read to 0xffff9c63bbdac7a8 of 4 bytes by interrupt on cpu 2:
> >  tcp_orphan_count_sum net/ipv4/tcp.c:2476 [inline]
> >  tcp_orphan_update+0x64/0x100 net/ipv4/tcp.c:2487
> >  call_timer_fn+0x33/0x210 kernel/time/timer.c:1414
> >
> > Fixes: 19757cebf0c5 ("tcp: switch orphan_count to bare per-cpu counters")
> > Signed-off-by: Liu Jian <liujian56@huawei.com>
> > ---
> >  net/ipv4/tcp.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
> > index cf18fbcbf123..7245609f41e6 100644
> > --- a/net/ipv4/tcp.c
> > +++ b/net/ipv4/tcp.c
> > @@ -2718,7 +2718,7 @@ int tcp_orphan_count_sum(void)
> >         int i, total = 0;
> >
> >         for_each_possible_cpu(i)
> > -               total += per_cpu(tcp_orphan_count, i);
> > +               total += READ_ONCE(per_cpu(tcp_orphan_count, i));
>
> We might raise the discussion to lkml and/or KCSAN supporters.
>
> Presumably, all per_cpu() uses in the kernel will have the same issue ?
>
> By definition per-cpu data can be changed by other cpus.
>
> So maybe per_cpu() should contain the annotation, instead of having to
> annotate all users.

I guess the question is, is it the norm that per_cpu() retrieves data
that can legally be modified concurrently, or not. If not, and in most
cases it's a bug, the annotations should be here.

Paul, was there any guidance/documentation on this, but I fail to find
it right now? (access-marking.txt doesn't say much about per-CPU
data.)

Thanks,
-- Marco
Eric Dumazet May 12, 2022, 9:31 p.m. UTC | #3
On Thu, May 12, 2022 at 2:18 PM Marco Elver <elver@google.com> wrote:

>
> I guess the question is, is it the norm that per_cpu() retrieves data
> that can legally be modified concurrently, or not. If not, and in most
> cases it's a bug, the annotations should be here.
>
> Paul, was there any guidance/documentation on this, but I fail to find
> it right now? (access-marking.txt doesn't say much about per-CPU
> data.)

Normally, whenever we add a READ_ONCE(), we are supposed to add a comment.

We could make an exception for per_cpu_once(), because the comment
would be centralized
at per_cpu_once() definition.

We will be stuck with READ_ONCE() in places we are using
per_cpu_ptr(), for example
in dev_fetch_sw_netstats()

diff --git a/net/core/dev.c b/net/core/dev.c
index 1461c2d9dec8099a9a2d43a704b4c6cb0375f480..b66470291d7b7e6c33161093d71e40587f9ed838
100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -10381,10 +10381,13 @@ void dev_fetch_sw_netstats(struct
rtnl_link_stats64 *s,
                stats = per_cpu_ptr(netstats, cpu);
                do {
                        start = u64_stats_fetch_begin_irq(&stats->syncp);
-                       tmp.rx_packets = stats->rx_packets;
-                       tmp.rx_bytes   = stats->rx_bytes;
-                       tmp.tx_packets = stats->tx_packets;
-                       tmp.tx_bytes   = stats->tx_bytes;
+                       /* These values can change under us.
+                        * READ_ONCE() pair with too many write sides...
+                        */
+                       tmp.rx_packets = READ_ONCE(stats->rx_packets);
+                       tmp.rx_bytes   = READ_ONCE(stats->rx_bytes);
+                       tmp.tx_packets = READ_ONCE(stats->tx_packets);
+                       tmp.tx_bytes   = READ_ONCE(stats->tx_bytes);
                } while (u64_stats_fetch_retry_irq(&stats->syncp, start));

                s->rx_packets += tmp.rx_packets;
Paul E. McKenney May 12, 2022, 11:10 p.m. UTC | #4
On Thu, May 12, 2022 at 02:31:48PM -0700, Eric Dumazet wrote:
> On Thu, May 12, 2022 at 2:18 PM Marco Elver <elver@google.com> wrote:
> 
> >
> > I guess the question is, is it the norm that per_cpu() retrieves data
> > that can legally be modified concurrently, or not. If not, and in most
> > cases it's a bug, the annotations should be here.
> >
> > Paul, was there any guidance/documentation on this, but I fail to find
> > it right now? (access-marking.txt doesn't say much about per-CPU
> > data.)
> 
> Normally, whenever we add a READ_ONCE(), we are supposed to add a comment.

I am starting to think that comments are even more necessary for unmarked
accesses to shared variables, with the comments setting out why the
compiler cannot mess things up.  ;-)

> We could make an exception for per_cpu_once(), because the comment
> would be centralized
> at per_cpu_once() definition.

This makes a lot of sense to me.

> We will be stuck with READ_ONCE() in places we are using
> per_cpu_ptr(), for example
> in dev_fetch_sw_netstats()

If this is strictly statistics, data_race() is another possibility.
But it does not constrain the compiler at all.

							Thanx, Paul

> diff --git a/net/core/dev.c b/net/core/dev.c
> index 1461c2d9dec8099a9a2d43a704b4c6cb0375f480..b66470291d7b7e6c33161093d71e40587f9ed838
> 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -10381,10 +10381,13 @@ void dev_fetch_sw_netstats(struct
> rtnl_link_stats64 *s,
>                 stats = per_cpu_ptr(netstats, cpu);
>                 do {
>                         start = u64_stats_fetch_begin_irq(&stats->syncp);
> -                       tmp.rx_packets = stats->rx_packets;
> -                       tmp.rx_bytes   = stats->rx_bytes;
> -                       tmp.tx_packets = stats->tx_packets;
> -                       tmp.tx_bytes   = stats->tx_bytes;
> +                       /* These values can change under us.
> +                        * READ_ONCE() pair with too many write sides...
> +                        */
> +                       tmp.rx_packets = READ_ONCE(stats->rx_packets);
> +                       tmp.rx_bytes   = READ_ONCE(stats->rx_bytes);
> +                       tmp.tx_packets = READ_ONCE(stats->tx_packets);
> +                       tmp.tx_bytes   = READ_ONCE(stats->tx_bytes);
>                 } while (u64_stats_fetch_retry_irq(&stats->syncp, start));
> 
>                 s->rx_packets += tmp.rx_packets;
Eric Dumazet May 12, 2022, 11:43 p.m. UTC | #5
On Thu, May 12, 2022 at 4:10 PM Paul E. McKenney <paulmck@kernel.org> wrote:
>
> On Thu, May 12, 2022 at 02:31:48PM -0700, Eric Dumazet wrote:
> > On Thu, May 12, 2022 at 2:18 PM Marco Elver <elver@google.com> wrote:
> >
> > >
> > > I guess the question is, is it the norm that per_cpu() retrieves data
> > > that can legally be modified concurrently, or not. If not, and in most
> > > cases it's a bug, the annotations should be here.
> > >
> > > Paul, was there any guidance/documentation on this, but I fail to find
> > > it right now? (access-marking.txt doesn't say much about per-CPU
> > > data.)
> >
> > Normally, whenever we add a READ_ONCE(), we are supposed to add a comment.
>
> I am starting to think that comments are even more necessary for unmarked
> accesses to shared variables, with the comments setting out why the
> compiler cannot mess things up.  ;-)
>
> > We could make an exception for per_cpu_once(), because the comment
> > would be centralized
> > at per_cpu_once() definition.
>
> This makes a lot of sense to me.
>
> > We will be stuck with READ_ONCE() in places we are using
> > per_cpu_ptr(), for example
> > in dev_fetch_sw_netstats()
>
> If this is strictly statistics, data_race() is another possibility.
> But it does not constrain the compiler at all.

Statistics are supposed to be monotonically increasing ;)

Some SNMP agents would be very confused if they could observe 'garbage' there.

I sense that we are going to add thousands of READ_ONCE() soon :/
David Laight May 13, 2022, 11:08 a.m. UTC | #6
> Statistics are supposed to be monotonically increasing ;)
> 
> Some SNMP agents would be very confused if they could observe 'garbage' there.

Don't look inside tg3 :-)

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
Paul E. McKenney May 17, 2022, 8:28 p.m. UTC | #7
On Thu, May 12, 2022 at 04:43:20PM -0700, Eric Dumazet wrote:
> On Thu, May 12, 2022 at 4:10 PM Paul E. McKenney <paulmck@kernel.org> wrote:
> >
> > On Thu, May 12, 2022 at 02:31:48PM -0700, Eric Dumazet wrote:
> > > On Thu, May 12, 2022 at 2:18 PM Marco Elver <elver@google.com> wrote:
> > >
> > > >
> > > > I guess the question is, is it the norm that per_cpu() retrieves data
> > > > that can legally be modified concurrently, or not. If not, and in most
> > > > cases it's a bug, the annotations should be here.
> > > >
> > > > Paul, was there any guidance/documentation on this, but I fail to find
> > > > it right now? (access-marking.txt doesn't say much about per-CPU
> > > > data.)
> > >
> > > Normally, whenever we add a READ_ONCE(), we are supposed to add a comment.
> >
> > I am starting to think that comments are even more necessary for unmarked
> > accesses to shared variables, with the comments setting out why the
> > compiler cannot mess things up.  ;-)
> >
> > > We could make an exception for per_cpu_once(), because the comment
> > > would be centralized
> > > at per_cpu_once() definition.
> >
> > This makes a lot of sense to me.
> >
> > > We will be stuck with READ_ONCE() in places we are using
> > > per_cpu_ptr(), for example
> > > in dev_fetch_sw_netstats()
> >
> > If this is strictly statistics, data_race() is another possibility.
> > But it does not constrain the compiler at all.
> 
> Statistics are supposed to be monotonically increasing ;)
> 
> Some SNMP agents would be very confused if they could observe 'garbage' there.
> 
> I sense that we are going to add thousands of READ_ONCE() soon :/

Indeed, adding READ_ONCE() instances can be annoying.  Then again, it
can also be annoying to have to debug the problems that sometimes arise
from omitting them where they are needed.

							Thanx, Paul
diff mbox series

Patch

diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index cf18fbcbf123..7245609f41e6 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -2718,7 +2718,7 @@  int tcp_orphan_count_sum(void)
 	int i, total = 0;
 
 	for_each_possible_cpu(i)
-		total += per_cpu(tcp_orphan_count, i);
+		total += READ_ONCE(per_cpu(tcp_orphan_count, i));
 
 	return max(total, 0);
 }