diff mbox series

[RFC] net, sock.h: Make sure accesses to a fullsock when it is indeed one

Message ID 20240228074308.3623984-2-zegao@tencent.com (mailing list archive)
State RFC
Delegated to: Netdev Maintainers
Headers show
Series [RFC] net, sock.h: Make sure accesses to a fullsock when it is indeed one | expand

Checks

Context Check Description
netdev/series_format warning Single patches do not need cover letters; Target tree name not specified in the subject
netdev/tree_selection success Guessed tree name to be net-next, async
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 3486 this patch: 3486
netdev/build_tools success Errors and warnings before: 0 this patch: 0
netdev/cc_maintainers success CCed 4 of 4 maintainers
netdev/build_clang success Errors and warnings before: 1008 this patch: 1008
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 3700 this patch: 3700
netdev/checkpatch warning WARNING: From:/Signed-off-by: email address mismatch: 'From: Ze Gao <zegao2021@gmail.com>' != 'Signed-off-by: Ze Gao <zegao@tencent.com>'
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Ze Gao Feb. 28, 2024, 7:43 a.m. UTC
We know a pointer that has type struct sock* can actually points to
one of some different sock types which have different memory layouts,
take req_to_sk() for example, and whether a sock is full or not
depends upon ->sk_state which is a shared field among them so that we
see some repeated code pattern similar to this:

	if (sk && sk fullsock(sk) && sk->field_not_shared)

which seems to have no problem at the first glance, but it is actually
unsound in a way that ->field_not_shared is likely uninitialized (or
unmapped) when it's not a full sock, and a compiler is free to reorder
accesses to fields of a struct sock when it can, that is, it could
reorder accesses to ->field_not_shared across ->sk_state or load them
all before the branch test, which leads to unexpected behavior, although
most of them won't do this.

So leave a barrier() in between and force the compiler to keep the
obvious program order.

Cc: Honglin Li <honglinli@tencent.com>
Signed-off-by: Ze Gao <zegao@tencent.com>
---

IIUC, casting a pointer to refer to a bigger object in size is
technically UB, which may lead to unsound code. From the POV of
a compiler, when it is allowed to assume that one struct member
is valid, they all are through a pointer, and thus it's likely
for the compiler to do such optimizations and reorder what we
want to keep in order.

Note this is not a typical way to use barrier(), which only
acts an ok fix to what's already unsound, at least IMO.

Comments are welcome, since I'm not an expert in C and I know
most of compilers won't do this reorder, but I'm being pessimistic
here.

Happy to learn from your sage insights and better solutions (or
no solutions at all if this is indeed not a problem in the first
place)

Regards,
        -- Ze

 include/net/sock.h | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

Comments

Eric Dumazet Feb. 28, 2024, 8:34 a.m. UTC | #1
On Wed, Feb 28, 2024 at 8:46 AM Ze Gao <zegao2021@gmail.com> wrote:
>
> We know a pointer that has type struct sock* can actually points to
> one of some different sock types which have different memory layouts,
> take req_to_sk() for example, and whether a sock is full or not
> depends upon ->sk_state which is a shared field among them so that we
> see some repeated code pattern similar to this:
>
>         if (sk && sk fullsock(sk) && sk->field_not_shared)
>
> which seems to have no problem at the first glance, but it is actually
> unsound in a way that ->field_not_shared is likely uninitialized (or
> unmapped) when it's not a full sock, and a compiler is free to reorder
> accesses to fields of a struct sock when it can, that is, it could
> reorder accesses to ->field_not_shared across ->sk_state or load them
> all before the branch test, which leads to unexpected behavior, although
> most of them won't do this.
>
> So leave a barrier() in between and force the compiler to keep the
> obvious program order.
>
> Cc: Honglin Li <honglinli@tencent.com>
> Signed-off-by: Ze Gao <zegao@tencent.com>
> ---
>
> IIUC, casting a pointer to refer to a bigger object in size is
> technically UB, which may lead to unsound code. From the POV of
> a compiler, when it is allowed to assume that one struct member
> is valid, they all are through a pointer, and thus it's likely
> for the compiler to do such optimizations and reorder what we
> want to keep in order.
>
> Note this is not a typical way to use barrier(), which only
> acts an ok fix to what's already unsound, at least IMO.
>
> Comments are welcome, since I'm not an expert in C and I know
> most of compilers won't do this reorder, but I'm being pessimistic
> here.

Well, my suggestion is to have evidence first...

We are not going to add barriers just because we do not trust
compilers handling of sequence points.
Ze Gao Feb. 28, 2024, 8:58 a.m. UTC | #2
On Wed, Feb 28, 2024 at 4:34 PM Eric Dumazet <edumazet@google.com> wrote:
>
> On Wed, Feb 28, 2024 at 8:46 AM Ze Gao <zegao2021@gmail.com> wrote:
> >
> > We know a pointer that has type struct sock* can actually points to
> > one of some different sock types which have different memory layouts,
> > take req_to_sk() for example, and whether a sock is full or not
> > depends upon ->sk_state which is a shared field among them so that we
> > see some repeated code pattern similar to this:
> >
> >         if (sk && sk fullsock(sk) && sk->field_not_shared)
> >
> > which seems to have no problem at the first glance, but it is actually
> > unsound in a way that ->field_not_shared is likely uninitialized (or
> > unmapped) when it's not a full sock, and a compiler is free to reorder
> > accesses to fields of a struct sock when it can, that is, it could
> > reorder accesses to ->field_not_shared across ->sk_state or load them
> > all before the branch test, which leads to unexpected behavior, although
> > most of them won't do this.
> >
> > So leave a barrier() in between and force the compiler to keep the
> > obvious program order.
> >
> > Cc: Honglin Li <honglinli@tencent.com>
> > Signed-off-by: Ze Gao <zegao@tencent.com>
> > ---
> >
> > IIUC, casting a pointer to refer to a bigger object in size is
> > technically UB, which may lead to unsound code. From the POV of
> > a compiler, when it is allowed to assume that one struct member
> > is valid, they all are through a pointer, and thus it's likely
> > for the compiler to do such optimizations and reorder what we
> > want to keep in order.
> >
> > Note this is not a typical way to use barrier(), which only
> > acts an ok fix to what's already unsound, at least IMO.
> >
> > Comments are welcome, since I'm not an expert in C and I know
> > most of compilers won't do this reorder, but I'm being pessimistic
> > here.
>
> Well, my suggestion is to have evidence first...

Fair point!  my initial purpose is to raise my question here to check if
there is UB here and see if C experts have insights on this.

> We are not going to add barriers just because we do not trust
> compilers handling of sequence points.

Makes sense to me as well if this is indeed not an issue here.

Thanks,
        -- Ze
diff mbox series

Patch

diff --git a/include/net/sock.h b/include/net/sock.h
index 92f7ea62a915..f7e3960cb5fc 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -2815,7 +2815,14 @@  skb_sk_is_prefetched(struct sk_buff *skb)
  */
 static inline bool sk_fullsock(const struct sock *sk)
 {
-	return (1 << sk->sk_state) & ~(TCPF_TIME_WAIT | TCPF_NEW_SYN_RECV);
+	bool ret = (1 << sk->sk_state) & ~(TCPF_TIME_WAIT | TCPF_NEW_SYN_RECV);
+
+	/*
+	 * Make sure all accesses to a full sock happens right
+	 * after ->sk_state.
+	 */
+	barrier();
+	return ret;
 }
 
 static inline bool