diff mbox series

[net] af_unix: Disable MSG_OOB handling for sockets in sockmap/sockhash

Message ID 20240620203009.2610301-1-mhal@rbox.co (mailing list archive)
State Superseded
Delegated to: Netdev Maintainers
Headers show
Series [net] af_unix: Disable MSG_OOB handling for sockets in sockmap/sockhash | expand

Checks

Context Check Description
netdev/series_format success Single patches do not need cover letters
netdev/tree_selection success Clearly marked for net
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag present in non-next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 859 this patch: 859
netdev/build_tools success No tools touched, skip
netdev/cc_maintainers fail 1 blamed authors not CCed: Rao.Shoaib@oracle.com; 1 maintainers not CCed: Rao.Shoaib@oracle.com
netdev/build_clang success Errors and warnings before: 863 this patch: 863
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 867 this patch: 867
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 48 lines checked
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
netdev/contest success net-next-2024-06-21--03-00 (tests: 659)

Commit Message

Michal Luczaj June 20, 2024, 8:20 p.m. UTC
AF_UNIX socket tracks the most recent OOB packet (in its receive queue)
with an `oob_skb` pointer. BPF redirecting does not account for that: when
an OOB packet is moved between sockets, `oob_skb` is left outdated. This
results in a single skb that may be accessed from two different sockets.

Take the easy way out: silently drop MSG_OOB data targeting any socket that
is in a sockmap or a sockhash. Note that such silent drop is akin to the
fate of redirected skb's scm_fp_list (SCM_RIGHTS, SCM_CREDENTIALS).

For symmetry, forbid MSG_OOB in unix_bpf_recvmsg().

Suggested-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Fixes: 314001f0bf92 ("af_unix: Add OOB support")
Signed-off-by: Michal Luczaj <mhal@rbox.co>
---
 net/unix/af_unix.c  | 30 +++++++++++++++++++++++++++++-
 net/unix/unix_bpf.c |  3 +++
 2 files changed, 32 insertions(+), 1 deletion(-)

Comments

Kuniyuki Iwashima June 20, 2024, 10:12 p.m. UTC | #1
Sorry for not mentioning this before, but could you replace "net" with
"bpf" in Subject and rebase the patch on bpf.git so that we can trigger
the patchwork's CI ?

https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf.git


From: Michal Luczaj <mhal@rbox.co>
Date: Thu, 20 Jun 2024 22:20:05 +0200
> AF_UNIX socket tracks the most recent OOB packet (in its receive queue)
> with an `oob_skb` pointer. BPF redirecting does not account for that: when
> an OOB packet is moved between sockets, `oob_skb` is left outdated. This
> results in a single skb that may be accessed from two different sockets.
> 
> Take the easy way out: silently drop MSG_OOB data targeting any socket that
> is in a sockmap or a sockhash. Note that such silent drop is akin to the
> fate of redirected skb's scm_fp_list (SCM_RIGHTS, SCM_CREDENTIALS).
> 
> For symmetry, forbid MSG_OOB in unix_bpf_recvmsg().
> 
> Suggested-by: Kuniyuki Iwashima <kuniyu@amazon.com>
> Fixes: 314001f0bf92 ("af_unix: Add OOB support")
> Signed-off-by: Michal Luczaj <mhal@rbox.co>
> ---
>  net/unix/af_unix.c  | 30 +++++++++++++++++++++++++++++-
>  net/unix/unix_bpf.c |  3 +++
>  2 files changed, 32 insertions(+), 1 deletion(-)
> 
> diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
> index 5e695a9a609c..3a55d075f199 100644
> --- a/net/unix/af_unix.c
> +++ b/net/unix/af_unix.c
> @@ -2653,10 +2653,38 @@ static struct sk_buff *manage_oob(struct sk_buff *skb, struct sock *sk,
>  
>  static int unix_stream_read_skb(struct sock *sk, skb_read_actor_t recv_actor)
>  {
> +	struct unix_sock *u = unix_sk(sk);
> +	struct sk_buff *skb;
> +	int err;
> +
>  	if (unlikely(READ_ONCE(sk->sk_state) != TCP_ESTABLISHED))
>  		return -ENOTCONN;
>  
> -	return unix_read_skb(sk, recv_actor);
> +	mutex_lock(&u->iolock);
> +	skb = skb_recv_datagram(sk, MSG_DONTWAIT, &err);

	mutex_unlock(&u->iolock);

I think we can drop mutex here as the skb is already unlinked
and no receiver can touch it.

and the below part can be like the following not to slow down
the common case:

	if (!skb)
		return err;

> +
> +#if IS_ENABLED(CONFIG_AF_UNIX_OOB)
> +	if (skb) {

	if (unlikely(skb == READ_ONCE(u->oob_skb))) {


> +		bool drop = false;
> +
> +		spin_lock(&sk->sk_receive_queue.lock);
> +		if (skb == u->oob_skb) {

		if (likely(skb == u->oob_skb)) {

> +			WRITE_ONCE(u->oob_skb, NULL);
> +			drop = true;
> +		}
> +		spin_unlock(&sk->sk_receive_queue.lock);
> +
> +		if (drop) {
> +			WARN_ON_ONCE(skb_unref(skb));
> +			kfree_skb(skb);
> +			skb = NULL;
> +			err = -EAGAIN;
			return -EAGAIN;

> +		}
> +	}
> +#endif

	return recv_actor(sk, skb);

Thanks!

> +
> +	mutex_unlock(&u->iolock);
> +	return skb ? recv_actor(sk, skb) : err;
>  }
>  
>  static int unix_stream_read_generic(struct unix_stream_read_state *state,
> diff --git a/net/unix/unix_bpf.c b/net/unix/unix_bpf.c
> index bd84785bf8d6..bca2d86ba97d 100644
> --- a/net/unix/unix_bpf.c
> +++ b/net/unix/unix_bpf.c
> @@ -54,6 +54,9 @@ static int unix_bpf_recvmsg(struct sock *sk, struct msghdr *msg,
>  	struct sk_psock *psock;
>  	int copied;
>  
> +	if (flags & MSG_OOB)
> +		return -EOPNOTSUPP;
> +
>  	if (!len)
>  		return 0;
>  
> -- 
> 2.45.1
Jakub Kicinski June 22, 2024, 12:02 a.m. UTC | #2
On Thu, 20 Jun 2024 15:12:23 -0700 Kuniyuki Iwashima wrote:
> Sorry for not mentioning this before, but could you replace "net" with
> "bpf" in Subject and rebase the patch on bpf.git so that we can trigger
> the patchwork's CI ?
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf.git

netdev runs the BPF CI, too, FWIW.

Open the patch in patchwork:
https://patchwork.kernel.org/project/netdevbpf/patch/20240620203009.2610301-1-mhal@rbox.co/
Click on contest in "checks".
Select Executor = "gh-bpf-ci".
Click on "outputs", you should get to:
https://github.com/kernel-patches/bpf/actions/runs/9607623089
If you click in context on the branch name it will take you to
the tested branch:
https://github.com/linux-netdev/testing/commits/net-next-2024-06-21--03-00
which had:
  af_unix: Disable MSG_OOB handling for sockets in sockmap/sockhash
applied, 5th from the top.
Michal Luczaj June 22, 2024, 10:38 p.m. UTC | #3
On 6/21/24 00:12, Kuniyuki Iwashima wrote:
> Sorry for not mentioning this before, but could you replace "net" with
> "bpf" in Subject and rebase the patch on bpf.git so that we can trigger
> the patchwork's CI ?

No problem, will do.

>> ...
>>  static int unix_stream_read_skb(struct sock *sk, skb_read_actor_t recv_actor)
>>  {
>> +	struct unix_sock *u = unix_sk(sk);
>> +	struct sk_buff *skb;
>> +	int err;
>> +
>>  	if (unlikely(READ_ONCE(sk->sk_state) != TCP_ESTABLISHED))
>>  		return -ENOTCONN;
>>  
>> -	return unix_read_skb(sk, recv_actor);
>> +	mutex_lock(&u->iolock);
>> +	skb = skb_recv_datagram(sk, MSG_DONTWAIT, &err);
> 
> 	mutex_unlock(&u->iolock);
> 
> I think we can drop mutex here as the skb is already unlinked
> and no receiver can touch it.

I guess you're right about the mutex. That said, double mea culpa, lack of
state lock makes things racy:

unix_stream_read_skb
  mutex_lock
  skb = skb_recv_datagram
  mutex_unlock
  spin_lock
  if (oob_skb == skb) {
				unix_release_sock
				  if (u->oob_skb) {
				    kfree_skb(u->oob_skb)
				    u->oob_skb = NULL
				  }
    oob_skb = NULL
    drop = true
  }
  spin_unlock
  if (drop) {
    skb_unref(skb)
    kfree_skb(skb)
  }

In v2 I'll do what unix_stream_read_generic() does: take state lock and
check for SOCK_DEAD.

> and the below part can be like the following not to slow down
> the common case:
> 
> 	if (!skb)
> 		return err;
> 
>> +
>> +#if IS_ENABLED(CONFIG_AF_UNIX_OOB)
>> +	if (skb) {
> 
> 	if (unlikely(skb == READ_ONCE(u->oob_skb))) {
> 
> 
>> +		bool drop = false;
>> +
>> +		spin_lock(&sk->sk_receive_queue.lock);
>> +		if (skb == u->oob_skb) {
> 
> 		if (likely(skb == u->oob_skb)) {
> 
>> +			WRITE_ONCE(u->oob_skb, NULL);
>> +			drop = true;
>> +		}
>> +		spin_unlock(&sk->sk_receive_queue.lock);
>> +
>> +		if (drop) {
>> +			WARN_ON_ONCE(skb_unref(skb));
>> +			kfree_skb(skb);
>> +			skb = NULL;
>> +			err = -EAGAIN;
> 			return -EAGAIN;
> 
>> +		}
>> +	}
>> +#endif
> 
> 	return recv_actor(sk, skb);

All right, thanks. So here's v2:
https://lore.kernel.org/netdev/20240622223324.3337956-1-mhal@rbox.co/
diff mbox series

Patch

diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index 5e695a9a609c..3a55d075f199 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -2653,10 +2653,38 @@  static struct sk_buff *manage_oob(struct sk_buff *skb, struct sock *sk,
 
 static int unix_stream_read_skb(struct sock *sk, skb_read_actor_t recv_actor)
 {
+	struct unix_sock *u = unix_sk(sk);
+	struct sk_buff *skb;
+	int err;
+
 	if (unlikely(READ_ONCE(sk->sk_state) != TCP_ESTABLISHED))
 		return -ENOTCONN;
 
-	return unix_read_skb(sk, recv_actor);
+	mutex_lock(&u->iolock);
+	skb = skb_recv_datagram(sk, MSG_DONTWAIT, &err);
+
+#if IS_ENABLED(CONFIG_AF_UNIX_OOB)
+	if (skb) {
+		bool drop = false;
+
+		spin_lock(&sk->sk_receive_queue.lock);
+		if (skb == u->oob_skb) {
+			WRITE_ONCE(u->oob_skb, NULL);
+			drop = true;
+		}
+		spin_unlock(&sk->sk_receive_queue.lock);
+
+		if (drop) {
+			WARN_ON_ONCE(skb_unref(skb));
+			kfree_skb(skb);
+			skb = NULL;
+			err = -EAGAIN;
+		}
+	}
+#endif
+
+	mutex_unlock(&u->iolock);
+	return skb ? recv_actor(sk, skb) : err;
 }
 
 static int unix_stream_read_generic(struct unix_stream_read_state *state,
diff --git a/net/unix/unix_bpf.c b/net/unix/unix_bpf.c
index bd84785bf8d6..bca2d86ba97d 100644
--- a/net/unix/unix_bpf.c
+++ b/net/unix/unix_bpf.c
@@ -54,6 +54,9 @@  static int unix_bpf_recvmsg(struct sock *sk, struct msghdr *msg,
 	struct sk_psock *psock;
 	int copied;
 
+	if (flags & MSG_OOB)
+		return -EOPNOTSUPP;
+
 	if (!len)
 		return 0;