[v3,8/9] io_uring/cmd: BPF hook for getsockopt cmd

Message ID	20230817145554.892543-9-leitao@debian.org (mailing list archive)
State	New
Headers	show Return-Path: <io-uring-owner@vger.kernel.org> From: Breno Leitao <leitao@debian.org> To: sdf@google.com, axboe@kernel.dk, asml.silence@gmail.com, willemdebruijn.kernel@gmail.com, martin.lau@linux.dev Cc: bpf@vger.kernel.org, linux-kernel@vger.kernel.org, netdev@vger.kernel.org, io-uring@vger.kernel.org, kuba@kernel.org, pabeni@redhat.com, krisman@suse.de Subject: [PATCH v3 8/9] io_uring/cmd: BPF hook for getsockopt cmd Date: Thu, 17 Aug 2023 07:55:53 -0700 Message-Id: <20230817145554.892543-9-leitao@debian.org> In-Reply-To: <20230817145554.892543-1-leitao@debian.org> References: <20230817145554.892543-1-leitao@debian.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk
Series	io_uring: Initial support for {s,g}etsockopt commands \| expand [v3,0/9] io_uring: Initial support for {s,g}etsockopt commands [v3,1/9] bpf: Leverage sockptr_t in BPF getsockopt hook [v3,2/9] bpf: Leverage sockptr_t in BPF setsockopt hook [v3,3/9] net/socket: Break down __sys_setsockopt [v3,4/9] io_uring/cmd: Pass compat mode in issue_flags [v3,5/9] selftests/net: Extract uring helpers to be reusable [v3,6/9] io_uring/cmd: Introduce SOCKET_URING_OP_GETSOCKOPT [v3,7/9] io_uring/cmd: Introduce SOCKET_URING_OP_SETSOCKOPT [v3,8/9] io_uring/cmd: BPF hook for getsockopt cmd [v3,9/9] selftests/bpf/sockopt: Add io_uring support

Breno Leitao Aug. 17, 2023, 2:55 p.m. UTC

Add BPF hook support for getsockopts io_uring command. So, BPF cgroups
programs can run when SOCKET_URING_OP_GETSOCKOPT command is executed
through io_uring.

This implementation follows a similar approach to what
__sys_getsockopt() does, but, using USER_SOCKPTR() for optval instead of
kernel pointer.

Signed-off-by: Breno Leitao <leitao@debian.org>
---
 io_uring/uring_cmd.c | 18 +++++++++++++-----
 1 file changed, 13 insertions(+), 5 deletions(-)

Gabriel Krisman Bertazi Aug. 17, 2023, 7:08 p.m. UTC | #1

Breno Leitao <leitao@debian.org> writes:

> Add BPF hook support for getsockopts io_uring command. So, BPF cgroups
> programs can run when SOCKET_URING_OP_GETSOCKOPT command is executed
> through io_uring.
>
> This implementation follows a similar approach to what
> __sys_getsockopt() does, but, using USER_SOCKPTR() for optval instead of
> kernel pointer.
>
> Signed-off-by: Breno Leitao <leitao@debian.org>
> ---
>  io_uring/uring_cmd.c | 18 +++++++++++++-----
>  1 file changed, 13 insertions(+), 5 deletions(-)
>
> diff --git a/io_uring/uring_cmd.c b/io_uring/uring_cmd.c
> index a567dd32df00..9e08a14760c3 100644
> --- a/io_uring/uring_cmd.c
> +++ b/io_uring/uring_cmd.c
> @@ -5,6 +5,8 @@
>  #include <linux/io_uring.h>
>  #include <linux/security.h>
>  #include <linux/nospec.h>
> +#include <linux/compat.h>
> +#include <linux/bpf-cgroup.h>
>  
>  #include <uapi/linux/io_uring.h>
>  #include <uapi/asm-generic/ioctls.h>
> @@ -184,17 +186,23 @@ static inline int io_uring_cmd_getsockopt(struct socket *sock,
>  	if (err)
>  		return err;
>  
> -	if (level == SOL_SOCKET) {
> +	err = -EOPNOTSUPP;
> +	if (level == SOL_SOCKET)
>  		err = sk_getsockopt(sock->sk, level, optname,
>  				    USER_SOCKPTR(optval),
>  				    KERNEL_SOCKPTR(&optlen));
> -		if (err)
> -			return err;
>  
> +	if (!(issue_flags & IO_URING_F_COMPAT))
> +		err = BPF_CGROUP_RUN_PROG_GETSOCKOPT(sock->sk, level,
> +						     optname,
> +						     USER_SOCKPTR(optval),
> +						     KERNEL_SOCKPTR(&optlen),
> +						     optlen, err);
> +
> +	if (!err)
>  		return optlen;
> -	}

Shouldn't you call sock->ops->getsockopt for level!=SOL_SOCKET prior to
running the hook?  Before this patch, it would bail out with EOPNOTSUPP,
but now the bpf hook gets called even for level!=SOL_SOCKET, which
doesn't fit __sys_getsockopt. Am I misreading the code?

Breno Leitao Aug. 21, 2023, 9:14 a.m. UTC | #2

On Thu, Aug 17, 2023 at 03:08:47PM -0400, Gabriel Krisman Bertazi wrote:
> Breno Leitao <leitao@debian.org> writes:
> 
> > Add BPF hook support for getsockopts io_uring command. So, BPF cgroups
> > programs can run when SOCKET_URING_OP_GETSOCKOPT command is executed
> > through io_uring.
> >
> > This implementation follows a similar approach to what
> > __sys_getsockopt() does, but, using USER_SOCKPTR() for optval instead of
> > kernel pointer.
> >
> > Signed-off-by: Breno Leitao <leitao@debian.org>
> > ---
> >  io_uring/uring_cmd.c | 18 +++++++++++++-----
> >  1 file changed, 13 insertions(+), 5 deletions(-)
> >
> > diff --git a/io_uring/uring_cmd.c b/io_uring/uring_cmd.c
> > index a567dd32df00..9e08a14760c3 100644
> > --- a/io_uring/uring_cmd.c
> > +++ b/io_uring/uring_cmd.c
> > @@ -5,6 +5,8 @@
> >  #include <linux/io_uring.h>
> >  #include <linux/security.h>
> >  #include <linux/nospec.h>
> > +#include <linux/compat.h>
> > +#include <linux/bpf-cgroup.h>
> >  
> >  #include <uapi/linux/io_uring.h>
> >  #include <uapi/asm-generic/ioctls.h>
> > @@ -184,17 +186,23 @@ static inline int io_uring_cmd_getsockopt(struct socket *sock,
> >  	if (err)
> >  		return err;
> >  
> > -	if (level == SOL_SOCKET) {
> > +	err = -EOPNOTSUPP;
> > +	if (level == SOL_SOCKET)
> >  		err = sk_getsockopt(sock->sk, level, optname,
> >  				    USER_SOCKPTR(optval),
> >  				    KERNEL_SOCKPTR(&optlen));
> > -		if (err)
> > -			return err;
> >  
> > +	if (!(issue_flags & IO_URING_F_COMPAT))
> > +		err = BPF_CGROUP_RUN_PROG_GETSOCKOPT(sock->sk, level,
> > +						     optname,
> > +						     USER_SOCKPTR(optval),
> > +						     KERNEL_SOCKPTR(&optlen),
> > +						     optlen, err);
> > +
> > +	if (!err)
> >  		return optlen;
> > -	}
> 
> Shouldn't you call sock->ops->getsockopt for level!=SOL_SOCKET prior to
> running the hook?
> Before this patch, it would bail out with EOPNOTSUPP,
> but now the bpf hook gets called even for level!=SOL_SOCKET, which
> doesn't fit __sys_getsockopt. Am I misreading the code?

Not really, sock->ops->getsockopt() does not suport sockptr_t, but
__user addresses, differently from setsockopt()

          int             (*setsockopt)(struct socket *sock, int level,
                                        int optname, sockptr_t optval,
                                        unsigned int optlen);
          int             (*getsockopt)(struct socket *sock, int level,
                                        int optname, char __user *optval, int __user *optlen);

In order to be able to call sock->ops->getsockopt(), the callback
function will need to accepted sockptr.

Gabriel Krisman Bertazi Aug. 21, 2023, 5:03 p.m. UTC | #3

Breno Leitao <leitao@debian.org> writes:

> On Thu, Aug 17, 2023 at 03:08:47PM -0400, Gabriel Krisman Bertazi wrote:
>> Breno Leitao <leitao@debian.org> writes:
>> 
>> > Add BPF hook support for getsockopts io_uring command. So, BPF cgroups
>> > programs can run when SOCKET_URING_OP_GETSOCKOPT command is executed
>> > through io_uring.
>> >
>> > This implementation follows a similar approach to what
>> > __sys_getsockopt() does, but, using USER_SOCKPTR() for optval instead of
>> > kernel pointer.
>> >
>> > Signed-off-by: Breno Leitao <leitao@debian.org>
>> > ---
>> >  io_uring/uring_cmd.c | 18 +++++++++++++-----
>> >  1 file changed, 13 insertions(+), 5 deletions(-)
>> >
>> > diff --git a/io_uring/uring_cmd.c b/io_uring/uring_cmd.c
>> > index a567dd32df00..9e08a14760c3 100644
>> > --- a/io_uring/uring_cmd.c
>> > +++ b/io_uring/uring_cmd.c
>> > @@ -5,6 +5,8 @@
>> >  #include <linux/io_uring.h>
>> >  #include <linux/security.h>
>> >  #include <linux/nospec.h>
>> > +#include <linux/compat.h>
>> > +#include <linux/bpf-cgroup.h>
>> >  
>> >  #include <uapi/linux/io_uring.h>
>> >  #include <uapi/asm-generic/ioctls.h>
>> > @@ -184,17 +186,23 @@ static inline int io_uring_cmd_getsockopt(struct socket *sock,
>> >  	if (err)
>> >  		return err;
>> >  
>> > -	if (level == SOL_SOCKET) {
>> > +	err = -EOPNOTSUPP;
>> > +	if (level == SOL_SOCKET)
>> >  		err = sk_getsockopt(sock->sk, level, optname,
>> >  				    USER_SOCKPTR(optval),
>> >  				    KERNEL_SOCKPTR(&optlen));
>> > -		if (err)
>> > -			return err;
>> >  
>> > +	if (!(issue_flags & IO_URING_F_COMPAT))
>> > +		err = BPF_CGROUP_RUN_PROG_GETSOCKOPT(sock->sk, level,
>> > +						     optname,
>> > +						     USER_SOCKPTR(optval),
>> > +						     KERNEL_SOCKPTR(&optlen),
>> > +						     optlen, err);
>> > +
>> > +	if (!err)
>> >  		return optlen;
>> > -	}
>> 
>> Shouldn't you call sock->ops->getsockopt for level!=SOL_SOCKET prior to
>> running the hook?
>> Before this patch, it would bail out with EOPNOTSUPP,
>> but now the bpf hook gets called even for level!=SOL_SOCKET, which
>> doesn't fit __sys_getsockopt. Am I misreading the code?
>
> Not really, sock->ops->getsockopt() does not suport sockptr_t, but
> __user addresses, differently from setsockopt()
>
>           int             (*setsockopt)(struct socket *sock, int level,
>                                         int optname, sockptr_t optval,
>                                         unsigned int optlen);
>           int             (*getsockopt)(struct socket *sock, int level,
>                                         int optname, char __user *optval, int __user *optlen);
>
> In order to be able to call sock->ops->getsockopt(), the callback
> function will need to accepted sockptr.

So, it seems you won't support !SOL_SOCKETs here.  Then, I think you
shouldn't call the hook for those sockets. My main concern is that we
remain compatible to __sys_getsockopt when invoking the hook.

I think you should just have the following as the very first thing in
the function (but after the security_ check).

if (level != SOL_SOCKET)
   return -EOPNOTSUPP;

Martin KaFai Lau Aug. 21, 2023, 8:25 p.m. UTC | #4

On 8/17/23 12:08 PM, Gabriel Krisman Bertazi wrote:
> Breno Leitao <leitao@debian.org> writes:
> 
>> Add BPF hook support for getsockopts io_uring command. So, BPF cgroups
>> programs can run when SOCKET_URING_OP_GETSOCKOPT command is executed
>> through io_uring.
>>
>> This implementation follows a similar approach to what
>> __sys_getsockopt() does, but, using USER_SOCKPTR() for optval instead of
>> kernel pointer.
>>
>> Signed-off-by: Breno Leitao <leitao@debian.org>
>> ---
>>   io_uring/uring_cmd.c | 18 +++++++++++++-----
>>   1 file changed, 13 insertions(+), 5 deletions(-)
>>
>> diff --git a/io_uring/uring_cmd.c b/io_uring/uring_cmd.c
>> index a567dd32df00..9e08a14760c3 100644
>> --- a/io_uring/uring_cmd.c
>> +++ b/io_uring/uring_cmd.c
>> @@ -5,6 +5,8 @@
>>   #include <linux/io_uring.h>
>>   #include <linux/security.h>
>>   #include <linux/nospec.h>
>> +#include <linux/compat.h>
>> +#include <linux/bpf-cgroup.h>
>>   
>>   #include <uapi/linux/io_uring.h>
>>   #include <uapi/asm-generic/ioctls.h>
>> @@ -184,17 +186,23 @@ static inline int io_uring_cmd_getsockopt(struct socket *sock,
>>   	if (err)
>>   		return err;
>>   
>> -	if (level == SOL_SOCKET) {
>> +	err = -EOPNOTSUPP;
>> +	if (level == SOL_SOCKET)
>>   		err = sk_getsockopt(sock->sk, level, optname,
>>   				    USER_SOCKPTR(optval),
>>   				    KERNEL_SOCKPTR(&optlen));
>> -		if (err)
>> -			return err;
>>   
>> +	if (!(issue_flags & IO_URING_F_COMPAT))
>> +		err = BPF_CGROUP_RUN_PROG_GETSOCKOPT(sock->sk, level,
>> +						     optname,
>> +						     USER_SOCKPTR(optval),
>> +						     KERNEL_SOCKPTR(&optlen),
>> +						     optlen, err);
>> +
>> +	if (!err)
>>   		return optlen;
>> -	}
> 
> Shouldn't you call sock->ops->getsockopt for level!=SOL_SOCKET prior to
> running the hook?  Before this patch, it would bail out with EOPNOTSUPP,
> but now the bpf hook gets called even for level!=SOL_SOCKET, which
> doesn't fit __sys_getsockopt. Am I misreading the code?
I agree it should not call into bpf if the io_uring cannot support non 
SOL_SOCKET optnames. Otherwise, the bpf prog will get different optval and 
optlen when running in _sys_getsockopt vs io_uring getsockopt (e.g. in regular 
_sys_getsockopt(SOL_TCP), bpf expects the optval returned from tcp_getsockopt).

I think __sys_getsockopt can also be refactored similar to __sys_setsockopt in 
patch 3. Yes, for non SOL_SOCKET it only supports __user *optval and __user 
*optlen but may be a WARN_ON_ONCE/BUG_ON(sockpt_is_kernel(optval)) can be added 
before calling ops->getsockopt()? Then this details can be hidden away from the 
io_uring.

David Laight Aug. 22, 2023, 1:50 p.m. UTC | #5

...
> Not really, sock->ops->getsockopt() does not suport sockptr_t, but
> __user addresses, differently from setsockopt()
> 
...
>           int             (*getsockopt)(struct socket *sock, int level,
>                                         int optname, char __user *optval, int __user *optlen);
> 
> In order to be able to call sock->ops->getsockopt(), the callback
> function will need to accepted sockptr.

It is also worth looking at whether 'optlen' can be passed in
as a numeric parameter and then returned on success.
That would move the user copy into the syscall wrapper.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

Breno Leitao Aug. 23, 2023, 1:48 p.m. UTC | #6

On Mon, Aug 21, 2023 at 01:03:08PM -0400, Gabriel Krisman Bertazi wrote:
> Breno Leitao <leitao@debian.org> writes:
> 
> > On Thu, Aug 17, 2023 at 03:08:47PM -0400, Gabriel Krisman Bertazi wrote:
> >> Breno Leitao <leitao@debian.org> writes:
> >> 
> >> > Add BPF hook support for getsockopts io_uring command. So, BPF cgroups
> >> > programs can run when SOCKET_URING_OP_GETSOCKOPT command is executed
> >> > through io_uring.
> >> >
> >> > This implementation follows a similar approach to what
> >> > __sys_getsockopt() does, but, using USER_SOCKPTR() for optval instead of
> >> > kernel pointer.
> >> >
> >> > Signed-off-by: Breno Leitao <leitao@debian.org>
> >> > ---
> >> >  io_uring/uring_cmd.c | 18 +++++++++++++-----
> >> >  1 file changed, 13 insertions(+), 5 deletions(-)
> >> >
> >> > diff --git a/io_uring/uring_cmd.c b/io_uring/uring_cmd.c
> >> > index a567dd32df00..9e08a14760c3 100644
> >> > --- a/io_uring/uring_cmd.c
> >> > +++ b/io_uring/uring_cmd.c
> >> > @@ -5,6 +5,8 @@
> >> >  #include <linux/io_uring.h>
> >> >  #include <linux/security.h>
> >> >  #include <linux/nospec.h>
> >> > +#include <linux/compat.h>
> >> > +#include <linux/bpf-cgroup.h>
> >> >  
> >> >  #include <uapi/linux/io_uring.h>
> >> >  #include <uapi/asm-generic/ioctls.h>
> >> > @@ -184,17 +186,23 @@ static inline int io_uring_cmd_getsockopt(struct socket *sock,
> >> >  	if (err)
> >> >  		return err;
> >> >  
> >> > -	if (level == SOL_SOCKET) {
> >> > +	err = -EOPNOTSUPP;
> >> > +	if (level == SOL_SOCKET)
> >> >  		err = sk_getsockopt(sock->sk, level, optname,
> >> >  				    USER_SOCKPTR(optval),
> >> >  				    KERNEL_SOCKPTR(&optlen));
> >> > -		if (err)
> >> > -			return err;
> >> >  
> >> > +	if (!(issue_flags & IO_URING_F_COMPAT))
> >> > +		err = BPF_CGROUP_RUN_PROG_GETSOCKOPT(sock->sk, level,
> >> > +						     optname,
> >> > +						     USER_SOCKPTR(optval),
> >> > +						     KERNEL_SOCKPTR(&optlen),
> >> > +						     optlen, err);
> >> > +
> >> > +	if (!err)
> >> >  		return optlen;
> >> > -	}
> >> 
> >> Shouldn't you call sock->ops->getsockopt for level!=SOL_SOCKET prior to
> >> running the hook?
> >> Before this patch, it would bail out with EOPNOTSUPP,
> >> but now the bpf hook gets called even for level!=SOL_SOCKET, which
> >> doesn't fit __sys_getsockopt. Am I misreading the code?
> >
> > Not really, sock->ops->getsockopt() does not suport sockptr_t, but
> > __user addresses, differently from setsockopt()
> >
> >           int             (*setsockopt)(struct socket *sock, int level,
> >                                         int optname, sockptr_t optval,
> >                                         unsigned int optlen);
> >           int             (*getsockopt)(struct socket *sock, int level,
> >                                         int optname, char __user *optval, int __user *optlen);
> >
> > In order to be able to call sock->ops->getsockopt(), the callback
> > function will need to accepted sockptr.
> 
> So, it seems you won't support !SOL_SOCKETs here.  Then, I think you
> shouldn't call the hook for those sockets. My main concern is that we
> remain compatible to __sys_getsockopt when invoking the hook.
> 
> I think you should just have the following as the very first thing in
> the function (but after the security_ check).
> 
> if (level != SOL_SOCKET)
>    return -EOPNOTSUPP;

Gotcha. I will update. Thanks!

Breno Leitao Aug. 25, 2023, 4:53 p.m. UTC | #7

On Mon, Aug 21, 2023 at 01:25:25PM -0700, Martin KaFai Lau wrote:
> On 8/17/23 12:08 PM, Gabriel Krisman Bertazi wrote:
> > Shouldn't you call sock->ops->getsockopt for level!=SOL_SOCKET prior to
> > running the hook?  Before this patch, it would bail out with EOPNOTSUPP,
> > but now the bpf hook gets called even for level!=SOL_SOCKET, which
> > doesn't fit __sys_getsockopt. Am I misreading the code?
> I agree it should not call into bpf if the io_uring cannot support non
> SOL_SOCKET optnames. Otherwise, the bpf prog will get different optval and
> optlen when running in _sys_getsockopt vs io_uring getsockopt (e.g. in
> regular _sys_getsockopt(SOL_TCP), bpf expects the optval returned from
> tcp_getsockopt).
> 
> I think __sys_getsockopt can also be refactored similar to __sys_setsockopt
> in patch 3. Yes, for non SOL_SOCKET it only supports __user *optval and
> __user *optlen but may be a WARN_ON_ONCE/BUG_ON(sockpt_is_kernel(optval))
> can be added before calling ops->getsockopt()? Then this details can be
> hidden away from the io_uring.


Right, I've spent some time thinking about it, and this could be done.
This is a draft I have. Is it what you had in mind?

--

diff --git a/include/linux/bpf-cgroup.h b/include/linux/bpf-cgroup.h
index 5e3419eb267a..e39743f4ce5e 100644
--- a/include/linux/bpf-cgroup.h
+++ b/include/linux/bpf-cgroup.h
@@ -378,7 +378,7 @@ static inline bool cgroup_bpf_sock_enabled(struct sock *sk,
 ({									       \
 	int __ret = 0;							       \
 	if (cgroup_bpf_enabled(CGROUP_GETSOCKOPT))			       \
-		get_user(__ret, optlen);				       \
+		copy_from_sockptr(&__ret, optlen, sizeof(int));		       \
 	__ret;								       \
 })
 
diff --git a/include/net/sock.h b/include/net/sock.h
index 2a0324275347..24ea1719fd02 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -1855,6 +1855,8 @@ int sock_setsockopt(struct socket *sock, int level, int op,
 		    sockptr_t optval, unsigned int optlen);
 int do_sock_setsockopt(struct socket *sock, bool compat, int level,
 		       int optname, sockptr_t optval, int optlen);
+int do_sock_getsockopt(struct socket *sock, bool compat, int level,
+		       int optname, sockptr_t optval, sockptr_t optlen);
 
 int sk_getsockopt(struct sock *sk, int level, int optname,
 		  sockptr_t optval, sockptr_t optlen);
diff --git a/net/core/sock.c b/net/core/sock.c
index 9370fd50aa2c..2a5f30f14f5c 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -1997,14 +1997,6 @@ int sk_getsockopt(struct sock *sk, int level, int optname,
 	return 0;
 }
 
-int sock_getsockopt(struct socket *sock, int level, int optname,
-		    char __user *optval, int __user *optlen)
-{
-	return sk_getsockopt(sock->sk, level, optname,
-			     USER_SOCKPTR(optval),
-			     USER_SOCKPTR(optlen));
-}
-
 /*
  * Initialize an sk_lock.
  *
diff --git a/net/socket.c b/net/socket.c
index b5e4398a6b4d..f0d6b6b1f75e 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -2290,6 +2290,40 @@ SYSCALL_DEFINE5(setsockopt, int, fd, int, level, int, optname,
 INDIRECT_CALLABLE_DECLARE(bool tcp_bpf_bypass_getsockopt(int level,
 							 int optname));
 
+int do_sock_getsockopt(struct socket *sock, bool compat, int level,
+		       int optname, sockptr_t optval, sockptr_t optlen)
+{
+	int max_optlen __maybe_unused;
+	int err;
+
+	err = security_socket_getsockopt(sock, level, optname);
+	if (err)
+		return err;
+
+	if (level == SOL_SOCKET) {
+		err = sk_getsockopt(sock->sk, level, optname, optval, optlen);
+	} else if (unlikely(!sock->ops->getsockopt)) {
+		err = -EOPNOTSUPP;
+	} else {
+		if (WARN_ONCE(optval.is_kernel || optlen.is_kernel,
+			      "Invalid argument type"))
+			return -EOPNOTSUPP;
+
+		err = sock->ops->getsockopt(sock, level, optname, optval.user,
+					    optlen.user);
+	}
+
+	if (!compat) {
+		max_optlen = BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN(optlen);
+		err = BPF_CGROUP_RUN_PROG_GETSOCKOPT(sock->sk, level, optname,
+						     optval, optlen, max_optlen,
+						     err);
+	}
+
+	return err;
+}
+EXPORT_SYMBOL(do_sock_getsockopt);
+
 /*
  *	Get a socket option. Because we don't know the option lengths we have
  *	to pass a user mode parameter for the protocols to sort out.
@@ -2297,35 +2331,17 @@ INDIRECT_CALLABLE_DECLARE(bool tcp_bpf_bypass_getsockopt(int level,
 int __sys_getsockopt(int fd, int level, int optname, char __user *optval,
 		int __user *optlen)
 {
-	int max_optlen __maybe_unused;
 	int err, fput_needed;
+	bool compat = in_compat_syscall();
 	struct socket *sock;
 
 	sock = sockfd_lookup_light(fd, &err, &fput_needed);
 	if (!sock)
 		return err;
 
-	err = security_socket_getsockopt(sock, level, optname);
-	if (err)
-		goto out_put;
-
-	if (!in_compat_syscall())
-		max_optlen = BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN(optlen);
-
-	if (level == SOL_SOCKET)
-		err = sock_getsockopt(sock, level, optname, optval, optlen);
-	else if (unlikely(!sock->ops->getsockopt))
-		err = -EOPNOTSUPP;
-	else
-		err = sock->ops->getsockopt(sock, level, optname, optval,
-					    optlen);
+	err = do_sock_getsockopt(sock, compat, level, optname,
+				 USER_SOCKPTR(optval), USER_SOCKPTR(optlen));
 
-	if (!in_compat_syscall())
-		err = BPF_CGROUP_RUN_PROG_GETSOCKOPT(sock->sk, level, optname,
-						     USER_SOCKPTR(optval),
-						     USER_SOCKPTR(optlen),
-						     max_optlen, err);
-out_put:
 	fput_light(sock->file, fput_needed);
 	return err;
 }

Martin KaFai Lau Aug. 26, 2023, 12:45 a.m. UTC | #8

On 8/25/23 9:53 AM, Breno Leitao wrote:
> On Mon, Aug 21, 2023 at 01:25:25PM -0700, Martin KaFai Lau wrote:
>> On 8/17/23 12:08 PM, Gabriel Krisman Bertazi wrote:
>>> Shouldn't you call sock->ops->getsockopt for level!=SOL_SOCKET prior to
>>> running the hook?  Before this patch, it would bail out with EOPNOTSUPP,
>>> but now the bpf hook gets called even for level!=SOL_SOCKET, which
>>> doesn't fit __sys_getsockopt. Am I misreading the code?
>> I agree it should not call into bpf if the io_uring cannot support non
>> SOL_SOCKET optnames. Otherwise, the bpf prog will get different optval and
>> optlen when running in _sys_getsockopt vs io_uring getsockopt (e.g. in
>> regular _sys_getsockopt(SOL_TCP), bpf expects the optval returned from
>> tcp_getsockopt).
>>
>> I think __sys_getsockopt can also be refactored similar to __sys_setsockopt
>> in patch 3. Yes, for non SOL_SOCKET it only supports __user *optval and
>> __user *optlen but may be a WARN_ON_ONCE/BUG_ON(sockpt_is_kernel(optval))
>> can be added before calling ops->getsockopt()? Then this details can be
>> hidden away from the io_uring.
> 
> 
> Right, I've spent some time thinking about it, and this could be done.
> This is a draft I have. Is it what you had in mind?

Yes. lgtm. Thanks.

[v3,8/9] io_uring/cmd: BPF hook for getsockopt cmd

Commit Message

Comments

Patch