Message ID | 20230817145554.892543-9-leitao@debian.org (mailing list archive) |
---|---|
State | Not Applicable |
Headers | show |
Series | io_uring: Initial support for {s,g}etsockopt commands | expand |
Breno Leitao <leitao@debian.org> writes: > Add BPF hook support for getsockopts io_uring command. So, BPF cgroups > programs can run when SOCKET_URING_OP_GETSOCKOPT command is executed > through io_uring. > > This implementation follows a similar approach to what > __sys_getsockopt() does, but, using USER_SOCKPTR() for optval instead of > kernel pointer. > > Signed-off-by: Breno Leitao <leitao@debian.org> > --- > io_uring/uring_cmd.c | 18 +++++++++++++----- > 1 file changed, 13 insertions(+), 5 deletions(-) > > diff --git a/io_uring/uring_cmd.c b/io_uring/uring_cmd.c > index a567dd32df00..9e08a14760c3 100644 > --- a/io_uring/uring_cmd.c > +++ b/io_uring/uring_cmd.c > @@ -5,6 +5,8 @@ > #include <linux/io_uring.h> > #include <linux/security.h> > #include <linux/nospec.h> > +#include <linux/compat.h> > +#include <linux/bpf-cgroup.h> > > #include <uapi/linux/io_uring.h> > #include <uapi/asm-generic/ioctls.h> > @@ -184,17 +186,23 @@ static inline int io_uring_cmd_getsockopt(struct socket *sock, > if (err) > return err; > > - if (level == SOL_SOCKET) { > + err = -EOPNOTSUPP; > + if (level == SOL_SOCKET) > err = sk_getsockopt(sock->sk, level, optname, > USER_SOCKPTR(optval), > KERNEL_SOCKPTR(&optlen)); > - if (err) > - return err; > > + if (!(issue_flags & IO_URING_F_COMPAT)) > + err = BPF_CGROUP_RUN_PROG_GETSOCKOPT(sock->sk, level, > + optname, > + USER_SOCKPTR(optval), > + KERNEL_SOCKPTR(&optlen), > + optlen, err); > + > + if (!err) > return optlen; > - } Shouldn't you call sock->ops->getsockopt for level!=SOL_SOCKET prior to running the hook? Before this patch, it would bail out with EOPNOTSUPP, but now the bpf hook gets called even for level!=SOL_SOCKET, which doesn't fit __sys_getsockopt. Am I misreading the code?
On Thu, Aug 17, 2023 at 03:08:47PM -0400, Gabriel Krisman Bertazi wrote: > Breno Leitao <leitao@debian.org> writes: > > > Add BPF hook support for getsockopts io_uring command. So, BPF cgroups > > programs can run when SOCKET_URING_OP_GETSOCKOPT command is executed > > through io_uring. > > > > This implementation follows a similar approach to what > > __sys_getsockopt() does, but, using USER_SOCKPTR() for optval instead of > > kernel pointer. > > > > Signed-off-by: Breno Leitao <leitao@debian.org> > > --- > > io_uring/uring_cmd.c | 18 +++++++++++++----- > > 1 file changed, 13 insertions(+), 5 deletions(-) > > > > diff --git a/io_uring/uring_cmd.c b/io_uring/uring_cmd.c > > index a567dd32df00..9e08a14760c3 100644 > > --- a/io_uring/uring_cmd.c > > +++ b/io_uring/uring_cmd.c > > @@ -5,6 +5,8 @@ > > #include <linux/io_uring.h> > > #include <linux/security.h> > > #include <linux/nospec.h> > > +#include <linux/compat.h> > > +#include <linux/bpf-cgroup.h> > > > > #include <uapi/linux/io_uring.h> > > #include <uapi/asm-generic/ioctls.h> > > @@ -184,17 +186,23 @@ static inline int io_uring_cmd_getsockopt(struct socket *sock, > > if (err) > > return err; > > > > - if (level == SOL_SOCKET) { > > + err = -EOPNOTSUPP; > > + if (level == SOL_SOCKET) > > err = sk_getsockopt(sock->sk, level, optname, > > USER_SOCKPTR(optval), > > KERNEL_SOCKPTR(&optlen)); > > - if (err) > > - return err; > > > > + if (!(issue_flags & IO_URING_F_COMPAT)) > > + err = BPF_CGROUP_RUN_PROG_GETSOCKOPT(sock->sk, level, > > + optname, > > + USER_SOCKPTR(optval), > > + KERNEL_SOCKPTR(&optlen), > > + optlen, err); > > + > > + if (!err) > > return optlen; > > - } > > Shouldn't you call sock->ops->getsockopt for level!=SOL_SOCKET prior to > running the hook? > Before this patch, it would bail out with EOPNOTSUPP, > but now the bpf hook gets called even for level!=SOL_SOCKET, which > doesn't fit __sys_getsockopt. Am I misreading the code? Not really, sock->ops->getsockopt() does not suport sockptr_t, but __user addresses, differently from setsockopt() int (*setsockopt)(struct socket *sock, int level, int optname, sockptr_t optval, unsigned int optlen); int (*getsockopt)(struct socket *sock, int level, int optname, char __user *optval, int __user *optlen); In order to be able to call sock->ops->getsockopt(), the callback function will need to accepted sockptr.
Breno Leitao <leitao@debian.org> writes: > On Thu, Aug 17, 2023 at 03:08:47PM -0400, Gabriel Krisman Bertazi wrote: >> Breno Leitao <leitao@debian.org> writes: >> >> > Add BPF hook support for getsockopts io_uring command. So, BPF cgroups >> > programs can run when SOCKET_URING_OP_GETSOCKOPT command is executed >> > through io_uring. >> > >> > This implementation follows a similar approach to what >> > __sys_getsockopt() does, but, using USER_SOCKPTR() for optval instead of >> > kernel pointer. >> > >> > Signed-off-by: Breno Leitao <leitao@debian.org> >> > --- >> > io_uring/uring_cmd.c | 18 +++++++++++++----- >> > 1 file changed, 13 insertions(+), 5 deletions(-) >> > >> > diff --git a/io_uring/uring_cmd.c b/io_uring/uring_cmd.c >> > index a567dd32df00..9e08a14760c3 100644 >> > --- a/io_uring/uring_cmd.c >> > +++ b/io_uring/uring_cmd.c >> > @@ -5,6 +5,8 @@ >> > #include <linux/io_uring.h> >> > #include <linux/security.h> >> > #include <linux/nospec.h> >> > +#include <linux/compat.h> >> > +#include <linux/bpf-cgroup.h> >> > >> > #include <uapi/linux/io_uring.h> >> > #include <uapi/asm-generic/ioctls.h> >> > @@ -184,17 +186,23 @@ static inline int io_uring_cmd_getsockopt(struct socket *sock, >> > if (err) >> > return err; >> > >> > - if (level == SOL_SOCKET) { >> > + err = -EOPNOTSUPP; >> > + if (level == SOL_SOCKET) >> > err = sk_getsockopt(sock->sk, level, optname, >> > USER_SOCKPTR(optval), >> > KERNEL_SOCKPTR(&optlen)); >> > - if (err) >> > - return err; >> > >> > + if (!(issue_flags & IO_URING_F_COMPAT)) >> > + err = BPF_CGROUP_RUN_PROG_GETSOCKOPT(sock->sk, level, >> > + optname, >> > + USER_SOCKPTR(optval), >> > + KERNEL_SOCKPTR(&optlen), >> > + optlen, err); >> > + >> > + if (!err) >> > return optlen; >> > - } >> >> Shouldn't you call sock->ops->getsockopt for level!=SOL_SOCKET prior to >> running the hook? >> Before this patch, it would bail out with EOPNOTSUPP, >> but now the bpf hook gets called even for level!=SOL_SOCKET, which >> doesn't fit __sys_getsockopt. Am I misreading the code? > > Not really, sock->ops->getsockopt() does not suport sockptr_t, but > __user addresses, differently from setsockopt() > > int (*setsockopt)(struct socket *sock, int level, > int optname, sockptr_t optval, > unsigned int optlen); > int (*getsockopt)(struct socket *sock, int level, > int optname, char __user *optval, int __user *optlen); > > In order to be able to call sock->ops->getsockopt(), the callback > function will need to accepted sockptr. So, it seems you won't support !SOL_SOCKETs here. Then, I think you shouldn't call the hook for those sockets. My main concern is that we remain compatible to __sys_getsockopt when invoking the hook. I think you should just have the following as the very first thing in the function (but after the security_ check). if (level != SOL_SOCKET) return -EOPNOTSUPP;
On 8/17/23 12:08 PM, Gabriel Krisman Bertazi wrote: > Breno Leitao <leitao@debian.org> writes: > >> Add BPF hook support for getsockopts io_uring command. So, BPF cgroups >> programs can run when SOCKET_URING_OP_GETSOCKOPT command is executed >> through io_uring. >> >> This implementation follows a similar approach to what >> __sys_getsockopt() does, but, using USER_SOCKPTR() for optval instead of >> kernel pointer. >> >> Signed-off-by: Breno Leitao <leitao@debian.org> >> --- >> io_uring/uring_cmd.c | 18 +++++++++++++----- >> 1 file changed, 13 insertions(+), 5 deletions(-) >> >> diff --git a/io_uring/uring_cmd.c b/io_uring/uring_cmd.c >> index a567dd32df00..9e08a14760c3 100644 >> --- a/io_uring/uring_cmd.c >> +++ b/io_uring/uring_cmd.c >> @@ -5,6 +5,8 @@ >> #include <linux/io_uring.h> >> #include <linux/security.h> >> #include <linux/nospec.h> >> +#include <linux/compat.h> >> +#include <linux/bpf-cgroup.h> >> >> #include <uapi/linux/io_uring.h> >> #include <uapi/asm-generic/ioctls.h> >> @@ -184,17 +186,23 @@ static inline int io_uring_cmd_getsockopt(struct socket *sock, >> if (err) >> return err; >> >> - if (level == SOL_SOCKET) { >> + err = -EOPNOTSUPP; >> + if (level == SOL_SOCKET) >> err = sk_getsockopt(sock->sk, level, optname, >> USER_SOCKPTR(optval), >> KERNEL_SOCKPTR(&optlen)); >> - if (err) >> - return err; >> >> + if (!(issue_flags & IO_URING_F_COMPAT)) >> + err = BPF_CGROUP_RUN_PROG_GETSOCKOPT(sock->sk, level, >> + optname, >> + USER_SOCKPTR(optval), >> + KERNEL_SOCKPTR(&optlen), >> + optlen, err); >> + >> + if (!err) >> return optlen; >> - } > > Shouldn't you call sock->ops->getsockopt for level!=SOL_SOCKET prior to > running the hook? Before this patch, it would bail out with EOPNOTSUPP, > but now the bpf hook gets called even for level!=SOL_SOCKET, which > doesn't fit __sys_getsockopt. Am I misreading the code? I agree it should not call into bpf if the io_uring cannot support non SOL_SOCKET optnames. Otherwise, the bpf prog will get different optval and optlen when running in _sys_getsockopt vs io_uring getsockopt (e.g. in regular _sys_getsockopt(SOL_TCP), bpf expects the optval returned from tcp_getsockopt). I think __sys_getsockopt can also be refactored similar to __sys_setsockopt in patch 3. Yes, for non SOL_SOCKET it only supports __user *optval and __user *optlen but may be a WARN_ON_ONCE/BUG_ON(sockpt_is_kernel(optval)) can be added before calling ops->getsockopt()? Then this details can be hidden away from the io_uring.
... > Not really, sock->ops->getsockopt() does not suport sockptr_t, but > __user addresses, differently from setsockopt() > ... > int (*getsockopt)(struct socket *sock, int level, > int optname, char __user *optval, int __user *optlen); > > In order to be able to call sock->ops->getsockopt(), the callback > function will need to accepted sockptr. It is also worth looking at whether 'optlen' can be passed in as a numeric parameter and then returned on success. That would move the user copy into the syscall wrapper. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
On Mon, Aug 21, 2023 at 01:03:08PM -0400, Gabriel Krisman Bertazi wrote: > Breno Leitao <leitao@debian.org> writes: > > > On Thu, Aug 17, 2023 at 03:08:47PM -0400, Gabriel Krisman Bertazi wrote: > >> Breno Leitao <leitao@debian.org> writes: > >> > >> > Add BPF hook support for getsockopts io_uring command. So, BPF cgroups > >> > programs can run when SOCKET_URING_OP_GETSOCKOPT command is executed > >> > through io_uring. > >> > > >> > This implementation follows a similar approach to what > >> > __sys_getsockopt() does, but, using USER_SOCKPTR() for optval instead of > >> > kernel pointer. > >> > > >> > Signed-off-by: Breno Leitao <leitao@debian.org> > >> > --- > >> > io_uring/uring_cmd.c | 18 +++++++++++++----- > >> > 1 file changed, 13 insertions(+), 5 deletions(-) > >> > > >> > diff --git a/io_uring/uring_cmd.c b/io_uring/uring_cmd.c > >> > index a567dd32df00..9e08a14760c3 100644 > >> > --- a/io_uring/uring_cmd.c > >> > +++ b/io_uring/uring_cmd.c > >> > @@ -5,6 +5,8 @@ > >> > #include <linux/io_uring.h> > >> > #include <linux/security.h> > >> > #include <linux/nospec.h> > >> > +#include <linux/compat.h> > >> > +#include <linux/bpf-cgroup.h> > >> > > >> > #include <uapi/linux/io_uring.h> > >> > #include <uapi/asm-generic/ioctls.h> > >> > @@ -184,17 +186,23 @@ static inline int io_uring_cmd_getsockopt(struct socket *sock, > >> > if (err) > >> > return err; > >> > > >> > - if (level == SOL_SOCKET) { > >> > + err = -EOPNOTSUPP; > >> > + if (level == SOL_SOCKET) > >> > err = sk_getsockopt(sock->sk, level, optname, > >> > USER_SOCKPTR(optval), > >> > KERNEL_SOCKPTR(&optlen)); > >> > - if (err) > >> > - return err; > >> > > >> > + if (!(issue_flags & IO_URING_F_COMPAT)) > >> > + err = BPF_CGROUP_RUN_PROG_GETSOCKOPT(sock->sk, level, > >> > + optname, > >> > + USER_SOCKPTR(optval), > >> > + KERNEL_SOCKPTR(&optlen), > >> > + optlen, err); > >> > + > >> > + if (!err) > >> > return optlen; > >> > - } > >> > >> Shouldn't you call sock->ops->getsockopt for level!=SOL_SOCKET prior to > >> running the hook? > >> Before this patch, it would bail out with EOPNOTSUPP, > >> but now the bpf hook gets called even for level!=SOL_SOCKET, which > >> doesn't fit __sys_getsockopt. Am I misreading the code? > > > > Not really, sock->ops->getsockopt() does not suport sockptr_t, but > > __user addresses, differently from setsockopt() > > > > int (*setsockopt)(struct socket *sock, int level, > > int optname, sockptr_t optval, > > unsigned int optlen); > > int (*getsockopt)(struct socket *sock, int level, > > int optname, char __user *optval, int __user *optlen); > > > > In order to be able to call sock->ops->getsockopt(), the callback > > function will need to accepted sockptr. > > So, it seems you won't support !SOL_SOCKETs here. Then, I think you > shouldn't call the hook for those sockets. My main concern is that we > remain compatible to __sys_getsockopt when invoking the hook. > > I think you should just have the following as the very first thing in > the function (but after the security_ check). > > if (level != SOL_SOCKET) > return -EOPNOTSUPP; Gotcha. I will update. Thanks!
On Mon, Aug 21, 2023 at 01:25:25PM -0700, Martin KaFai Lau wrote: > On 8/17/23 12:08 PM, Gabriel Krisman Bertazi wrote: > > Shouldn't you call sock->ops->getsockopt for level!=SOL_SOCKET prior to > > running the hook? Before this patch, it would bail out with EOPNOTSUPP, > > but now the bpf hook gets called even for level!=SOL_SOCKET, which > > doesn't fit __sys_getsockopt. Am I misreading the code? > I agree it should not call into bpf if the io_uring cannot support non > SOL_SOCKET optnames. Otherwise, the bpf prog will get different optval and > optlen when running in _sys_getsockopt vs io_uring getsockopt (e.g. in > regular _sys_getsockopt(SOL_TCP), bpf expects the optval returned from > tcp_getsockopt). > > I think __sys_getsockopt can also be refactored similar to __sys_setsockopt > in patch 3. Yes, for non SOL_SOCKET it only supports __user *optval and > __user *optlen but may be a WARN_ON_ONCE/BUG_ON(sockpt_is_kernel(optval)) > can be added before calling ops->getsockopt()? Then this details can be > hidden away from the io_uring. Right, I've spent some time thinking about it, and this could be done. This is a draft I have. Is it what you had in mind? -- diff --git a/include/linux/bpf-cgroup.h b/include/linux/bpf-cgroup.h index 5e3419eb267a..e39743f4ce5e 100644 --- a/include/linux/bpf-cgroup.h +++ b/include/linux/bpf-cgroup.h @@ -378,7 +378,7 @@ static inline bool cgroup_bpf_sock_enabled(struct sock *sk, ({ \ int __ret = 0; \ if (cgroup_bpf_enabled(CGROUP_GETSOCKOPT)) \ - get_user(__ret, optlen); \ + copy_from_sockptr(&__ret, optlen, sizeof(int)); \ __ret; \ }) diff --git a/include/net/sock.h b/include/net/sock.h index 2a0324275347..24ea1719fd02 100644 --- a/include/net/sock.h +++ b/include/net/sock.h @@ -1855,6 +1855,8 @@ int sock_setsockopt(struct socket *sock, int level, int op, sockptr_t optval, unsigned int optlen); int do_sock_setsockopt(struct socket *sock, bool compat, int level, int optname, sockptr_t optval, int optlen); +int do_sock_getsockopt(struct socket *sock, bool compat, int level, + int optname, sockptr_t optval, sockptr_t optlen); int sk_getsockopt(struct sock *sk, int level, int optname, sockptr_t optval, sockptr_t optlen); diff --git a/net/core/sock.c b/net/core/sock.c index 9370fd50aa2c..2a5f30f14f5c 100644 --- a/net/core/sock.c +++ b/net/core/sock.c @@ -1997,14 +1997,6 @@ int sk_getsockopt(struct sock *sk, int level, int optname, return 0; } -int sock_getsockopt(struct socket *sock, int level, int optname, - char __user *optval, int __user *optlen) -{ - return sk_getsockopt(sock->sk, level, optname, - USER_SOCKPTR(optval), - USER_SOCKPTR(optlen)); -} - /* * Initialize an sk_lock. * diff --git a/net/socket.c b/net/socket.c index b5e4398a6b4d..f0d6b6b1f75e 100644 --- a/net/socket.c +++ b/net/socket.c @@ -2290,6 +2290,40 @@ SYSCALL_DEFINE5(setsockopt, int, fd, int, level, int, optname, INDIRECT_CALLABLE_DECLARE(bool tcp_bpf_bypass_getsockopt(int level, int optname)); +int do_sock_getsockopt(struct socket *sock, bool compat, int level, + int optname, sockptr_t optval, sockptr_t optlen) +{ + int max_optlen __maybe_unused; + int err; + + err = security_socket_getsockopt(sock, level, optname); + if (err) + return err; + + if (level == SOL_SOCKET) { + err = sk_getsockopt(sock->sk, level, optname, optval, optlen); + } else if (unlikely(!sock->ops->getsockopt)) { + err = -EOPNOTSUPP; + } else { + if (WARN_ONCE(optval.is_kernel || optlen.is_kernel, + "Invalid argument type")) + return -EOPNOTSUPP; + + err = sock->ops->getsockopt(sock, level, optname, optval.user, + optlen.user); + } + + if (!compat) { + max_optlen = BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN(optlen); + err = BPF_CGROUP_RUN_PROG_GETSOCKOPT(sock->sk, level, optname, + optval, optlen, max_optlen, + err); + } + + return err; +} +EXPORT_SYMBOL(do_sock_getsockopt); + /* * Get a socket option. Because we don't know the option lengths we have * to pass a user mode parameter for the protocols to sort out. @@ -2297,35 +2331,17 @@ INDIRECT_CALLABLE_DECLARE(bool tcp_bpf_bypass_getsockopt(int level, int __sys_getsockopt(int fd, int level, int optname, char __user *optval, int __user *optlen) { - int max_optlen __maybe_unused; int err, fput_needed; + bool compat = in_compat_syscall(); struct socket *sock; sock = sockfd_lookup_light(fd, &err, &fput_needed); if (!sock) return err; - err = security_socket_getsockopt(sock, level, optname); - if (err) - goto out_put; - - if (!in_compat_syscall()) - max_optlen = BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN(optlen); - - if (level == SOL_SOCKET) - err = sock_getsockopt(sock, level, optname, optval, optlen); - else if (unlikely(!sock->ops->getsockopt)) - err = -EOPNOTSUPP; - else - err = sock->ops->getsockopt(sock, level, optname, optval, - optlen); + err = do_sock_getsockopt(sock, compat, level, optname, + USER_SOCKPTR(optval), USER_SOCKPTR(optlen)); - if (!in_compat_syscall()) - err = BPF_CGROUP_RUN_PROG_GETSOCKOPT(sock->sk, level, optname, - USER_SOCKPTR(optval), - USER_SOCKPTR(optlen), - max_optlen, err); -out_put: fput_light(sock->file, fput_needed); return err; }
On 8/25/23 9:53 AM, Breno Leitao wrote: > On Mon, Aug 21, 2023 at 01:25:25PM -0700, Martin KaFai Lau wrote: >> On 8/17/23 12:08 PM, Gabriel Krisman Bertazi wrote: >>> Shouldn't you call sock->ops->getsockopt for level!=SOL_SOCKET prior to >>> running the hook? Before this patch, it would bail out with EOPNOTSUPP, >>> but now the bpf hook gets called even for level!=SOL_SOCKET, which >>> doesn't fit __sys_getsockopt. Am I misreading the code? >> I agree it should not call into bpf if the io_uring cannot support non >> SOL_SOCKET optnames. Otherwise, the bpf prog will get different optval and >> optlen when running in _sys_getsockopt vs io_uring getsockopt (e.g. in >> regular _sys_getsockopt(SOL_TCP), bpf expects the optval returned from >> tcp_getsockopt). >> >> I think __sys_getsockopt can also be refactored similar to __sys_setsockopt >> in patch 3. Yes, for non SOL_SOCKET it only supports __user *optval and >> __user *optlen but may be a WARN_ON_ONCE/BUG_ON(sockpt_is_kernel(optval)) >> can be added before calling ops->getsockopt()? Then this details can be >> hidden away from the io_uring. > > > Right, I've spent some time thinking about it, and this could be done. > This is a draft I have. Is it what you had in mind? Yes. lgtm. Thanks.
diff --git a/io_uring/uring_cmd.c b/io_uring/uring_cmd.c index a567dd32df00..9e08a14760c3 100644 --- a/io_uring/uring_cmd.c +++ b/io_uring/uring_cmd.c @@ -5,6 +5,8 @@ #include <linux/io_uring.h> #include <linux/security.h> #include <linux/nospec.h> +#include <linux/compat.h> +#include <linux/bpf-cgroup.h> #include <uapi/linux/io_uring.h> #include <uapi/asm-generic/ioctls.h> @@ -184,17 +186,23 @@ static inline int io_uring_cmd_getsockopt(struct socket *sock, if (err) return err; - if (level == SOL_SOCKET) { + err = -EOPNOTSUPP; + if (level == SOL_SOCKET) err = sk_getsockopt(sock->sk, level, optname, USER_SOCKPTR(optval), KERNEL_SOCKPTR(&optlen)); - if (err) - return err; + if (!(issue_flags & IO_URING_F_COMPAT)) + err = BPF_CGROUP_RUN_PROG_GETSOCKOPT(sock->sk, level, + optname, + USER_SOCKPTR(optval), + KERNEL_SOCKPTR(&optlen), + optlen, err); + + if (!err) return optlen; - } - return -EOPNOTSUPP; + return err; } static inline int io_uring_cmd_setsockopt(struct socket *sock,
Add BPF hook support for getsockopts io_uring command. So, BPF cgroups programs can run when SOCKET_URING_OP_GETSOCKOPT command is executed through io_uring. This implementation follows a similar approach to what __sys_getsockopt() does, but, using USER_SOCKPTR() for optval instead of kernel pointer. Signed-off-by: Breno Leitao <leitao@debian.org> --- io_uring/uring_cmd.c | 18 +++++++++++++----- 1 file changed, 13 insertions(+), 5 deletions(-)