mbox series

[v4,00/10] io_uring: Initial support for {s,g}etsockopt commands

Message ID 20230904162504.1356068-1-leitao@debian.org (mailing list archive)
Headers show
Series io_uring: Initial support for {s,g}etsockopt commands | expand

Message

Breno Leitao Sept. 4, 2023, 4:24 p.m. UTC
This patchset adds support for getsockopt (SOCKET_URING_OP_GETSOCKOPT)
and setsockopt (SOCKET_URING_OP_SETSOCKOPT) in io_uring commands.
SOCKET_URING_OP_SETSOCKOPT implements generic case, covering all levels
and optnames. SOCKET_URING_OP_GETSOCKOPT is limited, for now, to
SOL_SOCKET level, which seems to be the most common level parameter for
get/setsockopt(2).

In order to keep the implementation (and tests) simple, some refactors
were done prior to the changes, as follows:

Patches 1-2: Modify the BPF hooks to support sockptr_t, so, these functions
become flexible enough to accept user or kernel pointers for optval/optlen.

Patch 3-4:  Remove the core {s,g}etsockopt() core function from
__sys_{g,s}etsockopt, so, the code could be reused by other callers,
such as io_uring.

Patch 5: Pass compat mode to the file/socket callbacks

Patch 6: Move io_uring helpers from io_uring_zerocopy_tx to a generic
io_uring headers. This simplify the test case (last patch)

Patch 7: Protect io_uring_cmd_sock() to not be called if CONFIG_NET is
disabled.

PS1: For getsockopt command, the optlen field is not a userspace
pointers, but an absolute value, so this is slightly different from
getsockopt(2) behaviour. The new optlen value is returned in cqe->res.

PS2: The userspace pointers need to be alive until the operation is
completed.

These changes were tested with a new test[1] in liburing, LTP sockopt*
tests, as also with bpf/progs/sockopt test case, which is now adapted to
run using both system calls and io_uring commands.

[1] Link: https://github.com/leitao/liburing/blob/getsockopt/test/socket-getsetsock-cmd.c

RFC -> V1:
	* Copy user memory at io_uring subsystem, and call proto_ops
	  callbacks using kernel memory
	* Implement all the cases for SOCKET_URING_OP_SETSOCKOPT

V1 -> V2
	* Implemented the BPF part
	* Using user pointers from optval to avoid kmalloc in io_uring part.

V2 -> V3:
	* Break down __sys_setsockopt and reuse the core code, avoiding
	  duplicated code. This removed the requirement to expose
	  sock_use_custom_sol_socket().
	* Added io_uring test to selftests/bpf/sockopt.
	* Fixed compat argument, by passing it to the issue_flags.

V3 -> V4:
	* Rebase on top of commit 1ded5e5a5931b ("net: annotate data-races around sock->ops")
	* Also broke down __sys_setsockopt() to reuse the core function
	  from io_uring.
	* Create a new patch to return -EOPNOTSUPP if CONFIG_NET is
	  disabled
	* Added two SOL_SOCKET tests in bpf/prog_tests/sockopt.c

Breno Leitao (10):
  bpf: Leverage sockptr_t in BPF getsockopt hook
  bpf: Leverage sockptr_t in BPF setsockopt hook
  net/socket: Break down __sys_setsockopt
  net/socket: Break down __sys_getsockopt
  io_uring/cmd: Pass compat mode in issue_flags
  selftests/net: Extract uring helpers to be reusable
  io_uring/cmd: return -EOPNOTSUPP if net is disabled
  io_uring/cmd: Introduce SOCKET_URING_OP_GETSOCKOPT
  io_uring/cmd: Introduce SOCKET_URING_OP_SETSOCKOPT
  selftests/bpf/sockopt: Add io_uring support

 include/linux/bpf-cgroup.h                    |   9 +-
 include/linux/io_uring.h                      |   1 +
 include/net/sock.h                            |   4 +
 include/uapi/linux/io_uring.h                 |   8 +
 io_uring/uring_cmd.c                          |  55 ++++
 kernel/bpf/cgroup.c                           |  25 +-
 net/core/sock.c                               |   8 -
 net/socket.c                                  | 102 ++++---
 tools/include/io_uring/mini_liburing.h        | 282 ++++++++++++++++++
 .../selftests/bpf/prog_tests/sockopt.c        | 113 ++++++-
 tools/testing/selftests/net/Makefile          |   1 +
 .../selftests/net/io_uring_zerocopy_tx.c      | 268 +----------------
 12 files changed, 544 insertions(+), 332 deletions(-)
 create mode 100644 tools/include/io_uring/mini_liburing.h

Comments

Jakub Kicinski Sept. 5, 2023, 10:49 p.m. UTC | #1
On Mon,  4 Sep 2023 09:24:53 -0700 Breno Leitao wrote:
> Patches 1-2: Modify the BPF hooks to support sockptr_t, so, these functions
> become flexible enough to accept user or kernel pointers for optval/optlen.

Have you seen:

https://lore.kernel.org/all/CAHk-=wgGV61xrG=gO0=dXH64o2TDWWrXn1mx-CX885JZ7h84Og@mail.gmail.com/

? I wasn't aware that Linus felt this way, now I wonder if having
sockptr_t spread will raise any red flags as this code flows back
to him.
Breno Leitao Sept. 8, 2023, 4:55 p.m. UTC | #2
On Tue, Sep 05, 2023 at 03:49:51PM -0700, Jakub Kicinski wrote:
> On Mon,  4 Sep 2023 09:24:53 -0700 Breno Leitao wrote:
> > Patches 1-2: Modify the BPF hooks to support sockptr_t, so, these functions
> > become flexible enough to accept user or kernel pointers for optval/optlen.
> 
> Have you seen:
> 
> https://lore.kernel.org/all/CAHk-=wgGV61xrG=gO0=dXH64o2TDWWrXn1mx-CX885JZ7h84Og@mail.gmail.com/

I haven't but I think it will not affect *much* this patchset.

> ? I wasn't aware that Linus felt this way, now I wonder if having
> sockptr_t spread will raise any red flags as this code flows back
> to him.

I can change the io_uring API in a way that we can avoid these
sockptr_t changes completely.

My plan is to mimic what getsockopt(2) is doing in io_uring cmd path, in
regard to optlen being an userpointer, instead of a value - which is
then translated to a KERNEL_SOCKPTR.

In this way, this change don't need to touch any sockptr field.

Thanks for the heads-up
Breno Leitao Oct. 6, 2023, 3:45 p.m. UTC | #3
Hello Jakub,

On Tue, Sep 05, 2023 at 03:49:51PM -0700, Jakub Kicinski wrote:
> On Mon,  4 Sep 2023 09:24:53 -0700 Breno Leitao wrote:
> > Patches 1-2: Modify the BPF hooks to support sockptr_t, so, these functions
> > become flexible enough to accept user or kernel pointers for optval/optlen.
> 
> Have you seen:
> 
> https://lore.kernel.org/all/CAHk-=wgGV61xrG=gO0=dXH64o2TDWWrXn1mx-CX885JZ7h84Og@mail.gmail.com/
> 
> ? I wasn't aware that Linus felt this way, now I wonder if having
> sockptr_t spread will raise any red flags as this code flows back
> to him.

Thanks for the heads-up. I've been thinking about it for a while and I'd
like to hear what are the next steps here.

Let me first back up and state where we are, and what is the current
situation:

1) __sys_getsockopt() uses __user pointers for both optval and optlen
2) For io_uring command, Jens[1] suggested we get optlen from the io_uring
sqe, which is a kernel pointer/value.

Thus, we need to make the common code (callbacks) able to handle __user
and kernel pointers (for optlen, at least).

From a proto_ops callback perspective, ->setsockopt() uses sockptr.

          int             (*setsockopt)(struct socket *sock, int level,
                                        int optname, sockptr_t optval,
                                        unsigned int optlen);

Getsockopt() uses sockptr() for level=SOL_SOCKET:

	int sk_getsockopt(struct sock *sk, int level, int optname,
                    sockptr_t optval, sockptr_t optlen)

But not for the other levels:

	int             (*getsockopt)(struct socket *sock, int level,
				      int optname, char __user *optval, int __user *optlen);


That said, if this patchset shouldn't use sockptr anymore, what is the
recommendation?

If we move this patchset to use iov_iter instead of sockptr, then I
understand we want to move *all* these callbacks to use iov_vec. Is this
the right direction?

Thanks for the guidance!

[1] https://lore.kernel.org/all/efe602f1-8e72-466c-b796-0083fd1c6d82@kernel.dk/
Willem de Bruijn Oct. 9, 2023, 10:11 a.m. UTC | #4
On Fri, Oct 6, 2023 at 10:45 AM Breno Leitao <leitao@debian.org> wrote:
>
> Hello Jakub,
>
> On Tue, Sep 05, 2023 at 03:49:51PM -0700, Jakub Kicinski wrote:
> > On Mon,  4 Sep 2023 09:24:53 -0700 Breno Leitao wrote:
> > > Patches 1-2: Modify the BPF hooks to support sockptr_t, so, these functions
> > > become flexible enough to accept user or kernel pointers for optval/optlen.
> >
> > Have you seen:
> >
> > https://lore.kernel.org/all/CAHk-=wgGV61xrG=gO0=dXH64o2TDWWrXn1mx-CX885JZ7h84Og@mail.gmail.com/
> >
> > ? I wasn't aware that Linus felt this way, now I wonder if having
> > sockptr_t spread will raise any red flags as this code flows back
> > to him.
>
> Thanks for the heads-up. I've been thinking about it for a while and I'd
> like to hear what are the next steps here.
>
> Let me first back up and state where we are, and what is the current
> situation:
>
> 1) __sys_getsockopt() uses __user pointers for both optval and optlen
> 2) For io_uring command, Jens[1] suggested we get optlen from the io_uring
> sqe, which is a kernel pointer/value.
>
> Thus, we need to make the common code (callbacks) able to handle __user
> and kernel pointers (for optlen, at least).
>
> From a proto_ops callback perspective, ->setsockopt() uses sockptr.
>
>           int             (*setsockopt)(struct socket *sock, int level,
>                                         int optname, sockptr_t optval,
>                                         unsigned int optlen);
>
> Getsockopt() uses sockptr() for level=SOL_SOCKET:
>
>         int sk_getsockopt(struct sock *sk, int level, int optname,
>                     sockptr_t optval, sockptr_t optlen)
>
> But not for the other levels:
>
>         int             (*getsockopt)(struct socket *sock, int level,
>                                       int optname, char __user *optval, int __user *optlen);
>
>
> That said, if this patchset shouldn't use sockptr anymore, what is the
> recommendation?
>
> If we move this patchset to use iov_iter instead of sockptr, then I
> understand we want to move *all* these callbacks to use iov_vec. Is this
> the right direction?
>
> Thanks for the guidance!
>
> [1] https://lore.kernel.org/all/efe602f1-8e72-466c-b796-0083fd1c6d82@kernel.dk/

Since sockptr_t is already used by __sys_setsockopt and
__sys_setsockopt, patches 1 and 2 don't introduce any new sockptr code
paths.

setsockopt callbacks also already use sockptr as of commit
a7b75c5a8c41 ("net: pass a sockptr_t into ->setsockopt").

getsockopt callbacks do take user pointers, just not sockptr.

Is the only issue right now the optlen kernel pointer?
Breno Leitao Oct. 9, 2023, 1:28 p.m. UTC | #5
On Mon, Oct 09, 2023 at 03:11:05AM -0700, Willem de Bruijn wrote:
> On Fri, Oct 6, 2023 at 10:45 AM Breno Leitao <leitao@debian.org> wrote:
> > Let me first back up and state where we are, and what is the current
> > situation:
> >
> > 1) __sys_getsockopt() uses __user pointers for both optval and optlen
> > 2) For io_uring command, Jens[1] suggested we get optlen from the io_uring
> > sqe, which is a kernel pointer/value.
> >
> > Thus, we need to make the common code (callbacks) able to handle __user
> > and kernel pointers (for optlen, at least).
> >
> > From a proto_ops callback perspective, ->setsockopt() uses sockptr.
> >
> >           int             (*setsockopt)(struct socket *sock, int level,
> >                                         int optname, sockptr_t optval,
> >                                         unsigned int optlen);
> >
> > Getsockopt() uses sockptr() for level=SOL_SOCKET:
> >
> >         int sk_getsockopt(struct sock *sk, int level, int optname,
> >                     sockptr_t optval, sockptr_t optlen)
> >
> > But not for the other levels:
> >
> >         int             (*getsockopt)(struct socket *sock, int level,
> >                                       int optname, char __user *optval, int __user *optlen);
> >
> >
> > That said, if this patchset shouldn't use sockptr anymore, what is the
> > recommendation?
> >
> > If we move this patchset to use iov_iter instead of sockptr, then I
> > understand we want to move *all* these callbacks to use iov_vec. Is this
> > the right direction?
> >
> > Thanks for the guidance!
> >
> > [1] https://lore.kernel.org/all/efe602f1-8e72-466c-b796-0083fd1c6d82@kernel.dk/
> 
> Since sockptr_t is already used by __sys_setsockopt and
> __sys_setsockopt, patches 1 and 2 don't introduce any new sockptr code
> paths.
> 
> setsockopt callbacks also already use sockptr as of commit
> a7b75c5a8c41 ("net: pass a sockptr_t into ->setsockopt").
> 
> getsockopt callbacks do take user pointers, just not sockptr.
> 
> Is the only issue right now the optlen kernel pointer?

Correct. The current discussion is only related to optlen in the
getsockopt() callbacks (invoked when level != SOL_SOCKET). Everything
else (getsockopt(level=SOL_SOCKET..) and setsockopt) is using sockptr.

Is it bad if we review/merge this code as is (using sockptr), and start
the iov_iter/getsockopt() refactor in a follow-up thread?

Thanks!
Jakub Kicinski Oct. 9, 2023, 4:55 p.m. UTC | #6
On Mon, 9 Oct 2023 06:28:00 -0700 Breno Leitao wrote:
> Correct. The current discussion is only related to optlen in the
> getsockopt() callbacks (invoked when level != SOL_SOCKET). Everything
> else (getsockopt(level=SOL_SOCKET..) and setsockopt) is using sockptr.
> 
> Is it bad if we review/merge this code as is (using sockptr), and start
> the iov_iter/getsockopt() refactor in a follow-up thread?

Sorry for the delay, I only looked at the code now :S
Agreed, that there's no need to worry about the sockptr spread
in this series. It looks good to go in.