Message ID | 20230510131527.1244929-1-aleksandr.mikhalitsyn@canonical.com (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net-next] sctp: add bpf_bypass_getsockopt proto callback | expand |
On Wed, May 10, 2023 at 6:15 AM Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com> wrote: > > Add bpf_bypass_getsockopt proto callback and filter out > SCTP_SOCKOPT_PEELOFF and SCTP_SOCKOPT_PEELOFF_FLAGS socket options > from running eBPF hook on them. > > These options do fd_install(), and if BPF_CGROUP_RUN_PROG_GETSOCKOPT > hook returns an error after success of the original handler > sctp_getsockopt(...), userspace will receive an error from getsockopt > syscall and will be not aware that fd was successfully installed into fdtable. > > This patch was born as a result of discussion around a new SCM_PIDFD interface: > https://lore.kernel.org/all/20230413133355.350571-3-aleksandr.mikhalitsyn@canonical.com/ > > Fixes: 0d01da6afc54 ("bpf: implement getsockopt and setsockopt hooks") > Cc: Daniel Borkmann <daniel@iogearbox.net> > Cc: Christian Brauner <brauner@kernel.org> > Cc: Stanislav Fomichev <sdf@google.com> > Cc: Neil Horman <nhorman@tuxdriver.com> > Cc: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> > Cc: Xin Long <lucien.xin@gmail.com> > Cc: linux-sctp@vger.kernel.org > Cc: linux-kernel@vger.kernel.org > Cc: netdev@vger.kernel.org > Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com> Acked-by: Stanislav Fomichev <sdf@google.com> with a small nit below > --- > net/sctp/socket.c | 25 +++++++++++++++++++++++++ > 1 file changed, 25 insertions(+) > > diff --git a/net/sctp/socket.c b/net/sctp/socket.c > index cda8c2874691..a9a0ababea90 100644 > --- a/net/sctp/socket.c > +++ b/net/sctp/socket.c > @@ -8281,6 +8281,29 @@ static int sctp_getsockopt(struct sock *sk, int level, int optname, > return retval; > } > [...] > +bool sctp_bpf_bypass_getsockopt(int level, int optname) static bool ... ? You're not making it indirect-callable, so seems fine to keep private to this compilation unit? > +{ > + /* > + * These options do fd_install(), and if BPF_CGROUP_RUN_PROG_GETSOCKOPT > + * hook returns an error after success of the original handler > + * sctp_getsockopt(...), userspace will receive an error from getsockopt > + * syscall and will be not aware that fd was successfully installed into fdtable. > + * > + * Let's prevent bpf cgroup hook from running on them. > + */ > + if (level == SOL_SCTP) { > + switch (optname) { > + case SCTP_SOCKOPT_PEELOFF: > + case SCTP_SOCKOPT_PEELOFF_FLAGS: > + return true; > + default: > + return false; > + } > + } > + > + return false; > +} > + > static int sctp_hash(struct sock *sk) > { > /* STUB */ > @@ -9650,6 +9673,7 @@ struct proto sctp_prot = { > .shutdown = sctp_shutdown, > .setsockopt = sctp_setsockopt, > .getsockopt = sctp_getsockopt, > + .bpf_bypass_getsockopt = sctp_bpf_bypass_getsockopt, > .sendmsg = sctp_sendmsg, > .recvmsg = sctp_recvmsg, > .bind = sctp_bind, > @@ -9705,6 +9729,7 @@ struct proto sctpv6_prot = { > .shutdown = sctp_shutdown, > .setsockopt = sctp_setsockopt, > .getsockopt = sctp_getsockopt, > + .bpf_bypass_getsockopt = sctp_bpf_bypass_getsockopt, > .sendmsg = sctp_sendmsg, > .recvmsg = sctp_recvmsg, > .bind = sctp_bind, > -- > 2.34.1 >
On Wed, May 10, 2023 at 03:15:27PM +0200, Alexander Mikhalitsyn wrote: > Add bpf_bypass_getsockopt proto callback and filter out > SCTP_SOCKOPT_PEELOFF and SCTP_SOCKOPT_PEELOFF_FLAGS socket options > from running eBPF hook on them. > > These options do fd_install(), and if BPF_CGROUP_RUN_PROG_GETSOCKOPT > hook returns an error after success of the original handler > sctp_getsockopt(...), userspace will receive an error from getsockopt > syscall and will be not aware that fd was successfully installed into fdtable. > > This patch was born as a result of discussion around a new SCM_PIDFD interface: > https://lore.kernel.org/all/20230413133355.350571-3-aleksandr.mikhalitsyn@canonical.com/ I read some of the emails in there but I don't get why the fd leak is special here. I mean, I get that it leaks, but masking the error return like this can lead to several other problems in the application as well. For example, SCTP_SOCKOPT_CONNECTX3 will trigger a connect(). If it failed, and the hook returns success, the user app will at least log a wrong "connection successful". If the hook can't be responsible for cleaning up before returning a different value, then maybe we want to extend the list of sockopts in here. AFAICT these would be the 3 most critical sockopts.
On Wed, May 10, 2023 at 4:32 PM Stanislav Fomichev <sdf@google.com> wrote: > > On Wed, May 10, 2023 at 6:15 AM Alexander Mikhalitsyn > <aleksandr.mikhalitsyn@canonical.com> wrote: > > > > Add bpf_bypass_getsockopt proto callback and filter out > > SCTP_SOCKOPT_PEELOFF and SCTP_SOCKOPT_PEELOFF_FLAGS socket options > > from running eBPF hook on them. > > > > These options do fd_install(), and if BPF_CGROUP_RUN_PROG_GETSOCKOPT > > hook returns an error after success of the original handler > > sctp_getsockopt(...), userspace will receive an error from getsockopt > > syscall and will be not aware that fd was successfully installed into fdtable. > > > > This patch was born as a result of discussion around a new SCM_PIDFD interface: > > https://lore.kernel.org/all/20230413133355.350571-3-aleksandr.mikhalitsyn@canonical.com/ > > > > Fixes: 0d01da6afc54 ("bpf: implement getsockopt and setsockopt hooks") > > Cc: Daniel Borkmann <daniel@iogearbox.net> > > Cc: Christian Brauner <brauner@kernel.org> > > Cc: Stanislav Fomichev <sdf@google.com> > > Cc: Neil Horman <nhorman@tuxdriver.com> > > Cc: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> > > Cc: Xin Long <lucien.xin@gmail.com> > > Cc: linux-sctp@vger.kernel.org > > Cc: linux-kernel@vger.kernel.org > > Cc: netdev@vger.kernel.org > > Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com> > > Acked-by: Stanislav Fomichev <sdf@google.com> > > with a small nit below Hi Stanislav! Thanks for your review. I've also added a Suggested-by tag with your name in -v2, because you've given me this idea to use bpf_bypass_getsockopt. Hope that you are comfortable with it. > > > --- > > net/sctp/socket.c | 25 +++++++++++++++++++++++++ > > 1 file changed, 25 insertions(+) > > > > diff --git a/net/sctp/socket.c b/net/sctp/socket.c > > index cda8c2874691..a9a0ababea90 100644 > > --- a/net/sctp/socket.c > > +++ b/net/sctp/socket.c > > @@ -8281,6 +8281,29 @@ static int sctp_getsockopt(struct sock *sk, int level, int optname, > > return retval; > > } > > > > [...] > > > +bool sctp_bpf_bypass_getsockopt(int level, int optname) > > static bool ... ? You're not making it indirect-callable, so seems > fine to keep private to this compilation unit? Sure, my bad. Fixed in v2. Kind regards, Alex > > > +{ > > + /* > > + * These options do fd_install(), and if BPF_CGROUP_RUN_PROG_GETSOCKOPT > > + * hook returns an error after success of the original handler > > + * sctp_getsockopt(...), userspace will receive an error from getsockopt > > + * syscall and will be not aware that fd was successfully installed into fdtable. > > + * > > + * Let's prevent bpf cgroup hook from running on them. > > + */ > > + if (level == SOL_SCTP) { > > + switch (optname) { > > + case SCTP_SOCKOPT_PEELOFF: > > + case SCTP_SOCKOPT_PEELOFF_FLAGS: > > + return true; > > + default: > > + return false; > > + } > > + } > > + > > + return false; > > +} > > + > > static int sctp_hash(struct sock *sk) > > { > > /* STUB */ > > @@ -9650,6 +9673,7 @@ struct proto sctp_prot = { > > .shutdown = sctp_shutdown, > > .setsockopt = sctp_setsockopt, > > .getsockopt = sctp_getsockopt, > > + .bpf_bypass_getsockopt = sctp_bpf_bypass_getsockopt, > > .sendmsg = sctp_sendmsg, > > .recvmsg = sctp_recvmsg, > > .bind = sctp_bind, > > @@ -9705,6 +9729,7 @@ struct proto sctpv6_prot = { > > .shutdown = sctp_shutdown, > > .setsockopt = sctp_setsockopt, > > .getsockopt = sctp_getsockopt, > > + .bpf_bypass_getsockopt = sctp_bpf_bypass_getsockopt, > > .sendmsg = sctp_sendmsg, > > .recvmsg = sctp_recvmsg, > > .bind = sctp_bind, > > -- > > 2.34.1 > >
On Wed, May 10, 2023 at 4:39 PM Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> wrote: > > On Wed, May 10, 2023 at 03:15:27PM +0200, Alexander Mikhalitsyn wrote: > > Add bpf_bypass_getsockopt proto callback and filter out > > SCTP_SOCKOPT_PEELOFF and SCTP_SOCKOPT_PEELOFF_FLAGS socket options > > from running eBPF hook on them. > > > > These options do fd_install(), and if BPF_CGROUP_RUN_PROG_GETSOCKOPT > > hook returns an error after success of the original handler > > sctp_getsockopt(...), userspace will receive an error from getsockopt > > syscall and will be not aware that fd was successfully installed into fdtable. > > > > This patch was born as a result of discussion around a new SCM_PIDFD interface: > > https://lore.kernel.org/all/20230413133355.350571-3-aleksandr.mikhalitsyn@canonical.com/ > > I read some of the emails in there but I don't get why the fd leak is > special here. I mean, I get that it leaks, but masking the error > return like this can lead to several other problems in the application > as well. > > For example, SCTP_SOCKOPT_CONNECTX3 will trigger a connect(). If it > failed, and the hook returns success, the user app will at least log a > wrong "connection successful". > > If the hook can't be responsible for cleaning up before returning a > different value, then maybe we want to extend the list of sockopts in > here. AFAICT these would be the 3 most critical sockopts. > Dear Marcelo, Thanks for pointing this out. Initially this problem was discovered by Christian Brauner and for SO_PEERPIDFD (a new SOL_SOCKET option that we want to add), after this I decided to check if we do fd_install in any other socket options in the kernel and found that we have 2 cases in SCTP. It was an accidental finding. :) So, this patch isn't specific to fd_install things and probably we should filter out bpf hook from being called for other socket options as well. So, I need to filter out SCTP_SOCKOPT_CONNECTX3 and SCTP_SOCKOPT_PEELOFF* for SCTP, right? Kind regards, Alex
Hi Alexander, kernel test robot noticed the following build warnings: [auto build test WARNING on net-next/main] url: https://github.com/intel-lab-lkp/linux/commits/Alexander-Mikhalitsyn/sctp-add-bpf_bypass_getsockopt-proto-callback/20230510-211646 base: net-next/main patch link: https://lore.kernel.org/r/20230510131527.1244929-1-aleksandr.mikhalitsyn%40canonical.com patch subject: [PATCH net-next] sctp: add bpf_bypass_getsockopt proto callback config: m68k-defconfig (https://download.01.org/0day-ci/archive/20230510/202305102234.u4T0ut0T-lkp@intel.com/config) compiler: m68k-linux-gcc (GCC) 12.1.0 reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # https://github.com/intel-lab-lkp/linux/commit/8ad9818b4b74026fe549b2aa34ea800ab6c8e66d git remote add linux-review https://github.com/intel-lab-lkp/linux git fetch --no-tags linux-review Alexander-Mikhalitsyn/sctp-add-bpf_bypass_getsockopt-proto-callback/20230510-211646 git checkout 8ad9818b4b74026fe549b2aa34ea800ab6c8e66d # save the config file mkdir build_dir && cp config build_dir/.config COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=m68k olddefconfig COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=m68k SHELL=/bin/bash net/sctp/ If you fix the issue, kindly add following tag where applicable | Reported-by: kernel test robot <lkp@intel.com> | Link: https://lore.kernel.org/oe-kbuild-all/202305102234.u4T0ut0T-lkp@intel.com/ All warnings (new ones prefixed by >>): >> net/sctp/socket.c:8284:6: warning: no previous prototype for 'sctp_bpf_bypass_getsockopt' [-Wmissing-prototypes] 8284 | bool sctp_bpf_bypass_getsockopt(int level, int optname) | ^~~~~~~~~~~~~~~~~~~~~~~~~~ vim +/sctp_bpf_bypass_getsockopt +8284 net/sctp/socket.c 8283 > 8284 bool sctp_bpf_bypass_getsockopt(int level, int optname) 8285 { 8286 /* 8287 * These options do fd_install(), and if BPF_CGROUP_RUN_PROG_GETSOCKOPT 8288 * hook returns an error after success of the original handler 8289 * sctp_getsockopt(...), userspace will receive an error from getsockopt 8290 * syscall and will be not aware that fd was successfully installed into fdtable. 8291 * 8292 * Let's prevent bpf cgroup hook from running on them. 8293 */ 8294 if (level == SOL_SCTP) { 8295 switch (optname) { 8296 case SCTP_SOCKOPT_PEELOFF: 8297 case SCTP_SOCKOPT_PEELOFF_FLAGS: 8298 return true; 8299 default: 8300 return false; 8301 } 8302 } 8303 8304 return false; 8305 } 8306
On Wed, May 10, 2023 at 04:55:37PM +0200, Aleksandr Mikhalitsyn wrote: > On Wed, May 10, 2023 at 4:39 PM Marcelo Ricardo Leitner > <marcelo.leitner@gmail.com> wrote: > > > > On Wed, May 10, 2023 at 03:15:27PM +0200, Alexander Mikhalitsyn wrote: > > > Add bpf_bypass_getsockopt proto callback and filter out > > > SCTP_SOCKOPT_PEELOFF and SCTP_SOCKOPT_PEELOFF_FLAGS socket options > > > from running eBPF hook on them. > > > > > > These options do fd_install(), and if BPF_CGROUP_RUN_PROG_GETSOCKOPT > > > hook returns an error after success of the original handler > > > sctp_getsockopt(...), userspace will receive an error from getsockopt > > > syscall and will be not aware that fd was successfully installed into fdtable. > > > > > > This patch was born as a result of discussion around a new SCM_PIDFD interface: > > > https://lore.kernel.org/all/20230413133355.350571-3-aleksandr.mikhalitsyn@canonical.com/ > > > > I read some of the emails in there but I don't get why the fd leak is > > special here. I mean, I get that it leaks, but masking the error > > return like this can lead to several other problems in the application > > as well. > > > > For example, SCTP_SOCKOPT_CONNECTX3 will trigger a connect(). If it > > failed, and the hook returns success, the user app will at least log a > > wrong "connection successful". > > > > If the hook can't be responsible for cleaning up before returning a > > different value, then maybe we want to extend the list of sockopts in > > here. AFAICT these would be the 3 most critical sockopts. > > > > Dear Marcelo, Hello! > > Thanks for pointing this out. Initially this problem was discovered by > Christian Brauner and for SO_PEERPIDFD (a new SOL_SOCKET option that > we want to add), > after this I decided to check if we do fd_install in any other socket > options in the kernel and found that we have 2 cases in SCTP. It was > an accidental finding. :) > > So, this patch isn't specific to fd_install things and probably we > should filter out bpf hook from being called for other socket options > as well. Understood. > > So, I need to filter out SCTP_SOCKOPT_CONNECTX3 and > SCTP_SOCKOPT_PEELOFF* for SCTP, right? Gotta say, it seems weird that it will filter out some of the most important sockopts. But I'm not acquainted to bpf hooks so I won't question (much? :) ) that. Considering that filtering is needed, then yes, on getsock those are ones I'm seeing that needs filtering. Otherwise they will either trigger leakage or will confuse the application. Should we care about setsock as well? We have SCTP_SOCKOPT_CONNECTX and SCTP_SOCKOPT_CONNECTX_OLD in there, and well, I guess any of those would misbehave if they failed and yet the hook returns success. Thanks, Marcelo
On Wed, May 10, 2023 at 8:18 AM Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> wrote: > > On Wed, May 10, 2023 at 04:55:37PM +0200, Aleksandr Mikhalitsyn wrote: > > On Wed, May 10, 2023 at 4:39 PM Marcelo Ricardo Leitner > > <marcelo.leitner@gmail.com> wrote: > > > > > > On Wed, May 10, 2023 at 03:15:27PM +0200, Alexander Mikhalitsyn wrote: > > > > Add bpf_bypass_getsockopt proto callback and filter out > > > > SCTP_SOCKOPT_PEELOFF and SCTP_SOCKOPT_PEELOFF_FLAGS socket options > > > > from running eBPF hook on them. > > > > > > > > These options do fd_install(), and if BPF_CGROUP_RUN_PROG_GETSOCKOPT > > > > hook returns an error after success of the original handler > > > > sctp_getsockopt(...), userspace will receive an error from getsockopt > > > > syscall and will be not aware that fd was successfully installed into fdtable. > > > > > > > > This patch was born as a result of discussion around a new SCM_PIDFD interface: > > > > https://lore.kernel.org/all/20230413133355.350571-3-aleksandr.mikhalitsyn@canonical.com/ > > > > > > I read some of the emails in there but I don't get why the fd leak is > > > special here. I mean, I get that it leaks, but masking the error > > > return like this can lead to several other problems in the application > > > as well. > > > > > > For example, SCTP_SOCKOPT_CONNECTX3 will trigger a connect(). If it > > > failed, and the hook returns success, the user app will at least log a > > > wrong "connection successful". > > > > > > If the hook can't be responsible for cleaning up before returning a > > > different value, then maybe we want to extend the list of sockopts in > > > here. AFAICT these would be the 3 most critical sockopts. > > > > > > > Dear Marcelo, > > Hello! > > > > > Thanks for pointing this out. Initially this problem was discovered by > > Christian Brauner and for SO_PEERPIDFD (a new SOL_SOCKET option that > > we want to add), > > after this I decided to check if we do fd_install in any other socket > > options in the kernel and found that we have 2 cases in SCTP. It was > > an accidental finding. :) > > > > So, this patch isn't specific to fd_install things and probably we > > should filter out bpf hook from being called for other socket options > > as well. > > Understood. > > > > > So, I need to filter out SCTP_SOCKOPT_CONNECTX3 and > > SCTP_SOCKOPT_PEELOFF* for SCTP, right? > > Gotta say, it seems weird that it will filter out some of the most > important sockopts. But I'm not acquainted to bpf hooks so I won't > question (much? :) ) that. Thanks for raising this. Alexander, maybe you can respin your v2 to include these as well? > Considering that filtering is needed, then yes, on getsock those are > ones I'm seeing that needs filtering. Otherwise they will either > trigger leakage or will confuse the application. [..] > Should we care about setsock as well? We have SCTP_SOCKOPT_CONNECTX > and SCTP_SOCKOPT_CONNECTX_OLD in there, and well, I guess any of those > would misbehave if they failed and yet the hook returns success. For setsockopt, the bpf program runs before the kernel, so setsockopt shouldn't have those issues we're observing with getsockopt (which runs after the kernel and has an option to ignore kernel value). > Thanks, > Marcelo
diff --git a/net/sctp/socket.c b/net/sctp/socket.c index cda8c2874691..a9a0ababea90 100644 --- a/net/sctp/socket.c +++ b/net/sctp/socket.c @@ -8281,6 +8281,29 @@ static int sctp_getsockopt(struct sock *sk, int level, int optname, return retval; } +bool sctp_bpf_bypass_getsockopt(int level, int optname) +{ + /* + * These options do fd_install(), and if BPF_CGROUP_RUN_PROG_GETSOCKOPT + * hook returns an error after success of the original handler + * sctp_getsockopt(...), userspace will receive an error from getsockopt + * syscall and will be not aware that fd was successfully installed into fdtable. + * + * Let's prevent bpf cgroup hook from running on them. + */ + if (level == SOL_SCTP) { + switch (optname) { + case SCTP_SOCKOPT_PEELOFF: + case SCTP_SOCKOPT_PEELOFF_FLAGS: + return true; + default: + return false; + } + } + + return false; +} + static int sctp_hash(struct sock *sk) { /* STUB */ @@ -9650,6 +9673,7 @@ struct proto sctp_prot = { .shutdown = sctp_shutdown, .setsockopt = sctp_setsockopt, .getsockopt = sctp_getsockopt, + .bpf_bypass_getsockopt = sctp_bpf_bypass_getsockopt, .sendmsg = sctp_sendmsg, .recvmsg = sctp_recvmsg, .bind = sctp_bind, @@ -9705,6 +9729,7 @@ struct proto sctpv6_prot = { .shutdown = sctp_shutdown, .setsockopt = sctp_setsockopt, .getsockopt = sctp_getsockopt, + .bpf_bypass_getsockopt = sctp_bpf_bypass_getsockopt, .sendmsg = sctp_sendmsg, .recvmsg = sctp_recvmsg, .bind = sctp_bind,
Add bpf_bypass_getsockopt proto callback and filter out SCTP_SOCKOPT_PEELOFF and SCTP_SOCKOPT_PEELOFF_FLAGS socket options from running eBPF hook on them. These options do fd_install(), and if BPF_CGROUP_RUN_PROG_GETSOCKOPT hook returns an error after success of the original handler sctp_getsockopt(...), userspace will receive an error from getsockopt syscall and will be not aware that fd was successfully installed into fdtable. This patch was born as a result of discussion around a new SCM_PIDFD interface: https://lore.kernel.org/all/20230413133355.350571-3-aleksandr.mikhalitsyn@canonical.com/ Fixes: 0d01da6afc54 ("bpf: implement getsockopt and setsockopt hooks") Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Christian Brauner <brauner@kernel.org> Cc: Stanislav Fomichev <sdf@google.com> Cc: Neil Horman <nhorman@tuxdriver.com> Cc: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Cc: Xin Long <lucien.xin@gmail.com> Cc: linux-sctp@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: netdev@vger.kernel.org Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com> --- net/sctp/socket.c | 25 +++++++++++++++++++++++++ 1 file changed, 25 insertions(+)