diff mbox series

net: bpf: handle return value of BPF_CGROUP_RUN_PROG_INET4_POST_BIND()

Message ID 20211227062035.3224982-1-imagedong@tencent.com (mailing list archive)
State Changes Requested
Delegated to: BPF
Headers show
Series net: bpf: handle return value of BPF_CGROUP_RUN_PROG_INET4_POST_BIND() | expand

Checks

Context Check Description
netdev/fixes_present success Fixes tag not required for -next series
netdev/subject_prefix warning Target tree name not specified in the subject
netdev/cover_letter success Single patches do not need cover letters
netdev/patch_count success Link
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 3 this patch: 3
netdev/cc_maintainers success CCed 14 of 14 maintainers
netdev/build_clang success Errors and warnings before: 20 this patch: 20
netdev/module_param success Was 0 now: 0
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 5 this patch: 5
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 14 lines checked
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
bpf/vmtest-bpf-next-PR success PR summary
bpf/vmtest-bpf-next success VM_Test
netdev/tree_selection success Guessing tree name failed - patch did not apply

Commit Message

Menglong Dong Dec. 27, 2021, 6:20 a.m. UTC
From: Menglong Dong <imagedong@tencent.com>

The return value of BPF_CGROUP_RUN_PROG_INET4_POST_BIND() in
__inet_bind() is not handled properly. While the return value
is non-zero, it will set inet_saddr and inet_rcv_saddr to 0 and
exit:

	err = BPF_CGROUP_RUN_PROG_INET4_POST_BIND(sk);
	if (err) {
		inet->inet_saddr = inet->inet_rcv_saddr = 0;
		goto out_release_sock;
	}

Let's take UDP for example and see what will happen. For UDP
socket, it will be added to 'udp_prot.h.udp_table->hash' and
'udp_prot.h.udp_table->hash2' after the sk->sk_prot->get_port()
called success. If 'inet->inet_rcv_saddr' is specified here,
then 'sk' will be in the 'hslot2' of 'hash2' that it don't belong
to (because inet_saddr is changed to 0), and UDP packet received
will not be passed to this sock. If 'inet->inet_rcv_saddr' is not
specified here, the sock will work fine, as it can receive packet
properly, which is wired, as the 'bind()' is already failed.

I'm not sure what should do here, maybe we should unhash the sock
for UDP? Therefor, user can try to bind another port?

Signed-off-by: Menglong Dong <imagedong@tencent.com>
---
 net/ipv4/af_inet.c | 7 +++++++
 1 file changed, 7 insertions(+)

Comments

Jakub Kicinski Dec. 29, 2021, 9:09 p.m. UTC | #1
On Mon, 27 Dec 2021 14:20:35 +0800 menglong8.dong@gmail.com wrote:
> From: Menglong Dong <imagedong@tencent.com>
> 
> The return value of BPF_CGROUP_RUN_PROG_INET4_POST_BIND() in
> __inet_bind() is not handled properly. While the return value
> is non-zero, it will set inet_saddr and inet_rcv_saddr to 0 and
> exit:
> 
> 	err = BPF_CGROUP_RUN_PROG_INET4_POST_BIND(sk);
> 	if (err) {
> 		inet->inet_saddr = inet->inet_rcv_saddr = 0;
> 		goto out_release_sock;
> 	}
> 
> Let's take UDP for example and see what will happen. For UDP
> socket, it will be added to 'udp_prot.h.udp_table->hash' and
> 'udp_prot.h.udp_table->hash2' after the sk->sk_prot->get_port()
> called success. If 'inet->inet_rcv_saddr' is specified here,
> then 'sk' will be in the 'hslot2' of 'hash2' that it don't belong
> to (because inet_saddr is changed to 0), and UDP packet received
> will not be passed to this sock. If 'inet->inet_rcv_saddr' is not
> specified here, the sock will work fine, as it can receive packet
> properly, which is wired, as the 'bind()' is already failed.
> 
> I'm not sure what should do here, maybe we should unhash the sock
> for UDP? Therefor, user can try to bind another port?

Enumarating the L4 unwind paths in L3 code seems like a fairly clear
layering violation. A new callback to undo ->sk_prot->get_port() may
be better.

Does IPv6 no need as similar change?

You need to provide a selftest to validate the expected behavior.

> diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
> index 04067b249bf3..9e5710f40a39 100644
> --- a/net/ipv4/af_inet.c
> +++ b/net/ipv4/af_inet.c
> @@ -530,7 +530,14 @@ int __inet_bind(struct sock *sk, struct sockaddr *uaddr, int addr_len,
>  		if (!(flags & BIND_FROM_BPF)) {
>  			err = BPF_CGROUP_RUN_PROG_INET4_POST_BIND(sk);
>  			if (err) {
> +				if (sk->sk_prot == &udp_prot)
> +					sk->sk_prot->unhash(sk);
> +				else if (sk->sk_prot == &tcp_prot)
> +					inet_put_port(sk);
> +
>  				inet->inet_saddr = inet->inet_rcv_saddr = 0;
> +				err = -EPERM;
> +
>  				goto out_release_sock;
>  			}
>  		}
Menglong Dong Dec. 30, 2021, 2:17 a.m. UTC | #2
On Thu, Dec 30, 2021 at 5:09 AM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Mon, 27 Dec 2021 14:20:35 +0800 menglong8.dong@gmail.com wrote:
> > From: Menglong Dong <imagedong@tencent.com>
> >
> > The return value of BPF_CGROUP_RUN_PROG_INET4_POST_BIND() in
> > __inet_bind() is not handled properly. While the return value
> > is non-zero, it will set inet_saddr and inet_rcv_saddr to 0 and
> > exit:
> >
> >       err = BPF_CGROUP_RUN_PROG_INET4_POST_BIND(sk);
> >       if (err) {
> >               inet->inet_saddr = inet->inet_rcv_saddr = 0;
> >               goto out_release_sock;
> >       }
> >
> > Let's take UDP for example and see what will happen. For UDP
> > socket, it will be added to 'udp_prot.h.udp_table->hash' and
> > 'udp_prot.h.udp_table->hash2' after the sk->sk_prot->get_port()
> > called success. If 'inet->inet_rcv_saddr' is specified here,
> > then 'sk' will be in the 'hslot2' of 'hash2' that it don't belong
> > to (because inet_saddr is changed to 0), and UDP packet received
> > will not be passed to this sock. If 'inet->inet_rcv_saddr' is not
> > specified here, the sock will work fine, as it can receive packet
> > properly, which is wired, as the 'bind()' is already failed.
> >
> > I'm not sure what should do here, maybe we should unhash the sock
> > for UDP? Therefor, user can try to bind another port?
>
> Enumarating the L4 unwind paths in L3 code seems like a fairly clear
> layering violation. A new callback to undo ->sk_prot->get_port() may
> be better.

Yeah, it seems there isn't an easier way to solve this problem, a new
callback is needed.

>
> Does IPv6 no need as similar change?
>

IPv6 nedd change too. This patch is just to get some suggestions :/

> You need to provide a selftest to validate the expected behavior.

I'll add it.

Thanks!
Menglong Dong

>
> > diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
> > index 04067b249bf3..9e5710f40a39 100644
> > --- a/net/ipv4/af_inet.c
> > +++ b/net/ipv4/af_inet.c
> > @@ -530,7 +530,14 @@ int __inet_bind(struct sock *sk, struct sockaddr *uaddr, int addr_len,
> >               if (!(flags & BIND_FROM_BPF)) {
> >                       err = BPF_CGROUP_RUN_PROG_INET4_POST_BIND(sk);
> >                       if (err) {
> > +                             if (sk->sk_prot == &udp_prot)
> > +                                     sk->sk_prot->unhash(sk);
> > +                             else if (sk->sk_prot == &tcp_prot)
> > +                                     inet_put_port(sk);
> > +
> >                               inet->inet_saddr = inet->inet_rcv_saddr = 0;
> > +                             err = -EPERM;
> > +
> >                               goto out_release_sock;
> >                       }
> >               }
>
diff mbox series

Patch

diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
index 04067b249bf3..9e5710f40a39 100644
--- a/net/ipv4/af_inet.c
+++ b/net/ipv4/af_inet.c
@@ -530,7 +530,14 @@  int __inet_bind(struct sock *sk, struct sockaddr *uaddr, int addr_len,
 		if (!(flags & BIND_FROM_BPF)) {
 			err = BPF_CGROUP_RUN_PROG_INET4_POST_BIND(sk);
 			if (err) {
+				if (sk->sk_prot == &udp_prot)
+					sk->sk_prot->unhash(sk);
+				else if (sk->sk_prot == &tcp_prot)
+					inet_put_port(sk);
+
 				inet->inet_saddr = inet->inet_rcv_saddr = 0;
+				err = -EPERM;
+
 				goto out_release_sock;
 			}
 		}