diff mbox series

[v2,net-next,1/2] net: bpf: handle return value of BPF_CGROUP_RUN_PROG_INET{4,6}_POST_BIND()

Message ID 20211230080305.1068950-2-imagedong@tencent.com (mailing list archive)
State Superseded
Delegated to: BPF
Headers show
Series net: bpf: handle return value of post_bind{4,6} and add selftests for it | expand

Checks

Context Check Description
netdev/tree_selection success Clearly marked for net-next
netdev/fixes_present success Fixes tag not required for -next series
netdev/subject_prefix success Link
netdev/cover_letter success Series has a cover letter
netdev/patch_count success Link
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 2955 this patch: 2955
netdev/cc_maintainers warning 1 maintainers not CCed: christian@brauner.io
netdev/build_clang success Errors and warnings before: 360 this patch: 360
netdev/module_param success Was 0 now: 0
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 3088 this patch: 3088
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 65 lines checked
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
bpf/vmtest-bpf-next-PR success PR summary
bpf/vmtest-bpf-next success VM_Test

Commit Message

Menglong Dong Dec. 30, 2021, 8:03 a.m. UTC
From: Menglong Dong <imagedong@tencent.com>

The return value of BPF_CGROUP_RUN_PROG_INET{4,6}_POST_BIND() in
__inet_bind() is not handled properly. While the return value
is non-zero, it will set inet_saddr and inet_rcv_saddr to 0 and
exit:

	err = BPF_CGROUP_RUN_PROG_INET4_POST_BIND(sk);
	if (err) {
		inet->inet_saddr = inet->inet_rcv_saddr = 0;
		goto out_release_sock;
	}

Let's take UDP for example and see what will happen. For UDP
socket, it will be added to 'udp_prot.h.udp_table->hash' and
'udp_prot.h.udp_table->hash2' after the sk->sk_prot->get_port()
called success. If 'inet->inet_rcv_saddr' is specified here,
then 'sk' will be in the 'hslot2' of 'hash2' that it don't belong
to (because inet_saddr is changed to 0), and UDP packet received
will not be passed to this sock. If 'inet->inet_rcv_saddr' is not
specified here, the sock will work fine, as it can receive packet
properly, which is wired, as the 'bind()' is already failed.

To undo the get_port() operation, introduce the 'put_port' field
for 'struct proto'. For TCP proto, it is inet_put_port(); For UDP
proto, it is udp_lib_unhash(); For icmp proto, it is
ping_unhash().

Therefore, after sys_bind() fail caused by
BPF_CGROUP_RUN_PROG_INET4_POST_BIND(), it will be unbinded, which
means that it can try to be binded to another port.

Signed-off-by: Menglong Dong <imagedong@tencent.com>
---
v2:
- introduce 'put_port' field for 'struct proto'
---
 include/net/sock.h  | 1 +
 net/ipv4/af_inet.c  | 2 ++
 net/ipv4/ping.c     | 1 +
 net/ipv4/tcp_ipv4.c | 1 +
 net/ipv4/udp.c      | 1 +
 net/ipv6/af_inet6.c | 2 ++
 net/ipv6/ping.c     | 1 +
 net/ipv6/tcp_ipv6.c | 1 +
 net/ipv6/udp.c      | 1 +
 9 files changed, 11 insertions(+)

Comments

Daniel Borkmann Jan. 5, 2022, 1:01 p.m. UTC | #1
On 12/30/21 9:03 AM, menglong8.dong@gmail.com wrote:
[...]
> diff --git a/include/net/sock.h b/include/net/sock.h
> index 44cc25f0bae7..f5fc0432374e 100644
> --- a/include/net/sock.h
> +++ b/include/net/sock.h
> @@ -1209,6 +1209,7 @@ struct proto {
>   	void			(*unhash)(struct sock *sk);
>   	void			(*rehash)(struct sock *sk);
>   	int			(*get_port)(struct sock *sk, unsigned short snum);
> +	void			(*put_port)(struct sock *sk);
>   #ifdef CONFIG_BPF_SYSCALL
>   	int			(*psock_update_sk_prot)(struct sock *sk,
>   							struct sk_psock *psock,
> diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
> index 5d18d32557d2..8784e72d4b8b 100644
> --- a/net/ipv4/af_inet.c
> +++ b/net/ipv4/af_inet.c
> @@ -531,6 +531,8 @@ int __inet_bind(struct sock *sk, struct sockaddr *uaddr, int addr_len,
>   			err = BPF_CGROUP_RUN_PROG_INET4_POST_BIND(sk);
>   			if (err) {
>   				inet->inet_saddr = inet->inet_rcv_saddr = 0;
> +				if (sk->sk_prot->get_port)
> +					sk->sk_prot->put_port(sk);
>   				goto out_release_sock;
>   			}
>   		}
[...]
> diff --git a/net/ipv6/af_inet6.c b/net/ipv6/af_inet6.c
> index d1636425654e..ddfc6821e682 100644
> --- a/net/ipv6/af_inet6.c
> +++ b/net/ipv6/af_inet6.c
> @@ -413,6 +413,8 @@ static int __inet6_bind(struct sock *sk, struct sockaddr *uaddr, int addr_len,
>   			if (err) {
>   				sk->sk_ipv6only = saved_ipv6only;
>   				inet_reset_saddr(sk);
> +				if (sk->sk_prot->get_port)
> +					sk->sk_prot->put_port(sk);
>   				goto out;
>   			}
>   		}

I presume both tests above should test for non-zero sk->sk_prot->put_port
callback given that is what they end up calling when true, no?

Thanks,
Daniel
Menglong Dong Jan. 5, 2022, 1:03 p.m. UTC | #2
On Wed, Jan 5, 2022 at 9:01 PM Daniel Borkmann <daniel@iogearbox.net> wrote:
>
> On 12/30/21 9:03 AM, menglong8.dong@gmail.com wrote:
> [...]
> > diff --git a/include/net/sock.h b/include/net/sock.h
> > index 44cc25f0bae7..f5fc0432374e 100644
> > --- a/include/net/sock.h
> > +++ b/include/net/sock.h
> > @@ -1209,6 +1209,7 @@ struct proto {
> >       void                    (*unhash)(struct sock *sk);
> >       void                    (*rehash)(struct sock *sk);
> >       int                     (*get_port)(struct sock *sk, unsigned short snum);
> > +     void                    (*put_port)(struct sock *sk);
> >   #ifdef CONFIG_BPF_SYSCALL
> >       int                     (*psock_update_sk_prot)(struct sock *sk,
> >                                                       struct sk_psock *psock,
> > diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
> > index 5d18d32557d2..8784e72d4b8b 100644
> > --- a/net/ipv4/af_inet.c
> > +++ b/net/ipv4/af_inet.c
> > @@ -531,6 +531,8 @@ int __inet_bind(struct sock *sk, struct sockaddr *uaddr, int addr_len,
> >                       err = BPF_CGROUP_RUN_PROG_INET4_POST_BIND(sk);
> >                       if (err) {
> >                               inet->inet_saddr = inet->inet_rcv_saddr = 0;
> > +                             if (sk->sk_prot->get_port)
> > +                                     sk->sk_prot->put_port(sk);
> >                               goto out_release_sock;
> >                       }
> >               }
> [...]
> > diff --git a/net/ipv6/af_inet6.c b/net/ipv6/af_inet6.c
> > index d1636425654e..ddfc6821e682 100644
> > --- a/net/ipv6/af_inet6.c
> > +++ b/net/ipv6/af_inet6.c
> > @@ -413,6 +413,8 @@ static int __inet6_bind(struct sock *sk, struct sockaddr *uaddr, int addr_len,
> >                       if (err) {
> >                               sk->sk_ipv6only = saved_ipv6only;
> >                               inet_reset_saddr(sk);
> > +                             if (sk->sk_prot->get_port)
> > +                                     sk->sk_prot->put_port(sk);
> >                               goto out;
> >                       }
> >               }
>
> I presume both tests above should test for non-zero sk->sk_prot->put_port
> callback given that is what they end up calling when true, no?
>

You are right, I think that I made a big mistake here...:/

Thanks!
Menglong Dong
diff mbox series

Patch

diff --git a/include/net/sock.h b/include/net/sock.h
index 44cc25f0bae7..f5fc0432374e 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -1209,6 +1209,7 @@  struct proto {
 	void			(*unhash)(struct sock *sk);
 	void			(*rehash)(struct sock *sk);
 	int			(*get_port)(struct sock *sk, unsigned short snum);
+	void			(*put_port)(struct sock *sk);
 #ifdef CONFIG_BPF_SYSCALL
 	int			(*psock_update_sk_prot)(struct sock *sk,
 							struct sk_psock *psock,
diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
index 5d18d32557d2..8784e72d4b8b 100644
--- a/net/ipv4/af_inet.c
+++ b/net/ipv4/af_inet.c
@@ -531,6 +531,8 @@  int __inet_bind(struct sock *sk, struct sockaddr *uaddr, int addr_len,
 			err = BPF_CGROUP_RUN_PROG_INET4_POST_BIND(sk);
 			if (err) {
 				inet->inet_saddr = inet->inet_rcv_saddr = 0;
+				if (sk->sk_prot->get_port)
+					sk->sk_prot->put_port(sk);
 				goto out_release_sock;
 			}
 		}
diff --git a/net/ipv4/ping.c b/net/ipv4/ping.c
index e540b0dcf085..0e56df3a45e2 100644
--- a/net/ipv4/ping.c
+++ b/net/ipv4/ping.c
@@ -994,6 +994,7 @@  struct proto ping_prot = {
 	.hash =		ping_hash,
 	.unhash =	ping_unhash,
 	.get_port =	ping_get_port,
+	.put_port =	ping_unhash,
 	.obj_size =	sizeof(struct inet_sock),
 };
 EXPORT_SYMBOL(ping_prot);
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 03dc4c79b84b..0ffb5b5779c0 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -3082,6 +3082,7 @@  struct proto tcp_prot = {
 	.hash			= inet_hash,
 	.unhash			= inet_unhash,
 	.get_port		= inet_csk_get_port,
+	.put_port		= inet_put_port,
 #ifdef CONFIG_BPF_SYSCALL
 	.psock_update_sk_prot	= tcp_bpf_update_proto,
 #endif
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index f376c777e8fc..c87e3888c8f8 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -2926,6 +2926,7 @@  struct proto udp_prot = {
 	.unhash			= udp_lib_unhash,
 	.rehash			= udp_v4_rehash,
 	.get_port		= udp_v4_get_port,
+	.put_port		= udp_lib_unhash,
 #ifdef CONFIG_BPF_SYSCALL
 	.psock_update_sk_prot	= udp_bpf_update_proto,
 #endif
diff --git a/net/ipv6/af_inet6.c b/net/ipv6/af_inet6.c
index d1636425654e..ddfc6821e682 100644
--- a/net/ipv6/af_inet6.c
+++ b/net/ipv6/af_inet6.c
@@ -413,6 +413,8 @@  static int __inet6_bind(struct sock *sk, struct sockaddr *uaddr, int addr_len,
 			if (err) {
 				sk->sk_ipv6only = saved_ipv6only;
 				inet_reset_saddr(sk);
+				if (sk->sk_prot->get_port)
+					sk->sk_prot->put_port(sk);
 				goto out;
 			}
 		}
diff --git a/net/ipv6/ping.c b/net/ipv6/ping.c
index 6ac88fe24a8e..9256f6ba87ef 100644
--- a/net/ipv6/ping.c
+++ b/net/ipv6/ping.c
@@ -177,6 +177,7 @@  struct proto pingv6_prot = {
 	.hash =		ping_hash,
 	.unhash =	ping_unhash,
 	.get_port =	ping_get_port,
+	.put_port =	ping_unhash,
 	.obj_size =	sizeof(struct raw6_sock),
 };
 EXPORT_SYMBOL_GPL(pingv6_prot);
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index 1ac243d18c2b..075ee8a2df3b 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -2181,6 +2181,7 @@  struct proto tcpv6_prot = {
 	.hash			= inet6_hash,
 	.unhash			= inet_unhash,
 	.get_port		= inet_csk_get_port,
+	.put_port		= inet_put_port,
 #ifdef CONFIG_BPF_SYSCALL
 	.psock_update_sk_prot	= tcp_bpf_update_proto,
 #endif
diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index 01e53eb4875a..cd402bdf9eed 100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -1731,6 +1731,7 @@  struct proto udpv6_prot = {
 	.unhash			= udp_lib_unhash,
 	.rehash			= udp_v6_rehash,
 	.get_port		= udp_v6_get_port,
+	.put_port		= udp_lib_unhash,
 #ifdef CONFIG_BPF_SYSCALL
 	.psock_update_sk_prot	= udp_bpf_update_proto,
 #endif