diff mbox series

[v2] net-next: Fix IP_UNICAST_IF option behavior for connected sockets

Message ID 20220829111554.GA1771@debian (mailing list archive)
State Accepted
Commit 0e4d354762cefd3e16b4cff8988ff276e45effc4
Delegated to: Netdev Maintainers
Headers show
Series [v2] net-next: Fix IP_UNICAST_IF option behavior for connected sockets | expand

Checks

Context Check Description
netdev/tree_selection success Guessed tree name to be net-next
netdev/fixes_present success Fixes tag not required for -next series
netdev/subject_prefix warning Target tree name not specified in the subject
netdev/cover_letter success Single patches do not need cover letters
netdev/patch_count success Link
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 2 this patch: 2
netdev/cc_maintainers success CCed 9 of 9 maintainers
netdev/build_clang success Errors and warnings before: 5 this patch: 5
netdev/module_param success Was 0 now: 0
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/check_selftest success net selftest script(s) already in Makefile
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 2 this patch: 2
netdev/checkpatch warning WARNING: line length of 110 exceeds 80 columns WARNING: line length of 114 exceeds 80 columns WARNING: line length of 122 exceeds 80 columns WARNING: line length of 91 exceeds 80 columns WARNING: line length of 95 exceeds 80 columns
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Richard Gobert Aug. 29, 2022, 11:18 a.m. UTC
The IP_UNICAST_IF socket option is used to set the outgoing interface
for outbound packets.

The IP_UNICAST_IF socket option was added as it was needed by the
Wine project, since no other existing option (SO_BINDTODEVICE socket
option, IP_PKTINFO socket option or the bind function) provided the
needed characteristics needed by the IP_UNICAST_IF socket option. [1]
The IP_UNICAST_IF socket option works well for unconnected sockets,
that is, the interface specified by the IP_UNICAST_IF socket option
is taken into consideration in the route lookup process when a packet
is being sent. However, for connected sockets, the outbound interface
is chosen when connecting the socket, and in the route lookup process
which is done when a packet is being sent, the interface specified by
the IP_UNICAST_IF socket option is being ignored.

This inconsistent behavior was reported and discussed in an issue
opened on systemd's GitHub project [2]. Also, a bug report was
submitted in the kernel's bugzilla [3].

To understand the problem in more detail, we can look at what happens
for UDP packets over IPv4 (The same analysis was done separately in
the referenced systemd issue).
When a UDP packet is sent the udp_sendmsg function gets called and
the following happens:

1. The oif member of the struct ipcm_cookie ipc (which stores the
output interface of the packet) is initialized by the ipcm_init_sk
function to inet->sk.sk_bound_dev_if (the device set by the
SO_BINDTODEVICE socket option).

2. If the IP_PKTINFO socket option was set, the oif member gets
overridden by the call to the ip_cmsg_send function.

3. If no output interface was selected yet, the interface specified
by the IP_UNICAST_IF socket option is used.

4. If the socket is connected and no destination address is
specified in the send function, the struct ipcm_cookie ipc is not
taken into consideration and the cached route, that was calculated in
the connect function is being used.

Thus, for a connected socket, the IP_UNICAST_IF sockopt isn't taken
into consideration.

This patch corrects the behavior of the IP_UNICAST_IF socket option
for connect()ed sockets by taking into consideration the
IP_UNICAST_IF sockopt when connecting the socket.

In order to avoid reconnecting the socket, this option is still
ignored when applied on an already connected socket until connect()
is called again by the Richard Gobert.

Change the __ip4_datagram_connect function, which is called during
socket connection, to take into consideration the interface set by
the IP_UNICAST_IF socket option, in a similar way to what is done in
the udp_sendmsg function.

[1] https://lore.kernel.org/netdev/1328685717.4736.4.camel@edumazet-laptop/T/
[2] https://github.com/systemd/systemd/issues/11935#issuecomment-618691018
[3] https://bugzilla.kernel.org/show_bug.cgi?id=210255

Signed-off-by: Richard Gobert <richardbgobert@gmail.com>
---
v1 -> v2: Added self-tests and targeted to net-next.

 net/ipv4/datagram.c                       |  2 ++
 tools/testing/selftests/net/fcnal-test.sh | 30 +++++++++++++++++++++++
 tools/testing/selftests/net/nettest.c     | 16 ++++++++++--
 3 files changed, 46 insertions(+), 2 deletions(-)

Comments

David Ahern Sept. 1, 2022, 2:29 a.m. UTC | #1
On 8/29/22 5:18 AM, Richard Gobert wrote:
> The IP_UNICAST_IF socket option is used to set the outgoing interface
> for outbound packets.
> 
> The IP_UNICAST_IF socket option was added as it was needed by the
> Wine project, since no other existing option (SO_BINDTODEVICE socket
> option, IP_PKTINFO socket option or the bind function) provided the
> needed characteristics needed by the IP_UNICAST_IF socket option. [1]
> The IP_UNICAST_IF socket option works well for unconnected sockets,
> that is, the interface specified by the IP_UNICAST_IF socket option
> is taken into consideration in the route lookup process when a packet
> is being sent. However, for connected sockets, the outbound interface
> is chosen when connecting the socket, and in the route lookup process
> which is done when a packet is being sent, the interface specified by
> the IP_UNICAST_IF socket option is being ignored.
> 
> This inconsistent behavior was reported and discussed in an issue
> opened on systemd's GitHub project [2]. Also, a bug report was
> submitted in the kernel's bugzilla [3].
> 
> To understand the problem in more detail, we can look at what happens
> for UDP packets over IPv4 (The same analysis was done separately in
> the referenced systemd issue).
> When a UDP packet is sent the udp_sendmsg function gets called and
> the following happens:
> 
> 1. The oif member of the struct ipcm_cookie ipc (which stores the
> output interface of the packet) is initialized by the ipcm_init_sk
> function to inet->sk.sk_bound_dev_if (the device set by the
> SO_BINDTODEVICE socket option).
> 
> 2. If the IP_PKTINFO socket option was set, the oif member gets
> overridden by the call to the ip_cmsg_send function.
> 
> 3. If no output interface was selected yet, the interface specified
> by the IP_UNICAST_IF socket option is used.
> 
> 4. If the socket is connected and no destination address is
> specified in the send function, the struct ipcm_cookie ipc is not
> taken into consideration and the cached route, that was calculated in
> the connect function is being used.
> 
> Thus, for a connected socket, the IP_UNICAST_IF sockopt isn't taken
> into consideration.
> 
> This patch corrects the behavior of the IP_UNICAST_IF socket option
> for connect()ed sockets by taking into consideration the
> IP_UNICAST_IF sockopt when connecting the socket.
> 
> In order to avoid reconnecting the socket, this option is still
> ignored when applied on an already connected socket until connect()
> is called again by the Richard Gobert.
> 
> Change the __ip4_datagram_connect function, which is called during
> socket connection, to take into consideration the interface set by
> the IP_UNICAST_IF socket option, in a similar way to what is done in
> the udp_sendmsg function.
> 
> [1] https://lore.kernel.org/netdev/1328685717.4736.4.camel@edumazet-laptop/T/
> [2] https://github.com/systemd/systemd/issues/11935#issuecomment-618691018
> [3] https://bugzilla.kernel.org/show_bug.cgi?id=210255
> 
> Signed-off-by: Richard Gobert <richardbgobert@gmail.com>
> ---
> v1 -> v2: Added self-tests and targeted to net-next.
> 
>  net/ipv4/datagram.c                       |  2 ++
>  tools/testing/selftests/net/fcnal-test.sh | 30 +++++++++++++++++++++++
>  tools/testing/selftests/net/nettest.c     | 16 ++++++++++--
>  3 files changed, 46 insertions(+), 2 deletions(-)
> 

Reviewed-by: David Ahern <dsahern@kernel.org>

> diff --git a/tools/testing/selftests/net/fcnal-test.sh b/tools/testing/selftests/net/fcnal-test.sh
> index 03b586760164..31c3b6ebd388 100755
> --- a/tools/testing/selftests/net/fcnal-test.sh
> +++ b/tools/testing/selftests/net/fcnal-test.sh
> @@ -1466,6 +1466,13 @@ ipv4_udp_novrf()
>  		run_cmd nettest -D -r ${a} -d ${NSA_DEV} -S -0 ${NSA_IP}
>  		log_test_addr ${a} $? 0 "Client, device bind via IP_UNICAST_IF"
>  
> +		log_start
> +		run_cmd_nsb nettest -D -s &
> +		sleep 1
> +		run_cmd nettest -D -r ${a} -d ${NSA_DEV} -S -0 ${NSA_IP} -U
> +		log_test_addr ${a} $? 0 "Client, device bind via IP_UNICAST_IF, with connect()"
> +
> +
>  		log_start
>  		show_hint "Should fail 'Connection refused'"
>  		run_cmd nettest -D -r ${a}
> @@ -1525,6 +1532,13 @@ ipv4_udp_novrf()
>  	run_cmd nettest -D -d ${NSA_DEV} -S -r ${a}
>  	log_test_addr ${a} $? 0 "Global server, device client via IP_UNICAST_IF, local connection"
>  
> +	log_start
> +	run_cmd nettest -s -D &
> +	sleep 1
> +	run_cmd nettest -D -d ${NSA_DEV} -S -r ${a} -U
> +	log_test_addr ${a} $? 0 "Global server, device client via IP_UNICAST_IF, local connection, with connect()"
> +
> +
>  	# IPv4 with device bind has really weird behavior - it overrides the
>  	# fib lookup, generates an rtable and tries to send the packet. This
>  	# causes failures for local traffic at different places
> @@ -1550,6 +1564,15 @@ ipv4_udp_novrf()
>  		sleep 1
>  		run_cmd nettest -D -r ${a} -d ${NSA_DEV} -S
>  		log_test_addr ${a} $? 1 "Global server, device client via IP_UNICAST_IF, local connection"
> +
> +		log_start
> +		show_hint "Should fail since addresses on loopback are out of device scope"
> +		run_cmd nettest -D -s &
> +		sleep 1
> +		run_cmd nettest -D -r ${a} -d ${NSA_DEV} -S -U
> +		log_test_addr ${a} $? 1 "Global server, device client via IP_UNICAST_IF, local connection, with connect()"
> +
> +
>  	done
>  
>  	a=${NSA_IP}
> @@ -3157,6 +3180,13 @@ ipv6_udp_novrf()
>  		sleep 1
>  		run_cmd nettest -6 -D -r ${a} -d ${NSA_DEV} -S
>  		log_test_addr ${a} $? 1 "Global server, device client via IP_UNICAST_IF, local connection"
> +
> +		log_start
> +		show_hint "Should fail 'No route to host' since addresses on loopback are out of device scope"
> +		run_cmd nettest -6 -D -s &
> +		sleep 1
> +		run_cmd nettest -6 -D -r ${a} -d ${NSA_DEV} -S -U
> +		log_test_addr ${a} $? 1 "Global server, device client via IP_UNICAST_IF, local connection, with connect()"
>  	done
>  
>  	a=${NSA_IP6}

All of the tests are 'novrf' but this should work with VRF too. The
device is in a VRF, so binding it to should. Please add these tests
again for the VRF cases. It can be a followup patch.
patchwork-bot+netdevbpf@kernel.org Sept. 1, 2022, 3 a.m. UTC | #2
Hello:

This patch was applied to netdev/net-next.git (master)
by Jakub Kicinski <kuba@kernel.org>:

On Mon, 29 Aug 2022 13:18:51 +0200 you wrote:
> The IP_UNICAST_IF socket option is used to set the outgoing interface
> for outbound packets.
> 
> The IP_UNICAST_IF socket option was added as it was needed by the
> Wine project, since no other existing option (SO_BINDTODEVICE socket
> option, IP_PKTINFO socket option or the bind function) provided the
> needed characteristics needed by the IP_UNICAST_IF socket option. [1]
> The IP_UNICAST_IF socket option works well for unconnected sockets,
> that is, the interface specified by the IP_UNICAST_IF socket option
> is taken into consideration in the route lookup process when a packet
> is being sent. However, for connected sockets, the outbound interface
> is chosen when connecting the socket, and in the route lookup process
> which is done when a packet is being sent, the interface specified by
> the IP_UNICAST_IF socket option is being ignored.
> 
> [...]

Here is the summary with links:
  - [v2] net-next: Fix IP_UNICAST_IF option behavior for connected sockets
    https://git.kernel.org/netdev/net-next/c/0e4d354762ce

You are awesome, thank you!
diff mbox series

Patch

diff --git a/net/ipv4/datagram.c b/net/ipv4/datagram.c
index ffd57523331f..405a8c2aea64 100644
--- a/net/ipv4/datagram.c
+++ b/net/ipv4/datagram.c
@@ -42,6 +42,8 @@  int __ip4_datagram_connect(struct sock *sk, struct sockaddr *uaddr, int addr_len
 			oif = inet->mc_index;
 		if (!saddr)
 			saddr = inet->mc_addr;
+	} else if (!oif) {
+		oif = inet->uc_index;
 	}
 	fl4 = &inet->cork.fl.u.ip4;
 	rt = ip_route_connect(fl4, usin->sin_addr.s_addr, saddr, oif,
diff --git a/tools/testing/selftests/net/fcnal-test.sh b/tools/testing/selftests/net/fcnal-test.sh
index 03b586760164..31c3b6ebd388 100755
--- a/tools/testing/selftests/net/fcnal-test.sh
+++ b/tools/testing/selftests/net/fcnal-test.sh
@@ -1466,6 +1466,13 @@  ipv4_udp_novrf()
 		run_cmd nettest -D -r ${a} -d ${NSA_DEV} -S -0 ${NSA_IP}
 		log_test_addr ${a} $? 0 "Client, device bind via IP_UNICAST_IF"
 
+		log_start
+		run_cmd_nsb nettest -D -s &
+		sleep 1
+		run_cmd nettest -D -r ${a} -d ${NSA_DEV} -S -0 ${NSA_IP} -U
+		log_test_addr ${a} $? 0 "Client, device bind via IP_UNICAST_IF, with connect()"
+
+
 		log_start
 		show_hint "Should fail 'Connection refused'"
 		run_cmd nettest -D -r ${a}
@@ -1525,6 +1532,13 @@  ipv4_udp_novrf()
 	run_cmd nettest -D -d ${NSA_DEV} -S -r ${a}
 	log_test_addr ${a} $? 0 "Global server, device client via IP_UNICAST_IF, local connection"
 
+	log_start
+	run_cmd nettest -s -D &
+	sleep 1
+	run_cmd nettest -D -d ${NSA_DEV} -S -r ${a} -U
+	log_test_addr ${a} $? 0 "Global server, device client via IP_UNICAST_IF, local connection, with connect()"
+
+
 	# IPv4 with device bind has really weird behavior - it overrides the
 	# fib lookup, generates an rtable and tries to send the packet. This
 	# causes failures for local traffic at different places
@@ -1550,6 +1564,15 @@  ipv4_udp_novrf()
 		sleep 1
 		run_cmd nettest -D -r ${a} -d ${NSA_DEV} -S
 		log_test_addr ${a} $? 1 "Global server, device client via IP_UNICAST_IF, local connection"
+
+		log_start
+		show_hint "Should fail since addresses on loopback are out of device scope"
+		run_cmd nettest -D -s &
+		sleep 1
+		run_cmd nettest -D -r ${a} -d ${NSA_DEV} -S -U
+		log_test_addr ${a} $? 1 "Global server, device client via IP_UNICAST_IF, local connection, with connect()"
+
+
 	done
 
 	a=${NSA_IP}
@@ -3157,6 +3180,13 @@  ipv6_udp_novrf()
 		sleep 1
 		run_cmd nettest -6 -D -r ${a} -d ${NSA_DEV} -S
 		log_test_addr ${a} $? 1 "Global server, device client via IP_UNICAST_IF, local connection"
+
+		log_start
+		show_hint "Should fail 'No route to host' since addresses on loopback are out of device scope"
+		run_cmd nettest -6 -D -s &
+		sleep 1
+		run_cmd nettest -6 -D -r ${a} -d ${NSA_DEV} -S -U
+		log_test_addr ${a} $? 1 "Global server, device client via IP_UNICAST_IF, local connection, with connect()"
 	done
 
 	a=${NSA_IP6}
diff --git a/tools/testing/selftests/net/nettest.c b/tools/testing/selftests/net/nettest.c
index d9a6fd2cd9d3..7900fa98eccb 100644
--- a/tools/testing/selftests/net/nettest.c
+++ b/tools/testing/selftests/net/nettest.c
@@ -127,6 +127,9 @@  struct sock_args {
 
 	/* ESP in UDP encap test */
 	int use_xfrm;
+
+	/* use send() and connect() instead of sendto */
+	int datagram_connect;
 };
 
 static int server_mode;
@@ -979,6 +982,11 @@  static int send_msg(int sd, void *addr, socklen_t alen, struct sock_args *args)
 			log_err_errno("write failed sending msg to peer");
 			return 1;
 		}
+	} else if (args->datagram_connect) {
+		if (send(sd, msg, msglen, 0) < 0) {
+			log_err_errno("send failed sending msg to peer");
+			return 1;
+		}
 	} else if (args->ifindex && args->use_cmsg) {
 		if (send_msg_cmsg(sd, addr, alen, args->ifindex, args->version))
 			return 1;
@@ -1659,7 +1667,7 @@  static int connectsock(void *addr, socklen_t alen, struct sock_args *args)
 	if (args->has_local_ip && bind_socket(sd, args))
 		goto err;
 
-	if (args->type != SOCK_STREAM)
+	if (args->type != SOCK_STREAM && !args->datagram_connect)
 		goto out;
 
 	if (args->password && tcp_md5sig(sd, addr, alen, args))
@@ -1854,7 +1862,7 @@  static int ipc_parent(int cpid, int fd, struct sock_args *args)
 	return client_status;
 }
 
-#define GETOPT_STR  "sr:l:c:p:t:g:P:DRn:M:X:m:d:I:BN:O:SCi6xL:0:1:2:3:Fbqf"
+#define GETOPT_STR  "sr:l:c:p:t:g:P:DRn:M:X:m:d:I:BN:O:SUCi6xL:0:1:2:3:Fbqf"
 #define OPT_FORCE_BIND_KEY_IFINDEX 1001
 #define OPT_NO_BIND_KEY_IFINDEX 1002
 
@@ -1891,6 +1899,7 @@  static void print_usage(char *prog)
 	"    -I dev        bind socket to given device name - server mode\n"
 	"    -S            use setsockopt (IP_UNICAST_IF or IP_MULTICAST_IF)\n"
 	"                  to set device binding\n"
+	"    -U            Use connect() and send() for datagram sockets\n"
 	"    -f            bind socket with the IP[V6]_FREEBIND option\n"
 	"    -C            use cmsg and IP_PKTINFO to specify device binding\n"
 	"\n"
@@ -2074,6 +2083,9 @@  int main(int argc, char *argv[])
 		case 'x':
 			args.use_xfrm = 1;
 			break;
+		case 'U':
+			args.datagram_connect = 1;
+			break;
 		default:
 			print_usage(argv[0]);
 			return 1;