diff mbox series

[net,1/2] ip: fix dflt addr selection for connected nexthop

Message ID 20220706160526.31711-1-nicolas.dichtel@6wind.com (mailing list archive)
State Superseded
Delegated to: Netdev Maintainers
Headers show
Series [net,1/2] ip: fix dflt addr selection for connected nexthop | expand

Checks

Context Check Description
netdev/tree_selection success Clearly marked for net
netdev/fixes_present success Fixes tag present in non-next series
netdev/subject_prefix success Link
netdev/cover_letter success Single patches do not need cover letters
netdev/patch_count success Link
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 2 this patch: 2
netdev/cc_maintainers warning 1 maintainers not CCed: yoshfuji@linux-ipv6.org
netdev/build_clang success Errors and warnings before: 6 this patch: 6
netdev/module_param success Was 0 now: 0
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 2 this patch: 2
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 8 lines checked
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Nicolas Dichtel July 6, 2022, 4:05 p.m. UTC
When a nexthop is added, without a gw address, the default scope was set
to 'host'. Thus, when a source address is selected, 127.0.0.1 may be chosen
but rejected when the route is used.

When using a route without a nexthop id, the scope can be configured in the
route, thus the problem doesn't exist.

To explain more deeply: when a user creates a nexthop, it cannot specify
the scope. To create it, the function nh_create_ipv4() calls fib_check_nh()
with scope set to 0. fib_check_nh() calls fib_check_nh_nongw() wich was
setting scope to 'host'. Then, nh_create_ipv4() calls
fib_info_update_nhc_saddr() with scope set to 'host'. The src addr is
chosen before the route is inserted.

When a 'standard' route (ie without a reference to a nexthop) is added,
fib_create_info() calls fib_info_update_nhc_saddr() with the scope set by
the user. iproute2 set the scope to 'link' by default.

Here is a way to reproduce the problem:
ip netns add foo
ip -n foo link set lo up
ip netns add bar
ip -n bar link set lo up
sleep 1

ip -n foo link add name eth0 type dummy
ip -n foo link set eth0 up
ip -n foo address add 192.168.0.1/24 dev eth0

ip -n foo link add name veth0 type veth peer name veth1 netns bar
ip -n foo link set veth0 address 00:09:c0:26:05:82
ip -n foo link set veth0 arp off
ip -n foo link set veth0 up
ip -n bar link set veth1 address 00:09:c0:26:05:82
ip -n bar link set veth1 arp off
ip -n bar link set veth1 up

ip -n bar address add 192.168.1.1/32 dev veth1
ip -n bar route add default dev veth1

ip -n foo nexthop add id 1 dev veth0
ip -n foo route add 192.168.1.1 nhid 1

Try to get/use the route:
> $ ip -n foo route get 192.168.1.1
> RTNETLINK answers: Invalid argument
> $ ip netns exec foo ping -c1 192.168.1.1
> ping: connect: Invalid argument

Try without nexthop group (iproute2 sets scope to 'link' by dflt):
ip -n foo route del 192.168.1.1
ip -n foo route add 192.168.1.1 dev veth0

Try to get/use the route:
> $ ip -n foo route get 192.168.1.1
> 192.168.1.1 dev veth0 src 192.168.0.1 uid 0
>     cache
> $ ip netns exec foo ping -c1 192.168.1.1
> PING 192.168.1.1 (192.168.1.1) 56(84) bytes of data.
> 64 bytes from 192.168.1.1: icmp_seq=1 ttl=64 time=0.039 ms
>
> --- 192.168.1.1 ping statistics ---
> 1 packets transmitted, 1 received, 0% packet loss, time 0ms
> rtt min/avg/max/mdev = 0.039/0.039/0.039/0.000 ms

CC: stable@vger.kernel.org
Fixes: 597cfe4fc339 ("nexthop: Add support for IPv4 nexthops")
Reported-by: Edwin Brossette <edwin.brossette@6wind.com>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
---
 net/ipv4/fib_semantics.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Paolo Abeni July 12, 2022, 8:11 a.m. UTC | #1
On Wed, 2022-07-06 at 18:05 +0200, Nicolas Dichtel wrote:
> When a nexthop is added, without a gw address, the default scope was set
> to 'host'. Thus, when a source address is selected, 127.0.0.1 may be chosen
> but rejected when the route is used.
> 
> When using a route without a nexthop id, the scope can be configured in the
> route, thus the problem doesn't exist.
> 
> To explain more deeply: when a user creates a nexthop, it cannot specify
> the scope. To create it, the function nh_create_ipv4() calls fib_check_nh()
> with scope set to 0. fib_check_nh() calls fib_check_nh_nongw() wich was
> setting scope to 'host'. Then, nh_create_ipv4() calls
> fib_info_update_nhc_saddr() with scope set to 'host'. The src addr is
> chosen before the route is inserted.
> 
> When a 'standard' route (ie without a reference to a nexthop) is added,
> fib_create_info() calls fib_info_update_nhc_saddr() with the scope set by
> the user. iproute2 set the scope to 'link' by default.
> 
> Here is a way to reproduce the problem:
> ip netns add foo
> ip -n foo link set lo up
> ip netns add bar
> ip -n bar link set lo up
> sleep 1
> 
> ip -n foo link add name eth0 type dummy
> ip -n foo link set eth0 up
> ip -n foo address add 192.168.0.1/24 dev eth0
> 
> ip -n foo link add name veth0 type veth peer name veth1 netns bar
> ip -n foo link set veth0 address 00:09:c0:26:05:82
> ip -n foo link set veth0 arp off

It looks like the 'arp off'/fixed mac address is not relevant for the
test case, could you please drop it, so that the example and the self-
test are more clean?

> ip -n foo link set veth0 up
> ip -n bar link set veth1 address 00:09:c0:26:05:82
> ip -n bar link set veth1 arp off
> ip -n bar link set veth1 up
> 
> ip -n bar address add 192.168.1.1/32 dev veth1
> ip -n bar route add default dev veth1
> 
> ip -n foo nexthop add id 1 dev veth0
> ip -n foo route add 192.168.1.1 nhid 1
> 
> Try to get/use the route:
> > $ ip -n foo route get 192.168.1.1
> > RTNETLINK answers: Invalid argument
> > $ ip netns exec foo ping -c1 192.168.1.1
> > ping: connect: Invalid argument
> 
> Try without nexthop group (iproute2 sets scope to 'link' by dflt):
> ip -n foo route del 192.168.1.1
> ip -n foo route add 192.168.1.1 dev veth0
> 
> Try to get/use the route:
> > $ ip -n foo route get 192.168.1.1
> > 192.168.1.1 dev veth0 src 192.168.0.1 uid 0
> >     cache
> > $ ip netns exec foo ping -c1 192.168.1.1
> > PING 192.168.1.1 (192.168.1.1) 56(84) bytes of data.
> > 64 bytes from 192.168.1.1: icmp_seq=1 ttl=64 time=0.039 ms
> > 
> > --- 192.168.1.1 ping statistics ---
> > 1 packets transmitted, 1 received, 0% packet loss, time 0ms
> > rtt min/avg/max/mdev = 0.039/0.039/0.039/0.000 ms
> 
> CC: stable@vger.kernel.org
> Fixes: 597cfe4fc339 ("nexthop: Add support for IPv4 nexthops")

Why that commit? It looks like fib_check_nh() used SCOPE_HOST for nongw
next hop since well before ?!?

Otherwise LGTM.

Paolo
Nicolas Dichtel July 12, 2022, 8:45 a.m. UTC | #2
Le 12/07/2022 à 10:11, Paolo Abeni a écrit :
[snip]
>>
>> ip -n foo link add name veth0 type veth peer name veth1 netns bar
>> ip -n foo link set veth0 address 00:09:c0:26:05:82
>> ip -n foo link set veth0 arp off
> 
> It looks like the 'arp off'/fixed mac address is not relevant for the
> test case, could you please drop it, so that the example and the self-
> test are more clean?Good point, I will remove these lines.

[snip]
>> Fixes: 597cfe4fc339 ("nexthop: Add support for IPv4 nexthops")
> 
> Why that commit? It looks like fib_check_nh() used SCOPE_HOST for nongw
> next hop since well before ?!?
Yes, but with "standard" route, ie when the nexthop objects are not used, the
scope used, at the end, is the one specified in the netlink message.
With nexthop object, the user cannot specify the scope and we end up in this
problem.
Thus, the problem exists only with nexthop object only.

I tried to explain this difference in the commit log, but maybe it's not clear
enough.

> 
> Otherwise LGTM.


Thank you,
Nicolas
diff mbox series

Patch

diff --git a/net/ipv4/fib_semantics.c b/net/ipv4/fib_semantics.c
index a57ba23571c9..20177ecf5bdd 100644
--- a/net/ipv4/fib_semantics.c
+++ b/net/ipv4/fib_semantics.c
@@ -1230,7 +1230,7 @@  static int fib_check_nh_nongw(struct net *net, struct fib_nh *nh,
 
 	nh->fib_nh_dev = in_dev->dev;
 	dev_hold_track(nh->fib_nh_dev, &nh->fib_nh_dev_tracker, GFP_ATOMIC);
-	nh->fib_nh_scope = RT_SCOPE_HOST;
+	nh->fib_nh_scope = RT_SCOPE_LINK;
 	if (!netif_carrier_ok(nh->fib_nh_dev))
 		nh->fib_nh_flags |= RTNH_F_LINKDOWN;
 	err = 0;