Message ID | 20230720065941.3294051-1-liuhangbin@gmail.com (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [PATCHv3,net] ipv6: do not match device when remove source route | expand |
On Thu, Jul 20, 2023 at 02:59:41PM +0800, Hangbin Liu wrote: > After deleting an IPv6 address on an interface and cleaning up the > related preferred source entries, it is important to ensure that all > routes associated with the deleted address are properly cleared. The > current implementation of rt6_remove_prefsrc() only checks the preferred > source addresses bound to the current device. However, there may be > routes that are bound to other devices but still utilize the same > preferred source address. > > To address this issue, it is necessary to also delete entries that are > bound to other interfaces but share the same source address with the > current device. Failure to delete these entries would leave routes that > are bound to the deleted address unclear. Here is an example reproducer > (I have omitted unrelated routes): [...] > Ido notified that there is a commit 5a56a0b3a45d ("net: Don't delete > routes in different VRFs") to not affect the route in different VRFs. > So let's remove the rt dev checking and add an table id checking. > Also remove the !rt-nh checking to clear the IPv6 routes that are using > a nexthop object. This would be consistent with IPv4. > > A ipv6_del_addr test is added for fib_tests.sh. Note that instead > of removing the whole route for IPv4, IPv6 only remove the preferred > source address for source routing. So in the testing use > "grep -q src $src_ipv6_address" instead of "grep -q $dst_ipv6_subnet/64" > when checking if the source route deleted. > > Here is the fib_tests.sh ipv6_del_addr test result. [...] > > Reported-by: Thomas Haller <thaller@redhat.com> > Fixes: c3968a857a6b ("ipv6: RTA_PREFSRC support for ipv6 route source address selection") > Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> One nit below [...] > @@ -1869,6 +1869,96 @@ ipv4_del_addr_test() > cleanup > } > > +ipv6_del_addr_test() > +{ [...] > +} > + > Double blank line
On Thu, Jul 20, 2023 at 04:22:11PM +0300, Ido Schimmel wrote: > On Thu, Jul 20, 2023 at 02:59:41PM +0800, Hangbin Liu wrote: > > After deleting an IPv6 address on an interface and cleaning up the > > related preferred source entries, it is important to ensure that all > > routes associated with the deleted address are properly cleared. The > > current implementation of rt6_remove_prefsrc() only checks the preferred > > source addresses bound to the current device. However, there may be > > routes that are bound to other devices but still utilize the same > > preferred source address. > > > > To address this issue, it is necessary to also delete entries that are > > bound to other interfaces but share the same source address with the > > current device. Failure to delete these entries would leave routes that > > are bound to the deleted address unclear. Here is an example reproducer > > (I have omitted unrelated routes): > > [...] > > > Ido notified that there is a commit 5a56a0b3a45d ("net: Don't delete > > routes in different VRFs") to not affect the route in different VRFs. > > So let's remove the rt dev checking and add an table id checking. > > Also remove the !rt-nh checking to clear the IPv6 routes that are using > > a nexthop object. This would be consistent with IPv4. > > > > A ipv6_del_addr test is added for fib_tests.sh. Note that instead > > of removing the whole route for IPv4, IPv6 only remove the preferred > > source address for source routing. So in the testing use > > "grep -q src $src_ipv6_address" instead of "grep -q $dst_ipv6_subnet/64" > > when checking if the source route deleted. > > > > Here is the fib_tests.sh ipv6_del_addr test result. > > [...] > > > > > Reported-by: Thomas Haller <thaller@redhat.com> > > Fixes: c3968a857a6b ("ipv6: RTA_PREFSRC support for ipv6 route source address selection") > > Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> > > Reviewed-by: Ido Schimmel <idosch@nvidia.com> Actually, there is another problem here. IPv4 checks that the address is indeed gone (it can be assigned to more than one interface): + ip link add name dummy1 up type dummy + ip link add name dummy2 up type dummy + ip link add name dummy3 up type dummy + ip address add 192.0.2.1/24 dev dummy1 + ip address add 192.0.2.1/24 dev dummy2 + ip route add 198.51.100.0/24 dev dummy3 src 192.0.2.1 + ip address del 192.0.2.1/24 dev dummy2 + ip -4 r s 192.0.2.0/24 dev dummy1 proto kernel scope link src 192.0.2.1 198.51.100.0/24 dev dummy3 scope link src 192.0.2.1 But it doesn't happen for IPv6: + ip link add name dummy1 up type dummy + ip link add name dummy2 up type dummy + ip link add name dummy3 up type dummy + ip address add 2001:db8:1::1/64 dev dummy1 + ip address add 2001:db8:1::1/64 dev dummy2 + ip route add 2001:db8:2::/64 dev dummy3 src 2001:db8:1::1 + ip address del 2001:db8:1::1/64 dev dummy2 + ip -6 r s 2001:db8:1::/64 dev dummy1 proto kernel metric 256 pref medium 2001:db8:2::/64 dev dummy3 metric 1024 pref medium fe80::/64 dev dummy1 proto kernel metric 256 pref medium fe80::/64 dev dummy2 proto kernel metric 256 pref medium fe80::/64 dev dummy3 proto kernel metric 256 pref medium
Hi Ido, On Thu, Jul 20, 2023 at 05:49:47PM +0300, Ido Schimmel wrote: > Actually, there is another problem here. IPv4 checks that the address is > indeed gone (it can be assigned to more than one interface): > > + ip link add name dummy1 up type dummy > + ip link add name dummy2 up type dummy > + ip link add name dummy3 up type dummy > + ip address add 192.0.2.1/24 dev dummy1 > + ip address add 192.0.2.1/24 dev dummy2 > + ip route add 198.51.100.0/24 dev dummy3 src 192.0.2.1 > + ip address del 192.0.2.1/24 dev dummy2 > + ip -4 r s > 192.0.2.0/24 dev dummy1 proto kernel scope link src 192.0.2.1 > 198.51.100.0/24 dev dummy3 scope link src 192.0.2.1 > > But it doesn't happen for IPv6: > > + ip link add name dummy1 up type dummy > + ip link add name dummy2 up type dummy > + ip link add name dummy3 up type dummy > + ip address add 2001:db8:1::1/64 dev dummy1 > + ip address add 2001:db8:1::1/64 dev dummy2 > + ip route add 2001:db8:2::/64 dev dummy3 src 2001:db8:1::1 > + ip address del 2001:db8:1::1/64 dev dummy2 > + ip -6 r s > 2001:db8:1::/64 dev dummy1 proto kernel metric 256 pref medium > 2001:db8:2::/64 dev dummy3 metric 1024 pref medium > fe80::/64 dev dummy1 proto kernel metric 256 pref medium > fe80::/64 dev dummy2 proto kernel metric 256 pref medium > fe80::/64 dev dummy3 proto kernel metric 256 pref medium Hmm, what kind of usage that need to add same address on different interfaces? BTW, to fix it, how about check if the IPv6 addr still exist. e.g. --- a/net/ipv6/route.c +++ b/net/ipv6/route.c @@ -4590,10 +4590,11 @@ static int fib6_remove_prefsrc(struct fib6_info *rt, void *arg) struct net_device *dev = ((struct arg_dev_net_ip *)arg)->dev; struct net *net = ((struct arg_dev_net_ip *)arg)->net; struct in6_addr *addr = ((struct arg_dev_net_ip *)arg)->addr; + u32 tb6_id = l3mdev_fib_table(dev) ? : RT_TABLE_MAIN; - if (!rt->nh && - ((void *)rt->fib6_nh->fib_nh_dev == dev || !dev) && - rt != net->ipv6.fib6_null_entry && + if (rt != net->ipv6.fib6_null_entry && + rt->fib6_table->tb6_id == tb6_id && + !ipv6_chk_addr_and_flags(net, addr, NULL, true, 0, IFA_F_TENTATIVE) && ipv6_addr_equal(addr, &rt->fib6_prefsrc.addr)) { spin_lock_bh(&rt6_exception_lock); /* remove prefsrc entry */ Thanks Hangbin
On Fri, Jul 21, 2023 at 04:59:21PM +0800, Hangbin Liu wrote: > Hi Ido, > > On Thu, Jul 20, 2023 at 05:49:47PM +0300, Ido Schimmel wrote: > > Actually, there is another problem here. IPv4 checks that the address is > > indeed gone (it can be assigned to more than one interface): > > > > + ip link add name dummy1 up type dummy > > + ip link add name dummy2 up type dummy > > + ip link add name dummy3 up type dummy > > + ip address add 192.0.2.1/24 dev dummy1 > > + ip address add 192.0.2.1/24 dev dummy2 > > + ip route add 198.51.100.0/24 dev dummy3 src 192.0.2.1 > > + ip address del 192.0.2.1/24 dev dummy2 > > + ip -4 r s > > 192.0.2.0/24 dev dummy1 proto kernel scope link src 192.0.2.1 > > 198.51.100.0/24 dev dummy3 scope link src 192.0.2.1 > > > > But it doesn't happen for IPv6: > > > > + ip link add name dummy1 up type dummy > > + ip link add name dummy2 up type dummy > > + ip link add name dummy3 up type dummy > > + ip address add 2001:db8:1::1/64 dev dummy1 > > + ip address add 2001:db8:1::1/64 dev dummy2 > > + ip route add 2001:db8:2::/64 dev dummy3 src 2001:db8:1::1 > > + ip address del 2001:db8:1::1/64 dev dummy2 > > + ip -6 r s > > 2001:db8:1::/64 dev dummy1 proto kernel metric 256 pref medium > > 2001:db8:2::/64 dev dummy3 metric 1024 pref medium > > fe80::/64 dev dummy1 proto kernel metric 256 pref medium > > fe80::/64 dev dummy2 proto kernel metric 256 pref medium > > fe80::/64 dev dummy3 proto kernel metric 256 pref medium > > Hmm, what kind of usage that need to add same address on different interfaces? I don't know, but when I checked the code and tested it I noticed that the kernel doesn't care on which interface the address is configured. Therefore, in order for deletion to be consistent with addition and with IPv4, the preferred source address shouldn't be removed from routes in the VRF table as long as the address is configured on one of the interfaces in the VRF. > BTW, to fix it, how about check if the IPv6 addr still exist. e.g. > > --- a/net/ipv6/route.c > +++ b/net/ipv6/route.c > @@ -4590,10 +4590,11 @@ static int fib6_remove_prefsrc(struct fib6_info *rt, void *arg) > struct net_device *dev = ((struct arg_dev_net_ip *)arg)->dev; > struct net *net = ((struct arg_dev_net_ip *)arg)->net; > struct in6_addr *addr = ((struct arg_dev_net_ip *)arg)->addr; > + u32 tb6_id = l3mdev_fib_table(dev) ? : RT_TABLE_MAIN; > > - if (!rt->nh && > - ((void *)rt->fib6_nh->fib_nh_dev == dev || !dev) && > - rt != net->ipv6.fib6_null_entry && > + if (rt != net->ipv6.fib6_null_entry && > + rt->fib6_table->tb6_id == tb6_id && > + !ipv6_chk_addr_and_flags(net, addr, NULL, true, 0, IFA_F_TENTATIVE) && > ipv6_addr_equal(addr, &rt->fib6_prefsrc.addr)) { ipv6_chk_addr_and_flags() is more expensive than ipv6_addr_equal() so better to first check that route indeed uses the address as the preferred source address and only then check if it exists. Maybe you can even do it in rt6_remove_prefsrc(). That would be similar to what IPv4 is doing. > spin_lock_bh(&rt6_exception_lock); > /* remove prefsrc entry */ > > Thanks > Hangbin
On 7/23/23 2:13 AM, Ido Schimmel wrote: > > I don't know, but when I checked the code and tested it I noticed that > the kernel doesn't care on which interface the address is configured. > Therefore, in order for deletion to be consistent with addition and with > IPv4, the preferred source address shouldn't be removed from routes in > the VRF table as long as the address is configured on one of the > interfaces in the VRF. > Deleting routes associated with device 2 when an address is deleted from device 1 is going to introduce as many problems as it solves. The VRF use case is one example.
On Sun, Jul 23, 2023 at 11:13:36AM +0300, Ido Schimmel wrote: > > BTW, to fix it, how about check if the IPv6 addr still exist. e.g. > > > > --- a/net/ipv6/route.c > > +++ b/net/ipv6/route.c > > @@ -4590,10 +4590,11 @@ static int fib6_remove_prefsrc(struct fib6_info *rt, void *arg) > > struct net_device *dev = ((struct arg_dev_net_ip *)arg)->dev; > > struct net *net = ((struct arg_dev_net_ip *)arg)->net; > > struct in6_addr *addr = ((struct arg_dev_net_ip *)arg)->addr; > > + u32 tb6_id = l3mdev_fib_table(dev) ? : RT_TABLE_MAIN; > > > > - if (!rt->nh && > > - ((void *)rt->fib6_nh->fib_nh_dev == dev || !dev) && > > - rt != net->ipv6.fib6_null_entry && > > + if (rt != net->ipv6.fib6_null_entry && > > + rt->fib6_table->tb6_id == tb6_id && > > + !ipv6_chk_addr_and_flags(net, addr, NULL, true, 0, IFA_F_TENTATIVE) && > > ipv6_addr_equal(addr, &rt->fib6_prefsrc.addr)) { > > ipv6_chk_addr_and_flags() is more expensive than ipv6_addr_equal() so > better to first check that route indeed uses the address as the > preferred source address and only then check if it exists. OK. > > Maybe you can even do it in rt6_remove_prefsrc(). That would be similar > to what IPv4 is doing. Do you mean call ipv6_chk_addr_and_flags() in rt6_remove_prefsrc()? Thanks Hangbin
On Mon, Jul 24, 2023 at 05:42:34PM +0800, Hangbin Liu wrote: > On Sun, Jul 23, 2023 at 11:13:36AM +0300, Ido Schimmel wrote: > > > BTW, to fix it, how about check if the IPv6 addr still exist. e.g. > > > > > > --- a/net/ipv6/route.c > > > +++ b/net/ipv6/route.c > > > @@ -4590,10 +4590,11 @@ static int fib6_remove_prefsrc(struct fib6_info *rt, void *arg) > > > struct net_device *dev = ((struct arg_dev_net_ip *)arg)->dev; > > > struct net *net = ((struct arg_dev_net_ip *)arg)->net; > > > struct in6_addr *addr = ((struct arg_dev_net_ip *)arg)->addr; > > > + u32 tb6_id = l3mdev_fib_table(dev) ? : RT_TABLE_MAIN; > > > > > > - if (!rt->nh && > > > - ((void *)rt->fib6_nh->fib_nh_dev == dev || !dev) && > > > - rt != net->ipv6.fib6_null_entry && > > > + if (rt != net->ipv6.fib6_null_entry && > > > + rt->fib6_table->tb6_id == tb6_id && > > > + !ipv6_chk_addr_and_flags(net, addr, NULL, true, 0, IFA_F_TENTATIVE) && > > > ipv6_addr_equal(addr, &rt->fib6_prefsrc.addr)) { > > > > ipv6_chk_addr_and_flags() is more expensive than ipv6_addr_equal() so > > better to first check that route indeed uses the address as the > > preferred source address and only then check if it exists. > > OK. > > > > Maybe you can even do it in rt6_remove_prefsrc(). That would be similar > > to what IPv4 is doing. > > Do you mean call ipv6_chk_addr_and_flags() in rt6_remove_prefsrc()? Yes.
On Sun, Jul 23, 2023 at 12:12:00PM -0600, David Ahern wrote: > On 7/23/23 2:13 AM, Ido Schimmel wrote: > > > > I don't know, but when I checked the code and tested it I noticed that > > the kernel doesn't care on which interface the address is configured. > > Therefore, in order for deletion to be consistent with addition and with > > IPv4, the preferred source address shouldn't be removed from routes in > > the VRF table as long as the address is configured on one of the > > interfaces in the VRF. > > > > Deleting routes associated with device 2 when an address is deleted from > device 1 is going to introduce as many problems as it solves. The VRF > use case is one example. This already happens in IPv4: # ip link add name dummy1 up type dummy # ip link add name dummy2 up type dummy # ip address add 192.0.2.1/24 dev dummy1 # ip route add 198.51.100.0/24 dev dummy2 src 192.0.2.1 # ip -4 r s 192.0.2.0/24 dev dummy1 proto kernel scope link src 192.0.2.1 198.51.100.0/24 dev dummy2 scope link src 192.0.2.1 # ip address del 192.0.2.1/24 dev dummy1 # ip -4 r s IPv6 only removes the preferred source address from routes, but doesn't delete them. The patch doesn't change that. Another difference from IPv4 is that IPv6 only removes the preferred source address from routes whose first nexthop device matches the device from which the address was removed: # ip link add name dummy1 up type dummy # ip link add name dummy2 up type dummy # ip address add 2001:db8:1::1/64 dev dummy1 # ip route add 2001:db8:2::/64 dev dummy2 src 2001:db8:1::1 # ip -6 r s 2001:db8:1::/64 dev dummy1 proto kernel metric 256 pref medium 2001:db8:2::/64 dev dummy2 src 2001:db8:1::1 metric 1024 pref medium fe80::/64 dev dummy1 proto kernel metric 256 pref medium fe80::/64 dev dummy2 proto kernel metric 256 pref medium # ip address del 2001:db8:1::1/64 dev dummy1 # ip -6 r s 2001:db8:2::/64 dev dummy2 src 2001:db8:1::1 metric 1024 pref medium fe80::/64 dev dummy1 proto kernel metric 256 pref medium fe80::/64 dev dummy2 proto kernel metric 256 pref medium And this is despite the fact that the kernel only allowed the route to be programmed because the preferred source address was present on another interface in the same L3 domain / VRF: # ip link add name dummy1 up type dummy # ip link add name dummy2 up type dummy # ip route add 2001:db8:2::/64 dev dummy2 src 2001:db8:1::1 Error: Invalid source address. The intent of the patch (at least with the changes I proposed) is to remove the preferred source address from routes in a VRF when the address is no longer configured on any interface in the VRF. Note that the above is true for addresses with a global scope. The removal of a link-local address from a device should not affect other devices. This restriction also applies when a route is added: # ip link add name dummy1 up type dummy # ip link add name dummy2 up type dummy # ip -6 address add fe80::1/64 dev dummy1 # ip -6 route add 2001:db8:2::/64 dev dummy2 src fe80::1 Error: Invalid source address. # ip -6 address add fe80::1/64 dev dummy2 # ip -6 route add 2001:db8:2::/64 dev dummy2 src fe80::1
On 7/25/23 4:06 AM, Ido Schimmel wrote: > On Sun, Jul 23, 2023 at 12:12:00PM -0600, David Ahern wrote: >> On 7/23/23 2:13 AM, Ido Schimmel wrote: >>> >>> I don't know, but when I checked the code and tested it I noticed that >>> the kernel doesn't care on which interface the address is configured. >>> Therefore, in order for deletion to be consistent with addition and with >>> IPv4, the preferred source address shouldn't be removed from routes in >>> the VRF table as long as the address is configured on one of the >>> interfaces in the VRF. >>> >> >> Deleting routes associated with device 2 when an address is deleted from >> device 1 is going to introduce as many problems as it solves. The VRF >> use case is one example. > > This already happens in IPv4: > > # ip link add name dummy1 up type dummy > # ip link add name dummy2 up type dummy > # ip address add 192.0.2.1/24 dev dummy1 > # ip route add 198.51.100.0/24 dev dummy2 src 192.0.2.1 > # ip -4 r s > 192.0.2.0/24 dev dummy1 proto kernel scope link src 192.0.2.1 > 198.51.100.0/24 dev dummy2 scope link src 192.0.2.1 > # ip address del 192.0.2.1/24 dev dummy1 > # ip -4 r s > > IPv6 only removes the preferred source address from routes, but doesn't > delete them. The patch doesn't change that. > > Another difference from IPv4 is that IPv6 only removes the preferred > source address from routes whose first nexthop device matches the device > from which the address was removed: > > # ip link add name dummy1 up type dummy > # ip link add name dummy2 up type dummy > # ip address add 2001:db8:1::1/64 dev dummy1 > # ip route add 2001:db8:2::/64 dev dummy2 src 2001:db8:1::1 > # ip -6 r s > 2001:db8:1::/64 dev dummy1 proto kernel metric 256 pref medium > 2001:db8:2::/64 dev dummy2 src 2001:db8:1::1 metric 1024 pref medium > fe80::/64 dev dummy1 proto kernel metric 256 pref medium > fe80::/64 dev dummy2 proto kernel metric 256 pref medium > # ip address del 2001:db8:1::1/64 dev dummy1 > # ip -6 r s > 2001:db8:2::/64 dev dummy2 src 2001:db8:1::1 metric 1024 pref medium > fe80::/64 dev dummy1 proto kernel metric 256 pref medium > fe80::/64 dev dummy2 proto kernel metric 256 pref medium > > And this is despite the fact that the kernel only allowed the route to > be programmed because the preferred source address was present on > another interface in the same L3 domain / VRF: > > # ip link add name dummy1 up type dummy > # ip link add name dummy2 up type dummy > # ip route add 2001:db8:2::/64 dev dummy2 src 2001:db8:1::1 > Error: Invalid source address. > > The intent of the patch (at least with the changes I proposed) is to > remove the preferred source address from routes in a VRF when the > address is no longer configured on any interface in the VRF. > > Note that the above is true for addresses with a global scope. The > removal of a link-local address from a device should not affect other > devices. This restriction also applies when a route is added: > > # ip link add name dummy1 up type dummy > # ip link add name dummy2 up type dummy > # ip -6 address add fe80::1/64 dev dummy1 > # ip -6 route add 2001:db8:2::/64 dev dummy2 src fe80::1 > Error: Invalid source address. > # ip -6 address add fe80::1/64 dev dummy2 > # ip -6 route add 2001:db8:2::/64 dev dummy2 src fe80::1 Lot of permutations. It would be good to get these in a test script along with other variations - e.g., # 2 devices with the same source address ip link add name dummy1 up type dummy ip link add name dummy2 up type dummy ip link add name dummy3 up type dummy ip address add 192.0.2.1/24 dev dummy1 ip address add 192.0.2.1/24 dev dummy3 ip route add 198.51.100.0/24 dev dummy2 src 192.0.2.1 ip address del 192.0.2.1/24 dev dummy1 --> src route should stay # VRF with single device using src address ip li add name red up type vrf table 123 ip link add name dummy4 up type dummy vrf red ip link add name dummy5 up type dummy vrf red ip address add 192.0.2.1/24 dev dummy4 ip route add 198.51.100.0/24 dev dummy5 src 192.0.2.1 ip address del 192.0.2.1/24 dev dummy4 ip ro ls vrf red # VRF with two devices using src address ip li add name red up type vrf table 123 ip link add name dummy4 up vrf red type dummy ip link add name dummy5 up vrf red type dummy ip link add name dummy6 up vrf red type dummy ip address add 192.0.2.1/24 dev dummy4 ip address add 192.0.2.1/24 dev dummy6 ip route add 198.51.100.0/24 dev dummy5 src 192.0.2.1 vrf red ip address del 192.0.2.1/24 dev dummy4 I can not find my notes but I recall Donald raised a ticket at Cumulus when FRR tripped over a scenario like this or a related one (something about routes and address delete). CC'ed Donald in case he recalls the details
On Tue, Jul 25, 2023 at 04:37:15PM -0600, David Ahern wrote: > > This already happens in IPv4: > > > > # ip link add name dummy1 up type dummy > > # ip link add name dummy2 up type dummy > > # ip address add 192.0.2.1/24 dev dummy1 > > # ip route add 198.51.100.0/24 dev dummy2 src 192.0.2.1 > > # ip -4 r s > > 192.0.2.0/24 dev dummy1 proto kernel scope link src 192.0.2.1 > > 198.51.100.0/24 dev dummy2 scope link src 192.0.2.1 > > # ip address del 192.0.2.1/24 dev dummy1 > > # ip -4 r s > > > > IPv6 only removes the preferred source address from routes, but doesn't > > delete them. The patch doesn't change that. > > > > Another difference from IPv4 is that IPv6 only removes the preferred > > source address from routes whose first nexthop device matches the device > > from which the address was removed: > > > > # ip link add name dummy1 up type dummy > > # ip link add name dummy2 up type dummy > > # ip address add 2001:db8:1::1/64 dev dummy1 > > # ip route add 2001:db8:2::/64 dev dummy2 src 2001:db8:1::1 > > # ip -6 r s > > 2001:db8:1::/64 dev dummy1 proto kernel metric 256 pref medium > > 2001:db8:2::/64 dev dummy2 src 2001:db8:1::1 metric 1024 pref medium > > fe80::/64 dev dummy1 proto kernel metric 256 pref medium > > fe80::/64 dev dummy2 proto kernel metric 256 pref medium > > # ip address del 2001:db8:1::1/64 dev dummy1 > > # ip -6 r s > > 2001:db8:2::/64 dev dummy2 src 2001:db8:1::1 metric 1024 pref medium > > fe80::/64 dev dummy1 proto kernel metric 256 pref medium > > fe80::/64 dev dummy2 proto kernel metric 256 pref medium > > > > And this is despite the fact that the kernel only allowed the route to > > be programmed because the preferred source address was present on > > another interface in the same L3 domain / VRF: > > > > # ip link add name dummy1 up type dummy > > # ip link add name dummy2 up type dummy > > # ip route add 2001:db8:2::/64 dev dummy2 src 2001:db8:1::1 > > Error: Invalid source address. > > > > The intent of the patch (at least with the changes I proposed) is to > > remove the preferred source address from routes in a VRF when the > > address is no longer configured on any interface in the VRF. > > > > Note that the above is true for addresses with a global scope. The > > removal of a link-local address from a device should not affect other > > devices. This restriction also applies when a route is added: > > > > # ip link add name dummy1 up type dummy > > # ip link add name dummy2 up type dummy > > # ip -6 address add fe80::1/64 dev dummy1 > > # ip -6 route add 2001:db8:2::/64 dev dummy2 src fe80::1 > > Error: Invalid source address. > > # ip -6 address add fe80::1/64 dev dummy2 > > # ip -6 route add 2001:db8:2::/64 dev dummy2 src fe80::1 > > Lot of permutations. It would be good to get these in a test script > along with other variations - e.g., > > # 2 devices with the same source address > ip link add name dummy1 up type dummy > ip link add name dummy2 up type dummy > ip link add name dummy3 up type dummy > ip address add 192.0.2.1/24 dev dummy1 > ip address add 192.0.2.1/24 dev dummy3 > ip route add 198.51.100.0/24 dev dummy2 src 192.0.2.1 > ip address del 192.0.2.1/24 dev dummy1 > --> src route should stay > > # VRF with single device using src address > ip li add name red up type vrf table 123 > ip link add name dummy4 up type dummy vrf red > ip link add name dummy5 up type dummy vrf red > ip address add 192.0.2.1/24 dev dummy4 > ip route add 198.51.100.0/24 dev dummy5 src 192.0.2.1 > ip address del 192.0.2.1/24 dev dummy4 > ip ro ls vrf red > > # VRF with two devices using src address > ip li add name red up type vrf table 123 > ip link add name dummy4 up vrf red type dummy > ip link add name dummy5 up vrf red type dummy > ip link add name dummy6 up vrf red type dummy > ip address add 192.0.2.1/24 dev dummy4 > ip address add 192.0.2.1/24 dev dummy6 > ip route add 198.51.100.0/24 dev dummy5 src 192.0.2.1 vrf red > ip address del 192.0.2.1/24 dev dummy4 I can add these to fib_tests later. Hangbin
diff --git a/net/ipv6/route.c b/net/ipv6/route.c index 64e873f5895f..773285adc17c 100644 --- a/net/ipv6/route.c +++ b/net/ipv6/route.c @@ -4590,10 +4590,10 @@ static int fib6_remove_prefsrc(struct fib6_info *rt, void *arg) struct net_device *dev = ((struct arg_dev_net_ip *)arg)->dev; struct net *net = ((struct arg_dev_net_ip *)arg)->net; struct in6_addr *addr = ((struct arg_dev_net_ip *)arg)->addr; + u32 tb6_id = l3mdev_fib_table(dev) ? : RT_TABLE_MAIN; - if (!rt->nh && - ((void *)rt->fib6_nh->fib_nh_dev == dev || !dev) && - rt != net->ipv6.fib6_null_entry && + if (rt != net->ipv6.fib6_null_entry && + rt->fib6_table->tb6_id == tb6_id && ipv6_addr_equal(addr, &rt->fib6_prefsrc.addr)) { spin_lock_bh(&rt6_exception_lock); /* remove prefsrc entry */ diff --git a/tools/testing/selftests/net/fib_tests.sh b/tools/testing/selftests/net/fib_tests.sh index 35d89dfa6f11..bf11edc17dfc 100755 --- a/tools/testing/selftests/net/fib_tests.sh +++ b/tools/testing/selftests/net/fib_tests.sh @@ -9,7 +9,7 @@ ret=0 ksft_skip=4 # all tests in this script. Can be overridden with -t option -TESTS="unregister down carrier nexthop suppress ipv6_notify ipv4_notify ipv6_rt ipv4_rt ipv6_addr_metric ipv4_addr_metric ipv6_route_metrics ipv4_route_metrics ipv4_route_v6_gw rp_filter ipv4_del_addr ipv4_mangle ipv6_mangle ipv4_bcast_neigh" +TESTS="unregister down carrier nexthop suppress ipv6_notify ipv4_notify ipv6_rt ipv4_rt ipv6_addr_metric ipv4_addr_metric ipv6_route_metrics ipv4_route_metrics ipv4_route_v6_gw rp_filter ipv4_del_addr ipv6_del_addr ipv4_mangle ipv6_mangle ipv4_bcast_neigh" VERBOSE=0 PAUSE_ON_FAIL=no @@ -1869,6 +1869,96 @@ ipv4_del_addr_test() cleanup } +ipv6_del_addr_test() +{ + echo + echo "IPv6 delete address route tests" + + setup + + set -e + $IP li add dummy1 type dummy + $IP li set dummy1 up + $IP li add dummy2 type dummy + $IP li set dummy2 up + $IP li add red type vrf table 1111 + $IP li set red up + $IP ro add vrf red unreachable default + $IP li set dummy2 vrf red + + $IP addr add dev dummy1 2001:db8:104::1/64 + $IP addr add dev dummy1 2001:db8:104::11/64 + $IP addr add dev dummy1 2001:db8:104::12/64 + $IP addr add dev dummy1 2001:db8:104::13/64 + $IP addr add dev dummy2 2001:db8:104::1/64 + $IP addr add dev dummy2 2001:db8:104::11/64 + $IP addr add dev dummy2 2001:db8:104::12/64 + $IP route add 2001:db8:105::/64 via 2001:db8:104::2 src 2001:db8:104::11 + $IP route add 2001:db8:106::/64 dev lo src 2001:db8:104::12 + $IP route add table 0 2001:db8:107::/64 via 2001:db8:104::2 src 2001:db8:104::13 + $IP route add vrf red 2001:db8:105::/64 via 2001:db8:104::2 src 2001:db8:104::11 + $IP route add vrf red 2001:db8:106::/64 dev lo src 2001:db8:104::12 + set +e + + # removing address from device in vrf should only remove it as a + # preferred source address from routes in vrf table + echo " Regular FIB info" + + $IP addr del dev dummy2 2001:db8:104::11/64 + # Checking if the source address exists instead of the dest subnet + # as IPv6 only removes the preferred source address, not whole route. + $IP -6 ro ls vrf red | grep -q "src 2001:db8:104::11" + log_test $? 1 "Prefsrc removed from VRF when source address deleted" + + $IP -6 ro ls | grep -q " src 2001:db8:104::11" + log_test $? 0 "Prefsrc in default VRF not removed" + + $IP addr add dev dummy2 2001:db8:104::11/64 + $IP route replace vrf red 2001:db8:105::/64 via 2001:db8:104::2 src 2001:db8:104::11 + + $IP addr del dev dummy1 2001:db8:104::11/64 + $IP -6 ro ls | grep -q "src 2001:db8:104::11" + log_test $? 1 "Prefsrc removed in default VRF when source address deleted" + + $IP -6 ro ls vrf red | grep -q "src 2001:db8:104::11" + log_test $? 0 "Prefsrc in VRF is not removed by address delete" + + # removing address from device in vrf should only remove preferred + # source address from vrf table even when the associated fib info + # only differs in table ID + echo " Identical FIB info with different table ID" + + $IP addr del dev dummy2 2001:db8:104::12/64 + $IP -6 ro ls vrf red | grep -q "src 2001:db8:104::12" + log_test $? 1 "Prefsrc removed from VRF when source address deleted" + + $IP -6 ro ls | grep -q "src 2001:db8:104::12" + log_test $? 0 "Prefsrc in default VRF not removed" + + $IP addr add dev dummy2 2001:db8:104::12/64 + $IP route replace vrf red 2001:db8:106::/64 dev lo src 2001:db8:104::12 + + $IP addr del dev dummy1 2001:db8:104::12/64 + $IP -6 ro ls | grep -q "src 2001:db8:104::12" + log_test $? 1 "Prefsrc removed in default VRF when source address deleted" + + $IP -6 ro ls vrf red | grep -q "src 2001:db8:104::12" + log_test $? 0 "Prefsrc in VRF is not removed by address delete" + + # removing address from device in default vrf should remove preferred + # source address from the default vrf even when route was inserted + # with a table ID of 0. + echo " Table ID 0" + + $IP addr del dev dummy1 2001:db8:104::13/64 + $IP -6 ro ls | grep -q "src 2001:db8:104::13" + log_test $? 1 "Prefsrc removed in default VRF when source address deleted" + + $IP li del dummy1 + $IP li del dummy2 + cleanup +} + ipv4_route_v6_gw_test() { @@ -2211,6 +2301,7 @@ do ipv6_addr_metric) ipv6_addr_metric_test;; ipv4_addr_metric) ipv4_addr_metric_test;; ipv4_del_addr) ipv4_del_addr_test;; + ipv6_del_addr) ipv6_del_addr_test;; ipv6_route_metrics) ipv6_route_metrics_test;; ipv4_route_metrics) ipv4_route_metrics_test;; ipv4_route_v6_gw) ipv4_route_v6_gw_test;;
After deleting an IPv6 address on an interface and cleaning up the related preferred source entries, it is important to ensure that all routes associated with the deleted address are properly cleared. The current implementation of rt6_remove_prefsrc() only checks the preferred source addresses bound to the current device. However, there may be routes that are bound to other devices but still utilize the same preferred source address. To address this issue, it is necessary to also delete entries that are bound to other interfaces but share the same source address with the current device. Failure to delete these entries would leave routes that are bound to the deleted address unclear. Here is an example reproducer (I have omitted unrelated routes): + ip link add dummy1 type dummy + ip link add dummy2 type dummy + ip link set dummy1 up + ip link set dummy2 up + ip addr add 1:2:3:4::5/64 dev dummy1 + ip route add 7:7:7:0::1 dev dummy1 src 1:2:3:4::5 + ip route add 7:7:7:0::2 dev dummy2 src 1:2:3:4::5 + ip -6 route show 1:2:3:4::/64 dev dummy1 proto kernel metric 256 pref medium 7:7:7::1 dev dummy1 src 1:2:3:4::5 metric 1024 pref medium 7:7:7::2 dev dummy2 src 1:2:3:4::5 metric 1024 pref medium + ip addr del 1:2:3:4::5/64 dev dummy1 + ip -6 route show 7:7:7::1 dev dummy1 metric 1024 pref medium 7:7:7::2 dev dummy2 src 1:2:3:4::5 metric 1024 pref medium After fix: + ip addr del 1:2:3:4::5/64 dev dummy1 + ip -6 route show 7:7:7::1 dev dummy1 metric 1024 pref medium 7:7:7::2 dev dummy2 metric 1024 pref medium Ido notified that there is a commit 5a56a0b3a45d ("net: Don't delete routes in different VRFs") to not affect the route in different VRFs. So let's remove the rt dev checking and add an table id checking. Also remove the !rt-nh checking to clear the IPv6 routes that are using a nexthop object. This would be consistent with IPv4. A ipv6_del_addr test is added for fib_tests.sh. Note that instead of removing the whole route for IPv4, IPv6 only remove the preferred source address for source routing. So in the testing use "grep -q src $src_ipv6_address" instead of "grep -q $dst_ipv6_subnet/64" when checking if the source route deleted. Here is the fib_tests.sh ipv6_del_addr test result. Before fix: IPv6 delete address route tests Regular FIB info TEST: Prefsrc removed from VRF when source address deleted [ OK ] TEST: Prefsrc in default VRF not removed [FAIL] TEST: Prefsrc removed in default VRF when source address deleted [ OK ] TEST: Prefsrc in VRF is not removed by address delete [FAIL] Identical FIB info with different table ID TEST: Prefsrc removed from VRF when source address deleted [ OK ] TEST: Prefsrc in default VRF not removed [FAIL] TEST: Prefsrc removed in default VRF when source address deleted [ OK ] TEST: Prefsrc in VRF is not removed by address delete [FAIL] Table ID 0 TEST: Prefsrc removed in default VRF when source address deleted [ OK ] After fix: IPv6 delete address route tests Regular FIB info TEST: Prefsrc removed from VRF when source address deleted [ OK ] TEST: Prefsrc in default VRF not removed [ OK ] TEST: Prefsrc removed in default VRF when source address deleted [ OK ] TEST: Prefsrc in VRF is not removed by address delete [ OK ] Identical FIB info with different table ID TEST: Prefsrc removed from VRF when source address deleted [ OK ] TEST: Prefsrc in default VRF not removed [ OK ] TEST: Prefsrc removed in default VRF when source address deleted [ OK ] TEST: Prefsrc in VRF is not removed by address delete [ OK ] Table ID 0 TEST: Prefsrc removed in default VRF when source address deleted [ OK ] Reported-by: Thomas Haller <thaller@redhat.com> Fixes: c3968a857a6b ("ipv6: RTA_PREFSRC support for ipv6 route source address selection") Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> --- v3: remove rt nh checking. update the ipv6_del_addr test descriptions v2: checking table id and update fib_test.sh --- net/ipv6/route.c | 6 +- tools/testing/selftests/net/fib_tests.sh | 93 +++++++++++++++++++++++- 2 files changed, 95 insertions(+), 4 deletions(-)