Message ID | 20230811095308.242489-1-liuhangbin@gmail.com (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [PATCHv5,net-next] ipv6: do not match device when remove source route | expand |
On Fri, Aug 11, 2023 at 05:53:08PM +0800, Hangbin Liu wrote: > diff --git a/net/ipv6/route.c b/net/ipv6/route.c > index 64e873f5895f..0f981cc5bed1 100644 > --- a/net/ipv6/route.c > +++ b/net/ipv6/route.c > @@ -4590,11 +4590,12 @@ static int fib6_remove_prefsrc(struct fib6_info *rt, void *arg) > struct net_device *dev = ((struct arg_dev_net_ip *)arg)->dev; > struct net *net = ((struct arg_dev_net_ip *)arg)->net; > struct in6_addr *addr = ((struct arg_dev_net_ip *)arg)->addr; > + u32 tb6_id = l3mdev_fib_table(dev) ? : RT_TABLE_MAIN; > > - if (!rt->nh && > - ((void *)rt->fib6_nh->fib_nh_dev == dev || !dev) && > - rt != net->ipv6.fib6_null_entry && > - ipv6_addr_equal(addr, &rt->fib6_prefsrc.addr)) { > + if (rt != net->ipv6.fib6_null_entry && > + rt->fib6_table->tb6_id == tb6_id && > + ipv6_addr_equal(addr, &rt->fib6_prefsrc.addr) && > + !ipv6_chk_addr(net, addr, rt->fib6_nh->fib_nh_dev, 0)) { > spin_lock_bh(&rt6_exception_lock); > /* remove prefsrc entry */ > rt->fib6_prefsrc.plen = 0; The table check is incorrect which is what I was trying to explain here [1]. The route insertion code does not check that the preferred source is accessible from the VRF where the route is installed, but instead that it is accessible from the VRF of the first nexthop device. I'm not going to debate whether it is correct or not. I'm going to say that the logic should be consistent between the route insertion and deletion paths. That is, if I'm only able to insert a route with a preferred source address because some address exists, then when this address is removed the preferred source address should be removed from the route. Here is an example with your patch applied: + ip link add name dummy1 up type dummy + ip link add name vrf1 up type vrf table 1111 + ip link set dev dummy1 master vrf1 + ip -6 route add 2001:db8:2::/64 src 2001:db8:1::1 dev dummy1 Error: Invalid source address. + ip address add 2001:db8:1::1/64 dev dummy1 + ip -6 route add 2001:db8:2::/64 src 2001:db8:1::1 dev dummy1 + ip -6 route show 2001:db8:2::/64 2001:db8:2::/64 dev dummy1 src 2001:db8:1::1 metric 1024 pref medium + ip address del 2001:db8:1::1/64 dev dummy1 + ip -6 route show 2001:db8:2::/64 2001:db8:2::/64 dev dummy1 src 2001:db8:1::1 metric 1024 pref medium Note how it is not possible to add the route to the main table because the address does not exist, but then after the address is deleted the route still exists with the preferred source address. And this is the same example, but with the patch from [1]: + ip link add name dummy1 up type dummy + ip link add name vrf1 up type vrf table 1111 + ip link set dev dummy1 master vrf1 + ip -6 route add 2001:db8:2::/64 src 2001:db8:1::1 dev dummy1 Error: Invalid source address. + ip address add 2001:db8:1::1/64 dev dummy1 + ip -6 route add 2001:db8:2::/64 src 2001:db8:1::1 dev dummy1 + ip -6 route show 2001:db8:2::/64 2001:db8:2::/64 dev dummy1 src 2001:db8:1::1 metric 1024 pref medium + ip address del 2001:db8:1::1/64 dev dummy1 + ip -6 route show 2001:db8:2::/64 2001:db8:2::/64 dev dummy1 metric 1024 pref medium [1] https://lore.kernel.org/netdev/ZNSol%2F7x5oI6amEB@shredder/
Hi Ido, On Sun, Aug 13, 2023 at 07:09:46PM +0300, Ido Schimmel wrote: > On Fri, Aug 11, 2023 at 05:53:08PM +0800, Hangbin Liu wrote: > > diff --git a/net/ipv6/route.c b/net/ipv6/route.c > > index 64e873f5895f..0f981cc5bed1 100644 > > --- a/net/ipv6/route.c > > +++ b/net/ipv6/route.c > > @@ -4590,11 +4590,12 @@ static int fib6_remove_prefsrc(struct fib6_info *rt, void *arg) > > struct net_device *dev = ((struct arg_dev_net_ip *)arg)->dev; > > struct net *net = ((struct arg_dev_net_ip *)arg)->net; > > struct in6_addr *addr = ((struct arg_dev_net_ip *)arg)->addr; > > + u32 tb6_id = l3mdev_fib_table(dev) ? : RT_TABLE_MAIN; > > > > - if (!rt->nh && > > - ((void *)rt->fib6_nh->fib_nh_dev == dev || !dev) && > > - rt != net->ipv6.fib6_null_entry && > > - ipv6_addr_equal(addr, &rt->fib6_prefsrc.addr)) { > > + if (rt != net->ipv6.fib6_null_entry && > > + rt->fib6_table->tb6_id == tb6_id && > > + ipv6_addr_equal(addr, &rt->fib6_prefsrc.addr) && > > + !ipv6_chk_addr(net, addr, rt->fib6_nh->fib_nh_dev, 0)) { > > spin_lock_bh(&rt6_exception_lock); > > /* remove prefsrc entry */ > > rt->fib6_prefsrc.plen = 0; > > The table check is incorrect which is what I was trying to explain here > [1]. The route insertion code does not check that the preferred source > is accessible from the VRF where the route is installed, but instead > that it is accessible from the VRF of the first nexthop device. I'm not Sorry for my bad understanding and thanks a lot for your patient response! Now I finally get what you mean of "In IPv6, the preferred source address is looked up in the same VRF as the first nexthop device." Which is not same with the IPv4 commit f96a3d74554d ipv4: Fix incorrect route flushing when source address is deleted I will remove the tb id checking in next version. Another thing to confirm. We need remove the "!rt->nh" checking, right. Because I saw you kept it in you reply. Thanks and Best regards Hangbin
On 8/14/23 2:33 AM, Hangbin Liu wrote: > Hi Ido, > On Sun, Aug 13, 2023 at 07:09:46PM +0300, Ido Schimmel wrote: >> On Fri, Aug 11, 2023 at 05:53:08PM +0800, Hangbin Liu wrote: >>> diff --git a/net/ipv6/route.c b/net/ipv6/route.c >>> index 64e873f5895f..0f981cc5bed1 100644 >>> --- a/net/ipv6/route.c >>> +++ b/net/ipv6/route.c >>> @@ -4590,11 +4590,12 @@ static int fib6_remove_prefsrc(struct fib6_info *rt, void *arg) >>> struct net_device *dev = ((struct arg_dev_net_ip *)arg)->dev; >>> struct net *net = ((struct arg_dev_net_ip *)arg)->net; >>> struct in6_addr *addr = ((struct arg_dev_net_ip *)arg)->addr; >>> + u32 tb6_id = l3mdev_fib_table(dev) ? : RT_TABLE_MAIN; >>> >>> - if (!rt->nh && >>> - ((void *)rt->fib6_nh->fib_nh_dev == dev || !dev) && >>> - rt != net->ipv6.fib6_null_entry && >>> - ipv6_addr_equal(addr, &rt->fib6_prefsrc.addr)) { >>> + if (rt != net->ipv6.fib6_null_entry && >>> + rt->fib6_table->tb6_id == tb6_id && >>> + ipv6_addr_equal(addr, &rt->fib6_prefsrc.addr) && >>> + !ipv6_chk_addr(net, addr, rt->fib6_nh->fib_nh_dev, 0)) { >>> spin_lock_bh(&rt6_exception_lock); >>> /* remove prefsrc entry */ >>> rt->fib6_prefsrc.plen = 0; >> >> The table check is incorrect which is what I was trying to explain here >> [1]. The route insertion code does not check that the preferred source >> is accessible from the VRF where the route is installed, but instead >> that it is accessible from the VRF of the first nexthop device. I'm not > > Sorry for my bad understanding and thanks a lot for your patient response! > > Now I finally get what you mean of "In IPv6, the preferred source address is > looked up in the same VRF as the first nexthop device." Which is not same with > the IPv4 commit f96a3d74554d ipv4: Fix incorrect route flushing when source > address is deleted > > I will remove the tb id checking in next version. Another thing to confirm. > We need remove the "!rt->nh" checking, right. Because I saw you kept it in you > reply. > Make sure Ido's test cases for the various cases are added to the test scripts. Lot of permutations here and we do not want to regress
On Mon, Aug 14, 2023 at 04:33:37PM +0800, Hangbin Liu wrote: > I will remove the tb id checking in next version. Another thing to confirm. > We need remove the "!rt->nh" checking, right. Because I saw you kept it in you > reply. My understanding is that when the route uses a nexthop object (i.e., rt->nh is not NULL), then rt->fib6_nh is invalid. So I think we need the check for now. Maybe it can be removed once the function learns to use nexthop_fib6_nh() for routes with a nexthop object, but that's another patch. Let's finish with the current problem first.
diff --git a/net/ipv6/route.c b/net/ipv6/route.c index 64e873f5895f..0f981cc5bed1 100644 --- a/net/ipv6/route.c +++ b/net/ipv6/route.c @@ -4590,11 +4590,12 @@ static int fib6_remove_prefsrc(struct fib6_info *rt, void *arg) struct net_device *dev = ((struct arg_dev_net_ip *)arg)->dev; struct net *net = ((struct arg_dev_net_ip *)arg)->net; struct in6_addr *addr = ((struct arg_dev_net_ip *)arg)->addr; + u32 tb6_id = l3mdev_fib_table(dev) ? : RT_TABLE_MAIN; - if (!rt->nh && - ((void *)rt->fib6_nh->fib_nh_dev == dev || !dev) && - rt != net->ipv6.fib6_null_entry && - ipv6_addr_equal(addr, &rt->fib6_prefsrc.addr)) { + if (rt != net->ipv6.fib6_null_entry && + rt->fib6_table->tb6_id == tb6_id && + ipv6_addr_equal(addr, &rt->fib6_prefsrc.addr) && + !ipv6_chk_addr(net, addr, rt->fib6_nh->fib_nh_dev, 0)) { spin_lock_bh(&rt6_exception_lock); /* remove prefsrc entry */ rt->fib6_prefsrc.plen = 0; diff --git a/tools/testing/selftests/net/fib_tests.sh b/tools/testing/selftests/net/fib_tests.sh index 35d89dfa6f11..0b5c99d80d56 100755 --- a/tools/testing/selftests/net/fib_tests.sh +++ b/tools/testing/selftests/net/fib_tests.sh @@ -9,7 +9,7 @@ ret=0 ksft_skip=4 # all tests in this script. Can be overridden with -t option -TESTS="unregister down carrier nexthop suppress ipv6_notify ipv4_notify ipv6_rt ipv4_rt ipv6_addr_metric ipv4_addr_metric ipv6_route_metrics ipv4_route_metrics ipv4_route_v6_gw rp_filter ipv4_del_addr ipv4_mangle ipv6_mangle ipv4_bcast_neigh" +TESTS="unregister down carrier nexthop suppress ipv6_notify ipv4_notify ipv6_rt ipv4_rt ipv6_addr_metric ipv4_addr_metric ipv6_route_metrics ipv4_route_metrics ipv4_route_v6_gw rp_filter ipv4_del_addr ipv6_del_addr ipv4_mangle ipv6_mangle ipv4_bcast_neigh" VERBOSE=0 PAUSE_ON_FAIL=no @@ -1796,6 +1796,8 @@ ipv4_del_addr_test() $IP li set dummy1 up $IP li add dummy2 type dummy $IP li set dummy2 up + $IP li add dummy3 type dummy + $IP li set dummy3 up $IP li add red type vrf table 1111 $IP li set red up $IP ro add vrf red unreachable default @@ -1808,11 +1810,13 @@ ipv4_del_addr_test() $IP addr add dev dummy2 172.16.104.1/24 $IP addr add dev dummy2 172.16.104.11/24 $IP addr add dev dummy2 172.16.104.12/24 + $IP addr add dev dummy3 172.16.104.1/24 $IP route add 172.16.105.0/24 via 172.16.104.2 src 172.16.104.11 $IP route add 172.16.106.0/24 dev lo src 172.16.104.12 $IP route add table 0 172.16.107.0/24 via 172.16.104.2 src 172.16.104.13 $IP route add vrf red 172.16.105.0/24 via 172.16.104.2 src 172.16.104.11 $IP route add vrf red 172.16.106.0/24 dev lo src 172.16.104.12 + $IP route add 172.16.108.0/24 via 172.16.104.2 src 172.16.104.1 set +e # removing address from device in vrf should only remove route from vrf table @@ -1864,11 +1868,121 @@ ipv4_del_addr_test() $IP ro ls | grep -q 172.16.107.0/24 log_test $? 1 "Route removed in default VRF when source address deleted" + # removing address from one device while other device still has this + # address should not remove the source route + echo " Identical address on different device" + $IP addr del dev dummy3 172.16.104.1/24 + $IP ro ls | grep -q 172.16.108.0/24 + log_test $? 0 "Route not removed when source address exists on other device" + $IP li del dummy1 $IP li del dummy2 + $IP li del dummy3 cleanup } +ipv6_del_addr_test() +{ + echo + echo "IPv6 delete address route tests" + + setup + + set -e + $IP li add dummy1 up type dummy + $IP li add dummy2 up type dummy + $IP li add dummy3 up type dummy + $IP li add red type vrf table 1111 + $IP li set red up + $IP ro add vrf red unreachable default + $IP li set dummy2 vrf red + + $IP addr add dev dummy1 2001:db8:104::1/64 + $IP addr add dev dummy1 2001:db8:104::11/64 + $IP addr add dev dummy1 2001:db8:104::12/64 + $IP addr add dev dummy1 2001:db8:104::13/64 + $IP addr add dev dummy2 2001:db8:104::1/64 + $IP addr add dev dummy2 2001:db8:104::11/64 + $IP addr add dev dummy2 2001:db8:104::12/64 + $IP addr add dev dummy3 2001:db8:104::1/64 + $IP route add 2001:db8:105::/64 via 2001:db8:104::2 src 2001:db8:104::11 + $IP route add 2001:db8:106::/64 dev lo src 2001:db8:104::12 + $IP route add table 0 2001:db8:107::/64 via 2001:db8:104::2 src 2001:db8:104::13 + $IP route add vrf red 2001:db8:105::/64 via 2001:db8:104::2 src 2001:db8:104::11 + $IP route add vrf red 2001:db8:106::/64 dev lo src 2001:db8:104::12 + $IP route add 2001:db8:108::/64 via 2001:db8:104::2 src 2001:db8:104::1 + set +e + + # removing address from device in vrf should only remove it as a + # preferred source address from routes in vrf table + echo " Regular FIB info" + + $IP addr del dev dummy2 2001:db8:104::11/64 + # Checking if the source address exists instead of the dest subnet + # as IPv6 only removes the preferred source address, not whole route. + $IP -6 ro ls vrf red | grep -q "src 2001:db8:104::11" + log_test $? 1 "Prefsrc removed from VRF when source address deleted" + + $IP -6 ro ls | grep -q " src 2001:db8:104::11" + log_test $? 0 "Prefsrc in default VRF not removed" + + $IP addr add dev dummy2 2001:db8:104::11/64 + $IP route replace vrf red 2001:db8:105::/64 via 2001:db8:104::2 src 2001:db8:104::11 + + $IP addr del dev dummy1 2001:db8:104::11/64 + $IP -6 ro ls | grep -q "src 2001:db8:104::11" + log_test $? 1 "Prefsrc removed in default VRF when source address deleted" + + $IP -6 ro ls vrf red | grep -q "src 2001:db8:104::11" + log_test $? 0 "Prefsrc in VRF is not removed by address delete" + + # removing address from device in vrf should only remove preferred + # source address from vrf table even when the associated fib info + # only differs in table ID + echo " Identical FIB info with different table ID" + + # IPv6 works different with IPv4 when the nexthop device is in a + # different VRF. + $IP addr del dev dummy2 2001:db8:104::12/64 + $IP -6 ro ls vrf red | grep -q "src 2001:db8:104::12" + log_test $? 0 "Prefsrc not removed from VRF when nexthop dev in other VRF" + + $IP -6 ro ls | grep -q "src 2001:db8:104::12" + log_test $? 0 "Prefsrc in default VRF not removed" + + $IP addr add dev dummy2 2001:db8:104::12/64 + $IP addr del dev dummy1 2001:db8:104::12/64 + $IP -6 ro ls | grep -q "src 2001:db8:104::12" + log_test $? 1 "Prefsrc removed in default VRF when source address deleted" + + $IP -6 ro ls vrf red | grep -q "src 2001:db8:104::12" + log_test $? 0 "Prefsrc in VRF is not removed by address delete" + + $IP addr del dev dummy2 2001:db8:104::12/64 + $IP -6 ro ls vrf red | grep -q "src 2001:db8:104::12" + log_test $? 1 "Prefsrc in VRF is removed by address delete" + + # removing address from device in default vrf should remove preferred + # source address from the default vrf even when route was inserted + # with a table ID of 0. + echo " Table ID 0" + + $IP addr del dev dummy1 2001:db8:104::13/64 + $IP -6 ro ls | grep -q "src 2001:db8:104::13" + log_test $? 1 "Prefsrc removed in default VRF when source address deleted" + + # removing address from one device while other device still has this + # address should not remove the source route + echo " Identical address on different devices" + $IP addr del dev dummy3 2001:db8:104::1/64 + $IP -6 ro ls | grep -q "src 2001:db8:104::1 " + log_test $? 0 "Prefsrc not removed when src address exists on other device" + + $IP li del dummy1 + $IP li del dummy2 + $IP li del dummy3 + cleanup +} ipv4_route_v6_gw_test() { @@ -2211,6 +2325,7 @@ do ipv6_addr_metric) ipv6_addr_metric_test;; ipv4_addr_metric) ipv4_addr_metric_test;; ipv4_del_addr) ipv4_del_addr_test;; + ipv6_del_addr) ipv6_del_addr_test;; ipv6_route_metrics) ipv6_route_metrics_test;; ipv4_route_metrics) ipv4_route_metrics_test;; ipv4_route_v6_gw) ipv4_route_v6_gw_test;;
After deleting an IPv6 address on an interface and cleaning up the related preferred source entries, it is important to ensure that all routes associated with the deleted address are properly cleared. The current implementation of rt6_remove_prefsrc() only checks the preferred source addresses bound to the current device. However, there may be routes that are bound to other devices but still utilize the same preferred source address. To address this issue, it is necessary to also delete entries that are bound to other interfaces but share the same source address with the current device. Failure to delete these entries would leave routes that are bound to the deleted address unclear. Here is an example reproducer (I have omitted unrelated routes): + ip link add dummy1 type dummy + ip link add dummy2 type dummy + ip link set dummy1 up + ip link set dummy2 up + ip addr add 1:2:3:4::5/64 dev dummy1 + ip route add 7:7:7:0::1 dev dummy1 src 1:2:3:4::5 + ip route add 7:7:7:0::2 dev dummy2 src 1:2:3:4::5 + ip -6 route show 1:2:3:4::/64 dev dummy1 proto kernel metric 256 pref medium 7:7:7::1 dev dummy1 src 1:2:3:4::5 metric 1024 pref medium 7:7:7::2 dev dummy2 src 1:2:3:4::5 metric 1024 pref medium + ip addr del 1:2:3:4::5/64 dev dummy1 + ip -6 route show 7:7:7::1 dev dummy1 metric 1024 pref medium 7:7:7::2 dev dummy2 src 1:2:3:4::5 metric 1024 pref medium Ido notified that there is a commit 5a56a0b3a45d ("net: Don't delete routes in different VRFs") to not affect the route in different VRFs. To fix all these issues. We will: 1. Remove the !rt-nh checking to clear the IPv6 routes that are using a nexthop object. This would be consistent with IPv4. 2. Remove the rt dev checking and add an table id checking to not remove the route in different VRFs. 3. Add a check to make sure not remove the src route if the address still exists on other device(in same VRF). After fix: + ip addr del 1:2:3:4::5/64 dev dummy1 + ip -6 route show 7:7:7::1 dev dummy1 metric 1024 pref medium 7:7:7::2 dev dummy2 metric 1024 pref medium An ipv6_del_addr test is added in fib_tests.sh. Here is the result. IPv6 delete address route tests Regular FIB info TEST: Prefsrc removed from VRF when source address deleted [ OK ] TEST: Prefsrc in default VRF not removed [ OK ] TEST: Prefsrc removed in default VRF when source address deleted [ OK ] TEST: Prefsrc in VRF is not removed by address delete [ OK ] Identical FIB info with different table ID TEST: Prefsrc removed from VRF when source address deleted [ OK ] TEST: Prefsrc in default VRF not removed [ OK ] TEST: Prefsrc removed in default VRF when source address deleted [ OK ] TEST: Prefsrc in VRF is not removed by address delete [ OK ] Table ID 0 TEST: Prefsrc removed in default VRF when source address deleted [ OK ] Identical address on different devices TEST: Prefsrc not removed when src address exists on other device [ OK ] Reported-by: Thomas Haller <thaller@redhat.com> Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2170513 Fixes: c3968a857a6b ("ipv6: RTA_PREFSRC support for ipv6 route source address selection") Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> --- v5: Move the addr check back to fib6_remove_prefsrc. v4: check if the prefsrc address still exists on other device v3: remove rt nh checking. update the ipv6_del_addr test descriptions v2: checking table id and update fib_test.sh --- net/ipv6/route.c | 9 +- tools/testing/selftests/net/fib_tests.sh | 117 ++++++++++++++++++++++- 2 files changed, 121 insertions(+), 5 deletions(-)