Message ID | 20250103054413.31581-1-shiming.cheng@mediatek.com (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net,v2] ipv6: socket SO_BINDTODEVICE lookup routing fail without IPv6 rule. | expand |
Hi David Test cases will be provided later, below are the corresponding IP rule configurations for IPv4 and IPv6 that i provided, as well as the differences in ping results, the IPv4 result passed, but the IPv6 result failed, after adding this patch, the IPv6 result passed. Thanks bind out interface ccmni2 address: ccmni2 Link encap:UNSPEC inet addr:10.233.33.169 Mask:255.0.0.0 inet6 addr: fe80::1817:1da4:19a1:dc42/64 Scope: Link inet6 addr: 2409:8900:26f3:2541:1817:1da4:19a1:dc42/64 Scope: 1.There is no ip rule for ccmni2 and also pass. k6991v1_64:/ # ip rule list 0: from all lookup local 10000: from all fwmark 0xc0000/0xd0000 lookup legacy_system 11000: from all iif lo oif dummy0 uidrange 0-0 lookup dummy0 11000: from all iif lo oif ccmni1 uidrange 0-0 lookup ccmni1 16000: from all fwmark 0x10063/0x1ffff iif lo lookup local_network 16000: from all fwmark 0xd0071/0xdffff iif lo lookup ccmni1 17000: from all iif lo oif dummy0 lookup dummy0 17000: from all fwmark 0xc0000/0xc0000 iif lo oif ccmni1 lookup ccmni1 18000: from all fwmark 0x0/0x10000 lookup legacy_system 19000: from all fwmark 0x0/0x10000 lookup legacy_network 20000: from all fwmark 0x0/0x10000 lookup local_network 32000: from all unreachable k6991v1_64:/ # k6991v1_64:/ # k6991v1_64:/ # k6991v1_64:/ # ping -I ccmni2 8.8.8.8 PING 8.8.8.8 (8.8.8.8) from 10.233.33.169 ccmni2: 56(84) bytes of data. 64 bytes from 8.8.8.8: icmp_seq=1 ttl=50 time=179 ms 64 bytes from 8.8.8.8: icmp_seq=2 ttl=50 time=74.3 ms 64 bytes from 8.8.8.8: icmp_seq=3 ttl=50 time=72.7 ms 64 bytes from 8.8.8.8: icmp_seq=4 ttl=50 time=67.1 ms 64 bytes from 8.8.8.8: icmp_seq=5 ttl=50 time=63.3 ms 64 bytes from 8.8.8.8: icmp_seq=6 ttl=50 time=90.8 ms ^C --- 8.8.8.8 ping statistics --- 6 packets transmitted, 6 received, 0% packet loss, time 5017ms rtt min/avg/max/mdev = 63.323/91.253/179.082/40.213 ms 2.There is no ipv6 rule for ccmni2 and fail. k6991v1_64:/ # ip -6 rule list 0: from all lookup local 10000: from all fwmark 0xc0000/0xd0000 lookup legacy_system 11000: from all iif lo oif dummy0 uidrange 0-0 lookup dummy0 11000: from all iif lo oif ccmni1 uidrange 0-0 lookup ccmni1 16000: from all fwmark 0x10063/0x1ffff iif lo lookup local_network 16000: from all fwmark 0xd0071/0xdffff iif lo lookup ccmni1 17000: from all iif lo oif dummy0 lookup dummy0 17000: from all fwmark 0xc0000/0xc0000 iif lo oif ccmni1 lookup ccmni1 18000: from all fwmark 0x0/0x10000 lookup legacy_system 19000: from all fwmark 0x0/0x10000 lookup legacy_network 20000: from all fwmark 0x0/0x10000 lookup local_network 32000: from all unreachable k6991v1_64:/ # ping6 -I ccmni2 2001:4860:4860::8888 connect: Network is unreachable 3.After merging the submitted patch. There is no ipv6 rule for ccmni2 and pass. ccmni2 Link encap:UNSPEC inet addr:10.185.46.236 Mask:255.0.0.0 inet6 addr: 2409:8900:24d9:404e:1817:25c4:e1c7:4e52/64 Scope: k6897v1_64:/ # ip -6 rule list 0: from all lookup local 10000: from all fwmark 0xc0000/0xd0000 lookup legacy_system 11000: from all iif lo oif dummy0 uidrange 0-0 lookup dummy0 11000: from all iif lo oif ccmni1 uidrange 0-0 lookup ccmni1 16000: from all fwmark 0x10063/0x1ffff iif lo lookup local_network 16000: from all fwmark 0xd0070/0xdffff iif lo lookup ccmni1 17000: from all iif lo oif dummy0 lookup dummy0 17000: from all fwmark 0xc0000/0xc0000 iif lo oif ccmni1 lookup ccmni1 18000: from all fwmark 0x0/0x10000 lookup legacy_system 19000: from all fwmark 0x0/0x10000 lookup legacy_network 20000: from all fwmark 0x0/0x10000 lookup local_network 32000: from all unreachable k6897v1_64:/ # ping6 -I ccmni2 2001:4860:4860::8888 PING 2001:4860:4860::8888(2001:4860:4860::8888) from 2409:8900:24d9:404e:1817:25c4:e1c7:4e52 ccmni2: 56 data bytes 64 bytes from 2001:4860:4860::8888: icmp_seq=1 ttl=51 time=167 ms 64 bytes from 2001:4860:4860::8888: icmp_seq=2 ttl=51 time=73.9 ms 64 bytes from 2001:4860:4860::8888: icmp_seq=3 ttl=51 time=101 ms 64 bytes from 2001:4860:4860::8888: icmp_seq=4 ttl=51 time=62.3 ms ^C --- 2001:4860:4860::8888 ping statistics --- 5 packets transmitted, 4 received, 20% packet loss, time 4011ms On Fri, 2025-01-03 at 13:43 +0800, Shiming Cheng wrote: > When using socket IPv6 with SO_BINDTODEVICE, if IPv6 rule is not > matched, it will return ENETUNREACH. In fact, IPv4 does not behave > this way. IPv4 prioritizes looking up IP rules for routing and > forwarding, if not matched it will use socket-bound out interface > to send packets. The modification here is to make IPv6 behave the > same as IPv4. If IP rule is not found, it will also use socket-bound > out interface to send packts. > > Fixes: 6f21c96a78b8 ("ipv6: enforce flowi6_oif usage in > ip6_dst_lookup_tail()") > Signed-off-by: Shiming Cheng <shiming.cheng@mediatek.com> > --- > include/net/ip6_route.h | 2 ++ > net/ipv6/ip6_output.c | 7 ++++++- > net/ipv6/route.c | 34 ++++++++++++++++++++++++++++++++++ > 3 files changed, 42 insertions(+), 1 deletion(-) > > diff --git a/include/net/ip6_route.h b/include/net/ip6_route.h > index 6dbdf60b342f..0625597def6f 100644 > --- a/include/net/ip6_route.h > +++ b/include/net/ip6_route.h > @@ -214,6 +214,8 @@ void rt6_multipath_rebalance(struct fib6_info > *f6i); > > void rt6_uncached_list_add(struct rt6_info *rt); > void rt6_uncached_list_del(struct rt6_info *rt); > +struct rt6_info *ip6_create_rt_oif_rcu(struct net *net, const struct > sock *sk, > + struct flowi6 *fl6, int flags); > > static inline const struct rt6_info *skb_rt6_info(const struct > sk_buff *skb) > { > diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c > index f7b4608bb316..95728c8921cb 100644 > --- a/net/ipv6/ip6_output.c > +++ b/net/ipv6/ip6_output.c > @@ -1156,8 +1156,13 @@ static int ip6_dst_lookup_tail(struct net > *net, const struct sock *sk, > *dst = ip6_route_output_flags(net, sk, fl6, flags); > > err = (*dst)->error; > - if (err) > + if (err && (flags & RT6_LOOKUP_F_IFACE)) { > + *dst = (struct dst_entry *)ip6_create_rt_oif_rcu(net, > sk, fl6, flags); > + if (!*dst) > + goto out_err_release; > + } else if (err) { > goto out_err_release; > + } > > #ifdef CONFIG_IPV6_OPTIMISTIC_DAD > /* > diff --git a/net/ipv6/route.c b/net/ipv6/route.c > index 67ff16c04718..7d7450fab44f 100644 > --- a/net/ipv6/route.c > +++ b/net/ipv6/route.c > @@ -1214,6 +1214,40 @@ static struct rt6_info > *ip6_create_rt_rcu(const struct fib6_result *res) > return nrt; > } > > +struct rt6_info *ip6_create_rt_oif_rcu(struct net *net, const struct > sock *sk, > + struct flowi6 *fl6, int flags) > +{ > + struct rt6_info *rt; > + unsigned int prefs; > + int err; > + struct net_device *dev = dev_get_by_index_rcu(net, fl6- > >flowi6_oif); > + > + if (!dev) > + return NULL; > + rt = ip6_dst_alloc(dev_net(dev), dev, flags); > + > + if (!rt) > + return NULL; > + rt->dst.error = 0; > + rt->dst.output = ip6_output; > + rt->dst.lastuse = jiffies; > + prefs = sk ? inet6_sk(sk)->srcprefs : 0; > + err = ipv6_dev_get_saddr(net, dev, &fl6->daddr, prefs, &fl6- > >saddr); > + > + if (err) { > + dst_release(&rt->dst); > + return NULL; > + } > + rt->rt6i_dst.addr = fl6->daddr; > + rt->rt6i_dst.plen = 128; > + rt->rt6i_src.addr = fl6->saddr; > + rt->rt6i_dst.plen = 128; > + rt->rt6i_idev = in6_dev_get(dev); > + rt->rt6i_flags = flags; > + return rt; > +} > +EXPORT_SYMBOL_GPL(ip6_create_rt_oif_rcu); > + > INDIRECT_CALLABLE_SCOPE struct rt6_info *ip6_pol_route_lookup(struct > net *net, > struct fib6_table *table, > struct flowi6 *fl6,
On 1/3/25 3:27 AM, Shiming Cheng (成诗明) wrote: > Test cases will be provided later, below are the corresponding IP rule > configurations for IPv4 and IPv6 that i provided, as well as the > differences in ping results, the IPv4 result passed, but the IPv6 result > failed, after adding this patch, the IPv6 result passed. I do not want the output of a complicated stack of ip rules with a ping a command. Provide a simplistic set of commands that configure the stack and show what you believe is a problem. Anyone on this list should be able to quickly reproduce the setup to verify it is a problem and investigate what is happening.
On Fri, 3 Jan 2025 13:43:49 +0800 Shiming Cheng wrote: > When using socket IPv6 with SO_BINDTODEVICE, if IPv6 rule is not > matched, it will return ENETUNREACH. In fact, IPv4 does not behave > this way. IPv4 prioritizes looking up IP rules for routing and > forwarding, if not matched it will use socket-bound out interface > to send packets. The modification here is to make IPv6 behave the > same as IPv4. If IP rule is not found, it will also use socket-bound > out interface to send packts. CI shows failures in tools/testing/selftests/net/fcnal-test.sh and tools/testing/selftests/net/fib_nexthops.sh with this patch applied. Could be a flake but please double check before sending v3
diff --git a/include/net/ip6_route.h b/include/net/ip6_route.h index 6dbdf60b342f..0625597def6f 100644 --- a/include/net/ip6_route.h +++ b/include/net/ip6_route.h @@ -214,6 +214,8 @@ void rt6_multipath_rebalance(struct fib6_info *f6i); void rt6_uncached_list_add(struct rt6_info *rt); void rt6_uncached_list_del(struct rt6_info *rt); +struct rt6_info *ip6_create_rt_oif_rcu(struct net *net, const struct sock *sk, + struct flowi6 *fl6, int flags); static inline const struct rt6_info *skb_rt6_info(const struct sk_buff *skb) { diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c index f7b4608bb316..95728c8921cb 100644 --- a/net/ipv6/ip6_output.c +++ b/net/ipv6/ip6_output.c @@ -1156,8 +1156,13 @@ static int ip6_dst_lookup_tail(struct net *net, const struct sock *sk, *dst = ip6_route_output_flags(net, sk, fl6, flags); err = (*dst)->error; - if (err) + if (err && (flags & RT6_LOOKUP_F_IFACE)) { + *dst = (struct dst_entry *)ip6_create_rt_oif_rcu(net, sk, fl6, flags); + if (!*dst) + goto out_err_release; + } else if (err) { goto out_err_release; + } #ifdef CONFIG_IPV6_OPTIMISTIC_DAD /* diff --git a/net/ipv6/route.c b/net/ipv6/route.c index 67ff16c04718..7d7450fab44f 100644 --- a/net/ipv6/route.c +++ b/net/ipv6/route.c @@ -1214,6 +1214,40 @@ static struct rt6_info *ip6_create_rt_rcu(const struct fib6_result *res) return nrt; } +struct rt6_info *ip6_create_rt_oif_rcu(struct net *net, const struct sock *sk, + struct flowi6 *fl6, int flags) +{ + struct rt6_info *rt; + unsigned int prefs; + int err; + struct net_device *dev = dev_get_by_index_rcu(net, fl6->flowi6_oif); + + if (!dev) + return NULL; + rt = ip6_dst_alloc(dev_net(dev), dev, flags); + + if (!rt) + return NULL; + rt->dst.error = 0; + rt->dst.output = ip6_output; + rt->dst.lastuse = jiffies; + prefs = sk ? inet6_sk(sk)->srcprefs : 0; + err = ipv6_dev_get_saddr(net, dev, &fl6->daddr, prefs, &fl6->saddr); + + if (err) { + dst_release(&rt->dst); + return NULL; + } + rt->rt6i_dst.addr = fl6->daddr; + rt->rt6i_dst.plen = 128; + rt->rt6i_src.addr = fl6->saddr; + rt->rt6i_dst.plen = 128; + rt->rt6i_idev = in6_dev_get(dev); + rt->rt6i_flags = flags; + return rt; +} +EXPORT_SYMBOL_GPL(ip6_create_rt_oif_rcu); + INDIRECT_CALLABLE_SCOPE struct rt6_info *ip6_pol_route_lookup(struct net *net, struct fib6_table *table, struct flowi6 *fl6,
When using socket IPv6 with SO_BINDTODEVICE, if IPv6 rule is not matched, it will return ENETUNREACH. In fact, IPv4 does not behave this way. IPv4 prioritizes looking up IP rules for routing and forwarding, if not matched it will use socket-bound out interface to send packets. The modification here is to make IPv6 behave the same as IPv4. If IP rule is not found, it will also use socket-bound out interface to send packts. Fixes: 6f21c96a78b8 ("ipv6: enforce flowi6_oif usage in ip6_dst_lookup_tail()") Signed-off-by: Shiming Cheng <shiming.cheng@mediatek.com> --- include/net/ip6_route.h | 2 ++ net/ipv6/ip6_output.c | 7 ++++++- net/ipv6/route.c | 34 ++++++++++++++++++++++++++++++++++ 3 files changed, 42 insertions(+), 1 deletion(-)