mbox series

[net-next,0/3] ipv4: First steps toward removing RTO_ONLINK

Message ID cover.1650470610.git.gnault@redhat.com (mailing list archive)
Headers show
Series ipv4: First steps toward removing RTO_ONLINK | expand

Message

Guillaume Nault April 20, 2022, 11:21 p.m. UTC
RTO_ONLINK is a flag that allows to reduce the scope of route lookups.
It's stored in a normally unused bit of the ->flowi4_tos field, in
struct flowi4. However it has several problems:

 * This bit is also used by ECN. Although ECN bits are supposed to be
   cleared before doing a route lookup, it happened that some code
   paths didn't properly sanitise their ->flowi4_tos. So this mechanism
   is fragile and we had bugs in the past where ECN bits slipped in and
   could end up being erroneously interpreted as RTO_ONLINK.

 * A dscp_t type was recently introduced to ensure ECN bits are cleared
   during route lookups. ->flowi4_tos is the most important structure
   field to convert, but RTO_ONLINK prevents such conversion, as dscp_t
   mandates that ECN bits (where RTO_ONLINK is stored) be zero.

Therefore we need to stop using RTO_ONLINK altogether. Fortunately
RTO_ONLINK isn't a necessity. Instead of passing a flag in ->flowi4_tos
to tell the route lookup function to restrict the scope, we can simply
initialise the scope correctly.

Patch 1 does some preparatory work: it stops resetting ->flowi4_scope
automatically before a route lookup, thus allowing callers to set their
desired scope without having to rely on the RTO_ONLINK flag.

Patch 2-3 convert a few code paths to avoid relying on RTO_ONLINK.

More conversions will have to take place before we can eventually
remove this flag.

Guillaume Nault (3):
  ipv4: Don't reset ->flowi4_scope in ip_rt_fix_tos().
  ipv4: Avoid using RTO_ONLINK with ip_route_connect().
  ipv4: Initialise ->flowi4_scope properly in ICMP handlers.

 include/net/route.h | 36 ++++++++++++++++++++++++------------
 net/dccp/ipv4.c     |  5 ++---
 net/ipv4/af_inet.c  |  6 +++---
 net/ipv4/datagram.c |  7 +++----
 net/ipv4/route.c    | 41 +++++++++++++++++++----------------------
 net/ipv4/tcp_ipv4.c |  5 ++---
 6 files changed, 53 insertions(+), 47 deletions(-)

Comments

David Ahern April 22, 2022, 3:10 a.m. UTC | #1
On 4/20/22 5:21 PM, Guillaume Nault wrote:
> RTO_ONLINK is a flag that allows to reduce the scope of route lookups.
> It's stored in a normally unused bit of the ->flowi4_tos field, in
> struct flowi4. However it has several problems:
> 
>  * This bit is also used by ECN. Although ECN bits are supposed to be
>    cleared before doing a route lookup, it happened that some code
>    paths didn't properly sanitise their ->flowi4_tos. So this mechanism
>    is fragile and we had bugs in the past where ECN bits slipped in and
>    could end up being erroneously interpreted as RTO_ONLINK.
> 
>  * A dscp_t type was recently introduced to ensure ECN bits are cleared
>    during route lookups. ->flowi4_tos is the most important structure
>    field to convert, but RTO_ONLINK prevents such conversion, as dscp_t
>    mandates that ECN bits (where RTO_ONLINK is stored) be zero.
> 
> Therefore we need to stop using RTO_ONLINK altogether. Fortunately
> RTO_ONLINK isn't a necessity. Instead of passing a flag in ->flowi4_tos
> to tell the route lookup function to restrict the scope, we can simply
> initialise the scope correctly.
> 

I believe the set looks ok. I think the fib test coverage in selftests
could use more tests to cover tos.
Guillaume Nault April 22, 2022, 11:02 a.m. UTC | #2
On Thu, Apr 21, 2022 at 09:10:21PM -0600, David Ahern wrote:
> On 4/20/22 5:21 PM, Guillaume Nault wrote:
> > RTO_ONLINK is a flag that allows to reduce the scope of route lookups.
> > It's stored in a normally unused bit of the ->flowi4_tos field, in
> > struct flowi4. However it has several problems:
> > 
> >  * This bit is also used by ECN. Although ECN bits are supposed to be
> >    cleared before doing a route lookup, it happened that some code
> >    paths didn't properly sanitise their ->flowi4_tos. So this mechanism
> >    is fragile and we had bugs in the past where ECN bits slipped in and
> >    could end up being erroneously interpreted as RTO_ONLINK.
> > 
> >  * A dscp_t type was recently introduced to ensure ECN bits are cleared
> >    during route lookups. ->flowi4_tos is the most important structure
> >    field to convert, but RTO_ONLINK prevents such conversion, as dscp_t
> >    mandates that ECN bits (where RTO_ONLINK is stored) be zero.
> > 
> > Therefore we need to stop using RTO_ONLINK altogether. Fortunately
> > RTO_ONLINK isn't a necessity. Instead of passing a flag in ->flowi4_tos
> > to tell the route lookup function to restrict the scope, we can simply
> > initialise the scope correctly.
> > 
> 
> I believe the set looks ok. I think the fib test coverage in selftests
> could use more tests to cover tos.

Yes, this is on my todo list. I also plan to review existing tests that
cover route lookups with link scope, and extend them if necessary.

Thanks for the review.
patchwork-bot+netdevbpf@kernel.org April 22, 2022, 12:50 p.m. UTC | #3
Hello:

This series was applied to netdev/net-next.git (master)
by David S. Miller <davem@davemloft.net>:

On Thu, 21 Apr 2022 01:21:19 +0200 you wrote:
> RTO_ONLINK is a flag that allows to reduce the scope of route lookups.
> It's stored in a normally unused bit of the ->flowi4_tos field, in
> struct flowi4. However it has several problems:
> 
>  * This bit is also used by ECN. Although ECN bits are supposed to be
>    cleared before doing a route lookup, it happened that some code
>    paths didn't properly sanitise their ->flowi4_tos. So this mechanism
>    is fragile and we had bugs in the past where ECN bits slipped in and
>    could end up being erroneously interpreted as RTO_ONLINK.
> 
> [...]

Here is the summary with links:
  - [net-next,1/3] ipv4: Don't reset ->flowi4_scope in ip_rt_fix_tos().
    https://git.kernel.org/netdev/net-next/c/16a28267774c
  - [net-next,2/3] ipv4: Avoid using RTO_ONLINK with ip_route_connect().
    https://git.kernel.org/netdev/net-next/c/67e1e2f4854b
  - [net-next,3/3] ipv4: Initialise ->flowi4_scope properly in ICMP handlers.
    https://git.kernel.org/netdev/net-next/c/b1ad41384866

You are awesome, thank you!