mbox series

[net-next,0/6] net: fib_rules: Add DSCP selector support

Message ID 20240911093748.3662015-1-idosch@nvidia.com (mailing list archive)
Headers show
Series net: fib_rules: Add DSCP selector support | expand

Message

Ido Schimmel Sept. 11, 2024, 9:37 a.m. UTC
Currently, the kernel rejects IPv4 FIB rules that try to match on the
upper three DSCP bits:

 # ip -4 rule add tos 0x1c table 100
 # ip -4 rule add tos 0x3c table 100
 Error: Invalid tos.

The reason for that is that historically users of the FIB lookup API
only populated the lower three DSCP bits in the TOS field of the IPv4
flow key ('flowi4_tos'), which fits the TOS definition from the initial
IPv4 specification (RFC 791).

This is not very useful nowadays and instead some users want to be able
to match on the six bits DSCP field, which replaced the TOS and IP
precedence fields over 25 years ago (RFC 2474). In addition, the current
behavior differs between IPv4 and IPv6 which does allow users to match
on the entire DSCP field using the TOS selector.

Recent patchsets made sure that callers of the FIB lookup API now
populate the entire DSCP field in the IPv4 flow key. Therefore, it is
now possible to extend FIB rules to match on DSCP.

This is done by adding a new DSCP attribute which is implemented for
both IPv4 and IPv6 to provide user space programs a consistent behavior
between both address families.

The behavior of the old TOS selector is unchanged and IPv4 FIB rules
using it will only match on the lower three DSCP bits. The kernel will
reject rules that try to use both selectors.

Patch #1 adds the new DSCP attribute but rejects its usage.

Patches #2-#3 implement IPv4 and IPv6 support.

Patch #4 allows user space to use the new attribute.

Patches #5-#6 add selftests.

iproute2 changes can be found here [1].

[1] https://github.com/idosch/iproute2/tree/submit/dscp_rfc_v1

Ido Schimmel (6):
  net: fib_rules: Add DSCP selector attribute
  ipv4: fib_rules: Add DSCP selector support
  ipv6: fib_rules: Add DSCP selector support
  net: fib_rules: Enable DSCP selector usage
  selftests: fib_rule_tests: Add DSCP selector match tests
  selftests: fib_rule_tests: Add DSCP selector connect tests

 include/uapi/linux/fib_rules.h                |  1 +
 net/core/fib_rules.c                          |  4 +-
 net/ipv4/fib_rules.c                          | 54 ++++++++++-
 net/ipv6/fib6_rules.c                         | 43 ++++++++-
 tools/testing/selftests/net/fib_rule_tests.sh | 90 +++++++++++++++++++
 5 files changed, 184 insertions(+), 8 deletions(-)

Comments

Guillaume Nault Sept. 13, 2024, 1:08 p.m. UTC | #1
On Wed, Sep 11, 2024 at 12:37:42PM +0300, Ido Schimmel wrote:
> Currently, the kernel rejects IPv4 FIB rules that try to match on the
> upper three DSCP bits:
> 
>  # ip -4 rule add tos 0x1c table 100
>  # ip -4 rule add tos 0x3c table 100
>  Error: Invalid tos.
> 
> The reason for that is that historically users of the FIB lookup API
> only populated the lower three DSCP bits in the TOS field of the IPv4
> flow key ('flowi4_tos'), which fits the TOS definition from the initial
> IPv4 specification (RFC 791).
> 
> This is not very useful nowadays and instead some users want to be able
> to match on the six bits DSCP field, which replaced the TOS and IP
> precedence fields over 25 years ago (RFC 2474). In addition, the current
> behavior differs between IPv4 and IPv6 which does allow users to match
> on the entire DSCP field using the TOS selector.
> 
> Recent patchsets made sure that callers of the FIB lookup API now
> populate the entire DSCP field in the IPv4 flow key. Therefore, it is
> now possible to extend FIB rules to match on DSCP.
> 
> This is done by adding a new DSCP attribute which is implemented for
> both IPv4 and IPv6 to provide user space programs a consistent behavior
> between both address families.
> 
> The behavior of the old TOS selector is unchanged and IPv4 FIB rules
> using it will only match on the lower three DSCP bits. The kernel will
> reject rules that try to use both selectors.
> 
> Patch #1 adds the new DSCP attribute but rejects its usage.
> 
> Patches #2-#3 implement IPv4 and IPv6 support.
> 
> Patch #4 allows user space to use the new attribute.
> 
> Patches #5-#6 add selftests.
> 
> iproute2 changes can be found here [1].
> 
> [1] https://github.com/idosch/iproute2/tree/submit/dscp_rfc_v1

Any reason for always printing numbers in the json output of this
iproute2 RFC? Why can't json users just use the -N parameter?

I haven't checked all the /etc/iproute2/rt_* aliases, but the general
behaviour seems to print the human readable name for both json and
normal outputs, unles -N is given on the command line.

> Ido Schimmel (6):
>   net: fib_rules: Add DSCP selector attribute
>   ipv4: fib_rules: Add DSCP selector support
>   ipv6: fib_rules: Add DSCP selector support
>   net: fib_rules: Enable DSCP selector usage
>   selftests: fib_rule_tests: Add DSCP selector match tests
>   selftests: fib_rule_tests: Add DSCP selector connect tests
> 
>  include/uapi/linux/fib_rules.h                |  1 +
>  net/core/fib_rules.c                          |  4 +-
>  net/ipv4/fib_rules.c                          | 54 ++++++++++-
>  net/ipv6/fib6_rules.c                         | 43 ++++++++-
>  tools/testing/selftests/net/fib_rule_tests.sh | 90 +++++++++++++++++++
>  5 files changed, 184 insertions(+), 8 deletions(-)
> 
> -- 
> 2.46.0
>
David Ahern Sept. 13, 2024, 2:31 p.m. UTC | #2
On 9/11/24 3:37 AM, Ido Schimmel wrote:
> Currently, the kernel rejects IPv4 FIB rules that try to match on the
> upper three DSCP bits:
> 
>  # ip -4 rule add tos 0x1c table 100
>  # ip -4 rule add tos 0x3c table 100
>  Error: Invalid tos.
> 
> The reason for that is that historically users of the FIB lookup API
> only populated the lower three DSCP bits in the TOS field of the IPv4
> flow key ('flowi4_tos'), which fits the TOS definition from the initial
> IPv4 specification (RFC 791).
> 
> This is not very useful nowadays and instead some users want to be able
> to match on the six bits DSCP field, which replaced the TOS and IP
> precedence fields over 25 years ago (RFC 2474). In addition, the current
> behavior differs between IPv4 and IPv6 which does allow users to match
> on the entire DSCP field using the TOS selector.
> 
> Recent patchsets made sure that callers of the FIB lookup API now
> populate the entire DSCP field in the IPv4 flow key. Therefore, it is
> now possible to extend FIB rules to match on DSCP.
> 
> This is done by adding a new DSCP attribute which is implemented for
> both IPv4 and IPv6 to provide user space programs a consistent behavior
> between both address families.
> 
> The behavior of the old TOS selector is unchanged and IPv4 FIB rules
> using it will only match on the lower three DSCP bits. The kernel will
> reject rules that try to use both selectors.
> 
> Patch #1 adds the new DSCP attribute but rejects its usage.
> 
> Patches #2-#3 implement IPv4 and IPv6 support.
> 
> Patch #4 allows user space to use the new attribute.
> 
> Patches #5-#6 add selftests.
> 
> iproute2 changes can be found here [1].
> 
> [1] https://github.com/idosch/iproute2/tree/submit/dscp_rfc_v1
> 
> Ido Schimmel (6):
>   net: fib_rules: Add DSCP selector attribute
>   ipv4: fib_rules: Add DSCP selector support
>   ipv6: fib_rules: Add DSCP selector support
>   net: fib_rules: Enable DSCP selector usage
>   selftests: fib_rule_tests: Add DSCP selector match tests
>   selftests: fib_rule_tests: Add DSCP selector connect tests
> 
>  include/uapi/linux/fib_rules.h                |  1 +
>  net/core/fib_rules.c                          |  4 +-
>  net/ipv4/fib_rules.c                          | 54 ++++++++++-
>  net/ipv6/fib6_rules.c                         | 43 ++++++++-
>  tools/testing/selftests/net/fib_rule_tests.sh | 90 +++++++++++++++++++
>  5 files changed, 184 insertions(+), 8 deletions(-)
> 

For the set:
Reviewed-by: David Ahern <dsahern@kernel.org>
patchwork-bot+netdevbpf@kernel.org Sept. 14, 2024, 4:30 a.m. UTC | #3
Hello:

This series was applied to netdev/net-next.git (main)
by Jakub Kicinski <kuba@kernel.org>:

On Wed, 11 Sep 2024 12:37:42 +0300 you wrote:
> Currently, the kernel rejects IPv4 FIB rules that try to match on the
> upper three DSCP bits:
> 
>  # ip -4 rule add tos 0x1c table 100
>  # ip -4 rule add tos 0x3c table 100
>  Error: Invalid tos.
> 
> [...]

Here is the summary with links:
  - [net-next,1/6] net: fib_rules: Add DSCP selector attribute
    https://git.kernel.org/netdev/net-next/c/c951a29f6ba5
  - [net-next,2/6] ipv4: fib_rules: Add DSCP selector support
    https://git.kernel.org/netdev/net-next/c/b9455fef8b1f
  - [net-next,3/6] ipv6: fib_rules: Add DSCP selector support
    https://git.kernel.org/netdev/net-next/c/2cf630034e4e
  - [net-next,4/6] net: fib_rules: Enable DSCP selector usage
    https://git.kernel.org/netdev/net-next/c/4b041d286e91
  - [net-next,5/6] selftests: fib_rule_tests: Add DSCP selector match tests
    https://git.kernel.org/netdev/net-next/c/ac6ad3f3b5b1
  - [net-next,6/6] selftests: fib_rule_tests: Add DSCP selector connect tests
    https://git.kernel.org/netdev/net-next/c/2bf1259a6ea1

You are awesome, thank you!
Ido Schimmel Sept. 30, 2024, 1:45 p.m. UTC | #4
Hi Guillaume,

Sorry for the delay. Was OOO / sick. Thanks for reviewing the patches.

On Fri, Sep 13, 2024 at 03:08:36PM +0200, Guillaume Nault wrote:
> On Wed, Sep 11, 2024 at 12:37:42PM +0300, Ido Schimmel wrote:
[...]
> > iproute2 changes can be found here [1].
> > 
> > [1] https://github.com/idosch/iproute2/tree/submit/dscp_rfc_v1
> 
> Any reason for always printing numbers in the json output of this
> iproute2 RFC? Why can't json users just use the -N parameter?

Because then the JSON output is always printed as a string. Example with
the old "tos" keyword:

# ip -6 rule add tos CS1 table 100
# ip -6 -j -p rule show tos CS1          
[ {
        "priority": 32765,
        "src": "all",
        "tos": "CS1",
        "table": "100"
    } ]
# ip -6 -j -p -N rule show tos CS1
[ {
        "priority": 32765,
        "src": "all",
        "tos": "0x20",
        "table": "100"
    } ]

Plus, JSON output should be consumed by scripts and it doesn't make
sense to me to use symbolic names there.

> I haven't checked all the /etc/iproute2/rt_* aliases, but the general
> behaviour seems to print the human readable name for both json and
> normal outputs, unles -N is given on the command line.

dcb is also always using numeric output for JSON:

# dcb app add dev swp1 dscp-prio CS1:0 CS2:1
# dcb -j -p app show dev swp1 dscp-prio
{
    "dscp_prio": [ [ 8,0," " ],[ 16,1," " ] ]
}
# dcb -j -p -N app show dev swp1 dscp-prio
{
    "dscp_prio": [ [ 8,0," " ],[ 16,1," " ] ]
}

So there is already inconsistency in iproute2. I chose the approach that
seemed correct to me. I don't think much thought went into always
printing strings in JSON output other than that it was easy to
implement.

David, what is your preference?
David Ahern Sept. 30, 2024, 6:18 p.m. UTC | #5
On 9/30/24 7:45 AM, Ido Schimmel wrote:
> So there is already inconsistency in iproute2. I chose the approach that
> seemed correct to me. I don't think much thought went into always
> printing strings in JSON output other than that it was easy to implement.

In general I agree with human strings unless -N is used.

While there might be inconsistencies across commands in iproute2
package, we can strive for consistency within a command such as ip.
Guillaume Nault Oct. 1, 2024, 8:08 p.m. UTC | #6
On Mon, Sep 30, 2024 at 04:45:19PM +0300, Ido Schimmel wrote:
> Hi Guillaume,
> 
> Sorry for the delay. Was OOO / sick. Thanks for reviewing the patches.
> 
> On Fri, Sep 13, 2024 at 03:08:36PM +0200, Guillaume Nault wrote:
> > On Wed, Sep 11, 2024 at 12:37:42PM +0300, Ido Schimmel wrote:
> [...]
> > > iproute2 changes can be found here [1].
> > > 
> > > [1] https://github.com/idosch/iproute2/tree/submit/dscp_rfc_v1
> > 
> > Any reason for always printing numbers in the json output of this
> > iproute2 RFC? Why can't json users just use the -N parameter?
> 
> Because then the JSON output is always printed as a string. Example with
> the old "tos" keyword:
> 
> # ip -6 rule add tos CS1 table 100
> # ip -6 -j -p rule show tos CS1          
> [ {
>         "priority": 32765,
>         "src": "all",
>         "tos": "CS1",
>         "table": "100"
>     } ]
> # ip -6 -j -p -N rule show tos CS1
> [ {
>         "priority": 32765,
>         "src": "all",
>         "tos": "0x20",
>         "table": "100"
>     } ]
> 
> Plus, JSON output should be consumed by scripts and it doesn't make
> sense to me to use symbolic names there.

I guess that's a matter of taste then. I personally wouldn't try to
imagine what the scripts expectations are, and I'd rather let them
explicitely tell what kind of output they want. I mean, I agree that
scripts would generally want to get numbers instead of symbolic names,
but I can't see why they would _always_ want that. By forcing a numeric
value, scripts have no possibility to report symbolic names, although
that could make sense if the output isn't processed further and just
displayed to the user.

But anyway, if you really prefer the numeric-only approach, I can live
with it :).

> > I haven't checked all the /etc/iproute2/rt_* aliases, but the general
> > behaviour seems to print the human readable name for both json and
> > normal outputs, unles -N is given on the command line.
> 
> dcb is also always using numeric output for JSON:
> 
> # dcb app add dev swp1 dscp-prio CS1:0 CS2:1
> # dcb -j -p app show dev swp1 dscp-prio
> {
>     "dscp_prio": [ [ 8,0," " ],[ 16,1," " ] ]
> }
> # dcb -j -p -N app show dev swp1 dscp-prio
> {
>     "dscp_prio": [ [ 8,0," " ],[ 16,1," " ] ]
> }
> 
> So there is already inconsistency in iproute2. I chose the approach that
> seemed correct to me. I don't think much thought went into always
> printing strings in JSON output other than that it was easy to
> implement.
> 
> David, what is your preference?
>