Message ID | 20240725131729.1729103-3-idosch@nvidia.com (mailing list archive) |
---|---|
State | RFC |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | Preparations for FIB rule DSCP selector | expand |
On Thu, Jul 25, 2024 at 04:17:28PM +0300, Ido Schimmel wrote: > As part of its functionality, the nftables FIB expression module > performs a FIB lookup, but unlike other users of the FIB lookup API, it > does so without masking the upper DSCP bits. In particular, this differs > from the equivalent iptables match ("rpfilter") that does mask the upper > DSCP bits before the FIB lookup. > > Align the module to other users of the FIB lookup API and mask the upper > DSCP bits using IPTOS_RT_MASK before the lookup. If Florian and Pablo are okay with this change and the long term plan to allow full DSCP match, then I'm all for it. Reviewed-by: Guillaume Nault <gnault@redhat.com>
Ido Schimmel <idosch@nvidia.com> wrote: > @@ -110,7 +108,7 @@ void nft_fib4_eval(const struct nft_expr *expr, struct nft_regs *regs, > if (priv->flags & NFTA_FIB_F_MARK) > fl4.flowi4_mark = pkt->skb->mark; > > - fl4.flowi4_tos = iph->tos & DSCP_BITS; > + fl4.flowi4_tos = iph->tos & IPTOS_RT_MASK; If this is supposed to get centralised, wouldn't it make more sense to not mask it, or will that happen later? I thought plan was to ditch RT_MASK...
On Fri, Jul 26, 2024 at 03:32:48PM +0200, Florian Westphal wrote: > Ido Schimmel <idosch@nvidia.com> wrote: > > @@ -110,7 +108,7 @@ void nft_fib4_eval(const struct nft_expr *expr, struct nft_regs *regs, > > if (priv->flags & NFTA_FIB_F_MARK) > > fl4.flowi4_mark = pkt->skb->mark; > > > > - fl4.flowi4_tos = iph->tos & DSCP_BITS; > > + fl4.flowi4_tos = iph->tos & IPTOS_RT_MASK; > > If this is supposed to get centralised, wouldn't it > make more sense to not mask it, or will that happen later? I think Ido prefers to have this behaviour introduced in a dedicated patch, rather than as a side effect of the centralisation done in patch 3/3. Once patch 3/3 is applied, the next step would be to remove all those redundant masks (including this new nft_fib4_eval() one), so that fib4_rule_match(), fib_table_lookup() and fib_select_default() could see the full DSCP. This will allow the final step of allowing IPv4 routes and fib-rules to be configured for matching either the DSCP bits or only the old TOS ones. > I thought plan was to ditch RT_MASK... That was my preference too. But Ido's affraid of potential users who may depend on fib-rules with tos=0x1c matching packets with dsfield=0xfc. Centralising the mask would allow us to configure this behaviour upon route or fib-rule creation. Here's the original thread that lead to this RFC, if anyone wants more details on the current plan: https://lore.kernel.org/netdev/ZnwCWejSuOTqriJc@shredder.mtl.com/
Ido Schimmel <idosch@nvidia.com> wrote: > void nft_fib4_eval_type(const struct nft_expr *expr, struct nft_regs *regs, > const struct nft_pktinfo *pkt) > { > @@ -110,7 +108,7 @@ void nft_fib4_eval(const struct nft_expr *expr, struct nft_regs *regs, > if (priv->flags & NFTA_FIB_F_MARK) > fl4.flowi4_mark = pkt->skb->mark; > > - fl4.flowi4_tos = iph->tos & DSCP_BITS; > + fl4.flowi4_tos = iph->tos & IPTOS_RT_MASK; I was confused because cover letter talks about allowing both tos or dscp depending on new nlattr for ipv4, but then this patch makes that impossible because dscp bits get masked. patch 3 says: ---- A prerequisite for allowing FIB rules to match on DSCP is to adjust all the call sites to initialize the high order DSCP bits and remove their masking along the path to the core where the field is matched on. ---- But nft_fib_ipv4.c already does that. So I would suggest to just drop this patch and then get rid of '& DSCP_BITS' once everything is in place. But feel free to handle this as you prefer.
On Sun, Jul 28, 2024 at 04:30:40AM +0200, Florian Westphal wrote: > Ido Schimmel <idosch@nvidia.com> wrote: > > void nft_fib4_eval_type(const struct nft_expr *expr, struct nft_regs *regs, > > const struct nft_pktinfo *pkt) > > { > > @@ -110,7 +108,7 @@ void nft_fib4_eval(const struct nft_expr *expr, struct nft_regs *regs, > > if (priv->flags & NFTA_FIB_F_MARK) > > fl4.flowi4_mark = pkt->skb->mark; > > > > - fl4.flowi4_tos = iph->tos & DSCP_BITS; > > + fl4.flowi4_tos = iph->tos & IPTOS_RT_MASK; > > I was confused because cover letter talks about allowing both tos or dscp depending on > new nlattr for ipv4, but then this patch makes that impossible because dscp bits get masked. > > patch 3 says: > ---- > A prerequisite for allowing FIB rules to match on DSCP is to adjust all > the call sites to initialize the high order DSCP bits and remove their > masking along the path to the core where the field is matched on. > ---- > > But nft_fib_ipv4.c already does that. > > So I would suggest to just drop this patch and then get rid of '& > DSCP_BITS' once everything is in place. > > But feel free to handle this as you prefer. My preference is to first align all the users to mask the upper DSCP bits (patches #1-#2), then move the masking to the core (patch #3) and finally remove the masking from the various call sites (future patchsets). Will make it clearer in the cover letter. Thanks!
diff --git a/net/ipv4/netfilter/nft_fib_ipv4.c b/net/ipv4/netfilter/nft_fib_ipv4.c index 9eee535c64dd..df94bc28c3d7 100644 --- a/net/ipv4/netfilter/nft_fib_ipv4.c +++ b/net/ipv4/netfilter/nft_fib_ipv4.c @@ -22,8 +22,6 @@ static __be32 get_saddr(__be32 addr) return addr; } -#define DSCP_BITS 0xfc - void nft_fib4_eval_type(const struct nft_expr *expr, struct nft_regs *regs, const struct nft_pktinfo *pkt) { @@ -110,7 +108,7 @@ void nft_fib4_eval(const struct nft_expr *expr, struct nft_regs *regs, if (priv->flags & NFTA_FIB_F_MARK) fl4.flowi4_mark = pkt->skb->mark; - fl4.flowi4_tos = iph->tos & DSCP_BITS; + fl4.flowi4_tos = iph->tos & IPTOS_RT_MASK; if (priv->flags & NFTA_FIB_F_DADDR) { fl4.daddr = iph->daddr;
As part of its functionality, the nftables FIB expression module performs a FIB lookup, but unlike other users of the FIB lookup API, it does so without masking the upper DSCP bits. In particular, this differs from the equivalent iptables match ("rpfilter") that does mask the upper DSCP bits before the FIB lookup. Align the module to other users of the FIB lookup API and mask the upper DSCP bits using IPTOS_RT_MASK before the lookup. No regressions in nft_fib.sh: # ./nft_fib.sh PASS: fib expression did not cause unwanted packet drops PASS: fib expression did drop packets for 1.1.1.1 PASS: fib expression did drop packets for 1c3::c01d PASS: fib expression forward check with policy based routing Signed-off-by: Ido Schimmel <idosch@nvidia.com> --- net/ipv4/netfilter/nft_fib_ipv4.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-)