Message ID | cover.1643981839.git.gnault@redhat.com (mailing list archive) |
---|---|
Headers | show |
Series | inet: Separate DSCP from ECN bits using new dscp_t type | expand |
On 2/4/22 5:58 AM, Guillaume Nault wrote: > The networking stack currently doesn't clearly distinguish between DSCP > and ECN bits. The entire DSCP+ECN bits are stored in u8 variables (or > structure fields), and each part of the stack handles them in their own > way, using different macros. This has created several bugs in the past > and some uncommon code paths are still unfixed. > > Such bugs generally manifest by selecting invalid routes because of ECN > bits interfering with FIB routes and rules lookups (more details in the > LPC 2021 talk[1] and in the RFC of this series[2]). > > This patch series aims at preventing the introduction of such bugs (and > detecting existing ones), by introducing a dscp_t type, representing > "sanitised" DSCP values (that is, with no ECN information), as opposed > to plain u8 values that contain both DSCP and ECN information. dscp_t > makes it clear for the reader what we're working on, and Sparse can > flag invalid interactions between dscp_t and plain u8. > > This series converts only a few variables and structures: > > * Patch 1 converts the tclass field of struct fib6_rule. It > effectively forbids the use of ECN bits in the tos/dsfield option > of ip -6 rule. Rules now match packets solely based on their DSCP > bits, so ECN doesn't influence the result any more. This contrasts > with the previous behaviour where all 8 bits of the Traffic Class > field were used. It is believed that this change is acceptable as > matching ECN bits wasn't usable for IPv4, so only IPv6-only > deployments could be depending on it. Also the previous behaviour > made DSCP-based ip6-rules fail for packets with both a DSCP and an > ECN mark, which is another reason why any such deploy is unlikely. > > * Patch 2 converts the tos field of struct fib4_rule. This one too > effectively forbids defining ECN bits, this time in ip -4 rule. > Before that, setting ECN bit 1 was accepted, while ECN bit 0 was > rejected. But even when accepted, the rule would never match, as > the packets would have their ECN bits cleared before doing the > rule lookup. > > * Patch 3 converts the fc_tos field of struct fib_config. This is > equivalent to patch 2, but for IPv4 routes. Routes using a > tos/dsfield option with any ECN bit set is now rejected. Before > this patch, they were accepted but, as with ip4 rules, these routes > couldn't match any packet, since their ECN bits are cleared before > the lookup. > > * Patch 4 converts the fa_tos field of struct fib_alias. This one is > pure internal u8 to dscp_t conversion. While patches 1-3 had user > facing consequences, this patch shouldn't have any side effect and > is there to give an overview of what future conversion patches will > look like. Conversions are quite mechanical, but imply some code > churn, which is the price for the extra clarity a possibility of > type checking. > > To summarise, all the behaviour changes required for the dscp_t type > approach to work should be contained in patches 1-3. These changes are > edge cases of ip-route and ip-rule that don't currently work properly. > So they should be safe. Also, a kernel selftest is added for each of > them. > seems like the right directions to me. Acked-by: David Ahern <dsahern@kernel.org>
Guillaume Nault <gnault@redhat.com> writes: > The networking stack currently doesn't clearly distinguish between DSCP > and ECN bits. The entire DSCP+ECN bits are stored in u8 variables (or > structure fields), and each part of the stack handles them in their own > way, using different macros. This has created several bugs in the past > and some uncommon code paths are still unfixed. > > Such bugs generally manifest by selecting invalid routes because of ECN > bits interfering with FIB routes and rules lookups (more details in the > LPC 2021 talk[1] and in the RFC of this series[2]). > > This patch series aims at preventing the introduction of such bugs (and > detecting existing ones), by introducing a dscp_t type, representing > "sanitised" DSCP values (that is, with no ECN information), as opposed > to plain u8 values that contain both DSCP and ECN information. dscp_t > makes it clear for the reader what we're working on, and Sparse can > flag invalid interactions between dscp_t and plain u8. > > This series converts only a few variables and structures: > > * Patch 1 converts the tclass field of struct fib6_rule. It > effectively forbids the use of ECN bits in the tos/dsfield option > of ip -6 rule. Rules now match packets solely based on their DSCP > bits, so ECN doesn't influence the result any more. This contrasts > with the previous behaviour where all 8 bits of the Traffic Class > field were used. It is believed that this change is acceptable as > matching ECN bits wasn't usable for IPv4, so only IPv6-only > deployments could be depending on it. Also the previous behaviour > made DSCP-based ip6-rules fail for packets with both a DSCP and an > ECN mark, which is another reason why any such deploy is unlikely. > > * Patch 2 converts the tos field of struct fib4_rule. This one too > effectively forbids defining ECN bits, this time in ip -4 rule. > Before that, setting ECN bit 1 was accepted, while ECN bit 0 was > rejected. But even when accepted, the rule would never match, as > the packets would have their ECN bits cleared before doing the > rule lookup. > > * Patch 3 converts the fc_tos field of struct fib_config. This is > equivalent to patch 2, but for IPv4 routes. Routes using a > tos/dsfield option with any ECN bit set is now rejected. Before > this patch, they were accepted but, as with ip4 rules, these routes > couldn't match any packet, since their ECN bits are cleared before > the lookup. > > * Patch 4 converts the fa_tos field of struct fib_alias. This one is > pure internal u8 to dscp_t conversion. While patches 1-3 had user > facing consequences, this patch shouldn't have any side effect and > is there to give an overview of what future conversion patches will > look like. Conversions are quite mechanical, but imply some code > churn, which is the price for the extra clarity a possibility of > type checking. > > To summarise, all the behaviour changes required for the dscp_t type > approach to work should be contained in patches 1-3. These changes are > edge cases of ip-route and ip-rule that don't currently work properly. > So they should be safe. Also, a kernel selftest is added for each of > them. > > Finally, this work also paves the way for allowing the usage of the 3 > high order DSCP bits in IPv4 (a few call paths already handle them, but > in general the stack clears them before IPv4 rule and route lookups). LGTM; thanks again for doing this! Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
Hello: This series was applied to netdev/net-next.git (master) by Jakub Kicinski <kuba@kernel.org>: On Fri, 4 Feb 2022 14:58:09 +0100 you wrote: > The networking stack currently doesn't clearly distinguish between DSCP > and ECN bits. The entire DSCP+ECN bits are stored in u8 variables (or > structure fields), and each part of the stack handles them in their own > way, using different macros. This has created several bugs in the past > and some uncommon code paths are still unfixed. > > Such bugs generally manifest by selecting invalid routes because of ECN > bits interfering with FIB routes and rules lookups (more details in the > LPC 2021 talk[1] and in the RFC of this series[2]). > > [...] Here is the summary with links: - [net-next,1/4] ipv6: Define dscp_t and stop taking ECN bits into account in fib6-rules (no matching commit) - [net-next,2/4] ipv4: Stop taking ECN bits into account in fib4-rules https://git.kernel.org/netdev/net-next/c/563f8e97e054 - [net-next,3/4] ipv4: Reject routes specifying ECN bits in rtm_tos https://git.kernel.org/netdev/net-next/c/f55fbb6afb8d - [net-next,4/4] ipv4: Use dscp_t in struct fib_alias https://git.kernel.org/netdev/net-next/c/32ccf1107980 You are awesome, thank you!