From patchwork Thu Mar 17 12:45:09 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Guillaume Nault X-Patchwork-Id: 12783985 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 50FDCC433F5 for ; Thu, 17 Mar 2022 12:45:22 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233662AbiCQMqg (ORCPT ); Thu, 17 Mar 2022 08:46:36 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60004 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233660AbiCQMqb (ORCPT ); Thu, 17 Mar 2022 08:46:31 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 460B710507D for ; Thu, 17 Mar 2022 05:45:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1647521114; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=cyl9Y9Shs7l/iWslbHqnNb9Omf5IPpBek4AvRueMKPY=; b=f3uQGqKlgEy1DRcznLHHDmq0OCKwpWHl8VCLDAT970WSTF8Egvwl67lxsD9bv6Z8Tvm+PE 11LGR7wWUiukqXlgNNy/sTrNZPBe52DeAtVV7xtACmAz8QQKolScraidWqW7wv3RCipsey tS+MSrh5LkDgR5LE9xR0UOHhPsWCwfQ= Received: from mail-wm1-f72.google.com (mail-wm1-f72.google.com [209.85.128.72]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-620-cUZE7Al0MsiKvNgyoQOUsw-1; Thu, 17 Mar 2022 08:45:13 -0400 X-MC-Unique: cUZE7Al0MsiKvNgyoQOUsw-1 Received: by mail-wm1-f72.google.com with SMTP id q65-20020a1c4344000000b0038c7c65e120so394838wma.7 for ; Thu, 17 Mar 2022 05:45:12 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=cyl9Y9Shs7l/iWslbHqnNb9Omf5IPpBek4AvRueMKPY=; b=RUhO2RfjkcS1D9k8G4wEYpNRKxP/H9MOw1PnO2FOlhqNNaGDQOZ3TT0KTjgGH4X83s QZiXnXVYcXPIR9FuTu3maIkk19PqwchE+pLdd3uGF4zA4z5r+r9IIQnMn4Zd5NKwLndA GdDZAEX3SvWIo5GE4zwX8EI8muJPM8a68svL6wHWgmX4wUytS9BwoLbCQHQle5BqaECP KoJixrVyLeT7KkRWjqYrf7rLRUvSnaRzXPI9caFgLHPrKeUDesbtwmztAAjHADVgkieY zTYdz1KnIfeLBrnpuEt56NwHa/Rrs9Eg4zVYMsPrqoIGRRXGi63UA+4Ougw1jkvCov3Y zv+g== X-Gm-Message-State: AOAM532f/fldKneh5VoHy9GtOxdDXxWin8j612Y7TglY9TgENuswg8CQ gatYPI0LrdvTD9R968oQzIY06Sl45E1v4k2E0qeVnx/lcWDx9tOomTpbY7mr8kENxYaS1TZPzZu MIuq/rdYTfUZ0sk/f X-Received: by 2002:a05:6000:1d87:b0:203:e5c8:7cf3 with SMTP id bk7-20020a0560001d8700b00203e5c87cf3mr3659197wrb.704.1647521111756; Thu, 17 Mar 2022 05:45:11 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyJXVyeP6NJrPm38k6vTxaOflpgRklkNQqqypM22SvD73rr5I2fY0wDR+0nWHO9WXlVdQ9BDg== X-Received: by 2002:a05:6000:1d87:b0:203:e5c8:7cf3 with SMTP id bk7-20020a0560001d8700b00203e5c87cf3mr3659180wrb.704.1647521111487; Thu, 17 Mar 2022 05:45:11 -0700 (PDT) Received: from pc-4.home (2a01cb058918ce00dd1a5a4f9908f2d5.ipv6.abo.wanadoo.fr. [2a01:cb05:8918:ce00:dd1a:5a4f:9908:f2d5]) by smtp.gmail.com with ESMTPSA id 10-20020adf808a000000b001edd413a952sm3854392wrl.95.2022.03.17.05.45.10 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 17 Mar 2022 05:45:11 -0700 (PDT) Date: Thu, 17 Mar 2022 13:45:09 +0100 From: Guillaume Nault To: David Miller , Jakub Kicinski , Paolo Abeni Cc: netdev@vger.kernel.org, Hideaki YOSHIFUJI , David Ahern , Shuah Khan , linux-kselftest@vger.kernel.org Subject: [PATCH net v2 1/2] ipv4: Fix route lookups when handling ICMP redirects and PMTU updates Message-ID: <8cbc1e6f2319dd50d4289bec6604174484ca615c.1647519748.git.gnault@redhat.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org The PMTU update and ICMP redirect helper functions initialise their fl4 variable with either __build_flow_key() or build_sk_flow_key(). These initialisation functions always set ->flowi4_scope with RT_SCOPE_UNIVERSE and might set the ECN bits of ->flowi4_tos. This is not a problem when the route lookup is later done via ip_route_output_key_hash(), which properly clears the ECN bits from ->flowi4_tos and initialises ->flowi4_scope based on the RTO_ONLINK flag. However, some helpers call fib_lookup() directly, without sanitising the tos and scope fields, so the route lookup can fail and, as a result, the ICMP redirect or PMTU update aren't taken into account. Fix this by extracting the ->flowi4_tos and ->flowi4_scope sanitisation code into ip_rt_fix_tos(), then use this function in handlers that call fib_lookup() directly. Note 1: We can't sanitise ->flowi4_tos and ->flowi4_scope in a central place (like __build_flow_key() or flowi4_init_output()), because ip_route_output_key_hash() expects non-sanitised values. When called with sanitised values, it can erroneously overwrite RT_SCOPE_LINK with RT_SCOPE_UNIVERSE in ->flowi4_scope. Therefore we have to be careful to sanitise the values only for those paths that don't call ip_route_output_key_hash(). Note 2: The problem is mostly about sanitising ->flowi4_tos. Having ->flowi4_scope initialised with RT_SCOPE_UNIVERSE instead of RT_SCOPE_LINK probably wasn't really a problem: sockets with the SOCK_LOCALROUTE flag set (those that'd result in RTO_ONLINK being set) normally shouldn't receive ICMP redirects or PMTU updates. Fixes: 4895c771c7f0 ("ipv4: Add FIB nexthop exceptions.") Signed-off-by: Guillaume Nault Reviewed-by: David Ahern --- net/ipv4/route.c | 18 ++++++++++++++---- 1 file changed, 14 insertions(+), 4 deletions(-) diff --git a/net/ipv4/route.c b/net/ipv4/route.c index f33ad1f383b6..d5d058de3664 100644 --- a/net/ipv4/route.c +++ b/net/ipv4/route.c @@ -499,6 +499,15 @@ void __ip_select_ident(struct net *net, struct iphdr *iph, int segs) } EXPORT_SYMBOL(__ip_select_ident); +static void ip_rt_fix_tos(struct flowi4 *fl4) +{ + __u8 tos = RT_FL_TOS(fl4); + + fl4->flowi4_tos = tos & IPTOS_RT_MASK; + fl4->flowi4_scope = tos & RTO_ONLINK ? + RT_SCOPE_LINK : RT_SCOPE_UNIVERSE; +} + static void __build_flow_key(const struct net *net, struct flowi4 *fl4, const struct sock *sk, const struct iphdr *iph, @@ -824,6 +833,7 @@ static void ip_do_redirect(struct dst_entry *dst, struct sock *sk, struct sk_buf rt = (struct rtable *) dst; __build_flow_key(net, &fl4, sk, iph, oif, tos, prot, mark, 0); + ip_rt_fix_tos(&fl4); __ip_do_redirect(rt, skb, &fl4, true); } @@ -1048,6 +1058,7 @@ static void ip_rt_update_pmtu(struct dst_entry *dst, struct sock *sk, struct flowi4 fl4; ip_rt_build_flow_key(&fl4, sk, skb); + ip_rt_fix_tos(&fl4); /* Don't make lookup fail for bridged encapsulations */ if (skb && netif_is_any_bridge_port(skb->dev)) @@ -1122,6 +1133,8 @@ void ipv4_sk_update_pmtu(struct sk_buff *skb, struct sock *sk, u32 mtu) goto out; new = true; + } else { + ip_rt_fix_tos(&fl4); } __ip_rt_update_pmtu((struct rtable *)xfrm_dst_path(&rt->dst), &fl4, mtu); @@ -2603,7 +2616,6 @@ static struct rtable *__mkroute_output(const struct fib_result *res, struct rtable *ip_route_output_key_hash(struct net *net, struct flowi4 *fl4, const struct sk_buff *skb) { - __u8 tos = RT_FL_TOS(fl4); struct fib_result res = { .type = RTN_UNSPEC, .fi = NULL, @@ -2613,9 +2625,7 @@ struct rtable *ip_route_output_key_hash(struct net *net, struct flowi4 *fl4, struct rtable *rth; fl4->flowi4_iif = LOOPBACK_IFINDEX; - fl4->flowi4_tos = tos & IPTOS_RT_MASK; - fl4->flowi4_scope = ((tos & RTO_ONLINK) ? - RT_SCOPE_LINK : RT_SCOPE_UNIVERSE); + ip_rt_fix_tos(fl4); rcu_read_lock(); rth = ip_route_output_key_hash_rcu(net, fl4, &res, skb); From patchwork Thu Mar 17 12:45:11 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Guillaume Nault X-Patchwork-Id: 12783986 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E9AE3C433EF for ; Thu, 17 Mar 2022 12:45:24 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233665AbiCQMqh (ORCPT ); Thu, 17 Mar 2022 08:46:37 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60406 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233673AbiCQMqg (ORCPT ); Thu, 17 Mar 2022 08:46:36 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 42284106635 for ; Thu, 17 Mar 2022 05:45:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1647521116; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=88CAxKMWn5FORVv6LBMsjE9krM2M0gkaKC7IljIFvus=; b=BU7zGEfdNy3bEvd+74Vnu57wJr5KKOxIXALbAiaiCLEjW31aQcF0hIT/xLEfD+FFM2c9Tv ms5eXJdaFblRG5aRepHLCkLeR3SDWt4sI13oz+QninI1ELLw2KaoXLBTxDIClKrwbf3GXx UcySbnkei5OAmEX9Y2NgmfUH0Jw2F5k= Received: from mail-wm1-f70.google.com (mail-wm1-f70.google.com [209.85.128.70]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-553-d8xZmqEqOIaA2IUsyBeuGg-1; Thu, 17 Mar 2022 08:45:15 -0400 X-MC-Unique: d8xZmqEqOIaA2IUsyBeuGg-1 Received: by mail-wm1-f70.google.com with SMTP id v67-20020a1cac46000000b00383e71bb26fso1601091wme.1 for ; Thu, 17 Mar 2022 05:45:15 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=88CAxKMWn5FORVv6LBMsjE9krM2M0gkaKC7IljIFvus=; b=tyPwtNL+G4sh9DJ3k/WDQaLkihfw+4vuqtc20en2Gi30Uqqa00fSjc2X5/PfWYhrN6 oXmczqe2LAnmt45aNXf8b9vYy7qqDb8cxAiYH5S78AXz0U+v5X0ZYTSkzs2iI4JCRkOF 5c9lvqC7NwXYEaaaAlBV8h9jxSxTsj6swfWQxC0XtG2E+jo38E/j1PFLpC54zf6tRrm2 ON98TSgICEVG6qrJjrsUWZhtzYQEAshBMOvmcK2FVuHxRGDRntZICsc9FsyJDToXJX/x A0WSqA4xEgxqG/qTAZ/xaABBeWI9putLm+OCwiA0wqHONsHCKvCPqviHDuQHIm64DAWA vrRg== X-Gm-Message-State: AOAM531OM0fUL3WcLqHEQV/H39WE4BVqkpdzMYd/ArP0A1ak3W4dMgdG M8suw6JyGuXYTPegYqxhpCHMJcejbIn/r6nV9IhPBRmoqzxNp7GzBnJLeZFj1/H5olYguWU3qFK /IJA/AOfTH3pFV7af X-Received: by 2002:a5d:4151:0:b0:203:d18a:7df4 with SMTP id c17-20020a5d4151000000b00203d18a7df4mr3837253wrq.1.1647521114072; Thu, 17 Mar 2022 05:45:14 -0700 (PDT) X-Google-Smtp-Source: ABdhPJx/k3xeWC9ZAwjVi4GDj8PwpRoPPK1ra7rZRb30/NwCI7gnDhigvx1HNfoN6lYB5ozmD1ZdcQ== X-Received: by 2002:a5d:4151:0:b0:203:d18a:7df4 with SMTP id c17-20020a5d4151000000b00203d18a7df4mr3837230wrq.1.1647521113836; Thu, 17 Mar 2022 05:45:13 -0700 (PDT) Received: from pc-4.home (2a01cb058918ce00dd1a5a4f9908f2d5.ipv6.abo.wanadoo.fr. [2a01:cb05:8918:ce00:dd1a:5a4f:9908:f2d5]) by smtp.gmail.com with ESMTPSA id p12-20020a5d48cc000000b001e6114938a8sm3966966wrs.56.2022.03.17.05.45.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 17 Mar 2022 05:45:13 -0700 (PDT) Date: Thu, 17 Mar 2022 13:45:11 +0100 From: Guillaume Nault To: David Miller , Jakub Kicinski , Paolo Abeni Cc: netdev@vger.kernel.org, Hideaki YOSHIFUJI , David Ahern , Shuah Khan , linux-kselftest@vger.kernel.org Subject: [PATCH net v2 2/2] selftest: net: Test IPv4 PMTU exceptions with DSCP and ECN Message-ID: <6f3853ab347422044d71f394bb991548d30992d3.1647519748.git.gnault@redhat.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org Add two tests to pmtu.sh, for verifying that PMTU exceptions get properly created for routes that don't belong to the main table. A fib-rule based on the packet's DSCP field is used to jump to the correct table. ECN shouldn't interfere with this process, so each test has two components: one that only sets DSCP and one that sets both DSCP and ECN. One of the test triggers PMTU exceptions using ICMP Echo Requests, the other using UDP packets (to test different handlers in the kernel). A few adjustments are necessary in the rest of the script to allow policy routing scenarios: * Add global variable rt_table that allows setup_routing_*() to add routes to a specific routing table. By default rt_table is set to "main", so existing tests don't need to be modified. * Another global variable, policy_mark, is used to define which dsfield value is used for policy routing. This variable has no effect on tests that don't use policy routing. * The UDP version of the test uses socat. So cleanup() now also need to kill socat PIDs. * route_get_dst_pmtu_from_exception() and route_get_dst_exception() now take an optional third argument specifying the dsfield. If not specified, 0 is used, so existing users don't need to be modified. Signed-off-by: Guillaume Nault Reviewed-by: David Ahern --- tools/testing/selftests/net/pmtu.sh | 141 +++++++++++++++++++++++++++- 1 file changed, 137 insertions(+), 4 deletions(-) diff --git a/tools/testing/selftests/net/pmtu.sh b/tools/testing/selftests/net/pmtu.sh index 694732e4b344..736e358dc549 100755 --- a/tools/testing/selftests/net/pmtu.sh +++ b/tools/testing/selftests/net/pmtu.sh @@ -26,6 +26,15 @@ # - pmtu_ipv6 # Same as pmtu_ipv4, except for locked PMTU tests, using IPv6 # +# - pmtu_ipv4_dscp_icmp_exception +# Set up the same network topology as pmtu_ipv4, but use non-default +# routing table in A. A fib-rule is used to jump to this routing table +# based on DSCP. Send ICMPv4 packets with the expected DSCP value and +# verify that ECN doesn't interfere with the creation of PMTU exceptions. +# +# - pmtu_ipv4_dscp_udp_exception +# Same as pmtu_ipv4_dscp_icmp_exception, but use UDP instead of ICMP. +# # - pmtu_ipv4_vxlan4_exception # Set up the same network topology as pmtu_ipv4, create a VXLAN tunnel # over IPv4 between A and B, routed via R1. On the link between R1 and B, @@ -203,6 +212,8 @@ which ping6 > /dev/null 2>&1 && ping6=$(which ping6) || ping6=$(which ping) tests=" pmtu_ipv4_exception ipv4: PMTU exceptions 1 pmtu_ipv6_exception ipv6: PMTU exceptions 1 + pmtu_ipv4_dscp_icmp_exception ICMPv4 with DSCP and ECN: PMTU exceptions 1 + pmtu_ipv4_dscp_udp_exception UDPv4 with DSCP and ECN: PMTU exceptions 1 pmtu_ipv4_vxlan4_exception IPv4 over vxlan4: PMTU exceptions 1 pmtu_ipv6_vxlan4_exception IPv6 over vxlan4: PMTU exceptions 1 pmtu_ipv4_vxlan6_exception IPv4 over vxlan6: PMTU exceptions 1 @@ -323,6 +334,9 @@ routes_nh=" B 6 default 61 " +policy_mark=0x04 +rt_table=main + veth4_a_addr="192.168.1.1" veth4_b_addr="192.168.1.2" veth4_c_addr="192.168.2.10" @@ -346,6 +360,7 @@ dummy6_mask="64" err_buf= tcpdump_pids= nettest_pids= +socat_pids= err() { err_buf="${err_buf}${1} @@ -723,7 +738,7 @@ setup_routing_old() { ns_name="$(nsname ${ns})" - ip -n ${ns_name} route add ${addr} via ${gw} + ip -n "${ns_name}" route add "${addr}" table "${rt_table}" via "${gw}" ns=""; addr=""; gw="" done @@ -753,7 +768,7 @@ setup_routing_new() { ns_name="$(nsname ${ns})" - ip -n ${ns_name} -${fam} route add ${addr} nhid ${nhid} + ip -n "${ns_name}" -"${fam}" route add "${addr}" table "${rt_table}" nhid "${nhid}" ns=""; fam=""; addr=""; nhid="" done @@ -798,6 +813,24 @@ setup_routing() { return 0 } +setup_policy_routing() { + setup_routing + + ip -netns "${NS_A}" -4 rule add dsfield "${policy_mark}" \ + table "${rt_table}" + + # Set the IPv4 Don't Fragment bit with tc, since socat doesn't seem to + # have an option do to it. + tc -netns "${NS_A}" qdisc replace dev veth_A-R1 root prio + tc -netns "${NS_A}" qdisc replace dev veth_A-R2 root prio + tc -netns "${NS_A}" filter add dev veth_A-R1 \ + protocol ipv4 flower ip_proto udp \ + action pedit ex munge ip df set 0x40 pipe csum ip and udp + tc -netns "${NS_A}" filter add dev veth_A-R2 \ + protocol ipv4 flower ip_proto udp \ + action pedit ex munge ip df set 0x40 pipe csum ip and udp +} + setup_bridge() { run_cmd ${ns_a} ip link add br0 type bridge || return $ksft_skip run_cmd ${ns_a} ip link set br0 up @@ -903,6 +936,11 @@ cleanup() { done nettest_pids= + for pid in ${socat_pids}; do + kill "${pid}" + done + socat_pids= + for n in ${NS_A} ${NS_B} ${NS_C} ${NS_R1} ${NS_R2}; do ip netns del ${n} 2> /dev/null done @@ -950,15 +988,21 @@ link_get_mtu() { route_get_dst_exception() { ns_cmd="${1}" dst="${2}" + dsfield="${3}" - ${ns_cmd} ip route get "${dst}" + if [ -z "${dsfield}" ]; then + dsfield=0 + fi + + ${ns_cmd} ip route get "${dst}" dsfield "${dsfield}" } route_get_dst_pmtu_from_exception() { ns_cmd="${1}" dst="${2}" + dsfield="${3}" - mtu_parse "$(route_get_dst_exception "${ns_cmd}" ${dst})" + mtu_parse "$(route_get_dst_exception "${ns_cmd}" "${dst}" "${dsfield}")" } check_pmtu_value() { @@ -1068,6 +1112,95 @@ test_pmtu_ipv6_exception() { test_pmtu_ipvX 6 } +test_pmtu_ipv4_dscp_icmp_exception() { + rt_table=100 + + setup namespaces policy_routing || return $ksft_skip + trace "${ns_a}" veth_A-R1 "${ns_r1}" veth_R1-A \ + "${ns_r1}" veth_R1-B "${ns_b}" veth_B-R1 \ + "${ns_a}" veth_A-R2 "${ns_r2}" veth_R2-A \ + "${ns_r2}" veth_R2-B "${ns_b}" veth_B-R2 + + # Set up initial MTU values + mtu "${ns_a}" veth_A-R1 2000 + mtu "${ns_r1}" veth_R1-A 2000 + mtu "${ns_r1}" veth_R1-B 1400 + mtu "${ns_b}" veth_B-R1 1400 + + mtu "${ns_a}" veth_A-R2 2000 + mtu "${ns_r2}" veth_R2-A 2000 + mtu "${ns_r2}" veth_R2-B 1500 + mtu "${ns_b}" veth_B-R2 1500 + + len=$((2000 - 20 - 8)) # Fills MTU of veth_A-R1 + + dst1="${prefix4}.${b_r1}.1" + dst2="${prefix4}.${b_r2}.1" + + # Create route exceptions + dsfield=${policy_mark} # No ECN bit set (Not-ECT) + run_cmd "${ns_a}" ping -q -M want -Q "${dsfield}" -c 1 -w 1 -s "${len}" "${dst1}" + + dsfield=$(printf "%#x" $((policy_mark + 0x02))) # ECN=2 (ECT(0)) + run_cmd "${ns_a}" ping -q -M want -Q "${dsfield}" -c 1 -w 1 -s "${len}" "${dst2}" + + # Check that exceptions have been created with the correct PMTU + pmtu_1="$(route_get_dst_pmtu_from_exception "${ns_a}" "${dst1}" "${policy_mark}")" + check_pmtu_value "1400" "${pmtu_1}" "exceeding MTU" || return 1 + + pmtu_2="$(route_get_dst_pmtu_from_exception "${ns_a}" "${dst2}" "${policy_mark}")" + check_pmtu_value "1500" "${pmtu_2}" "exceeding MTU" || return 1 +} + +test_pmtu_ipv4_dscp_udp_exception() { + rt_table=100 + + if ! which socat > /dev/null 2>&1; then + echo "'socat' command not found; skipping tests" + return $ksft_skip + fi + + setup namespaces policy_routing || return $ksft_skip + trace "${ns_a}" veth_A-R1 "${ns_r1}" veth_R1-A \ + "${ns_r1}" veth_R1-B "${ns_b}" veth_B-R1 \ + "${ns_a}" veth_A-R2 "${ns_r2}" veth_R2-A \ + "${ns_r2}" veth_R2-B "${ns_b}" veth_B-R2 + + # Set up initial MTU values + mtu "${ns_a}" veth_A-R1 2000 + mtu "${ns_r1}" veth_R1-A 2000 + mtu "${ns_r1}" veth_R1-B 1400 + mtu "${ns_b}" veth_B-R1 1400 + + mtu "${ns_a}" veth_A-R2 2000 + mtu "${ns_r2}" veth_R2-A 2000 + mtu "${ns_r2}" veth_R2-B 1500 + mtu "${ns_b}" veth_B-R2 1500 + + len=$((2000 - 20 - 8)) # Fills MTU of veth_A-R1 + + dst1="${prefix4}.${b_r1}.1" + dst2="${prefix4}.${b_r2}.1" + + # Create route exceptions + run_cmd_bg "${ns_b}" socat UDP-LISTEN:50000 OPEN:/dev/null,wronly=1 + socat_pids="${socat_pids} $!" + + dsfield=${policy_mark} # No ECN bit set (Not-ECT) + run_cmd "${ns_a}" socat OPEN:/dev/zero,rdonly=1,readbytes="${len}" \ + UDP:"${dst1}":50000,tos="${dsfield}" + + dsfield=$(printf "%#x" $((policy_mark + 0x02))) # ECN=2 (ECT(0)) + run_cmd "${ns_a}" socat OPEN:/dev/zero,rdonly=1,readbytes="${len}" \ + UDP:"${dst2}":50000,tos="${dsfield}" + + # Check that exceptions have been created with the correct PMTU + pmtu_1="$(route_get_dst_pmtu_from_exception "${ns_a}" "${dst1}" "${policy_mark}")" + check_pmtu_value "1400" "${pmtu_1}" "exceeding MTU" || return 1 + pmtu_2="$(route_get_dst_pmtu_from_exception "${ns_a}" "${dst2}" "${policy_mark}")" + check_pmtu_value "1500" "${pmtu_2}" "exceeding MTU" || return 1 +} + test_pmtu_ipvX_over_vxlanY_or_geneveY_exception() { type=${1} family=${2}