From patchwork Sun May 2 16:22:48 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ido Schimmel X-Patchwork-Id: 12235289 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 92347C433B4 for ; Sun, 2 May 2021 16:24:45 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 65BF86134F for ; Sun, 2 May 2021 16:24:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232166AbhEBQZg (ORCPT ); Sun, 2 May 2021 12:25:36 -0400 Received: from out5-smtp.messagingengine.com ([66.111.4.29]:34943 "EHLO out5-smtp.messagingengine.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230110AbhEBQZf (ORCPT ); Sun, 2 May 2021 12:25:35 -0400 Received: from compute2.internal (compute2.nyi.internal [10.202.2.42]) by mailout.nyi.internal (Postfix) with ESMTP id 903215C011D; Sun, 2 May 2021 12:24:43 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute2.internal (MEProxy); Sun, 02 May 2021 12:24:43 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:subject:to :x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; bh=A3sH6QRmOFJww5oDye1PHFbjTLC1JchZ/MTPAnasEVE=; b=bRjsFvbA 8bS1vuijg6JhD/IYnvwDuYhX1pg2ogTZLaK3Z2lVYV4vEduFsQl658DiiLDTnoYi MLgQijBbVExbl9reZqmlHwck0/ZHVTLyAYZr66vNGczBPcO0lKp6ZhsPm45f5E4r E3l5yqlDDV/pGqqCjNvX0NXWgQhGOYFYiLA8Ouin5IjWK5HSAO81BSxbp6ZIEj2p He+UfsgKgTYJHqNSpYNpW2ZUaNQpvXFazojFuCBd54REmEmbS+Bg0hs3aPfRVxEq 6GE0E4SX6A43bZXrRety7HMs2fHdoo+4f9lVqZldsF9sH/4380j4tn08zzNzF57V OU0N9RPBH818tg== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeduledrvdefvddgtdeiucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucenucfjughrpefhvffufffkofgjfhgggfestdekre dtredttdenucfhrhhomhepkfguohcuufgthhhimhhmvghluceoihguohhstghhsehiugho shgthhdrohhrgheqnecuggftrfgrthhtvghrnhepudetieevffffveelkeeljeffkefhke ehgfdtffethfelvdejgffghefgveejkefhnecukfhppeduleefrdegjedrudeihedrvdeh udenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepmhgrihhlfhhrohhmpehiug hoshgthhesihguohhstghhrdhorhhg X-ME-Proxy: Received: from shredder.mellanox.com (unknown [193.47.165.251]) by mail.messagingengine.com (Postfix) with ESMTPA; Sun, 2 May 2021 12:24:41 -0400 (EDT) From: Ido Schimmel To: netdev@vger.kernel.org Cc: davem@davemloft.net, kuba@kernel.org, dsahern@gmail.com, petrm@nvidia.com, roopa@nvidia.com, nikolay@nvidia.com, ssuryaextr@gmail.com, mlxsw@nvidia.com, Ido Schimmel Subject: [RFC PATCH net-next 01/10] ipv4: Calculate multipath hash inside switch statement Date: Sun, 2 May 2021 19:22:48 +0300 Message-Id: <20210502162257.3472453-2-idosch@idosch.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20210502162257.3472453-1-idosch@idosch.org> References: <20210502162257.3472453-1-idosch@idosch.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org X-Patchwork-State: RFC From: Ido Schimmel A subsequent patch will add another multipath hash policy where the multipath hash is calculated directly by the policy specific code and not outside of the switch statement. Prepare for this change by moving the multipath hash calculation inside the switch statement. No functional changes intended. Signed-off-by: Ido Schimmel --- net/ipv4/route.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/net/ipv4/route.c b/net/ipv4/route.c index f6787c55f6ab..9d61e969446e 100644 --- a/net/ipv4/route.c +++ b/net/ipv4/route.c @@ -1912,7 +1912,7 @@ int fib_multipath_hash(const struct net *net, const struct flowi4 *fl4, { u32 multipath_hash = fl4 ? fl4->flowi4_multipath_hash : 0; struct flow_keys hash_keys; - u32 mhash; + u32 mhash = 0; switch (net->ipv4.sysctl_fib_multipath_hash_policy) { case 0: @@ -1924,6 +1924,7 @@ int fib_multipath_hash(const struct net *net, const struct flowi4 *fl4, hash_keys.addrs.v4addrs.src = fl4->saddr; hash_keys.addrs.v4addrs.dst = fl4->daddr; } + mhash = flow_hash_from_keys(&hash_keys); break; case 1: /* skb is currently provided only when forwarding */ @@ -1957,6 +1958,7 @@ int fib_multipath_hash(const struct net *net, const struct flowi4 *fl4, hash_keys.ports.dst = fl4->fl4_dport; hash_keys.basic.ip_proto = fl4->flowi4_proto; } + mhash = flow_hash_from_keys(&hash_keys); break; case 2: memset(&hash_keys, 0, sizeof(hash_keys)); @@ -1987,9 +1989,9 @@ int fib_multipath_hash(const struct net *net, const struct flowi4 *fl4, hash_keys.addrs.v4addrs.src = fl4->saddr; hash_keys.addrs.v4addrs.dst = fl4->daddr; } + mhash = flow_hash_from_keys(&hash_keys); break; } - mhash = flow_hash_from_keys(&hash_keys); if (multipath_hash) mhash = jhash_2words(mhash, multipath_hash, 0); From patchwork Sun May 2 16:22:49 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ido Schimmel X-Patchwork-Id: 12235291 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D605FC43460 for ; Sun, 2 May 2021 16:24:47 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id A5FC36134F for ; Sun, 2 May 2021 16:24:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232230AbhEBQZi (ORCPT ); Sun, 2 May 2021 12:25:38 -0400 Received: from out5-smtp.messagingengine.com ([66.111.4.29]:42179 "EHLO out5-smtp.messagingengine.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230110AbhEBQZh (ORCPT ); Sun, 2 May 2021 12:25:37 -0400 Received: from compute1.internal (compute1.nyi.internal [10.202.2.41]) by mailout.nyi.internal (Postfix) with ESMTP id 074DA5C0117; Sun, 2 May 2021 12:24:46 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute1.internal (MEProxy); Sun, 02 May 2021 12:24:46 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:subject:to :x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; bh=ZWV2q+ggA1lFN/AxHMibF/3tMOAN0JjkmhcxXLjxAh0=; b=Il+z+EBz 67Q3tgpmanGFHANCxbwy81X1XoSQIo8vgIbBGT6PHn4BpDKS1MoEvNnd+3rhxH3A OcThaee6nylbvYergzTvlBnONhS2HV3zJ2OPdgIYXIHJzLOJhge70QGqP+02zdyS 8hqClFfgszuH7+8mde8FqgBA7th3cl0TwVgFBKD0cidjJ83ZKxHrE1gTFF4avPz2 EHnlPuCd7qf7ULov5jwwMBsKS78CVtc9EJUp3/Cl2tKJIgIvQDPOD6KsSOzqXICX 6z/eOSP7aTOwLF5L6udxs8JuymtGxsRil8p2T1BA2G0IDh4JGx5nj/vIIE6aOKvn me6LOllvwMAhAw== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeduledrvdefvddgtdejucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucenucfjughrpefhvffufffkofgjfhgggfestdekre dtredttdenucfhrhhomhepkfguohcuufgthhhimhhmvghluceoihguohhstghhsehiugho shgthhdrohhrgheqnecuggftrfgrthhtvghrnhepudetieevffffveelkeeljeffkefhke ehgfdtffethfelvdejgffghefgveejkefhnecukfhppeduleefrdegjedrudeihedrvdeh udenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepmhgrihhlfhhrohhmpehiug hoshgthhesihguohhstghhrdhorhhg X-ME-Proxy: Received: from shredder.mellanox.com (unknown [193.47.165.251]) by mail.messagingengine.com (Postfix) with ESMTPA; Sun, 2 May 2021 12:24:43 -0400 (EDT) From: Ido Schimmel To: netdev@vger.kernel.org Cc: davem@davemloft.net, kuba@kernel.org, dsahern@gmail.com, petrm@nvidia.com, roopa@nvidia.com, nikolay@nvidia.com, ssuryaextr@gmail.com, mlxsw@nvidia.com, Ido Schimmel Subject: [RFC PATCH net-next 02/10] ipv4: Add a sysctl to control multipath hash fields Date: Sun, 2 May 2021 19:22:49 +0300 Message-Id: <20210502162257.3472453-3-idosch@idosch.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20210502162257.3472453-1-idosch@idosch.org> References: <20210502162257.3472453-1-idosch@idosch.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org X-Patchwork-State: RFC From: Ido Schimmel A subsequent patch will add a new multipath hash policy where the packet fields used for multipath hash calculation are determined by user space. This patch adds a sysctl that allows user space to set these fields. The packet fields are represented using a bitmap and are common between IPv4 and IPv6 to allow user space to use the same numbering across both protocols. For example, to hash based on standard 5-tuple: # sysctl -w net.ipv4.fib_multipath_hash_fields=0-2,4-5 net.ipv4.fib_multipath_hash_fields = 0-2,4-5 More fields can be added in the future, if needed. The 'need_outer' and 'need_inner' variables are set in the control path to indicate whether dissection of the outer or inner flow is needed. They will be used by a subsequent patch to allow the data path to avoid dissection of the outer or inner flow when not needed. Signed-off-by: Ido Schimmel --- Documentation/networking/ip-sysctl.rst | 29 ++++++++++++++++ include/net/ip_fib.h | 46 ++++++++++++++++++++++++++ include/net/netns/ipv4.h | 4 +++ net/ipv4/fib_frontend.c | 24 ++++++++++++++ net/ipv4/sysctl_net_ipv4.c | 32 ++++++++++++++++++ 5 files changed, 135 insertions(+) diff --git a/Documentation/networking/ip-sysctl.rst b/Documentation/networking/ip-sysctl.rst index c2ecc9894fd0..8ab61f4edf02 100644 --- a/Documentation/networking/ip-sysctl.rst +++ b/Documentation/networking/ip-sysctl.rst @@ -100,6 +100,35 @@ fib_multipath_hash_policy - INTEGER - 1 - Layer 4 - 2 - Layer 3 or inner Layer 3 if present +fib_multipath_hash_fields - list of comma separated ranges + When fib_multipath_hash_policy is set to 3 (custom multipath hash), the + fields used for multipath hash calculation are determined by this + sysctl. + + The format used for both input and output is a comma separated list of + ranges (e.g., "0-2" for source IP, destination IP and IP protocol). + Writing to the file will clear all previous ranges and update the + current list with the input. + + Possible fields are: + + == ============================ + 0 Source IP address + 1 Destination IP address + 2 IP protocol + 3 Unused + 4 Source port + 5 Destination port + 6 Inner source IP address + 7 Inner destination IP address + 8 Inner IP protocol + 9 Inner Flow Label + 10 Inner source port + 11 Inner destination port + == ============================ + + Default: 0-2 (source IP, destination IP and IP protocol) + fib_sync_mem - UNSIGNED INTEGER Amount of dirty memory from fib entries that can be backlogged before synchronize_rcu is forced. diff --git a/include/net/ip_fib.h b/include/net/ip_fib.h index a914f33f3ed5..d70a4c524bef 100644 --- a/include/net/ip_fib.h +++ b/include/net/ip_fib.h @@ -466,6 +466,52 @@ int fib_sync_up(struct net_device *dev, unsigned char nh_flags); void fib_sync_mtu(struct net_device *dev, u32 orig_mtu); void fib_nhc_update_mtu(struct fib_nh_common *nhc, u32 new, u32 orig); +/* Fields used for sysctl_fib_multipath_hash_fields. + * Common to IPv4 and IPv6. + */ +enum { + FIB_MULTIPATH_HASH_FIELD_SRC_IP, + FIB_MULTIPATH_HASH_FIELD_DST_IP, + FIB_MULTIPATH_HASH_FIELD_IP_PROTO, + FIB_MULTIPATH_HASH_FIELD_FLOWLABEL, + FIB_MULTIPATH_HASH_FIELD_SRC_PORT, + FIB_MULTIPATH_HASH_FIELD_DST_PORT, + FIB_MULTIPATH_HASH_FIELD_INNER_SRC_IP, + FIB_MULTIPATH_HASH_FIELD_INNER_DST_IP, + FIB_MULTIPATH_HASH_FIELD_INNER_IP_PROTO, + FIB_MULTIPATH_HASH_FIELD_INNER_FLOWLABEL, + FIB_MULTIPATH_HASH_FIELD_INNER_SRC_PORT, + FIB_MULTIPATH_HASH_FIELD_INNER_DST_PORT, + + /* Add new fields above. This is user API. */ + __FIB_MULTIPATH_HASH_FIELD_CNT, +}; + +#define FIB_MULTIPATH_HASH_TEST_FIELD(_field, _hash_fields) \ + test_bit(FIB_MULTIPATH_HASH_FIELD_##_field, _hash_fields) + +static inline bool +fib_multipath_hash_need_outer(const unsigned long *hash_fields) +{ + return FIB_MULTIPATH_HASH_TEST_FIELD(SRC_IP, hash_fields) || + FIB_MULTIPATH_HASH_TEST_FIELD(DST_IP, hash_fields) || + FIB_MULTIPATH_HASH_TEST_FIELD(IP_PROTO, hash_fields) || + FIB_MULTIPATH_HASH_TEST_FIELD(FLOWLABEL, hash_fields) || + FIB_MULTIPATH_HASH_TEST_FIELD(SRC_PORT, hash_fields) || + FIB_MULTIPATH_HASH_TEST_FIELD(DST_PORT, hash_fields); +} + +static inline bool +fib_multipath_hash_need_inner(const unsigned long *hash_fields) +{ + return FIB_MULTIPATH_HASH_TEST_FIELD(INNER_SRC_IP, hash_fields) || + FIB_MULTIPATH_HASH_TEST_FIELD(INNER_DST_IP, hash_fields) || + FIB_MULTIPATH_HASH_TEST_FIELD(INNER_IP_PROTO, hash_fields) || + FIB_MULTIPATH_HASH_TEST_FIELD(INNER_FLOWLABEL, hash_fields) || + FIB_MULTIPATH_HASH_TEST_FIELD(INNER_SRC_PORT, hash_fields) || + FIB_MULTIPATH_HASH_TEST_FIELD(INNER_DST_PORT, hash_fields); +} + #ifdef CONFIG_IP_ROUTE_MULTIPATH int fib_multipath_hash(const struct net *net, const struct flowi4 *fl4, const struct sk_buff *skb, struct flow_keys *flkeys); diff --git a/include/net/netns/ipv4.h b/include/net/netns/ipv4.h index f6af8d96d3c6..d0fcd968be44 100644 --- a/include/net/netns/ipv4.h +++ b/include/net/netns/ipv4.h @@ -210,6 +210,10 @@ struct netns_ipv4 { #endif #endif #ifdef CONFIG_IP_ROUTE_MULTIPATH + unsigned long *sysctl_fib_multipath_hash_fields; + u8 fib_multipath_hash_fields_need_outer:1, + fib_multipath_hash_fields_need_inner:1, + unused:6; u8 sysctl_fib_multipath_use_neigh; u8 sysctl_fib_multipath_hash_policy; #endif diff --git a/net/ipv4/fib_frontend.c b/net/ipv4/fib_frontend.c index 84bb707bd88d..f685e84b03b6 100644 --- a/net/ipv4/fib_frontend.c +++ b/net/ipv4/fib_frontend.c @@ -1516,6 +1516,23 @@ static int __net_init ip_fib_net_init(struct net *net) if (err) return err; +#ifdef CONFIG_IP_ROUTE_MULTIPATH + net->ipv4.sysctl_fib_multipath_hash_fields = + bitmap_zalloc(__FIB_MULTIPATH_HASH_FIELD_CNT, GFP_KERNEL); + if (!net->ipv4.sysctl_fib_multipath_hash_fields) + goto err_hash_fields_alloc; + + /* Default to 3-tuple */ + set_bit(FIB_MULTIPATH_HASH_FIELD_SRC_IP, + net->ipv4.sysctl_fib_multipath_hash_fields); + set_bit(FIB_MULTIPATH_HASH_FIELD_DST_IP, + net->ipv4.sysctl_fib_multipath_hash_fields); + set_bit(FIB_MULTIPATH_HASH_FIELD_IP_PROTO, + net->ipv4.sysctl_fib_multipath_hash_fields); + net->ipv4.fib_multipath_hash_fields_need_outer = 1; + net->ipv4.fib_multipath_hash_fields_need_inner = 0; +#endif + /* Avoid false sharing : Use at least a full cache line */ size = max_t(size_t, size, L1_CACHE_BYTES); @@ -1533,6 +1550,10 @@ static int __net_init ip_fib_net_init(struct net *net) err_rules_init: kfree(net->ipv4.fib_table_hash); err_table_hash_alloc: +#ifdef CONFIG_IP_ROUTE_MULTIPATH + bitmap_free(net->ipv4.sysctl_fib_multipath_hash_fields); +err_hash_fields_alloc: +#endif fib4_notifier_exit(net); return err; } @@ -1568,6 +1589,9 @@ static void ip_fib_net_exit(struct net *net) #endif rtnl_unlock(); kfree(net->ipv4.fib_table_hash); +#ifdef CONFIG_IP_ROUTE_MULTIPATH + bitmap_free(net->ipv4.sysctl_fib_multipath_hash_fields); +#endif fib4_notifier_exit(net); } diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c index a62934b9f15a..0db7e68c38cd 100644 --- a/net/ipv4/sysctl_net_ipv4.c +++ b/net/ipv4/sysctl_net_ipv4.c @@ -19,6 +19,7 @@ #include #include #include +#include #include #include #include @@ -461,6 +462,30 @@ static int proc_fib_multipath_hash_policy(struct ctl_table *table, int write, return ret; } + +static int proc_fib_multipath_hash_fields(struct ctl_table *table, int write, + void *buffer, size_t *lenp, + loff_t *ppos) +{ + unsigned long *hash_fields; + struct net *net; + int ret; + + net = container_of(table->data, struct net, + ipv4.sysctl_fib_multipath_hash_fields); + ret = proc_do_large_bitmap(table, write, buffer, lenp, ppos); + if (!write || ret) + goto out; + + hash_fields = net->ipv4.sysctl_fib_multipath_hash_fields; + net->ipv4.fib_multipath_hash_fields_need_outer = + fib_multipath_hash_need_outer(hash_fields); + net->ipv4.fib_multipath_hash_fields_need_inner = + fib_multipath_hash_need_inner(hash_fields); + +out: + return ret; +} #endif static struct ctl_table ipv4_table[] = { @@ -1052,6 +1077,13 @@ static struct ctl_table ipv4_net_table[] = { .extra1 = SYSCTL_ZERO, .extra2 = &two, }, + { + .procname = "fib_multipath_hash_fields", + .data = &init_net.ipv4.sysctl_fib_multipath_hash_fields, + .maxlen = __FIB_MULTIPATH_HASH_FIELD_CNT, + .mode = 0644, + .proc_handler = proc_fib_multipath_hash_fields, + }, #endif { .procname = "ip_unprivileged_port_start", From patchwork Sun May 2 16:22:50 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ido Schimmel X-Patchwork-Id: 12235293 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3FE2AC433B4 for ; Sun, 2 May 2021 16:24:50 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 29D1C611B0 for ; Sun, 2 May 2021 16:24:50 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232301AbhEBQZk (ORCPT ); Sun, 2 May 2021 12:25:40 -0400 Received: from out5-smtp.messagingengine.com ([66.111.4.29]:51983 "EHLO out5-smtp.messagingengine.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230110AbhEBQZk (ORCPT ); Sun, 2 May 2021 12:25:40 -0400 Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.nyi.internal (Postfix) with ESMTP id 1BDDC5C0117; Sun, 2 May 2021 12:24:48 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Sun, 02 May 2021 12:24:48 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:subject:to :x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; bh=+yGDr1o/AdqfdazCVVHpxRhJv7r9IzDpMacLAbNiGM0=; b=ES/TV+Ii MxYuVIlkDEEHQPrXqhYWquawjQndVlDJSuJE4Iz3r03gR5W13HagM2ANX9cPhzAJ 9ZlrqNVidT1xaPOlO7LLjl2+is3PdzaAphRtXHPQwG1Bk7JM6jUOd5APG/5y/tHA YBsYyHrNCB8fSGZO0fpXYNTkOk8F95g9MLjhEsig8hliX+hMbyMKLLLhe7n5UL10 Z/nBcvvt9MeQSsgBYgI+7XIHdkl6gL59/E0t6zEZ/IXsG72RuPLF0XGLoFT682wn z+s/q1yNEfApnc5s2ZZrxIusG/UBuM87K8K7izxSkXzhwru7OJ32dmt7agBuJOLS hu7nEPmFM7OVXw== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeduledrvdefvddgtdejucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucenucfjughrpefhvffufffkofgjfhgggfestdekre dtredttdenucfhrhhomhepkfguohcuufgthhhimhhmvghluceoihguohhstghhsehiugho shgthhdrohhrgheqnecuggftrfgrthhtvghrnhepudetieevffffveelkeeljeffkefhke ehgfdtffethfelvdejgffghefgveejkefhnecukfhppeduleefrdegjedrudeihedrvdeh udenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepmhgrihhlfhhrohhmpehiug hoshgthhesihguohhstghhrdhorhhg X-ME-Proxy: Received: from shredder.mellanox.com (unknown [193.47.165.251]) by mail.messagingengine.com (Postfix) with ESMTPA; Sun, 2 May 2021 12:24:46 -0400 (EDT) From: Ido Schimmel To: netdev@vger.kernel.org Cc: davem@davemloft.net, kuba@kernel.org, dsahern@gmail.com, petrm@nvidia.com, roopa@nvidia.com, nikolay@nvidia.com, ssuryaextr@gmail.com, mlxsw@nvidia.com, Ido Schimmel Subject: [RFC PATCH net-next 03/10] ipv4: Add custom multipath hash policy Date: Sun, 2 May 2021 19:22:50 +0300 Message-Id: <20210502162257.3472453-4-idosch@idosch.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20210502162257.3472453-1-idosch@idosch.org> References: <20210502162257.3472453-1-idosch@idosch.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org X-Patchwork-State: RFC From: Ido Schimmel Add a new multipath hash policy where the packet fields used for hash calculation are determined by user space via the fib_multipath_hash_fields sysctl that was introduced in the previous patch. The current set of available packet fields includes both outer and inner fields, which requires two invocations of the flow dissector. Avoid unnecessary dissection of the outer or inner flows by skipping dissection if the 'need_outer' or 'need_inner' variables are not set. These fields are set / cleared as part of the control path when the fib_multipath_hash_fields sysctl is configured. In accordance with the existing policies, when an skb is not available, packet fields are extracted from the provided flow key. In which case, only outer fields are considered. Signed-off-by: Ido Schimmel --- Documentation/networking/ip-sysctl.rst | 2 + net/ipv4/route.c | 121 +++++++++++++++++++++++++ net/ipv4/sysctl_net_ipv4.c | 3 +- 3 files changed, 125 insertions(+), 1 deletion(-) diff --git a/Documentation/networking/ip-sysctl.rst b/Documentation/networking/ip-sysctl.rst index 8ab61f4edf02..549601494694 100644 --- a/Documentation/networking/ip-sysctl.rst +++ b/Documentation/networking/ip-sysctl.rst @@ -99,6 +99,8 @@ fib_multipath_hash_policy - INTEGER - 0 - Layer 3 - 1 - Layer 4 - 2 - Layer 3 or inner Layer 3 if present + - 3 - Custom multipath hash. Fields used for multipath hash calculation + are determined by fib_multipath_hash_fields sysctl fib_multipath_hash_fields - list of comma separated ranges When fib_multipath_hash_policy is set to 3 (custom multipath hash), the diff --git a/net/ipv4/route.c b/net/ipv4/route.c index 9d61e969446e..995799a6e06f 100644 --- a/net/ipv4/route.c +++ b/net/ipv4/route.c @@ -1906,6 +1906,121 @@ static void ip_multipath_l3_keys(const struct sk_buff *skb, hash_keys->addrs.v4addrs.dst = key_iph->daddr; } +static u32 fib_multipath_custom_hash_outer(const struct net *net, + const struct sk_buff *skb, + bool *p_has_inner) +{ + unsigned long *hash_fields = net->ipv4.sysctl_fib_multipath_hash_fields; + struct flow_keys keys, hash_keys; + + if (!net->ipv4.fib_multipath_hash_fields_need_outer) + return 0; + + memset(&hash_keys, 0, sizeof(hash_keys)); + skb_flow_dissect_flow_keys(skb, &keys, FLOW_DISSECTOR_F_STOP_AT_ENCAP); + + hash_keys.control.addr_type = FLOW_DISSECTOR_KEY_IPV4_ADDRS; + if (FIB_MULTIPATH_HASH_TEST_FIELD(SRC_IP, hash_fields)) + hash_keys.addrs.v4addrs.src = keys.addrs.v4addrs.src; + if (FIB_MULTIPATH_HASH_TEST_FIELD(DST_IP, hash_fields)) + hash_keys.addrs.v4addrs.dst = keys.addrs.v4addrs.dst; + if (FIB_MULTIPATH_HASH_TEST_FIELD(IP_PROTO, hash_fields)) + hash_keys.basic.ip_proto = keys.basic.ip_proto; + if (FIB_MULTIPATH_HASH_TEST_FIELD(SRC_PORT, hash_fields)) + hash_keys.ports.src = keys.ports.src; + if (FIB_MULTIPATH_HASH_TEST_FIELD(DST_PORT, hash_fields)) + hash_keys.ports.dst = keys.ports.dst; + + *p_has_inner = !!(keys.control.flags & FLOW_DIS_ENCAPSULATION); + return flow_hash_from_keys(&hash_keys); +} + +static u32 fib_multipath_custom_hash_inner(const struct net *net, + const struct sk_buff *skb, + bool has_inner) +{ + unsigned long *hash_fields = net->ipv4.sysctl_fib_multipath_hash_fields; + struct flow_keys keys, hash_keys; + + /* We assume the packet carries an encapsulation, but if none was + * encountered during dissection of the outer flow, then there is no + * point in calling the flow dissector again. + */ + if (!has_inner) + return 0; + + if (!net->ipv4.fib_multipath_hash_fields_need_inner) + return 0; + + memset(&hash_keys, 0, sizeof(hash_keys)); + skb_flow_dissect_flow_keys(skb, &keys, 0); + + if (!(keys.control.flags & FLOW_DIS_ENCAPSULATION)) + return 0; + + if (keys.control.addr_type == FLOW_DISSECTOR_KEY_IPV4_ADDRS) { + hash_keys.control.addr_type = FLOW_DISSECTOR_KEY_IPV4_ADDRS; + if (FIB_MULTIPATH_HASH_TEST_FIELD(INNER_SRC_IP, hash_fields)) + hash_keys.addrs.v4addrs.src = keys.addrs.v4addrs.src; + if (FIB_MULTIPATH_HASH_TEST_FIELD(INNER_DST_IP, hash_fields)) + hash_keys.addrs.v4addrs.dst = keys.addrs.v4addrs.dst; + } else if (keys.control.addr_type == FLOW_DISSECTOR_KEY_IPV6_ADDRS) { + hash_keys.control.addr_type = FLOW_DISSECTOR_KEY_IPV6_ADDRS; + if (FIB_MULTIPATH_HASH_TEST_FIELD(INNER_SRC_IP, hash_fields)) + hash_keys.addrs.v6addrs.src = keys.addrs.v6addrs.src; + if (FIB_MULTIPATH_HASH_TEST_FIELD(INNER_DST_IP, hash_fields)) + hash_keys.addrs.v6addrs.dst = keys.addrs.v6addrs.dst; + if (FIB_MULTIPATH_HASH_TEST_FIELD(INNER_FLOWLABEL, hash_fields)) + hash_keys.tags.flow_label = keys.tags.flow_label; + } + + if (FIB_MULTIPATH_HASH_TEST_FIELD(INNER_IP_PROTO, hash_fields)) + hash_keys.basic.ip_proto = keys.basic.ip_proto; + if (FIB_MULTIPATH_HASH_TEST_FIELD(INNER_SRC_PORT, hash_fields)) + hash_keys.ports.src = keys.ports.src; + if (FIB_MULTIPATH_HASH_TEST_FIELD(INNER_DST_PORT, hash_fields)) + hash_keys.ports.dst = keys.ports.dst; + + return flow_hash_from_keys(&hash_keys); +} + +static u32 fib_multipath_custom_hash_skb(const struct net *net, + const struct sk_buff *skb) +{ + u32 mhash, mhash_inner; + bool has_inner = true; + + mhash = fib_multipath_custom_hash_outer(net, skb, &has_inner); + mhash_inner = fib_multipath_custom_hash_inner(net, skb, has_inner); + + return jhash_2words(mhash, mhash_inner, 0); +} + +static u32 fib_multipath_custom_hash_fl4(const struct net *net, + const struct flowi4 *fl4) +{ + unsigned long *hash_fields = net->ipv4.sysctl_fib_multipath_hash_fields; + struct flow_keys hash_keys; + + if (!net->ipv4.fib_multipath_hash_fields_need_outer) + return 0; + + memset(&hash_keys, 0, sizeof(hash_keys)); + hash_keys.control.addr_type = FLOW_DISSECTOR_KEY_IPV4_ADDRS; + if (FIB_MULTIPATH_HASH_TEST_FIELD(SRC_IP, hash_fields)) + hash_keys.addrs.v4addrs.src = fl4->saddr; + if (FIB_MULTIPATH_HASH_TEST_FIELD(DST_IP, hash_fields)) + hash_keys.addrs.v4addrs.dst = fl4->daddr; + if (FIB_MULTIPATH_HASH_TEST_FIELD(IP_PROTO, hash_fields)) + hash_keys.basic.ip_proto = fl4->flowi4_proto; + if (FIB_MULTIPATH_HASH_TEST_FIELD(SRC_PORT, hash_fields)) + hash_keys.ports.src = fl4->fl4_sport; + if (FIB_MULTIPATH_HASH_TEST_FIELD(DST_PORT, hash_fields)) + hash_keys.ports.dst = fl4->fl4_dport; + + return flow_hash_from_keys(&hash_keys); +} + /* if skb is set it will be used and fl4 can be NULL */ int fib_multipath_hash(const struct net *net, const struct flowi4 *fl4, const struct sk_buff *skb, struct flow_keys *flkeys) @@ -1991,6 +2106,12 @@ int fib_multipath_hash(const struct net *net, const struct flowi4 *fl4, } mhash = flow_hash_from_keys(&hash_keys); break; + case 3: + if (skb) + mhash = fib_multipath_custom_hash_skb(net, skb); + else + mhash = fib_multipath_custom_hash_fl4(net, fl4); + break; } if (multipath_hash) diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c index 0db7e68c38cd..988defcd8ae3 100644 --- a/net/ipv4/sysctl_net_ipv4.c +++ b/net/ipv4/sysctl_net_ipv4.c @@ -30,6 +30,7 @@ #include static int two = 2; +static int three __maybe_unused = 3; static int four = 4; static int thousand = 1000; static int tcp_retr1_max = 255; @@ -1075,7 +1076,7 @@ static struct ctl_table ipv4_net_table[] = { .mode = 0644, .proc_handler = proc_fib_multipath_hash_policy, .extra1 = SYSCTL_ZERO, - .extra2 = &two, + .extra2 = &three, }, { .procname = "fib_multipath_hash_fields", From patchwork Sun May 2 16:22:51 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ido Schimmel X-Patchwork-Id: 12235295 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7A6FAC43460 for ; Sun, 2 May 2021 16:24:52 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 5638B611B0 for ; Sun, 2 May 2021 16:24:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232331AbhEBQZn (ORCPT ); Sun, 2 May 2021 12:25:43 -0400 Received: from out5-smtp.messagingengine.com ([66.111.4.29]:60007 "EHLO out5-smtp.messagingengine.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230110AbhEBQZm (ORCPT ); Sun, 2 May 2021 12:25:42 -0400 Received: from compute5.internal (compute5.nyi.internal [10.202.2.45]) by mailout.nyi.internal (Postfix) with ESMTP id D33CF5C011D; Sun, 2 May 2021 12:24:50 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute5.internal (MEProxy); Sun, 02 May 2021 12:24:50 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:subject:to :x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; bh=ITr0G8uhtUCwuwYSwWJ1AQ4TdX9N04JeQxH9SYTVO2w=; b=OlN4c5vN Qc4V5Oi1SdSPGtomuC5+ZWURsQppCe9P+RSmKWIfvlNuxjvTSdkbeqJOINcbewUP zg1tSo8rVGgqxNNQtyTSd17upN0nhJhqT3rRFEp30QQ5SCkax/uimu2uUN4U3feh Rk9JyyskPJGEVoyuYUWCVn5PLdfRmcSU3Bibf6CxTXByF8VjxdfqDDZwt1jsdQSo HPv0DLY0ELvVVkC4XSpl+ldvESdu8Qrbrw6srnO8WHpUij7QTIUB+LrkN4oCqhA5 Yt1uaKh2n+SKnrL0DqN6XlWeGfbBx/iDJdIDqDzJrG0EKhC1OoHq19w09ZksA3IO 8fwajdoAyc1QJg== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeduledrvdefvddgtdejucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucenucfjughrpefhvffufffkofgjfhgggfestdekre dtredttdenucfhrhhomhepkfguohcuufgthhhimhhmvghluceoihguohhstghhsehiugho shgthhdrohhrgheqnecuggftrfgrthhtvghrnhepudetieevffffveelkeeljeffkefhke ehgfdtffethfelvdejgffghefgveejkefhnecukfhppeduleefrdegjedrudeihedrvdeh udenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepmhgrihhlfhhrohhmpehiug hoshgthhesihguohhstghhrdhorhhg X-ME-Proxy: Received: from shredder.mellanox.com (unknown [193.47.165.251]) by mail.messagingengine.com (Postfix) with ESMTPA; Sun, 2 May 2021 12:24:48 -0400 (EDT) From: Ido Schimmel To: netdev@vger.kernel.org Cc: davem@davemloft.net, kuba@kernel.org, dsahern@gmail.com, petrm@nvidia.com, roopa@nvidia.com, nikolay@nvidia.com, ssuryaextr@gmail.com, mlxsw@nvidia.com, Ido Schimmel Subject: [RFC PATCH net-next 04/10] ipv6: Use a more suitable label name Date: Sun, 2 May 2021 19:22:51 +0300 Message-Id: <20210502162257.3472453-5-idosch@idosch.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20210502162257.3472453-1-idosch@idosch.org> References: <20210502162257.3472453-1-idosch@idosch.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org X-Patchwork-State: RFC From: Ido Schimmel The 'out_timer' label was added in commit 63152fc0de4d ("[NETNS][IPV6] ip6_fib - gc timer per namespace") when the timer was allocated on the heap. Commit 417f28bb3407 ("netns: dont alloc ipv6 fib timer list") removed the allocation, but kept the label name. Rename it to a more suitable name. Signed-off-by: Ido Schimmel --- net/ipv6/ip6_fib.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c index 679699e953f1..33d2d6a4e28c 100644 --- a/net/ipv6/ip6_fib.c +++ b/net/ipv6/ip6_fib.c @@ -2362,7 +2362,7 @@ static int __net_init fib6_net_init(struct net *net) net->ipv6.rt6_stats = kzalloc(sizeof(*net->ipv6.rt6_stats), GFP_KERNEL); if (!net->ipv6.rt6_stats) - goto out_timer; + goto out_notifier; /* Avoid false sharing : Use at least a full cache line */ size = max_t(size_t, size, L1_CACHE_BYTES); @@ -2407,7 +2407,7 @@ static int __net_init fib6_net_init(struct net *net) kfree(net->ipv6.fib_table_hash); out_rt6_stats: kfree(net->ipv6.rt6_stats); -out_timer: +out_notifier: fib6_notifier_exit(net); return -ENOMEM; } From patchwork Sun May 2 16:22:52 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ido Schimmel X-Patchwork-Id: 12235297 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B1A1BC433B4 for ; Sun, 2 May 2021 16:24:55 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 8D45C61352 for ; Sun, 2 May 2021 16:24:55 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232360AbhEBQZq (ORCPT ); Sun, 2 May 2021 12:25:46 -0400 Received: from out5-smtp.messagingengine.com ([66.111.4.29]:51159 "EHLO out5-smtp.messagingengine.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230110AbhEBQZp (ORCPT ); Sun, 2 May 2021 12:25:45 -0400 Received: from compute3.internal (compute3.nyi.internal [10.202.2.43]) by mailout.nyi.internal (Postfix) with ESMTP id AA8AF5C0135; Sun, 2 May 2021 12:24:53 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute3.internal (MEProxy); Sun, 02 May 2021 12:24:53 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:subject:to :x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; bh=5EzNwX5cPveEgm3tihSJJobXk46uSK4/sZLomhcpN1s=; b=BUwHq+LN VTMZNrz2si4i+uv1iFC8n1GgYVaduZ6+m0JnxunhMI1QsdO6d9oskDQp40gWv7sE 2MPSok1OBxBakfsP9jO2mgtvT1TkLO7vGxSFH/EwubZ7Er0uFq28KExN6FJpnO/V 7GCMiGmekBo8m4kt2ARYiRK43rt5wLrp1eIQMTLVqQz+sfen8THu6mtloLonlPv1 bDAMquHw/xe9ItHCDXOFFv6+rk+nmTjeWSPlNw/e1hT1H7atJpmGie6Dm25Bqab8 MiIu4FwBfocuPAW4W9FV7CmDthRGOrbL3GtAeaijeXWaVbGBkVp6cwkvebRFUqyL SHg6RcVTdpC9WQ== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeduledrvdefvddgtdejucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucenucfjughrpefhvffufffkofgjfhgggfestdekre dtredttdenucfhrhhomhepkfguohcuufgthhhimhhmvghluceoihguohhstghhsehiugho shgthhdrohhrgheqnecuggftrfgrthhtvghrnhepudetieevffffveelkeeljeffkefhke ehgfdtffethfelvdejgffghefgveejkefhnecukfhppeduleefrdegjedrudeihedrvdeh udenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepmhgrihhlfhhrohhmpehiug hoshgthhesihguohhstghhrdhorhhg X-ME-Proxy: Received: from shredder.mellanox.com (unknown [193.47.165.251]) by mail.messagingengine.com (Postfix) with ESMTPA; Sun, 2 May 2021 12:24:51 -0400 (EDT) From: Ido Schimmel To: netdev@vger.kernel.org Cc: davem@davemloft.net, kuba@kernel.org, dsahern@gmail.com, petrm@nvidia.com, roopa@nvidia.com, nikolay@nvidia.com, ssuryaextr@gmail.com, mlxsw@nvidia.com, Ido Schimmel Subject: [RFC PATCH net-next 05/10] ipv6: Calculate multipath hash inside switch statement Date: Sun, 2 May 2021 19:22:52 +0300 Message-Id: <20210502162257.3472453-6-idosch@idosch.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20210502162257.3472453-1-idosch@idosch.org> References: <20210502162257.3472453-1-idosch@idosch.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org X-Patchwork-State: RFC From: Ido Schimmel A subsequent patch will add another multipath hash policy where the multipath hash is calculated directly by the policy specific code and not outside of the switch statement. Prepare for this change by moving the multipath hash calculation inside the switch statement. No functional changes intended. Signed-off-by: Ido Schimmel --- net/ipv6/route.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/net/ipv6/route.c b/net/ipv6/route.c index a22822bdbf39..9935e18146e5 100644 --- a/net/ipv6/route.c +++ b/net/ipv6/route.c @@ -2331,7 +2331,7 @@ u32 rt6_multipath_hash(const struct net *net, const struct flowi6 *fl6, const struct sk_buff *skb, struct flow_keys *flkeys) { struct flow_keys hash_keys; - u32 mhash; + u32 mhash = 0; switch (ip6_multipath_hash_policy(net)) { case 0: @@ -2345,6 +2345,7 @@ u32 rt6_multipath_hash(const struct net *net, const struct flowi6 *fl6, hash_keys.tags.flow_label = (__force u32)flowi6_get_flowlabel(fl6); hash_keys.basic.ip_proto = fl6->flowi6_proto; } + mhash = flow_hash_from_keys(&hash_keys); break; case 1: if (skb) { @@ -2376,6 +2377,7 @@ u32 rt6_multipath_hash(const struct net *net, const struct flowi6 *fl6, hash_keys.ports.dst = fl6->fl6_dport; hash_keys.basic.ip_proto = fl6->flowi6_proto; } + mhash = flow_hash_from_keys(&hash_keys); break; case 2: memset(&hash_keys, 0, sizeof(hash_keys)); @@ -2412,9 +2414,9 @@ u32 rt6_multipath_hash(const struct net *net, const struct flowi6 *fl6, hash_keys.tags.flow_label = (__force u32)flowi6_get_flowlabel(fl6); hash_keys.basic.ip_proto = fl6->flowi6_proto; } + mhash = flow_hash_from_keys(&hash_keys); break; } - mhash = flow_hash_from_keys(&hash_keys); return mhash >> 1; } From patchwork Sun May 2 16:22:53 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ido Schimmel X-Patchwork-Id: 12235299 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4E4C3C433ED for ; Sun, 2 May 2021 16:24:58 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 2DDD8611B0 for ; Sun, 2 May 2021 16:24:58 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232358AbhEBQZs (ORCPT ); Sun, 2 May 2021 12:25:48 -0400 Received: from out5-smtp.messagingengine.com ([66.111.4.29]:52945 "EHLO out5-smtp.messagingengine.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230110AbhEBQZs (ORCPT ); Sun, 2 May 2021 12:25:48 -0400 Received: from compute3.internal (compute3.nyi.internal [10.202.2.43]) by mailout.nyi.internal (Postfix) with ESMTP id 518115C012B; Sun, 2 May 2021 12:24:56 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute3.internal (MEProxy); Sun, 02 May 2021 12:24:56 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:subject:to :x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; bh=/oxXd2ccNvebZFYyMu6Dmty9S3Uu+511EGTvu/wp9Ng=; b=c2T+2y0w E6cDurwXPafzXfcEBAnoELbepwvsqbtKERNIls3Tsx7Zznw1yV5/hJ4ttHxtoFvE R14D8jIrvWOOwLk5HCZ3iNLgL4Gt2sqWtWCnpP/a2RGi6aiQeP9rZqwAojsd7r5U uNC3eDGOj/B9mEkDZuTYehXpZZuwvpSZFRnAI1z+haKrkln4fL1YTkF0mc/N2TV1 ViXYTNVx2CLVi0gaJEjUNXbCJYHiCVJ3nKPIwjTx/0fZGUiSUx7NQBIXd6voG5k3 rN5om+zw0/w9/ra6bPEXECF1K4Hmyq3bRfJKi6dRDyyU6qIJZ2Rs+9PsWFaHimb/ L+9tdWqO46QwUA== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeduledrvdefvddgtdejucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucenucfjughrpefhvffufffkofgjfhgggfestdekre dtredttdenucfhrhhomhepkfguohcuufgthhhimhhmvghluceoihguohhstghhsehiugho shgthhdrohhrgheqnecuggftrfgrthhtvghrnhepudetieevffffveelkeeljeffkefhke ehgfdtffethfelvdejgffghefgveejkefhnecukfhppeduleefrdegjedrudeihedrvdeh udenucevlhhushhtvghrufhiiigvpedunecurfgrrhgrmhepmhgrihhlfhhrohhmpehiug hoshgthhesihguohhstghhrdhorhhg X-ME-Proxy: Received: from shredder.mellanox.com (unknown [193.47.165.251]) by mail.messagingengine.com (Postfix) with ESMTPA; Sun, 2 May 2021 12:24:53 -0400 (EDT) From: Ido Schimmel To: netdev@vger.kernel.org Cc: davem@davemloft.net, kuba@kernel.org, dsahern@gmail.com, petrm@nvidia.com, roopa@nvidia.com, nikolay@nvidia.com, ssuryaextr@gmail.com, mlxsw@nvidia.com, Ido Schimmel Subject: [RFC PATCH net-next 06/10] ipv6: Add a sysctl to control multipath hash fields Date: Sun, 2 May 2021 19:22:53 +0300 Message-Id: <20210502162257.3472453-7-idosch@idosch.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20210502162257.3472453-1-idosch@idosch.org> References: <20210502162257.3472453-1-idosch@idosch.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org X-Patchwork-State: RFC From: Ido Schimmel A subsequent patch will add a new multipath hash policy where the packet fields used for multipath hash calculation are determined by user space. This patch adds a sysctl that allows user space to set these fields. The packet fields are represented using a bitmap and are common between IPv4 and IPv6 to allow user space to use the same numbering across both protocols. For example, to hash based on standard 5-tuple: # sysctl -w net.ipv6.fib_multipath_hash_fields=0-2,4-5 net.ipv6.fib_multipath_hash_fields = 0-2,4-5 More fields can be added in the future, if needed. The 'need_outer' and 'need_inner' variables are set in the control path to indicate whether dissection of the outer or inner flow is needed. They will be used by a subsequent patch to allow the data path to avoid dissection of the outer or inner flow when not needed. To avoid introducing holes in 'struct netns_sysctl_ipv6', move the 'bindv6only' field after the multipath hash fields. Signed-off-by: Ido Schimmel --- Documentation/networking/ip-sysctl.rst | 29 +++++++++++++++++++++++ include/net/ipv6.h | 8 +++++++ include/net/netns/ipv6.h | 6 ++++- net/ipv6/ip6_fib.c | 21 ++++++++++++++++- net/ipv6/sysctl_net_ipv6.c | 32 ++++++++++++++++++++++++++ 5 files changed, 94 insertions(+), 2 deletions(-) diff --git a/Documentation/networking/ip-sysctl.rst b/Documentation/networking/ip-sysctl.rst index 549601494694..5289336227b3 100644 --- a/Documentation/networking/ip-sysctl.rst +++ b/Documentation/networking/ip-sysctl.rst @@ -1775,6 +1775,35 @@ fib_multipath_hash_policy - INTEGER - 1 - Layer 4 (standard 5-tuple) - 2 - Layer 3 or inner Layer 3 if present +fib_multipath_hash_fields - list of comma separated ranges + When fib_multipath_hash_policy is set to 3 (custom multipath hash), the + fields used for multipath hash calculation are determined by this + sysctl. + + The format used for both input and output is a comma separated list of + ranges (e.g., "0-2" for source IP, destination IP and IP protocol). + Writing to the file will clear all previous ranges and update the + current list with the input. + + Possible fields are: + + == ============================ + 0 Source IP address + 1 Destination IP address + 2 IP protocol + 3 Flow Label + 4 Source port + 5 Destination port + 6 Inner source IP address + 7 Inner destination IP address + 8 Inner IP protocol + 9 Inner Flow Label + 10 Inner source port + 11 Inner destination port + == ============================ + + Default: 0-2 (source IP, destination IP and IP protocol) + anycast_src_echo_reply - BOOLEAN Controls the use of anycast addresses as source addresses for ICMPv6 echo reply diff --git a/include/net/ipv6.h b/include/net/ipv6.h index 448bf2b34759..61cce23f628e 100644 --- a/include/net/ipv6.h +++ b/include/net/ipv6.h @@ -926,11 +926,19 @@ static inline int ip6_multipath_hash_policy(const struct net *net) { return net->ipv6.sysctl.multipath_hash_policy; } +static inline unsigned long *ip6_multipath_hash_fields(const struct net *net) +{ + return net->ipv6.sysctl.multipath_hash_fields; +} #else static inline int ip6_multipath_hash_policy(const struct net *net) { return 0; } +static inline unsigned long *ip6_multipath_hash_fields(const struct net *net) +{ + return NULL; +} #endif /* diff --git a/include/net/netns/ipv6.h b/include/net/netns/ipv6.h index 6153c8067009..81f78e07e2d5 100644 --- a/include/net/netns/ipv6.h +++ b/include/net/netns/ipv6.h @@ -28,8 +28,12 @@ struct netns_sysctl_ipv6 { int ip6_rt_gc_elasticity; int ip6_rt_mtu_expires; int ip6_rt_min_advmss; - u8 bindv6only; + unsigned long *multipath_hash_fields; + u8 multipath_hash_fields_need_outer:1, + multipath_hash_fields_need_inner:1, + unused:6; u8 multipath_hash_policy; + u8 bindv6only; u8 flowlabel_consistency; u8 auto_flowlabels; int icmpv6_time; diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c index 33d2d6a4e28c..3aa6b6532343 100644 --- a/net/ipv6/ip6_fib.c +++ b/net/ipv6/ip6_fib.c @@ -32,6 +32,7 @@ #include #include +#include #include #include @@ -2355,6 +2356,21 @@ static int __net_init fib6_net_init(struct net *net) if (err) return err; + net->ipv6.sysctl.multipath_hash_fields = + bitmap_zalloc(__FIB_MULTIPATH_HASH_FIELD_CNT, GFP_KERNEL); + if (!net->ipv6.sysctl.multipath_hash_fields) + goto out_notifier; + + /* Default to 3-tuple */ + set_bit(FIB_MULTIPATH_HASH_FIELD_SRC_IP, + net->ipv6.sysctl.multipath_hash_fields); + set_bit(FIB_MULTIPATH_HASH_FIELD_DST_IP, + net->ipv6.sysctl.multipath_hash_fields); + set_bit(FIB_MULTIPATH_HASH_FIELD_IP_PROTO, + net->ipv6.sysctl.multipath_hash_fields); + net->ipv6.sysctl.multipath_hash_fields_need_outer = 1; + net->ipv6.sysctl.multipath_hash_fields_need_inner = 0; + spin_lock_init(&net->ipv6.fib6_gc_lock); rwlock_init(&net->ipv6.fib6_walker_lock); INIT_LIST_HEAD(&net->ipv6.fib6_walkers); @@ -2362,7 +2378,7 @@ static int __net_init fib6_net_init(struct net *net) net->ipv6.rt6_stats = kzalloc(sizeof(*net->ipv6.rt6_stats), GFP_KERNEL); if (!net->ipv6.rt6_stats) - goto out_notifier; + goto out_hash_fields; /* Avoid false sharing : Use at least a full cache line */ size = max_t(size_t, size, L1_CACHE_BYTES); @@ -2407,6 +2423,8 @@ static int __net_init fib6_net_init(struct net *net) kfree(net->ipv6.fib_table_hash); out_rt6_stats: kfree(net->ipv6.rt6_stats); +out_hash_fields: + bitmap_free(net->ipv6.sysctl.multipath_hash_fields); out_notifier: fib6_notifier_exit(net); return -ENOMEM; @@ -2431,6 +2449,7 @@ static void fib6_net_exit(struct net *net) kfree(net->ipv6.fib_table_hash); kfree(net->ipv6.rt6_stats); + bitmap_free(net->ipv6.sysctl.multipath_hash_fields); fib6_notifier_exit(net); } diff --git a/net/ipv6/sysctl_net_ipv6.c b/net/ipv6/sysctl_net_ipv6.c index 27102c3d6e1d..8d94a1d621d0 100644 --- a/net/ipv6/sysctl_net_ipv6.c +++ b/net/ipv6/sysctl_net_ipv6.c @@ -17,6 +17,7 @@ #include #include #include +#include #ifdef CONFIG_NETLABEL #include #endif @@ -40,6 +41,30 @@ static int proc_rt6_multipath_hash_policy(struct ctl_table *table, int write, return ret; } +static int +proc_rt6_multipath_hash_fields(struct ctl_table *table, int write, void *buffer, + size_t *lenp, loff_t *ppos) +{ + unsigned long *hash_fields; + struct net *net; + int ret; + + net = container_of(table->data, struct net, + ipv6.sysctl.multipath_hash_fields); + ret = proc_do_large_bitmap(table, write, buffer, lenp, ppos); + if (!write || ret) + goto out; + + hash_fields = net->ipv6.sysctl.multipath_hash_fields; + net->ipv6.sysctl.multipath_hash_fields_need_outer = + fib_multipath_hash_need_outer(hash_fields); + net->ipv6.sysctl.multipath_hash_fields_need_inner = + fib_multipath_hash_need_inner(hash_fields); + +out: + return ret; +} + static struct ctl_table ipv6_table_template[] = { { .procname = "bindv6only", @@ -151,6 +176,13 @@ static struct ctl_table ipv6_table_template[] = { .extra1 = SYSCTL_ZERO, .extra2 = &two, }, + { + .procname = "fib_multipath_hash_fields", + .data = &init_net.ipv6.sysctl.multipath_hash_fields, + .maxlen = __FIB_MULTIPATH_HASH_FIELD_CNT, + .mode = 0644, + .proc_handler = proc_rt6_multipath_hash_fields, + }, { .procname = "seg6_flowlabel", .data = &init_net.ipv6.sysctl.seg6_flowlabel, From patchwork Sun May 2 16:22:54 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ido Schimmel X-Patchwork-Id: 12235301 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 46F69C433B4 for ; Sun, 2 May 2021 16:25:01 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 2996161352 for ; Sun, 2 May 2021 16:25:01 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232371AbhEBQZv (ORCPT ); Sun, 2 May 2021 12:25:51 -0400 Received: from out5-smtp.messagingengine.com ([66.111.4.29]:56579 "EHLO out5-smtp.messagingengine.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230110AbhEBQZu (ORCPT ); Sun, 2 May 2021 12:25:50 -0400 Received: from compute3.internal (compute3.nyi.internal [10.202.2.43]) by mailout.nyi.internal (Postfix) with ESMTP id 02C7A5C0139; Sun, 2 May 2021 12:24:59 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute3.internal (MEProxy); Sun, 02 May 2021 12:24:59 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:subject:to :x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; bh=rvJu9s7cYpH3BPPQQUlSTGl8+7ztPDbXW/o2v3PzryY=; b=pBBHKj6Y fP7lzarn+5DOicQoqOqAU981brpaUq+IYWdTwTrhgr+9p2ajChYn7aS4q+Driyze v10CDeZlGf9jkeUJ3DnTPEmmbg0j3cQPLPptlpCR1lpIUd4LLBTq30IuS/qoGuw7 iYMo8KuglIo13MwWZ5WDfyy6NUg28wjh+duhAfv9mPwKIaXTfuR9LpYP7cotu+Jn UmFADJ1oOsldgDI6lDdbU3I4iZkGZwRXL1oN4K9RDTOug2O1UvW9QgmVTQEWnO/q IWhGXt1KZoI3/S5e57WCQP2TQtd6deL1LptNCZKSWC7eYLjUwJIokTRkZN+pBMWR zZpPsUabnb3C3g== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeduledrvdefvddgtdejucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucenucfjughrpefhvffufffkofgjfhgggfestdekre dtredttdenucfhrhhomhepkfguohcuufgthhhimhhmvghluceoihguohhstghhsehiugho shgthhdrohhrgheqnecuggftrfgrthhtvghrnhepudetieevffffveelkeeljeffkefhke ehgfdtffethfelvdejgffghefgveejkefhnecukfhppeduleefrdegjedrudeihedrvdeh udenucevlhhushhtvghrufhiiigvpedunecurfgrrhgrmhepmhgrihhlfhhrohhmpehiug hoshgthhesihguohhstghhrdhorhhg X-ME-Proxy: Received: from shredder.mellanox.com (unknown [193.47.165.251]) by mail.messagingengine.com (Postfix) with ESMTPA; Sun, 2 May 2021 12:24:56 -0400 (EDT) From: Ido Schimmel To: netdev@vger.kernel.org Cc: davem@davemloft.net, kuba@kernel.org, dsahern@gmail.com, petrm@nvidia.com, roopa@nvidia.com, nikolay@nvidia.com, ssuryaextr@gmail.com, mlxsw@nvidia.com, Ido Schimmel Subject: [RFC PATCH net-next 07/10] ipv6: Add custom multipath hash policy Date: Sun, 2 May 2021 19:22:54 +0300 Message-Id: <20210502162257.3472453-8-idosch@idosch.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20210502162257.3472453-1-idosch@idosch.org> References: <20210502162257.3472453-1-idosch@idosch.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org X-Patchwork-State: RFC From: Ido Schimmel Add a new multipath hash policy where the packet fields used for hash calculation are determined by user space via the fib_multipath_hash_fields sysctl that was introduced in the previous patch. The current set of available packet fields includes both outer and inner fields, which requires two invocations of the flow dissector. Avoid unnecessary dissection of the outer or inner flows by skipping dissection if the 'need_outer' or 'need_inner' variables are not set. These fields are set / cleared as part of the control path when the fib_multipath_hash_fields sysctl is configured. In accordance with the existing policies, when an skb is not available, packet fields are extracted from the provided flow key. In which case, only outer fields are considered. Signed-off-by: Ido Schimmel --- Documentation/networking/ip-sysctl.rst | 2 + net/ipv6/route.c | 125 +++++++++++++++++++++++++ net/ipv6/sysctl_net_ipv6.c | 3 +- 3 files changed, 129 insertions(+), 1 deletion(-) diff --git a/Documentation/networking/ip-sysctl.rst b/Documentation/networking/ip-sysctl.rst index 5289336227b3..8b88499fe555 100644 --- a/Documentation/networking/ip-sysctl.rst +++ b/Documentation/networking/ip-sysctl.rst @@ -1774,6 +1774,8 @@ fib_multipath_hash_policy - INTEGER - 0 - Layer 3 (source and destination addresses plus flow label) - 1 - Layer 4 (standard 5-tuple) - 2 - Layer 3 or inner Layer 3 if present + - 3 - Custom multipath hash. Fields used for multipath hash calculation + are determined by fib_multipath_hash_fields sysctl fib_multipath_hash_fields - list of comma separated ranges When fib_multipath_hash_policy is set to 3 (custom multipath hash), the diff --git a/net/ipv6/route.c b/net/ipv6/route.c index 9935e18146e5..b4c65c5baf35 100644 --- a/net/ipv6/route.c +++ b/net/ipv6/route.c @@ -2326,6 +2326,125 @@ static void ip6_multipath_l3_keys(const struct sk_buff *skb, } } +static u32 rt6_multipath_custom_hash_outer(const struct net *net, + const struct sk_buff *skb, + bool *p_has_inner) +{ + unsigned long *hash_fields = ip6_multipath_hash_fields(net); + struct flow_keys keys, hash_keys; + + if (!net->ipv6.sysctl.multipath_hash_fields_need_outer) + return 0; + + memset(&hash_keys, 0, sizeof(hash_keys)); + skb_flow_dissect_flow_keys(skb, &keys, FLOW_DISSECTOR_F_STOP_AT_ENCAP); + + hash_keys.control.addr_type = FLOW_DISSECTOR_KEY_IPV6_ADDRS; + if (FIB_MULTIPATH_HASH_TEST_FIELD(SRC_IP, hash_fields)) + hash_keys.addrs.v6addrs.src = keys.addrs.v6addrs.src; + if (FIB_MULTIPATH_HASH_TEST_FIELD(DST_IP, hash_fields)) + hash_keys.addrs.v6addrs.dst = keys.addrs.v6addrs.dst; + if (FIB_MULTIPATH_HASH_TEST_FIELD(IP_PROTO, hash_fields)) + hash_keys.basic.ip_proto = keys.basic.ip_proto; + if (FIB_MULTIPATH_HASH_TEST_FIELD(FLOWLABEL, hash_fields)) + hash_keys.tags.flow_label = keys.tags.flow_label; + if (FIB_MULTIPATH_HASH_TEST_FIELD(SRC_PORT, hash_fields)) + hash_keys.ports.src = keys.ports.src; + if (FIB_MULTIPATH_HASH_TEST_FIELD(DST_PORT, hash_fields)) + hash_keys.ports.dst = keys.ports.dst; + + *p_has_inner = !!(keys.control.flags & FLOW_DIS_ENCAPSULATION); + return flow_hash_from_keys(&hash_keys); +} + +static u32 rt6_multipath_custom_hash_inner(const struct net *net, + const struct sk_buff *skb, + bool has_inner) +{ + unsigned long *hash_fields = ip6_multipath_hash_fields(net); + struct flow_keys keys, hash_keys; + + /* We assume the packet carries an encapsulation, but if none was + * encountered during dissection of the outer flow, then there is no + * point in calling the flow dissector again. + */ + if (!has_inner) + return 0; + + if (!net->ipv6.sysctl.multipath_hash_fields_need_inner) + return 0; + + memset(&hash_keys, 0, sizeof(hash_keys)); + skb_flow_dissect_flow_keys(skb, &keys, 0); + + if (!(keys.control.flags & FLOW_DIS_ENCAPSULATION)) + return 0; + + if (keys.control.addr_type == FLOW_DISSECTOR_KEY_IPV4_ADDRS) { + hash_keys.control.addr_type = FLOW_DISSECTOR_KEY_IPV4_ADDRS; + if (FIB_MULTIPATH_HASH_TEST_FIELD(INNER_SRC_IP, hash_fields)) + hash_keys.addrs.v4addrs.src = keys.addrs.v4addrs.src; + if (FIB_MULTIPATH_HASH_TEST_FIELD(INNER_DST_IP, hash_fields)) + hash_keys.addrs.v4addrs.dst = keys.addrs.v4addrs.dst; + } else if (keys.control.addr_type == FLOW_DISSECTOR_KEY_IPV6_ADDRS) { + hash_keys.control.addr_type = FLOW_DISSECTOR_KEY_IPV6_ADDRS; + if (FIB_MULTIPATH_HASH_TEST_FIELD(INNER_SRC_IP, hash_fields)) + hash_keys.addrs.v6addrs.src = keys.addrs.v6addrs.src; + if (FIB_MULTIPATH_HASH_TEST_FIELD(INNER_DST_IP, hash_fields)) + hash_keys.addrs.v6addrs.dst = keys.addrs.v6addrs.dst; + if (FIB_MULTIPATH_HASH_TEST_FIELD(INNER_FLOWLABEL, hash_fields)) + hash_keys.tags.flow_label = keys.tags.flow_label; + } + + if (FIB_MULTIPATH_HASH_TEST_FIELD(INNER_IP_PROTO, hash_fields)) + hash_keys.basic.ip_proto = keys.basic.ip_proto; + if (FIB_MULTIPATH_HASH_TEST_FIELD(INNER_SRC_PORT, hash_fields)) + hash_keys.ports.src = keys.ports.src; + if (FIB_MULTIPATH_HASH_TEST_FIELD(INNER_DST_PORT, hash_fields)) + hash_keys.ports.dst = keys.ports.dst; + + return flow_hash_from_keys(&hash_keys); +} + +static u32 rt6_multipath_custom_hash_skb(const struct net *net, + const struct sk_buff *skb) +{ + u32 mhash, mhash_inner; + bool has_inner = true; + + mhash = rt6_multipath_custom_hash_outer(net, skb, &has_inner); + mhash_inner = rt6_multipath_custom_hash_inner(net, skb, has_inner); + + return jhash_2words(mhash, mhash_inner, 0); +} + +static u32 rt6_multipath_custom_hash_fl6(const struct net *net, + const struct flowi6 *fl6) +{ + unsigned long *hash_fields = ip6_multipath_hash_fields(net); + struct flow_keys hash_keys; + + if (!net->ipv6.sysctl.multipath_hash_fields_need_outer) + return 0; + + memset(&hash_keys, 0, sizeof(hash_keys)); + hash_keys.control.addr_type = FLOW_DISSECTOR_KEY_IPV6_ADDRS; + if (FIB_MULTIPATH_HASH_TEST_FIELD(SRC_IP, hash_fields)) + hash_keys.addrs.v6addrs.src = fl6->saddr; + if (FIB_MULTIPATH_HASH_TEST_FIELD(DST_IP, hash_fields)) + hash_keys.addrs.v6addrs.dst = fl6->daddr; + if (FIB_MULTIPATH_HASH_TEST_FIELD(IP_PROTO, hash_fields)) + hash_keys.basic.ip_proto = fl6->flowi6_proto; + if (FIB_MULTIPATH_HASH_TEST_FIELD(FLOWLABEL, hash_fields)) + hash_keys.tags.flow_label = (__force u32)flowi6_get_flowlabel(fl6); + if (FIB_MULTIPATH_HASH_TEST_FIELD(SRC_PORT, hash_fields)) + hash_keys.ports.src = fl6->fl6_sport; + if (FIB_MULTIPATH_HASH_TEST_FIELD(DST_PORT, hash_fields)) + hash_keys.ports.dst = fl6->fl6_dport; + + return flow_hash_from_keys(&hash_keys); +} + /* if skb is set it will be used and fl6 can be NULL */ u32 rt6_multipath_hash(const struct net *net, const struct flowi6 *fl6, const struct sk_buff *skb, struct flow_keys *flkeys) @@ -2416,6 +2535,12 @@ u32 rt6_multipath_hash(const struct net *net, const struct flowi6 *fl6, } mhash = flow_hash_from_keys(&hash_keys); break; + case 3: + if (skb) + mhash = rt6_multipath_custom_hash_skb(net, skb); + else + mhash = rt6_multipath_custom_hash_fl6(net, fl6); + break; } return mhash >> 1; diff --git a/net/ipv6/sysctl_net_ipv6.c b/net/ipv6/sysctl_net_ipv6.c index 8d94a1d621d0..38d444b1bb60 100644 --- a/net/ipv6/sysctl_net_ipv6.c +++ b/net/ipv6/sysctl_net_ipv6.c @@ -23,6 +23,7 @@ #endif static int two = 2; +static int three = 3; static int flowlabel_reflect_max = 0x7; static int auto_flowlabels_max = IP6_AUTO_FLOW_LABEL_MAX; @@ -174,7 +175,7 @@ static struct ctl_table ipv6_table_template[] = { .mode = 0644, .proc_handler = proc_rt6_multipath_hash_policy, .extra1 = SYSCTL_ZERO, - .extra2 = &two, + .extra2 = &three, }, { .procname = "fib_multipath_hash_fields", From patchwork Sun May 2 16:22:55 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ido Schimmel X-Patchwork-Id: 12235303 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9FC86C433ED for ; Sun, 2 May 2021 16:25:05 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 80D2761221 for ; Sun, 2 May 2021 16:25:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232391AbhEBQZ4 (ORCPT ); Sun, 2 May 2021 12:25:56 -0400 Received: from out5-smtp.messagingengine.com ([66.111.4.29]:53589 "EHLO out5-smtp.messagingengine.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232374AbhEBQZx (ORCPT ); Sun, 2 May 2021 12:25:53 -0400 Received: from compute3.internal (compute3.nyi.internal [10.202.2.43]) by mailout.nyi.internal (Postfix) with ESMTP id A6F7D5C013A; Sun, 2 May 2021 12:25:01 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute3.internal (MEProxy); Sun, 02 May 2021 12:25:01 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:subject:to :x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; bh=p/XT9y9DQqL0Mv3w3jy5WV+m3xOS1V9/y8ZvzRYvkFI=; b=CVl1XYej Le4j/pkYQ4+9qvu0XMxtfFranaLq26/+XDMIx+zAM19EaT17HXmvny7snyTRvJWJ cwLZOupOB75MQej4jVuzqTi8J7M66ZED4nYBnixkXAwgOzQBK5qI/5+yAgAXFsSj kq5sjiv8mlNGX/rZlElZVfyZ/hYZPDmCnguvuB8nv3kpkgW8EUymnhCGBdWtsZoh 1zZjKwMion60y96ARi7ZcFMtx9Rip0VCeDxhEEKJ2CPBtsw86/vGYpasN7okXc5m 5q58dlUgD1NebJeFj5IsV2ntxsYR1s0Uo4WesVpTh0K5DEuqG/0I83C4f/Z5A55L MBzb58Zc7DsGhw== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeduledrvdefvddgtdejucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucenucfjughrpefhvffufffkofgjfhgggfestdekre dtredttdenucfhrhhomhepkfguohcuufgthhhimhhmvghluceoihguohhstghhsehiugho shgthhdrohhrgheqnecuggftrfgrthhtvghrnhepudetieevffffveelkeeljeffkefhke ehgfdtffethfelvdejgffghefgveejkefhnecukfhppeduleefrdegjedrudeihedrvdeh udenucevlhhushhtvghrufhiiigvpeefnecurfgrrhgrmhepmhgrihhlfhhrohhmpehiug hoshgthhesihguohhstghhrdhorhhg X-ME-Proxy: Received: from shredder.mellanox.com (unknown [193.47.165.251]) by mail.messagingengine.com (Postfix) with ESMTPA; Sun, 2 May 2021 12:24:59 -0400 (EDT) From: Ido Schimmel To: netdev@vger.kernel.org Cc: davem@davemloft.net, kuba@kernel.org, dsahern@gmail.com, petrm@nvidia.com, roopa@nvidia.com, nikolay@nvidia.com, ssuryaextr@gmail.com, mlxsw@nvidia.com, Ido Schimmel Subject: [RFC PATCH net-next 08/10] selftests: forwarding: Add test for custom multipath hash Date: Sun, 2 May 2021 19:22:55 +0300 Message-Id: <20210502162257.3472453-9-idosch@idosch.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20210502162257.3472453-1-idosch@idosch.org> References: <20210502162257.3472453-1-idosch@idosch.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org X-Patchwork-State: RFC From: Ido Schimmel Test that when the hash policy is set to custom, traffic is distributed only according to the outer fields set in the fib_multipath_hash_fields sysctl. Each time set a different field and make sure traffic is only distributed when the field is changed in the packet stream. The test only verifies the behavior with non-encapsulated IPv4 and IPv6 packets. Subsequent patches will add tests for IPv4/IPv6 overlays on top of IPv4/IPv6 underlay networks. Example output: # ./custom_multipath_hash.sh TEST: ping [ OK ] TEST: ping6 [ OK ] INFO: Running IPv4 custom multipath hash tests TEST: Multipath hash field: Source IP (balanced) [ OK ] INFO: Packets sent on path1 / path2: 6353 / 6254 TEST: Multipath hash field: Source IP (unbalanced) [ OK ] INFO: Packets sent on path1 / path2: 0 / 12600 TEST: Multipath hash field: Destination IP (balanced) [ OK ] INFO: Packets sent on path1 / path2: 6102 / 6502 TEST: Multipath hash field: Destination IP (unbalanced) [ OK ] INFO: Packets sent on path1 / path2: 1 / 12601 TEST: Multipath hash field: Source port (balanced) [ OK ] INFO: Packets sent on path1 / path2: 16428 / 16345 TEST: Multipath hash field: Source port (unbalanced) [ OK ] INFO: Packets sent on path1 / path2: 32770 / 2 TEST: Multipath hash field: Destination port (balanced) [ OK ] INFO: Packets sent on path1 / path2: 16428 / 16345 TEST: Multipath hash field: Destination port (unbalanced) [ OK ] INFO: Packets sent on path1 / path2: 32770 / 2 INFO: Running IPv6 custom multipath hash tests TEST: Multipath hash field: Source IP (balanced) [ OK ] INFO: Packets sent on path1 / path2: 6704 / 5903 TEST: Multipath hash field: Source IP (unbalanced) [ OK ] INFO: Packets sent on path1 / path2: 12600 / 0 TEST: Multipath hash field: Destination IP (balanced) [ OK ] INFO: Packets sent on path1 / path2: 5551 / 7052 TEST: Multipath hash field: Destination IP (unbalanced) [ OK ] INFO: Packets sent on path1 / path2: 12603 / 0 TEST: Multipath hash field: Flowlabel (balanced) [ OK ] INFO: Packets sent on path1 / path2: 8378 / 8080 TEST: Multipath hash field: Flowlabel (unbalanced) [ OK ] INFO: Packets sent on path1 / path2: 2 / 12603 TEST: Multipath hash field: Source port (balanced) [ OK ] INFO: Packets sent on path1 / path2: 16385 / 16388 TEST: Multipath hash field: Source port (unbalanced) [ OK ] INFO: Packets sent on path1 / path2: 0 / 32774 TEST: Multipath hash field: Destination port (balanced) [ OK ] INFO: Packets sent on path1 / path2: 16386 / 16390 TEST: Multipath hash field: Destination port (unbalanced) [ OK ] INFO: Packets sent on path1 / path2: 32771 / 2 Signed-off-by: Ido Schimmel --- .../net/forwarding/custom_multipath_hash.sh | 364 ++++++++++++++++++ 1 file changed, 364 insertions(+) create mode 100755 tools/testing/selftests/net/forwarding/custom_multipath_hash.sh diff --git a/tools/testing/selftests/net/forwarding/custom_multipath_hash.sh b/tools/testing/selftests/net/forwarding/custom_multipath_hash.sh new file mode 100755 index 000000000000..210ebb0b94e4 --- /dev/null +++ b/tools/testing/selftests/net/forwarding/custom_multipath_hash.sh @@ -0,0 +1,364 @@ +#!/bin/bash +# SPDX-License-Identifier: GPL-2.0 +# +# Test traffic distribution between two paths when using custom hash policy. +# +# +--------------------------------+ +# | H1 | +# | $h1 + | +# | 198.51.100.{2-253}/24 | | +# | 2001:db8:1::{2-fd}/64 | | +# +-------------------------|------+ +# | +# +-------------------------|-------------------------+ +# | SW1 | | +# | $rp1 + | +# | 198.51.100.1/24 | +# | 2001:db8:1::1/64 | +# | | +# | | +# | $rp11 + + $rp12 | +# | 192.0.2.1/28 | | 192.0.2.17/28 | +# | 2001:db8:2::1/64 | | 2001:db8:3::1/64 | +# +------------------|-------------|------------------+ +# | | +# +------------------|-------------|------------------+ +# | SW2 | | | +# | | | | +# | $rp21 + + $rp22 | +# | 192.0.2.2/28 192.0.2.18/28 | +# | 2001:db8:2::2/64 2001:db8:3::2/64 | +# | | +# | | +# | $rp2 + | +# | 203.0.113.1/24 | | +# | 2001:db8:4::1/64 | | +# +-------------------------|-------------------------+ +# | +# +-------------------------|------+ +# | H2 | | +# | $h2 + | +# | 203.0.113.{2-253}/24 | +# | 2001:db8:4::{2-fd}/64 | +# +--------------------------------+ + +ALL_TESTS=" + ping_ipv4 + ping_ipv6 + custom_hash +" + +NUM_NETIFS=8 +source lib.sh + +h1_create() +{ + simple_if_init $h1 198.51.100.2/24 2001:db8:1::2/64 + ip route add vrf v$h1 default via 198.51.100.1 dev $h1 + ip -6 route add vrf v$h1 default via 2001:db8:1::1 dev $h1 +} + +h1_destroy() +{ + ip -6 route del vrf v$h1 default + ip route del vrf v$h1 default + simple_if_fini $h1 198.51.100.2/24 2001:db8:1::2/64 +} + +sw1_create() +{ + simple_if_init $rp1 198.51.100.1/24 2001:db8:1::1/64 + __simple_if_init $rp11 v$rp1 192.0.2.1/28 2001:db8:2::1/64 + __simple_if_init $rp12 v$rp1 192.0.2.17/28 2001:db8:3::1/64 + + ip route add vrf v$rp1 203.0.113.0/24 \ + nexthop via 192.0.2.2 dev $rp11 \ + nexthop via 192.0.2.18 dev $rp12 + + ip -6 route add vrf v$rp1 2001:db8:4::/64 \ + nexthop via 2001:db8:2::2 dev $rp11 \ + nexthop via 2001:db8:3::2 dev $rp12 +} + +sw1_destroy() +{ + ip -6 route del vrf v$rp1 2001:db8:4::/64 + + ip route del vrf v$rp1 203.0.113.0/24 + + __simple_if_fini $rp12 192.0.2.17/28 2001:db8:3::1/64 + __simple_if_fini $rp11 192.0.2.1/28 2001:db8:2::1/64 + simple_if_fini $rp1 198.51.100.1/24 2001:db8:1::1/64 +} + +sw2_create() +{ + simple_if_init $rp2 203.0.113.1/24 2001:db8:4::1/64 + __simple_if_init $rp21 v$rp2 192.0.2.2/28 2001:db8:2::2/64 + __simple_if_init $rp22 v$rp2 192.0.2.18/28 2001:db8:3::2/64 + + ip route add vrf v$rp2 198.51.100.0/24 \ + nexthop via 192.0.2.1 dev $rp21 \ + nexthop via 192.0.2.17 dev $rp22 + + ip -6 route add vrf v$rp2 2001:db8:1::/64 \ + nexthop via 2001:db8:2::1 dev $rp21 \ + nexthop via 2001:db8:3::1 dev $rp22 +} + +sw2_destroy() +{ + ip -6 route del vrf v$rp2 2001:db8:1::/64 + + ip route del vrf v$rp2 198.51.100.0/24 + + __simple_if_fini $rp22 192.0.2.18/28 2001:db8:3::2/64 + __simple_if_fini $rp21 192.0.2.2/28 2001:db8:2::2/64 + simple_if_fini $rp2 203.0.113.1/24 2001:db8:4::1/64 +} + +h2_create() +{ + simple_if_init $h2 203.0.113.2/24 2001:db8:4::2/64 + ip route add vrf v$h2 default via 203.0.113.1 dev $h2 + ip -6 route add vrf v$h2 default via 2001:db8:4::1 dev $h2 +} + +h2_destroy() +{ + ip -6 route del vrf v$h2 default + ip route del vrf v$h2 default + simple_if_fini $h2 203.0.113.2/24 2001:db8:4::2/64 +} + +setup_prepare() +{ + h1=${NETIFS[p1]} + + rp1=${NETIFS[p2]} + + rp11=${NETIFS[p3]} + rp21=${NETIFS[p4]} + + rp12=${NETIFS[p5]} + rp22=${NETIFS[p6]} + + rp2=${NETIFS[p7]} + + h2=${NETIFS[p8]} + + vrf_prepare + h1_create + sw1_create + sw2_create + h2_create + + forwarding_enable +} + +cleanup() +{ + pre_cleanup + + forwarding_restore + + h2_destroy + sw2_destroy + sw1_destroy + h1_destroy + vrf_cleanup +} + +ping_ipv4() +{ + ping_test $h1 203.0.113.2 +} + +ping_ipv6() +{ + ping6_test $h1 2001:db8:4::2 +} + +send_src_ipv4() +{ + $MZ $h1 -q -p 64 -A "198.51.100.2-198.51.100.253" -B 203.0.113.2 \ + -d 1msec -c 50 -t udp "sp=20000,dp=30000" +} + +send_dst_ipv4() +{ + $MZ $h1 -q -p 64 -A 198.51.100.2 -B "203.0.113.2-203.0.113.253" \ + -d 1msec -c 50 -t udp "sp=20000,dp=30000" +} + +send_src_udp4() +{ + $MZ $h1 -q -p 64 -A 198.51.100.2 -B 203.0.113.2 \ + -d 1msec -t udp "sp=0-32768,dp=30000" +} + +send_dst_udp4() +{ + $MZ $h1 -q -p 64 -A 198.51.100.2 -B 203.0.113.2 \ + -d 1msec -t udp "sp=20000,dp=0-32768" +} + +send_src_ipv6() +{ + $MZ -6 $h1 -q -p 64 -A "2001:db8:1::2-2001:db8:1::fd" -B 2001:db8:4::2 \ + -d 1msec -c 50 -t udp "sp=20000,dp=30000" +} + +send_dst_ipv6() +{ + $MZ -6 $h1 -q -p 64 -A 2001:db8:1::2 -B "2001:db8:4::2-2001:db8:4::fd" \ + -d 1msec -c 50 -t udp "sp=20000,dp=30000" +} + +send_flowlabel() +{ + # Generate 16384 echo requests, each with a random flow label. + for _ in $(seq 1 16384); do + ip vrf exec v$h1 \ + $PING6 2001:db8:4::2 -F 0 -c 1 -q >/dev/null 2>&1 + done +} + +send_src_udp6() +{ + $MZ -6 $h1 -q -p 64 -A 2001:db8:1::2 -B 2001:db8:4::2 \ + -d 1msec -t udp "sp=0-32768,dp=30000" +} + +send_dst_udp6() +{ + $MZ -6 $h1 -q -p 64 -A 2001:db8:1::2 -B 2001:db8:4::2 \ + -d 1msec -t udp "sp=20000,dp=0-32768" +} + +custom_hash_test() +{ + local field="$1"; shift + local balanced="$1"; shift + local send_flows="$@" + + RET=0 + + local t0_rp11=$(link_stats_tx_packets_get $rp11) + local t0_rp12=$(link_stats_tx_packets_get $rp12) + + $send_flows + + local t1_rp11=$(link_stats_tx_packets_get $rp11) + local t1_rp12=$(link_stats_tx_packets_get $rp12) + + local d_rp11=$((t1_rp11 - t0_rp11)) + local d_rp12=$((t1_rp12 - t0_rp12)) + + local diff=$((d_rp12 - d_rp11)) + local sum=$((d_rp11 + d_rp12)) + + local pct=$(echo "$diff / $sum * 100" | bc -l) + local is_balanced=$(echo "-20 <= $pct && $pct <= 20" | bc) + + [[ ( $is_balanced -eq 1 && $balanced == "balanced" ) || + ( $is_balanced -eq 0 && $balanced == "unbalanced" ) ]] + check_err $? "Expected traffic to be $balanced, but it is not" + + log_test "Multipath hash field: $field ($balanced)" + log_info "Packets sent on path1 / path2: $d_rp11 / $d_rp12" +} + +custom_hash_v4() +{ + log_info "Running IPv4 custom multipath hash tests" + + sysctl_set net.ipv4.fib_multipath_hash_policy 3 + + # Prevent the neighbour table from overflowing, as different neighbour + # entries will be created on $ol4 when using different destination IPs. + sysctl_set net.ipv4.neigh.default.gc_thresh1 1024 + sysctl_set net.ipv4.neigh.default.gc_thresh2 1024 + sysctl_set net.ipv4.neigh.default.gc_thresh3 1024 + + sysctl_set net.ipv4.fib_multipath_hash_fields 0 + custom_hash_test "Source IP" "balanced" send_src_ipv4 + custom_hash_test "Source IP" "unbalanced" send_dst_ipv4 + + sysctl_set net.ipv4.fib_multipath_hash_fields 1 + custom_hash_test "Destination IP" "balanced" send_dst_ipv4 + custom_hash_test "Destination IP" "unbalanced" send_src_ipv4 + + sysctl_set net.ipv4.fib_multipath_hash_fields 4 + custom_hash_test "Source port" "balanced" send_src_udp4 + custom_hash_test "Source port" "unbalanced" send_dst_udp4 + + sysctl_set net.ipv4.fib_multipath_hash_fields 5 + custom_hash_test "Destination port" "balanced" send_dst_udp4 + custom_hash_test "Destination port" "unbalanced" send_src_udp4 + + sysctl_restore net.ipv4.neigh.default.gc_thresh3 + sysctl_restore net.ipv4.neigh.default.gc_thresh2 + sysctl_restore net.ipv4.neigh.default.gc_thresh1 + + sysctl_restore net.ipv4.fib_multipath_hash_policy +} + +custom_hash_v6() +{ + log_info "Running IPv6 custom multipath hash tests" + + sysctl_set net.ipv6.fib_multipath_hash_policy 3 + + # Prevent the neighbour table from overflowing, as different neighbour + # entries will be created on $ol4 when using different destination IPs. + sysctl_set net.ipv6.neigh.default.gc_thresh1 1024 + sysctl_set net.ipv6.neigh.default.gc_thresh2 1024 + sysctl_set net.ipv6.neigh.default.gc_thresh3 1024 + + sysctl_set net.ipv6.fib_multipath_hash_fields 0 + custom_hash_test "Source IP" "balanced" send_src_ipv6 + custom_hash_test "Source IP" "unbalanced" send_dst_ipv6 + + sysctl_set net.ipv6.fib_multipath_hash_fields 1 + custom_hash_test "Destination IP" "balanced" send_dst_ipv6 + custom_hash_test "Destination IP" "unbalanced" send_src_ipv6 + + sysctl_set net.ipv6.fib_multipath_hash_fields 3 + custom_hash_test "Flowlabel" "balanced" send_flowlabel + custom_hash_test "Flowlabel" "unbalanced" send_src_ipv6 + + sysctl_set net.ipv6.fib_multipath_hash_fields 4 + custom_hash_test "Source port" "balanced" send_src_udp6 + custom_hash_test "Source port" "unbalanced" send_dst_udp6 + + sysctl_set net.ipv6.fib_multipath_hash_fields 5 + custom_hash_test "Destination port" "balanced" send_dst_udp6 + custom_hash_test "Destination port" "unbalanced" send_src_udp6 + + sysctl_restore net.ipv6.neigh.default.gc_thresh3 + sysctl_restore net.ipv6.neigh.default.gc_thresh2 + sysctl_restore net.ipv6.neigh.default.gc_thresh1 + + sysctl_restore net.ipv6.fib_multipath_hash_policy +} + +custom_hash() +{ + # Test that when the hash policy is set to custom, traffic is + # distributed only according to the fields set in the + # fib_multipath_hash_fields sysctl. + # + # Each time set a different field and make sure traffic is only + # distributed when the field is changed in the packet stream. + custom_hash_v4 + custom_hash_v6 +} + +trap cleanup EXIT + +setup_prepare +setup_wait +tests_run + +exit $EXIT_STATUS From patchwork Sun May 2 16:22:56 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ido Schimmel X-Patchwork-Id: 12235305 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 93E19C433B4 for ; Sun, 2 May 2021 16:25:08 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 6B1D7611B0 for ; Sun, 2 May 2021 16:25:08 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232401AbhEBQZ6 (ORCPT ); Sun, 2 May 2021 12:25:58 -0400 Received: from out5-smtp.messagingengine.com ([66.111.4.29]:37879 "EHLO out5-smtp.messagingengine.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232372AbhEBQZ4 (ORCPT ); Sun, 2 May 2021 12:25:56 -0400 Received: from compute3.internal (compute3.nyi.internal [10.202.2.43]) by mailout.nyi.internal (Postfix) with ESMTP id 71F305C0139; Sun, 2 May 2021 12:25:04 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute3.internal (MEProxy); Sun, 02 May 2021 12:25:04 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:subject:to :x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; bh=bpqUHuy2SMOV6yzr3J1D6H0SgL+BModvT7aDQWC3IYE=; b=lkHRZvZU +muQdpmsNwxqmK7P5tAWf2rEIzp2Qaf0kM5LA1JWXYsHYWaAHS5DY0fjobq8o+p5 AJqU/00K1Fb/g5qp5gB8QSlM/O228vJiWCqqPZQs2vS2XrzYBRTgO7gpSdgTIMx6 AZtCn1tZsnOjcqf/4bjn/SnEnlGt8C7lPPf9rn8ISre0IuF3n2TudqFtO+GWyH9K Xpw8TKYEABTE2Dhmm79J/6UMUoHiVKb+kNSTD35QXfZz2RfwgosMisyEAHV/Pvyu 9JAOiIdCIKUk844+uIgK5RuvlJEtsdnyz8cWi8ihC+0+hm/LlMwvxKrZdKVJnHfn mdNQaoZWE8OOEw== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeduledrvdefvddgtdejucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucenucfjughrpefhvffufffkofgjfhgggfestdekre dtredttdenucfhrhhomhepkfguohcuufgthhhimhhmvghluceoihguohhstghhsehiugho shgthhdrohhrgheqnecuggftrfgrthhtvghrnhepudetieevffffveelkeeljeffkefhke ehgfdtffethfelvdejgffghefgveejkefhnecukfhppeduleefrdegjedrudeihedrvdeh udenucevlhhushhtvghrufhiiigvpeefnecurfgrrhgrmhepmhgrihhlfhhrohhmpehiug hoshgthhesihguohhstghhrdhorhhg X-ME-Proxy: Received: from shredder.mellanox.com (unknown [193.47.165.251]) by mail.messagingengine.com (Postfix) with ESMTPA; Sun, 2 May 2021 12:25:01 -0400 (EDT) From: Ido Schimmel To: netdev@vger.kernel.org Cc: davem@davemloft.net, kuba@kernel.org, dsahern@gmail.com, petrm@nvidia.com, roopa@nvidia.com, nikolay@nvidia.com, ssuryaextr@gmail.com, mlxsw@nvidia.com, Ido Schimmel Subject: [RFC PATCH net-next 09/10] selftests: forwarding: Add test for custom multipath hash with IPv4 GRE Date: Sun, 2 May 2021 19:22:56 +0300 Message-Id: <20210502162257.3472453-10-idosch@idosch.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20210502162257.3472453-1-idosch@idosch.org> References: <20210502162257.3472453-1-idosch@idosch.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org X-Patchwork-State: RFC From: Ido Schimmel Test that when the hash policy is set to custom, traffic is distributed only according to the inner fields set in the fib_multipath_hash_fields sysctl. Each time set a different field and make sure traffic is only distributed when the field is changed in the packet stream. The test only verifies the behavior of IPv4/IPv6 overlays on top of an IPv4 underlay network. A subsequent patch will do the same with an IPv6 underlay network. Example output: # ./gre_custom_multipath_hash.sh TEST: ping [ OK ] TEST: ping6 [ OK ] INFO: Running IPv4 overlay custom multipath hash tests TEST: Multipath hash field: Inner source IP (balanced) [ OK ] INFO: Packets sent on path1 / path2: 6601 / 6001 TEST: Multipath hash field: Inner source IP (unbalanced) [ OK ] INFO: Packets sent on path1 / path2: 0 / 12600 TEST: Multipath hash field: Inner destination IP (balanced) [ OK ] INFO: Packets sent on path1 / path2: 6802 / 5802 TEST: Multipath hash field: Inner destination IP (unbalanced) [ OK ] INFO: Packets sent on path1 / path2: 12601 / 1 TEST: Multipath hash field: Inner source port (balanced) [ OK ] INFO: Packets sent on path1 / path2: 16430 / 16344 TEST: Multipath hash field: Inner source port (unbalanced) [ OK ] INFO: Packets sent on path1 / path2: 0 / 32772 TEST: Multipath hash field: Inner destination port (balanced) [ OK ] INFO: Packets sent on path1 / path2: 16430 / 16343 TEST: Multipath hash field: Inner destination port (unbalanced) [ OK ] INFO: Packets sent on path1 / path2: 0 / 32772 INFO: Running IPv6 overlay custom multipath hash tests TEST: Multipath hash field: Inner source IP (balanced) [ OK ] INFO: Packets sent on path1 / path2: 6702 / 5900 TEST: Multipath hash field: Inner source IP (unbalanced) [ OK ] INFO: Packets sent on path1 / path2: 0 / 12601 TEST: Multipath hash field: Inner destination IP (balanced) [ OK ] INFO: Packets sent on path1 / path2: 5751 / 6851 TEST: Multipath hash field: Inner destination IP (unbalanced) [ OK ] INFO: Packets sent on path1 / path2: 12602 / 1 TEST: Multipath hash field: Inner flowlabel (balanced) [ OK ] INFO: Packets sent on path1 / path2: 8364 / 8065 TEST: Multipath hash field: Inner flowlabel (unbalanced) [ OK ] INFO: Packets sent on path1 / path2: 12601 / 0 TEST: Multipath hash field: Inner source port (balanced) [ OK ] INFO: Packets sent on path1 / path2: 16425 / 16349 TEST: Multipath hash field: Inner source port (unbalanced) [ OK ] INFO: Packets sent on path1 / path2: 1 / 32770 TEST: Multipath hash field: Inner destination port (balanced) [ OK ] INFO: Packets sent on path1 / path2: 16425 / 16349 TEST: Multipath hash field: Inner destination port (unbalanced) [ OK ] INFO: Packets sent on path1 / path2: 2 / 32770 Signed-off-by: Ido Schimmel --- .../forwarding/gre_custom_multipath_hash.sh | 456 ++++++++++++++++++ 1 file changed, 456 insertions(+) create mode 100755 tools/testing/selftests/net/forwarding/gre_custom_multipath_hash.sh diff --git a/tools/testing/selftests/net/forwarding/gre_custom_multipath_hash.sh b/tools/testing/selftests/net/forwarding/gre_custom_multipath_hash.sh new file mode 100755 index 000000000000..e381083d8609 --- /dev/null +++ b/tools/testing/selftests/net/forwarding/gre_custom_multipath_hash.sh @@ -0,0 +1,456 @@ +#!/bin/bash +# SPDX-License-Identifier: GPL-2.0 +# +# Test traffic distribution when there are multiple paths between an IPv4 GRE +# tunnel. The tunnel carries IPv4 and IPv6 traffic between multiple hosts. +# Multiple routes are in the underlay network. With the default multipath +# policy, SW2 will only look at the outer IP addresses, hence only a single +# route would be used. +# +# +--------------------------------+ +# | H1 | +# | $h1 + | +# | 198.51.100.{2-253}/24 | | +# | 2001:db8:1::{2-fd}/64 | | +# +-------------------------|------+ +# | +# +-------------------------|------------------+ +# | SW1 | | +# | $ol1 + | +# | 198.51.100.1/24 | +# | 2001:db8:1::1/64 | +# | | +# | + g1 (gre) | +# | loc=192.0.2.1 | +# | rem=192.0.2.2 --. | +# | tos=inherit | | +# | v | +# | + $ul1 | +# | | 192.0.2.17/28 | +# +---------------------|----------------------+ +# | +# +---------------------|----------------------+ +# | SW2 | | +# | $ul21 + | +# | 192.0.2.18/28 | | +# | | | +# ! __________________+___ | +# | / \ | +# | | | | +# | + $ul22.111 (vlan) + $ul22.222 (vlan) | +# | | 192.0.2.33/28 | 192.0.2.49/28 | +# | | | | +# +--|----------------------|------------------+ +# | | +# +--|----------------------|------------------+ +# | | | | +# | + $ul32.111 (vlan) + $ul32.222 (vlan) | +# | | 192.0.2.34/28 | 192.0.2.50/28 | +# | | | | +# | \__________________+___/ | +# | | | +# | | | +# | $ul31 + | +# | 192.0.2.65/28 | SW3 | +# +---------------------|----------------------+ +# | +# +---------------------|----------------------+ +# | + $ul4 | +# | ^ 192.0.2.66/28 | +# | | | +# | + g2 (gre) | | +# | loc=192.0.2.2 | | +# | rem=192.0.2.1 --' | +# | tos=inherit | +# | | +# | $ol4 + | +# | 203.0.113.1/24 | | +# | 2001:db8:2::1/64 | SW4 | +# +-------------------------|------------------+ +# | +# +-------------------------|------+ +# | | | +# | $h2 + | +# | 203.0.113.{2-253}/24 | +# | 2001:db8:2::{2-fd}/64 H2 | +# +--------------------------------+ + +ALL_TESTS=" + ping_ipv4 + ping_ipv6 + custom_hash +" + +NUM_NETIFS=10 +source lib.sh + +h1_create() +{ + simple_if_init $h1 198.51.100.2/24 2001:db8:1::2/64 + ip route add vrf v$h1 default via 198.51.100.1 dev $h1 + ip -6 route add vrf v$h1 default via 2001:db8:1::1 dev $h1 +} + +h1_destroy() +{ + ip -6 route del vrf v$h1 default + ip route del vrf v$h1 default + simple_if_fini $h1 198.51.100.2/24 2001:db8:1::2/64 +} + +sw1_create() +{ + simple_if_init $ol1 198.51.100.1/24 2001:db8:1::1/64 + __simple_if_init $ul1 v$ol1 192.0.2.17/28 + + tunnel_create g1 gre 192.0.2.1 192.0.2.2 tos inherit dev v$ol1 + __simple_if_init g1 v$ol1 192.0.2.1/32 + ip route add vrf v$ol1 192.0.2.2/32 via 192.0.2.18 + + ip route add vrf v$ol1 203.0.113.0/24 dev g1 + ip -6 route add vrf v$ol1 2001:db8:2::/64 dev g1 +} + +sw1_destroy() +{ + ip -6 route del vrf v$ol1 2001:db8:2::/64 + ip route del vrf v$ol1 203.0.113.0/24 + + ip route del vrf v$ol1 192.0.2.2/32 + __simple_if_fini g1 192.0.2.1/32 + tunnel_destroy g1 + + __simple_if_fini $ul1 192.0.2.17/28 + simple_if_fini $ol1 198.51.100.1/24 2001:db8:1::1/64 +} + +sw2_create() +{ + simple_if_init $ul21 192.0.2.18/28 + __simple_if_init $ul22 v$ul21 + vlan_create $ul22 111 v$ul21 192.0.2.33/28 + vlan_create $ul22 222 v$ul21 192.0.2.49/28 + + ip route add vrf v$ul21 192.0.2.1/32 via 192.0.2.17 + ip route add vrf v$ul21 192.0.2.2/32 \ + nexthop via 192.0.2.34 \ + nexthop via 192.0.2.50 +} + +sw2_destroy() +{ + ip route del vrf v$ul21 192.0.2.2/32 + ip route del vrf v$ul21 192.0.2.1/32 + + vlan_destroy $ul22 222 + vlan_destroy $ul22 111 + __simple_if_fini $ul22 + simple_if_fini $ul21 192.0.2.18/28 +} + +sw3_create() +{ + simple_if_init $ul31 192.0.2.65/28 + __simple_if_init $ul32 v$ul31 + vlan_create $ul32 111 v$ul31 192.0.2.34/28 + vlan_create $ul32 222 v$ul31 192.0.2.50/28 + + ip route add vrf v$ul31 192.0.2.2/32 via 192.0.2.66 + ip route add vrf v$ul31 192.0.2.1/32 \ + nexthop via 192.0.2.33 \ + nexthop via 192.0.2.49 + + tc qdisc add dev $ul32 clsact + tc filter add dev $ul32 ingress pref 111 prot 802.1Q \ + flower vlan_id 111 action pass + tc filter add dev $ul32 ingress pref 222 prot 802.1Q \ + flower vlan_id 222 action pass +} + +sw3_destroy() +{ + tc qdisc del dev $ul32 clsact + + ip route del vrf v$ul31 192.0.2.1/32 + ip route del vrf v$ul31 192.0.2.2/32 + + vlan_destroy $ul32 222 + vlan_destroy $ul32 111 + __simple_if_fini $ul32 + simple_if_fini $ul31 192.0.2.65/28 +} + +sw4_create() +{ + simple_if_init $ol4 203.0.113.1/24 2001:db8:2::1/64 + __simple_if_init $ul4 v$ol4 192.0.2.66/28 + + tunnel_create g2 gre 192.0.2.2 192.0.2.1 tos inherit dev v$ol4 + __simple_if_init g2 v$ol4 192.0.2.2/32 + ip route add vrf v$ol4 192.0.2.1/32 via 192.0.2.65 + + ip route add vrf v$ol4 198.51.100.0/24 dev g2 + ip -6 route add vrf v$ol4 2001:db8:1::/64 dev g2 +} + +sw4_destroy() +{ + ip -6 route del vrf v$ol4 2001:db8:1::/64 + ip route del vrf v$ol4 198.51.100.0/24 + + ip route del vrf v$ol4 192.0.2.1/32 + __simple_if_fini g2 192.0.2.2/32 + tunnel_destroy g2 + + __simple_if_fini $ul4 192.0.2.66/28 + simple_if_fini $ol4 203.0.113.1/24 2001:db8:2::1/64 +} + +h2_create() +{ + simple_if_init $h2 203.0.113.2/24 2001:db8:2::2/64 + ip route add vrf v$h2 default via 203.0.113.1 dev $h2 + ip -6 route add vrf v$h2 default via 2001:db8:2::1 dev $h2 +} + +h2_destroy() +{ + ip -6 route del vrf v$h2 default + ip route del vrf v$h2 default + simple_if_fini $h2 203.0.113.2/24 2001:db8:2::2/64 +} + +setup_prepare() +{ + h1=${NETIFS[p1]} + + ol1=${NETIFS[p2]} + ul1=${NETIFS[p3]} + + ul21=${NETIFS[p4]} + ul22=${NETIFS[p5]} + + ul32=${NETIFS[p6]} + ul31=${NETIFS[p7]} + + ul4=${NETIFS[p8]} + ol4=${NETIFS[p9]} + + h2=${NETIFS[p10]} + + vrf_prepare + h1_create + sw1_create + sw2_create + sw3_create + sw4_create + h2_create + + forwarding_enable +} + +cleanup() +{ + pre_cleanup + + forwarding_restore + + h2_destroy + sw4_destroy + sw3_destroy + sw2_destroy + sw1_destroy + h1_destroy + vrf_cleanup +} + +ping_ipv4() +{ + ping_test $h1 203.0.113.2 +} + +ping_ipv6() +{ + ping6_test $h1 2001:db8:2::2 +} + +send_src_ipv4() +{ + $MZ $h1 -q -p 64 -A "198.51.100.2-198.51.100.253" -B 203.0.113.2 \ + -d 1msec -c 50 -t udp "sp=20000,dp=30000" +} + +send_dst_ipv4() +{ + $MZ $h1 -q -p 64 -A 198.51.100.2 -B "203.0.113.2-203.0.113.253" \ + -d 1msec -c 50 -t udp "sp=20000,dp=30000" +} + +send_src_udp4() +{ + $MZ $h1 -q -p 64 -A 198.51.100.2 -B 203.0.113.2 \ + -d 1msec -t udp "sp=0-32768,dp=30000" +} + +send_dst_udp4() +{ + $MZ $h1 -q -p 64 -A 198.51.100.2 -B 203.0.113.2 \ + -d 1msec -t udp "sp=20000,dp=0-32768" +} + +send_src_ipv6() +{ + $MZ -6 $h1 -q -p 64 -A "2001:db8:1::2-2001:db8:1::fd" -B 2001:db8:2::2 \ + -d 1msec -c 50 -t udp "sp=20000,dp=30000" +} + +send_dst_ipv6() +{ + $MZ -6 $h1 -q -p 64 -A 2001:db8:1::2 -B "2001:db8:2::2-2001:db8:2::fd" \ + -d 1msec -c 50 -t udp "sp=20000,dp=30000" +} + +send_flowlabel() +{ + # Generate 16384 echo requests, each with a random flow label. + for _ in $(seq 1 16384); do + ip vrf exec v$h1 \ + $PING6 2001:db8:2::2 -F 0 -c 1 -q >/dev/null 2>&1 + done +} + +send_src_udp6() +{ + $MZ -6 $h1 -q -p 64 -A 2001:db8:1::2 -B 2001:db8:2::2 \ + -d 1msec -t udp "sp=0-32768,dp=30000" +} + +send_dst_udp6() +{ + $MZ -6 $h1 -q -p 64 -A 2001:db8:1::2 -B 2001:db8:2::2 \ + -d 1msec -t udp "sp=20000,dp=0-32768" +} + +custom_hash_test() +{ + local field="$1"; shift + local balanced="$1"; shift + local send_flows="$@" + + RET=0 + + local t0_111=$(tc_rule_stats_get $ul32 111 ingress) + local t0_222=$(tc_rule_stats_get $ul32 222 ingress) + + $send_flows + + local t1_111=$(tc_rule_stats_get $ul32 111 ingress) + local t1_222=$(tc_rule_stats_get $ul32 222 ingress) + + local d111=$((t1_111 - t0_111)) + local d222=$((t1_222 - t0_222)) + + local diff=$((d222 - d111)) + local sum=$((d111 + d222)) + + local pct=$(echo "$diff / $sum * 100" | bc -l) + local is_balanced=$(echo "-20 <= $pct && $pct <= 20" | bc) + + [[ ( $is_balanced -eq 1 && $balanced == "balanced" ) || + ( $is_balanced -eq 0 && $balanced == "unbalanced" ) ]] + check_err $? "Expected traffic to be $balanced, but it is not" + + log_test "Multipath hash field: $field ($balanced)" + log_info "Packets sent on path1 / path2: $d111 / $d222" +} + +custom_hash_v4() +{ + log_info "Running IPv4 overlay custom multipath hash tests" + + # Prevent the neighbour table from overflowing, as different neighbour + # entries will be created on $ol4 when using different destination IPs. + sysctl_set net.ipv4.neigh.default.gc_thresh1 1024 + sysctl_set net.ipv4.neigh.default.gc_thresh2 1024 + sysctl_set net.ipv4.neigh.default.gc_thresh3 1024 + + sysctl_set net.ipv4.fib_multipath_hash_fields 6 + custom_hash_test "Inner source IP" "balanced" send_src_ipv4 + custom_hash_test "Inner source IP" "unbalanced" send_dst_ipv4 + + sysctl_set net.ipv4.fib_multipath_hash_fields 7 + custom_hash_test "Inner destination IP" "balanced" send_dst_ipv4 + custom_hash_test "Inner destination IP" "unbalanced" send_src_ipv4 + + sysctl_set net.ipv4.fib_multipath_hash_fields 10 + custom_hash_test "Inner source port" "balanced" send_src_udp4 + custom_hash_test "Inner source port" "unbalanced" send_dst_udp4 + + sysctl_set net.ipv4.fib_multipath_hash_fields 11 + custom_hash_test "Inner destination port" "balanced" send_dst_udp4 + custom_hash_test "Inner destination port" "unbalanced" send_src_udp4 + + sysctl_restore net.ipv4.neigh.default.gc_thresh3 + sysctl_restore net.ipv4.neigh.default.gc_thresh2 + sysctl_restore net.ipv4.neigh.default.gc_thresh1 +} + +custom_hash_v6() +{ + log_info "Running IPv6 overlay custom multipath hash tests" + + # Prevent the neighbour table from overflowing, as different neighbour + # entries will be created on $ol4 when using different destination IPs. + sysctl_set net.ipv6.neigh.default.gc_thresh1 1024 + sysctl_set net.ipv6.neigh.default.gc_thresh2 1024 + sysctl_set net.ipv6.neigh.default.gc_thresh3 1024 + + sysctl_set net.ipv4.fib_multipath_hash_fields 6 + custom_hash_test "Inner source IP" "balanced" send_src_ipv6 + custom_hash_test "Inner source IP" "unbalanced" send_dst_ipv6 + + sysctl_set net.ipv4.fib_multipath_hash_fields 7 + custom_hash_test "Inner destination IP" "balanced" send_dst_ipv6 + custom_hash_test "Inner destination IP" "unbalanced" send_src_ipv6 + + sysctl_set net.ipv4.fib_multipath_hash_fields 9 + custom_hash_test "Inner flowlabel" "balanced" send_flowlabel + custom_hash_test "Inner flowlabel" "unbalanced" send_src_ipv6 + + sysctl_set net.ipv4.fib_multipath_hash_fields 10 + custom_hash_test "Inner source port" "balanced" send_src_udp6 + custom_hash_test "Inner source port" "unbalanced" send_dst_udp6 + + sysctl_set net.ipv4.fib_multipath_hash_fields 11 + custom_hash_test "Inner destination port" "balanced" send_dst_udp6 + custom_hash_test "Inner destination port" "unbalanced" send_src_udp6 + + sysctl_restore net.ipv6.neigh.default.gc_thresh3 + sysctl_restore net.ipv6.neigh.default.gc_thresh2 + sysctl_restore net.ipv6.neigh.default.gc_thresh1 +} + +custom_hash() +{ + # Test that when the hash policy is set to custom, traffic is + # distributed only according to the fields set in the + # fib_multipath_hash_fields sysctl. + # + # Each time set a different field and make sure traffic is only + # distributed when the field is changed in the packet stream. + + sysctl_set net.ipv4.fib_multipath_hash_policy 3 + + custom_hash_v4 + custom_hash_v6 + + sysctl_restore net.ipv4.fib_multipath_hash_policy +} + +trap cleanup EXIT + +setup_prepare +setup_wait +tests_run + +exit $EXIT_STATUS From patchwork Sun May 2 16:22:57 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ido Schimmel X-Patchwork-Id: 12235307 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5340FC43461 for ; Sun, 2 May 2021 16:25:10 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 2B60A61359 for ; Sun, 2 May 2021 16:25:10 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232421AbhEBQ0B (ORCPT ); Sun, 2 May 2021 12:26:01 -0400 Received: from out5-smtp.messagingengine.com ([66.111.4.29]:60527 "EHLO out5-smtp.messagingengine.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232405AbhEBQZ7 (ORCPT ); Sun, 2 May 2021 12:25:59 -0400 Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.nyi.internal (Postfix) with ESMTP id 8F3825C0130; Sun, 2 May 2021 12:25:07 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Sun, 02 May 2021 12:25:07 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:subject:to :x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; bh=zFhWYlNdj4CYC0fKjhu+t1LBc1kg5QzS9fgNiP0VKMo=; b=ILNa6o92 j1DSn0+49bczp3dDuDbNh0RC0wgp+B0A4YLA3n8NFyKQHw4Wr/inxp8pTog2OjF9 ZRXk18D+awoMZc0h0JOfxzQ2KBS1fK7MitVMU/zc6km95PXfgcoSlJ5jwlA0XXzy l3vrqAbRbWHtQqgzgSNhQ/LmspycQILqA3XUEYZeYu9U4QQPvLV5Q0kxgWoWtsQ/ UnJ1D5LvHmeSRzbS6WGroGP7ZByvJFvvgbJNWlye1oSLMYh0rju7qhUOHnAzrqlS DrEtQilTN64zYrDhv8J9mturmERJ+BYgmQzCwo7hcW2hgJ2yj+a0VFzWZecCP5B2 qKi9vpwPiERskQ== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeduledrvdefvddgtdejucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucenucfjughrpefhvffufffkofgjfhgggfestdekre dtredttdenucfhrhhomhepkfguohcuufgthhhimhhmvghluceoihguohhstghhsehiugho shgthhdrohhrgheqnecuggftrfgrthhtvghrnhepudetieevffffveelkeeljeffkefhke ehgfdtffethfelvdejgffghefgveejkefhnecukfhppeduleefrdegjedrudeihedrvdeh udenucevlhhushhtvghrufhiiigvpedunecurfgrrhgrmhepmhgrihhlfhhrohhmpehiug hoshgthhesihguohhstghhrdhorhhg X-ME-Proxy: Received: from shredder.mellanox.com (unknown [193.47.165.251]) by mail.messagingengine.com (Postfix) with ESMTPA; Sun, 2 May 2021 12:25:04 -0400 (EDT) From: Ido Schimmel To: netdev@vger.kernel.org Cc: davem@davemloft.net, kuba@kernel.org, dsahern@gmail.com, petrm@nvidia.com, roopa@nvidia.com, nikolay@nvidia.com, ssuryaextr@gmail.com, mlxsw@nvidia.com, Ido Schimmel Subject: [RFC PATCH net-next 10/10] selftests: forwarding: Add test for custom multipath hash with IPv6 GRE Date: Sun, 2 May 2021 19:22:57 +0300 Message-Id: <20210502162257.3472453-11-idosch@idosch.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20210502162257.3472453-1-idosch@idosch.org> References: <20210502162257.3472453-1-idosch@idosch.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org X-Patchwork-State: RFC From: Ido Schimmel Test that when the hash policy is set to custom, traffic is distributed only according to the inner fields set in the fib_multipath_hash_fields sysctl. Each time set a different field and make sure traffic is only distributed when the field is changed in the packet stream. The test only verifies the behavior of IPv4/IPv6 overlays on top of an IPv6 underlay network. The previous patch verified the same with an IPv4 underlay network. Example output: # ./ip6gre_custom_multipath_hash.sh TEST: ping [ OK ] TEST: ping6 [ OK ] INFO: Running IPv4 overlay custom multipath hash tests TEST: Multipath hash field: Inner source IP (balanced) [ OK ] INFO: Packets sent on path1 / path2: 6602 / 6002 TEST: Multipath hash field: Inner source IP (unbalanced) [ OK ] INFO: Packets sent on path1 / path2: 1 / 12601 TEST: Multipath hash field: Inner destination IP (balanced) [ OK ] INFO: Packets sent on path1 / path2: 6802 / 5801 TEST: Multipath hash field: Inner destination IP (unbalanced) [ OK ] INFO: Packets sent on path1 / path2: 12602 / 3 TEST: Multipath hash field: Inner source port (balanced) [ OK ] INFO: Packets sent on path1 / path2: 16431 / 16344 TEST: Multipath hash field: Inner source port (unbalanced) [ OK ] INFO: Packets sent on path1 / path2: 0 / 32773 TEST: Multipath hash field: Inner destination port (balanced) [ OK ] INFO: Packets sent on path1 / path2: 16431 / 16344 TEST: Multipath hash field: Inner destination port (unbalanced) [ OK ] INFO: Packets sent on path1 / path2: 2 / 32772 INFO: Running IPv6 overlay custom multipath hash tests TEST: Multipath hash field: Inner source IP (balanced) [ OK ] INFO: Packets sent on path1 / path2: 6704 / 5902 TEST: Multipath hash field: Inner source IP (unbalanced) [ OK ] INFO: Packets sent on path1 / path2: 1 / 12600 TEST: Multipath hash field: Inner destination IP (balanced) [ OK ] INFO: Packets sent on path1 / path2: 5751 / 6852 TEST: Multipath hash field: Inner destination IP (unbalanced) [ OK ] INFO: Packets sent on path1 / path2: 12602 / 0 TEST: Multipath hash field: Inner flowlabel (balanced) [ OK ] INFO: Packets sent on path1 / path2: 8272 / 8181 TEST: Multipath hash field: Inner flowlabel (unbalanced) [ OK ] INFO: Packets sent on path1 / path2: 3 / 12602 TEST: Multipath hash field: Inner source port (balanced) [ OK ] INFO: Packets sent on path1 / path2: 16424 / 16351 TEST: Multipath hash field: Inner source port (unbalanced) [ OK ] INFO: Packets sent on path1 / path2: 3 / 32774 TEST: Multipath hash field: Inner destination port (balanced) [ OK ] INFO: Packets sent on path1 / path2: 16425 / 16350 TEST: Multipath hash field: Inner destination port (unbalanced) [ OK ] INFO: Packets sent on path1 / path2: 2 / 32773 Signed-off-by: Ido Schimmel --- .../ip6gre_custom_multipath_hash.sh | 458 ++++++++++++++++++ 1 file changed, 458 insertions(+) create mode 100755 tools/testing/selftests/net/forwarding/ip6gre_custom_multipath_hash.sh diff --git a/tools/testing/selftests/net/forwarding/ip6gre_custom_multipath_hash.sh b/tools/testing/selftests/net/forwarding/ip6gre_custom_multipath_hash.sh new file mode 100755 index 000000000000..8436f182d6a6 --- /dev/null +++ b/tools/testing/selftests/net/forwarding/ip6gre_custom_multipath_hash.sh @@ -0,0 +1,458 @@ +#!/bin/bash +# SPDX-License-Identifier: GPL-2.0 +# +# Test traffic distribution when there are multiple paths between an IPv6 GRE +# tunnel. The tunnel carries IPv4 and IPv6 traffic between multiple hosts. +# Multiple routes are in the underlay network. With the default multipath +# policy, SW2 will only look at the outer IP addresses, hence only a single +# route would be used. +# +# +--------------------------------+ +# | H1 | +# | $h1 + | +# | 198.51.100.{2-253}/24 | | +# | 2001:db8:1::{2-fd}/64 | | +# +-------------------------|------+ +# | +# +-------------------------|-------------------+ +# | SW1 | | +# | $ol1 + | +# | 198.51.100.1/24 | +# | 2001:db8:1::1/64 | +# | | +# |+ g1 (ip6gre) | +# | loc=2001:db8:3::1 | +# | rem=2001:db8:3::2 -. | +# | tos=inherit | | +# | v | +# | + $ul1 | +# | | 2001:db8:10::1/64 | +# +---------------------|-----------------------+ +# | +# +---------------------|-----------------------+ +# | SW2 | | +# | $ul21 + | +# | 2001:db8:10::2/64 | | +# | | | +# ! __________________+___ | +# | / \ | +# | | | | +# | + $ul22.111 (vlan) + $ul22.222 (vlan) | +# | | 2001:db8:11::1/64 | 2001:db8:12::1/64 | +# | | | | +# +--|----------------------|-------------------+ +# | | +# +--|----------------------|-------------------+ +# | | | | +# | + $ul32.111 (vlan) + $ul32.222 (vlan) | +# | | 2001:db8:11::2/64 | 2001:db8:12::2/64 | +# | | | | +# | \__________________+___/ | +# | | | +# | | | +# | $ul31 + | +# | 2001:db8:13::1/64 | SW3 | +# +---------------------|-----------------------+ +# | +# +---------------------|-----------------------+ +# | + $ul4 | +# | ^ 2001:db8:13::2/64 | +# | | | +# |+ g2 (ip6gre) | | +# | loc=2001:db8:3::2 | | +# | rem=2001:db8:3::1 -' | +# | tos=inherit | +# | | +# | $ol4 + | +# | 203.0.113.1/24 | | +# | 2001:db8:2::1/64 | SW4 | +# +-------------------------|-------------------+ +# | +# +-------------------------|------+ +# | | | +# | $h2 + | +# | 203.0.113.{2-253}/24 | +# | 2001:db8:2::{2-fd}/64 H2 | +# +--------------------------------+ + +ALL_TESTS=" + ping_ipv4 + ping_ipv6 + custom_hash +" + +NUM_NETIFS=10 +source lib.sh + +h1_create() +{ + simple_if_init $h1 198.51.100.2/24 2001:db8:1::2/64 + ip route add vrf v$h1 default via 198.51.100.1 dev $h1 + ip -6 route add vrf v$h1 default via 2001:db8:1::1 dev $h1 +} + +h1_destroy() +{ + ip -6 route del vrf v$h1 default + ip route del vrf v$h1 default + simple_if_fini $h1 198.51.100.2/24 2001:db8:1::2/64 +} + +sw1_create() +{ + simple_if_init $ol1 198.51.100.1/24 2001:db8:1::1/64 + __simple_if_init $ul1 v$ol1 2001:db8:10::1/64 + + tunnel_create g1 ip6gre 2001:db8:3::1 2001:db8:3::2 tos inherit \ + dev v$ol1 + __simple_if_init g1 v$ol1 2001:db8:3::1/128 + ip route add vrf v$ol1 2001:db8:3::2/128 via 2001:db8:10::2 + + ip route add vrf v$ol1 203.0.113.0/24 dev g1 + ip -6 route add vrf v$ol1 2001:db8:2::/64 dev g1 +} + +sw1_destroy() +{ + ip -6 route del vrf v$ol1 2001:db8:2::/64 + ip route del vrf v$ol1 203.0.113.0/24 + + ip route del vrf v$ol1 2001:db8:3::2/128 + __simple_if_fini g1 2001:db8:3::1/128 + tunnel_destroy g1 + + __simple_if_fini $ul1 2001:db8:10::1/64 + simple_if_fini $ol1 198.51.100.1/24 2001:db8:1::1/64 +} + +sw2_create() +{ + simple_if_init $ul21 2001:db8:10::2/64 + __simple_if_init $ul22 v$ul21 + vlan_create $ul22 111 v$ul21 2001:db8:11::1/64 + vlan_create $ul22 222 v$ul21 2001:db8:12::1/64 + + ip -6 route add vrf v$ul21 2001:db8:3::1/128 via 2001:db8:10::1 + ip -6 route add vrf v$ul21 2001:db8:3::2/128 \ + nexthop via 2001:db8:11::2 \ + nexthop via 2001:db8:12::2 +} + +sw2_destroy() +{ + ip -6 route del vrf v$ul21 2001:db8:3::2/128 + ip -6 route del vrf v$ul21 2001:db8:3::1/128 + + vlan_destroy $ul22 222 + vlan_destroy $ul22 111 + __simple_if_fini $ul22 + simple_if_fini $ul21 2001:db8:10::2/64 +} + +sw3_create() +{ + simple_if_init $ul31 2001:db8:13::1/64 + __simple_if_init $ul32 v$ul31 + vlan_create $ul32 111 v$ul31 2001:db8:11::2/64 + vlan_create $ul32 222 v$ul31 2001:db8:12::2/64 + + ip -6 route add vrf v$ul31 2001:db8:3::2/128 via 2001:db8:13::2 + ip -6 route add vrf v$ul31 2001:db8:3::1/128 \ + nexthop via 2001:db8:11::1 \ + nexthop via 2001:db8:12::1 + + tc qdisc add dev $ul32 clsact + tc filter add dev $ul32 ingress pref 111 prot 802.1Q \ + flower vlan_id 111 action pass + tc filter add dev $ul32 ingress pref 222 prot 802.1Q \ + flower vlan_id 222 action pass +} + +sw3_destroy() +{ + tc qdisc del dev $ul32 clsact + + ip -6 route del vrf v$ul31 2001:db8:3::1/128 + ip -6 route del vrf v$ul31 2001:db8:3::2/128 + + vlan_destroy $ul32 222 + vlan_destroy $ul32 111 + __simple_if_fini $ul32 + simple_if_fini $ul31 2001:db8:13::1/64 +} + +sw4_create() +{ + simple_if_init $ol4 203.0.113.1/24 2001:db8:2::1/64 + __simple_if_init $ul4 v$ol4 2001:db8:13::2/64 + + tunnel_create g2 ip6gre 2001:db8:3::2 2001:db8:3::1 tos inherit \ + dev v$ol4 + __simple_if_init g2 v$ol4 2001:db8:3::2/128 + ip -6 route add vrf v$ol4 2001:db8:3::1/128 via 2001:db8:13::1 + + ip route add vrf v$ol4 198.51.100.0/24 dev g2 + ip -6 route add vrf v$ol4 2001:db8:1::/64 dev g2 +} + +sw4_destroy() +{ + ip -6 route del vrf v$ol4 2001:db8:1::/64 + ip route del vrf v$ol4 198.51.100.0/24 + + ip -6 route del vrf v$ol4 2001:db8:3::1/128 + __simple_if_fini g2 2001:db8:3::2/128 + tunnel_destroy g2 + + __simple_if_fini $ul4 2001:db8:13::2/64 + simple_if_fini $ol4 203.0.113.1/24 2001:db8:2::1/64 +} + +h2_create() +{ + simple_if_init $h2 203.0.113.2/24 2001:db8:2::2/64 + ip route add vrf v$h2 default via 203.0.113.1 dev $h2 + ip -6 route add vrf v$h2 default via 2001:db8:2::1 dev $h2 +} + +h2_destroy() +{ + ip -6 route del vrf v$h2 default + ip route del vrf v$h2 default + simple_if_fini $h2 203.0.113.2/24 2001:db8:2::2/64 +} + +setup_prepare() +{ + h1=${NETIFS[p1]} + + ol1=${NETIFS[p2]} + ul1=${NETIFS[p3]} + + ul21=${NETIFS[p4]} + ul22=${NETIFS[p5]} + + ul32=${NETIFS[p6]} + ul31=${NETIFS[p7]} + + ul4=${NETIFS[p8]} + ol4=${NETIFS[p9]} + + h2=${NETIFS[p10]} + + vrf_prepare + h1_create + sw1_create + sw2_create + sw3_create + sw4_create + h2_create + + forwarding_enable +} + +cleanup() +{ + pre_cleanup + + forwarding_restore + + h2_destroy + sw4_destroy + sw3_destroy + sw2_destroy + sw1_destroy + h1_destroy + vrf_cleanup +} + +ping_ipv4() +{ + ping_test $h1 203.0.113.2 +} + +ping_ipv6() +{ + ping6_test $h1 2001:db8:2::2 +} + +send_src_ipv4() +{ + $MZ $h1 -q -p 64 -A "198.51.100.2-198.51.100.253" -B 203.0.113.2 \ + -d 1msec -c 50 -t udp "sp=20000,dp=30000" +} + +send_dst_ipv4() +{ + $MZ $h1 -q -p 64 -A 198.51.100.2 -B "203.0.113.2-203.0.113.253" \ + -d 1msec -c 50 -t udp "sp=20000,dp=30000" +} + +send_src_udp4() +{ + $MZ $h1 -q -p 64 -A 198.51.100.2 -B 203.0.113.2 \ + -d 1msec -t udp "sp=0-32768,dp=30000" +} + +send_dst_udp4() +{ + $MZ $h1 -q -p 64 -A 198.51.100.2 -B 203.0.113.2 \ + -d 1msec -t udp "sp=20000,dp=0-32768" +} + +send_src_ipv6() +{ + $MZ -6 $h1 -q -p 64 -A "2001:db8:1::2-2001:db8:1::fd" -B 2001:db8:2::2 \ + -d 1msec -c 50 -t udp "sp=20000,dp=30000" +} + +send_dst_ipv6() +{ + $MZ -6 $h1 -q -p 64 -A 2001:db8:1::2 -B "2001:db8:2::2-2001:db8:2::fd" \ + -d 1msec -c 50 -t udp "sp=20000,dp=30000" +} + +send_flowlabel() +{ + # Generate 16384 echo requests, each with a random flow label. + for _ in $(seq 1 16384); do + ip vrf exec v$h1 \ + $PING6 2001:db8:2::2 -F 0 -c 1 -q >/dev/null 2>&1 + done +} + +send_src_udp6() +{ + $MZ -6 $h1 -q -p 64 -A 2001:db8:1::2 -B 2001:db8:2::2 \ + -d 1msec -t udp "sp=0-32768,dp=30000" +} + +send_dst_udp6() +{ + $MZ -6 $h1 -q -p 64 -A 2001:db8:1::2 -B 2001:db8:2::2 \ + -d 1msec -t udp "sp=20000,dp=0-32768" +} + +custom_hash_test() +{ + local field="$1"; shift + local balanced="$1"; shift + local send_flows="$@" + + RET=0 + + local t0_111=$(tc_rule_stats_get $ul32 111 ingress) + local t0_222=$(tc_rule_stats_get $ul32 222 ingress) + + $send_flows + + local t1_111=$(tc_rule_stats_get $ul32 111 ingress) + local t1_222=$(tc_rule_stats_get $ul32 222 ingress) + + local d111=$((t1_111 - t0_111)) + local d222=$((t1_222 - t0_222)) + + local diff=$((d222 - d111)) + local sum=$((d111 + d222)) + + local pct=$(echo "$diff / $sum * 100" | bc -l) + local is_balanced=$(echo "-20 <= $pct && $pct <= 20" | bc) + + [[ ( $is_balanced -eq 1 && $balanced == "balanced" ) || + ( $is_balanced -eq 0 && $balanced == "unbalanced" ) ]] + check_err $? "Expected traffic to be $balanced, but it is not" + + log_test "Multipath hash field: $field ($balanced)" + log_info "Packets sent on path1 / path2: $d111 / $d222" +} + +custom_hash_v4() +{ + log_info "Running IPv4 overlay custom multipath hash tests" + + # Prevent the neighbour table from overflowing, as different neighbour + # entries will be created on $ol4 when using different destination IPs. + sysctl_set net.ipv4.neigh.default.gc_thresh1 1024 + sysctl_set net.ipv4.neigh.default.gc_thresh2 1024 + sysctl_set net.ipv4.neigh.default.gc_thresh3 1024 + + sysctl_set net.ipv6.fib_multipath_hash_fields 6 + custom_hash_test "Inner source IP" "balanced" send_src_ipv4 + custom_hash_test "Inner source IP" "unbalanced" send_dst_ipv4 + + sysctl_set net.ipv6.fib_multipath_hash_fields 7 + custom_hash_test "Inner destination IP" "balanced" send_dst_ipv4 + custom_hash_test "Inner destination IP" "unbalanced" send_src_ipv4 + + sysctl_set net.ipv6.fib_multipath_hash_fields 10 + custom_hash_test "Inner source port" "balanced" send_src_udp4 + custom_hash_test "Inner source port" "unbalanced" send_dst_udp4 + + sysctl_set net.ipv6.fib_multipath_hash_fields 11 + custom_hash_test "Inner destination port" "balanced" send_dst_udp4 + custom_hash_test "Inner destination port" "unbalanced" send_src_udp4 + + sysctl_restore net.ipv4.neigh.default.gc_thresh3 + sysctl_restore net.ipv4.neigh.default.gc_thresh2 + sysctl_restore net.ipv4.neigh.default.gc_thresh1 +} + +custom_hash_v6() +{ + log_info "Running IPv6 overlay custom multipath hash tests" + + # Prevent the neighbour table from overflowing, as different neighbour + # entries will be created on $ol4 when using different destination IPs. + sysctl_set net.ipv6.neigh.default.gc_thresh1 1024 + sysctl_set net.ipv6.neigh.default.gc_thresh2 1024 + sysctl_set net.ipv6.neigh.default.gc_thresh3 1024 + + sysctl_set net.ipv6.fib_multipath_hash_fields 6 + custom_hash_test "Inner source IP" "balanced" send_src_ipv6 + custom_hash_test "Inner source IP" "unbalanced" send_dst_ipv6 + + sysctl_set net.ipv6.fib_multipath_hash_fields 7 + custom_hash_test "Inner destination IP" "balanced" send_dst_ipv6 + custom_hash_test "Inner destination IP" "unbalanced" send_src_ipv6 + + sysctl_set net.ipv6.fib_multipath_hash_fields 9 + custom_hash_test "Inner flowlabel" "balanced" send_flowlabel + custom_hash_test "Inner flowlabel" "unbalanced" send_src_ipv6 + + sysctl_set net.ipv6.fib_multipath_hash_fields 10 + custom_hash_test "Inner source port" "balanced" send_src_udp6 + custom_hash_test "Inner source port" "unbalanced" send_dst_udp6 + + sysctl_set net.ipv6.fib_multipath_hash_fields 11 + custom_hash_test "Inner destination port" "balanced" send_dst_udp6 + custom_hash_test "Inner destination port" "unbalanced" send_src_udp6 + + sysctl_restore net.ipv6.neigh.default.gc_thresh3 + sysctl_restore net.ipv6.neigh.default.gc_thresh2 + sysctl_restore net.ipv6.neigh.default.gc_thresh1 +} + +custom_hash() +{ + # Test that when the hash policy is set to custom, traffic is + # distributed only according to the fields set in the + # fib_multipath_hash_fields sysctl. + # + # Each time set a different field and make sure traffic is only + # distributed when the field is changed in the packet stream. + + sysctl_set net.ipv6.fib_multipath_hash_policy 3 + + custom_hash_v4 + custom_hash_v6 + + sysctl_restore net.ipv6.fib_multipath_hash_policy +} + +trap cleanup EXIT + +setup_prepare +setup_wait +tests_run + +exit $EXIT_STATUS