From patchwork Wed May 26 17:16:36 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Justin Iurman X-Patchwork-Id: 12282445 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 520AEC47082 for ; Wed, 26 May 2021 17:27:28 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 2E6EE613F4 for ; Wed, 26 May 2021 17:27:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233526AbhEZR26 (ORCPT ); Wed, 26 May 2021 13:28:58 -0400 Received: from serv108.segi.ulg.ac.be ([139.165.32.111]:46307 "EHLO serv108.segi.ulg.ac.be" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232884AbhEZR25 (ORCPT ); Wed, 26 May 2021 13:28:57 -0400 Received: from ubuntu18.home (135.19-200-80.adsl-dyn.isp.belgacom.be [80.200.19.135]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by serv108.segi.ulg.ac.be (Postfix) with ESMTPSA id BFF07200DFBB; Wed, 26 May 2021 19:17:10 +0200 (CEST) DKIM-Filter: OpenDKIM Filter v2.11.0 serv108.segi.ulg.ac.be BFF07200DFBB DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=uliege.be; s=ulg20190529; t=1622049430; bh=zd6T88m0ywsXml7q9DhEmUr6rv4zabWc1VlRCgydCI8=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=Fvg4a+8tFQwI7xEUagavhe8dAMbUVbDC4O6goZYSRT1wodi4WO2X4WqX0po75wHzW zfg9zH73wQi9ihLc+fHtP44uyfvxV7hm6xzHnzQhzZyX0vgQhdzwCtKg+Om7QXAe2C NvHiAWmw/9+0STvG5rDUrVKH3L43ufcUOFThpweXhm32mNVCmVTzob2I9SjkktJWZE kLc68BpRJjAbgemAycxKHotOxNgZU2/4Wi+6wBdlw9YbPUANqZLjwxMilH1mazmEXQ YGkpSnjngRxjviM4CQHRD0m0sq9omhd1affPl6G3zjbHGEDQGEcIN5icx12WoAkOgF Vq5d1gosbktyg== From: Justin Iurman To: netdev@vger.kernel.org Cc: davem@davemloft.net, kuba@kernel.org, tom@herbertland.com, justin.iurman@uliege.be Subject: [RESEND PATCH net-next v3 1/5] uapi: IPv6 IOAM headers definition Date: Wed, 26 May 2021 19:16:36 +0200 Message-Id: <20210526171640.9722-2-justin.iurman@uliege.be> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20210526171640.9722-1-justin.iurman@uliege.be> References: <20210526171640.9722-1-justin.iurman@uliege.be> Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org This patch provides the IPv6 IOAM option header [1] as well as the IOAM Trace header [2]. An IOAM option must be 4n-aligned. Here is an overview of a Hop-by-Hop with an IOAM Trace option: +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Next header | Hdr Ext Len | Padding | Padding | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Option Type | Opt Data Len | Reserved | IOAM Type | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Namespace-ID | NodeLen | Flags | RemainingLen| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | IOAM-Trace-Type | Reserved | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<-+ | | | | node data [0] | | | | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ D | | a | node data [1] | t | | a +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ ~ ... ~ S +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ p | | a | node data [n-1] | c | | e +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | | | | node data [n] | | | | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<-+ The IOAM option header starts at "Option Type" and ends after "IOAM Type". The IOAM Trace header starts at "Namespace-ID" and ends after "IOAM-Trace-Type/Reserved". IOAM Type: either Pre-allocated Trace (=0), Incremental Trace (=1), Proof-of-Transit (=2) or Edge-to-Edge (=3). Note that both the Pre-allocated Trace and the Incremental Trace look the same. The two others are not implemented. Namespace-ID: IOAM namespace identifier, not to be confused with network namespaces. It adds further context to IOAM options and associated data, and allows devices which are IOAM capable to determine whether IOAM options must be processed or ignored. It can also be used by an operator to distinguish different operational domains or to identify different sets of devices. NodeLen: Length of data added by each node. It depends on the Trace Type. Flags: Only the Overflow (O) flag for now. The O flag is set by a transit node when there are not enough octets left to record its data. RemainingLen: Remaining free space to record data. IOAM-Trace-Type: Bit field where each bit corresponds to a specific kind of IOAM data. See [2] for a detailed list. [1] https://tools.ietf.org/html/draft-ietf-ippm-ioam-ipv6-options [2] https://tools.ietf.org/html/draft-ietf-ippm-ioam-data Signed-off-by: Justin Iurman --- include/uapi/linux/ioam6.h | 123 +++++++++++++++++++++++++++++++++++++ 1 file changed, 123 insertions(+) create mode 100644 include/uapi/linux/ioam6.h diff --git a/include/uapi/linux/ioam6.h b/include/uapi/linux/ioam6.h new file mode 100644 index 000000000000..2177e4e49566 --- /dev/null +++ b/include/uapi/linux/ioam6.h @@ -0,0 +1,123 @@ +/* SPDX-License-Identifier: GPL-2.0+ WITH Linux-syscall-note */ +/* + * IPv6 IOAM implementation + * + * Author: + * Justin Iurman + */ + +#ifndef _UAPI_LINUX_IOAM6_H +#define _UAPI_LINUX_IOAM6_H + +#include +#include + +/* + * IPv6 IOAM Option Header + */ +struct ioam6_hdr { + __u8 opt_type; + __u8 opt_len; + __u8 :8; /* reserved */ +#define IOAM6_TYPE_PREALLOC 0 + __u8 type; +} __attribute__((packed)); + +/* + * IOAM Trace Header + */ +struct ioam6_trace_hdr { + __be16 namespace_id; + +#if defined(__LITTLE_ENDIAN_BITFIELD) + + __u8 :1, /* unused */ + :1, /* unused */ + overflow:1, + nodelen:5; + + __u8 remlen:7, + :1; /* unused */ + + union { + __be32 type_be32; + + struct { + __u32 bit7:1, + bit6:1, + bit5:1, + bit4:1, + bit3:1, + bit2:1, + bit1:1, + bit0:1, + bit15:1, /* unused */ + bit14:1, /* unused */ + bit13:1, /* unused */ + bit12:1, /* unused */ + bit11:1, + bit10:1, + bit9:1, + bit8:1, + bit23:1, /* reserved */ + bit22:1, + bit21:1, /* unused */ + bit20:1, /* unused */ + bit19:1, /* unused */ + bit18:1, /* unused */ + bit17:1, /* unused */ + bit16:1, /* unused */ + :8; /* reserved */ + } type; + }; + +#elif defined(__BIG_ENDIAN_BITFIELD) + + __u8 nodelen:5, + overflow:1, + :1, /* unused */ + :1; /* unused */ + + __u8 :1, /* unused */ + remlen:7; + + union { + __be32 type_be32; + + struct { + __u32 bit0:1, + bit1:1, + bit2:1, + bit3:1, + bit4:1, + bit5:1, + bit6:1, + bit7:1, + bit8:1, + bit9:1, + bit10:1, + bit11:1, + bit12:1, /* unused */ + bit13:1, /* unused */ + bit14:1, /* unused */ + bit15:1, /* unused */ + bit16:1, /* unused */ + bit17:1, /* unused */ + bit18:1, /* unused */ + bit19:1, /* unused */ + bit20:1, /* unused */ + bit21:1, /* unused */ + bit22:1, + bit23:1, /* reserved */ + :8; /* reserved */ + } type; + }; + +#else +#error "Please fix " +#endif + + __u8 data[0]; +} __attribute__((packed)); + +#endif /* _UAPI_LINUX_IOAM6_H */ From patchwork Wed May 26 17:16:37 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Justin Iurman X-Patchwork-Id: 12282449 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.0 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, UNWANTED_LANGUAGE_BODY,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id ACA91C4708A for ; Wed, 26 May 2021 17:27:31 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 89384613C3 for ; Wed, 26 May 2021 17:27:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233517AbhEZR3B (ORCPT ); Wed, 26 May 2021 13:29:01 -0400 Received: from serv108.segi.ulg.ac.be ([139.165.32.111]:46304 "EHLO serv108.segi.ulg.ac.be" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232869AbhEZR25 (ORCPT ); Wed, 26 May 2021 13:28:57 -0400 Received: from ubuntu18.home (135.19-200-80.adsl-dyn.isp.belgacom.be [80.200.19.135]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by serv108.segi.ulg.ac.be (Postfix) with ESMTPSA id E3290200DFBE; Wed, 26 May 2021 19:17:10 +0200 (CEST) DKIM-Filter: OpenDKIM Filter v2.11.0 serv108.segi.ulg.ac.be E3290200DFBE DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=uliege.be; s=ulg20190529; t=1622049431; bh=P4F5qBoeFpf3XABE7pJhpMn7YmD3awxR3ffQRSIIbVA=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=SCN5opGHkJeg89MCuefQZ2YDHEXnUJTk0GH8ObgbLI6cmJ5X9uMq0II26BIkLNEwI 7OmILv078II0Sxj/pjyQQD66AXKj8JYstONd5WjNC5pogcDs81WccocGCQw67MMFuK eUyzEYgGl6jHfTeJGbv6f/kIyEz/sRlk06LAE/n/H/xNdw2UrX3ma8Cadv4/PmImZu 0qCQRte5cZXwZWxyy9R0EUkzUhWi1/14pki/3ET9Ya9ruCFEZWXoAw7fXxgbG+BaXT SDvlVQGuPgHukolQo9OKnPj7gZXfVz3P1Me3WkeNYXxUqbrwLLON2xWzqaAWilfjwF 1rYqABGxWNPog== From: Justin Iurman To: netdev@vger.kernel.org Cc: davem@davemloft.net, kuba@kernel.org, tom@herbertland.com, justin.iurman@uliege.be Subject: [RESEND PATCH net-next v3 2/5] ipv6: ioam: Data plane support for Pre-allocated Trace Date: Wed, 26 May 2021 19:16:37 +0200 Message-Id: <20210526171640.9722-3-justin.iurman@uliege.be> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20210526171640.9722-1-justin.iurman@uliege.be> References: <20210526171640.9722-1-justin.iurman@uliege.be> Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org Implement support for processing the IOAM Pre-allocated Trace with IPv6, see [1] and [2]. Introduce a new IPv6 Hop-by-Hop TLV option, see IANA [3]. A per-interface sysctl ioam6_enabled is provided to process/ignore IOAM headers. Default is ignore (= disabled). Another per-interface sysctl ioam6_id is provided to define the IOAM (unique) identifier of the interface. Default is 0. A per-namespace sysctl ioam6_id is provided to define the IOAM (unique) identifier of the node. Default is 0. Documentation is provided at the end of this patchset. Two relativistic hash tables: one for IOAM namespaces, the other for IOAM schemas. A namespace can only have a single active schema and a schema can only be attached to a single namespace (1:1 relationship). [1] https://tools.ietf.org/html/draft-ietf-ippm-ioam-ipv6-options [2] https://tools.ietf.org/html/draft-ietf-ippm-ioam-data [3] https://www.iana.org/assignments/ipv6-parameters/ipv6-parameters.xhtml#ipv6-parameters-2 Signed-off-by: Justin Iurman --- include/linux/ioam6.h | 13 ++ include/linux/ipv6.h | 2 + include/net/ioam6.h | 62 +++++++ include/net/netns/ipv6.h | 2 + include/uapi/linux/in6.h | 1 + include/uapi/linux/ipv6.h | 2 + net/ipv6/Makefile | 2 +- net/ipv6/addrconf.c | 20 +++ net/ipv6/af_inet6.c | 7 + net/ipv6/exthdrs.c | 51 ++++++ net/ipv6/ioam6.c | 357 +++++++++++++++++++++++++++++++++++++ net/ipv6/sysctl_net_ipv6.c | 7 + 12 files changed, 525 insertions(+), 1 deletion(-) create mode 100644 include/linux/ioam6.h create mode 100644 include/net/ioam6.h create mode 100644 net/ipv6/ioam6.c diff --git a/include/linux/ioam6.h b/include/linux/ioam6.h new file mode 100644 index 000000000000..94a24b36998f --- /dev/null +++ b/include/linux/ioam6.h @@ -0,0 +1,13 @@ +/* SPDX-License-Identifier: GPL-2.0+ */ +/* + * IPv6 IOAM + * + * Author: + * Justin Iurman + */ +#ifndef _LINUX_IOAM6_H +#define _LINUX_IOAM6_H + +#include + +#endif /* _LINUX_IOAM6_H */ diff --git a/include/linux/ipv6.h b/include/linux/ipv6.h index 70b2ad3b9884..6cc372af2319 100644 --- a/include/linux/ipv6.h +++ b/include/linux/ipv6.h @@ -76,6 +76,8 @@ struct ipv6_devconf { __s32 disable_policy; __s32 ndisc_tclass; __s32 rpl_seg_enabled; + __u32 ioam6_enabled; + __u32 ioam6_id; struct ctl_table_header *sysctl_header; }; diff --git a/include/net/ioam6.h b/include/net/ioam6.h new file mode 100644 index 000000000000..828b83c70721 --- /dev/null +++ b/include/net/ioam6.h @@ -0,0 +1,62 @@ +/* SPDX-License-Identifier: GPL-2.0+ */ +/* + * IPv6 IOAM implementation + * + * Author: + * Justin Iurman + */ + +#ifndef _NET_IOAM6_H +#define _NET_IOAM6_H + +#include +#include +#include +#include + +struct ioam6_namespace { + struct rhash_head head; + struct rcu_head rcu; + + __be16 id; + __be64 data; + + struct ioam6_schema *schema; +}; + +struct ioam6_schema { + struct rhash_head head; + struct rcu_head rcu; + + u32 id; + int len; + __be32 hdr; + u8 *data; + + struct ioam6_namespace *ns; +}; + +struct ioam6_pernet_data { + struct mutex lock; + struct rhashtable namespaces; + struct rhashtable schemas; +}; + +static inline struct ioam6_pernet_data *ioam6_pernet(struct net *net) +{ +#if IS_ENABLED(CONFIG_IPV6) + return net->ipv6.ioam6_data; +#else + return NULL; +#endif +} + +extern struct ioam6_namespace *ioam6_namespace(struct net *net, __be16 id); +extern void ioam6_fill_trace_data(struct sk_buff *skb, + struct ioam6_namespace *ns, + struct ioam6_trace_hdr *trace); + +extern int ioam6_init(void); +extern void ioam6_exit(void); + +#endif /* _NET_IOAM6_H */ diff --git a/include/net/netns/ipv6.h b/include/net/netns/ipv6.h index bde0b7adb4a3..a0d61a8fcfe1 100644 --- a/include/net/netns/ipv6.h +++ b/include/net/netns/ipv6.h @@ -53,6 +53,7 @@ struct netns_sysctl_ipv6 { int seg6_flowlabel; bool skip_notify_on_dev_down; u8 fib_notify_on_flag_change; + unsigned int ioam6_id; }; struct netns_ipv6 { @@ -110,6 +111,7 @@ struct netns_ipv6 { spinlock_t lock; u32 seq; } ip6addrlbl_table; + struct ioam6_pernet_data *ioam6_data; }; #if IS_ENABLED(CONFIG_NF_DEFRAG_IPV6) diff --git a/include/uapi/linux/in6.h b/include/uapi/linux/in6.h index 5ad396a57eb3..c4c53a9ab959 100644 --- a/include/uapi/linux/in6.h +++ b/include/uapi/linux/in6.h @@ -145,6 +145,7 @@ struct in6_flowlabel_req { #define IPV6_TLV_PADN 1 #define IPV6_TLV_ROUTERALERT 5 #define IPV6_TLV_CALIPSO 7 /* RFC 5570 */ +#define IPV6_TLV_IOAM 49 /* TEMPORARY IANA allocation for IOAM */ #define IPV6_TLV_JUMBO 194 #define IPV6_TLV_HAO 201 /* home address option */ diff --git a/include/uapi/linux/ipv6.h b/include/uapi/linux/ipv6.h index 70603775fe91..885c29e3a8d6 100644 --- a/include/uapi/linux/ipv6.h +++ b/include/uapi/linux/ipv6.h @@ -190,6 +190,8 @@ enum { DEVCONF_NDISC_TCLASS, DEVCONF_RPL_SEG_ENABLED, DEVCONF_RA_DEFRTR_METRIC, + DEVCONF_IOAM6_ENABLED, + DEVCONF_IOAM6_ID, DEVCONF_MAX }; diff --git a/net/ipv6/Makefile b/net/ipv6/Makefile index cf7b47bdb9b3..b7ef10d417d6 100644 --- a/net/ipv6/Makefile +++ b/net/ipv6/Makefile @@ -10,7 +10,7 @@ ipv6-objs := af_inet6.o anycast.o ip6_output.o ip6_input.o addrconf.o \ route.o ip6_fib.o ipv6_sockglue.o ndisc.o udp.o udplite.o \ raw.o icmp.o mcast.o reassembly.o tcp_ipv6.o ping.o \ exthdrs.o datagram.o ip6_flowlabel.o inet6_connection_sock.o \ - udp_offload.o seg6.o fib6_notifier.o rpl.o + udp_offload.o seg6.o fib6_notifier.o rpl.o ioam6.o ipv6-offload := ip6_offload.o tcpv6_offload.o exthdrs_offload.o diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c index b0ef65eb9bd2..c068d3b683c7 100644 --- a/net/ipv6/addrconf.c +++ b/net/ipv6/addrconf.c @@ -237,6 +237,8 @@ static struct ipv6_devconf ipv6_devconf __read_mostly = { .addr_gen_mode = IN6_ADDR_GEN_MODE_EUI64, .disable_policy = 0, .rpl_seg_enabled = 0, + .ioam6_enabled = 0, + .ioam6_id = 0, }; static struct ipv6_devconf ipv6_devconf_dflt __read_mostly = { @@ -293,6 +295,8 @@ static struct ipv6_devconf ipv6_devconf_dflt __read_mostly = { .addr_gen_mode = IN6_ADDR_GEN_MODE_EUI64, .disable_policy = 0, .rpl_seg_enabled = 0, + .ioam6_enabled = 0, + .ioam6_id = 0, }; /* Check if link is ready: is it up and is a valid qdisc available */ @@ -5526,6 +5530,8 @@ static inline void ipv6_store_devconf(struct ipv6_devconf *cnf, array[DEVCONF_DISABLE_POLICY] = cnf->disable_policy; array[DEVCONF_NDISC_TCLASS] = cnf->ndisc_tclass; array[DEVCONF_RPL_SEG_ENABLED] = cnf->rpl_seg_enabled; + array[DEVCONF_IOAM6_ENABLED] = cnf->ioam6_enabled; + array[DEVCONF_IOAM6_ID] = cnf->ioam6_id; } static inline size_t inet6_ifla6_size(void) @@ -6932,6 +6938,20 @@ static const struct ctl_table addrconf_sysctl[] = { .mode = 0644, .proc_handler = proc_dointvec, }, + { + .procname = "ioam6_enabled", + .data = &ipv6_devconf.ioam6_enabled, + .maxlen = sizeof(int), + .mode = 0644, + .proc_handler = proc_dointvec, + }, + { + .procname = "ioam6_id", + .data = &ipv6_devconf.ioam6_id, + .maxlen = sizeof(int), + .mode = 0644, + .proc_handler = proc_dointvec, + }, { /* sentinel */ } diff --git a/net/ipv6/af_inet6.c b/net/ipv6/af_inet6.c index 2389ff702f51..aec9664ec909 100644 --- a/net/ipv6/af_inet6.c +++ b/net/ipv6/af_inet6.c @@ -62,6 +62,7 @@ #include #include #include +#include #include #include @@ -1191,6 +1192,10 @@ static int __init inet6_init(void) if (err) goto rpl_fail; + err = ioam6_init(); + if (err) + goto ioam6_fail; + err = igmp6_late_init(); if (err) goto igmp6_late_err; @@ -1214,6 +1219,8 @@ static int __init inet6_init(void) #endif igmp6_late_err: rpl_exit(); +ioam6_fail: + ioam6_exit(); rpl_fail: seg6_exit(); seg6_fail: diff --git a/net/ipv6/exthdrs.c b/net/ipv6/exthdrs.c index 56e479d158b7..93c4e7a409e3 100644 --- a/net/ipv6/exthdrs.c +++ b/net/ipv6/exthdrs.c @@ -49,6 +49,9 @@ #include #endif #include +#include +#include +#include #include @@ -929,6 +932,50 @@ static bool ipv6_hop_ra(struct sk_buff *skb, int optoff) return false; } +/* IOAM */ + +static bool ipv6_hop_ioam(struct sk_buff *skb, int optoff) +{ + struct ioam6_trace_hdr *trace; + struct ioam6_namespace *ns; + struct ioam6_hdr *hdr; + + /* Must be 4n-aligned */ + if (optoff & 3) + goto drop; + + /* Ignore if IOAM is not enabled on ingress */ + if (!__in6_dev_get(skb->dev)->cnf.ioam6_enabled) + goto ignore; + + hdr = (struct ioam6_hdr *)(skb_network_header(skb) + optoff); + + switch (hdr->type) { + case IOAM6_TYPE_PREALLOC: + trace = (struct ioam6_trace_hdr *)((u8 *)hdr + sizeof(*hdr)); + ns = ioam6_namespace(ipv6_skb_net(skb), trace->namespace_id); + + /* Ignore if the IOAM namespace is unknown */ + if (!ns) + goto ignore; + + if (!skb_valid_dst(skb)) + ip6_route_input(skb); + + ioam6_fill_trace_data(skb, ns, trace); + break; + default: + break; + } + +ignore: + return true; + +drop: + kfree_skb(skb); + return false; +} + /* Jumbo payload */ static bool ipv6_hop_jumbo(struct sk_buff *skb, int optoff) @@ -1000,6 +1047,10 @@ static const struct tlvtype_proc tlvprochopopt_lst[] = { .type = IPV6_TLV_ROUTERALERT, .func = ipv6_hop_ra, }, + { + .type = IPV6_TLV_IOAM, + .func = ipv6_hop_ioam, + }, { .type = IPV6_TLV_JUMBO, .func = ipv6_hop_jumbo, diff --git a/net/ipv6/ioam6.c b/net/ipv6/ioam6.c new file mode 100644 index 000000000000..dcec24e09e99 --- /dev/null +++ b/net/ipv6/ioam6.c @@ -0,0 +1,357 @@ +/* SPDX-License-Identifier: GPL-2.0+ */ +/* + * IPv6 IOAM implementation + * + * Author: + * Justin Iurman + */ + +#include +#include +#include +#include +#include +#include + +#include +#include + +#define IOAM6_EMPTY_u16 0xffff +#define IOAM6_EMPTY_u24 0x00ffffff +#define IOAM6_EMPTY_u32 0xffffffff +#define IOAM6_EMPTY_u56 0x00ffffffffffffff + +#define IOAM6_MASK_u24 IOAM6_EMPTY_u24 +#define IOAM6_MASK_u56 IOAM6_EMPTY_u56 + +static inline void ioam6_ns_release(struct ioam6_namespace *ns) +{ + kfree_rcu(ns, rcu); +} + +static inline void ioam6_sc_release(struct ioam6_schema *sc) +{ + kfree_rcu(sc, rcu); +} + +static void ioam6_free_ns(void *ptr, void *arg) +{ + struct ioam6_namespace *ns = (struct ioam6_namespace *)ptr; + + if (ns) + ioam6_ns_release(ns); +} + +static void ioam6_free_sc(void *ptr, void *arg) +{ + struct ioam6_schema *sc = (struct ioam6_schema *)ptr; + + if (sc) + ioam6_sc_release(sc); +} + +static int ioam6_ns_cmpfn(struct rhashtable_compare_arg *arg, const void *obj) +{ + const struct ioam6_namespace *ns = obj; + + return (ns->id != *(__be16 *)arg->key); +} + +static int ioam6_sc_cmpfn(struct rhashtable_compare_arg *arg, const void *obj) +{ + const struct ioam6_schema *sc = obj; + + return (sc->id != *(u32 *)arg->key); +} + +static const struct rhashtable_params rht_ns_params = { + .key_len = sizeof(__be16), + .key_offset = offsetof(struct ioam6_namespace, id), + .head_offset = offsetof(struct ioam6_namespace, head), + .automatic_shrinking = true, + .obj_cmpfn = ioam6_ns_cmpfn, +}; + +static const struct rhashtable_params rht_sc_params = { + .key_len = sizeof(u32), + .key_offset = offsetof(struct ioam6_schema, id), + .head_offset = offsetof(struct ioam6_schema, head), + .automatic_shrinking = true, + .obj_cmpfn = ioam6_sc_cmpfn, +}; + +struct ioam6_namespace *ioam6_namespace(struct net *net, __be16 id) +{ + struct ioam6_pernet_data *nsdata = ioam6_pernet(net); + + return rhashtable_lookup_fast(&nsdata->namespaces, &id, rht_ns_params); +} + +static void __ioam6_fill_trace_data(struct sk_buff *skb, + struct ioam6_namespace *ns, + struct ioam6_trace_hdr *trace, + u8 sclen) +{ + struct __kernel_sock_timeval ts; + u64 raw64; + u32 raw32; + u16 raw16; + u8 *data; + u8 byte; + + data = trace->data + trace->remlen*4 - trace->nodelen*4 - sclen*4; + + /* hop_lim and node_id */ + if (trace->type.bit0) { + byte = ipv6_hdr(skb)->hop_limit; + if (skb->dev) + byte--; + + raw32 = dev_net(skb->dev)->ipv6.sysctl.ioam6_id; + if (!raw32) + raw32 = IOAM6_EMPTY_u24; + else + raw32 &= IOAM6_MASK_u24; + + *(__be32 *)data = cpu_to_be32((byte << 24) | raw32); + data += sizeof(__be32); + } + + /* ingress_if_id and egress_if_id */ + if (trace->type.bit1) { + if (!skb->dev) { + raw16 = IOAM6_EMPTY_u16; + } else { + raw16 = __in6_dev_get(skb->dev)->cnf.ioam6_id; + if (!raw16) + raw16 = IOAM6_EMPTY_u16; + } + + *(__be16 *)data = cpu_to_be16(raw16); + data += sizeof(__be16); + + if (skb_dst(skb)->dev->flags & IFF_LOOPBACK) { + raw16 = IOAM6_EMPTY_u16; + } else { + raw16 = __in6_dev_get(skb_dst(skb)->dev)->cnf.ioam6_id; + if (!raw16) + raw16 = IOAM6_EMPTY_u16; + } + + *(__be16 *)data = cpu_to_be16(raw16); + data += sizeof(__be16); + } + + /* timestamp seconds */ + if (trace->type.bit2) { + if (!skb->tstamp) + __net_timestamp(skb); + + skb_get_new_timestamp(skb, &ts); + + *(__be32 *)data = cpu_to_be32((u32)ts.tv_sec); + data += sizeof(__be32); + } + + /* timestamp subseconds */ + if (trace->type.bit3) { + if (!skb->tstamp) + __net_timestamp(skb); + + if (!trace->type.bit2) + skb_get_new_timestamp(skb, &ts); + + *(__be32 *)data = cpu_to_be32((u32)ts.tv_usec); + data += sizeof(__be32); + } + + /* transit delay */ + if (trace->type.bit4) { + *(__be32 *)data = cpu_to_be32(IOAM6_EMPTY_u32); + data += sizeof(__be32); + } + + /* namespace data */ + if (trace->type.bit5) { + *(__be32 *)data = (__force __be32)ns->data; + data += sizeof(__be32); + } + + /* queue depth */ + if (trace->type.bit6) { + *(__be32 *)data = cpu_to_be32(IOAM6_EMPTY_u32); + data += sizeof(__be32); + } + + /* checksum complement */ + if (trace->type.bit7) { + *(__be32 *)data = cpu_to_be32(IOAM6_EMPTY_u32); + data += sizeof(__be32); + } + + /* hop_lim and node_id (wide) */ + if (trace->type.bit8) { + byte = ipv6_hdr(skb)->hop_limit; + if (skb->dev) + byte--; + + raw64 = dev_net(skb->dev)->ipv6.sysctl.ioam6_id; + if (!raw64) + raw64 = IOAM6_EMPTY_u56; + else + raw64 &= IOAM6_MASK_u56; + + *(__be64 *)data = cpu_to_be64(((u64)byte << 56) | raw64); + data += sizeof(__be64); + } + + /* ingress_if_id and egress_if_id (wide) */ + if (trace->type.bit9) { + if (!skb->dev) { + raw32 = IOAM6_EMPTY_u32; + } else { + raw32 = __in6_dev_get(skb->dev)->cnf.ioam6_id; + if (!raw32) + raw32 = IOAM6_EMPTY_u32; + } + + *(__be32 *)data = cpu_to_be32(raw32); + data += sizeof(__be32); + + if (skb_dst(skb)->dev->flags & IFF_LOOPBACK) { + raw32 = IOAM6_EMPTY_u32; + } else { + raw32 = __in6_dev_get(skb_dst(skb)->dev)->cnf.ioam6_id; + if (!raw32) + raw32 = IOAM6_EMPTY_u32; + } + + *(__be32 *)data = cpu_to_be32(raw32); + data += sizeof(__be32); + } + + /* namespace data (wide) */ + if (trace->type.bit10) { + *(__be64 *)data = ns->data; + data += sizeof(__be64); + } + + /* buffer occupancy */ + if (trace->type.bit11) { + *(__be32 *)data = cpu_to_be32(IOAM6_EMPTY_u32); + data += sizeof(__be32); + } + + /* opaque state snapshot */ + if (trace->type.bit22) { + if (!ns->schema) { + *(__be32 *)data = cpu_to_be32(IOAM6_EMPTY_u24); + } else { + *(__be32 *)data = ns->schema->hdr; + data += sizeof(__be32); + + memcpy(data, ns->schema->data, ns->schema->len); + } + } +} + +void ioam6_fill_trace_data(struct sk_buff *skb, + struct ioam6_namespace *ns, + struct ioam6_trace_hdr *trace) +{ + u8 sclen = 0; + + /* Skip if Overflow flag is set OR + * if an unknown type (bit 12-21) is set + */ + if (trace->overflow || + (trace->type.bit12 | trace->type.bit13 | trace->type.bit14 | + trace->type.bit15 | trace->type.bit16 | trace->type.bit17 | + trace->type.bit18 | trace->type.bit19 | trace->type.bit20 | + trace->type.bit21)) { + return; + } + + /* NodeLen does not include Opaque State Snapshot length. We need to + * take it into account if the corresponding bit is set (bit 22) and + * if the current IOAM namespace has an active schema attached to it + */ + if (trace->type.bit22) { + sclen = sizeof_field(struct ioam6_schema, hdr) / 4; + + if (ns->schema) + sclen += ns->schema->len / 4; + } + + /* If there is no space remaining, we set the Overflow flag and we + * skip without filling the trace + */ + if (!trace->remlen || trace->remlen < (trace->nodelen + sclen)) { + trace->overflow = 1; + return; + } + + __ioam6_fill_trace_data(skb, ns, trace, sclen); + trace->remlen -= trace->nodelen + sclen; +} + +static int __net_init ioam6_net_init(struct net *net) +{ + struct ioam6_pernet_data *nsdata; + int err = -ENOMEM; + + nsdata = kzalloc(sizeof(*nsdata), GFP_KERNEL); + if (!nsdata) + goto out; + + mutex_init(&nsdata->lock); + net->ipv6.ioam6_data = nsdata; + + err = rhashtable_init(&nsdata->namespaces, &rht_ns_params); + if (err) + goto free_nsdata; + + err = rhashtable_init(&nsdata->schemas, &rht_sc_params); + if (err) + goto free_rht_ns; + +out: + return err; +free_rht_ns: + rhashtable_destroy(&nsdata->namespaces); +free_nsdata: + kfree(nsdata); + net->ipv6.ioam6_data = NULL; + goto out; +} + +static void __net_exit ioam6_net_exit(struct net *net) +{ + struct ioam6_pernet_data *nsdata = ioam6_pernet(net); + + rhashtable_free_and_destroy(&nsdata->namespaces, ioam6_free_ns, NULL); + rhashtable_free_and_destroy(&nsdata->schemas, ioam6_free_sc, NULL); + + kfree(nsdata); +} + +static struct pernet_operations ioam6_net_ops = { + .init = ioam6_net_init, + .exit = ioam6_net_exit, +}; + +int __init ioam6_init(void) +{ + int err = register_pernet_subsys(&ioam6_net_ops); + + if (err) + return err; + + pr_info("In-situ OAM (IOAM) with IPv6\n"); + return 0; +} + +void ioam6_exit(void) +{ + unregister_pernet_subsys(&ioam6_net_ops); +} diff --git a/net/ipv6/sysctl_net_ipv6.c b/net/ipv6/sysctl_net_ipv6.c index d7cf26f730d7..b97aad7b6aca 100644 --- a/net/ipv6/sysctl_net_ipv6.c +++ b/net/ipv6/sysctl_net_ipv6.c @@ -196,6 +196,13 @@ static struct ctl_table ipv6_table_template[] = { .extra1 = SYSCTL_ZERO, .extra2 = &two, }, + { + .procname = "ioam6_id", + .data = &init_net.ipv6.sysctl.ioam6_id, + .maxlen = sizeof(int), + .mode = 0644, + .proc_handler = proc_dointvec + }, { } }; From patchwork Wed May 26 17:16:38 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Justin Iurman X-Patchwork-Id: 12282453 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8B13AC47088 for ; Wed, 26 May 2021 17:27:52 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 6E716613D2 for ; Wed, 26 May 2021 17:27:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234032AbhEZR3E (ORCPT ); Wed, 26 May 2021 13:29:04 -0400 Received: from serv108.segi.ulg.ac.be ([139.165.32.111]:46312 "EHLO serv108.segi.ulg.ac.be" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232922AbhEZR26 (ORCPT ); Wed, 26 May 2021 13:28:58 -0400 Received: from ubuntu18.home (135.19-200-80.adsl-dyn.isp.belgacom.be [80.200.19.135]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by serv108.segi.ulg.ac.be (Postfix) with ESMTPSA id 1E162200E1CD; Wed, 26 May 2021 19:17:11 +0200 (CEST) DKIM-Filter: OpenDKIM Filter v2.11.0 serv108.segi.ulg.ac.be 1E162200E1CD DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=uliege.be; s=ulg20190529; t=1622049431; bh=Kzdh0j7UH1DAJGpf0pqMkqgf6GaAhNci4sP3IDi9YBQ=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=Z6c8lpkXBYtjo3gcU/LaK0EPnHGbFgpU8bXjLbtmd7vWGGC3RxRBQR3i26NBE9GYm Tl1pifK+XOuqh9YG8xAvb2+Zv0EAmJ1f4nm1tm7KcuaK+qToa1oMtQ2qc6tETr7BNj fr+4jXzgw+Id0EHjcICacRBFBQhVLSvEQay/3avCVNPBEDzhbEow07zP/RFKd9ud9E uytYTLRuPdkn32yiEw1AWC3QQDcXlkSj35+8mp/0dB5NcTvaWrCWQ66v1YlhWPt7fH HFIUGzX9n81RFdUh7K5YGJy4vfZlMiu72UAB4Wgw0UI1qK5c54AS+c7lbQ0/OlKePa k0T6Mc6lBgVvQ== From: Justin Iurman To: netdev@vger.kernel.org Cc: davem@davemloft.net, kuba@kernel.org, tom@herbertland.com, justin.iurman@uliege.be Subject: [RESEND PATCH net-next v3 3/5] ipv6: ioam: IOAM Generic Netlink API Date: Wed, 26 May 2021 19:16:38 +0200 Message-Id: <20210526171640.9722-4-justin.iurman@uliege.be> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20210526171640.9722-1-justin.iurman@uliege.be> References: <20210526171640.9722-1-justin.iurman@uliege.be> Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org Add Generic Netlink commands to allow userspace to configure IOAM namespaces and schemas. The target is iproute2 and the patch is ready. It will be posted as soon as this patchset is merged. Here is an overview: $ ip ioam Usage: ip ioam { COMMAND | help } ip ioam namespace show ip ioam namespace add ID [ DATA ] ip ioam namespace del ID ip ioam schema show ip ioam schema add ID DATA ip ioam schema del ID ip ioam namespace set ID schema { ID | none } Signed-off-by: Justin Iurman --- include/linux/ioam6_genl.h | 13 + include/uapi/linux/ioam6_genl.h | 49 ++++ net/ipv6/ioam6.c | 506 +++++++++++++++++++++++++++++++- 3 files changed, 566 insertions(+), 2 deletions(-) create mode 100644 include/linux/ioam6_genl.h create mode 100644 include/uapi/linux/ioam6_genl.h diff --git a/include/linux/ioam6_genl.h b/include/linux/ioam6_genl.h new file mode 100644 index 000000000000..176e67919de3 --- /dev/null +++ b/include/linux/ioam6_genl.h @@ -0,0 +1,13 @@ +/* SPDX-License-Identifier: GPL-2.0+ */ +/* + * IPv6 IOAM Generic Netlink API + * + * Author: + * Justin Iurman + */ +#ifndef _LINUX_IOAM6_GENL_H +#define _LINUX_IOAM6_GENL_H + +#include + +#endif /* _LINUX_IOAM6_GENL_H */ diff --git a/include/uapi/linux/ioam6_genl.h b/include/uapi/linux/ioam6_genl.h new file mode 100644 index 000000000000..0dd94b26448d --- /dev/null +++ b/include/uapi/linux/ioam6_genl.h @@ -0,0 +1,49 @@ +/* SPDX-License-Identifier: GPL-2.0+ WITH Linux-syscall-note */ +/* + * IPv6 IOAM Generic Netlink API + * + * Author: + * Justin Iurman + */ + +#ifndef _UAPI_LINUX_IOAM6_GENL_H +#define _UAPI_LINUX_IOAM6_GENL_H + +#define IOAM6_GENL_NAME "IOAM6" +#define IOAM6_GENL_VERSION 0x1 + +enum { + IOAM6_ATTR_UNSPEC, + + IOAM6_ATTR_NS_ID, /* u16 */ + IOAM6_ATTR_NS_DATA, /* u64 */ + +#define IOAM6_MAX_SCHEMA_DATA_LEN (255 * 4) + IOAM6_ATTR_SC_ID, /* u32 */ + IOAM6_ATTR_SC_DATA, /* Binary */ + IOAM6_ATTR_SC_NONE, /* Flag */ + + IOAM6_ATTR_PAD, + + __IOAM6_ATTR_MAX, +}; +#define IOAM6_ATTR_MAX (__IOAM6_ATTR_MAX - 1) + +enum { + IOAM6_CMD_UNSPEC, + + IOAM6_CMD_ADD_NAMESPACE, + IOAM6_CMD_DEL_NAMESPACE, + IOAM6_CMD_DUMP_NAMESPACES, + + IOAM6_CMD_ADD_SCHEMA, + IOAM6_CMD_DEL_SCHEMA, + IOAM6_CMD_DUMP_SCHEMAS, + + IOAM6_CMD_NS_SET_SCHEMA, + + __IOAM6_CMD_MAX, +}; +#define IOAM6_CMD_MAX (__IOAM6_CMD_MAX - 1) + +#endif /* _UAPI_LINUX_IOAM6_GENL_H */ diff --git a/net/ipv6/ioam6.c b/net/ipv6/ioam6.c index dcec24e09e99..09fb93f4cf1f 100644 --- a/net/ipv6/ioam6.c +++ b/net/ipv6/ioam6.c @@ -11,15 +11,18 @@ #include #include #include +#include #include #include +#include #include #define IOAM6_EMPTY_u16 0xffff #define IOAM6_EMPTY_u24 0x00ffffff #define IOAM6_EMPTY_u32 0xffffffff #define IOAM6_EMPTY_u56 0x00ffffffffffffff +#define IOAM6_EMPTY_u64 0xffffffffffffffff #define IOAM6_MASK_u24 IOAM6_EMPTY_u24 #define IOAM6_MASK_u56 IOAM6_EMPTY_u56 @@ -80,6 +83,496 @@ static const struct rhashtable_params rht_sc_params = { .obj_cmpfn = ioam6_sc_cmpfn, }; +static struct genl_family ioam6_genl_family; + +static const struct nla_policy ioam6_genl_policy[IOAM6_ATTR_MAX + 1] = { + [IOAM6_ATTR_NS_ID] = { .type = NLA_U16 }, + [IOAM6_ATTR_NS_DATA] = { .type = NLA_U64 }, + [IOAM6_ATTR_SC_ID] = { .type = NLA_U32 }, + [IOAM6_ATTR_SC_DATA] = { .type = NLA_BINARY, + .len = IOAM6_MAX_SCHEMA_DATA_LEN }, + [IOAM6_ATTR_SC_NONE] = { .type = NLA_FLAG }, +}; + +static int ioam6_genl_addns(struct sk_buff *skb, struct genl_info *info) +{ + struct ioam6_pernet_data *nsdata; + struct ioam6_namespace *ns; + __be16 ns_id; + int err; + + if (!info->attrs[IOAM6_ATTR_NS_ID]) + return -EINVAL; + + ns_id = cpu_to_be16(nla_get_u16(info->attrs[IOAM6_ATTR_NS_ID])); + nsdata = ioam6_pernet(genl_info_net(info)); + + mutex_lock(&nsdata->lock); + + ns = rhashtable_lookup_fast(&nsdata->namespaces, &ns_id, rht_ns_params); + if (ns) { + err = -EEXIST; + goto out_unlock; + } + + ns = kzalloc(sizeof(*ns), GFP_KERNEL); + if (!ns) { + err = -ENOMEM; + goto out_unlock; + } + + ns->id = ns_id; + + if (!info->attrs[IOAM6_ATTR_NS_DATA]) { + ns->data = cpu_to_be64(IOAM6_EMPTY_u64); + } else { + ns->data = cpu_to_be64( + nla_get_u64(info->attrs[IOAM6_ATTR_NS_DATA])); + } + + err = rhashtable_lookup_insert_fast(&nsdata->namespaces, &ns->head, + rht_ns_params); + if (err) + kfree(ns); + +out_unlock: + mutex_unlock(&nsdata->lock); + return err; +} + +static int ioam6_genl_delns(struct sk_buff *skb, struct genl_info *info) +{ + struct ioam6_pernet_data *nsdata; + struct ioam6_namespace *ns; + struct ioam6_schema *sc; + __be16 ns_id; + int err; + + if (!info->attrs[IOAM6_ATTR_NS_ID]) + return -EINVAL; + + ns_id = cpu_to_be16(nla_get_u16(info->attrs[IOAM6_ATTR_NS_ID])); + nsdata = ioam6_pernet(genl_info_net(info)); + + mutex_lock(&nsdata->lock); + + ns = rhashtable_lookup_fast(&nsdata->namespaces, &ns_id, rht_ns_params); + if (!ns) { + err = -ENOENT; + goto out_unlock; + } + + sc = ns->schema; + err = rhashtable_remove_fast(&nsdata->namespaces, &ns->head, + rht_ns_params); + if (err) + goto out_unlock; + + if (sc) + sc->ns = NULL; + + ioam6_ns_release(ns); + +out_unlock: + mutex_unlock(&nsdata->lock); + return err; +} + +static int __ioam6_genl_dumpns_element(struct ioam6_namespace *ns, + u32 portid, + u32 seq, + u32 flags, + struct sk_buff *skb, + u8 cmd) +{ + void *hdr; + u64 data; + + hdr = genlmsg_put(skb, portid, seq, &ioam6_genl_family, flags, cmd); + if (!hdr) + return -ENOMEM; + + data = be64_to_cpu(ns->data); + + if (nla_put_u16(skb, IOAM6_ATTR_NS_ID, be16_to_cpu(ns->id)) || + (data != IOAM6_EMPTY_u64 && + nla_put_u64_64bit(skb, IOAM6_ATTR_NS_DATA, data, IOAM6_ATTR_PAD)) || + (ns->schema && nla_put_u32(skb, IOAM6_ATTR_SC_ID, ns->schema->id))) + goto nla_put_failure; + + genlmsg_end(skb, hdr); + return 0; + +nla_put_failure: + genlmsg_cancel(skb, hdr); + return -EMSGSIZE; +} + +static int ioam6_genl_dumpns_start(struct netlink_callback *cb) +{ + struct ioam6_pernet_data *nsdata = ioam6_pernet(sock_net(cb->skb->sk)); + struct rhashtable_iter *iter = (struct rhashtable_iter *)cb->args[0]; + + if (!iter) { + iter = kmalloc(sizeof(*iter), GFP_KERNEL); + if (!iter) + return -ENOMEM; + + cb->args[0] = (long)iter; + } + + rhashtable_walk_enter(&nsdata->namespaces, iter); + + return 0; +} + +static int ioam6_genl_dumpns_done(struct netlink_callback *cb) +{ + struct rhashtable_iter *iter = (struct rhashtable_iter *)cb->args[0]; + + rhashtable_walk_exit(iter); + kfree(iter); + + return 0; +} + +static int ioam6_genl_dumpns(struct sk_buff *skb, struct netlink_callback *cb) +{ + struct rhashtable_iter *iter; + struct ioam6_namespace *ns; + int err; + + iter = (struct rhashtable_iter *)cb->args[0]; + rhashtable_walk_start(iter); + + for (;;) { + ns = rhashtable_walk_next(iter); + + if (IS_ERR(ns)) { + if (PTR_ERR(ns) == -EAGAIN) + continue; + err = PTR_ERR(ns); + goto done; + } else if (!ns) { + break; + } + + err = __ioam6_genl_dumpns_element(ns, + NETLINK_CB(cb->skb).portid, + cb->nlh->nlmsg_seq, + NLM_F_MULTI, + skb, + IOAM6_CMD_DUMP_NAMESPACES); + if (err) + goto done; + } + + err = skb->len; + +done: + rhashtable_walk_stop(iter); + return err; +} + +static int ioam6_genl_addsc(struct sk_buff *skb, struct genl_info *info) +{ + struct ioam6_pernet_data *nsdata; + int len, len_aligned, err; + struct ioam6_schema *sc; + u32 sc_id; + + if (!info->attrs[IOAM6_ATTR_SC_ID] || !info->attrs[IOAM6_ATTR_SC_DATA]) + return -EINVAL; + + sc_id = nla_get_u32(info->attrs[IOAM6_ATTR_SC_ID]); + nsdata = ioam6_pernet(genl_info_net(info)); + + mutex_lock(&nsdata->lock); + + sc = rhashtable_lookup_fast(&nsdata->schemas, &sc_id, rht_sc_params); + if (sc) { + err = -EEXIST; + goto out_unlock; + } + + sc = kzalloc(sizeof(*sc), GFP_KERNEL); + if (!sc) { + err = -ENOMEM; + goto out_unlock; + } + + len = nla_len(info->attrs[IOAM6_ATTR_SC_DATA]); + len_aligned = ALIGN(len, 4); + + sc->data = kzalloc(len_aligned, GFP_KERNEL); + if (!sc->data) { + err = -ENOMEM; + goto free_sc; + } + + sc->id = sc_id; + sc->len = len_aligned; + sc->hdr = cpu_to_be32(sc->id | ((u8)(sc->len / 4) << 24)); + + nla_memcpy(sc->data, info->attrs[IOAM6_ATTR_SC_DATA], len); + + err = rhashtable_lookup_insert_fast(&nsdata->schemas, &sc->head, + rht_sc_params); + if (err) + goto free_data; + +out_unlock: + mutex_unlock(&nsdata->lock); + return err; +free_data: + kfree(sc->data); +free_sc: + kfree(sc); + goto out_unlock; +} + +static int ioam6_genl_delsc(struct sk_buff *skb, struct genl_info *info) +{ + struct ioam6_pernet_data *nsdata; + struct ioam6_namespace *ns; + struct ioam6_schema *sc; + u32 sc_id; + int err; + + if (!info->attrs[IOAM6_ATTR_SC_ID]) + return -EINVAL; + + sc_id = nla_get_u32(info->attrs[IOAM6_ATTR_SC_ID]); + nsdata = ioam6_pernet(genl_info_net(info)); + + mutex_lock(&nsdata->lock); + + sc = rhashtable_lookup_fast(&nsdata->schemas, &sc_id, rht_sc_params); + if (!sc) { + err = -ENOENT; + goto out_unlock; + } + + ns = sc->ns; + err = rhashtable_remove_fast(&nsdata->schemas, &sc->head, + rht_sc_params); + if (err) + goto out_unlock; + + if (ns) + ns->schema = NULL; + + ioam6_sc_release(sc); + +out_unlock: + mutex_unlock(&nsdata->lock); + return err; +} + +static int __ioam6_genl_dumpsc_element(struct ioam6_schema *sc, + u32 portid, u32 seq, u32 flags, + struct sk_buff *skb, u8 cmd) +{ + void *hdr; + + hdr = genlmsg_put(skb, portid, seq, &ioam6_genl_family, flags, cmd); + if (!hdr) + return -ENOMEM; + + if (nla_put_u32(skb, IOAM6_ATTR_SC_ID, sc->id) || + nla_put(skb, IOAM6_ATTR_SC_DATA, sc->len, sc->data) || + (sc->ns && nla_put_u16(skb, IOAM6_ATTR_NS_ID, + be16_to_cpu(sc->ns->id)))) + goto nla_put_failure; + + genlmsg_end(skb, hdr); + return 0; + +nla_put_failure: + genlmsg_cancel(skb, hdr); + return -EMSGSIZE; +} + +static int ioam6_genl_dumpsc_start(struct netlink_callback *cb) +{ + struct ioam6_pernet_data *nsdata = ioam6_pernet(sock_net(cb->skb->sk)); + struct rhashtable_iter *iter = (struct rhashtable_iter *)cb->args[0]; + + if (!iter) { + iter = kmalloc(sizeof(*iter), GFP_KERNEL); + if (!iter) + return -ENOMEM; + + cb->args[0] = (long)iter; + } + + rhashtable_walk_enter(&nsdata->schemas, iter); + + return 0; +} + +static int ioam6_genl_dumpsc_done(struct netlink_callback *cb) +{ + struct rhashtable_iter *iter = (struct rhashtable_iter *)cb->args[0]; + + rhashtable_walk_exit(iter); + kfree(iter); + + return 0; +} + +static int ioam6_genl_dumpsc(struct sk_buff *skb, struct netlink_callback *cb) +{ + struct rhashtable_iter *iter; + struct ioam6_schema *sc; + int err; + + iter = (struct rhashtable_iter *)cb->args[0]; + rhashtable_walk_start(iter); + + for (;;) { + sc = rhashtable_walk_next(iter); + + if (IS_ERR(sc)) { + if (PTR_ERR(sc) == -EAGAIN) + continue; + err = PTR_ERR(sc); + goto done; + } else if (!sc) { + break; + } + + err = __ioam6_genl_dumpsc_element(sc, + NETLINK_CB(cb->skb).portid, + cb->nlh->nlmsg_seq, + NLM_F_MULTI, + skb, + IOAM6_CMD_DUMP_SCHEMAS); + if (err) + goto done; + } + + err = skb->len; + +done: + rhashtable_walk_stop(iter); + return err; +} + +static int ioam6_genl_ns_set_schema(struct sk_buff *skb, struct genl_info *info) +{ + struct ioam6_pernet_data *nsdata; + struct ioam6_namespace *ns; + struct ioam6_schema *sc; + __be16 ns_id; + int err = 0; + u32 sc_id; + + if (!info->attrs[IOAM6_ATTR_NS_ID] || + (!info->attrs[IOAM6_ATTR_SC_ID] && + !info->attrs[IOAM6_ATTR_SC_NONE])) + return -EINVAL; + + ns_id = cpu_to_be16(nla_get_u16(info->attrs[IOAM6_ATTR_NS_ID])); + nsdata = ioam6_pernet(genl_info_net(info)); + + mutex_lock(&nsdata->lock); + + ns = rhashtable_lookup_fast(&nsdata->namespaces, &ns_id, rht_ns_params); + if (!ns) { + err = -ENOENT; + goto out_unlock; + } + + if (info->attrs[IOAM6_ATTR_SC_NONE]) { + sc = NULL; + } else { + sc_id = nla_get_u32(info->attrs[IOAM6_ATTR_SC_ID]); + sc = rhashtable_lookup_fast(&nsdata->schemas, &sc_id, + rht_sc_params); + if (!sc) { + err = -ENOENT; + goto out_unlock; + } + } + + if (ns->schema) + ns->schema->ns = NULL; + ns->schema = sc; + + if (sc) { + if (sc->ns) + sc->ns->schema = NULL; + sc->ns = ns; + } + +out_unlock: + mutex_unlock(&nsdata->lock); + return err; +} + +static const struct genl_ops ioam6_genl_ops[] = { + { + .cmd = IOAM6_CMD_ADD_NAMESPACE, + .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP, + .doit = ioam6_genl_addns, + .flags = GENL_ADMIN_PERM, + }, + { + .cmd = IOAM6_CMD_DEL_NAMESPACE, + .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP, + .doit = ioam6_genl_delns, + .flags = GENL_ADMIN_PERM, + }, + { + .cmd = IOAM6_CMD_DUMP_NAMESPACES, + .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP, + .start = ioam6_genl_dumpns_start, + .dumpit = ioam6_genl_dumpns, + .done = ioam6_genl_dumpns_done, + .flags = GENL_ADMIN_PERM, + }, + { + .cmd = IOAM6_CMD_ADD_SCHEMA, + .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP, + .doit = ioam6_genl_addsc, + .flags = GENL_ADMIN_PERM, + }, + { + .cmd = IOAM6_CMD_DEL_SCHEMA, + .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP, + .doit = ioam6_genl_delsc, + .flags = GENL_ADMIN_PERM, + }, + { + .cmd = IOAM6_CMD_DUMP_SCHEMAS, + .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP, + .start = ioam6_genl_dumpsc_start, + .dumpit = ioam6_genl_dumpsc, + .done = ioam6_genl_dumpsc_done, + .flags = GENL_ADMIN_PERM, + }, + { + .cmd = IOAM6_CMD_NS_SET_SCHEMA, + .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP, + .doit = ioam6_genl_ns_set_schema, + .flags = GENL_ADMIN_PERM, + }, +}; + +static struct genl_family ioam6_genl_family __ro_after_init = { + .hdrsize = 0, + .name = IOAM6_GENL_NAME, + .version = IOAM6_GENL_VERSION, + .maxattr = IOAM6_ATTR_MAX, + .policy = ioam6_genl_policy, + .netnsok = true, + .parallel_ops = true, + .ops = ioam6_genl_ops, + .n_ops = ARRAY_SIZE(ioam6_genl_ops), + .module = THIS_MODULE, +}; + struct ioam6_namespace *ioam6_namespace(struct net *net, __be16 id) { struct ioam6_pernet_data *nsdata = ioam6_pernet(net); @@ -343,15 +836,24 @@ static struct pernet_operations ioam6_net_ops = { int __init ioam6_init(void) { int err = register_pernet_subsys(&ioam6_net_ops); + if (err) + goto out; + err = genl_register_family(&ioam6_genl_family); if (err) - return err; + goto out_unregister_pernet_subsys; pr_info("In-situ OAM (IOAM) with IPv6\n"); - return 0; + +out: + return err; +out_unregister_pernet_subsys: + unregister_pernet_subsys(&ioam6_net_ops); + goto out; } void ioam6_exit(void) { + genl_unregister_family(&ioam6_genl_family); unregister_pernet_subsys(&ioam6_net_ops); } From patchwork Wed May 26 17:16:39 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Justin Iurman X-Patchwork-Id: 12282447 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.0 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, UNWANTED_LANGUAGE_BODY,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E58CDC47088 for ; Wed, 26 May 2021 17:27:27 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id C1B4A613D2 for ; Wed, 26 May 2021 17:27:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233221AbhEZR26 (ORCPT ); Wed, 26 May 2021 13:28:58 -0400 Received: from serv108.segi.ulg.ac.be ([139.165.32.111]:46296 "EHLO serv108.segi.ulg.ac.be" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229958AbhEZR25 (ORCPT ); Wed, 26 May 2021 13:28:57 -0400 Received: from ubuntu18.home (135.19-200-80.adsl-dyn.isp.belgacom.be [80.200.19.135]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by serv108.segi.ulg.ac.be (Postfix) with ESMTPSA id 4A012200E1D9; Wed, 26 May 2021 19:17:11 +0200 (CEST) DKIM-Filter: OpenDKIM Filter v2.11.0 serv108.segi.ulg.ac.be 4A012200E1D9 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=uliege.be; s=ulg20190529; t=1622049431; bh=l6AnhrAY5jrfK/igHxxskO47gxkaP/lY24O2HfbMavg=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=nBFNEALiSrkIs9P/O0eW7AwT5dTMekA7DIPX81jDdTPkuJp/gc1/N1Mz7wN+/kPzP ogHbRa/7gpWjf6vbvSfuG8wh5BJ3/kD1nKCVkpGaF/ZZUZDS44GmoIX6xf/hvaES+r R1luk6cVo8NnOIoD61/8iq6xbDgwx+yppekDcKLLInpfwe6ueiXcjanD3UTADSMANs wFW2JARaVMGQHmIGRO2X7veRyrhR75up7wv4bqLmZbsLhAPZB2L/LB2IulyxvO2o29 5fyOl0CcnQz1h70Vd8WoX8IHa3tLydJ9A6j1eCEwQU03sdcnZY8yyNPXBn6I4EOIah FceC433YLcxqQ== From: Justin Iurman To: netdev@vger.kernel.org Cc: davem@davemloft.net, kuba@kernel.org, tom@herbertland.com, justin.iurman@uliege.be Subject: [RESEND PATCH net-next v3 4/5] ipv6: ioam: Support for IOAM injection with lwtunnels Date: Wed, 26 May 2021 19:16:39 +0200 Message-Id: <20210526171640.9722-5-justin.iurman@uliege.be> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20210526171640.9722-1-justin.iurman@uliege.be> References: <20210526171640.9722-1-justin.iurman@uliege.be> Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org Add support for the IOAM inline insertion (only for the host-to-host use case) which is per-route configured with lightweight tunnels. The target is iproute2 and the patch is ready. It will be posted as soon as this patchset is merged. Here is an overview: $ ip -6 ro ad fc00::1/128 encap ioam6 trace type 0x800000 ns 1 size 12 dev eth0 This example configures an IOAM Pre-allocated Trace option attached to the fc00::1/128 prefix. The IOAM namespace (ns) is 1, the size of the pre-allocated trace data block is 12 octets (size) and only the first IOAM data (bit 0: hop_limit + node id) is included in the trace (type) represented as a bitfield. The reason why the in-transit (IPv6-in-IPv6 encapsulation) use case is not implemented is explained on the patchset cover. Signed-off-by: Justin Iurman --- include/linux/ioam6_iptunnel.h | 13 ++ include/net/ioam6.h | 3 + include/uapi/linux/ioam6.h | 1 + include/uapi/linux/ioam6_iptunnel.h | 19 ++ include/uapi/linux/lwtunnel.h | 1 + net/core/lwtunnel.c | 2 + net/ipv6/Kconfig | 11 ++ net/ipv6/Makefile | 1 + net/ipv6/ioam6.c | 17 +- net/ipv6/ioam6_iptunnel.c | 273 ++++++++++++++++++++++++++++ 10 files changed, 339 insertions(+), 2 deletions(-) create mode 100644 include/linux/ioam6_iptunnel.h create mode 100644 include/uapi/linux/ioam6_iptunnel.h create mode 100644 net/ipv6/ioam6_iptunnel.c diff --git a/include/linux/ioam6_iptunnel.h b/include/linux/ioam6_iptunnel.h new file mode 100644 index 000000000000..07d9dfedd29d --- /dev/null +++ b/include/linux/ioam6_iptunnel.h @@ -0,0 +1,13 @@ +/* SPDX-License-Identifier: GPL-2.0+ */ +/* + * IPv6 IOAM Lightweight Tunnel API + * + * Author: + * Justin Iurman + */ +#ifndef _LINUX_IOAM6_IPTUNNEL_H +#define _LINUX_IOAM6_IPTUNNEL_H + +#include + +#endif /* _LINUX_IOAM6_IPTUNNEL_H */ diff --git a/include/net/ioam6.h b/include/net/ioam6.h index 828b83c70721..9eab3817e90b 100644 --- a/include/net/ioam6.h +++ b/include/net/ioam6.h @@ -59,4 +59,7 @@ extern void ioam6_fill_trace_data(struct sk_buff *skb, extern int ioam6_init(void); extern void ioam6_exit(void); +extern int ioam6_iptunnel_init(void); +extern void ioam6_iptunnel_exit(void); + #endif /* _NET_IOAM6_H */ diff --git a/include/uapi/linux/ioam6.h b/include/uapi/linux/ioam6.h index 2177e4e49566..0a1e09e43c28 100644 --- a/include/uapi/linux/ioam6.h +++ b/include/uapi/linux/ioam6.h @@ -117,6 +117,7 @@ struct ioam6_trace_hdr { #error "Please fix " #endif +#define IOAM6_TRACE_DATA_SIZE_MAX 244 __u8 data[0]; } __attribute__((packed)); diff --git a/include/uapi/linux/ioam6_iptunnel.h b/include/uapi/linux/ioam6_iptunnel.h new file mode 100644 index 000000000000..ed4ba9d523d6 --- /dev/null +++ b/include/uapi/linux/ioam6_iptunnel.h @@ -0,0 +1,19 @@ +/* SPDX-License-Identifier: GPL-2.0+ WITH Linux-syscall-note */ +/* + * IPv6 IOAM Lightweight Tunnel API + * + * Author: + * Justin Iurman + */ + +#ifndef _UAPI_LINUX_IOAM6_IPTUNNEL_H +#define _UAPI_LINUX_IOAM6_IPTUNNEL_H + +enum { + IOAM6_IPTUNNEL_UNSPEC, + IOAM6_IPTUNNEL_TRACE, /* struct ioam6_trace_hdr */ + __IOAM6_IPTUNNEL_MAX, +}; +#define IOAM6_IPTUNNEL_MAX (__IOAM6_IPTUNNEL_MAX - 1) + +#endif /* _UAPI_LINUX_IOAM6_IPTUNNEL_H */ diff --git a/include/uapi/linux/lwtunnel.h b/include/uapi/linux/lwtunnel.h index 568a4303ccce..2e206919125c 100644 --- a/include/uapi/linux/lwtunnel.h +++ b/include/uapi/linux/lwtunnel.h @@ -14,6 +14,7 @@ enum lwtunnel_encap_types { LWTUNNEL_ENCAP_BPF, LWTUNNEL_ENCAP_SEG6_LOCAL, LWTUNNEL_ENCAP_RPL, + LWTUNNEL_ENCAP_IOAM6, __LWTUNNEL_ENCAP_MAX, }; diff --git a/net/core/lwtunnel.c b/net/core/lwtunnel.c index 8ec7d13d2860..d0ae987d2de9 100644 --- a/net/core/lwtunnel.c +++ b/net/core/lwtunnel.c @@ -43,6 +43,8 @@ static const char *lwtunnel_encap_str(enum lwtunnel_encap_types encap_type) return "SEG6LOCAL"; case LWTUNNEL_ENCAP_RPL: return "RPL"; + case LWTUNNEL_ENCAP_IOAM6: + return "IOAM6"; case LWTUNNEL_ENCAP_IP6: case LWTUNNEL_ENCAP_IP: case LWTUNNEL_ENCAP_NONE: diff --git a/net/ipv6/Kconfig b/net/ipv6/Kconfig index 747f56e0c636..e504204bca92 100644 --- a/net/ipv6/Kconfig +++ b/net/ipv6/Kconfig @@ -328,4 +328,15 @@ config IPV6_RPL_LWTUNNEL If unsure, say N. +config IPV6_IOAM6_LWTUNNEL + bool "IPv6: IOAM Pre-allocated Trace insertion support" + depends on IPV6 + select LWTUNNEL + help + Support for the inline insertion of IOAM Pre-allocated + Trace Header (only on locally generated packets), using + the lightweight tunnels mechanism. + + If unsure, say N. + endif # IPV6 diff --git a/net/ipv6/Makefile b/net/ipv6/Makefile index b7ef10d417d6..1bc7e143217b 100644 --- a/net/ipv6/Makefile +++ b/net/ipv6/Makefile @@ -27,6 +27,7 @@ ipv6-$(CONFIG_NETLABEL) += calipso.o ipv6-$(CONFIG_IPV6_SEG6_LWTUNNEL) += seg6_iptunnel.o seg6_local.o ipv6-$(CONFIG_IPV6_SEG6_HMAC) += seg6_hmac.o ipv6-$(CONFIG_IPV6_RPL_LWTUNNEL) += rpl_iptunnel.o +ipv6-$(CONFIG_IPV6_IOAM6_LWTUNNEL) += ioam6_iptunnel.o ipv6-objs += $(ipv6-y) diff --git a/net/ipv6/ioam6.c b/net/ipv6/ioam6.c index 09fb93f4cf1f..57dcde80cda4 100644 --- a/net/ipv6/ioam6.c +++ b/net/ipv6/ioam6.c @@ -600,7 +600,7 @@ static void __ioam6_fill_trace_data(struct sk_buff *skb, if (skb->dev) byte--; - raw32 = dev_net(skb->dev)->ipv6.sysctl.ioam6_id; + raw32 = dev_net(skb_dst(skb)->dev)->ipv6.sysctl.ioam6_id; if (!raw32) raw32 = IOAM6_EMPTY_u24; else @@ -688,7 +688,7 @@ static void __ioam6_fill_trace_data(struct sk_buff *skb, if (skb->dev) byte--; - raw64 = dev_net(skb->dev)->ipv6.sysctl.ioam6_id; + raw64 = dev_net(skb_dst(skb)->dev)->ipv6.sysctl.ioam6_id; if (!raw64) raw64 = IOAM6_EMPTY_u56; else @@ -843,10 +843,20 @@ int __init ioam6_init(void) if (err) goto out_unregister_pernet_subsys; +#ifdef CONFIG_IPV6_IOAM6_LWTUNNEL + err = ioam6_iptunnel_init(); + if (err) + goto out_unregister_genl; +#endif + pr_info("In-situ OAM (IOAM) with IPv6\n"); out: return err; +#ifdef CONFIG_IPV6_IOAM6_LWTUNNEL +out_unregister_genl: + genl_unregister_family(&ioam6_genl_family); +#endif out_unregister_pernet_subsys: unregister_pernet_subsys(&ioam6_net_ops); goto out; @@ -854,6 +864,9 @@ int __init ioam6_init(void) void ioam6_exit(void) { +#ifdef CONFIG_IPV6_IOAM6_LWTUNNEL + ioam6_iptunnel_exit(); +#endif genl_unregister_family(&ioam6_genl_family); unregister_pernet_subsys(&ioam6_net_ops); } diff --git a/net/ipv6/ioam6_iptunnel.c b/net/ipv6/ioam6_iptunnel.c new file mode 100644 index 000000000000..e19b5c3aa792 --- /dev/null +++ b/net/ipv6/ioam6_iptunnel.c @@ -0,0 +1,273 @@ +/* SPDX-License-Identifier: GPL-2.0+ */ +/* + * IPv6 IOAM Lightweight Tunnel implementation + * + * Author: + * Justin Iurman + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +struct ioam6_lwt_encap { + struct ipv6_hopopt_hdr eh; + u8 pad[2]; /* 2-octet padding for 4n-alignment */ + struct ioam6_hdr ioamh; + struct ioam6_trace_hdr traceh; +} __attribute__((packed)); + +struct ioam6_lwt { + struct ioam6_lwt_encap tuninfo; +}; + +static inline struct ioam6_lwt *ioam6_lwt_state(struct lwtunnel_state *lwt) +{ + return (struct ioam6_lwt *)lwt->data; +} + +static inline struct ioam6_lwt_encap *ioam6_lwt_info(struct lwtunnel_state *lwt) +{ + return &ioam6_lwt_state(lwt)->tuninfo; +} + +static inline struct ioam6_trace_hdr *ioam6_trace(struct lwtunnel_state *lwt) +{ + return &(ioam6_lwt_state(lwt)->tuninfo.traceh); +} + +static const struct nla_policy ioam6_iptunnel_policy[IOAM6_IPTUNNEL_MAX + 1] = { + [IOAM6_IPTUNNEL_TRACE] = { .type = NLA_BINARY }, +}; + +static int nla_put_ioam6_trace(struct sk_buff *skb, int attrtype, + struct ioam6_trace_hdr *trace) +{ + struct ioam6_trace_hdr *data; + struct nlattr *nla; + int len; + + len = sizeof(*trace); + + nla = nla_reserve(skb, attrtype, len); + if (!nla) + return -EMSGSIZE; + + data = nla_data(nla); + memcpy(data, trace, len); + + return 0; +} + +static bool ioam6_validate_trace_hdr(struct ioam6_trace_hdr *trace) +{ + if (!trace->type_be32 || !trace->remlen || + trace->remlen > IOAM6_TRACE_DATA_SIZE_MAX / 4) + return false; + + trace->nodelen = 0; + if (trace->type.bit0) trace->nodelen += sizeof(__be32) / 4; + if (trace->type.bit1) trace->nodelen += sizeof(__be32) / 4; + if (trace->type.bit2) trace->nodelen += sizeof(__be32) / 4; + if (trace->type.bit3) trace->nodelen += sizeof(__be32) / 4; + if (trace->type.bit4) trace->nodelen += sizeof(__be32) / 4; + if (trace->type.bit5) trace->nodelen += sizeof(__be32) / 4; + if (trace->type.bit6) trace->nodelen += sizeof(__be32) / 4; + if (trace->type.bit7) trace->nodelen += sizeof(__be32) / 4; + if (trace->type.bit8) trace->nodelen += sizeof(__be64) / 4; + if (trace->type.bit9) trace->nodelen += sizeof(__be64) / 4; + if (trace->type.bit10) trace->nodelen += sizeof(__be64) / 4; + if (trace->type.bit11) trace->nodelen += sizeof(__be32) / 4; + + return true; +} + +static int ioam6_build_state(struct net *net, struct nlattr *nla, + unsigned int family, const void *cfg, + struct lwtunnel_state **ts, + struct netlink_ext_ack *extack) +{ + struct nlattr *tb[IOAM6_IPTUNNEL_MAX + 1]; + struct ioam6_lwt_encap *tuninfo; + struct ioam6_trace_hdr *trace; + struct lwtunnel_state *s; + int len_aligned; + int len, err; + + if (family != AF_INET6) + return -EINVAL; + + err = nla_parse_nested(tb, IOAM6_IPTUNNEL_MAX, nla, + ioam6_iptunnel_policy, extack); + if (err < 0) + return err; + + if (!tb[IOAM6_IPTUNNEL_TRACE]) + return -EINVAL; + + trace = nla_data(tb[IOAM6_IPTUNNEL_TRACE]); + if (nla_len(tb[IOAM6_IPTUNNEL_TRACE]) != sizeof(*trace)) + return -EINVAL; + + if (!ioam6_validate_trace_hdr(trace)) + return -EINVAL; + + len = sizeof(*tuninfo) + trace->remlen * 4; + len_aligned = ALIGN(len, 8); + + s = lwtunnel_state_alloc(len_aligned); + if (!s) + return -ENOMEM; + + tuninfo = ioam6_lwt_info(s); + tuninfo->eh.hdrlen = (len_aligned >> 3) - 1; + tuninfo->pad[0] = IPV6_TLV_PADN; + tuninfo->ioamh.type = IOAM6_TYPE_PREALLOC; + tuninfo->ioamh.opt_type = IPV6_TLV_IOAM; + tuninfo->ioamh.opt_len = sizeof(tuninfo->ioamh) - 2 + sizeof(*trace) + + trace->remlen * 4; + + memcpy(&tuninfo->traceh, trace, sizeof(*trace)); + + len = len_aligned - len; + if (len == 1) { + tuninfo->traceh.data[trace->remlen * 4] = IPV6_TLV_PAD1; + } else if (len > 0) { + tuninfo->traceh.data[trace->remlen * 4] = IPV6_TLV_PADN; + tuninfo->traceh.data[trace->remlen * 4 + 1] = len - 2; + } + + s->type = LWTUNNEL_ENCAP_IOAM6; + s->flags |= LWTUNNEL_STATE_OUTPUT_REDIRECT; + + *ts = s; + + return 0; +} + +static int ioam6_do_inline(struct sk_buff *skb, struct ioam6_lwt_encap *tuninfo) +{ + struct ioam6_trace_hdr *trace; + struct ipv6hdr *oldhdr, *hdr; + struct ioam6_namespace *ns; + int hdrlen, err; + + hdrlen = (tuninfo->eh.hdrlen + 1) << 3; + + err = skb_cow_head(skb, hdrlen + skb->mac_len); + if (unlikely(err)) + return err; + + oldhdr = ipv6_hdr(skb); + skb_pull(skb, sizeof(*oldhdr)); + skb_postpull_rcsum(skb, skb_network_header(skb), sizeof(*oldhdr)); + + skb_push(skb, sizeof(*oldhdr) + hdrlen); + skb_reset_network_header(skb); + skb_mac_header_rebuild(skb); + + hdr = ipv6_hdr(skb); + memmove(hdr, oldhdr, sizeof(*oldhdr)); + tuninfo->eh.nexthdr = hdr->nexthdr; + + skb_set_transport_header(skb, sizeof(*hdr)); + skb_postpush_rcsum(skb, hdr, sizeof(*hdr) + hdrlen); + + memcpy(skb_transport_header(skb), (u8*)tuninfo, hdrlen); + + hdr->nexthdr = NEXTHDR_HOP; + hdr->payload_len = cpu_to_be16(skb->len - sizeof(*hdr)); + + trace = (struct ioam6_trace_hdr *)(skb_transport_header(skb) + + sizeof(struct ipv6_hopopt_hdr) + 2 + + sizeof(struct ioam6_hdr)); + + ns = ioam6_namespace(dev_net(skb_dst(skb)->dev), trace->namespace_id); + if (ns) + ioam6_fill_trace_data(skb, ns, trace); + + return 0; +} + +static int ioam6_output(struct net *net, struct sock *sk, struct sk_buff *skb) +{ + struct lwtunnel_state *lwt = skb_dst(skb)->lwtstate; + int err = -EINVAL; + + if (skb->protocol != htons(ETH_P_IPV6)) + goto drop; + + /* Only for packets we send and + * that do not contain a Hop-by-Hop yet + */ + if (skb->dev || ipv6_hdr(skb)->nexthdr == NEXTHDR_HOP) + goto out; + + err = ioam6_do_inline(skb, ioam6_lwt_info(lwt)); + if (unlikely(err)) + goto drop; + + err = skb_cow_head(skb, LL_RESERVED_SPACE(skb_dst(skb)->dev)); + if (unlikely(err)) + goto drop; + +out: + return lwt->orig_output(net, sk, skb); + +drop: + kfree_skb(skb); + return err; +} + +static int ioam6_fill_encap_info(struct sk_buff *skb, + struct lwtunnel_state *lwtstate) +{ + struct ioam6_trace_hdr *trace = ioam6_trace(lwtstate); + + if (nla_put_ioam6_trace(skb, IOAM6_IPTUNNEL_TRACE, trace)) + return -EMSGSIZE; + + return 0; +} + +static int ioam6_encap_nlsize(struct lwtunnel_state *lwtstate) +{ + struct ioam6_trace_hdr *trace = ioam6_trace(lwtstate); + + return nla_total_size(sizeof(*trace)); +} + +static int ioam6_encap_cmp(struct lwtunnel_state *a, struct lwtunnel_state *b) +{ + struct ioam6_trace_hdr *a_hdr = ioam6_trace(a); + struct ioam6_trace_hdr *b_hdr = ioam6_trace(b); + + return (a_hdr->namespace_id != b_hdr->namespace_id); +} + +static const struct lwtunnel_encap_ops ioam6_iptun_ops = { + .build_state = ioam6_build_state, + .output = ioam6_output, + .fill_encap = ioam6_fill_encap_info, + .get_encap_size = ioam6_encap_nlsize, + .cmp_encap = ioam6_encap_cmp, + .owner = THIS_MODULE, +}; + +int __init ioam6_iptunnel_init(void) +{ + return lwtunnel_encap_add_ops(&ioam6_iptun_ops, LWTUNNEL_ENCAP_IOAM6); +} + +void ioam6_iptunnel_exit(void) +{ + lwtunnel_encap_del_ops(&ioam6_iptun_ops, LWTUNNEL_ENCAP_IOAM6); +} From patchwork Wed May 26 17:16:40 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Justin Iurman X-Patchwork-Id: 12282451 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C9D49C47089 for ; Wed, 26 May 2021 17:27:52 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id ADE9E613C7 for ; Wed, 26 May 2021 17:27:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234359AbhEZR3I (ORCPT ); Wed, 26 May 2021 13:29:08 -0400 Received: from serv108.segi.ulg.ac.be ([139.165.32.111]:46315 "EHLO serv108.segi.ulg.ac.be" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232956AbhEZR26 (ORCPT ); Wed, 26 May 2021 13:28:58 -0400 Received: from ubuntu18.home (135.19-200-80.adsl-dyn.isp.belgacom.be [80.200.19.135]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by serv108.segi.ulg.ac.be (Postfix) with ESMTPSA id 74CF3200E1E2; Wed, 26 May 2021 19:17:11 +0200 (CEST) DKIM-Filter: OpenDKIM Filter v2.11.0 serv108.segi.ulg.ac.be 74CF3200E1E2 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=uliege.be; s=ulg20190529; t=1622049431; bh=ii4SLSbpT9CZ0dU9Nx7vvFLYxOrx1q09Gt9o0V6koaU=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=o9f3vdP8e6O+vrt5zTUrWPX6p9chHDWnTFtn4EIUUPwkdXtAoYsXFwPQe6G9T+W6t 2VCLsIZ6pHiApv3EZhyY9obgtOoamXt0XT8hoGI+offc8Rk7vXXM0AjvPpSaTX6H1G tCOD2ECG8Oq5NTAV9wjROFIKoit2cLnXBhrMYI8706/qtViyxbzaVDrBypej0gFNrS I/GCNLFxx6DNl8P/6UH0hmxa2+M0IN2C7g+C4wtEWK875TQcngvbyqP7b7XHnXXgX1 0SyQTHSIB22aNj+jusHDDg4s1zhqGZ70DU1qm2ZNACm4SU/CHgzun4s87QzP8IITVK LPiNmgNwu/qXQ== From: Justin Iurman To: netdev@vger.kernel.org Cc: davem@davemloft.net, kuba@kernel.org, tom@herbertland.com, justin.iurman@uliege.be Subject: [RESEND PATCH net-next v3 5/5] ipv6: ioam: Documentation for new IOAM sysctls Date: Wed, 26 May 2021 19:16:40 +0200 Message-Id: <20210526171640.9722-6-justin.iurman@uliege.be> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20210526171640.9722-1-justin.iurman@uliege.be> References: <20210526171640.9722-1-justin.iurman@uliege.be> Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org Add documentation for new IOAM sysctls: - ioam6_id: a namespace sysctl - ioam6_enabled and ioam6_id: two per-interface sysctls Example of IOAM configuration based on the following simple topology: _____ _____ _____ | | eth0 eth0 | | eth1 eth0 | | | A |.----------.| B |.----------.| C | |_____| |_____| |_____| 1) Node and interface IDs can be configured for IOAM: # IOAM ID of A = 1, IOAM ID of A.eth0 = 11 (A) sysctl -w net.ipv6.ioam6_id=1 (A) sysctl -w net.ipv6.conf.eth0.ioam6_id=11 # IOAM ID of B = 2, IOAM ID of B.eth0 = 21, IOAM ID of B.eth1 = 22 (B) sysctl -w net.ipv6.ioam6_id=2 (B) sysctl -w net.ipv6.conf.eth0.ioam6_id=21 (B) sysctl -w net.ipv6.conf.eth1.ioam6_id=22 # IOAM ID of C = 3, IOAM ID of C.eth0 = 31 (C) sysctl -w net.ipv6.ioam6_id=3 (C) sysctl -w net.ipv6.conf.eth0.ioam6_id=31 2) Each node can be configured to form an IOAM domain. For instance, we allow IOAM from A to C, i.e. enable IOAM on ingress for B.eth0 and C.eth0: (B) sysctl -w net.ipv6.conf.eth0.ioam6_enabled=1 (C) sysctl -w net.ipv6.conf.eth0.ioam6_enabled=1 3) An IOAM domain (e.g. ID=123) is defined and made known to each node: (A) ip ioam namespace add 123 (B) ip ioam namespace add 123 (C) ip ioam namespace add 123 4) Finally, an IOAM Pre-allocated Trace can be inserted in traffic sent by A when C (e.g. db02::2) is the destination: (A) ip -6 route add db02::2/128 encap ioam6 trace type 0x800000 ns 123 size 12 dev eth0 Signed-off-by: Justin Iurman --- Documentation/networking/ioam6-sysctl.rst | 20 ++++++++++++++++++++ Documentation/networking/ip-sysctl.rst | 5 +++++ 2 files changed, 25 insertions(+) create mode 100644 Documentation/networking/ioam6-sysctl.rst diff --git a/Documentation/networking/ioam6-sysctl.rst b/Documentation/networking/ioam6-sysctl.rst new file mode 100644 index 000000000000..37a9b4e731a0 --- /dev/null +++ b/Documentation/networking/ioam6-sysctl.rst @@ -0,0 +1,20 @@ +.. SPDX-License-Identifier: GPL-2.0 + +===================== +IOAM6 Sysfs variables +===================== + + +/proc/sys/net/conf//ioam6_* variables: +============================================= + +ioam6_enabled - BOOL + Accept or ignore IPv6 IOAM options for ingress on this interface. + + * 0 - disabled (default) + * not 0 - enabled + +ioam6_id - INTEGER + Define the IOAM id of this interface. + + Default is 0. diff --git a/Documentation/networking/ip-sysctl.rst b/Documentation/networking/ip-sysctl.rst index a5c250044500..d472c4f0972e 100644 --- a/Documentation/networking/ip-sysctl.rst +++ b/Documentation/networking/ip-sysctl.rst @@ -1901,6 +1901,11 @@ fib_notify_on_flag_change - INTEGER - 1 - Emit notifications. - 2 - Emit notifications only for RTM_F_OFFLOAD_FAILED flag change. +ioam6_id - INTEGER + Define the IOAM id of this node. + + Default: 0 + IPv6 Fragmentation: ip6frag_high_thresh - INTEGER