diff mbox series

[ipsec-next,v7,07/16] xfrm: iptfs: add new iptfs xfrm mode impl

Message ID 20240801080314.169715-8-chopps@chopps.org (mailing list archive)
State Superseded
Delegated to: Netdev Maintainers
Headers show
Series Add IP-TFS mode to xfrm | expand

Checks

Context Check Description
netdev/series_format fail Series longer than 15 patches
netdev/tree_selection success Guessed tree name to be net-next
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 42 this patch: 42
netdev/build_tools success No tools touched, skip
netdev/cc_maintainers warning 4 maintainers not CCed: herbert@gondor.apana.org.au edumazet@google.com pabeni@redhat.com kuba@kernel.org
netdev/build_clang success Errors and warnings before: 44 this patch: 44
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 49 this patch: 49
netdev/checkpatch warning WARNING: added, moved or deleted file(s), does MAINTAINERS need updating? WARNING: line length of 90 exceeds 80 columns WARNING: line length of 92 exceeds 80 columns
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc fail Errors and warnings before: 0 this patch: 1
netdev/source_inline success Was 0 now: 0

Commit Message

Christian Hopps Aug. 1, 2024, 8:03 a.m. UTC
From: Christian Hopps <chopps@labn.net>

Add a new xfrm mode implementing AggFrag/IP-TFS from RFC9347.

This utilizes the new xfrm_mode_cbs to implement demand-driven IP-TFS
functionality. This functionality can be used to increase bandwidth
utilization through small packet aggregation, as well as help solve PMTU
issues through it's efficient use of fragmentation.

  Link: https://www.rfc-editor.org/rfc/rfc9347.txt

Multiple commits follow to build the functionality into xfrm_iptfs.c

Signed-off-by: Christian Hopps <chopps@labn.net>
---
 net/xfrm/Makefile     |   1 +
 net/xfrm/xfrm_iptfs.c | 223 ++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 224 insertions(+)
 create mode 100644 net/xfrm/xfrm_iptfs.c

Comments

Florian Westphal Aug. 1, 2024, 12:13 p.m. UTC | #1
Christian Hopps <chopps@chopps.org> wrote:
> +static int __iptfs_init_state(struct xfrm_state *x,
> +			      struct xfrm_iptfs_data *xtfs)
> +{
> +	/* Modify type (esp) adjustment values */
> +
> +	if (x->props.family == AF_INET)
> +		x->props.header_len += sizeof(struct iphdr) + sizeof(struct ip_iptfs_hdr);
> +	else if (x->props.family == AF_INET6)
> +		x->props.header_len += sizeof(struct ipv6hdr) + sizeof(struct ip_iptfs_hdr);
> +	x->props.enc_hdr_len = sizeof(struct ip_iptfs_hdr);
> +
> +	/* Always have a module reference if x->mode_data is set */
> +	if (!try_module_get(x->mode_cbs->owner))
> +		return -EINVAL;

If the comment means that we already have a module owner ref taken
before this try_module_get, then this should use __module_get and
a mention where the first ref was taken.

If not, then this needs an explanation as to what prevents another cpu to
rmmod the owning module between the lookup in xfrm_init_state and the
module reference in __iptfs_init_state.

cpu0					cpu1
 xfrm_init_state
   -> xfrm_get_mode_cbs                 rmmod
     -> __iptfs_init_state		  xfrm_iptfs_fini
       <interrupt>                         xfrm_unregister_mode_cbs
                                            release memory
       <resume>
	try_module_get -> UaF
Christian Hopps Aug. 1, 2024, 12:36 p.m. UTC | #2
Florian Westphal <fw@strlen.de> writes:

> Christian Hopps <chopps@chopps.org> wrote:
>> +static int __iptfs_init_state(struct xfrm_state *x,
>> +			      struct xfrm_iptfs_data *xtfs)
>> +{
>> +	/* Modify type (esp) adjustment values */
>> +
>> +	if (x->props.family == AF_INET)
>> +		x->props.header_len += sizeof(struct iphdr) + sizeof(struct ip_iptfs_hdr);
>> +	else if (x->props.family == AF_INET6)
>> +		x->props.header_len += sizeof(struct ipv6hdr) + sizeof(struct ip_iptfs_hdr);
>> +	x->props.enc_hdr_len = sizeof(struct ip_iptfs_hdr);
>> +
>> +	/* Always have a module reference if x->mode_data is set */
>> +	if (!try_module_get(x->mode_cbs->owner))
>> +		return -EINVAL;
>
> If the comment means that we already have a module owner ref taken
> before this try_module_get, then this should use __module_get and
> a mention where the first ref was taken.
>
> If not, then this needs an explanation as to what prevents another cpu to
> rmmod the owning module between the lookup in xfrm_init_state and the
> module reference in __iptfs_init_state.
>
> cpu0					cpu1
>  xfrm_init_state
>    -> xfrm_get_mode_cbs                 rmmod
>      -> __iptfs_init_state		  xfrm_iptfs_fini
>        <interrupt>                         xfrm_unregister_mode_cbs
>                                             release memory
>        <resume>
> 	try_module_get -> UaF

You are correct the code is not sufficiently protective.

I need to use rcu to keep `xfrm_mode_cbs_map[mode]` around long enough to do a `try_module_get(&xfrm_mode_cbs_map[mode]->owner)` and return from `xfrm_get_mode_cbs()` with that ref count held. The caller (xfrm_init_state()) will then need do a `module_put()` after it calls `mode_cbs->create_state()` to give the reference back.

Thanks,
Chris.
Florian Westphal Aug. 1, 2024, 1:09 p.m. UTC | #3
Christian Hopps <chopps@chopps.org> wrote:
> You are correct the code is not sufficiently protective.
> 
> I need to use rcu to keep `xfrm_mode_cbs_map[mode]` around long enough to do a `try_module_get(&xfrm_mode_cbs_map[mode]->owner)` and return from `xfrm_get_mode_cbs()` with that ref count held. The caller (xfrm_init_state()) will then need do a `module_put()` after it calls `mode_cbs->create_state()` to give the reference back.

Yes, that should work.
diff mbox series

Patch

diff --git a/net/xfrm/Makefile b/net/xfrm/Makefile
index 512e0b2f8514..5a1787587cb3 100644
--- a/net/xfrm/Makefile
+++ b/net/xfrm/Makefile
@@ -21,5 +21,6 @@  obj-$(CONFIG_XFRM_USER) += xfrm_user.o
 obj-$(CONFIG_XFRM_USER_COMPAT) += xfrm_compat.o
 obj-$(CONFIG_XFRM_IPCOMP) += xfrm_ipcomp.o
 obj-$(CONFIG_XFRM_INTERFACE) += xfrm_interface.o
+obj-$(CONFIG_XFRM_IPTFS) += xfrm_iptfs.o
 obj-$(CONFIG_XFRM_ESPINTCP) += espintcp.o
 obj-$(CONFIG_DEBUG_INFO_BTF) += xfrm_state_bpf.o
diff --git a/net/xfrm/xfrm_iptfs.c b/net/xfrm/xfrm_iptfs.c
new file mode 100644
index 000000000000..a522e5998020
--- /dev/null
+++ b/net/xfrm/xfrm_iptfs.c
@@ -0,0 +1,223 @@ 
+// SPDX-License-Identifier: GPL-2.0
+/* xfrm_iptfs: IPTFS encapsulation support
+ *
+ * April 21 2022, Christian Hopps <chopps@labn.net>
+ *
+ * Copyright (c) 2022, LabN Consulting, L.L.C.
+ *
+ */
+
+#include <linux/kernel.h>
+#include <linux/icmpv6.h>
+#include <net/gro.h>
+#include <net/icmp.h>
+#include <net/ip6_route.h>
+#include <net/inet_ecn.h>
+#include <net/xfrm.h>
+
+#include <crypto/aead.h>
+
+#include "xfrm_inout.h"
+
+/**
+ * struct xfrm_iptfs_config - configuration for the IPTFS tunnel.
+ * @pkt_size: size of the outer IP packet. 0 to use interface and MTU discovery,
+ *	otherwise the user specified value.
+ */
+struct xfrm_iptfs_config {
+	u32 pkt_size;	    /* outer_packet_size or 0 */
+};
+
+/**
+ * struct xfrm_iptfs_data - mode specific xfrm state.
+ * @cfg: IPTFS tunnel config.
+ * @x: owning SA (xfrm_state).
+ * @payload_mtu: max payload size.
+ */
+struct xfrm_iptfs_data {
+	struct xfrm_iptfs_config cfg;
+
+	/* Ingress User Input */
+	struct xfrm_state *x;	    /* owning state */
+	u32 payload_mtu;	    /* max payload size */
+};
+
+/* ========================== */
+/* State Management Functions */
+/* ========================== */
+
+/**
+ * iptfs_get_inner_mtu() - return inner MTU with no fragmentation.
+ * @x: xfrm state.
+ * @outer_mtu: the outer mtu
+ */
+static u32 iptfs_get_inner_mtu(struct xfrm_state *x, int outer_mtu)
+{
+	struct crypto_aead *aead;
+	u32 blksize;
+
+	aead = x->data;
+	blksize = ALIGN(crypto_aead_blocksize(aead), 4);
+	return ((outer_mtu - x->props.header_len - crypto_aead_authsize(aead)) &
+		~(blksize - 1)) - 2;
+}
+
+/**
+ * iptfs_user_init() - initialize the SA with IPTFS options from netlink.
+ * @net: the net data
+ * @x: xfrm state
+ * @attrs: netlink attributes
+ * @extack: extack return data
+ *
+ * Return: 0 on success or a negative error code on failure
+ */
+static int iptfs_user_init(struct net *net, struct xfrm_state *x,
+			   struct nlattr **attrs,
+			   struct netlink_ext_ack *extack)
+{
+	struct xfrm_iptfs_data *xtfs = x->mode_data;
+	struct xfrm_iptfs_config *xc;
+
+	xc = &xtfs->cfg;
+
+	if (attrs[XFRMA_IPTFS_PKT_SIZE]) {
+		xc->pkt_size = nla_get_u32(attrs[XFRMA_IPTFS_PKT_SIZE]);
+		if (!xc->pkt_size) {
+			xtfs->payload_mtu = 0;
+		} else if (xc->pkt_size > x->props.header_len) {
+			xtfs->payload_mtu = xc->pkt_size - x->props.header_len;
+		} else {
+			NL_SET_ERR_MSG(extack,
+				       "Packet size must be 0 or greater than IPTFS/ESP header length");
+			return -EINVAL;
+		}
+	}
+	return 0;
+}
+
+static unsigned int iptfs_sa_len(const struct xfrm_state *x)
+{
+	struct xfrm_iptfs_data *xtfs = x->mode_data;
+	struct xfrm_iptfs_config *xc = &xtfs->cfg;
+	unsigned int l = 0;
+
+	if (x->dir == XFRM_SA_DIR_OUT)
+		l += nla_total_size(sizeof(xc->pkt_size));
+
+	return l;
+}
+
+static int iptfs_copy_to_user(struct xfrm_state *x, struct sk_buff *skb)
+{
+	struct xfrm_iptfs_data *xtfs = x->mode_data;
+	struct xfrm_iptfs_config *xc = &xtfs->cfg;
+	int ret = 0;
+
+	if (x->dir == XFRM_SA_DIR_OUT)
+		ret = nla_put_u32(skb, XFRMA_IPTFS_PKT_SIZE, xc->pkt_size);
+
+	return ret;
+}
+
+static int __iptfs_init_state(struct xfrm_state *x,
+			      struct xfrm_iptfs_data *xtfs)
+{
+	/* Modify type (esp) adjustment values */
+
+	if (x->props.family == AF_INET)
+		x->props.header_len += sizeof(struct iphdr) + sizeof(struct ip_iptfs_hdr);
+	else if (x->props.family == AF_INET6)
+		x->props.header_len += sizeof(struct ipv6hdr) + sizeof(struct ip_iptfs_hdr);
+	x->props.enc_hdr_len = sizeof(struct ip_iptfs_hdr);
+
+	/* Always have a module reference if x->mode_data is set */
+	if (!try_module_get(x->mode_cbs->owner))
+		return -EINVAL;
+
+	x->mode_data = xtfs;
+	xtfs->x = x;
+
+	return 0;
+}
+
+static int iptfs_clone(struct xfrm_state *x, struct xfrm_state *orig)
+{
+	struct xfrm_iptfs_data *xtfs;
+	int err;
+
+	xtfs = kmemdup(orig->mode_data, sizeof(*xtfs), GFP_KERNEL);
+	if (!xtfs)
+		return -ENOMEM;
+
+	err = __iptfs_init_state(x, xtfs);
+	if (err) {
+		kfree_sensitive(xtfs);
+		return err;
+	}
+
+	return 0;
+}
+
+static int iptfs_create_state(struct xfrm_state *x)
+{
+	struct xfrm_iptfs_data *xtfs;
+	int err;
+
+	xtfs = kzalloc(sizeof(*xtfs), GFP_KERNEL);
+	if (!xtfs)
+		return -ENOMEM;
+
+	err = __iptfs_init_state(x, xtfs);
+	if (err) {
+		kfree_sensitive(xtfs);
+		return err;
+	}
+
+	return 0;
+}
+
+static void iptfs_delete_state(struct xfrm_state *x)
+{
+	struct xfrm_iptfs_data *xtfs = x->mode_data;
+
+	if (!xtfs)
+		return;
+
+	kfree_sensitive(xtfs);
+
+	module_put(x->mode_cbs->owner);
+}
+
+static const struct xfrm_mode_cbs iptfs_mode_cbs = {
+	.owner = THIS_MODULE,
+	.create_state = iptfs_create_state,
+	.delete_state = iptfs_delete_state,
+	.user_init = iptfs_user_init,
+	.copy_to_user = iptfs_copy_to_user,
+	.sa_len = iptfs_sa_len,
+	.clone = iptfs_clone,
+	.get_inner_mtu = iptfs_get_inner_mtu,
+};
+
+static int __init xfrm_iptfs_init(void)
+{
+	int err;
+
+	pr_info("xfrm_iptfs: IPsec IP-TFS tunnel mode module\n");
+
+	err = xfrm_register_mode_cbs(XFRM_MODE_IPTFS, &iptfs_mode_cbs);
+	if (err < 0)
+		pr_info("%s: can't register IP-TFS\n", __func__);
+
+	return err;
+}
+
+static void __exit xfrm_iptfs_fini(void)
+{
+	xfrm_unregister_mode_cbs(XFRM_MODE_IPTFS);
+}
+
+module_init(xfrm_iptfs_init);
+module_exit(xfrm_iptfs_fini);
+MODULE_LICENSE("GPL");
+MODULE_DESCRIPTION("IP-TFS support for xfrm ipsec tunnels");