Message ID | 20231113035219.920136-1-chopps@chopps.org (mailing list archive) |
---|---|
Headers | show |
Series | Add IP-TFS mode to xfrm | expand |
On Sun, Nov 12, 2023 at 10:52:11PM -0500, Christian Hopps wrote: > From: Christian Hopps <chopps@labn.net> > > This patchset adds a new xfrm mode implementing on-demand IP-TFS. IP-TFS > (AggFrag encapsulation) has been standardized in RFC9347. > > Link: https://www.rfc-editor.org/rfc/rfc9347.txt > > This feature supports demand driven (i.e., non-constant send rate) IP-TFS to > take advantage of the AGGFRAG ESP payload encapsulation. This payload type > supports aggregation and fragmentation of the inner IP packet stream which in > turn yields higher small-packet bandwidth as well as reducing MTU/PMTU issues. > Congestion control is unimplementated as the send rate is demand driven rather > than constant. > > In order to allow loading this fucntionality as a module a set of callbacks > xfrm_mode_cbs has been added to xfrm as well. I did a multiple days peer review with Chris on this pachset. So my concerns are already addressed. Further reviews are welcome! This is a bigger change and it would be nice if more people could look at it. Thanks!
> I did a multiple days peer review with Chris on this pachset. So my > concerns are already addressed. > > Further reviews are welcome! This is a bigger change and it would > be nice if more people could look at it. I have a usability question. What name should appear when a user interacts with and sees log messages from this feature? ip-tfs, IP-TFS, IP_TFS or: iptfs, IPTFS, ...
On Sun, Nov 12, 2023 at 10:52:11PM -0500, Christian Hopps via Devel wrote: > From: Christian Hopps <chopps@labn.net> > > This patchset adds a new xfrm mode implementing on-demand IP-TFS. IP-TFS > (AggFrag encapsulation) has been standardized in RFC9347. > > Link: https://www.rfc-editor.org/rfc/rfc9347.txt > > This feature supports demand driven (i.e., non-constant send rate) IP-TFS to > take advantage of the AGGFRAG ESP payload encapsulation. This payload type > supports aggregation and fragmentation of the inner IP packet stream which in > turn yields higher small-packet bandwidth as well as reducing MTU/PMTU issues. > Congestion control is unimplementated as the send rate is demand driven rather > than constant. > > In order to allow loading this fucntionality as a module a set of callbacks > xfrm_mode_cbs has been added to xfrm as well. Hi Chris, I have further reviewed the code and have a few minor questions, mainly related to handling of XFRM_MODE_IPTFS. It appears to me be either some case missing support or/and in a few places it should be prohibited. I have three types of questions: 1. missing XFRM_MODE_IPTFS support? 2. Will XFRM_MODE_IPTFS be supported this combination? 3. Should be prohibited combination with XFRM_MODE_IPTFS 1. Missing: a. wouldn't xfrm_sa_len() need extending? I could not find support for XFRM_MODE_IPTFS explicitly. However, I'm not sure how the following code is working when xfrm_sa_len is missing IP-TFS xfrm_sa_len: copy_to_user_state_extra() { if (x->mode_cbs && x->mode_cbs->copy_to_user) ret = x->mode_cbs->copy_to_user(x, skb); } I have attached what I imagine is a fix. Check with Steffen or others if this is necessary. b. esp6_init_state() and esp_init_state(): These functions do not seem to handle XFRM_MODE_IPTFS. Would they default to work with it? 2. Would xfrm4_outer_mode_gso_segment() xfrm6_outer_mode_gso_segment(): support XFRM_MODE_IPTFS? These functions appear to be missing. Is it possible that you don't support GSO and GRO? 3: Shouldn't these combinations return error? a. ipcomp and XFRM_MODE_IPTFS I'm guessing that ipcomp would generate an error when userspace tries to add an SA with XFRM_MODE_IPTFS. ipcomp6_init_state() ipcomp4_init_state() b: In xfrm_state_construct(): if (attrs[XFRMA_TFCPAD]) x->tfcpad = nla_get_u32(attrs[XFRMA_TFCPAD]); -antony >From 3d9ad9fab35f4efd6c2fb17853aaca986e7daaf3 Mon Sep 17 00:00:00 2001 From: Antony Antony <antony.antony@secunet.com> Date: Thu, 16 Nov 2023 16:46:54 +0100 Subject: [PATCH] xfrm: iptfs extend xfrm_sa_len think this extension is necessary. I wonder how it works now:) May it is alreday handled:) Signed-off-by: Antony Antony <antony.antony@secunet.com> --- include/net/xfrm.h | 1 + net/xfrm/xfrm_iptfs.c | 18 ++++++++++++++++++ net/xfrm/xfrm_user.c | 3 +++ 3 files changed, 22 insertions(+) diff --git a/include/net/xfrm.h b/include/net/xfrm.h index 176ab5ac436e..7ca508b0c46c 100644 --- a/include/net/xfrm.h +++ b/include/net/xfrm.h @@ -474,6 +474,7 @@ struct xfrm_mode_cbs { * mac_header should point at protocol/nexthdr of the outer IP */ int (*prepare_output)(struct xfrm_state *x, struct sk_buff *skb); + unsigned int (*xfrm_sa_len)(const struct xfrm_state *x); }; int xfrm_register_mode_cbs(u8 mode, const struct xfrm_mode_cbs *mode_cbs); diff --git a/net/xfrm/xfrm_iptfs.c b/net/xfrm/xfrm_iptfs.c index 910c5e060931..d8b02cc3073b 100644 --- a/net/xfrm/xfrm_iptfs.c +++ b/net/xfrm/xfrm_iptfs.c @@ -2613,6 +2613,23 @@ static int iptfs_copy_to_user(struct xfrm_state *x, struct sk_buff *skb) return ret; } +static unsigned int iptfs_xfrm_sa_len(const struct xfrm_state *x) +{ + struct xfrm_iptfs_data *xtfs = x->mode_data; + struct xfrm_iptfs_config *xc = &xtfs->cfg; + unsigned int l = 0; + + if (xc->dont_frag) + l += nla_total_size(sizeof(xc->reorder_win_size)); + l += nla_total_size(sizeof(xc->reorder_win_size)); + l += nla_total_size(sizeof(xc->pkt_size)); + l += nla_total_size(sizeof(xc->max_queue_size)); + l += nla_total_size(sizeof(xc->drop_time_us)); + l += nla_total_size(sizeof(xc->init_delay_us)); + + return l; +} + static int iptfs_create_state(struct xfrm_state *x) { struct xfrm_iptfs_data *xtfs; @@ -2701,6 +2718,7 @@ static const struct xfrm_mode_cbs iptfs_mode_cbs = { .input = iptfs_input, .output = iptfs_output_collect, .prepare_output = iptfs_prepare_output, + .xfrm_sa_len = iptfs_xfrm_sa_len, }; static int __init xfrm_iptfs_init(void) diff --git a/net/xfrm/xfrm_user.c b/net/xfrm/xfrm_user.c index 4b0e462d5e31..737636b2c267 100644 --- a/net/xfrm/xfrm_user.c +++ b/net/xfrm/xfrm_user.c @@ -3350,6 +3350,9 @@ static inline unsigned int xfrm_sa_len(struct xfrm_state *x) if (x->mapping_maxage) l += nla_total_size(sizeof(x->mapping_maxage)); + if (x->mode_cbs && x->mode_cbs->copy_to_user) + l += x->mode_cbs->xfrm_sa_len(x); + return l; }
Andrew Cagney <andrew.cagney@gmail.com> writes: >> I did a multiple days peer review with Chris on this pachset. So my >> concerns are already addressed. >> >> Further reviews are welcome! This is a bigger change and it would >> be nice if more people could look at it. > > I have a usability question. What name should appear when a user > interacts with and sees log messages from this feature? > ip-tfs, IP-TFS, IP_TFS > or: > iptfs, IPTFS, ... I think no `-` or `_` in the code/api. For documentation it is probably better to hew closer to the RFC and use `IP-TFS`. Thanks, Chris.
On Mon, Nov 20, 2023 at 01:39:50PM -0500, Christian Hopps via Devel wrote: > > Andrew Cagney <andrew.cagney@gmail.com> writes: > > > > I did a multiple days peer review with Chris on this pachset. So my > > > concerns are already addressed. > > > > > > Further reviews are welcome! This is a bigger change and it would > > > be nice if more people could look at it. > > > > I have a usability question. What name should appear when a user > > interacts with and sees log messages from this feature? > > ip-tfs, IP-TFS, IP_TFS > > or: > > iptfs, IPTFS, ... > > I think no `-` or `_` in the code/api. For documentation it is probably better to hew closer to the RFC and use `IP-TFS`. That sounds good. However, iproute2 output, ip xfrm state, or "ip xfrm policy" is that documentation or code? current unsubmitted patch shows: "iptfs" src 192.1.2.23 dst 192.1.2.45 proto esp spi 0x76ee6b87(1995336583) reqid 16389(0x00004005) mode iptfs root@west:/testing/pluto/ikev2-74-iptfs-01 (iptfs-aa-20231120)# ip x p src 192.0.1.0/24 dst 192.0.2.0/24 dir out priority 1757393 ptype main tmpl src 192.1.2.45 dst 192.1.2.23 proto esp reqid 16389 mode iptfs -antony
On Mon, Nov 20, 2023 at 09:00:45PM +0100, Antony Antony wrote: > On Mon, Nov 20, 2023 at 01:39:50PM -0500, Christian Hopps via Devel wrote: > > > > Andrew Cagney <andrew.cagney@gmail.com> writes: > > > > > > I did a multiple days peer review with Chris on this pachset. So my > > > > concerns are already addressed. > > > > > > > > Further reviews are welcome! This is a bigger change and it would > > > > be nice if more people could look at it. > > > > > > I have a usability question. What name should appear when a user > > > interacts with and sees log messages from this feature? > > > ip-tfs, IP-TFS, IP_TFS > > > or: > > > iptfs, IPTFS, ... > > > > I think no `-` or `_` in the code/api. For documentation it is probably better to hew closer to the RFC and use `IP-TFS`. > > That sounds good. However, > iproute2 output, ip xfrm state, or "ip xfrm policy" is that documentation or code? > > current unsubmitted patch shows: "iptfs" > > src 192.1.2.23 dst 192.1.2.45 > proto esp spi 0x76ee6b87(1995336583) reqid 16389(0x00004005) mode iptfs there also the following line further down in ip x s iptfs-opts pkt-size 0 max-queue-size 1048576 drop-time 1000000 reorder-window 3 init-delay 0 > > root@west:/testing/pluto/ikev2-74-iptfs-01 (iptfs-aa-20231120)# ip x p > src 192.0.1.0/24 dst 192.0.2.0/24 > dir out priority 1757393 ptype main > tmpl src 192.1.2.45 dst 192.1.2.23 > proto esp reqid 16389 mode iptfs > > -antony
Antony Antony <antony@phenome.org> writes: > On Mon, Nov 20, 2023 at 01:39:50PM -0500, Christian Hopps via Devel wrote: >> >> Andrew Cagney <andrew.cagney@gmail.com> writes: >> >> > > I did a multiple days peer review with Chris on this pachset. So my >> > > concerns are already addressed. >> > > >> > > Further reviews are welcome! This is a bigger change and it would >> > > be nice if more people could look at it. >> > >> > I have a usability question. What name should appear when a user >> > interacts with and sees log messages from this feature? >> > ip-tfs, IP-TFS, IP_TFS >> > or: >> > iptfs, IPTFS, ... >> >> I think no `-` or `_` in the code/api. For documentation it is probably better to hew closer to the RFC and use `IP-TFS`. > > That sounds good. However, > iproute2 output, ip xfrm state, or "ip xfrm policy" is that documentation or code? > > current unsubmitted patch shows: "iptfs" That's code/api. I was thinking ui/api when I said api. Documentation would be something ending in a .rst or within a comment as that might be referring to the RFC too. I will go through and make sure things are consistent in the next version. Thanks, Chris. > > src 192.1.2.23 dst 192.1.2.45 > proto esp spi 0x76ee6b87(1995336583) reqid 16389(0x00004005) mode iptfs > > root@west:/testing/pluto/ikev2-74-iptfs-01 (iptfs-aa-20231120)# ip x p > src 192.0.1.0/24 dst 192.0.2.0/24 > dir out priority 1757393 ptype main > tmpl src 192.1.2.45 dst 192.1.2.23 > proto esp reqid 16389 mode iptfs > > -antony
On Mon, Nov 13, 2023 at 08:15:47AM +0100, Steffen Klassert via Devel wrote: > On Sun, Nov 12, 2023 at 10:52:11PM -0500, Christian Hopps wrote: > > From: Christian Hopps <chopps@labn.net> > > > > This patchset adds a new xfrm mode implementing on-demand IP-TFS. IP-TFS > > (AggFrag encapsulation) has been standardized in RFC9347. > > > > Link: https://www.rfc-editor.org/rfc/rfc9347.txt > > > > This feature supports demand driven (i.e., non-constant send rate) IP-TFS to > > take advantage of the AGGFRAG ESP payload encapsulation. This payload type > > supports aggregation and fragmentation of the inner IP packet stream which in > > turn yields higher small-packet bandwidth as well as reducing MTU/PMTU issues. > > Congestion control is unimplementated as the send rate is demand driven rather > > than constant. > > > > In order to allow loading this fucntionality as a module a set of callbacks > > xfrm_mode_cbs has been added to xfrm as well. > > I did a multiple days peer review with Chris on this pachset. So my > concerns are already addressed. > > Further reviews are welcome! This is a bigger change and it would > be nice if more people could look at it. I'd like to pose a basic question to understand the new IP-TFS config options for an SA. When configuring IP-TFS parameters on an SA, are all parameters actively used by that SA? or does their usage vary based on the direction of the traffic? Currently, each IP-TFS SA includes both init-delay and drop-time parameters. I would like to understand whether both parameters are necessary for every SA. ip xfrm state src 2001:db8:1:2::45 dst 2001:db8:1:2::23 proto esp spi 0x32af1f6d reqid 16389 mode iptfs replay-window 0 flag af-unspec esn aead rfc4106(gcm(aes)) 0x0... 128 anti-replay esn context: seq-hi 0x0, seq 0x0, oseq-hi 0x0, oseq 0x0 replay_window 128, bitmap-length 4 00000000 00000000 00000000 00000000 iptfs-opts pkt-size 0 max-queue-size 1048576 drop-time 1000000 reorder-window 3 init-delay 0 Chris, would like to you share which of iptfs-opts are for input and which are for output? My current understanding suggests that, depending on the traffic direction, onlysome of these parameters might be in active use for each SA. If this is indeed the case, I propose we discuss the possibility of refining our configuration approach. Could we consider making these parameters mutually exclusive and dependent on the traffic direction? Furthermore, given that the xfrm community has previously discussed the idea of adding direction to the SA — a concept also used in hardware offload scenarios — why not explore adding direction to the SA? We currently have DIR with offload. ip xfrm state { add | update } ... offload [ crypto|packet ] dev DEV dir DIR Implementing direction could imply that an IP-TFS SA would require only fewer parameters – init-delay or drop-time, depending on the specified direction. This would make ip x s output simple and comprehensible, thereby reducing potential confusion. Also avoid confusing when creating state say input, why add output parameters to an input SA?"
Hi Chris, I got a crash when running inside a namespace using veth. I got twice. It is not 100% reproducable. Note: I have KASAN enabled. [ 226.323824] BUG: KASAN: null-ptr-deref in iptfs_output_collect+0x1f5/0x61a [ 226.325155] Read of size 8 at addr 0000000000000478 by task ping/5276 [ 226.325980] [ 226.326220] CPU: 2 PID: 5276 Comm: ping Not tainted 6.6.0-rc1-00528-gadd9146753b7 #95 [ 226.327219] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-debian-1.16.2-1 04/01/2014 [ 226.328396] Call Trace: [ 226.328791] <TASK> [ 226.329099] dump_stack_lvl+0x33/0x42 [ 226.329597] kasan_report+0xa0/0xc3 [ 226.330074] ? iptfs_output_collect+0x1f5/0x61a [ 226.330678] iptfs_output_collect+0x1f5/0x61a [ 226.331259] ip_send_skb+0x25/0x58 [ 226.331724] raw_sendmsg+0xec4/0xfee [ 226.332223] ? raw_hash_sk+0x224/0x224 [ 226.332741] ? kasan_unpoison+0x44/0x52 [ 226.333258] ? kernel_init_pages+0x42/0x51 [ 226.333810] ? post_alloc_hook+0xba/0xe0 [ 226.334339] ? prep_new_page+0x44/0x51 [ 226.334849] ? ma_data_end+0x6e/0x88 [ 226.335338] ? preempt_count_sub+0x14/0xb5 [ 226.335894] ? zone_watermark_fast.isra.0+0x12b/0x12b [ 226.336577] ? __might_resched+0x8d/0x248 [ 226.337116] ? __might_sleep+0x27/0xa2 [ 226.337626] ? first_zones_zonelist+0x2c/0x43 [ 226.338211] ? do_raw_spin_lock+0x72/0xbb [ 226.338752] ? __might_resched+0x8d/0x248 [ 226.339291] ? __might_sleep+0x27/0xa2 [ 226.339801] ? inet_send_prepare+0x59/0x59 [ 226.340356] ? sock_sendmsg_nosec+0x42/0x6c [ 226.340924] sock_sendmsg_nosec+0x42/0x6c [ 226.341465] __sys_sendto+0x172/0x1e1 [ 226.341964] ? __x64_sys_getpeername+0x44/0x44 [ 226.342558] ? folio_batch_add_and_move+0x6a/0x7d [ 226.343189] ? find_vma+0x6b/0x8b [ 226.343646] ? find_vma_intersection+0x8a/0x8a [ 226.344246] ? handle_mm_fault+0xfa/0x163 [ 226.344788] ? preempt_latency_start+0x41/0x4c [ 226.345385] ? preempt_count_sub+0x14/0xb5 [ 226.345937] __x64_sys_sendto+0x76/0x82 [ 226.346456] do_syscall_64+0x58/0x78 [ 226.346946] entry_SYSCALL_64_after_hwframe+0x46/0xb0 [ 226.347611] RIP: 0033:0x7f378b00ba73 [ 226.348095] Code: 8b 15 a9 83 0c 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b8 0f 1f 00 80 3d 71 0b 0d 00 00 41 89 ca 74 14 b8 2c 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 75 c3 0f 1f 40 00 55 48 83 ec 30 44 89 4c 24 [ 226.350433] RSP: 002b:00007fff58f9c148 EFLAGS: 00000202 ORIG_RAX: 000000000000002c [ 226.351400] RAX: ffffffffffffffda RBX: 00005591b05fa340 RCX: 00007f378b00ba73 [ 226.352318] RDX: 0000000000000040 RSI: 00005591b06003c0 RDI: 0000000000000003 [ 226.353233] RBP: 00005591b06003c0 R08: 00005591b05fc5c0 R09: 0000000000000010 [ 226.354147] R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000000040 [ 226.355061] R13: 00007fff58f9d830 R14: 0000001d00000001 R15: 00005591b05fd680 [ 226.355978] </TASK> [ 226.356301] ================================================================== [ 226.357274] Disabling lock debugging due to kernel taint [ 226.357973] BUG: kernel NULL pointer dereference, address: 0000000000000478 [ 226.358865] #PF: supervisor read access in kernel mode [ 226.359540] #PF: error_code(0x0000) - not-present page [ 226.360220] PGD 0 P4D 0 [ 226.360594] Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC KASAN [ 226.361346] CPU: 2 PID: 5276 Comm: ping Tainted: G B 6.6.0-rc1-00528-gadd9146753b7 #95 [ 226.362520] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-debian-1.16.2-1 04/01/2014 [ 226.363693] RIP: 0010:iptfs_output_collect+0x1f5/0x61a [ 226.364376] Code: c1 ff ff 65 ff 0d 12 29 35 7e 75 05 e8 c3 f0 31 ff 48 8d 7b 10 e8 4b 16 6b ff 4c 8b 73 10 49 8d be 78 04 00 00 e8 3b 16 6b ff <4d> 8b b6 78 04 00 00 49 8d be c0 01 00 00 e8 28 16 6b ff 49 8b 86 [ 226.366704] RSP: 0018:ffffc9000a357998 EFLAGS: 00010296 [ 226.367389] RAX: 0000000000000001 RBX: ffff8881123a7a00 RCX: fffffbfff075e9f5 [ 226.368308] RDX: fffffbfff075e9f5 RSI: fffffbfff075e9f5 RDI: ffffffff83af4fa0 [ 226.369237] RBP: ffff88810f45ac00 R08: 0000000000000008 R09: 0000000000000001 [ 226.370154] R10: ffffffff83af4fa7 R11: fffffbfff075e9f4 R12: 0000000000000000 [ 226.371070] R13: 0000000000000000 R14: 0000000000000000 R15: ffff888114cbeb01 [ 226.371984] FS: 00007f378ad4fc40(0000) GS:ffff88815ad00000(0000) knlGS:0000000000000000 [ 226.373029] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 226.373777] CR2: 0000000000000478 CR3: 00000001083e9000 CR4: 0000000000350ee0 [ 226.374693] Call Trace: [ 226.375042] <TASK> [ 226.375352] ? __die_body+0x1a/0x58 [ 226.375828] ? page_fault_oops+0x459/0x4c7 [ 226.376387] ? irq_work_queue+0x2c/0x41 [ 226.376915] ? dump_pagetable+0x1db/0x1db [ 226.377453] ? vprintk_emit+0x187/0x195 [ 226.377972] ? iptfs_output_collect+0x1f5/0x61a [ 226.378576] ? _printk+0xb2/0xe1 [ 226.379021] ? syslog_print+0x340/0x340 [ 226.379541] ? do_user_addr_fault+0x156/0x59f [ 226.380713] ? exc_page_fault+0xaa/0xc3 [ 226.381234] ? asm_exc_page_fault+0x22/0x30 [ 226.381795] ? iptfs_output_collect+0x1f5/0x61a [ 226.382399] ? iptfs_output_collect+0x1f5/0x61a [ 226.383002] ip_send_skb+0x25/0x58 [ 226.383469] raw_sendmsg+0xec4/0xfee [ 226.383958] ? raw_hash_sk+0x224/0x224 [ 226.385083] ? kasan_unpoison+0x44/0x52 [ 226.385866] ? kernel_init_pages+0x42/0x51 [ 226.386687] ? post_alloc_hook+0xba/0xe0 [ 226.387559] ? prep_new_page+0x44/0x51 [ 226.388522] ? ma_data_end+0x6e/0x88 [ 226.389365] ? preempt_count_sub+0x14/0xb5 [ 226.390575] ? zone_watermark_fast.isra.0+0x12b/0x12b [ 226.391555] ? __might_resched+0x8d/0x248 [ 226.392127] ? __might_sleep+0x27/0xa2 [ 226.392794] ? first_zones_zonelist+0x2c/0x43 [ 226.393684] ? do_raw_spin_lock+0x72/0xbb [ 226.394501] ? __might_resched+0x8d/0x248 [ 226.395324] ? __might_sleep+0x27/0xa2 [ 226.396204] ? inet_send_prepare+0x59/0x59 [ 226.397042] ? sock_sendmsg_nosec+0x42/0x6c [ 226.397879] sock_sendmsg_nosec+0x42/0x6c [ 226.398704] __sys_sendto+0x172/0x1e1 [ 226.399465] ? __x64_sys_getpeername+0x44/0x44 [ 226.400367] ? folio_batch_add_and_move+0x6a/0x7d [ 226.401327] ? find_vma+0x6b/0x8b [ 226.401812] ? find_vma_intersection+0x8a/0x8a [ 226.402442] ? handle_mm_fault+0xfa/0x163 [ 226.403014] ? preempt_latency_start+0x41/0x4c [ 226.403641] ? preempt_count_sub+0x14/0xb5 [ 226.404224] __x64_sys_sendto+0x76/0x82 [ 226.404790] do_syscall_64+0x58/0x78 [ 226.405306] entry_SYSCALL_64_after_hwframe+0x46/0xb0 [ 226.406010] RIP: 0033:0x7f378b00ba73 [ 226.406524] Code: 8b 15 a9 83 0c 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b8 0f 1f 00 80 3d 71 0b 0d 00 00 41 89 ca 74 14 b8 2c 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 75 c3 0f 1f 40 00 55 48 83 ec 30 44 89 4c 24 [ 226.408990] RSP: 002b:00007fff58f9c148 EFLAGS: 00000202 ORIG_RAX: 000000000000002c [ 226.410020] RAX: ffffffffffffffda RBX: 00005591b05fa340 RCX: 00007f378b00ba73 [ 226.410992] RDX: 0000000000000040 RSI: 00005591b06003c0 RDI: 0000000000000003 [ 226.411968] RBP: 00005591b06003c0 R08: 00005591b05fc5c0 R09: 0000000000000010 [ 226.412948] R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000000040 [ 226.413923] R13: 00007fff58f9d830 R14: 0000001d00000001 R15: 00005591b05fd680 [ 226.414904] </TASK> [ 226.415243] Modules linked in: [ 226.415697] CR2: 0000000000000478 [ 226.416181] ---[ end trace 0000000000000000 ]--- [ 226.416848] RIP: 0010:iptfs_output_collect+0x1f5/0x61a [ 226.417569] Code: c1 ff ff 65 ff 0d 12 29 35 7e 75 05 e8 c3 f0 31 ff 48 8d 7b 10 e8 4b 16 6b ff 4c 8b 73 10 49 8d be 78 04 00 00 e8 3b 16 6b ff <4d> 8b b6 78 04 00 00 49 8d be c0 01 00 00 e8 28 16 6b ff 49 8b 86 [ 226.420031] RSP: 0018:ffffc9000a357998 EFLAGS: 00010296 [ 226.420764] RAX: 0000000000000001 RBX: ffff8881123a7a00 RCX: fffffbfff075e9f5 [ 226.421733] RDX: fffffbfff075e9f5 RSI: fffffbfff075e9f5 RDI: ffffffff83af4fa0 [ 226.422704] RBP: ffff88810f45ac00 R08: 0000000000000008 R09: 0000000000000001 [ 226.423671] R10: ffffffff83af4fa7 R11: fffffbfff075e9f4 R12: 0000000000000000 [ 226.424646] R13: 0000000000000000 R14: 0000000000000000 R15: ffff888114cbeb01 [ 226.425616] FS: 00007f378ad4fc40(0000) GS:ffff88815ad00000(0000) knlGS:0000000000000000 [ 226.426709] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 226.427501] CR2: 0000000000000478 CR3: 00000001083e9000 CR4: 0000000000350ee0 [ 226.428492] Kernel panic - not syncing: Fatal exception in interrupt [ 226.429587] Kernel Offset: disabled [ 226.430096] ---[ end Kernel panic - not syncing: Fatal exception in interrupt ]--- (gdb) list *iptfs_output_collect+0x1f5 0xffffffff81ce40d6 is in iptfs_output_collect (./include/net/net_namespace.h:385). 380 } 381 382 static inline struct net *read_pnet(const possible_net_t *pnet) 383 { 384 #ifdef CONFIG_NET_NS 385 return pnet->net; 386 #else 387 return &init_net; 388 #endif 389 } On Sun, Nov 12, 2023 at 10:52:11PM -0500, Christian Hopps via Devel wrote: > From: Christian Hopps <chopps@labn.net> > > This patchset adds a new xfrm mode implementing on-demand IP-TFS. IP-TFS > (AggFrag encapsulation) has been standardized in RFC9347. > > Link: https://www.rfc-editor.org/rfc/rfc9347.txt > > This feature supports demand driven (i.e., non-constant send rate) IP-TFS to > take advantage of the AGGFRAG ESP payload encapsulation. This payload type > supports aggregation and fragmentation of the inner IP packet stream which in > turn yields higher small-packet bandwidth as well as reducing MTU/PMTU issues. > Congestion control is unimplementated as the send rate is demand driven rather > than constant. > > In order to allow loading this fucntionality as a module a set of callbacks > xfrm_mode_cbs has been added to xfrm as well. > -- > Devel mailing list > Devel@linux-ipsec.org > https://linux-ipsec.org/mailman/listinfo/devel
Antony Antony <antony@phenome.org> writes: > On Sun, Nov 12, 2023 at 10:52:11PM -0500, Christian Hopps via Devel wrote: >> From: Christian Hopps <chopps@labn.net> >> >> This patchset adds a new xfrm mode implementing on-demand IP-TFS. IP-TFS >> (AggFrag encapsulation) has been standardized in RFC9347. >> >> Link: https://www.rfc-editor.org/rfc/rfc9347.txt >> >> This feature supports demand driven (i.e., non-constant send rate) IP-TFS to >> take advantage of the AGGFRAG ESP payload encapsulation. This payload type >> supports aggregation and fragmentation of the inner IP packet stream which in >> turn yields higher small-packet bandwidth as well as reducing MTU/PMTU issues. >> Congestion control is unimplementated as the send rate is demand driven rather >> than constant. >> >> In order to allow loading this fucntionality as a module a set of callbacks >> xfrm_mode_cbs has been added to xfrm as well. > > Hi Chris, > > I have further reviewed the code and have a few minor questions, mainly > related to handling of XFRM_MODE_IPTFS. It appears to me be either some case > missing support or/and in a few places it should be prohibited. I have three > types of questions: > > 1. missing XFRM_MODE_IPTFS support? > 2. Will XFRM_MODE_IPTFS be supported this combination? > 3. Should be prohibited combination with XFRM_MODE_IPTFS > > 1. Missing: > > a. wouldn't xfrm_sa_len() need extending? > > I could not find support for XFRM_MODE_IPTFS explicitly. > > However, I'm not sure how the following code is working when xfrm_sa_len is > missing IP-TFS xfrm_sa_len: > > copy_to_user_state_extra() { > if (x->mode_cbs && x->mode_cbs->copy_to_user) > ret = x->mode_cbs->copy_to_user(x, skb); > } > > I have attached what I imagine is a fix. Check with Steffen or others if > this is necessary. I have adopted this change with a couple small changes, thanks! > b. esp6_init_state() and esp_init_state(): > These functions do not seem to handle XFRM_MODE_IPTFS. Would they default to work with it? That's b/c IPTFS uses ESP w/o modification. IP-TFS makes its own mode based changes (similar to what `esp_init_state()` does) in it's `create_state` callback which like `esp_init_state()` is called from `__xfrm_init_state()`. > 2. Would xfrm4_outer_mode_gso_segment() xfrm6_outer_mode_gso_segment(): > support XFRM_MODE_IPTFS? > These functions appear to be missing. Is it possible that you don't support GSO and GRO? Correct. > 3: Shouldn't these combinations return error? > > a. ipcomp and XFRM_MODE_IPTFS > I'm guessing that ipcomp would generate an error when userspace tries to add an SA with XFRM_MODE_IPTFS. > ipcomp6_init_state() > ipcomp4_init_state() Correct. > b: In xfrm_state_construct(): > > if (attrs[XFRMA_TFCPAD]) > x->tfcpad = nla_get_u32(attrs[XFRMA_TFCPAD]); Can you explain this more? Thanks, Chris. > > -antony > > [2. text/plain; 0001-xfrm-iptfs-extend-xfrm_sa_len.patch]...
On Thu, Mar 07, 2024 at 06:05:37AM -0500, Christian Hopps wrote: > > Antony Antony <antony@phenome.org> writes: > > > On Sun, Nov 12, 2023 at 10:52:11PM -0500, Christian Hopps via Devel wrote: > > > From: Christian Hopps <chopps@labn.net> > > > > > b: In xfrm_state_construct(): > > > > if (attrs[XFRMA_TFCPAD]) > > x->tfcpad = nla_get_u32(attrs[XFRMA_TFCPAD]); > > Can you explain this more? I haven't used it. I just know it from *swan code. And I wondered how it would work with IP-TFS. Looking in the kernel code, I noticed, it is not allowed in xfrm_user.c 270 if (attrs[XFRMA_TFCPAD] && 271 p->mode != XFRM_MODE_TUNNEL) { 272 NL_SET_ERR_MSG(extack, "TFC padding can only be used in tunnel mode"); 273 goto out; 274 }
From: Christian Hopps <chopps@labn.net> This patchset adds a new xfrm mode implementing on-demand IP-TFS. IP-TFS (AggFrag encapsulation) has been standardized in RFC9347. Link: https://www.rfc-editor.org/rfc/rfc9347.txt This feature supports demand driven (i.e., non-constant send rate) IP-TFS to take advantage of the AGGFRAG ESP payload encapsulation. This payload type supports aggregation and fragmentation of the inner IP packet stream which in turn yields higher small-packet bandwidth as well as reducing MTU/PMTU issues. Congestion control is unimplementated as the send rate is demand driven rather than constant. In order to allow loading this fucntionality as a module a set of callbacks xfrm_mode_cbs has been added to xfrm as well.