mbox series

[ipsec-next,v9,00/17] Add IP-TFS mode to xfrm

Message ID 20240807211331.1081038-1-chopps@chopps.org (mailing list archive)
Headers show
Series Add IP-TFS mode to xfrm | expand

Message

Christian Hopps Aug. 7, 2024, 9:13 p.m. UTC
* Summary of Changes:

This patchset adds a new xfrm mode implementing on-demand IP-TFS. IP-TFS
(AggFrag encapsulation) has been standardized in RFC9347.

  Link: https://www.rfc-editor.org/rfc/rfc9347.txt

This feature supports demand driven (i.e., non-constant send rate)
IP-TFS to take advantage of the AGGFRAG ESP payload encapsulation. This
payload type supports aggregation and fragmentation of the inner IP
packet stream which in turn yields higher small-packet bandwidth as well
as reducing MTU/PMTU issues. Congestion control is unimplementated as
the send rate is demand driven rather than constant.

In order to allow loading this fucntionality as a module a set of
callbacks xfrm_mode_cbs has been added to xfrm as well.

Patchset Changes:
-----------------

  include/linux/skbuff.h     |    1 +
  include/net/xfrm.h         |   44 +
  include/uapi/linux/in.h    |    2 +
  include/uapi/linux/ip.h    |   16 +
  include/uapi/linux/ipsec.h |    3 +-
  include/uapi/linux/snmp.h  |    2 +
  include/uapi/linux/xfrm.h  |    9 +-
  net/core/skbuff.c          |    7 +-
  net/ipv4/esp4.c            |    3 +-
  net/ipv6/esp6.c            |    3 +-
  net/netfilter/nft_xfrm.c   |    3 +-
  net/xfrm/Kconfig           |   16 +
  net/xfrm/Makefile          |    1 +
  net/xfrm/trace_iptfs.h     |  218 ++++
  net/xfrm/xfrm_compat.c     |   10 +-
  net/xfrm/xfrm_device.c     |    4 +-
  net/xfrm/xfrm_input.c      |   18 +-
  net/xfrm/xfrm_iptfs.c      | 2822 ++++++++++++++++++++++++++++++++++++++++++++
  net/xfrm/xfrm_output.c     |    6 +
  net/xfrm/xfrm_policy.c     |   26 +-
  net/xfrm/xfrm_proc.c       |    2 +
  net/xfrm/xfrm_state.c      |   84 ++
  net/xfrm/xfrm_user.c       |   77 ++
  23 files changed, 3357 insertions(+), 20 deletions(-)

Patchset Structure:
-------------------

The first 7 commits are changes to the xfrm infrastructure to support
the callbacks as well as more generic IP-TFS additions that may be used
outside the actual IP-TFS implementation.

  - net: refactor common skb header copy code for re-use
  - xfrm: config: add CONFIG_XFRM_IPTFS
  - include: uapi: add ip_tfs_*_hdr packet formats
  - include: uapi: add IPPROTO_AGGFRAG for AGGFRAG in ESP
  - xfrm: netlink: add config (netlink) options
  - xfrm: add mode_cbs module functionality
  - xfrm: add generic iptfs defines and functionality

The last 10 commits constitute the IP-TFS implementation constructed in
layers to make review easier. The first 9 commits all apply to a single
file `net/xfrm/xfrm_iptfs.c`, the last commit adds a new tracepoint
header file along with the use of these new tracepoint calls.

  - xfrm: iptfs: add new iptfs xfrm mode impl
  - xfrm: iptfs: add user packet (tunnel ingress) handling
  - xfrm: iptfs: share page fragments of inner packets
  - xfrm: iptfs: add fragmenting of larger than MTU user packets
  - xfrm: iptfs: add basic receive packet (tunnel egress) handling
  - xfrm: iptfs: handle received fragmented inner packets
  - xfrm: iptfs: add reusing received skb for the tunnel egress packet
  - xfrm: iptfs: add skb-fragment sharing code
  - xfrm: iptfs: handle reordering of received packets
  - xfrm: iptfs: add tracepoint functionality

Patchset History:
-----------------

RFCv1 (11/10/2023)

RFCv1 -> RFCv2 (11/12/2023)

  Updates based on feedback from Simon Horman, Antony,
  Michael Richardson, and kernel test robot.

RFCv2 -> v1 (2/19/2024)

  Updates based on feedback from Sabrina Dubroca, kernel test robot

v1 -> v2 (5/19/2024)

  Updates based on feedback from Sabrina Dubroca, Simon Horman, Antony.

  o Add handling of new netlink SA direction attribute (Antony).
  o Split single patch/commit of xfrm_iptfs.c (the actual IP-TFS impl)
    into 9+1 distinct layered functionality commits for aiding review.
  - xfrm: fix return check on clone() callback
  - xfrm: add sa_len() callback in xfrm_mode_cbs for copy to user
  - iptfs: remove unneeded skb free count variable
  - iptfs: remove unused variable and "breadcrumb" for future code.
  - iptfs: use do_div() to avoid "__udivd13 missing" link failure.
  - iptfs: remove some BUG_ON() assertions questioned in review.

v2->v3
  - Git User Glitch

v2->v4 (6/17/2024)

  - iptfs: copy only the netlink attributes to user based on the
    direction of the SA.

  - xfrm: stats: in the output path check for skb->dev == NULL prior to
    setting xfrm statistics on dev_net(skb->dev) as skb->dev may be NULL
    for locally generated packets.

  - xfrm: stats: fix an input use case where dev_net(skb->dev) is used
    to inc stats after skb is possibly NULL'd earlier. Switch to using
    existing saved `net` pointer.

v4->v5 (7/14/2024)
  - uapi: add units to doc comments
  - iptfs: add MODULE_DESCRIPTION()
  - squash nl-direction-update commit

v5->v6 (7/31/2024)
  * sysctl: removed IPTFS sysctl additions
  - xfrm: use array of pointers vs structs for mode callbacks
  - iptfs: eliminate a memleak during state alloc failure
  - iptfs: free send queue content on SA delete
  - add some kdoc and comments
  - cleanup a couple formatting choices per Steffen

v6->v7 (8/1/2024)
  - Rebased on latest ipsec-next

v7->v8 (8/4/2024)
  - Use lock and rcu to load iptfs module -- copy existing use pattern
  - fix 2 warnings from the kernel bot

v8->v9 (8/7/2024)
  - factor common code from skbuff.c:__copy_skb_header into ___copy_skb_header
    and use in iptfs rather that copying any code.
  - change all BUG_ON to WARN_ON_ONCE
  - remove unwanted new NOSKB xfrm MIB error counter
  - remove unneeded copy or share choice function
  - ifdef CONFIG_IPV6 around IPv6 function

Comments

Antony Antony Aug. 8, 2024, 9:43 a.m. UTC | #1
Hi Chris,

On Wed, Aug 07, 2024 at 05:13:14PM -0400, Christian Hopps wrote:
> * Summary of Changes:
> 
> This patchset adds a new xfrm mode implementing on-demand IP-TFS. IP-TFS
> (AggFrag encapsulation) has been standardized in RFC9347.
> 
>   Link: https://www.rfc-editor.org/rfc/rfc9347.txt
> 
> This feature supports demand driven (i.e., non-constant send rate)
> IP-TFS to take advantage of the AGGFRAG ESP payload encapsulation. This
> payload type supports aggregation and fragmentation of the inner IP
> packet stream which in turn yields higher small-packet bandwidth as well
> as reducing MTU/PMTU issues. Congestion control is unimplementated as
> the send rate is demand driven rather than constant.
> 
> In order to allow loading this fucntionality as a module a set of
> callbacks xfrm_mode_cbs has been added to xfrm as well.
> 
> Patchset Changes:
> -----------------
> 
>   include/linux/skbuff.h     |    1 +
>   include/net/xfrm.h         |   44 +
>   include/uapi/linux/in.h    |    2 +
>   include/uapi/linux/ip.h    |   16 +
>   include/uapi/linux/ipsec.h |    3 +-
>   include/uapi/linux/snmp.h  |    2 +
>   include/uapi/linux/xfrm.h  |    9 +-
>   net/core/skbuff.c          |    7 +-
>   net/ipv4/esp4.c            |    3 +-
>   net/ipv6/esp6.c            |    3 +-
>   net/netfilter/nft_xfrm.c   |    3 +-
>   net/xfrm/Kconfig           |   16 +
>   net/xfrm/Makefile          |    1 +
>   net/xfrm/trace_iptfs.h     |  218 ++++
>   net/xfrm/xfrm_compat.c     |   10 +-
>   net/xfrm/xfrm_device.c     |    4 +-
>   net/xfrm/xfrm_input.c      |   18 +-
>   net/xfrm/xfrm_iptfs.c      | 2822 ++++++++++++++++++++++++++++++++++++++++++++
>   net/xfrm/xfrm_output.c     |    6 +
>   net/xfrm/xfrm_policy.c     |   26 +-
>   net/xfrm/xfrm_proc.c       |    2 +
>   net/xfrm/xfrm_state.c      |   84 ++
>   net/xfrm/xfrm_user.c       |   77 ++
>   23 files changed, 3357 insertions(+), 20 deletions(-)
> 
> Patchset Structure:
> -------------------
> 
> The first 7 commits are changes to the xfrm infrastructure to support
> the callbacks as well as more generic IP-TFS additions that may be used
> outside the actual IP-TFS implementation.
> 
>   - net: refactor common skb header copy code for re-use
>   - xfrm: config: add CONFIG_XFRM_IPTFS
>   - include: uapi: add ip_tfs_*_hdr packet formats
>   - include: uapi: add IPPROTO_AGGFRAG for AGGFRAG in ESP
>   - xfrm: netlink: add config (netlink) options
>   - xfrm: add mode_cbs module functionality
>   - xfrm: add generic iptfs defines and functionality
> 
> The last 10 commits constitute the IP-TFS implementation constructed in
> layers to make review easier. The first 9 commits all apply to a single
> file `net/xfrm/xfrm_iptfs.c`, the last commit adds a new tracepoint
> header file along with the use of these new tracepoint calls.
> 
>   - xfrm: iptfs: add new iptfs xfrm mode impl
>   - xfrm: iptfs: add user packet (tunnel ingress) handling
>   - xfrm: iptfs: share page fragments of inner packets
>   - xfrm: iptfs: add fragmenting of larger than MTU user packets
>   - xfrm: iptfs: add basic receive packet (tunnel egress) handling
>   - xfrm: iptfs: handle received fragmented inner packets
>   - xfrm: iptfs: add reusing received skb for the tunnel egress packet
>   - xfrm: iptfs: add skb-fragment sharing code
>   - xfrm: iptfs: handle reordering of received packets
>   - xfrm: iptfs: add tracepoint functionality
> 
> Patchset History:
> -----------------
> 
> RFCv1 (11/10/2023)
> 
> RFCv1 -> RFCv2 (11/12/2023)
> 
>   Updates based on feedback from Simon Horman, Antony,
>   Michael Richardson, and kernel test robot.
> 
> RFCv2 -> v1 (2/19/2024)
> 
>   Updates based on feedback from Sabrina Dubroca, kernel test robot
> 
> v1 -> v2 (5/19/2024)
> 
>   Updates based on feedback from Sabrina Dubroca, Simon Horman, Antony.
> 
>   o Add handling of new netlink SA direction attribute (Antony).
>   o Split single patch/commit of xfrm_iptfs.c (the actual IP-TFS impl)
>     into 9+1 distinct layered functionality commits for aiding review.
>   - xfrm: fix return check on clone() callback
>   - xfrm: add sa_len() callback in xfrm_mode_cbs for copy to user
>   - iptfs: remove unneeded skb free count variable
>   - iptfs: remove unused variable and "breadcrumb" for future code.
>   - iptfs: use do_div() to avoid "__udivd13 missing" link failure.
>   - iptfs: remove some BUG_ON() assertions questioned in review.
> 
> v2->v3
>   - Git User Glitch
> 
> v2->v4 (6/17/2024)
> 
>   - iptfs: copy only the netlink attributes to user based on the
>     direction of the SA.
> 
>   - xfrm: stats: in the output path check for skb->dev == NULL prior to
>     setting xfrm statistics on dev_net(skb->dev) as skb->dev may be NULL
>     for locally generated packets.
> 
>   - xfrm: stats: fix an input use case where dev_net(skb->dev) is used
>     to inc stats after skb is possibly NULL'd earlier. Switch to using
>     existing saved `net` pointer.
> 
> v4->v5 (7/14/2024)
>   - uapi: add units to doc comments
>   - iptfs: add MODULE_DESCRIPTION()
>   - squash nl-direction-update commit
> 
> v5->v6 (7/31/2024)
>   * sysctl: removed IPTFS sysctl additions
>   - xfrm: use array of pointers vs structs for mode callbacks
>   - iptfs: eliminate a memleak during state alloc failure
>   - iptfs: free send queue content on SA delete
>   - add some kdoc and comments
>   - cleanup a couple formatting choices per Steffen
> 
> v6->v7 (8/1/2024)
>   - Rebased on latest ipsec-next
> 
> v7->v8 (8/4/2024)
>   - Use lock and rcu to load iptfs module -- copy existing use pattern
>   - fix 2 warnings from the kernel bot
> 
> v8->v9 (8/7/2024)
>   - factor common code from skbuff.c:__copy_skb_header into ___copy_skb_header
>     and use in iptfs rather that copying any code.
>   - change all BUG_ON to WARN_ON_ONCE
>   - remove unwanted new NOSKB xfrm MIB error counter
>   - remove unneeded copy or share choice function
>   - ifdef CONFIG_IPV6 around IPv6 function

I noticed a build error with CONFIG_XFRM_IPTFS=m. This error also shows up 
in in NetDev NIPA tester. However, it kernel builds with CONFIG_XFRM_IPTFS=y

a@laya:~/git/linux (iptfs-patchset-v9-20240808)$ make
  CALL    scripts/checksyscalls.sh
  DESCEND objtool
  INSTALL libsubcmd_headers
  MODPOST Module.symvers
ERROR: modpost: "___copy_skb_header" [net/xfrm/xfrm_iptfs.ko] undefined!
make[2]: *** [scripts/Makefile.modpost:145: Module.symvers] Error 1
make[1]: *** [/home/a/git/linux/Makefile:1878: modpost] Error 2
make: *** [Makefile:224: __sub-make] Error 2

NIPA tester also noticed the build error.
https://netdev.bots.linux.dev/static/nipa/877576/13756764/build_clang/stderr
I wonder why the  kernel test robot hasn't caught this yet.

Also note a few minor issues with the patches, specifically for the patch 
ipsec-next,v9,14/17
https://netdev.bots.linux.dev/static/nipa/877576/13756763/build_clang/stderr

-antony
Christian Hopps Aug. 8, 2024, 2:22 p.m. UTC | #2
Antony Antony <antony@phenome.org> writes:

> Hi Chris,
>
> On Wed, Aug 07, 2024 at 05:13:14PM -0400, Christian Hopps wrote:
>> * Summary of Changes:
...
>> v8->v9 (8/7/2024)
>>   - factor common code from skbuff.c:__copy_skb_header into ___copy_skb_header
>>     and use in iptfs rather that copying any code.
>>   - change all BUG_ON to WARN_ON_ONCE
>>   - remove unwanted new NOSKB xfrm MIB error counter
>>   - remove unneeded copy or share choice function
>>   - ifdef CONFIG_IPV6 around IPv6 function
>
> I noticed a build error with CONFIG_XFRM_IPTFS=m. This error also shows up
> in in NetDev NIPA tester. However, it kernel builds with CONFIG_XFRM_IPTFS=y

Indeed. Added the missing EXPORT_SYMBOL_GPL(___copy_skb_header) now..


> a@laya:~/git/linux (iptfs-patchset-v9-20240808)$ make
>   CALL    scripts/checksyscalls.sh
>   DESCEND objtool
>   INSTALL libsubcmd_headers
>   MODPOST Module.symvers
> ERROR: modpost: "___copy_skb_header" [net/xfrm/xfrm_iptfs.ko] undefined!
> make[2]: *** [scripts/Makefile.modpost:145: Module.symvers] Error 1
> make[1]: *** [/home/a/git/linux/Makefile:1878: modpost] Error 2
> make: *** [Makefile:224: __sub-make] Error 2
>
> NIPA tester also noticed the build error.
> https://netdev.bots.linux.dev/static/nipa/877576/13756764/build_clang/stderr
> I wonder why the  kernel test robot hasn't caught this yet.

It just hasn't gotten there yet. :)

> Also note a few minor issues with the patches, specifically for the patch
> ipsec-next,v9,14/17
> https://netdev.bots.linux.dev/static/nipa/877576/13756763/build_clang/stderr

I'll fix these, they are transient warnings based on how I split the patchset.

Thanks,
Chris.


>
> -antony