diff mbox series

[net] xfrm: Preserve vlan tags for transport mode software GRO

Message ID 20240422025711.145577-1-paul.davey@alliedtelesis.co.nz (mailing list archive)
State Superseded
Delegated to: Netdev Maintainers
Headers show
Series [net] xfrm: Preserve vlan tags for transport mode software GRO | expand

Checks

Context Check Description
netdev/series_format success Single patches do not need cover letters
netdev/tree_selection success Clearly marked for net
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present fail Series targets non-next tree, but doesn't contain any Fixes tags
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 5780 this patch: 5780
netdev/build_tools success Errors and warnings before: 0 this patch: 0
netdev/cc_maintainers warning 4 maintainers not CCed: pabeni@redhat.com dsahern@kernel.org edumazet@google.com kuba@kernel.org
netdev/build_clang success Errors and warnings before: 1058 this patch: 1058
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 6064 this patch: 6064
netdev/checkpatch warning WARNING: line length of 82 exceeds 80 columns WARNING: line length of 85 exceeds 80 columns WARNING: line length of 86 exceeds 80 columns WARNING: line length of 97 exceeds 80 columns
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 60 this patch: 60
netdev/source_inline success Was 0 now: 0
netdev/contest fail net-next-2024-04-22--06-00 (tests: 1000)

Commit Message

Paul Davey April 22, 2024, 2:56 a.m. UTC
The software GRO path for esp transport mode uses skb_mac_header_rebuild
prior to re-injecting the packet via the xfrm_napi_dev.  This only
copies skb->mac_len bytes of header which may not be sufficient if the
packet contains 802.1Q tags or other VLAN tags.  Worse copying only the
initial header will leave a packet marked as being VLAN tagged but
without the corresponding tag leading to mangling when it is later
untagged.

The VLAN tags are important when receiving the decrypted esp transport
mode packet after GRO processing to ensure it is received on the correct
interface.

Therefore record the full mac header length in xfrm*_transport_input for
later use in correpsonding xfrm*_transport_finish to copy the entire mac
header when rebuilding the mac header for GRO.  The skb->data pointer is
left pointing skb->mac_header bytes after the start of the mac header as
is expected by the network stack and network and transport header
offsets reset to this location.

Signed-off-by: Paul Davey <paul.davey@alliedtelesis.co.nz>
---

I discovered that incomplete mac header rebuilding was the cause of an issue I
noticed with l2tp tunnels with IPSec transport mode protection not working
properly when the tunnel was running over a VLAN. This issue was described in
the linked mailing list post.
    
I am not certain if the tunnel encapsulation modes would require similar
adjustment though they do not seem to have an issue with packets passing.
I am also uncertain of how this sort of problem may interact with interfaces
which are bridged.
    
Link: https://lore.kernel.org/r/ecf16770431c8b30782e3443085641eb685aa5f9.camel@alliedtelesis.co.nz/

 include/linux/skbuff.h | 15 +++++++++++++++
 include/net/xfrm.h     |  3 +++
 net/ipv4/xfrm4_input.c |  6 +++++-
 net/ipv6/xfrm6_input.c |  6 +++++-
 net/xfrm/xfrm_input.c  |  4 ++++
 5 files changed, 32 insertions(+), 2 deletions(-)

Comments

Steffen Klassert April 22, 2024, 9:56 a.m. UTC | #1
On Mon, Apr 22, 2024 at 02:56:20PM +1200, Paul Davey wrote:
> The software GRO path for esp transport mode uses skb_mac_header_rebuild
> prior to re-injecting the packet via the xfrm_napi_dev.  This only
> copies skb->mac_len bytes of header which may not be sufficient if the
> packet contains 802.1Q tags or other VLAN tags.  Worse copying only the
> initial header will leave a packet marked as being VLAN tagged but
> without the corresponding tag leading to mangling when it is later
> untagged.
> 
> The VLAN tags are important when receiving the decrypted esp transport
> mode packet after GRO processing to ensure it is received on the correct
> interface.
> 
> Therefore record the full mac header length in xfrm*_transport_input for
> later use in correpsonding xfrm*_transport_finish to copy the entire mac
> header when rebuilding the mac header for GRO.  The skb->data pointer is
> left pointing skb->mac_header bytes after the start of the mac header as
> is expected by the network stack and network and transport header
> offsets reset to this location.
> 
> Signed-off-by: Paul Davey <paul.davey@alliedtelesis.co.nz>

Please add a 'Fixes:' tag so it can be backported to stable.

> diff --git a/include/net/xfrm.h b/include/net/xfrm.h
> index 57c743b7e4fe..0331cfecb28b 100644
> --- a/include/net/xfrm.h
> +++ b/include/net/xfrm.h
> @@ -675,6 +675,9 @@ struct xfrm_mode_skb_cb {
>  
>  	/* Used by IPv6 only, zero for IPv4. */
>  	u8 flow_lbl[3];
> +
> +	/* Used to keep whole l2 header for transport mode GRO */
> +	u32 orig_mac_len;

xfrm_mode_skb_cb has already reached the maximum size of 48 bytes.
Adding more will overwrite data in the 'struct sk_buff'.

Try to store this in 'struct xfrm_offload'.

Thanks!
Sabrina Dubroca April 22, 2024, 12:49 p.m. UTC | #2
2024-04-22, 11:56:09 +0200, Steffen Klassert wrote:
> On Mon, Apr 22, 2024 at 02:56:20PM +1200, Paul Davey wrote:
> > The software GRO path for esp transport mode uses skb_mac_header_rebuild
> > prior to re-injecting the packet via the xfrm_napi_dev.  This only
> > copies skb->mac_len bytes of header which may not be sufficient if the
> > packet contains 802.1Q tags or other VLAN tags.  Worse copying only the
> > initial header will leave a packet marked as being VLAN tagged but
> > without the corresponding tag leading to mangling when it is later
> > untagged.
> > 
> > The VLAN tags are important when receiving the decrypted esp transport
> > mode packet after GRO processing to ensure it is received on the correct
> > interface.
> > 
> > Therefore record the full mac header length in xfrm*_transport_input for
> > later use in correpsonding xfrm*_transport_finish to copy the entire mac
> > header when rebuilding the mac header for GRO.  The skb->data pointer is
> > left pointing skb->mac_header bytes after the start of the mac header as
> > is expected by the network stack and network and transport header
> > offsets reset to this location.
> > 
> > Signed-off-by: Paul Davey <paul.davey@alliedtelesis.co.nz>
> 
> Please add a 'Fixes:' tag so it can be backported to stable.
> 
> > diff --git a/include/net/xfrm.h b/include/net/xfrm.h
> > index 57c743b7e4fe..0331cfecb28b 100644
> > --- a/include/net/xfrm.h
> > +++ b/include/net/xfrm.h
> > @@ -675,6 +675,9 @@ struct xfrm_mode_skb_cb {
> >  
> >  	/* Used by IPv6 only, zero for IPv4. */
> >  	u8 flow_lbl[3];
> > +
> > +	/* Used to keep whole l2 header for transport mode GRO */
> > +	u32 orig_mac_len;
> 
> xfrm_mode_skb_cb has already reached the maximum size of 48 bytes.
> Adding more will overwrite data in the 'struct sk_buff'.

I thought we already had a BUILD_BUG_ON(sizeof(struct
xfrm_mode_skb_cb) > sizeof_field(struct sk_buff, cb)) somewhere, but
apparently not. I guess it's time to add one? (and xfrm_spi_skb_cb, xfrm_skb_cb)

diff --git a/net/xfrm/xfrm_input.c b/net/xfrm/xfrm_input.c
index 161f535c8b94..afc8b3c881e2 100644
--- a/net/xfrm/xfrm_input.c
+++ b/net/xfrm/xfrm_input.c
@@ -793,6 +793,8 @@ void __init xfrm_input_init(void)
 	int err;
 	int i;
 
+	BUILD_BUG_ON(sizeof(struct xfrm_mode_skb_cb) > sizeof_field(struct sk_buff, cb));
+
 	init_dummy_netdev(&xfrm_napi_dev);
 	err = gro_cells_init(&gro_cells, &xfrm_napi_dev);
 	if (err)


Actually it looks like we still have 4B in xfrm_mode_skb_cb:

struct xfrm_mode_skb_cb {
	struct xfrm_tunnel_skb_cb  header;               /*     0    32 */
	__be16                     id;                   /*    32     2 */
	__be16                     frag_off;             /*    34     2 */
	u8                         ihl;                  /*    36     1 */
	u8                         tos;                  /*    37     1 */
	u8                         ttl;                  /*    38     1 */
	u8                         protocol;             /*    39     1 */
	u8                         optlen;               /*    40     1 */
	u8                         flow_lbl[3];          /*    41     3 */

	/* size: 48, cachelines: 1, members: 9 */
	/* padding: 4 */
	/* last cacheline: 48 bytes */
};

flow_lbl ends at 44, so adding orig_mac_len should be fine. I don't
see any config options that would increase the size of
xfrm_mode_skb_cb compared to what I already have.
Steffen Klassert April 22, 2024, 4:11 p.m. UTC | #3
On Mon, Apr 22, 2024 at 02:49:44PM +0200, Sabrina Dubroca wrote:
> 
> Actually it looks like we still have 4B in xfrm_mode_skb_cb:
> 
> struct xfrm_mode_skb_cb {
> 	struct xfrm_tunnel_skb_cb  header;               /*     0    32 */
> 	__be16                     id;                   /*    32     2 */
> 	__be16                     frag_off;             /*    34     2 */
> 	u8                         ihl;                  /*    36     1 */
> 	u8                         tos;                  /*    37     1 */
> 	u8                         ttl;                  /*    38     1 */
> 	u8                         protocol;             /*    39     1 */
> 	u8                         optlen;               /*    40     1 */
> 	u8                         flow_lbl[3];          /*    41     3 */
> 
> 	/* size: 48, cachelines: 1, members: 9 */
> 	/* padding: 4 */
> 	/* last cacheline: 48 bytes */
> };
> 
> flow_lbl ends at 44, so adding orig_mac_len should be fine. I don't
> see any config options that would increase the size of
> xfrm_mode_skb_cb compared to what I already have.

Right, I overlooked the 4 byte padding in the pahole output.
diff mbox series

Patch

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 9d24aec064e8..4ff48eda3f64 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -3031,6 +3031,21 @@  static inline void skb_mac_header_rebuild(struct sk_buff *skb)
 	}
 }
 
+/* Move the full mac header up to current network_header.
+ * Leaves skb->data pointing at offset skb->mac_len into the mac_header.
+ * Must be provided the complete mac header length.
+ */
+static inline void skb_mac_header_rebuild_full(struct sk_buff *skb, u32 full_mac_len)
+{
+	if (skb_mac_header_was_set(skb)) {
+		const unsigned char *old_mac = skb_mac_header(skb);
+
+		skb_set_mac_header(skb, -full_mac_len);
+		memmove(skb_mac_header(skb), old_mac, full_mac_len);
+		__skb_push(skb, full_mac_len - skb->mac_len);
+	}
+}
+
 static inline int skb_checksum_start_offset(const struct sk_buff *skb)
 {
 	return skb->csum_start - skb_headroom(skb);
diff --git a/include/net/xfrm.h b/include/net/xfrm.h
index 57c743b7e4fe..0331cfecb28b 100644
--- a/include/net/xfrm.h
+++ b/include/net/xfrm.h
@@ -675,6 +675,9 @@  struct xfrm_mode_skb_cb {
 
 	/* Used by IPv6 only, zero for IPv4. */
 	u8 flow_lbl[3];
+
+	/* Used to keep whole l2 header for transport mode GRO */
+	u32 orig_mac_len;
 };
 
 #define XFRM_MODE_SKB_CB(__skb) ((struct xfrm_mode_skb_cb *)&((__skb)->cb[0]))
diff --git a/net/ipv4/xfrm4_input.c b/net/ipv4/xfrm4_input.c
index dae35101d189..322f9dfee0c4 100644
--- a/net/ipv4/xfrm4_input.c
+++ b/net/ipv4/xfrm4_input.c
@@ -63,7 +63,11 @@  int xfrm4_transport_finish(struct sk_buff *skb, int async)
 	ip_send_check(iph);
 
 	if (xo && (xo->flags & XFRM_GRO)) {
-		skb_mac_header_rebuild(skb);
+		/* The full l2 header needs to be preserved so that re-injecting the packet at l2
+		 * works correctly in the presence of vlan tags.
+		 */
+		skb_mac_header_rebuild_full(skb, XFRM_MODE_SKB_CB(skb)->orig_mac_len);
+		skb_reset_network_header(skb);
 		skb_reset_transport_header(skb);
 		return 0;
 	}
diff --git a/net/ipv6/xfrm6_input.c b/net/ipv6/xfrm6_input.c
index a17d783dc7c0..000981d4ca02 100644
--- a/net/ipv6/xfrm6_input.c
+++ b/net/ipv6/xfrm6_input.c
@@ -58,7 +58,11 @@  int xfrm6_transport_finish(struct sk_buff *skb, int async)
 	skb_postpush_rcsum(skb, skb_network_header(skb), nhlen);
 
 	if (xo && (xo->flags & XFRM_GRO)) {
-		skb_mac_header_rebuild(skb);
+		/* The full l2 header needs to be preserved so that re-injecting the packet at l2
+		 * works correctly in the presence of vlan tags.
+		 */
+		skb_mac_header_rebuild_full(skb, XFRM_MODE_SKB_CB(skb)->orig_mac_len);
+		skb_reset_network_header(skb);
 		skb_reset_transport_header(skb);
 		return 0;
 	}
diff --git a/net/xfrm/xfrm_input.c b/net/xfrm/xfrm_input.c
index 161f535c8b94..d06f1b3cd322 100644
--- a/net/xfrm/xfrm_input.c
+++ b/net/xfrm/xfrm_input.c
@@ -394,6 +394,8 @@  static int xfrm4_transport_input(struct xfrm_state *x, struct sk_buff *skb)
 	if (skb->transport_header != skb->network_header) {
 		memmove(skb_transport_header(skb),
 			skb_network_header(skb), ihl);
+		XFRM_MODE_SKB_CB(skb)->orig_mac_len =
+			skb_mac_header_was_set(skb) ? skb_mac_header_len(skb) : 0;
 		skb->network_header = skb->transport_header;
 	}
 	ip_hdr(skb)->tot_len = htons(skb->len + ihl);
@@ -409,6 +411,8 @@  static int xfrm6_transport_input(struct xfrm_state *x, struct sk_buff *skb)
 	if (skb->transport_header != skb->network_header) {
 		memmove(skb_transport_header(skb),
 			skb_network_header(skb), ihl);
+		XFRM_MODE_SKB_CB(skb)->orig_mac_len =
+			skb_mac_header_was_set(skb) ? skb_mac_header_len(skb) : 0;
 		skb->network_header = skb->transport_header;
 	}
 	ipv6_hdr(skb)->payload_len = htons(skb->len + ihl -