Message ID | 20240422025711.145577-1-paul.davey@alliedtelesis.co.nz (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net] xfrm: Preserve vlan tags for transport mode software GRO | expand |
On Mon, Apr 22, 2024 at 02:56:20PM +1200, Paul Davey wrote: > The software GRO path for esp transport mode uses skb_mac_header_rebuild > prior to re-injecting the packet via the xfrm_napi_dev. This only > copies skb->mac_len bytes of header which may not be sufficient if the > packet contains 802.1Q tags or other VLAN tags. Worse copying only the > initial header will leave a packet marked as being VLAN tagged but > without the corresponding tag leading to mangling when it is later > untagged. > > The VLAN tags are important when receiving the decrypted esp transport > mode packet after GRO processing to ensure it is received on the correct > interface. > > Therefore record the full mac header length in xfrm*_transport_input for > later use in correpsonding xfrm*_transport_finish to copy the entire mac > header when rebuilding the mac header for GRO. The skb->data pointer is > left pointing skb->mac_header bytes after the start of the mac header as > is expected by the network stack and network and transport header > offsets reset to this location. > > Signed-off-by: Paul Davey <paul.davey@alliedtelesis.co.nz> Please add a 'Fixes:' tag so it can be backported to stable. > diff --git a/include/net/xfrm.h b/include/net/xfrm.h > index 57c743b7e4fe..0331cfecb28b 100644 > --- a/include/net/xfrm.h > +++ b/include/net/xfrm.h > @@ -675,6 +675,9 @@ struct xfrm_mode_skb_cb { > > /* Used by IPv6 only, zero for IPv4. */ > u8 flow_lbl[3]; > + > + /* Used to keep whole l2 header for transport mode GRO */ > + u32 orig_mac_len; xfrm_mode_skb_cb has already reached the maximum size of 48 bytes. Adding more will overwrite data in the 'struct sk_buff'. Try to store this in 'struct xfrm_offload'. Thanks!
2024-04-22, 11:56:09 +0200, Steffen Klassert wrote: > On Mon, Apr 22, 2024 at 02:56:20PM +1200, Paul Davey wrote: > > The software GRO path for esp transport mode uses skb_mac_header_rebuild > > prior to re-injecting the packet via the xfrm_napi_dev. This only > > copies skb->mac_len bytes of header which may not be sufficient if the > > packet contains 802.1Q tags or other VLAN tags. Worse copying only the > > initial header will leave a packet marked as being VLAN tagged but > > without the corresponding tag leading to mangling when it is later > > untagged. > > > > The VLAN tags are important when receiving the decrypted esp transport > > mode packet after GRO processing to ensure it is received on the correct > > interface. > > > > Therefore record the full mac header length in xfrm*_transport_input for > > later use in correpsonding xfrm*_transport_finish to copy the entire mac > > header when rebuilding the mac header for GRO. The skb->data pointer is > > left pointing skb->mac_header bytes after the start of the mac header as > > is expected by the network stack and network and transport header > > offsets reset to this location. > > > > Signed-off-by: Paul Davey <paul.davey@alliedtelesis.co.nz> > > Please add a 'Fixes:' tag so it can be backported to stable. > > > diff --git a/include/net/xfrm.h b/include/net/xfrm.h > > index 57c743b7e4fe..0331cfecb28b 100644 > > --- a/include/net/xfrm.h > > +++ b/include/net/xfrm.h > > @@ -675,6 +675,9 @@ struct xfrm_mode_skb_cb { > > > > /* Used by IPv6 only, zero for IPv4. */ > > u8 flow_lbl[3]; > > + > > + /* Used to keep whole l2 header for transport mode GRO */ > > + u32 orig_mac_len; > > xfrm_mode_skb_cb has already reached the maximum size of 48 bytes. > Adding more will overwrite data in the 'struct sk_buff'. I thought we already had a BUILD_BUG_ON(sizeof(struct xfrm_mode_skb_cb) > sizeof_field(struct sk_buff, cb)) somewhere, but apparently not. I guess it's time to add one? (and xfrm_spi_skb_cb, xfrm_skb_cb) diff --git a/net/xfrm/xfrm_input.c b/net/xfrm/xfrm_input.c index 161f535c8b94..afc8b3c881e2 100644 --- a/net/xfrm/xfrm_input.c +++ b/net/xfrm/xfrm_input.c @@ -793,6 +793,8 @@ void __init xfrm_input_init(void) int err; int i; + BUILD_BUG_ON(sizeof(struct xfrm_mode_skb_cb) > sizeof_field(struct sk_buff, cb)); + init_dummy_netdev(&xfrm_napi_dev); err = gro_cells_init(&gro_cells, &xfrm_napi_dev); if (err) Actually it looks like we still have 4B in xfrm_mode_skb_cb: struct xfrm_mode_skb_cb { struct xfrm_tunnel_skb_cb header; /* 0 32 */ __be16 id; /* 32 2 */ __be16 frag_off; /* 34 2 */ u8 ihl; /* 36 1 */ u8 tos; /* 37 1 */ u8 ttl; /* 38 1 */ u8 protocol; /* 39 1 */ u8 optlen; /* 40 1 */ u8 flow_lbl[3]; /* 41 3 */ /* size: 48, cachelines: 1, members: 9 */ /* padding: 4 */ /* last cacheline: 48 bytes */ }; flow_lbl ends at 44, so adding orig_mac_len should be fine. I don't see any config options that would increase the size of xfrm_mode_skb_cb compared to what I already have.
On Mon, Apr 22, 2024 at 02:49:44PM +0200, Sabrina Dubroca wrote: > > Actually it looks like we still have 4B in xfrm_mode_skb_cb: > > struct xfrm_mode_skb_cb { > struct xfrm_tunnel_skb_cb header; /* 0 32 */ > __be16 id; /* 32 2 */ > __be16 frag_off; /* 34 2 */ > u8 ihl; /* 36 1 */ > u8 tos; /* 37 1 */ > u8 ttl; /* 38 1 */ > u8 protocol; /* 39 1 */ > u8 optlen; /* 40 1 */ > u8 flow_lbl[3]; /* 41 3 */ > > /* size: 48, cachelines: 1, members: 9 */ > /* padding: 4 */ > /* last cacheline: 48 bytes */ > }; > > flow_lbl ends at 44, so adding orig_mac_len should be fine. I don't > see any config options that would increase the size of > xfrm_mode_skb_cb compared to what I already have. Right, I overlooked the 4 byte padding in the pahole output.
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h index 9d24aec064e8..4ff48eda3f64 100644 --- a/include/linux/skbuff.h +++ b/include/linux/skbuff.h @@ -3031,6 +3031,21 @@ static inline void skb_mac_header_rebuild(struct sk_buff *skb) } } +/* Move the full mac header up to current network_header. + * Leaves skb->data pointing at offset skb->mac_len into the mac_header. + * Must be provided the complete mac header length. + */ +static inline void skb_mac_header_rebuild_full(struct sk_buff *skb, u32 full_mac_len) +{ + if (skb_mac_header_was_set(skb)) { + const unsigned char *old_mac = skb_mac_header(skb); + + skb_set_mac_header(skb, -full_mac_len); + memmove(skb_mac_header(skb), old_mac, full_mac_len); + __skb_push(skb, full_mac_len - skb->mac_len); + } +} + static inline int skb_checksum_start_offset(const struct sk_buff *skb) { return skb->csum_start - skb_headroom(skb); diff --git a/include/net/xfrm.h b/include/net/xfrm.h index 57c743b7e4fe..0331cfecb28b 100644 --- a/include/net/xfrm.h +++ b/include/net/xfrm.h @@ -675,6 +675,9 @@ struct xfrm_mode_skb_cb { /* Used by IPv6 only, zero for IPv4. */ u8 flow_lbl[3]; + + /* Used to keep whole l2 header for transport mode GRO */ + u32 orig_mac_len; }; #define XFRM_MODE_SKB_CB(__skb) ((struct xfrm_mode_skb_cb *)&((__skb)->cb[0])) diff --git a/net/ipv4/xfrm4_input.c b/net/ipv4/xfrm4_input.c index dae35101d189..322f9dfee0c4 100644 --- a/net/ipv4/xfrm4_input.c +++ b/net/ipv4/xfrm4_input.c @@ -63,7 +63,11 @@ int xfrm4_transport_finish(struct sk_buff *skb, int async) ip_send_check(iph); if (xo && (xo->flags & XFRM_GRO)) { - skb_mac_header_rebuild(skb); + /* The full l2 header needs to be preserved so that re-injecting the packet at l2 + * works correctly in the presence of vlan tags. + */ + skb_mac_header_rebuild_full(skb, XFRM_MODE_SKB_CB(skb)->orig_mac_len); + skb_reset_network_header(skb); skb_reset_transport_header(skb); return 0; } diff --git a/net/ipv6/xfrm6_input.c b/net/ipv6/xfrm6_input.c index a17d783dc7c0..000981d4ca02 100644 --- a/net/ipv6/xfrm6_input.c +++ b/net/ipv6/xfrm6_input.c @@ -58,7 +58,11 @@ int xfrm6_transport_finish(struct sk_buff *skb, int async) skb_postpush_rcsum(skb, skb_network_header(skb), nhlen); if (xo && (xo->flags & XFRM_GRO)) { - skb_mac_header_rebuild(skb); + /* The full l2 header needs to be preserved so that re-injecting the packet at l2 + * works correctly in the presence of vlan tags. + */ + skb_mac_header_rebuild_full(skb, XFRM_MODE_SKB_CB(skb)->orig_mac_len); + skb_reset_network_header(skb); skb_reset_transport_header(skb); return 0; } diff --git a/net/xfrm/xfrm_input.c b/net/xfrm/xfrm_input.c index 161f535c8b94..d06f1b3cd322 100644 --- a/net/xfrm/xfrm_input.c +++ b/net/xfrm/xfrm_input.c @@ -394,6 +394,8 @@ static int xfrm4_transport_input(struct xfrm_state *x, struct sk_buff *skb) if (skb->transport_header != skb->network_header) { memmove(skb_transport_header(skb), skb_network_header(skb), ihl); + XFRM_MODE_SKB_CB(skb)->orig_mac_len = + skb_mac_header_was_set(skb) ? skb_mac_header_len(skb) : 0; skb->network_header = skb->transport_header; } ip_hdr(skb)->tot_len = htons(skb->len + ihl); @@ -409,6 +411,8 @@ static int xfrm6_transport_input(struct xfrm_state *x, struct sk_buff *skb) if (skb->transport_header != skb->network_header) { memmove(skb_transport_header(skb), skb_network_header(skb), ihl); + XFRM_MODE_SKB_CB(skb)->orig_mac_len = + skb_mac_header_was_set(skb) ? skb_mac_header_len(skb) : 0; skb->network_header = skb->transport_header; } ipv6_hdr(skb)->payload_len = htons(skb->len + ihl -
The software GRO path for esp transport mode uses skb_mac_header_rebuild prior to re-injecting the packet via the xfrm_napi_dev. This only copies skb->mac_len bytes of header which may not be sufficient if the packet contains 802.1Q tags or other VLAN tags. Worse copying only the initial header will leave a packet marked as being VLAN tagged but without the corresponding tag leading to mangling when it is later untagged. The VLAN tags are important when receiving the decrypted esp transport mode packet after GRO processing to ensure it is received on the correct interface. Therefore record the full mac header length in xfrm*_transport_input for later use in correpsonding xfrm*_transport_finish to copy the entire mac header when rebuilding the mac header for GRO. The skb->data pointer is left pointing skb->mac_header bytes after the start of the mac header as is expected by the network stack and network and transport header offsets reset to this location. Signed-off-by: Paul Davey <paul.davey@alliedtelesis.co.nz> --- I discovered that incomplete mac header rebuilding was the cause of an issue I noticed with l2tp tunnels with IPSec transport mode protection not working properly when the tunnel was running over a VLAN. This issue was described in the linked mailing list post. I am not certain if the tunnel encapsulation modes would require similar adjustment though they do not seem to have an issue with packets passing. I am also uncertain of how this sort of problem may interact with interfaces which are bridged. Link: https://lore.kernel.org/r/ecf16770431c8b30782e3443085641eb685aa5f9.camel@alliedtelesis.co.nz/ include/linux/skbuff.h | 15 +++++++++++++++ include/net/xfrm.h | 3 +++ net/ipv4/xfrm4_input.c | 6 +++++- net/ipv6/xfrm6_input.c | 6 +++++- net/xfrm/xfrm_input.c | 4 ++++ 5 files changed, 32 insertions(+), 2 deletions(-)