diff mbox series

[ipsec,v2] xfrm: check MAC header is shown with both skb->mac_len and skb_mac_header_was_set()

Message ID 20240912071702.221128-1-en-wei.wu@canonical.com (mailing list archive)
State Superseded
Delegated to: Netdev Maintainers
Headers show
Series [ipsec,v2] xfrm: check MAC header is shown with both skb->mac_len and skb_mac_header_was_set() | expand

Checks

Context Check Description
netdev/series_format warning Single patches do not need cover letters; Target tree name not specified in the subject
netdev/tree_selection success Guessed tree name to be net-next
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 16 this patch: 16
netdev/build_tools success No tools touched, skip
netdev/cc_maintainers success CCed 6 of 7 maintainers
netdev/build_clang success Errors and warnings before: 16 this patch: 16
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 19 this patch: 19
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 16 lines checked
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

En-Wei WU Sept. 12, 2024, 7:17 a.m. UTC
When we use Intel WWAN with xfrm, our system always hangs after
browsing websites for a few seconds. The error message shows that
it is a slab-out-of-bounds error:

[ 67.162014] BUG: KASAN: slab-out-of-bounds in xfrm_input+0x426e/0x6740
[ 67.162030] Write of size 2 at addr ffff888156cb814b by task ksoftirqd/2/26

[ 67.162043] CPU: 2 UID: 0 PID: 26 Comm: ksoftirqd/2 Not tainted 6.11.0-rc6-c763c4339688+ #2
[ 67.162053] Hardware name: Dell Inc. Latitude 5340/0SG010, BIOS 1.15.0 07/15/2024
[ 67.162058] Call Trace:
[ 67.162062] <TASK>
[ 67.162068] dump_stack_lvl+0x76/0xa0
[ 67.162079] print_report+0xce/0x5f0
[ 67.162088] ? xfrm_input+0x426e/0x6740
[ 67.162096] ? kasan_complete_mode_report_info+0x26/0x200
[ 67.162105] ? xfrm_input+0x426e/0x6740
[ 67.162112] kasan_report+0xbe/0x110
[ 67.162119] ? xfrm_input+0x426e/0x6740
[ 67.162129] __asan_report_store_n_noabort+0x12/0x30
[ 67.162138] xfrm_input+0x426e/0x6740
[ 67.162149] ? __pfx_xfrm_input+0x10/0x10
[ 67.162160] ? __kasan_check_read+0x11/0x20
[ 67.162168] ? __call_rcu_common+0x3e7/0x15b0
[ 67.162178] xfrm4_rcv_encap+0x214/0x470
[ 67.162186] ? __xfrm4_udp_encap_rcv.part.0+0x3cd/0x560
[ 67.162195] xfrm4_udp_encap_rcv+0xdd/0xf0
[ 67.162203] udp_queue_rcv_one_skb+0x880/0x12f0
[ 67.162212] udp_queue_rcv_skb+0x139/0xa90
[ 67.162221] udp_unicast_rcv_skb+0x116/0x350
[ 67.162229] __udp4_lib_rcv+0x213b/0x3410
[ 67.162237] ? ldsem_down_write+0x211/0x4ed
[ 67.162246] ? __pfx___udp4_lib_rcv+0x10/0x10
[ 67.162254] ? __pfx_raw_local_deliver+0x10/0x10
[ 67.162262] ? __pfx_cache_tag_flush_range_np+0x10/0x10
[ 67.162273] udp_rcv+0x86/0xb0
[ 67.162280] ip_protocol_deliver_rcu+0x152/0x380
[ 67.162289] ip_local_deliver_finish+0x282/0x370
[ 67.162296] ip_local_deliver+0x1a8/0x380
[ 67.162303] ? __pfx_ip_local_deliver+0x10/0x10
[ 67.162310] ? ip_rcv_finish_core.constprop.0+0x481/0x1ce0
[ 67.162317] ? ip_rcv_core+0x5df/0xd60
[ 67.162325] ip_rcv+0x2fc/0x380
[ 67.162332] ? __pfx_ip_rcv+0x10/0x10
[ 67.162338] ? __pfx_dma_map_page_attrs+0x10/0x10
[ 67.162346] ? __kasan_check_write+0x14/0x30
[ 67.162354] ? __build_skb_around+0x23a/0x350
[ 67.162363] ? __pfx_ip_rcv+0x10/0x10
[ 67.162369] __netif_receive_skb_one_core+0x173/0x1d0
[ 67.162377] ? __pfx___netif_receive_skb_one_core+0x10/0x10
[ 67.162386] ? __kasan_check_write+0x14/0x30
[ 67.162394] ? _raw_spin_lock_irq+0x8b/0x100
[ 67.162402] __netif_receive_skb+0x21/0x160
[ 67.162409] process_backlog+0x1c0/0x590
[ 67.162417] __napi_poll+0xab/0x550
[ 67.162425] net_rx_action+0x53e/0xd10
[ 67.162434] ? __pfx_net_rx_action+0x10/0x10
[ 67.162443] ? __pfx_wake_up_var+0x10/0x10
[ 67.162453] ? tasklet_action_common.constprop.0+0x22c/0x670
[ 67.162463] handle_softirqs+0x18f/0x5d0
[ 67.162472] ? __pfx_run_ksoftirqd+0x10/0x10
[ 67.162480] run_ksoftirqd+0x3c/0x60
[ 67.162487] smpboot_thread_fn+0x2f3/0x700
[ 67.162497] kthread+0x2b5/0x390
[ 67.162505] ? __pfx_smpboot_thread_fn+0x10/0x10
[ 67.162512] ? __pfx_kthread+0x10/0x10
[ 67.162519] ret_from_fork+0x43/0x90
[ 67.162527] ? __pfx_kthread+0x10/0x10
[ 67.162534] ret_from_fork_asm+0x1a/0x30
[ 67.162544] </TASK>

[ 67.162551] The buggy address belongs to the object at ffff888156cb8000
                which belongs to the cache kmalloc-rnd-09-8k of size 8192
[ 67.162557] The buggy address is located 331 bytes inside of
                allocated 8192-byte region [ffff888156cb8000, ffff888156cba000)

[ 67.162566] The buggy address belongs to the physical page:
[ 67.162570] page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x156cb8
[ 67.162578] head: order:3 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0
[ 67.162583] flags: 0x17ffffc0000040(head|node=0|zone=2|lastcpupid=0x1fffff)
[ 67.162591] page_type: 0xfdffffff(slab)
[ 67.162599] raw: 0017ffffc0000040 ffff888100056780 dead000000000122 0000000000000000
[ 67.162605] raw: 0000000000000000 0000000080020002 00000001fdffffff 0000000000000000
[ 67.162611] head: 0017ffffc0000040 ffff888100056780 dead000000000122 0000000000000000
[ 67.162616] head: 0000000000000000 0000000080020002 00000001fdffffff 0000000000000000
[ 67.162621] head: 0017ffffc0000003 ffffea00055b2e01 ffffffffffffffff 0000000000000000
[ 67.162626] head: 0000000000000008 0000000000000000 00000000ffffffff 0000000000000000
[ 67.162630] page dumped because: kasan: bad access detected

[ 67.162636] Memory state around the buggy address:
[ 67.162640] ffff888156cb8000: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 67.162645] ffff888156cb8080: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 67.162650] >ffff888156cb8100: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 67.162653] ^
[ 67.162658] ffff888156cb8180: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 67.162663] ffff888156cb8200: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc

The reason is that the eth_hdr(skb) inside if statement evaluated
to an unexpected address with skb->mac_header = ~0U (indicating there
is no MAC header). The unreliability of skb->mac_len causes the if
statement to become true even if there is no MAC header inside the
skb data buffer.

Check both the skb->mac_len and skb_mac_header_was_set(skb) fixes this issue.

Fixes: 87cdf3148b11 ("xfrm: Verify MAC header exists before overwriting eth_hdr(skb)->h_proto")
Signed-off-by: En-Wei Wu <en-wei.wu@canonical.com>
---
Changes in v2:
* Change the title from "xfrm: avoid using skb->mac_len to decide if mac header is shown"
* Remain skb->mac_len check
* Apply fix on ipv6 path too
---
 net/xfrm/xfrm_input.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Comments

Eric Dumazet Sept. 12, 2024, 7:35 a.m. UTC | #1
On Thu, Sep 12, 2024 at 9:17 AM En-Wei Wu <en-wei.wu@canonical.com> wrote:
>
> When we use Intel WWAN with xfrm, our system always hangs after
> browsing websites for a few seconds. The error message shows that
> it is a slab-out-of-bounds error:
>
> [ 67.162014] BUG: KASAN: slab-out-of-bounds in xfrm_input+0x426e/0x6740
> [ 67.162030] Write of size 2 at addr ffff888156cb814b by task ksoftirqd/2/26
>
> The reason is that the eth_hdr(skb) inside if statement evaluated
> to an unexpected address with skb->mac_header = ~0U (indicating there
> is no MAC header). The unreliability of skb->mac_len causes the if
> statement to become true even if there is no MAC header inside the
> skb data buffer.
>
> Check both the skb->mac_len and skb_mac_header_was_set(skb) fixes this issue.
>
> Fixes: 87cdf3148b11 ("xfrm: Verify MAC header exists before overwriting eth_hdr(skb)->h_proto")
> Signed-off-by: En-Wei Wu <en-wei.wu@canonical.com>
> ---
> Changes in v2:
> * Change the title from "xfrm: avoid using skb->mac_len to decide if mac header is shown"
> * Remain skb->mac_len check
> * Apply fix on ipv6 path too
> ---
>  net/xfrm/xfrm_input.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/net/xfrm/xfrm_input.c b/net/xfrm/xfrm_input.c
> index 749e7eea99e4..eef0145c73a7 100644
> --- a/net/xfrm/xfrm_input.c
> +++ b/net/xfrm/xfrm_input.c
> @@ -251,7 +251,7 @@ static int xfrm4_remove_tunnel_encap(struct xfrm_state *x, struct sk_buff *skb)
>
>         skb_reset_network_header(skb);
>         skb_mac_header_rebuild(skb);
> -       if (skb->mac_len)
> +       if (skb->mac_len && skb_mac_header_was_set(skb))
>                 eth_hdr(skb)->h_proto = skb->protocol;

I would swap the two conditions :
We might in the future debug kernels leave mac_len uninitialized if
mac_header was never set.

It would be nice to catch the issue sooner.
Something is calling skb_reset_mac_len() while the mac_header was not set ?
Considering the stack trace, I can not see why mac_header is not set.
Could you try the following patch, and compile your test kernel with
CONFIG_DEBUG_NET=y ?

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 39f1d16f362887821caa022464695c4045461493..fb06dc81039253bafeb49f0b7228748e898f480f
100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -2909,9 +2909,19 @@ static inline void
skb_reset_inner_headers(struct sk_buff *skb)
        skb->inner_transport_header = skb->transport_header;
 }

+static inline int skb_mac_header_was_set(const struct sk_buff *skb)
+{
+       return skb->mac_header != (typeof(skb->mac_header))~0U;
+}
+
 static inline void skb_reset_mac_len(struct sk_buff *skb)
 {
-       skb->mac_len = skb->network_header - skb->mac_header;
+       if (!skb_mac_header_was_set(skb)) {
+               DEBUG_NET_WARN_ON_ONCE(1);
+               skb->mac_len = 0;
+       } else {
+               skb->mac_len = skb->network_header - skb->mac_header;
+       }
 }

 static inline unsigned char *skb_inner_transport_header(const struct sk_buff
@@ -3014,11 +3024,6 @@ static inline void
skb_set_network_header(struct sk_buff *skb, const int offset)
        skb->network_header += offset;
 }

-static inline int skb_mac_header_was_set(const struct sk_buff *skb)
-{
-       return skb->mac_header != (typeof(skb->mac_header))~0U;
-}
-
 static inline unsigned char *skb_mac_header(const struct sk_buff *skb)
 {
        DEBUG_NET_WARN_ON_ONCE(!skb_mac_header_was_set(skb));
Peter Seiderer Sept. 12, 2024, 9:35 a.m. UTC | #2
Hello *,

On Thu, 12 Sep 2024 15:17:02 +0800, En-Wei Wu <en-wei.wu@canonical.com> wrote:

> When we use Intel WWAN with xfrm, our system always hangs after
> browsing websites for a few seconds. The error message shows that
> it is a slab-out-of-bounds error:
>
> [ 67.162014] BUG: KASAN: slab-out-of-bounds in xfrm_input+0x426e/0x6740
> [ 67.162030] Write of size 2 at addr ffff888156cb814b by task ksoftirqd/2/26
>
> [ 67.162043] CPU: 2 UID: 0 PID: 26 Comm: ksoftirqd/2 Not tainted 6.11.0-rc6-c763c4339688+ #2
> [ 67.162053] Hardware name: Dell Inc. Latitude 5340/0SG010, BIOS 1.15.0 07/15/2024
> [ 67.162058] Call Trace:
> [ 67.162062] <TASK>
> [ 67.162068] dump_stack_lvl+0x76/0xa0
> [ 67.162079] print_report+0xce/0x5f0
> [ 67.162088] ? xfrm_input+0x426e/0x6740
> [ 67.162096] ? kasan_complete_mode_report_info+0x26/0x200
> [ 67.162105] ? xfrm_input+0x426e/0x6740
> [ 67.162112] kasan_report+0xbe/0x110
> [ 67.162119] ? xfrm_input+0x426e/0x6740
> [ 67.162129] __asan_report_store_n_noabort+0x12/0x30
> [ 67.162138] xfrm_input+0x426e/0x6740
> [ 67.162149] ? __pfx_xfrm_input+0x10/0x10
> [ 67.162160] ? __kasan_check_read+0x11/0x20
> [ 67.162168] ? __call_rcu_common+0x3e7/0x15b0
> [ 67.162178] xfrm4_rcv_encap+0x214/0x470
> [ 67.162186] ? __xfrm4_udp_encap_rcv.part.0+0x3cd/0x560
> [ 67.162195] xfrm4_udp_encap_rcv+0xdd/0xf0
> [ 67.162203] udp_queue_rcv_one_skb+0x880/0x12f0
> [ 67.162212] udp_queue_rcv_skb+0x139/0xa90
> [ 67.162221] udp_unicast_rcv_skb+0x116/0x350
> [ 67.162229] __udp4_lib_rcv+0x213b/0x3410
> [ 67.162237] ? ldsem_down_write+0x211/0x4ed
> [ 67.162246] ? __pfx___udp4_lib_rcv+0x10/0x10
> [ 67.162254] ? __pfx_raw_local_deliver+0x10/0x10
> [ 67.162262] ? __pfx_cache_tag_flush_range_np+0x10/0x10
> [ 67.162273] udp_rcv+0x86/0xb0
> [ 67.162280] ip_protocol_deliver_rcu+0x152/0x380
> [ 67.162289] ip_local_deliver_finish+0x282/0x370
> [ 67.162296] ip_local_deliver+0x1a8/0x380
> [ 67.162303] ? __pfx_ip_local_deliver+0x10/0x10
> [ 67.162310] ? ip_rcv_finish_core.constprop.0+0x481/0x1ce0
> [ 67.162317] ? ip_rcv_core+0x5df/0xd60
> [ 67.162325] ip_rcv+0x2fc/0x380
> [ 67.162332] ? __pfx_ip_rcv+0x10/0x10
> [ 67.162338] ? __pfx_dma_map_page_attrs+0x10/0x10
> [ 67.162346] ? __kasan_check_write+0x14/0x30
> [ 67.162354] ? __build_skb_around+0x23a/0x350
> [ 67.162363] ? __pfx_ip_rcv+0x10/0x10
> [ 67.162369] __netif_receive_skb_one_core+0x173/0x1d0
> [ 67.162377] ? __pfx___netif_receive_skb_one_core+0x10/0x10
> [ 67.162386] ? __kasan_check_write+0x14/0x30
> [ 67.162394] ? _raw_spin_lock_irq+0x8b/0x100
> [ 67.162402] __netif_receive_skb+0x21/0x160
> [ 67.162409] process_backlog+0x1c0/0x590
> [ 67.162417] __napi_poll+0xab/0x550
> [ 67.162425] net_rx_action+0x53e/0xd10
> [ 67.162434] ? __pfx_net_rx_action+0x10/0x10
> [ 67.162443] ? __pfx_wake_up_var+0x10/0x10
> [ 67.162453] ? tasklet_action_common.constprop.0+0x22c/0x670
> [ 67.162463] handle_softirqs+0x18f/0x5d0
> [ 67.162472] ? __pfx_run_ksoftirqd+0x10/0x10
> [ 67.162480] run_ksoftirqd+0x3c/0x60
> [ 67.162487] smpboot_thread_fn+0x2f3/0x700
> [ 67.162497] kthread+0x2b5/0x390
> [ 67.162505] ? __pfx_smpboot_thread_fn+0x10/0x10
> [ 67.162512] ? __pfx_kthread+0x10/0x10
> [ 67.162519] ret_from_fork+0x43/0x90
> [ 67.162527] ? __pfx_kthread+0x10/0x10
> [ 67.162534] ret_from_fork_asm+0x1a/0x30
> [ 67.162544] </TASK>
>
> [ 67.162551] The buggy address belongs to the object at ffff888156cb8000
>                 which belongs to the cache kmalloc-rnd-09-8k of size 8192
> [ 67.162557] The buggy address is located 331 bytes inside of
>                 allocated 8192-byte region [ffff888156cb8000, ffff888156cba000)
>
> [ 67.162566] The buggy address belongs to the physical page:
> [ 67.162570] page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x156cb8
> [ 67.162578] head: order:3 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0
> [ 67.162583] flags: 0x17ffffc0000040(head|node=0|zone=2|lastcpupid=0x1fffff)
> [ 67.162591] page_type: 0xfdffffff(slab)
> [ 67.162599] raw: 0017ffffc0000040 ffff888100056780 dead000000000122 0000000000000000
> [ 67.162605] raw: 0000000000000000 0000000080020002 00000001fdffffff 0000000000000000
> [ 67.162611] head: 0017ffffc0000040 ffff888100056780 dead000000000122 0000000000000000
> [ 67.162616] head: 0000000000000000 0000000080020002 00000001fdffffff 0000000000000000
> [ 67.162621] head: 0017ffffc0000003 ffffea00055b2e01 ffffffffffffffff 0000000000000000
> [ 67.162626] head: 0000000000000008 0000000000000000 00000000ffffffff 0000000000000000
> [ 67.162630] page dumped because: kasan: bad access detected
>
> [ 67.162636] Memory state around the buggy address:
> [ 67.162640] ffff888156cb8000: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
> [ 67.162645] ffff888156cb8080: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
> [ 67.162650] >ffff888156cb8100: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
> [ 67.162653] ^
> [ 67.162658] ffff888156cb8180: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
> [ 67.162663] ffff888156cb8200: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>
> The reason is that the eth_hdr(skb) inside if statement evaluated
> to an unexpected address with skb->mac_header = ~0U (indicating there
> is no MAC header). The unreliability of skb->mac_len causes the if
> statement to become true even if there is no MAC header inside the
> skb data buffer.
>
> Check both the skb->mac_len and skb_mac_header_was_set(skb) fixes this issue.
>
> Fixes: 87cdf3148b11 ("xfrm: Verify MAC header exists before overwriting eth_hdr(skb)->h_proto")
> Signed-off-by: En-Wei Wu <en-wei.wu@canonical.com>
> ---
> Changes in v2:
> * Change the title from "xfrm: avoid using skb->mac_len to decide if mac header is shown"
> * Remain skb->mac_len check
> * Apply fix on ipv6 path too
> ---
>  net/xfrm/xfrm_input.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/net/xfrm/xfrm_input.c b/net/xfrm/xfrm_input.c
> index 749e7eea99e4..eef0145c73a7 100644
> --- a/net/xfrm/xfrm_input.c
> +++ b/net/xfrm/xfrm_input.c
> @@ -251,7 +251,7 @@ static int xfrm4_remove_tunnel_encap(struct xfrm_state *x, struct sk_buff *skb)
>
>  	skb_reset_network_header(skb);
>  	skb_mac_header_rebuild(skb);
> -	if (skb->mac_len)
> +	if (skb->mac_len && skb_mac_header_was_set(skb))
>  		eth_hdr(skb)->h_proto = skb->protocol;
>
>  	err = 0;
> @@ -288,7 +288,7 @@ static int xfrm6_remove_tunnel_encap(struct xfrm_state *x, struct sk_buff *skb)
>
>  	skb_reset_network_header(skb);
>  	skb_mac_header_rebuild(skb);
> -	if (skb->mac_len)
> +	if (skb->mac_len && skb_mac_header_was_set(skb))
>  		eth_hdr(skb)->h_proto = skb->protocol;
>
>  	err = 0;

Same change (and request for more debugging) already suggested in 2023, see [1]...

Regards,
Peter

[1] https://lore.kernel.org/netdev/d1cf5a66-03e1-44b8-929d-ac123b1bbd7b@sylv.io/T/
Eric Dumazet Sept. 12, 2024, 10:53 a.m. UTC | #3
On Thu, Sep 12, 2024 at 11:35 AM Peter Seiderer <ps.report@gmx.net> wrote:
>

> Same change (and request for more debugging) already suggested in 2023, see [1]...
>
> Regards,
> Peter
>
> [1] https://lore.kernel.org/netdev/d1cf5a66-03e1-44b8-929d-ac123b1bbd7b@sylv.io/T/

Indeed !
Nice to see some consistency among us :)
En-Wei WU Sept. 13, 2024, 5:29 a.m. UTC | #4
> Could you try the following patch, and compile your test kernel with
> CONFIG_DEBUG_NET=y ?
[  323.870221] ------------[ cut here ]------------
[  323.870226] WARNING: CPU: 2 PID: 26 at include/linux/skbuff.h:2904
__netif_receive_skb_core.constprop.0+0x201/0x39d0
[  323.870369] CPU: 2 UID: 0 PID: 26 Comm: ksoftirqd/2 Not tainted
6.11.0-rc6-c763c4339688+ #12
[  323.870372] Hardware name: Dell Inc. Latitude 5340/0SG010, BIOS
1.15.0 07/15/2024
[  323.870373] RIP: 0010:__netif_receive_skb_core.constprop.0+0x201/0x39d0
[  323.870376] Code: 03 0f b6 14 02 48 89 f8 83 e0 07 83 c0 01 38 d0
7c 08 84 d2 0f 85 b4 24 00 00 41 0f b7 87 ba 00 00 00 29 c3 66 83 f8
ff 75 04 <0f> 0b 31 db 48 b8 00 00 00 00 00 fc ff df 49 8d 7f 78 48 89
fa 48
[  323.870378] RSP: 0018:ffffc90000377838 EFLAGS: 00010246
[  323.870380] RAX: 000000000000ffff RBX: 00000000ffff0061 RCX: ffff88876cf48090
[  323.870381] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8881756b2e7a
[  323.870382] RBP: ffffc90000377a88 R08: ffff88876cf48184 R09: 0000000000000000
[  323.870383] R10: 0000000000000000 R11: 1ffff1102ead65b9 R12: ffff8881756b2dc0
[  323.870384] R13: ffffc90000377b20 R14: ffff8881635ca000 R15: ffff8881756b2dc0
[  323.870385] FS:  0000000000000000(0000) GS:ffff88876cf00000(0000)
knlGS:0000000000000000
[  323.870387] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  323.870388] CR2: 0000769acfa9d080 CR3: 0000000712498000 CR4: 0000000000f50ef0
[  323.870389] PKRU: 55555554
[  323.870390] Call Trace:
[  323.870391]  <TASK>
[  323.870393]  ? show_regs+0x71/0x90
[  323.870397]  ? __warn+0xce/0x270
[  323.870399]  ? __netif_receive_skb_core.constprop.0+0x201/0x39d0
[  323.870401]  ? report_bug+0x2ad/0x300
[  323.870404]  ? handle_bug+0x46/0x90
[  323.870407]  ? exc_invalid_op+0x19/0x50
[  323.870409]  ? asm_exc_invalid_op+0x1b/0x20
[  323.870413]  ? __netif_receive_skb_core.constprop.0+0x201/0x39d0
[  323.870415]  ? intel_iommu_iotlb_sync_map+0x1a/0x30
[  323.870418]  ? iommu_map+0xab/0x140
[  323.870421]  ? __pfx___netif_receive_skb_core.constprop.0+0x10/0x10
[  323.870423]  ? iommu_dma_map_page+0x159/0x720
[  323.870425]  ? dma_map_page_attrs+0x568/0xdc0
[  323.870427]  ? __kasan_slab_alloc+0x9d/0xa0
[  323.870430]  ? __pfx_dma_map_page_attrs+0x10/0x10
[  323.870431]  ? __kasan_check_write+0x14/0x30
[  323.870434]  ? __build_skb_around+0x23a/0x350
[  323.870437]  __netif_receive_skb_one_core+0xb4/0x1d0
[  323.870439]  ? __pfx___netif_receive_skb_one_core+0x10/0x10
[  323.870441]  ? __kasan_check_write+0x14/0x30
[  323.870443]  ? _raw_spin_lock_irq+0x8b/0x100
[  323.870445]  __netif_receive_skb+0x21/0x160
[  323.870447]  process_backlog+0x1c0/0x590
[  323.870449]  __napi_poll+0xab/0x560
[  323.870451]  net_rx_action+0x53e/0xd10
[  323.870453]  ? __pfx_net_rx_action+0x10/0x10
[  323.870455]  ? __pfx_wake_up_var+0x10/0x10
[  323.870457]  ? tasklet_action_common.constprop.0+0x22c/0x670
[  323.870461]  handle_softirqs+0x18f/0x5d0
[  323.870463]  ? __pfx_run_ksoftirqd+0x10/0x10
[  323.870465]  run_ksoftirqd+0x3c/0x60
[  323.870467]  smpboot_thread_fn+0x2f3/0x700
[  323.870470]  kthread+0x2b5/0x390
[  323.870472]  ? __pfx_smpboot_thread_fn+0x10/0x10
[  323.870474]  ? __pfx_kthread+0x10/0x10
[  323.870476]  ret_from_fork+0x43/0x90
[  323.870478]  ? __pfx_kthread+0x10/0x10
[  323.870480]  ret_from_fork_asm+0x1a/0x30
[  323.870483]  </TASK>
[  323.870484] ---[ end trace 0000000000000000 ]---
[  350.300485] Initializing XFRM netlink socket
[  351.586993] ------------[ cut here ]------------
[  351.586999] WARNING: CPU: 2 PID: 26 at include/linux/skbuff.h:2904
dev_gro_receive+0x172c/0x2860
[  351.587141] CPU: 2 UID: 0 PID: 26 Comm: ksoftirqd/2 Tainted: G
  W          6.11.0-rc6-c763c4339688+ #12
[  351.587144] Tainted: [W]=WARN
[  351.587145] Hardware name: Dell Inc. Latitude 5340/0SG010, BIOS
1.15.0 07/15/2024
[  351.587147] RIP: 0010:dev_gro_receive+0x172c/0x2860
[  351.587149] Code: 07 83 c2 01 38 ca 7c 08 84 c9 0f 85 d2 09 00 00
8d 14 c5 00 00 00 00 41 0f b6 45 46 83 e0 c7 09 d0 41 88 45 46 e9 ee
f9 ff ff <0f> 0b 45 31 f6 e9 64 f7 ff ff 45 31 e4 81 e3 c0 00 00 00 41
0f 95
[  351.587151] RSP: 0018:ffffc90000377aa8 EFLAGS: 00010246
[  351.587153] RAX: ffff888128d72840 RBX: ffffffff95a0d9c0 RCX: 0000000000000000
[  351.587154] RDX: 000000000000ffff RSI: ffff88876cf52418 RDI: ffff88815880ad3a
[  351.587155] RBP: ffffc90000377b48 R08: 0000000000000000 R09: 0000000000000000
[  351.587156] R10: 1ffff110ed9ea481 R11: 0000000000000000 R12: ffffffff95a0d9d0
[  351.587157] R13: ffff88815880ac80 R14: 00000000ffff008d R15: ffff88815880acb8
[  351.587159] FS:  0000000000000000(0000) GS:ffff88876cf00000(0000)
knlGS:0000000000000000
[  351.587160] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  351.587161] CR2: 000078e9ea9e25b0 CR3: 0000000712498000 CR4: 0000000000f50ef0
[  351.587163] PKRU: 55555554
[  351.587163] Call Trace:
[  351.587164]  <TASK>
[  351.587167]  ? show_regs+0x71/0x90
[  351.587171]  ? __warn+0xce/0x270
[  351.587173]  ? dev_gro_receive+0x172c/0x2860
[  351.587175]  ? report_bug+0x2ad/0x300
[  351.587178]  ? handle_bug+0x46/0x90
[  351.587181]  ? exc_invalid_op+0x19/0x50
[  351.587182]  ? asm_exc_invalid_op+0x1b/0x20
[  351.587187]  ? dev_gro_receive+0x172c/0x2860
[  351.587188]  ? dev_gro_receive+0xcdd/0x2860
[  351.587190]  ? __pfx___netif_receive_skb_one_core+0x10/0x10
[  351.587192]  ? __mutex_lock.constprop.0+0x150/0x1180
[  351.587195]  napi_gro_receive+0x3a2/0x900
[  351.587197]  gro_cell_poll+0xe5/0x1d0
[  351.587200]  __napi_poll+0xab/0x560
[  351.587202]  net_rx_action+0x53e/0xd10
[  351.587204]  ? __pfx_net_rx_action+0x10/0x10
[  351.587206]  ? __pfx_wake_up_var+0x10/0x10
[  351.587209]  ? tasklet_action_common.constprop.0+0x22c/0x670
[  351.587212]  handle_softirqs+0x18f/0x5d0
[  351.587214]  ? __pfx_run_ksoftirqd+0x10/0x10
[  351.587216]  run_ksoftirqd+0x3c/0x60
[  351.587218]  smpboot_thread_fn+0x2f3/0x700
[  351.587220]  kthread+0x2b5/0x390
[  351.587223]  ? __pfx_smpboot_thread_fn+0x10/0x10
[  351.587224]  ? __pfx_kthread+0x10/0x10
[  351.587226]  ret_from_fork+0x43/0x90
[  351.587229]  ? __pfx_kthread+0x10/0x10
[  351.587231]  ret_from_fork_asm+0x1a/0x30
[  351.587234]  </TASK>
[  351.587235] ---[ end trace 0000000000000000 ]---

Seems like the __netif_receive_skb_core() and dev_gro_receive() are
the places where it calls skb_reset_mac_len() with skb->mac_header =
~0U.

On Thu, 12 Sept 2024 at 18:54, Eric Dumazet <edumazet@google.com> wrote:
>
> On Thu, Sep 12, 2024 at 11:35 AM Peter Seiderer <ps.report@gmx.net> wrote:
> >
>
> > Same change (and request for more debugging) already suggested in 2023, see [1]...
> >
> > Regards,
> > Peter
> >
> > [1] https://lore.kernel.org/netdev/d1cf5a66-03e1-44b8-929d-ac123b1bbd7b@sylv.io/T/
>
> Indeed !
> Nice to see some consistency among us :)
Eric Dumazet Sept. 13, 2024, 7:04 a.m. UTC | #5
On Fri, Sep 13, 2024 at 7:29 AM En-Wei WU <en-wei.wu@canonical.com> wrote:
>
> > Could you try the following patch, and compile your test kernel with
> > CONFIG_DEBUG_NET=y ?
> [  323.870221] ------------[ cut here ]------------
> [  323.870226] WARNING: CPU: 2 PID: 26 at include/linux/skbuff.h:2904
> __netif_receive_skb_core.constprop.0+0x201/0x39d0
> [  323.870369] CPU: 2 UID: 0 PID: 26 Comm: ksoftirqd/2 Not tainted
> 6.11.0-rc6-c763c4339688+ #12
> [  323.870372] Hardware name: Dell Inc. Latitude 5340/0SG010, BIOS
> 1.15.0 07/15/2024
> [  323.870373] RIP: 0010:__netif_receive_skb_core.constprop.0+0x201/0x39d0
> [  323.870376] Code: 03 0f b6 14 02 48 89 f8 83 e0 07 83 c0 01 38 d0
> 7c 08 84 d2 0f 85 b4 24 00 00 41 0f b7 87 ba 00 00 00 29 c3 66 83 f8
> ff 75 04 <0f> 0b 31 db 48 b8 00 00 00 00 00 fc ff df 49 8d 7f 78 48 89
> fa 48
> [  323.870378] RSP: 0018:ffffc90000377838 EFLAGS: 00010246
> [  323.870380] RAX: 000000000000ffff RBX: 00000000ffff0061 RCX: ffff88876cf48090
> [  323.870381] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8881756b2e7a
> [  323.870382] RBP: ffffc90000377a88 R08: ffff88876cf48184 R09: 0000000000000000
> [  323.870383] R10: 0000000000000000 R11: 1ffff1102ead65b9 R12: ffff8881756b2dc0
> [  323.870384] R13: ffffc90000377b20 R14: ffff8881635ca000 R15: ffff8881756b2dc0
> [  323.870385] FS:  0000000000000000(0000) GS:ffff88876cf00000(0000)
> knlGS:0000000000000000
> [  323.870387] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  323.870388] CR2: 0000769acfa9d080 CR3: 0000000712498000 CR4: 0000000000f50ef0
> [  323.870389] PKRU: 55555554
> [  323.870390] Call Trace:
> [  323.870391]  <TASK>
> [  323.870393]  ? show_regs+0x71/0x90
> [  323.870397]  ? __warn+0xce/0x270
> [  323.870399]  ? __netif_receive_skb_core.constprop.0+0x201/0x39d0
> [  323.870401]  ? report_bug+0x2ad/0x300
> [  323.870404]  ? handle_bug+0x46/0x90
> [  323.870407]  ? exc_invalid_op+0x19/0x50
> [  323.870409]  ? asm_exc_invalid_op+0x1b/0x20
> [  323.870413]  ? __netif_receive_skb_core.constprop.0+0x201/0x39d0
> [  323.870415]  ? intel_iommu_iotlb_sync_map+0x1a/0x30
> [  323.870418]  ? iommu_map+0xab/0x140
> [  323.870421]  ? __pfx___netif_receive_skb_core.constprop.0+0x10/0x10
> [  323.870423]  ? iommu_dma_map_page+0x159/0x720
> [  323.870425]  ? dma_map_page_attrs+0x568/0xdc0
> [  323.870427]  ? __kasan_slab_alloc+0x9d/0xa0
> [  323.870430]  ? __pfx_dma_map_page_attrs+0x10/0x10
> [  323.870431]  ? __kasan_check_write+0x14/0x30
> [  323.870434]  ? __build_skb_around+0x23a/0x350
> [  323.870437]  __netif_receive_skb_one_core+0xb4/0x1d0
> [  323.870439]  ? __pfx___netif_receive_skb_one_core+0x10/0x10
> [  323.870441]  ? __kasan_check_write+0x14/0x30
> [  323.870443]  ? _raw_spin_lock_irq+0x8b/0x100
> [  323.870445]  __netif_receive_skb+0x21/0x160
> [  323.870447]  process_backlog+0x1c0/0x590
> [  323.870449]  __napi_poll+0xab/0x560
> [  323.870451]  net_rx_action+0x53e/0xd10
> [  323.870453]  ? __pfx_net_rx_action+0x10/0x10
> [  323.870455]  ? __pfx_wake_up_var+0x10/0x10
> [  323.870457]  ? tasklet_action_common.constprop.0+0x22c/0x670
> [  323.870461]  handle_softirqs+0x18f/0x5d0
> [  323.870463]  ? __pfx_run_ksoftirqd+0x10/0x10
> [  323.870465]  run_ksoftirqd+0x3c/0x60
> [  323.870467]  smpboot_thread_fn+0x2f3/0x700
> [  323.870470]  kthread+0x2b5/0x390
> [  323.870472]  ? __pfx_smpboot_thread_fn+0x10/0x10
> [  323.870474]  ? __pfx_kthread+0x10/0x10
> [  323.870476]  ret_from_fork+0x43/0x90
> [  323.870478]  ? __pfx_kthread+0x10/0x10
> [  323.870480]  ret_from_fork_asm+0x1a/0x30
> [  323.870483]  </TASK>
> [  323.870484] ---[ end trace 0000000000000000 ]---
> [  350.300485] Initializing XFRM netlink socket
> [  351.586993] ------------[ cut here ]------------
> [  351.586999] WARNING: CPU: 2 PID: 26 at include/linux/skbuff.h:2904
> dev_gro_receive+0x172c/0x2860
> [  351.587141] CPU: 2 UID: 0 PID: 26 Comm: ksoftirqd/2 Tainted: G
>   W          6.11.0-rc6-c763c4339688+ #12
> [  351.587144] Tainted: [W]=WARN
> [  351.587145] Hardware name: Dell Inc. Latitude 5340/0SG010, BIOS
> 1.15.0 07/15/2024
> [  351.587147] RIP: 0010:dev_gro_receive+0x172c/0x2860
> [  351.587149] Code: 07 83 c2 01 38 ca 7c 08 84 c9 0f 85 d2 09 00 00
> 8d 14 c5 00 00 00 00 41 0f b6 45 46 83 e0 c7 09 d0 41 88 45 46 e9 ee
> f9 ff ff <0f> 0b 45 31 f6 e9 64 f7 ff ff 45 31 e4 81 e3 c0 00 00 00 41
> 0f 95
> [  351.587151] RSP: 0018:ffffc90000377aa8 EFLAGS: 00010246
> [  351.587153] RAX: ffff888128d72840 RBX: ffffffff95a0d9c0 RCX: 0000000000000000
> [  351.587154] RDX: 000000000000ffff RSI: ffff88876cf52418 RDI: ffff88815880ad3a
> [  351.587155] RBP: ffffc90000377b48 R08: 0000000000000000 R09: 0000000000000000
> [  351.587156] R10: 1ffff110ed9ea481 R11: 0000000000000000 R12: ffffffff95a0d9d0
> [  351.587157] R13: ffff88815880ac80 R14: 00000000ffff008d R15: ffff88815880acb8
> [  351.587159] FS:  0000000000000000(0000) GS:ffff88876cf00000(0000)
> knlGS:0000000000000000
> [  351.587160] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  351.587161] CR2: 000078e9ea9e25b0 CR3: 0000000712498000 CR4: 0000000000f50ef0
> [  351.587163] PKRU: 55555554
> [  351.587163] Call Trace:
> [  351.587164]  <TASK>
> [  351.587167]  ? show_regs+0x71/0x90
> [  351.587171]  ? __warn+0xce/0x270
> [  351.587173]  ? dev_gro_receive+0x172c/0x2860
> [  351.587175]  ? report_bug+0x2ad/0x300
> [  351.587178]  ? handle_bug+0x46/0x90
> [  351.587181]  ? exc_invalid_op+0x19/0x50
> [  351.587182]  ? asm_exc_invalid_op+0x1b/0x20
> [  351.587187]  ? dev_gro_receive+0x172c/0x2860
> [  351.587188]  ? dev_gro_receive+0xcdd/0x2860
> [  351.587190]  ? __pfx___netif_receive_skb_one_core+0x10/0x10
> [  351.587192]  ? __mutex_lock.constprop.0+0x150/0x1180
> [  351.587195]  napi_gro_receive+0x3a2/0x900
> [  351.587197]  gro_cell_poll+0xe5/0x1d0
> [  351.587200]  __napi_poll+0xab/0x560
> [  351.587202]  net_rx_action+0x53e/0xd10
> [  351.587204]  ? __pfx_net_rx_action+0x10/0x10
> [  351.587206]  ? __pfx_wake_up_var+0x10/0x10
> [  351.587209]  ? tasklet_action_common.constprop.0+0x22c/0x670
> [  351.587212]  handle_softirqs+0x18f/0x5d0
> [  351.587214]  ? __pfx_run_ksoftirqd+0x10/0x10
> [  351.587216]  run_ksoftirqd+0x3c/0x60
> [  351.587218]  smpboot_thread_fn+0x2f3/0x700
> [  351.587220]  kthread+0x2b5/0x390
> [  351.587223]  ? __pfx_smpboot_thread_fn+0x10/0x10
> [  351.587224]  ? __pfx_kthread+0x10/0x10
> [  351.587226]  ret_from_fork+0x43/0x90
> [  351.587229]  ? __pfx_kthread+0x10/0x10
> [  351.587231]  ret_from_fork_asm+0x1a/0x30
> [  351.587234]  </TASK>
> [  351.587235] ---[ end trace 0000000000000000 ]---
>
> Seems like the __netif_receive_skb_core() and dev_gro_receive() are
> the places where it calls skb_reset_mac_len() with skb->mac_header =
> ~0U.

Ouch, let me take a look.
diff mbox series

Patch

diff --git a/net/xfrm/xfrm_input.c b/net/xfrm/xfrm_input.c
index 749e7eea99e4..eef0145c73a7 100644
--- a/net/xfrm/xfrm_input.c
+++ b/net/xfrm/xfrm_input.c
@@ -251,7 +251,7 @@  static int xfrm4_remove_tunnel_encap(struct xfrm_state *x, struct sk_buff *skb)
 
 	skb_reset_network_header(skb);
 	skb_mac_header_rebuild(skb);
-	if (skb->mac_len)
+	if (skb->mac_len && skb_mac_header_was_set(skb))
 		eth_hdr(skb)->h_proto = skb->protocol;
 
 	err = 0;
@@ -288,7 +288,7 @@  static int xfrm6_remove_tunnel_encap(struct xfrm_state *x, struct sk_buff *skb)
 
 	skb_reset_network_header(skb);
 	skb_mac_header_rebuild(skb);
-	if (skb->mac_len)
+	if (skb->mac_len && skb_mac_header_was_set(skb))
 		eth_hdr(skb)->h_proto = skb->protocol;
 
 	err = 0;