diff mbox series

[bpf] lwt_bpf: fix crash when using bpf_skb_set_tunnel_key() from bpf_xmit lwt hook

Message ID 20220420165219.1755407-1-eyal.birger@gmail.com (mailing list archive)
State Accepted
Commit b02d196c44ead1a5949729be9ff08fe781c3e48a
Delegated to: BPF
Headers show
Series [bpf] lwt_bpf: fix crash when using bpf_skb_set_tunnel_key() from bpf_xmit lwt hook | expand

Checks

Context Check Description
netdev/tree_selection success Clearly marked for bpf
netdev/fixes_present success Fixes tag present in non-next series
netdev/subject_prefix success Link
netdev/cover_letter success Single patches do not need cover letters
netdev/patch_count success Link
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 2 this patch: 2
netdev/cc_maintainers warning 2 maintainers not CCed: f.fainelli@gmail.com nikolay@nvidia.com
netdev/build_clang success Errors and warnings before: 9 this patch: 9
netdev/module_param success Was 0 now: 0
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/verify_fixes fail Problems with Fixes tag: 1
netdev/build_allmodconfig_warn success Errors and warnings before: 2 this patch: 2
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 26 lines checked
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
bpf/vmtest-bpf-PR success PR summary
bpf/vmtest-bpf-VM_Test-2 success Logs for Kernel LATEST on z15 + selftests
bpf/vmtest-bpf-VM_Test-1 fail Logs for Kernel LATEST on ubuntu-latest + selftests

Commit Message

Eyal Birger April 20, 2022, 4:52 p.m. UTC
xmit_check_hhlen() observes the dst for getting the device hard header
length to make sure a modified packet can fit. When a helper which changes
the dst - such as bpf_skb_set_tunnel_key() - is called as part of the xmit
program the accessed dst is no longer valid.

This leads to the following splat:

 BUG: kernel NULL pointer dereference, address: 00000000000000de
 #PF: supervisor read access in kernel mode
 #PF: error_code(0x0000) - not-present page
 PGD 0 P4D 0
 Oops: 0000 [#1] PREEMPT SMP PTI
 CPU: 0 PID: 798 Comm: ping Not tainted 5.18.0-rc2+ #103
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014
 RIP: 0010:bpf_xmit+0xfb/0x17f
 Code: c6 c0 4d cd 8e 48 c7 c7 7d 33 f0 8e e8 42 09 fb ff 48 8b 45 58 48 8b 95 c8 00 00 00 48 2b 95 c0 00 00 00 48 83 e0 fe 48 8b 00 <0f> b7 80 de 00 00 00 39 c2 73 22 29 d0 b9 20 0a 00 00 31 d2 48 89
 RSP: 0018:ffffb148c0bc7b98 EFLAGS: 00010282
 RAX: 0000000000000000 RBX: 0000000000240008 RCX: 0000000000000000
 RDX: 0000000000000010 RSI: 00000000ffffffea RDI: 00000000ffffffff
 RBP: ffff922a828a4e00 R08: ffffffff8f1350e8 R09: 00000000ffffdfff
 R10: ffffffff8f055100 R11: ffffffff8f105100 R12: 0000000000000000
 R13: ffff922a828a4e00 R14: 0000000000000040 R15: 0000000000000000
 FS:  00007f414e8f0080(0000) GS:ffff922afdc00000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 00000000000000de CR3: 0000000002d80006 CR4: 0000000000370ef0
 Call Trace:
  <TASK>
  lwtunnel_xmit.cold+0x71/0xc8
  ip_finish_output2+0x279/0x520
  ? __ip_finish_output.part.0+0x21/0x130

Fix by fetching the device hard header length before running the bpf code.

Cc: stable@vger.kernel.org
Fixes: commit 3a0af8fd61f9 ("bpf: BPF for lightweight tunnel infrastructure")
Signed-off-by: Eyal Birger <eyal.birger@gmail.com>
---
 net/core/lwt_bpf.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

Comments

Daniel Borkmann April 22, 2022, 3:48 p.m. UTC | #1
On 4/20/22 6:52 PM, Eyal Birger wrote:
> xmit_check_hhlen() observes the dst for getting the device hard header
> length to make sure a modified packet can fit. When a helper which changes
> the dst - such as bpf_skb_set_tunnel_key() - is called as part of the xmit
> program the accessed dst is no longer valid.
> 
> This leads to the following splat:
> 
>   BUG: kernel NULL pointer dereference, address: 00000000000000de
>   #PF: supervisor read access in kernel mode
>   #PF: error_code(0x0000) - not-present page
>   PGD 0 P4D 0
>   Oops: 0000 [#1] PREEMPT SMP PTI
>   CPU: 0 PID: 798 Comm: ping Not tainted 5.18.0-rc2+ #103
>   Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014
>   RIP: 0010:bpf_xmit+0xfb/0x17f
>   Code: c6 c0 4d cd 8e 48 c7 c7 7d 33 f0 8e e8 42 09 fb ff 48 8b 45 58 48 8b 95 c8 00 00 00 48 2b 95 c0 00 00 00 48 83 e0 fe 48 8b 00 <0f> b7 80 de 00 00 00 39 c2 73 22 29 d0 b9 20 0a 00 00 31 d2 48 89
>   RSP: 0018:ffffb148c0bc7b98 EFLAGS: 00010282
>   RAX: 0000000000000000 RBX: 0000000000240008 RCX: 0000000000000000
>   RDX: 0000000000000010 RSI: 00000000ffffffea RDI: 00000000ffffffff
>   RBP: ffff922a828a4e00 R08: ffffffff8f1350e8 R09: 00000000ffffdfff
>   R10: ffffffff8f055100 R11: ffffffff8f105100 R12: 0000000000000000
>   R13: ffff922a828a4e00 R14: 0000000000000040 R15: 0000000000000000
>   FS:  00007f414e8f0080(0000) GS:ffff922afdc00000(0000) knlGS:0000000000000000
>   CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>   CR2: 00000000000000de CR3: 0000000002d80006 CR4: 0000000000370ef0
>   Call Trace:
>    <TASK>
>    lwtunnel_xmit.cold+0x71/0xc8
>    ip_finish_output2+0x279/0x520
>    ? __ip_finish_output.part.0+0x21/0x130
> 
> Fix by fetching the device hard header length before running the bpf code.
> 
> Cc: stable@vger.kernel.org
> Fixes: commit 3a0af8fd61f9 ("bpf: BPF for lightweight tunnel infrastructure")
> Signed-off-by: Eyal Birger <eyal.birger@gmail.com>
> ---
>   net/core/lwt_bpf.c | 7 +++----
>   1 file changed, 3 insertions(+), 4 deletions(-)
> 
> diff --git a/net/core/lwt_bpf.c b/net/core/lwt_bpf.c
> index 349480ef68a5..8b6b5e72b217 100644
> --- a/net/core/lwt_bpf.c
> +++ b/net/core/lwt_bpf.c
> @@ -159,10 +159,8 @@ static int bpf_output(struct net *net, struct sock *sk, struct sk_buff *skb)
>   	return dst->lwtstate->orig_output(net, sk, skb);
>   }
>   
> -static int xmit_check_hhlen(struct sk_buff *skb)
> +static int xmit_check_hhlen(struct sk_buff *skb, int hh_len)
>   {
> -	int hh_len = skb_dst(skb)->dev->hard_header_len;
> -
>   	if (skb_headroom(skb) < hh_len) {
>   		int nhead = HH_DATA_ALIGN(hh_len - skb_headroom(skb));
>   
> @@ -274,6 +272,7 @@ static int bpf_xmit(struct sk_buff *skb)
>   
>   	bpf = bpf_lwt_lwtunnel(dst->lwtstate);
>   	if (bpf->xmit.prog) {
> +		int hh_len = dst->dev->hard_header_len;
>   		__be16 proto = skb->protocol;
>   		int ret;
>   
> @@ -291,7 +290,7 @@ static int bpf_xmit(struct sk_buff *skb)
>   			/* If the header was expanded, headroom might be too
>   			 * small for L2 header to come, expand as needed.
>   			 */
> -			ret = xmit_check_hhlen(skb);
> +			ret = xmit_check_hhlen(skb, hh_len);

Ok, makes sense given for BPF_OK the dst->dev shouldn't change here (e.g. as opposed
to BPF_REDIRECT). Applied, please also follow-up with a BPF selftest for test_progs
so that this won't break in future when it's running as part of BPF CI.

>   			if (unlikely(ret))
>   				return ret;
>   
> 

Thanks,
Daniel
patchwork-bot+netdevbpf@kernel.org April 22, 2022, 3:50 p.m. UTC | #2
Hello:

This patch was applied to bpf/bpf.git (master)
by Daniel Borkmann <daniel@iogearbox.net>:

On Wed, 20 Apr 2022 19:52:19 +0300 you wrote:
> xmit_check_hhlen() observes the dst for getting the device hard header
> length to make sure a modified packet can fit. When a helper which changes
> the dst - such as bpf_skb_set_tunnel_key() - is called as part of the xmit
> program the accessed dst is no longer valid.
> 
> This leads to the following splat:
> 
> [...]

Here is the summary with links:
  - [bpf] lwt_bpf: fix crash when using bpf_skb_set_tunnel_key() from bpf_xmit lwt hook
    https://git.kernel.org/bpf/bpf/c/b02d196c44ea

You are awesome, thank you!
Daniel Borkmann April 22, 2022, 3:55 p.m. UTC | #3
On 4/22/22 5:48 PM, Daniel Borkmann wrote:
> On 4/20/22 6:52 PM, Eyal Birger wrote:
[...]
> 
> Ok, makes sense given for BPF_OK the dst->dev shouldn't change here (e.g. as opposed
> to BPF_REDIRECT). Applied, please also follow-up with a BPF selftest for test_progs
> so that this won't break in future when it's running as part of BPF CI.

(Coverage for lwt BPF flavor from test_progs is afaik non-existent aside from some section
name tests. Would be great in general to have runtime tests asserting lwt behavior there if
you have a chance.)
diff mbox series

Patch

diff --git a/net/core/lwt_bpf.c b/net/core/lwt_bpf.c
index 349480ef68a5..8b6b5e72b217 100644
--- a/net/core/lwt_bpf.c
+++ b/net/core/lwt_bpf.c
@@ -159,10 +159,8 @@  static int bpf_output(struct net *net, struct sock *sk, struct sk_buff *skb)
 	return dst->lwtstate->orig_output(net, sk, skb);
 }
 
-static int xmit_check_hhlen(struct sk_buff *skb)
+static int xmit_check_hhlen(struct sk_buff *skb, int hh_len)
 {
-	int hh_len = skb_dst(skb)->dev->hard_header_len;
-
 	if (skb_headroom(skb) < hh_len) {
 		int nhead = HH_DATA_ALIGN(hh_len - skb_headroom(skb));
 
@@ -274,6 +272,7 @@  static int bpf_xmit(struct sk_buff *skb)
 
 	bpf = bpf_lwt_lwtunnel(dst->lwtstate);
 	if (bpf->xmit.prog) {
+		int hh_len = dst->dev->hard_header_len;
 		__be16 proto = skb->protocol;
 		int ret;
 
@@ -291,7 +290,7 @@  static int bpf_xmit(struct sk_buff *skb)
 			/* If the header was expanded, headroom might be too
 			 * small for L2 header to come, expand as needed.
 			 */
-			ret = xmit_check_hhlen(skb);
+			ret = xmit_check_hhlen(skb, hh_len);
 			if (unlikely(ret))
 				return ret;