diff mbox series

[bpf] bpf: check negative offsets in __bpf_skb_min_len()

Message ID 20241008053350.123205-1-xiyou.wangcong@gmail.com (mailing list archive)
State Changes Requested
Delegated to: BPF
Headers show
Series [bpf] bpf: check negative offsets in __bpf_skb_min_len() | expand

Checks

Context Check Description
netdev/series_format success Single patches do not need cover letters
netdev/tree_selection success Clearly marked for bpf
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag present in non-next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 9 this patch: 9
netdev/build_tools success No tools touched, skip
netdev/cc_maintainers warning 12 maintainers not CCed: song@kernel.org haoluo@google.com andrii@kernel.org pabeni@redhat.com kuba@kernel.org edumazet@google.com sdf@fomichev.me martin.lau@linux.dev kpsingh@kernel.org yonghong.song@linux.dev eddyz87@gmail.com jolsa@kernel.org
netdev/build_clang success Errors and warnings before: 7 this patch: 7
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 33 this patch: 33
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 28 lines checked
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 6 this patch: 6
netdev/source_inline success Was 0 now: 0
bpf/vmtest-bpf-PR success PR summary
bpf/vmtest-bpf-VM_Test-14 success Logs for s390x-gcc / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on s390x with gcc
bpf/vmtest-bpf-VM_Test-32 success Logs for x86_64-llvm-17 / test (test_verifier, false, 360) / test_verifier on x86_64 with llvm-17
bpf/vmtest-bpf-VM_Test-30 success Logs for x86_64-llvm-17 / test (test_progs, false, 360) / test_progs on x86_64 with llvm-17
bpf/vmtest-bpf-VM_Test-20 success Logs for x86_64-gcc / test (test_maps, false, 360) / test_maps on x86_64 with gcc
bpf/vmtest-bpf-VM_Test-31 success Logs for x86_64-llvm-17 / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on x86_64 with llvm-17
bpf/vmtest-bpf-VM_Test-23 success Logs for x86_64-gcc / test (test_progs_no_alu32_parallel, true, 30) / test_progs_no_alu32_parallel on x86_64 with gcc
bpf/vmtest-bpf-VM_Test-38 success Logs for x86_64-llvm-18 / test (test_progs_cpuv4, false, 360) / test_progs_cpuv4 on x86_64 with llvm-18
bpf/vmtest-bpf-VM_Test-22 success Logs for x86_64-gcc / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on x86_64 with gcc
bpf/vmtest-bpf-VM_Test-39 success Logs for x86_64-llvm-18 / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on x86_64 with llvm-18
bpf/vmtest-bpf-VM_Test-40 success Logs for x86_64-llvm-18 / test (test_verifier, false, 360) / test_verifier on x86_64 with llvm-18
bpf/vmtest-bpf-VM_Test-37 success Logs for x86_64-llvm-18 / test (test_progs, false, 360) / test_progs on x86_64 with llvm-18
bpf/vmtest-bpf-VM_Test-25 success Logs for x86_64-gcc / test (test_verifier, false, 360) / test_verifier on x86_64 with gcc
bpf/vmtest-bpf-VM_Test-36 success Logs for x86_64-llvm-18 / test (test_maps, false, 360) / test_maps on x86_64 with llvm-18
bpf/vmtest-bpf-VM_Test-24 success Logs for x86_64-gcc / test (test_progs_parallel, true, 30) / test_progs_parallel on x86_64 with gcc
bpf/vmtest-bpf-VM_Test-26 success Logs for x86_64-gcc / veristat / veristat on x86_64 with gcc
bpf/vmtest-bpf-VM_Test-21 success Logs for x86_64-gcc / test (test_progs, false, 360) / test_progs on x86_64 with gcc
bpf/vmtest-bpf-VM_Test-29 success Logs for x86_64-llvm-17 / test (test_maps, false, 360) / test_maps on x86_64 with llvm-17
bpf/vmtest-bpf-VM_Test-3 success Logs for Validate matrix.py
bpf/vmtest-bpf-VM_Test-5 success Logs for aarch64-gcc / build-release
bpf/vmtest-bpf-VM_Test-2 success Logs for Unittests
bpf/vmtest-bpf-VM_Test-0 success Logs for Lint
bpf/vmtest-bpf-VM_Test-1 success Logs for ShellCheck
bpf/vmtest-bpf-VM_Test-13 success Logs for set-matrix
bpf/vmtest-bpf-VM_Test-15 success Logs for x86_64-gcc / build-release
bpf/vmtest-bpf-VM_Test-10 success Logs for aarch64-gcc / veristat
bpf/vmtest-bpf-VM_Test-4 success Logs for aarch64-gcc / build / build for aarch64 with gcc
bpf/vmtest-bpf-VM_Test-12 success Logs for s390x-gcc / build-release
bpf/vmtest-bpf-VM_Test-9 success Logs for aarch64-gcc / test (test_verifier, false, 360) / test_verifier on aarch64 with gcc
bpf/vmtest-bpf-VM_Test-6 success Logs for aarch64-gcc / test (test_maps, false, 360) / test_maps on aarch64 with gcc
bpf/vmtest-bpf-VM_Test-7 success Logs for aarch64-gcc / test (test_progs, false, 360) / test_progs on aarch64 with gcc
bpf/vmtest-bpf-VM_Test-35 success Logs for x86_64-llvm-18 / build-release / build for x86_64 with llvm-18-O2
bpf/vmtest-bpf-VM_Test-34 success Logs for x86_64-llvm-18 / build / build for x86_64 with llvm-18
bpf/vmtest-bpf-VM_Test-27 success Logs for x86_64-llvm-17 / build / build for x86_64 with llvm-17
bpf/vmtest-bpf-VM_Test-11 success Logs for s390x-gcc / build / build for s390x with gcc
bpf/vmtest-bpf-VM_Test-17 success Logs for set-matrix
bpf/vmtest-bpf-VM_Test-19 success Logs for x86_64-gcc / build-release
bpf/vmtest-bpf-VM_Test-16 success Logs for s390x-gcc / veristat
bpf/vmtest-bpf-VM_Test-33 success Logs for x86_64-llvm-17 / veristat
bpf/vmtest-bpf-VM_Test-18 success Logs for x86_64-gcc / build / build for x86_64 with gcc
bpf/vmtest-bpf-VM_Test-41 success Logs for x86_64-llvm-18 / veristat
bpf/vmtest-bpf-VM_Test-28 success Logs for x86_64-llvm-17 / build-release / build for x86_64 with llvm-17-O2
bpf/vmtest-bpf-VM_Test-8 success Logs for aarch64-gcc / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on aarch64 with gcc

Commit Message

Cong Wang Oct. 8, 2024, 5:33 a.m. UTC
From: Cong Wang <cong.wang@bytedance.com>

skb_transport_offset() and skb_transport_offset() can be negative when
they are called after we pull the transport header, for example, when
we use eBPF sockmap (aka at the point of ->sk_data_ready()).

__bpf_skb_min_len() uses an unsigned int to get these offsets, this
leads to a very large number which causes bpf_skb_change_tail() failed
unexpectedly.

Fix this by using a signed int to get these offsets and test them
against zero.

Fixes: 5293efe62df8 ("bpf: add bpf_skb_change_tail helper")
Cc: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Cong Wang <cong.wang@bytedance.com>
---
 net/core/filter.c | 21 +++++++++++++++------
 1 file changed, 15 insertions(+), 6 deletions(-)

Comments

Daniel Borkmann Oct. 22, 2024, 8:52 p.m. UTC | #1
On 10/8/24 7:33 AM, Cong Wang wrote:
> From: Cong Wang <cong.wang@bytedance.com>
> 
> skb_transport_offset() and skb_transport_offset() can be negative when

nit: I presume the 2nd one is skb_network_offset?

> they are called after we pull the transport header, for example, when
> we use eBPF sockmap (aka at the point of ->sk_data_ready()).
> 
> __bpf_skb_min_len() uses an unsigned int to get these offsets, this
> leads to a very large number which causes bpf_skb_change_tail() failed
> unexpectedly.
> 
> Fix this by using a signed int to get these offsets and test them
> against zero.
> 
> Fixes: 5293efe62df8 ("bpf: add bpf_skb_change_tail helper")
> Cc: Daniel Borkmann <daniel@iogearbox.net>
> Signed-off-by: Cong Wang <cong.wang@bytedance.com>

Is there any chance you could also extend the sockmap BPF selftest with
this case you're hitting so that BPF CI can run this regularly?

> ---
>   net/core/filter.c | 21 +++++++++++++++------
>   1 file changed, 15 insertions(+), 6 deletions(-)
> 
> diff --git a/net/core/filter.c b/net/core/filter.c
> index 4e3f42cc6611..10ef27639a5d 100644
> --- a/net/core/filter.c
> +++ b/net/core/filter.c
> @@ -3737,13 +3737,22 @@ static const struct bpf_func_proto bpf_skb_adjust_room_proto = {
>   
>   static u32 __bpf_skb_min_len(const struct sk_buff *skb)
>   {
> -	u32 min_len = skb_network_offset(skb);
> +	int offset = skb_network_offset(skb);
> +	u32 min_len = 0;
>   
> -	if (skb_transport_header_was_set(skb))
> -		min_len = skb_transport_offset(skb);
> -	if (skb->ip_summed == CHECKSUM_PARTIAL)
> -		min_len = skb_checksum_start_offset(skb) +
> -			  skb->csum_offset + sizeof(__sum16);
> +	if (offset > 0)
> +		min_len = offset;
> +	if (skb_transport_header_was_set(skb)) {
> +		offset = skb_transport_offset(skb);
> +		if (offset > 0)
> +			min_len = offset;
> +	}
> +	if (skb->ip_summed == CHECKSUM_PARTIAL) {
> +		offset = skb_checksum_start_offset(skb) +
> +			 skb->csum_offset + sizeof(__sum16);
> +		if (offset > 0)
> +			min_len = offset;
> +	}
>   	return min_len;

I'll let John chime in, but does this mean in case of sockmap min_len always ends
up at 0? I just wonder whether we should pass a custom __bpf_skb_min_len to
__bpf_skb_change_tail for bpf_skb_change_tail vs sk_skb_change_tail assuming the
compiler is able to inlining all this (instead of indirect call).
Cong Wang Oct. 26, 2024, 3:33 p.m. UTC | #2
On Tue, Oct 22, 2024 at 10:52:31PM +0200, Daniel Borkmann wrote:
> On 10/8/24 7:33 AM, Cong Wang wrote:
> > From: Cong Wang <cong.wang@bytedance.com>
> > 
> > skb_transport_offset() and skb_transport_offset() can be negative when
> 
> nit: I presume the 2nd one is skb_network_offset?
> 
> > they are called after we pull the transport header, for example, when
> > we use eBPF sockmap (aka at the point of ->sk_data_ready()).
> > 
> > __bpf_skb_min_len() uses an unsigned int to get these offsets, this
> > leads to a very large number which causes bpf_skb_change_tail() failed
> > unexpectedly.
> > 
> > Fix this by using a signed int to get these offsets and test them
> > against zero.
> > 
> > Fixes: 5293efe62df8 ("bpf: add bpf_skb_change_tail helper")
> > Cc: Daniel Borkmann <daniel@iogearbox.net>
> > Signed-off-by: Cong Wang <cong.wang@bytedance.com>
> 
> Is there any chance you could also extend the sockmap BPF selftest with
> this case you're hitting so that BPF CI can run this regularly?

Yes, my colleague Zijian (Cc'ed) is working on a selftest to cover this case.

Please let me know if you prefer to send it together with the selftest,
technically it would make backporting this fix harder, but I am open to
any suggestion here.

> 
> > ---
> >   net/core/filter.c | 21 +++++++++++++++------
> >   1 file changed, 15 insertions(+), 6 deletions(-)
> > 
> > diff --git a/net/core/filter.c b/net/core/filter.c
> > index 4e3f42cc6611..10ef27639a5d 100644
> > --- a/net/core/filter.c
> > +++ b/net/core/filter.c
> > @@ -3737,13 +3737,22 @@ static const struct bpf_func_proto bpf_skb_adjust_room_proto = {
> >   static u32 __bpf_skb_min_len(const struct sk_buff *skb)
> >   {
> > -	u32 min_len = skb_network_offset(skb);
> > +	int offset = skb_network_offset(skb);
> > +	u32 min_len = 0;
> > -	if (skb_transport_header_was_set(skb))
> > -		min_len = skb_transport_offset(skb);
> > -	if (skb->ip_summed == CHECKSUM_PARTIAL)
> > -		min_len = skb_checksum_start_offset(skb) +
> > -			  skb->csum_offset + sizeof(__sum16);
> > +	if (offset > 0)
> > +		min_len = offset;
> > +	if (skb_transport_header_was_set(skb)) {
> > +		offset = skb_transport_offset(skb);
> > +		if (offset > 0)
> > +			min_len = offset;
> > +	}
> > +	if (skb->ip_summed == CHECKSUM_PARTIAL) {
> > +		offset = skb_checksum_start_offset(skb) +
> > +			 skb->csum_offset + sizeof(__sum16);
> > +		if (offset > 0)
> > +			min_len = offset;
> > +	}
> >   	return min_len;
> 
> I'll let John chime in, but does this mean in case of sockmap min_len always ends
> up at 0? I just wonder whether we should pass a custom __bpf_skb_min_len to
> __bpf_skb_change_tail for bpf_skb_change_tail vs sk_skb_change_tail assuming the
> compiler is able to inlining all this (instead of indirect call).

Yes, in case of sockmap skb->data is already past TCP header, so all the
offsets here are negative. And since the 'new_len' of bpf_skb_change_tail()
is unsigned (too late to change), min_len should be zero.

Thanks.
Daniel Borkmann Oct. 28, 2024, 12:32 p.m. UTC | #3
On 10/26/24 5:33 PM, Cong Wang wrote:
> On Tue, Oct 22, 2024 at 10:52:31PM +0200, Daniel Borkmann wrote:
>> On 10/8/24 7:33 AM, Cong Wang wrote:
>>> From: Cong Wang <cong.wang@bytedance.com>
>>>
>>> skb_transport_offset() and skb_transport_offset() can be negative when
>>
>> nit: I presume the 2nd one is skb_network_offset?
>>
>>> they are called after we pull the transport header, for example, when
>>> we use eBPF sockmap (aka at the point of ->sk_data_ready()).
>>>
>>> __bpf_skb_min_len() uses an unsigned int to get these offsets, this
>>> leads to a very large number which causes bpf_skb_change_tail() failed
>>> unexpectedly.
>>>
>>> Fix this by using a signed int to get these offsets and test them
>>> against zero.
>>>
>>> Fixes: 5293efe62df8 ("bpf: add bpf_skb_change_tail helper")
>>> Cc: Daniel Borkmann <daniel@iogearbox.net>
>>> Signed-off-by: Cong Wang <cong.wang@bytedance.com>
>>
>> Is there any chance you could also extend the sockmap BPF selftest with
>> this case you're hitting so that BPF CI can run this regularly?
> 
> Yes, my colleague Zijian (Cc'ed) is working on a selftest to cover this case.
> 
> Please let me know if you prefer to send it together with the selftest,
> technically it would make backporting this fix harder, but I am open to
> any suggestion here.

Ideally this is a 2-patch series, first one is the fix itself and the second
one contains the BPF selftest so that CI can run it too, and this way it also
does not hurt backporting the fix.

>>>    net/core/filter.c | 21 +++++++++++++++------
>>>    1 file changed, 15 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/net/core/filter.c b/net/core/filter.c
>>> index 4e3f42cc6611..10ef27639a5d 100644
>>> --- a/net/core/filter.c
>>> +++ b/net/core/filter.c
>>> @@ -3737,13 +3737,22 @@ static const struct bpf_func_proto bpf_skb_adjust_room_proto = {
>>>    static u32 __bpf_skb_min_len(const struct sk_buff *skb)
>>>    {
>>> -	u32 min_len = skb_network_offset(skb);
>>> +	int offset = skb_network_offset(skb);
>>> +	u32 min_len = 0;
>>> -	if (skb_transport_header_was_set(skb))
>>> -		min_len = skb_transport_offset(skb);
>>> -	if (skb->ip_summed == CHECKSUM_PARTIAL)
>>> -		min_len = skb_checksum_start_offset(skb) +
>>> -			  skb->csum_offset + sizeof(__sum16);
>>> +	if (offset > 0)
>>> +		min_len = offset;
>>> +	if (skb_transport_header_was_set(skb)) {
>>> +		offset = skb_transport_offset(skb);
>>> +		if (offset > 0)
>>> +			min_len = offset;
>>> +	}
>>> +	if (skb->ip_summed == CHECKSUM_PARTIAL) {
>>> +		offset = skb_checksum_start_offset(skb) +
>>> +			 skb->csum_offset + sizeof(__sum16);
>>> +		if (offset > 0)
>>> +			min_len = offset;
>>> +	}
>>>    	return min_len;
>>
>> I'll let John chime in, but does this mean in case of sockmap min_len always ends
>> up at 0? I just wonder whether we should pass a custom __bpf_skb_min_len to
>> __bpf_skb_change_tail for bpf_skb_change_tail vs sk_skb_change_tail assuming the
>> compiler is able to inlining all this (instead of indirect call).
> 
> Yes, in case of sockmap skb->data is already past TCP header, so all the
> offsets here are negative. And since the 'new_len' of bpf_skb_change_tail()
> is unsigned (too late to change), min_len should be zero.

Ok, so hopefully this can be further cleaned up/simplified a bit more then.

Thanks,
Daniel
diff mbox series

Patch

diff --git a/net/core/filter.c b/net/core/filter.c
index 4e3f42cc6611..10ef27639a5d 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -3737,13 +3737,22 @@  static const struct bpf_func_proto bpf_skb_adjust_room_proto = {
 
 static u32 __bpf_skb_min_len(const struct sk_buff *skb)
 {
-	u32 min_len = skb_network_offset(skb);
+	int offset = skb_network_offset(skb);
+	u32 min_len = 0;
 
-	if (skb_transport_header_was_set(skb))
-		min_len = skb_transport_offset(skb);
-	if (skb->ip_summed == CHECKSUM_PARTIAL)
-		min_len = skb_checksum_start_offset(skb) +
-			  skb->csum_offset + sizeof(__sum16);
+	if (offset > 0)
+		min_len = offset;
+	if (skb_transport_header_was_set(skb)) {
+		offset = skb_transport_offset(skb);
+		if (offset > 0)
+			min_len = offset;
+	}
+	if (skb->ip_summed == CHECKSUM_PARTIAL) {
+		offset = skb_checksum_start_offset(skb) +
+			 skb->csum_offset + sizeof(__sum16);
+		if (offset > 0)
+			min_len = offset;
+	}
 	return min_len;
 }