diff mbox series

[bpf-next] selftests/bpf: Fix selftest test_global_funcs/global_func1 failure with latest clang

Message ID 20230425174744.1758515-1-yhs@fb.com (mailing list archive)
State Accepted
Commit f1f5553d91a11663a5761b78e61f70c1db0bbd2f
Delegated to: BPF
Headers show
Series [bpf-next] selftests/bpf: Fix selftest test_global_funcs/global_func1 failure with latest clang | expand

Checks

Context Check Description
netdev/series_format success Single patches do not need cover letters
netdev/tree_selection success Clearly marked for bpf-next
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 8 this patch: 8
netdev/cc_maintainers warning 14 maintainers not CCed: kpsingh@kernel.org martin.lau@linux.dev john.fastabend@gmail.com sdf@google.com trix@redhat.com shuah@kernel.org song@kernel.org jolsa@kernel.org mykolal@fb.com nathan@kernel.org llvm@lists.linux.dev linux-kselftest@vger.kernel.org haoluo@google.com ndesaulniers@google.com
netdev/build_clang success Errors and warnings before: 8 this patch: 8
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 8 this patch: 8
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 8 lines checked
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
bpf/vmtest-bpf-next-PR success PR summary
bpf/vmtest-bpf-next-VM_Test-14 fail Logs for test_progs on aarch64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-19 fail Logs for test_progs_no_alu32 on aarch64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-24 success Logs for test_progs_no_alu32_parallel on aarch64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-28 success Logs for test_progs_parallel on aarch64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-15 fail Logs for test_progs on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-10 success Logs for test_maps on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-1 success Logs for ShellCheck
bpf/vmtest-bpf-next-VM_Test-5 success Logs for build for x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-6 success Logs for build for x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-7 success Logs for set-matrix
bpf/vmtest-bpf-next-VM_Test-2 success Logs for build for aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-3 success Logs for build for aarch64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-4 success Logs for build for s390x with gcc
bpf/vmtest-bpf-next-VM_Test-11 success Logs for test_maps on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-12 success Logs for test_maps on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-25 success Logs for test_progs_no_alu32_parallel on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-26 success Logs for test_progs_no_alu32_parallel on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-29 success Logs for test_progs_parallel on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-30 success Logs for test_progs_parallel on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-34 success Logs for test_verifier on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-35 success Logs for test_verifier on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-36 success Logs for veristat
bpf/vmtest-bpf-next-VM_Test-8 success Logs for test_maps on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-9 success Logs for test_maps on aarch64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-13 fail Logs for test_progs on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-16 fail Logs for test_progs on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-17 fail Logs for test_progs on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-18 fail Logs for test_progs_no_alu32 on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-20 fail Logs for test_progs_no_alu32 on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-21 fail Logs for test_progs_no_alu32 on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-22 fail Logs for test_progs_no_alu32 on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-23 success Logs for test_progs_no_alu32_parallel on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-27 success Logs for test_progs_parallel on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-31 success Logs for test_verifier on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-32 success Logs for test_verifier on aarch64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-33 success Logs for test_verifier on s390x with gcc

Commit Message

Yonghong Song April 25, 2023, 5:47 p.m. UTC
The selftest test_global_funcs/global_func1 failed with the latest clang17.
The reason is due to upstream ArgumentPromotionPass ([1]),
which may manipulate static function parameters and cause inlining
although the funciton is marked as noinline.

The original code:
  static __attribute__ ((noinline))
  int f0(int var, struct __sk_buff *skb)
  {
        return skb->len;
  }

  __attribute__ ((noinline))
  int f1(struct __sk_buff *skb)
  {
	...
        return f0(0, skb) + skb->len;
  }

  ...

  SEC("tc")
  __failure __msg("combined stack size of 4 calls is 544")
  int global_func1(struct __sk_buff *skb)
  {
        return f0(1, skb) + f1(skb) + f2(2, skb) + f3(3, skb, 4);
  }

After ArgumentPromotionPass, the code is translated to
  static __attribute__ ((noinline))
  int f0(int var, int skb_len)
  {
        return skb_len;
  }

  __attribute__ ((noinline))
  int f1(struct __sk_buff *skb)
  {
	...
        return f0(0, skb->len) + skb->len;
  }

  ...

  SEC("tc")
  __failure __msg("combined stack size of 4 calls is 544")
  int global_func1(struct __sk_buff *skb)
  {
        return f0(1, skb->len) + f1(skb) + f2(2, skb) + f3(3, skb, 4);
  }

And later llvm InstCombine phase recognized that f0()
simplify returns the value of the second argument and removed f0()
completely and the final code looks like:
  __attribute__ ((noinline))
  int f1(struct __sk_buff *skb)
  {
	...
        return skb->len + skb->len;
  }

  ...

  SEC("tc")
  __failure __msg("combined stack size of 4 calls is 544")
  int global_func1(struct __sk_buff *skb)
  {
        return skb->len + f1(skb) + f2(2, skb) + f3(3, skb, 4);
  }

If f0() is not inlined, the verification will fail with stack size
544 for a particular callchain. With f0() inlined, the maximum
stack size is 512 which is in the limit.

Let us add a `asm volatile ("")` in f0() to prevent ArgumentPromotionPass
from hoisting the code to its caller, and this fixed the test failure.

  [1] https://reviews.llvm.org/D148269

Signed-off-by: Yonghong Song <yhs@fb.com>
---
 tools/testing/selftests/bpf/progs/test_global_func1.c | 2 ++
 1 file changed, 2 insertions(+)

Comments

Daniel Borkmann April 27, 2023, 8:23 p.m. UTC | #1
On 4/25/23 7:47 PM, Yonghong Song wrote:
> The selftest test_global_funcs/global_func1 failed with the latest clang17.
> The reason is due to upstream ArgumentPromotionPass ([1]),
> which may manipulate static function parameters and cause inlining
> although the funciton is marked as noinline.
> 
> The original code:
>    static __attribute__ ((noinline))
>    int f0(int var, struct __sk_buff *skb)
>    {
>          return skb->len;
>    }
> 
>    __attribute__ ((noinline))
>    int f1(struct __sk_buff *skb)
>    {
> 	...
>          return f0(0, skb) + skb->len;
>    }
> 
>    ...
> 
>    SEC("tc")
>    __failure __msg("combined stack size of 4 calls is 544")
>    int global_func1(struct __sk_buff *skb)
>    {
>          return f0(1, skb) + f1(skb) + f2(2, skb) + f3(3, skb, 4);
>    }
> 
> After ArgumentPromotionPass, the code is translated to
>    static __attribute__ ((noinline))
>    int f0(int var, int skb_len)
>    {
>          return skb_len;
>    }
> 
>    __attribute__ ((noinline))
>    int f1(struct __sk_buff *skb)
>    {
> 	...
>          return f0(0, skb->len) + skb->len;
>    }
> 
>    ...
> 
>    SEC("tc")
>    __failure __msg("combined stack size of 4 calls is 544")
>    int global_func1(struct __sk_buff *skb)
>    {
>          return f0(1, skb->len) + f1(skb) + f2(2, skb) + f3(3, skb, 4);
>    }
> 
> And later llvm InstCombine phase recognized that f0()
> simplify returns the value of the second argument and removed f0()
> completely and the final code looks like:
>    __attribute__ ((noinline))
>    int f1(struct __sk_buff *skb)
>    {
> 	...
>          return skb->len + skb->len;
>    }
> 
>    ...
> 
>    SEC("tc")
>    __failure __msg("combined stack size of 4 calls is 544")
>    int global_func1(struct __sk_buff *skb)
>    {
>          return skb->len + f1(skb) + f2(2, skb) + f3(3, skb, 4);
>    }
> 
> If f0() is not inlined, the verification will fail with stack size
> 544 for a particular callchain. With f0() inlined, the maximum
> stack size is 512 which is in the limit.
> 
> Let us add a `asm volatile ("")` in f0() to prevent ArgumentPromotionPass
> from hoisting the code to its caller, and this fixed the test failure.
> 
>    [1] https://reviews.llvm.org/D148269
> 
> Signed-off-by: Yonghong Song <yhs@fb.com>
> ---
>   tools/testing/selftests/bpf/progs/test_global_func1.c | 2 ++
>   1 file changed, 2 insertions(+)
> 
> diff --git a/tools/testing/selftests/bpf/progs/test_global_func1.c b/tools/testing/selftests/bpf/progs/test_global_func1.c
> index b85fc8c423ba..17a9f59bf5f3 100644
> --- a/tools/testing/selftests/bpf/progs/test_global_func1.c
> +++ b/tools/testing/selftests/bpf/progs/test_global_func1.c
> @@ -10,6 +10,8 @@
>   static __attribute__ ((noinline))
>   int f0(int var, struct __sk_buff *skb)
>   {
> +	asm volatile ("");
> +
>   	return skb->len;

Is it planned to get this reverted before the final llvm/clang 17 is
officially released (you mentioned the TTI hook in [1])?

Thanks,
Daniel
Yonghong Song April 27, 2023, 9:03 p.m. UTC | #2
On 4/27/23 1:23 PM, Daniel Borkmann wrote:
> On 4/25/23 7:47 PM, Yonghong Song wrote:
>> The selftest test_global_funcs/global_func1 failed with the latest 
>> clang17.
>> The reason is due to upstream ArgumentPromotionPass ([1]),
>> which may manipulate static function parameters and cause inlining
>> although the funciton is marked as noinline.
>>
>> The original code:
>>    static __attribute__ ((noinline))
>>    int f0(int var, struct __sk_buff *skb)
>>    {
>>          return skb->len;
>>    }
>>
>>    __attribute__ ((noinline))
>>    int f1(struct __sk_buff *skb)
>>    {
>>     ...
>>          return f0(0, skb) + skb->len;
>>    }
>>
>>    ...
>>
>>    SEC("tc")
>>    __failure __msg("combined stack size of 4 calls is 544")
>>    int global_func1(struct __sk_buff *skb)
>>    {
>>          return f0(1, skb) + f1(skb) + f2(2, skb) + f3(3, skb, 4);
>>    }
>>
>> After ArgumentPromotionPass, the code is translated to
>>    static __attribute__ ((noinline))
>>    int f0(int var, int skb_len)
>>    {
>>          return skb_len;
>>    }
>>
>>    __attribute__ ((noinline))
>>    int f1(struct __sk_buff *skb)
>>    {
>>     ...
>>          return f0(0, skb->len) + skb->len;
>>    }
>>
>>    ...
>>
>>    SEC("tc")
>>    __failure __msg("combined stack size of 4 calls is 544")
>>    int global_func1(struct __sk_buff *skb)
>>    {
>>          return f0(1, skb->len) + f1(skb) + f2(2, skb) + f3(3, skb, 4);
>>    }
>>
>> And later llvm InstCombine phase recognized that f0()
>> simplify returns the value of the second argument and removed f0()
>> completely and the final code looks like:
>>    __attribute__ ((noinline))
>>    int f1(struct __sk_buff *skb)
>>    {
>>     ...
>>          return skb->len + skb->len;
>>    }
>>
>>    ...
>>
>>    SEC("tc")
>>    __failure __msg("combined stack size of 4 calls is 544")
>>    int global_func1(struct __sk_buff *skb)
>>    {
>>          return skb->len + f1(skb) + f2(2, skb) + f3(3, skb, 4);
>>    }
>>
>> If f0() is not inlined, the verification will fail with stack size
>> 544 for a particular callchain. With f0() inlined, the maximum
>> stack size is 512 which is in the limit.
>>
>> Let us add a `asm volatile ("")` in f0() to prevent ArgumentPromotionPass
>> from hoisting the code to its caller, and this fixed the test failure.
>>
>>    [1] 
>> https://reviews.llvm.org/D148269
>> Signed-off-by: Yonghong Song <yhs@fb.com>
>> ---
>>   tools/testing/selftests/bpf/progs/test_global_func1.c | 2 ++
>>   1 file changed, 2 insertions(+)
>>
>> diff --git a/tools/testing/selftests/bpf/progs/test_global_func1.c 
>> b/tools/testing/selftests/bpf/progs/test_global_func1.c
>> index b85fc8c423ba..17a9f59bf5f3 100644
>> --- a/tools/testing/selftests/bpf/progs/test_global_func1.c
>> +++ b/tools/testing/selftests/bpf/progs/test_global_func1.c
>> @@ -10,6 +10,8 @@
>>   static __attribute__ ((noinline))
>>   int f0(int var, struct __sk_buff *skb)
>>   {
>> +    asm volatile ("");
>> +
>>       return skb->len;
> 
> Is it planned to get this reverted before the final llvm/clang 17 is
> officially released (you mentioned the TTI hook in [1])?

No. This fix will not be reverted even with final clang17.

The TTI hook I am used (https://reviews.llvm.org/D148551) is
to prevent the optimization from increasing the number of parameter
beyond 5. In this particular case, the number of arguments
remains at 2, so BPF backend TTI hook has no effect.

> 
> Thanks,
> Daniel
patchwork-bot+netdevbpf@kernel.org April 27, 2023, 9:50 p.m. UTC | #3
Hello:

This patch was applied to bpf/bpf-next.git (master)
by Andrii Nakryiko <andrii@kernel.org>:

On Tue, 25 Apr 2023 10:47:44 -0700 you wrote:
> The selftest test_global_funcs/global_func1 failed with the latest clang17.
> The reason is due to upstream ArgumentPromotionPass ([1]),
> which may manipulate static function parameters and cause inlining
> although the funciton is marked as noinline.
> 
> The original code:
>   static __attribute__ ((noinline))
>   int f0(int var, struct __sk_buff *skb)
>   {
>         return skb->len;
>   }
> 
> [...]

Here is the summary with links:
  - [bpf-next] selftests/bpf: Fix selftest test_global_funcs/global_func1 failure with latest clang
    https://git.kernel.org/bpf/bpf-next/c/f1f5553d91a1

You are awesome, thank you!
diff mbox series

Patch

diff --git a/tools/testing/selftests/bpf/progs/test_global_func1.c b/tools/testing/selftests/bpf/progs/test_global_func1.c
index b85fc8c423ba..17a9f59bf5f3 100644
--- a/tools/testing/selftests/bpf/progs/test_global_func1.c
+++ b/tools/testing/selftests/bpf/progs/test_global_func1.c
@@ -10,6 +10,8 @@ 
 static __attribute__ ((noinline))
 int f0(int var, struct __sk_buff *skb)
 {
+	asm volatile ("");
+
 	return skb->len;
 }