diff mbox series

[bpf-next] selftests/bpf: Fix pyperf180 compilation failure with llvm18

Message ID 20231109053029.1403552-1-yonghong.song@linux.dev (mailing list archive)
State Changes Requested
Delegated to: BPF
Headers show
Series [bpf-next] selftests/bpf: Fix pyperf180 compilation failure with llvm18 | expand

Checks

Context Check Description
netdev/series_format success Single patches do not need cover letters
netdev/tree_selection success Clearly marked for bpf-next
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 9 this patch: 9
netdev/cc_maintainers warning 14 maintainers not CCed: shuah@kernel.org jolsa@kernel.org linux-kselftest@vger.kernel.org john.fastabend@gmail.com martin.lau@linux.dev trix@redhat.com mykolal@fb.com llvm@lists.linux.dev nathan@kernel.org song@kernel.org haoluo@google.com ndesaulniers@google.com sdf@google.com kpsingh@kernel.org
netdev/build_clang success Errors and warnings before: 9 this patch: 9
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 9 this patch: 9
netdev/checkpatch warning WARNING: line length of 82 exceeds 80 columns
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
bpf/vmtest-bpf-next-PR fail PR summary
bpf/vmtest-bpf-next-VM_Test-2 success Logs for Validate matrix.py
bpf/vmtest-bpf-next-VM_Test-1 success Logs for ShellCheck
bpf/vmtest-bpf-next-VM_Test-0 success Logs for Lint
bpf/vmtest-bpf-next-VM_Test-3 success Logs for aarch64-gcc / build / build for aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-8 success Logs for aarch64-gcc / veristat
bpf/vmtest-bpf-next-VM_Test-4 success Logs for aarch64-gcc / test (test_maps, false, 360) / test_maps on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-6 success Logs for aarch64-gcc / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-7 success Logs for aarch64-gcc / test (test_verifier, false, 360) / test_verifier on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-5 success Logs for aarch64-gcc / test (test_progs, false, 360) / test_progs on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-9 success Logs for s390x-gcc / build / build for s390x with gcc
bpf/vmtest-bpf-next-VM_Test-14 success Logs for s390x-gcc / veristat
bpf/vmtest-bpf-next-VM_Test-15 success Logs for set-matrix
bpf/vmtest-bpf-next-VM_Test-20 success Logs for x86_64-gcc / test (test_progs_no_alu32_parallel, true, 30) / test_progs_no_alu32_parallel on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-16 success Logs for x86_64-gcc / build / build for x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-17 success Logs for x86_64-gcc / test (test_maps, false, 360) / test_maps on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-22 success Logs for x86_64-gcc / test (test_verifier, false, 360) / test_verifier on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-18 success Logs for x86_64-gcc / test (test_progs, false, 360) / test_progs on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-23 fail Logs for x86_64-gcc / veristat / veristat on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-25 success Logs for x86_64-llvm-16 / test (test_maps, false, 360) / test_maps on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-19 success Logs for x86_64-gcc / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-21 success Logs for x86_64-gcc / test (test_progs_parallel, true, 30) / test_progs_parallel on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-24 success Logs for x86_64-llvm-16 / build / build for x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-27 success Logs for x86_64-llvm-16 / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-26 success Logs for x86_64-llvm-16 / test (test_progs, false, 360) / test_progs on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-28 success Logs for x86_64-llvm-16 / test (test_verifier, false, 360) / test_verifier on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-29 success Logs for x86_64-llvm-16 / veristat
bpf/vmtest-bpf-next-VM_Test-13 success Logs for s390x-gcc / test (test_verifier, false, 360) / test_verifier on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-12 success Logs for s390x-gcc / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-11 success Logs for s390x-gcc / test (test_progs, false, 360) / test_progs on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-10 success Logs for s390x-gcc / test (test_maps, false, 360) / test_maps on s390x with gcc

Commit Message

Yonghong Song Nov. 9, 2023, 5:30 a.m. UTC
With latest llvm18 (main branch of llvm-project repo), when building bpf selftests,
    [~/work/bpf-next (master)]$ make -C tools/testing/selftests/bpf LLVM=1 -j

The following compilation error happens:
    fatal error: error in backend: Branch target out of insn range
    ...
    Stack dump:
    0.      Program arguments: clang -g -Wall -Werror -D__TARGET_ARCH_x86 -mlittle-endian
      -I/home/yhs/work/bpf-next/tools/testing/selftests/bpf/tools/include
      -I/home/yhs/work/bpf-next/tools/testing/selftests/bpf -I/home/yhs/work/bpf-next/tools/include/uapi
      -I/home/yhs/work/bpf-next/tools/testing/selftests/usr/include -idirafter
      /home/yhs/work/llvm-project/llvm/build.18/install/lib/clang/18/include -idirafter /usr/local/include
      -idirafter /usr/include -Wno-compare-distinct-pointer-types -DENABLE_ATOMICS_TESTS -O2 --target=bpf
      -c progs/pyperf180.c -mcpu=v3 -o /home/yhs/work/bpf-next/tools/testing/selftests/bpf/pyperf180.bpf.o
    1.      <eof> parser at end of file
    2.      Code generation
    ...

The compilation failure only happens to cpu=v2 and cpu=v3. cpu=v4 is okay
since cpu=v4 supports 32-bit branch target offset.

The above failure is due to upstream llvm patch [1] where some inlining behavior
are changed in llvm18.

To workaround the issue, previously all 180 loop iterations are fully unrolled.
Now, the fully unrolling count is changed to 90 for llvm18 and later. This reduced
some otherwise long branch target distance, and fixed the compilation failure.

  [1] https://github.com/llvm/llvm-project/commit/1a2e77cf9e11dbf56b5720c607313a566eebb16e

Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
---
 tools/testing/selftests/bpf/progs/pyperf180.c | 13 +++++++++++++
 1 file changed, 13 insertions(+)

Comments

Eduard Zingerman Nov. 9, 2023, 11:47 a.m. UTC | #1
On Wed, 2023-11-08 at 21:30 -0800, Yonghong Song wrote:
> With latest llvm18 (main branch of llvm-project repo), when building bpf selftests,
>     [~/work/bpf-next (master)]$ make -C tools/testing/selftests/bpf LLVM=1 -j
> 
> The following compilation error happens:
>     fatal error: error in backend: Branch target out of insn range
>     ...
>     Stack dump:
>     0.      Program arguments: clang -g -Wall -Werror -D__TARGET_ARCH_x86 -mlittle-endian
>       -I/home/yhs/work/bpf-next/tools/testing/selftests/bpf/tools/include
>       -I/home/yhs/work/bpf-next/tools/testing/selftests/bpf -I/home/yhs/work/bpf-next/tools/include/uapi
>       -I/home/yhs/work/bpf-next/tools/testing/selftests/usr/include -idirafter
>       /home/yhs/work/llvm-project/llvm/build.18/install/lib/clang/18/include -idirafter /usr/local/include
>       -idirafter /usr/include -Wno-compare-distinct-pointer-types -DENABLE_ATOMICS_TESTS -O2 --target=bpf
>       -c progs/pyperf180.c -mcpu=v3 -o /home/yhs/work/bpf-next/tools/testing/selftests/bpf/pyperf180.bpf.o
>     1.      <eof> parser at end of file
>     2.      Code generation
>     ...
> 
> The compilation failure only happens to cpu=v2 and cpu=v3. cpu=v4 is okay
> since cpu=v4 supports 32-bit branch target offset.
> 
> The above failure is due to upstream llvm patch [1] where some inlining behavior
> are changed in llvm18.
> 
> To workaround the issue, previously all 180 loop iterations are fully unrolled.
> Now, the fully unrolling count is changed to 90 for llvm18 and later. This reduced
> some otherwise long branch target distance, and fixed the compilation failure.
> 
>   [1] https://github.com/llvm/llvm-project/commit/1a2e77cf9e11dbf56b5720c607313a566eebb16e
> 
> Signed-off-by: Yonghong Song <yonghong.song@linux.dev>

Can confirm, the issue is present on clang main w/o this patch and
disappears after this patch.

Yonghong, is there a way to keep original UNROLL_COUNT if cpuv4 is used?

Tested-by: Eduard Zingerman <eddyz87@gmail.com>
Yonghong Song Nov. 9, 2023, 7:54 p.m. UTC | #2
On 11/9/23 3:47 AM, Eduard Zingerman wrote:
> On Wed, 2023-11-08 at 21:30 -0800, Yonghong Song wrote:
>> With latest llvm18 (main branch of llvm-project repo), when building bpf selftests,
>>      [~/work/bpf-next (master)]$ make -C tools/testing/selftests/bpf LLVM=1 -j
>>
>> The following compilation error happens:
>>      fatal error: error in backend: Branch target out of insn range
>>      ...
>>      Stack dump:
>>      0.      Program arguments: clang -g -Wall -Werror -D__TARGET_ARCH_x86 -mlittle-endian
>>        -I/home/yhs/work/bpf-next/tools/testing/selftests/bpf/tools/include
>>        -I/home/yhs/work/bpf-next/tools/testing/selftests/bpf -I/home/yhs/work/bpf-next/tools/include/uapi
>>        -I/home/yhs/work/bpf-next/tools/testing/selftests/usr/include -idirafter
>>        /home/yhs/work/llvm-project/llvm/build.18/install/lib/clang/18/include -idirafter /usr/local/include
>>        -idirafter /usr/include -Wno-compare-distinct-pointer-types -DENABLE_ATOMICS_TESTS -O2 --target=bpf
>>        -c progs/pyperf180.c -mcpu=v3 -o /home/yhs/work/bpf-next/tools/testing/selftests/bpf/pyperf180.bpf.o
>>      1.      <eof> parser at end of file
>>      2.      Code generation
>>      ...
>>
>> The compilation failure only happens to cpu=v2 and cpu=v3. cpu=v4 is okay
>> since cpu=v4 supports 32-bit branch target offset.
>>
>> The above failure is due to upstream llvm patch [1] where some inlining behavior
>> are changed in llvm18.
>>
>> To workaround the issue, previously all 180 loop iterations are fully unrolled.
>> Now, the fully unrolling count is changed to 90 for llvm18 and later. This reduced
>> some otherwise long branch target distance, and fixed the compilation failure.
>>
>>    [1] https://github.com/llvm/llvm-project/commit/1a2e77cf9e11dbf56b5720c607313a566eebb16e
>>
>> Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
> Can confirm, the issue is present on clang main w/o this patch and
> disappears after this patch.
>
> Yonghong, is there a way to keep original UNROLL_COUNT if cpuv4 is used?

I thought about this but a little bit lazy so not giving it enough throught.
But since you mentioned this, I think adding a macro to indicate cpu version
by llvm is a good idea. This will give bpf developers some flexibility to
add new features (new cpu variant) or workaround bugs (for a particular cpu variant
but not impacting others if they are fine), etc.

So here is the llvm patch: https://github.com/llvm/llvm-project/pull/71856

With the above llvm patch, the following code change should work:

diff --git a/tools/testing/selftests/bpf/progs/pyperf180.c b/tools/testing/selftests/bpf/progs/pyperf180.c
index c39f559d3100..2473845d1ee2 100644
--- a/tools/testing/selftests/bpf/progs/pyperf180.c
+++ b/tools/testing/selftests/bpf/progs/pyperf180.c
@@ -1,4 +1,18 @@
  // SPDX-License-Identifier: GPL-2.0
  // Copyright (c) 2019 Facebook
  #define STACK_MAX_LEN 180
+
+/* llvm upstream commit at llvm18
+ *   https://github.com/llvm/llvm-project/commit/1a2e77cf9e11dbf56b5720c607313a566eebb16e
+ * changed inlining behavior and caused compilation failure as some branch
+ * target distance exceeded 16bit representation which is the maximum for
+ * cpu v1/v2/v3. Macro __bpf_cpu_version__ is implemented in llvm18 to specify
+ * which cpu version is used for compilation. So we can set a smaller
+ * unroll_count if __bpf_cpu_version__ is less than 4, which reduced
+ * some branch target distances and resolved the compilation failure.
+ */
+#if defined(__bpf_cpu_version__) && __bpf_cpu_version__ < 4
+#define UNROLL_COUNT 90
+#endif
+
  #include "pyperf.h"


>
> Tested-by: Eduard Zingerman <eddyz87@gmail.com>
Eduard Zingerman Nov. 9, 2023, 8:45 p.m. UTC | #3
On Thu, 2023-11-09 at 11:54 -0800, Yonghong Song wrote:
> On 11/9/23 3:47 AM, Eduard Zingerman wrote:
> > On Wed, 2023-11-08 at 21:30 -0800, Yonghong Song wrote:
> > > With latest llvm18 (main branch of llvm-project repo), when building bpf selftests,
> > >      [~/work/bpf-next (master)]$ make -C tools/testing/selftests/bpf LLVM=1 -j
> > > 
> > > The following compilation error happens:
> > >      fatal error: error in backend: Branch target out of insn range
> > >      ...
> > >      Stack dump:
> > >      0.      Program arguments: clang -g -Wall -Werror -D__TARGET_ARCH_x86 -mlittle-endian
> > >        -I/home/yhs/work/bpf-next/tools/testing/selftests/bpf/tools/include
> > >        -I/home/yhs/work/bpf-next/tools/testing/selftests/bpf -I/home/yhs/work/bpf-next/tools/include/uapi
> > >        -I/home/yhs/work/bpf-next/tools/testing/selftests/usr/include -idirafter
> > >        /home/yhs/work/llvm-project/llvm/build.18/install/lib/clang/18/include -idirafter /usr/local/include
> > >        -idirafter /usr/include -Wno-compare-distinct-pointer-types -DENABLE_ATOMICS_TESTS -O2 --target=bpf
> > >        -c progs/pyperf180.c -mcpu=v3 -o /home/yhs/work/bpf-next/tools/testing/selftests/bpf/pyperf180.bpf.o
> > >      1.      <eof> parser at end of file
> > >      2.      Code generation
> > >      ...
> > > 
> > > The compilation failure only happens to cpu=v2 and cpu=v3. cpu=v4 is okay
> > > since cpu=v4 supports 32-bit branch target offset.
> > > 
> > > The above failure is due to upstream llvm patch [1] where some inlining behavior
> > > are changed in llvm18.
> > > 
> > > To workaround the issue, previously all 180 loop iterations are fully unrolled.
> > > Now, the fully unrolling count is changed to 90 for llvm18 and later. This reduced
> > > some otherwise long branch target distance, and fixed the compilation failure.
> > > 
> > >    [1] https://github.com/llvm/llvm-project/commit/1a2e77cf9e11dbf56b5720c607313a566eebb16e
> > > 
> > > Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
> > Can confirm, the issue is present on clang main w/o this patch and
> > disappears after this patch.
> > 
> > Yonghong, is there a way to keep original UNROLL_COUNT if cpuv4 is used?
> 
> I thought about this but a little bit lazy so not giving it enough throught.
> But since you mentioned this, I think adding a macro to indicate cpu version
> by llvm is a good idea. This will give bpf developers some flexibility to
> add new features (new cpu variant) or workaround bugs (for a particular cpu variant
> but not impacting others if they are fine), etc.
> 
> So here is the llvm patch: https://github.com/llvm/llvm-project/pull/71856

Thank you, tried it locally, works as expected.
Alexei Starovoitov Nov. 9, 2023, 9:09 p.m. UTC | #4
On Thu, Nov 9, 2023 at 11:55 AM Yonghong Song <yonghong.song@linux.dev> wrote:
>
>
> On 11/9/23 3:47 AM, Eduard Zingerman wrote:
> > On Wed, 2023-11-08 at 21:30 -0800, Yonghong Song wrote:
> >> With latest llvm18 (main branch of llvm-project repo), when building bpf selftests,
> >>      [~/work/bpf-next (master)]$ make -C tools/testing/selftests/bpf LLVM=1 -j
> >>
> >> The following compilation error happens:
> >>      fatal error: error in backend: Branch target out of insn range
> >>      ...
> >>      Stack dump:
> >>      0.      Program arguments: clang -g -Wall -Werror -D__TARGET_ARCH_x86 -mlittle-endian
> >>        -I/home/yhs/work/bpf-next/tools/testing/selftests/bpf/tools/include
> >>        -I/home/yhs/work/bpf-next/tools/testing/selftests/bpf -I/home/yhs/work/bpf-next/tools/include/uapi
> >>        -I/home/yhs/work/bpf-next/tools/testing/selftests/usr/include -idirafter
> >>        /home/yhs/work/llvm-project/llvm/build.18/install/lib/clang/18/include -idirafter /usr/local/include
> >>        -idirafter /usr/include -Wno-compare-distinct-pointer-types -DENABLE_ATOMICS_TESTS -O2 --target=bpf
> >>        -c progs/pyperf180.c -mcpu=v3 -o /home/yhs/work/bpf-next/tools/testing/selftests/bpf/pyperf180.bpf.o
> >>      1.      <eof> parser at end of file
> >>      2.      Code generation
> >>      ...
> >>
> >> The compilation failure only happens to cpu=v2 and cpu=v3. cpu=v4 is okay
> >> since cpu=v4 supports 32-bit branch target offset.
> >>
> >> The above failure is due to upstream llvm patch [1] where some inlining behavior
> >> are changed in llvm18.
> >>
> >> To workaround the issue, previously all 180 loop iterations are fully unrolled.
> >> Now, the fully unrolling count is changed to 90 for llvm18 and later. This reduced
> >> some otherwise long branch target distance, and fixed the compilation failure.
> >>
> >>    [1] https://github.com/llvm/llvm-project/commit/1a2e77cf9e11dbf56b5720c607313a566eebb16e
> >>
> >> Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
> > Can confirm, the issue is present on clang main w/o this patch and
> > disappears after this patch.
> >
> > Yonghong, is there a way to keep original UNROLL_COUNT if cpuv4 is used?
>
> I thought about this but a little bit lazy so not giving it enough throught.
> But since you mentioned this, I think adding a macro to indicate cpu version
> by llvm is a good idea. This will give bpf developers some flexibility to
> add new features (new cpu variant) or workaround bugs (for a particular cpu variant
> but not impacting others if they are fine), etc.
>
> So here is the llvm patch: https://github.com/llvm/llvm-project/pull/71856

Great idea. Commented on the diff.

> With the above llvm patch, the following code change should work:
>
> diff --git a/tools/testing/selftests/bpf/progs/pyperf180.c b/tools/testing/selftests/bpf/progs/pyperf180.c
> index c39f559d3100..2473845d1ee2 100644
> --- a/tools/testing/selftests/bpf/progs/pyperf180.c
> +++ b/tools/testing/selftests/bpf/progs/pyperf180.c
> @@ -1,4 +1,18 @@
>   // SPDX-License-Identifier: GPL-2.0
>   // Copyright (c) 2019 Facebook
>   #define STACK_MAX_LEN 180
> +
> +/* llvm upstream commit at llvm18
> + *   https://github.com/llvm/llvm-project/commit/1a2e77cf9e11dbf56b5720c607313a566eebb16e
> + * changed inlining behavior and caused compilation failure as some branch
> + * target distance exceeded 16bit representation which is the maximum for
> + * cpu v1/v2/v3. Macro __bpf_cpu_version__ is implemented in llvm18 to specify
> + * which cpu version is used for compilation. So we can set a smaller
> + * unroll_count if __bpf_cpu_version__ is less than 4, which reduced
> + * some branch target distances and resolved the compilation failure.
> + */
> +#if defined(__bpf_cpu_version__) && __bpf_cpu_version__ < 4

probably should be combined with __clang_major__ >= 18 check too.

> +#define UNROLL_COUNT 90
> +#endif
> +
>   #include "pyperf.h"
>
>
> >
> > Tested-by: Eduard Zingerman <eddyz87@gmail.com>
Yonghong Song Nov. 9, 2023, 9:53 p.m. UTC | #5
On 11/9/23 1:09 PM, Alexei Starovoitov wrote:
> On Thu, Nov 9, 2023 at 11:55 AM Yonghong Song <yonghong.song@linux.dev> wrote:
>>
>> On 11/9/23 3:47 AM, Eduard Zingerman wrote:
>>> On Wed, 2023-11-08 at 21:30 -0800, Yonghong Song wrote:
>>>> With latest llvm18 (main branch of llvm-project repo), when building bpf selftests,
>>>>       [~/work/bpf-next (master)]$ make -C tools/testing/selftests/bpf LLVM=1 -j
>>>>
>>>> The following compilation error happens:
>>>>       fatal error: error in backend: Branch target out of insn range
>>>>       ...
>>>>       Stack dump:
>>>>       0.      Program arguments: clang -g -Wall -Werror -D__TARGET_ARCH_x86 -mlittle-endian
>>>>         -I/home/yhs/work/bpf-next/tools/testing/selftests/bpf/tools/include
>>>>         -I/home/yhs/work/bpf-next/tools/testing/selftests/bpf -I/home/yhs/work/bpf-next/tools/include/uapi
>>>>         -I/home/yhs/work/bpf-next/tools/testing/selftests/usr/include -idirafter
>>>>         /home/yhs/work/llvm-project/llvm/build.18/install/lib/clang/18/include -idirafter /usr/local/include
>>>>         -idirafter /usr/include -Wno-compare-distinct-pointer-types -DENABLE_ATOMICS_TESTS -O2 --target=bpf
>>>>         -c progs/pyperf180.c -mcpu=v3 -o /home/yhs/work/bpf-next/tools/testing/selftests/bpf/pyperf180.bpf.o
>>>>       1.      <eof> parser at end of file
>>>>       2.      Code generation
>>>>       ...
>>>>
>>>> The compilation failure only happens to cpu=v2 and cpu=v3. cpu=v4 is okay
>>>> since cpu=v4 supports 32-bit branch target offset.
>>>>
>>>> The above failure is due to upstream llvm patch [1] where some inlining behavior
>>>> are changed in llvm18.
>>>>
>>>> To workaround the issue, previously all 180 loop iterations are fully unrolled.
>>>> Now, the fully unrolling count is changed to 90 for llvm18 and later. This reduced
>>>> some otherwise long branch target distance, and fixed the compilation failure.
>>>>
>>>>     [1] https://github.com/llvm/llvm-project/commit/1a2e77cf9e11dbf56b5720c607313a566eebb16e
>>>>
>>>> Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
>>> Can confirm, the issue is present on clang main w/o this patch and
>>> disappears after this patch.
>>>
>>> Yonghong, is there a way to keep original UNROLL_COUNT if cpuv4 is used?
>> I thought about this but a little bit lazy so not giving it enough throught.
>> But since you mentioned this, I think adding a macro to indicate cpu version
>> by llvm is a good idea. This will give bpf developers some flexibility to
>> add new features (new cpu variant) or workaround bugs (for a particular cpu variant
>> but not impacting others if they are fine), etc.
>>
>> So here is the llvm patch: https://github.com/llvm/llvm-project/pull/71856
> Great idea. Commented on the diff.
>
>> With the above llvm patch, the following code change should work:
>>
>> diff --git a/tools/testing/selftests/bpf/progs/pyperf180.c b/tools/testing/selftests/bpf/progs/pyperf180.c
>> index c39f559d3100..2473845d1ee2 100644
>> --- a/tools/testing/selftests/bpf/progs/pyperf180.c
>> +++ b/tools/testing/selftests/bpf/progs/pyperf180.c
>> @@ -1,4 +1,18 @@
>>    // SPDX-License-Identifier: GPL-2.0
>>    // Copyright (c) 2019 Facebook
>>    #define STACK_MAX_LEN 180
>> +
>> +/* llvm upstream commit at llvm18
>> + *   https://github.com/llvm/llvm-project/commit/1a2e77cf9e11dbf56b5720c607313a566eebb16e
>> + * changed inlining behavior and caused compilation failure as some branch
>> + * target distance exceeded 16bit representation which is the maximum for
>> + * cpu v1/v2/v3. Macro __bpf_cpu_version__ is implemented in llvm18 to specify
>> + * which cpu version is used for compilation. So we can set a smaller
>> + * unroll_count if __bpf_cpu_version__ is less than 4, which reduced
>> + * some branch target distances and resolved the compilation failure.
>> + */
>> +#if defined(__bpf_cpu_version__) && __bpf_cpu_version__ < 4
> probably should be combined with __clang_major__ >= 18 check too.

Okay, I could do this to catch the case where somebody uses development
llvm18 which has this regression but __bpf_cpu_version__ is not
introduced yet.

>
>> +#define UNROLL_COUNT 90
>> +#endif
>> +
>>    #include "pyperf.h"
>>
>>
>>> Tested-by: Eduard Zingerman <eddyz87@gmail.com>
Alexei Starovoitov Nov. 9, 2023, 10:07 p.m. UTC | #6
On Thu, Nov 9, 2023 at 1:53 PM Yonghong Song <yonghong.song@linux.dev> wrote:
>
> >> + */
> >> +#if defined(__bpf_cpu_version__) && __bpf_cpu_version__ < 4
> > probably should be combined with __clang_major__ >= 18 check too.
>
> Okay, I could do this to catch the case where somebody uses development
> llvm18 which has this regression but __bpf_cpu_version__ is not
> introduced yet.

Exactly. That's what I tried to say.
diff mbox series

Patch

diff --git a/tools/testing/selftests/bpf/progs/pyperf180.c b/tools/testing/selftests/bpf/progs/pyperf180.c
index c39f559d3100..3c38f3e12836 100644
--- a/tools/testing/selftests/bpf/progs/pyperf180.c
+++ b/tools/testing/selftests/bpf/progs/pyperf180.c
@@ -1,4 +1,17 @@ 
 // SPDX-License-Identifier: GPL-2.0
 // Copyright (c) 2019 Facebook
 #define STACK_MAX_LEN 180
+
+/* llvm upstream commit at llvm18
+ *   https://github.com/llvm/llvm-project/commit/1a2e77cf9e11dbf56b5720c607313a566eebb16e
+ * changed inlining behavior and caused compilation failure as some branch
+ * target distance exceeded 16bit representation which is the maximum for
+ * cpu v1/v2/v3. To workaround this, for llvm18 and later, let us set unroll_count
+ * to be 90, which reduced some branch target distances and resolved the
+ * compilation failure.
+ */
+#if __clang_major__ >= 18
+#define UNROLL_COUNT 90
+#endif
+
 #include "pyperf.h"