diff mbox series

bpf: use r constraint instead of p constraint in selftests

Message ID 20240123181309.19853-1-jose.marchesi@oracle.com (mailing list archive)
State Accepted
Commit bbc094b3052647c188d6f155f5c09cb9492ce106
Delegated to: BPF
Headers show
Series bpf: use r constraint instead of p constraint in selftests | expand

Checks

Context Check Description
bpf/vmtest-bpf-next-PR success PR summary
bpf/vmtest-bpf-next-VM_Test-33 success Logs for x86_64-llvm-17 / test (test_verifier, false, 360) / test_verifier on x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-20 success Logs for x86_64-gcc / build-release
bpf/vmtest-bpf-next-VM_Test-32 success Logs for x86_64-llvm-17 / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-27 success Logs for x86_64-gcc / veristat / veristat on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-40 success Logs for x86_64-llvm-18 / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-25 success Logs for x86_64-gcc / test (test_progs_parallel, true, 30) / test_progs_parallel on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-39 success Logs for x86_64-llvm-18 / test (test_progs_cpuv4, false, 360) / test_progs_cpuv4 on x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-38 success Logs for x86_64-llvm-18 / test (test_progs, false, 360) / test_progs on x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-42 success Logs for x86_64-llvm-18 / veristat
bpf/vmtest-bpf-next-VM_Test-22 success Logs for x86_64-gcc / test (test_progs, false, 360) / test_progs on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-31 success Logs for x86_64-llvm-17 / test (test_progs, false, 360) / test_progs on x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-19 success Logs for x86_64-gcc / build / build for x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-16 success Logs for s390x-gcc / test (test_verifier, false, 360) / test_verifier on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-21 success Logs for x86_64-gcc / test (test_maps, false, 360) / test_maps on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-26 success Logs for x86_64-gcc / test (test_verifier, false, 360) / test_verifier on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-41 success Logs for x86_64-llvm-18 / test (test_verifier, false, 360) / test_verifier on x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-11 success Logs for s390x-gcc / build / build for s390x with gcc
bpf/vmtest-bpf-next-VM_Test-24 success Logs for x86_64-gcc / test (test_progs_no_alu32_parallel, true, 30) / test_progs_no_alu32_parallel on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-17 success Logs for s390x-gcc / veristat
bpf/vmtest-bpf-next-VM_Test-18 success Logs for set-matrix
bpf/vmtest-bpf-next-VM_Test-35 success Logs for x86_64-llvm-18 / build / build for x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-36 success Logs for x86_64-llvm-18 / build-release / build for x86_64 with llvm-18 and -O2 optimization
bpf/vmtest-bpf-next-VM_Test-34 success Logs for x86_64-llvm-17 / veristat
netdev/tree_selection success Not a local patch
bpf/vmtest-bpf-next-VM_Test-28 success Logs for x86_64-llvm-17 / test (test_verifier, false, 360) / test_verifier on x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-30 success Logs for x86_64-llvm-18 / build / build for x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-37 success Logs for x86_64-llvm-18 / veristat
bpf/vmtest-bpf-next-VM_Test-6 success Logs for aarch64-gcc / test (test_maps, false, 360) / test_maps on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-7 success Logs for aarch64-gcc / test (test_progs, false, 360) / test_progs on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-14 success Logs for x86_64-gcc / build / build for x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-29 success Logs for x86_64-llvm-17 / veristat
bpf/vmtest-bpf-next-VM_Test-23 success Logs for x86_64-llvm-17 / build / build for x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-8 success Logs for aarch64-gcc / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-9 success Logs for aarch64-gcc / test (test_verifier, false, 360) / test_verifier on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-1 success Logs for ShellCheck
bpf/vmtest-bpf-next-VM_Test-0 success Logs for Lint
bpf/vmtest-bpf-next-VM_Test-4 success Logs for aarch64-gcc / build / build for aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-3 success Logs for Validate matrix.py
bpf/vmtest-bpf-next-VM_Test-2 success Logs for Unittests
bpf/vmtest-bpf-next-VM_Test-5 success Logs for aarch64-gcc / build-release
bpf/vmtest-bpf-next-VM_Test-10 success Logs for aarch64-gcc / veristat
bpf/vmtest-bpf-next-VM_Test-12 success Logs for s390x-gcc / build-release
bpf/vmtest-bpf-next-VM_Test-13 success Logs for set-matrix
bpf/vmtest-bpf-next-VM_Test-15 success Logs for x86_64-gcc / build-release

Commit Message

Jose E. Marchesi Jan. 23, 2024, 6:13 p.m. UTC
Some of the BPF selftests use the "p" constraint in inline assembly
snippets, for input operands for MOV (rN = rM) instructions.

This is mainly done via the __imm_ptr macro defined in
tools/testing/selftests/bpf/progs/bpf_misc.h:

  #define __imm_ptr(name) [name]"p"(&name)

Example:

  int consume_first_item_only(void *ctx)
  {
        struct bpf_iter_num iter;
        asm volatile (
                /* create iterator */
                "r1 = %[iter];"
                [...]
                :
                : __imm_ptr(iter)
                : CLOBBERS);
        [...]
  }

The "p" constraint is a tricky one.  It is documented in the GCC manual
section "Simple Constraints":

  An operand that is a valid memory address is allowed.  This is for
  ``load address'' and ``push address'' instructions.

  p in the constraint must be accompanied by address_operand as the
  predicate in the match_operand.  This predicate interprets the mode
  specified in the match_operand as the mode of the memory reference for
  which the address would be valid.

There are two problems:

1. It is questionable whether that constraint was ever intended to be
   used in inline assembly templates, because its behavior really
   depends on compiler internals.  A "memory address" is not the same
   than a "memory operand" or a "memory reference" (constraint "m"), and
   in fact its usage in the template above results in an error in both
   x86_64-linux-gnu and bpf-unkonwn-none:

     foo.c: In function ‘bar’:
     foo.c:6:3: error: invalid 'asm': invalid expression as operand
        6 |   asm volatile ("r1 = %[jorl]" : : [jorl]"p"(&jorl));
          |   ^~~

   I would assume the same happens with aarch64, riscv, and most/all
   other targets in GCC, that do not accept operands of the form A + B
   that are not wrapped either in a const or in a memory reference.

   To avoid that error, the usage of the "p" constraint in internal GCC
   instruction templates is supposed to be complemented by the 'a'
   modifier, like in:

     asm volatile ("r1 = %a[jorl]" : : [jorl]"p"(&jorl));

   Internally documented (in GCC's final.cc) as:

     %aN means expect operand N to be a memory address
        (not a memory reference!) and print a reference
        to that address.

   That works because when the modifier 'a' is found, GCC prints an
   "operand address", which is not the same than an "operand".

   But...

2. Even if we used the internal 'a' modifier (we shouldn't) the 'rN =
   rM' instruction really requires a register argument.  In cases
   involving automatics, like in the examples above, we easily end with:

     bar:
        #APP
            r1 = r10-4
        #NO_APP

   In other cases we could conceibly also end with a 64-bit label that
   may overflow the 32-bit immediate operand of `rN = imm32'
   instructions:

        r1 = foo

   All of which is clearly wrong.

clang happens to do "the right thing" in the current usage of __imm_ptr
in the BPF tests, because even with -O2 it seems to "reload" the
fp-relative address of the automatic to a register like in:

  bar:
	r1 = r10
	r1 += -4
	#APP
	r1 = r1
	#NO_APP

Which is what GCC would generate with -O0.  Whether this is by chance
or by design, the compiler shouln't be expected to do that reload
driven by the "p" constraint.

This patch changes the usage of the "p" constraint in the BPF
selftests macros to use the "r" constraint instead.  If a register is
what is required, we should let the compiler know.

Previous discussion in bpf@vger:
https://lore.kernel.org/bpf/87h6p5ebpb.fsf@oracle.com/T/#ef0df83d6975c34dff20bf0dd52e078f5b8ca2767

Tested in bpf-next master.
No regressions.

Signed-off-by: Jose E. Marchesi <jose.marchesi@oracle.com>
Cc: Yonghong Song <yonghong.song@linux.dev>
Cc: Eduard Zingerman <eddyz87@gmail.com>
---
 tools/testing/selftests/bpf/progs/bpf_misc.h | 2 +-
 tools/testing/selftests/bpf/progs/iters.c    | 4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

Comments

Yonghong Song Jan. 23, 2024, 7:07 p.m. UTC | #1
On 1/23/24 10:13 AM, Jose E. Marchesi wrote:
> Some of the BPF selftests use the "p" constraint in inline assembly
> snippets, for input operands for MOV (rN = rM) instructions.
>
> This is mainly done via the __imm_ptr macro defined in
> tools/testing/selftests/bpf/progs/bpf_misc.h:
>
>    #define __imm_ptr(name) [name]"p"(&name)
>
> Example:
>
>    int consume_first_item_only(void *ctx)
>    {
>          struct bpf_iter_num iter;
>          asm volatile (
>                  /* create iterator */
>                  "r1 = %[iter];"
>                  [...]
>                  :
>                  : __imm_ptr(iter)
>                  : CLOBBERS);
>          [...]
>    }
>
> The "p" constraint is a tricky one.  It is documented in the GCC manual
> section "Simple Constraints":
>
>    An operand that is a valid memory address is allowed.  This is for
>    ``load address'' and ``push address'' instructions.
>
>    p in the constraint must be accompanied by address_operand as the
>    predicate in the match_operand.  This predicate interprets the mode
>    specified in the match_operand as the mode of the memory reference for
>    which the address would be valid.
>
> There are two problems:
>
> 1. It is questionable whether that constraint was ever intended to be
>     used in inline assembly templates, because its behavior really
>     depends on compiler internals.  A "memory address" is not the same
>     than a "memory operand" or a "memory reference" (constraint "m"), and
>     in fact its usage in the template above results in an error in both
>     x86_64-linux-gnu and bpf-unkonwn-none:
>
>       foo.c: In function ‘bar’:
>       foo.c:6:3: error: invalid 'asm': invalid expression as operand
>          6 |   asm volatile ("r1 = %[jorl]" : : [jorl]"p"(&jorl));
>            |   ^~~
>
>     I would assume the same happens with aarch64, riscv, and most/all
>     other targets in GCC, that do not accept operands of the form A + B
>     that are not wrapped either in a const or in a memory reference.
>
>     To avoid that error, the usage of the "p" constraint in internal GCC
>     instruction templates is supposed to be complemented by the 'a'
>     modifier, like in:
>
>       asm volatile ("r1 = %a[jorl]" : : [jorl]"p"(&jorl));
>
>     Internally documented (in GCC's final.cc) as:
>
>       %aN means expect operand N to be a memory address
>          (not a memory reference!) and print a reference
>          to that address.
>
>     That works because when the modifier 'a' is found, GCC prints an
>     "operand address", which is not the same than an "operand".
>
>     But...
>
> 2. Even if we used the internal 'a' modifier (we shouldn't) the 'rN =
>     rM' instruction really requires a register argument.  In cases
>     involving automatics, like in the examples above, we easily end with:
>
>       bar:
>          #APP
>              r1 = r10-4
>          #NO_APP
>
>     In other cases we could conceibly also end with a 64-bit label that
>     may overflow the 32-bit immediate operand of `rN = imm32'
>     instructions:
>
>          r1 = foo
>
>     All of which is clearly wrong.
>
> clang happens to do "the right thing" in the current usage of __imm_ptr
> in the BPF tests, because even with -O2 it seems to "reload" the
> fp-relative address of the automatic to a register like in:
>
>    bar:
> 	r1 = r10
> 	r1 += -4
> 	#APP
> 	r1 = r1
> 	#NO_APP
>
> Which is what GCC would generate with -O0.  Whether this is by chance
> or by design, the compiler shouln't be expected to do that reload
> driven by the "p" constraint.
>
> This patch changes the usage of the "p" constraint in the BPF
> selftests macros to use the "r" constraint instead.  If a register is
> what is required, we should let the compiler know.
>
> Previous discussion in bpf@vger:
> https://lore.kernel.org/bpf/87h6p5ebpb.fsf@oracle.com/T/#ef0df83d6975c34dff20bf0dd52e078f5b8ca2767
>
> Tested in bpf-next master.
> No regressions.
>
> Signed-off-by: Jose E. Marchesi <jose.marchesi@oracle.com>
> Cc: Yonghong Song <yonghong.song@linux.dev>
> Cc: Eduard Zingerman <eddyz87@gmail.com>

Acked-by: Yonghong Song <yonghong.song@linux.dev>
patchwork-bot+netdevbpf@kernel.org Jan. 24, 2024, midnight UTC | #2
Hello:

This patch was applied to bpf/bpf-next.git (master)
by Alexei Starovoitov <ast@kernel.org>:

On Tue, 23 Jan 2024 19:13:09 +0100 you wrote:
> Some of the BPF selftests use the "p" constraint in inline assembly
> snippets, for input operands for MOV (rN = rM) instructions.
> 
> This is mainly done via the __imm_ptr macro defined in
> tools/testing/selftests/bpf/progs/bpf_misc.h:
> 
>   #define __imm_ptr(name) [name]"p"(&name)
> 
> [...]

Here is the summary with links:
  - bpf: use r constraint instead of p constraint in selftests
    https://git.kernel.org/bpf/bpf-next/c/bbc094b30526

You are awesome, thank you!
diff mbox series

Patch

diff --git a/tools/testing/selftests/bpf/progs/bpf_misc.h b/tools/testing/selftests/bpf/progs/bpf_misc.h
index 2fd59970c43a..fb2f5513e29e 100644
--- a/tools/testing/selftests/bpf/progs/bpf_misc.h
+++ b/tools/testing/selftests/bpf/progs/bpf_misc.h
@@ -80,7 +80,7 @@ 
 #define __imm(name) [name]"i"(name)
 #define __imm_const(name, expr) [name]"i"(expr)
 #define __imm_addr(name) [name]"i"(&name)
-#define __imm_ptr(name) [name]"p"(&name)
+#define __imm_ptr(name) [name]"r"(&name)
 #define __imm_insn(name, expr) [name]"i"(*(long *)&(expr))
 
 /* Magic constants used with __retval() */
diff --git a/tools/testing/selftests/bpf/progs/iters.c b/tools/testing/selftests/bpf/progs/iters.c
index fe971992e635..225f02dd66d0 100644
--- a/tools/testing/selftests/bpf/progs/iters.c
+++ b/tools/testing/selftests/bpf/progs/iters.c
@@ -78,8 +78,8 @@  int iter_err_unsafe_asm_loop(const void *ctx)
 		"*(u32 *)(r1 + 0) = r6;" /* invalid */
 		:
 		: [it]"r"(&it),
-		  [small_arr]"p"(small_arr),
-		  [zero]"p"(zero),
+		  [small_arr]"r"(small_arr),
+		  [zero]"r"(zero),
 		  __imm(bpf_iter_num_new),
 		  __imm(bpf_iter_num_next),
 		  __imm(bpf_iter_num_destroy)