diff mbox series

riscv, bpf: Make BPF_CMPXCHG fully ordered

Message ID 20241017143628.2673894-1-parri.andrea@gmail.com (mailing list archive)
State Accepted
Commit 98cd61955771c16f43c671f0ee47213b75416350
Delegated to: BPF
Headers show
Series riscv, bpf: Make BPF_CMPXCHG fully ordered | expand

Checks

Context Check Description
netdev/tree_selection success Not a local patch
bpf/vmtest-bpf-next-PR success PR summary
bpf/vmtest-bpf-next-VM_Test-0 success Logs for Lint
bpf/vmtest-bpf-next-VM_Test-2 success Logs for Unittests
bpf/vmtest-bpf-next-VM_Test-3 success Logs for Validate matrix.py
bpf/vmtest-bpf-next-VM_Test-1 success Logs for ShellCheck
bpf/vmtest-bpf-next-VM_Test-5 success Logs for aarch64-gcc / build-release
bpf/vmtest-bpf-next-VM_Test-4 success Logs for aarch64-gcc / build / build for aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-6 success Logs for aarch64-gcc / test (test_maps, false, 360) / test_maps on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-9 success Logs for aarch64-gcc / test (test_verifier, false, 360) / test_verifier on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-11 success Logs for s390x-gcc / build / build for s390x with gcc
bpf/vmtest-bpf-next-VM_Test-15 success Logs for s390x-gcc / test (test_verifier, false, 360) / test_verifier on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-10 success Logs for aarch64-gcc / veristat
bpf/vmtest-bpf-next-VM_Test-12 success Logs for s390x-gcc / build-release
bpf/vmtest-bpf-next-VM_Test-16 success Logs for s390x-gcc / veristat
bpf/vmtest-bpf-next-VM_Test-17 success Logs for set-matrix
bpf/vmtest-bpf-next-VM_Test-20 success Logs for x86_64-gcc / test (test_maps, false, 360) / test_maps on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-18 success Logs for x86_64-gcc / build / build for x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-24 success Logs for x86_64-gcc / test (test_progs_parallel, true, 30) / test_progs_parallel on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-23 success Logs for x86_64-gcc / test (test_progs_no_alu32_parallel, true, 30) / test_progs_no_alu32_parallel on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-26 success Logs for x86_64-gcc / veristat / veristat on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-25 success Logs for x86_64-gcc / test (test_verifier, false, 360) / test_verifier on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-29 success Logs for x86_64-llvm-17 / test (test_maps, false, 360) / test_maps on x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-19 success Logs for x86_64-gcc / build-release
bpf/vmtest-bpf-next-VM_Test-28 success Logs for x86_64-llvm-17 / build-release / build for x86_64 with llvm-17-O2
bpf/vmtest-bpf-next-VM_Test-27 success Logs for x86_64-llvm-17 / build / build for x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-32 success Logs for x86_64-llvm-17 / test (test_verifier, false, 360) / test_verifier on x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-33 success Logs for x86_64-llvm-17 / veristat
bpf/vmtest-bpf-next-VM_Test-34 success Logs for x86_64-llvm-18 / build / build for x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-35 success Logs for x86_64-llvm-18 / build-release / build for x86_64 with llvm-18-O2
bpf/vmtest-bpf-next-VM_Test-36 success Logs for x86_64-llvm-18 / test (test_maps, false, 360) / test_maps on x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-41 success Logs for x86_64-llvm-18 / veristat
bpf/vmtest-bpf-next-VM_Test-40 success Logs for x86_64-llvm-18 / test (test_verifier, false, 360) / test_verifier on x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-13 success Logs for s390x-gcc / test (test_progs, false, 360) / test_progs on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-21 success Logs for x86_64-gcc / test (test_progs, false, 360) / test_progs on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-14 success Logs for s390x-gcc / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-37 success Logs for x86_64-llvm-18 / test (test_progs, false, 360) / test_progs on x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-39 success Logs for x86_64-llvm-18 / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-8 success Logs for aarch64-gcc / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-30 success Logs for x86_64-llvm-17 / test (test_progs, false, 360) / test_progs on x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-38 success Logs for x86_64-llvm-18 / test (test_progs_cpuv4, false, 360) / test_progs_cpuv4 on x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-31 success Logs for x86_64-llvm-17 / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-22 success Logs for x86_64-gcc / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-7 success Logs for aarch64-gcc / test (test_progs, false, 360) / test_progs on aarch64 with gcc

Commit Message

Andrea Parri Oct. 17, 2024, 2:36 p.m. UTC
According to the prototype formal BPF memory consistency model
discussed e.g. in [1] and following the ordering properties of
the C/in-kernel macro atomic_cmpxchg(), a BPF atomic operation
with the BPF_CMPXCHG modifier is fully ordered.  However, the
current RISC-V JIT lowerings fail to meet such memory ordering
property.  This is illustrated by the following litmus test:

BPF BPF__MP+success_cmpxchg+fence
{
 0:r1=x; 0:r3=y; 0:r5=1;
 1:r2=y; 1:r4=f; 1:r7=x;
}
 P0                               | P1                                         ;
 *(u64 *)(r1 + 0) = 1             | r1 = *(u64 *)(r2 + 0)                      ;
 r2 = cmpxchg_64 (r3 + 0, r4, r5) | r3 = atomic_fetch_add((u64 *)(r4 + 0), r5) ;
                                  | r6 = *(u64 *)(r7 + 0)                      ;
exists (1:r1=1 /\ 1:r6=0)

whose "exists" clause is not satisfiable according to the BPF
memory model.  Using the current RISC-V JIT lowerings, the test
can be mapped to the following RISC-V litmus test:

RISCV RISCV__MP+success_cmpxchg+fence
{
 0:x1=x; 0:x3=y; 0:x5=1;
 1:x2=y; 1:x4=f; 1:x7=x;
}
 P0                 | P1                          ;
 sd x5, 0(x1)       | ld x1, 0(x2)                ;
 L00:               | amoadd.d.aqrl x3, x5, 0(x4) ;
 lr.d x2, 0(x3)     | ld x6, 0(x7)                ;
 bne x2, x4, L01    |                             ;
 sc.d x6, x5, 0(x3) |                             ;
 bne x6, x4, L00    |                             ;
 fence rw, rw       |                             ;
 L01:               |                             ;
exists (1:x1=1 /\ 1:x6=0)

where the two stores in P0 can be reordered.  Update the RISC-V
JIT lowerings/implementation of BPF_CMPXCHG to emit an SC with
RELEASE ("rl") annotation in order to meet the expected memory
ordering guarantees.  The resulting RISC-V JIT lowerings of
BPF_CMPXCHG match the RISC-V lowerings of the C atomic_cmpxchg().

Fixes: dd642ccb45ec ("riscv, bpf: Implement more atomic operations for RV64")
Signed-off-by: Andrea Parri <parri.andrea@gmail.com>
Link: https://lpc.events/event/18/contributions/1949/attachments/1665/3441/bpfmemmodel.2024.09.19p.pdf [1]
---
 arch/riscv/net/bpf_jit_comp64.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Comments

Puranjay Mohan Oct. 17, 2024, 2:46 p.m. UTC | #1
Andrea Parri <parri.andrea@gmail.com> writes:

> According to the prototype formal BPF memory consistency model
> discussed e.g. in [1] and following the ordering properties of
> the C/in-kernel macro atomic_cmpxchg(), a BPF atomic operation
> with the BPF_CMPXCHG modifier is fully ordered.  However, the
> current RISC-V JIT lowerings fail to meet such memory ordering
> property.  This is illustrated by the following litmus test:
>
> BPF BPF__MP+success_cmpxchg+fence
> {
>  0:r1=x; 0:r3=y; 0:r5=1;
>  1:r2=y; 1:r4=f; 1:r7=x;
> }
>  P0                               | P1                                         ;
>  *(u64 *)(r1 + 0) = 1             | r1 = *(u64 *)(r2 + 0)                      ;
>  r2 = cmpxchg_64 (r3 + 0, r4, r5) | r3 = atomic_fetch_add((u64 *)(r4 + 0), r5) ;
>                                   | r6 = *(u64 *)(r7 + 0)                      ;
> exists (1:r1=1 /\ 1:r6=0)
>
> whose "exists" clause is not satisfiable according to the BPF
> memory model.  Using the current RISC-V JIT lowerings, the test
> can be mapped to the following RISC-V litmus test:
>
> RISCV RISCV__MP+success_cmpxchg+fence
> {
>  0:x1=x; 0:x3=y; 0:x5=1;
>  1:x2=y; 1:x4=f; 1:x7=x;
> }
>  P0                 | P1                          ;
>  sd x5, 0(x1)       | ld x1, 0(x2)                ;
>  L00:               | amoadd.d.aqrl x3, x5, 0(x4) ;
>  lr.d x2, 0(x3)     | ld x6, 0(x7)                ;
>  bne x2, x4, L01    |                             ;
>  sc.d x6, x5, 0(x3) |                             ;
>  bne x6, x4, L00    |                             ;
>  fence rw, rw       |                             ;
>  L01:               |                             ;
> exists (1:x1=1 /\ 1:x6=0)
>
> where the two stores in P0 can be reordered.  Update the RISC-V
> JIT lowerings/implementation of BPF_CMPXCHG to emit an SC with
> RELEASE ("rl") annotation in order to meet the expected memory
> ordering guarantees.  The resulting RISC-V JIT lowerings of
> BPF_CMPXCHG match the RISC-V lowerings of the C atomic_cmpxchg().

Thanks for fixing this, I fixed all others in:

20a759df3bba ("riscv, bpf: make some atomic operations fully ordered")

> Fixes: dd642ccb45ec ("riscv, bpf: Implement more atomic operations for RV64")
> Signed-off-by: Andrea Parri <parri.andrea@gmail.com>

Reviewed-by: Puranjay Mohan <puranjay@kernel.org>

> Link: https://lpc.events/event/18/contributions/1949/attachments/1665/3441/bpfmemmodel.2024.09.19p.pdf [1]
> ---
>  arch/riscv/net/bpf_jit_comp64.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/arch/riscv/net/bpf_jit_comp64.c b/arch/riscv/net/bpf_jit_comp64.c
> index 99f34409fb60f..c207aa33c980b 100644
> --- a/arch/riscv/net/bpf_jit_comp64.c
> +++ b/arch/riscv/net/bpf_jit_comp64.c
> @@ -548,8 +548,8 @@ static void emit_atomic(u8 rd, u8 rs, s16 off, s32 imm, bool is64,
>  		     rv_lr_w(r0, 0, rd, 0, 0), ctx);
>  		jmp_offset = ninsns_rvoff(8);
>  		emit(rv_bne(RV_REG_T2, r0, jmp_offset >> 1), ctx);
> -		emit(is64 ? rv_sc_d(RV_REG_T3, rs, rd, 0, 0) :
> -		     rv_sc_w(RV_REG_T3, rs, rd, 0, 0), ctx);
> +		emit(is64 ? rv_sc_d(RV_REG_T3, rs, rd, 0, 1) :
> +		     rv_sc_w(RV_REG_T3, rs, rd, 0, 1), ctx);
>  		jmp_offset = ninsns_rvoff(-6);
>  		emit(rv_bne(RV_REG_T3, 0, jmp_offset >> 1), ctx);
>  		emit(rv_fence(0x3, 0x3), ctx);
> -- 
> 2.43.0
patchwork-bot+netdevbpf@kernel.org Oct. 17, 2024, 3:10 p.m. UTC | #2
Hello:

This patch was applied to bpf/bpf.git (master)
by Daniel Borkmann <daniel@iogearbox.net>:

On Thu, 17 Oct 2024 17:36:28 +0300 you wrote:
> According to the prototype formal BPF memory consistency model
> discussed e.g. in [1] and following the ordering properties of
> the C/in-kernel macro atomic_cmpxchg(), a BPF atomic operation
> with the BPF_CMPXCHG modifier is fully ordered.  However, the
> current RISC-V JIT lowerings fail to meet such memory ordering
> property.  This is illustrated by the following litmus test:
> 
> [...]

Here is the summary with links:
  - riscv, bpf: Make BPF_CMPXCHG fully ordered
    https://git.kernel.org/bpf/bpf/c/98cd61955771

You are awesome, thank you!
Björn Töpel Oct. 17, 2024, 3:11 p.m. UTC | #3
Thanks, Andrea!

Puranjay Mohan <puranjay@kernel.org> writes:

> Andrea Parri <parri.andrea@gmail.com> writes:
>
>> According to the prototype formal BPF memory consistency model
>> discussed e.g. in [1] and following the ordering properties of
>> the C/in-kernel macro atomic_cmpxchg(), a BPF atomic operation
>> with the BPF_CMPXCHG modifier is fully ordered.  However, the
>> current RISC-V JIT lowerings fail to meet such memory ordering
>> property.  This is illustrated by the following litmus test:
>>
>> BPF BPF__MP+success_cmpxchg+fence
>> {
>>  0:r1=x; 0:r3=y; 0:r5=1;
>>  1:r2=y; 1:r4=f; 1:r7=x;
>> }
>>  P0                               | P1                                         ;
>>  *(u64 *)(r1 + 0) = 1             | r1 = *(u64 *)(r2 + 0)                      ;
>>  r2 = cmpxchg_64 (r3 + 0, r4, r5) | r3 = atomic_fetch_add((u64 *)(r4 + 0), r5) ;
>>                                   | r6 = *(u64 *)(r7 + 0)                      ;
>> exists (1:r1=1 /\ 1:r6=0)
>>
>> whose "exists" clause is not satisfiable according to the BPF
>> memory model.  Using the current RISC-V JIT lowerings, the test
>> can be mapped to the following RISC-V litmus test:
>>
>> RISCV RISCV__MP+success_cmpxchg+fence
>> {
>>  0:x1=x; 0:x3=y; 0:x5=1;
>>  1:x2=y; 1:x4=f; 1:x7=x;
>> }
>>  P0                 | P1                          ;
>>  sd x5, 0(x1)       | ld x1, 0(x2)                ;
>>  L00:               | amoadd.d.aqrl x3, x5, 0(x4) ;
>>  lr.d x2, 0(x3)     | ld x6, 0(x7)                ;
>>  bne x2, x4, L01    |                             ;
>>  sc.d x6, x5, 0(x3) |                             ;
>>  bne x6, x4, L00    |                             ;
>>  fence rw, rw       |                             ;
>>  L01:               |                             ;
>> exists (1:x1=1 /\ 1:x6=0)
>>
>> where the two stores in P0 can be reordered.  Update the RISC-V
>> JIT lowerings/implementation of BPF_CMPXCHG to emit an SC with
>> RELEASE ("rl") annotation in order to meet the expected memory
>> ordering guarantees.  The resulting RISC-V JIT lowerings of
>> BPF_CMPXCHG match the RISC-V lowerings of the C atomic_cmpxchg().
>
> Thanks for fixing this, I fixed all others in:
>
> 20a759df3bba ("riscv, bpf: make some atomic operations fully ordered")
>
>> Fixes: dd642ccb45ec ("riscv, bpf: Implement more atomic operations for RV64")
>> Signed-off-by: Andrea Parri <parri.andrea@gmail.com>
>
> Reviewed-by: Puranjay Mohan <puranjay@kernel.org>

Acked-by: Björn Töpel <bjorn@kernel.org>
diff mbox series

Patch

diff --git a/arch/riscv/net/bpf_jit_comp64.c b/arch/riscv/net/bpf_jit_comp64.c
index 99f34409fb60f..c207aa33c980b 100644
--- a/arch/riscv/net/bpf_jit_comp64.c
+++ b/arch/riscv/net/bpf_jit_comp64.c
@@ -548,8 +548,8 @@  static void emit_atomic(u8 rd, u8 rs, s16 off, s32 imm, bool is64,
 		     rv_lr_w(r0, 0, rd, 0, 0), ctx);
 		jmp_offset = ninsns_rvoff(8);
 		emit(rv_bne(RV_REG_T2, r0, jmp_offset >> 1), ctx);
-		emit(is64 ? rv_sc_d(RV_REG_T3, rs, rd, 0, 0) :
-		     rv_sc_w(RV_REG_T3, rs, rd, 0, 0), ctx);
+		emit(is64 ? rv_sc_d(RV_REG_T3, rs, rd, 0, 1) :
+		     rv_sc_w(RV_REG_T3, rs, rd, 0, 1), ctx);
 		jmp_offset = ninsns_rvoff(-6);
 		emit(rv_bne(RV_REG_T3, 0, jmp_offset >> 1), ctx);
 		emit(rv_fence(0x3, 0x3), ctx);