diff mbox series

[bpf,v3] powerpc/bpf: enforce full ordering for ATOMIC operations with BPF_FETCH

Message ID 20240513100248.110535-1-puranjay@kernel.org (mailing list archive)
State Handled Elsewhere
Delegated to: BPF
Headers show
Series [bpf,v3] powerpc/bpf: enforce full ordering for ATOMIC operations with BPF_FETCH | expand

Checks

Context Check Description
bpf/vmtest-bpf-VM_Test-43 success Logs for x86_64-llvm-18 / test (test_progs_cpuv4, false, 360) / test_progs_cpuv4 on x86_64 with llvm-18
bpf/vmtest-bpf-VM_Test-44 success Logs for x86_64-llvm-18 / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on x86_64 with llvm-18
bpf/vmtest-bpf-VM_Test-46 success Logs for x86_64-llvm-18 / veristat
bpf/vmtest-bpf-VM_Test-45 success Logs for x86_64-llvm-18 / test (test_verifier, false, 360) / test_verifier on x86_64 with llvm-18
netdev/series_format success Single patches do not need cover letters
netdev/tree_selection success Clearly marked for bpf
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag present in non-next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 8 this patch: 8
netdev/build_tools success No tools touched, skip
netdev/cc_maintainers success CCed 20 of 20 maintainers
netdev/build_clang success Errors and warnings before: 8 this patch: 8
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 8 this patch: 8
netdev/checkpatch warning WARNING: line length of 82 exceeds 80 columns WARNING: line length of 99 exceeds 80 columns
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
bpf/vmtest-bpf-VM_Test-7 success Logs for aarch64-gcc / test (test_progs, false, 360) / test_progs on aarch64 with gcc
bpf/vmtest-bpf-VM_Test-6 success Logs for aarch64-gcc / test (test_maps, false, 360) / test_maps on aarch64 with gcc
bpf/vmtest-bpf-VM_Test-8 success Logs for aarch64-gcc / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on aarch64 with gcc
bpf/vmtest-bpf-VM_Test-14 success Logs for s390x-gcc / test (test_progs, false, 360) / test_progs on s390x with gcc
bpf/vmtest-bpf-VM_Test-15 success Logs for s390x-gcc / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on s390x with gcc
bpf/vmtest-bpf-VM_Test-36 success Logs for x86_64-llvm-18 / build-release / build for x86_64 with llvm-18 and -O2 optimization
bpf/vmtest-bpf-VM_Test-29 success Logs for x86_64-llvm-17 / build-release / build for x86_64 with llvm-17 and -O2 optimization
bpf/vmtest-bpf-VM_Test-32 success Logs for x86_64-llvm-17 / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on x86_64 with llvm-17
bpf/vmtest-bpf-VM_Test-31 success Logs for x86_64-llvm-17 / test (test_progs, false, 360) / test_progs on x86_64 with llvm-17
bpf/vmtest-bpf-VM_Test-41 success Logs for x86_64-llvm-18 / test (test_verifier, false, 360) / test_verifier on x86_64 with llvm-18
bpf/vmtest-bpf-PR success PR summary
bpf/vmtest-bpf-VM_Test-22 success Logs for x86_64-gcc / test (test_progs, false, 360) / test_progs on x86_64 with gcc
bpf/vmtest-bpf-VM_Test-21 success Logs for x86_64-gcc / test (test_maps, false, 360) / test_maps on x86_64 with gcc
bpf/vmtest-bpf-VM_Test-25 success Logs for x86_64-gcc / test (test_progs_parallel, true, 30) / test_progs_parallel on x86_64 with gcc
bpf/vmtest-bpf-VM_Test-27 success Logs for x86_64-gcc / veristat / veristat on x86_64 with gcc
bpf/vmtest-bpf-VM_Test-24 success Logs for x86_64-gcc / test (test_progs_no_alu32_parallel, true, 30) / test_progs_no_alu32_parallel on x86_64 with gcc
bpf/vmtest-bpf-VM_Test-23 success Logs for x86_64-gcc / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on x86_64 with gcc
bpf/vmtest-bpf-VM_Test-38 success Logs for x86_64-llvm-18 / test (test_progs, false, 360) / test_progs on x86_64 with llvm-18
bpf/vmtest-bpf-VM_Test-39 success Logs for x86_64-llvm-18 / test (test_progs_cpuv4, false, 360) / test_progs_cpuv4 on x86_64 with llvm-18
bpf/vmtest-bpf-VM_Test-37 success Logs for x86_64-llvm-18 / test (test_maps, false, 360) / test_maps on x86_64 with llvm-18
bpf/vmtest-bpf-VM_Test-40 success Logs for x86_64-llvm-18 / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on x86_64 with llvm-18
bpf/vmtest-bpf-VM_Test-1 success Logs for ShellCheck
bpf/vmtest-bpf-VM_Test-0 success Logs for Lint
bpf/vmtest-bpf-VM_Test-2 success Logs for Unittests
bpf/vmtest-bpf-VM_Test-3 success Logs for Validate matrix.py
bpf/vmtest-bpf-VM_Test-4 success Logs for aarch64-gcc / build / build for aarch64 with gcc
bpf/vmtest-bpf-VM_Test-5 success Logs for aarch64-gcc / build-release
bpf/vmtest-bpf-VM_Test-11 success Logs for s390x-gcc / build / build for s390x with gcc
bpf/vmtest-bpf-VM_Test-9 success Logs for aarch64-gcc / test (test_verifier, false, 360) / test_verifier on aarch64 with gcc
bpf/vmtest-bpf-VM_Test-12 success Logs for s390x-gcc / build-release
bpf/vmtest-bpf-VM_Test-26 success Logs for x86_64-gcc / test (test_verifier, false, 360) / test_verifier on x86_64 with gcc
bpf/vmtest-bpf-VM_Test-10 success Logs for aarch64-gcc / veristat
bpf/vmtest-bpf-VM_Test-13 success Logs for s390x-gcc / test (test_maps, false, 360) / test_maps on s390x with gcc
bpf/vmtest-bpf-VM_Test-16 success Logs for s390x-gcc / test (test_verifier, false, 360) / test_verifier on s390x with gcc
bpf/vmtest-bpf-VM_Test-18 success Logs for set-matrix
bpf/vmtest-bpf-VM_Test-17 success Logs for s390x-gcc / veristat
bpf/vmtest-bpf-VM_Test-19 success Logs for x86_64-gcc / build / build for x86_64 with gcc
bpf/vmtest-bpf-VM_Test-20 success Logs for x86_64-gcc / build-release
bpf/vmtest-bpf-VM_Test-33 success Logs for x86_64-llvm-17 / test (test_verifier, false, 360) / test_verifier on x86_64 with llvm-17
bpf/vmtest-bpf-VM_Test-34 success Logs for x86_64-llvm-17 / veristat
bpf/vmtest-bpf-VM_Test-35 success Logs for x86_64-llvm-18 / build / build for x86_64 with llvm-18
bpf/vmtest-bpf-VM_Test-28 success Logs for x86_64-llvm-17 / build / build for x86_64 with llvm-17
bpf/vmtest-bpf-VM_Test-30 success Logs for x86_64-llvm-17 / test (test_maps, false, 360) / test_maps on x86_64 with llvm-17
bpf/vmtest-bpf-VM_Test-42 success Logs for x86_64-llvm-18 / veristat

Commit Message

Puranjay Mohan May 13, 2024, 10:02 a.m. UTC
The Linux Kernel Memory Model [1][2] requires RMW operations that have a
return value to be fully ordered.

BPF atomic operations with BPF_FETCH (including BPF_XCHG and
BPF_CMPXCHG) return a value back so they need to be JITed to fully
ordered operations. POWERPC currently emits relaxed operations for
these.

We can show this by running the following litmus-test:

PPC SB+atomic_add+fetch

{
0:r0=x;  (* dst reg assuming offset is 0 *)
0:r1=2;  (* src reg *)
0:r2=1;
0:r4=y;  (* P0 writes to this, P1 reads this *)
0:r5=z;  (* P1 writes to this, P0 reads this *)
0:r6=0;

1:r2=1;
1:r4=y;
1:r5=z;
}

P0                      | P1            ;
stw         r2, 0(r4)   | stw  r2,0(r5) ;
                        |               ;
loop:lwarx  r3, r6, r0  |               ;
mr          r8, r3      |               ;
add         r3, r3, r1  | sync          ;
stwcx.      r3, r6, r0  |               ;
bne         loop        |               ;
mr          r1, r8      |               ;
                        |               ;
lwa         r7, 0(r5)   | lwa  r7,0(r4) ;

~exists(0:r7=0 /\ 1:r7=0)

Witnesses
Positive: 9 Negative: 3
Condition ~exists (0:r7=0 /\ 1:r7=0)
Observation SB+atomic_add+fetch Sometimes 3 9

This test shows that the older store in P0 is reordered with a newer
load to a different address. Although there is a RMW operation with
fetch between them. Adding a sync before and after RMW fixes the issue:

Witnesses
Positive: 9 Negative: 0
Condition ~exists (0:r7=0 /\ 1:r7=0)
Observation SB+atomic_add+fetch Never 0 9

[1] https://www.kernel.org/doc/Documentation/memory-barriers.txt
[2] https://www.kernel.org/doc/Documentation/atomic_t.txt

Fixes: 65112709115f ("powerpc/bpf/64: add support for BPF_ATOMIC bitwise operations")
Signed-off-by: Puranjay Mohan <puranjay@kernel.org>
Acked-by: Paul E. McKenney <paulmck@kernel.org>
---
Changes in v2 -> v3:
v2: https://lore.kernel.org/all/20240508115404.74823-1-puranjay@kernel.org/
- Emit the sync outside the loop so it doesn't get executed everytime.
- Minor coding style changes.

Changes in v1 -> v2:
v1: https://lore.kernel.org/all/20240507175439.119467-1-puranjay@kernel.org/
- Don't emit `sync` for non-SMP kernels as that adds unessential overhead.
---

arch/powerpc/net/bpf_jit_comp32.c | 12 ++++++++++++
 arch/powerpc/net/bpf_jit_comp64.c | 12 ++++++++++++
 2 files changed, 24 insertions(+)

Comments

Christophe Leroy May 13, 2024, 10:06 a.m. UTC | #1
Le 13/05/2024 à 12:02, Puranjay Mohan a écrit :
> The Linux Kernel Memory Model [1][2] requires RMW operations that have a
> return value to be fully ordered.
> 
> BPF atomic operations with BPF_FETCH (including BPF_XCHG and
> BPF_CMPXCHG) return a value back so they need to be JITed to fully
> ordered operations. POWERPC currently emits relaxed operations for
> these.
> 
> We can show this by running the following litmus-test:
> 
> PPC SB+atomic_add+fetch
> 
> {
> 0:r0=x;  (* dst reg assuming offset is 0 *)
> 0:r1=2;  (* src reg *)
> 0:r2=1;
> 0:r4=y;  (* P0 writes to this, P1 reads this *)
> 0:r5=z;  (* P1 writes to this, P0 reads this *)
> 0:r6=0;
> 
> 1:r2=1;
> 1:r4=y;
> 1:r5=z;
> }
> 
> P0                      | P1            ;
> stw         r2, 0(r4)   | stw  r2,0(r5) ;
>                          |               ;
> loop:lwarx  r3, r6, r0  |               ;
> mr          r8, r3      |               ;
> add         r3, r3, r1  | sync          ;
> stwcx.      r3, r6, r0  |               ;
> bne         loop        |               ;
> mr          r1, r8      |               ;
>                          |               ;
> lwa         r7, 0(r5)   | lwa  r7,0(r4) ;
> 
> ~exists(0:r7=0 /\ 1:r7=0)
> 
> Witnesses
> Positive: 9 Negative: 3
> Condition ~exists (0:r7=0 /\ 1:r7=0)
> Observation SB+atomic_add+fetch Sometimes 3 9
> 
> This test shows that the older store in P0 is reordered with a newer
> load to a different address. Although there is a RMW operation with
> fetch between them. Adding a sync before and after RMW fixes the issue:
> 
> Witnesses
> Positive: 9 Negative: 0
> Condition ~exists (0:r7=0 /\ 1:r7=0)
> Observation SB+atomic_add+fetch Never 0 9
> 
> [1] https://www.kernel.org/doc/Documentation/memory-barriers.txt
> [2] https://www.kernel.org/doc/Documentation/atomic_t.txt
> 
> Fixes: 65112709115f ("powerpc/bpf/64: add support for BPF_ATOMIC bitwise operations")
> Signed-off-by: Puranjay Mohan <puranjay@kernel.org>
> Acked-by: Paul E. McKenney <paulmck@kernel.org>

Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>

> ---
> Changes in v2 -> v3:
> v2: https://lore.kernel.org/all/20240508115404.74823-1-puranjay@kernel.org/
> - Emit the sync outside the loop so it doesn't get executed everytime.
> - Minor coding style changes.
> 
> Changes in v1 -> v2:
> v1: https://lore.kernel.org/all/20240507175439.119467-1-puranjay@kernel.org/
> - Don't emit `sync` for non-SMP kernels as that adds unessential overhead.
> ---
> 
> arch/powerpc/net/bpf_jit_comp32.c | 12 ++++++++++++
>   arch/powerpc/net/bpf_jit_comp64.c | 12 ++++++++++++
>   2 files changed, 24 insertions(+)
> 
> diff --git a/arch/powerpc/net/bpf_jit_comp32.c b/arch/powerpc/net/bpf_jit_comp32.c
> index 2f39c50ca729..35f64dcfa68e 100644
> --- a/arch/powerpc/net/bpf_jit_comp32.c
> +++ b/arch/powerpc/net/bpf_jit_comp32.c
> @@ -852,6 +852,15 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, u32 *fimage, struct code
>   
>   			/* Get offset into TMP_REG */
>   			EMIT(PPC_RAW_LI(tmp_reg, off));
> +			/*
> +			 * Enforce full ordering for operations with BPF_FETCH by emitting a 'sync'
> +			 * before and after the operation.
> +			 *
> +			 * This is a requirement in the Linux Kernel Memory Model.
> +			 * See __cmpxchg_u32() in asm/cmpxchg.h as an example.
> +			 */
> +			if ((imm & BPF_FETCH) && IS_ENABLED(CONFIG_SMP))
> +				EMIT(PPC_RAW_SYNC());
>   			tmp_idx = ctx->idx * 4;
>   			/* load value from memory into r0 */
>   			EMIT(PPC_RAW_LWARX(_R0, tmp_reg, dst_reg, 0));
> @@ -905,6 +914,9 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, u32 *fimage, struct code
>   
>   			/* For the BPF_FETCH variant, get old data into src_reg */
>   			if (imm & BPF_FETCH) {
> +				/* Emit 'sync' to enforce full ordering */
> +				if (IS_ENABLED(CONFIG_SMP))
> +					EMIT(PPC_RAW_SYNC());
>   				EMIT(PPC_RAW_MR(ret_reg, ax_reg));
>   				if (!fp->aux->verifier_zext)
>   					EMIT(PPC_RAW_LI(ret_reg - 1, 0)); /* higher 32-bit */
> diff --git a/arch/powerpc/net/bpf_jit_comp64.c b/arch/powerpc/net/bpf_jit_comp64.c
> index 79f23974a320..884eef1b3973 100644
> --- a/arch/powerpc/net/bpf_jit_comp64.c
> +++ b/arch/powerpc/net/bpf_jit_comp64.c
> @@ -803,6 +803,15 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, u32 *fimage, struct code
>   
>   			/* Get offset into TMP_REG_1 */
>   			EMIT(PPC_RAW_LI(tmp1_reg, off));
> +			/*
> +			 * Enforce full ordering for operations with BPF_FETCH by emitting a 'sync'
> +			 * before and after the operation.
> +			 *
> +			 * This is a requirement in the Linux Kernel Memory Model.
> +			 * See __cmpxchg_u64() in asm/cmpxchg.h as an example.
> +			 */
> +			if ((imm & BPF_FETCH) && IS_ENABLED(CONFIG_SMP))
> +				EMIT(PPC_RAW_SYNC());
>   			tmp_idx = ctx->idx * 4;
>   			/* load value from memory into TMP_REG_2 */
>   			if (size == BPF_DW)
> @@ -865,6 +874,9 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, u32 *fimage, struct code
>   			PPC_BCC_SHORT(COND_NE, tmp_idx);
>   
>   			if (imm & BPF_FETCH) {
> +				/* Emit 'sync' to enforce full ordering */
> +				if (IS_ENABLED(CONFIG_SMP))
> +					EMIT(PPC_RAW_SYNC());
>   				EMIT(PPC_RAW_MR(ret_reg, _R0));
>   				/*
>   				 * Skip unnecessary zero-extension for 32-bit cmpxchg.
Naveen N Rao May 13, 2024, 1:44 p.m. UTC | #2
On Mon, May 13, 2024 at 10:02:48AM GMT, Puranjay Mohan wrote:
> The Linux Kernel Memory Model [1][2] requires RMW operations that have a
> return value to be fully ordered.
> 
> BPF atomic operations with BPF_FETCH (including BPF_XCHG and
> BPF_CMPXCHG) return a value back so they need to be JITed to fully
> ordered operations. POWERPC currently emits relaxed operations for
> these.
> 
> We can show this by running the following litmus-test:
> 
> PPC SB+atomic_add+fetch
> 
> {
> 0:r0=x;  (* dst reg assuming offset is 0 *)
> 0:r1=2;  (* src reg *)
> 0:r2=1;
> 0:r4=y;  (* P0 writes to this, P1 reads this *)
> 0:r5=z;  (* P1 writes to this, P0 reads this *)
> 0:r6=0;
> 
> 1:r2=1;
> 1:r4=y;
> 1:r5=z;
> }
> 
> P0                      | P1            ;
> stw         r2, 0(r4)   | stw  r2,0(r5) ;
>                         |               ;
> loop:lwarx  r3, r6, r0  |               ;
> mr          r8, r3      |               ;
> add         r3, r3, r1  | sync          ;
> stwcx.      r3, r6, r0  |               ;
> bne         loop        |               ;
> mr          r1, r8      |               ;
>                         |               ;
> lwa         r7, 0(r5)   | lwa  r7,0(r4) ;
> 
> ~exists(0:r7=0 /\ 1:r7=0)
> 
> Witnesses
> Positive: 9 Negative: 3
> Condition ~exists (0:r7=0 /\ 1:r7=0)
> Observation SB+atomic_add+fetch Sometimes 3 9
> 
> This test shows that the older store in P0 is reordered with a newer
> load to a different address. Although there is a RMW operation with
> fetch between them. Adding a sync before and after RMW fixes the issue:
> 
> Witnesses
> Positive: 9 Negative: 0
> Condition ~exists (0:r7=0 /\ 1:r7=0)
> Observation SB+atomic_add+fetch Never 0 9
> 
> [1] https://www.kernel.org/doc/Documentation/memory-barriers.txt
> [2] https://www.kernel.org/doc/Documentation/atomic_t.txt
> 
> Fixes: 65112709115f ("powerpc/bpf/64: add support for BPF_ATOMIC bitwise operations")

As I noted in v2, I think that is the wrong commit. This fixes the below 
four commits in mainline:
Fixes: aea7ef8a82c0 ("powerpc/bpf/32: add support for BPF_ATOMIC bitwise operations")
Fixes: 2d9206b22743 ("powerpc/bpf/32: Add instructions for atomic_[cmp]xchg")
Fixes: dbe6e2456fb0 ("powerpc/bpf/64: add support for atomic fetch operations")
Fixes: 1e82dfaa7819 ("powerpc/bpf/64: Add instructions for atomic_[cmp]xchg")

> Signed-off-by: Puranjay Mohan <puranjay@kernel.org>
> Acked-by: Paul E. McKenney <paulmck@kernel.org>

Cc: stable@vger.kernel.org # v6.0+

I have tested this with test_bpf and test_progs.
Reviewed-by: Naveen N Rao <naveen@kernel.org>


- Naveen
Puranjay Mohan May 13, 2024, 3:59 p.m. UTC | #3
Naveen N Rao <naveen@kernel.org> writes:

> On Mon, May 13, 2024 at 10:02:48AM GMT, Puranjay Mohan wrote:
>> The Linux Kernel Memory Model [1][2] requires RMW operations that have a
>> return value to be fully ordered.
>> 
>> BPF atomic operations with BPF_FETCH (including BPF_XCHG and
>> BPF_CMPXCHG) return a value back so they need to be JITed to fully
>> ordered operations. POWERPC currently emits relaxed operations for
>> these.
>> 
>> We can show this by running the following litmus-test:
>> 
>> PPC SB+atomic_add+fetch
>> 
>> {
>> 0:r0=x;  (* dst reg assuming offset is 0 *)
>> 0:r1=2;  (* src reg *)
>> 0:r2=1;
>> 0:r4=y;  (* P0 writes to this, P1 reads this *)
>> 0:r5=z;  (* P1 writes to this, P0 reads this *)
>> 0:r6=0;
>> 
>> 1:r2=1;
>> 1:r4=y;
>> 1:r5=z;
>> }
>> 
>> P0                      | P1            ;
>> stw         r2, 0(r4)   | stw  r2,0(r5) ;
>>                         |               ;
>> loop:lwarx  r3, r6, r0  |               ;
>> mr          r8, r3      |               ;
>> add         r3, r3, r1  | sync          ;
>> stwcx.      r3, r6, r0  |               ;
>> bne         loop        |               ;
>> mr          r1, r8      |               ;
>>                         |               ;
>> lwa         r7, 0(r5)   | lwa  r7,0(r4) ;
>> 
>> ~exists(0:r7=0 /\ 1:r7=0)
>> 
>> Witnesses
>> Positive: 9 Negative: 3
>> Condition ~exists (0:r7=0 /\ 1:r7=0)
>> Observation SB+atomic_add+fetch Sometimes 3 9
>> 
>> This test shows that the older store in P0 is reordered with a newer
>> load to a different address. Although there is a RMW operation with
>> fetch between them. Adding a sync before and after RMW fixes the issue:
>> 
>> Witnesses
>> Positive: 9 Negative: 0
>> Condition ~exists (0:r7=0 /\ 1:r7=0)
>> Observation SB+atomic_add+fetch Never 0 9
>> 
>> [1] https://www.kernel.org/doc/Documentation/memory-barriers.txt
>> [2] https://www.kernel.org/doc/Documentation/atomic_t.txt
>> 
>> Fixes: 65112709115f ("powerpc/bpf/64: add support for BPF_ATOMIC bitwise operations")
>
> As I noted in v2, I think that is the wrong commit. This fixes the below 

Sorry for missing this. Would this need another version or your message
below will make it work with the stable process?

> four commits in mainline:
> Fixes: aea7ef8a82c0 ("powerpc/bpf/32: add support for BPF_ATOMIC bitwise operations")
> Fixes: 2d9206b22743 ("powerpc/bpf/32: Add instructions for atomic_[cmp]xchg")
> Fixes: dbe6e2456fb0 ("powerpc/bpf/64: add support for atomic fetch operations")
> Fixes: 1e82dfaa7819 ("powerpc/bpf/64: Add instructions for atomic_[cmp]xchg")
>
> Cc: stable@vger.kernel.org # v6.0+

Thanks,
Puranjay
Michael Ellerman May 14, 2024, 1:09 a.m. UTC | #4
Puranjay Mohan <puranjay@kernel.org> writes:
> Naveen N Rao <naveen@kernel.org> writes:
>> On Mon, May 13, 2024 at 10:02:48AM GMT, Puranjay Mohan wrote:
>>> The Linux Kernel Memory Model [1][2] requires RMW operations that have a
>>> return value to be fully ordered.
>>> 
>>> BPF atomic operations with BPF_FETCH (including BPF_XCHG and
>>> BPF_CMPXCHG) return a value back so they need to be JITed to fully
>>> ordered operations. POWERPC currently emits relaxed operations for
>>> these.
>>> 
>>> We can show this by running the following litmus-test:
>>> 
>>> PPC SB+atomic_add+fetch
>>> 
>>> {
>>> 0:r0=x;  (* dst reg assuming offset is 0 *)
>>> 0:r1=2;  (* src reg *)
>>> 0:r2=1;
>>> 0:r4=y;  (* P0 writes to this, P1 reads this *)
>>> 0:r5=z;  (* P1 writes to this, P0 reads this *)
>>> 0:r6=0;
>>> 
>>> 1:r2=1;
>>> 1:r4=y;
>>> 1:r5=z;
>>> }
>>> 
>>> P0                      | P1            ;
>>> stw         r2, 0(r4)   | stw  r2,0(r5) ;
>>>                         |               ;
>>> loop:lwarx  r3, r6, r0  |               ;
>>> mr          r8, r3      |               ;
>>> add         r3, r3, r1  | sync          ;
>>> stwcx.      r3, r6, r0  |               ;
>>> bne         loop        |               ;
>>> mr          r1, r8      |               ;
>>>                         |               ;
>>> lwa         r7, 0(r5)   | lwa  r7,0(r4) ;
>>> 
>>> ~exists(0:r7=0 /\ 1:r7=0)
>>> 
>>> Witnesses
>>> Positive: 9 Negative: 3
>>> Condition ~exists (0:r7=0 /\ 1:r7=0)
>>> Observation SB+atomic_add+fetch Sometimes 3 9
>>> 
>>> This test shows that the older store in P0 is reordered with a newer
>>> load to a different address. Although there is a RMW operation with
>>> fetch between them. Adding a sync before and after RMW fixes the issue:
>>> 
>>> Witnesses
>>> Positive: 9 Negative: 0
>>> Condition ~exists (0:r7=0 /\ 1:r7=0)
>>> Observation SB+atomic_add+fetch Never 0 9
>>> 
>>> [1] https://www.kernel.org/doc/Documentation/memory-barriers.txt
>>> [2] https://www.kernel.org/doc/Documentation/atomic_t.txt
>>> 
>>> Fixes: 65112709115f ("powerpc/bpf/64: add support for BPF_ATOMIC bitwise operations")
>>
>> As I noted in v2, I think that is the wrong commit. This fixes the below 
>
> Sorry for missing this. Would this need another version or your message
> below will make it work with the stable process?

No need for another version. b4 should pick up those tags, or if not
I'll add them by hand.

cheers
Michael Ellerman June 3, 2024, 2:45 a.m. UTC | #5
On Mon, 13 May 2024 10:02:48 +0000, Puranjay Mohan wrote:
> The Linux Kernel Memory Model [1][2] requires RMW operations that have a
> return value to be fully ordered.
> 
> BPF atomic operations with BPF_FETCH (including BPF_XCHG and
> BPF_CMPXCHG) return a value back so they need to be JITed to fully
> ordered operations. POWERPC currently emits relaxed operations for
> these.
> 
> [...]

Applied to powerpc/fixes.

[1/1] powerpc/bpf: enforce full ordering for ATOMIC operations with BPF_FETCH
      https://git.kernel.org/powerpc/c/b1e7cee96127468c2483cf10c2899c9b5cf79bf8

cheers
diff mbox series

Patch

diff --git a/arch/powerpc/net/bpf_jit_comp32.c b/arch/powerpc/net/bpf_jit_comp32.c
index 2f39c50ca729..35f64dcfa68e 100644
--- a/arch/powerpc/net/bpf_jit_comp32.c
+++ b/arch/powerpc/net/bpf_jit_comp32.c
@@ -852,6 +852,15 @@  int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, u32 *fimage, struct code
 
 			/* Get offset into TMP_REG */
 			EMIT(PPC_RAW_LI(tmp_reg, off));
+			/*
+			 * Enforce full ordering for operations with BPF_FETCH by emitting a 'sync'
+			 * before and after the operation.
+			 *
+			 * This is a requirement in the Linux Kernel Memory Model.
+			 * See __cmpxchg_u32() in asm/cmpxchg.h as an example.
+			 */
+			if ((imm & BPF_FETCH) && IS_ENABLED(CONFIG_SMP))
+				EMIT(PPC_RAW_SYNC());
 			tmp_idx = ctx->idx * 4;
 			/* load value from memory into r0 */
 			EMIT(PPC_RAW_LWARX(_R0, tmp_reg, dst_reg, 0));
@@ -905,6 +914,9 @@  int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, u32 *fimage, struct code
 
 			/* For the BPF_FETCH variant, get old data into src_reg */
 			if (imm & BPF_FETCH) {
+				/* Emit 'sync' to enforce full ordering */
+				if (IS_ENABLED(CONFIG_SMP))
+					EMIT(PPC_RAW_SYNC());
 				EMIT(PPC_RAW_MR(ret_reg, ax_reg));
 				if (!fp->aux->verifier_zext)
 					EMIT(PPC_RAW_LI(ret_reg - 1, 0)); /* higher 32-bit */
diff --git a/arch/powerpc/net/bpf_jit_comp64.c b/arch/powerpc/net/bpf_jit_comp64.c
index 79f23974a320..884eef1b3973 100644
--- a/arch/powerpc/net/bpf_jit_comp64.c
+++ b/arch/powerpc/net/bpf_jit_comp64.c
@@ -803,6 +803,15 @@  int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, u32 *fimage, struct code
 
 			/* Get offset into TMP_REG_1 */
 			EMIT(PPC_RAW_LI(tmp1_reg, off));
+			/*
+			 * Enforce full ordering for operations with BPF_FETCH by emitting a 'sync'
+			 * before and after the operation.
+			 *
+			 * This is a requirement in the Linux Kernel Memory Model.
+			 * See __cmpxchg_u64() in asm/cmpxchg.h as an example.
+			 */
+			if ((imm & BPF_FETCH) && IS_ENABLED(CONFIG_SMP))
+				EMIT(PPC_RAW_SYNC());
 			tmp_idx = ctx->idx * 4;
 			/* load value from memory into TMP_REG_2 */
 			if (size == BPF_DW)
@@ -865,6 +874,9 @@  int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, u32 *fimage, struct code
 			PPC_BCC_SHORT(COND_NE, tmp_idx);
 
 			if (imm & BPF_FETCH) {
+				/* Emit 'sync' to enforce full ordering */
+				if (IS_ENABLED(CONFIG_SMP))
+					EMIT(PPC_RAW_SYNC());
 				EMIT(PPC_RAW_MR(ret_reg, _R0));
 				/*
 				 * Skip unnecessary zero-extension for 32-bit cmpxchg.