diff mbox series

[bpf-next,v1,8/8] bpf, docs: Update instruction-set.rst for load-acquire and store-release instructions

Message ID e2072e24a6773b346f2a71c80b6a28d5b98e6194.1737763916.git.yepeilin@google.com (mailing list archive)
State New
Headers show
Series Introduce load-acquire and store-release BPF instructions | expand

Commit Message

Peilin Ye Jan. 25, 2025, 2:19 a.m. UTC
Update documentation for the new load-acquire and store-release
instructions.  Rename existing atomic operations as "atomic
read-modify-write (RMW) operations".

Following RFC 9669, section 7.3. "Adding Instructions", create new
conformance groups "atomic32v2" and "atomic64v2", where:

  * atomic32v2: includes all instructions in "atomic32", plus the new
                8-bit, 16-bit and 32-bit atomic load-acquire and
                store-release instructions

  * atomic64v2: includes all instructions in "atomic64" and
                "atomic32v2", plus the new 64-bit atomic load-acquire
                and store-release instructions

Cc: bpf@ietf.org
Signed-off-by: Peilin Ye <yepeilin@google.com>
---
 .../bpf/standardization/instruction-set.rst   | 114 +++++++++++++++---
 1 file changed, 98 insertions(+), 16 deletions(-)

Comments

Alexei Starovoitov Jan. 30, 2025, 12:44 a.m. UTC | #1
On Fri, Jan 24, 2025 at 6:19 PM Peilin Ye <yepeilin@google.com> wrote:
>
> Update documentation for the new load-acquire and store-release
> instructions.  Rename existing atomic operations as "atomic
> read-modify-write (RMW) operations".
>
> Following RFC 9669, section 7.3. "Adding Instructions", create new
> conformance groups "atomic32v2" and "atomic64v2", where:
>
>   * atomic32v2: includes all instructions in "atomic32", plus the new
>                 8-bit, 16-bit and 32-bit atomic load-acquire and
>                 store-release instructions
>
>   * atomic64v2: includes all instructions in "atomic64" and
>                 "atomic32v2", plus the new 64-bit atomic load-acquire
>                 and store-release instructions
>
> Cc: bpf@ietf.org
> Signed-off-by: Peilin Ye <yepeilin@google.com>
> ---
>  .../bpf/standardization/instruction-set.rst   | 114 +++++++++++++++---
>  1 file changed, 98 insertions(+), 16 deletions(-)
>
> diff --git a/Documentation/bpf/standardization/instruction-set.rst b/Documentation/bpf/standardization/instruction-set.rst
> index ab820d565052..86917932e9ef 100644
> --- a/Documentation/bpf/standardization/instruction-set.rst
> +++ b/Documentation/bpf/standardization/instruction-set.rst
> @@ -139,8 +139,14 @@ This document defines the following conformance groups:
>    specification unless otherwise noted.
>  * base64: includes base32, plus instructions explicitly noted
>    as being in the base64 conformance group.
> -* atomic32: includes 32-bit atomic operation instructions (see `Atomic operations`_).
> -* atomic64: includes atomic32, plus 64-bit atomic operation instructions.
> +* atomic32: includes 32-bit atomic read-modify-write instructions (see
> +  `Atomic operations`_).
> +* atomic32v2: includes atomic32, plus 8-bit, 16-bit and 32-bit atomic
> +  load-acquire and store-release instructions.
> +* atomic64: includes atomic32, plus 64-bit atomic read-modify-write
> +  instructions.
> +* atomic64v2: unifies atomic32v2 and atomic64, plus 64-bit atomic load-acquire
> +  and store-release instructions.
>  * divmul32: includes 32-bit division, multiplication, and modulo instructions.
>  * divmul64: includes divmul32, plus 64-bit division, multiplication,
>    and modulo instructions.
> @@ -653,20 +659,31 @@ Atomic operations are operations that operate on memory and can not be
>  interrupted or corrupted by other access to the same memory region
>  by other BPF programs or means outside of this specification.
>
> -All atomic operations supported by BPF are encoded as store operations
> -that use the ``ATOMIC`` mode modifier as follows:
> +All atomic operations supported by BPF are encoded as ``STX`` instructions
> +that use the ``ATOMIC`` mode modifier, with the 'imm' field encoding the
> +actual atomic operation.  These operations are categorized based on the second
> +lowest nibble (bits 4-7) of the 'imm' field:
>
> -* ``{ATOMIC, W, STX}`` for 32-bit operations, which are
> +* ``ATOMIC_LOAD`` and ``ATOMIC_STORE`` indicate atomic load and store
> +  operations, respectively (see `Atomic load and store operations`_).
> +* All other defined values indicate an atomic read-modify-write operation, as
> +  described in the following section.
> +
> +Atomic read-modify-write operations
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +
> +The atomic read-modify-write (RMW) operations are encoded as follows:
> +
> +* ``{ATOMIC, W, STX}`` for 32-bit RMW operations, which are
>    part of the "atomic32" conformance group.
> -* ``{ATOMIC, DW, STX}`` for 64-bit operations, which are
> +* ``{ATOMIC, DW, STX}`` for 64-bit RMW operations, which are
>    part of the "atomic64" conformance group.
> -* 8-bit and 16-bit wide atomic operations are not supported.
> +* 8-bit and 16-bit wide atomic RMW operations are not supported.
>
> -The 'imm' field is used to encode the actual atomic operation.
> -Simple atomic operation use a subset of the values defined to encode
> -arithmetic operations in the 'imm' field to encode the atomic operation:
> +Simple atomic RMW operation use a subset of the values defined to encode
> +arithmetic operations in the 'imm' field to encode the atomic RMW operation:
>
> -.. table:: Simple atomic operations
> +.. table:: Simple atomic read-modify-write operations
>
>    ========  =====  ===========
>    imm       value  description
> @@ -686,10 +703,10 @@ arithmetic operations in the 'imm' field to encode the atomic operation:
>
>    *(u64 *)(dst + offset) += src
>
> -In addition to the simple atomic operations, there also is a modifier and
> -two complex atomic operations:
> +In addition to the simple atomic RMW operations, there also is a modifier and
> +two complex atomic RMW operations:
>
> -.. table:: Complex atomic operations
> +.. table:: Complex atomic read-modify-write operations
>
>    ===========  ================  ===========================
>    imm          value             description
> @@ -699,8 +716,8 @@ two complex atomic operations:
>    CMPXCHG      0xf0 | FETCH      atomic compare and exchange
>    ===========  ================  ===========================
>
> -The ``FETCH`` modifier is optional for simple atomic operations, and
> -always set for the complex atomic operations.  If the ``FETCH`` flag
> +The ``FETCH`` modifier is optional for simple atomic RMW operations, and
> +always set for the complex atomic RMW operations.  If the ``FETCH`` flag
>  is set, then the operation also overwrites ``src`` with the value that
>  was in memory before it was modified.
>
> @@ -713,6 +730,71 @@ The ``CMPXCHG`` operation atomically compares the value addressed by
>  value that was at ``dst + offset`` before the operation is zero-extended
>  and loaded back to ``R0``.
>
> +Atomic load and store operations
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +
> +To encode an atomic load or store operation, the lowest 8 bits of the 'imm'
> +field are divided as follows::
> +
> +  +-+-+-+-+-+-+-+-+
> +  | type  | order |
> +  +-+-+-+-+-+-+-+-+
> +
> +**type**
> +  The operation type is one of:
> +
> +.. table:: Atomic load and store operation types
> +
> +  ============  =====  ============
> +  type          value  description
> +  ============  =====  ============
> +  ATOMIC_LOAD   0x1    atomic load
> +  ATOMIC_STORE  0x2    atomic store
> +  ============  =====  ============
> +
> +**order**
> +  The memory order is one of:
> +
> +.. table:: Memory orders
> +
> +  =======  =====  =======================
> +  order    value  description
> +  =======  =====  =======================
> +  RELAXED  0x0    relaxed
> +  ACQUIRE  0x1    acquire
> +  RELEASE  0x2    release
> +  ACQ_REL  0x3    acquire and release
> +  SEQ_CST  0x4    sequentially consistent
> +  =======  =====  =======================

I understand that this is inspired by C,
but what are the chances this will map meaningfully to hw?
What JITs suppose to do with all other combinations ?

> +Currently the following combinations of ``type`` and ``order`` are allowed:
> +
> +.. table:: Atomic load and store operations
> +
> +  ========= =====  ====================
> +  imm       value  description
> +  ========= =====  ====================
> +  LOAD_ACQ  0x11   atomic load-acquire
> +  STORE_REL 0x22   atomic store-release
> +  ========= =====  ====================

Should we do LOAD_ACQ=1 and STORE_REL=2 and
do not add anything else?
Peilin Ye Jan. 30, 2025, 7:33 a.m. UTC | #2
+Cc: Yingchi Long

On Wed, Jan 29, 2025 at 04:44:02PM -0800, Alexei Starovoitov wrote:
> On Fri, Jan 24, 2025 at 6:19 PM Peilin Ye <yepeilin@google.com> wrote:
> > +Atomic load and store operations
> > +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > +
> > +To encode an atomic load or store operation, the lowest 8 bits of the 'imm'
> > +field are divided as follows::
> > +
> > +  +-+-+-+-+-+-+-+-+
> > +  | type  | order |
> > +  +-+-+-+-+-+-+-+-+
> > +
> > +**type**
> > +  The operation type is one of:
> > +
> > +.. table:: Atomic load and store operation types
> > +
> > +  ============  =====  ============
> > +  type          value  description
> > +  ============  =====  ============
> > +  ATOMIC_LOAD   0x1    atomic load
> > +  ATOMIC_STORE  0x2    atomic store
> > +  ============  =====  ============
> > +
> > +**order**
> > +  The memory order is one of:
> > +
> > +.. table:: Memory orders
> > +
> > +  =======  =====  =======================
> > +  order    value  description
> > +  =======  =====  =======================
> > +  RELAXED  0x0    relaxed
> > +  ACQUIRE  0x1    acquire
> > +  RELEASE  0x2    release
> > +  ACQ_REL  0x3    acquire and release
> > +  SEQ_CST  0x4    sequentially consistent
> > +  =======  =====  =======================
> 
> I understand that this is inspired by C,
> but what are the chances this will map meaningfully to hw?
> What JITs suppose to do with all other combinations ?

For context, those memorder flags were added after a discussion about
the SEQ_CST case on GitHub [1].

Do you anticipate we'll ever need BPF atomic seq_cst load/store
instructions?

If yes, I think we either:

  (a) add more flags to imm<4-7>: maybe LOAD_SEQ_CST (0x3) and
      STORE_SEQ_CST (0x6); need to skip OR (0x4) and AND (0x5) used by
      RMW atomics
  (b) specify memorder in imm<0-3>

I chose (b) for fewer "What would be a good numerical value so that RMW
atomics won't need to use it in imm<4-7>?" questions to answer.

If we're having dedicated fields for memorder, I think it's better to
define all possible values once and for all, just so that e.g. 0x2 will
always mean RELEASE in a memorder field.  Initially I defined all six of
them [2], then Yonghong suggested dropping CONSUME [3].

[1] https://github.com/llvm/llvm-project/pull/108636#discussion_r1817555681
[2] https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2023/n4950.pdf#page=1817
[3] https://github.com/llvm/llvm-project/pull/108636#discussion_r1819380536

Thanks,
Peilin Ye
diff mbox series

Patch

diff --git a/Documentation/bpf/standardization/instruction-set.rst b/Documentation/bpf/standardization/instruction-set.rst
index ab820d565052..86917932e9ef 100644
--- a/Documentation/bpf/standardization/instruction-set.rst
+++ b/Documentation/bpf/standardization/instruction-set.rst
@@ -139,8 +139,14 @@  This document defines the following conformance groups:
   specification unless otherwise noted.
 * base64: includes base32, plus instructions explicitly noted
   as being in the base64 conformance group.
-* atomic32: includes 32-bit atomic operation instructions (see `Atomic operations`_).
-* atomic64: includes atomic32, plus 64-bit atomic operation instructions.
+* atomic32: includes 32-bit atomic read-modify-write instructions (see
+  `Atomic operations`_).
+* atomic32v2: includes atomic32, plus 8-bit, 16-bit and 32-bit atomic
+  load-acquire and store-release instructions.
+* atomic64: includes atomic32, plus 64-bit atomic read-modify-write
+  instructions.
+* atomic64v2: unifies atomic32v2 and atomic64, plus 64-bit atomic load-acquire
+  and store-release instructions.
 * divmul32: includes 32-bit division, multiplication, and modulo instructions.
 * divmul64: includes divmul32, plus 64-bit division, multiplication,
   and modulo instructions.
@@ -653,20 +659,31 @@  Atomic operations are operations that operate on memory and can not be
 interrupted or corrupted by other access to the same memory region
 by other BPF programs or means outside of this specification.
 
-All atomic operations supported by BPF are encoded as store operations
-that use the ``ATOMIC`` mode modifier as follows:
+All atomic operations supported by BPF are encoded as ``STX`` instructions
+that use the ``ATOMIC`` mode modifier, with the 'imm' field encoding the
+actual atomic operation.  These operations are categorized based on the second
+lowest nibble (bits 4-7) of the 'imm' field:
 
-* ``{ATOMIC, W, STX}`` for 32-bit operations, which are
+* ``ATOMIC_LOAD`` and ``ATOMIC_STORE`` indicate atomic load and store
+  operations, respectively (see `Atomic load and store operations`_).
+* All other defined values indicate an atomic read-modify-write operation, as
+  described in the following section.
+
+Atomic read-modify-write operations
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The atomic read-modify-write (RMW) operations are encoded as follows:
+
+* ``{ATOMIC, W, STX}`` for 32-bit RMW operations, which are
   part of the "atomic32" conformance group.
-* ``{ATOMIC, DW, STX}`` for 64-bit operations, which are
+* ``{ATOMIC, DW, STX}`` for 64-bit RMW operations, which are
   part of the "atomic64" conformance group.
-* 8-bit and 16-bit wide atomic operations are not supported.
+* 8-bit and 16-bit wide atomic RMW operations are not supported.
 
-The 'imm' field is used to encode the actual atomic operation.
-Simple atomic operation use a subset of the values defined to encode
-arithmetic operations in the 'imm' field to encode the atomic operation:
+Simple atomic RMW operation use a subset of the values defined to encode
+arithmetic operations in the 'imm' field to encode the atomic RMW operation:
 
-.. table:: Simple atomic operations
+.. table:: Simple atomic read-modify-write operations
 
   ========  =====  ===========
   imm       value  description
@@ -686,10 +703,10 @@  arithmetic operations in the 'imm' field to encode the atomic operation:
 
   *(u64 *)(dst + offset) += src
 
-In addition to the simple atomic operations, there also is a modifier and
-two complex atomic operations:
+In addition to the simple atomic RMW operations, there also is a modifier and
+two complex atomic RMW operations:
 
-.. table:: Complex atomic operations
+.. table:: Complex atomic read-modify-write operations
 
   ===========  ================  ===========================
   imm          value             description
@@ -699,8 +716,8 @@  two complex atomic operations:
   CMPXCHG      0xf0 | FETCH      atomic compare and exchange
   ===========  ================  ===========================
 
-The ``FETCH`` modifier is optional for simple atomic operations, and
-always set for the complex atomic operations.  If the ``FETCH`` flag
+The ``FETCH`` modifier is optional for simple atomic RMW operations, and
+always set for the complex atomic RMW operations.  If the ``FETCH`` flag
 is set, then the operation also overwrites ``src`` with the value that
 was in memory before it was modified.
 
@@ -713,6 +730,71 @@  The ``CMPXCHG`` operation atomically compares the value addressed by
 value that was at ``dst + offset`` before the operation is zero-extended
 and loaded back to ``R0``.
 
+Atomic load and store operations
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+To encode an atomic load or store operation, the lowest 8 bits of the 'imm'
+field are divided as follows::
+
+  +-+-+-+-+-+-+-+-+
+  | type  | order |
+  +-+-+-+-+-+-+-+-+
+
+**type**
+  The operation type is one of:
+
+.. table:: Atomic load and store operation types
+
+  ============  =====  ============
+  type          value  description
+  ============  =====  ============
+  ATOMIC_LOAD   0x1    atomic load
+  ATOMIC_STORE  0x2    atomic store
+  ============  =====  ============
+
+**order**
+  The memory order is one of:
+
+.. table:: Memory orders
+
+  =======  =====  =======================
+  order    value  description
+  =======  =====  =======================
+  RELAXED  0x0    relaxed
+  ACQUIRE  0x1    acquire
+  RELEASE  0x2    release
+  ACQ_REL  0x3    acquire and release
+  SEQ_CST  0x4    sequentially consistent
+  =======  =====  =======================
+
+Currently the following combinations of ``type`` and ``order`` are allowed:
+
+.. table:: Atomic load and store operations
+
+  ========= =====  ====================
+  imm       value  description
+  ========= =====  ====================
+  LOAD_ACQ  0x11   atomic load-acquire
+  STORE_REL 0x22   atomic store-release
+  ========= =====  ====================
+
+``{ATOMIC, <size>, STX}`` with 'imm' = LOAD_ACQ means::
+
+  dst = load_acquire((unsigned size *)(src + offset))
+
+``{ATOMIC, <size>, STX}`` with 'imm' = STORE_REL means::
+
+  store_release((unsigned size *)(dst + offset), src)
+
+Where '<size>' is one of: ``B``, ``H``, ``W``, or ``DW``, and 'unsigned size'
+is one of: u8, u16, u32, or u64.
+
+8-bit, 16-bit and 32-bit atomic load-acquire and store-release instructions
+are part of the "atomic32v2" conformance group.
+
+64-bit atomic load-acquire and store-release instructions are part of the
+"atomic64v2" conformance group.
+
 64-bit immediate instructions
 -----------------------------