diff mbox series

[bpf-next,v2,15/15] docs/bpf: Add documentation for new instructions

Message ID 20230713060847.397969-1-yhs@fb.com (mailing list archive)
State Superseded
Delegated to: BPF
Headers show
Series bpf: Support new insns from cpu v4 | expand

Checks

Context Check Description
bpf/vmtest-bpf-next-PR fail PR summary
bpf/vmtest-bpf-next-VM_Test-1 success Logs for ${{ matrix.test }} on ${{ matrix.arch }} with ${{ matrix.toolchain_full }}
bpf/vmtest-bpf-next-VM_Test-2 success Logs for ShellCheck
bpf/vmtest-bpf-next-VM_Test-3 fail Logs for build for aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-4 fail Logs for build for s390x with gcc
bpf/vmtest-bpf-next-VM_Test-5 fail Logs for build for x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-6 fail Logs for build for x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-7 success Logs for set-matrix
bpf/vmtest-bpf-next-VM_Test-8 success Logs for veristat
netdev/series_format success Posting correctly formatted
netdev/tree_selection success Clearly marked for bpf-next, async
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 8 this patch: 8
netdev/cc_maintainers warning 16 maintainers not CCed: void@manifault.com ndesaulniers@google.com kpsingh@kernel.org martin.lau@linux.dev john.fastabend@gmail.com sdf@google.com trix@redhat.com song@kernel.org corbet@lwn.net bpf@ietf.org nathan@kernel.org llvm@lists.linux.dev jolsa@kernel.org haoluo@google.com dthaler@microsoft.com linux-doc@vger.kernel.org
netdev/build_clang fail Errors and warnings before: 18 this patch: 18
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 8 this patch: 8
netdev/checkpatch warning WARNING: 'devision' may be misspelled - perhaps 'division'?
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Yonghong Song July 13, 2023, 6:08 a.m. UTC
Add documentation in instruction-set.rst for new instruction encoding
and their corresponding operations. Also removed the question
related to 'no BPF_SDIV' in bpf_design_QA.rst since we have
BPF_SDIV insn now.

Signed-off-by: Yonghong Song <yhs@fb.com>
---
 Documentation/bpf/bpf_design_QA.rst           |   5 -
 .../bpf/standardization/instruction-set.rst   | 100 ++++++++++++------
 2 files changed, 66 insertions(+), 39 deletions(-)

Comments

Alexei Starovoitov July 14, 2023, 6:28 p.m. UTC | #1
On Wed, Jul 12, 2023 at 11:08:47PM -0700, Yonghong Song wrote:
> diff --git a/Documentation/bpf/standardization/instruction-set.rst b/Documentation/bpf/standardization/instruction-set.rst
> index 751e657973f0..367f426d09a1 100644
> --- a/Documentation/bpf/standardization/instruction-set.rst
> +++ b/Documentation/bpf/standardization/instruction-set.rst
> @@ -154,24 +154,27 @@ otherwise identical operations.
>  The 'code' field encodes the operation as below, where 'src' and 'dst' refer
>  to the values of the source and destination registers, respectively.
>  
> -========  =====  ==========================================================
> -code      value  description
> -========  =====  ==========================================================
> -BPF_ADD   0x00   dst += src
> -BPF_SUB   0x10   dst -= src
> -BPF_MUL   0x20   dst \*= src
> -BPF_DIV   0x30   dst = (src != 0) ? (dst / src) : 0
> -BPF_OR    0x40   dst \|= src
> -BPF_AND   0x50   dst &= src
> -BPF_LSH   0x60   dst <<= (src & mask)
> -BPF_RSH   0x70   dst >>= (src & mask)
> -BPF_NEG   0x80   dst = -src
> -BPF_MOD   0x90   dst = (src != 0) ? (dst % src) : dst
> -BPF_XOR   0xa0   dst ^= src
> -BPF_MOV   0xb0   dst = src
> -BPF_ARSH  0xc0   sign extending dst >>= (src & mask)
> -BPF_END   0xd0   byte swap operations (see `Byte swap instructions`_ below)
> -========  =====  ==========================================================
> +========  =====  ============  ==========================================================
> +code      value  offset value  description

How about just 'offset' ?

> +========  =====  ============  ==========================================================
> +BPF_ADD   0x00   0             dst += src
> +BPF_SUB   0x10   0             dst -= src
> +BPF_MUL   0x20   0             dst \*= src
> +BPF_DIV   0x30   0             dst = (src != 0) ? (dst / src) : 0
> +BPF_SDIV  0x30   1             dst = (src != 0) ? (dst s/ src) : 0
> +BPF_OR    0x40   0             dst \|= src
> +BPF_AND   0x50   0             dst &= src
> +BPF_LSH   0x60   0             dst <<= (src & mask)
> +BPF_RSH   0x70   0             dst >>= (src & mask)
> +BPF_NEG   0x80   0             dst = -src
> +BPF_MOD   0x90   0             dst = (src != 0) ? (dst % src) : dst
> +BPF_SMOD  0x90   1             dst = (src != 0) ? (dst s% src) : dst
> +BPF_XOR   0xa0   0             dst ^= src
> +BPF_MOV   0xb0   0             dst = src
> +BPF_MOVSX 0xb0   8/16/32       dst = (s8,16,s32)src
> +BPF_ARSH  0xc0   0             sign extending dst >>= (src & mask)
> +BPF_END   0xd0   0             byte swap operations (see `Byte swap instructions`_ below)
> +========  =====  ============  ==========================================================
>  
>  Underflow and overflow are allowed during arithmetic operations, meaning
>  the 64-bit or 32-bit value will wrap. If eBPF program execution would
> @@ -198,11 +201,19 @@ where '(u32)' indicates that the upper 32 bits are zeroed.
>  
>    dst = dst ^ imm32
>  
> -Also note that the division and modulo operations are unsigned. Thus, for
> -``BPF_ALU``, 'imm' is first interpreted as an unsigned 32-bit value, whereas
> -for ``BPF_ALU64``, 'imm' is first sign extended to 64 bits and the result
> -interpreted as an unsigned 64-bit value. There are no instructions for
> -signed division or modulo.
> +Note that most instructions have instruction offset of 0. But three instructions
> +(BPF_SDIV, BPF_SMOD, BPF_MOVSX) have non-zero offset.
> +
> +The devision and modulo operations support both unsigned and signed flavors.
> +For unsigned operation (BPF_DIV and BPF_MOD), for ``BPF_ALU``, 'imm' is first
> +interpreted as an unsigned 32-bit value, whereas for ``BPF_ALU64``, 'imm' is
> +first sign extended to 64 bits and the result interpreted as an unsigned 64-bit
> +value.  For signed operation (BPF_SDIV and BPF_SMOD), for both ``BPF_ALU`` and
> +``BPF_ALU64``, 'imm' is interpreted as a signed value.

Probably worth clarifying that in case of S[DIV|MOD] | ALU64 the imm is sign
extended from 32 to 64 and interpreted as signed 64-bit.

> +
> +Instruction BPF_MOVSX does move operation with sign extension. For ``BPF_ALU``
> +mode, 8-bit and 16-bit sign extensions to 32-bit are supported. For ``BPF_ALU64``,
> +8-bit, 16-bit and 32-bit sign extenstions to 64-bit are supported.

How about:

Instruction BPF_MOVSX does move operation with sign extension. 
BPF_ALU | MOVSX sign extendes 8-bit and 16-bit into 32-bit and upper 32-bit are zeroed.
BPF_ALU64 | MOVSX sign extends 8-bit, 16-bit and 32-bit into 64-bit.

>  
> +``BPF_ALU64 | BPF_TO_LE | BPF_END`` with imm = 16 means::
> +
> +  dst = bswap16(dst)

Worth spelling out imm 32 and 64 too ?

>  
> +The ``BPF_MEMSX`` mode modifier is used to encode sign-extension load
> +instructions that transfer data between a register and memory.
> +
> +``BPF_MEMSX | <size> | BPF_LDX`` means::
> +
> +  dst = *(sign-extension size *) (src + offset)
> +

How about:

``BPF_MEM | <size> | BPF_LDX`` means::

  dst = *(unsigned size *) (src + offset)

``BPF_MEMSX | <size> | BPF_LDX`` means::

  dst = *(signed size *) (src + offset)
Yonghong Song July 14, 2023, 11:26 p.m. UTC | #2
On 7/14/23 11:28 AM, Alexei Starovoitov wrote:
> On Wed, Jul 12, 2023 at 11:08:47PM -0700, Yonghong Song wrote:
>> diff --git a/Documentation/bpf/standardization/instruction-set.rst b/Documentation/bpf/standardization/instruction-set.rst
>> index 751e657973f0..367f426d09a1 100644
>> --- a/Documentation/bpf/standardization/instruction-set.rst
>> +++ b/Documentation/bpf/standardization/instruction-set.rst
>> @@ -154,24 +154,27 @@ otherwise identical operations.
>>   The 'code' field encodes the operation as below, where 'src' and 'dst' refer
>>   to the values of the source and destination registers, respectively.
>>   
>> -========  =====  ==========================================================
>> -code      value  description
>> -========  =====  ==========================================================
>> -BPF_ADD   0x00   dst += src
>> -BPF_SUB   0x10   dst -= src
>> -BPF_MUL   0x20   dst \*= src
>> -BPF_DIV   0x30   dst = (src != 0) ? (dst / src) : 0
>> -BPF_OR    0x40   dst \|= src
>> -BPF_AND   0x50   dst &= src
>> -BPF_LSH   0x60   dst <<= (src & mask)
>> -BPF_RSH   0x70   dst >>= (src & mask)
>> -BPF_NEG   0x80   dst = -src
>> -BPF_MOD   0x90   dst = (src != 0) ? (dst % src) : dst
>> -BPF_XOR   0xa0   dst ^= src
>> -BPF_MOV   0xb0   dst = src
>> -BPF_ARSH  0xc0   sign extending dst >>= (src & mask)
>> -BPF_END   0xd0   byte swap operations (see `Byte swap instructions`_ below)
>> -========  =====  ==========================================================
>> +========  =====  ============  ==========================================================
>> +code      value  offset value  description
> 
> How about just 'offset' ?

Ack.

> 
>> +========  =====  ============  ==========================================================
>> +BPF_ADD   0x00   0             dst += src
>> +BPF_SUB   0x10   0             dst -= src
>> +BPF_MUL   0x20   0             dst \*= src
>> +BPF_DIV   0x30   0             dst = (src != 0) ? (dst / src) : 0
>> +BPF_SDIV  0x30   1             dst = (src != 0) ? (dst s/ src) : 0
>> +BPF_OR    0x40   0             dst \|= src
>> +BPF_AND   0x50   0             dst &= src
>> +BPF_LSH   0x60   0             dst <<= (src & mask)
>> +BPF_RSH   0x70   0             dst >>= (src & mask)
>> +BPF_NEG   0x80   0             dst = -src
>> +BPF_MOD   0x90   0             dst = (src != 0) ? (dst % src) : dst
>> +BPF_SMOD  0x90   1             dst = (src != 0) ? (dst s% src) : dst
>> +BPF_XOR   0xa0   0             dst ^= src
>> +BPF_MOV   0xb0   0             dst = src
>> +BPF_MOVSX 0xb0   8/16/32       dst = (s8,16,s32)src
>> +BPF_ARSH  0xc0   0             sign extending dst >>= (src & mask)
>> +BPF_END   0xd0   0             byte swap operations (see `Byte swap instructions`_ below)
>> +========  =====  ============  ==========================================================
>>   
>>   Underflow and overflow are allowed during arithmetic operations, meaning
>>   the 64-bit or 32-bit value will wrap. If eBPF program execution would
>> @@ -198,11 +201,19 @@ where '(u32)' indicates that the upper 32 bits are zeroed.
>>   
>>     dst = dst ^ imm32
>>   
>> -Also note that the division and modulo operations are unsigned. Thus, for
>> -``BPF_ALU``, 'imm' is first interpreted as an unsigned 32-bit value, whereas
>> -for ``BPF_ALU64``, 'imm' is first sign extended to 64 bits and the result
>> -interpreted as an unsigned 64-bit value. There are no instructions for
>> -signed division or modulo.
>> +Note that most instructions have instruction offset of 0. But three instructions
>> +(BPF_SDIV, BPF_SMOD, BPF_MOVSX) have non-zero offset.
>> +
>> +The devision and modulo operations support both unsigned and signed flavors.
>> +For unsigned operation (BPF_DIV and BPF_MOD), for ``BPF_ALU``, 'imm' is first
>> +interpreted as an unsigned 32-bit value, whereas for ``BPF_ALU64``, 'imm' is
>> +first sign extended to 64 bits and the result interpreted as an unsigned 64-bit
>> +value.  For signed operation (BPF_SDIV and BPF_SMOD), for both ``BPF_ALU`` and
>> +``BPF_ALU64``, 'imm' is interpreted as a signed value.
> 
> Probably worth clarifying that in case of S[DIV|MOD] | ALU64 the imm is sign
> extended from 32 to 64 and interpreted as signed 64-bit.

I thought this before but think "'imm' is interpreted as a signed value"
might be good enough since in this case BPF_ALU and BPF_ALU64 should 
have the same signed 'imm' value. But what you suggested will make it
more clearer. Will do.

> 
>> +
>> +Instruction BPF_MOVSX does move operation with sign extension. For ``BPF_ALU``
>> +mode, 8-bit and 16-bit sign extensions to 32-bit are supported. For ``BPF_ALU64``,
>> +8-bit, 16-bit and 32-bit sign extenstions to 64-bit are supported.
> 
> How about:
> 
> Instruction BPF_MOVSX does move operation with sign extension.
> BPF_ALU | MOVSX sign extendes 8-bit and 16-bit into 32-bit and upper 32-bit are zeroed.
> BPF_ALU64 | MOVSX sign extends 8-bit, 16-bit and 32-bit into 64-bit.

In previous doc, we already says that BPF_ALU operation has upper
32-bit zeroed. But since this sign extension is special, agree it
is worthwhile to mention upper 32-bit zeroed explictly.

> 
>>   
>> +``BPF_ALU64 | BPF_TO_LE | BPF_END`` with imm = 16 means::
>> +
>> +  dst = bswap16(dst)
> 
> Worth spelling out imm 32 and 64 too ?

Will do.

> 
>>   
>> +The ``BPF_MEMSX`` mode modifier is used to encode sign-extension load
>> +instructions that transfer data between a register and memory.
>> +
>> +``BPF_MEMSX | <size> | BPF_LDX`` means::
>> +
>> +  dst = *(sign-extension size *) (src + offset)
>> +
> 
> How about:
> 
> ``BPF_MEM | <size> | BPF_LDX`` means::
> 
>    dst = *(unsigned size *) (src + offset)
> 
> ``BPF_MEMSX | <size> | BPF_LDX`` means::
> 
>    dst = *(signed size *) (src + offset)

Will do.
>
Dave Thaler July 14, 2023, 11:33 p.m. UTC | #3
Yonghong Song <yhs@fb.com> wrote:
> Add documentation in instruction-set.rst for new instruction encoding and
> their corresponding operations. Also removed the question related to 'no
> BPF_SDIV' in bpf_design_QA.rst since we have BPF_SDIV insn now.

Why did you choose to differentiate the instruction by offset instead of using a separate
opcode value?  I don't think there's any other instructions that do so, and there's spare
opcode values as far as I can see.

Using a separate offset works but would end up requiring another column in the IANA
registry assuming we have one.  So why the extra complexity and inconsistency
introduced now?

Dave
Dave Thaler July 14, 2023, 11:34 p.m. UTC | #4
> -----Original Message-----
> From: Yonghong Song <yhs@fb.com>
> Sent: Wednesday, July 12, 2023 11:09 PM
> To: bpf@vger.kernel.org
> Cc: Alexei Starovoitov <ast@kernel.org>; Andrii Nakryiko
> <andrii@kernel.org>; Daniel Borkmann <daniel@iogearbox.net>; Fangrui
> Song <maskray@google.com>; kernel-team@fb.com
> Subject: [PATCH bpf-next v2 15/15] docs/bpf: Add documentation for new
> instructions
> 
> Add documentation in instruction-set.rst for new instruction encoding and
> their corresponding operations. Also removed the question related to 'no
> BPF_SDIV' in bpf_design_QA.rst since we have BPF_SDIV insn now.
> 
> Signed-off-by: Yonghong Song <yhs@fb.com>
> ---
>  Documentation/bpf/bpf_design_QA.rst           |   5 -
>  .../bpf/standardization/instruction-set.rst   | 100 ++++++++++++------
>  2 files changed, 66 insertions(+), 39 deletions(-)
> 
> diff --git a/Documentation/bpf/bpf_design_QA.rst
> b/Documentation/bpf/bpf_design_QA.rst
> index 38372a956d65..eb19c945f4d5 100644
> --- a/Documentation/bpf/bpf_design_QA.rst
> +++ b/Documentation/bpf/bpf_design_QA.rst
> @@ -140,11 +140,6 @@ A: Because if we picked one-to-one relationship to
> x64 it would have made  it more complicated to support on arm64 and other
> archs. Also it  needs div-by-zero runtime check.
> 
> -Q: Why there is no BPF_SDIV for signed divide operation?
> -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> -A: Because it would be rarely used. llvm errors in such case and -prints a
> suggestion to use unsigned divide instead.
> -
>  Q: Why BPF has implicit prologue and epilogue?
>  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>  A: Because architectures like sparc have register windows and in general diff
> --git a/Documentation/bpf/standardization/instruction-set.rst
> b/Documentation/bpf/standardization/instruction-set.rst
> index 751e657973f0..367f426d09a1 100644
> --- a/Documentation/bpf/standardization/instruction-set.rst
> +++ b/Documentation/bpf/standardization/instruction-set.rst
> @@ -154,24 +154,27 @@ otherwise identical operations.
>  The 'code' field encodes the operation as below, where 'src' and 'dst' refer  to
> the values of the source and destination registers, respectively.
> 
> -========  =====
> ==========================================================
> -code      value  description
> -========  =====
> ==========================================================
> -BPF_ADD   0x00   dst += src
> -BPF_SUB   0x10   dst -= src
> -BPF_MUL   0x20   dst \*= src
> -BPF_DIV   0x30   dst = (src != 0) ? (dst / src) : 0
> -BPF_OR    0x40   dst \|= src
> -BPF_AND   0x50   dst &= src
> -BPF_LSH   0x60   dst <<= (src & mask)
> -BPF_RSH   0x70   dst >>= (src & mask)
> -BPF_NEG   0x80   dst = -src
> -BPF_MOD   0x90   dst = (src != 0) ? (dst % src) : dst
> -BPF_XOR   0xa0   dst ^= src
> -BPF_MOV   0xb0   dst = src
> -BPF_ARSH  0xc0   sign extending dst >>= (src & mask)
> -BPF_END   0xd0   byte swap operations (see `Byte swap instructions`_
> below)
> -========  =====
> ==========================================================
> +========  =====  ============
> ==========================================================
> +code      value  offset value  description
> +========  =====  ============
> ==========================================================
> +BPF_ADD   0x00   0             dst += src
> +BPF_SUB   0x10   0             dst -= src
> +BPF_MUL   0x20   0             dst \*= src
> +BPF_DIV   0x30   0             dst = (src != 0) ? (dst / src) : 0
> +BPF_SDIV  0x30   1             dst = (src != 0) ? (dst s/ src) : 0
> +BPF_OR    0x40   0             dst \|= src
> +BPF_AND   0x50   0             dst &= src
> +BPF_LSH   0x60   0             dst <<= (src & mask)
> +BPF_RSH   0x70   0             dst >>= (src & mask)
> +BPF_NEG   0x80   0             dst = -src
> +BPF_MOD   0x90   0             dst = (src != 0) ? (dst % src) : dst
> +BPF_SMOD  0x90   1             dst = (src != 0) ? (dst s% src) : dst
> +BPF_XOR   0xa0   0             dst ^= src
> +BPF_MOV   0xb0   0             dst = src
> +BPF_MOVSX 0xb0   8/16/32       dst = (s8,16,s32)src
> +BPF_ARSH  0xc0   0             sign extending dst >>= (src & mask)
> +BPF_END   0xd0   0             byte swap operations (see `Byte swap
> instructions`_ below)
> +========  =====  ============
> +==========================================================
> 
>  Underflow and overflow are allowed during arithmetic operations, meaning
> the 64-bit or 32-bit value will wrap. If eBPF program execution would @@ -
> 198,11 +201,19 @@ where '(u32)' indicates that the upper 32 bits are
> zeroed.
> 
>    dst = dst ^ imm32
> 
> -Also note that the division and modulo operations are unsigned. Thus, for -
> ``BPF_ALU``, 'imm' is first interpreted as an unsigned 32-bit value, whereas -
> for ``BPF_ALU64``, 'imm' is first sign extended to 64 bits and the result -
> interpreted as an unsigned 64-bit value. There are no instructions for -signed
> division or modulo.
> +Note that most instructions have instruction offset of 0. But three
> +instructions (BPF_SDIV, BPF_SMOD, BPF_MOVSX) have non-zero offset.
> +
> +The devision and modulo operations support both unsigned and signed
> flavors.
> +For unsigned operation (BPF_DIV and BPF_MOD), for ``BPF_ALU``, 'imm' is
> +first interpreted as an unsigned 32-bit value, whereas for
> +``BPF_ALU64``, 'imm' is first sign extended to 64 bits and the result
> +interpreted as an unsigned 64-bit value.  For signed operation
> +(BPF_SDIV and BPF_SMOD), for both ``BPF_ALU`` and ``BPF_ALU64``, 'imm'
> is interpreted as a signed value.
> +
> +Instruction BPF_MOVSX does move operation with sign extension. For
> +``BPF_ALU`` mode, 8-bit and 16-bit sign extensions to 32-bit are
> +supported. For ``BPF_ALU64``, 8-bit, 16-bit and 32-bit sign extenstions to
> 64-bit are supported.
> 
>  Shift operations use a mask of 0x3F (63) for 64-bit operations and 0x1F (31)
> for 32-bit operations.
> @@ -210,21 +221,23 @@ for 32-bit operations.
>  Byte swap instructions
>  ~~~~~~~~~~~~~~~~~~~~~~
> 
> -The byte swap instructions use an instruction class of ``BPF_ALU`` and a 4-bit
> -'code' field of ``BPF_END``.
> +The byte swap instructions use instruction classes of ``BPF_ALU`` and
> +``BPF_ALU64`` and a 4-bit 'code' field of ``BPF_END``.
> 
>  The byte swap instructions operate on the destination register  only and do
> not use a separate source register or immediate value.
> 
> -The 1-bit source operand field in the opcode is used to select what byte -
> order the operation convert from or to:
> +For ``BPF_ALU``, the 1-bit source operand field in the opcode is used
> +to select what byte order the operation convert from or to. For
> +``BPF_ALU64``, the 1-bit source operand field in the opcode is not used.
> 
> -=========  =====
> =================================================
> -source     value  description
> -=========  =====
> =================================================
> -BPF_TO_LE  0x00   convert between host byte order and little endian
> -BPF_TO_BE  0x08   convert between host byte order and big endian
> -=========  =====
> =================================================
> +=========  =========  =====
> =================================================
> +class      source     value  description
> +=========  =========  =====
> =================================================
> +BPF_ALU    BPF_TO_LE  0x00   convert between host byte order and little
> endian
> +BPF_ALU    BPF_TO_BE  0x08   convert between host byte order and big
> endian
> +BPF_ALU64  BPF_TO_LE  0x00   do byte swap unconditionally
> +=========  =========  =====
> +=================================================
> 
>  The 'imm' field encodes the width of the swap operations.  The following
> widths  are supported: 16, 32 and 64.
> @@ -239,6 +252,10 @@ Examples:
> 
>    dst = htobe64(dst)
> 
> +``BPF_ALU64 | BPF_TO_LE | BPF_END`` with imm = 16 means::
> +
> +  dst = bswap16(dst)
> +
>  Jump instructions
>  -----------------
> 
> @@ -249,7 +266,8 @@ The 'code' field encodes the operation as below:
>  ========  =====  ===  ===========================================
> =========================================
>  code      value  src  description                                  notes
>  ========  =====  ===  ===========================================
> =========================================
> -BPF_JA    0x0    0x0  PC += offset                                 BPF_JMP only
> +BPF_JA    0x0    0x0  PC += offset                                 BPF_JMP class
> +BPF_JA    0x0    0x0  PC += imm                                    BPF_JMP32 class
>  BPF_JEQ   0x1    any  PC += offset if dst == src
>  BPF_JGT   0x2    any  PC += offset if dst > src                    unsigned
>  BPF_JGE   0x3    any  PC += offset if dst >= src                   unsigned
> @@ -278,6 +296,10 @@ Example:
> 
>  where 's>=' indicates a signed '>=' comparison.
> 
> +Note there are two flavors of BPF_JA instrions. BPF_JMP class permits
> +16-bit jump offset while
> +BPF_JMP32 permits 32-bit jump offset. A >16bit conditional jmp can be
> +converted to a <16bit conditional jmp plus a 32-bit unconditional jump.
> +
>  Helper functions
>  ~~~~~~~~~~~~~~~~
> 
> @@ -320,6 +342,7 @@ The mode modifier is one of:
>    BPF_ABS        0x20   legacy BPF packet access (absolute)   `Legacy BPF
> Packet access instructions`_
>    BPF_IND        0x40   legacy BPF packet access (indirect)   `Legacy BPF Packet
> access instructions`_
>    BPF_MEM        0x60   regular load and store operations     `Regular load and
> store operations`_
> +  BPF_MEMSX      0x80   sign-extension load operations        `Sign-extension
> load operations`_
>    BPF_ATOMIC     0xc0   atomic operations                     `Atomic operations`_
>    =============  =====  ====================================
> =============
> 
> @@ -354,6 +377,15 @@ instructions that transfer data between a register
> and memory.
> 
>  Where size is one of: ``BPF_B``, ``BPF_H``, ``BPF_W``, or ``BPF_DW``.
> 
> +The ``BPF_MEMSX`` mode modifier is used to encode sign-extension load
> +instructions that transfer data between a register and memory.
> +
> +``BPF_MEMSX | <size> | BPF_LDX`` means::
> +
> +  dst = *(sign-extension size *) (src + offset)
> +
> +Where size is one of: ``BPF_B``, ``BPF_H`` or ``BPF_W``.
> +
>  Atomic operations
>  -----------------
> 
> --
> 2.34.1
> 

Adding bpf@ietf.org to the To line since this is proposing a chance to the ISA draft.

Dave
Alexei Starovoitov July 15, 2023, 12:23 a.m. UTC | #5
On Fri, Jul 14, 2023 at 4:33 PM Dave Thaler <dthaler@microsoft.com> wrote:
>
> Yonghong Song <yhs@fb.com> wrote:
> > Add documentation in instruction-set.rst for new instruction encoding and
> > their corresponding operations. Also removed the question related to 'no
> > BPF_SDIV' in bpf_design_QA.rst since we have BPF_SDIV insn now.
>
> Why did you choose to differentiate the instruction by offset instead of using a separate
> opcode value?  I don't think there's any other instructions that do so, and there's spare
> opcode values as far as I can see.
>
> Using a separate offset works but would end up requiring another column in the IANA
> registry assuming we have one.  So why the extra complexity and inconsistency
> introduced now?

"another column in IANA" is the last thing to worry about.
diff mbox series

Patch

diff --git a/Documentation/bpf/bpf_design_QA.rst b/Documentation/bpf/bpf_design_QA.rst
index 38372a956d65..eb19c945f4d5 100644
--- a/Documentation/bpf/bpf_design_QA.rst
+++ b/Documentation/bpf/bpf_design_QA.rst
@@ -140,11 +140,6 @@  A: Because if we picked one-to-one relationship to x64 it would have made
 it more complicated to support on arm64 and other archs. Also it
 needs div-by-zero runtime check.
 
-Q: Why there is no BPF_SDIV for signed divide operation?
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-A: Because it would be rarely used. llvm errors in such case and
-prints a suggestion to use unsigned divide instead.
-
 Q: Why BPF has implicit prologue and epilogue?
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 A: Because architectures like sparc have register windows and in general
diff --git a/Documentation/bpf/standardization/instruction-set.rst b/Documentation/bpf/standardization/instruction-set.rst
index 751e657973f0..367f426d09a1 100644
--- a/Documentation/bpf/standardization/instruction-set.rst
+++ b/Documentation/bpf/standardization/instruction-set.rst
@@ -154,24 +154,27 @@  otherwise identical operations.
 The 'code' field encodes the operation as below, where 'src' and 'dst' refer
 to the values of the source and destination registers, respectively.
 
-========  =====  ==========================================================
-code      value  description
-========  =====  ==========================================================
-BPF_ADD   0x00   dst += src
-BPF_SUB   0x10   dst -= src
-BPF_MUL   0x20   dst \*= src
-BPF_DIV   0x30   dst = (src != 0) ? (dst / src) : 0
-BPF_OR    0x40   dst \|= src
-BPF_AND   0x50   dst &= src
-BPF_LSH   0x60   dst <<= (src & mask)
-BPF_RSH   0x70   dst >>= (src & mask)
-BPF_NEG   0x80   dst = -src
-BPF_MOD   0x90   dst = (src != 0) ? (dst % src) : dst
-BPF_XOR   0xa0   dst ^= src
-BPF_MOV   0xb0   dst = src
-BPF_ARSH  0xc0   sign extending dst >>= (src & mask)
-BPF_END   0xd0   byte swap operations (see `Byte swap instructions`_ below)
-========  =====  ==========================================================
+========  =====  ============  ==========================================================
+code      value  offset value  description
+========  =====  ============  ==========================================================
+BPF_ADD   0x00   0             dst += src
+BPF_SUB   0x10   0             dst -= src
+BPF_MUL   0x20   0             dst \*= src
+BPF_DIV   0x30   0             dst = (src != 0) ? (dst / src) : 0
+BPF_SDIV  0x30   1             dst = (src != 0) ? (dst s/ src) : 0
+BPF_OR    0x40   0             dst \|= src
+BPF_AND   0x50   0             dst &= src
+BPF_LSH   0x60   0             dst <<= (src & mask)
+BPF_RSH   0x70   0             dst >>= (src & mask)
+BPF_NEG   0x80   0             dst = -src
+BPF_MOD   0x90   0             dst = (src != 0) ? (dst % src) : dst
+BPF_SMOD  0x90   1             dst = (src != 0) ? (dst s% src) : dst
+BPF_XOR   0xa0   0             dst ^= src
+BPF_MOV   0xb0   0             dst = src
+BPF_MOVSX 0xb0   8/16/32       dst = (s8,16,s32)src
+BPF_ARSH  0xc0   0             sign extending dst >>= (src & mask)
+BPF_END   0xd0   0             byte swap operations (see `Byte swap instructions`_ below)
+========  =====  ============  ==========================================================
 
 Underflow and overflow are allowed during arithmetic operations, meaning
 the 64-bit or 32-bit value will wrap. If eBPF program execution would
@@ -198,11 +201,19 @@  where '(u32)' indicates that the upper 32 bits are zeroed.
 
   dst = dst ^ imm32
 
-Also note that the division and modulo operations are unsigned. Thus, for
-``BPF_ALU``, 'imm' is first interpreted as an unsigned 32-bit value, whereas
-for ``BPF_ALU64``, 'imm' is first sign extended to 64 bits and the result
-interpreted as an unsigned 64-bit value. There are no instructions for
-signed division or modulo.
+Note that most instructions have instruction offset of 0. But three instructions
+(BPF_SDIV, BPF_SMOD, BPF_MOVSX) have non-zero offset.
+
+The devision and modulo operations support both unsigned and signed flavors.
+For unsigned operation (BPF_DIV and BPF_MOD), for ``BPF_ALU``, 'imm' is first
+interpreted as an unsigned 32-bit value, whereas for ``BPF_ALU64``, 'imm' is
+first sign extended to 64 bits and the result interpreted as an unsigned 64-bit
+value.  For signed operation (BPF_SDIV and BPF_SMOD), for both ``BPF_ALU`` and
+``BPF_ALU64``, 'imm' is interpreted as a signed value.
+
+Instruction BPF_MOVSX does move operation with sign extension. For ``BPF_ALU``
+mode, 8-bit and 16-bit sign extensions to 32-bit are supported. For ``BPF_ALU64``,
+8-bit, 16-bit and 32-bit sign extenstions to 64-bit are supported.
 
 Shift operations use a mask of 0x3F (63) for 64-bit operations and 0x1F (31)
 for 32-bit operations.
@@ -210,21 +221,23 @@  for 32-bit operations.
 Byte swap instructions
 ~~~~~~~~~~~~~~~~~~~~~~
 
-The byte swap instructions use an instruction class of ``BPF_ALU`` and a 4-bit
-'code' field of ``BPF_END``.
+The byte swap instructions use instruction classes of ``BPF_ALU`` and ``BPF_ALU64``
+and a 4-bit 'code' field of ``BPF_END``.
 
 The byte swap instructions operate on the destination register
 only and do not use a separate source register or immediate value.
 
-The 1-bit source operand field in the opcode is used to select what byte
-order the operation convert from or to:
+For ``BPF_ALU``, the 1-bit source operand field in the opcode is used to select what byte
+order the operation convert from or to. For ``BPF_ALU64``, the 1-bit source operand
+field in the opcode is not used.
 
-=========  =====  =================================================
-source     value  description
-=========  =====  =================================================
-BPF_TO_LE  0x00   convert between host byte order and little endian
-BPF_TO_BE  0x08   convert between host byte order and big endian
-=========  =====  =================================================
+=========  =========  =====  =================================================
+class      source     value  description
+=========  =========  =====  =================================================
+BPF_ALU    BPF_TO_LE  0x00   convert between host byte order and little endian
+BPF_ALU    BPF_TO_BE  0x08   convert between host byte order and big endian
+BPF_ALU64  BPF_TO_LE  0x00   do byte swap unconditionally
+=========  =========  =====  =================================================
 
 The 'imm' field encodes the width of the swap operations.  The following widths
 are supported: 16, 32 and 64.
@@ -239,6 +252,10 @@  Examples:
 
   dst = htobe64(dst)
 
+``BPF_ALU64 | BPF_TO_LE | BPF_END`` with imm = 16 means::
+
+  dst = bswap16(dst)
+
 Jump instructions
 -----------------
 
@@ -249,7 +266,8 @@  The 'code' field encodes the operation as below:
 ========  =====  ===  ===========================================  =========================================
 code      value  src  description                                  notes
 ========  =====  ===  ===========================================  =========================================
-BPF_JA    0x0    0x0  PC += offset                                 BPF_JMP only
+BPF_JA    0x0    0x0  PC += offset                                 BPF_JMP class
+BPF_JA    0x0    0x0  PC += imm                                    BPF_JMP32 class
 BPF_JEQ   0x1    any  PC += offset if dst == src
 BPF_JGT   0x2    any  PC += offset if dst > src                    unsigned
 BPF_JGE   0x3    any  PC += offset if dst >= src                   unsigned
@@ -278,6 +296,10 @@  Example:
 
 where 's>=' indicates a signed '>=' comparison.
 
+Note there are two flavors of BPF_JA instrions. BPF_JMP class permits 16-bit jump offset while
+BPF_JMP32 permits 32-bit jump offset. A >16bit conditional jmp can be converted to a <16bit
+conditional jmp plus a 32-bit unconditional jump.
+
 Helper functions
 ~~~~~~~~~~~~~~~~
 
@@ -320,6 +342,7 @@  The mode modifier is one of:
   BPF_ABS        0x20   legacy BPF packet access (absolute)   `Legacy BPF Packet access instructions`_
   BPF_IND        0x40   legacy BPF packet access (indirect)   `Legacy BPF Packet access instructions`_
   BPF_MEM        0x60   regular load and store operations     `Regular load and store operations`_
+  BPF_MEMSX      0x80   sign-extension load operations        `Sign-extension load operations`_
   BPF_ATOMIC     0xc0   atomic operations                     `Atomic operations`_
   =============  =====  ====================================  =============
 
@@ -354,6 +377,15 @@  instructions that transfer data between a register and memory.
 
 Where size is one of: ``BPF_B``, ``BPF_H``, ``BPF_W``, or ``BPF_DW``.
 
+The ``BPF_MEMSX`` mode modifier is used to encode sign-extension load
+instructions that transfer data between a register and memory.
+
+``BPF_MEMSX | <size> | BPF_LDX`` means::
+
+  dst = *(sign-extension size *) (src + offset)
+
+Where size is one of: ``BPF_B``, ``BPF_H`` or ``BPF_W``.
+
 Atomic operations
 -----------------