diff mbox series

[bpf-next,v2] bpf, docs: Add explanation of endianness

Message ID 20230220223742.1347-1-dthaler1968@googlemail.com (mailing list archive)
State Accepted
Commit 746ce767128598711a00d8df5713d4c3b3d9e9a7
Delegated to: BPF
Headers show
Series [bpf-next,v2] bpf, docs: Add explanation of endianness | expand

Checks

Context Check Description
netdev/tree_selection success Clearly marked for bpf-next
netdev/fixes_present success Fixes tag not required for -next series
netdev/subject_prefix success Link
netdev/cover_letter success Single patches do not need cover letters
netdev/patch_count success Link
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 0 this patch: 0
netdev/cc_maintainers warning 13 maintainers not CCed: john.fastabend@gmail.com daniel@iogearbox.net andrii@kernel.org sdf@google.com corbet@lwn.net linux-doc@vger.kernel.org jolsa@kernel.org song@kernel.org martin.lau@linux.dev haoluo@google.com yhs@fb.com ast@kernel.org kpsingh@kernel.org
netdev/build_clang success Errors and warnings before: 0 this patch: 0
netdev/module_param success Was 0 now: 0
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 0 this patch: 0
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 28 lines checked
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
bpf/vmtest-bpf-next-VM_Test-10 success Logs for test_maps on aarch64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-12 success Logs for test_maps on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-13 success Logs for test_maps on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-14 success Logs for test_progs on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-15 success Logs for test_progs on aarch64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-17 success Logs for test_progs on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-18 success Logs for test_progs on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-19 success Logs for test_progs_no_alu32 on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-20 success Logs for test_progs_no_alu32 on aarch64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-22 success Logs for test_progs_no_alu32 on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-23 success Logs for test_progs_no_alu32 on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-24 success Logs for test_progs_no_alu32_parallel on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-25 success Logs for test_progs_no_alu32_parallel on aarch64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-27 success Logs for test_progs_no_alu32_parallel on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-28 success Logs for test_progs_no_alu32_parallel on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-29 success Logs for test_progs_parallel on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-30 success Logs for test_progs_parallel on aarch64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-32 success Logs for test_progs_parallel on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-33 success Logs for test_progs_parallel on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-34 success Logs for test_verifier on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-35 success Logs for test_verifier on aarch64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-37 success Logs for test_verifier on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-38 success Logs for test_verifier on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-16 success Logs for test_progs on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-31 success Logs for test_progs_parallel on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-36 success Logs for test_verifier on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-11 success Logs for test_maps on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-21 success Logs for test_progs_no_alu32 on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-26 success Logs for test_progs_no_alu32_parallel on s390x with gcc
bpf/vmtest-bpf-next-PR fail merge-conflict
bpf/vmtest-bpf-next-VM_Test-1 success Logs for ${{ matrix.test }} on ${{ matrix.arch }} with ${{ matrix.toolchain }}
bpf/vmtest-bpf-next-VM_Test-2 success Logs for ShellCheck
bpf/vmtest-bpf-next-VM_Test-3 fail Logs for build for aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-4 fail Logs for build for aarch64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-5 fail Logs for build for s390x with gcc
bpf/vmtest-bpf-next-VM_Test-6 success Logs for build for x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-7 success Logs for build for x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-8 success Logs for llvm-toolchain
bpf/vmtest-bpf-next-VM_Test-9 success Logs for set-matrix

Commit Message

Dave Thaler Feb. 20, 2023, 10:37 p.m. UTC
From: Dave Thaler <dthaler@microsoft.com>

Document the discussion from the email thread on the IETF bpf list,
where it was explained that the raw format varies by endianness
of the processor.

Signed-off-by: Dave Thaler <dthaler@microsoft.com>

Acked-by: David Vernet <void@manifault.com>
---

V1 -> V2: rebased on top of latest master
---
 Documentation/bpf/instruction-set.rst | 16 ++++++++++++++--
 1 file changed, 14 insertions(+), 2 deletions(-)

Comments

patchwork-bot+netdevbpf@kernel.org Feb. 22, 2023, 10:10 p.m. UTC | #1
Hello:

This patch was applied to bpf/bpf-next.git (master)
by Alexei Starovoitov <ast@kernel.org>:

On Mon, 20 Feb 2023 22:37:42 +0000 you wrote:
> From: Dave Thaler <dthaler@microsoft.com>
> 
> Document the discussion from the email thread on the IETF bpf list,
> where it was explained that the raw format varies by endianness
> of the processor.
> 
> Signed-off-by: Dave Thaler <dthaler@microsoft.com>
> 
> [...]

Here is the summary with links:
  - [bpf-next,v2] bpf, docs: Add explanation of endianness
    https://git.kernel.org/bpf/bpf-next/c/746ce7671285

You are awesome, thank you!
Alexei Starovoitov Feb. 22, 2023, 10:10 p.m. UTC | #2
On Mon, Feb 20, 2023 at 2:37 PM Dave Thaler
<dthaler1968=40googlemail.com@dmarc.ietf.org> wrote:
>
> From: Dave Thaler <dthaler@microsoft.com>
>
> Document the discussion from the email thread on the IETF bpf list,
> where it was explained that the raw format varies by endianness
> of the processor.
>
> Signed-off-by: Dave Thaler <dthaler@microsoft.com>
>
> Acked-by: David Vernet <void@manifault.com>
> ---
>
> V1 -> V2: rebased on top of latest master
> ---
>  Documentation/bpf/instruction-set.rst | 16 ++++++++++++++--
>  1 file changed, 14 insertions(+), 2 deletions(-)
>
> diff --git a/Documentation/bpf/instruction-set.rst b/Documentation/bpf/instruction-set.rst
> index af515de5fc3..1d473f060fa 100644
> --- a/Documentation/bpf/instruction-set.rst
> +++ b/Documentation/bpf/instruction-set.rst
> @@ -38,8 +38,9 @@ eBPF has two instruction encodings:
>  * the wide instruction encoding, which appends a second 64-bit immediate (i.e.,
>    constant) value after the basic instruction for a total of 128 bits.
>
> -The basic instruction encoding is as follows, where MSB and LSB mean the most significant
> -bits and least significant bits, respectively:
> +The basic instruction encoding looks as follows for a little-endian processor,
> +where MSB and LSB mean the most significant bits and least significant bits,
> +respectively:
>
>  =============  =======  =======  =======  ============
>  32 bits (MSB)  16 bits  4 bits   4 bits   8 bits (LSB)
> @@ -63,6 +64,17 @@ imm            offset   src_reg  dst_reg  opcode
>  **opcode**
>    operation to perform
>
> +and as follows for a big-endian processor:
> +
> +=============  =======  ====================  ===============  ============
> +32 bits (MSB)  16 bits  4 bits                4 bits           8 bits (LSB)
> +=============  =======  ====================  ===============  ============
> +immediate      offset   destination register  source register  opcode
> +=============  =======  ====================  ===============  ============

I've changed it to:
imm            offset   dst_reg  src_reg  opcode

to match the little endian table,
but now one of the tables feels wrong.
The encoding is always done by applying C standard to the struct:
struct bpf_insn {
        __u8    code;           /* opcode */
        __u8    dst_reg:4;      /* dest register */
        __u8    src_reg:4;      /* source register */
        __s16   off;            /* signed offset */
        __s32   imm;            /* signed immediate constant */
};
I'm not sure how to express this clearly in the table.
Jose E. Marchesi Feb. 22, 2023, 11:23 p.m. UTC | #3
> On Mon, Feb 20, 2023 at 2:37 PM Dave Thaler
> <dthaler1968=40googlemail.com@dmarc.ietf.org> wrote:
>>
>> From: Dave Thaler <dthaler@microsoft.com>
>>
>> Document the discussion from the email thread on the IETF bpf list,
>> where it was explained that the raw format varies by endianness
>> of the processor.
>>
>> Signed-off-by: Dave Thaler <dthaler@microsoft.com>
>>
>> Acked-by: David Vernet <void@manifault.com>
>> ---
>>
>> V1 -> V2: rebased on top of latest master
>> ---
>>  Documentation/bpf/instruction-set.rst | 16 ++++++++++++++--
>>  1 file changed, 14 insertions(+), 2 deletions(-)
>>
>> diff --git a/Documentation/bpf/instruction-set.rst b/Documentation/bpf/instruction-set.rst
>> index af515de5fc3..1d473f060fa 100644
>> --- a/Documentation/bpf/instruction-set.rst
>> +++ b/Documentation/bpf/instruction-set.rst
>> @@ -38,8 +38,9 @@ eBPF has two instruction encodings:
>>  * the wide instruction encoding, which appends a second 64-bit immediate (i.e.,
>>    constant) value after the basic instruction for a total of 128 bits.
>>
>> -The basic instruction encoding is as follows, where MSB and LSB mean the most significant
>> -bits and least significant bits, respectively:
>> +The basic instruction encoding looks as follows for a little-endian processor,
>> +where MSB and LSB mean the most significant bits and least significant bits,
>> +respectively:
>>
>>  =============  =======  =======  =======  ============
>>  32 bits (MSB)  16 bits  4 bits   4 bits   8 bits (LSB)
>> @@ -63,6 +64,17 @@ imm            offset   src_reg  dst_reg  opcode
>>  **opcode**
>>    operation to perform
>>
>> +and as follows for a big-endian processor:
>> +
>> +=============  =======  ====================  ===============  ============
>> +32 bits (MSB)  16 bits  4 bits                4 bits           8 bits (LSB)
>> +=============  =======  ====================  ===============  ============
>> +immediate      offset   destination register  source register  opcode
>> +=============  =======  ====================  ===============  ============
>
> I've changed it to:
> imm            offset   dst_reg  src_reg  opcode
>
> to match the little endian table,
> but now one of the tables feels wrong.
> The encoding is always done by applying C standard to the struct:
> struct bpf_insn {
>         __u8    code;           /* opcode */
>         __u8    dst_reg:4;      /* dest register */
>         __u8    src_reg:4;      /* source register */
>         __s16   off;            /* signed offset */
>         __s32   imm;            /* signed immediate constant */
> };
> I'm not sure how to express this clearly in the table.

Perhaps it would be simpler to document how the instruction bytes are
stored (be it in an ELF file or as bytes in a memory buffer to be loaded
into the kernel or some other BPF consumer) as opposed to how the
instructions look like once loaded (as a 64-bit word) by a little-endian
or big-endian kernel?

Stored little-endian BPF instructions:

  code src_reg dst_reg off imm

  foo-le.o:     file format elf64-bpfle

  0000000000000000 <.text>:
     0:   07 01 00 00 ef be ad de         r1 += 0xdeadbeef

Stored big-endian BPF instructions:

  code dst_reg src_reg off imm

  foo-be.o:     file format elf64-bpfbe

  0000000000000000 <.text>:
     0:   07 10 00 00 de ad be ef         r1 += 0xdeadbeef

i.e. in the stored bytes the code always comes first, then the
registers, then the offset, then the immediate, regardless of
endianness.

This may be easier to understand by implementors looking to generate
and/or consume bytes conforming BPF instructions.
Alexei Starovoitov Feb. 23, 2023, 1:56 a.m. UTC | #4
On Wed, Feb 22, 2023 at 3:23 PM Jose E. Marchesi
<jose.marchesi@oracle.com> wrote:
>
>
> > On Mon, Feb 20, 2023 at 2:37 PM Dave Thaler
> > <dthaler1968=40googlemail.com@dmarc.ietf.org> wrote:
> >>
> >> From: Dave Thaler <dthaler@microsoft.com>
> >>
> >> Document the discussion from the email thread on the IETF bpf list,
> >> where it was explained that the raw format varies by endianness
> >> of the processor.
> >>
> >> Signed-off-by: Dave Thaler <dthaler@microsoft.com>
> >>
> >> Acked-by: David Vernet <void@manifault.com>
> >> ---
> >>
> >> V1 -> V2: rebased on top of latest master
> >> ---
> >>  Documentation/bpf/instruction-set.rst | 16 ++++++++++++++--
> >>  1 file changed, 14 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/Documentation/bpf/instruction-set.rst b/Documentation/bpf/instruction-set.rst
> >> index af515de5fc3..1d473f060fa 100644
> >> --- a/Documentation/bpf/instruction-set.rst
> >> +++ b/Documentation/bpf/instruction-set.rst
> >> @@ -38,8 +38,9 @@ eBPF has two instruction encodings:
> >>  * the wide instruction encoding, which appends a second 64-bit immediate (i.e.,
> >>    constant) value after the basic instruction for a total of 128 bits.
> >>
> >> -The basic instruction encoding is as follows, where MSB and LSB mean the most significant
> >> -bits and least significant bits, respectively:
> >> +The basic instruction encoding looks as follows for a little-endian processor,
> >> +where MSB and LSB mean the most significant bits and least significant bits,
> >> +respectively:
> >>
> >>  =============  =======  =======  =======  ============
> >>  32 bits (MSB)  16 bits  4 bits   4 bits   8 bits (LSB)
> >> @@ -63,6 +64,17 @@ imm            offset   src_reg  dst_reg  opcode
> >>  **opcode**
> >>    operation to perform
> >>
> >> +and as follows for a big-endian processor:
> >> +
> >> +=============  =======  ====================  ===============  ============
> >> +32 bits (MSB)  16 bits  4 bits                4 bits           8 bits (LSB)
> >> +=============  =======  ====================  ===============  ============
> >> +immediate      offset   destination register  source register  opcode
> >> +=============  =======  ====================  ===============  ============
> >
> > I've changed it to:
> > imm            offset   dst_reg  src_reg  opcode
> >
> > to match the little endian table,
> > but now one of the tables feels wrong.
> > The encoding is always done by applying C standard to the struct:
> > struct bpf_insn {
> >         __u8    code;           /* opcode */
> >         __u8    dst_reg:4;      /* dest register */
> >         __u8    src_reg:4;      /* source register */
> >         __s16   off;            /* signed offset */
> >         __s32   imm;            /* signed immediate constant */
> > };
> > I'm not sure how to express this clearly in the table.
>
> Perhaps it would be simpler to document how the instruction bytes are
> stored (be it in an ELF file or as bytes in a memory buffer to be loaded
> into the kernel or some other BPF consumer) as opposed to how the
> instructions look like once loaded (as a 64-bit word) by a little-endian
> or big-endian kernel?
>
> Stored little-endian BPF instructions:
>
>   code src_reg dst_reg off imm
>
>   foo-le.o:     file format elf64-bpfle
>
>   0000000000000000 <.text>:
>      0:   07 01 00 00 ef be ad de         r1 += 0xdeadbeef
>
> Stored big-endian BPF instructions:
>
>   code dst_reg src_reg off imm
>
>   foo-be.o:     file format elf64-bpfbe
>
>   0000000000000000 <.text>:
>      0:   07 10 00 00 de ad be ef         r1 += 0xdeadbeef
>
> i.e. in the stored bytes the code always comes first, then the
> registers, then the offset, then the immediate, regardless of
> endianness.
>
> This may be easier to understand by implementors looking to generate
> and/or consume bytes conforming BPF instructions.

+1
I like this format more as well.
Maybe we can drop the table and use a diagram of a kind ?

opcode src dst offset imm          assembly
07     0   1   00 00  ef be ad de  r1 += 0xdeadbeef // little
07     1   0   00 00  de ad be ef  r1 += 0xdeadbeef // big
Jose E. Marchesi Feb. 23, 2023, 1:18 p.m. UTC | #5
> On Wed, Feb 22, 2023 at 3:23 PM Jose E. Marchesi
> <jose.marchesi@oracle.com> wrote:
>>
>>
>> > On Mon, Feb 20, 2023 at 2:37 PM Dave Thaler
>> > <dthaler1968=40googlemail.com@dmarc.ietf.org> wrote:
>> >>
>> >> From: Dave Thaler <dthaler@microsoft.com>
>> >>
>> >> Document the discussion from the email thread on the IETF bpf list,
>> >> where it was explained that the raw format varies by endianness
>> >> of the processor.
>> >>
>> >> Signed-off-by: Dave Thaler <dthaler@microsoft.com>
>> >>
>> >> Acked-by: David Vernet <void@manifault.com>
>> >> ---
>> >>
>> >> V1 -> V2: rebased on top of latest master
>> >> ---
>> >>  Documentation/bpf/instruction-set.rst | 16 ++++++++++++++--
>> >>  1 file changed, 14 insertions(+), 2 deletions(-)
>> >>
>> >> diff --git a/Documentation/bpf/instruction-set.rst b/Documentation/bpf/instruction-set.rst
>> >> index af515de5fc3..1d473f060fa 100644
>> >> --- a/Documentation/bpf/instruction-set.rst
>> >> +++ b/Documentation/bpf/instruction-set.rst
>> >> @@ -38,8 +38,9 @@ eBPF has two instruction encodings:
>> >>  * the wide instruction encoding, which appends a second 64-bit immediate (i.e.,
>> >>    constant) value after the basic instruction for a total of 128 bits.
>> >>
>> >> -The basic instruction encoding is as follows, where MSB and LSB mean the most significant
>> >> -bits and least significant bits, respectively:
>> >> +The basic instruction encoding looks as follows for a little-endian processor,
>> >> +where MSB and LSB mean the most significant bits and least significant bits,
>> >> +respectively:
>> >>
>> >>  =============  =======  =======  =======  ============
>> >>  32 bits (MSB)  16 bits  4 bits   4 bits   8 bits (LSB)
>> >> @@ -63,6 +64,17 @@ imm            offset   src_reg  dst_reg  opcode
>> >>  **opcode**
>> >>    operation to perform
>> >>
>> >> +and as follows for a big-endian processor:
>> >> +
>> >> +=============  =======  ====================  ===============  ============
>> >> +32 bits (MSB)  16 bits  4 bits                4 bits           8 bits (LSB)
>> >> +=============  =======  ====================  ===============  ============
>> >> +immediate      offset   destination register  source register  opcode
>> >> +=============  =======  ====================  ===============  ============
>> >
>> > I've changed it to:
>> > imm            offset   dst_reg  src_reg  opcode
>> >
>> > to match the little endian table,
>> > but now one of the tables feels wrong.
>> > The encoding is always done by applying C standard to the struct:
>> > struct bpf_insn {
>> >         __u8    code;           /* opcode */
>> >         __u8    dst_reg:4;      /* dest register */
>> >         __u8    src_reg:4;      /* source register */
>> >         __s16   off;            /* signed offset */
>> >         __s32   imm;            /* signed immediate constant */
>> > };
>> > I'm not sure how to express this clearly in the table.
>>
>> Perhaps it would be simpler to document how the instruction bytes are
>> stored (be it in an ELF file or as bytes in a memory buffer to be loaded
>> into the kernel or some other BPF consumer) as opposed to how the
>> instructions look like once loaded (as a 64-bit word) by a little-endian
>> or big-endian kernel?
>>
>> Stored little-endian BPF instructions:
>>
>>   code src_reg dst_reg off imm
>>
>>   foo-le.o:     file format elf64-bpfle
>>
>>   0000000000000000 <.text>:
>>      0:   07 01 00 00 ef be ad de         r1 += 0xdeadbeef
>>
>> Stored big-endian BPF instructions:
>>
>>   code dst_reg src_reg off imm
>>
>>   foo-be.o:     file format elf64-bpfbe
>>
>>   0000000000000000 <.text>:
>>      0:   07 10 00 00 de ad be ef         r1 += 0xdeadbeef
>>
>> i.e. in the stored bytes the code always comes first, then the
>> registers, then the offset, then the immediate, regardless of
>> endianness.
>>
>> This may be easier to understand by implementors looking to generate
>> and/or consume bytes conforming BPF instructions.
>
> +1
> I like this format more as well.
> Maybe we can drop the table and use a diagram of a kind ?
>
> opcode src dst offset imm          assembly
> 07     0   1   00 00  ef be ad de  r1 += 0xdeadbeef // little
> 07     1   0   00 00  de ad be ef  r1 += 0xdeadbeef // big

Good idea.  What about something like this:

opcode         offset imm          assembly
       src dst
07     0   1   00 00  44 33 22 11  r1 += 0x11223344 // little
       dst src
07     1   0   00 00  11 22 33 44  r1 += 0x11223344 // big

I changed the immediate because 0xdeadbeef is negative and it may be
confusing in the assembly part: strictly it would be r1 += -559038737.
Alexei Starovoitov Feb. 23, 2023, 4:40 p.m. UTC | #6
On Thu, Feb 23, 2023 at 5:19 AM Jose E. Marchesi <jemarch@gnu.org> wrote:
>
>
> > On Wed, Feb 22, 2023 at 3:23 PM Jose E. Marchesi
> > <jose.marchesi@oracle.com> wrote:
> >>
> >>
> >> > On Mon, Feb 20, 2023 at 2:37 PM Dave Thaler
> >> > <dthaler1968=40googlemail.com@dmarc.ietf.org> wrote:
> >> >>
> >> >> From: Dave Thaler <dthaler@microsoft.com>
> >> >>
> >> >> Document the discussion from the email thread on the IETF bpf list,
> >> >> where it was explained that the raw format varies by endianness
> >> >> of the processor.
> >> >>
> >> >> Signed-off-by: Dave Thaler <dthaler@microsoft.com>
> >> >>
> >> >> Acked-by: David Vernet <void@manifault.com>
> >> >> ---
> >> >>
> >> >> V1 -> V2: rebased on top of latest master
> >> >> ---
> >> >>  Documentation/bpf/instruction-set.rst | 16 ++++++++++++++--
> >> >>  1 file changed, 14 insertions(+), 2 deletions(-)
> >> >>
> >> >> diff --git a/Documentation/bpf/instruction-set.rst b/Documentation/bpf/instruction-set.rst
> >> >> index af515de5fc3..1d473f060fa 100644
> >> >> --- a/Documentation/bpf/instruction-set.rst
> >> >> +++ b/Documentation/bpf/instruction-set.rst
> >> >> @@ -38,8 +38,9 @@ eBPF has two instruction encodings:
> >> >>  * the wide instruction encoding, which appends a second 64-bit immediate (i.e.,
> >> >>    constant) value after the basic instruction for a total of 128 bits.
> >> >>
> >> >> -The basic instruction encoding is as follows, where MSB and LSB mean the most significant
> >> >> -bits and least significant bits, respectively:
> >> >> +The basic instruction encoding looks as follows for a little-endian processor,
> >> >> +where MSB and LSB mean the most significant bits and least significant bits,
> >> >> +respectively:
> >> >>
> >> >>  =============  =======  =======  =======  ============
> >> >>  32 bits (MSB)  16 bits  4 bits   4 bits   8 bits (LSB)
> >> >> @@ -63,6 +64,17 @@ imm            offset   src_reg  dst_reg  opcode
> >> >>  **opcode**
> >> >>    operation to perform
> >> >>
> >> >> +and as follows for a big-endian processor:
> >> >> +
> >> >> +=============  =======  ====================  ===============  ============
> >> >> +32 bits (MSB)  16 bits  4 bits                4 bits           8 bits (LSB)
> >> >> +=============  =======  ====================  ===============  ============
> >> >> +immediate      offset   destination register  source register  opcode
> >> >> +=============  =======  ====================  ===============  ============
> >> >
> >> > I've changed it to:
> >> > imm            offset   dst_reg  src_reg  opcode
> >> >
> >> > to match the little endian table,
> >> > but now one of the tables feels wrong.
> >> > The encoding is always done by applying C standard to the struct:
> >> > struct bpf_insn {
> >> >         __u8    code;           /* opcode */
> >> >         __u8    dst_reg:4;      /* dest register */
> >> >         __u8    src_reg:4;      /* source register */
> >> >         __s16   off;            /* signed offset */
> >> >         __s32   imm;            /* signed immediate constant */
> >> > };
> >> > I'm not sure how to express this clearly in the table.
> >>
> >> Perhaps it would be simpler to document how the instruction bytes are
> >> stored (be it in an ELF file or as bytes in a memory buffer to be loaded
> >> into the kernel or some other BPF consumer) as opposed to how the
> >> instructions look like once loaded (as a 64-bit word) by a little-endian
> >> or big-endian kernel?
> >>
> >> Stored little-endian BPF instructions:
> >>
> >>   code src_reg dst_reg off imm
> >>
> >>   foo-le.o:     file format elf64-bpfle
> >>
> >>   0000000000000000 <.text>:
> >>      0:   07 01 00 00 ef be ad de         r1 += 0xdeadbeef
> >>
> >> Stored big-endian BPF instructions:
> >>
> >>   code dst_reg src_reg off imm
> >>
> >>   foo-be.o:     file format elf64-bpfbe
> >>
> >>   0000000000000000 <.text>:
> >>      0:   07 10 00 00 de ad be ef         r1 += 0xdeadbeef
> >>
> >> i.e. in the stored bytes the code always comes first, then the
> >> registers, then the offset, then the immediate, regardless of
> >> endianness.
> >>
> >> This may be easier to understand by implementors looking to generate
> >> and/or consume bytes conforming BPF instructions.
> >
> > +1
> > I like this format more as well.
> > Maybe we can drop the table and use a diagram of a kind ?
> >
> > opcode src dst offset imm          assembly
> > 07     0   1   00 00  ef be ad de  r1 += 0xdeadbeef // little
> > 07     1   0   00 00  de ad be ef  r1 += 0xdeadbeef // big
>
> Good idea.  What about something like this:
>
> opcode         offset imm          assembly
>        src dst
> 07     0   1   00 00  44 33 22 11  r1 += 0x11223344 // little
>        dst src
> 07     1   0   00 00  11 22 33 44  r1 += 0x11223344 // big
>
> I changed the immediate because 0xdeadbeef is negative and it may be
> confusing in the assembly part: strictly it would be r1 += -559038737.

Looks great to me. Do you want to send your first kernel patch? :)
Jose E. Marchesi Feb. 23, 2023, 4:42 p.m. UTC | #7
> On Thu, Feb 23, 2023 at 5:19 AM Jose E. Marchesi <jemarch@gnu.org> wrote:
>>
>>
>> > On Wed, Feb 22, 2023 at 3:23 PM Jose E. Marchesi
>> > <jose.marchesi@oracle.com> wrote:
>> >>
>> >>
>> >> > On Mon, Feb 20, 2023 at 2:37 PM Dave Thaler
>> >> > <dthaler1968=40googlemail.com@dmarc.ietf.org> wrote:
>> >> >>
>> >> >> From: Dave Thaler <dthaler@microsoft.com>
>> >> >>
>> >> >> Document the discussion from the email thread on the IETF bpf list,
>> >> >> where it was explained that the raw format varies by endianness
>> >> >> of the processor.
>> >> >>
>> >> >> Signed-off-by: Dave Thaler <dthaler@microsoft.com>
>> >> >>
>> >> >> Acked-by: David Vernet <void@manifault.com>
>> >> >> ---
>> >> >>
>> >> >> V1 -> V2: rebased on top of latest master
>> >> >> ---
>> >> >>  Documentation/bpf/instruction-set.rst | 16 ++++++++++++++--
>> >> >>  1 file changed, 14 insertions(+), 2 deletions(-)
>> >> >>
>> >> >> diff --git a/Documentation/bpf/instruction-set.rst b/Documentation/bpf/instruction-set.rst
>> >> >> index af515de5fc3..1d473f060fa 100644
>> >> >> --- a/Documentation/bpf/instruction-set.rst
>> >> >> +++ b/Documentation/bpf/instruction-set.rst
>> >> >> @@ -38,8 +38,9 @@ eBPF has two instruction encodings:
>> >> >>  * the wide instruction encoding, which appends a second 64-bit immediate (i.e.,
>> >> >>    constant) value after the basic instruction for a total of 128 bits.
>> >> >>
>> >> >> -The basic instruction encoding is as follows, where MSB and LSB mean the most significant
>> >> >> -bits and least significant bits, respectively:
>> >> >> +The basic instruction encoding looks as follows for a little-endian processor,
>> >> >> +where MSB and LSB mean the most significant bits and least significant bits,
>> >> >> +respectively:
>> >> >>
>> >> >>  =============  =======  =======  =======  ============
>> >> >>  32 bits (MSB)  16 bits  4 bits   4 bits   8 bits (LSB)
>> >> >> @@ -63,6 +64,17 @@ imm            offset   src_reg  dst_reg  opcode
>> >> >>  **opcode**
>> >> >>    operation to perform
>> >> >>
>> >> >> +and as follows for a big-endian processor:
>> >> >> +
>> >> >> +=============  =======  ====================  ===============  ============
>> >> >> +32 bits (MSB)  16 bits  4 bits                4 bits           8 bits (LSB)
>> >> >> +=============  =======  ====================  ===============  ============
>> >> >> +immediate      offset   destination register  source register  opcode
>> >> >> +=============  =======  ====================  ===============  ============
>> >> >
>> >> > I've changed it to:
>> >> > imm            offset   dst_reg  src_reg  opcode
>> >> >
>> >> > to match the little endian table,
>> >> > but now one of the tables feels wrong.
>> >> > The encoding is always done by applying C standard to the struct:
>> >> > struct bpf_insn {
>> >> >         __u8    code;           /* opcode */
>> >> >         __u8    dst_reg:4;      /* dest register */
>> >> >         __u8    src_reg:4;      /* source register */
>> >> >         __s16   off;            /* signed offset */
>> >> >         __s32   imm;            /* signed immediate constant */
>> >> > };
>> >> > I'm not sure how to express this clearly in the table.
>> >>
>> >> Perhaps it would be simpler to document how the instruction bytes are
>> >> stored (be it in an ELF file or as bytes in a memory buffer to be loaded
>> >> into the kernel or some other BPF consumer) as opposed to how the
>> >> instructions look like once loaded (as a 64-bit word) by a little-endian
>> >> or big-endian kernel?
>> >>
>> >> Stored little-endian BPF instructions:
>> >>
>> >>   code src_reg dst_reg off imm
>> >>
>> >>   foo-le.o:     file format elf64-bpfle
>> >>
>> >>   0000000000000000 <.text>:
>> >>      0:   07 01 00 00 ef be ad de         r1 += 0xdeadbeef
>> >>
>> >> Stored big-endian BPF instructions:
>> >>
>> >>   code dst_reg src_reg off imm
>> >>
>> >>   foo-be.o:     file format elf64-bpfbe
>> >>
>> >>   0000000000000000 <.text>:
>> >>      0:   07 10 00 00 de ad be ef         r1 += 0xdeadbeef
>> >>
>> >> i.e. in the stored bytes the code always comes first, then the
>> >> registers, then the offset, then the immediate, regardless of
>> >> endianness.
>> >>
>> >> This may be easier to understand by implementors looking to generate
>> >> and/or consume bytes conforming BPF instructions.
>> >
>> > +1
>> > I like this format more as well.
>> > Maybe we can drop the table and use a diagram of a kind ?
>> >
>> > opcode src dst offset imm          assembly
>> > 07     0   1   00 00  ef be ad de  r1 += 0xdeadbeef // little
>> > 07     1   0   00 00  de ad be ef  r1 += 0xdeadbeef // big
>>
>> Good idea.  What about something like this:
>>
>> opcode         offset imm          assembly
>>        src dst
>> 07     0   1   00 00  44 33 22 11  r1 += 0x11223344 // little
>>        dst src
>> 07     1   0   00 00  11 22 33 44  r1 += 0x11223344 // big
>>
>> I changed the immediate because 0xdeadbeef is negative and it may be
>> confusing in the assembly part: strictly it would be r1 += -559038737.
>
> Looks great to me. Do you want to send your first kernel patch? :)

Sure, will prepare it :)
diff mbox series

Patch

diff --git a/Documentation/bpf/instruction-set.rst b/Documentation/bpf/instruction-set.rst
index af515de5fc3..1d473f060fa 100644
--- a/Documentation/bpf/instruction-set.rst
+++ b/Documentation/bpf/instruction-set.rst
@@ -38,8 +38,9 @@  eBPF has two instruction encodings:
 * the wide instruction encoding, which appends a second 64-bit immediate (i.e.,
   constant) value after the basic instruction for a total of 128 bits.
 
-The basic instruction encoding is as follows, where MSB and LSB mean the most significant
-bits and least significant bits, respectively:
+The basic instruction encoding looks as follows for a little-endian processor,
+where MSB and LSB mean the most significant bits and least significant bits,
+respectively:
 
 =============  =======  =======  =======  ============
 32 bits (MSB)  16 bits  4 bits   4 bits   8 bits (LSB)
@@ -63,6 +64,17 @@  imm            offset   src_reg  dst_reg  opcode
 **opcode**
   operation to perform
 
+and as follows for a big-endian processor:
+
+=============  =======  ====================  ===============  ============
+32 bits (MSB)  16 bits  4 bits                4 bits           8 bits (LSB)
+=============  =======  ====================  ===============  ============
+immediate      offset   destination register  source register  opcode
+=============  =======  ====================  ===============  ============
+
+Multi-byte fields ('immediate' and 'offset') are similarly stored in
+the byte order of the processor.
+
 Note that most instructions do not use all of the fields.
 Unused fields shall be cleared to zero.