diff mbox series

[bpf-next,v4] bpf/scripts: assert helper enum value is aligned with comment order

Message ID 20220824181043.1601429-1-eyal.birger@gmail.com (mailing list archive)
State Accepted
Commit 0a0d55ef3e61d9f14e803cacb644fcc890f16774
Delegated to: BPF
Headers show
Series [bpf-next,v4] bpf/scripts: assert helper enum value is aligned with comment order | expand

Checks

Context Check Description
netdev/tree_selection success Clearly marked for bpf-next
netdev/fixes_present success Fixes tag not required for -next series
netdev/subject_prefix success Link
netdev/cover_letter success Single patches do not need cover letters
netdev/patch_count success Link
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 0 this patch: 0
netdev/cc_maintainers success CCed 12 of 12 maintainers
netdev/build_clang success Errors and warnings before: 0 this patch: 0
netdev/module_param success Was 0 now: 0
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 0 this patch: 0
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 83 lines checked
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
bpf/vmtest-bpf-next-VM_Test-2 success Logs for build for x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-1 success Logs for build for s390x with gcc
bpf/vmtest-bpf-next-VM_Test-7 success Logs for test_maps on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-8 success Logs for test_maps on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-14 success Logs for test_progs_no_alu32 on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-16 success Logs for test_verifier on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-17 success Logs for test_verifier on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-10 success Logs for test_progs on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-11 success Logs for test_progs on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-13 success Logs for test_progs_no_alu32 on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-6 success Logs for test_maps on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-9 success Logs for test_progs on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-15 success Logs for test_verifier on s390x with gcc
bpf/vmtest-bpf-next-PR fail PR summary
bpf/vmtest-bpf-next-VM_Test-12 fail Logs for test_progs_no_alu32 on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-3 success Logs for build for x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-4 success Logs for llvm-toolchain
bpf/vmtest-bpf-next-VM_Test-5 success Logs for set-matrix

Commit Message

Eyal Birger Aug. 24, 2022, 6:10 p.m. UTC
The helper value is ABI as defined by enum bpf_func_id.
As bpf_helper_defs.h is used for the userpace part, it must be consistent
with this enum.

Before this change the comments order was used by the bpf_doc script in
order to set the helper values defined in the helpers file.

When adding new helpers it is very puzzling when the userspace application
breaks in weird places if the comment is inserted instead of appended -
because the generated helper ABI is incorrect and shifted.

This commit sets the helper value to the enum value.

In addition it is currently the practice to have the comments appended
and kept in the same order as the enum. As such, add an assertion
validating the comment order is consistent with enum value.

In case a different comments ordering is desired, this assertion can
be lifted.

Signed-off-by: Eyal Birger <eyal.birger@gmail.com>

---
v4: fix variable name typo
v3: based on feedback from Quentin Monnet:
- move assertion to parser
- avoid using define_unique_helpers as elem_number_check() relies on
  it being an array
- set enum_val in helper object instead of passing as a dict to the
  printer

v2: based on feedback from Quentin Monnet:
- assert the current comment ordering
- match only one FN in each line
---
 scripts/bpf_doc.py | 39 ++++++++++++++++++++++++++++++++++-----
 1 file changed, 34 insertions(+), 5 deletions(-)

Comments

Quentin Monnet Aug. 24, 2022, 9:50 p.m. UTC | #1
On Wed, 24 Aug 2022 at 19:11, Eyal Birger <eyal.birger@gmail.com> wrote:
>
> The helper value is ABI as defined by enum bpf_func_id.
> As bpf_helper_defs.h is used for the userpace part, it must be consistent
> with this enum.
>
> Before this change the comments order was used by the bpf_doc script in
> order to set the helper values defined in the helpers file.
>
> When adding new helpers it is very puzzling when the userspace application
> breaks in weird places if the comment is inserted instead of appended -
> because the generated helper ABI is incorrect and shifted.
>
> This commit sets the helper value to the enum value.
>
> In addition it is currently the practice to have the comments appended
> and kept in the same order as the enum. As such, add an assertion
> validating the comment order is consistent with enum value.
>
> In case a different comments ordering is desired, this assertion can
> be lifted.
>
> Signed-off-by: Eyal Birger <eyal.birger@gmail.com>

Reviewed-by: Quentin Monnet <quentin@isovalent.com>

Thanks!
Andrii Nakryiko Aug. 25, 2022, 6:53 p.m. UTC | #2
On Wed, Aug 24, 2022 at 11:11 AM Eyal Birger <eyal.birger@gmail.com> wrote:
>
> The helper value is ABI as defined by enum bpf_func_id.
> As bpf_helper_defs.h is used for the userpace part, it must be consistent
> with this enum.

I think the way we implicitly define the value of those BPF_FUNC_
enums is also suboptimal. It makes it much harder to cherry-pick and
backport only few latest helpers onto old kernels (there was a case
backporting one of the pretty trivial timestamp fetching helpers
without backporting other stuff). It's also quite hard to correlate
llvm-objdump output with just `call 123;` instruction into which
helper it is.

If each FN(xxx) definition in __BPF_FUNC_MAPPER was taking explicit
integer number, I think it would be a big win and make things better
all around.

Is there any opposition to doing that?


But regardless, applied this patch to bpf-next as well as an improvement.


>
> Before this change the comments order was used by the bpf_doc script in
> order to set the helper values defined in the helpers file.
>
> When adding new helpers it is very puzzling when the userspace application
> breaks in weird places if the comment is inserted instead of appended -
> because the generated helper ABI is incorrect and shifted.
>
> This commit sets the helper value to the enum value.
>
> In addition it is currently the practice to have the comments appended
> and kept in the same order as the enum. As such, add an assertion
> validating the comment order is consistent with enum value.
>
> In case a different comments ordering is desired, this assertion can
> be lifted.
>
> Signed-off-by: Eyal Birger <eyal.birger@gmail.com>
>
> ---
> v4: fix variable name typo
> v3: based on feedback from Quentin Monnet:
> - move assertion to parser
> - avoid using define_unique_helpers as elem_number_check() relies on
>   it being an array
> - set enum_val in helper object instead of passing as a dict to the
>   printer
>
> v2: based on feedback from Quentin Monnet:
> - assert the current comment ordering
> - match only one FN in each line
> ---
>  scripts/bpf_doc.py | 39 ++++++++++++++++++++++++++++++++++-----
>  1 file changed, 34 insertions(+), 5 deletions(-)
>

[...]
patchwork-bot+netdevbpf@kernel.org Aug. 25, 2022, 7 p.m. UTC | #3
Hello:

This patch was applied to bpf/bpf-next.git (master)
by Andrii Nakryiko <andrii@kernel.org>:

On Wed, 24 Aug 2022 21:10:43 +0300 you wrote:
> The helper value is ABI as defined by enum bpf_func_id.
> As bpf_helper_defs.h is used for the userpace part, it must be consistent
> with this enum.
> 
> Before this change the comments order was used by the bpf_doc script in
> order to set the helper values defined in the helpers file.
> 
> [...]

Here is the summary with links:
  - [bpf-next,v4] bpf/scripts: assert helper enum value is aligned with comment order
    https://git.kernel.org/bpf/bpf-next/c/0a0d55ef3e61

You are awesome, thank you!
Quentin Monnet Aug. 26, 2022, 10:11 a.m. UTC | #4
On 25/08/2022 19:53, Andrii Nakryiko wrote:
> On Wed, Aug 24, 2022 at 11:11 AM Eyal Birger <eyal.birger@gmail.com> wrote:
>>
>> The helper value is ABI as defined by enum bpf_func_id.
>> As bpf_helper_defs.h is used for the userpace part, it must be consistent
>> with this enum.
> 
> I think the way we implicitly define the value of those BPF_FUNC_
> enums is also suboptimal. It makes it much harder to cherry-pick and
> backport only few latest helpers onto old kernels (there was a case
> backporting one of the pretty trivial timestamp fetching helpers
> without backporting other stuff). It's also quite hard to correlate
> llvm-objdump output with just `call 123;` instruction into which
> helper it is.
> 
> If each FN(xxx) definition in __BPF_FUNC_MAPPER was taking explicit
> integer number, I think it would be a big win and make things better
> all around.
> 
> Is there any opposition to doing that?

No objection from my side, for what it's worth.

As a side note, and in case it's useful to anyone, I've played a bit in
the past with clang from Python to parse the UAPI header:


    #!/usr/bin/env python3

    from clang.cindex import Index, CursorKind

    index = Index.create()
    translation_unit = index.parse(None, ['include/uapi/linux/bpf.h'])
    if not translation_unit:
        raise Exception("unable to load input")

    elements = []
    for node in translation_unit.cursor.get_children():
        if node.type.spelling == "enum bpf_func_id":
            for val in node.get_children():
                elements.append(val.spelling)

    print(elements)
    print(elements.index('BPF_FUNC_trace_printk'))


    $ python3 script.py
    ['BPF_FUNC_unspec', 'BPF_FUNC_map_lookup_elem', [...],
    'BPF_FUNC_ktime_get_tai_ns', '__BPF_FUNC_MAX_ID']
    6

I'd love to use something like this to make scripts/bpf_doc.py more
robust, but I've refrained because of the dependency on the clang library.

Quentin
diff mbox series

Patch

diff --git a/scripts/bpf_doc.py b/scripts/bpf_doc.py
index f4f3e7ec6d44..d5c389df6045 100755
--- a/scripts/bpf_doc.py
+++ b/scripts/bpf_doc.py
@@ -50,6 +50,10 @@  class Helper(APIElement):
     @desc: textual description of the helper function
     @ret: description of the return value of the helper function
     """
+    def __init__(self, *args, **kwargs):
+        super().__init__(*args, **kwargs)
+        self.enum_val = None
+
     def proto_break_down(self):
         """
         Break down helper function protocol into smaller chunks: return type,
@@ -92,6 +96,7 @@  class HeaderParser(object):
         self.commands = []
         self.desc_unique_helpers = set()
         self.define_unique_helpers = []
+        self.helper_enum_vals = {}
         self.desc_syscalls = []
         self.enum_syscalls = []
 
@@ -248,30 +253,54 @@  class HeaderParser(object):
                 break
 
     def parse_define_helpers(self):
-        # Parse the number of FN(...) in #define __BPF_FUNC_MAPPER to compare
-        # later with the number of unique function names present in description.
+        # Parse FN(...) in #define __BPF_FUNC_MAPPER to compare later with the
+        # number of unique function names present in description and use the
+        # correct enumeration value.
         # Note: seek_to(..) discards the first line below the target search text,
         # resulting in FN(unspec) being skipped and not added to self.define_unique_helpers.
         self.seek_to('#define __BPF_FUNC_MAPPER(FN)',
                      'Could not find start of eBPF helper definition list')
-        # Searches for either one or more FN(\w+) defines or a backslash for newline
-        p = re.compile('\s*(FN\(\w+\))+|\\\\')
+        # Searches for one FN(\w+) define or a backslash for newline
+        p = re.compile('\s*FN\((\w+)\)|\\\\')
         fn_defines_str = ''
+        i = 1  # 'unspec' is skipped as mentioned above
         while True:
             capture = p.match(self.line)
             if capture:
                 fn_defines_str += self.line
+                self.helper_enum_vals[capture.expand(r'bpf_\1')] = i
+                i += 1
             else:
                 break
             self.line = self.reader.readline()
         # Find the number of occurences of FN(\w+)
         self.define_unique_helpers = re.findall('FN\(\w+\)', fn_defines_str)
 
+    def assign_helper_values(self):
+        seen_helpers = set()
+        for helper in self.helpers:
+            proto = helper.proto_break_down()
+            name = proto['name']
+            try:
+                enum_val = self.helper_enum_vals[name]
+            except KeyError:
+                raise Exception("Helper %s is missing from enum bpf_func_id" % name)
+
+            # Enforce current practice of having the descriptions ordered
+            # by enum value.
+            seen_helpers.add(name)
+            desc_val = len(seen_helpers)
+            if desc_val != enum_val:
+                raise Exception("Helper %s comment order (#%d) must be aligned with its position (#%d) in enum bpf_func_id" % (name, desc_val, enum_val))
+
+            helper.enum_val = enum_val
+
     def run(self):
         self.parse_desc_syscall()
         self.parse_enum_syscall()
         self.parse_desc_helpers()
         self.parse_define_helpers()
+        self.assign_helper_values()
         self.reader.close()
 
 ###############################################################################
@@ -796,7 +825,7 @@  class PrinterHelpers(Printer):
             comma = ', '
             print(one_arg, end='')
 
-        print(') = (void *) %d;' % len(self.seen_helpers))
+        print(') = (void *) %d;' % helper.enum_val)
         print('')
 
 ###############################################################################