diff mbox series

[v2,bpf-next,1/9] btf: add kind layout encoding, crcs to UAPI

Message ID 20230616171728.530116-2-alan.maguire@oracle.com (mailing list archive)
State Changes Requested
Delegated to: BPF
Headers show
Series bpf: support BTF kind layout info, CRCs | expand

Checks

Context Check Description
netdev/series_format success Posting correctly formatted
netdev/tree_selection success Clearly marked for bpf-next
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 1448 this patch: 1448
netdev/cc_maintainers success CCed 13 of 13 maintainers
netdev/build_clang success Errors and warnings before: 176 this patch: 176
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 1443 this patch: 1443
netdev/checkpatch warning CHECK: Prefer using the BIT macro WARNING: line length of 86 exceeds 80 columns WARNING: line length of 96 exceeds 80 columns
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
bpf/vmtest-bpf-next-PR success PR summary
bpf/vmtest-bpf-next-VM_Test-1 success Logs for ShellCheck
bpf/vmtest-bpf-next-VM_Test-2 success Logs for build for aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-4 success Logs for build for x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-5 success Logs for build for x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-6 success Logs for set-matrix
bpf/vmtest-bpf-next-VM_Test-3 success Logs for build for s390x with gcc
bpf/vmtest-bpf-next-VM_Test-7 success Logs for test_maps on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-9 success Logs for test_maps on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-10 success Logs for test_maps on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-17 fail Logs for test_progs_no_alu32 on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-19 success Logs for test_progs_no_alu32_parallel on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-20 success Logs for test_progs_no_alu32_parallel on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-21 success Logs for test_progs_no_alu32_parallel on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-22 success Logs for test_progs_parallel on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-23 success Logs for test_progs_parallel on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-24 success Logs for test_progs_parallel on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-25 success Logs for test_verifier on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-27 success Logs for test_verifier on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-28 success Logs for test_verifier on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-29 success Logs for veristat
bpf/vmtest-bpf-next-VM_Test-11 fail Logs for test_progs on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-13 fail Logs for test_progs on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-14 fail Logs for test_progs on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-15 fail Logs for test_progs_no_alu32 on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-18 fail Logs for test_progs_no_alu32 on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-26 success Logs for test_verifier on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-12 fail Logs for test_progs on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-16 fail Logs for test_progs_no_alu32 on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-8 success Logs for test_maps on s390x with gcc

Commit Message

Alan Maguire June 16, 2023, 5:17 p.m. UTC
BTF kind layouts provide information to parse BTF kinds.
By separating parsing BTF from using all the information
it provides, we allow BTF to encode new features even if
they cannot be used.  This is helpful in particular for
cases where newer tools for BTF generation run on an
older kernel; BTF kinds may be present that the kernel
cannot yet use, but at least it can parse the BTF
provided.  Meanwhile userspace tools with newer libbpf
may be able to use the newer information.

The intent is to support encoding of kind layouts
optionally so that tools like pahole can add this
information.  So for each kind we record

- kind-related flags
- length of singular element following struct btf_type
- length of each of the btf_vlen() elements following

In addition we make space in the BTF header for
CRC32s computed over the BTF along with a CRC for
the base BTF; this allows split BTF to identify
a mismatch explicitly.

The ideas here were discussed at [1], [2]; hence

Suggested-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alan Maguire <alan.maguire@oracle.com>

[1] https://lore.kernel.org/bpf/CAEf4BzYjWHRdNNw4B=eOXOs_ONrDwrgX4bn=Nuc1g8JPFC34MA@mail.gmail.com/
[2] https://lore.kernel.org/bpf/20230531201936.1992188-1-alan.maguire@oracle.com/
---
 include/uapi/linux/btf.h       | 24 ++++++++++++++++++++++++
 tools/include/uapi/linux/btf.h | 24 ++++++++++++++++++++++++
 2 files changed, 48 insertions(+)

Comments

Alexei Starovoitov June 17, 2023, 12:39 a.m. UTC | #1
On Fri, Jun 16, 2023 at 06:17:19PM +0100, Alan Maguire wrote:
> BTF kind layouts provide information to parse BTF kinds.
> By separating parsing BTF from using all the information
> it provides, we allow BTF to encode new features even if
> they cannot be used.  This is helpful in particular for
> cases where newer tools for BTF generation run on an
> older kernel; BTF kinds may be present that the kernel
> cannot yet use, but at least it can parse the BTF
> provided.  Meanwhile userspace tools with newer libbpf
> may be able to use the newer information.
> 
> The intent is to support encoding of kind layouts
> optionally so that tools like pahole can add this
> information.  So for each kind we record
> 
> - kind-related flags
> - length of singular element following struct btf_type
> - length of each of the btf_vlen() elements following
> 
> In addition we make space in the BTF header for
> CRC32s computed over the BTF along with a CRC for
> the base BTF; this allows split BTF to identify
> a mismatch explicitly.
> 
> The ideas here were discussed at [1], [2]; hence
> 
> Suggested-by: Andrii Nakryiko <andrii@kernel.org>
> Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
> 
> [1] https://lore.kernel.org/bpf/CAEf4BzYjWHRdNNw4B=eOXOs_ONrDwrgX4bn=Nuc1g8JPFC34MA@mail.gmail.com/
> [2] https://lore.kernel.org/bpf/20230531201936.1992188-1-alan.maguire@oracle.com/
> ---
>  include/uapi/linux/btf.h       | 24 ++++++++++++++++++++++++
>  tools/include/uapi/linux/btf.h | 24 ++++++++++++++++++++++++
>  2 files changed, 48 insertions(+)
> 
> diff --git a/include/uapi/linux/btf.h b/include/uapi/linux/btf.h
> index ec1798b6d3ff..cea9125ed953 100644
> --- a/include/uapi/linux/btf.h
> +++ b/include/uapi/linux/btf.h
> @@ -8,6 +8,22 @@
>  #define BTF_MAGIC	0xeB9F
>  #define BTF_VERSION	1
>  
> +/* is this information required? If so it cannot be sanitized safely. */
> +#define BTF_KIND_LAYOUT_OPTIONAL		(1 << 0)
> +
> +/* kind layout section consists of a struct btf_kind_layout for each known
> + * kind at BTF encoding time.
> + */
> +struct btf_kind_layout {
> +	__u16 flags;		/* see BTF_KIND_LAYOUT_* values above */
> +	__u8 info_sz;		/* size of singular element after btf_type */
> +	__u8 elem_sz;		/* size of each of btf_vlen(t) elements */
> +};
> +
> +/* for CRCs for BTF, base BTF to be considered usable, flags must be set. */
> +#define BTF_FLAG_CRC_SET		(1 << 0)
> +#define BTF_FLAG_BASE_CRC_SET		(1 << 1)
> +
>  struct btf_header {
>  	__u16	magic;
>  	__u8	version;
> @@ -19,8 +35,16 @@ struct btf_header {
>  	__u32	type_len;	/* length of type section	*/
>  	__u32	str_off;	/* offset of string section	*/
>  	__u32	str_len;	/* length of string section	*/
> +	__u32	kind_layout_off;/* offset of kind layout section */
> +	__u32	kind_layout_len;/* length of kind layout section */
> +
> +	__u32	crc;		/* crc of BTF; used if flags set BTF_FLAG_CRC_VALID */
> +	__u32	base_crc;	/* crc of base BTF; used if flags set BTF_FLAG_BASE_CRC_VALID */

typo ? should be BTF_FLAG_CRC_SET ?
Andrii Nakryiko June 22, 2023, 10:02 p.m. UTC | #2
On Fri, Jun 16, 2023 at 10:17 AM Alan Maguire <alan.maguire@oracle.com> wrote:
>
> BTF kind layouts provide information to parse BTF kinds.
> By separating parsing BTF from using all the information
> it provides, we allow BTF to encode new features even if
> they cannot be used.  This is helpful in particular for
> cases where newer tools for BTF generation run on an
> older kernel; BTF kinds may be present that the kernel
> cannot yet use, but at least it can parse the BTF
> provided.  Meanwhile userspace tools with newer libbpf
> may be able to use the newer information.
>
> The intent is to support encoding of kind layouts
> optionally so that tools like pahole can add this
> information.  So for each kind we record
>
> - kind-related flags
> - length of singular element following struct btf_type
> - length of each of the btf_vlen() elements following
>
> In addition we make space in the BTF header for
> CRC32s computed over the BTF along with a CRC for
> the base BTF; this allows split BTF to identify
> a mismatch explicitly.
>
> The ideas here were discussed at [1], [2]; hence
>
> Suggested-by: Andrii Nakryiko <andrii@kernel.org>
> Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
>
> [1] https://lore.kernel.org/bpf/CAEf4BzYjWHRdNNw4B=eOXOs_ONrDwrgX4bn=Nuc1g8JPFC34MA@mail.gmail.com/
> [2] https://lore.kernel.org/bpf/20230531201936.1992188-1-alan.maguire@oracle.com/
> ---
>  include/uapi/linux/btf.h       | 24 ++++++++++++++++++++++++
>  tools/include/uapi/linux/btf.h | 24 ++++++++++++++++++++++++
>  2 files changed, 48 insertions(+)
>
> diff --git a/include/uapi/linux/btf.h b/include/uapi/linux/btf.h
> index ec1798b6d3ff..cea9125ed953 100644
> --- a/include/uapi/linux/btf.h
> +++ b/include/uapi/linux/btf.h
> @@ -8,6 +8,22 @@
>  #define BTF_MAGIC      0xeB9F
>  #define BTF_VERSION    1
>
> +/* is this information required? If so it cannot be sanitized safely. */
> +#define BTF_KIND_LAYOUT_OPTIONAL               (1 << 0)

hm.. I thought we agreed to not have OPTIONAL flag last time, no? From
kernel's perspective nothing is optional. From libbpf perspective
everything should be optional, unless we get type_id reference to
something that we don't recognize. So why the flag and extra code to
handle it?

We can always add it later, if necessary.

> +
> +/* kind layout section consists of a struct btf_kind_layout for each known
> + * kind at BTF encoding time.
> + */
> +struct btf_kind_layout {
> +       __u16 flags;            /* see BTF_KIND_LAYOUT_* values above */
> +       __u8 info_sz;           /* size of singular element after btf_type */
> +       __u8 elem_sz;           /* size of each of btf_vlen(t) elements */
> +};
> +
> +/* for CRCs for BTF, base BTF to be considered usable, flags must be set. */
> +#define BTF_FLAG_CRC_SET               (1 << 0)
> +#define BTF_FLAG_BASE_CRC_SET          (1 << 1)
> +
>  struct btf_header {
>         __u16   magic;
>         __u8    version;
> @@ -19,8 +35,16 @@ struct btf_header {
>         __u32   type_len;       /* length of type section       */
>         __u32   str_off;        /* offset of string section     */
>         __u32   str_len;        /* length of string section     */
> +       __u32   kind_layout_off;/* offset of kind layout section */
> +       __u32   kind_layout_len;/* length of kind layout section */
> +
> +       __u32   crc;            /* crc of BTF; used if flags set BTF_FLAG_CRC_VALID */
> +       __u32   base_crc;       /* crc of base BTF; used if flags set BTF_FLAG_BASE_CRC_VALID */
>  };
>
> +/* required minimum BTF header length */
> +#define BTF_HEADER_MIN_LEN     (sizeof(struct btf_header) - 16)

offsetof(struct btf_header, kind_layout_off) ?

but actually why this needs to be a part of UAPI?

> +
>  /* Max # of type identifier */
>  #define BTF_MAX_TYPE   0x000fffff
>  /* Max offset into the string section */
> diff --git a/tools/include/uapi/linux/btf.h b/tools/include/uapi/linux/btf.h
> index ec1798b6d3ff..cea9125ed953 100644
> --- a/tools/include/uapi/linux/btf.h
> +++ b/tools/include/uapi/linux/btf.h
> @@ -8,6 +8,22 @@
>  #define BTF_MAGIC      0xeB9F
>  #define BTF_VERSION    1
>
> +/* is this information required? If so it cannot be sanitized safely. */
> +#define BTF_KIND_LAYOUT_OPTIONAL               (1 << 0)
> +
> +/* kind layout section consists of a struct btf_kind_layout for each known
> + * kind at BTF encoding time.
> + */
> +struct btf_kind_layout {
> +       __u16 flags;            /* see BTF_KIND_LAYOUT_* values above */
> +       __u8 info_sz;           /* size of singular element after btf_type */
> +       __u8 elem_sz;           /* size of each of btf_vlen(t) elements */
> +};
> +
> +/* for CRCs for BTF, base BTF to be considered usable, flags must be set. */
> +#define BTF_FLAG_CRC_SET               (1 << 0)
> +#define BTF_FLAG_BASE_CRC_SET          (1 << 1)
> +
>  struct btf_header {
>         __u16   magic;
>         __u8    version;
> @@ -19,8 +35,16 @@ struct btf_header {
>         __u32   type_len;       /* length of type section       */
>         __u32   str_off;        /* offset of string section     */
>         __u32   str_len;        /* length of string section     */
> +       __u32   kind_layout_off;/* offset of kind layout section */
> +       __u32   kind_layout_len;/* length of kind layout section */
> +
> +       __u32   crc;            /* crc of BTF; used if flags set BTF_FLAG_CRC_VALID */

why are we making crc optional? shouldn't we just say that crc is
always filled out?

> +       __u32   base_crc;       /* crc of base BTF; used if flags set BTF_FLAG_BASE_CRC_VALID */

here it would be nice if we could just rely on zero meaning not set,
but I suspect not everyone will be happy about this, as technically
crc 0 is a valid crc :(


>  };
>
> +/* required minimum BTF header length */
> +#define BTF_HEADER_MIN_LEN     (sizeof(struct btf_header) - 16)
> +
>  /* Max # of type identifier */
>  #define BTF_MAX_TYPE   0x000fffff
>  /* Max offset into the string section */
> --
> 2.39.3
>
Alan Maguire July 3, 2023, 1:42 p.m. UTC | #3
On 22/06/2023 23:02, Andrii Nakryiko wrote:
> On Fri, Jun 16, 2023 at 10:17 AM Alan Maguire <alan.maguire@oracle.com> wrote:
>>
>> BTF kind layouts provide information to parse BTF kinds.
>> By separating parsing BTF from using all the information
>> it provides, we allow BTF to encode new features even if
>> they cannot be used.  This is helpful in particular for
>> cases where newer tools for BTF generation run on an
>> older kernel; BTF kinds may be present that the kernel
>> cannot yet use, but at least it can parse the BTF
>> provided.  Meanwhile userspace tools with newer libbpf
>> may be able to use the newer information.
>>
>> The intent is to support encoding of kind layouts
>> optionally so that tools like pahole can add this
>> information.  So for each kind we record
>>
>> - kind-related flags
>> - length of singular element following struct btf_type
>> - length of each of the btf_vlen() elements following
>>
>> In addition we make space in the BTF header for
>> CRC32s computed over the BTF along with a CRC for
>> the base BTF; this allows split BTF to identify
>> a mismatch explicitly.
>>
>> The ideas here were discussed at [1], [2]; hence
>>
>> Suggested-by: Andrii Nakryiko <andrii@kernel.org>
>> Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
>>
>> [1] https://lore.kernel.org/bpf/CAEf4BzYjWHRdNNw4B=eOXOs_ONrDwrgX4bn=Nuc1g8JPFC34MA@mail.gmail.com/
>> [2] https://lore.kernel.org/bpf/20230531201936.1992188-1-alan.maguire@oracle.com/
>> ---
>>  include/uapi/linux/btf.h       | 24 ++++++++++++++++++++++++
>>  tools/include/uapi/linux/btf.h | 24 ++++++++++++++++++++++++
>>  2 files changed, 48 insertions(+)
>>
>> diff --git a/include/uapi/linux/btf.h b/include/uapi/linux/btf.h
>> index ec1798b6d3ff..cea9125ed953 100644
>> --- a/include/uapi/linux/btf.h
>> +++ b/include/uapi/linux/btf.h
>> @@ -8,6 +8,22 @@
>>  #define BTF_MAGIC      0xeB9F
>>  #define BTF_VERSION    1
>>
>> +/* is this information required? If so it cannot be sanitized safely. */
>> +#define BTF_KIND_LAYOUT_OPTIONAL               (1 << 0)
> 
> hm.. I thought we agreed to not have OPTIONAL flag last time, no? From
> kernel's perspective nothing is optional. From libbpf perspective
> everything should be optional, unless we get type_id reference to
> something that we don't recognize. So why the flag and extra code to
> handle it?
> 
> We can always add it later, if necessary.
>

I totally agree we need to reject any BTF that contains references
to unknown objects if these references are made via known ones;
so for example an enum64 in a struct (in the case we didn't know
about an enum64). However, it's possible a BTF kind could point
_at_ other BTF kinds but not be pointed _to_ by any known kinds;
in such a case wouldn't optional make sense even for the kernel
to say "ignore any kinds that we don't know about that aren't
participating in any known BTF relationships"? Default assumption
without the optional flag would be to reject such BTF.

>> +
>> +/* kind layout section consists of a struct btf_kind_layout for each known
>> + * kind at BTF encoding time.
>> + */
>> +struct btf_kind_layout {
>> +       __u16 flags;            /* see BTF_KIND_LAYOUT_* values above */
>> +       __u8 info_sz;           /* size of singular element after btf_type */
>> +       __u8 elem_sz;           /* size of each of btf_vlen(t) elements */
>> +};
>> +
>> +/* for CRCs for BTF, base BTF to be considered usable, flags must be set. */
>> +#define BTF_FLAG_CRC_SET               (1 << 0)
>> +#define BTF_FLAG_BASE_CRC_SET          (1 << 1)
>> +
>>  struct btf_header {
>>         __u16   magic;
>>         __u8    version;
>> @@ -19,8 +35,16 @@ struct btf_header {
>>         __u32   type_len;       /* length of type section       */
>>         __u32   str_off;        /* offset of string section     */
>>         __u32   str_len;        /* length of string section     */
>> +       __u32   kind_layout_off;/* offset of kind layout section */
>> +       __u32   kind_layout_len;/* length of kind layout section */
>> +
>> +       __u32   crc;            /* crc of BTF; used if flags set BTF_FLAG_CRC_VALID */
>> +       __u32   base_crc;       /* crc of base BTF; used if flags set BTF_FLAG_BASE_CRC_VALID */
>>  };
>>
>> +/* required minimum BTF header length */
>> +#define BTF_HEADER_MIN_LEN     (sizeof(struct btf_header) - 16)
> 
> offsetof(struct btf_header, kind_layout_off) ?
> 
> but actually why this needs to be a part of UAPI?
> 

no not really. I was trying to come up with a more elegant
way of differentiating between the old and new header formats
on the basis of size and eventually just gave up and added
a #define. It can absolutely be removed.

>> +
>>  /* Max # of type identifier */
>>  #define BTF_MAX_TYPE   0x000fffff
>>  /* Max offset into the string section */
>> diff --git a/tools/include/uapi/linux/btf.h b/tools/include/uapi/linux/btf.h
>> index ec1798b6d3ff..cea9125ed953 100644
>> --- a/tools/include/uapi/linux/btf.h
>> +++ b/tools/include/uapi/linux/btf.h
>> @@ -8,6 +8,22 @@
>>  #define BTF_MAGIC      0xeB9F
>>  #define BTF_VERSION    1
>>
>> +/* is this information required? If so it cannot be sanitized safely. */
>> +#define BTF_KIND_LAYOUT_OPTIONAL               (1 << 0)
>> +
>> +/* kind layout section consists of a struct btf_kind_layout for each known
>> + * kind at BTF encoding time.
>> + */
>> +struct btf_kind_layout {
>> +       __u16 flags;            /* see BTF_KIND_LAYOUT_* values above */
>> +       __u8 info_sz;           /* size of singular element after btf_type */
>> +       __u8 elem_sz;           /* size of each of btf_vlen(t) elements */
>> +};
>> +
>> +/* for CRCs for BTF, base BTF to be considered usable, flags must be set. */
>> +#define BTF_FLAG_CRC_SET               (1 << 0)
>> +#define BTF_FLAG_BASE_CRC_SET          (1 << 1)
>> +
>>  struct btf_header {
>>         __u16   magic;
>>         __u8    version;
>> @@ -19,8 +35,16 @@ struct btf_header {
>>         __u32   type_len;       /* length of type section       */
>>         __u32   str_off;        /* offset of string section     */
>>         __u32   str_len;        /* length of string section     */
>> +       __u32   kind_layout_off;/* offset of kind layout section */
>> +       __u32   kind_layout_len;/* length of kind layout section */
>> +
>> +       __u32   crc;            /* crc of BTF; used if flags set BTF_FLAG_CRC_VALID */
> 
> why are we making crc optional? shouldn't we just say that crc is
> always filled out?
> 

The approach I took was to have libbpf/pahole be flexible about
specification of crcs and kind layout; neither, one of these or both
are supported. When neither are specified we'll still generate
a larger header, but it will be zeros for the new fields so parseable
by older libbpf/kernel. I think we probably need to make it optional
for a while to support new pahole on older kernels.


>> +       __u32   base_crc;       /* crc of base BTF; used if flags set BTF_FLAG_BASE_CRC_VALID */
> 
> here it would be nice if we could just rely on zero meaning not set,
> but I suspect not everyone will be happy about this, as technically
> crc 0 is a valid crc :(
>

Right, I think we're stuck with the flags unfortunately.
Thanks for the review (and apologies for the belated response!)

Alan

> 
>>  };
>>
>> +/* required minimum BTF header length */
>> +#define BTF_HEADER_MIN_LEN     (sizeof(struct btf_header) - 16)
>> +
>>  /* Max # of type identifier */
>>  #define BTF_MAX_TYPE   0x000fffff
>>  /* Max offset into the string section */
>> --
>> 2.39.3
>>
Andrii Nakryiko July 5, 2023, 11:48 p.m. UTC | #4
On Mon, Jul 3, 2023 at 6:42 AM Alan Maguire <alan.maguire@oracle.com> wrote:
>
> On 22/06/2023 23:02, Andrii Nakryiko wrote:
> > On Fri, Jun 16, 2023 at 10:17 AM Alan Maguire <alan.maguire@oracle.com> wrote:
> >>
> >> BTF kind layouts provide information to parse BTF kinds.
> >> By separating parsing BTF from using all the information
> >> it provides, we allow BTF to encode new features even if
> >> they cannot be used.  This is helpful in particular for
> >> cases where newer tools for BTF generation run on an
> >> older kernel; BTF kinds may be present that the kernel
> >> cannot yet use, but at least it can parse the BTF
> >> provided.  Meanwhile userspace tools with newer libbpf
> >> may be able to use the newer information.
> >>
> >> The intent is to support encoding of kind layouts
> >> optionally so that tools like pahole can add this
> >> information.  So for each kind we record
> >>
> >> - kind-related flags
> >> - length of singular element following struct btf_type
> >> - length of each of the btf_vlen() elements following
> >>
> >> In addition we make space in the BTF header for
> >> CRC32s computed over the BTF along with a CRC for
> >> the base BTF; this allows split BTF to identify
> >> a mismatch explicitly.
> >>
> >> The ideas here were discussed at [1], [2]; hence
> >>
> >> Suggested-by: Andrii Nakryiko <andrii@kernel.org>
> >> Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
> >>
> >> [1] https://lore.kernel.org/bpf/CAEf4BzYjWHRdNNw4B=eOXOs_ONrDwrgX4bn=Nuc1g8JPFC34MA@mail.gmail.com/
> >> [2] https://lore.kernel.org/bpf/20230531201936.1992188-1-alan.maguire@oracle.com/
> >> ---
> >>  include/uapi/linux/btf.h       | 24 ++++++++++++++++++++++++
> >>  tools/include/uapi/linux/btf.h | 24 ++++++++++++++++++++++++
> >>  2 files changed, 48 insertions(+)
> >>
> >> diff --git a/include/uapi/linux/btf.h b/include/uapi/linux/btf.h
> >> index ec1798b6d3ff..cea9125ed953 100644
> >> --- a/include/uapi/linux/btf.h
> >> +++ b/include/uapi/linux/btf.h
> >> @@ -8,6 +8,22 @@
> >>  #define BTF_MAGIC      0xeB9F
> >>  #define BTF_VERSION    1
> >>
> >> +/* is this information required? If so it cannot be sanitized safely. */
> >> +#define BTF_KIND_LAYOUT_OPTIONAL               (1 << 0)
> >
> > hm.. I thought we agreed to not have OPTIONAL flag last time, no? From
> > kernel's perspective nothing is optional. From libbpf perspective
> > everything should be optional, unless we get type_id reference to
> > something that we don't recognize. So why the flag and extra code to
> > handle it?
> >
> > We can always add it later, if necessary.
> >
>
> I totally agree we need to reject any BTF that contains references
> to unknown objects if these references are made via known ones;
> so for example an enum64 in a struct (in the case we didn't know
> about an enum64). However, it's possible a BTF kind could point
> _at_ other BTF kinds but not be pointed _to_ by any known kinds;
> in such a case wouldn't optional make sense even for the kernel
> to say "ignore any kinds that we don't know about that aren't
> participating in any known BTF relationships"? Default assumption
> without the optional flag would be to reject such BTF.

I think it's simpler (and would follow what we've been doing with
kernel-side strict validation of everything) to reject everything
unknown. "Being pointed to" isn't always contained within BTF itself.
E.g., for line and func info, type_id comes during BPF_PROG_LOAD. So
at the point of BTF validation you don't know that some FUNCs will be
pointed to (as an example). So I'd avoid making unnecessarily
dangerous assumptions that some pieces of information can be ignored.

And in general, kernel doesn't trust user-space data without
validation, so we'd have to double-check all this OPTIONAL flags
somehow anyways.

>
> >> +
> >> +/* kind layout section consists of a struct btf_kind_layout for each known
> >> + * kind at BTF encoding time.
> >> + */
> >> +struct btf_kind_layout {
> >> +       __u16 flags;            /* see BTF_KIND_LAYOUT_* values above */
> >> +       __u8 info_sz;           /* size of singular element after btf_type */
> >> +       __u8 elem_sz;           /* size of each of btf_vlen(t) elements */
> >> +};
> >> +
> >> +/* for CRCs for BTF, base BTF to be considered usable, flags must be set. */
> >> +#define BTF_FLAG_CRC_SET               (1 << 0)
> >> +#define BTF_FLAG_BASE_CRC_SET          (1 << 1)
> >> +
> >>  struct btf_header {
> >>         __u16   magic;
> >>         __u8    version;
> >> @@ -19,8 +35,16 @@ struct btf_header {
> >>         __u32   type_len;       /* length of type section       */
> >>         __u32   str_off;        /* offset of string section     */
> >>         __u32   str_len;        /* length of string section     */
> >> +       __u32   kind_layout_off;/* offset of kind layout section */
> >> +       __u32   kind_layout_len;/* length of kind layout section */
> >> +
> >> +       __u32   crc;            /* crc of BTF; used if flags set BTF_FLAG_CRC_VALID */
> >> +       __u32   base_crc;       /* crc of base BTF; used if flags set BTF_FLAG_BASE_CRC_VALID */
> >>  };
> >>
> >> +/* required minimum BTF header length */
> >> +#define BTF_HEADER_MIN_LEN     (sizeof(struct btf_header) - 16)
> >
> > offsetof(struct btf_header, kind_layout_off) ?
> >
> > but actually why this needs to be a part of UAPI?
> >
>
> no not really. I was trying to come up with a more elegant
> way of differentiating between the old and new header formats
> on the basis of size and eventually just gave up and added
> a #define. It can absolutely be removed.

right, so that's why just checking if field is present based on
btf_header.len and field offset is a good approach? Let's drop
unnecessary constants from UAPI header

>
> >> +
> >>  /* Max # of type identifier */
> >>  #define BTF_MAX_TYPE   0x000fffff
> >>  /* Max offset into the string section */
> >> diff --git a/tools/include/uapi/linux/btf.h b/tools/include/uapi/linux/btf.h
> >> index ec1798b6d3ff..cea9125ed953 100644
> >> --- a/tools/include/uapi/linux/btf.h
> >> +++ b/tools/include/uapi/linux/btf.h
> >> @@ -8,6 +8,22 @@
> >>  #define BTF_MAGIC      0xeB9F
> >>  #define BTF_VERSION    1
> >>
> >> +/* is this information required? If so it cannot be sanitized safely. */
> >> +#define BTF_KIND_LAYOUT_OPTIONAL               (1 << 0)
> >> +
> >> +/* kind layout section consists of a struct btf_kind_layout for each known
> >> + * kind at BTF encoding time.
> >> + */
> >> +struct btf_kind_layout {
> >> +       __u16 flags;            /* see BTF_KIND_LAYOUT_* values above */
> >> +       __u8 info_sz;           /* size of singular element after btf_type */
> >> +       __u8 elem_sz;           /* size of each of btf_vlen(t) elements */
> >> +};
> >> +
> >> +/* for CRCs for BTF, base BTF to be considered usable, flags must be set. */
> >> +#define BTF_FLAG_CRC_SET               (1 << 0)
> >> +#define BTF_FLAG_BASE_CRC_SET          (1 << 1)
> >> +
> >>  struct btf_header {
> >>         __u16   magic;
> >>         __u8    version;
> >> @@ -19,8 +35,16 @@ struct btf_header {
> >>         __u32   type_len;       /* length of type section       */
> >>         __u32   str_off;        /* offset of string section     */
> >>         __u32   str_len;        /* length of string section     */
> >> +       __u32   kind_layout_off;/* offset of kind layout section */
> >> +       __u32   kind_layout_len;/* length of kind layout section */
> >> +
> >> +       __u32   crc;            /* crc of BTF; used if flags set BTF_FLAG_CRC_VALID */
> >
> > why are we making crc optional? shouldn't we just say that crc is
> > always filled out?
> >
>
> The approach I took was to have libbpf/pahole be flexible about
> specification of crcs and kind layout; neither, one of these or both
> are supported. When neither are specified we'll still generate
> a larger header, but it will be zeros for the new fields so parseable
> by older libbpf/kernel. I think we probably need to make it optional
> for a while to support new pahole on older kernels.

I'm not sure how this "optional for a while" will turn to
"non-optional", tbh, and who and when will decide that. I think the
"new pahole on old kernel" problem is solvable easily by making all
this new stuff opt-in. New kernel Makefiles will request pahole to
emit them, if pahole is new enough. Old kernels won't know about this
feature and even new pahole won't emit it.

I don't feel too strongly about it, just generally feeling like
minimizing all the different supportable variations.

>
>
> >> +       __u32   base_crc;       /* crc of base BTF; used if flags set BTF_FLAG_BASE_CRC_VALID */
> >
> > here it would be nice if we could just rely on zero meaning not set,
> > but I suspect not everyone will be happy about this, as technically
> > crc 0 is a valid crc :(
> >
>
> Right, I think we're stuck with the flags unfortunately.

yep. one extra reason why I like the idea of this being string offset,
but whatever, small thing


> Thanks for the review (and apologies for the belated response!)
>
> Alan
>
> >
> >>  };
> >>
> >> +/* required minimum BTF header length */
> >> +#define BTF_HEADER_MIN_LEN     (sizeof(struct btf_header) - 16)
> >> +
> >>  /* Max # of type identifier */
> >>  #define BTF_MAX_TYPE   0x000fffff
> >>  /* Max offset into the string section */
> >> --
> >> 2.39.3
> >>
Alan Maguire July 20, 2023, 8:18 p.m. UTC | #5
On 06/07/2023 00:48, Andrii Nakryiko wrote:
> On Mon, Jul 3, 2023 at 6:42 AM Alan Maguire <alan.maguire@oracle.com> wrote:
>>
>> On 22/06/2023 23:02, Andrii Nakryiko wrote:
>>> On Fri, Jun 16, 2023 at 10:17 AM Alan Maguire <alan.maguire@oracle.com> wrote:
>>>>
>>>> BTF kind layouts provide information to parse BTF kinds.
>>>> By separating parsing BTF from using all the information
>>>> it provides, we allow BTF to encode new features even if
>>>> they cannot be used.  This is helpful in particular for
>>>> cases where newer tools for BTF generation run on an
>>>> older kernel; BTF kinds may be present that the kernel
>>>> cannot yet use, but at least it can parse the BTF
>>>> provided.  Meanwhile userspace tools with newer libbpf
>>>> may be able to use the newer information.
>>>>
>>>> The intent is to support encoding of kind layouts
>>>> optionally so that tools like pahole can add this
>>>> information.  So for each kind we record
>>>>
>>>> - kind-related flags
>>>> - length of singular element following struct btf_type
>>>> - length of each of the btf_vlen() elements following
>>>>
>>>> In addition we make space in the BTF header for
>>>> CRC32s computed over the BTF along with a CRC for
>>>> the base BTF; this allows split BTF to identify
>>>> a mismatch explicitly.
>>>>
>>>> The ideas here were discussed at [1], [2]; hence
>>>>
>>>> Suggested-by: Andrii Nakryiko <andrii@kernel.org>
>>>> Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
>>>>
>>>> [1] https://lore.kernel.org/bpf/CAEf4BzYjWHRdNNw4B=eOXOs_ONrDwrgX4bn=Nuc1g8JPFC34MA@mail.gmail.com/
>>>> [2] https://lore.kernel.org/bpf/20230531201936.1992188-1-alan.maguire@oracle.com/
>>>> ---
>>>>  include/uapi/linux/btf.h       | 24 ++++++++++++++++++++++++
>>>>  tools/include/uapi/linux/btf.h | 24 ++++++++++++++++++++++++
>>>>  2 files changed, 48 insertions(+)
>>>>
>>>> diff --git a/include/uapi/linux/btf.h b/include/uapi/linux/btf.h
>>>> index ec1798b6d3ff..cea9125ed953 100644
>>>> --- a/include/uapi/linux/btf.h
>>>> +++ b/include/uapi/linux/btf.h
>>>> @@ -8,6 +8,22 @@
>>>>  #define BTF_MAGIC      0xeB9F
>>>>  #define BTF_VERSION    1
>>>>
>>>> +/* is this information required? If so it cannot be sanitized safely. */
>>>> +#define BTF_KIND_LAYOUT_OPTIONAL               (1 << 0)
>>>
>>> hm.. I thought we agreed to not have OPTIONAL flag last time, no? From
>>> kernel's perspective nothing is optional. From libbpf perspective
>>> everything should be optional, unless we get type_id reference to
>>> something that we don't recognize. So why the flag and extra code to
>>> handle it?
>>>
>>> We can always add it later, if necessary.
>>>
>>
>> I totally agree we need to reject any BTF that contains references
>> to unknown objects if these references are made via known ones;
>> so for example an enum64 in a struct (in the case we didn't know
>> about an enum64). However, it's possible a BTF kind could point
>> _at_ other BTF kinds but not be pointed _to_ by any known kinds;
>> in such a case wouldn't optional make sense even for the kernel
>> to say "ignore any kinds that we don't know about that aren't
>> participating in any known BTF relationships"? Default assumption
>> without the optional flag would be to reject such BTF.
> 
> I think it's simpler (and would follow what we've been doing with
> kernel-side strict validation of everything) to reject everything
> unknown. "Being pointed to" isn't always contained within BTF itself.
> E.g., for line and func info, type_id comes during BPF_PROG_LOAD. So
> at the point of BTF validation you don't know that some FUNCs will be
> pointed to (as an example). So I'd avoid making unnecessarily
> dangerous assumptions that some pieces of information can be ignored.
> 
> And in general, kernel doesn't trust user-space data without
> validation, so we'd have to double-check all this OPTIONAL flagsole
> somehow anyways.
> 

makes sense. I think I'd been hoping the BTF kind layout would help
address the "newer pahole runs on older kernel" problem, but it
doesn't really. More on that below...

>>
>>>> +
>>>> +/* kind layout section consists of a struct btf_kind_layout for each known
>>>> + * kind at BTF encoding time.
>>>> + */
>>>> +struct btf_kind_layout {
>>>> +       __u16 flags;            /* see BTF_KIND_LAYOUT_* values above */
>>>> +       __u8 info_sz;           /* size of singular element after btf_type */
>>>> +       __u8 elem_sz;           /* size of each of btf_vlen(t) elements */
>>>> +};
>>>> +
>>>> +/* for CRCs for BTF, base BTF to be considered usable, flags must be set. */
>>>> +#define BTF_FLAG_CRC_SET               (1 << 0)
>>>> +#define BTF_FLAG_BASE_CRC_SET          (1 << 1)
>>>> +
>>>>  struct btf_header {
>>>>         __u16   magic;
>>>>         __u8    version;
>>>> @@ -19,8 +35,16 @@ struct btf_header {
>>>>         __u32   type_len;       /* length of type section       */
>>>>         __u32   str_off;        /* offset of string section     */
>>>>         __u32   str_len;        /* length of string section     */
>>>> +       __u32   kind_layout_off;/* offset of kind layout section */
>>>> +       __u32   kind_layout_len;/* length of kind layout section */
>>>> +
>>>> +       __u32   crc;            /* crc of BTF; used if flags set BTF_FLAG_CRC_VALID */
>>>> +       __u32   base_crc;       /* crc of base BTF; used if flags set BTF_FLAG_BASE_CRC_VALID */
>>>>  };
>>>>
>>>> +/* required minimum BTF header length */
>>>> +#define BTF_HEADER_MIN_LEN     (sizeof(struct btf_header) - 16)
>>>
>>> offsetof(struct btf_header, kind_layout_off) ?
>>>
>>> but actually why this needs to be a part of UAPI?
>>>
>>
>> no not really. I was trying to come up with a more elegant
>> way of differentiating between the old and new header formats
>> on the basis of size and eventually just gave up and added
>> a #define. It can absolutely be removed.
> 
> right, so that's why just checking if field is present based on
> btf_header.len and field offset is a good approach? Let's drop
> unnecessary constants from UAPI header
> 
>>
>>>> +
>>>>  /* Max # of type identifier */
>>>>  #define BTF_MAX_TYPE   0x000fffff
>>>>  /* Max offset into the string section */
>>>> diff --git a/tools/include/uapi/linux/btf.h b/tools/include/uapi/linux/btf.h
>>>> index ec1798b6d3ff..cea9125ed953 100644
>>>> --- a/tools/include/uapi/linux/btf.h
>>>> +++ b/tools/include/uapi/linux/btf.h
>>>> @@ -8,6 +8,22 @@
>>>>  #define BTF_MAGIC      0xeB9F
>>>>  #define BTF_VERSION    1
>>>>
>>>> +/* is this information required? If so it cannot be sanitized safely. */
>>>> +#define BTF_KIND_LAYOUT_OPTIONAL               (1 << 0)
>>>> +
>>>> +/* kind layout section consists of a struct btf_kind_layout for each known
>>>> + * kind at BTF encoding time.
>>>> + */
>>>> +struct btf_kind_layout {
>>>> +       __u16 flags;            /* see BTF_KIND_LAYOUT_* values above */
>>>> +       __u8 info_sz;           /* size of singular element after btf_type */
>>>> +       __u8 elem_sz;           /* size of each of btf_vlen(t) elements */
>>>> +};
>>>> +
>>>> +/* for CRCs for BTF, base BTF to be considered usable, flags must be set. */
>>>> +#define BTF_FLAG_CRC_SET               (1 << 0)
>>>> +#define BTF_FLAG_BASE_CRC_SET          (1 << 1)
>>>> +
>>>>  struct btf_header {
>>>>         __u16   magic;
>>>>         __u8    version;
>>>> @@ -19,8 +35,16 @@ struct btf_header {
>>>>         __u32   type_len;       /* length of type section       */
>>>>         __u32   str_off;        /* offset of string section     */
>>>>         __u32   str_len;        /* length of string section     */
>>>> +       __u32   kind_layout_off;/* offset of kind layout section */
>>>> +       __u32   kind_layout_len;/* length of kind layout section */
>>>> +
>>>> +       __u32   crc;            /* crc of BTF; used if flags set BTF_FLAG_CRC_VALID */
>>>
>>> why are we making crc optional? shouldn't we just say that crc is
>>> always filled out?
>>>
>>
>> The approach I took was to have libbpf/pahole be flexible about
>> specification of crcs and kind layout; neither, one of these or both
>> are supported. When neither are specified we'll still generate
>> a larger header, but it will be zeros for the new fields so parseable
>> by older libbpf/kernel. I think we probably need to make it optional
>> for a while to support new pahole on older kernels.
> 
> I'm not sure how this "optional for a while" will turn to
> "non-optional", tbh, and who and when will decide that. I think the
> "new pahole on old kernel" problem is solvable easily by making all
> this new stuff opt-in. New kernel Makefiles will request pahole to
> emit them, if pahole is new enough. Old kernels won't know about this
> feature and even new pahole won't emit it.
>

Another approach would be if we could auto-detect the kinds supported
by an older kernel, and not emit anything > BTF_KIND_MAX for that
kernel. I've put together a proof-of-concept for that, see [1].

> I don't feel too strongly about it, just generally feeling like
> minimizing all the different supportable variations.
> 

Yeah, hopefully some of these options can go away eventually.
The auto-detection scheme in [1], or something like it, might
make things easier in future. Thanks!

Alan

[1]
https://lore.kernel.org/bpf/20230720201443.224040-1-alan.maguire@oracle.com/
>>
>>
>>>> +       __u32   base_crc;       /* crc of base BTF; used if flags set BTF_FLAG_BASE_CRC_VALID */
>>>
>>> here it would be nice if we could just rely on zero meaning not set,
>>> but I suspect not everyone will be happy about this, as technically
>>> crc 0 is a valid crc :(
>>>
>>
>> Right, I think we're stuck with the flags unfortunately.
> 
> yep. one extra reason why I like the idea of this being string offset,
> but whatever, small thing
> 
> 
>> Thanks for the review (and apologies for the belated response!)
>>
>> Alan
>>
>>>
>>>>  };
>>>>
>>>> +/* required minimum BTF header length */
>>>> +#define BTF_HEADER_MIN_LEN     (sizeof(struct btf_header) - 16)
>>>> +
>>>>  /* Max # of type identifier */
>>>>  #define BTF_MAX_TYPE   0x000fffff
>>>>  /* Max offset into the string section */
>>>> --
>>>> 2.39.3
>>>>
>
diff mbox series

Patch

diff --git a/include/uapi/linux/btf.h b/include/uapi/linux/btf.h
index ec1798b6d3ff..cea9125ed953 100644
--- a/include/uapi/linux/btf.h
+++ b/include/uapi/linux/btf.h
@@ -8,6 +8,22 @@ 
 #define BTF_MAGIC	0xeB9F
 #define BTF_VERSION	1
 
+/* is this information required? If so it cannot be sanitized safely. */
+#define BTF_KIND_LAYOUT_OPTIONAL		(1 << 0)
+
+/* kind layout section consists of a struct btf_kind_layout for each known
+ * kind at BTF encoding time.
+ */
+struct btf_kind_layout {
+	__u16 flags;		/* see BTF_KIND_LAYOUT_* values above */
+	__u8 info_sz;		/* size of singular element after btf_type */
+	__u8 elem_sz;		/* size of each of btf_vlen(t) elements */
+};
+
+/* for CRCs for BTF, base BTF to be considered usable, flags must be set. */
+#define BTF_FLAG_CRC_SET		(1 << 0)
+#define BTF_FLAG_BASE_CRC_SET		(1 << 1)
+
 struct btf_header {
 	__u16	magic;
 	__u8	version;
@@ -19,8 +35,16 @@  struct btf_header {
 	__u32	type_len;	/* length of type section	*/
 	__u32	str_off;	/* offset of string section	*/
 	__u32	str_len;	/* length of string section	*/
+	__u32	kind_layout_off;/* offset of kind layout section */
+	__u32	kind_layout_len;/* length of kind layout section */
+
+	__u32	crc;		/* crc of BTF; used if flags set BTF_FLAG_CRC_VALID */
+	__u32	base_crc;	/* crc of base BTF; used if flags set BTF_FLAG_BASE_CRC_VALID */
 };
 
+/* required minimum BTF header length */
+#define BTF_HEADER_MIN_LEN	(sizeof(struct btf_header) - 16)
+
 /* Max # of type identifier */
 #define BTF_MAX_TYPE	0x000fffff
 /* Max offset into the string section */
diff --git a/tools/include/uapi/linux/btf.h b/tools/include/uapi/linux/btf.h
index ec1798b6d3ff..cea9125ed953 100644
--- a/tools/include/uapi/linux/btf.h
+++ b/tools/include/uapi/linux/btf.h
@@ -8,6 +8,22 @@ 
 #define BTF_MAGIC	0xeB9F
 #define BTF_VERSION	1
 
+/* is this information required? If so it cannot be sanitized safely. */
+#define BTF_KIND_LAYOUT_OPTIONAL		(1 << 0)
+
+/* kind layout section consists of a struct btf_kind_layout for each known
+ * kind at BTF encoding time.
+ */
+struct btf_kind_layout {
+	__u16 flags;		/* see BTF_KIND_LAYOUT_* values above */
+	__u8 info_sz;		/* size of singular element after btf_type */
+	__u8 elem_sz;		/* size of each of btf_vlen(t) elements */
+};
+
+/* for CRCs for BTF, base BTF to be considered usable, flags must be set. */
+#define BTF_FLAG_CRC_SET		(1 << 0)
+#define BTF_FLAG_BASE_CRC_SET		(1 << 1)
+
 struct btf_header {
 	__u16	magic;
 	__u8	version;
@@ -19,8 +35,16 @@  struct btf_header {
 	__u32	type_len;	/* length of type section	*/
 	__u32	str_off;	/* offset of string section	*/
 	__u32	str_len;	/* length of string section	*/
+	__u32	kind_layout_off;/* offset of kind layout section */
+	__u32	kind_layout_len;/* length of kind layout section */
+
+	__u32	crc;		/* crc of BTF; used if flags set BTF_FLAG_CRC_VALID */
+	__u32	base_crc;	/* crc of base BTF; used if flags set BTF_FLAG_BASE_CRC_VALID */
 };
 
+/* required minimum BTF header length */
+#define BTF_HEADER_MIN_LEN	(sizeof(struct btf_header) - 16)
+
 /* Max # of type identifier */
 #define BTF_MAX_TYPE	0x000fffff
 /* Max offset into the string section */