diff mbox series

[dwarves] btf: Generate btf for functions in the .BTF_ids section

Message ID 20210423213728.3538141-1-kafai@fb.com (mailing list archive)
State Not Applicable
Delegated to: BPF
Headers show
Series [dwarves] btf: Generate btf for functions in the .BTF_ids section | expand

Checks

Context Check Description
netdev/tree_selection success Not a local patch

Commit Message

Martin KaFai Lau April 23, 2021, 9:37 p.m. UTC
BTF is currently generated for functions that are in ftrace list
or extern.

A recent use case also needs BTF generated for functions included in
allowlist.  In particular, the kernel
commit e78aea8b2170 ("bpf: tcp: Put some tcp cong functions in allowlist for bpf-tcp-cc")
allows bpf program to directly call a few tcp cc kernel functions.  Those
functions are specified under an ELF section .BTF_ids.  The symbols
in this ELF section is like __BTF_ID__func__<kernel_func>__[digit]+.
For example, __BTF_ID__func__cubictcp_init__1.  Those kernel
functions are currently allowed only if CONFIG_DYNAMIC_FTRACE is
set to ensure they are in the ftrace list but this kconfig dependency
is unnecessary.

pahole can generate BTF for those kernel functions if it knows they
are in the allowlist.  This patch is to capture those symbols
in the .BTF_ids section and generate BTF for them.

Cc: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
---
 btf_encoder.c | 136 +++++++++++++++++++++++++++++++++++++++++++++++---
 libbtf.c      |  10 ++++
 libbtf.h      |   2 +
 3 files changed, 142 insertions(+), 6 deletions(-)

Comments

Andrii Nakryiko April 26, 2021, 11:26 p.m. UTC | #1
On Fri, Apr 23, 2021 at 2:37 PM Martin KaFai Lau <kafai@fb.com> wrote:
>
> BTF is currently generated for functions that are in ftrace list
> or extern.
>
> A recent use case also needs BTF generated for functions included in
> allowlist.  In particular, the kernel
> commit e78aea8b2170 ("bpf: tcp: Put some tcp cong functions in allowlist for bpf-tcp-cc")
> allows bpf program to directly call a few tcp cc kernel functions.  Those
> functions are specified under an ELF section .BTF_ids.  The symbols
> in this ELF section is like __BTF_ID__func__<kernel_func>__[digit]+.
> For example, __BTF_ID__func__cubictcp_init__1.  Those kernel
> functions are currently allowed only if CONFIG_DYNAMIC_FTRACE is
> set to ensure they are in the ftrace list but this kconfig dependency
> is unnecessary.
>
> pahole can generate BTF for those kernel functions if it knows they
> are in the allowlist.  This patch is to capture those symbols
> in the .BTF_ids section and generate BTF for them.
>
> Cc: Andrii Nakryiko <andrii@kernel.org>
> Signed-off-by: Martin KaFai Lau <kafai@fb.com>
> ---

I wonder if we just record all functions how bad that would be. Jiri,
do you remember from the time you were experimenting with static
functions how much more functions we'd be recording if we didn't do
ftrace filtering?

>  btf_encoder.c | 136 +++++++++++++++++++++++++++++++++++++++++++++++---
>  libbtf.c      |  10 ++++
>  libbtf.h      |   2 +
>  3 files changed, 142 insertions(+), 6 deletions(-)
>
> diff --git a/btf_encoder.c b/btf_encoder.c
> index 80e896961d4e..48c183915ddd 100644
> --- a/btf_encoder.c
> +++ b/btf_encoder.c
> @@ -106,6 +106,121 @@ static int collect_function(struct btf_elf *btfe, GElf_Sym *sym,
>         return 0;
>  }
>
> +#define BTF_ID_FUNC_PREFIX "__BTF_ID__func__"
> +#define BTF_ID_FUNC_PREFIX_LEN (sizeof(BTF_ID_FUNC_PREFIX) - 1)
> +
> +static char **listed_functions;
> +static int listed_functions_alloc;
> +static int listed_functions_cnt;

maybe just use struct btf as a container of strings, which is what you
need here? You can do btf__add_str() and btf__find_str(), which are
both fast and memory-efficient, and you won't have to manage all the
memory and do sorting, etc, etc.

[...]
Jiri Olsa April 27, 2021, 11:38 a.m. UTC | #2
On Mon, Apr 26, 2021 at 04:26:11PM -0700, Andrii Nakryiko wrote:
> On Fri, Apr 23, 2021 at 2:37 PM Martin KaFai Lau <kafai@fb.com> wrote:
> >
> > BTF is currently generated for functions that are in ftrace list
> > or extern.
> >
> > A recent use case also needs BTF generated for functions included in
> > allowlist.  In particular, the kernel
> > commit e78aea8b2170 ("bpf: tcp: Put some tcp cong functions in allowlist for bpf-tcp-cc")
> > allows bpf program to directly call a few tcp cc kernel functions.  Those
> > functions are specified under an ELF section .BTF_ids.  The symbols
> > in this ELF section is like __BTF_ID__func__<kernel_func>__[digit]+.
> > For example, __BTF_ID__func__cubictcp_init__1.  Those kernel
> > functions are currently allowed only if CONFIG_DYNAMIC_FTRACE is
> > set to ensure they are in the ftrace list but this kconfig dependency
> > is unnecessary.
> >
> > pahole can generate BTF for those kernel functions if it knows they
> > are in the allowlist.  This patch is to capture those symbols
> > in the .BTF_ids section and generate BTF for them.
> >
> > Cc: Andrii Nakryiko <andrii@kernel.org>
> > Signed-off-by: Martin KaFai Lau <kafai@fb.com>
> > ---
> 
> I wonder if we just record all functions how bad that would be. Jiri,
> do you remember from the time you were experimenting with static
> functions how much more functions we'd be recording if we didn't do
> ftrace filtering?

hum, I can't find that.. but should be just matter of removing
that is_ftrace_func check

if we decided to do that, maybe we could add some bit indicating
that the function is traceable? it would save us check with
available_filter_functions file

jirka

> 
> >  btf_encoder.c | 136 +++++++++++++++++++++++++++++++++++++++++++++++---
> >  libbtf.c      |  10 ++++
> >  libbtf.h      |   2 +
> >  3 files changed, 142 insertions(+), 6 deletions(-)
> >
> > diff --git a/btf_encoder.c b/btf_encoder.c
> > index 80e896961d4e..48c183915ddd 100644
> > --- a/btf_encoder.c
> > +++ b/btf_encoder.c
> > @@ -106,6 +106,121 @@ static int collect_function(struct btf_elf *btfe, GElf_Sym *sym,
> >         return 0;
> >  }
> >
> > +#define BTF_ID_FUNC_PREFIX "__BTF_ID__func__"
> > +#define BTF_ID_FUNC_PREFIX_LEN (sizeof(BTF_ID_FUNC_PREFIX) - 1)
> > +
> > +static char **listed_functions;
> > +static int listed_functions_alloc;
> > +static int listed_functions_cnt;
> 
> maybe just use struct btf as a container of strings, which is what you
> need here? You can do btf__add_str() and btf__find_str(), which are
> both fast and memory-efficient, and you won't have to manage all the
> memory and do sorting, etc, etc.
> 
> [...]
>
Jiri Olsa April 27, 2021, 11:42 a.m. UTC | #3
On Fri, Apr 23, 2021 at 02:37:28PM -0700, Martin KaFai Lau wrote:

SNIP

> +static int collect_listed_functions(struct btf_elf *btfe, GElf_Sym *sym,
> +				    size_t sym_sec_idx)
> +{
> +	int len, digits = 0, underscores = 0;
> +	const char *name;
> +	char *func_name;
> +
> +	if (!btfe->btf_ids_shndx ||
> +	    btfe->btf_ids_shndx != sym_sec_idx)
> +		return 0;
> +
> +	/* The kernel function in the btf id list will have symbol like:
> +	 * __BTF_ID__func__<kernel_func_name>__[digit]+
> +	 */
> +	name = elf_sym__name(sym, btfe->symtab);
> +	if (strncmp(name, BTF_ID_FUNC_PREFIX, BTF_ID_FUNC_PREFIX_LEN))
> +		return 0;
> +
> +	name += BTF_ID_FUNC_PREFIX_LEN;
> +
> +	/* name: <kernel_func_name>__[digit]+
> +	 * Strip the ending __[digit]+
> +	 */
> +	for (len = strlen(name); len && underscores != 2; len--) {
> +		char c = name[len - 1];
> +
> +		if (c == '_') {
> +			if (!digits)
> +				return 0;
> +			underscores++;
> +		} else if (isdigit(c)) {
> +			if (underscores)
> +				return 0;
> +			digits++;
> +		} else {
> +			return 0;
> +		}
> +	}
> +
> +	if (!len)
> +		return 0;
> +
> +	func_name = strndup(name, len);
> +	if (!func_name) {
> +		fprintf(stderr,
> +			"Failed to alloc memory for listed function %s%s\n",
> +			BTF_ID_FUNC_PREFIX, name);
> +		return -1;
> +	}
> +
> +	if (is_listed_func(func_name)) {
> +		/* already captured */
> +		free(func_name);
> +		return 0;
> +	}
> +
> +	/* grow listed_functions */
> +	if (listed_functions_cnt == listed_functions_alloc) {
> +		char **new;
> +
> +		listed_functions_alloc = max(100,
> +					     listed_functions_alloc * 3 / 2);
> +		new = realloc(listed_functions,
> +			      listed_functions_alloc * sizeof(*listed_functions));
> +		if (!new) {
> +			fprintf(stderr,
> +				"Failed to alloc memory for listed function %s%s\n",
> +				BTF_ID_FUNC_PREFIX, name);
> +			free(func_name);
> +			return -1;
> +		}
> +		listed_functions = new;
> +	}
> +
> +	listed_functions[listed_functions_cnt++] = func_name;
> +	qsort(listed_functions, listed_functions_cnt,
> +	      sizeof(*listed_functions), listed_function_cmp);

I was thinking of doing this at the end in setup_functions,
but we need to do name lookups before adding.. also there are
not too many BTF_ID instances anyway

I'll run the test script with your change

jirka
Arnaldo Carvalho de Melo April 27, 2021, 12:34 p.m. UTC | #4
Em Tue, Apr 27, 2021 at 01:38:20PM +0200, Jiri Olsa escreveu:
> On Mon, Apr 26, 2021 at 04:26:11PM -0700, Andrii Nakryiko wrote:
> > On Fri, Apr 23, 2021 at 2:37 PM Martin KaFai Lau <kafai@fb.com> wrote:
> > > BTF is currently generated for functions that are in ftrace list
> > > or extern.
> > >
> > > A recent use case also needs BTF generated for functions included in
> > > allowlist.  In particular, the kernel
> > > commit e78aea8b2170 ("bpf: tcp: Put some tcp cong functions in allowlist for bpf-tcp-cc")
> > > allows bpf program to directly call a few tcp cc kernel functions.  Those
> > > functions are specified under an ELF section .BTF_ids.  The symbols
> > > in this ELF section is like __BTF_ID__func__<kernel_func>__[digit]+.
> > > For example, __BTF_ID__func__cubictcp_init__1.  Those kernel
> > > functions are currently allowed only if CONFIG_DYNAMIC_FTRACE is
> > > set to ensure they are in the ftrace list but this kconfig dependency
> > > is unnecessary.
> > >
> > > pahole can generate BTF for those kernel functions if it knows they
> > > are in the allowlist.  This patch is to capture those symbols
> > > in the .BTF_ids section and generate BTF for them.

> > I wonder if we just record all functions how bad that would be. Jiri,
> > do you remember from the time you were experimenting with static
> > functions how much more functions we'd be recording if we didn't do
> > ftrace filtering?
 
> hum, I can't find that.. but should be just matter of removing
> that is_ftrace_func check
 
> if we decided to do that, maybe we could add some bit indicating
> that the function is traceable? it would save us check with
> available_filter_functions file

You mean encoding it in BTF, in 'struct btf_type'? Seems important to
have it, there are free bits there:

/* Max # of type identifier */
#define BTF_MAX_TYPE    0x000fffff
/* Max offset into the string section */
#define BTF_MAX_NAME_OFFSET     0x00ffffff
/* Max # of struct/union/enum members or func args */
#define BTF_MAX_VLEN    0xffff

struct btf_type {
        __u32 name_off;
        /* "info" bits arrangement
         * bits  0-15: vlen (e.g. # of struct's members)
         * bits 16-23: unused
         * bits 24-27: kind (e.g. int, ptr, array...etc)
         * bits 28-30: unused
         * bit     31: kind_flag, currently used by
         *             struct, union and fwd
         */
        __u32 info;
        /* "size" is used by INT, ENUM, STRUCT, UNION and DATASEC.
         * "size" tells the size of the type it is describing.
         *
         * "type" is used by PTR, TYPEDEF, VOLATILE, CONST, RESTRICT,
         * FUNC, FUNC_PROTO and VAR.
         * "type" is a type_id referring to another type.
         */
        union {
                __u32 size;
                __u32 type;
        };
};

And tools that expect to trace a function can get that information from
the BTF info instead of getting some failure when trying to trace those
functions, right?

- Arnaldo
Jiri Olsa April 27, 2021, 12:43 p.m. UTC | #5
On Tue, Apr 27, 2021 at 09:34:30AM -0300, Arnaldo Carvalho de Melo wrote:
> Em Tue, Apr 27, 2021 at 01:38:20PM +0200, Jiri Olsa escreveu:
> > On Mon, Apr 26, 2021 at 04:26:11PM -0700, Andrii Nakryiko wrote:
> > > On Fri, Apr 23, 2021 at 2:37 PM Martin KaFai Lau <kafai@fb.com> wrote:
> > > > BTF is currently generated for functions that are in ftrace list
> > > > or extern.
> > > >
> > > > A recent use case also needs BTF generated for functions included in
> > > > allowlist.  In particular, the kernel
> > > > commit e78aea8b2170 ("bpf: tcp: Put some tcp cong functions in allowlist for bpf-tcp-cc")
> > > > allows bpf program to directly call a few tcp cc kernel functions.  Those
> > > > functions are specified under an ELF section .BTF_ids.  The symbols
> > > > in this ELF section is like __BTF_ID__func__<kernel_func>__[digit]+.
> > > > For example, __BTF_ID__func__cubictcp_init__1.  Those kernel
> > > > functions are currently allowed only if CONFIG_DYNAMIC_FTRACE is
> > > > set to ensure they are in the ftrace list but this kconfig dependency
> > > > is unnecessary.
> > > >
> > > > pahole can generate BTF for those kernel functions if it knows they
> > > > are in the allowlist.  This patch is to capture those symbols
> > > > in the .BTF_ids section and generate BTF for them.
> 
> > > I wonder if we just record all functions how bad that would be. Jiri,
> > > do you remember from the time you were experimenting with static
> > > functions how much more functions we'd be recording if we didn't do
> > > ftrace filtering?
>  
> > hum, I can't find that.. but should be just matter of removing
> > that is_ftrace_func check
>  
> > if we decided to do that, maybe we could add some bit indicating
> > that the function is traceable? it would save us check with
> > available_filter_functions file
> 
> You mean encoding it in BTF, in 'struct btf_type'? Seems important to
> have it, there are free bits there:
> 
> /* Max # of type identifier */
> #define BTF_MAX_TYPE    0x000fffff
> /* Max offset into the string section */
> #define BTF_MAX_NAME_OFFSET     0x00ffffff
> /* Max # of struct/union/enum members or func args */
> #define BTF_MAX_VLEN    0xffff
> 
> struct btf_type {
>         __u32 name_off;
>         /* "info" bits arrangement
>          * bits  0-15: vlen (e.g. # of struct's members)
>          * bits 16-23: unused
>          * bits 24-27: kind (e.g. int, ptr, array...etc)
>          * bits 28-30: unused
>          * bit     31: kind_flag, currently used by
>          *             struct, union and fwd
>          */
>         __u32 info;
>         /* "size" is used by INT, ENUM, STRUCT, UNION and DATASEC.
>          * "size" tells the size of the type it is describing.
>          *
>          * "type" is used by PTR, TYPEDEF, VOLATILE, CONST, RESTRICT,
>          * FUNC, FUNC_PROTO and VAR.
>          * "type" is a type_id referring to another type.
>          */
>         union {
>                 __u32 size;
>                 __u32 type;
>         };
> };
> 
> And tools that expect to trace a function can get that information from
> the BTF info instead of getting some failure when trying to trace those
> functions, right?

yep, if it's possible to spare some bit for this info

jirka
Martin KaFai Lau April 27, 2021, 6:20 p.m. UTC | #6
On Tue, Apr 27, 2021 at 01:42:40PM +0200, Jiri Olsa wrote:
> On Fri, Apr 23, 2021 at 02:37:28PM -0700, Martin KaFai Lau wrote:
> 
> SNIP
> 
> > +static int collect_listed_functions(struct btf_elf *btfe, GElf_Sym *sym,
> > +				    size_t sym_sec_idx)
> > +{
> > +	int len, digits = 0, underscores = 0;
> > +	const char *name;
> > +	char *func_name;
> > +
> > +	if (!btfe->btf_ids_shndx ||
> > +	    btfe->btf_ids_shndx != sym_sec_idx)
> > +		return 0;
> > +
> > +	/* The kernel function in the btf id list will have symbol like:
> > +	 * __BTF_ID__func__<kernel_func_name>__[digit]+
> > +	 */
> > +	name = elf_sym__name(sym, btfe->symtab);
> > +	if (strncmp(name, BTF_ID_FUNC_PREFIX, BTF_ID_FUNC_PREFIX_LEN))
> > +		return 0;
> > +
> > +	name += BTF_ID_FUNC_PREFIX_LEN;
> > +
> > +	/* name: <kernel_func_name>__[digit]+
> > +	 * Strip the ending __[digit]+
> > +	 */
> > +	for (len = strlen(name); len && underscores != 2; len--) {
> > +		char c = name[len - 1];
> > +
> > +		if (c == '_') {
> > +			if (!digits)
> > +				return 0;
> > +			underscores++;
> > +		} else if (isdigit(c)) {
> > +			if (underscores)
> > +				return 0;
> > +			digits++;
> > +		} else {
> > +			return 0;
> > +		}
> > +	}
> > +
> > +	if (!len)
> > +		return 0;
> > +
> > +	func_name = strndup(name, len);
> > +	if (!func_name) {
> > +		fprintf(stderr,
> > +			"Failed to alloc memory for listed function %s%s\n",
> > +			BTF_ID_FUNC_PREFIX, name);
> > +		return -1;
> > +	}
> > +
> > +	if (is_listed_func(func_name)) {
> > +		/* already captured */
> > +		free(func_name);
> > +		return 0;
> > +	}
> > +
> > +	/* grow listed_functions */
> > +	if (listed_functions_cnt == listed_functions_alloc) {
> > +		char **new;
> > +
> > +		listed_functions_alloc = max(100,
> > +					     listed_functions_alloc * 3 / 2);
> > +		new = realloc(listed_functions,
> > +			      listed_functions_alloc * sizeof(*listed_functions));
> > +		if (!new) {
> > +			fprintf(stderr,
> > +				"Failed to alloc memory for listed function %s%s\n",
> > +				BTF_ID_FUNC_PREFIX, name);
> > +			free(func_name);
> > +			return -1;
> > +		}
> > +		listed_functions = new;
> > +	}
> > +
> > +	listed_functions[listed_functions_cnt++] = func_name;
> > +	qsort(listed_functions, listed_functions_cnt,
> > +	      sizeof(*listed_functions), listed_function_cmp);
> 
> I was thinking of doing this at the end in setup_functions,
> but we need to do name lookups before adding..
I am planning to just use btf__add_str() as Andrii suggested.
Then most of these will go away.
Andrii Nakryiko April 27, 2021, 8:38 p.m. UTC | #7
On Tue, Apr 27, 2021 at 5:34 AM Arnaldo Carvalho de Melo
<acme@kernel.org> wrote:
>
> Em Tue, Apr 27, 2021 at 01:38:20PM +0200, Jiri Olsa escreveu:
> > On Mon, Apr 26, 2021 at 04:26:11PM -0700, Andrii Nakryiko wrote:
> > > On Fri, Apr 23, 2021 at 2:37 PM Martin KaFai Lau <kafai@fb.com> wrote:
> > > > BTF is currently generated for functions that are in ftrace list
> > > > or extern.
> > > >
> > > > A recent use case also needs BTF generated for functions included in
> > > > allowlist.  In particular, the kernel
> > > > commit e78aea8b2170 ("bpf: tcp: Put some tcp cong functions in allowlist for bpf-tcp-cc")
> > > > allows bpf program to directly call a few tcp cc kernel functions.  Those
> > > > functions are specified under an ELF section .BTF_ids.  The symbols
> > > > in this ELF section is like __BTF_ID__func__<kernel_func>__[digit]+.
> > > > For example, __BTF_ID__func__cubictcp_init__1.  Those kernel
> > > > functions are currently allowed only if CONFIG_DYNAMIC_FTRACE is
> > > > set to ensure they are in the ftrace list but this kconfig dependency
> > > > is unnecessary.
> > > >
> > > > pahole can generate BTF for those kernel functions if it knows they
> > > > are in the allowlist.  This patch is to capture those symbols
> > > > in the .BTF_ids section and generate BTF for them.
>
> > > I wonder if we just record all functions how bad that would be. Jiri,
> > > do you remember from the time you were experimenting with static
> > > functions how much more functions we'd be recording if we didn't do
> > > ftrace filtering?
>
> > hum, I can't find that.. but should be just matter of removing
> > that is_ftrace_func check
>
> > if we decided to do that, maybe we could add some bit indicating
> > that the function is traceable? it would save us check with
> > available_filter_functions file
>
> You mean encoding it in BTF, in 'struct btf_type'? Seems important to
> have it, there are free bits there:
>
> /* Max # of type identifier */
> #define BTF_MAX_TYPE    0x000fffff
> /* Max offset into the string section */
> #define BTF_MAX_NAME_OFFSET     0x00ffffff
> /* Max # of struct/union/enum members or func args */
> #define BTF_MAX_VLEN    0xffff
>
> struct btf_type {
>         __u32 name_off;
>         /* "info" bits arrangement
>          * bits  0-15: vlen (e.g. # of struct's members)
>          * bits 16-23: unused
>          * bits 24-27: kind (e.g. int, ptr, array...etc)
>          * bits 28-30: unused
>          * bit     31: kind_flag, currently used by
>          *             struct, union and fwd
>          */
>         __u32 info;
>         /* "size" is used by INT, ENUM, STRUCT, UNION and DATASEC.
>          * "size" tells the size of the type it is describing.
>          *
>          * "type" is used by PTR, TYPEDEF, VOLATILE, CONST, RESTRICT,
>          * FUNC, FUNC_PROTO and VAR.
>          * "type" is a type_id referring to another type.
>          */
>         union {
>                 __u32 size;
>                 __u32 type;
>         };
> };
>
> And tools that expect to trace a function can get that information from
> the BTF info instead of getting some failure when trying to trace those
> functions, right?

I don't think it belongs in BTF, though. Plus there are additional
limitations enforced by BPF verifier that will prevent some functions
to be attached. So just because the function is in BTF doesn't mean
it's 100% attachable.

>
> - Arnaldo
Arnaldo Carvalho de Melo April 28, 2021, 1:10 p.m. UTC | #8
Em Tue, Apr 27, 2021 at 01:38:51PM -0700, Andrii Nakryiko escreveu:
> On Tue, Apr 27, 2021 at 5:34 AM Arnaldo Carvalho de Melo <acme@kernel.org> wrote:

> > And tools that expect to trace a function can get that information from
> > the BTF info instead of getting some failure when trying to trace those
> > functions, right?
 
> I don't think it belongs in BTF, though.

My thinking was that since BTF is used when tracing, one would be
interested in knowing if some functions can't be used for that.

> Plus there are additional limitations enforced by BPF verifier that
> will prevent some functions to be attached. So just because the
> function is in BTF doesn't mean it's 100% attachable.

Well, at least we would avoid some that can't for sure be used for
tracing. But, a bit in there is precious, so probably geting a NACK from
the kernel should be a good enough capability query. :-)

Tools should just silently prune things in wildcards provided by the
user that aren't traceable, silently, and provide an error message when
the user explicitely asks for tracing a verbotten function.

- Arnaldo
Andrii Nakryiko April 28, 2021, 7:45 p.m. UTC | #9
On Wed, Apr 28, 2021 at 6:10 AM Arnaldo Carvalho de Melo
<arnaldo.melo@gmail.com> wrote:
>
> Em Tue, Apr 27, 2021 at 01:38:51PM -0700, Andrii Nakryiko escreveu:
> > On Tue, Apr 27, 2021 at 5:34 AM Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
>
> > > And tools that expect to trace a function can get that information from
> > > the BTF info instead of getting some failure when trying to trace those
> > > functions, right?
>
> > I don't think it belongs in BTF, though.
>
> My thinking was that since BTF is used when tracing, one would be
> interested in knowing if some functions can't be used for that.
>
> > Plus there are additional limitations enforced by BPF verifier that
> > will prevent some functions to be attached. So just because the
> > function is in BTF doesn't mean it's 100% attachable.
>
> Well, at least we would avoid some that can't for sure be used for
> tracing. But, a bit in there is precious, so probably geting a NACK from
> the kernel should be a good enough capability query. :-)
>
> Tools should just silently prune things in wildcards provided by the
> user that aren't traceable, silently, and provide an error message when
> the user explicitely asks for tracing a verbotten function.

Yep, that's what I'm doing in my retsnoop tool (I both filter by
available kprobes [0], and have extra blacklist [1]). Loading kprobes
is pretty simple and fast enough, shouldn't be a problem.

  [0] https://github.com/anakryiko/retsnoop/blob/master/src/mass_attacher.c#L495
  [1] https://github.com/anakryiko/retsnoop/blob/master/src/mass_attacher.c#L41


>
> - Arnaldo
Arnaldo Carvalho de Melo April 28, 2021, 10:05 p.m. UTC | #10
Em Wed, Apr 28, 2021 at 12:45:10PM -0700, Andrii Nakryiko escreveu:
> On Wed, Apr 28, 2021 at 6:10 AM Arnaldo Carvalho de Melo <arnaldo.melo@gmail.com> wrote:
> > Em Tue, Apr 27, 2021 at 01:38:51PM -0700, Andrii Nakryiko escreveu:
> > > On Tue, Apr 27, 2021 at 5:34 AM Arnaldo Carvalho de Melo <acme@kernel.org> wrote:

> > > > And tools that expect to trace a function can get that information from
> > > > the BTF info instead of getting some failure when trying to trace those
> > > > functions, right?

> > > I don't think it belongs in BTF, though.

> > My thinking was that since BTF is used when tracing, one would be
> > interested in knowing if some functions can't be used for that.

> > > Plus there are additional limitations enforced by BPF verifier that
> > > will prevent some functions to be attached. So just because the
> > > function is in BTF doesn't mean it's 100% attachable.

> > Well, at least we would avoid some that can't for sure be used for
> > tracing. But, a bit in there is precious, so probably geting a NACK from
> > the kernel should be a good enough capability query. :-)

> > Tools should just silently prune things in wildcards provided by the
> > user that aren't traceable, silently, and provide an error message when
> > the user explicitely asks for tracing a verbotten function.

> Yep, that's what I'm doing in my retsnoop tool (I both filter by
> available kprobes [0], and have extra blacklist [1]). Loading kprobes
> is pretty simple and fast enough, shouldn't be a problem.

>   [0] https://github.com/anakryiko/retsnoop/blob/master/src/mass_attacher.c#L495
>   [1] https://github.com/anakryiko/retsnoop/blob/master/src/mass_attacher.c#L41

Oh my, one more tool to download, build, try, have updated, etc. Will
look at it, thanks for the pointer.

- Arnaldo
Martin KaFai Lau May 1, 2021, 12:16 a.m. UTC | #11
On Tue, Apr 27, 2021 at 01:38:20PM +0200, Jiri Olsa wrote:
> On Mon, Apr 26, 2021 at 04:26:11PM -0700, Andrii Nakryiko wrote:
> > On Fri, Apr 23, 2021 at 2:37 PM Martin KaFai Lau <kafai@fb.com> wrote:
> > >
> > > BTF is currently generated for functions that are in ftrace list
> > > or extern.
> > >
> > > A recent use case also needs BTF generated for functions included in
> > > allowlist.  In particular, the kernel
> > > commit e78aea8b2170 ("bpf: tcp: Put some tcp cong functions in allowlist for bpf-tcp-cc")
> > > allows bpf program to directly call a few tcp cc kernel functions.  Those
> > > functions are specified under an ELF section .BTF_ids.  The symbols
> > > in this ELF section is like __BTF_ID__func__<kernel_func>__[digit]+.
> > > For example, __BTF_ID__func__cubictcp_init__1.  Those kernel
> > > functions are currently allowed only if CONFIG_DYNAMIC_FTRACE is
> > > set to ensure they are in the ftrace list but this kconfig dependency
> > > is unnecessary.
> > >
> > > pahole can generate BTF for those kernel functions if it knows they
> > > are in the allowlist.  This patch is to capture those symbols
> > > in the .BTF_ids section and generate BTF for them.
> > >
> > > Cc: Andrii Nakryiko <andrii@kernel.org>
> > > Signed-off-by: Martin KaFai Lau <kafai@fb.com>
> > > ---
> > 
> > I wonder if we just record all functions how bad that would be. Jiri,
> > do you remember from the time you were experimenting with static
> > functions how much more functions we'd be recording if we didn't do
> > ftrace filtering?
> 
> hum, I can't find that.. but should be just matter of removing
> that is_ftrace_func check
In my kconfig, by ignoring is_ftrace_func(),
number of FUNC: 40643 vs 46225

I would say skip the ftrace filtering instead of my current patch.  Thoughts?
Jiri Olsa May 1, 2021, 10:23 p.m. UTC | #12
On Fri, Apr 30, 2021 at 05:16:53PM -0700, Martin KaFai Lau wrote:
> On Tue, Apr 27, 2021 at 01:38:20PM +0200, Jiri Olsa wrote:
> > On Mon, Apr 26, 2021 at 04:26:11PM -0700, Andrii Nakryiko wrote:
> > > On Fri, Apr 23, 2021 at 2:37 PM Martin KaFai Lau <kafai@fb.com> wrote:
> > > >
> > > > BTF is currently generated for functions that are in ftrace list
> > > > or extern.
> > > >
> > > > A recent use case also needs BTF generated for functions included in
> > > > allowlist.  In particular, the kernel
> > > > commit e78aea8b2170 ("bpf: tcp: Put some tcp cong functions in allowlist for bpf-tcp-cc")
> > > > allows bpf program to directly call a few tcp cc kernel functions.  Those
> > > > functions are specified under an ELF section .BTF_ids.  The symbols
> > > > in this ELF section is like __BTF_ID__func__<kernel_func>__[digit]+.
> > > > For example, __BTF_ID__func__cubictcp_init__1.  Those kernel
> > > > functions are currently allowed only if CONFIG_DYNAMIC_FTRACE is
> > > > set to ensure they are in the ftrace list but this kconfig dependency
> > > > is unnecessary.
> > > >
> > > > pahole can generate BTF for those kernel functions if it knows they
> > > > are in the allowlist.  This patch is to capture those symbols
> > > > in the .BTF_ids section and generate BTF for them.
> > > >
> > > > Cc: Andrii Nakryiko <andrii@kernel.org>
> > > > Signed-off-by: Martin KaFai Lau <kafai@fb.com>
> > > > ---
> > > 
> > > I wonder if we just record all functions how bad that would be. Jiri,
> > > do you remember from the time you were experimenting with static
> > > functions how much more functions we'd be recording if we didn't do
> > > ftrace filtering?
> > 
> > hum, I can't find that.. but should be just matter of removing
> > that is_ftrace_func check
> In my kconfig, by ignoring is_ftrace_func(),
> number of FUNC: 40643 vs 46225
> 
> I would say skip the ftrace filtering instead of my current patch.  Thoughts?
> 

I tested on arm and got 25022 vs 55812 ;-)
and of course it fixes the compilation issue with cubictcp_state

I checked the pahole changelog and the original idea was to have only
traceable functions in BTF, but we need more than that now, so I'm ok
to skip the ftrace filter

jirka
diff mbox series

Patch

diff --git a/btf_encoder.c b/btf_encoder.c
index 80e896961d4e..48c183915ddd 100644
--- a/btf_encoder.c
+++ b/btf_encoder.c
@@ -106,6 +106,121 @@  static int collect_function(struct btf_elf *btfe, GElf_Sym *sym,
 	return 0;
 }
 
+#define BTF_ID_FUNC_PREFIX "__BTF_ID__func__"
+#define BTF_ID_FUNC_PREFIX_LEN (sizeof(BTF_ID_FUNC_PREFIX) - 1)
+
+static char **listed_functions;
+static int listed_functions_alloc;
+static int listed_functions_cnt;
+
+static void delete_listed_functions(void)
+{
+	int i;
+
+	for (i = 0; i < listed_functions_cnt; i++)
+		free(listed_functions[i]);
+
+	free(listed_functions);
+	/* In case btf_encoder__encode() will be called multiple times */
+	listed_functions_alloc = 0;
+	listed_functions_cnt = 0;
+	listed_functions = NULL;
+}
+
+static int listed_function_cmp(const void *_a, const void *_b)
+{
+	const char *a = *(const char **)_a;
+	const char *b = *(const char **)_b;
+
+	return strcmp(a, b);
+}
+
+static bool is_listed_func(const char *name)
+{
+	return !!bsearch(&name, listed_functions, listed_functions_cnt,
+			 sizeof(*listed_functions), listed_function_cmp);
+}
+
+static int collect_listed_functions(struct btf_elf *btfe, GElf_Sym *sym,
+				    size_t sym_sec_idx)
+{
+	int len, digits = 0, underscores = 0;
+	const char *name;
+	char *func_name;
+
+	if (!btfe->btf_ids_shndx ||
+	    btfe->btf_ids_shndx != sym_sec_idx)
+		return 0;
+
+	/* The kernel function in the btf id list will have symbol like:
+	 * __BTF_ID__func__<kernel_func_name>__[digit]+
+	 */
+	name = elf_sym__name(sym, btfe->symtab);
+	if (strncmp(name, BTF_ID_FUNC_PREFIX, BTF_ID_FUNC_PREFIX_LEN))
+		return 0;
+
+	name += BTF_ID_FUNC_PREFIX_LEN;
+
+	/* name: <kernel_func_name>__[digit]+
+	 * Strip the ending __[digit]+
+	 */
+	for (len = strlen(name); len && underscores != 2; len--) {
+		char c = name[len - 1];
+
+		if (c == '_') {
+			if (!digits)
+				return 0;
+			underscores++;
+		} else if (isdigit(c)) {
+			if (underscores)
+				return 0;
+			digits++;
+		} else {
+			return 0;
+		}
+	}
+
+	if (!len)
+		return 0;
+
+	func_name = strndup(name, len);
+	if (!func_name) {
+		fprintf(stderr,
+			"Failed to alloc memory for listed function %s%s\n",
+			BTF_ID_FUNC_PREFIX, name);
+		return -1;
+	}
+
+	if (is_listed_func(func_name)) {
+		/* already captured */
+		free(func_name);
+		return 0;
+	}
+
+	/* grow listed_functions */
+	if (listed_functions_cnt == listed_functions_alloc) {
+		char **new;
+
+		listed_functions_alloc = max(100,
+					     listed_functions_alloc * 3 / 2);
+		new = realloc(listed_functions,
+			      listed_functions_alloc * sizeof(*listed_functions));
+		if (!new) {
+			fprintf(stderr,
+				"Failed to alloc memory for listed function %s%s\n",
+				BTF_ID_FUNC_PREFIX, name);
+			free(func_name);
+			return -1;
+		}
+		listed_functions = new;
+	}
+
+	listed_functions[listed_functions_cnt++] = func_name;
+	qsort(listed_functions, listed_functions_cnt,
+	      sizeof(*listed_functions), listed_function_cmp);
+	return 0;
+}
+
 static int addrs_cmp(const void *_a, const void *_b)
 {
 	const __u64 *a = _a;
@@ -294,14 +409,15 @@  static int setup_functions(struct btf_elf *btfe, struct funcs_layout *fl)
 		kmod = true;
 	}
 
-	if (!addrs) {
+	if (!addrs && !listed_functions_cnt) {
 		if (btf_elf__verbose)
-			printf("ftrace symbols not detected, falling back to DWARF data\n");
+			printf("ftrace symbols and btf listed functions are not detected, falling back to DWARF data\n");
 		delete_functions();
 		return 0;
 	}
 
-	qsort(addrs, count, sizeof(addrs[0]), addrs_cmp);
+	if (addrs)
+		qsort(addrs, count, sizeof(addrs[0]), addrs_cmp);
 	qsort(functions, functions_cnt, sizeof(functions[0]), functions_cmp);
 
 	/*
@@ -321,8 +437,12 @@  static int setup_functions(struct btf_elf *btfe, struct funcs_layout *fl)
 		if (kmod)
 			func->addr += func->sh_addr;
 
-		/* Make sure function is within ftrace addresses. */
-		if (is_ftrace_func(func, addrs, count)) {
+		/*
+		 * Make sure function is within ftrace addresses or
+		 * is a listed function in the .BTF_ids section.
+		 */
+		if (is_ftrace_func(func, addrs, count) ||
+		    is_listed_func(func->name)) {
 			/*
 			 * We iterate over sorted array, so we can easily skip
 			 * not valid item and move following valid field into
@@ -533,6 +653,7 @@  int btf_encoder__encode()
 
 	err = btf_elf__encode(btfe, 0);
 	delete_functions();
+	delete_listed_functions();
 	btf_elf__delete(btfe);
 	btfe = NULL;
 
@@ -650,6 +771,8 @@  static int collect_symbols(struct btf_elf *btfe, bool collect_percpu_vars)
 			return -1;
 		if (collect_function(btfe, &sym, sym_sec_idx))
 			return -1;
+		if (collect_listed_functions(btfe, &sym, sym_sec_idx))
+			return -1;
 		collect_symbol(&sym, &fl, sym_sec_idx);
 	}
 
@@ -764,7 +887,8 @@  int cu__encode_btf(struct cu *cu, int verbose, bool force,
 		 *   - are marked as declarations
 		 *   - do not have full argument names
 		 *   - are not in ftrace list (if it's available)
-		 *   - are not external (in case ftrace filter is not available)
+		 *   - are not in the .BTF_ids section
+		 *   - are not external (in case ftrace and .BTF_ids filter are not available)
 		 */
 		if (fn->declaration)
 			continue;
diff --git a/libbtf.c b/libbtf.c
index c2360ff1f804..d9729bd4680d 100644
--- a/libbtf.c
+++ b/libbtf.c
@@ -168,6 +168,16 @@  try_as_raw_btf:
 		return btfe;
 	}
 
+	sec = elf_section_by_name(btfe->elf, &btfe->ehdr, &shdr,
+				  BTF_IDS_SECTION, NULL);
+	if (!sec) {
+		if (btf_elf__verbose)
+			printf("%s: '%s' doesn't have '%s' section\n", __func__,
+			       btfe->filename, BTF_IDS_SECTION);
+	} else {
+		btfe->btf_ids_shndx = elf_ndxscn(sec);
+	}
+
 	/* find percpu section's shndx */
 	sec = elf_section_by_name(btfe->elf, &btfe->ehdr, &shdr, PERCPU_SECTION,
 				  NULL);
diff --git a/libbtf.h b/libbtf.h
index c7cbe6e17ee7..6caaac306f77 100644
--- a/libbtf.h
+++ b/libbtf.h
@@ -24,6 +24,7 @@  struct btf_elf {
 	uint8_t		  wordsize;
 	bool		  is_big_endian;
 	bool		  raw_btf; // "/sys/kernel/btf/vmlinux"
+	uint32_t	  btf_ids_shndx;
 	uint32_t	  percpu_shndx;
 	uint64_t	  percpu_base_addr;
 	uint64_t	  percpu_sec_sz;
@@ -38,6 +39,7 @@  extern uint8_t btf_elf__force;
 extern bool btf_gen_floats;
 
 #define PERCPU_SECTION ".data..percpu"
+#define BTF_IDS_SECTION ".BTF_ids"
 
 struct cu;
 struct base_type;