diff mbox series

[dwarves,v3,5/5] pahole: add global_var BTF feature

Message ID 20241002235253.487251-6-stephen.s.brennan@oracle.com (mailing list archive)
State Not Applicable
Delegated to: BPF
Headers show
Series [dwarves,v3,1/5] btf_encoder: use bitfield to control var encoding | expand

Checks

Context Check Description
netdev/tree_selection success Not a local patch

Commit Message

Stephen Brennan Oct. 2, 2024, 11:52 p.m. UTC
So far, pahole has only encoded type information for percpu variables.
However, there are several reasons why type information for all global
variables would be useful in the kernel:

1. Runtime kernel debuggers like drgn could use the BTF to introspect
kernel data structures without needing to install heavyweight DWARF.

2. BPF programs using the "__ksym" annotation could declare the
variables using the correct type, rather than "void".

It makes sense to introduce a feature for this in pahole so that these
capabilities can be explored in the kernel. The feature is non-default:
when using "--btf-features=default", it is disabled. It must be
explicitly requested, e.g. with "--btf-features=+global_var".

Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com>
---
 btf_encoder.c      | 5 +++++
 btf_encoder.h      | 1 +
 dwarves.h          | 1 +
 man-pages/pahole.1 | 7 +++++--
 pahole.c           | 3 ++-
 5 files changed, 14 insertions(+), 3 deletions(-)

Comments

Alan Maguire Oct. 3, 2024, 2:40 p.m. UTC | #1
On 03/10/2024 00:52, Stephen Brennan wrote:
> So far, pahole has only encoded type information for percpu variables.
> However, there are several reasons why type information for all global
> variables would be useful in the kernel:
> 
> 1. Runtime kernel debuggers like drgn could use the BTF to introspect
> kernel data structures without needing to install heavyweight DWARF.
> 
> 2. BPF programs using the "__ksym" annotation could declare the
> variables using the correct type, rather than "void".
> 
> It makes sense to introduce a feature for this in pahole so that these
> capabilities can be explored in the kernel. The feature is non-default:
> when using "--btf-features=default", it is disabled. It must be
> explicitly requested, e.g. with "--btf-features=+global_var".
>

I'm not totally sure switching global_var to a non-default feature is
the right answer here.

The --btf_features "default" set of options are meant to closely mirror
the upstream kernel options we enable when doing BTF encoding. However,
in scripts/Makefile.btf we don't use "default"; we explicitly call out
the set of features we want. We can't just use "default" in that context
since the meaning of "default" varies based upon whatever version of
pahole you have.

So "default" is simply a convenient shorthand for pahole testing which
corresponds to "give me the set of features that upstream kernels use".
It could have a better name that reflects that more clearly I suppose.

When we do switch this on in-kernel, we'll add the explicit "global_var"
to the list of features in scripts/Makefile.btf.

So with all this said, do we make global_vars a default or non-default
feature? It would seem to make sense to specify non-default, since it is
not switched on for the kernel yet, but looking ahead, what if the 1.28
pahole release is used to build vmlinux BTF and we add global_var to the
list of features? In such a case, our "default" set of values would be
out of step with the kernel. So it's not a huge deal, but I would
consider keeping this a default feature to facilitate testing; this
won't change what the kernel does, but it makes testing with full
variable generation easier (I can just do "--btf_features=default").

And on that subject, I tested with this series, and all looks good.
vmlinux BTF grew by 1.5Mb to 6.7Mb for me on a bpf-next kernel.
Following datasecs were seen:

[156581] DATASEC '.rodata' size=7379360 vlen=5472
[156582] DATASEC '__init_rodata' size=496 vlen=3
[156583] DATASEC '__param' size=15160 vlen=375
[156584] DATASEC '__modver' size=864 vlen=12
[156585] DATASEC '.data' size=5955041 vlen=15839
[156586] DATASEC '.vvar' size=656 vlen=2
[156587] DATASEC '.data..percpu' size=229632 vlen=389
[156588] DATASEC '.init.data' size=2057888 vlen=5565
[156589] DATASEC '.x86_cpu_dev.init' size=40 vlen=5
[156590] DATASEC '.apicdrivers' size=56 vlen=7
[156591] DATASEC '.data_nosave' size=4 vlen=1
[156592] DATASEC '.bss' size=3788800 vlen=4080
[156593] DATASEC '.brk' size=196608 vlen=2
[156594] DATASEC '.init.scratch' size=4194304 vlen=1

Biggest contributors in terms of BTF size appear to be

- .data (15839 vars)
- .init.data (5565 vars)
- .rodata (5472 vars)
- .bss (4080 vars)

> Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com>

Reviewed-by: Alan Maguire <alan.maguire@oracle.com>
Tested-by: Alan Maguire <alan.maguire@oracle.com>

> ---
>  btf_encoder.c      | 5 +++++
>  btf_encoder.h      | 1 +
>  dwarves.h          | 1 +
>  man-pages/pahole.1 | 7 +++++--
>  pahole.c           | 3 ++-
>  5 files changed, 14 insertions(+), 3 deletions(-)
> 
> diff --git a/btf_encoder.c b/btf_encoder.c
> index 2fd1648..2730ea8 100644
> --- a/btf_encoder.c
> +++ b/btf_encoder.c
> @@ -2348,6 +2348,8 @@ struct btf_encoder *btf_encoder__new(struct cu *cu, const char *detached_filenam
>  		encoder->encode_vars = 0;
>  		if (!conf_load->skip_encoding_btf_vars)
>  			encoder->encode_vars |= BTF_VAR_PERCPU;
> +		if (conf_load->encode_btf_global_vars)
> +			encoder->encode_vars |= BTF_VAR_GLOBAL;
>  
>  		GElf_Ehdr ehdr;
>  
> @@ -2400,6 +2402,9 @@ struct btf_encoder *btf_encoder__new(struct cu *cu, const char *detached_filenam
>  			encoder->secinfo[shndx].name = secname;
>  			encoder->secinfo[shndx].type = shdr.sh_type;
>  
> +			if (encoder->encode_vars & BTF_VAR_GLOBAL)
> +				encoder->secinfo[shndx].include = true;
> +
>  			if (strcmp(secname, PERCPU_SECTION) == 0) {
>  				found_percpu = true;
>  				if (encoder->encode_vars & BTF_VAR_PERCPU)
> diff --git a/btf_encoder.h b/btf_encoder.h
> index 91e7947..824963b 100644
> --- a/btf_encoder.h
> +++ b/btf_encoder.h
> @@ -20,6 +20,7 @@ struct list_head;
>  enum btf_var_option {
>  	BTF_VAR_NONE = 0,
>  	BTF_VAR_PERCPU = 1,
> +	BTF_VAR_GLOBAL = 2,
>  };
>  
>  struct btf_encoder *btf_encoder__new(struct cu *cu, const char *detached_filename, struct btf *base_btf, bool verbose, struct conf_load *conf_load);
> diff --git a/dwarves.h b/dwarves.h
> index 0fede91..fef881f 100644
> --- a/dwarves.h
> +++ b/dwarves.h
> @@ -92,6 +92,7 @@ struct conf_load {
>  	bool			btf_gen_optimized;
>  	bool			skip_encoding_btf_inconsistent_proto;
>  	bool			skip_encoding_btf_vars;
> +	bool			encode_btf_global_vars;
>  	bool			btf_gen_floats;
>  	bool			btf_encode_force;
>  	bool			reproducible_build;
> diff --git a/man-pages/pahole.1 b/man-pages/pahole.1
> index b3e6632..7c1a69a 100644
> --- a/man-pages/pahole.1
> +++ b/man-pages/pahole.1
> @@ -238,7 +238,9 @@ the debugging information.
>  
>  .TP
>  .B \-\-skip_encoding_btf_vars
> -Do not encode VARs in BTF.
> +By default, VARs are encoded only for percpu variables. When specified, this
> +option prevents encoding any VARs. Note that this option can be overridden
> +by the feature "global_var".
>  
>  .TP
>  .B \-\-skip_encoding_btf_decl_tag
> @@ -304,7 +306,7 @@ Encode BTF using the specified feature list, or specify 'default' for all standa
>  	encode_force       Ignore invalid symbols when encoding BTF; for example
>  	                   if a symbol has an invalid name, it will be ignored
>  	                   and BTF encoding will continue.
> -	var                Encode variables using BTF_KIND_VAR in BTF.
> +	var                Encode percpu variables using BTF_KIND_VAR in BTF.
>  	float              Encode floating-point types in BTF.
>  	decl_tag           Encode declaration tags using BTF_KIND_DECL_TAG.
>  	type_tag           Encode type tags using BTF_KIND_TYPE_TAG.
> @@ -329,6 +331,7 @@ Supported non-standard features (not enabled for 'default')
>  	                   the associated base BTF to support later relocation
>  	                   of split BTF with a possibly changed base, storing
>  	                   it in a .BTF.base ELF section.
> +	global_var         Encode all global variables using BTF_KIND_VAR in BTF.
>  .fi
>  
>  So for example, specifying \-\-btf_encode=var,enum64 will result in a BTF encoding that (as well as encoding basic BTF information) will contain variables and enum64 values.
> diff --git a/pahole.c b/pahole.c
> index b21a7f2..9f0dc59 100644
> --- a/pahole.c
> +++ b/pahole.c
> @@ -1301,6 +1301,7 @@ struct btf_feature {
>  	BTF_DEFAULT_FEATURE(decl_tag_kfuncs, btf_decl_tag_kfuncs, false),
>  	BTF_NON_DEFAULT_FEATURE(reproducible_build, reproducible_build, false),
>  	BTF_NON_DEFAULT_FEATURE(distilled_base, btf_gen_distilled_base, false),
> +	BTF_NON_DEFAULT_FEATURE(global_var, encode_btf_global_vars, false),

see above, I'd suggest making this a BTF_DEFAULT_FEATURE() to make
testing easier.

>  };
>  
>  #define BTF_MAX_FEATURE_STR	1024
> @@ -1733,7 +1734,7 @@ static const struct argp_option pahole__options[] = {
>  	{
>  		.name = "skip_encoding_btf_vars",
>  		.key  = ARGP_skip_encoding_btf_vars,
> -		.doc  = "Do not encode VARs in BTF."
> +		.doc  = "Do not encode any VARs in BTF [if this is not specified, only percpu variables are encoded. To encode global variables too, use --encode_btf_global_vars]."
>  	},
>  	{
>  		.name = "btf_encode_force",
Arnaldo Carvalho de Melo Oct. 3, 2024, 2:53 p.m. UTC | #2
On Thu, Oct 03, 2024 at 03:40:35PM +0100, Alan Maguire wrote:
> On 03/10/2024 00:52, Stephen Brennan wrote:
> > So far, pahole has only encoded type information for percpu variables.
> > However, there are several reasons why type information for all global
> > variables would be useful in the kernel:

> > 1. Runtime kernel debuggers like drgn could use the BTF to introspect
> > kernel data structures without needing to install heavyweight DWARF.

> > 2. BPF programs using the "__ksym" annotation could declare the
> > variables using the correct type, rather than "void".

> > It makes sense to introduce a feature for this in pahole so that these
> > capabilities can be explored in the kernel. The feature is non-default:
> > when using "--btf-features=default", it is disabled. It must be
> > explicitly requested, e.g. with "--btf-features=+global_var".
 
> I'm not totally sure switching global_var to a non-default feature is
> the right answer here.
 
> The --btf_features "default" set of options are meant to closely mirror
> the upstream kernel options we enable when doing BTF encoding. However,
> in scripts/Makefile.btf we don't use "default"; we explicitly call out
> the set of features we want. We can't just use "default" in that context
> since the meaning of "default" varies based upon whatever version of
> pahole you have.
 
> So "default" is simply a convenient shorthand for pahole testing which
> corresponds to "give me the set of features that upstream kernels use".
> It could have a better name that reflects that more clearly I suppose.
 
> When we do switch this on in-kernel, we'll add the explicit "global_var"
> to the list of features in scripts/Makefile.btf.
 
> So with all this said, do we make global_vars a default or non-default
> feature? It would seem to make sense to specify non-default, since it is
> not switched on for the kernel yet, but looking ahead, what if the 1.28
> pahole release is used to build vmlinux BTF and we add global_var to the
> list of features? In such a case, our "default" set of values would be
> out of step with the kernel. So it's not a huge deal, but I would
> consider keeping this a default feature to facilitate testing; this
> won't change what the kernel does, but it makes testing with full
> variable generation easier (I can just do "--btf_features=default").

This "default" really is confusing, as you spelled out above :-\ When to
add something to it so that it reflects what the kernel has is tricky,
perhaps we should instead have a ~/.config/pahole file where developers
can add BTF features to add to --btf_features=default in the period
where something new was _really_ added to the kernel and before the next
version when it _have been added to the kernel set of BTF features_ thus
should be set into stone in the pahole sources?

So I think we should do as Stephen did, keep it out of
--btf_features=default, as it is not yet in the kernel set of options,
and have the config file, starting with being able to set those
features, i.e. we would have:

$ cat ~/.config/pahole
[btf_encoder]
	btf_features=+global_var

wdyt?

- Arnaldo
 
> And on that subject, I tested with this series, and all looks good.
> vmlinux BTF grew by 1.5Mb to 6.7Mb for me on a bpf-next kernel.
> Following datasecs were seen:
> 
> [156581] DATASEC '.rodata' size=7379360 vlen=5472
> [156582] DATASEC '__init_rodata' size=496 vlen=3
> [156583] DATASEC '__param' size=15160 vlen=375
> [156584] DATASEC '__modver' size=864 vlen=12
> [156585] DATASEC '.data' size=5955041 vlen=15839
> [156586] DATASEC '.vvar' size=656 vlen=2
> [156587] DATASEC '.data..percpu' size=229632 vlen=389
> [156588] DATASEC '.init.data' size=2057888 vlen=5565
> [156589] DATASEC '.x86_cpu_dev.init' size=40 vlen=5
> [156590] DATASEC '.apicdrivers' size=56 vlen=7
> [156591] DATASEC '.data_nosave' size=4 vlen=1
> [156592] DATASEC '.bss' size=3788800 vlen=4080
> [156593] DATASEC '.brk' size=196608 vlen=2
> [156594] DATASEC '.init.scratch' size=4194304 vlen=1
> 
> Biggest contributors in terms of BTF size appear to be
> 
> - .data (15839 vars)
> - .init.data (5565 vars)
> - .rodata (5472 vars)
> - .bss (4080 vars)
> 
> > Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com>
> 
> Reviewed-by: Alan Maguire <alan.maguire@oracle.com>
> Tested-by: Alan Maguire <alan.maguire@oracle.com>
> 
> > ---
> >  btf_encoder.c      | 5 +++++
> >  btf_encoder.h      | 1 +
> >  dwarves.h          | 1 +
> >  man-pages/pahole.1 | 7 +++++--
> >  pahole.c           | 3 ++-
> >  5 files changed, 14 insertions(+), 3 deletions(-)
> > 
> > diff --git a/btf_encoder.c b/btf_encoder.c
> > index 2fd1648..2730ea8 100644
> > --- a/btf_encoder.c
> > +++ b/btf_encoder.c
> > @@ -2348,6 +2348,8 @@ struct btf_encoder *btf_encoder__new(struct cu *cu, const char *detached_filenam
> >  		encoder->encode_vars = 0;
> >  		if (!conf_load->skip_encoding_btf_vars)
> >  			encoder->encode_vars |= BTF_VAR_PERCPU;
> > +		if (conf_load->encode_btf_global_vars)
> > +			encoder->encode_vars |= BTF_VAR_GLOBAL;
> >  
> >  		GElf_Ehdr ehdr;
> >  
> > @@ -2400,6 +2402,9 @@ struct btf_encoder *btf_encoder__new(struct cu *cu, const char *detached_filenam
> >  			encoder->secinfo[shndx].name = secname;
> >  			encoder->secinfo[shndx].type = shdr.sh_type;
> >  
> > +			if (encoder->encode_vars & BTF_VAR_GLOBAL)
> > +				encoder->secinfo[shndx].include = true;
> > +
> >  			if (strcmp(secname, PERCPU_SECTION) == 0) {
> >  				found_percpu = true;
> >  				if (encoder->encode_vars & BTF_VAR_PERCPU)
> > diff --git a/btf_encoder.h b/btf_encoder.h
> > index 91e7947..824963b 100644
> > --- a/btf_encoder.h
> > +++ b/btf_encoder.h
> > @@ -20,6 +20,7 @@ struct list_head;
> >  enum btf_var_option {
> >  	BTF_VAR_NONE = 0,
> >  	BTF_VAR_PERCPU = 1,
> > +	BTF_VAR_GLOBAL = 2,
> >  };
> >  
> >  struct btf_encoder *btf_encoder__new(struct cu *cu, const char *detached_filename, struct btf *base_btf, bool verbose, struct conf_load *conf_load);
> > diff --git a/dwarves.h b/dwarves.h
> > index 0fede91..fef881f 100644
> > --- a/dwarves.h
> > +++ b/dwarves.h
> > @@ -92,6 +92,7 @@ struct conf_load {
> >  	bool			btf_gen_optimized;
> >  	bool			skip_encoding_btf_inconsistent_proto;
> >  	bool			skip_encoding_btf_vars;
> > +	bool			encode_btf_global_vars;
> >  	bool			btf_gen_floats;
> >  	bool			btf_encode_force;
> >  	bool			reproducible_build;
> > diff --git a/man-pages/pahole.1 b/man-pages/pahole.1
> > index b3e6632..7c1a69a 100644
> > --- a/man-pages/pahole.1
> > +++ b/man-pages/pahole.1
> > @@ -238,7 +238,9 @@ the debugging information.
> >  
> >  .TP
> >  .B \-\-skip_encoding_btf_vars
> > -Do not encode VARs in BTF.
> > +By default, VARs are encoded only for percpu variables. When specified, this
> > +option prevents encoding any VARs. Note that this option can be overridden
> > +by the feature "global_var".
> >  
> >  .TP
> >  .B \-\-skip_encoding_btf_decl_tag
> > @@ -304,7 +306,7 @@ Encode BTF using the specified feature list, or specify 'default' for all standa
> >  	encode_force       Ignore invalid symbols when encoding BTF; for example
> >  	                   if a symbol has an invalid name, it will be ignored
> >  	                   and BTF encoding will continue.
> > -	var                Encode variables using BTF_KIND_VAR in BTF.
> > +	var                Encode percpu variables using BTF_KIND_VAR in BTF.
> >  	float              Encode floating-point types in BTF.
> >  	decl_tag           Encode declaration tags using BTF_KIND_DECL_TAG.
> >  	type_tag           Encode type tags using BTF_KIND_TYPE_TAG.
> > @@ -329,6 +331,7 @@ Supported non-standard features (not enabled for 'default')
> >  	                   the associated base BTF to support later relocation
> >  	                   of split BTF with a possibly changed base, storing
> >  	                   it in a .BTF.base ELF section.
> > +	global_var         Encode all global variables using BTF_KIND_VAR in BTF.
> >  .fi
> >  
> >  So for example, specifying \-\-btf_encode=var,enum64 will result in a BTF encoding that (as well as encoding basic BTF information) will contain variables and enum64 values.
> > diff --git a/pahole.c b/pahole.c
> > index b21a7f2..9f0dc59 100644
> > --- a/pahole.c
> > +++ b/pahole.c
> > @@ -1301,6 +1301,7 @@ struct btf_feature {
> >  	BTF_DEFAULT_FEATURE(decl_tag_kfuncs, btf_decl_tag_kfuncs, false),
> >  	BTF_NON_DEFAULT_FEATURE(reproducible_build, reproducible_build, false),
> >  	BTF_NON_DEFAULT_FEATURE(distilled_base, btf_gen_distilled_base, false),
> > +	BTF_NON_DEFAULT_FEATURE(global_var, encode_btf_global_vars, false),
> 
> see above, I'd suggest making this a BTF_DEFAULT_FEATURE() to make
> testing easier.
> 
> >  };
> >  
> >  #define BTF_MAX_FEATURE_STR	1024
> > @@ -1733,7 +1734,7 @@ static const struct argp_option pahole__options[] = {
> >  	{
> >  		.name = "skip_encoding_btf_vars",
> >  		.key  = ARGP_skip_encoding_btf_vars,
> > -		.doc  = "Do not encode VARs in BTF."
> > +		.doc  = "Do not encode any VARs in BTF [if this is not specified, only percpu variables are encoded. To encode global variables too, use --encode_btf_global_vars]."
> >  	},
> >  	{
> >  		.name = "btf_encode_force",
Alan Maguire Oct. 3, 2024, 3:21 p.m. UTC | #3
On 03/10/2024 15:53, Arnaldo Carvalho de Melo wrote:
> On Thu, Oct 03, 2024 at 03:40:35PM +0100, Alan Maguire wrote:
>> On 03/10/2024 00:52, Stephen Brennan wrote:
>>> So far, pahole has only encoded type information for percpu variables.
>>> However, there are several reasons why type information for all global
>>> variables would be useful in the kernel:
> 
>>> 1. Runtime kernel debuggers like drgn could use the BTF to introspect
>>> kernel data structures without needing to install heavyweight DWARF.
> 
>>> 2. BPF programs using the "__ksym" annotation could declare the
>>> variables using the correct type, rather than "void".
> 
>>> It makes sense to introduce a feature for this in pahole so that these
>>> capabilities can be explored in the kernel. The feature is non-default:
>>> when using "--btf-features=default", it is disabled. It must be
>>> explicitly requested, e.g. with "--btf-features=+global_var".
>  
>> I'm not totally sure switching global_var to a non-default feature is
>> the right answer here.
>  
>> The --btf_features "default" set of options are meant to closely mirror
>> the upstream kernel options we enable when doing BTF encoding. However,
>> in scripts/Makefile.btf we don't use "default"; we explicitly call out
>> the set of features we want. We can't just use "default" in that context
>> since the meaning of "default" varies based upon whatever version of
>> pahole you have.
>  
>> So "default" is simply a convenient shorthand for pahole testing which
>> corresponds to "give me the set of features that upstream kernels use".
>> It could have a better name that reflects that more clearly I suppose.
>  
>> When we do switch this on in-kernel, we'll add the explicit "global_var"
>> to the list of features in scripts/Makefile.btf.
>  
>> So with all this said, do we make global_vars a default or non-default
>> feature? It would seem to make sense to specify non-default, since it is
>> not switched on for the kernel yet, but looking ahead, what if the 1.28
>> pahole release is used to build vmlinux BTF and we add global_var to the
>> list of features? In such a case, our "default" set of values would be
>> out of step with the kernel. So it's not a huge deal, but I would
>> consider keeping this a default feature to facilitate testing; this
>> won't change what the kernel does, but it makes testing with full
>> variable generation easier (I can just do "--btf_features=default").
> 
> This "default" really is confusing, as you spelled out above :-\ When to
> add something to it so that it reflects what the kernel has is tricky,
> perhaps we should instead have a ~/.config/pahole file where developers
> can add BTF features to add to --btf_features=default in the period
> where something new was _really_ added to the kernel and before the next
> version when it _have been added to the kernel set of BTF features_ thus
> should be set into stone in the pahole sources?
> 

it's a nice idea; I suppose once we have more automated tests, this will
be less of an issue too. I'm looking at adding a BTF variable test
shortly, would be good to have coverage there too, especially since
we're growing the amount of info we encode in this area.

> So I think we should do as Stephen did, keep it out of
> --btf_features=default, as it is not yet in the kernel set of options,
> and have the config file, starting with being able to set those
> features, i.e. we would have:
> 
> $ cat ~/.config/pahole
> [btf_encoder]
> 	btf_features=+global_var
> 
> wdyt?
> 

I think that makes perfect sense, great idea!

Alan
Arnaldo Carvalho de Melo Oct. 3, 2024, 5:18 p.m. UTC | #4
On Thu, Oct 03, 2024 at 04:21:07PM +0100, Alan Maguire wrote:
> On 03/10/2024 15:53, Arnaldo Carvalho de Melo wrote:
> > On Thu, Oct 03, 2024 at 03:40:35PM +0100, Alan Maguire wrote:
> >> On 03/10/2024 00:52, Stephen Brennan wrote:
> >>> So far, pahole has only encoded type information for percpu variables.
> >>> However, there are several reasons why type information for all global
> >>> variables would be useful in the kernel:
> > 
> >>> 1. Runtime kernel debuggers like drgn could use the BTF to introspect
> >>> kernel data structures without needing to install heavyweight DWARF.
> > 
> >>> 2. BPF programs using the "__ksym" annotation could declare the
> >>> variables using the correct type, rather than "void".
> > 
> >>> It makes sense to introduce a feature for this in pahole so that these
> >>> capabilities can be explored in the kernel. The feature is non-default:
> >>> when using "--btf-features=default", it is disabled. It must be
> >>> explicitly requested, e.g. with "--btf-features=+global_var".
> >  
> >> I'm not totally sure switching global_var to a non-default feature is
> >> the right answer here.
> >  
> >> The --btf_features "default" set of options are meant to closely mirror
> >> the upstream kernel options we enable when doing BTF encoding. However,
> >> in scripts/Makefile.btf we don't use "default"; we explicitly call out
> >> the set of features we want. We can't just use "default" in that context
> >> since the meaning of "default" varies based upon whatever version of
> >> pahole you have.
> >  
> >> So "default" is simply a convenient shorthand for pahole testing which
> >> corresponds to "give me the set of features that upstream kernels use".
> >> It could have a better name that reflects that more clearly I suppose.
> >  
> >> When we do switch this on in-kernel, we'll add the explicit "global_var"
> >> to the list of features in scripts/Makefile.btf.
> >  
> >> So with all this said, do we make global_vars a default or non-default
> >> feature? It would seem to make sense to specify non-default, since it is
> >> not switched on for the kernel yet, but looking ahead, what if the 1.28
> >> pahole release is used to build vmlinux BTF and we add global_var to the
> >> list of features? In such a case, our "default" set of values would be
> >> out of step with the kernel. So it's not a huge deal, but I would
> >> consider keeping this a default feature to facilitate testing; this
> >> won't change what the kernel does, but it makes testing with full
> >> variable generation easier (I can just do "--btf_features=default").
> > 
> > This "default" really is confusing, as you spelled out above :-\ When to
> > add something to it so that it reflects what the kernel has is tricky,
> > perhaps we should instead have a ~/.config/pahole file where developers
> > can add BTF features to add to --btf_features=default in the period
> > where something new was _really_ added to the kernel and before the next
> > version when it _have been added to the kernel set of BTF features_ thus
> > should be set into stone in the pahole sources?
 
> it's a nice idea; I suppose once we have more automated tests, this will
> be less of an issue too. I'm looking at adding a BTF variable test
> shortly, would be good to have coverage there too, especially since
> we're growing the amount of info we encode in this area.

Sure thing, the more tests, the better!
 
> > So I think we should do as Stephen did, keep it out of
> > --btf_features=default, as it is not yet in the kernel set of options,
> > and have the config file, starting with being able to set those
> > features, i.e. we would have:

> > $ cat ~/.config/pahole
> > [btf_encoder]
> > 	btf_features=+global_var

> > wdyt?
 
> I think that makes perfect sense, great idea!

I was looking for a library to do that to avoid "stealing" the
perf-config code, but perhaps we should use an env var for that?

PAHOLE_BTF_FEATURES='+global_var'

To keep things simple?

- Arnaldo
Stephen Brennan Oct. 3, 2024, 5:48 p.m. UTC | #5
Arnaldo Carvalho de Melo <acme@kernel.org> writes:
> On Thu, Oct 03, 2024 at 04:21:07PM +0100, Alan Maguire wrote:
>> On 03/10/2024 15:53, Arnaldo Carvalho de Melo wrote:
>> > On Thu, Oct 03, 2024 at 03:40:35PM +0100, Alan Maguire wrote:
>> >> On 03/10/2024 00:52, Stephen Brennan wrote:
>> >>> So far, pahole has only encoded type information for percpu variables.
>> >>> However, there are several reasons why type information for all global
>> >>> variables would be useful in the kernel:
>> > 
>> >>> 1. Runtime kernel debuggers like drgn could use the BTF to introspect
>> >>> kernel data structures without needing to install heavyweight DWARF.
>> > 
>> >>> 2. BPF programs using the "__ksym" annotation could declare the
>> >>> variables using the correct type, rather than "void".
>> > 
>> >>> It makes sense to introduce a feature for this in pahole so that these
>> >>> capabilities can be explored in the kernel. The feature is non-default:
>> >>> when using "--btf-features=default", it is disabled. It must be
>> >>> explicitly requested, e.g. with "--btf-features=+global_var".
>> >  
>> >> I'm not totally sure switching global_var to a non-default feature is
>> >> the right answer here.
>> >  
>> >> The --btf_features "default" set of options are meant to closely mirror
>> >> the upstream kernel options we enable when doing BTF encoding. However,
>> >> in scripts/Makefile.btf we don't use "default"; we explicitly call out
>> >> the set of features we want. We can't just use "default" in that context
>> >> since the meaning of "default" varies based upon whatever version of
>> >> pahole you have.
>> >  
>> >> So "default" is simply a convenient shorthand for pahole testing which
>> >> corresponds to "give me the set of features that upstream kernels use".
>> >> It could have a better name that reflects that more clearly I suppose.
>> >  
>> >> When we do switch this on in-kernel, we'll add the explicit "global_var"
>> >> to the list of features in scripts/Makefile.btf.
>> >  
>> >> So with all this said, do we make global_vars a default or non-default
>> >> feature? It would seem to make sense to specify non-default, since it is
>> >> not switched on for the kernel yet, but looking ahead, what if the 1.28
>> >> pahole release is used to build vmlinux BTF and we add global_var to the
>> >> list of features? In such a case, our "default" set of values would be
>> >> out of step with the kernel. So it's not a huge deal, but I would
>> >> consider keeping this a default feature to facilitate testing; this
>> >> won't change what the kernel does, but it makes testing with full
>> >> variable generation easier (I can just do "--btf_features=default").
>> > 
>> > This "default" really is confusing, as you spelled out above :-\

Yeah, I spent a while staring at the comment and reading the code to
understand the nuance between the initial and default values. I don't
think I fully understood it until this v3 patch, and admittedly I still
didn't have the full context of how "default" was used.

One interesting point of comparison is the "-M" argument to
"qemu-system-$arch". For example:

  $ qemu-system-x86_64 -M ?
  Supported machines are:
  microvm              microvm (i386)
  pc                   Standard PC (i440FX + PIIX, 1996) (alias of pc-i440fx-9.0)
  pc-i440fx-9.0        Standard PC (i440FX + PIIX, 1996) (default)
  pc-i440fx-8.2        Standard PC (i440FX + PIIX, 1996)
  pc-i440fx-8.1        Standard PC (i440FX + PIIX, 1996)
  pc-i440fx-8.0        Standard PC (i440FX + PIIX, 1996)
  [...]

So the default "pc" machine is simply an alias that gets updated to the
most recent machine (with potential new behaviors) every release, but
you can always select a specific machine that you care about.

Maybe it would make sense if there were versioned defaults, so that
"default" always picks whatever is relevant to the most recent upstream
kernel, but you could also select the default as of an older pahole
release.

That does sound like plenty of complexity added to an already somewhat
confusing system, so I'm not sold on it. The flexibility for adjusting
to new kernel defaults is appealing though.

>> >When to
>> > add something to it so that it reflects what the kernel has is tricky,
>> > perhaps we should instead have a ~/.config/pahole file where developers
>> > can add BTF features to add to --btf_features=default in the period
>> > where something new was _really_ added to the kernel and before the next
>> > version when it _have been added to the kernel set of BTF features_ thus
>> > should be set into stone in the pahole sources?
>  
>> it's a nice idea; I suppose once we have more automated tests, this will
>> be less of an issue too. I'm looking at adding a BTF variable test
>> shortly, would be good to have coverage there too, especially since
>> we're growing the amount of info we encode in this area.
>
> Sure thing, the more tests, the better!
>  
>> > So I think we should do as Stephen did, keep it out of
>> > --btf_features=default, as it is not yet in the kernel set of options,
>> > and have the config file, starting with being able to set those
>> > features, i.e. we would have:
>
>> > $ cat ~/.config/pahole
>> > [btf_encoder]
>> > 	btf_features=+global_var
>
>> > wdyt?
>  
>> I think that makes perfect sense, great idea!
>
> I was looking for a library to do that to avoid "stealing" the
> perf-config code, but perhaps we should use an env var for that?
>
> PAHOLE_BTF_FEATURES='+global_var'
>
> To keep things simple?

One concern with configuration files is that (at least on my system)
they tend to sit around and get forgotten, unless they're super well
known configs like ~/.bashrc. So at some point, I could see myself
setting a pahole config and then 6 months later wondering why pahole
behaves differently on two different systems.

Env vars are easy to set permanently if you want, but are also more
visible and centralized with your other configurations, so they're my
preference.

Thanks,
Stephen
Arnaldo Carvalho de Melo Oct. 3, 2024, 6:33 p.m. UTC | #6
On Thu, Oct 03, 2024 at 10:48:23AM -0700, Stephen Brennan wrote:
> Arnaldo Carvalho de Melo <acme@kernel.org> writes:
> > On Thu, Oct 03, 2024 at 04:21:07PM +0100, Alan Maguire wrote:
> >> On 03/10/2024 15:53, Arnaldo Carvalho de Melo wrote:
> >> > On Thu, Oct 03, 2024 at 03:40:35PM +0100, Alan Maguire wrote:
> >> >> On 03/10/2024 00:52, Stephen Brennan wrote:
> >> >>> So far, pahole has only encoded type information for percpu variables.
> >> >>> However, there are several reasons why type information for all global
> >> >>> variables would be useful in the kernel:
> >> > 
> >> >>> 1. Runtime kernel debuggers like drgn could use the BTF to introspect
> >> >>> kernel data structures without needing to install heavyweight DWARF.
> >> > 
> >> >>> 2. BPF programs using the "__ksym" annotation could declare the
> >> >>> variables using the correct type, rather than "void".
> >> > 
> >> >>> It makes sense to introduce a feature for this in pahole so that these
> >> >>> capabilities can be explored in the kernel. The feature is non-default:
> >> >>> when using "--btf-features=default", it is disabled. It must be
> >> >>> explicitly requested, e.g. with "--btf-features=+global_var".
> >> >  
> >> >> I'm not totally sure switching global_var to a non-default feature is
> >> >> the right answer here.
> >> >  
> >> >> The --btf_features "default" set of options are meant to closely mirror
> >> >> the upstream kernel options we enable when doing BTF encoding. However,
> >> >> in scripts/Makefile.btf we don't use "default"; we explicitly call out
> >> >> the set of features we want. We can't just use "default" in that context
> >> >> since the meaning of "default" varies based upon whatever version of
> >> >> pahole you have.
> >> >  
> >> >> So "default" is simply a convenient shorthand for pahole testing which
> >> >> corresponds to "give me the set of features that upstream kernels use".
> >> >> It could have a better name that reflects that more clearly I suppose.
> >> >  
> >> >> When we do switch this on in-kernel, we'll add the explicit "global_var"
> >> >> to the list of features in scripts/Makefile.btf.
> >> >  
> >> >> So with all this said, do we make global_vars a default or non-default
> >> >> feature? It would seem to make sense to specify non-default, since it is
> >> >> not switched on for the kernel yet, but looking ahead, what if the 1.28
> >> >> pahole release is used to build vmlinux BTF and we add global_var to the
> >> >> list of features? In such a case, our "default" set of values would be
> >> >> out of step with the kernel. So it's not a huge deal, but I would
> >> >> consider keeping this a default feature to facilitate testing; this
> >> >> won't change what the kernel does, but it makes testing with full
> >> >> variable generation easier (I can just do "--btf_features=default").
> >> > 
> >> > This "default" really is confusing, as you spelled out above :-\
> 
> Yeah, I spent a while staring at the comment and reading the code to
> understand the nuance between the initial and default values. I don't
> think I fully understood it until this v3 patch, and admittedly I still
> didn't have the full context of how "default" was used.
> 
> One interesting point of comparison is the "-M" argument to
> "qemu-system-$arch". For example:
> 
>   $ qemu-system-x86_64 -M ?
>   Supported machines are:
>   microvm              microvm (i386)
>   pc                   Standard PC (i440FX + PIIX, 1996) (alias of pc-i440fx-9.0)
>   pc-i440fx-9.0        Standard PC (i440FX + PIIX, 1996) (default)
>   pc-i440fx-8.2        Standard PC (i440FX + PIIX, 1996)
>   pc-i440fx-8.1        Standard PC (i440FX + PIIX, 1996)
>   pc-i440fx-8.0        Standard PC (i440FX + PIIX, 1996)
>   [...]
> 
> So the default "pc" machine is simply an alias that gets updated to the
> most recent machine (with potential new behaviors) every release, but
> you can always select a specific machine that you care about.
> 
> Maybe it would make sense if there were versioned defaults, so that
> "default" always picks whatever is relevant to the most recent upstream
> kernel, but you could also select the default as of an older pahole
> release.
> 
> That does sound like plenty of complexity added to an already somewhat
> confusing system, so I'm not sold on it. The flexibility for adjusting
> to new kernel defaults is appealing though.
> 
> >> >When to
> >> > add something to it so that it reflects what the kernel has is tricky,
> >> > perhaps we should instead have a ~/.config/pahole file where developers
> >> > can add BTF features to add to --btf_features=default in the period
> >> > where something new was _really_ added to the kernel and before the next
> >> > version when it _have been added to the kernel set of BTF features_ thus
> >> > should be set into stone in the pahole sources?
> >  
> >> it's a nice idea; I suppose once we have more automated tests, this will
> >> be less of an issue too. I'm looking at adding a BTF variable test
> >> shortly, would be good to have coverage there too, especially since
> >> we're growing the amount of info we encode in this area.
> >
> > Sure thing, the more tests, the better!
> >  
> >> > So I think we should do as Stephen did, keep it out of
> >> > --btf_features=default, as it is not yet in the kernel set of options,
> >> > and have the config file, starting with being able to set those
> >> > features, i.e. we would have:
> >
> >> > $ cat ~/.config/pahole
> >> > [btf_encoder]
> >> > 	btf_features=+global_var
> >
> >> > wdyt?
> >  
> >> I think that makes perfect sense, great idea!
> >
> > I was looking for a library to do that to avoid "stealing" the
> > perf-config code, but perhaps we should use an env var for that?
> >
> > PAHOLE_BTF_FEATURES='+global_var'
> >
> > To keep things simple?
> 
> One concern with configuration files is that (at least on my system)
> they tend to sit around and get forgotten, unless they're super well
> known configs like ~/.bashrc. So at some point, I could see myself
> setting a pahole config and then 6 months later wondering why pahole
> behaves differently on two different systems.
> 
> Env vars are easy to set permanently if you want, but are also more
> visible and centralized with your other configurations, so they're my
> preference.

Agreed, lets go with an env var.

And this one is even "ephemeral", i.e. as we get new versions of pahole
it should match the most recent set of kernel btf_features set and thus
become unneeded.

At some point we'll stop adding features, right? 8-)

- Arnaldo
diff mbox series

Patch

diff --git a/btf_encoder.c b/btf_encoder.c
index 2fd1648..2730ea8 100644
--- a/btf_encoder.c
+++ b/btf_encoder.c
@@ -2348,6 +2348,8 @@  struct btf_encoder *btf_encoder__new(struct cu *cu, const char *detached_filenam
 		encoder->encode_vars = 0;
 		if (!conf_load->skip_encoding_btf_vars)
 			encoder->encode_vars |= BTF_VAR_PERCPU;
+		if (conf_load->encode_btf_global_vars)
+			encoder->encode_vars |= BTF_VAR_GLOBAL;
 
 		GElf_Ehdr ehdr;
 
@@ -2400,6 +2402,9 @@  struct btf_encoder *btf_encoder__new(struct cu *cu, const char *detached_filenam
 			encoder->secinfo[shndx].name = secname;
 			encoder->secinfo[shndx].type = shdr.sh_type;
 
+			if (encoder->encode_vars & BTF_VAR_GLOBAL)
+				encoder->secinfo[shndx].include = true;
+
 			if (strcmp(secname, PERCPU_SECTION) == 0) {
 				found_percpu = true;
 				if (encoder->encode_vars & BTF_VAR_PERCPU)
diff --git a/btf_encoder.h b/btf_encoder.h
index 91e7947..824963b 100644
--- a/btf_encoder.h
+++ b/btf_encoder.h
@@ -20,6 +20,7 @@  struct list_head;
 enum btf_var_option {
 	BTF_VAR_NONE = 0,
 	BTF_VAR_PERCPU = 1,
+	BTF_VAR_GLOBAL = 2,
 };
 
 struct btf_encoder *btf_encoder__new(struct cu *cu, const char *detached_filename, struct btf *base_btf, bool verbose, struct conf_load *conf_load);
diff --git a/dwarves.h b/dwarves.h
index 0fede91..fef881f 100644
--- a/dwarves.h
+++ b/dwarves.h
@@ -92,6 +92,7 @@  struct conf_load {
 	bool			btf_gen_optimized;
 	bool			skip_encoding_btf_inconsistent_proto;
 	bool			skip_encoding_btf_vars;
+	bool			encode_btf_global_vars;
 	bool			btf_gen_floats;
 	bool			btf_encode_force;
 	bool			reproducible_build;
diff --git a/man-pages/pahole.1 b/man-pages/pahole.1
index b3e6632..7c1a69a 100644
--- a/man-pages/pahole.1
+++ b/man-pages/pahole.1
@@ -238,7 +238,9 @@  the debugging information.
 
 .TP
 .B \-\-skip_encoding_btf_vars
-Do not encode VARs in BTF.
+By default, VARs are encoded only for percpu variables. When specified, this
+option prevents encoding any VARs. Note that this option can be overridden
+by the feature "global_var".
 
 .TP
 .B \-\-skip_encoding_btf_decl_tag
@@ -304,7 +306,7 @@  Encode BTF using the specified feature list, or specify 'default' for all standa
 	encode_force       Ignore invalid symbols when encoding BTF; for example
 	                   if a symbol has an invalid name, it will be ignored
 	                   and BTF encoding will continue.
-	var                Encode variables using BTF_KIND_VAR in BTF.
+	var                Encode percpu variables using BTF_KIND_VAR in BTF.
 	float              Encode floating-point types in BTF.
 	decl_tag           Encode declaration tags using BTF_KIND_DECL_TAG.
 	type_tag           Encode type tags using BTF_KIND_TYPE_TAG.
@@ -329,6 +331,7 @@  Supported non-standard features (not enabled for 'default')
 	                   the associated base BTF to support later relocation
 	                   of split BTF with a possibly changed base, storing
 	                   it in a .BTF.base ELF section.
+	global_var         Encode all global variables using BTF_KIND_VAR in BTF.
 .fi
 
 So for example, specifying \-\-btf_encode=var,enum64 will result in a BTF encoding that (as well as encoding basic BTF information) will contain variables and enum64 values.
diff --git a/pahole.c b/pahole.c
index b21a7f2..9f0dc59 100644
--- a/pahole.c
+++ b/pahole.c
@@ -1301,6 +1301,7 @@  struct btf_feature {
 	BTF_DEFAULT_FEATURE(decl_tag_kfuncs, btf_decl_tag_kfuncs, false),
 	BTF_NON_DEFAULT_FEATURE(reproducible_build, reproducible_build, false),
 	BTF_NON_DEFAULT_FEATURE(distilled_base, btf_gen_distilled_base, false),
+	BTF_NON_DEFAULT_FEATURE(global_var, encode_btf_global_vars, false),
 };
 
 #define BTF_MAX_FEATURE_STR	1024
@@ -1733,7 +1734,7 @@  static const struct argp_option pahole__options[] = {
 	{
 		.name = "skip_encoding_btf_vars",
 		.key  = ARGP_skip_encoding_btf_vars,
-		.doc  = "Do not encode VARs in BTF."
+		.doc  = "Do not encode any VARs in BTF [if this is not specified, only percpu variables are encoded. To encode global variables too, use --encode_btf_global_vars]."
 	},
 	{
 		.name = "btf_encode_force",