diff mbox series

[dwarves,v1] btf_encoder: handle .BTF_ids section endianness when cross-compiling

Message ID 20241122070218.3832680-1-eddyz87@gmail.com (mailing list archive)
State Not Applicable
Headers show
Series [dwarves,v1] btf_encoder: handle .BTF_ids section endianness when cross-compiling | expand

Checks

Context Check Description
netdev/tree_selection success Not a local patch

Commit Message

Eduard Zingerman Nov. 22, 2024, 7:02 a.m. UTC
btf_encoder__tag_kfuncs() reads .BTF_ids section to identify a set of
kfuncs present in the ELF being processed. This section consists of
records of the following shape:

  struct btf_id_and_flag {
      uint32_t id;
      uint32_t flags;
  };

When endianness of binary operated by pahole differs from the
host endianness these fields require byte swap before using.

At the moment such byte swap does not happen and kfuncs are not marked
with decl tags when e.g. s390 kernel is compiled on x86.
To reproduces the bug:
- follow instructions from [0] to build an s390 vmlinux;
- execute:
  pahole --btf_features_strict=decl_tag_kfuncs,decl_tag \
         --btf_encode_detached=test.btf vmlinux
- observe no kfuncs generated:
  bpftool btf dump test.btf format c | grep __ksym

This commit fixes the issue by adding an endianness conversion step
for .BTF_ids section data before main processing step, modifying the
Elf_Data object in-place.
The choice is such in order to:
- minimize changes;
- keep using Elf_Data, as it provides fields {d_size,d_off} used
  by kfunc processing routines;
- avoid sprinkling bswap_32 at each 'struct btf_id_and_flag' field
  access in fear of forgetting to add new ones when code is modified.

[0] https://docs.kernel.org/bpf/s390.html

Cc: Alan Maguire <alan.maguire@oracle.com>
Cc: Daniel Xu <dxu@dxuuu.xyz>
Cc: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Cc: Vadim Fedorenko <vadfed@meta.com>
Fixes: 72e88f29c6f7 ("pahole: Inject kfunc decl tags into BTF")
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
---
 btf_encoder.c | 42 ++++++++++++++++++++++++++++++++++++++++++
 lib/bpf       |  2 +-
 2 files changed, 43 insertions(+), 1 deletion(-)

Comments

Vadim Fedorenko Nov. 22, 2024, 3:03 p.m. UTC | #1
On 21/11/2024 23:02, Eduard Zingerman wrote:
> btf_encoder__tag_kfuncs() reads .BTF_ids section to identify a set of
> kfuncs present in the ELF being processed. This section consists of
> records of the following shape:
> 
>    struct btf_id_and_flag {
>        uint32_t id;
>        uint32_t flags;
>    };
> 
> When endianness of binary operated by pahole differs from the
> host endianness these fields require byte swap before using.
> 
> At the moment such byte swap does not happen and kfuncs are not marked
> with decl tags when e.g. s390 kernel is compiled on x86.
> To reproduces the bug:
> - follow instructions from [0] to build an s390 vmlinux;
> - execute:
>    pahole --btf_features_strict=decl_tag_kfuncs,decl_tag \
>           --btf_encode_detached=test.btf vmlinux
> - observe no kfuncs generated:
>    bpftool btf dump test.btf format c | grep __ksym
> 
> This commit fixes the issue by adding an endianness conversion step
> for .BTF_ids section data before main processing step, modifying the
> Elf_Data object in-place.
> The choice is such in order to:
> - minimize changes;
> - keep using Elf_Data, as it provides fields {d_size,d_off} used
>    by kfunc processing routines;
> - avoid sprinkling bswap_32 at each 'struct btf_id_and_flag' field
>    access in fear of forgetting to add new ones when code is modified.
> 
> [0] https://docs.kernel.org/bpf/s390.html
> 

LGTM, Thanks!

Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Jiri Olsa Nov. 22, 2024, 3:11 p.m. UTC | #2
On Thu, Nov 21, 2024 at 11:02:18PM -0800, Eduard Zingerman wrote:
> btf_encoder__tag_kfuncs() reads .BTF_ids section to identify a set of
> kfuncs present in the ELF being processed. This section consists of
> records of the following shape:
> 
>   struct btf_id_and_flag {
>       uint32_t id;
>       uint32_t flags;
>   };

it contains pairs like above and also just id arrays with no flags, but
that does not matter for the patch functionality, because you swap by
u32 values anyway

> 
> When endianness of binary operated by pahole differs from the
> host endianness these fields require byte swap before using.
> 
> At the moment such byte swap does not happen and kfuncs are not marked
> with decl tags when e.g. s390 kernel is compiled on x86.
> To reproduces the bug:
> - follow instructions from [0] to build an s390 vmlinux;
> - execute:
>   pahole --btf_features_strict=decl_tag_kfuncs,decl_tag \
>          --btf_encode_detached=test.btf vmlinux
> - observe no kfuncs generated:
>   bpftool btf dump test.btf format c | grep __ksym
> 
> This commit fixes the issue by adding an endianness conversion step
> for .BTF_ids section data before main processing step, modifying the
> Elf_Data object in-place.
> The choice is such in order to:
> - minimize changes;
> - keep using Elf_Data, as it provides fields {d_size,d_off} used
>   by kfunc processing routines;
> - avoid sprinkling bswap_32 at each 'struct btf_id_and_flag' field
>   access in fear of forgetting to add new ones when code is modified.

lgtm, some questions below

> 
> [0] https://docs.kernel.org/bpf/s390.html
> 
> Cc: Alan Maguire <alan.maguire@oracle.com>
> Cc: Daniel Xu <dxu@dxuuu.xyz>
> Cc: Kumar Kartikeya Dwivedi <memxor@gmail.com>
> Cc: Vadim Fedorenko <vadfed@meta.com>
> Fixes: 72e88f29c6f7 ("pahole: Inject kfunc decl tags into BTF")
> Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
> ---
>  btf_encoder.c | 42 ++++++++++++++++++++++++++++++++++++++++++
>  lib/bpf       |  2 +-
>  2 files changed, 43 insertions(+), 1 deletion(-)
> 
> diff --git a/btf_encoder.c b/btf_encoder.c
> index e1adddf..3bdb73b 100644
> --- a/btf_encoder.c
> +++ b/btf_encoder.c
> @@ -33,6 +33,7 @@
>  #include <stdint.h>
>  #include <search.h> /* for tsearch(), tfind() and tdestroy() */
>  #include <pthread.h>
> +#include <byteswap.h>
>  
>  #define BTF_IDS_SECTION		".BTF_ids"
>  #define BTF_ID_FUNC_PFX		"__BTF_ID__func__"
> @@ -1847,11 +1848,47 @@ static int btf_encoder__tag_kfunc(struct btf_encoder *encoder, struct gobuffer *
>  	return 0;
>  }
>  
> +/* If byte order of 'elf' differs from current byte order, convert the data->d_buf.
> + * ELF file is opened in a readonly mode, so data->d_buf cannot be modified in place.
> + * Instead, allocate a new buffer if modification is necessary.
> + */
> +static int convert_idlist_endianness(Elf *elf, Elf_Data *data, bool *copied)
> +{
> +	int byteorder, i;
> +	char *elf_ident;
> +	uint32_t *tmp;
> +
> +	*copied = false;
> +	elf_ident = elf_getident(elf, NULL);
> +	if (elf_ident == NULL) {
> +		fprintf(stderr, "Cannot get ELF identification from header\n");
> +		return -EINVAL;
> +	}
> +	byteorder = elf_ident[EI_DATA];
> +	if ((BYTE_ORDER == LITTLE_ENDIAN && byteorder == ELFDATA2LSB)
> +	    || (BYTE_ORDER == BIG_ENDIAN && byteorder == ELFDATA2MSB))
> +		return 0;
> +	tmp = malloc(data->d_size);
> +	if (tmp == NULL) {
> +		fprintf(stderr, "Cannot allocate %lu bytes of memory\n", data->d_size);
> +		return -ENOMEM;
> +	}
> +	memcpy(tmp, data->d_buf, data->d_size);
> +	data->d_buf = tmp;

will the original data->d_buf be leaked? are we allowed to assign d_buf like that? ;-)

> +	*copied = true;
> +
> +	/* .BTF_ids sections consist of u32 objects */
> +	for (i = 0; i < data->d_size / sizeof(uint32_t); i++)
> +		tmp[i] = bswap_32(tmp[i]);
> +	return 0;
> +}
> +
>  static int btf_encoder__tag_kfuncs(struct btf_encoder *encoder)
>  {
>  	const char *filename = encoder->source_filename;
>  	struct gobuffer btf_kfunc_ranges = {};
>  	struct gobuffer btf_funcs = {};
> +	bool free_idlist = false;
>  	Elf_Data *symbols = NULL;
>  	Elf_Data *idlist = NULL;
>  	Elf_Scn *symscn = NULL;
> @@ -1919,6 +1956,9 @@ static int btf_encoder__tag_kfuncs(struct btf_encoder *encoder)
>  			idlist_shndx = i;
>  			idlist_addr = shdr.sh_addr;
>  			idlist = data;
> +			err = convert_idlist_endianness(elf, idlist, &free_idlist);
> +			if (err < 0)
> +				goto out;
>  		}
>  	}
>  
> @@ -2031,6 +2071,8 @@ static int btf_encoder__tag_kfuncs(struct btf_encoder *encoder)
>  out:
>  	__gobuffer__delete(&btf_funcs);
>  	__gobuffer__delete(&btf_kfunc_ranges);
> +	if (free_idlist)
> +		free(idlist->d_buf);
>  	if (elf)
>  		elf_end(elf);

curious, would elf_end try to free the d_buf at some point?

>  	if (fd != -1)
> diff --git a/lib/bpf b/lib/bpf
> index 09b9e83..caa17bd 160000
> --- a/lib/bpf
> +++ b/lib/bpf
> @@ -1 +1 @@
> -Subproject commit 09b9e83102eb8ab9e540d36b4559c55f3bcdb95d
> +Subproject commit caa17bdcbfc58e68eaf4d017c058e6577606bf56

I think this should not be part of the patch

thanks,
jirka
Eduard Zingerman Nov. 22, 2024, 6:08 p.m. UTC | #3
On Fri, 2024-11-22 at 16:11 +0100, Jiri Olsa wrote:
> On Thu, Nov 21, 2024 at 11:02:18PM -0800, Eduard Zingerman wrote:
> > btf_encoder__tag_kfuncs() reads .BTF_ids section to identify a set of
> > kfuncs present in the ELF being processed. This section consists of
> > records of the following shape:
> > 
> >   struct btf_id_and_flag {
> >       uint32_t id;
> >       uint32_t flags;
> >   };
> 
> it contains pairs like above and also just id arrays with no flags, but
> that does not matter for the patch functionality, because you swap by
> u32 values anyway

Right, I'll update the description, thank you.

[...]

> > @@ -1847,11 +1848,47 @@ static int btf_encoder__tag_kfunc(struct btf_encoder *encoder, struct gobuffer *
> >  	return 0;
> >  }
> >  
> > +/* If byte order of 'elf' differs from current byte order, convert the data->d_buf.
> > + * ELF file is opened in a readonly mode, so data->d_buf cannot be modified in place.
> > + * Instead, allocate a new buffer if modification is necessary.
> > + */
> > +static int convert_idlist_endianness(Elf *elf, Elf_Data *data, bool *copied)
> > +{
> > +	int byteorder, i;
> > +	char *elf_ident;
> > +	uint32_t *tmp;
> > +
> > +	*copied = false;
> > +	elf_ident = elf_getident(elf, NULL);
> > +	if (elf_ident == NULL) {
> > +		fprintf(stderr, "Cannot get ELF identification from header\n");
> > +		return -EINVAL;
> > +	}
> > +	byteorder = elf_ident[EI_DATA];
> > +	if ((BYTE_ORDER == LITTLE_ENDIAN && byteorder == ELFDATA2LSB)
> > +	    || (BYTE_ORDER == BIG_ENDIAN && byteorder == ELFDATA2MSB))
> > +		return 0;
> > +	tmp = malloc(data->d_size);
> > +	if (tmp == NULL) {
> > +		fprintf(stderr, "Cannot allocate %lu bytes of memory\n", data->d_size);
> > +		return -ENOMEM;
> > +	}
> > +	memcpy(tmp, data->d_buf, data->d_size);
> > +	data->d_buf = tmp;
> 
> will the original data->d_buf be leaked? are we allowed to assign d_buf like that? ;-)

Well, before sending I checked using address sanitizer, and it did not complain.
As far as I understand elfutils elf_getdata.c / elf_end.c [0]:
- elf_getdata() allocates memory for full section (elf_getdata.c:333),
  before setting d_buf field of Elf_Data;
- elf_end() frees memory for full section (elf_end.c:174).

So I assumed that this is hacky but not that bad.
Given that current patch depends on implementation details it is
probably better to switch to one of the alternatives:
a. allocate new Elf_Data object using elf_newdata() API;
b. just allocate a fake instance of Elf_Data on stack in btf_encoder__tag_kfuncs().

(a) seems to be an Ok option, wdyt?

[0] b2f225d6bff8 ("Consolidate and add files to clean target variables")
    git://sourceware.org/git/elfutils.git

[...]

> >  	if (fd != -1)
> > diff --git a/lib/bpf b/lib/bpf
> > index 09b9e83..caa17bd 160000
> > --- a/lib/bpf
> > +++ b/lib/bpf
> > @@ -1 +1 @@
> > -Subproject commit 09b9e83102eb8ab9e540d36b4559c55f3bcdb95d
> > +Subproject commit caa17bdcbfc58e68eaf4d017c058e6577606bf56
> 
> I think this should not be part of the patch

Sorry, didn't notice this thing.
Eduard Zingerman Nov. 22, 2024, 6:16 p.m. UTC | #4
On Fri, 2024-11-22 at 10:08 -0800, Eduard Zingerman wrote:

[...]

> So I assumed that this is hacky but not that bad.
> Given that current patch depends on implementation details it is
> probably better to switch to one of the alternatives:
> a. allocate new Elf_Data object using elf_newdata() API;
> b. just allocate a fake instance of Elf_Data on stack in btf_encoder__tag_kfuncs().
> 
> (a) seems to be an Ok option, wdyt?

Meh, any of these raise some questions about validity of API usage.
I'll just add a custom wrapper:

  struct converted_elf_data {
	void *d_buf;
        size_t d_size;
        size_t d_off;
        bool owns_buf;
  };

[...]
Andrii Nakryiko Nov. 26, 2024, 7:26 p.m. UTC | #5
On Thu, Nov 21, 2024 at 11:02 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> btf_encoder__tag_kfuncs() reads .BTF_ids section to identify a set of
> kfuncs present in the ELF being processed. This section consists of
> records of the following shape:
>
>   struct btf_id_and_flag {
>       uint32_t id;
>       uint32_t flags;
>   };
>

Can we just set data->d_type to ELF_T_WORD and let libelf handle the byte swap?

> When endianness of binary operated by pahole differs from the
> host endianness these fields require byte swap before using.
>
> At the moment such byte swap does not happen and kfuncs are not marked
> with decl tags when e.g. s390 kernel is compiled on x86.
> To reproduces the bug:
> - follow instructions from [0] to build an s390 vmlinux;
> - execute:
>   pahole --btf_features_strict=decl_tag_kfuncs,decl_tag \
>          --btf_encode_detached=test.btf vmlinux
> - observe no kfuncs generated:
>   bpftool btf dump test.btf format c | grep __ksym
>
> This commit fixes the issue by adding an endianness conversion step
> for .BTF_ids section data before main processing step, modifying the
> Elf_Data object in-place.
> The choice is such in order to:
> - minimize changes;
> - keep using Elf_Data, as it provides fields {d_size,d_off} used
>   by kfunc processing routines;
> - avoid sprinkling bswap_32 at each 'struct btf_id_and_flag' field
>   access in fear of forgetting to add new ones when code is modified.
>
> [0] https://docs.kernel.org/bpf/s390.html
>
> Cc: Alan Maguire <alan.maguire@oracle.com>
> Cc: Daniel Xu <dxu@dxuuu.xyz>
> Cc: Kumar Kartikeya Dwivedi <memxor@gmail.com>
> Cc: Vadim Fedorenko <vadfed@meta.com>
> Fixes: 72e88f29c6f7 ("pahole: Inject kfunc decl tags into BTF")
> Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
> ---
>  btf_encoder.c | 42 ++++++++++++++++++++++++++++++++++++++++++
>  lib/bpf       |  2 +-
>  2 files changed, 43 insertions(+), 1 deletion(-)
>

[...]
Eduard Zingerman Nov. 26, 2024, 7:31 p.m. UTC | #6
On Tue, 2024-11-26 at 11:26 -0800, Andrii Nakryiko wrote:
> On Thu, Nov 21, 2024 at 11:02 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
> > 
> > btf_encoder__tag_kfuncs() reads .BTF_ids section to identify a set of
> > kfuncs present in the ELF being processed. This section consists of
> > records of the following shape:
> > 
> >   struct btf_id_and_flag {
> >       uint32_t id;
> >       uint32_t flags;
> >   };
> > 
> 
> Can we just set data->d_type to ELF_T_WORD and let libelf handle the byte swap?

When I tried 'data->d_type = ELF_T_WORD' + gelf_xlatetom() snippet
suggested by Tony Ambardar some time ago, I got a write protection error.
Concluded that this is so, because file is opened in O_RDONLY mode.

(Also please note v2 of this patch).

[...]
Andrii Nakryiko Nov. 26, 2024, 9:51 p.m. UTC | #7
On Tue, Nov 26, 2024 at 11:31 AM Eduard Zingerman <eddyz87@gmail.com> wrote:
>
> On Tue, 2024-11-26 at 11:26 -0800, Andrii Nakryiko wrote:
> > On Thu, Nov 21, 2024 at 11:02 PM Eduard Zingerman <eddyz87@gmail.com> wrote:
> > >
> > > btf_encoder__tag_kfuncs() reads .BTF_ids section to identify a set of
> > > kfuncs present in the ELF being processed. This section consists of
> > > records of the following shape:
> > >
> > >   struct btf_id_and_flag {
> > >       uint32_t id;
> > >       uint32_t flags;
> > >   };
> > >
> >
> > Can we just set data->d_type to ELF_T_WORD and let libelf handle the byte swap?
>
> When I tried 'data->d_type = ELF_T_WORD' + gelf_xlatetom() snippet
> suggested by Tony Ambardar some time ago, I got a write protection error.
> Concluded that this is so, because file is opened in O_RDONLY mode.

Ok, maybe don't follow my words *exactly*, just in spirit ;) I see
there is elf_getdata_rawchunk() API in libelf, which seems useful:

/* Get data translated from a chunk of the file contents as section data
   would be for TYPE.  The resulting Elf_Data pointer is valid until
   elf_end (ELF) is called.  */
extern Elf_Data *elf_getdata_rawchunk (Elf *__elf,
                                       int64_t __offset, size_t __size,
                                       Elf_Type __type);

I don't know the surrounding code, I was just hoping to leverage
libelf's byte swapping support (which I think I learned from Tony as
well). But if it's too inconvenient, so be it.

>
> (Also please note v2 of this patch).
>
> [...]
>
Eduard Zingerman Nov. 27, 2024, 12:30 a.m. UTC | #8
On Tue, 2024-11-26 at 13:51 -0800, Andrii Nakryiko wrote:

[...]

> > When I tried 'data->d_type = ELF_T_WORD' + gelf_xlatetom() snippet
> > suggested by Tony Ambardar some time ago, I got a write protection error.
> > Concluded that this is so, because file is opened in O_RDONLY mode.
> 
> Ok, maybe don't follow my words *exactly*, just in spirit ;) I see
> there is elf_getdata_rawchunk() API in libelf, which seems useful:
> 
> /* Get data translated from a chunk of the file contents as section data
>    would be for TYPE.  The resulting Elf_Data pointer is valid until
>    elf_end (ELF) is called.  */
> extern Elf_Data *elf_getdata_rawchunk (Elf *__elf,
>                                        int64_t __offset, size_t __size,
>                                        Elf_Type __type);
> 
> I don't know the surrounding code, I was just hoping to leverage
> libelf's byte swapping support (which I think I learned from Tony as
> well). But if it's too inconvenient, so be it.

I missed this API, it works and makes the patch much smaller,
thank you for finding it.
diff mbox series

Patch

diff --git a/btf_encoder.c b/btf_encoder.c
index e1adddf..3bdb73b 100644
--- a/btf_encoder.c
+++ b/btf_encoder.c
@@ -33,6 +33,7 @@ 
 #include <stdint.h>
 #include <search.h> /* for tsearch(), tfind() and tdestroy() */
 #include <pthread.h>
+#include <byteswap.h>
 
 #define BTF_IDS_SECTION		".BTF_ids"
 #define BTF_ID_FUNC_PFX		"__BTF_ID__func__"
@@ -1847,11 +1848,47 @@  static int btf_encoder__tag_kfunc(struct btf_encoder *encoder, struct gobuffer *
 	return 0;
 }
 
+/* If byte order of 'elf' differs from current byte order, convert the data->d_buf.
+ * ELF file is opened in a readonly mode, so data->d_buf cannot be modified in place.
+ * Instead, allocate a new buffer if modification is necessary.
+ */
+static int convert_idlist_endianness(Elf *elf, Elf_Data *data, bool *copied)
+{
+	int byteorder, i;
+	char *elf_ident;
+	uint32_t *tmp;
+
+	*copied = false;
+	elf_ident = elf_getident(elf, NULL);
+	if (elf_ident == NULL) {
+		fprintf(stderr, "Cannot get ELF identification from header\n");
+		return -EINVAL;
+	}
+	byteorder = elf_ident[EI_DATA];
+	if ((BYTE_ORDER == LITTLE_ENDIAN && byteorder == ELFDATA2LSB)
+	    || (BYTE_ORDER == BIG_ENDIAN && byteorder == ELFDATA2MSB))
+		return 0;
+	tmp = malloc(data->d_size);
+	if (tmp == NULL) {
+		fprintf(stderr, "Cannot allocate %lu bytes of memory\n", data->d_size);
+		return -ENOMEM;
+	}
+	memcpy(tmp, data->d_buf, data->d_size);
+	data->d_buf = tmp;
+	*copied = true;
+
+	/* .BTF_ids sections consist of u32 objects */
+	for (i = 0; i < data->d_size / sizeof(uint32_t); i++)
+		tmp[i] = bswap_32(tmp[i]);
+	return 0;
+}
+
 static int btf_encoder__tag_kfuncs(struct btf_encoder *encoder)
 {
 	const char *filename = encoder->source_filename;
 	struct gobuffer btf_kfunc_ranges = {};
 	struct gobuffer btf_funcs = {};
+	bool free_idlist = false;
 	Elf_Data *symbols = NULL;
 	Elf_Data *idlist = NULL;
 	Elf_Scn *symscn = NULL;
@@ -1919,6 +1956,9 @@  static int btf_encoder__tag_kfuncs(struct btf_encoder *encoder)
 			idlist_shndx = i;
 			idlist_addr = shdr.sh_addr;
 			idlist = data;
+			err = convert_idlist_endianness(elf, idlist, &free_idlist);
+			if (err < 0)
+				goto out;
 		}
 	}
 
@@ -2031,6 +2071,8 @@  static int btf_encoder__tag_kfuncs(struct btf_encoder *encoder)
 out:
 	__gobuffer__delete(&btf_funcs);
 	__gobuffer__delete(&btf_kfunc_ranges);
+	if (free_idlist)
+		free(idlist->d_buf);
 	if (elf)
 		elf_end(elf);
 	if (fd != -1)
diff --git a/lib/bpf b/lib/bpf
index 09b9e83..caa17bd 160000
--- a/lib/bpf
+++ b/lib/bpf
@@ -1 +1 @@ 
-Subproject commit 09b9e83102eb8ab9e540d36b4559c55f3bcdb95d
+Subproject commit caa17bdcbfc58e68eaf4d017c058e6577606bf56