diff mbox series

[3/3] btf_encoder: Set .BTF section alignment to 16

Message ID 20210118160139.1971039-4-gprocida@google.com (mailing list archive)
State Superseded
Delegated to: BPF
Headers show
Series Small fixes and improvements | expand

Checks

Context Check Description
netdev/tree_selection success Not a local patch

Commit Message

Giuliano Procida Jan. 18, 2021, 4:01 p.m. UTC
This is to avoid misaligned access when memory-mapping ELF sections.

Signed-off-by: Giuliano Procida <gprocida@google.com>
---
 libbtf.c | 8 ++++++++
 1 file changed, 8 insertions(+)

Comments

Andrii Nakryiko Jan. 21, 2021, 7:16 a.m. UTC | #1
On Mon, Jan 18, 2021 at 8:01 AM Giuliano Procida <gprocida@google.com> wrote:
>
> This is to avoid misaligned access when memory-mapping ELF sections.
>
> Signed-off-by: Giuliano Procida <gprocida@google.com>
> ---
>  libbtf.c | 8 ++++++++
>  1 file changed, 8 insertions(+)
>
> diff --git a/libbtf.c b/libbtf.c
> index 7552d8e..2f12d53 100644
> --- a/libbtf.c
> +++ b/libbtf.c
> @@ -797,6 +797,14 @@ static int btf_elf__write(const char *filename, struct btf *btf)
>                         goto unlink;
>                 }
>
> +               snprintf(cmd, sizeof(cmd), "%s --set-section-alignment .BTF=16 %s",
> +                        llvm_objcopy, filename);

does it align inside the ELF file to 16 bytes, or does it request the
linker to align it at 16 byte alignment in memory? Given .BTF section
is not loadable, trying to understand the implications.


> +               if (system(cmd)) {

Also curious, if objcopy emits error (saying that
--set-section-alignment argument is not recognized), will that error
be shown in stdout? or system() consumes it without redirecting it to
stdout?

> +                       /* non-fatal, this is a nice-to-have and it's only supported from LLVM 10 */
> +                       fprintf(stderr, "%s: warning: failed to align .BTF section in '%s': %d!\n",
> +                               __func__, filename, errno);

Probably better to emit this warning only in verbose mode, otherwise
lots of people will start complaining that they get some new warnings
from pahole.


> +               }
> +
>                 err = 0;
>         unlink:
>                 unlink(tmp_fn);
> --
> 2.30.0.284.gd98b1dd5eaa7-goog
>
Andrii Nakryiko Jan. 21, 2021, 8:07 p.m. UTC | #2
On Thu, Jan 21, 2021 at 3:07 AM Giuliano Procida <gprocida@google.com> wrote:
>
> Hi.
>
> On Thu, 21 Jan 2021 at 07:16, Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
>>
>> On Mon, Jan 18, 2021 at 8:01 AM Giuliano Procida <gprocida@google.com> wrote:
>> >
>> > This is to avoid misaligned access when memory-mapping ELF sections.
>> >
>> > Signed-off-by: Giuliano Procida <gprocida@google.com>
>> > ---
>> >  libbtf.c | 8 ++++++++
>> >  1 file changed, 8 insertions(+)
>> >
>> > diff --git a/libbtf.c b/libbtf.c
>> > index 7552d8e..2f12d53 100644
>> > --- a/libbtf.c
>> > +++ b/libbtf.c
>> > @@ -797,6 +797,14 @@ static int btf_elf__write(const char *filename, struct btf *btf)
>> >                         goto unlink;
>> >                 }
>> >
>> > +               snprintf(cmd, sizeof(cmd), "%s --set-section-alignment .BTF=16 %s",
>> > +                        llvm_objcopy, filename);
>>
>> does it align inside the ELF file to 16 bytes, or does it request the
>> linker to align it at 16 byte alignment in memory? Given .BTF section
>> is not loadable, trying to understand the implications.
>>
>
> We have a tool that loads BTF from ELF files. It uses mmap and "parses" the BTF as structs in memory. The ELF file is mapped with page alignment but the BTF section within it has no alignment at all. Using MSAN (IIRC) we get warnings about misaligned accesses. Everything within BTF itself is naturally aligned, so it makes sense to align the section within ELF as well. There are probably some architectures where this makes the difference between working and SIGBUS.
>

Right, ok, thanks for explaining!

> I did try to get objcopy to set alignment at the point the section is added. However, this didn't work.
>
>>
>>
>> > +               if (system(cmd)) {
>>
>> Also curious, if objcopy emits error (saying that
>> --set-section-alignment argument is not recognized), will that error
>> be shown in stdout? or system() consumes it without redirecting it to
>> stdout?
>>
>
> I believe it goes to stderr. I would need to check. system() will not consume this. I'm not keen to write stderr (or stdout) post-processing code in plain C.
>

You can use popen() to capture/hide output, this is a better
alternative to system() in this case. We don't want "expected
warnings" in kernel build process.

>>
>> > +                       /* non-fatal, this is a nice-to-have and it's only supported from LLVM 10 */
>> > +                       fprintf(stderr, "%s: warning: failed to align .BTF section in '%s': %d!\n",
>> > +                               __func__, filename, errno);
>>
>> Probably better to emit this warning only in verbose mode, otherwise
>> lots of people will start complaining that they get some new warnings
>> from pahole.
>>
>
> It may be better to just use POSIX and ELF APIs directly instead of objcopy. This way the section can be added with the right alignment directly. pahole is already linked against libelf and if we could get rid of the external dependency on objcopy it would be a win in more than one way.

This would be great, yes. At some point I remember giving it a try,
but for some reason I couldn't make libelf flush data and update
section headers properly. Maybe you'll have better luck. Though I
think I was trying to mark section loadable, and eventually I probably
managed to do that, but still abandoned it (it's not enough to mark
section loadable, you have to assign it to ELF segment as well, which
libelf doesn't allow to do and you need linker support). Anyways, give
it a try, it should work.

>
>>
>>
>> > +               }
>> > +
>> >                 err = 0;
>> >         unlink:
>> >                 unlink(tmp_fn);
>> > --
>> > 2.30.0.284.gd98b1dd5eaa7-goog
>> >
>
>
> I'll see if I can spend a little time on this idea instead.
>
> Regards,
> Giuliano.
Giuliano Procida Jan. 25, 2021, 12:53 p.m. UTC | #3
Hi.

On Thu, 21 Jan 2021 at 20:08, Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
>
> On Thu, Jan 21, 2021 at 3:07 AM Giuliano Procida <gprocida@google.com> wrote:
> >
> > Hi.
> >
> > On Thu, 21 Jan 2021 at 07:16, Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
> >>
> >> On Mon, Jan 18, 2021 at 8:01 AM Giuliano Procida <gprocida@google.com> wrote:
> >> >
> >> > This is to avoid misaligned access when memory-mapping ELF sections.
> >> >
> >> > Signed-off-by: Giuliano Procida <gprocida@google.com>
> >> > ---
> >> >  libbtf.c | 8 ++++++++
> >> >  1 file changed, 8 insertions(+)
> >> >
> >> > diff --git a/libbtf.c b/libbtf.c
> >> > index 7552d8e..2f12d53 100644
> >> > --- a/libbtf.c
> >> > +++ b/libbtf.c
> >> > @@ -797,6 +797,14 @@ static int btf_elf__write(const char *filename, struct btf *btf)
> >> >                         goto unlink;
> >> >                 }
> >> >
> >> > +               snprintf(cmd, sizeof(cmd), "%s --set-section-alignment .BTF=16 %s",
> >> > +                        llvm_objcopy, filename);
> >>
> >> does it align inside the ELF file to 16 bytes, or does it request the
> >> linker to align it at 16 byte alignment in memory? Given .BTF section
> >> is not loadable, trying to understand the implications.
> >>
> >
> > We have a tool that loads BTF from ELF files. It uses mmap and "parses" the BTF as structs in memory. The ELF file is mapped with page alignment but the BTF section within it has no alignment at all. Using MSAN (IIRC) we get warnings about misaligned accesses. Everything within BTF itself is naturally aligned, so it makes sense to align the section within ELF as well. There are probably some architectures where this makes the difference between working and SIGBUS.
> >
>
> Right, ok, thanks for explaining!
>
> > I did try to get objcopy to set alignment at the point the section is added. However, this didn't work.
> >
> >>
> >>
> >> > +               if (system(cmd)) {
> >>
> >> Also curious, if objcopy emits error (saying that
> >> --set-section-alignment argument is not recognized), will that error
> >> be shown in stdout? or system() consumes it without redirecting it to
> >> stdout?
> >>
> >
> > I believe it goes to stderr. I would need to check. system() will not consume this. I'm not keen to write stderr (or stdout) post-processing code in plain C.
> >
>
> You can use popen() to capture/hide output, this is a better
> alternative to system() in this case. We don't want "expected
> warnings" in kernel build process.
>
> >>
> >> > +                       /* non-fatal, this is a nice-to-have and it's only supported from LLVM 10 */
> >> > +                       fprintf(stderr, "%s: warning: failed to align .BTF section in '%s': %d!\n",
> >> > +                               __func__, filename, errno);
> >>
> >> Probably better to emit this warning only in verbose mode, otherwise
> >> lots of people will start complaining that they get some new warnings
> >> from pahole.
> >>
> >
> > It may be better to just use POSIX and ELF APIs directly instead of objcopy. This way the section can be added with the right alignment directly. pahole is already linked against libelf and if we could get rid of the external dependency on objcopy it would be a win in more than one way.
>
> This would be great, yes. At some point I remember giving it a try,
> but for some reason I couldn't make libelf flush data and update
> section headers properly. Maybe you'll have better luck. Though I
> think I was trying to mark section loadable, and eventually I probably
> managed to do that, but still abandoned it (it's not enough to mark
> section loadable, you have to assign it to ELF segment as well, which
> libelf doesn't allow to do and you need linker support). Anyways, give
> it a try, it should work.

I struggled for a day and a bit and have got this (ELF_F_LAYOUT etc.)
working. There are some caveats:

1. Laying out only the new / updated sections can leave gaps.

In practice, for vmlinux, it's a very small hole. To fix this, I'd
need to reposition .strtab as well as .BTF and .shstrtab.

2. vmlinux increases in size as llvm-objcopy was trimming down .strtab.

I know very little about this, but I'd guess that the kernel linker
scripts are leaving strings in .strtab that are not referenced by
.symtab.

I'll send a short series out for review soon.

Giuliano.

>
> >
> >>
> >>
> >> > +               }
> >> > +
> >> >                 err = 0;
> >> >         unlink:
> >> >                 unlink(tmp_fn);
> >> > --
> >> > 2.30.0.284.gd98b1dd5eaa7-goog
> >> >
> >
> >
> > I'll see if I can spend a little time on this idea instead.
> >
> > Regards,
> > Giuliano.
>
> --
> To unsubscribe from this group and stop receiving emails from it, send an email to kernel-team+unsubscribe@android.com.
>
Andrii Nakryiko Jan. 26, 2021, 12:28 a.m. UTC | #4
On Mon, Jan 25, 2021 at 4:53 AM Giuliano Procida <gprocida@google.com> wrote:
>
> Hi.
>
> On Thu, 21 Jan 2021 at 20:08, Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
> >
> > On Thu, Jan 21, 2021 at 3:07 AM Giuliano Procida <gprocida@google.com> wrote:
> > >
> > > Hi.
> > >
> > > On Thu, 21 Jan 2021 at 07:16, Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
> > >>
> > >> On Mon, Jan 18, 2021 at 8:01 AM Giuliano Procida <gprocida@google.com> wrote:
> > >> >
> > >> > This is to avoid misaligned access when memory-mapping ELF sections.
> > >> >
> > >> > Signed-off-by: Giuliano Procida <gprocida@google.com>
> > >> > ---
> > >> >  libbtf.c | 8 ++++++++
> > >> >  1 file changed, 8 insertions(+)
> > >> >
> > >> > diff --git a/libbtf.c b/libbtf.c
> > >> > index 7552d8e..2f12d53 100644
> > >> > --- a/libbtf.c
> > >> > +++ b/libbtf.c
> > >> > @@ -797,6 +797,14 @@ static int btf_elf__write(const char *filename, struct btf *btf)
> > >> >                         goto unlink;
> > >> >                 }
> > >> >
> > >> > +               snprintf(cmd, sizeof(cmd), "%s --set-section-alignment .BTF=16 %s",
> > >> > +                        llvm_objcopy, filename);
> > >>
> > >> does it align inside the ELF file to 16 bytes, or does it request the
> > >> linker to align it at 16 byte alignment in memory? Given .BTF section
> > >> is not loadable, trying to understand the implications.
> > >>
> > >
> > > We have a tool that loads BTF from ELF files. It uses mmap and "parses" the BTF as structs in memory. The ELF file is mapped with page alignment but the BTF section within it has no alignment at all. Using MSAN (IIRC) we get warnings about misaligned accesses. Everything within BTF itself is naturally aligned, so it makes sense to align the section within ELF as well. There are probably some architectures where this makes the difference between working and SIGBUS.
> > >
> >
> > Right, ok, thanks for explaining!
> >
> > > I did try to get objcopy to set alignment at the point the section is added. However, this didn't work.
> > >
> > >>
> > >>
> > >> > +               if (system(cmd)) {
> > >>
> > >> Also curious, if objcopy emits error (saying that
> > >> --set-section-alignment argument is not recognized), will that error
> > >> be shown in stdout? or system() consumes it without redirecting it to
> > >> stdout?
> > >>
> > >
> > > I believe it goes to stderr. I would need to check. system() will not consume this. I'm not keen to write stderr (or stdout) post-processing code in plain C.
> > >
> >
> > You can use popen() to capture/hide output, this is a better
> > alternative to system() in this case. We don't want "expected
> > warnings" in kernel build process.
> >
> > >>
> > >> > +                       /* non-fatal, this is a nice-to-have and it's only supported from LLVM 10 */
> > >> > +                       fprintf(stderr, "%s: warning: failed to align .BTF section in '%s': %d!\n",
> > >> > +                               __func__, filename, errno);
> > >>
> > >> Probably better to emit this warning only in verbose mode, otherwise
> > >> lots of people will start complaining that they get some new warnings
> > >> from pahole.
> > >>
> > >
> > > It may be better to just use POSIX and ELF APIs directly instead of objcopy. This way the section can be added with the right alignment directly. pahole is already linked against libelf and if we could get rid of the external dependency on objcopy it would be a win in more than one way.
> >
> > This would be great, yes. At some point I remember giving it a try,
> > but for some reason I couldn't make libelf flush data and update
> > section headers properly. Maybe you'll have better luck. Though I
> > think I was trying to mark section loadable, and eventually I probably
> > managed to do that, but still abandoned it (it's not enough to mark
> > section loadable, you have to assign it to ELF segment as well, which
> > libelf doesn't allow to do and you need linker support). Anyways, give
> > it a try, it should work.
>
> I struggled for a day and a bit and have got this (ELF_F_LAYOUT etc.)
> working. There are some caveats:
>
> 1. Laying out only the new / updated sections can leave gaps.
>
> In practice, for vmlinux, it's a very small hole. To fix this, I'd
> need to reposition .strtab as well as .BTF and .shstrtab.
>
> 2. vmlinux increases in size as llvm-objcopy was trimming down .strtab.
>
> I know very little about this, but I'd guess that the kernel linker
> scripts are leaving strings in .strtab that are not referenced by
> .symtab.
>
> I'll send a short series out for review soon.

1. Hm.. I realized I don't get why you need 16-byte alignment. Can you
comment on why 8 doesn't work?

2. If you care about vmlinux BTF only, I think an easier way to ensure
proper alignment is to adjust include/asm-generic/vmlinux.lds.h and
add `. = ALIGN(8);` before .BTF (see how we ensure 4-byte alignment
for .BTF_ids)

If you have code ready to get rid of llvm-objcopy requirement for
pahole, please still post, but we'll need to test it very thoroughly
to ensure there are no regressions before landing in pahole.

>
> Giuliano.
>
> >
> > >
> > >>
> > >>
> > >> > +               }
> > >> > +
> > >> >                 err = 0;
> > >> >         unlink:
> > >> >                 unlink(tmp_fn);
> > >> > --
> > >> > 2.30.0.284.gd98b1dd5eaa7-goog
> > >> >
> > >
> > >
> > > I'll see if I can spend a little time on this idea instead.
> > >
> > > Regards,
> > > Giuliano.
> >
> > --
> > To unsubscribe from this group and stop receiving emails from it, send an email to kernel-team+unsubscribe@android.com.
> >
Giuliano Procida Jan. 26, 2021, 11:43 a.m. UTC | #5
Hi.

On Tue, 26 Jan 2021 at 00:28, Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
>
> On Mon, Jan 25, 2021 at 4:53 AM Giuliano Procida <gprocida@google.com> wrote:
> >
> > Hi.
> >
> > On Thu, 21 Jan 2021 at 20:08, Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
> > >
> > > On Thu, Jan 21, 2021 at 3:07 AM Giuliano Procida <gprocida@google.com> wrote:
> > > >
> > > > Hi.
> > > >
> > > > On Thu, 21 Jan 2021 at 07:16, Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
> > > >>
> > > >> On Mon, Jan 18, 2021 at 8:01 AM Giuliano Procida <gprocida@google.com> wrote:
> > > >> >
> > > >> > This is to avoid misaligned access when memory-mapping ELF sections.
> > > >> >
> > > >> > Signed-off-by: Giuliano Procida <gprocida@google.com>
> > > >> > ---
> > > >> >  libbtf.c | 8 ++++++++
> > > >> >  1 file changed, 8 insertions(+)
> > > >> >
> > > >> > diff --git a/libbtf.c b/libbtf.c
> > > >> > index 7552d8e..2f12d53 100644
> > > >> > --- a/libbtf.c
> > > >> > +++ b/libbtf.c
> > > >> > @@ -797,6 +797,14 @@ static int btf_elf__write(const char *filename, struct btf *btf)
> > > >> >                         goto unlink;
> > > >> >                 }
> > > >> >
> > > >> > +               snprintf(cmd, sizeof(cmd), "%s --set-section-alignment .BTF=16 %s",
> > > >> > +                        llvm_objcopy, filename);
> > > >>
> > > >> does it align inside the ELF file to 16 bytes, or does it request the
> > > >> linker to align it at 16 byte alignment in memory? Given .BTF section
> > > >> is not loadable, trying to understand the implications.
> > > >>
> > > >
> > > > We have a tool that loads BTF from ELF files. It uses mmap and "parses" the BTF as structs in memory. The ELF file is mapped with page alignment but the BTF section within it has no alignment at all. Using MSAN (IIRC) we get warnings about misaligned accesses. Everything within BTF itself is naturally aligned, so it makes sense to align the section within ELF as well. There are probably some architectures where this makes the difference between working and SIGBUS.
> > > >
> > >
> > > Right, ok, thanks for explaining!
> > >
> > > > I did try to get objcopy to set alignment at the point the section is added. However, this didn't work.
> > > >
> > > >>
> > > >>
> > > >> > +               if (system(cmd)) {
> > > >>
> > > >> Also curious, if objcopy emits error (saying that
> > > >> --set-section-alignment argument is not recognized), will that error
> > > >> be shown in stdout? or system() consumes it without redirecting it to
> > > >> stdout?
> > > >>
> > > >
> > > > I believe it goes to stderr. I would need to check. system() will not consume this. I'm not keen to write stderr (or stdout) post-processing code in plain C.
> > > >
> > >
> > > You can use popen() to capture/hide output, this is a better
> > > alternative to system() in this case. We don't want "expected
> > > warnings" in kernel build process.
> > >
> > > >>
> > > >> > +                       /* non-fatal, this is a nice-to-have and it's only supported from LLVM 10 */
> > > >> > +                       fprintf(stderr, "%s: warning: failed to align .BTF section in '%s': %d!\n",
> > > >> > +                               __func__, filename, errno);
> > > >>
> > > >> Probably better to emit this warning only in verbose mode, otherwise
> > > >> lots of people will start complaining that they get some new warnings
> > > >> from pahole.
> > > >>
> > > >
> > > > It may be better to just use POSIX and ELF APIs directly instead of objcopy. This way the section can be added with the right alignment directly. pahole is already linked against libelf and if we could get rid of the external dependency on objcopy it would be a win in more than one way.
> > >
> > > This would be great, yes. At some point I remember giving it a try,
> > > but for some reason I couldn't make libelf flush data and update
> > > section headers properly. Maybe you'll have better luck. Though I
> > > think I was trying to mark section loadable, and eventually I probably
> > > managed to do that, but still abandoned it (it's not enough to mark
> > > section loadable, you have to assign it to ELF segment as well, which
> > > libelf doesn't allow to do and you need linker support). Anyways, give
> > > it a try, it should work.
> >
> > I struggled for a day and a bit and have got this (ELF_F_LAYOUT etc.)
> > working. There are some caveats:
> >
> > 1. Laying out only the new / updated sections can leave gaps.
> >
> > In practice, for vmlinux, it's a very small hole. To fix this, I'd
> > need to reposition .strtab as well as .BTF and .shstrtab.
> >
> > 2. vmlinux increases in size as llvm-objcopy was trimming down .strtab.
> >
> > I know very little about this, but I'd guess that the kernel linker
> > scripts are leaving strings in .strtab that are not referenced by
> > .symtab.
> >
> > I'll send a short series out for review soon.
>
> 1. Hm.. I realized I don't get why you need 16-byte alignment. Can you
> comment on why 8 doesn't work?
>

Sorry, 16 was a number that was guaranteed to work. However, after
looking in more detail, 8 seems fine too.

> 2. If you care about vmlinux BTF only, I think an easier way to ensure
> proper alignment is to adjust include/asm-generic/vmlinux.lds.h and
> add `. = ALIGN(8);` before .BTF (see how we ensure 4-byte alignment
> for .BTF_ids)
>

Firstly, while we do care mostly about vmlinux, we also care about
modules and even potentially plain userspace .o and .so files. In
terms of testing and development, working with simple objects is
simpler and faster.

Does the above mean vmlinux will always contain an empty .BTF, ready
to be populated?

If not, we and others may add .BTF with pahole -J and that's the code
I've been working on.

> If you have code ready to get rid of llvm-objcopy requirement for
> pahole, please still post, but we'll need to test it very thoroughly
> to ensure there are no regressions before landing in pahole.

It's stuck in a vger/kernel.org queue somewhere.

The updated vmlinux my code generates is I believe much closer to the
original vmlinux than what the existing code produces using
llvm-objcopy. But that's not necessarily a good thing.

Regards.

>
> >
> > Giuliano.
> >
> > >
> > > >
> > > >>
> > > >>
> > > >> > +               }
> > > >> > +
> > > >> >                 err = 0;
> > > >> >         unlink:
> > > >> >                 unlink(tmp_fn);
> > > >> > --
> > > >> > 2.30.0.284.gd98b1dd5eaa7-goog
> > > >> >
> > > >
> > > >
> > > > I'll see if I can spend a little time on this idea instead.
> > > >
> > > > Regards,
> > > > Giuliano.
> > >
> > > --
> > > To unsubscribe from this group and stop receiving emails from it, send an email to kernel-team+unsubscribe@android.com.
> > >
Giuliano Procida Jan. 27, 2021, 3:08 p.m. UTC | #6
Hi Andrii.

On Thu, 21 Jan 2021 at 20:08, Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
>
> On Thu, Jan 21, 2021 at 3:07 AM Giuliano Procida <gprocida@google.com> wrote:
> >
> > Hi.
> >
> > On Thu, 21 Jan 2021 at 07:16, Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
> >>
> >> On Mon, Jan 18, 2021 at 8:01 AM Giuliano Procida <gprocida@google.com> wrote:
> >> >
> >> > This is to avoid misaligned access when memory-mapping ELF sections.
> >> >
> >> > Signed-off-by: Giuliano Procida <gprocida@google.com>
> >> > ---
> >> >  libbtf.c | 8 ++++++++
> >> >  1 file changed, 8 insertions(+)
> >> >
> >> > diff --git a/libbtf.c b/libbtf.c
> >> > index 7552d8e..2f12d53 100644
> >> > --- a/libbtf.c
> >> > +++ b/libbtf.c
> >> > @@ -797,6 +797,14 @@ static int btf_elf__write(const char *filename, struct btf *btf)
> >> >                         goto unlink;
> >> >                 }
> >> >
> >> > +               snprintf(cmd, sizeof(cmd), "%s --set-section-alignment .BTF=16 %s",
> >> > +                        llvm_objcopy, filename);
> >>
> >> does it align inside the ELF file to 16 bytes, or does it request the
> >> linker to align it at 16 byte alignment in memory? Given .BTF section
> >> is not loadable, trying to understand the implications.
> >>
> >
> > We have a tool that loads BTF from ELF files. It uses mmap and "parses" the BTF as structs in memory. The ELF file is mapped with page alignment but the BTF section within it has no alignment at all. Using MSAN (IIRC) we get warnings about misaligned accesses. Everything within BTF itself is naturally aligned, so it makes sense to align the section within ELF as well. There are probably some architectures where this makes the difference between working and SIGBUS.
> >
>
> Right, ok, thanks for explaining!
>
> > I did try to get objcopy to set alignment at the point the section is added. However, this didn't work.
> >
> >>
> >>
> >> > +               if (system(cmd)) {
> >>
> >> Also curious, if objcopy emits error (saying that
> >> --set-section-alignment argument is not recognized), will that error
> >> be shown in stdout? or system() consumes it without redirecting it to
> >> stdout?
> >>
> >
> > I believe it goes to stderr. I would need to check. system() will not consume this. I'm not keen to write stderr (or stdout) post-processing code in plain C.
> >
>
> You can use popen() to capture/hide output, this is a better
> alternative to system() in this case. We don't want "expected
> warnings" in kernel build process.
>
> >>
> >> > +                       /* non-fatal, this is a nice-to-have and it's only supported from LLVM 10 */
> >> > +                       fprintf(stderr, "%s: warning: failed to align .BTF section in '%s': %d!\n",
> >> > +                               __func__, filename, errno);
> >>
> >> Probably better to emit this warning only in verbose mode, otherwise
> >> lots of people will start complaining that they get some new warnings
> >> from pahole.
> >>
> >
> > It may be better to just use POSIX and ELF APIs directly instead of objcopy. This way the section can be added with the right alignment directly. pahole is already linked against libelf and if we could get rid of the external dependency on objcopy it would be a win in more than one way.
>
> This would be great, yes. At some point I remember giving it a try,
> but for some reason I couldn't make libelf flush data and update
> section headers properly. Maybe you'll have better luck. Though I
> think I was trying to mark section loadable, and eventually I probably
> managed to do that, but still abandoned it (it's not enough to mark
> section loadable, you have to assign it to ELF segment as well, which
> libelf doesn't allow to do and you need linker support). Anyways, give
> it a try, it should work.
>

I found 341dfcf8d78eaa3a2dc96dea06f0392eb2978364 ("btf: expose BTF
info through sysfs") and I now see what you mean.

Alignment of .BTF as produced by the linker script is currently not
down to pahole at all. The kernel link script has to add .BTF in a
rather roundabout way because it needs to be added as a loadable
segment and pahole only adds it as a plain section.

pahole's does this using llvm-objcopy (which I spotted has some
side-effects on our AOSP vmlinux). On vanilla kernels, while
llvm-objcopy doesn't rewrite (or at least, resize) .strtab, it does
renumber sections so that the offset order is monotonic.

We're working with .BTF in userspace and haven't needed .BTF as a
segment. If I managed to get pahole to make .BTF a loadable segment as
well, then the linker scripts could be simplified. I'll see if I can
do this part as well.

Regards,
Giuliano.

> >
> >>
> >>
> >> > +               }
> >> > +
> >> >                 err = 0;
> >> >         unlink:
> >> >                 unlink(tmp_fn);
> >> > --
> >> > 2.30.0.284.gd98b1dd5eaa7-goog
> >> >
> >
> >
> > I'll see if I can spend a little time on this idea instead.
> >
> > Regards,
> > Giuliano.
>
> --
> To unsubscribe from this group and stop receiving emails from it, send an email to kernel-team+unsubscribe@android.com.
>
Giuliano Procida Jan. 27, 2021, 6:06 p.m. UTC | #7
Hi.

On Wed, 27 Jan 2021 at 15:08, Giuliano Procida <gprocida@google.com> wrote:
>
> Hi Andrii.
>
> On Thu, 21 Jan 2021 at 20:08, Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
> >
> > On Thu, Jan 21, 2021 at 3:07 AM Giuliano Procida <gprocida@google.com> wrote:
> > >
> > > Hi.
> > >
> > > On Thu, 21 Jan 2021 at 07:16, Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
> > >>
> > >> On Mon, Jan 18, 2021 at 8:01 AM Giuliano Procida <gprocida@google.com> wrote:
> > >> >
> > >> > This is to avoid misaligned access when memory-mapping ELF sections.
> > >> >
> > >> > Signed-off-by: Giuliano Procida <gprocida@google.com>
> > >> > ---
> > >> >  libbtf.c | 8 ++++++++
> > >> >  1 file changed, 8 insertions(+)
> > >> >
> > >> > diff --git a/libbtf.c b/libbtf.c
> > >> > index 7552d8e..2f12d53 100644
> > >> > --- a/libbtf.c
> > >> > +++ b/libbtf.c
> > >> > @@ -797,6 +797,14 @@ static int btf_elf__write(const char *filename, struct btf *btf)
> > >> >                         goto unlink;
> > >> >                 }
> > >> >
> > >> > +               snprintf(cmd, sizeof(cmd), "%s --set-section-alignment .BTF=16 %s",
> > >> > +                        llvm_objcopy, filename);
> > >>
> > >> does it align inside the ELF file to 16 bytes, or does it request the
> > >> linker to align it at 16 byte alignment in memory? Given .BTF section
> > >> is not loadable, trying to understand the implications.
> > >>
> > >
> > > We have a tool that loads BTF from ELF files. It uses mmap and "parses" the BTF as structs in memory. The ELF file is mapped with page alignment but the BTF section within it has no alignment at all. Using MSAN (IIRC) we get warnings about misaligned accesses. Everything within BTF itself is naturally aligned, so it makes sense to align the section within ELF as well. There are probably some architectures where this makes the difference between working and SIGBUS.
> > >
> >
> > Right, ok, thanks for explaining!
> >
> > > I did try to get objcopy to set alignment at the point the section is added. However, this didn't work.
> > >
> > >>
> > >>
> > >> > +               if (system(cmd)) {
> > >>
> > >> Also curious, if objcopy emits error (saying that
> > >> --set-section-alignment argument is not recognized), will that error
> > >> be shown in stdout? or system() consumes it without redirecting it to
> > >> stdout?
> > >>
> > >
> > > I believe it goes to stderr. I would need to check. system() will not consume this. I'm not keen to write stderr (or stdout) post-processing code in plain C.
> > >
> >
> > You can use popen() to capture/hide output, this is a better
> > alternative to system() in this case. We don't want "expected
> > warnings" in kernel build process.
> >
> > >>
> > >> > +                       /* non-fatal, this is a nice-to-have and it's only supported from LLVM 10 */
> > >> > +                       fprintf(stderr, "%s: warning: failed to align .BTF section in '%s': %d!\n",
> > >> > +                               __func__, filename, errno);
> > >>
> > >> Probably better to emit this warning only in verbose mode, otherwise
> > >> lots of people will start complaining that they get some new warnings
> > >> from pahole.
> > >>
> > >
> > > It may be better to just use POSIX and ELF APIs directly instead of objcopy. This way the section can be added with the right alignment directly. pahole is already linked against libelf and if we could get rid of the external dependency on objcopy it would be a win in more than one way.
> >
> > This would be great, yes. At some point I remember giving it a try,
> > but for some reason I couldn't make libelf flush data and update
> > section headers properly. Maybe you'll have better luck. Though I
> > think I was trying to mark section loadable, and eventually I probably
> > managed to do that, but still abandoned it (it's not enough to mark
> > section loadable, you have to assign it to ELF segment as well, which
> > libelf doesn't allow to do and you need linker support). Anyways, give
> > it a try, it should work.
> >
>
> I found 341dfcf8d78eaa3a2dc96dea06f0392eb2978364 ("btf: expose BTF
> info through sysfs") and I now see what you mean.
>
> Alignment of .BTF as produced by the linker script is currently not
> down to pahole at all. The kernel link script has to add .BTF in a
> rather roundabout way because it needs to be added as a loadable
> segment and pahole only adds it as a plain section.
>
> pahole's does this using llvm-objcopy (which I spotted has some
> side-effects on our AOSP vmlinux). On vanilla kernels, while
> llvm-objcopy doesn't rewrite (or at least, resize) .strtab, it does
> renumber sections so that the offset order is monotonic.
>
> We're working with .BTF in userspace and haven't needed .BTF as a
> segment. If I managed to get pahole to make .BTF a loadable segment as
> well, then the linker scripts could be simplified. I'll see if I can
> do this part as well.

OK...

$ readelf -lW /tmp/vmlinux

Elf file type is EXEC (Executable file)
Entry point 0x1000000
There are 5 program headers, starting at offset 64

Program Headers:
  Type           Offset   VirtAddr           PhysAddr
FileSiz  MemSiz   Flg Align
  LOAD           0x200000 0xffffffff81000000 0x0000000001000000
0x167be37 0x167be37 R E 0x200000
  LOAD           0x1a00000 0xffffffff82800000 0x0000000002800000
0x5a6000 0x5a6000 RW  0x200000
  LOAD           0x2000000 0x0000000000000000 0x0000000002da6000
0x02a258 0x02a258 RW  0x200000
  LOAD           0x21d1000 0xffffffff82dd1000 0x0000000002dd1000
0x104000 0x25b000 RWE 0x200000
  NOTE           0x152ac30 0xffffffff8232ac30 0x000000000232ac30
0x00003c 0x00003c     0x4

 Section to Segment mapping:
  Segment Sections...
   00     .text .rodata .pci_fixup .tracedata __ksymtab __ksymtab_gpl
__ksymtab_strings __param __modver __ex_table .notes .BTF
   01     .data __bug_table .orc_unwind_ip .orc_unwind .orc_lookup .vvar
   02     .data..percpu
   03     .init.text .altinstr_aux .init.data .x86_cpu_dev.init
.altinstructions .altinstr_replacement .iommu_table .apicdrivers
.exit.text .smp_locks .data_nosave .bss .brk
   04     .notes

This is the end result. The sausage factory (gen_btf / vmlinux_link -
which I've now read through) actually does:

* link a temporary vmlinux
* run pahole -J on this
* dump out the .BTF as a raw file (anything clever pahole does with
ELF is thrown away here)
* create an ELF file with arch and format to match vmlinux, containing
a single .BTF section
* link this ELF file with the other bits of the kernel.

As a DWARF to BTF converter, pahole's role is clear. At this point I'd
like to separate what's useful for the kernel and what's useful in
terms of generally packaging up .BTF as a kind of debug information
for general use.

Packaging up BTF as an ELF section or linking this into the kernel is
a lot of work to do properly in pahole and duplicates the role of the
linker. If I continue, I'll probably end up creating a disjoint R
segment just for .BTF and I don't know if that's OK. I'm not sure how
much more work is needed to get to the point where all the various
objcopy/objdump can be eliminated or whether this is a worthwhile
goal. Another way of getting rid of the objcopy/objdump dependencies
for the kernel would be to just emit an ELF file containing the .BTF
section only and let the linker do its thing.

For non-kernel use, I'm not sure of the implications of letting libelf
reorder all the sections, or if we'd ever want the .BTF to be in a
loadable segment. If anything, I'd advocate for having pahole just
generate raw BTF output. However, I know there's a big convenience
factor in having debug (type) info packaged into the ELF.

So I'm not sure if it's worth pursuing the line of work beyond
eliminating pahole's dependency on llvm-objcopy. In terms of my
follow-up series, this might mean dropping 3/4 (as preserving existing
ELF layout in the temporary vmlinux isn't needed) but keeping 4/4 (as
it's useful to us, even if it's currently useless for the kernel).

If we cannot get libelf to make the right kind of sausage, then I
agree that vmlinux .BTF alignment should probably follow your earlier
suggestion of `. = ALIGN(8)` in vmlinux.lds.h.

And we haven't even got to discussing modules and merging .BTF info. :-)

Regards,
Giuliano.


> Regards,
> Giuliano.
>
> > >
> > >>
> > >>
> > >> > +               }
> > >> > +
> > >> >                 err = 0;
> > >> >         unlink:
> > >> >                 unlink(tmp_fn);
> > >> > --
> > >> > 2.30.0.284.gd98b1dd5eaa7-goog
> > >> >
> > >
> > >
> > > I'll see if I can spend a little time on this idea instead.
> > >
> > > Regards,
> > > Giuliano.
> >
> > --
> > To unsubscribe from this group and stop receiving emails from it, send an email to kernel-team+unsubscribe@android.com.
> >
Andrii Nakryiko Jan. 27, 2021, 7:56 p.m. UTC | #8
On Wed, Jan 27, 2021 at 10:07 AM Giuliano Procida <gprocida@google.com> wrote:
>
> Hi.
>
> On Wed, 27 Jan 2021 at 15:08, Giuliano Procida <gprocida@google.com> wrote:
> >
> > Hi Andrii.
> >
> > On Thu, 21 Jan 2021 at 20:08, Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
> > >
> > > On Thu, Jan 21, 2021 at 3:07 AM Giuliano Procida <gprocida@google.com> wrote:
> > > >
> > > > Hi.
> > > >
> > > > On Thu, 21 Jan 2021 at 07:16, Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
> > > >>
> > > >> On Mon, Jan 18, 2021 at 8:01 AM Giuliano Procida <gprocida@google.com> wrote:
> > > >> >
> > > >> > This is to avoid misaligned access when memory-mapping ELF sections.
> > > >> >
> > > >> > Signed-off-by: Giuliano Procida <gprocida@google.com>
> > > >> > ---
> > > >> >  libbtf.c | 8 ++++++++
> > > >> >  1 file changed, 8 insertions(+)
> > > >> >
> > > >> > diff --git a/libbtf.c b/libbtf.c
> > > >> > index 7552d8e..2f12d53 100644
> > > >> > --- a/libbtf.c
> > > >> > +++ b/libbtf.c
> > > >> > @@ -797,6 +797,14 @@ static int btf_elf__write(const char *filename, struct btf *btf)
> > > >> >                         goto unlink;
> > > >> >                 }
> > > >> >
> > > >> > +               snprintf(cmd, sizeof(cmd), "%s --set-section-alignment .BTF=16 %s",
> > > >> > +                        llvm_objcopy, filename);
> > > >>
> > > >> does it align inside the ELF file to 16 bytes, or does it request the
> > > >> linker to align it at 16 byte alignment in memory? Given .BTF section
> > > >> is not loadable, trying to understand the implications.
> > > >>
> > > >
> > > > We have a tool that loads BTF from ELF files. It uses mmap and "parses" the BTF as structs in memory. The ELF file is mapped with page alignment but the BTF section within it has no alignment at all. Using MSAN (IIRC) we get warnings about misaligned accesses. Everything within BTF itself is naturally aligned, so it makes sense to align the section within ELF as well. There are probably some architectures where this makes the difference between working and SIGBUS.
> > > >
> > >
> > > Right, ok, thanks for explaining!
> > >
> > > > I did try to get objcopy to set alignment at the point the section is added. However, this didn't work.
> > > >
> > > >>
> > > >>
> > > >> > +               if (system(cmd)) {
> > > >>
> > > >> Also curious, if objcopy emits error (saying that
> > > >> --set-section-alignment argument is not recognized), will that error
> > > >> be shown in stdout? or system() consumes it without redirecting it to
> > > >> stdout?
> > > >>
> > > >
> > > > I believe it goes to stderr. I would need to check. system() will not consume this. I'm not keen to write stderr (or stdout) post-processing code in plain C.
> > > >
> > >
> > > You can use popen() to capture/hide output, this is a better
> > > alternative to system() in this case. We don't want "expected
> > > warnings" in kernel build process.
> > >
> > > >>
> > > >> > +                       /* non-fatal, this is a nice-to-have and it's only supported from LLVM 10 */
> > > >> > +                       fprintf(stderr, "%s: warning: failed to align .BTF section in '%s': %d!\n",
> > > >> > +                               __func__, filename, errno);
> > > >>
> > > >> Probably better to emit this warning only in verbose mode, otherwise
> > > >> lots of people will start complaining that they get some new warnings
> > > >> from pahole.
> > > >>
> > > >
> > > > It may be better to just use POSIX and ELF APIs directly instead of objcopy. This way the section can be added with the right alignment directly. pahole is already linked against libelf and if we could get rid of the external dependency on objcopy it would be a win in more than one way.
> > >
> > > This would be great, yes. At some point I remember giving it a try,
> > > but for some reason I couldn't make libelf flush data and update
> > > section headers properly. Maybe you'll have better luck. Though I
> > > think I was trying to mark section loadable, and eventually I probably
> > > managed to do that, but still abandoned it (it's not enough to mark
> > > section loadable, you have to assign it to ELF segment as well, which
> > > libelf doesn't allow to do and you need linker support). Anyways, give
> > > it a try, it should work.
> > >
> >
> > I found 341dfcf8d78eaa3a2dc96dea06f0392eb2978364 ("btf: expose BTF
> > info through sysfs") and I now see what you mean.
> >
> > Alignment of .BTF as produced by the linker script is currently not
> > down to pahole at all. The kernel link script has to add .BTF in a
> > rather roundabout way because it needs to be added as a loadable
> > segment and pahole only adds it as a plain section.
> >
> > pahole's does this using llvm-objcopy (which I spotted has some
> > side-effects on our AOSP vmlinux). On vanilla kernels, while
> > llvm-objcopy doesn't rewrite (or at least, resize) .strtab, it does
> > renumber sections so that the offset order is monotonic.
> >
> > We're working with .BTF in userspace and haven't needed .BTF as a
> > segment. If I managed to get pahole to make .BTF a loadable segment as
> > well, then the linker scripts could be simplified. I'll see if I can
> > do this part as well.
>
> OK...
>
> $ readelf -lW /tmp/vmlinux
>
> Elf file type is EXEC (Executable file)
> Entry point 0x1000000
> There are 5 program headers, starting at offset 64
>
> Program Headers:
>   Type           Offset   VirtAddr           PhysAddr
> FileSiz  MemSiz   Flg Align
>   LOAD           0x200000 0xffffffff81000000 0x0000000001000000
> 0x167be37 0x167be37 R E 0x200000
>   LOAD           0x1a00000 0xffffffff82800000 0x0000000002800000
> 0x5a6000 0x5a6000 RW  0x200000
>   LOAD           0x2000000 0x0000000000000000 0x0000000002da6000
> 0x02a258 0x02a258 RW  0x200000
>   LOAD           0x21d1000 0xffffffff82dd1000 0x0000000002dd1000
> 0x104000 0x25b000 RWE 0x200000
>   NOTE           0x152ac30 0xffffffff8232ac30 0x000000000232ac30
> 0x00003c 0x00003c     0x4
>
>  Section to Segment mapping:
>   Segment Sections...
>    00     .text .rodata .pci_fixup .tracedata __ksymtab __ksymtab_gpl
> __ksymtab_strings __param __modver __ex_table .notes .BTF
>    01     .data __bug_table .orc_unwind_ip .orc_unwind .orc_lookup .vvar
>    02     .data..percpu
>    03     .init.text .altinstr_aux .init.data .x86_cpu_dev.init
> .altinstructions .altinstr_replacement .iommu_table .apicdrivers
> .exit.text .smp_locks .data_nosave .bss .brk
>    04     .notes
>
> This is the end result. The sausage factory (gen_btf / vmlinux_link -
> which I've now read through) actually does:
>
> * link a temporary vmlinux
> * run pahole -J on this
> * dump out the .BTF as a raw file (anything clever pahole does with
> ELF is thrown away here)
> * create an ELF file with arch and format to match vmlinux, containing
> a single .BTF section
> * link this ELF file with the other bits of the kernel.
>
> As a DWARF to BTF converter, pahole's role is clear. At this point I'd
> like to separate what's useful for the kernel and what's useful in
> terms of generally packaging up .BTF as a kind of debug information
> for general use.
>
> Packaging up BTF as an ELF section or linking this into the kernel is
> a lot of work to do properly in pahole and duplicates the role of the
> linker. If I continue, I'll probably end up creating a disjoint R
> segment just for .BTF and I don't know if that's OK. I'm not sure how
> much more work is needed to get to the point where all the various
> objcopy/objdump can be eliminated or whether this is a worthwhile
> goal. Another way of getting rid of the objcopy/objdump dependencies
> for the kernel would be to just emit an ELF file containing the .BTF
> section only and let the linker do its thing.
>
> For non-kernel use, I'm not sure of the implications of letting libelf
> reorder all the sections, or if we'd ever want the .BTF to be in a
> loadable segment. If anything, I'd advocate for having pahole just
> generate raw BTF output. However, I know there's a big convenience
> factor in having debug (type) info packaged into the ELF.

At this point I don't think pahole can mark .BTF as loadable, at best
it should be an opt-in. Order of .BTF section relative to others
shouldn't matter, though. I, honestly, lost track of what's the latest
status of your work, but if it's dangerous to do with libelf, I'd use
llvm-objcopy, but would do better detection of
--set-section-alignment.

>
> So I'm not sure if it's worth pursuing the line of work beyond
> eliminating pahole's dependency on llvm-objcopy. In terms of my
> follow-up series, this might mean dropping 3/4 (as preserving existing
> ELF layout in the temporary vmlinux isn't needed) but keeping 4/4 (as
> it's useful to us, even if it's currently useless for the kernel).

Right, I'm not aware of any other use case beyond vmlinux and kernel
modules where BTF has to be in a loadable section.

>
> If we cannot get libelf to make the right kind of sausage, then I
> agree that vmlinux .BTF alignment should probably follow your earlier
> suggestion of `. = ALIGN(8)` in vmlinux.lds.h.

this can be done independent of all the other decisions, IMO

>
> And we haven't even got to discussing modules and merging .BTF info. :-)

that shouldn't matter at all, in the end it's a single .BTF blob that
needs to be put into ELF

>
> Regards,
> Giuliano.
>
>
> > Regards,
> > Giuliano.
> >
> > > >
> > > >>
> > > >>
> > > >> > +               }
> > > >> > +
> > > >> >                 err = 0;
> > > >> >         unlink:
> > > >> >                 unlink(tmp_fn);
> > > >> > --
> > > >> > 2.30.0.284.gd98b1dd5eaa7-goog
> > > >> >
> > > >
> > > >
> > > > I'll see if I can spend a little time on this idea instead.
> > > >
> > > > Regards,
> > > > Giuliano.
> > >
> > > --
> > > To unsubscribe from this group and stop receiving emails from it, send an email to kernel-team+unsubscribe@android.com.
> > >
diff mbox series

Patch

diff --git a/libbtf.c b/libbtf.c
index 7552d8e..2f12d53 100644
--- a/libbtf.c
+++ b/libbtf.c
@@ -797,6 +797,14 @@  static int btf_elf__write(const char *filename, struct btf *btf)
 			goto unlink;
 		}
 
+		snprintf(cmd, sizeof(cmd), "%s --set-section-alignment .BTF=16 %s",
+			 llvm_objcopy, filename);
+		if (system(cmd)) {
+			/* non-fatal, this is a nice-to-have and it's only supported from LLVM 10 */
+			fprintf(stderr, "%s: warning: failed to align .BTF section in '%s': %d!\n",
+				__func__, filename, errno);
+		}
+
 		err = 0;
 	unlink:
 		unlink(tmp_fn);