mbox series

[bpf-next,0/3] libbpf: Make uprobe attachment APK aware

Message ID 20230217191908.1000004-1-deso@posteo.net (mailing list archive)
Headers show
Series libbpf: Make uprobe attachment APK aware | expand

Message

Daniel Müller Feb. 17, 2023, 7:19 p.m. UTC
On Android, APKs (android packages; zip packages with somewhat
prescriptive contents) are first class citizens in the system: the
shared objects contained in them don't exist in unpacked form on the
file system. Rather, they are mmaped directly from within the archive
and the archive is also what the kernel is aware of.

For users that complicates the process of attaching a uprobe to a
function contained in a shared object in one such APK: they'd have to
find the byte offset of said function from the beginning of the archive.
That is cumbersome to do manually and can be fragile, because various
changes could invalidate said offset.

That is why for uprobes inside ELF files (not inside an APK), commit
d112c9ce249b ("libbpf: Support function name-based attach uprobes") added
support for attaching to symbols by name. On Android, that mechanism
currently does not work, because this logic is not APK aware.

This patch set introduces first class support for attaching uprobes to
functions inside ELF objects contained in APKs via function names. We
add support for recognizing the following syntax for a binary path:
  <archive>!/<binary-in-archive>

  (e.g., /system/app/test-app.apk!/lib/arm64-v8a/libc++.so)

This syntax is common in the Android eco system and used by tools such
as simpleperf. It is also what is being proposed for bcc [0].

If the user provides such a binary path, we find <binary-in-archive>
(lib/arm64-v8a/libc++.so in the example) inside of <archive>
(/system/app/test-app.apk). We perform the regular ELF offset search
inside the binary and add that to the offset within the archive itself,
to retrieve the offset at which to attach the uprobe.

[0] https://github.com/iovisor/bcc/pull/4440

Daniel Müller (3):
  libbpf: Implement basic zip archive parsing support
  libbpf: Introduce elf_find_func_offset_from_elf_file() function
  libbpf: Add support for attaching uprobes to shared objects in APKs

 tools/lib/bpf/Build    |   2 +-
 tools/lib/bpf/libbpf.c | 137 ++++++++++++---
 tools/lib/bpf/zip.c    | 371 +++++++++++++++++++++++++++++++++++++++++
 tools/lib/bpf/zip.h    |  47 ++++++
 4 files changed, 533 insertions(+), 24 deletions(-)
 create mode 100644 tools/lib/bpf/zip.c
 create mode 100644 tools/lib/bpf/zip.h

Comments

Alan Maguire Feb. 18, 2023, 8:29 p.m. UTC | #1
On 17/02/2023 19:19, Daniel Müller wrote:
> On Android, APKs (android packages; zip packages with somewhat
> prescriptive contents) are first class citizens in the system: the
> shared objects contained in them don't exist in unpacked form on the
> file system. Rather, they are mmaped directly from within the archive
> and the archive is also what the kernel is aware of.
> 
> For users that complicates the process of attaching a uprobe to a
> function contained in a shared object in one such APK: they'd have to
> find the byte offset of said function from the beginning of the archive.
> That is cumbersome to do manually and can be fragile, because various
> changes could invalidate said offset.
> 
> That is why for uprobes inside ELF files (not inside an APK), commit
> d112c9ce249b ("libbpf: Support function name-based attach uprobes") added
> support for attaching to symbols by name. On Android, that mechanism
> currently does not work, because this logic is not APK aware.
> 
> This patch set introduces first class support for attaching uprobes to
> functions inside ELF objects contained in APKs via function names. We
> add support for recognizing the following syntax for a binary path:
>   <archive>!/<binary-in-archive>
> 
>   (e.g., /system/app/test-app.apk!/lib/arm64-v8a/libc++.so)
> 
> This syntax is common in the Android eco system and used by tools such
> as simpleperf. It is also what is being proposed for bcc [0].
> 
> If the user provides such a binary path, we find <binary-in-archive>
> (lib/arm64-v8a/libc++.so in the example) inside of <archive>
> (/system/app/test-app.apk). We perform the regular ELF offset search
> inside the binary and add that to the offset within the archive itself,
> to retrieve the offset at which to attach the uprobe.
> 

I have to look in a bit more depth here, but my first thought is if
we need the APK specifics in libbpf itself? Would having additional
uprobe opts that specify elf memory and some sort of "don't attach,
just figure out offset" flag work? Then you could perhaps do the work
in patch 3 outside of libbpf, calling attach once to get the
offset within the elf (using the changes in patch 2 to support ELF
memory), then a second time to do the attach using the offset previously
computed.

Then you could implement the APK handling in a custom SEC() handler
which runs based on seeing an APK path or apk_uprobe/ prefix. Is that
approach feasible? I'm guessing there's something I'm missing, but it
would be good to understand what that is. Thanks!

Alan

> [0] https://github.com/iovisor/bcc/pull/4440
> 
> Daniel Müller (3):
>   libbpf: Implement basic zip archive parsing support
>   libbpf: Introduce elf_find_func_offset_from_elf_file() function
>   libbpf: Add support for attaching uprobes to shared objects in APKs
> 
>  tools/lib/bpf/Build    |   2 +-
>  tools/lib/bpf/libbpf.c | 137 ++++++++++++---
>  tools/lib/bpf/zip.c    | 371 +++++++++++++++++++++++++++++++++++++++++
>  tools/lib/bpf/zip.h    |  47 ++++++
>  4 files changed, 533 insertions(+), 24 deletions(-)
>  create mode 100644 tools/lib/bpf/zip.c
>  create mode 100644 tools/lib/bpf/zip.h
>
Daniel Müller Feb. 21, 2023, 11:34 p.m. UTC | #2
On Sat, Feb 18, 2023 at 08:29:32PM +0000, Alan Maguire wrote:
> On 17/02/2023 19:19, Daniel Müller wrote:
> > On Android, APKs (android packages; zip packages with somewhat
> > prescriptive contents) are first class citizens in the system: the
> > shared objects contained in them don't exist in unpacked form on the
> > file system. Rather, they are mmaped directly from within the archive
> > and the archive is also what the kernel is aware of.
> > 
> > For users that complicates the process of attaching a uprobe to a
> > function contained in a shared object in one such APK: they'd have to
> > find the byte offset of said function from the beginning of the archive.
> > That is cumbersome to do manually and can be fragile, because various
> > changes could invalidate said offset.
> > 
> > That is why for uprobes inside ELF files (not inside an APK), commit
> > d112c9ce249b ("libbpf: Support function name-based attach uprobes") added
> > support for attaching to symbols by name. On Android, that mechanism
> > currently does not work, because this logic is not APK aware.
> > 
> > This patch set introduces first class support for attaching uprobes to
> > functions inside ELF objects contained in APKs via function names. We
> > add support for recognizing the following syntax for a binary path:
> >   <archive>!/<binary-in-archive>
> > 
> >   (e.g., /system/app/test-app.apk!/lib/arm64-v8a/libc++.so)
> > 
> > This syntax is common in the Android eco system and used by tools such
> > as simpleperf. It is also what is being proposed for bcc [0].
> > 
> > If the user provides such a binary path, we find <binary-in-archive>
> > (lib/arm64-v8a/libc++.so in the example) inside of <archive>
> > (/system/app/test-app.apk). We perform the regular ELF offset search
> > inside the binary and add that to the offset within the archive itself,
> > to retrieve the offset at which to attach the uprobe.
> > 
> 
> I have to look in a bit more depth here, but my first thought is if
> we need the APK specifics in libbpf itself? Would having additional
> uprobe opts that specify elf memory and some sort of "don't attach,
> just figure out offset" flag work? Then you could perhaps do the work
> in patch 3 outside of libbpf, calling attach once to get the
> offset within the elf (using the changes in patch 2 to support ELF
> memory), then a second time to do the attach using the offset previously
> computed.
> 
> Then you could implement the APK handling in a custom SEC() handler
> which runs based on seeing an APK path or apk_uprobe/ prefix. Is that
> approach feasible? I'm guessing there's something I'm missing, but it
> would be good to understand what that is. Thanks!

Thanks for taking a look! From what I understand what you laid out could work as
well (though the devil may be in the detail here; I am not particularly familiar
with custom SEC handlers and so unless it's being prototyped I can't say for
certain).
That being said, I am not sure I see how it is superior: it strikes me as more
complicated just from a control flow and orchestration point of view. It also
does not seem more user friendly to work with. As mentioned in the description,
the proposed syntax addition is common in the eco system. I would think that
supporting it benefits users, which in turn helps with adoption of libbpf usage
on Android systems.

Thanks,
Daniel

[...]
Andrii Nakryiko Feb. 24, 2023, 12:18 a.m. UTC | #3
On Tue, Feb 21, 2023 at 3:35 PM Daniel Müller <deso@posteo.net> wrote:
>
> On Sat, Feb 18, 2023 at 08:29:32PM +0000, Alan Maguire wrote:
> > On 17/02/2023 19:19, Daniel Müller wrote:
> > > On Android, APKs (android packages; zip packages with somewhat
> > > prescriptive contents) are first class citizens in the system: the
> > > shared objects contained in them don't exist in unpacked form on the
> > > file system. Rather, they are mmaped directly from within the archive
> > > and the archive is also what the kernel is aware of.
> > >
> > > For users that complicates the process of attaching a uprobe to a
> > > function contained in a shared object in one such APK: they'd have to
> > > find the byte offset of said function from the beginning of the archive.
> > > That is cumbersome to do manually and can be fragile, because various
> > > changes could invalidate said offset.
> > >
> > > That is why for uprobes inside ELF files (not inside an APK), commit
> > > d112c9ce249b ("libbpf: Support function name-based attach uprobes") added
> > > support for attaching to symbols by name. On Android, that mechanism
> > > currently does not work, because this logic is not APK aware.
> > >
> > > This patch set introduces first class support for attaching uprobes to
> > > functions inside ELF objects contained in APKs via function names. We
> > > add support for recognizing the following syntax for a binary path:
> > >   <archive>!/<binary-in-archive>
> > >
> > >   (e.g., /system/app/test-app.apk!/lib/arm64-v8a/libc++.so)
> > >
> > > This syntax is common in the Android eco system and used by tools such
> > > as simpleperf. It is also what is being proposed for bcc [0].
> > >
> > > If the user provides such a binary path, we find <binary-in-archive>
> > > (lib/arm64-v8a/libc++.so in the example) inside of <archive>
> > > (/system/app/test-app.apk). We perform the regular ELF offset search
> > > inside the binary and add that to the offset within the archive itself,
> > > to retrieve the offset at which to attach the uprobe.
> > >
> >
> > I have to look in a bit more depth here, but my first thought is if
> > we need the APK specifics in libbpf itself? Would having additional
> > uprobe opts that specify elf memory and some sort of "don't attach,
> > just figure out offset" flag work? Then you could perhaps do the work
> > in patch 3 outside of libbpf, calling attach once to get the
> > offset within the elf (using the changes in patch 2 to support ELF
> > memory), then a second time to do the attach using the offset previously
> > computed.
> >
> > Then you could implement the APK handling in a custom SEC() handler
> > which runs based on seeing an APK path or apk_uprobe/ prefix. Is that
> > approach feasible? I'm guessing there's something I'm missing, but it
> > would be good to understand what that is. Thanks!
>
> Thanks for taking a look! From what I understand what you laid out could work as
> well (though the devil may be in the detail here; I am not particularly familiar
> with custom SEC handlers and so unless it's being prototyped I can't say for
> certain).
> That being said, I am not sure I see how it is superior: it strikes me as more
> complicated just from a control flow and orchestration point of view. It also
> does not seem more user friendly to work with. As mentioned in the description,
> the proposed syntax addition is common in the eco system. I would think that
> supporting it benefits users, which in turn helps with adoption of libbpf usage
> on Android systems.
>

+1. Yes, a lot of stuff could be implemented with a custom SEC()
handler, but here the point is to have good out-of-the-box declarative
support for very typical APK-based attachments.

> Thanks,
> Daniel
>
> [...]