mbox series

[v7,bpf-next,00/10] Harden and extend ELF build ID parsing logic

Message ID 20240829174232.3133883-1-andrii@kernel.org (mailing list archive)
Headers show
Series Harden and extend ELF build ID parsing logic | expand

Message

Andrii Nakryiko Aug. 29, 2024, 5:42 p.m. UTC
The goal of this patch set is to extend existing ELF build ID parsing logic,
currently mostly used by BPF subsystem, with support for working in sleepable
mode in which memory faults are allowed and can be relied upon to fetch
relevant parts of ELF file to find and fetch .note.gnu.build-id information.

This is useful and important for BPF subsystem itself, but also for
PROCMAP_QUERY ioctl(), built atop of /proc/<pid>/maps functionality (see [0]),
which makes use of the same build_id_parse() functionality. PROCMAP_QUERY is
always called from sleepable user process context, so it doesn't have to
suffer from current restrictions of build_id_parse() which are due to the NMI
context assumption.

Along the way, we harden the logic to avoid TOCTOU, overflow, out-of-bounds
access problems.  This is the very first patch, which can be backported to
older releases, if necessary.

We also lift existing limitations of only working as long as ELF program
headers and build ID note section is contained strictly within the very first
page of ELF file.

We achieve all of the above without duplication of logic between sleepable and
non-sleepable modes through freader abstraction that manages underlying folio
from page cache (on demand) and gives a simple to use direct memory access
interface. With that, single page restrictions and adding sleepable mode
support is rather straightforward.

We also extend existing set of BPF selftests with a few tests targeting build
ID logic across sleepable and non-sleepabe contexts (we utilize sleepable and
non-sleepable uprobes for that).

   [0] https://lore.kernel.org/linux-mm/20240627170900.1672542-4-andrii@kernel.org/

v6->v7:
  - added filemap_invalidate_{lock,unlock}_shared() around read_cache_folio
    and kept Eduard's Reviewed-by (Eduard);
v5->v6:
  - use local phnum variable in get_build_id_32() (Jann);
  - switch memcmp() instead of strcmp() in parse_build_id() (Jann);
v4->v5:
  - pass proper file reference to read_cache_folio() (Shakeel);
  - fix another potential overflow due to two u32 additions (Andi);
  - add PageUptodate() check to patch #1 (Jann);
v3->v4:
  - fix few more potential overflow and out-of-bounds access issues (Andi);
  - use purely folio-based implementation for freader (Matthew);
v2->v3:
  - remove unneeded READ_ONCE()s and force phoff to u64 for 32-bit mode (Andi);
  - moved hardening fixes to the front for easier backporting (Jann);
  - call freader_cleanup() from build_id_parse_buf() for consistency (Jiri);
v1->v2:
  - ensure MADV_PAGEOUT works reliably by paging data in first (Shakeel);
  - to fix BPF CI build optionally define MADV_POPULATE_READ in selftest.

Andrii Nakryiko (10):
  lib/buildid: harden build ID parsing logic
  lib/buildid: add single folio-based file reader abstraction
  lib/buildid: take into account e_phoff when fetching program headers
  lib/buildid: remove single-page limit for PHDR search
  lib/buildid: rename build_id_parse() into build_id_parse_nofault()
  lib/buildid: implement sleepable build_id_parse() API
  lib/buildid: don't limit .note.gnu.build-id to the first page in ELF
  bpf: decouple stack_map_get_build_id_offset() from
    perf_callchain_entry
  bpf: wire up sleepable bpf_get_stack() and bpf_get_task_stack()
    helpers
  selftests/bpf: add build ID tests

 include/linux/bpf.h                           |   2 +
 include/linux/buildid.h                       |   4 +-
 kernel/bpf/stackmap.c                         | 131 ++++--
 kernel/events/core.c                          |   2 +-
 kernel/trace/bpf_trace.c                      |   5 +-
 lib/buildid.c                                 | 397 +++++++++++++-----
 tools/testing/selftests/bpf/Makefile          |   5 +-
 .../selftests/bpf/prog_tests/build_id.c       | 118 ++++++
 .../selftests/bpf/progs/test_build_id.c       |  31 ++
 tools/testing/selftests/bpf/uprobe_multi.c    |  41 ++
 tools/testing/selftests/bpf/uprobe_multi.ld   |  11 +
 11 files changed, 605 insertions(+), 142 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/build_id.c
 create mode 100644 tools/testing/selftests/bpf/progs/test_build_id.c
 create mode 100644 tools/testing/selftests/bpf/uprobe_multi.ld

Comments

Andrii Nakryiko Sept. 3, 2024, 10:38 p.m. UTC | #1
On Thu, Aug 29, 2024 at 10:42 AM Andrii Nakryiko <andrii@kernel.org> wrote:
>
> The goal of this patch set is to extend existing ELF build ID parsing logic,
> currently mostly used by BPF subsystem, with support for working in sleepable
> mode in which memory faults are allowed and can be relied upon to fetch
> relevant parts of ELF file to find and fetch .note.gnu.build-id information.
>
> This is useful and important for BPF subsystem itself, but also for
> PROCMAP_QUERY ioctl(), built atop of /proc/<pid>/maps functionality (see [0]),
> which makes use of the same build_id_parse() functionality. PROCMAP_QUERY is
> always called from sleepable user process context, so it doesn't have to
> suffer from current restrictions of build_id_parse() which are due to the NMI
> context assumption.
>
> Along the way, we harden the logic to avoid TOCTOU, overflow, out-of-bounds
> access problems.  This is the very first patch, which can be backported to
> older releases, if necessary.
>
> We also lift existing limitations of only working as long as ELF program
> headers and build ID note section is contained strictly within the very first
> page of ELF file.
>
> We achieve all of the above without duplication of logic between sleepable and
> non-sleepable modes through freader abstraction that manages underlying folio
> from page cache (on demand) and gives a simple to use direct memory access
> interface. With that, single page restrictions and adding sleepable mode
> support is rather straightforward.
>
> We also extend existing set of BPF selftests with a few tests targeting build
> ID logic across sleepable and non-sleepabe contexts (we utilize sleepable and
> non-sleepable uprobes for that).
>
>    [0] https://lore.kernel.org/linux-mm/20240627170900.1672542-4-andrii@kernel.org/
>
> v6->v7:
>   - added filemap_invalidate_{lock,unlock}_shared() around read_cache_folio
>     and kept Eduard's Reviewed-by (Eduard);
> v5->v6:
>   - use local phnum variable in get_build_id_32() (Jann);
>   - switch memcmp() instead of strcmp() in parse_build_id() (Jann);
> v4->v5:
>   - pass proper file reference to read_cache_folio() (Shakeel);
>   - fix another potential overflow due to two u32 additions (Andi);
>   - add PageUptodate() check to patch #1 (Jann);
> v3->v4:
>   - fix few more potential overflow and out-of-bounds access issues (Andi);
>   - use purely folio-based implementation for freader (Matthew);

Ok, so I'm not sure what one needs to do to get Matthew's attention
nowadays, but hopefully yet another ping might do the trick.

Matthew,

Can you please take another look and provide your ack or nack? I did
the conversion to folio as you requested. It would be nice if you can
give me a courtesy of acking my patch set, if there is nothing wrong
with it, so it can finally go in.

Thank you.

> v2->v3:
>   - remove unneeded READ_ONCE()s and force phoff to u64 for 32-bit mode (Andi);
>   - moved hardening fixes to the front for easier backporting (Jann);
>   - call freader_cleanup() from build_id_parse_buf() for consistency (Jiri);
> v1->v2:
>   - ensure MADV_PAGEOUT works reliably by paging data in first (Shakeel);
>   - to fix BPF CI build optionally define MADV_POPULATE_READ in selftest.
>
> Andrii Nakryiko (10):
>   lib/buildid: harden build ID parsing logic
>   lib/buildid: add single folio-based file reader abstraction
>   lib/buildid: take into account e_phoff when fetching program headers
>   lib/buildid: remove single-page limit for PHDR search
>   lib/buildid: rename build_id_parse() into build_id_parse_nofault()
>   lib/buildid: implement sleepable build_id_parse() API
>   lib/buildid: don't limit .note.gnu.build-id to the first page in ELF
>   bpf: decouple stack_map_get_build_id_offset() from
>     perf_callchain_entry
>   bpf: wire up sleepable bpf_get_stack() and bpf_get_task_stack()
>     helpers
>   selftests/bpf: add build ID tests
>
>  include/linux/bpf.h                           |   2 +
>  include/linux/buildid.h                       |   4 +-
>  kernel/bpf/stackmap.c                         | 131 ++++--
>  kernel/events/core.c                          |   2 +-
>  kernel/trace/bpf_trace.c                      |   5 +-
>  lib/buildid.c                                 | 397 +++++++++++++-----
>  tools/testing/selftests/bpf/Makefile          |   5 +-
>  .../selftests/bpf/prog_tests/build_id.c       | 118 ++++++
>  .../selftests/bpf/progs/test_build_id.c       |  31 ++
>  tools/testing/selftests/bpf/uprobe_multi.c    |  41 ++
>  tools/testing/selftests/bpf/uprobe_multi.ld   |  11 +
>  11 files changed, 605 insertions(+), 142 deletions(-)
>  create mode 100644 tools/testing/selftests/bpf/prog_tests/build_id.c
>  create mode 100644 tools/testing/selftests/bpf/progs/test_build_id.c
>  create mode 100644 tools/testing/selftests/bpf/uprobe_multi.ld
>
> --
> 2.43.5
>
Alexei Starovoitov Sept. 11, 2024, 12:27 a.m. UTC | #2
On Tue, Sep 3, 2024 at 3:39 PM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Thu, Aug 29, 2024 at 10:42 AM Andrii Nakryiko <andrii@kernel.org> wrote:
> >
> > The goal of this patch set is to extend existing ELF build ID parsing logic,
> > currently mostly used by BPF subsystem, with support for working in sleepable
> > mode in which memory faults are allowed and can be relied upon to fetch
> > relevant parts of ELF file to find and fetch .note.gnu.build-id information.
> >
> > This is useful and important for BPF subsystem itself, but also for
> > PROCMAP_QUERY ioctl(), built atop of /proc/<pid>/maps functionality (see [0]),
> > which makes use of the same build_id_parse() functionality. PROCMAP_QUERY is
> > always called from sleepable user process context, so it doesn't have to
> > suffer from current restrictions of build_id_parse() which are due to the NMI
> > context assumption.
> >
> > Along the way, we harden the logic to avoid TOCTOU, overflow, out-of-bounds
> > access problems.  This is the very first patch, which can be backported to
> > older releases, if necessary.
> >
> > We also lift existing limitations of only working as long as ELF program
> > headers and build ID note section is contained strictly within the very first
> > page of ELF file.
> >
> > We achieve all of the above without duplication of logic between sleepable and
> > non-sleepable modes through freader abstraction that manages underlying folio
> > from page cache (on demand) and gives a simple to use direct memory access
> > interface. With that, single page restrictions and adding sleepable mode
> > support is rather straightforward.
> >
> > We also extend existing set of BPF selftests with a few tests targeting build
> > ID logic across sleepable and non-sleepabe contexts (we utilize sleepable and
> > non-sleepable uprobes for that).
> >
> >    [0] https://lore.kernel.org/linux-mm/20240627170900.1672542-4-andrii@kernel.org/
> >
> > v6->v7:
> >   - added filemap_invalidate_{lock,unlock}_shared() around read_cache_folio
> >     and kept Eduard's Reviewed-by (Eduard);
> > v5->v6:
> >   - use local phnum variable in get_build_id_32() (Jann);
> >   - switch memcmp() instead of strcmp() in parse_build_id() (Jann);
> > v4->v5:
> >   - pass proper file reference to read_cache_folio() (Shakeel);
> >   - fix another potential overflow due to two u32 additions (Andi);
> >   - add PageUptodate() check to patch #1 (Jann);
> > v3->v4:
> >   - fix few more potential overflow and out-of-bounds access issues (Andi);
> >   - use purely folio-based implementation for freader (Matthew);
>
> Ok, so I'm not sure what one needs to do to get Matthew's attention
> nowadays, but hopefully yet another ping might do the trick.
>
> Matthew,
>
> Can you please take another look and provide your ack or nack? I did
> the conversion to folio as you requested. It would be nice if you can
> give me a courtesy of acking my patch set, if there is nothing wrong
> with it, so it can finally go in.

Looks like no further comments from Matthew or anyone else.

I'll take another look through the set before applying to bpf-next.
patchwork-bot+netdevbpf@kernel.org Sept. 11, 2024, 5:10 p.m. UTC | #3
Hello:

This series was applied to bpf/bpf-next.git (master)
by Alexei Starovoitov <ast@kernel.org>:

On Thu, 29 Aug 2024 10:42:22 -0700 you wrote:
> The goal of this patch set is to extend existing ELF build ID parsing logic,
> currently mostly used by BPF subsystem, with support for working in sleepable
> mode in which memory faults are allowed and can be relied upon to fetch
> relevant parts of ELF file to find and fetch .note.gnu.build-id information.
> 
> This is useful and important for BPF subsystem itself, but also for
> PROCMAP_QUERY ioctl(), built atop of /proc/<pid>/maps functionality (see [0]),
> which makes use of the same build_id_parse() functionality. PROCMAP_QUERY is
> always called from sleepable user process context, so it doesn't have to
> suffer from current restrictions of build_id_parse() which are due to the NMI
> context assumption.
> 
> [...]

Here is the summary with links:
  - [v7,bpf-next,01/10] lib/buildid: harden build ID parsing logic
    https://git.kernel.org/bpf/bpf-next/c/905415ff3ffb
  - [v7,bpf-next,02/10] lib/buildid: add single folio-based file reader abstraction
    https://git.kernel.org/bpf/bpf-next/c/de3ec364c3c3
  - [v7,bpf-next,03/10] lib/buildid: take into account e_phoff when fetching program headers
    https://git.kernel.org/bpf/bpf-next/c/d4deb8242341
  - [v7,bpf-next,04/10] lib/buildid: remove single-page limit for PHDR search
    https://git.kernel.org/bpf/bpf-next/c/4e9d360c4cdf
  - [v7,bpf-next,05/10] lib/buildid: rename build_id_parse() into build_id_parse_nofault()
    https://git.kernel.org/bpf/bpf-next/c/45b8fc309654
  - [v7,bpf-next,06/10] lib/buildid: implement sleepable build_id_parse() API
    https://git.kernel.org/bpf/bpf-next/c/ad41251c290d
  - [v7,bpf-next,07/10] lib/buildid: don't limit .note.gnu.build-id to the first page in ELF
    https://git.kernel.org/bpf/bpf-next/c/cdbb44f9a74f
  - [v7,bpf-next,08/10] bpf: decouple stack_map_get_build_id_offset() from perf_callchain_entry
    https://git.kernel.org/bpf/bpf-next/c/4f4c4fc0153f
  - [v7,bpf-next,09/10] bpf: wire up sleepable bpf_get_stack() and bpf_get_task_stack() helpers
    https://git.kernel.org/bpf/bpf-next/c/d4dd9775ec24
  - [v7,bpf-next,10/10] selftests/bpf: add build ID tests
    https://git.kernel.org/bpf/bpf-next/c/3c217a182018

You are awesome, thank you!