mbox series

[RFC,0/4] Make bpf_jit and kprobes work with CONFIG_MODULES=n

Message ID cover.1709676663.git.jcalvinowens@gmail.com (mailing list archive)
Headers show
Series Make bpf_jit and kprobes work with CONFIG_MODULES=n | expand

Message

Calvin Owens March 6, 2024, 8:05 p.m. UTC
Hello all,

This patchset makes it possible to use bpftrace with kprobes on kernels
built without loadable module support.

On a Raspberry Pi 4b, this saves about 700KB of memory where BPF is
needed but loadable module support is not. These two kernels had
identical configurations, except CONFIG_MODULE was off in the second:

   - Linux version 6.8.0-rc7
   - Memory: 3330672K/4050944K available (16576K kernel code, 2390K rwdata,
   - 12364K rodata, 5632K init, 675K bss, 195984K reserved, 524288K cma-reserved)
   + Linux version 6.8.0-rc7-00003-g2af01251ca21
   + Memory: 3331400K/4050944K available (16512K kernel code, 2384K rwdata,
   + 11728K rodata, 5632K init, 673K bss, 195256K reserved, 524288K cma-reserved)

I don't intend to present an exhaustive list of !MODULES usecases, since
I'm sure there are many I'm not aware of. Performance is a common one,
the primary justification being that static text is mapped on hugepages
and module text is not. Security is another, since rootkits are much
harder to implement without modules.

The first patch is the interesting one: it moves module_alloc() into its
own file with its own Kconfig option, so it can be utilized even when
loadable module support is disabled. I got the idea from an unmerged
patch from a few years ago I found on lkml (see [1/4] for details). I
think this also has value in its own right, since I suspect there are
potential users beyond bpf, hopefully we will hear from some.

Patches 2-3 are proofs of concept to demonstrate the first patch is
sufficient to achieve my goal (full ebpf functionality without modules).

Patch 4 adds a new "-n" argument to vmtest.sh to run the BPF selftests
without modules, so the prior three patches can be rigorously tested.

If something like the first patch were to eventually be merged, the rest
could go through the normal bpf-next process as I clean them up: I've
only based them on Linus' tree and combined them into a series here to
introduce the idea.

If you prefer to fetch the patches via git:

  [1/4] https://github.com/jcalvinowens/linux.git work/module-alloc
 +[2/4]+[3/4] https://github.com/jcalvinowens/linux.git work/nomodule-bpf
 +[4/4] https://github.com/jcalvinowens/linux.git testing/nomodule-bpf-ci

In addition to the automated BPF selftests, I've lightly tested this on
my laptop (x86_64), a Raspberry Pi 4b (arm64), and a Raspberry Pi Zero W
(arm). The other architectures have only been compile tested.

I didn't want to spam all the arch maintainers with what I expect will
be a discussion mostly about modules and bpf, so I've left them off this
first submission. I will be sure to add them on future submissions of
the first patch. Of course, feedback on the arch bits is welcome here.

In addition to feedback on the patches themselves, I'm interested in
hearing from anybody else who might find this functionality useful.

Thanks,
Calvin


Calvin Owens (4):
  module: mm: Make module_alloc() generally available
  bpf: Allow BPF_JIT with CONFIG_MODULES=n
  kprobes: Allow kprobes with CONFIG_MODULES=n
  selftests/bpf: Support testing the !MODULES case

 arch/Kconfig                                  |   4 +-
 arch/arm/kernel/module.c                      |  35 -----
 arch/arm/mm/Makefile                          |   2 +
 arch/arm/mm/module_alloc.c                    |  40 ++++++
 arch/arm64/kernel/module.c                    | 127 -----------------
 arch/arm64/mm/Makefile                        |   1 +
 arch/arm64/mm/module_alloc.c                  | 130 ++++++++++++++++++
 arch/loongarch/kernel/module.c                |   6 -
 arch/loongarch/mm/Makefile                    |   2 +
 arch/loongarch/mm/module_alloc.c              |  10 ++
 arch/mips/kernel/module.c                     |  10 --
 arch/mips/mm/Makefile                         |   2 +
 arch/mips/mm/module_alloc.c                   |  13 ++
 arch/nios2/kernel/module.c                    |  20 ---
 arch/nios2/mm/Makefile                        |   2 +
 arch/nios2/mm/module_alloc.c                  |  22 +++
 arch/parisc/kernel/module.c                   |  12 --
 arch/parisc/mm/Makefile                       |   1 +
 arch/parisc/mm/module_alloc.c                 |  15 ++
 arch/powerpc/kernel/module.c                  |  36 -----
 arch/powerpc/mm/Makefile                      |   1 +
 arch/powerpc/mm/module_alloc.c                |  41 ++++++
 arch/riscv/kernel/module.c                    |  11 --
 arch/riscv/mm/Makefile                        |   1 +
 arch/riscv/mm/module_alloc.c                  |  17 +++
 arch/s390/kernel/module.c                     |  37 -----
 arch/s390/mm/Makefile                         |   1 +
 arch/s390/mm/module_alloc.c                   |  42 ++++++
 arch/sparc/kernel/module.c                    |  31 -----
 arch/sparc/mm/Makefile                        |   2 +
 arch/sparc/mm/module_alloc.c                  |  31 +++++
 arch/x86/kernel/ftrace.c                      |   2 +-
 arch/x86/kernel/module.c                      |  56 --------
 arch/x86/mm/Makefile                          |   2 +
 arch/x86/mm/module_alloc.c                    |  59 ++++++++
 fs/proc/kcore.c                               |   2 +-
 include/trace/events/bpf_testmod.h            |   1 +
 kernel/bpf/Kconfig                            |  11 +-
 kernel/bpf/Makefile                           |   2 +
 kernel/bpf/bpf_struct_ops.c                   |  28 +++-
 kernel/bpf/bpf_testmod/Makefile               |   1 +
 kernel/bpf/bpf_testmod/bpf_testmod.c          |   1 +
 kernel/bpf/bpf_testmod/bpf_testmod.h          |   1 +
 kernel/bpf/bpf_testmod/bpf_testmod_kfunc.h    |   1 +
 kernel/kprobes.c                              |  22 +++
 kernel/module/Kconfig                         |   1 +
 kernel/module/main.c                          |  17 ---
 kernel/trace/trace_kprobe.c                   |  11 ++
 mm/Kconfig                                    |   3 +
 mm/Makefile                                   |   1 +
 mm/module_alloc.c                             |  21 +++
 mm/vmalloc.c                                  |   2 +-
 net/bpf/test_run.c                            |   2 +
 tools/testing/selftests/bpf/Makefile          |  28 ++--
 .../selftests/bpf/bpf_testmod/Makefile        |   2 +-
 .../bpf/bpf_testmod/bpf_testmod-events.h      |   6 +
 .../selftests/bpf/bpf_testmod/bpf_testmod.c   |   4 +
 .../bpf/bpf_testmod/bpf_testmod_kfunc.h       |   2 +
 tools/testing/selftests/bpf/config            |   5 -
 tools/testing/selftests/bpf/config.mods       |   5 +
 tools/testing/selftests/bpf/config.nomods     |   1 +
 .../selftests/bpf/progs/btf_type_tag_percpu.c |   2 +
 .../selftests/bpf/progs/btf_type_tag_user.c   |   2 +
 tools/testing/selftests/bpf/progs/core_kern.c |   2 +
 .../selftests/bpf/progs/iters_testmod_seq.c   |   2 +
 .../bpf/progs/test_core_reloc_module.c        |   2 +
 .../selftests/bpf/progs/test_ldsx_insn.c      |   2 +
 .../selftests/bpf/progs/test_module_attach.c  |   3 +
 .../selftests/bpf/progs/tracing_struct.c      |   2 +
 tools/testing/selftests/bpf/testing_helpers.c |  14 ++
 tools/testing/selftests/bpf/vmtest.sh         |  24 +++-
 71 files changed, 636 insertions(+), 424 deletions(-)
 create mode 100644 arch/arm/mm/module_alloc.c
 create mode 100644 arch/arm64/mm/module_alloc.c
 create mode 100644 arch/loongarch/mm/module_alloc.c
 create mode 100644 arch/mips/mm/module_alloc.c
 create mode 100644 arch/nios2/mm/module_alloc.c
 create mode 100644 arch/parisc/mm/module_alloc.c
 create mode 100644 arch/powerpc/mm/module_alloc.c
 create mode 100644 arch/riscv/mm/module_alloc.c
 create mode 100644 arch/s390/mm/module_alloc.c
 create mode 100644 arch/sparc/mm/module_alloc.c
 create mode 100644 arch/x86/mm/module_alloc.c
 create mode 120000 include/trace/events/bpf_testmod.h
 create mode 100644 kernel/bpf/bpf_testmod/Makefile
 create mode 120000 kernel/bpf/bpf_testmod/bpf_testmod.c
 create mode 120000 kernel/bpf/bpf_testmod/bpf_testmod.h
 create mode 120000 kernel/bpf/bpf_testmod/bpf_testmod_kfunc.h
 create mode 100644 mm/module_alloc.c
 create mode 100644 tools/testing/selftests/bpf/config.mods
 create mode 100644 tools/testing/selftests/bpf/config.nomods

Comments

Luis Chamberlain March 6, 2024, 9:34 p.m. UTC | #1
On Wed, Mar 06, 2024 at 12:05:07PM -0800, Calvin Owens wrote:
> Hello all,
> 
> This patchset makes it possible to use bpftrace with kprobes on kernels
> built without loadable module support.

This is a step in the right direction for another reason: clearly the
module_alloc() is not about modules, and we have special reasons for it
now beyond modules. The effort to share a generalize a huge page for
these things is also another reason for some of this but that is more
long term.

I'm all for minor changes here so to avoid regressions but it seems a
rename is in order -- if we're going to all this might as well do it
now. And for that I'd just like to ask you paint the bikeshed with
Song Liu as he's been the one slowly making way to help us get there
with the "module: replace module_layout with module_memory",
and Mike Rapoport as he's had some follow up attempts [0]. As I see it,
the EXECMEM stuff would be what we use instead then. Mike kept the
module_alloc() and the execmem was just a wrapper but your move of the
arch stuff makes sense as well and I think would complement his series
nicely.

If you're gonna split code up to move to another place, it'd be nice
if you can add copyright headers as was done with the kernel/module.c
split into kernel/module/*.c

Can we start with some small basic stuff we can all agree on?

[0] https://lwn.net/Articles/944857/

  Luis
Calvin Owens March 6, 2024, 11:23 p.m. UTC | #2
On Wednesday 03/06 at 13:34 -0800, Luis Chamberlain wrote:
> On Wed, Mar 06, 2024 at 12:05:07PM -0800, Calvin Owens wrote:
> > Hello all,
> > 
> > This patchset makes it possible to use bpftrace with kprobes on kernels
> > built without loadable module support.
> 
> This is a step in the right direction for another reason: clearly the
> module_alloc() is not about modules, and we have special reasons for it
> now beyond modules. The effort to share a generalize a huge page for
> these things is also another reason for some of this but that is more
> long term.
> 
> I'm all for minor changes here so to avoid regressions but it seems a
> rename is in order -- if we're going to all this might as well do it
> now. And for that I'd just like to ask you paint the bikeshed with
> Song Liu as he's been the one slowly making way to help us get there
> with the "module: replace module_layout with module_memory",
> and Mike Rapoport as he's had some follow up attempts [0]. As I see it,
> the EXECMEM stuff would be what we use instead then. Mike kept the
> module_alloc() and the execmem was just a wrapper but your move of the
> arch stuff makes sense as well and I think would complement his series
> nicely.

I apologize for missing that. I think these are the four most recent
versions of the different series referenced from that LWN link:

  a) https://lore.kernel.org/all/20230918072955.2507221-1-rppt@kernel.org/
  b) https://lore.kernel.org/all/20230526051529.3387103-1-song@kernel.org/
  c) https://lore.kernel.org/all/20221107223921.3451913-1-song@kernel.org/
  d) https://lore.kernel.org/all/20201120202426.18009-1-rick.p.edgecombe@intel.com/

Song and Mike, please correct me if I'm wrong, but I think what I've
done here (see [1], sorry for not adding you initially) is compatible
with everything both of you have recently proposed above. How do you
feel about this as a first step?

For naming, execmem_alloc() seems reasonable to me? I have no strong
feelings at all, I'll just use that going forward unless somebody else
expresses an opinion.

[1] https://lore.kernel.org/lkml/cover.1709676663.git.jcalvinowens@gmail.com/T/#m337096e158a5f771d0c7c2fb15a3b80a4443226a

> If you're gonna split code up to move to another place, it'd be nice
> if you can add copyright headers as was done with the kernel/module.c
> split into kernel/module/*.c

Silly question: should it be the same copyright header as the original
corresponding module.c, or a new one? I tried to preserve the license
header because I wasn't sure what to do about it.

Thanks,
Calvin

> Can we start with some small basic stuff we can all agree on?
> 
> [0] https://lwn.net/Articles/944857/
> 
>   Luis
Song Liu March 7, 2024, 1:58 a.m. UTC | #3
Hi Calvin,

It is great to hear from you! :)

On Wed, Mar 6, 2024 at 3:23 PM Calvin Owens <jcalvinowens@gmail.com> wrote:
>
> On Wednesday 03/06 at 13:34 -0800, Luis Chamberlain wrote:
> > On Wed, Mar 06, 2024 at 12:05:07PM -0800, Calvin Owens wrote:
> > > Hello all,
> > >
> > > This patchset makes it possible to use bpftrace with kprobes on kernels
> > > built without loadable module support.
> >
> > This is a step in the right direction for another reason: clearly the
> > module_alloc() is not about modules, and we have special reasons for it
> > now beyond modules. The effort to share a generalize a huge page for
> > these things is also another reason for some of this but that is more
> > long term.
> >
> > I'm all for minor changes here so to avoid regressions but it seems a
> > rename is in order -- if we're going to all this might as well do it
> > now. And for that I'd just like to ask you paint the bikeshed with
> > Song Liu as he's been the one slowly making way to help us get there
> > with the "module: replace module_layout with module_memory",
> > and Mike Rapoport as he's had some follow up attempts [0]. As I see it,
> > the EXECMEM stuff would be what we use instead then. Mike kept the
> > module_alloc() and the execmem was just a wrapper but your move of the
> > arch stuff makes sense as well and I think would complement his series
> > nicely.
>
> I apologize for missing that. I think these are the four most recent
> versions of the different series referenced from that LWN link:
>
>   a) https://lore.kernel.org/all/20230918072955.2507221-1-rppt@kernel.org/
>   b) https://lore.kernel.org/all/20230526051529.3387103-1-song@kernel.org/
>   c) https://lore.kernel.org/all/20221107223921.3451913-1-song@kernel.org/
>   d) https://lore.kernel.org/all/20201120202426.18009-1-rick.p.edgecombe@intel.com/
>
> Song and Mike, please correct me if I'm wrong, but I think what I've
> done here (see [1], sorry for not adding you initially) is compatible
> with everything both of you have recently proposed above. How do you
> feel about this as a first step?

I agree that the work here is compatible with other efforts. I have no
objection to making this the first step.

>
> For naming, execmem_alloc() seems reasonable to me? I have no strong
> feelings at all, I'll just use that going forward unless somebody else
> expresses an opinion.

I am not good at naming things. No objection from me to "execmem_alloc".

Thanks,
Song
Mike Rapoport March 7, 2024, 7:13 a.m. UTC | #4
Hi Calvin,

On Wed, Mar 06, 2024 at 03:23:22PM -0800, Calvin Owens wrote:
> On Wednesday 03/06 at 13:34 -0800, Luis Chamberlain wrote:
> > On Wed, Mar 06, 2024 at 12:05:07PM -0800, Calvin Owens wrote:
> > > Hello all,
> > > 
> > > This patchset makes it possible to use bpftrace with kprobes on kernels
> > > built without loadable module support.
> > 
> > This is a step in the right direction for another reason: clearly the
> > module_alloc() is not about modules, and we have special reasons for it
> > now beyond modules. The effort to share a generalize a huge page for
> > these things is also another reason for some of this but that is more
> > long term.
> > 
> > I'm all for minor changes here so to avoid regressions but it seems a
> > rename is in order -- if we're going to all this might as well do it
> > now. And for that I'd just like to ask you paint the bikeshed with
> > Song Liu as he's been the one slowly making way to help us get there
> > with the "module: replace module_layout with module_memory",
> > and Mike Rapoport as he's had some follow up attempts [0]. As I see it,
> > the EXECMEM stuff would be what we use instead then. Mike kept the
> > module_alloc() and the execmem was just a wrapper but your move of the
> > arch stuff makes sense as well and I think would complement his series
> > nicely.

Actually I've dropped module_alloc() in favor of execmem_alloc() ;-)
 
> I apologize for missing that. I think these are the four most recent
> versions of the different series referenced from that LWN link:
> 
>   a) https://lore.kernel.org/all/20230918072955.2507221-1-rppt@kernel.org/
>   b) https://lore.kernel.org/all/20230526051529.3387103-1-song@kernel.org/
>   c) https://lore.kernel.org/all/20221107223921.3451913-1-song@kernel.org/
>   d) https://lore.kernel.org/all/20201120202426.18009-1-rick.p.edgecombe@intel.com/
> 
> Song and Mike, please correct me if I'm wrong, but I think what I've
> done here (see [1], sorry for not adding you initially) is compatible
> with everything both of you have recently proposed above. How do you
> feel about this as a first step?

No objections from me.

> For naming, execmem_alloc() seems reasonable to me? I have no strong
> feelings at all, I'll just use that going forward unless somebody else
> expresses an opinion.

I like execmem_alloc() and CONFIG_EXECMEM.
Masami Hiramatsu (Google) March 8, 2024, 2:45 a.m. UTC | #5
Hi,

On Wed, 6 Mar 2024 13:34:40 -0800
Luis Chamberlain <mcgrof@kernel.org> wrote:

> On Wed, Mar 06, 2024 at 12:05:07PM -0800, Calvin Owens wrote:
> > Hello all,
> > 
> > This patchset makes it possible to use bpftrace with kprobes on kernels
> > built without loadable module support.
> 
> This is a step in the right direction for another reason: clearly the
> module_alloc() is not about modules, and we have special reasons for it
> now beyond modules. The effort to share a generalize a huge page for
> these things is also another reason for some of this but that is more
> long term.

Indeed. If it works without CONFIG_MODULES, it may be exec_alloc() or
something like that. Anyway, thanks for great job on this item!

> 
> I'm all for minor changes here so to avoid regressions but it seems a
> rename is in order -- if we're going to all this might as well do it
> now. And for that I'd just like to ask you paint the bikeshed with
> Song Liu as he's been the one slowly making way to help us get there
> with the "module: replace module_layout with module_memory",
> and Mike Rapoport as he's had some follow up attempts [0]. As I see it,
> the EXECMEM stuff would be what we use instead then. Mike kept the
> module_alloc() and the execmem was just a wrapper but your move of the
> arch stuff makes sense as well and I think would complement his series
> nicely.

yeah, it is better to work with Mike.

Thank you,

> 
> If you're gonna split code up to move to another place, it'd be nice
> if you can add copyright headers as was done with the kernel/module.c
> split into kernel/module/*.c
> 
> Can we start with some small basic stuff we can all agree on?
> 
> [0] https://lwn.net/Articles/944857/
> 
>   Luis
Masami Hiramatsu (Google) March 8, 2024, 2:50 a.m. UTC | #6
On Wed, 6 Mar 2024 17:58:14 -0800
Song Liu <song@kernel.org> wrote:

> Hi Calvin,
> 
> It is great to hear from you! :)
> 
> On Wed, Mar 6, 2024 at 3:23 PM Calvin Owens <jcalvinowens@gmail.com> wrote:
> >
> > On Wednesday 03/06 at 13:34 -0800, Luis Chamberlain wrote:
> > > On Wed, Mar 06, 2024 at 12:05:07PM -0800, Calvin Owens wrote:
> > > > Hello all,
> > > >
> > > > This patchset makes it possible to use bpftrace with kprobes on kernels
> > > > built without loadable module support.
> > >
> > > This is a step in the right direction for another reason: clearly the
> > > module_alloc() is not about modules, and we have special reasons for it
> > > now beyond modules. The effort to share a generalize a huge page for
> > > these things is also another reason for some of this but that is more
> > > long term.
> > >
> > > I'm all for minor changes here so to avoid regressions but it seems a
> > > rename is in order -- if we're going to all this might as well do it
> > > now. And for that I'd just like to ask you paint the bikeshed with
> > > Song Liu as he's been the one slowly making way to help us get there
> > > with the "module: replace module_layout with module_memory",
> > > and Mike Rapoport as he's had some follow up attempts [0]. As I see it,
> > > the EXECMEM stuff would be what we use instead then. Mike kept the
> > > module_alloc() and the execmem was just a wrapper but your move of the
> > > arch stuff makes sense as well and I think would complement his series
> > > nicely.
> >
> > I apologize for missing that. I think these are the four most recent
> > versions of the different series referenced from that LWN link:
> >
> >   a) https://lore.kernel.org/all/20230918072955.2507221-1-rppt@kernel.org/
> >   b) https://lore.kernel.org/all/20230526051529.3387103-1-song@kernel.org/
> >   c) https://lore.kernel.org/all/20221107223921.3451913-1-song@kernel.org/
> >   d) https://lore.kernel.org/all/20201120202426.18009-1-rick.p.edgecombe@intel.com/
> >
> > Song and Mike, please correct me if I'm wrong, but I think what I've
> > done here (see [1], sorry for not adding you initially) is compatible
> > with everything both of you have recently proposed above. How do you
> > feel about this as a first step?
> 
> I agree that the work here is compatible with other efforts. I have no
> objection to making this the first step.
> 
> >
> > For naming, execmem_alloc() seems reasonable to me? I have no strong
> > feelings at all, I'll just use that going forward unless somebody else
> > expresses an opinion.
> 
> I am not good at naming things. No objection from me to "execmem_alloc".

Hm, it sounds good to me too. I think we should add a patch which just
rename the module_alloc/module_memfree with execmem_alloc/free first.

Thanks,
Luis Chamberlain March 8, 2024, 2:55 a.m. UTC | #7
On Thu, Mar 7, 2024 at 6:50 PM Masami Hiramatsu <mhiramat@kernel.org> wrote:
>
> On Wed, 6 Mar 2024 17:58:14 -0800
> Song Liu <song@kernel.org> wrote:
>
> > Hi Calvin,
> >
> > It is great to hear from you! :)
> >
> > On Wed, Mar 6, 2024 at 3:23 PM Calvin Owens <jcalvinowens@gmail.com> wrote:
> > >
> > > On Wednesday 03/06 at 13:34 -0800, Luis Chamberlain wrote:
> > > > On Wed, Mar 06, 2024 at 12:05:07PM -0800, Calvin Owens wrote:
> > > > > Hello all,
> > > > >
> > > > > This patchset makes it possible to use bpftrace with kprobes on kernels
> > > > > built without loadable module support.
> > > >
> > > > This is a step in the right direction for another reason: clearly the
> > > > module_alloc() is not about modules, and we have special reasons for it
> > > > now beyond modules. The effort to share a generalize a huge page for
> > > > these things is also another reason for some of this but that is more
> > > > long term.
> > > >
> > > > I'm all for minor changes here so to avoid regressions but it seems a
> > > > rename is in order -- if we're going to all this might as well do it
> > > > now. And for that I'd just like to ask you paint the bikeshed with
> > > > Song Liu as he's been the one slowly making way to help us get there
> > > > with the "module: replace module_layout with module_memory",
> > > > and Mike Rapoport as he's had some follow up attempts [0]. As I see it,
> > > > the EXECMEM stuff would be what we use instead then. Mike kept the
> > > > module_alloc() and the execmem was just a wrapper but your move of the
> > > > arch stuff makes sense as well and I think would complement his series
> > > > nicely.
> > >
> > > I apologize for missing that. I think these are the four most recent
> > > versions of the different series referenced from that LWN link:
> > >
> > >   a) https://lore.kernel.org/all/20230918072955.2507221-1-rppt@kernel.org/
> > >   b) https://lore.kernel.org/all/20230526051529.3387103-1-song@kernel.org/
> > >   c) https://lore.kernel.org/all/20221107223921.3451913-1-song@kernel.org/
> > >   d) https://lore.kernel.org/all/20201120202426.18009-1-rick.p.edgecombe@intel.com/
> > >
> > > Song and Mike, please correct me if I'm wrong, but I think what I've
> > > done here (see [1], sorry for not adding you initially) is compatible
> > > with everything both of you have recently proposed above. How do you
> > > feel about this as a first step?
> >
> > I agree that the work here is compatible with other efforts. I have no
> > objection to making this the first step.
> >
> > >
> > > For naming, execmem_alloc() seems reasonable to me? I have no strong
> > > feelings at all, I'll just use that going forward unless somebody else
> > > expresses an opinion.
> >
> > I am not good at naming things. No objection from me to "execmem_alloc".
>
> Hm, it sounds good to me too. I think we should add a patch which just
> rename the module_alloc/module_memfree with execmem_alloc/free first.

I think that would be cleaner, yes. Leaving the possible move to a
secondary patch and placing the testing more on the later part.

 Luis
Calvin Owens March 8, 2024, 8:27 p.m. UTC | #8
On Thursday 03/07 at 18:55 -0800, Luis Chamberlain wrote:
> On Thu, Mar 7, 2024 at 6:50 PM Masami Hiramatsu <mhiramat@kernel.org> wrote:
> >
> > On Wed, 6 Mar 2024 17:58:14 -0800
> > Song Liu <song@kernel.org> wrote:
> >
> > > Hi Calvin,
> > >
> > > It is great to hear from you! :)
> > >
> > > On Wed, Mar 6, 2024 at 3:23 PM Calvin Owens <jcalvinowens@gmail.com> wrote:
> > > >
> > > > On Wednesday 03/06 at 13:34 -0800, Luis Chamberlain wrote:
> > > > > On Wed, Mar 06, 2024 at 12:05:07PM -0800, Calvin Owens wrote:
> > > > > > Hello all,
> > > > > >
> > > > > > This patchset makes it possible to use bpftrace with kprobes on kernels
> > > > > > built without loadable module support.
> > > > >
> > > > > This is a step in the right direction for another reason: clearly the
> > > > > module_alloc() is not about modules, and we have special reasons for it
> > > > > now beyond modules. The effort to share a generalize a huge page for
> > > > > these things is also another reason for some of this but that is more
> > > > > long term.
> > > > >
> > > > > I'm all for minor changes here so to avoid regressions but it seems a
> > > > > rename is in order -- if we're going to all this might as well do it
> > > > > now. And for that I'd just like to ask you paint the bikeshed with
> > > > > Song Liu as he's been the one slowly making way to help us get there
> > > > > with the "module: replace module_layout with module_memory",
> > > > > and Mike Rapoport as he's had some follow up attempts [0]. As I see it,
> > > > > the EXECMEM stuff would be what we use instead then. Mike kept the
> > > > > module_alloc() and the execmem was just a wrapper but your move of the
> > > > > arch stuff makes sense as well and I think would complement his series
> > > > > nicely.
> > > >
> > > > I apologize for missing that. I think these are the four most recent
> > > > versions of the different series referenced from that LWN link:
> > > >
> > > >   a) https://lore.kernel.org/all/20230918072955.2507221-1-rppt@kernel.org/
> > > >   b) https://lore.kernel.org/all/20230526051529.3387103-1-song@kernel.org/
> > > >   c) https://lore.kernel.org/all/20221107223921.3451913-1-song@kernel.org/
> > > >   d) https://lore.kernel.org/all/20201120202426.18009-1-rick.p.edgecombe@intel.com/
> > > >
> > > > Song and Mike, please correct me if I'm wrong, but I think what I've
> > > > done here (see [1], sorry for not adding you initially) is compatible
> > > > with everything both of you have recently proposed above. How do you
> > > > feel about this as a first step?
> > >
> > > I agree that the work here is compatible with other efforts. I have no
> > > objection to making this the first step.
> > >
> > > >
> > > > For naming, execmem_alloc() seems reasonable to me? I have no strong
> > > > feelings at all, I'll just use that going forward unless somebody else
> > > > expresses an opinion.
> > >
> > > I am not good at naming things. No objection from me to "execmem_alloc".
> >
> > Hm, it sounds good to me too. I think we should add a patch which just
> > rename the module_alloc/module_memfree with execmem_alloc/free first.
> 
> I think that would be cleaner, yes. Leaving the possible move to a
> secondary patch and placing the testing more on the later part.

Makes sense to me.
Jarkko Sakkinen March 25, 2024, 10:46 p.m. UTC | #9
On Wed Mar 6, 2024 at 10:05 PM EET, Calvin Owens wrote:
> Hello all,
>
> This patchset makes it possible to use bpftrace with kprobes on kernels
> built without loadable module support.
>
> On a Raspberry Pi 4b, this saves about 700KB of memory where BPF is
> needed but loadable module support is not. These two kernels had
> identical configurations, except CONFIG_MODULE was off in the second:
>
>    - Linux version 6.8.0-rc7
>    - Memory: 3330672K/4050944K available (16576K kernel code, 2390K rwdata,
>    - 12364K rodata, 5632K init, 675K bss, 195984K reserved, 524288K cma-reserved)
>    + Linux version 6.8.0-rc7-00003-g2af01251ca21
>    + Memory: 3331400K/4050944K available (16512K kernel code, 2384K rwdata,
>    + 11728K rodata, 5632K init, 673K bss, 195256K reserved, 524288K cma-reserved)
>
> I don't intend to present an exhaustive list of !MODULES usecases, since
> I'm sure there are many I'm not aware of. Performance is a common one,
> the primary justification being that static text is mapped on hugepages
> and module text is not. Security is another, since rootkits are much
> harder to implement without modules.
>
> The first patch is the interesting one: it moves module_alloc() into its
> own file with its own Kconfig option, so it can be utilized even when
> loadable module support is disabled. I got the idea from an unmerged
> patch from a few years ago I found on lkml (see [1/4] for details). I
> think this also has value in its own right, since I suspect there are
> potential users beyond bpf, hopefully we will hear from some.
>
> Patches 2-3 are proofs of concept to demonstrate the first patch is
> sufficient to achieve my goal (full ebpf functionality without modules).
>
> Patch 4 adds a new "-n" argument to vmtest.sh to run the BPF selftests
> without modules, so the prior three patches can be rigorously tested.
>
> If something like the first patch were to eventually be merged, the rest
> could go through the normal bpf-next process as I clean them up: I've
> only based them on Linus' tree and combined them into a series here to
> introduce the idea.
>
> If you prefer to fetch the patches via git:
>
>   [1/4] https://github.com/jcalvinowens/linux.git work/module-alloc
>  +[2/4]+[3/4] https://github.com/jcalvinowens/linux.git work/nomodule-bpf
>  +[4/4] https://github.com/jcalvinowens/linux.git testing/nomodule-bpf-ci
>
> In addition to the automated BPF selftests, I've lightly tested this on
> my laptop (x86_64), a Raspberry Pi 4b (arm64), and a Raspberry Pi Zero W
> (arm). The other architectures have only been compile tested.
>
> I didn't want to spam all the arch maintainers with what I expect will
> be a discussion mostly about modules and bpf, so I've left them off this
> first submission. I will be sure to add them on future submissions of
> the first patch. Of course, feedback on the arch bits is welcome here.
>
> In addition to feedback on the patches themselves, I'm interested in
> hearing from anybody else who might find this functionality useful.
>
> Thanks,
> Calvin
>
>
> Calvin Owens (4):
>   module: mm: Make module_alloc() generally available
>   bpf: Allow BPF_JIT with CONFIG_MODULES=n
>   kprobes: Allow kprobes with CONFIG_MODULES=n
>   selftests/bpf: Support testing the !MODULES case
>
>  arch/Kconfig                                  |   4 +-
>  arch/arm/kernel/module.c                      |  35 -----
>  arch/arm/mm/Makefile                          |   2 +
>  arch/arm/mm/module_alloc.c                    |  40 ++++++
>  arch/arm64/kernel/module.c                    | 127 -----------------
>  arch/arm64/mm/Makefile                        |   1 +
>  arch/arm64/mm/module_alloc.c                  | 130 ++++++++++++++++++
>  arch/loongarch/kernel/module.c                |   6 -
>  arch/loongarch/mm/Makefile                    |   2 +
>  arch/loongarch/mm/module_alloc.c              |  10 ++
>  arch/mips/kernel/module.c                     |  10 --
>  arch/mips/mm/Makefile                         |   2 +
>  arch/mips/mm/module_alloc.c                   |  13 ++
>  arch/nios2/kernel/module.c                    |  20 ---
>  arch/nios2/mm/Makefile                        |   2 +
>  arch/nios2/mm/module_alloc.c                  |  22 +++
>  arch/parisc/kernel/module.c                   |  12 --
>  arch/parisc/mm/Makefile                       |   1 +
>  arch/parisc/mm/module_alloc.c                 |  15 ++
>  arch/powerpc/kernel/module.c                  |  36 -----
>  arch/powerpc/mm/Makefile                      |   1 +
>  arch/powerpc/mm/module_alloc.c                |  41 ++++++
>  arch/riscv/kernel/module.c                    |  11 --
>  arch/riscv/mm/Makefile                        |   1 +
>  arch/riscv/mm/module_alloc.c                  |  17 +++
>  arch/s390/kernel/module.c                     |  37 -----
>  arch/s390/mm/Makefile                         |   1 +
>  arch/s390/mm/module_alloc.c                   |  42 ++++++
>  arch/sparc/kernel/module.c                    |  31 -----
>  arch/sparc/mm/Makefile                        |   2 +
>  arch/sparc/mm/module_alloc.c                  |  31 +++++
>  arch/x86/kernel/ftrace.c                      |   2 +-
>  arch/x86/kernel/module.c                      |  56 --------
>  arch/x86/mm/Makefile                          |   2 +
>  arch/x86/mm/module_alloc.c                    |  59 ++++++++
>  fs/proc/kcore.c                               |   2 +-
>  include/trace/events/bpf_testmod.h            |   1 +
>  kernel/bpf/Kconfig                            |  11 +-
>  kernel/bpf/Makefile                           |   2 +
>  kernel/bpf/bpf_struct_ops.c                   |  28 +++-
>  kernel/bpf/bpf_testmod/Makefile               |   1 +
>  kernel/bpf/bpf_testmod/bpf_testmod.c          |   1 +
>  kernel/bpf/bpf_testmod/bpf_testmod.h          |   1 +
>  kernel/bpf/bpf_testmod/bpf_testmod_kfunc.h    |   1 +
>  kernel/kprobes.c                              |  22 +++
>  kernel/module/Kconfig                         |   1 +
>  kernel/module/main.c                          |  17 ---
>  kernel/trace/trace_kprobe.c                   |  11 ++
>  mm/Kconfig                                    |   3 +
>  mm/Makefile                                   |   1 +
>  mm/module_alloc.c                             |  21 +++
>  mm/vmalloc.c                                  |   2 +-
>  net/bpf/test_run.c                            |   2 +
>  tools/testing/selftests/bpf/Makefile          |  28 ++--
>  .../selftests/bpf/bpf_testmod/Makefile        |   2 +-
>  .../bpf/bpf_testmod/bpf_testmod-events.h      |   6 +
>  .../selftests/bpf/bpf_testmod/bpf_testmod.c   |   4 +
>  .../bpf/bpf_testmod/bpf_testmod_kfunc.h       |   2 +
>  tools/testing/selftests/bpf/config            |   5 -
>  tools/testing/selftests/bpf/config.mods       |   5 +
>  tools/testing/selftests/bpf/config.nomods     |   1 +
>  .../selftests/bpf/progs/btf_type_tag_percpu.c |   2 +
>  .../selftests/bpf/progs/btf_type_tag_user.c   |   2 +
>  tools/testing/selftests/bpf/progs/core_kern.c |   2 +
>  .../selftests/bpf/progs/iters_testmod_seq.c   |   2 +
>  .../bpf/progs/test_core_reloc_module.c        |   2 +
>  .../selftests/bpf/progs/test_ldsx_insn.c      |   2 +
>  .../selftests/bpf/progs/test_module_attach.c  |   3 +
>  .../selftests/bpf/progs/tracing_struct.c      |   2 +
>  tools/testing/selftests/bpf/testing_helpers.c |  14 ++
>  tools/testing/selftests/bpf/vmtest.sh         |  24 +++-
>  71 files changed, 636 insertions(+), 424 deletions(-)
>  create mode 100644 arch/arm/mm/module_alloc.c
>  create mode 100644 arch/arm64/mm/module_alloc.c
>  create mode 100644 arch/loongarch/mm/module_alloc.c
>  create mode 100644 arch/mips/mm/module_alloc.c
>  create mode 100644 arch/nios2/mm/module_alloc.c
>  create mode 100644 arch/parisc/mm/module_alloc.c
>  create mode 100644 arch/powerpc/mm/module_alloc.c
>  create mode 100644 arch/riscv/mm/module_alloc.c
>  create mode 100644 arch/s390/mm/module_alloc.c
>  create mode 100644 arch/sparc/mm/module_alloc.c
>  create mode 100644 arch/x86/mm/module_alloc.c
>  create mode 120000 include/trace/events/bpf_testmod.h
>  create mode 100644 kernel/bpf/bpf_testmod/Makefile
>  create mode 120000 kernel/bpf/bpf_testmod/bpf_testmod.c
>  create mode 120000 kernel/bpf/bpf_testmod/bpf_testmod.h
>  create mode 120000 kernel/bpf/bpf_testmod/bpf_testmod_kfunc.h
>  create mode 100644 mm/module_alloc.c
>  create mode 100644 tools/testing/selftests/bpf/config.mods
>  create mode 100644 tools/testing/selftests/bpf/config.nomods

I think with eBPF focus in the patch set should be only on arch's
that you use regulary, i.e. repeating same mistake I did couple of
years ago:

https://lore.kernel.org/all/20220608000014.3054333-1-jarkko@profian.com/

I don't see my patch set conflict with this work, and it adds needed
shenanigans to realize eBPF patches. I refined the shenanigans to match
Masami's suggestions:

https://lore.kernel.org/all/20240325215502.660-1-jarkko@kernel.org/

And it this requires properly working kprobes anyway.

BR, Jarkko