mbox series

[v4,00/19] Implement DWARF modversions

Message ID 20241008183823.36676-21-samitolvanen@google.com (mailing list archive)
Headers show
Series Implement DWARF modversions | expand

Message

Sami Tolvanen Oct. 8, 2024, 6:38 p.m. UTC
Hi,

Here's v4 of the DWARF modversions series. The main motivation is
modversions support for Rust, which is important for distributions
like Android that are about to ship Rust kernel modules. Per Luis'
request [1], v2 dropped the Rust specific bits from the series and
instead added the feature as an option for the entire kernel. Matt is
addressing Rust modversion_info compatibility issues in a separate
series [2], and we'll follow up with a patch to actually allow
CONFIG_MODVERSIONS with Rust once everything else has been sorted
out.

Short background: Unlike C, Rust source code doesn't have sufficient
information about the final ABI, as the compiler has considerable
freedom in adjusting structure layout, for example, which makes
using a source code parser like genksyms a non-starter. Based on
earlier feedback, this series uses DWARF debugging information for
computing versions. DWARF is an established and a relatively stable
format, which includes all the necessary ABI details, and adding a
CONFIG_DEBUG_INFO dependency for Rust symbol versioning seems like a
reasonable trade-off.

The first patch moves the genksyms CRC32 implementation to a shared
header file and the next 15 patches add gendwarfksyms, a tool for
computing symbol versions from DWARF. When passed a list of exported
symbols and object files, the tool generates an expanded type string
for each symbol and computes symbol CRCs similarly to genksyms.
gendwarfksyms is written in C and uses libdw to process DWARF. Patch
17 ensures that debugging information is present where we need it,
patch 18 adds gendwarfksyms as an alternative to genksyms, and the
last patch adds documentation.

v4 is based on v6.12-rc2 and for your convenience the series is also
available here:

https://github.com/samitolvanen/linux/commits/gendwarfksyms-v4

If you also want to test the series with Rust modules, this branch
adds Matt's latest modversion_info series and a small patch to enable
Rust modversions:

https://github.com/samitolvanen/linux/commits/rustmodversions-v4

Sami


[1] https://lore.kernel.org/lkml/ZnIZEtkkQWEIGf9n@bombadil.infradead.org/
[2] https://lore.kernel.org/lkml/20240925233854.90072-1-mmaurer@google.com/

---

v4:
- Rebased on v6.12-rc2, which now includes all the prerequisites.

- Dropped unnecessary name_only parameter for symbols.c::for_each
  and cleaned up error handling. (Patch 3)

- Fixed anonymous scope handling to ensure unnamed DIEs don't get
  names. (Patch 4)

- Added non-variant children to variant_type output, and included
  DW_AT_discr_value attributes for variants. (Patch 9)

- Added another symbol pointer test case. (Patch 16)

- Picked up (Acked|Reviewed)-by tags from v3.


v3: https://lore.kernel.org/lkml/20240923181846.549877-22-samitolvanen@google.com/
- Updated SPX license headers.

- Squashed the first two patches in v2 and tried to reduce churn as
  much as reasonable.

- Dropped patch 18 from v2 ("x86/asm-prototypes: Include
  <asm/ptrace.h>") as it's addressed by a separate patch.

- Changed the error handling code to immediately terminate instead
  of propagating the errors back to main, which cleaned up the code
  quite a bit.

- Switched to the list and hashtable implementations in scripts and
  dropped the remaining tools/include dependencies. Added a couple
  missing list macros. (patch 1)

- Moved the genksyms CRC32 implementation to scripts/include and
  dropped the duplicate code. (patches 2 and 14)

- Switched from ad-hoc command line parsing to getopt_long (patch 3).

- Added structure member and function parameter names to the DIE
  output to match genksyms behavior, and tweaked the symtypes format
  to be more parser-friendly in general based on Petr's suggestions.

- Replaced the declaration-only struct annotations with more generic
  kABI stability rules that allow source code annotations to be used
  where #ifndef __GENKSYMS__ was previously used.  Added support for
  rules that can be used to exclude enumerators from versioning.
  (patch 16)

- Per Miroslav's suggestion, added an option to hide structure
  members from versioning when they're added to existing alignment
  holes, for example. (patch 16)

- Per Greg's request, added documentation and example macros for the
  --stable features, and a couple of test cases. (patches 15, 16, and
  20)

- Fixed making symtypes files, which need to depend on .o files with
  gendwarfksyms. (patch 19)

- Addressed several other smaller issues that Petr and Masahiro
  kindly pointed out during the v2 review.


v2: https://lore.kernel.org/lkml/20240815173903.4172139-21-samitolvanen@google.com/
- Per Luis' request, dropped Rust-specific patches and added
  gendwarfksyms as an alternative to genksyms for the entire
  kernel.

- Added support for missing DWARF features needed to handle
  also non-Rust code.

- Changed symbol address matching to use the symbol table
  information instead of relying on addresses in DWARF.

- Added __gendwarfksyms_ptr patches to ensure the compiler emits
  the necessary type information in DWARF even for symbols that
  are defined in other TUs.

- Refactored debugging output and moved the more verbose output
  behind --dump* flags.

- Added a --symtypes flag for generating a genksyms-style
  symtypes output based on Petr's feedback, and refactored
  symbol version calculations to be based on symtypes instead
  of raw --dump-dies output.

- Based on feedback from Greg and Petr, added --stable flag and
  support for reserved data structure fields and declaration-onl
  structures. Also added examples for using these features.

- Added a GENDWARFKSYMS option and hooked up kbuild support
  for both C and assembly code. Note that with gendwarfksyms,
  we have to actually build a temporary .o file for calculating
  assembly modversions.

v1: https://lore.kernel.org/lkml/20240617175818.58219-17-samitolvanen@google.com/

---
Sami Tolvanen (19):
  scripts: move genksyms crc32 implementation to a common include
  tools: Add gendwarfksyms
  gendwarfksyms: Add address matching
  gendwarfksyms: Expand base_type
  gendwarfksyms: Add a cache for processed DIEs
  gendwarfksyms: Expand type modifiers and typedefs
  gendwarfksyms: Expand subroutine_type
  gendwarfksyms: Expand array_type
  gendwarfksyms: Expand structure types
  gendwarfksyms: Limit structure expansion
  gendwarfksyms: Add die_map debugging
  gendwarfksyms: Add symtypes output
  gendwarfksyms: Add symbol versioning
  gendwarfksyms: Add support for kABI rules
  gendwarfksyms: Add support for reserved and ignored fields
  gendwarfksyms: Add support for symbol type pointers
  export: Add __gendwarfksyms_ptr_ references to exported symbols
  kbuild: Add gendwarfksyms as an alternative to genksyms
  Documentation/kbuild: Add DWARF module versioning

 Documentation/kbuild/gendwarfksyms.rst      |  274 +++++
 Documentation/kbuild/index.rst              |    1 +
 include/linux/export.h                      |   15 +
 kernel/module/Kconfig                       |   31 +
 scripts/Makefile                            |    3 +-
 scripts/Makefile.build                      |   39 +-
 scripts/gendwarfksyms/.gitignore            |    2 +
 scripts/gendwarfksyms/Makefile              |   12 +
 scripts/gendwarfksyms/cache.c               |   44 +
 scripts/gendwarfksyms/die.c                 |  166 +++
 scripts/gendwarfksyms/dwarf.c               | 1088 +++++++++++++++++++
 scripts/gendwarfksyms/examples/kabi.h       |  141 +++
 scripts/gendwarfksyms/examples/kabi_ex0.c   |   86 ++
 scripts/gendwarfksyms/examples/kabi_ex1.c   |   89 ++
 scripts/gendwarfksyms/examples/kabi_ex2.c   |   98 ++
 scripts/gendwarfksyms/examples/kabi_rules.c |   56 +
 scripts/gendwarfksyms/examples/symbolptr.c  |   33 +
 scripts/gendwarfksyms/gendwarfksyms.c       |  187 ++++
 scripts/gendwarfksyms/gendwarfksyms.h       |  352 ++++++
 scripts/gendwarfksyms/kabi.c                |  214 ++++
 scripts/gendwarfksyms/symbols.c             |  319 ++++++
 scripts/gendwarfksyms/types.c               |  477 ++++++++
 scripts/genksyms/genksyms.c                 |   77 +-
 scripts/include/crc32.h                     |   93 ++
 24 files changed, 3812 insertions(+), 85 deletions(-)
 create mode 100644 Documentation/kbuild/gendwarfksyms.rst
 create mode 100644 scripts/gendwarfksyms/.gitignore
 create mode 100644 scripts/gendwarfksyms/Makefile
 create mode 100644 scripts/gendwarfksyms/cache.c
 create mode 100644 scripts/gendwarfksyms/die.c
 create mode 100644 scripts/gendwarfksyms/dwarf.c
 create mode 100644 scripts/gendwarfksyms/examples/kabi.h
 create mode 100644 scripts/gendwarfksyms/examples/kabi_ex0.c
 create mode 100644 scripts/gendwarfksyms/examples/kabi_ex1.c
 create mode 100644 scripts/gendwarfksyms/examples/kabi_ex2.c
 create mode 100644 scripts/gendwarfksyms/examples/kabi_rules.c
 create mode 100644 scripts/gendwarfksyms/examples/symbolptr.c
 create mode 100644 scripts/gendwarfksyms/gendwarfksyms.c
 create mode 100644 scripts/gendwarfksyms/gendwarfksyms.h
 create mode 100644 scripts/gendwarfksyms/kabi.c
 create mode 100644 scripts/gendwarfksyms/symbols.c
 create mode 100644 scripts/gendwarfksyms/types.c
 create mode 100644 scripts/include/crc32.h


base-commit: 8cf0b93919e13d1e8d4466eb4080a4c4d9d66d7b

Comments

Luis Chamberlain Oct. 11, 2024, 11:42 p.m. UTC | #1
On Tue, Oct 08, 2024 at 06:38:24PM +0000, Sami Tolvanen wrote:
> Hi,
> 
> Here's v4 of the DWARF modversions series. The main motivation is
> modversions support for Rust, which is important for distributions
> like Android that are about to ship Rust kernel modules. Per Luis'
> request [1], v2 dropped the Rust specific bits from the series and
> instead added the feature as an option for the entire kernel.

I think its important to mention *rationale* why I recommended it, it
let's you more broadly test coverage / support with tooling so that its
not just Rust modules which tooling will have to support. It gives the
option to have *one* unified way to do things. Not "rust is special",
oh no, do this instead. This also allows us to empirically evaluate
if this new solution also has benefits. If so what should we look out
for, metrics, etc. If there are proven benefits, then by all means
why not make the the default.

> Matt is
> addressing Rust modversion_info compatibility issues in a separate
> series [2], and we'll follow up with a patch to actually allow
> CONFIG_MODVERSIONS with Rust once everything else has been sorted
> out.

So this depends on that work? What's the ordering of things? That work
seems to be aimed at addressing long symbols, and that is also
generically useful, and there were odd old hacks for that for LTO.
Bring the patch reviewer with you, as they may not have all the
background.

> Short background: Unlike C, Rust source code doesn't have sufficient
> information about the final ABI, as the compiler has considerable
> freedom in adjusting structure layout, for example, which makes
> using a source code parser like genksyms a non-starter. Based on
> earlier feedback, this series uses DWARF debugging information for
> computing versions. DWARF is an established and a relatively stable
> format, which includes all the necessary ABI details, and adding a
> CONFIG_DEBUG_INFO dependency for Rust symbol versioning seems like a
> reasonable trade-off.

I think its important to state that most distributions enable CONFIG_DEBUG_INFO
already, so this means Rust modules won't be asking much more from
distributions.

> The first patch moves the genksyms CRC32 implementation to a shared
> header file and the next 15 patches add gendwarfksyms, a tool for
> computing symbol versions from DWARF.

In case I find issues with this patch series, let's split up the patches
in the next iteration by two sets, one which is the cleanups,  moves,
and non functional changes, and then a seprate set with the actual
changes needed. This let's us carry on faster so I can apply the first
set.

> When passed a list of exported
> symbols and object files, the tool generates an expanded type string
> for each symbol and computes symbol CRCs similarly to genksyms.
> gendwarfksyms is written in C and uses libdw to process DWARF.

OK so a new lib dependency at build time. What if that's not present?

> Patch
> 17 ensures that debugging information is present where we need it,
> patch 18 adds gendwarfksyms as an alternative to genksyms, and the
> last patch adds documentation.
> 
> v4 is based on v6.12-rc2 and for your convenience the series is also
> available here:
> 
> https://github.com/samitolvanen/linux/commits/gendwarfksyms-v4

Thanks! OK so I'd like to see next test coverage.

The below is more modules maintainer homework we have to do, if you're
not a modules maintainer or not interested in CI stuff you can stop
reading now:

We don't have much but for testing of modules, we have lib/test_kmod and
tools/testing/selftests/kmod/kmod.sh. kdevops now has 'make
kdevops-seltests-kmod', a respective cli defconfig is available to
allow github actions to run kdevops on a self-hosted runner too. We
already have modules tree integration with kernel-patch-daemon (KPD)
[0] [1], there's two parts to this integration to patchwork. The first is the
stuff you need to run a github self-hosted runner to test modules with
kdevops [2], one can just take that into a random Linux kernel git tree
and merge push it into github to trigger your own self hosted runner.
That's what KPD does, and in effect its visible [3].

I'm currently working on the security policies on the linux-kdevops org
to ensure I can trust the pushes to the linux-modules-kpd tree, but
you should be able to now. Once I block random people from sending PR
or pushes to the tree we can run CI on that tree by you just having to push.

But the current coverage sucks.

Luca noted we can run the kmod git tree meson test but the preload
stuff needs to be fixed so we can actually run all the test on that
tree on a VM [4].

Given all the work you are doing, the next thing is to add tests. 
We should also add the kmod git tree to kdevops and just run the meson
test once we can get the module tests to actually run on the VM.

[0] https://github.com/facebookincubator/kernel-patches-daemon
[1] https://patchwork.kernel.org/project/linux-modules/list/
[2] https://github.com/linux-kdevops/kdevops-ci-modules
[3] https://github.com/linux-kdevops/linux-modules-kpd
[4] https://github.com/kmod-project/kmod/blob/master/.github/workflows/main.yml#L92-L98

  Luis
Sami Tolvanen Oct. 12, 2024, 12:30 a.m. UTC | #2
Hi Luis,

On Fri, Oct 11, 2024 at 4:42 PM Luis Chamberlain <mcgrof@kernel.org> wrote:
>
> On Tue, Oct 08, 2024 at 06:38:24PM +0000, Sami Tolvanen wrote:
> > Hi,
> >
> > Here's v4 of the DWARF modversions series. The main motivation is
> > modversions support for Rust, which is important for distributions
> > like Android that are about to ship Rust kernel modules. Per Luis'
> > request [1], v2 dropped the Rust specific bits from the series and
> > instead added the feature as an option for the entire kernel.
>
> I think its important to mention *rationale* why I recommended it, it
> let's you more broadly test coverage / support with tooling so that its
> not just Rust modules which tooling will have to support. It gives the
> option to have *one* unified way to do things. Not "rust is special",
> oh no, do this instead. This also allows us to empirically evaluate
> if this new solution also has benefits. If so what should we look out
> for, metrics, etc. If there are proven benefits, then by all means
> why not make the the default.

Sure, I can expand the cover letter in the next version to include
this. I linked to your original request, which gives the reader some
more background, but you're right, it doesn't hurt to mention it here
as well.

> > Matt is
> > addressing Rust modversion_info compatibility issues in a separate
> > series [2], and we'll follow up with a patch to actually allow
> > CONFIG_MODVERSIONS with Rust once everything else has been sorted
> > out.
>
> So this depends on that work? What's the ordering of things? That work
> seems to be aimed at addressing long symbols, and that is also
> generically useful, and there were odd old hacks for that for LTO.
> Bring the patch reviewer with you, as they may not have all the
> background.

These patch sets are fully independent, but they are both needed
before we can support Rust. I'll clarify this too.

> > Short background: Unlike C, Rust source code doesn't have sufficient
> > information about the final ABI, as the compiler has considerable
> > freedom in adjusting structure layout, for example, which makes
> > using a source code parser like genksyms a non-starter. Based on
> > earlier feedback, this series uses DWARF debugging information for
> > computing versions. DWARF is an established and a relatively stable
> > format, which includes all the necessary ABI details, and adding a
> > CONFIG_DEBUG_INFO dependency for Rust symbol versioning seems like a
> > reasonable trade-off.
>
> I think its important to state that most distributions enable CONFIG_DEBUG_INFO
> already, so this means Rust modules won't be asking much more from
> distributions.

Ack.

> > The first patch moves the genksyms CRC32 implementation to a shared
> > header file and the next 15 patches add gendwarfksyms, a tool for
> > computing symbol versions from DWARF.
>
> In case I find issues with this patch series, let's split up the patches
> in the next iteration by two sets, one which is the cleanups,  moves,
> and non functional changes, and then a seprate set with the actual
> changes needed. This let's us carry on faster so I can apply the first
> set.

I think the first patch is the only one that matches your description,
but it's only needed for the gendwarfksyms tool we're adding, so I'm
not sure it makes sense to merge it separately. I did have a couple of
other patches in previous versions that would qualify, but they've
been merged to -rc2 already.

> > When passed a list of exported
> > symbols and object files, the tool generates an expanded type string
> > for each symbol and computes symbol CRCs similarly to genksyms.
> > gendwarfksyms is written in C and uses libdw to process DWARF.
>
> OK so a new lib dependency at build time. What if that's not present?

Then the build fails. We used to check for libelf (part of elfutils
like libdw) in Makefile before, but Masahiro explained in commit
0d989ac2c90b ("kbuild: remove libelf checks from top Makefile") why
it's not necessary to have separate checks for the library
dependencies:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=0d989ac2c90b5f51fe12102d3cddf54b959f2014

> > Patch
> > 17 ensures that debugging information is present where we need it,
> > patch 18 adds gendwarfksyms as an alternative to genksyms, and the
> > last patch adds documentation.
> >
> > v4 is based on v6.12-rc2 and for your convenience the series is also
> > available here:
> >
> > https://github.com/samitolvanen/linux/commits/gendwarfksyms-v4
>
> Thanks! OK so I'd like to see next test coverage.

Thanks for the links, I'll take a look and see what tests it makes sense to add.

Sami