diff mbox series

[2/3] modpost: Extended modversion support

Message ID 20231115185858.2110875-3-mmaurer@google.com (mailing list archive)
State New
Headers show
Series None | expand

Commit Message

Matthew Maurer Nov. 15, 2023, 6:50 p.m. UTC
Adds a new format for modversions which stores each field in a separate
elf section. This initially adds support for variable length names, but
could later be used to add additional fields to modversions in a
backwards compatible way if needed.

Adding support for variable length names makes it possible to enable
MODVERSIONS and RUST at the same time.

Signed-off-by: Matthew Maurer <mmaurer@google.com>
---
 arch/powerpc/kernel/module_64.c | 24 +++++++++-
 init/Kconfig                    |  1 -
 kernel/module/internal.h        | 16 ++++++-
 kernel/module/main.c            |  9 +++-
 kernel/module/version.c         | 77 +++++++++++++++++++++++++++++++++
 scripts/mod/modpost.c           | 33 ++++++++++++--
 6 files changed, 151 insertions(+), 9 deletions(-)

Comments

Luis Chamberlain Nov. 16, 2023, 5:12 p.m. UTC | #1
On Wed, Nov 15, 2023 at 06:50:10PM +0000, Matthew Maurer wrote:
> Adds a new format for modversions which stores each field in a separate
> elf section.

The "why" is critical and not mentioned. And I'd like to also see
documented this with foresight, if Rust needed could this be used
in the future for other things?

Also please include folks CC'd in *one* patch to *all* patches as
otherwise we have no context.

> This initially adds support for variable length names, but
> could later be used to add additional fields to modversions in a
> backwards compatible way if needed.
> 
> Adding support for variable length names makes it possible to enable
> MODVERSIONS and RUST at the same time.
> 
> Signed-off-by: Matthew Maurer <mmaurer@google.com>
> ---
>  arch/powerpc/kernel/module_64.c | 24 +++++++++-

Why was only powerpc modified? If the commit log explained this it would
make it easier for review.

> diff --git a/kernel/module/internal.h b/kernel/module/internal.h
> index c8b7b4dcf782..0c188c96a045 100644
> --- a/kernel/module/internal.h
> +++ b/kernel/module/internal.h
> @@ -80,7 +80,7 @@ struct load_info {
>  	unsigned int used_pages;
>  #endif
>  	struct {
> -		unsigned int sym, str, mod, vers, info, pcpu;
> +		unsigned int sym, str, mod, vers, info, pcpu, vers_ext_crc, vers_ext_name;

We might as well modify this in a preliminary patch to add each new
unsinged int in a new line, so that it is easier to blame when each new
entry gets added. It should not grow the size of the struct at all but
it would make futur extensions easier to review what is new and git
blame easier to spot when something was added.

Although we don't use this extensively today this can easily grow for
convenience and making code easier to read.

> diff --git a/kernel/module/version.c b/kernel/module/version.c
> index 53f43ac5a73e..93d97dad8c77 100644
> --- a/kernel/module/version.c
> +++ b/kernel/module/version.c
> @@ -19,11 +19,28 @@ int check_version(const struct load_info *info,
>  	unsigned int versindex = info->index.vers;
>  	unsigned int i, num_versions;
>  	struct modversion_info *versions;
> +	struct modversion_info_ext version_ext;
>  
>  	/* Exporting module didn't supply crcs?  OK, we're already tainted. */
>  	if (!crc)
>  		return 1;
>  
> +	/* If we have extended version info, rely on it */
> +	if (modversion_ext_start(info, &version_ext) >= 0) {

There are two things we need to do to make processing modules easier:

  1) ELF validation
  2) Once checked then process the information

We used to have this split up but also had a few places which did both
1) and 2) together. This was wrong and so I want to keep things tidy
and ensure we do things which validate the ELF separate. To that
end please put the checks to validate the ELF first so that we report
to users with a proper error/debug check in case the ELF is wrong,
this enables futher debug checks for that to be done instead of
confusing users who end up scratching their heads why something
failed.

So please split up the ELF validation check and put that into
elf_validity_cache_copy() which runs *earlier* than this.

Then *if* if has this, you just process it. Please take care to be
very pedantic in the elf_validity_cache_copy() and extend the checks
you have for validation in modversion_ext_start() and bring them to
elf_validity_cache_copy() with perhaps *more* stuff which does any
insane checks to verify it is 100% correct.

> +		do {
> +			if (strncmp(version_ext.name.value, symname,
> +				    version_ext.name.end - version_ext.name.value) != 0)
> +				continue;
> +
> +			if (*version_ext.crc.value == *crc)
> +				return 1;
> +			pr_debug("Found checksum %X vs module %X\n",
> +				 *crc, *version_ext.crc.value);
> +			goto bad_version;
> +		} while (modversion_ext_advance(&version_ext) == 0);

Can you do a for_each_foo()) type loop here instead after validation?
Because the validation would ensure your loop is bounded then. Look at
for_each_mod_mem_type() for inspiration.

> +		goto broken_toolchain;

The broken toolchain thing would then be an issue reported in the
ELF validation.

> @@ -87,6 +105,65 @@ int same_magic(const char *amagic, const char *bmagic,
>  	return strcmp(amagic, bmagic) == 0;
>  }
>  
> +#define MODVERSION_FIELD_START(sec, field) \
> +	field.value = (typeof(field.value))sec.sh_addr; \
> +	field.end = field.value + sec.sh_size
> +
> +ssize_t modversion_ext_start(const struct load_info *info,
> +			     struct modversion_info_ext *start)
> +{
> +	unsigned int crc_idx = info->index.vers_ext_crc;
> +	unsigned int name_idx = info->index.vers_ext_name;
> +	Elf_Shdr *sechdrs = info->sechdrs;
> +
> +	// Both of these fields are needed for this to be useful
> +	// Any future fields should be initialized to NULL if absent.

Curious, what gave you the impression // type style comments are
welcomed, please replace that with either a one line 

/* foo comment */

Or a multi-line:

/*
 * stuff and go into great deatils
 * more elaaborate explanation
 */

Of even better, since you are moving this to ELF Validation please add
undertand what elf_validity_cache_copy() does, and add kdoc style
comments for it and then extend it with why Rust needs these magical things.

> +	if ((crc_idx == 0) || (name_idx == 0))
> +		return -EINVAL;
> +
> +	MODVERSION_FIELD_START(sechdrs[crc_idx], start->crc);
> +	MODVERSION_FIELD_START(sechdrs[name_idx], start->name);
> +
> +	return (start->crc.end - start->crc.value) / sizeof(*start->crc.value);
> +}
> +
> +static int modversion_ext_s32_advance(struct modversion_info_ext_s32 *field)
> +{
> +	if (!field->value)
> +		return 0;
> +	if (field->value >= field->end)
> +		return -EINVAL;
> +	field->value++;
> +	return 0;
> +}
> +
> +static int modversion_ext_string_advance(struct modversion_info_ext_string *s)
> +{
> +	if (!s->value)
> +		return 0;
> +	if (s->value >= s->end)
> +		return -EINVAL;
> +	s->value += strnlen(s->value, s->end - s->value - 1) + 1;
> +	if (s->value >= s->end)
> +		return -EINVAL;
> +	return 0;
> +}
> +
> +int modversion_ext_advance(struct modversion_info_ext *start)
> +{
> +	int ret;
> +
> +	ret = modversion_ext_s32_advance(&start->crc);
> +	if (ret < 0)
> +		return ret;
> +
> +	ret = modversion_ext_string_advance(&start->name);
> +	if (ret < 0)
> +		return ret;
> +
> +	return 0;
> +}

Please add all the validation as part of the ELF validation sanity checks
and make sure you rant so toolchains get easily debugged and fixed.
That would make the processing of data a secodnary step and it is
easier to read and simpler code. The validation then becomes the part
which kicks issues out early.

>  /*
>   * Generate the signature for all relevant module structures here.
>   * If these change, we don't want to try to parse the module.
> diff --git a/scripts/mod/modpost.c b/scripts/mod/modpost.c
> index 973b5e5ae2dd..884860c2e833 100644
> --- a/scripts/mod/modpost.c
> +++ b/scripts/mod/modpost.c
> @@ -1910,15 +1910,42 @@ static void add_versions(struct buffer *b, struct module *mod)
>  			continue;
>  		}
>  		if (strlen(s->name) >= MODULE_NAME_LEN) {
> -			error("too long symbol \"%s\" [%s.ko]\n",
> -			      s->name, mod->name);
> -			break;
> +			/* this symbol will only be in the extended info */
> +			continue;

I cannot grok why this is being done, but hopefully in the next patch
series this will be easier to understand.

 Luis
diff mbox series

Patch

diff --git a/arch/powerpc/kernel/module_64.c b/arch/powerpc/kernel/module_64.c
index 7112adc597a8..2582353a2048 100644
--- a/arch/powerpc/kernel/module_64.c
+++ b/arch/powerpc/kernel/module_64.c
@@ -355,6 +355,24 @@  static void dedotify_versions(struct modversion_info *vers,
 		}
 }
 
+static void dedotify_ext_version_names(char *str_seq, unsigned long size)
+{
+	unsigned long out = 0;
+	unsigned long in;
+	char last = '\0';
+
+	for (in = 0; in < size; in++) {
+		if (last == '\0')
+			/* Skip all leading dots */
+			if (str_seq[in] == '.')
+				continue;
+		last = str_seq[in];
+		str_seq[out++] = last;
+	}
+	/* Zero the trailing portion of the names table for robustness */
+	bzero(&str_seq[out], size - out);
+}
+
 /*
  * Undefined symbols which refer to .funcname, hack to funcname. Make .TOC.
  * seem to be defined (value set later).
@@ -424,10 +442,12 @@  int module_frob_arch_sections(Elf64_Ehdr *hdr,
 			me->arch.toc_section = i;
 			if (sechdrs[i].sh_addralign < 8)
 				sechdrs[i].sh_addralign = 8;
-		}
-		else if (strcmp(secstrings+sechdrs[i].sh_name,"__versions")==0)
+		} else if (strcmp(secstrings + sechdrs[i].sh_name, "__versions") == 0)
 			dedotify_versions((void *)hdr + sechdrs[i].sh_offset,
 					  sechdrs[i].sh_size);
+		else if (strcmp(secstrings + sechdrs[i].sh_name, "__version_ext_names") == 0)
+			dedotify_ext_version_names((void *)hdr + sechdrs[i].sh_offset,
+						   sechdrs[i].sh_size);
 
 		if (sechdrs[i].sh_type == SHT_SYMTAB)
 			dedotify((void *)hdr + sechdrs[i].sh_offset,
diff --git a/init/Kconfig b/init/Kconfig
index 9ffb103fc927..6cac5b4db8f6 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1885,7 +1885,6 @@  config RUST
 	bool "Rust support"
 	depends on HAVE_RUST
 	depends on RUST_IS_AVAILABLE
-	depends on !MODVERSIONS
 	depends on !GCC_PLUGINS
 	depends on !RANDSTRUCT
 	depends on !DEBUG_INFO_BTF || PAHOLE_HAS_LANG_EXCLUDE
diff --git a/kernel/module/internal.h b/kernel/module/internal.h
index c8b7b4dcf782..0c188c96a045 100644
--- a/kernel/module/internal.h
+++ b/kernel/module/internal.h
@@ -80,7 +80,7 @@  struct load_info {
 	unsigned int used_pages;
 #endif
 	struct {
-		unsigned int sym, str, mod, vers, info, pcpu;
+		unsigned int sym, str, mod, vers, info, pcpu, vers_ext_crc, vers_ext_name;
 	} index;
 };
 
@@ -384,6 +384,20 @@  void module_layout(struct module *mod, struct modversion_info *ver, struct kerne
 		   struct kernel_symbol *ks, struct tracepoint * const *tp);
 int check_modstruct_version(const struct load_info *info, struct module *mod);
 int same_magic(const char *amagic, const char *bmagic, bool has_crcs);
+struct modversion_info_ext_s32 {
+	const s32 *value;
+	const s32 *end;
+};
+struct modversion_info_ext_string {
+	const char *value;
+	const char *end;
+};
+struct modversion_info_ext {
+	struct modversion_info_ext_s32 crc;
+	struct modversion_info_ext_string name;
+};
+ssize_t modversion_ext_start(const struct load_info *info, struct modversion_info_ext *ver);
+int modversion_ext_advance(struct modversion_info_ext *ver);
 #else /* !CONFIG_MODVERSIONS */
 static inline int check_version(const struct load_info *info,
 				const char *symname,
diff --git a/kernel/module/main.c b/kernel/module/main.c
index 98fedfdb8db5..e69b2ae46161 100644
--- a/kernel/module/main.c
+++ b/kernel/module/main.c
@@ -1886,10 +1886,15 @@  static int elf_validity_cache_copy(struct load_info *info, int flags)
 	if (!info->name)
 		info->name = info->mod->name;
 
-	if (flags & MODULE_INIT_IGNORE_MODVERSIONS)
+	if (flags & MODULE_INIT_IGNORE_MODVERSIONS) {
 		info->index.vers = 0; /* Pretend no __versions section! */
-	else
+		info->index.vers_ext_crc = 0;
+		info->index.vers_ext_name = 0;
+	} else {
 		info->index.vers = find_sec(info, "__versions");
+		info->index.vers_ext_crc = find_sec(info, "__version_ext_crcs");
+		info->index.vers_ext_name = find_sec(info, "__version_ext_names");
+	}
 
 	info->index.pcpu = find_pcpusec(info);
 
diff --git a/kernel/module/version.c b/kernel/module/version.c
index 53f43ac5a73e..93d97dad8c77 100644
--- a/kernel/module/version.c
+++ b/kernel/module/version.c
@@ -19,11 +19,28 @@  int check_version(const struct load_info *info,
 	unsigned int versindex = info->index.vers;
 	unsigned int i, num_versions;
 	struct modversion_info *versions;
+	struct modversion_info_ext version_ext;
 
 	/* Exporting module didn't supply crcs?  OK, we're already tainted. */
 	if (!crc)
 		return 1;
 
+	/* If we have extended version info, rely on it */
+	if (modversion_ext_start(info, &version_ext) >= 0) {
+		do {
+			if (strncmp(version_ext.name.value, symname,
+				    version_ext.name.end - version_ext.name.value) != 0)
+				continue;
+
+			if (*version_ext.crc.value == *crc)
+				return 1;
+			pr_debug("Found checksum %X vs module %X\n",
+				 *crc, *version_ext.crc.value);
+			goto bad_version;
+		} while (modversion_ext_advance(&version_ext) == 0);
+		goto broken_toolchain;
+	}
+
 	/* No versions at all?  modprobe --force does this. */
 	if (versindex == 0)
 		return try_to_force_load(mod, symname) == 0;
@@ -46,6 +63,7 @@  int check_version(const struct load_info *info,
 		goto bad_version;
 	}
 
+broken_toolchain:
 	/* Broken toolchain. Warn once, then let it go.. */
 	pr_warn_once("%s: no symbol version for %s\n", info->name, symname);
 	return 1;
@@ -87,6 +105,65 @@  int same_magic(const char *amagic, const char *bmagic,
 	return strcmp(amagic, bmagic) == 0;
 }
 
+#define MODVERSION_FIELD_START(sec, field) \
+	field.value = (typeof(field.value))sec.sh_addr; \
+	field.end = field.value + sec.sh_size
+
+ssize_t modversion_ext_start(const struct load_info *info,
+			     struct modversion_info_ext *start)
+{
+	unsigned int crc_idx = info->index.vers_ext_crc;
+	unsigned int name_idx = info->index.vers_ext_name;
+	Elf_Shdr *sechdrs = info->sechdrs;
+
+	// Both of these fields are needed for this to be useful
+	// Any future fields should be initialized to NULL if absent.
+	if ((crc_idx == 0) || (name_idx == 0))
+		return -EINVAL;
+
+	MODVERSION_FIELD_START(sechdrs[crc_idx], start->crc);
+	MODVERSION_FIELD_START(sechdrs[name_idx], start->name);
+
+	return (start->crc.end - start->crc.value) / sizeof(*start->crc.value);
+}
+
+static int modversion_ext_s32_advance(struct modversion_info_ext_s32 *field)
+{
+	if (!field->value)
+		return 0;
+	if (field->value >= field->end)
+		return -EINVAL;
+	field->value++;
+	return 0;
+}
+
+static int modversion_ext_string_advance(struct modversion_info_ext_string *s)
+{
+	if (!s->value)
+		return 0;
+	if (s->value >= s->end)
+		return -EINVAL;
+	s->value += strnlen(s->value, s->end - s->value - 1) + 1;
+	if (s->value >= s->end)
+		return -EINVAL;
+	return 0;
+}
+
+int modversion_ext_advance(struct modversion_info_ext *start)
+{
+	int ret;
+
+	ret = modversion_ext_s32_advance(&start->crc);
+	if (ret < 0)
+		return ret;
+
+	ret = modversion_ext_string_advance(&start->name);
+	if (ret < 0)
+		return ret;
+
+	return 0;
+}
+
 /*
  * Generate the signature for all relevant module structures here.
  * If these change, we don't want to try to parse the module.
diff --git a/scripts/mod/modpost.c b/scripts/mod/modpost.c
index 973b5e5ae2dd..884860c2e833 100644
--- a/scripts/mod/modpost.c
+++ b/scripts/mod/modpost.c
@@ -1910,15 +1910,42 @@  static void add_versions(struct buffer *b, struct module *mod)
 			continue;
 		}
 		if (strlen(s->name) >= MODULE_NAME_LEN) {
-			error("too long symbol \"%s\" [%s.ko]\n",
-			      s->name, mod->name);
-			break;
+			/* this symbol will only be in the extended info */
+			continue;
 		}
 		buf_printf(b, "\t{ %#8x, \"%s\" },\n",
 			   s->crc, s->name);
 	}
 
 	buf_printf(b, "};\n");
+
+	buf_printf(b, "static const s32 ____version_ext_crcs[]\n");
+	buf_printf(b, "__used __section(\"__version_ext_crcs\") = {\n");
+	list_for_each_entry(s, &mod->unresolved_symbols, list) {
+		if (!s->module)
+			continue;
+		if (!s->crc_valid) {
+			// We already warned on this when producing the legacy
+			// modversions table.
+			continue;
+		}
+		buf_printf(b, "\t%#8x,\n", s->crc);
+	}
+	buf_printf(b, "};\n");
+
+	buf_printf(b, "static const char ____version_ext_names[]\n");
+	buf_printf(b, "__used __section(\"__version_ext_names\") =\n");
+	list_for_each_entry(s, &mod->unresolved_symbols, list) {
+		if (!s->module)
+			continue;
+		if (!s->crc_valid) {
+			// We already warned on this when producing the legacy
+			// modversions table.
+			continue;
+		}
+		buf_printf(b, "\t\"%s\\0\"\n", s->name);
+	}
+	buf_printf(b, ";\n");
 }
 
 static void add_depends(struct buffer *b, struct module *mod)