diff mbox series

[RFC] module: Introduce module unload taint tracking

Message ID 20211124173327.3878177-1-atomlin@redhat.com (mailing list archive)
State Superseded
Headers show
Series [RFC] module: Introduce module unload taint tracking | expand

Commit Message

Aaron Tomlin Nov. 24, 2021, 5:33 p.m. UTC
Currently, only the initial module that tainted the kernel is
recorded e.g. when an out-of-tree module is loaded.

So the purpose of this patch is to allow the kernel to maintain a record of
each unloaded module that taints the kernel. Now, in addition to displaying
a list of linked modules (see print_modules()) e.g. in the event of an
Oops, unloaded modules that carried a taint/or taints are also displayed.
If the previously unloaded module is loaded once again it will be removed
from the list only if the taints bitmask is the same.

The number of tracked modules is not fixed and can be modified accordingly.
This feature is disabled by default.

Signed-off-by: Aaron Tomlin <atomlin@redhat.com>
---
 include/linux/module.h |   5 ++
 init/Kconfig           |   9 ++++
 kernel/module.c        | 106 +++++++++++++++++++++++++++++++++++++++--
 kernel/sysctl.c        |  10 ++++
 4 files changed, 126 insertions(+), 4 deletions(-)

Comments

Luis Chamberlain Dec. 8, 2021, 8:47 p.m. UTC | #1
Hey Aaron thanks for your patch! Please Cc the folks I added in future
iterations. My review below.

Andrew,

just Cc'ing you in case the sysctl changes in here need to rely on
your changes. If so then we'll have these changes (if and once reviewed)
go through your tree.

On Wed, Nov 24, 2021 at 05:33:27PM +0000, Aaron Tomlin wrote:
> Currently, only the initial module that tainted the kernel is
> recorded e.g. when an out-of-tree module is loaded.
> 
> So the purpose of this patch is to allow the kernel to maintain a record of
> each unloaded module that taints the kernel. Now, in addition to displaying
> a list of linked modules (see print_modules()) e.g. in the event of an
> Oops, unloaded modules that carried a taint/or taints are also displayed.

This all does indeed seem useful to me.

> If the previously unloaded module is loaded once again it will be removed
> from the list only if the taints bitmask is the same.

That doesn't seem to be clear. What if say a user loads a module which
taints the kernel, and then unloads it, and then tries to load a similar
module with the same name but that it does not taint the kernel?

Would't we loose visibility that at one point the tainting module was
loaded? OK I see after reviewing the patch that we keep track of each
module instance unloaded with an attached unsigned long taints. So if
a module was unloaded with a different taint, we'd see it twice. Is that
right?

> The number of tracked modules is not fixed and can be modified accordingly.

The commit should mention what happens if the limit is reached.

> This feature is disabled by default.
> 
> Signed-off-by: Aaron Tomlin <atomlin@redhat.com>
> ---
>  include/linux/module.h |   5 ++
>  init/Kconfig           |   9 ++++
>  kernel/module.c        | 106 +++++++++++++++++++++++++++++++++++++++--
>  kernel/sysctl.c        |  10 ++++
>  4 files changed, 126 insertions(+), 4 deletions(-)
> 
> diff --git a/include/linux/module.h b/include/linux/module.h
> index 8a298d820dbc..6f089953f28a 100644
> --- a/include/linux/module.h
> +++ b/include/linux/module.h
> @@ -672,6 +672,11 @@ static inline bool is_livepatch_module(struct module *mod)
>  bool is_module_sig_enforced(void);
>  void set_module_sig_enforced(void);
>  
> +#ifdef CONFIG_MODULE_UNLOAD_TAINT_TRACKING
> +
> +extern int tainted_list_max_count; /* for sysctl */
> +
> +#endif
>  #else /* !CONFIG_MODULES... */
>  
>  static inline struct module *__module_address(unsigned long addr)
> diff --git a/init/Kconfig b/init/Kconfig
> index bb0d6e6262b1..699c6cf948d8 100644
> --- a/init/Kconfig
> +++ b/init/Kconfig
> @@ -2087,6 +2087,15 @@ config MODULE_FORCE_UNLOAD
>  	  rmmod).  This is mainly for kernel developers and desperate users.
>  	  If unsure, say N.
>  
> +config MODULE_UNLOAD_TAINT_TRACKING
> +	bool "Tainted module unload tracking"
> +	default n
> +	help
> +          This option allows you to maintain a record of each unloaded
> +          module that taints the kernel. Now in addition to displaying a
> +          list of linked modules e.g. in the event of an Oops, the
> +          aforementioned details are also displayed. If unsure, say N.
> +
>  config MODVERSIONS
>  	bool "Module versioning support"
>  	help
> diff --git a/kernel/module.c b/kernel/module.c
> index ed13917ea5f3..11e10b571d64 100644
> --- a/kernel/module.c
> +++ b/kernel/module.c
> @@ -90,6 +90,17 @@
>   */
>  static DEFINE_MUTEX(module_mutex);
>  static LIST_HEAD(modules);
> +#ifdef CONFIG_MODULE_UNLOAD_TAINT_TRACKING

wc -l kernel/*.c| sort -r -n -k 1| head
84550 total
6143 kernel/workqueue.c
4810 kernel/module.c
4789 kernel/signal.c
3170 kernel/fork.c
2997 kernel/auditsc.c
2902 kernel/kprobes.c
2857 kernel/sysctl.c
2760 kernel/sys.c
2712 kernel/cpu.c

I think it is time we start splitting module.c out into components,
and here we might have a good opportunity to do that. There are tons
of nasty cob webs I'd like to start cleaning up from module.c. So
how about we start by moving module stuff out to kernel/modules/main.c
and then you can bring in your taint friend into that directory.

That way we can avoid the #ifdefs, which seem to attract huge spiders.

Maybe live patch stuff go in its own file too?

> +static LIST_HEAD(unloaded_tainted_modules);
> +static int tainted_list_count;
> +int __read_mostly tainted_list_max_count = 20;

Please read the guidance for __read_mostly on include/linux/cache.h.
I don't see performance metrics on your commit log to justify this use.
We don't want people to just be using that for anything they think is
read often... but not really in the context of what it was originally
desinged for.

Loading and unloading modules... to keep track of *which ones are
tainted*. I'd find it extremely hard to believe this is such a common
thing and hot path that we need this.

In any case, since a linked list is used, I'm curious why did you
decide to bound this to an arbitrary limit of say 20? If this
feature is enabled why not make this boundless?

> +struct mod_unloaded_taint {
> +	struct list_head list;
> +	char name[MODULE_NAME_LEN];
> +	unsigned long taints;
> +};
> +#endif
>  
>  /* Work queue for freeing init sections in success case */
>  static void do_free_init(struct work_struct *w);
> @@ -310,6 +321,47 @@ int unregister_module_notifier(struct notifier_block *nb)
>  }
>  EXPORT_SYMBOL(unregister_module_notifier);
>  
> +#ifdef CONFIG_MODULE_UNLOAD_TAINT_TRACKING
> +
> +static int try_add_tainted_module(struct module *mod)
> +{
> +	struct mod_unload_taint *mod_taint;
> +
> +	module_assert_mutex_or_preempt();
> +
> +	if (tainted_list_max_count >= 0 && mod->taints) {
> +		if (!tainted_list_max_count &&
> +			tainted_list_count >= tainted_list_max_count) {
> +			pr_warn_once("%s: limit reached on the unloaded tainted modules list (count: %d).\n",
> +				     mod->name, tainted_list_count);
> +			goto out;
> +		}
> +
> +		mod_taint = kmalloc(sizeof(*mod_taint), GFP_KERNEL);
> +		if (unlikely(!mod_taint))
> +			return -ENOMEM;
> +		else {
> +			strlcpy(mod_taint->name, mod->name,
> +				MODULE_NAME_LEN);
> +			mod_taint->taints = mod->taints;
> +			list_add_rcu(&mod_taint->list,
> +				&unloaded_tainted_modules);
> +			tainted_list_count++;
> +		}
> +out:
> +	}
> +	return 0;
> +}
> +
> +#else /* MODULE_UNLOAD_TAINT_TRACKING */
> +
> +static int try_add_tainted_module(struct module *mod)
> +{
> +	return 0;
> +}
> +
> +#endif /* MODULE_UNLOAD_TAINT_TRACKING */
> +
>  /*
>   * We require a truly strong try_module_get(): 0 means success.
>   * Otherwise an error is returned due to ongoing or failed
> @@ -579,6 +631,23 @@ struct module *find_module(const char *name)
>  {
>  	return find_module_all(name, strlen(name), false);
>  }
> +#ifdef CONFIG_MODULE_UNLOAD_TAINT_TRACKING
> +struct mod_unload_taint *find_mod_unload_taint(const char *name, size_t len,
> +					    unsigned long taints)
> +{
> +	struct mod_unload_taint *mod_taint;
> +
> +	module_assert_mutex_or_preempt();
> +
> +	list_for_each_entry_rcu(mod_taint, &unloaded_tainted_modules, list,
> +				lockdep_is_held(&module_mutex)) {
> +		if (strlen(mod_taint->name) == len && !memcmp(mod_taint->name,
> +			name, len) && mod_taint->taints & taints) {
> +			return mod_taint;
> +		}
> +	}
> +	return NULL;
> +#endif
>  
>  #ifdef CONFIG_SMP
>  
> @@ -1121,13 +1190,13 @@ static inline int module_unload_init(struct module *mod)
>  }
>  #endif /* CONFIG_MODULE_UNLOAD */
>  
> -static size_t module_flags_taint(struct module *mod, char *buf)
> +static size_t module_flags_taint(unsigned long taints, char *buf)
>  {
>  	size_t l = 0;
>  	int i;
>  
>  	for (i = 0; i < TAINT_FLAGS_COUNT; i++) {
> -		if (taint_flags[i].module && test_bit(i, &mod->taints))
> +		if (taint_flags[i].module && test_bit(i, &taints))
>  			buf[l++] = taint_flags[i].c_true;
>  	}

Please make this its own separate patch. This makes it easier to review
the other changes.

>  
> @@ -1194,7 +1263,7 @@ static ssize_t show_taint(struct module_attribute *mattr,
>  {
>  	size_t l;
>  
> -	l = module_flags_taint(mk->mod, buffer);
> +	l = module_flags_taint(mk->mod->taints, buffer);
>  	buffer[l++] = '\n';
>  	return l;
>  }
> @@ -2193,6 +2262,9 @@ static void free_module(struct module *mod)
>  	module_bug_cleanup(mod);
>  	/* Wait for RCU-sched synchronizing before releasing mod->list and buglist. */
>  	synchronize_rcu();
> +	if (try_add_tainted_module(mod))
> +		pr_error("%s: adding tainted module to the unloaded tainted modules list failed.\n",
> +			 mod->name);
>  	mutex_unlock(&module_mutex);
>  
>  	/* Clean up CFI for the module. */
> @@ -3670,6 +3742,9 @@ static noinline int do_init_module(struct module *mod)
>  {
>  	int ret = 0;
>  	struct mod_initfree *freeinit;
> +#ifdef CONFIG_MODULE_UNLOAD_TAINT_TRACKING
> +	struct mod_unload_taint *old;
> +#endif
>  
>  	freeinit = kmalloc(sizeof(*freeinit), GFP_KERNEL);
>  	if (!freeinit) {
> @@ -3703,6 +3778,16 @@ static noinline int do_init_module(struct module *mod)
>  	mod->state = MODULE_STATE_LIVE;
>  	blocking_notifier_call_chain(&module_notify_list,
>  				     MODULE_STATE_LIVE, mod);
> +#ifdef CONFIG_MODULE_UNLOAD_TAINT_TRACKING
> +	mutex_lock(&module_mutex);
> +	old = find_mod_unload_taint(mod->name, strlen(mod->name),
> +				mod->taints);
> +	if (old) {
> +		list_del_rcu(&old->list);
> +		synchronize_rcu();
> +	}
> +	mutex_unlock(&module_mutex);

But here we seem to delete an old instance of the module taint
history if it is loaded again and has the same taint properties.
Why?

I mean, if a taint happened once, and our goal is to keep track
of them, I'd imagine I'd want to know that this had happened
before, so instead how about just an increment counter for this,
so know how many times this has happened? Please use u64 for that.
I have some test environments where module unloaded happens *a lot*.

> +#endif
>  
>  	/* Delay uevent until module has finished its init routine */
>  	kobject_uevent(&mod->mkobj.kobj, KOBJ_ADD);
> @@ -4511,7 +4596,7 @@ static char *module_flags(struct module *mod, char *buf)
>  	    mod->state == MODULE_STATE_GOING ||
>  	    mod->state == MODULE_STATE_COMING) {
>  		buf[bx++] = '(';
> -		bx += module_flags_taint(mod, buf + bx);
> +		bx += module_flags_taint(mod->taints, buf + bx);

This change can be its own separate patch.

>  		/* Show a - for module-is-being-unloaded */
>  		if (mod->state == MODULE_STATE_GOING)
>  			buf[bx++] = '-';
> @@ -4735,6 +4820,10 @@ void print_modules(void)
>  {
>  	struct module *mod;
>  	char buf[MODULE_FLAGS_BUF_SIZE];
> +#ifdef CONFIG_MODULE_UNLOAD_TAINT_TRACKING
> +	struct mod_unload_taint *mod_taint;
> +	size_t l;
> +#endif
>  
>  	printk(KERN_DEFAULT "Modules linked in:");
>  	/* Most callers should already have preempt disabled, but make sure */
> @@ -4744,6 +4833,15 @@ void print_modules(void)
>  			continue;
>  		pr_cont(" %s%s", mod->name, module_flags(mod, buf));
>  	}
> +#ifdef CONFIG_MODULE_UNLOAD_TAINT_TRACKING
> +	printk(KERN_DEFAULT "\nUnloaded tainted modules:");
> +	list_for_each_entry_rcu(mod_taint, &unloaded_tainted_modules,
> +				list) {
> +		l = module_flags_taint(mod_taint->taints, buf);
> +		buf[l++] = '\0';
> +		pr_cont(" %s(%s)", mod_taint->name, buf);
> +	}
> +#endif

Ugh yeah no, this has to be in its own file. Reading this file
is just one huge effort right now. Please make this a helper so we
don't have to see this eye blinding code.

>  	preempt_enable();
>  	if (last_unloaded_module[0])
>  		pr_cont(" [last unloaded: %s]", last_unloaded_module);
> diff --git a/kernel/sysctl.c b/kernel/sysctl.c
> index 272f4a272f8c..290ffaa5b553 100644
> --- a/kernel/sysctl.c
> +++ b/kernel/sysctl.c
> @@ -2078,6 +2078,16 @@ static struct ctl_table kern_table[] = {
>  		.extra1		= SYSCTL_ONE,
>  		.extra2		= SYSCTL_ONE,
>  	},
> +#ifdef CONFIG_MODULE_UNLOAD_TAINT_TRACKING
> +	{
> +		.procname	= "tainted_list_max_count",
> +		.data		= &tainted_list_max_count,
> +		.maxlen		= sizeof(int),
> +		.mode		= 0644,
> +		.proc_handler	= proc_dointvec_minmax,
> +		.extra1		= &neg_one,
> +	},
> +#endif
>  #endif
>  #ifdef CONFIG_UEVENT_HELPER

Please see kernel/sysctl.c changes on linux-next, we're moving away
from everyone stuffing their sysclts in kernel/sysctl.c and there
you can find helpers and examples of how *not* to do this. Its
on the kernel table so you should be able to just
register_sysctl_init("kernel", modules_sysctls) and while at it,
if you spot any sysctls for module under the kern_table, please
move those over and then your patch would be adding just one new
entry to that new local modules_sysctls table.

We'll have to coordinate with Andrew given that if your changes
depend on those changes then we might as well get all your
changes through Andrew for the next release cycle.

  Luis
Aaron Tomlin Dec. 9, 2021, 4:49 p.m. UTC | #2
On Wed 2021-12-08 12:47 -0800, Luis Chamberlain wrote:
> Hey Aaron thanks for your patch!

Hi Luis,

Firstly, thank you for your review and feedback thus far.

> Please Cc the folks I added in future iterations.

All right.

> > If the previously unloaded module is loaded once again it will be removed
> > from the list only if the taints bitmask is the same.
> 
> That doesn't seem to be clear. What if say a user loads a module which
> taints the kernel, and then unloads it, and then tries to load a similar
> module with the same name but that it does not taint the kernel?
> 
> Would't we loose visibility that at one point the tainting module was
> loaded? OK I see after reviewing the patch that we keep track of each
> module instance unloaded with an attached unsigned long taints. So if
> a module was unloaded with a different taint, we'd see it twice. Is that
> right?

Indeed - is this acceptable to you? I prefer this approach rather than
remove it from the aforementioned list solely based on the module name.

> > The number of tracked modules is not fixed and can be modified accordingly.
> 
> The commit should mention what happens if the limit is reached.

I will mention this accordingly.

> wc -l kernel/*.c| sort -r -n -k 1| head
> 84550 total
> 6143 kernel/workqueue.c
> 4810 kernel/module.c
> 4789 kernel/signal.c
> 3170 kernel/fork.c
> 2997 kernel/auditsc.c
> 2902 kernel/kprobes.c
> 2857 kernel/sysctl.c
> 2760 kernel/sys.c
> 2712 kernel/cpu.c
> 
> I think it is time we start splitting module.c out into components,
> and here we might have a good opportunity to do that. There are tons
> of nasty cob webs I'd like to start cleaning up from module.c. So
> how about we start by moving module stuff out to kernel/modules/main.c
> and then you can bring in your taint friend into that directory.
> 
> That way we can avoid the #ifdefs, which seem to attract huge spiders.

Agreed. This makes sense. I'll work on it.

> Maybe live patch stuff go in its own file too?

At first glance, I believe this is possible too.

> 
> > +static LIST_HEAD(unloaded_tainted_modules);
> > +static int tainted_list_count;
> > +int __read_mostly tainted_list_max_count = 20;
> 
> Please read the guidance for __read_mostly on include/linux/cache.h.
> I don't see performance metrics on your commit log to justify this use.
> We don't want people to just be using that for anything they think is
> read often... but not really in the context of what it was originally
> desinged for.

Understood.

> Loading and unloading modules... to keep track of *which ones are
> tainted*. I'd find it extremely hard to believe this is such a common
> thing and hot path that we need this.
> 
> In any case, since a linked list is used, I'm curious why did you
> decide to bound this to an arbitrary limit of say 20? If this
> feature is enabled why not make this boundless?

It can be, once set to 0. Indeed, the limit specified above is arbitrary.
Personally, I prefer to have some limit that can be controlled by the user.
In fact, if agreed, I can incorporate the limit [when specified] into the
output generated via print_modules().

> 
> > +struct mod_unloaded_taint {
> > +	struct list_head list;
> > +	char name[MODULE_NAME_LEN];
> > +	unsigned long taints;
> > +};
> > +#endif
> >  
> >  /* Work queue for freeing init sections in success case */
> >  static void do_free_init(struct work_struct *w);
> > @@ -310,6 +321,47 @@ int unregister_module_notifier(struct notifier_block *nb)
> >  }
> >  EXPORT_SYMBOL(unregister_module_notifier);
> >  
> > +#ifdef CONFIG_MODULE_UNLOAD_TAINT_TRACKING
> > +
> > +static int try_add_tainted_module(struct module *mod)
> > +{
> > +	struct mod_unload_taint *mod_taint;
> > +
> > +	module_assert_mutex_or_preempt();
> > +
> > +	if (tainted_list_max_count >= 0 && mod->taints) {
> > +		if (!tainted_list_max_count &&
> > +			tainted_list_count >= tainted_list_max_count) {
> > +			pr_warn_once("%s: limit reached on the unloaded tainted modules list (count: %d).\n",
> > +				     mod->name, tainted_list_count);
> > +			goto out;
> > +		}
> > +
> > +		mod_taint = kmalloc(sizeof(*mod_taint), GFP_KERNEL);
> > +		if (unlikely(!mod_taint))
> > +			return -ENOMEM;
> > +		else {
> > +			strlcpy(mod_taint->name, mod->name,
> > +				MODULE_NAME_LEN);
> > +			mod_taint->taints = mod->taints;
> > +			list_add_rcu(&mod_taint->list,
> > +				&unloaded_tainted_modules);
> > +			tainted_list_count++;
> > +		}
> > +out:
> > +	}
> > +	return 0;
> > +}
> > +
> > +#else /* MODULE_UNLOAD_TAINT_TRACKING */
> > +
> > +static int try_add_tainted_module(struct module *mod)
> > +{
> > +	return 0;
> > +}
> > +
> > +#endif /* MODULE_UNLOAD_TAINT_TRACKING */
> > +
> >  /*
> >   * We require a truly strong try_module_get(): 0 means success.
> >   * Otherwise an error is returned due to ongoing or failed
> > @@ -579,6 +631,23 @@ struct module *find_module(const char *name)
> >  {
> >  	return find_module_all(name, strlen(name), false);
> >  }
> > +#ifdef CONFIG_MODULE_UNLOAD_TAINT_TRACKING
> > +struct mod_unload_taint *find_mod_unload_taint(const char *name, size_t len,
> > +					    unsigned long taints)
> > +{
> > +	struct mod_unload_taint *mod_taint;
> > +
> > +	module_assert_mutex_or_preempt();
> > +
> > +	list_for_each_entry_rcu(mod_taint, &unloaded_tainted_modules, list,
> > +				lockdep_is_held(&module_mutex)) {
> > +		if (strlen(mod_taint->name) == len && !memcmp(mod_taint->name,
> > +			name, len) && mod_taint->taints & taints) {
> > +			return mod_taint;
> > +		}
> > +	}
> > +	return NULL;
> > +#endif
> >  
> >  #ifdef CONFIG_SMP
> >  
> > @@ -1121,13 +1190,13 @@ static inline int module_unload_init(struct module *mod)
> >  }
> >  #endif /* CONFIG_MODULE_UNLOAD */
> >  
> > -static size_t module_flags_taint(struct module *mod, char *buf)
> > +static size_t module_flags_taint(unsigned long taints, char *buf)
> >  {
> >  	size_t l = 0;
> >  	int i;
> >  
> >  	for (i = 0; i < TAINT_FLAGS_COUNT; i++) {
> > -		if (taint_flags[i].module && test_bit(i, &mod->taints))
> > +		if (taint_flags[i].module && test_bit(i, &taints))
> >  			buf[l++] = taint_flags[i].c_true;
> >  	}
> 
> Please make this its own separate patch. This makes it easier to review
> the other changes.

No problem, will do.

> >  
> > @@ -1194,7 +1263,7 @@ static ssize_t show_taint(struct module_attribute *mattr,
> >  {
> >  	size_t l;
> >  
> > -	l = module_flags_taint(mk->mod, buffer);
> > +	l = module_flags_taint(mk->mod->taints, buffer);
> >  	buffer[l++] = '\n';
> >  	return l;
> >  }
> > @@ -2193,6 +2262,9 @@ static void free_module(struct module *mod)
> >  	module_bug_cleanup(mod);
> >  	/* Wait for RCU-sched synchronizing before releasing mod->list and buglist. */
> >  	synchronize_rcu();
> > +	if (try_add_tainted_module(mod))
> > +		pr_error("%s: adding tainted module to the unloaded tainted modules list failed.\n",
> > +			 mod->name);
> >  	mutex_unlock(&module_mutex);
> >  
> >  	/* Clean up CFI for the module. */
> > @@ -3670,6 +3742,9 @@ static noinline int do_init_module(struct module *mod)
> >  {
> >  	int ret = 0;
> >  	struct mod_initfree *freeinit;
> > +#ifdef CONFIG_MODULE_UNLOAD_TAINT_TRACKING
> > +	struct mod_unload_taint *old;
> > +#endif
> >  
> >  	freeinit = kmalloc(sizeof(*freeinit), GFP_KERNEL);
> >  	if (!freeinit) {
> > @@ -3703,6 +3778,16 @@ static noinline int do_init_module(struct module *mod)
> >  	mod->state = MODULE_STATE_LIVE;
> >  	blocking_notifier_call_chain(&module_notify_list,
> >  				     MODULE_STATE_LIVE, mod);
> > +#ifdef CONFIG_MODULE_UNLOAD_TAINT_TRACKING
> > +	mutex_lock(&module_mutex);
> > +	old = find_mod_unload_taint(mod->name, strlen(mod->name),
> > +				mod->taints);
> > +	if (old) {
> > +		list_del_rcu(&old->list);
> > +		synchronize_rcu();
> > +	}
> > +	mutex_unlock(&module_mutex);
> 
> But here we seem to delete an old instance of the module taint
> history if it is loaded again and has the same taint properties.
> Why?

At first glance, in this particular case, I believe this makes sense to
avoid duplication i.e. the taint module would be stored in the 'modules'
list thus should be shown once via print_modules(). So, the initial
objective was to only track a "tainted" module when unloaded and once
added/or loaded again [with the same taint(s)] further tracking cease.

> I mean, if a taint happened once, and our goal is to keep track
> of them, I'd imagine I'd want to know that this had happened
> before, so instead how about just an increment counter for this,
> so know how many times this has happened? Please use u64 for that.
> I have some test environments where module unloaded happens *a lot*.

If I understand correctly, I do not like this approach but indeed it could
work. Personally, I would like to incorporate the above idea i.e. track
the unload count, into the initial goal.

> 
> > +#endif
> >  
> >  	/* Delay uevent until module has finished its init routine */
> >  	kobject_uevent(&mod->mkobj.kobj, KOBJ_ADD);
> > @@ -4511,7 +4596,7 @@ static char *module_flags(struct module *mod, char *buf)
> >  	    mod->state == MODULE_STATE_GOING ||
> >  	    mod->state == MODULE_STATE_COMING) {
> >  		buf[bx++] = '(';
> > -		bx += module_flags_taint(mod, buf + bx);
> > +		bx += module_flags_taint(mod->taints, buf + bx);
> 
> This change can be its own separate patch.

Will do.

> 
> >  		/* Show a - for module-is-being-unloaded */
> >  		if (mod->state == MODULE_STATE_GOING)
> >  			buf[bx++] = '-';
> > @@ -4735,6 +4820,10 @@ void print_modules(void)
> >  {
> >  	struct module *mod;
> >  	char buf[MODULE_FLAGS_BUF_SIZE];
> > +#ifdef CONFIG_MODULE_UNLOAD_TAINT_TRACKING
> > +	struct mod_unload_taint *mod_taint;
> > +	size_t l;
> > +#endif
> >  
> >  	printk(KERN_DEFAULT "Modules linked in:");
> >  	/* Most callers should already have preempt disabled, but make sure */
> > @@ -4744,6 +4833,15 @@ void print_modules(void)
> >  			continue;
> >  		pr_cont(" %s%s", mod->name, module_flags(mod, buf));
> >  	}
> > +#ifdef CONFIG_MODULE_UNLOAD_TAINT_TRACKING
> > +	printk(KERN_DEFAULT "\nUnloaded tainted modules:");
> > +	list_for_each_entry_rcu(mod_taint, &unloaded_tainted_modules,
> > +				list) {
> > +		l = module_flags_taint(mod_taint->taints, buf);
> > +		buf[l++] = '\0';
> > +		pr_cont(" %s(%s)", mod_taint->name, buf);
> > +	}
> > +#endif
> 
> Ugh yeah no, this has to be in its own file. Reading this file
> is just one huge effort right now. Please make this a helper so we
> don't have to see this eye blinding code.

Sure, no problem.

> 
> >  	preempt_enable();
> >  	if (last_unloaded_module[0])
> >  		pr_cont(" [last unloaded: %s]", last_unloaded_module);
> > diff --git a/kernel/sysctl.c b/kernel/sysctl.c
> > index 272f4a272f8c..290ffaa5b553 100644
> > --- a/kernel/sysctl.c
> > +++ b/kernel/sysctl.c
> > @@ -2078,6 +2078,16 @@ static struct ctl_table kern_table[] = {
> >  		.extra1		= SYSCTL_ONE,
> >  		.extra2		= SYSCTL_ONE,
> >  	},
> > +#ifdef CONFIG_MODULE_UNLOAD_TAINT_TRACKING
> > +	{
> > +		.procname	= "tainted_list_max_count",
> > +		.data		= &tainted_list_max_count,
> > +		.maxlen		= sizeof(int),
> > +		.mode		= 0644,
> > +		.proc_handler	= proc_dointvec_minmax,
> > +		.extra1		= &neg_one,
> > +	},
> > +#endif
> >  #endif
> >  #ifdef CONFIG_UEVENT_HELPER
> 
> Please see kernel/sysctl.c changes on linux-next, we're moving away
> from everyone stuffing their sysclts in kernel/sysctl.c and there
> you can find helpers and examples of how *not* to do this. Its
> on the kernel table so you should be able to just
> register_sysctl_init("kernel", modules_sysctls) and while at it,
> if you spot any sysctls for module under the kern_table, please
> move those over and then your patch would be adding just one new
> entry to that new local modules_sysctls table.
> 
> We'll have to coordinate with Andrew given that if your changes
> depend on those changes then we might as well get all your
> changes through Andrew for the next release cycle.

All right. I will make the required changes. Thanks once again.



Regards,
Luis Chamberlain Dec. 9, 2021, 11:42 p.m. UTC | #3
On Thu, Dec 09, 2021 at 04:49:17PM +0000, Aaron Tomlin wrote:
> On Wed 2021-12-08 12:47 -0800, Luis Chamberlain wrote:
> > > If the previously unloaded module is loaded once again it will be removed
> > > from the list only if the taints bitmask is the same.
> > 
> > That doesn't seem to be clear. What if say a user loads a module which
> > taints the kernel, and then unloads it, and then tries to load a similar
> > module with the same name but that it does not taint the kernel?
> > 
> > Would't we loose visibility that at one point the tainting module was
> > loaded? OK I see after reviewing the patch that we keep track of each
> > module instance unloaded with an attached unsigned long taints. So if
> > a module was unloaded with a different taint, we'd see it twice. Is that
> > right?
> 
> Indeed - is this acceptable to you? I prefer this approach rather than
> remove it from the aforementioned list solely based on the module name.

Sure, it makes sense to keep all the stupid ways we are harming the
kernel. Makes sense. The other point I made about count though would
be good, in case the taint was the same.

> > wc -l kernel/*.c| sort -r -n -k 1| head
> > 84550 total
> > 6143 kernel/workqueue.c
> > 4810 kernel/module.c
> > 4789 kernel/signal.c
> > 3170 kernel/fork.c
> > 2997 kernel/auditsc.c
> > 2902 kernel/kprobes.c
> > 2857 kernel/sysctl.c
> > 2760 kernel/sys.c
> > 2712 kernel/cpu.c
> > 
> > I think it is time we start splitting module.c out into components,
> > and here we might have a good opportunity to do that. There are tons
> > of nasty cob webs I'd like to start cleaning up from module.c. So
> > how about we start by moving module stuff out to kernel/modules/main.c
> > and then you can bring in your taint friend into that directory.
> > 
> > That way we can avoid the #ifdefs, which seem to attract huge spiders.
> 
> Agreed. This makes sense. I'll work on it.

Wonderful, thanks!

> > Maybe live patch stuff go in its own file too?
> 
> At first glance, I believe this is possible too.

Great! Thanks for being willing to doing this!

> > Loading and unloading modules... to keep track of *which ones are
> > tainted*. I'd find it extremely hard to believe this is such a common
> > thing and hot path that we need this.
> > 
> > In any case, since a linked list is used, I'm curious why did you
> > decide to bound this to an arbitrary limit of say 20? If this
> > feature is enabled why not make this boundless?
> 
> It can be, once set to 0. Indeed, the limit specified above is arbitrary.
> Personally, I prefer to have some limit that can be controlled by the user.
> In fact, if agreed, I can incorporate the limit [when specified] into the
> output generated via print_modules().

If someone enables this feature I can't think of a reason why they
would want to limit this to some arbitrary number. So my preference
is to remove that limitation completely. I see no point to it.

> > > @@ -3703,6 +3778,16 @@ static noinline int do_init_module(struct module *mod)
> > >  	mod->state = MODULE_STATE_LIVE;
> > >  	blocking_notifier_call_chain(&module_notify_list,
> > >  				     MODULE_STATE_LIVE, mod);
> > > +#ifdef CONFIG_MODULE_UNLOAD_TAINT_TRACKING
> > > +	mutex_lock(&module_mutex);
> > > +	old = find_mod_unload_taint(mod->name, strlen(mod->name),
> > > +				mod->taints);
> > > +	if (old) {
> > > +		list_del_rcu(&old->list);
> > > +		synchronize_rcu();
> > > +	}
> > > +	mutex_unlock(&module_mutex);
> > 
> > But here we seem to delete an old instance of the module taint
> > history if it is loaded again and has the same taint properties.
> > Why?
> 
> At first glance, in this particular case, I believe this makes sense to
> avoid duplication

If you just bump the count then its not duplication, it just adds
more information that the same module name with the same taint flag
has been unloaded now more than once.

> i.e. the taint module would be stored in the 'modules'
> list thus should be shown once via print_modules(). So, the initial
> objective was to only track a "tainted" module when unloaded and once
> added/or loaded again [with the same taint(s)] further tracking cease.

This makes me wonder, why not just grow the list at driver insertion
time, rather than removal.

> > I mean, if a taint happened once, and our goal is to keep track
> > of them, I'd imagine I'd want to know that this had happened
> > before, so instead how about just an increment counter for this,
> > so know how many times this has happened? Please use u64 for that.
> > I have some test environments where module unloaded happens *a lot*.
> 
> If I understand correctly, I do not like this approach but indeed it could
> work.

I'm a bit confused, because here you seem to suggest you don't like the
idea, and then...

> Personally, I would like to incorporate the above idea i.e. track
> the unload count, into the initial goal.

Here you say you'd like to keep the unloud count.

> > Please see kernel/sysctl.c changes on linux-next, we're moving away
> > from everyone stuffing their sysclts in kernel/sysctl.c and there
> > you can find helpers and examples of how *not* to do this. Its
> > on the kernel table so you should be able to just
> > register_sysctl_init("kernel", modules_sysctls) and while at it,
> > if you spot any sysctls for module under the kern_table, please
> > move those over and then your patch would be adding just one new
> > entry to that new local modules_sysctls table.
> > 
> > We'll have to coordinate with Andrew given that if your changes
> > depend on those changes then we might as well get all your
> > changes through Andrew for the next release cycle.
> 
> All right. I will make the required changes. Thanks once again.

Sure, so hey just one more thing. Can you add a simple selftest
lib/test_taint.c which can be used to test tainting and you new
tracker ? You can add a new selftest on

tools/testing/selftests/module/

I had already written some module based testing on
tools/testing/selftests/kmod/kmod.sh so you can borrow stuff
from there if you find it useful. But I think we need to start
doing basic testing for module. I know Lucas has tons of test
on kmod, so we should also look at what is there and what needs
testing outside of that.

Then there is the question of what should be tested using kunit and
or selftests. From my experience, if you need a shell, use selftests.
Also, if you need parallelization, use selftests, given kunit by
default uses a uniprocessor architecture, user-mode-linux. I'll let
you figure out what is the best place to add the test for this. It
could just be its a better place to add these tests to kmod upstream
as there are tons of tests there already. But kunit test can't be
added there.

Live patching already has its own set of selftests.

  Luis
Petr Mladek Dec. 10, 2021, 10 a.m. UTC | #4
On Thu 2021-12-09 15:42:08, Luis Chamberlain wrote:
> On Thu, Dec 09, 2021 at 04:49:17PM +0000, Aaron Tomlin wrote:
> > On Wed 2021-12-08 12:47 -0800, Luis Chamberlain wrote:
> > > Loading and unloading modules... to keep track of *which ones are
> > > tainted*. I'd find it extremely hard to believe this is such a common
> > > thing and hot path that we need this.
> > > 
> > > In any case, since a linked list is used, I'm curious why did you
> > > decide to bound this to an arbitrary limit of say 20? If this
> > > feature is enabled why not make this boundless?
> > 
> > It can be, once set to 0. Indeed, the limit specified above is arbitrary.
> > Personally, I prefer to have some limit that can be controlled by the user.
> > In fact, if agreed, I can incorporate the limit [when specified] into the
> > output generated via print_modules().
> 
> If someone enables this feature I can't think of a reason why they
> would want to limit this to some arbitrary number. So my preference
> is to remove that limitation completely. I see no point to it.

I agree with Luis here. We could always add the limit later when
people report some real life problems with too long list. It is
always good to know that someone did some heavy lifting in
the system.

It might be even interesting to remember timestamp of the removal
to match it with another events reported in the system log.

> > > > @@ -3703,6 +3778,16 @@ static noinline int do_init_module(struct module *mod)
> > > >  	mod->state = MODULE_STATE_LIVE;
> > > >  	blocking_notifier_call_chain(&module_notify_list,
> > > >  				     MODULE_STATE_LIVE, mod);
> > > > +#ifdef CONFIG_MODULE_UNLOAD_TAINT_TRACKING
> > > > +	mutex_lock(&module_mutex);
> > > > +	old = find_mod_unload_taint(mod->name, strlen(mod->name),
> > > > +				mod->taints);
> > > > +	if (old) {
> > > > +		list_del_rcu(&old->list);
> > > > +		synchronize_rcu();
> > > > +	}
> > > > +	mutex_unlock(&module_mutex);
> > > 
> > > But here we seem to delete an old instance of the module taint
> > > history if it is loaded again and has the same taint properties.
> > > Why?
> > 
> > At first glance, in this particular case, I believe this makes sense to
> > avoid duplication
> 
> If you just bump the count then its not duplication, it just adds
> more information that the same module name with the same taint flag
> has been unloaded now more than once.

Please, do not remove records that a module was removed. IMHO, it
might be useful to track all removed module, including the non-tainted
ones. Module removal is always tricky and not much tested. The tain
flags might be just shown as extra information in the output.

Best Regards,
Petr
Aaron Tomlin Dec. 10, 2021, 3:48 p.m. UTC | #5
On Thu 2021-12-09 15:42 -0800, Luis Chamberlain wrote:
> > Indeed - is this acceptable to you? I prefer this approach rather than
> > remove it from the aforementioned list solely based on the module name.
> 
> Sure, it makes sense to keep all the stupid ways we are harming the
> kernel. Makes sense. The other point I made about count though would
> be good, in case the taint was the same.

Agreed. So, just to confirm you'd prefer not remove any module that tainted
the kernel from the aforementioned list when the same module and taints
bitmask is reintroduced? If I understand correctly, we'd simply maintain a
list of modules that tainted the kernel during module deletion/or unload
and their respective unload count? If so then this was not my original
objective yet I'm happy with this approach too - I'll take on this
implementation in the next iteration.

> > It can be, once set to 0. Indeed, the limit specified above is arbitrary.
> > Personally, I prefer to have some limit that can be controlled by the user.
> > In fact, if agreed, I can incorporate the limit [when specified] into the
> > output generated via print_modules().
> 
> If someone enables this feature I can't think of a reason why they
> would want to limit this to some arbitrary number. So my preference
> is to remove that limitation completely. I see no point to it.

Fair enough. If necessary we could introduce the above later.

> > > > @@ -3703,6 +3778,16 @@ static noinline int do_init_module(struct module *mod)
> > > >  	mod->state = MODULE_STATE_LIVE;
> > > >  	blocking_notifier_call_chain(&module_notify_list,
> > > >  				     MODULE_STATE_LIVE, mod);
> > > > +#ifdef CONFIG_MODULE_UNLOAD_TAINT_TRACKING
> > > > +	mutex_lock(&module_mutex);
> > > > +	old = find_mod_unload_taint(mod->name, strlen(mod->name),
> > > > +				mod->taints);
> > > > +	if (old) {
> > > > +		list_del_rcu(&old->list);
> > > > +		synchronize_rcu();
> > > > +	}
> > > > +	mutex_unlock(&module_mutex);
> > > 
> > > But here we seem to delete an old instance of the module taint
> > > history if it is loaded again and has the same taint properties.
> > > Why?

Yes, this was my original approach. Once the same module [with the same
taints bitmask] is reintroduced it will be listed on the 'modules' list
thus no need to track it on the unloaded list anymore. That being said, as
per the above, let's now keep track of each removal and maintain an unload
count.

> > At first glance, in this particular case, I believe this makes sense to
> > avoid duplication
> 
> If you just bump the count then its not duplication, it just adds
> more information that the same module name with the same taint flag
> has been unloaded now more than once.

Agreed.

> > All right. I will make the required changes. Thanks once again.
> 
> Sure, so hey just one more thing. Can you add a simple selftest
> lib/test_taint.c which can be used to test tainting and you new
> tracker ? You can add a new selftest on
> 
> tools/testing/selftests/module/
> 
> I had already written some module based testing on
> tools/testing/selftests/kmod/kmod.sh so you can borrow stuff
> from there if you find it useful. But I think we need to start
> doing basic testing for module. I know Lucas has tons of test
> on kmod, so we should also look at what is there and what needs
> testing outside of that.
> 
> Then there is the question of what should be tested using kunit and
> or selftests. From my experience, if you need a shell, use selftests.
> Also, if you need parallelization, use selftests, given kunit by
> default uses a uniprocessor architecture, user-mode-linux. I'll let
> you figure out what is the best place to add the test for this. It
> could just be its a better place to add these tests to kmod upstream
> as there are tons of tests there already. But kunit test can't be
> added there.
> 
> Live patching already has its own set of selftests.


Sure - will do. Thanks once again!
Aaron Tomlin Dec. 10, 2021, 4:09 p.m. UTC | #6
On Fri 2021-12-10 11:00 +0100, Petr Mladek wrote:
> > If someone enables this feature I can't think of a reason why they
> > would want to limit this to some arbitrary number. So my preference
> > is to remove that limitation completely. I see no point to it.
> 
> I agree with Luis here. We could always add the limit later when
> people report some real life problems with too long list. It is
> always good to know that someone did some heavy lifting in
> the system.

Fair enough.

> It might be even interesting to remember timestamp of the removal
> to match it with another events reported in the system log.

I'm not so sure about this. We could gather such details already via Ftrace
(e.g. see load_module()). Personally, I'd prefer to maintain a simple list.

> > If you just bump the count then its not duplication, it just adds
> > more information that the same module name with the same taint flag
> > has been unloaded now more than once.
> 
> Please, do not remove records that a module was removed. IMHO, it
> might be useful to track all removed module, including the non-tainted
> ones. Module removal is always tricky and not much tested. The tain
> flags might be just shown as extra information in the output.

This is an interesting suggestion. Albeit, as per the subject, I prefer to
just keep track of any module that tainted the kernel. That being said,
Petr, if you'd prefer to track each module unload/or deletion event, then I
would suggest for instance to remove a module once it has been reintroduced
or maintain an unload count as suggested by Luis.

Please let me know your thoughts.


Kind regards,
Luis Chamberlain Dec. 10, 2021, 5:03 p.m. UTC | #7
On Fri, Dec 10, 2021 at 11:00:52AM +0100, Petr Mladek wrote:
> Please, do not remove records that a module was removed. IMHO, it
> might be useful to track all removed module, including the non-tainted
> ones.

Then we'd need two features. One modules removed, and another which
limits this to only tainted modules. On kernel-ci systems where I
try to reproduce issues with fstests or blktests I might be unloading
a module 10,000 times, and so for those systems I'd like to disable
the tracking of all modules removed, otherwise we'd end up with
-ENOMEM eventually.

> Module removal is always tricky and not much tested.

It is tricky but I have been trying to correct issues along that path
and given that fstests and blktests uses it heavily I want to start
dispelling the false narrative that this is not a common use case.

> The tain flags might be just shown as extra information in the output.

Yes!

  Luis
Luis Chamberlain Dec. 10, 2021, 5:09 p.m. UTC | #8
On Fri, Dec 10, 2021 at 04:09:31PM +0000, Aaron Tomlin wrote:
> On Fri 2021-12-10 11:00 +0100, Petr Mladek wrote:
> This is an interesting suggestion. Albeit, as per the subject, I prefer to
> just keep track of any module that tainted the kernel. That being said,
> Petr, if you'd prefer to track each module unload/or deletion event, then I
> would suggest for instance to remove a module once it has been reintroduced
> or maintain an unload count as suggested by Luis.

Come to think of this again, although at first it might be enticing to
keep track of module unloads (without taint), we have to ask ourselves
who would need / enable such a feature. And I think Aaron is right that
this might be better tracked in userspace.

The taint though, that seems critical due to the potential harm.

Maybe how many unloads one has done though, that might be useful
debugging information and does not create a huge overhead like a
pontential -ENOMEM.

  Luis
Allen Dec. 13, 2021, 1 p.m. UTC | #9
>
> Hi Luis,
>
> Firstly, thank you for your review and feedback thus far.
>
> > Please Cc the folks I added in future iterations.
>
> All right.
>
> > > If the previously unloaded module is loaded once again it will be removed
> > > from the list only if the taints bitmask is the same.
> >
> > That doesn't seem to be clear. What if say a user loads a module which
> > taints the kernel, and then unloads it, and then tries to load a similar
> > module with the same name but that it does not taint the kernel?
> >
> > Would't we loose visibility that at one point the tainting module was
> > loaded? OK I see after reviewing the patch that we keep track of each
> > module instance unloaded with an attached unsigned long taints. So if
> > a module was unloaded with a different taint, we'd see it twice. Is that
> > right?
>
> Indeed - is this acceptable to you? I prefer this approach rather than
> remove it from the aforementioned list solely based on the module name.
>
> > > The number of tracked modules is not fixed and can be modified accordingly.
> >
> > The commit should mention what happens if the limit is reached.
>
> I will mention this accordingly.
>
> > wc -l kernel/*.c| sort -r -n -k 1| head
> > 84550 total
> > 6143 kernel/workqueue.c
> > 4810 kernel/module.c
> > 4789 kernel/signal.c
> > 3170 kernel/fork.c
> > 2997 kernel/auditsc.c
> > 2902 kernel/kprobes.c
> > 2857 kernel/sysctl.c
> > 2760 kernel/sys.c
> > 2712 kernel/cpu.c
> >
> > I think it is time we start splitting module.c out into components,
> > and here we might have a good opportunity to do that. There are tons
> > of nasty cob webs I'd like to start cleaning up from module.c. So
> > how about we start by moving module stuff out to kernel/modules/main.c
> > and then you can bring in your taint friend into that directory.
> >
> > That way we can avoid the #ifdefs, which seem to attract huge spiders.
>
> Agreed. This makes sense. I'll work on it.

Aaron, Luis,

   I have some ideas and did some work on it. Let me know if we could
work together on this.

- Allen

>
> > Maybe live patch stuff go in its own file too?
>
> At first glance, I believe this is possible too.
>
> >
> > > +static LIST_HEAD(unloaded_tainted_modules);
> > > +static int tainted_list_count;
> > > +int __read_mostly tainted_list_max_count = 20;
> >
> > Please read the guidance for __read_mostly on include/linux/cache.h.
> > I don't see performance metrics on your commit log to justify this use.
> > We don't want people to just be using that for anything they think is
> > read often... but not really in the context of what it was originally
> > desinged for.
>
> Understood.
>
> > Loading and unloading modules... to keep track of *which ones are
> > tainted*. I'd find it extremely hard to believe this is such a common
> > thing and hot path that we need this.
> >
> > In any case, since a linked list is used, I'm curious why did you
> > decide to bound this to an arbitrary limit of say 20? If this
> > feature is enabled why not make this boundless?
>
> It can be, once set to 0. Indeed, the limit specified above is arbitrary.
> Personally, I prefer to have some limit that can be controlled by the user.
> In fact, if agreed, I can incorporate the limit [when specified] into the
> output generated via print_modules().
>
> >
> > > +struct mod_unloaded_taint {
> > > +   struct list_head list;
> > > +   char name[MODULE_NAME_LEN];
> > > +   unsigned long taints;
> > > +};
> > > +#endif
> > >
> > >  /* Work queue for freeing init sections in success case */
> > >  static void do_free_init(struct work_struct *w);
> > > @@ -310,6 +321,47 @@ int unregister_module_notifier(struct notifier_block *nb)
> > >  }
> > >  EXPORT_SYMBOL(unregister_module_notifier);
> > >
> > > +#ifdef CONFIG_MODULE_UNLOAD_TAINT_TRACKING
> > > +
> > > +static int try_add_tainted_module(struct module *mod)
> > > +{
> > > +   struct mod_unload_taint *mod_taint;
> > > +
> > > +   module_assert_mutex_or_preempt();
> > > +
> > > +   if (tainted_list_max_count >= 0 && mod->taints) {
> > > +           if (!tainted_list_max_count &&
> > > +                   tainted_list_count >= tainted_list_max_count) {
> > > +                   pr_warn_once("%s: limit reached on the unloaded tainted modules list (count: %d).\n",
> > > +                                mod->name, tainted_list_count);
> > > +                   goto out;
> > > +           }
> > > +
> > > +           mod_taint = kmalloc(sizeof(*mod_taint), GFP_KERNEL);
> > > +           if (unlikely(!mod_taint))
> > > +                   return -ENOMEM;
> > > +           else {
> > > +                   strlcpy(mod_taint->name, mod->name,
> > > +                           MODULE_NAME_LEN);
> > > +                   mod_taint->taints = mod->taints;
> > > +                   list_add_rcu(&mod_taint->list,
> > > +                           &unloaded_tainted_modules);
> > > +                   tainted_list_count++;
> > > +           }
> > > +out:
> > > +   }
> > > +   return 0;
> > > +}
> > > +
> > > +#else /* MODULE_UNLOAD_TAINT_TRACKING */
> > > +
> > > +static int try_add_tainted_module(struct module *mod)
> > > +{
> > > +   return 0;
> > > +}
> > > +
> > > +#endif /* MODULE_UNLOAD_TAINT_TRACKING */
> > > +
> > >  /*
> > >   * We require a truly strong try_module_get(): 0 means success.
> > >   * Otherwise an error is returned due to ongoing or failed
> > > @@ -579,6 +631,23 @@ struct module *find_module(const char *name)
> > >  {
> > >     return find_module_all(name, strlen(name), false);
> > >  }
> > > +#ifdef CONFIG_MODULE_UNLOAD_TAINT_TRACKING
> > > +struct mod_unload_taint *find_mod_unload_taint(const char *name, size_t len,
> > > +                                       unsigned long taints)
> > > +{
> > > +   struct mod_unload_taint *mod_taint;
> > > +
> > > +   module_assert_mutex_or_preempt();
> > > +
> > > +   list_for_each_entry_rcu(mod_taint, &unloaded_tainted_modules, list,
> > > +                           lockdep_is_held(&module_mutex)) {
> > > +           if (strlen(mod_taint->name) == len && !memcmp(mod_taint->name,
> > > +                   name, len) && mod_taint->taints & taints) {
> > > +                   return mod_taint;
> > > +           }
> > > +   }
> > > +   return NULL;
> > > +#endif
> > >
> > >  #ifdef CONFIG_SMP
> > >
> > > @@ -1121,13 +1190,13 @@ static inline int module_unload_init(struct module *mod)
> > >  }
> > >  #endif /* CONFIG_MODULE_UNLOAD */
> > >
> > > -static size_t module_flags_taint(struct module *mod, char *buf)
> > > +static size_t module_flags_taint(unsigned long taints, char *buf)
> > >  {
> > >     size_t l = 0;
> > >     int i;
> > >
> > >     for (i = 0; i < TAINT_FLAGS_COUNT; i++) {
> > > -           if (taint_flags[i].module && test_bit(i, &mod->taints))
> > > +           if (taint_flags[i].module && test_bit(i, &taints))
> > >                     buf[l++] = taint_flags[i].c_true;
> > >     }
> >
> > Please make this its own separate patch. This makes it easier to review
> > the other changes.
>
> No problem, will do.
>
> > >
> > > @@ -1194,7 +1263,7 @@ static ssize_t show_taint(struct module_attribute *mattr,
> > >  {
> > >     size_t l;
> > >
> > > -   l = module_flags_taint(mk->mod, buffer);
> > > +   l = module_flags_taint(mk->mod->taints, buffer);
> > >     buffer[l++] = '\n';
> > >     return l;
> > >  }
> > > @@ -2193,6 +2262,9 @@ static void free_module(struct module *mod)
> > >     module_bug_cleanup(mod);
> > >     /* Wait for RCU-sched synchronizing before releasing mod->list and buglist. */
> > >     synchronize_rcu();
> > > +   if (try_add_tainted_module(mod))
> > > +           pr_error("%s: adding tainted module to the unloaded tainted modules list failed.\n",
> > > +                    mod->name);
> > >     mutex_unlock(&module_mutex);
> > >
> > >     /* Clean up CFI for the module. */
> > > @@ -3670,6 +3742,9 @@ static noinline int do_init_module(struct module *mod)
> > >  {
> > >     int ret = 0;
> > >     struct mod_initfree *freeinit;
> > > +#ifdef CONFIG_MODULE_UNLOAD_TAINT_TRACKING
> > > +   struct mod_unload_taint *old;
> > > +#endif
> > >
> > >     freeinit = kmalloc(sizeof(*freeinit), GFP_KERNEL);
> > >     if (!freeinit) {
> > > @@ -3703,6 +3778,16 @@ static noinline int do_init_module(struct module *mod)
> > >     mod->state = MODULE_STATE_LIVE;
> > >     blocking_notifier_call_chain(&module_notify_list,
> > >                                  MODULE_STATE_LIVE, mod);
> > > +#ifdef CONFIG_MODULE_UNLOAD_TAINT_TRACKING
> > > +   mutex_lock(&module_mutex);
> > > +   old = find_mod_unload_taint(mod->name, strlen(mod->name),
> > > +                           mod->taints);
> > > +   if (old) {
> > > +           list_del_rcu(&old->list);
> > > +           synchronize_rcu();
> > > +   }
> > > +   mutex_unlock(&module_mutex);
> >
> > But here we seem to delete an old instance of the module taint
> > history if it is loaded again and has the same taint properties.
> > Why?
>
> At first glance, in this particular case, I believe this makes sense to
> avoid duplication i.e. the taint module would be stored in the 'modules'
> list thus should be shown once via print_modules(). So, the initial
> objective was to only track a "tainted" module when unloaded and once
> added/or loaded again [with the same taint(s)] further tracking cease.
>
> > I mean, if a taint happened once, and our goal is to keep track
> > of them, I'd imagine I'd want to know that this had happened
> > before, so instead how about just an increment counter for this,
> > so know how many times this has happened? Please use u64 for that.
> > I have some test environments where module unloaded happens *a lot*.
>
> If I understand correctly, I do not like this approach but indeed it could
> work. Personally, I would like to incorporate the above idea i.e. track
> the unload count, into the initial goal.
>
> >
> > > +#endif
> > >
> > >     /* Delay uevent until module has finished its init routine */
> > >     kobject_uevent(&mod->mkobj.kobj, KOBJ_ADD);
> > > @@ -4511,7 +4596,7 @@ static char *module_flags(struct module *mod, char *buf)
> > >         mod->state == MODULE_STATE_GOING ||
> > >         mod->state == MODULE_STATE_COMING) {
> > >             buf[bx++] = '(';
> > > -           bx += module_flags_taint(mod, buf + bx);
> > > +           bx += module_flags_taint(mod->taints, buf + bx);
> >
> > This change can be its own separate patch.
>
> Will do.
>
> >
> > >             /* Show a - for module-is-being-unloaded */
> > >             if (mod->state == MODULE_STATE_GOING)
> > >                     buf[bx++] = '-';
> > > @@ -4735,6 +4820,10 @@ void print_modules(void)
> > >  {
> > >     struct module *mod;
> > >     char buf[MODULE_FLAGS_BUF_SIZE];
> > > +#ifdef CONFIG_MODULE_UNLOAD_TAINT_TRACKING
> > > +   struct mod_unload_taint *mod_taint;
> > > +   size_t l;
> > > +#endif
> > >
> > >     printk(KERN_DEFAULT "Modules linked in:");
> > >     /* Most callers should already have preempt disabled, but make sure */
> > > @@ -4744,6 +4833,15 @@ void print_modules(void)
> > >                     continue;
> > >             pr_cont(" %s%s", mod->name, module_flags(mod, buf));
> > >     }
> > > +#ifdef CONFIG_MODULE_UNLOAD_TAINT_TRACKING
> > > +   printk(KERN_DEFAULT "\nUnloaded tainted modules:");
> > > +   list_for_each_entry_rcu(mod_taint, &unloaded_tainted_modules,
> > > +                           list) {
> > > +           l = module_flags_taint(mod_taint->taints, buf);
> > > +           buf[l++] = '\0';
> > > +           pr_cont(" %s(%s)", mod_taint->name, buf);
> > > +   }
> > > +#endif
> >
> > Ugh yeah no, this has to be in its own file. Reading this file
> > is just one huge effort right now. Please make this a helper so we
> > don't have to see this eye blinding code.
>
> Sure, no problem.
>
> >
> > >     preempt_enable();
> > >     if (last_unloaded_module[0])
> > >             pr_cont(" [last unloaded: %s]", last_unloaded_module);
> > > diff --git a/kernel/sysctl.c b/kernel/sysctl.c
> > > index 272f4a272f8c..290ffaa5b553 100644
> > > --- a/kernel/sysctl.c
> > > +++ b/kernel/sysctl.c
> > > @@ -2078,6 +2078,16 @@ static struct ctl_table kern_table[] = {
> > >             .extra1         = SYSCTL_ONE,
> > >             .extra2         = SYSCTL_ONE,
> > >     },
> > > +#ifdef CONFIG_MODULE_UNLOAD_TAINT_TRACKING
> > > +   {
> > > +           .procname       = "tainted_list_max_count",
> > > +           .data           = &tainted_list_max_count,
> > > +           .maxlen         = sizeof(int),
> > > +           .mode           = 0644,
> > > +           .proc_handler   = proc_dointvec_minmax,
> > > +           .extra1         = &neg_one,
> > > +   },
> > > +#endif
> > >  #endif
> > >  #ifdef CONFIG_UEVENT_HELPER
> >
> > Please see kernel/sysctl.c changes on linux-next, we're moving away
> > from everyone stuffing their sysclts in kernel/sysctl.c and there
> > you can find helpers and examples of how *not* to do this. Its
> > on the kernel table so you should be able to just
> > register_sysctl_init("kernel", modules_sysctls) and while at it,
> > if you spot any sysctls for module under the kern_table, please
> > move those over and then your patch would be adding just one new
> > entry to that new local modules_sysctls table.
> >
> > We'll have to coordinate with Andrew given that if your changes
> > depend on those changes then we might as well get all your
> > changes through Andrew for the next release cycle.
>
> All right. I will make the required changes. Thanks once again.
>
>
>
> Regards,
>
> --
> Aaron Tomlin
>
Petr Mladek Dec. 13, 2021, 3:16 p.m. UTC | #10
On Fri 2021-12-10 16:09:31, Aaron Tomlin wrote:
> On Fri 2021-12-10 11:00 +0100, Petr Mladek wrote:
> > > If someone enables this feature I can't think of a reason why they
> > > would want to limit this to some arbitrary number. So my preference
> > > is to remove that limitation completely. I see no point to it.
> > 
> > I agree with Luis here. We could always add the limit later when
> > people report some real life problems with too long list. It is
> > always good to know that someone did some heavy lifting in
> > the system.
> 
> Fair enough.
> 
> > It might be even interesting to remember timestamp of the removal
> > to match it with another events reported in the system log.
> 
> I'm not so sure about this. We could gather such details already via Ftrace
> (e.g. see load_module()). Personally, I'd prefer to maintain a simple list.

Fair enough. It was just an idea. Simple list is a good start. We
could always add more details if people find it useful.


> > > If you just bump the count then its not duplication, it just adds
> > > more information that the same module name with the same taint flag
> > > has been unloaded now more than once.
> > 
> > Please, do not remove records that a module was removed. IMHO, it
> > might be useful to track all removed module, including the non-tainted
> > ones. Module removal is always tricky and not much tested. The tain
> > flags might be just shown as extra information in the output.
> 
> This is an interesting suggestion. Albeit, as per the subject, I prefer to
> just keep track of any module that tainted the kernel. That being said,
> Petr, if you'd prefer to track each module unload/or deletion event, then I
> would suggest for instance to remove a module once it has been reintroduced
> or maintain an unload count as suggested by Luis.

I just have fresh in mind the patchset
https://lore.kernel.org/r/20211129034509.2646872-1-ming.lei@redhat.com
It is about that removing sysfs interface is tricky and might lead to
use after free problems. I could imagine many other similar problems
that might happen with any module.

But I agree that the information about modules that tainted the kernel is
more important. I do not want to block the feature by requiring more
requirements.


Also we should keep in mind that the default panic() message should
be reasonably short. Only the last lines might be visible on screen.
Serial consoles might be really slow.

It is perfectly fine to add few lines, like the existing list of
loaded modules. Any potentially extensive output should be optional.
There already is support for optional info, see panic_print_sys_info().

Best Regards,
Petr
Luis Chamberlain Dec. 20, 2021, 7:23 p.m. UTC | #11
On Mon, Dec 13, 2021 at 06:30:07PM +0530, Allen wrote:
> Aaron, Luis,
> 
>    I have some ideas and did some work on it. Let me know if we could
> work together on this.

Patches welcomed.

  Luis
Aaron Tomlin Dec. 21, 2021, 11:44 a.m. UTC | #12
On Mon 2021-12-13 18:30 +0530, Allen wrote:
> Aaron, Luis,

Hi Allen

> I have some ideas and did some work on it. Let me know if we could work
> together on this.

Yes, we can. What have you done so far?

Kind regards,
Aaron Tomlin Dec. 21, 2021, 11:58 a.m. UTC | #13
On Mon 2021-12-13 16:16 +0100, Petr Mladek wrote:
> > I'm not so sure about this. We could gather such details already via Ftrace
> > (e.g. see load_module()). Personally, I'd prefer to maintain a simple list.
> 
> Fair enough. It was just an idea. Simple list is a good start. We
> could always add more details if people find it useful.

Indeed we could.

> Also we should keep in mind that the default panic() message should
> be reasonably short. Only the last lines might be visible on screen.
> Serial consoles might be really slow.

Absolutely, I agree. This feature should be entirely optional. In fact, it
is likely only useful while reviewing the data via /proc/vmcore given the
potential amount of data generated, in addition to that seen in
panic_print_sys_info(), when explicitly enabled.


Kind regards,
diff mbox series

Patch

diff --git a/include/linux/module.h b/include/linux/module.h
index 8a298d820dbc..6f089953f28a 100644
--- a/include/linux/module.h
+++ b/include/linux/module.h
@@ -672,6 +672,11 @@  static inline bool is_livepatch_module(struct module *mod)
 bool is_module_sig_enforced(void);
 void set_module_sig_enforced(void);
 
+#ifdef CONFIG_MODULE_UNLOAD_TAINT_TRACKING
+
+extern int tainted_list_max_count; /* for sysctl */
+
+#endif
 #else /* !CONFIG_MODULES... */
 
 static inline struct module *__module_address(unsigned long addr)
diff --git a/init/Kconfig b/init/Kconfig
index bb0d6e6262b1..699c6cf948d8 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -2087,6 +2087,15 @@  config MODULE_FORCE_UNLOAD
 	  rmmod).  This is mainly for kernel developers and desperate users.
 	  If unsure, say N.
 
+config MODULE_UNLOAD_TAINT_TRACKING
+	bool "Tainted module unload tracking"
+	default n
+	help
+          This option allows you to maintain a record of each unloaded
+          module that taints the kernel. Now in addition to displaying a
+          list of linked modules e.g. in the event of an Oops, the
+          aforementioned details are also displayed. If unsure, say N.
+
 config MODVERSIONS
 	bool "Module versioning support"
 	help
diff --git a/kernel/module.c b/kernel/module.c
index ed13917ea5f3..11e10b571d64 100644
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -90,6 +90,17 @@ 
  */
 static DEFINE_MUTEX(module_mutex);
 static LIST_HEAD(modules);
+#ifdef CONFIG_MODULE_UNLOAD_TAINT_TRACKING
+static LIST_HEAD(unloaded_tainted_modules);
+static int tainted_list_count;
+int __read_mostly tainted_list_max_count = 20;
+
+struct mod_unloaded_taint {
+	struct list_head list;
+	char name[MODULE_NAME_LEN];
+	unsigned long taints;
+};
+#endif
 
 /* Work queue for freeing init sections in success case */
 static void do_free_init(struct work_struct *w);
@@ -310,6 +321,47 @@  int unregister_module_notifier(struct notifier_block *nb)
 }
 EXPORT_SYMBOL(unregister_module_notifier);
 
+#ifdef CONFIG_MODULE_UNLOAD_TAINT_TRACKING
+
+static int try_add_tainted_module(struct module *mod)
+{
+	struct mod_unload_taint *mod_taint;
+
+	module_assert_mutex_or_preempt();
+
+	if (tainted_list_max_count >= 0 && mod->taints) {
+		if (!tainted_list_max_count &&
+			tainted_list_count >= tainted_list_max_count) {
+			pr_warn_once("%s: limit reached on the unloaded tainted modules list (count: %d).\n",
+				     mod->name, tainted_list_count);
+			goto out;
+		}
+
+		mod_taint = kmalloc(sizeof(*mod_taint), GFP_KERNEL);
+		if (unlikely(!mod_taint))
+			return -ENOMEM;
+		else {
+			strlcpy(mod_taint->name, mod->name,
+				MODULE_NAME_LEN);
+			mod_taint->taints = mod->taints;
+			list_add_rcu(&mod_taint->list,
+				&unloaded_tainted_modules);
+			tainted_list_count++;
+		}
+out:
+	}
+	return 0;
+}
+
+#else /* MODULE_UNLOAD_TAINT_TRACKING */
+
+static int try_add_tainted_module(struct module *mod)
+{
+	return 0;
+}
+
+#endif /* MODULE_UNLOAD_TAINT_TRACKING */
+
 /*
  * We require a truly strong try_module_get(): 0 means success.
  * Otherwise an error is returned due to ongoing or failed
@@ -579,6 +631,23 @@  struct module *find_module(const char *name)
 {
 	return find_module_all(name, strlen(name), false);
 }
+#ifdef CONFIG_MODULE_UNLOAD_TAINT_TRACKING
+struct mod_unload_taint *find_mod_unload_taint(const char *name, size_t len,
+					    unsigned long taints)
+{
+	struct mod_unload_taint *mod_taint;
+
+	module_assert_mutex_or_preempt();
+
+	list_for_each_entry_rcu(mod_taint, &unloaded_tainted_modules, list,
+				lockdep_is_held(&module_mutex)) {
+		if (strlen(mod_taint->name) == len && !memcmp(mod_taint->name,
+			name, len) && mod_taint->taints & taints) {
+			return mod_taint;
+		}
+	}
+	return NULL;
+#endif
 
 #ifdef CONFIG_SMP
 
@@ -1121,13 +1190,13 @@  static inline int module_unload_init(struct module *mod)
 }
 #endif /* CONFIG_MODULE_UNLOAD */
 
-static size_t module_flags_taint(struct module *mod, char *buf)
+static size_t module_flags_taint(unsigned long taints, char *buf)
 {
 	size_t l = 0;
 	int i;
 
 	for (i = 0; i < TAINT_FLAGS_COUNT; i++) {
-		if (taint_flags[i].module && test_bit(i, &mod->taints))
+		if (taint_flags[i].module && test_bit(i, &taints))
 			buf[l++] = taint_flags[i].c_true;
 	}
 
@@ -1194,7 +1263,7 @@  static ssize_t show_taint(struct module_attribute *mattr,
 {
 	size_t l;
 
-	l = module_flags_taint(mk->mod, buffer);
+	l = module_flags_taint(mk->mod->taints, buffer);
 	buffer[l++] = '\n';
 	return l;
 }
@@ -2193,6 +2262,9 @@  static void free_module(struct module *mod)
 	module_bug_cleanup(mod);
 	/* Wait for RCU-sched synchronizing before releasing mod->list and buglist. */
 	synchronize_rcu();
+	if (try_add_tainted_module(mod))
+		pr_error("%s: adding tainted module to the unloaded tainted modules list failed.\n",
+			 mod->name);
 	mutex_unlock(&module_mutex);
 
 	/* Clean up CFI for the module. */
@@ -3670,6 +3742,9 @@  static noinline int do_init_module(struct module *mod)
 {
 	int ret = 0;
 	struct mod_initfree *freeinit;
+#ifdef CONFIG_MODULE_UNLOAD_TAINT_TRACKING
+	struct mod_unload_taint *old;
+#endif
 
 	freeinit = kmalloc(sizeof(*freeinit), GFP_KERNEL);
 	if (!freeinit) {
@@ -3703,6 +3778,16 @@  static noinline int do_init_module(struct module *mod)
 	mod->state = MODULE_STATE_LIVE;
 	blocking_notifier_call_chain(&module_notify_list,
 				     MODULE_STATE_LIVE, mod);
+#ifdef CONFIG_MODULE_UNLOAD_TAINT_TRACKING
+	mutex_lock(&module_mutex);
+	old = find_mod_unload_taint(mod->name, strlen(mod->name),
+				mod->taints);
+	if (old) {
+		list_del_rcu(&old->list);
+		synchronize_rcu();
+	}
+	mutex_unlock(&module_mutex);
+#endif
 
 	/* Delay uevent until module has finished its init routine */
 	kobject_uevent(&mod->mkobj.kobj, KOBJ_ADD);
@@ -4511,7 +4596,7 @@  static char *module_flags(struct module *mod, char *buf)
 	    mod->state == MODULE_STATE_GOING ||
 	    mod->state == MODULE_STATE_COMING) {
 		buf[bx++] = '(';
-		bx += module_flags_taint(mod, buf + bx);
+		bx += module_flags_taint(mod->taints, buf + bx);
 		/* Show a - for module-is-being-unloaded */
 		if (mod->state == MODULE_STATE_GOING)
 			buf[bx++] = '-';
@@ -4735,6 +4820,10 @@  void print_modules(void)
 {
 	struct module *mod;
 	char buf[MODULE_FLAGS_BUF_SIZE];
+#ifdef CONFIG_MODULE_UNLOAD_TAINT_TRACKING
+	struct mod_unload_taint *mod_taint;
+	size_t l;
+#endif
 
 	printk(KERN_DEFAULT "Modules linked in:");
 	/* Most callers should already have preempt disabled, but make sure */
@@ -4744,6 +4833,15 @@  void print_modules(void)
 			continue;
 		pr_cont(" %s%s", mod->name, module_flags(mod, buf));
 	}
+#ifdef CONFIG_MODULE_UNLOAD_TAINT_TRACKING
+	printk(KERN_DEFAULT "\nUnloaded tainted modules:");
+	list_for_each_entry_rcu(mod_taint, &unloaded_tainted_modules,
+				list) {
+		l = module_flags_taint(mod_taint->taints, buf);
+		buf[l++] = '\0';
+		pr_cont(" %s(%s)", mod_taint->name, buf);
+	}
+#endif
 	preempt_enable();
 	if (last_unloaded_module[0])
 		pr_cont(" [last unloaded: %s]", last_unloaded_module);
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index 272f4a272f8c..290ffaa5b553 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -2078,6 +2078,16 @@  static struct ctl_table kern_table[] = {
 		.extra1		= SYSCTL_ONE,
 		.extra2		= SYSCTL_ONE,
 	},
+#ifdef CONFIG_MODULE_UNLOAD_TAINT_TRACKING
+	{
+		.procname	= "tainted_list_max_count",
+		.data		= &tainted_list_max_count,
+		.maxlen		= sizeof(int),
+		.mode		= 0644,
+		.proc_handler	= proc_dointvec_minmax,
+		.extra1		= &neg_one,
+	},
+#endif
 #endif
 #ifdef CONFIG_UEVENT_HELPER
 	{