diff mbox series

[17/17] module: Prevent module removal racing with text_poke()

Message ID 20190117003259.23141-18-rick.p.edgecombe@intel.com (mailing list archive)
State New, archived
Headers show
Series Merge text_poke fixes and executable lockdowns | expand

Commit Message

Edgecombe, Rick P Jan. 17, 2019, 12:32 a.m. UTC
From: Nadav Amit <namit@vmware.com>

It seems dangerous to allow code modifications to take place
concurrently with module unloading. So take the text_mutex while the
memory of the module is freed.

Signed-off-by: Nadav Amit <namit@vmware.com>
Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
---
 kernel/module.c | 5 +++++
 1 file changed, 5 insertions(+)

Comments

Masami Hiramatsu (Google) Jan. 17, 2019, 7:54 a.m. UTC | #1
On Wed, 16 Jan 2019 16:32:59 -0800
Rick Edgecombe <rick.p.edgecombe@intel.com> wrote:

> From: Nadav Amit <namit@vmware.com>
> 
> It seems dangerous to allow code modifications to take place
> concurrently with module unloading. So take the text_mutex while the
> memory of the module is freed.

At that point, since the module itself is removed from module list,
it seems no actual harm. Or would you have any concern?

Thank you,

> 
> Signed-off-by: Nadav Amit <namit@vmware.com>
> Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
> ---
>  kernel/module.c | 5 +++++
>  1 file changed, 5 insertions(+)
> 
> diff --git a/kernel/module.c b/kernel/module.c
> index 1af5c8e19086..90cfc4988d98 100644
> --- a/kernel/module.c
> +++ b/kernel/module.c
> @@ -64,6 +64,7 @@
>  #include <linux/bsearch.h>
>  #include <linux/dynamic_debug.h>
>  #include <linux/audit.h>
> +#include <linux/memory.h>
>  #include <uapi/linux/module.h>
>  #include "module-internal.h"
>  
> @@ -2157,6 +2158,9 @@ static void free_module(struct module *mod)
>  	synchronize_rcu();
>  	mutex_unlock(&module_mutex);
>  
> +	/* Protect against patching of the module while it is being removed */
> +	mutex_lock(&text_mutex);
> +
>  	/* This may be empty, but that's OK */
>  	module_arch_freeing_init(mod);
>  	module_memfree(mod->init_layout.base);
> @@ -2168,6 +2172,7 @@ static void free_module(struct module *mod)
>  
>  	/* Finally, free the core (containing the module structure) */
>  	module_memfree(mod->core_layout.base);
> +	mutex_unlock(&text_mutex);
>  }
>  
>  void *__symbol_get(const char *symbol)
> -- 
> 2.17.1
>
Nadav Amit Jan. 17, 2019, 6:07 p.m. UTC | #2
> On Jan 16, 2019, at 11:54 PM, Masami Hiramatsu <mhiramat@kernel.org> wrote:
> 
> On Wed, 16 Jan 2019 16:32:59 -0800
> Rick Edgecombe <rick.p.edgecombe@intel.com> wrote:
> 
>> From: Nadav Amit <namit@vmware.com>
>> 
>> It seems dangerous to allow code modifications to take place
>> concurrently with module unloading. So take the text_mutex while the
>> memory of the module is freed.
> 
> At that point, since the module itself is removed from module list,
> it seems no actual harm. Or would you have any concern?

So it appears that you are right and all the users of text_poke() and
text_poke_bp() do install module notifiers, and remove the module from their
internal data structure when they are done (*). As long as they prevent
text_poke*() to be called concurrently (e.g., using jump_label_lock()),
everything is fine.

Having said that, the question is whether you “trust” text_poke*() users to
do so. text_poke() description does not day explicitly that you need to
prevent modules from being removed.

What do you say?


(*) I am not sure about kgdb, but it probably does not matter much
H. Peter Anvin Jan. 17, 2019, 11:44 p.m. UTC | #3
On 1/17/19 10:07 AM, Nadav Amit wrote:
>> On Jan 16, 2019, at 11:54 PM, Masami Hiramatsu <mhiramat@kernel.org> wrote:
>>
>> On Wed, 16 Jan 2019 16:32:59 -0800
>> Rick Edgecombe <rick.p.edgecombe@intel.com> wrote:
>>
>>> From: Nadav Amit <namit@vmware.com>
>>>
>>> It seems dangerous to allow code modifications to take place
>>> concurrently with module unloading. So take the text_mutex while the
>>> memory of the module is freed.
>>
>> At that point, since the module itself is removed from module list,
>> it seems no actual harm. Or would you have any concern?
> 
> So it appears that you are right and all the users of text_poke() and
> text_poke_bp() do install module notifiers, and remove the module from their
> internal data structure when they are done (*). As long as they prevent
> text_poke*() to be called concurrently (e.g., using jump_label_lock()),
> everything is fine.
> 
> Having said that, the question is whether you “trust” text_poke*() users to
> do so. text_poke() description does not day explicitly that you need to
> prevent modules from being removed.
> 
> What do you say?
> 

Please make it explicit.

	-hpa
H. Peter Anvin Jan. 17, 2019, 11:58 p.m. UTC | #4
On 1/16/19 11:54 PM, Masami Hiramatsu wrote:
> On Wed, 16 Jan 2019 16:32:59 -0800
> Rick Edgecombe <rick.p.edgecombe@intel.com> wrote:
> 
>> From: Nadav Amit <namit@vmware.com>
>>
>> It seems dangerous to allow code modifications to take place
>> concurrently with module unloading. So take the text_mutex while the
>> memory of the module is freed.
> 
> At that point, since the module itself is removed from module list,
> it seems no actual harm. Or would you have any concern?
> 

The issue isn't the module list, but rather when it is safe to free the
contents, so we don't clobber anything. We absolutely need to enforce
that we can't text_poke() something that might have already been freed.

That being said, we *also* really would prefer to enforce that we can't
text_poke() memory that doesn't actually contain code; as far as I can
tell we don't currently do that check.

This, again, is a good use for a separate mm context. We can enforce
that that context will only ever contain valid page mappings for actual
code pages.

(Note: in my proposed algorithm, with a separate mm, replace INVLPG with
switching CR3 if we have to do a rollback or roll forward in the
breakpoint handler.)

	-hpa
Nadav Amit Jan. 18, 2019, 1:15 a.m. UTC | #5
> On Jan 17, 2019, at 3:58 PM, H. Peter Anvin <hpa@zytor.com> wrote:
> 
> On 1/16/19 11:54 PM, Masami Hiramatsu wrote:
>> On Wed, 16 Jan 2019 16:32:59 -0800
>> Rick Edgecombe <rick.p.edgecombe@intel.com> wrote:
>> 
>>> From: Nadav Amit <namit@vmware.com>
>>> 
>>> It seems dangerous to allow code modifications to take place
>>> concurrently with module unloading. So take the text_mutex while the
>>> memory of the module is freed.
>> 
>> At that point, since the module itself is removed from module list,
>> it seems no actual harm. Or would you have any concern?
> 
> The issue isn't the module list, but rather when it is safe to free the
> contents, so we don't clobber anything. We absolutely need to enforce
> that we can't text_poke() something that might have already been freed.
> 
> That being said, we *also* really would prefer to enforce that we can't
> text_poke() memory that doesn't actually contain code; as far as I can
> tell we don't currently do that check.

Yes, that what the mutex was supposed to achieve. It’s not supposed just
to check whether it is a code page, but also that it is the same code
page that you wanted to patch. 

> This, again, is a good use for a separate mm context. We can enforce
> that that context will only ever contain valid page mappings for actual
> code pages.

This will not tell you that you have the *right* code-page. The module
notifiers help to do so, since they synchronize the text poking with
the module removal.

> (Note: in my proposed algorithm, with a separate mm, replace INVLPG with
> switching CR3 if we have to do a rollback or roll forward in the
> breakpoint handler.)

I really need to read your patches more carefully to see what you mean.

Anyhow, so what do you prefer? I’m ok with either one:
	1. Keep this patch
	2. Remove this patch and change into a comment on text_poke()
	3. Just drop the patch
Masami Hiramatsu (Google) Jan. 18, 2019, 8:23 a.m. UTC | #6
On Thu, 17 Jan 2019 18:07:03 +0000
Nadav Amit <namit@vmware.com> wrote:

> > On Jan 16, 2019, at 11:54 PM, Masami Hiramatsu <mhiramat@kernel.org> wrote:
> > 
> > On Wed, 16 Jan 2019 16:32:59 -0800
> > Rick Edgecombe <rick.p.edgecombe@intel.com> wrote:
> > 
> >> From: Nadav Amit <namit@vmware.com>
> >> 
> >> It seems dangerous to allow code modifications to take place
> >> concurrently with module unloading. So take the text_mutex while the
> >> memory of the module is freed.
> > 
> > At that point, since the module itself is removed from module list,
> > it seems no actual harm. Or would you have any concern?
> 
> So it appears that you are right and all the users of text_poke() and
> text_poke_bp() do install module notifiers, and remove the module from their
> internal data structure when they are done (*). As long as they prevent
> text_poke*() to be called concurrently (e.g., using jump_label_lock()),
> everything is fine.
> 
> Having said that, the question is whether you “trust” text_poke*() users to
> do so. text_poke() description does not day explicitly that you need to
> prevent modules from being removed.
> 
> What do you say?

I agreed, but in that case, this is just a fool proof. I think we should
prevent this kind of bug by review, and should comment it on text_poke(),
instead of locking text_mutex.

What I thought was even if we take text_mutex here, such user can modify
the (released) module code right after we exit this section.

Maybe we'd better make text_poke() more smart?

> (*) I am not sure about kgdb, but it probably does not matter much

I think we don't need to care about kgdb. It is a tool which should be able
to shoot your feet and we can not prevent it. Only expert can avoid it. :)

Thank you,
Masami Hiramatsu (Google) Jan. 18, 2019, 1:32 p.m. UTC | #7
On Thu, 17 Jan 2019 17:15:27 -0800
Nadav Amit <nadav.amit@gmail.com> wrote:

> > On Jan 17, 2019, at 3:58 PM, H. Peter Anvin <hpa@zytor.com> wrote:
> > 
> > On 1/16/19 11:54 PM, Masami Hiramatsu wrote:
> >> On Wed, 16 Jan 2019 16:32:59 -0800
> >> Rick Edgecombe <rick.p.edgecombe@intel.com> wrote:
> >> 
> >>> From: Nadav Amit <namit@vmware.com>
> >>> 
> >>> It seems dangerous to allow code modifications to take place
> >>> concurrently with module unloading. So take the text_mutex while the
> >>> memory of the module is freed.
> >> 
> >> At that point, since the module itself is removed from module list,
> >> it seems no actual harm. Or would you have any concern?
> > 
> > The issue isn't the module list, but rather when it is safe to free the
> > contents, so we don't clobber anything. We absolutely need to enforce
> > that we can't text_poke() something that might have already been freed.
> > 
> > That being said, we *also* really would prefer to enforce that we can't
> > text_poke() memory that doesn't actually contain code; as far as I can
> > tell we don't currently do that check.
> 
> Yes, that what the mutex was supposed to achieve. It’s not supposed just
> to check whether it is a code page, but also that it is the same code
> page that you wanted to patch. 
> 
> > This, again, is a good use for a separate mm context. We can enforce
> > that that context will only ever contain valid page mappings for actual
> > code pages.
> 
> This will not tell you that you have the *right* code-page. The module
> notifiers help to do so, since they synchronize the text poking with
> the module removal.
> 
> > (Note: in my proposed algorithm, with a separate mm, replace INVLPG with
> > switching CR3 if we have to do a rollback or roll forward in the
> > breakpoint handler.)
> 
> I really need to read your patches more carefully to see what you mean.
> 
> Anyhow, so what do you prefer? I’m ok with either one:
> 	1. Keep this patch
> 	2. Remove this patch and change into a comment on text_poke()
> 	3. Just drop the patch

I would prefer 2. so at least we should add a comment to text_poke().

Thank you,
diff mbox series

Patch

diff --git a/kernel/module.c b/kernel/module.c
index 1af5c8e19086..90cfc4988d98 100644
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -64,6 +64,7 @@ 
 #include <linux/bsearch.h>
 #include <linux/dynamic_debug.h>
 #include <linux/audit.h>
+#include <linux/memory.h>
 #include <uapi/linux/module.h>
 #include "module-internal.h"
 
@@ -2157,6 +2158,9 @@  static void free_module(struct module *mod)
 	synchronize_rcu();
 	mutex_unlock(&module_mutex);
 
+	/* Protect against patching of the module while it is being removed */
+	mutex_lock(&text_mutex);
+
 	/* This may be empty, but that's OK */
 	module_arch_freeing_init(mod);
 	module_memfree(mod->init_layout.base);
@@ -2168,6 +2172,7 @@  static void free_module(struct module *mod)
 
 	/* Finally, free the core (containing the module structure) */
 	module_memfree(mod->core_layout.base);
+	mutex_unlock(&text_mutex);
 }
 
 void *__symbol_get(const char *symbol)