Message ID | 20190117003259.23141-18-rick.p.edgecombe@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Merge text_poke fixes and executable lockdowns | expand |
On Wed, 16 Jan 2019 16:32:59 -0800 Rick Edgecombe <rick.p.edgecombe@intel.com> wrote: > From: Nadav Amit <namit@vmware.com> > > It seems dangerous to allow code modifications to take place > concurrently with module unloading. So take the text_mutex while the > memory of the module is freed. At that point, since the module itself is removed from module list, it seems no actual harm. Or would you have any concern? Thank you, > > Signed-off-by: Nadav Amit <namit@vmware.com> > Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com> > --- > kernel/module.c | 5 +++++ > 1 file changed, 5 insertions(+) > > diff --git a/kernel/module.c b/kernel/module.c > index 1af5c8e19086..90cfc4988d98 100644 > --- a/kernel/module.c > +++ b/kernel/module.c > @@ -64,6 +64,7 @@ > #include <linux/bsearch.h> > #include <linux/dynamic_debug.h> > #include <linux/audit.h> > +#include <linux/memory.h> > #include <uapi/linux/module.h> > #include "module-internal.h" > > @@ -2157,6 +2158,9 @@ static void free_module(struct module *mod) > synchronize_rcu(); > mutex_unlock(&module_mutex); > > + /* Protect against patching of the module while it is being removed */ > + mutex_lock(&text_mutex); > + > /* This may be empty, but that's OK */ > module_arch_freeing_init(mod); > module_memfree(mod->init_layout.base); > @@ -2168,6 +2172,7 @@ static void free_module(struct module *mod) > > /* Finally, free the core (containing the module structure) */ > module_memfree(mod->core_layout.base); > + mutex_unlock(&text_mutex); > } > > void *__symbol_get(const char *symbol) > -- > 2.17.1 >
> On Jan 16, 2019, at 11:54 PM, Masami Hiramatsu <mhiramat@kernel.org> wrote: > > On Wed, 16 Jan 2019 16:32:59 -0800 > Rick Edgecombe <rick.p.edgecombe@intel.com> wrote: > >> From: Nadav Amit <namit@vmware.com> >> >> It seems dangerous to allow code modifications to take place >> concurrently with module unloading. So take the text_mutex while the >> memory of the module is freed. > > At that point, since the module itself is removed from module list, > it seems no actual harm. Or would you have any concern? So it appears that you are right and all the users of text_poke() and text_poke_bp() do install module notifiers, and remove the module from their internal data structure when they are done (*). As long as they prevent text_poke*() to be called concurrently (e.g., using jump_label_lock()), everything is fine. Having said that, the question is whether you “trust” text_poke*() users to do so. text_poke() description does not day explicitly that you need to prevent modules from being removed. What do you say? (*) I am not sure about kgdb, but it probably does not matter much
On 1/17/19 10:07 AM, Nadav Amit wrote: >> On Jan 16, 2019, at 11:54 PM, Masami Hiramatsu <mhiramat@kernel.org> wrote: >> >> On Wed, 16 Jan 2019 16:32:59 -0800 >> Rick Edgecombe <rick.p.edgecombe@intel.com> wrote: >> >>> From: Nadav Amit <namit@vmware.com> >>> >>> It seems dangerous to allow code modifications to take place >>> concurrently with module unloading. So take the text_mutex while the >>> memory of the module is freed. >> >> At that point, since the module itself is removed from module list, >> it seems no actual harm. Or would you have any concern? > > So it appears that you are right and all the users of text_poke() and > text_poke_bp() do install module notifiers, and remove the module from their > internal data structure when they are done (*). As long as they prevent > text_poke*() to be called concurrently (e.g., using jump_label_lock()), > everything is fine. > > Having said that, the question is whether you “trust” text_poke*() users to > do so. text_poke() description does not day explicitly that you need to > prevent modules from being removed. > > What do you say? > Please make it explicit. -hpa
On 1/16/19 11:54 PM, Masami Hiramatsu wrote: > On Wed, 16 Jan 2019 16:32:59 -0800 > Rick Edgecombe <rick.p.edgecombe@intel.com> wrote: > >> From: Nadav Amit <namit@vmware.com> >> >> It seems dangerous to allow code modifications to take place >> concurrently with module unloading. So take the text_mutex while the >> memory of the module is freed. > > At that point, since the module itself is removed from module list, > it seems no actual harm. Or would you have any concern? > The issue isn't the module list, but rather when it is safe to free the contents, so we don't clobber anything. We absolutely need to enforce that we can't text_poke() something that might have already been freed. That being said, we *also* really would prefer to enforce that we can't text_poke() memory that doesn't actually contain code; as far as I can tell we don't currently do that check. This, again, is a good use for a separate mm context. We can enforce that that context will only ever contain valid page mappings for actual code pages. (Note: in my proposed algorithm, with a separate mm, replace INVLPG with switching CR3 if we have to do a rollback or roll forward in the breakpoint handler.) -hpa
> On Jan 17, 2019, at 3:58 PM, H. Peter Anvin <hpa@zytor.com> wrote: > > On 1/16/19 11:54 PM, Masami Hiramatsu wrote: >> On Wed, 16 Jan 2019 16:32:59 -0800 >> Rick Edgecombe <rick.p.edgecombe@intel.com> wrote: >> >>> From: Nadav Amit <namit@vmware.com> >>> >>> It seems dangerous to allow code modifications to take place >>> concurrently with module unloading. So take the text_mutex while the >>> memory of the module is freed. >> >> At that point, since the module itself is removed from module list, >> it seems no actual harm. Or would you have any concern? > > The issue isn't the module list, but rather when it is safe to free the > contents, so we don't clobber anything. We absolutely need to enforce > that we can't text_poke() something that might have already been freed. > > That being said, we *also* really would prefer to enforce that we can't > text_poke() memory that doesn't actually contain code; as far as I can > tell we don't currently do that check. Yes, that what the mutex was supposed to achieve. It’s not supposed just to check whether it is a code page, but also that it is the same code page that you wanted to patch. > This, again, is a good use for a separate mm context. We can enforce > that that context will only ever contain valid page mappings for actual > code pages. This will not tell you that you have the *right* code-page. The module notifiers help to do so, since they synchronize the text poking with the module removal. > (Note: in my proposed algorithm, with a separate mm, replace INVLPG with > switching CR3 if we have to do a rollback or roll forward in the > breakpoint handler.) I really need to read your patches more carefully to see what you mean. Anyhow, so what do you prefer? I’m ok with either one: 1. Keep this patch 2. Remove this patch and change into a comment on text_poke() 3. Just drop the patch
On Thu, 17 Jan 2019 18:07:03 +0000 Nadav Amit <namit@vmware.com> wrote: > > On Jan 16, 2019, at 11:54 PM, Masami Hiramatsu <mhiramat@kernel.org> wrote: > > > > On Wed, 16 Jan 2019 16:32:59 -0800 > > Rick Edgecombe <rick.p.edgecombe@intel.com> wrote: > > > >> From: Nadav Amit <namit@vmware.com> > >> > >> It seems dangerous to allow code modifications to take place > >> concurrently with module unloading. So take the text_mutex while the > >> memory of the module is freed. > > > > At that point, since the module itself is removed from module list, > > it seems no actual harm. Or would you have any concern? > > So it appears that you are right and all the users of text_poke() and > text_poke_bp() do install module notifiers, and remove the module from their > internal data structure when they are done (*). As long as they prevent > text_poke*() to be called concurrently (e.g., using jump_label_lock()), > everything is fine. > > Having said that, the question is whether you “trust” text_poke*() users to > do so. text_poke() description does not day explicitly that you need to > prevent modules from being removed. > > What do you say? I agreed, but in that case, this is just a fool proof. I think we should prevent this kind of bug by review, and should comment it on text_poke(), instead of locking text_mutex. What I thought was even if we take text_mutex here, such user can modify the (released) module code right after we exit this section. Maybe we'd better make text_poke() more smart? > (*) I am not sure about kgdb, but it probably does not matter much I think we don't need to care about kgdb. It is a tool which should be able to shoot your feet and we can not prevent it. Only expert can avoid it. :) Thank you,
On Thu, 17 Jan 2019 17:15:27 -0800 Nadav Amit <nadav.amit@gmail.com> wrote: > > On Jan 17, 2019, at 3:58 PM, H. Peter Anvin <hpa@zytor.com> wrote: > > > > On 1/16/19 11:54 PM, Masami Hiramatsu wrote: > >> On Wed, 16 Jan 2019 16:32:59 -0800 > >> Rick Edgecombe <rick.p.edgecombe@intel.com> wrote: > >> > >>> From: Nadav Amit <namit@vmware.com> > >>> > >>> It seems dangerous to allow code modifications to take place > >>> concurrently with module unloading. So take the text_mutex while the > >>> memory of the module is freed. > >> > >> At that point, since the module itself is removed from module list, > >> it seems no actual harm. Or would you have any concern? > > > > The issue isn't the module list, but rather when it is safe to free the > > contents, so we don't clobber anything. We absolutely need to enforce > > that we can't text_poke() something that might have already been freed. > > > > That being said, we *also* really would prefer to enforce that we can't > > text_poke() memory that doesn't actually contain code; as far as I can > > tell we don't currently do that check. > > Yes, that what the mutex was supposed to achieve. It’s not supposed just > to check whether it is a code page, but also that it is the same code > page that you wanted to patch. > > > This, again, is a good use for a separate mm context. We can enforce > > that that context will only ever contain valid page mappings for actual > > code pages. > > This will not tell you that you have the *right* code-page. The module > notifiers help to do so, since they synchronize the text poking with > the module removal. > > > (Note: in my proposed algorithm, with a separate mm, replace INVLPG with > > switching CR3 if we have to do a rollback or roll forward in the > > breakpoint handler.) > > I really need to read your patches more carefully to see what you mean. > > Anyhow, so what do you prefer? I’m ok with either one: > 1. Keep this patch > 2. Remove this patch and change into a comment on text_poke() > 3. Just drop the patch I would prefer 2. so at least we should add a comment to text_poke(). Thank you,
diff --git a/kernel/module.c b/kernel/module.c index 1af5c8e19086..90cfc4988d98 100644 --- a/kernel/module.c +++ b/kernel/module.c @@ -64,6 +64,7 @@ #include <linux/bsearch.h> #include <linux/dynamic_debug.h> #include <linux/audit.h> +#include <linux/memory.h> #include <uapi/linux/module.h> #include "module-internal.h" @@ -2157,6 +2158,9 @@ static void free_module(struct module *mod) synchronize_rcu(); mutex_unlock(&module_mutex); + /* Protect against patching of the module while it is being removed */ + mutex_lock(&text_mutex); + /* This may be empty, but that's OK */ module_arch_freeing_init(mod); module_memfree(mod->init_layout.base); @@ -2168,6 +2172,7 @@ static void free_module(struct module *mod) /* Finally, free the core (containing the module structure) */ module_memfree(mod->core_layout.base); + mutex_unlock(&text_mutex); } void *__symbol_get(const char *symbol)