diff mbox series

modules: wait do_free_init correctly

Message ID 20231219141231.2218215-1-changbin.du@huawei.com (mailing list archive)
State New
Headers show
Series modules: wait do_free_init correctly | expand

Commit Message

Changbin Du Dec. 19, 2023, 2:12 p.m. UTC
The commit 1a7b7d922081 ("modules: Use vmalloc special flag") moves
do_free_init() into a global workqueue instead of call_rcu(). So now
we should wait it via flush_work().

Fixes: 1a7b7d922081 ("modules: Use vmalloc special flag")
Signed-off-by: Changbin Du <changbin.du@huawei.com>
Cc: Xiaoyi Su <suxiaoyi@huawei.com>
---
 include/linux/moduleloader.h | 2 ++
 init/main.c                  | 5 +++--
 kernel/module/main.c         | 5 +++++
 3 files changed, 10 insertions(+), 2 deletions(-)

Comments

Andrew Morton Dec. 19, 2023, 8:51 p.m. UTC | #1
On Tue, 19 Dec 2023 22:12:31 +0800 Changbin Du <changbin.du@huawei.com> wrote:

> The commit 1a7b7d922081 ("modules: Use vmalloc special flag") moves
> do_free_init() into a global workqueue instead of call_rcu(). So now
> we should wait it via flush_work().

What are the runtime effects of this change?
Luis Chamberlain Dec. 19, 2023, 9:52 p.m. UTC | #2
On Tue, Dec 19, 2023 at 12:51:51PM -0800, Andrew Morton wrote:
> On Tue, 19 Dec 2023 22:12:31 +0800 Changbin Du <changbin.du@huawei.com> wrote:
> 
> > The commit 1a7b7d922081 ("modules: Use vmalloc special flag") moves
> > do_free_init() into a global workqueue instead of call_rcu(). So now
> > we should wait it via flush_work().
> 
> What are the runtime effects of this change?

Indeed that's needed given how old this culprit commit is:

git describe --contains 1a7b7d922081
v5.2-rc1~192^2~5

Who did this work and for what reason? What triggered this itch?

Is it perhaps for an out of tree driver that did something funky
on its module exit?

As per Documentation/RCU/rcubarrier.rst rcu_barrier will ensure the
callbacks complete, so interms of determinism both mechanisms will
have waited for the free. It seems we're now just limiting the scope.

This could also mean initialization grew used to having RCU calls on
init complete at this point in time, even for modules, and so localizing
this wait may now also introduce other unexpected behaviour.

  Luis
Changbin Du Dec. 20, 2023, 5:27 a.m. UTC | #3
On Tue, Dec 19, 2023 at 01:52:03PM -0800, Luis Chamberlain wrote:
> On Tue, Dec 19, 2023 at 12:51:51PM -0800, Andrew Morton wrote:
> > On Tue, 19 Dec 2023 22:12:31 +0800 Changbin Du <changbin.du@huawei.com> wrote:
> > 
> > > The commit 1a7b7d922081 ("modules: Use vmalloc special flag") moves
> > > do_free_init() into a global workqueue instead of call_rcu(). So now
> > > we should wait it via flush_work().
> > 
> > What are the runtime effects of this change?
> 
> Indeed that's needed given how old this culprit commit is:
> 
> git describe --contains 1a7b7d922081
> v5.2-rc1~192^2~5
> 
> Who did this work and for what reason? What triggered this itch?
>
Seems the waiting was introduced by commit ae646f0b9ca ("init: fix false positives
in W+X checking").

As what I have observed, mark_readonly() is only invoked by the first user mode
thread function kernel_init(), which is before userspace /init. So is it real
possible we have loaded modules at this point?

Cc Jeffrey Hugo <jhugo@codeaurora.org>
> Is it perhaps for an out of tree driver that did something funky
> on its module exit?
> 
> As per Documentation/RCU/rcubarrier.rst rcu_barrier will ensure the
> callbacks complete, so interms of determinism both mechanisms will
> have waited for the free. It seems we're now just limiting the scope.
> 
> This could also mean initialization grew used to having RCU calls on
> init complete at this point in time, even for modules, and so localizing
> this wait may now also introduce other unexpected behaviour.
> 
>   Luis
Luis Chamberlain Dec. 20, 2023, 2:32 p.m. UTC | #4
On Wed, Dec 20, 2023 at 01:27:51PM +0800, Changbin Du wrote:
> On Tue, Dec 19, 2023 at 01:52:03PM -0800, Luis Chamberlain wrote:
> > On Tue, Dec 19, 2023 at 12:51:51PM -0800, Andrew Morton wrote:
> > > On Tue, 19 Dec 2023 22:12:31 +0800 Changbin Du <changbin.du@huawei.com> wrote:
> > > 
> > > > The commit 1a7b7d922081 ("modules: Use vmalloc special flag") moves
> > > > do_free_init() into a global workqueue instead of call_rcu(). So now
> > > > we should wait it via flush_work().
> > > 
> > > What are the runtime effects of this change?
> > 
> > Indeed that's needed given how old this culprit commit is:
> > 
> > git describe --contains 1a7b7d922081
> > v5.2-rc1~192^2~5
> > 
> > Who did this work and for what reason? What triggered this itch?
> >
> Seems the waiting was introduced by commit ae646f0b9ca ("init: fix false positives
> in W+X checking").
> 
> As what I have observed, mark_readonly() is only invoked by the first user mode
> thread function kernel_init(), which is before userspace /init. So is it real
> possible we have loaded modules at this point?

Are you saying we don't free any module inits at all then? I asked a lot
of questions and your answers seem slim.

How did you find this?
What actual impact does this have without the patch?

The commit must document this.

  Luis
Changbin Du Dec. 21, 2023, 2:30 a.m. UTC | #5
On Wed, Dec 20, 2023 at 06:32:39AM -0800, Luis Chamberlain wrote:
> On Wed, Dec 20, 2023 at 01:27:51PM +0800, Changbin Du wrote:
> > On Tue, Dec 19, 2023 at 01:52:03PM -0800, Luis Chamberlain wrote:
> > > On Tue, Dec 19, 2023 at 12:51:51PM -0800, Andrew Morton wrote:
> > > > On Tue, 19 Dec 2023 22:12:31 +0800 Changbin Du <changbin.du@huawei.com> wrote:
> > > > 
> > > > > The commit 1a7b7d922081 ("modules: Use vmalloc special flag") moves
> > > > > do_free_init() into a global workqueue instead of call_rcu(). So now
> > > > > we should wait it via flush_work().
> > > > 
> > > > What are the runtime effects of this change?
> > > 
> > > Indeed that's needed given how old this culprit commit is:
> > > 
> > > git describe --contains 1a7b7d922081
> > > v5.2-rc1~192^2~5
> > > 
> > > Who did this work and for what reason? What triggered this itch?
> > >
> > Seems the waiting was introduced by commit ae646f0b9ca ("init: fix false positives
> > in W+X checking").
> > 
> > As what I have observed, mark_readonly() is only invoked by the first user mode
> > thread function kernel_init(), which is before userspace /init. So is it real
> > possible we have loaded modules at this point?
> 
> Are you saying we don't free any module inits at all then? I asked a lot
> of questions and your answers seem slim.
>
Yes, indeed no module loaded at all before mark_readonly(), at least on my desktop.
So I think we can just delete this synchronization. I am not sure whether there are
any historical reasons.

> How did you find this?
> What actual impact does this have without the patch?
>
This is a coincidence. We encountered a rcu problem which the barrier takes much
longger time to wait (this is an another story). So we reviewed the code and
found this issue.

There is no funcional problem without the patch. It's a unnecessary wait AFAIK,
and it does take a little cycles to wait the rcb callbacks.

> The commit must document this.
> 
>   Luis
Changbin Du Dec. 25, 2023, 4:07 a.m. UTC | #6
On Thu, Dec 21, 2023 at 10:30:37AM +0800, Changbin Du wrote:
> On Wed, Dec 20, 2023 at 06:32:39AM -0800, Luis Chamberlain wrote:
> > On Wed, Dec 20, 2023 at 01:27:51PM +0800, Changbin Du wrote:
> > > On Tue, Dec 19, 2023 at 01:52:03PM -0800, Luis Chamberlain wrote:
> > > > On Tue, Dec 19, 2023 at 12:51:51PM -0800, Andrew Morton wrote:
> > > > > On Tue, 19 Dec 2023 22:12:31 +0800 Changbin Du <changbin.du@huawei.com> wrote:
> > > > > 
> > > > > > The commit 1a7b7d922081 ("modules: Use vmalloc special flag") moves
> > > > > > do_free_init() into a global workqueue instead of call_rcu(). So now
> > > > > > we should wait it via flush_work().
> > > > > 
> > > > > What are the runtime effects of this change?
> > > > 
> > > > Indeed that's needed given how old this culprit commit is:
> > > > 
> > > > git describe --contains 1a7b7d922081
> > > > v5.2-rc1~192^2~5
> > > > 
> > > > Who did this work and for what reason? What triggered this itch?
> > > >
> > > Seems the waiting was introduced by commit ae646f0b9ca ("init: fix false positives
> > > in W+X checking").
> > > 
> > > As what I have observed, mark_readonly() is only invoked by the first user mode
> > > thread function kernel_init(), which is before userspace /init. So is it real
> > > possible we have loaded modules at this point?
> > 
> > Are you saying we don't free any module inits at all then? I asked a lot
> > of questions and your answers seem slim.
> >
> Yes, indeed no module loaded at all before mark_readonly(), at least on my desktop.
> So I think we can just delete this synchronization. I am not sure whether there are
> any historical reasons.
>
I thought about it again, kernel doesn't prevent any drivers from calling
request_module() before init. So it's possible that some particular modules do
behave this way.

I will send an updated one to fix the compilation issue for no CONFIG_MODULES.
diff mbox series

Patch

diff --git a/include/linux/moduleloader.h b/include/linux/moduleloader.h
index 001b2ce83832..f3d445d8ccd0 100644
--- a/include/linux/moduleloader.h
+++ b/include/linux/moduleloader.h
@@ -115,6 +115,8 @@  int module_finalize(const Elf_Ehdr *hdr,
 		    const Elf_Shdr *sechdrs,
 		    struct module *mod);
 
+void flush_module_init_free_work(void);
+
 /* Any cleanup needed when module leaves. */
 void module_arch_cleanup(struct module *mod);
 
diff --git a/init/main.c b/init/main.c
index e24b0780fdff..f0b7e21ac67f 100644
--- a/init/main.c
+++ b/init/main.c
@@ -99,6 +99,7 @@ 
 #include <linux/init_syscalls.h>
 #include <linux/stackdepot.h>
 #include <linux/randomize_kstack.h>
+#include <linux/moduleloader.h>
 #include <net/net_namespace.h>
 
 #include <asm/io.h>
@@ -1402,11 +1403,11 @@  static void mark_readonly(void)
 	if (rodata_enabled) {
 		/*
 		 * load_module() results in W+X mappings, which are cleaned
-		 * up with call_rcu().  Let's make sure that queued work is
+		 * up with init_free_wq. Let's make sure that queued work is
 		 * flushed so that we don't hit false positives looking for
 		 * insecure pages which are W+X.
 		 */
-		rcu_barrier();
+		flush_module_init_free_work();
 		mark_rodata_ro();
 		rodata_test();
 	} else
diff --git a/kernel/module/main.c b/kernel/module/main.c
index 98fedfdb8db5..1943ccb7414f 100644
--- a/kernel/module/main.c
+++ b/kernel/module/main.c
@@ -2486,6 +2486,11 @@  static void do_free_init(struct work_struct *w)
 	}
 }
 
+void flush_module_init_free_work(void)
+{
+	flush_work(&init_free_wq);
+}
+
 #undef MODULE_PARAM_PREFIX
 #define MODULE_PARAM_PREFIX "module."
 /* Default value for module->async_probe_requested */