diff mbox series

[v4] RISC-V: Don't check text_mutex during stop_machine

Message ID 20230301223853.1444332-1-conor@kernel.org (mailing list archive)
State Superseded
Headers show
Series [v4] RISC-V: Don't check text_mutex during stop_machine | expand

Checks

Context Check Description
conchuod/cover_letter success Single patches do not need cover letters
conchuod/tree_selection success Guessed tree name to be fixes
conchuod/fixes_present success Fixes tag present in non-next series
conchuod/maintainers_pattern success MAINTAINERS pattern errors before the patch: 13 and now 13
conchuod/verify_signedoff success Signed-off-by tag matches author and committer
conchuod/kdoc success Errors and warnings before: 0 this patch: 0
conchuod/build_rv64_clang_allmodconfig success Errors and warnings before: 585 this patch: 585
conchuod/module_param success Was 0 now: 0
conchuod/build_rv64_gcc_allmodconfig success Errors and warnings before: 5737 this patch: 5737
conchuod/alphanumeric_selects success Out of order selects before the patch: 730 and now 730
conchuod/build_rv32_defconfig fail Build failed
conchuod/dtb_warn_rv64 success Errors and warnings before: 2 this patch: 2
conchuod/header_inline success No static functions without inline keyword in header files
conchuod/checkpatch warning CHECK: Consider using #include <linux/ftrace.h> instead of <asm/ftrace.h>
conchuod/source_inline success Was 0 now: 0
conchuod/build_rv64_nommu_k210_defconfig fail Build failed
conchuod/verify_fixes success Fixes tag looks correct
conchuod/build_rv64_nommu_virt_defconfig fail Build failed

Commit Message

Conor Dooley March 1, 2023, 10:38 p.m. UTC
From: Palmer Dabbelt <palmerdabbelt@google.com>

We're currently using stop_machine() to update ftrace & kprobes, which
means that the thread that takes text_mutex during may not be the same
as the thread that eventually patches the code.  This isn't actually a
race because the lock is still held (preventing any other concurrent
accesses) and there is only one thread running during stop_machine(),
but it does trigger a lockdep failure.

This patch just elides the lockdep check during stop_machine.

Fixes: c15ac4fd60d5 ("riscv/ftrace: Add dynamic function tracer support")
Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Reported-by: Changbin Du <changbin.du@gmail.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
---
Changes since v3 [<20230215164317.727657-1-conor@kernel.org>]:
* rename the flag to riscv_patch_in_stop_machine as it is being used for
  kprobes & ftrace, and just looked a bit odd.
* implement Changbin's suggestion of checking the lock is held in
  patch_text(), rather than set the flag in callers of patch_text().

Changes since v2 [<20220322022331.32136-1-palmer@rivosinc.com>]:
* rebase on riscv/for-next as it as been a year.
* incorporate Changbin's suggestion that init_nop should take the lock
  rather than call prepare() & post_process().

Changes since v1 [<20210506071041.417854-1-palmer@dabbelt.com>]:
* Use ftrace_arch_ocde_modify_{prepare,post_process}() to set the flag.
  I remember having a reason I wanted the function when I wrote the v1,
  but it's been almost a year and I forget what that was -- maybe I was
  just crazy, the patch was sent at midnight.
* Fix DYNAMIC_FTRACE=n builds.
---
 arch/riscv/include/asm/ftrace.h |  7 +++++++
 arch/riscv/kernel/ftrace.c      | 15 +++++++++++++--
 arch/riscv/kernel/patch.c       | 26 +++++++++++++++++++++++---
 3 files changed, 43 insertions(+), 5 deletions(-)

Comments

Changbin Du March 2, 2023, 3:34 a.m. UTC | #1
Reviewed-by: Changbin Du <changbin.du@huawei.com>

On Wed, Mar 01, 2023 at 10:38:53PM +0000, Conor Dooley wrote:
> From: Palmer Dabbelt <palmerdabbelt@google.com>
> 
> We're currently using stop_machine() to update ftrace & kprobes, which
> means that the thread that takes text_mutex during may not be the same
> as the thread that eventually patches the code.  This isn't actually a
> race because the lock is still held (preventing any other concurrent
> accesses) and there is only one thread running during stop_machine(),
> but it does trigger a lockdep failure.
> 
> This patch just elides the lockdep check during stop_machine.
> 
> Fixes: c15ac4fd60d5 ("riscv/ftrace: Add dynamic function tracer support")
> Suggested-by: Steven Rostedt <rostedt@goodmis.org>
> Reported-by: Changbin Du <changbin.du@gmail.com>
> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
> Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
> Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
> ---
> Changes since v3 [<20230215164317.727657-1-conor@kernel.org>]:
> * rename the flag to riscv_patch_in_stop_machine as it is being used for
>   kprobes & ftrace, and just looked a bit odd.
> * implement Changbin's suggestion of checking the lock is held in
>   patch_text(), rather than set the flag in callers of patch_text().
> 
> Changes since v2 [<20220322022331.32136-1-palmer@rivosinc.com>]:
> * rebase on riscv/for-next as it as been a year.
> * incorporate Changbin's suggestion that init_nop should take the lock
>   rather than call prepare() & post_process().
> 
> Changes since v1 [<20210506071041.417854-1-palmer@dabbelt.com>]:
> * Use ftrace_arch_ocde_modify_{prepare,post_process}() to set the flag.
>   I remember having a reason I wanted the function when I wrote the v1,
>   but it's been almost a year and I forget what that was -- maybe I was
>   just crazy, the patch was sent at midnight.
> * Fix DYNAMIC_FTRACE=n builds.
> ---
>  arch/riscv/include/asm/ftrace.h |  7 +++++++
>  arch/riscv/kernel/ftrace.c      | 15 +++++++++++++--
>  arch/riscv/kernel/patch.c       | 26 +++++++++++++++++++++++---
>  3 files changed, 43 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/riscv/include/asm/ftrace.h b/arch/riscv/include/asm/ftrace.h
> index 9e73922e1e2e..41e0f4aa5243 100644
> --- a/arch/riscv/include/asm/ftrace.h
> +++ b/arch/riscv/include/asm/ftrace.h
> @@ -107,8 +107,15 @@ do {									\
>  struct dyn_ftrace;
>  int ftrace_init_nop(struct module *mod, struct dyn_ftrace *rec);
>  #define ftrace_init_nop ftrace_init_nop
> +extern int riscv_patch_in_stop_machine;
>  #endif
>  
> +#else /* CONFIG_DYNAMIC_FTRACE */
> +
> +#ifndef __ASSEMBLY__
> +#define riscv_patch_in_stop_machine 0
>  #endif
>  
> +#endif /* CONFIG_DYNAMIC_FTRACE */
> +
>  #endif /* _ASM_RISCV_FTRACE_H */
> diff --git a/arch/riscv/kernel/ftrace.c b/arch/riscv/kernel/ftrace.c
> index 5bff37af4770..00cb8d51a0ec 100644
> --- a/arch/riscv/kernel/ftrace.c
> +++ b/arch/riscv/kernel/ftrace.c
> @@ -11,14 +11,25 @@
>  #include <asm/cacheflush.h>
>  #include <asm/patch.h>
>  
> +int riscv_patch_in_stop_machine;
> +
>  #ifdef CONFIG_DYNAMIC_FTRACE
>  void ftrace_arch_code_modify_prepare(void) __acquires(&text_mutex)
>  {
>  	mutex_lock(&text_mutex);
> +
> +	/*
> +	 * The code sequences we use for ftrace can't be patched while the
> +	 * kernel is running, so we need to use stop_machine() to modify them
> +	 * for now.  This doesn't play nice with text_mutex, we use this flag
> +	 * to elide the check.
> +	 */
> +	riscv_patch_in_stop_machine = true;
>  }
>  
>  void ftrace_arch_code_modify_post_process(void) __releases(&text_mutex)
>  {
> +	riscv_patch_in_stop_machine = false;
>  	mutex_unlock(&text_mutex);
>  }
>  
> @@ -107,9 +118,9 @@ int ftrace_init_nop(struct module *mod, struct dyn_ftrace *rec)
>  {
>  	int out;
>  
> -	ftrace_arch_code_modify_prepare();
> +	mutex_lock(&text_mutex);
>  	out = ftrace_make_nop(mod, rec, MCOUNT_ADDR);
> -	ftrace_arch_code_modify_post_process();
> +	mutex_unlock(&text_mutex);
>  
>  	return out;
>  }
> diff --git a/arch/riscv/kernel/patch.c b/arch/riscv/kernel/patch.c
> index 765004b60513..eef1243f6844 100644
> --- a/arch/riscv/kernel/patch.c
> +++ b/arch/riscv/kernel/patch.c
> @@ -11,6 +11,7 @@
>  #include <asm/kprobes.h>
>  #include <asm/cacheflush.h>
>  #include <asm/fixmap.h>
> +#include <asm/ftrace.h>
>  #include <asm/patch.h>
>  
>  struct patch_insn {
> @@ -59,8 +60,15 @@ static int patch_insn_write(void *addr, const void *insn, size_t len)
>  	 * Before reaching here, it was expected to lock the text_mutex
>  	 * already, so we don't need to give another lock here and could
>  	 * ensure that it was safe between each cores.
> +	 *
> +	 * We're currently using stop_machine() for ftrace & kprobes, and while
> +	 * that ensures text_mutex is held before installing the mappings it
> +	 * does not ensure text_mutex is held by the calling thread.  That's
> +	 * safe but triggers a lockdep failure, so just elide it for that
> +	 * specific case.
>  	 */
> -	lockdep_assert_held(&text_mutex);
> +	if (!riscv_patch_in_stop_machine)
> +		lockdep_assert_held(&text_mutex);
>  
>  	if (across_pages)
>  		patch_map(addr + len, FIX_TEXT_POKE1);
> @@ -121,13 +129,25 @@ NOKPROBE_SYMBOL(patch_text_cb);
>  
>  int patch_text(void *addr, u32 insn)
>  {
> +	int ret;
>  	struct patch_insn patch = {
>  		.addr = addr,
>  		.insn = insn,
>  		.cpu_count = ATOMIC_INIT(0),
>  	};
>  
> -	return stop_machine_cpuslocked(patch_text_cb,
> -				       &patch, cpu_online_mask);
> +	/*
> +	 * kprobes takes text_mutex, before calling patch_text(), but as we call
> +	 * calls stop_machine(), the lockdep assertion in patch_insn_write()
> +	 * gets confused by the context in which the lock is taken.
> +	 * Instead, ensure the lock is held before calling stop_machine(), and
> +	 * set riscv_patch_in_stop_machine to skip the check in
> +	 * patch_insn_write().
> +	 */
> +	lockdep_assert_held(&text_mutex);
> +	riscv_patch_in_stop_machine = true;
> +	ret = stop_machine_cpuslocked(patch_text_cb, &patch, cpu_online_mask);
> +	riscv_patch_in_stop_machine = false;
> +	return ret;
>  }
>  NOKPROBE_SYMBOL(patch_text);
> -- 
> 2.39.2
> 
> 
> _______________________________________________
> linux-riscv mailing list
> linux-riscv@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-riscv
>
Conor Dooley March 3, 2023, 12:21 a.m. UTC | #2
On Wed, Mar 01, 2023 at 10:38:53PM +0000, Conor Dooley wrote:
> From: Palmer Dabbelt <palmerdabbelt@google.com>
> 
> We're currently using stop_machine() to update ftrace & kprobes, which
> means that the thread that takes text_mutex during may not be the same
> as the thread that eventually patches the code.  This isn't actually a
> race because the lock is still held (preventing any other concurrent
> accesses) and there is only one thread running during stop_machine(),
> but it does trigger a lockdep failure.
> 
> This patch just elides the lockdep check during stop_machine.
> 
> Fixes: c15ac4fd60d5 ("riscv/ftrace: Add dynamic function tracer support")
> Suggested-by: Steven Rostedt <rostedt@goodmis.org>
> Reported-by: Changbin Du <changbin.du@gmail.com>
> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
> Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
> Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
> ---
> Changes since v3 [<20230215164317.727657-1-conor@kernel.org>]:
> * rename the flag to riscv_patch_in_stop_machine as it is being used for
>   kprobes & ftrace, and just looked a bit odd.
> * implement Changbin's suggestion of checking the lock is held in
>   patch_text(), rather than set the flag in callers of patch_text().
> 
> Changes since v2 [<20220322022331.32136-1-palmer@rivosinc.com>]:
> * rebase on riscv/for-next as it as been a year.
> * incorporate Changbin's suggestion that init_nop should take the lock
>   rather than call prepare() & post_process().
> 
> Changes since v1 [<20210506071041.417854-1-palmer@dabbelt.com>]:
> * Use ftrace_arch_ocde_modify_{prepare,post_process}() to set the flag.
>   I remember having a reason I wanted the function when I wrote the v1,
>   but it's been almost a year and I forget what that was -- maybe I was
>   just crazy, the patch was sent at midnight.
> * Fix DYNAMIC_FTRACE=n builds.

> @@ -121,13 +129,25 @@ NOKPROBE_SYMBOL(patch_text_cb);
>  
>  int patch_text(void *addr, u32 insn)
>  {
> +	int ret;
>  	struct patch_insn patch = {
>  		.addr = addr,
>  		.insn = insn,
>  		.cpu_count = ATOMIC_INIT(0),
>  	};
>  
> -	return stop_machine_cpuslocked(patch_text_cb,
> -				       &patch, cpu_online_mask);
> +	/*
> +	 * kprobes takes text_mutex, before calling patch_text(), but as we call
> +	 * calls stop_machine(), the lockdep assertion in patch_insn_write()
> +	 * gets confused by the context in which the lock is taken.
> +	 * Instead, ensure the lock is held before calling stop_machine(), and
> +	 * set riscv_patch_in_stop_machine to skip the check in
> +	 * patch_insn_write().
> +	 */
> +	lockdep_assert_held(&text_mutex);
> +	riscv_patch_in_stop_machine = true;
> +	ret = stop_machine_cpuslocked(patch_text_cb, &patch, cpu_online_mask);
> +	riscv_patch_in_stop_machine = false;
> +	return ret;

*sigh*, automation reports that this is not possible:
../arch/riscv/kernel/patch.c:153:30: error: expression is not assignable
        riscv_patch_in_stop_machine = true;
        ~~~~~~~~~~~~~~~~~~~~~~~~~~~ ^
../arch/riscv/kernel/patch.c:155:30: error: expression is not assignable
        riscv_patch_in_stop_machine = false;
        ~~~~~~~~~~~~~~~~~~~~~~~~~~~ ^

The rest of the patch (in the commit context) uses this variable inside
ftrace code, but as this one is in patch.c, it's compiled in whether or
not we have ftrace etc. On rv32 & nommu defconfigs, that results in a
build failure:
https://patchwork.kernel.org/project/linux-riscv/patch/20230301223853.1444332-1-conor@kernel.org/

A respin is in order...
diff mbox series

Patch

diff --git a/arch/riscv/include/asm/ftrace.h b/arch/riscv/include/asm/ftrace.h
index 9e73922e1e2e..41e0f4aa5243 100644
--- a/arch/riscv/include/asm/ftrace.h
+++ b/arch/riscv/include/asm/ftrace.h
@@ -107,8 +107,15 @@  do {									\
 struct dyn_ftrace;
 int ftrace_init_nop(struct module *mod, struct dyn_ftrace *rec);
 #define ftrace_init_nop ftrace_init_nop
+extern int riscv_patch_in_stop_machine;
 #endif
 
+#else /* CONFIG_DYNAMIC_FTRACE */
+
+#ifndef __ASSEMBLY__
+#define riscv_patch_in_stop_machine 0
 #endif
 
+#endif /* CONFIG_DYNAMIC_FTRACE */
+
 #endif /* _ASM_RISCV_FTRACE_H */
diff --git a/arch/riscv/kernel/ftrace.c b/arch/riscv/kernel/ftrace.c
index 5bff37af4770..00cb8d51a0ec 100644
--- a/arch/riscv/kernel/ftrace.c
+++ b/arch/riscv/kernel/ftrace.c
@@ -11,14 +11,25 @@ 
 #include <asm/cacheflush.h>
 #include <asm/patch.h>
 
+int riscv_patch_in_stop_machine;
+
 #ifdef CONFIG_DYNAMIC_FTRACE
 void ftrace_arch_code_modify_prepare(void) __acquires(&text_mutex)
 {
 	mutex_lock(&text_mutex);
+
+	/*
+	 * The code sequences we use for ftrace can't be patched while the
+	 * kernel is running, so we need to use stop_machine() to modify them
+	 * for now.  This doesn't play nice with text_mutex, we use this flag
+	 * to elide the check.
+	 */
+	riscv_patch_in_stop_machine = true;
 }
 
 void ftrace_arch_code_modify_post_process(void) __releases(&text_mutex)
 {
+	riscv_patch_in_stop_machine = false;
 	mutex_unlock(&text_mutex);
 }
 
@@ -107,9 +118,9 @@  int ftrace_init_nop(struct module *mod, struct dyn_ftrace *rec)
 {
 	int out;
 
-	ftrace_arch_code_modify_prepare();
+	mutex_lock(&text_mutex);
 	out = ftrace_make_nop(mod, rec, MCOUNT_ADDR);
-	ftrace_arch_code_modify_post_process();
+	mutex_unlock(&text_mutex);
 
 	return out;
 }
diff --git a/arch/riscv/kernel/patch.c b/arch/riscv/kernel/patch.c
index 765004b60513..eef1243f6844 100644
--- a/arch/riscv/kernel/patch.c
+++ b/arch/riscv/kernel/patch.c
@@ -11,6 +11,7 @@ 
 #include <asm/kprobes.h>
 #include <asm/cacheflush.h>
 #include <asm/fixmap.h>
+#include <asm/ftrace.h>
 #include <asm/patch.h>
 
 struct patch_insn {
@@ -59,8 +60,15 @@  static int patch_insn_write(void *addr, const void *insn, size_t len)
 	 * Before reaching here, it was expected to lock the text_mutex
 	 * already, so we don't need to give another lock here and could
 	 * ensure that it was safe between each cores.
+	 *
+	 * We're currently using stop_machine() for ftrace & kprobes, and while
+	 * that ensures text_mutex is held before installing the mappings it
+	 * does not ensure text_mutex is held by the calling thread.  That's
+	 * safe but triggers a lockdep failure, so just elide it for that
+	 * specific case.
 	 */
-	lockdep_assert_held(&text_mutex);
+	if (!riscv_patch_in_stop_machine)
+		lockdep_assert_held(&text_mutex);
 
 	if (across_pages)
 		patch_map(addr + len, FIX_TEXT_POKE1);
@@ -121,13 +129,25 @@  NOKPROBE_SYMBOL(patch_text_cb);
 
 int patch_text(void *addr, u32 insn)
 {
+	int ret;
 	struct patch_insn patch = {
 		.addr = addr,
 		.insn = insn,
 		.cpu_count = ATOMIC_INIT(0),
 	};
 
-	return stop_machine_cpuslocked(patch_text_cb,
-				       &patch, cpu_online_mask);
+	/*
+	 * kprobes takes text_mutex, before calling patch_text(), but as we call
+	 * calls stop_machine(), the lockdep assertion in patch_insn_write()
+	 * gets confused by the context in which the lock is taken.
+	 * Instead, ensure the lock is held before calling stop_machine(), and
+	 * set riscv_patch_in_stop_machine to skip the check in
+	 * patch_insn_write().
+	 */
+	lockdep_assert_held(&text_mutex);
+	riscv_patch_in_stop_machine = true;
+	ret = stop_machine_cpuslocked(patch_text_cb, &patch, cpu_online_mask);
+	riscv_patch_in_stop_machine = false;
+	return ret;
 }
 NOKPROBE_SYMBOL(patch_text);