Message ID | 20240731133601.3073-1-will@kernel.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [RFC] arm64: jump_label: Ensure patched jump_labels are visible to all CPUs | expand |
On Wed, Jul 31, 2024 at 02:36:01PM +0100, Will Deacon wrote: > Although the Arm architecture permits concurrent modification and > execution of NOP and branch instructions, it still requires some > synchronisation to ensure that other CPUs consistently execute the newly > written instruction: > > > When the modified instructions are observable, each PE that is > > executing the modified instructions must execute an ISB or perform a > > context synchronizing event to ensure execution of the modified > > instructions > > Prior to commit f6cc0c501649 ("arm64: Avoid calling stop_machine() when > patching jump labels"), the arm64 jump_label patching machinery > performed synchronisation using stop_machine() after each modification, > however this was problematic when flipping static keys from atomic > contexts (namely, the arm_arch_timer CPU hotplug startup notifier) and > so we switched to the _nosync() patching routines to avoid "scheduling > while atomic" BUG()s during boot. > > In hindsight, the analysis of the issue in f6cc0c501649 isn't quite > right: it cites the use of IPIs in the default patching routines as the > cause of the lockup, whereas stop_machine() does not rely on IPIs and > the I-cache invalidation is performed using __flush_icache_range(), > which elides the call to kick_all_cpus_sync(). In fact, the blocking > wait for other CPUs is what triggers the BUG() and the problem remains > even after f6cc0c501649, for example because we could block on the > jump_label_mutex. Eventually, the arm_arch_timer driver was fixed to > avoid the static key entirely in commit a862fc2254bd > ("clocksource/arm_arch_timer: Remove use of workaround static key"). > > This all leaves the jump_label patching code in a funny situation on > arm64 as we do not synchronise with other CPUs to reduce the likelihood > of a bug which no longer exists. Consequently, toggling a static key on > one CPU cannot be assumed to take effect on other CPUs, leading to > potential issues, for example with missing preempt notifiers. > > Rather than revert f6cc0c501649 and go back to stop_machine() for each > patch site, implement arch_jump_label_transform_apply() and kick all > the other CPUs with an IPI at the end of patching. > > Cc: Catalin Marinas <catalin.marinas@arm.com> > Cc: Alexander Potapenko <glider@google.com> > Cc: Mark Rutland <mark.rutland@arm.com> > Cc: Marc Zyngier <maz@kernel.org> > Fixes: f6cc0c501649 ("arm64: Avoid calling stop_machine() when patching jump labels") > Signed-off-by: Will Deacon <will@kernel.org> We need to keep an eye so that the patch is not picked up for 4.19 (the fixed commit) as it doesn't have the arm_arch_timer fix, nor the batch jump label support. LTS 5.4 is fine though, it has both. If we want it in -stable, we can explicitly mention the version in the Cc line. The patch looks good to me. Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
On Wed, 31 Jul 2024 14:36:01 +0100, Will Deacon <will@kernel.org> wrote: > > Although the Arm architecture permits concurrent modification and > execution of NOP and branch instructions, it still requires some > synchronisation to ensure that other CPUs consistently execute the newly > written instruction: > > > When the modified instructions are observable, each PE that is > > executing the modified instructions must execute an ISB or perform a > > context synchronizing event to ensure execution of the modified > > instructions > > Prior to commit f6cc0c501649 ("arm64: Avoid calling stop_machine() when > patching jump labels"), the arm64 jump_label patching machinery > performed synchronisation using stop_machine() after each modification, > however this was problematic when flipping static keys from atomic > contexts (namely, the arm_arch_timer CPU hotplug startup notifier) and > so we switched to the _nosync() patching routines to avoid "scheduling > while atomic" BUG()s during boot. > > In hindsight, the analysis of the issue in f6cc0c501649 isn't quite > right: it cites the use of IPIs in the default patching routines as the > cause of the lockup, whereas stop_machine() does not rely on IPIs and > the I-cache invalidation is performed using __flush_icache_range(), > which elides the call to kick_all_cpus_sync(). In fact, the blocking > wait for other CPUs is what triggers the BUG() and the problem remains > even after f6cc0c501649, for example because we could block on the > jump_label_mutex. Eventually, the arm_arch_timer driver was fixed to > avoid the static key entirely in commit a862fc2254bd > ("clocksource/arm_arch_timer: Remove use of workaround static key"). > > This all leaves the jump_label patching code in a funny situation on > arm64 as we do not synchronise with other CPUs to reduce the likelihood > of a bug which no longer exists. Consequently, toggling a static key on > one CPU cannot be assumed to take effect on other CPUs, leading to > potential issues, for example with missing preempt notifiers. > > Rather than revert f6cc0c501649 and go back to stop_machine() for each > patch site, implement arch_jump_label_transform_apply() and kick all > the other CPUs with an IPI at the end of patching. > > Cc: Catalin Marinas <catalin.marinas@arm.com> > Cc: Alexander Potapenko <glider@google.com> > Cc: Mark Rutland <mark.rutland@arm.com> > Cc: Marc Zyngier <maz@kernel.org> > Fixes: f6cc0c501649 ("arm64: Avoid calling stop_machine() when patching jump labels") > Signed-off-by: Will Deacon <will@kernel.org> > --- > arch/arm64/include/asm/jump_label.h | 1 + > arch/arm64/kernel/jump_label.c | 11 +++++++++-- > 2 files changed, 10 insertions(+), 2 deletions(-) Thanks for the detailed archaeology! This looks pretty reasonable to me. Reviewed-by: Marc Zyngier <maz@kernel.org> M.
On Wed, 31 Jul 2024 14:36:01 +0100, Will Deacon wrote: > Although the Arm architecture permits concurrent modification and > execution of NOP and branch instructions, it still requires some > synchronisation to ensure that other CPUs consistently execute the newly > written instruction: > > > When the modified instructions are observable, each PE that is > > executing the modified instructions must execute an ISB or perform a > > context synchronizing event to ensure execution of the modified > > instructions > > [...] Applied to arm64 (for-next/fixes), thanks! [1/1] arm64: jump_label: Ensure patched jump_labels are visible to all CPUs https://git.kernel.org/arm64/c/cfb00a357864
diff --git a/arch/arm64/include/asm/jump_label.h b/arch/arm64/include/asm/jump_label.h index 4e753908b801..a0a5bbae7229 100644 --- a/arch/arm64/include/asm/jump_label.h +++ b/arch/arm64/include/asm/jump_label.h @@ -13,6 +13,7 @@ #include <linux/types.h> #include <asm/insn.h> +#define HAVE_JUMP_LABEL_BATCH #define JUMP_LABEL_NOP_SIZE AARCH64_INSN_SIZE #define JUMP_TABLE_ENTRY(key, label) \ diff --git a/arch/arm64/kernel/jump_label.c b/arch/arm64/kernel/jump_label.c index faf88ec9c48e..f63ea915d6ad 100644 --- a/arch/arm64/kernel/jump_label.c +++ b/arch/arm64/kernel/jump_label.c @@ -7,11 +7,12 @@ */ #include <linux/kernel.h> #include <linux/jump_label.h> +#include <linux/smp.h> #include <asm/insn.h> #include <asm/patching.h> -void arch_jump_label_transform(struct jump_entry *entry, - enum jump_label_type type) +bool arch_jump_label_transform_queue(struct jump_entry *entry, + enum jump_label_type type) { void *addr = (void *)jump_entry_code(entry); u32 insn; @@ -25,4 +26,10 @@ void arch_jump_label_transform(struct jump_entry *entry, } aarch64_insn_patch_text_nosync(addr, insn); + return true; +} + +void arch_jump_label_transform_apply(void) +{ + kick_all_cpus_sync(); }