Message ID | 1607433270-9807-1-git-send-email-kernelfans@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | arm64/irq: use NMI to send stop IPI | expand |
On 2020-12-08 13:14, Pingfan Liu wrote: > Like other arches, both 'kexec -l' and 'kexec -p' expect to use NMI to > interrupt other cpus in order to capture cpus tightly to a known state. > > So for both of the cases, try to use NMI IPI if irqchip supports NMI. There are at least 2 other competing series doing the same thing, and (ab)using NMIs for dubious purposes (kexec, kgdb...) (1) Please work with others instead of reinventing things from scratch, with similar problems (2) No additional NMI patches will make it until we have resolved the issues described in [1], which is likely to take some time Thanks, M. [1] https://lore.kernel.org/linux-arm-kernel/87lfewnmdz.fsf@nanos.tec.linutronix.de/
On Tue, Dec 8, 2020 at 10:04 PM Marc Zyngier <maz@kernel.org> wrote: > > On 2020-12-08 13:14, Pingfan Liu wrote: > > Like other arches, both 'kexec -l' and 'kexec -p' expect to use NMI to > > interrupt other cpus in order to capture cpus tightly to a known state. > > > > So for both of the cases, try to use NMI IPI if irqchip supports NMI. > > There are at least 2 other competing series doing the same thing, > and (ab)using NMIs for dubious purposes (kexec, kgdb...) > I think kexec has a good justification, but start with (2) at present. > (1) Please work with others instead of reinventing things from scratch, > with similar problems > I will. > (2) No additional NMI patches will make it until we have resolved the > issues > described in [1], which is likely to take some time > After serval days of chasing this issue, I have a raw idea about it. What about re-structure the code like DEFINE_IDTENTRY_SYSVEC_SIMPLE(sysvec_reschedule_ipi) ? I will send a series soon afterward. Thanks, Pingfan > Thanks, > > M. > > [1] > https://lore.kernel.org/linux-arm-kernel/87lfewnmdz.fsf@nanos.tec.linutronix.de/ > -- > Jazz is not dead. It just smells funny...
On 2020-12-14 02:20, Pingfan Liu wrote: [...] > After serval days of chasing this issue, I have a raw idea about it. > What about re-structure the code like > DEFINE_IDTENTRY_SYSVEC_SIMPLE(sysvec_reschedule_ipi) ? I'm not sure how you want to mimic what x86 does, given that we have at least three different root interrupt controllers on arm64 (plus a fourth coming up), all of which have access methods that are not part of the CPU architecture, and that are shared with the 32bit part. > I will send a series soon afterward. Please wait until the merge window is over. Thanks, M.
On Mon, Dec 14, 2020 at 5:46 PM Marc Zyngier <maz@kernel.org> wrote: > > On 2020-12-14 02:20, Pingfan Liu wrote: > > [...] > > > After serval days of chasing this issue, I have a raw idea about it. > > What about re-structure the code like > > DEFINE_IDTENTRY_SYSVEC_SIMPLE(sysvec_reschedule_ipi) ? > > I'm not sure how you want to mimic what x86 does, given that we have > at least three different root interrupt controllers on arm64 (plus > a fourth coming up), all of which have access methods that are not > part of the CPU architecture, and that are shared with the 32bit > part. > I just found that Mark had implemented enter_el1_irq_or_nmi() for arm64, and think it is another way to handle the problem. I had thought if aligned to x86, then linux/kernel/entry/common.c can be used. It may bring benefits, e.g. shift off the rcu/lockdep/dyntick to core kernel. Anyway, I am a new beginner at this. If my opinion is not good enough, please just ignore it. > > I will send a series soon afterward. > > Please wait until the merge window is over. > Sorry that it has been sent out before your reply. If there is V2, I will do it after the merge window. Thanks, Pingfan
On 2020-12-14 13:04, Pingfan Liu wrote: > On Mon, Dec 14, 2020 at 5:46 PM Marc Zyngier <maz@kernel.org> wrote: >> >> On 2020-12-14 02:20, Pingfan Liu wrote: >> >> [...] >> >> > After serval days of chasing this issue, I have a raw idea about it. >> > What about re-structure the code like >> > DEFINE_IDTENTRY_SYSVEC_SIMPLE(sysvec_reschedule_ipi) ? >> >> I'm not sure how you want to mimic what x86 does, given that we have >> at least three different root interrupt controllers on arm64 (plus >> a fourth coming up), all of which have access methods that are not >> part of the CPU architecture, and that are shared with the 32bit >> part. >> > I just found that Mark had implemented enter_el1_irq_or_nmi() for > arm64, and think it is another way to handle the problem. > > I had thought if aligned to x86, then linux/kernel/entry/common.c can > be used. It may bring benefits, e.g. shift off the rcu/lockdep/dyntick > to core kernel. Using the entry/common.c stuff is indeed what we are planning to do, but the current arm64 code doesn't lend itself to that just yet, > > Anyway, I am a new beginner at this. If my opinion is not good enough, > please just ignore it. >> > I will send a series soon afterward. >> >> Please wait until the merge window is over. >> > Sorry that it has been sent out before your reply. If there is V2, I > will do it after the merge window. Please look at the series I pointed you too (and its follow-up), which has most of what you are trying to achieve already. Thanks, M.
On Mon, Dec 14, 2020 at 9:24 PM Marc Zyngier <maz@kernel.org> wrote: > > On 2020-12-14 13:04, Pingfan Liu wrote: > > On Mon, Dec 14, 2020 at 5:46 PM Marc Zyngier <maz@kernel.org> wrote: > >> > >> On 2020-12-14 02:20, Pingfan Liu wrote: > >> > >> [...] > >> > >> > After serval days of chasing this issue, I have a raw idea about it. > >> > What about re-structure the code like > >> > DEFINE_IDTENTRY_SYSVEC_SIMPLE(sysvec_reschedule_ipi) ? > >> > >> I'm not sure how you want to mimic what x86 does, given that we have > >> at least three different root interrupt controllers on arm64 (plus > >> a fourth coming up), all of which have access methods that are not > >> part of the CPU architecture, and that are shared with the 32bit > >> part. > >> > > I just found that Mark had implemented enter_el1_irq_or_nmi() for > > arm64, and think it is another way to handle the problem. > > > > I had thought if aligned to x86, then linux/kernel/entry/common.c can > > be used. It may bring benefits, e.g. shift off the rcu/lockdep/dyntick > > to core kernel. > > Using the entry/common.c stuff is indeed what we are planning to do, > but the current arm64 code doesn't lend itself to that just yet, > > > > > Anyway, I am a new beginner at this. If my opinion is not good enough, > > please just ignore it. > >> > I will send a series soon afterward. > >> > >> Please wait until the merge window is over. > >> > > Sorry that it has been sent out before your reply. If there is V2, I > > will do it after the merge window. > > Please look at the series I pointed you too (and its follow-up), which > has most of what you are trying to achieve already. > OK, and glad to know this issue is close to the end. Thanks, Pingfan
diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c index 18e9727..ff61e2c 100644 --- a/arch/arm64/kernel/smp.c +++ b/arch/arm64/kernel/smp.c @@ -884,6 +884,31 @@ static void ipi_cpu_crash_stop(unsigned int cpu, struct pt_regs *regs) } /* + * Overlapped with the counterpart in do_handle_IPI(), only just for nmi purpose + */ +static irqreturn_t nmi_ipi_handler(int irq, void *data) +{ + unsigned int cpu = smp_processor_id(); + int ipinr = irq - ipi_irq_base; + + switch (ipinr) { + case IPI_CPU_STOP: + local_cpu_stop(); + unreachable(); + break; + + case IPI_CPU_CRASH_STOP: + if (IS_ENABLED(CONFIG_KEXEC_CORE)) { + ipi_cpu_crash_stop(cpu, get_irq_regs()); + + unreachable(); + } + break; + } + return IRQ_NONE; +} + +/* * Main handler for inter-processor interrupts */ static void do_handle_IPI(int ipinr) @@ -902,10 +927,12 @@ static void do_handle_IPI(int ipinr) generic_smp_call_function_interrupt(); break; + /* If NMI IPI is not supported */ case IPI_CPU_STOP: local_cpu_stop(); break; + /* If NMI IPI is not supported */ case IPI_CPU_CRASH_STOP: if (IS_ENABLED(CONFIG_KEXEC_CORE)) { ipi_cpu_crash_stop(cpu, get_irq_regs()); @@ -989,6 +1016,14 @@ void __init set_smp_ipi_range(int ipi_base, int n) for (i = 0; i < nr_ipi; i++) { int err; + if (i == IPI_CPU_CRASH_STOP || i == IPI_CPU_STOP) { + err = request_percpu_nmi(ipi_base + i, nmi_ipi_handler, + "NMI-IPI", &cpu_number); + if (!err) + continue; + + pr_warn("Fail to request nmi for NMI-IPI\n"); + } err = request_percpu_irq(ipi_base + i, ipi_handler, "IPI", &cpu_number); WARN_ON(err); @@ -1038,6 +1073,7 @@ void smp_send_stop(void) if (system_state <= SYSTEM_RUNNING) pr_crit("SMP: stopping secondary CPUs\n"); + /* NMI is expected, but it is subject to the irq line's ability */ smp_cross_call(&mask, IPI_CPU_STOP); } @@ -1084,6 +1120,7 @@ void crash_smp_send_stop(void) atomic_set(&waiting_for_crash_ipi, num_other_online_cpus()); pr_crit("SMP: stopping secondary CPUs\n"); + /* NMI is expected, but it is subject to the irq line's ability */ smp_cross_call(&mask, IPI_CPU_CRASH_STOP); /* Wait up to one second for other CPUs to stop */
Like other arches, both 'kexec -l' and 'kexec -p' expect to use NMI to interrupt other cpus in order to capture cpus tightly to a known state. So for both of the cases, try to use NMI IPI if irqchip supports NMI. Signed-off-by: Pingfan Liu <kernelfans@gmail.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Julien Thierry <julien.thierry@arm.com> To: linux-arm-kernel@lists.infradead.org --- arch/arm64/kernel/smp.c | 37 +++++++++++++++++++++++++++++++++++++ 1 file changed, 37 insertions(+)