Message ID | 20230519101840.v5.15.Ic55cb6f90ef5967d8aaa2b503a4e67c753f64d3a@changeid (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | watchdog/hardlockup: Add the buddy hardlockup detector | expand |
On Fri 2023-05-19 10:18:39, Douglas Anderson wrote: > On arm64, NMI support needs to be detected at runtime. Add a weak > function to the perf hardlockup detector so that an architecture can > implement it to detect whether NMIs are available. > > Signed-off-by: Douglas Anderson <dianders@chromium.org> > --- > While I won't object to this patch landing, I consider it part of the > arm64 perf hardlockup effort. I would be OK with the earlier patches > in the series landing and then not landing ${SUBJECT} patch nor > anything else later. > > I'll also note that, as an alternative to this, it would be nice if we > could figure out how to make perf_event_create_kernel_counter() fail > on arm64 if NMIs aren't available. Maybe we could add a "must_use_nmi" > element to "struct perf_event_attr"? > > --- a/kernel/watchdog_perf.c > +++ b/kernel/watchdog_perf.c > @@ -234,12 +234,22 @@ void __init hardlockup_detector_perf_restart(void) > } > } > > +bool __weak __init arch_perf_nmi_is_available(void) > +{ > + return true; > +} > + > /** > * watchdog_hardlockup_probe - Probe whether NMI event is available at all > */ > int __init watchdog_hardlockup_probe(void) > { > - int ret = hardlockup_detector_event_create(); > + int ret; > + > + if (!arch_perf_nmi_is_available()) > + return -ENODEV; My understanding is that this would block the perf hardlockup detector at runtime. Does it work with the "nmi_watchdog" sysctl. I see that it is made read-only when it is not enabled at build time, see NMI_WATCHDOG_SYSCTL_PERM. > + > + ret = hardlockup_detector_event_create(); > > if (ret) { > pr_info("Perf NMI watchdog permanently disabled\n"); Best Regards, Petr
Mark, On Mon, Jun 12, 2023 at 3:33 AM Mark Rutland <mark.rutland@arm.com> wrote: > > On Fri, May 19, 2023 at 10:18:39AM -0700, Douglas Anderson wrote: > > On arm64, NMI support needs to be detected at runtime. Add a weak > > function to the perf hardlockup detector so that an architecture can > > implement it to detect whether NMIs are available. > > > > Signed-off-by: Douglas Anderson <dianders@chromium.org> > > --- > > While I won't object to this patch landing, I consider it part of the > > arm64 perf hardlockup effort. I would be OK with the earlier patches > > in the series landing and then not landing ${SUBJECT} patch nor > > anything else later. > > FWIW, everything prior to this looks fine to me, so I reckon it'd be worth > splitting the series here and getting the buddy lockup detector in first, to > avoid a log-jam on all the subsequent NMI bits. I think the whole series has already landed in Andrew's tree, including the arm64 "perf" lockup detector bits. I saw all the notifications from Andrew go through over the weekend that they were moved from an "unstable" branch to a "stable" one and I see them at: https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git/log/?h=mm-nonmm-stable When I first saw Anderw land the arm64 perf lockup detector bits in his unstable branch several weeks ago, I sent a private message to the arm64 maintainers (yourself included) to make sure you were aware of it and that it hadn't been caught in mail filters. I got the impression that everything was OK. Is that not the case? -Doug
On Mon, Jun 12, 2023 at 06:55:37AM -0700, Doug Anderson wrote: > Mark, > > On Mon, Jun 12, 2023 at 3:33 AM Mark Rutland <mark.rutland@arm.com> wrote: > > > > On Fri, May 19, 2023 at 10:18:39AM -0700, Douglas Anderson wrote: > > > On arm64, NMI support needs to be detected at runtime. Add a weak > > > function to the perf hardlockup detector so that an architecture can > > > implement it to detect whether NMIs are available. > > > > > > Signed-off-by: Douglas Anderson <dianders@chromium.org> > > > --- > > > While I won't object to this patch landing, I consider it part of the > > > arm64 perf hardlockup effort. I would be OK with the earlier patches > > > in the series landing and then not landing ${SUBJECT} patch nor > > > anything else later. > > > > FWIW, everything prior to this looks fine to me, so I reckon it'd be worth > > splitting the series here and getting the buddy lockup detector in first, to > > avoid a log-jam on all the subsequent NMI bits. > > I think the whole series has already landed in Andrew's tree, > including the arm64 "perf" lockup detector bits. I saw all the > notifications from Andrew go through over the weekend that they were > moved from an "unstable" branch to a "stable" one and I see them at: > > https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git/log/?h=mm-nonmm-stable > > When I first saw Anderw land the arm64 perf lockup detector bits in > his unstable branch several weeks ago, I sent a private message to the > arm64 maintainers (yourself included) to make sure you were aware of > it and that it hadn't been caught in mail filters. I got the > impression that everything was OK. Is that not the case? Sorry; I'm slowly catching up with a backlog of email, and I'm just behind. Feel free to ignore this; sorry for the noise! If we spot anything going wrong in testing we can look at fixing those up. Thanks, Mark.
diff --git a/include/linux/nmi.h b/include/linux/nmi.h index 47db14e7da52..eb616fc07c85 100644 --- a/include/linux/nmi.h +++ b/include/linux/nmi.h @@ -210,6 +210,7 @@ static inline bool trigger_single_cpu_backtrace(int cpu) #ifdef CONFIG_HARDLOCKUP_DETECTOR_PERF u64 hw_nmi_get_sample_period(int watchdog_thresh); +bool arch_perf_nmi_is_available(void); #endif #if defined(CONFIG_HARDLOCKUP_CHECK_TIMESTAMP) && \ diff --git a/kernel/watchdog_perf.c b/kernel/watchdog_perf.c index 349fcd4d2abc..8ea00c4a24b2 100644 --- a/kernel/watchdog_perf.c +++ b/kernel/watchdog_perf.c @@ -234,12 +234,22 @@ void __init hardlockup_detector_perf_restart(void) } } +bool __weak __init arch_perf_nmi_is_available(void) +{ + return true; +} + /** * watchdog_hardlockup_probe - Probe whether NMI event is available at all */ int __init watchdog_hardlockup_probe(void) { - int ret = hardlockup_detector_event_create(); + int ret; + + if (!arch_perf_nmi_is_available()) + return -ENODEV; + + ret = hardlockup_detector_event_create(); if (ret) { pr_info("Perf NMI watchdog permanently disabled\n");
On arm64, NMI support needs to be detected at runtime. Add a weak function to the perf hardlockup detector so that an architecture can implement it to detect whether NMIs are available. Signed-off-by: Douglas Anderson <dianders@chromium.org> --- While I won't object to this patch landing, I consider it part of the arm64 perf hardlockup effort. I would be OK with the earlier patches in the series landing and then not landing ${SUBJECT} patch nor anything else later. I'll also note that, as an alternative to this, it would be nice if we could figure out how to make perf_event_create_kernel_counter() fail on arm64 if NMIs aren't available. Maybe we could add a "must_use_nmi" element to "struct perf_event_attr"? (no changes since v4) Changes in v4: - ("Add a weak function for an arch to detect ...") new for v4. include/linux/nmi.h | 1 + kernel/watchdog_perf.c | 12 +++++++++++- 2 files changed, 12 insertions(+), 1 deletion(-)