From patchwork Fri Mar 17 15:47:08 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: David Woodhouse X-Patchwork-Id: 9631013 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 92F7E602D6 for ; Fri, 17 Mar 2017 16:00:29 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 89DF0286D5 for ; Fri, 17 Mar 2017 16:00:29 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 7C7D4286D7; Fri, 17 Mar 2017 16:00:29 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-1.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID autolearn=ham version=3.3.1 Received: from bombadil.infradead.org (bombadil.infradead.org [65.50.211.133]) (using TLSv1.2 with cipher AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id E4951286D5 for ; Fri, 17 Mar 2017 16:00:28 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20170209; h=Sender:Content-Type:Cc: List-Subscribe:List-Help:List-Post:List-Archive:List-Unsubscribe:List-Id: Mime-Version:Date:References:In-Reply-To:To:From:Subject:Message-ID:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=AU/QUfNsZmvUWnmGNlqv4oG4Tm3LSSN4T5oFQJbyRnc=; b=EHM6VhFgw+48aSqtJz1RgsfzC 0fvzkiiSFRaZ2IY7Jl5i7H5X+QnoMXoIxl1wsafOEykKLpoLQXN2UJOykMdjVpR5kuJhVH2WCJPwj 6jUDB102RUH1FN4rE3DB09ip8PELPpLfd6vQCpsZUajqNvarcX9SdOOb9JA+4o6/2br9oSKZWweny i0DK7JoKnGkq6RoksTXGoQTJ6oAQy+AzuoW49KXg1hYdupla/nUc4f8gHX9I7/0u/dpcjFe2ydsQb yrry+XtLcUD01pZUz6Ut0ohaPNmkZsfe8+P7j9SO/TvLrpAub2dRqAFxt/VLuPXNQ5YcWVSZCG7rh 3KO2C0Ryw==; Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.87 #1 (Red Hat Linux)) id 1couIm-0006rm-Eg; Fri, 17 Mar 2017 16:00:28 +0000 Received: from casper.infradead.org ([2001:770:15f::2]) by bombadil.infradead.org with esmtps (Exim 4.87 #1 (Red Hat Linux)) id 1couIO-0005GB-PS; Fri, 17 Mar 2017 16:00:05 +0000 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Mime-Version:Date:Content-Type: References:In-Reply-To:Cc:To:From:Subject:Message-ID:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=eCUL9/NP/md3YIXRLAb8Uy0sgJYa/y7eXH8hOCnow4w=; b=Qt7KQTsPTPeZW0l185lZ2TivZ 8XZLvdOAyid0MGzqf2pTTk1yT3XpNMRFnc6V/R2CMq4+nu8SenZ2LJmBsdfgxAUF58YraKNltYu0y oRWDmidTpJ1sGkGZR0Em6giLtp7A3KulZQ3Vjze2pFd0PN3FK2OXS/qMAwyv4UU/V3jHNsI66YMh5 +Tt/QeooxuM80a/e8/OT2dZ3xP0CsglyrqYlkg/QnwhiH6q/pgmEPWP18MQEg6y7H0eFXgkZIgImQ tAuDaU05UVI2OBElK+KIT4a4uC+U7/LVeG2XCynZUJucvfUvhLvt1VR8jlhPsnrts+abxn6OHyyyw lm3Yh8/uQ==; Received: from twosheds.infradead.org ([2001:8b0:10b:1:21d:7dff:fe04:dbe2]) by casper.infradead.org with esmtps (Exim 4.87 #1 (Red Hat Linux)) id 1cou5x-0006BY-Td; Fri, 17 Mar 2017 15:47:14 +0000 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=twosheds.20170209; h=Mime-Version:Date:Content-Type: References:In-Reply-To:Cc:To:From:Subject:Message-ID:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=eCUL9/NP/md3YIXRLAb8Uy0sgJYa/y7eXH8hOCnow4w=; b=G+O+3O8ZbCLo5lELtLoegbHy0 zPB4q51DU8l+G31FrNv1pQLJZfS6+2VF6Lpwo2ZLZPURtoH+zeqk8/SZgnhIKIiu9NKZFuBgCY2Je lgcmM0g50aa9b9vAsmNm+Ni0ylVT/oWJBklI07UDXJTnDdJqiN85wvmlc+ge88xnUqcJLVj/05Bda D3QDsgk0T8mmUC6lE0oDa7kqzNodLldflctjuDDrza/X3ut13XbZURoDLkIPC1Wv3Ebr3bz3E+Mru E9qzYOztu+LX7OxTvBnW/6yogqx8SydHjv5/thrUn31VqHfSikztRdDdKfoY6wuUoZS7hPuEwDk4c BXUcO0GFg==; Received: from [2001:8b0:10b:1:68a3:3dbc:bcbc:db24] by twosheds.infradead.org with esmtpsa (Exim 4.87 #1 (Red Hat Linux)) id 1cou5t-0003hN-Ag; Fri, 17 Mar 2017 15:47:09 +0000 Message-ID: <1489765628.17202.59.camel@infradead.org> Subject: Re: [PATCH v33 00/14] add kdump support From: David Woodhouse To: Mark Rutland In-Reply-To: <20170317153358.GI5940@leverpostej> References: <20170315095656.24992-1-takahiro.akashi@linaro.org> <1489750991.17202.40.camel@infradead.org> <1489759373.17202.44.camel@infradead.org> <20170317153358.GI5940@leverpostej> Date: Fri, 17 Mar 2017 15:47:08 +0000 Mime-Version: 1.0 X-Mailer: Evolution 3.18.5.2-0ubuntu3.1 X-SRS-Rewrite: SMTP reverse-path rewritten from by twosheds.infradead.org. See http://www.infradead.org/rpr.html X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: geoff@infradead.org, kexec@lists.infradead.org, will.deacon@arm.com, AKASHI Takahiro , james.morse@arm.com, catalin.marinas@arm.com, bauerman@linux.vnet.ibm.com, dyoung@redhat.com, linux-arm-kernel@lists.infradead.org Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org X-Virus-Scanned: ClamAV using ClamSMTP On Fri, 2017-03-17 at 15:33 +0000, Mark Rutland wrote: > > We can certainly log a better message, e.g. >          >         bool kdump = (image == kexec_crash_image); >         bool stuck_cpus = cpus_are_stuck_in_kernel() || >                           num_online_cpus() > 1; > >         BUG_ON(stuck_cpus && !kdump); >         WARN(stuck_cpus, "Unable to offline CPUs, kdump will be unreliable.\n"); No, in this case the CPUs *were* offlined correctly, or at least "as designed", by smp_send_crash_stop(). And if that hadn't worked, as verified by *its* synchronisation method based on the atomic_t waiting_for_crash_ipi, then *it* would have complained for itself: if (atomic_read(&waiting_for_crash_ipi) > 0) pr_warning("SMP: failed to stop secondary CPUs %*pbl\n",    cpumask_pr_args(cpu_online_mask)); It's just that smp_send_crash_stop() (or more specifically ipi_cpu_crash_stop()) doesn't touch the online cpu mask. Unlike the ARM32 equivalent function machien_crash_nonpanic_core(), which does. It wasn't clear if that was *intentional*, to allow the original contents of the online mask before the crash to be seen in the resulting vmcore... or purely an accident.  FWIW if I trigger a crash on CPU 1 my kdump (still 4.9.8+v32) doesn't work. I end up booting the kdump kernel on CPU#1 and then it gets distinctly unhappy... [    0.000000] Booting Linux on physical CPU 0x1 ... [    0.017125] Detected PIPT I-cache on CPU1 [    0.017138] GICv3: CPU1: found redistributor 0 region 0:0x00000000f0280000 [    0.017147] CPU1: Booted secondary processor [411fd073] [    0.017339] Detected PIPT I-cache on CPU2 [    0.017347] GICv3: CPU2: found redistributor 2 region 0:0x00000000f02c0000 [    0.017354] CPU2: Booted secondary processor [411fd073] [    0.017537] Detected PIPT I-cache on CPU3 [    0.017545] GICv3: CPU3: found redistributor 3 region 0:0x00000000f02e0000 [    0.017551] CPU3: Booted secondary processor [411fd073] [    0.017576] Brought up 4 CPUs [    0.017587] SMP: Total of 4 processors activated. ... [   31.745809] INFO: rcu_sched detected stalls on CPUs/tasks: [   31.751299]  1-...: (30 GPs behind) idle=c90/0/0 softirq=0/0 fqs=0  [   31.757557]  2-...: (30 GPs behind) idle=608/0/0 softirq=0/0 fqs=0  [   31.763814]  3-...: (30 GPs behind) idle=604/0/0 softirq=0/0 fqs=0  [   31.770069]  (detected by 0, t=5252 jiffies, g=-270, c=-271, q=0) [   31.776161] Task dump for CPU 1: [   31.779381] swapper/1       R  running task        0     0      1 0x00000080 [   31.786446] Task dump for CPU 2: [   31.789666] swapper/2       R  running task        0     0      1 0x00000080 [   31.796725] Task dump for CPU 3: [   31.799945] swapper/3       R  running task        0     0      1 0x00000080 Is some of that platform-specific? diff --git a/drivers/tty/sysrq.c b/drivers/tty/sysrq.c index 701c085..41d238e 100644 --- a/drivers/tty/sysrq.c +++ b/drivers/tty/sysrq.c @@ -129,7 +129,7 @@ static struct sysrq_key_op sysrq_unraw_op = {  #define sysrq_unraw_op (*(struct sysrq_key_op *)NULL)  #endif /* CONFIG_VT */   -static void sysrq_handle_crash(int key) +static void do_sysrq_handle_crash(int key)  {   char *killer = NULL;   @@ -143,6 +143,12 @@ static void sysrq_handle_crash(int key)   wmb();   *killer = 1;  } + +static void sysrq_handle_crash(int key) +{ + smp_call_on_cpu(1, (void *)do_sysrq_handle_crash, 0, 1); +} +  static struct sysrq_key_op sysrq_crash_op = {   .handler = sysrq_handle_crash,   .help_msg = "crash(c)",