diff mbox series

arm64: kdump: fix warning message about failed to stop secondary CPUs

Message ID 20190222014625.107640-1-chenzhou10@huawei.com (mailing list archive)
State New, archived
Headers show
Series arm64: kdump: fix warning message about failed to stop secondary CPUs | expand

Commit Message

chenzhou Feb. 22, 2019, 1:46 a.m. UTC
The local variable mask indicates non-boot cpus. Just print the num
of failing to stop secondary CPUs using varaible waiting_for_crash_ipi.

Also replace two pr_warning with the shorter pr_warn.

Signed-off-by: Chen Zhou <chenzhou10@huawei.com>
---
 arch/arm64/kernel/smp.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

Comments

Will Deacon Feb. 22, 2019, 10:03 a.m. UTC | #1
On Fri, Feb 22, 2019 at 09:46:25AM +0800, Chen Zhou wrote:
> The local variable mask indicates non-boot cpus. Just print the num
> of failing to stop secondary CPUs using varaible waiting_for_crash_ipi.

Why? Also s/varaible/variable/

> Also replace two pr_warning with the shorter pr_warn.
> 
> Signed-off-by: Chen Zhou <chenzhou10@huawei.com>
> ---
>  arch/arm64/kernel/smp.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
> index 824de70..a4a952e 100644
> --- a/arch/arm64/kernel/smp.c
> +++ b/arch/arm64/kernel/smp.c
> @@ -989,7 +989,7 @@ void smp_send_stop(void)
>  		udelay(1);
>  
>  	if (num_online_cpus() > 1)
> -		pr_warning("SMP: failed to stop secondary CPUs %*pbl\n",
> +		pr_warn("SMP: failed to stop secondary CPUs %*pbl\n",
>  			   cpumask_pr_args(cpu_online_mask));
>  
>  	sdei_mask_local_cpu();
> @@ -1030,8 +1030,8 @@ void crash_smp_send_stop(void)
>  		udelay(1);
>  
>  	if (atomic_read(&waiting_for_crash_ipi) > 0)
> -		pr_warning("SMP: failed to stop secondary CPUs %*pbl\n",
> -			   cpumask_pr_args(&mask));
> +		pr_warn("SMP: failed to stop secondary CPUs %d\n",
> +				atomic_read(&waiting_for_crash_ipi));

But now this message doesn't make any sense.

Will
Mark Rutland Feb. 22, 2019, 11:38 a.m. UTC | #2
On Fri, Feb 22, 2019 at 10:03:05AM +0000, Will Deacon wrote:
> On Fri, Feb 22, 2019 at 09:46:25AM +0800, Chen Zhou wrote:
> > The local variable mask indicates non-boot cpus. Just print the num
> > of failing to stop secondary CPUs using varaible waiting_for_crash_ipi.
> 
> Why? Also s/varaible/variable/

I think the point is that the mask is an over-estimate, since it's all
of the CPUs which were online when we sent the stop IPI, not only the
ones we failed to stop.

If we had 255 secondary CPUs, and one failed to offline, we log the IDs
of all 255 CPUs which we tried to offline.

> > Also replace two pr_warning with the shorter pr_warn.
> > 
> > Signed-off-by: Chen Zhou <chenzhou10@huawei.com>
> > ---
> >  arch/arm64/kernel/smp.c | 6 +++---
> >  1 file changed, 3 insertions(+), 3 deletions(-)
> > 
> > diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
> > index 824de70..a4a952e 100644
> > --- a/arch/arm64/kernel/smp.c
> > +++ b/arch/arm64/kernel/smp.c
> > @@ -989,7 +989,7 @@ void smp_send_stop(void)
> >  		udelay(1);
> >  
> >  	if (num_online_cpus() > 1)
> > -		pr_warning("SMP: failed to stop secondary CPUs %*pbl\n",
> > +		pr_warn("SMP: failed to stop secondary CPUs %*pbl\n",
> >  			   cpumask_pr_args(cpu_online_mask));
> >  
> >  	sdei_mask_local_cpu();
> > @@ -1030,8 +1030,8 @@ void crash_smp_send_stop(void)
> >  		udelay(1);
> >  
> >  	if (atomic_read(&waiting_for_crash_ipi) > 0)
> > -		pr_warning("SMP: failed to stop secondary CPUs %*pbl\n",
> > -			   cpumask_pr_args(&mask));
> > +		pr_warn("SMP: failed to stop secondary CPUs %d\n",
> > +				atomic_read(&waiting_for_crash_ipi));

I think this should be:

	pr_warning("SMP: failed to stop secondary CPUs %*pbl\n",
		   cpumask_pr_args(cpu_online_mask));

... matching smp_send_stop().

In either case that includes the current processor, but that's not a new
bug, and not a big deal.

Thanks,
Mark.
chenzhou Feb. 22, 2019, 1:13 p.m. UTC | #3
Hi Mark,

> -----Original Message-----
> From: Mark Rutland [mailto:mark.rutland@arm.com]
> Sent: Friday, February 22, 2019 7:38 PM
> To: Will Deacon
> Cc: chenzhou; catalin.marinas@arm.com; chengjian (D);
> linux-arm-kernel@lists.infradead.org; takahiro.akashi@linaro.org
> Subject: Re: [PATCH] arm64: kdump: fix warning message about failed to
> stop secondary CPUs
> 
> On Fri, Feb 22, 2019 at 10:03:05AM +0000, Will Deacon wrote:
> > On Fri, Feb 22, 2019 at 09:46:25AM +0800, Chen Zhou wrote:
> > > The local variable mask indicates non-boot cpus. Just print the num
> > > of failing to stop secondary CPUs using varaible waiting_for_crash_ipi.
> >
> > Why? Also s/varaible/variable/
> 
> I think the point is that the mask is an over-estimate, since it's all
> of the CPUs which were online when we sent the stop IPI, not only the
> ones we failed to stop.
> 
> If we had 255 secondary CPUs, and one failed to offline, we log the IDs
> of all 255 CPUs which we tried to offline.
> 
> > > Also replace two pr_warning with the shorter pr_warn.
> > >
> > > Signed-off-by: Chen Zhou <chenzhou10@huawei.com>
> > > ---
> > >  arch/arm64/kernel/smp.c | 6 +++---
> > >  1 file changed, 3 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
> > > index 824de70..a4a952e 100644
> > > --- a/arch/arm64/kernel/smp.c
> > > +++ b/arch/arm64/kernel/smp.c
> > > @@ -989,7 +989,7 @@ void smp_send_stop(void)
> > >  		udelay(1);
> > >
> > >  	if (num_online_cpus() > 1)
> > > -		pr_warning("SMP: failed to stop secondary CPUs %*pbl\n",
> > > +		pr_warn("SMP: failed to stop secondary CPUs %*pbl\n",
> > >  			   cpumask_pr_args(cpu_online_mask));
> > >
> > >  	sdei_mask_local_cpu();
> > > @@ -1030,8 +1030,8 @@ void crash_smp_send_stop(void)
> > >  		udelay(1);
> > >
> > >  	if (atomic_read(&waiting_for_crash_ipi) > 0)
> > > -		pr_warning("SMP: failed to stop secondary CPUs %*pbl\n",
> > > -			   cpumask_pr_args(&mask));
> > > +		pr_warn("SMP: failed to stop secondary CPUs %d\n",
> > > +				atomic_read(&waiting_for_crash_ipi));
> 
> I think this should be:
> 
> 	pr_warning("SMP: failed to stop secondary CPUs %*pbl\n",
> 		   cpumask_pr_args(cpu_online_mask));
> 
> ... matching smp_send_stop().
> 
> In either case that includes the current processor, but that's not a new
> bug, and not a big deal.
> 

You are right. The mask and cpu_online_mask in kdump path are both the online CPUs 
before sending IPI stop. But the cpu_online_mask of the smp_send_stop() you mentioned
is the online CPUs after sending the IPI stop, which is the CPUs failed to stop, not equal to
the online CPUs before sending IPI stop.

The root cause is that the ipi_cpu_stop() will set related cpu_online_mask false, but the 
ipi_cpu_crash_stop() won't do this. Therefore, we can't get the correct online CPUs at 
the end of crash_smp_send_stop().

In kdump path, we introduce waiting_for_crash_ipi to record the number of CPUs failed to stop,
and we only have the information about the number of CPUs failed to stop.

Sorry, I don't know if I express clear.

Thanks
Chen Zhou

> Thanks,
> Mark.
diff mbox series

Patch

diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
index 824de70..a4a952e 100644
--- a/arch/arm64/kernel/smp.c
+++ b/arch/arm64/kernel/smp.c
@@ -989,7 +989,7 @@  void smp_send_stop(void)
 		udelay(1);
 
 	if (num_online_cpus() > 1)
-		pr_warning("SMP: failed to stop secondary CPUs %*pbl\n",
+		pr_warn("SMP: failed to stop secondary CPUs %*pbl\n",
 			   cpumask_pr_args(cpu_online_mask));
 
 	sdei_mask_local_cpu();
@@ -1030,8 +1030,8 @@  void crash_smp_send_stop(void)
 		udelay(1);
 
 	if (atomic_read(&waiting_for_crash_ipi) > 0)
-		pr_warning("SMP: failed to stop secondary CPUs %*pbl\n",
-			   cpumask_pr_args(&mask));
+		pr_warn("SMP: failed to stop secondary CPUs %d\n",
+				atomic_read(&waiting_for_crash_ipi));
 
 	sdei_mask_local_cpu();
 }