diff mbox

[RFC/PATCH] arm: do not skip SMP init calls on SMP_ON_UP case

Message ID 1448279946-19975-1-git-send-email-nyushchenko@dev.rtsoft.ru (mailing list archive)
State New, archived
Headers show

Commit Message

Nikita Yushchenko Nov. 23, 2015, 11:59 a.m. UTC
From: Nikita Yushchenko <nyushchenko@dev.rtsoft.ru>

While running an imx6s boasrd, I got following message in boot log:

[    0.032414] CPU1: failed to boot: -38

This looked strange: imx6s is singe-core and kernel perfectly knows
that. However, for some reason it tries to initialize CPU 1?

I found this to be caused by
- CONFIG_SMP_ON_UP successfully detects that system is single core,
- this causes is_smp() to return false,
- this causes setup_arch() to skip smp_init_cpus() call,
- this skips board-specific code that sets cpu_possible mask.

By looking at the code, I don't understand why several initialization
routines are called only in is_smp() case - while other kernel
CONFIG_SMP code does not check is_smp() every time and uses what should
have been initialized by skipped routines.

Thus I propose making these init calls regardless of is_smp() check.
Calls are already conditional on CONFIG_SMP. This will make init and
usage sides consistent.

Signed-off-by: Nikita Yushchenko <nyushchenko@dev.rtsoft.ru>
---
 arch/arm/kernel/setup.c | 16 +++++++---------
 1 file changed, 7 insertions(+), 9 deletions(-)

Comments

Russell King - ARM Linux Nov. 23, 2015, 12:03 p.m. UTC | #1
On Mon, Nov 23, 2015 at 02:59:06PM +0300, nyushchenko@dev.rtsoft.ru wrote:
> From: Nikita Yushchenko <nyushchenko@dev.rtsoft.ru>
> 
> While running an imx6s boasrd, I got following message in boot log:
> 
> [    0.032414] CPU1: failed to boot: -38
> 
> This looked strange: imx6s is singe-core and kernel perfectly knows
> that. However, for some reason it tries to initialize CPU 1?
> 
> I found this to be caused by
> - CONFIG_SMP_ON_UP successfully detects that system is single core,
> - this causes is_smp() to return false,
> - this causes setup_arch() to skip smp_init_cpus() call,
> - this skips board-specific code that sets cpu_possible mask.

Right, so you should end up with the possible and present masks
containing just one CPU, which should prevent the kernel trying to
bring any secondary CPUs online.
Nikita Yushchenko Nov. 23, 2015, 12:06 p.m. UTC | #2
23.11.2015 15:03, Russell King - ARM Linux ?????:
> On Mon, Nov 23, 2015 at 02:59:06PM +0300, nyushchenko@dev.rtsoft.ru wrote:
>> From: Nikita Yushchenko <nyushchenko@dev.rtsoft.ru>
>>
>> While running an imx6s boasrd, I got following message in boot log:
>>
>> [    0.032414] CPU1: failed to boot: -38
>>
>> This looked strange: imx6s is singe-core and kernel perfectly knows
>> that. However, for some reason it tries to initialize CPU 1?
>>
>> I found this to be caused by
>> - CONFIG_SMP_ON_UP successfully detects that system is single core,
>> - this causes is_smp() to return false,
>> - this causes setup_arch() to skip smp_init_cpus() call,
>> - this skips board-specific code that sets cpu_possible mask.
> 
> Right, so you should end up with the possible and present masks
> containing just one CPU, which should prevent the kernel trying to
> bring any secondary CPUs online.

Kernel that is running here still tries to init CPU 1 for some reason.

Will try to check mainline (although not sure if that will be possible
on available custom hardware)
Russell King - ARM Linux Nov. 23, 2015, 12:12 p.m. UTC | #3
On Mon, Nov 23, 2015 at 03:06:52PM +0300, Nikita Yushchenko wrote:
> 23.11.2015 15:03, Russell King - ARM Linux ?????:
> > On Mon, Nov 23, 2015 at 02:59:06PM +0300, nyushchenko@dev.rtsoft.ru wrote:
> >> From: Nikita Yushchenko <nyushchenko@dev.rtsoft.ru>
> >>
> >> While running an imx6s boasrd, I got following message in boot log:
> >>
> >> [    0.032414] CPU1: failed to boot: -38
> >>
> >> This looked strange: imx6s is singe-core and kernel perfectly knows
> >> that. However, for some reason it tries to initialize CPU 1?
> >>
> >> I found this to be caused by
> >> - CONFIG_SMP_ON_UP successfully detects that system is single core,
> >> - this causes is_smp() to return false,
> >> - this causes setup_arch() to skip smp_init_cpus() call,
> >> - this skips board-specific code that sets cpu_possible mask.
> > 
> > Right, so you should end up with the possible and present masks
> > containing just one CPU, which should prevent the kernel trying to
> > bring any secondary CPUs online.
> 
> Kernel that is running here still tries to init CPU 1 for some reason.
> 
> Will try to check mainline (although not sure if that will be possible
> on available custom hardware)

iMX6 is fairly well supported in mainline.  The only reason to use a
custom kernel is if you want to use some feature which mainline does
not support (or support very well) such as video decode, the full IPU
facilities, GPUs or CEC (sorry, I don't have an expansive list.)

The GPU problem for the GC320/GC880/GC2000 is fairly close to being
solved in a functional (but maybe not yet performant) manner.
Russell King - ARM Linux Nov. 23, 2015, 12:19 p.m. UTC | #4
On Mon, Nov 23, 2015 at 12:12:16PM +0000, Russell King - ARM Linux wrote:
> iMX6 is fairly well supported in mainline.  The only reason to use a
> custom kernel is if you want to use some feature which mainline does
> not support (or support very well) such as video decode, the full IPU
> facilities, GPUs or CEC (sorry, I don't have an expansive list.)
> 
> The GPU problem for the GC320/GC880/GC2000 is fairly close to being
> solved in a functional (but maybe not yet performant) manner.

For reference, iMX6S in mainline behaves like this:

Calibrating delay loop (skipped), value calculated using timer frequency.. 6.00 BogoMIPS (lpj=12000)
pid_max: default: 32768 minimum: 301
Mount-cache hash table entries: 1024 (order: 0, 4096 bytes)
Mountpoint-cache hash table entries: 1024 (order: 0, 4096 bytes)
Initializing cgroup subsys net_cls
CPU: Testing write buffer coherency: ok
CPU0: thread -1, cpu 0, socket 0, mpidr 80000000
Setting up static identity map for 0x100082c0 - 0x10008318
Brought up 1 CPUs
SMP: Total of 1 processors activated (6.00 BogoMIPS).
CPU: All CPU(s) started in SVC mode.
Vladimir Murzin Nov. 23, 2015, 12:32 p.m. UTC | #5
On 23/11/15 12:06, Nikita Yushchenko wrote:
> 23.11.2015 15:03, Russell King - ARM Linux ?????:
>> On Mon, Nov 23, 2015 at 02:59:06PM +0300, nyushchenko@dev.rtsoft.ru wrote:
>>> From: Nikita Yushchenko <nyushchenko@dev.rtsoft.ru>
>>>
>>> While running an imx6s boasrd, I got following message in boot log:
>>>
>>> [    0.032414] CPU1: failed to boot: -38
>>>
>>> This looked strange: imx6s is singe-core and kernel perfectly knows
>>> that. However, for some reason it tries to initialize CPU 1?
>>>
>>> I found this to be caused by
>>> - CONFIG_SMP_ON_UP successfully detects that system is single core,
>>> - this causes is_smp() to return false,
>>> - this causes setup_arch() to skip smp_init_cpus() call,
>>> - this skips board-specific code that sets cpu_possible mask.
>>
>> Right, so you should end up with the possible and present masks
>> containing just one CPU, which should prevent the kernel trying to
>> bring any secondary CPUs online.
> 
> Kernel that is running here still tries to init CPU 1 for some reason.

I *guess* cpus node [1] in your dts has more than one cpu entry, could
you check please?

[1] Documentation/devicetree/bindings/arm/cpus.txt

Vladimir

> 
> Will try to check mainline (although not sure if that will be possible
> on available custom hardware)
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>
Nikita Yushchenko Nov. 23, 2015, 12:42 p.m. UTC | #6
>>>> While running an imx6s boasrd, I got following message in boot log:
>>>>
>>>> [    0.032414] CPU1: failed to boot: -38
>>>>
>>>> This looked strange: imx6s is singe-core and kernel perfectly knows
>>>> that. However, for some reason it tries to initialize CPU 1?
>>>>
>>>> I found this to be caused by
>>>> - CONFIG_SMP_ON_UP successfully detects that system is single core,
>>>> - this causes is_smp() to return false,
>>>> - this causes setup_arch() to skip smp_init_cpus() call,
>>>> - this skips board-specific code that sets cpu_possible mask.
>>>
>>> Right, so you should end up with the possible and present masks
>>> containing just one CPU, which should prevent the kernel trying to
>>> bring any secondary CPUs online.
>>
>> Kernel that is running here still tries to init CPU 1 for some reason.
> 
> I *guess* cpus node [1] in your dts has more than one cpu entry, could
> you check please?

Indeed looks so:

# ls /proc/device-tree/cpus
#address-cells  #size-cells  cpu@0  cpu@1  name

But my custom device tree just includes imx6dl.dtsi

So it is imx6dl.dtsi in linux-imx tree broken?..


Still, if I apply change from the patch, issue diappears, since in this
case imx_smp_init_cpus() gets called and initializes possible_cpu mask
properly.
Nikita Yushchenko Nov. 23, 2015, 12:46 p.m. UTC | #7
23.11.2015 15:19, Russell King - ARM Linux ?????:
> On Mon, Nov 23, 2015 at 12:12:16PM +0000, Russell King - ARM Linux wrote:
>> iMX6 is fairly well supported in mainline.  The only reason to use a
>> custom kernel is if you want to use some feature which mainline does
>> not support (or support very well) such as video decode, the full IPU
>> facilities, GPUs or CEC (sorry, I don't have an expansive list.)
>>
>> The GPU problem for the GC320/GC880/GC2000 is fairly close to being
>> solved in a functional (but maybe not yet performant) manner.
> 
> For reference, iMX6S in mainline behaves like this:
> 
> Calibrating delay loop (skipped), value calculated using timer frequency.. 6.00 BogoMIPS (lpj=12000)
> pid_max: default: 32768 minimum: 301
> Mount-cache hash table entries: 1024 (order: 0, 4096 bytes)
> Mountpoint-cache hash table entries: 1024 (order: 0, 4096 bytes)
> Initializing cgroup subsys net_cls
> CPU: Testing write buffer coherency: ok
> CPU0: thread -1, cpu 0, socket 0, mpidr 80000000
> Setting up static identity map for 0x100082c0 - 0x10008318
> Brought up 1 CPUs
> SMP: Total of 1 processors activated (6.00 BogoMIPS).
> CPU: All CPU(s) started in SVC mode.

Indeed, somehow booted mainline and it does not try to initialize CPU 1.
Nikita Yushchenko Nov. 23, 2015, 12:47 p.m. UTC | #8
>>>>> While running an imx6s boasrd, I got following message in boot log:
>>>>>
>>>>> [    0.032414] CPU1: failed to boot: -38
>>>>>
>>>>> This looked strange: imx6s is singe-core and kernel perfectly knows
>>>>> that. However, for some reason it tries to initialize CPU 1?
>>>>>
>>>>> I found this to be caused by
>>>>> - CONFIG_SMP_ON_UP successfully detects that system is single core,
>>>>> - this causes is_smp() to return false,
>>>>> - this causes setup_arch() to skip smp_init_cpus() call,
>>>>> - this skips board-specific code that sets cpu_possible mask.
>>>>
>>>> Right, so you should end up with the possible and present masks
>>>> containing just one CPU, which should prevent the kernel trying to
>>>> bring any secondary CPUs online.
>>>
>>> Kernel that is running here still tries to init CPU 1 for some reason.
>>
>> I *guess* cpus node [1] in your dts has more than one cpu entry, could
>> you check please?
> 
> Indeed looks so:
> 
> # ls /proc/device-tree/cpus
> #address-cells  #size-cells  cpu@0  cpu@1  name
> 
> But my custom device tree just includes imx6dl.dtsi
> 
> So it is imx6dl.dtsi in linux-imx tree broken?..

Just booted mainline...  unline linux-imx, it does not try to init cpu1.

However, imx6dl.dtsi from mainline also has both cpu@0 and cpu@1

So missing piece in linux-imx is elsewhere :(
Russell King - ARM Linux Nov. 23, 2015, 1:04 p.m. UTC | #9
On Mon, Nov 23, 2015 at 03:47:34PM +0300, Nikita Yushchenko wrote:
> >>>>> While running an imx6s boasrd, I got following message in boot log:
> >>>>>
> >>>>> [    0.032414] CPU1: failed to boot: -38
> >>>>>
> >>>>> This looked strange: imx6s is singe-core and kernel perfectly knows
> >>>>> that. However, for some reason it tries to initialize CPU 1?
> >>>>>
> >>>>> I found this to be caused by
> >>>>> - CONFIG_SMP_ON_UP successfully detects that system is single core,
> >>>>> - this causes is_smp() to return false,
> >>>>> - this causes setup_arch() to skip smp_init_cpus() call,
> >>>>> - this skips board-specific code that sets cpu_possible mask.
> >>>>
> >>>> Right, so you should end up with the possible and present masks
> >>>> containing just one CPU, which should prevent the kernel trying to
> >>>> bring any secondary CPUs online.
> >>>
> >>> Kernel that is running here still tries to init CPU 1 for some reason.
> >>
> >> I *guess* cpus node [1] in your dts has more than one cpu entry, could
> >> you check please?
> > 
> > Indeed looks so:
> > 
> > # ls /proc/device-tree/cpus
> > #address-cells  #size-cells  cpu@0  cpu@1  name
> > 
> > But my custom device tree just includes imx6dl.dtsi
> > 
> > So it is imx6dl.dtsi in linux-imx tree broken?..
> 
> Just booted mainline...  unline linux-imx, it does not try to init cpu1.
> 
> However, imx6dl.dtsi from mainline also has both cpu@0 and cpu@1
> 
> So missing piece in linux-imx is elsewhere :(

It works as you mentioned - and it relies upon the code you tried to
modify.

The early boot code detects that the boot CPU is not SMP capable, so
through SMP_ON_UP, it "turns off" SMP support by fixing up the code
and making is_smp() return false.

This prevents smp_init_cpus() being called, which in turn prevents
imx_smp_init_cpus() executing, which prevents the CPU possible mask
including any CPU but the boot CPU.

As only the boot CPU is possible, this prevents the SMP code trying
to bring any secondary CPUs online.

Applying your patch which removes the is_smp() check will break this
logic.
Nikita Yushchenko Nov. 24, 2015, 2:52 p.m. UTC | #10
>> Just booted mainline...  unline linux-imx, it does not try to init cpu1.
>>
>> However, imx6dl.dtsi from mainline also has both cpu@0 and cpu@1
>>
>> So missing piece in linux-imx is elsewhere :(
> 
> It works as you mentioned - and it relies upon the code you tried to
> modify.
> 
> The early boot code detects that the boot CPU is not SMP capable, so
> through SMP_ON_UP, it "turns off" SMP support by fixing up the code
> and making is_smp() return false.
> 
> This prevents smp_init_cpus() being called, which in turn prevents
> imx_smp_init_cpus() executing, which prevents the CPU possible mask
> including any CPU but the boot CPU.
> 
> As only the boot CPU is possible, this prevents the SMP code trying
> to bring any secondary CPUs online.

I'm still trying to understand what is going on, and my printk()s show
that this is not entirely true.

When smp_init() is entered on mainline om imx6s, cpu_possible_mask and
cpu_present_mask both contain two cpus. These get initialized in
arm_dt_init_cpu_maps() and stay unmodified since then.

But cpu_online() returns 1 for cpu0 and 0 from cpu1 - thus it is
cpu_online() check, not possible_mask or present_mask, that prevents
cpu1 initialization attempt.

Not sure I understand logic behind this. With the current code,
resulting cpu_possible_mask depends on CONFIG_SMP_ON_UP:
- if it is set, cpu_possible_mask contains (0 1), as initialized in
arm_dt_init_cpu_maps()
- if it is not set, cpu_possible_mask contains (0), since
imx_smp_init_cpus() removes 1 from there.

This does not seem to be intended difference.
Nikita Yushchenko Nov. 24, 2015, 3:05 p.m. UTC | #11
> I'm still trying to understand what is going on, and my printk()s show
> that this is not entirely true.
> 
> When smp_init() is entered on mainline om imx6s, cpu_possible_mask and
> cpu_present_mask both contain two cpus. These get initialized in
> arm_dt_init_cpu_maps() and stay unmodified since then.
> 
> But cpu_online() returns 1 for cpu0 and 0 from cpu1 - thus it is
> cpu_online() check, not possible_mask or present_mask, that prevents
> cpu1 initialization attempt.

Sorry was too quick to type.

cpu_online(0) is true and cpu_online(1) is false.
It is natural, since cpu0 is already running.
Thus cpu_up(1) is entered!
Nikita Yushchenko Nov. 24, 2015, 3:28 p.m. UTC | #12
24.11.2015 18:05, Nikita Yushchenko ?????:
>> I'm still trying to understand what is going on, and my printk()s show
>> that this is not entirely true.
>>
>> When smp_init() is entered on mainline om imx6s, cpu_possible_mask and
>> cpu_present_mask both contain two cpus. These get initialized in
>> arm_dt_init_cpu_maps() and stay unmodified since then.
>>
>> But cpu_online() returns 1 for cpu0 and 0 from cpu1 - thus it is
>> cpu_online() check, not possible_mask or present_mask, that prevents
>> cpu1 initialization attempt.
> 
> Sorry was too quick to type.
> 
> cpu_online(0) is true and cpu_online(1) is false.
> It is natural, since cpu0 is already running.
> Thus cpu_up(1) is entered!

... and then code executes into __cpu_up() from arch/arm/kernel/smp.c,
and stops via

	if (!smp_ops.smp_boot_secondary)
		return -ENOSYS;


(smp_ops zeroed due to SMP_ON_UP, as far as I understand).


In linux-imx 3.14.28 based tree, there is no such check in __cpu_up,
thus boot_secondary() is called

int boot_secondary(unsigned int cpu, struct task_struct *idle)
{
	if (smp_ops.smp_boot_secondary)
		return smp_ops.smp_boot_secondary(cpu, idle);
	return -ENOSYS;
}


at this point zeroed smp_ops plays, -ENOSYS (-38) is returned, and
pr_err() in __cpu_up() prints the message that caused the entire analysis.


So conclusion is that
- behaviour of mainline and linux-imx tres is almost the same, there is
attempt to bring up non-existing cpu 1, difference is only in where
zeroed smp_ops is detected and if error is logged or not.

Not sure that my proposed patch was correct, it fixes imx6s case but can
have bad effect on other arm targets. But I think that something needs
to be done to make cpu masks correct in SMP_ON_UP case.
diff mbox

Patch

diff --git a/arch/arm/kernel/setup.c b/arch/arm/kernel/setup.c
index 20edd34..8a14fce 100644
--- a/arch/arm/kernel/setup.c
+++ b/arch/arm/kernel/setup.c
@@ -980,16 +980,14 @@  void __init setup_arch(char **cmdline_p)
 	psci_dt_init();
 	xen_early_init();
 #ifdef CONFIG_SMP
-	if (is_smp()) {
-		if (!mdesc->smp_init || !mdesc->smp_init()) {
-			if (psci_smp_available())
-				smp_set_ops(&psci_smp_ops);
-			else if (mdesc->smp)
-				smp_set_ops(mdesc->smp);
-		}
-		smp_init_cpus();
-		smp_build_mpidr_hash();
+	if (!mdesc->smp_init || !mdesc->smp_init()) {
+		if (psci_smp_available())
+			smp_set_ops(&psci_smp_ops);
+		else if (mdesc->smp)
+			smp_set_ops(mdesc->smp);
 	}
+	smp_init_cpus();
+	smp_build_mpidr_hash();
 #endif
 
 	if (!is_smp())