diff mbox

ARM: smp: Remove CPU: shutdown notice

Message ID 20170117230714.6799-1-f.fainelli@gmail.com (mailing list archive)
State New, archived
Headers show

Commit Message

Florian Fainelli Jan. 17, 2017, 11:07 p.m. UTC
This message is not particularly informative, and is not paired with an
identical message when a CPU is brought online. Finally, it slows the
CPU hotplug path down, thus allowing less CPU hotplug operations per
second. Just remove it.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
---
 arch/arm/kernel/smp.c | 1 -
 1 file changed, 1 deletion(-)

Comments

Russell King (Oracle) Jan. 17, 2017, 11:23 p.m. UTC | #1
On Tue, Jan 17, 2017 at 03:07:12PM -0800, Florian Fainelli wrote:
> This message is not particularly informative, and is not paired with an
> identical message when a CPU is brought online. Finally, it slows the
> CPU hotplug path down, thus allowing less CPU hotplug operations per
> second. Just remove it.

CPU hotplug isn't a fast operation anyway - it's also fairly disruptive
in that it uses stop_machine() to halt activity everywhere while taking
the CPU offline.

If you're worried about a single printk slowing things down, it sounds
like you're doing something wrong here - maybe you should be using
cpuidle for power management rather than trying to make CPU hotplug do
that for you?

A rapidly changing CPU hotplug state has other side effects - such as
reading /proc/interrupts is meaningless, because as soon as you've read
it the CPUs could have changed.  It also makes a farce of spreading
interrupts over the available CPUs.

All in all, I don't think this is a good idea...
Florian Fainelli Jan. 17, 2017, 11:39 p.m. UTC | #2
On 01/17/2017 03:23 PM, Russell King - ARM Linux wrote:
> On Tue, Jan 17, 2017 at 03:07:12PM -0800, Florian Fainelli wrote:
>> This message is not particularly informative, and is not paired with an
>> identical message when a CPU is brought online. Finally, it slows the
>> CPU hotplug path down, thus allowing less CPU hotplug operations per
>> second. Just remove it.
> 
> CPU hotplug isn't a fast operation anyway - it's also fairly disruptive
> in that it uses stop_machine() to halt activity everywhere while taking
> the CPU offline.

We have a test that consists in shutting down all CPUs as frequently as
we can and do this for about 2 million iterations which takes roughly
24h, and this printk slows thing down by a reasonable amount. Here are
some numbers on 500 hotplug operations:

w/ printk:
real    0m9.997s
user    0m0.725s
sys     0m3.030s
#

w/o printk:
real    0m8.547s
user    0m0.436s
sys     0m1.838s

> 
> If you're worried about a single printk slowing things down, it sounds
> like you're doing something wrong here - maybe you should be using
> cpuidle for power management rather than trying to make CPU hotplug do
> that for you?

No, this is not about power management considerations here, just how
fast we can bang it.

> 
> A rapidly changing CPU hotplug state has other side effects - such as
> reading /proc/interrupts is meaningless, because as soon as you've read
> it the CPUs could have changed.  It also makes a farce of spreading
> interrupts over the available CPUs.
> 
> All in all, I don't think this is a good idea...

Well, for one it's inconsistent, and it also leaves room for subtle
timing/caching issues with some platforms (OK, maybe not so much).
Improving the speed and consistency was the primary motive.
Petr Mladek Jan. 18, 2017, 9:55 a.m. UTC | #3
On Tue 2017-01-17 15:39:45, Florian Fainelli wrote:
> On 01/17/2017 03:23 PM, Russell King - ARM Linux wrote:
> > On Tue, Jan 17, 2017 at 03:07:12PM -0800, Florian Fainelli wrote:
> >> This message is not particularly informative, and is not paired with an
> >> identical message when a CPU is brought online. Finally, it slows the
> >> CPU hotplug path down, thus allowing less CPU hotplug operations per
> >> second. Just remove it.
> > 
> > CPU hotplug isn't a fast operation anyway - it's also fairly disruptive
> > in that it uses stop_machine() to halt activity everywhere while taking
> > the CPU offline.
> 
> We have a test that consists in shutting down all CPUs as frequently as
> we can and do this for about 2 million iterations which takes roughly
> 24h, and this printk slows thing down by a reasonable amount. Here are
> some numbers on 500 hotplug operations:
> 
> w/ printk:
> real    0m9.997s
> user    0m0.725s
> sys     0m3.030s
> #
> 
> w/o printk:
> real    0m8.547s
> user    0m0.436s
> sys     0m1.838s

I am curious that a single printk() might make such a big difference.

One reason might be that the messages are pushed to a "slow" console.

Another reason might be that there are many other messages printed
on the system and there is a contention on logbuf_lock or other
console related locks.

There might be also the opposite problem. The messages are also read
by userspace tools that store them into /var/log/messages or systemd
logs. If these are the only messages printed to the log and if there
is no other activity on the system. Then the waken loggers might make
a difference, especially if all CPUs are getting disabled and only
one is available at some point.

Well, I am not sure what other operations are needed to do the
CPU hotplug operation.

I cannot judge how the message is useful and if the speed up
is worth removing it.

Best Regards,
Petr
Florian Fainelli Jan. 18, 2017, 7:56 p.m. UTC | #4
On 01/18/2017 01:55 AM, Petr Mladek wrote:
> On Tue 2017-01-17 15:39:45, Florian Fainelli wrote:
>> On 01/17/2017 03:23 PM, Russell King - ARM Linux wrote:
>>> On Tue, Jan 17, 2017 at 03:07:12PM -0800, Florian Fainelli wrote:
>>>> This message is not particularly informative, and is not paired with an
>>>> identical message when a CPU is brought online. Finally, it slows the
>>>> CPU hotplug path down, thus allowing less CPU hotplug operations per
>>>> second. Just remove it.
>>>
>>> CPU hotplug isn't a fast operation anyway - it's also fairly disruptive
>>> in that it uses stop_machine() to halt activity everywhere while taking
>>> the CPU offline.
>>
>> We have a test that consists in shutting down all CPUs as frequently as
>> we can and do this for about 2 million iterations which takes roughly
>> 24h, and this printk slows thing down by a reasonable amount. Here are
>> some numbers on 500 hotplug operations:
>>
>> w/ printk:
>> real    0m9.997s
>> user    0m0.725s
>> sys     0m3.030s
>> #
>>
>> w/o printk:
>> real    0m8.547s
>> user    0m0.436s
>> sys     0m1.838s
> 
> I am curious that a single printk() might make such a big difference.

It does, because of how printk() is implemented (there is nothing wrong
with it, just slow by nature and how the UART gets written to as well).

> 
> One reason might be that the messages are pushed to a "slow" console.

115200 UART, yes that's slow, but not unusual.

> 
> Another reason might be that there are many other messages printed
> on the system and there is a contention on logbuf_lock or other
> console related locks.

The other messages being printed are those from the hotplug script that
I run which just checkpoints its running every 50 instances, so it does
not occur that often, the console really is not busy, which really
extracts the overhead of printing "CPU: shutdown".

> 
> There might be also the opposite problem. The messages are also read
> by userspace tools that store them into /var/log/messages or systemd
> logs. If these are the only messages printed to the log and if there
> is no other activity on the system. Then the waken loggers might make
> a difference, especially if all CPUs are getting disabled and only
> one is available at some point.

There is none of that, no systemd, no syslog, just minimal userspace
running.

> 
> Well, I am not sure what other operations are needed to do the
> CPU hotplug operation.
> 
> I cannot judge how the message is useful and if the speed up
> is worth removing it.
> 
> Best Regards,
> Petr
>
Russell King (Oracle) Jan. 18, 2017, 10:39 p.m. UTC | #5
On Tue, Jan 17, 2017 at 03:39:45PM -0800, Florian Fainelli wrote:
> Well, for one it's inconsistent, and it also leaves room for subtle
> timing/caching issues with some platforms (OK, maybe not so much).
> Improving the speed and consistency was the primary motive.

Actually, it's not as inconsistent as you claim.

c68b0274fb3c ("ARM: reduce "Booted secondary processor" message to debug level")
dropped the corresponding pr_info() in the bringup path to debug level,
so to be consistent, this one should be dropped to debug level too.

Removing it arguably makes it more inconsistent!

So, I'd be willing to accept a patch lowering this to a pr_debug()
to make it more consistent with the bringup path, but otherwise the
consistency argument doesn't really stand.
Florian Fainelli Jan. 18, 2017, 10:49 p.m. UTC | #6
On 01/18/2017 02:39 PM, Russell King - ARM Linux wrote:
> On Tue, Jan 17, 2017 at 03:39:45PM -0800, Florian Fainelli wrote:
>> Well, for one it's inconsistent, and it also leaves room for subtle
>> timing/caching issues with some platforms (OK, maybe not so much).
>> Improving the speed and consistency was the primary motive.
> 
> Actually, it's not as inconsistent as you claim.
> 
> c68b0274fb3c ("ARM: reduce "Booted secondary processor" message to debug level")
> dropped the corresponding pr_info() in the bringup path to debug level,
> so to be consistent, this one should be dropped to debug level too.
> 
> Removing it arguably makes it more inconsistent!

OK, fair point, I missed that one.

> 
> So, I'd be willing to accept a patch lowering this to a pr_debug()
> to make it more consistent with the bringup path, but otherwise the
> consistency argument doesn't really stand.

Sounds good, thanks!
diff mbox

Patch

diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c
index 7dd14e8395e6..b3e02751ab64 100644
--- a/arch/arm/kernel/smp.c
+++ b/arch/arm/kernel/smp.c
@@ -251,7 +251,6 @@  void __cpu_die(unsigned int cpu)
 		pr_err("CPU%u: cpu didn't die\n", cpu);
 		return;
 	}
-	pr_notice("CPU%u: shutdown\n", cpu);
 
 	/*
 	 * platform_cpu_kill() is generally expected to do the powering off