Message ID | 20170117230714.6799-1-f.fainelli@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Tue, Jan 17, 2017 at 03:07:12PM -0800, Florian Fainelli wrote: > This message is not particularly informative, and is not paired with an > identical message when a CPU is brought online. Finally, it slows the > CPU hotplug path down, thus allowing less CPU hotplug operations per > second. Just remove it. CPU hotplug isn't a fast operation anyway - it's also fairly disruptive in that it uses stop_machine() to halt activity everywhere while taking the CPU offline. If you're worried about a single printk slowing things down, it sounds like you're doing something wrong here - maybe you should be using cpuidle for power management rather than trying to make CPU hotplug do that for you? A rapidly changing CPU hotplug state has other side effects - such as reading /proc/interrupts is meaningless, because as soon as you've read it the CPUs could have changed. It also makes a farce of spreading interrupts over the available CPUs. All in all, I don't think this is a good idea...
On 01/17/2017 03:23 PM, Russell King - ARM Linux wrote: > On Tue, Jan 17, 2017 at 03:07:12PM -0800, Florian Fainelli wrote: >> This message is not particularly informative, and is not paired with an >> identical message when a CPU is brought online. Finally, it slows the >> CPU hotplug path down, thus allowing less CPU hotplug operations per >> second. Just remove it. > > CPU hotplug isn't a fast operation anyway - it's also fairly disruptive > in that it uses stop_machine() to halt activity everywhere while taking > the CPU offline. We have a test that consists in shutting down all CPUs as frequently as we can and do this for about 2 million iterations which takes roughly 24h, and this printk slows thing down by a reasonable amount. Here are some numbers on 500 hotplug operations: w/ printk: real 0m9.997s user 0m0.725s sys 0m3.030s # w/o printk: real 0m8.547s user 0m0.436s sys 0m1.838s > > If you're worried about a single printk slowing things down, it sounds > like you're doing something wrong here - maybe you should be using > cpuidle for power management rather than trying to make CPU hotplug do > that for you? No, this is not about power management considerations here, just how fast we can bang it. > > A rapidly changing CPU hotplug state has other side effects - such as > reading /proc/interrupts is meaningless, because as soon as you've read > it the CPUs could have changed. It also makes a farce of spreading > interrupts over the available CPUs. > > All in all, I don't think this is a good idea... Well, for one it's inconsistent, and it also leaves room for subtle timing/caching issues with some platforms (OK, maybe not so much). Improving the speed and consistency was the primary motive.
On Tue 2017-01-17 15:39:45, Florian Fainelli wrote: > On 01/17/2017 03:23 PM, Russell King - ARM Linux wrote: > > On Tue, Jan 17, 2017 at 03:07:12PM -0800, Florian Fainelli wrote: > >> This message is not particularly informative, and is not paired with an > >> identical message when a CPU is brought online. Finally, it slows the > >> CPU hotplug path down, thus allowing less CPU hotplug operations per > >> second. Just remove it. > > > > CPU hotplug isn't a fast operation anyway - it's also fairly disruptive > > in that it uses stop_machine() to halt activity everywhere while taking > > the CPU offline. > > We have a test that consists in shutting down all CPUs as frequently as > we can and do this for about 2 million iterations which takes roughly > 24h, and this printk slows thing down by a reasonable amount. Here are > some numbers on 500 hotplug operations: > > w/ printk: > real 0m9.997s > user 0m0.725s > sys 0m3.030s > # > > w/o printk: > real 0m8.547s > user 0m0.436s > sys 0m1.838s I am curious that a single printk() might make such a big difference. One reason might be that the messages are pushed to a "slow" console. Another reason might be that there are many other messages printed on the system and there is a contention on logbuf_lock or other console related locks. There might be also the opposite problem. The messages are also read by userspace tools that store them into /var/log/messages or systemd logs. If these are the only messages printed to the log and if there is no other activity on the system. Then the waken loggers might make a difference, especially if all CPUs are getting disabled and only one is available at some point. Well, I am not sure what other operations are needed to do the CPU hotplug operation. I cannot judge how the message is useful and if the speed up is worth removing it. Best Regards, Petr
On 01/18/2017 01:55 AM, Petr Mladek wrote: > On Tue 2017-01-17 15:39:45, Florian Fainelli wrote: >> On 01/17/2017 03:23 PM, Russell King - ARM Linux wrote: >>> On Tue, Jan 17, 2017 at 03:07:12PM -0800, Florian Fainelli wrote: >>>> This message is not particularly informative, and is not paired with an >>>> identical message when a CPU is brought online. Finally, it slows the >>>> CPU hotplug path down, thus allowing less CPU hotplug operations per >>>> second. Just remove it. >>> >>> CPU hotplug isn't a fast operation anyway - it's also fairly disruptive >>> in that it uses stop_machine() to halt activity everywhere while taking >>> the CPU offline. >> >> We have a test that consists in shutting down all CPUs as frequently as >> we can and do this for about 2 million iterations which takes roughly >> 24h, and this printk slows thing down by a reasonable amount. Here are >> some numbers on 500 hotplug operations: >> >> w/ printk: >> real 0m9.997s >> user 0m0.725s >> sys 0m3.030s >> # >> >> w/o printk: >> real 0m8.547s >> user 0m0.436s >> sys 0m1.838s > > I am curious that a single printk() might make such a big difference. It does, because of how printk() is implemented (there is nothing wrong with it, just slow by nature and how the UART gets written to as well). > > One reason might be that the messages are pushed to a "slow" console. 115200 UART, yes that's slow, but not unusual. > > Another reason might be that there are many other messages printed > on the system and there is a contention on logbuf_lock or other > console related locks. The other messages being printed are those from the hotplug script that I run which just checkpoints its running every 50 instances, so it does not occur that often, the console really is not busy, which really extracts the overhead of printing "CPU: shutdown". > > There might be also the opposite problem. The messages are also read > by userspace tools that store them into /var/log/messages or systemd > logs. If these are the only messages printed to the log and if there > is no other activity on the system. Then the waken loggers might make > a difference, especially if all CPUs are getting disabled and only > one is available at some point. There is none of that, no systemd, no syslog, just minimal userspace running. > > Well, I am not sure what other operations are needed to do the > CPU hotplug operation. > > I cannot judge how the message is useful and if the speed up > is worth removing it. > > Best Regards, > Petr >
On Tue, Jan 17, 2017 at 03:39:45PM -0800, Florian Fainelli wrote: > Well, for one it's inconsistent, and it also leaves room for subtle > timing/caching issues with some platforms (OK, maybe not so much). > Improving the speed and consistency was the primary motive. Actually, it's not as inconsistent as you claim. c68b0274fb3c ("ARM: reduce "Booted secondary processor" message to debug level") dropped the corresponding pr_info() in the bringup path to debug level, so to be consistent, this one should be dropped to debug level too. Removing it arguably makes it more inconsistent! So, I'd be willing to accept a patch lowering this to a pr_debug() to make it more consistent with the bringup path, but otherwise the consistency argument doesn't really stand.
On 01/18/2017 02:39 PM, Russell King - ARM Linux wrote: > On Tue, Jan 17, 2017 at 03:39:45PM -0800, Florian Fainelli wrote: >> Well, for one it's inconsistent, and it also leaves room for subtle >> timing/caching issues with some platforms (OK, maybe not so much). >> Improving the speed and consistency was the primary motive. > > Actually, it's not as inconsistent as you claim. > > c68b0274fb3c ("ARM: reduce "Booted secondary processor" message to debug level") > dropped the corresponding pr_info() in the bringup path to debug level, > so to be consistent, this one should be dropped to debug level too. > > Removing it arguably makes it more inconsistent! OK, fair point, I missed that one. > > So, I'd be willing to accept a patch lowering this to a pr_debug() > to make it more consistent with the bringup path, but otherwise the > consistency argument doesn't really stand. Sounds good, thanks!
diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c index 7dd14e8395e6..b3e02751ab64 100644 --- a/arch/arm/kernel/smp.c +++ b/arch/arm/kernel/smp.c @@ -251,7 +251,6 @@ void __cpu_die(unsigned int cpu) pr_err("CPU%u: cpu didn't die\n", cpu); return; } - pr_notice("CPU%u: shutdown\n", cpu); /* * platform_cpu_kill() is generally expected to do the powering off
This message is not particularly informative, and is not paired with an identical message when a CPU is brought online. Finally, it slows the CPU hotplug path down, thus allowing less CPU hotplug operations per second. Just remove it. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> --- arch/arm/kernel/smp.c | 1 - 1 file changed, 1 deletion(-)