diff mbox

perf: Use raw_smp_processor_id insted of smp_processor_id

Message ID 1347450610-4342-1-git-send-email-rogerq@ti.com (mailing list archive)
State New, archived
Headers show

Commit Message

Roger Quadros Sept. 12, 2012, 11:50 a.m. UTC
gets rid of below messages with CONFIG_DEBUG_PREEMPT enabled

[   28.832916] debug_smp_processor_id: 18 callbacks suppressed
[   28.832946] BUG: using smp_processor_id() in preemptible [00000000] code: modprobe/1763
[   28.841491] caller is pwrdm_set_next_pwrst+0x54/0x120

Tested with perf on OMAP4 Panda.

Signed-off-by: Roger Quadros <rogerq@ti.com>
---
 arch/arm/mach-omap2/clock.c       |    6 +++---
 arch/arm/mach-omap2/pm34xx.c      |   11 +++++++----
 arch/arm/mach-omap2/powerdomain.c |    4 ++--
 3 files changed, 12 insertions(+), 9 deletions(-)

Comments

Russell King - ARM Linux Sept. 12, 2012, 12:05 p.m. UTC | #1
On Wed, Sep 12, 2012 at 02:50:10PM +0300, Roger Quadros wrote:
> gets rid of below messages with CONFIG_DEBUG_PREEMPT enabled
> 
> [   28.832916] debug_smp_processor_id: 18 callbacks suppressed
> [   28.832946] BUG: using smp_processor_id() in preemptible [00000000] code: modprobe/1763
> [   28.841491] caller is pwrdm_set_next_pwrst+0x54/0x120
> 
> Tested with perf on OMAP4 Panda.

NAK.  Using a different function which doesn't have the warning isn't a
subsitute for fixing the problem properly.  What you're doing is papering
over the bug, making the warning go away without properly understanding
the problem.

The warning is there because something is being done wrong.  What that is
is exactly what is being said in the warning message.  You're getting a
CPU number in a context where preemptions can occur - and therefore the
CPU that you're running on can change.

Think about this sequence:

	- cpu = smp_processor_id(); /* returns 0 */
	- you are preempted
	- you resume on CPU 1
	- trace_clock_disable(clk->name, 0, 0);

If trace_clock_disable() uses the CPU number to access per-CPU data
without locking, that's going to cause corruption.

Please, if you're going to fix a warning, analyse it properly first and
don't just search for a function which appears to give you the same
functionality but without the warning message.
Stephen Boyd Sept. 12, 2012, 6:44 p.m. UTC | #2
On 09/12/12 05:05, Russell King - ARM Linux wrote:
> NAK.  Using a different function which doesn't have the warning isn't a
> subsitute for fixing the problem properly.  What you're doing is papering
> over the bug, making the warning go away without properly understanding
> the problem.
>
> The warning is there because something is being done wrong.  What that is
> is exactly what is being said in the warning message.  You're getting a
> CPU number in a context where preemptions can occur - and therefore the
> CPU that you're running on can change.
>
> Think about this sequence:
>
> 	- cpu = smp_processor_id(); /* returns 0 */
> 	- you are preempted
> 	- you resume on CPU 1
> 	- trace_clock_disable(clk->name, 0, 0);
>
> If trace_clock_disable() uses the CPU number to access per-CPU data
> without locking, that's going to cause corruption.
>
> Please, if you're going to fix a warning, analyse it properly first and
> don't just search for a function which appears to give you the same
> functionality but without the warning message.

Is anyone actually using the CPU field in this tracepoint? I don't see
any usecase for it except for the case where you have a percpu clock,
but even then I imagine the name of the clock would be different
depending on which CPU it corresponds to. So why can't we just remove
the CPU field from the tracepoint?
Russell King - ARM Linux Sept. 12, 2012, 8:52 p.m. UTC | #3
On Wed, Sep 12, 2012 at 11:44:29AM -0700, Stephen Boyd wrote:
> On 09/12/12 05:05, Russell King - ARM Linux wrote:
> > NAK.  Using a different function which doesn't have the warning isn't a
> > subsitute for fixing the problem properly.  What you're doing is papering
> > over the bug, making the warning go away without properly understanding
> > the problem.
> >
> > The warning is there because something is being done wrong.  What that is
> > is exactly what is being said in the warning message.  You're getting a
> > CPU number in a context where preemptions can occur - and therefore the
> > CPU that you're running on can change.
> >
> > Think about this sequence:
> >
> > 	- cpu = smp_processor_id(); /* returns 0 */
> > 	- you are preempted
> > 	- you resume on CPU 1
> > 	- trace_clock_disable(clk->name, 0, 0);
> >
> > If trace_clock_disable() uses the CPU number to access per-CPU data
> > without locking, that's going to cause corruption.
> >
> > Please, if you're going to fix a warning, analyse it properly first and
> > don't just search for a function which appears to give you the same
> > functionality but without the warning message.
> 
> Is anyone actually using the CPU field in this tracepoint?

No idea, but for the point I'm raising, that's rather irrelevant... we
need to nip these things in the bud before they become more common
through allowing the misunderstanding to propagate.

> I don't see
> any usecase for it except for the case where you have a percpu clock,

Which actually makes it all the more important to get the right CPU
number, and to have atomicity between reading the CPU number and reading
the percpu clock... without that the results may well be meaningless.

If the use of a CPU number can be eliminated, then it should be.  If not,
the proper fix for this is to use interfaces which prevent pre-emption
occuring during the critical region - so get_cpu() and put_cpu().
diff mbox

Patch

diff --git a/arch/arm/mach-omap2/clock.c b/arch/arm/mach-omap2/clock.c
index ea3f565..2765acd 100644
--- a/arch/arm/mach-omap2/clock.c
+++ b/arch/arm/mach-omap2/clock.c
@@ -285,7 +285,7 @@  void omap2_clk_disable(struct clk *clk)
 	pr_debug("clock: %s: disabling in hardware\n", clk->name);
 
 	if (clk->ops && clk->ops->disable) {
-		trace_clock_disable(clk->name, 0, smp_processor_id());
+		trace_clock_disable(clk->name, 0, raw_smp_processor_id());
 		clk->ops->disable(clk);
 	}
 
@@ -339,7 +339,7 @@  int omap2_clk_enable(struct clk *clk)
 	}
 
 	if (clk->ops && clk->ops->enable) {
-		trace_clock_enable(clk->name, 1, smp_processor_id());
+		trace_clock_enable(clk->name, 1, raw_smp_processor_id());
 		ret = clk->ops->enable(clk);
 		if (ret) {
 			WARN(1, "clock: %s: could not enable: %d\n",
@@ -380,7 +380,7 @@  int omap2_clk_set_rate(struct clk *clk, unsigned long rate)
 
 	/* dpll_ck, core_ck, virt_prcm_set; plus all clksel clocks */
 	if (clk->set_rate) {
-		trace_clock_set_rate(clk->name, rate, smp_processor_id());
+		trace_clock_set_rate(clk->name, rate, raw_smp_processor_id());
 		ret = clk->set_rate(clk, rate);
 	}
 
diff --git a/arch/arm/mach-omap2/pm34xx.c b/arch/arm/mach-omap2/pm34xx.c
index 05bd8f0..d384068 100644
--- a/arch/arm/mach-omap2/pm34xx.c
+++ b/arch/arm/mach-omap2/pm34xx.c
@@ -346,18 +346,21 @@  void omap_sram_idle(void)
 
 static void omap3_pm_idle(void)
 {
+	int cpu;
+
 	local_fiq_disable();
 
 	if (omap_irq_pending())
 		goto out;
 
-	trace_power_start(POWER_CSTATE, 1, smp_processor_id());
-	trace_cpu_idle(1, smp_processor_id());
+	cpu = raw_smp_processor_id();
+	trace_power_start(POWER_CSTATE, 1, cpu);
+	trace_cpu_idle(1, cpu);
 
 	omap_sram_idle();
 
-	trace_power_end(smp_processor_id());
-	trace_cpu_idle(PWR_EVENT_EXIT, smp_processor_id());
+	trace_power_end(cpu);
+	trace_cpu_idle(PWR_EVENT_EXIT, cpu);
 
 out:
 	local_fiq_enable();
diff --git a/arch/arm/mach-omap2/powerdomain.c b/arch/arm/mach-omap2/powerdomain.c
index 69b36e1..f1d0d69 100644
--- a/arch/arm/mach-omap2/powerdomain.c
+++ b/arch/arm/mach-omap2/powerdomain.c
@@ -169,7 +169,7 @@  static int _pwrdm_state_switch(struct powerdomain *pwrdm, int flag)
 				       ((state & OMAP_POWERSTATE_MASK) << 8) |
 				       ((prev & OMAP_POWERSTATE_MASK) << 0));
 			trace_power_domain_target(pwrdm->name, trace_state,
-						  smp_processor_id());
+						  raw_smp_processor_id());
 		}
 		break;
 	default:
@@ -491,7 +491,7 @@  int pwrdm_set_next_pwrst(struct powerdomain *pwrdm, u8 pwrst)
 	if (arch_pwrdm && arch_pwrdm->pwrdm_set_next_pwrst) {
 		/* Trace the pwrdm desired target state */
 		trace_power_domain_target(pwrdm->name, pwrst,
-					  smp_processor_id());
+					  raw_smp_processor_id());
 		/* Program the pwrdm desired target state */
 		ret = arch_pwrdm->pwrdm_set_next_pwrst(pwrdm, pwrst);
 	}