From patchwork Wed Feb 20 13:33:04 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Thomas Gleixner X-Patchwork-Id: 2167451 Return-Path: X-Original-To: patchwork-linux-arm@patchwork.kernel.org Delivered-To: patchwork-process-083081@patchwork1.kernel.org Received: from merlin.infradead.org (merlin.infradead.org [205.233.59.134]) by patchwork1.kernel.org (Postfix) with ESMTP id E8A613FD4E for ; Wed, 20 Feb 2013 13:36:41 +0000 (UTC) Received: from localhost ([::1] helo=merlin.infradead.org) by merlin.infradead.org with esmtp (Exim 4.76 #1 (Red Hat Linux)) id 1U89nO-0002xQ-9L; Wed, 20 Feb 2013 13:33:14 +0000 Received: from www.linutronix.de ([62.245.132.108] helo=Galois.linutronix.de) by merlin.infradead.org with esmtps (Exim 4.76 #1 (Red Hat Linux)) id 1U89nL-0002wu-8u for linux-arm-kernel@lists.infradead.org; Wed, 20 Feb 2013 13:33:12 +0000 Received: from localhost ([127.0.0.1]) by Galois.linutronix.de with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.72) (envelope-from ) id 1U89nF-0007MR-Hz; Wed, 20 Feb 2013 14:33:05 +0100 Date: Wed, 20 Feb 2013 14:33:04 +0100 (CET) From: Thomas Gleixner To: Jason Liu Subject: Re: too many timer retries happen when do local timer swtich with broadcast timer In-Reply-To: Message-ID: References: User-Agent: Alpine 2.02 (LFD 1266 2009-07-14) MIME-Version: 1.0 X-Linutronix-Spam-Score: -1.0 X-Linutronix-Spam-Level: - X-Linutronix-Spam-Status: No , -1.0 points, 5.0 required, ALL_TRUSTED=-1, SHORTCIRCUIT=-0.0001 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20130220_083311_519395_4196F7F0 X-CRM114-Status: GOOD ( 17.58 ) X-Spam-Score: -4.9 (----) X-Spam-Report: SpamAssassin version 3.3.2 on merlin.infradead.org summary: Content analysis details: (-4.9 points) pts rule name description ---- ---------------------- -------------------------------------------------- -2.3 RCVD_IN_DNSWL_MED RBL: Sender listed at http://www.dnswl.org/, medium trust [62.245.132.108 listed in list.dnswl.org] -0.7 RP_MATCHES_RCVD Envelope sender domain matches handover relay domain -1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1% [score: 0.0000] Cc: LKML , linux-arm-kernel@lists.infradead.org X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: linux-arm-kernel-bounces@lists.infradead.org Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org On Wed, 20 Feb 2013, Jason Liu wrote: > void arch_idle(void) > { > .... > clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_ENTER, &cpu); > > enter_the_wait_mode(); > > clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_EXIT, &cpu); > } > > when the broadcast timer interrupt arrives(this interrupt just wakeup > the ARM, and ARM has no chance > to handle it since local irq is disabled. In fact it's disabled in > cpu_idle() of arch/arm/kernel/process.c) > > the broadcast timer interrupt will wake up the CPU and run: > > clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_EXIT, &cpu); -> > tick_broadcast_oneshot_control(...); > -> > tick_program_event(dev->next_event, 1); > -> > tick_dev_program_event(dev, expires, force); > -> > for (i = 0;;) { > int ret = clockevents_program_event(dev, expires, now); > if (!ret || !force) > return ret; > > dev->retries++; > .... > now = ktime_get(); > expires = ktime_add_ns(now, dev->min_delta_ns); > } > clockevents_program_event(dev, expires, now); > > delta = ktime_to_ns(ktime_sub(expires, now)); > > if (delta <= 0) > return -ETIME; > > when the bc timer interrupt arrives, which means the last local timer > expires too. so, > clockevents_program_event will return -ETIME, which will cause the > dev->retries++ > when retry to program the expired timer. > > Even under the worst case, after the re-program the expired timer, > then CPU enter idle > quickly before the re-progam timer expired, it will make system > ping-pang forever, That's nonsense. The timer IPI brings the core out of the deep idle state. So after returning from enter_wait_mode() and after calling clockevents_notify() it returns from arch_idle() to cpu_idle(). In cpu_idle() interrupts are reenabled, so the timer IPI handler is invoked. That calls the event_handler of the per cpu local clockevent device (the one which stops in C3). That ends up in the generic timer code which expires timers and reprograms the local clock event device with the next pending timer. So you cannot go idle again, before the expired timers of this event are handled and their callbacks invoked. Now the reprogramming itself is a different issue. It's necessary as we have no information whether the wakeup was caused by the timer IPI or by something else. We might try to be clever and store the pending IPI in tick_handle_oneshot_broadcast() and avoid the reprogramming in that case. Completely untested patch below. Thanks, tglx Index: linux-2.6/kernel/time/tick-broadcast.c =================================================================== --- linux-2.6.orig/kernel/time/tick-broadcast.c +++ linux-2.6/kernel/time/tick-broadcast.c @@ -29,6 +29,7 @@ static struct tick_device tick_broadcast_device; /* FIXME: Use cpumask_var_t. */ static DECLARE_BITMAP(tick_broadcast_mask, NR_CPUS); +static DECLARE_BITMAP(tick_broadcast_pending, NR_CPUS); static DECLARE_BITMAP(tmpmask, NR_CPUS); static DEFINE_RAW_SPINLOCK(tick_broadcast_lock); static int tick_broadcast_force; @@ -417,9 +418,10 @@ again: /* Find all expired events */ for_each_cpu(cpu, tick_get_broadcast_oneshot_mask()) { td = &per_cpu(tick_cpu_device, cpu); - if (td->evtdev->next_event.tv64 <= now.tv64) + if (td->evtdev->next_event.tv64 <= now.tv64) { cpumask_set_cpu(cpu, to_cpumask(tmpmask)); - else if (td->evtdev->next_event.tv64 < next_event.tv64) + set_bit(cpu, tick_broadcast_pending); + } else if (td->evtdev->next_event.tv64 < next_event.tv64) next_event.tv64 = td->evtdev->next_event.tv64; } @@ -493,7 +495,8 @@ void tick_broadcast_oneshot_control(unsi cpumask_clear_cpu(cpu, tick_get_broadcast_oneshot_mask()); clockevents_set_mode(dev, CLOCK_EVT_MODE_ONESHOT); - if (dev->next_event.tv64 != KTIME_MAX) + if (dev->next_event.tv64 != KTIME_MAX && + !test_and_clear_bit(cpu, tick_broadcast_pending)) tick_program_event(dev->next_event, 1); } }