From patchwork Mon Jun 17 16:15:35 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Lezcano X-Patchwork-Id: 2734821 Return-Path: X-Original-To: patchwork-linux-arm@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.19.201]) by patchwork2.web.kernel.org (Postfix) with ESMTP id D3B07C0AB1 for ; Mon, 17 Jun 2013 16:16:13 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 1908620445 for ; Mon, 17 Jun 2013 16:16:09 +0000 (UTC) Received: from casper.infradead.org (casper.infradead.org [85.118.1.10]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 286A520443 for ; Mon, 17 Jun 2013 16:16:04 +0000 (UTC) Received: from merlin.infradead.org ([2001:4978:20e::2]) by casper.infradead.org with esmtps (Exim 4.80.1 #2 (Red Hat Linux)) id 1Uoc64-0003a7-Uk; Mon, 17 Jun 2013 16:16:01 +0000 Received: from localhost ([::1] helo=merlin.infradead.org) by merlin.infradead.org with esmtp (Exim 4.80.1 #2 (Red Hat Linux)) id 1Uoc62-0005QF-2U; Mon, 17 Jun 2013 16:15:58 +0000 Received: from mail-wi0-x235.google.com ([2a00:1450:400c:c05::235]) by merlin.infradead.org with esmtps (Exim 4.80.1 #2 (Red Hat Linux)) id 1Uoc5y-0005Ph-C4 for linux-arm-kernel@lists.infradead.org; Mon, 17 Jun 2013 16:15:56 +0000 Received: by mail-wi0-f181.google.com with SMTP id hq4so2352192wib.2 for ; Mon, 17 Jun 2013 09:15:30 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=from:to:cc:subject:date:message-id:x-mailer:x-gm-message-state; bh=jVfYBTLK2g682xbhDDXm4jfUUsADHjoToWGgtXt5iMQ=; b=mbbO5+dTAmR/h0mdSZ1cbUo7rLNwcoRgiecV96r3mzpANnqwWs8+MUYXDzi2ygNotZ IUXPPT2meLjnLyc8SB+fhkRqFMyMSMmnpAm71G4MqqD4OhbSXwwqL8cIsB40ii81Q1Sf 4R740fDvVeq9i6gFGam/6uY9OY6TNCnI+KZPJfAPWhkj++5s+A+jcpHTxHx0t5BzoUwf D2E8DvhhyTD6StMf8nwcLT65hbNZyTNhCaNPTWqETMEtcyEwL21ibQ1TBzP0dcs0nd+x 5quMzusIMPFg5LfmIrVDjJlygryoylftKGzgcN00fo/HUQMvuWTf0tUCoiGkuxIaNGYP vGJA== X-Received: by 10.180.93.136 with SMTP id cu8mr5333503wib.49.1371485730346; Mon, 17 Jun 2013 09:15:30 -0700 (PDT) Received: from mai.home (AToulouse-654-1-455-222.w83-205.abo.wanadoo.fr. [83.205.78.222]) by mx.google.com with ESMTPSA id en3sm23083909wid.1.2013.06.17.09.15.28 for (version=TLSv1.1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Mon, 17 Jun 2013 09:15:29 -0700 (PDT) From: Daniel Lezcano To: tglx@linutronix.de Subject: [PATCH] tick: fix tick_broadcast_pending_mask not cleared Date: Mon, 17 Jun 2013 18:15:35 +0200 Message-Id: <1371485735-31249-1-git-send-email-daniel.lezcano@linaro.org> X-Mailer: git-send-email 1.7.9.5 X-Gm-Message-State: ALoCoQmQ8F7mpLGXtrpAcExrG6M19mdA4rrr0Fq/m4SyAXpJ0h1a9v5VFidFTGku3TyoXpeCvxrg X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20130617_121554_675474_FAC13580 X-CRM114-Status: GOOD ( 19.01 ) X-Spam-Score: -1.9 (-) Cc: linaro-kernel@lists.linaro.org, linux-pm@vger.kernel.org, swarren@wwwdotorg.org, josephl@nvidia.com, linux-tegra@vger.kernel.org, patche@linaro.org, linux-arm-kernel@lists.infradead.org X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org X-Spam-Status: No, score=-5.3 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_MED, RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP The recent modification in the cpuidle framework consolidated the timer broadcast code across the different drivers by setting a new flag in the idle state. It tells the cpuidle core code to enter/exit to the broadcast mode for the cpu when entering a deep idle state. The broadcast timer enter/exit is no longer handled by the back-end driver. This change made the local interrupt to be enabled *before* calling CLOCK_EVENT_NOTIFY_EXIT. bad or not (see below) ? On a tegra114, a four cores system, when the flag has been introduced in the driver, the following warning appeared: [ 25.629559] WARNING: CPU: 2 PID: 0 at kernel/time/tick-broadcast.c:578 tick_broadcast_oneshot_control +0x1a4/0x1d0() [ 25.629565] Modules linked in: [ 25.629574] CPU: 2 PID: 0 Comm: swapper/2 Not tainted 3.10.0-rc3-next-20130529+ #15 [ 25.629601] [] (unwind_backtrace+0x0/0xf8) from [] (show_stack+0x10/0x14) [ 25.629616] [] (show_stack+0x10/0x14) from [] (dump_stack+0x80/0xc4) [ 25.629633] [] (dump_stack+0x80/0xc4) from [] (warn_slowpath_common+0x64/0x88) [ 25.629646] [] (warn_slowpath_common+0x64/0x88) from [] (warn_slowpath_null+0x1c/0x24) [ 25.629657] [] (warn_slowpath_null+0x1c/0x24) from [] (tick_broadcast_oneshot_control+0x1a4/0x1d0) [ 25.629670] [] (tick_broadcast_oneshot_control+0x1a4/0x1d0) from [] (tick_notify+0x240/0x40c) [ 25.629685] [] (tick_notify+0x240/0x40c) from [] (notifier_call_chain+0x44/0x84) [ 25.629699] [] (notifier_call_chain+0x44/0x84) from [] (raw_notifier_call_chain+0x18/0x20) [ 25.629712] [] (raw_notifier_call_chain+0x18/0x20) from [] (clockevents_notify+0x28/0x170) [ 25.629726] [] (clockevents_notify+0x28/0x170) from [] (cpuidle_idle_call+0x11c/0x168) [ 25.629739] [] (cpuidle_idle_call+0x11c/0x168) from [] (arch_cpu_idle+0x8/0x38) [ 25.629755] [] (arch_cpu_idle+0x8/0x38) from [] (cpu_startup_entry+0x60/0x134) [ 25.629767] [] (cpu_startup_entry+0x60/0x134) from [<804fe9a4>] (0x804fe9a4) [ 25.629772] ---[ end trace 5484e77e2531bccc ]--- I don't have the hardware, so I wasn't able to reproduce the warning but after looking a while in the code, I deduced the following: 1. the CPU2 enters a deep idle state and sets the broadcast timer 2. the timer expires, the tick_handle_oneshot_broadcast function is called, setting the tick_broadcast_pending_mask and waking up the idle cpu CPU2 3. the CPU2 exits idle and invokes tick_broadcast_oneshot_control with CLOCK_EVENT_NOTIFY_EXIT with the following code: [...] if (dev->next_event.tv64 == KTIME_MAX) goto out; if (cpumask_test_and_clear_cpu(cpu, tick_broadcast_pending_mask)) goto out; [...] 4. if there is no next event planned for CPU2, we fulfil the first condition and we jump to the 'out' section without clearing the tick_broadcast_pending_mask 5. CPU2 goes to deep idle again and calls tick_broadcast_oneshot_control with CLOCK_NOTIFY_EVENT_ENTER but with the tick_broadcast_pending_mask set for CPU2, leading to the WARNING. Above, it is mentionned the change moved the CLOCK_EVENT_NOTIFY_EXIT after the local interrupt were enabled. If it is before, this warning does not occur. My hypothesis is the code path described before does not happen because when a broadcast timer expires, dev->next_event.tv64 is always different from KTIME_MAX because the timer handler did not set the value yet (local interrupt are still disabled). I don't see anywhere in the code, a clockevents_notify(ENTER/EXIT) block must be done with the local interrupt disabled in between, furthermore the function uses 'raw_spin_lock_irqsave' which make me think, we don't care about that. Invert the conditions and make the tick broadcast code immune from the local interrupts context. Signed-off-by: Daniel Lezcano Reported-by: Joseph Lo Tested-by: Joseph Lo --- kernel/time/tick-broadcast.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/kernel/time/tick-broadcast.c b/kernel/time/tick-broadcast.c index d067c01..58d88e8 100644 --- a/kernel/time/tick-broadcast.c +++ b/kernel/time/tick-broadcast.c @@ -610,8 +610,6 @@ void tick_broadcast_oneshot_control(unsigned long reason) } else { if (cpumask_test_and_clear_cpu(cpu, tick_broadcast_oneshot_mask)) { clockevents_set_mode(dev, CLOCK_EVT_MODE_ONESHOT); - if (dev->next_event.tv64 == KTIME_MAX) - goto out; /* * The cpu which was handling the broadcast * timer marked this cpu in the broadcast @@ -625,6 +623,8 @@ void tick_broadcast_oneshot_control(unsigned long reason) tick_broadcast_pending_mask)) goto out; + if (dev->next_event.tv64 == KTIME_MAX) + goto out; /* * If the pending bit is not set, then we are * either the CPU handling the broadcast