From patchwork Sun Aug 20 13:00:40 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nicholas Piggin X-Patchwork-Id: 9911145 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 2ED2A602B1 for ; Sun, 20 Aug 2017 13:01:31 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1548F28710 for ; Sun, 20 Aug 2017 13:01:31 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 08AD7287BB; Sun, 20 Aug 2017 13:01:31 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.6 required=2.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED, DKIM_SIGNED, DKIM_VALID, FREEMAIL_FROM, RCVD_IN_DNSWL_LOW autolearn=ham version=3.3.1 Received: from bombadil.infradead.org (bombadil.infradead.org [65.50.211.133]) (using TLSv1.2 with cipher AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 6CB2528710 for ; Sun, 20 Aug 2017 13:01:30 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20170209; h=Sender: Content-Transfer-Encoding:Content-Type:Cc:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-ID:Subject:To:From:Date:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=0aaHolFpPtp5eA9LKXSPvlFuK8XYRqsZFSC4hsATolw=; b=odBDXsoZ5fDo3U toQV4XVoWBJcOtmO/+sP/RVB17eqzwKjxDah4a5W+JetmojrSOMsxB6wVCTNdKSj9mNgi3xmz6g4d reGaFri2Pw0YNiRdMkAdVOTABltiq0dlfkKQbWdeUJB8qNHtOx2jvqAxs0q817nPIYumiajTTUrJA BkvOHJCNv0PhOaZmVTJ8KzuuxQYGSSlHNaxBYNUwKUcX/ySc/AxF+01Gj3CuQBF+h/KCDxPeVBr3k gQm32X0bFflUxw3A6EjIyCG7k0PkW8n2CbqeoQZabSAoJFzvb/D0IR4d6n8MgOOipyx7WU+3E1zra zHU051x9FnCDl3Lzzcsw==; Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.87 #1 (Red Hat Linux)) id 1djPr2-0001ot-NJ; Sun, 20 Aug 2017 13:01:24 +0000 Received: from mail-pf0-x242.google.com ([2607:f8b0:400e:c00::242]) by bombadil.infradead.org with esmtps (Exim 4.87 #1 (Red Hat Linux)) id 1djPqy-0001aZ-61 for linux-arm-kernel@lists.infradead.org; Sun, 20 Aug 2017 13:01:22 +0000 Received: by mail-pf0-x242.google.com with SMTP id z3so2363794pfk.0 for ; Sun, 20 Aug 2017 06:00:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=date:from:to:cc:subject:message-id:in-reply-to:references :organization:mime-version:content-transfer-encoding; bh=vn7ZTU8PxMse5xvagc+b+oK8WPXv2HpYfa69+b0TDtg=; b=tmiylAp3E0G/9WaL2EHhB4KeCOLCsMGDZ2kDxloVb403SofYO8Xz9Y/yO6xUihX9ct lcRac3Rnv5WWqmOHN5o/BFnU7YRpyhLocm2gPrSCMLOczlEW4Lku0rCHbpsK6x13zZVf +W5pQE5zqnUAkuF4h3PZam/+BRzEkRAcDn22r97OSaft0S0K96nJz+jFKkNQugKKz0k3 0V2HzNXK1pVTwpP7uwFmEGAKy4OuSyMAO5UdRGRhwCL0h+LHZjFND9+1ArTbt4clk6/E tIig1voP5xaHMyq1l/CRIH4y7Euj7iw0nw31doDyV4ZLbWD1EMe9C7ibHoRtdOtXS5xa NNcA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:in-reply-to :references:organization:mime-version:content-transfer-encoding; bh=vn7ZTU8PxMse5xvagc+b+oK8WPXv2HpYfa69+b0TDtg=; b=HKGCZbeYDwa1axEpH2uBsmfYzzgrh7SESyBKxCdckaDbvM1V/wSAmhVc/XJJpAmWS7 +c5O2W/Rl6b7ntPK25OMqokdupoFuJWfXrBHc2/p1gPVJIMfsJS1qE5dGyRUXud5qYm6 8ObbqCUpCr4GiqPIWHnxCRJJOgTBAILfK4DgacYgzr4mjCp0ZoouBzAYGlIyUXpyG/2O 3ZmrQzBIUDs1H8d2NU/ffCWqLiU2crWS+9HbRTGaM5BubRtGhdrCMj5KfumcuZrtpRv9 S/0HU9PF3cRmXvMLt6X7DEgFEbrfy4HBqD4H7IJFa3bKZtB5xScHrKhEiz0MHE702nOz HBZw== X-Gm-Message-State: AHYfb5jDxlCnFMu+Do34kX5YW19YZTa0S4XD+NVXHniRUkW3aK2NCfNM qWvn+kgEF63fDg== X-Received: by 10.98.9.90 with SMTP id e87mr7002649pfd.138.1503234058474; Sun, 20 Aug 2017 06:00:58 -0700 (PDT) Received: from roar.ozlabs.ibm.com (203-219-56-202.tpgi.com.au. [203.219.56.202]) by smtp.gmail.com with ESMTPSA id f15sm21203802pga.5.2017.08.20.06.00.51 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sun, 20 Aug 2017 06:00:57 -0700 (PDT) Date: Sun, 20 Aug 2017 23:00:40 +1000 From: Nicholas Piggin To: "Paul E. McKenney" Subject: Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this? Message-ID: <20170820230040.706b62ac@roar.ozlabs.ibm.com> In-Reply-To: <20170820144553.2ab2727b@ppc64le> References: <20170728182053.000072aa@huawei.com> <20170728190349.GM3730@linux.vnet.ibm.com> <20170731120847.00003d5c@huawei.com> <20170731150411.GA3730@linux.vnet.ibm.com> <20170731162757.000058ba@huawei.com> <20170801184646.GE3730@linux.vnet.ibm.com> <20170802172555.0000468a@huawei.com> <20170815154743.GK7017@linux.vnet.ibm.com> <87wp63smwn.fsf@concordia.ellerman.id.au> <20170816125617.GY7017@linux.vnet.ibm.com> <20170816162731.GA22978@linux.vnet.ibm.com> <20170820144553.2ab2727b@ppc64le> Organization: IBM X-Mailer: Claws Mail 3.15.0-dirty (GTK+ 2.24.31; x86_64-pc-linux-gnu) MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20170820_060120_268538_89A1D9E2 X-CRM114-Status: GOOD ( 27.26 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: dzickus@redhat.com, sfr@canb.auug.org.au, Michael Ellerman , linuxarm@huawei.com, David Miller , abdhalee@linux.vnet.ibm.com, john.stultz@linaro.org, Jonathan Cameron , sparclinux@vger.kernel.org, tglx@linutronix.de, linuxppc-dev@lists.ozlabs.org, akpm@linux-foundation.org, linux-arm-kernel@lists.infradead.org Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org X-Virus-Scanned: ClamAV using ClamSMTP On Sun, 20 Aug 2017 14:45:53 +1000 Nicholas Piggin wrote: > On Wed, 16 Aug 2017 09:27:31 -0700 > "Paul E. McKenney" wrote: > > On Wed, Aug 16, 2017 at 05:56:17AM -0700, Paul E. McKenney wrote: > > > > Thomas, John, am I misinterpreting the timer trace event messages? > > So I did some digging, and what you find is that rcu_sched seems to do a > simple scheudle_timeout(1) and just goes out to lunch for many seconds. > The process_timeout timer never fires (when it finally does wake after > one of these events, it usually removes the timer with del_timer_sync). > > So this patch seems to fix it. Testing, comments welcome. Okay this had a problem of trying to forward the timer from a timer callback function. This was my other approach which also fixes the RCU warnings, but it's a little more complex. I reworked it a bit so the mod_timer fast path hopefully doesn't have much more overhead (actually by reading jiffies only when needed, it probably saves a load). Thanks, Nick --- [PATCH] timers: Fix excessive granularity of new timers after a nohz idle When a timer base is idle, it is forwarded when a new timer is added to ensure that granularity does not become excessive. When not idle, the timer tick is expected to increment the base. However there is a window after a timer is restarted from nohz, when it is marked not-idle, and before the timer tick on this CPU, where a timer may be added on an ancient base that does not get forwarded (beacause the timer appears not-idle). This results in excessive granularity. So much so that a 1 jiffy timeout has blown out to 10s of seconds and triggered the RCU stall warning detector. Fix this by keeping track of whether the timer has been idle since it was last run or forwarded, and allow forwarding in the case that is true (even if it is not currently idle). Also add a comment noting a case where we could get an unexpectedly large granularity for a timer. I debugged this problem by adding warnings for such cases, but it seems we can't add them in general due to this corner case. Signed-off-by: Nicholas Piggin --- kernel/time/timer.c | 32 +++++++++++++++++++++++++++----- 1 file changed, 27 insertions(+), 5 deletions(-) diff --git a/kernel/time/timer.c b/kernel/time/timer.c index 8f5d1bf18854..ee7b8b688b48 100644 --- a/kernel/time/timer.c +++ b/kernel/time/timer.c @@ -203,6 +203,7 @@ struct timer_base { bool migration_enabled; bool nohz_active; bool is_idle; + bool was_idle; DECLARE_BITMAP(pending_map, WHEEL_SIZE); struct hlist_head vectors[WHEEL_SIZE]; } ____cacheline_aligned; @@ -856,13 +857,19 @@ get_target_base(struct timer_base *base, unsigned tflags) static inline void forward_timer_base(struct timer_base *base) { - unsigned long jnow = READ_ONCE(jiffies); + unsigned long jnow; /* - * We only forward the base when it's idle and we have a delta between - * base clock and jiffies. + * We only forward the base when we are idle or have just come out + * of idle (was_idle logic), and have a delta between base clock + * and jiffies. In the common case, run_timers will take care of it. */ - if (!base->is_idle || (long) (jnow - base->clk) < 2) + if (likely(!base->was_idle)) + return; + + jnow = READ_ONCE(jiffies); + base->was_idle = base->is_idle; + if ((long)(jnow - base->clk) < 2) return; /* @@ -938,6 +945,13 @@ __mod_timer(struct timer_list *timer, unsigned long expires, bool pending_only) * same array bucket then just return: */ if (timer_pending(timer)) { + /* + * The downside of this optimization is that it can result in + * larger granularity than you would get from adding a new + * timer with this expiry. Would a timer flag for networking + * be appropriate, then we can try to keep expiry of general + * timers within ~1/8th of their interval? + */ if (timer->expires == expires) return 1; @@ -1499,8 +1513,10 @@ u64 get_next_timer_interrupt(unsigned long basej, u64 basem) /* * If we expect to sleep more than a tick, mark the base idle: */ - if ((expires - basem) > TICK_NSEC) + if ((expires - basem) > TICK_NSEC) { + base->was_idle = true; base->is_idle = true; + } } raw_spin_unlock(&base->lock); @@ -1587,6 +1603,12 @@ static inline void __run_timers(struct timer_base *base) struct hlist_head heads[LVL_DEPTH]; int levels; + /* + * was_idle must be cleared before running timers so that any timer + * functions that call mod_timer will not try to forward the base. + */ + base->was_idle = false; + if (!time_after_eq(jiffies, base->clk)) return;