From patchwork Fri Jul 28 12:54:16 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Boqun Feng X-Patchwork-Id: 9868713 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 643ED6037D for ; Fri, 28 Jul 2017 12:55:03 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 5DA40288C4 for ; Fri, 28 Jul 2017 12:55:03 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 51EC6288C3; Fri, 28 Jul 2017 12:55:03 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-1.9 required=2.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED, DKIM_SIGNED, DKIM_VALID, FREEMAIL_FROM, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from bombadil.infradead.org (bombadil.infradead.org [65.50.211.133]) (using TLSv1.2 with cipher AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 47845288C3 for ; Fri, 28 Jul 2017 12:55:01 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20170209; h=Sender: Content-Transfer-Encoding:Content-Type:Cc:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:In-Reply-To:MIME-Version:References: Message-ID:Subject:To:From:Date:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=uflsjA3nvMBS5RcbtExOJbuyt5+B9JreHcl22NhOSeU=; b=uecSWgfHq2IXyi 7G3t3MK+WQbsXF0kRLjiifErQKXl+mI1lSiIaUEwb4dSXYIs4dqGqfw6iSdgQ6J+2JOFwL+VBqgxD 1MUlrLcqsaeyQkRZLuRorNwIE5S/D9DXrRO1I7LiEaBuVE+voC2mJw57JhPgPL37T4DwQ9F+Rwj4X AkL8Znd4LqigA6FZ1YjttdOdjBXzzC1LPMZDkQDvcGPdBkCMJk/oBojK7+XjpED/YvWutQyFlFYTq i50yjs9Kv26A/kRSNuFqmjHrr1lHaZ693giqwEzKlUNQ1w16VRUy34fKBMYsxf4fY2Jbfc3KngFnx hCdSkBjNtSdv6uy4RV5A==; Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.87 #1 (Red Hat Linux)) id 1db4nB-0008Bz-Gs; Fri, 28 Jul 2017 12:54:57 +0000 Received: from mail-pf0-x242.google.com ([2607:f8b0:400e:c00::242]) by bombadil.infradead.org with esmtps (Exim 4.87 #1 (Red Hat Linux)) id 1db4n7-0007qI-EV for linux-arm-kernel@lists.infradead.org; Fri, 28 Jul 2017 12:54:55 +0000 Received: by mail-pf0-x242.google.com with SMTP id e3so3117195pfc.5 for ; Fri, 28 Jul 2017 05:54:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=3dxAWXRnKj6cNK3WuBZRq+thyyCEiE5wWSRzzzL1eZ0=; b=An67snWtsqD0zhAm9QwAFJE6/o9erRT02Nea7M5PaO9k9K99mVhHsjBXdKRPvDW7L8 XcB9iwffvLcJyrcaa7yooau1+ExYoesmhDVa/CjAyGJ50/JloEhGxxam+KUsdP1Y3KOc dKNSKE2Hy8htkn79U1P0Y6H6ZkkAmQcCGdzmrpdS0FCdrBNWl94zkPHSR430ddDFjrBp HeTfLzjuw+Q+8x1v47MG8z4swEuBAA7wnkXodP5P8/ubgWA8rMSEpoYcijFVsRXouUE5 w3RNrlNeG/dXwslgfS8wuabRzCz141pZPZw5RO2nrZwojGopOA4rIdIfaWDe+cTuiC6a ZPfA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=3dxAWXRnKj6cNK3WuBZRq+thyyCEiE5wWSRzzzL1eZ0=; b=bs1jK+PM00hvlXSUfrAEDn5pUivZ/4t+Q6Z9NZzcemdTwSHGWRtLi+3XXSIw2GcHak CW1ezWbR7CYQ2JXGChsob0MFROeH9lPTdUMZgOMJpj9kVD9+aIxhY2NkllG2SgF6f1QB z2v1gJr7qLkiZ81NsoVBrK6HUoTj0pa4zxOV5aF71LCqTf4p774FDJ9GnCc2iyWOi4V6 OMurrmp/XXoENft3598z3X1YCCyKKhqO4bvfGkjXIG8sp3b30Oqz35f9SOzZuqcFzZNO ZkDN+PpmW4YUJeH0NuqLPio3obxOJ4glYGOKI5ZX8liDe6Xf7q20viq4HgUF1nPCeGY/ H5bw== X-Gm-Message-State: AIVw113H+J4MTfw6Pp+6Wl42soBH72QJ12mSsHhUJJOZOXSkyBH4fmtX j4G9kjpd66MTeQ== X-Received: by 10.84.232.143 with SMTP id i15mr7909449plk.248.1501246471089; Fri, 28 Jul 2017 05:54:31 -0700 (PDT) Received: from localhost ([45.32.128.109]) by smtp.gmail.com with ESMTPSA id u13sm37713843pgq.75.2017.07.28.05.54.29 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Fri, 28 Jul 2017 05:54:30 -0700 (PDT) Date: Fri, 28 Jul 2017 20:54:16 +0800 From: Boqun Feng To: Jonathan Cameron Subject: Re: RCU lockup issues when CONFIG_SOFTLOCKUP_DETECTOR=n - any one else seeing this? Message-ID: <20170728125416.j7gcgvnxgv2gq73u@tardis> References: <20170726.154540.150558937277891719.davem@davemloft.net> <20170726231505.GG3730@linux.vnet.ibm.com> <20170726.162200.1904949371593276937.davem@davemloft.net> <20170727014214.GH3730@linux.vnet.ibm.com> <20170727143400.23e4d2b2@roar.ozlabs.ibm.com> <20170727124913.GL3730@linux.vnet.ibm.com> <20170727144903.000022a1@huawei.com> <20170727173923.000001b2@huawei.com> <20170727165245.GD3730@linux.vnet.ibm.com> <20170728084411.00001ddb@huawei.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20170728084411.00001ddb@huawei.com> User-Agent: NeoMutt/20170609 (1.8.3) X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20170728_055453_556019_9507F93E X-CRM114-Status: GOOD ( 13.82 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: dzickus@redhat.com, sfr@canb.auug.org.au, linuxarm@huawei.com, Nicholas Piggin , abdhalee@linux.vnet.ibm.com, sparclinux@vger.kernel.org, akpm@linux-foundation.org, "Paul E. McKenney" , linuxppc-dev@lists.ozlabs.org, David Miller , linux-arm-kernel@lists.infradead.org Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org X-Virus-Scanned: ClamAV using ClamSMTP Hi Jonathan, FWIW, there is wakeup-missing issue in swake_up() and swake_up_all(): https://marc.info/?l=linux-kernel&m=149750022019663 and RCU begins to use swait/wake last year, so I thought this could be relevant. Could you try the following patch and see if it works? Thanks. Regards, Boqun ------------------>8 Subject: [PATCH] swait: Remove the lockless swait_active() check in swake_up*() Steven Rostedt reported a potential race in RCU core because of swake_up(): CPU0 CPU1 ---- ---- __call_rcu_core() { spin_lock(rnp_root) need_wake = __rcu_start_gp() { rcu_start_gp_advanced() { gp_flags = FLAG_INIT } } rcu_gp_kthread() { swait_event_interruptible(wq, gp_flags & FLAG_INIT) { spin_lock(q->lock) *fetch wq->task_list here! * list_add(wq->task_list, q->task_list) spin_unlock(q->lock); *fetch old value of gp_flags here * spin_unlock(rnp_root) rcu_gp_kthread_wake() { swake_up(wq) { swait_active(wq) { list_empty(wq->task_list) } * return false * if (condition) * false * schedule(); In this case, a wakeup is missed, which could cause the rcu_gp_kthread waits for a long time. The reason of this is that we do a lockless swait_active() check in swake_up(). To fix this, we can either 1) add a smp_mb() in swake_up() before swait_active() to provide the proper order or 2) simply remove the swait_active() in swake_up(). The solution 2 not only fixes this problem but also keeps the swait and wait API as close as possible, as wake_up() doesn't provide a full barrier and doesn't do a lockless check of the wait queue either. Moreover, there are users already using swait_active() to do their quick checks for the wait queues, so it make less sense that swake_up() and swake_up_all() do this on their own. This patch then removes the lockless swait_active() check in swake_up() and swake_up_all(). Reported-by: Steven Rostedt Signed-off-by: Boqun Feng Acked-by: Paul E. McKenney Tested-by: Paul E. McKenney --- kernel/sched/swait.c | 6 ------ 1 file changed, 6 deletions(-) diff --git a/kernel/sched/swait.c b/kernel/sched/swait.c index 3d5610dcce11..2227e183e202 100644 --- a/kernel/sched/swait.c +++ b/kernel/sched/swait.c @@ -33,9 +33,6 @@ void swake_up(struct swait_queue_head *q) { unsigned long flags; - if (!swait_active(q)) - return; - raw_spin_lock_irqsave(&q->lock, flags); swake_up_locked(q); raw_spin_unlock_irqrestore(&q->lock, flags); @@ -51,9 +48,6 @@ void swake_up_all(struct swait_queue_head *q) struct swait_queue *curr; LIST_HEAD(tmp); - if (!swait_active(q)) - return; - raw_spin_lock_irq(&q->lock); list_splice_init(&q->task_list, &tmp); while (!list_empty(&tmp)) {