From patchwork Wed Nov 2 05:07:07 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Brian Norris X-Patchwork-Id: 9408475 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 2BA0A60234 for ; Wed, 2 Nov 2016 05:07:15 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1BA9D29C07 for ; Wed, 2 Nov 2016 05:07:15 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 105F429DB4; Wed, 2 Nov 2016 05:07:15 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.8 required=2.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_HI,T_DKIM_INVALID autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 3212829C07 for ; Wed, 2 Nov 2016 05:07:14 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751987AbcKBFHM (ORCPT ); Wed, 2 Nov 2016 01:07:12 -0400 Received: from mail-pf0-f181.google.com ([209.85.192.181]:36161 "EHLO mail-pf0-f181.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751936AbcKBFHL (ORCPT ); Wed, 2 Nov 2016 01:07:11 -0400 Received: by mail-pf0-f181.google.com with SMTP id 189so4613802pfz.3 for ; Tue, 01 Nov 2016 22:07:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=A++r+/pTHhj+Ph5QYnNSRGZAzqdT/YleVmaybjh2BBY=; b=UoqmzCD27jaAlbkzIFnjl5gNqyXrLK2lVlXm95nBaHkUys1SXYcTntUQazcq6i/v3y LLyuezgQRSY9RkphlGQgiMvMHgMGZhWzY8P9MVGdI9aDzS5jS+WdoAVlvZn4XNrros2W o2uN0h+XELPphqEWuvjHmt9fWULwhK9o/F29o= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=A++r+/pTHhj+Ph5QYnNSRGZAzqdT/YleVmaybjh2BBY=; b=SCCT3joJ1yWl2YJqt/TrdUjPun3DXwIiEt9NrdfXaMfV4k4gEHPPfcUa80j6ZbeVy2 PjJZ0zdPRc6vliI8cCPvpy89fDRT1baAXkyTZUjOKTVvXvgpKYRkrZc6qt3oMUi1xEe9 Vk6TxAexVQl6LHWM0e2H0p+nJwRkhCgt5erQiVgzBeyyhRHRcJTDGPXKBV5KojbDNzQF +JXJqT58OZaMVwVmH5vPmRvxJjJuzju9QMD8uhvLvm8QSOb+GSNK/YzpDWXX3CKNcIaC tVXMfMWa5aXSybKh/ifDpBwBHsV9Xv2pqhMzVlG2xNqRW49hLh9ntw8QlZ6f0CM0r6fZ EzYA== X-Gm-Message-State: ABUngvfhNl/iEAL4Tgtyhl2iy5mqWIt+KTB8KNHnBLJZaikYXdngTr9QV/2YAPWKRsmRRfoD X-Received: by 10.99.102.69 with SMTP id a66mr2747359pgc.71.1478063230972; Tue, 01 Nov 2016 22:07:10 -0700 (PDT) Received: from google.com ([2620:0:1000:1301:e0d0:af0:86aa:a317]) by smtp.gmail.com with ESMTPSA id z9sm671686pfd.29.2016.11.01.22.07.09 (version=TLS1_2 cipher=AES128-SHA bits=128/128); Tue, 01 Nov 2016 22:07:10 -0700 (PDT) Date: Tue, 1 Nov 2016 22:07:07 -0700 From: Brian Norris To: "Rafael J. Wysocki" Cc: Dmitry Torokhov , Pavel Machek , Len Brown , Greg Kroah-Hartman , lkml , Doug Anderson , Brian Norris , Jeffy Chen , "linux-pm@vger.kernel.org" , Chuansheng Liu , Kevin Hilman , Ulf Hansson Subject: Re: [PATCH v2 2/2] PM / sleep: don't suspend parent when async child suspend_{noirq,late} fails Message-ID: <20161102050706.GA49402@google.com> References: <1476923170-111986-2-git-send-email-briannorris@chromium.org> <8017823.VJuZzSqtaY@vostro.rjw.lan> <50971906.K6xak2t6Z6@vostro.rjw.lan> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <50971906.K6xak2t6Z6@vostro.rjw.lan> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-pm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-pm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP + more genpd folks On Wed, Nov 02, 2016 at 04:51:08AM +0100, Rafael J. Wysocki wrote: > On Tuesday, November 01, 2016 12:04:28 AM Dmitry Torokhov wrote: > > On Mon, Oct 31, 2016 at 10:25 PM, Rafael J. Wysocki wrote: > > > On Thursday, October 27, 2016 09:05:34 AM Brian Norris wrote: > > >> diff --git a/drivers/base/power/main.c b/drivers/base/power/main.c > > >> index c58563581345..eaf6b53463a5 100644 > > >> --- a/drivers/base/power/main.c > > >> +++ b/drivers/base/power/main.c > > >> @@ -1040,6 +1040,9 @@ static int __device_suspend_noirq(struct device *dev, pm_message_t state, bool a > > >> > > >> dpm_wait_for_children(dev, async); > > >> > > >> + if (async_error) > > >> + goto Complete; > > >> + > > > > > > This is a second chech for async_error in this routine and is the first one > > > really needed after adding this? > > > > There is really no point in waiting for children to be suspended if > > error has already been signalled; that's what first check achieves. > > The 2nd check ensures that we abort suspend if any of the children > > failed to suspend. > > > > I'd say both checks are needed (well, 1st is helpful, 2nd is essential). > > OK, fair enough. Sort of agreed, although I'm still not sure how helpful the 1st one is; kinda serves to complicate things, for little real benefit IMO (you don't save much time by "not waiting" -- either the child quickly notices the same error and complete()'s quickly, or else you're going to wait for that child in the end anyway). I think it's also important to ask why we do this optimization in the {late,noirq} cases, but we don't do this in __device_suspend(). As demonstrated by the $subject bug, I think we would yield fewer bugs by sharing code structure (if not the code itself) among the similar phases. I'm happy for you to take my current patch, of course, but I think some further effort on making this consistent might be warranted. Either put all of these short-circuit checks after the wait_for_children(), or else add the same short-circuit for the missing case (__device_suspend()). i.e., this (untested) patch: --- I can test this and send it in proper form if that looks preferable. P.S. To get slightly off-topic here (but speaking of noirq bugs): I noticed the genpd code has comments like this scattered all over: * This function is only called in "noirq" and "syscore" stages of system power * transitions, so it need not acquire locks (all of the "noirq" callbacks are * executed sequentially, so it is guaranteed that it will never run twice in * parallel). Isn't that no longer true, now that noirq suspend can be asynchronous? Maybe we should grep for the phrase "need not acquire locks" throughout the kernel, in order to find low-hanging fruit for race conditions :) -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/drivers/base/power/main.c b/drivers/base/power/main.c index e44944f4be77..2932a5bd892f 100644 --- a/drivers/base/power/main.c +++ b/drivers/base/power/main.c @@ -1027,6 +1027,8 @@ static int __device_suspend_noirq(struct device *dev, pm_message_t state, bool a TRACE_DEVICE(dev); TRACE_SUSPEND(0); + dpm_wait_for_children(dev, async); + if (async_error) goto Complete; @@ -1038,8 +1040,6 @@ static int __device_suspend_noirq(struct device *dev, pm_message_t state, bool a if (dev->power.syscore || dev->power.direct_complete) goto Complete; - dpm_wait_for_children(dev, async); - if (dev->pm_domain) { info = "noirq power domain "; callback = pm_noirq_op(&dev->pm_domain->ops, state); @@ -1174,6 +1174,8 @@ static int __device_suspend_late(struct device *dev, pm_message_t state, bool as __pm_runtime_disable(dev, false); + dpm_wait_for_children(dev, async); + if (async_error) goto Complete; @@ -1185,8 +1187,6 @@ static int __device_suspend_late(struct device *dev, pm_message_t state, bool as if (dev->power.syscore || dev->power.direct_complete) goto Complete; - dpm_wait_for_children(dev, async); - if (dev->pm_domain) { info = "late power domain "; callback = pm_late_early_op(&dev->pm_domain->ops, state);