From patchwork Wed Sep 11 10:21:45 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Srivatsa S. Bhat" X-Patchwork-Id: 2872031 Return-Path: X-Original-To: patchwork-linux-pm@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.19.201]) by patchwork2.web.kernel.org (Postfix) with ESMTP id 22843BF43F for ; Wed, 11 Sep 2013 10:25:57 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id A2761202DB for ; Wed, 11 Sep 2013 10:25:56 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id E8F74202CC for ; Wed, 11 Sep 2013 10:25:54 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753399Ab3IKKZy (ORCPT ); Wed, 11 Sep 2013 06:25:54 -0400 Received: from e23smtp04.au.ibm.com ([202.81.31.146]:58065 "EHLO e23smtp04.au.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753105Ab3IKKZw (ORCPT ); Wed, 11 Sep 2013 06:25:52 -0400 Received: from /spool/local by e23smtp04.au.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Wed, 11 Sep 2013 20:07:58 +1000 Received: from d23dlp01.au.ibm.com (202.81.31.203) by e23smtp04.au.ibm.com (202.81.31.210) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Wed, 11 Sep 2013 20:07:56 +1000 Received: from d23relay03.au.ibm.com (d23relay03.au.ibm.com [9.190.235.21]) by d23dlp01.au.ibm.com (Postfix) with ESMTP id 26AB82CE804D; Wed, 11 Sep 2013 20:25:41 +1000 (EST) Received: from d23av01.au.ibm.com (d23av01.au.ibm.com [9.190.234.96]) by d23relay03.au.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id r8BAPTUb62128354; Wed, 11 Sep 2013 20:25:30 +1000 Received: from d23av01.au.ibm.com (localhost [127.0.0.1]) by d23av01.au.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id r8BAPdgB007514; Wed, 11 Sep 2013 20:25:39 +1000 Received: from srivatsabhat.in.ibm.com (srivatsabhat.in.ibm.com [9.124.35.237] (may be forged)) by d23av01.au.ibm.com (8.14.4/8.14.4/NCO v10.0 AVin) with ESMTP id r8BAPaYZ007417; Wed, 11 Sep 2013 20:25:37 +1000 Message-ID: <52304439.3030301@linux.vnet.ibm.com> Date: Wed, 11 Sep 2013 15:51:45 +0530 From: "Srivatsa S. Bhat" User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:15.0) Gecko/20120828 Thunderbird/15.0 MIME-Version: 1.0 To: "Rafael J. Wysocki" CC: Stephen Warren , Viresh Kumar , "linux-pm@vger.kernel.org" , "linux-kernel@vger.kernel.org" , cpufreq Subject: Re: cpufreq_stats NULL deref on second system suspend References: <522E1FEF.6080803@wwwdotorg.org> <1775778.MeiRhuYy7o@vostro.rjw.lan> <522F86AD.6010603@wwwdotorg.org> <2521560.SfeNbV74nj@vostro.rjw.lan> In-Reply-To: <2521560.SfeNbV74nj@vostro.rjw.lan> X-TM-AS-MML: No X-Content-Scanned: Fidelis XPS MAILER x-cbid: 13091110-9264-0000-0000-000004824D5A Sender: linux-pm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-pm@vger.kernel.org X-Spam-Status: No, score=-7.7 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_HI, RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP On 09/11/2013 04:04 AM, Rafael J. Wysocki wrote: > On Tuesday, September 10, 2013 02:53:01 PM Stephen Warren wrote: >> On 09/09/2013 05:14 PM, Rafael J. Wysocki wrote: >>> On Monday, September 09, 2013 03:29:06 PM Stephen Warren wrote: >>>> On 09/09/2013 02:24 PM, Rafael J. Wysocki wrote: >>>>> On Monday, September 09, 2013 02:01:32 PM Stephen Warren wrote: >>>>>> On 09/09/2013 02:01 PM, Rafael J. Wysocki wrote: >>>>>>> On Monday, September 09, 2013 01:22:23 PM Stephen Warren wrote: >>>>>>>> Viresh, >>>>>>>> >>>>>>>> I'm seeing the crash below when suspending my system for the second time. >>>>>>>> >>>>>>>> I can avoid this with the following patch, which adds a check which >>>>>>>> already exists in all-but-one other places that the same lookup is made: >>>>>>> >>>>>>> Which kernel did you test? >>>>>> >>>>>> next-20130909. >>>>> >>>>> Is it reproducible with the current mainline? >>>> >>>> This does not affect v3.11, but does affect current HEAD; 300893b "Merge >>>> tag 'xfs-for-linus-v3.12-rc1' of git://oss.sgi.com/xfs/xfs". >>> >>> What system does it break on? >> >> A dual-core ARM system (NVIDIA Tegra20 SoC, Harmony board). >> >>> Any chance to bisect cpufreq changes between 3.11 and the current HEAD? >> >> Sure, it's due to 5302c3f "cpufreq: Perform light-weight init/teardown >> during suspend/resume". > > Thanks! > > Srivatsa, any chance to look into this? > Sure, Rafael. Thanks for CC'ing me. Stephen, I went through the code and I think I found out what is going wrong. Can you please try the following patch? Regards, Srivatsa S. Bhat ---------------------------------------------------------------------------- From: Srivatsa S. Bhat Subject: [PATCH] cpufreq: Fix crash in cpufreq-stats during suspend/resume Stephen Warren reported that the cpufreq-stats code hits a NULL pointer dereference during the second attempt to suspend a system. He also pin-pointed the problem to commit 5302c3f "cpufreq: Perform light-weight init/teardown during suspend/resume". That commit actually ensured that the cpufreq-stats table and the cpufreq-stats sysfs entries are *not* torn down (ie., not freed) during suspend/resume, which makes it all the more surprising. However, it turns out that the root-cause is not that we access an already freed memory, but that the reference to the allocated memory gets moved around and we lose track of that during resume, leading to the reported crash in a subsequent suspend attempt. In the suspend path, during CPU offline, the value of policy->cpu is updated by choosing one of the surviving CPUs in that policy, as long as there is atleast one CPU in that policy. And cpufreq_stats_update_policy_cpu() is invoked to update the reference to the stats structure by assigning it to the new CPU. However, in the resume path, during CPU online, we end up assigning a fresh CPU as the policy->cpu, without letting cpufreq-stats know about this. Thus the reference to the stats structure remains (incorrectly) associated with the old CPU. So, in a subsequent suspend attempt, during CPU offline, we end up accessing an incorrect location to get the stats structure, which eventually leads to the NULL pointer dereference. Fix this by letting cpufreq-stats know about the update of the policy->cpu during CPU online in the resume path. (Also, move the update_policy_cpu() function higher up in the file, so that __cpufreq_add_dev() can invoke it). Reported-by: Stephen Warren Signed-off-by: Srivatsa S. Bhat --- drivers/cpufreq/cpufreq.c | 37 ++++++++++++++++++++++++------------- 1 file changed, 24 insertions(+), 13 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index 5a64f66..62bdb95 100644 --- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -947,6 +947,18 @@ static void cpufreq_policy_free(struct cpufreq_policy *policy) kfree(policy); } +static void update_policy_cpu(struct cpufreq_policy *policy, unsigned int cpu) +{ + policy->last_cpu = policy->cpu; + policy->cpu = cpu; + +#ifdef CONFIG_CPU_FREQ_TABLE + cpufreq_frequency_table_update_policy_cpu(policy); +#endif + blocking_notifier_call_chain(&cpufreq_policy_notifier_list, + CPUFREQ_UPDATE_POLICY_CPU, policy); +} + static int __cpufreq_add_dev(struct device *dev, struct subsys_interface *sif, bool frozen) { @@ -1000,7 +1012,18 @@ static int __cpufreq_add_dev(struct device *dev, struct subsys_interface *sif, if (!policy) goto nomem_out; - policy->cpu = cpu; + + /* + * In the resume path, since we restore a saved policy, the assignment + * to policy->cpu is like an update of the existing policy, rather than + * the creation of a brand new one. So we need to perform this update + * by invoking update_policy_cpu(). + */ + if (frozen && cpu != policy->cpu) + update_policy_cpu(policy, cpu); + else + policy->cpu = cpu; + policy->governor = CPUFREQ_DEFAULT_GOVERNOR; cpumask_copy(policy->cpus, cpumask_of(cpu)); @@ -1092,18 +1115,6 @@ static int cpufreq_add_dev(struct device *dev, struct subsys_interface *sif) return __cpufreq_add_dev(dev, sif, false); } -static void update_policy_cpu(struct cpufreq_policy *policy, unsigned int cpu) -{ - policy->last_cpu = policy->cpu; - policy->cpu = cpu; - -#ifdef CONFIG_CPU_FREQ_TABLE - cpufreq_frequency_table_update_policy_cpu(policy); -#endif - blocking_notifier_call_chain(&cpufreq_policy_notifier_list, - CPUFREQ_UPDATE_POLICY_CPU, policy); -} - static int cpufreq_nominate_new_policy_cpu(struct cpufreq_policy *policy, unsigned int old_cpu, bool frozen) {