From patchwork Sat Jan 31 00:32:44 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Viresh Kumar X-Patchwork-Id: 5753661 X-Patchwork-Delegate: rjw@sisk.pl Return-Path: X-Original-To: patchwork-linux-pm@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork1.web.kernel.org (Postfix) with ESMTP id 5AF779F38B for ; Sat, 31 Jan 2015 00:33:01 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 77EFF2026D for ; Sat, 31 Jan 2015 00:33:00 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 9C096201EF for ; Sat, 31 Jan 2015 00:32:55 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1760347AbbAaAcy (ORCPT ); Fri, 30 Jan 2015 19:32:54 -0500 Received: from mail-pa0-f54.google.com ([209.85.220.54]:57043 "EHLO mail-pa0-f54.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1762029AbbAaAcx (ORCPT ); Fri, 30 Jan 2015 19:32:53 -0500 Received: by mail-pa0-f54.google.com with SMTP id eu11so58419591pac.13 for ; Fri, 30 Jan 2015 16:32:53 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=y2BrXK3vM+/NhwRmfouTJfs3a7ZKBv544FXu7ICzyYk=; b=fXnd2GHB/pK9i4Io2twiDGdjEbRpD/Rs+00tZ2S8L+v0isuUXLVjSN3NzHLEPzzhkg EJZPxzECTCo+dpUfnfURnFqck4MTptsUL62upa8aB6sGhzxcvNj4WYg2AOnncVAorg/q aufUAJKHbFpfv0gOzkLoh+Wa9uuyCD0BaIttYoOxSti1Ry84jmLIxjmEFVNU2TT8aEuQ ZGL+4iCmmvkAM2hDyIIhk+2xx0i1WS/CFMEQag1IBMjlAqYHNDxVm9maPGFVgrTsaQnD UK0VyuLFH6JzNUxrs7C2JT6d6tUyCYHq+b/zl8B5z9zd5O5AiX1gsue8Tk5iagcB/SsK 3iDQ== X-Gm-Message-State: ALoCoQkTFsR/o9s/aSaOkN83AcC/XdzaTcW3iYRom7VcMf91Jl47TBXW4su3zHSqhhmjZLeMspvG X-Received: by 10.68.224.234 with SMTP id rf10mr12774000pbc.124.1422664373133; Fri, 30 Jan 2015 16:32:53 -0800 (PST) Received: from localhost ([122.167.221.35]) by mx.google.com with ESMTPSA id gr7sm11869168pbc.75.2015.01.30.16.32.51 (version=TLSv1.2 cipher=RC4-SHA bits=128/128); Fri, 30 Jan 2015 16:32:52 -0800 (PST) From: Viresh Kumar To: Rafael Wysocki , santosh.shilimkar@oracle.com, ethan.zhao@oracle.com Cc: linaro-kernel@lists.linaro.org, linux-pm@vger.kernel.org, Viresh Kumar Subject: [PATCH Resend] cpufreq: Set cpufreq_cpu_data to NULL before putting kobject Date: Sat, 31 Jan 2015 06:02:44 +0530 Message-Id: X-Mailer: git-send-email 2.3.0.rc0.44.ga94655d Sender: linux-pm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-pm@vger.kernel.org X-Spam-Status: No, score=-6.9 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_HI, T_RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP In __cpufreq_remove_dev_finish(), per-cpu 'cpufreq_cpu_data' needs to be cleared before calling kobject_put(&policy->kobj) *and* under the lock. Otherwise if someone else calls cpufreq_cpu_get() in parallel with it, they can obtain a non-NULL policy from it *after* kobject_put(&policy->kobj) was executed. Consider this case: Thread A Thread B cpufreq_cpu_get() read_lock_irqsave() read-per-cpu cpufreq_cpu_data per_cpu(&cpufreq_cpu_data, cpu) = NULL kobject_put(&policy->kobj); kobject_get(&policy->kobj); And this will result in below Warnings: ------------[ cut here ]------------ WARNING: CPU: 0 PID: 4 at include/linux/kref.h:47 kobject_get+0x41/0x50() Modules linked in: acpi_cpufreq(+) nfsd auth_rpcgss nfs_acl lockd grace sunrpc xfs libcrc32c sd_mod ixgbe igb mdio ahci hwmon ... Call Trace: [] dump_stack+0x46/0x58 [] warn_slowpath_common+0x81/0xa0 [] warn_slowpath_null+0x1a/0x20 [] kobject_get+0x41/0x50 [] cpufreq_cpu_get+0x75/0xc0 [] cpufreq_update_policy+0x2e/0x1f0 [] ? up+0x32/0x50 [] ? acpi_ns_get_node+0xcb/0xf2 [] ? acpi_evaluate_object+0x22c/0x252 [] ? acpi_get_handle+0x95/0xc0 [] ? acpi_has_method+0x25/0x40 [] acpi_processor_ppc_has_changed+0x77/0x82 [] ? move_linked_works+0x66/0x90 [] acpi_processor_notify+0x58/0xe7 [] acpi_ev_notify_dispatch+0x44/0x5c [] acpi_os_execute_deferred+0x15/0x22 [] process_one_work+0x160/0x410 [] worker_thread+0x11b/0x520 [] ? rescuer_thread+0x380/0x380 [] kthread+0xe1/0x100 [] ? kthread_create_on_node+0x1b0/0x1b0 [] ret_from_fork+0x7c/0xb0 [] ? kthread_create_on_node+0x1b0/0x1b0 ---[ end trace 89e66eb9795efdf7 ]--- And here is the actual race (+ the race mentioned above): Thread A: Workqueue: kacpi_notify acpi_processor_notify() acpi_processor_ppc_has_changed() cpufreq_update_policy() cpufreq_cpu_get() kobject_get() Thread B: xenbus_thread() xenbus_thread() msg->u.watch.handle->callback() handle_vcpu_hotplug_event() vcpu_hotplug() cpu_down() __cpu_notify(CPU_POST_DEAD..) cpufreq_cpu_callback() __cpufreq_remove_dev_finish() cpufreq_policy_put_kobj() kobject_put() cpufreq_cpu_get() gets the policy from per-cpu variable cpufreq_cpu_data under cpufreq_driver_lock, and once it gets a valid policy it expects it to not be freed until cpufreq_cpu_put() is called. But the race happens when another thread puts the kobject first and updates cpufreq_cpu_data before or later. And so the first thread gets a valid policy structure and before it does kobject_get() on it, the second one has already done kobject_put(). Fix this by setting cpufreq_cpu_data to NULL before putting the kobject and that too under locks. Reported-by: Ethan Zhao Reported-by: Santosh Shilimkar Signed-off-by: Viresh Kumar --- drivers/cpufreq/cpufreq.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index 4473eba1d6b0..e3bf702b5588 100644 --- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -1409,9 +1409,10 @@ static int __cpufreq_remove_dev_finish(struct device *dev, unsigned long flags; struct cpufreq_policy *policy; - read_lock_irqsave(&cpufreq_driver_lock, flags); + write_lock_irqsave(&cpufreq_driver_lock, flags); policy = per_cpu(cpufreq_cpu_data, cpu); - read_unlock_irqrestore(&cpufreq_driver_lock, flags); + per_cpu(cpufreq_cpu_data, cpu) = NULL; + write_unlock_irqrestore(&cpufreq_driver_lock, flags); if (!policy) { pr_debug("%s: No cpu_data found\n", __func__); @@ -1466,7 +1467,6 @@ static int __cpufreq_remove_dev_finish(struct device *dev, } } - per_cpu(cpufreq_cpu_data, cpu) = NULL; return 0; }