From patchwork Tue Jul 29 15:36:55 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jonathan Cormier X-Patchwork-Id: 4641111 Return-Path: X-Original-To: patchwork-davinci@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.19.201]) by patchwork1.web.kernel.org (Postfix) with ESMTP id 67C559F2B8 for ; Tue, 29 Jul 2014 15:37:34 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id C646B20123 for ; Tue, 29 Jul 2014 15:37:30 +0000 (UTC) Received: from bear.ext.ti.com (bear.ext.ti.com [192.94.94.41]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id EA8F520125 for ; Tue, 29 Jul 2014 15:37:28 +0000 (UTC) Received: from dflxv15.itg.ti.com ([128.247.5.124]) by bear.ext.ti.com (8.13.7/8.13.7) with ESMTP id s6TFbRlV027589 for ; Tue, 29 Jul 2014 10:37:28 -0500 Received: from DLEE71.ent.ti.com (dlee71.ent.ti.com [157.170.170.114]) by dflxv15.itg.ti.com (8.14.3/8.13.8) with ESMTP id s6TFbRI8019892 for ; Tue, 29 Jul 2014 10:37:27 -0500 Received: from dlep33.itg.ti.com (157.170.170.75) by DLEE71.ent.ti.com (157.170.170.114) with Microsoft SMTP Server id 14.3.174.1; Tue, 29 Jul 2014 10:37:27 -0500 Received: from linux.omap.com (dlelxs01.itg.ti.com [157.170.227.31]) by dlep33.itg.ti.com (8.14.3/8.13.8) with ESMTP id s6TFbRXu005483 for ; Tue, 29 Jul 2014 10:37:27 -0500 Received: from linux.omap.com (localhost [127.0.0.1]) by linux.omap.com (Postfix) with ESMTP id B68E980626 for ; Tue, 29 Jul 2014 10:37:27 -0500 (CDT) X-Original-To: davinci-linux-open-source@linux.davincidsp.com Delivered-To: davinci-linux-open-source@linux.davincidsp.com Received: from dflxv17.itg.ti.com (dflxv17.itg.ti.com [128.247.5.93]) by linux.omap.com (Postfix) with ESMTP id 6F06180626 for ; Tue, 29 Jul 2014 10:37:18 -0500 (CDT) Received: from neches.ext.ti.com (neches.ext.ti.com [192.91.81.29]) by dflxv17.itg.ti.com (8.14.3/8.13.8) with ESMTP id s6TFbIOv011035 for ; Tue, 29 Jul 2014 10:37:18 -0500 Received: from mail6.bemta8.messagelabs.com (mail6.bemta8.messagelabs.com [216.82.243.55]) by neches.ext.ti.com (8.13.7/8.13.7) with ESMTP id s6TFbHPM007929 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Tue, 29 Jul 2014 10:37:18 -0500 Received: from [216.82.242.131:45529] by server-14.bemta-8.messagelabs.com id 0F/8A-02730-DAFB7D35; Tue, 29 Jul 2014 15:37:17 +0000 X-Env-Sender: jcormier@criticallink.com X-Msg-Ref: server-5.tower-76.messagelabs.com!1406648235!34069802!1 X-Originating-IP: [209.85.220.182] X-SpamReason: No, hits=0.8 required=7.0 tests=BODY_RANDOM_LONG, RCVD_BY_IP X-StarScan-Received: X-StarScan-Version: 6.11.3; banners=-,-,- X-VirusChecked: Checked Received: (qmail 19686 invoked from network); 29 Jul 2014 15:37:16 -0000 Received: from mail-vc0-f182.google.com (HELO mail-vc0-f182.google.com) (209.85.220.182) by server-5.tower-76.messagelabs.com with RC4-SHA encrypted SMTP; 29 Jul 2014 15:37:16 -0000 Received: by mail-vc0-f182.google.com with SMTP id hy4so13936427vcb.13 for ; Tue, 29 Jul 2014 08:37:15 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc:content-type:content-transfer-encoding; bh=lOxHIY4NFEojWRLiUXEuNPG1QePEzZ3wYvIklG912SE=; b=ftftzHWgCC7edV15V7k+XNoH0J10Ck0CfkUMyZGyXI9DbT2Ystvl4gjhLDg5ZmwdGP jONyiOc5Z1NXSpfFmbvDAiMnNv/AjZlcBEzcIgzFiDW7E30QqgwyCvw9oQz50z7rhBYp sJ3axlVbWLB5kriL922Dovg2A8ba4ULiGYdWjRHsBDis2aFZ4iAVT2KPszpbIBXGMafI 23YcDTHRYf6hA3mxNkigJ581G1uXDHJ9/wCalZZc0VELbzmtw6kBiGuKZKhB3PoClgp6 WT36fOD4uktpo2sfN88OJ5uyqcei6lHWMtqcMOUllugiYvetA64eKw09WOpHmGg4l/Bz /kow== X-Gm-Message-State: ALoCoQke6hkHSjUxAV2d94Mt7dv3jJePvnqLzTSHX0ekiYqur8iBCLO0lcnS93TXl12RRZLYZvh8 X-Received: by 10.220.251.134 with SMTP id ms6mr2679313vcb.10.1406648235717; Tue, 29 Jul 2014 08:37:15 -0700 (PDT) MIME-Version: 1.0 Received: by 10.220.192.136 with HTTP; Tue, 29 Jul 2014 08:36:55 -0700 (PDT) In-Reply-To: <2588C357BB271A4CA8E86BCFF043FA920C4ACE9D@EFJDFWMB01.EFJDFW.local> References: <2588C357BB271A4CA8E86BCFF043FA920C4ACE9D@EFJDFWMB01.EFJDFW.local> From: Jon Cormier Date: Tue, 29 Jul 2014 11:36:55 -0400 Message-ID: Subject: Re: i2c-davinci.c: CPU FREQ causes lock up due to xfr_complete To: Brian Niebuhr CC: "davinci-linux-open-source@linux.davincidsp.com" , Tim Iskander X-BeenThere: davinci-linux-open-source@linux.davincidsp.com X-Mailman-Version: 2.1.12 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: Errors-To: davinci-linux-open-source-bounces+patchwork-davinci=patchwork.kernel.org@linux.davincidsp.com X-Spam-Status: No, score=-6.9 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_HI, UNPARSEABLE_RELAY autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Okay here is my attempt. Author: Cormier, Jonathan Date: Tue Jul 29 11:26:22 2014 -0400 i2c: davinci: Change xfr_complete completion to a mutex There are several problems with the use of a completion for this task: 1. If no I2C transfer has occurred, a cpufreq transition will block forever. 2. Once an I2C transfer has occurred the cpufreq transition will never block since the completion is never reinitialized. 3. Even if you do reinitialize the completion for every I2C transfer, (1) is not solved and there is still a race condition where the cpufreq transition could start just before an I2C transfer starts and the I2C transfer occurs during the cpufreq transition. Signed-off-by: Cormier, Jonathan On Tue, Jul 29, 2014 at 10:38 AM, Brian Niebuhr wrote: > > >> -----Original Message----- >> From: davinci-linux-open-source-bounces@linux.davincidsp.com >> [mailto:davinci-linux-open-source-bounces@linux.davincidsp.com] On >> Behalf Of Jon Cormier >> Sent: Tuesday, July 29, 2014 9:20 AM >> To: davinci-linux-open-source@linux.davincidsp.com >> Cc: Tim Iskander >> Subject: i2c-davinci.c: CPU FREQ causes lock up due to xfr_complete >> >> Reported issue: >> >> 1. Enable I2C1, flash the new kernel and reboot >> 2. Immediately after reboot, attempt to change the processor clock: "echo >> 456000 > /sys/devices/system/cpu/cpu0/cpufreq/scaling_setspeed" >> 3. Process blocks >> >> However, if we do the following: >> 1. Enable I2C1, flash the new kernel and reboot >> 2. Immediately after reboot, run: "i2cdetect -y 2 0x08 0x08" or just "i2cdetect >> -y 2" >> 3. Then run: "echo 456000 > >> /sys/devices/system/cpu/cpu0/cpufreq/scaling_setspeed" >> 4. Command succeeds > > I found this problem as well, and I haven't had time to put together a proper patch. Right now we're just working around it. There are several problems with the use of a completion for this task: > > 1. If no I2C transfer has occurred, a cpufreq transition will block forever (which is the bug you found) > 2. Once an I2C transfer has occurred the cpufreq transition will never block since the completion is never reinitialized. > 3. Even if you do reinitialize the completion for every I2C transfer, (1) is not solved and there is still a race condition where the cpufreq transition could start just before an I2C transfer starts and the I2C transfer occurs during the cpufreq transition. > > Personally I think the correct solution is to use a mutex instead of a completion. The cpufreq code would take the mutex during the prechange callback and release it during the postchange callback. Likewise the I2C transfer function would hold the mutex while an I2C transfer is ongoing. That way an I2C transfer cannot occur during a cpufreq transition and vice-versa. As I mentioned, I have not had time to put together a proper patch, but I would be happy to see this issue addressed if someone else can do it. > > >> Here's the kernel hung task stack trace: >> >> INFO: task sh:1428 blocked for more than 120 seconds. >> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. >> sh D c026dc74 0 1428 1426 0x00000000 >> [] (schedule+0x2a8/0x334) from [] >> (schedule_timeout+0x1c/0x218) >> [] (schedule_timeout+0x1c/0x218) from [] >> (wait_for_common+0xf0/0x1b8) >> [] (wait_for_common+0xf0/0x1b8) from [] >> (i2c_davinci_cpufreq_transition+0x18/0x50) >> [] (i2c_davinci_cpufreq_transition+0x18/0x50) from [] >> (notifier_call_chain+0x38/0x68) >> [] (notifier_call_chain+0x38/0x68) from [] >> (__srcu_notifier_call_chain+0x40/0x58) >> [] (__srcu_notifier_call_chain+0x40/0x58) from [] >> (srcu_notifier_call_chain+0x14/0x18) >> [] (srcu_notifier_call_chain+0x14/0x18) from [] >> (cpufreq_notify_transition+0xc8/0xfc) >> [] (cpufreq_notify_transition+0xc8/0xfc) from [] >> (davinci_target+0x144/0x154) >> [] (davinci_target+0x144/0x154) from [] >> (__cpufreq_driver_target+0x28/0x38) >> [] (__cpufreq_driver_target+0x28/0x38) from [] >> (cpufreq_set+0x54/0x70) >> [] (cpufreq_set+0x54/0x70) from [] >> (store_scaling_setspeed+0x58/0x6c) >> [] (store_scaling_setspeed+0x58/0x6c) from [] >> (store+0x58/0x74) >> [] (store+0x58/0x74) from [] >> (sysfs_write_file+0x108/0x140) >> [] (sysfs_write_file+0x108/0x140) from [] >> (vfs_write+0xb0/0x118) >> [] (vfs_write+0xb0/0x118) from [] >> (sys_write+0x3c/0x68) >> [] (sys_write+0x3c/0x68) from [] >> (ret_fast_syscall+0x0/0x28) >> Kernel panic - not syncing: hung_task: blocked tasks >> [] (unwind_backtrace+0x0/0xd0) from [] >> (panic+0x44/0xc8) >> [] (panic+0x44/0xc8) from [] (watchdog+0x1d4/0x21c) >> [] (watchdog+0x1d4/0x21c) from [] >> (kthread+0x78/0x80) >> [] (kthread+0x78/0x80) from [] >> (kernel_thread_exit+0x0/0x8) >> According to the stack trace the kernel gets stuck in the >> "i2c_davinci_cpufreq_transition" function when it calls >> "wait_for_completion(&dev->xfr_complete);" The two other places this >> xfr_complete variable is referenced is the init_completion in the probe and >> the complete at the end of the i2c_davinci_xfer function. My understanding >> as to what this was intended for was to ensure that a transfer in progress >> completed before changing the clock frequency. But as its currently done >> the only thing it does is make sure there has been a completed i2c transfer >> on this device ever. Is my understanding correct? >> Currently the workaround is to simply disable the wait_for_completion as >> seen below. How would you fix this to ensure a transfer in progress >> completes before changing clocks without hanging if no transfer was ever >> attempted? >> diff --git a/drivers/i2c/busses/i2c-davinci.c b/drivers/i2c/busses/i2c-davinci.c >> index a76d85f..564247f 100644 >> --- a/drivers/i2c/busses/i2c-davinci.c >> +++ b/drivers/i2c/busses/i2c-davinci.c >> @@ -596,7 +596,7 @@ static int i2c_davinci_cpufreq_transition(struct >> notifier_block *nb, >> >> dev = container_of(nb, struct davinci_i2c_dev, freq_transition); >> if (val == CPUFREQ_PRECHANGE) { >> - wait_for_completion(&dev->xfr_complete); >> + //wait_for_completion(&dev->xfr_complete); >> davinci_i2c_reset_ctrl(dev, 0); >> } else if (val == CPUFREQ_POSTCHANGE) { >> i2c_davinci_calc_clk_dividers(dev); >> Patch were this was introduced: >> sha: 82c0de11b734c5acec13c0f6007466da81cd16d9 i2c:davinci:Add cpufreq >> support >> >> -- >> Jonathan Cormier >> CriticalLink > This e-mail transmission, and any documents, files or previous e-mail messages attached to it, may contain confidential information. If you are not the intended recipient, or a person responsible for delivering it to the intended recipient, you are hereby notified that any disclosure, distribution, review, copy or use of any of the information contained in or attached to this message is STRICTLY PROHIBITED. If you have received this transmission in error, please immediately notify us by reply e-mail, and destroy the original transmission and its attachments without reading them or saving them to disk. Thank you. diff --git a/drivers/i2c/busses/i2c-davinci.c b/drivers/i2c/busses/i2c-davinci.c index a76d85f..9eac1c1 100644 --- a/drivers/i2c/busses/i2c-davinci.c +++ b/drivers/i2c/busses/i2c-davinci.c @@ -111,7 +111,7 @@ struct davinci_i2c_dev { u8 terminate; struct i2c_adapter adapter; #ifdef CONFIG_CPU_FREQ - struct completion xfr_complete; + struct mutex xfr_lock; struct notifier_block freq_transition; #endif }; @@ -438,10 +438,14 @@ i2c_davinci_xfer(struct i2c_adapter *adap, struct i2c_msg msgs[], int num) dev_dbg(dev->dev, "%s: msgs: %d\n", __func__, num); +#ifdef CONFIG_CPU_FREQ + mutex_lock(&dev->xfr_lock); +#endif + ret = i2c_davinci_wait_bus_not_busy(dev, 1); if (ret < 0) { dev_warn(dev->dev, "timeout waiting for bus ready\n"); - return ret; + goto exit; } for (i = 0; i < num; i++) { @@ -449,14 +453,17 @@ i2c_davinci_xfer(struct i2c_adapter *adap, struct i2c_msg msgs[], int num) dev_dbg(dev->dev, "%s [%d/%d] ret: %d\n", __func__, i + 1, num, ret); if (ret < 0) - return ret; + goto exit; } + ret = num; + +exit: #ifdef CONFIG_CPU_FREQ - complete(&dev->xfr_complete); + mutex_unlock(&dev->xfr_lock); #endif - return num; + return ret; } static u32 i2c_davinci_func(struct i2c_adapter *adap) @@ -596,11 +603,16 @@ static int i2c_davinci_cpufreq_transition(struct notifier_block *nb, dev = container_of(nb, struct davinci_i2c_dev, freq_transition); if (val == CPUFREQ_PRECHANGE) { - wait_for_completion(&dev->xfr_complete); +#ifdef CONFIG_CPU_FREQ + mutex_lock(&dev->xfr_lock); +#endif davinci_i2c_reset_ctrl(dev, 0); } else if (val == CPUFREQ_POSTCHANGE) { i2c_davinci_calc_clk_dividers(dev); davinci_i2c_reset_ctrl(dev, 1); +#ifdef CONFIG_CPU_FREQ + mutex_unlock(&dev->xfr_lock); +#endif } return 0; @@ -670,7 +682,7 @@ static int davinci_i2c_probe(struct platform_device *pdev) init_completion(&dev->cmd_complete); #ifdef CONFIG_CPU_FREQ - init_completion(&dev->xfr_complete); + mutex_init(&dev->xfr_lock); #endif dev->dev = get_device(&pdev->dev); dev->irq = irq->start; I'm using the 3.2 branch and the above patch revealed another bug. INFO: task sh:2144 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. sh D c03262f0 0 2144 1383 0x00000000 [] (__schedule+0x350/0x3b0) from [] (__mutex_lock_slowpath+0 x90/0x100) [] (__mutex_lock_slowpath+0x90/0x100) from [] (i2c_davinci_x fer+0x30/0x30c) [] (i2c_davinci_xfer+0x30/0x30c) from [] (i2c_transfer+0xbc/ 0x110) [] (i2c_transfer+0xbc/0x110) from [] (i2c_master_send+0x38/0 x48) [] (i2c_master_send+0x38/0x48) from [] (regmap_i2c_write+0x1 0/0x2c) [] (regmap_i2c_write+0x10/0x2c) from [] (_regmap_raw_write+0 xa4/0x144) [] (_regmap_raw_write+0xa4/0x144) from [] (regmap_write+0x24 /0x38) [] (regmap_write+0x24/0x38) from [] (tps65023_dcdc_set_volta ge+0xc0/0xe8) [] (tps65023_dcdc_set_voltage+0xc0/0xe8) from [] (_regulator _do_set_voltage+0x3c/0x1d0) [] (_regulator_do_set_voltage+0x3c/0x1d0) from [] (regulator _set_voltage+0xb8/0xcc) [] (regulator_set_voltage+0xb8/0xcc) from [] (davinci_target +0xcc/0x14c) [] (davinci_target+0xcc/0x14c) from [] (__cpufreq_driver_tar get+0x2c/0x3c) [] (__cpufreq_driver_target+0x2c/0x3c) from [] (cpufreq_set+ 0x54/0x70) [] (cpufreq_set+0x54/0x70) from [] (store_scaling_setspeed+0 x58/0x6c) [] (store_scaling_setspeed+0x58/0x6c) from [] (store+0x58/0x 70) [] (store+0x58/0x70) from [] (sysfs_write_file+0x108/0x140) [] (sysfs_write_file+0x108/0x140) from [] (vfs_write+0xb0/0x 138) [] (vfs_write+0xb0/0x138) from [] (sys_write+0x3c/0x68) [] (sys_write+0x3c/0x68) from [] (ret_fast_syscall+0x0/0x2c) davinci_target was notifying CPUFREQ_PRECHANGE before calling set_voltage which relied tried talking to the pmic over i2c causing a hang. @@ -101,17 +101,17 @@ static int davinci_target(struct cpufreq_policy *policy, cpufreq_notify_transition(&freqs, CPUFREQ_PRECHANGE); /* if moving to higher frequency, up the voltage beforehand */ if (pdata->set_voltage && freqs.new > freqs.old) { ret = pdata->set_voltage(idx); if (ret) goto out; } I fixed this by moving the cpufreq_notifiy. This doesn't apply to the later kernels as the davinci cpufreq driver has changed quite a bit. Unfortunately I don't have the setup to test a patch against a newer kernel right now. Author: Cormier, Jonathan Date: Tue Jul 29 11:22:50 2014 -0400 ARM: DAVINCI: Reorder cpufreq_nofity_transistion so that set_voltage happens first. set_voltage may make use of the i2c bus to communicate with the PMIC. In this case we dont want the i2c to reset until after we set the voltage. Signed-off-by: Cormier, Jonathan diff --git a/arch/arm/mach-davinci/cpufreq.c b/arch/arm/mach-davinci/cpufreq.c index 5bba707..cbaee6c 100644 --- a/arch/arm/mach-davinci/cpufreq.c +++ b/arch/arm/mach-davinci/cpufreq.c @@ -102,8 +102,6 @@ static int davinci_target(struct cpufreq_policy *policy, if (ret) return -EINVAL; - cpufreq_notify_transition(&freqs, CPUFREQ_PRECHANGE); - /* if moving to higher frequency, up the voltage beforehand */ if (pdata->set_voltage && freqs.new > freqs.old) { ret = pdata->set_voltage(idx); @@ -111,6 +109,8 @@ static int davinci_target(struct cpufreq_policy *policy, goto out; } + cpufreq_notify_transition(&freqs, CPUFREQ_PRECHANGE); + ret = clk_set_rate(armclk, idx); if (ret) goto out;