From patchwork Thu Apr 7 15:44:11 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Grygorii Strashko X-Patchwork-Id: 8774141 Return-Path: X-Original-To: patchwork-linux-omap@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork1.web.kernel.org (Postfix) with ESMTP id 410EB9FBEA for ; Thu, 7 Apr 2016 15:44:24 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id E01AC201ED for ; Thu, 7 Apr 2016 15:44:19 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id AEACA201BC for ; Thu, 7 Apr 2016 15:44:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756656AbcDGPoS (ORCPT ); Thu, 7 Apr 2016 11:44:18 -0400 Received: from bear.ext.ti.com ([192.94.94.41]:38052 "EHLO bear.ext.ti.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756470AbcDGPoR (ORCPT ); Thu, 7 Apr 2016 11:44:17 -0400 Received: from dlelxv90.itg.ti.com ([172.17.2.17]) by bear.ext.ti.com (8.13.7/8.13.7) with ESMTP id u37FiEiU022464; Thu, 7 Apr 2016 10:44:14 -0500 Received: from DLEE70.ent.ti.com (dlee70.ent.ti.com [157.170.170.113]) by dlelxv90.itg.ti.com (8.14.3/8.13.8) with ESMTP id u37FiDYw023198; Thu, 7 Apr 2016 10:44:13 -0500 Received: from dlep33.itg.ti.com (157.170.170.75) by DLEE70.ent.ti.com (157.170.170.113) with Microsoft SMTP Server id 14.3.224.2; Thu, 7 Apr 2016 10:44:13 -0500 Received: from [172.22.234.251] (ileax41-snat.itg.ti.com [10.172.224.153]) by dlep33.itg.ti.com (8.14.3/8.13.8) with ESMTP id u37FiBkg003251; Thu, 7 Apr 2016 10:44:12 -0500 Subject: Re: Weird kernel dumps seen on 437-gp on linux-next To: Nishanth Menon , "Franklin S Cooper Jr." , , Tony Lindgren References: <57067A25.1030401@ti.com> <57067BA7.4020005@ti.com> From: Grygorii Strashko Message-ID: <5706804B.8020608@ti.com> Date: Thu, 7 Apr 2016 18:44:11 +0300 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.6.0 MIME-Version: 1.0 In-Reply-To: <57067BA7.4020005@ti.com> Sender: linux-omap-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-omap@vger.kernel.org X-Spam-Status: No, score=-7.9 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_HI, RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP On 04/07/2016 06:24 PM, Nishanth Menon wrote: > On 04/07/2016 10:17 AM, Franklin S Cooper Jr. wrote: >> Hi All, >> >> While testing patches to fix NFS not working on linux-next I ran into to >> a bunch of kernel dumps when booting my 437-gp evm via NFS. >> >> Bootlog: http://pastebin.ubuntu.com/15670741/ >> Another Bootlog: http://pastebin.ubuntu.com/15671302/ >> >> I'm pretty sure it isn't related to the NFS issue. I dropped the local >> NFS fixes and reverted the patches that originally caused the NFS >> problem and I'm still able to see these kernel dumps. Now this doesn't >> occur all the time. Originally I was able to see this issue 1 out of 5 >> times but recently it takes over 10 times for this problem to occur. >> >> Also I am unable to reproduce this issue on my AM335x GP evm. I've >> rebooted the board 10 times and haven't seen any kernel dumps. >> > https://github.com/nmenon/kernel-test-logs/blob/next-20160406/multi_v7_defconfig/am43xx-gpevm.txt > looked good to me at least.. but then it was reported as intermittent.. > > Could be > http://lists.infradead.org/pipermail/linux-arm-kernel/2015-October/381174.html > > The guilty patch is 55ee701 "arm: omap2: board-generic: use omap4_local_timer_init for AM437x" which changes clocksource from gptimer to 32k_counter. You see this "INFO: rcu_sched self-detected stall on CPU", because sometime during boot wrong value is read from 32k_counter, which cause invalid jiffies calculation. Personally, I think that it could be errata: am437x: "Advisory 22 Watchdog Timers: Delay Needed to Read Some WDTimer Registers After Wakeup" ompa4460: i103 "1.1 Delay needed to read some GP timer, WD timer and sync timer registers after wakeup" also, you can check it with below diff: [4294938600] ============= cycle_now19816771 delta4294959581 tkr->cycle_last19824486 jiffies4294938600 12E6143 [4294938600] ============= cycle_now19817092 delta4294959902 tkr->cycle_last19824486 jiffies4294938600 12E6284 [4294938600] ============= cycle_now19817413 delta4294960223 tkr->cycle_last19824486 jiffies4294938600 12E63C5 ^^^ good jiffies [13078482] BUG: ================================== now19817734 cycle_last4314784734 real_tk19824486 ^^^ bad jiffies <-- 7-8 second of kernel boot [13078482] INFO: rcu_preempt self-detected stall on CPU diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c index bfbf284..0858077 100644 --- a/kernel/printk/printk.c +++ b/kernel/printk/printk.c @@ -1053,10 +1053,9 @@ static size_t print_time(u64 ts, char *buf) rem_nsec = do_div(ts, 1000000000); if (!buf) - return snprintf(NULL, 0, "[%5lu.000000] ", (unsigned long)ts); + return snprintf(NULL, 0, "[%5lu] ", (unsigned long)jiffies); - return sprintf(buf, "[%5lu.%06lu] ", - (unsigned long)ts, rem_nsec / 1000); + return sprintf(buf, "[%lu] ", (unsigned long)jiffies); } with above change you could see smth. like this in log: