From patchwork Thu Jul 5 11:10:34 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Stefan Agner X-Patchwork-Id: 10508641 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id C5F3D600F5 for ; Thu, 5 Jul 2018 11:11:05 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B312328ECA for ; Thu, 5 Jul 2018 11:11:05 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id A6C4E28ED1; Thu, 5 Jul 2018 11:11:05 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI autolearn=unavailable version=3.3.1 Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 21F4828ECA for ; Thu, 5 Jul 2018 11:11:05 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20170209; h=Sender: Content-Transfer-Encoding:Content-Type:Cc:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:Message-ID:References:In-Reply-To: Subject:To:From:Date:MIME-Version:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=xHVqnAHG3jIF04p/Y1mtdmemQvEiW6oKtrHPBV8m5uU=; b=Li+XIVJHh2QpTF k/EuuVmFVOLa5dRk2ZtqzszLhMPjdjttaPUnxMJKuuQEJxp0yn7gd5Ttb5iaIdi14/fupPaCxZlGL Ftu5HC9iPMd6kaLPDMQFcZheB8UHf1u4Rd5ejEV2gerXGGSdrt+Z3/MRlGBPV9t+PZNnS1QjoK+Ld Oi2k5ljPM4fsyViyzaJARvLgjf8oXlU+lkJ3DVDZ14jKjitBIl0NGxSMBvJ1qHKck2xNWdYpfJg33 Cvw9fYAZXFGYr6UYRc7HjJUs9k2giEwil1bolshUAZ7JYAfZmHMyTW9jhWIQ4z6Y9d0gQircmP2lt QCp6vM+s5MFrjUqU0gOA==; Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.90_1 #2 (Red Hat Linux)) id 1fb2A8-0004l4-Rz; Thu, 05 Jul 2018 11:11:00 +0000 Received: from mail.kmu-office.ch ([2a02:418:6a02::a2]) by bombadil.infradead.org with esmtps (Exim 4.90_1 #2 (Red Hat Linux)) id 1fb2A2-0004Gt-05 for linux-arm-kernel@lists.infradead.org; Thu, 05 Jul 2018 11:10:57 +0000 Received: from webmail.kmu-office.ch (unknown [IPv6:2a02:418:6a02::a3]) by mail.kmu-office.ch (Postfix) with ESMTPSA id D531F5C16B5; Thu, 5 Jul 2018 13:10:36 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=agner.ch; s=dkim; t=1530789036; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=QZMFsVF3wWYvU38wc/bvFkTGB/+cLVHuNAs7V1Ja1Lk=; b=jP8GChEJ+zJXDxLCAtBXZdvQgzWUaNL2AeyMJLs4dNxholdIZuHbz+EmOpxgO1CULwKGaB l2X/fYdCIPkj7NJy4z3fhDBFuAUDiZRoOop2TASuzjHrx58ndAGjzDTdtGwvKEWcYOfNyt KLCB6fBo9utrXzJiRnapidMgmzIhu9U= MIME-Version: 1.0 Date: Thu, 05 Jul 2018 13:10:34 +0200 From: Stefan Agner To: "A.s. Dong" Subject: Re: Self-detected stall on CPU when using SD card In-Reply-To: References: <037cbd3560a68bad0cea92f453c1b819@agner.ch> <0f262b04fab61ec75b7e368a8f4968aa@agner.ch> Message-ID: <0ba425d59b0739272cf871722441f69c@agner.ch> X-Sender: stefan@agner.ch User-Agent: Roundcube Webmail/1.3.4 X-Spamd-Result: default: False [-0.10 / 15.00]; TO_MATCH_ENVRCPT_ALL(0.00)[]; MID_RHS_MATCH_FROM(0.00)[]; RCPT_COUNT_FIVE(0.00)[6]; MIME_GOOD(-0.10)[text/plain]; FROM_HAS_DN(0.00)[]; FROM_EQ_ENVFROM(0.00)[]; DKIM_SIGNED(0.00)[]; TO_DN_SOME(0.00)[]; RCVD_COUNT_ZERO(0.00)[0]; ASN(0.00)[asn:29691, ipnet:2a02:418::/29, country:CH]; RCVD_TLS_ALL(0.00)[]; BAYES_HAM(-0.00)[36.59%]; ARC_NA(0.00)[] X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20180705_041054_640138_402FBE17 X-CRM114-Status: GOOD ( 26.80 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Fabio Estevam , linux-mmc-owner@vger.kernel.org, linux-mmc@vger.kernel.org, adrian.hunter@intel.com, linux-arm-kernel@lists.infradead.org Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org X-Virus-Scanned: ClamAV using ClamSMTP On 05.07.2018 05:13, A.s. Dong wrote: >> -----Original Message----- >> From: Stefan Agner [mailto:stefan@agner.ch] >> Sent: Tuesday, July 3, 2018 9:13 PM >> To: adrian.hunter@intel.com; A.s. Dong ; Fabio >> Estevam >> Cc: linux-mmc@vger.kernel.org; linux-arm-kernel@lists.infradead.org; linux- >> mmc-owner@vger.kernel.org >> Subject: Re: Self-detected stall on CPU when using SD card >> >> Dong Aisheng, >> >> Maybe you can shed some light on the issue below? >> > > I tried on MX6Q SDB board but did not reproduce it with a SD card > plugged into Slot2 with 4.18-rc1. > Any difference we need to care? > >> >> On 26.06.2018 16:45, Stefan Agner wrote: >> > On 26.06.2018 12:53, Stefan Agner wrote: >> >> Hi, >> >> >> >> On our Colibri iMX6 (arch/arm/boot/dts/imx6qdl-colibri.dtsi) we >> >> experience the following stack trace when a SD card is plugged in: >> >> >> > >> > [...] >> > >> >> [] (esdhc_pltfm_set_clock) from [] >> >> (sdhci_set_ios+0xd8/0x584) >> >> r10:ffffe000 r9:c1105900 r8:00004097 r7:d818f2e8 r6:d818f480 >> >> r5:d818f2e8 >> >> r4:d818f000 >> >> [] (sdhci_set_ios) from [] >> >> (sdhci_runtime_resume_host+0xa0/0x184) >> >> r9:c1105900 r8:00004097 r7:d818f2e8 r6:d818f648 r5:d818f000 >> >> r4:d818f480 >> > >> > [...] >> > >> >> >> >> It used to work in v4.9, so I started a git bisect. It pointed me to >> >> this commit: >> >> >> >> Commit d1e4f74f911d ("mmc: sdhci: Do not use spin lock in set_ios >> >> paths"). >> >> >> >> Reverting the commit on-top of v4.18-rc1 seems to fix the issue too. >> >> >> >> Any idea? >> > >> > I figured out that the same platform had a GPIO Key which triggered >> > all the time. This seems to exacerbate the MMC issue such that it >> > triggers on very boot, about rootfs mount time. >> > > > That's a bit strange.... Why that GPIO key triggers all the time? > This was a bug in our device tree: we configured a 100k pull-up where there is a 100k pull-down externally... The level was always around 1.6V and caused thousands of interrupts during boot up. The stack trace really suggests that there are simply to many interrupts happening on the GPIO key and stalling the CPU. Of course, at this point I could have stopped and just do the device tree fix. But the question was just nagging me: Why does disabling interrupts during esdhc_pltfm_set_clock fixes the issue too?? There must be something particular with esdhc_pltfm_set_clock. After digging some hours I finally found the issue, and it is somewhat mind blowing: The SD card clock and the GPIO key pins happen to be routed right next to each other! Without SD card the GPIO key stays at a stable level and does not cause interrupts. However, when the SD card clock next to the GPIO key is enabled it reliably generates interrupts due to cross talk! But the SD card clock is enabled even outside of esdhc_pltfm_set_clock. If just cross talk is the problem then disabling interrupts during clock change esdhc_pltfm_set_clock should make no real difference... It turns out, that in the i.MX 6 case the driver clears ESDHC_CLOCK_MASK which generates 200MHz on the SD card clk pin for a short period of time! This exacerbated cross talk just enough to cause a complete stall! When interrupts are disabled during that short phase, cross talk does not do any harm.... And it seems that with 50MHz (which is the rate on function exit) cross talk does not appear/doe not cause enough interrupts to stall the CPU completely... Anyway, we probably don't want the card to be clocked at 200MHz for that short time. Fixing this is rather trivial, just make sure that we disable card clock before changing rate: } With that clocks get disabled before we clear the divider registers further down. I will send a patch for this. --- Stefan >> > This change seems to fix the issue as well, not sure though whether >> > this is a proper fix: >> > >> > --- a/drivers/mmc/host/sdhci-esdhc-imx.c >> > +++ b/drivers/mmc/host/sdhci-esdhc-imx.c >> > @@ -697,6 +697,7 @@ static inline void esdhc_pltfm_set_clock(struct >> > sdhci_host *host, >> > int ddr_pre_div = imx_data->is_ddr ? 2 : 1; >> > int pre_div = 1; >> > int div = 1; >> > + unsigned long flags; >> > u32 temp, val; >> > >> > if (clock == 0) { >> > @@ -724,6 +725,7 @@ static inline void esdhc_pltfm_set_clock(struct >> > sdhci_host *host, >> > pre_div = 2; >> > } >> > >> > + spin_lock_irqsave(&host->lock, flags); >> > temp = sdhci_readl(host, ESDHC_SYSTEM_CONTROL); >> > temp &= ~(ESDHC_CLOCK_IPGEN | ESDHC_CLOCK_HCKEN | >> > ESDHC_CLOCK_PEREN >> > | ESDHC_CLOCK_MASK); >> > @@ -754,6 +756,7 @@ static inline void esdhc_pltfm_set_clock(struct >> > sdhci_host *host, >> > writel(val | ESDHC_VENDOR_SPEC_FRC_SDCLK_ON, >> > host->ioaddr + ESDHC_VENDOR_SPEC); >> > } >> > + spin_unlock_irqrestore(&host->lock, flags); >> >> The bits ESDHC_CLOCK_IPGEN, ESDHC_CLOCK_HCKEN, >> ESDHC_CLOCK_PEREN which are cleared and written in that section are >> actually mentioned as "Reserved. Always write as 1." in the reference >> manual... >> > > That's for legacy platforms like MX5. No side affect on MX6&7 as we know. > So keep it now. Is the issue related to that? > > Regards > Dong Aisheng > >> -- >> Stefan >> >> > >> > mdelay(1); >> > } >> > >> > -- >> > Stefan --- a/drivers/mmc/host/sdhci-esdhc-imx.c +++ b/drivers/mmc/host/sdhci-esdhc-imx.c @@ -708,14 +708,14 @@ static inline void esdhc_pltfm_set_clock(struct sdhci_host *host, int div = 1; u32 temp, val; + if (esdhc_is_usdhc(imx_data)) { + val = readl(host->ioaddr + ESDHC_VENDOR_SPEC); + writel(val & ~ESDHC_VENDOR_SPEC_FRC_SDCLK_ON, + host->ioaddr + ESDHC_VENDOR_SPEC); + } + if (clock == 0) { host->mmc->actual_clock = 0; - - if (esdhc_is_usdhc(imx_data)) { - val = readl(host->ioaddr + ESDHC_VENDOR_SPEC); - writel(val & ~ESDHC_VENDOR_SPEC_FRC_SDCLK_ON, - host->ioaddr + ESDHC_VENDOR_SPEC); - } return;