From patchwork Wed Mar 8 16:43:19 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Robert Jarzmik X-Patchwork-Id: 9611649 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id A324960414 for ; Wed, 8 Mar 2017 16:43:54 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id AB0FD2861F for ; Wed, 8 Mar 2017 16:43:54 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 9F57828625; Wed, 8 Mar 2017 16:43:54 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-1.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,FREEMAIL_FROM autolearn=unavailable version=3.3.1 Received: from bombadil.infradead.org (bombadil.infradead.org [65.50.211.133]) (using TLSv1.2 with cipher AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 5B4A72861F for ; Wed, 8 Mar 2017 16:43:54 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20170209; h=Sender: Content-Transfer-Encoding:Content-Type:Cc:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:Message-ID:In-Reply-To: Date:References:Subject:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=Au+G5zhsX/1XzFAOzdnyql5qdrCoMRZOUyFW+524Ed8=; b=PFhgcl1QVWj1Jc c8yYhP9yJ2GQO7hx840DIdn0Q+Uq+wZMZvhmmZoKJ6wPXuw6CRz9YdeoAY9/CnvKpM9/wmQmzTOhy vu8p5E2ZlIF2WKUZWPvuu5/N+s/z9v/7laMu+FevhXh1BbRD/8DYW9kQZo6e9sNkcD6jZCJHWV6Q4 H50Yjv+nPX0X1nvVRRO7kOA+KhnNk15RbBPBMlyc2h6fUfwCOm+vTtDkw6JEFUxey789KdR1lAJbe XC3AHd6Twv1RjcE4BR7IK4xm3z5hNCmt6Gj5KcFbT9NDA8/m67fDBJ6AO72Mt+rEvo1hSA6vBudkZ xP3opAiq9sNd4ZmPQ/KQ==; Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.87 #1 (Red Hat Linux)) id 1clegp-000055-C8; Wed, 08 Mar 2017 16:43:51 +0000 Received: from smtp04.smtpout.orange.fr ([80.12.242.126] helo=smtp.smtpout.orange.fr) by bombadil.infradead.org with esmtps (Exim 4.87 #1 (Red Hat Linux)) id 1clegl-0008U4-7U for linux-arm-kernel@lists.infradead.org; Wed, 08 Mar 2017 16:43:49 +0000 Received: from belgarion ([92.149.59.52]) by mwinf5d07 with ME id tUjL1u00517d2Es03UjLfd; Wed, 08 Mar 2017 17:43:22 +0100 X-ME-Helo: belgarion X-ME-Auth: amFyem1pay5yb2JlcnRAb3JhbmdlLmZy X-ME-Date: Wed, 08 Mar 2017 17:43:22 +0100 X-ME-IP: 92.149.59.52 From: Robert Jarzmik To: Petr Cvek Subject: Re: [BUG] dmaengine: pxa_dma: + mmc: pxamci: race condition with DMA error on tx channel References: X-URL: http://belgarath.falguerolles.org/ Date: Wed, 08 Mar 2017 17:43:19 +0100 In-Reply-To: (Petr Cvek's message of "Wed, 8 Mar 2017 07:57:08 +0100") Message-ID: <877f3zwwgo.fsf@belgarion.home> User-Agent: Gnus/5.130008 (Ma Gnus v0.8) Emacs/24.4 (gnu/linux) MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20170308_084347_595567_2270D4DE X-CRM114-Status: GOOD ( 21.92 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Ulf Hansson , vinod.koul@intel.com, linux-mmc@vger.kernel.org, Haojian Zhuang , linux-arm-kernel@lists.infradead.org, dmaengine@vger.kernel.org, Daniel Mack Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org X-Virus-Scanned: ClamAV using ClamSMTP Petr Cvek writes: Hi Petr, > I wasn't able to track the problem to a single patch as the problem occurs at > random time (from the boot to like a half an hour) and it's maybe dependent on a > level of a battery charge (maybe because of kernel log writes of charging > messages). Mmmh, long reproduction time, that will be bad. > It seems that most occurrency is during writes on an SD card. Using an SDHC > card decreases the time to fail. After failure the OS is unavailable (rootfs > in on the card). Okay, let me try to make write loop on my SD card to see if I manage to reproduce this. > From my poking in the kernel source code it seems there is a probability that pxamci_irq() takes longer to call and its subsequent call pxamci_data_done() isn't fast enough to set [1] > host->data = NULL; > From the DMA side, the DMA done interrupt is generated: > pxad_chan_handler() -> vchan_cookie_complete() > ...where a tasklet for vchan_complete() is scheduled At least that seems to hint the DMA part is sound so for. The bothering part is the log error "mmc0: DMA error on tx channel". I would need a bit of guidance here, with the same log with [1] applied. > , where finally with interrupts enabled (can pxamci_irq() be called here?) the > callback pxamci_dma_irq() is called. When DMA completes, there is a tiny window, before pxamci_dma_irq() is called, when pxamci_irq() can be called, yes. As soon as the spinlock is taken in pxamci_dma_irq() is taken, no more races. > From my tests it seems at this point [2] the host->data is always NULL and rest > of the callback is never called. It is called once with a nonempty host->data > only just before the failure. > > During the testing I put udelay(100) at the start of pxamci_dma_irq() and fail > occurred after like 2 hours (when I for the first time tapped the touchscreen - > higher CPU usage and interrupts). Mmm I would need more data here. The biggest help I could get would be the pxa dma traces here : echo -n 'file pxa_dma.c +p' > /sys/kernel/debug/dynamic_debug/control echo -n 'file virt-dma.c +p' > /sys/kernel/debug/dynamic_debug/control And then capture the last traces and send them to me. Cheers. diff --git a/drivers/mmc/host/pxamci.c b/drivers/mmc/host/pxamci.c index c763b404510f..ed3812b2a34d 100644 --- a/drivers/mmc/host/pxamci.c +++ b/drivers/mmc/host/pxamci.c @@ -571,8 +571,9 @@ static void pxamci_dma_irq(void *param) if (likely(status == DMA_COMPLETE)) { writel(BUF_PART_FULL, host->base + MMC_PRTBUF); } else { - pr_err("%s: DMA error on %s channel\n", mmc_hostname(host->mmc), - host->data->flags & MMC_DATA_READ ? "rx" : "tx"); + pr_err("%s: DMA error on %s channel: %d\n", + mmc_hostname(host->mmc), + host->data->flags & MMC_DATA_READ ? "rx" : "tx", status); host->data->error = -EIO; pxamci_data_done(host, 0); }