From patchwork Sat Mar 17 11:36:55 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Helge Deller X-Patchwork-Id: 10290773 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 1DDDB60386 for ; Sat, 17 Mar 2018 11:37:08 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id F0254290F9 for ; Sat, 17 Mar 2018 11:37:03 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id E4BC429124; Sat, 17 Mar 2018 11:37:03 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,FREEMAIL_FROM, NORMAL_HTTP_TO_IP,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 40E19290F9 for ; Sat, 17 Mar 2018 11:37:03 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751944AbeCQLhC (ORCPT ); Sat, 17 Mar 2018 07:37:02 -0400 Received: from mout.gmx.net ([212.227.17.21]:53285 "EHLO mout.gmx.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751710AbeCQLhB (ORCPT ); Sat, 17 Mar 2018 07:37:01 -0400 Received: from ls3530.fritz.box ([89.247.41.208]) by mail.gmx.com (mrgmx102 [212.227.17.168]) with ESMTPSA (Nemesis) id 0MFi1J-1ersgm0SBR-00EdWG; Sat, 17 Mar 2018 12:36:57 +0100 Date: Sat, 17 Mar 2018 12:36:55 +0100 From: Helge Deller To: Carlo Pisani Cc: debian-hppa@lists.debian.org, linux-parisc@vger.kernel.org Subject: Re: kernel 4.15.7/64bit, C3600 is unstable during heavy I/O on PCI Message-ID: <20180317113655.GA30572@ls3530.fritz.box> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.9.1 (2017-09-22) X-Provags-ID: V03:K0:396CYWqdLyLo+wP2jrdMvjPMBkG0N44he7NCsapv9fyyCnAbtCj XrLLqA6C06pwvkSNmVxBiPRWITWLSIUj40WT7vpgK0XctDvi8CVxVvTAKN8CkXCJzHvfuTp Gfr1DOLrDh8CRn/bnRRmf6ATlrJvO0UP0WECJOOT1JaM3RE7ocdgakg+VpXGy1EA3XqgRXK jIsaBDLlN6AsjVcCEi2wQ== X-UI-Out-Filterresults: notjunk:1; V01:K0:GWYYh3IzfwU=:88W5bRrjgqNcuxTOTaavwX fRRbqN+9Qc/POWNnflIPXKBqKkSZcTTUKf+To5Sk85hHhlmmMtAtfpI4wpnQTjGDTnaU/eYFN //9gS4x4JyPy/+c4wh/uaGz1pHiajXvwXXniWLciNCNdyWse2XlxSJdIm6A1gpEFd+m8blk8Y aZY9F4/7/MoNlictAGnMyN+WDehLr1ibEySnWo/quwJ83pxEMPbDZkNoIbXk1M12nlgZ/XLxU VQum9oyFMm1i5UGmlwphoYfNzdCPFQdQGL7YiyFk76erzx8HipTfHZuZw9Zds8ecZMF2sRcIF URGvfJqdtOIuzz6GeJ9XAwnme1IXw2CK1TJjpaVMA7dYfx+TFu+F2W1H6avbSC6R9qlm8dJ3c ljYK7QCFxAFU39pmqA9XrFL6OzOROqvZSjErvu4zmTiYHMDHHDTL9/h/6HADdDeq3GLPN8gwj DtEGUgjFUJcXvg8obzRj9tVP2OVzXVPECFnrmp4WoQ593imh0z7P2bjVrjUFmMVxFGeo3n6Hv iOXCSZtMq6eB+sGKCFp5Gs0J1rMB8Uy4r9wNJkHOHrTTdCfUn2/oI2ojvxth7dwn3nAKZXCWl KHUKmZVuVOXjeg1EuBSXQhxf4Ps+iw74dCC5VjuWlXMXPr2r/GbWyB1jDwtuUuLFicgwDewQn xe2JuHgRSUjb1q1N63tHiri3wtoN2bu5/77O7NCEk34zmNL96tyOip67kITY0wwylxeIiy+SI dvB3geEejeiqHyPUaZAOsTlWS1gTdVO0H5pMz9ltH9Qp/VaeZ1MODHr8v+EKP/dp6Y9QL2LAG WusOfpHplhh9ZseEZTLUmuulV2pBg== Sender: linux-parisc-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-parisc@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP * Carlo Pisani : > I have created and applied the following patch > testing the kernel with heavy I/O seems now stable > > my C3600 is still under testing, moving chunks of 500Mbyte and compiling gcc-v6 > > > http://93.55.217.0//wonderland/chunk_of/user/ivelegacy/happa-dev/hppa2_0001_HPMC_fix_my_v1.patch This one seems wrong. I think you just didn't hit a HPMC with your first patch, and as such this patch has no influence... Helge --- To unsubscribe from this list: send the line "unsubscribe linux-parisc" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html --- drivers/parisc/lba_pci.c 2018-01-28 22:20:33.000000000 +0100 +++ drivers/parisc/lba_pci.c 2018-03-15 12:26:44.839894952 +0100 @@ -1405,7 +1405,7 @@ /* Set HF mode as the default (vs. -1 mode). */ stat = READ_REG32(d->hba.base_addr + LBA_STAT_CTL); - WRITE_REG32(stat | HF_ENABLE, d->hba.base_addr + LBA_STAT_CTL); + WRITE_REG32(stat & ~HF_ENABLE, d->hba.base_addr + LBA_STAT_CTL); /* ** Writing a zero to STAT_CTL.rf (bit 0) will clear reset signal That's the patch from Kyle: https://www.spinics.net/lists/linux-parisc/msg01027.html which comes out of this mail thread: https://www.spinics.net/lists/linux-parisc/msg01024.html specifically with those notes: https://www.spinics.net/lists/linux-parisc/msg01026.html Citing here: "bus timeout" usually means we tried to read an address that doesn't respond. that is, nothing on the bus accepted the transaction for it, so it timed out and HPMC'd the box. what you really need is the IIR, and the address it tried to access (both the kernel vaddr which will be in the register, and the "system requester address" from the hpmc dump which will be the physical address mapped. not sure why the hpmc handler is getting skipped, that's a little weird. you can try hacking elroy to set softfail mode on that bus, which will result in a timeout on the pci bus to return -1 (like what x86 and most other architectures do) rather than hang the box, but it really likely means a driver bug. So, you change LBA to return -1 instead of faulting via HPMC which is of course one work-around to avoid the HPMC. But could you try to check the driver instead? You run this SATA controller: 01:05.0 RAID bus controller: VIA Technologies, Inc. VT6421 IDE RAID Can you maybe try to localize where in the drivers/ata/sata_via.c driver triggers the HPMC ? --- arch/parisc/kernel/hpmc.S 2018-01-28 22:20:33.000000000 +0100 +++ arch/parisc/kernel/hpmc.S 2018-03-15 14:13:46.611969815 +0100 @@ -308,4 +290,5 @@ .align 4 .export os_hpmc_size os_hpmc_size: - .word .os_hpmc_end-.os_hpmc + /* .word .os_hpmc_end-.os_hpmc */ + .word (.os_hpmc_end - .os_hpmc) * 4 /* sizeof(u32) */