From patchwork Fri Dec 7 22:29:51 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Don Brace X-Patchwork-Id: 10718979 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id D20FC1750 for ; Fri, 7 Dec 2018 22:30:03 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C06DB2ABF4 for ; Fri, 7 Dec 2018 22:30:03 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id B3D732F2E3; Fri, 7 Dec 2018 22:30:03 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 3736C2F2E1 for ; Fri, 7 Dec 2018 22:30:03 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726077AbeLGWaD (ORCPT ); Fri, 7 Dec 2018 17:30:03 -0500 Received: from mail-eopbgr740075.outbound.protection.outlook.com ([40.107.74.75]:40833 "EHLO NAM01-BN3-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726008AbeLGWaC (ORCPT ); Fri, 7 Dec 2018 17:30:02 -0500 Received: from MWHPR19CA0095.namprd19.prod.outlook.com (2603:10b6:320:1f::33) by BN6PR19MB0932.namprd19.prod.outlook.com (2603:10b6:404:75::14) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.1404.21; Fri, 7 Dec 2018 22:29:57 +0000 Received: from BN1BFFO11FD016.protection.gbl (2a01:111:f400:7c10::1:140) by MWHPR19CA0095.outlook.office365.com (2603:10b6:320:1f::33) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.1404.19 via Frontend Transport; Fri, 7 Dec 2018 22:29:56 +0000 Authentication-Results: spf=pass (sender IP is 208.19.100.23) smtp.mailfrom=microsemi.com; linux.vnet.ibm.com; dkim=none (message not signed) header.d=none;linux.vnet.ibm.com; dmarc=bestguesspass action=none header.from=microsemi.com; Received-SPF: Pass (protection.outlook.com: domain of microsemi.com designates 208.19.100.23 as permitted sender) receiver=protection.outlook.com; client-ip=208.19.100.23; helo=AVMBX3.microsemi.net; Received: from AVMBX3.microsemi.net (208.19.100.23) by BN1BFFO11FD016.mail.protection.outlook.com (10.58.144.79) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.1404.13 via Frontend Transport; Fri, 7 Dec 2018 22:29:56 +0000 Received: from AVMBX2.microsemi.net (10.100.34.32) by AVMBX3.microsemi.net (10.100.34.33) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.1531.3; Fri, 7 Dec 2018 14:29:52 -0800 Received: from [127.0.1.1] (10.238.32.34) by avmbx2.microsemi.net (10.100.34.32) with Microsoft SMTP Server id 15.1.1531.3 via Frontend Transport; Fri, 7 Dec 2018 14:29:52 -0800 Subject: [PATCH 17/20] smartpqi: correct lun reset issues From: Don Brace To: , , , , , , , , , , , CC: Date: Fri, 7 Dec 2018 16:29:51 -0600 Message-ID: <154422179177.1218.6693507854725315677.stgit@brunhilda> In-Reply-To: <154422079293.1218.12539829857034151457.stgit@brunhilda> References: <154422079293.1218.12539829857034151457.stgit@brunhilda> User-Agent: StGit/0.19-dirty MIME-Version: 1.0 X-EOPAttributedMessage: 0 X-MS-Office365-Filtering-HT: Tenant X-Forefront-Antispam-Report: CIP:208.19.100.23;IPV:NLI;CTRY:US;EFV:NLI;SFV:NSPM;SFS:(10009020)(7916004)(376002)(136003)(346002)(396003)(39860400002)(2980300002)(189003)(199004)(5660300001)(478600001)(2201001)(103116003)(476003)(356004)(68736007)(4326008)(50466002)(86362001)(97736004)(58126008)(126002)(110136005)(8676002)(106002)(47776003)(16576012)(53936002)(336012)(2870700001)(2486003)(9686003)(8936002)(11346002)(446003)(23676004)(2906002)(14444005)(186003)(33716001)(44832011)(76176011)(81166006)(33896004)(81156014)(69596002)(106466001)(77096007)(486006)(26005)(316002)(305945005)(921003)(2101003)(1121003)(83996005);DIR:OUT;SFP:1101;SCL:1;SRVR:BN6PR19MB0932;H:AVMBX3.microsemi.net;FPR:;SPF:Pass;LANG:en;PTR:InfoDomainNonexistent;MX:1;A:1; X-Microsoft-Exchange-Diagnostics: 1;BN1BFFO11FD016;1:sqMxNKUaMmLpqyDJULkue5YtBsZyq2zOrxZ6d9MpTmFoCzlSwreYSGZRQcnPtqdM40uRgil979kKbxthuNq5zhkjjtrhx2Nuq2F+8MZHXlB8A2VJ5CfubuysDU+mQRTT X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: dd74ef53-3c1d-435d-6f31-08d65c9383e4 X-Microsoft-Antispam: BCL:0;PCL:0;RULEID:(2390098)(7020095)(4652040)(8989299)(5600074)(711020)(4608076)(4534185)(4627221)(201703031133081)(201702281549075)(8990200)(2017052603328)(7153060);SRVR:BN6PR19MB0932; X-Microsoft-Exchange-Diagnostics: 1;BN6PR19MB0932;3:PTlAYIpEeRRFBWspuoLZMTV87jj+xQT8bIq5m/pPsDxFncPWiVkXGiT1zPNWzULN87EvB4SHfehvzkREz4dfD5OZZyez5O4uQq58yzM1rvvRsGd7apPYfN3ng6yRAoOb2btl4qw8owxrRLcOXuozM7kVbbj5GcViuvZoHfua0hf85bL8LukBv4KsUPd+Pmry8EY+YRLFcur7TILsBbUQk5GaMBSHivVQYUXVd8AquH7UZTZJ9iOqhMHa/1M89O7z2ynXSdz1Uf1s5xyqTftnNzq0Q4I1D85N9hbNVKPUM+NiwO+SYLBZKOdcd25Mf14gFgTTeWg/b+co62SlYh+QT/q/oDZo0fC1p9HRxodsXm8=;25:0j7sgHXUT7FjpyEVTdkGgI7XYCBai6FZD2v4Op8tQzmnF0lCDuZ+R8FpoVbm0R+Wp6U2dL5bSUxz9+WrLk74FJacv/+H+/Txm6t8NTklgN2iiNSSKgr78N51iwS6JfSxlxngNcmwWws798DLZZ5HJrBYCZ3ZSBAAwCKQJWHFSnAw4Kw/Gr7mNRdPbRWYWomy6h7ejjItUDtqFrUL3z+UMKNP3Th6zr8BZbLvGVwgwY5Z2bd3ew2SEJAwKewh7RWiKdcW9nxCCZgTqKUBJbzctLGvTjM8Q7fLPl8uRcxTjlpPg1aNks+kzUoVujAmsl2LzFV1z+OCT8ZAUmfWSNBwQw== X-MS-TrafficTypeDiagnostic: BN6PR19MB0932: X-Microsoft-Exchange-Diagnostics: 1;BN6PR19MB0932;31:1yEjMrunoDfIcdXmQLR2ZUA1x/f6V9souGD8YfRK4WiG9z8Q988zWnnHVrEyPBV4jtOPvxii6ZOF6WKTvXMIxrjMk+9N5DxE8SeVkkVp416Bgim09Yt3LILa5slKElSyjkho3S4G/j+9ig+QGIxmtMocIc2UmojGMgi/simeE4ADZnOTIyKidfcDv/XIK42e6dXMtiYho6DL4RgG15C08Ck78t/ROpnDaiuVegrvV3Y=;20:XU+6a6TcVb78wi164Ip0ZP+7FYw9h4OSzjGDQJz4xsiQqd9085DGjLn8QiyjHbDgMqEE8jt7F6o8lNShFvOwHj/fPjutFhEAZQyGLPJTPiq+qTG5CKHl6MCwPYXAdoCW0+9P/H43Lw/PjdfGgRF+jD5Ihp9/Rb400e8W0TmvwSV9g5OVZI1v/HgwN/xFsaU2mLTjexHHXWBtz3qKowSdAoorODfrhWEjzek/EPTEzNo3nPX4rbtUMr57pb79aUamGDlDyuQ1bYVd2oiJ7lwvpM8rTg8CczZqctvvsdo3x9BLpdzXml7Y5qmuxBvlyzkuSTRm/20Nszs+hXOv+XFhUQjVZy8GnLRI3HaS7nymZ8IV33tCk9aDT9xgfwXMC3EAxOh5/7oi4RdsSM0sPubcaFYOxVEMoWwCE2RenP43QdyAMriEelCiNaAhYxaZu3mhI5gEN6JVTxItZUJHym+BmS+Gzm+7HYHVyEoDfyEzGoFhVZ3xRLoto+E5kOmBQvgt7G8qXcmIlQHeb1ioWKpnS0JjXY0zsX0E51QmJ81mF3kBzyqiA+ArIKwIF/HyBSseZE8CkmSPN88o8FpWgHfSRnKNozKT9oZBCgsNGbYAxxM= X-Microsoft-Antispam-PRVS: X-MS-Exchange-SenderADCheck: 1 X-Exchange-Antispam-Report-CFA-Test: BCL:0;PCL:0;RULEID:(8211001083)(6040522)(2401047)(8121501046)(5005006)(10201501046)(3002001)(93006095)(93004095)(3231455)(999002)(944501520)(52105112)(6055026)(148016)(149066)(150057)(6041310)(201703131423095)(201702281528075)(20161123555045)(201703061421075)(201703061406153)(20161123558120)(20161123562045)(20161123560045)(20161123564045)(201708071742011)(7699051)(76991095);SRVR:BN6PR19MB0932;BCL:0;PCL:0;RULEID:;SRVR:BN6PR19MB0932; X-Microsoft-Exchange-Diagnostics: 1;BN6PR19MB0932;4:+jZGDeloJqbVzDHXwJzqWPNSsYMveLcxbTrCDvdS3K9hk+gUpuBqt6ZBSU+JQ5gzY3CGBU3Rc0m6sJdLsa61tmODdDppo8iWwln5GE9lKh1K0fllPUYIJhNOWJpNXifSNXrTuie9ZrN6LZTyGl0juOpQdRJF3QUdmMZd2UyCdwoiFYPiySWb4S7VaiUFxwZMVXUc2PvNYx38NOqh1mNsaJpGcQV3yBTjzWkyRvJdN5WTBWVLPu11ZyBVn8Pvi/Bt0o4gBeOvzq9QqhtuyA559Q== X-Forefront-PRVS: 0879599414 X-Microsoft-Exchange-Diagnostics: =?utf-8?q?1=3BBN6PR19MB0932=3B23=3AlHS9wrr?= =?utf-8?q?JCCzEvoj/wrVOxoJz3QjR88UVAr6iIz/km/SM777lv/yLQoxKG69Q2IMO1+xlYPTS?= =?utf-8?q?fVqMclRQEzn+zaTNkZTXOLxvvatTmqJRp0Z76JRSBXZ225+7D3kZT5UjUNwkO79H/?= =?utf-8?q?7/XNL2cIi0SerU4FeJ20wbte+eBoOk7M6metI2SAXJpzAT4iTR9x6TDGW0iCyLTDG?= =?utf-8?q?vwtIEC932jjwhE7bKGfKQiTdDClkN78zSnFd55UrysB9tuFKCZ1YXzAfYQavxy2Ja?= =?utf-8?q?e5DdPoBLQF/IPjVAWbQOOXq/Z1tbw15UFGzMwntR9cWJ+d7PhKqFl/aS7cRju7Fqi?= =?utf-8?q?A3JgwLyp41JIfUgFr8ovjrK6cx3JcldrLve/kdLbwTCmrZDL9IkfkN/++CrzNEhDg?= =?utf-8?q?ELWSfRDdoGaPXNttYstSNNGdR6xtVsQLLZwAjb+78NnslRwt+gm/YfI4azW4KVjOh?= =?utf-8?q?aeygqNOsgYih/iE0TsF34xPC6b+npmXN9T5t/ZFOlatI17n8hVw0r2NYAhLnJEIoB?= =?utf-8?q?3OBJe/A4n3G4bdA99j1lUxYdKEDz7062qOMiV89ceX25PjQpHUzlTyydps7Bc96kg?= =?utf-8?q?cWgrdBWV95jNneW6dibJzbwDS/GWZwbBZy8oekSTWhF+k9x1o118B5pt5QiG0HwPz?= =?utf-8?q?DRMcXQFlpF9J5y+LzYrhEqHq2zQKOnv8x12R7XrFgb6fRdQSMtd6N76MBup1fSSu1?= =?utf-8?q?VMbzjsqywnyq8Bv7fCC9xeHYj/zlGGz2CMIB4UBvU46snngcgWS7zxLVgq4hV6v5w?= =?utf-8?q?5QTA80B0SlOuJVT8pY7qBMXYeD1YFc7eL3P49rd7nmEcUFORVFRNuSugy89J4zthk?= =?utf-8?q?mKLP7wfCEK6l+ElC89sSJ4Eh856t841yV834rJinQYxapX2C9H0xrcuNgmh7Cmas5?= =?utf-8?q?VDj5+ALQznh+iGa9lNYFzCW0KA6Z0sW8fVYgFPnLmC3RJtIFQruJlh2XOi1HLs0N0?= =?utf-8?q?MQfTT1ZeMZogndR5db8YpRLRhKlJY7qjRqNMYZcs5QEgn27vHswGtnTy3tcTWMyrB?= =?utf-8?q?EUvVCk0aa+hsvKAYIo6Dl807/mWok5pZWwE1nD+jrv6zeN6yGJUIlvDEwgZo8ydh8?= =?utf-8?q?xy9KEJba68U/C7jKq/8MQqb2gvXhNO+ebw+FXIGg9Vzi3pih33Yi+sCrEU0JAtVvM?= =?utf-8?q?KgDpjzySrROnClLYmwhbOpw4tY4aXoiJWnDl8Ssa3bOu1Gt/jgsbKhJ+0WXWw=3D?= =?utf-8?q?=3D?= X-Microsoft-Antispam-Message-Info: b0U8I4Gmsw3qMA2VVVcmuViqQJL0xzTOv0AUalz8d9bOPYLbqU/YHYpMzIrhOWs8XoEsRAAIDD+hbft+D0lGBgnAK+dn+9/7u2J69VKzF2CAQ69a4ROZ+920aAZyxnoL6+Vlwi8oEyrM7lT/61P+194ILrgTglBis/teNAFCO0ueQDx0ByNTV9tYGG/MkSW8PPcZFPh9lm9GzZz2Fehr15Jf9ESLVFptt/3WWCdLkfNV5+XXwmkQotN4VwEgPjPaN5DK7aRtV4dEdbLtQ7yMaRg7SgZiDBwR3SdR88YQ2aPNp+gllXLzNd3j1Wx5FBISrIxvZ4ggtEnyF5V5bA8HI//4K+980+KmTgn2xOtP9rQ= X-Microsoft-Exchange-Diagnostics: 1;BN6PR19MB0932;6:DMEmGzqUiFjweaqmmDB9WQ/PmHBrX+b45JbqdhdwQfLjMU1E9xNv6OVMIle420fM+JuOwZUeoUUNSPjKvqRGRMhahsQnCrcNYgfcVty/l0eddZw5JkAmlQWb/pyTZszJ1jm3+qEhBJAfqmXfBRXwqbssrRlQknaW6llEh7L5bXbUilqHxZ87ABxJ+4nofsWwAagaJfqLawk6ICAaTE38xKimAc9u4oGq+zZHxLniQIFFHRFn6e+RMHb3raXg7qPrPbCVu8fHhCKJIMyVO3MjTVMWMPUmNDI5uBpY0eDrK1A9ChWcM89lx4iBiHuCzgneZlJ30mv8YdHn9rsL2L2VMAOInpokWXT3P/kH6NL+JL26pFTLlVwQzAtj5N4SmJAwxKc8qAsJf8XCKmdlOibd4eiFu46hVwCjLQACXxfzkxOnrpdr+4Z78CT5fcREcS6SuJDBm6zQ/J1EypPpdAbw5Q==;5:4/9OT6j4f9EAfebjbyAyn99zFKMmB71aU8OlsF+Wa6nfkmv6uBztqKAb1Egg3f7nZKuCpKEO9Js16raQ1hfLAOkA0t3vtyMCvFS6NmoqO6h9ipnxowonzpDVI70190zrdM6lNj4wUSezDNQftHEvIlm3HpPATYlKGQajPhUOQJs=;7:5ldjryWEP/o+/bt/jBGl2kroTuQyUA1MniiIXtT0RDzZTspJRF7WX6jmwCYQz6Wd+05colcFCW3XZDdfXYNvR3btpMXpOM0liwyBhQcHNcSK41hQzgQyjnDGkASfmk4ZDEHy1lAIMj7XY/jZ5+LOqQ== SpamDiagnosticOutput: 1:99 SpamDiagnosticMetadata: NSPM X-OriginatorOrg: microsemi.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 07 Dec 2018 22:29:56.0831 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: dd74ef53-3c1d-435d-6f31-08d65c9383e4 X-MS-Exchange-CrossTenant-Id: f267a5c8-86d8-4cc9-af71-1fd2c67c8fad X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f267a5c8-86d8-4cc9-af71-1fd2c67c8fad;Ip=[208.19.100.23];Helo=[AVMBX3.microsemi.net] X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: BN6PR19MB0932 Sender: linux-scsi-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Kevin Barnett Problem: The Linux kernel takes a logical volume offline after a LUN reset. This is generally accompanied by this message in the dmesg output: Device offlined - not ready after error recovery Root Cause: The root cause is a "quirk" in the timeout handling in the Linux SCSI layer. The Linux kernel places a 30-second timeout on most media access commands (reads and writes) that it send to device drivers. When a media access command times out, the Linux kernel goes into error recovery mode for the LUN that was the target of the command that timed out. Every command that timed out is kept on a list inside of the Linux kernel to be retried later. The kernel attempts to recover the command(s) that timed out by issuing a LUN reset followed by a TEST UNIT READY. If the LUN reset and TEST UNIT READY commands are successful, the kernel retries the command(s) that timed out. Each SCSI command issued by the kernel has a result field associated with it. This field indicates the final result of the command (success or error). When a command times out, the kernel places a value in this result field indicating that the command timed out. The "quirk" is that after the LUN reset and TEST UNIT READY commands are completed, the kernel checks each command on the timed-out command list before retrying it. If the result field is still "timed out", the kernel treats that command as not having been successfully recovered for a retry. If the number of commands that are in this state are greater than two, the kernel takes the LUN offline. Fix: When our RAIDStack receives a LUN reset, it simply waits until all outstanding commands complete. Generally, all of these outstanding commands complete successfully. Therefore, the fix in the smartpqi driver is to always set the command result field to indicate success when a request completes successfully. This normally isn’t necessary because the result field is always initialized to success when the command is submitted to the driver. So when the command completes successfully, the result field is left untouched. But in this case, the kernel changes the result field behind the driver’s back and then expects the field to be changed by the driver as the commands that timed-out complete. Reviewed-by: Dave Carroll Reviewed-by: Scott Teel Signed-off-by: Kevin Barnett Signed-off-by: Don Brace --- drivers/scsi/smartpqi/smartpqi_init.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/scsi/smartpqi/smartpqi_init.c b/drivers/scsi/smartpqi/smartpqi_init.c index bee14fc8a35e..2f2a07a38dad 100644 --- a/drivers/scsi/smartpqi/smartpqi_init.c +++ b/drivers/scsi/smartpqi/smartpqi_init.c @@ -2841,6 +2841,9 @@ static unsigned int pqi_process_io_intr(struct pqi_ctrl_info *ctrl_info, switch (response->header.iu_type) { case PQI_RESPONSE_IU_RAID_PATH_IO_SUCCESS: case PQI_RESPONSE_IU_AIO_PATH_IO_SUCCESS: + if (io_request->scmd) + io_request->scmd->result = 0; + /* fall through */ case PQI_RESPONSE_IU_GENERAL_MANAGEMENT: break; case PQI_RESPONSE_IU_VENDOR_GENERAL: