From patchwork Mon Jan 16 20:01:18 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Bottomley X-Patchwork-Id: 9519383 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id AAE226020B for ; Mon, 16 Jan 2017 20:01:40 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 999D128066 for ; Mon, 16 Jan 2017 20:01:40 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 8E4A628138; Mon, 16 Jan 2017 20:01:40 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.8 required=2.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_HI,T_DKIM_INVALID autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1C65928066 for ; Mon, 16 Jan 2017 20:01:40 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751391AbdAPUB3 (ORCPT ); Mon, 16 Jan 2017 15:01:29 -0500 Received: from bedivere.hansenpartnership.com ([66.63.167.143]:56132 "EHLO bedivere.hansenpartnership.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750884AbdAPUB1 (ORCPT ); Mon, 16 Jan 2017 15:01:27 -0500 Received: from localhost (localhost [127.0.0.1]) by bedivere.hansenpartnership.com (Postfix) with ESMTP id AC3AB8EE216; Mon, 16 Jan 2017 12:01:20 -0800 (PST) Received: from bedivere.hansenpartnership.com ([127.0.0.1]) by localhost (bedivere.hansenpartnership.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 6B673PCBERNr; Mon, 16 Jan 2017 12:01:20 -0800 (PST) Received: from [153.66.254.194] (unknown [50.46.144.141]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by bedivere.hansenpartnership.com (Postfix) with ESMTPSA id D6AAC8EE0CD; Mon, 16 Jan 2017 12:01:19 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=hansenpartnership.com; s=20151216; t=1484596880; bh=S7X3ZMJKDwDZrFnmc9K+p+WU5ZBzFbnTG8c6FlyDHWI=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=o26a0iDXC5pTYp49eoBUrCCmwP3R8tjqvijfFo3Gu32G6Prrq1vB3AR3xPX3Ky91P Mi8lszCXFM0YNUTqI0EGLa5tYw3FA1nJktb4fj1wfp9B0JJ9zsqAg7mnVTOiDYlMqi ypOJ/S1hfmyHU5ns0J11G1WF2zbTti6yDHcnbBno= Message-ID: <1484596878.2540.58.camel@HansenPartnership.com> Subject: Re: [PATCH] scsi: mpt3sas: fix hang on ata passthru commands From: James Bottomley To: "Martin K. Petersen" Cc: David Miller , Bart Van Assche , Christoph Hellwig , jbaron@akamai.com, "linux-kernel@vger.kernel.org" , sagi@grimberg.me, Sathya Prakash , Suganath Prabu Subramani , Hannes Reinecke , "linux-scsi@vger.kernel.org" , Christoph Hellwig , Chaitra Basappa , dledford@redhat.com, Sreekanth Reddy Date: Mon, 16 Jan 2017 12:01:18 -0800 In-Reply-To: References: <20161229080250.GA11605@infradead.org> <1483226343.2518.32.camel@linux.vnet.ibm.com> <1483280506.5512.1.camel@sandisk.com> <20170101.113311.417107391370623850.davem@davemloft.net> <1483292364.2345.5.camel@linux.vnet.ibm.com> X-Mailer: Evolution 3.16.5 Mime-Version: 1.0 Sender: linux-scsi-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From 91d249409546569444897a1ffde65c421e064899 Mon Sep 17 00:00:00 2001 From: James Bottomley Date: Sun, 1 Jan 2017 09:39:24 -0800 Subject: [PATCH] scsi: mpt3sas: fix hang on ata passthrough commands mpt3sas has a firmware failure where it can only handle one pass through ATA command at a time. If another comes in, contrary to the SAT standard, it will hang until the first one completes (causing long commands like secure erase to timeout). The original fix was to block the device when an ATA command came in, but this caused a regression with commit 669f044170d8933c3d66d231b69ea97cb8447338 Author: Bart Van Assche Date: Tue Nov 22 16:17:13 2016 -0800 scsi: srp_transport: Move queuecommand() wait code to SCSI core So fix the original fix of the secure erase timeout by properly returning SAM_STAT_BUSY like the SAT recommends. The original patch also had a concurrency problem since scsih_qcmd is lockless at that point (this is fixed by using atomic bitops to set and test the flag). Fixes: 18f6084a989ba1b (mpt3sas: Fix secure erase premature termination) Signed-off-by: James Bottomley Acked-by: Sreekanth Reddy Reviewed-by: Christoph Hellwig --- v2 - use bitops for lockless atomicity v3 - update description, change function name --- drivers/scsi/mpt3sas/mpt3sas_base.h | 12 +++++++++++ drivers/scsi/mpt3sas/mpt3sas_scsih.c | 40 +++++++++++++++++++++++------------- 2 files changed, 38 insertions(+), 14 deletions(-) diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.h b/drivers/scsi/mpt3sas/mpt3sas_base.h index 394fe13..dcb33f4 100644 --- a/drivers/scsi/mpt3sas/mpt3sas_base.h +++ b/drivers/scsi/mpt3sas/mpt3sas_base.h @@ -393,6 +393,7 @@ struct MPT3SAS_TARGET { * @eedp_enable: eedp support enable bit * @eedp_type: 0(type_1), 1(type_2), 2(type_3) * @eedp_block_length: block size + * @ata_command_pending: SATL passthrough outstanding for device */ struct MPT3SAS_DEVICE { struct MPT3SAS_TARGET *sas_target; @@ -404,6 +405,17 @@ struct MPT3SAS_DEVICE { u8 ignore_delay_remove; /* Iopriority Command Handling */ u8 ncq_prio_enable; + /* + * Bug workaround for SATL handling: the mpt2/3sas firmware + * doesn't return BUSY or TASK_SET_FULL for subsequent + * commands while a SATL pass through is in operation as the + * spec requires, it simply does nothing with them until the + * pass through completes, causing them possibly to timeout if + * the passthrough is a long executing command (like format or + * secure erase). This variable allows us to do the right + * thing while a SATL command is pending. + */ + unsigned long ata_command_pending; }; diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c b/drivers/scsi/mpt3sas/mpt3sas_scsih.c index b5c966e..830e2c1 100644 --- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c +++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c @@ -3899,9 +3899,18 @@ _scsih_temp_threshold_events(struct MPT3SAS_ADAPTER *ioc, } } -static inline bool ata_12_16_cmd(struct scsi_cmnd *scmd) +static int _scsih_set_satl_pending(struct scsi_cmnd *scmd, bool pending) { - return (scmd->cmnd[0] == ATA_12 || scmd->cmnd[0] == ATA_16); + struct MPT3SAS_DEVICE *priv = scmd->device->hostdata; + + if (scmd->cmnd[0] != ATA_12 && scmd->cmnd[0] != ATA_16) + return 0; + + if (pending) + return test_and_set_bit(0, &priv->ata_command_pending); + + clear_bit(0, &priv->ata_command_pending); + return 0; } /** @@ -3925,9 +3934,7 @@ _scsih_flush_running_cmds(struct MPT3SAS_ADAPTER *ioc) if (!scmd) continue; count++; - if (ata_12_16_cmd(scmd)) - scsi_internal_device_unblock(scmd->device, - SDEV_RUNNING); + _scsih_set_satl_pending(scmd, false); mpt3sas_base_free_smid(ioc, smid); scsi_dma_unmap(scmd); if (ioc->pci_error_recovery) @@ -4063,13 +4070,6 @@ scsih_qcmd(struct Scsi_Host *shost, struct scsi_cmnd *scmd) if (ioc->logging_level & MPT_DEBUG_SCSI) scsi_print_command(scmd); - /* - * Lock the device for any subsequent command until command is - * done. - */ - if (ata_12_16_cmd(scmd)) - scsi_internal_device_block(scmd->device); - sas_device_priv_data = scmd->device->hostdata; if (!sas_device_priv_data || !sas_device_priv_data->sas_target) { scmd->result = DID_NO_CONNECT << 16; @@ -4083,6 +4083,19 @@ scsih_qcmd(struct Scsi_Host *shost, struct scsi_cmnd *scmd) return 0; } + /* + * Bug work around for firmware SATL handling. The loop + * is based on atomic operations and ensures consistency + * since we're lockless at this point + */ + do { + if (sas_device_priv_data->ata_command_pending) { + scmd->result = SAM_STAT_BUSY; + scmd->scsi_done(scmd); + return 0; + } + } while (_scsih_set_satl_pending(scmd, true)); + sas_target_priv_data = sas_device_priv_data->sas_target; /* invalid device handle */ @@ -4650,8 +4663,7 @@ _scsih_io_done(struct MPT3SAS_ADAPTER *ioc, u16 smid, u8 msix_index, u32 reply) if (scmd == NULL) return 1; - if (ata_12_16_cmd(scmd)) - scsi_internal_device_unblock(scmd->device, SDEV_RUNNING); + _scsih_set_satl_pending(scmd, false); mpi_request = mpt3sas_base_get_msg_frame(ioc, smid);