From patchwork Fri Jun 26 16:33:59 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mauricio Faria de Oliveira X-Patchwork-Id: 6682531 Return-Path: X-Original-To: patchwork-linux-scsi@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork1.web.kernel.org (Postfix) with ESMTP id 0F7099F399 for ; Fri, 26 Jun 2015 16:34:12 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 2F0E02069D for ; Fri, 26 Jun 2015 16:34:11 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 2B3EE20547 for ; Fri, 26 Jun 2015 16:34:10 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752485AbbFZQeH (ORCPT ); Fri, 26 Jun 2015 12:34:07 -0400 Received: from e24smtp01.br.ibm.com ([32.104.18.85]:49467 "EHLO e24smtp01.br.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752469AbbFZQeF (ORCPT ); Fri, 26 Jun 2015 12:34:05 -0400 Received: from /spool/local by e24smtp01.br.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Fri, 26 Jun 2015 13:34:03 -0300 Received: from d24dlp01.br.ibm.com (9.18.248.204) by e24smtp01.br.ibm.com (10.172.0.143) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Fri, 26 Jun 2015 13:34:02 -0300 X-Helo: d24dlp01.br.ibm.com X-MailFrom: mauricfo@linux.vnet.ibm.com X-RcptTo: linux-scsi@vger.kernel.org Received: from d24relay03.br.ibm.com (d24relay03.br.ibm.com [9.13.184.25]) by d24dlp01.br.ibm.com (Postfix) with ESMTP id 97F873520066; Fri, 26 Jun 2015 12:32:56 -0400 (EDT) Received: from d24av03.br.ibm.com (d24av03.br.ibm.com [9.8.31.95]) by d24relay03.br.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id t5QGWkgg6357480; Fri, 26 Jun 2015 13:32:46 -0300 Received: from d24av03.br.ibm.com (localhost [127.0.0.1]) by d24av03.br.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id t5QGY0hZ005468; Fri, 26 Jun 2015 13:34:00 -0300 Received: from [9.18.235.110] (mauricfo-t440.br.ibm.com [9.18.235.110]) by d24av03.br.ibm.com (8.14.4/8.14.4/NCO v10.0 AVin) with ESMTP id t5QGY0ao005465; Fri, 26 Jun 2015 13:34:00 -0300 Message-ID: <558D7EF7.30104@linux.vnet.ibm.com> Date: Fri, 26 Jun 2015 13:33:59 -0300 From: Mauricio Faria de Oliveira User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.7.0 MIME-Version: 1.0 To: linux-scsi@vger.kernel.org CC: linux-kernel@vger.kernel.org, JBottomley@parallels.com, qla2xxx-upstream@qlogic.com, cascardo@debian.org, mauricfo@linux.vnet.ibm.com Subject: [PATCH] [RESEND] qla2xxx: prevent board_disable from running during EEH References: <1430395985-17101-1-git-send-email-cascardo@linux.vnet.ibm.com> In-Reply-To: <1430395985-17101-1-git-send-email-cascardo@linux.vnet.ibm.com> X-Forwarded-Message-Id: <1430395985-17101-1-git-send-email-cascardo@linux.vnet.ibm.com> X-TM-AS-MML: disable X-Content-Scanned: Fidelis XPS MAILER x-cbid: 15062616-1524-0000-0000-000002F581E4 Sender: linux-scsi-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org X-Spam-Status: No, score=-8.3 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_HI, RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Commit f3ddac1918fe963bcbf8d407a3a3c0881b47248b ("[SCSI] qla2xxx: Disable adapter when we encounter a PCI disconnect.") has introduced a code that disables the board, releasing some resources, when reading 0xffffffff. In case this happens when there is an EEH, this read will trigger EEH detection and set PCI channel offline. EEH will be able to recover the card from this state by doing a reset, so it's a better option than simply disabling the card. Since eeh_check_failure will mark the channel as offline before returning the read value, in case there really was an EEH, we can simply check for pci_channel_offline, preventing the board_disable code from running if it's true. Without this patch, EEH code will try to access those same resources that board_disable will try to free. This race can cause EEH recovery to fail. [ 504.370577] EEH: Notify device driver to resume [ 504.370580] qla2xxx [0001:07:00.0]-9002:2: The device failed to resume I/O from slot/link_reset. Signed-off-by: Thadeu Lima de Souza Cascardo Cc: Cc: Acked-by: Himanshu Madhani --- drivers/scsi/qla2xxx/qla_isr.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/scsi/qla2xxx/qla_isr.c b/drivers/scsi/qla2xxx/qla_isr.c index a04a1b1..8132926 100644 --- a/drivers/scsi/qla2xxx/qla_isr.c +++ b/drivers/scsi/qla2xxx/qla_isr.c @@ -116,7 +116,7 @@ bool qla2x00_check_reg32_for_disconnect(scsi_qla_host_t *vha, uint32_t reg) { /* Check for PCI disconnection */ - if (reg == 0xffffffff) { + if (reg == 0xffffffff && !pci_channel_offline(vha->hw->pdev)) { if (!test_and_set_bit(PFLG_DISCONNECTED, &vha->pci_flags) && !test_bit(PFLG_DRIVER_REMOVING, &vha->pci_flags) && !test_bit(PFLG_DRIVER_PROBING, &vha->pci_flags)) {