From patchwork Thu May 11 21:54:05 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Brian Norris X-Patchwork-Id: 9723195 X-Patchwork-Delegate: bhelgaas@google.com Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id B5118601E7 for ; Thu, 11 May 2017 21:57:09 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A48F228733 for ; Thu, 11 May 2017 21:57:09 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 9942928736; Thu, 11 May 2017 21:57:09 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 503BE28733 for ; Thu, 11 May 2017 21:57:09 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756476AbdEKV44 (ORCPT ); Thu, 11 May 2017 17:56:56 -0400 Received: from mail-pg0-f47.google.com ([74.125.83.47]:36068 "EHLO mail-pg0-f47.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756458AbdEKV4z (ORCPT ); Thu, 11 May 2017 17:56:55 -0400 Received: by mail-pg0-f47.google.com with SMTP id x64so1268776pgd.3 for ; Thu, 11 May 2017 14:56:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; h=from:to:cc:subject:date:message-id; bh=R1VVhVsgX/g2wokVlg6SpiwCiwUYK70XEui34TgY9Fg=; b=PTQ8QvyLAoBi2hzca4PlK+eBZ0Lmd4zBB6q6DyIDw1DPOuH43T2GV1+lVXSFavLA9N r5+XXQInrTVr6omFM+i2dP3P94KK7bEvrNtqHHnWz6wyXawwFkfzPQDGsu8OJHNE3k3D ufgQ6OVpuipkbccHRIPXUzbKPcKDdBK5vSoeQ= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=R1VVhVsgX/g2wokVlg6SpiwCiwUYK70XEui34TgY9Fg=; b=tyzUaXJuIPt8X2OZQG7YdB+PkK3BQpU6WZ+bIOXKJGTNAXphzX5qjncvjeT4aW7Gqy fJXSfD69h92Eb85CVQ15APtO1AG10sayacT+32pHELvMS442gGrfb70YGVtY/TKNhqjp GrT0ZmSQ2J8HMVeahXcMsZXToKvRLqrQPaFksRXYsypDcANKPzHq4+ht+ESe65RCMY/5 uvuUDkExJkE/gm4m9NNX5Z3/HnLCoTAa2SaMJRP/YJ54S+JPx7v7/5uEvbNnQa3r5QvU C6Vj9B638b/11uSAnO6BQp9UeKXqx/Wp2QSJaE4RcAEBIDmgZfbDcHErOyABllXUlcdc WLFQ== X-Gm-Message-State: AODbwcBoeQXg9mNl/yULjzCP1KhWkRoxu+PYDqOdLQlLlg6Zsv9B/4g7 dbSIqxTJeQ8OJ+Ln X-Received: by 10.98.208.198 with SMTP id p189mr680873pfg.213.1494539814515; Thu, 11 May 2017 14:56:54 -0700 (PDT) Received: from ban.mtv.corp.google.com ([172.22.64.120]) by smtp.gmail.com with ESMTPSA id o5sm1740326pga.64.2017.05.11.14.56.53 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Thu, 11 May 2017 14:56:53 -0700 (PDT) From: Brian Norris To: Bjorn Helgaas Cc: , linux-pci@vger.kernel.org, Rajat Jain , Guenter Roeck , Jeffy Chen , Shawn Lin , Brian Norris Subject: [RFC PATCH] PCI: disable MSI/MSI-X before resetting Date: Thu, 11 May 2017 14:54:05 -0700 Message-Id: <20170511215405.91410-1-briannorris@chromium.org> X-Mailer: git-send-email 2.13.0.rc2.291.g57267f2277-goog Sender: linux-pci-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Despite the claims in the associated comment block, it seems that clearing the command register is not enough to guarantee that no MSI interrupts get triggered during Function Level Reset. Through code instrumentation, I'm able to clearly trace cases like this: (0) reset a device: echo 1 > /sys/bus/pci/devices/xxx/reset (1) disable an MSI interrupt for device 'xxx' in a PCI reset handler (disable_irq()) (2) pcie_flr() initiates reset: pcie_capability_set_word(dev, PCI_EXP_DEVCTL, PCI_EXP_DEVCTL_BCR_FLR)); (3) about 0.5 ms later, kernel handles IRQ: -> __pci_msi_desc_mask_irq() -> pci_write_config_dword() (4) this is well before the device is actually ready, and the system sees a bus abort Tested with MSI, but presumably MSI-X could have the same issue. Tested on Samsung Chromebook Plus, with RK3399 OP1. Signed-off-by: Brian Norris --- RFC, because I'm not really sure this is the right approach, or if there's something else that's misconfigured or buggy. Note that right now, configuration space aborts trigger SError aborts (and panics) on RK3399, so this sort of problem is fatal. It's not clear to me if that's a spec violation (many other PCI controllers manage to mask such aborts), or an acceptable behavior. But FWIW, that means that polling behavior like in commit 5adecf817dd6 ("PCI: Wait for up to 1000ms after FLR reset") cannot work; the first read when the device is not ready will cause a panic. --- drivers/pci/pci.c | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index b01bd5bba8e6..861a3b2d7026 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -4156,6 +4156,17 @@ static void pci_dev_save_and_disable(struct pci_dev *dev) pci_set_power_state(dev, PCI_D0); pci_save_state(dev); + + /* + * Disable MSI/MSI-X before resetting. Some devices have been found to + * trigger interrupts while in the middle of Function Level Reset. The + * MSI/MSI-X state will get restored after we reset. + */ + if (dev->msi_enabled) + pci_msi_set_enable(dev, 0); + if (dev->msix_enabled) + pci_msix_clear_and_set_ctrl(dev, PCI_MSIX_FLAGS_ENABLE, 0); + /* * Disable the device by clearing the Command register, except for * INTx-disable which is set. This not only disables MMIO and I/O port