From patchwork Wed Oct 30 09:50:37 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chang Liu X-Patchwork-Id: 3112751 X-Patchwork-Delegate: bhelgaas@google.com Return-Path: X-Original-To: patchwork-linux-pci@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.19.201]) by patchwork1.web.kernel.org (Postfix) with ESMTP id 3A92A9F3E3 for ; Wed, 30 Oct 2013 01:50:58 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id A6FC02021B for ; Wed, 30 Oct 2013 01:50:56 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 79B7D2020E for ; Wed, 30 Oct 2013 01:50:55 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752054Ab3J3Buy (ORCPT ); Tue, 29 Oct 2013 21:50:54 -0400 Received: from mail-pd0-f175.google.com ([209.85.192.175]:44335 "EHLO mail-pd0-f175.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751579Ab3J3Buy (ORCPT ); Tue, 29 Oct 2013 21:50:54 -0400 Received: by mail-pd0-f175.google.com with SMTP id g10so249573pdj.34 for ; Tue, 29 Oct 2013 18:50:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=from:to:cc:subject:date:message-id; bh=5YO+owVIh8zIk8L2ijMB8QSMgtA0FwDWL1KdqP+HDjc=; b=GMmTBu7OP40BmKZs9ew3mgvmlM1P2rvOIvC7hcHJOgggGiglXqrc9h8u0UBBAmQ5oD SI/lJK9hrsmOPaRRE4swfqyk+qBpTSk/nu+v6fLqq2Hruf+/IMgnOPRcu4hfSaRQFWbv F+PeFON3K8vNbaPfOyPzzb6nBOD/gp8ENsxywF/mD39sH4ZQE3DZRjMyKOs7sbm7XLgo jrPBINEBgaCQx69+GL35/pIOAibY9vuJPMZS4fmYFw+/5rINrBGy+lrd9r6iNQYolrxz fklPckWJZQ5ZpIlI0EO+CQHslD04r83KK3ox9QN8573DT2uhOIzJ3U5jxxdbBy9ALfzL qRBg== X-Received: by 10.68.164.165 with SMTP id yr5mr2570007pbb.146.1383097853862; Tue, 29 Oct 2013 18:50:53 -0700 (PDT) Received: from localhost.localdomain ([114.232.176.31]) by mx.google.com with ESMTPSA id iu7sm37800612pbc.45.2013.10.29.18.50.50 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 29 Oct 2013 18:50:53 -0700 (PDT) From: Chang Liu To: linux-pci@vger.kernel.org Cc: Chang Liu Subject: [PATCH] PCI: Blacklist certain hardware from clearing Bus Master bit Date: Wed, 30 Oct 2013 09:50:37 +0000 Message-Id: <1383126637-4641-1-git-send-email-cl91tp@gmail.com> X-Mailer: git-send-email 1.8.4.1 Sender: linux-pci-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org X-Spam-Status: No, score=-5.3 required=5.0 tests=BAYES_00, DATE_IN_FUTURE_06_12, DKIM_ADSP_CUSTOM_MED,DKIM_SIGNED,FREEMAIL_FROM,RCVD_IN_DNSWL_HI, RP_MATCHES_RCVD, T_DKIM_INVALID, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP This patch adds a blacklist that prevents pci device shutdown code from clearing Bus Master bit on certain hardware that cannot tolerate it. Clearing Bus Master bit was originally added in commit b566a22 to address the issue of some PCI device breaking kexec by continuing to do DMA after having been signaled to shutdown. However, this introduced a regression (https://bugzilla.kernel.org/show_bug.cgi?id=63861) that hangs the machine on kernel power off path. It has been pointed out previously (https://lkml.org/lkml/2012/6/6/545) that clearing Bus Master bit during PCI shutdown may have surprising consequences as some devices may not tolerate this, and may hang the system indefinitely. However, not doing so may break kexec. Since only one bug report has come up since the introduction of b566a22, therefore indicating that these misbehaving devices are in the minority, the logical way to fix the hang while not breaking kexec is to blacklist these devices from having their Bus Master bit cleared during the PCI shutdown routine. --- This fixes the above mentioned bug https://bugzilla.kernel.org/show_bug.cgi?id=63861 As Alan Cox has warned in https://lkml.org/lkml/2012/6/6/545, some device will break if we clear bus master bit on shutdown. So sooner or later we will need to blacklist some device if we are to keep Bus Master being cleared as the default behavior. Now with the first (as I have been aware) device that breaks under this default behavior has surfaced, we might as well add the infrastructure code now in case some other devices break down in the future. These devices may not be many so a blacklist likely won't add much maintainance overhead. And we should keep the blacklist here in the pci shutdown code instead of in individual device drivers since this is the most direct way and will likely aid future debug process. drivers/pci/pci-driver.c | 34 +++++++++++++++++++++++++++++++++- 1 file changed, 33 insertions(+), 1 deletion(-) diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c index 98f7b9b..1744ebf 100644 --- a/drivers/pci/pci-driver.c +++ b/drivers/pci/pci-driver.c @@ -376,6 +376,29 @@ static int pci_device_remove(struct device * dev) return 0; } +/* + * Blacklisting certain hardware from having their Bus Master bit cleared + * during device shutdown. This is to workaround certain hardware's issue + * with clearing Bus Master bit that hangs the entire system. + */ +struct { + unsigned short vendor; + unsigned short device; +} pci_bus_master_blacklist[] = { + { 0x8086, 0x9c03 }, /* Intel Corporation Lynx Point-LP SATA Controller */ +}; + +static bool pci_should_clear_master(struct pci_dev *pdev) +{ + int i; + for (i = 0; i < ARRAY_SIZE(pci_bus_master_blacklist); i++) { + if (pdev->vendor == pci_bus_master_blacklist[i].vendor + && pdev->device == pci_bus_master_blacklist[i].device) + return false; + } + return true; +} + static void pci_device_shutdown(struct device *dev) { struct pci_dev *pci_dev = to_pci_dev(dev); @@ -391,9 +414,18 @@ static void pci_device_shutdown(struct device *dev) /* * Turn off Bus Master bit on the device to tell it to not * continue to do DMA. Don't touch devices in D3cold or unknown states. + * This is useful for proper kexec. However, a number of hardware + * aren't happy with this. At the slightest, some hardware simply + * ignore the bus master bit. For some other hardware, clearing + * the bit on shutdown path hangs the entire system. + * This is likely to be a firmware or hardware problem, but + * we can workaround it here by blacklisting these hardware + * from having their bus master bit cleared during device shutdown. */ - if (pci_dev->current_state <= PCI_D3hot) + if (pci_should_clear_master(pci_dev) + && pci_dev->current_state <= PCI_D3hot) { pci_clear_master(pci_dev); + } } #ifdef CONFIG_PM