From patchwork Tue Mar 3 22:38:16 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Bjorn Helgaas X-Patchwork-Id: 5926731 Return-Path: X-Original-To: patchwork-linux-acpi@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork2.web.kernel.org (Postfix) with ESMTP id 7620FBF440 for ; Tue, 3 Mar 2015 22:38:47 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 61A8D20437 for ; Tue, 3 Mar 2015 22:38:46 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 366852041F for ; Tue, 3 Mar 2015 22:38:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756375AbbCCWim (ORCPT ); Tue, 3 Mar 2015 17:38:42 -0500 Received: from mail-ob0-f177.google.com ([209.85.214.177]:38121 "EHLO mail-ob0-f177.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756008AbbCCWil (ORCPT ); Tue, 3 Mar 2015 17:38:41 -0500 Received: by obcwo20 with SMTP id wo20so3064869obc.5 for ; Tue, 03 Mar 2015 14:38:40 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=date:from:to:cc:subject:message-id:references:mime-version :content-type:content-disposition:content-transfer-encoding :in-reply-to:user-agent; bh=cMa8jLd8kcEQ/VDzv9r/S0Zoyv61bZ0sHdPG0TzwWZg=; b=osCAxOX1r5LaBJzYagJnNS4i04bEm1RiKrC3K7SK1UO5xFXq7cH/vzSQklbKIZBlKP rPkOKRE0ZJnKaW3jiQ6Y9n7PeqW30NjiabdXv8Hu05Ag3IKWiQ6iZRAM4/1uquoEZ/lf kyHjOJgiriYzMqzWejw8Zv2qKXK/iN8T7ycfbHe0mKA8js0PwdtCbMUISR7MGmCNV6L2 1xFvz5TpxQaP25EjUNf110AVRN/BnqtT3rTg++JaU7Oi7uS52p/cXn/EZV8jKyg7MINs wjQtytBV0MPomhtetPN6YHoLTOE9HVAXWkxMSmsgH6zzeQVfxSbASY4uAp56qV3EA1EI mUZg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-type:content-disposition :content-transfer-encoding:in-reply-to:user-agent; bh=cMa8jLd8kcEQ/VDzv9r/S0Zoyv61bZ0sHdPG0TzwWZg=; b=IHgNMAwL7e6UkVKmHsdsoRcUgv4AgRQ90fUtJeEDEm9m1EwhrZJeEQSCfgkXatUtti LGDcJCIgFurWvN0b/bHDwLhpvys4IXPpWDN7U8/1/7ebmJ1qzkBH/7BvywShnu46Om+w 5y01eitdjaw37uf5O6VukmVyqdBiQcDdn87NQTT7Ml+hrutJV7YWy/k0FZUmCsGBNdEp Mbx/MrBTpxjfuxhkXsHJunMonffE1Sfm2zlCI1Q4rDZBnxYlX/167FBJZSOJVid+9RfP pVPlXA7d0ihKGncKBMewPSt2kpY3IWr2r67+OYqDyJkAN6ucYw4MTCsH5cvpdU0lPtzj Getw== X-Gm-Message-State: ALoCoQl/0U3k17ntNRbwRahfnA4QsjXzL/OX9mBjPQIERqzmGFZqYkLJTSjxuNeLWCtenN+OH0n9 X-Received: by 10.202.191.194 with SMTP id p185mr836697oif.128.1425422320268; Tue, 03 Mar 2015 14:38:40 -0800 (PST) Received: from google.com ([69.71.1.1]) by mx.google.com with ESMTPSA id d198sm1196685oih.11.2015.03.03.14.38.38 (version=TLSv1.2 cipher=RC4-SHA bits=128/128); Tue, 03 Mar 2015 14:38:39 -0800 (PST) Date: Tue, 3 Mar 2015 16:38:16 -0600 From: Bjorn Helgaas To: Daniel J Blueman Cc: Jiang Liu , Ingo Molnar , H Peter Anvin , Thomas Gleixner , Linux Kernel , Steffen Persvold , "x86@kernel.org" , Yinghai Lu , linux-pci@vger.kernel.org, linux-acpi@vger.kernel.org Subject: Re: PCIe 32-bit MMIO exhaustion Message-ID: <20150303223816.GB22299@google.com> References: <54C8A10B.3070207@numascale.com> <54EC0013.7000100@numascale.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <54EC0013.7000100@numascale.com> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-acpi-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-acpi@vger.kernel.org X-Spam-Status: No, score=-3.4 required=5.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED, DKIM_SIGNED, FSL_HELO_FAKE, RCVD_IN_DNSWL_HI, T_DKIM_INVALID, T_RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP [+cc linux-pci, linux-acpi] On Tue, Feb 24, 2015 at 12:37:39PM +0800, Daniel J Blueman wrote: > Hi Bjorn, Jiang, > > On 29/01/2015 23:23, Bjorn Helgaas wrote: > >Hi Daniel, > > > >On Wed, Jan 28, 2015 at 2:42 AM, Daniel J Blueman wrote: > >>With systems with a large number of PCI devices, we're seeing lack of 32-bit > >>MMIO space, eg one quad-port NetXtreme-2 adapter takes 128MB of space [1]. > >> > >>An errata to the PCIe 2.1 spec provides guidance on limitations with 64-bit > >>non-prefetchable BARs (since bridges have only 32-bit non-prefetchable > >>ranges) stating that vendors can enable the prefetchable bit in BARs under > >>certain circumstances to allow 64-bit allocation [2]. > >> > >>The problem with that, is that vendors can't know apriori what hosts their > >>products will be in, so can't just advertise prefetchable 64-bit BARs. What > >>can be done, is system firmware can use the 64-bit prefetchable BAR in > >>bridges, and assign a 64-bit non-prefetchable device BAR into that area, > >>where it is safe to do so (following the guidance). > >> > >>At present, linux denies such allocations [3] and disables the BARs. It > >>seems a practical solution to allow them if the firmware believes it is > >>safe. > > > >This particular message ([3]): > > > >>pci 0002:01:00.0: BAR 0: [mem size 0x00002000 64bit] conflicts with PCI Bus > >>0002:00 [mem 0x10020000000-0x10027ffffff pref] > > > >is misleading at best and likely a symptom of a bug. We printed the > >*size* of BAR 0, not an address, which means we haven't assigned space > >for the BAR. That means it should not conflict with anything. > > > >We already do revert to firmware assignments in some situations when > >Linux can't figure out how to assign things itself. But apparently > >not in *this* situation. > > > >Without seeing the whole picture, it's hard for me to figure out > >what's going on here. Could you open a bug report at > >http://bugzilla.kernel.org (category drivers/PCI) and attach a > >complete dmesg and "lspci -vv" output? Then we can look at what > >firmware did and what Linux thought was wrong with it. > > Done a while back: > https://bugzilla.kernel.org/show_bug.cgi?id=92671 > > An interesting question popped up: I find the kernel doesn't accept > IO BARs and bridge windows after address 0xffff, though the PCI spec > and modern hardware allows 32-bit decode. > > Thus for practical reasons, our NumaConnect firmware doesn't setup > IO BARs/windows beyond the first PCI domain (which is the only one > with legacy support, and no drivers seem to require IO their BARs > anyway), ... If we don't handle IO ports above 0xffff, I think that's broken. I'm pretty sure we do handle that on ia64 (it's done by assigning 64K of IO space to each host bridge, and I think it's typically translated by the bridge so each root bus sees a 0-64K space on PCI). We should be able to do something similar on x86, but it may not be implemented there yet. > and we get conflicts and warnings [1]: > > pnp 00:00: disabling [io 0x0061] because it overlaps 0001:05:00.0 > BAR 0 [io 0x0000-0x00ff] > pci 0001:03:00.0: BAR 13: no space for [io size 0x1000] > pci 0001:03:00.0: BAR 13: failed to assign [io size 0x1000] > > Is there a cleaner way of dealing with this, in our firmware and/or > the kernel? Eg, I guess if IO BARs aren't assigned (value 0) on PCI > domains without IO bridge windows in the ACPI AML, no need to > conflict/attempt assignment? Yes, we should be able to deal with this better. The complaint about disabling the pnp 00:00 resource is bogus because the PCI 0001:05:00.0 BAR is not assigned and should never be enabled, so this is not a real conflict.  My intent is that the PCI resource corresponding to this BAR should have the IORESOURCE_UNSET bit set.  That will prevent pci_enable_resources() from setting the PCI_COMMAND_IO bit, which is what would enable the BAR. Can you try the patch below? I don't think it will work right off the bat because I think the fact that we print "[io 0x0000-0x00ff]" instead of "[io size 0x0100]" means we don't have IORESOURCE_UNSET set in the PCI resource. But maybe you can figure out where it *should* be getting set? Bjorn commit fd4888cf942a2ae9cdefc46d1fba86b2c7ec2dbf Author: Bjorn Helgaas Date: Tue Mar 3 16:13:56 2015 -0600 PNP: Don't check for overlaps with unassigned PCI BARs After 0509ad5e1a7d ("PNP: disable PNP motherboard resources that overlap PCI BARs"), we disable and warn about PNP resources that overlap PCI BARs. But we assume that all PCI BARs are valid, which is incorrect, because a BAR may not have any space assigned to it. In that case, we will not enable the BAR, so no other resource can conflict with it. Ignore PCI BARs that are unassigned, as indicated by IORESOURCE_UNSET. Signed-off-by: Bjorn Helgaas --- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/drivers/pnp/quirks.c b/drivers/pnp/quirks.c index ebf0d6710b5a..943c1cb9566c 100644 --- a/drivers/pnp/quirks.c +++ b/drivers/pnp/quirks.c @@ -246,13 +246,16 @@ static void quirk_system_pci_resources(struct pnp_dev *dev) */ for_each_pci_dev(pdev) { for (i = 0; i < DEVICE_COUNT_RESOURCE; i++) { - unsigned long type; + unsigned long flags, type; - type = pci_resource_flags(pdev, i) & - (IORESOURCE_IO | IORESOURCE_MEM); + flags = pci_resource_flags(pdev, i); + type = flags & (IORESOURCE_IO | IORESOURCE_MEM); if (!type || pci_resource_len(pdev, i) == 0) continue; + if (flags & IORESOURCE_UNSET) + continue; + pci_start = pci_resource_start(pdev, i); pci_end = pci_resource_end(pdev, i); for (j = 0;