[RFC] hw/arm/virt: Provide DT binding generation for PCI eXpander Bridges

This patch and the problem it is trying to resolve will form part of a talk
I will be giving next week at Linaro Connect. https://sched.co/1K85p

So in the spirit of giving people almost no warning... Plus being able to
say the discussion has kicked off here is the simplest solution I could
come up with. If we can agree on an approach to the sizing of memory
windows question by Thursday even better (I'll update the slides!) ;)

This is a precursor for CXL on arm-virt (for which DT support was
requested). CXL QEMU emulation uses pxb-cxl which inherits from the
slightly simpler pxb-pcie. ACPI support for these additional host bridges
has been available for some time but relies an interesting dance with
EDK2:
* During initial board creation the PXB buses are largely ignored
  and ACPI tables + DT passed to EDK2 only expose the root bus on
  which the section of the PXB instance that exists as a normal PCI EP
  may be discovered.
* EDK2 then carries out full PCI bus enumeration, including the buses
  below the PXB host bridges.
* Finally EDK2 calls back into QEMU to rebuild the tables via
  virt_acpi_build_update(), at which point the correct DSDT for the
  PXB host bridges is generated and an adjust DSDT section is generated
  for the primary host bridge (holes are bunched in the _CRS).

For device tree bindings there is no such dance with the firmware and as
such we need QEMU to directly present device tree entries for the primary
PCIe bus and the PXB instances.

This cannot be fully done either at initial PCIe instantiation, (PXB
instances not yet instantiated) or at in virt_machine_done() as it is
for ACPI (virtio-mem hot plug routines rely on presence of primary PCIe bus).
Thus the proposed solution is to set things up initially without being
careful and later update it with additional checks that we don't get
overlapping bus numbers.

It 'might' be possible to use hotplug handlers to update the DT as we
go along, but that would be rather challenging as each additional PXB
introduction would sometimes require an update of the dt for all other
instances.

Note: The linux,pci-domain dt element is more restrictive than the
equivalent in ACPI, so for now just put each PXB in it's own domain. I'll
look to address relaxing that on the kernel side, but then this code won't
initially work with older kernels - so this lack of alignment with ACPI
systems may have to persist (you end up with a single segment for ACPI
systems, and multiple for DT).

Another question is how to allocate the resources. A PXB instance uses
a section of the ECAM space (which is fine as that is defined by bus
numbers) and also part of each of the PCI windows. When EDK2 is involved,
the allocation of the windows is done after enumeration of connected
PCI devices / switches etc with heuristics adding extra space for
hotplug.  Thus the windows configured for the bridges can be easily
established and written into the ACPI _CRS entries.

I've considered two options:
1) (More or less) enumerate the full PCI topology in QEMU, just don't
   actually write the registers for routing. Apply similar heuristics
   (note that EDK2 and the kernel may have different ones and they
   have changed over time) to establish how much memory is needed
   in each window. This is a complex solution that may still provide
   allocations that aren't big enough for the kernel to then
   enumerate the PCI topology using it's algorithms.
2) Use the one thing we definitely control for PXB instances, the base
   bus number.  Based on how many buses are available to each PXB
   instance allocate a proportion of the available memory windows. Again,
   it's more than possible that this won't leave enough room to provide
   all of the necessary space to accommodate BARs, but is fairly simple to
   implement and fairly simple to understand when the allocation doesn't
   work. This option is what the current code does.

Note that either way any regressions should be limited to systems using
DT with PXB instances which would be odd given they don't currently work.

RFC being posted to get feedback on the general approach we should take.
One option would be to ignore pxb-pcie and only deal with pxb-cxl on
the basis that removes risk of regressions because we don't support
CXL on arm-virt yet.  However, code will end up much the same so we
can discuss this without the added fun that CXL will bring.

Other more minor nasties:
* EDK2 breaks if we pass it a DT that includes the PXB instances.
* Need to look closer at the interrupt mapping - or potentially make this
  MSI/MSIX only. Equivalent is broken on x86 / q35 :)

Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
---
 hw/arm/virt.c        | 214 +++++++++++++++++++++++++++++++++++++++++++
 hw/pci/pci.c         |  21 +++++
 include/hw/pci/pci.h |   1 +
 3 files changed, 236 insertions(+)

Message ID	20230421165037.2506-1-Jonathan.Cameron@huawei.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org> X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id F1576C7618E for <qemu-devel@archiver.kernel.org>; Fri, 21 Apr 2023 16:51:14 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from <qemu-devel-bounces@nongnu.org>) id 1pptyE-00086L-Bk; Fri, 21 Apr 2023 12:50:50 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from <jonathan.cameron@huawei.com>) id 1ppty7-0007qm-32; Fri, 21 Apr 2023 12:50:43 -0400 Received: from frasgout.his.huawei.com ([185.176.79.56]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from <jonathan.cameron@huawei.com>) id 1ppty3-0004RM-8x; Fri, 21 Apr 2023 12:50:42 -0400 Received: from lhrpeml500005.china.huawei.com (unknown [172.18.147.200]) by frasgout.his.huawei.com (SkyGuard) with ESMTP id 4Q30lF60LHz67MCh; Sat, 22 Apr 2023 00:45:45 +0800 (CST) Received: from SecurePC-101-06.china.huawei.com (10.122.247.231) by lhrpeml500005.china.huawei.com (7.191.163.240) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.23; Fri, 21 Apr 2023 17:50:33 +0100 To: <qemu-devel@nongnu.org>, Peter Maydell <peter.maydell@linaro.org>, "Michael S . Tsirkin" <mst@redhat.com> CC: <qemu-arm@nongnu.org>, Fan Ni <fan.ni@samsung.com>, <linuxarm@huawei.com>, =?utf-8?q?Philippe_Mathieu-Daud=C3=A9?= <philmd@linaro.org> Subject: [RFC] hw/arm/virt: Provide DT binding generation for PCI eXpander Bridges Date: Fri, 21 Apr 2023 17:50:37 +0100 Message-ID: <20230421165037.2506-1-Jonathan.Cameron@huawei.com> X-Mailer: git-send-email 2.37.2 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain X-Originating-IP: [10.122.247.231] X-ClientProxiedBy: lhrpeml500005.china.huawei.com (7.191.163.240) To lhrpeml500005.china.huawei.com (7.191.163.240) X-CFilter-Loop: Reflected Received-SPF: pass client-ip=185.176.79.56; envelope-from=jonathan.cameron@huawei.com; helo=frasgout.his.huawei.com X-Spam_score_int: -41 X-Spam_score: -4.2 X-Spam_bar: ---- X-Spam_report: (-4.2 / 5.0 requ) BAYES_00=-1.9, RCVD_IN_DNSWL_MED=-2.3, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: <qemu-devel.nongnu.org> List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>, <mailto:qemu-devel-request@nongnu.org?subject=unsubscribe> List-Archive: <https://lists.nongnu.org/archive/html/qemu-devel> List-Post: <mailto:qemu-devel@nongnu.org> List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help> List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>, <mailto:qemu-devel-request@nongnu.org?subject=subscribe> Reply-to: Jonathan Cameron <Jonathan.Cameron@huawei.com> From: Jonathan Cameron via <qemu-devel@nongnu.org> Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org
Series	[RFC] hw/arm/virt: Provide DT binding generation for PCI eXpander Bridges \| expand [RFC] hw/arm/virt: Provide DT binding generation for PCI eXpander Bridges

[RFC] hw/arm/virt: Provide DT binding generation for PCI eXpander Bridges

Commit Message

Comments

Patch