From patchwork Mon Oct 31 17:38:05 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alex Williamson X-Patchwork-Id: 9406239 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id E98E160234 for ; Mon, 31 Oct 2016 18:19:21 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id DE34329351 for ; Mon, 31 Oct 2016 18:19:21 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id D24D72935B; Mon, 31 Oct 2016 18:19:21 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 6CD1D29351 for ; Mon, 31 Oct 2016 18:19:21 +0000 (UTC) Received: from localhost ([::1]:37678 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1c1HB2-0007DM-IA for patchwork-qemu-devel@patchwork.kernel.org; Mon, 31 Oct 2016 14:19:20 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:37472) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1c1GXA-0001bf-7Z for qemu-devel@nongnu.org; Mon, 31 Oct 2016 13:38:09 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1c1GX9-0003Ei-3q for qemu-devel@nongnu.org; Mon, 31 Oct 2016 13:38:08 -0400 Received: from mx1.redhat.com ([209.132.183.28]:45806) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1c1GX8-0003Ed-SL for qemu-devel@nongnu.org; Mon, 31 Oct 2016 13:38:07 -0400 Received: from int-mx14.intmail.prod.int.phx2.redhat.com (int-mx14.intmail.prod.int.phx2.redhat.com [10.5.11.27]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 28E9A31B310; Mon, 31 Oct 2016 17:38:06 +0000 (UTC) Received: from gimli.home (ovpn-116-188.phx2.redhat.com [10.3.116.188]) by int-mx14.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id u9VHc5XI018014; Mon, 31 Oct 2016 13:38:05 -0400 From: Alex Williamson To: qemu-devel@nongnu.org Date: Mon, 31 Oct 2016 11:38:05 -0600 Message-ID: <20161031173805.14266.47659.stgit@gimli.home> In-Reply-To: <20161031173548.14266.36112.stgit@gimli.home> References: <20161031173548.14266.36112.stgit@gimli.home> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.68 on 10.5.11.27 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.29]); Mon, 31 Oct 2016 17:38:06 +0000 (UTC) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 209.132.183.28 Subject: [Qemu-devel] [PULL 5/5] vfio: Add support for mmapping sub-page MMIO BARs X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Yongji Xie Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" X-Virus-Scanned: ClamAV using ClamSMTP From: Yongji Xie Now the kernel commit 05f0c03fbac1 ("vfio-pci: Allow to mmap sub-page MMIO BARs if the mmio page is exclusive") allows VFIO to mmap sub-page BARs. This is the corresponding QEMU patch. With those patches applied, we could passthrough sub-page BARs to guest, which can help to improve IO performance for some devices. In this patch, we expand MemoryRegions of these sub-page MMIO BARs to PAGE_SIZE in vfio_pci_write_config(), so that the BARs could be passed to KVM ioctl KVM_SET_USER_MEMORY_REGION with a valid size. The expanding size will be recovered when the base address of sub-page BAR is changed and not page aligned any more in guest. And we also set the priority of these BARs' memory regions to zero in case of overlap with BARs which share the same page with sub-page BARs in guest. Signed-off-by: Yongji Xie Signed-off-by: Alex Williamson --- hw/vfio/common.c | 3 +- hw/vfio/pci.c | 67 ++++++++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 68 insertions(+), 2 deletions(-) diff --git a/hw/vfio/common.c b/hw/vfio/common.c index f528309..801578b 100644 --- a/hw/vfio/common.c +++ b/hw/vfio/common.c @@ -670,8 +670,7 @@ int vfio_region_setup(Object *obj, VFIODevice *vbasedev, VFIORegion *region, region, name, region->size); if (!vbasedev->no_mmap && - region->flags & VFIO_REGION_INFO_FLAG_MMAP && - !(region->size & ~qemu_real_host_page_mask)) { + region->flags & VFIO_REGION_INFO_FLAG_MMAP) { ret = vfio_setup_region_sparse_mmaps(region, info); diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c index b399742..d7dbe0e 100644 --- a/hw/vfio/pci.c +++ b/hw/vfio/pci.c @@ -1071,6 +1071,55 @@ static const MemoryRegionOps vfio_vga_ops = { }; /* + * Expand memory region of sub-page(size < PAGE_SIZE) MMIO BAR to page + * size if the BAR is in an exclusive page in host so that we could map + * this BAR to guest. But this sub-page BAR may not occupy an exclusive + * page in guest. So we should set the priority of the expanded memory + * region to zero in case of overlap with BARs which share the same page + * with the sub-page BAR in guest. Besides, we should also recover the + * size of this sub-page BAR when its base address is changed in guest + * and not page aligned any more. + */ +static void vfio_sub_page_bar_update_mapping(PCIDevice *pdev, int bar) +{ + VFIOPCIDevice *vdev = DO_UPCAST(VFIOPCIDevice, pdev, pdev); + VFIORegion *region = &vdev->bars[bar].region; + MemoryRegion *mmap_mr, *mr; + PCIIORegion *r; + pcibus_t bar_addr; + uint64_t size = region->size; + + /* Make sure that the whole region is allowed to be mmapped */ + if (region->nr_mmaps != 1 || !region->mmaps[0].mmap || + region->mmaps[0].size != region->size) { + return; + } + + r = &pdev->io_regions[bar]; + bar_addr = r->addr; + mr = region->mem; + mmap_mr = ®ion->mmaps[0].mem; + + /* If BAR is mapped and page aligned, update to fill PAGE_SIZE */ + if (bar_addr != PCI_BAR_UNMAPPED && + !(bar_addr & ~qemu_real_host_page_mask)) { + size = qemu_real_host_page_size; + } + + memory_region_transaction_begin(); + + memory_region_set_size(mr, size); + memory_region_set_size(mmap_mr, size); + if (size != region->size && memory_region_is_mapped(mr)) { + memory_region_del_subregion(r->address_space, mr); + memory_region_add_subregion_overlap(r->address_space, + bar_addr, mr, 0); + } + + memory_region_transaction_commit(); +} + +/* * PCI config space */ uint32_t vfio_pci_read_config(PCIDevice *pdev, uint32_t addr, int len) @@ -1153,6 +1202,24 @@ void vfio_pci_write_config(PCIDevice *pdev, } else if (was_enabled && !is_enabled) { vfio_msix_disable(vdev); } + } else if (ranges_overlap(addr, len, PCI_BASE_ADDRESS_0, 24) || + range_covers_byte(addr, len, PCI_COMMAND)) { + pcibus_t old_addr[PCI_NUM_REGIONS - 1]; + int bar; + + for (bar = 0; bar < PCI_ROM_SLOT; bar++) { + old_addr[bar] = pdev->io_regions[bar].addr; + } + + pci_default_write_config(pdev, addr, val, len); + + for (bar = 0; bar < PCI_ROM_SLOT; bar++) { + if (old_addr[bar] != pdev->io_regions[bar].addr && + pdev->io_regions[bar].size > 0 && + pdev->io_regions[bar].size < qemu_real_host_page_size) { + vfio_sub_page_bar_update_mapping(pdev, bar); + } + } } else { /* Write everything to QEMU to keep emulated bits correct */ pci_default_write_config(pdev, addr, val, len);