From patchwork Thu Mar 26 15:24:07 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandru Elisei X-Patchwork-Id: 11460461 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id F1A1692A for ; Thu, 26 Mar 2020 15:24:56 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id D2D472076A for ; Thu, 26 Mar 2020 15:24:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727856AbgCZPY4 (ORCPT ); Thu, 26 Mar 2020 11:24:56 -0400 Received: from foss.arm.com ([217.140.110.172]:33672 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726034AbgCZPYz (ORCPT ); Thu, 26 Mar 2020 11:24:55 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 52BB51045; Thu, 26 Mar 2020 08:24:55 -0700 (PDT) Received: from e123195-lin.cambridge.arm.com (e123195-lin.cambridge.arm.com [10.1.196.63]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 647C83F71E; Thu, 26 Mar 2020 08:24:54 -0700 (PDT) From: Alexandru Elisei To: kvm@vger.kernel.org Cc: will@kernel.org, julien.thierry.kdev@gmail.com, andre.przywara@arm.com, sami.mujawar@arm.com, lorenzo.pieralisi@arm.com Subject: [PATCH v3 kvmtool 01/32] Makefile: Use correct objcopy binary when cross-compiling for x86_64 Date: Thu, 26 Mar 2020 15:24:07 +0000 Message-Id: <20200326152438.6218-2-alexandru.elisei@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200326152438.6218-1-alexandru.elisei@arm.com> References: <20200326152438.6218-1-alexandru.elisei@arm.com> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Use the compiler toolchain version of objcopy instead of the native one when cross-compiling for the x86_64 architecture. Reviewed-by: Andre Przywara Tested-by: Andre Przywara Signed-off-by: Alexandru Elisei --- Makefile | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/Makefile b/Makefile index b76d844f2e01..6d6880dd4f8a 100644 --- a/Makefile +++ b/Makefile @@ -22,6 +22,7 @@ CC := $(CROSS_COMPILE)gcc CFLAGS := LD := $(CROSS_COMPILE)ld LDFLAGS := +OBJCOPY := $(CROSS_COMPILE)objcopy FIND := find CSCOPE := cscope @@ -479,7 +480,7 @@ x86/bios/bios.bin.elf: x86/bios/entry.S x86/bios/e820.c x86/bios/int10.c x86/bio x86/bios/bios.bin: x86/bios/bios.bin.elf $(E) " OBJCOPY " $@ - $(Q) objcopy -O binary -j .text x86/bios/bios.bin.elf x86/bios/bios.bin + $(Q) $(OBJCOPY) -O binary -j .text x86/bios/bios.bin.elf x86/bios/bios.bin x86/bios/bios-rom.o: x86/bios/bios-rom.S x86/bios/bios.bin x86/bios/bios-rom.h $(E) " CC " $@ From patchwork Thu Mar 26 15:24:08 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandru Elisei X-Patchwork-Id: 11460463 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 007EB92A for ; Thu, 26 Mar 2020 15:24:58 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id D61FB20748 for ; Thu, 26 Mar 2020 15:24:57 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728150AbgCZPY5 (ORCPT ); Thu, 26 Mar 2020 11:24:57 -0400 Received: from foss.arm.com ([217.140.110.172]:33680 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726034AbgCZPY4 (ORCPT ); Thu, 26 Mar 2020 11:24:56 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 6DF9D7FA; Thu, 26 Mar 2020 08:24:56 -0700 (PDT) Received: from e123195-lin.cambridge.arm.com (e123195-lin.cambridge.arm.com [10.1.196.63]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 88B293F71E; Thu, 26 Mar 2020 08:24:55 -0700 (PDT) From: Alexandru Elisei To: kvm@vger.kernel.org Cc: will@kernel.org, julien.thierry.kdev@gmail.com, andre.przywara@arm.com, sami.mujawar@arm.com, lorenzo.pieralisi@arm.com Subject: [PATCH v3 kvmtool 02/32] hw/i8042: Compile only for x86 Date: Thu, 26 Mar 2020 15:24:08 +0000 Message-Id: <20200326152438.6218-3-alexandru.elisei@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200326152438.6218-1-alexandru.elisei@arm.com> References: <20200326152438.6218-1-alexandru.elisei@arm.com> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org The initialization function for the i8042 emulated device does exactly nothing for all architectures, except for x86. As a result, the device is usable only for x86, so let's make the file an architecture specific object file. Reviewed-by: Andre Przywara Signed-off-by: Alexandru Elisei --- Makefile | 2 +- hw/i8042.c | 4 ---- 2 files changed, 1 insertion(+), 5 deletions(-) diff --git a/Makefile b/Makefile index 6d6880dd4f8a..33eddcbb4d66 100644 --- a/Makefile +++ b/Makefile @@ -103,7 +103,6 @@ OBJS += hw/pci-shmem.o OBJS += kvm-ipc.o OBJS += builtin-sandbox.o OBJS += virtio/mmio.o -OBJS += hw/i8042.o # Translate uname -m into ARCH string ARCH ?= $(shell uname -m | sed -e s/i.86/i386/ -e s/ppc.*/powerpc/ \ @@ -124,6 +123,7 @@ endif #x86 ifeq ($(ARCH),x86) DEFINES += -DCONFIG_X86 + OBJS += hw/i8042.o OBJS += x86/boot.o OBJS += x86/cpuid.o OBJS += x86/interrupt.o diff --git a/hw/i8042.c b/hw/i8042.c index 288b7d1108ac..2d8c96e9c7e6 100644 --- a/hw/i8042.c +++ b/hw/i8042.c @@ -349,10 +349,6 @@ static struct ioport_operations kbd_ops = { int kbd__init(struct kvm *kvm) { -#ifndef CONFIG_X86 - return 0; -#endif - kbd_reset(); state.kvm = kvm; ioport__register(kvm, I8042_DATA_REG, &kbd_ops, 2, NULL); From patchwork Thu Mar 26 15:24:09 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandru Elisei X-Patchwork-Id: 11460465 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 8B3F01667 for ; Thu, 26 Mar 2020 15:24:59 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 7774720775 for ; Thu, 26 Mar 2020 15:24:59 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728213AbgCZPY6 (ORCPT ); Thu, 26 Mar 2020 11:24:58 -0400 Received: from foss.arm.com ([217.140.110.172]:33692 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728157AbgCZPY5 (ORCPT ); Thu, 26 Mar 2020 11:24:57 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id A67FE1045; Thu, 26 Mar 2020 08:24:57 -0700 (PDT) Received: from e123195-lin.cambridge.arm.com (e123195-lin.cambridge.arm.com [10.1.196.63]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id A56703F71E; Thu, 26 Mar 2020 08:24:56 -0700 (PDT) From: Alexandru Elisei To: kvm@vger.kernel.org Cc: will@kernel.org, julien.thierry.kdev@gmail.com, andre.przywara@arm.com, sami.mujawar@arm.com, lorenzo.pieralisi@arm.com, Julien Thierry Subject: [PATCH v3 kvmtool 03/32] pci: Fix BAR resource sizing arbitration Date: Thu, 26 Mar 2020 15:24:09 +0000 Message-Id: <20200326152438.6218-4-alexandru.elisei@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200326152438.6218-1-alexandru.elisei@arm.com> References: <20200326152438.6218-1-alexandru.elisei@arm.com> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Sami Mujawar According to the 'PCI Local Bus Specification, Revision 3.0, February 3, 2004, Section 6.2.5.1, Implementation Notes, page 227' "Software saves the original value of the Base Address register, writes 0 FFFF FFFFh to the register, then reads it back. Size calculation can be done from the 32-bit value read by first clearing encoding information bits (bit 0 for I/O, bits 0-3 for memory), inverting all 32 bits (logical NOT), then incrementing by 1. The resultant 32-bit value is the memory/I/O range size decoded by the register. Note that the upper 16 bits of the result is ignored if the Base Address register is for I/O and bits 16-31 returned zero upon read." kvmtool was returning the actual BAR resource size which would be incorrect as the software software drivers would invert all 32 bits (logical NOT), then incrementing by 1. This ends up with a very large resource size (in some cases more than 4GB) due to which drivers assert/fail to work. e.g if the BAR resource size was 0x1000, kvmtool would return 0x1000 instead of 0xFFFFF00x. Fixed pci__config_wr() to return the size of the BAR in accordance with the PCI Local Bus specification, Implementation Notes. Reviewed-by: Andre Przywara Signed-off-by: Sami Mujawar Signed-off-by: Julien Thierry [Reworked algorithm, removed power-of-two check] Signed-off-by: Alexandru Elisei --- pci.c | 42 ++++++++++++++++++++++++++++++++++++------ 1 file changed, 36 insertions(+), 6 deletions(-) diff --git a/pci.c b/pci.c index 689869cb79a3..3198732935eb 100644 --- a/pci.c +++ b/pci.c @@ -149,6 +149,8 @@ void pci__config_wr(struct kvm *kvm, union pci_config_address addr, void *data, u8 bar, offset; struct pci_device_header *pci_hdr; u8 dev_num = addr.device_number; + u32 value = 0; + u32 mask; if (!pci_device_exists(addr.bus_number, dev_num, 0)) return; @@ -169,13 +171,41 @@ void pci__config_wr(struct kvm *kvm, union pci_config_address addr, void *data, bar = (offset - PCI_BAR_OFFSET(0)) / sizeof(u32); /* - * If the kernel masks the BAR it would expect to find the size of the - * BAR there next time it reads from it. When the kernel got the size it - * would write the address back. + * If the kernel masks the BAR, it will expect to find the size of the + * BAR there next time it reads from it. After the kernel reads the + * size, it will write the address back. */ - if (bar < 6 && ioport__read32(data) == 0xFFFFFFFF) { - u32 sz = pci_hdr->bar_size[bar]; - memcpy(base + offset, &sz, sizeof(sz)); + if (bar < 6) { + if (pci_hdr->bar[bar] & PCI_BASE_ADDRESS_SPACE_IO) + mask = (u32)PCI_BASE_ADDRESS_IO_MASK; + else + mask = (u32)PCI_BASE_ADDRESS_MEM_MASK; + /* + * According to the PCI local bus specification REV 3.0: + * The number of upper bits that a device actually implements + * depends on how much of the address space the device will + * respond to. A device that wants a 1 MB memory address space + * (using a 32-bit base address register) would build the top + * 12 bits of the address register, hardwiring the other bits + * to 0. + * + * Furthermore, software can determine how much address space + * the device requires by writing a value of all 1's to the + * register and then reading the value back. The device will + * return 0's in all don't-care address bits, effectively + * specifying the address space required. + * + * Software computes the size of the address space with the + * formula S = ~B + 1, where S is the memory size and B is the + * value read from the BAR. This means that the BAR value that + * kvmtool should return is B = ~(S - 1). + */ + memcpy(&value, data, size); + if (value == 0xffffffff) + value = ~(pci_hdr->bar_size[bar] - 1); + /* Preserve the special bits. */ + value = (value & mask) | (pci_hdr->bar[bar] & ~mask); + memcpy(base + offset, &value, size); } else { memcpy(base + offset, data, size); } From patchwork Thu Mar 26 15:24:10 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandru Elisei X-Patchwork-Id: 11460471 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 635C41667 for ; Thu, 26 Mar 2020 15:25:04 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 387ED20774 for ; Thu, 26 Mar 2020 15:25:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728187AbgCZPZA (ORCPT ); Thu, 26 Mar 2020 11:25:00 -0400 Received: from foss.arm.com ([217.140.110.172]:33698 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726034AbgCZPZA (ORCPT ); Thu, 26 Mar 2020 11:25:00 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id C43727FA; Thu, 26 Mar 2020 08:24:58 -0700 (PDT) Received: from e123195-lin.cambridge.arm.com (e123195-lin.cambridge.arm.com [10.1.196.63]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id DCED93F71E; Thu, 26 Mar 2020 08:24:57 -0700 (PDT) From: Alexandru Elisei To: kvm@vger.kernel.org Cc: will@kernel.org, julien.thierry.kdev@gmail.com, andre.przywara@arm.com, sami.mujawar@arm.com, lorenzo.pieralisi@arm.com Subject: [PATCH v3 kvmtool 04/32] Remove pci-shmem device Date: Thu, 26 Mar 2020 15:24:10 +0000 Message-Id: <20200326152438.6218-5-alexandru.elisei@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200326152438.6218-1-alexandru.elisei@arm.com> References: <20200326152438.6218-1-alexandru.elisei@arm.com> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org The pci-shmem emulated device ("ivshmem") was created by QEMU for cross-VM data sharing. The only Linux driver that uses this device is the Android Virtual System on a Chip staging driver, which also mentions a character device driver implemented on top of shmem, which was removed from Linux. On the kvmtool side, the only commits touching the pci-shmem device since it was introduced in 2012 were made when refactoring various kvmtool subsystems. Let's remove the maintenance burden on the kvmtool maintainers and remove this unused device. Reviewed-by: Andre Przywara Signed-off-by: Alexandru Elisei --- Makefile | 1 - builtin-run.c | 5 - hw/pci-shmem.c | 400 ---------------------------------------- include/kvm/pci-shmem.h | 32 ---- 4 files changed, 438 deletions(-) delete mode 100644 hw/pci-shmem.c delete mode 100644 include/kvm/pci-shmem.h diff --git a/Makefile b/Makefile index 33eddcbb4d66..f75413e74819 100644 --- a/Makefile +++ b/Makefile @@ -99,7 +99,6 @@ OBJS += util/read-write.o OBJS += util/util.o OBJS += virtio/9p.o OBJS += virtio/9p-pdu.o -OBJS += hw/pci-shmem.o OBJS += kvm-ipc.o OBJS += builtin-sandbox.o OBJS += virtio/mmio.o diff --git a/builtin-run.c b/builtin-run.c index f8dc6c7229b0..9cb8c75300eb 100644 --- a/builtin-run.c +++ b/builtin-run.c @@ -31,7 +31,6 @@ #include "kvm/sdl.h" #include "kvm/vnc.h" #include "kvm/guest_compat.h" -#include "kvm/pci-shmem.h" #include "kvm/kvm-ipc.h" #include "kvm/builtin-debug.h" @@ -99,10 +98,6 @@ void kvm_run_set_wrapper_sandbox(void) OPT_INTEGER('c', "cpus", &(cfg)->nrcpus, "Number of CPUs"), \ OPT_U64('m', "mem", &(cfg)->ram_size, "Virtual machine memory" \ " size in MiB."), \ - OPT_CALLBACK('\0', "shmem", NULL, \ - "[pci:]:[:handle=][:create]", \ - "Share host shmem with guest via pci device", \ - shmem_parser, NULL), \ OPT_CALLBACK('d', "disk", kvm, "image or rootfs_dir", "Disk " \ " image or rootfs directory", img_name_parser, \ kvm), \ diff --git a/hw/pci-shmem.c b/hw/pci-shmem.c deleted file mode 100644 index f92bc75544d7..000000000000 --- a/hw/pci-shmem.c +++ /dev/null @@ -1,400 +0,0 @@ -#include "kvm/devices.h" -#include "kvm/pci-shmem.h" -#include "kvm/virtio-pci-dev.h" -#include "kvm/irq.h" -#include "kvm/kvm.h" -#include "kvm/pci.h" -#include "kvm/util.h" -#include "kvm/ioport.h" -#include "kvm/ioeventfd.h" - -#include -#include -#include -#include -#include - -#define MB_SHIFT (20) -#define KB_SHIFT (10) -#define GB_SHIFT (30) - -static struct pci_device_header pci_shmem_pci_device = { - .vendor_id = cpu_to_le16(PCI_VENDOR_ID_REDHAT_QUMRANET), - .device_id = cpu_to_le16(0x1110), - .header_type = PCI_HEADER_TYPE_NORMAL, - .class[2] = 0xFF, /* misc pci device */ - .status = cpu_to_le16(PCI_STATUS_CAP_LIST), - .capabilities = (void *)&pci_shmem_pci_device.msix - (void *)&pci_shmem_pci_device, - .msix.cap = PCI_CAP_ID_MSIX, - .msix.ctrl = cpu_to_le16(1), - .msix.table_offset = cpu_to_le32(1), /* Use BAR 1 */ - .msix.pba_offset = cpu_to_le32(0x1001), /* Use BAR 1 */ -}; - -static struct device_header pci_shmem_device = { - .bus_type = DEVICE_BUS_PCI, - .data = &pci_shmem_pci_device, -}; - -/* registers for the Inter-VM shared memory device */ -enum ivshmem_registers { - INTRMASK = 0, - INTRSTATUS = 4, - IVPOSITION = 8, - DOORBELL = 12, -}; - -static struct shmem_info *shmem_region; -static u16 ivshmem_registers; -static int local_fd; -static u32 local_id; -static u64 msix_block; -static u64 msix_pba; -static struct msix_table msix_table[2]; - -int pci_shmem__register_mem(struct shmem_info *si) -{ - if (!shmem_region) { - shmem_region = si; - } else { - pr_warning("only single shmem currently avail. ignoring.\n"); - free(si); - } - return 0; -} - -static bool shmem_pci__io_in(struct ioport *ioport, struct kvm_cpu *vcpu, u16 port, void *data, int size) -{ - u16 offset = port - ivshmem_registers; - - switch (offset) { - case INTRMASK: - break; - case INTRSTATUS: - break; - case IVPOSITION: - ioport__write32(data, local_id); - break; - case DOORBELL: - break; - }; - - return true; -} - -static bool shmem_pci__io_out(struct ioport *ioport, struct kvm_cpu *vcpu, u16 port, void *data, int size) -{ - u16 offset = port - ivshmem_registers; - - switch (offset) { - case INTRMASK: - break; - case INTRSTATUS: - break; - case IVPOSITION: - break; - case DOORBELL: - break; - }; - - return true; -} - -static struct ioport_operations shmem_pci__io_ops = { - .io_in = shmem_pci__io_in, - .io_out = shmem_pci__io_out, -}; - -static void callback_mmio_msix(struct kvm_cpu *vcpu, u64 addr, u8 *data, u32 len, u8 is_write, void *ptr) -{ - void *mem; - - if (addr - msix_block < 0x1000) - mem = &msix_table; - else - mem = &msix_pba; - - if (is_write) - memcpy(mem + addr - msix_block, data, len); - else - memcpy(data, mem + addr - msix_block, len); -} - -/* - * Return an irqfd which can be used by other guests to signal this guest - * whenever they need to poke it - */ -int pci_shmem__get_local_irqfd(struct kvm *kvm) -{ - int fd, gsi, r; - - if (local_fd == 0) { - fd = eventfd(0, 0); - if (fd < 0) - return fd; - - if (pci_shmem_pci_device.msix.ctrl & cpu_to_le16(PCI_MSIX_FLAGS_ENABLE)) { - gsi = irq__add_msix_route(kvm, &msix_table[0].msg, - pci_shmem_device.dev_num << 3); - if (gsi < 0) - return gsi; - } else { - gsi = pci_shmem_pci_device.irq_line; - } - - r = irq__add_irqfd(kvm, gsi, fd, -1); - if (r < 0) - return r; - - local_fd = fd; - } - - return local_fd; -} - -/* - * Connect a new client to ivshmem by adding the appropriate datamatch - * to the DOORBELL - */ -int pci_shmem__add_client(struct kvm *kvm, u32 id, int fd) -{ - struct kvm_ioeventfd ioevent; - - ioevent = (struct kvm_ioeventfd) { - .addr = ivshmem_registers + DOORBELL, - .len = sizeof(u32), - .datamatch = id, - .fd = fd, - .flags = KVM_IOEVENTFD_FLAG_PIO | KVM_IOEVENTFD_FLAG_DATAMATCH, - }; - - return ioctl(kvm->vm_fd, KVM_IOEVENTFD, &ioevent); -} - -/* - * Remove a client connected to ivshmem by removing the appropriate datamatch - * from the DOORBELL - */ -int pci_shmem__remove_client(struct kvm *kvm, u32 id) -{ - struct kvm_ioeventfd ioevent; - - ioevent = (struct kvm_ioeventfd) { - .addr = ivshmem_registers + DOORBELL, - .len = sizeof(u32), - .datamatch = id, - .flags = KVM_IOEVENTFD_FLAG_PIO - | KVM_IOEVENTFD_FLAG_DATAMATCH - | KVM_IOEVENTFD_FLAG_DEASSIGN, - }; - - return ioctl(kvm->vm_fd, KVM_IOEVENTFD, &ioevent); -} - -static void *setup_shmem(const char *key, size_t len, int creating) -{ - int fd; - int rtn; - void *mem; - int flag = O_RDWR; - - if (creating) - flag |= O_CREAT; - - fd = shm_open(key, flag, S_IRUSR | S_IWUSR); - if (fd < 0) { - pr_warning("Failed to open shared memory file %s\n", key); - return NULL; - } - - if (creating) { - rtn = ftruncate(fd, (off_t) len); - if (rtn < 0) - pr_warning("Can't ftruncate(fd,%zu)\n", len); - } - mem = mmap(NULL, len, - PROT_READ | PROT_WRITE, MAP_SHARED | MAP_NORESERVE, fd, 0); - if (mem == MAP_FAILED) { - pr_warning("Failed to mmap shared memory file"); - mem = NULL; - } - close(fd); - - return mem; -} - -int shmem_parser(const struct option *opt, const char *arg, int unset) -{ - const u64 default_size = SHMEM_DEFAULT_SIZE; - const u64 default_phys_addr = SHMEM_DEFAULT_ADDR; - const char *default_handle = SHMEM_DEFAULT_HANDLE; - struct shmem_info *si = malloc(sizeof(struct shmem_info)); - u64 phys_addr; - u64 size; - char *handle = NULL; - int create = 0; - const char *p = arg; - char *next; - int base = 10; - int verbose = 0; - - const int skip_pci = strlen("pci:"); - if (verbose) - pr_info("shmem_parser(%p,%s,%d)", opt, arg, unset); - /* parse out optional addr family */ - if (strcasestr(p, "pci:")) { - p += skip_pci; - } else if (strcasestr(p, "mem:")) { - die("I can't add to E820 map yet.\n"); - } - /* parse out physical addr */ - base = 10; - if (strcasestr(p, "0x")) - base = 16; - phys_addr = strtoll(p, &next, base); - if (next == p && phys_addr == 0) { - pr_info("shmem: no physical addr specified, using default."); - phys_addr = default_phys_addr; - } - if (*next != ':' && *next != '\0') - die("shmem: unexpected chars after phys addr.\n"); - if (*next == '\0') - p = next; - else - p = next + 1; - /* parse out size */ - base = 10; - if (strcasestr(p, "0x")) - base = 16; - size = strtoll(p, &next, base); - if (next == p && size == 0) { - pr_info("shmem: no size specified, using default."); - size = default_size; - } - /* look for [KMGkmg][Bb]* uses base 2. */ - int skip_B = 0; - if (strspn(next, "KMGkmg")) { /* might have a prefix */ - if (*(next + 1) == 'B' || *(next + 1) == 'b') - skip_B = 1; - switch (*next) { - case 'K': - case 'k': - size = size << KB_SHIFT; - break; - case 'M': - case 'm': - size = size << MB_SHIFT; - break; - case 'G': - case 'g': - size = size << GB_SHIFT; - break; - default: - die("shmem: bug in detecting size prefix."); - break; - } - next += 1 + skip_B; - } - if (*next != ':' && *next != '\0') { - die("shmem: unexpected chars after phys size. <%c><%c>\n", - *next, *p); - } - if (*next == '\0') - p = next; - else - p = next + 1; - /* parse out optional shmem handle */ - const int skip_handle = strlen("handle="); - next = strcasestr(p, "handle="); - if (*p && next) { - if (p != next) - die("unexpected chars before handle\n"); - p += skip_handle; - next = strchrnul(p, ':'); - if (next - p) { - handle = malloc(next - p + 1); - strncpy(handle, p, next - p); - handle[next - p] = '\0'; /* just in case. */ - } - if (*next == '\0') - p = next; - else - p = next + 1; - } - /* parse optional create flag to see if we should create shm seg. */ - if (*p && strcasestr(p, "create")) { - create = 1; - p += strlen("create"); - } - if (*p != '\0') - die("shmem: unexpected trailing chars\n"); - if (handle == NULL) { - handle = malloc(strlen(default_handle) + 1); - strcpy(handle, default_handle); - } - if (verbose) { - pr_info("shmem: phys_addr = %llx", - (unsigned long long)phys_addr); - pr_info("shmem: size = %llx", (unsigned long long)size); - pr_info("shmem: handle = %s", handle); - pr_info("shmem: create = %d", create); - } - - si->phys_addr = phys_addr; - si->size = size; - si->handle = handle; - si->create = create; - pci_shmem__register_mem(si); /* ownership of si, etc. passed on. */ - return 0; -} - -int pci_shmem__init(struct kvm *kvm) -{ - char *mem; - int r; - - if (shmem_region == NULL) - return 0; - - /* Register MMIO space for MSI-X */ - r = ioport__register(kvm, IOPORT_EMPTY, &shmem_pci__io_ops, IOPORT_SIZE, NULL); - if (r < 0) - return r; - ivshmem_registers = (u16)r; - - msix_block = pci_get_io_space_block(0x1010); - kvm__register_mmio(kvm, msix_block, 0x1010, false, callback_mmio_msix, NULL); - - /* - * This registers 3 BARs: - * - * 0 - ivshmem registers - * 1 - MSI-X MMIO space - * 2 - Shared memory block - */ - pci_shmem_pci_device.bar[0] = cpu_to_le32(ivshmem_registers | PCI_BASE_ADDRESS_SPACE_IO); - pci_shmem_pci_device.bar_size[0] = shmem_region->size; - pci_shmem_pci_device.bar[1] = cpu_to_le32(msix_block | PCI_BASE_ADDRESS_SPACE_MEMORY); - pci_shmem_pci_device.bar_size[1] = 0x1010; - pci_shmem_pci_device.bar[2] = cpu_to_le32(shmem_region->phys_addr | PCI_BASE_ADDRESS_SPACE_MEMORY); - pci_shmem_pci_device.bar_size[2] = shmem_region->size; - - device__register(&pci_shmem_device); - - /* Open shared memory and plug it into the guest */ - mem = setup_shmem(shmem_region->handle, shmem_region->size, - shmem_region->create); - if (mem == NULL) - return -EINVAL; - - kvm__register_dev_mem(kvm, shmem_region->phys_addr, shmem_region->size, - mem); - return 0; -} -dev_init(pci_shmem__init); - -int pci_shmem__exit(struct kvm *kvm) -{ - return 0; -} -dev_exit(pci_shmem__exit); diff --git a/include/kvm/pci-shmem.h b/include/kvm/pci-shmem.h deleted file mode 100644 index 6cff2b85bfd3..000000000000 --- a/include/kvm/pci-shmem.h +++ /dev/null @@ -1,32 +0,0 @@ -#ifndef KVM__PCI_SHMEM_H -#define KVM__PCI_SHMEM_H - -#include -#include - -#include "kvm/parse-options.h" - -#define SHMEM_DEFAULT_SIZE (16 << MB_SHIFT) -#define SHMEM_DEFAULT_ADDR (0xc8000000) -#define SHMEM_DEFAULT_HANDLE "/kvm_shmem" - -struct kvm; -struct shmem_info; - -struct shmem_info { - u64 phys_addr; - u64 size; - char *handle; - int create; -}; - -int pci_shmem__init(struct kvm *kvm); -int pci_shmem__exit(struct kvm *kvm); -int pci_shmem__register_mem(struct shmem_info *si); -int shmem_parser(const struct option *opt, const char *arg, int unset); - -int pci_shmem__get_local_irqfd(struct kvm *kvm); -int pci_shmem__add_client(struct kvm *kvm, u32 id, int fd); -int pci_shmem__remove_client(struct kvm *kvm, u32 id); - -#endif From patchwork Thu Mar 26 15:24:11 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandru Elisei X-Patchwork-Id: 11460467 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C90CE1667 for ; Thu, 26 Mar 2020 15:25:01 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id B448F20774 for ; Thu, 26 Mar 2020 15:25:01 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728225AbgCZPZA (ORCPT ); Thu, 26 Mar 2020 11:25:00 -0400 Received: from foss.arm.com ([217.140.110.172]:33708 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728194AbgCZPZA (ORCPT ); Thu, 26 Mar 2020 11:25:00 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id E0BF61045; Thu, 26 Mar 2020 08:24:59 -0700 (PDT) Received: from e123195-lin.cambridge.arm.com (e123195-lin.cambridge.arm.com [10.1.196.63]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 03E683F71E; Thu, 26 Mar 2020 08:24:58 -0700 (PDT) From: Alexandru Elisei To: kvm@vger.kernel.org Cc: will@kernel.org, julien.thierry.kdev@gmail.com, andre.przywara@arm.com, sami.mujawar@arm.com, lorenzo.pieralisi@arm.com Subject: [PATCH v3 kvmtool 05/32] Check that a PCI device's memory size is power of two Date: Thu, 26 Mar 2020 15:24:11 +0000 Message-Id: <20200326152438.6218-6-alexandru.elisei@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200326152438.6218-1-alexandru.elisei@arm.com> References: <20200326152438.6218-1-alexandru.elisei@arm.com> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org According to the PCI local bus specification [1], a device's memory size must be a power of two. This is also implicit in the mechanism that a CPU uses to get the memory size requirement for a PCI device. The vesa device requests a memory size that isn't a power of two. According to the same spec [1], a device is allowed to consume more memory than it actually requires. As a result, the amount of memory that the vesa device now reserves has been increased. To prevent slip-ups in the future, a few BUILD_BUG_ON statements were added in places where the memory size is known at compile time. [1] PCI Local Bus Specification Revision 3.0, section 6.2.5.1 Reviewed-by: Andre Przywara Signed-off-by: Alexandru Elisei --- hw/vesa.c | 3 +++ include/kvm/util.h | 2 ++ include/kvm/vesa.h | 6 +++++- virtio/pci.c | 3 +++ 4 files changed, 13 insertions(+), 1 deletion(-) diff --git a/hw/vesa.c b/hw/vesa.c index f3c5114cf4fe..d75b4b316a1e 100644 --- a/hw/vesa.c +++ b/hw/vesa.c @@ -58,6 +58,9 @@ struct framebuffer *vesa__init(struct kvm *kvm) char *mem; int r; + BUILD_BUG_ON(!is_power_of_two(VESA_MEM_SIZE)); + BUILD_BUG_ON(VESA_MEM_SIZE < VESA_BPP/8 * VESA_WIDTH * VESA_HEIGHT); + if (!kvm->cfg.vnc && !kvm->cfg.sdl && !kvm->cfg.gtk) return NULL; diff --git a/include/kvm/util.h b/include/kvm/util.h index 4ca7aa9392b6..199724c4018c 100644 --- a/include/kvm/util.h +++ b/include/kvm/util.h @@ -104,6 +104,8 @@ static inline unsigned long roundup_pow_of_two(unsigned long x) return x ? 1UL << fls_long(x - 1) : 0; } +#define is_power_of_two(x) ((x) > 0 ? ((x) & ((x) - 1)) == 0 : 0) + struct kvm; void *mmap_hugetlbfs(struct kvm *kvm, const char *htlbfs_path, u64 size); void *mmap_anon_or_hugetlbfs(struct kvm *kvm, const char *hugetlbfs_path, u64 size); diff --git a/include/kvm/vesa.h b/include/kvm/vesa.h index 0fac11ab5a9f..e7d971343642 100644 --- a/include/kvm/vesa.h +++ b/include/kvm/vesa.h @@ -5,8 +5,12 @@ #define VESA_HEIGHT 480 #define VESA_MEM_ADDR 0xd0000000 -#define VESA_MEM_SIZE (4*VESA_WIDTH*VESA_HEIGHT) #define VESA_BPP 32 +/* + * We actually only need VESA_BPP/8*VESA_WIDTH*VESA_HEIGHT bytes. But the memory + * size must be a power of 2, so we round up. + */ +#define VESA_MEM_SIZE (1 << 21) struct kvm; struct biosregs; diff --git a/virtio/pci.c b/virtio/pci.c index 99653cad2c0f..04e801827df9 100644 --- a/virtio/pci.c +++ b/virtio/pci.c @@ -435,6 +435,9 @@ int virtio_pci__init(struct kvm *kvm, void *dev, struct virtio_device *vdev, vpci->kvm = kvm; vpci->dev = dev; + BUILD_BUG_ON(!is_power_of_two(IOPORT_SIZE)); + BUILD_BUG_ON(!is_power_of_two(PCI_IO_SIZE)); + r = ioport__register(kvm, IOPORT_EMPTY, &virtio_pci__io_ops, IOPORT_SIZE, vdev); if (r < 0) return r; From patchwork Thu Mar 26 15:24:12 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandru Elisei X-Patchwork-Id: 11460469 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 1716292A for ; Thu, 26 Mar 2020 15:25:03 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 031642076A for ; Thu, 26 Mar 2020 15:25:03 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728279AbgCZPZC (ORCPT ); Thu, 26 Mar 2020 11:25:02 -0400 Received: from foss.arm.com ([217.140.110.172]:33716 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728232AbgCZPZB (ORCPT ); Thu, 26 Mar 2020 11:25:01 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 142321063; Thu, 26 Mar 2020 08:25:01 -0700 (PDT) Received: from e123195-lin.cambridge.arm.com (e123195-lin.cambridge.arm.com [10.1.196.63]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 23ACB3F71E; Thu, 26 Mar 2020 08:25:00 -0700 (PDT) From: Alexandru Elisei To: kvm@vger.kernel.org Cc: will@kernel.org, julien.thierry.kdev@gmail.com, andre.przywara@arm.com, sami.mujawar@arm.com, lorenzo.pieralisi@arm.com Subject: [PATCH v3 kvmtool 06/32] arm/pci: Advertise only PCI bus 0 in the DT Date: Thu, 26 Mar 2020 15:24:12 +0000 Message-Id: <20200326152438.6218-7-alexandru.elisei@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200326152438.6218-1-alexandru.elisei@arm.com> References: <20200326152438.6218-1-alexandru.elisei@arm.com> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org The "bus-range" property encodes the PCI bus number of the PCI controller and the largest bus number of any PCI buses that are subordinate to this node [1]. kvmtool emulates only PCI bus 0. Advertise this in the PCI DT node by setting "bus-range" to <0,0>. [1] IEEE Std 1275-1994, Section 3 "Bus Nodes Properties and Methods" Reviewed-by: Andre Przywara Signed-off-by: Alexandru Elisei --- arm/pci.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arm/pci.c b/arm/pci.c index 557cfa98938d..ed325fa4a811 100644 --- a/arm/pci.c +++ b/arm/pci.c @@ -30,7 +30,7 @@ void pci__generate_fdt_nodes(void *fdt) struct of_interrupt_map_entry irq_map[OF_PCI_IRQ_MAP_MAX]; unsigned nentries = 0; /* Bus range */ - u32 bus_range[] = { cpu_to_fdt32(0), cpu_to_fdt32(1), }; + u32 bus_range[] = { cpu_to_fdt32(0), cpu_to_fdt32(0), }; /* Configuration Space */ u64 cfg_reg_prop[] = { cpu_to_fdt64(KVM_PCI_CFG_AREA), cpu_to_fdt64(ARM_PCI_CFG_SIZE), }; From patchwork Thu Mar 26 15:24:13 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandru Elisei X-Patchwork-Id: 11460475 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 3A2D192A for ; Thu, 26 Mar 2020 15:25:06 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 1B69A2076A for ; Thu, 26 Mar 2020 15:25:06 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728264AbgCZPZD (ORCPT ); Thu, 26 Mar 2020 11:25:03 -0400 Received: from foss.arm.com ([217.140.110.172]:33724 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728290AbgCZPZD (ORCPT ); Thu, 26 Mar 2020 11:25:03 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 48CDF7FA; Thu, 26 Mar 2020 08:25:02 -0700 (PDT) Received: from e123195-lin.cambridge.arm.com (e123195-lin.cambridge.arm.com [10.1.196.63]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 478A03F71E; Thu, 26 Mar 2020 08:25:01 -0700 (PDT) From: Alexandru Elisei To: kvm@vger.kernel.org Cc: will@kernel.org, julien.thierry.kdev@gmail.com, andre.przywara@arm.com, sami.mujawar@arm.com, lorenzo.pieralisi@arm.com, Julien Thierry Subject: [PATCH v3 kvmtool 07/32] ioport: pci: Move port allocations to PCI devices Date: Thu, 26 Mar 2020 15:24:13 +0000 Message-Id: <20200326152438.6218-8-alexandru.elisei@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200326152438.6218-1-alexandru.elisei@arm.com> References: <20200326152438.6218-1-alexandru.elisei@arm.com> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Julien Thierry The dynamic ioport allocation with IOPORT_EMPTY is currently only used by PCI devices. Other devices use fixed ports for which they request registration to the ioport API. PCI ports need to be in the PCI IO space and there is no reason ioport API should know a PCI port is being allocated and needs to be placed in PCI IO space. This currently just happens to be the case. Move the responsability of dynamic allocation of ioports from the ioport API to PCI. In the future, if other types of devices also need dynamic ioport allocation, they'll have to figure out the range of ports they are allowed to use. Reviewed-by: Andre Przywara Signed-off-by: Julien Thierry [Renamed functions for clarity] Signed-off-by: Alexandru Elisei --- hw/vesa.c | 4 ++-- include/kvm/ioport.h | 3 --- include/kvm/pci.h | 4 +++- ioport.c | 18 ------------------ pci.c | 17 +++++++++++++---- powerpc/include/kvm/kvm-arch.h | 2 +- vfio/core.c | 6 ++++-- vfio/pci.c | 4 ++-- virtio/pci.c | 7 ++++--- x86/include/kvm/kvm-arch.h | 2 +- 10 files changed, 30 insertions(+), 37 deletions(-) diff --git a/hw/vesa.c b/hw/vesa.c index d75b4b316a1e..24fb46faad3b 100644 --- a/hw/vesa.c +++ b/hw/vesa.c @@ -63,8 +63,8 @@ struct framebuffer *vesa__init(struct kvm *kvm) if (!kvm->cfg.vnc && !kvm->cfg.sdl && !kvm->cfg.gtk) return NULL; - - r = ioport__register(kvm, IOPORT_EMPTY, &vesa_io_ops, IOPORT_SIZE, NULL); + r = pci_get_io_port_block(IOPORT_SIZE); + r = ioport__register(kvm, r, &vesa_io_ops, IOPORT_SIZE, NULL); if (r < 0) return ERR_PTR(r); diff --git a/include/kvm/ioport.h b/include/kvm/ioport.h index db52a479742b..b10fcd5b4412 100644 --- a/include/kvm/ioport.h +++ b/include/kvm/ioport.h @@ -14,11 +14,8 @@ /* some ports we reserve for own use */ #define IOPORT_DBG 0xe0 -#define IOPORT_START 0x6200 #define IOPORT_SIZE 0x400 -#define IOPORT_EMPTY USHRT_MAX - struct kvm; struct ioport { diff --git a/include/kvm/pci.h b/include/kvm/pci.h index a86c15a70e6d..ccb155e3e8fe 100644 --- a/include/kvm/pci.h +++ b/include/kvm/pci.h @@ -19,6 +19,7 @@ #define PCI_CONFIG_DATA 0xcfc #define PCI_CONFIG_BUS_FORWARD 0xcfa #define PCI_IO_SIZE 0x100 +#define PCI_IOPORT_START 0x6200 #define PCI_CFG_SIZE (1ULL << 24) struct kvm; @@ -152,7 +153,8 @@ struct pci_device_header { int pci__init(struct kvm *kvm); int pci__exit(struct kvm *kvm); struct pci_device_header *pci__find_dev(u8 dev_num); -u32 pci_get_io_space_block(u32 size); +u32 pci_get_mmio_block(u32 size); +u16 pci_get_io_port_block(u32 size); void pci__assign_irq(struct device_header *dev_hdr); void pci__config_wr(struct kvm *kvm, union pci_config_address addr, void *data, int size); void pci__config_rd(struct kvm *kvm, union pci_config_address addr, void *data, int size); diff --git a/ioport.c b/ioport.c index a6dc65e3e6c6..a72e4035881a 100644 --- a/ioport.c +++ b/ioport.c @@ -16,24 +16,8 @@ #define ioport_node(n) rb_entry(n, struct ioport, node) -DEFINE_MUTEX(ioport_mutex); - -static u16 free_io_port_idx; /* protected by ioport_mutex */ - static struct rb_root ioport_tree = RB_ROOT; -static u16 ioport__find_free_port(void) -{ - u16 free_port; - - mutex_lock(&ioport_mutex); - free_port = IOPORT_START + free_io_port_idx * IOPORT_SIZE; - free_io_port_idx++; - mutex_unlock(&ioport_mutex); - - return free_port; -} - static struct ioport *ioport_search(struct rb_root *root, u64 addr) { struct rb_int_node *node; @@ -85,8 +69,6 @@ int ioport__register(struct kvm *kvm, u16 port, struct ioport_operations *ops, i int r; br_write_lock(kvm); - if (port == IOPORT_EMPTY) - port = ioport__find_free_port(); entry = ioport_search(&ioport_tree, port); if (entry) { diff --git a/pci.c b/pci.c index 3198732935eb..80b5c5d3d7f3 100644 --- a/pci.c +++ b/pci.c @@ -15,15 +15,24 @@ static u32 pci_config_address_bits; * (That's why it can still 32bit even with 64bit guests-- 64bit * PCI isn't currently supported.) */ -static u32 io_space_blocks = KVM_PCI_MMIO_AREA; +static u32 mmio_blocks = KVM_PCI_MMIO_AREA; +static u16 io_port_blocks = PCI_IOPORT_START; + +u16 pci_get_io_port_block(u32 size) +{ + u16 port = ALIGN(io_port_blocks, IOPORT_SIZE); + + io_port_blocks = port + size; + return port; +} /* * BARs must be naturally aligned, so enforce this in the allocator. */ -u32 pci_get_io_space_block(u32 size) +u32 pci_get_mmio_block(u32 size) { - u32 block = ALIGN(io_space_blocks, size); - io_space_blocks = block + size; + u32 block = ALIGN(mmio_blocks, size); + mmio_blocks = block + size; return block; } diff --git a/powerpc/include/kvm/kvm-arch.h b/powerpc/include/kvm/kvm-arch.h index 8126b96cb66a..26d440b22bdd 100644 --- a/powerpc/include/kvm/kvm-arch.h +++ b/powerpc/include/kvm/kvm-arch.h @@ -34,7 +34,7 @@ #define KVM_MMIO_START PPC_MMIO_START /* - * This is the address that pci_get_io_space_block() starts allocating + * This is the address that pci_get_io_port_block() starts allocating * from. Note that this is a PCI bus address. */ #define KVM_IOPORT_AREA 0x0 diff --git a/vfio/core.c b/vfio/core.c index 17b5b0cfc9ac..0ed1e6fee6bf 100644 --- a/vfio/core.c +++ b/vfio/core.c @@ -202,8 +202,10 @@ static int vfio_setup_trap_region(struct kvm *kvm, struct vfio_device *vdev, struct vfio_region *region) { if (region->is_ioport) { - int port = ioport__register(kvm, IOPORT_EMPTY, &vfio_ioport_ops, - region->info.size, region); + int port = pci_get_io_port_block(region->info.size); + + port = ioport__register(kvm, port, &vfio_ioport_ops, + region->info.size, region); if (port < 0) return port; diff --git a/vfio/pci.c b/vfio/pci.c index 76e24c156906..8e5d8572bc0c 100644 --- a/vfio/pci.c +++ b/vfio/pci.c @@ -750,7 +750,7 @@ static int vfio_pci_create_msix_table(struct kvm *kvm, * powers of two. */ mmio_size = roundup_pow_of_two(table->size + pba->size); - table->guest_phys_addr = pci_get_io_space_block(mmio_size); + table->guest_phys_addr = pci_get_mmio_block(mmio_size); if (!table->guest_phys_addr) { pr_err("cannot allocate IO space"); ret = -ENOMEM; @@ -846,7 +846,7 @@ static int vfio_pci_configure_bar(struct kvm *kvm, struct vfio_device *vdev, if (!region->is_ioport) { /* Grab some MMIO space in the guest */ map_size = ALIGN(region->info.size, PAGE_SIZE); - region->guest_phys_addr = pci_get_io_space_block(map_size); + region->guest_phys_addr = pci_get_mmio_block(map_size); } /* Map the BARs into the guest or setup a trap region. */ diff --git a/virtio/pci.c b/virtio/pci.c index 04e801827df9..d73414abde05 100644 --- a/virtio/pci.c +++ b/virtio/pci.c @@ -438,18 +438,19 @@ int virtio_pci__init(struct kvm *kvm, void *dev, struct virtio_device *vdev, BUILD_BUG_ON(!is_power_of_two(IOPORT_SIZE)); BUILD_BUG_ON(!is_power_of_two(PCI_IO_SIZE)); - r = ioport__register(kvm, IOPORT_EMPTY, &virtio_pci__io_ops, IOPORT_SIZE, vdev); + r = pci_get_io_port_block(IOPORT_SIZE); + r = ioport__register(kvm, r, &virtio_pci__io_ops, IOPORT_SIZE, vdev); if (r < 0) return r; vpci->port_addr = (u16)r; - vpci->mmio_addr = pci_get_io_space_block(IOPORT_SIZE); + vpci->mmio_addr = pci_get_mmio_block(IOPORT_SIZE); r = kvm__register_mmio(kvm, vpci->mmio_addr, IOPORT_SIZE, false, virtio_pci__io_mmio_callback, vpci); if (r < 0) goto free_ioport; - vpci->msix_io_block = pci_get_io_space_block(PCI_IO_SIZE * 2); + vpci->msix_io_block = pci_get_mmio_block(PCI_IO_SIZE * 2); r = kvm__register_mmio(kvm, vpci->msix_io_block, PCI_IO_SIZE * 2, false, virtio_pci__msix_mmio_callback, vpci); if (r < 0) diff --git a/x86/include/kvm/kvm-arch.h b/x86/include/kvm/kvm-arch.h index bfdd3438a9de..85cd336c7577 100644 --- a/x86/include/kvm/kvm-arch.h +++ b/x86/include/kvm/kvm-arch.h @@ -16,7 +16,7 @@ #define KVM_MMIO_START KVM_32BIT_GAP_START -/* This is the address that pci_get_io_space_block() starts allocating +/* This is the address that pci_get_io_port_block() starts allocating * from. Note that this is a PCI bus address (though same on x86). */ #define KVM_IOPORT_AREA 0x0 From patchwork Thu Mar 26 15:24:14 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandru Elisei X-Patchwork-Id: 11460473 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 4CEAB1667 for ; Thu, 26 Mar 2020 15:25:05 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 39E762076A for ; Thu, 26 Mar 2020 15:25:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728294AbgCZPZE (ORCPT ); Thu, 26 Mar 2020 11:25:04 -0400 Received: from foss.arm.com ([217.140.110.172]:33732 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728256AbgCZPZD (ORCPT ); Thu, 26 Mar 2020 11:25:03 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 8342B1045; Thu, 26 Mar 2020 08:25:03 -0700 (PDT) Received: from e123195-lin.cambridge.arm.com (e123195-lin.cambridge.arm.com [10.1.196.63]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 7CD4D3F71E; Thu, 26 Mar 2020 08:25:02 -0700 (PDT) From: Alexandru Elisei To: kvm@vger.kernel.org Cc: will@kernel.org, julien.thierry.kdev@gmail.com, andre.przywara@arm.com, sami.mujawar@arm.com, lorenzo.pieralisi@arm.com, Julien Thierry Subject: [PATCH v3 kvmtool 08/32] pci: Fix ioport allocation size Date: Thu, 26 Mar 2020 15:24:14 +0000 Message-Id: <20200326152438.6218-9-alexandru.elisei@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200326152438.6218-1-alexandru.elisei@arm.com> References: <20200326152438.6218-1-alexandru.elisei@arm.com> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Julien Thierry The PCI Local Bus Specification, Rev. 3.0, Section 6.2.5.1. "Address Maps" states: "Devices that map control functions into I/O Space must not consume more than 256 bytes per I/O Base Address register." Yet all the PCI devices allocate IO ports of IOPORT_SIZE (= 1024 bytes). Fix this by having PCI devices use 256 bytes ports for IO BARs. There is no hard requirement on the size of the memory region described by memory BARs. Since BAR 1 is supposed to offer the same functionality as IO ports, let's make its size match BAR 0. Signed-off-by: Julien Thierry [Added rationale for changing BAR1 size to PCI_IO_SIZE] Signed-off-by: Alexandru Elisei Reviewed-by: Andre Przywara --- hw/vesa.c | 4 ++-- include/kvm/ioport.h | 1 - pci.c | 2 +- virtio/pci.c | 15 +++++++-------- 4 files changed, 10 insertions(+), 12 deletions(-) diff --git a/hw/vesa.c b/hw/vesa.c index 24fb46faad3b..d8d91aa9c873 100644 --- a/hw/vesa.c +++ b/hw/vesa.c @@ -63,8 +63,8 @@ struct framebuffer *vesa__init(struct kvm *kvm) if (!kvm->cfg.vnc && !kvm->cfg.sdl && !kvm->cfg.gtk) return NULL; - r = pci_get_io_port_block(IOPORT_SIZE); - r = ioport__register(kvm, r, &vesa_io_ops, IOPORT_SIZE, NULL); + r = pci_get_io_port_block(PCI_IO_SIZE); + r = ioport__register(kvm, r, &vesa_io_ops, PCI_IO_SIZE, NULL); if (r < 0) return ERR_PTR(r); diff --git a/include/kvm/ioport.h b/include/kvm/ioport.h index b10fcd5b4412..8c86b7151f25 100644 --- a/include/kvm/ioport.h +++ b/include/kvm/ioport.h @@ -14,7 +14,6 @@ /* some ports we reserve for own use */ #define IOPORT_DBG 0xe0 -#define IOPORT_SIZE 0x400 struct kvm; diff --git a/pci.c b/pci.c index 80b5c5d3d7f3..b6892d974c08 100644 --- a/pci.c +++ b/pci.c @@ -20,7 +20,7 @@ static u16 io_port_blocks = PCI_IOPORT_START; u16 pci_get_io_port_block(u32 size) { - u16 port = ALIGN(io_port_blocks, IOPORT_SIZE); + u16 port = ALIGN(io_port_blocks, PCI_IO_SIZE); io_port_blocks = port + size; return port; diff --git a/virtio/pci.c b/virtio/pci.c index d73414abde05..eeb5b5efa6e1 100644 --- a/virtio/pci.c +++ b/virtio/pci.c @@ -421,7 +421,7 @@ static void virtio_pci__io_mmio_callback(struct kvm_cpu *vcpu, { struct virtio_pci *vpci = ptr; int direction = is_write ? KVM_EXIT_IO_OUT : KVM_EXIT_IO_IN; - u16 port = vpci->port_addr + (addr & (IOPORT_SIZE - 1)); + u16 port = vpci->port_addr + (addr & (PCI_IO_SIZE - 1)); kvm__emulate_io(vcpu, port, data, direction, len, 1); } @@ -435,17 +435,16 @@ int virtio_pci__init(struct kvm *kvm, void *dev, struct virtio_device *vdev, vpci->kvm = kvm; vpci->dev = dev; - BUILD_BUG_ON(!is_power_of_two(IOPORT_SIZE)); BUILD_BUG_ON(!is_power_of_two(PCI_IO_SIZE)); - r = pci_get_io_port_block(IOPORT_SIZE); - r = ioport__register(kvm, r, &virtio_pci__io_ops, IOPORT_SIZE, vdev); + r = pci_get_io_port_block(PCI_IO_SIZE); + r = ioport__register(kvm, r, &virtio_pci__io_ops, PCI_IO_SIZE, vdev); if (r < 0) return r; vpci->port_addr = (u16)r; - vpci->mmio_addr = pci_get_mmio_block(IOPORT_SIZE); - r = kvm__register_mmio(kvm, vpci->mmio_addr, IOPORT_SIZE, false, + vpci->mmio_addr = pci_get_mmio_block(PCI_IO_SIZE); + r = kvm__register_mmio(kvm, vpci->mmio_addr, PCI_IO_SIZE, false, virtio_pci__io_mmio_callback, vpci); if (r < 0) goto free_ioport; @@ -475,8 +474,8 @@ int virtio_pci__init(struct kvm *kvm, void *dev, struct virtio_device *vdev, | PCI_BASE_ADDRESS_SPACE_MEMORY), .status = cpu_to_le16(PCI_STATUS_CAP_LIST), .capabilities = (void *)&vpci->pci_hdr.msix - (void *)&vpci->pci_hdr, - .bar_size[0] = cpu_to_le32(IOPORT_SIZE), - .bar_size[1] = cpu_to_le32(IOPORT_SIZE), + .bar_size[0] = cpu_to_le32(PCI_IO_SIZE), + .bar_size[1] = cpu_to_le32(PCI_IO_SIZE), .bar_size[2] = cpu_to_le32(PCI_IO_SIZE*2), }; From patchwork Thu Mar 26 15:24:15 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandru Elisei X-Patchwork-Id: 11460477 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id AE3BD17EA for ; Thu, 26 Mar 2020 15:25:06 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 9054E2076A for ; Thu, 26 Mar 2020 15:25:06 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728232AbgCZPZF (ORCPT ); Thu, 26 Mar 2020 11:25:05 -0400 Received: from foss.arm.com ([217.140.110.172]:33742 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728326AbgCZPZF (ORCPT ); Thu, 26 Mar 2020 11:25:05 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id B948B7FA; Thu, 26 Mar 2020 08:25:04 -0700 (PDT) Received: from e123195-lin.cambridge.arm.com (e123195-lin.cambridge.arm.com [10.1.196.63]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id B76133F71E; Thu, 26 Mar 2020 08:25:03 -0700 (PDT) From: Alexandru Elisei To: kvm@vger.kernel.org Cc: will@kernel.org, julien.thierry.kdev@gmail.com, andre.przywara@arm.com, sami.mujawar@arm.com, lorenzo.pieralisi@arm.com, Julien Thierry Subject: [PATCH v3 kvmtool 09/32] virtio/pci: Make memory and IO BARs independent Date: Thu, 26 Mar 2020 15:24:15 +0000 Message-Id: <20200326152438.6218-10-alexandru.elisei@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200326152438.6218-1-alexandru.elisei@arm.com> References: <20200326152438.6218-1-alexandru.elisei@arm.com> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Julien Thierry Currently, callbacks for memory BAR 1 call the IO port emulation. This means that the memory BAR needs I/O Space to be enabled whenever Memory Space is enabled. Refactor the code so the two type of BARs are independent. Also, unify ioport/mmio callback arguments so that they all receive a virtio_device. Reviewed-by: Andre Przywara Signed-off-by: Julien Thierry [Cosmetic changes wrt to where local variables are initialized] Signed-off-by: Alexandru Elisei --- virtio/pci.c | 63 +++++++++++++++++++++++++++++++++------------------- 1 file changed, 40 insertions(+), 23 deletions(-) diff --git a/virtio/pci.c b/virtio/pci.c index eeb5b5efa6e1..281c31817598 100644 --- a/virtio/pci.c +++ b/virtio/pci.c @@ -87,8 +87,8 @@ static inline bool virtio_pci__msix_enabled(struct virtio_pci *vpci) return vpci->pci_hdr.msix.ctrl & cpu_to_le16(PCI_MSIX_FLAGS_ENABLE); } -static bool virtio_pci__specific_io_in(struct kvm *kvm, struct virtio_device *vdev, u16 port, - void *data, int size, int offset) +static bool virtio_pci__specific_data_in(struct kvm *kvm, struct virtio_device *vdev, + void *data, int size, unsigned long offset) { u32 config_offset; struct virtio_pci *vpci = vdev->virtio; @@ -117,20 +117,17 @@ static bool virtio_pci__specific_io_in(struct kvm *kvm, struct virtio_device *vd return false; } -static bool virtio_pci__io_in(struct ioport *ioport, struct kvm_cpu *vcpu, u16 port, void *data, int size) +static bool virtio_pci__data_in(struct kvm_cpu *vcpu, struct virtio_device *vdev, + unsigned long offset, void *data, int size) { - unsigned long offset; bool ret = true; - struct virtio_device *vdev; struct virtio_pci *vpci; struct virt_queue *vq; struct kvm *kvm; u32 val; kvm = vcpu->kvm; - vdev = ioport->priv; vpci = vdev->virtio; - offset = port - vpci->port_addr; switch (offset) { case VIRTIO_PCI_HOST_FEATURES: @@ -154,13 +151,22 @@ static bool virtio_pci__io_in(struct ioport *ioport, struct kvm_cpu *vcpu, u16 p vpci->isr = VIRTIO_IRQ_LOW; break; default: - ret = virtio_pci__specific_io_in(kvm, vdev, port, data, size, offset); + ret = virtio_pci__specific_data_in(kvm, vdev, data, size, offset); break; }; return ret; } +static bool virtio_pci__io_in(struct ioport *ioport, struct kvm_cpu *vcpu, u16 port, void *data, int size) +{ + struct virtio_device *vdev = ioport->priv; + struct virtio_pci *vpci = vdev->virtio; + unsigned long offset = port - vpci->port_addr; + + return virtio_pci__data_in(vcpu, vdev, offset, data, size); +} + static void update_msix_map(struct virtio_pci *vpci, struct msix_table *msix_entry, u32 vecnum) { @@ -185,8 +191,8 @@ static void update_msix_map(struct virtio_pci *vpci, irq__update_msix_route(vpci->kvm, gsi, &msix_entry->msg); } -static bool virtio_pci__specific_io_out(struct kvm *kvm, struct virtio_device *vdev, u16 port, - void *data, int size, int offset) +static bool virtio_pci__specific_data_out(struct kvm *kvm, struct virtio_device *vdev, + void *data, int size, unsigned long offset) { struct virtio_pci *vpci = vdev->virtio; u32 config_offset, vec; @@ -259,19 +265,16 @@ static bool virtio_pci__specific_io_out(struct kvm *kvm, struct virtio_device *v return false; } -static bool virtio_pci__io_out(struct ioport *ioport, struct kvm_cpu *vcpu, u16 port, void *data, int size) +static bool virtio_pci__data_out(struct kvm_cpu *vcpu, struct virtio_device *vdev, + unsigned long offset, void *data, int size) { - unsigned long offset; bool ret = true; - struct virtio_device *vdev; struct virtio_pci *vpci; struct kvm *kvm; u32 val; kvm = vcpu->kvm; - vdev = ioport->priv; vpci = vdev->virtio; - offset = port - vpci->port_addr; switch (offset) { case VIRTIO_PCI_GUEST_FEATURES: @@ -304,13 +307,22 @@ static bool virtio_pci__io_out(struct ioport *ioport, struct kvm_cpu *vcpu, u16 virtio_notify_status(kvm, vdev, vpci->dev, vpci->status); break; default: - ret = virtio_pci__specific_io_out(kvm, vdev, port, data, size, offset); + ret = virtio_pci__specific_data_out(kvm, vdev, data, size, offset); break; }; return ret; } +static bool virtio_pci__io_out(struct ioport *ioport, struct kvm_cpu *vcpu, u16 port, void *data, int size) +{ + struct virtio_device *vdev = ioport->priv; + struct virtio_pci *vpci = vdev->virtio; + unsigned long offset = port - vpci->port_addr; + + return virtio_pci__data_out(vcpu, vdev, offset, data, size); +} + static struct ioport_operations virtio_pci__io_ops = { .io_in = virtio_pci__io_in, .io_out = virtio_pci__io_out, @@ -320,7 +332,8 @@ static void virtio_pci__msix_mmio_callback(struct kvm_cpu *vcpu, u64 addr, u8 *data, u32 len, u8 is_write, void *ptr) { - struct virtio_pci *vpci = ptr; + struct virtio_device *vdev = ptr; + struct virtio_pci *vpci = vdev->virtio; struct msix_table *table; int vecnum; size_t offset; @@ -419,11 +432,15 @@ static void virtio_pci__io_mmio_callback(struct kvm_cpu *vcpu, u64 addr, u8 *data, u32 len, u8 is_write, void *ptr) { - struct virtio_pci *vpci = ptr; - int direction = is_write ? KVM_EXIT_IO_OUT : KVM_EXIT_IO_IN; - u16 port = vpci->port_addr + (addr & (PCI_IO_SIZE - 1)); + struct virtio_device *vdev = ptr; + struct virtio_pci *vpci = vdev->virtio; - kvm__emulate_io(vcpu, port, data, direction, len, 1); + if (!is_write) + virtio_pci__data_in(vcpu, vdev, addr - vpci->mmio_addr, + data, len); + else + virtio_pci__data_out(vcpu, vdev, addr - vpci->mmio_addr, + data, len); } int virtio_pci__init(struct kvm *kvm, void *dev, struct virtio_device *vdev, @@ -445,13 +462,13 @@ int virtio_pci__init(struct kvm *kvm, void *dev, struct virtio_device *vdev, vpci->mmio_addr = pci_get_mmio_block(PCI_IO_SIZE); r = kvm__register_mmio(kvm, vpci->mmio_addr, PCI_IO_SIZE, false, - virtio_pci__io_mmio_callback, vpci); + virtio_pci__io_mmio_callback, vdev); if (r < 0) goto free_ioport; vpci->msix_io_block = pci_get_mmio_block(PCI_IO_SIZE * 2); r = kvm__register_mmio(kvm, vpci->msix_io_block, PCI_IO_SIZE * 2, false, - virtio_pci__msix_mmio_callback, vpci); + virtio_pci__msix_mmio_callback, vdev); if (r < 0) goto free_mmio; From patchwork Thu Mar 26 15:24:16 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandru Elisei X-Patchwork-Id: 11460479 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id ED1181667 for ; Thu, 26 Mar 2020 15:25:07 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id D85B420748 for ; Thu, 26 Mar 2020 15:25:07 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728340AbgCZPZH (ORCPT ); Thu, 26 Mar 2020 11:25:07 -0400 Received: from foss.arm.com ([217.140.110.172]:33752 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728322AbgCZPZG (ORCPT ); Thu, 26 Mar 2020 11:25:06 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id DBF071045; Thu, 26 Mar 2020 08:25:05 -0700 (PDT) Received: from e123195-lin.cambridge.arm.com (e123195-lin.cambridge.arm.com [10.1.196.63]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id F00663F71E; Thu, 26 Mar 2020 08:25:04 -0700 (PDT) From: Alexandru Elisei To: kvm@vger.kernel.org Cc: will@kernel.org, julien.thierry.kdev@gmail.com, andre.przywara@arm.com, sami.mujawar@arm.com, lorenzo.pieralisi@arm.com Subject: [PATCH v3 kvmtool 10/32] vfio/pci: Allocate correct size for MSIX table and PBA BARs Date: Thu, 26 Mar 2020 15:24:16 +0000 Message-Id: <20200326152438.6218-11-alexandru.elisei@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200326152438.6218-1-alexandru.elisei@arm.com> References: <20200326152438.6218-1-alexandru.elisei@arm.com> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org kvmtool assumes that the BAR that holds the address for the MSIX table and PBA structure has a size which is equal to their total size and it allocates memory from MMIO space accordingly. However, when initializing the BARs, the BAR size is set to the region size reported by VFIO. When the physical BAR size is greater than the mmio space that kvmtool allocates, we can have a situation where the BAR overlaps with another BAR, in which case kvmtool will fail to map the memory. This was found when trying to do PCI passthrough with a PCIe Realtek r8168 NIC, when the guest was also using virtio-block and virtio-net devices: [..] [ 0.197926] PCI: OF: PROBE_ONLY enabled [ 0.198454] pci-host-generic 40000000.pci: host bridge /pci ranges: [ 0.199291] pci-host-generic 40000000.pci: IO 0x00007000..0x0000ffff -> 0x00007000 [ 0.200331] pci-host-generic 40000000.pci: MEM 0x41000000..0x7fffffff -> 0x41000000 [ 0.201480] pci-host-generic 40000000.pci: ECAM at [mem 0x40000000-0x40ffffff] for [bus 00] [ 0.202635] pci-host-generic 40000000.pci: PCI host bridge to bus 0000:00 [ 0.203535] pci_bus 0000:00: root bus resource [bus 00] [ 0.204227] pci_bus 0000:00: root bus resource [io 0x0000-0x8fff] (bus address [0x7000-0xffff]) [ 0.205483] pci_bus 0000:00: root bus resource [mem 0x41000000-0x7fffffff] [ 0.206456] pci 0000:00:00.0: [10ec:8168] type 00 class 0x020000 [ 0.207399] pci 0000:00:00.0: reg 0x10: [io 0x0000-0x00ff] [ 0.208252] pci 0000:00:00.0: reg 0x18: [mem 0x41002000-0x41002fff] [ 0.209233] pci 0000:00:00.0: reg 0x20: [mem 0x41000000-0x41003fff] [ 0.210481] pci 0000:00:01.0: [1af4:1000] type 00 class 0x020000 [ 0.211349] pci 0000:00:01.0: reg 0x10: [io 0x0100-0x01ff] [ 0.212118] pci 0000:00:01.0: reg 0x14: [mem 0x41003000-0x410030ff] [ 0.212982] pci 0000:00:01.0: reg 0x18: [mem 0x41003200-0x410033ff] [ 0.214247] pci 0000:00:02.0: [1af4:1001] type 00 class 0x018000 [ 0.215096] pci 0000:00:02.0: reg 0x10: [io 0x0200-0x02ff] [ 0.215863] pci 0000:00:02.0: reg 0x14: [mem 0x41003400-0x410034ff] [ 0.216723] pci 0000:00:02.0: reg 0x18: [mem 0x41003600-0x410037ff] [ 0.218105] pci 0000:00:00.0: can't claim BAR 4 [mem 0x41000000-0x41003fff]: address conflict with 0000:00:00.0 [mem 0x41002000-0x41002fff] [..] Guest output of lspci -vv: 00:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 06) Subsystem: TP-LINK Technologies Co., Ltd. TG-3468 Gigabit PCI Express Network Adapter Control: I/O+ Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR- Signed-off-by: Alexandru Elisei --- vfio/pci.c | 68 +++++++++++++++++++++++++++++++++++++++++------------- 1 file changed, 52 insertions(+), 16 deletions(-) diff --git a/vfio/pci.c b/vfio/pci.c index 8e5d8572bc0c..bbb8469c8d93 100644 --- a/vfio/pci.c +++ b/vfio/pci.c @@ -715,17 +715,44 @@ static int vfio_pci_fixup_cfg_space(struct vfio_device *vdev) return 0; } -static int vfio_pci_create_msix_table(struct kvm *kvm, - struct vfio_pci_device *pdev) +static int vfio_pci_get_region_info(struct vfio_device *vdev, u32 index, + struct vfio_region_info *info) +{ + int ret; + + *info = (struct vfio_region_info) { + .argsz = sizeof(*info), + .index = index, + }; + + ret = ioctl(vdev->fd, VFIO_DEVICE_GET_REGION_INFO, info); + if (ret) { + ret = -errno; + vfio_dev_err(vdev, "cannot get info for BAR %u", index); + return ret; + } + + if (info->size && !is_power_of_two(info->size)) { + vfio_dev_err(vdev, "region is not power of two: 0x%llx", + info->size); + return -EINVAL; + } + + return 0; +} + +static int vfio_pci_create_msix_table(struct kvm *kvm, struct vfio_device *vdev) { int ret; size_t i; - size_t mmio_size; + size_t map_size; size_t nr_entries; struct vfio_pci_msi_entry *entries; + struct vfio_pci_device *pdev = &vdev->pci; struct vfio_pci_msix_pba *pba = &pdev->msix_pba; struct vfio_pci_msix_table *table = &pdev->msix_table; struct msix_cap *msix = PCI_CAP(&pdev->hdr, pdev->msix.pos); + struct vfio_region_info info; table->bar = msix->table_offset & PCI_MSIX_TABLE_BIR; pba->bar = msix->pba_offset & PCI_MSIX_TABLE_BIR; @@ -744,15 +771,31 @@ static int vfio_pci_create_msix_table(struct kvm *kvm, for (i = 0; i < nr_entries; i++) entries[i].config.ctrl = PCI_MSIX_ENTRY_CTRL_MASKBIT; + ret = vfio_pci_get_region_info(vdev, table->bar, &info); + if (ret) + return ret; + if (!info.size) + return -EINVAL; + map_size = info.size; + + if (table->bar != pba->bar) { + ret = vfio_pci_get_region_info(vdev, pba->bar, &info); + if (ret) + return ret; + if (!info.size) + return -EINVAL; + map_size += info.size; + } + /* * To ease MSI-X cap configuration in case they share the same BAR, * collapse table and pending array. The size of the BAR regions must be * powers of two. */ - mmio_size = roundup_pow_of_two(table->size + pba->size); - table->guest_phys_addr = pci_get_mmio_block(mmio_size); + map_size = ALIGN(map_size, PAGE_SIZE); + table->guest_phys_addr = pci_get_mmio_block(map_size); if (!table->guest_phys_addr) { - pr_err("cannot allocate IO space"); + pr_err("cannot allocate MMIO space"); ret = -ENOMEM; goto out_free; } @@ -816,17 +859,10 @@ static int vfio_pci_configure_bar(struct kvm *kvm, struct vfio_device *vdev, region->vdev = vdev; region->is_ioport = !!(bar & PCI_BASE_ADDRESS_SPACE_IO); - region->info = (struct vfio_region_info) { - .argsz = sizeof(region->info), - .index = nr, - }; - ret = ioctl(vdev->fd, VFIO_DEVICE_GET_REGION_INFO, ®ion->info); - if (ret) { - ret = -errno; - vfio_dev_err(vdev, "cannot get info for BAR %zu", nr); + ret = vfio_pci_get_region_info(vdev, nr, ®ion->info); + if (ret) return ret; - } /* Ignore invalid or unimplemented regions */ if (!region->info.size) @@ -871,7 +907,7 @@ static int vfio_pci_configure_dev_regions(struct kvm *kvm, return ret; if (pdev->irq_modes & VFIO_PCI_IRQ_MODE_MSIX) { - ret = vfio_pci_create_msix_table(kvm, pdev); + ret = vfio_pci_create_msix_table(kvm, vdev); if (ret) return ret; } From patchwork Thu Mar 26 15:24:17 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandru Elisei X-Patchwork-Id: 11460481 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 25DC41667 for ; Thu, 26 Mar 2020 15:25:10 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 0FEF72076A for ; Thu, 26 Mar 2020 15:25:10 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726034AbgCZPZH (ORCPT ); Thu, 26 Mar 2020 11:25:07 -0400 Received: from foss.arm.com ([217.140.110.172]:33760 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728341AbgCZPZH (ORCPT ); Thu, 26 Mar 2020 11:25:07 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 0EB347FA; Thu, 26 Mar 2020 08:25:07 -0700 (PDT) Received: from e123195-lin.cambridge.arm.com (e123195-lin.cambridge.arm.com [10.1.196.63]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 1D3913F71E; Thu, 26 Mar 2020 08:25:06 -0700 (PDT) From: Alexandru Elisei To: kvm@vger.kernel.org Cc: will@kernel.org, julien.thierry.kdev@gmail.com, andre.przywara@arm.com, sami.mujawar@arm.com, lorenzo.pieralisi@arm.com Subject: [PATCH v3 kvmtool 11/32] vfio/pci: Don't assume that only even numbered BARs are 64bit Date: Thu, 26 Mar 2020 15:24:17 +0000 Message-Id: <20200326152438.6218-12-alexandru.elisei@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200326152438.6218-1-alexandru.elisei@arm.com> References: <20200326152438.6218-1-alexandru.elisei@arm.com> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Not all devices have the bottom 32 bits of a 64 bit BAR in an even numbered BAR. For example, on an NVIDIA Quadro P400, BARs 1 and 3 are 64bit. Remove this assumption. Reviewed-by: Andre Przywara Signed-off-by: Alexandru Elisei --- vfio/pci.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/vfio/pci.c b/vfio/pci.c index bbb8469c8d93..1bdc20038411 100644 --- a/vfio/pci.c +++ b/vfio/pci.c @@ -920,8 +920,10 @@ static int vfio_pci_configure_dev_regions(struct kvm *kvm, for (i = VFIO_PCI_BAR0_REGION_INDEX; i <= VFIO_PCI_BAR5_REGION_INDEX; ++i) { /* Ignore top half of 64-bit BAR */ - if (i % 2 && is_64bit) + if (is_64bit) { + is_64bit = false; continue; + } ret = vfio_pci_configure_bar(kvm, vdev, i); if (ret) From patchwork Thu Mar 26 15:24:18 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandru Elisei X-Patchwork-Id: 11460523 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9BCC81667 for ; Thu, 26 Mar 2020 15:25:38 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 8706520748 for ; Thu, 26 Mar 2020 15:25:38 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728348AbgCZPZK (ORCPT ); Thu, 26 Mar 2020 11:25:10 -0400 Received: from foss.arm.com ([217.140.110.172]:33766 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728312AbgCZPZI (ORCPT ); Thu, 26 Mar 2020 11:25:08 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 2F18A1045; Thu, 26 Mar 2020 08:25:08 -0700 (PDT) Received: from e123195-lin.cambridge.arm.com (e123195-lin.cambridge.arm.com [10.1.196.63]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 46A463F71E; Thu, 26 Mar 2020 08:25:07 -0700 (PDT) From: Alexandru Elisei To: kvm@vger.kernel.org Cc: will@kernel.org, julien.thierry.kdev@gmail.com, andre.przywara@arm.com, sami.mujawar@arm.com, lorenzo.pieralisi@arm.com Subject: [PATCH v3 kvmtool 12/32] vfio/pci: Ignore expansion ROM BAR writes Date: Thu, 26 Mar 2020 15:24:18 +0000 Message-Id: <20200326152438.6218-13-alexandru.elisei@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200326152438.6218-1-alexandru.elisei@arm.com> References: <20200326152438.6218-1-alexandru.elisei@arm.com> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org To get the size of the expansion ROM, software writes 0xfffff800 to the expansion ROM BAR in the PCI configuration space. PCI emulation executes the optional configuration space write callback that a device can implement before emulating this write. kvmtool's implementation of VFIO doesn't have support for emulating expansion ROMs. However, the callback writes the guest value to the hardware BAR, and then it reads it back to the emulated BAR to make sure the write has completed successfully. After this, we return to regular PCI emulation and because the BAR is no longer 0, we write back to the BAR the value that the guest used to get the size. As a result, the guest will think that the ROM size is 0x800 after the subsequent read and we end up unintentionally exposing to the guest a BAR which we don't emulate. Let's fix this by ignoring writes to the expansion ROM BAR. Signed-off-by: Alexandru Elisei Reviewed-by: Andre Przywara --- vfio/pci.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/vfio/pci.c b/vfio/pci.c index 1bdc20038411..1f38f90c3ae9 100644 --- a/vfio/pci.c +++ b/vfio/pci.c @@ -472,6 +472,9 @@ static void vfio_pci_cfg_write(struct kvm *kvm, struct pci_device_header *pci_hd struct vfio_device *vdev; void *base = pci_hdr; + if (offset == PCI_ROM_ADDRESS) + return; + pdev = container_of(pci_hdr, struct vfio_pci_device, hdr); vdev = container_of(pdev, struct vfio_device, pci); info = &vdev->regions[VFIO_PCI_CONFIG_REGION_INDEX].info; From patchwork Thu Mar 26 15:24:19 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandru Elisei X-Patchwork-Id: 11460483 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B1AF492A for ; Thu, 26 Mar 2020 15:25:11 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 9D56E2073E for ; Thu, 26 Mar 2020 15:25:11 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728404AbgCZPZL (ORCPT ); Thu, 26 Mar 2020 11:25:11 -0400 Received: from foss.arm.com ([217.140.110.172]:33782 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728364AbgCZPZJ (ORCPT ); Thu, 26 Mar 2020 11:25:09 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 4DE6C7FA; Thu, 26 Mar 2020 08:25:09 -0700 (PDT) Received: from e123195-lin.cambridge.arm.com (e123195-lin.cambridge.arm.com [10.1.196.63]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 63C913F71E; Thu, 26 Mar 2020 08:25:08 -0700 (PDT) From: Alexandru Elisei To: kvm@vger.kernel.org Cc: will@kernel.org, julien.thierry.kdev@gmail.com, andre.przywara@arm.com, sami.mujawar@arm.com, lorenzo.pieralisi@arm.com Subject: [PATCH v3 kvmtool 13/32] vfio/pci: Don't access unallocated regions Date: Thu, 26 Mar 2020 15:24:19 +0000 Message-Id: <20200326152438.6218-14-alexandru.elisei@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200326152438.6218-1-alexandru.elisei@arm.com> References: <20200326152438.6218-1-alexandru.elisei@arm.com> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Don't try to configure a BAR if there is no region associated with it. Also move the variable declarations from inside the loop to the start of the function for consistency. Signed-off-by: Alexandru Elisei Reviewed-by: Andre Przywara --- vfio/pci.c | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/vfio/pci.c b/vfio/pci.c index 1f38f90c3ae9..4412c6d7a862 100644 --- a/vfio/pci.c +++ b/vfio/pci.c @@ -645,16 +645,19 @@ static int vfio_pci_parse_cfg_space(struct vfio_device *vdev) static int vfio_pci_fixup_cfg_space(struct vfio_device *vdev) { int i; + u64 base; ssize_t hdr_sz; struct msix_cap *msix; struct vfio_region_info *info; struct vfio_pci_device *pdev = &vdev->pci; + struct vfio_region *region; /* Initialise the BARs */ for (i = VFIO_PCI_BAR0_REGION_INDEX; i <= VFIO_PCI_BAR5_REGION_INDEX; ++i) { - u64 base; - struct vfio_region *region = &vdev->regions[i]; + if ((u32)i == vdev->info.num_regions) + break; + region = &vdev->regions[i]; /* Construct a fake reg to match what we've mapped. */ if (region->is_ioport) { base = (region->port_base & PCI_BASE_ADDRESS_IO_MASK) | @@ -853,11 +856,12 @@ static int vfio_pci_configure_bar(struct kvm *kvm, struct vfio_device *vdev, u32 bar; size_t map_size; struct vfio_pci_device *pdev = &vdev->pci; - struct vfio_region *region = &vdev->regions[nr]; + struct vfio_region *region; if (nr >= vdev->info.num_regions) return 0; + region = &vdev->regions[nr]; bar = pdev->hdr.bar[nr]; region->vdev = vdev; From patchwork Thu Mar 26 15:24:20 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandru Elisei X-Patchwork-Id: 11460487 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id DE08D17EA for ; Thu, 26 Mar 2020 15:25:13 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id C85672073E for ; Thu, 26 Mar 2020 15:25:13 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728341AbgCZPZL (ORCPT ); Thu, 26 Mar 2020 11:25:11 -0400 Received: from foss.arm.com ([217.140.110.172]:33792 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728365AbgCZPZL (ORCPT ); Thu, 26 Mar 2020 11:25:11 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 64DB31045; Thu, 26 Mar 2020 08:25:10 -0700 (PDT) Received: from e123195-lin.cambridge.arm.com (e123195-lin.cambridge.arm.com [10.1.196.63]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 7FBF03F71E; Thu, 26 Mar 2020 08:25:09 -0700 (PDT) From: Alexandru Elisei To: kvm@vger.kernel.org Cc: will@kernel.org, julien.thierry.kdev@gmail.com, andre.przywara@arm.com, sami.mujawar@arm.com, lorenzo.pieralisi@arm.com Subject: [PATCH v3 kvmtool 14/32] virtio: Don't ignore initialization failures Date: Thu, 26 Mar 2020 15:24:20 +0000 Message-Id: <20200326152438.6218-15-alexandru.elisei@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200326152438.6218-1-alexandru.elisei@arm.com> References: <20200326152438.6218-1-alexandru.elisei@arm.com> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Don't ignore an error in the bus specific initialization function in virtio_init; don't ignore the result of virtio_init; and don't return 0 in virtio_blk__init and virtio_scsi__init when we encounter an error. Hopefully this will save some developer's time debugging faulty virtio devices in a guest. To take advantage of the cleanup function virtio_blk__exit, we have moved appending the new device to the list before the call to virtio_init. To safeguard against this in the future, virtio_init has been annoted with the compiler attribute warn_unused_result. Signed-off-by: Alexandru Elisei --- include/kvm/kvm.h | 1 + include/kvm/virtio.h | 7 ++++--- include/linux/compiler.h | 2 +- virtio/9p.c | 9 ++++++--- virtio/balloon.c | 10 +++++++--- virtio/blk.c | 14 +++++++++----- virtio/console.c | 11 ++++++++--- virtio/core.c | 9 +++++---- virtio/net.c | 32 ++++++++++++++++++-------------- virtio/scsi.c | 14 +++++++++----- 10 files changed, 68 insertions(+), 41 deletions(-) diff --git a/include/kvm/kvm.h b/include/kvm/kvm.h index 7a738183d67a..c6dc6ef72d11 100644 --- a/include/kvm/kvm.h +++ b/include/kvm/kvm.h @@ -8,6 +8,7 @@ #include #include +#include #include #include #include diff --git a/include/kvm/virtio.h b/include/kvm/virtio.h index 19b913732cd5..3a311f54f2a5 100644 --- a/include/kvm/virtio.h +++ b/include/kvm/virtio.h @@ -7,6 +7,7 @@ #include #include +#include #include #include @@ -204,9 +205,9 @@ struct virtio_ops { int (*reset)(struct kvm *kvm, struct virtio_device *vdev); }; -int virtio_init(struct kvm *kvm, void *dev, struct virtio_device *vdev, - struct virtio_ops *ops, enum virtio_trans trans, - int device_id, int subsys_id, int class); +int __must_check virtio_init(struct kvm *kvm, void *dev, struct virtio_device *vdev, + struct virtio_ops *ops, enum virtio_trans trans, + int device_id, int subsys_id, int class); int virtio_compat_add_message(const char *device, const char *config); const char* virtio_trans_name(enum virtio_trans trans); diff --git a/include/linux/compiler.h b/include/linux/compiler.h index 898420b81aec..a662ba0a5c68 100644 --- a/include/linux/compiler.h +++ b/include/linux/compiler.h @@ -14,7 +14,7 @@ #define __packed __attribute__((packed)) #define __iomem #define __force -#define __must_check +#define __must_check __attribute__((warn_unused_result)) #define unlikely #endif diff --git a/virtio/9p.c b/virtio/9p.c index ac70dbc31207..b78f2b3f0e09 100644 --- a/virtio/9p.c +++ b/virtio/9p.c @@ -1551,11 +1551,14 @@ int virtio_9p_img_name_parser(const struct option *opt, const char *arg, int uns int virtio_9p__init(struct kvm *kvm) { struct p9_dev *p9dev; + int r; list_for_each_entry(p9dev, &devs, list) { - virtio_init(kvm, p9dev, &p9dev->vdev, &p9_dev_virtio_ops, - VIRTIO_DEFAULT_TRANS(kvm), PCI_DEVICE_ID_VIRTIO_9P, - VIRTIO_ID_9P, PCI_CLASS_9P); + r = virtio_init(kvm, p9dev, &p9dev->vdev, &p9_dev_virtio_ops, + VIRTIO_DEFAULT_TRANS(kvm), PCI_DEVICE_ID_VIRTIO_9P, + VIRTIO_ID_9P, PCI_CLASS_9P); + if (r < 0) + return r; } return 0; diff --git a/virtio/balloon.c b/virtio/balloon.c index 0bd16703dfee..8e8803fed607 100644 --- a/virtio/balloon.c +++ b/virtio/balloon.c @@ -264,6 +264,8 @@ struct virtio_ops bln_dev_virtio_ops = { int virtio_bln__init(struct kvm *kvm) { + int r; + if (!kvm->cfg.balloon) return 0; @@ -273,9 +275,11 @@ int virtio_bln__init(struct kvm *kvm) bdev.stat_waitfd = eventfd(0, 0); memset(&bdev.config, 0, sizeof(struct virtio_balloon_config)); - virtio_init(kvm, &bdev, &bdev.vdev, &bln_dev_virtio_ops, - VIRTIO_DEFAULT_TRANS(kvm), PCI_DEVICE_ID_VIRTIO_BLN, - VIRTIO_ID_BALLOON, PCI_CLASS_BLN); + r = virtio_init(kvm, &bdev, &bdev.vdev, &bln_dev_virtio_ops, + VIRTIO_DEFAULT_TRANS(kvm), PCI_DEVICE_ID_VIRTIO_BLN, + VIRTIO_ID_BALLOON, PCI_CLASS_BLN); + if (r < 0) + return r; if (compat_id == -1) compat_id = virtio_compat_add_message("virtio-balloon", "CONFIG_VIRTIO_BALLOON"); diff --git a/virtio/blk.c b/virtio/blk.c index f267be1563dc..4d02d101af81 100644 --- a/virtio/blk.c +++ b/virtio/blk.c @@ -306,6 +306,7 @@ static struct virtio_ops blk_dev_virtio_ops = { static int virtio_blk__init_one(struct kvm *kvm, struct disk_image *disk) { struct blk_dev *bdev; + int r; if (!disk) return -EINVAL; @@ -323,12 +324,14 @@ static int virtio_blk__init_one(struct kvm *kvm, struct disk_image *disk) .kvm = kvm, }; - virtio_init(kvm, bdev, &bdev->vdev, &blk_dev_virtio_ops, - VIRTIO_DEFAULT_TRANS(kvm), PCI_DEVICE_ID_VIRTIO_BLK, - VIRTIO_ID_BLOCK, PCI_CLASS_BLK); - list_add_tail(&bdev->list, &bdevs); + r = virtio_init(kvm, bdev, &bdev->vdev, &blk_dev_virtio_ops, + VIRTIO_DEFAULT_TRANS(kvm), PCI_DEVICE_ID_VIRTIO_BLK, + VIRTIO_ID_BLOCK, PCI_CLASS_BLK); + if (r < 0) + return r; + disk_image__set_callback(bdev->disk, virtio_blk_complete); if (compat_id == -1) @@ -359,7 +362,8 @@ int virtio_blk__init(struct kvm *kvm) return 0; cleanup: - return virtio_blk__exit(kvm); + virtio_blk__exit(kvm); + return r; } virtio_dev_init(virtio_blk__init); diff --git a/virtio/console.c b/virtio/console.c index f1be02549222..e0b98df37965 100644 --- a/virtio/console.c +++ b/virtio/console.c @@ -230,12 +230,17 @@ static struct virtio_ops con_dev_virtio_ops = { int virtio_console__init(struct kvm *kvm) { + int r; + if (kvm->cfg.active_console != CONSOLE_VIRTIO) return 0; - virtio_init(kvm, &cdev, &cdev.vdev, &con_dev_virtio_ops, - VIRTIO_DEFAULT_TRANS(kvm), PCI_DEVICE_ID_VIRTIO_CONSOLE, - VIRTIO_ID_CONSOLE, PCI_CLASS_CONSOLE); + r = virtio_init(kvm, &cdev, &cdev.vdev, &con_dev_virtio_ops, + VIRTIO_DEFAULT_TRANS(kvm), PCI_DEVICE_ID_VIRTIO_CONSOLE, + VIRTIO_ID_CONSOLE, PCI_CLASS_CONSOLE); + if (r < 0) + return r; + if (compat_id == -1) compat_id = virtio_compat_add_message("virtio-console", "CONFIG_VIRTIO_CONSOLE"); diff --git a/virtio/core.c b/virtio/core.c index e10ec362e1ea..f5b3c07fc100 100644 --- a/virtio/core.c +++ b/virtio/core.c @@ -259,6 +259,7 @@ int virtio_init(struct kvm *kvm, void *dev, struct virtio_device *vdev, int device_id, int subsys_id, int class) { void *virtio; + int r; switch (trans) { case VIRTIO_PCI: @@ -272,7 +273,7 @@ int virtio_init(struct kvm *kvm, void *dev, struct virtio_device *vdev, vdev->ops->init = virtio_pci__init; vdev->ops->exit = virtio_pci__exit; vdev->ops->reset = virtio_pci__reset; - vdev->ops->init(kvm, dev, vdev, device_id, subsys_id, class); + r = vdev->ops->init(kvm, dev, vdev, device_id, subsys_id, class); break; case VIRTIO_MMIO: virtio = calloc(sizeof(struct virtio_mmio), 1); @@ -285,13 +286,13 @@ int virtio_init(struct kvm *kvm, void *dev, struct virtio_device *vdev, vdev->ops->init = virtio_mmio_init; vdev->ops->exit = virtio_mmio_exit; vdev->ops->reset = virtio_mmio_reset; - vdev->ops->init(kvm, dev, vdev, device_id, subsys_id, class); + r = vdev->ops->init(kvm, dev, vdev, device_id, subsys_id, class); break; default: - return -1; + r = -1; }; - return 0; + return r; } int virtio_compat_add_message(const char *device, const char *config) diff --git a/virtio/net.c b/virtio/net.c index 091406912a24..425c13ba1136 100644 --- a/virtio/net.c +++ b/virtio/net.c @@ -910,7 +910,7 @@ done: static int virtio_net__init_one(struct virtio_net_params *params) { - int i, err; + int i, r; struct net_dev *ndev; struct virtio_ops *ops; enum virtio_trans trans = VIRTIO_DEFAULT_TRANS(params->kvm); @@ -920,10 +920,8 @@ static int virtio_net__init_one(struct virtio_net_params *params) return -ENOMEM; ops = malloc(sizeof(*ops)); - if (ops == NULL) { - err = -ENOMEM; - goto err_free_ndev; - } + if (ops == NULL) + return -ENOMEM; list_add_tail(&ndev->list, &ndevs); @@ -969,8 +967,10 @@ static int virtio_net__init_one(struct virtio_net_params *params) virtio_trans_name(trans)); } - virtio_init(params->kvm, ndev, &ndev->vdev, ops, trans, - PCI_DEVICE_ID_VIRTIO_NET, VIRTIO_ID_NET, PCI_CLASS_NET); + r = virtio_init(params->kvm, ndev, &ndev->vdev, ops, trans, + PCI_DEVICE_ID_VIRTIO_NET, VIRTIO_ID_NET, PCI_CLASS_NET); + if (r < 0) + return r; if (params->vhost) virtio_net__vhost_init(params->kvm, ndev); @@ -979,19 +979,17 @@ static int virtio_net__init_one(struct virtio_net_params *params) compat_id = virtio_compat_add_message("virtio-net", "CONFIG_VIRTIO_NET"); return 0; - -err_free_ndev: - free(ndev); - return err; } int virtio_net__init(struct kvm *kvm) { - int i; + int i, r; for (i = 0; i < kvm->cfg.num_net_devices; i++) { kvm->cfg.net_params[i].kvm = kvm; - virtio_net__init_one(&kvm->cfg.net_params[i]); + r = virtio_net__init_one(&kvm->cfg.net_params[i]); + if (r < 0) + goto cleanup; } if (kvm->cfg.num_net_devices == 0 && kvm->cfg.no_net == 0) { @@ -1007,10 +1005,16 @@ int virtio_net__init(struct kvm *kvm) str_to_mac(kvm->cfg.guest_mac, net_params.guest_mac); str_to_mac(kvm->cfg.host_mac, net_params.host_mac); - virtio_net__init_one(&net_params); + r = virtio_net__init_one(&net_params); + if (r < 0) + goto cleanup; } return 0; + +cleanup: + virtio_net__exit(kvm); + return r; } virtio_dev_init(virtio_net__init); diff --git a/virtio/scsi.c b/virtio/scsi.c index 1ec78fe0945a..16a86cb7e0e6 100644 --- a/virtio/scsi.c +++ b/virtio/scsi.c @@ -234,6 +234,7 @@ static void virtio_scsi_vhost_init(struct kvm *kvm, struct scsi_dev *sdev) static int virtio_scsi_init_one(struct kvm *kvm, struct disk_image *disk) { struct scsi_dev *sdev; + int r; if (!disk) return -EINVAL; @@ -260,12 +261,14 @@ static int virtio_scsi_init_one(struct kvm *kvm, struct disk_image *disk) strlcpy((char *)&sdev->target.vhost_wwpn, disk->wwpn, sizeof(sdev->target.vhost_wwpn)); sdev->target.vhost_tpgt = strtol(disk->tpgt, NULL, 0); - virtio_init(kvm, sdev, &sdev->vdev, &scsi_dev_virtio_ops, - VIRTIO_DEFAULT_TRANS(kvm), PCI_DEVICE_ID_VIRTIO_SCSI, - VIRTIO_ID_SCSI, PCI_CLASS_BLK); - list_add_tail(&sdev->list, &sdevs); + r = virtio_init(kvm, sdev, &sdev->vdev, &scsi_dev_virtio_ops, + VIRTIO_DEFAULT_TRANS(kvm), PCI_DEVICE_ID_VIRTIO_SCSI, + VIRTIO_ID_SCSI, PCI_CLASS_BLK); + if (r < 0) + return r; + virtio_scsi_vhost_init(kvm, sdev); if (compat_id == -1) @@ -302,7 +305,8 @@ int virtio_scsi_init(struct kvm *kvm) return 0; cleanup: - return virtio_scsi_exit(kvm); + virtio_scsi_exit(kvm); + return r; } virtio_dev_init(virtio_scsi_init); From patchwork Thu Mar 26 15:24:21 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandru Elisei X-Patchwork-Id: 11460485 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id BF3A81667 for ; Thu, 26 Mar 2020 15:25:13 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id A21642073E for ; Thu, 26 Mar 2020 15:25:13 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728325AbgCZPZM (ORCPT ); Thu, 26 Mar 2020 11:25:12 -0400 Received: from foss.arm.com ([217.140.110.172]:33798 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728338AbgCZPZM (ORCPT ); Thu, 26 Mar 2020 11:25:12 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 815CC7FA; Thu, 26 Mar 2020 08:25:11 -0700 (PDT) Received: from e123195-lin.cambridge.arm.com (e123195-lin.cambridge.arm.com [10.1.196.63]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 990EB3F71E; Thu, 26 Mar 2020 08:25:10 -0700 (PDT) From: Alexandru Elisei To: kvm@vger.kernel.org Cc: will@kernel.org, julien.thierry.kdev@gmail.com, andre.przywara@arm.com, sami.mujawar@arm.com, lorenzo.pieralisi@arm.com Subject: [PATCH v3 kvmtool 15/32] Don't ignore errors registering a device, ioport or mmio emulation Date: Thu, 26 Mar 2020 15:24:21 +0000 Message-Id: <20200326152438.6218-16-alexandru.elisei@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200326152438.6218-1-alexandru.elisei@arm.com> References: <20200326152438.6218-1-alexandru.elisei@arm.com> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org An error returned by device__register, kvm__register_mmio and ioport__register means that the device will not be emulated properly. Annotate the functions with __must_check, so we get a compiler warning when this error is ignored. And fix several instances where the caller returns 0 even if the function failed. Also make sure the ioport emulation code uses ioport_remove consistently. Reviewed-by: Andre Przywara Signed-off-by: Alexandru Elisei --- arm/ioport.c | 3 +- hw/i8042.c | 12 ++++++-- hw/vesa.c | 4 ++- include/kvm/devices.h | 3 +- include/kvm/ioport.h | 6 ++-- include/kvm/kvm.h | 6 ++-- ioport.c | 25 ++++++++-------- mips/kvm.c | 3 +- powerpc/ioport.c | 3 +- virtio/mmio.c | 13 +++++++-- x86/ioport.c | 66 ++++++++++++++++++++++++++++++++----------- 11 files changed, 101 insertions(+), 43 deletions(-) diff --git a/arm/ioport.c b/arm/ioport.c index bdd30b6fe812..2f0feb9ab69f 100644 --- a/arm/ioport.c +++ b/arm/ioport.c @@ -1,8 +1,9 @@ #include "kvm/ioport.h" #include "kvm/irq.h" -void ioport__setup_arch(struct kvm *kvm) +int ioport__setup_arch(struct kvm *kvm) { + return 0; } void ioport__map_irq(u8 *irq) diff --git a/hw/i8042.c b/hw/i8042.c index 2d8c96e9c7e6..37a99a2dc6b8 100644 --- a/hw/i8042.c +++ b/hw/i8042.c @@ -349,10 +349,18 @@ static struct ioport_operations kbd_ops = { int kbd__init(struct kvm *kvm) { + int r; + kbd_reset(); state.kvm = kvm; - ioport__register(kvm, I8042_DATA_REG, &kbd_ops, 2, NULL); - ioport__register(kvm, I8042_COMMAND_REG, &kbd_ops, 2, NULL); + r = ioport__register(kvm, I8042_DATA_REG, &kbd_ops, 2, NULL); + if (r < 0) + return r; + r = ioport__register(kvm, I8042_COMMAND_REG, &kbd_ops, 2, NULL); + if (r < 0) { + ioport__unregister(kvm, I8042_DATA_REG); + return r; + } return 0; } diff --git a/hw/vesa.c b/hw/vesa.c index d8d91aa9c873..b92cc990b730 100644 --- a/hw/vesa.c +++ b/hw/vesa.c @@ -70,7 +70,9 @@ struct framebuffer *vesa__init(struct kvm *kvm) vesa_base_addr = (u16)r; vesa_pci_device.bar[0] = cpu_to_le32(vesa_base_addr | PCI_BASE_ADDRESS_SPACE_IO); - device__register(&vesa_device); + r = device__register(&vesa_device); + if (r < 0) + return ERR_PTR(r); mem = mmap(NULL, VESA_MEM_SIZE, PROT_RW, MAP_ANON_NORESERVE, -1, 0); if (mem == MAP_FAILED) diff --git a/include/kvm/devices.h b/include/kvm/devices.h index 405f19521977..e445db6f56b1 100644 --- a/include/kvm/devices.h +++ b/include/kvm/devices.h @@ -3,6 +3,7 @@ #include #include +#include enum device_bus_type { DEVICE_BUS_PCI, @@ -18,7 +19,7 @@ struct device_header { struct rb_node node; }; -int device__register(struct device_header *dev); +int __must_check device__register(struct device_header *dev); void device__unregister(struct device_header *dev); struct device_header *device__find_dev(enum device_bus_type bus_type, u8 dev_num); diff --git a/include/kvm/ioport.h b/include/kvm/ioport.h index 8c86b7151f25..62a719327e3f 100644 --- a/include/kvm/ioport.h +++ b/include/kvm/ioport.h @@ -33,11 +33,11 @@ struct ioport_operations { enum irq_type)); }; -void ioport__setup_arch(struct kvm *kvm); +int ioport__setup_arch(struct kvm *kvm); void ioport__map_irq(u8 *irq); -int ioport__register(struct kvm *kvm, u16 port, struct ioport_operations *ops, - int count, void *param); +int __must_check ioport__register(struct kvm *kvm, u16 port, struct ioport_operations *ops, + int count, void *param); int ioport__unregister(struct kvm *kvm, u16 port); int ioport__init(struct kvm *kvm); int ioport__exit(struct kvm *kvm); diff --git a/include/kvm/kvm.h b/include/kvm/kvm.h index c6dc6ef72d11..50119a8672eb 100644 --- a/include/kvm/kvm.h +++ b/include/kvm/kvm.h @@ -128,9 +128,9 @@ static inline int kvm__reserve_mem(struct kvm *kvm, u64 guest_phys, u64 size) KVM_MEM_TYPE_RESERVED); } -int kvm__register_mmio(struct kvm *kvm, u64 phys_addr, u64 phys_addr_len, bool coalesce, - void (*mmio_fn)(struct kvm_cpu *vcpu, u64 addr, u8 *data, u32 len, u8 is_write, void *ptr), - void *ptr); +int __must_check kvm__register_mmio(struct kvm *kvm, u64 phys_addr, u64 phys_addr_len, bool coalesce, + void (*mmio_fn)(struct kvm_cpu *vcpu, u64 addr, u8 *data, u32 len, u8 is_write, void *ptr), + void *ptr); bool kvm__deregister_mmio(struct kvm *kvm, u64 phys_addr); void kvm__reboot(struct kvm *kvm); void kvm__pause(struct kvm *kvm); diff --git a/ioport.c b/ioport.c index a72e4035881a..cb778ed8d757 100644 --- a/ioport.c +++ b/ioport.c @@ -73,7 +73,7 @@ int ioport__register(struct kvm *kvm, u16 port, struct ioport_operations *ops, i entry = ioport_search(&ioport_tree, port); if (entry) { pr_warning("ioport re-registered: %x", port); - rb_int_erase(&ioport_tree, &entry->node); + ioport_remove(&ioport_tree, entry); } entry = malloc(sizeof(*entry)); @@ -91,16 +91,21 @@ int ioport__register(struct kvm *kvm, u16 port, struct ioport_operations *ops, i }; r = ioport_insert(&ioport_tree, entry); - if (r < 0) { - free(entry); - br_write_unlock(kvm); - return r; - } - - device__register(&entry->dev_hdr); + if (r < 0) + goto out_free; + r = device__register(&entry->dev_hdr); + if (r < 0) + goto out_remove; br_write_unlock(kvm); return port; + +out_remove: + ioport_remove(&ioport_tree, entry); +out_free: + free(entry); + br_write_unlock(kvm); + return r; } int ioport__unregister(struct kvm *kvm, u16 port) @@ -196,9 +201,7 @@ out: int ioport__init(struct kvm *kvm) { - ioport__setup_arch(kvm); - - return 0; + return ioport__setup_arch(kvm); } dev_base_init(ioport__init); diff --git a/mips/kvm.c b/mips/kvm.c index 211770da0d85..26355930d3b6 100644 --- a/mips/kvm.c +++ b/mips/kvm.c @@ -100,8 +100,9 @@ void kvm__irq_trigger(struct kvm *kvm, int irq) die_perror("KVM_IRQ_LINE ioctl"); } -void ioport__setup_arch(struct kvm *kvm) +int ioport__setup_arch(struct kvm *kvm) { + return 0; } bool kvm__arch_cpu_supports_vm(void) diff --git a/powerpc/ioport.c b/powerpc/ioport.c index 58dc625c54fe..0c188b61a51a 100644 --- a/powerpc/ioport.c +++ b/powerpc/ioport.c @@ -12,9 +12,10 @@ #include -void ioport__setup_arch(struct kvm *kvm) +int ioport__setup_arch(struct kvm *kvm) { /* PPC has no legacy ioports to set up */ + return 0; } void ioport__map_irq(u8 *irq) diff --git a/virtio/mmio.c b/virtio/mmio.c index 03cecc366292..5537c39367d6 100644 --- a/virtio/mmio.c +++ b/virtio/mmio.c @@ -292,13 +292,16 @@ int virtio_mmio_init(struct kvm *kvm, void *dev, struct virtio_device *vdev, int device_id, int subsys_id, int class) { struct virtio_mmio *vmmio = vdev->virtio; + int r; vmmio->addr = virtio_mmio_get_io_space_block(VIRTIO_MMIO_IO_SIZE); vmmio->kvm = kvm; vmmio->dev = dev; - kvm__register_mmio(kvm, vmmio->addr, VIRTIO_MMIO_IO_SIZE, - false, virtio_mmio_mmio_callback, vdev); + r = kvm__register_mmio(kvm, vmmio->addr, VIRTIO_MMIO_IO_SIZE, + false, virtio_mmio_mmio_callback, vdev); + if (r < 0) + return r; vmmio->hdr = (struct virtio_mmio_hdr) { .magic = {'v', 'i', 'r', 't'}, @@ -313,7 +316,11 @@ int virtio_mmio_init(struct kvm *kvm, void *dev, struct virtio_device *vdev, .data = generate_virtio_mmio_fdt_node, }; - device__register(&vmmio->dev_hdr); + r = device__register(&vmmio->dev_hdr); + if (r < 0) { + kvm__deregister_mmio(kvm, vmmio->addr); + return r; + } /* * Instantiate guest virtio-mmio devices using kernel command line diff --git a/x86/ioport.c b/x86/ioport.c index 8572c758ed4f..7ad7b8f3f497 100644 --- a/x86/ioport.c +++ b/x86/ioport.c @@ -69,50 +69,84 @@ void ioport__map_irq(u8 *irq) { } -void ioport__setup_arch(struct kvm *kvm) +int ioport__setup_arch(struct kvm *kvm) { + int r; + /* Legacy ioport setup */ /* 0000 - 001F - DMA1 controller */ - ioport__register(kvm, 0x0000, &dummy_read_write_ioport_ops, 32, NULL); + r = ioport__register(kvm, 0x0000, &dummy_read_write_ioport_ops, 32, NULL); + if (r < 0) + return r; /* 0x0020 - 0x003F - 8259A PIC 1 */ - ioport__register(kvm, 0x0020, &dummy_read_write_ioport_ops, 2, NULL); + r = ioport__register(kvm, 0x0020, &dummy_read_write_ioport_ops, 2, NULL); + if (r < 0) + return r; /* PORT 0040-005F - PIT - PROGRAMMABLE INTERVAL TIMER (8253, 8254) */ - ioport__register(kvm, 0x0040, &dummy_read_write_ioport_ops, 4, NULL); + r = ioport__register(kvm, 0x0040, &dummy_read_write_ioport_ops, 4, NULL); + if (r < 0) + return r; /* 0092 - PS/2 system control port A */ - ioport__register(kvm, 0x0092, &ps2_control_a_ops, 1, NULL); + r = ioport__register(kvm, 0x0092, &ps2_control_a_ops, 1, NULL); + if (r < 0) + return r; /* 0x00A0 - 0x00AF - 8259A PIC 2 */ - ioport__register(kvm, 0x00A0, &dummy_read_write_ioport_ops, 2, NULL); + r = ioport__register(kvm, 0x00A0, &dummy_read_write_ioport_ops, 2, NULL); + if (r < 0) + return r; /* 00C0 - 001F - DMA2 controller */ - ioport__register(kvm, 0x00C0, &dummy_read_write_ioport_ops, 32, NULL); + r = ioport__register(kvm, 0x00C0, &dummy_read_write_ioport_ops, 32, NULL); + if (r < 0) + return r; /* PORT 00E0-00EF are 'motherboard specific' so we use them for our internal debugging purposes. */ - ioport__register(kvm, IOPORT_DBG, &debug_ops, 1, NULL); + r = ioport__register(kvm, IOPORT_DBG, &debug_ops, 1, NULL); + if (r < 0) + return r; /* PORT 00ED - DUMMY PORT FOR DELAY??? */ - ioport__register(kvm, 0x00ED, &dummy_write_only_ioport_ops, 1, NULL); + r = ioport__register(kvm, 0x00ED, &dummy_write_only_ioport_ops, 1, NULL); + if (r < 0) + return r; /* 0x00F0 - 0x00FF - Math co-processor */ - ioport__register(kvm, 0x00F0, &dummy_write_only_ioport_ops, 2, NULL); + r = ioport__register(kvm, 0x00F0, &dummy_write_only_ioport_ops, 2, NULL); + if (r < 0) + return r; /* PORT 0278-027A - PARALLEL PRINTER PORT (usually LPT1, sometimes LPT2) */ - ioport__register(kvm, 0x0278, &dummy_read_write_ioport_ops, 3, NULL); + r = ioport__register(kvm, 0x0278, &dummy_read_write_ioport_ops, 3, NULL); + if (r < 0) + return r; /* PORT 0378-037A - PARALLEL PRINTER PORT (usually LPT2, sometimes LPT3) */ - ioport__register(kvm, 0x0378, &dummy_read_write_ioport_ops, 3, NULL); + r = ioport__register(kvm, 0x0378, &dummy_read_write_ioport_ops, 3, NULL); + if (r < 0) + return r; /* PORT 03D4-03D5 - COLOR VIDEO - CRT CONTROL REGISTERS */ - ioport__register(kvm, 0x03D4, &dummy_read_write_ioport_ops, 1, NULL); - ioport__register(kvm, 0x03D5, &dummy_write_only_ioport_ops, 1, NULL); + r = ioport__register(kvm, 0x03D4, &dummy_read_write_ioport_ops, 1, NULL); + if (r < 0) + return r; + r = ioport__register(kvm, 0x03D5, &dummy_write_only_ioport_ops, 1, NULL); + if (r < 0) + return r; - ioport__register(kvm, 0x402, &seabios_debug_ops, 1, NULL); + r = ioport__register(kvm, 0x402, &seabios_debug_ops, 1, NULL); + if (r < 0) + return r; /* 0510 - QEMU BIOS configuration register */ - ioport__register(kvm, 0x510, &dummy_read_write_ioport_ops, 2, NULL); + r = ioport__register(kvm, 0x510, &dummy_read_write_ioport_ops, 2, NULL); + if (r < 0) + return r; + + return 0; } From patchwork Thu Mar 26 15:24:22 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandru Elisei X-Patchwork-Id: 11460521 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 2DE0E1667 for ; Thu, 26 Mar 2020 15:25:37 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 178D32076A for ; Thu, 26 Mar 2020 15:25:37 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728441AbgCZPZO (ORCPT ); Thu, 26 Mar 2020 11:25:14 -0400 Received: from foss.arm.com ([217.140.110.172]:33806 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728383AbgCZPZN (ORCPT ); Thu, 26 Mar 2020 11:25:13 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 9FBC07FA; Thu, 26 Mar 2020 08:25:12 -0700 (PDT) Received: from e123195-lin.cambridge.arm.com (e123195-lin.cambridge.arm.com [10.1.196.63]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id B6F363F71E; Thu, 26 Mar 2020 08:25:11 -0700 (PDT) From: Alexandru Elisei To: kvm@vger.kernel.org Cc: will@kernel.org, julien.thierry.kdev@gmail.com, andre.przywara@arm.com, sami.mujawar@arm.com, lorenzo.pieralisi@arm.com Subject: [PATCH v3 kvmtool 16/32] hw/vesa: Don't ignore fatal errors Date: Thu, 26 Mar 2020 15:24:22 +0000 Message-Id: <20200326152438.6218-17-alexandru.elisei@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200326152438.6218-1-alexandru.elisei@arm.com> References: <20200326152438.6218-1-alexandru.elisei@arm.com> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Failling an mmap call or creating a memslot means that device emulation will not work, don't ignore it. Signed-off-by: Alexandru Elisei Reviewed-by: Andre Przywara --- hw/vesa.c | 28 ++++++++++++++++++++-------- 1 file changed, 20 insertions(+), 8 deletions(-) diff --git a/hw/vesa.c b/hw/vesa.c index b92cc990b730..ad09373ea2ff 100644 --- a/hw/vesa.c +++ b/hw/vesa.c @@ -63,22 +63,25 @@ struct framebuffer *vesa__init(struct kvm *kvm) if (!kvm->cfg.vnc && !kvm->cfg.sdl && !kvm->cfg.gtk) return NULL; - r = pci_get_io_port_block(PCI_IO_SIZE); - r = ioport__register(kvm, r, &vesa_io_ops, PCI_IO_SIZE, NULL); + vesa_base_addr = pci_get_io_port_block(PCI_IO_SIZE); + r = ioport__register(kvm, vesa_base_addr, &vesa_io_ops, PCI_IO_SIZE, NULL); if (r < 0) - return ERR_PTR(r); + goto out_error; - vesa_base_addr = (u16)r; vesa_pci_device.bar[0] = cpu_to_le32(vesa_base_addr | PCI_BASE_ADDRESS_SPACE_IO); r = device__register(&vesa_device); if (r < 0) - return ERR_PTR(r); + goto unregister_ioport; mem = mmap(NULL, VESA_MEM_SIZE, PROT_RW, MAP_ANON_NORESERVE, -1, 0); - if (mem == MAP_FAILED) - ERR_PTR(-errno); + if (mem == MAP_FAILED) { + r = -errno; + goto unregister_device; + } - kvm__register_dev_mem(kvm, VESA_MEM_ADDR, VESA_MEM_SIZE, mem); + r = kvm__register_dev_mem(kvm, VESA_MEM_ADDR, VESA_MEM_SIZE, mem); + if (r < 0) + goto unmap_dev; vesafb = (struct framebuffer) { .width = VESA_WIDTH, @@ -90,4 +93,13 @@ struct framebuffer *vesa__init(struct kvm *kvm) .kvm = kvm, }; return fb__register(&vesafb); + +unmap_dev: + munmap(mem, VESA_MEM_SIZE); +unregister_device: + device__unregister(&vesa_device); +unregister_ioport: + ioport__unregister(kvm, vesa_base_addr); +out_error: + return ERR_PTR(r); } From patchwork Thu Mar 26 15:24:23 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandru Elisei X-Patchwork-Id: 11460519 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 3964C92A for ; Thu, 26 Mar 2020 15:25:36 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 249F520748 for ; Thu, 26 Mar 2020 15:25:36 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728449AbgCZPZP (ORCPT ); Thu, 26 Mar 2020 11:25:15 -0400 Received: from foss.arm.com ([217.140.110.172]:33814 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728426AbgCZPZO (ORCPT ); Thu, 26 Mar 2020 11:25:14 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id C214B1045; Thu, 26 Mar 2020 08:25:13 -0700 (PDT) Received: from e123195-lin.cambridge.arm.com (e123195-lin.cambridge.arm.com [10.1.196.63]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id D376C3F71E; Thu, 26 Mar 2020 08:25:12 -0700 (PDT) From: Alexandru Elisei To: kvm@vger.kernel.org Cc: will@kernel.org, julien.thierry.kdev@gmail.com, andre.przywara@arm.com, sami.mujawar@arm.com, lorenzo.pieralisi@arm.com Subject: [PATCH v3 kvmtool 17/32] hw/vesa: Set the size for BAR 0 Date: Thu, 26 Mar 2020 15:24:23 +0000 Message-Id: <20200326152438.6218-18-alexandru.elisei@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200326152438.6218-1-alexandru.elisei@arm.com> References: <20200326152438.6218-1-alexandru.elisei@arm.com> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Implemented BARs have an non-zero address and a size. Let's set the size for BAR 0. Reviewed-by: Andre Przywara Signed-off-by: Alexandru Elisei --- hw/vesa.c | 1 + 1 file changed, 1 insertion(+) diff --git a/hw/vesa.c b/hw/vesa.c index ad09373ea2ff..dd59a112330b 100644 --- a/hw/vesa.c +++ b/hw/vesa.c @@ -69,6 +69,7 @@ struct framebuffer *vesa__init(struct kvm *kvm) goto out_error; vesa_pci_device.bar[0] = cpu_to_le32(vesa_base_addr | PCI_BASE_ADDRESS_SPACE_IO); + vesa_pci_device.bar_size[0] = PCI_IO_SIZE; r = device__register(&vesa_device); if (r < 0) goto unregister_ioport; From patchwork Thu Mar 26 15:24:24 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandru Elisei X-Patchwork-Id: 11460517 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id D7E8D1667 for ; Thu, 26 Mar 2020 15:25:35 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id B74872073E for ; Thu, 26 Mar 2020 15:25:35 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728426AbgCZPZP (ORCPT ); Thu, 26 Mar 2020 11:25:15 -0400 Received: from foss.arm.com ([217.140.110.172]:33830 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728446AbgCZPZP (ORCPT ); Thu, 26 Mar 2020 11:25:15 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id DE24F7FA; Thu, 26 Mar 2020 08:25:14 -0700 (PDT) Received: from e123195-lin.cambridge.arm.com (e123195-lin.cambridge.arm.com [10.1.196.63]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 01BC53F71E; Thu, 26 Mar 2020 08:25:13 -0700 (PDT) From: Alexandru Elisei To: kvm@vger.kernel.org Cc: will@kernel.org, julien.thierry.kdev@gmail.com, andre.przywara@arm.com, sami.mujawar@arm.com, lorenzo.pieralisi@arm.com Subject: [PATCH v3 kvmtool 18/32] ioport: Fail when registering overlapping ports Date: Thu, 26 Mar 2020 15:24:24 +0000 Message-Id: <20200326152438.6218-19-alexandru.elisei@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200326152438.6218-1-alexandru.elisei@arm.com> References: <20200326152438.6218-1-alexandru.elisei@arm.com> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org If we try to register a range of ports which overlaps with another, already registered, I/O ports region then device emulation for that region will not work anymore. There's nothing sane that the ioport emulation layer can do in this case so refuse to allocate the port. This matches the behavior of kvm__register_mmio. There's no need to protect allocating a new ioport struct with a lock, so move the lock to protect the actual ioport insertion in the tree. Signed-off-by: Alexandru Elisei Reviewed-by: Andre Przywara --- ioport.c | 9 +-------- 1 file changed, 1 insertion(+), 8 deletions(-) diff --git a/ioport.c b/ioport.c index cb778ed8d757..d9f2e8ea3c3b 100644 --- a/ioport.c +++ b/ioport.c @@ -68,14 +68,6 @@ int ioport__register(struct kvm *kvm, u16 port, struct ioport_operations *ops, i struct ioport *entry; int r; - br_write_lock(kvm); - - entry = ioport_search(&ioport_tree, port); - if (entry) { - pr_warning("ioport re-registered: %x", port); - ioport_remove(&ioport_tree, entry); - } - entry = malloc(sizeof(*entry)); if (entry == NULL) return -ENOMEM; @@ -90,6 +82,7 @@ int ioport__register(struct kvm *kvm, u16 port, struct ioport_operations *ops, i }, }; + br_write_lock(kvm); r = ioport_insert(&ioport_tree, entry); if (r < 0) goto out_free; From patchwork Thu Mar 26 15:24:25 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandru Elisei X-Patchwork-Id: 11460489 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 644D492A for ; Thu, 26 Mar 2020 15:25:18 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 480F32076A for ; Thu, 26 Mar 2020 15:25:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728470AbgCZPZR (ORCPT ); Thu, 26 Mar 2020 11:25:17 -0400 Received: from foss.arm.com ([217.140.110.172]:33836 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728461AbgCZPZQ (ORCPT ); Thu, 26 Mar 2020 11:25:16 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 0C4727FA; Thu, 26 Mar 2020 08:25:16 -0700 (PDT) Received: from e123195-lin.cambridge.arm.com (e123195-lin.cambridge.arm.com [10.1.196.63]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 236CA3F71E; Thu, 26 Mar 2020 08:25:15 -0700 (PDT) From: Alexandru Elisei To: kvm@vger.kernel.org Cc: will@kernel.org, julien.thierry.kdev@gmail.com, andre.przywara@arm.com, sami.mujawar@arm.com, lorenzo.pieralisi@arm.com Subject: [PATCH v3 kvmtool 19/32] ioport: mmio: Use a mutex and reference counting for locking Date: Thu, 26 Mar 2020 15:24:25 +0000 Message-Id: <20200326152438.6218-20-alexandru.elisei@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200326152438.6218-1-alexandru.elisei@arm.com> References: <20200326152438.6218-1-alexandru.elisei@arm.com> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org kvmtool uses brlock for protecting accesses to the ioport and mmio red-black trees. brlock allows concurrent reads, but only one writer, which is assumed not to be a VCPU thread (for more information see commit 0b907ed2eaec ("kvm tools: Add a brlock)). This is done by issuing a compiler barrier on read and pausing the entire virtual machine on writes. When KVM_BRLOCK_DEBUG is defined, brlock uses instead a pthread read/write lock. When we will implement reassignable BARs, the mmio or ioport mapping will be done as a result of a VCPU mmio access. When brlock is a pthread read/write lock, it means that we will try to acquire a write lock with the read lock already held by the same VCPU and we will deadlock. When it's not, a VCPU will have to call kvm__pause, which means the virtual machine will stay paused forever. Let's avoid all this by using a mutex and reference counting the red-black tree entries. This way we can guarantee that we won't unregister a node that another thread is currently using for emulation. Signed-off-by: Alexandru Elisei --- include/kvm/ioport.h | 2 + include/kvm/rbtree-interval.h | 4 +- ioport.c | 64 +++++++++++++++++------- mmio.c | 91 +++++++++++++++++++++++++---------- 4 files changed, 118 insertions(+), 43 deletions(-) diff --git a/include/kvm/ioport.h b/include/kvm/ioport.h index 62a719327e3f..039633f76bdd 100644 --- a/include/kvm/ioport.h +++ b/include/kvm/ioport.h @@ -22,6 +22,8 @@ struct ioport { struct ioport_operations *ops; void *priv; struct device_header dev_hdr; + u32 refcount; + bool remove; }; struct ioport_operations { diff --git a/include/kvm/rbtree-interval.h b/include/kvm/rbtree-interval.h index 730eb5e8551d..17cd3b5f3199 100644 --- a/include/kvm/rbtree-interval.h +++ b/include/kvm/rbtree-interval.h @@ -6,7 +6,9 @@ #define RB_INT_INIT(l, h) \ (struct rb_int_node){.low = l, .high = h} -#define rb_int(n) rb_entry(n, struct rb_int_node, node) +#define rb_int(n) rb_entry(n, struct rb_int_node, node) +#define rb_int_start(n) ((n)->low) +#define rb_int_end(n) ((n)->low + (n)->high - 1) struct rb_int_node { struct rb_node node; diff --git a/ioport.c b/ioport.c index d9f2e8ea3c3b..5d7288809528 100644 --- a/ioport.c +++ b/ioport.c @@ -2,7 +2,6 @@ #include "kvm/kvm.h" #include "kvm/util.h" -#include "kvm/brlock.h" #include "kvm/rbtree-interval.h" #include "kvm/mutex.h" @@ -16,6 +15,8 @@ #define ioport_node(n) rb_entry(n, struct ioport, node) +static DEFINE_MUTEX(ioport_lock); + static struct rb_root ioport_tree = RB_ROOT; static struct ioport *ioport_search(struct rb_root *root, u64 addr) @@ -39,6 +40,36 @@ static void ioport_remove(struct rb_root *root, struct ioport *data) rb_int_erase(root, &data->node); } +static struct ioport *ioport_get(struct rb_root *root, u64 addr) +{ + struct ioport *ioport; + + mutex_lock(&ioport_lock); + ioport = ioport_search(root, addr); + if (ioport) + ioport->refcount++; + mutex_unlock(&ioport_lock); + + return ioport; +} + +/* Called with ioport_lock held. */ +static void ioport_unregister(struct rb_root *root, struct ioport *data) +{ + device__unregister(&data->dev_hdr); + ioport_remove(root, data); + free(data); +} + +static void ioport_put(struct rb_root *root, struct ioport *data) +{ + mutex_lock(&ioport_lock); + data->refcount--; + if (data->remove && data->refcount == 0) + ioport_unregister(root, data); + mutex_unlock(&ioport_lock); +} + #ifdef CONFIG_HAS_LIBFDT static void generate_ioport_fdt_node(void *fdt, struct device_header *dev_hdr, @@ -80,16 +111,18 @@ int ioport__register(struct kvm *kvm, u16 port, struct ioport_operations *ops, i .bus_type = DEVICE_BUS_IOPORT, .data = generate_ioport_fdt_node, }, + .refcount = 0, + .remove = false, }; - br_write_lock(kvm); + mutex_lock(&ioport_lock); r = ioport_insert(&ioport_tree, entry); if (r < 0) goto out_free; r = device__register(&entry->dev_hdr); if (r < 0) goto out_remove; - br_write_unlock(kvm); + mutex_unlock(&ioport_lock); return port; @@ -97,7 +130,7 @@ out_remove: ioport_remove(&ioport_tree, entry); out_free: free(entry); - br_write_unlock(kvm); + mutex_unlock(&ioport_lock); return r; } @@ -106,22 +139,22 @@ int ioport__unregister(struct kvm *kvm, u16 port) struct ioport *entry; int r; - br_write_lock(kvm); + mutex_lock(&ioport_lock); r = -ENOENT; entry = ioport_search(&ioport_tree, port); if (!entry) goto done; - device__unregister(&entry->dev_hdr); - ioport_remove(&ioport_tree, entry); - - free(entry); + if (entry->refcount) + entry->remove = true; + else + ioport_unregister(&ioport_tree, entry); r = 0; done: - br_write_unlock(kvm); + mutex_unlock(&ioport_lock); return r; } @@ -136,9 +169,7 @@ static void ioport__unregister_all(void) while (rb) { rb_node = rb_int(rb); entry = ioport_node(rb_node); - device__unregister(&entry->dev_hdr); - ioport_remove(&ioport_tree, entry); - free(entry); + ioport_unregister(&ioport_tree, entry); rb = rb_first(&ioport_tree); } } @@ -164,8 +195,7 @@ bool kvm__emulate_io(struct kvm_cpu *vcpu, u16 port, void *data, int direction, void *ptr = data; struct kvm *kvm = vcpu->kvm; - br_read_lock(kvm); - entry = ioport_search(&ioport_tree, port); + entry = ioport_get(&ioport_tree, port); if (!entry) goto out; @@ -180,9 +210,9 @@ bool kvm__emulate_io(struct kvm_cpu *vcpu, u16 port, void *data, int direction, ptr += size; } -out: - br_read_unlock(kvm); + ioport_put(&ioport_tree, entry); +out: if (ret) return true; diff --git a/mmio.c b/mmio.c index 61e1d47a587d..57a4d5b9d64d 100644 --- a/mmio.c +++ b/mmio.c @@ -1,7 +1,7 @@ #include "kvm/kvm.h" #include "kvm/kvm-cpu.h" #include "kvm/rbtree-interval.h" -#include "kvm/brlock.h" +#include "kvm/mutex.h" #include #include @@ -15,10 +15,14 @@ #define mmio_node(n) rb_entry(n, struct mmio_mapping, node) +static DEFINE_MUTEX(mmio_lock); + struct mmio_mapping { struct rb_int_node node; void (*mmio_fn)(struct kvm_cpu *vcpu, u64 addr, u8 *data, u32 len, u8 is_write, void *ptr); void *ptr; + u32 refcount; + bool remove; }; static struct rb_root mmio_tree = RB_ROOT; @@ -51,6 +55,11 @@ static int mmio_insert(struct rb_root *root, struct mmio_mapping *data) return rb_int_insert(root, &data->node); } +static void mmio_remove(struct rb_root *root, struct mmio_mapping *data) +{ + rb_int_erase(root, &data->node); +} + static const char *to_direction(u8 is_write) { if (is_write) @@ -59,6 +68,41 @@ static const char *to_direction(u8 is_write) return "read"; } +static struct mmio_mapping *mmio_get(struct rb_root *root, u64 phys_addr, u32 len) +{ + struct mmio_mapping *mmio; + + mutex_lock(&mmio_lock); + mmio = mmio_search(root, phys_addr, len); + if (mmio) + mmio->refcount++; + mutex_unlock(&mmio_lock); + + return mmio; +} + +/* Called with mmio_lock held. */ +static void mmio_deregister(struct kvm *kvm, struct rb_root *root, struct mmio_mapping *mmio) +{ + struct kvm_coalesced_mmio_zone zone = (struct kvm_coalesced_mmio_zone) { + .addr = rb_int_start(&mmio->node), + .size = 1, + }; + ioctl(kvm->vm_fd, KVM_UNREGISTER_COALESCED_MMIO, &zone); + + mmio_remove(root, mmio); + free(mmio); +} + +static void mmio_put(struct kvm *kvm, struct rb_root *root, struct mmio_mapping *mmio) +{ + mutex_lock(&mmio_lock); + mmio->refcount--; + if (mmio->remove && mmio->refcount == 0) + mmio_deregister(kvm, root, mmio); + mutex_unlock(&mmio_lock); +} + int kvm__register_mmio(struct kvm *kvm, u64 phys_addr, u64 phys_addr_len, bool coalesce, void (*mmio_fn)(struct kvm_cpu *vcpu, u64 addr, u8 *data, u32 len, u8 is_write, void *ptr), void *ptr) @@ -72,9 +116,11 @@ int kvm__register_mmio(struct kvm *kvm, u64 phys_addr, u64 phys_addr_len, bool c return -ENOMEM; *mmio = (struct mmio_mapping) { - .node = RB_INT_INIT(phys_addr, phys_addr + phys_addr_len), - .mmio_fn = mmio_fn, - .ptr = ptr, + .node = RB_INT_INIT(phys_addr, phys_addr + phys_addr_len), + .mmio_fn = mmio_fn, + .ptr = ptr, + .refcount = 0, + .remove = false, }; if (coalesce) { @@ -88,9 +134,9 @@ int kvm__register_mmio(struct kvm *kvm, u64 phys_addr, u64 phys_addr_len, bool c return -errno; } } - br_write_lock(kvm); + mutex_lock(&mmio_lock); ret = mmio_insert(&mmio_tree, mmio); - br_write_unlock(kvm); + mutex_unlock(&mmio_lock); return ret; } @@ -98,25 +144,20 @@ int kvm__register_mmio(struct kvm *kvm, u64 phys_addr, u64 phys_addr_len, bool c bool kvm__deregister_mmio(struct kvm *kvm, u64 phys_addr) { struct mmio_mapping *mmio; - struct kvm_coalesced_mmio_zone zone; - br_write_lock(kvm); + mutex_lock(&mmio_lock); mmio = mmio_search_single(&mmio_tree, phys_addr); if (mmio == NULL) { - br_write_unlock(kvm); + mutex_unlock(&mmio_lock); return false; } - zone = (struct kvm_coalesced_mmio_zone) { - .addr = phys_addr, - .size = 1, - }; - ioctl(kvm->vm_fd, KVM_UNREGISTER_COALESCED_MMIO, &zone); + if (mmio->refcount) + mmio->remove = true; + else + mmio_deregister(kvm, &mmio_tree, mmio); + mutex_unlock(&mmio_lock); - rb_int_erase(&mmio_tree, &mmio->node); - br_write_unlock(kvm); - - free(mmio); return true; } @@ -124,18 +165,18 @@ bool kvm__emulate_mmio(struct kvm_cpu *vcpu, u64 phys_addr, u8 *data, u32 len, u { struct mmio_mapping *mmio; - br_read_lock(vcpu->kvm); - mmio = mmio_search(&mmio_tree, phys_addr, len); - - if (mmio) - mmio->mmio_fn(vcpu, phys_addr, data, len, is_write, mmio->ptr); - else { + mmio = mmio_get(&mmio_tree, phys_addr, len); + if (!mmio) { if (vcpu->kvm->cfg.mmio_debug) fprintf(stderr, "Warning: Ignoring MMIO %s at %016llx (length %u)\n", to_direction(is_write), (unsigned long long)phys_addr, len); + goto out; } - br_read_unlock(vcpu->kvm); + mmio->mmio_fn(vcpu, phys_addr, data, len, is_write, mmio->ptr); + mmio_put(vcpu->kvm, &mmio_tree, mmio); + +out: return true; } From patchwork Thu Mar 26 15:24:26 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandru Elisei X-Patchwork-Id: 11460491 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 007821667 for ; Thu, 26 Mar 2020 15:25:20 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id E113920774 for ; Thu, 26 Mar 2020 15:25:19 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728329AbgCZPZT (ORCPT ); Thu, 26 Mar 2020 11:25:19 -0400 Received: from foss.arm.com ([217.140.110.172]:33844 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728428AbgCZPZR (ORCPT ); Thu, 26 Mar 2020 11:25:17 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 28A3B1045; Thu, 26 Mar 2020 08:25:17 -0700 (PDT) Received: from e123195-lin.cambridge.arm.com (e123195-lin.cambridge.arm.com [10.1.196.63]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 3FA2A3F71E; Thu, 26 Mar 2020 08:25:16 -0700 (PDT) From: Alexandru Elisei To: kvm@vger.kernel.org Cc: will@kernel.org, julien.thierry.kdev@gmail.com, andre.przywara@arm.com, sami.mujawar@arm.com, lorenzo.pieralisi@arm.com Subject: [PATCH v3 kvmtool 20/32] pci: Add helpers for BAR values and memory/IO space access Date: Thu, 26 Mar 2020 15:24:26 +0000 Message-Id: <20200326152438.6218-21-alexandru.elisei@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200326152438.6218-1-alexandru.elisei@arm.com> References: <20200326152438.6218-1-alexandru.elisei@arm.com> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org We're going to be checking the BAR type, the address written to it and if access to memory or I/O space is enabled quite often when we add support for reasignable BARs; make our life easier by adding helpers for it. Signed-off-by: Alexandru Elisei Reviewed-by: Andre Przywara --- include/kvm/pci.h | 53 +++++++++++++++++++++++++++++++++++++++++++++ pci.c | 4 ++-- powerpc/spapr_pci.c | 2 +- 3 files changed, 56 insertions(+), 3 deletions(-) diff --git a/include/kvm/pci.h b/include/kvm/pci.h index ccb155e3e8fe..adb4b5c082d5 100644 --- a/include/kvm/pci.h +++ b/include/kvm/pci.h @@ -5,6 +5,7 @@ #include #include #include +#include #include "kvm/devices.h" #include "kvm/msi.h" @@ -161,4 +162,56 @@ void pci__config_rd(struct kvm *kvm, union pci_config_address addr, void *data, void *pci_find_cap(struct pci_device_header *hdr, u8 cap_type); +static inline bool __pci__memory_space_enabled(u16 command) +{ + return command & PCI_COMMAND_MEMORY; +} + +static inline bool pci__memory_space_enabled(struct pci_device_header *pci_hdr) +{ + return __pci__memory_space_enabled(pci_hdr->command); +} + +static inline bool __pci__io_space_enabled(u16 command) +{ + return command & PCI_COMMAND_IO; +} + +static inline bool pci__io_space_enabled(struct pci_device_header *pci_hdr) +{ + return __pci__io_space_enabled(pci_hdr->command); +} + +static inline bool __pci__bar_is_io(u32 bar) +{ + return bar & PCI_BASE_ADDRESS_SPACE_IO; +} + +static inline bool pci__bar_is_io(struct pci_device_header *pci_hdr, int bar_num) +{ + return __pci__bar_is_io(pci_hdr->bar[bar_num]); +} + +static inline bool pci__bar_is_memory(struct pci_device_header *pci_hdr, int bar_num) +{ + return !pci__bar_is_io(pci_hdr, bar_num); +} + +static inline u32 __pci__bar_address(u32 bar) +{ + if (__pci__bar_is_io(bar)) + return bar & PCI_BASE_ADDRESS_IO_MASK; + return bar & PCI_BASE_ADDRESS_MEM_MASK; +} + +static inline u32 pci__bar_address(struct pci_device_header *pci_hdr, int bar_num) +{ + return __pci__bar_address(pci_hdr->bar[bar_num]); +} + +static inline u32 pci__bar_size(struct pci_device_header *pci_hdr, int bar_num) +{ + return pci_hdr->bar_size[bar_num]; +} + #endif /* KVM__PCI_H */ diff --git a/pci.c b/pci.c index b6892d974c08..7399c76c0819 100644 --- a/pci.c +++ b/pci.c @@ -185,7 +185,7 @@ void pci__config_wr(struct kvm *kvm, union pci_config_address addr, void *data, * size, it will write the address back. */ if (bar < 6) { - if (pci_hdr->bar[bar] & PCI_BASE_ADDRESS_SPACE_IO) + if (pci__bar_is_io(pci_hdr, bar)) mask = (u32)PCI_BASE_ADDRESS_IO_MASK; else mask = (u32)PCI_BASE_ADDRESS_MEM_MASK; @@ -211,7 +211,7 @@ void pci__config_wr(struct kvm *kvm, union pci_config_address addr, void *data, */ memcpy(&value, data, size); if (value == 0xffffffff) - value = ~(pci_hdr->bar_size[bar] - 1); + value = ~(pci__bar_size(pci_hdr, bar) - 1); /* Preserve the special bits. */ value = (value & mask) | (pci_hdr->bar[bar] & ~mask); memcpy(base + offset, &value, size); diff --git a/powerpc/spapr_pci.c b/powerpc/spapr_pci.c index a15f7d895a46..7be44d950acb 100644 --- a/powerpc/spapr_pci.c +++ b/powerpc/spapr_pci.c @@ -369,7 +369,7 @@ int spapr_populate_pci_devices(struct kvm *kvm, of_pci_b_ddddd(devid) | of_pci_b_fff(fn) | of_pci_b_rrrrrrrr(bars[i])); - reg[n+1].size = cpu_to_be64(hdr->bar_size[i]); + reg[n+1].size = cpu_to_be64(pci__bar_size(hdr, i)); reg[n+1].addr = 0; assigned_addresses[n].phys_hi = cpu_to_be32( From patchwork Thu Mar 26 15:24:27 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandru Elisei X-Patchwork-Id: 11460493 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id AF8D11667 for ; Thu, 26 Mar 2020 15:25:21 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 8730A20774 for ; Thu, 26 Mar 2020 15:25:21 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728478AbgCZPZU (ORCPT ); Thu, 26 Mar 2020 11:25:20 -0400 Received: from foss.arm.com ([217.140.110.172]:33860 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728477AbgCZPZT (ORCPT ); Thu, 26 Mar 2020 11:25:19 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 448C27FA; Thu, 26 Mar 2020 08:25:18 -0700 (PDT) Received: from e123195-lin.cambridge.arm.com (e123195-lin.cambridge.arm.com [10.1.196.63]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 5CDFA3F71E; Thu, 26 Mar 2020 08:25:17 -0700 (PDT) From: Alexandru Elisei To: kvm@vger.kernel.org Cc: will@kernel.org, julien.thierry.kdev@gmail.com, andre.przywara@arm.com, sami.mujawar@arm.com, lorenzo.pieralisi@arm.com Subject: [PATCH v3 kvmtool 21/32] virtio/pci: Get emulated region address from BARs Date: Thu, 26 Mar 2020 15:24:27 +0000 Message-Id: <20200326152438.6218-22-alexandru.elisei@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200326152438.6218-1-alexandru.elisei@arm.com> References: <20200326152438.6218-1-alexandru.elisei@arm.com> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org The struct virtio_pci fields port_addr, mmio_addr and msix_io_block represent the same addresses that are written in the corresponding BARs. Remove this duplication of information and always use the address from the BAR. This will make our life a lot easier when we add support for reassignable BARs, because we won't have to update the fields on each BAR change. No functional changes. Reviewed-by: Andre Przywara Signed-off-by: Alexandru Elisei --- include/kvm/virtio-pci.h | 3 -- virtio/pci.c | 82 +++++++++++++++++++++++++--------------- 2 files changed, 52 insertions(+), 33 deletions(-) diff --git a/include/kvm/virtio-pci.h b/include/kvm/virtio-pci.h index 278a25950d8b..959b4b81c871 100644 --- a/include/kvm/virtio-pci.h +++ b/include/kvm/virtio-pci.h @@ -24,8 +24,6 @@ struct virtio_pci { void *dev; struct kvm *kvm; - u16 port_addr; - u32 mmio_addr; u8 status; u8 isr; u32 features; @@ -43,7 +41,6 @@ struct virtio_pci { u32 config_gsi; u32 vq_vector[VIRTIO_PCI_MAX_VQ]; u32 gsis[VIRTIO_PCI_MAX_VQ]; - u32 msix_io_block; u64 msix_pba; struct msix_table msix_table[VIRTIO_PCI_MAX_VQ + VIRTIO_PCI_MAX_CONFIG]; diff --git a/virtio/pci.c b/virtio/pci.c index 281c31817598..d111dc499f5e 100644 --- a/virtio/pci.c +++ b/virtio/pci.c @@ -13,6 +13,21 @@ #include #include +static u16 virtio_pci__port_addr(struct virtio_pci *vpci) +{ + return pci__bar_address(&vpci->pci_hdr, 0); +} + +static u32 virtio_pci__mmio_addr(struct virtio_pci *vpci) +{ + return pci__bar_address(&vpci->pci_hdr, 1); +} + +static u32 virtio_pci__msix_io_addr(struct virtio_pci *vpci) +{ + return pci__bar_address(&vpci->pci_hdr, 2); +} + static void virtio_pci__ioevent_callback(struct kvm *kvm, void *param) { struct virtio_pci_ioevent_param *ioeventfd = param; @@ -25,6 +40,8 @@ static int virtio_pci__init_ioeventfd(struct kvm *kvm, struct virtio_device *vde { struct ioevent ioevent; struct virtio_pci *vpci = vdev->virtio; + u32 mmio_addr = virtio_pci__mmio_addr(vpci); + u16 port_addr = virtio_pci__port_addr(vpci); int r, flags = 0; int fd; @@ -48,7 +65,7 @@ static int virtio_pci__init_ioeventfd(struct kvm *kvm, struct virtio_device *vde flags |= IOEVENTFD_FLAG_USER_POLL; /* ioport */ - ioevent.io_addr = vpci->port_addr + VIRTIO_PCI_QUEUE_NOTIFY; + ioevent.io_addr = port_addr + VIRTIO_PCI_QUEUE_NOTIFY; ioevent.io_len = sizeof(u16); ioevent.fd = fd = eventfd(0, 0); r = ioeventfd__add_event(&ioevent, flags | IOEVENTFD_FLAG_PIO); @@ -56,7 +73,7 @@ static int virtio_pci__init_ioeventfd(struct kvm *kvm, struct virtio_device *vde return r; /* mmio */ - ioevent.io_addr = vpci->mmio_addr + VIRTIO_PCI_QUEUE_NOTIFY; + ioevent.io_addr = mmio_addr + VIRTIO_PCI_QUEUE_NOTIFY; ioevent.io_len = sizeof(u16); ioevent.fd = eventfd(0, 0); r = ioeventfd__add_event(&ioevent, flags); @@ -68,7 +85,7 @@ static int virtio_pci__init_ioeventfd(struct kvm *kvm, struct virtio_device *vde return 0; free_ioport_evt: - ioeventfd__del_event(vpci->port_addr + VIRTIO_PCI_QUEUE_NOTIFY, vq); + ioeventfd__del_event(port_addr + VIRTIO_PCI_QUEUE_NOTIFY, vq); return r; } @@ -76,9 +93,11 @@ static void virtio_pci_exit_vq(struct kvm *kvm, struct virtio_device *vdev, int vq) { struct virtio_pci *vpci = vdev->virtio; + u32 mmio_addr = virtio_pci__mmio_addr(vpci); + u16 port_addr = virtio_pci__port_addr(vpci); - ioeventfd__del_event(vpci->mmio_addr + VIRTIO_PCI_QUEUE_NOTIFY, vq); - ioeventfd__del_event(vpci->port_addr + VIRTIO_PCI_QUEUE_NOTIFY, vq); + ioeventfd__del_event(mmio_addr + VIRTIO_PCI_QUEUE_NOTIFY, vq); + ioeventfd__del_event(port_addr + VIRTIO_PCI_QUEUE_NOTIFY, vq); virtio_exit_vq(kvm, vdev, vpci->dev, vq); } @@ -162,7 +181,7 @@ static bool virtio_pci__io_in(struct ioport *ioport, struct kvm_cpu *vcpu, u16 p { struct virtio_device *vdev = ioport->priv; struct virtio_pci *vpci = vdev->virtio; - unsigned long offset = port - vpci->port_addr; + unsigned long offset = port - virtio_pci__port_addr(vpci); return virtio_pci__data_in(vcpu, vdev, offset, data, size); } @@ -318,7 +337,7 @@ static bool virtio_pci__io_out(struct ioport *ioport, struct kvm_cpu *vcpu, u16 { struct virtio_device *vdev = ioport->priv; struct virtio_pci *vpci = vdev->virtio; - unsigned long offset = port - vpci->port_addr; + unsigned long offset = port - virtio_pci__port_addr(vpci); return virtio_pci__data_out(vcpu, vdev, offset, data, size); } @@ -335,17 +354,18 @@ static void virtio_pci__msix_mmio_callback(struct kvm_cpu *vcpu, struct virtio_device *vdev = ptr; struct virtio_pci *vpci = vdev->virtio; struct msix_table *table; + u32 msix_io_addr = virtio_pci__msix_io_addr(vpci); int vecnum; size_t offset; - if (addr > vpci->msix_io_block + PCI_IO_SIZE) { + if (addr > msix_io_addr + PCI_IO_SIZE) { if (is_write) return; table = (struct msix_table *)&vpci->msix_pba; - offset = addr - (vpci->msix_io_block + PCI_IO_SIZE); + offset = addr - (msix_io_addr + PCI_IO_SIZE); } else { table = vpci->msix_table; - offset = addr - vpci->msix_io_block; + offset = addr - msix_io_addr; } vecnum = offset / sizeof(struct msix_table); offset = offset % sizeof(struct msix_table); @@ -434,19 +454,20 @@ static void virtio_pci__io_mmio_callback(struct kvm_cpu *vcpu, { struct virtio_device *vdev = ptr; struct virtio_pci *vpci = vdev->virtio; + u32 mmio_addr = virtio_pci__mmio_addr(vpci); if (!is_write) - virtio_pci__data_in(vcpu, vdev, addr - vpci->mmio_addr, - data, len); + virtio_pci__data_in(vcpu, vdev, addr - mmio_addr, data, len); else - virtio_pci__data_out(vcpu, vdev, addr - vpci->mmio_addr, - data, len); + virtio_pci__data_out(vcpu, vdev, addr - mmio_addr, data, len); } int virtio_pci__init(struct kvm *kvm, void *dev, struct virtio_device *vdev, int device_id, int subsys_id, int class) { struct virtio_pci *vpci = vdev->virtio; + u32 mmio_addr, msix_io_block; + u16 port_addr; int r; vpci->kvm = kvm; @@ -454,20 +475,21 @@ int virtio_pci__init(struct kvm *kvm, void *dev, struct virtio_device *vdev, BUILD_BUG_ON(!is_power_of_two(PCI_IO_SIZE)); - r = pci_get_io_port_block(PCI_IO_SIZE); - r = ioport__register(kvm, r, &virtio_pci__io_ops, PCI_IO_SIZE, vdev); + port_addr = pci_get_io_port_block(PCI_IO_SIZE); + r = ioport__register(kvm, port_addr, &virtio_pci__io_ops, PCI_IO_SIZE, + vdev); if (r < 0) return r; - vpci->port_addr = (u16)r; + port_addr = (u16)r; - vpci->mmio_addr = pci_get_mmio_block(PCI_IO_SIZE); - r = kvm__register_mmio(kvm, vpci->mmio_addr, PCI_IO_SIZE, false, + mmio_addr = pci_get_mmio_block(PCI_IO_SIZE); + r = kvm__register_mmio(kvm, mmio_addr, PCI_IO_SIZE, false, virtio_pci__io_mmio_callback, vdev); if (r < 0) goto free_ioport; - vpci->msix_io_block = pci_get_mmio_block(PCI_IO_SIZE * 2); - r = kvm__register_mmio(kvm, vpci->msix_io_block, PCI_IO_SIZE * 2, false, + msix_io_block = pci_get_mmio_block(PCI_IO_SIZE * 2); + r = kvm__register_mmio(kvm, msix_io_block, PCI_IO_SIZE * 2, false, virtio_pci__msix_mmio_callback, vdev); if (r < 0) goto free_mmio; @@ -483,11 +505,11 @@ int virtio_pci__init(struct kvm *kvm, void *dev, struct virtio_device *vdev, .class[2] = (class >> 16) & 0xff, .subsys_vendor_id = cpu_to_le16(PCI_SUBSYSTEM_VENDOR_ID_REDHAT_QUMRANET), .subsys_id = cpu_to_le16(subsys_id), - .bar[0] = cpu_to_le32(vpci->port_addr + .bar[0] = cpu_to_le32(port_addr | PCI_BASE_ADDRESS_SPACE_IO), - .bar[1] = cpu_to_le32(vpci->mmio_addr + .bar[1] = cpu_to_le32(mmio_addr | PCI_BASE_ADDRESS_SPACE_MEMORY), - .bar[2] = cpu_to_le32(vpci->msix_io_block + .bar[2] = cpu_to_le32(msix_io_block | PCI_BASE_ADDRESS_SPACE_MEMORY), .status = cpu_to_le16(PCI_STATUS_CAP_LIST), .capabilities = (void *)&vpci->pci_hdr.msix - (void *)&vpci->pci_hdr, @@ -534,11 +556,11 @@ int virtio_pci__init(struct kvm *kvm, void *dev, struct virtio_device *vdev, return 0; free_msix_mmio: - kvm__deregister_mmio(kvm, vpci->msix_io_block); + kvm__deregister_mmio(kvm, msix_io_block); free_mmio: - kvm__deregister_mmio(kvm, vpci->mmio_addr); + kvm__deregister_mmio(kvm, mmio_addr); free_ioport: - ioport__unregister(kvm, vpci->port_addr); + ioport__unregister(kvm, port_addr); return r; } @@ -558,9 +580,9 @@ int virtio_pci__exit(struct kvm *kvm, struct virtio_device *vdev) struct virtio_pci *vpci = vdev->virtio; virtio_pci__reset(kvm, vdev); - kvm__deregister_mmio(kvm, vpci->mmio_addr); - kvm__deregister_mmio(kvm, vpci->msix_io_block); - ioport__unregister(kvm, vpci->port_addr); + kvm__deregister_mmio(kvm, virtio_pci__mmio_addr(vpci)); + kvm__deregister_mmio(kvm, virtio_pci__msix_io_addr(vpci)); + ioport__unregister(kvm, virtio_pci__port_addr(vpci)); return 0; } From patchwork Thu Mar 26 15:24:28 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandru Elisei X-Patchwork-Id: 11460515 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id D242E1667 for ; Thu, 26 Mar 2020 15:25:34 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id B4FCA2073E for ; Thu, 26 Mar 2020 15:25:34 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728464AbgCZPZU (ORCPT ); Thu, 26 Mar 2020 11:25:20 -0400 Received: from foss.arm.com ([217.140.110.172]:33866 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728445AbgCZPZT (ORCPT ); Thu, 26 Mar 2020 11:25:19 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 5E93F1045; Thu, 26 Mar 2020 08:25:19 -0700 (PDT) Received: from e123195-lin.cambridge.arm.com (e123195-lin.cambridge.arm.com [10.1.196.63]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 784E03F71E; Thu, 26 Mar 2020 08:25:18 -0700 (PDT) From: Alexandru Elisei To: kvm@vger.kernel.org Cc: will@kernel.org, julien.thierry.kdev@gmail.com, andre.przywara@arm.com, sami.mujawar@arm.com, lorenzo.pieralisi@arm.com Subject: [PATCH v3 kvmtool 22/32] vfio: Destroy memslot when unmapping the associated VAs Date: Thu, 26 Mar 2020 15:24:28 +0000 Message-Id: <20200326152438.6218-23-alexandru.elisei@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200326152438.6218-1-alexandru.elisei@arm.com> References: <20200326152438.6218-1-alexandru.elisei@arm.com> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org When we want to map a device region into the guest address space, first we perform an mmap on the device fd. The resulting VMA is a mapping between host userspace addresses and physical addresses associated with the device. Next, we create a memslot, which populates the stage 2 table with the mappings between guest physical addresses and the device physical adresses. However, when we want to unmap the device from the guest address space, we only call munmap, which destroys the VMA and the stage 2 mappings, but doesn't destroy the memslot and kvmtool's internal mem_bank structure associated with the memslot. This has been perfectly fine so far, because we only unmap a device region when we exit kvmtool. This is will change when we add support for reassignable BARs, and we will have to unmap vfio regions as the guest kernel writes new addresses in the BARs. This can lead to two possible problems: - We refuse to create a valid BAR mapping because of a stale mem_bank structure which belonged to a previously unmapped region. - It is possible that the mmap in vfio_map_region returns the same address that was used to create a memslot, but was unmapped by vfio_unmap_region. Guest accesses to the device memory will fault because the stage 2 mappings are missing, and this can lead to performance degradation. Let's do the right thing and destroy the memslot and the mem_bank struct associated with it when we unmap a vfio region. Set host_addr to NULL after the munmap call so we won't try to unmap an address which is currently used by the process for something else if vfio_unmap_region gets called twice. Signed-off-by: Alexandru Elisei Reviewed-by: Andre Przywara --- include/kvm/kvm.h | 4 ++ kvm.c | 101 ++++++++++++++++++++++++++++++++++++++++------ vfio/core.c | 6 +++ 3 files changed, 99 insertions(+), 12 deletions(-) diff --git a/include/kvm/kvm.h b/include/kvm/kvm.h index 50119a8672eb..9428f57a1c0c 100644 --- a/include/kvm/kvm.h +++ b/include/kvm/kvm.h @@ -1,6 +1,7 @@ #ifndef KVM__KVM_H #define KVM__KVM_H +#include "kvm/mutex.h" #include "kvm/kvm-arch.h" #include "kvm/kvm-config.h" #include "kvm/util-init.h" @@ -56,6 +57,7 @@ struct kvm_mem_bank { void *host_addr; u64 size; enum kvm_mem_type type; + u32 slot; }; struct kvm { @@ -72,6 +74,7 @@ struct kvm { u64 ram_size; void *ram_start; u64 ram_pagesize; + struct mutex mem_banks_lock; struct list_head mem_banks; bool nmi_disabled; @@ -106,6 +109,7 @@ void kvm__irq_line(struct kvm *kvm, int irq, int level); void kvm__irq_trigger(struct kvm *kvm, int irq); bool kvm__emulate_io(struct kvm_cpu *vcpu, u16 port, void *data, int direction, int size, u32 count); bool kvm__emulate_mmio(struct kvm_cpu *vcpu, u64 phys_addr, u8 *data, u32 len, u8 is_write); +int kvm__destroy_mem(struct kvm *kvm, u64 guest_phys, u64 size, void *userspace_addr); int kvm__register_mem(struct kvm *kvm, u64 guest_phys, u64 size, void *userspace_addr, enum kvm_mem_type type); static inline int kvm__register_ram(struct kvm *kvm, u64 guest_phys, u64 size, diff --git a/kvm.c b/kvm.c index 57c4ff98ec4c..26f6b9bc58a3 100644 --- a/kvm.c +++ b/kvm.c @@ -157,6 +157,7 @@ struct kvm *kvm__new(void) if (!kvm) return ERR_PTR(-ENOMEM); + mutex_init(&kvm->mem_banks_lock); kvm->sys_fd = -1; kvm->vm_fd = -1; @@ -183,20 +184,84 @@ int kvm__exit(struct kvm *kvm) } core_exit(kvm__exit); +int kvm__destroy_mem(struct kvm *kvm, u64 guest_phys, u64 size, + void *userspace_addr) +{ + struct kvm_userspace_memory_region mem; + struct kvm_mem_bank *bank; + int ret; + + mutex_lock(&kvm->mem_banks_lock); + list_for_each_entry(bank, &kvm->mem_banks, list) + if (bank->guest_phys_addr == guest_phys && + bank->size == size && bank->host_addr == userspace_addr) + break; + + if (&bank->list == &kvm->mem_banks) { + pr_err("Region [%llx-%llx] not found", guest_phys, + guest_phys + size - 1); + ret = -EINVAL; + goto out; + } + + if (bank->type == KVM_MEM_TYPE_RESERVED) { + pr_err("Cannot delete reserved region [%llx-%llx]", + guest_phys, guest_phys + size - 1); + ret = -EINVAL; + goto out; + } + + mem = (struct kvm_userspace_memory_region) { + .slot = bank->slot, + .guest_phys_addr = guest_phys, + .memory_size = 0, + .userspace_addr = (unsigned long)userspace_addr, + }; + + ret = ioctl(kvm->vm_fd, KVM_SET_USER_MEMORY_REGION, &mem); + if (ret < 0) { + ret = -errno; + goto out; + } + + list_del(&bank->list); + free(bank); + kvm->mem_slots--; + ret = 0; + +out: + mutex_unlock(&kvm->mem_banks_lock); + return ret; +} + int kvm__register_mem(struct kvm *kvm, u64 guest_phys, u64 size, void *userspace_addr, enum kvm_mem_type type) { struct kvm_userspace_memory_region mem; struct kvm_mem_bank *merged = NULL; struct kvm_mem_bank *bank; + struct list_head *prev_entry; + u32 slot; int ret; - /* Check for overlap */ + mutex_lock(&kvm->mem_banks_lock); + /* Check for overlap and find first empty slot. */ + slot = 0; + prev_entry = &kvm->mem_banks; list_for_each_entry(bank, &kvm->mem_banks, list) { u64 bank_end = bank->guest_phys_addr + bank->size - 1; u64 end = guest_phys + size - 1; - if (guest_phys > bank_end || end < bank->guest_phys_addr) + if (guest_phys > bank_end || end < bank->guest_phys_addr) { + /* + * Keep the banks sorted ascending by slot, so it's + * easier for us to find a free slot. + */ + if (bank->slot == slot) { + slot++; + prev_entry = &bank->list; + } continue; + } /* Merge overlapping reserved regions */ if (bank->type == KVM_MEM_TYPE_RESERVED && @@ -226,38 +291,50 @@ int kvm__register_mem(struct kvm *kvm, u64 guest_phys, u64 size, kvm_mem_type_to_string(bank->type), bank->guest_phys_addr, bank->guest_phys_addr + bank->size - 1); - return -EINVAL; + ret = -EINVAL; + goto out; } - if (merged) - return 0; + if (merged) { + ret = 0; + goto out; + } bank = malloc(sizeof(*bank)); - if (!bank) - return -ENOMEM; + if (!bank) { + ret = -ENOMEM; + goto out; + } INIT_LIST_HEAD(&bank->list); bank->guest_phys_addr = guest_phys; bank->host_addr = userspace_addr; bank->size = size; bank->type = type; + bank->slot = slot; if (type != KVM_MEM_TYPE_RESERVED) { mem = (struct kvm_userspace_memory_region) { - .slot = kvm->mem_slots++, + .slot = slot, .guest_phys_addr = guest_phys, .memory_size = size, .userspace_addr = (unsigned long)userspace_addr, }; ret = ioctl(kvm->vm_fd, KVM_SET_USER_MEMORY_REGION, &mem); - if (ret < 0) - return -errno; + if (ret < 0) { + ret = -errno; + goto out; + } } - list_add(&bank->list, &kvm->mem_banks); + list_add(&bank->list, prev_entry); + kvm->mem_slots++; + ret = 0; - return 0; +out: + mutex_unlock(&kvm->mem_banks_lock); + return ret; } void *guest_flat_to_host(struct kvm *kvm, u64 offset) diff --git a/vfio/core.c b/vfio/core.c index 0ed1e6fee6bf..b80ebf3bb913 100644 --- a/vfio/core.c +++ b/vfio/core.c @@ -256,8 +256,14 @@ int vfio_map_region(struct kvm *kvm, struct vfio_device *vdev, void vfio_unmap_region(struct kvm *kvm, struct vfio_region *region) { + u64 map_size; + if (region->host_addr) { + map_size = ALIGN(region->info.size, PAGE_SIZE); + kvm__destroy_mem(kvm, region->guest_phys_addr, map_size, + region->host_addr); munmap(region->host_addr, region->info.size); + region->host_addr = NULL; } else if (region->is_ioport) { ioport__unregister(kvm, region->port_base); } else { From patchwork Thu Mar 26 15:24:29 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandru Elisei X-Patchwork-Id: 11460495 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 11CDF92A for ; Thu, 26 Mar 2020 15:25:22 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id E601020787 for ; Thu, 26 Mar 2020 15:25:21 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728486AbgCZPZV (ORCPT ); Thu, 26 Mar 2020 11:25:21 -0400 Received: from foss.arm.com ([217.140.110.172]:33874 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728461AbgCZPZU (ORCPT ); Thu, 26 Mar 2020 11:25:20 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 7C0E37FA; Thu, 26 Mar 2020 08:25:20 -0700 (PDT) Received: from e123195-lin.cambridge.arm.com (e123195-lin.cambridge.arm.com [10.1.196.63]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 945843F71E; Thu, 26 Mar 2020 08:25:19 -0700 (PDT) From: Alexandru Elisei To: kvm@vger.kernel.org Cc: will@kernel.org, julien.thierry.kdev@gmail.com, andre.przywara@arm.com, sami.mujawar@arm.com, lorenzo.pieralisi@arm.com Subject: [PATCH v3 kvmtool 23/32] vfio: Reserve ioports when configuring the BAR Date: Thu, 26 Mar 2020 15:24:29 +0000 Message-Id: <20200326152438.6218-24-alexandru.elisei@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200326152438.6218-1-alexandru.elisei@arm.com> References: <20200326152438.6218-1-alexandru.elisei@arm.com> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Let's be consistent and reserve ioports when we are configuring the BAR, not when we map it, just like we do with mmio regions. Reviewed-by: Andre Przywara Signed-off-by: Alexandru Elisei --- vfio/core.c | 9 +++------ vfio/pci.c | 4 +++- 2 files changed, 6 insertions(+), 7 deletions(-) diff --git a/vfio/core.c b/vfio/core.c index b80ebf3bb913..bad3c7c8cd26 100644 --- a/vfio/core.c +++ b/vfio/core.c @@ -202,14 +202,11 @@ static int vfio_setup_trap_region(struct kvm *kvm, struct vfio_device *vdev, struct vfio_region *region) { if (region->is_ioport) { - int port = pci_get_io_port_block(region->info.size); - - port = ioport__register(kvm, port, &vfio_ioport_ops, - region->info.size, region); + int port = ioport__register(kvm, region->port_base, + &vfio_ioport_ops, region->info.size, + region); if (port < 0) return port; - - region->port_base = port; return 0; } diff --git a/vfio/pci.c b/vfio/pci.c index 4412c6d7a862..fe02574390f6 100644 --- a/vfio/pci.c +++ b/vfio/pci.c @@ -886,7 +886,9 @@ static int vfio_pci_configure_bar(struct kvm *kvm, struct vfio_device *vdev, } } - if (!region->is_ioport) { + if (region->is_ioport) { + region->port_base = pci_get_io_port_block(region->info.size); + } else { /* Grab some MMIO space in the guest */ map_size = ALIGN(region->info.size, PAGE_SIZE); region->guest_phys_addr = pci_get_mmio_block(map_size); From patchwork Thu Mar 26 15:24:30 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandru Elisei X-Patchwork-Id: 11460511 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id D2AE917EA for ; Thu, 26 Mar 2020 15:25:31 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id B2A8E20774 for ; Thu, 26 Mar 2020 15:25:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728421AbgCZPZX (ORCPT ); Thu, 26 Mar 2020 11:25:23 -0400 Received: from foss.arm.com ([217.140.110.172]:33882 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728495AbgCZPZV (ORCPT ); Thu, 26 Mar 2020 11:25:21 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 963841045; Thu, 26 Mar 2020 08:25:21 -0700 (PDT) Received: from e123195-lin.cambridge.arm.com (e123195-lin.cambridge.arm.com [10.1.196.63]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id AF1FA3F71E; Thu, 26 Mar 2020 08:25:20 -0700 (PDT) From: Alexandru Elisei To: kvm@vger.kernel.org Cc: will@kernel.org, julien.thierry.kdev@gmail.com, andre.przywara@arm.com, sami.mujawar@arm.com, lorenzo.pieralisi@arm.com Subject: [PATCH v3 kvmtool 24/32] pci: Limit configuration transaction size to 32 bits Date: Thu, 26 Mar 2020 15:24:30 +0000 Message-Id: <20200326152438.6218-25-alexandru.elisei@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200326152438.6218-1-alexandru.elisei@arm.com> References: <20200326152438.6218-1-alexandru.elisei@arm.com> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From PCI Local Bus Specification Revision 3.0. section 3.8 "64-Bit Bus Extension": "The bandwidth requirements for I/O and configuration transactions cannot justify the added complexity, and, therefore, only memory transactions support 64-bit data transfers". Further down, the spec also describes the possible responses of a target which has been requested to do a 64-bit transaction. Limit the transaction to the lower 32 bits, to match the second accepted behaviour. Signed-off-by: Alexandru Elisei Reviewed-by: Andre Przywara --- pci.c | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/pci.c b/pci.c index 7399c76c0819..611e2c0bf1da 100644 --- a/pci.c +++ b/pci.c @@ -119,6 +119,9 @@ static bool pci_config_data_out(struct ioport *ioport, struct kvm_cpu *vcpu, u16 { union pci_config_address pci_config_address; + if (size > 4) + size = 4; + pci_config_address.w = ioport__read32(&pci_config_address_bits); /* * If someone accesses PCI configuration space offsets that are not @@ -135,6 +138,9 @@ static bool pci_config_data_in(struct ioport *ioport, struct kvm_cpu *vcpu, u16 { union pci_config_address pci_config_address; + if (size > 4) + size = 4; + pci_config_address.w = ioport__read32(&pci_config_address_bits); /* * If someone accesses PCI configuration space offsets that are not @@ -248,6 +254,9 @@ static void pci_config_mmio_access(struct kvm_cpu *vcpu, u64 addr, u8 *data, cfg_addr.w = (u32)addr; cfg_addr.enable_bit = 1; + if (len > 4) + len = 4; + if (is_write) pci__config_wr(kvm, cfg_addr, data, len); else From patchwork Thu Mar 26 15:24:31 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandru Elisei X-Patchwork-Id: 11460497 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 2F80392A for ; Thu, 26 Mar 2020 15:25:24 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 1B2132076A for ; Thu, 26 Mar 2020 15:25:24 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728513AbgCZPZX (ORCPT ); Thu, 26 Mar 2020 11:25:23 -0400 Received: from foss.arm.com ([217.140.110.172]:33890 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728466AbgCZPZX (ORCPT ); Thu, 26 Mar 2020 11:25:23 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id B093F7FA; Thu, 26 Mar 2020 08:25:22 -0700 (PDT) Received: from e123195-lin.cambridge.arm.com (e123195-lin.cambridge.arm.com [10.1.196.63]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id C9A183F71E; Thu, 26 Mar 2020 08:25:21 -0700 (PDT) From: Alexandru Elisei To: kvm@vger.kernel.org Cc: will@kernel.org, julien.thierry.kdev@gmail.com, andre.przywara@arm.com, sami.mujawar@arm.com, lorenzo.pieralisi@arm.com Subject: [PATCH v3 kvmtool 25/32] vfio/pci: Don't write configuration value twice Date: Thu, 26 Mar 2020 15:24:31 +0000 Message-Id: <20200326152438.6218-26-alexandru.elisei@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200326152438.6218-1-alexandru.elisei@arm.com> References: <20200326152438.6218-1-alexandru.elisei@arm.com> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org After writing to the device fd as part of the PCI configuration space emulation, we read back from the device to make sure that the write finished. The value is read back into the PCI configuration space and afterwards, the same value is copied by the PCI emulation code. Let's read from the device fd into a temporary variable, to prevent this double write. The double write is harmless in itself. But when we implement reassignable BARs, we need to keep track of the old BAR value, and the VFIO code is overwritting it. Signed-off-by: Alexandru Elisei --- vfio/pci.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/vfio/pci.c b/vfio/pci.c index fe02574390f6..8b2a0c8dbac3 100644 --- a/vfio/pci.c +++ b/vfio/pci.c @@ -470,7 +470,7 @@ static void vfio_pci_cfg_write(struct kvm *kvm, struct pci_device_header *pci_hd struct vfio_region_info *info; struct vfio_pci_device *pdev; struct vfio_device *vdev; - void *base = pci_hdr; + u32 tmp; if (offset == PCI_ROM_ADDRESS) return; @@ -490,7 +490,7 @@ static void vfio_pci_cfg_write(struct kvm *kvm, struct pci_device_header *pci_hd if (pdev->irq_modes & VFIO_PCI_IRQ_MODE_MSI) vfio_pci_msi_cap_write(kvm, vdev, offset, data, sz); - if (pread(vdev->fd, base + offset, sz, info->offset + offset) != sz) + if (pread(vdev->fd, &tmp, sz, info->offset + offset) != sz) vfio_dev_warn(vdev, "Failed to read %d bytes from Configuration Space at 0x%x", sz, offset); } From patchwork Thu Mar 26 15:24:32 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandru Elisei X-Patchwork-Id: 11460499 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 4D0491667 for ; Thu, 26 Mar 2020 15:25:26 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 2D7A220774 for ; Thu, 26 Mar 2020 15:25:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728424AbgCZPZZ (ORCPT ); Thu, 26 Mar 2020 11:25:25 -0400 Received: from foss.arm.com ([217.140.110.172]:33898 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728466AbgCZPZY (ORCPT ); Thu, 26 Mar 2020 11:25:24 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id CA9EC1045; Thu, 26 Mar 2020 08:25:23 -0700 (PDT) Received: from e123195-lin.cambridge.arm.com (e123195-lin.cambridge.arm.com [10.1.196.63]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id E3F623F71E; Thu, 26 Mar 2020 08:25:22 -0700 (PDT) From: Alexandru Elisei To: kvm@vger.kernel.org Cc: will@kernel.org, julien.thierry.kdev@gmail.com, andre.przywara@arm.com, sami.mujawar@arm.com, lorenzo.pieralisi@arm.com Subject: [PATCH v3 kvmtool 26/32] vesa: Create device exactly once Date: Thu, 26 Mar 2020 15:24:32 +0000 Message-Id: <20200326152438.6218-27-alexandru.elisei@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200326152438.6218-1-alexandru.elisei@arm.com> References: <20200326152438.6218-1-alexandru.elisei@arm.com> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org A vesa device is used by the SDL, GTK or VNC framebuffers. Don't allow the user to specify more than one of these options because kvmtool will create identical devices and bad things will happen: $ ./lkvm run -c2 -m2048 -k bzImage --sdl --gtk # lkvm run -k bzImage -m 2048 -c 2 --name guest-10159 Error: device region [d0000000-d012bfff] would overlap device region [d0000000-d012bfff] *** Error in `./lkvm': free(): invalid pointer: 0x00007fad78002e40 *** *** Error in `./lkvm': free(): invalid pointer: 0x00007fad78002e40 *** *** Error in `./lkvm': free(): invalid pointer: 0x00007fad78002e40 *** ======= Backtrace: ========= ======= Backtrace: ========= /lib/x86_64-linux-gnu/libc.so.6(+0x777e5)[0x7fae0ed447e5] ======= Backtrace: ========= /lib/x86_64-linux-gnu/libc.so.6/lib/x86_64-linux-gnu/libc.so.6(+0x8037a)[0x7fae0ed4d37a] (+0x777e5)[0x7fae0ed447e5] /lib/x86_64-linux-gnu/libc.so.6(+0x777e5)[0x7fae0ed447e5] /lib/x86_64-linux-gnu/libc.so.6(+0x8037a)[0x7fae0ed4d37a] /lib/x86_64-linux-gnu/libc.so.6(cfree+0x4c)[0x7fae0ed5153c] *** Error in `./lkvm': free(): invalid pointer: 0x00007fad78002e40 *** /lib/x86_64-linux-gnu/libglib-2.0.so.0(g_string_free+0x3b)[0x7fae0f814dab] /lib/x86_64-linux-gnu/libglib-2.0.so.0(g_string_free+0x3b)[0x7fae0f814dab] /usr/lib/x86_64-linux-gnu/libgtk-3.so.0(+0x21121c)[0x7fae1023321c] /usr/lib/x86_64-linux-gnu/libgtk-3.so.0(+0x21121c)[0x7fae1023321c] ======= Backtrace: ========= Aborted (core dumped) The vesa device is explicitly created during the initialization phase of the above framebuffers. Remove the superfluous check for their existence. Signed-off-by: Alexandru Elisei Reviewed-by: Andre Przywara --- hw/vesa.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/hw/vesa.c b/hw/vesa.c index dd59a112330b..8071ad153f27 100644 --- a/hw/vesa.c +++ b/hw/vesa.c @@ -61,8 +61,11 @@ struct framebuffer *vesa__init(struct kvm *kvm) BUILD_BUG_ON(!is_power_of_two(VESA_MEM_SIZE)); BUILD_BUG_ON(VESA_MEM_SIZE < VESA_BPP/8 * VESA_WIDTH * VESA_HEIGHT); - if (!kvm->cfg.vnc && !kvm->cfg.sdl && !kvm->cfg.gtk) - return NULL; + if (device__find_dev(vesa_device.bus_type, vesa_device.dev_num) == &vesa_device) { + r = -EEXIST; + goto out_error; + } + vesa_base_addr = pci_get_io_port_block(PCI_IO_SIZE); r = ioport__register(kvm, vesa_base_addr, &vesa_io_ops, PCI_IO_SIZE, NULL); if (r < 0) From patchwork Thu Mar 26 15:24:33 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandru Elisei X-Patchwork-Id: 11460501 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id A608E1667 for ; Thu, 26 Mar 2020 15:25:27 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 801B420748 for ; Thu, 26 Mar 2020 15:25:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727122AbgCZPZ0 (ORCPT ); Thu, 26 Mar 2020 11:25:26 -0400 Received: from foss.arm.com ([217.140.110.172]:33906 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728437AbgCZPZ0 (ORCPT ); Thu, 26 Mar 2020 11:25:26 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 276A41045; Thu, 26 Mar 2020 08:25:25 -0700 (PDT) Received: from e123195-lin.cambridge.arm.com (e123195-lin.cambridge.arm.com [10.1.196.63]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 0AC103F71E; Thu, 26 Mar 2020 08:25:23 -0700 (PDT) From: Alexandru Elisei To: kvm@vger.kernel.org Cc: will@kernel.org, julien.thierry.kdev@gmail.com, andre.przywara@arm.com, sami.mujawar@arm.com, lorenzo.pieralisi@arm.com, Alexandru Elisei Subject: [PATCH v3 kvmtool 27/32] pci: Implement callbacks for toggling BAR emulation Date: Thu, 26 Mar 2020 15:24:33 +0000 Message-Id: <20200326152438.6218-28-alexandru.elisei@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200326152438.6218-1-alexandru.elisei@arm.com> References: <20200326152438.6218-1-alexandru.elisei@arm.com> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Alexandru Elisei Implement callbacks for activating and deactivating emulation for a BAR region. This is in preparation for allowing a guest operating system to enable and disable access to I/O or memory space, or to reassign the BARs. The emulated vesa device framebuffer isn't designed to allow stopping and restarting at arbitrary points in the guest execution. Furthermore, on x86, the kernel will not change the BAR addresses, which on bare metal are programmed by the firmware, so take the easy way out and refuse to activate/deactivate emulation for the BAR regions. We also take this opportunity to make the vesa emulation code more consistent by moving all static variable definitions in one place, at the top of the file. Signed-off-by: Alexandru Elisei --- hw/vesa.c | 70 ++++++++++++++++++++------------ include/kvm/pci.h | 18 ++++++++- pci.c | 44 ++++++++++++++++++++ vfio/pci.c | 100 ++++++++++++++++++++++++++++++++++++++-------- virtio/pci.c | 90 ++++++++++++++++++++++++++++++----------- 5 files changed, 254 insertions(+), 68 deletions(-) diff --git a/hw/vesa.c b/hw/vesa.c index 8071ad153f27..31c2d16ae4de 100644 --- a/hw/vesa.c +++ b/hw/vesa.c @@ -18,6 +18,31 @@ #include #include +static struct pci_device_header vesa_pci_device = { + .vendor_id = cpu_to_le16(PCI_VENDOR_ID_REDHAT_QUMRANET), + .device_id = cpu_to_le16(PCI_DEVICE_ID_VESA), + .header_type = PCI_HEADER_TYPE_NORMAL, + .revision_id = 0, + .class[2] = 0x03, + .subsys_vendor_id = cpu_to_le16(PCI_SUBSYSTEM_VENDOR_ID_REDHAT_QUMRANET), + .subsys_id = cpu_to_le16(PCI_SUBSYSTEM_ID_VESA), + .bar[1] = cpu_to_le32(VESA_MEM_ADDR | PCI_BASE_ADDRESS_SPACE_MEMORY), + .bar_size[1] = VESA_MEM_SIZE, +}; + +static struct device_header vesa_device = { + .bus_type = DEVICE_BUS_PCI, + .data = &vesa_pci_device, +}; + +static struct framebuffer vesafb = { + .width = VESA_WIDTH, + .height = VESA_HEIGHT, + .depth = VESA_BPP, + .mem_addr = VESA_MEM_ADDR, + .mem_size = VESA_MEM_SIZE, +}; + static bool vesa_pci_io_in(struct ioport *ioport, struct kvm_cpu *vcpu, u16 port, void *data, int size) { return true; @@ -33,24 +58,19 @@ static struct ioport_operations vesa_io_ops = { .io_out = vesa_pci_io_out, }; -static struct pci_device_header vesa_pci_device = { - .vendor_id = cpu_to_le16(PCI_VENDOR_ID_REDHAT_QUMRANET), - .device_id = cpu_to_le16(PCI_DEVICE_ID_VESA), - .header_type = PCI_HEADER_TYPE_NORMAL, - .revision_id = 0, - .class[2] = 0x03, - .subsys_vendor_id = cpu_to_le16(PCI_SUBSYSTEM_VENDOR_ID_REDHAT_QUMRANET), - .subsys_id = cpu_to_le16(PCI_SUBSYSTEM_ID_VESA), - .bar[1] = cpu_to_le32(VESA_MEM_ADDR | PCI_BASE_ADDRESS_SPACE_MEMORY), - .bar_size[1] = VESA_MEM_SIZE, -}; - -static struct device_header vesa_device = { - .bus_type = DEVICE_BUS_PCI, - .data = &vesa_pci_device, -}; +static int vesa__bar_activate(struct kvm *kvm, struct pci_device_header *pci_hdr, + int bar_num, void *data) +{ + /* We don't support remapping of the framebuffer. */ + return 0; +} -static struct framebuffer vesafb; +static int vesa__bar_deactivate(struct kvm *kvm, struct pci_device_header *pci_hdr, + int bar_num, void *data) +{ + /* We don't support remapping of the framebuffer. */ + return -EINVAL; +} struct framebuffer *vesa__init(struct kvm *kvm) { @@ -73,6 +93,11 @@ struct framebuffer *vesa__init(struct kvm *kvm) vesa_pci_device.bar[0] = cpu_to_le32(vesa_base_addr | PCI_BASE_ADDRESS_SPACE_IO); vesa_pci_device.bar_size[0] = PCI_IO_SIZE; + r = pci__register_bar_regions(kvm, &vesa_pci_device, vesa__bar_activate, + vesa__bar_deactivate, NULL); + if (r < 0) + goto unregister_ioport; + r = device__register(&vesa_device); if (r < 0) goto unregister_ioport; @@ -87,15 +112,8 @@ struct framebuffer *vesa__init(struct kvm *kvm) if (r < 0) goto unmap_dev; - vesafb = (struct framebuffer) { - .width = VESA_WIDTH, - .height = VESA_HEIGHT, - .depth = VESA_BPP, - .mem = mem, - .mem_addr = VESA_MEM_ADDR, - .mem_size = VESA_MEM_SIZE, - .kvm = kvm, - }; + vesafb.mem = mem; + vesafb.kvm = kvm; return fb__register(&vesafb); unmap_dev: diff --git a/include/kvm/pci.h b/include/kvm/pci.h index adb4b5c082d5..1d7d4c0cea5a 100644 --- a/include/kvm/pci.h +++ b/include/kvm/pci.h @@ -89,12 +89,19 @@ struct pci_cap_hdr { u8 next; }; +struct pci_device_header; + +typedef int (*bar_activate_fn_t)(struct kvm *kvm, + struct pci_device_header *pci_hdr, + int bar_num, void *data); +typedef int (*bar_deactivate_fn_t)(struct kvm *kvm, + struct pci_device_header *pci_hdr, + int bar_num, void *data); + #define PCI_BAR_OFFSET(b) (offsetof(struct pci_device_header, bar[b])) #define PCI_DEV_CFG_SIZE 256 #define PCI_DEV_CFG_MASK (PCI_DEV_CFG_SIZE - 1) -struct pci_device_header; - struct pci_config_operations { void (*write)(struct kvm *kvm, struct pci_device_header *pci_hdr, u8 offset, void *data, int sz); @@ -136,6 +143,9 @@ struct pci_device_header { /* Private to lkvm */ u32 bar_size[6]; + bar_activate_fn_t bar_activate_fn; + bar_deactivate_fn_t bar_deactivate_fn; + void *data; struct pci_config_operations cfg_ops; /* * PCI INTx# are level-triggered, but virtual device often feature @@ -162,6 +172,10 @@ void pci__config_rd(struct kvm *kvm, union pci_config_address addr, void *data, void *pci_find_cap(struct pci_device_header *hdr, u8 cap_type); +int pci__register_bar_regions(struct kvm *kvm, struct pci_device_header *pci_hdr, + bar_activate_fn_t bar_activate_fn, + bar_deactivate_fn_t bar_deactivate_fn, void *data); + static inline bool __pci__memory_space_enabled(u16 command) { return command & PCI_COMMAND_MEMORY; diff --git a/pci.c b/pci.c index 611e2c0bf1da..4ace190898f2 100644 --- a/pci.c +++ b/pci.c @@ -66,6 +66,11 @@ void pci__assign_irq(struct device_header *dev_hdr) pci_hdr->irq_type = IRQ_TYPE_EDGE_RISING; } +static bool pci_bar_is_implemented(struct pci_device_header *pci_hdr, int bar_num) +{ + return pci__bar_size(pci_hdr, bar_num); +} + static void *pci_config_address_ptr(u16 port) { unsigned long offset; @@ -273,6 +278,45 @@ struct pci_device_header *pci__find_dev(u8 dev_num) return hdr->data; } +int pci__register_bar_regions(struct kvm *kvm, struct pci_device_header *pci_hdr, + bar_activate_fn_t bar_activate_fn, + bar_deactivate_fn_t bar_deactivate_fn, void *data) +{ + int i, r; + bool has_bar_regions = false; + + assert(bar_activate_fn && bar_deactivate_fn); + + pci_hdr->bar_activate_fn = bar_activate_fn; + pci_hdr->bar_deactivate_fn = bar_deactivate_fn; + pci_hdr->data = data; + + for (i = 0; i < 6; i++) { + if (!pci_bar_is_implemented(pci_hdr, i)) + continue; + + has_bar_regions = true; + + if (pci__bar_is_io(pci_hdr, i) && + pci__io_space_enabled(pci_hdr)) { + r = bar_activate_fn(kvm, pci_hdr, i, data); + if (r < 0) + return r; + } + + if (pci__bar_is_memory(pci_hdr, i) && + pci__memory_space_enabled(pci_hdr)) { + r = bar_activate_fn(kvm, pci_hdr, i, data); + if (r < 0) + return r; + } + } + + assert(has_bar_regions); + + return 0; +} + int pci__init(struct kvm *kvm) { int r; diff --git a/vfio/pci.c b/vfio/pci.c index 8b2a0c8dbac3..18e22a8c5320 100644 --- a/vfio/pci.c +++ b/vfio/pci.c @@ -8,6 +8,8 @@ #include #include +#include + /* Wrapper around UAPI vfio_irq_set */ union vfio_irq_eventfd { struct vfio_irq_set irq; @@ -446,6 +448,81 @@ out_unlock: mutex_unlock(&pdev->msi.mutex); } +static int vfio_pci_bar_activate(struct kvm *kvm, + struct pci_device_header *pci_hdr, + int bar_num, void *data) +{ + struct vfio_device *vdev = data; + struct vfio_pci_device *pdev = &vdev->pci; + struct vfio_pci_msix_pba *pba = &pdev->msix_pba; + struct vfio_pci_msix_table *table = &pdev->msix_table; + struct vfio_region *region; + bool has_msix; + int ret; + + assert((u32)bar_num < vdev->info.num_regions); + + region = &vdev->regions[bar_num]; + has_msix = pdev->irq_modes & VFIO_PCI_IRQ_MODE_MSIX; + + if (has_msix && (u32)bar_num == table->bar) { + ret = kvm__register_mmio(kvm, table->guest_phys_addr, + table->size, false, + vfio_pci_msix_table_access, pdev); + if (ret < 0 || table->bar != pba->bar) + goto out; + } + + if (has_msix && (u32)bar_num == pba->bar) { + ret = kvm__register_mmio(kvm, pba->guest_phys_addr, + pba->size, false, + vfio_pci_msix_pba_access, pdev); + goto out; + } + + ret = vfio_map_region(kvm, vdev, region); +out: + return ret; +} + +static int vfio_pci_bar_deactivate(struct kvm *kvm, + struct pci_device_header *pci_hdr, + int bar_num, void *data) +{ + struct vfio_device *vdev = data; + struct vfio_pci_device *pdev = &vdev->pci; + struct vfio_pci_msix_pba *pba = &pdev->msix_pba; + struct vfio_pci_msix_table *table = &pdev->msix_table; + struct vfio_region *region; + bool has_msix, success; + int ret; + + assert((u32)bar_num < vdev->info.num_regions); + + region = &vdev->regions[bar_num]; + has_msix = pdev->irq_modes & VFIO_PCI_IRQ_MODE_MSIX; + + if (has_msix && (u32)bar_num == table->bar) { + success = kvm__deregister_mmio(kvm, table->guest_phys_addr); + /* kvm__deregister_mmio fails when the region is not found. */ + ret = (success ? 0 : -ENOENT); + if (ret < 0 || table->bar!= pba->bar) + goto out; + } + + if (has_msix && (u32)bar_num == pba->bar) { + success = kvm__deregister_mmio(kvm, pba->guest_phys_addr); + ret = (success ? 0 : -ENOENT); + goto out; + } + + vfio_unmap_region(kvm, region); + ret = 0; + +out: + return ret; +} + static void vfio_pci_cfg_read(struct kvm *kvm, struct pci_device_header *pci_hdr, u8 offset, void *data, int sz) { @@ -805,12 +882,6 @@ static int vfio_pci_create_msix_table(struct kvm *kvm, struct vfio_device *vdev) ret = -ENOMEM; goto out_free; } - pba->guest_phys_addr = table->guest_phys_addr + table->size; - - ret = kvm__register_mmio(kvm, table->guest_phys_addr, table->size, - false, vfio_pci_msix_table_access, pdev); - if (ret < 0) - goto out_free; /* * We could map the physical PBA directly into the guest, but it's @@ -820,10 +891,7 @@ static int vfio_pci_create_msix_table(struct kvm *kvm, struct vfio_device *vdev) * between MSI-X table and PBA. For the sake of isolation, create a * virtual PBA. */ - ret = kvm__register_mmio(kvm, pba->guest_phys_addr, pba->size, false, - vfio_pci_msix_pba_access, pdev); - if (ret < 0) - goto out_free; + pba->guest_phys_addr = table->guest_phys_addr + table->size; pdev->msix.entries = entries; pdev->msix.nr_entries = nr_entries; @@ -894,11 +962,6 @@ static int vfio_pci_configure_bar(struct kvm *kvm, struct vfio_device *vdev, region->guest_phys_addr = pci_get_mmio_block(map_size); } - /* Map the BARs into the guest or setup a trap region. */ - ret = vfio_map_region(kvm, vdev, region); - if (ret) - return ret; - return 0; } @@ -945,7 +1008,12 @@ static int vfio_pci_configure_dev_regions(struct kvm *kvm, } /* We've configured the BARs, fake up a Configuration Space */ - return vfio_pci_fixup_cfg_space(vdev); + ret = vfio_pci_fixup_cfg_space(vdev); + if (ret) + return ret; + + return pci__register_bar_regions(kvm, &pdev->hdr, vfio_pci_bar_activate, + vfio_pci_bar_deactivate, vdev); } /* diff --git a/virtio/pci.c b/virtio/pci.c index d111dc499f5e..598da699c241 100644 --- a/virtio/pci.c +++ b/virtio/pci.c @@ -11,6 +11,7 @@ #include #include #include +#include #include static u16 virtio_pci__port_addr(struct virtio_pci *vpci) @@ -462,6 +463,64 @@ static void virtio_pci__io_mmio_callback(struct kvm_cpu *vcpu, virtio_pci__data_out(vcpu, vdev, addr - mmio_addr, data, len); } +static int virtio_pci__bar_activate(struct kvm *kvm, + struct pci_device_header *pci_hdr, + int bar_num, void *data) +{ + struct virtio_device *vdev = data; + u32 bar_addr, bar_size; + int r = -EINVAL; + + assert(bar_num <= 2); + + bar_addr = pci__bar_address(pci_hdr, bar_num); + bar_size = pci__bar_size(pci_hdr, bar_num); + + switch (bar_num) { + case 0: + r = ioport__register(kvm, bar_addr, &virtio_pci__io_ops, + bar_size, vdev); + if (r > 0) + r = 0; + break; + case 1: + r = kvm__register_mmio(kvm, bar_addr, bar_size, false, + virtio_pci__io_mmio_callback, vdev); + break; + case 2: + r = kvm__register_mmio(kvm, bar_addr, bar_size, false, + virtio_pci__msix_mmio_callback, vdev); + } + + return r; +} + +static int virtio_pci__bar_deactivate(struct kvm *kvm, + struct pci_device_header *pci_hdr, + int bar_num, void *data) +{ + u32 bar_addr; + bool success; + int r = -EINVAL; + + assert(bar_num <= 2); + + bar_addr = pci__bar_address(pci_hdr, bar_num); + + switch (bar_num) { + case 0: + r = ioport__unregister(kvm, bar_addr); + break; + case 1: + case 2: + success = kvm__deregister_mmio(kvm, bar_addr); + /* kvm__deregister_mmio fails when the region is not found. */ + r = (success ? 0 : -ENOENT); + } + + return r; +} + int virtio_pci__init(struct kvm *kvm, void *dev, struct virtio_device *vdev, int device_id, int subsys_id, int class) { @@ -476,23 +535,8 @@ int virtio_pci__init(struct kvm *kvm, void *dev, struct virtio_device *vdev, BUILD_BUG_ON(!is_power_of_two(PCI_IO_SIZE)); port_addr = pci_get_io_port_block(PCI_IO_SIZE); - r = ioport__register(kvm, port_addr, &virtio_pci__io_ops, PCI_IO_SIZE, - vdev); - if (r < 0) - return r; - port_addr = (u16)r; - mmio_addr = pci_get_mmio_block(PCI_IO_SIZE); - r = kvm__register_mmio(kvm, mmio_addr, PCI_IO_SIZE, false, - virtio_pci__io_mmio_callback, vdev); - if (r < 0) - goto free_ioport; - msix_io_block = pci_get_mmio_block(PCI_IO_SIZE * 2); - r = kvm__register_mmio(kvm, msix_io_block, PCI_IO_SIZE * 2, false, - virtio_pci__msix_mmio_callback, vdev); - if (r < 0) - goto free_mmio; vpci->pci_hdr = (struct pci_device_header) { .vendor_id = cpu_to_le16(PCI_VENDOR_ID_REDHAT_QUMRANET), @@ -518,6 +562,12 @@ int virtio_pci__init(struct kvm *kvm, void *dev, struct virtio_device *vdev, .bar_size[2] = cpu_to_le32(PCI_IO_SIZE*2), }; + r = pci__register_bar_regions(kvm, &vpci->pci_hdr, + virtio_pci__bar_activate, + virtio_pci__bar_deactivate, vdev); + if (r < 0) + return r; + vpci->dev_hdr = (struct device_header) { .bus_type = DEVICE_BUS_PCI, .data = &vpci->pci_hdr, @@ -548,20 +598,12 @@ int virtio_pci__init(struct kvm *kvm, void *dev, struct virtio_device *vdev, r = device__register(&vpci->dev_hdr); if (r < 0) - goto free_msix_mmio; + return r; /* save the IRQ that device__register() has allocated */ vpci->legacy_irq_line = vpci->pci_hdr.irq_line; return 0; - -free_msix_mmio: - kvm__deregister_mmio(kvm, msix_io_block); -free_mmio: - kvm__deregister_mmio(kvm, mmio_addr); -free_ioport: - ioport__unregister(kvm, port_addr); - return r; } int virtio_pci__reset(struct kvm *kvm, struct virtio_device *vdev) From patchwork Thu Mar 26 15:24:34 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandru Elisei X-Patchwork-Id: 11460503 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 4211F92A for ; Thu, 26 Mar 2020 15:25:28 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 21EF32076A for ; Thu, 26 Mar 2020 15:25:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728473AbgCZPZ1 (ORCPT ); Thu, 26 Mar 2020 11:25:27 -0400 Received: from foss.arm.com ([217.140.110.172]:33922 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727655AbgCZPZ1 (ORCPT ); Thu, 26 Mar 2020 11:25:27 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 4281F7FA; Thu, 26 Mar 2020 08:25:26 -0700 (PDT) Received: from e123195-lin.cambridge.arm.com (e123195-lin.cambridge.arm.com [10.1.196.63]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 5AE833F71E; Thu, 26 Mar 2020 08:25:25 -0700 (PDT) From: Alexandru Elisei To: kvm@vger.kernel.org Cc: will@kernel.org, julien.thierry.kdev@gmail.com, andre.przywara@arm.com, sami.mujawar@arm.com, lorenzo.pieralisi@arm.com Subject: [PATCH v3 kvmtool 28/32] pci: Toggle BAR I/O and memory space emulation Date: Thu, 26 Mar 2020 15:24:34 +0000 Message-Id: <20200326152438.6218-29-alexandru.elisei@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200326152438.6218-1-alexandru.elisei@arm.com> References: <20200326152438.6218-1-alexandru.elisei@arm.com> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org During configuration of the BAR addresses, a Linux guest disables and enables access to I/O and memory space. When access is disabled, we don't stop emulating the memory regions described by the BARs. Now that we have callbacks for activating and deactivating emulation for a BAR region, let's use that to stop emulation when access is disabled, and re-activate it when access is re-enabled. Signed-off-by: Alexandru Elisei Reviewed-by: Andre Przywara --- pci.c | 42 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 42 insertions(+) diff --git a/pci.c b/pci.c index 4ace190898f2..c2860e6707fe 100644 --- a/pci.c +++ b/pci.c @@ -163,6 +163,42 @@ static struct ioport_operations pci_config_data_ops = { .io_out = pci_config_data_out, }; +static void pci_config_command_wr(struct kvm *kvm, + struct pci_device_header *pci_hdr, + u16 new_command) +{ + int i; + bool toggle_io, toggle_mem; + + toggle_io = (pci_hdr->command ^ new_command) & PCI_COMMAND_IO; + toggle_mem = (pci_hdr->command ^ new_command) & PCI_COMMAND_MEMORY; + + for (i = 0; i < 6; i++) { + if (!pci_bar_is_implemented(pci_hdr, i)) + continue; + + if (toggle_io && pci__bar_is_io(pci_hdr, i)) { + if (__pci__io_space_enabled(new_command)) + pci_hdr->bar_activate_fn(kvm, pci_hdr, i, + pci_hdr->data); + else + pci_hdr->bar_deactivate_fn(kvm, pci_hdr, i, + pci_hdr->data); + } + + if (toggle_mem && pci__bar_is_memory(pci_hdr, i)) { + if (__pci__memory_space_enabled(new_command)) + pci_hdr->bar_activate_fn(kvm, pci_hdr, i, + pci_hdr->data); + else + pci_hdr->bar_deactivate_fn(kvm, pci_hdr, i, + pci_hdr->data); + } + } + + pci_hdr->command = new_command; +} + void pci__config_wr(struct kvm *kvm, union pci_config_address addr, void *data, int size) { void *base; @@ -188,6 +224,12 @@ void pci__config_wr(struct kvm *kvm, union pci_config_address addr, void *data, if (*(u32 *)(base + offset) == 0) return; + if (offset == PCI_COMMAND) { + memcpy(&value, data, size); + pci_config_command_wr(kvm, pci_hdr, (u16)value); + return; + } + bar = (offset - PCI_BAR_OFFSET(0)) / sizeof(u32); /* From patchwork Thu Mar 26 15:24:35 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandru Elisei X-Patchwork-Id: 11460505 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id A114C1667 for ; Thu, 26 Mar 2020 15:25:29 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 81A1F20774 for ; Thu, 26 Mar 2020 15:25:29 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728345AbgCZPZ2 (ORCPT ); Thu, 26 Mar 2020 11:25:28 -0400 Received: from foss.arm.com ([217.140.110.172]:33928 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728496AbgCZPZ2 (ORCPT ); Thu, 26 Mar 2020 11:25:28 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 603271045; Thu, 26 Mar 2020 08:25:27 -0700 (PDT) Received: from e123195-lin.cambridge.arm.com (e123195-lin.cambridge.arm.com [10.1.196.63]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 79C0C3F71E; Thu, 26 Mar 2020 08:25:26 -0700 (PDT) From: Alexandru Elisei To: kvm@vger.kernel.org Cc: will@kernel.org, julien.thierry.kdev@gmail.com, andre.przywara@arm.com, sami.mujawar@arm.com, lorenzo.pieralisi@arm.com Subject: [PATCH v3 kvmtool 29/32] pci: Implement reassignable BARs Date: Thu, 26 Mar 2020 15:24:35 +0000 Message-Id: <20200326152438.6218-30-alexandru.elisei@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200326152438.6218-1-alexandru.elisei@arm.com> References: <20200326152438.6218-1-alexandru.elisei@arm.com> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org BARs are used by the guest to configure the access to the PCI device by writing the address to which the device will respond. The basic idea for adding support for reassignable BARs is straightforward: deactivate emulation for the memory region described by the old BAR value, and activate emulation for the new region. BAR reassignement can be done while device access is enabled and memory regions for different devices can overlap as long as no access is made to the overlapping memory regions. This means that it is legal for the BARs of two distinct devices to point to an overlapping memory region, and indeed, this is how Linux does resource assignment at boot. To account for this situation, the simple algorithm described above is enhanced to scan for all devices and: - Deactivate emulation for any BARs that might overlap with the new BAR value. - Enable emulation for any BARs that were overlapping with the old value after the BAR has been updated. Activating/deactivating emulation of a memory region has side effects. In order to prevent the execution of the same callback twice we now keep track of the state of the region emulation. For example, this can happen if we program a BAR with an address that overlaps a second BAR, thus deactivating emulation for the second BAR, and then we disable all region accesses to the second BAR by writing to the command register. Signed-off-by: Alexandru Elisei --- include/kvm/pci.h | 14 ++- pci.c | 273 +++++++++++++++++++++++++++++++++++++--------- vfio/pci.c | 12 ++ 3 files changed, 244 insertions(+), 55 deletions(-) diff --git a/include/kvm/pci.h b/include/kvm/pci.h index 1d7d4c0cea5a..be75f77fd2cb 100644 --- a/include/kvm/pci.h +++ b/include/kvm/pci.h @@ -11,6 +11,17 @@ #include "kvm/msi.h" #include "kvm/fdt.h" +#define pci_dev_err(pci_hdr, fmt, ...) \ + pr_err("[%04x:%04x] " fmt, pci_hdr->vendor_id, pci_hdr->device_id, ##__VA_ARGS__) +#define pci_dev_warn(pci_hdr, fmt, ...) \ + pr_warning("[%04x:%04x] " fmt, pci_hdr->vendor_id, pci_hdr->device_id, ##__VA_ARGS__) +#define pci_dev_info(pci_hdr, fmt, ...) \ + pr_info("[%04x:%04x] " fmt, pci_hdr->vendor_id, pci_hdr->device_id, ##__VA_ARGS__) +#define pci_dev_dbg(pci_hdr, fmt, ...) \ + pr_debug("[%04x:%04x] " fmt, pci_hdr->vendor_id, pci_hdr->device_id, ##__VA_ARGS__) +#define pci_dev_die(pci_hdr, fmt, ...) \ + die("[%04x:%04x] " fmt, pci_hdr->vendor_id, pci_hdr->device_id, ##__VA_ARGS__) + /* * PCI Configuration Mechanism #1 I/O ports. See Section 3.7.4.1. * ("Configuration Mechanism #1") of the PCI Local Bus Specification 2.1 for @@ -142,7 +153,8 @@ struct pci_device_header { }; /* Private to lkvm */ - u32 bar_size[6]; + u32 bar_size[6]; + bool bar_active[6]; bar_activate_fn_t bar_activate_fn; bar_deactivate_fn_t bar_deactivate_fn; void *data; diff --git a/pci.c b/pci.c index c2860e6707fe..68ece65441a6 100644 --- a/pci.c +++ b/pci.c @@ -71,6 +71,11 @@ static bool pci_bar_is_implemented(struct pci_device_header *pci_hdr, int bar_nu return pci__bar_size(pci_hdr, bar_num); } +static bool pci_bar_is_active(struct pci_device_header *pci_hdr, int bar_num) +{ + return pci_hdr->bar_active[bar_num]; +} + static void *pci_config_address_ptr(u16 port) { unsigned long offset; @@ -163,6 +168,46 @@ static struct ioport_operations pci_config_data_ops = { .io_out = pci_config_data_out, }; +static int pci_activate_bar(struct kvm *kvm, struct pci_device_header *pci_hdr, + int bar_num) +{ + int r = 0; + + if (pci_bar_is_active(pci_hdr, bar_num)) + goto out; + + r = pci_hdr->bar_activate_fn(kvm, pci_hdr, bar_num, pci_hdr->data); + if (r < 0) { + pci_dev_warn(pci_hdr, "Error activating emulation for BAR %d", + bar_num); + goto out; + } + pci_hdr->bar_active[bar_num] = true; + +out: + return r; +} + +static int pci_deactivate_bar(struct kvm *kvm, struct pci_device_header *pci_hdr, + int bar_num) +{ + int r = 0; + + if (!pci_bar_is_active(pci_hdr, bar_num)) + goto out; + + r = pci_hdr->bar_deactivate_fn(kvm, pci_hdr, bar_num, pci_hdr->data); + if (r < 0) { + pci_dev_warn(pci_hdr, "Error deactivating emulation for BAR %d", + bar_num); + goto out; + } + pci_hdr->bar_active[bar_num] = false; + +out: + return r; +} + static void pci_config_command_wr(struct kvm *kvm, struct pci_device_header *pci_hdr, u16 new_command) @@ -179,26 +224,179 @@ static void pci_config_command_wr(struct kvm *kvm, if (toggle_io && pci__bar_is_io(pci_hdr, i)) { if (__pci__io_space_enabled(new_command)) - pci_hdr->bar_activate_fn(kvm, pci_hdr, i, - pci_hdr->data); + pci_activate_bar(kvm, pci_hdr, i); else - pci_hdr->bar_deactivate_fn(kvm, pci_hdr, i, - pci_hdr->data); + pci_deactivate_bar(kvm, pci_hdr, i); } if (toggle_mem && pci__bar_is_memory(pci_hdr, i)) { if (__pci__memory_space_enabled(new_command)) - pci_hdr->bar_activate_fn(kvm, pci_hdr, i, - pci_hdr->data); + pci_activate_bar(kvm, pci_hdr, i); else - pci_hdr->bar_deactivate_fn(kvm, pci_hdr, i, - pci_hdr->data); + pci_deactivate_bar(kvm, pci_hdr, i); } } pci_hdr->command = new_command; } +static int pci_deactivate_bar_regions(struct kvm *kvm, + struct pci_device_header *pci_hdr, + u32 start, u32 size) +{ + struct device_header *dev_hdr; + struct pci_device_header *tmp_hdr; + u32 tmp_addr, tmp_size; + int i, r; + + dev_hdr = device__first_dev(DEVICE_BUS_PCI); + while (dev_hdr) { + tmp_hdr = dev_hdr->data; + for (i = 0; i < 6; i++) { + if (!pci_bar_is_implemented(tmp_hdr, i)) + continue; + + tmp_addr = pci__bar_address(tmp_hdr, i); + tmp_size = pci__bar_size(tmp_hdr, i); + + if (tmp_addr + tmp_size <= start || + tmp_addr >= start + size) + continue; + + r = pci_deactivate_bar(kvm, tmp_hdr, i); + if (r < 0) + return r; + } + dev_hdr = device__next_dev(dev_hdr); + } + + return 0; +} + +static int pci_activate_bar_regions(struct kvm *kvm, + struct pci_device_header *pci_hdr, + u32 start, u32 size) +{ + struct device_header *dev_hdr; + struct pci_device_header *tmp_hdr; + u32 tmp_addr, tmp_size; + int i, r; + + dev_hdr = device__first_dev(DEVICE_BUS_PCI); + while (dev_hdr) { + tmp_hdr = dev_hdr->data; + for (i = 0; i < 6; i++) { + if (!pci_bar_is_implemented(tmp_hdr, i)) + continue; + + tmp_addr = pci__bar_address(tmp_hdr, i); + tmp_size = pci__bar_size(tmp_hdr, i); + + if (tmp_addr + tmp_size <= start || + tmp_addr >= start + size) + continue; + + r = pci_activate_bar(kvm, tmp_hdr, i); + if (r < 0) + return r; + } + dev_hdr = device__next_dev(dev_hdr); + } + + return 0; +} + +static void pci_config_bar_wr(struct kvm *kvm, + struct pci_device_header *pci_hdr, int bar_num, + u32 value) +{ + u32 old_addr, new_addr, bar_size; + u32 mask; + int r; + + if (pci__bar_is_io(pci_hdr, bar_num)) + mask = (u32)PCI_BASE_ADDRESS_IO_MASK; + else + mask = (u32)PCI_BASE_ADDRESS_MEM_MASK; + + /* + * If the kernel masks the BAR, it will expect to find the size of the + * BAR there next time it reads from it. After the kernel reads the + * size, it will write the address back. + * + * According to the PCI local bus specification REV 3.0: The number of + * upper bits that a device actually implements depends on how much of + * the address space the device will respond to. A device that wants a 1 + * MB memory address space (using a 32-bit base address register) would + * build the top 12 bits of the address register, hardwiring the other + * bits to 0. + * + * Furthermore, software can determine how much address space the device + * requires by writing a value of all 1's to the register and then + * reading the value back. The device will return 0's in all don't-care + * address bits, effectively specifying the address space required. + * + * Software computes the size of the address space with the formula + * S = ~B + 1, where S is the memory size and B is the value read from + * the BAR. This means that the BAR value that kvmtool should return is + * B = ~(S - 1). + */ + if (value == 0xffffffff) { + value = ~(pci__bar_size(pci_hdr, bar_num) - 1); + /* Preserve the special bits. */ + value = (value & mask) | (pci_hdr->bar[bar_num] & ~mask); + pci_hdr->bar[bar_num] = value; + return; + } + + value = (value & mask) | (pci_hdr->bar[bar_num] & ~mask); + + /* Don't toggle emulation when region type access is disbled. */ + if (pci__bar_is_io(pci_hdr, bar_num) && + !pci__io_space_enabled(pci_hdr)) { + pci_hdr->bar[bar_num] = value; + return; + } + + if (pci__bar_is_memory(pci_hdr, bar_num) && + !pci__memory_space_enabled(pci_hdr)) { + pci_hdr->bar[bar_num] = value; + return; + } + + old_addr = pci__bar_address(pci_hdr, bar_num); + new_addr = __pci__bar_address(value); + bar_size = pci__bar_size(pci_hdr, bar_num); + + r = pci_deactivate_bar(kvm, pci_hdr, bar_num); + if (r < 0) + return; + + r = pci_deactivate_bar_regions(kvm, pci_hdr, new_addr, bar_size); + if (r < 0) { + /* + * We cannot update the BAR because of an overlapping region + * that failed to deactivate emulation, so keep the old BAR + * value and re-activate emulation for it. + */ + pci_activate_bar(kvm, pci_hdr, bar_num); + return; + } + + pci_hdr->bar[bar_num] = value; + r = pci_activate_bar(kvm, pci_hdr, bar_num); + if (r < 0) { + /* + * New region cannot be emulated, re-enable the regions that + * were overlapping. + */ + pci_activate_bar_regions(kvm, pci_hdr, new_addr, bar_size); + return; + } + + pci_activate_bar_regions(kvm, pci_hdr, old_addr, bar_size); +} + void pci__config_wr(struct kvm *kvm, union pci_config_address addr, void *data, int size) { void *base; @@ -206,7 +404,6 @@ void pci__config_wr(struct kvm *kvm, union pci_config_address addr, void *data, struct pci_device_header *pci_hdr; u8 dev_num = addr.device_number; u32 value = 0; - u32 mask; if (!pci_device_exists(addr.bus_number, dev_num, 0)) return; @@ -231,46 +428,13 @@ void pci__config_wr(struct kvm *kvm, union pci_config_address addr, void *data, } bar = (offset - PCI_BAR_OFFSET(0)) / sizeof(u32); - - /* - * If the kernel masks the BAR, it will expect to find the size of the - * BAR there next time it reads from it. After the kernel reads the - * size, it will write the address back. - */ if (bar < 6) { - if (pci__bar_is_io(pci_hdr, bar)) - mask = (u32)PCI_BASE_ADDRESS_IO_MASK; - else - mask = (u32)PCI_BASE_ADDRESS_MEM_MASK; - /* - * According to the PCI local bus specification REV 3.0: - * The number of upper bits that a device actually implements - * depends on how much of the address space the device will - * respond to. A device that wants a 1 MB memory address space - * (using a 32-bit base address register) would build the top - * 12 bits of the address register, hardwiring the other bits - * to 0. - * - * Furthermore, software can determine how much address space - * the device requires by writing a value of all 1's to the - * register and then reading the value back. The device will - * return 0's in all don't-care address bits, effectively - * specifying the address space required. - * - * Software computes the size of the address space with the - * formula S = ~B + 1, where S is the memory size and B is the - * value read from the BAR. This means that the BAR value that - * kvmtool should return is B = ~(S - 1). - */ memcpy(&value, data, size); - if (value == 0xffffffff) - value = ~(pci__bar_size(pci_hdr, bar) - 1); - /* Preserve the special bits. */ - value = (value & mask) | (pci_hdr->bar[bar] & ~mask); - memcpy(base + offset, &value, size); - } else { - memcpy(base + offset, data, size); + pci_config_bar_wr(kvm, pci_hdr, bar, value); + return; } + + memcpy(base + offset, data, size); } void pci__config_rd(struct kvm *kvm, union pci_config_address addr, void *data, int size) @@ -338,20 +502,21 @@ int pci__register_bar_regions(struct kvm *kvm, struct pci_device_header *pci_hdr continue; has_bar_regions = true; + assert(!pci_bar_is_active(pci_hdr, i)); if (pci__bar_is_io(pci_hdr, i) && pci__io_space_enabled(pci_hdr)) { - r = bar_activate_fn(kvm, pci_hdr, i, data); - if (r < 0) - return r; - } + r = pci_activate_bar(kvm, pci_hdr, i); + if (r < 0) + return r; + } if (pci__bar_is_memory(pci_hdr, i) && pci__memory_space_enabled(pci_hdr)) { - r = bar_activate_fn(kvm, pci_hdr, i, data); - if (r < 0) - return r; - } + r = pci_activate_bar(kvm, pci_hdr, i); + if (r < 0) + return r; + } } assert(has_bar_regions); diff --git a/vfio/pci.c b/vfio/pci.c index 18e22a8c5320..2b891496547d 100644 --- a/vfio/pci.c +++ b/vfio/pci.c @@ -457,6 +457,7 @@ static int vfio_pci_bar_activate(struct kvm *kvm, struct vfio_pci_msix_pba *pba = &pdev->msix_pba; struct vfio_pci_msix_table *table = &pdev->msix_table; struct vfio_region *region; + u32 bar_addr; bool has_msix; int ret; @@ -465,7 +466,14 @@ static int vfio_pci_bar_activate(struct kvm *kvm, region = &vdev->regions[bar_num]; has_msix = pdev->irq_modes & VFIO_PCI_IRQ_MODE_MSIX; + bar_addr = pci__bar_address(pci_hdr, bar_num); + if (pci__bar_is_io(pci_hdr, bar_num)) + region->port_base = bar_addr; + else + region->guest_phys_addr = bar_addr; + if (has_msix && (u32)bar_num == table->bar) { + table->guest_phys_addr = region->guest_phys_addr; ret = kvm__register_mmio(kvm, table->guest_phys_addr, table->size, false, vfio_pci_msix_table_access, pdev); @@ -474,6 +482,10 @@ static int vfio_pci_bar_activate(struct kvm *kvm, } if (has_msix && (u32)bar_num == pba->bar) { + if (pba->bar == table->bar) + pba->guest_phys_addr = table->guest_phys_addr + table->size; + else + pba->guest_phys_addr = region->guest_phys_addr; ret = kvm__register_mmio(kvm, pba->guest_phys_addr, pba->size, false, vfio_pci_msix_pba_access, pdev); From patchwork Thu Mar 26 15:24:36 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandru Elisei X-Patchwork-Id: 11460507 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 595631667 for ; Thu, 26 Mar 2020 15:25:30 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 451502076A for ; Thu, 26 Mar 2020 15:25:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728507AbgCZPZ3 (ORCPT ); Thu, 26 Mar 2020 11:25:29 -0400 Received: from foss.arm.com ([217.140.110.172]:33938 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728496AbgCZPZ3 (ORCPT ); Thu, 26 Mar 2020 11:25:29 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 9EE3E7FA; Thu, 26 Mar 2020 08:25:28 -0700 (PDT) Received: from e123195-lin.cambridge.arm.com (e123195-lin.cambridge.arm.com [10.1.196.63]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 965EE3F71E; Thu, 26 Mar 2020 08:25:27 -0700 (PDT) From: Alexandru Elisei To: kvm@vger.kernel.org Cc: will@kernel.org, julien.thierry.kdev@gmail.com, andre.przywara@arm.com, sami.mujawar@arm.com, lorenzo.pieralisi@arm.com, Julien Thierry Subject: [PATCH v3 kvmtool 30/32] arm/fdt: Remove 'linux,pci-probe-only' property Date: Thu, 26 Mar 2020 15:24:36 +0000 Message-Id: <20200326152438.6218-31-alexandru.elisei@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200326152438.6218-1-alexandru.elisei@arm.com> References: <20200326152438.6218-1-alexandru.elisei@arm.com> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Julien Thierry PCI now supports configurable BARs. Get rid of the no longer needed, Linux-only, fdt property. Reviewed-by: Andre Przywara Signed-off-by: Julien Thierry Signed-off-by: Alexandru Elisei --- arm/fdt.c | 1 - 1 file changed, 1 deletion(-) diff --git a/arm/fdt.c b/arm/fdt.c index c80e6da323b6..02091e9e0bee 100644 --- a/arm/fdt.c +++ b/arm/fdt.c @@ -130,7 +130,6 @@ static int setup_fdt(struct kvm *kvm) /* /chosen */ _FDT(fdt_begin_node(fdt, "chosen")); - _FDT(fdt_property_cell(fdt, "linux,pci-probe-only", 1)); /* Pass on our amended command line to a Linux kernel only. */ if (kvm->cfg.firmware_filename) { From patchwork Thu Mar 26 15:24:37 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandru Elisei X-Patchwork-Id: 11460509 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 57FBE1667 for ; Thu, 26 Mar 2020 15:25:31 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 42B822076A for ; Thu, 26 Mar 2020 15:25:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728465AbgCZPZa (ORCPT ); Thu, 26 Mar 2020 11:25:30 -0400 Received: from foss.arm.com ([217.140.110.172]:33946 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728496AbgCZPZa (ORCPT ); Thu, 26 Mar 2020 11:25:30 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id BB0CD1045; Thu, 26 Mar 2020 08:25:29 -0700 (PDT) Received: from e123195-lin.cambridge.arm.com (e123195-lin.cambridge.arm.com [10.1.196.63]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id D295C3F71E; Thu, 26 Mar 2020 08:25:28 -0700 (PDT) From: Alexandru Elisei To: kvm@vger.kernel.org Cc: will@kernel.org, julien.thierry.kdev@gmail.com, andre.przywara@arm.com, sami.mujawar@arm.com, lorenzo.pieralisi@arm.com Subject: [PATCH v3 kvmtool 31/32] vfio: Trap MMIO access to BAR addresses which aren't page aligned Date: Thu, 26 Mar 2020 15:24:37 +0000 Message-Id: <20200326152438.6218-32-alexandru.elisei@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200326152438.6218-1-alexandru.elisei@arm.com> References: <20200326152438.6218-1-alexandru.elisei@arm.com> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org KVM_SET_USER_MEMORY_REGION will fail if the guest physical address is not aligned to the page size. However, it is legal for a guest to program an address which isn't aligned to the page size. Trap and emulate MMIO accesses to the region when that happens. Without this patch, when assigning a Seagate Barracude hard drive to a VM I was seeing these errors: [ 0.286029] pci 0000:00:00.0: BAR 0: assigned [mem 0x41004600-0x4100467f] Error: 0000:01:00.0: failed to register region with KVM Error: [1095:3132] Error activating emulation for BAR 0 [..] [ 10.561794] irq 13: nobody cared (try booting with the "irqpoll" option) [ 10.563122] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.4.0-seattle-00009-g909b20467ed1 #133 [ 10.563124] Hardware name: linux,dummy-virt (DT) [ 10.563126] Call trace: [ 10.563134] dump_backtrace+0x0/0x140 [ 10.563137] show_stack+0x14/0x20 [ 10.563141] dump_stack+0xbc/0x100 [ 10.563146] __report_bad_irq+0x48/0xd4 [ 10.563148] note_interrupt+0x288/0x378 [ 10.563151] handle_irq_event_percpu+0x80/0x88 [ 10.563153] handle_irq_event+0x44/0xc8 [ 10.563155] handle_fasteoi_irq+0xb4/0x160 [ 10.563157] generic_handle_irq+0x24/0x38 [ 10.563159] __handle_domain_irq+0x60/0xb8 [ 10.563162] gic_handle_irq+0x50/0xa0 [ 10.563164] el1_irq+0xb8/0x180 [ 10.563166] arch_cpu_idle+0x10/0x18 [ 10.563170] do_idle+0x204/0x290 [ 10.563172] cpu_startup_entry+0x20/0x40 [ 10.563175] rest_init+0xd4/0xe0 [ 10.563180] arch_call_rest_init+0xc/0x14 [ 10.563182] start_kernel+0x420/0x44c [ 10.563183] handlers: [ 10.563650] [<000000001e474803>] sil24_interrupt [ 10.564559] Disabling IRQ #13 [..] [ 11.832916] ata1: spurious interrupt (slot_stat 0x0 active_tag -84148995 sactive 0x0) [ 12.045444] ata_ratelimit: 1 callbacks suppressed With this patch, I don't see the errors and the device works as expected. Reviewed-by: Andre Przywara Signed-off-by: Alexandru Elisei --- vfio/core.c | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/vfio/core.c b/vfio/core.c index bad3c7c8cd26..0b45e78b40b4 100644 --- a/vfio/core.c +++ b/vfio/core.c @@ -226,6 +226,15 @@ int vfio_map_region(struct kvm *kvm, struct vfio_device *vdev, if (!(region->info.flags & VFIO_REGION_INFO_FLAG_MMAP)) return vfio_setup_trap_region(kvm, vdev, region); + /* + * KVM_SET_USER_MEMORY_REGION will fail because the guest physical + * address isn't page aligned, let's emulate the region ourselves. + */ + if (region->guest_phys_addr & (PAGE_SIZE - 1)) + return kvm__register_mmio(kvm, region->guest_phys_addr, + region->info.size, false, + vfio_mmio_access, region); + if (region->info.flags & VFIO_REGION_INFO_FLAG_READ) prot |= PROT_READ; if (region->info.flags & VFIO_REGION_INFO_FLAG_WRITE) From patchwork Thu Mar 26 15:24:38 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandru Elisei X-Patchwork-Id: 11460513 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 7A1BB92A for ; Thu, 26 Mar 2020 15:25:33 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 5E1F120774 for ; Thu, 26 Mar 2020 15:25:33 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728549AbgCZPZc (ORCPT ); Thu, 26 Mar 2020 11:25:32 -0400 Received: from foss.arm.com ([217.140.110.172]:33956 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728516AbgCZPZb (ORCPT ); Thu, 26 Mar 2020 11:25:31 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id D6F7B7FA; Thu, 26 Mar 2020 08:25:30 -0700 (PDT) Received: from e123195-lin.cambridge.arm.com (e123195-lin.cambridge.arm.com [10.1.196.63]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id EFF283F71E; Thu, 26 Mar 2020 08:25:29 -0700 (PDT) From: Alexandru Elisei To: kvm@vger.kernel.org Cc: will@kernel.org, julien.thierry.kdev@gmail.com, andre.przywara@arm.com, sami.mujawar@arm.com, lorenzo.pieralisi@arm.com Subject: [PATCH v3 kvmtool 32/32] arm/arm64: Add PCI Express 1.1 support Date: Thu, 26 Mar 2020 15:24:38 +0000 Message-Id: <20200326152438.6218-33-alexandru.elisei@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200326152438.6218-1-alexandru.elisei@arm.com> References: <20200326152438.6218-1-alexandru.elisei@arm.com> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org PCI Express comes with an extended addressing scheme, which directly translated into a bigger device configuration space (256->4096 bytes) and bigger PCI configuration space (16->256 MB), as well as mandatory capabilities (power management [1] and PCI Express capability [2]). However, our virtio PCI implementation implements version 0.9 of the protocol and it still uses transitional PCI device ID's, so we have opted to omit the mandatory PCI Express capabilities.For VFIO, the power management and PCI Express capability are left for a subsequent patch. [1] PCI Express Base Specification Revision 1.1, section 7.6 [2] PCI Express Base Specification Revision 1.1, section 7.8 Signed-off-by: Alexandru Elisei --- arm/include/arm-common/kvm-arch.h | 4 +- arm/pci.c | 2 +- builtin-run.c | 1 + include/kvm/kvm-config.h | 2 +- include/kvm/pci.h | 76 ++++++++++++++++++++++++++++--- pci.c | 5 +- vfio/pci.c | 26 +++++++---- 7 files changed, 96 insertions(+), 20 deletions(-) diff --git a/arm/include/arm-common/kvm-arch.h b/arm/include/arm-common/kvm-arch.h index b9d486d5eac2..13c55fa3dc29 100644 --- a/arm/include/arm-common/kvm-arch.h +++ b/arm/include/arm-common/kvm-arch.h @@ -23,7 +23,7 @@ #define ARM_IOPORT_SIZE (ARM_MMIO_AREA - ARM_IOPORT_AREA) #define ARM_VIRTIO_MMIO_SIZE (ARM_AXI_AREA - (ARM_MMIO_AREA + ARM_GIC_SIZE)) -#define ARM_PCI_CFG_SIZE (1ULL << 24) +#define ARM_PCI_CFG_SIZE (1ULL << 28) #define ARM_PCI_MMIO_SIZE (ARM_MEMORY_AREA - \ (ARM_AXI_AREA + ARM_PCI_CFG_SIZE)) @@ -50,6 +50,8 @@ #define VIRTIO_RING_ENDIAN (VIRTIO_ENDIAN_LE | VIRTIO_ENDIAN_BE) +#define ARCH_HAS_PCI_EXP 1 + static inline bool arm_addr_in_ioport_region(u64 phys_addr) { u64 limit = KVM_IOPORT_AREA + ARM_IOPORT_SIZE; diff --git a/arm/pci.c b/arm/pci.c index ed325fa4a811..2251f627d8b5 100644 --- a/arm/pci.c +++ b/arm/pci.c @@ -62,7 +62,7 @@ void pci__generate_fdt_nodes(void *fdt) _FDT(fdt_property_cell(fdt, "#address-cells", 0x3)); _FDT(fdt_property_cell(fdt, "#size-cells", 0x2)); _FDT(fdt_property_cell(fdt, "#interrupt-cells", 0x1)); - _FDT(fdt_property_string(fdt, "compatible", "pci-host-cam-generic")); + _FDT(fdt_property_string(fdt, "compatible", "pci-host-ecam-generic")); _FDT(fdt_property(fdt, "dma-coherent", NULL, 0)); _FDT(fdt_property(fdt, "bus-range", bus_range, sizeof(bus_range))); diff --git a/builtin-run.c b/builtin-run.c index 9cb8c75300eb..def8a1f803ad 100644 --- a/builtin-run.c +++ b/builtin-run.c @@ -27,6 +27,7 @@ #include "kvm/irq.h" #include "kvm/kvm.h" #include "kvm/pci.h" +#include "kvm/vfio.h" #include "kvm/rtc.h" #include "kvm/sdl.h" #include "kvm/vnc.h" diff --git a/include/kvm/kvm-config.h b/include/kvm/kvm-config.h index a052b0bc7582..a1012c57b7a7 100644 --- a/include/kvm/kvm-config.h +++ b/include/kvm/kvm-config.h @@ -2,7 +2,6 @@ #define KVM_CONFIG_H_ #include "kvm/disk-image.h" -#include "kvm/vfio.h" #include "kvm/kvm-config-arch.h" #define DEFAULT_KVM_DEV "/dev/kvm" @@ -18,6 +17,7 @@ #define MIN_RAM_SIZE_MB (64ULL) #define MIN_RAM_SIZE_BYTE (MIN_RAM_SIZE_MB << MB_SHIFT) +struct vfio_device_params; struct kvm_config { struct kvm_config_arch arch; struct disk_image_params disk_image[MAX_DISK_IMAGES]; diff --git a/include/kvm/pci.h b/include/kvm/pci.h index be75f77fd2cb..71ee9d8cb01f 100644 --- a/include/kvm/pci.h +++ b/include/kvm/pci.h @@ -10,6 +10,7 @@ #include "kvm/devices.h" #include "kvm/msi.h" #include "kvm/fdt.h" +#include "kvm.h" #define pci_dev_err(pci_hdr, fmt, ...) \ pr_err("[%04x:%04x] " fmt, pci_hdr->vendor_id, pci_hdr->device_id, ##__VA_ARGS__) @@ -32,9 +33,41 @@ #define PCI_CONFIG_BUS_FORWARD 0xcfa #define PCI_IO_SIZE 0x100 #define PCI_IOPORT_START 0x6200 -#define PCI_CFG_SIZE (1ULL << 24) -struct kvm; +#define PCIE_CAP_REG_VER 0x1 +#define PCIE_CAP_REG_DEV_LEGACY (1 << 4) +#define PM_CAP_VER 0x3 + +#ifdef ARCH_HAS_PCI_EXP +#define PCI_CFG_SIZE (1ULL << 28) +#define PCI_DEV_CFG_SIZE 4096 + +union pci_config_address { + struct { +#if __BYTE_ORDER == __LITTLE_ENDIAN + unsigned reg_offset : 2; /* 1 .. 0 */ + unsigned register_number : 10; /* 11 .. 2 */ + unsigned function_number : 3; /* 14 .. 12 */ + unsigned device_number : 5; /* 19 .. 15 */ + unsigned bus_number : 8; /* 27 .. 20 */ + unsigned reserved : 3; /* 30 .. 28 */ + unsigned enable_bit : 1; /* 31 */ +#else + unsigned enable_bit : 1; /* 31 */ + unsigned reserved : 3; /* 30 .. 28 */ + unsigned bus_number : 8; /* 27 .. 20 */ + unsigned device_number : 5; /* 19 .. 15 */ + unsigned function_number : 3; /* 14 .. 12 */ + unsigned register_number : 10; /* 11 .. 2 */ + unsigned reg_offset : 2; /* 1 .. 0 */ +#endif + }; + u32 w; +}; + +#else +#define PCI_CFG_SIZE (1ULL << 24) +#define PCI_DEV_CFG_SIZE 256 union pci_config_address { struct { @@ -58,6 +91,8 @@ union pci_config_address { }; u32 w; }; +#endif +#define PCI_DEV_CFG_MASK (PCI_DEV_CFG_SIZE - 1) struct msix_table { struct msi_msg msg; @@ -100,6 +135,33 @@ struct pci_cap_hdr { u8 next; }; +struct pcie_cap { + u8 cap; + u8 next; + u16 cap_reg; + u32 dev_cap; + u16 dev_ctrl; + u16 dev_status; + u32 link_cap; + u16 link_ctrl; + u16 link_status; + u32 slot_cap; + u16 slot_ctrl; + u16 slot_status; + u16 root_ctrl; + u16 root_cap; + u32 root_status; +}; + +struct pm_cap { + u8 cap; + u8 next; + u16 pmc; + u16 pmcsr; + u8 pmcsr_bse; + u8 data; +}; + struct pci_device_header; typedef int (*bar_activate_fn_t)(struct kvm *kvm, @@ -110,14 +172,12 @@ typedef int (*bar_deactivate_fn_t)(struct kvm *kvm, int bar_num, void *data); #define PCI_BAR_OFFSET(b) (offsetof(struct pci_device_header, bar[b])) -#define PCI_DEV_CFG_SIZE 256 -#define PCI_DEV_CFG_MASK (PCI_DEV_CFG_SIZE - 1) struct pci_config_operations { void (*write)(struct kvm *kvm, struct pci_device_header *pci_hdr, - u8 offset, void *data, int sz); + u16 offset, void *data, int sz); void (*read)(struct kvm *kvm, struct pci_device_header *pci_hdr, - u8 offset, void *data, int sz); + u16 offset, void *data, int sz); }; struct pci_device_header { @@ -147,6 +207,10 @@ struct pci_device_header { u8 min_gnt; u8 max_lat; struct msix_cap msix; +#ifdef ARCH_HAS_PCI_EXP + struct pm_cap pm; + struct pcie_cap pcie; +#endif } __attribute__((packed)); /* Pad to PCI config space size */ u8 __pad[PCI_DEV_CFG_SIZE]; diff --git a/pci.c b/pci.c index 68ece65441a6..b471209a6efc 100644 --- a/pci.c +++ b/pci.c @@ -400,7 +400,8 @@ static void pci_config_bar_wr(struct kvm *kvm, void pci__config_wr(struct kvm *kvm, union pci_config_address addr, void *data, int size) { void *base; - u8 bar, offset; + u8 bar; + u16 offset; struct pci_device_header *pci_hdr; u8 dev_num = addr.device_number; u32 value = 0; @@ -439,7 +440,7 @@ void pci__config_wr(struct kvm *kvm, union pci_config_address addr, void *data, void pci__config_rd(struct kvm *kvm, union pci_config_address addr, void *data, int size) { - u8 offset; + u16 offset; struct pci_device_header *pci_hdr; u8 dev_num = addr.device_number; diff --git a/vfio/pci.c b/vfio/pci.c index 2b891496547d..6b8726227ea0 100644 --- a/vfio/pci.c +++ b/vfio/pci.c @@ -311,7 +311,7 @@ out_unlock: } static void vfio_pci_msix_cap_write(struct kvm *kvm, - struct vfio_device *vdev, u8 off, + struct vfio_device *vdev, u16 off, void *data, int sz) { struct vfio_pci_device *pdev = &vdev->pci; @@ -343,7 +343,7 @@ static void vfio_pci_msix_cap_write(struct kvm *kvm, } static int vfio_pci_msi_vector_write(struct kvm *kvm, struct vfio_device *vdev, - u8 off, u8 *data, u32 sz) + u16 off, u8 *data, u32 sz) { size_t i; u32 mask = 0; @@ -391,7 +391,7 @@ static int vfio_pci_msi_vector_write(struct kvm *kvm, struct vfio_device *vdev, } static void vfio_pci_msi_cap_write(struct kvm *kvm, struct vfio_device *vdev, - u8 off, u8 *data, u32 sz) + u16 off, u8 *data, u32 sz) { u8 ctrl; struct msi_msg msg; @@ -536,7 +536,7 @@ out: } static void vfio_pci_cfg_read(struct kvm *kvm, struct pci_device_header *pci_hdr, - u8 offset, void *data, int sz) + u16 offset, void *data, int sz) { struct vfio_region_info *info; struct vfio_pci_device *pdev; @@ -554,7 +554,7 @@ static void vfio_pci_cfg_read(struct kvm *kvm, struct pci_device_header *pci_hdr } static void vfio_pci_cfg_write(struct kvm *kvm, struct pci_device_header *pci_hdr, - u8 offset, void *data, int sz) + u16 offset, void *data, int sz) { struct vfio_region_info *info; struct vfio_pci_device *pdev; @@ -638,15 +638,17 @@ static int vfio_pci_parse_caps(struct vfio_device *vdev) { int ret; size_t size; - u8 pos, next; + u16 pos, next; struct pci_cap_hdr *cap; - u8 virt_hdr[PCI_DEV_CFG_SIZE]; + u8 *virt_hdr; struct vfio_pci_device *pdev = &vdev->pci; if (!(pdev->hdr.status & PCI_STATUS_CAP_LIST)) return 0; - memset(virt_hdr, 0, PCI_DEV_CFG_SIZE); + virt_hdr = calloc(1, PCI_DEV_CFG_SIZE); + if (!virt_hdr) + return -errno; pos = pdev->hdr.capabilities & ~3; @@ -682,6 +684,8 @@ static int vfio_pci_parse_caps(struct vfio_device *vdev) size = PCI_DEV_CFG_SIZE - PCI_STD_HEADER_SIZEOF; memcpy((void *)&pdev->hdr + pos, virt_hdr + pos, size); + free(virt_hdr); + return 0; } @@ -792,7 +796,11 @@ static int vfio_pci_fixup_cfg_space(struct vfio_device *vdev) /* Install our fake Configuration Space */ info = &vdev->regions[VFIO_PCI_CONFIG_REGION_INDEX].info; - hdr_sz = PCI_DEV_CFG_SIZE; + /* + * We don't touch the extended configuration space, let's be cautious + * and not overwrite it all with zeros, or bad things might happen. + */ + hdr_sz = PCI_CFG_SPACE_SIZE; if (pwrite(vdev->fd, &pdev->hdr, hdr_sz, info->offset) != hdr_sz) { vfio_dev_err(vdev, "failed to write %zd bytes to Config Space", hdr_sz);