From patchwork Thu Jan 23 13:47:36 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandru Elisei X-Patchwork-Id: 11347911 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 8D56B92A for ; Thu, 23 Jan 2020 13:48:24 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 6B4DD24689 for ; Thu, 23 Jan 2020 13:48:24 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728057AbgAWNsW (ORCPT ); Thu, 23 Jan 2020 08:48:22 -0500 Received: from foss.arm.com ([217.140.110.172]:39646 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726871AbgAWNsW (ORCPT ); Thu, 23 Jan 2020 08:48:22 -0500 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 9F9E91007; Thu, 23 Jan 2020 05:48:20 -0800 (PST) Received: from e123195-lin.cambridge.arm.com (e123195-lin.cambridge.arm.com [10.1.196.63]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 95DEF3F68E; Thu, 23 Jan 2020 05:48:19 -0800 (PST) From: Alexandru Elisei To: kvm@vger.kernel.org Cc: will@kernel.org, julien.thierry.kdev@gmail.com, andre.przywara@arm.com, sami.mujawar@arm.com, lorenzo.pieralisi@arm.com, maz@kernel.org Subject: [PATCH v2 kvmtool 01/30] Makefile: Use correct objcopy binary when cross-compiling for x86_64 Date: Thu, 23 Jan 2020 13:47:36 +0000 Message-Id: <20200123134805.1993-2-alexandru.elisei@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200123134805.1993-1-alexandru.elisei@arm.com> References: <20200123134805.1993-1-alexandru.elisei@arm.com> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Use the compiler toolchain version of objcopy instead of the native one when cross-compiling for the x86_64 architecture. Reviewed-by: Andre Przywara Tested-by: Andre Przywara Signed-off-by: Alexandru Elisei --- Makefile | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/Makefile b/Makefile index b76d844f2e01..6d6880dd4f8a 100644 --- a/Makefile +++ b/Makefile @@ -22,6 +22,7 @@ CC := $(CROSS_COMPILE)gcc CFLAGS := LD := $(CROSS_COMPILE)ld LDFLAGS := +OBJCOPY := $(CROSS_COMPILE)objcopy FIND := find CSCOPE := cscope @@ -479,7 +480,7 @@ x86/bios/bios.bin.elf: x86/bios/entry.S x86/bios/e820.c x86/bios/int10.c x86/bio x86/bios/bios.bin: x86/bios/bios.bin.elf $(E) " OBJCOPY " $@ - $(Q) objcopy -O binary -j .text x86/bios/bios.bin.elf x86/bios/bios.bin + $(Q) $(OBJCOPY) -O binary -j .text x86/bios/bios.bin.elf x86/bios/bios.bin x86/bios/bios-rom.o: x86/bios/bios-rom.S x86/bios/bios.bin x86/bios/bios-rom.h $(E) " CC " $@ From patchwork Thu Jan 23 13:47:37 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandru Elisei X-Patchwork-Id: 11347909 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 74E6A17EF for ; Thu, 23 Jan 2020 13:48:23 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 5488724689 for ; Thu, 23 Jan 2020 13:48:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728709AbgAWNsW (ORCPT ); Thu, 23 Jan 2020 08:48:22 -0500 Received: from foss.arm.com ([217.140.110.172]:39654 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727194AbgAWNsW (ORCPT ); Thu, 23 Jan 2020 08:48:22 -0500 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id DFE94106F; Thu, 23 Jan 2020 05:48:21 -0800 (PST) Received: from e123195-lin.cambridge.arm.com (e123195-lin.cambridge.arm.com [10.1.196.63]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id D4CFE3F68E; Thu, 23 Jan 2020 05:48:20 -0800 (PST) From: Alexandru Elisei To: kvm@vger.kernel.org Cc: will@kernel.org, julien.thierry.kdev@gmail.com, andre.przywara@arm.com, sami.mujawar@arm.com, lorenzo.pieralisi@arm.com, maz@kernel.org Subject: [PATCH v2 kvmtool 02/30] hw/i8042: Compile only for x86 Date: Thu, 23 Jan 2020 13:47:37 +0000 Message-Id: <20200123134805.1993-3-alexandru.elisei@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200123134805.1993-1-alexandru.elisei@arm.com> References: <20200123134805.1993-1-alexandru.elisei@arm.com> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org The initialization function for the i8042 emulated device does exactly nothing for all architectures, except for x86. As a result, the device is usable only for x86, so let's make the file an architecture specific object file. Signed-off-by: Alexandru Elisei Reviewed-by: Andre Przywara --- Makefile | 2 +- hw/i8042.c | 4 ---- 2 files changed, 1 insertion(+), 5 deletions(-) diff --git a/Makefile b/Makefile index 6d6880dd4f8a..33eddcbb4d66 100644 --- a/Makefile +++ b/Makefile @@ -103,7 +103,6 @@ OBJS += hw/pci-shmem.o OBJS += kvm-ipc.o OBJS += builtin-sandbox.o OBJS += virtio/mmio.o -OBJS += hw/i8042.o # Translate uname -m into ARCH string ARCH ?= $(shell uname -m | sed -e s/i.86/i386/ -e s/ppc.*/powerpc/ \ @@ -124,6 +123,7 @@ endif #x86 ifeq ($(ARCH),x86) DEFINES += -DCONFIG_X86 + OBJS += hw/i8042.o OBJS += x86/boot.o OBJS += x86/cpuid.o OBJS += x86/interrupt.o diff --git a/hw/i8042.c b/hw/i8042.c index 288b7d1108ac..2d8c96e9c7e6 100644 --- a/hw/i8042.c +++ b/hw/i8042.c @@ -349,10 +349,6 @@ static struct ioport_operations kbd_ops = { int kbd__init(struct kvm *kvm) { -#ifndef CONFIG_X86 - return 0; -#endif - kbd_reset(); state.kvm = kvm; ioport__register(kvm, I8042_DATA_REG, &kbd_ops, 2, NULL); From patchwork Thu Jan 23 13:47:38 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandru Elisei X-Patchwork-Id: 11347969 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 7B17B159A for ; Thu, 23 Jan 2020 13:49:08 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 6306E20661 for ; Thu, 23 Jan 2020 13:49:08 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728831AbgAWNsY (ORCPT ); Thu, 23 Jan 2020 08:48:24 -0500 Received: from foss.arm.com ([217.140.110.172]:39664 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728731AbgAWNsX (ORCPT ); Thu, 23 Jan 2020 08:48:23 -0500 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 3EE10FEC; Thu, 23 Jan 2020 05:48:23 -0800 (PST) Received: from e123195-lin.cambridge.arm.com (e123195-lin.cambridge.arm.com [10.1.196.63]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 2284D3F68E; Thu, 23 Jan 2020 05:48:22 -0800 (PST) From: Alexandru Elisei To: kvm@vger.kernel.org Cc: will@kernel.org, julien.thierry.kdev@gmail.com, andre.przywara@arm.com, sami.mujawar@arm.com, lorenzo.pieralisi@arm.com, maz@kernel.org, Julien Thierry Subject: [PATCH v2 kvmtool 03/30] pci: Fix BAR resource sizing arbitration Date: Thu, 23 Jan 2020 13:47:38 +0000 Message-Id: <20200123134805.1993-4-alexandru.elisei@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200123134805.1993-1-alexandru.elisei@arm.com> References: <20200123134805.1993-1-alexandru.elisei@arm.com> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Sami Mujawar According to the 'PCI Local Bus Specification, Revision 3.0, February 3, 2004, Section 6.2.5.1, Implementation Notes, page 227' "Software saves the original value of the Base Address register, writes 0 FFFF FFFFh to the register, then reads it back. Size calculation can be done from the 32-bit value read by first clearing encoding information bits (bit 0 for I/O, bits 0-3 for memory), inverting all 32 bits (logical NOT), then incrementing by 1. The resultant 32-bit value is the memory/I/O range size decoded by the register. Note that the upper 16 bits of the result is ignored if the Base Address register is for I/O and bits 16-31 returned zero upon read." kvmtool was returning the actual BAR resource size which would be incorrect as the software software drivers would invert all 32 bits (logical NOT), then incrementing by 1. This ends up with a very large resource size (in some cases more than 4GB) due to which drivers assert/fail to work. e.g if the BAR resource size was 0x1000, kvmtool would return 0x1000 instead of 0xFFFFF00x. Fixed pci__config_wr() to return the size of the BAR in accordance with the PCI Local Bus specification, Implementation Notes. Signed-off-by: Sami Mujawar Signed-off-by: Julien Thierry [Reworked algorithm, removed power-of-two check] Signed-off-by: Alexandru Elisei Reviewed-by: Andre Przywara --- pci.c | 42 ++++++++++++++++++++++++++++++++++++------ 1 file changed, 36 insertions(+), 6 deletions(-) diff --git a/pci.c b/pci.c index 689869cb79a3..3198732935eb 100644 --- a/pci.c +++ b/pci.c @@ -149,6 +149,8 @@ void pci__config_wr(struct kvm *kvm, union pci_config_address addr, void *data, u8 bar, offset; struct pci_device_header *pci_hdr; u8 dev_num = addr.device_number; + u32 value = 0; + u32 mask; if (!pci_device_exists(addr.bus_number, dev_num, 0)) return; @@ -169,13 +171,41 @@ void pci__config_wr(struct kvm *kvm, union pci_config_address addr, void *data, bar = (offset - PCI_BAR_OFFSET(0)) / sizeof(u32); /* - * If the kernel masks the BAR it would expect to find the size of the - * BAR there next time it reads from it. When the kernel got the size it - * would write the address back. + * If the kernel masks the BAR, it will expect to find the size of the + * BAR there next time it reads from it. After the kernel reads the + * size, it will write the address back. */ - if (bar < 6 && ioport__read32(data) == 0xFFFFFFFF) { - u32 sz = pci_hdr->bar_size[bar]; - memcpy(base + offset, &sz, sizeof(sz)); + if (bar < 6) { + if (pci_hdr->bar[bar] & PCI_BASE_ADDRESS_SPACE_IO) + mask = (u32)PCI_BASE_ADDRESS_IO_MASK; + else + mask = (u32)PCI_BASE_ADDRESS_MEM_MASK; + /* + * According to the PCI local bus specification REV 3.0: + * The number of upper bits that a device actually implements + * depends on how much of the address space the device will + * respond to. A device that wants a 1 MB memory address space + * (using a 32-bit base address register) would build the top + * 12 bits of the address register, hardwiring the other bits + * to 0. + * + * Furthermore, software can determine how much address space + * the device requires by writing a value of all 1's to the + * register and then reading the value back. The device will + * return 0's in all don't-care address bits, effectively + * specifying the address space required. + * + * Software computes the size of the address space with the + * formula S = ~B + 1, where S is the memory size and B is the + * value read from the BAR. This means that the BAR value that + * kvmtool should return is B = ~(S - 1). + */ + memcpy(&value, data, size); + if (value == 0xffffffff) + value = ~(pci_hdr->bar_size[bar] - 1); + /* Preserve the special bits. */ + value = (value & mask) | (pci_hdr->bar[bar] & ~mask); + memcpy(base + offset, &value, size); } else { memcpy(base + offset, data, size); } From patchwork Thu Jan 23 13:47:39 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandru Elisei X-Patchwork-Id: 11347913 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 1F157159A for ; Thu, 23 Jan 2020 13:48:28 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id E75382467B for ; Thu, 23 Jan 2020 13:48:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729027AbgAWNsZ (ORCPT ); Thu, 23 Jan 2020 08:48:25 -0500 Received: from foss.arm.com ([217.140.110.172]:39670 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728911AbgAWNsZ (ORCPT ); Thu, 23 Jan 2020 08:48:25 -0500 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 7995BFEC; Thu, 23 Jan 2020 05:48:24 -0800 (PST) Received: from e123195-lin.cambridge.arm.com (e123195-lin.cambridge.arm.com [10.1.196.63]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 773843F68E; Thu, 23 Jan 2020 05:48:23 -0800 (PST) From: Alexandru Elisei To: kvm@vger.kernel.org Cc: will@kernel.org, julien.thierry.kdev@gmail.com, andre.przywara@arm.com, sami.mujawar@arm.com, lorenzo.pieralisi@arm.com, maz@kernel.org Subject: [PATCH v2 kvmtool 04/30] Remove pci-shmem device Date: Thu, 23 Jan 2020 13:47:39 +0000 Message-Id: <20200123134805.1993-5-alexandru.elisei@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200123134805.1993-1-alexandru.elisei@arm.com> References: <20200123134805.1993-1-alexandru.elisei@arm.com> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org The pci-shmem emulated device ("ivshmem") was created by QEMU for cross-VM data sharing. The only Linux driver that uses this device is the Android Virtual System on a Chip staging driver, which also mentions a character device driver implemented on top of shmem, which was removed from Linux. On the kvmtool side, the only commits touching the pci-shmem device since it was introduced in 2012 were made when refactoring various kvmtool subsystems. Let's remove the maintenance burden on the kvmtool maintainers and remove this unused device. Reviewed-by: Andre Przywara Signed-off-by: Alexandru Elisei --- Makefile | 1 - builtin-run.c | 5 - hw/pci-shmem.c | 400 ---------------------------------------- include/kvm/pci-shmem.h | 32 ---- 4 files changed, 438 deletions(-) delete mode 100644 hw/pci-shmem.c delete mode 100644 include/kvm/pci-shmem.h diff --git a/Makefile b/Makefile index 33eddcbb4d66..f75413e74819 100644 --- a/Makefile +++ b/Makefile @@ -99,7 +99,6 @@ OBJS += util/read-write.o OBJS += util/util.o OBJS += virtio/9p.o OBJS += virtio/9p-pdu.o -OBJS += hw/pci-shmem.o OBJS += kvm-ipc.o OBJS += builtin-sandbox.o OBJS += virtio/mmio.o diff --git a/builtin-run.c b/builtin-run.c index f8dc6c7229b0..9cb8c75300eb 100644 --- a/builtin-run.c +++ b/builtin-run.c @@ -31,7 +31,6 @@ #include "kvm/sdl.h" #include "kvm/vnc.h" #include "kvm/guest_compat.h" -#include "kvm/pci-shmem.h" #include "kvm/kvm-ipc.h" #include "kvm/builtin-debug.h" @@ -99,10 +98,6 @@ void kvm_run_set_wrapper_sandbox(void) OPT_INTEGER('c', "cpus", &(cfg)->nrcpus, "Number of CPUs"), \ OPT_U64('m', "mem", &(cfg)->ram_size, "Virtual machine memory" \ " size in MiB."), \ - OPT_CALLBACK('\0', "shmem", NULL, \ - "[pci:]:[:handle=][:create]", \ - "Share host shmem with guest via pci device", \ - shmem_parser, NULL), \ OPT_CALLBACK('d', "disk", kvm, "image or rootfs_dir", "Disk " \ " image or rootfs directory", img_name_parser, \ kvm), \ diff --git a/hw/pci-shmem.c b/hw/pci-shmem.c deleted file mode 100644 index f92bc75544d7..000000000000 --- a/hw/pci-shmem.c +++ /dev/null @@ -1,400 +0,0 @@ -#include "kvm/devices.h" -#include "kvm/pci-shmem.h" -#include "kvm/virtio-pci-dev.h" -#include "kvm/irq.h" -#include "kvm/kvm.h" -#include "kvm/pci.h" -#include "kvm/util.h" -#include "kvm/ioport.h" -#include "kvm/ioeventfd.h" - -#include -#include -#include -#include -#include - -#define MB_SHIFT (20) -#define KB_SHIFT (10) -#define GB_SHIFT (30) - -static struct pci_device_header pci_shmem_pci_device = { - .vendor_id = cpu_to_le16(PCI_VENDOR_ID_REDHAT_QUMRANET), - .device_id = cpu_to_le16(0x1110), - .header_type = PCI_HEADER_TYPE_NORMAL, - .class[2] = 0xFF, /* misc pci device */ - .status = cpu_to_le16(PCI_STATUS_CAP_LIST), - .capabilities = (void *)&pci_shmem_pci_device.msix - (void *)&pci_shmem_pci_device, - .msix.cap = PCI_CAP_ID_MSIX, - .msix.ctrl = cpu_to_le16(1), - .msix.table_offset = cpu_to_le32(1), /* Use BAR 1 */ - .msix.pba_offset = cpu_to_le32(0x1001), /* Use BAR 1 */ -}; - -static struct device_header pci_shmem_device = { - .bus_type = DEVICE_BUS_PCI, - .data = &pci_shmem_pci_device, -}; - -/* registers for the Inter-VM shared memory device */ -enum ivshmem_registers { - INTRMASK = 0, - INTRSTATUS = 4, - IVPOSITION = 8, - DOORBELL = 12, -}; - -static struct shmem_info *shmem_region; -static u16 ivshmem_registers; -static int local_fd; -static u32 local_id; -static u64 msix_block; -static u64 msix_pba; -static struct msix_table msix_table[2]; - -int pci_shmem__register_mem(struct shmem_info *si) -{ - if (!shmem_region) { - shmem_region = si; - } else { - pr_warning("only single shmem currently avail. ignoring.\n"); - free(si); - } - return 0; -} - -static bool shmem_pci__io_in(struct ioport *ioport, struct kvm_cpu *vcpu, u16 port, void *data, int size) -{ - u16 offset = port - ivshmem_registers; - - switch (offset) { - case INTRMASK: - break; - case INTRSTATUS: - break; - case IVPOSITION: - ioport__write32(data, local_id); - break; - case DOORBELL: - break; - }; - - return true; -} - -static bool shmem_pci__io_out(struct ioport *ioport, struct kvm_cpu *vcpu, u16 port, void *data, int size) -{ - u16 offset = port - ivshmem_registers; - - switch (offset) { - case INTRMASK: - break; - case INTRSTATUS: - break; - case IVPOSITION: - break; - case DOORBELL: - break; - }; - - return true; -} - -static struct ioport_operations shmem_pci__io_ops = { - .io_in = shmem_pci__io_in, - .io_out = shmem_pci__io_out, -}; - -static void callback_mmio_msix(struct kvm_cpu *vcpu, u64 addr, u8 *data, u32 len, u8 is_write, void *ptr) -{ - void *mem; - - if (addr - msix_block < 0x1000) - mem = &msix_table; - else - mem = &msix_pba; - - if (is_write) - memcpy(mem + addr - msix_block, data, len); - else - memcpy(data, mem + addr - msix_block, len); -} - -/* - * Return an irqfd which can be used by other guests to signal this guest - * whenever they need to poke it - */ -int pci_shmem__get_local_irqfd(struct kvm *kvm) -{ - int fd, gsi, r; - - if (local_fd == 0) { - fd = eventfd(0, 0); - if (fd < 0) - return fd; - - if (pci_shmem_pci_device.msix.ctrl & cpu_to_le16(PCI_MSIX_FLAGS_ENABLE)) { - gsi = irq__add_msix_route(kvm, &msix_table[0].msg, - pci_shmem_device.dev_num << 3); - if (gsi < 0) - return gsi; - } else { - gsi = pci_shmem_pci_device.irq_line; - } - - r = irq__add_irqfd(kvm, gsi, fd, -1); - if (r < 0) - return r; - - local_fd = fd; - } - - return local_fd; -} - -/* - * Connect a new client to ivshmem by adding the appropriate datamatch - * to the DOORBELL - */ -int pci_shmem__add_client(struct kvm *kvm, u32 id, int fd) -{ - struct kvm_ioeventfd ioevent; - - ioevent = (struct kvm_ioeventfd) { - .addr = ivshmem_registers + DOORBELL, - .len = sizeof(u32), - .datamatch = id, - .fd = fd, - .flags = KVM_IOEVENTFD_FLAG_PIO | KVM_IOEVENTFD_FLAG_DATAMATCH, - }; - - return ioctl(kvm->vm_fd, KVM_IOEVENTFD, &ioevent); -} - -/* - * Remove a client connected to ivshmem by removing the appropriate datamatch - * from the DOORBELL - */ -int pci_shmem__remove_client(struct kvm *kvm, u32 id) -{ - struct kvm_ioeventfd ioevent; - - ioevent = (struct kvm_ioeventfd) { - .addr = ivshmem_registers + DOORBELL, - .len = sizeof(u32), - .datamatch = id, - .flags = KVM_IOEVENTFD_FLAG_PIO - | KVM_IOEVENTFD_FLAG_DATAMATCH - | KVM_IOEVENTFD_FLAG_DEASSIGN, - }; - - return ioctl(kvm->vm_fd, KVM_IOEVENTFD, &ioevent); -} - -static void *setup_shmem(const char *key, size_t len, int creating) -{ - int fd; - int rtn; - void *mem; - int flag = O_RDWR; - - if (creating) - flag |= O_CREAT; - - fd = shm_open(key, flag, S_IRUSR | S_IWUSR); - if (fd < 0) { - pr_warning("Failed to open shared memory file %s\n", key); - return NULL; - } - - if (creating) { - rtn = ftruncate(fd, (off_t) len); - if (rtn < 0) - pr_warning("Can't ftruncate(fd,%zu)\n", len); - } - mem = mmap(NULL, len, - PROT_READ | PROT_WRITE, MAP_SHARED | MAP_NORESERVE, fd, 0); - if (mem == MAP_FAILED) { - pr_warning("Failed to mmap shared memory file"); - mem = NULL; - } - close(fd); - - return mem; -} - -int shmem_parser(const struct option *opt, const char *arg, int unset) -{ - const u64 default_size = SHMEM_DEFAULT_SIZE; - const u64 default_phys_addr = SHMEM_DEFAULT_ADDR; - const char *default_handle = SHMEM_DEFAULT_HANDLE; - struct shmem_info *si = malloc(sizeof(struct shmem_info)); - u64 phys_addr; - u64 size; - char *handle = NULL; - int create = 0; - const char *p = arg; - char *next; - int base = 10; - int verbose = 0; - - const int skip_pci = strlen("pci:"); - if (verbose) - pr_info("shmem_parser(%p,%s,%d)", opt, arg, unset); - /* parse out optional addr family */ - if (strcasestr(p, "pci:")) { - p += skip_pci; - } else if (strcasestr(p, "mem:")) { - die("I can't add to E820 map yet.\n"); - } - /* parse out physical addr */ - base = 10; - if (strcasestr(p, "0x")) - base = 16; - phys_addr = strtoll(p, &next, base); - if (next == p && phys_addr == 0) { - pr_info("shmem: no physical addr specified, using default."); - phys_addr = default_phys_addr; - } - if (*next != ':' && *next != '\0') - die("shmem: unexpected chars after phys addr.\n"); - if (*next == '\0') - p = next; - else - p = next + 1; - /* parse out size */ - base = 10; - if (strcasestr(p, "0x")) - base = 16; - size = strtoll(p, &next, base); - if (next == p && size == 0) { - pr_info("shmem: no size specified, using default."); - size = default_size; - } - /* look for [KMGkmg][Bb]* uses base 2. */ - int skip_B = 0; - if (strspn(next, "KMGkmg")) { /* might have a prefix */ - if (*(next + 1) == 'B' || *(next + 1) == 'b') - skip_B = 1; - switch (*next) { - case 'K': - case 'k': - size = size << KB_SHIFT; - break; - case 'M': - case 'm': - size = size << MB_SHIFT; - break; - case 'G': - case 'g': - size = size << GB_SHIFT; - break; - default: - die("shmem: bug in detecting size prefix."); - break; - } - next += 1 + skip_B; - } - if (*next != ':' && *next != '\0') { - die("shmem: unexpected chars after phys size. <%c><%c>\n", - *next, *p); - } - if (*next == '\0') - p = next; - else - p = next + 1; - /* parse out optional shmem handle */ - const int skip_handle = strlen("handle="); - next = strcasestr(p, "handle="); - if (*p && next) { - if (p != next) - die("unexpected chars before handle\n"); - p += skip_handle; - next = strchrnul(p, ':'); - if (next - p) { - handle = malloc(next - p + 1); - strncpy(handle, p, next - p); - handle[next - p] = '\0'; /* just in case. */ - } - if (*next == '\0') - p = next; - else - p = next + 1; - } - /* parse optional create flag to see if we should create shm seg. */ - if (*p && strcasestr(p, "create")) { - create = 1; - p += strlen("create"); - } - if (*p != '\0') - die("shmem: unexpected trailing chars\n"); - if (handle == NULL) { - handle = malloc(strlen(default_handle) + 1); - strcpy(handle, default_handle); - } - if (verbose) { - pr_info("shmem: phys_addr = %llx", - (unsigned long long)phys_addr); - pr_info("shmem: size = %llx", (unsigned long long)size); - pr_info("shmem: handle = %s", handle); - pr_info("shmem: create = %d", create); - } - - si->phys_addr = phys_addr; - si->size = size; - si->handle = handle; - si->create = create; - pci_shmem__register_mem(si); /* ownership of si, etc. passed on. */ - return 0; -} - -int pci_shmem__init(struct kvm *kvm) -{ - char *mem; - int r; - - if (shmem_region == NULL) - return 0; - - /* Register MMIO space for MSI-X */ - r = ioport__register(kvm, IOPORT_EMPTY, &shmem_pci__io_ops, IOPORT_SIZE, NULL); - if (r < 0) - return r; - ivshmem_registers = (u16)r; - - msix_block = pci_get_io_space_block(0x1010); - kvm__register_mmio(kvm, msix_block, 0x1010, false, callback_mmio_msix, NULL); - - /* - * This registers 3 BARs: - * - * 0 - ivshmem registers - * 1 - MSI-X MMIO space - * 2 - Shared memory block - */ - pci_shmem_pci_device.bar[0] = cpu_to_le32(ivshmem_registers | PCI_BASE_ADDRESS_SPACE_IO); - pci_shmem_pci_device.bar_size[0] = shmem_region->size; - pci_shmem_pci_device.bar[1] = cpu_to_le32(msix_block | PCI_BASE_ADDRESS_SPACE_MEMORY); - pci_shmem_pci_device.bar_size[1] = 0x1010; - pci_shmem_pci_device.bar[2] = cpu_to_le32(shmem_region->phys_addr | PCI_BASE_ADDRESS_SPACE_MEMORY); - pci_shmem_pci_device.bar_size[2] = shmem_region->size; - - device__register(&pci_shmem_device); - - /* Open shared memory and plug it into the guest */ - mem = setup_shmem(shmem_region->handle, shmem_region->size, - shmem_region->create); - if (mem == NULL) - return -EINVAL; - - kvm__register_dev_mem(kvm, shmem_region->phys_addr, shmem_region->size, - mem); - return 0; -} -dev_init(pci_shmem__init); - -int pci_shmem__exit(struct kvm *kvm) -{ - return 0; -} -dev_exit(pci_shmem__exit); diff --git a/include/kvm/pci-shmem.h b/include/kvm/pci-shmem.h deleted file mode 100644 index 6cff2b85bfd3..000000000000 --- a/include/kvm/pci-shmem.h +++ /dev/null @@ -1,32 +0,0 @@ -#ifndef KVM__PCI_SHMEM_H -#define KVM__PCI_SHMEM_H - -#include -#include - -#include "kvm/parse-options.h" - -#define SHMEM_DEFAULT_SIZE (16 << MB_SHIFT) -#define SHMEM_DEFAULT_ADDR (0xc8000000) -#define SHMEM_DEFAULT_HANDLE "/kvm_shmem" - -struct kvm; -struct shmem_info; - -struct shmem_info { - u64 phys_addr; - u64 size; - char *handle; - int create; -}; - -int pci_shmem__init(struct kvm *kvm); -int pci_shmem__exit(struct kvm *kvm); -int pci_shmem__register_mem(struct shmem_info *si); -int shmem_parser(const struct option *opt, const char *arg, int unset); - -int pci_shmem__get_local_irqfd(struct kvm *kvm); -int pci_shmem__add_client(struct kvm *kvm, u32 id, int fd); -int pci_shmem__remove_client(struct kvm *kvm, u32 id); - -#endif From patchwork Thu Jan 23 13:47:40 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandru Elisei X-Patchwork-Id: 11347915 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 68B6017EF for ; Thu, 23 Jan 2020 13:48:28 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 5053820661 for ; Thu, 23 Jan 2020 13:48:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729017AbgAWNs1 (ORCPT ); Thu, 23 Jan 2020 08:48:27 -0500 Received: from foss.arm.com ([217.140.110.172]:39680 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729028AbgAWNs0 (ORCPT ); Thu, 23 Jan 2020 08:48:26 -0500 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id AF4501007; Thu, 23 Jan 2020 05:48:25 -0800 (PST) Received: from e123195-lin.cambridge.arm.com (e123195-lin.cambridge.arm.com [10.1.196.63]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id AE9A13F68E; Thu, 23 Jan 2020 05:48:24 -0800 (PST) From: Alexandru Elisei To: kvm@vger.kernel.org Cc: will@kernel.org, julien.thierry.kdev@gmail.com, andre.przywara@arm.com, sami.mujawar@arm.com, lorenzo.pieralisi@arm.com, maz@kernel.org Subject: [PATCH v2 kvmtool 05/30] Check that a PCI device's memory size is power of two Date: Thu, 23 Jan 2020 13:47:40 +0000 Message-Id: <20200123134805.1993-6-alexandru.elisei@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200123134805.1993-1-alexandru.elisei@arm.com> References: <20200123134805.1993-1-alexandru.elisei@arm.com> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org According to the PCI local bus specification [1], a device's memory size must be a power of two. This is also implicit in the mechanism that a CPU uses to get the memory size requirement for a PCI device. The vesa device requests a memory size that isn't a power of two. According to the same spec [1], a device is allowed to consume more memory than it actually requires. As a result, the amount of memory that the vesa device now reserves has been increased. To prevent slip-ups in the future, a few BUILD_BUG_ON statements were added in places where the memory size is known at compile time. [1] PCI Local Bus Specification Revision 3.0, section 6.2.5.1 Signed-off-by: Alexandru Elisei Reviewed-by: Andre Przywara --- hw/vesa.c | 3 +++ include/kvm/util.h | 2 ++ include/kvm/vesa.h | 6 +++++- virtio/pci.c | 3 +++ 4 files changed, 13 insertions(+), 1 deletion(-) diff --git a/hw/vesa.c b/hw/vesa.c index f3c5114cf4fe..d75b4b316a1e 100644 --- a/hw/vesa.c +++ b/hw/vesa.c @@ -58,6 +58,9 @@ struct framebuffer *vesa__init(struct kvm *kvm) char *mem; int r; + BUILD_BUG_ON(!is_power_of_two(VESA_MEM_SIZE)); + BUILD_BUG_ON(VESA_MEM_SIZE < VESA_BPP/8 * VESA_WIDTH * VESA_HEIGHT); + if (!kvm->cfg.vnc && !kvm->cfg.sdl && !kvm->cfg.gtk) return NULL; diff --git a/include/kvm/util.h b/include/kvm/util.h index 4ca7aa9392b6..199724c4018c 100644 --- a/include/kvm/util.h +++ b/include/kvm/util.h @@ -104,6 +104,8 @@ static inline unsigned long roundup_pow_of_two(unsigned long x) return x ? 1UL << fls_long(x - 1) : 0; } +#define is_power_of_two(x) ((x) > 0 ? ((x) & ((x) - 1)) == 0 : 0) + struct kvm; void *mmap_hugetlbfs(struct kvm *kvm, const char *htlbfs_path, u64 size); void *mmap_anon_or_hugetlbfs(struct kvm *kvm, const char *hugetlbfs_path, u64 size); diff --git a/include/kvm/vesa.h b/include/kvm/vesa.h index 0fac11ab5a9f..e7d971343642 100644 --- a/include/kvm/vesa.h +++ b/include/kvm/vesa.h @@ -5,8 +5,12 @@ #define VESA_HEIGHT 480 #define VESA_MEM_ADDR 0xd0000000 -#define VESA_MEM_SIZE (4*VESA_WIDTH*VESA_HEIGHT) #define VESA_BPP 32 +/* + * We actually only need VESA_BPP/8*VESA_WIDTH*VESA_HEIGHT bytes. But the memory + * size must be a power of 2, so we round up. + */ +#define VESA_MEM_SIZE (1 << 21) struct kvm; struct biosregs; diff --git a/virtio/pci.c b/virtio/pci.c index 99653cad2c0f..04e801827df9 100644 --- a/virtio/pci.c +++ b/virtio/pci.c @@ -435,6 +435,9 @@ int virtio_pci__init(struct kvm *kvm, void *dev, struct virtio_device *vdev, vpci->kvm = kvm; vpci->dev = dev; + BUILD_BUG_ON(!is_power_of_two(IOPORT_SIZE)); + BUILD_BUG_ON(!is_power_of_two(PCI_IO_SIZE)); + r = ioport__register(kvm, IOPORT_EMPTY, &virtio_pci__io_ops, IOPORT_SIZE, vdev); if (r < 0) return r; From patchwork Thu Jan 23 13:47:41 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandru Elisei X-Patchwork-Id: 11347917 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E609192A for ; Thu, 23 Jan 2020 13:48:28 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id CEFD824684 for ; Thu, 23 Jan 2020 13:48:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729028AbgAWNs1 (ORCPT ); Thu, 23 Jan 2020 08:48:27 -0500 Received: from foss.arm.com ([217.140.110.172]:39688 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728899AbgAWNs1 (ORCPT ); Thu, 23 Jan 2020 08:48:27 -0500 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id F1D99FEC; Thu, 23 Jan 2020 05:48:26 -0800 (PST) Received: from e123195-lin.cambridge.arm.com (e123195-lin.cambridge.arm.com [10.1.196.63]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id E804B3F68E; Thu, 23 Jan 2020 05:48:25 -0800 (PST) From: Alexandru Elisei To: kvm@vger.kernel.org Cc: will@kernel.org, julien.thierry.kdev@gmail.com, andre.przywara@arm.com, sami.mujawar@arm.com, lorenzo.pieralisi@arm.com, maz@kernel.org Subject: [PATCH v2 kvmtool 06/30] arm/pci: Advertise only PCI bus 0 in the DT Date: Thu, 23 Jan 2020 13:47:41 +0000 Message-Id: <20200123134805.1993-7-alexandru.elisei@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200123134805.1993-1-alexandru.elisei@arm.com> References: <20200123134805.1993-1-alexandru.elisei@arm.com> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org The "bus-range" property encodes the PCI bus number of the PCI controller and the largest bus number of any PCI buses that are subordinate to this node [1]. kvmtool emulates only PCI bus 0. Advertise this in the PCI DT node by setting "bus-range" to <0,0>. [1] IEEE Std 1275-1994, Section 3 "Bus Nodes Properties and Methods" Signed-off-by: Alexandru Elisei Reviewed-by: Andre Przywara --- arm/pci.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arm/pci.c b/arm/pci.c index 557cfa98938d..ed325fa4a811 100644 --- a/arm/pci.c +++ b/arm/pci.c @@ -30,7 +30,7 @@ void pci__generate_fdt_nodes(void *fdt) struct of_interrupt_map_entry irq_map[OF_PCI_IRQ_MAP_MAX]; unsigned nentries = 0; /* Bus range */ - u32 bus_range[] = { cpu_to_fdt32(0), cpu_to_fdt32(1), }; + u32 bus_range[] = { cpu_to_fdt32(0), cpu_to_fdt32(0), }; /* Configuration Space */ u64 cfg_reg_prop[] = { cpu_to_fdt64(KVM_PCI_CFG_AREA), cpu_to_fdt64(ARM_PCI_CFG_SIZE), }; From patchwork Thu Jan 23 13:47:42 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandru Elisei X-Patchwork-Id: 11347921 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 2BF5892A for ; Thu, 23 Jan 2020 13:48:31 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 0B12424686 for ; Thu, 23 Jan 2020 13:48:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729066AbgAWNsa (ORCPT ); Thu, 23 Jan 2020 08:48:30 -0500 Received: from foss.arm.com ([217.140.110.172]:39696 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729030AbgAWNs3 (ORCPT ); Thu, 23 Jan 2020 08:48:29 -0500 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 501E81007; Thu, 23 Jan 2020 05:48:28 -0800 (PST) Received: from e123195-lin.cambridge.arm.com (e123195-lin.cambridge.arm.com [10.1.196.63]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 340363F68E; Thu, 23 Jan 2020 05:48:27 -0800 (PST) From: Alexandru Elisei To: kvm@vger.kernel.org Cc: will@kernel.org, julien.thierry.kdev@gmail.com, andre.przywara@arm.com, sami.mujawar@arm.com, lorenzo.pieralisi@arm.com, maz@kernel.org, Julien Thierry Subject: [PATCH v2 kvmtool 07/30] ioport: pci: Move port allocations to PCI devices Date: Thu, 23 Jan 2020 13:47:42 +0000 Message-Id: <20200123134805.1993-8-alexandru.elisei@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200123134805.1993-1-alexandru.elisei@arm.com> References: <20200123134805.1993-1-alexandru.elisei@arm.com> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Julien Thierry The dynamic ioport allocation with IOPORT_EMPTY is currently only used by PCI devices. Other devices use fixed ports for which they request registration to the ioport API. PCI ports need to be in the PCI IO space and there is no reason ioport API should know a PCI port is being allocated and needs to be placed in PCI IO space. This currently just happens to be the case. Move the responsability of dynamic allocation of ioports from the ioport API to PCI. In the future, if other types of devices also need dynamic ioport allocation, they'll have to figure out the range of ports they are allowed to use. Signed-off-by: Julien Thierry [Renamed functions for clarity] Signed-off-by: Alexandru Elisei Reviewed-by: Andre Przywara --- hw/vesa.c | 4 ++-- include/kvm/ioport.h | 3 --- include/kvm/pci.h | 4 +++- ioport.c | 18 ------------------ pci.c | 17 +++++++++++++---- powerpc/include/kvm/kvm-arch.h | 2 +- vfio/core.c | 6 ++++-- vfio/pci.c | 4 ++-- virtio/pci.c | 7 ++++--- x86/include/kvm/kvm-arch.h | 2 +- 10 files changed, 30 insertions(+), 37 deletions(-) diff --git a/hw/vesa.c b/hw/vesa.c index d75b4b316a1e..24fb46faad3b 100644 --- a/hw/vesa.c +++ b/hw/vesa.c @@ -63,8 +63,8 @@ struct framebuffer *vesa__init(struct kvm *kvm) if (!kvm->cfg.vnc && !kvm->cfg.sdl && !kvm->cfg.gtk) return NULL; - - r = ioport__register(kvm, IOPORT_EMPTY, &vesa_io_ops, IOPORT_SIZE, NULL); + r = pci_get_io_port_block(IOPORT_SIZE); + r = ioport__register(kvm, r, &vesa_io_ops, IOPORT_SIZE, NULL); if (r < 0) return ERR_PTR(r); diff --git a/include/kvm/ioport.h b/include/kvm/ioport.h index db52a479742b..b10fcd5b4412 100644 --- a/include/kvm/ioport.h +++ b/include/kvm/ioport.h @@ -14,11 +14,8 @@ /* some ports we reserve for own use */ #define IOPORT_DBG 0xe0 -#define IOPORT_START 0x6200 #define IOPORT_SIZE 0x400 -#define IOPORT_EMPTY USHRT_MAX - struct kvm; struct ioport { diff --git a/include/kvm/pci.h b/include/kvm/pci.h index a86c15a70e6d..ccb155e3e8fe 100644 --- a/include/kvm/pci.h +++ b/include/kvm/pci.h @@ -19,6 +19,7 @@ #define PCI_CONFIG_DATA 0xcfc #define PCI_CONFIG_BUS_FORWARD 0xcfa #define PCI_IO_SIZE 0x100 +#define PCI_IOPORT_START 0x6200 #define PCI_CFG_SIZE (1ULL << 24) struct kvm; @@ -152,7 +153,8 @@ struct pci_device_header { int pci__init(struct kvm *kvm); int pci__exit(struct kvm *kvm); struct pci_device_header *pci__find_dev(u8 dev_num); -u32 pci_get_io_space_block(u32 size); +u32 pci_get_mmio_block(u32 size); +u16 pci_get_io_port_block(u32 size); void pci__assign_irq(struct device_header *dev_hdr); void pci__config_wr(struct kvm *kvm, union pci_config_address addr, void *data, int size); void pci__config_rd(struct kvm *kvm, union pci_config_address addr, void *data, int size); diff --git a/ioport.c b/ioport.c index a6dc65e3e6c6..a72e4035881a 100644 --- a/ioport.c +++ b/ioport.c @@ -16,24 +16,8 @@ #define ioport_node(n) rb_entry(n, struct ioport, node) -DEFINE_MUTEX(ioport_mutex); - -static u16 free_io_port_idx; /* protected by ioport_mutex */ - static struct rb_root ioport_tree = RB_ROOT; -static u16 ioport__find_free_port(void) -{ - u16 free_port; - - mutex_lock(&ioport_mutex); - free_port = IOPORT_START + free_io_port_idx * IOPORT_SIZE; - free_io_port_idx++; - mutex_unlock(&ioport_mutex); - - return free_port; -} - static struct ioport *ioport_search(struct rb_root *root, u64 addr) { struct rb_int_node *node; @@ -85,8 +69,6 @@ int ioport__register(struct kvm *kvm, u16 port, struct ioport_operations *ops, i int r; br_write_lock(kvm); - if (port == IOPORT_EMPTY) - port = ioport__find_free_port(); entry = ioport_search(&ioport_tree, port); if (entry) { diff --git a/pci.c b/pci.c index 3198732935eb..80b5c5d3d7f3 100644 --- a/pci.c +++ b/pci.c @@ -15,15 +15,24 @@ static u32 pci_config_address_bits; * (That's why it can still 32bit even with 64bit guests-- 64bit * PCI isn't currently supported.) */ -static u32 io_space_blocks = KVM_PCI_MMIO_AREA; +static u32 mmio_blocks = KVM_PCI_MMIO_AREA; +static u16 io_port_blocks = PCI_IOPORT_START; + +u16 pci_get_io_port_block(u32 size) +{ + u16 port = ALIGN(io_port_blocks, IOPORT_SIZE); + + io_port_blocks = port + size; + return port; +} /* * BARs must be naturally aligned, so enforce this in the allocator. */ -u32 pci_get_io_space_block(u32 size) +u32 pci_get_mmio_block(u32 size) { - u32 block = ALIGN(io_space_blocks, size); - io_space_blocks = block + size; + u32 block = ALIGN(mmio_blocks, size); + mmio_blocks = block + size; return block; } diff --git a/powerpc/include/kvm/kvm-arch.h b/powerpc/include/kvm/kvm-arch.h index 8126b96cb66a..26d440b22bdd 100644 --- a/powerpc/include/kvm/kvm-arch.h +++ b/powerpc/include/kvm/kvm-arch.h @@ -34,7 +34,7 @@ #define KVM_MMIO_START PPC_MMIO_START /* - * This is the address that pci_get_io_space_block() starts allocating + * This is the address that pci_get_io_port_block() starts allocating * from. Note that this is a PCI bus address. */ #define KVM_IOPORT_AREA 0x0 diff --git a/vfio/core.c b/vfio/core.c index 17b5b0cfc9ac..0ed1e6fee6bf 100644 --- a/vfio/core.c +++ b/vfio/core.c @@ -202,8 +202,10 @@ static int vfio_setup_trap_region(struct kvm *kvm, struct vfio_device *vdev, struct vfio_region *region) { if (region->is_ioport) { - int port = ioport__register(kvm, IOPORT_EMPTY, &vfio_ioport_ops, - region->info.size, region); + int port = pci_get_io_port_block(region->info.size); + + port = ioport__register(kvm, port, &vfio_ioport_ops, + region->info.size, region); if (port < 0) return port; diff --git a/vfio/pci.c b/vfio/pci.c index 76e24c156906..8e5d8572bc0c 100644 --- a/vfio/pci.c +++ b/vfio/pci.c @@ -750,7 +750,7 @@ static int vfio_pci_create_msix_table(struct kvm *kvm, * powers of two. */ mmio_size = roundup_pow_of_two(table->size + pba->size); - table->guest_phys_addr = pci_get_io_space_block(mmio_size); + table->guest_phys_addr = pci_get_mmio_block(mmio_size); if (!table->guest_phys_addr) { pr_err("cannot allocate IO space"); ret = -ENOMEM; @@ -846,7 +846,7 @@ static int vfio_pci_configure_bar(struct kvm *kvm, struct vfio_device *vdev, if (!region->is_ioport) { /* Grab some MMIO space in the guest */ map_size = ALIGN(region->info.size, PAGE_SIZE); - region->guest_phys_addr = pci_get_io_space_block(map_size); + region->guest_phys_addr = pci_get_mmio_block(map_size); } /* Map the BARs into the guest or setup a trap region. */ diff --git a/virtio/pci.c b/virtio/pci.c index 04e801827df9..d73414abde05 100644 --- a/virtio/pci.c +++ b/virtio/pci.c @@ -438,18 +438,19 @@ int virtio_pci__init(struct kvm *kvm, void *dev, struct virtio_device *vdev, BUILD_BUG_ON(!is_power_of_two(IOPORT_SIZE)); BUILD_BUG_ON(!is_power_of_two(PCI_IO_SIZE)); - r = ioport__register(kvm, IOPORT_EMPTY, &virtio_pci__io_ops, IOPORT_SIZE, vdev); + r = pci_get_io_port_block(IOPORT_SIZE); + r = ioport__register(kvm, r, &virtio_pci__io_ops, IOPORT_SIZE, vdev); if (r < 0) return r; vpci->port_addr = (u16)r; - vpci->mmio_addr = pci_get_io_space_block(IOPORT_SIZE); + vpci->mmio_addr = pci_get_mmio_block(IOPORT_SIZE); r = kvm__register_mmio(kvm, vpci->mmio_addr, IOPORT_SIZE, false, virtio_pci__io_mmio_callback, vpci); if (r < 0) goto free_ioport; - vpci->msix_io_block = pci_get_io_space_block(PCI_IO_SIZE * 2); + vpci->msix_io_block = pci_get_mmio_block(PCI_IO_SIZE * 2); r = kvm__register_mmio(kvm, vpci->msix_io_block, PCI_IO_SIZE * 2, false, virtio_pci__msix_mmio_callback, vpci); if (r < 0) diff --git a/x86/include/kvm/kvm-arch.h b/x86/include/kvm/kvm-arch.h index bfdd3438a9de..85cd336c7577 100644 --- a/x86/include/kvm/kvm-arch.h +++ b/x86/include/kvm/kvm-arch.h @@ -16,7 +16,7 @@ #define KVM_MMIO_START KVM_32BIT_GAP_START -/* This is the address that pci_get_io_space_block() starts allocating +/* This is the address that pci_get_io_port_block() starts allocating * from. Note that this is a PCI bus address (though same on x86). */ #define KVM_IOPORT_AREA 0x0 From patchwork Thu Jan 23 13:47:43 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandru Elisei X-Patchwork-Id: 11347919 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id EBBDA17EF for ; Thu, 23 Jan 2020 13:48:31 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id CCC2D2468A for ; Thu, 23 Jan 2020 13:48:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729078AbgAWNsb (ORCPT ); Thu, 23 Jan 2020 08:48:31 -0500 Received: from foss.arm.com ([217.140.110.172]:39704 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729049AbgAWNsa (ORCPT ); Thu, 23 Jan 2020 08:48:30 -0500 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id AE76111B3; Thu, 23 Jan 2020 05:48:29 -0800 (PST) Received: from e123195-lin.cambridge.arm.com (e123195-lin.cambridge.arm.com [10.1.196.63]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 8D5473F68E; Thu, 23 Jan 2020 05:48:28 -0800 (PST) From: Alexandru Elisei To: kvm@vger.kernel.org Cc: will@kernel.org, julien.thierry.kdev@gmail.com, andre.przywara@arm.com, sami.mujawar@arm.com, lorenzo.pieralisi@arm.com, maz@kernel.org, Julien Thierry Subject: [PATCH v2 kvmtool 08/30] pci: Fix ioport allocation size Date: Thu, 23 Jan 2020 13:47:43 +0000 Message-Id: <20200123134805.1993-9-alexandru.elisei@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200123134805.1993-1-alexandru.elisei@arm.com> References: <20200123134805.1993-1-alexandru.elisei@arm.com> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Julien Thierry The PCI Local Bus Specification, Rev. 3.0, Section 6.2.5.1. "Address Maps" states: "Devices that map control functions into I/O Space must not consume more than 256 bytes per I/O Base Address register." Yet all the PCI devices allocate IO ports of IOPORT_SIZE (= 1024 bytes). Fix this by having PCI devices use 256 bytes ports for IO BARs. There is no hard requirement on the size of the memory region described by memory BARs. Since BAR 1 is supposed to offer the same functionality as IO ports, let's make its size match BAR 0. Signed-off-by: Julien Thierry [Added rationale for changing BAR1 size to PCI_IO_SIZE] Signed-off-by: Alexandru Elisei --- hw/vesa.c | 4 ++-- include/kvm/ioport.h | 1 - pci.c | 2 +- virtio/pci.c | 15 +++++++-------- 4 files changed, 10 insertions(+), 12 deletions(-) diff --git a/hw/vesa.c b/hw/vesa.c index 24fb46faad3b..d8d91aa9c873 100644 --- a/hw/vesa.c +++ b/hw/vesa.c @@ -63,8 +63,8 @@ struct framebuffer *vesa__init(struct kvm *kvm) if (!kvm->cfg.vnc && !kvm->cfg.sdl && !kvm->cfg.gtk) return NULL; - r = pci_get_io_port_block(IOPORT_SIZE); - r = ioport__register(kvm, r, &vesa_io_ops, IOPORT_SIZE, NULL); + r = pci_get_io_port_block(PCI_IO_SIZE); + r = ioport__register(kvm, r, &vesa_io_ops, PCI_IO_SIZE, NULL); if (r < 0) return ERR_PTR(r); diff --git a/include/kvm/ioport.h b/include/kvm/ioport.h index b10fcd5b4412..8c86b7151f25 100644 --- a/include/kvm/ioport.h +++ b/include/kvm/ioport.h @@ -14,7 +14,6 @@ /* some ports we reserve for own use */ #define IOPORT_DBG 0xe0 -#define IOPORT_SIZE 0x400 struct kvm; diff --git a/pci.c b/pci.c index 80b5c5d3d7f3..b6892d974c08 100644 --- a/pci.c +++ b/pci.c @@ -20,7 +20,7 @@ static u16 io_port_blocks = PCI_IOPORT_START; u16 pci_get_io_port_block(u32 size) { - u16 port = ALIGN(io_port_blocks, IOPORT_SIZE); + u16 port = ALIGN(io_port_blocks, PCI_IO_SIZE); io_port_blocks = port + size; return port; diff --git a/virtio/pci.c b/virtio/pci.c index d73414abde05..eeb5b5efa6e1 100644 --- a/virtio/pci.c +++ b/virtio/pci.c @@ -421,7 +421,7 @@ static void virtio_pci__io_mmio_callback(struct kvm_cpu *vcpu, { struct virtio_pci *vpci = ptr; int direction = is_write ? KVM_EXIT_IO_OUT : KVM_EXIT_IO_IN; - u16 port = vpci->port_addr + (addr & (IOPORT_SIZE - 1)); + u16 port = vpci->port_addr + (addr & (PCI_IO_SIZE - 1)); kvm__emulate_io(vcpu, port, data, direction, len, 1); } @@ -435,17 +435,16 @@ int virtio_pci__init(struct kvm *kvm, void *dev, struct virtio_device *vdev, vpci->kvm = kvm; vpci->dev = dev; - BUILD_BUG_ON(!is_power_of_two(IOPORT_SIZE)); BUILD_BUG_ON(!is_power_of_two(PCI_IO_SIZE)); - r = pci_get_io_port_block(IOPORT_SIZE); - r = ioport__register(kvm, r, &virtio_pci__io_ops, IOPORT_SIZE, vdev); + r = pci_get_io_port_block(PCI_IO_SIZE); + r = ioport__register(kvm, r, &virtio_pci__io_ops, PCI_IO_SIZE, vdev); if (r < 0) return r; vpci->port_addr = (u16)r; - vpci->mmio_addr = pci_get_mmio_block(IOPORT_SIZE); - r = kvm__register_mmio(kvm, vpci->mmio_addr, IOPORT_SIZE, false, + vpci->mmio_addr = pci_get_mmio_block(PCI_IO_SIZE); + r = kvm__register_mmio(kvm, vpci->mmio_addr, PCI_IO_SIZE, false, virtio_pci__io_mmio_callback, vpci); if (r < 0) goto free_ioport; @@ -475,8 +474,8 @@ int virtio_pci__init(struct kvm *kvm, void *dev, struct virtio_device *vdev, | PCI_BASE_ADDRESS_SPACE_MEMORY), .status = cpu_to_le16(PCI_STATUS_CAP_LIST), .capabilities = (void *)&vpci->pci_hdr.msix - (void *)&vpci->pci_hdr, - .bar_size[0] = cpu_to_le32(IOPORT_SIZE), - .bar_size[1] = cpu_to_le32(IOPORT_SIZE), + .bar_size[0] = cpu_to_le32(PCI_IO_SIZE), + .bar_size[1] = cpu_to_le32(PCI_IO_SIZE), .bar_size[2] = cpu_to_le32(PCI_IO_SIZE*2), }; From patchwork Thu Jan 23 13:47:44 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandru Elisei X-Patchwork-Id: 11347923 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 6844792A for ; Thu, 23 Jan 2020 13:48:33 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 500AC24688 for ; Thu, 23 Jan 2020 13:48:33 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729049AbgAWNsc (ORCPT ); Thu, 23 Jan 2020 08:48:32 -0500 Received: from foss.arm.com ([217.140.110.172]:39716 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729083AbgAWNsb (ORCPT ); Thu, 23 Jan 2020 08:48:31 -0500 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 158D5FEC; Thu, 23 Jan 2020 05:48:31 -0800 (PST) Received: from e123195-lin.cambridge.arm.com (e123195-lin.cambridge.arm.com [10.1.196.63]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id E53893F68E; Thu, 23 Jan 2020 05:48:29 -0800 (PST) From: Alexandru Elisei To: kvm@vger.kernel.org Cc: will@kernel.org, julien.thierry.kdev@gmail.com, andre.przywara@arm.com, sami.mujawar@arm.com, lorenzo.pieralisi@arm.com, maz@kernel.org, Julien Thierry Subject: [PATCH v2 kvmtool 09/30] arm/pci: Fix PCI IO region Date: Thu, 23 Jan 2020 13:47:44 +0000 Message-Id: <20200123134805.1993-10-alexandru.elisei@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200123134805.1993-1-alexandru.elisei@arm.com> References: <20200123134805.1993-1-alexandru.elisei@arm.com> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Julien Thierry Current PCI IO region that is exposed through the DT contains ports that are reserved by non-PCI devices. Use the proper PCI IO start so that the region exposed through DT can actually be used to reassign device BARs. Signed-off-by: Julien Thierry Signed-off-by: Alexandru Elisei --- arm/include/arm-common/pci.h | 1 + arm/kvm.c | 3 +++ arm/pci.c | 21 ++++++++++++++++++--- 3 files changed, 22 insertions(+), 3 deletions(-) diff --git a/arm/include/arm-common/pci.h b/arm/include/arm-common/pci.h index 9008a0ed072e..aea42b8895e9 100644 --- a/arm/include/arm-common/pci.h +++ b/arm/include/arm-common/pci.h @@ -1,6 +1,7 @@ #ifndef ARM_COMMON__PCI_H #define ARM_COMMON__PCI_H +void pci__arm_init(struct kvm *kvm); void pci__generate_fdt_nodes(void *fdt); #endif /* ARM_COMMON__PCI_H */ diff --git a/arm/kvm.c b/arm/kvm.c index 1f85fc60588f..5c30ec1e0515 100644 --- a/arm/kvm.c +++ b/arm/kvm.c @@ -6,6 +6,7 @@ #include "kvm/fdt.h" #include "arm-common/gic.h" +#include "arm-common/pci.h" #include #include @@ -86,6 +87,8 @@ void kvm__arch_init(struct kvm *kvm, const char *hugetlbfs_path, u64 ram_size) /* Create the virtual GIC. */ if (gic__create(kvm, kvm->cfg.arch.irqchip)) die("Failed to create virtual GIC"); + + pci__arm_init(kvm); } #define FDT_ALIGN SZ_2M diff --git a/arm/pci.c b/arm/pci.c index ed325fa4a811..1c0949a22408 100644 --- a/arm/pci.c +++ b/arm/pci.c @@ -1,3 +1,5 @@ +#include "linux/sizes.h" + #include "kvm/devices.h" #include "kvm/fdt.h" #include "kvm/kvm.h" @@ -7,6 +9,11 @@ #include "arm-common/pci.h" +#define ARM_PCI_IO_START ALIGN(PCI_IOPORT_START, SZ_4K) + +/* Must be a multiple of 4k */ +#define ARM_PCI_IO_SIZE ((ARM_MMIO_AREA - ARM_PCI_IO_START) & ~(SZ_4K - 1)) + /* * An entry in the interrupt-map table looks like: * @@ -24,6 +31,14 @@ struct of_interrupt_map_entry { struct of_gic_irq gic_irq; } __attribute__((packed)); +void pci__arm_init(struct kvm *kvm) +{ + u32 align_pad = ARM_PCI_IO_START - PCI_IOPORT_START; + + /* Make PCI port allocation start at a properly aligned address */ + pci_get_io_port_block(align_pad); +} + void pci__generate_fdt_nodes(void *fdt) { struct device_header *dev_hdr; @@ -40,10 +55,10 @@ void pci__generate_fdt_nodes(void *fdt) .pci_addr = { .hi = cpu_to_fdt32(of_pci_b_ss(OF_PCI_SS_IO)), .mid = 0, - .lo = 0, + .lo = cpu_to_fdt32(ARM_PCI_IO_START), }, - .cpu_addr = cpu_to_fdt64(KVM_IOPORT_AREA), - .length = cpu_to_fdt64(ARM_IOPORT_SIZE), + .cpu_addr = cpu_to_fdt64(ARM_PCI_IO_START), + .length = cpu_to_fdt64(ARM_PCI_IO_SIZE), }, { .pci_addr = { From patchwork Thu Jan 23 13:47:45 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandru Elisei X-Patchwork-Id: 11347925 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9A54992A for ; Thu, 23 Jan 2020 13:48:34 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 792AD2468B for ; Thu, 23 Jan 2020 13:48:34 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729043AbgAWNsd (ORCPT ); Thu, 23 Jan 2020 08:48:33 -0500 Received: from foss.arm.com ([217.140.110.172]:39726 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728981AbgAWNsd (ORCPT ); Thu, 23 Jan 2020 08:48:33 -0500 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 7081CFEC; Thu, 23 Jan 2020 05:48:32 -0800 (PST) Received: from e123195-lin.cambridge.arm.com (e123195-lin.cambridge.arm.com [10.1.196.63]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 4DD963F68E; Thu, 23 Jan 2020 05:48:31 -0800 (PST) From: Alexandru Elisei To: kvm@vger.kernel.org Cc: will@kernel.org, julien.thierry.kdev@gmail.com, andre.przywara@arm.com, sami.mujawar@arm.com, lorenzo.pieralisi@arm.com, maz@kernel.org, Julien Thierry Subject: [PATCH v2 kvmtool 10/30] virtio/pci: Make memory and IO BARs independent Date: Thu, 23 Jan 2020 13:47:45 +0000 Message-Id: <20200123134805.1993-11-alexandru.elisei@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200123134805.1993-1-alexandru.elisei@arm.com> References: <20200123134805.1993-1-alexandru.elisei@arm.com> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Julien Thierry Currently, callbacks for memory BAR 1 call the IO port emulation. This means that the memory BAR needs I/O Space to be enabled whenever Memory Space is enabled. Refactor the code so the two type of BARs are independent. Also, unify ioport/mmio callback arguments so that they all receive a virtio_device. Signed-off-by: Julien Thierry Signed-off-by: Alexandru Elisei Reviewed-by: Andre Przywara --- virtio/pci.c | 71 +++++++++++++++++++++++++++++++++++----------------- 1 file changed, 48 insertions(+), 23 deletions(-) diff --git a/virtio/pci.c b/virtio/pci.c index eeb5b5efa6e1..6723a1f3a84d 100644 --- a/virtio/pci.c +++ b/virtio/pci.c @@ -87,8 +87,8 @@ static inline bool virtio_pci__msix_enabled(struct virtio_pci *vpci) return vpci->pci_hdr.msix.ctrl & cpu_to_le16(PCI_MSIX_FLAGS_ENABLE); } -static bool virtio_pci__specific_io_in(struct kvm *kvm, struct virtio_device *vdev, u16 port, - void *data, int size, int offset) +static bool virtio_pci__specific_data_in(struct kvm *kvm, struct virtio_device *vdev, + void *data, int size, unsigned long offset) { u32 config_offset; struct virtio_pci *vpci = vdev->virtio; @@ -117,20 +117,17 @@ static bool virtio_pci__specific_io_in(struct kvm *kvm, struct virtio_device *vd return false; } -static bool virtio_pci__io_in(struct ioport *ioport, struct kvm_cpu *vcpu, u16 port, void *data, int size) +static bool virtio_pci__data_in(struct kvm_cpu *vcpu, struct virtio_device *vdev, + unsigned long offset, void *data, int size) { - unsigned long offset; bool ret = true; - struct virtio_device *vdev; struct virtio_pci *vpci; struct virt_queue *vq; struct kvm *kvm; u32 val; kvm = vcpu->kvm; - vdev = ioport->priv; vpci = vdev->virtio; - offset = port - vpci->port_addr; switch (offset) { case VIRTIO_PCI_HOST_FEATURES: @@ -154,13 +151,26 @@ static bool virtio_pci__io_in(struct ioport *ioport, struct kvm_cpu *vcpu, u16 p vpci->isr = VIRTIO_IRQ_LOW; break; default: - ret = virtio_pci__specific_io_in(kvm, vdev, port, data, size, offset); + ret = virtio_pci__specific_data_in(kvm, vdev, data, size, offset); break; }; return ret; } +static bool virtio_pci__io_in(struct ioport *ioport, struct kvm_cpu *vcpu, u16 port, void *data, int size) +{ + unsigned long offset; + struct virtio_device *vdev; + struct virtio_pci *vpci; + + vdev = ioport->priv; + vpci = vdev->virtio; + offset = port - vpci->port_addr; + + return virtio_pci__data_in(vcpu, vdev, offset, data, size); +} + static void update_msix_map(struct virtio_pci *vpci, struct msix_table *msix_entry, u32 vecnum) { @@ -185,8 +195,8 @@ static void update_msix_map(struct virtio_pci *vpci, irq__update_msix_route(vpci->kvm, gsi, &msix_entry->msg); } -static bool virtio_pci__specific_io_out(struct kvm *kvm, struct virtio_device *vdev, u16 port, - void *data, int size, int offset) +static bool virtio_pci__specific_data_out(struct kvm *kvm, struct virtio_device *vdev, + void *data, int size, unsigned long offset) { struct virtio_pci *vpci = vdev->virtio; u32 config_offset, vec; @@ -259,19 +269,16 @@ static bool virtio_pci__specific_io_out(struct kvm *kvm, struct virtio_device *v return false; } -static bool virtio_pci__io_out(struct ioport *ioport, struct kvm_cpu *vcpu, u16 port, void *data, int size) +static bool virtio_pci__data_out(struct kvm_cpu *vcpu, struct virtio_device *vdev, + unsigned long offset, void *data, int size) { - unsigned long offset; bool ret = true; - struct virtio_device *vdev; struct virtio_pci *vpci; struct kvm *kvm; u32 val; kvm = vcpu->kvm; - vdev = ioport->priv; vpci = vdev->virtio; - offset = port - vpci->port_addr; switch (offset) { case VIRTIO_PCI_GUEST_FEATURES: @@ -304,13 +311,26 @@ static bool virtio_pci__io_out(struct ioport *ioport, struct kvm_cpu *vcpu, u16 virtio_notify_status(kvm, vdev, vpci->dev, vpci->status); break; default: - ret = virtio_pci__specific_io_out(kvm, vdev, port, data, size, offset); + ret = virtio_pci__specific_data_out(kvm, vdev, data, size, offset); break; }; return ret; } +static bool virtio_pci__io_out(struct ioport *ioport, struct kvm_cpu *vcpu, u16 port, void *data, int size) +{ + unsigned long offset; + struct virtio_device *vdev; + struct virtio_pci *vpci; + + vdev = ioport->priv; + vpci = vdev->virtio; + offset = port - vpci->port_addr; + + return virtio_pci__data_out(vcpu, vdev, offset, data, size); +} + static struct ioport_operations virtio_pci__io_ops = { .io_in = virtio_pci__io_in, .io_out = virtio_pci__io_out, @@ -320,7 +340,8 @@ static void virtio_pci__msix_mmio_callback(struct kvm_cpu *vcpu, u64 addr, u8 *data, u32 len, u8 is_write, void *ptr) { - struct virtio_pci *vpci = ptr; + struct virtio_device *vdev = ptr; + struct virtio_pci *vpci = vdev->virtio; struct msix_table *table; int vecnum; size_t offset; @@ -419,11 +440,15 @@ static void virtio_pci__io_mmio_callback(struct kvm_cpu *vcpu, u64 addr, u8 *data, u32 len, u8 is_write, void *ptr) { - struct virtio_pci *vpci = ptr; - int direction = is_write ? KVM_EXIT_IO_OUT : KVM_EXIT_IO_IN; - u16 port = vpci->port_addr + (addr & (PCI_IO_SIZE - 1)); + struct virtio_device *vdev = ptr; + struct virtio_pci *vpci = vdev->virtio; - kvm__emulate_io(vcpu, port, data, direction, len, 1); + if (!is_write) + virtio_pci__data_in(vcpu, vdev, addr - vpci->mmio_addr, + data, len); + else + virtio_pci__data_out(vcpu, vdev, addr - vpci->mmio_addr, + data, len); } int virtio_pci__init(struct kvm *kvm, void *dev, struct virtio_device *vdev, @@ -445,13 +470,13 @@ int virtio_pci__init(struct kvm *kvm, void *dev, struct virtio_device *vdev, vpci->mmio_addr = pci_get_mmio_block(PCI_IO_SIZE); r = kvm__register_mmio(kvm, vpci->mmio_addr, PCI_IO_SIZE, false, - virtio_pci__io_mmio_callback, vpci); + virtio_pci__io_mmio_callback, vdev); if (r < 0) goto free_ioport; vpci->msix_io_block = pci_get_mmio_block(PCI_IO_SIZE * 2); r = kvm__register_mmio(kvm, vpci->msix_io_block, PCI_IO_SIZE * 2, false, - virtio_pci__msix_mmio_callback, vpci); + virtio_pci__msix_mmio_callback, vdev); if (r < 0) goto free_mmio; From patchwork Thu Jan 23 13:47:46 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandru Elisei X-Patchwork-Id: 11347929 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 5963992A for ; Thu, 23 Jan 2020 13:48:36 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 37BCB2467B for ; Thu, 23 Jan 2020 13:48:36 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729050AbgAWNsf (ORCPT ); Thu, 23 Jan 2020 08:48:35 -0500 Received: from foss.arm.com ([217.140.110.172]:39732 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729083AbgAWNse (ORCPT ); Thu, 23 Jan 2020 08:48:34 -0500 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id A6E8EFEC; Thu, 23 Jan 2020 05:48:33 -0800 (PST) Received: from e123195-lin.cambridge.arm.com (e123195-lin.cambridge.arm.com [10.1.196.63]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id A79FF3F68E; Thu, 23 Jan 2020 05:48:32 -0800 (PST) From: Alexandru Elisei To: kvm@vger.kernel.org Cc: will@kernel.org, julien.thierry.kdev@gmail.com, andre.przywara@arm.com, sami.mujawar@arm.com, lorenzo.pieralisi@arm.com, maz@kernel.org Subject: [PATCH v2 kvmtool 11/30] vfio/pci: Allocate correct size for MSIX table and PBA BARs Date: Thu, 23 Jan 2020 13:47:46 +0000 Message-Id: <20200123134805.1993-12-alexandru.elisei@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200123134805.1993-1-alexandru.elisei@arm.com> References: <20200123134805.1993-1-alexandru.elisei@arm.com> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org kvmtool assumes that the BAR that holds the address for the MSIX table and PBA structure has a size which is equal to their total size and it allocates memory from MMIO space accordingly. However, when initializing the BARs, the BAR size is set to the region size reported by VFIO. When the physical BAR size is greater than the mmio space that kvmtool allocates, we can have a situation where the BAR overlaps with another BAR, in which case kvmtool will fail to map the memory. This was found when trying to do PCI passthrough with a PCIe Realtek r8168 NIC, when the guest was also using virtio-block and virtio-net devices: [..] [ 0.197926] PCI: OF: PROBE_ONLY enabled [ 0.198454] pci-host-generic 40000000.pci: host bridge /pci ranges: [ 0.199291] pci-host-generic 40000000.pci: IO 0x00007000..0x0000ffff -> 0x00007000 [ 0.200331] pci-host-generic 40000000.pci: MEM 0x41000000..0x7fffffff -> 0x41000000 [ 0.201480] pci-host-generic 40000000.pci: ECAM at [mem 0x40000000-0x40ffffff] for [bus 00] [ 0.202635] pci-host-generic 40000000.pci: PCI host bridge to bus 0000:00 [ 0.203535] pci_bus 0000:00: root bus resource [bus 00] [ 0.204227] pci_bus 0000:00: root bus resource [io 0x0000-0x8fff] (bus address [0x7000-0xffff]) [ 0.205483] pci_bus 0000:00: root bus resource [mem 0x41000000-0x7fffffff] [ 0.206456] pci 0000:00:00.0: [10ec:8168] type 00 class 0x020000 [ 0.207399] pci 0000:00:00.0: reg 0x10: [io 0x0000-0x00ff] [ 0.208252] pci 0000:00:00.0: reg 0x18: [mem 0x41002000-0x41002fff] [ 0.209233] pci 0000:00:00.0: reg 0x20: [mem 0x41000000-0x41003fff] [ 0.210481] pci 0000:00:01.0: [1af4:1000] type 00 class 0x020000 [ 0.211349] pci 0000:00:01.0: reg 0x10: [io 0x0100-0x01ff] [ 0.212118] pci 0000:00:01.0: reg 0x14: [mem 0x41003000-0x410030ff] [ 0.212982] pci 0000:00:01.0: reg 0x18: [mem 0x41003200-0x410033ff] [ 0.214247] pci 0000:00:02.0: [1af4:1001] type 00 class 0x018000 [ 0.215096] pci 0000:00:02.0: reg 0x10: [io 0x0200-0x02ff] [ 0.215863] pci 0000:00:02.0: reg 0x14: [mem 0x41003400-0x410034ff] [ 0.216723] pci 0000:00:02.0: reg 0x18: [mem 0x41003600-0x410037ff] [ 0.218105] pci 0000:00:00.0: can't claim BAR 4 [mem 0x41000000-0x41003fff]: address conflict with 0000:00:00.0 [mem 0x41002000-0x41002fff] [..] Guest output of lspci -vv: 00:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 06) Subsystem: TP-LINK Technologies Co., Ltd. TG-3468 Gigabit PCI Express Network Adapter Control: I/O+ Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR- Reviewed-by: Andre Przywara --- vfio/pci.c | 68 +++++++++++++++++++++++++++++++++++++++++------------- 1 file changed, 52 insertions(+), 16 deletions(-) diff --git a/vfio/pci.c b/vfio/pci.c index 8e5d8572bc0c..bbb8469c8d93 100644 --- a/vfio/pci.c +++ b/vfio/pci.c @@ -715,17 +715,44 @@ static int vfio_pci_fixup_cfg_space(struct vfio_device *vdev) return 0; } -static int vfio_pci_create_msix_table(struct kvm *kvm, - struct vfio_pci_device *pdev) +static int vfio_pci_get_region_info(struct vfio_device *vdev, u32 index, + struct vfio_region_info *info) +{ + int ret; + + *info = (struct vfio_region_info) { + .argsz = sizeof(*info), + .index = index, + }; + + ret = ioctl(vdev->fd, VFIO_DEVICE_GET_REGION_INFO, info); + if (ret) { + ret = -errno; + vfio_dev_err(vdev, "cannot get info for BAR %u", index); + return ret; + } + + if (info->size && !is_power_of_two(info->size)) { + vfio_dev_err(vdev, "region is not power of two: 0x%llx", + info->size); + return -EINVAL; + } + + return 0; +} + +static int vfio_pci_create_msix_table(struct kvm *kvm, struct vfio_device *vdev) { int ret; size_t i; - size_t mmio_size; + size_t map_size; size_t nr_entries; struct vfio_pci_msi_entry *entries; + struct vfio_pci_device *pdev = &vdev->pci; struct vfio_pci_msix_pba *pba = &pdev->msix_pba; struct vfio_pci_msix_table *table = &pdev->msix_table; struct msix_cap *msix = PCI_CAP(&pdev->hdr, pdev->msix.pos); + struct vfio_region_info info; table->bar = msix->table_offset & PCI_MSIX_TABLE_BIR; pba->bar = msix->pba_offset & PCI_MSIX_TABLE_BIR; @@ -744,15 +771,31 @@ static int vfio_pci_create_msix_table(struct kvm *kvm, for (i = 0; i < nr_entries; i++) entries[i].config.ctrl = PCI_MSIX_ENTRY_CTRL_MASKBIT; + ret = vfio_pci_get_region_info(vdev, table->bar, &info); + if (ret) + return ret; + if (!info.size) + return -EINVAL; + map_size = info.size; + + if (table->bar != pba->bar) { + ret = vfio_pci_get_region_info(vdev, pba->bar, &info); + if (ret) + return ret; + if (!info.size) + return -EINVAL; + map_size += info.size; + } + /* * To ease MSI-X cap configuration in case they share the same BAR, * collapse table and pending array. The size of the BAR regions must be * powers of two. */ - mmio_size = roundup_pow_of_two(table->size + pba->size); - table->guest_phys_addr = pci_get_mmio_block(mmio_size); + map_size = ALIGN(map_size, PAGE_SIZE); + table->guest_phys_addr = pci_get_mmio_block(map_size); if (!table->guest_phys_addr) { - pr_err("cannot allocate IO space"); + pr_err("cannot allocate MMIO space"); ret = -ENOMEM; goto out_free; } @@ -816,17 +859,10 @@ static int vfio_pci_configure_bar(struct kvm *kvm, struct vfio_device *vdev, region->vdev = vdev; region->is_ioport = !!(bar & PCI_BASE_ADDRESS_SPACE_IO); - region->info = (struct vfio_region_info) { - .argsz = sizeof(region->info), - .index = nr, - }; - ret = ioctl(vdev->fd, VFIO_DEVICE_GET_REGION_INFO, ®ion->info); - if (ret) { - ret = -errno; - vfio_dev_err(vdev, "cannot get info for BAR %zu", nr); + ret = vfio_pci_get_region_info(vdev, nr, ®ion->info); + if (ret) return ret; - } /* Ignore invalid or unimplemented regions */ if (!region->info.size) @@ -871,7 +907,7 @@ static int vfio_pci_configure_dev_regions(struct kvm *kvm, return ret; if (pdev->irq_modes & VFIO_PCI_IRQ_MODE_MSIX) { - ret = vfio_pci_create_msix_table(kvm, pdev); + ret = vfio_pci_create_msix_table(kvm, vdev); if (ret) return ret; } From patchwork Thu Jan 23 13:47:47 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandru Elisei X-Patchwork-Id: 11347931 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 2092292A for ; Thu, 23 Jan 2020 13:48:37 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 09E7B24686 for ; Thu, 23 Jan 2020 13:48:37 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729094AbgAWNsg (ORCPT ); Thu, 23 Jan 2020 08:48:36 -0500 Received: from foss.arm.com ([217.140.110.172]:39740 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729095AbgAWNsf (ORCPT ); Thu, 23 Jan 2020 08:48:35 -0500 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id E970CFEC; Thu, 23 Jan 2020 05:48:34 -0800 (PST) Received: from e123195-lin.cambridge.arm.com (e123195-lin.cambridge.arm.com [10.1.196.63]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id DE7E23F68E; Thu, 23 Jan 2020 05:48:33 -0800 (PST) From: Alexandru Elisei To: kvm@vger.kernel.org Cc: will@kernel.org, julien.thierry.kdev@gmail.com, andre.przywara@arm.com, sami.mujawar@arm.com, lorenzo.pieralisi@arm.com, maz@kernel.org Subject: [PATCH v2 kvmtool 12/30] vfio/pci: Don't assume that only even numbered BARs are 64bit Date: Thu, 23 Jan 2020 13:47:47 +0000 Message-Id: <20200123134805.1993-13-alexandru.elisei@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200123134805.1993-1-alexandru.elisei@arm.com> References: <20200123134805.1993-1-alexandru.elisei@arm.com> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Not all devices have the bottom 32 bits of a 64 bit BAR in an even numbered BAR. For example, on an NVIDIA Quadro P400, BARs 1 and 3 are 64bit. Remove this assumption. Signed-off-by: Alexandru Elisei Reviewed-by: Andre Przywara --- vfio/pci.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/vfio/pci.c b/vfio/pci.c index bbb8469c8d93..1bdc20038411 100644 --- a/vfio/pci.c +++ b/vfio/pci.c @@ -920,8 +920,10 @@ static int vfio_pci_configure_dev_regions(struct kvm *kvm, for (i = VFIO_PCI_BAR0_REGION_INDEX; i <= VFIO_PCI_BAR5_REGION_INDEX; ++i) { /* Ignore top half of 64-bit BAR */ - if (i % 2 && is_64bit) + if (is_64bit) { + is_64bit = false; continue; + } ret = vfio_pci_configure_bar(kvm, vdev, i); if (ret) From patchwork Thu Jan 23 13:47:48 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandru Elisei X-Patchwork-Id: 11347933 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 48C23159A for ; Thu, 23 Jan 2020 13:48:39 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 2798C24685 for ; Thu, 23 Jan 2020 13:48:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729095AbgAWNsh (ORCPT ); Thu, 23 Jan 2020 08:48:37 -0500 Received: from foss.arm.com ([217.140.110.172]:39750 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729096AbgAWNsg (ORCPT ); Thu, 23 Jan 2020 08:48:36 -0500 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 30D20106F; Thu, 23 Jan 2020 05:48:36 -0800 (PST) Received: from e123195-lin.cambridge.arm.com (e123195-lin.cambridge.arm.com [10.1.196.63]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 2BF943F68E; Thu, 23 Jan 2020 05:48:35 -0800 (PST) From: Alexandru Elisei To: kvm@vger.kernel.org Cc: will@kernel.org, julien.thierry.kdev@gmail.com, andre.przywara@arm.com, sami.mujawar@arm.com, lorenzo.pieralisi@arm.com, maz@kernel.org Subject: [PATCH v2 kvmtool 13/30] vfio/pci: Ignore expansion ROM BAR writes Date: Thu, 23 Jan 2020 13:47:48 +0000 Message-Id: <20200123134805.1993-14-alexandru.elisei@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200123134805.1993-1-alexandru.elisei@arm.com> References: <20200123134805.1993-1-alexandru.elisei@arm.com> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org To get the size of the expansion ROM, software writes 0xfffff800 to the expansion ROM BAR in the PCI configuration space. PCI emulation executes the optional configuration space write callback that a device can implement before emulating this write. VFIO doesn't have support for emulating expansion ROMs. However, the callback writes the guest value to the hardware BAR, and then it reads it back to the BAR to make sure the write has completed successfully. After this, we return to regular PCI emulation and because the BAR is no longer 0, we write back to the BAR the value that the guest used to get the size. As a result, the guest will think that the ROM size is 0x800 after the subsequent read and we end up unintentionally exposing to the guest a BAR which we don't emulate. Let's fix this by ignoring writes to the expansion ROM BAR. Signed-off-by: Alexandru Elisei --- vfio/pci.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/vfio/pci.c b/vfio/pci.c index 1bdc20038411..1f38f90c3ae9 100644 --- a/vfio/pci.c +++ b/vfio/pci.c @@ -472,6 +472,9 @@ static void vfio_pci_cfg_write(struct kvm *kvm, struct pci_device_header *pci_hd struct vfio_device *vdev; void *base = pci_hdr; + if (offset == PCI_ROM_ADDRESS) + return; + pdev = container_of(pci_hdr, struct vfio_pci_device, hdr); vdev = container_of(pdev, struct vfio_device, pci); info = &vdev->regions[VFIO_PCI_CONFIG_REGION_INDEX].info; From patchwork Thu Jan 23 13:47:49 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandru Elisei X-Patchwork-Id: 11347935 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id CD8C917EF for ; Thu, 23 Jan 2020 13:48:39 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id ABE1A24685 for ; Thu, 23 Jan 2020 13:48:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729129AbgAWNsi (ORCPT ); Thu, 23 Jan 2020 08:48:38 -0500 Received: from foss.arm.com ([217.140.110.172]:39760 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729108AbgAWNsh (ORCPT ); Thu, 23 Jan 2020 08:48:37 -0500 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 7157511B3; Thu, 23 Jan 2020 05:48:37 -0800 (PST) Received: from e123195-lin.cambridge.arm.com (e123195-lin.cambridge.arm.com [10.1.196.63]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 683773F68E; Thu, 23 Jan 2020 05:48:36 -0800 (PST) From: Alexandru Elisei To: kvm@vger.kernel.org Cc: will@kernel.org, julien.thierry.kdev@gmail.com, andre.przywara@arm.com, sami.mujawar@arm.com, lorenzo.pieralisi@arm.com, maz@kernel.org Subject: [PATCH v2 kvmtool 14/30] vfio/pci: Don't access potentially unallocated regions Date: Thu, 23 Jan 2020 13:47:49 +0000 Message-Id: <20200123134805.1993-15-alexandru.elisei@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200123134805.1993-1-alexandru.elisei@arm.com> References: <20200123134805.1993-1-alexandru.elisei@arm.com> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Don't try to configure a BAR if there is no region associated with it. Signed-off-by: Alexandru Elisei --- vfio/pci.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/vfio/pci.c b/vfio/pci.c index 1f38f90c3ae9..f86a7d9b7032 100644 --- a/vfio/pci.c +++ b/vfio/pci.c @@ -652,6 +652,8 @@ static int vfio_pci_fixup_cfg_space(struct vfio_device *vdev) /* Initialise the BARs */ for (i = VFIO_PCI_BAR0_REGION_INDEX; i <= VFIO_PCI_BAR5_REGION_INDEX; ++i) { + if ((u32)i == vdev->info.num_regions) + break; u64 base; struct vfio_region *region = &vdev->regions[i]; @@ -853,11 +855,12 @@ static int vfio_pci_configure_bar(struct kvm *kvm, struct vfio_device *vdev, u32 bar; size_t map_size; struct vfio_pci_device *pdev = &vdev->pci; - struct vfio_region *region = &vdev->regions[nr]; + struct vfio_region *region; if (nr >= vdev->info.num_regions) return 0; + region = &vdev->regions[nr]; bar = pdev->hdr.bar[nr]; region->vdev = vdev; From patchwork Thu Jan 23 13:47:50 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandru Elisei X-Patchwork-Id: 11347937 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E3C5992A for ; Thu, 23 Jan 2020 13:48:41 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id B80DB20661 for ; Thu, 23 Jan 2020 13:48:41 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729146AbgAWNsk (ORCPT ); Thu, 23 Jan 2020 08:48:40 -0500 Received: from foss.arm.com ([217.140.110.172]:39770 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729108AbgAWNsj (ORCPT ); Thu, 23 Jan 2020 08:48:39 -0500 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id B2DAEFEC; Thu, 23 Jan 2020 05:48:38 -0800 (PST) Received: from e123195-lin.cambridge.arm.com (e123195-lin.cambridge.arm.com [10.1.196.63]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id AE9F43F68E; Thu, 23 Jan 2020 05:48:37 -0800 (PST) From: Alexandru Elisei To: kvm@vger.kernel.org Cc: will@kernel.org, julien.thierry.kdev@gmail.com, andre.przywara@arm.com, sami.mujawar@arm.com, lorenzo.pieralisi@arm.com, maz@kernel.org Subject: [PATCH v2 kvmtool 15/30] virtio: Don't ignore initialization failures Date: Thu, 23 Jan 2020 13:47:50 +0000 Message-Id: <20200123134805.1993-16-alexandru.elisei@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200123134805.1993-1-alexandru.elisei@arm.com> References: <20200123134805.1993-1-alexandru.elisei@arm.com> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Don't ignore an error in the bus specific initialization function in virtio_init; don't ignore the result of virtio_init; and don't return 0 in virtio_blk__init and virtio_scsi__init when we encounter an error. Hopefully this will save some developer's time debugging faulty virtio devices in a guest. To take advantage of the cleanup function virtio_blk__exit, we have moved appending the new device to the list before the call to virtio_init. To safeguard against this in the future, virtio_init has been annoted with the compiler attribute warn_unused_result. Signed-off-by: Alexandru Elisei --- include/kvm/kvm.h | 1 + include/kvm/virtio.h | 7 ++++--- include/linux/compiler.h | 2 +- virtio/9p.c | 9 ++++++--- virtio/balloon.c | 10 +++++++--- virtio/blk.c | 14 +++++++++----- virtio/console.c | 11 ++++++++--- virtio/core.c | 9 +++++---- virtio/net.c | 32 ++++++++++++++++++-------------- virtio/scsi.c | 14 +++++++++----- 10 files changed, 68 insertions(+), 41 deletions(-) diff --git a/include/kvm/kvm.h b/include/kvm/kvm.h index 7a738183d67a..c6dc6ef72d11 100644 --- a/include/kvm/kvm.h +++ b/include/kvm/kvm.h @@ -8,6 +8,7 @@ #include #include +#include #include #include #include diff --git a/include/kvm/virtio.h b/include/kvm/virtio.h index 19b913732cd5..3a311f54f2a5 100644 --- a/include/kvm/virtio.h +++ b/include/kvm/virtio.h @@ -7,6 +7,7 @@ #include #include +#include #include #include @@ -204,9 +205,9 @@ struct virtio_ops { int (*reset)(struct kvm *kvm, struct virtio_device *vdev); }; -int virtio_init(struct kvm *kvm, void *dev, struct virtio_device *vdev, - struct virtio_ops *ops, enum virtio_trans trans, - int device_id, int subsys_id, int class); +int __must_check virtio_init(struct kvm *kvm, void *dev, struct virtio_device *vdev, + struct virtio_ops *ops, enum virtio_trans trans, + int device_id, int subsys_id, int class); int virtio_compat_add_message(const char *device, const char *config); const char* virtio_trans_name(enum virtio_trans trans); diff --git a/include/linux/compiler.h b/include/linux/compiler.h index 898420b81aec..a662ba0a5c68 100644 --- a/include/linux/compiler.h +++ b/include/linux/compiler.h @@ -14,7 +14,7 @@ #define __packed __attribute__((packed)) #define __iomem #define __force -#define __must_check +#define __must_check __attribute__((warn_unused_result)) #define unlikely #endif diff --git a/virtio/9p.c b/virtio/9p.c index ac70dbc31207..b78f2b3f0e09 100644 --- a/virtio/9p.c +++ b/virtio/9p.c @@ -1551,11 +1551,14 @@ int virtio_9p_img_name_parser(const struct option *opt, const char *arg, int uns int virtio_9p__init(struct kvm *kvm) { struct p9_dev *p9dev; + int r; list_for_each_entry(p9dev, &devs, list) { - virtio_init(kvm, p9dev, &p9dev->vdev, &p9_dev_virtio_ops, - VIRTIO_DEFAULT_TRANS(kvm), PCI_DEVICE_ID_VIRTIO_9P, - VIRTIO_ID_9P, PCI_CLASS_9P); + r = virtio_init(kvm, p9dev, &p9dev->vdev, &p9_dev_virtio_ops, + VIRTIO_DEFAULT_TRANS(kvm), PCI_DEVICE_ID_VIRTIO_9P, + VIRTIO_ID_9P, PCI_CLASS_9P); + if (r < 0) + return r; } return 0; diff --git a/virtio/balloon.c b/virtio/balloon.c index 0bd16703dfee..8e8803fed607 100644 --- a/virtio/balloon.c +++ b/virtio/balloon.c @@ -264,6 +264,8 @@ struct virtio_ops bln_dev_virtio_ops = { int virtio_bln__init(struct kvm *kvm) { + int r; + if (!kvm->cfg.balloon) return 0; @@ -273,9 +275,11 @@ int virtio_bln__init(struct kvm *kvm) bdev.stat_waitfd = eventfd(0, 0); memset(&bdev.config, 0, sizeof(struct virtio_balloon_config)); - virtio_init(kvm, &bdev, &bdev.vdev, &bln_dev_virtio_ops, - VIRTIO_DEFAULT_TRANS(kvm), PCI_DEVICE_ID_VIRTIO_BLN, - VIRTIO_ID_BALLOON, PCI_CLASS_BLN); + r = virtio_init(kvm, &bdev, &bdev.vdev, &bln_dev_virtio_ops, + VIRTIO_DEFAULT_TRANS(kvm), PCI_DEVICE_ID_VIRTIO_BLN, + VIRTIO_ID_BALLOON, PCI_CLASS_BLN); + if (r < 0) + return r; if (compat_id == -1) compat_id = virtio_compat_add_message("virtio-balloon", "CONFIG_VIRTIO_BALLOON"); diff --git a/virtio/blk.c b/virtio/blk.c index f267be1563dc..4d02d101af81 100644 --- a/virtio/blk.c +++ b/virtio/blk.c @@ -306,6 +306,7 @@ static struct virtio_ops blk_dev_virtio_ops = { static int virtio_blk__init_one(struct kvm *kvm, struct disk_image *disk) { struct blk_dev *bdev; + int r; if (!disk) return -EINVAL; @@ -323,12 +324,14 @@ static int virtio_blk__init_one(struct kvm *kvm, struct disk_image *disk) .kvm = kvm, }; - virtio_init(kvm, bdev, &bdev->vdev, &blk_dev_virtio_ops, - VIRTIO_DEFAULT_TRANS(kvm), PCI_DEVICE_ID_VIRTIO_BLK, - VIRTIO_ID_BLOCK, PCI_CLASS_BLK); - list_add_tail(&bdev->list, &bdevs); + r = virtio_init(kvm, bdev, &bdev->vdev, &blk_dev_virtio_ops, + VIRTIO_DEFAULT_TRANS(kvm), PCI_DEVICE_ID_VIRTIO_BLK, + VIRTIO_ID_BLOCK, PCI_CLASS_BLK); + if (r < 0) + return r; + disk_image__set_callback(bdev->disk, virtio_blk_complete); if (compat_id == -1) @@ -359,7 +362,8 @@ int virtio_blk__init(struct kvm *kvm) return 0; cleanup: - return virtio_blk__exit(kvm); + virtio_blk__exit(kvm); + return r; } virtio_dev_init(virtio_blk__init); diff --git a/virtio/console.c b/virtio/console.c index f1be02549222..e0b98df37965 100644 --- a/virtio/console.c +++ b/virtio/console.c @@ -230,12 +230,17 @@ static struct virtio_ops con_dev_virtio_ops = { int virtio_console__init(struct kvm *kvm) { + int r; + if (kvm->cfg.active_console != CONSOLE_VIRTIO) return 0; - virtio_init(kvm, &cdev, &cdev.vdev, &con_dev_virtio_ops, - VIRTIO_DEFAULT_TRANS(kvm), PCI_DEVICE_ID_VIRTIO_CONSOLE, - VIRTIO_ID_CONSOLE, PCI_CLASS_CONSOLE); + r = virtio_init(kvm, &cdev, &cdev.vdev, &con_dev_virtio_ops, + VIRTIO_DEFAULT_TRANS(kvm), PCI_DEVICE_ID_VIRTIO_CONSOLE, + VIRTIO_ID_CONSOLE, PCI_CLASS_CONSOLE); + if (r < 0) + return r; + if (compat_id == -1) compat_id = virtio_compat_add_message("virtio-console", "CONFIG_VIRTIO_CONSOLE"); diff --git a/virtio/core.c b/virtio/core.c index e10ec362e1ea..f5b3c07fc100 100644 --- a/virtio/core.c +++ b/virtio/core.c @@ -259,6 +259,7 @@ int virtio_init(struct kvm *kvm, void *dev, struct virtio_device *vdev, int device_id, int subsys_id, int class) { void *virtio; + int r; switch (trans) { case VIRTIO_PCI: @@ -272,7 +273,7 @@ int virtio_init(struct kvm *kvm, void *dev, struct virtio_device *vdev, vdev->ops->init = virtio_pci__init; vdev->ops->exit = virtio_pci__exit; vdev->ops->reset = virtio_pci__reset; - vdev->ops->init(kvm, dev, vdev, device_id, subsys_id, class); + r = vdev->ops->init(kvm, dev, vdev, device_id, subsys_id, class); break; case VIRTIO_MMIO: virtio = calloc(sizeof(struct virtio_mmio), 1); @@ -285,13 +286,13 @@ int virtio_init(struct kvm *kvm, void *dev, struct virtio_device *vdev, vdev->ops->init = virtio_mmio_init; vdev->ops->exit = virtio_mmio_exit; vdev->ops->reset = virtio_mmio_reset; - vdev->ops->init(kvm, dev, vdev, device_id, subsys_id, class); + r = vdev->ops->init(kvm, dev, vdev, device_id, subsys_id, class); break; default: - return -1; + r = -1; }; - return 0; + return r; } int virtio_compat_add_message(const char *device, const char *config) diff --git a/virtio/net.c b/virtio/net.c index 091406912a24..425c13ba1136 100644 --- a/virtio/net.c +++ b/virtio/net.c @@ -910,7 +910,7 @@ done: static int virtio_net__init_one(struct virtio_net_params *params) { - int i, err; + int i, r; struct net_dev *ndev; struct virtio_ops *ops; enum virtio_trans trans = VIRTIO_DEFAULT_TRANS(params->kvm); @@ -920,10 +920,8 @@ static int virtio_net__init_one(struct virtio_net_params *params) return -ENOMEM; ops = malloc(sizeof(*ops)); - if (ops == NULL) { - err = -ENOMEM; - goto err_free_ndev; - } + if (ops == NULL) + return -ENOMEM; list_add_tail(&ndev->list, &ndevs); @@ -969,8 +967,10 @@ static int virtio_net__init_one(struct virtio_net_params *params) virtio_trans_name(trans)); } - virtio_init(params->kvm, ndev, &ndev->vdev, ops, trans, - PCI_DEVICE_ID_VIRTIO_NET, VIRTIO_ID_NET, PCI_CLASS_NET); + r = virtio_init(params->kvm, ndev, &ndev->vdev, ops, trans, + PCI_DEVICE_ID_VIRTIO_NET, VIRTIO_ID_NET, PCI_CLASS_NET); + if (r < 0) + return r; if (params->vhost) virtio_net__vhost_init(params->kvm, ndev); @@ -979,19 +979,17 @@ static int virtio_net__init_one(struct virtio_net_params *params) compat_id = virtio_compat_add_message("virtio-net", "CONFIG_VIRTIO_NET"); return 0; - -err_free_ndev: - free(ndev); - return err; } int virtio_net__init(struct kvm *kvm) { - int i; + int i, r; for (i = 0; i < kvm->cfg.num_net_devices; i++) { kvm->cfg.net_params[i].kvm = kvm; - virtio_net__init_one(&kvm->cfg.net_params[i]); + r = virtio_net__init_one(&kvm->cfg.net_params[i]); + if (r < 0) + goto cleanup; } if (kvm->cfg.num_net_devices == 0 && kvm->cfg.no_net == 0) { @@ -1007,10 +1005,16 @@ int virtio_net__init(struct kvm *kvm) str_to_mac(kvm->cfg.guest_mac, net_params.guest_mac); str_to_mac(kvm->cfg.host_mac, net_params.host_mac); - virtio_net__init_one(&net_params); + r = virtio_net__init_one(&net_params); + if (r < 0) + goto cleanup; } return 0; + +cleanup: + virtio_net__exit(kvm); + return r; } virtio_dev_init(virtio_net__init); diff --git a/virtio/scsi.c b/virtio/scsi.c index 1ec78fe0945a..16a86cb7e0e6 100644 --- a/virtio/scsi.c +++ b/virtio/scsi.c @@ -234,6 +234,7 @@ static void virtio_scsi_vhost_init(struct kvm *kvm, struct scsi_dev *sdev) static int virtio_scsi_init_one(struct kvm *kvm, struct disk_image *disk) { struct scsi_dev *sdev; + int r; if (!disk) return -EINVAL; @@ -260,12 +261,14 @@ static int virtio_scsi_init_one(struct kvm *kvm, struct disk_image *disk) strlcpy((char *)&sdev->target.vhost_wwpn, disk->wwpn, sizeof(sdev->target.vhost_wwpn)); sdev->target.vhost_tpgt = strtol(disk->tpgt, NULL, 0); - virtio_init(kvm, sdev, &sdev->vdev, &scsi_dev_virtio_ops, - VIRTIO_DEFAULT_TRANS(kvm), PCI_DEVICE_ID_VIRTIO_SCSI, - VIRTIO_ID_SCSI, PCI_CLASS_BLK); - list_add_tail(&sdev->list, &sdevs); + r = virtio_init(kvm, sdev, &sdev->vdev, &scsi_dev_virtio_ops, + VIRTIO_DEFAULT_TRANS(kvm), PCI_DEVICE_ID_VIRTIO_SCSI, + VIRTIO_ID_SCSI, PCI_CLASS_BLK); + if (r < 0) + return r; + virtio_scsi_vhost_init(kvm, sdev); if (compat_id == -1) @@ -302,7 +305,8 @@ int virtio_scsi_init(struct kvm *kvm) return 0; cleanup: - return virtio_scsi_exit(kvm); + virtio_scsi_exit(kvm); + return r; } virtio_dev_init(virtio_scsi_init); From patchwork Thu Jan 23 13:47:51 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandru Elisei X-Patchwork-Id: 11347967 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9B5C3159A for ; Thu, 23 Jan 2020 13:49:05 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 7A15F2467E for ; Thu, 23 Jan 2020 13:49:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729157AbgAWNsl (ORCPT ); Thu, 23 Jan 2020 08:48:41 -0500 Received: from foss.arm.com ([217.140.110.172]:39776 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729138AbgAWNsl (ORCPT ); Thu, 23 Jan 2020 08:48:41 -0500 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id EC0FC106F; Thu, 23 Jan 2020 05:48:39 -0800 (PST) Received: from e123195-lin.cambridge.arm.com (e123195-lin.cambridge.arm.com [10.1.196.63]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id EAB463F68E; Thu, 23 Jan 2020 05:48:38 -0800 (PST) From: Alexandru Elisei To: kvm@vger.kernel.org Cc: will@kernel.org, julien.thierry.kdev@gmail.com, andre.przywara@arm.com, sami.mujawar@arm.com, lorenzo.pieralisi@arm.com, maz@kernel.org Subject: [PATCH v2 kvmtool 16/30] Don't ignore errors registering a device, ioport or mmio emulation Date: Thu, 23 Jan 2020 13:47:51 +0000 Message-Id: <20200123134805.1993-17-alexandru.elisei@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200123134805.1993-1-alexandru.elisei@arm.com> References: <20200123134805.1993-1-alexandru.elisei@arm.com> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org An error returned by device__register, kvm__register_mmio and ioport__register means that the device will not be emulated properly. Annotate the functions with __must_check, so we get a compiler warning when this error is ignored. And fix several instances where the caller returns 0 even if the function failed. Signed-off-by: Alexandru Elisei Reviewed-by: Andre Przywara --- arm/ioport.c | 3 +- hw/i8042.c | 12 ++++++-- hw/vesa.c | 4 ++- include/kvm/devices.h | 3 +- include/kvm/ioport.h | 6 ++-- include/kvm/kvm.h | 6 ++-- ioport.c | 23 ++++++++------- mips/kvm.c | 3 +- powerpc/ioport.c | 3 +- virtio/mmio.c | 13 +++++++-- x86/ioport.c | 66 ++++++++++++++++++++++++++++++++----------- 11 files changed, 100 insertions(+), 42 deletions(-) diff --git a/arm/ioport.c b/arm/ioport.c index bdd30b6fe812..2f0feb9ab69f 100644 --- a/arm/ioport.c +++ b/arm/ioport.c @@ -1,8 +1,9 @@ #include "kvm/ioport.h" #include "kvm/irq.h" -void ioport__setup_arch(struct kvm *kvm) +int ioport__setup_arch(struct kvm *kvm) { + return 0; } void ioport__map_irq(u8 *irq) diff --git a/hw/i8042.c b/hw/i8042.c index 2d8c96e9c7e6..37a99a2dc6b8 100644 --- a/hw/i8042.c +++ b/hw/i8042.c @@ -349,10 +349,18 @@ static struct ioport_operations kbd_ops = { int kbd__init(struct kvm *kvm) { + int r; + kbd_reset(); state.kvm = kvm; - ioport__register(kvm, I8042_DATA_REG, &kbd_ops, 2, NULL); - ioport__register(kvm, I8042_COMMAND_REG, &kbd_ops, 2, NULL); + r = ioport__register(kvm, I8042_DATA_REG, &kbd_ops, 2, NULL); + if (r < 0) + return r; + r = ioport__register(kvm, I8042_COMMAND_REG, &kbd_ops, 2, NULL); + if (r < 0) { + ioport__unregister(kvm, I8042_DATA_REG); + return r; + } return 0; } diff --git a/hw/vesa.c b/hw/vesa.c index d8d91aa9c873..b92cc990b730 100644 --- a/hw/vesa.c +++ b/hw/vesa.c @@ -70,7 +70,9 @@ struct framebuffer *vesa__init(struct kvm *kvm) vesa_base_addr = (u16)r; vesa_pci_device.bar[0] = cpu_to_le32(vesa_base_addr | PCI_BASE_ADDRESS_SPACE_IO); - device__register(&vesa_device); + r = device__register(&vesa_device); + if (r < 0) + return ERR_PTR(r); mem = mmap(NULL, VESA_MEM_SIZE, PROT_RW, MAP_ANON_NORESERVE, -1, 0); if (mem == MAP_FAILED) diff --git a/include/kvm/devices.h b/include/kvm/devices.h index 405f19521977..e445db6f56b1 100644 --- a/include/kvm/devices.h +++ b/include/kvm/devices.h @@ -3,6 +3,7 @@ #include #include +#include enum device_bus_type { DEVICE_BUS_PCI, @@ -18,7 +19,7 @@ struct device_header { struct rb_node node; }; -int device__register(struct device_header *dev); +int __must_check device__register(struct device_header *dev); void device__unregister(struct device_header *dev); struct device_header *device__find_dev(enum device_bus_type bus_type, u8 dev_num); diff --git a/include/kvm/ioport.h b/include/kvm/ioport.h index 8c86b7151f25..62a719327e3f 100644 --- a/include/kvm/ioport.h +++ b/include/kvm/ioport.h @@ -33,11 +33,11 @@ struct ioport_operations { enum irq_type)); }; -void ioport__setup_arch(struct kvm *kvm); +int ioport__setup_arch(struct kvm *kvm); void ioport__map_irq(u8 *irq); -int ioport__register(struct kvm *kvm, u16 port, struct ioport_operations *ops, - int count, void *param); +int __must_check ioport__register(struct kvm *kvm, u16 port, struct ioport_operations *ops, + int count, void *param); int ioport__unregister(struct kvm *kvm, u16 port); int ioport__init(struct kvm *kvm); int ioport__exit(struct kvm *kvm); diff --git a/include/kvm/kvm.h b/include/kvm/kvm.h index c6dc6ef72d11..50119a8672eb 100644 --- a/include/kvm/kvm.h +++ b/include/kvm/kvm.h @@ -128,9 +128,9 @@ static inline int kvm__reserve_mem(struct kvm *kvm, u64 guest_phys, u64 size) KVM_MEM_TYPE_RESERVED); } -int kvm__register_mmio(struct kvm *kvm, u64 phys_addr, u64 phys_addr_len, bool coalesce, - void (*mmio_fn)(struct kvm_cpu *vcpu, u64 addr, u8 *data, u32 len, u8 is_write, void *ptr), - void *ptr); +int __must_check kvm__register_mmio(struct kvm *kvm, u64 phys_addr, u64 phys_addr_len, bool coalesce, + void (*mmio_fn)(struct kvm_cpu *vcpu, u64 addr, u8 *data, u32 len, u8 is_write, void *ptr), + void *ptr); bool kvm__deregister_mmio(struct kvm *kvm, u64 phys_addr); void kvm__reboot(struct kvm *kvm); void kvm__pause(struct kvm *kvm); diff --git a/ioport.c b/ioport.c index a72e4035881a..d224819c6e43 100644 --- a/ioport.c +++ b/ioport.c @@ -91,16 +91,21 @@ int ioport__register(struct kvm *kvm, u16 port, struct ioport_operations *ops, i }; r = ioport_insert(&ioport_tree, entry); - if (r < 0) { - free(entry); - br_write_unlock(kvm); - return r; - } - - device__register(&entry->dev_hdr); + if (r < 0) + goto out_free; + r = device__register(&entry->dev_hdr); + if (r < 0) + goto out_erase; br_write_unlock(kvm); return port; + +out_erase: + rb_int_erase(&ioport_tree, &entry->node); +out_free: + free(entry); + br_write_unlock(kvm); + return r; } int ioport__unregister(struct kvm *kvm, u16 port) @@ -196,9 +201,7 @@ out: int ioport__init(struct kvm *kvm) { - ioport__setup_arch(kvm); - - return 0; + return ioport__setup_arch(kvm); } dev_base_init(ioport__init); diff --git a/mips/kvm.c b/mips/kvm.c index 211770da0d85..26355930d3b6 100644 --- a/mips/kvm.c +++ b/mips/kvm.c @@ -100,8 +100,9 @@ void kvm__irq_trigger(struct kvm *kvm, int irq) die_perror("KVM_IRQ_LINE ioctl"); } -void ioport__setup_arch(struct kvm *kvm) +int ioport__setup_arch(struct kvm *kvm) { + return 0; } bool kvm__arch_cpu_supports_vm(void) diff --git a/powerpc/ioport.c b/powerpc/ioport.c index 58dc625c54fe..0c188b61a51a 100644 --- a/powerpc/ioport.c +++ b/powerpc/ioport.c @@ -12,9 +12,10 @@ #include -void ioport__setup_arch(struct kvm *kvm) +int ioport__setup_arch(struct kvm *kvm) { /* PPC has no legacy ioports to set up */ + return 0; } void ioport__map_irq(u8 *irq) diff --git a/virtio/mmio.c b/virtio/mmio.c index 03cecc366292..5537c39367d6 100644 --- a/virtio/mmio.c +++ b/virtio/mmio.c @@ -292,13 +292,16 @@ int virtio_mmio_init(struct kvm *kvm, void *dev, struct virtio_device *vdev, int device_id, int subsys_id, int class) { struct virtio_mmio *vmmio = vdev->virtio; + int r; vmmio->addr = virtio_mmio_get_io_space_block(VIRTIO_MMIO_IO_SIZE); vmmio->kvm = kvm; vmmio->dev = dev; - kvm__register_mmio(kvm, vmmio->addr, VIRTIO_MMIO_IO_SIZE, - false, virtio_mmio_mmio_callback, vdev); + r = kvm__register_mmio(kvm, vmmio->addr, VIRTIO_MMIO_IO_SIZE, + false, virtio_mmio_mmio_callback, vdev); + if (r < 0) + return r; vmmio->hdr = (struct virtio_mmio_hdr) { .magic = {'v', 'i', 'r', 't'}, @@ -313,7 +316,11 @@ int virtio_mmio_init(struct kvm *kvm, void *dev, struct virtio_device *vdev, .data = generate_virtio_mmio_fdt_node, }; - device__register(&vmmio->dev_hdr); + r = device__register(&vmmio->dev_hdr); + if (r < 0) { + kvm__deregister_mmio(kvm, vmmio->addr); + return r; + } /* * Instantiate guest virtio-mmio devices using kernel command line diff --git a/x86/ioport.c b/x86/ioport.c index 8572c758ed4f..7ad7b8f3f497 100644 --- a/x86/ioport.c +++ b/x86/ioport.c @@ -69,50 +69,84 @@ void ioport__map_irq(u8 *irq) { } -void ioport__setup_arch(struct kvm *kvm) +int ioport__setup_arch(struct kvm *kvm) { + int r; + /* Legacy ioport setup */ /* 0000 - 001F - DMA1 controller */ - ioport__register(kvm, 0x0000, &dummy_read_write_ioport_ops, 32, NULL); + r = ioport__register(kvm, 0x0000, &dummy_read_write_ioport_ops, 32, NULL); + if (r < 0) + return r; /* 0x0020 - 0x003F - 8259A PIC 1 */ - ioport__register(kvm, 0x0020, &dummy_read_write_ioport_ops, 2, NULL); + r = ioport__register(kvm, 0x0020, &dummy_read_write_ioport_ops, 2, NULL); + if (r < 0) + return r; /* PORT 0040-005F - PIT - PROGRAMMABLE INTERVAL TIMER (8253, 8254) */ - ioport__register(kvm, 0x0040, &dummy_read_write_ioport_ops, 4, NULL); + r = ioport__register(kvm, 0x0040, &dummy_read_write_ioport_ops, 4, NULL); + if (r < 0) + return r; /* 0092 - PS/2 system control port A */ - ioport__register(kvm, 0x0092, &ps2_control_a_ops, 1, NULL); + r = ioport__register(kvm, 0x0092, &ps2_control_a_ops, 1, NULL); + if (r < 0) + return r; /* 0x00A0 - 0x00AF - 8259A PIC 2 */ - ioport__register(kvm, 0x00A0, &dummy_read_write_ioport_ops, 2, NULL); + r = ioport__register(kvm, 0x00A0, &dummy_read_write_ioport_ops, 2, NULL); + if (r < 0) + return r; /* 00C0 - 001F - DMA2 controller */ - ioport__register(kvm, 0x00C0, &dummy_read_write_ioport_ops, 32, NULL); + r = ioport__register(kvm, 0x00C0, &dummy_read_write_ioport_ops, 32, NULL); + if (r < 0) + return r; /* PORT 00E0-00EF are 'motherboard specific' so we use them for our internal debugging purposes. */ - ioport__register(kvm, IOPORT_DBG, &debug_ops, 1, NULL); + r = ioport__register(kvm, IOPORT_DBG, &debug_ops, 1, NULL); + if (r < 0) + return r; /* PORT 00ED - DUMMY PORT FOR DELAY??? */ - ioport__register(kvm, 0x00ED, &dummy_write_only_ioport_ops, 1, NULL); + r = ioport__register(kvm, 0x00ED, &dummy_write_only_ioport_ops, 1, NULL); + if (r < 0) + return r; /* 0x00F0 - 0x00FF - Math co-processor */ - ioport__register(kvm, 0x00F0, &dummy_write_only_ioport_ops, 2, NULL); + r = ioport__register(kvm, 0x00F0, &dummy_write_only_ioport_ops, 2, NULL); + if (r < 0) + return r; /* PORT 0278-027A - PARALLEL PRINTER PORT (usually LPT1, sometimes LPT2) */ - ioport__register(kvm, 0x0278, &dummy_read_write_ioport_ops, 3, NULL); + r = ioport__register(kvm, 0x0278, &dummy_read_write_ioport_ops, 3, NULL); + if (r < 0) + return r; /* PORT 0378-037A - PARALLEL PRINTER PORT (usually LPT2, sometimes LPT3) */ - ioport__register(kvm, 0x0378, &dummy_read_write_ioport_ops, 3, NULL); + r = ioport__register(kvm, 0x0378, &dummy_read_write_ioport_ops, 3, NULL); + if (r < 0) + return r; /* PORT 03D4-03D5 - COLOR VIDEO - CRT CONTROL REGISTERS */ - ioport__register(kvm, 0x03D4, &dummy_read_write_ioport_ops, 1, NULL); - ioport__register(kvm, 0x03D5, &dummy_write_only_ioport_ops, 1, NULL); + r = ioport__register(kvm, 0x03D4, &dummy_read_write_ioport_ops, 1, NULL); + if (r < 0) + return r; + r = ioport__register(kvm, 0x03D5, &dummy_write_only_ioport_ops, 1, NULL); + if (r < 0) + return r; - ioport__register(kvm, 0x402, &seabios_debug_ops, 1, NULL); + r = ioport__register(kvm, 0x402, &seabios_debug_ops, 1, NULL); + if (r < 0) + return r; /* 0510 - QEMU BIOS configuration register */ - ioport__register(kvm, 0x510, &dummy_read_write_ioport_ops, 2, NULL); + r = ioport__register(kvm, 0x510, &dummy_read_write_ioport_ops, 2, NULL); + if (r < 0) + return r; + + return 0; } From patchwork Thu Jan 23 13:47:52 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandru Elisei X-Patchwork-Id: 11347939 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 6FF3592A for ; Thu, 23 Jan 2020 13:48:44 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 57E7924688 for ; Thu, 23 Jan 2020 13:48:44 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729030AbgAWNsm (ORCPT ); Thu, 23 Jan 2020 08:48:42 -0500 Received: from foss.arm.com ([217.140.110.172]:39784 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729145AbgAWNsl (ORCPT ); Thu, 23 Jan 2020 08:48:41 -0500 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 3D78B11B3; Thu, 23 Jan 2020 05:48:41 -0800 (PST) Received: from e123195-lin.cambridge.arm.com (e123195-lin.cambridge.arm.com [10.1.196.63]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 2D7713F68E; Thu, 23 Jan 2020 05:48:40 -0800 (PST) From: Alexandru Elisei To: kvm@vger.kernel.org Cc: will@kernel.org, julien.thierry.kdev@gmail.com, andre.przywara@arm.com, sami.mujawar@arm.com, lorenzo.pieralisi@arm.com, maz@kernel.org Subject: [PATCH v2 kvmtool 17/30] hw/vesa: Don't ignore fatal errors Date: Thu, 23 Jan 2020 13:47:52 +0000 Message-Id: <20200123134805.1993-18-alexandru.elisei@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200123134805.1993-1-alexandru.elisei@arm.com> References: <20200123134805.1993-1-alexandru.elisei@arm.com> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Failling an mmap call or creating a memslot means that device emulation will not work, don't ignore it. Signed-off-by: Alexandru Elisei Reviewed-by: Andre Przywara --- hw/vesa.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/hw/vesa.c b/hw/vesa.c index b92cc990b730..a665736a76d7 100644 --- a/hw/vesa.c +++ b/hw/vesa.c @@ -76,9 +76,11 @@ struct framebuffer *vesa__init(struct kvm *kvm) mem = mmap(NULL, VESA_MEM_SIZE, PROT_RW, MAP_ANON_NORESERVE, -1, 0); if (mem == MAP_FAILED) - ERR_PTR(-errno); + return ERR_PTR(-errno); - kvm__register_dev_mem(kvm, VESA_MEM_ADDR, VESA_MEM_SIZE, mem); + r = kvm__register_dev_mem(kvm, VESA_MEM_ADDR, VESA_MEM_SIZE, mem); + if (r < 0) + return ERR_PTR(r); vesafb = (struct framebuffer) { .width = VESA_WIDTH, From patchwork Thu Jan 23 13:47:53 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandru Elisei X-Patchwork-Id: 11347965 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 2D58892A for ; Thu, 23 Jan 2020 13:49:04 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 16C1A2467B for ; Thu, 23 Jan 2020 13:49:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729134AbgAWNso (ORCPT ); Thu, 23 Jan 2020 08:48:44 -0500 Received: from foss.arm.com ([217.140.110.172]:39794 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729182AbgAWNsm (ORCPT ); Thu, 23 Jan 2020 08:48:42 -0500 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 7FF5DFEC; Thu, 23 Jan 2020 05:48:42 -0800 (PST) Received: from e123195-lin.cambridge.arm.com (e123195-lin.cambridge.arm.com [10.1.196.63]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 74ECE3F68E; Thu, 23 Jan 2020 05:48:41 -0800 (PST) From: Alexandru Elisei To: kvm@vger.kernel.org Cc: will@kernel.org, julien.thierry.kdev@gmail.com, andre.przywara@arm.com, sami.mujawar@arm.com, lorenzo.pieralisi@arm.com, maz@kernel.org Subject: [PATCH v2 kvmtool 18/30] hw/vesa: Set the size for BAR 0 Date: Thu, 23 Jan 2020 13:47:53 +0000 Message-Id: <20200123134805.1993-19-alexandru.elisei@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200123134805.1993-1-alexandru.elisei@arm.com> References: <20200123134805.1993-1-alexandru.elisei@arm.com> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org BAR 0 is an I/O BAR and is registered as an ioport region. Let's set its size, so a guest can actually use it. Signed-off-by: Alexandru Elisei Reviewed-by: Andre Przywara --- hw/vesa.c | 1 + 1 file changed, 1 insertion(+) diff --git a/hw/vesa.c b/hw/vesa.c index a665736a76d7..e988c0425946 100644 --- a/hw/vesa.c +++ b/hw/vesa.c @@ -70,6 +70,7 @@ struct framebuffer *vesa__init(struct kvm *kvm) vesa_base_addr = (u16)r; vesa_pci_device.bar[0] = cpu_to_le32(vesa_base_addr | PCI_BASE_ADDRESS_SPACE_IO); + vesa_pci_device.bar_size[0] = PCI_IO_SIZE; r = device__register(&vesa_device); if (r < 0) return ERR_PTR(r); From patchwork Thu Jan 23 13:47:54 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandru Elisei X-Patchwork-Id: 11347941 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 85C28159A for ; Thu, 23 Jan 2020 13:48:46 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 6E3DA2467F for ; Thu, 23 Jan 2020 13:48:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729117AbgAWNso (ORCPT ); Thu, 23 Jan 2020 08:48:44 -0500 Received: from foss.arm.com ([217.140.110.172]:39800 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729152AbgAWNso (ORCPT ); Thu, 23 Jan 2020 08:48:44 -0500 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id C1B36FEC; Thu, 23 Jan 2020 05:48:43 -0800 (PST) Received: from e123195-lin.cambridge.arm.com (e123195-lin.cambridge.arm.com [10.1.196.63]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id B95B43F68E; Thu, 23 Jan 2020 05:48:42 -0800 (PST) From: Alexandru Elisei To: kvm@vger.kernel.org Cc: will@kernel.org, julien.thierry.kdev@gmail.com, andre.przywara@arm.com, sami.mujawar@arm.com, lorenzo.pieralisi@arm.com, maz@kernel.org Subject: [PATCH v2 kvmtool 19/30] Use independent read/write locks for ioport and mmio Date: Thu, 23 Jan 2020 13:47:54 +0000 Message-Id: <20200123134805.1993-20-alexandru.elisei@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200123134805.1993-1-alexandru.elisei@arm.com> References: <20200123134805.1993-1-alexandru.elisei@arm.com> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org kvmtool uses brlock for protecting accesses to the ioport and mmio red-black trees. brlock allows concurrent reads, but only one writer, which is assumed not to be a VCPU thread. This is done by issuing a compiler barrier on read and pausing the entire virtual machine on writes. When KVM_BRLOCK_DEBUG is defined, brlock uses instead a pthread read/write lock. When we will implement reassignable BARs, the mmio or ioport mapping will be done as a result of a VCPU mmio access. When brlock is a read/write lock, it means that we will try to acquire a write lock with the read lock already held by the same VCPU and we will deadlock. When it's not, a VCPU will have to call kvm__pause, which means the virtual machine will stay paused forever. Let's avoid all this by using separate pthread_rwlock_t locks for the mmio and the ioport red-black trees and carefully choosing our read critical region such that modification as a result of a guest mmio access doesn't deadlock. Signed-off-by: Alexandru Elisei --- ioport.c | 20 +++++++++++--------- mmio.c | 26 +++++++++++++++++--------- 2 files changed, 28 insertions(+), 18 deletions(-) diff --git a/ioport.c b/ioport.c index d224819c6e43..c044a80dd763 100644 --- a/ioport.c +++ b/ioport.c @@ -2,9 +2,9 @@ #include "kvm/kvm.h" #include "kvm/util.h" -#include "kvm/brlock.h" #include "kvm/rbtree-interval.h" #include "kvm/mutex.h" +#include "kvm/rwsem.h" #include /* for KVM_EXIT_* */ #include @@ -16,6 +16,8 @@ #define ioport_node(n) rb_entry(n, struct ioport, node) +static DECLARE_RWSEM(ioport_lock); + static struct rb_root ioport_tree = RB_ROOT; static struct ioport *ioport_search(struct rb_root *root, u64 addr) @@ -68,7 +70,7 @@ int ioport__register(struct kvm *kvm, u16 port, struct ioport_operations *ops, i struct ioport *entry; int r; - br_write_lock(kvm); + down_write(&ioport_lock); entry = ioport_search(&ioport_tree, port); if (entry) { @@ -96,7 +98,7 @@ int ioport__register(struct kvm *kvm, u16 port, struct ioport_operations *ops, i r = device__register(&entry->dev_hdr); if (r < 0) goto out_erase; - br_write_unlock(kvm); + up_write(&ioport_lock); return port; @@ -104,7 +106,7 @@ out_erase: rb_int_erase(&ioport_tree, &entry->node); out_free: free(entry); - br_write_unlock(kvm); + up_write(&ioport_lock); return r; } @@ -113,7 +115,7 @@ int ioport__unregister(struct kvm *kvm, u16 port) struct ioport *entry; int r; - br_write_lock(kvm); + down_write(&ioport_lock); r = -ENOENT; entry = ioport_search(&ioport_tree, port); @@ -128,7 +130,7 @@ int ioport__unregister(struct kvm *kvm, u16 port) r = 0; done: - br_write_unlock(kvm); + up_write(&ioport_lock); return r; } @@ -171,8 +173,10 @@ bool kvm__emulate_io(struct kvm_cpu *vcpu, u16 port, void *data, int direction, void *ptr = data; struct kvm *kvm = vcpu->kvm; - br_read_lock(kvm); + down_read(&ioport_lock); entry = ioport_search(&ioport_tree, port); + up_read(&ioport_lock); + if (!entry) goto out; @@ -188,8 +192,6 @@ bool kvm__emulate_io(struct kvm_cpu *vcpu, u16 port, void *data, int direction, } out: - br_read_unlock(kvm); - if (ret) return true; diff --git a/mmio.c b/mmio.c index 61e1d47a587d..4e0ff830c738 100644 --- a/mmio.c +++ b/mmio.c @@ -1,7 +1,7 @@ #include "kvm/kvm.h" #include "kvm/kvm-cpu.h" #include "kvm/rbtree-interval.h" -#include "kvm/brlock.h" +#include "kvm/rwsem.h" #include #include @@ -15,6 +15,8 @@ #define mmio_node(n) rb_entry(n, struct mmio_mapping, node) +static DECLARE_RWSEM(mmio_lock); + struct mmio_mapping { struct rb_int_node node; void (*mmio_fn)(struct kvm_cpu *vcpu, u64 addr, u8 *data, u32 len, u8 is_write, void *ptr); @@ -61,7 +63,7 @@ static const char *to_direction(u8 is_write) int kvm__register_mmio(struct kvm *kvm, u64 phys_addr, u64 phys_addr_len, bool coalesce, void (*mmio_fn)(struct kvm_cpu *vcpu, u64 addr, u8 *data, u32 len, u8 is_write, void *ptr), - void *ptr) + void *ptr) { struct mmio_mapping *mmio; struct kvm_coalesced_mmio_zone zone; @@ -88,9 +90,9 @@ int kvm__register_mmio(struct kvm *kvm, u64 phys_addr, u64 phys_addr_len, bool c return -errno; } } - br_write_lock(kvm); + down_write(&mmio_lock); ret = mmio_insert(&mmio_tree, mmio); - br_write_unlock(kvm); + up_write(&mmio_lock); return ret; } @@ -100,10 +102,10 @@ bool kvm__deregister_mmio(struct kvm *kvm, u64 phys_addr) struct mmio_mapping *mmio; struct kvm_coalesced_mmio_zone zone; - br_write_lock(kvm); + down_write(&mmio_lock); mmio = mmio_search_single(&mmio_tree, phys_addr); if (mmio == NULL) { - br_write_unlock(kvm); + up_write(&mmio_lock); return false; } @@ -114,7 +116,7 @@ bool kvm__deregister_mmio(struct kvm *kvm, u64 phys_addr) ioctl(kvm->vm_fd, KVM_UNREGISTER_COALESCED_MMIO, &zone); rb_int_erase(&mmio_tree, &mmio->node); - br_write_unlock(kvm); + up_write(&mmio_lock); free(mmio); return true; @@ -124,8 +126,15 @@ bool kvm__emulate_mmio(struct kvm_cpu *vcpu, u64 phys_addr, u8 *data, u32 len, u { struct mmio_mapping *mmio; - br_read_lock(vcpu->kvm); + /* + * The callback might call kvm__register_mmio which takes a write lock, + * so avoid deadlocks by protecting only the node search with a reader + * lock. Note that there is still a small time window for a node to be + * deleted by another vcpu before mmio_fn gets called. + */ + down_read(&mmio_lock); mmio = mmio_search(&mmio_tree, phys_addr, len); + up_read(&mmio_lock); if (mmio) mmio->mmio_fn(vcpu, phys_addr, data, len, is_write, mmio->ptr); @@ -135,7 +144,6 @@ bool kvm__emulate_mmio(struct kvm_cpu *vcpu, u64 phys_addr, u8 *data, u32 len, u to_direction(is_write), (unsigned long long)phys_addr, len); } - br_read_unlock(vcpu->kvm); return true; } From patchwork Thu Jan 23 13:47:55 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandru Elisei X-Patchwork-Id: 11347943 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 63756159A for ; Thu, 23 Jan 2020 13:48:47 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 4BD5C24685 for ; Thu, 23 Jan 2020 13:48:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729147AbgAWNsq (ORCPT ); Thu, 23 Jan 2020 08:48:46 -0500 Received: from foss.arm.com ([217.140.110.172]:39806 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729190AbgAWNsp (ORCPT ); Thu, 23 Jan 2020 08:48:45 -0500 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 1B190FEC; Thu, 23 Jan 2020 05:48:45 -0800 (PST) Received: from e123195-lin.cambridge.arm.com (e123195-lin.cambridge.arm.com [10.1.196.63]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 08C2F3F68E; Thu, 23 Jan 2020 05:48:43 -0800 (PST) From: Alexandru Elisei To: kvm@vger.kernel.org Cc: will@kernel.org, julien.thierry.kdev@gmail.com, andre.przywara@arm.com, sami.mujawar@arm.com, lorenzo.pieralisi@arm.com, maz@kernel.org Subject: [PATCH v2 kvmtool 20/30] pci: Add helpers for BAR values and memory/IO space access Date: Thu, 23 Jan 2020 13:47:55 +0000 Message-Id: <20200123134805.1993-21-alexandru.elisei@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200123134805.1993-1-alexandru.elisei@arm.com> References: <20200123134805.1993-1-alexandru.elisei@arm.com> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org We're going to be checking the BAR type, the address written to it and if access to memory or I/O space is enabled quite often when we add support for reasignable BARs, add helpers for it. Signed-off-by: Alexandru Elisei Reviewed-by: Andre Przywara --- include/kvm/pci.h | 48 +++++++++++++++++++++++++++++++++++++++++++++++ pci.c | 2 +- 2 files changed, 49 insertions(+), 1 deletion(-) diff --git a/include/kvm/pci.h b/include/kvm/pci.h index ccb155e3e8fe..235cd82fff3c 100644 --- a/include/kvm/pci.h +++ b/include/kvm/pci.h @@ -5,6 +5,7 @@ #include #include #include +#include #include "kvm/devices.h" #include "kvm/msi.h" @@ -161,4 +162,51 @@ void pci__config_rd(struct kvm *kvm, union pci_config_address addr, void *data, void *pci_find_cap(struct pci_device_header *hdr, u8 cap_type); +static inline bool __pci__memory_space_enabled(u16 command) +{ + return command & PCI_COMMAND_MEMORY; +} + +static inline bool pci__memory_space_enabled(struct pci_device_header *pci_hdr) +{ + return __pci__memory_space_enabled(pci_hdr->command); +} + +static inline bool __pci__io_space_enabled(u16 command) +{ + return command & PCI_COMMAND_IO; +} + +static inline bool pci__io_space_enabled(struct pci_device_header *pci_hdr) +{ + return __pci__io_space_enabled(pci_hdr->command); +} + +static inline bool __pci__bar_is_io(u32 bar) +{ + return bar & PCI_BASE_ADDRESS_SPACE_IO; +} + +static inline bool pci__bar_is_io(struct pci_device_header *pci_hdr, int bar_num) +{ + return __pci__bar_is_io(pci_hdr->bar[bar_num]); +} + +static inline bool pci__bar_is_memory(struct pci_device_header *pci_hdr, int bar_num) +{ + return !pci__bar_is_io(pci_hdr, bar_num); +} + +static inline u32 __pci__bar_address(u32 bar) +{ + if (__pci__bar_is_io(bar)) + return bar & PCI_BASE_ADDRESS_IO_MASK; + return bar & PCI_BASE_ADDRESS_MEM_MASK; +} + +static inline u32 pci__bar_address(struct pci_device_header *pci_hdr, int bar_num) +{ + return __pci__bar_address(pci_hdr->bar[bar_num]); +} + #endif /* KVM__PCI_H */ diff --git a/pci.c b/pci.c index b6892d974c08..4f7b863298f6 100644 --- a/pci.c +++ b/pci.c @@ -185,7 +185,7 @@ void pci__config_wr(struct kvm *kvm, union pci_config_address addr, void *data, * size, it will write the address back. */ if (bar < 6) { - if (pci_hdr->bar[bar] & PCI_BASE_ADDRESS_SPACE_IO) + if (pci__bar_is_io(pci_hdr, bar)) mask = (u32)PCI_BASE_ADDRESS_IO_MASK; else mask = (u32)PCI_BASE_ADDRESS_MEM_MASK; From patchwork Thu Jan 23 13:47:56 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandru Elisei X-Patchwork-Id: 11347945 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 2A8D992A for ; Thu, 23 Jan 2020 13:48:49 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 0921A24689 for ; Thu, 23 Jan 2020 13:48:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729143AbgAWNss (ORCPT ); Thu, 23 Jan 2020 08:48:48 -0500 Received: from foss.arm.com ([217.140.110.172]:39816 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729162AbgAWNsr (ORCPT ); Thu, 23 Jan 2020 08:48:47 -0500 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 50408FEC; Thu, 23 Jan 2020 05:48:46 -0800 (PST) Received: from e123195-lin.cambridge.arm.com (e123195-lin.cambridge.arm.com [10.1.196.63]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 4FB433F68E; Thu, 23 Jan 2020 05:48:45 -0800 (PST) From: Alexandru Elisei To: kvm@vger.kernel.org Cc: will@kernel.org, julien.thierry.kdev@gmail.com, andre.przywara@arm.com, sami.mujawar@arm.com, lorenzo.pieralisi@arm.com, maz@kernel.org Subject: [PATCH v2 kvmtool 21/30] virtio/pci: Get emulated region address from BARs Date: Thu, 23 Jan 2020 13:47:56 +0000 Message-Id: <20200123134805.1993-22-alexandru.elisei@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200123134805.1993-1-alexandru.elisei@arm.com> References: <20200123134805.1993-1-alexandru.elisei@arm.com> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org The struct virtio_pci fields port_addr, mmio_addr and msix_io_block represent the same addresses that are written in the corresponding BARs. Remove this duplication of information and always use the address from the BAR. This will make our life a lot easier when we add support for reassignable BARs, because we won't have to update the fields on each BAR change. No functional changes. Signed-off-by: Alexandru Elisei Reviewed-by: Andre Przywara --- include/kvm/virtio-pci.h | 3 -- virtio/pci.c | 86 ++++++++++++++++++++++++++-------------- 2 files changed, 56 insertions(+), 33 deletions(-) diff --git a/include/kvm/virtio-pci.h b/include/kvm/virtio-pci.h index 278a25950d8b..959b4b81c871 100644 --- a/include/kvm/virtio-pci.h +++ b/include/kvm/virtio-pci.h @@ -24,8 +24,6 @@ struct virtio_pci { void *dev; struct kvm *kvm; - u16 port_addr; - u32 mmio_addr; u8 status; u8 isr; u32 features; @@ -43,7 +41,6 @@ struct virtio_pci { u32 config_gsi; u32 vq_vector[VIRTIO_PCI_MAX_VQ]; u32 gsis[VIRTIO_PCI_MAX_VQ]; - u32 msix_io_block; u64 msix_pba; struct msix_table msix_table[VIRTIO_PCI_MAX_VQ + VIRTIO_PCI_MAX_CONFIG]; diff --git a/virtio/pci.c b/virtio/pci.c index 6723a1f3a84d..c4822514856c 100644 --- a/virtio/pci.c +++ b/virtio/pci.c @@ -13,6 +13,21 @@ #include #include +static u16 virtio_pci__port_addr(struct virtio_pci *vpci) +{ + return pci__bar_address(&vpci->pci_hdr, 0); +} + +static u32 virtio_pci__mmio_addr(struct virtio_pci *vpci) +{ + return pci__bar_address(&vpci->pci_hdr, 1); +} + +static u32 virtio_pci__msix_io_addr(struct virtio_pci *vpci) +{ + return pci__bar_address(&vpci->pci_hdr, 2); +} + static void virtio_pci__ioevent_callback(struct kvm *kvm, void *param) { struct virtio_pci_ioevent_param *ioeventfd = param; @@ -25,6 +40,8 @@ static int virtio_pci__init_ioeventfd(struct kvm *kvm, struct virtio_device *vde { struct ioevent ioevent; struct virtio_pci *vpci = vdev->virtio; + u32 mmio_addr = virtio_pci__mmio_addr(vpci); + u16 port_addr = virtio_pci__port_addr(vpci); int r, flags = 0; int fd; @@ -48,7 +65,7 @@ static int virtio_pci__init_ioeventfd(struct kvm *kvm, struct virtio_device *vde flags |= IOEVENTFD_FLAG_USER_POLL; /* ioport */ - ioevent.io_addr = vpci->port_addr + VIRTIO_PCI_QUEUE_NOTIFY; + ioevent.io_addr = port_addr + VIRTIO_PCI_QUEUE_NOTIFY; ioevent.io_len = sizeof(u16); ioevent.fd = fd = eventfd(0, 0); r = ioeventfd__add_event(&ioevent, flags | IOEVENTFD_FLAG_PIO); @@ -56,7 +73,7 @@ static int virtio_pci__init_ioeventfd(struct kvm *kvm, struct virtio_device *vde return r; /* mmio */ - ioevent.io_addr = vpci->mmio_addr + VIRTIO_PCI_QUEUE_NOTIFY; + ioevent.io_addr = mmio_addr + VIRTIO_PCI_QUEUE_NOTIFY; ioevent.io_len = sizeof(u16); ioevent.fd = eventfd(0, 0); r = ioeventfd__add_event(&ioevent, flags); @@ -68,7 +85,7 @@ static int virtio_pci__init_ioeventfd(struct kvm *kvm, struct virtio_device *vde return 0; free_ioport_evt: - ioeventfd__del_event(vpci->port_addr + VIRTIO_PCI_QUEUE_NOTIFY, vq); + ioeventfd__del_event(port_addr + VIRTIO_PCI_QUEUE_NOTIFY, vq); return r; } @@ -76,9 +93,11 @@ static void virtio_pci_exit_vq(struct kvm *kvm, struct virtio_device *vdev, int vq) { struct virtio_pci *vpci = vdev->virtio; + u32 mmio_addr = virtio_pci__mmio_addr(vpci); + u16 port_addr = virtio_pci__port_addr(vpci); - ioeventfd__del_event(vpci->mmio_addr + VIRTIO_PCI_QUEUE_NOTIFY, vq); - ioeventfd__del_event(vpci->port_addr + VIRTIO_PCI_QUEUE_NOTIFY, vq); + ioeventfd__del_event(mmio_addr + VIRTIO_PCI_QUEUE_NOTIFY, vq); + ioeventfd__del_event(port_addr + VIRTIO_PCI_QUEUE_NOTIFY, vq); virtio_exit_vq(kvm, vdev, vpci->dev, vq); } @@ -163,10 +182,12 @@ static bool virtio_pci__io_in(struct ioport *ioport, struct kvm_cpu *vcpu, u16 p unsigned long offset; struct virtio_device *vdev; struct virtio_pci *vpci; + u16 port_addr; vdev = ioport->priv; vpci = vdev->virtio; - offset = port - vpci->port_addr; + port_addr = virtio_pci__port_addr(vpci); + offset = port - port_addr; return virtio_pci__data_in(vcpu, vdev, offset, data, size); } @@ -323,10 +344,12 @@ static bool virtio_pci__io_out(struct ioport *ioport, struct kvm_cpu *vcpu, u16 unsigned long offset; struct virtio_device *vdev; struct virtio_pci *vpci; + u16 port_addr; vdev = ioport->priv; vpci = vdev->virtio; - offset = port - vpci->port_addr; + port_addr = virtio_pci__port_addr(vpci); + offset = port - port_addr; return virtio_pci__data_out(vcpu, vdev, offset, data, size); } @@ -343,17 +366,18 @@ static void virtio_pci__msix_mmio_callback(struct kvm_cpu *vcpu, struct virtio_device *vdev = ptr; struct virtio_pci *vpci = vdev->virtio; struct msix_table *table; + u32 msix_io_addr = virtio_pci__msix_io_addr(vpci); int vecnum; size_t offset; - if (addr > vpci->msix_io_block + PCI_IO_SIZE) { + if (addr > msix_io_addr + PCI_IO_SIZE) { if (is_write) return; table = (struct msix_table *)&vpci->msix_pba; - offset = addr - (vpci->msix_io_block + PCI_IO_SIZE); + offset = addr - (msix_io_addr + PCI_IO_SIZE); } else { table = vpci->msix_table; - offset = addr - vpci->msix_io_block; + offset = addr - msix_io_addr; } vecnum = offset / sizeof(struct msix_table); offset = offset % sizeof(struct msix_table); @@ -442,19 +466,20 @@ static void virtio_pci__io_mmio_callback(struct kvm_cpu *vcpu, { struct virtio_device *vdev = ptr; struct virtio_pci *vpci = vdev->virtio; + u32 mmio_addr = virtio_pci__mmio_addr(vpci); if (!is_write) - virtio_pci__data_in(vcpu, vdev, addr - vpci->mmio_addr, - data, len); + virtio_pci__data_in(vcpu, vdev, addr - mmio_addr, data, len); else - virtio_pci__data_out(vcpu, vdev, addr - vpci->mmio_addr, - data, len); + virtio_pci__data_out(vcpu, vdev, addr - mmio_addr, data, len); } int virtio_pci__init(struct kvm *kvm, void *dev, struct virtio_device *vdev, int device_id, int subsys_id, int class) { struct virtio_pci *vpci = vdev->virtio; + u32 mmio_addr, msix_io_block; + u16 port_addr; int r; vpci->kvm = kvm; @@ -462,20 +487,21 @@ int virtio_pci__init(struct kvm *kvm, void *dev, struct virtio_device *vdev, BUILD_BUG_ON(!is_power_of_two(PCI_IO_SIZE)); - r = pci_get_io_port_block(PCI_IO_SIZE); - r = ioport__register(kvm, r, &virtio_pci__io_ops, PCI_IO_SIZE, vdev); + port_addr = pci_get_io_port_block(PCI_IO_SIZE); + r = ioport__register(kvm, port_addr, &virtio_pci__io_ops, PCI_IO_SIZE, + vdev); if (r < 0) return r; - vpci->port_addr = (u16)r; + port_addr = (u16)r; - vpci->mmio_addr = pci_get_mmio_block(PCI_IO_SIZE); - r = kvm__register_mmio(kvm, vpci->mmio_addr, PCI_IO_SIZE, false, + mmio_addr = pci_get_mmio_block(PCI_IO_SIZE); + r = kvm__register_mmio(kvm, mmio_addr, PCI_IO_SIZE, false, virtio_pci__io_mmio_callback, vdev); if (r < 0) goto free_ioport; - vpci->msix_io_block = pci_get_mmio_block(PCI_IO_SIZE * 2); - r = kvm__register_mmio(kvm, vpci->msix_io_block, PCI_IO_SIZE * 2, false, + msix_io_block = pci_get_mmio_block(PCI_IO_SIZE * 2); + r = kvm__register_mmio(kvm, msix_io_block, PCI_IO_SIZE * 2, false, virtio_pci__msix_mmio_callback, vdev); if (r < 0) goto free_mmio; @@ -491,11 +517,11 @@ int virtio_pci__init(struct kvm *kvm, void *dev, struct virtio_device *vdev, .class[2] = (class >> 16) & 0xff, .subsys_vendor_id = cpu_to_le16(PCI_SUBSYSTEM_VENDOR_ID_REDHAT_QUMRANET), .subsys_id = cpu_to_le16(subsys_id), - .bar[0] = cpu_to_le32(vpci->port_addr + .bar[0] = cpu_to_le32(port_addr | PCI_BASE_ADDRESS_SPACE_IO), - .bar[1] = cpu_to_le32(vpci->mmio_addr + .bar[1] = cpu_to_le32(mmio_addr | PCI_BASE_ADDRESS_SPACE_MEMORY), - .bar[2] = cpu_to_le32(vpci->msix_io_block + .bar[2] = cpu_to_le32(msix_io_block | PCI_BASE_ADDRESS_SPACE_MEMORY), .status = cpu_to_le16(PCI_STATUS_CAP_LIST), .capabilities = (void *)&vpci->pci_hdr.msix - (void *)&vpci->pci_hdr, @@ -542,11 +568,11 @@ int virtio_pci__init(struct kvm *kvm, void *dev, struct virtio_device *vdev, return 0; free_msix_mmio: - kvm__deregister_mmio(kvm, vpci->msix_io_block); + kvm__deregister_mmio(kvm, msix_io_block); free_mmio: - kvm__deregister_mmio(kvm, vpci->mmio_addr); + kvm__deregister_mmio(kvm, mmio_addr); free_ioport: - ioport__unregister(kvm, vpci->port_addr); + ioport__unregister(kvm, port_addr); return r; } @@ -566,9 +592,9 @@ int virtio_pci__exit(struct kvm *kvm, struct virtio_device *vdev) struct virtio_pci *vpci = vdev->virtio; virtio_pci__reset(kvm, vdev); - kvm__deregister_mmio(kvm, vpci->mmio_addr); - kvm__deregister_mmio(kvm, vpci->msix_io_block); - ioport__unregister(kvm, vpci->port_addr); + kvm__deregister_mmio(kvm, virtio_pci__mmio_addr(vpci)); + kvm__deregister_mmio(kvm, virtio_pci__msix_io_addr(vpci)); + ioport__unregister(kvm, virtio_pci__port_addr(vpci)); return 0; } From patchwork Thu Jan 23 13:47:57 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandru Elisei X-Patchwork-Id: 11347949 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 4ED04159A for ; Thu, 23 Jan 2020 13:48:52 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 3753024686 for ; Thu, 23 Jan 2020 13:48:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729196AbgAWNst (ORCPT ); Thu, 23 Jan 2020 08:48:49 -0500 Received: from foss.arm.com ([217.140.110.172]:39822 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729195AbgAWNss (ORCPT ); Thu, 23 Jan 2020 08:48:48 -0500 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 83807106F; Thu, 23 Jan 2020 05:48:47 -0800 (PST) Received: from e123195-lin.cambridge.arm.com (e123195-lin.cambridge.arm.com [10.1.196.63]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 840883F68E; Thu, 23 Jan 2020 05:48:46 -0800 (PST) From: Alexandru Elisei To: kvm@vger.kernel.org Cc: will@kernel.org, julien.thierry.kdev@gmail.com, andre.przywara@arm.com, sami.mujawar@arm.com, lorenzo.pieralisi@arm.com, maz@kernel.org Subject: [PATCH v2 kvmtool 22/30] vfio: Destroy memslot when unmapping the associated VAs Date: Thu, 23 Jan 2020 13:47:57 +0000 Message-Id: <20200123134805.1993-23-alexandru.elisei@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200123134805.1993-1-alexandru.elisei@arm.com> References: <20200123134805.1993-1-alexandru.elisei@arm.com> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org When we want to map a device region into the guest address space, first we perform an mmap on the device fd. The resulting VMA is a mapping between host userspace addresses and physical addresses associated with the device. Next, we create a memslot, which populates the stage 2 table with the mappings between guest physical addresses and the device physical adresses. However, when we want to unmap the device from the guest address space, we only call munmap, which destroys the VMA and the stage 2 mappings, but doesn't destroy the memslot and kvmtool's internal mem_bank structure associated with the memslot. This has been perfectly fine so far, because we only unmap a device region when we exit kvmtool. This is will change when we add support for reassignable BARs, and we will have to unmap vfio regions as the guest kernel writes new addresses in the BARs. This can lead to two possible problems: - We refuse to create a valid BAR mapping because of a stale mem_bank structure which belonged to a previously unmapped region. - It is possible that the mmap in vfio_map_region returns the same address that was used to create a memslot, but was unmapped by vfio_unmap_region. Guest accesses to the device memory will fault because the stage 2 mappings are missing, and this can lead to performance degradation. Let's do the right thing and destroy the memslot and the mem_bank struct associated with it when we unmap a vfio region. Set host_addr to NULL after the munmap call so we won't try to unmap an address which is currently used if vfio_unmap_region gets called twice. Signed-off-by: Alexandru Elisei --- include/kvm/kvm.h | 2 ++ kvm.c | 65 ++++++++++++++++++++++++++++++++++++++++++++--- vfio/core.c | 6 +++++ 3 files changed, 69 insertions(+), 4 deletions(-) diff --git a/include/kvm/kvm.h b/include/kvm/kvm.h index 50119a8672eb..c7e57b890cdd 100644 --- a/include/kvm/kvm.h +++ b/include/kvm/kvm.h @@ -56,6 +56,7 @@ struct kvm_mem_bank { void *host_addr; u64 size; enum kvm_mem_type type; + u32 slot; }; struct kvm { @@ -106,6 +107,7 @@ void kvm__irq_line(struct kvm *kvm, int irq, int level); void kvm__irq_trigger(struct kvm *kvm, int irq); bool kvm__emulate_io(struct kvm_cpu *vcpu, u16 port, void *data, int direction, int size, u32 count); bool kvm__emulate_mmio(struct kvm_cpu *vcpu, u64 phys_addr, u8 *data, u32 len, u8 is_write); +int kvm__destroy_mem(struct kvm *kvm, u64 guest_phys, u64 size, void *userspace_addr); int kvm__register_mem(struct kvm *kvm, u64 guest_phys, u64 size, void *userspace_addr, enum kvm_mem_type type); static inline int kvm__register_ram(struct kvm *kvm, u64 guest_phys, u64 size, diff --git a/kvm.c b/kvm.c index 57c4ff98ec4c..afcf55c7bf45 100644 --- a/kvm.c +++ b/kvm.c @@ -183,20 +183,75 @@ int kvm__exit(struct kvm *kvm) } core_exit(kvm__exit); +int kvm__destroy_mem(struct kvm *kvm, u64 guest_phys, u64 size, + void *userspace_addr) +{ + struct kvm_userspace_memory_region mem; + struct kvm_mem_bank *bank; + int ret; + + list_for_each_entry(bank, &kvm->mem_banks, list) + if (bank->guest_phys_addr == guest_phys && + bank->size == size && bank->host_addr == userspace_addr) + break; + + if (&bank->list == &kvm->mem_banks) { + pr_err("Region [%llx-%llx] not found", guest_phys, + guest_phys + size - 1); + return -EINVAL; + } + + if (bank->type == KVM_MEM_TYPE_RESERVED) { + pr_err("Cannot delete reserved region [%llx-%llx]", + guest_phys, guest_phys + size - 1); + return -EINVAL; + } + + mem = (struct kvm_userspace_memory_region) { + .slot = bank->slot, + .guest_phys_addr = guest_phys, + .memory_size = 0, + .userspace_addr = (unsigned long)userspace_addr, + }; + + ret = ioctl(kvm->vm_fd, KVM_SET_USER_MEMORY_REGION, &mem); + if (ret < 0) + return -errno; + + list_del(&bank->list); + free(bank); + kvm->mem_slots--; + + return 0; +} + int kvm__register_mem(struct kvm *kvm, u64 guest_phys, u64 size, void *userspace_addr, enum kvm_mem_type type) { struct kvm_userspace_memory_region mem; struct kvm_mem_bank *merged = NULL; struct kvm_mem_bank *bank; + struct list_head *prev_entry; + u32 slot; int ret; - /* Check for overlap */ + /* Check for overlap and find first empty slot. */ + slot = 0; + prev_entry = &kvm->mem_banks; list_for_each_entry(bank, &kvm->mem_banks, list) { u64 bank_end = bank->guest_phys_addr + bank->size - 1; u64 end = guest_phys + size - 1; - if (guest_phys > bank_end || end < bank->guest_phys_addr) + if (guest_phys > bank_end || end < bank->guest_phys_addr) { + /* + * Keep the banks sorted ascending by slot, so it's + * easier for us to find a free slot. + */ + if (bank->slot == slot) { + slot++; + prev_entry = &bank->list; + } continue; + } /* Merge overlapping reserved regions */ if (bank->type == KVM_MEM_TYPE_RESERVED && @@ -241,10 +296,11 @@ int kvm__register_mem(struct kvm *kvm, u64 guest_phys, u64 size, bank->host_addr = userspace_addr; bank->size = size; bank->type = type; + bank->slot = slot; if (type != KVM_MEM_TYPE_RESERVED) { mem = (struct kvm_userspace_memory_region) { - .slot = kvm->mem_slots++, + .slot = slot, .guest_phys_addr = guest_phys, .memory_size = size, .userspace_addr = (unsigned long)userspace_addr, @@ -255,7 +311,8 @@ int kvm__register_mem(struct kvm *kvm, u64 guest_phys, u64 size, return -errno; } - list_add(&bank->list, &kvm->mem_banks); + list_add(&bank->list, prev_entry); + kvm->mem_slots++; return 0; } diff --git a/vfio/core.c b/vfio/core.c index 0ed1e6fee6bf..73fdac8be675 100644 --- a/vfio/core.c +++ b/vfio/core.c @@ -256,8 +256,14 @@ int vfio_map_region(struct kvm *kvm, struct vfio_device *vdev, void vfio_unmap_region(struct kvm *kvm, struct vfio_region *region) { + u64 map_size; + if (region->host_addr) { + map_size = ALIGN(region->info.size, PAGE_SIZE); munmap(region->host_addr, region->info.size); + kvm__destroy_mem(kvm, region->guest_phys_addr, map_size, + region->host_addr); + region->host_addr = NULL; } else if (region->is_ioport) { ioport__unregister(kvm, region->port_base); } else { From patchwork Thu Jan 23 13:47:58 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandru Elisei X-Patchwork-Id: 11347947 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 87B97159A for ; Thu, 23 Jan 2020 13:48:50 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 6E7222467B for ; Thu, 23 Jan 2020 13:48:50 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729214AbgAWNst (ORCPT ); Thu, 23 Jan 2020 08:48:49 -0500 Received: from foss.arm.com ([217.140.110.172]:39832 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729191AbgAWNst (ORCPT ); Thu, 23 Jan 2020 08:48:49 -0500 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id B91C3FEC; Thu, 23 Jan 2020 05:48:48 -0800 (PST) Received: from e123195-lin.cambridge.arm.com (e123195-lin.cambridge.arm.com [10.1.196.63]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id B79BA3F68E; Thu, 23 Jan 2020 05:48:47 -0800 (PST) From: Alexandru Elisei To: kvm@vger.kernel.org Cc: will@kernel.org, julien.thierry.kdev@gmail.com, andre.przywara@arm.com, sami.mujawar@arm.com, lorenzo.pieralisi@arm.com, maz@kernel.org Subject: [PATCH v2 kvmtool 23/30] vfio: Reserve ioports when configuring the BAR Date: Thu, 23 Jan 2020 13:47:58 +0000 Message-Id: <20200123134805.1993-24-alexandru.elisei@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200123134805.1993-1-alexandru.elisei@arm.com> References: <20200123134805.1993-1-alexandru.elisei@arm.com> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Let's be consistent and reserve ioports when we are configuring the BAR, not when we map it, just like we do with mmio regions. Signed-off-by: Alexandru Elisei Reviewed-by: Andre Przywara --- vfio/core.c | 9 +++------ vfio/pci.c | 4 +++- 2 files changed, 6 insertions(+), 7 deletions(-) diff --git a/vfio/core.c b/vfio/core.c index 73fdac8be675..6b9b58ea8d2f 100644 --- a/vfio/core.c +++ b/vfio/core.c @@ -202,14 +202,11 @@ static int vfio_setup_trap_region(struct kvm *kvm, struct vfio_device *vdev, struct vfio_region *region) { if (region->is_ioport) { - int port = pci_get_io_port_block(region->info.size); - - port = ioport__register(kvm, port, &vfio_ioport_ops, - region->info.size, region); + int port = ioport__register(kvm, region->port_base, + &vfio_ioport_ops, region->info.size, + region); if (port < 0) return port; - - region->port_base = port; return 0; } diff --git a/vfio/pci.c b/vfio/pci.c index f86a7d9b7032..abde16dc8693 100644 --- a/vfio/pci.c +++ b/vfio/pci.c @@ -885,7 +885,9 @@ static int vfio_pci_configure_bar(struct kvm *kvm, struct vfio_device *vdev, } } - if (!region->is_ioport) { + if (region->is_ioport) { + region->port_base = pci_get_io_port_block(region->info.size); + } else { /* Grab some MMIO space in the guest */ map_size = ALIGN(region->info.size, PAGE_SIZE); region->guest_phys_addr = pci_get_mmio_block(map_size); From patchwork Thu Jan 23 13:47:59 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandru Elisei X-Patchwork-Id: 11347951 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B886892A for ; Thu, 23 Jan 2020 13:48:52 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 9F57524688 for ; Thu, 23 Jan 2020 13:48:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729208AbgAWNsw (ORCPT ); Thu, 23 Jan 2020 08:48:52 -0500 Received: from foss.arm.com ([217.140.110.172]:39838 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729221AbgAWNsu (ORCPT ); Thu, 23 Jan 2020 08:48:50 -0500 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id EC71AFEC; Thu, 23 Jan 2020 05:48:49 -0800 (PST) Received: from e123195-lin.cambridge.arm.com (e123195-lin.cambridge.arm.com [10.1.196.63]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id ED6593F68E; Thu, 23 Jan 2020 05:48:48 -0800 (PST) From: Alexandru Elisei To: kvm@vger.kernel.org Cc: will@kernel.org, julien.thierry.kdev@gmail.com, andre.przywara@arm.com, sami.mujawar@arm.com, lorenzo.pieralisi@arm.com, maz@kernel.org Subject: [PATCH v2 kvmtool 24/30] vfio/pci: Don't write configuration value twice Date: Thu, 23 Jan 2020 13:47:59 +0000 Message-Id: <20200123134805.1993-25-alexandru.elisei@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200123134805.1993-1-alexandru.elisei@arm.com> References: <20200123134805.1993-1-alexandru.elisei@arm.com> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org After writing to the device fd as part of the PCI configuration space emulation, we read back from the device to make sure that the write finished. The value is read back into the PCI configuration space and afterwards, the same value is copied by the PCI emulation code. Let's read from the device fd into a temporary variable, to prevent this double write. The double write is harmless in itself. But when we implement reassignable BARs, we need to keep track of the old BAR value, and the VFIO code is overwritting it. Signed-off-by: Alexandru Elisei --- vfio/pci.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/vfio/pci.c b/vfio/pci.c index abde16dc8693..8a775a4a4a54 100644 --- a/vfio/pci.c +++ b/vfio/pci.c @@ -470,7 +470,7 @@ static void vfio_pci_cfg_write(struct kvm *kvm, struct pci_device_header *pci_hd struct vfio_region_info *info; struct vfio_pci_device *pdev; struct vfio_device *vdev; - void *base = pci_hdr; + u32 tmp; if (offset == PCI_ROM_ADDRESS) return; @@ -490,7 +490,7 @@ static void vfio_pci_cfg_write(struct kvm *kvm, struct pci_device_header *pci_hd if (pdev->irq_modes & VFIO_PCI_IRQ_MODE_MSI) vfio_pci_msi_cap_write(kvm, vdev, offset, data, sz); - if (pread(vdev->fd, base + offset, sz, info->offset + offset) != sz) + if (pread(vdev->fd, &tmp, sz, info->offset + offset) != sz) vfio_dev_warn(vdev, "Failed to read %d bytes from Configuration Space at 0x%x", sz, offset); } From patchwork Thu Jan 23 13:48:00 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandru Elisei X-Patchwork-Id: 11347953 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 158F6159A for ; Thu, 23 Jan 2020 13:48:55 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id DE7A424684 for ; Thu, 23 Jan 2020 13:48:54 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729239AbgAWNsy (ORCPT ); Thu, 23 Jan 2020 08:48:54 -0500 Received: from foss.arm.com ([217.140.110.172]:39846 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729108AbgAWNsw (ORCPT ); Thu, 23 Jan 2020 08:48:52 -0500 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 4B2F7FEC; Thu, 23 Jan 2020 05:48:51 -0800 (PST) Received: from e123195-lin.cambridge.arm.com (e123195-lin.cambridge.arm.com [10.1.196.63]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 2CC683F68E; Thu, 23 Jan 2020 05:48:50 -0800 (PST) From: Alexandru Elisei To: kvm@vger.kernel.org Cc: will@kernel.org, julien.thierry.kdev@gmail.com, andre.przywara@arm.com, sami.mujawar@arm.com, lorenzo.pieralisi@arm.com, maz@kernel.org Subject: [PATCH v2 kvmtool 25/30] pci: Implement callbacks for toggling BAR emulation Date: Thu, 23 Jan 2020 13:48:00 +0000 Message-Id: <20200123134805.1993-26-alexandru.elisei@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200123134805.1993-1-alexandru.elisei@arm.com> References: <20200123134805.1993-1-alexandru.elisei@arm.com> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Implement callbacks for activating and deactivating emulation for a BAR region. This is in preparation for allowing a guest operating system to enable and disable access to I/O or memory space, or to reassign the BARs. The emulated vesa device has been refactored in the process and the static variables were removed in order to make using the callbacks less painful. The framebuffer isn't designed to allow stopping and restarting at arbitrary points in the guest execution. Furthermore, on x86, the kernel will not change the BAR addresses, which on bare metal are programmed by the firmware, so take the easy way out and refuse to deactivate emulation for the BAR regions. Signed-off-by: Alexandru Elisei --- hw/vesa.c | 120 ++++++++++++++++++++++++++++++++-------------- include/kvm/pci.h | 19 +++++++- pci.c | 44 +++++++++++++++++ vfio/pci.c | 100 +++++++++++++++++++++++++++++++------- virtio/pci.c | 90 ++++++++++++++++++++++++---------- 5 files changed, 294 insertions(+), 79 deletions(-) diff --git a/hw/vesa.c b/hw/vesa.c index e988c0425946..74ebebbefa6b 100644 --- a/hw/vesa.c +++ b/hw/vesa.c @@ -18,6 +18,12 @@ #include #include +struct vesa_dev { + struct pci_device_header pci_hdr; + struct device_header dev_hdr; + struct framebuffer fb; +}; + static bool vesa_pci_io_in(struct ioport *ioport, struct kvm_cpu *vcpu, u16 port, void *data, int size) { return true; @@ -33,29 +39,52 @@ static struct ioport_operations vesa_io_ops = { .io_out = vesa_pci_io_out, }; -static struct pci_device_header vesa_pci_device = { - .vendor_id = cpu_to_le16(PCI_VENDOR_ID_REDHAT_QUMRANET), - .device_id = cpu_to_le16(PCI_DEVICE_ID_VESA), - .header_type = PCI_HEADER_TYPE_NORMAL, - .revision_id = 0, - .class[2] = 0x03, - .subsys_vendor_id = cpu_to_le16(PCI_SUBSYSTEM_VENDOR_ID_REDHAT_QUMRANET), - .subsys_id = cpu_to_le16(PCI_SUBSYSTEM_ID_VESA), - .bar[1] = cpu_to_le32(VESA_MEM_ADDR | PCI_BASE_ADDRESS_SPACE_MEMORY), - .bar_size[1] = VESA_MEM_SIZE, -}; +static int vesa__bar_activate(struct kvm *kvm, + struct pci_device_header *pci_hdr, + int bar_num, void *data) +{ + struct vesa_dev *vdev = data; + u32 bar_addr, bar_size; + char *mem; + int r; -static struct device_header vesa_device = { - .bus_type = DEVICE_BUS_PCI, - .data = &vesa_pci_device, -}; + bar_addr = pci__bar_address(pci_hdr, bar_num); + bar_size = pci_hdr->bar_size[bar_num]; -static struct framebuffer vesafb; + switch (bar_num) { + case 0: + r = ioport__register(kvm, bar_addr, &vesa_io_ops, bar_size, + NULL); + break; + case 1: + mem = mmap(NULL, bar_size, PROT_RW, MAP_ANON_NORESERVE, -1, 0); + if (mem == MAP_FAILED) { + r = -errno; + break; + } + r = kvm__register_dev_mem(kvm, bar_addr, bar_size, mem); + if (r < 0) + break; + vdev->fb.mem = mem; + break; + default: + r = -EINVAL; + } + + return r; +} + +static int vesa__bar_deactivate(struct kvm *kvm, + struct pci_device_header *pci_hdr, + int bar_num, void *data) +{ + return -EINVAL; +} struct framebuffer *vesa__init(struct kvm *kvm) { - u16 vesa_base_addr; - char *mem; + struct vesa_dev *vdev; + u16 port_addr; int r; BUILD_BUG_ON(!is_power_of_two(VESA_MEM_SIZE)); @@ -63,34 +92,51 @@ struct framebuffer *vesa__init(struct kvm *kvm) if (!kvm->cfg.vnc && !kvm->cfg.sdl && !kvm->cfg.gtk) return NULL; - r = pci_get_io_port_block(PCI_IO_SIZE); - r = ioport__register(kvm, r, &vesa_io_ops, PCI_IO_SIZE, NULL); - if (r < 0) - return ERR_PTR(r); - vesa_base_addr = (u16)r; - vesa_pci_device.bar[0] = cpu_to_le32(vesa_base_addr | PCI_BASE_ADDRESS_SPACE_IO); - vesa_pci_device.bar_size[0] = PCI_IO_SIZE; - r = device__register(&vesa_device); - if (r < 0) - return ERR_PTR(r); + vdev = calloc(1, sizeof(*vdev)); + if (vdev == NULL) + return ERR_PTR(-ENOMEM); - mem = mmap(NULL, VESA_MEM_SIZE, PROT_RW, MAP_ANON_NORESERVE, -1, 0); - if (mem == MAP_FAILED) - return ERR_PTR(-errno); + port_addr = pci_get_io_port_block(PCI_IO_SIZE); - r = kvm__register_dev_mem(kvm, VESA_MEM_ADDR, VESA_MEM_SIZE, mem); - if (r < 0) - return ERR_PTR(r); + vdev->pci_hdr = (struct pci_device_header) { + .vendor_id = cpu_to_le16(PCI_VENDOR_ID_REDHAT_QUMRANET), + .device_id = cpu_to_le16(PCI_DEVICE_ID_VESA), + .command = PCI_COMMAND_IO | PCI_COMMAND_MEMORY, + .header_type = PCI_HEADER_TYPE_NORMAL, + .revision_id = 0, + .class[2] = 0x03, + .subsys_vendor_id = cpu_to_le16(PCI_SUBSYSTEM_VENDOR_ID_REDHAT_QUMRANET), + .subsys_id = cpu_to_le16(PCI_SUBSYSTEM_ID_VESA), + .bar[0] = cpu_to_le32(port_addr | PCI_BASE_ADDRESS_SPACE_IO), + .bar_size[0] = PCI_IO_SIZE, + .bar[1] = cpu_to_le32(VESA_MEM_ADDR | PCI_BASE_ADDRESS_SPACE_MEMORY), + .bar_size[1] = VESA_MEM_SIZE, + }; - vesafb = (struct framebuffer) { + vdev->fb = (struct framebuffer) { .width = VESA_WIDTH, .height = VESA_HEIGHT, .depth = VESA_BPP, - .mem = mem, + .mem = NULL, .mem_addr = VESA_MEM_ADDR, .mem_size = VESA_MEM_SIZE, .kvm = kvm, }; - return fb__register(&vesafb); + + r = pci__register_bar_regions(kvm, &vdev->pci_hdr, vesa__bar_activate, + vesa__bar_deactivate, vdev); + if (r < 0) + return ERR_PTR(r); + + vdev->dev_hdr = (struct device_header) { + .bus_type = DEVICE_BUS_PCI, + .data = &vdev->pci_hdr, + }; + + r = device__register(&vdev->dev_hdr); + if (r < 0) + return ERR_PTR(r); + + return fb__register(&vdev->fb); } diff --git a/include/kvm/pci.h b/include/kvm/pci.h index 235cd82fff3c..bf42f497168f 100644 --- a/include/kvm/pci.h +++ b/include/kvm/pci.h @@ -89,12 +89,19 @@ struct pci_cap_hdr { u8 next; }; +struct pci_device_header; + +typedef int (*bar_activate_fn_t)(struct kvm *kvm, + struct pci_device_header *pci_hdr, + int bar_num, void *data); +typedef int (*bar_deactivate_fn_t)(struct kvm *kvm, + struct pci_device_header *pci_hdr, + int bar_num, void *data); + #define PCI_BAR_OFFSET(b) (offsetof(struct pci_device_header, bar[b])) #define PCI_DEV_CFG_SIZE 256 #define PCI_DEV_CFG_MASK (PCI_DEV_CFG_SIZE - 1) -struct pci_device_header; - struct pci_config_operations { void (*write)(struct kvm *kvm, struct pci_device_header *pci_hdr, u8 offset, void *data, int sz); @@ -136,6 +143,9 @@ struct pci_device_header { /* Private to lkvm */ u32 bar_size[6]; + bar_activate_fn_t bar_activate_fn; + bar_deactivate_fn_t bar_deactivate_fn; + void *data; struct pci_config_operations cfg_ops; /* * PCI INTx# are level-triggered, but virtual device often feature @@ -160,8 +170,13 @@ void pci__assign_irq(struct device_header *dev_hdr); void pci__config_wr(struct kvm *kvm, union pci_config_address addr, void *data, int size); void pci__config_rd(struct kvm *kvm, union pci_config_address addr, void *data, int size); + void *pci_find_cap(struct pci_device_header *hdr, u8 cap_type); +int pci__register_bar_regions(struct kvm *kvm, struct pci_device_header *pci_hdr, + bar_activate_fn_t bar_activate_fn, + bar_deactivate_fn_t bar_deactivate_fn, void *data); + static inline bool __pci__memory_space_enabled(u16 command) { return command & PCI_COMMAND_MEMORY; diff --git a/pci.c b/pci.c index 4f7b863298f6..5412f2defa2e 100644 --- a/pci.c +++ b/pci.c @@ -66,6 +66,11 @@ void pci__assign_irq(struct device_header *dev_hdr) pci_hdr->irq_type = IRQ_TYPE_EDGE_RISING; } +static bool pci_bar_is_implemented(struct pci_device_header *pci_hdr, int bar_num) +{ + return bar_num < 6 && pci_hdr->bar_size[bar_num]; +} + static void *pci_config_address_ptr(u16 port) { unsigned long offset; @@ -264,6 +269,45 @@ struct pci_device_header *pci__find_dev(u8 dev_num) return hdr->data; } +int pci__register_bar_regions(struct kvm *kvm, struct pci_device_header *pci_hdr, + bar_activate_fn_t bar_activate_fn, + bar_deactivate_fn_t bar_deactivate_fn, void *data) +{ + int i, r; + bool has_bar_regions = false; + + assert(bar_activate_fn && bar_deactivate_fn); + + pci_hdr->bar_activate_fn = bar_activate_fn; + pci_hdr->bar_deactivate_fn = bar_deactivate_fn; + pci_hdr->data = data; + + for (i = 0; i < 6; i++) { + if (!pci_bar_is_implemented(pci_hdr, i)) + continue; + + has_bar_regions = true; + + if (pci__bar_is_io(pci_hdr, i) && + pci__io_space_enabled(pci_hdr)) { + r = bar_activate_fn(kvm, pci_hdr, i, data); + if (r < 0) + return r; + } + + if (pci__bar_is_memory(pci_hdr, i) && + pci__memory_space_enabled(pci_hdr)) { + r = bar_activate_fn(kvm, pci_hdr, i, data); + if (r < 0) + return r; + } + } + + assert(has_bar_regions); + + return 0; +} + int pci__init(struct kvm *kvm) { int r; diff --git a/vfio/pci.c b/vfio/pci.c index 8a775a4a4a54..9e595562180b 100644 --- a/vfio/pci.c +++ b/vfio/pci.c @@ -446,6 +446,83 @@ out_unlock: mutex_unlock(&pdev->msi.mutex); } +static int vfio_pci_bar_activate(struct kvm *kvm, + struct pci_device_header *pci_hdr, + int bar_num, void *data) +{ + struct vfio_device *vdev = data; + struct vfio_pci_device *pdev = &vdev->pci; + struct vfio_pci_msix_pba *pba = &pdev->msix_pba; + struct vfio_pci_msix_table *table = &pdev->msix_table; + struct vfio_region *region = &vdev->regions[bar_num]; + int ret; + + if (!region->info.size) { + ret = -EINVAL; + goto out; + } + + if ((pdev->irq_modes & VFIO_PCI_IRQ_MODE_MSIX) && + (u32)bar_num == table->bar) { + ret = kvm__register_mmio(kvm, table->guest_phys_addr, + table->size, false, + vfio_pci_msix_table_access, pdev); + if (ret < 0 || table->bar!= pba->bar) + goto out; + } + + if ((pdev->irq_modes & VFIO_PCI_IRQ_MODE_MSIX) && + (u32)bar_num == pba->bar) { + ret = kvm__register_mmio(kvm, pba->guest_phys_addr, + pba->size, false, + vfio_pci_msix_pba_access, pdev); + goto out; + } + + ret = vfio_map_region(kvm, vdev, region); +out: + return ret; +} + +static int vfio_pci_bar_deactivate(struct kvm *kvm, + struct pci_device_header *pci_hdr, + int bar_num, void *data) +{ + struct vfio_device *vdev = data; + struct vfio_pci_device *pdev = &vdev->pci; + struct vfio_pci_msix_pba *pba = &pdev->msix_pba; + struct vfio_pci_msix_table *table = &pdev->msix_table; + struct vfio_region *region = &vdev->regions[bar_num]; + int ret; + bool success; + + if (!region->info.size) { + ret = -EINVAL; + goto out; + } + + if ((pdev->irq_modes & VFIO_PCI_IRQ_MODE_MSIX) && + (u32)bar_num == table->bar) { + success = kvm__deregister_mmio(kvm, table->guest_phys_addr); + ret = (success ? 0 : -EINVAL); + if (ret < 0 || table->bar!= pba->bar) + goto out; + } + + if ((pdev->irq_modes & VFIO_PCI_IRQ_MODE_MSIX) && + (u32)bar_num == pba->bar) { + success = kvm__deregister_mmio(kvm, pba->guest_phys_addr); + ret = (success ? 0 : -EINVAL); + goto out; + } + + vfio_unmap_region(kvm, region); + ret = 0; + +out: + return ret; +} + static void vfio_pci_cfg_read(struct kvm *kvm, struct pci_device_header *pci_hdr, u8 offset, void *data, int sz) { @@ -804,12 +881,6 @@ static int vfio_pci_create_msix_table(struct kvm *kvm, struct vfio_device *vdev) ret = -ENOMEM; goto out_free; } - pba->guest_phys_addr = table->guest_phys_addr + table->size; - - ret = kvm__register_mmio(kvm, table->guest_phys_addr, table->size, - false, vfio_pci_msix_table_access, pdev); - if (ret < 0) - goto out_free; /* * We could map the physical PBA directly into the guest, but it's @@ -819,10 +890,7 @@ static int vfio_pci_create_msix_table(struct kvm *kvm, struct vfio_device *vdev) * between MSI-X table and PBA. For the sake of isolation, create a * virtual PBA. */ - ret = kvm__register_mmio(kvm, pba->guest_phys_addr, pba->size, false, - vfio_pci_msix_pba_access, pdev); - if (ret < 0) - goto out_free; + pba->guest_phys_addr = table->guest_phys_addr + table->size; pdev->msix.entries = entries; pdev->msix.nr_entries = nr_entries; @@ -893,11 +961,6 @@ static int vfio_pci_configure_bar(struct kvm *kvm, struct vfio_device *vdev, region->guest_phys_addr = pci_get_mmio_block(map_size); } - /* Map the BARs into the guest or setup a trap region. */ - ret = vfio_map_region(kvm, vdev, region); - if (ret) - return ret; - return 0; } @@ -944,7 +1007,12 @@ static int vfio_pci_configure_dev_regions(struct kvm *kvm, } /* We've configured the BARs, fake up a Configuration Space */ - return vfio_pci_fixup_cfg_space(vdev); + ret = vfio_pci_fixup_cfg_space(vdev); + if (ret) + return ret; + + return pci__register_bar_regions(kvm, &pdev->hdr, vfio_pci_bar_activate, + vfio_pci_bar_deactivate, vdev); } /* diff --git a/virtio/pci.c b/virtio/pci.c index c4822514856c..5a3cc6f1e943 100644 --- a/virtio/pci.c +++ b/virtio/pci.c @@ -474,6 +474,65 @@ static void virtio_pci__io_mmio_callback(struct kvm_cpu *vcpu, virtio_pci__data_out(vcpu, vdev, addr - mmio_addr, data, len); } +static int virtio_pci__bar_activate(struct kvm *kvm, + struct pci_device_header *pci_hdr, + int bar_num, void *data) +{ + struct virtio_device *vdev = data; + u32 bar_addr, bar_size; + int r; + + bar_addr = pci__bar_address(pci_hdr, bar_num); + bar_size = pci_hdr->bar_size[bar_num]; + + switch (bar_num) { + case 0: + r = ioport__register(kvm, bar_addr, &virtio_pci__io_ops, + bar_size, vdev); + if (r > 0) + r = 0; + break; + case 1: + r = kvm__register_mmio(kvm, bar_addr, bar_size, false, + virtio_pci__io_mmio_callback, vdev); + break; + case 2: + r = kvm__register_mmio(kvm, bar_addr, bar_size, false, + virtio_pci__msix_mmio_callback, vdev); + break; + default: + r = -EINVAL; + } + + return r; +} + +static int virtio_pci__bar_deactivate(struct kvm *kvm, + struct pci_device_header *pci_hdr, + int bar_num, void *data) +{ + u32 bar_addr; + bool success; + int r; + + bar_addr = pci__bar_address(pci_hdr, bar_num); + + switch (bar_num) { + case 0: + r = ioport__unregister(kvm, bar_addr); + break; + case 1: + case 2: + success = kvm__deregister_mmio(kvm, bar_addr); + r = (success ? 0 : -EINVAL); + break; + default: + r = -EINVAL; + } + + return r; +} + int virtio_pci__init(struct kvm *kvm, void *dev, struct virtio_device *vdev, int device_id, int subsys_id, int class) { @@ -488,23 +547,8 @@ int virtio_pci__init(struct kvm *kvm, void *dev, struct virtio_device *vdev, BUILD_BUG_ON(!is_power_of_two(PCI_IO_SIZE)); port_addr = pci_get_io_port_block(PCI_IO_SIZE); - r = ioport__register(kvm, port_addr, &virtio_pci__io_ops, PCI_IO_SIZE, - vdev); - if (r < 0) - return r; - port_addr = (u16)r; - mmio_addr = pci_get_mmio_block(PCI_IO_SIZE); - r = kvm__register_mmio(kvm, mmio_addr, PCI_IO_SIZE, false, - virtio_pci__io_mmio_callback, vdev); - if (r < 0) - goto free_ioport; - msix_io_block = pci_get_mmio_block(PCI_IO_SIZE * 2); - r = kvm__register_mmio(kvm, msix_io_block, PCI_IO_SIZE * 2, false, - virtio_pci__msix_mmio_callback, vdev); - if (r < 0) - goto free_mmio; vpci->pci_hdr = (struct pci_device_header) { .vendor_id = cpu_to_le16(PCI_VENDOR_ID_REDHAT_QUMRANET), @@ -530,6 +574,12 @@ int virtio_pci__init(struct kvm *kvm, void *dev, struct virtio_device *vdev, .bar_size[2] = cpu_to_le32(PCI_IO_SIZE*2), }; + r = pci__register_bar_regions(kvm, &vpci->pci_hdr, + virtio_pci__bar_activate, + virtio_pci__bar_deactivate, vdev); + if (r < 0) + return r; + vpci->dev_hdr = (struct device_header) { .bus_type = DEVICE_BUS_PCI, .data = &vpci->pci_hdr, @@ -560,20 +610,12 @@ int virtio_pci__init(struct kvm *kvm, void *dev, struct virtio_device *vdev, r = device__register(&vpci->dev_hdr); if (r < 0) - goto free_msix_mmio; + return r; /* save the IRQ that device__register() has allocated */ vpci->legacy_irq_line = vpci->pci_hdr.irq_line; return 0; - -free_msix_mmio: - kvm__deregister_mmio(kvm, msix_io_block); -free_mmio: - kvm__deregister_mmio(kvm, mmio_addr); -free_ioport: - ioport__unregister(kvm, port_addr); - return r; } int virtio_pci__reset(struct kvm *kvm, struct virtio_device *vdev) From patchwork Thu Jan 23 13:48:01 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandru Elisei X-Patchwork-Id: 11347963 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 6291492A for ; Thu, 23 Jan 2020 13:49:02 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 4AD6820661 for ; Thu, 23 Jan 2020 13:49:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729162AbgAWNsx (ORCPT ); Thu, 23 Jan 2020 08:48:53 -0500 Received: from foss.arm.com ([217.140.110.172]:39858 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729225AbgAWNsx (ORCPT ); Thu, 23 Jan 2020 08:48:53 -0500 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 836C2106F; Thu, 23 Jan 2020 05:48:52 -0800 (PST) Received: from e123195-lin.cambridge.arm.com (e123195-lin.cambridge.arm.com [10.1.196.63]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 823BA3F68E; Thu, 23 Jan 2020 05:48:51 -0800 (PST) From: Alexandru Elisei To: kvm@vger.kernel.org Cc: will@kernel.org, julien.thierry.kdev@gmail.com, andre.przywara@arm.com, sami.mujawar@arm.com, lorenzo.pieralisi@arm.com, maz@kernel.org Subject: [PATCH v2 kvmtool 26/30] pci: Toggle BAR I/O and memory space emulation Date: Thu, 23 Jan 2020 13:48:01 +0000 Message-Id: <20200123134805.1993-27-alexandru.elisei@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200123134805.1993-1-alexandru.elisei@arm.com> References: <20200123134805.1993-1-alexandru.elisei@arm.com> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org During configuration of the BAR addresses, a Linux guest disables and enables access to I/O and memory space. When access is disabled, we don't stop emulating the memory regions described by the BARs. Now that we have callbacks for activating and deactivating emulation for a BAR region, let's use that to stop emulation when access is disabled, and re-activate it when access is re-enabled. The vesa emulation hasn't been designed with toggling on and off in mind, so refuse writes to the PCI command register that disable memory or IO access. Signed-off-by: Alexandru Elisei --- hw/vesa.c | 16 ++++++++++++++++ pci.c | 42 ++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 58 insertions(+) diff --git a/hw/vesa.c b/hw/vesa.c index 74ebebbefa6b..3044a86078fb 100644 --- a/hw/vesa.c +++ b/hw/vesa.c @@ -81,6 +81,18 @@ static int vesa__bar_deactivate(struct kvm *kvm, return -EINVAL; } +static void vesa__pci_cfg_write(struct kvm *kvm, struct pci_device_header *pci_hdr, + u8 offset, void *data, int sz) +{ + u32 value; + + if (offset == PCI_COMMAND) { + memcpy(&value, data, sz); + value |= (PCI_COMMAND_IO | PCI_COMMAND_MEMORY); + memcpy(data, &value, sz); + } +} + struct framebuffer *vesa__init(struct kvm *kvm) { struct vesa_dev *vdev; @@ -114,6 +126,10 @@ struct framebuffer *vesa__init(struct kvm *kvm) .bar_size[1] = VESA_MEM_SIZE, }; + vdev->pci_hdr.cfg_ops = (struct pci_config_operations) { + .write = vesa__pci_cfg_write, + }; + vdev->fb = (struct framebuffer) { .width = VESA_WIDTH, .height = VESA_HEIGHT, diff --git a/pci.c b/pci.c index 5412f2defa2e..98331a1fc205 100644 --- a/pci.c +++ b/pci.c @@ -157,6 +157,42 @@ static struct ioport_operations pci_config_data_ops = { .io_out = pci_config_data_out, }; +static void pci_config_command_wr(struct kvm *kvm, + struct pci_device_header *pci_hdr, + u16 new_command) +{ + int i; + bool toggle_io, toggle_mem; + + toggle_io = (pci_hdr->command ^ new_command) & PCI_COMMAND_IO; + toggle_mem = (pci_hdr->command ^ new_command) & PCI_COMMAND_MEMORY; + + for (i = 0; i < 6; i++) { + if (!pci_bar_is_implemented(pci_hdr, i)) + continue; + + if (toggle_io && pci__bar_is_io(pci_hdr, i)) { + if (__pci__io_space_enabled(new_command)) + pci_hdr->bar_activate_fn(kvm, pci_hdr, i, + pci_hdr->data); + else + pci_hdr->bar_deactivate_fn(kvm, pci_hdr, i, + pci_hdr->data); + } + + if (toggle_mem && pci__bar_is_memory(pci_hdr, i)) { + if (__pci__memory_space_enabled(new_command)) + pci_hdr->bar_activate_fn(kvm, pci_hdr, i, + pci_hdr->data); + else + pci_hdr->bar_deactivate_fn(kvm, pci_hdr, i, + pci_hdr->data); + } + } + + pci_hdr->command = new_command; +} + void pci__config_wr(struct kvm *kvm, union pci_config_address addr, void *data, int size) { void *base; @@ -182,6 +218,12 @@ void pci__config_wr(struct kvm *kvm, union pci_config_address addr, void *data, if (*(u32 *)(base + offset) == 0) return; + if (offset == PCI_COMMAND) { + memcpy(&value, data, size); + pci_config_command_wr(kvm, pci_hdr, (u16)value); + return; + } + bar = (offset - PCI_BAR_OFFSET(0)) / sizeof(u32); /* From patchwork Thu Jan 23 13:48:02 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandru Elisei X-Patchwork-Id: 11347955 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 7455692A for ; Thu, 23 Jan 2020 13:48:56 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 427C22467B for ; Thu, 23 Jan 2020 13:48:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729253AbgAWNsz (ORCPT ); Thu, 23 Jan 2020 08:48:55 -0500 Received: from foss.arm.com ([217.140.110.172]:39866 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729238AbgAWNsy (ORCPT ); Thu, 23 Jan 2020 08:48:54 -0500 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id BDE00FEC; Thu, 23 Jan 2020 05:48:53 -0800 (PST) Received: from e123195-lin.cambridge.arm.com (e123195-lin.cambridge.arm.com [10.1.196.63]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id BCDA73F68E; Thu, 23 Jan 2020 05:48:52 -0800 (PST) From: Alexandru Elisei To: kvm@vger.kernel.org Cc: will@kernel.org, julien.thierry.kdev@gmail.com, andre.przywara@arm.com, sami.mujawar@arm.com, lorenzo.pieralisi@arm.com, maz@kernel.org Subject: [PATCH v2 kvmtool 27/30] pci: Implement reassignable BARs Date: Thu, 23 Jan 2020 13:48:02 +0000 Message-Id: <20200123134805.1993-28-alexandru.elisei@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200123134805.1993-1-alexandru.elisei@arm.com> References: <20200123134805.1993-1-alexandru.elisei@arm.com> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org BARs are used by the guest to configure the access to the PCI device by writing the address to which the device will respond. The basic idea for adding support for reassignable BARs is straightforward: deactivate emulation for the memory region described by the old BAR value, and activate emulation for the new region. BAR reassignement can be done while device access is enabled and memory regions for different devices can overlap as long as no access is made to the overlapping memory regions. This means that it is legal for the BARs of two distinct devices to point to an overlapping memory region, and indeed, this is how Linux does resource assignment at boot. To account for this situation, the simple algorithm described above is enhanced to scan for all devices and: - Deactivate emulation for any BARs that might overlap with the new BAR value. - Enable emulation for any BARs that were overlapping with the old value after the BAR has been updated. Activating/deactivating emulation of a memory region has side effects. In order to prevent the execution of the same callback twice we now keep track of the state of the region emulation. For example, this can happen if we program a BAR with an address that overlaps a second BAR, thus deactivating emulation for the second BAR, and then we disable all region accesses to the second BAR by writing to the command register. Signed-off-by: Alexandru Elisei --- hw/vesa.c | 6 +- include/kvm/pci.h | 23 +++- pci.c | 274 +++++++++++++++++++++++++++++++++++--------- powerpc/spapr_pci.c | 2 +- vfio/pci.c | 15 ++- virtio/pci.c | 8 +- 6 files changed, 261 insertions(+), 67 deletions(-) diff --git a/hw/vesa.c b/hw/vesa.c index 3044a86078fb..aca938f79c82 100644 --- a/hw/vesa.c +++ b/hw/vesa.c @@ -49,7 +49,7 @@ static int vesa__bar_activate(struct kvm *kvm, int r; bar_addr = pci__bar_address(pci_hdr, bar_num); - bar_size = pci_hdr->bar_size[bar_num]; + bar_size = pci__bar_size(pci_hdr, bar_num); switch (bar_num) { case 0: @@ -121,9 +121,9 @@ struct framebuffer *vesa__init(struct kvm *kvm) .subsys_vendor_id = cpu_to_le16(PCI_SUBSYSTEM_VENDOR_ID_REDHAT_QUMRANET), .subsys_id = cpu_to_le16(PCI_SUBSYSTEM_ID_VESA), .bar[0] = cpu_to_le32(port_addr | PCI_BASE_ADDRESS_SPACE_IO), - .bar_size[0] = PCI_IO_SIZE, + .bar_info[0] = (struct pci_bar_info) {.size = PCI_IO_SIZE}, .bar[1] = cpu_to_le32(VESA_MEM_ADDR | PCI_BASE_ADDRESS_SPACE_MEMORY), - .bar_size[1] = VESA_MEM_SIZE, + .bar_info[1] = (struct pci_bar_info) {.size = VESA_MEM_SIZE}, }; vdev->pci_hdr.cfg_ops = (struct pci_config_operations) { diff --git a/include/kvm/pci.h b/include/kvm/pci.h index bf42f497168f..ae71ef33237c 100644 --- a/include/kvm/pci.h +++ b/include/kvm/pci.h @@ -11,6 +11,17 @@ #include "kvm/msi.h" #include "kvm/fdt.h" +#define pci_dev_err(pci_hdr, fmt, ...) \ + pr_err("[%04x:%04x] " fmt, pci_hdr->vendor_id, pci_hdr->device_id, ##__VA_ARGS__) +#define pci_dev_warn(pci_hdr, fmt, ...) \ + pr_warning("[%04x:%04x] " fmt, pci_hdr->vendor_id, pci_hdr->device_id, ##__VA_ARGS__) +#define pci_dev_info(pci_hdr, fmt, ...) \ + pr_info("[%04x:%04x] " fmt, pci_hdr->vendor_id, pci_hdr->device_id, ##__VA_ARGS__) +#define pci_dev_dbg(pci_hdr, fmt, ...) \ + pr_debug("[%04x:%04x] " fmt, pci_hdr->vendor_id, pci_hdr->device_id, ##__VA_ARGS__) +#define pci_dev_die(pci_hdr, fmt, ...) \ + die("[%04x:%04x] " fmt, pci_hdr->vendor_id, pci_hdr->device_id, ##__VA_ARGS__) + /* * PCI Configuration Mechanism #1 I/O ports. See Section 3.7.4.1. * ("Configuration Mechanism #1") of the PCI Local Bus Specification 2.1 for @@ -89,6 +100,11 @@ struct pci_cap_hdr { u8 next; }; +struct pci_bar_info { + u32 size; + bool active; +}; + struct pci_device_header; typedef int (*bar_activate_fn_t)(struct kvm *kvm, @@ -142,7 +158,7 @@ struct pci_device_header { }; /* Private to lkvm */ - u32 bar_size[6]; + struct pci_bar_info bar_info[6]; bar_activate_fn_t bar_activate_fn; bar_deactivate_fn_t bar_deactivate_fn; void *data; @@ -224,4 +240,9 @@ static inline u32 pci__bar_address(struct pci_device_header *pci_hdr, int bar_nu return __pci__bar_address(pci_hdr->bar[bar_num]); } +static inline u32 pci__bar_size(struct pci_device_header *pci_hdr, int bar_num) +{ + return pci_hdr->bar_info[bar_num].size; +} + #endif /* KVM__PCI_H */ diff --git a/pci.c b/pci.c index 98331a1fc205..1e9791250bc3 100644 --- a/pci.c +++ b/pci.c @@ -68,7 +68,7 @@ void pci__assign_irq(struct device_header *dev_hdr) static bool pci_bar_is_implemented(struct pci_device_header *pci_hdr, int bar_num) { - return bar_num < 6 && pci_hdr->bar_size[bar_num]; + return bar_num < 6 && pci__bar_size(pci_hdr, bar_num); } static void *pci_config_address_ptr(u16 port) @@ -157,6 +157,46 @@ static struct ioport_operations pci_config_data_ops = { .io_out = pci_config_data_out, }; +static int pci_activate_bar(struct kvm *kvm, struct pci_device_header *pci_hdr, + int bar_num) +{ + int r = 0; + + if (pci_hdr->bar_info[bar_num].active) + goto out; + + r = pci_hdr->bar_activate_fn(kvm, pci_hdr, bar_num, pci_hdr->data); + if (r < 0) { + pci_dev_err(pci_hdr, "Error activating emulation for BAR %d", + bar_num); + goto out; + } + pci_hdr->bar_info[bar_num].active = true; + +out: + return r; +} + +static int pci_deactivate_bar(struct kvm *kvm, struct pci_device_header *pci_hdr, + int bar_num) +{ + int r = 0; + + if (!pci_hdr->bar_info[bar_num].active) + goto out; + + r = pci_hdr->bar_deactivate_fn(kvm, pci_hdr, bar_num, pci_hdr->data); + if (r < 0) { + pci_dev_err(pci_hdr, "Error deactivating emulation for BAR %d", + bar_num); + goto out; + } + pci_hdr->bar_info[bar_num].active = false; + +out: + return r; +} + static void pci_config_command_wr(struct kvm *kvm, struct pci_device_header *pci_hdr, u16 new_command) @@ -173,26 +213,179 @@ static void pci_config_command_wr(struct kvm *kvm, if (toggle_io && pci__bar_is_io(pci_hdr, i)) { if (__pci__io_space_enabled(new_command)) - pci_hdr->bar_activate_fn(kvm, pci_hdr, i, - pci_hdr->data); - else - pci_hdr->bar_deactivate_fn(kvm, pci_hdr, i, - pci_hdr->data); + pci_activate_bar(kvm, pci_hdr, i); + if (!__pci__io_space_enabled(new_command)) + pci_deactivate_bar(kvm, pci_hdr, i); } if (toggle_mem && pci__bar_is_memory(pci_hdr, i)) { if (__pci__memory_space_enabled(new_command)) - pci_hdr->bar_activate_fn(kvm, pci_hdr, i, - pci_hdr->data); - else - pci_hdr->bar_deactivate_fn(kvm, pci_hdr, i, - pci_hdr->data); + pci_activate_bar(kvm, pci_hdr, i); + if (!__pci__memory_space_enabled(new_command)) + pci_deactivate_bar(kvm, pci_hdr, i); } } pci_hdr->command = new_command; } +static int pci_deactivate_bar_regions(struct kvm *kvm, + struct pci_device_header *pci_hdr, + u32 start, u32 size) +{ + struct device_header *dev_hdr; + struct pci_device_header *tmp_hdr; + u32 tmp_addr, tmp_size; + int i, r; + + dev_hdr = device__first_dev(DEVICE_BUS_PCI); + while (dev_hdr) { + tmp_hdr = dev_hdr->data; + for (i = 0; i < 6; i++) { + if (!pci_bar_is_implemented(tmp_hdr, i)) + continue; + + tmp_addr = pci__bar_address(tmp_hdr, i); + tmp_size = pci__bar_size(tmp_hdr, i); + + if (tmp_addr + tmp_size <= start || + tmp_addr >= start + size) + continue; + + r = pci_deactivate_bar(kvm, tmp_hdr, i); + if (r < 0) + return r; + } + dev_hdr = device__next_dev(dev_hdr); + } + + return 0; +} + +static int pci_activate_bar_regions(struct kvm *kvm, + struct pci_device_header *pci_hdr, + u32 start, u32 size) +{ + struct device_header *dev_hdr; + struct pci_device_header *tmp_hdr; + u32 tmp_addr, tmp_size; + int i, r; + + dev_hdr = device__first_dev(DEVICE_BUS_PCI); + while (dev_hdr) { + tmp_hdr = dev_hdr->data; + for (i = 0; i < 6; i++) { + if (!pci_bar_is_implemented(tmp_hdr, i)) + continue; + + tmp_addr = pci__bar_address(tmp_hdr, i); + tmp_size = pci__bar_size(tmp_hdr, i); + + if (tmp_addr + tmp_size <= start || + tmp_addr >= start + size) + continue; + + r = pci_activate_bar(kvm, tmp_hdr, i); + if (r < 0) + return r; + } + dev_hdr = device__next_dev(dev_hdr); + } + + return 0; +} + +static void pci_config_bar_wr(struct kvm *kvm, + struct pci_device_header *pci_hdr, int bar_num, + u32 value) +{ + u32 old_addr, new_addr, bar_size; + u32 mask; + int r; + + if (pci__bar_is_io(pci_hdr, bar_num)) + mask = (u32)PCI_BASE_ADDRESS_IO_MASK; + else + mask = (u32)PCI_BASE_ADDRESS_MEM_MASK; + + /* + * If the kernel masks the BAR, it will expect to find the size of the + * BAR there next time it reads from it. After the kernel reads the + * size, it will write the address back. + * + * According to the PCI local bus specification REV 3.0: The number of + * upper bits that a device actually implements depends on how much of + * the address space the device will respond to. A device that wants a 1 + * MB memory address space (using a 32-bit base address register) would + * build the top 12 bits of the address register, hardwiring the other + * bits to 0. + * + * Furthermore, software can determine how much address space the device + * requires by writing a value of all 1's to the register and then + * reading the value back. The device will return 0's in all don't-care + * address bits, effectively specifying the address space required. + * + * Software computes the size of the address space with the formula + * S = ~B + 1, where S is the memory size and B is the value read from + * the BAR. This means that the BAR value that kvmtool should return is + * B = ~(S - 1). + */ + if (value == 0xffffffff) { + value = ~(pci__bar_size(pci_hdr, bar_num) - 1); + /* Preserve the special bits. */ + value = (value & mask) | (pci_hdr->bar[bar_num] & ~mask); + pci_hdr->bar[bar_num] = value; + return; + } + + value = (value & mask) | (pci_hdr->bar[bar_num] & ~mask); + + /* Don't toggle emulation when region type access is disbled. */ + if (pci__bar_is_io(pci_hdr, bar_num) && + !pci__io_space_enabled(pci_hdr)) { + pci_hdr->bar[bar_num] = value; + return; + } + + if (pci__bar_is_memory(pci_hdr, bar_num) && + !pci__memory_space_enabled(pci_hdr)) { + pci_hdr->bar[bar_num] = value; + return; + } + + old_addr = pci__bar_address(pci_hdr, bar_num); + new_addr = __pci__bar_address(value); + bar_size = pci__bar_size(pci_hdr, bar_num); + + r = pci_deactivate_bar(kvm, pci_hdr, bar_num); + if (r < 0) + return; + + r = pci_deactivate_bar_regions(kvm, pci_hdr, new_addr, bar_size); + if (r < 0) { + /* + * We cannot update the BAR because of an overlapping region + * that failed to deactivate emulation, so keep the old BAR + * value and re-activate emulation for it. + */ + pci_activate_bar(kvm, pci_hdr, bar_num); + return; + } + + pci_hdr->bar[bar_num] = value; + r = pci_activate_bar(kvm, pci_hdr, bar_num); + if (r < 0) { + /* + * New region cannot be emulated, re-enable the regions that + * were overlapping. + */ + pci_activate_bar_regions(kvm, pci_hdr, new_addr, bar_size); + return; + } + + pci_activate_bar_regions(kvm, pci_hdr, old_addr, bar_size); +} + void pci__config_wr(struct kvm *kvm, union pci_config_address addr, void *data, int size) { void *base; @@ -200,7 +393,6 @@ void pci__config_wr(struct kvm *kvm, union pci_config_address addr, void *data, struct pci_device_header *pci_hdr; u8 dev_num = addr.device_number; u32 value = 0; - u32 mask; if (!pci_device_exists(addr.bus_number, dev_num, 0)) return; @@ -225,46 +417,13 @@ void pci__config_wr(struct kvm *kvm, union pci_config_address addr, void *data, } bar = (offset - PCI_BAR_OFFSET(0)) / sizeof(u32); - - /* - * If the kernel masks the BAR, it will expect to find the size of the - * BAR there next time it reads from it. After the kernel reads the - * size, it will write the address back. - */ if (bar < 6) { - if (pci__bar_is_io(pci_hdr, bar)) - mask = (u32)PCI_BASE_ADDRESS_IO_MASK; - else - mask = (u32)PCI_BASE_ADDRESS_MEM_MASK; - /* - * According to the PCI local bus specification REV 3.0: - * The number of upper bits that a device actually implements - * depends on how much of the address space the device will - * respond to. A device that wants a 1 MB memory address space - * (using a 32-bit base address register) would build the top - * 12 bits of the address register, hardwiring the other bits - * to 0. - * - * Furthermore, software can determine how much address space - * the device requires by writing a value of all 1's to the - * register and then reading the value back. The device will - * return 0's in all don't-care address bits, effectively - * specifying the address space required. - * - * Software computes the size of the address space with the - * formula S = ~B + 1, where S is the memory size and B is the - * value read from the BAR. This means that the BAR value that - * kvmtool should return is B = ~(S - 1). - */ memcpy(&value, data, size); - if (value == 0xffffffff) - value = ~(pci_hdr->bar_size[bar] - 1); - /* Preserve the special bits. */ - value = (value & mask) | (pci_hdr->bar[bar] & ~mask); - memcpy(base + offset, &value, size); - } else { - memcpy(base + offset, data, size); + pci_config_bar_wr(kvm, pci_hdr, bar, value); + return; } + + memcpy(base + offset, data, size); } void pci__config_rd(struct kvm *kvm, union pci_config_address addr, void *data, int size) @@ -329,20 +488,21 @@ int pci__register_bar_regions(struct kvm *kvm, struct pci_device_header *pci_hdr continue; has_bar_regions = true; + assert(!pci_hdr->bar_info[i].active); if (pci__bar_is_io(pci_hdr, i) && pci__io_space_enabled(pci_hdr)) { - r = bar_activate_fn(kvm, pci_hdr, i, data); - if (r < 0) - return r; - } + r = pci_activate_bar(kvm, pci_hdr, i); + if (r < 0) + return r; + } if (pci__bar_is_memory(pci_hdr, i) && pci__memory_space_enabled(pci_hdr)) { - r = bar_activate_fn(kvm, pci_hdr, i, data); - if (r < 0) - return r; - } + r = pci_activate_bar(kvm, pci_hdr, i); + if (r < 0) + return r; + } } assert(has_bar_regions); diff --git a/powerpc/spapr_pci.c b/powerpc/spapr_pci.c index a15f7d895a46..7be44d950acb 100644 --- a/powerpc/spapr_pci.c +++ b/powerpc/spapr_pci.c @@ -369,7 +369,7 @@ int spapr_populate_pci_devices(struct kvm *kvm, of_pci_b_ddddd(devid) | of_pci_b_fff(fn) | of_pci_b_rrrrrrrr(bars[i])); - reg[n+1].size = cpu_to_be64(hdr->bar_size[i]); + reg[n+1].size = cpu_to_be64(pci__bar_size(hdr, i)); reg[n+1].addr = 0; assigned_addresses[n].phys_hi = cpu_to_be32( diff --git a/vfio/pci.c b/vfio/pci.c index 9e595562180b..3a641e72e574 100644 --- a/vfio/pci.c +++ b/vfio/pci.c @@ -455,6 +455,7 @@ static int vfio_pci_bar_activate(struct kvm *kvm, struct vfio_pci_msix_pba *pba = &pdev->msix_pba; struct vfio_pci_msix_table *table = &pdev->msix_table; struct vfio_region *region = &vdev->regions[bar_num]; + u32 bar_addr; int ret; if (!region->info.size) { @@ -462,8 +463,11 @@ static int vfio_pci_bar_activate(struct kvm *kvm, goto out; } + bar_addr = pci__bar_address(pci_hdr, bar_num); + if ((pdev->irq_modes & VFIO_PCI_IRQ_MODE_MSIX) && (u32)bar_num == table->bar) { + table->guest_phys_addr = region->guest_phys_addr = bar_addr; ret = kvm__register_mmio(kvm, table->guest_phys_addr, table->size, false, vfio_pci_msix_table_access, pdev); @@ -473,13 +477,22 @@ static int vfio_pci_bar_activate(struct kvm *kvm, if ((pdev->irq_modes & VFIO_PCI_IRQ_MODE_MSIX) && (u32)bar_num == pba->bar) { + if (pba->bar == table->bar) + pba->guest_phys_addr = table->guest_phys_addr + table->size; + else + pba->guest_phys_addr = region->guest_phys_addr = bar_addr; ret = kvm__register_mmio(kvm, pba->guest_phys_addr, pba->size, false, vfio_pci_msix_pba_access, pdev); goto out; } + if (pci__bar_is_io(pci_hdr, bar_num)) + region->port_base = bar_addr; + else + region->guest_phys_addr = bar_addr; ret = vfio_map_region(kvm, vdev, region); + out: return ret; } @@ -749,7 +762,7 @@ static int vfio_pci_fixup_cfg_space(struct vfio_device *vdev) if (!base) continue; - pdev->hdr.bar_size[i] = region->info.size; + pdev->hdr.bar_info[i].size = region->info.size; } /* I really can't be bothered to support cardbus. */ diff --git a/virtio/pci.c b/virtio/pci.c index 5a3cc6f1e943..e02430881394 100644 --- a/virtio/pci.c +++ b/virtio/pci.c @@ -483,7 +483,7 @@ static int virtio_pci__bar_activate(struct kvm *kvm, int r; bar_addr = pci__bar_address(pci_hdr, bar_num); - bar_size = pci_hdr->bar_size[bar_num]; + bar_size = pci__bar_size(pci_hdr, bar_num); switch (bar_num) { case 0: @@ -569,9 +569,9 @@ int virtio_pci__init(struct kvm *kvm, void *dev, struct virtio_device *vdev, | PCI_BASE_ADDRESS_SPACE_MEMORY), .status = cpu_to_le16(PCI_STATUS_CAP_LIST), .capabilities = (void *)&vpci->pci_hdr.msix - (void *)&vpci->pci_hdr, - .bar_size[0] = cpu_to_le32(PCI_IO_SIZE), - .bar_size[1] = cpu_to_le32(PCI_IO_SIZE), - .bar_size[2] = cpu_to_le32(PCI_IO_SIZE*2), + .bar_info[0] = (struct pci_bar_info) {.size = cpu_to_le32(PCI_IO_SIZE)}, + .bar_info[1] = (struct pci_bar_info) {.size = cpu_to_le32(PCI_IO_SIZE)}, + .bar_info[2] = (struct pci_bar_info) {.size = cpu_to_le32(PCI_IO_SIZE*2)}, }; r = pci__register_bar_regions(kvm, &vpci->pci_hdr, From patchwork Thu Jan 23 13:48:03 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandru Elisei X-Patchwork-Id: 11347957 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id D8E7917EF for ; Thu, 23 Jan 2020 13:48:56 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id BABAF2468A for ; Thu, 23 Jan 2020 13:48:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729240AbgAWNs4 (ORCPT ); Thu, 23 Jan 2020 08:48:56 -0500 Received: from foss.arm.com ([217.140.110.172]:39872 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729251AbgAWNsz (ORCPT ); Thu, 23 Jan 2020 08:48:55 -0500 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 234B2FEC; Thu, 23 Jan 2020 05:48:55 -0800 (PST) Received: from e123195-lin.cambridge.arm.com (e123195-lin.cambridge.arm.com [10.1.196.63]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id F3AB63F68E; Thu, 23 Jan 2020 05:48:53 -0800 (PST) From: Alexandru Elisei To: kvm@vger.kernel.org Cc: will@kernel.org, julien.thierry.kdev@gmail.com, andre.przywara@arm.com, sami.mujawar@arm.com, lorenzo.pieralisi@arm.com, maz@kernel.org, Julien Thierry Subject: [PATCH v2 kvmtool 28/30] arm/fdt: Remove 'linux,pci-probe-only' property Date: Thu, 23 Jan 2020 13:48:03 +0000 Message-Id: <20200123134805.1993-29-alexandru.elisei@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200123134805.1993-1-alexandru.elisei@arm.com> References: <20200123134805.1993-1-alexandru.elisei@arm.com> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Julien Thierry PCI now supports configurable BARs. Get rid of the no longer needed, Linux-only, fdt property. Signed-off-by: Julien Thierry Signed-off-by: Alexandru Elisei Reviewed-by: Andre Przywara --- arm/fdt.c | 1 - 1 file changed, 1 deletion(-) diff --git a/arm/fdt.c b/arm/fdt.c index c80e6da323b6..02091e9e0bee 100644 --- a/arm/fdt.c +++ b/arm/fdt.c @@ -130,7 +130,6 @@ static int setup_fdt(struct kvm *kvm) /* /chosen */ _FDT(fdt_begin_node(fdt, "chosen")); - _FDT(fdt_property_cell(fdt, "linux,pci-probe-only", 1)); /* Pass on our amended command line to a Linux kernel only. */ if (kvm->cfg.firmware_filename) { From patchwork Thu Jan 23 13:48:04 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandru Elisei X-Patchwork-Id: 11347959 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E2A4D159A for ; Thu, 23 Jan 2020 13:48:58 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id CAC112467B for ; Thu, 23 Jan 2020 13:48:58 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729261AbgAWNs5 (ORCPT ); Thu, 23 Jan 2020 08:48:57 -0500 Received: from foss.arm.com ([217.140.110.172]:39880 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729165AbgAWNs5 (ORCPT ); Thu, 23 Jan 2020 08:48:57 -0500 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 5CF03FEC; Thu, 23 Jan 2020 05:48:56 -0800 (PST) Received: from e123195-lin.cambridge.arm.com (e123195-lin.cambridge.arm.com [10.1.196.63]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 5796C3F68E; Thu, 23 Jan 2020 05:48:55 -0800 (PST) From: Alexandru Elisei To: kvm@vger.kernel.org Cc: will@kernel.org, julien.thierry.kdev@gmail.com, andre.przywara@arm.com, sami.mujawar@arm.com, lorenzo.pieralisi@arm.com, maz@kernel.org Subject: [PATCH v2 kvmtool 29/30] vfio: Trap MMIO access to BAR addresses which aren't page aligned Date: Thu, 23 Jan 2020 13:48:04 +0000 Message-Id: <20200123134805.1993-30-alexandru.elisei@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200123134805.1993-1-alexandru.elisei@arm.com> References: <20200123134805.1993-1-alexandru.elisei@arm.com> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org KVM_SET_USER_MEMORY_REGION will fail if the guest physical address is not aligned to the page size. However, it is legal for a guest to program an address which isn't aligned to the page size. Trap and emulate MMIO accesses to the region when that happens. Without this patch, when assigning a Seagate Barracude hard drive to a VM I was seeing these errors: [ 0.286029] pci 0000:00:00.0: BAR 0: assigned [mem 0x41004600-0x4100467f] Error: 0000:01:00.0: failed to register region with KVM Error: [1095:3132] Error activating emulation for BAR 0 [..] [ 10.561794] irq 13: nobody cared (try booting with the "irqpoll" option) [ 10.563122] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.4.0-seattle-00009-g909b20467ed1 #133 [ 10.563124] Hardware name: linux,dummy-virt (DT) [ 10.563126] Call trace: [ 10.563134] dump_backtrace+0x0/0x140 [ 10.563137] show_stack+0x14/0x20 [ 10.563141] dump_stack+0xbc/0x100 [ 10.563146] __report_bad_irq+0x48/0xd4 [ 10.563148] note_interrupt+0x288/0x378 [ 10.563151] handle_irq_event_percpu+0x80/0x88 [ 10.563153] handle_irq_event+0x44/0xc8 [ 10.563155] handle_fasteoi_irq+0xb4/0x160 [ 10.563157] generic_handle_irq+0x24/0x38 [ 10.563159] __handle_domain_irq+0x60/0xb8 [ 10.563162] gic_handle_irq+0x50/0xa0 [ 10.563164] el1_irq+0xb8/0x180 [ 10.563166] arch_cpu_idle+0x10/0x18 [ 10.563170] do_idle+0x204/0x290 [ 10.563172] cpu_startup_entry+0x20/0x40 [ 10.563175] rest_init+0xd4/0xe0 [ 10.563180] arch_call_rest_init+0xc/0x14 [ 10.563182] start_kernel+0x420/0x44c [ 10.563183] handlers: [ 10.563650] [<000000001e474803>] sil24_interrupt [ 10.564559] Disabling IRQ #13 [..] [ 11.832916] ata1: spurious interrupt (slot_stat 0x0 active_tag -84148995 sactive 0x0) [ 12.045444] ata_ratelimit: 1 callbacks suppressed With this patch, I don't see the errors and the device works as expected. Signed-off-by: Alexandru Elisei Reviewed-by: Andre Przywara --- vfio/core.c | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/vfio/core.c b/vfio/core.c index 6b9b58ea8d2f..b23e77c54771 100644 --- a/vfio/core.c +++ b/vfio/core.c @@ -226,6 +226,15 @@ int vfio_map_region(struct kvm *kvm, struct vfio_device *vdev, if (!(region->info.flags & VFIO_REGION_INFO_FLAG_MMAP)) return vfio_setup_trap_region(kvm, vdev, region); + /* + * KVM_SET_USER_MEMORY_REGION will fail because the guest physical + * address isn't page aligned, let's emulate the region ourselves. + */ + if (region->guest_phys_addr & (PAGE_SIZE - 1)) + return kvm__register_mmio(kvm, region->guest_phys_addr, + region->info.size, false, + vfio_mmio_access, region); + if (region->info.flags & VFIO_REGION_INFO_FLAG_READ) prot |= PROT_READ; if (region->info.flags & VFIO_REGION_INFO_FLAG_WRITE) From patchwork Thu Jan 23 13:48:05 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandru Elisei X-Patchwork-Id: 11347961 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 5C8CE92A for ; Thu, 23 Jan 2020 13:49:00 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 3A0B820661 for ; Thu, 23 Jan 2020 13:49:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729268AbgAWNs7 (ORCPT ); Thu, 23 Jan 2020 08:48:59 -0500 Received: from foss.arm.com ([217.140.110.172]:39886 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729246AbgAWNs6 (ORCPT ); Thu, 23 Jan 2020 08:48:58 -0500 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 959BF106F; Thu, 23 Jan 2020 05:48:57 -0800 (PST) Received: from e123195-lin.cambridge.arm.com (e123195-lin.cambridge.arm.com [10.1.196.63]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 958503F68E; Thu, 23 Jan 2020 05:48:56 -0800 (PST) From: Alexandru Elisei To: kvm@vger.kernel.org Cc: will@kernel.org, julien.thierry.kdev@gmail.com, andre.przywara@arm.com, sami.mujawar@arm.com, lorenzo.pieralisi@arm.com, maz@kernel.org Subject: [PATCH v2 kvmtool 30/30] arm/arm64: Add PCI Express 1.1 support Date: Thu, 23 Jan 2020 13:48:05 +0000 Message-Id: <20200123134805.1993-31-alexandru.elisei@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200123134805.1993-1-alexandru.elisei@arm.com> References: <20200123134805.1993-1-alexandru.elisei@arm.com> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org PCI Express comes with an extended addressing scheme, which directly translated into a bigger device configuration space (256->4096 bytes) and bigger PCI configuration space (16->256 MB), as well as mandatory capabilities (power management [1] and PCI Express capability [2]). However, our virtio PCI implementation implements version 0.9 of the protocol and it still uses transitional PCI device ID's, so we have opted to omit the mandatory PCI Express capabilities.For VFIO, the power management and PCI Express capability are left for a subsequent patch. [1] PCI Express Base Specification Revision 1.1, section 7.6 [2] PCI Express Base Specification Revision 1.1, section 7.8 Signed-off-by: Alexandru Elisei --- arm/include/arm-common/kvm-arch.h | 4 +- arm/pci.c | 2 +- builtin-run.c | 1 + hw/vesa.c | 2 +- include/kvm/kvm-config.h | 2 +- include/kvm/pci.h | 76 ++++++++++++++++++++++++++++--- pci.c | 5 +- vfio/pci.c | 26 +++++++---- 8 files changed, 97 insertions(+), 21 deletions(-) diff --git a/arm/include/arm-common/kvm-arch.h b/arm/include/arm-common/kvm-arch.h index b9d486d5eac2..13c55fa3dc29 100644 --- a/arm/include/arm-common/kvm-arch.h +++ b/arm/include/arm-common/kvm-arch.h @@ -23,7 +23,7 @@ #define ARM_IOPORT_SIZE (ARM_MMIO_AREA - ARM_IOPORT_AREA) #define ARM_VIRTIO_MMIO_SIZE (ARM_AXI_AREA - (ARM_MMIO_AREA + ARM_GIC_SIZE)) -#define ARM_PCI_CFG_SIZE (1ULL << 24) +#define ARM_PCI_CFG_SIZE (1ULL << 28) #define ARM_PCI_MMIO_SIZE (ARM_MEMORY_AREA - \ (ARM_AXI_AREA + ARM_PCI_CFG_SIZE)) @@ -50,6 +50,8 @@ #define VIRTIO_RING_ENDIAN (VIRTIO_ENDIAN_LE | VIRTIO_ENDIAN_BE) +#define ARCH_HAS_PCI_EXP 1 + static inline bool arm_addr_in_ioport_region(u64 phys_addr) { u64 limit = KVM_IOPORT_AREA + ARM_IOPORT_SIZE; diff --git a/arm/pci.c b/arm/pci.c index 1c0949a22408..eec9f3d936a5 100644 --- a/arm/pci.c +++ b/arm/pci.c @@ -77,7 +77,7 @@ void pci__generate_fdt_nodes(void *fdt) _FDT(fdt_property_cell(fdt, "#address-cells", 0x3)); _FDT(fdt_property_cell(fdt, "#size-cells", 0x2)); _FDT(fdt_property_cell(fdt, "#interrupt-cells", 0x1)); - _FDT(fdt_property_string(fdt, "compatible", "pci-host-cam-generic")); + _FDT(fdt_property_string(fdt, "compatible", "pci-host-ecam-generic")); _FDT(fdt_property(fdt, "dma-coherent", NULL, 0)); _FDT(fdt_property(fdt, "bus-range", bus_range, sizeof(bus_range))); diff --git a/builtin-run.c b/builtin-run.c index 9cb8c75300eb..def8a1f803ad 100644 --- a/builtin-run.c +++ b/builtin-run.c @@ -27,6 +27,7 @@ #include "kvm/irq.h" #include "kvm/kvm.h" #include "kvm/pci.h" +#include "kvm/vfio.h" #include "kvm/rtc.h" #include "kvm/sdl.h" #include "kvm/vnc.h" diff --git a/hw/vesa.c b/hw/vesa.c index aca938f79c82..4321cfbb6ddc 100644 --- a/hw/vesa.c +++ b/hw/vesa.c @@ -82,7 +82,7 @@ static int vesa__bar_deactivate(struct kvm *kvm, } static void vesa__pci_cfg_write(struct kvm *kvm, struct pci_device_header *pci_hdr, - u8 offset, void *data, int sz) + u16 offset, void *data, int sz) { u32 value; diff --git a/include/kvm/kvm-config.h b/include/kvm/kvm-config.h index a052b0bc7582..a1012c57b7a7 100644 --- a/include/kvm/kvm-config.h +++ b/include/kvm/kvm-config.h @@ -2,7 +2,6 @@ #define KVM_CONFIG_H_ #include "kvm/disk-image.h" -#include "kvm/vfio.h" #include "kvm/kvm-config-arch.h" #define DEFAULT_KVM_DEV "/dev/kvm" @@ -18,6 +17,7 @@ #define MIN_RAM_SIZE_MB (64ULL) #define MIN_RAM_SIZE_BYTE (MIN_RAM_SIZE_MB << MB_SHIFT) +struct vfio_device_params; struct kvm_config { struct kvm_config_arch arch; struct disk_image_params disk_image[MAX_DISK_IMAGES]; diff --git a/include/kvm/pci.h b/include/kvm/pci.h index ae71ef33237c..0c3c74b82626 100644 --- a/include/kvm/pci.h +++ b/include/kvm/pci.h @@ -10,6 +10,7 @@ #include "kvm/devices.h" #include "kvm/msi.h" #include "kvm/fdt.h" +#include "kvm.h" #define pci_dev_err(pci_hdr, fmt, ...) \ pr_err("[%04x:%04x] " fmt, pci_hdr->vendor_id, pci_hdr->device_id, ##__VA_ARGS__) @@ -32,9 +33,41 @@ #define PCI_CONFIG_BUS_FORWARD 0xcfa #define PCI_IO_SIZE 0x100 #define PCI_IOPORT_START 0x6200 -#define PCI_CFG_SIZE (1ULL << 24) -struct kvm; +#define PCIE_CAP_REG_VER 0x1 +#define PCIE_CAP_REG_DEV_LEGACY (1 << 4) +#define PM_CAP_VER 0x3 + +#ifdef ARCH_HAS_PCI_EXP +#define PCI_CFG_SIZE (1ULL << 28) +#define PCI_DEV_CFG_SIZE PCI_CFG_SPACE_EXP_SIZE + +union pci_config_address { + struct { +#if __BYTE_ORDER == __LITTLE_ENDIAN + unsigned reg_offset : 2; /* 1 .. 0 */ + unsigned register_number : 10; /* 11 .. 2 */ + unsigned function_number : 3; /* 14 .. 12 */ + unsigned device_number : 5; /* 19 .. 15 */ + unsigned bus_number : 8; /* 27 .. 20 */ + unsigned reserved : 3; /* 30 .. 28 */ + unsigned enable_bit : 1; /* 31 */ +#else + unsigned enable_bit : 1; /* 31 */ + unsigned reserved : 3; /* 30 .. 28 */ + unsigned bus_number : 8; /* 27 .. 20 */ + unsigned device_number : 5; /* 19 .. 15 */ + unsigned function_number : 3; /* 14 .. 12 */ + unsigned register_number : 10; /* 11 .. 2 */ + unsigned reg_offset : 2; /* 1 .. 0 */ +#endif + }; + u32 w; +}; + +#else +#define PCI_CFG_SIZE (1ULL << 24) +#define PCI_DEV_CFG_SIZE PCI_CFG_SPACE_SIZE union pci_config_address { struct { @@ -58,6 +91,8 @@ union pci_config_address { }; u32 w; }; +#endif +#define PCI_DEV_CFG_MASK (PCI_DEV_CFG_SIZE - 1) struct msix_table { struct msi_msg msg; @@ -100,6 +135,33 @@ struct pci_cap_hdr { u8 next; }; +struct pcie_cap { + u8 cap; + u8 next; + u16 cap_reg; + u32 dev_cap; + u16 dev_ctrl; + u16 dev_status; + u32 link_cap; + u16 link_ctrl; + u16 link_status; + u32 slot_cap; + u16 slot_ctrl; + u16 slot_status; + u16 root_ctrl; + u16 root_cap; + u32 root_status; +}; + +struct pm_cap { + u8 cap; + u8 next; + u16 pmc; + u16 pmcsr; + u8 pmcsr_bse; + u8 data; +}; + struct pci_bar_info { u32 size; bool active; @@ -115,14 +177,12 @@ typedef int (*bar_deactivate_fn_t)(struct kvm *kvm, int bar_num, void *data); #define PCI_BAR_OFFSET(b) (offsetof(struct pci_device_header, bar[b])) -#define PCI_DEV_CFG_SIZE 256 -#define PCI_DEV_CFG_MASK (PCI_DEV_CFG_SIZE - 1) struct pci_config_operations { void (*write)(struct kvm *kvm, struct pci_device_header *pci_hdr, - u8 offset, void *data, int sz); + u16 offset, void *data, int sz); void (*read)(struct kvm *kvm, struct pci_device_header *pci_hdr, - u8 offset, void *data, int sz); + u16 offset, void *data, int sz); }; struct pci_device_header { @@ -152,6 +212,10 @@ struct pci_device_header { u8 min_gnt; u8 max_lat; struct msix_cap msix; +#ifdef ARCH_HAS_PCI_EXP + struct pm_cap pm; + struct pcie_cap pcie; +#endif } __attribute__((packed)); /* Pad to PCI config space size */ u8 __pad[PCI_DEV_CFG_SIZE]; diff --git a/pci.c b/pci.c index 1e9791250bc3..ea3df8d2e28a 100644 --- a/pci.c +++ b/pci.c @@ -389,7 +389,8 @@ static void pci_config_bar_wr(struct kvm *kvm, void pci__config_wr(struct kvm *kvm, union pci_config_address addr, void *data, int size) { void *base; - u8 bar, offset; + u8 bar; + u16 offset; struct pci_device_header *pci_hdr; u8 dev_num = addr.device_number; u32 value = 0; @@ -428,7 +429,7 @@ void pci__config_wr(struct kvm *kvm, union pci_config_address addr, void *data, void pci__config_rd(struct kvm *kvm, union pci_config_address addr, void *data, int size) { - u8 offset; + u16 offset; struct pci_device_header *pci_hdr; u8 dev_num = addr.device_number; diff --git a/vfio/pci.c b/vfio/pci.c index 3a641e72e574..05e8b54e77ac 100644 --- a/vfio/pci.c +++ b/vfio/pci.c @@ -309,7 +309,7 @@ out_unlock: } static void vfio_pci_msix_cap_write(struct kvm *kvm, - struct vfio_device *vdev, u8 off, + struct vfio_device *vdev, u16 off, void *data, int sz) { struct vfio_pci_device *pdev = &vdev->pci; @@ -341,7 +341,7 @@ static void vfio_pci_msix_cap_write(struct kvm *kvm, } static int vfio_pci_msi_vector_write(struct kvm *kvm, struct vfio_device *vdev, - u8 off, u8 *data, u32 sz) + u16 off, u8 *data, u32 sz) { size_t i; u32 mask = 0; @@ -389,7 +389,7 @@ static int vfio_pci_msi_vector_write(struct kvm *kvm, struct vfio_device *vdev, } static void vfio_pci_msi_cap_write(struct kvm *kvm, struct vfio_device *vdev, - u8 off, u8 *data, u32 sz) + u16 off, u8 *data, u32 sz) { u8 ctrl; struct msi_msg msg; @@ -537,7 +537,7 @@ out: } static void vfio_pci_cfg_read(struct kvm *kvm, struct pci_device_header *pci_hdr, - u8 offset, void *data, int sz) + u16 offset, void *data, int sz) { struct vfio_region_info *info; struct vfio_pci_device *pdev; @@ -555,7 +555,7 @@ static void vfio_pci_cfg_read(struct kvm *kvm, struct pci_device_header *pci_hdr } static void vfio_pci_cfg_write(struct kvm *kvm, struct pci_device_header *pci_hdr, - u8 offset, void *data, int sz) + u16 offset, void *data, int sz) { struct vfio_region_info *info; struct vfio_pci_device *pdev; @@ -639,15 +639,17 @@ static int vfio_pci_parse_caps(struct vfio_device *vdev) { int ret; size_t size; - u8 pos, next; + u16 pos, next; struct pci_cap_hdr *cap; - u8 virt_hdr[PCI_DEV_CFG_SIZE]; + u8 *virt_hdr; struct vfio_pci_device *pdev = &vdev->pci; if (!(pdev->hdr.status & PCI_STATUS_CAP_LIST)) return 0; - memset(virt_hdr, 0, PCI_DEV_CFG_SIZE); + virt_hdr = calloc(1, PCI_DEV_CFG_SIZE); + if (!virt_hdr) + return -errno; pos = pdev->hdr.capabilities & ~3; @@ -683,6 +685,8 @@ static int vfio_pci_parse_caps(struct vfio_device *vdev) size = PCI_DEV_CFG_SIZE - PCI_STD_HEADER_SIZEOF; memcpy((void *)&pdev->hdr + pos, virt_hdr + pos, size); + free(virt_hdr); + return 0; } @@ -792,7 +796,11 @@ static int vfio_pci_fixup_cfg_space(struct vfio_device *vdev) /* Install our fake Configuration Space */ info = &vdev->regions[VFIO_PCI_CONFIG_REGION_INDEX].info; - hdr_sz = PCI_DEV_CFG_SIZE; + /* + * We don't touch the extended configuration space, let's be cautious + * and not overwrite it all with zeros, or bad things might happen. + */ + hdr_sz = PCI_CFG_SPACE_SIZE; if (pwrite(vdev->fd, &pdev->hdr, hdr_sz, info->offset) != hdr_sz) { vfio_dev_err(vdev, "failed to write %zd bytes to Config Space", hdr_sz);