From patchwork Wed Jun 15 15:51:25 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Stefan Hajnoczi X-Patchwork-Id: 12882674 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 97FA2C433EF for ; Wed, 15 Jun 2022 16:13:34 +0000 (UTC) Received: from localhost ([::1]:41100 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1o1Ve9-0000Ph-Fw for qemu-devel@archiver.kernel.org; Wed, 15 Jun 2022 12:13:33 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:43434) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1o1VLu-0004Lw-Mm for qemu-devel@nongnu.org; Wed, 15 Jun 2022 11:54:42 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]:57533) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1o1VLr-0007O4-Ss for qemu-devel@nongnu.org; Wed, 15 Jun 2022 11:54:42 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1655308479; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=LuD1dQBr5PDdf9gF6OwOpS+szju7RxeLsLEULQigrW0=; b=YrZjK1EThzQcCtdVqxp48if2ZmHQcwrIq6sIDbgxgitK7VmG/p+7QucExrPm0kUu0/RLFH 8stRJKpiUlfBWGLqtyixsC01hJzoIj5hHu85Q9Tn2Ss0h0720NkyJB5/NCHw52IGxQ0n9z +yyfjiJJCpVQLMcsX+YETj/nSPy3h/o= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-389-rgRPNjs6PMeXavUL9ZRM8g-1; Wed, 15 Jun 2022 11:54:36 -0400 X-MC-Unique: rgRPNjs6PMeXavUL9ZRM8g-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 227898339A5; Wed, 15 Jun 2022 15:54:35 +0000 (UTC) Received: from localhost (unknown [10.39.192.100]) by smtp.corp.redhat.com (Postfix) with ESMTP id 712CE2166B29; Wed, 15 Jun 2022 15:54:34 +0000 (UTC) From: Stefan Hajnoczi To: qemu-devel@nongnu.org Cc: Thomas Huth , Jagannathan Raman , Aarushi Mehta , =?utf-8?q?Philippe_Mathieu-Daud?= =?utf-8?q?=C3=A9?= , Elena Ufimtseva , Marcel Apfelbaum , Stefan Hajnoczi , Laurent Vivier , , "Dr. David Alan Gilbert" , Richard Henderson , virtio-fs@redhat.com, Hanna Reitz , David Hildenbrand , =?utf-8?q?Alex_Benn=C3=A9e?= , Eric Blake , Kevin Wolf , =?utf-8?q?Da?= =?utf-8?q?niel_P=2E_Berrang=C3=A9?= , Beraldo Leal , Peter Xu , Eduardo Habkost , Qiuhao Li , Paolo Bonzini , Markus Armbruster , Bandan Das , "Michael S. Tsirkin" , Stefano Garzarella , Alexander Bulekov , Julia Suvorova , Darren Kenny , Wainer dos Santos Moschetta , John G Johnson Subject: [PULL 14/18] vfio-user: handle PCI BAR accesses Date: Wed, 15 Jun 2022 16:51:25 +0100 Message-Id: <20220615155129.1025811-15-stefanha@redhat.com> In-Reply-To: <20220615155129.1025811-1-stefanha@redhat.com> References: <20220615155129.1025811-1-stefanha@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.78 on 10.11.54.6 Received-SPF: pass client-ip=170.10.129.124; envelope-from=stefanha@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -21 X-Spam_score: -2.2 X-Spam_bar: -- X-Spam_report: (-2.2 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.082, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" From: Jagannathan Raman Determine the BARs used by the PCI device and register handlers to manage the access to the same. Signed-off-by: Elena Ufimtseva Signed-off-by: John G Johnson Signed-off-by: Jagannathan Raman Reviewed-by: Stefan Hajnoczi Message-id: 3373e10b5be5f42846f0632d4382466e1698c505.1655151679.git.jag.raman@oracle.com Signed-off-by: Stefan Hajnoczi --- include/exec/memory.h | 3 + hw/remote/vfio-user-obj.c | 190 ++++++++++++++++++++++++++++++++ softmmu/physmem.c | 4 +- tests/qtest/fuzz/generic_fuzz.c | 9 +- hw/remote/trace-events | 3 + 5 files changed, 203 insertions(+), 6 deletions(-) diff --git a/include/exec/memory.h b/include/exec/memory.h index f1c19451bc..a6a0f4d8ad 100644 --- a/include/exec/memory.h +++ b/include/exec/memory.h @@ -2810,6 +2810,9 @@ MemTxResult address_space_write_cached_slow(MemoryRegionCache *cache, hwaddr addr, const void *buf, hwaddr len); +int memory_access_size(MemoryRegion *mr, unsigned l, hwaddr addr); +bool prepare_mmio_access(MemoryRegion *mr); + static inline bool memory_access_is_direct(MemoryRegion *mr, bool is_write) { if (is_write) { diff --git a/hw/remote/vfio-user-obj.c b/hw/remote/vfio-user-obj.c index 7b21f77052..dd760a99e2 100644 --- a/hw/remote/vfio-user-obj.c +++ b/hw/remote/vfio-user-obj.c @@ -52,6 +52,7 @@ #include "hw/qdev-core.h" #include "hw/pci/pci.h" #include "qemu/timer.h" +#include "exec/memory.h" #define TYPE_VFU_OBJECT "x-vfio-user-server" OBJECT_DECLARE_TYPE(VfuObject, VfuObjectClass, VFU_OBJECT) @@ -332,6 +333,193 @@ static void dma_unregister(vfu_ctx_t *vfu_ctx, vfu_dma_info_t *info) trace_vfu_dma_unregister((uint64_t)info->iova.iov_base); } +static int vfu_object_mr_rw(MemoryRegion *mr, uint8_t *buf, hwaddr offset, + hwaddr size, const bool is_write) +{ + uint8_t *ptr = buf; + bool release_lock = false; + uint8_t *ram_ptr = NULL; + MemTxResult result; + int access_size; + uint64_t val; + + if (memory_access_is_direct(mr, is_write)) { + /** + * Some devices expose a PCI expansion ROM, which could be buffer + * based as compared to other regions which are primarily based on + * MemoryRegionOps. memory_region_find() would already check + * for buffer overflow, we don't need to repeat it here. + */ + ram_ptr = memory_region_get_ram_ptr(mr); + + if (is_write) { + memcpy((ram_ptr + offset), buf, size); + } else { + memcpy(buf, (ram_ptr + offset), size); + } + + return 0; + } + + while (size) { + /** + * The read/write logic used below is similar to the ones in + * flatview_read/write_continue() + */ + release_lock = prepare_mmio_access(mr); + + access_size = memory_access_size(mr, size, offset); + + if (is_write) { + val = ldn_he_p(ptr, access_size); + + result = memory_region_dispatch_write(mr, offset, val, + size_memop(access_size), + MEMTXATTRS_UNSPECIFIED); + } else { + result = memory_region_dispatch_read(mr, offset, &val, + size_memop(access_size), + MEMTXATTRS_UNSPECIFIED); + + stn_he_p(ptr, access_size, val); + } + + if (release_lock) { + qemu_mutex_unlock_iothread(); + release_lock = false; + } + + if (result != MEMTX_OK) { + return -1; + } + + size -= access_size; + ptr += access_size; + offset += access_size; + } + + return 0; +} + +static size_t vfu_object_bar_rw(PCIDevice *pci_dev, int pci_bar, + hwaddr bar_offset, char * const buf, + hwaddr len, const bool is_write) +{ + MemoryRegionSection section = { 0 }; + uint8_t *ptr = (uint8_t *)buf; + MemoryRegion *section_mr = NULL; + uint64_t section_size; + hwaddr section_offset; + hwaddr size = 0; + + while (len) { + section = memory_region_find(pci_dev->io_regions[pci_bar].memory, + bar_offset, len); + + if (!section.mr) { + warn_report("vfu: invalid address 0x%"PRIx64"", bar_offset); + return size; + } + + section_mr = section.mr; + section_offset = section.offset_within_region; + section_size = int128_get64(section.size); + + if (is_write && section_mr->readonly) { + warn_report("vfu: attempting to write to readonly region in " + "bar %d - [0x%"PRIx64" - 0x%"PRIx64"]", + pci_bar, bar_offset, + (bar_offset + section_size)); + memory_region_unref(section_mr); + return size; + } + + if (vfu_object_mr_rw(section_mr, ptr, section_offset, + section_size, is_write)) { + warn_report("vfu: failed to %s " + "[0x%"PRIx64" - 0x%"PRIx64"] in bar %d", + is_write ? "write to" : "read from", bar_offset, + (bar_offset + section_size), pci_bar); + memory_region_unref(section_mr); + return size; + } + + size += section_size; + bar_offset += section_size; + ptr += section_size; + len -= section_size; + + memory_region_unref(section_mr); + } + + return size; +} + +/** + * VFU_OBJECT_BAR_HANDLER - macro for defining handlers for PCI BARs. + * + * To create handler for BAR number 2, VFU_OBJECT_BAR_HANDLER(2) would + * define vfu_object_bar2_handler + */ +#define VFU_OBJECT_BAR_HANDLER(BAR_NO) \ + static ssize_t vfu_object_bar##BAR_NO##_handler(vfu_ctx_t *vfu_ctx, \ + char * const buf, size_t count, \ + loff_t offset, const bool is_write) \ + { \ + VfuObject *o = vfu_get_private(vfu_ctx); \ + PCIDevice *pci_dev = o->pci_dev; \ + \ + return vfu_object_bar_rw(pci_dev, BAR_NO, offset, \ + buf, count, is_write); \ + } \ + +VFU_OBJECT_BAR_HANDLER(0) +VFU_OBJECT_BAR_HANDLER(1) +VFU_OBJECT_BAR_HANDLER(2) +VFU_OBJECT_BAR_HANDLER(3) +VFU_OBJECT_BAR_HANDLER(4) +VFU_OBJECT_BAR_HANDLER(5) +VFU_OBJECT_BAR_HANDLER(6) + +static vfu_region_access_cb_t *vfu_object_bar_handlers[PCI_NUM_REGIONS] = { + &vfu_object_bar0_handler, + &vfu_object_bar1_handler, + &vfu_object_bar2_handler, + &vfu_object_bar3_handler, + &vfu_object_bar4_handler, + &vfu_object_bar5_handler, + &vfu_object_bar6_handler, +}; + +/** + * vfu_object_register_bars - Identify active BAR regions of pdev and setup + * callbacks to handle read/write accesses + */ +static void vfu_object_register_bars(vfu_ctx_t *vfu_ctx, PCIDevice *pdev) +{ + int flags = VFU_REGION_FLAG_RW; + int i; + + for (i = 0; i < PCI_NUM_REGIONS; i++) { + if (!pdev->io_regions[i].size) { + continue; + } + + if ((i == VFU_PCI_DEV_ROM_REGION_IDX) || + pdev->io_regions[i].memory->readonly) { + flags &= ~VFU_REGION_FLAG_WRITE; + } + + vfu_setup_region(vfu_ctx, VFU_PCI_DEV_BAR0_REGION_IDX + i, + (size_t)pdev->io_regions[i].size, + vfu_object_bar_handlers[i], + flags, NULL, 0, -1, 0); + + trace_vfu_bar_register(i, pdev->io_regions[i].addr, + pdev->io_regions[i].size); + } +} + /* * TYPE_VFU_OBJECT depends on the availability of the 'socket' and 'device' * properties. It also depends on devices instantiated in QEMU. These @@ -442,6 +630,8 @@ static void vfu_object_init_ctx(VfuObject *o, Error **errp) goto fail; } + vfu_object_register_bars(o->vfu_ctx, o->pci_dev); + ret = vfu_realize_ctx(o->vfu_ctx); if (ret < 0) { error_setg(errp, "vfu: Failed to realize device %s- %s", diff --git a/softmmu/physmem.c b/softmmu/physmem.c index 657841eed0..fb16be57a6 100644 --- a/softmmu/physmem.c +++ b/softmmu/physmem.c @@ -2719,7 +2719,7 @@ void memory_region_flush_rom_device(MemoryRegion *mr, hwaddr addr, hwaddr size) invalidate_and_set_dirty(mr, addr, size); } -static int memory_access_size(MemoryRegion *mr, unsigned l, hwaddr addr) +int memory_access_size(MemoryRegion *mr, unsigned l, hwaddr addr) { unsigned access_size_max = mr->ops->valid.max_access_size; @@ -2746,7 +2746,7 @@ static int memory_access_size(MemoryRegion *mr, unsigned l, hwaddr addr) return l; } -static bool prepare_mmio_access(MemoryRegion *mr) +bool prepare_mmio_access(MemoryRegion *mr) { bool release_lock = false; diff --git a/tests/qtest/fuzz/generic_fuzz.c b/tests/qtest/fuzz/generic_fuzz.c index 25df19fd5a..447ffe8178 100644 --- a/tests/qtest/fuzz/generic_fuzz.c +++ b/tests/qtest/fuzz/generic_fuzz.c @@ -144,7 +144,7 @@ static void *pattern_alloc(pattern p, size_t len) return buf; } -static int memory_access_size(MemoryRegion *mr, unsigned l, hwaddr addr) +static int fuzz_memory_access_size(MemoryRegion *mr, unsigned l, hwaddr addr) { unsigned access_size_max = mr->ops->valid.max_access_size; @@ -242,11 +242,12 @@ void fuzz_dma_read_cb(size_t addr, size_t len, MemoryRegion *mr) /* * If mr1 isn't RAM, address_space_translate doesn't update l. Use - * memory_access_size to identify the number of bytes that it is safe - * to write without accidentally writing to another MemoryRegion. + * fuzz_memory_access_size to identify the number of bytes that it + * is safe to write without accidentally writing to another + * MemoryRegion. */ if (!memory_region_is_ram(mr1)) { - l = memory_access_size(mr1, l, addr1); + l = fuzz_memory_access_size(mr1, l, addr1); } if (memory_region_is_ram(mr1) || memory_region_is_romd(mr1) || diff --git a/hw/remote/trace-events b/hw/remote/trace-events index f945c7e33b..847d50d88f 100644 --- a/hw/remote/trace-events +++ b/hw/remote/trace-events @@ -9,3 +9,6 @@ vfu_cfg_read(uint32_t offset, uint32_t val) "vfu: cfg: 0x%u -> 0x%x" vfu_cfg_write(uint32_t offset, uint32_t val) "vfu: cfg: 0x%u <- 0x%x" vfu_dma_register(uint64_t gpa, size_t len) "vfu: registering GPA 0x%"PRIx64", %zu bytes" vfu_dma_unregister(uint64_t gpa) "vfu: unregistering GPA 0x%"PRIx64"" +vfu_bar_register(int i, uint64_t addr, uint64_t size) "vfu: BAR %d: addr 0x%"PRIx64" size 0x%"PRIx64"" +vfu_bar_rw_enter(const char *op, uint64_t addr) "vfu: %s request for BAR address 0x%"PRIx64"" +vfu_bar_rw_exit(const char *op, uint64_t addr) "vfu: Finished %s of BAR address 0x%"PRIx64""