From patchwork Tue Dec 5 03:33:13 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Wang, Wei W" X-Patchwork-Id: 10092041 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 2653B60348 for ; Tue, 5 Dec 2017 03:51:20 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 126E92944D for ; Tue, 5 Dec 2017 03:51:20 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 059FA29453; Tue, 5 Dec 2017 03:51:20 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 23E2F2879C for ; Tue, 5 Dec 2017 03:51:19 +0000 (UTC) Received: from localhost ([::1]:46254 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eM4GM-0002i7-A7 for patchwork-qemu-devel@patchwork.kernel.org; Mon, 04 Dec 2017 22:51:18 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:33300) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eM4Ev-0001QK-1b for qemu-devel@nongnu.org; Mon, 04 Dec 2017 22:49:53 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1eM4Et-0007Oc-6N for qemu-devel@nongnu.org; Mon, 04 Dec 2017 22:49:49 -0500 Received: from mga01.intel.com ([192.55.52.88]:32277) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1eM4Es-0007JF-Mn for qemu-devel@nongnu.org; Mon, 04 Dec 2017 22:49:47 -0500 Received: from orsmga003.jf.intel.com ([10.7.209.27]) by fmsmga101.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 04 Dec 2017 19:49:46 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.45,362,1508828400"; d="scan'208";a="9615367" Received: from devel-ww.sh.intel.com ([10.239.48.110]) by orsmga003.jf.intel.com with ESMTP; 04 Dec 2017 19:49:43 -0800 From: Wei Wang To: virtio-dev@lists.oasis-open.org, qemu-devel@nongnu.org, mst@redhat.com, marcandre.lureau@redhat.com, jasowang@redhat.com, stefanha@redhat.com, pbonzini@redhat.com Date: Tue, 5 Dec 2017 11:33:13 +0800 Message-Id: <1512444796-30615-5-git-send-email-wei.w.wang@intel.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1512444796-30615-1-git-send-email-wei.w.wang@intel.com> References: <1512444796-30615-1-git-send-email-wei.w.wang@intel.com> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 192.55.52.88 Subject: [Qemu-devel] [PATCH v3 4/7] vhost-pci-slave: add vhost-pci slave implementation X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: zhiyong.yang@intel.com, jan.kiszka@siemens.com, Wei Wang , avi.cohen@huawei.com Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" X-Virus-Scanned: ClamAV using ClamSMTP The vhost-pci slave implementation is added to support the creation of the vhost-pci-net device. It follows the vhost-user protocol to get the master VM's info (e.g. memory regions, vring address). Signed-off-by: Wei Wang --- hw/net/vhost_pci_net.c | 20 ++- hw/virtio/Makefile.objs | 1 + hw/virtio/vhost-pci-slave.c | 307 ++++++++++++++++++++++++++++++++++++ include/hw/virtio/vhost-pci-net.h | 3 + include/hw/virtio/vhost-pci-slave.h | 12 ++ 5 files changed, 342 insertions(+), 1 deletion(-) create mode 100644 hw/virtio/vhost-pci-slave.c create mode 100644 include/hw/virtio/vhost-pci-slave.h diff --git a/hw/net/vhost_pci_net.c b/hw/net/vhost_pci_net.c index db8f954..11184c3 100644 --- a/hw/net/vhost_pci_net.c +++ b/hw/net/vhost_pci_net.c @@ -21,6 +21,7 @@ #include "hw/virtio/virtio-access.h" #include "hw/virtio/vhost-pci-net.h" #include "hw/virtio/virtio-net.h" +#include "hw/virtio/vhost-pci-slave.h" static uint64_t vpnet_get_features(VirtIODevice *vdev, uint64_t features, Error **errp) @@ -45,6 +46,10 @@ static void vpnet_device_realize(DeviceState *dev, Error **errp) VirtIODevice *vdev = VIRTIO_DEVICE(dev); VhostPCINet *vpnet = VHOST_PCI_NET(vdev); + qemu_chr_fe_set_handlers(&vpnet->chr_be, vp_slave_can_read, + vp_slave_read, vp_slave_event, NULL, + vpnet, NULL, true); + virtio_init(vdev, "vhost-pci-net", VIRTIO_ID_VHOST_PCI_NET, vpnet->config_size); @@ -59,7 +64,20 @@ static void vpnet_device_realize(DeviceState *dev, Error **errp) static void vpnet_device_unrealize(DeviceState *dev, Error **errp) { VirtIODevice *vdev = VIRTIO_DEVICE(dev); - + VhostPCINet *vpnet = VHOST_PCI_NET(vdev); + int i, ret, nregions = vpnet->metadata->nregions; + + for (i = 0; i < nregions; i++) { + ret = munmap(vpnet->remote_mem_base[i], vpnet->remote_mem_map_size[i]); + if (ret < 0) { + error_report("%s: failed to unmap mr[%d]", __func__, i); + continue; + } + memory_region_del_subregion(&vpnet->bar_region, + &vpnet->remote_mem_region[i]); + } + + qemu_chr_fe_deinit(&vpnet->chr_be, true); virtio_cleanup(vdev); } diff --git a/hw/virtio/Makefile.objs b/hw/virtio/Makefile.objs index 765d363..5e81f2f 100644 --- a/hw/virtio/Makefile.objs +++ b/hw/virtio/Makefile.objs @@ -3,6 +3,7 @@ common-obj-y += virtio-rng.o common-obj-$(CONFIG_VIRTIO_PCI) += virtio-pci.o common-obj-y += virtio-bus.o common-obj-y += virtio-mmio.o +common-obj-y += vhost-pci-slave.o obj-y += virtio.o virtio-balloon.o obj-$(CONFIG_LINUX) += vhost.o vhost-backend.o vhost-user.o diff --git a/hw/virtio/vhost-pci-slave.c b/hw/virtio/vhost-pci-slave.c new file mode 100644 index 0000000..b1a8620 --- /dev/null +++ b/hw/virtio/vhost-pci-slave.c @@ -0,0 +1,307 @@ +/* + * Vhost-pci Slave + * + * Copyright Intel Corp. 2017 + * + * Authors: + * Wei Wang + * Zhiyong Yang + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + */ + +#include +#include + +#include "monitor/qdev.h" +#include "qapi/error.h" +#include "qemu/config-file.h" +#include "qemu/error-report.h" +#include "hw/virtio/virtio-pci.h" +#include "hw/virtio/vhost-pci-slave.h" +#include "hw/virtio/vhost-user.h" +#include "hw/virtio/vhost-pci-net.h" + +#define VHOST_USER_PROTOCOL_FEATURES 0 + +static int vp_slave_write(CharBackend *chr_be, VhostUserMsg *msg) +{ + int size; + + if (!msg) { + return 0; + } + + /* The payload size has been assigned, plus the header size here */ + size = msg->size + VHOST_USER_HDR_SIZE; + msg->flags &= ~VHOST_USER_VERSION_MASK; + msg->flags |= VHOST_USER_VERSION; + + return qemu_chr_fe_write_all(chr_be, (const uint8_t *)msg, size) + == size ? 0 : -1; +} + +static int vp_slave_get_features(VhostPCINet *vpnet, CharBackend *chr_be, + VhostUserMsg *msg) +{ + /* Offer the initial features, which have the protocol feature bit set */ + msg->payload.u64 = (uint64_t)vpnet->host_features | + (1 << VHOST_USER_F_PROTOCOL_FEATURES); + msg->size = sizeof(msg->payload.u64); + msg->flags |= VHOST_USER_REPLY_MASK; + + return vp_slave_write(chr_be, msg); +} + +static void vp_slave_set_features(VhostPCINet *vpnet, VhostUserMsg *msg) +{ + vpnet->host_features = msg->payload.u64 & + ~(1 << VHOST_USER_F_PROTOCOL_FEATURES); +} + +void vp_slave_event(void *opaque, int event) +{ + switch (event) { + case CHR_EVENT_OPENED: + break; + case CHR_EVENT_CLOSED: + break; + } +} + +static int vp_slave_get_protocol_features(CharBackend *chr_be, + VhostUserMsg *msg) +{ + msg->payload.u64 = VHOST_USER_PROTOCOL_FEATURES; + msg->size = sizeof(msg->payload.u64); + msg->flags |= VHOST_USER_REPLY_MASK; + + return vp_slave_write(chr_be, msg); +} + +static int vp_slave_get_queue_num(CharBackend *chr_be, VhostUserMsg *msg) +{ + msg->payload.u64 = VIRTIO_NET_CTRL_MQ_VQ_PAIRS_MAX; + msg->size = sizeof(msg->payload.u64); + msg->flags |= VHOST_USER_REPLY_MASK; + + return vp_slave_write(chr_be, msg); +} + +/* Set up the vhost-pci-net device bar to map the remote memory */ +static int vp_slave_set_mem_table(VhostPCINet *vpnet, VhostUserMsg *msg, + int *fds, int fd_num) +{ + VhostUserMemory *mem = &msg->payload.memory; + VhostUserMemoryRegion *region = mem->regions; + uint32_t i, nregions = mem->nregions; + uint64_t bar_map_offset = METADATA_SIZE; + void *remote_mem_ptr; + + /* Sanity Check */ + if (fd_num != nregions) { + error_report("%s: fd num doesn't match region num", __func__); + return -1; + } + + vpnet->metadata->nregions = nregions; + vpnet->remote_mem_region = g_malloc(nregions * sizeof(MemoryRegion)); + + for (i = 0; i < nregions; i++) { + vpnet->remote_mem_map_size[i] = region[i].memory_size + + region[i].mmap_offset; + /* + * Map the remote memory by QEMU. They will then be exposed to the + * guest via a vhost-pci device BAR. The mapped base addr and size + * are recorded and will be used when cleaning up the device. + */ + vpnet->remote_mem_base[i] = mmap(NULL, vpnet->remote_mem_map_size[i], + PROT_READ | PROT_WRITE, MAP_SHARED, + fds[i], 0); + if (vpnet->remote_mem_base[i] == MAP_FAILED) { + error_report("%s: map peer memory region %d failed", __func__, i); + return -1; + } + + remote_mem_ptr = vpnet->remote_mem_base[i] + region[i].mmap_offset; + /* + * The BAR MMIO is different from the traditional one, because the + * it is set up as a regular RAM. Guest will be able to directly + * access it without VMExits, just like accessing its RAM memory. + */ + memory_region_init_ram_ptr(&vpnet->remote_mem_region[i], NULL, + "RemoteMemory", region[i].memory_size, + remote_mem_ptr); + /* + * The remote memory regions, which are scattered in the remote VM's + * address space, are put continuous in the BAR. + */ + memory_region_add_subregion(&vpnet->bar_region, bar_map_offset, + &vpnet->remote_mem_region[i]); + bar_map_offset += region[i].memory_size; + + vpnet->metadata->mem[i].gpa = region[i].guest_phys_addr; + vpnet->metadata->mem[i].size = region[i].memory_size; + } + + return 0; +} + +static void vp_slave_set_vring_num(VhostPCINet *vpnet, VhostUserMsg *msg) +{ + struct vhost_vring_state *state = &msg->payload.state; + + vpnet->metadata->vq[state->index].vring_num = state->num; +} + +static void vp_slave_set_vring_base(VhostPCINet *vpnet, VhostUserMsg *msg) +{ + struct vhost_vring_state *state = &msg->payload.state; + + vpnet->metadata->vq[state->index].last_avail_idx = state->num; +} + +static int vp_slave_get_vring_base(CharBackend *chr_be, VhostUserMsg *msg) +{ + msg->flags |= VHOST_USER_REPLY_MASK; + msg->size = sizeof(m.payload.state); + /* Send back the last_avail_idx, which is 0 here */ + msg->payload.state.num = 0; + + return vp_slave_write(chr_be, msg); +} + +static void vp_slave_set_vring_addr(VhostPCINet *vpnet, VhostUserMsg *msg) +{ + uint32_t index = msg->payload.addr.index; + + vpnet->metadata->vq[index].desc_gpa = msg->payload.addr.desc_user_addr; + vpnet->metadata->vq[index].avail_gpa = msg->payload.addr.avail_user_addr; + vpnet->metadata->vq[index].used_gpa = msg->payload.addr.used_user_addr; + vpnet->metadata->nvqs = msg->payload.addr.index + 1; +} + +static void vp_slave_set_vring_enable(VhostPCINet *vpnet, VhostUserMsg *msg) +{ + struct vhost_vring_state *state = &msg->payload.state; + + vpnet->metadata->vq[state->index].vring_enabled = (int)state->num; +} + +int vp_slave_can_read(void *opaque) +{ + return VHOST_USER_HDR_SIZE; +} + +void vp_slave_read(void *opaque, const uint8_t *buf, int size) +{ + int ret, fd_num, fds[VHOST_MEMORY_MAX_NREGIONS]; + VhostUserMsg msg; + uint8_t *p = (uint8_t *) &msg; + VhostPCINet *vpnet = (VhostPCINet *)opaque; + CharBackend *chr_be = &vpnet->chr_be; + + if (size != VHOST_USER_HDR_SIZE) { + error_report("%s: wrong message size received %d", __func__, size); + return; + } + + memcpy(p, buf, VHOST_USER_HDR_SIZE); + + if (msg.size) { + p += VHOST_USER_HDR_SIZE; + size = qemu_chr_fe_read_all(chr_be, p, msg.size); + if (size != msg.size) { + error_report("%s: wrong message size received %d != %d", __func__, + size, msg.size); + return; + } + } + + if (msg.request > VHOST_USER_MAX) { + error_report("%s: read an incorrect msg %d", __func__, msg.request); + return; + } + + switch (msg.request) { + case VHOST_USER_GET_FEATURES: + ret = vp_slave_get_features(vpnet, chr_be, &msg); + if (ret < 0) { + goto err_handling; + } + break; + case VHOST_USER_SET_FEATURES: + vp_slave_set_features(vpnet, &msg); + break; + case VHOST_USER_GET_PROTOCOL_FEATURES: + ret = vp_slave_get_protocol_features(chr_be, &msg); + if (ret < 0) { + goto err_handling; + } + break; + case VHOST_USER_SET_PROTOCOL_FEATURES: + break; + case VHOST_USER_GET_QUEUE_NUM: + ret = vp_slave_get_queue_num(chr_be, &msg); + if (ret < 0) { + goto err_handling; + } + break; + case VHOST_USER_SET_OWNER: + break; + case VHOST_USER_SET_MEM_TABLE: + fd_num = qemu_chr_fe_get_msgfds(chr_be, fds, sizeof(fds) / sizeof(int)); + vp_slave_set_mem_table(vpnet, &msg, fds, fd_num); + break; + case VHOST_USER_SET_VRING_NUM: + vp_slave_set_vring_num(vpnet, &msg); + break; + case VHOST_USER_SET_VRING_BASE: + vp_slave_set_vring_base(vpnet, &msg); + break; + case VHOST_USER_GET_VRING_BASE: + ret = vp_slave_get_vring_base(chr_be, &msg); + if (ret < 0) { + goto err_handling; + } + break; + case VHOST_USER_SET_VRING_ADDR: + vp_slave_set_vring_addr(vpnet, &msg); + break; + case VHOST_USER_SET_VRING_KICK: + /* Consume the fd */ + qemu_chr_fe_get_msgfds(chr_be, fds, 1); + /* + * This is a non-blocking eventfd. + * The receive function forces it to be blocking, + * so revert it back to non-blocking. + */ + qemu_set_nonblock(fds[0]); + break; + case VHOST_USER_SET_VRING_CALL: + /* Consume the fd, and revert it to non-blocking. */ + qemu_chr_fe_get_msgfds(chr_be, fds, 1); + qemu_set_nonblock(fds[0]); + break; + case VHOST_USER_SET_VRING_ENABLE: + vp_slave_set_vring_enable(vpnet, &msg); + break; + case VHOST_USER_SET_LOG_BASE: + break; + case VHOST_USER_SET_LOG_FD: + qemu_chr_fe_get_msgfds(chr_be, fds, 1); + close(fds[0]); + break; + case VHOST_USER_SEND_RARP: + break; + default: + error_report("vhost-pci-slave does not support msg request = %d", + msg.request); + break; + } + return; + +err_handling: + error_report("%s: handle request %d failed", __func__, msg.request); +} diff --git a/include/hw/virtio/vhost-pci-net.h b/include/hw/virtio/vhost-pci-net.h index 0c6886d..6f4ab6a 100644 --- a/include/hw/virtio/vhost-pci-net.h +++ b/include/hw/virtio/vhost-pci-net.h @@ -27,7 +27,10 @@ typedef struct VhostPCINet { VirtIODevice parent_obj; MemoryRegion bar_region; MemoryRegion metadata_region; + MemoryRegion *remote_mem_region; struct vpnet_metadata *metadata; + void *remote_mem_base[MAX_REMOTE_REGION]; + uint64_t remote_mem_map_size[MAX_REMOTE_REGION]; uint32_t host_features; size_t config_size; uint16_t status; diff --git a/include/hw/virtio/vhost-pci-slave.h b/include/hw/virtio/vhost-pci-slave.h new file mode 100644 index 0000000..1be6a87 --- /dev/null +++ b/include/hw/virtio/vhost-pci-slave.h @@ -0,0 +1,12 @@ +#ifndef QEMU_VHOST_PCI_SLAVE_H +#define QEMU_VHOST_PCI_SLAVE_H + +#include "linux-headers/linux/vhost.h" + +extern int vp_slave_can_read(void *opaque); + +extern void vp_slave_read(void *opaque, const uint8_t *buf, int size); + +extern void vp_slave_event(void *opaque, int event); + +#endif