From patchwork Wed Aug 24 22:25:11 2011 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Evensky X-Patchwork-Id: 1094482 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by demeter2.kernel.org (8.14.4/8.14.4) with ESMTP id p7OMPhw0031859 for ; Wed, 24 Aug 2011 22:25:44 GMT Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751049Ab1HXWZT (ORCPT ); Wed, 24 Aug 2011 18:25:19 -0400 Received: from sentry-two.sandia.gov ([132.175.109.14]:51265 "EHLO sentry-two.sandia.gov" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750915Ab1HXWZS (ORCPT ); Wed, 24 Aug 2011 18:25:18 -0400 X-WSS-ID: 0LQGEA4-0B-2HL-02 X-M-MSG: Received: from interceptor1.sandia.gov (interceptor1.sandia.gov [132.175.109.5]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by sentry-two.sandia.gov (Postfix) with ESMTP id 1AE1C36995B; Wed, 24 Aug 2011 16:25:16 -0600 (MDT) Received: from mailgate.sandia.gov (mailgate.sandia.gov [132.175.109.1]) by interceptor1.sandia.gov (RSA Interceptor); Wed, 24 Aug 2011 16:25:11 -0600 Received: from dancer.ca.sandia.gov (dancer.ca.sandia.gov [146.246.246.1]) by mailgate.sandia.gov (8.14.4/8.14.4) with ESMTP id p7OMOj7g022713; Wed, 24 Aug 2011 16:24:46 -0600 Received: from dancer.ca.sandia.gov (localhost [127.0.0.1]) by dancer.ca.sandia.gov (8.14.4/8.14.4/Debian-2) with ESMTP id p7OMPBlQ018009 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NOT); Wed, 24 Aug 2011 15:25:11 -0700 Received: (from evensky@localhost) by dancer.ca.sandia.gov (8.14.4/8.14.4/Submit) id p7OMPBx8018008; Wed, 24 Aug 2011 15:25:11 -0700 Date: Wed, 24 Aug 2011 15:25:11 -0700 From: David Evensky To: penberg@kernel.org Cc: Sasha Levin , kvm@vger.kernel.org Subject: [PATCH] kvm tools: adds a PCI device that exports a host shared segment as a PCI BAR in the guest Message-ID: <20110824222510.GC14835@dancer.ca.sandia.gov> MIME-Version: 1.0 Content-Disposition: inline User-Agent: Mutt/1.5.21 (2010-09-15) X-PMX-Version: 5.6.1.2065439, Antispam-Engine: 2.7.2.376379, Antispam-Data: 2011.8.24.221217 X-PMX-Spam: Gauge=IIIIIIII, Probability=8%, Report=' MULTIPLE_RCPTS 0.1, BODY_SIZE_10000_PLUS 0, DATE_TZ_NA 0, __ANY_URI 0, __CD 0, __CP_MEDIA_BODY 0, __CT 0, __CT_TEXT_PLAIN 0, __HAS_MSGID 0, __MIME_TEXT_ONLY 0, __MIME_VERSION 0, __MULTIPLE_RCPTS_CC_X2 0, __SANE_MSGID 0, __SUBJ_ALPHA_END 0, __TO_MALFORMED_2 0, __TO_NO_NAME 0, __URI_NO_PATH 0, __URI_NO_WWW 0, __URI_NS , __USER_AGENT 0' X-RSA-Inspected: yes X-RSA-Classifications: public X-RSA-Action: allow Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Greylist: IP, sender and recipient auto-whitelisted, not delayed by milter-greylist-4.2.6 (demeter2.kernel.org [140.211.167.43]); Wed, 24 Aug 2011 22:25:44 +0000 (UTC) This patch adds a PCI device that provides PCI device memory to the guest. This memory in the guest exists as a shared memory segment in the host. This is similar memory sharing capability of Nahanni (ivshmem) available in QEMU. In this case, the shared memory segment is exposed as a PCI BAR only. A new command line argument is added as: --shmem pci:0xc8000000:16MB:handle=/newmem:create which will set the PCI BAR at 0xc8000000, the shared memory segment and the region pointed to by the BAR will be 16MB. On the host side the shm_open handle will be '/newmem', and the kvm tool will create the shared segment, set its size, and initialize it. If the size, handle, or create flag are absent, they will default to 16MB, handle=/kvm_shmem, and create will be false. The address family, 'pci:' is also optional as it is the only address family currently supported. Only a single --shmem is supported at this time. Signed-off-by: David Evensky --- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff -uprN -X linux-kvm/Documentation/dontdiff linux-kvm/tools/kvm/builtin-run.c linux-kvm_pci_shmem/tools/kvm/builtin-run.c --- linux-kvm/tools/kvm/builtin-run.c 2011-08-24 10:21:22.342077674 -0700 +++ linux-kvm_pci_shmem/tools/kvm/builtin-run.c 2011-08-24 14:17:33.190451297 -0700 @@ -28,6 +28,8 @@ #include "kvm/sdl.h" #include "kvm/vnc.h" #include "kvm/guest_compat.h" +#include "shmem-util.h" +#include "kvm/pci-shmem.h" #include @@ -52,6 +54,8 @@ #define DEFAULT_SCRIPT "none" #define MB_SHIFT (20) +#define KB_SHIFT (10) +#define GB_SHIFT (30) #define MIN_RAM_SIZE_MB (64ULL) #define MIN_RAM_SIZE_BYTE (MIN_RAM_SIZE_MB << MB_SHIFT) @@ -151,6 +155,130 @@ static int virtio_9p_rootdir_parser(cons return 0; } +static int shmem_parser(const struct option *opt, const char *arg, int unset) +{ + const uint64_t default_size = SHMEM_DEFAULT_SIZE; + const uint64_t default_phys_addr = SHMEM_DEFAULT_ADDR; + const char *default_handle = SHMEM_DEFAULT_HANDLE; + enum { PCI, UNK } addr_type = PCI; + uint64_t phys_addr; + uint64_t size; + char *handle = NULL; + int create = 0; + const char *p = arg; + char *next; + int base = 10; + int verbose = 0; + + const int skip_pci = strlen("pci:"); + if (verbose) + pr_info("shmem_parser(%p,%s,%d)", opt, arg, unset); + /* parse out optional addr family */ + if (strcasestr(p, "pci:")) { + p += skip_pci; + addr_type = PCI; + } else if (strcasestr(p, "mem:")) { + die("I can't add to E820 map yet.\n"); + } + /* parse out physical addr */ + base = 10; + if (strcasestr(p, "0x")) + base = 16; + phys_addr = strtoll(p, &next, base); + if (next == p && phys_addr == 0) { + pr_info("shmem: no physical addr specified, using default."); + phys_addr = default_phys_addr; + } + if (*next != ':' && *next != '\0') + die("shmem: unexpected chars after phys addr.\n"); + if (*next == '\0') + p = next; + else + p = next + 1; + /* parse out size */ + base = 10; + if (strcasestr(p, "0x")) + base = 16; + size = strtoll(p, &next, base); + if (next == p && size == 0) { + pr_info("shmem: no size specified, using default."); + size = default_size; + } + /* look for [KMGkmg][Bb]* uses base 2. */ + int skip_B = 0; + if (strspn(next, "KMGkmg")) { /* might have a prefix */ + if (*(next + 1) == 'B' || *(next + 1) == 'b') + skip_B = 1; + switch (*next) { + case 'K': + case 'k': + size = size << KB_SHIFT; + break; + case 'M': + case 'm': + size = size << MB_SHIFT; + break; + case 'G': + case 'g': + size = size << GB_SHIFT; + break; + default: + die("shmem: bug in detecting size prefix."); + break; + } + next += 1 + skip_B; + } + if (*next != ':' && *next != '\0') { + die("shmem: unexpected chars after phys size. <%c><%c>\n", + *next, *p); + } + if (*next == '\0') + p = next; + else + p = next + 1; + /* parse out optional shmem handle */ + const int skip_handle = strlen("handle="); + next = strcasestr(p, "handle="); + if (*p && next) { + if (p != next) + die("unexpected chars before handle\n"); + p += skip_handle; + next = strchrnul(p, ':'); + if (next - p) { + handle = malloc(next - p + 1); + strncpy(handle, p, next - p); + handle[next - p] = '\0'; /* just in case. */ + } + if (*next == '\0') + p = next; + else + p = next + 1; + } + /* parse optional create flag to see if we should create shm seg. */ + if (*p && strcasestr(p, "create")) { + create = 1; + p += strlen("create"); + } + if (*p != '\0') + die("shmem: unexpected trailing chars\n"); + if (handle == NULL) { + handle = malloc(strlen(default_handle) + 1); + strcpy(handle, default_handle); + } + if (verbose) { + pr_info("shmem: phys_addr = %lx", phys_addr); + pr_info("shmem: size = %lx", size); + pr_info("shmem: handle = %s", handle); + pr_info("shmem: create = %d", create); + } + struct shmem_info *si = malloc(sizeof(struct shmem_info)); + si->phys_addr = phys_addr; + si->size = size; + si->handle = handle; + si->create = create; + pci_shmem__register_mem(si); /* ownership of si, etc. passed on. */ + return 0; +} static const struct option options[] = { OPT_GROUP("Basic options:"), @@ -158,6 +286,10 @@ static const struct option options[] = { "A name for the guest"), OPT_INTEGER('c', "cpus", &nrcpus, "Number of CPUs"), OPT_U64('m', "mem", &ram_size, "Virtual machine memory size in MiB."), + OPT_CALLBACK('\0', "shmem", NULL, + "[pci:]:[:handle=][:create]", + "Share host shmem with guest via pci device", + shmem_parser), OPT_CALLBACK('d', "disk", NULL, "image or rootfs_dir", "Disk image or rootfs directory", img_name_parser), OPT_BOOLEAN('\0', "balloon", &balloon, "Enable virtio balloon"), OPT_BOOLEAN('\0', "vnc", &vnc, "Enable VNC framebuffer"), @@ -695,6 +827,8 @@ int kvm_cmd_run(int argc, const char **a kbd__init(kvm); + pci_shmem__init(kvm); + if (vnc || sdl) fb = vesa__init(kvm); diff -uprN -X linux-kvm/Documentation/dontdiff linux-kvm/tools/kvm/hw/pci-shmem.c linux-kvm_pci_shmem/tools/kvm/hw/pci-shmem.c --- linux-kvm/tools/kvm/hw/pci-shmem.c 1969-12-31 16:00:00.000000000 -0800 +++ linux-kvm_pci_shmem/tools/kvm/hw/pci-shmem.c 2011-08-24 14:18:09.227234167 -0700 @@ -0,0 +1,59 @@ +#include "shmem-util.h" +#include "kvm/pci-shmem.h" +#include "kvm/virtio-pci-dev.h" +#include "kvm/irq.h" +#include "kvm/kvm.h" +#include "kvm/pci.h" +#include "kvm/util.h" + +static struct pci_device_header pci_shmem_pci_device = { + .vendor_id = PCI_VENDOR_ID_PCI_SHMEM, + .device_id = PCI_DEVICE_ID_PCI_SHMEM, + .header_type = PCI_HEADER_TYPE_NORMAL, + .class = 0xFF0000, /* misc pci device */ +}; + +static struct shmem_info *shmem_region; + +int pci_shmem__register_mem(struct shmem_info *si) +{ + if (shmem_region == NULL) { + shmem_region = si; + } else { + pr_warning("only single shmem currently avail. ignoring.\n"); + free(si); + } + return 0; +} + +int pci_shmem__init(struct kvm *kvm) +{ + u8 dev, line, pin; + char *mem; + int verbose = 0; + if (irq__register_device(PCI_DEVICE_ID_PCI_SHMEM, &dev, &pin, &line) + < 0) + return 0; + /* ignore irq stuff, just want bus info for now. */ + /* pci_shmem_pci_device.irq_pin = pin; */ + /* pci_shmem_pci_device.irq_line = line; */ + if (shmem_region == 0) { + if (verbose == 1) + pr_warning + ("pci_shmem_init: memory region not registered\n"); + return 0; + } + pci_shmem_pci_device.bar[0] = + shmem_region->phys_addr | PCI_BASE_ADDRESS_SPACE_MEMORY; + pci_shmem_pci_device.bar_size[0] = shmem_region->size; + pci__register(&pci_shmem_pci_device, dev); + + mem = + setup_shmem(shmem_region->handle, shmem_region->size, + shmem_region->create, verbose); + if (mem == NULL) + return 0; + kvm__register_mem(kvm, shmem_region->phys_addr, shmem_region->size, + mem); + return 1; +} diff -uprN -X linux-kvm/Documentation/dontdiff linux-kvm/tools/kvm/include/kvm/pci-shmem.h linux-kvm_pci_shmem/tools/kvm/include/kvm/pci-shmem.h --- linux-kvm/tools/kvm/include/kvm/pci-shmem.h 1969-12-31 16:00:00.000000000 -0800 +++ linux-kvm_pci_shmem/tools/kvm/include/kvm/pci-shmem.h 2011-08-13 15:43:01.067953711 -0700 @@ -0,0 +1,13 @@ +#ifndef KVM__PCI_SHMEM_H +#define KVM__PCI_SHMEM_H + +#include +#include + +struct kvm; +struct shmem_info; + +int pci_shmem__init(struct kvm *self); +int pci_shmem__register_mem(struct shmem_info *si); + +#endif diff -uprN -X linux-kvm/Documentation/dontdiff linux-kvm/tools/kvm/include/kvm/virtio-pci-dev.h linux-kvm_pci_shmem/tools/kvm/include/kvm/virtio-pci-dev.h --- linux-kvm/tools/kvm/include/kvm/virtio-pci-dev.h 2011-08-09 15:38:48.760120973 -0700 +++ linux-kvm_pci_shmem/tools/kvm/include/kvm/virtio-pci-dev.h 2011-08-18 10:06:12.171539230 -0700 @@ -15,10 +15,13 @@ #define PCI_DEVICE_ID_VIRTIO_BLN 0x1005 #define PCI_DEVICE_ID_VIRTIO_P9 0x1009 #define PCI_DEVICE_ID_VESA 0x2000 +#define PCI_DEVICE_ID_PCI_SHMEM 0x0001 #define PCI_VENDOR_ID_REDHAT_QUMRANET 0x1af4 +#define PCI_VENDOR_ID_PCI_SHMEM 0x0001 #define PCI_SUBSYSTEM_VENDOR_ID_REDHAT_QUMRANET 0x1af4 #define PCI_SUBSYSTEM_ID_VESA 0x0004 +#define PCI_SUBSYSTEM_ID_PCI_SHMEM 0x0001 #endif /* VIRTIO_PCI_DEV_H_ */ diff -uprN -X linux-kvm/Documentation/dontdiff linux-kvm/tools/kvm/include/shmem-util.h linux-kvm_pci_shmem/tools/kvm/include/shmem-util.h --- linux-kvm/tools/kvm/include/shmem-util.h 1969-12-31 16:00:00.000000000 -0800 +++ linux-kvm_pci_shmem/tools/kvm/include/shmem-util.h 2011-08-24 14:15:30.271780710 -0700 @@ -0,0 +1,27 @@ +#ifndef SHMEM_UTIL_H +#define SHMEM_UTIL_H + +#include +#include +#include +#include +#include +#include +#include +#include + +#define SHMEM_DEFAULT_SIZE (16 << MB_SHIFT) +#define SHMEM_DEFAULT_ADDR (0xc8000000) +#define SHMEM_DEFAULT_HANDLE "/kvm_shmem" +struct shmem_info { + uint64_t phys_addr; + uint64_t size; + char *handle; + int create; +}; + +inline void *setup_shmem(const char *key, size_t len, int creating, + int verbose); +inline void fill_mem(void *buf, size_t buf_size, char *fill, size_t fill_len); + +#endif /* SHMEM_UTIL_H */ diff -uprN -X linux-kvm/Documentation/dontdiff linux-kvm/tools/kvm/Makefile linux-kvm_pci_shmem/tools/kvm/Makefile --- linux-kvm/tools/kvm/Makefile 2011-08-24 10:21:22.342077676 -0700 +++ linux-kvm_pci_shmem/tools/kvm/Makefile 2011-08-24 13:55:37.442003451 -0700 @@ -77,10 +77,12 @@ OBJS += threadpool.o OBJS += util/parse-options.o OBJS += util/rbtree-interval.o OBJS += util/strbuf.o +OBJS += util/shmem-util.o OBJS += virtio/9p.o OBJS += virtio/9p-pdu.o OBJS += hw/vesa.o OBJS += hw/i8042.o +OBJS += hw/pci-shmem.o FLAGS_BFD := $(CFLAGS) -lbfd has_bfd := $(call try-cc,$(SOURCE_BFD),$(FLAGS_BFD)) diff -uprN -X linux-kvm/Documentation/dontdiff linux-kvm/tools/kvm/util/shmem-util.c linux-kvm_pci_shmem/tools/kvm/util/shmem-util.c --- linux-kvm/tools/kvm/util/shmem-util.c 1969-12-31 16:00:00.000000000 -0800 +++ linux-kvm_pci_shmem/tools/kvm/util/shmem-util.c 2011-08-24 14:19:31.801027897 -0700 @@ -0,0 +1,64 @@ +#include +#include +#include +#include +#include +#include +#include "shmem-util.h" + +inline void *setup_shmem(const char *key, size_t len, int creating, int verbose) +{ + int fd; + int rtn; + void *mem; + int flag = O_RDWR; + if (creating) + flag |= O_CREAT; + fd = shm_open(key, flag, S_IRUSR | S_IWUSR); + if (fd == -1) { + fprintf(stderr, "Failed to open shared memory file %s\n", key); + fflush(stderr); + return NULL; + } + if (creating) { + if (verbose) + fprintf(stderr, "Truncating file.\n"); + rtn = ftruncate(fd, (off_t) len); + if (rtn == -1) { + fprintf(stderr, "Can't ftruncate(fd,%ld)\n", len); + fflush(stderr); + } + } + mem = mmap(NULL, len, + PROT_READ | PROT_WRITE, MAP_SHARED | MAP_NORESERVE, fd, 0); + close(fd); + if (mem == NULL) { + fprintf(stderr, "Failed to mmap shared memory file\n"); + fflush(stderr); + return NULL; + } + if (creating) { + int fill_bytes = 0xae0dae0d; + if (verbose) + fprintf(stderr, "Filling buffer\n"); + fill_mem(mem, len, (char *)&fill_bytes, 4); + } + return mem; +} + +inline void fill_mem(void *buf, size_t buf_size, char *fill, size_t fill_len) +{ + size_t i; + + if (fill_len == 1) { + memset(buf, fill[0], buf_size); + } else { + if (buf_size > fill_len) { + for (i = 0; i < buf_size - fill_len; i += fill_len) + memcpy(((char *)buf) + i, fill, fill_len); + memcpy(buf + i, fill, buf_size - i); + } else { + memcpy(buf, fill, buf_size); + } + } +}