From patchwork Wed Feb 10 09:26:25 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Stefan Hajnoczi X-Patchwork-Id: 12080305 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id BF44FC433E0 for ; Wed, 10 Feb 2021 09:52:22 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 0E59164DE9 for ; Wed, 10 Feb 2021 09:52:21 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 0E59164DE9 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:35450 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1l9mAW-0000nl-SO for qemu-devel@archiver.kernel.org; Wed, 10 Feb 2021 04:52:21 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:56988) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1l9lqD-0006b3-2K for qemu-devel@nongnu.org; Wed, 10 Feb 2021 04:31:22 -0500 Received: from us-smtp-delivery-124.mimecast.com ([63.128.21.124]:49975) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.90_1) (envelope-from ) id 1l9lq8-0007uY-OU for qemu-devel@nongnu.org; Wed, 10 Feb 2021 04:31:20 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1612949476; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=xQUSL88RmvUA22z4wQlZIn4iKh3QIU6cg9OaBXD7ngE=; b=HJs8qXbIaY3ulKUizPXlfoOI7ipN3hniLtOQH/nVk/m6NQ9lBXndwqLSUyJ0DIDtkhiI4I X+cH/E5a131/Qvc0PKHv8R7PL94N1AOr9iP9KNMzAIUDv4bq5F6Tu3wwPflxn1OjO7pvQf bWzWL8tARtJAQjmmNKBxUGcpACXMeYo= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-186-Z-4i5UGANHKPzw_5RGEyhg-1; Wed, 10 Feb 2021 04:31:12 -0500 X-MC-Unique: Z-4i5UGANHKPzw_5RGEyhg-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 42007521F; Wed, 10 Feb 2021 09:31:10 +0000 (UTC) Received: from localhost (ovpn-115-120.ams2.redhat.com [10.36.115.120]) by smtp.corp.redhat.com (Postfix) with ESMTP id D7A5F19C59; Wed, 10 Feb 2021 09:30:57 +0000 (UTC) From: Stefan Hajnoczi To: Peter Maydell , qemu-devel@nongnu.org Subject: [PULL v4 24/27] multi-process: create IOHUB object to handle irq Date: Wed, 10 Feb 2021 09:26:25 +0000 Message-Id: <20210210092628.193785-25-stefanha@redhat.com> In-Reply-To: <20210210092628.193785-1-stefanha@redhat.com> References: <20210210092628.193785-1-stefanha@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=stefanha@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Received-SPF: pass client-ip=63.128.21.124; envelope-from=stefanha@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -33 X-Spam_score: -3.4 X-Spam_bar: --- X-Spam_report: (-3.4 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.57, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Fam Zheng , John G Johnson , thuth@redhat.com, Jagannathan Raman , Eduardo Habkost , qemu-block@nongnu.org, "Michael S. Tsirkin" , "Denis V. Lunev" , =?utf-8?q?Philippe_Mathieu-Daud=C3=A9?= , Stefan Hajnoczi , =?utf-8?q?Daniel_P=2E_Berrang=C3=A9?= , Wainer dos Santos Moschetta , Elena Ufimtseva , Igor Mammedov , Paolo Bonzini , =?utf-8?q?Alex_Benn=C3=A9e?= Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" From: Jagannathan Raman IOHUB object is added to manage PCI IRQs. It uses KVM_IRQFD ioctl to create irqfd to injecting PCI interrupts to the guest. IOHUB object forwards the irqfd to the remote process. Remote process uses this fd to directly send interrupts to the guest, bypassing QEMU. Signed-off-by: John G Johnson Signed-off-by: Jagannathan Raman Signed-off-by: Elena Ufimtseva Reviewed-by: Stefan Hajnoczi Message-id: 51d5c3d54e28a68b002e3875c59599c9f5a424a1.1611938319.git.jag.raman@oracle.com Signed-off-by: Stefan Hajnoczi --- MAINTAINERS | 2 + include/hw/pci/pci_ids.h | 3 + include/hw/remote/iohub.h | 42 +++++++++++ include/hw/remote/machine.h | 2 + include/hw/remote/mpqemu-link.h | 1 + include/hw/remote/proxy.h | 4 ++ hw/remote/iohub.c | 119 ++++++++++++++++++++++++++++++++ hw/remote/machine.c | 10 +++ hw/remote/message.c | 4 ++ hw/remote/mpqemu-link.c | 5 ++ hw/remote/proxy.c | 56 +++++++++++++++ hw/remote/meson.build | 1 + 12 files changed, 249 insertions(+) create mode 100644 include/hw/remote/iohub.h create mode 100644 hw/remote/iohub.c diff --git a/MAINTAINERS b/MAINTAINERS index 3817e807b1..e6f1eca30f 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -3221,6 +3221,8 @@ F: hw/remote/proxy.c F: include/hw/remote/proxy.h F: hw/remote/proxy-memory-listener.c F: include/hw/remote/proxy-memory-listener.h +F: hw/remote/iohub.c +F: include/hw/remote/iohub.h Build and test automation ------------------------- diff --git a/include/hw/pci/pci_ids.h b/include/hw/pci/pci_ids.h index 11f8ab7149..bd0c17dc78 100644 --- a/include/hw/pci/pci_ids.h +++ b/include/hw/pci/pci_ids.h @@ -192,6 +192,9 @@ #define PCI_DEVICE_ID_SUN_SIMBA 0x5000 #define PCI_DEVICE_ID_SUN_SABRE 0xa000 +#define PCI_VENDOR_ID_ORACLE 0x108e +#define PCI_DEVICE_ID_REMOTE_IOHUB 0xb000 + #define PCI_VENDOR_ID_CMD 0x1095 #define PCI_DEVICE_ID_CMD_646 0x0646 diff --git a/include/hw/remote/iohub.h b/include/hw/remote/iohub.h new file mode 100644 index 0000000000..0bf98e0d78 --- /dev/null +++ b/include/hw/remote/iohub.h @@ -0,0 +1,42 @@ +/* + * IO Hub for remote device + * + * Copyright © 2018, 2021 Oracle and/or its affiliates. + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + * + */ + +#ifndef REMOTE_IOHUB_H +#define REMOTE_IOHUB_H + +#include "hw/pci/pci.h" +#include "qemu/event_notifier.h" +#include "qemu/thread-posix.h" +#include "hw/remote/mpqemu-link.h" + +#define REMOTE_IOHUB_NB_PIRQS PCI_DEVFN_MAX + +typedef struct ResampleToken { + void *iohub; + int pirq; +} ResampleToken; + +typedef struct RemoteIOHubState { + PCIDevice d; + EventNotifier irqfds[REMOTE_IOHUB_NB_PIRQS]; + EventNotifier resamplefds[REMOTE_IOHUB_NB_PIRQS]; + unsigned int irq_level[REMOTE_IOHUB_NB_PIRQS]; + ResampleToken token[REMOTE_IOHUB_NB_PIRQS]; + QemuMutex irq_level_lock[REMOTE_IOHUB_NB_PIRQS]; +} RemoteIOHubState; + +int remote_iohub_map_irq(PCIDevice *pci_dev, int intx); +void remote_iohub_set_irq(void *opaque, int pirq, int level); +void process_set_irqfd_msg(PCIDevice *pci_dev, MPQemuMsg *msg); + +void remote_iohub_init(RemoteIOHubState *iohub); +void remote_iohub_finalize(RemoteIOHubState *iohub); + +#endif diff --git a/include/hw/remote/machine.h b/include/hw/remote/machine.h index b92b2ce705..2a2a33c4b2 100644 --- a/include/hw/remote/machine.h +++ b/include/hw/remote/machine.h @@ -15,11 +15,13 @@ #include "hw/boards.h" #include "hw/pci-host/remote.h" #include "io/channel.h" +#include "hw/remote/iohub.h" struct RemoteMachineState { MachineState parent_obj; RemotePCIHost *host; + RemoteIOHubState iohub; }; /* Used to pass to co-routine device and ioc. */ diff --git a/include/hw/remote/mpqemu-link.h b/include/hw/remote/mpqemu-link.h index 6303e62b17..71d206f00e 100644 --- a/include/hw/remote/mpqemu-link.h +++ b/include/hw/remote/mpqemu-link.h @@ -39,6 +39,7 @@ typedef enum { MPQEMU_CMD_PCI_CFGREAD, MPQEMU_CMD_BAR_WRITE, MPQEMU_CMD_BAR_READ, + MPQEMU_CMD_SET_IRQFD, MPQEMU_CMD_MAX, } MPQemuCmd; diff --git a/include/hw/remote/proxy.h b/include/hw/remote/proxy.h index 12888b4f90..741def71f1 100644 --- a/include/hw/remote/proxy.h +++ b/include/hw/remote/proxy.h @@ -12,6 +12,7 @@ #include "hw/pci/pci.h" #include "io/channel.h" #include "hw/remote/proxy-memory-listener.h" +#include "qemu/event_notifier.h" #define TYPE_PCI_PROXY_DEV "x-pci-proxy-dev" OBJECT_DECLARE_SIMPLE_TYPE(PCIProxyDev, PCI_PROXY_DEV) @@ -38,6 +39,9 @@ struct PCIProxyDev { QIOChannel *ioc; Error *migration_blocker; ProxyMemoryListener proxy_listener; + int virq; + EventNotifier intr; + EventNotifier resample; ProxyMemoryRegion region[PCI_NUM_REGIONS]; }; diff --git a/hw/remote/iohub.c b/hw/remote/iohub.c new file mode 100644 index 0000000000..e4ff131a6b --- /dev/null +++ b/hw/remote/iohub.c @@ -0,0 +1,119 @@ +/* + * Remote IO Hub + * + * Copyright © 2018, 2021 Oracle and/or its affiliates. + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + * + */ + +#include "qemu/osdep.h" +#include "qemu-common.h" + +#include "hw/pci/pci.h" +#include "hw/pci/pci_ids.h" +#include "hw/pci/pci_bus.h" +#include "qemu/thread.h" +#include "hw/boards.h" +#include "hw/remote/machine.h" +#include "hw/remote/iohub.h" +#include "qemu/main-loop.h" + +void remote_iohub_init(RemoteIOHubState *iohub) +{ + int pirq; + + memset(&iohub->irqfds, 0, sizeof(iohub->irqfds)); + memset(&iohub->resamplefds, 0, sizeof(iohub->resamplefds)); + + for (pirq = 0; pirq < REMOTE_IOHUB_NB_PIRQS; pirq++) { + qemu_mutex_init(&iohub->irq_level_lock[pirq]); + iohub->irq_level[pirq] = 0; + event_notifier_init_fd(&iohub->irqfds[pirq], -1); + event_notifier_init_fd(&iohub->resamplefds[pirq], -1); + } +} + +void remote_iohub_finalize(RemoteIOHubState *iohub) +{ + int pirq; + + for (pirq = 0; pirq < REMOTE_IOHUB_NB_PIRQS; pirq++) { + qemu_set_fd_handler(event_notifier_get_fd(&iohub->resamplefds[pirq]), + NULL, NULL, NULL); + event_notifier_cleanup(&iohub->irqfds[pirq]); + event_notifier_cleanup(&iohub->resamplefds[pirq]); + qemu_mutex_destroy(&iohub->irq_level_lock[pirq]); + } +} + +int remote_iohub_map_irq(PCIDevice *pci_dev, int intx) +{ + return pci_dev->devfn; +} + +void remote_iohub_set_irq(void *opaque, int pirq, int level) +{ + RemoteIOHubState *iohub = opaque; + + assert(pirq >= 0); + assert(pirq < PCI_DEVFN_MAX); + + QEMU_LOCK_GUARD(&iohub->irq_level_lock[pirq]); + + if (level) { + if (++iohub->irq_level[pirq] == 1) { + event_notifier_set(&iohub->irqfds[pirq]); + } + } else if (iohub->irq_level[pirq] > 0) { + iohub->irq_level[pirq]--; + } +} + +static void intr_resample_handler(void *opaque) +{ + ResampleToken *token = opaque; + RemoteIOHubState *iohub = token->iohub; + int pirq, s; + + pirq = token->pirq; + + s = event_notifier_test_and_clear(&iohub->resamplefds[pirq]); + + assert(s >= 0); + + QEMU_LOCK_GUARD(&iohub->irq_level_lock[pirq]); + + if (iohub->irq_level[pirq]) { + event_notifier_set(&iohub->irqfds[pirq]); + } +} + +void process_set_irqfd_msg(PCIDevice *pci_dev, MPQemuMsg *msg) +{ + RemoteMachineState *machine = REMOTE_MACHINE(current_machine); + RemoteIOHubState *iohub = &machine->iohub; + int pirq, intx; + + intx = pci_get_byte(pci_dev->config + PCI_INTERRUPT_PIN) - 1; + + pirq = remote_iohub_map_irq(pci_dev, intx); + + if (event_notifier_get_fd(&iohub->irqfds[pirq]) != -1) { + qemu_set_fd_handler(event_notifier_get_fd(&iohub->resamplefds[pirq]), + NULL, NULL, NULL); + event_notifier_cleanup(&iohub->irqfds[pirq]); + event_notifier_cleanup(&iohub->resamplefds[pirq]); + memset(&iohub->token[pirq], 0, sizeof(ResampleToken)); + } + + event_notifier_init_fd(&iohub->irqfds[pirq], msg->fds[0]); + event_notifier_init_fd(&iohub->resamplefds[pirq], msg->fds[1]); + + iohub->token[pirq].iohub = iohub; + iohub->token[pirq].pirq = pirq; + + qemu_set_fd_handler(msg->fds[1], intr_resample_handler, NULL, + &iohub->token[pirq]); +} diff --git a/hw/remote/machine.c b/hw/remote/machine.c index 9519a6c0a4..c0ab4f528a 100644 --- a/hw/remote/machine.c +++ b/hw/remote/machine.c @@ -20,12 +20,15 @@ #include "exec/address-spaces.h" #include "exec/memory.h" #include "qapi/error.h" +#include "hw/pci/pci_host.h" +#include "hw/remote/iohub.h" static void remote_machine_init(MachineState *machine) { MemoryRegion *system_memory, *system_io, *pci_memory; RemoteMachineState *s = REMOTE_MACHINE(machine); RemotePCIHost *rem_host; + PCIHostState *pci_host; system_memory = get_system_memory(); system_io = get_system_io(); @@ -45,6 +48,13 @@ static void remote_machine_init(MachineState *machine) memory_region_add_subregion_overlap(system_memory, 0x0, pci_memory, -1); qdev_realize(DEVICE(rem_host), sysbus_get_default(), &error_fatal); + + pci_host = PCI_HOST_BRIDGE(rem_host); + + remote_iohub_init(&s->iohub); + + pci_bus_irqs(pci_host->bus, remote_iohub_set_irq, remote_iohub_map_irq, + &s->iohub, REMOTE_IOHUB_NB_PIRQS); } static void remote_machine_class_init(ObjectClass *oc, void *data) diff --git a/hw/remote/message.c b/hw/remote/message.c index 25341d8ad2..adab040ca1 100644 --- a/hw/remote/message.c +++ b/hw/remote/message.c @@ -18,6 +18,7 @@ #include "hw/pci/pci.h" #include "exec/memattrs.h" #include "hw/remote/memory.h" +#include "hw/remote/iohub.h" static void process_config_write(QIOChannel *ioc, PCIDevice *dev, MPQemuMsg *msg, Error **errp); @@ -65,6 +66,9 @@ void coroutine_fn mpqemu_remote_msg_loop_co(void *data) case MPQEMU_CMD_SYNC_SYSMEM: remote_sysmem_reconfig(&msg, &local_err); break; + case MPQEMU_CMD_SET_IRQFD: + process_set_irqfd_msg(pci_dev, &msg); + break; default: error_setg(&local_err, "Unknown command (%d) received for device %s" diff --git a/hw/remote/mpqemu-link.c b/hw/remote/mpqemu-link.c index 52bfeddcdc..9ce31526e8 100644 --- a/hw/remote/mpqemu-link.c +++ b/hw/remote/mpqemu-link.c @@ -254,6 +254,11 @@ bool mpqemu_msg_valid(MPQemuMsg *msg) return false; } break; + case MPQEMU_CMD_SET_IRQFD: + if (msg->size || (msg->num_fds != 2)) { + return false; + } + break; default: break; } diff --git a/hw/remote/proxy.c b/hw/remote/proxy.c index 472b2df335..555b3103f4 100644 --- a/hw/remote/proxy.c +++ b/hw/remote/proxy.c @@ -21,6 +21,57 @@ #include "qemu/error-report.h" #include "hw/remote/proxy-memory-listener.h" #include "qom/object.h" +#include "qemu/event_notifier.h" +#include "sysemu/kvm.h" +#include "util/event_notifier-posix.c" + +static void proxy_intx_update(PCIDevice *pci_dev) +{ + PCIProxyDev *dev = PCI_PROXY_DEV(pci_dev); + PCIINTxRoute route; + int pin = pci_get_byte(pci_dev->config + PCI_INTERRUPT_PIN) - 1; + + if (dev->virq != -1) { + kvm_irqchip_remove_irqfd_notifier_gsi(kvm_state, &dev->intr, dev->virq); + dev->virq = -1; + } + + route = pci_device_route_intx_to_irq(pci_dev, pin); + + dev->virq = route.irq; + + if (dev->virq != -1) { + kvm_irqchip_add_irqfd_notifier_gsi(kvm_state, &dev->intr, + &dev->resample, dev->virq); + } +} + +static void setup_irqfd(PCIProxyDev *dev) +{ + PCIDevice *pci_dev = PCI_DEVICE(dev); + MPQemuMsg msg; + Error *local_err = NULL; + + event_notifier_init(&dev->intr, 0); + event_notifier_init(&dev->resample, 0); + + memset(&msg, 0, sizeof(MPQemuMsg)); + msg.cmd = MPQEMU_CMD_SET_IRQFD; + msg.num_fds = 2; + msg.fds[0] = event_notifier_get_fd(&dev->intr); + msg.fds[1] = event_notifier_get_fd(&dev->resample); + msg.size = 0; + + if (!mpqemu_msg_send(&msg, dev->ioc, &local_err)) { + error_report_err(local_err); + } + + dev->virq = -1; + + proxy_intx_update(pci_dev); + + pci_device_set_intx_routing_notifier(pci_dev, proxy_intx_update); +} static void pci_proxy_dev_realize(PCIDevice *device, Error **errp) { @@ -56,6 +107,8 @@ static void pci_proxy_dev_realize(PCIDevice *device, Error **errp) qio_channel_set_blocking(dev->ioc, true, NULL); proxy_memory_listener_configure(&dev->proxy_listener, dev->ioc); + + setup_irqfd(dev); } static void pci_proxy_dev_exit(PCIDevice *pdev) @@ -71,6 +124,9 @@ static void pci_proxy_dev_exit(PCIDevice *pdev) error_free(dev->migration_blocker); proxy_memory_listener_deconfigure(&dev->proxy_listener); + + event_notifier_cleanup(&dev->intr); + event_notifier_cleanup(&dev->resample); } static void config_op_send(PCIProxyDev *pdev, uint32_t addr, uint32_t *val, diff --git a/hw/remote/meson.build b/hw/remote/meson.build index 7f11be4736..e6a5574242 100644 --- a/hw/remote/meson.build +++ b/hw/remote/meson.build @@ -5,6 +5,7 @@ remote_ss.add(when: 'CONFIG_MULTIPROCESS', if_true: files('mpqemu-link.c')) remote_ss.add(when: 'CONFIG_MULTIPROCESS', if_true: files('message.c')) remote_ss.add(when: 'CONFIG_MULTIPROCESS', if_true: files('remote-obj.c')) remote_ss.add(when: 'CONFIG_MULTIPROCESS', if_true: files('proxy.c')) +remote_ss.add(when: 'CONFIG_MULTIPROCESS', if_true: files('iohub.c')) specific_ss.add(when: 'CONFIG_MULTIPROCESS', if_true: files('memory.c')) specific_ss.add(when: 'CONFIG_MULTIPROCESS', if_true: files('proxy-memory-listener.c'))