From patchwork Mon Mar 8 12:31:38 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Greg Kurz X-Patchwork-Id: 12122389 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,MENTIONS_GIT_HOSTING,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id CFE41C433DB for ; Mon, 8 Mar 2021 13:53:11 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 4D311651C2 for ; Mon, 8 Mar 2021 13:53:11 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 4D311651C2 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=kaod.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:44030 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lJGJq-0000iS-C2 for qemu-devel@archiver.kernel.org; Mon, 08 Mar 2021 08:53:10 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:37598) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lJF3P-0000Ry-Ap for qemu-devel@nongnu.org; Mon, 08 Mar 2021 07:32:07 -0500 Received: from us-smtp-delivery-44.mimecast.com ([205.139.111.44]:30766) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.90_1) (envelope-from ) id 1lJF3K-0001uM-7q for qemu-devel@nongnu.org; Mon, 08 Mar 2021 07:32:06 -0500 Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-494-Fggn1500NR2zJjuEYvadDA-1; Mon, 08 Mar 2021 07:31:52 -0500 X-MC-Unique: Fggn1500NR2zJjuEYvadDA-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 7A8B62F7A6; Mon, 8 Mar 2021 12:31:51 +0000 (UTC) Received: from bahia.redhat.com (ovpn-113-236.ams2.redhat.com [10.36.113.236]) by smtp.corp.redhat.com (Postfix) with ESMTP id 2BA9A5D9D0; Mon, 8 Mar 2021 12:31:50 +0000 (UTC) From: Greg Kurz To: qemu-devel@nongnu.org Subject: [PATCH 1/4] vhost-user: Introduce nested event loop in vhost_user_read() Date: Mon, 8 Mar 2021 13:31:38 +0100 Message-Id: <20210308123141.26444-2-groug@kaod.org> In-Reply-To: <20210308123141.26444-1-groug@kaod.org> References: <20210308123141.26444-1-groug@kaod.org> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: kaod.org Received-SPF: softfail client-ip=205.139.111.44; envelope-from=groug@kaod.org; helo=us-smtp-delivery-44.mimecast.com X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_NONE=0.001, SPF_SOFTFAIL=0.665 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Greg Kurz , "Michael S. Tsirkin" , Vivek Goyal , Stefan Hajnoczi , "Dr. David Alan Gilbert" Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" A deadlock condition potentially exists if a vhost-user process needs to request something to QEMU on the slave channel while processing a vhost-user message. This doesn't seem to affect any vhost-user implementation so far, but this is currently biting the upcoming enablement of DAX with virtio-fs. The issue is being observed when the guest does an emergency reboot while a mapping still exits in the DAX window, which is very easy to get with a busy enough workload (e.g. as simulated by blogbench [1]) : - QEMU sends VHOST_USER_GET_VRING_BASE to virtiofsd. - In order to complete the request, virtiofsd then asks QEMU to remove the mapping on the slave channel. All these dialogs are synchronous, hence the deadlock. As pointed out by Stefan Hajnoczi: When QEMU's vhost-user master implementation sends a vhost-user protocol message, vhost_user_read() does a "blocking" read during which slave_fd is not monitored by QEMU. As a preliminary step to address this, split vhost_user_read() into a nested even loop and a one-shot callback that does the actual reading. A subsequent patch will teach the loop to monitor and process messages from the slave channel as well. [1] https://github.com/jedisct1/Blogbench Suggested-by: Stefan Hajnoczi Signed-off-by: Greg Kurz --- hw/virtio/vhost-user.c | 59 ++++++++++++++++++++++++++++++++++++++---- 1 file changed, 54 insertions(+), 5 deletions(-) diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c index 2fdd5daf74bb..8a0574d5f959 100644 --- a/hw/virtio/vhost-user.c +++ b/hw/virtio/vhost-user.c @@ -294,15 +294,27 @@ static int vhost_user_read_header(struct vhost_dev *dev, VhostUserMsg *msg) return 0; } -static int vhost_user_read(struct vhost_dev *dev, VhostUserMsg *msg) +struct vhost_user_read_cb_data { + struct vhost_dev *dev; + VhostUserMsg *msg; + GMainLoop *loop; + int ret; +}; + +static gboolean vhost_user_read_cb(GIOChannel *source, GIOCondition condition, + gpointer opaque) { + struct vhost_user_read_cb_data *data = opaque; + struct vhost_dev *dev = data->dev; + VhostUserMsg *msg = data->msg; struct vhost_user *u = dev->opaque; CharBackend *chr = u->user->chr; uint8_t *p = (uint8_t *) msg; int r, size; if (vhost_user_read_header(dev, msg) < 0) { - return -1; + data->ret = -1; + goto end; } /* validate message size is sane */ @@ -310,7 +322,8 @@ static int vhost_user_read(struct vhost_dev *dev, VhostUserMsg *msg) error_report("Failed to read msg header." " Size %d exceeds the maximum %zu.", msg->hdr.size, VHOST_USER_PAYLOAD_SIZE); - return -1; + data->ret = -1; + goto end; } if (msg->hdr.size) { @@ -320,11 +333,47 @@ static int vhost_user_read(struct vhost_dev *dev, VhostUserMsg *msg) if (r != size) { error_report("Failed to read msg payload." " Read %d instead of %d.", r, msg->hdr.size); - return -1; + data->ret = -1; + goto end; } } - return 0; +end: + g_main_loop_quit(data->loop); + return G_SOURCE_REMOVE; +} + +static int vhost_user_read(struct vhost_dev *dev, VhostUserMsg *msg) +{ + struct vhost_user *u = dev->opaque; + CharBackend *chr = u->user->chr; + GMainContext *prev_ctxt = chr->chr->gcontext; + GMainContext *ctxt = g_main_context_new(); + GMainLoop *loop = g_main_loop_new(ctxt, FALSE); + struct vhost_user_read_cb_data data = { + .dev = dev, + .loop = loop, + .msg = msg, + .ret = 0 + }; + + /* Switch context and add a new watch to monitor chardev activity */ + qemu_chr_be_update_read_handlers(chr->chr, ctxt); + qemu_chr_fe_add_watch(chr, G_IO_IN | G_IO_HUP, vhost_user_read_cb, &data); + + g_main_loop_run(loop); + + /* + * Restore the previous context. This also destroys/recreates event + * sources : this guarantees that all pending events in the original + * context that have been processed by the nested loop are purged. + */ + qemu_chr_be_update_read_handlers(chr->chr, prev_ctxt); + + g_main_loop_unref(loop); + g_main_context_unref(ctxt); + + return data.ret; } static int process_message_reply(struct vhost_dev *dev, From patchwork Mon Mar 8 12:31:39 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Greg Kurz X-Patchwork-Id: 12122261 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 49778C433DB for ; Mon, 8 Mar 2021 13:25:40 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id AB581651BB for ; Mon, 8 Mar 2021 13:25:39 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org AB581651BB Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=kaod.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:45078 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lJFtC-0001Fz-Oz for qemu-devel@archiver.kernel.org; Mon, 08 Mar 2021 08:25:38 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:37630) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lJF3Q-0000TT-J7 for qemu-devel@nongnu.org; Mon, 08 Mar 2021 07:32:08 -0500 Received: from us-smtp-delivery-44.mimecast.com ([207.211.30.44]:50749) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.90_1) (envelope-from ) id 1lJF3O-0001ve-Iy for qemu-devel@nongnu.org; Mon, 08 Mar 2021 07:32:08 -0500 Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-107-AvdlgY69OYurw77E-E1bOg-1; Mon, 08 Mar 2021 07:31:54 -0500 X-MC-Unique: AvdlgY69OYurw77E-E1bOg-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 1DBC9108BD06; Mon, 8 Mar 2021 12:31:53 +0000 (UTC) Received: from bahia.redhat.com (ovpn-113-236.ams2.redhat.com [10.36.113.236]) by smtp.corp.redhat.com (Postfix) with ESMTP id C569A5D9D0; Mon, 8 Mar 2021 12:31:51 +0000 (UTC) From: Greg Kurz To: qemu-devel@nongnu.org Subject: [PATCH 2/4] vhost-user: Convert slave channel to QIOChannelSocket Date: Mon, 8 Mar 2021 13:31:39 +0100 Message-Id: <20210308123141.26444-3-groug@kaod.org> In-Reply-To: <20210308123141.26444-1-groug@kaod.org> References: <20210308123141.26444-1-groug@kaod.org> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=groug@kaod.org X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: kaod.org Received-SPF: softfail client-ip=207.211.30.44; envelope-from=groug@kaod.org; helo=us-smtp-delivery-44.mimecast.com X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_NONE=0.001, SPF_SOFTFAIL=0.665 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Greg Kurz , "Michael S. Tsirkin" , Vivek Goyal , Stefan Hajnoczi , "Dr. David Alan Gilbert" Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" The slave channel is implemented with socketpair() : QEMU creates the pair, passes one of the socket to virtiofsd and monitors the other one with the main event loop using qemu_set_fd_handler(). In order to fix a potential deadlock between QEMU and a vhost-user external process (e.g. virtiofsd with DAX), we want the nested event loop in vhost_user_read() to monitor the slave channel as well. Prepare ground for this by converting the slave channel to be a QIOChannelSocket. This will make monitoring of the slave channel as simple as calling qio_channel_add_watch_source() from vhost_user_read() instead of open-coding it. This also allows to get rid of the ancillary data parsing since QIOChannelSocket can do this for us. Note that the MSG_CTRUNC check is dropped on the way because QIOChannelSocket ignores this case. This isn't a problem since slave_read() provisions space for 8 file descriptors, but affected vhost-user slave protocol messages generally only convey one. If for some reason a buggy implementation passes more file descriptors, no need to break the connection, just like we don't break it if some other type of ancillary data is received : this isn't explicitely violating the protocol per-se so it seems better to ignore it. Signed-off-by: Greg Kurz --- hw/virtio/vhost-user.c | 99 ++++++++++++++++++------------------------ 1 file changed, 42 insertions(+), 57 deletions(-) diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c index 8a0574d5f959..e395d0d1fd81 100644 --- a/hw/virtio/vhost-user.c +++ b/hw/virtio/vhost-user.c @@ -16,6 +16,7 @@ #include "hw/virtio/virtio.h" #include "hw/virtio/virtio-net.h" #include "chardev/char-fe.h" +#include "io/channel-socket.h" #include "sysemu/kvm.h" #include "qemu/error-report.h" #include "qemu/main-loop.h" @@ -237,7 +238,8 @@ struct vhost_user { struct vhost_dev *dev; /* Shared between vhost devs of the same virtio device */ VhostUserState *user; - int slave_fd; + QIOChannel *slave_ioc; + GSource *slave_src; NotifierWithReturn postcopy_notifier; struct PostCopyFD postcopy_fd; uint64_t postcopy_client_bases[VHOST_USER_MAX_RAM_SLOTS]; @@ -1441,56 +1443,33 @@ static int vhost_user_slave_handle_vring_host_notifier(struct vhost_dev *dev, return 0; } -static void slave_read(void *opaque) +static gboolean slave_read(QIOChannel *ioc, GIOCondition condition, + gpointer opaque) { struct vhost_dev *dev = opaque; struct vhost_user *u = dev->opaque; VhostUserHeader hdr = { 0, }; VhostUserPayload payload = { 0, }; - int size, ret = 0; + ssize_t size; + int ret = 0; struct iovec iov; - struct msghdr msgh; - int fd[VHOST_USER_SLAVE_MAX_FDS]; - char control[CMSG_SPACE(sizeof(fd))]; - struct cmsghdr *cmsg; - int i, fdsize = 0; - - memset(&msgh, 0, sizeof(msgh)); - msgh.msg_iov = &iov; - msgh.msg_iovlen = 1; - msgh.msg_control = control; - msgh.msg_controllen = sizeof(control); - - memset(fd, -1, sizeof(fd)); + g_autofree int *fd = NULL; + size_t fdsize = 0; + int i; /* Read header */ iov.iov_base = &hdr; iov.iov_len = VHOST_USER_HDR_SIZE; do { - size = recvmsg(u->slave_fd, &msgh, 0); - } while (size < 0 && (errno == EINTR || errno == EAGAIN)); + size = qio_channel_readv_full(ioc, &iov, 1, &fd, &fdsize, NULL); + } while (size == QIO_CHANNEL_ERR_BLOCK); if (size != VHOST_USER_HDR_SIZE) { error_report("Failed to read from slave."); goto err; } - if (msgh.msg_flags & MSG_CTRUNC) { - error_report("Truncated message."); - goto err; - } - - for (cmsg = CMSG_FIRSTHDR(&msgh); cmsg != NULL; - cmsg = CMSG_NXTHDR(&msgh, cmsg)) { - if (cmsg->cmsg_level == SOL_SOCKET && - cmsg->cmsg_type == SCM_RIGHTS) { - fdsize = cmsg->cmsg_len - CMSG_LEN(0); - memcpy(fd, CMSG_DATA(cmsg), fdsize); - break; - } - } - if (hdr.size > VHOST_USER_PAYLOAD_SIZE) { error_report("Failed to read msg header." " Size %d exceeds the maximum %zu.", hdr.size, @@ -1500,8 +1479,8 @@ static void slave_read(void *opaque) /* Read payload */ do { - size = read(u->slave_fd, &payload, hdr.size); - } while (size < 0 && (errno == EINTR || errno == EAGAIN)); + size = qio_channel_read(ioc, (char *) &payload, hdr.size, NULL); + } while (size == QIO_CHANNEL_ERR_BLOCK); if (size != hdr.size) { error_report("Failed to read payload from slave."); @@ -1517,7 +1496,7 @@ static void slave_read(void *opaque) break; case VHOST_USER_SLAVE_VRING_HOST_NOTIFIER_MSG: ret = vhost_user_slave_handle_vring_host_notifier(dev, &payload.area, - fd[0]); + fd ? fd[0] : -1); break; default: error_report("Received unexpected msg type: %d.", hdr.request); @@ -1525,8 +1504,8 @@ static void slave_read(void *opaque) } /* Close the remaining file descriptors. */ - for (i = 0; i < fdsize; i++) { - if (fd[i] != -1) { + if (fd) { + for (i = 0; i < fdsize; i++) { close(fd[i]); } } @@ -1551,8 +1530,8 @@ static void slave_read(void *opaque) iovec[1].iov_len = hdr.size; do { - size = writev(u->slave_fd, iovec, ARRAY_SIZE(iovec)); - } while (size < 0 && (errno == EINTR || errno == EAGAIN)); + size = qio_channel_writev(ioc, iovec, ARRAY_SIZE(iovec), NULL); + } while (size == QIO_CHANNEL_ERR_BLOCK); if (size != VHOST_USER_HDR_SIZE + hdr.size) { error_report("Failed to send msg reply to slave."); @@ -1560,18 +1539,20 @@ static void slave_read(void *opaque) } } - return; + return G_SOURCE_CONTINUE; err: - qemu_set_fd_handler(u->slave_fd, NULL, NULL, NULL); - close(u->slave_fd); - u->slave_fd = -1; - for (i = 0; i < fdsize; i++) { - if (fd[i] != -1) { + g_source_destroy(u->slave_src); + g_source_unref(u->slave_src); + u->slave_src = NULL; + object_unref(OBJECT(ioc)); + u->slave_ioc = NULL; + if (fd) { + for (i = 0; i < fdsize; i++) { close(fd[i]); } } - return; + return G_SOURCE_REMOVE; } static int vhost_setup_slave_channel(struct vhost_dev *dev) @@ -1595,8 +1576,9 @@ static int vhost_setup_slave_channel(struct vhost_dev *dev) return -1; } - u->slave_fd = sv[0]; - qemu_set_fd_handler(u->slave_fd, slave_read, NULL, dev); + u->slave_ioc = QIO_CHANNEL(qio_channel_socket_new_fd(sv[0], &error_abort)); + u->slave_src = qio_channel_add_watch_source(u->slave_ioc, G_IO_IN, + slave_read, dev, NULL, NULL); if (reply_supported) { msg.hdr.flags |= VHOST_USER_NEED_REPLY_MASK; @@ -1614,9 +1596,11 @@ static int vhost_setup_slave_channel(struct vhost_dev *dev) out: close(sv[1]); if (ret) { - qemu_set_fd_handler(u->slave_fd, NULL, NULL, NULL); - close(u->slave_fd); - u->slave_fd = -1; + g_source_destroy(u->slave_src); + g_source_unref(u->slave_src); + u->slave_src = NULL; + object_unref(OBJECT(u->slave_ioc)); + u->slave_ioc = NULL; } return ret; @@ -1853,7 +1837,6 @@ static int vhost_user_backend_init(struct vhost_dev *dev, void *opaque) u = g_new0(struct vhost_user, 1); u->user = opaque; - u->slave_fd = -1; u->dev = dev; dev->opaque = u; @@ -1968,10 +1951,12 @@ static int vhost_user_backend_cleanup(struct vhost_dev *dev) close(u->postcopy_fd.fd); u->postcopy_fd.handler = NULL; } - if (u->slave_fd >= 0) { - qemu_set_fd_handler(u->slave_fd, NULL, NULL, NULL); - close(u->slave_fd); - u->slave_fd = -1; + if (u->slave_ioc) { + g_source_destroy(u->slave_src); + g_source_unref(u->slave_src); + u->slave_src = NULL; + object_unref(OBJECT(u->slave_ioc)); + u->slave_ioc = NULL; } g_free(u->region_rb); u->region_rb = NULL; From patchwork Mon Mar 8 12:31:40 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Greg Kurz X-Patchwork-Id: 12122243 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3207CC433DB for ; Mon, 8 Mar 2021 13:15:37 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id C1411651CD for ; Mon, 8 Mar 2021 13:15:36 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org C1411651CD Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=kaod.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:55238 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lJFjT-0001bV-NO for qemu-devel@archiver.kernel.org; Mon, 08 Mar 2021 08:15:35 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:37642) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lJF3R-0000UX-FB for qemu-devel@nongnu.org; Mon, 08 Mar 2021 07:32:09 -0500 Received: from us-smtp-delivery-44.mimecast.com ([205.139.111.44]:38224) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.90_1) (envelope-from ) id 1lJF3P-0001wh-OU for qemu-devel@nongnu.org; Mon, 08 Mar 2021 07:32:09 -0500 Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-170-rTogx99wNSKYiuBl4lAMMw-1; Mon, 08 Mar 2021 07:31:55 -0500 X-MC-Unique: rTogx99wNSKYiuBl4lAMMw-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id B72872F7A0; Mon, 8 Mar 2021 12:31:54 +0000 (UTC) Received: from bahia.redhat.com (ovpn-113-236.ams2.redhat.com [10.36.113.236]) by smtp.corp.redhat.com (Postfix) with ESMTP id 691605D9D0; Mon, 8 Mar 2021 12:31:53 +0000 (UTC) From: Greg Kurz To: qemu-devel@nongnu.org Subject: [PATCH 3/4] vhost-user: Monitor slave channel in vhost_user_read() Date: Mon, 8 Mar 2021 13:31:40 +0100 Message-Id: <20210308123141.26444-4-groug@kaod.org> In-Reply-To: <20210308123141.26444-1-groug@kaod.org> References: <20210308123141.26444-1-groug@kaod.org> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: kaod.org Received-SPF: softfail client-ip=205.139.111.44; envelope-from=groug@kaod.org; helo=us-smtp-delivery-44.mimecast.com X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_NONE=0.001, SPF_SOFTFAIL=0.665 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Greg Kurz , "Michael S. Tsirkin" , Vivek Goyal , Stefan Hajnoczi , "Dr. David Alan Gilbert" Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Now that everything is in place, have the nested event loop to monitor the slave channel. The source in the main event loop is destroyed and recreated to ensure any pending even for the slave channel that was previously detected is purged. This guarantees that the main loop wont invoke slave_read() based on an event that was already handled by the nested loop. Signed-off-by: Greg Kurz --- hw/virtio/vhost-user.c | 27 +++++++++++++++++++++++++++ 1 file changed, 27 insertions(+) diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c index e395d0d1fd81..7669b3f2a99f 100644 --- a/hw/virtio/vhost-user.c +++ b/hw/virtio/vhost-user.c @@ -345,6 +345,9 @@ end: return G_SOURCE_REMOVE; } +static gboolean slave_read(QIOChannel *ioc, GIOCondition condition, + gpointer opaque); + static int vhost_user_read(struct vhost_dev *dev, VhostUserMsg *msg) { struct vhost_user *u = dev->opaque; @@ -352,6 +355,7 @@ static int vhost_user_read(struct vhost_dev *dev, VhostUserMsg *msg) GMainContext *prev_ctxt = chr->chr->gcontext; GMainContext *ctxt = g_main_context_new(); GMainLoop *loop = g_main_loop_new(ctxt, FALSE); + GSource *slave_src = NULL; struct vhost_user_read_cb_data data = { .dev = dev, .loop = loop, @@ -363,8 +367,30 @@ static int vhost_user_read(struct vhost_dev *dev, VhostUserMsg *msg) qemu_chr_be_update_read_handlers(chr->chr, ctxt); qemu_chr_fe_add_watch(chr, G_IO_IN | G_IO_HUP, vhost_user_read_cb, &data); + if (u->slave_ioc) { + /* + * This guarantees that all pending events in the main context + * for the slave channel are purged. They will be re-detected + * and processed now by the nested loop. + */ + g_source_destroy(u->slave_src); + g_source_unref(u->slave_src); + u->slave_src = NULL; + slave_src = qio_channel_add_watch_source(u->slave_ioc, G_IO_IN, + slave_read, dev, NULL, + ctxt); + } + g_main_loop_run(loop); + if (u->slave_ioc) { + g_source_destroy(slave_src); + g_source_unref(slave_src); + u->slave_src = qio_channel_add_watch_source(u->slave_ioc, G_IO_IN, + slave_read, dev, NULL, + NULL); + } + /* * Restore the previous context. This also destroys/recreates event * sources : this guarantees that all pending events in the original @@ -372,6 +398,7 @@ static int vhost_user_read(struct vhost_dev *dev, VhostUserMsg *msg) */ qemu_chr_be_update_read_handlers(chr->chr, prev_ctxt); + g_main_loop_unref(loop); g_main_context_unref(ctxt); From patchwork Mon Mar 8 12:31:41 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Greg Kurz X-Patchwork-Id: 12122395 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 41FFCC433E0 for ; Mon, 8 Mar 2021 13:55:32 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id BD1F3651CC for ; Mon, 8 Mar 2021 13:55:31 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org BD1F3651CC Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=kaod.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:52226 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lJGM6-00046D-Rq for qemu-devel@archiver.kernel.org; Mon, 08 Mar 2021 08:55:30 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:37640) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lJF3R-0000UV-EK for qemu-devel@nongnu.org; Mon, 08 Mar 2021 07:32:09 -0500 Received: from us-smtp-delivery-44.mimecast.com ([205.139.111.44]:54970) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.90_1) (envelope-from ) id 1lJF3O-0001w9-OD for qemu-devel@nongnu.org; Mon, 08 Mar 2021 07:32:09 -0500 Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-276-aXG14X0jPBWaImhh0N_cDA-1; Mon, 08 Mar 2021 07:31:57 -0500 X-MC-Unique: aXG14X0jPBWaImhh0N_cDA-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 5BDFE8189D6; Mon, 8 Mar 2021 12:31:56 +0000 (UTC) Received: from bahia.redhat.com (ovpn-113-236.ams2.redhat.com [10.36.113.236]) by smtp.corp.redhat.com (Postfix) with ESMTP id 0E5AA5D9D0; Mon, 8 Mar 2021 12:31:54 +0000 (UTC) From: Greg Kurz To: qemu-devel@nongnu.org Subject: [PATCH 4/4] virtiofsd: Release vu_dispatch_lock when stopping queue Date: Mon, 8 Mar 2021 13:31:41 +0100 Message-Id: <20210308123141.26444-5-groug@kaod.org> In-Reply-To: <20210308123141.26444-1-groug@kaod.org> References: <20210308123141.26444-1-groug@kaod.org> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: kaod.org Received-SPF: softfail client-ip=205.139.111.44; envelope-from=groug@kaod.org; helo=us-smtp-delivery-44.mimecast.com X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_NONE=0.001, SPF_SOFTFAIL=0.665 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Greg Kurz , "Michael S. Tsirkin" , Vivek Goyal , Stefan Hajnoczi , "Dr. David Alan Gilbert" Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" QEMU can stop a virtqueue by sending a VHOST_USER_GET_VRING_BASE request to virtiofsd. As with all other vhost-user protocol messages, the thread that runs the main event loop in virtiofsd takes the vu_dispatch lock in write mode. This ensures that no other thread can access virtqueues or memory tables at the same time. In the case of VHOST_USER_GET_VRING_BASE, the main thread basically notifies the queue thread that it should terminate and waits for its termination: main() virtio_loop() vu_dispatch_wrlock() vu_dispatch() vu_process_message() vu_get_vring_base_exec() fv_queue_cleanup_thread() pthread_join() Unfortunately, the queue thread ends up calling virtio_send_msg() at some point, which itself needs to grab the lock: fv_queue_thread() g_list_foreach() fv_queue_worker() fuse_session_process_buf_int() do_release() lo_release() fuse_reply_err() send_reply() send_reply_iov() fuse_send_reply_iov_nofree() fuse_send_msg() virtio_send_msg() vu_dispatch_rdlock() <-- Deadlock ! Simply have the main thread to release the lock before going to sleep and take it back afterwards. A very similar patch was already sent by Vivek Goyal sometime back: https://listman.redhat.com/archives/virtio-fs/2021-January/msg00073.html The only difference here is that this done in fv_queue_set_started() because fv_queue_cleanup_thread() can also be called from virtio_loop() without the lock being held. Signed-off-by: Greg Kurz Reviewed-by: Vivek Goyal Reviewed-by: Stefan Hajnoczi --- tools/virtiofsd/fuse_virtio.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/tools/virtiofsd/fuse_virtio.c b/tools/virtiofsd/fuse_virtio.c index 523ee64fb7ae..3e13997406bf 100644 --- a/tools/virtiofsd/fuse_virtio.c +++ b/tools/virtiofsd/fuse_virtio.c @@ -792,7 +792,13 @@ static void fv_queue_set_started(VuDev *dev, int qidx, bool started) assert(0); } } else { + /* + * Temporarily drop write-lock taken in virtio_loop() so that + * the queue thread doesn't block in virtio_send_msg(). + */ + vu_dispatch_unlock(vud); fv_queue_cleanup_thread(vud, qidx); + vu_dispatch_wrlock(vud); } }