From patchwork Fri Mar 8 23:05:22 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alex Williamson X-Patchwork-Id: 13587428 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E34DC3BBD5 for ; Fri, 8 Mar 2024 23:06:12 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709939175; cv=none; b=kPttnSJkkqnRxTyB2b/qyEhiEEqI5J7kxUiEXHx5T79uKqK9HwWLwkiO7U0AxTCdYgQD7sxP0j0fPGFclG/fLfEo7VXWgwvxgbS+qxR2eb0LxXNisRBZ6D0kPX9a64r8kRhfzbaXDo6wFtW8n5r6B0c6lmuRPXX6PxFxZvFWHA4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709939175; c=relaxed/simple; bh=heT93CNz8M4GwBdCOZkIHxkdserSfvqS5SdZ6rnGMbA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=k3eQWmv5L4rJdoc93hHfCyPdUebQFxZGYe2UYOYLKoySJSqUTYpPNesWBbpxZvzfvz4aQulkqgQoVXTFxPOPiHvxD333n4PFDiTyL2KKpsZTlLbpiFR5jThEK3O++w2Kod5Gfyhp5EE/vIDeHhAndlDKNu9xgp9C205SuwnSpgA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=Fcbz8QJb; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="Fcbz8QJb" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1709939171; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=s1fqiArHyLhCU0d5Vyof/5+mbhO22X+BF7PWrc8qFrE=; b=Fcbz8QJb2oaTsiRp0WLUWZd+hEetRgLUOkPi04iVrSVWwzUFbX8MQ6G/Y4VFcSX/1Ptyo6 +/QSwTEdIQQP/71xN5Sb/7peNgLFmh8toN6ceBcKXpVCl5ph2z5TdH0iGyBVQjB6AOdnWk kpaLwioWFlHYHYE55y3iitkeekMNs94= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-347-7t25bQA-P66lx1ZIL_U-7g-1; Fri, 08 Mar 2024 18:06:06 -0500 X-MC-Unique: 7t25bQA-P66lx1ZIL_U-7g-1 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.rdu2.redhat.com [10.11.54.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 3B9408007A2; Fri, 8 Mar 2024 23:06:06 +0000 (UTC) Received: from omen.home.shazbot.org (unknown [10.22.8.4]) by smtp.corp.redhat.com (Postfix) with ESMTP id 4112A37FD; Fri, 8 Mar 2024 23:06:04 +0000 (UTC) From: Alex Williamson To: alex.williamson@redhat.com Cc: kvm@vger.kernel.org, eric.auger@redhat.com, clg@redhat.com, reinette.chatre@intel.com, linux-kernel@vger.kernel.org, kevin.tian@intel.com, stable@vger.kernel.org Subject: [PATCH v2 1/7] vfio/pci: Disable auto-enable of exclusive INTx IRQ Date: Fri, 8 Mar 2024 16:05:22 -0700 Message-ID: <20240308230557.805580-2-alex.williamson@redhat.com> In-Reply-To: <20240308230557.805580-1-alex.williamson@redhat.com> References: <20240308230557.805580-1-alex.williamson@redhat.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.1 Currently for devices requiring masking at the irqchip for INTx, ie. devices without DisINTx support, the IRQ is enabled in request_irq() and subsequently disabled as necessary to align with the masked status flag. This presents a window where the interrupt could fire between these events, resulting in the IRQ incrementing the disable depth twice. This would be unrecoverable for a user since the masked flag prevents nested enables through vfio. Instead, invert the logic using IRQF_NO_AUTOEN such that exclusive INTx is never auto-enabled, then unmask as required. Cc: stable@vger.kernel.org Fixes: 89e1f7d4c66d ("vfio: Add PCI device driver") Reviewed-by: Kevin Tian Signed-off-by: Alex Williamson --- drivers/vfio/pci/vfio_pci_intrs.c | 17 ++++++++++------- 1 file changed, 10 insertions(+), 7 deletions(-) diff --git a/drivers/vfio/pci/vfio_pci_intrs.c b/drivers/vfio/pci/vfio_pci_intrs.c index 237beac83809..136101179fcb 100644 --- a/drivers/vfio/pci/vfio_pci_intrs.c +++ b/drivers/vfio/pci/vfio_pci_intrs.c @@ -296,8 +296,15 @@ static int vfio_intx_set_signal(struct vfio_pci_core_device *vdev, int fd) ctx->trigger = trigger; + /* + * Devices without DisINTx support require an exclusive interrupt, + * IRQ masking is performed at the IRQ chip. The masked status is + * protected by vdev->irqlock. Setup the IRQ without auto-enable and + * unmask as necessary below under lock. DisINTx is unmodified by + * the IRQ configuration and may therefore use auto-enable. + */ if (!vdev->pci_2_3) - irqflags = 0; + irqflags = IRQF_NO_AUTOEN; ret = request_irq(pdev->irq, vfio_intx_handler, irqflags, ctx->name, vdev); @@ -308,13 +315,9 @@ static int vfio_intx_set_signal(struct vfio_pci_core_device *vdev, int fd) return ret; } - /* - * INTx disable will stick across the new irq setup, - * disable_irq won't. - */ spin_lock_irqsave(&vdev->irqlock, flags); - if (!vdev->pci_2_3 && ctx->masked) - disable_irq_nosync(pdev->irq); + if (!vdev->pci_2_3 && !ctx->masked) + enable_irq(pdev->irq); spin_unlock_irqrestore(&vdev->irqlock, flags); return 0; From patchwork Fri Mar 8 23:05:23 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alex Williamson X-Patchwork-Id: 13587426 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6D6B93A1C4 for ; Fri, 8 Mar 2024 23:06:12 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709939174; cv=none; b=hbtOhtxKGKk4cOY5Gg9PaPhDjzLkTV8yNhxuQrXZAwTZn8Kv8SXytg5vg1mNKC+A3hnX/9B8Kc4C7GkkEfRIYOeu8KaajB0NwvLJqt1DPBIhwt794ZkSETYeUi1lDStzRejMkzxwNkRx3dedezN+nRgzPsSwTHvz8r/R5+iQEj4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709939174; c=relaxed/simple; bh=RRNuwGOodP63KJuqXtOzgGIXmJxXs5aKT4xUnjW/L3M=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=LrfPMTXjPT6zcXV4vk314zspgNfOcFDzVtY/P0ej11KNcJO1NqOMYCMXBGk+tCSgUf0EI1wlEwlUMN4NhYZNqepirmiQHrq+pYih7q1gJmsRlOfI3zBZjkDUj6Dt2L5CyDIzK8ndO+a4ObviDdtdt4zKHcAPRFwE2ibQ55yzqXg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=BsH+yC9H; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="BsH+yC9H" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1709939171; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=6LsYRp/Tbqx78yCSxYtu/4eHgx9ZiAm6cb+pYYvgz1s=; b=BsH+yC9H2sWgqu3uuKoeUAPvcjBV858LM8tdxl5uBz4YfQRuBrG2HZU8lqKG+wNka+YJNn hjVwAlkaW/aMZFymFhnwaejaxr/R/3Z8f0nyXHChCLh6OIdRMUzdDbPcVC4caZYYQLFDFj xVvkQZL2KKuJr8M9s/vabFhHxs8vUx4= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-221-aSK38dycPjW6tp_5qnzF2A-1; Fri, 08 Mar 2024 18:06:08 -0500 X-MC-Unique: aSK38dycPjW6tp_5qnzF2A-1 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.rdu2.redhat.com [10.11.54.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 960B2800264; Fri, 8 Mar 2024 23:06:07 +0000 (UTC) Received: from omen.home.shazbot.org (unknown [10.22.8.4]) by smtp.corp.redhat.com (Postfix) with ESMTP id 8D44436FF; Fri, 8 Mar 2024 23:06:06 +0000 (UTC) From: Alex Williamson To: alex.williamson@redhat.com Cc: kvm@vger.kernel.org, eric.auger@redhat.com, clg@redhat.com, reinette.chatre@intel.com, linux-kernel@vger.kernel.org, kevin.tian@intel.com, stable@vger.kernel.org Subject: [PATCH v2 2/7] vfio/pci: Lock external INTx masking ops Date: Fri, 8 Mar 2024 16:05:23 -0700 Message-ID: <20240308230557.805580-3-alex.williamson@redhat.com> In-Reply-To: <20240308230557.805580-1-alex.williamson@redhat.com> References: <20240308230557.805580-1-alex.williamson@redhat.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.1 Mask operations through config space changes to DisINTx may race INTx configuration changes via ioctl. Create wrappers that add locking for paths outside of the core interrupt code. In particular, irq_type is updated holding igate, therefore testing is_intx() requires holding igate. For example clearing DisINTx from config space can otherwise race changes of the interrupt configuration. This aligns interfaces which may trigger the INTx eventfd into two camps, one side serialized by igate and the other only enabled while INTx is configured. A subsequent patch introduces synchronization for the latter flows. Cc: stable@vger.kernel.org Fixes: 89e1f7d4c66d ("vfio: Add PCI device driver") Reported-by: Reinette Chatre Reviewed-by: Kevin Tian Reviewed-by: Reinette Chatre Signed-off-by: Alex Williamson Reviewed-by: Eric Auger --- drivers/vfio/pci/vfio_pci_intrs.c | 34 +++++++++++++++++++++++++------ 1 file changed, 28 insertions(+), 6 deletions(-) diff --git a/drivers/vfio/pci/vfio_pci_intrs.c b/drivers/vfio/pci/vfio_pci_intrs.c index 136101179fcb..75c85eec21b3 100644 --- a/drivers/vfio/pci/vfio_pci_intrs.c +++ b/drivers/vfio/pci/vfio_pci_intrs.c @@ -99,13 +99,15 @@ static void vfio_send_intx_eventfd(void *opaque, void *unused) } /* Returns true if the INTx vfio_pci_irq_ctx.masked value is changed. */ -bool vfio_pci_intx_mask(struct vfio_pci_core_device *vdev) +static bool __vfio_pci_intx_mask(struct vfio_pci_core_device *vdev) { struct pci_dev *pdev = vdev->pdev; struct vfio_pci_irq_ctx *ctx; unsigned long flags; bool masked_changed = false; + lockdep_assert_held(&vdev->igate); + spin_lock_irqsave(&vdev->irqlock, flags); /* @@ -143,6 +145,17 @@ bool vfio_pci_intx_mask(struct vfio_pci_core_device *vdev) return masked_changed; } +bool vfio_pci_intx_mask(struct vfio_pci_core_device *vdev) +{ + bool mask_changed; + + mutex_lock(&vdev->igate); + mask_changed = __vfio_pci_intx_mask(vdev); + mutex_unlock(&vdev->igate); + + return mask_changed; +} + /* * If this is triggered by an eventfd, we can't call eventfd_signal * or else we'll deadlock on the eventfd wait queue. Return >0 when @@ -194,12 +207,21 @@ static int vfio_pci_intx_unmask_handler(void *opaque, void *unused) return ret; } -void vfio_pci_intx_unmask(struct vfio_pci_core_device *vdev) +static void __vfio_pci_intx_unmask(struct vfio_pci_core_device *vdev) { + lockdep_assert_held(&vdev->igate); + if (vfio_pci_intx_unmask_handler(vdev, NULL) > 0) vfio_send_intx_eventfd(vdev, NULL); } +void vfio_pci_intx_unmask(struct vfio_pci_core_device *vdev) +{ + mutex_lock(&vdev->igate); + __vfio_pci_intx_unmask(vdev); + mutex_unlock(&vdev->igate); +} + static irqreturn_t vfio_intx_handler(int irq, void *dev_id) { struct vfio_pci_core_device *vdev = dev_id; @@ -563,11 +585,11 @@ static int vfio_pci_set_intx_unmask(struct vfio_pci_core_device *vdev, return -EINVAL; if (flags & VFIO_IRQ_SET_DATA_NONE) { - vfio_pci_intx_unmask(vdev); + __vfio_pci_intx_unmask(vdev); } else if (flags & VFIO_IRQ_SET_DATA_BOOL) { uint8_t unmask = *(uint8_t *)data; if (unmask) - vfio_pci_intx_unmask(vdev); + __vfio_pci_intx_unmask(vdev); } else if (flags & VFIO_IRQ_SET_DATA_EVENTFD) { struct vfio_pci_irq_ctx *ctx = vfio_irq_ctx_get(vdev, 0); int32_t fd = *(int32_t *)data; @@ -594,11 +616,11 @@ static int vfio_pci_set_intx_mask(struct vfio_pci_core_device *vdev, return -EINVAL; if (flags & VFIO_IRQ_SET_DATA_NONE) { - vfio_pci_intx_mask(vdev); + __vfio_pci_intx_mask(vdev); } else if (flags & VFIO_IRQ_SET_DATA_BOOL) { uint8_t mask = *(uint8_t *)data; if (mask) - vfio_pci_intx_mask(vdev); + __vfio_pci_intx_mask(vdev); } else if (flags & VFIO_IRQ_SET_DATA_EVENTFD) { return -ENOTTY; /* XXX implement me */ } From patchwork Fri Mar 8 23:05:24 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alex Williamson X-Patchwork-Id: 13587427 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C87FE3FB92 for ; Fri, 8 Mar 2024 23:06:13 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709939175; cv=none; b=o9j3RPIy6T2ONL1A1BYu1oH1v9UzQ7FvbGYBbLzvI9wAzWVSnTVndatd/SALX+A6fmwLksJlD7B/52rgxH11mK5uvwHhDRUZqxUCgTgFi+CGzBUePdfwFTc6IZcwFHdoe7572GQ2cdFZNDlyrLgPPVfXukmXXCCLD82xAUA8rsM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709939175; c=relaxed/simple; bh=k+R2VROY6vUH5PdEi8jgrtrsikXwjab0c1hYvgZZfOA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Yn/dtcPI0sDAXLhK1YI8qVy6xwPQdd6pHd/Zj3kgNY7pPhK9EBLcJWkQc2gXMJO/LfihM6Ben4Qj6gvyDzz8+vCLCCgmTUcPT9AN9cbCbc0HX4nb0mjyoYHRPGjBq89zJxQM3/ai4n2WCzIXygcoJjaZB22MKjX5zGt/8TZdvd4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=TVTXhvwy; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="TVTXhvwy" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1709939172; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=eXn3Gg6N1f84hEIo4OgYVMjxq/BUaA1ykSSynr+HxW0=; b=TVTXhvwyBdK7MOP3GubiPjihk4R11TTKBhDTY1Fsp3FZEpZOeG6NFFw3o6fCAnfGegPqNV EtR3+CxRyb4DQH6PgJkW1wd9v515uKT/U3sWBg15Yqi3OCVFvB10fN/x3Pq4PritkKCMTf ph3UmvFZUk9U51hd0aeHRG2g0F4fAYY= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-523-BEWV8FfbPm64jqF_xRnC2A-1; Fri, 08 Mar 2024 18:06:09 -0500 X-MC-Unique: BEWV8FfbPm64jqF_xRnC2A-1 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.rdu2.redhat.com [10.11.54.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id E230C86D4C1; Fri, 8 Mar 2024 23:06:08 +0000 (UTC) Received: from omen.home.shazbot.org (unknown [10.22.8.4]) by smtp.corp.redhat.com (Postfix) with ESMTP id E861547CB; Fri, 8 Mar 2024 23:06:07 +0000 (UTC) From: Alex Williamson To: alex.williamson@redhat.com Cc: kvm@vger.kernel.org, eric.auger@redhat.com, clg@redhat.com, reinette.chatre@intel.com, linux-kernel@vger.kernel.org, kevin.tian@intel.com Subject: [PATCH v2 3/7] vfio: Introduce interface to flush virqfd inject workqueue Date: Fri, 8 Mar 2024 16:05:24 -0700 Message-ID: <20240308230557.805580-4-alex.williamson@redhat.com> In-Reply-To: <20240308230557.805580-1-alex.williamson@redhat.com> References: <20240308230557.805580-1-alex.williamson@redhat.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.1 In order to synchronize changes that can affect the thread callback, introduce an interface to force a flush of the inject workqueue. The irqfd pointer is only valid under spinlock, but the workqueue cannot be flushed under spinlock. Therefore the flush work for the irqfd is queued under spinlock. The vfio_irqfd_cleanup_wq workqueue is re-used for queuing this work such that flushing the workqueue is also ordered relative to shutdown. Reviewed-by: Kevin Tian Reviewed-by: Reinette Chatre Signed-off-by: Alex Williamson Reviewed-by: Eric Auger --- drivers/vfio/virqfd.c | 21 +++++++++++++++++++++ include/linux/vfio.h | 2 ++ 2 files changed, 23 insertions(+) diff --git a/drivers/vfio/virqfd.c b/drivers/vfio/virqfd.c index 29c564b7a6e1..532269133801 100644 --- a/drivers/vfio/virqfd.c +++ b/drivers/vfio/virqfd.c @@ -101,6 +101,13 @@ static void virqfd_inject(struct work_struct *work) virqfd->thread(virqfd->opaque, virqfd->data); } +static void virqfd_flush_inject(struct work_struct *work) +{ + struct virqfd *virqfd = container_of(work, struct virqfd, flush_inject); + + flush_work(&virqfd->inject); +} + int vfio_virqfd_enable(void *opaque, int (*handler)(void *, void *), void (*thread)(void *, void *), @@ -124,6 +131,7 @@ int vfio_virqfd_enable(void *opaque, INIT_WORK(&virqfd->shutdown, virqfd_shutdown); INIT_WORK(&virqfd->inject, virqfd_inject); + INIT_WORK(&virqfd->flush_inject, virqfd_flush_inject); irqfd = fdget(fd); if (!irqfd.file) { @@ -213,3 +221,16 @@ void vfio_virqfd_disable(struct virqfd **pvirqfd) flush_workqueue(vfio_irqfd_cleanup_wq); } EXPORT_SYMBOL_GPL(vfio_virqfd_disable); + +void vfio_virqfd_flush_thread(struct virqfd **pvirqfd) +{ + unsigned long flags; + + spin_lock_irqsave(&virqfd_lock, flags); + if (*pvirqfd && (*pvirqfd)->thread) + queue_work(vfio_irqfd_cleanup_wq, &(*pvirqfd)->flush_inject); + spin_unlock_irqrestore(&virqfd_lock, flags); + + flush_workqueue(vfio_irqfd_cleanup_wq); +} +EXPORT_SYMBOL_GPL(vfio_virqfd_flush_thread); diff --git a/include/linux/vfio.h b/include/linux/vfio.h index 89b265bc6ec3..8b1a29820409 100644 --- a/include/linux/vfio.h +++ b/include/linux/vfio.h @@ -356,6 +356,7 @@ struct virqfd { wait_queue_entry_t wait; poll_table pt; struct work_struct shutdown; + struct work_struct flush_inject; struct virqfd **pvirqfd; }; @@ -363,5 +364,6 @@ int vfio_virqfd_enable(void *opaque, int (*handler)(void *, void *), void (*thread)(void *, void *), void *data, struct virqfd **pvirqfd, int fd); void vfio_virqfd_disable(struct virqfd **pvirqfd); +void vfio_virqfd_flush_thread(struct virqfd **pvirqfd); #endif /* VFIO_H */ From patchwork Fri Mar 8 23:05:25 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alex Williamson X-Patchwork-Id: 13587429 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 346754BA88 for ; Fri, 8 Mar 2024 23:06:15 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709939178; cv=none; b=G2AF7JRdxnvZg3kqeX2OVUOSycpFN/0cDF6x6TOo1p90LSCjxE6721oq7uM+6ZYgvnQzYE2Gi4Ekjyagsjk5/aC9wpkjboRccooAUUXsuGSkAJ8fpL2t0VD81asbzTk498HSEbMupcYXc51HybG4/AxogH6UFfRB4j2sFOc3X6w= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709939178; c=relaxed/simple; bh=8s1hywDNEEyjxGemEE8705xYNh1NyVGgBGOpM7NxcDs=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=dKWeQELXEKG1Rpf9vqAYGy8dXcQRTlSsIR1qi2N4L0PpHaK5Lz0hiIbKYmCe3x4u9VzD+Lj+JksziisZ49+GvP3j1FIor9Ezw+f7usLhYPl+KjijQYljDtf22OASH5vFhxIj0DJ6Y/oIgFIYs6ecCcWD03lH5MFPXqEFQll/3Rw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=OO5FZi6m; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="OO5FZi6m" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1709939175; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=QEMDRUYAwwblqvmSVpxrM3ioP4wjTxARdoB/48mCjPw=; b=OO5FZi6m3CdCzmNUZXvnT4eNIu8IU5296dsnDhb8F/qJWiFogspWN4OQSHycwtB64DiFQp WXYHgUE5gP+gsCnQcGWhf/wC1eOnNZmLmajM4OKp+XtP3L0Qw/5cEH/06ht73Uu186G5nl 1NOI17Z+FTsGVjdDnbeXWKpybIOpz10= Received: from mimecast-mx02.redhat.com (mx-ext.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-277-v834kelnOJiUox6BG9FYpQ-1; Fri, 08 Mar 2024 18:06:10 -0500 X-MC-Unique: v834kelnOJiUox6BG9FYpQ-1 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.rdu2.redhat.com [10.11.54.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 4A6BF3801F50; Fri, 8 Mar 2024 23:06:10 +0000 (UTC) Received: from omen.home.shazbot.org (unknown [10.22.8.4]) by smtp.corp.redhat.com (Postfix) with ESMTP id 409ED47CF; Fri, 8 Mar 2024 23:06:08 +0000 (UTC) From: Alex Williamson To: alex.williamson@redhat.com Cc: kvm@vger.kernel.org, eric.auger@redhat.com, clg@redhat.com, reinette.chatre@intel.com, linux-kernel@vger.kernel.org, kevin.tian@intel.com, stable@vger.kernel.org Subject: [PATCH v2 4/7] vfio/pci: Create persistent INTx handler Date: Fri, 8 Mar 2024 16:05:25 -0700 Message-ID: <20240308230557.805580-5-alex.williamson@redhat.com> In-Reply-To: <20240308230557.805580-1-alex.williamson@redhat.com> References: <20240308230557.805580-1-alex.williamson@redhat.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.1 A vulnerability exists where the eventfd for INTx signaling can be deconfigured, which unregisters the IRQ handler but still allows eventfds to be signaled with a NULL context through the SET_IRQS ioctl or through unmask irqfd if the device interrupt is pending. Ideally this could be solved with some additional locking; the igate mutex serializes the ioctl and config space accesses, and the interrupt handler is unregistered relative to the trigger, but the irqfd path runs asynchronous to those. The igate mutex cannot be acquired from the atomic context of the eventfd wake function. Disabling the irqfd relative to the eventfd registration is potentially incompatible with existing userspace. As a result, the solution implemented here moves configuration of the INTx interrupt handler to track the lifetime of the INTx context object and irq_type configuration, rather than registration of a particular trigger eventfd. Synchronization is added between the ioctl path and eventfd_signal() wrapper such that the eventfd trigger can be dynamically updated relative to in-flight interrupts or irqfd callbacks. Cc: stable@vger.kernel.org Fixes: 89e1f7d4c66d ("vfio: Add PCI device driver") Reported-by: Reinette Chatre Reviewed-by: Kevin Tian Reviewed-by: Reinette Chatre Signed-off-by: Alex Williamson Reviewed-by: Eric Auger --- drivers/vfio/pci/vfio_pci_intrs.c | 145 ++++++++++++++++-------------- 1 file changed, 78 insertions(+), 67 deletions(-) diff --git a/drivers/vfio/pci/vfio_pci_intrs.c b/drivers/vfio/pci/vfio_pci_intrs.c index 75c85eec21b3..fb5392b749ff 100644 --- a/drivers/vfio/pci/vfio_pci_intrs.c +++ b/drivers/vfio/pci/vfio_pci_intrs.c @@ -90,11 +90,15 @@ static void vfio_send_intx_eventfd(void *opaque, void *unused) if (likely(is_intx(vdev) && !vdev->virq_disabled)) { struct vfio_pci_irq_ctx *ctx; + struct eventfd_ctx *trigger; ctx = vfio_irq_ctx_get(vdev, 0); if (WARN_ON_ONCE(!ctx)) return; - eventfd_signal(ctx->trigger); + + trigger = READ_ONCE(ctx->trigger); + if (likely(trigger)) + eventfd_signal(trigger); } } @@ -253,100 +257,100 @@ static irqreturn_t vfio_intx_handler(int irq, void *dev_id) return ret; } -static int vfio_intx_enable(struct vfio_pci_core_device *vdev) +static int vfio_intx_enable(struct vfio_pci_core_device *vdev, + struct eventfd_ctx *trigger) { + struct pci_dev *pdev = vdev->pdev; struct vfio_pci_irq_ctx *ctx; + unsigned long irqflags; + char *name; + int ret; if (!is_irq_none(vdev)) return -EINVAL; - if (!vdev->pdev->irq) + if (!pdev->irq) return -ENODEV; + name = kasprintf(GFP_KERNEL_ACCOUNT, "vfio-intx(%s)", pci_name(pdev)); + if (!name) + return -ENOMEM; + ctx = vfio_irq_ctx_alloc(vdev, 0); if (!ctx) return -ENOMEM; + ctx->name = name; + ctx->trigger = trigger; + /* - * If the virtual interrupt is masked, restore it. Devices - * supporting DisINTx can be masked at the hardware level - * here, non-PCI-2.3 devices will have to wait until the - * interrupt is enabled. + * Fill the initial masked state based on virq_disabled. After + * enable, changing the DisINTx bit in vconfig directly changes INTx + * masking. igate prevents races during setup, once running masked + * is protected via irqlock. + * + * Devices supporting DisINTx also reflect the current mask state in + * the physical DisINTx bit, which is not affected during IRQ setup. + * + * Devices without DisINTx support require an exclusive interrupt. + * IRQ masking is performed at the IRQ chip. Again, igate protects + * against races during setup and IRQ handlers and irqfds are not + * yet active, therefore masked is stable and can be used to + * conditionally auto-enable the IRQ. + * + * irq_type must be stable while the IRQ handler is registered, + * therefore it must be set before request_irq(). */ ctx->masked = vdev->virq_disabled; - if (vdev->pci_2_3) - pci_intx(vdev->pdev, !ctx->masked); + if (vdev->pci_2_3) { + pci_intx(pdev, !ctx->masked); + irqflags = IRQF_SHARED; + } else { + irqflags = ctx->masked ? IRQF_NO_AUTOEN : 0; + } vdev->irq_type = VFIO_PCI_INTX_IRQ_INDEX; + ret = request_irq(pdev->irq, vfio_intx_handler, + irqflags, ctx->name, vdev); + if (ret) { + vdev->irq_type = VFIO_PCI_NUM_IRQS; + kfree(name); + vfio_irq_ctx_free(vdev, ctx, 0); + return ret; + } + return 0; } -static int vfio_intx_set_signal(struct vfio_pci_core_device *vdev, int fd) +static int vfio_intx_set_signal(struct vfio_pci_core_device *vdev, + struct eventfd_ctx *trigger) { struct pci_dev *pdev = vdev->pdev; - unsigned long irqflags = IRQF_SHARED; struct vfio_pci_irq_ctx *ctx; - struct eventfd_ctx *trigger; - unsigned long flags; - int ret; + struct eventfd_ctx *old; ctx = vfio_irq_ctx_get(vdev, 0); if (WARN_ON_ONCE(!ctx)) return -EINVAL; - if (ctx->trigger) { - free_irq(pdev->irq, vdev); - kfree(ctx->name); - eventfd_ctx_put(ctx->trigger); - ctx->trigger = NULL; - } - - if (fd < 0) /* Disable only */ - return 0; - - ctx->name = kasprintf(GFP_KERNEL_ACCOUNT, "vfio-intx(%s)", - pci_name(pdev)); - if (!ctx->name) - return -ENOMEM; - - trigger = eventfd_ctx_fdget(fd); - if (IS_ERR(trigger)) { - kfree(ctx->name); - return PTR_ERR(trigger); - } + old = ctx->trigger; - ctx->trigger = trigger; + WRITE_ONCE(ctx->trigger, trigger); - /* - * Devices without DisINTx support require an exclusive interrupt, - * IRQ masking is performed at the IRQ chip. The masked status is - * protected by vdev->irqlock. Setup the IRQ without auto-enable and - * unmask as necessary below under lock. DisINTx is unmodified by - * the IRQ configuration and may therefore use auto-enable. - */ - if (!vdev->pci_2_3) - irqflags = IRQF_NO_AUTOEN; - - ret = request_irq(pdev->irq, vfio_intx_handler, - irqflags, ctx->name, vdev); - if (ret) { - ctx->trigger = NULL; - kfree(ctx->name); - eventfd_ctx_put(trigger); - return ret; + /* Releasing an old ctx requires synchronizing in-flight users */ + if (old) { + synchronize_irq(pdev->irq); + vfio_virqfd_flush_thread(&ctx->unmask); + eventfd_ctx_put(old); } - spin_lock_irqsave(&vdev->irqlock, flags); - if (!vdev->pci_2_3 && !ctx->masked) - enable_irq(pdev->irq); - spin_unlock_irqrestore(&vdev->irqlock, flags); - return 0; } static void vfio_intx_disable(struct vfio_pci_core_device *vdev) { + struct pci_dev *pdev = vdev->pdev; struct vfio_pci_irq_ctx *ctx; ctx = vfio_irq_ctx_get(vdev, 0); @@ -354,10 +358,13 @@ static void vfio_intx_disable(struct vfio_pci_core_device *vdev) if (ctx) { vfio_virqfd_disable(&ctx->unmask); vfio_virqfd_disable(&ctx->mask); + free_irq(pdev->irq, vdev); + if (ctx->trigger) + eventfd_ctx_put(ctx->trigger); + kfree(ctx->name); + vfio_irq_ctx_free(vdev, ctx, 0); } - vfio_intx_set_signal(vdev, -1); vdev->irq_type = VFIO_PCI_NUM_IRQS; - vfio_irq_ctx_free(vdev, ctx, 0); } /* @@ -641,19 +648,23 @@ static int vfio_pci_set_intx_trigger(struct vfio_pci_core_device *vdev, return -EINVAL; if (flags & VFIO_IRQ_SET_DATA_EVENTFD) { + struct eventfd_ctx *trigger = NULL; int32_t fd = *(int32_t *)data; int ret; - if (is_intx(vdev)) - return vfio_intx_set_signal(vdev, fd); + if (fd >= 0) { + trigger = eventfd_ctx_fdget(fd); + if (IS_ERR(trigger)) + return PTR_ERR(trigger); + } - ret = vfio_intx_enable(vdev); - if (ret) - return ret; + if (is_intx(vdev)) + ret = vfio_intx_set_signal(vdev, trigger); + else + ret = vfio_intx_enable(vdev, trigger); - ret = vfio_intx_set_signal(vdev, fd); - if (ret) - vfio_intx_disable(vdev); + if (ret && trigger) + eventfd_ctx_put(trigger); return ret; } From patchwork Fri Mar 8 23:05:26 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alex Williamson X-Patchwork-Id: 13587430 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 88F424CB24 for ; Fri, 8 Mar 2024 23:06:16 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709939178; cv=none; b=o/Q2YQsloAA4grKZd8033/rPK0COr3UvjdWU830gK22eTm2m+/cgptRYNauqoLfVgSkVhe9lW+vJu2G3s85F1c2dQ/g01nyo9TtX25TYNo0057PAC4PiPEE0VERd/u/v6esirMUQ4xJfxumYoSkB14PnCF90/qxOxpCnNd6k0Lk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709939178; c=relaxed/simple; bh=xe/gkZ/Xvg/uXlto0Dtj8eqOvEjluTFaDYjfd6lHxrA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=q5WnDs/Y3YpYJcVpVahkhUGGlUM8AkLbL1WygDqG0oZxoKYvcENY3L/FHNOz1GdyBCR+/+U6qrJu+F137+fNRN3Xc63t64WdAJsrOcTGzYuh4p6a++SHT30sLQHy/E5dV2eaYO4vTUGcQifYeVd/NoMO0gaF2wVOKnvntiEstnM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=LDwwbYJl; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="LDwwbYJl" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1709939175; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=1dEKfErNXa51Ijqh6RZq2oy2pBONTNUJTZpcaMjJ9JU=; b=LDwwbYJl23kPdgPzkwQZx80azGl4pzvjIk2jKMPCxmKsE819Dp85PAbY0YJvbJn9BzyU4d jFU/qVLjJtQHOq7YxwHeMCgDHhpIGDQmLn4Hz/nSBp5sUk2xhbtZTP1ARIe9zIknDHCZSu ANsj9taBiDPEO5CK3ec/vcuI7OPDckI= Received: from mimecast-mx02.redhat.com (mx-ext.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-295-AtEUu7PaO76Jg_z-aNWO4g-1; Fri, 08 Mar 2024 18:06:12 -0500 X-MC-Unique: AtEUu7PaO76Jg_z-aNWO4g-1 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.rdu2.redhat.com [10.11.54.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id A37EB3801F50; Fri, 8 Mar 2024 23:06:11 +0000 (UTC) Received: from omen.home.shazbot.org (unknown [10.22.8.4]) by smtp.corp.redhat.com (Postfix) with ESMTP id 9B69F47CB; Fri, 8 Mar 2024 23:06:10 +0000 (UTC) From: Alex Williamson To: alex.williamson@redhat.com Cc: kvm@vger.kernel.org, eric.auger@redhat.com, clg@redhat.com, reinette.chatre@intel.com, linux-kernel@vger.kernel.org, kevin.tian@intel.com, stable@vger.kernel.org Subject: [PATCH v2 5/7] vfio/platform: Disable virqfds on cleanup Date: Fri, 8 Mar 2024 16:05:26 -0700 Message-ID: <20240308230557.805580-6-alex.williamson@redhat.com> In-Reply-To: <20240308230557.805580-1-alex.williamson@redhat.com> References: <20240308230557.805580-1-alex.williamson@redhat.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.1 irqfds for mask and unmask that are not specifically disabled by the user are leaked. Remove any irqfds during cleanup Cc: Eric Auger Cc: stable@vger.kernel.org Fixes: a7fa7c77cf15 ("vfio/platform: implement IRQ masking/unmasking via an eventfd") Signed-off-by: Alex Williamson Reviewed-by: Kevin Tian Reviewed-by: Eric Auger --- drivers/vfio/platform/vfio_platform_irq.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/drivers/vfio/platform/vfio_platform_irq.c b/drivers/vfio/platform/vfio_platform_irq.c index 61a1bfb68ac7..e5dcada9e86c 100644 --- a/drivers/vfio/platform/vfio_platform_irq.c +++ b/drivers/vfio/platform/vfio_platform_irq.c @@ -321,8 +321,11 @@ void vfio_platform_irq_cleanup(struct vfio_platform_device *vdev) { int i; - for (i = 0; i < vdev->num_irqs; i++) + for (i = 0; i < vdev->num_irqs; i++) { + vfio_virqfd_disable(&vdev->irqs[i].mask); + vfio_virqfd_disable(&vdev->irqs[i].unmask); vfio_set_trigger(vdev, i, -1, NULL); + } vdev->num_irqs = 0; kfree(vdev->irqs); From patchwork Fri Mar 8 23:05:27 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alex Williamson X-Patchwork-Id: 13587431 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AAF7A4D9EC for ; Fri, 8 Mar 2024 23:06:17 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709939179; cv=none; b=Fn6i/dlnM58WPiMlWRGmTQlPOU3yj55W9nSvZ+FYw5jX7jHy/lOSef/xMa5HMkZt1ycPXRPlfGEb9S4ci650W5n1p7zd0BTFYwnRcuBfcAJ+8lT1hiKVrlFZ2KkDC+TeWUx4C0A4qNY3KnMmxvzBZLTzTnXHlV1xFN8nfqgJImQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709939179; c=relaxed/simple; bh=IgRfhOHk13Mv0ASTofy/83LFTItxFaSl3Db+woVfbPI=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=bVMQGbqg56oEeFlTImAyQpFtKnD2YN1+1uAQrSKuhT/+bS/Y3StAkpx+XTTl+BdIdQs4DQK8wM7BTnFWTZ5dpPMEdA9H5wR95pc/50ihH1Gx9PqYO66tnc33ScQqJZ5815zM/rlKhV/V6tphcyI5fObR68TsJh3LeXz0jxZrK8I= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=EbX37EPi; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="EbX37EPi" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1709939176; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=T30nVL2WzB2fz1DAo6eXAKzxiL2l64osi6FTP+vjx0w=; b=EbX37EPi2qQha1jBrdoKRPobN0D4d9CnRQyRb8yviEwqZkSOzQVOWFlPn7eYj/7iUA5X2b sw6StHXli98AL0Q2JPk6nGhBYfLl4IVWKkn2YAKiNpJ0kCVUngKhL4CQ/K+A///5fY9/ga uKHQXkJVjKxupD4wPCb6vI84Qfvsd98= Received: from mimecast-mx02.redhat.com (mx-ext.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-648-VsIc0bXfMm6eAF6PUsRH1A-1; Fri, 08 Mar 2024 18:06:13 -0500 X-MC-Unique: VsIc0bXfMm6eAF6PUsRH1A-1 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.rdu2.redhat.com [10.11.54.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 0ABFB29AA387; Fri, 8 Mar 2024 23:06:13 +0000 (UTC) Received: from omen.home.shazbot.org (unknown [10.22.8.4]) by smtp.corp.redhat.com (Postfix) with ESMTP id 0130347CF; Fri, 8 Mar 2024 23:06:11 +0000 (UTC) From: Alex Williamson To: alex.williamson@redhat.com Cc: kvm@vger.kernel.org, eric.auger@redhat.com, clg@redhat.com, reinette.chatre@intel.com, linux-kernel@vger.kernel.org, kevin.tian@intel.com, stable@vger.kernel.org Subject: [PATCH v2 6/7] vfio/platform: Create persistent IRQ handlers Date: Fri, 8 Mar 2024 16:05:27 -0700 Message-ID: <20240308230557.805580-7-alex.williamson@redhat.com> In-Reply-To: <20240308230557.805580-1-alex.williamson@redhat.com> References: <20240308230557.805580-1-alex.williamson@redhat.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.1 The vfio-platform SET_IRQS ioctl currently allows loopback triggering of an interrupt before a signaling eventfd has been configured by the user, which thereby allows a NULL pointer dereference. Rather than register the IRQ relative to a valid trigger, register all IRQs in a disabled state in the device open path. This allows mask operations on the IRQ to nest within the overall enable state governed by a valid eventfd signal. This decouples @masked, protected by the @locked spinlock from @trigger, protected via the @igate mutex. In doing so, it's guaranteed that changes to @trigger cannot race the IRQ handlers because the IRQ handler is synchronously disabled before modifying the trigger, and loopback triggering of the IRQ via ioctl is safe due to serialization with trigger changes via igate. For compatibility, request_irq() failures are maintained to be local to the SET_IRQS ioctl rather than a fatal error in the open device path. This allows, for example, a userspace driver with polling mode support to continue to work regardless of moving the request_irq() call site. This necessarily blocks all SET_IRQS access to the failed index. Cc: Eric Auger Cc: stable@vger.kernel.org Fixes: 57f972e2b341 ("vfio/platform: trigger an interrupt via eventfd") Signed-off-by: Alex Williamson Reviewed-by: Kevin Tian Reviewed-by: Eric Auger Tested-by: Eric Auger --- drivers/vfio/platform/vfio_platform_irq.c | 100 +++++++++++++++------- 1 file changed, 68 insertions(+), 32 deletions(-) diff --git a/drivers/vfio/platform/vfio_platform_irq.c b/drivers/vfio/platform/vfio_platform_irq.c index e5dcada9e86c..ef41ecef83af 100644 --- a/drivers/vfio/platform/vfio_platform_irq.c +++ b/drivers/vfio/platform/vfio_platform_irq.c @@ -136,6 +136,16 @@ static int vfio_platform_set_irq_unmask(struct vfio_platform_device *vdev, return 0; } +/* + * The trigger eventfd is guaranteed valid in the interrupt path + * and protected by the igate mutex when triggered via ioctl. + */ +static void vfio_send_eventfd(struct vfio_platform_irq *irq_ctx) +{ + if (likely(irq_ctx->trigger)) + eventfd_signal(irq_ctx->trigger); +} + static irqreturn_t vfio_automasked_irq_handler(int irq, void *dev_id) { struct vfio_platform_irq *irq_ctx = dev_id; @@ -155,7 +165,7 @@ static irqreturn_t vfio_automasked_irq_handler(int irq, void *dev_id) spin_unlock_irqrestore(&irq_ctx->lock, flags); if (ret == IRQ_HANDLED) - eventfd_signal(irq_ctx->trigger); + vfio_send_eventfd(irq_ctx); return ret; } @@ -164,52 +174,40 @@ static irqreturn_t vfio_irq_handler(int irq, void *dev_id) { struct vfio_platform_irq *irq_ctx = dev_id; - eventfd_signal(irq_ctx->trigger); + vfio_send_eventfd(irq_ctx); return IRQ_HANDLED; } static int vfio_set_trigger(struct vfio_platform_device *vdev, int index, - int fd, irq_handler_t handler) + int fd) { struct vfio_platform_irq *irq = &vdev->irqs[index]; struct eventfd_ctx *trigger; - int ret; if (irq->trigger) { - irq_clear_status_flags(irq->hwirq, IRQ_NOAUTOEN); - free_irq(irq->hwirq, irq); - kfree(irq->name); + disable_irq(irq->hwirq); eventfd_ctx_put(irq->trigger); irq->trigger = NULL; } if (fd < 0) /* Disable only */ return 0; - irq->name = kasprintf(GFP_KERNEL_ACCOUNT, "vfio-irq[%d](%s)", - irq->hwirq, vdev->name); - if (!irq->name) - return -ENOMEM; trigger = eventfd_ctx_fdget(fd); - if (IS_ERR(trigger)) { - kfree(irq->name); + if (IS_ERR(trigger)) return PTR_ERR(trigger); - } irq->trigger = trigger; - irq_set_status_flags(irq->hwirq, IRQ_NOAUTOEN); - ret = request_irq(irq->hwirq, handler, 0, irq->name, irq); - if (ret) { - kfree(irq->name); - eventfd_ctx_put(trigger); - irq->trigger = NULL; - return ret; - } - - if (!irq->masked) - enable_irq(irq->hwirq); + /* + * irq->masked effectively provides nested disables within the overall + * enable relative to trigger. Specifically request_irq() is called + * with NO_AUTOEN, therefore the IRQ is initially disabled. The user + * may only further disable the IRQ with a MASK operations because + * irq->masked is initially false. + */ + enable_irq(irq->hwirq); return 0; } @@ -228,7 +226,7 @@ static int vfio_platform_set_irq_trigger(struct vfio_platform_device *vdev, handler = vfio_irq_handler; if (!count && (flags & VFIO_IRQ_SET_DATA_NONE)) - return vfio_set_trigger(vdev, index, -1, handler); + return vfio_set_trigger(vdev, index, -1); if (start != 0 || count != 1) return -EINVAL; @@ -236,7 +234,7 @@ static int vfio_platform_set_irq_trigger(struct vfio_platform_device *vdev, if (flags & VFIO_IRQ_SET_DATA_EVENTFD) { int32_t fd = *(int32_t *)data; - return vfio_set_trigger(vdev, index, fd, handler); + return vfio_set_trigger(vdev, index, fd); } if (flags & VFIO_IRQ_SET_DATA_NONE) { @@ -260,6 +258,14 @@ int vfio_platform_set_irqs_ioctl(struct vfio_platform_device *vdev, unsigned start, unsigned count, uint32_t flags, void *data) = NULL; + /* + * For compatibility, errors from request_irq() are local to the + * SET_IRQS path and reflected in the name pointer. This allows, + * for example, polling mode fallback for an exclusive IRQ failure. + */ + if (IS_ERR(vdev->irqs[index].name)) + return PTR_ERR(vdev->irqs[index].name); + switch (flags & VFIO_IRQ_SET_ACTION_TYPE_MASK) { case VFIO_IRQ_SET_ACTION_MASK: func = vfio_platform_set_irq_mask; @@ -280,7 +286,7 @@ int vfio_platform_set_irqs_ioctl(struct vfio_platform_device *vdev, int vfio_platform_irq_init(struct vfio_platform_device *vdev) { - int cnt = 0, i; + int cnt = 0, i, ret = 0; while (vdev->get_irq(vdev, cnt) >= 0) cnt++; @@ -292,29 +298,54 @@ int vfio_platform_irq_init(struct vfio_platform_device *vdev) for (i = 0; i < cnt; i++) { int hwirq = vdev->get_irq(vdev, i); + irq_handler_t handler = vfio_irq_handler; - if (hwirq < 0) + if (hwirq < 0) { + ret = -EINVAL; goto err; + } spin_lock_init(&vdev->irqs[i].lock); vdev->irqs[i].flags = VFIO_IRQ_INFO_EVENTFD; - if (irq_get_trigger_type(hwirq) & IRQ_TYPE_LEVEL_MASK) + if (irq_get_trigger_type(hwirq) & IRQ_TYPE_LEVEL_MASK) { vdev->irqs[i].flags |= VFIO_IRQ_INFO_MASKABLE | VFIO_IRQ_INFO_AUTOMASKED; + handler = vfio_automasked_irq_handler; + } vdev->irqs[i].count = 1; vdev->irqs[i].hwirq = hwirq; vdev->irqs[i].masked = false; + vdev->irqs[i].name = kasprintf(GFP_KERNEL_ACCOUNT, + "vfio-irq[%d](%s)", hwirq, + vdev->name); + if (!vdev->irqs[i].name) { + ret = -ENOMEM; + goto err; + } + + ret = request_irq(hwirq, handler, IRQF_NO_AUTOEN, + vdev->irqs[i].name, &vdev->irqs[i]); + if (ret) { + kfree(vdev->irqs[i].name); + vdev->irqs[i].name = ERR_PTR(ret); + } } vdev->num_irqs = cnt; return 0; err: + for (--i; i >= 0; i--) { + if (!IS_ERR(vdev->irqs[i].name)) { + free_irq(vdev->irqs[i].hwirq, &vdev->irqs[i]); + kfree(vdev->irqs[i].name); + } + } kfree(vdev->irqs); - return -EINVAL; + return ret; } void vfio_platform_irq_cleanup(struct vfio_platform_device *vdev) @@ -324,7 +355,12 @@ void vfio_platform_irq_cleanup(struct vfio_platform_device *vdev) for (i = 0; i < vdev->num_irqs; i++) { vfio_virqfd_disable(&vdev->irqs[i].mask); vfio_virqfd_disable(&vdev->irqs[i].unmask); - vfio_set_trigger(vdev, i, -1, NULL); + if (!IS_ERR(vdev->irqs[i].name)) { + free_irq(vdev->irqs[i].hwirq, &vdev->irqs[i]); + if (vdev->irqs[i].trigger) + eventfd_ctx_put(vdev->irqs[i].trigger); + kfree(vdev->irqs[i].name); + } } vdev->num_irqs = 0; From patchwork Fri Mar 8 23:05:28 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alex Williamson X-Patchwork-Id: 13587432 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1D87054775 for ; Fri, 8 Mar 2024 23:06:19 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709939180; cv=none; b=LG+JSnYj4oSKqw2lIgX3kXqI34+gP5XFE2+LgkQOcPfb9Xr6RWr8qSY74vjjJhyYq5eol1JIolsVovvVkq1OXJiUFYU6HWNRlLEcMIihpr4Rw1+mE4AjU1NfT6wU/Fjxtds8I6unxzNujg6D9wpUAQjcoJ72pkPb+nm/T2zAlE4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709939180; c=relaxed/simple; bh=xI6bQYs7D7L/1GayJOzEXbmk72qf1CYrvd8eI1QF5Gs=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=UwQGYsmkIqClpOkTOCAufpLnmKRVZX1ESUVUJsnCYKjFL9Z2UoSXqrackwX8bjjysygk0kP/P4OG8gsNo0gxuXxn2iKxD50Rw9LGntt+MiuLjJtChMc3pZL/KLE4d0kRR0OJ57GchbTxOSRINZNOeZHJQrA44WYPnSRewYhita8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=hd8ZV/ak; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="hd8ZV/ak" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1709939178; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=N8jD+utIUIWsnb+QHB/Vd0dznqOHt9b4raUNDQiHScE=; b=hd8ZV/akANlZ5jRea7Y/r0DwDAQ5Y/fwXs+OaxArfPZRyfGf8DffQCGshvNthSV1jBfOnD 6OsMveef3rxE38oXYa4pXmljuNmZnXcHQLXavvMR8ULU+wJxfz2J4pThHIekFO2U0i1XKv NFIAwwZLozPDezz1ulV5VnefTfexQUI= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-475-YgeSMQxMPHK209lCwn6AMQ-1; Fri, 08 Mar 2024 18:06:14 -0500 X-MC-Unique: YgeSMQxMPHK209lCwn6AMQ-1 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.rdu2.redhat.com [10.11.54.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 66973185A781; Fri, 8 Mar 2024 23:06:14 +0000 (UTC) Received: from omen.home.shazbot.org (unknown [10.22.8.4]) by smtp.corp.redhat.com (Postfix) with ESMTP id 5C8CD47CB; Fri, 8 Mar 2024 23:06:13 +0000 (UTC) From: Alex Williamson To: alex.williamson@redhat.com Cc: kvm@vger.kernel.org, eric.auger@redhat.com, clg@redhat.com, reinette.chatre@intel.com, linux-kernel@vger.kernel.org, kevin.tian@intel.com, diana.craciun@oss.nxp.com, stable@vger.kernel.org Subject: [PATCH v2 7/7] vfio/fsl-mc: Block calling interrupt handler without trigger Date: Fri, 8 Mar 2024 16:05:28 -0700 Message-ID: <20240308230557.805580-8-alex.williamson@redhat.com> In-Reply-To: <20240308230557.805580-1-alex.williamson@redhat.com> References: <20240308230557.805580-1-alex.williamson@redhat.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.1 The eventfd_ctx trigger pointer of the vfio_fsl_mc_irq object is initially NULL and may become NULL if the user sets the trigger eventfd to -1. The interrupt handler itself is guaranteed that trigger is always valid between request_irq() and free_irq(), but the loopback testing mechanisms to invoke the handler function need to test the trigger. The triggering and setting ioctl paths both make use of igate and are therefore mutually exclusive. The vfio-fsl-mc driver does not make use of irqfds, nor does it support any sort of masking operations, therefore unlike vfio-pci and vfio-platform, the flow can remain essentially unchanged. Cc: Diana Craciun Cc: stable@vger.kernel.org Fixes: cc0ee20bd969 ("vfio/fsl-mc: trigger an interrupt via eventfd") Signed-off-by: Alex Williamson Reviewed-by: Kevin Tian Reviewed-by: Eric Auger --- drivers/vfio/fsl-mc/vfio_fsl_mc_intr.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/drivers/vfio/fsl-mc/vfio_fsl_mc_intr.c b/drivers/vfio/fsl-mc/vfio_fsl_mc_intr.c index d62fbfff20b8..82b2afa9b7e3 100644 --- a/drivers/vfio/fsl-mc/vfio_fsl_mc_intr.c +++ b/drivers/vfio/fsl-mc/vfio_fsl_mc_intr.c @@ -141,13 +141,14 @@ static int vfio_fsl_mc_set_irq_trigger(struct vfio_fsl_mc_device *vdev, irq = &vdev->mc_irqs[index]; if (flags & VFIO_IRQ_SET_DATA_NONE) { - vfio_fsl_mc_irq_handler(hwirq, irq); + if (irq->trigger) + eventfd_signal(irq->trigger); } else if (flags & VFIO_IRQ_SET_DATA_BOOL) { u8 trigger = *(u8 *)data; - if (trigger) - vfio_fsl_mc_irq_handler(hwirq, irq); + if (trigger && irq->trigger) + eventfd_signal(irq->trigger); } return 0;