From patchwork Sat Nov 5 22:44:56 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anthony DeRossi X-Patchwork-Id: 13033282 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 49C37C4332F for ; Sat, 5 Nov 2022 22:49:43 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230001AbiKEWtl (ORCPT ); Sat, 5 Nov 2022 18:49:41 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51832 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229479AbiKEWtk (ORCPT ); Sat, 5 Nov 2022 18:49:40 -0400 Received: from mail-pl1-x634.google.com (mail-pl1-x634.google.com [IPv6:2607:f8b0:4864:20::634]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BD380EE21 for ; Sat, 5 Nov 2022 15:49:39 -0700 (PDT) Received: by mail-pl1-x634.google.com with SMTP id y4so7999243plb.2 for ; Sat, 05 Nov 2022 15:49:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=e5SjKpl5wQpBKJaujaO+g62YnASncJMmHCTnqeXvOsQ=; b=VFH6RS+O3lhv0kawwFy4GqXV8uGlb/yNBb1JaE+x8ghjRBzS1ugbBZm3CTqzAic4h2 uCjk2OWTFHAITNd2ZHoFR/zIzjybtgkEe/OEuOZfMKe0QzWuWQPUWCRCj6CiycnBCn2G k/tXvI23ZIMZ29XRRYS9gGcFuLdyYwxoQQEUwb1q8n5m7Ix6XNxn0NAHBUcgUhdL7oR5 /0EApsDhsgf7C245qF0QFYJCNQkUx1tZ8tXRHvEUxoOrIwU8PQ6tzNm9jDYhri8V56Yx B0dz/gFBNP8n8ti6H74/flnh8Es3qnuZMKU5WzGULsGZMnQ5bj7t9AQJ8PqZ4BcZZoje Np8Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=e5SjKpl5wQpBKJaujaO+g62YnASncJMmHCTnqeXvOsQ=; b=LIu/y1qDhXVG5DyupUl1C1bYWUuG16flqsFUD9OS5dc7SVqMdnAIM1BADPbdBhlrWs oiHx8xJVmhGVDMIa7xiyuaOkHnM3XnAgQ0BCv6PnMW6lBh+2UFZb8jZr1CiSmz7tsmOW RuVEhDrWWHmgLW6e3555qs6xkL/SUCW7BTt6j2IYMe75KP1AkUPFLKEPZNTxmeM4hhSS By0azsnuiFVRMABESdIqCWFh06ts2mroYvDoxhgc0n4tKoYiF6Q+9AU938jcmDRYK+TX xSh6+eYZUY0awWQuuf/ivXfddrs5ONkRNrTETNsZL9OjN6OQj5UPCF/z0q9UEjitf4V8 qp+g== X-Gm-Message-State: ACrzQf1u75hAulUsIwkUt0eGpp6rMdnSuRLHkyp0kQGWn/BS3CtGeQpQ gqumckfVjAgAw0JzmTjNsmgPuFBzAOpwsQ== X-Google-Smtp-Source: AMsMyM6bB2lywVy13ko/5gWhZIRg+2nHiO19flXRewLtimzUgdyDe/aw/3YB1qU9thY0Nqfo9mzTqw== X-Received: by 2002:a17:90b:4c8a:b0:214:2ed8:6501 with SMTP id my10-20020a17090b4c8a00b002142ed86501mr21276436pjb.70.1667688579014; Sat, 05 Nov 2022 15:49:39 -0700 (PDT) Received: from crazyhorse.local ([174.127.229.57]) by smtp.googlemail.com with ESMTPSA id rj14-20020a17090b3e8e00b001fde655225fsm14716728pjb.2.2022.11.05.15.49.38 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 05 Nov 2022 15:49:38 -0700 (PDT) From: Anthony DeRossi To: kvm@vger.kernel.org Cc: alex.williamson@redhat.com, cohuck@redhat.com, jgg@ziepe.ca, kevin.tian@intel.com, abhsahu@nvidia.com, yishaih@nvidia.com Subject: [PATCH v5 1/3] vfio: Fix container device registration life cycle Date: Sat, 5 Nov 2022 15:44:56 -0700 Message-Id: <20221105224458.8180-2-ajderossi@gmail.com> X-Mailer: git-send-email 2.37.4 In-Reply-To: <20221105224458.8180-1-ajderossi@gmail.com> References: <20221105224458.8180-1-ajderossi@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org In vfio_device_open(), vfio_container_device_register() is always called when open_count == 1. On error, vfio_device_container_unregister() is only called when open_count == 1 and close_device is set. This leaks a registration for devices without a close_device implementation. In vfio_device_fops_release(), vfio_device_container_unregister() is called unconditionally. This can cause a device to be unregistered multiple times. Treating container device registration/unregistration uniformly (always when open_count == 1) fixes both issues. Fixes: ce4b4657ff18 ("vfio: Replace the DMA unmapping notifier with a callback") Signed-off-by: Anthony DeRossi Reviewed-by: Jason Gunthorpe Reviewed-by: Kevin Tian --- drivers/vfio/vfio_main.c | 13 ++++++++----- 1 file changed, 8 insertions(+), 5 deletions(-) diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c index 2d168793d4e1..9a4af880e941 100644 --- a/drivers/vfio/vfio_main.c +++ b/drivers/vfio/vfio_main.c @@ -801,8 +801,9 @@ static struct file *vfio_device_open(struct vfio_device *device) err_close_device: mutex_lock(&device->dev_set->lock); mutex_lock(&device->group->group_lock); - if (device->open_count == 1 && device->ops->close_device) { - device->ops->close_device(device); + if (device->open_count == 1) { + if (device->ops->close_device) + device->ops->close_device(device); vfio_device_container_unregister(device); } @@ -1017,10 +1018,12 @@ static int vfio_device_fops_release(struct inode *inode, struct file *filep) mutex_lock(&device->dev_set->lock); vfio_assert_device_open(device); mutex_lock(&device->group->group_lock); - if (device->open_count == 1 && device->ops->close_device) - device->ops->close_device(device); + if (device->open_count == 1) { + if (device->ops->close_device) + device->ops->close_device(device); - vfio_device_container_unregister(device); + vfio_device_container_unregister(device); + } mutex_unlock(&device->group->group_lock); device->open_count--; if (device->open_count == 0) From patchwork Sat Nov 5 22:44:57 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anthony DeRossi X-Patchwork-Id: 13033283 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6C60CC4332F for ; Sat, 5 Nov 2022 22:49:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230054AbiKEWtq (ORCPT ); Sat, 5 Nov 2022 18:49:46 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51854 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229479AbiKEWto (ORCPT ); Sat, 5 Nov 2022 18:49:44 -0400 Received: from mail-pf1-x42c.google.com (mail-pf1-x42c.google.com [IPv6:2607:f8b0:4864:20::42c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 28F5612A8C for ; Sat, 5 Nov 2022 15:49:44 -0700 (PDT) Received: by mail-pf1-x42c.google.com with SMTP id y13so7501987pfp.7 for ; Sat, 05 Nov 2022 15:49:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=ajNJxEoZIzXXTPYQjvuC6wM/hhZCYCl7Jf/OFsj6B8Y=; b=YaQPUZ6YWQd8jcp7M5FuLZG7oyfaMhTw/I2rRVGZB+gIPyDqjgrULFqK5p7yTnPtnr OkPyGZfOEWHDK0AjGuNMc1IQogAwBkPfX1Lb9yZ6AYg9R3cqyFkBQK3QVuhF61eLVApO 5Xkw1zoA66vED0KLtQjNkOlzFdhfQfAYHMjtVrwAS/IAJA+McBCi/MVy8LWPLvB1RetO 8jBaQWbvQs9s7ObWs0/3U9KYur3fOngC8T1E52UWrBWxP9hk7pZJLE+SbIEnBZHBGkBx 3iyV27QtrYJRlhgkJgcS3JSZ9AGcOS4e5VcFSNQEgN1itqE5O5j2EDYUcwjQthb1sZBN HNKw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=ajNJxEoZIzXXTPYQjvuC6wM/hhZCYCl7Jf/OFsj6B8Y=; b=64XQ3D4e4u9bfnB0n8v+4kRKDlsbAq3KakVKFSHeYC5mYMacpuBneCW9+S1n8YsfZv 4aKb6eE5Qd+zMnT9pyulRrC3hc4yt9eNd9LIVHah+a2NQQCvpjBGag9tjgqmoQ2cFtoQ T+4EjcKeeSlPZvQL6N43bzY2xHeou3Cn60xy+ezq+93HpF+0YYCffsWZay1/kXogZnZH JnSHayzCKVCZAAuydRrp7oCvHtqCIryZihnEe8+fNoxNdbKDNV/JIlol/xjG2BsbIaln BqS3Xio7dP0VBKnayXl5zwVC7Ujz827UdItDK3rK4/rs7r/YbbxFFuHgTGd+Es+lnHjG NxAQ== X-Gm-Message-State: ACrzQf0xQyhsKlzJfLEiX3Q8okiikOEYoI4dzKilOpvzYgg4nL+3uExF NHKzlFsQA/SAUffwcQYNdtmO0JJTlFEQXA== X-Google-Smtp-Source: AMsMyM6Dg9mghp3xhzDo45vpeQmTilaoqsi3S8jbTzN2MMbwtve93rop3NVbkDQZCBKv9jYpC5kJ3Q== X-Received: by 2002:a62:874f:0:b0:56c:45eb:1ffa with SMTP id i76-20020a62874f000000b0056c45eb1ffamr43206784pfe.58.1667688581812; Sat, 05 Nov 2022 15:49:41 -0700 (PDT) Received: from crazyhorse.local ([174.127.229.57]) by smtp.googlemail.com with ESMTPSA id rj14-20020a17090b3e8e00b001fde655225fsm14716728pjb.2.2022.11.05.15.49.41 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 05 Nov 2022 15:49:41 -0700 (PDT) From: Anthony DeRossi To: kvm@vger.kernel.org Cc: alex.williamson@redhat.com, cohuck@redhat.com, jgg@ziepe.ca, kevin.tian@intel.com, abhsahu@nvidia.com, yishaih@nvidia.com Subject: [PATCH v5 2/3] vfio: Export the device set open count Date: Sat, 5 Nov 2022 15:44:57 -0700 Message-Id: <20221105224458.8180-3-ajderossi@gmail.com> X-Mailer: git-send-email 2.37.4 In-Reply-To: <20221105224458.8180-1-ajderossi@gmail.com> References: <20221105224458.8180-1-ajderossi@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org The open count of a device set is the sum of the open counts of all devices in the set. Drivers can use this value to determine whether shared resources are in use without tracking them manually or accessing the private open_count in vfio_device. Signed-off-by: Anthony DeRossi Reviewed-by: Jason Gunthorpe Reviewed-by: Kevin Tian --- drivers/vfio/vfio_main.c | 11 +++++++++++ include/linux/vfio.h | 1 + 2 files changed, 12 insertions(+) diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c index 9a4af880e941..ab34faabcebb 100644 --- a/drivers/vfio/vfio_main.c +++ b/drivers/vfio/vfio_main.c @@ -125,6 +125,17 @@ static void vfio_release_device_set(struct vfio_device *device) xa_unlock(&vfio_device_set_xa); } +unsigned int vfio_device_set_open_count(struct vfio_device_set *dev_set) +{ + struct vfio_device *cur; + unsigned int open_count = 0; + + list_for_each_entry(cur, &dev_set->device_list, dev_set_list) + open_count += cur->open_count; + return open_count; +} +EXPORT_SYMBOL_GPL(vfio_device_set_open_count); + /* * Group objects - create, release, get, put, search */ diff --git a/include/linux/vfio.h b/include/linux/vfio.h index e7cebeb875dd..fdd393f70b19 100644 --- a/include/linux/vfio.h +++ b/include/linux/vfio.h @@ -189,6 +189,7 @@ int vfio_register_emulated_iommu_dev(struct vfio_device *device); void vfio_unregister_group_dev(struct vfio_device *device); int vfio_assign_device_set(struct vfio_device *device, void *set_id); +unsigned int vfio_device_set_open_count(struct vfio_device_set *dev_set); int vfio_mig_get_next_state(struct vfio_device *device, enum vfio_device_mig_state cur_fsm, From patchwork Sat Nov 5 22:44:58 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anthony DeRossi X-Patchwork-Id: 13033284 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9974EC43217 for ; Sat, 5 Nov 2022 22:49:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230074AbiKEWts (ORCPT ); Sat, 5 Nov 2022 18:49:48 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51892 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230057AbiKEWtr (ORCPT ); Sat, 5 Nov 2022 18:49:47 -0400 Received: from mail-pg1-x529.google.com (mail-pg1-x529.google.com [IPv6:2607:f8b0:4864:20::529]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4D53313DC7 for ; Sat, 5 Nov 2022 15:49:46 -0700 (PDT) Received: by mail-pg1-x529.google.com with SMTP id 78so7337814pgb.13 for ; Sat, 05 Nov 2022 15:49:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=h6pDOE1RG9rRgZEeL9h0FSsm2VZYbQptR492yBEKa5A=; b=gdffMLuQrslOcBUorSS3t2Sx6Dg0CXDe56LBqaJB52w3Y/DjH2qmbOERB7Kc6TyRMn 5UmHQoochRiQzSX43I4vxo6UqJfAvgTWlQuZc67WS+4B2hKe7V7XIasu1hS7kmNtWSpw jH8eRky02ZmnONIoGvXW96uttSEbZsAqyNJAE9DT7+eoR7ewaMdTD6RyQzsljxe54PSn Uuh/LqYHQ/LVNW/jl9gocw+Q26vJh9CNzhIEgcHEfs3cDzOaDeWCZ0unYTBlUEbkknep ey/TWrcBCmeh5q89nzZmBWu+T7AuVegxSNxh4pn3ysvebQZ8FeWp6B+WdO161W9PnDSw 0MYQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=h6pDOE1RG9rRgZEeL9h0FSsm2VZYbQptR492yBEKa5A=; b=AXO0WT9IbrwmyuWzE9wdgKjCDLBqKh+FT5ww5b4yjST8atUDa7GcVllEQVxg67cz0L XGnkVQXb99Z++w+a5QnAWP3vKNaFUfF1+LBZ4AqcdUHNPHVrAWnsuFARTulDubOrkAzP yOAtnm9ywL2YDJLUZQG1T1eHco/5FcDPRShdMPmRrSOFiVeWlWstryMP696IRbD6e/au JnHj7gOzSbVnhTCNFBfyz9Y/AexxlUw8izC9Z3RBt7yZ9pKk+dGBDILYwXLgscw0lVR3 MhjuIhOmMh3zsCA72gT2mPRitxm184nvF4KNwwBT8iKaATmvk4imLpCVNHySalsAtoKI py5g== X-Gm-Message-State: ANoB5pkP0d8Mt2lpGNPsSqiUI25Kn5DeBGsQTxHsmxkSKyKO1nyObXLb 7lB5/pctlujZ+Mzeyu/tk/6D/TvGedybzw== X-Google-Smtp-Source: AA0mqf7aSukzA9ZTp5RDbA53Z/6GfFU08/UmpaNptKNc0I+rvszDcdWC8DMVvhEUVC9MrIFe8jORYg== X-Received: by 2002:a63:e855:0:b0:470:6287:fd4d with SMTP id a21-20020a63e855000000b004706287fd4dmr1756949pgk.295.1667688585876; Sat, 05 Nov 2022 15:49:45 -0700 (PDT) Received: from crazyhorse.local ([174.127.229.57]) by smtp.googlemail.com with ESMTPSA id rj14-20020a17090b3e8e00b001fde655225fsm14716728pjb.2.2022.11.05.15.49.45 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 05 Nov 2022 15:49:45 -0700 (PDT) From: Anthony DeRossi To: kvm@vger.kernel.org Cc: alex.williamson@redhat.com, cohuck@redhat.com, jgg@ziepe.ca, kevin.tian@intel.com, abhsahu@nvidia.com, yishaih@nvidia.com Subject: [PATCH v5 3/3] vfio/pci: Check the device set open count on reset Date: Sat, 5 Nov 2022 15:44:58 -0700 Message-Id: <20221105224458.8180-4-ajderossi@gmail.com> X-Mailer: git-send-email 2.37.4 In-Reply-To: <20221105224458.8180-1-ajderossi@gmail.com> References: <20221105224458.8180-1-ajderossi@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org vfio_pci_dev_set_needs_reset() inspects the open_count of every device in the set to determine whether a reset is allowed. The current device always has open_count == 1 within vfio_pci_core_disable(), effectively disabling the reset logic. This field is also documented as private in vfio_device, so it should not be used to determine whether other devices in the set are open. Checking for vfio_device_set_open_count() > 1 on the device set fixes both issues. After commit 2cd8b14aaa66 ("vfio/pci: Move to the device set infrastructure"), failure to create a new file for a device would cause the reset to be skipped due to open_count being decremented after calling close_device() in the error path. After commit eadd86f835c6 ("vfio: Remove calls to vfio_group_add_container_user()"), releasing a device would always skip the reset due to an ordering change in vfio_device_fops_release(). Failing to reset the device leaves it in an unknown state, potentially causing errors when it is accessed later or bound to a different driver. This issue was observed with a Radeon RX Vega 56 [1002:687f] (rev c3) assigned to a Windows guest. After shutting down the guest, unbinding the device from vfio-pci, and binding the device to amdgpu: [ 548.007102] [drm:psp_hw_start [amdgpu]] *ERROR* PSP create ring failed! [ 548.027174] [drm:psp_hw_init [amdgpu]] *ERROR* PSP firmware loading failed [ 548.027242] [drm:amdgpu_device_fw_loading [amdgpu]] *ERROR* hw_init of IP block failed -22 [ 548.027306] amdgpu 0000:0a:00.0: amdgpu: amdgpu_device_ip_init failed [ 548.027308] amdgpu 0000:0a:00.0: amdgpu: Fatal error during GPU init Fixes: 2cd8b14aaa66 ("vfio/pci: Move to the device set infrastructure") Fixes: eadd86f835c6 ("vfio: Remove calls to vfio_group_add_container_user()") Signed-off-by: Anthony DeRossi Reviewed-by: Jason Gunthorpe Reviewed-by: Kevin Tian --- drivers/vfio/pci/vfio_pci_core.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c index badc9d828cac..e030c2120183 100644 --- a/drivers/vfio/pci/vfio_pci_core.c +++ b/drivers/vfio/pci/vfio_pci_core.c @@ -2488,12 +2488,12 @@ static bool vfio_pci_dev_set_needs_reset(struct vfio_device_set *dev_set) struct vfio_pci_core_device *cur; bool needs_reset = false; - list_for_each_entry(cur, &dev_set->device_list, vdev.dev_set_list) { - /* No VFIO device in the set can have an open device FD */ - if (cur->vdev.open_count) - return false; + /* No other VFIO device in the set can be open. */ + if (vfio_device_set_open_count(dev_set) > 1) + return false; + + list_for_each_entry(cur, &dev_set->device_list, vdev.dev_set_list) needs_reset |= cur->needs_reset; - } return needs_reset; }