From patchwork Thu Jun 30 10:25:40 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yishai Hadas X-Patchwork-Id: 12901620 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id BFC49C43334 for ; Thu, 30 Jun 2022 10:27:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235026AbiF3K1Y (ORCPT ); Thu, 30 Jun 2022 06:27:24 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40556 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234937AbiF3K07 (ORCPT ); Thu, 30 Jun 2022 06:26:59 -0400 Received: from NAM10-DM6-obe.outbound.protection.outlook.com (mail-dm6nam10on2055.outbound.protection.outlook.com [40.107.93.55]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CBA81CE00; Thu, 30 Jun 2022 03:26:58 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=RdZc+o+GsG+rgOpLZWfEhhCeFMqjy0tmTbCg2Id21tPUXwrcrszyJcyBFG3Dla49rz7SB/wt/ZoEC/SJ56VYTbjiZCL8lX9Fnwm2X9LHHPlmUF04RPA/myJB/G5GDTatNLptPLg/FkSZjkk2tXAbSCEOFqF9NNtFr9EZkbPwBMYlLI/syqXRVfS+dV5s9GlXZ1HF+8C5+C97e5P+iu3ktiVn/8sghjQ3xbkpo2xu178D930m+vWbJlltU+MjD3M49owE7kEpeu/gu3BjRq8CxzUO+PQ9C+gEJwo7ld/DrNVD9WgTgryWnOzQkWmL5PlqCEjUj2Hu7s+mhQr+TzIU+Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=j8CbhQOtH/MavfK6utdDspsEAlQxOfHii+3T7MThgLQ=; b=S0ES14NIf0/DIT7X5rVeE4Fa5/LAgJSJiN+OkkYGgfxP5ueoXl5yddlQbv5ViwEUQDpeE1wMaDMmcwyGkTS6D9/ycZLoMYj6TzrOVXh7UqnLSunyhUPD3pIFHr4WMgOPQLMkdS5kV1Oj5MkqA4AQ9PU8sGSRtRJiETj7JkMAB0nVh1NO6QkNTeCmEhtXjy8VtGCac/JfDDpWkyZVHWSOg8LWcJo3H/8JVhqdKKM8LkjYn4hgPUmtrVJbCilWv5lEUI7iYg+NUUwJcjsTpTeyQ1HGn9lYYhzQHw/N6JA/Aixx08nmYVSsttgXwx62jmkNMHDP2LntiAirUdUKdIJUSQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 12.22.5.236) smtp.rcpttodomain=intel.com smtp.mailfrom=nvidia.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=j8CbhQOtH/MavfK6utdDspsEAlQxOfHii+3T7MThgLQ=; b=WHR0UNoMncrOOyHcjU56onNpaJR8U+IOt770L8QWBs5BhduwBoCwWQebKLaQSG7T89QGNY7eo7E2/1Lq9Hs9seEKjiLkmS1dkgAC7BcyOwe44PlQN7aNJIFmkMmaHYkvW8cQXW3pjvb6CuFFQl2H1PoNesP1Cxd4Rkp8TbYGntuTp33LddSvzMQPj1d8Ost+OtOGWRKEEICKz7pI/K+62QLKvQU/KDsDPBLq+S9p8ehae58ZGxew0hON+IE42KQRwpWAlsU3sO2tjZ0ShTQhN62NW88LuOZVyFbA23eFrouUNlev7Q05FQET0kX1YZo0lvgyLdXbqjLzxd6lUdXtlw== Received: from BN9PR03CA0500.namprd03.prod.outlook.com (2603:10b6:408:130::25) by CY4PR12MB1512.namprd12.prod.outlook.com (2603:10b6:910:3::14) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5373.18; Thu, 30 Jun 2022 10:26:57 +0000 Received: from BN8NAM11FT048.eop-nam11.prod.protection.outlook.com (2603:10b6:408:130:cafe::68) by BN9PR03CA0500.outlook.office365.com (2603:10b6:408:130::25) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5395.14 via Frontend Transport; Thu, 30 Jun 2022 10:26:57 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 12.22.5.236) smtp.mailfrom=nvidia.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 12.22.5.236 as permitted sender) receiver=protection.outlook.com; client-ip=12.22.5.236; helo=mail.nvidia.com; pr=C Received: from mail.nvidia.com (12.22.5.236) by BN8NAM11FT048.mail.protection.outlook.com (10.13.177.117) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id 15.20.5395.14 via Frontend Transport; Thu, 30 Jun 2022 10:26:56 +0000 Received: from drhqmail203.nvidia.com (10.126.190.182) by DRHQMAIL109.nvidia.com (10.27.9.19) with Microsoft SMTP Server (TLS) id 15.0.1497.32; Thu, 30 Jun 2022 10:26:55 +0000 Received: from drhqmail203.nvidia.com (10.126.190.182) by drhqmail203.nvidia.com (10.126.190.182) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.986.26; Thu, 30 Jun 2022 03:26:54 -0700 Received: from vdi.nvidia.com (10.127.8.10) by mail.nvidia.com (10.126.190.182) with Microsoft SMTP Server id 15.2.986.26 via Frontend Transport; Thu, 30 Jun 2022 03:26:51 -0700 From: Yishai Hadas To: , CC: , , , , , , , , , Subject: [PATCH vfio 08/13] vfio: Introduce the DMA logging feature support Date: Thu, 30 Jun 2022 13:25:40 +0300 Message-ID: <20220630102545.18005-9-yishaih@nvidia.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20220630102545.18005-1-yishaih@nvidia.com> References: <20220630102545.18005-1-yishaih@nvidia.com> MIME-Version: 1.0 X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 6280eb64-1104-429b-1834-08da5a830f28 X-MS-TrafficTypeDiagnostic: CY4PR12MB1512:EE_ X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: a09SkQpurWQQCSEiNXF+CIVy7d3//xBEaWpIstthq33hXjRRDG8WmJy8M2YMSfIURx/soG/AeakDqil/u/by6TcASvjAilntAm9L7N3XvGJlubk6xp5ykKTnvQPQxY9S0gvQSkeDrMFHt+6jujxq3gfQxevSa63SkoN+0x6Kfm6ODjEND3vDXLgmjzn9A/Z5/JEKRcPyiJ02dyefxmBXDfEV+FcfYCBwyVWkwF6h18GY/md3J4jbRbR2FiF+SYYQ8l5aQum2iYKvD+jM7nVujIpzUxrZQX1d4jK9rHHWr5dJSfXF5QtyH0IL/MqTnz3B4kEz1QzZrsKcdYw4MsCiSJztXrfmqRBXPya/v4ckO7h3E3vPVaZ+H6iGpPnHMSGPJvSvi41myI1FKuRwR0VLDSmjw0I+KsBwkfG+0f+Jnm7IeM29l4DGOR4jSoQQvLYqX82Zoi2Jl1FvF2voDvvA0aGdDYMG724NIVaS/ELAHz1Ng/ubWI2XO6J0GHRXZig4ULWKT/rfgvljpk5/OG1BdvCOwRgJrQSAFLTYboMQ7fYdGjqFxl0VjIr6TmFWW4s0m+yjSCt+mgvrKxeVh3/G501BS58v1C2/+tiTJbYL9IszNZ/inweyBollj8bcZaUcjjSY5EbPRZ8FtX9UPYlRTl4IJXp/ygGTW1rBcKMkBEyaeEYre9LWUCegiHX7DTh3uulj93HsxtRcHFDIZJfX7TCMxbcZlNGUwhDE1i97XZy2kZriCZMzO5wcgwlpiKeCbnILKZRUEYuv92K7YMiBN9Aka/2Mo7Y5KUom0Mc3AKJM+NcywLJq/GbuxOWHSpRyMgdLeWZmzaCQihR7SSeOrNj0huPNaedqwh3PEORNUZI= X-Forefront-Antispam-Report: CIP:12.22.5.236;CTRY:US;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:mail.nvidia.com;PTR:InfoNoRecords;CAT:NONE;SFS:(13230016)(4636009)(346002)(396003)(376002)(39860400002)(136003)(36840700001)(40470700004)(46966006)(316002)(1076003)(2906002)(6666004)(5660300002)(54906003)(6636002)(186003)(41300700001)(82310400005)(82740400003)(36756003)(70206006)(86362001)(110136005)(2616005)(356005)(336012)(7696005)(426003)(40460700003)(8676002)(70586007)(4326008)(47076005)(8936002)(26005)(81166007)(83380400001)(40480700001)(478600001)(36860700001)(36900700001);DIR:OUT;SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 30 Jun 2022 10:26:56.5348 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 6280eb64-1104-429b-1834-08da5a830f28 X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a;Ip=[12.22.5.236];Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: BN8NAM11FT048.eop-nam11.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: CY4PR12MB1512 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Introduce the DMA logging feature support in the vfio core layer. It includes the processing of the device start/stop/report DMA logging UAPIs and calling the relevant driver 'op' to do the work. Specifically, Upon start, the core translates the given input ranges into an interval tree, checks for unexpected overlapping, non aligned ranges and then pass the translated input to the driver for start tracking the given ranges. Upon report, the core translates the given input user space bitmap and page size into an IOVA kernel bitmap iterator. Then it iterates it and call the driver to set the corresponding bits for the dirtied pages in a specific IOVA range. Upon stop, the driver is called to stop the previous started tracking. The next patches from the series will introduce the mlx5 driver implementation for the logging ops. Signed-off-by: Yishai Hadas Reported-by: kernel test robot Reported-by: kernel test robot --- drivers/vfio/pci/vfio_pci_core.c | 5 + drivers/vfio/vfio_main.c | 162 +++++++++++++++++++++++++++++++ include/linux/vfio.h | 21 +++- 3 files changed, 186 insertions(+), 2 deletions(-) diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c index 2e003913c561..8dcd212971fe 100644 --- a/drivers/vfio/pci/vfio_pci_core.c +++ b/drivers/vfio/pci/vfio_pci_core.c @@ -1862,6 +1862,11 @@ int vfio_pci_core_register_device(struct vfio_pci_core_device *vdev) return -EINVAL; } + if (vdev->vdev.log_ops && !(vdev->vdev.log_ops->log_start && + vdev->vdev.log_ops->log_stop && + vdev->vdev.log_ops->log_read_and_clear)) + return -EINVAL; + /* * Prevent binding to PFs with VFs enabled, the VFs might be in use * by the host or other users. We cannot capture the VFs if they diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c index aac9213a783d..8eb8ba837059 100644 --- a/drivers/vfio/vfio_main.c +++ b/drivers/vfio/vfio_main.c @@ -32,6 +32,8 @@ #include #include #include +#include +#include #include "vfio.h" #define DRIVER_VERSION "0.3" @@ -1601,6 +1603,154 @@ static int vfio_ioctl_device_feature_migration(struct vfio_device *device, return 0; } +#define LOG_MAX_RANGES 1024 + +static int +vfio_ioctl_device_feature_logging_start(struct vfio_device *device, + u32 flags, void __user *arg, + size_t argsz) +{ + size_t minsz = + offsetofend(struct vfio_device_feature_dma_logging_control, + ranges); + struct vfio_device_feature_dma_logging_range __user *ranges; + struct vfio_device_feature_dma_logging_control control; + struct vfio_device_feature_dma_logging_range range; + struct rb_root_cached root = RB_ROOT_CACHED; + struct interval_tree_node *nodes; + u32 nnodes; + int i, ret; + + if (!device->log_ops) + return -ENOTTY; + + ret = vfio_check_feature(flags, argsz, + VFIO_DEVICE_FEATURE_SET, + sizeof(control)); + if (ret != 1) + return ret; + + if (copy_from_user(&control, arg, minsz)) + return -EFAULT; + + nnodes = control.num_ranges; + if (!nnodes || nnodes > LOG_MAX_RANGES) + return -EINVAL; + + ranges = (struct vfio_device_feature_dma_logging_range __user *) + control.ranges; + nodes = kmalloc_array(nnodes, sizeof(struct interval_tree_node), + GFP_KERNEL); + if (!nodes) + return -ENOMEM; + + for (i = 0; i < nnodes; i++) { + if (copy_from_user(&range, &ranges[i], sizeof(range))) { + ret = -EFAULT; + goto end; + } + if (!IS_ALIGNED(range.iova, control.page_size) || + !IS_ALIGNED(range.length, control.page_size)) { + ret = -EINVAL; + goto end; + } + nodes[i].start = range.iova; + nodes[i].last = range.iova + range.length - 1; + if (interval_tree_iter_first(&root, nodes[i].start, + nodes[i].last)) { + /* Range overlapping */ + ret = -EINVAL; + goto end; + } + interval_tree_insert(nodes + i, &root); + } + + ret = device->log_ops->log_start(device, &root, nnodes, + &control.page_size); + if (ret) + goto end; + + if (copy_to_user(arg, &control, sizeof(control))) { + ret = -EFAULT; + device->log_ops->log_stop(device); + } + +end: + kfree(nodes); + return ret; +} + +static int +vfio_ioctl_device_feature_logging_stop(struct vfio_device *device, + u32 flags, void __user *arg, + size_t argsz) +{ + int ret; + + if (!device->log_ops) + return -ENOTTY; + + ret = vfio_check_feature(flags, argsz, + VFIO_DEVICE_FEATURE_SET, 0); + if (ret != 1) + return ret; + + return device->log_ops->log_stop(device); +} + +static int +vfio_ioctl_device_feature_logging_report(struct vfio_device *device, + u32 flags, void __user *arg, + size_t argsz) +{ + size_t minsz = + offsetofend(struct vfio_device_feature_dma_logging_report, + bitmap); + struct vfio_device_feature_dma_logging_report report; + struct iova_bitmap_iter iter; + int ret; + + if (!device->log_ops) + return -ENOTTY; + + ret = vfio_check_feature(flags, argsz, + VFIO_DEVICE_FEATURE_GET, + sizeof(report)); + if (ret != 1) + return ret; + + if (copy_from_user(&report, arg, minsz)) + return -EFAULT; + + if (report.page_size < PAGE_SIZE) + return -EINVAL; + + iova_bitmap_init(&iter.dirty, report.iova, ilog2(report.page_size)); + ret = iova_bitmap_iter_init(&iter, report.iova, report.length, + (unsigned long __user *)report.bitmap); + if (ret) + return ret; + + for (; iova_bitmap_iter_done(&iter); + iova_bitmap_iter_advance(&iter)) { + ret = iova_bitmap_iter_get(&iter); + if (ret) + break; + + ret = device->log_ops->log_read_and_clear(device, + iova_bitmap_iova(&iter), + iova_bitmap_length(&iter), &iter.dirty); + + iova_bitmap_iter_put(&iter); + + if (ret) + break; + } + + iova_bitmap_iter_free(&iter); + return ret; +} + static int vfio_ioctl_device_feature(struct vfio_device *device, struct vfio_device_feature __user *arg) { @@ -1634,6 +1784,18 @@ static int vfio_ioctl_device_feature(struct vfio_device *device, return vfio_ioctl_device_feature_mig_device_state( device, feature.flags, arg->data, feature.argsz - minsz); + case VFIO_DEVICE_FEATURE_DMA_LOGGING_START: + return vfio_ioctl_device_feature_logging_start( + device, feature.flags, arg->data, + feature.argsz - minsz); + case VFIO_DEVICE_FEATURE_DMA_LOGGING_STOP: + return vfio_ioctl_device_feature_logging_stop( + device, feature.flags, arg->data, + feature.argsz - minsz); + case VFIO_DEVICE_FEATURE_DMA_LOGGING_REPORT: + return vfio_ioctl_device_feature_logging_report( + device, feature.flags, arg->data, + feature.argsz - minsz); default: if (unlikely(!device->ops->device_feature)) return -EINVAL; diff --git a/include/linux/vfio.h b/include/linux/vfio.h index 4d26e149db81..feed84d686ec 100644 --- a/include/linux/vfio.h +++ b/include/linux/vfio.h @@ -14,6 +14,7 @@ #include #include #include +#include struct kvm; @@ -33,10 +34,11 @@ struct vfio_device { struct device *dev; const struct vfio_device_ops *ops; /* - * mig_ops is a static property of the vfio_device which must be set - * prior to registering the vfio_device. + * mig_ops/log_ops is a static property of the vfio_device which must + * be set prior to registering the vfio_device. */ const struct vfio_migration_ops *mig_ops; + const struct vfio_log_ops *log_ops; struct vfio_group *group; struct vfio_device_set *dev_set; struct list_head dev_set_list; @@ -104,6 +106,21 @@ struct vfio_migration_ops { enum vfio_device_mig_state *curr_state); }; +/** + * @log_start: Optional callback to ask the device start DMA logging. + * @log_stop: Optional callback to ask the device stop DMA logging. + * @log_read_and_clear: Optional callback to ask the device read + * and clear the dirty DMAs in some given range. + */ +struct vfio_log_ops { + int (*log_start)(struct vfio_device *device, + struct rb_root_cached *ranges, u32 nnodes, u64 *page_size); + int (*log_stop)(struct vfio_device *device); + int (*log_read_and_clear)(struct vfio_device *device, + unsigned long iova, unsigned long length, + struct iova_bitmap *dirty); +}; + /** * vfio_check_feature - Validate user input for the VFIO_DEVICE_FEATURE ioctl * @flags: Arg from the device_feature op