From patchwork Wed Sep 4 15:30:31 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Michael Guralnik X-Patchwork-Id: 13791129 Received: from NAM04-MW2-obe.outbound.protection.outlook.com (mail-mw2nam04on2050.outbound.protection.outlook.com [40.107.101.50]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BA01C2C95 for ; Wed, 4 Sep 2024 15:31:03 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.101.50 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725463865; cv=fail; b=QiJkNAoqPx5eZxxqj5cmVwCv6WV58l+tiBzulEVY83w2BZsZrWgEy7F1DLWgcB9gECG36cMLmYPV49vJR1EWsrNd/l+/1X3fzNITMUwJpgyaSQUCvpIoLBA1/7udLXn5bE3ytpFk2k0fbYu5PLA+FMBK85VLTU9Y2ruUFS4CoWI= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725463865; c=relaxed/simple; bh=BmNtBdEXtMOgKrCTQgsc4df78FABQr/ox4uBXdK7IkY=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=fqabpUAeEapoZoyBKceaLsX3xOZBbSR3jRuEibqAfy+DuwXaIV5GIXFmDAfpKpS2XmoPBamL8+oRMW7DeJ8WZtEjyyGnN+bZ4nZHFKiqndOkcUl2qC+mshyD0Wk28b9YPNg/FRcMbmscdFMRd9l5WUzxNWyszQt+OSVHUFCq7I8= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com; spf=fail smtp.mailfrom=nvidia.com; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b=gdYirdI9; arc=fail smtp.client-ip=40.107.101.50 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nvidia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b="gdYirdI9" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=CccxKkFzNJ4to+WVHZ5YOhrq12kNJAAdtIYxvqSFI7tzcN39UmQFK0IQuq4rWyNpuB2+4qgCKgjXONHHbZXC35zdNX6ftMVzOA3YWS4e1jMm4BPt9GvyOBjL3KndLjYawwOQkZgjR1EcJjbBYJaLoNQ1sj5NEg3WQkqJjahw4K9O4KUm0k5v3BUZSsOl/VhlOxhBTq0rETK9pVQGgLEPXoTf1+Z+Wl2b1+4gZfMA2f24Q1WhFNJmejXKeUHSNHUYalwElbCjZybQTnY/pozkq3jqQDwVCweSfwg3vj9yNpwEup1kA7NBPC2kqAq5L0JZdOfPXBtHp5Az5vwp8ETqLQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=aTdXFVTLs5/DNoAOBGdDmb/7yxwMAGtzNYK6kvPuuyU=; b=QX/OQS613BhqgVOi9E44jLlqQqDunSTqM3mh+jvOB3gmavcbUMF9FnVQdJaoFuz/afGV6NWAv6OHxzbQGhws1yLvwE/zc1vb2tAeHZQ3H73rVwMFEO15VfmSApH22xeyY9zrpJ4dZANwVP/BxE9lb8y8j/3Lk8t0TLRj0mxnVEcj3RbKcWxtmpjUtNTLFUtXQVoc9XLQpec/Kestvfy/WOseHJsEFUK9T+aaXTv3xa6uZZW/Dn6QryT4Zx6n7BSMQTwE6eG2CxnLKV27j+05ya8Lr23kAfMu8vzH+zR0O4k0afyuunsShwpm6i4bHnoDZtew3gPjvzGy87mcgfUZOA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 216.228.118.232) smtp.rcpttodomain=vger.kernel.org smtp.mailfrom=nvidia.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=aTdXFVTLs5/DNoAOBGdDmb/7yxwMAGtzNYK6kvPuuyU=; b=gdYirdI9FdFLhi7VuRKU0vHwNgMOzPOS2QEkVzk32Z5pLY1wWJAbxiP42B4W3qkRZpiuI1b46ZtcMdyLoInZK2M3nx3EnA1FLQrwHHWmElJ/pvr6jPmECa3zkP3jwsYjm7UkrwYzjBb7QjIK6DeXxgXgBjkemVRGFqG1Zz3ykCL+433sZb0TXyewejSrEdI10SyfX2eDqWyi7V4jmP7CHPajYur6K228Yl4UqBzonkBTuBhu4hv7kUQDNDKpJt8YO2SWPD1hdcDoB5+NDWp0G9/DMy7XVoPF57fNfu7DcKBwak/JLKeIm7C8zgZNN+RGecVFJWClaxacYHcOS2nbIA== Received: from BY1P220CA0011.NAMP220.PROD.OUTLOOK.COM (2603:10b6:a03:59d::11) by DS7PR12MB6310.namprd12.prod.outlook.com (2603:10b6:8:95::21) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7918.25; Wed, 4 Sep 2024 15:30:58 +0000 Received: from SJ5PEPF000001EA.namprd05.prod.outlook.com (2603:10b6:a03:59d:cafe::5f) by BY1P220CA0011.outlook.office365.com (2603:10b6:a03:59d::11) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7939.14 via Frontend Transport; Wed, 4 Sep 2024 15:30:58 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 216.228.118.232) smtp.mailfrom=nvidia.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 216.228.118.232 as permitted sender) receiver=protection.outlook.com; client-ip=216.228.118.232; helo=mail.nvidia.com; pr=C Received: from mail.nvidia.com (216.228.118.232) by SJ5PEPF000001EA.mail.protection.outlook.com (10.167.242.198) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7918.13 via Frontend Transport; Wed, 4 Sep 2024 15:30:58 +0000 Received: from drhqmail203.nvidia.com (10.126.190.182) by mail.nvidia.com (10.127.129.5) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.4; Wed, 4 Sep 2024 08:30:45 -0700 Received: from drhqmail203.nvidia.com (10.126.190.182) by drhqmail203.nvidia.com (10.126.190.182) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.4; Wed, 4 Sep 2024 08:30:44 -0700 Received: from vdi.nvidia.com (10.127.8.13) by mail.nvidia.com (10.126.190.182) with Microsoft SMTP Server id 15.2.1544.4 via Frontend Transport; Wed, 4 Sep 2024 08:30:43 -0700 From: Michael Guralnik To: , CC: , , , Michael Guralnik Subject: [PATCH rdma-next 1/8] net/mlx5: Expand mkey page size to support 6 bits Date: Wed, 4 Sep 2024 18:30:31 +0300 Message-ID: <20240904153038.23054-2-michaelgur@nvidia.com> X-Mailer: git-send-email 2.17.2 In-Reply-To: <20240904153038.23054-1-michaelgur@nvidia.com> References: <20240904153038.23054-1-michaelgur@nvidia.com> Precedence: bulk X-Mailing-List: linux-rdma@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-NV-OnPremToCloud: ExternallySecured X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: SJ5PEPF000001EA:EE_|DS7PR12MB6310:EE_ X-MS-Office365-Filtering-Correlation-Id: fa602229-6617-4c07-ea98-08dcccf69357 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|376014|82310400026|36860700013|1800799024; X-Microsoft-Antispam-Message-Info: p8R7gBZnSbU66AJeZ9t2ZsYtnGMlwsxrLu1EN1tTUpCP+HkwLtFnL+cdSpxCAAabBNuadE1uAicCZd5IFNQu8yifba6FN/VKf9+YjDjVJlIPZq3QSZ4Bm3hsbJzylQzHmlElJKBzkkdNFzn2iP7xaixCvZv58T/xXnHG1ZHQisk2bIfLDXder81P0+t3Qi3rwrEMhNFEDc+msoOH9Dccjd6lK+VP6wvcyCrq85tzcLWvEY5AhnkCO1R1KYzQI7SGgyfuDplbjORTXGPwAiWHsEr+7dI0lXIv+0Y0td9Esh2kV8P9tEtddoK8NbmPmODcyXXqwFzSWYZcpWeFFxz8Cm3k1Keo5+I0vOwoINdFadIQdvu+WVOJhNZ+XZRAxzMtaWyWfbhUlgu5uypa2bPVMpPZvbXJ2QfZOEz3TyB1M0l3bHOdtQwklEMKN4mJxP2QTtqv8IvT3O7noYmkorX80ijBOfNz0bUasvuABbpLBIDRRwusKGUUU0r3SHZxP6oyrmrk0ULwoT+XEsqHOHQcw5H+UsSHlZ+PFsV64FCnHBHALVFi8IS3HBeLApV4T5IMRgWRNsL9CZOVz7v9lBDZ9lmggG7AHUfxad7TeMD8q5TZ/HbDMw9U/cSmDQPiIiu4IAsox73Ks92Hdun5jwuw40/edXE+U1lI8ARYgRtFTy339+VFGae0sG6RdLI01gxcceAGVsikaOOmw/BjFKWmdjt+aOkJs0M9ggXe3KMhhPbcymB772kPs44ylVWST4atx3QQnFfB5uJ0hPeldlW2lOADh0t/eIgyMQ6AOlZXJgJpPmSOLefrcvWk7SLOQ8hnsWX1iG7Dfy2fN68y24wY8sII+EcE2J5NAuDUEcEeWVOGrjABBLCsjuuSp7kY8FtGZsoyzPenGy8mLEgqPEuYeHFdGb5Ayj7FC8t/SWUSeeeSgxkADUfLL71HhIuMPhxxAmSE31yXgi6qmw1Mb94a6fdkMUCepoMUskqHGIK7KU/7nwgb2Bvc7n7puV/KS6HRhKfBnDDud6InXf+8yu2zAYUA+nNxrrSzZRkcAmKpgyvlo5SepT2+ia8/6At6V4GeTfrfpc2answyKu18o3M95gVTu4vy4Wur6FuT8uu8Bw3AwnQYkxuUrqDjCDYPabBsEgWdqOdZTQyScKUoEOfrMsSz3Kx1j/3pgwYw1+bzazzcezyodoW5O7aPkZqgPtOOAPfKRNGimPEVWeiB+9SFPfiBIvWQ9TlYzaTGnYMcW8c9T0FwCGpp5bdDnO677TQKvPZ1ymvIj255QTXmToERfwaZ4Pi6PVDxnd1ESj6t5dd6hD7YGNpEbUkoqib+bcyalzhzTk54TXN80SspDneq2UR4cyOk7/cLEmMBJjMcR2p9zSECvecfR6XxpbNWVpOVT+f0dFX+H0S/g1sRcUQkTFsR4jg03QX/OdXAr6qo9iDAtrNtx7LUZEhxziNih//w X-Forefront-Antispam-Report: CIP:216.228.118.232;CTRY:US;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:mail.nvidia.com;PTR:dc7edge1.nvidia.com;CAT:NONE;SFS:(13230040)(376014)(82310400026)(36860700013)(1800799024);DIR:OUT;SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 04 Sep 2024 15:30:58.5488 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: fa602229-6617-4c07-ea98-08dcccf69357 X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a;Ip=[216.228.118.232];Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: SJ5PEPF000001EA.namprd05.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: DS7PR12MB6310 Protect the usage of the 6th bit with the relevant capability to ensure we are using the new page sizes with FW that supports the bit extension. Signed-off-by: Michael Guralnik Reviewed-by: Leon Romanovsky --- drivers/infiniband/hw/mlx5/mlx5_ib.h | 14 ++++++++------ drivers/infiniband/hw/mlx5/mr.c | 10 ++++------ drivers/infiniband/hw/mlx5/odp.c | 2 +- include/linux/mlx5/mlx5_ifc.h | 7 ++++--- 4 files changed, 17 insertions(+), 16 deletions(-) diff --git a/drivers/infiniband/hw/mlx5/mlx5_ib.h b/drivers/infiniband/hw/mlx5/mlx5_ib.h index 926a965e4570..89c2ab728577 100644 --- a/drivers/infiniband/hw/mlx5/mlx5_ib.h +++ b/drivers/infiniband/hw/mlx5/mlx5_ib.h @@ -67,12 +67,14 @@ __mlx5_log_page_size_to_bitmap(unsigned int log_pgsz_bits, * For mkc users, instead of a page_offset the command has a start_iova which * specifies both the page_offset and the on-the-wire IOVA */ -#define mlx5_umem_find_best_pgsz(umem, typ, log_pgsz_fld, pgsz_shift, iova) \ - ib_umem_find_best_pgsz(umem, \ - __mlx5_log_page_size_to_bitmap( \ - __mlx5_bit_sz(typ, log_pgsz_fld), \ - pgsz_shift), \ - iova) +#define mlx5_umem_find_best_pgsz(umem, dev, iova) \ + ib_umem_find_best_pgsz( \ + umem, \ + __mlx5_log_page_size_to_bitmap( \ + MLX5_CAP_GEN_2(dev->mdev, umr_log_entity_size_5) ? 6 : \ + 5, \ + 0), \ + iova) static __always_inline unsigned long __mlx5_page_offset_to_bitmask(unsigned int page_offset_bits, diff --git a/drivers/infiniband/hw/mlx5/mr.c b/drivers/infiniband/hw/mlx5/mr.c index 73962bd0b216..0b52f080879f 100644 --- a/drivers/infiniband/hw/mlx5/mr.c +++ b/drivers/infiniband/hw/mlx5/mr.c @@ -1119,8 +1119,7 @@ static struct mlx5_ib_mr *alloc_cacheable_mr(struct ib_pd *pd, if (umem->is_dmabuf) page_size = mlx5_umem_dmabuf_default_pgsz(umem, iova); else - page_size = mlx5_umem_find_best_pgsz(umem, mkc, log_page_size, - 0, iova); + page_size = mlx5_umem_find_best_pgsz(umem, dev, iova); if (WARN_ON(!page_size)) return ERR_PTR(-EINVAL); @@ -1425,8 +1424,8 @@ static struct ib_mr *create_real_mr(struct ib_pd *pd, struct ib_umem *umem, mr = alloc_cacheable_mr(pd, umem, iova, access_flags, MLX5_MKC_ACCESS_MODE_MTT); } else { - unsigned int page_size = mlx5_umem_find_best_pgsz( - umem, mkc, log_page_size, 0, iova); + unsigned int page_size = + mlx5_umem_find_best_pgsz(umem, dev, iova); mutex_lock(&dev->slow_path_mutex); mr = reg_create(pd, umem, iova, access_flags, page_size, @@ -1744,8 +1743,7 @@ static bool can_use_umr_rereg_pas(struct mlx5_ib_mr *mr, if (!mlx5r_umr_can_load_pas(dev, new_umem->length)) return false; - *page_size = - mlx5_umem_find_best_pgsz(new_umem, mkc, log_page_size, 0, iova); + *page_size = mlx5_umem_find_best_pgsz(new_umem, dev, iova); if (WARN_ON(!*page_size)) return false; return (mr->mmkey.cache_ent->rb_key.ndescs) >= diff --git a/drivers/infiniband/hw/mlx5/odp.c b/drivers/infiniband/hw/mlx5/odp.c index 44a3428ea342..221820874e7a 100644 --- a/drivers/infiniband/hw/mlx5/odp.c +++ b/drivers/infiniband/hw/mlx5/odp.c @@ -693,7 +693,7 @@ static int pagefault_dmabuf_mr(struct mlx5_ib_mr *mr, size_t bcnt, struct ib_umem_dmabuf *umem_dmabuf = to_ib_umem_dmabuf(mr->umem); u32 xlt_flags = 0; int err; - unsigned int page_size; + unsigned long page_size; if (flags & MLX5_PF_FLAGS_ENABLE) xlt_flags |= MLX5_IB_UPD_XLT_ENABLE; diff --git a/include/linux/mlx5/mlx5_ifc.h b/include/linux/mlx5/mlx5_ifc.h index 691a285f9c1e..1be2495362ee 100644 --- a/include/linux/mlx5/mlx5_ifc.h +++ b/include/linux/mlx5/mlx5_ifc.h @@ -1995,7 +1995,9 @@ struct mlx5_ifc_cmd_hca_cap_2_bits { u8 dp_ordering_force[0x1]; u8 reserved_at_89[0x9]; u8 query_vuid[0x1]; - u8 reserved_at_93[0xd]; + u8 reserved_at_93[0x5]; + u8 umr_log_entity_size_5[0x1]; + u8 reserved_at_99[0x7]; u8 max_reformat_insert_size[0x8]; u8 max_reformat_insert_offset[0x8]; @@ -4221,8 +4223,7 @@ struct mlx5_ifc_mkc_bits { u8 reserved_at_1c0[0x19]; u8 relaxed_ordering_read[0x1]; - u8 reserved_at_1d9[0x1]; - u8 log_page_size[0x5]; + u8 log_page_size[0x6]; u8 reserved_at_1e0[0x20]; }; From patchwork Wed Sep 4 15:30:32 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Michael Guralnik X-Patchwork-Id: 13791130 Received: from NAM10-DM6-obe.outbound.protection.outlook.com (mail-dm6nam10on2070.outbound.protection.outlook.com [40.107.93.70]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5D2192C95 for ; Wed, 4 Sep 2024 15:31:06 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.93.70 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725463868; cv=fail; b=sijAHaQhidH8d8LU4FQ3OoI/hk3GDgBZH2KRdiW69B+7sDfDS9XDtdB4kPlEH7+kTDfxiBFTEbpfUl0iNUkGNzHLuFZztP0FWjJ4rpwMUUalBawm4DA0vPgZO+D88qB42hjsyJM2Or5r/KZ3++5b1Nucsmws/Un56+CVj+M5pHs= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725463868; c=relaxed/simple; bh=Orp6Tp08ADKQDPxJj9uLRXU+fVBRCYmlQ3AyosyNqBM=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=Lli+YMeoLnxnQJC3x4SUGa2bc1LoX4LY6bSTMDr09YgzD/WPds4MjVhHYUZTehCmLXYRLNCHGSOWK13OKDO7ViXsDaCp3Qj8TlDlc6s7DvonQ+5XpFCshEfPraGH9VJtamYk5He2CFrtmsX4OBtpTdi71RDB4JVvP9zs1FEhJrQ= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com; spf=fail smtp.mailfrom=nvidia.com; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b=EDSiZ8bH; arc=fail smtp.client-ip=40.107.93.70 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nvidia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b="EDSiZ8bH" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=EoMIszWw46WHJR0857hKSBbQBeTiQ5rwq1k4GShBm7DFz+Sxua1gX7pngClIOsfMC4jjeFxeWvQQJHNsXKq7xUwIaOtIctcGxm6x9OekieWroxxcV5pj3NQac0u+ze0TrNtp8qWg1TN4EdITEKOyLff74Dy4dFldLzOoSqn7ka8fDBz9l7XEa971ZeRkt8rsaJFXVqNjAvqdgGdAnzyT6br4vi49j/l2gm7fUAQSe1vaxfmWr0Rzt5KhFql8HlKdV4NbkgqgB/iK2HyShlaO63Xi9K07ybJ2FyKaQWgOfL4DE24jnqOoTkpVe9BBD2dJ+no6RDtGiGBlOWKudkY+5Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=5sNVhDYjlljuerZ0wXeb5L6zpj0qjcPMA7qjC7b8S6I=; b=hEDrjR2LVM6qlFOeFtlm/jS27F/wLOjZ8xi8fX+H7t2yDu+tSnnzGNkZgo1s1sA3qnDF/spp67XeoapgwF8mm6vSH+/FMjXuhSVHYsJPIcX41X65xafESXMQ+JLDLluSpPLmbt2waMjf5eMUOVltzTs4r1ZQGQKGQ2/Jh6KgWM4Ij9HrZwGecXVqaZHUsFqogPF7KyRBNnH14WCrjRq/NCkdvCbJqB0ET12G8zyyLdRGNWPkOaqFDVUjoQz9DVa/W4lKyOlsLr9M5LTC8JoWGM8H7LU/soYatZI9mYVxv6FtrsLUYIu/W2EopfKIZS/PZSBs14jrnyq/0Ey6mTYhww== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 216.228.118.232) smtp.rcpttodomain=vger.kernel.org smtp.mailfrom=nvidia.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=5sNVhDYjlljuerZ0wXeb5L6zpj0qjcPMA7qjC7b8S6I=; b=EDSiZ8bHYqT+p0KZCRImsM1Wy/uPls/JLURZZh7cFMWWgeuU+XH1t1dh3ZvxwyXmGYMpZ3dtRhjtsUbVhpNC1JpK8ViQcF8krRi3GM3bc7uKvS23JwhX0P6vaOaWKydZze9YX1PEr6JP33b7hSeC9onIZDGtGYQ9x/KMaXn/PNETm0sAV/P9L95yBQXpBvswrQpT2BBRHNS1VWWan2XdydVDYNEn/sxhwQ/j8wg3IZrf0yRT19kWiKKRhUqisHNocCJnesjmEZv/pXU01HFW+h9W71HfJ2iypQKA4y3X+VvegMaCIojX/x/Pwj4Q/dPKUrzHPMZmmwX6SbORJKtIpQ== Received: from BY5PR13CA0036.namprd13.prod.outlook.com (2603:10b6:a03:180::49) by SA3PR12MB9106.namprd12.prod.outlook.com (2603:10b6:806:37e::14) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7918.24; Wed, 4 Sep 2024 15:31:00 +0000 Received: from SJ5PEPF000001E9.namprd05.prod.outlook.com (2603:10b6:a03:180:cafe::7e) by BY5PR13CA0036.outlook.office365.com (2603:10b6:a03:180::49) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7939.14 via Frontend Transport; Wed, 4 Sep 2024 15:31:00 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 216.228.118.232) smtp.mailfrom=nvidia.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 216.228.118.232 as permitted sender) receiver=protection.outlook.com; client-ip=216.228.118.232; helo=mail.nvidia.com; pr=C Received: from mail.nvidia.com (216.228.118.232) by SJ5PEPF000001E9.mail.protection.outlook.com (10.167.242.197) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7918.13 via Frontend Transport; Wed, 4 Sep 2024 15:31:00 +0000 Received: from drhqmail201.nvidia.com (10.126.190.180) by mail.nvidia.com (10.127.129.5) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.4; Wed, 4 Sep 2024 08:30:47 -0700 Received: from drhqmail203.nvidia.com (10.126.190.182) by drhqmail201.nvidia.com (10.126.190.180) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.4; Wed, 4 Sep 2024 08:30:47 -0700 Received: from vdi.nvidia.com (10.127.8.13) by mail.nvidia.com (10.126.190.182) with Microsoft SMTP Server id 15.2.1544.4 via Frontend Transport; Wed, 4 Sep 2024 08:30:45 -0700 From: Michael Guralnik To: , CC: , , , Michael Guralnik Subject: [PATCH rdma-next 2/8] net/mlx5: Expose HW bits for Memory scheme ODP Date: Wed, 4 Sep 2024 18:30:32 +0300 Message-ID: <20240904153038.23054-3-michaelgur@nvidia.com> X-Mailer: git-send-email 2.17.2 In-Reply-To: <20240904153038.23054-1-michaelgur@nvidia.com> References: <20240904153038.23054-1-michaelgur@nvidia.com> Precedence: bulk X-Mailing-List: linux-rdma@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-NV-OnPremToCloud: ExternallySecured X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: SJ5PEPF000001E9:EE_|SA3PR12MB9106:EE_ X-MS-Office365-Filtering-Correlation-Id: 68e6339e-3a14-441f-e2b8-08dcccf69484 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|1800799024|36860700013|376014|82310400026; X-Microsoft-Antispam-Message-Info: UODhSYvNfVR2G9Mcu/skb8dm/p4ZHNDs944Wv5Fc05dlB18XLUeLUdoL9c/h7sbqEaOVVVa54qHpjXFyS7NYj3PWrECP7ndYU3un1ZsLCvAgmFOATsdXv/HTPoivKB+G3nlrzzaJ4XbiaCE1Ic2fyvKlY3tVMfH22Lay5oTOxSUqqo15K2XVh8wELJFYZUuhFvoZanu9FJohC3VtIaM/jXhkrR1/ktSGB6JMdxSODlGBUtEYTXb7W2Lij77xl7eUY7+59dmnvcS5EkvSg3Y26piE0hMqXCcST3ZIXwrQcBjsT0cOJjnFP90258B5gD2qPCkncY4TH17G4vCxow7Q/wnJhpBpMo986T4IHMpyD42V8ad3/1txRD1r7A7njgp+JMz7Id1KUWmplu1ut7bonioD68/WXYkWRi+Cb/rUt9w+80UlrxIgNsePXEtBHVVVFJb2h4LonihJtdY9jrEzVYphT+6sPm0UUT2rgKb2OTXt1tf71s/o2JFVqlDgDnQmOpJ4sjguk9gS0hpk3GAL+DlLOvv8EwKgvnVqEvyOiKZ6wJFE6Cqbp3jGyNoE2F21HgGUsVnkPWB5sjRGQqBI39XosKf9bllH44g3gLInpSqehCFf2i3PJpUYMENMQMlrcgChU+QUVOFqobDVpZvGDh/PpROuwD+nBUE9Yo66Y/dNssiS1ZBl04jjOMzY6sxolmvX6qdoqch4Vc0TH+ZmS6AoghW/OUaqrya4Og5apmM0/RF5KpG18etrB54VFbvxXVIZ5sOOkCU+Ucf2u92pI2F0pT5YPh7WjpJXGPC1N6+NmUOp00BXUpsMs+Ns5eapGHrFeG8ZRZGcW4WiZXUVNrb08YviJtgLNv/bdr3kReKpC1UgLeENljnKFpBpkT4aGCWJogJ2l0wuD9NdjyLaYQMhJ9LRTqYFm8a5jsdLkYJThkHVLnzmTLTPoCTRwZy3/OORwMZLUWzD5FddDcikSAdrwJXXy6KEIVkSI6TrBotKM8CRCp3j5s8QEacxBbaadaWhz2KmxxeSQx3gipAwo5aNvXfljagDL8eRhLDpfuLBo9BmU0wlYOocxrp4Z04GgRORlKDhsSZ3kgzTDlALdUEUNwhg1jS56LJBnxaHUNLruQQ7RyuktcpKvecfxJOtTaWgVBIuDrDgHNGVKtmTu9SZK+dhxOREJjnVY2kePSWvNSUXT2r9e/wI7+C/zNgfSUKba/Pc7NkIBrhfFbkjpLuF7vFsLupV0kJApPsneU0KHuRx2nqr1wyj2CQY0YlrKelP6udKUelJMNWHsNey43/SHpMB1kKQxXqcSwQc2OGJXxoYinPaBVw3StkyoJ3Cd43WKxQ2abmFkfRmOaz1jWhWW/5bdDsH+Ah+meVk2Z3aqETYWIdsFXluCLxrewYiadNC2uoYUKRMXz9YUGu10Jzka6ZBAzlCrLfK2ePcNChPWs3Ngv8W/DR+04GQcMt+ X-Forefront-Antispam-Report: CIP:216.228.118.232;CTRY:US;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:mail.nvidia.com;PTR:dc7edge1.nvidia.com;CAT:NONE;SFS:(13230040)(1800799024)(36860700013)(376014)(82310400026);DIR:OUT;SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 04 Sep 2024 15:31:00.5213 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 68e6339e-3a14-441f-e2b8-08dcccf69484 X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a;Ip=[216.228.118.232];Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: SJ5PEPF000001E9.namprd05.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: SA3PR12MB9106 Expose IFC bits to support the new memory scheme on demand paging. Change the macro reading odp capabilities to be able to read from the new IFC layout and align the code in upper layers to be compiled. Signed-off-by: Michael Guralnik Reviewed-by: Leon Romanovsky --- drivers/infiniband/hw/mlx5/odp.c | 40 +++++++------ .../net/ethernet/mellanox/mlx5/core/main.c | 28 ++++----- include/linux/mlx5/device.h | 4 ++ include/linux/mlx5/mlx5_ifc.h | 57 +++++++++++++++---- 4 files changed, 86 insertions(+), 43 deletions(-) diff --git a/drivers/infiniband/hw/mlx5/odp.c b/drivers/infiniband/hw/mlx5/odp.c index 221820874e7a..300504bf79d7 100644 --- a/drivers/infiniband/hw/mlx5/odp.c +++ b/drivers/infiniband/hw/mlx5/odp.c @@ -332,46 +332,46 @@ static void internal_fill_odp_caps(struct mlx5_ib_dev *dev) else dev->odp_max_size = BIT_ULL(MLX5_MAX_UMR_SHIFT + PAGE_SHIFT); - if (MLX5_CAP_ODP(dev->mdev, ud_odp_caps.send)) + if (MLX5_CAP_ODP_SCHEME(dev->mdev, ud_odp_caps.send)) caps->per_transport_caps.ud_odp_caps |= IB_ODP_SUPPORT_SEND; - if (MLX5_CAP_ODP(dev->mdev, ud_odp_caps.srq_receive)) + if (MLX5_CAP_ODP_SCHEME(dev->mdev, ud_odp_caps.srq_receive)) caps->per_transport_caps.ud_odp_caps |= IB_ODP_SUPPORT_SRQ_RECV; - if (MLX5_CAP_ODP(dev->mdev, rc_odp_caps.send)) + if (MLX5_CAP_ODP_SCHEME(dev->mdev, rc_odp_caps.send)) caps->per_transport_caps.rc_odp_caps |= IB_ODP_SUPPORT_SEND; - if (MLX5_CAP_ODP(dev->mdev, rc_odp_caps.receive)) + if (MLX5_CAP_ODP_SCHEME(dev->mdev, rc_odp_caps.receive)) caps->per_transport_caps.rc_odp_caps |= IB_ODP_SUPPORT_RECV; - if (MLX5_CAP_ODP(dev->mdev, rc_odp_caps.write)) + if (MLX5_CAP_ODP_SCHEME(dev->mdev, rc_odp_caps.write)) caps->per_transport_caps.rc_odp_caps |= IB_ODP_SUPPORT_WRITE; - if (MLX5_CAP_ODP(dev->mdev, rc_odp_caps.read)) + if (MLX5_CAP_ODP_SCHEME(dev->mdev, rc_odp_caps.read)) caps->per_transport_caps.rc_odp_caps |= IB_ODP_SUPPORT_READ; - if (MLX5_CAP_ODP(dev->mdev, rc_odp_caps.atomic)) + if (MLX5_CAP_ODP_SCHEME(dev->mdev, rc_odp_caps.atomic)) caps->per_transport_caps.rc_odp_caps |= IB_ODP_SUPPORT_ATOMIC; - if (MLX5_CAP_ODP(dev->mdev, rc_odp_caps.srq_receive)) + if (MLX5_CAP_ODP_SCHEME(dev->mdev, rc_odp_caps.srq_receive)) caps->per_transport_caps.rc_odp_caps |= IB_ODP_SUPPORT_SRQ_RECV; - if (MLX5_CAP_ODP(dev->mdev, xrc_odp_caps.send)) + if (MLX5_CAP_ODP_SCHEME(dev->mdev, xrc_odp_caps.send)) caps->per_transport_caps.xrc_odp_caps |= IB_ODP_SUPPORT_SEND; - if (MLX5_CAP_ODP(dev->mdev, xrc_odp_caps.receive)) + if (MLX5_CAP_ODP_SCHEME(dev->mdev, xrc_odp_caps.receive)) caps->per_transport_caps.xrc_odp_caps |= IB_ODP_SUPPORT_RECV; - if (MLX5_CAP_ODP(dev->mdev, xrc_odp_caps.write)) + if (MLX5_CAP_ODP_SCHEME(dev->mdev, xrc_odp_caps.write)) caps->per_transport_caps.xrc_odp_caps |= IB_ODP_SUPPORT_WRITE; - if (MLX5_CAP_ODP(dev->mdev, xrc_odp_caps.read)) + if (MLX5_CAP_ODP_SCHEME(dev->mdev, xrc_odp_caps.read)) caps->per_transport_caps.xrc_odp_caps |= IB_ODP_SUPPORT_READ; - if (MLX5_CAP_ODP(dev->mdev, xrc_odp_caps.atomic)) + if (MLX5_CAP_ODP_SCHEME(dev->mdev, xrc_odp_caps.atomic)) caps->per_transport_caps.xrc_odp_caps |= IB_ODP_SUPPORT_ATOMIC; - if (MLX5_CAP_ODP(dev->mdev, xrc_odp_caps.srq_receive)) + if (MLX5_CAP_ODP_SCHEME(dev->mdev, xrc_odp_caps.srq_receive)) caps->per_transport_caps.xrc_odp_caps |= IB_ODP_SUPPORT_SRQ_RECV; if (MLX5_CAP_GEN(dev->mdev, fixed_buffer_size) && @@ -388,13 +388,17 @@ static void mlx5_ib_page_fault_resume(struct mlx5_ib_dev *dev, int wq_num = pfault->event_subtype == MLX5_PFAULT_SUBTYPE_WQE ? pfault->wqe.wq_num : pfault->token; u32 in[MLX5_ST_SZ_DW(page_fault_resume_in)] = {}; + void *info; int err; MLX5_SET(page_fault_resume_in, in, opcode, MLX5_CMD_OP_PAGE_FAULT_RESUME); - MLX5_SET(page_fault_resume_in, in, page_fault_type, pfault->type); - MLX5_SET(page_fault_resume_in, in, token, pfault->token); - MLX5_SET(page_fault_resume_in, in, wq_number, wq_num); - MLX5_SET(page_fault_resume_in, in, error, !!error); + + info = MLX5_ADDR_OF(page_fault_resume_in, in, + page_fault_info.trans_page_fault_info); + MLX5_SET(trans_page_fault_info, info, page_fault_type, pfault->type); + MLX5_SET(trans_page_fault_info, info, fault_token, pfault->token); + MLX5_SET(trans_page_fault_info, info, wq_number, wq_num); + MLX5_SET(trans_page_fault_info, info, error, !!error); err = mlx5_cmd_exec_in(dev->mdev, page_fault_resume, in); if (err) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c b/drivers/net/ethernet/mellanox/mlx5/core/main.c index 5b7e6f4b5c7e..cc2aa46cff04 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/main.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c @@ -479,20 +479,20 @@ static int handle_hca_cap_odp(struct mlx5_core_dev *dev, void *set_ctx) } \ } while (0) - ODP_CAP_SET_MAX(dev, ud_odp_caps.srq_receive); - ODP_CAP_SET_MAX(dev, rc_odp_caps.srq_receive); - ODP_CAP_SET_MAX(dev, xrc_odp_caps.srq_receive); - ODP_CAP_SET_MAX(dev, xrc_odp_caps.send); - ODP_CAP_SET_MAX(dev, xrc_odp_caps.receive); - ODP_CAP_SET_MAX(dev, xrc_odp_caps.write); - ODP_CAP_SET_MAX(dev, xrc_odp_caps.read); - ODP_CAP_SET_MAX(dev, xrc_odp_caps.atomic); - ODP_CAP_SET_MAX(dev, dc_odp_caps.srq_receive); - ODP_CAP_SET_MAX(dev, dc_odp_caps.send); - ODP_CAP_SET_MAX(dev, dc_odp_caps.receive); - ODP_CAP_SET_MAX(dev, dc_odp_caps.write); - ODP_CAP_SET_MAX(dev, dc_odp_caps.read); - ODP_CAP_SET_MAX(dev, dc_odp_caps.atomic); + ODP_CAP_SET_MAX(dev, transport_page_fault_scheme_cap.ud_odp_caps.srq_receive); + ODP_CAP_SET_MAX(dev, transport_page_fault_scheme_cap.rc_odp_caps.srq_receive); + ODP_CAP_SET_MAX(dev, transport_page_fault_scheme_cap.xrc_odp_caps.srq_receive); + ODP_CAP_SET_MAX(dev, transport_page_fault_scheme_cap.xrc_odp_caps.send); + ODP_CAP_SET_MAX(dev, transport_page_fault_scheme_cap.xrc_odp_caps.receive); + ODP_CAP_SET_MAX(dev, transport_page_fault_scheme_cap.xrc_odp_caps.write); + ODP_CAP_SET_MAX(dev, transport_page_fault_scheme_cap.xrc_odp_caps.read); + ODP_CAP_SET_MAX(dev, transport_page_fault_scheme_cap.xrc_odp_caps.atomic); + ODP_CAP_SET_MAX(dev, transport_page_fault_scheme_cap.dc_odp_caps.srq_receive); + ODP_CAP_SET_MAX(dev, transport_page_fault_scheme_cap.dc_odp_caps.send); + ODP_CAP_SET_MAX(dev, transport_page_fault_scheme_cap.dc_odp_caps.receive); + ODP_CAP_SET_MAX(dev, transport_page_fault_scheme_cap.dc_odp_caps.write); + ODP_CAP_SET_MAX(dev, transport_page_fault_scheme_cap.dc_odp_caps.read); + ODP_CAP_SET_MAX(dev, transport_page_fault_scheme_cap.dc_odp_caps.atomic); if (!do_set) return 0; diff --git a/include/linux/mlx5/device.h b/include/linux/mlx5/device.h index ba875a619b97..bd081f276654 100644 --- a/include/linux/mlx5/device.h +++ b/include/linux/mlx5/device.h @@ -1369,6 +1369,10 @@ enum mlx5_qcam_feature_groups { #define MLX5_CAP_ODP(mdev, cap)\ MLX5_GET(odp_cap, mdev->caps.hca[MLX5_CAP_ODP]->cur, cap) +#define MLX5_CAP_ODP_SCHEME(mdev, cap) \ + MLX5_GET(odp_cap, mdev->caps.hca[MLX5_CAP_ODP]->cur, \ + transport_page_fault_scheme_cap.cap) + #define MLX5_CAP_ODP_MAX(mdev, cap)\ MLX5_GET(odp_cap, mdev->caps.hca[MLX5_CAP_ODP]->max, cap) diff --git a/include/linux/mlx5/mlx5_ifc.h b/include/linux/mlx5/mlx5_ifc.h index 1be2495362ee..fcccfc34e076 100644 --- a/include/linux/mlx5/mlx5_ifc.h +++ b/include/linux/mlx5/mlx5_ifc.h @@ -1326,11 +1326,13 @@ struct mlx5_ifc_atomic_caps_bits { u8 reserved_at_e0[0x720]; }; -struct mlx5_ifc_odp_cap_bits { +struct mlx5_ifc_odp_scheme_cap_bits { u8 reserved_at_0[0x40]; u8 sig[0x1]; - u8 reserved_at_41[0x1f]; + u8 reserved_at_41[0x4]; + u8 page_prefetch[0x1]; + u8 reserved_at_46[0x1a]; u8 reserved_at_60[0x20]; @@ -1344,7 +1346,20 @@ struct mlx5_ifc_odp_cap_bits { struct mlx5_ifc_odp_per_transport_service_cap_bits dc_odp_caps; - u8 reserved_at_120[0x6E0]; + u8 reserved_at_120[0xe0]; +}; + +struct mlx5_ifc_odp_cap_bits { + struct mlx5_ifc_odp_scheme_cap_bits transport_page_fault_scheme_cap; + + struct mlx5_ifc_odp_scheme_cap_bits memory_page_fault_scheme_cap; + + u8 reserved_at_400[0x200]; + + u8 mem_page_fault[0x1]; + u8 reserved_at_601[0x1f]; + + u8 reserved_at_620[0x1e0]; }; struct mlx5_ifc_tls_cap_bits { @@ -2041,7 +2056,8 @@ struct mlx5_ifc_cmd_hca_cap_2_bits { u8 min_mkey_log_entity_size_fixed_buffer[0x5]; u8 ec_vf_vport_base[0x10]; - u8 reserved_at_3a0[0x10]; + u8 reserved_at_3a0[0xa]; + u8 max_mkey_log_entity_size_mtt[0x6]; u8 max_rqt_vhca_id[0x10]; u8 reserved_at_3c0[0x20]; @@ -7270,6 +7286,30 @@ struct mlx5_ifc_qp_2err_in_bits { u8 reserved_at_60[0x20]; }; +struct mlx5_ifc_trans_page_fault_info_bits { + u8 error[0x1]; + u8 reserved_at_1[0x4]; + u8 page_fault_type[0x3]; + u8 wq_number[0x18]; + + u8 reserved_at_20[0x8]; + u8 fault_token[0x18]; +}; + +struct mlx5_ifc_mem_page_fault_info_bits { + u8 error[0x1]; + u8 reserved_at_1[0xf]; + u8 fault_token_47_32[0x10]; + + u8 fault_token_31_0[0x20]; +}; + +union mlx5_ifc_page_fault_resume_in_page_fault_info_auto_bits { + struct mlx5_ifc_trans_page_fault_info_bits trans_page_fault_info; + struct mlx5_ifc_mem_page_fault_info_bits mem_page_fault_info; + u8 reserved_at_0[0x40]; +}; + struct mlx5_ifc_page_fault_resume_out_bits { u8 status[0x8]; u8 reserved_at_8[0x18]; @@ -7286,13 +7326,8 @@ struct mlx5_ifc_page_fault_resume_in_bits { u8 reserved_at_20[0x10]; u8 op_mod[0x10]; - u8 error[0x1]; - u8 reserved_at_41[0x4]; - u8 page_fault_type[0x3]; - u8 wq_number[0x18]; - - u8 reserved_at_60[0x8]; - u8 token[0x18]; + union mlx5_ifc_page_fault_resume_in_page_fault_info_auto_bits + page_fault_info; }; struct mlx5_ifc_nop_out_bits { From patchwork Wed Sep 4 15:30:33 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Michael Guralnik X-Patchwork-Id: 13791131 Received: from NAM02-SN1-obe.outbound.protection.outlook.com (mail-sn1nam02on2044.outbound.protection.outlook.com [40.107.96.44]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5D01E1DEFCD for ; Wed, 4 Sep 2024 15:31:08 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.96.44 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725463870; cv=fail; b=pRMJdI3hmiiZw7YDLeKXn05mgst0+Tb964Z4uEjM+em/jNCbiJeM2vW+0c2a1Z9RFKRPjJmRUsW+2i9BTuvUEO5N3ew29QEopsqW1MjihVGBcXpa+nDmpnc9PGtAx5PqhtyOKUVpX/DKzHRYco4cjFaLxgvd4FZ4UTEPrYevjpM= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725463870; c=relaxed/simple; bh=udtOONpfnz7kqLOsQgZy4dXS2FqgNLoCyozCvWdqiPw=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=b6TSOp16PyLaKCYP3/1O9iEcHaYktIFMOPf7HvQhC2WfzWPr1XNkF3bCReEw4BsjyPkb2iRllRofHVALjOTZFbYhWXyPVJUqJ/nCLiaCJt69Be0xyckJzzEGS1Dk71rIzwSZFJMWT4i9vG+NG5HnMO8vAFBcu4kbh7iuPSVHVBI= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com; spf=fail smtp.mailfrom=nvidia.com; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b=nQqGP49d; arc=fail smtp.client-ip=40.107.96.44 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nvidia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b="nQqGP49d" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=ZR8hdPqsCYkX5JoMFL8HZZxNGyEj28ToFdKZZHYFAaZjo8HxPB70h8oqYgz+E74lK69aoiILGT/Ur+tWs21NHzDuYT4p77RfGLJnF65XJC9RDzv5f6ExtHdOR8VcvkDXfoyoU0CJr32RO/IXr07cZvKcmrl8OJvcD8OgTxPGG+jzhquJ0D5kYBEUEsc6JZ5SbIKVDDuPpdAZDd6L5gwII3ef7YsJKnHx2dhHMUC+nGOVkmfdpJI9pTmqSCW95jfBWD4YbQHIXXLexwGoseLdHaDKMPkF5mudJ9zfWw2RTBykn0SUBZ4KonPX58Vn+QFXc90O0L26MnBmDWhLMhyF6A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=2VVsmCBsTbmoNEDvOisP0DU0vksXfNKdqLnxNPQrtAo=; b=J7l7ISfpsMLPd4COR/w/7mmpVCbHScsng6YF/6yZypvxjtipuPLobHHUx8nLFhfsvsmuShCtffH9kkLJo1RVReEW5Rycq07mLobuaQp96UWbroP4DVKQlNI8opj5Yp6JCIYzxmWr3ExcpDDKEdlFCYfLKXfPUu6ip7yzNH3J4p3hG/xNAQwV5j1gXK7QHaheWFgPGM8ZrkPGKIroyBbqgIrEwJ2pdyb5/oOAvblQIAbvBEQA/f2zMYudNsrHRSaslCsk/fQi5DPf5ReUYHyC4e4NYk9R05t1VnnOsPjcoSCAuHMKx9rrF9qVTNZNXi6KSmjj/mh3CpbtUKsaWhtFWg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 216.228.118.232) smtp.rcpttodomain=vger.kernel.org smtp.mailfrom=nvidia.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=2VVsmCBsTbmoNEDvOisP0DU0vksXfNKdqLnxNPQrtAo=; b=nQqGP49dEWavXICi3IN0D9Cy1ZHdLyWPDUIOUzJTw1yaVUMoYeJYQ5E9eKolHMFNN5iTFuIyJdhnX/ucYv/7ZA9hRhpjDxsQCE/GZQ/HwZa06y6gPL6MulCtmTieAq91l98oM5i6dwdMa/queYm5TnoxVRQub/MU0CO/M43IaKKsut6obtFQCUq0lWLoDYLm3c3rhXaEUg+Zz/hLI2/LXIcjCF4D3gUsQ9rcxvCOnNnND2GtB81PlGBkg+8nf8l8t2EKMibVVmuPq/T89dwtzaIhVRiTu0OOpKXLjVGYI6EqkK09wZBiC3dFOoFGWLex1dhbdWgPE/+z5KznmlDX/A== Received: from BY1P220CA0012.NAMP220.PROD.OUTLOOK.COM (2603:10b6:a03:59d::8) by BL3PR12MB6451.namprd12.prod.outlook.com (2603:10b6:208:3ba::9) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7918.27; Wed, 4 Sep 2024 15:31:02 +0000 Received: from SJ5PEPF000001EA.namprd05.prod.outlook.com (2603:10b6:a03:59d:cafe::e6) by BY1P220CA0012.outlook.office365.com (2603:10b6:a03:59d::8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7939.14 via Frontend Transport; Wed, 4 Sep 2024 15:31:02 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 216.228.118.232) smtp.mailfrom=nvidia.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 216.228.118.232 as permitted sender) receiver=protection.outlook.com; client-ip=216.228.118.232; helo=mail.nvidia.com; pr=C Received: from mail.nvidia.com (216.228.118.232) by SJ5PEPF000001EA.mail.protection.outlook.com (10.167.242.198) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7918.13 via Frontend Transport; Wed, 4 Sep 2024 15:31:02 +0000 Received: from drhqmail201.nvidia.com (10.126.190.180) by mail.nvidia.com (10.127.129.5) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.4; Wed, 4 Sep 2024 08:30:49 -0700 Received: from drhqmail203.nvidia.com (10.126.190.182) by drhqmail201.nvidia.com (10.126.190.180) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.4; Wed, 4 Sep 2024 08:30:49 -0700 Received: from vdi.nvidia.com (10.127.8.13) by mail.nvidia.com (10.126.190.182) with Microsoft SMTP Server id 15.2.1544.4 via Frontend Transport; Wed, 4 Sep 2024 08:30:47 -0700 From: Michael Guralnik To: , CC: , , , Michael Guralnik Subject: [PATCH rdma-next 3/8] RDMA/mlx5: Add new ODP memory scheme eqe format Date: Wed, 4 Sep 2024 18:30:33 +0300 Message-ID: <20240904153038.23054-4-michaelgur@nvidia.com> X-Mailer: git-send-email 2.17.2 In-Reply-To: <20240904153038.23054-1-michaelgur@nvidia.com> References: <20240904153038.23054-1-michaelgur@nvidia.com> Precedence: bulk X-Mailing-List: linux-rdma@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-NV-OnPremToCloud: ExternallySecured X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: SJ5PEPF000001EA:EE_|BL3PR12MB6451:EE_ X-MS-Office365-Filtering-Correlation-Id: e96aa37f-74e9-46a4-e652-08dcccf69579 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|1800799024|376014|82310400026|36860700013; X-Microsoft-Antispam-Message-Info: FNkLHH1DP/IjFTAZnx/Nna1fztv4eyrGLqD8+YTSpovsr3391IqjZuov+ixyy2ZmWDvgltMmbKbwP9DGZed2mMR4i1xheR8hwiTMhMI8NQN1KxoBCASPZxAXFopF/LAmZ0bQf7SzFkRBA8L/5qGo6+xUn/OmE3XDHCC8OTHiIn8mtNrvc4mZPo4924qR0+52VnaMCG/BsYHzvw1YApstUydqoCmJDsY8tVHZt2FLNAYBLQsDydyAR1qWPe4uDTHZ3oSwYaO6BIJeeObDu+By4rU+lzER5inxy363XlntBZWNL/BZuR3bS5moXZcv6YQj0UFlhCQaP+gwPzN7kxYZ46ZwX33oev0M3kzAs4PbPFY60GV475AtUNUyWJ1gcJbr2q6qe9cMSdyrotTqpISmrJCYRU4JJLIiqrSFjyjN7qFuaInazNJRYGw56riQW+/M/z0anRc0+/yxcPU93UDjye+CIuEvdP+f2co7Jwei3lMDkHWiLsoG17P2dIDqYEGNx1OxVkoMLv/XnfXPRqDJqMJjlpxvdortJ93b1ZZjD1YLgKKVnz+CpbIE8a5p/LoNq8VJWI65dXR4/n2TD+/VJja26qbt9ERFvrgOX0msyoJFpEVKEfms74NMHZNkjyHxVueMiUK8EBf6GxIRfCuDDIZwfB+v4J1DZnsLBuQurX1zTTa5IEAK0ouj2yNRl8XAC+/60SUvkcbxz7GsCd/D6TPYkjye6gKheA8AC6zFhIPJP7YsdtaxY2cIRKUalh3crZH4/RQswmLvpUKZ6FOx8PWL0vbLdl3NT6S0JaYtvCO4dK12I5MTxAtduCiLC60yK44DmmJ02TJ4TAK8+u9Zn0Ccvg70nZsJkfg3RpsYv33uVBZqhjIQCADYwYLOvl8kU5NbaqRXldx/eJzDuP6IKMCauhBd+jKrsa5sW7tzH6GBc5ZF+4NuMEyLMgoNfQOLut/8yAmyxAmwwYOyjuzDTnSTJkLrBXgrFx4v471/java6aQrwcj3nbEvwsRCKXjZP0P3wRIvcyJTBGr4e+5MVme63fJrbnB32jgv4+9qZUa4aLtdImV3PHqTExQcySFB8bE+OL19e3h51I0vqcYod3vmXqcHmlKokjTWPK+884sNb7eWheONU6AnsdNJcX4prXdScbXkQLrriwyJYHcPCiAS0gLKbL0gVkwrmgm99JSzYypyhJWqYMcsd47o0nIfLuAPBiaZgRgoWcWZ0Z8p0KMWT+olCSE7Y1oEQbRmG6696DCkzkhiy8WsG6zfrzONQy4zU4RIV5TjQk+1wwYQsQhY4RSBF2RYtvKZh2vGlJN50vecw5udl7y/gUo7UcONYl/L/wXQkpJNPxZjDzLCsYUHClhKIcMQzJEA+Ra/rc5PHLTRzF/F+o4Q8/wFdxPFbTeDq+ikB/I4sSn6+FYZOZvZMRAClED/+wlQS7K1G2VSOnhwTtHUYBEiLiwEdjJS X-Forefront-Antispam-Report: CIP:216.228.118.232;CTRY:US;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:mail.nvidia.com;PTR:dc7edge1.nvidia.com;CAT:NONE;SFS:(13230040)(1800799024)(376014)(82310400026)(36860700013);DIR:OUT;SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 04 Sep 2024 15:31:02.1426 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: e96aa37f-74e9-46a4-e652-08dcccf69579 X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a;Ip=[216.228.118.232];Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: SJ5PEPF000001EA.namprd05.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: BL3PR12MB6451 Add new fields to support the new memory scheme page fault and extend the token field to u64 as in the new scheme the token is 48 bit. Signed-off-by: Michael Guralnik Reviewed-by: Leon Romanovsky --- drivers/infiniband/hw/mlx5/odp.c | 48 +++++++++++++++++++------------- include/linux/mlx5/device.h | 22 ++++++++++++++- 2 files changed, 50 insertions(+), 20 deletions(-) diff --git a/drivers/infiniband/hw/mlx5/odp.c b/drivers/infiniband/hw/mlx5/odp.c index 300504bf79d7..f01026d507a3 100644 --- a/drivers/infiniband/hw/mlx5/odp.c +++ b/drivers/infiniband/hw/mlx5/odp.c @@ -45,7 +45,7 @@ /* Contains the details of a pagefault. */ struct mlx5_pagefault { u32 bytes_committed; - u32 token; + u64 token; u8 event_subtype; u8 type; union { @@ -74,6 +74,14 @@ struct mlx5_pagefault { u32 rdma_op_len; u64 rdma_va; } rdma; + struct { + u64 va; + u32 mkey; + u32 fault_byte_count; + u32 prefetch_before_byte_count; + u32 prefetch_after_byte_count; + u8 flags; + } memory; }; struct mlx5_ib_pf_eq *eq; @@ -1273,7 +1281,7 @@ static void mlx5_ib_mr_wqe_pfault_handler(struct mlx5_ib_dev *dev, if (ret) mlx5_ib_err( dev, - "Failed reading a WQE following page fault, error %d, wqe_index %x, qpn %x\n", + "Failed reading a WQE following page fault, error %d, wqe_index %x, qpn %llx\n", ret, wqe_index, pfault->token); resolve_page_fault: @@ -1332,13 +1340,13 @@ static void mlx5_ib_mr_rdma_pfault_handler(struct mlx5_ib_dev *dev, } else if (ret < 0 || pages_in_range(address, length) > ret) { mlx5_ib_page_fault_resume(dev, pfault, 1); if (ret != -ENOENT) - mlx5_ib_dbg(dev, "PAGE FAULT error %d. QP 0x%x, type: 0x%x\n", + mlx5_ib_dbg(dev, "PAGE FAULT error %d. QP 0x%llx, type: 0x%x\n", ret, pfault->token, pfault->type); return; } mlx5_ib_page_fault_resume(dev, pfault, 0); - mlx5_ib_dbg(dev, "PAGE FAULT completed. QP 0x%x, type: 0x%x, prefetch_activated: %d\n", + mlx5_ib_dbg(dev, "PAGE FAULT completed. QP 0x%llx, type: 0x%x, prefetch_activated: %d\n", pfault->token, pfault->type, prefetch_activated); @@ -1354,7 +1362,7 @@ static void mlx5_ib_mr_rdma_pfault_handler(struct mlx5_ib_dev *dev, prefetch_len, &bytes_committed, NULL); if (ret < 0 && ret != -EAGAIN) { - mlx5_ib_dbg(dev, "Prefetch failed. ret: %d, QP 0x%x, address: 0x%.16llx, length = 0x%.16x\n", + mlx5_ib_dbg(dev, "Prefetch failed. ret: %d, QP 0x%llx, address: 0x%.16llx, length = 0x%.16x\n", ret, pfault->token, address, prefetch_len); } } @@ -1405,15 +1413,12 @@ static void mlx5_ib_eq_pf_process(struct mlx5_ib_pf_eq *eq) pf_eqe = &eqe->data.page_fault; pfault->event_subtype = eqe->sub_type; - pfault->bytes_committed = be32_to_cpu(pf_eqe->bytes_committed); - - mlx5_ib_dbg(eq->dev, - "PAGE_FAULT: subtype: 0x%02x, bytes_committed: 0x%06x\n", - eqe->sub_type, pfault->bytes_committed); switch (eqe->sub_type) { case MLX5_PFAULT_SUBTYPE_RDMA: /* RDMA based event */ + pfault->bytes_committed = + be32_to_cpu(pf_eqe->rdma.bytes_committed); pfault->type = be32_to_cpu(pf_eqe->rdma.pftype_token) >> 24; pfault->token = @@ -1427,10 +1432,12 @@ static void mlx5_ib_eq_pf_process(struct mlx5_ib_pf_eq *eq) be32_to_cpu(pf_eqe->rdma.rdma_op_len); pfault->rdma.rdma_va = be64_to_cpu(pf_eqe->rdma.rdma_va); - mlx5_ib_dbg(eq->dev, - "PAGE_FAULT: type:0x%x, token: 0x%06x, r_key: 0x%08x\n", - pfault->type, pfault->token, - pfault->rdma.r_key); + mlx5_ib_dbg( + eq->dev, + "PAGE_FAULT: subtype: 0x%02x, bytes_committed: 0x%06x, type:0x%x, token: 0x%06llx, r_key: 0x%08x\n", + eqe->sub_type, pfault->bytes_committed, + pfault->type, pfault->token, + pfault->rdma.r_key); mlx5_ib_dbg(eq->dev, "PAGE_FAULT: rdma_op_len: 0x%08x, rdma_va: 0x%016llx\n", pfault->rdma.rdma_op_len, @@ -1439,6 +1446,8 @@ static void mlx5_ib_eq_pf_process(struct mlx5_ib_pf_eq *eq) case MLX5_PFAULT_SUBTYPE_WQE: /* WQE based event */ + pfault->bytes_committed = + be32_to_cpu(pf_eqe->wqe.bytes_committed); pfault->type = (be32_to_cpu(pf_eqe->wqe.pftype_wq) >> 24) & 0x7; pfault->token = @@ -1450,11 +1459,12 @@ static void mlx5_ib_eq_pf_process(struct mlx5_ib_pf_eq *eq) be16_to_cpu(pf_eqe->wqe.wqe_index); pfault->wqe.packet_size = be16_to_cpu(pf_eqe->wqe.packet_length); - mlx5_ib_dbg(eq->dev, - "PAGE_FAULT: type:0x%x, token: 0x%06x, wq_num: 0x%06x, wqe_index: 0x%04x\n", - pfault->type, pfault->token, - pfault->wqe.wq_num, - pfault->wqe.wqe_index); + mlx5_ib_dbg( + eq->dev, + "PAGE_FAULT: subtype: 0x%02x, bytes_committed: 0x%06x, type:0x%x, token: 0x%06llx, wq_num: 0x%06x, wqe_index: 0x%04x\n", + eqe->sub_type, pfault->bytes_committed, + pfault->type, pfault->token, pfault->wqe.wq_num, + pfault->wqe.wqe_index); break; default: diff --git a/include/linux/mlx5/device.h b/include/linux/mlx5/device.h index bd081f276654..154095256d0d 100644 --- a/include/linux/mlx5/device.h +++ b/include/linux/mlx5/device.h @@ -211,6 +211,7 @@ enum { enum { MLX5_PFAULT_SUBTYPE_WQE = 0, MLX5_PFAULT_SUBTYPE_RDMA = 1, + MLX5_PFAULT_SUBTYPE_MEMORY = 2, }; enum wqe_page_fault_type { @@ -646,10 +647,11 @@ struct mlx5_eqe_page_req { __be32 rsvd1[5]; }; +#define MEMORY_SCHEME_PAGE_FAULT_GRANULARITY 4096 struct mlx5_eqe_page_fault { - __be32 bytes_committed; union { struct { + __be32 bytes_committed; u16 reserved1; __be16 wqe_index; u16 reserved2; @@ -659,6 +661,7 @@ struct mlx5_eqe_page_fault { __be32 pftype_wq; } __packed wqe; struct { + __be32 bytes_committed; __be32 r_key; u16 reserved1; __be16 packet_length; @@ -666,6 +669,23 @@ struct mlx5_eqe_page_fault { __be64 rdma_va; __be32 pftype_token; } __packed rdma; + struct { + u8 flags; + u8 reserved1; + __be16 post_demand_fault_pages; + __be16 pre_demand_fault_pages; + __be16 token47_32; + __be32 token31_0; + /* + * FW changed from specifying the fault size in byte + * count to 4k pages granularity. The size specified + * in pages uses bits 31:12, to keep backward + * compatibility. + */ + __be32 demand_fault_pages; + __be32 mkey; + __be64 va; + } __packed memory; } __packed; } __packed; From patchwork Wed Sep 4 15:30:34 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Michael Guralnik X-Patchwork-Id: 13791133 Received: from NAM12-MW2-obe.outbound.protection.outlook.com (mail-mw2nam12on2051.outbound.protection.outlook.com [40.107.244.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 46DFE1DEFF3 for ; Wed, 4 Sep 2024 15:31:13 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.244.51 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725463874; cv=fail; b=DCk9VJDKnbuvXQIWKj85/8z/LiezKghmTpGquqI5pErd+J2fIeiEkSvdBz/9E10tGRZkqQUqHQFss1pljGZQtBpdnGkNI6ZTP5PZZZIjB3vXB1K0jJfoYUjHXxoV7aKBSfE+YPxcgu9h2oGjrg/id2fmSPCNcXY8kDGZ6yRSD1U= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725463874; c=relaxed/simple; bh=BUFmWbS9FKrS39yLkH2iuk0nOTBFDgj893f2A0hrvPw=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=enwdXcUF241GmMHDb48mlfL/yeN8sC25zSECfVR00Hpl9pKNFy+aJNEyOCMMttCDW4HDSelL2xTNd+Karyh7C0VHwgcXHzx1NUee5QXk6HDhYrydYrOEjQdI+E7WFE3B5tUKpAyJFNwqoUbQ8nrlQhYjwjzybW76aBt+xbpgob0= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com; spf=fail smtp.mailfrom=nvidia.com; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b=GHj/QH8O; arc=fail smtp.client-ip=40.107.244.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nvidia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b="GHj/QH8O" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=iR1CMiHEu29jfG8+sUH1qFfA+VgVUF0IKyd5s+KJiQdFO9BQzCkbhGGVodL3Umc0N4ssb/pCRl2VPFlkYpBtxlOGjfZmodYO7HO1hbtJffA2FUob+9uisGuoTTRa4VtXsSn2Qy7nzXn71UBMBsi96hSeHnqgeRvHc5na4yt8gUhzeJlxEqBJG4AdCFjYNvs1zEFS07Sn5doKUM2uUr8ZUk6LlxKpLAp081YelCzEgyxcu7CP2jcJXqD6x2bdvjhQ3Vt+hLWHB4PB77aD6DVMRJwbKXoJmtEapwj1xysSL4VXNtlnGJ8tM4GgJTJAex74VtEm/jz6fGNUSKF7zETxww== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=GkwQYRhdbg7/r4cMwsxk5qyL1NVCqzgjNM0m4JkjMMs=; b=qIO/5qe8lFDMA02Ns6tFcUnxDMh3Whcz/N75xrtjyZ2PIJCHloBM5kelm4wDItqjIsfAVspBM0RFYZ84/618qlICIxgMTuZwA/kG6N6i3Xf+4zSHZD/vGHMjka5RnSZjlsuYHhGUUnMiloDyP3B1q+6U9Wh33ZJ2dTWwtxCcvf6vlRmgwl/hI+/aAlYz4fu7+nV4ivgaT60L5yrPnqZ5ZkFcWizETq2kOEfrpbq7gxcNJzcWislZXc5rzzkA6qf1zt17qYRtFHTvHpG7dARkt92IP2nw1G0I7RV0UmdM63jucNMjeTUZOHzBVFwVl78xoHgZGhZ+zARDJIxau9a0Tg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 216.228.118.233) smtp.rcpttodomain=vger.kernel.org smtp.mailfrom=nvidia.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=GkwQYRhdbg7/r4cMwsxk5qyL1NVCqzgjNM0m4JkjMMs=; b=GHj/QH8O6iaXJT3S82YtBpGi1NL+g22v15s7P5Pk5PEAgwdEGJCWuCipNRwLOWwj+i3EwYgU3dahZX+4HY58BjG6SpPHHtsfl6EtJliiyGM++PxkZRO+rkV7H+TcrXH9BYipKg4/4HoJOceBzBZ9Uc/NRwtUKpf5r0cEfLTg/gh9oJFlL3kWp4YVQxGLlhvZYzqJhedK4Gk2e/ulrQPGcyF9Nci/oOapgrVcIWlCiv5r6PkbKaFaohFei5+YZHXC4s5qjElQNgA97cX4tMuCBJORy/cjexCSDCIQfyk8TAx7YJeWBs0scO97poUaEnncHoGkJhADPGcljcLaQYJaJA== Received: from SJ0PR03CA0070.namprd03.prod.outlook.com (2603:10b6:a03:331::15) by CH3PR12MB9341.namprd12.prod.outlook.com (2603:10b6:610:1cd::20) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7918.20; Wed, 4 Sep 2024 15:31:07 +0000 Received: from SJ1PEPF00001CDC.namprd05.prod.outlook.com (2603:10b6:a03:331:cafe::58) by SJ0PR03CA0070.outlook.office365.com (2603:10b6:a03:331::15) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7918.27 via Frontend Transport; Wed, 4 Sep 2024 15:31:07 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 216.228.118.233) smtp.mailfrom=nvidia.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 216.228.118.233 as permitted sender) receiver=protection.outlook.com; client-ip=216.228.118.233; helo=mail.nvidia.com; pr=C Received: from mail.nvidia.com (216.228.118.233) by SJ1PEPF00001CDC.mail.protection.outlook.com (10.167.242.4) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7918.13 via Frontend Transport; Wed, 4 Sep 2024 15:31:07 +0000 Received: from drhqmail201.nvidia.com (10.126.190.180) by mail.nvidia.com (10.127.129.6) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.4; Wed, 4 Sep 2024 08:30:51 -0700 Received: from drhqmail203.nvidia.com (10.126.190.182) by drhqmail201.nvidia.com (10.126.190.180) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.4; Wed, 4 Sep 2024 08:30:51 -0700 Received: from vdi.nvidia.com (10.127.8.13) by mail.nvidia.com (10.126.190.182) with Microsoft SMTP Server id 15.2.1544.4 via Frontend Transport; Wed, 4 Sep 2024 08:30:49 -0700 From: Michael Guralnik To: , CC: , , , Michael Guralnik Subject: [PATCH rdma-next 4/8] RDMA/mlx5: Enforce umem boundaries for explicit ODP page faults Date: Wed, 4 Sep 2024 18:30:34 +0300 Message-ID: <20240904153038.23054-5-michaelgur@nvidia.com> X-Mailer: git-send-email 2.17.2 In-Reply-To: <20240904153038.23054-1-michaelgur@nvidia.com> References: <20240904153038.23054-1-michaelgur@nvidia.com> Precedence: bulk X-Mailing-List: linux-rdma@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-NV-OnPremToCloud: ExternallySecured X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: SJ1PEPF00001CDC:EE_|CH3PR12MB9341:EE_ X-MS-Office365-Filtering-Correlation-Id: 3ff6c53d-1dff-441b-98b5-08dcccf69897 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|1800799024|36860700013|82310400026|376014; X-Microsoft-Antispam-Message-Info: UPnBycswoFVuNfFSoXLipqtJ1xeqhxC0DQ0fguqmnWA+COB1HCB4Tsut9/A69YBjMhN2HL7nga1vOR9M4ml6S76ub+zyQUh1zbt0tqeVOTygpP035GOlKVciiRADuFG7nAf0BF/TQvIO+mrsLGAEqveVIx6GL/x7vSyDcfGSbHzFxxP2TbKEkektO6n/AuJNS+iMBuKJ2HKRNnGrACGF5HQkzcZcemnpiUtTWUhO7ff66u/SgGYWl2lpS0qYgv6qXB+W4hbDrvK3UVFLVcKGwBA+ocO7R06vNrQHefawpeA05++hKi5oDXOIQWg92brkGZBEyN4/WujWHMi05+KTnQpglL2/80rDvkFTG9qjuAb/9SnTFrYS1w/RRZhMQs+slL9fJXalFWLQEwL7r+NkFdGhvohrmoAERW77W4251ED+IX+29K1k0Rr4SMdbbxjSoUZw5CCdb7hbB4V1/mZnnvL+gcaje4Wph3HPFwXh5owVaYV41JG8rek+WokRU22Af30ZtcMZrgCJ7SBSsIq6UZko/zLcUcE3sX2uAFGkVSMWEVg5Bw/hlcjxj9fQuVGg4gc19JtZzFQSmlleJgBRTAnBekSXcFLfB9wgCSe/RhfVFHoxNwl9z20ppOLYba5trA8nNb+6BWpEpJJjmcvm1hwdRCI5H470mBdHhcMZPdJHpnlNOtx9M+J3v444pvMu59anaL1PkAZk4vrUCGR08qG2ZPc03eWzMIOKUR9DoC3kfznuJQN2F44XT1Bk7VwknSaMf1gfsTYxEkLEbBrORfUaNUZwRJpQzxnMI/sM3OBxqwBPeO3givrRZZO4fRnJLyFg2zThRrNBhx6oKkp33uqtR3y6JaowMQbkOMujISBc1XLSu6Jdgix6ZC+ZoCct5TMDzDO2Rtf4GUgcU/+uFxsDnivcVB/MN7uReaVbOeFem0ZAsDO8HaUt/Vyp0J5WTALeMAuWA9EnO1hAERjf7xTsjq5VEccla5FZLASLu3s1UUNlUCppD+eRplb+jxuccZQ2hHrSjq3J3gseykuF6kfknwvTIZVHpPzxo95tu8WTKUzo5wIgB9m5ADfrcSRpnKAJtmUR3OK4G/Jh6wRd6vDEL33QkhGoISXIiTW/eODbQcQPB0E/7w3p2piU2KcYBkhI8khSsnobTav/LgS/UCsIkCclUd3vLTpp7FEH+hx4E84tTXHCGZUjgCfHyUyezAWO4oeZ9upXmEG0hV1iUMi//jhD7cPjxokidilZjokwVsTWk4sNKFsyssJkx0IvIePtZHUKsrmk9c1lq9eI2YbG6iI9pMQ/exG4wKQqf6So8lJvZkCJyKiHeGz3Rf2cSD/Nk/UY3a/5Sp17UalRWQLJvnOLFx6HenBHwwSu9/k/4/HQm5pdo+udMAm1CkbpRFJ7k+Kch4GX1epp2ud3VYM/xrAUkFYq5+qbvHOLjmk= X-Forefront-Antispam-Report: CIP:216.228.118.233;CTRY:US;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:mail.nvidia.com;PTR:dc7edge2.nvidia.com;CAT:NONE;SFS:(13230040)(1800799024)(36860700013)(82310400026)(376014);DIR:OUT;SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 04 Sep 2024 15:31:07.3235 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 3ff6c53d-1dff-441b-98b5-08dcccf69897 X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a;Ip=[216.228.118.233];Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: SJ1PEPF00001CDC.namprd05.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: CH3PR12MB9341 The new memory scheme page faults are requesting the driver to fetch additinal pages to the faulted memory access. This is done in order to prefetch pages before and after the area that got the page fault, assuming this will reduce the total amount of page faults. The driver should ensure it handles only the pages that are within the umem range. Signed-off-by: Michael Guralnik Reviewed-by: Leon Romanovsky --- drivers/infiniband/hw/mlx5/odp.c | 25 ++++++++++++++++--------- 1 file changed, 16 insertions(+), 9 deletions(-) diff --git a/drivers/infiniband/hw/mlx5/odp.c b/drivers/infiniband/hw/mlx5/odp.c index f01026d507a3..20ad2616bed0 100644 --- a/drivers/infiniband/hw/mlx5/odp.c +++ b/drivers/infiniband/hw/mlx5/odp.c @@ -748,24 +748,31 @@ static int pagefault_dmabuf_mr(struct mlx5_ib_mr *mr, size_t bcnt, * >0: Number of pages mapped */ static int pagefault_mr(struct mlx5_ib_mr *mr, u64 io_virt, size_t bcnt, - u32 *bytes_mapped, u32 flags) + u32 *bytes_mapped, u32 flags, bool permissive_fault) { struct ib_umem_odp *odp = to_ib_umem_odp(mr->umem); - if (unlikely(io_virt < mr->ibmr.iova)) + if (unlikely(io_virt < mr->ibmr.iova) && !permissive_fault) return -EFAULT; if (mr->umem->is_dmabuf) return pagefault_dmabuf_mr(mr, bcnt, bytes_mapped, flags); if (!odp->is_implicit_odp) { + u64 offset = io_virt < mr->ibmr.iova ? 0 : io_virt - mr->ibmr.iova; u64 user_va; - if (check_add_overflow(io_virt - mr->ibmr.iova, - (u64)odp->umem.address, &user_va)) + if (check_add_overflow(offset, (u64)odp->umem.address, + &user_va)) return -EFAULT; - if (unlikely(user_va >= ib_umem_end(odp) || - ib_umem_end(odp) - user_va < bcnt)) + + if (permissive_fault) { + if (user_va < ib_umem_start(odp)) + user_va = ib_umem_start(odp); + if ((user_va + bcnt) > ib_umem_end(odp)) + bcnt = ib_umem_end(odp) - user_va; + } else if (unlikely(user_va >= ib_umem_end(odp) || + ib_umem_end(odp) - user_va < bcnt)) return -EFAULT; return pagefault_real_mr(mr, odp, user_va, bcnt, bytes_mapped, flags); @@ -872,7 +879,7 @@ static int pagefault_single_data_segment(struct mlx5_ib_dev *dev, case MLX5_MKEY_MR: mr = container_of(mmkey, struct mlx5_ib_mr, mmkey); - ret = pagefault_mr(mr, io_virt, bcnt, bytes_mapped, 0); + ret = pagefault_mr(mr, io_virt, bcnt, bytes_mapped, 0, false); if (ret < 0) goto end; @@ -1727,7 +1734,7 @@ static void mlx5_ib_prefetch_mr_work(struct work_struct *w) for (i = 0; i < work->num_sge; ++i) { ret = pagefault_mr(work->frags[i].mr, work->frags[i].io_virt, work->frags[i].length, &bytes_mapped, - work->pf_flags); + work->pf_flags, false); if (ret <= 0) continue; mlx5_update_odp_stats(work->frags[i].mr, prefetch, ret); @@ -1778,7 +1785,7 @@ static int mlx5_ib_prefetch_sg_list(struct ib_pd *pd, if (IS_ERR(mr)) return PTR_ERR(mr); ret = pagefault_mr(mr, sg_list[i].addr, sg_list[i].length, - &bytes_mapped, pf_flags); + &bytes_mapped, pf_flags, false); if (ret < 0) { mlx5r_deref_odp_mkey(&mr->mmkey); return ret; From patchwork Wed Sep 4 15:30:35 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Michael Guralnik X-Patchwork-Id: 13791134 Received: from NAM12-MW2-obe.outbound.protection.outlook.com (mail-mw2nam12on2064.outbound.protection.outlook.com [40.107.244.64]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 960BC2C95 for ; Wed, 4 Sep 2024 15:31:13 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.244.64 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725463875; cv=fail; b=XwMYtjK8LG360IDFevNHxVvEaGHN60/JMOQptALqML+jRogrNu3GFSe2iLQ7RcmjqG17iYwiYkgHASKaDthL6F9Sk/6/qSj7vD5aI/Xxr3UT33SBYklQr5ZKjYMMAjvr2bAV/69sOdr02I2KEqq6NnGF2B5jmHKZuBPgdO9yibo= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725463875; c=relaxed/simple; bh=XfGdmeJwrAdNQXtVJ/wcJICIKVlJzEmIZ4IS2AjexXE=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=kNTv3hG13S4qtbu1iEoKfYnO/HNEXViovEW+2Dd/tCRS73yEcJcOLjslFQAcFJwNFy+DcXwfpKzrHHp2WILjB1rX3KyTXYCGl3WS33DdBOuUlUbc76KBE1Erhtloi07boiN6CkPcssjvLDEGEfOLhvC2P4WviIRCvEy+3u0BJIc= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com; spf=fail smtp.mailfrom=nvidia.com; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b=cRp4W00s; arc=fail smtp.client-ip=40.107.244.64 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nvidia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b="cRp4W00s" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=tSb1yZDKt4ny4sSm1pfb1JZugnInpiagOgIHF95JAAoueYh4/bpejHQ1wK0Hdj44bbV2HPrpGbx41W2ImyScAc98TefEvbv/MPm0c43cSv2mDE1v1DBG8g6Qc3YCLkfOAPR4c5GlEPP4fetX/4cVWILtYvlCYBs+/hr2iunEnBwPQu+TeweTP+EJykNhAzrjkS2rRnoQ4MSBr1K51CLVbKDeNhSFSP/PNF6x2RtkZXJt94bpSt9R+uM5o6ZJgo/SmCrbT90EULKWOzvxlFQDnq7dr29/PoK87ANZ+oS1Buq6GNDJBVpMrofGNVRbGhlTYqqJjFM+1NLfBL1DLSDt6g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=G5HqaHeN2EHybHDUbaqlvfUmjZvJMB5U5BsFxBHNYks=; b=iJWAP4+W7dkAMcxlObpQzzsP1rtLQutk9xgP6SeBWY4Rg56DpWaNe/a0a9Jb5vald7I2EA9jKY3ZSsFNsg2cFyyqqi4xWA9sd7KrE0MjsGJv/sYi54ZYuEWycDt8MJIDZpKTbJfzYf/7sLVcpiki4h/i5j4CG8QPip1DajkwEkwBX0DgrEt89n+Dp9C6rsd7FoJlLV+HxAqYPkkwtd9CPc9S8p4CR2/dRNZv+CTEKJsS5DhnXxt9jmaNhNoXL5zwUbY9Bwmnd0GsVutTw4vJqMPIE6yQwCriPbNzZuU6G0zDc793ruKH7lnOe20kGiAfs+Zy+/eosZ/t8IaCqJWrJg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 216.228.118.233) smtp.rcpttodomain=vger.kernel.org smtp.mailfrom=nvidia.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=G5HqaHeN2EHybHDUbaqlvfUmjZvJMB5U5BsFxBHNYks=; b=cRp4W00sW3wEA5L8Y7RMbjoaA2d6kfVr+oLn6k2GHTXIkUjQhBv9e+VmgJ/zQqQwIf7VMJmHq8dhRxsKdkWTUJGFWY7/+P2s+4NzNy8u6ZNlpFJEsEEUIwPgpOnE1+w3gQxGLUXgmo+StB/6CJhf61X+54/3hUaaIXanIwM+hq1dgrMPnTax416rPIYbPmjsVd/GigHInWiHFhzpHeVvBMxz7e8s8hq4fwVINnEQ6nyzX3xAjAtQnuWTqzGjAqJLT7v51uOSqcxYlv1nRVRHzhBvySCXxPt95GKZHM58tP88NcAEaIt7BWjcpzgGu3qAgPSB8PUI2BthLay49vmwHg== Received: from SJ0PR03CA0073.namprd03.prod.outlook.com (2603:10b6:a03:331::18) by SN7PR12MB6743.namprd12.prod.outlook.com (2603:10b6:806:26d::22) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7918.24; Wed, 4 Sep 2024 15:31:08 +0000 Received: from SJ1PEPF00001CDC.namprd05.prod.outlook.com (2603:10b6:a03:331:cafe::c0) by SJ0PR03CA0073.outlook.office365.com (2603:10b6:a03:331::18) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7918.27 via Frontend Transport; Wed, 4 Sep 2024 15:31:08 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 216.228.118.233) smtp.mailfrom=nvidia.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 216.228.118.233 as permitted sender) receiver=protection.outlook.com; client-ip=216.228.118.233; helo=mail.nvidia.com; pr=C Received: from mail.nvidia.com (216.228.118.233) by SJ1PEPF00001CDC.mail.protection.outlook.com (10.167.242.4) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7918.13 via Frontend Transport; Wed, 4 Sep 2024 15:31:08 +0000 Received: from drhqmail201.nvidia.com (10.126.190.180) by mail.nvidia.com (10.127.129.6) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.4; Wed, 4 Sep 2024 08:30:54 -0700 Received: from drhqmail203.nvidia.com (10.126.190.182) by drhqmail201.nvidia.com (10.126.190.180) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.4; Wed, 4 Sep 2024 08:30:53 -0700 Received: from vdi.nvidia.com (10.127.8.13) by mail.nvidia.com (10.126.190.182) with Microsoft SMTP Server id 15.2.1544.4 via Frontend Transport; Wed, 4 Sep 2024 08:30:52 -0700 From: Michael Guralnik To: , CC: , , , Michael Guralnik Subject: [PATCH rdma-next 5/8] RDMA/mlx5: Split ODP mkey search logic Date: Wed, 4 Sep 2024 18:30:35 +0300 Message-ID: <20240904153038.23054-6-michaelgur@nvidia.com> X-Mailer: git-send-email 2.17.2 In-Reply-To: <20240904153038.23054-1-michaelgur@nvidia.com> References: <20240904153038.23054-1-michaelgur@nvidia.com> Precedence: bulk X-Mailing-List: linux-rdma@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-NV-OnPremToCloud: ExternallySecured X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: SJ1PEPF00001CDC:EE_|SN7PR12MB6743:EE_ X-MS-Office365-Filtering-Correlation-Id: 0dbccf17-48ae-4968-2ddc-08dcccf6994f X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|1800799024|36860700013|376014|82310400026; X-Microsoft-Antispam-Message-Info: vtkjw8imDrpH21WdLH84Eo7NL1kKvARZ+MkINW+9YrgkvzkieMIXVeUEk7pHnatGIhNFCsLjiVxRMYclrSgvS2mE/jbnY4FoByHLitEYpCzsq457fgeZ1HctCvYR+7iiPLPekUD+WsYqSpaMI7gjws6ghMjcRYIIzqmZ6YwUrv8QUOczkjjhmrCejqWCsFSqrG0YpUB3aX20F0ATs3Xlqlpb8Y9NFgSKt/IuM1nkPQnFVUYDFKBo9cDpGqsWPZtZaF5N05y33BcO/g1NHtPoX2dAkD39yRgsxzHHMWGVtz4K0jqYQJaqvOAbfTbEl5kwtP5tni18NPPO2GQHsOb7sbEOuaL745rPel9uS9PrskKv9qpw1v5PN0GbtkfFEMhNXi0EuemlHIZLQGn65i7nI2sKxBDGW6oLki7L4hkGRScwXH17xYL85mObY9q0naX4MFHOUxC6bKGpTQqLyc2rFiKohGnEX7pPY/fmVseSZ9vkOB7obmhdw1Dn7Y1si3sptd3Rj7+N4ZK1VvhmQnKYY6pDCLIALUsb8znS5Jb7jHNi6TcfQduI80FuyT6UWdzCvBTgquVQ8xPJJg6ahp33cTyEi0pFWWiCy3ykMta9qUpXcMSNihOhnAWF3pXt7BjbICDPzSCUlRwvLEyrYgwa0APnX2PJpsQsxD/ikxiID0LtFlapD+Bi/uxgLx7Wysv12j7SDIHv8BdKR6Pvb44HzebrgeJ2IO0f/S8dW1O1EIF03ix7EOeQPLsc0QVHCNK+QxaDKRvI5XJesOjHo9hhoYciSz7I2VrBaVB8jYlE3IyYirupT1lXwmXwV+NS04FcGesg8S5ZbECHsiKbj7h/WJyhT7iU4+/64RWJo144bhUOpexe7vQJnE/xiNNYt5wMt9syRzg0NYXsoQOMXY0evrN8TboV/uTJcuH1KUNdiNjs1GCNzlEAvlbnC2j+qQ87mS0eO4jcQldvm+BkPukFSQc6CrMtE4xBKwATwPHsoe4hfXWneB9ykYJfTBZKD1v1AwydTycqMRO6q0jhvpDmD9SSKEAU/WQQi7Pcqdkv7055iQDDM89zcv2DS5QJVgWWWAnmjud3hA6lVbqDwlMrOGJ0bh5JIhPAyE0fZDUGDWqFBhDNfhwt+aURIpKNVsoJ2iAu1xM8aEWGDo42GDVlztKerVpOlGvMiOB4nTd1MP/ZZMIh0qPdWzPw7IihvfPb8+qEFiCwKzZ9IoxZ3Tb7Ny8QhgwiyJKPnVVNFrJ/qwHRWE8HVs+DOWKWpAu+rBtaE/hhjC3HU1Ri8H6Uoz8gFDhUxgmWqaeU8hdXHCwvc+mfkAaECUdtBdKCknyDfJ6PlfAjpwklTt1urxrP6yCwXZ6wS+/i5X0ErKAMJsnfLXFwu3CzgMZTT0Q2ljKlFrhkAeuhpwZ/dDF1epTkUnPkReimgYdUuxRrqqJ4z4sKyie5wBeM18sd8+6cwbCOUx0R X-Forefront-Antispam-Report: CIP:216.228.118.233;CTRY:US;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:mail.nvidia.com;PTR:dc7edge2.nvidia.com;CAT:NONE;SFS:(13230040)(1800799024)(36860700013)(376014)(82310400026);DIR:OUT;SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 04 Sep 2024 15:31:08.5579 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 0dbccf17-48ae-4968-2ddc-08dcccf6994f X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a;Ip=[216.228.118.233];Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: SJ1PEPF00001CDC.namprd05.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: SN7PR12MB6743 Split the search for the ODP mkey when handling an rdma type page fault to a helper function, later to be used in other page fault types. Signed-off-by: Michael Guralnik Reviewed-by: Leon Romanovsky --- drivers/infiniband/hw/mlx5/odp.c | 65 +++++++++++++++++++------------- 1 file changed, 39 insertions(+), 26 deletions(-) diff --git a/drivers/infiniband/hw/mlx5/odp.c b/drivers/infiniband/hw/mlx5/odp.c index 20ad2616bed0..05b92f4cac0e 100644 --- a/drivers/infiniband/hw/mlx5/odp.c +++ b/drivers/infiniband/hw/mlx5/odp.c @@ -819,6 +819,27 @@ static bool mkey_is_eq(struct mlx5_ib_mkey *mmkey, u32 key) return mmkey->key == key; } +static struct mlx5_ib_mkey *find_odp_mkey(struct mlx5_ib_dev *dev, u32 key) +{ + struct mlx5_ib_mkey *mmkey; + + xa_lock(&dev->odp_mkeys); + mmkey = xa_load(&dev->odp_mkeys, mlx5_base_mkey(key)); + if (!mmkey) { + mmkey = ERR_PTR(-ENOENT); + goto out; + } + if (!mkey_is_eq(mmkey, key)) { + mmkey = ERR_PTR(-EFAULT); + goto out; + } + refcount_inc(&mmkey->usecount); +out: + xa_unlock(&dev->odp_mkeys); + + return mmkey; +} + /* * Handle a single data segment in a page-fault WQE or RDMA region. * @@ -846,32 +867,24 @@ static int pagefault_single_data_segment(struct mlx5_ib_dev *dev, io_virt += *bytes_committed; bcnt -= *bytes_committed; - next_mr: - xa_lock(&dev->odp_mkeys); - mmkey = xa_load(&dev->odp_mkeys, mlx5_base_mkey(key)); - if (!mmkey) { - xa_unlock(&dev->odp_mkeys); - mlx5_ib_dbg( - dev, - "skipping non ODP MR (lkey=0x%06x) in page fault handler.\n", - key); - if (bytes_mapped) - *bytes_mapped += bcnt; - /* - * The user could specify a SGL with multiple lkeys and only - * some of them are ODP. Treat the non-ODP ones as fully - * faulted. - */ - ret = 0; - goto end; - } - refcount_inc(&mmkey->usecount); - xa_unlock(&dev->odp_mkeys); - - if (!mkey_is_eq(mmkey, key)) { - mlx5_ib_dbg(dev, "failed to find mkey %x\n", key); - ret = -EFAULT; + mmkey = find_odp_mkey(dev, key); + if (IS_ERR(mmkey)) { + ret = PTR_ERR(mmkey); + if (ret == -ENOENT) { + mlx5_ib_dbg( + dev, + "skipping non ODP MR (lkey=0x%06x) in page fault handler.\n", + key); + if (bytes_mapped) + *bytes_mapped += bcnt; + /* + * The user could specify a SGL with multiple lkeys and + * only some of them are ODP. Treat the non-ODP ones as + * fully faulted. + */ + ret = 0; + } goto end; } @@ -966,7 +979,7 @@ static int pagefault_single_data_segment(struct mlx5_ib_dev *dev, } end: - if (mmkey) + if (!IS_ERR(mmkey)) mlx5r_deref_odp_mkey(mmkey); while (head) { frame = head; From patchwork Wed Sep 4 15:30:36 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Michael Guralnik X-Patchwork-Id: 13791136 Received: from NAM04-BN8-obe.outbound.protection.outlook.com (mail-bn8nam04on2059.outbound.protection.outlook.com [40.107.100.59]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0A3591DEFC0 for ; Wed, 4 Sep 2024 15:31:18 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.100.59 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725463880; cv=fail; b=fbiGQWNnPmrfYoRhb0p34hxVEZvt3wY2v/kTaB+uv+HbX8OrCw5GPx/KXzBGiVH1jFDWULGnIJXFfwtpKXPTL1wkExXv+dy/hpIvcYGekap0+zAco+CD+GLmQq27FxtiHHUtZ0/XVWH/a0ouHVG7/UQNpYEMqfebEREokUZnoy4= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725463880; c=relaxed/simple; bh=jPeiXjI37ANc39RDWrpbzn5xjmtlmGhKn/vQVj11IIk=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=mPVnYej1mVFs9OUY3GmIaPClTBFScaJW25XLKK8gKFbleVdwPXUJFXruIstjQpyKeWEb/jBJEPKI+hL94YiZ+u9X5Cc75zgnzjv76/jJCo24uoso3w+cKpc/+xTuRhzXiP2Ja0VpdTPwpDuVrW1NBhpNaCLy8eYxaFRkhaVFYQs= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com; spf=fail smtp.mailfrom=nvidia.com; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b=fXjaHNEA; arc=fail smtp.client-ip=40.107.100.59 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nvidia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b="fXjaHNEA" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=XRwFWocbHtg3aJxJxYlUWwSfNgfucYIpFI/1jmPWj2aSeOOQ1Aann7b0pj+Su7HW1h5/s3ksLLp4a1ip44FeUR6GFNt7k8r0wzakHzfgRvYe2MtlZbybjTAS5zIYFDmIgkk/xYWnX5cj8tNUyASWu6fobsTbP1mbLgZMjzvFsJnOhB1Z8SIkTbNws0DG59iAnU7WAGT7f0QB3sGLZ0EpTa4jlbSTYO5B/3mAzwKGqI3LnxrFZ1spj04sEl8yyfy8TpYBz6BAqwckACAx00Q9mlVoi4XxRCaV0C/C+jtSeCaTJ+wpG4DcZaoBJtoNQ9Teabuf2n3MTc4yYofVPCxI1g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=k+9e/SoPO2gjOMHZLXNmZsnaD9SS90hM6w7pGL1AZBQ=; b=wznFR3FlV0CsOplcMj98hTOb8GuhxEEcQXL2YPoVCmddvSiyYlf53VcVbF0d8ZWITasTlEHdjXWTXqTfocU38mQeVx3gMWf5YxoHPwsp6+qe72X890cb5KjFXZqTo/JKxD+BlJognE5qkTyGzZt7nLgqr2cwYCpr+8jXDQtzRFTbWwR6Az00/DWqxBcFw0w+Qw4FZcgcqPXS8QQ2ucf1GTZ9I2DmLVx9DbKlZ7aJAQqAx/tzQH1X8OF79ie6tpWnt4EYjqRzbrvwak57mLhx9HJXj1NmU20YSA4mymv4AIeKDz+WmIsES1xczBIE7UEPc5e/dvWHqXd0lekyxHfbrg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 216.228.118.233) smtp.rcpttodomain=vger.kernel.org smtp.mailfrom=nvidia.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=k+9e/SoPO2gjOMHZLXNmZsnaD9SS90hM6w7pGL1AZBQ=; b=fXjaHNEAs2wQt+BtkuZqkPeXxYVWGI6pwSPMsjw+HJtE/1c6D1XX/xVOeRlel93G8+M2OV9f23JHPlFBXdTifMyJTsed4FoTBG9yDB+2EQDOMeWvEwUnk+hN1J+/h+65EabDXNjKnADm6RSbNzpeHUUChgSmXCtXzTJanid60tHJrbug0efBHq29PtUJdG8SpQbB1ZevFBllPGnE0dD4yD3XQNyPkG4mc9qFkdPzMnJRyXSKPN8W2o2YZYRQ2Gi0NvOWELlj2cZUxQtftHHz+5puRFEEkYtUEeZuVVikdNprpK4IdPPR+/paH+x67xuiVzWgqCuq34z86Ns4ORUxVw== Received: from SJ0PR05CA0097.namprd05.prod.outlook.com (2603:10b6:a03:334::12) by SN7PR12MB7201.namprd12.prod.outlook.com (2603:10b6:806:2a8::22) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7918.25; Wed, 4 Sep 2024 15:31:13 +0000 Received: from SJ1PEPF00001CDD.namprd05.prod.outlook.com (2603:10b6:a03:334:cafe::f1) by SJ0PR05CA0097.outlook.office365.com (2603:10b6:a03:334::12) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7939.12 via Frontend Transport; Wed, 4 Sep 2024 15:31:13 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 216.228.118.233) smtp.mailfrom=nvidia.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 216.228.118.233 as permitted sender) receiver=protection.outlook.com; client-ip=216.228.118.233; helo=mail.nvidia.com; pr=C Received: from mail.nvidia.com (216.228.118.233) by SJ1PEPF00001CDD.mail.protection.outlook.com (10.167.242.5) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7918.13 via Frontend Transport; Wed, 4 Sep 2024 15:31:13 +0000 Received: from drhqmail201.nvidia.com (10.126.190.180) by mail.nvidia.com (10.127.129.6) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.4; Wed, 4 Sep 2024 08:30:56 -0700 Received: from drhqmail203.nvidia.com (10.126.190.182) by drhqmail201.nvidia.com (10.126.190.180) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.4; Wed, 4 Sep 2024 08:30:55 -0700 Received: from vdi.nvidia.com (10.127.8.13) by mail.nvidia.com (10.126.190.182) with Microsoft SMTP Server id 15.2.1544.4 via Frontend Transport; Wed, 4 Sep 2024 08:30:54 -0700 From: Michael Guralnik To: , CC: , , , Michael Guralnik Subject: [PATCH rdma-next 6/8] RDMA/mlx5: Add handling for memory scheme page fault events Date: Wed, 4 Sep 2024 18:30:36 +0300 Message-ID: <20240904153038.23054-7-michaelgur@nvidia.com> X-Mailer: git-send-email 2.17.2 In-Reply-To: <20240904153038.23054-1-michaelgur@nvidia.com> References: <20240904153038.23054-1-michaelgur@nvidia.com> Precedence: bulk X-Mailing-List: linux-rdma@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-NV-OnPremToCloud: ExternallySecured X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: SJ1PEPF00001CDD:EE_|SN7PR12MB7201:EE_ X-MS-Office365-Filtering-Correlation-Id: 991f3db8-b1eb-4f5b-60ae-08dcccf69c3d X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|1800799024|376014|82310400026|36860700013; X-Microsoft-Antispam-Message-Info: h2qMdBCsB62g5P5V0WhRUCDC0DuKiWzgmcJO0r6aXklkUmehygmEEPooDIZJMfkkjN67wUmUeR0J8Kv9NNILUDqVQxXq/ZXGV9Di3H8FldYKTrQEjhZrUXaGIBtvlQItMnlCMEJlRtQ/HKG1SXP79RZuoSFA1kcZtzf/D3iQUVnSI8sPi7zX8acRHv4f6PsAo2BmBARUSyFTnr7MZzNP4seJ2Dba8//x/gVrxsxNssXXBbsg8m2q+KRLMMLtll3coBqDuIOZwht5VEM43LXWYpdwXgkXiNCjKt3WxJ5RwBIIw+BrMB+Kp5S0VMIQrUXS+lK69CS/z9KdAGbm5BW8APmxEFwzZB4bK9+yoQykPnc4n7xlUF/N5W4cLV60hxCPL220xCsLh6PqCGd2V0n3mSpoDD+KCRQsjwVn08zVv4VGEJ4otUM8K0BRuDwJqW7x1pOyfuD724DiY9sztdXhPni2KF69B4Mo/WlS/ksIor6FQu+NvcDLUJEdUWPHzZLy1keUsBM3UGgq4w8RJas/5yJI2VTjrSgfCoevzgUW81r3qduJOmCMoceisW/DchQwgRLzuG5ONGYkJIGj5JRI9ukQ1vx+395RjbuzqFyWWfXcZACTrnm6e4P8IYVP+84OgAlpddv7Mwqjn1TOJQBtMUp1+o927/qec348s2AuSk7J6aVZXhpgvL9pLimNhz/9CzoQvtCIzFO7hbghZom7q4KrVtiOiykokFlTrL7JuKxNMy5IIBSef11qkmlwwsP1+wGT6N9h1nSp6y4MCCPRCP0vDX2d3QQ3RcOHK98Js1QQbAodpT1lIqva8pJvJRMY5tR7IY5ks0FQKWrmt8sWifaJzVK1x4RI3OB4nR1cqIRQj05wH4cD2iKVzGl7wQw5VVS5kAH9LEMTvGDeJB4alnl5nEaJY9QK/NasfnldnhzUhF0anmAZnd/mXacViboNEJP9oXO/DqJKjSR2oVV6J++Hc97XIFAv+VpAwtwIyqfQJOO0PUGdp17/TJabPNs36mMnsLOLwM7+ZcwiWPRDjoHi4IgANryYlQi8h0XwOuv9+uy8Dn04a996sGxWA/rrGIfu45wi+8BsK3tMLaatlV90kB7aJrVHhBsrhatGQjQjWTJTrBKChjl4rz35MgapjBfqR7fI2INBk8+5GYFpGZ3IrzRQft996BvR8ArvrVPcvC7tPzBVjWYiJfAOj0OeS6JJrcgyRxPl8SYfpnFVzyDHjA+SI8kxqvE7WMprDRLpSFBy2elmON3UKCiyJWTKER2ei4gzloKWYdRFj1daOLecjKeBHoqEPIfRK6DfnfNsS8ZeqYKO6g7b6U19e2HEYb2Ip0cRx6Nv9IhnjtcTbSHk+S7l4rr5dqY2zlkN01h1sEEgF/gXcn6D9iWPYXjMyi4HBVSWDtcguY80HP0g6+/teN9azr7cfU7NQfE8kZMq3MO328wIxDddzjNDrFwV X-Forefront-Antispam-Report: CIP:216.228.118.233;CTRY:US;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:mail.nvidia.com;PTR:dc7edge2.nvidia.com;CAT:NONE;SFS:(13230040)(1800799024)(376014)(82310400026)(36860700013);DIR:OUT;SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 04 Sep 2024 15:31:13.4900 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 991f3db8-b1eb-4f5b-60ae-08dcccf69c3d X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a;Ip=[216.228.118.233];Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: SJ1PEPF00001CDD.namprd05.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: SN7PR12MB7201 The memory scheme page fault event is a new approch in handling page fault on mkeys using the on-demand-paging feature. The major shift in handling the page fault in this scheme is that the HW is taking responsibilty for parsing the faulted mkey instead of the previous approach where the driver would read and parse the wqes and query the mkeys to get to the direct mkey that we need to handle. Therefore, the event we get from FW in this scheme will contain the direct mkey and address we need to handle and require much less work from driver. Additionally, to optimize performance, the FW can generate the event on a memory area that is larger than the faulted memory operation is requiring, to 'prefetch' memory that is around it and will likely be used soon. Unlike previous types of page fault, the memory page scheme fault does not always require a resume command after handling the page fault as the FW can post multiple events on same mkey and will set the 'last' flag only on the page fault that requires the resume command. Signed-off-by: Michael Guralnik Reviewed-by: Leon Romanovsky --- drivers/infiniband/hw/mlx5/odp.c | 120 +++++++++++++++++++++++++++++-- 1 file changed, 114 insertions(+), 6 deletions(-) diff --git a/drivers/infiniband/hw/mlx5/odp.c b/drivers/infiniband/hw/mlx5/odp.c index 05b92f4cac0e..841725557f2a 100644 --- a/drivers/infiniband/hw/mlx5/odp.c +++ b/drivers/infiniband/hw/mlx5/odp.c @@ -401,12 +401,24 @@ static void mlx5_ib_page_fault_resume(struct mlx5_ib_dev *dev, MLX5_SET(page_fault_resume_in, in, opcode, MLX5_CMD_OP_PAGE_FAULT_RESUME); - info = MLX5_ADDR_OF(page_fault_resume_in, in, - page_fault_info.trans_page_fault_info); - MLX5_SET(trans_page_fault_info, info, page_fault_type, pfault->type); - MLX5_SET(trans_page_fault_info, info, fault_token, pfault->token); - MLX5_SET(trans_page_fault_info, info, wq_number, wq_num); - MLX5_SET(trans_page_fault_info, info, error, !!error); + if (pfault->event_subtype == MLX5_PFAULT_SUBTYPE_MEMORY) { + info = MLX5_ADDR_OF(page_fault_resume_in, in, + page_fault_info.mem_page_fault_info); + MLX5_SET(mem_page_fault_info, info, fault_token_31_0, + pfault->token & 0xffffffff); + MLX5_SET(mem_page_fault_info, info, fault_token_47_32, + (pfault->token >> 32) & 0xffff); + MLX5_SET(mem_page_fault_info, info, error, !!error); + } else { + info = MLX5_ADDR_OF(page_fault_resume_in, in, + page_fault_info.trans_page_fault_info); + MLX5_SET(trans_page_fault_info, info, page_fault_type, + pfault->type); + MLX5_SET(trans_page_fault_info, info, fault_token, + pfault->token); + MLX5_SET(trans_page_fault_info, info, wq_number, wq_num); + MLX5_SET(trans_page_fault_info, info, error, !!error); + } err = mlx5_cmd_exec_in(dev->mdev, page_fault_resume, in); if (err) @@ -1388,6 +1400,63 @@ static void mlx5_ib_mr_rdma_pfault_handler(struct mlx5_ib_dev *dev, } } +#define MLX5_MEMORY_PAGE_FAULT_FLAGS_LAST BIT(7) +static void mlx5_ib_mr_memory_pfault_handler(struct mlx5_ib_dev *dev, + struct mlx5_pagefault *pfault) +{ + u64 prefetch_va = + pfault->memory.va - pfault->memory.prefetch_before_byte_count; + size_t prefetch_size = pfault->memory.prefetch_before_byte_count + + pfault->memory.fault_byte_count + + pfault->memory.prefetch_after_byte_count; + struct mlx5_ib_mkey *mmkey; + struct mlx5_ib_mr *mr; + int ret = 0; + + mmkey = find_odp_mkey(dev, pfault->memory.mkey); + if (IS_ERR(mmkey)) + goto err; + + mr = container_of(mmkey, struct mlx5_ib_mr, mmkey); + + /* If prefetch fails, handle only demanded page fault */ + ret = pagefault_mr(mr, prefetch_va, prefetch_size, NULL, 0, true); + if (ret < 0) { + ret = pagefault_mr(mr, pfault->memory.va, + pfault->memory.fault_byte_count, NULL, 0, + true); + if (ret < 0) + goto err; + } + + mlx5_update_odp_stats(mr, faults, ret); + mlx5r_deref_odp_mkey(mmkey); + + if (pfault->memory.flags & MLX5_MEMORY_PAGE_FAULT_FLAGS_LAST) + mlx5_ib_page_fault_resume(dev, pfault, 0); + + mlx5_ib_dbg( + dev, + "PAGE FAULT completed %s. token 0x%llx, mkey: 0x%x, va: 0x%llx, byte_count: 0x%x\n", + pfault->memory.flags & MLX5_MEMORY_PAGE_FAULT_FLAGS_LAST ? + "" : + "without resume cmd", + pfault->token, pfault->memory.mkey, pfault->memory.va, + pfault->memory.fault_byte_count); + + return; + +err: + if (!IS_ERR(mmkey)) + mlx5r_deref_odp_mkey(mmkey); + mlx5_ib_page_fault_resume(dev, pfault, 1); + mlx5_ib_dbg( + dev, + "PAGE FAULT error. token 0x%llx, mkey: 0x%x, va: 0x%llx, byte_count: 0x%x, err: %d\n", + pfault->token, pfault->memory.mkey, pfault->memory.va, + pfault->memory.fault_byte_count, ret); +} + static void mlx5_ib_pfault(struct mlx5_ib_dev *dev, struct mlx5_pagefault *pfault) { u8 event_subtype = pfault->event_subtype; @@ -1399,6 +1468,9 @@ static void mlx5_ib_pfault(struct mlx5_ib_dev *dev, struct mlx5_pagefault *pfaul case MLX5_PFAULT_SUBTYPE_RDMA: mlx5_ib_mr_rdma_pfault_handler(dev, pfault); break; + case MLX5_PFAULT_SUBTYPE_MEMORY: + mlx5_ib_mr_memory_pfault_handler(dev, pfault); + break; default: mlx5_ib_err(dev, "Invalid page fault event subtype: 0x%x\n", event_subtype); @@ -1417,6 +1489,7 @@ static void mlx5_ib_eqe_pf_action(struct work_struct *work) mempool_free(pfault, eq->pool); } +#define MEMORY_SCHEME_PAGE_FAULT_GRANULARITY 4096 static void mlx5_ib_eq_pf_process(struct mlx5_ib_pf_eq *eq) { struct mlx5_eqe_page_fault *pf_eqe; @@ -1487,6 +1560,41 @@ static void mlx5_ib_eq_pf_process(struct mlx5_ib_pf_eq *eq) pfault->wqe.wqe_index); break; + case MLX5_PFAULT_SUBTYPE_MEMORY: + /* Memory based event */ + pfault->bytes_committed = 0; + pfault->token = + be32_to_cpu(pf_eqe->memory.token31_0) | + ((u64)be16_to_cpu(pf_eqe->memory.token47_32) + << 32); + pfault->memory.va = be64_to_cpu(pf_eqe->memory.va); + pfault->memory.mkey = be32_to_cpu(pf_eqe->memory.mkey); + pfault->memory.fault_byte_count = (be32_to_cpu( + pf_eqe->memory.demand_fault_pages) >> 12) * + MEMORY_SCHEME_PAGE_FAULT_GRANULARITY; + pfault->memory.prefetch_before_byte_count = + be16_to_cpu( + pf_eqe->memory.pre_demand_fault_pages) * + MEMORY_SCHEME_PAGE_FAULT_GRANULARITY; + pfault->memory.prefetch_after_byte_count = + be16_to_cpu( + pf_eqe->memory.post_demand_fault_pages) * + MEMORY_SCHEME_PAGE_FAULT_GRANULARITY; + pfault->memory.flags = pf_eqe->memory.flags; + mlx5_ib_dbg( + eq->dev, + "PAGE_FAULT: subtype: 0x%02x, token: 0x%06llx, mkey: 0x%06x, fault_byte_count: 0x%06x, va: 0x%016llx, flags: 0x%02x\n", + eqe->sub_type, pfault->token, + pfault->memory.mkey, + pfault->memory.fault_byte_count, + pfault->memory.va, pfault->memory.flags); + mlx5_ib_dbg( + eq->dev, + "PAGE_FAULT: prefetch size: before: 0x%06x, after 0x%06x\n", + pfault->memory.prefetch_before_byte_count, + pfault->memory.prefetch_after_byte_count); + break; + default: mlx5_ib_warn(eq->dev, "Unsupported page fault event sub-type: 0x%02hhx\n", From patchwork Wed Sep 4 15:30:37 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Michael Guralnik X-Patchwork-Id: 13791137 Received: from NAM12-DM6-obe.outbound.protection.outlook.com (mail-dm6nam12on2057.outbound.protection.outlook.com [40.107.243.57]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 919B51DFE05 for ; Wed, 4 Sep 2024 15:31:23 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.243.57 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725463885; cv=fail; b=mIVw8tcvFnTIFWuVFLFjK27eTjMGjMEtrhsr5uqI8rY+q8hD1fxkIz1iHocXSYOhVD/WD8yR7HcVAyRJKCZztTt5eBWkr5e3ydfI3F7n94A0X1jOs0uyU4BJLholcl5QhOMrflT5UltuxsuWQKYiEBqFgBq+t1HX+9wDkmFkLB8= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725463885; c=relaxed/simple; bh=l8iaeG1WhCbfvR3zZonOKfUTavfdbYsLCzJvn5eoc4s=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=F/DlpBNb8BTUIPgTWOZvK5pl5qpAHgftqY2USYW7biTprGO9tTBls48Q1rEgvns8/ezoYT2Cap+o59l8PQvTxn4Zk7Jg6BUEfVRWSVoa84P4CzanhmoEa7B8Izgc/Vvvy9GkgTHhtIy7feH+fsh3GbhcPtwPysOku70p9YjTqEY= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com; spf=fail smtp.mailfrom=nvidia.com; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b=NtaMtliC; arc=fail smtp.client-ip=40.107.243.57 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nvidia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b="NtaMtliC" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=jVrl2z9avk0FI3XKrjP6IN0FKd/jfGqvOdYkjVZqT8AKk9kkTyTBb2KE1e3LLRqGk3YYJRpsJHJdJQ4jHxiTlWEuec+N4j8L5GjOCRzcLtd/hA2qsy7NQQcYrCAPuoxXIjKnwwAUgBoHI/0kNOOD2T4eukQIp+CBa4AcHvpd9lgTc7jrVNLLyEEu/P8wbuhUkp3v8J5YmnDE/YLnwCSWrxnu1GdEOTuOsusbf7fqjKY77WLA1m3kdyz6fCN03FiQZbxgsP7rz7+mjH2j9NvpoOELRUIpsVfa+ShPzAPwNea9BEno7YFzaTEkel6QaPAfDuI1KZHQ/9UAE61/IpdLxA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=mgCDuzsAuTAc3ap6+H2PXvjqsgR07GVCqCY0pgV2TVA=; b=QvZHY3B/dzUukwFFahCPGYAulA/fYEZbjdZRTEN+4pVsd/i/732QroG0S57udWgh6B1XaT0IzwMS55Oeknw7OwECag0HSby/f8X9fqn5RRS+aQqyM8dr96Ii0NNSxThTzCwRLWC3HP3ecgg5WUHeZrk84OV7fC7t6iXbqFWcH6H3oTC56Tsz2/JIGaiAHBwzdgD5MjqHQmvGsewmmg/2BFiLht885mSRh5mGXwZuEzpb6e+7o2S7Ondi48Pozw6IX3InBfmIU6psPsiizsMT/ThgXnCSoH0CLMdYLXQpey7KNjwxSoDiWCL7Xj5B9D+KiBUAQgI5WaiLbCw//shUhg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 216.228.118.233) smtp.rcpttodomain=vger.kernel.org smtp.mailfrom=nvidia.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=mgCDuzsAuTAc3ap6+H2PXvjqsgR07GVCqCY0pgV2TVA=; b=NtaMtliCs94L/xaxCJQZ6vAvcWTPPN5khZOXaga3zQ0j5NWDCwXMRDblEph7zuMlRvVAMujBDFL7pKDIt9RyEzy52/NBMNEMOgdZH6wB6OtAZ3qj96uLflqH6XRNSPrdeKEGCgAtQYF4ThCkJuNtWQqjMwXnQlcPVefdpzvrT5oDBlI1uHcy7EWPTbYmQnFV3PdN+BSJL3tOni+g4oJbY2xcMndxzvapOkrG0hmnF/2MT547lgGW8gcqHH/rLtGBN96jq7l/uzWxrCg4N9oUonOHtb6O2IHGDpTxAx0+qsQxn5Ss9IjjhoL4dgD5NUzcFmlPJZvxLd8Vb5DjJujXPQ== Received: from SJ0PR03CA0075.namprd03.prod.outlook.com (2603:10b6:a03:331::20) by DS0PR12MB6559.namprd12.prod.outlook.com (2603:10b6:8:d1::6) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7918.27; Wed, 4 Sep 2024 15:31:15 +0000 Received: from SJ1PEPF00001CDC.namprd05.prod.outlook.com (2603:10b6:a03:331:cafe::ae) by SJ0PR03CA0075.outlook.office365.com (2603:10b6:a03:331::20) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7918.26 via Frontend Transport; Wed, 4 Sep 2024 15:31:15 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 216.228.118.233) smtp.mailfrom=nvidia.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 216.228.118.233 as permitted sender) receiver=protection.outlook.com; client-ip=216.228.118.233; helo=mail.nvidia.com; pr=C Received: from mail.nvidia.com (216.228.118.233) by SJ1PEPF00001CDC.mail.protection.outlook.com (10.167.242.4) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7918.13 via Frontend Transport; Wed, 4 Sep 2024 15:31:15 +0000 Received: from drhqmail203.nvidia.com (10.126.190.182) by mail.nvidia.com (10.127.129.6) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.4; Wed, 4 Sep 2024 08:30:58 -0700 Received: from drhqmail203.nvidia.com (10.126.190.182) by drhqmail203.nvidia.com (10.126.190.182) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.4; Wed, 4 Sep 2024 08:30:57 -0700 Received: from vdi.nvidia.com (10.127.8.13) by mail.nvidia.com (10.126.190.182) with Microsoft SMTP Server id 15.2.1544.4 via Frontend Transport; Wed, 4 Sep 2024 08:30:56 -0700 From: Michael Guralnik To: , CC: , , , Michael Guralnik Subject: [PATCH rdma-next 7/8] RDMA/mlx5: Add implicit MR handling to ODP memory scheme Date: Wed, 4 Sep 2024 18:30:37 +0300 Message-ID: <20240904153038.23054-8-michaelgur@nvidia.com> X-Mailer: git-send-email 2.17.2 In-Reply-To: <20240904153038.23054-1-michaelgur@nvidia.com> References: <20240904153038.23054-1-michaelgur@nvidia.com> Precedence: bulk X-Mailing-List: linux-rdma@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-NV-OnPremToCloud: ExternallySecured X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: SJ1PEPF00001CDC:EE_|DS0PR12MB6559:EE_ X-MS-Office365-Filtering-Correlation-Id: 489cea98-a527-43db-7426-08dcccf69d63 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|1800799024|376014|82310400026|36860700013; X-Microsoft-Antispam-Message-Info: ZY4MezGwNVr/DB4bPn0hch8+E/HOpOpxYTBxcf8E7vX566ODWX2eSGe7X6ellyAUmH84YgqsqNHd2E4v3DTkEbsk+g3JdJXUbq4vJMFF7cG/Gs7Cb7Te/ri5w9XyOR8dMGtm4j/39EQ88LsVIxx40l+2Ke0Jp+tJFlidMknU8cqg9GDqbKFSVQFkWULLdzVnVgdxJzqKkwcG/iwCE+oq+vZV9/41hfDPxYLHCgt6ufgg9R7cqOen1Vg7dTCqLDnfw9zez5r6BlCzddSPJgEmFzH8wpZXnwxyhbt6ejlwcHj+FOGIzo2XKZyKCkZNZR/RRB2NYKyFKO4gSMbAqWQDwDgy+/Bf4xltbWoTZk31oMMY+ZsIg8hfeYS1EZ9+Nuv9tcj2sDXa1GDZkQTZn1/AaX44YUBuTnq1ZmDt15fJU7oh0k5+Dw9RPCD+Ml34+g1KqocGk9VvOsXFmdl5+6rDhIKLo8Yof9ivh+c7kQ+VA1mCnrRkx7h4IWhcpfb6Swa45PfD1U+RUB6xda0ZkBDEd2xhsaYzhEzWRt94bfA15E8qYwisp/uhaidlLZ1i+Y7Hnm0DKL5YZwze3jMassQOG0fjrFWEoPreVal+S0SavF9UlMmijs6Z1Z4gz8qqSRfSfnCaFa6DhMB1lkEWfJqG5Got3EeGwcySuQ1Ay2AsWubV0RyKyYuNRzi/O2pJOhoBQnZJVfCrT6UL7783iV3LXkAo79kKSJAAnwroZ7ICe9PbMeoudo2G6bCXYHY5c8MCxnYVOxvOAIeIowpcH/YZKyO0uK2X8LVGRAkELIs8fRYD2H6fEsdsZBY/cr16Df6sR0N32/A3eRoC+TS9b8PnTGov1w0erKtAqSS+Za0Xy+B7TMkh3vLcBB0GauC92daK2bbwQGaxC49IYUVgNEErqjF1TGYxlxzhN4aU6WALwu2MfjGxLcOigUAqb/kheqVxk+SUmtIA+eQ0Vs9QUGuKKG9UZT9oRqu01hbAD/iZoqDeF6goR5XYgfqkK4BPG0UDs42HiWBVfDuNQJnN80qmB97YY7NBjCuRiU/otfCq6aVlcVpj3enNXx3j60agkGoO6GHR9gGa2BbpttiCRyVM+P1RAYsMA8w5iudgAoh2yqV9pyaDOhIBR/ROOBl8ShwYyh4iwtdgi+TZNzxpxWcbWG+Nm8IBEuQ16M6MGzJc9q0zrQVx7WURf5GolCc4ftHAp2EyAXC6JyyjQZCml0+M6j/bNsLtI5dUQg5kksXekF+AauSNJpsbUpEPewjBAOeaBepVMOyqWvVEpYtAcDD6v+ag22lyDWfeC0hjBP4KH8ZCf0jUoDiFB4K+i3ExaC5ymzyz+UOSX06tYMuNdnJnbo2aDM33ABPpNCKfMvXdLpwrEcadqmBLiy2m6lojQxBmfRaZ/UHvi4ksHlneUwumQ5pYwPBwexvKOOfIPZqv+BK6qn4QfQuIxkRaq3YZyylN X-Forefront-Antispam-Report: CIP:216.228.118.233;CTRY:US;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:mail.nvidia.com;PTR:dc7edge2.nvidia.com;CAT:NONE;SFS:(13230040)(1800799024)(376014)(82310400026)(36860700013);DIR:OUT;SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 04 Sep 2024 15:31:15.4017 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 489cea98-a527-43db-7426-08dcccf69d63 X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a;Ip=[216.228.118.233];Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: SJ1PEPF00001CDC.namprd05.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: DS0PR12MB6559 Implicit MRs in ODP memory scheme require allocating a private null mkey and assigning the mkey and va differently in the KSM mkey. The page faults are received on the null mkey so we also add storing the null mkey in the odp_mkey xarray. Signed-off-by: Michael Guralnik Reviewed-by: Leon Romanovsky --- drivers/infiniband/hw/mlx5/mlx5_ib.h | 3 + drivers/infiniband/hw/mlx5/odp.c | 116 +++++++++++++++++++++++++-- 2 files changed, 111 insertions(+), 8 deletions(-) diff --git a/drivers/infiniband/hw/mlx5/mlx5_ib.h b/drivers/infiniband/hw/mlx5/mlx5_ib.h index 89c2ab728577..b973568d406a 100644 --- a/drivers/infiniband/hw/mlx5/mlx5_ib.h +++ b/drivers/infiniband/hw/mlx5/mlx5_ib.h @@ -643,6 +643,8 @@ enum mlx5_mkey_type { MLX5_MKEY_MR = 1, MLX5_MKEY_MW, MLX5_MKEY_INDIRECT_DEVX, + MLX5_MKEY_NULL, + MLX5_MKEY_IMPLICIT_CHILD, }; struct mlx5r_cache_rb_key { @@ -728,6 +730,7 @@ struct mlx5_ib_mr { struct mlx5_ib_mr *dd_crossed_mr; struct list_head dd_node; u8 revoked :1; + struct mlx5_ib_mkey null_mmkey; }; }; }; diff --git a/drivers/infiniband/hw/mlx5/odp.c b/drivers/infiniband/hw/mlx5/odp.c index 841725557f2a..4b37446758fd 100644 --- a/drivers/infiniband/hw/mlx5/odp.c +++ b/drivers/infiniband/hw/mlx5/odp.c @@ -107,13 +107,20 @@ static u64 mlx5_imr_ksm_entries; static void populate_klm(struct mlx5_klm *pklm, size_t idx, size_t nentries, struct mlx5_ib_mr *imr, int flags) { + struct mlx5_core_dev *dev = mr_to_mdev(imr)->mdev; struct mlx5_klm *end = pklm + nentries; + int step = MLX5_CAP_ODP(dev, mem_page_fault) ? MLX5_IMR_MTT_SIZE : 0; + __be32 key = MLX5_CAP_ODP(dev, mem_page_fault) ? + cpu_to_be32(imr->null_mmkey.key) : + mr_to_mdev(imr)->mkeys.null_mkey; + u64 va = + MLX5_CAP_ODP(dev, mem_page_fault) ? idx * MLX5_IMR_MTT_SIZE : 0; if (flags & MLX5_IB_UPD_XLT_ZAP) { - for (; pklm != end; pklm++, idx++) { + for (; pklm != end; pklm++, idx++, va += step) { pklm->bcount = cpu_to_be32(MLX5_IMR_MTT_SIZE); - pklm->key = mr_to_mdev(imr)->mkeys.null_mkey; - pklm->va = 0; + pklm->key = key; + pklm->va = cpu_to_be64(va); } return; } @@ -137,7 +144,7 @@ static void populate_klm(struct mlx5_klm *pklm, size_t idx, size_t nentries, */ lockdep_assert_held(&to_ib_umem_odp(imr->umem)->umem_mutex); - for (; pklm != end; pklm++, idx++) { + for (; pklm != end; pklm++, idx++, va += step) { struct mlx5_ib_mr *mtt = xa_load(&imr->implicit_children, idx); pklm->bcount = cpu_to_be32(MLX5_IMR_MTT_SIZE); @@ -145,8 +152,8 @@ static void populate_klm(struct mlx5_klm *pklm, size_t idx, size_t nentries, pklm->key = cpu_to_be32(mtt->ibmr.lkey); pklm->va = cpu_to_be64(idx * MLX5_IMR_MTT_SIZE); } else { - pklm->key = mr_to_mdev(imr)->mkeys.null_mkey; - pklm->va = 0; + pklm->key = key; + pklm->va = cpu_to_be64(va); } } } @@ -225,6 +232,9 @@ static void destroy_unused_implicit_child_mr(struct mlx5_ib_mr *mr) return; xa_erase(&imr->implicit_children, idx); + if (MLX5_CAP_ODP(mr_to_mdev(mr)->mdev, mem_page_fault)) + xa_erase(&mr_to_mdev(mr)->odp_mkeys, + mlx5_base_mkey(mr->mmkey.key)); /* Freeing a MR is a sleeping operation, so bounce to a work queue */ INIT_WORK(&mr->odp_destroy.work, free_implicit_child_mr_work); @@ -492,6 +502,16 @@ static struct mlx5_ib_mr *implicit_get_child_mr(struct mlx5_ib_mr *imr, } xa_unlock(&imr->implicit_children); + if (MLX5_CAP_ODP(dev->mdev, mem_page_fault)) { + ret = xa_store(&dev->odp_mkeys, mlx5_base_mkey(mr->mmkey.key), + &mr->mmkey, GFP_KERNEL); + if (xa_is_err(ret)) { + ret = ERR_PTR(xa_err(ret)); + xa_erase(&imr->implicit_children, idx); + goto out_mr; + } + mr->mmkey.type = MLX5_MKEY_IMPLICIT_CHILD; + } mlx5_ib_dbg(mr_to_mdev(imr), "key %x mr %p\n", mr->mmkey.key, mr); return mr; @@ -502,6 +522,57 @@ static struct mlx5_ib_mr *implicit_get_child_mr(struct mlx5_ib_mr *imr, return ret; } +/* + * When using memory scheme ODP, implicit MRs can't use the reserved null mkey + * and each implicit MR needs to assign a private null mkey to get the page + * faults on. + * The null mkey is created with the properties to enable getting the page + * fault for every time it is accessed and having all relevant access flags. + */ +static int alloc_implicit_mr_null_mkey(struct mlx5_ib_dev *dev, + struct mlx5_ib_mr *imr, + struct mlx5_ib_pd *pd) +{ + size_t inlen = MLX5_ST_SZ_BYTES(create_mkey_in) + 64; + void *mkc; + u32 *in; + int err; + + in = kzalloc(inlen, GFP_KERNEL); + if (!in) + return -ENOMEM; + + MLX5_SET(create_mkey_in, in, translations_octword_actual_size, 4); + MLX5_SET(create_mkey_in, in, pg_access, 1); + + mkc = MLX5_ADDR_OF(create_mkey_in, in, memory_key_mkey_entry); + MLX5_SET(mkc, mkc, a, 1); + MLX5_SET(mkc, mkc, rw, 1); + MLX5_SET(mkc, mkc, rr, 1); + MLX5_SET(mkc, mkc, lw, 1); + MLX5_SET(mkc, mkc, lr, 1); + MLX5_SET(mkc, mkc, free, 0); + MLX5_SET(mkc, mkc, umr_en, 0); + MLX5_SET(mkc, mkc, access_mode_1_0, MLX5_MKC_ACCESS_MODE_MTT); + + MLX5_SET(mkc, mkc, translations_octword_size, 4); + MLX5_SET(mkc, mkc, log_page_size, 61); + MLX5_SET(mkc, mkc, length64, 1); + MLX5_SET(mkc, mkc, pd, pd->pdn); + MLX5_SET64(mkc, mkc, start_addr, 0); + MLX5_SET(mkc, mkc, qpn, 0xffffff); + + err = mlx5_core_create_mkey(dev->mdev, &imr->null_mmkey.key, in, inlen); + if (err) + goto free_in; + + imr->null_mmkey.type = MLX5_MKEY_NULL; + +free_in: + kfree(in); + return err; +} + struct mlx5_ib_mr *mlx5_ib_alloc_implicit_mr(struct mlx5_ib_pd *pd, int access_flags) { @@ -534,6 +605,16 @@ struct mlx5_ib_mr *mlx5_ib_alloc_implicit_mr(struct mlx5_ib_pd *pd, imr->is_odp_implicit = true; xa_init(&imr->implicit_children); + if (MLX5_CAP_ODP(dev->mdev, mem_page_fault)) { + err = alloc_implicit_mr_null_mkey(dev, imr, pd); + if (err) + goto out_mr; + + err = mlx5r_store_odp_mkey(dev, &imr->null_mmkey); + if (err) + goto out_mr; + } + err = mlx5r_umr_update_xlt(imr, 0, mlx5_imr_ksm_entries, MLX5_KSM_PAGE_SHIFT, @@ -568,6 +649,14 @@ void mlx5_ib_free_odp_mr(struct mlx5_ib_mr *mr) xa_erase(&mr->implicit_children, idx); mlx5_ib_dereg_mr(&mtt->ibmr, NULL); } + + if (mr->null_mmkey.key) { + xa_erase(&mr_to_mdev(mr)->odp_mkeys, + mlx5_base_mkey(mr->null_mmkey.key)); + + mlx5_core_destroy_mkey(mr_to_mdev(mr)->mdev, + mr->null_mmkey.key); + } } #define MLX5_PF_FLAGS_DOWNGRADE BIT(1) @@ -1410,14 +1499,25 @@ static void mlx5_ib_mr_memory_pfault_handler(struct mlx5_ib_dev *dev, pfault->memory.fault_byte_count + pfault->memory.prefetch_after_byte_count; struct mlx5_ib_mkey *mmkey; - struct mlx5_ib_mr *mr; + struct mlx5_ib_mr *mr, *child_mr; int ret = 0; mmkey = find_odp_mkey(dev, pfault->memory.mkey); if (IS_ERR(mmkey)) goto err; - mr = container_of(mmkey, struct mlx5_ib_mr, mmkey); + switch (mmkey->type) { + case MLX5_MKEY_IMPLICIT_CHILD: + child_mr = container_of(mmkey, struct mlx5_ib_mr, mmkey); + mr = child_mr->parent; + break; + case MLX5_MKEY_NULL: + mr = container_of(mmkey, struct mlx5_ib_mr, null_mmkey); + break; + default: + mr = container_of(mmkey, struct mlx5_ib_mr, mmkey); + break; + } /* If prefetch fails, handle only demanded page fault */ ret = pagefault_mr(mr, prefetch_va, prefetch_size, NULL, 0, true); From patchwork Wed Sep 4 15:30:38 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Michael Guralnik X-Patchwork-Id: 13791135 Received: from NAM11-CO1-obe.outbound.protection.outlook.com (mail-co1nam11on2040.outbound.protection.outlook.com [40.107.220.40]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 524211DEFFE for ; Wed, 4 Sep 2024 15:31:17 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.220.40 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725463878; cv=fail; b=o3yydrdBLPYiGMJBWwYBKRZEIHXqNIsvUuQf7Iyx1Bb0KZGkYE/Rli8BQ9DB8JMMop1tE9wOn3bcH+POPsZMW8TDHormr11nJhmGdXANVB5IJWjz/XPX9L1E4gJNo/YUT3qZ0ypjAzrsw+0mxpwOUGh/a2Kkmrrs++0mVmABqCg= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725463878; c=relaxed/simple; bh=bwh8IN3GVDkApqT518iGbmgEkDVWi5swgLYSzS3AybA=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=noVFHJsWj7Ukqp+pRyTncaVnWId9iHCnBLhFmv9D6gP+iMH+xeSA6bvvt7Y/Hvcu4l14UOAISsBa0WhMBJGjxwKzsxg6AI4z04olBTBVwv9sAYruN+gNRXwP+6Z1/utJR2/XkSgjhTj1adP0JP9mt3UNVQUtDxsTCmYai67OnR0= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com; spf=fail smtp.mailfrom=nvidia.com; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b=Gi8IzZOL; arc=fail smtp.client-ip=40.107.220.40 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nvidia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b="Gi8IzZOL" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=X4DLjys63ywRPc8OkNPPgDztqM8M8gfvrZD0+XB65AYi0HcH2yqa1aKEt2J56v7u2DQim564igLwIWPKWCxlWaTyQP/Obx62z8CPrB2vOSnipaPxhvx0ipX/7yH+CqjSVy9GT1Uqd7SgX6nn/xJ+gN2HBCBdhZpFx5rX/uOm5Hh3KL1+8rPw6Sli9sl/KNAIyrgfOAo8UlDcF0fHMtt7/hFw2ysbISpBDmpvh5lKxB7okY1KtDCGOv/im/I9YGPwquCDQvkyAgCriy70Ry3Jwj8x5YU4zM9OkNIGvIgShBJK736X7b5WNHyPRCGk4wruSv7cLtkpHTLynccc5q2I3Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=iC5pVCwjpryKSBtAQx6+28APM4+DZJ5nJWGw0WUsNB0=; b=f0T8YSUzsXAoVNC8NqUcjm9+EMuF04UfnIfG7DEF3v+SBtNCHa+gTDFOcXmrzeTzX913WGzxl+hHexqp3PKB//ouKAtRdFaoKXHh9gn5VhXWP/4CJuFc/vZceWvirMpEghB1prGedo1sFvwH8LsPWjx4p/cVNVNukFI59pOSaRTY7P1gIUAv7TI7RLJjZTWRZ9yRZNoqO01Nm1ojOxb0YqD2qlfo6pHoKqGJs6SkwvroVHgbf9GoPKUDPJtIsJOrBArX/76d5TWmPzzdQuHmrPB+r72brDUCm1woMsv8pl1Hosn2mkIfy/W23SizuitgDSNisChDQri+c0+gJ6qO6A== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 216.228.118.232) smtp.rcpttodomain=vger.kernel.org smtp.mailfrom=nvidia.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=iC5pVCwjpryKSBtAQx6+28APM4+DZJ5nJWGw0WUsNB0=; b=Gi8IzZOL3vJ//CQzqvvEgFUnPS5xtrr8Z6gXiuU9MxsSkv0/zvvAOyV6JjH8ok+QsOe3TllhBUXQEMc8jtRLouc01ANAuA3Uha6H95Wf77lvFh+VnaEMCSU5eM3mK6ZdayYAG9pBXdZK9Lgx6b7SIeVFyDEfmpjNTjXbo1jVxlzZ6YVge0BWytQGWci5V5AG7Pj8or50Sx8KtpRSI9dAggqCnDXxtqGm5SdQ9xLGeRlrmJCc5q3sexMQNaPn/XzBHUymU61HDNXU7oX6iu4JCu7d38CTOIK6V40MrOiBMoztTg2TIVtL2RqjjA5Wm8tcOc5yPNPXLYLUcmciU9KS8A== Received: from BY3PR10CA0002.namprd10.prod.outlook.com (2603:10b6:a03:255::7) by SJ2PR12MB8882.namprd12.prod.outlook.com (2603:10b6:a03:537::17) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7897.24; Wed, 4 Sep 2024 15:31:11 +0000 Received: from SJ5PEPF000001EA.namprd05.prod.outlook.com (2603:10b6:a03:255:cafe::2) by BY3PR10CA0002.outlook.office365.com (2603:10b6:a03:255::7) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7918.26 via Frontend Transport; Wed, 4 Sep 2024 15:31:11 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 216.228.118.232) smtp.mailfrom=nvidia.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 216.228.118.232 as permitted sender) receiver=protection.outlook.com; client-ip=216.228.118.232; helo=mail.nvidia.com; pr=C Received: from mail.nvidia.com (216.228.118.232) by SJ5PEPF000001EA.mail.protection.outlook.com (10.167.242.198) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7918.13 via Frontend Transport; Wed, 4 Sep 2024 15:31:11 +0000 Received: from drhqmail201.nvidia.com (10.126.190.180) by mail.nvidia.com (10.127.129.5) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.4; Wed, 4 Sep 2024 08:31:00 -0700 Received: from drhqmail203.nvidia.com (10.126.190.182) by drhqmail201.nvidia.com (10.126.190.180) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.4; Wed, 4 Sep 2024 08:31:00 -0700 Received: from vdi.nvidia.com (10.127.8.13) by mail.nvidia.com (10.126.190.182) with Microsoft SMTP Server id 15.2.1544.4 via Frontend Transport; Wed, 4 Sep 2024 08:30:58 -0700 From: Michael Guralnik To: , CC: , , , Michael Guralnik Subject: [PATCH rdma-next 8/8] net/mlx5: Handle memory scheme ODP capabilities Date: Wed, 4 Sep 2024 18:30:38 +0300 Message-ID: <20240904153038.23054-9-michaelgur@nvidia.com> X-Mailer: git-send-email 2.17.2 In-Reply-To: <20240904153038.23054-1-michaelgur@nvidia.com> References: <20240904153038.23054-1-michaelgur@nvidia.com> Precedence: bulk X-Mailing-List: linux-rdma@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-NV-OnPremToCloud: ExternallySecured X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: SJ5PEPF000001EA:EE_|SJ2PR12MB8882:EE_ X-MS-Office365-Filtering-Correlation-Id: 959928c0-e430-49cd-44bd-08dcccf69b09 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|376014|82310400026|36860700013|1800799024; X-Microsoft-Antispam-Message-Info: WXaHF7D6eYS7fuBhOIk0U2gNlhkxTn8Gf521Y1K4CNQU17WuWFkjCLz3N+QCc4HWJAUAxdOpTrQojYAm7ELTni5AW1Jc0EDl35vQdjfRFfzkJB3BtQF/5yowXXJaucHO7xzjqOkiVAft8FR9YAMD1i4cdb2W8Bek7kDW4dRATOiL+8rfnY61kFAYrrfSKuhJsHmcto9eOdLffNSH1kqwiLUprTab2rBkbYaxLcg8KdgAgxGkRhqAv2Qp/0e2tMfwySSAlManXbVaUA67OqiaI8FeRbtODgEzZL6vPWRQA655fOBTvFDyeAylJKshvA0pPS8sHwfbJpqOSXeDe/8zelkty3BhiH5KyYfjgaOonv3Mabu/7qe4NO/qYM33Wc4AwiAE1GreITqvPYIJpd2fjbxuexGj/kcuQp91Z3Z5BndsKEExENyqbe4mcBkaXyfl1QTVPu+9jyx/neMrXBn3kb5VV8rhMExeiZEzu84aHoWC4qi2G+KisMaeiWLxgcHquTIj1nbSeVlmBAa3oS3KcXEh0H/HTUP3swM7olhK6v5yr9SMaOrAXhLQs81pFpyRfxtdzqAB55oxz1R+ESS+vSb+oSY6S+gyCm1dKYSWWts6EM3r9eIOsL2hIWS/9WttY1/JDSRmifj/9ngk3OPfyfOkiGti9qteji7xf7e1v7K1tqWCZ3GfumnwTTUsoW44IWk+oI6a4WXkOWPNsQnIfv1CJSImVHJv8tpiuEyi23L30v4+ok1o8tcESHxGO+tuKahEcKwYmY7z65ikV7GlFJ0krhMHbpOvK0Anacoe/L4vvsgseqh7rBPnyX/jvt+Z00XcXn9kOynVsti/YxQqtnnnN6HtYbmMXzZVVAakLKtzV6cHCduCRz6XycyIOKRD2yEfLQ1drAM+YstS2Ha4NNicrnLGOpUzDzNRbt5sMlr+/vDWCmztFeBsQENb6Li+qh8r/ytdpOF9uZpMhMMRSm4C4zFUotUfsp6K0tRY4ynHEPA1EUjXjEJ2Xfgtl2XDGkEJjq5KuQPHaANdFqItqj9rm6pnlnvgn0nLgfoWNtQBVNSwKDNelLmQCfYhzGirymcxNOPUgIAwXMVi/yqABqY0c4BsQXIZIccPn4l8976Q728W3yppB+GhaAz9HpcHryET4v18NX1EeQ64NmhDiQ4RjrLgiFMISogMR1z18HMrUpczHrXxBw0WYfIkimiD6xp/61JXQq2XMBMyx+mxxaQeO/pgg2Ea8y/v2IRbqyqY+gZNSOQ0gPe0fyQfEJLhZe8oDiC0+te+WSbqUsvuLic+oSWnXEg7bA2ptJr9IRA/mECPrrZtUgzV4Umga7nHm+RWsQ2TaSvBtKEi8xBbrMMbL1kPZG+Q+Fx6aCgrfqEJb6LBQrOqRQJCVq+fqi/9QWqgXuYbc7O/KT66qzRbI+89DNuDCUwrcdK8kEjyJzznIw7KPMHVo+bIcu3gwHiV X-Forefront-Antispam-Report: CIP:216.228.118.232;CTRY:US;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:mail.nvidia.com;PTR:dc7edge1.nvidia.com;CAT:NONE;SFS:(13230040)(376014)(82310400026)(36860700013)(1800799024);DIR:OUT;SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 04 Sep 2024 15:31:11.3614 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 959928c0-e430-49cd-44bd-08dcccf69b09 X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a;Ip=[216.228.118.232];Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: SJ5PEPF000001EA.namprd05.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: SJ2PR12MB8882 When running over new FW that supports the new memory scheme ODP, set the cap in the FW to signal the FW we are working in the new scheme. In the memory scheme ODP the per_transport_service capabilities are RO for the driver so we skip their setting. Signed-off-by: Michael Guralnik Reviewed-by: Leon Romanovsky --- .../net/ethernet/mellanox/mlx5/core/main.c | 22 +++++++++++++++---- include/linux/mlx5/device.h | 10 ++++++--- 2 files changed, 25 insertions(+), 7 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c b/drivers/net/ethernet/mellanox/mlx5/core/main.c index cc2aa46cff04..944c209e9569 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/main.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c @@ -454,8 +454,8 @@ static int handle_hca_cap_atomic(struct mlx5_core_dev *dev, void *set_ctx) static int handle_hca_cap_odp(struct mlx5_core_dev *dev, void *set_ctx) { + bool do_set = false, mem_page_fault = false; void *set_hca_cap; - bool do_set = false; int err; if (!IS_ENABLED(CONFIG_INFINIBAND_ON_DEMAND_PAGING) || @@ -470,6 +470,17 @@ static int handle_hca_cap_odp(struct mlx5_core_dev *dev, void *set_ctx) memcpy(set_hca_cap, dev->caps.hca[MLX5_CAP_ODP]->cur, MLX5_ST_SZ_BYTES(odp_cap)); + /* For best performance, enable memory scheme ODP only when + * it has page prefetch enabled. + */ + if (MLX5_CAP_ODP_MAX(dev, mem_page_fault) && + MLX5_CAP_ODP_MAX(dev, memory_page_fault_scheme_cap.page_prefetch)) { + mem_page_fault = true; + do_set = true; + MLX5_SET(odp_cap, set_hca_cap, mem_page_fault, mem_page_fault); + goto set; + }; + #define ODP_CAP_SET_MAX(dev, field) \ do { \ u32 _res = MLX5_CAP_ODP_MAX(dev, field); \ @@ -494,10 +505,13 @@ static int handle_hca_cap_odp(struct mlx5_core_dev *dev, void *set_ctx) ODP_CAP_SET_MAX(dev, transport_page_fault_scheme_cap.dc_odp_caps.read); ODP_CAP_SET_MAX(dev, transport_page_fault_scheme_cap.dc_odp_caps.atomic); - if (!do_set) - return 0; +set: + if (do_set) + err = set_caps(dev, set_ctx, MLX5_SET_HCA_CAP_OP_MOD_ODP); - return set_caps(dev, set_ctx, MLX5_SET_HCA_CAP_OP_MOD_ODP); + mlx5_core_dbg(dev, "Using ODP %s scheme\n", + mem_page_fault ? "memory" : "transport"); + return err; } static int max_uc_list_get_devlink_param(struct mlx5_core_dev *dev) diff --git a/include/linux/mlx5/device.h b/include/linux/mlx5/device.h index 154095256d0d..57c9b18c3adb 100644 --- a/include/linux/mlx5/device.h +++ b/include/linux/mlx5/device.h @@ -1389,9 +1389,13 @@ enum mlx5_qcam_feature_groups { #define MLX5_CAP_ODP(mdev, cap)\ MLX5_GET(odp_cap, mdev->caps.hca[MLX5_CAP_ODP]->cur, cap) -#define MLX5_CAP_ODP_SCHEME(mdev, cap) \ - MLX5_GET(odp_cap, mdev->caps.hca[MLX5_CAP_ODP]->cur, \ - transport_page_fault_scheme_cap.cap) +#define MLX5_CAP_ODP_SCHEME(mdev, cap) \ + (MLX5_GET(odp_cap, mdev->caps.hca[MLX5_CAP_ODP]->cur, \ + mem_page_fault) ? \ + MLX5_GET(odp_cap, mdev->caps.hca[MLX5_CAP_ODP]->cur, \ + memory_page_fault_scheme_cap.cap) : \ + MLX5_GET(odp_cap, mdev->caps.hca[MLX5_CAP_ODP]->cur, \ + transport_page_fault_scheme_cap.cap)) #define MLX5_CAP_ODP_MAX(mdev, cap)\ MLX5_GET(odp_cap, mdev->caps.hca[MLX5_CAP_ODP]->max, cap)