From patchwork Mon Sep 9 10:04:57 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Michael Guralnik X-Patchwork-Id: 13796569 Received: from NAM11-BN8-obe.outbound.protection.outlook.com (mail-bn8nam11on2069.outbound.protection.outlook.com [40.107.236.69]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EF27D1B3F0B for ; Mon, 9 Sep 2024 10:05:25 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.236.69 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725876327; cv=fail; b=H6fMyuAR8KTWf/8FEMVi2KBBnWEM9DXh9aQ8ZTB0GJUcoIqXhFLAN0qQi/b9rPeEkY5J0txPpyd/RFui2rLUQhycbJgkDqT1XEWSnkBrJ4W2DmUkFGujRdCGEvJbXixcXWVbxUaIE56CHzU5T7qJYLDSeg020/6eg4rX+n9RIpU= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725876327; c=relaxed/simple; bh=LSssioY6EyCwDnw/m6KAsbO3/98FxfcW1TJYO18Po/k=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=emAz+LgSqZylSE/WYW/mwwRdz1OjSTK1kmC7L/OIoHEO5qCtaKchtIf2vd9V9o1q3XhG8U5/GbR/x635tXXnXKO+mQnmEHk89Q/SZ5/QHzPdR/Ir30EfaDJmy5ZmFuvMQJ/7QJOtX3NgQEpPJ+MlLmtVx5WvHB7HgLJ3UnQue1s= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com; spf=fail smtp.mailfrom=nvidia.com; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b=nn/lvUL6; arc=fail smtp.client-ip=40.107.236.69 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nvidia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b="nn/lvUL6" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=On4Q6Vdi1/gPk2v+EYMtEGo9RBbjm1fedlkSwH3xf1VlVwLyeVPSTxq2ouuyzgWnBiRaHtToHlFw0ZGo+qiNmoCGM65Y4XK4DcCOAJ8d6ZlFvQge/8u9yD0zpOkmqCEbPiWhXs+fK4SaO47Abzj65DQc7ojmPDG8gueE5Kqwx7N4Q/e6N3g4FJziW1GVuXUozGDwH2V26cXsQhxWtc+l0ojGiAom5lpArQubI81Rh/JKs0Z0ELxCjubkpmGQNLU9PWOrhRzqjdlfGCY9sT2jcSKatyz2ZzkAclzQb90Oad+vNwoFoi6C5dhtG7ZCus5clWEBzlew0zGb6MlO2lxr9Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=2s0AtPNfcIJt7DB29cUc3n33jC+C+1miWiMr01rD4l0=; b=mBxDVv9BoY89uedSx/Pg+95gItZbO3VZnRMMoxvv7zQu2KJBK58Sre3tYqozo16wGvi/YbvSBjL2OMt53HuU0IfJCoKtuQ84G455Ej3wOgaKyOnv73YNVQ+QI2bk9DNfAhDIYxTXrG5L3l9w2ZFdBViQ6kfWEMRfvSKIS52JMqXlj6YdHrDGP9YDdhn3hvA76gRnXi/ljzbEzMK1YX8GeJZupUlDtkLuavo7J7p2y3KOMtiWALzBd51SZKxMBVEUfSBhS/1zw05yq7RM+q56A8hERpXJ5a6r6Ymyi89oh364BIbZFKQbzlNkpsHMtgTd9uXC3g07UrmLTmHfvihmTg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 216.228.118.232) smtp.rcpttodomain=vger.kernel.org smtp.mailfrom=nvidia.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=2s0AtPNfcIJt7DB29cUc3n33jC+C+1miWiMr01rD4l0=; b=nn/lvUL6vj8GZEu2GaDZpPTF4UiUb3KDceIemQC90kTSfnUH0Qq24jXYBscoCGERi5yNsXc+LjulA58YGCiciQ1Kt5sBHpZdTE+w8O/+qD3kyOIhh6XNNbnHY//A2EQBBv2msPfZ97F/XfDB2ItEJgf4Tcj1k8m2AUjG7wBv096g71vcYNNe8auWeM7EBGv7bDmf2bxuTJHA6NSAvO7mHXsoZM7qFZyOQskmVmxYIic16w7M+0/sybPJfyPED36Frf6v7Rgbh8DnL+TkBwPOrl4aXiqPx0JcIdRmLXD30wSjQOZE4uL5eI7Nlq3F8BwrFj/kGVtqgv7c2x4CwKlfpw== Received: from PH7P220CA0079.NAMP220.PROD.OUTLOOK.COM (2603:10b6:510:32c::27) by SA3PR12MB9180.namprd12.prod.outlook.com (2603:10b6:806:39b::18) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7918.28; Mon, 9 Sep 2024 10:05:22 +0000 Received: from CY4PEPF0000E9DC.namprd05.prod.outlook.com (2603:10b6:510:32c:cafe::47) by PH7P220CA0079.outlook.office365.com (2603:10b6:510:32c::27) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7939.24 via Frontend Transport; Mon, 9 Sep 2024 10:05:22 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 216.228.118.232) smtp.mailfrom=nvidia.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 216.228.118.232 as permitted sender) receiver=protection.outlook.com; client-ip=216.228.118.232; helo=mail.nvidia.com; pr=C Received: from mail.nvidia.com (216.228.118.232) by CY4PEPF0000E9DC.mail.protection.outlook.com (10.167.241.75) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7918.13 via Frontend Transport; Mon, 9 Sep 2024 10:05:22 +0000 Received: from drhqmail202.nvidia.com (10.126.190.181) by mail.nvidia.com (10.127.129.5) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.4; Mon, 9 Sep 2024 03:05:10 -0700 Received: from drhqmail203.nvidia.com (10.126.190.182) by drhqmail202.nvidia.com (10.126.190.181) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.4; Mon, 9 Sep 2024 03:05:10 -0700 Received: from vdi.nvidia.com (10.127.8.13) by mail.nvidia.com (10.126.190.182) with Microsoft SMTP Server id 15.2.1544.4 via Frontend Transport; Mon, 9 Sep 2024 03:05:08 -0700 From: Michael Guralnik To: , CC: , , , Michael Guralnik Subject: [PATCH v2 rdma-next 1/8] net/mlx5: Expand mkey page size to support 6 bits Date: Mon, 9 Sep 2024 13:04:57 +0300 Message-ID: <20240909100504.29797-2-michaelgur@nvidia.com> X-Mailer: git-send-email 2.17.2 In-Reply-To: <20240909100504.29797-1-michaelgur@nvidia.com> References: <20240909100504.29797-1-michaelgur@nvidia.com> Precedence: bulk X-Mailing-List: linux-rdma@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-NV-OnPremToCloud: ExternallySecured X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: CY4PEPF0000E9DC:EE_|SA3PR12MB9180:EE_ X-MS-Office365-Filtering-Correlation-Id: 785f5019-862b-4a0f-9c27-08dcd0b6eb1a X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|1800799024|376014|36860700013|82310400026; X-Microsoft-Antispam-Message-Info: IQVgEqm97zcFIxfVkCCn3OC/i1SXMgZspVpoVw21Lr1D7HtzgE/aszDrxggBp2TlRyXRhf2iJWmgWKzYgDQZQFsxsonL1Fh357PzLmB2CklRk4y16zCsOJwXO0gWC6Ul5ullrV5Mtte337CcS84pyy/ZU9jYHGC6LOmPkif2ffsyYTOlnj5IKLxUYF3uWqEbZo2xO+3S28BV7sweJCPxrERxXmhVTuk6N39MaKfXY5UZ7IvbdeieIBEFk/5O/3JD5h7aorw01rwr8KyCCq/xArMkXxe+d5HjA5KxXCV0RJDIZqOnq2SH4MZe6LcVZLqcv1+BGVECWJFPxsbTm8VO36bihl71/8lmojRLoAxnibQphe8VMhJa1sPwWO5wSW9Qls+cBsGr9gdAsgsn6B8xNrcwOb/WHdF1Nl1WpKPu9NRUTp4ElK3AwCvKIEHM4Tgyc2GtCh3rusmMKb2HyJ12WktkUG5Eo5VO/Q03LX540BuX3wmhtcEk9XUTgPICDZuB3vhftJI147q48spfiiFUJO3P0peUk3jLi/L8bevQHpVD6pAToO1dyvqRGv4aVSihd7QL5o5Aryb/00TXYNLAAwkSbRqfXA94MmAnkoMdGit2uCWxo8z52LV3Q3qHNb8+hNxIfDTzkjekZNWacR6o4R1ojOc0gRSIopzbN/a3YFyjJNZf59IAxt0miPHunAINr4wFRizvGD34XB8bAV1Ugv++UTLsI6t7kFzUQnTNjfzAeTAnJ+VRLoSR4UqhnhNuXkBDQgS4yFaCXjHe1IPoKOps8J/6xE+vEh3wzuWZAeiRopaJju0YS6w+5ZP62PG7a2lKyCqoboBuOXpCyihJ/T4xCA3F3/9U/BHXdgtbXTdRnxzMk4TZM7CHAoPQKaDz9JsYS6qIUCQwDodP8F8gTwn2ffFa8no9KdnuB+YaV0EsvgJcNvScsmi4risNsI/V/CkTa0NH0/vrLzsWmOfgzibH72QtGHcBKW7qPsILj5WLrdTbfHNRyrNLQeLgCj83LeQrUcgb+k2CZ/GsMwU5/YNuRW4A4ttdLqV97aLoQKQ3UwRYb68Gz6KEkIpoTmoz9f8hL1rAwCC5VLGXgU5B3U87jOmTxVqlkPt6YD9S/xEizcPZE5SZnGgUUbUWtUy+YvJgK5++EKPiIQ3CdG4zFB2ZOUOss94ttHgfRch+qQopTKiexowF/Qk+A2nmbp71Bb3neY6SAqS9MdVQxkcvh8vOGvX8O8vfRQmX+qQBaiJeIx3OlNv6S3VzD6DyyJ7lgMfu+ugjaIEXLx69O3ISLUGo+f60wqpAW4zJ7nINiAce9dBqCQO2RDLrjEZ84yaJi2ZCQIbmphNX8PhPE72A3Uo1WzPGf3NYgANVrBo5AifS1e+i2ahIhIi9JNnGuFc6TyS/zB1NeiT4byKpclveOL+f5g1MeJV8e7m3pLuw2/RNfX7ta8DnEeJ8+5IURS3F X-Forefront-Antispam-Report: CIP:216.228.118.232;CTRY:US;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:mail.nvidia.com;PTR:dc7edge1.nvidia.com;CAT:NONE;SFS:(13230040)(1800799024)(376014)(36860700013)(82310400026);DIR:OUT;SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 09 Sep 2024 10:05:22.5845 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 785f5019-862b-4a0f-9c27-08dcd0b6eb1a X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a;Ip=[216.228.118.232];Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: CY4PEPF0000E9DC.namprd05.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: SA3PR12MB9180 Protect the usage of the 6th bit with the relevant capability to ensure we are using the new page sizes with FW that supports the bit extension. Signed-off-by: Michael Guralnik Reviewed-by: Leon Romanovsky --- drivers/infiniband/hw/mlx5/mlx5_ib.h | 27 ++++++++++++++++----------- drivers/infiniband/hw/mlx5/mr.c | 10 ++++------ drivers/infiniband/hw/mlx5/odp.c | 2 +- include/linux/mlx5/mlx5_ifc.h | 7 ++++--- 4 files changed, 25 insertions(+), 21 deletions(-) diff --git a/drivers/infiniband/hw/mlx5/mlx5_ib.h b/drivers/infiniband/hw/mlx5/mlx5_ib.h index 926a965e4570..ea8eb368108f 100644 --- a/drivers/infiniband/hw/mlx5/mlx5_ib.h +++ b/drivers/infiniband/hw/mlx5/mlx5_ib.h @@ -63,17 +63,6 @@ __mlx5_log_page_size_to_bitmap(unsigned int log_pgsz_bits, return GENMASK(largest_pg_shift, pgsz_shift); } -/* - * For mkc users, instead of a page_offset the command has a start_iova which - * specifies both the page_offset and the on-the-wire IOVA - */ -#define mlx5_umem_find_best_pgsz(umem, typ, log_pgsz_fld, pgsz_shift, iova) \ - ib_umem_find_best_pgsz(umem, \ - __mlx5_log_page_size_to_bitmap( \ - __mlx5_bit_sz(typ, log_pgsz_fld), \ - pgsz_shift), \ - iova) - static __always_inline unsigned long __mlx5_page_offset_to_bitmask(unsigned int page_offset_bits, unsigned int offset_shift) @@ -1725,4 +1714,20 @@ static inline u32 smi_to_native_portnum(struct mlx5_ib_dev *dev, u32 port) return (port - 1) / dev->num_ports + 1; } +/* + * For mkc users, instead of a page_offset the command has a start_iova which + * specifies both the page_offset and the on-the-wire IOVA + */ +static __always_inline unsigned long +mlx5_umem_mkc_find_best_pgsz(struct mlx5_ib_dev *dev, struct ib_umem *umem, + u64 iova) +{ + int page_size_bits = + MLX5_CAP_GEN_2(dev->mdev, umr_log_entity_size_5) ? 6 : 5; + unsigned long bitmap = + __mlx5_log_page_size_to_bitmap(page_size_bits, 0); + + return ib_umem_find_best_pgsz(umem, bitmap, iova); +} + #endif /* MLX5_IB_H */ diff --git a/drivers/infiniband/hw/mlx5/mr.c b/drivers/infiniband/hw/mlx5/mr.c index 73962bd0b216..3d6a14ece6db 100644 --- a/drivers/infiniband/hw/mlx5/mr.c +++ b/drivers/infiniband/hw/mlx5/mr.c @@ -1119,8 +1119,7 @@ static struct mlx5_ib_mr *alloc_cacheable_mr(struct ib_pd *pd, if (umem->is_dmabuf) page_size = mlx5_umem_dmabuf_default_pgsz(umem, iova); else - page_size = mlx5_umem_find_best_pgsz(umem, mkc, log_page_size, - 0, iova); + page_size = mlx5_umem_mkc_find_best_pgsz(dev, umem, iova); if (WARN_ON(!page_size)) return ERR_PTR(-EINVAL); @@ -1425,8 +1424,8 @@ static struct ib_mr *create_real_mr(struct ib_pd *pd, struct ib_umem *umem, mr = alloc_cacheable_mr(pd, umem, iova, access_flags, MLX5_MKC_ACCESS_MODE_MTT); } else { - unsigned int page_size = mlx5_umem_find_best_pgsz( - umem, mkc, log_page_size, 0, iova); + unsigned int page_size = + mlx5_umem_mkc_find_best_pgsz(dev, umem, iova); mutex_lock(&dev->slow_path_mutex); mr = reg_create(pd, umem, iova, access_flags, page_size, @@ -1744,8 +1743,7 @@ static bool can_use_umr_rereg_pas(struct mlx5_ib_mr *mr, if (!mlx5r_umr_can_load_pas(dev, new_umem->length)) return false; - *page_size = - mlx5_umem_find_best_pgsz(new_umem, mkc, log_page_size, 0, iova); + *page_size = mlx5_umem_mkc_find_best_pgsz(dev, new_umem, iova); if (WARN_ON(!*page_size)) return false; return (mr->mmkey.cache_ent->rb_key.ndescs) >= diff --git a/drivers/infiniband/hw/mlx5/odp.c b/drivers/infiniband/hw/mlx5/odp.c index 44a3428ea342..221820874e7a 100644 --- a/drivers/infiniband/hw/mlx5/odp.c +++ b/drivers/infiniband/hw/mlx5/odp.c @@ -693,7 +693,7 @@ static int pagefault_dmabuf_mr(struct mlx5_ib_mr *mr, size_t bcnt, struct ib_umem_dmabuf *umem_dmabuf = to_ib_umem_dmabuf(mr->umem); u32 xlt_flags = 0; int err; - unsigned int page_size; + unsigned long page_size; if (flags & MLX5_PF_FLAGS_ENABLE) xlt_flags |= MLX5_IB_UPD_XLT_ENABLE; diff --git a/include/linux/mlx5/mlx5_ifc.h b/include/linux/mlx5/mlx5_ifc.h index 691a285f9c1e..1be2495362ee 100644 --- a/include/linux/mlx5/mlx5_ifc.h +++ b/include/linux/mlx5/mlx5_ifc.h @@ -1995,7 +1995,9 @@ struct mlx5_ifc_cmd_hca_cap_2_bits { u8 dp_ordering_force[0x1]; u8 reserved_at_89[0x9]; u8 query_vuid[0x1]; - u8 reserved_at_93[0xd]; + u8 reserved_at_93[0x5]; + u8 umr_log_entity_size_5[0x1]; + u8 reserved_at_99[0x7]; u8 max_reformat_insert_size[0x8]; u8 max_reformat_insert_offset[0x8]; @@ -4221,8 +4223,7 @@ struct mlx5_ifc_mkc_bits { u8 reserved_at_1c0[0x19]; u8 relaxed_ordering_read[0x1]; - u8 reserved_at_1d9[0x1]; - u8 log_page_size[0x5]; + u8 log_page_size[0x6]; u8 reserved_at_1e0[0x20]; }; From patchwork Mon Sep 9 10:04:58 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Michael Guralnik X-Patchwork-Id: 13796570 Received: from NAM04-MW2-obe.outbound.protection.outlook.com (mail-mw2nam04on2060.outbound.protection.outlook.com [40.107.101.60]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 45B421B3F1E for ; Mon, 9 Sep 2024 10:05:30 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.101.60 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725876332; cv=fail; b=IbVYGMaJLtMdhNwxK42SKmgXvOcKpHPVQb7Q/TDMkF/atR4Yp/nHvEelBSE/dNfYPtUTXW6muFRME/FxDfDUCFspjPICOw9L4qjsGTYM+zZ19AIdeKV8zEkuaYa9Cko4b0735E2bamLniIuDAfvaUVfkMJ8pACWrDdyVDvVE+Ao= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725876332; c=relaxed/simple; bh=Orp6Tp08ADKQDPxJj9uLRXU+fVBRCYmlQ3AyosyNqBM=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=uSpnOMlAaDMtjJCFSVwc69l4nU5X6TCxCW1JKdzfUymK8nejstmNKAQTHuyDos88dOCi5O6Qe6BL66eJhoHAkcGC+E0OfMqQZLGzBfUDRHh/FPnixWKKj7x5JcwPlgITTra9uwOXj5qnln1YbhiFpB8NHdtHZbVhgXac3D+BG68= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com; spf=fail smtp.mailfrom=nvidia.com; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b=gGR+BJQp; arc=fail smtp.client-ip=40.107.101.60 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nvidia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b="gGR+BJQp" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=qfZ03ZNME6yX6w0OhOM/C0HI5orwxOucNCpoguoeNJV5WlpMzK4np2w4mnWTZQ6oJsdb2U3HQRrpDYA2CXubmzBF73igYPZTIdDCUA4jrfFDWF5/Ze1T2rboSuBgsOf8sRYNiuHSbobH9olAdHmhV+flLPKk56svUBs3nUssPgR7EKpcGItIj0+xOG9Gijx8r+mHphBhFlGz3yrV5YhEo0KbXQ1BmTlwE3kW8jFHFTnnrnXjmSeAzKqMIPVhzVybx4VQmRVU6oUuvmK09w87b8G0JXX7HeLuBJEQsworj2D7XML8pw7FJ74TvyzFVM91BjgX5rMk8SA1KEMiRzICPw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=5sNVhDYjlljuerZ0wXeb5L6zpj0qjcPMA7qjC7b8S6I=; b=vyHsSbUbomDRdEtQhD5NaE0SNDF6GKHAH9RJTk1wcg5diCpUdicLmRsz9Dtp9/MPTEGiCaGRZyGaWo5pyOrsVW3w3dfrSuPWYisd1GwCVIhkkTwpfcCgzZWQY9qXhmjX6uVliqIQnyflCg09b72RiWZW1tWf7ELI8bhG0yAO3L2ALUdNqZUQpRc+phdkt9QRFBewlFU0FbZ9K6K7yp1EBbGelOikGfjB8W8L8NAoPOWLyL82NSgpU1OgeRFhG938MWFAzxZdr+Cm9reHCB8BXQIlw5LUSqN0N2qmoSEqri4TTYlcsayrghzg+b9700hIoj++94E5FrMIVuseJI2wtg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 216.228.118.232) smtp.rcpttodomain=vger.kernel.org smtp.mailfrom=nvidia.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=5sNVhDYjlljuerZ0wXeb5L6zpj0qjcPMA7qjC7b8S6I=; b=gGR+BJQpsHsKUeebZzG088vifkRnQBZ6JAWZB71j4szmcR3zfTXqq71+Nkbl6jbImMvB66k9IiwZHIZg9aDY1v2/384VgHwBc5AJVUqcU39+/ThVYZC7JlXif/g3WO8yPw9yGCcP+EI9UuG0TbwhvVVeTo2c8oN3wSmiHN1iigfLURI718KzU/8ZNwkTGCzj56z4fqpbvbKhaEqBwWN7pR4UTrQY7PcpshMY4yaW9YmJ0MPUMbljbhGeo71DWFE41rGpLlPffXd4bW4x6mjVkZDmtRiGYo4GOOUWMrNybdDbhNuLYszCDXEr6JNV4Cryh/dhlCKS4JiCXD82BeWpEg== Received: from PH7P220CA0077.NAMP220.PROD.OUTLOOK.COM (2603:10b6:510:32c::12) by CH3PR12MB9023.namprd12.prod.outlook.com (2603:10b6:610:17b::12) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7918.27; Mon, 9 Sep 2024 10:05:25 +0000 Received: from CY4PEPF0000E9DC.namprd05.prod.outlook.com (2603:10b6:510:32c:cafe::31) by PH7P220CA0077.outlook.office365.com (2603:10b6:510:32c::12) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7939.24 via Frontend Transport; Mon, 9 Sep 2024 10:05:25 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 216.228.118.232) smtp.mailfrom=nvidia.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 216.228.118.232 as permitted sender) receiver=protection.outlook.com; client-ip=216.228.118.232; helo=mail.nvidia.com; pr=C Received: from mail.nvidia.com (216.228.118.232) by CY4PEPF0000E9DC.mail.protection.outlook.com (10.167.241.75) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7918.13 via Frontend Transport; Mon, 9 Sep 2024 10:05:25 +0000 Received: from drhqmail203.nvidia.com (10.126.190.182) by mail.nvidia.com (10.127.129.5) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.4; Mon, 9 Sep 2024 03:05:13 -0700 Received: from drhqmail203.nvidia.com (10.126.190.182) by drhqmail203.nvidia.com (10.126.190.182) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.4; Mon, 9 Sep 2024 03:05:12 -0700 Received: from vdi.nvidia.com (10.127.8.13) by mail.nvidia.com (10.126.190.182) with Microsoft SMTP Server id 15.2.1544.4 via Frontend Transport; Mon, 9 Sep 2024 03:05:10 -0700 From: Michael Guralnik To: , CC: , , , Michael Guralnik Subject: [PATCH v2 rdma-next 2/8] net/mlx5: Expose HW bits for Memory scheme ODP Date: Mon, 9 Sep 2024 13:04:58 +0300 Message-ID: <20240909100504.29797-3-michaelgur@nvidia.com> X-Mailer: git-send-email 2.17.2 In-Reply-To: <20240909100504.29797-1-michaelgur@nvidia.com> References: <20240909100504.29797-1-michaelgur@nvidia.com> Precedence: bulk X-Mailing-List: linux-rdma@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-NV-OnPremToCloud: ExternallySecured X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: CY4PEPF0000E9DC:EE_|CH3PR12MB9023:EE_ X-MS-Office365-Filtering-Correlation-Id: bf78a93f-b340-4edf-ecd2-08dcd0b6ec90 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|1800799024|36860700013|376014|82310400026; X-Microsoft-Antispam-Message-Info: jailp0+qiAXjGaDprN/l31A2dSxweCos5gzy25hP3nU1pLqrQUfHB0ypa6YrnFKMpu3n5zZgGs8aKQOkNaq3TCGvqD43Ec7sgJsDXQQDyb/r5o1oZEbFURwXxHq2M0p+7yH6yiaLxlc9zHWCFXii8g9D3IHyNMQLE6K1GfVzYZUk35Mjxqezpzrj+9cCZY/DuO2BtvO0dVadyCkQ9IZ2f7ANb29/PHNEeBkK+hc6cpeVlr/VqCgsNkellt2Gug5JQKEUL5wzk5S/ECQcHIRxZltfbO4VgTQAvT7esJRtgQks3klCsPaRePv//0C8oNmj5aDFVyY+MptE0B7mDJstlk0JFgqocIOe/5V23jtzkQiqu0tdsOKyPY7pHr3tZLFNlvSgHzCBaW1CzTxE0ZcU4QjPTS6TV4a2UMDMeawBSphRigngRZESNCTMB1257xlh2iRTDIuTAPIHXuTFDlUZPj7fKqJx+xX3zzvrmiDhMgmXQzvlQXCBoojzIJCGIphg9arZtBT7+OKJSL+lD/7WaEt7e2O/h1fnBFiTjElZhvywZEMyTaQvTsbnxuO+Vyp7ZsOMnlqcrpmxxzsb+GZRu5xzDDVM+JDpxR0KGB7z4jrwB/0irUOAHaRl8moWeADTiLot9T4GwojF6spCF4nayBss69NJ+UmGi+uA6QvKvxZAMd+E61cPAR+RXG8DC+ZROg6jimFWdACTCNWuce4JdqCJ27fDI9m52Z8FwUqAhsoYTO1dJnIJ70IjTEOzfjde+beTdo2qEq4B5HOBKTRsDu2bdgn8vbXRnnfdeQCj16hbquRADtuXuxkITZY07/F+N8M2TJnNYuHZHx1Zr+WpEBNSiuDNNHFvuSMzaTY4WbbCwWqPTF8lhoyHhN2+fzPhNJJI935ykiaI/y/yGz9g4m+P+JS6hyxRM4zKijYipxlQOpYRmn88vNoTlhR1+hbpWXOyHzAzrFIHUa162aChimGYIHs3urXvPQq//kqbgse3kUjr/Bp4vCvOibzEe2ugTSB3fqQteReHLrEYwpQOCIBqw7KTJ+YaJXXCMsYW9+ebly3yYntQTBui9aLybTzUehR5VgWbUramCa9TaIGWub4DLSxB+3iE7yeOSTZVgz4/t6YGYIYFa2aoREHw2EnXzVFSqjxru26KQFBX2b8cgvgwTZuTowOz25I9BtG255UpWH3CdRJA+y6wwJhrOrCmmyZNt3mCi1PqfsS8wtTTlbamoNvdq0oJP+fSDBnMLW7Y5OCXoxQ9wgGej0R2iPbjUmx1/JN4MhS3GHQlyTN/arcXWcWrAASPFABJPNqlev5Icj6h2/eW72Lm6JNtQ/IsOaqYgT6jOL8bGMWDx3CcuLUyl9loQWjH7fcOhdU2ia1x35c2FGbzx8DDhBZCoNiJD0Y1WEigfPVUJ8FfmVJeEvS5R3CIPo5Xunb4bM0zdn2SMld2FJcjfuWli4HqIBvY X-Forefront-Antispam-Report: CIP:216.228.118.232;CTRY:US;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:mail.nvidia.com;PTR:dc7edge1.nvidia.com;CAT:NONE;SFS:(13230040)(1800799024)(36860700013)(376014)(82310400026);DIR:OUT;SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 09 Sep 2024 10:05:25.0063 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: bf78a93f-b340-4edf-ecd2-08dcd0b6ec90 X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a;Ip=[216.228.118.232];Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: CY4PEPF0000E9DC.namprd05.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: CH3PR12MB9023 Expose IFC bits to support the new memory scheme on demand paging. Change the macro reading odp capabilities to be able to read from the new IFC layout and align the code in upper layers to be compiled. Signed-off-by: Michael Guralnik Reviewed-by: Leon Romanovsky --- drivers/infiniband/hw/mlx5/odp.c | 40 +++++++------ .../net/ethernet/mellanox/mlx5/core/main.c | 28 ++++----- include/linux/mlx5/device.h | 4 ++ include/linux/mlx5/mlx5_ifc.h | 57 +++++++++++++++---- 4 files changed, 86 insertions(+), 43 deletions(-) diff --git a/drivers/infiniband/hw/mlx5/odp.c b/drivers/infiniband/hw/mlx5/odp.c index 221820874e7a..300504bf79d7 100644 --- a/drivers/infiniband/hw/mlx5/odp.c +++ b/drivers/infiniband/hw/mlx5/odp.c @@ -332,46 +332,46 @@ static void internal_fill_odp_caps(struct mlx5_ib_dev *dev) else dev->odp_max_size = BIT_ULL(MLX5_MAX_UMR_SHIFT + PAGE_SHIFT); - if (MLX5_CAP_ODP(dev->mdev, ud_odp_caps.send)) + if (MLX5_CAP_ODP_SCHEME(dev->mdev, ud_odp_caps.send)) caps->per_transport_caps.ud_odp_caps |= IB_ODP_SUPPORT_SEND; - if (MLX5_CAP_ODP(dev->mdev, ud_odp_caps.srq_receive)) + if (MLX5_CAP_ODP_SCHEME(dev->mdev, ud_odp_caps.srq_receive)) caps->per_transport_caps.ud_odp_caps |= IB_ODP_SUPPORT_SRQ_RECV; - if (MLX5_CAP_ODP(dev->mdev, rc_odp_caps.send)) + if (MLX5_CAP_ODP_SCHEME(dev->mdev, rc_odp_caps.send)) caps->per_transport_caps.rc_odp_caps |= IB_ODP_SUPPORT_SEND; - if (MLX5_CAP_ODP(dev->mdev, rc_odp_caps.receive)) + if (MLX5_CAP_ODP_SCHEME(dev->mdev, rc_odp_caps.receive)) caps->per_transport_caps.rc_odp_caps |= IB_ODP_SUPPORT_RECV; - if (MLX5_CAP_ODP(dev->mdev, rc_odp_caps.write)) + if (MLX5_CAP_ODP_SCHEME(dev->mdev, rc_odp_caps.write)) caps->per_transport_caps.rc_odp_caps |= IB_ODP_SUPPORT_WRITE; - if (MLX5_CAP_ODP(dev->mdev, rc_odp_caps.read)) + if (MLX5_CAP_ODP_SCHEME(dev->mdev, rc_odp_caps.read)) caps->per_transport_caps.rc_odp_caps |= IB_ODP_SUPPORT_READ; - if (MLX5_CAP_ODP(dev->mdev, rc_odp_caps.atomic)) + if (MLX5_CAP_ODP_SCHEME(dev->mdev, rc_odp_caps.atomic)) caps->per_transport_caps.rc_odp_caps |= IB_ODP_SUPPORT_ATOMIC; - if (MLX5_CAP_ODP(dev->mdev, rc_odp_caps.srq_receive)) + if (MLX5_CAP_ODP_SCHEME(dev->mdev, rc_odp_caps.srq_receive)) caps->per_transport_caps.rc_odp_caps |= IB_ODP_SUPPORT_SRQ_RECV; - if (MLX5_CAP_ODP(dev->mdev, xrc_odp_caps.send)) + if (MLX5_CAP_ODP_SCHEME(dev->mdev, xrc_odp_caps.send)) caps->per_transport_caps.xrc_odp_caps |= IB_ODP_SUPPORT_SEND; - if (MLX5_CAP_ODP(dev->mdev, xrc_odp_caps.receive)) + if (MLX5_CAP_ODP_SCHEME(dev->mdev, xrc_odp_caps.receive)) caps->per_transport_caps.xrc_odp_caps |= IB_ODP_SUPPORT_RECV; - if (MLX5_CAP_ODP(dev->mdev, xrc_odp_caps.write)) + if (MLX5_CAP_ODP_SCHEME(dev->mdev, xrc_odp_caps.write)) caps->per_transport_caps.xrc_odp_caps |= IB_ODP_SUPPORT_WRITE; - if (MLX5_CAP_ODP(dev->mdev, xrc_odp_caps.read)) + if (MLX5_CAP_ODP_SCHEME(dev->mdev, xrc_odp_caps.read)) caps->per_transport_caps.xrc_odp_caps |= IB_ODP_SUPPORT_READ; - if (MLX5_CAP_ODP(dev->mdev, xrc_odp_caps.atomic)) + if (MLX5_CAP_ODP_SCHEME(dev->mdev, xrc_odp_caps.atomic)) caps->per_transport_caps.xrc_odp_caps |= IB_ODP_SUPPORT_ATOMIC; - if (MLX5_CAP_ODP(dev->mdev, xrc_odp_caps.srq_receive)) + if (MLX5_CAP_ODP_SCHEME(dev->mdev, xrc_odp_caps.srq_receive)) caps->per_transport_caps.xrc_odp_caps |= IB_ODP_SUPPORT_SRQ_RECV; if (MLX5_CAP_GEN(dev->mdev, fixed_buffer_size) && @@ -388,13 +388,17 @@ static void mlx5_ib_page_fault_resume(struct mlx5_ib_dev *dev, int wq_num = pfault->event_subtype == MLX5_PFAULT_SUBTYPE_WQE ? pfault->wqe.wq_num : pfault->token; u32 in[MLX5_ST_SZ_DW(page_fault_resume_in)] = {}; + void *info; int err; MLX5_SET(page_fault_resume_in, in, opcode, MLX5_CMD_OP_PAGE_FAULT_RESUME); - MLX5_SET(page_fault_resume_in, in, page_fault_type, pfault->type); - MLX5_SET(page_fault_resume_in, in, token, pfault->token); - MLX5_SET(page_fault_resume_in, in, wq_number, wq_num); - MLX5_SET(page_fault_resume_in, in, error, !!error); + + info = MLX5_ADDR_OF(page_fault_resume_in, in, + page_fault_info.trans_page_fault_info); + MLX5_SET(trans_page_fault_info, info, page_fault_type, pfault->type); + MLX5_SET(trans_page_fault_info, info, fault_token, pfault->token); + MLX5_SET(trans_page_fault_info, info, wq_number, wq_num); + MLX5_SET(trans_page_fault_info, info, error, !!error); err = mlx5_cmd_exec_in(dev->mdev, page_fault_resume, in); if (err) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c b/drivers/net/ethernet/mellanox/mlx5/core/main.c index 5b7e6f4b5c7e..cc2aa46cff04 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/main.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c @@ -479,20 +479,20 @@ static int handle_hca_cap_odp(struct mlx5_core_dev *dev, void *set_ctx) } \ } while (0) - ODP_CAP_SET_MAX(dev, ud_odp_caps.srq_receive); - ODP_CAP_SET_MAX(dev, rc_odp_caps.srq_receive); - ODP_CAP_SET_MAX(dev, xrc_odp_caps.srq_receive); - ODP_CAP_SET_MAX(dev, xrc_odp_caps.send); - ODP_CAP_SET_MAX(dev, xrc_odp_caps.receive); - ODP_CAP_SET_MAX(dev, xrc_odp_caps.write); - ODP_CAP_SET_MAX(dev, xrc_odp_caps.read); - ODP_CAP_SET_MAX(dev, xrc_odp_caps.atomic); - ODP_CAP_SET_MAX(dev, dc_odp_caps.srq_receive); - ODP_CAP_SET_MAX(dev, dc_odp_caps.send); - ODP_CAP_SET_MAX(dev, dc_odp_caps.receive); - ODP_CAP_SET_MAX(dev, dc_odp_caps.write); - ODP_CAP_SET_MAX(dev, dc_odp_caps.read); - ODP_CAP_SET_MAX(dev, dc_odp_caps.atomic); + ODP_CAP_SET_MAX(dev, transport_page_fault_scheme_cap.ud_odp_caps.srq_receive); + ODP_CAP_SET_MAX(dev, transport_page_fault_scheme_cap.rc_odp_caps.srq_receive); + ODP_CAP_SET_MAX(dev, transport_page_fault_scheme_cap.xrc_odp_caps.srq_receive); + ODP_CAP_SET_MAX(dev, transport_page_fault_scheme_cap.xrc_odp_caps.send); + ODP_CAP_SET_MAX(dev, transport_page_fault_scheme_cap.xrc_odp_caps.receive); + ODP_CAP_SET_MAX(dev, transport_page_fault_scheme_cap.xrc_odp_caps.write); + ODP_CAP_SET_MAX(dev, transport_page_fault_scheme_cap.xrc_odp_caps.read); + ODP_CAP_SET_MAX(dev, transport_page_fault_scheme_cap.xrc_odp_caps.atomic); + ODP_CAP_SET_MAX(dev, transport_page_fault_scheme_cap.dc_odp_caps.srq_receive); + ODP_CAP_SET_MAX(dev, transport_page_fault_scheme_cap.dc_odp_caps.send); + ODP_CAP_SET_MAX(dev, transport_page_fault_scheme_cap.dc_odp_caps.receive); + ODP_CAP_SET_MAX(dev, transport_page_fault_scheme_cap.dc_odp_caps.write); + ODP_CAP_SET_MAX(dev, transport_page_fault_scheme_cap.dc_odp_caps.read); + ODP_CAP_SET_MAX(dev, transport_page_fault_scheme_cap.dc_odp_caps.atomic); if (!do_set) return 0; diff --git a/include/linux/mlx5/device.h b/include/linux/mlx5/device.h index ba875a619b97..bd081f276654 100644 --- a/include/linux/mlx5/device.h +++ b/include/linux/mlx5/device.h @@ -1369,6 +1369,10 @@ enum mlx5_qcam_feature_groups { #define MLX5_CAP_ODP(mdev, cap)\ MLX5_GET(odp_cap, mdev->caps.hca[MLX5_CAP_ODP]->cur, cap) +#define MLX5_CAP_ODP_SCHEME(mdev, cap) \ + MLX5_GET(odp_cap, mdev->caps.hca[MLX5_CAP_ODP]->cur, \ + transport_page_fault_scheme_cap.cap) + #define MLX5_CAP_ODP_MAX(mdev, cap)\ MLX5_GET(odp_cap, mdev->caps.hca[MLX5_CAP_ODP]->max, cap) diff --git a/include/linux/mlx5/mlx5_ifc.h b/include/linux/mlx5/mlx5_ifc.h index 1be2495362ee..fcccfc34e076 100644 --- a/include/linux/mlx5/mlx5_ifc.h +++ b/include/linux/mlx5/mlx5_ifc.h @@ -1326,11 +1326,13 @@ struct mlx5_ifc_atomic_caps_bits { u8 reserved_at_e0[0x720]; }; -struct mlx5_ifc_odp_cap_bits { +struct mlx5_ifc_odp_scheme_cap_bits { u8 reserved_at_0[0x40]; u8 sig[0x1]; - u8 reserved_at_41[0x1f]; + u8 reserved_at_41[0x4]; + u8 page_prefetch[0x1]; + u8 reserved_at_46[0x1a]; u8 reserved_at_60[0x20]; @@ -1344,7 +1346,20 @@ struct mlx5_ifc_odp_cap_bits { struct mlx5_ifc_odp_per_transport_service_cap_bits dc_odp_caps; - u8 reserved_at_120[0x6E0]; + u8 reserved_at_120[0xe0]; +}; + +struct mlx5_ifc_odp_cap_bits { + struct mlx5_ifc_odp_scheme_cap_bits transport_page_fault_scheme_cap; + + struct mlx5_ifc_odp_scheme_cap_bits memory_page_fault_scheme_cap; + + u8 reserved_at_400[0x200]; + + u8 mem_page_fault[0x1]; + u8 reserved_at_601[0x1f]; + + u8 reserved_at_620[0x1e0]; }; struct mlx5_ifc_tls_cap_bits { @@ -2041,7 +2056,8 @@ struct mlx5_ifc_cmd_hca_cap_2_bits { u8 min_mkey_log_entity_size_fixed_buffer[0x5]; u8 ec_vf_vport_base[0x10]; - u8 reserved_at_3a0[0x10]; + u8 reserved_at_3a0[0xa]; + u8 max_mkey_log_entity_size_mtt[0x6]; u8 max_rqt_vhca_id[0x10]; u8 reserved_at_3c0[0x20]; @@ -7270,6 +7286,30 @@ struct mlx5_ifc_qp_2err_in_bits { u8 reserved_at_60[0x20]; }; +struct mlx5_ifc_trans_page_fault_info_bits { + u8 error[0x1]; + u8 reserved_at_1[0x4]; + u8 page_fault_type[0x3]; + u8 wq_number[0x18]; + + u8 reserved_at_20[0x8]; + u8 fault_token[0x18]; +}; + +struct mlx5_ifc_mem_page_fault_info_bits { + u8 error[0x1]; + u8 reserved_at_1[0xf]; + u8 fault_token_47_32[0x10]; + + u8 fault_token_31_0[0x20]; +}; + +union mlx5_ifc_page_fault_resume_in_page_fault_info_auto_bits { + struct mlx5_ifc_trans_page_fault_info_bits trans_page_fault_info; + struct mlx5_ifc_mem_page_fault_info_bits mem_page_fault_info; + u8 reserved_at_0[0x40]; +}; + struct mlx5_ifc_page_fault_resume_out_bits { u8 status[0x8]; u8 reserved_at_8[0x18]; @@ -7286,13 +7326,8 @@ struct mlx5_ifc_page_fault_resume_in_bits { u8 reserved_at_20[0x10]; u8 op_mod[0x10]; - u8 error[0x1]; - u8 reserved_at_41[0x4]; - u8 page_fault_type[0x3]; - u8 wq_number[0x18]; - - u8 reserved_at_60[0x8]; - u8 token[0x18]; + union mlx5_ifc_page_fault_resume_in_page_fault_info_auto_bits + page_fault_info; }; struct mlx5_ifc_nop_out_bits { From patchwork Mon Sep 9 10:04:59 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Michael Guralnik X-Patchwork-Id: 13796571 Received: from NAM11-CO1-obe.outbound.protection.outlook.com (mail-co1nam11on2082.outbound.protection.outlook.com [40.107.220.82]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id F21241B3F32 for ; Mon, 9 Sep 2024 10:05:30 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.220.82 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725876332; cv=fail; b=bdKrvSyVI/AFOVHwvFBawt1cjYmPkmzlN4DBKdaNrbCRcQdO+dTd6fVTQCls4n3oZBHIeJH5LJ6wfk/69XuD6rG1oV+/qKhFm+Z/XBDc2gMxO64x/9SIyaQvCP9IDTunOk6reY9mIY5mqp1NejP9TI3vOlTVJ1BJ592ALimw4pE= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725876332; c=relaxed/simple; bh=udtOONpfnz7kqLOsQgZy4dXS2FqgNLoCyozCvWdqiPw=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=KZU66GBZ2GsHxbmiJiolYUdEbp0WhwLMN9WCaQn17kL5Oz34/CuikZlaThY9WTiRKQ2nxjX39HtuSQmnoJifQJ8jsKQrD+gBP9SKyRlNYzVBK/FyD6OO/VQGC1u9qgkjTij2fXHGmqj2fNUBOtMZctoyNRA97qMqMzXzr1C9PLI= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com; spf=fail smtp.mailfrom=nvidia.com; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b=WI2rW2cU; arc=fail smtp.client-ip=40.107.220.82 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nvidia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b="WI2rW2cU" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=yiyxEPB8DhEXTjx/5qLqGdtY6bB7bIkqFCLcm7MlnLaqfeXwiu6DRrKAyCEi5RGxMKFcEaXJlGWj6Q4VgI+HP7Ib8YkxMPx5M9G2lwbGLPxwYFS5kN4j0bBN/9sOX31wlba2qswPACyyKWSio7IVTLhAVRjCEzap6OY66Nhgb1syS2H4EC0L8PsLTgSlxJqQcPIo7JffA1RLcgcjO8Dno4U89vPBGB+GO99A01lwLxcNpGDjUYqDvupBKa11RjIsX8YtAczP554MhqggLwwjjPMcIt49vjewp8ibpQ2VDQTcdKs5jUEZQ/8wh1N5KtqS9b6HpkR21Gv5h4a+HiYQtA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=2VVsmCBsTbmoNEDvOisP0DU0vksXfNKdqLnxNPQrtAo=; b=B/jGpAoRDdl9wCqlG2YhI9WYWqKhPyHlg8fWisJMe/itY0hoPnvpA+wViEDYDsJDGWDMaX2IPltA40zg+/pLkVIrMOSbYXJVhm5BwWcSAwS9O40GR8QQIVvZuVZi3g9wQEDoD/qee/DnkselwNp3E7d4551dcokd0wONI1Ijg2Fbo5u38seOVHi9GnR9JbQAdvaoQ72u3wptW28sfWlI0jrXjdcZwC98FvwJdfhIgYPMdPqw/mPMxH8Ksby0IS3/StJ6BtJEAtFnHXe4Bm93Aj5y21Pghyu2akRHmonDcbAJdRpIOR1xs1KS3MDpZH3sAF7fHbbozp/+SmMmyP17rw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 216.228.118.233) smtp.rcpttodomain=vger.kernel.org smtp.mailfrom=nvidia.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=2VVsmCBsTbmoNEDvOisP0DU0vksXfNKdqLnxNPQrtAo=; b=WI2rW2cUZMVqfHJN5H0wRImk4TsgIWxz4XSXZI21nXsgHecdrXFHda1C351W419arDi+W3dwKY30Eo+OWEOYScqcM790lajPBPQ2wb36CT3Bw0TBRM3t8mYyNt+oEtsXLQOfXKO6F21rA5sN5CBDHA1Lu7xZduec3+5a/0iKnol8ngi3s8gnSDgJORowkF5JZ4Q6CLtNNJ68xjzIeHVrAK3dWJEozNsnE3LKEMPe+xYaHCwXqCFf+/YUGojYw4qYJrAOW/C1gUC4TSDA6pYTNjjksc2eMRpeH/XhKJ0dFVgyUsbMOCF28sNGs+WWQ8JK7bV7WZ9ywz2rJridtViRGA== Received: from BL0PR02CA0070.namprd02.prod.outlook.com (2603:10b6:207:3d::47) by PH7PR12MB9074.namprd12.prod.outlook.com (2603:10b6:510:2f4::16) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7918.27; Mon, 9 Sep 2024 10:05:27 +0000 Received: from BL02EPF0002992D.namprd02.prod.outlook.com (2603:10b6:207:3d:cafe::2d) by BL0PR02CA0070.outlook.office365.com (2603:10b6:207:3d::47) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7939.23 via Frontend Transport; Mon, 9 Sep 2024 10:05:26 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 216.228.118.233) smtp.mailfrom=nvidia.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 216.228.118.233 as permitted sender) receiver=protection.outlook.com; client-ip=216.228.118.233; helo=mail.nvidia.com; pr=C Received: from mail.nvidia.com (216.228.118.233) by BL02EPF0002992D.mail.protection.outlook.com (10.167.249.58) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7918.13 via Frontend Transport; Mon, 9 Sep 2024 10:05:26 +0000 Received: from drhqmail202.nvidia.com (10.126.190.181) by mail.nvidia.com (10.127.129.6) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.4; Mon, 9 Sep 2024 03:05:15 -0700 Received: from drhqmail203.nvidia.com (10.126.190.182) by drhqmail202.nvidia.com (10.126.190.181) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.4; Mon, 9 Sep 2024 03:05:14 -0700 Received: from vdi.nvidia.com (10.127.8.13) by mail.nvidia.com (10.126.190.182) with Microsoft SMTP Server id 15.2.1544.4 via Frontend Transport; Mon, 9 Sep 2024 03:05:13 -0700 From: Michael Guralnik To: , CC: , , , Michael Guralnik Subject: [PATCH v2 rdma-next 3/8] RDMA/mlx5: Add new ODP memory scheme eqe format Date: Mon, 9 Sep 2024 13:04:59 +0300 Message-ID: <20240909100504.29797-4-michaelgur@nvidia.com> X-Mailer: git-send-email 2.17.2 In-Reply-To: <20240909100504.29797-1-michaelgur@nvidia.com> References: <20240909100504.29797-1-michaelgur@nvidia.com> Precedence: bulk X-Mailing-List: linux-rdma@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-NV-OnPremToCloud: ExternallySecured X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: BL02EPF0002992D:EE_|PH7PR12MB9074:EE_ X-MS-Office365-Filtering-Correlation-Id: 9ba65c99-1742-4c0a-92c0-08dcd0b6eda5 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|1800799024|36860700013|82310400026|376014; X-Microsoft-Antispam-Message-Info: JwR1pyG0ldJ+jd/2XwKpHSZQBArbdwDQxpFzJtDApMn0h6ic4Qb6vCyFCZRJM7N12OZzCs+PacFoZXlGSF7vXHXYXY8pX46g98rzYdnaEz4GJowu3mSQgUCwfpXlwzg9LRgj8zsiZsWKtGUlqKmJiB+824gqwMZNdXpjB3IXg1Sk2qZ4vg1QZQA4M7oryT3hQpVAngMi5VglHqA0B/X5GRaJ5bAku47nRbXsyf8jLnMCOFueKQMf7af30nz1RBNeljR9XLrlP9cg24+EeAJyeToODpTNtMIutF2YY0i7+pwwsmI0CnvsID31IW/d2b4PoLVtZ6HS9jO8Dx329P6jYnmmlSkpB0dC3KZym16/zBQP5S6Qfsxlmf0PPnSTr62aFJQnGFTjIVzohyUFtquufee4orM0/VL0y9fI8a4KMKKZ4PSOo8FZfgE3pOrGKj/j6XFu3EOiMkgn/GfQDQLlBWigY87SSs2N5tTvyKMr0Ko2WRGPgvV1G51Zm7ayMmg5c9ARXUK1p5oWgS3bEj2rKBJ4baW2GA1AcI4mzXA/YeY8PU28vZGn3zZswD8eYUq3HRUI0yn3OW5/NsoIJdL9DMgDaUosqaITXa/u4eEH7+vgVjEbvjXIC/8YBaahrA+17WETZD4w18Ce6B9dsc9RB3hqBbXxSbgioUKGd2LfcvM0s8hdkqgEYa5O6BhewhTdefIZoEKu5XdRSzKF5QuIwq+1P2iUt9StT9ZtTKRzn6fcpLCD6iUeGLzuCE/EAQfV4mwc+IJeSCrJ2/oTCw1daPTuBf7mShv+rqJM6S6d5dJglmXM0CH8qqV4HKd4etGNBuOOmZCbdl3OZ+im3Z0XE+aQEX3KjzQgDpnOdELOUKOFzXaaeKON8YFHPKKVEQdvAdpv0+MPfuxH/TrS9WMdIBdO/QLdLfsI1PEdbA8ehcEdYAnh7hWteR3N5EvHscnwoDnMMXuY8+B7n3J6TN/TQt+XSeCFqx+4Qzs8WHWwDJM4ojQF1kgF2TxpbPweNOhZi5i2tC+1fVFpedqd1OUUH17mzSjwHouY+EQ7jqRz7UUF1SD+8vg/Pr1C3n1UtEK0WaBDCtYBwHuER6J1iVClKwltBEyUanXNILA/aay86NFidc1MS5D/OTdX6OmwMXk1CcpugS5K0/Fz0x9ahz5SbcBF1XtHh5oHOK3+yIOCDhqN3yIgqnqSzU/8JjkOPLl5+jOFH6iel8i+ytU7drZ9c+1x3L8S6cc7Y22DNlVK72PcA0rvadETE+91zLbnkyYo7I/DWGBtxla0F2ElQaQ7eg9hXqQI140jwil4aNhI1VCaFixTyXlW8fVChD+VmkCRtxCbjEiqiTeeJTEtXP/oyRj5gaBNrvAHjEKNfbuC28kOaD49NEHiEPz4QV8Ftg0lSytlAdd4UEXzjd9b33t6aRqYUhA5hdoLQK1xJ4UmsJ9oU4fZOXXZ6Bp3+azWWid3 X-Forefront-Antispam-Report: CIP:216.228.118.233;CTRY:US;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:mail.nvidia.com;PTR:dc7edge2.nvidia.com;CAT:NONE;SFS:(13230040)(1800799024)(36860700013)(82310400026)(376014);DIR:OUT;SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 09 Sep 2024 10:05:26.7725 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 9ba65c99-1742-4c0a-92c0-08dcd0b6eda5 X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a;Ip=[216.228.118.233];Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: BL02EPF0002992D.namprd02.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: PH7PR12MB9074 Add new fields to support the new memory scheme page fault and extend the token field to u64 as in the new scheme the token is 48 bit. Signed-off-by: Michael Guralnik Reviewed-by: Leon Romanovsky --- drivers/infiniband/hw/mlx5/odp.c | 48 +++++++++++++++++++------------- include/linux/mlx5/device.h | 22 ++++++++++++++- 2 files changed, 50 insertions(+), 20 deletions(-) diff --git a/drivers/infiniband/hw/mlx5/odp.c b/drivers/infiniband/hw/mlx5/odp.c index 300504bf79d7..f01026d507a3 100644 --- a/drivers/infiniband/hw/mlx5/odp.c +++ b/drivers/infiniband/hw/mlx5/odp.c @@ -45,7 +45,7 @@ /* Contains the details of a pagefault. */ struct mlx5_pagefault { u32 bytes_committed; - u32 token; + u64 token; u8 event_subtype; u8 type; union { @@ -74,6 +74,14 @@ struct mlx5_pagefault { u32 rdma_op_len; u64 rdma_va; } rdma; + struct { + u64 va; + u32 mkey; + u32 fault_byte_count; + u32 prefetch_before_byte_count; + u32 prefetch_after_byte_count; + u8 flags; + } memory; }; struct mlx5_ib_pf_eq *eq; @@ -1273,7 +1281,7 @@ static void mlx5_ib_mr_wqe_pfault_handler(struct mlx5_ib_dev *dev, if (ret) mlx5_ib_err( dev, - "Failed reading a WQE following page fault, error %d, wqe_index %x, qpn %x\n", + "Failed reading a WQE following page fault, error %d, wqe_index %x, qpn %llx\n", ret, wqe_index, pfault->token); resolve_page_fault: @@ -1332,13 +1340,13 @@ static void mlx5_ib_mr_rdma_pfault_handler(struct mlx5_ib_dev *dev, } else if (ret < 0 || pages_in_range(address, length) > ret) { mlx5_ib_page_fault_resume(dev, pfault, 1); if (ret != -ENOENT) - mlx5_ib_dbg(dev, "PAGE FAULT error %d. QP 0x%x, type: 0x%x\n", + mlx5_ib_dbg(dev, "PAGE FAULT error %d. QP 0x%llx, type: 0x%x\n", ret, pfault->token, pfault->type); return; } mlx5_ib_page_fault_resume(dev, pfault, 0); - mlx5_ib_dbg(dev, "PAGE FAULT completed. QP 0x%x, type: 0x%x, prefetch_activated: %d\n", + mlx5_ib_dbg(dev, "PAGE FAULT completed. QP 0x%llx, type: 0x%x, prefetch_activated: %d\n", pfault->token, pfault->type, prefetch_activated); @@ -1354,7 +1362,7 @@ static void mlx5_ib_mr_rdma_pfault_handler(struct mlx5_ib_dev *dev, prefetch_len, &bytes_committed, NULL); if (ret < 0 && ret != -EAGAIN) { - mlx5_ib_dbg(dev, "Prefetch failed. ret: %d, QP 0x%x, address: 0x%.16llx, length = 0x%.16x\n", + mlx5_ib_dbg(dev, "Prefetch failed. ret: %d, QP 0x%llx, address: 0x%.16llx, length = 0x%.16x\n", ret, pfault->token, address, prefetch_len); } } @@ -1405,15 +1413,12 @@ static void mlx5_ib_eq_pf_process(struct mlx5_ib_pf_eq *eq) pf_eqe = &eqe->data.page_fault; pfault->event_subtype = eqe->sub_type; - pfault->bytes_committed = be32_to_cpu(pf_eqe->bytes_committed); - - mlx5_ib_dbg(eq->dev, - "PAGE_FAULT: subtype: 0x%02x, bytes_committed: 0x%06x\n", - eqe->sub_type, pfault->bytes_committed); switch (eqe->sub_type) { case MLX5_PFAULT_SUBTYPE_RDMA: /* RDMA based event */ + pfault->bytes_committed = + be32_to_cpu(pf_eqe->rdma.bytes_committed); pfault->type = be32_to_cpu(pf_eqe->rdma.pftype_token) >> 24; pfault->token = @@ -1427,10 +1432,12 @@ static void mlx5_ib_eq_pf_process(struct mlx5_ib_pf_eq *eq) be32_to_cpu(pf_eqe->rdma.rdma_op_len); pfault->rdma.rdma_va = be64_to_cpu(pf_eqe->rdma.rdma_va); - mlx5_ib_dbg(eq->dev, - "PAGE_FAULT: type:0x%x, token: 0x%06x, r_key: 0x%08x\n", - pfault->type, pfault->token, - pfault->rdma.r_key); + mlx5_ib_dbg( + eq->dev, + "PAGE_FAULT: subtype: 0x%02x, bytes_committed: 0x%06x, type:0x%x, token: 0x%06llx, r_key: 0x%08x\n", + eqe->sub_type, pfault->bytes_committed, + pfault->type, pfault->token, + pfault->rdma.r_key); mlx5_ib_dbg(eq->dev, "PAGE_FAULT: rdma_op_len: 0x%08x, rdma_va: 0x%016llx\n", pfault->rdma.rdma_op_len, @@ -1439,6 +1446,8 @@ static void mlx5_ib_eq_pf_process(struct mlx5_ib_pf_eq *eq) case MLX5_PFAULT_SUBTYPE_WQE: /* WQE based event */ + pfault->bytes_committed = + be32_to_cpu(pf_eqe->wqe.bytes_committed); pfault->type = (be32_to_cpu(pf_eqe->wqe.pftype_wq) >> 24) & 0x7; pfault->token = @@ -1450,11 +1459,12 @@ static void mlx5_ib_eq_pf_process(struct mlx5_ib_pf_eq *eq) be16_to_cpu(pf_eqe->wqe.wqe_index); pfault->wqe.packet_size = be16_to_cpu(pf_eqe->wqe.packet_length); - mlx5_ib_dbg(eq->dev, - "PAGE_FAULT: type:0x%x, token: 0x%06x, wq_num: 0x%06x, wqe_index: 0x%04x\n", - pfault->type, pfault->token, - pfault->wqe.wq_num, - pfault->wqe.wqe_index); + mlx5_ib_dbg( + eq->dev, + "PAGE_FAULT: subtype: 0x%02x, bytes_committed: 0x%06x, type:0x%x, token: 0x%06llx, wq_num: 0x%06x, wqe_index: 0x%04x\n", + eqe->sub_type, pfault->bytes_committed, + pfault->type, pfault->token, pfault->wqe.wq_num, + pfault->wqe.wqe_index); break; default: diff --git a/include/linux/mlx5/device.h b/include/linux/mlx5/device.h index bd081f276654..154095256d0d 100644 --- a/include/linux/mlx5/device.h +++ b/include/linux/mlx5/device.h @@ -211,6 +211,7 @@ enum { enum { MLX5_PFAULT_SUBTYPE_WQE = 0, MLX5_PFAULT_SUBTYPE_RDMA = 1, + MLX5_PFAULT_SUBTYPE_MEMORY = 2, }; enum wqe_page_fault_type { @@ -646,10 +647,11 @@ struct mlx5_eqe_page_req { __be32 rsvd1[5]; }; +#define MEMORY_SCHEME_PAGE_FAULT_GRANULARITY 4096 struct mlx5_eqe_page_fault { - __be32 bytes_committed; union { struct { + __be32 bytes_committed; u16 reserved1; __be16 wqe_index; u16 reserved2; @@ -659,6 +661,7 @@ struct mlx5_eqe_page_fault { __be32 pftype_wq; } __packed wqe; struct { + __be32 bytes_committed; __be32 r_key; u16 reserved1; __be16 packet_length; @@ -666,6 +669,23 @@ struct mlx5_eqe_page_fault { __be64 rdma_va; __be32 pftype_token; } __packed rdma; + struct { + u8 flags; + u8 reserved1; + __be16 post_demand_fault_pages; + __be16 pre_demand_fault_pages; + __be16 token47_32; + __be32 token31_0; + /* + * FW changed from specifying the fault size in byte + * count to 4k pages granularity. The size specified + * in pages uses bits 31:12, to keep backward + * compatibility. + */ + __be32 demand_fault_pages; + __be32 mkey; + __be64 va; + } __packed memory; } __packed; } __packed; From patchwork Mon Sep 9 10:05:00 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Michael Guralnik X-Patchwork-Id: 13796572 Received: from NAM04-MW2-obe.outbound.protection.outlook.com (mail-mw2nam04on2087.outbound.protection.outlook.com [40.107.101.87]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id ACF4E1B373F for ; Mon, 9 Sep 2024 10:05:34 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.101.87 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725876336; cv=fail; b=hyTucuIzTo+fskX1sYp6NgKSKc5MUukd/GgwnKUIQOc8r9tQy5CCe4MuybQwHMsLANFwi5cn4bhpeXLt7KnCRp+2hN4dX5kuAPURc9LOBsdt7256mftYcU8XVj0M9blTi4sRw8X/z9HSfG+ka/5/2wJN8Bv43e1d9jRnPSeX5/c= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725876336; c=relaxed/simple; bh=BUFmWbS9FKrS39yLkH2iuk0nOTBFDgj893f2A0hrvPw=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=Zpg7i7OrLZ18tMIBtJ8LFc4miLWiqN5ed8pnp/6tSaJreRJc1wrKVPaMVzv/3QFaHWij30Z5KIlvfxvrwWf4vtNu6XxrxcxJ19lt3gX09AbNiGCcVIAtXWGZtpegjSLuq2Zs3r4oX0uj3U4onzLoZUVWwprs1Du92/s5VXPR+1M= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com; spf=fail smtp.mailfrom=nvidia.com; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b=dPW2EBGG; arc=fail smtp.client-ip=40.107.101.87 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nvidia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b="dPW2EBGG" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=kBtpBuH7TGfcgZAWfly4f+TYa9Q/ttQqSydZG6mlZ+JCoSPr2XW5Ok5bCGpOn9DFXN/k1gxfFZZnn9mCpbVPNW1hWhpuNZArH+PmkQ65Yu9HxJbaU7aL6lW5I6Lsx/WRlQ5mj4Z+JyzDqcmPkmNmobZVdgrsbZpiKeKiU05pCSgLH7Tut9tcBBge8umkEQdgMTLL0fMqFhANMLpksfEMDleGlcuGef5pRVIDLCcXK6uUX+i/Gv9Hp4aRGx6+LqTuSngmkLMr4Y6wzfBOvqlghIO8GN6PmMkXQo21JL8Co6EtlnDzxCOEf6c9OnEER9IqbsRlXkLm5gjN9C1m/eKH0Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=GkwQYRhdbg7/r4cMwsxk5qyL1NVCqzgjNM0m4JkjMMs=; b=xeTupMZUIw7kATVTuArC8M+ip3jWS2xU7QvoYJ2wR3owLfol2EzH+sNaQ19pa44pdnf3DN3Ac/mKZqkLAF/zwuFm7twdxLvH97mBgr0MJ0MtRVT1OnErcyTNyPvX+BnohVRni3UTZZPCPpzCxoQBaNJwKO0CxFfnNupeEkfey7zAYO/pEBGkL3DbI8BArIH/dR++CtR/mVrCPDtaGxXZ+5jZYBiwqfBCStva/MrwfU9e5bCOZ7ohgKc1MNwACcM4Hf+n+YmCCCKslG8TiPaYlr0oFK2D6UaUd69asc+lZPNPcuOnvmm18KTZQpSlVtqlVh/NlPImTXaLbt/vaG8hlg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 216.228.118.233) smtp.rcpttodomain=vger.kernel.org smtp.mailfrom=nvidia.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=GkwQYRhdbg7/r4cMwsxk5qyL1NVCqzgjNM0m4JkjMMs=; b=dPW2EBGGZLAw+eRJ5OKlr1LphSnAWciDuKAIl7DwCS8CIrC22FRYl8Dy3WbSsytcCrefWYMG6jG2V+01snqH0CWdZotdeAm14+i5P5jaNMaUrec7m6aJmX9yfW5YN7dGSFtPgXSbq0yrGZe3AiS9KmI3IeGrXFJdbuvbYg2cdnaj/22ell1/noFhoQg/EFs1Di0RUcYxanG/vXtTZR62fH8UgwtCGyISwgEITddTBhYKajnn7CHfhw6pS1veMA9fI1s7cibIVOZsV5yP48iXgCCmdcv7Z+oKGaALYbdbYMehIekDGbqm2/8XLpxsnhGbdVszUwW18CCAYO6djEkg3g== Received: from BL0PR02CA0064.namprd02.prod.outlook.com (2603:10b6:207:3d::41) by IA1PR12MB8077.namprd12.prod.outlook.com (2603:10b6:208:3f4::12) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7939.24; Mon, 9 Sep 2024 10:05:28 +0000 Received: from BL02EPF0002992D.namprd02.prod.outlook.com (2603:10b6:207:3d:cafe::75) by BL0PR02CA0064.outlook.office365.com (2603:10b6:207:3d::41) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7939.23 via Frontend Transport; Mon, 9 Sep 2024 10:05:28 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 216.228.118.233) smtp.mailfrom=nvidia.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 216.228.118.233 as permitted sender) receiver=protection.outlook.com; client-ip=216.228.118.233; helo=mail.nvidia.com; pr=C Received: from mail.nvidia.com (216.228.118.233) by BL02EPF0002992D.mail.protection.outlook.com (10.167.249.58) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7918.13 via Frontend Transport; Mon, 9 Sep 2024 10:05:28 +0000 Received: from drhqmail202.nvidia.com (10.126.190.181) by mail.nvidia.com (10.127.129.6) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.4; Mon, 9 Sep 2024 03:05:17 -0700 Received: from drhqmail203.nvidia.com (10.126.190.182) by drhqmail202.nvidia.com (10.126.190.181) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.4; Mon, 9 Sep 2024 03:05:16 -0700 Received: from vdi.nvidia.com (10.127.8.13) by mail.nvidia.com (10.126.190.182) with Microsoft SMTP Server id 15.2.1544.4 via Frontend Transport; Mon, 9 Sep 2024 03:05:15 -0700 From: Michael Guralnik To: , CC: , , , Michael Guralnik Subject: [PATCH v2 rdma-next 4/8] RDMA/mlx5: Enforce umem boundaries for explicit ODP page faults Date: Mon, 9 Sep 2024 13:05:00 +0300 Message-ID: <20240909100504.29797-5-michaelgur@nvidia.com> X-Mailer: git-send-email 2.17.2 In-Reply-To: <20240909100504.29797-1-michaelgur@nvidia.com> References: <20240909100504.29797-1-michaelgur@nvidia.com> Precedence: bulk X-Mailing-List: linux-rdma@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-NV-OnPremToCloud: ExternallySecured X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: BL02EPF0002992D:EE_|IA1PR12MB8077:EE_ X-MS-Office365-Filtering-Correlation-Id: a0872a40-ed70-4607-3b26-08dcd0b6ee70 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|36860700013|82310400026|376014|1800799024; X-Microsoft-Antispam-Message-Info: FqPp3V2+3G9osT4UMxV/sohlZOzNqQN1uRk91uXTfEwiru35yp8FB6ZbjWJ/ECsdZ5wM0bsrYJFuBG/SLQmbDR56Fe0pIYX81IxjYsJCK5G8cj+Yck51fzfQK+rxxCEK63nRhz9SdvRgZ9GvMRq0jYVha1X3Ya8RzW1VMWuggr7CmI1/FPZ186FbWJYzDWh7bR/ilXAsizkZXkDHoXnR7nB+d7cZ6cmhneHAtIt6drJIL1nlLoITGiftOriDBfHMJ9ERiKggXB+Dll/HNKB7Swfpx5wML7HhPRVcAq0VetgufUoaOI5GKVst6zq2bWSO7AADvEzCkP77ibnQtUb7Vz67KVDiU/JNG/ROxYMmpUHkWLqNpM7A/lJeMaVkjfakYof8n/oa6fzEw4yNkLS+Ml5bJL04YYAXJ4s59WeNR9u83kFfwulkcOyDbWOt3SSS0rpgmkqCVTRD1ToK2lSbgC8kMlkggZ2rdsLmLoak8V6UD3xHp7sAHxw6Hcmw2IuT3RbXMSDBEjEF1uyReITlEoVsyDGhbOBFerDWlmHa4hOVRU5Uux6W1jE+P+S4inpwX2uDT6pzTn83Wf/5PlJVk/eDnaaFh01gV+LIdSF+eABqhUQWwLuvNw11fkYYtVxlmVMH1DJ9fHkTCqkV4aPtELPSc6vm1DN61lw9wLM8X/RinYYoz8d2PU3NoPhjLqGNwPtesKaNkIsCN3q5JOguX21eCOLv72Jf8a3q4tKb5UM/tSARDNxeMuXWyOUD/NAAf24bJoKIsWFurgU4GDgOh6VG/83bohGx7D9DI9qMGGnUIsP2DNrOExlMrCXyodNeAhYe/f6xR0bM8zc1RX60Ue7qA3zKh0+lyzLS+wLwgdOtjeyRQu61+iIJrb3JWf1ov5EIdqkyHuFbhq+/4fsMHMW7By6kN0d3yb+ZpP4aeUgfG2jEBeZG5qXBLyc5M2OVQTIwIyzm1ObRTZ7VLNDBTpdeyzrZxsbTw2MTu5LSmEM2EQ1JIAEk1ehLtL2A+8zA5bQ2Ky4kGVARHBoCYhdZ29aHr/0BjJke1z7hBik/ooyVATCUnuwa+SzPQHKGr+hTUeaE2bW496SpvCIZJVW0OV6ZbcqCePDG47K+IpegYOBoEDqPmfJj6+7PpfeM0lKP1+YbL5L5S65kY5ZAdsdvwxOQsG83CVYfW9fhInOzu4KsWaVO6XLe/RGb45pL0ZYRBmkxQOqhYrvX4gtH+BzDCjPZpKnBU1hqtizMZi+6wIL7skttyX0HY6hWZrD3S3VYsuakfwsfHSliYN41W8y3uC4YRr3zhyV9ahOORkY4+dOKyYxqxxhNbyv9p7PcvUZ9HyVONBAj/E4i1SfdcUQSKUnP32H0FseCaMurLfB1KDcel0gs4Nt5mwVA/DwHu13uAUINiVDRAbQsUk9P53K14iKJTNCLWwRvUhfDS48LYPA= X-Forefront-Antispam-Report: CIP:216.228.118.233;CTRY:US;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:mail.nvidia.com;PTR:dc7edge2.nvidia.com;CAT:NONE;SFS:(13230040)(36860700013)(82310400026)(376014)(1800799024);DIR:OUT;SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 09 Sep 2024 10:05:28.0850 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: a0872a40-ed70-4607-3b26-08dcd0b6ee70 X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a;Ip=[216.228.118.233];Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: BL02EPF0002992D.namprd02.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: IA1PR12MB8077 The new memory scheme page faults are requesting the driver to fetch additinal pages to the faulted memory access. This is done in order to prefetch pages before and after the area that got the page fault, assuming this will reduce the total amount of page faults. The driver should ensure it handles only the pages that are within the umem range. Signed-off-by: Michael Guralnik Reviewed-by: Leon Romanovsky --- drivers/infiniband/hw/mlx5/odp.c | 25 ++++++++++++++++--------- 1 file changed, 16 insertions(+), 9 deletions(-) diff --git a/drivers/infiniband/hw/mlx5/odp.c b/drivers/infiniband/hw/mlx5/odp.c index f01026d507a3..20ad2616bed0 100644 --- a/drivers/infiniband/hw/mlx5/odp.c +++ b/drivers/infiniband/hw/mlx5/odp.c @@ -748,24 +748,31 @@ static int pagefault_dmabuf_mr(struct mlx5_ib_mr *mr, size_t bcnt, * >0: Number of pages mapped */ static int pagefault_mr(struct mlx5_ib_mr *mr, u64 io_virt, size_t bcnt, - u32 *bytes_mapped, u32 flags) + u32 *bytes_mapped, u32 flags, bool permissive_fault) { struct ib_umem_odp *odp = to_ib_umem_odp(mr->umem); - if (unlikely(io_virt < mr->ibmr.iova)) + if (unlikely(io_virt < mr->ibmr.iova) && !permissive_fault) return -EFAULT; if (mr->umem->is_dmabuf) return pagefault_dmabuf_mr(mr, bcnt, bytes_mapped, flags); if (!odp->is_implicit_odp) { + u64 offset = io_virt < mr->ibmr.iova ? 0 : io_virt - mr->ibmr.iova; u64 user_va; - if (check_add_overflow(io_virt - mr->ibmr.iova, - (u64)odp->umem.address, &user_va)) + if (check_add_overflow(offset, (u64)odp->umem.address, + &user_va)) return -EFAULT; - if (unlikely(user_va >= ib_umem_end(odp) || - ib_umem_end(odp) - user_va < bcnt)) + + if (permissive_fault) { + if (user_va < ib_umem_start(odp)) + user_va = ib_umem_start(odp); + if ((user_va + bcnt) > ib_umem_end(odp)) + bcnt = ib_umem_end(odp) - user_va; + } else if (unlikely(user_va >= ib_umem_end(odp) || + ib_umem_end(odp) - user_va < bcnt)) return -EFAULT; return pagefault_real_mr(mr, odp, user_va, bcnt, bytes_mapped, flags); @@ -872,7 +879,7 @@ static int pagefault_single_data_segment(struct mlx5_ib_dev *dev, case MLX5_MKEY_MR: mr = container_of(mmkey, struct mlx5_ib_mr, mmkey); - ret = pagefault_mr(mr, io_virt, bcnt, bytes_mapped, 0); + ret = pagefault_mr(mr, io_virt, bcnt, bytes_mapped, 0, false); if (ret < 0) goto end; @@ -1727,7 +1734,7 @@ static void mlx5_ib_prefetch_mr_work(struct work_struct *w) for (i = 0; i < work->num_sge; ++i) { ret = pagefault_mr(work->frags[i].mr, work->frags[i].io_virt, work->frags[i].length, &bytes_mapped, - work->pf_flags); + work->pf_flags, false); if (ret <= 0) continue; mlx5_update_odp_stats(work->frags[i].mr, prefetch, ret); @@ -1778,7 +1785,7 @@ static int mlx5_ib_prefetch_sg_list(struct ib_pd *pd, if (IS_ERR(mr)) return PTR_ERR(mr); ret = pagefault_mr(mr, sg_list[i].addr, sg_list[i].length, - &bytes_mapped, pf_flags); + &bytes_mapped, pf_flags, false); if (ret < 0) { mlx5r_deref_odp_mkey(&mr->mmkey); return ret; From patchwork Mon Sep 9 10:05:01 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Michael Guralnik X-Patchwork-Id: 13796573 Received: from NAM10-BN7-obe.outbound.protection.outlook.com (mail-bn7nam10on2051.outbound.protection.outlook.com [40.107.92.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E871B171E5A for ; Mon, 9 Sep 2024 10:05:38 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.92.51 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725876340; cv=fail; b=mRu4417eY7H2cofzBW6ySYdGN2h/bBeE7xoy2UsiOOmunPXlFHHc5tSwxl+w9kexmwj3ylIi4AXbM37ZRoT4hzlkU8r51cgq1mqFRnX7DOqzFYYDF6nKp6Ru02uaAFZEmxCJsKZcRelGM5V8AWIGLCL3Im5xV6a3OK3IBhUKjo8= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725876340; c=relaxed/simple; bh=XfGdmeJwrAdNQXtVJ/wcJICIKVlJzEmIZ4IS2AjexXE=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=RH6yuawtso8xgu2M5nRp1MGxdRgs5I/lM68dJP6FTDd3b+iHmVB3MRtgLspdJbncRANJ1GPg6VNd5zwH8NQou8v/vp8YO7328U54p++UHfIuEC8vQ0bu83WdtYPRoLrAsKq9rgKZaVndA8txK98KgXkUPbek/wo4Q6fAtnCl2hE= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com; spf=fail smtp.mailfrom=nvidia.com; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b=dDZu2djf; arc=fail smtp.client-ip=40.107.92.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nvidia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b="dDZu2djf" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=GeBXM7FQOpLgIj1Q5Xm92Bxkgi3jKibQXc2Q62MUX5v0mWmmuho6pRCzeZVCNYQDoM/70Zo0NorM5FW/1w8Jo/uU6pQYt/iFnZ5+PiFPA3gFTpm80nJ60xyi+sAlVRNN06NqplumaawFwfrd3wjB8v63zU0wvzuW7M3XLL7zcaoSfq9NOxKDbx+H3K9V3qRXiORiPecPByPSQHNChMGCAIYzDcEKEm3vIIAPvsTSqS8lTRDN1cvGY58gUuRdomtbXohCsJfPs7ZGNK41mfo+2x7txqgP5apnIkImRfr/CN7NWk90fffbXlyYbxTSTO0kQ+FNXGrvPwatZzdxYexf9A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=G5HqaHeN2EHybHDUbaqlvfUmjZvJMB5U5BsFxBHNYks=; b=crvHg+sniDTMdeV4rRPUr6/Ku2n5b16I5+AKaOTqh6Ian5GeNSRwA+jxyfiKXPdgTbR3IAvL/G3Q0NJPPhgxNa7cZ4k4RLZvAWBqgFGshJPovtoIuBsepOTEnPzK0In7Q/s8u1vTCQxTHOqqaan6eKMAcnICZJ/Vfk7+PlrdW1KOuixzK4IUGN8oIVuTAGaw173l4fx0Lmb4nulRpLYMIPBeffOlhHCIJr+zSdVfiTsmxV7Lzh1tqtxjuT184q4/u7ODbweT2snbHfT+p2d695loh+AOAnTc7uOwC4kpApSJsQD2MY9VN4ENOkqt/JFSVm5lO2W6JndkEV/oPgGkUw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 216.228.118.232) smtp.rcpttodomain=vger.kernel.org smtp.mailfrom=nvidia.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=G5HqaHeN2EHybHDUbaqlvfUmjZvJMB5U5BsFxBHNYks=; b=dDZu2djfdj2mG0UviYv+qeIqibbuAKPByz8IXXZ/ATn5h02HLZuBiV+qViSSzpx9zCBUrZdsqgiMf/UIGZw9t45cPS0EBkSxrC5uzUqrT333B2nQnkEXfx/x6AyZAS3evJdYiTWDJ0BMqOBOrP3/PF3axYwi/vCF0Ka8PmaZyHrRrai3562XsD7tipnTOhGg173zCvW/Kd80ZI3bqBINcewsaQPFC8ZlDyIX5HEHkDkTGg0apzclxOnImDQHcrM42fYuDH9IqHMkqgZiuJgZ1tMhbJHXiAPEG1VmeLkbzhg70K0/B/q06TaaEWpioUERCp/W688jM4eq+QaPGeTmcA== Received: from PH2PEPF00003852.namprd17.prod.outlook.com (2603:10b6:518:1::77) by PH7PR12MB9065.namprd12.prod.outlook.com (2603:10b6:510:1f7::11) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7918.28; Mon, 9 Sep 2024 10:05:33 +0000 Received: from CY4PEPF0000E9D6.namprd05.prod.outlook.com (2a01:111:f403:f912::5) by PH2PEPF00003852.outlook.office365.com (2603:1036:903:48::3) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7918.24 via Frontend Transport; Mon, 9 Sep 2024 10:05:33 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 216.228.118.232) smtp.mailfrom=nvidia.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 216.228.118.232 as permitted sender) receiver=protection.outlook.com; client-ip=216.228.118.232; helo=mail.nvidia.com; pr=C Received: from mail.nvidia.com (216.228.118.232) by CY4PEPF0000E9D6.mail.protection.outlook.com (10.167.241.69) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7918.13 via Frontend Transport; Mon, 9 Sep 2024 10:05:33 +0000 Received: from drhqmail203.nvidia.com (10.126.190.182) by mail.nvidia.com (10.127.129.5) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.4; Mon, 9 Sep 2024 03:05:19 -0700 Received: from drhqmail203.nvidia.com (10.126.190.182) by drhqmail203.nvidia.com (10.126.190.182) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.4; Mon, 9 Sep 2024 03:05:18 -0700 Received: from vdi.nvidia.com (10.127.8.13) by mail.nvidia.com (10.126.190.182) with Microsoft SMTP Server id 15.2.1544.4 via Frontend Transport; Mon, 9 Sep 2024 03:05:17 -0700 From: Michael Guralnik To: , CC: , , , Michael Guralnik Subject: [PATCH v2 rdma-next 5/8] RDMA/mlx5: Split ODP mkey search logic Date: Mon, 9 Sep 2024 13:05:01 +0300 Message-ID: <20240909100504.29797-6-michaelgur@nvidia.com> X-Mailer: git-send-email 2.17.2 In-Reply-To: <20240909100504.29797-1-michaelgur@nvidia.com> References: <20240909100504.29797-1-michaelgur@nvidia.com> Precedence: bulk X-Mailing-List: linux-rdma@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-NV-OnPremToCloud: ExternallySecured X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: CY4PEPF0000E9D6:EE_|PH7PR12MB9065:EE_ X-MS-Office365-Filtering-Correlation-Id: 74bf0a93-7e65-4e12-b93c-08dcd0b6f16b X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|36860700013|1800799024|82310400026|376014; X-Microsoft-Antispam-Message-Info: Gn6sxe0xtSdPWs92Vaxgx5RAlu4SwbAE+ZKljJQg8rmxjSVMLKN1yqOSf3BS0Tvwr7W609FLCKdT7kYwGblyJSrSIN6kTRkX9jSKQsi7hGP3BpKJ7D/YKXisY8Kdciqadr3+xFMy/ow+SMG9KZ0dS9jQiXX7a61PsyrHUrpKTFt6a6jAQ+uQH4MjR7pg83CmIhZcHMb6nth4pyCqYPrbmCO4B5N7kpECFAPH2LT79szCiRetbERBYJ0MhUAuPHMEfWBS1IKzsOLGWW2NXKSQ12bsnQtaf2fbsR8gkjGaYJ9WjE1myiJbDDBF/W+/0ktecSniWoqYQvDP3cT+t63v9aFOVIQ8Tg5SPfsPqbxCMKwjPpRGm2H2dSowXxMlRkFBEc7KvBv2rj8M4YIchLnGeo2wh+jZFTQCazSNwkPls+9al05H+DTLKT0s7D6P783IBLcdiz0w2hjc30ChDoYWiXDRSPfIv27L7H9ZL/1z9t0FxB8NtfqMsZ0g3h5ar5U5s2BzHxWQO4kChHp+YVoTQK+8a32tfI4q/CWczmRSwW9M4nkXqLJjg0k5fsKntXoSpaLt0AwgrXvP63Cd/QVo9tgFI79kO0c8kB6acQdz9nNX4Wb016gREkWnxMO9eo6UAM9f4s9bjn9JhrE++6JSnJz34pzoVS51jFlU/v97gkIi7N4bEzHYyC4zif4sg9U0/kYqVVNwVCcQIeuPrTaA3vkTwK4X53IpcphTzCUTjeLUrzCGJtPTGl/DW2ia3RD+v4OwGMqG55DOTAb7vqMoJWRbFuZqi2nPyLMBg5GEZUPsAx6em2T/GIuAJ8wgNcxUyAomOH0nDd5giyel7ResjKZgudWXmvW71QRTVEkSEVW+XdyZO1qwspFroVxlOUeRdYSU/A6WD0QmkHSahwDSTgXESV/WR/Rx+f7nqT9Ofh7x9Yul7dIu7D6ozhDOtEi13OLSeH0w6rY3emqk3bfaiM3LvTgg6HWS9bHqPR3dQWy/pUd61ay7Rul5KH/yebDj8aTUIcZrmrb7R1XfeMCcYbujntXpGQXF+Wxlv5tLT/jzpVat8Ia/JsrlQy2e+vYUu1pKh4Kjia40EWngI+s4U3+z1a8TSWHQrIlNjLSkUan/AIZ+caxQAmsLMFg5FSy2L8YsMJ4Jkn+5qHwmTonHCAsWVsFmbiBqZZyL3x46Ts6fq19KOQ7O/qRe/8pHFPHZpx1Ut5Yo4NOA4Qc2VmfMuPO0/Ne6GHQLKrQcGXe+k3AEBmFq0Y10N78YAbM0R/FH47vmzyAJG/ttBNibHVAAkMUzSowJYvyRBauoLuXbEdTFzzOvh8HyY0RASH6oGJxxxR8z/IuG4DJ45BVq76u481pq5sx6NeI3grIfxVTCYOWfM2XFGY/dLd4cpbKB7t+fYqF8A3macyw7Pq7wY/pk78LP8mnaMIdOj5tWpYef5QOAJaiaKH9/3iOTcnmi7zTg X-Forefront-Antispam-Report: CIP:216.228.118.232;CTRY:US;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:mail.nvidia.com;PTR:dc7edge1.nvidia.com;CAT:NONE;SFS:(13230040)(36860700013)(1800799024)(82310400026)(376014);DIR:OUT;SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 09 Sep 2024 10:05:33.1824 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 74bf0a93-7e65-4e12-b93c-08dcd0b6f16b X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a;Ip=[216.228.118.232];Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: CY4PEPF0000E9D6.namprd05.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: PH7PR12MB9065 Split the search for the ODP mkey when handling an rdma type page fault to a helper function, later to be used in other page fault types. Signed-off-by: Michael Guralnik Reviewed-by: Leon Romanovsky --- drivers/infiniband/hw/mlx5/odp.c | 65 +++++++++++++++++++------------- 1 file changed, 39 insertions(+), 26 deletions(-) diff --git a/drivers/infiniband/hw/mlx5/odp.c b/drivers/infiniband/hw/mlx5/odp.c index 20ad2616bed0..05b92f4cac0e 100644 --- a/drivers/infiniband/hw/mlx5/odp.c +++ b/drivers/infiniband/hw/mlx5/odp.c @@ -819,6 +819,27 @@ static bool mkey_is_eq(struct mlx5_ib_mkey *mmkey, u32 key) return mmkey->key == key; } +static struct mlx5_ib_mkey *find_odp_mkey(struct mlx5_ib_dev *dev, u32 key) +{ + struct mlx5_ib_mkey *mmkey; + + xa_lock(&dev->odp_mkeys); + mmkey = xa_load(&dev->odp_mkeys, mlx5_base_mkey(key)); + if (!mmkey) { + mmkey = ERR_PTR(-ENOENT); + goto out; + } + if (!mkey_is_eq(mmkey, key)) { + mmkey = ERR_PTR(-EFAULT); + goto out; + } + refcount_inc(&mmkey->usecount); +out: + xa_unlock(&dev->odp_mkeys); + + return mmkey; +} + /* * Handle a single data segment in a page-fault WQE or RDMA region. * @@ -846,32 +867,24 @@ static int pagefault_single_data_segment(struct mlx5_ib_dev *dev, io_virt += *bytes_committed; bcnt -= *bytes_committed; - next_mr: - xa_lock(&dev->odp_mkeys); - mmkey = xa_load(&dev->odp_mkeys, mlx5_base_mkey(key)); - if (!mmkey) { - xa_unlock(&dev->odp_mkeys); - mlx5_ib_dbg( - dev, - "skipping non ODP MR (lkey=0x%06x) in page fault handler.\n", - key); - if (bytes_mapped) - *bytes_mapped += bcnt; - /* - * The user could specify a SGL with multiple lkeys and only - * some of them are ODP. Treat the non-ODP ones as fully - * faulted. - */ - ret = 0; - goto end; - } - refcount_inc(&mmkey->usecount); - xa_unlock(&dev->odp_mkeys); - - if (!mkey_is_eq(mmkey, key)) { - mlx5_ib_dbg(dev, "failed to find mkey %x\n", key); - ret = -EFAULT; + mmkey = find_odp_mkey(dev, key); + if (IS_ERR(mmkey)) { + ret = PTR_ERR(mmkey); + if (ret == -ENOENT) { + mlx5_ib_dbg( + dev, + "skipping non ODP MR (lkey=0x%06x) in page fault handler.\n", + key); + if (bytes_mapped) + *bytes_mapped += bcnt; + /* + * The user could specify a SGL with multiple lkeys and + * only some of them are ODP. Treat the non-ODP ones as + * fully faulted. + */ + ret = 0; + } goto end; } @@ -966,7 +979,7 @@ static int pagefault_single_data_segment(struct mlx5_ib_dev *dev, } end: - if (mmkey) + if (!IS_ERR(mmkey)) mlx5r_deref_odp_mkey(mmkey); while (head) { frame = head; From patchwork Mon Sep 9 10:05:02 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Michael Guralnik X-Patchwork-Id: 13796574 Received: from NAM10-MW2-obe.outbound.protection.outlook.com (mail-mw2nam10on2085.outbound.protection.outlook.com [40.107.94.85]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A5F661B3B23 for ; Mon, 9 Sep 2024 10:05:39 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.94.85 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725876341; cv=fail; b=JPZcEcQG1qpk8q72hIaSJN2m/lyExF39ubBEqqb5EfwWiHnJ6BUfvVPARAdKp7437905kitlb921/BtIBl43iaCscpZ6F4n5NHoXCYmzjp97psXlEXRZ+tQNw0V61FmUmqpsu3fBnHpgIrxIbzxqo1NygjhLy+NmKZ1w1wimXoM= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725876341; c=relaxed/simple; bh=jPeiXjI37ANc39RDWrpbzn5xjmtlmGhKn/vQVj11IIk=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=IXtYV+Tj0qtvlm7rDcTRT8GCD1GrhqCRWbIlgdGLKZ8mYG2IioNU+sQ7+s4IUdihA0ppqIJADNSy74w1ClxPhtCVDjAL0sM2Gxzkcl6WSlYWau3oLUAPGkWWUGrLZScRBB/FQKjcn5De88aceB/3WOraHLP3xK/QMVHmKlTVlCE= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com; spf=fail smtp.mailfrom=nvidia.com; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b=V6Gvn7M3; arc=fail smtp.client-ip=40.107.94.85 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nvidia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b="V6Gvn7M3" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=yW4cOCz/06+hHLJpjbp4LcAyDmugsoyYuLYbTNvzkLieFP79whbQfY+jDGks13oX1BCU0C61rg1ex++HXih/VJW1JFDes9cRbArpkelybisMOXGTiJQy3zrtg5HJogPDLHhq+tLRTvM6XU4nyPF9pEKr8R2/DBQQJeww5hSZ+XWsNDjMavWypo5uCbz+bu6Jc3yEdJuDoHOtQe9HW6kI2B1UUmKqk09AqNbbApGsBmNzFWHjTOL/jK0wLNT9Nqqi5QFUj2cB3koQlWpsBLDaKtWufWXUAk7f1gAy1W6JyW9Q9bWX0e+Enb33ukcH+SK5aRrf1oVkynKdrxPSh9cX/w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=k+9e/SoPO2gjOMHZLXNmZsnaD9SS90hM6w7pGL1AZBQ=; b=HtxuMu4oCiN7Ql/+i+R0emHraHsvNhfuDvpiyovR56VvKOBSStL9QUwaSV153h2mJqOEoKnKpz87ZtTNCI7xBhvEH3jDN9W7xEnfDmCzbm3alv6VI2ZFbG6LnIRM4h8Ce0OICAEVcj4o6gzNb85y4I8tZTbdR/CCzVY95zjv028GrOI/b+uqUiArxEOcy4p0BJ9/smoFbZ/4T741sLvGREQB/8xWuiiiJvH1KXvfOq1zBBcqQKE+DDjNa24VAAed8qoKmy069WdHAkpTHCU0/LfQL+WDgfJHpjsdN2z/kYPIunSikO2qbMAGu01PCGGPFt2xIzVaWbso7kQ6QodyDA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 216.228.118.232) smtp.rcpttodomain=vger.kernel.org smtp.mailfrom=nvidia.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=k+9e/SoPO2gjOMHZLXNmZsnaD9SS90hM6w7pGL1AZBQ=; b=V6Gvn7M3H4dkn0xfw8V2R+xkwi7Qqm9phwrVvw+jpkazxD3ZQj7Rqc7PNGtqmkEOWqRmP5lh42DTWbAiMCQ3uin7IeXfBs2HlvvAXtvocKImky2bQ/1sL/JhrnkANlfwmedtYvbFYuCLUCC9gQKOHtXI7/zinTxa441wAfMagdy4Z2yyb/HKx4zoc6A3bsUI7fr7U/MGv1uAW4TRbvAJ/HAVE2Lb28vTjATZms3PTWC3bdDP6/GLokRhsmtog8hq2q77Jmr/JkYjt7AXB5OXtCTMUeaMkGoY2QJFICWy0lM3aJiegiHgn70TEpAA+4ErT046mc6izYbkiSs/i4hTfA== Received: from PH8PR07CA0004.namprd07.prod.outlook.com (2603:10b6:510:2cd::13) by SA1PR12MB8857.namprd12.prod.outlook.com (2603:10b6:806:38d::6) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7939.17; Mon, 9 Sep 2024 10:05:35 +0000 Received: from CY4PEPF0000E9D8.namprd05.prod.outlook.com (2603:10b6:510:2cd:cafe::cc) by PH8PR07CA0004.outlook.office365.com (2603:10b6:510:2cd::13) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7939.23 via Frontend Transport; Mon, 9 Sep 2024 10:05:35 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 216.228.118.232) smtp.mailfrom=nvidia.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 216.228.118.232 as permitted sender) receiver=protection.outlook.com; client-ip=216.228.118.232; helo=mail.nvidia.com; pr=C Received: from mail.nvidia.com (216.228.118.232) by CY4PEPF0000E9D8.mail.protection.outlook.com (10.167.241.71) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7918.13 via Frontend Transport; Mon, 9 Sep 2024 10:05:34 +0000 Received: from drhqmail203.nvidia.com (10.126.190.182) by mail.nvidia.com (10.127.129.5) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.4; Mon, 9 Sep 2024 03:05:21 -0700 Received: from drhqmail203.nvidia.com (10.126.190.182) by drhqmail203.nvidia.com (10.126.190.182) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.4; Mon, 9 Sep 2024 03:05:20 -0700 Received: from vdi.nvidia.com (10.127.8.13) by mail.nvidia.com (10.126.190.182) with Microsoft SMTP Server id 15.2.1544.4 via Frontend Transport; Mon, 9 Sep 2024 03:05:19 -0700 From: Michael Guralnik To: , CC: , , , Michael Guralnik Subject: [PATCH v2 rdma-next 6/8] RDMA/mlx5: Add handling for memory scheme page fault events Date: Mon, 9 Sep 2024 13:05:02 +0300 Message-ID: <20240909100504.29797-7-michaelgur@nvidia.com> X-Mailer: git-send-email 2.17.2 In-Reply-To: <20240909100504.29797-1-michaelgur@nvidia.com> References: <20240909100504.29797-1-michaelgur@nvidia.com> Precedence: bulk X-Mailing-List: linux-rdma@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-NV-OnPremToCloud: ExternallySecured X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: CY4PEPF0000E9D8:EE_|SA1PR12MB8857:EE_ X-MS-Office365-Filtering-Correlation-Id: 06b107d5-8a28-41a9-fb24-08dcd0b6f278 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|376014|36860700013|82310400026|1800799024; X-Microsoft-Antispam-Message-Info: 0U1pVxY2Rqb+XZCUFiDEg0tDsHsHH58+b/4+9RN68qcGcjRyCxG1i0bWkfr3/jACTIkYElWdiNyz3hvXDSufVREY6Xdn76uHflgzL8/zXQ25aeUlsAscBBicLzxnCpt1ujBNNXtq5ZrYM+JFOcbo8MxG9ENDbor5lYVItLG2GHKMUl1AC1aNrMW5jiQFFbKqjX2cg+3Cz60A8VGplgzSCbwJcjNgGAc6maHAVjHbz9CwEFl0H7IEXt/wrBBbqMtZ5a3E4oFVuIkcGRbXPNC7xM/zVByWB2Z9D1hSCGDh//BglvRSBN3ykD44JPAUWhXZ0cDyk5fvYdWniohbeP0mp+wuNg6W5mCIo0STseS8iHmmW6nh7cGW0Krdy41qzS+5IQaKoTl8BU08f3wzJytCXsxFgslow61eArwbyj3cz6dq8n7oEabNvBC0GaVJxvjkTeFOEuF2xK4PFwbBFNj4+xwJH1OxXCBpyagby2JTZbj5mJb44E1A1nDrnzxddLeOurBhVdpX/apSCcw5pbtPA871JcV2HvNygbyKmm5aHHbqttC6rTccGiEqKDfIzDLiHUhjXMpFZL06pXPOFdLgfUOfmWg+psEDjFZ1C3VaS1dogXI8VJLjNhayupjjhZ/0/AKWeY0771gcjUkMYpXP6olp2SOPxWisdm9NWg7rBpD+hi+opo+0RuTmwlmxmZbYcp2I9vUq33MaOPY6dPoFJO/qyEDYx32PPJ3fwLuTjcanbe6f8IlhN4V3gdJoY0B/d38/bSpFLUGJ4s/gkLGyJzMyDkMPO9pJ4VvscqGgvmuajdBkeJ6sfDQVrV3elQQEWcP3jirtemxHYBpTEkyaaTp/2q5qLlNeVNS5efk1TQXJCdHqJRaB74C3IYrKFiWihnJ41s2Uv1x10jqmG4sI5LUnpFhTKag2hZhva19Anuj22jNbamIhJv6puwgeHMsqhuW3yG5tTcWvGz8gE4wKa/wF+Bg5DBDA1H5r8kx/fsXon3oBXNVOHe5heRXtjw/BtLRGeQh+sFq+ahTbutpo3YrUX83tbCgpe2R0nQfbKFQ4tz6D96xDo9DkJJANmcehiiY94Dlsyt4ZkIzXWJF9toFad1o1j2NlkD07pknl/wGqAbnsagR8hWu38Nf/1WxbWFxXBHH0QVIpLxfglaDrblF4XONMk81j8dx4gvGqakJujgmJzXD5gLz1m7FzSJT124enNvVxPVAas/8VWFTIJUNuJ9Fzr5g7E6vqdUWD/PZq5VP+tI1cg8Ysm9mZqG24jm8DJKtHTtUULiYxpLBOJgENHRolONhi9+2t9kteF0DLRSC1R37kpQc/6YA8pPKtyp0KUDYHPLJ0gM40Yj5QCWHV5Jw4eDOAccaolehdP6qWJdw67qdE2nqr3ZlrLZzs8uEW4kZhdVup4Eq5tAUOWzJzmR0mI2AVrQrqiymvXfpgGwRf7fguhtGjh107XmLv X-Forefront-Antispam-Report: CIP:216.228.118.232;CTRY:US;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:mail.nvidia.com;PTR:dc7edge1.nvidia.com;CAT:NONE;SFS:(13230040)(376014)(36860700013)(82310400026)(1800799024);DIR:OUT;SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 09 Sep 2024 10:05:34.9314 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 06b107d5-8a28-41a9-fb24-08dcd0b6f278 X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a;Ip=[216.228.118.232];Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: CY4PEPF0000E9D8.namprd05.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: SA1PR12MB8857 The memory scheme page fault event is a new approch in handling page fault on mkeys using the on-demand-paging feature. The major shift in handling the page fault in this scheme is that the HW is taking responsibilty for parsing the faulted mkey instead of the previous approach where the driver would read and parse the wqes and query the mkeys to get to the direct mkey that we need to handle. Therefore, the event we get from FW in this scheme will contain the direct mkey and address we need to handle and require much less work from driver. Additionally, to optimize performance, the FW can generate the event on a memory area that is larger than the faulted memory operation is requiring, to 'prefetch' memory that is around it and will likely be used soon. Unlike previous types of page fault, the memory page scheme fault does not always require a resume command after handling the page fault as the FW can post multiple events on same mkey and will set the 'last' flag only on the page fault that requires the resume command. Signed-off-by: Michael Guralnik Reviewed-by: Leon Romanovsky --- drivers/infiniband/hw/mlx5/odp.c | 120 +++++++++++++++++++++++++++++-- 1 file changed, 114 insertions(+), 6 deletions(-) diff --git a/drivers/infiniband/hw/mlx5/odp.c b/drivers/infiniband/hw/mlx5/odp.c index 05b92f4cac0e..841725557f2a 100644 --- a/drivers/infiniband/hw/mlx5/odp.c +++ b/drivers/infiniband/hw/mlx5/odp.c @@ -401,12 +401,24 @@ static void mlx5_ib_page_fault_resume(struct mlx5_ib_dev *dev, MLX5_SET(page_fault_resume_in, in, opcode, MLX5_CMD_OP_PAGE_FAULT_RESUME); - info = MLX5_ADDR_OF(page_fault_resume_in, in, - page_fault_info.trans_page_fault_info); - MLX5_SET(trans_page_fault_info, info, page_fault_type, pfault->type); - MLX5_SET(trans_page_fault_info, info, fault_token, pfault->token); - MLX5_SET(trans_page_fault_info, info, wq_number, wq_num); - MLX5_SET(trans_page_fault_info, info, error, !!error); + if (pfault->event_subtype == MLX5_PFAULT_SUBTYPE_MEMORY) { + info = MLX5_ADDR_OF(page_fault_resume_in, in, + page_fault_info.mem_page_fault_info); + MLX5_SET(mem_page_fault_info, info, fault_token_31_0, + pfault->token & 0xffffffff); + MLX5_SET(mem_page_fault_info, info, fault_token_47_32, + (pfault->token >> 32) & 0xffff); + MLX5_SET(mem_page_fault_info, info, error, !!error); + } else { + info = MLX5_ADDR_OF(page_fault_resume_in, in, + page_fault_info.trans_page_fault_info); + MLX5_SET(trans_page_fault_info, info, page_fault_type, + pfault->type); + MLX5_SET(trans_page_fault_info, info, fault_token, + pfault->token); + MLX5_SET(trans_page_fault_info, info, wq_number, wq_num); + MLX5_SET(trans_page_fault_info, info, error, !!error); + } err = mlx5_cmd_exec_in(dev->mdev, page_fault_resume, in); if (err) @@ -1388,6 +1400,63 @@ static void mlx5_ib_mr_rdma_pfault_handler(struct mlx5_ib_dev *dev, } } +#define MLX5_MEMORY_PAGE_FAULT_FLAGS_LAST BIT(7) +static void mlx5_ib_mr_memory_pfault_handler(struct mlx5_ib_dev *dev, + struct mlx5_pagefault *pfault) +{ + u64 prefetch_va = + pfault->memory.va - pfault->memory.prefetch_before_byte_count; + size_t prefetch_size = pfault->memory.prefetch_before_byte_count + + pfault->memory.fault_byte_count + + pfault->memory.prefetch_after_byte_count; + struct mlx5_ib_mkey *mmkey; + struct mlx5_ib_mr *mr; + int ret = 0; + + mmkey = find_odp_mkey(dev, pfault->memory.mkey); + if (IS_ERR(mmkey)) + goto err; + + mr = container_of(mmkey, struct mlx5_ib_mr, mmkey); + + /* If prefetch fails, handle only demanded page fault */ + ret = pagefault_mr(mr, prefetch_va, prefetch_size, NULL, 0, true); + if (ret < 0) { + ret = pagefault_mr(mr, pfault->memory.va, + pfault->memory.fault_byte_count, NULL, 0, + true); + if (ret < 0) + goto err; + } + + mlx5_update_odp_stats(mr, faults, ret); + mlx5r_deref_odp_mkey(mmkey); + + if (pfault->memory.flags & MLX5_MEMORY_PAGE_FAULT_FLAGS_LAST) + mlx5_ib_page_fault_resume(dev, pfault, 0); + + mlx5_ib_dbg( + dev, + "PAGE FAULT completed %s. token 0x%llx, mkey: 0x%x, va: 0x%llx, byte_count: 0x%x\n", + pfault->memory.flags & MLX5_MEMORY_PAGE_FAULT_FLAGS_LAST ? + "" : + "without resume cmd", + pfault->token, pfault->memory.mkey, pfault->memory.va, + pfault->memory.fault_byte_count); + + return; + +err: + if (!IS_ERR(mmkey)) + mlx5r_deref_odp_mkey(mmkey); + mlx5_ib_page_fault_resume(dev, pfault, 1); + mlx5_ib_dbg( + dev, + "PAGE FAULT error. token 0x%llx, mkey: 0x%x, va: 0x%llx, byte_count: 0x%x, err: %d\n", + pfault->token, pfault->memory.mkey, pfault->memory.va, + pfault->memory.fault_byte_count, ret); +} + static void mlx5_ib_pfault(struct mlx5_ib_dev *dev, struct mlx5_pagefault *pfault) { u8 event_subtype = pfault->event_subtype; @@ -1399,6 +1468,9 @@ static void mlx5_ib_pfault(struct mlx5_ib_dev *dev, struct mlx5_pagefault *pfaul case MLX5_PFAULT_SUBTYPE_RDMA: mlx5_ib_mr_rdma_pfault_handler(dev, pfault); break; + case MLX5_PFAULT_SUBTYPE_MEMORY: + mlx5_ib_mr_memory_pfault_handler(dev, pfault); + break; default: mlx5_ib_err(dev, "Invalid page fault event subtype: 0x%x\n", event_subtype); @@ -1417,6 +1489,7 @@ static void mlx5_ib_eqe_pf_action(struct work_struct *work) mempool_free(pfault, eq->pool); } +#define MEMORY_SCHEME_PAGE_FAULT_GRANULARITY 4096 static void mlx5_ib_eq_pf_process(struct mlx5_ib_pf_eq *eq) { struct mlx5_eqe_page_fault *pf_eqe; @@ -1487,6 +1560,41 @@ static void mlx5_ib_eq_pf_process(struct mlx5_ib_pf_eq *eq) pfault->wqe.wqe_index); break; + case MLX5_PFAULT_SUBTYPE_MEMORY: + /* Memory based event */ + pfault->bytes_committed = 0; + pfault->token = + be32_to_cpu(pf_eqe->memory.token31_0) | + ((u64)be16_to_cpu(pf_eqe->memory.token47_32) + << 32); + pfault->memory.va = be64_to_cpu(pf_eqe->memory.va); + pfault->memory.mkey = be32_to_cpu(pf_eqe->memory.mkey); + pfault->memory.fault_byte_count = (be32_to_cpu( + pf_eqe->memory.demand_fault_pages) >> 12) * + MEMORY_SCHEME_PAGE_FAULT_GRANULARITY; + pfault->memory.prefetch_before_byte_count = + be16_to_cpu( + pf_eqe->memory.pre_demand_fault_pages) * + MEMORY_SCHEME_PAGE_FAULT_GRANULARITY; + pfault->memory.prefetch_after_byte_count = + be16_to_cpu( + pf_eqe->memory.post_demand_fault_pages) * + MEMORY_SCHEME_PAGE_FAULT_GRANULARITY; + pfault->memory.flags = pf_eqe->memory.flags; + mlx5_ib_dbg( + eq->dev, + "PAGE_FAULT: subtype: 0x%02x, token: 0x%06llx, mkey: 0x%06x, fault_byte_count: 0x%06x, va: 0x%016llx, flags: 0x%02x\n", + eqe->sub_type, pfault->token, + pfault->memory.mkey, + pfault->memory.fault_byte_count, + pfault->memory.va, pfault->memory.flags); + mlx5_ib_dbg( + eq->dev, + "PAGE_FAULT: prefetch size: before: 0x%06x, after 0x%06x\n", + pfault->memory.prefetch_before_byte_count, + pfault->memory.prefetch_after_byte_count); + break; + default: mlx5_ib_warn(eq->dev, "Unsupported page fault event sub-type: 0x%02hhx\n", From patchwork Mon Sep 9 10:05:03 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Michael Guralnik X-Patchwork-Id: 13796575 Received: from NAM11-DM6-obe.outbound.protection.outlook.com (mail-dm6nam11on2064.outbound.protection.outlook.com [40.107.223.64]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 80DCB1B3B0F for ; Mon, 9 Sep 2024 10:05:40 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.223.64 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725876342; cv=fail; b=AG2iQ5Gthx1yBYwCiMI6rOa9QHXJjxQg+641iI3s1uNWz8h+J5W6bfFTS0N72cZb73T4cHVML58d7V52l81+F78y93ggSRb54MeXsJK9qEez9KPyjtkBvKZ9c9j7deF9ve1MWOXYWukjx24ze9FZUi7CoTOslk35rwznVWmkmyg= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725876342; c=relaxed/simple; bh=ZckFFfkJckagz5gA0MzYXbUN980P1wiGdBe4E7lzqko=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=D/hrq7oBz8TLJo+yEdFy1LZtI3Czfkz5SLqnZBDi1zQSqHm5XJk6Zk8c9+QT+iOjQhRKLndHobq5VpHMs2vGT+YLaZ+GqUzPeMmkt21gnSD+TttbJdHwYqv9hbQ1nXwPvISG1OpTu9NY6MzuUb1ys8xBTTmUhAmgUjreIw2qikI= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com; spf=fail smtp.mailfrom=nvidia.com; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b=Mj/WHoAV; arc=fail smtp.client-ip=40.107.223.64 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nvidia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b="Mj/WHoAV" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=hhipji/KY/l1CBWKDQiX2xQx9S3UyHR6QNUIFJ2KYS4f7gmt5ivWu9EnqH0256U7ZNluCqGBpBqGyllnd64TJ47NgwJRF7IX7809eFfJSKhLpmbOCQVHeamYAwWsyDWyBZNRCvWH0u2cXajHLlGmFp1SPOxkNXWHrUbFfblR+T16PIEJ34bghpl52Y46jm/qDhe18JmWo+W5XrVBGtZ2J38By8Px85TkPXpZB40M5zYeV3wjfvQSj8gTTQED4idtugEgTMUEJvApkaS4MIVbOPPKZ/JycfeHFcDXbukisnBzq98MwLQQCn3o5694/MxlBMJ/cEvXNAFi8kz3xbkucw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=zK5S06jt+hfRr5xMxH9UJyqe7klpnnkE0QAZF2sHEUk=; b=aAVeahPsjfMtiSRi5SDmt4GpGgNwTnW8DGA7lz16RRm6AIMbmNoEoheEilrex0DeWmoAtuQOK+ljGdul6gt3yuA0DXuIb7y0xMyZJ/4aec+/bx31Kw8FIYirlASM5Kwm215quBUztyUcnoknv3sGyF94dW19lTo7lOjFtM9mxWN7QLJIG1c6v/RDuzoeR/mjtzTmmwmefYD/EzpdIq06b5uswscoEdt39rs9u6Bn0LebLjozV/VwrKDiJ3ba+AW5TM9O0JxBHlc8gnKwDqST3iNc5zxoLRyYX/VqC7VJwyN7st7h+MeGRCXv8v9/PFbv0Fj/6Hdjdbzvn3wTT60Wfw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 216.228.118.233) smtp.rcpttodomain=vger.kernel.org smtp.mailfrom=nvidia.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=zK5S06jt+hfRr5xMxH9UJyqe7klpnnkE0QAZF2sHEUk=; b=Mj/WHoAV84ESOu8C2mkTit6NrUMAAa74SHtTNa5Ql97BzEPqeBd7ptOPfKyEfOL6dYRpa+KUhaZGVvSbgNThYSrO7J9j01buDT7jjusLGEJnCoy6ePMmPhptndk+Wu4H2fgk4FuiYkI3fdeJ3IXHHn1118PLZnDukmFiqn9/o4nC2ueg+NtfG1Sk3PAPbuzZvRNuVtknismVwswvjd+rIG1NeX0Nc1HN7wnT74DG5X4Y9tDrqHSq8e0BVVdY8fm2pXayamAIGlSuu8jfG6CWrbZpk7LRGcYo7+0OFR4/wPuFEfoqrqe+LaOaLQdGdfZJ/75RHtJvGiRc2sHoXpklPw== Received: from MN2PR06CA0006.namprd06.prod.outlook.com (2603:10b6:208:23d::11) by IA1PR12MB8263.namprd12.prod.outlook.com (2603:10b6:208:3f8::9) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7918.25; Mon, 9 Sep 2024 10:05:34 +0000 Received: from BL02EPF00029928.namprd02.prod.outlook.com (2603:10b6:208:23d:cafe::ff) by MN2PR06CA0006.outlook.office365.com (2603:10b6:208:23d::11) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7939.23 via Frontend Transport; Mon, 9 Sep 2024 10:05:34 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 216.228.118.233) smtp.mailfrom=nvidia.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 216.228.118.233 as permitted sender) receiver=protection.outlook.com; client-ip=216.228.118.233; helo=mail.nvidia.com; pr=C Received: from mail.nvidia.com (216.228.118.233) by BL02EPF00029928.mail.protection.outlook.com (10.167.249.53) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7918.13 via Frontend Transport; Mon, 9 Sep 2024 10:05:34 +0000 Received: from drhqmail203.nvidia.com (10.126.190.182) by mail.nvidia.com (10.127.129.6) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.4; Mon, 9 Sep 2024 03:05:23 -0700 Received: from drhqmail203.nvidia.com (10.126.190.182) by drhqmail203.nvidia.com (10.126.190.182) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.4; Mon, 9 Sep 2024 03:05:22 -0700 Received: from vdi.nvidia.com (10.127.8.13) by mail.nvidia.com (10.126.190.182) with Microsoft SMTP Server id 15.2.1544.4 via Frontend Transport; Mon, 9 Sep 2024 03:05:21 -0700 From: Michael Guralnik To: , CC: , , , Michael Guralnik Subject: [PATCH v2 rdma-next 7/8] RDMA/mlx5: Add implicit MR handling to ODP memory scheme Date: Mon, 9 Sep 2024 13:05:03 +0300 Message-ID: <20240909100504.29797-8-michaelgur@nvidia.com> X-Mailer: git-send-email 2.17.2 In-Reply-To: <20240909100504.29797-1-michaelgur@nvidia.com> References: <20240909100504.29797-1-michaelgur@nvidia.com> Precedence: bulk X-Mailing-List: linux-rdma@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-NV-OnPremToCloud: ExternallySecured X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: BL02EPF00029928:EE_|IA1PR12MB8263:EE_ X-MS-Office365-Filtering-Correlation-Id: 0eda17bc-4834-42ce-2637-08dcd0b6f1f7 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|82310400026|36860700013|1800799024|376014; X-Microsoft-Antispam-Message-Info: y/gof26RwRYgDE17jTjI8JKmYnbmh2LS9s82CRCpnLUjfQtrsXldg5EGvBAJoEvnQu8upxntBfeHB0qNCaRu3BwsQ1KU02n+emAURIxi9h2KPp0yx06D2Pzhf9IAKXPtuGTuuyiAB12Nt+hbJL98r4CVf8SuNg7N/9ntxoBCK+PQc+xAKudKD0RIZ8fsw9lhgSedcvyfQ1++559uNgMo/mvftX4sCH3Rgey95E8chpJmy1EpKdF/ncPtk8g9eiocltAusCt+P8TfNZb9RTlXIy0zppnzZGmPgNC8sEnr3+XOK/wfa1zAs8UOqP2xvJAjDRqKj7XW9zkFLznlPfYaCwZNbgwTPxWTt542CFW0YQoztfkp1Dedi7F7y8zLJgy2W/vDHQW7oNZhxGuQZOKRS93H5lLp+5RotixGDCbNGdXmhR3zPGest/g+22WArwOK8GMkizuXDIJcv8ZRRnIIxA9lazwxV42c+Ti1fhLKOZlDQQKCDt85IjAFj9Or1xSZ4wMkwHkvBsZXI4TdoEaVubTe79mPWVRirkWmxUxq2CvqIdXIRMhWQ3+RtmgCEwvdlgw04kEn0QCeqSxkduwL/QXOsQRc3kbSU4d2d5i8MpXfhZHpa9crjdypcTZZ0ykboD0qiRNM3+KuyaDCpjd0V5PSXvLdAjy/AazsBbsT90dKHXtMvaYoWNzr2SOFl44I2Swx1GgBKGJu3w2RzF1sk2RaVykZFUU93J8OrIvf74yCWgyUnLpUGspSR2GCjMy25Ak22fYocydi37aME/Qh7B8ukkVcF62vHH06NQ6ZnCUnUD6aAtepqohD9Q/2IsaGPGVQnh722U8Y1RuTTf4vUPhKI8MzWnWjJtTN3s1CyGHg0gK8R/5BpycmrD1i1acRHxAQmyT3oIGWYML9iLMA3jl6Rnv/9Ygi+o5cyvc26qlEslF03gDa5aAJQ6V3uJoThDchGyAYZHLEIqv1Jum5WkuFJPSsJhH7DFd38J5xO0GN9RMI+oEf0zHuT67KJ46r31u2RlXsaCGvq2NFo6hlc8D8aTYWoomTzT9eZNYMvK3ydzQF+OS9j9Av9XFWyA6S062N+0Muj9G2hCE5BGSK5uxtrvfR+pEVQqBnFbEIKKTPHD+tdNm5VgjcWUQCcyoqZjcDAmzSx/snUxl3fs26Sbrp9c9yO7YfGhcWOlbRiopN0CWvFOYZez93GZ5opbrpFLczHTu6KU4b/Y73zlOv7oC2bW4Zugv1ns8UbA0uklyuTRwrQYbBWjazZtc2dC6/ex2MtOWpIpa5a3vrDV18DTmVLRC17gYQ14GTghS8XvDPZR2IMdRcXBo+5YyFDzc/FXdxJN4CHPgdx3RzS4vgp52zkKAEvi4Kvc0LsERet2XensGwNL43stOGMgKsFebT1tHleqa4FhQXStFBmgRn3K3/CZMIRipXlUSjPcTqq29VXLMFAh90ooABOMicQvDh X-Forefront-Antispam-Report: CIP:216.228.118.233;CTRY:US;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:mail.nvidia.com;PTR:dc7edge2.nvidia.com;CAT:NONE;SFS:(13230040)(82310400026)(36860700013)(1800799024)(376014);DIR:OUT;SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 09 Sep 2024 10:05:34.0024 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 0eda17bc-4834-42ce-2637-08dcd0b6f1f7 X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a;Ip=[216.228.118.233];Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: BL02EPF00029928.namprd02.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: IA1PR12MB8263 Implicit MRs in ODP memory scheme require allocating a private null mkey and assigning the mkey and va differently in the KSM mkey. The page faults are received on the null mkey so we also add storing the null mkey in the odp_mkey xarray. Signed-off-by: Michael Guralnik Reviewed-by: Leon Romanovsky --- drivers/infiniband/hw/mlx5/mlx5_ib.h | 3 + drivers/infiniband/hw/mlx5/odp.c | 116 +++++++++++++++++++++++++-- 2 files changed, 111 insertions(+), 8 deletions(-) diff --git a/drivers/infiniband/hw/mlx5/mlx5_ib.h b/drivers/infiniband/hw/mlx5/mlx5_ib.h index ea8eb368108f..227dbaf7a754 100644 --- a/drivers/infiniband/hw/mlx5/mlx5_ib.h +++ b/drivers/infiniband/hw/mlx5/mlx5_ib.h @@ -630,6 +630,8 @@ enum mlx5_mkey_type { MLX5_MKEY_MR = 1, MLX5_MKEY_MW, MLX5_MKEY_INDIRECT_DEVX, + MLX5_MKEY_NULL, + MLX5_MKEY_IMPLICIT_CHILD, }; struct mlx5r_cache_rb_key { @@ -715,6 +717,7 @@ struct mlx5_ib_mr { struct mlx5_ib_mr *dd_crossed_mr; struct list_head dd_node; u8 revoked :1; + struct mlx5_ib_mkey null_mmkey; }; }; }; diff --git a/drivers/infiniband/hw/mlx5/odp.c b/drivers/infiniband/hw/mlx5/odp.c index 841725557f2a..4b37446758fd 100644 --- a/drivers/infiniband/hw/mlx5/odp.c +++ b/drivers/infiniband/hw/mlx5/odp.c @@ -107,13 +107,20 @@ static u64 mlx5_imr_ksm_entries; static void populate_klm(struct mlx5_klm *pklm, size_t idx, size_t nentries, struct mlx5_ib_mr *imr, int flags) { + struct mlx5_core_dev *dev = mr_to_mdev(imr)->mdev; struct mlx5_klm *end = pklm + nentries; + int step = MLX5_CAP_ODP(dev, mem_page_fault) ? MLX5_IMR_MTT_SIZE : 0; + __be32 key = MLX5_CAP_ODP(dev, mem_page_fault) ? + cpu_to_be32(imr->null_mmkey.key) : + mr_to_mdev(imr)->mkeys.null_mkey; + u64 va = + MLX5_CAP_ODP(dev, mem_page_fault) ? idx * MLX5_IMR_MTT_SIZE : 0; if (flags & MLX5_IB_UPD_XLT_ZAP) { - for (; pklm != end; pklm++, idx++) { + for (; pklm != end; pklm++, idx++, va += step) { pklm->bcount = cpu_to_be32(MLX5_IMR_MTT_SIZE); - pklm->key = mr_to_mdev(imr)->mkeys.null_mkey; - pklm->va = 0; + pklm->key = key; + pklm->va = cpu_to_be64(va); } return; } @@ -137,7 +144,7 @@ static void populate_klm(struct mlx5_klm *pklm, size_t idx, size_t nentries, */ lockdep_assert_held(&to_ib_umem_odp(imr->umem)->umem_mutex); - for (; pklm != end; pklm++, idx++) { + for (; pklm != end; pklm++, idx++, va += step) { struct mlx5_ib_mr *mtt = xa_load(&imr->implicit_children, idx); pklm->bcount = cpu_to_be32(MLX5_IMR_MTT_SIZE); @@ -145,8 +152,8 @@ static void populate_klm(struct mlx5_klm *pklm, size_t idx, size_t nentries, pklm->key = cpu_to_be32(mtt->ibmr.lkey); pklm->va = cpu_to_be64(idx * MLX5_IMR_MTT_SIZE); } else { - pklm->key = mr_to_mdev(imr)->mkeys.null_mkey; - pklm->va = 0; + pklm->key = key; + pklm->va = cpu_to_be64(va); } } } @@ -225,6 +232,9 @@ static void destroy_unused_implicit_child_mr(struct mlx5_ib_mr *mr) return; xa_erase(&imr->implicit_children, idx); + if (MLX5_CAP_ODP(mr_to_mdev(mr)->mdev, mem_page_fault)) + xa_erase(&mr_to_mdev(mr)->odp_mkeys, + mlx5_base_mkey(mr->mmkey.key)); /* Freeing a MR is a sleeping operation, so bounce to a work queue */ INIT_WORK(&mr->odp_destroy.work, free_implicit_child_mr_work); @@ -492,6 +502,16 @@ static struct mlx5_ib_mr *implicit_get_child_mr(struct mlx5_ib_mr *imr, } xa_unlock(&imr->implicit_children); + if (MLX5_CAP_ODP(dev->mdev, mem_page_fault)) { + ret = xa_store(&dev->odp_mkeys, mlx5_base_mkey(mr->mmkey.key), + &mr->mmkey, GFP_KERNEL); + if (xa_is_err(ret)) { + ret = ERR_PTR(xa_err(ret)); + xa_erase(&imr->implicit_children, idx); + goto out_mr; + } + mr->mmkey.type = MLX5_MKEY_IMPLICIT_CHILD; + } mlx5_ib_dbg(mr_to_mdev(imr), "key %x mr %p\n", mr->mmkey.key, mr); return mr; @@ -502,6 +522,57 @@ static struct mlx5_ib_mr *implicit_get_child_mr(struct mlx5_ib_mr *imr, return ret; } +/* + * When using memory scheme ODP, implicit MRs can't use the reserved null mkey + * and each implicit MR needs to assign a private null mkey to get the page + * faults on. + * The null mkey is created with the properties to enable getting the page + * fault for every time it is accessed and having all relevant access flags. + */ +static int alloc_implicit_mr_null_mkey(struct mlx5_ib_dev *dev, + struct mlx5_ib_mr *imr, + struct mlx5_ib_pd *pd) +{ + size_t inlen = MLX5_ST_SZ_BYTES(create_mkey_in) + 64; + void *mkc; + u32 *in; + int err; + + in = kzalloc(inlen, GFP_KERNEL); + if (!in) + return -ENOMEM; + + MLX5_SET(create_mkey_in, in, translations_octword_actual_size, 4); + MLX5_SET(create_mkey_in, in, pg_access, 1); + + mkc = MLX5_ADDR_OF(create_mkey_in, in, memory_key_mkey_entry); + MLX5_SET(mkc, mkc, a, 1); + MLX5_SET(mkc, mkc, rw, 1); + MLX5_SET(mkc, mkc, rr, 1); + MLX5_SET(mkc, mkc, lw, 1); + MLX5_SET(mkc, mkc, lr, 1); + MLX5_SET(mkc, mkc, free, 0); + MLX5_SET(mkc, mkc, umr_en, 0); + MLX5_SET(mkc, mkc, access_mode_1_0, MLX5_MKC_ACCESS_MODE_MTT); + + MLX5_SET(mkc, mkc, translations_octword_size, 4); + MLX5_SET(mkc, mkc, log_page_size, 61); + MLX5_SET(mkc, mkc, length64, 1); + MLX5_SET(mkc, mkc, pd, pd->pdn); + MLX5_SET64(mkc, mkc, start_addr, 0); + MLX5_SET(mkc, mkc, qpn, 0xffffff); + + err = mlx5_core_create_mkey(dev->mdev, &imr->null_mmkey.key, in, inlen); + if (err) + goto free_in; + + imr->null_mmkey.type = MLX5_MKEY_NULL; + +free_in: + kfree(in); + return err; +} + struct mlx5_ib_mr *mlx5_ib_alloc_implicit_mr(struct mlx5_ib_pd *pd, int access_flags) { @@ -534,6 +605,16 @@ struct mlx5_ib_mr *mlx5_ib_alloc_implicit_mr(struct mlx5_ib_pd *pd, imr->is_odp_implicit = true; xa_init(&imr->implicit_children); + if (MLX5_CAP_ODP(dev->mdev, mem_page_fault)) { + err = alloc_implicit_mr_null_mkey(dev, imr, pd); + if (err) + goto out_mr; + + err = mlx5r_store_odp_mkey(dev, &imr->null_mmkey); + if (err) + goto out_mr; + } + err = mlx5r_umr_update_xlt(imr, 0, mlx5_imr_ksm_entries, MLX5_KSM_PAGE_SHIFT, @@ -568,6 +649,14 @@ void mlx5_ib_free_odp_mr(struct mlx5_ib_mr *mr) xa_erase(&mr->implicit_children, idx); mlx5_ib_dereg_mr(&mtt->ibmr, NULL); } + + if (mr->null_mmkey.key) { + xa_erase(&mr_to_mdev(mr)->odp_mkeys, + mlx5_base_mkey(mr->null_mmkey.key)); + + mlx5_core_destroy_mkey(mr_to_mdev(mr)->mdev, + mr->null_mmkey.key); + } } #define MLX5_PF_FLAGS_DOWNGRADE BIT(1) @@ -1410,14 +1499,25 @@ static void mlx5_ib_mr_memory_pfault_handler(struct mlx5_ib_dev *dev, pfault->memory.fault_byte_count + pfault->memory.prefetch_after_byte_count; struct mlx5_ib_mkey *mmkey; - struct mlx5_ib_mr *mr; + struct mlx5_ib_mr *mr, *child_mr; int ret = 0; mmkey = find_odp_mkey(dev, pfault->memory.mkey); if (IS_ERR(mmkey)) goto err; - mr = container_of(mmkey, struct mlx5_ib_mr, mmkey); + switch (mmkey->type) { + case MLX5_MKEY_IMPLICIT_CHILD: + child_mr = container_of(mmkey, struct mlx5_ib_mr, mmkey); + mr = child_mr->parent; + break; + case MLX5_MKEY_NULL: + mr = container_of(mmkey, struct mlx5_ib_mr, null_mmkey); + break; + default: + mr = container_of(mmkey, struct mlx5_ib_mr, mmkey); + break; + } /* If prefetch fails, handle only demanded page fault */ ret = pagefault_mr(mr, prefetch_va, prefetch_size, NULL, 0, true); From patchwork Mon Sep 9 10:05:04 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Michael Guralnik X-Patchwork-Id: 13796576 Received: from NAM10-MW2-obe.outbound.protection.outlook.com (mail-mw2nam10on2072.outbound.protection.outlook.com [40.107.94.72]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 053271B3F34 for ; Mon, 9 Sep 2024 10:05:41 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.94.72 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725876343; cv=fail; b=tqlCEMzxQ2R+V3QxG0TmbB7YS3vFVDDzleX++J2hUXTP+1rM2TQyX01flF2cp8T36UV3vIq8DTb1BIT7dAWpHbxJ+SJuLjVlL+/Ab/AhC3FIBYEES+CW9erXgCXYwKfzU2lLwsBzHs5pMisaYkP4rHHK+RjlG4LGYrXY41NTjO8= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725876343; c=relaxed/simple; bh=bwh8IN3GVDkApqT518iGbmgEkDVWi5swgLYSzS3AybA=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=RGr0+ce2M+4Nboj8Oj2YTY5Pdlb6bh9qs9B7Tch0Zvo950o2rl95FdrOmDJ0FAI7kS+StPjMkmqDNOppqeRzoHFe3YGKH+IDaCm2Z4KrqW2OCMJ5zbUGZ2rmwtjriBlkOF5MBOGR9z95ocgNlddda73xVtFueLkbdQiEklcT9sc= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com; spf=fail smtp.mailfrom=nvidia.com; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b=uCD8uEvo; arc=fail smtp.client-ip=40.107.94.72 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nvidia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b="uCD8uEvo" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=GEpJrQqahdMNP3yBSKEkpGZrGeCcodiTHgIwH4yW7qidkA3AZqYVeVbLZOpCGgCtgKBbGooymXqL7AaoZgHMiKJB6Ls2QEZY4Lnoao+kItvpiBSaAi3b3scBj+NryReUXZW3G6T5jjJTZF/zp/LLFv3jKqv5YGb7VVDutAZatS89tYLP6j06A1oLKJ8LlscG6g9Wx+aFm1bDpaAZqe1ho1sud31hgDK0SXrHTPvTyO9SdIMwLDsE1LuFTjDzlNPISQJCSR629mULRRdDx5MnQkRgOzOuuigtpSFxUxeOP8bIt1m7+Bj7Kz3aRrihRT26+3XkiPCZj5KXIuWC++swqQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=iC5pVCwjpryKSBtAQx6+28APM4+DZJ5nJWGw0WUsNB0=; b=ae/9to7kRYuL2rdY4qpNt8P3jzh3nOo9Y5iPLEmr3ajPjKP2wXYYc7cQdSjvvEIz05JhOcbtOG0jC7in3g04HkksvacYyVO9gyiJjgUhKErT58lhTQY9kIx0sWzEDSi1HRxdMh4lW5XF2F/A5klOfRBklW/afSWyjYnzQGJB73pgWYH+w44IFcEtqE37vMGVitGC5k2feFXksOp+6nzx/7TC2cjC9x5niLDZjbB+WZ+eFLmVSxvLBwh78dLmqKKtrEv1veISl4hdjHOPVhJVU/VZjeq6r5Y/mda/b5ki/IUaphhE5LCY3TT4fROtBGc00ECDDDZ1hse18yx7l2MHVg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 216.228.118.233) smtp.rcpttodomain=vger.kernel.org smtp.mailfrom=nvidia.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=iC5pVCwjpryKSBtAQx6+28APM4+DZJ5nJWGw0WUsNB0=; b=uCD8uEvora2tFS6g6YykcE/0w7pt6XMgGEy1t7w174gwnbxeM1QYMi39uKurq1gJefZdw2U3+6jLwHxjWhlUiNHNpDQ/r0zB2nlh4GP4Vx6xN3Hxjwn8F3BsocRRPTB0cWl7yVN+E1lFTLAa7t4d5WDWFLd99hDg8j6xG/Bjie+mlt8riDXMNzD7njDBgEWosZNWDbrt+HP+6fzOBX5uEqpO/h3lvJnQMfux8NZo9U0snGKFO+Pbdbw3qjldArn2/gEpFYqpNgfe1MW78jnUAbQE7PXBemh0e2++5Xdi3wAQOTdZiw6EriVzXPitnaWPFiqjuoOj7wbUyE/LgfuVOg== Received: from MN2PR06CA0019.namprd06.prod.outlook.com (2603:10b6:208:23d::24) by CY8PR12MB7414.namprd12.prod.outlook.com (2603:10b6:930:5e::16) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7939.17; Mon, 9 Sep 2024 10:05:37 +0000 Received: from BL02EPF00029928.namprd02.prod.outlook.com (2603:10b6:208:23d:cafe::7a) by MN2PR06CA0019.outlook.office365.com (2603:10b6:208:23d::24) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7939.23 via Frontend Transport; Mon, 9 Sep 2024 10:05:37 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 216.228.118.233) smtp.mailfrom=nvidia.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 216.228.118.233 as permitted sender) receiver=protection.outlook.com; client-ip=216.228.118.233; helo=mail.nvidia.com; pr=C Received: from mail.nvidia.com (216.228.118.233) by BL02EPF00029928.mail.protection.outlook.com (10.167.249.53) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7918.13 via Frontend Transport; Mon, 9 Sep 2024 10:05:36 +0000 Received: from drhqmail202.nvidia.com (10.126.190.181) by mail.nvidia.com (10.127.129.6) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.4; Mon, 9 Sep 2024 03:05:25 -0700 Received: from drhqmail203.nvidia.com (10.126.190.182) by drhqmail202.nvidia.com (10.126.190.181) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.4; Mon, 9 Sep 2024 03:05:24 -0700 Received: from vdi.nvidia.com (10.127.8.13) by mail.nvidia.com (10.126.190.182) with Microsoft SMTP Server id 15.2.1544.4 via Frontend Transport; Mon, 9 Sep 2024 03:05:23 -0700 From: Michael Guralnik To: , CC: , , , Michael Guralnik Subject: [PATCH v2 rdma-next 8/8] net/mlx5: Handle memory scheme ODP capabilities Date: Mon, 9 Sep 2024 13:05:04 +0300 Message-ID: <20240909100504.29797-9-michaelgur@nvidia.com> X-Mailer: git-send-email 2.17.2 In-Reply-To: <20240909100504.29797-1-michaelgur@nvidia.com> References: <20240909100504.29797-1-michaelgur@nvidia.com> Precedence: bulk X-Mailing-List: linux-rdma@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-NV-OnPremToCloud: ExternallySecured X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: BL02EPF00029928:EE_|CY8PR12MB7414:EE_ X-MS-Office365-Filtering-Correlation-Id: 96e8639c-26aa-4b97-9b4d-08dcd0b6f3b2 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|1800799024|82310400026|376014|36860700013; X-Microsoft-Antispam-Message-Info: aE/F+SBKtT4UW3gouB3KBC9jUrVMprnABuXGoZRV0JPAqcJYMgsGDJnYWw5c5mkeOOWRy4D9XxGWKksirD5BkOPNuWMxVQo0fjuAamj0uUp8Y8cKJAX5LCBdyYykoedPgasPCDLln6NiejGMjuye0dJrE9nEdhzWJhO5bqDOXYIeviXzh8HMtb/uroRicc4TexT8OPUZyX5vPQj0EQsCxWeGtDszkOJZptAOGVhZkjQBOwKOMqnkqkXwk9gdHyoQGkyMSwJCBFdDwPFyjB7NOrHu+XY1wCHweWgYgiTDTgEnUmqh3fCSjRqAvsasPgboK7F6Wg22mgnEtODC8LiSr3KSYxnUvRBVgH0AcHcc1LT7nLx6zI/ExNkI2BT/vnq9ERoLzz4fRpB1Ec52Qc4acvut9c9uPk0RTtfMrM2MktTW18KM7QD9VIa6iW3lb/RT3eptgj5n5irbgKCq5LtFrzA1MuG0t9/ZjyqPG8Nmk3Kyu1A6CVNTftWfrNcT2J6UUjJjkoZ2u2/myyqVbzKggg4vC95YRuM/E+hsyrJ9GVQ82Yp3FlQwaRjR5gVOBWjnsCG7JWFgV+Cu8m8n29HvUx3BY1DBDmO+kehlntlV/cdFxqB0gNia81IzfLOpBfRpuLcMQ4IwB1XB12x97cOP0iWGzlDDziUm6mCfB0rMspY/BqyGdSnAffHC05AjoCYjhQ4lcqfjkfLWBHXeaILGJBd/eW/uQcj3ZCEjhTCz59Jz/apZiMgVDH5eguvblLdKfx8lLITWVsQ7HXg9MfgbhXRvfgVKt6CCw7bI18yfQ9pCWcrW+TGRAsL+KS6XaAkh8cl6pPrwd9DcbzhNVoJBYPwoXAZ7yfmcx9UZWTgupO5Fnw2zt/SUJLgWEijKdOW6FS4s0YP9Jo1529ocHbk6N5EE8ql2vE9yN0c2uJdm6S9t212W9hbBnmKyI5Eiqw2JoQXRb8LdlzzDaLfHuYGZoKLNQNxWNXqPFwEgkccmC9bcSCEX2QzKIzizgK03w2G6RO3Lbf/JjQ/dQGEYHWD/K+GFvN+mnuNL/fZEVX+aNcTXBnyI6nfUvG22lsMb4Flp+dX5q6Qw/DLHMVOUMGgujykm/gX4Ku4HMCWrncgvexHD+Ex+lmsQRCLCYGYtn5e0rhs7nq0nD3t+HNOjY1QcIZEBubrrzomvCFfgKIF6G318plpWSXEg69FjM1prPgB6spfqDK7gobyVkDCB1q1hGU7PlOjRq7/n3XfxBQBK0OYAiGewNSkN750APuOopo9/BoHOwWxVMfnNm/KAuRziNvIqB6oRQ6ilpRmyfqURptzf06tRCcnCVIBmzujKIFj0xMKwM38jCVz6CJoqQsifZs/11dZDQMbTTTSnyajKKrTVSDmoSl26IlJ4/pvvqI15d43Rm1wYtqcgVi4AeGplj2TbDjCNFhtboe7d6rM5xNJs4ci8ZslSzD91Mitx/ofA X-Forefront-Antispam-Report: CIP:216.228.118.233;CTRY:US;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:mail.nvidia.com;PTR:dc7edge2.nvidia.com;CAT:NONE;SFS:(13230040)(1800799024)(82310400026)(376014)(36860700013);DIR:OUT;SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 09 Sep 2024 10:05:36.9086 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 96e8639c-26aa-4b97-9b4d-08dcd0b6f3b2 X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a;Ip=[216.228.118.233];Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: BL02EPF00029928.namprd02.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: CY8PR12MB7414 When running over new FW that supports the new memory scheme ODP, set the cap in the FW to signal the FW we are working in the new scheme. In the memory scheme ODP the per_transport_service capabilities are RO for the driver so we skip their setting. Signed-off-by: Michael Guralnik Reviewed-by: Leon Romanovsky --- .../net/ethernet/mellanox/mlx5/core/main.c | 22 +++++++++++++++---- include/linux/mlx5/device.h | 10 ++++++--- 2 files changed, 25 insertions(+), 7 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c b/drivers/net/ethernet/mellanox/mlx5/core/main.c index cc2aa46cff04..944c209e9569 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/main.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c @@ -454,8 +454,8 @@ static int handle_hca_cap_atomic(struct mlx5_core_dev *dev, void *set_ctx) static int handle_hca_cap_odp(struct mlx5_core_dev *dev, void *set_ctx) { + bool do_set = false, mem_page_fault = false; void *set_hca_cap; - bool do_set = false; int err; if (!IS_ENABLED(CONFIG_INFINIBAND_ON_DEMAND_PAGING) || @@ -470,6 +470,17 @@ static int handle_hca_cap_odp(struct mlx5_core_dev *dev, void *set_ctx) memcpy(set_hca_cap, dev->caps.hca[MLX5_CAP_ODP]->cur, MLX5_ST_SZ_BYTES(odp_cap)); + /* For best performance, enable memory scheme ODP only when + * it has page prefetch enabled. + */ + if (MLX5_CAP_ODP_MAX(dev, mem_page_fault) && + MLX5_CAP_ODP_MAX(dev, memory_page_fault_scheme_cap.page_prefetch)) { + mem_page_fault = true; + do_set = true; + MLX5_SET(odp_cap, set_hca_cap, mem_page_fault, mem_page_fault); + goto set; + }; + #define ODP_CAP_SET_MAX(dev, field) \ do { \ u32 _res = MLX5_CAP_ODP_MAX(dev, field); \ @@ -494,10 +505,13 @@ static int handle_hca_cap_odp(struct mlx5_core_dev *dev, void *set_ctx) ODP_CAP_SET_MAX(dev, transport_page_fault_scheme_cap.dc_odp_caps.read); ODP_CAP_SET_MAX(dev, transport_page_fault_scheme_cap.dc_odp_caps.atomic); - if (!do_set) - return 0; +set: + if (do_set) + err = set_caps(dev, set_ctx, MLX5_SET_HCA_CAP_OP_MOD_ODP); - return set_caps(dev, set_ctx, MLX5_SET_HCA_CAP_OP_MOD_ODP); + mlx5_core_dbg(dev, "Using ODP %s scheme\n", + mem_page_fault ? "memory" : "transport"); + return err; } static int max_uc_list_get_devlink_param(struct mlx5_core_dev *dev) diff --git a/include/linux/mlx5/device.h b/include/linux/mlx5/device.h index 154095256d0d..57c9b18c3adb 100644 --- a/include/linux/mlx5/device.h +++ b/include/linux/mlx5/device.h @@ -1389,9 +1389,13 @@ enum mlx5_qcam_feature_groups { #define MLX5_CAP_ODP(mdev, cap)\ MLX5_GET(odp_cap, mdev->caps.hca[MLX5_CAP_ODP]->cur, cap) -#define MLX5_CAP_ODP_SCHEME(mdev, cap) \ - MLX5_GET(odp_cap, mdev->caps.hca[MLX5_CAP_ODP]->cur, \ - transport_page_fault_scheme_cap.cap) +#define MLX5_CAP_ODP_SCHEME(mdev, cap) \ + (MLX5_GET(odp_cap, mdev->caps.hca[MLX5_CAP_ODP]->cur, \ + mem_page_fault) ? \ + MLX5_GET(odp_cap, mdev->caps.hca[MLX5_CAP_ODP]->cur, \ + memory_page_fault_scheme_cap.cap) : \ + MLX5_GET(odp_cap, mdev->caps.hca[MLX5_CAP_ODP]->cur, \ + transport_page_fault_scheme_cap.cap)) #define MLX5_CAP_ODP_MAX(mdev, cap)\ MLX5_GET(odp_cap, mdev->caps.hca[MLX5_CAP_ODP]->max, cap)