From patchwork Wed Sep 7 11:37:57 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Patrisious Haddad X-Patchwork-Id: 12968919 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id AE7A8C6FA83 for ; Wed, 7 Sep 2022 11:38:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230019AbiIGLil (ORCPT ); Wed, 7 Sep 2022 07:38:41 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50498 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230359AbiIGLih (ORCPT ); Wed, 7 Sep 2022 07:38:37 -0400 Received: from NAM12-DM6-obe.outbound.protection.outlook.com (mail-dm6nam12on2053.outbound.protection.outlook.com [40.107.243.53]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AEEF5EE01 for ; Wed, 7 Sep 2022 04:38:35 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=HH/eqYVxLyIUvrq1Z7Llc2mD4BKiLDS71lTdHCUrbWWxtusXp6b+yubssw2QhD8AuLP1Ea/79vh+gpzY74Dl1FagufXKTkzURNysSnB2IFjsy0PRz1DGuDA1DFDFJ2n9PrU/7HEwHhTRN8h5fqUMRYTKtzuTW7vjiKRv4joILSqyfRinwl89qb6UWdJJIeWHOs7cJKgWslsme94APVJwSEPwq4bm/CpEir+mWvr9ltuU43dSmJoG1KkBm+X3ZOrvXvl3n7TjF3IhxjotwQ7eUznJm33GS0WiUFsjpTr9zbKiOKXjV2FzuzWvob/v1ICOLcjOF9VpZTuNPxfDsP5R6Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=oPFj95z91bKm4MdVCkwtrgSKWTrFTaKzHkyQtqUCOWQ=; b=eP4cFPuU0JdOQ5aDPaIsfg+efOExbY6AYVlm7zYQ1zsSgL8u02ZgPr4/Ww0Arp/FDullHCHK+E9dbD+H4dB2puDo+jpNOAmKyMLHVnV6V0VcL2SZU6+jXZRqT7L5S6sbk7CFNboPae6anJJLO4Pqm5t2TMyCeReVdM9aQ0FxxmpRl7wY6VtQ2q5QHG1PddUGFHRS4SPsVrhbie9sVAlVVAbDQK7eWqHWG1P6zCZUx2qpaIhK4s0Lar/sZuICISP/Y3v1vwfSTPYxsvT9CWJb3aSgfAzr7IgND1+DBtLOTaCFdSw03i1aoom5kyhdeDjBS0cB7ngxxTXEuJ6IEIIGRA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 12.22.5.234) smtp.rcpttodomain=lists.infradead.org smtp.mailfrom=nvidia.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=oPFj95z91bKm4MdVCkwtrgSKWTrFTaKzHkyQtqUCOWQ=; b=YdyA79F4NS8dscLMp1t+iZhoDOnDOtvYl7iGIdmurwyfBO4Ua8a1Itjxfe1B3H8fKWzG9bPdMfz5tWbZeWb7sBmN2qy7Irvl7YCbzXKqKkMv2TApIIqavNgXnT6/IKF1uKgM+DRGKhK88NBdVJlzsY2r6/OEFHrEqyaDp8JkeYKnkLFODyKZ6ROSU08uw8qUl+m2F6WUmEEA+tOsS5Vl5yTEUB6YxsmBSwpoZG+tkciMV6Phn92yNBbTP9w+5Stdg3DvOHOlpUCU7dWFT0Wo2+YwVjybztQeQRMoBLq4d+jPMKopXGrIWFETFYp+SPoAU2UxsMiX7lFleRvYDXt4gg== Received: from MW2PR2101CA0029.namprd21.prod.outlook.com (2603:10b6:302:1::42) by PH7PR12MB6633.namprd12.prod.outlook.com (2603:10b6:510:1ff::20) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5546.21; Wed, 7 Sep 2022 11:38:34 +0000 Received: from CO1NAM11FT096.eop-nam11.prod.protection.outlook.com (2603:10b6:302:1:cafe::9a) by MW2PR2101CA0029.outlook.office365.com (2603:10b6:302:1::42) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5632.4 via Frontend Transport; Wed, 7 Sep 2022 11:38:34 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 12.22.5.234) smtp.mailfrom=nvidia.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 12.22.5.234 as permitted sender) receiver=protection.outlook.com; client-ip=12.22.5.234; helo=mail.nvidia.com; pr=C Received: from mail.nvidia.com (12.22.5.234) by CO1NAM11FT096.mail.protection.outlook.com (10.13.175.84) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id 15.20.5612.13 via Frontend Transport; Wed, 7 Sep 2022 11:38:33 +0000 Received: from rnnvmail205.nvidia.com (10.129.68.10) by DRHQMAIL101.nvidia.com (10.27.9.10) with Microsoft SMTP Server (TLS) id 15.0.1497.38; Wed, 7 Sep 2022 11:38:33 +0000 Received: from rnnvmail205.nvidia.com (10.129.68.10) by rnnvmail205.nvidia.com (10.129.68.10) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.986.29; Wed, 7 Sep 2022 04:38:32 -0700 Received: from vdi.nvidia.com (10.127.8.11) by mail.nvidia.com (10.129.68.10) with Microsoft SMTP Server id 15.2.986.29 via Frontend Transport; Wed, 7 Sep 2022 04:38:29 -0700 From: Patrisious Haddad To: Sagi Grimberg , Christoph Hellwig CC: Patrisious Haddad , Leon Romanovsky , Linux-nvme , , Michael Guralnik , Israel Rukshin , Maor Gottlieb , "Max Gurtovoy" Subject: [PATCH rdma-next 1/4] net/mlx5: Introduce CQE error syndrome Date: Wed, 7 Sep 2022 14:37:57 +0300 Message-ID: <20220907113800.22182-2-phaddad@nvidia.com> X-Mailer: git-send-email 2.18.1 In-Reply-To: <20220907113800.22182-1-phaddad@nvidia.com> References: <20220907113800.22182-1-phaddad@nvidia.com> MIME-Version: 1.0 X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: CO1NAM11FT096:EE_|PH7PR12MB6633:EE_ X-MS-Office365-Filtering-Correlation-Id: c97dea38-18fb-4cc5-c59c-08da90c57efb X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: zW5pS4JQrn3qeiKrpSVgtbkwlazrg3dGxwRDmrYG9qofVeKcf8DlLcXgT6EoxSqH6zXD8BRopI9zdFXD5Mra1xUBtOqvIrQmuSKCZR0zqQv9SZ92GPlFqxw3AbUdHJOSHs+3M1dZeUiZov7VFDDCpFbq2VJ9R8ZMHzfQTreLtmJgzTu5iU3NiQXzKHjc2ys85EfXWW7mi3OXqzuNZ74g66yYjvb2UJKin915eZj9ZbCjpTavCrNcv52W+8fToZ1DzlAhn9aJ/rTpGPP+h2qEguQmSedsomV7WuuO+Fu8spx8lemGgZL+RcMws6RT0n51gBvhNVaQQa3rcO1oBY4WFGVI0a/cJEypJOoMpRbBA1Yfi6+6BAIFmZweGd49j+TlGLlDEnEj6um4CWwUhUgNYZsAcyRZkGSN1rs9P12SEgp6kURnMyE/Ts5+Bij8C1RqoG2gQDLTL/vXj4SsvB14vyvMBRTABt8JCAG8q2sYjzqPMQy/Kg/mfsmWD56ZwHaZcg+yoIk3MZ1n87X1ChN4bi1fOicHOFQ0PvtXVbXIuUYTTyip7C4fOXFh1Df+D2g83mfIJ1VCcofLXwX7DEQKy2Gfw2HCaYysUOmWfE2a0Ae9jBMespUs0bOYqJGrsC84osJK+KdJ6BmeDDda6KGS7NzXfXqcTIvGjElgSV1lpS26rzYM2dhZr9S3MGD2hKkRpHX+WAYcf08DucrlZWuqymwVaHf3fKeGHhtBHu6XfL21llhxub8stmzWDiIYMZfEtR2y26wRr9UQyEbujL1vY8yvk4gOtipRLAa8tGVpjw7ELJhcB2BQEiCFhfF/ChIp X-Forefront-Antispam-Report: CIP:12.22.5.234;CTRY:US;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:mail.nvidia.com;PTR:InfoNoRecords;CAT:NONE;SFS:(13230016)(4636009)(136003)(396003)(39860400002)(346002)(376002)(36840700001)(40470700004)(46966006)(478600001)(1076003)(70206006)(41300700001)(36860700001)(54906003)(316002)(8936002)(8676002)(4326008)(82740400003)(70586007)(356005)(110136005)(2616005)(81166007)(26005)(107886003)(2906002)(336012)(36756003)(5660300002)(82310400005)(7696005)(83380400001)(6666004)(40480700001)(86362001)(40460700003)(47076005)(426003)(186003)(36900700001);DIR:OUT;SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 07 Sep 2022 11:38:33.8795 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: c97dea38-18fb-4cc5-c59c-08da90c57efb X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a;Ip=[12.22.5.234];Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: CO1NAM11FT096.eop-nam11.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: PH7PR12MB6633 Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org Introduces CQE error syndrome bits which are inside qp_context_extension and are used to report the reason the QP was moved to error state. Useful for cases in which a CQE is generated, such as remote write rkey violaton. Signed-off-by: Patrisious Haddad Reviewed-by: Leon Romanovsky --- include/linux/mlx5/mlx5_ifc.h | 47 +++++++++++++++++++++++++++++++---- 1 file changed, 42 insertions(+), 5 deletions(-) diff --git a/include/linux/mlx5/mlx5_ifc.h b/include/linux/mlx5/mlx5_ifc.h index 4acd5610e96b..1c3b258baaa7 100644 --- a/include/linux/mlx5/mlx5_ifc.h +++ b/include/linux/mlx5/mlx5_ifc.h @@ -1441,7 +1441,9 @@ struct mlx5_ifc_cmd_hca_cap_bits { u8 null_mkey[0x1]; u8 log_max_klm_list_size[0x6]; - u8 reserved_at_120[0xa]; + u8 reserved_at_120[0x2]; + u8 qpc_extension[0x1]; + u8 reserved_at_123[0x7]; u8 log_max_ra_req_dc[0x6]; u8 reserved_at_130[0x9]; u8 vnic_env_cq_overrun[0x1]; @@ -1605,7 +1607,9 @@ struct mlx5_ifc_cmd_hca_cap_bits { u8 log_bf_reg_size[0x5]; - u8 reserved_at_270[0x6]; + u8 reserved_at_270[0x3]; + u8 qp_error_syndrome[0x1]; + u8 reserved_at_274[0x2]; u8 lag_dct[0x2]; u8 lag_tx_port_affinity[0x1]; u8 lag_native_fdb_selection[0x1]; @@ -5257,6 +5261,37 @@ struct mlx5_ifc_query_rmp_in_bits { u8 reserved_at_60[0x20]; }; +struct mlx5_ifc_cqe_error_syndrome_bits { + u8 hw_error_syndrome[0x8]; + u8 hw_syndrome_type[0x4]; + u8 reserved_at_c[0x4]; + u8 vendor_error_syndrome[0x8]; + u8 syndrome[0x8]; +}; + +struct mlx5_ifc_qp_context_extension_bits { + u8 reserved_at_0[0x60]; + + struct mlx5_ifc_cqe_error_syndrome_bits error_syndrome; + + u8 reserved_at_80[0x580]; +}; + +struct mlx5_ifc_qpc_extension_and_pas_list_in_bits { + struct mlx5_ifc_qp_context_extension_bits qpc_data_extension; + + u8 pas[0][0x40]; +}; + +struct mlx5_ifc_qp_pas_list_in_bits { + struct mlx5_ifc_cmd_pas_bits pas[0]; +}; + +union mlx5_ifc_qp_pas_or_qpc_ext_and_pas_bits { + struct mlx5_ifc_qp_pas_list_in_bits qp_pas_list; + struct mlx5_ifc_qpc_extension_and_pas_list_in_bits qpc_ext_and_pas_list; +}; + struct mlx5_ifc_query_qp_out_bits { u8 status[0x8]; u8 reserved_at_8[0x18]; @@ -5273,7 +5308,7 @@ struct mlx5_ifc_query_qp_out_bits { u8 reserved_at_800[0x80]; - u8 pas[][0x40]; + union mlx5_ifc_qp_pas_or_qpc_ext_and_pas_bits qp_pas_or_qpc_ext_and_pas; }; struct mlx5_ifc_query_qp_in_bits { @@ -5283,7 +5318,8 @@ struct mlx5_ifc_query_qp_in_bits { u8 reserved_at_20[0x10]; u8 op_mod[0x10]; - u8 reserved_at_40[0x8]; + u8 qpc_ext[0x1]; + u8 reserved_at_41[0x7]; u8 qpn[0x18]; u8 reserved_at_60[0x20]; @@ -8417,7 +8453,8 @@ struct mlx5_ifc_create_qp_in_bits { u8 reserved_at_20[0x10]; u8 op_mod[0x10]; - u8 reserved_at_40[0x8]; + u8 qpc_ext[0x1]; + u8 reserved_at_41[0x7]; u8 input_qpn[0x18]; u8 reserved_at_60[0x20]; From patchwork Wed Sep 7 11:37:58 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Patrisious Haddad X-Patchwork-Id: 12968920 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 68F65C38145 for ; Wed, 7 Sep 2022 11:38:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230093AbiIGLiv (ORCPT ); Wed, 7 Sep 2022 07:38:51 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50882 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229981AbiIGLit (ORCPT ); Wed, 7 Sep 2022 07:38:49 -0400 Received: from NAM12-BN8-obe.outbound.protection.outlook.com (mail-bn8nam12on2085.outbound.protection.outlook.com [40.107.237.85]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0E8431C912 for ; Wed, 7 Sep 2022 04:38:47 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Jv9mtP/8eJH5pibN1XQWjChWluxPE4mPrwoMOQ78RxqaJHdhatdWLx2eHE8yA2yf3WDXhpaMaBT+nw0EUt1tKWIl3Cill3pAmhwdckeJ0HCmI2vIHYvvCVMZuOJpjsITSuQMdJrLFbG9VP/4KF5hpQcNg4fBs23piU/zN5oCpfHqgM3IAQZ8q/ZDsjA4a7uE+IcLoLzQ8USX/bpaAvpUPjj+SEkLgNauixy/QO2IR/75Mgr/bg1YZ+WRf8bfhsegeBTuTcfODhWiiQUwjHvRHz7GN6ALM7h6zT+mkGveYuw159Awo3byCQ+xQuFPCiPRdLHDBadV15knEkKWNs1tKw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=ggsIYY1IhuSLzdx0mu8FSMreucGyWZ93mnCMqOJH/qk=; b=HP7hBGCogjDCojCVV+GtcX/vBAIq72ig0yVF2k4AGijrdlAVgjepbLVHlRw4bSLfOM0HGSEhIS78Ec59pwgR+W/DRAVbac795hnqYyUVyVmuf5tVGlogUZGuTBY93g8qsffVxqj7qG0FDjks+TomS7Z3HOHwdHaSOGSklRTa7XtIxkYDY4gEBLd2+dbRlSCgpJxh9nwEu4q+OZBzndL+ajQq/QGbtkhj31Rx231WGYwVnbFHrpxzkeDb7QmjkvsfeWVIvTbo7xAlXcj6HdYR6BvJaVLqzI+JLDmhhTyQ/ytDq3cBB/oxLJnKXI74GCn0jrvTkUg4hWAtDFiVHXLosA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 12.22.5.235) smtp.rcpttodomain=lists.infradead.org smtp.mailfrom=nvidia.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=ggsIYY1IhuSLzdx0mu8FSMreucGyWZ93mnCMqOJH/qk=; b=P3HzUNOafYdobGvF93EbLRMYVe9/AbAe8AUyaRmKoiNaf+fqKbKBXYAvnbmss3lAhZ8gYE2jh9P6hd7pmCLmH5hizb6NN3X4KO9I23dA9SdzknZenxeiKI7O/GuBQXdG8xKG4oK6M73AeWbS55Q++ar3mtAW4UUM9+OdUJg9M2QFC7xcrkjEoOOyJZ4RCDRVm2kARXxQkKhXiuy49P7UwTdkBnf4OyVTg68Mer4P8lngL2pdGezTZ9jSMhOgkh6rK2ly2wijVb2SRcGEsVW568SVmT8VMJ5nv7IOvKqT8lPy/dJKBqM7n3SmnDo0tDZeJyKQ29DeeK21MdOucBDPUg== Received: from BN0PR04CA0192.namprd04.prod.outlook.com (2603:10b6:408:e9::17) by PH8PR12MB7135.namprd12.prod.outlook.com (2603:10b6:510:22c::11) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5546.21; Wed, 7 Sep 2022 11:38:45 +0000 Received: from BN8NAM11FT053.eop-nam11.prod.protection.outlook.com (2603:10b6:408:e9:cafe::9a) by BN0PR04CA0192.outlook.office365.com (2603:10b6:408:e9::17) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5612.12 via Frontend Transport; Wed, 7 Sep 2022 11:38:45 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 12.22.5.235) smtp.mailfrom=nvidia.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 12.22.5.235 as permitted sender) receiver=protection.outlook.com; client-ip=12.22.5.235; helo=mail.nvidia.com; pr=C Received: from mail.nvidia.com (12.22.5.235) by BN8NAM11FT053.mail.protection.outlook.com (10.13.177.209) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id 15.20.5612.13 via Frontend Transport; Wed, 7 Sep 2022 11:38:44 +0000 Received: from rnnvmail205.nvidia.com (10.129.68.10) by DRHQMAIL107.nvidia.com (10.27.9.16) with Microsoft SMTP Server (TLS) id 15.0.1497.38; Wed, 7 Sep 2022 11:38:44 +0000 Received: from rnnvmail205.nvidia.com (10.129.68.10) by rnnvmail205.nvidia.com (10.129.68.10) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.986.29; Wed, 7 Sep 2022 04:38:43 -0700 Received: from vdi.nvidia.com (10.127.8.11) by mail.nvidia.com (10.129.68.10) with Microsoft SMTP Server id 15.2.986.29 via Frontend Transport; Wed, 7 Sep 2022 04:38:40 -0700 From: Patrisious Haddad To: Sagi Grimberg , Christoph Hellwig CC: Patrisious Haddad , Leon Romanovsky , Linux-nvme , , Michael Guralnik , Israel Rukshin , Maor Gottlieb , "Max Gurtovoy" Subject: [PATCH rdma-next 2/4] RDMA/core: Introduce ib_get_qp_err_syndrome function Date: Wed, 7 Sep 2022 14:37:58 +0300 Message-ID: <20220907113800.22182-3-phaddad@nvidia.com> X-Mailer: git-send-email 2.18.1 In-Reply-To: <20220907113800.22182-1-phaddad@nvidia.com> References: <20220907113800.22182-1-phaddad@nvidia.com> MIME-Version: 1.0 X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: BN8NAM11FT053:EE_|PH8PR12MB7135:EE_ X-MS-Office365-Filtering-Correlation-Id: 545ad267-1cea-40ab-3942-08da90c5859a X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: 769jzVMACV5V5mO0LWiOhy5QEgzLM7/OeKhjNao4ycmoybMjXUoDyzLjMHNoE6/9TvxZE92OGd0HLEIBfX67ZxjidS3O59wGw7aThaOAzc6KG4uXvAvCpT8fP5jj8GZoUyBu0PtmPPt9d6xNajVxuruQssPcPqVCN6gnd6XFPGvhJCexzKDsLgp1je8fawCCq5RqwzpMR3vBL5JfbRqEVbO/WjdP4g/xr8pn0BaKbL30WBPjRzpo+0im+Gt1QxEuozrobz4n7fWgIwJ9dzPzfmqacZ0Sraj8HUf3bgweZkt/xvHEpGPO/Y+CMxTMB2fO46efljlwTK/rfWOwvLi6ftjptPypDNKP54UiLHcji3rgHSGDJG085oUeBYbJPeOEUbm5QixW3q6ATITRaMbJl57yCKmMKVjjh2JqMpJXyk/KXxp8hz54Swh7s6t7ZTLl9z8wR0UoTtEyFRqJsXGJw5GrKIyms4TXD+97XQ/iH4z0Bnmw/P6iaFF8IFmofvIi8Ok/phKSDhjuELl/21RwAzHAZhoxX92rqY0AqmkzDqks5VIMVCl38E+hDcCi9nu3tvvcbHs1IP0mX2OswsHEBfca14BYFBhAJ/mJ2xkY9tnUONUcJAyEkx5JzKT5iX5KpWwGH2/eLRSGYdEnLl9I8yzAJE3Oz93SfZaAf6njkNjz6tkb0eeO45e1QCMnHPYOqfncLcNtqLzjg2sD/OvOeP9o5d7APrNz7DgB3rPhN/1q+m5P9HgUDMrU1LZoSD13xuYPFZX7Wfq0qqqq6kMN4xow46hP82MdFTsbF15Mo74= X-Forefront-Antispam-Report: CIP:12.22.5.235;CTRY:US;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:mail.nvidia.com;PTR:InfoNoRecords;CAT:NONE;SFS:(13230016)(4636009)(376002)(39860400002)(136003)(396003)(346002)(36840700001)(46966006)(40470700004)(26005)(8676002)(86362001)(70206006)(54906003)(40460700003)(110136005)(36860700001)(82310400005)(5660300002)(83380400001)(70586007)(316002)(40480700001)(2616005)(478600001)(1076003)(7696005)(336012)(107886003)(81166007)(4326008)(82740400003)(8936002)(6666004)(186003)(36756003)(2906002)(47076005)(356005)(426003)(41300700001)(36900700001);DIR:OUT;SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 07 Sep 2022 11:38:44.8749 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 545ad267-1cea-40ab-3942-08da90c5859a X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a;Ip=[12.22.5.235];Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: BN8NAM11FT053.eop-nam11.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: PH8PR12MB7135 Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org Introduce ib_get_qp_err_syndrome function, which enables kernel applications to query the reason the QP moved to error state. Even in cases in which no CQE was generated. Signed-off-by: Patrisious Haddad Reviewed-by: Leon Romanovsky --- drivers/infiniband/core/device.c | 1 + drivers/infiniband/core/verbs.c | 8 ++++++++ include/rdma/ib_verbs.h | 13 +++++++++++++ 3 files changed, 22 insertions(+) diff --git a/drivers/infiniband/core/device.c b/drivers/infiniband/core/device.c index ae60c73babcc..8235b8fa1100 100644 --- a/drivers/infiniband/core/device.c +++ b/drivers/infiniband/core/device.c @@ -2657,6 +2657,7 @@ void ib_set_device_ops(struct ib_device *dev, const struct ib_device_ops *ops) SET_DEVICE_OP(dev_ops, get_netdev); SET_DEVICE_OP(dev_ops, get_numa_node); SET_DEVICE_OP(dev_ops, get_port_immutable); + SET_DEVICE_OP(dev_ops, get_qp_err_syndrome); SET_DEVICE_OP(dev_ops, get_vector_affinity); SET_DEVICE_OP(dev_ops, get_vf_config); SET_DEVICE_OP(dev_ops, get_vf_guid); diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c index e54b3f1b730e..ac20af8be33a 100644 --- a/drivers/infiniband/core/verbs.c +++ b/drivers/infiniband/core/verbs.c @@ -1952,6 +1952,14 @@ int ib_query_qp(struct ib_qp *qp, } EXPORT_SYMBOL(ib_query_qp); +int ib_get_qp_err_syndrome(struct ib_qp *qp, char *str) +{ + return qp->device->ops.get_qp_err_syndrome ? + qp->device->ops.get_qp_err_syndrome(qp->real_qp, + str) : -EOPNOTSUPP; +} +EXPORT_SYMBOL(ib_get_qp_err_syndrome); + int ib_close_qp(struct ib_qp *qp) { struct ib_qp *real_qp; diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h index 975d6e9efbcb..9a94f2ef993c 100644 --- a/include/rdma/ib_verbs.h +++ b/include/rdma/ib_verbs.h @@ -2465,6 +2465,7 @@ struct ib_device_ops { int qp_attr_mask, struct ib_udata *udata); int (*query_qp)(struct ib_qp *qp, struct ib_qp_attr *qp_attr, int qp_attr_mask, struct ib_qp_init_attr *qp_init_attr); + int (*get_qp_err_syndrome)(struct ib_qp *qp, char *str); int (*destroy_qp)(struct ib_qp *qp, struct ib_udata *udata); int (*create_cq)(struct ib_cq *cq, const struct ib_cq_init_attr *attr, struct ib_udata *udata); @@ -3777,6 +3778,18 @@ int ib_query_qp(struct ib_qp *qp, int qp_attr_mask, struct ib_qp_init_attr *qp_init_attr); +#define IB_ERR_SYNDROME_LENGTH 256 + +/** + * ib_get_qp_err_syndrome - Returns a string that describes the reason + * the specified QP moved to error state. + * @qp : The QP to query. + * @str: The reason the qp moved to error state. + * + * NOTE: the user must pass a str with size of at least IB_ERR_SYNDROME_LENGTH + */ +int ib_get_qp_err_syndrome(struct ib_qp *qp, char *str); + /** * ib_destroy_qp - Destroys the specified QP. * @qp: The QP to destroy. From patchwork Wed Sep 7 11:37:59 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Patrisious Haddad X-Patchwork-Id: 12968921 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5EE46C38145 for ; Wed, 7 Sep 2022 11:38:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230111AbiIGLiz (ORCPT ); Wed, 7 Sep 2022 07:38:55 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50982 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229981AbiIGLiy (ORCPT ); Wed, 7 Sep 2022 07:38:54 -0400 Received: from NAM12-BN8-obe.outbound.protection.outlook.com (mail-bn8nam12on2079.outbound.protection.outlook.com [40.107.237.79]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 00F2619294 for ; Wed, 7 Sep 2022 04:38:52 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=GZFj+wNGnqXAbamMMtkutHGomjBCd/Wt5oa4okchXboC1yJNSmGwyhp/vZUCJfN4SUvRr8iFTctDN2lfIi1y4fpb65TrZNAMd4rLDcCpK7L3793Z1zGouQo3BzUhm/hQhghcaslkYzYeMcOHGnVUMCYWBjfYSmoC7Kex/TdhZuZE3q0IXLK29ymFSqqu9hmCTBu6HMcjLawZKPkm/3i0XyWtmNBDNG7tSrVOrESsYePpo0LEAKYO/2L9rzLXOdzUydlsYwBgAnI9xW+RcTtj+Qxq4D0+iuWs3uDszTEC7l/UzqEauie1gRnJtWu9MLLBdEMgPipGtjvEcMaOZ0NBDg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=5QA4NzJ/9UbXm635TW5A7UQkUTpG42+yzqSTdQK+kAA=; b=Y5lmq6OoEhg4WE6wVH3l3LiwOj+JfzomhjJPsPpSgLgae3d69xzGLnyjsByFaxwIjUtL+pgSVKh/LFO8b0zEUCuWQrYMQzsx57wMoyyC7pQjhrxObcoQaOkSu6IqDRPrbVBmwQGmfqqGxTLybQU0ndAGeid4qnDhrcBtUkcsZFAYkXabDmBHiIY43iGrIeRcjRDZjtQSZmZyL9Q1h6mIBL9dT0fl5pqRWucFo+/sLGs4lG2Z0HaBAdIg4DePAoSQUWdyGFTtaWZGwlftiJGdkI5ixtYEN430CGWgTQ1ArEqPNH6DR1By5aO/aFQIBeUl9jro+rO54WIp4zdgBAXvMA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 12.22.5.238) smtp.rcpttodomain=lists.infradead.org smtp.mailfrom=nvidia.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=5QA4NzJ/9UbXm635TW5A7UQkUTpG42+yzqSTdQK+kAA=; b=JMxHFQsOGdidBfjNTKdTuM1nb0lIDZMZOcep2UTflvuFoR7+2J/HXPqsN2HK0VGpC4pDGWGOkppRoUIg9Ro/cMsGiHNrupLCrq92MvyT7lSdDio94izAQHcd4dzJuBNj4z4PS7LivKyMkrnkbx1e7+dg3MYOCHpp96eFEEH+cIcSQCyqHSYWj/Dzs1G1Vj2aJqebYjn2BhXGDkOv/q99hVxqRNNd2RF5py02JfoPPv1jNNCh4ukXSSldnLl0Qoi5cAZOowrSyturZ0scwUudmmS+dUNu+RQxXv+E2uQ5b7Ej3yufYgvsA4b03dt6yIRxkQtC+ujgNJEjgYFktxHaJA== Received: from MW4PR03CA0284.namprd03.prod.outlook.com (2603:10b6:303:b5::19) by SJ0PR12MB5633.namprd12.prod.outlook.com (2603:10b6:a03:428::8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5588.12; Wed, 7 Sep 2022 11:38:50 +0000 Received: from CO1NAM11FT079.eop-nam11.prod.protection.outlook.com (2603:10b6:303:b5:cafe::ce) by MW4PR03CA0284.outlook.office365.com (2603:10b6:303:b5::19) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5612.12 via Frontend Transport; Wed, 7 Sep 2022 11:38:50 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 12.22.5.238) smtp.mailfrom=nvidia.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 12.22.5.238 as permitted sender) receiver=protection.outlook.com; client-ip=12.22.5.238; helo=mail.nvidia.com; pr=C Received: from mail.nvidia.com (12.22.5.238) by CO1NAM11FT079.mail.protection.outlook.com (10.13.175.134) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id 15.20.5612.13 via Frontend Transport; Wed, 7 Sep 2022 11:38:50 +0000 Received: from rnnvmail205.nvidia.com (10.129.68.10) by DRHQMAIL105.nvidia.com (10.27.9.14) with Microsoft SMTP Server (TLS) id 15.0.1497.38; Wed, 7 Sep 2022 11:38:49 +0000 Received: from rnnvmail205.nvidia.com (10.129.68.10) by rnnvmail205.nvidia.com (10.129.68.10) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.986.29; Wed, 7 Sep 2022 04:38:49 -0700 Received: from vdi.nvidia.com (10.127.8.11) by mail.nvidia.com (10.129.68.10) with Microsoft SMTP Server id 15.2.986.29 via Frontend Transport; Wed, 7 Sep 2022 04:38:46 -0700 From: Patrisious Haddad To: Sagi Grimberg , Christoph Hellwig CC: Patrisious Haddad , Leon Romanovsky , Linux-nvme , , Michael Guralnik , Israel Rukshin , Maor Gottlieb , "Max Gurtovoy" Subject: [PATCH rdma-next 3/4] RDMA/mlx5: Implement ib_get_qp_err_syndrome Date: Wed, 7 Sep 2022 14:37:59 +0300 Message-ID: <20220907113800.22182-4-phaddad@nvidia.com> X-Mailer: git-send-email 2.18.1 In-Reply-To: <20220907113800.22182-1-phaddad@nvidia.com> References: <20220907113800.22182-1-phaddad@nvidia.com> MIME-Version: 1.0 X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: CO1NAM11FT079:EE_|SJ0PR12MB5633:EE_ X-MS-Office365-Filtering-Correlation-Id: b4e5b724-df39-46aa-2b2e-08da90c588db X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: mDRLabMfT0wUpz/tq2uYDf87rLL585YxYc7wPtjxV8gH49i6hlE7lldKidjCaHofIs/zWiCoqWtlms/mMrqAOYeUSM0quEmA9io9clRpt9tLiqp/yqufY86ewEtN4mdgA9RaFc72TUvtEC3a2HgCAbvtfLpVDWuMOQYXYUxPRbkYkJu8YGRYf7cRZNmo6qRxWaqKxqPy7ArvYXO2Vopi3YHTPDpZ1aHIDKn66Q8bJago+J+/p4uXia+uq3f5EK0KLSAFfzdbOmYc1sRyl3jMbDgjeWj/1JByr91bCCN2daDsi4TLaJm9dENqviftw+RNIYv45+qkLG1375OPSMkqPM/psKje3buEsougMudCCvJMLog2Y8Miei1kd2rII6YB2L1Ly30rDn5f6O5nXUI1JF9gFcKXxNwmVAX2QTpJ6yV+YMrY7JEH6ywxtTPWEbjUol8XMnBSceAsz8m1yznjwV8dW7d346KdPlGvBhXQQp+O5Gkt9sNGtlwSQgESDorQRoHnbMi0j7MgnMHB5BtVUo3f513Nz+VZdZF3OTriurt84rRqOb6EnfoabMbKwwUwASoI9/6VJe+d3HOGsR3qV9rSkBRXDoGULIJJCYili5f8wWpp5mzHL+xFfKPj1xQLF9UdYC0Gl6Nb2ML3J4vwav4f/TmEkqprlmL/9lKimLxRPle9np0YcoDLwCkJUkDpMLtiUOh1N+cyoGqlBA1v3DXDgqa6vGzRRIx/O17Togu+SziGUcWkZfnTF1FSCnlttG7QAUhFvsxHGt7usqGAXG724cXzutZszsR3c2sCx1Y= X-Forefront-Antispam-Report: CIP:12.22.5.238;CTRY:US;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:mail.nvidia.com;PTR:InfoNoRecords;CAT:NONE;SFS:(13230016)(4636009)(376002)(136003)(39860400002)(396003)(346002)(40470700004)(36840700001)(46966006)(41300700001)(426003)(7696005)(86362001)(81166007)(336012)(1076003)(2616005)(6666004)(186003)(26005)(356005)(36860700001)(40460700003)(83380400001)(47076005)(82310400005)(82740400003)(5660300002)(8936002)(316002)(54906003)(36756003)(70586007)(4326008)(40480700001)(107886003)(110136005)(70206006)(478600001)(2906002)(8676002)(36900700001);DIR:OUT;SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 07 Sep 2022 11:38:50.4274 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: b4e5b724-df39-46aa-2b2e-08da90c588db X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a;Ip=[12.22.5.238];Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: CO1NAM11FT079.eop-nam11.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: SJ0PR12MB5633 Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org Implement ib_get_qp_err_syndrome using a query_qp FW call and return the result in a human readable string. Signed-off-by: Patrisious Haddad Reviewed-by: Leon Romanovsky --- drivers/infiniband/hw/mlx5/main.c | 1 + drivers/infiniband/hw/mlx5/mlx5_ib.h | 1 + drivers/infiniband/hw/mlx5/qp.c | 42 +++++++++++++++++++++++++++- drivers/infiniband/hw/mlx5/qp.h | 2 +- drivers/infiniband/hw/mlx5/qpc.c | 4 ++- 5 files changed, 47 insertions(+), 3 deletions(-) diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c index 7c40efae96a3..c18d3e36542b 100644 --- a/drivers/infiniband/hw/mlx5/main.c +++ b/drivers/infiniband/hw/mlx5/main.c @@ -3716,6 +3716,7 @@ static const struct ib_device_ops mlx5_ib_dev_ops = { .get_dev_fw_str = get_dev_fw_str, .get_dma_mr = mlx5_ib_get_dma_mr, .get_link_layer = mlx5_ib_port_link_layer, + .get_qp_err_syndrome = mlx5_ib_get_qp_err_syndrome, .map_mr_sg = mlx5_ib_map_mr_sg, .map_mr_sg_pi = mlx5_ib_map_mr_sg_pi, .mmap = mlx5_ib_mmap, diff --git a/drivers/infiniband/hw/mlx5/mlx5_ib.h b/drivers/infiniband/hw/mlx5/mlx5_ib.h index 2e2ad3918385..bbd414cbd695 100644 --- a/drivers/infiniband/hw/mlx5/mlx5_ib.h +++ b/drivers/infiniband/hw/mlx5/mlx5_ib.h @@ -1232,6 +1232,7 @@ int mlx5_ib_modify_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr, int attr_mask, struct ib_udata *udata); int mlx5_ib_query_qp(struct ib_qp *ibqp, struct ib_qp_attr *qp_attr, int qp_attr_mask, struct ib_qp_init_attr *qp_init_attr); +int mlx5_ib_get_qp_err_syndrome(struct ib_qp *ibqp, char *str); int mlx5_ib_destroy_qp(struct ib_qp *qp, struct ib_udata *udata); void mlx5_ib_drain_sq(struct ib_qp *qp); void mlx5_ib_drain_rq(struct ib_qp *qp); diff --git a/drivers/infiniband/hw/mlx5/qp.c b/drivers/infiniband/hw/mlx5/qp.c index 40d9410ec303..7cf2fe549b9a 100644 --- a/drivers/infiniband/hw/mlx5/qp.c +++ b/drivers/infiniband/hw/mlx5/qp.c @@ -4806,7 +4806,8 @@ static int query_qp_attr(struct mlx5_ib_dev *dev, struct mlx5_ib_qp *qp, if (!outb) return -ENOMEM; - err = mlx5_core_qp_query(dev, &qp->trans_qp.base.mqp, outb, outlen); + err = mlx5_core_qp_query(dev, &qp->trans_qp.base.mqp, outb, outlen, + false); if (err) goto out; @@ -4992,6 +4993,45 @@ int mlx5_ib_query_qp(struct ib_qp *ibqp, struct ib_qp_attr *qp_attr, return err; } +int mlx5_ib_get_qp_err_syndrome(struct ib_qp *ibqp, char *str) +{ + struct mlx5_ib_dev *dev = to_mdev(ibqp->device); + int outlen = MLX5_ST_SZ_BYTES(query_qp_out); + struct mlx5_ib_qp *qp = to_mqp(ibqp); + void *pas_ext_union, *err_syn; + u32 *outb; + int err; + + if (!MLX5_CAP_GEN(dev->mdev, qpc_extension) || + !MLX5_CAP_GEN(dev->mdev, qp_error_syndrome)) + return -EOPNOTSUPP; + + outb = kzalloc(outlen, GFP_KERNEL); + if (!outb) + return -ENOMEM; + + err = mlx5_core_qp_query(dev, &qp->trans_qp.base.mqp, outb, outlen, + true); + if (err) + goto out; + + pas_ext_union = + MLX5_ADDR_OF(query_qp_out, outb, qp_pas_or_qpc_ext_and_pas); + err_syn = MLX5_ADDR_OF(qpc_extension_and_pas_list_in, pas_ext_union, + qpc_data_extension.error_syndrome); + + scnprintf(str, IB_ERR_SYNDROME_LENGTH, "%s (0x%x 0x%x 0x%x)\n", + ib_wc_status_msg( + MLX5_GET(cqe_error_syndrome, err_syn, syndrome)), + MLX5_GET(cqe_error_syndrome, err_syn, vendor_error_syndrome), + MLX5_GET(cqe_error_syndrome, err_syn, hw_syndrome_type), + MLX5_GET(cqe_error_syndrome, err_syn, hw_error_syndrome)); + +out: + kfree(outb); + return err; +} + int mlx5_ib_alloc_xrcd(struct ib_xrcd *ibxrcd, struct ib_udata *udata) { struct mlx5_ib_dev *dev = to_mdev(ibxrcd->device); diff --git a/drivers/infiniband/hw/mlx5/qp.h b/drivers/infiniband/hw/mlx5/qp.h index 5d4e140db99c..8d792ca00b32 100644 --- a/drivers/infiniband/hw/mlx5/qp.h +++ b/drivers/infiniband/hw/mlx5/qp.h @@ -20,7 +20,7 @@ int mlx5_core_qp_modify(struct mlx5_ib_dev *dev, u16 opcode, u32 opt_param_mask, int mlx5_core_destroy_qp(struct mlx5_ib_dev *dev, struct mlx5_core_qp *qp); int mlx5_core_destroy_dct(struct mlx5_ib_dev *dev, struct mlx5_core_dct *dct); int mlx5_core_qp_query(struct mlx5_ib_dev *dev, struct mlx5_core_qp *qp, - u32 *out, int outlen); + u32 *out, int outlen, bool qpc_ext); int mlx5_core_dct_query(struct mlx5_ib_dev *dev, struct mlx5_core_dct *dct, u32 *out, int outlen); diff --git a/drivers/infiniband/hw/mlx5/qpc.c b/drivers/infiniband/hw/mlx5/qpc.c index 542e4c63a8de..7a1854aab441 100644 --- a/drivers/infiniband/hw/mlx5/qpc.c +++ b/drivers/infiniband/hw/mlx5/qpc.c @@ -504,12 +504,14 @@ void mlx5_cleanup_qp_table(struct mlx5_ib_dev *dev) } int mlx5_core_qp_query(struct mlx5_ib_dev *dev, struct mlx5_core_qp *qp, - u32 *out, int outlen) + u32 *out, int outlen, bool qpc_ext) { u32 in[MLX5_ST_SZ_DW(query_qp_in)] = {}; MLX5_SET(query_qp_in, in, opcode, MLX5_CMD_OP_QUERY_QP); MLX5_SET(query_qp_in, in, qpn, qp->qpn); + MLX5_SET(query_qp_in, in, qpc_ext, qpc_ext); + return mlx5_cmd_exec(dev->mdev, in, sizeof(in), out, outlen); } From patchwork Wed Sep 7 11:38:00 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Patrisious Haddad X-Patchwork-Id: 12968922 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id AB030C54EE9 for ; Wed, 7 Sep 2022 11:39:03 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230152AbiIGLjC (ORCPT ); Wed, 7 Sep 2022 07:39:02 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51184 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230145AbiIGLjA (ORCPT ); Wed, 7 Sep 2022 07:39:00 -0400 Received: from NAM02-SN1-obe.outbound.protection.outlook.com (mail-sn1anam02on2062.outbound.protection.outlook.com [40.107.96.62]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 452272C13D for ; Wed, 7 Sep 2022 04:38:59 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=JT/Nw8Yvg8De7H5jM2i5uUz6dSenV7DQrXsD7g2hOC9H3TkPt2g0H3P1TEPn7YZvPJjyJX5B7a3/lfSpmmWPYZvJI3EQZxYKo77DuI3f+VWEodFsXh80lA2jR2GnG8ybs76cpN0pwfT2Xfzx2x6O55t3K0bmuXJ/N8UeXlESX0c/DBfFkXi3OQ3/xdihk8OVZoiZn6pwrKUsxUDb/3a1loL1jVKoCzYihb81oDYCGJkjr234I6nUTrxuw0pauGNtJ0WWibgo4w79DjHIAi1PRNXCd+7V3dEtKJKqZsN7UL5T0TIDe3QBQ0UhUVbTyxBlCAq3BvV9pJ0ZQO+tZfDPaA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=copC5HG4Ch0HIA8UVLVLZEdftzJdxrw2gJhHLyDIszw=; b=JWpJr0YFBP4D9UKJaXkzF0XTvd/jMVyjpRMqlKTcrMEozzGvYXST5jQE4tvTE1oztuJRHShT1zMCrtqCnEpoZe1clS4TQW5vRzD5UtlsNv8e2PknDcwvID1s4LqlOhcr6fm4osUD9z/8sXAAZrwtdBqSpdXQr3VlRVtqEYQzCIoXTvUalwT6y6pIdWOzOMNdTWZzbABk12f4F8lLFqP+x+EZrRFB0LcLGAiDqJR0MmrtHqnJeEWNmSBhqVoEcgEod9AFN/5NVuGhJzL5JeB01UxIEU8duYEEzLmC2ZtnIpVpFqH6oz7jw3vdTaz4MW2SoB8BVMtShixAQ60vmQWWaA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 12.22.5.234) smtp.rcpttodomain=lists.infradead.org smtp.mailfrom=nvidia.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=copC5HG4Ch0HIA8UVLVLZEdftzJdxrw2gJhHLyDIszw=; b=gz0ou2/b2FN1Xkv577B+GY2nTE0WGHCrG9kKcLLuibygkbBV9vG5xUmT72yzc695OsKrAXreqqccvLTeyForrUU6SoZbo394p9dqkmjvFg/HrUqM0uaZgyZeFXxT8Zcy/8BduGMwPn5jOzDdTn4mJTpEdswhtWdM8tjZk5ENYSc29tdWqyPeAZgAWQSflW3abmjlwJ6DO9jPFUaYaaBWI1UA1pBsMgz19470vqODqW1vRVZklfBEo+LzgdDusP3yLLuu+JSSUgRjwDMxsotRF1De7QFdwkJ1V11J/XMPMFcZVlAXp0C8dYm/hVbWvk/nFORqMhC4zgSi/70Yb3PaEQ== Received: from MW4PR04CA0075.namprd04.prod.outlook.com (2603:10b6:303:6b::20) by DM6PR12MB4267.namprd12.prod.outlook.com (2603:10b6:5:21e::16) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5588.12; Wed, 7 Sep 2022 11:38:57 +0000 Received: from CO1NAM11FT078.eop-nam11.prod.protection.outlook.com (2603:10b6:303:6b:cafe::c2) by MW4PR04CA0075.outlook.office365.com (2603:10b6:303:6b::20) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5612.12 via Frontend Transport; Wed, 7 Sep 2022 11:38:56 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 12.22.5.234) smtp.mailfrom=nvidia.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 12.22.5.234 as permitted sender) receiver=protection.outlook.com; client-ip=12.22.5.234; helo=mail.nvidia.com; pr=C Received: from mail.nvidia.com (12.22.5.234) by CO1NAM11FT078.mail.protection.outlook.com (10.13.175.177) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id 15.20.5612.13 via Frontend Transport; Wed, 7 Sep 2022 11:38:56 +0000 Received: from rnnvmail205.nvidia.com (10.129.68.10) by DRHQMAIL101.nvidia.com (10.27.9.10) with Microsoft SMTP Server (TLS) id 15.0.1497.38; Wed, 7 Sep 2022 11:38:55 +0000 Received: from rnnvmail205.nvidia.com (10.129.68.10) by rnnvmail205.nvidia.com (10.129.68.10) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.986.29; Wed, 7 Sep 2022 04:38:54 -0700 Received: from vdi.nvidia.com (10.127.8.11) by mail.nvidia.com (10.129.68.10) with Microsoft SMTP Server id 15.2.986.29 via Frontend Transport; Wed, 7 Sep 2022 04:38:52 -0700 From: Patrisious Haddad To: Sagi Grimberg , Christoph Hellwig CC: Israel Rukshin , Leon Romanovsky , Linux-nvme , , Michael Guralnik , Maor Gottlieb , Max Gurtovoy Subject: [PATCH rdma-next 4/4] nvme-rdma: add more error details when a QP moves to an error state Date: Wed, 7 Sep 2022 14:38:00 +0300 Message-ID: <20220907113800.22182-5-phaddad@nvidia.com> X-Mailer: git-send-email 2.18.1 In-Reply-To: <20220907113800.22182-1-phaddad@nvidia.com> References: <20220907113800.22182-1-phaddad@nvidia.com> MIME-Version: 1.0 X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: CO1NAM11FT078:EE_|DM6PR12MB4267:EE_ X-MS-Office365-Filtering-Correlation-Id: 3bfd96da-e701-4e55-7499-08da90c58c46 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: P0LCv9ku/AtENOf0UdtlNhtIEQUT8BEC/a0YSMTWjtKKFowSDpowgEQa9Xg/aIGgb41HKozj2UFlX64qoaxFYuQM5dfze1QIc+jCIh/pJUvlNFbiU1EFvyDGGMl32xydfudMP0pEX7PdeBKR0sxLpv0qkLQLM4Ud6r0qMgY+dQomW26XCHIdpDHtwkCYIwaQfK1imhnrJTByD5IBlM+FEyAO0AtoOHF8AomhnTRjNFiWo7s4xS4r5Q7OtkA+/Ux9+MsoaAWvpF0L/1WNzlpTaDkSmwQbLMR6+4UNEUVwX9wTSAfZy3c+yC4TDDzPbOOVl0/KxeGSuDFBdk94WOeXeGc/XBQ2MlQOdXEVOqvS+x23WiJ9V/DYGjb2MjFFbAAhJSsDaJXzNCzL/an/rkzyYUXB/BedxCzeD8TTzefnsHqkrJKIKkYDhoY+/T4AuhMhpsaKhjUdTKe2sRnwzleeL4se5PAlik7gWPo9SDTc/fUCxPHrgqeL7YD0v99ggiiMWut+KhGUaKM+M6hwxYg8xG+5vYq/Zd2/TPhlRk9/XJBYLVA7+Q8nBf0bNJDNtphivWn9lrUauam0ezfQpWfj9j9G/PbfelxWO4uTw+zMBsUE/Rwo85pNBcSUwj3tfGIMqCzTI1FaFoeoA5r2svxj6VNLPabRDECFW+mbOr9TKTYIpZZ4en1cZ6b2TaIUXBZ1QmrP9nfFBclS7i13cDYLmC+Qg8greisUlq8gG3s4AfBDqp6bkk8J+B30ZHPcTgfZXbRYvAM69dNTsPTF8xADoS3Wt6HSZPwnMxSq72SStLHfheIJlXdaBuNx33uCtt4L X-Forefront-Antispam-Report: CIP:12.22.5.234;CTRY:US;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:mail.nvidia.com;PTR:InfoNoRecords;CAT:NONE;SFS:(13230016)(4636009)(39860400002)(396003)(346002)(136003)(376002)(40470700004)(36840700001)(46966006)(70206006)(8676002)(83380400001)(186003)(4326008)(316002)(107886003)(82310400005)(6666004)(54906003)(41300700001)(110136005)(86362001)(36756003)(47076005)(1076003)(426003)(2616005)(70586007)(336012)(26005)(36860700001)(81166007)(5660300002)(2906002)(40480700001)(356005)(7696005)(8936002)(478600001)(82740400003)(40460700003)(36900700001);DIR:OUT;SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 07 Sep 2022 11:38:56.1481 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 3bfd96da-e701-4e55-7499-08da90c58c46 X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a;Ip=[12.22.5.234];Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: CO1NAM11FT078.eop-nam11.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: DM6PR12MB4267 Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org From: Israel Rukshin Add debug prints for fatal QP events that are helpful for finding the root cause of the errors. The ib_get_qp_err_syndrome is called at a work queue since the QP event callback is running on an interrupt context that can't sleep. Signed-off-by: Israel Rukshin Reviewed-by: Max Gurtovoy Reviewed-by: Leon Romanovsky --- drivers/nvme/host/rdma.c | 24 ++++++++++++++++++++++++ 1 file changed, 24 insertions(+) diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c index 3100643be299..7e56c0dbe8ea 100644 --- a/drivers/nvme/host/rdma.c +++ b/drivers/nvme/host/rdma.c @@ -99,6 +99,7 @@ struct nvme_rdma_queue { bool pi_support; int cq_size; struct mutex queue_lock; + struct work_struct qp_err_work; }; struct nvme_rdma_ctrl { @@ -237,11 +238,31 @@ static struct nvme_rdma_qe *nvme_rdma_alloc_ring(struct ib_device *ibdev, return NULL; } +static void nvme_rdma_qp_error_work(struct work_struct *work) +{ + struct nvme_rdma_queue *queue = container_of(work, + struct nvme_rdma_queue, qp_err_work); + int ret; + char err[IB_ERR_SYNDROME_LENGTH]; + + ret = ib_get_qp_err_syndrome(queue->qp, err); + if (ret) + return; + + pr_err("Queue %d got QP error syndrome %s\n", + nvme_rdma_queue_idx(queue), err); +} + static void nvme_rdma_qp_event(struct ib_event *event, void *context) { + struct nvme_rdma_queue *queue = context; + pr_debug("QP event %s (%d)\n", ib_event_msg(event->event), event->event); + if (event->event == IB_EVENT_QP_FATAL || + event->event == IB_EVENT_QP_ACCESS_ERR) + queue_work(nvme_wq, &queue->qp_err_work); } static int nvme_rdma_wait_for_cm(struct nvme_rdma_queue *queue) @@ -261,7 +282,9 @@ static int nvme_rdma_create_qp(struct nvme_rdma_queue *queue, const int factor) struct ib_qp_init_attr init_attr; int ret; + INIT_WORK(&queue->qp_err_work, nvme_rdma_qp_error_work); memset(&init_attr, 0, sizeof(init_attr)); + init_attr.qp_context = queue; init_attr.event_handler = nvme_rdma_qp_event; /* +1 for drain */ init_attr.cap.max_send_wr = factor * queue->queue_size + 1; @@ -434,6 +457,7 @@ static void nvme_rdma_destroy_queue_ib(struct nvme_rdma_queue *queue) ib_mr_pool_destroy(queue->qp, &queue->qp->sig_mrs); ib_mr_pool_destroy(queue->qp, &queue->qp->rdma_mrs); + flush_work(&queue->qp_err_work); /* * The cm_id object might have been destroyed during RDMA connection * establishment error flow to avoid getting other cma events, thus