From patchwork Tue Dec 4 12:04:17 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gal Pressman X-Patchwork-Id: 10711627 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 1BA5B1731 for ; Tue, 4 Dec 2018 12:04:55 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id EFD922AD6B for ; Tue, 4 Dec 2018 12:04:54 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id DDCC82AE4E; Tue, 4 Dec 2018 12:04:54 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 677132A35B for ; Tue, 4 Dec 2018 12:04:54 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1725863AbeLDMEy (ORCPT ); Tue, 4 Dec 2018 07:04:54 -0500 Received: from smtp-fw-2101.amazon.com ([72.21.196.25]:49911 "EHLO smtp-fw-2101.amazon.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725769AbeLDMEx (ORCPT ); Tue, 4 Dec 2018 07:04:53 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1543925093; x=1575461093; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version; bh=m7n8diVgpgyG8Srle9gbhLm/DxDMEBo4Dn2fxjwU62s=; b=NPnNY6cKDlU9kdWBrjvCAAfRacE9uVOoLNROdvu+fplIiMeft1rEAlkT gIif7kGy7CvEJCh9LYZOhu2Za888Jp3ncK9aTv6T/YNODLt80663eUOa+ uG027rgzr2cVpLqetrjoXz8s1pl4LTFEjxiwYWzTCPXHItHBRbj2DgE8y s=; X-IronPort-AV: E=Sophos;i="5.56,253,1539648000"; d="scan'208";a="707203036" Received: from iad6-co-svc-p1-lb1-vlan2.amazon.com (HELO email-inbound-relay-1a-af6a10df.us-east-1.amazon.com) ([10.124.125.2]) by smtp-border-fw-out-2101.iad2.amazon.com with ESMTP/TLS/DHE-RSA-AES256-SHA; 04 Dec 2018 12:04:52 +0000 Received: from EX13MTAUEA001.ant.amazon.com (iad55-ws-svc-p15-lb9-vlan2.iad.amazon.com [10.40.159.162]) by email-inbound-relay-1a-af6a10df.us-east-1.amazon.com (8.14.7/8.14.7) with ESMTP id wB4C4hN1052106 (version=TLSv1/SSLv3 cipher=AES256-SHA bits=256 verify=FAIL); Tue, 4 Dec 2018 12:04:51 GMT Received: from EX13D19EUA004.ant.amazon.com (10.43.165.28) by EX13MTAUEA001.ant.amazon.com (10.43.61.243) with Microsoft SMTP Server (TLS) id 15.0.1367.3; Tue, 4 Dec 2018 12:04:50 +0000 Received: from EX13MTAUEB001.ant.amazon.com (10.43.60.96) by EX13D19EUA004.ant.amazon.com (10.43.165.28) with Microsoft SMTP Server (TLS) id 15.0.1367.3; Tue, 4 Dec 2018 12:04:49 +0000 Received: from galpress-VirtualBox.hfa16.amazon.com (10.218.62.26) by mail-relay.amazon.com (10.43.60.129) with Microsoft SMTP Server id 15.0.1367.3 via Frontend Transport; Tue, 4 Dec 2018 12:04:47 +0000 From: Gal Pressman To: Doug Ledford , Jason Gunthorpe CC: Alexander Matushevsky , Yossi Leybovich , , Tom Tucker , Gal Pressman Subject: [PATCH rdma-next 01/13] RDMA: Add EFA related definitions Date: Tue, 4 Dec 2018 14:04:17 +0200 Message-ID: <1543925069-8838-2-git-send-email-galpress@amazon.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1543925069-8838-1-git-send-email-galpress@amazon.com> References: <1543925069-8838-1-git-send-email-galpress@amazon.com> MIME-Version: 1.0 Sender: linux-rdma-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Add EFA node type, transport type and protocol type to core code. EFA relies on underlying implementation similar to reliable datagram, so we also define a new QP type named Scalable Reliable Datagram (SRD). EFA reliable datagram transport provides reliable out-of-order delivery, transparently utilizing multiple network paths to reduce network tail latency. Its interface is similar to UD, in particular it supports message size up to MTU, with error handling extended to support reliable communication. Signed-off-by: Gal Pressman --- drivers/infiniband/core/verbs.c | 2 ++ include/rdma/ib_verbs.h | 9 +++++++-- 2 files changed, 9 insertions(+), 2 deletions(-) diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c index 178899e3ce73..970744ffbf33 100644 --- a/drivers/infiniband/core/verbs.c +++ b/drivers/infiniband/core/verbs.c @@ -206,6 +206,8 @@ rdma_node_get_transport(enum rdma_node_type node_type) return RDMA_TRANSPORT_USNIC_UDP; if (node_type == RDMA_NODE_RNIC) return RDMA_TRANSPORT_IWARP; + if (node_type == RDMA_NODE_EFA) + return RDMA_TRANSPORT_EFA; return RDMA_TRANSPORT_IB; } diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h index 92633c15125b..8d4b07b346b7 100644 --- a/include/rdma/ib_verbs.h +++ b/include/rdma/ib_verbs.h @@ -108,6 +108,7 @@ enum rdma_node_type { RDMA_NODE_RNIC, RDMA_NODE_USNIC, RDMA_NODE_USNIC_UDP, + RDMA_NODE_EFA, }; enum { @@ -119,14 +120,16 @@ enum rdma_transport_type { RDMA_TRANSPORT_IB, RDMA_TRANSPORT_IWARP, RDMA_TRANSPORT_USNIC, - RDMA_TRANSPORT_USNIC_UDP + RDMA_TRANSPORT_USNIC_UDP, + RDMA_TRANSPORT_EFA, }; enum rdma_protocol_type { RDMA_PROTOCOL_IB, RDMA_PROTOCOL_IBOE, RDMA_PROTOCOL_IWARP, - RDMA_PROTOCOL_USNIC_UDP + RDMA_PROTOCOL_USNIC_UDP, + RDMA_PROTOCOL_EFA, }; __attribute_const__ enum rdma_transport_type @@ -538,6 +541,7 @@ static inline struct rdma_hw_stats *rdma_alloc_hw_stats_struct( #define RDMA_CORE_CAP_PROT_ROCE_UDP_ENCAP 0x00800000 #define RDMA_CORE_CAP_PROT_RAW_PACKET 0x01000000 #define RDMA_CORE_CAP_PROT_USNIC 0x02000000 +#define RDMA_CORE_CAP_PROT_EFA 0x04000000 #define RDMA_CORE_PORT_IB_GRH_REQUIRED (RDMA_CORE_CAP_IB_GRH_REQUIRED \ | RDMA_CORE_CAP_PROT_ROCE \ @@ -1095,6 +1099,7 @@ enum ib_qp_type { IB_QPT_RAW_PACKET = 8, IB_QPT_XRC_INI = 9, IB_QPT_XRC_TGT, + IB_QPT_SRD, IB_QPT_MAX, IB_QPT_DRIVER = 0xFF, /* Reserve a range for qp types internal to the low level driver. From patchwork Tue Dec 4 12:04:18 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gal Pressman X-Patchwork-Id: 10711631 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 2A5D81731 for ; Tue, 4 Dec 2018 12:05:05 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 16A432A35B for ; Tue, 4 Dec 2018 12:05:05 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 054912AE4E; Tue, 4 Dec 2018 12:05:05 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 64E872A35B for ; Tue, 4 Dec 2018 12:05:03 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726054AbeLDMFD (ORCPT ); Tue, 4 Dec 2018 07:05:03 -0500 Received: from smtp-fw-6001.amazon.com ([52.95.48.154]:34880 "EHLO smtp-fw-6001.amazon.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725767AbeLDMFD (ORCPT ); Tue, 4 Dec 2018 07:05:03 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1543925096; x=1575461096; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version; bh=x1U/wth/4z98ySUOkI3hSSCCWiJpVRH7yCOIWNDZnDw=; b=umZp7GZHeaO/qKlnRF6APX60kwtKhxU4I/6/p4o2amhILgtIGmMok1mk 1UJjAkfj9PmeU0D2Wt1rd8BSNeUfWTgMVofHEJup6Zk2J3MbMHaX6P3ip lZrcvzQMqhmTH++EuR/KuRjCfpJcRUQzhp+80gz6nUIc6HypHnsLrElP2 8=; X-IronPort-AV: E=Sophos;i="5.56,253,1539648000"; d="scan'208";a="370801221" Received: from iad6-co-svc-p1-lb1-vlan3.amazon.com (HELO email-inbound-relay-1d-98acfc19.us-east-1.amazon.com) ([10.124.125.6]) by smtp-border-fw-out-6001.iad6.amazon.com with ESMTP/TLS/DHE-RSA-AES256-SHA; 04 Dec 2018 12:04:55 +0000 Received: from EX13MTAUEA001.ant.amazon.com (iad55-ws-svc-p15-lb9-vlan2.iad.amazon.com [10.40.159.162]) by email-inbound-relay-1d-98acfc19.us-east-1.amazon.com (8.14.7/8.14.7) with ESMTP id wB4C4swK110384 (version=TLSv1/SSLv3 cipher=AES256-SHA bits=256 verify=FAIL); Tue, 4 Dec 2018 12:04:54 GMT Received: from EX13D02EUC003.ant.amazon.com (10.43.164.10) by EX13MTAUEA001.ant.amazon.com (10.43.61.82) with Microsoft SMTP Server (TLS) id 15.0.1367.3; Tue, 4 Dec 2018 12:04:53 +0000 Received: from EX13MTAUEB001.ant.amazon.com (10.43.60.96) by EX13D02EUC003.ant.amazon.com (10.43.164.10) with Microsoft SMTP Server (TLS) id 15.0.1367.3; Tue, 4 Dec 2018 12:04:52 +0000 Received: from galpress-VirtualBox.hfa16.amazon.com (10.218.62.26) by mail-relay.amazon.com (10.43.60.129) with Microsoft SMTP Server id 15.0.1367.3 via Frontend Transport; Tue, 4 Dec 2018 12:04:49 +0000 From: Gal Pressman To: Doug Ledford , Jason Gunthorpe CC: Alexander Matushevsky , Yossi Leybovich , , Tom Tucker , Gal Pressman Subject: [PATCH rdma-next 02/13] RDMA/efa: Add EFA device definitions Date: Tue, 4 Dec 2018 14:04:18 +0200 Message-ID: <1543925069-8838-3-git-send-email-galpress@amazon.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1543925069-8838-1-git-send-email-galpress@amazon.com> References: <1543925069-8838-1-git-send-email-galpress@amazon.com> MIME-Version: 1.0 Sender: linux-rdma-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP EFA PCIe device implements a single Admin Queue (AQ) and Admin Completion Queue (ACQ) pair to initialize and communicate configuration with the device (similar to NVMe and ENA network device). Through this pair, we run SET/GET commands for querying and configuring the device, CREATE/MODIFY/DESTROY queues, and IB specific commands like Address Handler (AH), Memory Registration (MR) and Protection Domains (PD). In addition to admin (AQ/ACQ), we have data path queues that get classified as Send Queues (SQ), Receive Queues (RQ) and Completion Queues (CQ). Signed-off-by: Gal Pressman --- drivers/infiniband/hw/efa/efa_admin_cmds_defs.h | 783 ++++++++++++++++++++++++ drivers/infiniband/hw/efa/efa_admin_defs.h | 135 ++++ drivers/infiniband/hw/efa/efa_common_defs.h | 17 + drivers/infiniband/hw/efa/efa_regs_defs.h | 117 ++++ 4 files changed, 1052 insertions(+) create mode 100644 drivers/infiniband/hw/efa/efa_admin_cmds_defs.h create mode 100644 drivers/infiniband/hw/efa/efa_admin_defs.h create mode 100644 drivers/infiniband/hw/efa/efa_common_defs.h create mode 100644 drivers/infiniband/hw/efa/efa_regs_defs.h diff --git a/drivers/infiniband/hw/efa/efa_admin_cmds_defs.h b/drivers/infiniband/hw/efa/efa_admin_cmds_defs.h new file mode 100644 index 000000000000..b8516d50df12 --- /dev/null +++ b/drivers/infiniband/hw/efa/efa_admin_cmds_defs.h @@ -0,0 +1,783 @@ +/* SPDX-License-Identifier: GPL-2.0 OR BSD-2-Clause */ +/* + * Copyright 2018 Amazon.com, Inc. or its affiliates. + */ +#ifndef _EFA_ADMIN_CMDS_H_ +#define _EFA_ADMIN_CMDS_H_ + +#define EFA_ADMIN_API_VERSION_MAJOR 0 +#define EFA_ADMIN_API_VERSION_MINOR 1 + +/* EFA admin queue opcodes */ +enum efa_admin_aq_opcode { + /* starting opcode of efa admin commands */ + EFA_ADMIN_START_CMD_RANGE = 100, + /* Query device capabilities */ + EFA_ADMIN_QUERY_DEV = EFA_ADMIN_START_CMD_RANGE, + /* Modify device attributes */ + EFA_ADMIN_MODIFY_DEV = 101, + /* Create QP */ + EFA_ADMIN_CREATE_QP = 102, + /* Modify QP */ + EFA_ADMIN_MODIFY_QP = 103, + /* Query QP */ + EFA_ADMIN_QUERY_QP = 104, + /* Destroy QP */ + EFA_ADMIN_DESTROY_QP = 105, + /* Create Address Handle */ + EFA_ADMIN_CREATE_AH = 106, + /* Destroy Address Handle */ + EFA_ADMIN_DESTROY_AH = 107, + /* Register Memory Region */ + EFA_ADMIN_REG_MR = 108, + /* Deregister Memory Region */ + EFA_ADMIN_DEREG_MR = 109, + /* Create Completion Q */ + EFA_ADMIN_CREATE_CQ = 110, + /* Destroy Completion Q */ + EFA_ADMIN_DESTROY_CQ = 111, + EFA_ADMIN_GET_FEATURE = 112, + EFA_ADMIN_SET_FEATURE = 113, + EFA_ADMIN_GET_STATS = 114, + EFA_ADMIN_MAX_OPCODE = 114, +}; + +enum efa_admin_aq_feature_id { + EFA_ADMIN_DEVICE_ATTR = 1, + EFA_ADMIN_AENQ_CONFIG = 2, + EFA_ADMIN_NETWORK_ATTR = 3, + EFA_ADMIN_QUEUE_ATTR = 4, + EFA_ADMIN_HW_HINTS = 5, + EFA_ADMIN_FEATURES_OPCODE_NUM = 8, +}; + +/* QP transport type */ +enum efa_admin_qp_type { + /* Unreliable Datagram */ + EFA_ADMIN_QP_TYPE_UD = 1, + /* Scalable Reliable Datagram */ + EFA_ADMIN_QP_TYPE_SRD = 2, +}; + +/* QP state */ +enum efa_admin_qp_state { + /* Reset queue */ + EFA_ADMIN_QP_STATE_RESET = 0, + /* Init queue */ + EFA_ADMIN_QP_STATE_INIT = 1, + /* Ready to receive */ + EFA_ADMIN_QP_STATE_RTR = 2, + /* Ready to send */ + EFA_ADMIN_QP_STATE_RTS = 3, + /* Send queue drain */ + EFA_ADMIN_QP_STATE_SQD = 4, + /* Send queue error */ + EFA_ADMIN_QP_STATE_SQE = 5, + /* Queue in error state */ + EFA_ADMIN_QP_STATE_ERR = 6, +}; + +/* Device attributes */ +struct efa_admin_dev_attr { + /* FW version */ + u32 fw_ver; + + u32 max_mr_size; + + u32 max_qp; + + u32 max_cq; + + u32 max_mr; + + u32 max_pd; + + u32 max_ah; + + /* Enable low-latency queues */ + u32 llq_en; +}; + +/* Query device command */ +struct efa_admin_query_dev { + /* Common Admin Queue descriptor */ + struct efa_admin_aq_common_desc aq_common_desc; +}; + +/* Query device response. */ +struct efa_admin_query_dev_resp { + /* Common Admin Queue completion descriptor */ + struct efa_admin_acq_common_desc acq_common_desc; + + /* Device attributes */ + struct efa_admin_dev_attr dev_attr; +}; + +/* Modify device command */ +struct efa_admin_modify_dev { + /* Common Admin Queue descriptor */ + struct efa_admin_aq_common_desc aq_common_desc; + + /* Device attributes */ + struct efa_admin_dev_attr dev_attr; +}; + +/* Modify device response. */ +struct efa_admin_modify_dev_resp { + /* Common Admin Queue completion descriptor */ + struct efa_admin_acq_common_desc acq_common_desc; +}; + +/* + * QP allocation sizes, converted by fabric QueuePair (QP) create command + * from QP capabilities. + */ +struct efa_admin_qp_alloc_size { + /* Send descriptor ring size in bytes */ + u32 send_queue_ring_size; + + /* Max number of WQEs that can be outstanding on send queue. */ + u32 send_queue_depth; + + /* + * Recv descriptor ring size in bytes, sufficient for user-provided + * number of WQEs + */ + u32 recv_queue_ring_size; + + /* MBZ */ + u32 reserved; +}; + +/* Create QP command. */ +struct efa_admin_create_qp_cmd { + /* Common Admin Queue descriptor */ + struct efa_admin_aq_common_desc aq_common_desc; + + /* Protection Domain associated with this QP */ + u16 pd; + + /* QP type */ + u8 qp_type; + + /* + * 0 : sq_virt - If set, SQ ring base address is + * virtual (IOVA returned by MR registration) + * 1 : rq_virt - If set, RQ ring base address is + * virtual (IOVA returned by MR registration) + * 7:2 : reserved - MBZ + */ + u8 flags; + + /* + * Send queue (SQ) ring base physical address. This field is not + * used if this is a Low Latency Queue(LLQ). + */ + u64 sq_base_addr; + + /* Receive queue (RQ) ring base address. */ + u64 rq_base_addr; + + /* Index of CQ to be associated with Send Queue completions */ + u32 send_cq_idx; + + /* Index of CQ to be associated with Recv Queue completions */ + u32 recv_cq_idx; + + /* + * Memory registration key for the SQ ring, used only when not in + * LLQ mode and base address is virtual + */ + u32 sq_l_key; + + /* + * Memory registration key for the RQ ring, used only when base + * address is virtual + */ + u32 rq_l_key; + + /* Requested QP allocation sizes */ + struct efa_admin_qp_alloc_size qp_alloc_size; +}; + +/* Create QP response. */ +struct efa_admin_create_qp_resp { + /* Common Admin Queue completion descriptor */ + struct efa_admin_acq_common_desc acq_common_desc; + + /* Opaque handle to be used for consequent operations on the QP */ + u32 qp_handle; + + /* QP number in the given EFA virtual device */ + u16 qp_num; + + /* MBZ */ + u16 reserved; + + /* Index of sub-CQ for Send Queue completions */ + u16 send_sub_cq_idx; + + /* Index of sub-CQ for Receive Queue completions */ + u16 recv_sub_cq_idx; + + /* SQ doorbell address, as offset to PCIe DB BAR */ + u32 sq_db_offset; + + /* RQ doorbell address, as offset to PCIe DB BAR */ + u32 rq_db_offset; + + /* + * low latency send queue ring base address as an offset to PCIe + * MMIO LLQ_MEM BAR + */ + u32 llq_descriptors_offset; +}; + +/* Modify QP command */ +struct efa_admin_modify_qp_cmd { + /* Common Admin Queue descriptor */ + struct efa_admin_aq_common_desc aq_common_desc; + + /* QP handle returned by create_qp command */ + u32 qp_handle; + + /* QP state */ + u32 qp_state; + + /* Override current QP state (before applying the transition) */ + u32 cur_qp_state; + + /* QKey */ + u32 qkey; + + /* Enable async notification when SQ is drained */ + u32 sq_drained_async_notify; +}; + +/* Modify QP response */ +struct efa_admin_modify_qp_resp { + /* Common Admin Queue completion descriptor */ + struct efa_admin_acq_common_desc acq_common_desc; +}; + +/* Query QP command */ +struct efa_admin_query_qp_cmd { + /* Common Admin Queue descriptor */ + struct efa_admin_aq_common_desc aq_common_desc; + + /* QP handle returned by create_qp command */ + u32 qp_handle; +}; + +/* Query QP response */ +struct efa_admin_query_qp_resp { + /* Common Admin Queue completion descriptor */ + struct efa_admin_acq_common_desc acq_common_desc; + + /* QP state */ + u32 qp_state; + + /* QKey */ + u32 qkey; + + /* Indicates that draining is in progress */ + u32 sq_draining; +}; + +/* Destroy QP command */ +struct efa_admin_destroy_qp_cmd { + /* Common Admin Queue descriptor */ + struct efa_admin_aq_common_desc aq_common_desc; + + /* QP handle returned by create_qp command */ + u32 qp_handle; +}; + +/* Destroy QP response */ +struct efa_admin_destroy_qp_resp { + /* Common Admin Queue completion descriptor */ + struct efa_admin_acq_common_desc acq_common_desc; +}; + +/* + * Create Address Handle command parameters. Must not be called more than + * once for the same destination + */ +struct efa_admin_create_ah_cmd { + /* Common Admin Queue descriptor */ + struct efa_admin_aq_common_desc aq_common_desc; + + /* Destination address in network byte order */ + u8 dest_addr[16]; +}; + +/* Create Address Handle response */ +struct efa_admin_create_ah_resp { + /* Common Admin Queue completion descriptor */ + struct efa_admin_acq_common_desc acq_common_desc; + + /* Target interface address handle (opaque) */ + u16 ah; + + u16 reserved; +}; + +/* Destroy Address Handle command parameters. */ +struct efa_admin_destroy_ah_cmd { + /* Common Admin Queue descriptor */ + struct efa_admin_aq_common_desc aq_common_desc; + + /* Target interface address handle (opaque) */ + u16 ah; + + u16 reserved; +}; + +/* Destroy Address Handle response */ +struct efa_admin_destroy_ah_resp { + /* Common Admin Queue completion descriptor */ + struct efa_admin_acq_common_desc acq_common_desc; +}; + +/* + * Registration of MemoryRegion, required for QP working with Virtual + * Addresses. In standard verbs semantics, region length is limited to 2GB + * space, but EFA offers larger MR support for large memory space, to ease + * on users working with very large datasets (i.e. full GPU memory mapping). + */ +struct efa_admin_reg_mr_cmd { + /* Common Admin Queue descriptor */ + struct efa_admin_aq_common_desc aq_common_desc; + + /* Protection Domain */ + u16 pd; + + /* MBZ */ + u16 reserved16_w1; + + /* Physical Buffer List, each element is page-aligned. */ + union { + /* + * Inline array of guest-physical page addresses of user + * memory pages (optimization for short region + * registrations) + */ + u64 inline_pbl_array[4]; + + /* points to PBL (direct or indirect, chained if needed) */ + struct efa_admin_ctrl_buff_info pbl; + } pbl; + + /* Memory region length, in bytes. */ + u64 mr_length; + + /* + * flags and page size + * 4:0 : phys_page_size_shift - page size is (1 << + * phys_page_size_shift). Page size is used for + * building the Virtual to Physical address mapping + * 6:5 : reserved - MBZ + * 7 : mem_addr_phy_mode_en - Enable bit for physical + * memory registration (no translation), can be used + * only by privileged clients. If set, PBL must + * contain a single entry. + */ + u8 flags; + + /* + * permissions + * 0 : local_write_enable - Write permissions: value + * of 1 needed for RQ buffers and for RDMA write + * 7:1 : reserved1 - remote access flags, etc + */ + u8 permissions; + + u16 reserved16_w5; + + /* number of pages in PBL (redundant, could be calculated) */ + u32 page_num; + + /* + * IO Virtual Address associated with this MR. If + * mem_addr_phy_mode_en is set, contains the physical address of + * the region. + */ + u64 iova; +}; + +struct efa_admin_reg_mr_resp { + /* Common Admin Queue completion descriptor */ + struct efa_admin_acq_common_desc acq_common_desc; + + /* + * L_Key, to be used in conjunction with local buffer references in + * SQ and RQ WQE, or with virtual RQ/CQ rings + */ + u32 l_key; + + /* + * R_Key, to be used in RDMA messages to refer to remotely accessed + * memory region + */ + u32 r_key; +}; + +/* Deregister a MemoryRegion */ +struct efa_admin_dereg_mr_cmd { + /* Common Admin Queue descriptor */ + struct efa_admin_aq_common_desc aq_common_desc; + + /* L_Key, memory region's l_key */ + u32 l_key; +}; + +struct efa_admin_dereg_mr_resp { + /* Common Admin Queue completion descriptor */ + struct efa_admin_acq_common_desc acq_common_desc; +}; + +/* Create CQ command */ +struct efa_admin_create_cq_cmd { + struct efa_admin_aq_common_desc aq_common_desc; + + /* + * 4:0 : reserved5 + * 5 : interrupt_mode_enabled - if set, cq operates + * in interrupt mode (i.e. CQ events and MSI-X are + * generated), otherwise - polling + * 6 : virt - If set, ring base address is virtual + * (IOVA returned by MR registration) + * 7 : reserved6 + */ + u8 cq_caps_1; + + /* + * 4:0 : cq_entry_size_words - size of CQ entry in + * 32-bit words, valid values: 4, 8. + * 7:5 : reserved7 + */ + u8 cq_caps_2; + + /* completion queue depth in # of entries. must be power of 2 */ + u16 cq_depth; + + /* msix vector assigned to this cq */ + u32 msix_vector_idx; + + /* + * CQ ring base address, virtual or physical depending on 'virt' + * flag + */ + struct efa_common_mem_addr cq_ba; + + /* + * Memory registration key for the ring, used only when base + * address is virtual + */ + u32 l_key; + + /* + * number of sub cqs - must be equal to sub_cqs_per_cq of queue + * attributes. + */ + u16 num_sub_cqs; + + u16 reserved8; +}; + +struct efa_admin_create_cq_resp { + struct efa_admin_acq_common_desc acq_common_desc; + + u16 cq_idx; + + /* actual cq depth in number of entries */ + u16 cq_actual_depth; +}; + +/* Destroy CQ command */ +struct efa_admin_destroy_cq_cmd { + struct efa_admin_aq_common_desc aq_common_desc; + + u16 cq_idx; + + u16 reserved1; +}; + +struct efa_admin_destroy_cq_resp { + struct efa_admin_acq_common_desc acq_common_desc; +}; + +/* + * EFA AQ Get Statistics command. Extended statistics are placed in control + * buffer pointed by AQ entry + */ +struct efa_admin_aq_get_stats_cmd { + struct efa_admin_aq_common_desc aq_common_descriptor; + + union { + /* command specific inline data */ + u32 inline_data_w1[3]; + + struct efa_admin_ctrl_buff_info control_buffer; + } u; + + /* stats type as defined in enum efa_admin_get_stats_type */ + u8 type; + + /* stats scope defined in enum efa_admin_get_stats_scope */ + u8 scope; + + u16 reserved3; + + /* queue id. used when scope is specific_queue */ + u16 queue_idx; + + /* + * device id, value 0xFFFF means mine. only privileged device can get + * stats of other device + */ + u16 device_id; +}; + +/* Basic Statistics Command. */ +struct efa_admin_basic_stats { + u64 tx_bytes; + + u64 tx_pkts; + + u64 rx_bytes; + + u64 rx_pkts; + + u64 rx_drops; +}; + +struct efa_admin_acq_get_stats_resp { + struct efa_admin_acq_common_desc acq_common_desc; + + struct efa_admin_basic_stats basic_stats; +}; + +struct efa_admin_get_set_feature_common_desc { + /* + * 1:0 : select - 0x1 - current value; 0x3 - default + * value + * 7:3 : reserved3 + */ + u8 flags; + + /* as appears in efa_admin_aq_feature_id */ + u8 feature_id; + + /* MBZ */ + u16 reserved16; +}; + +struct efa_admin_feature_device_attr_desc { + u32 fw_version; + + u32 admin_api_version; + + u32 device_version; + + /* MBZ */ + u32 reserved1; + + /* bitmap of efa_admin_aq_feature_id */ + u64 supported_features; + + /* Indicates how many bits are used physical address access. */ + u8 phys_addr_width; + + /* MBZ */ + u8 reserved2; + + /* Indicates how many bits are used virtual address access. */ + u8 virt_addr_width; + + /* MBZ */ + u8 reserved3; + + /* Bar used for SQ and RQ doorbells. */ + u16 db_bar; + + /* MBZ */ + u16 reserved; +}; + +struct efa_admin_feature_queue_attr_desc { + /* The maximum number of send queues supported */ + u32 max_sq; + + u32 max_sq_depth; + + /* max send wr used in inline-buf */ + u32 inline_buf_size; + + /* The maximum number of receive queues supported */ + u32 max_rq; + + u32 max_rq_depth; + + /* The maximum number of completion queues supported per VF */ + u32 max_cq; + + u32 max_cq_depth; + + /* Number of sub-CQs to be created for each CQ */ + u16 sub_cqs_per_cq; + + u16 reserved; + + /* + * Maximum number of SGEs (buffs) allowed for a single send work + * queue element (WQE) + */ + u16 max_wr_send_sges; + + /* Maximum number of SGEs allowed for a single recv WQE */ + u16 max_wr_recv_sges; + + /* The maximum number of memory regions supported */ + u32 max_mr; + + /* The maximum number of pages can be registered */ + u32 max_mr_pages; + + /* The maximum number of protection domains supported */ + u32 max_pd; + + /* The maximum number of address handles supported */ + u32 max_ah; +}; + +struct efa_admin_feature_aenq_desc { + /* bitmask for AENQ groups the device can report */ + u32 supported_groups; + + /* bitmask for AENQ groups to report */ + u32 enabled_groups; +}; + +struct efa_admin_feature_network_attr_desc { + /* Raw address data in network byte order */ + u8 addr[16]; + + u32 mtu; +}; + +/* + * When hint value is 0, hints capabilities are not supported or driver + * should use its own predefined value + */ +struct efa_admin_hw_hints { + /* value in ms */ + u16 mmio_read_timeout; + + /* value in ms */ + u16 driver_watchdog_timeout; + + /* value in ms */ + u16 admin_completion_timeout; + + /* poll interval in ms */ + u16 poll_interval; +}; + +struct efa_admin_get_feature_cmd { + struct efa_admin_aq_common_desc aq_common_descriptor; + + struct efa_admin_ctrl_buff_info control_buffer; + + struct efa_admin_get_set_feature_common_desc feature_common; + + u32 raw[11]; +}; + +struct efa_admin_get_feature_resp { + struct efa_admin_acq_common_desc acq_common_desc; + + union { + u32 raw[14]; + + struct efa_admin_feature_device_attr_desc device_attr; + + struct efa_admin_feature_aenq_desc aenq; + + struct efa_admin_feature_network_attr_desc network_attr; + + struct efa_admin_feature_queue_attr_desc queue_attr; + + struct efa_admin_hw_hints hw_hints; + } u; +}; + +struct efa_admin_set_feature_cmd { + struct efa_admin_aq_common_desc aq_common_descriptor; + + struct efa_admin_ctrl_buff_info control_buffer; + + struct efa_admin_get_set_feature_common_desc feature_common; + + union { + u32 raw[11]; + + /* AENQ configuration */ + struct efa_admin_feature_aenq_desc aenq; + } u; +}; + +struct efa_admin_set_feature_resp { + struct efa_admin_acq_common_desc acq_common_desc; + + union { + u32 raw[14]; + } u; +}; + +/* asynchronous event notification groups */ +enum efa_admin_aenq_group { + EFA_ADMIN_FATAL_ERROR = 1, + EFA_ADMIN_WARNING = 2, + EFA_ADMIN_NOTIFICATION = 3, + EFA_ADMIN_KEEP_ALIVE = 4, + EFA_ADMIN_AENQ_GROUPS_NUM = 5, +}; + +enum efa_admin_aenq_notification_syndrom { + EFA_ADMIN_SUSPEND = 0, + EFA_ADMIN_RESUME = 1, + EFA_ADMIN_UPDATE_HINTS = 2, +}; + +struct efa_admin_mmio_req_read_less_resp { + u16 req_id; + + u16 reg_off; + + /* value is valid when poll is cleared */ + u32 reg_val; +}; + +/* create_qp_cmd */ +#define EFA_ADMIN_CREATE_QP_CMD_SQ_VIRT_MASK BIT(0) +#define EFA_ADMIN_CREATE_QP_CMD_RQ_VIRT_SHIFT 1 +#define EFA_ADMIN_CREATE_QP_CMD_RQ_VIRT_MASK BIT(1) + +/* reg_mr_cmd */ +#define EFA_ADMIN_REG_MR_CMD_PHYS_PAGE_SIZE_SHIFT_MASK GENMASK(4, 0) +#define EFA_ADMIN_REG_MR_CMD_MEM_ADDR_PHY_MODE_EN_SHIFT 7 +#define EFA_ADMIN_REG_MR_CMD_MEM_ADDR_PHY_MODE_EN_MASK BIT(7) +#define EFA_ADMIN_REG_MR_CMD_LOCAL_WRITE_ENABLE_MASK BIT(0) + +/* create_cq_cmd */ +#define EFA_ADMIN_CREATE_CQ_CMD_INTERRUPT_MODE_ENABLED_SHIFT 5 +#define EFA_ADMIN_CREATE_CQ_CMD_INTERRUPT_MODE_ENABLED_MASK BIT(5) +#define EFA_ADMIN_CREATE_CQ_CMD_VIRT_SHIFT 6 +#define EFA_ADMIN_CREATE_CQ_CMD_VIRT_MASK BIT(6) +#define EFA_ADMIN_CREATE_CQ_CMD_CQ_ENTRY_SIZE_WORDS_MASK GENMASK(4, 0) + +/* get_set_feature_common_desc */ +#define EFA_ADMIN_GET_SET_FEATURE_COMMON_DESC_SELECT_MASK GENMASK(1, 0) + +#endif /*_EFA_ADMIN_CMDS_H_ */ diff --git a/drivers/infiniband/hw/efa/efa_admin_defs.h b/drivers/infiniband/hw/efa/efa_admin_defs.h new file mode 100644 index 000000000000..b2651e7cd2ac --- /dev/null +++ b/drivers/infiniband/hw/efa/efa_admin_defs.h @@ -0,0 +1,135 @@ +/* SPDX-License-Identifier: GPL-2.0 OR BSD-2-Clause */ +/* + * Copyright 2018 Amazon.com, Inc. or its affiliates. + */ +#ifndef _EFA_ADMIN_H_ +#define _EFA_ADMIN_H_ + +enum efa_admin_aq_completion_status { + EFA_ADMIN_SUCCESS = 0, + EFA_ADMIN_RESOURCE_ALLOCATION_FAILURE = 1, + EFA_ADMIN_BAD_OPCODE = 2, + EFA_ADMIN_UNSUPPORTED_OPCODE = 3, + EFA_ADMIN_MALFORMED_REQUEST = 4, + /* Additional status is provided in ACQ entry extended_status */ + EFA_ADMIN_ILLEGAL_PARAMETER = 5, + EFA_ADMIN_UNKNOWN_ERROR = 6, + EFA_ADMIN_RESOURCE_BUSY = 7, +}; + +struct efa_admin_aq_common_desc { + /* + * 11:0 : command_id + * 15:12 : reserved12 + */ + u16 command_id; + + /* as appears in efa_admin_aq_opcode */ + u8 opcode; + + /* + * 0 : phase + * 1 : ctrl_data - control buffer address valid + * 2 : ctrl_data_indirect - control buffer address + * points to list of pages with addresses of control + * buffers + * 7:3 : reserved3 + */ + u8 flags; +}; + +/* + * used in efa_admin_aq_entry. Can point directly to control data, or to a + * page list chunk. Used also at the end of indirect mode page list chunks, + * for chaining. + */ +struct efa_admin_ctrl_buff_info { + u32 length; + + struct efa_common_mem_addr address; +}; + +struct efa_admin_aq_entry { + struct efa_admin_aq_common_desc aq_common_descriptor; + + union { + u32 inline_data_w1[3]; + + struct efa_admin_ctrl_buff_info control_buffer; + } u; + + u32 inline_data_w4[12]; +}; + +struct efa_admin_acq_common_desc { + /* + * command identifier to associate it with the aq descriptor + * 11:0 : command_id + * 15:12 : reserved12 + */ + u16 command; + + u8 status; + + /* + * 0 : phase + * 7:1 : reserved1 + */ + u8 flags; + + u16 extended_status; + + /* + * indicates to the driver which AQ entry has been consumed by the + * device and could be reused + */ + u16 sq_head_indx; +}; + +struct efa_admin_acq_entry { + struct efa_admin_acq_common_desc acq_common_descriptor; + + u32 response_specific_data[14]; +}; + +struct efa_admin_aenq_common_desc { + u16 group; + + u16 syndrom; + + /* + * 0 : phase + * 7:1 : reserved - MBZ + */ + u8 flags; + + u8 reserved1[3]; + + u32 timestamp_low; + + u32 timestamp_high; +}; + +struct efa_admin_aenq_entry { + struct efa_admin_aenq_common_desc aenq_common_desc; + + /* command specific inline data */ + u32 inline_data_w4[12]; +}; + +/* aq_common_desc */ +#define EFA_ADMIN_AQ_COMMON_DESC_COMMAND_ID_MASK GENMASK(11, 0) +#define EFA_ADMIN_AQ_COMMON_DESC_PHASE_MASK BIT(0) +#define EFA_ADMIN_AQ_COMMON_DESC_CTRL_DATA_SHIFT 1 +#define EFA_ADMIN_AQ_COMMON_DESC_CTRL_DATA_MASK BIT(1) +#define EFA_ADMIN_AQ_COMMON_DESC_CTRL_DATA_INDIRECT_SHIFT 2 +#define EFA_ADMIN_AQ_COMMON_DESC_CTRL_DATA_INDIRECT_MASK BIT(2) + +/* acq_common_desc */ +#define EFA_ADMIN_ACQ_COMMON_DESC_COMMAND_ID_MASK GENMASK(11, 0) +#define EFA_ADMIN_ACQ_COMMON_DESC_PHASE_MASK BIT(0) + +/* aenq_common_desc */ +#define EFA_ADMIN_AENQ_COMMON_DESC_PHASE_MASK BIT(0) + +#endif /*_EFA_ADMIN_H_ */ diff --git a/drivers/infiniband/hw/efa/efa_common_defs.h b/drivers/infiniband/hw/efa/efa_common_defs.h new file mode 100644 index 000000000000..7dcee8e9a930 --- /dev/null +++ b/drivers/infiniband/hw/efa/efa_common_defs.h @@ -0,0 +1,17 @@ +/* SPDX-License-Identifier: GPL-2.0 OR BSD-2-Clause */ +/* + * Copyright 2018 Amazon.com, Inc. or its affiliates. + */ +#ifndef _EFA_COMMON_H_ +#define _EFA_COMMON_H_ + +#define EFA_COMMON_SPEC_VERSION_MAJOR 2 +#define EFA_COMMON_SPEC_VERSION_MINOR 0 + +struct efa_common_mem_addr { + u32 mem_addr_low; + + u32 mem_addr_high; +}; + +#endif /*_EFA_COMMON_H_ */ diff --git a/drivers/infiniband/hw/efa/efa_regs_defs.h b/drivers/infiniband/hw/efa/efa_regs_defs.h new file mode 100644 index 000000000000..906bc0c88240 --- /dev/null +++ b/drivers/infiniband/hw/efa/efa_regs_defs.h @@ -0,0 +1,117 @@ +/* SPDX-License-Identifier: GPL-2.0 OR BSD-2-Clause */ +/* + * Copyright 2018 Amazon.com, Inc. or its affiliates. + */ +#ifndef _EFA_REGS_H_ +#define _EFA_REGS_H_ + +enum efa_regs_reset_reason_types { + EFA_REGS_RESET_NORMAL = 0, + /* Keep alive timeout */ + EFA_REGS_RESET_KEEP_ALIVE_TO = 1, + EFA_REGS_RESET_ADMIN_TO = 2, + EFA_REGS_RESET_INIT_ERR = 3, + EFA_REGS_RESET_DRIVER_INVALID_STATE = 4, + EFA_REGS_RESET_OS_TRIGGER = 5, + EFA_REGS_RESET_SHUTDOWN = 6, + EFA_REGS_RESET_USER_TRIGGER = 7, + EFA_REGS_RESET_GENERIC = 8, +}; + +/* common_registers offsets */ + +/* 0x54 base */ +#define EFA_REGS_DEV_CTL_OFF 0x54 +#define EFA_REGS_DEV_STS_OFF 0x58 + +/* efa_registers offsets */ + +/* 0x800 base */ +#define EFA_REGS_VERSION_OFF 0x800 +#define EFA_REGS_CONTROLLER_VERSION_OFF 0x804 +#define EFA_REGS_CAPS_OFF 0x808 +#define EFA_REGS_CAPS_EXT_OFF 0x80c +#define EFA_REGS_AQ_BASE_LO_OFF 0x810 +#define EFA_REGS_AQ_BASE_HI_OFF 0x814 +#define EFA_REGS_AQ_CAPS_OFF 0x818 +#define EFA_REGS_ACQ_BASE_LO_OFF 0x820 +#define EFA_REGS_ACQ_BASE_HI_OFF 0x824 +#define EFA_REGS_ACQ_CAPS_OFF 0x828 +#define EFA_REGS_AQ_PROD_DB_OFF 0x82c +#define EFA_REGS_AENQ_CAPS_OFF 0x830 +#define EFA_REGS_AENQ_BASE_LO_OFF 0x834 +#define EFA_REGS_AENQ_BASE_HI_OFF 0x838 +#define EFA_REGS_AENQ_CONS_DB_OFF 0x83c +#define EFA_REGS_INTR_MASK_OFF 0x844 +#define EFA_REGS_MMIO_REG_READ_OFF 0x84c +#define EFA_REGS_MMIO_RESP_LO_OFF 0x850 +#define EFA_REGS_MMIO_RESP_HI_OFF 0x854 + +/* dev_ctl register */ +#define EFA_REGS_DEV_CTL_DEV_RESET_MASK 0x1 +#define EFA_REGS_DEV_CTL_AQ_RESTART_SHIFT 1 +#define EFA_REGS_DEV_CTL_AQ_RESTART_MASK 0x2 +#define EFA_REGS_DEV_CTL_RESET_REASON_SHIFT 28 +#define EFA_REGS_DEV_CTL_RESET_REASON_MASK 0xf0000000 + +/* dev_sts register */ +#define EFA_REGS_DEV_STS_READY_MASK 0x1 +#define EFA_REGS_DEV_STS_AQ_RESTART_IN_PROGRESS_SHIFT 1 +#define EFA_REGS_DEV_STS_AQ_RESTART_IN_PROGRESS_MASK 0x2 +#define EFA_REGS_DEV_STS_AQ_RESTART_FINISHED_SHIFT 2 +#define EFA_REGS_DEV_STS_AQ_RESTART_FINISHED_MASK 0x4 +#define EFA_REGS_DEV_STS_RESET_IN_PROGRESS_SHIFT 3 +#define EFA_REGS_DEV_STS_RESET_IN_PROGRESS_MASK 0x8 +#define EFA_REGS_DEV_STS_RESET_FINISHED_SHIFT 4 +#define EFA_REGS_DEV_STS_RESET_FINISHED_MASK 0x10 +#define EFA_REGS_DEV_STS_FATAL_ERROR_SHIFT 5 +#define EFA_REGS_DEV_STS_FATAL_ERROR_MASK 0x20 + +/* version register */ +#define EFA_REGS_VERSION_MINOR_VERSION_MASK 0xff +#define EFA_REGS_VERSION_MAJOR_VERSION_SHIFT 8 +#define EFA_REGS_VERSION_MAJOR_VERSION_MASK 0xff00 + +/* controller_version register */ +#define EFA_REGS_CONTROLLER_VERSION_SUBMINOR_VERSION_MASK 0xff +#define EFA_REGS_CONTROLLER_VERSION_MINOR_VERSION_SHIFT 8 +#define EFA_REGS_CONTROLLER_VERSION_MINOR_VERSION_MASK 0xff00 +#define EFA_REGS_CONTROLLER_VERSION_MAJOR_VERSION_SHIFT 16 +#define EFA_REGS_CONTROLLER_VERSION_MAJOR_VERSION_MASK 0xff0000 +#define EFA_REGS_CONTROLLER_VERSION_IMPL_ID_SHIFT 24 +#define EFA_REGS_CONTROLLER_VERSION_IMPL_ID_MASK 0xff000000 + +/* caps register */ +#define EFA_REGS_CAPS_CONTIGUOUS_QUEUE_REQUIRED_MASK 0x1 +#define EFA_REGS_CAPS_RESET_TIMEOUT_SHIFT 1 +#define EFA_REGS_CAPS_RESET_TIMEOUT_MASK 0x3e +#define EFA_REGS_CAPS_DMA_ADDR_WIDTH_SHIFT 8 +#define EFA_REGS_CAPS_DMA_ADDR_WIDTH_MASK 0xff00 +#define EFA_REGS_CAPS_ADMIN_CMD_TO_SHIFT 16 +#define EFA_REGS_CAPS_ADMIN_CMD_TO_MASK 0xf0000 + +/* aq_caps register */ +#define EFA_REGS_AQ_CAPS_AQ_DEPTH_MASK 0xffff +#define EFA_REGS_AQ_CAPS_AQ_ENTRY_SIZE_SHIFT 16 +#define EFA_REGS_AQ_CAPS_AQ_ENTRY_SIZE_MASK 0xffff0000 + +/* acq_caps register */ +#define EFA_REGS_ACQ_CAPS_ACQ_DEPTH_MASK 0xffff +#define EFA_REGS_ACQ_CAPS_ACQ_ENTRY_SIZE_SHIFT 16 +#define EFA_REGS_ACQ_CAPS_ACQ_ENTRY_SIZE_MASK 0xff0000 +#define EFA_REGS_ACQ_CAPS_ACQ_MSIX_VECTOR_SHIFT 24 +#define EFA_REGS_ACQ_CAPS_ACQ_MSIX_VECTOR_MASK 0xff000000 + +/* aenq_caps register */ +#define EFA_REGS_AENQ_CAPS_AENQ_DEPTH_MASK 0xffff +#define EFA_REGS_AENQ_CAPS_AENQ_ENTRY_SIZE_SHIFT 16 +#define EFA_REGS_AENQ_CAPS_AENQ_ENTRY_SIZE_MASK 0xff0000 +#define EFA_REGS_AENQ_CAPS_AENQ_MSIX_VECTOR_SHIFT 24 +#define EFA_REGS_AENQ_CAPS_AENQ_MSIX_VECTOR_MASK 0xff000000 + +/* mmio_reg_read register */ +#define EFA_REGS_MMIO_REG_READ_REQ_ID_MASK 0xffff +#define EFA_REGS_MMIO_REG_READ_REG_OFF_SHIFT 16 +#define EFA_REGS_MMIO_REG_READ_REG_OFF_MASK 0xffff0000 + +#endif /*_EFA_REGS_H_ */ From patchwork Tue Dec 4 12:04:19 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gal Pressman X-Patchwork-Id: 10711645 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B29401731 for ; Tue, 4 Dec 2018 12:05:38 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A54A02AD6B for ; Tue, 4 Dec 2018 12:05:38 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 9966A2AEFD; Tue, 4 Dec 2018 12:05:38 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 447FF2AD6B for ; Tue, 4 Dec 2018 12:05:38 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726220AbeLDMFh (ORCPT ); Tue, 4 Dec 2018 07:05:37 -0500 Received: from smtp-fw-33001.amazon.com ([207.171.190.10]:1134 "EHLO smtp-fw-33001.amazon.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726194AbeLDMFh (ORCPT ); Tue, 4 Dec 2018 07:05:37 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1543925136; x=1575461136; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version; bh=eA5PNUR3ZMzNrFTxTFhMfOFL2oxNvZcrB5xhzkAR6is=; b=vTjjXVf5XavLTA+tGkBwJWmvD6ZUMUSjGZ/nv0Dg4BYYS8fDAuSbgpTR 5j17Sp6DM0wUgFhCA+llcb9avDGRhYm+12PK2TJm4uSeC0JNlbMGX12s9 p9O5uJ4LQV8U+SIGmOVcLE/PcTVmRHP7hbs6iRYTNjC2s6BFzvcQ36StE Q=; X-IronPort-AV: E=Sophos;i="5.56,253,1539648000"; d="scan'208";a="768929513" Received: from sea3-co-svc-lb6-vlan2.sea.amazon.com (HELO email-inbound-relay-1d-98acfc19.us-east-1.amazon.com) ([10.47.22.34]) by smtp-border-fw-out-33001.sea14.amazon.com with ESMTP/TLS/DHE-RSA-AES256-SHA; 04 Dec 2018 12:04:57 +0000 Received: from EX13MTAUEA001.ant.amazon.com (iad55-ws-svc-p15-lb9-vlan2.iad.amazon.com [10.40.159.162]) by email-inbound-relay-1d-98acfc19.us-east-1.amazon.com (8.14.7/8.14.7) with ESMTP id wB4C4tHq110407 (version=TLSv1/SSLv3 cipher=AES256-SHA bits=256 verify=FAIL); Tue, 4 Dec 2018 12:04:56 GMT Received: from EX13D02EUB001.ant.amazon.com (10.43.166.150) by EX13MTAUEA001.ant.amazon.com (10.43.61.243) with Microsoft SMTP Server (TLS) id 15.0.1367.3; Tue, 4 Dec 2018 12:04:56 +0000 Received: from EX13MTAUEB001.ant.amazon.com (10.43.60.96) by EX13D02EUB001.ant.amazon.com (10.43.166.150) with Microsoft SMTP Server (TLS) id 15.0.1367.3; Tue, 4 Dec 2018 12:04:54 +0000 Received: from galpress-VirtualBox.hfa16.amazon.com (10.218.62.26) by mail-relay.amazon.com (10.43.60.129) with Microsoft SMTP Server id 15.0.1367.3 via Frontend Transport; Tue, 4 Dec 2018 12:04:52 +0000 From: Gal Pressman To: Doug Ledford , Jason Gunthorpe CC: Alexander Matushevsky , Yossi Leybovich , , Tom Tucker , Gal Pressman Subject: [PATCH rdma-next 03/13] RDMA/efa: Add the PCI device id definitions Date: Tue, 4 Dec 2018 14:04:19 +0200 Message-ID: <1543925069-8838-4-git-send-email-galpress@amazon.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1543925069-8838-1-git-send-email-galpress@amazon.com> References: <1543925069-8838-1-git-send-email-galpress@amazon.com> MIME-Version: 1.0 Sender: linux-rdma-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Add EFA PCI device IDs. Signed-off-by: Gal Pressman --- drivers/infiniband/hw/efa/efa_pci_id_tbl.h | 25 +++++++++++++++++++++++++ 1 file changed, 25 insertions(+) create mode 100644 drivers/infiniband/hw/efa/efa_pci_id_tbl.h diff --git a/drivers/infiniband/hw/efa/efa_pci_id_tbl.h b/drivers/infiniband/hw/efa/efa_pci_id_tbl.h new file mode 100644 index 000000000000..3bb21a95fe54 --- /dev/null +++ b/drivers/infiniband/hw/efa/efa_pci_id_tbl.h @@ -0,0 +1,25 @@ +/* SPDX-License-Identifier: GPL-2.0 OR BSD-2-Clause */ +/* + * Copyright 2018 Amazon.com, Inc. or its affiliates. + */ + +#ifndef _EFA_PCI_ID_TBL_H_ +#define _EFA_PCI_ID_TBL_H_ + +#ifndef PCI_VENDOR_ID_AMAZON +#define PCI_VENDOR_ID_AMAZON 0x1d0f +#endif + +#ifndef PCI_DEV_ID_EFA_VF +#define PCI_DEV_ID_EFA_VF 0xefa0 +#endif + +#define EFA_PCI_DEVICE_ID(devid) \ + { PCI_VDEVICE(AMAZON, devid) } + +static const struct pci_device_id efa_pci_tbl[] = { + EFA_PCI_DEVICE_ID(PCI_DEV_ID_EFA_VF), + { } +}; + +#endif /* _EFA_PCI_ID_TBL_H_ */ From patchwork Tue Dec 4 12:04:20 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gal Pressman X-Patchwork-Id: 10711633 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C63091731 for ; Tue, 4 Dec 2018 12:05:09 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B7D6D2A35B for ; Tue, 4 Dec 2018 12:05:09 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id ABBD62AE4E; Tue, 4 Dec 2018 12:05:09 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 246742A35B for ; Tue, 4 Dec 2018 12:05:09 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726076AbeLDMFJ (ORCPT ); Tue, 4 Dec 2018 07:05:09 -0500 Received: from smtp-fw-9102.amazon.com ([207.171.184.29]:5318 "EHLO smtp-fw-9102.amazon.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725767AbeLDMFI (ORCPT ); Tue, 4 Dec 2018 07:05:08 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1543925107; x=1575461107; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version; bh=S3l8nuW9J5YOyC00H+YQumNWKftGalWjuHqDVfPRhb8=; b=AQ4BSoBEhjHakxuyF7wKREkplc/Z5EmXvrSvA3THhZsX9m3zTVLysZwW LBHKa7q/OJqYrkegkoK7iPOozp8qoR/k1JzksCmqfAbU48u+gnU11o6r/ t9uWsGjGYTFEGyDOTF3HgXaIemRWgsBbgY+EtEAiVh2g2s+yivJnvMWGJ k=; X-IronPort-AV: E=Sophos;i="5.56,253,1539648000"; d="scan'208";a="645914947" Received: from sea3-co-svc-lb6-vlan3.sea.amazon.com (HELO email-inbound-relay-1a-67b371d8.us-east-1.amazon.com) ([10.47.22.38]) by smtp-border-fw-out-9102.sea19.amazon.com with ESMTP/TLS/DHE-RSA-AES256-SHA; 04 Dec 2018 12:05:06 +0000 Received: from EX13MTAUEA001.ant.amazon.com (iad55-ws-svc-p15-lb9-vlan2.iad.amazon.com [10.40.159.162]) by email-inbound-relay-1a-67b371d8.us-east-1.amazon.com (8.14.7/8.14.7) with ESMTP id wB4C51mh110179 (version=TLSv1/SSLv3 cipher=AES256-SHA bits=256 verify=FAIL); Tue, 4 Dec 2018 12:05:04 GMT Received: from EX13D13EUB004.ant.amazon.com (10.43.166.84) by EX13MTAUEA001.ant.amazon.com (10.43.61.82) with Microsoft SMTP Server (TLS) id 15.0.1367.3; Tue, 4 Dec 2018 12:04:58 +0000 Received: from EX13MTAUEB001.ant.amazon.com (10.43.60.96) by EX13D13EUB004.ant.amazon.com (10.43.166.84) with Microsoft SMTP Server (TLS) id 15.0.1367.3; Tue, 4 Dec 2018 12:04:57 +0000 Received: from galpress-VirtualBox.hfa16.amazon.com (10.218.62.26) by mail-relay.amazon.com (10.43.60.129) with Microsoft SMTP Server id 15.0.1367.3 via Frontend Transport; Tue, 4 Dec 2018 12:04:55 +0000 From: Gal Pressman To: Doug Ledford , Jason Gunthorpe CC: Alexander Matushevsky , Yossi Leybovich , , Tom Tucker , Gal Pressman Subject: [PATCH rdma-next 04/13] RDMA/efa: Add the efa.h header file Date: Tue, 4 Dec 2018 14:04:20 +0200 Message-ID: <1543925069-8838-5-git-send-email-galpress@amazon.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1543925069-8838-1-git-send-email-galpress@amazon.com> References: <1543925069-8838-1-git-send-email-galpress@amazon.com> MIME-Version: 1.0 Sender: linux-rdma-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Add EFA driver generic header file defining driver's device independent internal data structures, including the bitmaps and other management helper function. Signed-off-by: Gal Pressman --- drivers/infiniband/hw/efa/efa.h | 191 ++++++++++++++++++++++++++++++++++++++++ 1 file changed, 191 insertions(+) create mode 100644 drivers/infiniband/hw/efa/efa.h diff --git a/drivers/infiniband/hw/efa/efa.h b/drivers/infiniband/hw/efa/efa.h new file mode 100644 index 000000000000..9cc763923884 --- /dev/null +++ b/drivers/infiniband/hw/efa/efa.h @@ -0,0 +1,191 @@ +/* SPDX-License-Identifier: GPL-2.0 OR BSD-2-Clause */ +/* + * Copyright 2018 Amazon.com, Inc. or its affiliates. + */ + +#ifndef _EFA_H_ +#define _EFA_H_ + +#include +#include +#include + +#include + +#include "efa_com_cmd.h" + +#define DRV_MODULE_NAME "efa" +#define DEVICE_NAME "Elastic Fabric Adapter (EFA)" + +#ifdef pr_fmt +#undef pr_fmt +#endif +#define pr_fmt(fmt) DRV_MODULE_NAME ": %s: " fmt, __func__ + +#define EFA_IRQNAME_SIZE 40 + +/* 1 for AENQ + ADMIN */ +#define EFA_NUM_MSIX_VEC 1 +#define EFA_MGMNT_MSIX_VEC_IDX 0 + +#define EFA_BITMAP_INVAL U32_MAX + +enum { + EFA_DEVICE_RUNNING_BIT, + EFA_MSIX_ENABLED_BIT +}; + +struct efa_caps { + u32 max_sq; + u32 max_sq_depth; /* wqes */ + u32 max_rq; + u32 max_rq_depth; /* wqes */ + u32 max_cq; + u32 max_cq_depth; /* cqes */ + u32 inline_buf_size; + u32 max_sq_sge; + u32 max_rq_sge; + u32 max_mr; + u64 max_mr_pages; + u64 page_size_cap; /* bytes */ + u32 max_pd; + u32 max_ah; + u16 sub_cqs_per_cq; + u16 max_inline_data; /* bytes */ +}; + +struct efa_bitmap { + u32 last; + u32 max; + u32 mask; + u32 avail; + /* Protects bitmap */ + spinlock_t lock; + unsigned long *table; +}; + +struct efa_irq { + irq_handler_t handler; + void *data; + int cpu; + u32 vector; + cpumask_t affinity_hint_mask; + char name[EFA_IRQNAME_SIZE]; +}; + +struct efa_sw_stats { + u64 alloc_pd_alloc_err; + u64 alloc_pd_bitmap_full_err; + u64 mmap_entry_alloc_err; + u64 create_qp_alloc_err; + u64 create_cq_alloc_err; + u64 reg_mr_alloc_err; + u64 alloc_ucontext_alloc_err; + u64 create_ah_alloc_err; +}; + +struct efa_stats { + struct efa_sw_stats sw_stats; + u64 keep_alive_rcvd; +}; + +struct efa_dev { + struct ib_device ibdev; + struct pci_dev *pdev; + struct efa_com_dev *edev; + struct efa_caps caps; + + u64 reg_bar_addr; + u64 reg_bar_len; + u64 mem_bar_addr; + u64 mem_bar_len; + u64 db_bar_addr; + u64 db_bar_len; + u8 addr[EFA_GID_SIZE]; + u32 mtu; + u8 db_bar_idx; + + int admin_msix_vector_idx; + unsigned long state; + struct efa_irq admin_irq; + + struct list_head ctx_list; + + /* Protects efa_dev state */ + struct mutex efa_dev_lock; + + struct list_head efa_ah_list; + /* Protects efa_ah_list */ + struct mutex ah_list_lock; + struct efa_bitmap pd_bitmap; + + struct efa_stats stats; +}; + +u32 efa_bitmap_alloc(struct efa_bitmap *bitmap); +void efa_bitmap_free(struct efa_bitmap *bitmap, u32 obj); +int efa_bitmap_init(struct efa_bitmap *bitmap, u32 num); +void efa_bitmap_cleanup(struct efa_bitmap *bitmap); +u32 efa_bitmap_avail(struct efa_bitmap *bitmap); + +int efa_get_device_attributes(struct efa_dev *dev, + struct efa_com_get_device_attr_result *result); + +int efa_query_device(struct ib_device *ibdev, + struct ib_device_attr *props, + struct ib_udata *udata); +int efa_query_port(struct ib_device *ibdev, u8 port, + struct ib_port_attr *props); +int efa_query_qp(struct ib_qp *ibqp, struct ib_qp_attr *qp_attr, + int qp_attr_mask, + struct ib_qp_init_attr *qp_init_attr); +int efa_query_gid(struct ib_device *ibdev, u8 port, int index, + union ib_gid *gid); +int efa_query_pkey(struct ib_device *ibdev, u8 port, u16 index, + u16 *pkey); +struct ib_pd *efa_alloc_pd(struct ib_device *ibdev, + struct ib_ucontext *ibucontext, + struct ib_udata *udata); +int efa_dealloc_pd(struct ib_pd *ibpd); +int efa_destroy_qp_handle(struct efa_dev *dev, u32 qp_handle); +int efa_destroy_qp(struct ib_qp *ibqp); +struct ib_qp *efa_create_qp(struct ib_pd *ibpd, + struct ib_qp_init_attr *init_attr, + struct ib_udata *udata); +int efa_destroy_cq(struct ib_cq *ibcq); +struct ib_cq *efa_create_cq(struct ib_device *ibdev, + const struct ib_cq_init_attr *attr, + struct ib_ucontext *ibucontext, + struct ib_udata *udata); +struct ib_mr *efa_reg_mr(struct ib_pd *ibpd, u64 start, u64 length, + u64 virt_addr, int access_flags, + struct ib_udata *udata); +int efa_dereg_mr(struct ib_mr *ibmr); +int efa_get_port_immutable(struct ib_device *ibdev, u8 port_num, + struct ib_port_immutable *immutable); +struct ib_ucontext *efa_alloc_ucontext(struct ib_device *ibdev, + struct ib_udata *udata); +int efa_dealloc_ucontext(struct ib_ucontext *ibucontext); +int efa_mmap(struct ib_ucontext *ibucontext, + struct vm_area_struct *vma); +struct ib_ah *efa_create_ah(struct ib_pd *ibpd, + struct rdma_ah_attr *ah_attr, + struct ib_udata *udata); +int efa_destroy_ah(struct ib_ah *ibah); +int efa_post_send(struct ib_qp *ibqp, + const struct ib_send_wr *wr, + const struct ib_send_wr **bad_wr); +int efa_post_recv(struct ib_qp *ibqp, + const struct ib_recv_wr *wr, + const struct ib_recv_wr **bad_wr); +int efa_poll_cq(struct ib_cq *ibcq, int num_entries, + struct ib_wc *wc); +int efa_req_notify_cq(struct ib_cq *ibcq, + enum ib_cq_notify_flags flags); +struct ib_mr *efa_get_dma_mr(struct ib_pd *ibpd, int acc); +int efa_modify_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr, + int attr_mask, struct ib_udata *udata); +enum rdma_link_layer efa_port_link_layer(struct ib_device *ibdev, + u8 port_num); + +#endif /* _EFA_H_ */ From patchwork Tue Dec 4 12:04:21 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gal Pressman X-Patchwork-Id: 10711639 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 5739118A7 for ; Tue, 4 Dec 2018 12:05:24 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 48C652A35B for ; Tue, 4 Dec 2018 12:05:24 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 3D0172AD6B; Tue, 4 Dec 2018 12:05:24 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C2F972AE4E for ; Tue, 4 Dec 2018 12:05:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726151AbeLDMFX (ORCPT ); Tue, 4 Dec 2018 07:05:23 -0500 Received: from smtp-fw-9101.amazon.com ([207.171.184.25]:18986 "EHLO smtp-fw-9101.amazon.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726097AbeLDMFX (ORCPT ); Tue, 4 Dec 2018 07:05:23 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1543925121; x=1575461121; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version; bh=kMXh0TX0n+Oqq5R3WlrRiE2Md+uZCPukNeIcHif/O/w=; b=ZjCoX4mDY+Y/lEdgvJKcuxy/ca81MPb52oGQqpIwzVLYu6o1DJ/k7l7R FZLvy/JwNRzWz28CROeEqYyO8/s3PHjc5ODYd2i8vR00fVhOVTW/+XV8T Mrxs7lB9924xhEbWuF3OFVywGiAsT+rTEjRNE9XL89Sy1O1aMKdRpKcEp 8=; X-IronPort-AV: E=Sophos;i="5.56,253,1539648000"; d="scan'208";a="774001362" Received: from sea3-co-svc-lb6-vlan3.sea.amazon.com (HELO email-inbound-relay-1a-16acd5e0.us-east-1.amazon.com) ([10.47.22.38]) by smtp-border-fw-out-9101.sea19.amazon.com with ESMTP/TLS/DHE-RSA-AES256-SHA; 04 Dec 2018 12:05:13 +0000 Received: from EX13MTAUEA001.ant.amazon.com (iad55-ws-svc-p15-lb9-vlan3.iad.amazon.com [10.40.159.166]) by email-inbound-relay-1a-16acd5e0.us-east-1.amazon.com (8.14.7/8.14.7) with ESMTP id wB4C5Bj0089058 (version=TLSv1/SSLv3 cipher=AES256-SHA bits=256 verify=FAIL); Tue, 4 Dec 2018 12:05:12 GMT Received: from EX13D02EUC003.ant.amazon.com (10.43.164.10) by EX13MTAUEA001.ant.amazon.com (10.43.61.82) with Microsoft SMTP Server (TLS) id 15.0.1367.3; Tue, 4 Dec 2018 12:05:00 +0000 Received: from EX13MTAUEB001.ant.amazon.com (10.43.60.96) by EX13D02EUC003.ant.amazon.com (10.43.164.10) with Microsoft SMTP Server (TLS) id 15.0.1367.3; Tue, 4 Dec 2018 12:04:59 +0000 Received: from galpress-VirtualBox.hfa16.amazon.com (10.218.62.26) by mail-relay.amazon.com (10.43.60.129) with Microsoft SMTP Server id 15.0.1367.3 via Frontend Transport; Tue, 4 Dec 2018 12:04:57 +0000 From: Gal Pressman To: Doug Ledford , Jason Gunthorpe CC: Alexander Matushevsky , Yossi Leybovich , , Tom Tucker , Gal Pressman Subject: [PATCH rdma-next 05/13] RDMA/efa: Add the efa_com.h file Date: Tue, 4 Dec 2018 14:04:21 +0200 Message-ID: <1543925069-8838-6-git-send-email-galpress@amazon.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1543925069-8838-1-git-send-email-galpress@amazon.com> References: <1543925069-8838-1-git-send-email-galpress@amazon.com> MIME-Version: 1.0 Sender: linux-rdma-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP A helper header file for EFA admin queue, admin queue completion, asynchronous notification queue, and various hardware configuration data structures and functions. Signed-off-by: Gal Pressman --- drivers/infiniband/hw/efa/efa_com.h | 139 ++++++++++++++++++++++++++++++++++++ 1 file changed, 139 insertions(+) create mode 100644 drivers/infiniband/hw/efa/efa_com.h diff --git a/drivers/infiniband/hw/efa/efa_com.h b/drivers/infiniband/hw/efa/efa_com.h new file mode 100644 index 000000000000..845835d8a682 --- /dev/null +++ b/drivers/infiniband/hw/efa/efa_com.h @@ -0,0 +1,139 @@ +/* SPDX-License-Identifier: GPL-2.0 OR BSD-2-Clause */ +/* + * Copyright 2018 Amazon.com, Inc. or its affiliates. + */ + +#ifndef _EFA_COM_H_ +#define _EFA_COM_H_ + +#include +#include + +#include "efa_common_defs.h" +#include "efa_admin_defs.h" +#include "efa_admin_cmds_defs.h" +#include "efa_regs_defs.h" + +#define EFA_MAX_HANDLERS 256 + +#define ADMIN_SQ_SIZE(depth) ((depth) * sizeof(struct efa_admin_aq_entry)) +#define ADMIN_CQ_SIZE(depth) ((depth) * sizeof(struct efa_admin_acq_entry)) +#define ADMIN_AENQ_SIZE(depth) ((depth) * sizeof(struct efa_admin_aenq_entry)) + +struct efa_com_admin_cq { + struct efa_admin_acq_entry *entries; + dma_addr_t dma_addr; + spinlock_t lock; /* Protects ACQ */ + + u16 cc; /* consumer counter */ + u8 phase; +}; + +struct efa_com_admin_sq { + struct efa_admin_aq_entry *entries; + dma_addr_t dma_addr; + spinlock_t lock; /* Protects ASQ */ + + u32 __iomem *db_addr; + + u16 cc; /* consumer counter */ + u16 pc; /* producer counter */ + u8 phase; + +}; + +struct efa_com_stats_admin { + u64 aborted_cmd; + u64 submitted_cmd; + u64 completed_cmd; + u64 no_completion; +}; + +#define EFA_AQ_STATE_RUNNING_BIT 0 + +struct efa_com_admin_queue { + void *dmadev; + struct efa_comp_ctx *comp_ctx; + u32 completion_timeout; /* usecs */ + u16 poll_interval; /* msecs */ + u16 depth; + struct efa_com_admin_cq cq; + struct efa_com_admin_sq sq; + u16 msix_vector_idx; + /* Indicate if the admin queue should poll for completion */ + bool polling; + + unsigned long state; + + /* Count the number of available admin commands */ + struct semaphore avail_cmds; + + struct efa_com_stats_admin stats; + + spinlock_t comp_ctx_lock; /* Protects completion context pool */ + u32 *comp_ctx_pool; + u16 comp_ctx_pool_next; +}; + +struct efa_aenq_handlers; + +struct efa_com_aenq { + struct efa_admin_aenq_entry *entries; + struct efa_aenq_handlers *aenq_handlers; + dma_addr_t dma_addr; + u32 cc; /* consumer counter */ + u16 msix_vector_idx; + u16 depth; + u8 phase; +}; + +struct efa_com_mmio_read { + struct efa_admin_mmio_req_read_less_resp *read_resp; + dma_addr_t read_resp_dma_addr; + u16 seq_num; + u16 mmio_read_timeout; /* usecs */ + /* serializes mmio reads */ + spinlock_t lock; +}; + +struct efa_com_dev { + struct efa_com_admin_queue admin_queue; + struct efa_com_aenq aenq; + u8 __iomem *reg_bar; + void *dmadev; + u32 supported_features; + u32 dma_addr_bits; + + struct efa_com_mmio_read mmio_read; +}; + +typedef void (*efa_aenq_handler)(void *data, + struct efa_admin_aenq_entry *aenq_e); + +/* Holds aenq handlers. Indexed by AENQ event group */ +struct efa_aenq_handlers { + efa_aenq_handler handlers[EFA_MAX_HANDLERS]; + efa_aenq_handler unimplemented_handler; +}; + +int efa_com_admin_init(struct efa_com_dev *edev, + struct efa_aenq_handlers *aenq_handlers); +void efa_com_admin_destroy(struct efa_com_dev *edev); +int efa_com_dev_reset(struct efa_com_dev *edev, + enum efa_regs_reset_reason_types reset_reason); +void efa_com_set_admin_polling_mode(struct efa_com_dev *edev, bool polling); +void efa_com_admin_q_comp_intr_handler(struct efa_com_dev *edev); +int efa_com_mmio_reg_read_init(struct efa_com_dev *edev); +void efa_com_mmio_reg_read_destroy(struct efa_com_dev *edev); + +int efa_com_validate_version(struct efa_com_dev *edev); +int efa_com_get_dma_width(struct efa_com_dev *edev); + +int efa_com_cmd_exec(struct efa_com_admin_queue *admin_queue, + struct efa_admin_aq_entry *cmd, + size_t cmd_size, + struct efa_admin_acq_entry *comp, + size_t comp_size); +void efa_com_aenq_intr_handler(struct efa_com_dev *edev, void *data); + +#endif /* _EFA_COM_H_ */ From patchwork Tue Dec 4 12:04:22 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gal Pressman X-Patchwork-Id: 10711635 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 3799713AF for ; Tue, 4 Dec 2018 12:05:23 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 284662A35B for ; Tue, 4 Dec 2018 12:05:23 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 1C7822AEFD; Tue, 4 Dec 2018 12:05:23 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id EE7BA2A35B for ; Tue, 4 Dec 2018 12:05:21 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726143AbeLDMFV (ORCPT ); Tue, 4 Dec 2018 07:05:21 -0500 Received: from smtp-fw-6001.amazon.com ([52.95.48.154]:21017 "EHLO smtp-fw-6001.amazon.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725767AbeLDMFV (ORCPT ); Tue, 4 Dec 2018 07:05:21 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1543925119; x=1575461119; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version; bh=ESlBKWYvUsPmwJfV+s40J4ARtiQfEF3HQa5DuHzxSDY=; b=NSiGZgcl9w7qXZ2gKpb8st5DplJm4tBO+uDYHDf0mW3aOKrjRw0CNS0B W0S+FyugIpUuFiwNw5ITviFKiSHIuCRHqX0FZNndP0m3JmfD3jVrSBiSl Tcik6YkbmzqtsLJqqnIxqs0H/5NtHpJmU3YyvBS6YIJRhl26gTx4jgCYy w=; X-IronPort-AV: E=Sophos;i="5.56,253,1539648000"; d="scan'208";a="370801259" Received: from iad6-co-svc-p1-lb1-vlan3.amazon.com (HELO email-inbound-relay-1d-38ae4ad2.us-east-1.amazon.com) ([10.124.125.6]) by smtp-border-fw-out-6001.iad6.amazon.com with ESMTP/TLS/DHE-RSA-AES256-SHA; 04 Dec 2018 12:05:19 +0000 Received: from EX13MTAUEA001.ant.amazon.com (iad55-ws-svc-p15-lb9-vlan2.iad.amazon.com [10.40.159.162]) by email-inbound-relay-1d-38ae4ad2.us-east-1.amazon.com (8.14.7/8.14.7) with ESMTP id wB4C5Hgt104019 (version=TLSv1/SSLv3 cipher=AES256-SHA bits=256 verify=FAIL); Tue, 4 Dec 2018 12:05:18 GMT Received: from EX13D13EUB002.ant.amazon.com (10.43.166.205) by EX13MTAUEA001.ant.amazon.com (10.43.61.243) with Microsoft SMTP Server (TLS) id 15.0.1367.3; Tue, 4 Dec 2018 12:05:03 +0000 Received: from EX13MTAUEB001.ant.amazon.com (10.43.60.96) by EX13D13EUB002.ant.amazon.com (10.43.166.205) with Microsoft SMTP Server (TLS) id 15.0.1367.3; Tue, 4 Dec 2018 12:05:02 +0000 Received: from galpress-VirtualBox.hfa16.amazon.com (10.218.62.26) by mail-relay.amazon.com (10.43.60.129) with Microsoft SMTP Server id 15.0.1367.3 via Frontend Transport; Tue, 4 Dec 2018 12:05:00 +0000 From: Gal Pressman To: Doug Ledford , Jason Gunthorpe CC: Alexander Matushevsky , Yossi Leybovich , , Tom Tucker , Gal Pressman Subject: [PATCH rdma-next 06/13] RDMA/efa: Add the com service API definitions Date: Tue, 4 Dec 2018 14:04:22 +0200 Message-ID: <1543925069-8838-7-git-send-email-galpress@amazon.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1543925069-8838-1-git-send-email-galpress@amazon.com> References: <1543925069-8838-1-git-send-email-galpress@amazon.com> MIME-Version: 1.0 Sender: linux-rdma-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Header file for the various commands that can be sent through admin queue. This includes queue create/modify/destroy, setting up and remove protection domains, address handlers, and memory registration, etc. Signed-off-by: Gal Pressman --- drivers/infiniband/hw/efa/efa_com_cmd.h | 217 ++++++++++++++++++++++++++++++++ 1 file changed, 217 insertions(+) create mode 100644 drivers/infiniband/hw/efa/efa_com_cmd.h diff --git a/drivers/infiniband/hw/efa/efa_com_cmd.h b/drivers/infiniband/hw/efa/efa_com_cmd.h new file mode 100644 index 000000000000..307f03936049 --- /dev/null +++ b/drivers/infiniband/hw/efa/efa_com_cmd.h @@ -0,0 +1,217 @@ +/* SPDX-License-Identifier: GPL-2.0 OR BSD-2-Clause */ +/* + * Copyright 2018 Amazon.com, Inc. or its affiliates. + */ + +#ifndef _EFA_COM_CMD_H_ +#define _EFA_COM_CMD_H_ + +#include "efa_com.h" + +#define EFA_GID_SIZE 16 + +struct efa_com_create_qp_params { + u32 pd; + u8 qp_type; + u64 rq_base_addr; + u32 send_cq_idx; + u32 recv_cq_idx; + /* + * Send descriptor ring size in bytes, + * sufficient for user-provided number of WQEs and SGL size + */ + u32 sq_ring_size_in_bytes; + /* Max number of WQEs that will be posted on send queue */ + u32 sq_depth; + /* Recv descriptor ring size in bytes, sufficient */ + u32 rq_ring_size_in_bytes; +}; + +struct efa_com_create_qp_result { + u32 qp_handle; + u32 qp_num; + u32 sq_db_offset; + u32 rq_db_offset; + u32 llq_descriptors_offset; + u16 send_sub_cq_idx; + u16 recv_sub_cq_idx; +}; + +struct efa_com_destroy_qp_params { + u32 qp_handle; +}; + +struct efa_com_create_cq_params { + /* completion queue depth in # of entries */ + u16 cq_depth; + u8 entry_size_in_bytes; + /* cq physical base address in OS memory */ + dma_addr_t dma_addr; + u16 num_sub_cqs; +}; + +struct efa_com_create_cq_result { + /* cq identifier */ + u16 cq_idx; + /* actual cq depth in # of entries */ + u16 actual_depth; +}; + +struct efa_com_destroy_cq_params { + u16 cq_idx; +}; + +struct efa_com_create_ah_params { + /* Destination address in network byte order */ + u8 dest_addr[EFA_GID_SIZE]; +}; + +struct efa_com_create_ah_result { + u16 ah; +}; + +struct efa_com_destroy_ah_params { + u16 ah; +}; + +struct efa_com_get_network_attr_result { + u8 addr[EFA_GID_SIZE]; + u32 mtu; +}; + +struct efa_com_get_device_attr_result { + u32 fw_version; + u32 admin_api_version; + u32 vendor_id; + u32 vendor_part_id; + u32 device_version; + u32 supported_features; + u32 phys_addr_width; + u32 virt_addr_width; + u32 max_sq; + u16 max_sq_depth; + u32 max_rq; + u16 max_rq_depth; + u32 max_cq; + u32 max_cq_depth; + u32 inline_buf_size; + u16 max_sq_sge; + u16 max_rq_sge; + u32 max_mr; + u64 max_mr_pages; + u64 page_size_cap; + u32 max_pd; + u32 max_ah; + u16 sub_cqs_per_cq; + u8 db_bar; +}; + +struct efa_com_get_hw_hints_result { + u16 mmio_read_timeout; + u16 driver_watchdog_timeout; + u16 admin_completion_timeout; + u16 poll_interval; + u32 reserved[4]; +}; + +struct efa_com_mem_addr { + u32 mem_addr_low; + u32 mem_addr_high; +}; + +/* Used at indirect mode page list chunks for chaining */ +struct efa_com_ctrl_buff_info { + /* indicates length of the buffer pointed by control_buffer_address. */ + u32 length; + /* points to control buffer (direct or indirect) */ + struct efa_com_mem_addr address; +}; + +struct efa_com_reg_mr_params { + /* Protection Domain */ + u16 pd; + /* Memory region length, in bytes. */ + u64 mr_length_in_bytes; + /* + * phys_page_size_shift - page size is (1 << phys_page_size_shift) + * Page size is used for building the Virtual to Physical + * address mapping + */ + u8 page_shift; + /* + * permissions + * 0: local_write_enable - Write permissions: value of 1 needed + * for RQ buffers and for RDMA write:1: reserved1 - remote + * access flags, etc + */ + u8 permissions; + bool inline_pbl; + bool indirect; + /* number of pages in PBL (redundant, could be calculated) */ + u32 page_num; + /* IO Virtual Address associated with this MR. */ + u64 iova; + /* words 8:15: Physical Buffer List, each element is page-aligned. */ + union { + /* + * Inline array of physical addresses of app pages + * (optimization for short region reservations) + */ + u64 inline_pbl_array[4]; + /* + * Describes the next physically contiguous chunk of indirect + * page list. A page list contains physical addresses of command + * data pages. Data pages are 4KB; page list chunks are + * variable-sized. + */ + struct efa_com_ctrl_buff_info pbl; + } pbl; +}; + +struct efa_com_reg_mr_result { + /* + * To be used in conjunction with local buffers references in SQ and + * RQ WQE + */ + u32 l_key; + /* + * To be used in incoming RDMA semantics messages to refer to remotely + * accessed memory region + */ + u32 r_key; +}; + +struct efa_com_dereg_mr_params { + u32 l_key; +}; + +void efa_com_set_dma_addr(dma_addr_t addr, u32 *addr_high, u32 *addr_low); +int efa_com_create_qp(struct efa_com_dev *edev, + struct efa_com_create_qp_params *params, + struct efa_com_create_qp_result *res); +int efa_com_destroy_qp(struct efa_com_dev *edev, + struct efa_com_destroy_qp_params *params); +int efa_com_create_cq(struct efa_com_dev *edev, + struct efa_com_create_cq_params *params, + struct efa_com_create_cq_result *result); +int efa_com_destroy_cq(struct efa_com_dev *edev, + struct efa_com_destroy_cq_params *params); +int efa_com_register_mr(struct efa_com_dev *edev, + struct efa_com_reg_mr_params *params, + struct efa_com_reg_mr_result *result); +int efa_com_dereg_mr(struct efa_com_dev *edev, + struct efa_com_dereg_mr_params *params); +int efa_com_create_ah(struct efa_com_dev *edev, + struct efa_com_create_ah_params *params, + struct efa_com_create_ah_result *result); +int efa_com_destroy_ah(struct efa_com_dev *edev, + struct efa_com_destroy_ah_params *params); +int efa_com_get_network_attr(struct efa_com_dev *edev, + struct efa_com_get_network_attr_result *result); +int efa_com_get_device_attr(struct efa_com_dev *edev, + struct efa_com_get_device_attr_result *result); +int efa_com_get_hw_hints(struct efa_com_dev *edev, + struct efa_com_get_hw_hints_result *result); +int efa_com_set_aenq_config(struct efa_com_dev *edev, u32 groups); + +#endif /* _EFA_COM_CMD_H_ */ From patchwork Tue Dec 4 12:04:23 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gal Pressman X-Patchwork-Id: 10711637 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id EC4F81731 for ; Tue, 4 Dec 2018 12:05:23 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id DEA3F2A35B for ; Tue, 4 Dec 2018 12:05:23 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id D2D152AEFD; Tue, 4 Dec 2018 12:05:23 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C26742AD6B for ; Tue, 4 Dec 2018 12:05:22 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726145AbeLDMFW (ORCPT ); Tue, 4 Dec 2018 07:05:22 -0500 Received: from smtp-fw-2101.amazon.com ([72.21.196.25]:35195 "EHLO smtp-fw-2101.amazon.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725955AbeLDMFW (ORCPT ); Tue, 4 Dec 2018 07:05:22 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1543925121; x=1575461121; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version; bh=PFZJbbr956VTk0ZWm1aai+QzoejEb4dW597MwsfAQBc=; b=oL3muAplSMe4kWteUWPAZ3j7D6FCQRzeQqTAavfYuX9NF6+3KQmZneMd l2T/ZT5DShKdkGu1Ojh06PLAYM0plz1N5UzSofI/6xeP+Wxb3Oi0WMSdG H1uckMCVlAsAHFKfwlF+r6T+TrjM97xgc0JlQsBJM0hxpHlvgbwP4fS9d o=; X-IronPort-AV: E=Sophos;i="5.56,253,1539648000"; d="scan'208";a="707203089" Received: from iad6-co-svc-p1-lb1-vlan2.amazon.com (HELO email-inbound-relay-1d-38ae4ad2.us-east-1.amazon.com) ([10.124.125.2]) by smtp-border-fw-out-2101.iad2.amazon.com with ESMTP/TLS/DHE-RSA-AES256-SHA; 04 Dec 2018 12:05:20 +0000 Received: from EX13MTAUEA001.ant.amazon.com (iad55-ws-svc-p15-lb9-vlan2.iad.amazon.com [10.40.159.162]) by email-inbound-relay-1d-38ae4ad2.us-east-1.amazon.com (8.14.7/8.14.7) with ESMTP id wB4C5Hh2104019 (version=TLSv1/SSLv3 cipher=AES256-SHA bits=256 verify=FAIL); Tue, 4 Dec 2018 12:05:20 GMT Received: from EX13D02EUC003.ant.amazon.com (10.43.164.10) by EX13MTAUEA001.ant.amazon.com (10.43.61.82) with Microsoft SMTP Server (TLS) id 15.0.1367.3; Tue, 4 Dec 2018 12:05:06 +0000 Received: from EX13MTAUEB001.ant.amazon.com (10.43.60.96) by EX13D02EUC003.ant.amazon.com (10.43.164.10) with Microsoft SMTP Server (TLS) id 15.0.1367.3; Tue, 4 Dec 2018 12:05:05 +0000 Received: from galpress-VirtualBox.hfa16.amazon.com (10.218.62.26) by mail-relay.amazon.com (10.43.60.129) with Microsoft SMTP Server id 15.0.1367.3 via Frontend Transport; Tue, 4 Dec 2018 12:05:02 +0000 From: Gal Pressman To: Doug Ledford , Jason Gunthorpe CC: Alexander Matushevsky , Yossi Leybovich , , Tom Tucker , Gal Pressman Subject: [PATCH rdma-next 07/13] RDMA/efa: Add the ABI definitions Date: Tue, 4 Dec 2018 14:04:23 +0200 Message-ID: <1543925069-8838-8-git-send-email-galpress@amazon.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1543925069-8838-1-git-send-email-galpress@amazon.com> References: <1543925069-8838-1-git-send-email-galpress@amazon.com> MIME-Version: 1.0 Sender: linux-rdma-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Add the EFA ABI file. Signed-off-by: Gal Pressman --- include/uapi/rdma/efa-abi.h | 89 +++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 89 insertions(+) create mode 100644 include/uapi/rdma/efa-abi.h diff --git a/include/uapi/rdma/efa-abi.h b/include/uapi/rdma/efa-abi.h new file mode 100644 index 000000000000..0f97a0fba967 --- /dev/null +++ b/include/uapi/rdma/efa-abi.h @@ -0,0 +1,89 @@ +/* SPDX-License-Identifier: GPL-2.0 OR BSD-2-Clause */ +/* + * Copyright 2018 Amazon.com, Inc. or its affiliates. + */ + +#ifndef _EFA_ABI_H_ +#define _EFA_ABI_H_ + +enum efa_ibv_user_cmds_supp_udata { + EFA_USER_CMDS_SUPP_UDATA_QUERY_DEVICE = 1 << 0, + EFA_USER_CMDS_SUPP_UDATA_CREATE_AH = 1 << 1, +}; + +enum efa_ibv_kernel_supp_mask { + EFA_KERNEL_SUPP_QPT_SRD = 1 << 0, +}; + +struct efa_ibv_alloc_ucontext_resp { + __u32 comp_mask; + __u32 cmds_supp_udata_mask; + __u32 kernel_supp_mask; + __u8 reserved_60[0x4]; +}; + +struct efa_ibv_alloc_pd_resp { + __u32 comp_mask; + __u32 pdn; +}; + +struct efa_ibv_create_cq { + __u32 comp_mask; + __u32 cq_entry_size; + __u16 num_sub_cqs; + __u8 reserved_50[0x6]; +}; + +struct efa_ibv_create_cq_resp { + __u32 comp_mask; + __u8 reserved_20[0x4]; + __aligned_u64 q_mmap_key; + __aligned_u64 q_mmap_size; + __u16 cq_idx; + __u8 reserved_d0[0x6]; +}; + +struct efa_ibv_create_qp { + __u32 comp_mask; + __u32 rq_entries; + __u32 rq_entry_size; + __u32 sq_depth; + __u32 sq_ring_size; + __u32 srd_qp; +}; + +struct efa_ibv_create_qp_resp { + __u32 comp_mask; + /* the offset inside the page of the rq db */ + __u32 rq_db_offset; + /* the offset inside the page of the sq db */ + __u32 sq_db_offset; + /* the offset inside the page of descriptors buffer */ + __u32 llq_desc_offset; + __aligned_u64 rq_mmap_key; + __aligned_u64 rq_mmap_size; + __aligned_u64 rq_db_mmap_key; + __aligned_u64 sq_db_mmap_key; + __aligned_u64 llq_desc_mmap_key; + __u16 send_sub_cq_idx; + __u16 recv_sub_cq_idx; + __u8 reserved_1e0[0x4]; +}; + +struct efa_ibv_create_ah_resp { + __u32 comp_mask; + __u16 efa_address_handle; + __u8 reserved_30[0x2]; +}; + +struct efa_ibv_ex_query_device_resp { + __u32 comp_mask; + __u16 max_sq_sge; + __u16 max_rq_sge; + __u16 max_sq_wr; + __u16 max_rq_wr; + __u16 sub_cqs_per_cq; + __u16 max_inline_data; +}; + +#endif /* _EFA_ABI_H_ */ From patchwork Tue Dec 4 12:04:24 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gal Pressman X-Patchwork-Id: 10711643 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 7423C13AF for ; Tue, 4 Dec 2018 12:05:33 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 634EC2A35B for ; Tue, 4 Dec 2018 12:05:33 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 57A532AE4E; Tue, 4 Dec 2018 12:05:33 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 758F72A35B for ; Tue, 4 Dec 2018 12:05:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726165AbeLDMFb (ORCPT ); Tue, 4 Dec 2018 07:05:31 -0500 Received: from smtp-fw-9101.amazon.com ([207.171.184.25]:18986 "EHLO smtp-fw-9101.amazon.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725956AbeLDMFb (ORCPT ); Tue, 4 Dec 2018 07:05:31 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1543925128; x=1575461128; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version; bh=1jWmzLX2jjgJ8yPIbWdf3P6NcY55vTBxuOls5Lmw76k=; b=Hnga5quyh6rSzR1ezIkno7yg5Zrj2ERjXVvAiUJe033tQaMUTrqvz+x2 KKuC4zum1s8ktn3IOHsn1d317ommXaZueLh1IHvEgja+Czia7d+KAHZ8K Q32VANIyVtEwfhGT1Rb1/IElMfvTAzFBKW9mloxv9EI7+JkOamshY9l1W A=; X-IronPort-AV: E=Sophos;i="5.56,253,1539648000"; d="scan'208";a="774001404" Received: from sea3-co-svc-lb6-vlan3.sea.amazon.com (HELO email-inbound-relay-1d-74cf8b49.us-east-1.amazon.com) ([10.47.22.38]) by smtp-border-fw-out-9101.sea19.amazon.com with ESMTP/TLS/DHE-RSA-AES256-SHA; 04 Dec 2018 12:05:26 +0000 Received: from EX13MTAUEA001.ant.amazon.com (iad55-ws-svc-p15-lb9-vlan3.iad.amazon.com [10.40.159.166]) by email-inbound-relay-1d-74cf8b49.us-east-1.amazon.com (8.14.7/8.14.7) with ESMTP id wB4C5LFh082740 (version=TLSv1/SSLv3 cipher=AES256-SHA bits=256 verify=FAIL); Tue, 4 Dec 2018 12:05:24 GMT Received: from EX13D13EUA002.ant.amazon.com (10.43.165.18) by EX13MTAUEA001.ant.amazon.com (10.43.61.243) with Microsoft SMTP Server (TLS) id 15.0.1367.3; Tue, 4 Dec 2018 12:05:09 +0000 Received: from EX13MTAUEB001.ant.amazon.com (10.43.60.96) by EX13D13EUA002.ant.amazon.com (10.43.165.18) with Microsoft SMTP Server (TLS) id 15.0.1367.3; Tue, 4 Dec 2018 12:05:08 +0000 Received: from galpress-VirtualBox.hfa16.amazon.com (10.218.62.26) by mail-relay.amazon.com (10.43.60.129) with Microsoft SMTP Server id 15.0.1367.3 via Frontend Transport; Tue, 4 Dec 2018 12:05:05 +0000 From: Gal Pressman To: Doug Ledford , Jason Gunthorpe CC: Alexander Matushevsky , Yossi Leybovich , , Tom Tucker , Gal Pressman Subject: [PATCH rdma-next 08/13] RDMA/efa: Implement functions that submit and complete admin commands Date: Tue, 4 Dec 2018 14:04:24 +0200 Message-ID: <1543925069-8838-9-git-send-email-galpress@amazon.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1543925069-8838-1-git-send-email-galpress@amazon.com> References: <1543925069-8838-1-git-send-email-galpress@amazon.com> MIME-Version: 1.0 Sender: linux-rdma-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Add admin commands submissions/completions implementation. Signed-off-by: Gal Pressman --- drivers/infiniband/hw/efa/efa_com.c | 1122 +++++++++++++++++++++++++++++++++++ 1 file changed, 1122 insertions(+) create mode 100644 drivers/infiniband/hw/efa/efa_com.c diff --git a/drivers/infiniband/hw/efa/efa_com.c b/drivers/infiniband/hw/efa/efa_com.c new file mode 100644 index 000000000000..b325b9c58726 --- /dev/null +++ b/drivers/infiniband/hw/efa/efa_com.c @@ -0,0 +1,1122 @@ +// SPDX-License-Identifier: GPL-2.0 OR BSD-2-Clause +/* + * Copyright 2018 Amazon.com, Inc. or its affiliates. + */ + +#include "efa.h" +#include "efa_com.h" +#include "efa_regs_defs.h" + +#define ADMIN_CMD_TIMEOUT_US 30000000 /* usecs */ + +#define EFA_REG_READ_TIMEOUT_US 50000 /* usecs */ +#define EFA_MMIO_READ_INVALID 0xffffffff + +#define EFA_POLL_INTERVAL_MS 100 /* msecs */ + +#define EFA_ASYNC_QUEUE_DEPTH 16 +#define EFA_ADMIN_QUEUE_DEPTH 32 + +#define MIN_EFA_VER\ + ((EFA_ADMIN_API_VERSION_MAJOR << EFA_REGS_VERSION_MAJOR_VERSION_SHIFT) | \ + (EFA_ADMIN_API_VERSION_MINOR & EFA_REGS_VERSION_MINOR_VERSION_MASK)) + +#define EFA_CTRL_MAJOR 0 +#define EFA_CTRL_MINOR 0 +#define EFA_CTRL_SUB_MINOR 1 + +#define MIN_EFA_CTRL_VER \ + (((EFA_CTRL_MAJOR) << \ + (EFA_REGS_CONTROLLER_VERSION_MAJOR_VERSION_SHIFT)) | \ + ((EFA_CTRL_MINOR) << \ + (EFA_REGS_CONTROLLER_VERSION_MINOR_VERSION_SHIFT)) | \ + (EFA_CTRL_SUB_MINOR)) + +#define EFA_DMA_ADDR_TO_UINT32_LOW(x) ((u32)((u64)(x))) +#define EFA_DMA_ADDR_TO_UINT32_HIGH(x) ((u32)(((u64)(x)) >> 32)) + +#define EFA_REGS_ADMIN_INTR_MASK 1 + +enum efa_cmd_status { + EFA_CMD_SUBMITTED, + EFA_CMD_COMPLETED, + /* Abort - canceled by the driver */ + EFA_CMD_ABORTED, +}; + +struct efa_comp_ctx { + struct completion wait_event; + struct efa_admin_acq_entry *user_cqe; + u32 comp_size; + enum efa_cmd_status status; + /* status from the device */ + u8 comp_status; + u8 cmd_opcode; + bool occupied; +}; + +static u32 efa_com_reg_read32(struct efa_com_dev *edev, u16 offset) +{ + struct efa_com_mmio_read *mmio_read = &edev->mmio_read; + struct efa_admin_mmio_req_read_less_resp *read_resp; + unsigned long exp_time; + u32 mmio_read_reg; + u32 err; + + read_resp = mmio_read->read_resp; + + spin_lock(&mmio_read->lock); + mmio_read->seq_num++; + + /* trash DMA req_id to identify when hardware is done */ + read_resp->req_id = mmio_read->seq_num + 0x9aL; + mmio_read_reg = (offset << EFA_REGS_MMIO_REG_READ_REG_OFF_SHIFT) & + EFA_REGS_MMIO_REG_READ_REG_OFF_MASK; + mmio_read_reg |= mmio_read->seq_num & + EFA_REGS_MMIO_REG_READ_REQ_ID_MASK; + + writel(mmio_read_reg, edev->reg_bar + EFA_REGS_MMIO_REG_READ_OFF); + + exp_time = jiffies + usecs_to_jiffies(mmio_read->mmio_read_timeout); + do { + if (READ_ONCE(read_resp->req_id) == mmio_read->seq_num) + break; + udelay(1); + } while (time_is_after_jiffies(exp_time)); + + if (unlikely(read_resp->req_id != mmio_read->seq_num)) { + pr_err("Reading register timed out. expected: req id[%u] offset[%#x] actual: req id[%u] offset[%#x]\n", + mmio_read->seq_num, offset, read_resp->req_id, + read_resp->reg_off); + err = EFA_MMIO_READ_INVALID; + goto out; + } + + if (read_resp->reg_off != offset) { + pr_err("Reading register failed: wrong offset provided"); + err = EFA_MMIO_READ_INVALID; + goto out; + } + + err = read_resp->reg_val; +out: + spin_unlock(&mmio_read->lock); + return err; +} + +static int efa_com_admin_init_sq(struct efa_com_dev *edev) +{ + struct efa_com_admin_queue *queue = &edev->admin_queue; + struct efa_com_admin_sq *sq = &queue->sq; + u16 size = ADMIN_SQ_SIZE(queue->depth); + u32 addr_high; + u32 addr_low; + u32 aq_caps; + + sq->entries = dma_zalloc_coherent(queue->dmadev, size, &sq->dma_addr, + GFP_KERNEL); + if (!sq->entries) + return -ENOMEM; + + spin_lock_init(&sq->lock); + + sq->cc = 0; + sq->pc = 0; + sq->phase = 1; + + sq->db_addr = (u32 __iomem *)(edev->reg_bar + EFA_REGS_AQ_PROD_DB_OFF); + + pr_debug("init sq dma_addr....\n"); + addr_high = EFA_DMA_ADDR_TO_UINT32_HIGH(sq->dma_addr); + addr_low = EFA_DMA_ADDR_TO_UINT32_LOW(sq->dma_addr); + + writel(addr_low, edev->reg_bar + EFA_REGS_AQ_BASE_LO_OFF); + writel(addr_high, edev->reg_bar + EFA_REGS_AQ_BASE_HI_OFF); + + aq_caps = 0; + aq_caps |= queue->depth & EFA_REGS_AQ_CAPS_AQ_DEPTH_MASK; + aq_caps |= (sizeof(struct efa_admin_aq_entry) << + EFA_REGS_AQ_CAPS_AQ_ENTRY_SIZE_SHIFT) & + EFA_REGS_AQ_CAPS_AQ_ENTRY_SIZE_MASK; + + writel(aq_caps, edev->reg_bar + EFA_REGS_AQ_CAPS_OFF); + + return 0; +} + +static int efa_com_admin_init_cq(struct efa_com_dev *edev) +{ + struct efa_com_admin_queue *queue = &edev->admin_queue; + struct efa_com_admin_cq *cq = &queue->cq; + u16 size = ADMIN_CQ_SIZE(queue->depth); + u32 addr_high; + u32 addr_low; + u32 acq_caps; + + cq->entries = dma_zalloc_coherent(queue->dmadev, size, &cq->dma_addr, + GFP_KERNEL); + if (!cq->entries) + return -ENOMEM; + + spin_lock_init(&cq->lock); + + cq->cc = 0; + cq->phase = 1; + + pr_debug("init cq dma_addr....\n"); + addr_high = EFA_DMA_ADDR_TO_UINT32_HIGH(cq->dma_addr); + addr_low = EFA_DMA_ADDR_TO_UINT32_LOW(cq->dma_addr); + + writel(addr_low, edev->reg_bar + EFA_REGS_ACQ_BASE_LO_OFF); + writel(addr_high, edev->reg_bar + EFA_REGS_ACQ_BASE_HI_OFF); + + acq_caps = 0; + acq_caps |= queue->depth & EFA_REGS_ACQ_CAPS_ACQ_DEPTH_MASK; + acq_caps |= (sizeof(struct efa_admin_acq_entry) << + EFA_REGS_ACQ_CAPS_ACQ_ENTRY_SIZE_SHIFT) & + EFA_REGS_ACQ_CAPS_ACQ_ENTRY_SIZE_MASK; + acq_caps |= (queue->msix_vector_idx << + EFA_REGS_ACQ_CAPS_ACQ_MSIX_VECTOR_SHIFT) & + EFA_REGS_ACQ_CAPS_ACQ_MSIX_VECTOR_MASK; + + writel(acq_caps, edev->reg_bar + EFA_REGS_ACQ_CAPS_OFF); + + return 0; +} + +static int efa_com_admin_init_aenq(struct efa_com_dev *edev, + struct efa_aenq_handlers *aenq_handlers) +{ + struct efa_com_aenq *aenq = &edev->aenq; + u32 addr_low, addr_high, aenq_caps; + u16 size; + + pr_debug("init aenq...\n"); + + if (unlikely(!aenq_handlers)) { + pr_err("aenq handlers pointer is NULL\n"); + return -EINVAL; + } + + size = ADMIN_AENQ_SIZE(EFA_ASYNC_QUEUE_DEPTH); + aenq->entries = dma_zalloc_coherent(edev->dmadev, size, &aenq->dma_addr, + GFP_KERNEL); + if (!aenq->entries) + return -ENOMEM; + + aenq->aenq_handlers = aenq_handlers; + aenq->depth = EFA_ASYNC_QUEUE_DEPTH; + aenq->cc = 0; + aenq->phase = 1; + + addr_low = EFA_DMA_ADDR_TO_UINT32_LOW(aenq->dma_addr); + addr_high = EFA_DMA_ADDR_TO_UINT32_HIGH(aenq->dma_addr); + + writel(addr_low, edev->reg_bar + EFA_REGS_AENQ_BASE_LO_OFF); + writel(addr_high, edev->reg_bar + EFA_REGS_AENQ_BASE_HI_OFF); + + aenq_caps = 0; + aenq_caps |= aenq->depth & EFA_REGS_AENQ_CAPS_AENQ_DEPTH_MASK; + aenq_caps |= (sizeof(struct efa_admin_aenq_entry) << + EFA_REGS_AENQ_CAPS_AENQ_ENTRY_SIZE_SHIFT) & + EFA_REGS_AENQ_CAPS_AENQ_ENTRY_SIZE_MASK; + aenq_caps |= (aenq->msix_vector_idx + << EFA_REGS_AENQ_CAPS_AENQ_MSIX_VECTOR_SHIFT) & + EFA_REGS_AENQ_CAPS_AENQ_MSIX_VECTOR_MASK; + writel(aenq_caps, edev->reg_bar + EFA_REGS_AENQ_CAPS_OFF); + + /* + * Init cons_db to mark that all entries in the queue + * are initially available + */ + writel(edev->aenq.cc, edev->reg_bar + EFA_REGS_AENQ_CONS_DB_OFF); + + pr_debug("init aenq succeeded\n"); + + return 0; +} + +/* ID to be used with efa_com_get_comp_ctx */ +static u16 efa_com_alloc_ctx_id(struct efa_com_admin_queue *admin_queue) +{ + u16 ctx_id; + + spin_lock(&admin_queue->comp_ctx_lock); + ctx_id = admin_queue->comp_ctx_pool[admin_queue->comp_ctx_pool_next]; + admin_queue->comp_ctx_pool_next++; + spin_unlock(&admin_queue->comp_ctx_lock); + + return ctx_id; +} + +static inline void efa_com_put_comp_ctx(struct efa_com_admin_queue *queue, + struct efa_comp_ctx *comp_ctx) +{ + u16 comp_id = comp_ctx->user_cqe->acq_common_descriptor.command & + EFA_ADMIN_ACQ_COMMON_DESC_COMMAND_ID_MASK; + + pr_debug("Putting completion command_id %d", comp_id); + comp_ctx->occupied = false; + spin_lock(&queue->comp_ctx_lock); + queue->comp_ctx_pool_next--; + queue->comp_ctx_pool[queue->comp_ctx_pool_next] = comp_id; + spin_unlock(&queue->comp_ctx_lock); + up(&queue->avail_cmds); +} + +static struct efa_comp_ctx *efa_com_get_comp_ctx(struct efa_com_admin_queue *queue, + u16 command_id, bool capture) +{ + if (unlikely(command_id >= queue->depth)) { + pr_err("command id is larger than the queue size. cmd_id: %u queue size %d\n", + command_id, queue->depth); + return NULL; + } + + if (unlikely(queue->comp_ctx[command_id].occupied && capture)) { + pr_err("Completion context is occupied\n"); + return NULL; + } + + if (capture) { + queue->comp_ctx[command_id].occupied = true; + pr_debug("Taking completion ctxt command_id %d", command_id); + } + + return &queue->comp_ctx[command_id]; +} + +static struct efa_comp_ctx *__efa_com_submit_admin_cmd(struct efa_com_admin_queue *admin_queue, + struct efa_admin_aq_entry *cmd, + size_t cmd_size_in_bytes, + struct efa_admin_acq_entry *comp, + size_t comp_size_in_bytes) +{ + struct efa_comp_ctx *comp_ctx; + u16 queue_size_mask; + u16 ctx_id; + u16 pi; + + queue_size_mask = admin_queue->depth - 1; + pi = admin_queue->sq.pc & queue_size_mask; + + ctx_id = efa_com_alloc_ctx_id(admin_queue); + + cmd->aq_common_descriptor.flags |= admin_queue->sq.phase & + EFA_ADMIN_AQ_COMMON_DESC_PHASE_MASK; + + cmd->aq_common_descriptor.command_id |= ctx_id & + EFA_ADMIN_AQ_COMMON_DESC_COMMAND_ID_MASK; + + comp_ctx = efa_com_get_comp_ctx(admin_queue, ctx_id, true); + if (unlikely(!comp_ctx)) + return ERR_PTR(-EINVAL); + + comp_ctx->status = EFA_CMD_SUBMITTED; + comp_ctx->comp_size = comp_size_in_bytes; + comp_ctx->user_cqe = comp; + comp_ctx->cmd_opcode = cmd->aq_common_descriptor.opcode; + + reinit_completion(&comp_ctx->wait_event); + + memcpy(&admin_queue->sq.entries[pi], cmd, cmd_size_in_bytes); + + admin_queue->sq.pc++; + admin_queue->stats.submitted_cmd++; + + if (unlikely((admin_queue->sq.pc & queue_size_mask) == 0)) + admin_queue->sq.phase = !admin_queue->sq.phase; + + /* barrier not needed in case of writel */ + writel(admin_queue->sq.pc, admin_queue->sq.db_addr); + + return comp_ctx; +} + +static inline int efa_com_init_comp_ctxt(struct efa_com_admin_queue *queue) +{ + size_t pool_size = queue->depth * sizeof(*queue->comp_ctx_pool); + size_t size = queue->depth * sizeof(struct efa_comp_ctx); + struct efa_comp_ctx *comp_ctx; + u16 i; + + queue->comp_ctx = devm_kzalloc(queue->dmadev, size, GFP_KERNEL); + queue->comp_ctx_pool = + devm_kzalloc(queue->dmadev, pool_size, GFP_KERNEL); + if (unlikely(!queue->comp_ctx || !queue->comp_ctx_pool)) { + devm_kfree(queue->dmadev, queue->comp_ctx_pool); + devm_kfree(queue->dmadev, queue->comp_ctx); + return -ENOMEM; + } + + for (i = 0; i < queue->depth; i++) { + comp_ctx = efa_com_get_comp_ctx(queue, i, false); + if (comp_ctx) + init_completion(&comp_ctx->wait_event); + + queue->comp_ctx_pool[i] = i; + } + + spin_lock_init(&queue->comp_ctx_lock); + + queue->comp_ctx_pool_next = 0; + + return 0; +} + +static struct efa_comp_ctx *efa_com_submit_admin_cmd(struct efa_com_admin_queue *admin_queue, + struct efa_admin_aq_entry *cmd, + size_t cmd_size_in_bytes, + struct efa_admin_acq_entry *comp, + size_t comp_size_in_bytes) +{ + struct efa_comp_ctx *comp_ctx; + + /* In case of queue FULL */ + down(&admin_queue->avail_cmds); + + spin_lock(&admin_queue->sq.lock); + if (unlikely(!test_bit(EFA_AQ_STATE_RUNNING_BIT, &admin_queue->state))) { + pr_err("Admin queue is closed\n"); + spin_unlock(&admin_queue->sq.lock); + return ERR_PTR(-ENODEV); + } + + comp_ctx = + __efa_com_submit_admin_cmd(admin_queue, cmd, cmd_size_in_bytes, + comp, comp_size_in_bytes); + spin_unlock(&admin_queue->sq.lock); + if (unlikely(IS_ERR(comp_ctx))) + clear_bit(EFA_AQ_STATE_RUNNING_BIT, &admin_queue->state); + + return comp_ctx; +} + +static void efa_com_handle_single_admin_completion(struct efa_com_admin_queue *admin_queue, + struct efa_admin_acq_entry *cqe) +{ + struct efa_comp_ctx *comp_ctx; + u16 cmd_id; + + cmd_id = cqe->acq_common_descriptor.command & + EFA_ADMIN_ACQ_COMMON_DESC_COMMAND_ID_MASK; + + comp_ctx = efa_com_get_comp_ctx(admin_queue, cmd_id, false); + if (unlikely(!comp_ctx)) { + pr_err("comp_ctx is NULL. Changing the admin queue running state\n"); + clear_bit(EFA_AQ_STATE_RUNNING_BIT, &admin_queue->state); + return; + } + + comp_ctx->status = EFA_CMD_COMPLETED; + comp_ctx->comp_status = cqe->acq_common_descriptor.status; + if (comp_ctx->user_cqe) + memcpy(comp_ctx->user_cqe, cqe, comp_ctx->comp_size); + + if (!admin_queue->polling) + complete(&comp_ctx->wait_event); +} + +static void efa_com_handle_admin_completion(struct efa_com_admin_queue *admin_queue) +{ + struct efa_admin_acq_entry *cqe; + u16 queue_size_mask; + u16 comp_num = 0; + u8 phase; + u16 ci; + + queue_size_mask = admin_queue->depth - 1; + + ci = admin_queue->cq.cc & queue_size_mask; + phase = admin_queue->cq.phase; + + cqe = &admin_queue->cq.entries[ci]; + + /* Go over all the completions */ + while ((READ_ONCE(cqe->acq_common_descriptor.flags) & + EFA_ADMIN_ACQ_COMMON_DESC_PHASE_MASK) == phase) { + /* + * Do not read the rest of the completion entry before the + * phase bit was validated + */ + dma_rmb(); + efa_com_handle_single_admin_completion(admin_queue, cqe); + + ci++; + comp_num++; + if (unlikely(ci == admin_queue->depth)) { + ci = 0; + phase = !phase; + } + + cqe = &admin_queue->cq.entries[ci]; + } + + admin_queue->cq.cc += comp_num; + admin_queue->cq.phase = phase; + admin_queue->sq.cc += comp_num; + admin_queue->stats.completed_cmd += comp_num; +} + +static int efa_com_comp_status_to_errno(u8 comp_status) +{ + if (unlikely(comp_status)) + pr_err("admin command failed[%u]\n", comp_status); + + switch (comp_status) { + case EFA_ADMIN_SUCCESS: + return 0; + case EFA_ADMIN_RESOURCE_ALLOCATION_FAILURE: + return -ENOMEM; + case EFA_ADMIN_UNSUPPORTED_OPCODE: + return -EOPNOTSUPP; + case EFA_ADMIN_BAD_OPCODE: + case EFA_ADMIN_MALFORMED_REQUEST: + case EFA_ADMIN_ILLEGAL_PARAMETER: + case EFA_ADMIN_UNKNOWN_ERROR: + return -EINVAL; + default: + return -EINVAL; + } +} + +static int efa_com_wait_and_process_admin_cq_polling(struct efa_comp_ctx *comp_ctx, + struct efa_com_admin_queue *admin_queue) +{ + unsigned long timeout; + unsigned long flags; + int err; + + timeout = jiffies + usecs_to_jiffies(admin_queue->completion_timeout); + + while (1) { + spin_lock_irqsave(&admin_queue->cq.lock, flags); + efa_com_handle_admin_completion(admin_queue); + spin_unlock_irqrestore(&admin_queue->cq.lock, flags); + + if (comp_ctx->status != EFA_CMD_SUBMITTED) + break; + + if (time_is_before_jiffies(timeout)) { + pr_err("Wait for completion (polling) timeout\n"); + /* EFA didn't have any completion */ + admin_queue->stats.no_completion++; + + clear_bit(EFA_AQ_STATE_RUNNING_BIT, &admin_queue->state); + err = -ETIME; + goto out; + } + + msleep(admin_queue->poll_interval); + } + + if (unlikely(comp_ctx->status == EFA_CMD_ABORTED)) { + pr_err("Command was aborted\n"); + admin_queue->stats.aborted_cmd++; + err = -ENODEV; + goto out; + } + + WARN(comp_ctx->status != EFA_CMD_COMPLETED, "Invalid comp status %d\n", + comp_ctx->status); + + err = efa_com_comp_status_to_errno(comp_ctx->comp_status); +out: + efa_com_put_comp_ctx(admin_queue, comp_ctx); + return err; +} + +static int efa_com_wait_and_process_admin_cq_interrupts(struct efa_comp_ctx *comp_ctx, + struct efa_com_admin_queue *admin_queue) +{ + unsigned long flags; + int err; + + wait_for_completion_timeout(&comp_ctx->wait_event, + usecs_to_jiffies( + admin_queue->completion_timeout)); + + /* + * In case the command wasn't completed find out the root cause. + * There might be 2 kinds of errors + * 1) No completion (timeout reached) + * 2) There is completion but the device didn't get any msi-x interrupt. + */ + if (unlikely(comp_ctx->status == EFA_CMD_SUBMITTED)) { + spin_lock_irqsave(&admin_queue->cq.lock, flags); + efa_com_handle_admin_completion(admin_queue); + spin_unlock_irqrestore(&admin_queue->cq.lock, flags); + + admin_queue->stats.no_completion++; + + if (comp_ctx->status == EFA_CMD_COMPLETED) + pr_err("The device sent a completion but the driver didn't receive any MSI-X interrupt for admin cmd %d status %d (ctx: 0x%p, sq producer: %d, sq consumer: %d, cq consumer: %d)\n", + comp_ctx->cmd_opcode, comp_ctx->status, comp_ctx, + admin_queue->sq.pc, admin_queue->sq.cc, + admin_queue->cq.cc); + else + pr_err("The device didn't send any completion for admin cmd %d status %d (ctx 0x%p, sq producer: %d, sq consumer: %d, cq consumer: %d)\n", + comp_ctx->cmd_opcode, comp_ctx->status, comp_ctx, + admin_queue->sq.pc, admin_queue->sq.cc, + admin_queue->cq.cc); + + clear_bit(EFA_AQ_STATE_RUNNING_BIT, &admin_queue->state); + err = -ETIME; + goto out; + } + + err = efa_com_comp_status_to_errno(comp_ctx->comp_status); +out: + efa_com_put_comp_ctx(admin_queue, comp_ctx); + return err; +} + +/* + * There are two types to wait for completion. + * Polling mode - wait until the completion is available. + * Async mode - wait on wait queue until the completion is ready + * (or the timeout expired). + * It is expected that the IRQ called efa_com_handle_admin_completion + * to mark the completions. + */ +static int efa_com_wait_and_process_admin_cq(struct efa_comp_ctx *comp_ctx, + struct efa_com_admin_queue *admin_queue) +{ + if (admin_queue->polling) + return efa_com_wait_and_process_admin_cq_polling(comp_ctx, + admin_queue); + + return efa_com_wait_and_process_admin_cq_interrupts(comp_ctx, + admin_queue); +} + +/* + * efa_com_cmd_exec - Execute admin command + * @admin_queue: admin queue. + * @cmd: the admin command to execute. + * @cmd_size: the command size. + * @comp: command completion return entry. + * @comp_size: command completion size. + * Submit an admin command and then wait until the device will return a + * completion. + * The completion will be copied into comp. + * + * @return - 0 on success, negative value on failure. + */ +int efa_com_cmd_exec(struct efa_com_admin_queue *admin_queue, + struct efa_admin_aq_entry *cmd, + size_t cmd_size, + struct efa_admin_acq_entry *comp, + size_t comp_size) +{ + struct efa_comp_ctx *comp_ctx; + int err; + + pr_debug("opcode %d", cmd->aq_common_descriptor.opcode); + comp_ctx = efa_com_submit_admin_cmd(admin_queue, cmd, cmd_size, comp, + comp_size); + if (unlikely(IS_ERR(comp_ctx))) { + pr_err("Failed to submit command opcode %u err %ld\n", + cmd->aq_common_descriptor.opcode, PTR_ERR(comp_ctx)); + + return PTR_ERR(comp_ctx); + } + + err = efa_com_wait_and_process_admin_cq(comp_ctx, admin_queue); + if (unlikely(err)) + pr_err("Failed to process command opcode %u err %d\n", + cmd->aq_common_descriptor.opcode, err); + + return err; +} + +/* + * efa_com_abort_admin_commands - Abort all the outstanding admin commands. + * @edev: EFA communication layer struct + * + * This method aborts all the outstanding admin commands. + * The caller should then call efa_com_wait_for_abort_completion to make sure + * all the commands were completed. + */ +static void efa_com_abort_admin_commands(struct efa_com_dev *edev) +{ + struct efa_com_admin_queue *admin_queue = &edev->admin_queue; + struct efa_comp_ctx *comp_ctx; + unsigned long flags; + u16 i; + + spin_lock(&admin_queue->sq.lock); + spin_lock_irqsave(&admin_queue->cq.lock, flags); + for (i = 0; i < admin_queue->depth; i++) { + comp_ctx = efa_com_get_comp_ctx(admin_queue, i, false); + if (unlikely(!comp_ctx)) + break; + + comp_ctx->status = EFA_CMD_ABORTED; + + complete(&comp_ctx->wait_event); + } + spin_unlock_irqrestore(&admin_queue->cq.lock, flags); + spin_unlock(&admin_queue->sq.lock); +} + +/* + * efa_com_wait_for_abort_completion - Wait for admin commands abort. + * @edev: EFA communication layer struct + * + * This method wait until all the outstanding admin commands will be completed. + */ +static void efa_com_wait_for_abort_completion(struct efa_com_dev *edev) +{ + struct efa_com_admin_queue *admin_queue = &edev->admin_queue; + int i; + + /* all mine */ + for (i = 0; i < admin_queue->depth; i++) + down(&admin_queue->avail_cmds); + + /* let it go */ + for (i = 0; i < admin_queue->depth; i++) + up(&admin_queue->avail_cmds); +} + +static void efa_com_admin_flush(struct efa_com_dev *edev) +{ + struct efa_com_admin_queue *admin_queue = &edev->admin_queue; + + clear_bit(EFA_AQ_STATE_RUNNING_BIT, &admin_queue->state); + + efa_com_abort_admin_commands(edev); + efa_com_wait_for_abort_completion(edev); +} + +/* + * efa_com_admin_destroy - Destroy the admin and the async events queues. + * @edev: EFA communication layer struct + */ +void efa_com_admin_destroy(struct efa_com_dev *edev) +{ + struct efa_com_admin_queue *admin_queue = &edev->admin_queue; + struct efa_com_admin_cq *cq = &admin_queue->cq; + struct efa_com_admin_sq *sq = &admin_queue->sq; + struct efa_com_aenq *aenq = &edev->aenq; + u16 size; + + efa_com_admin_flush(edev); + + devm_kfree(edev->dmadev, admin_queue->comp_ctx_pool); + devm_kfree(edev->dmadev, admin_queue->comp_ctx); + + size = ADMIN_SQ_SIZE(admin_queue->depth); + dma_free_coherent(edev->dmadev, size, sq->entries, sq->dma_addr); + + pr_debug("destroyed SQ"); + size = ADMIN_CQ_SIZE(admin_queue->depth); + dma_free_coherent(edev->dmadev, size, cq->entries, cq->dma_addr); + pr_debug("destroyed CQ"); + + size = ADMIN_AENQ_SIZE(aenq->depth); + dma_free_coherent(edev->dmadev, size, aenq->entries, aenq->dma_addr); + pr_debug("destroyed AENQ"); +} + +/* + * efa_com_set_admin_polling_mode - Set the admin completion queue polling mode + * @edev: EFA communication layer struct + * @polling: Enable/Disable polling mode + * + * Set the admin completion mode. + */ +void efa_com_set_admin_polling_mode(struct efa_com_dev *edev, bool polling) +{ + u32 mask_value = 0; + + if (polling) + mask_value = EFA_REGS_ADMIN_INTR_MASK; + + writel(mask_value, edev->reg_bar + EFA_REGS_INTR_MASK_OFF); + edev->admin_queue.polling = polling; +} + +/* + * efa_com_admin_init - Init the admin and the async queues + * @edev: EFA communication layer struct + * @aenq_handlers: Those handlers to be called upon event. + * + * Initialize the admin submission and completion queues. + * Initialize the asynchronous events notification queues. + * + * @return - 0 on success, negative value on failure. + */ +int efa_com_admin_init(struct efa_com_dev *edev, + struct efa_aenq_handlers *aenq_handlers) +{ + struct efa_com_admin_queue *admin_queue = &edev->admin_queue; + u32 timeout; + u32 dev_sts; + u32 cap; + int err; + + dev_sts = efa_com_reg_read32(edev, EFA_REGS_DEV_STS_OFF); + if (!(dev_sts & EFA_REGS_DEV_STS_READY_MASK)) { + pr_err("Device isn't ready, abort com init 0x%08x\n", dev_sts); + return -ENODEV; + } + + admin_queue->depth = EFA_ADMIN_QUEUE_DEPTH; + + admin_queue->dmadev = edev->dmadev; + admin_queue->polling = true; + + sema_init(&admin_queue->avail_cmds, admin_queue->depth); + + err = efa_com_init_comp_ctxt(admin_queue); + if (err) + return err; + + err = efa_com_admin_init_sq(edev); + if (err) + goto err_destroy_comp_ctxt; + + err = efa_com_admin_init_cq(edev); + if (err) + goto err_destroy_sq; + + efa_com_set_admin_polling_mode(edev, false); + + err = efa_com_admin_init_aenq(edev, aenq_handlers); + if (err) + goto err_destroy_cq; + + cap = efa_com_reg_read32(edev, EFA_REGS_CAPS_OFF); + timeout = (cap & EFA_REGS_CAPS_ADMIN_CMD_TO_MASK) >> + EFA_REGS_CAPS_ADMIN_CMD_TO_SHIFT; + if (timeout) + /* the resolution of timeout reg is 100ms */ + admin_queue->completion_timeout = timeout * 100000; + else + admin_queue->completion_timeout = ADMIN_CMD_TIMEOUT_US; + + admin_queue->poll_interval = EFA_POLL_INTERVAL_MS; + + set_bit(EFA_AQ_STATE_RUNNING_BIT, &admin_queue->state); + + return 0; + +err_destroy_cq: + dma_free_coherent(edev->dmadev, ADMIN_CQ_SIZE(admin_queue->depth), + admin_queue->cq.entries, admin_queue->cq.dma_addr); +err_destroy_sq: + dma_free_coherent(edev->dmadev, ADMIN_SQ_SIZE(admin_queue->depth), + admin_queue->sq.entries, admin_queue->sq.dma_addr); +err_destroy_comp_ctxt: + devm_kfree(edev->dmadev, admin_queue->comp_ctx); + + return err; +} + +/* + * efa_com_admin_q_comp_intr_handler - admin queue interrupt handler + * @edev: EFA communication layer struct + * + * This method go over the admin completion queue and wake up all the pending + * threads that wait on the commands wait event. + * + * @note: Should be called after MSI-X interrupt. + */ +void efa_com_admin_q_comp_intr_handler(struct efa_com_dev *edev) +{ + unsigned long flags; + + spin_lock_irqsave(&edev->admin_queue.cq.lock, flags); + efa_com_handle_admin_completion(&edev->admin_queue); + spin_unlock_irqrestore(&edev->admin_queue.cq.lock, flags); +} + +/* + * efa_handle_specific_aenq_event: + * return the handler that is relevant to the specific event group + */ +static efa_aenq_handler efa_com_get_specific_aenq_cb(struct efa_com_dev *edev, + u16 group) +{ + struct efa_aenq_handlers *aenq_handlers = edev->aenq.aenq_handlers; + + if (group < EFA_MAX_HANDLERS && aenq_handlers->handlers[group]) + return aenq_handlers->handlers[group]; + + return aenq_handlers->unimplemented_handler; +} + +/* + * efa_com_aenq_intr_handler - AENQ interrupt handler + * @edev: EFA communication layer struct + * + * Go over the async event notification queue and call the proper aenq handler. + */ +void efa_com_aenq_intr_handler(struct efa_com_dev *edev, void *data) +{ + struct efa_admin_aenq_common_desc *aenq_common; + struct efa_com_aenq *aenq = &edev->aenq; + struct efa_admin_aenq_entry *aenq_e; + efa_aenq_handler handler_cb; + u32 processed = 0; + u8 phase; + u32 ci; + + ci = aenq->cc & (aenq->depth - 1); + phase = aenq->phase; + aenq_e = &aenq->entries[ci]; /* Get first entry */ + aenq_common = &aenq_e->aenq_common_desc; + + /* Go over all the events */ + while ((READ_ONCE(aenq_common->flags) & + EFA_ADMIN_AENQ_COMMON_DESC_PHASE_MASK) == phase) { + /* + * Do not read the rest of the completion entry before the + * phase bit was validated + */ + dma_rmb(); + + /* Handle specific event*/ + handler_cb = efa_com_get_specific_aenq_cb(edev, + aenq_common->group); + handler_cb(data, aenq_e); /* call the actual event handler*/ + + /* Get next event entry */ + ci++; + processed++; + + if (unlikely(ci == aenq->depth)) { + ci = 0; + phase = !phase; + } + aenq_e = &aenq->entries[ci]; + aenq_common = &aenq_e->aenq_common_desc; + } + + aenq->cc += processed; + aenq->phase = phase; + + /* Don't update aenq doorbell if there weren't any processed events */ + if (!processed) + return; + + /* barrier not needed in case of writel */ + writel(aenq->cc, edev->reg_bar + EFA_REGS_AENQ_CONS_DB_OFF); +} + +static void efa_com_mmio_reg_read_resp_addr_init(struct efa_com_dev *edev) +{ + struct efa_com_mmio_read *mmio_read = &edev->mmio_read; + u32 addr_high; + u32 addr_low; + + /* dma_addr_bits is unknown at this point */ + addr_high = (mmio_read->read_resp_dma_addr >> 32) & GENMASK(31, 0); + addr_low = mmio_read->read_resp_dma_addr & GENMASK(31, 0); + + writel(addr_high, edev->reg_bar + EFA_REGS_MMIO_RESP_HI_OFF); + writel(addr_low, edev->reg_bar + EFA_REGS_MMIO_RESP_LO_OFF); +} + +int efa_com_mmio_reg_read_init(struct efa_com_dev *edev) +{ + struct efa_com_mmio_read *mmio_read = &edev->mmio_read; + + spin_lock_init(&mmio_read->lock); + mmio_read->read_resp = + dma_zalloc_coherent(edev->dmadev, sizeof(*mmio_read->read_resp), + &mmio_read->read_resp_dma_addr, GFP_KERNEL); + if (unlikely(!mmio_read->read_resp)) + return -ENOMEM; + + efa_com_mmio_reg_read_resp_addr_init(edev); + + mmio_read->read_resp->req_id = 0; + mmio_read->seq_num = 0; + mmio_read->mmio_read_timeout = EFA_REG_READ_TIMEOUT_US; + + return 0; +} + +void efa_com_mmio_reg_read_destroy(struct efa_com_dev *edev) +{ + struct efa_com_mmio_read *mmio_read = &edev->mmio_read; + + /* just in case someone is still spinning on a read */ + spin_lock(&mmio_read->lock); + dma_free_coherent(edev->dmadev, sizeof(*mmio_read->read_resp), + mmio_read->read_resp, mmio_read->read_resp_dma_addr); + spin_unlock(&mmio_read->lock); +} + +/* + * efa_com_validate_version - Validate the device parameters + * @edev: EFA communication layer struct + * + * This method validate the device parameters are the same as the saved + * parameters in edev. + * This method is useful after device reset, to validate the device mac address + * and the device offloads are the same as before the reset. + * + * @return - 0 on success negative value otherwise. + */ +int efa_com_validate_version(struct efa_com_dev *edev) +{ + u32 ctrl_ver_masked; + u32 ctrl_ver; + u32 ver; + + /* + * Make sure the EFA version and the controller version are at least + * as the driver expects + */ + ver = efa_com_reg_read32(edev, EFA_REGS_VERSION_OFF); + ctrl_ver = efa_com_reg_read32(edev, + EFA_REGS_CONTROLLER_VERSION_OFF); + + pr_info("efa device version: %d.%d\n", + (ver & EFA_REGS_VERSION_MAJOR_VERSION_MASK) >> + EFA_REGS_VERSION_MAJOR_VERSION_SHIFT, + ver & EFA_REGS_VERSION_MINOR_VERSION_MASK); + + if (ver < MIN_EFA_VER) { + pr_err("EFA version is lower than the minimal version the driver supports\n"); + return -1; + } + + pr_info("efa controller version: %d.%d.%d implementation version %d\n", + (ctrl_ver & EFA_REGS_CONTROLLER_VERSION_MAJOR_VERSION_MASK) >> + EFA_REGS_CONTROLLER_VERSION_MAJOR_VERSION_SHIFT, + (ctrl_ver & EFA_REGS_CONTROLLER_VERSION_MINOR_VERSION_MASK) >> + EFA_REGS_CONTROLLER_VERSION_MINOR_VERSION_SHIFT, + (ctrl_ver & EFA_REGS_CONTROLLER_VERSION_SUBMINOR_VERSION_MASK), + (ctrl_ver & EFA_REGS_CONTROLLER_VERSION_IMPL_ID_MASK) >> + EFA_REGS_CONTROLLER_VERSION_IMPL_ID_SHIFT); + + ctrl_ver_masked = + (ctrl_ver & EFA_REGS_CONTROLLER_VERSION_MAJOR_VERSION_MASK) | + (ctrl_ver & EFA_REGS_CONTROLLER_VERSION_MINOR_VERSION_MASK) | + (ctrl_ver & EFA_REGS_CONTROLLER_VERSION_SUBMINOR_VERSION_MASK); + + /* Validate the ctrl version without the implementation ID */ + if (ctrl_ver_masked < MIN_EFA_CTRL_VER) { + pr_err("EFA ctrl version is lower than the minimal ctrl version the driver supports\n"); + return -1; + } + + return 0; +} + +/* + * efa_com_get_dma_width - Retrieve physical dma address width the device + * supports. + * @edev: EFA communication layer struct + * + * Retrieve the maximum physical address bits the device can handle. + * + * @return: > 0 on Success and negative value otherwise. + */ +int efa_com_get_dma_width(struct efa_com_dev *edev) +{ + u32 caps = efa_com_reg_read32(edev, EFA_REGS_CAPS_OFF); + int width; + + width = (caps & EFA_REGS_CAPS_DMA_ADDR_WIDTH_MASK) >> + EFA_REGS_CAPS_DMA_ADDR_WIDTH_SHIFT; + + pr_debug("EFA dma width: %d\n", width); + + if (width < 32 || width > 64) { + pr_err("DMA width illegal value: %d\n", width); + return -EINVAL; + } + + edev->dma_addr_bits = width; + + return width; +} + +static int wait_for_reset_state(struct efa_com_dev *edev, u32 timeout, + u16 exp_state) +{ + u32 val, i; + + for (i = 0; i < timeout; i++) { + val = efa_com_reg_read32(edev, EFA_REGS_DEV_STS_OFF); + + if ((val & EFA_REGS_DEV_STS_RESET_IN_PROGRESS_MASK) == + exp_state) + return 0; + + pr_debug("reset indication val %d\n", val); + msleep(EFA_POLL_INTERVAL_MS); + } + + return -ETIME; +} + +/* + * efa_com_dev_reset - Perform device FLR to the device. + * @edev: EFA communication layer struct + * @reset_reason: Specify what is the trigger for the reset in case of an error. + * + * @return - 0 on success, negative value on failure. + */ +int efa_com_dev_reset(struct efa_com_dev *edev, + enum efa_regs_reset_reason_types reset_reason) +{ + u32 stat, timeout, cap, reset_val; + int err; + + stat = efa_com_reg_read32(edev, EFA_REGS_DEV_STS_OFF); + cap = efa_com_reg_read32(edev, EFA_REGS_CAPS_OFF); + + if (!(stat & EFA_REGS_DEV_STS_READY_MASK)) { + pr_err("Device isn't ready, can't reset device\n"); + return -EINVAL; + } + + timeout = (cap & EFA_REGS_CAPS_RESET_TIMEOUT_MASK) >> + EFA_REGS_CAPS_RESET_TIMEOUT_SHIFT; + if (!timeout) { + pr_err("Invalid timeout value\n"); + return -EINVAL; + } + + /* start reset */ + reset_val = EFA_REGS_DEV_CTL_DEV_RESET_MASK; + reset_val |= (reset_reason << EFA_REGS_DEV_CTL_RESET_REASON_SHIFT) & + EFA_REGS_DEV_CTL_RESET_REASON_MASK; + writel(reset_val, edev->reg_bar + EFA_REGS_DEV_CTL_OFF); + + /* reset clears the mmio readless address, restore it */ + efa_com_mmio_reg_read_resp_addr_init(edev); + + err = wait_for_reset_state(edev, timeout, + EFA_REGS_DEV_STS_RESET_IN_PROGRESS_MASK); + if (err) { + pr_err("Reset indication didn't turn on\n"); + return err; + } + + /* reset done */ + writel(0, edev->reg_bar + EFA_REGS_DEV_CTL_OFF); + err = wait_for_reset_state(edev, timeout, 0); + if (err) { + pr_err("Reset indication didn't turn off\n"); + return err; + } + + timeout = (cap & EFA_REGS_CAPS_ADMIN_CMD_TO_MASK) >> + EFA_REGS_CAPS_ADMIN_CMD_TO_SHIFT; + if (timeout) + /* the resolution of timeout reg is 100ms */ + edev->admin_queue.completion_timeout = timeout * 100000; + else + edev->admin_queue.completion_timeout = ADMIN_CMD_TIMEOUT_US; + + return 0; +} + From patchwork Tue Dec 4 12:04:25 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gal Pressman X-Patchwork-Id: 10711653 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id DCE501731 for ; Tue, 4 Dec 2018 12:05:42 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id CD99F2AD6B for ; Tue, 4 Dec 2018 12:05:42 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id C10312AEFD; Tue, 4 Dec 2018 12:05:42 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B9E942AE4E for ; Tue, 4 Dec 2018 12:05:40 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726223AbeLDMFk (ORCPT ); Tue, 4 Dec 2018 07:05:40 -0500 Received: from smtp-fw-33001.amazon.com ([207.171.190.10]:1134 "EHLO smtp-fw-33001.amazon.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726218AbeLDMFk (ORCPT ); Tue, 4 Dec 2018 07:05:40 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1543925137; x=1575461137; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version; bh=1mIiL9WJnqc/+r3abWOvaOcYHPNqUmnKRCyKc7X/1HY=; b=Sv6ueFybxmZ4FFyIPUoqExXRVs3DHMLvYcqXdR/tRAztMWSGS5hZ4gLS k7JY2xfzApT8akQTjJXsR6xfN8YzdYlcl945pMHW/hJumSas8m1Hu8H5z IpsHKo9JRED+ym6c+zdSAiCp4jb3sFiccmVu82sj6FA5ZPfPOxR0W2nwe 4=; X-IronPort-AV: E=Sophos;i="5.56,253,1539648000"; d="scan'208";a="768929550" Received: from sea3-co-svc-lb6-vlan2.sea.amazon.com (HELO email-inbound-relay-1a-af6a10df.us-east-1.amazon.com) ([10.47.22.34]) by smtp-border-fw-out-33001.sea14.amazon.com with ESMTP/TLS/DHE-RSA-AES256-SHA; 04 Dec 2018 12:05:29 +0000 Received: from EX13MTAUEA001.ant.amazon.com (iad55-ws-svc-p15-lb9-vlan2.iad.amazon.com [10.40.159.162]) by email-inbound-relay-1a-af6a10df.us-east-1.amazon.com (8.14.7/8.14.7) with ESMTP id wB4C5NaQ071908 (version=TLSv1/SSLv3 cipher=AES256-SHA bits=256 verify=FAIL); Tue, 4 Dec 2018 12:05:25 GMT Received: from EX13D13EUA001.ant.amazon.com (10.43.165.24) by EX13MTAUEA001.ant.amazon.com (10.43.61.82) with Microsoft SMTP Server (TLS) id 15.0.1367.3; Tue, 4 Dec 2018 12:05:11 +0000 Received: from EX13MTAUEB001.ant.amazon.com (10.43.60.96) by EX13D13EUA001.ant.amazon.com (10.43.165.24) with Microsoft SMTP Server (TLS) id 15.0.1367.3; Tue, 4 Dec 2018 12:05:10 +0000 Received: from galpress-VirtualBox.hfa16.amazon.com (10.218.62.26) by mail-relay.amazon.com (10.43.60.129) with Microsoft SMTP Server id 15.0.1367.3 via Frontend Transport; Tue, 4 Dec 2018 12:05:08 +0000 From: Gal Pressman To: Doug Ledford , Jason Gunthorpe CC: Alexander Matushevsky , Yossi Leybovich , , Tom Tucker , Gal Pressman Subject: [PATCH rdma-next 09/13] RDMA/efa: Add com command handlers Date: Tue, 4 Dec 2018 14:04:25 +0200 Message-ID: <1543925069-8838-10-git-send-email-galpress@amazon.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1543925069-8838-1-git-send-email-galpress@amazon.com> References: <1543925069-8838-1-git-send-email-galpress@amazon.com> MIME-Version: 1.0 Sender: linux-rdma-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Add the EFA common commands implementation. Signed-off-by: Gal Pressman --- drivers/infiniband/hw/efa/efa_com_cmd.c | 544 ++++++++++++++++++++++++++++++++ 1 file changed, 544 insertions(+) create mode 100644 drivers/infiniband/hw/efa/efa_com_cmd.c diff --git a/drivers/infiniband/hw/efa/efa_com_cmd.c b/drivers/infiniband/hw/efa/efa_com_cmd.c new file mode 100644 index 000000000000..da9d4678928d --- /dev/null +++ b/drivers/infiniband/hw/efa/efa_com_cmd.c @@ -0,0 +1,544 @@ +// SPDX-License-Identifier: GPL-2.0 OR BSD-2-Clause +/* + * Copyright 2018 Amazon.com, Inc. or its affiliates. + */ + +#include "efa.h" +#include "efa_com.h" +#include "efa_com_cmd.h" + +void efa_com_set_dma_addr(dma_addr_t addr, u32 *addr_high, u32 *addr_low) +{ + *addr_low = addr & GENMASK(31, 0); + *addr_high = (addr >> 32) & GENMASK(31, 0); +} + +int efa_com_create_qp(struct efa_com_dev *edev, + struct efa_com_create_qp_params *params, + struct efa_com_create_qp_result *res) +{ + struct efa_com_admin_queue *admin_queue = &edev->admin_queue; + struct efa_admin_create_qp_resp cmd_completion; + struct efa_admin_create_qp_cmd create_qp_cmd; + int err; + + memset(&create_qp_cmd, 0x0, sizeof(create_qp_cmd)); + + create_qp_cmd.aq_common_desc.opcode = EFA_ADMIN_CREATE_QP; + + create_qp_cmd.pd = params->pd; + create_qp_cmd.qp_type = params->qp_type; + create_qp_cmd.flags = 0; + create_qp_cmd.sq_base_addr = 0; + create_qp_cmd.rq_base_addr = params->rq_base_addr; + create_qp_cmd.send_cq_idx = params->send_cq_idx; + create_qp_cmd.recv_cq_idx = params->recv_cq_idx; + create_qp_cmd.qp_alloc_size.send_queue_ring_size = + params->sq_ring_size_in_bytes; + create_qp_cmd.qp_alloc_size.send_queue_depth = + params->sq_depth; + create_qp_cmd.qp_alloc_size.recv_queue_ring_size = + params->rq_ring_size_in_bytes; + + err = efa_com_cmd_exec(admin_queue, + (struct efa_admin_aq_entry *)&create_qp_cmd, + sizeof(create_qp_cmd), + (struct efa_admin_acq_entry *)&cmd_completion, + sizeof(cmd_completion)); + if (unlikely(err)) { + pr_err("Failed to create qp [%d]\n", err); + return err; + } + + res->qp_handle = cmd_completion.qp_handle; + res->qp_num = cmd_completion.qp_num; + res->sq_db_offset = cmd_completion.sq_db_offset; + res->rq_db_offset = cmd_completion.rq_db_offset; + res->llq_descriptors_offset = cmd_completion.llq_descriptors_offset; + res->send_sub_cq_idx = cmd_completion.send_sub_cq_idx; + res->recv_sub_cq_idx = cmd_completion.recv_sub_cq_idx; + + return err; +} + +int efa_com_destroy_qp(struct efa_com_dev *edev, + struct efa_com_destroy_qp_params *params) +{ + struct efa_com_admin_queue *admin_queue = &edev->admin_queue; + struct efa_admin_destroy_qp_resp cmd_completion; + struct efa_admin_destroy_qp_cmd qp_cmd; + int err; + + memset(&qp_cmd, 0x0, sizeof(qp_cmd)); + + qp_cmd.aq_common_desc.opcode = EFA_ADMIN_DESTROY_QP; + + qp_cmd.qp_handle = params->qp_handle; + + err = efa_com_cmd_exec(admin_queue, + (struct efa_admin_aq_entry *)&qp_cmd, + sizeof(qp_cmd), + (struct efa_admin_acq_entry *)&cmd_completion, + sizeof(cmd_completion)); + if (unlikely(err)) + pr_err("failed to destroy qp-0x%08x [%d]\n", qp_cmd.qp_handle, + err); + + return err; +} + +int efa_com_create_cq(struct efa_com_dev *edev, + struct efa_com_create_cq_params *params, + struct efa_com_create_cq_result *result) +{ + struct efa_com_admin_queue *admin_queue = &edev->admin_queue; + struct efa_admin_create_cq_resp cmd_completion; + struct efa_admin_create_cq_cmd create_cmd; + int err; + + memset(&create_cmd, 0x0, sizeof(create_cmd)); + create_cmd.aq_common_desc.opcode = EFA_ADMIN_CREATE_CQ; + create_cmd.cq_caps_2 = (params->entry_size_in_bytes / 4) & + EFA_ADMIN_CREATE_CQ_CMD_CQ_ENTRY_SIZE_WORDS_MASK; + create_cmd.msix_vector_idx = 0; + create_cmd.cq_depth = params->cq_depth; + create_cmd.num_sub_cqs = params->num_sub_cqs; + + efa_com_set_dma_addr(params->dma_addr, + &create_cmd.cq_ba.mem_addr_high, + &create_cmd.cq_ba.mem_addr_low); + + err = efa_com_cmd_exec(admin_queue, + (struct efa_admin_aq_entry *)&create_cmd, + sizeof(create_cmd), + (struct efa_admin_acq_entry *)&cmd_completion, + sizeof(cmd_completion)); + if (unlikely(err)) { + pr_err("failed to create cq[%d]\n", err); + return err; + } + + pr_debug("created cq[%u], depth[%u]\n", cmd_completion.cq_idx, + params->cq_depth); + result->cq_idx = cmd_completion.cq_idx; + result->actual_depth = params->cq_depth; + + return err; +} + +int efa_com_destroy_cq(struct efa_com_dev *edev, + struct efa_com_destroy_cq_params *params) +{ + struct efa_com_admin_queue *admin_queue = &edev->admin_queue; + struct efa_admin_destroy_cq_resp destroy_resp; + struct efa_admin_destroy_cq_cmd destroy_cmd; + int err; + + memset(&destroy_cmd, 0x0, sizeof(destroy_cmd)); + + destroy_cmd.cq_idx = params->cq_idx; + destroy_cmd.aq_common_desc.opcode = EFA_ADMIN_DESTROY_CQ; + + err = efa_com_cmd_exec(admin_queue, + (struct efa_admin_aq_entry *)&destroy_cmd, + sizeof(destroy_cmd), + (struct efa_admin_acq_entry *)&destroy_resp, + sizeof(destroy_resp)); + + if (unlikely(err)) + pr_err("Failed to destroy CQ. error: %d\n", err); + + return err; +} + +int efa_com_register_mr(struct efa_com_dev *edev, + struct efa_com_reg_mr_params *params, + struct efa_com_reg_mr_result *result) +{ + struct efa_com_admin_queue *admin_queue = &edev->admin_queue; + struct efa_admin_reg_mr_resp cmd_completion; + struct efa_admin_reg_mr_cmd mr_cmd; + int err; + + memset(&mr_cmd, 0x0, sizeof(mr_cmd)); + + mr_cmd.aq_common_desc.opcode = EFA_ADMIN_REG_MR; + mr_cmd.aq_common_desc.flags = 0; + + mr_cmd.pd = params->pd; + mr_cmd.mr_length = params->mr_length_in_bytes; + + mr_cmd.flags |= 0 & + EFA_ADMIN_REG_MR_CMD_MEM_ADDR_PHY_MODE_EN_MASK; + mr_cmd.flags |= params->page_shift & + EFA_ADMIN_REG_MR_CMD_PHYS_PAGE_SIZE_SHIFT_MASK; + mr_cmd.iova = params->iova; + mr_cmd.permissions |= params->permissions & + EFA_ADMIN_REG_MR_CMD_LOCAL_WRITE_ENABLE_MASK; + + if (params->inline_pbl) { + memcpy(mr_cmd.pbl.inline_pbl_array, + params->pbl.inline_pbl_array, + sizeof(mr_cmd.pbl.inline_pbl_array)); + } else { + mr_cmd.pbl.pbl.length = params->pbl.pbl.length; + mr_cmd.pbl.pbl.address.mem_addr_low = + params->pbl.pbl.address.mem_addr_low; + mr_cmd.pbl.pbl.address.mem_addr_high = + params->pbl.pbl.address.mem_addr_high; + mr_cmd.aq_common_desc.flags |= + EFA_ADMIN_AQ_COMMON_DESC_CTRL_DATA_MASK; + if (params->indirect) + mr_cmd.aq_common_desc.flags |= + EFA_ADMIN_AQ_COMMON_DESC_CTRL_DATA_INDIRECT_MASK; + } + + err = efa_com_cmd_exec(admin_queue, + (struct efa_admin_aq_entry *)&mr_cmd, + sizeof(mr_cmd), + (struct efa_admin_acq_entry *)&cmd_completion, + sizeof(cmd_completion)); + if (unlikely(err)) { + pr_err("failed to register mr [%d]\n", err); + return err; + } + + result->l_key = cmd_completion.l_key; + result->r_key = cmd_completion.r_key; + + return err; +} + +int efa_com_dereg_mr(struct efa_com_dev *edev, + struct efa_com_dereg_mr_params *params) +{ + struct efa_com_admin_queue *admin_queue = &edev->admin_queue; + struct efa_admin_dereg_mr_resp cmd_completion; + struct efa_admin_dereg_mr_cmd mr_cmd; + int err; + + memset(&mr_cmd, 0x0, sizeof(mr_cmd)); + + mr_cmd.aq_common_desc.opcode = EFA_ADMIN_DEREG_MR; + mr_cmd.l_key = params->l_key; + + err = efa_com_cmd_exec(admin_queue, + (struct efa_admin_aq_entry *)&mr_cmd, + sizeof(mr_cmd), + (struct efa_admin_acq_entry *)&cmd_completion, + sizeof(cmd_completion)); + if (unlikely(err)) + pr_err("failed to de-register mr(lkey-0x%08X) [%d]\n", + mr_cmd.l_key, err); + + return err; +} + +int efa_com_create_ah(struct efa_com_dev *edev, + struct efa_com_create_ah_params *params, + struct efa_com_create_ah_result *result) +{ + struct efa_com_admin_queue *admin_queue = &edev->admin_queue; + struct efa_admin_create_ah_resp cmd_completion; + struct efa_admin_create_ah_cmd ah_cmd; + int err; + + memset(&ah_cmd, 0x0, sizeof(ah_cmd)); + + ah_cmd.aq_common_desc.opcode = EFA_ADMIN_CREATE_AH; + + memcpy(ah_cmd.dest_addr, params->dest_addr, sizeof(ah_cmd.dest_addr)); + + err = efa_com_cmd_exec(admin_queue, + (struct efa_admin_aq_entry *)&ah_cmd, + sizeof(ah_cmd), + (struct efa_admin_acq_entry *)&cmd_completion, + sizeof(cmd_completion)); + if (unlikely(err)) { + pr_err("failed to create ah [%d]\n", err); + return err; + } + + result->ah = cmd_completion.ah; + + return err; +} + +int efa_com_destroy_ah(struct efa_com_dev *edev, + struct efa_com_destroy_ah_params *params) +{ + struct efa_com_admin_queue *admin_queue = &edev->admin_queue; + struct efa_admin_destroy_ah_resp cmd_completion; + struct efa_admin_destroy_ah_cmd ah_cmd; + int err; + + memset(&ah_cmd, 0x0, sizeof(ah_cmd)); + + ah_cmd.aq_common_desc.opcode = EFA_ADMIN_DESTROY_AH; + ah_cmd.ah = params->ah; + + err = efa_com_cmd_exec(admin_queue, + (struct efa_admin_aq_entry *)&ah_cmd, + sizeof(ah_cmd), + (struct efa_admin_acq_entry *)&cmd_completion, + sizeof(cmd_completion)); + if (unlikely(err)) + pr_err("failed to destroy ah-%#x [%d]\n", ah_cmd.ah, err); + + return err; +} + +static bool +efa_com_check_supported_feature_id(struct efa_com_dev *edev, + enum efa_admin_aq_feature_id feature_id) +{ + u32 feature_mask = 1 << feature_id; + + /* Device attributes is always supported */ + if (feature_id != EFA_ADMIN_DEVICE_ATTR && + !(edev->supported_features & feature_mask)) + return false; + + return true; +} + +static int efa_com_get_feature_ex(struct efa_com_dev *edev, + struct efa_admin_get_feature_resp *get_resp, + enum efa_admin_aq_feature_id feature_id, + dma_addr_t control_buf_dma_addr, + u32 control_buff_size) +{ + struct efa_admin_get_feature_cmd get_cmd; + struct efa_com_admin_queue *admin_queue; + int err; + + if (!efa_com_check_supported_feature_id(edev, feature_id)) { + pr_err("Feature %d isn't supported\n", feature_id); + return -EOPNOTSUPP; + } + + memset(&get_cmd, 0x0, sizeof(get_cmd)); + admin_queue = &edev->admin_queue; + + get_cmd.aq_common_descriptor.opcode = EFA_ADMIN_GET_FEATURE; + + if (control_buff_size) + get_cmd.aq_common_descriptor.flags = + EFA_ADMIN_AQ_COMMON_DESC_CTRL_DATA_INDIRECT_MASK; + else + get_cmd.aq_common_descriptor.flags = 0; + + efa_com_set_dma_addr(control_buf_dma_addr, + &get_cmd.control_buffer.address.mem_addr_high, + &get_cmd.control_buffer.address.mem_addr_low); + + get_cmd.control_buffer.length = control_buff_size; + get_cmd.feature_common.feature_id = feature_id; + err = efa_com_cmd_exec(admin_queue, + (struct efa_admin_aq_entry *) + &get_cmd, + sizeof(get_cmd), + (struct efa_admin_acq_entry *) + get_resp, + sizeof(*get_resp)); + + if (unlikely(err)) + pr_err("Failed to submit get_feature command %d error: %d\n", + feature_id, err); + + return err; +} + +static int efa_com_get_feature(struct efa_com_dev *edev, + struct efa_admin_get_feature_resp *get_resp, + enum efa_admin_aq_feature_id feature_id) +{ + return efa_com_get_feature_ex(edev, get_resp, feature_id, 0, 0); +} + +int efa_com_get_network_attr(struct efa_com_dev *edev, + struct efa_com_get_network_attr_result *result) +{ + struct efa_admin_get_feature_resp resp; + int err; + + err = efa_com_get_feature(edev, &resp, + EFA_ADMIN_NETWORK_ATTR); + if (unlikely(err)) { + pr_err("Failed to get network attributes %d\n", err); + return err; + } + + memcpy(result->addr, resp.u.network_attr.addr, + sizeof(resp.u.network_attr.addr)); + result->mtu = resp.u.network_attr.mtu; + + return 0; +} + +int efa_com_get_device_attr(struct efa_com_dev *edev, + struct efa_com_get_device_attr_result *result) +{ + struct pci_dev *pdev = container_of(edev->dmadev, struct pci_dev, dev); + struct efa_admin_get_feature_resp resp; + int err; + + err = efa_com_get_feature(edev, &resp, EFA_ADMIN_DEVICE_ATTR); + if (unlikely(err)) { + pr_err("Failed to get network attributes %d\n", err); + return err; + } + + result->fw_version = resp.u.device_attr.fw_version; + result->admin_api_version = resp.u.device_attr.admin_api_version; + result->vendor_id = pdev->vendor; + result->vendor_part_id = pdev->device; + result->device_version = resp.u.device_attr.device_version; + result->supported_features = resp.u.device_attr.supported_features; + result->phys_addr_width = resp.u.device_attr.phys_addr_width; + result->virt_addr_width = resp.u.device_attr.virt_addr_width; + result->db_bar = resp.u.device_attr.db_bar; + + if (unlikely(result->admin_api_version < 1)) { + pr_err("Failed to get device attr api version [%u < 1]\n", + result->admin_api_version); + return -EINVAL; + } + + edev->supported_features = resp.u.device_attr.supported_features; + err = efa_com_get_feature(edev, &resp, + EFA_ADMIN_QUEUE_ATTR); + if (unlikely(err)) { + pr_err("Failed to get network attributes %d\n", err); + return err; + } + + result->max_sq = resp.u.queue_attr.max_sq; + result->max_sq_depth = min_t(u32, resp.u.queue_attr.max_sq_depth, + U16_MAX); + result->max_rq = resp.u.queue_attr.max_sq; + result->max_rq_depth = min_t(u32, resp.u.queue_attr.max_rq_depth, + U16_MAX); + result->max_cq = resp.u.queue_attr.max_cq; + result->max_cq_depth = resp.u.queue_attr.max_cq_depth; + result->inline_buf_size = resp.u.queue_attr.inline_buf_size; + result->max_sq_sge = resp.u.queue_attr.max_wr_send_sges; + result->max_rq_sge = resp.u.queue_attr.max_wr_recv_sges; + result->max_mr = resp.u.queue_attr.max_mr; + result->max_mr_pages = resp.u.queue_attr.max_mr_pages; + result->page_size_cap = PAGE_SIZE; + result->max_pd = resp.u.queue_attr.max_pd; + result->max_ah = resp.u.queue_attr.max_ah; + result->sub_cqs_per_cq = resp.u.queue_attr.sub_cqs_per_cq; + + return 0; +} + +int efa_com_get_hw_hints(struct efa_com_dev *edev, + struct efa_com_get_hw_hints_result *result) +{ + struct efa_admin_get_feature_resp resp; + int err; + + err = efa_com_get_feature(edev, &resp, EFA_ADMIN_HW_HINTS); + if (unlikely(err)) { + pr_err("Failed to get hw hints %d\n", err); + return err; + } + + result->admin_completion_timeout = resp.u.hw_hints.admin_completion_timeout; + result->driver_watchdog_timeout = resp.u.hw_hints.driver_watchdog_timeout; + result->mmio_read_timeout = resp.u.hw_hints.mmio_read_timeout; + result->poll_interval = resp.u.hw_hints.poll_interval; + + return 0; +} + +static int efa_com_set_feature_ex(struct efa_com_dev *edev, + struct efa_admin_set_feature_resp *set_resp, + struct efa_admin_set_feature_cmd *set_cmd, + enum efa_admin_aq_feature_id feature_id, + dma_addr_t control_buf_dma_addr, + u32 control_buff_size) +{ + struct efa_com_admin_queue *admin_queue; + int err; + + if (!efa_com_check_supported_feature_id(edev, feature_id)) { + pr_err("Feature %d isn't supported\n", feature_id); + return -EOPNOTSUPP; + } + + admin_queue = &edev->admin_queue; + + set_cmd->aq_common_descriptor.opcode = EFA_ADMIN_SET_FEATURE; + if (control_buff_size) { + set_cmd->aq_common_descriptor.flags = + EFA_ADMIN_AQ_COMMON_DESC_CTRL_DATA_INDIRECT_MASK; + efa_com_set_dma_addr(control_buf_dma_addr, + &set_cmd->control_buffer.address.mem_addr_high, + &set_cmd->control_buffer.address.mem_addr_low); + } else { + set_cmd->aq_common_descriptor.flags = 0; + } + + set_cmd->control_buffer.length = control_buff_size; + set_cmd->feature_common.feature_id = feature_id; + err = efa_com_cmd_exec(admin_queue, + (struct efa_admin_aq_entry *)set_cmd, + sizeof(*set_cmd), + (struct efa_admin_acq_entry *)set_resp, + sizeof(*set_resp)); + + if (unlikely(err)) + pr_err("Failed to submit set_feature command %d error: %d\n", + feature_id, err); + + return err; +} + +static int efa_com_set_feature(struct efa_com_dev *edev, + struct efa_admin_set_feature_resp *set_resp, + struct efa_admin_set_feature_cmd *set_cmd, + enum efa_admin_aq_feature_id feature_id) +{ + return efa_com_set_feature_ex(edev, set_resp, set_cmd, feature_id, + 0, 0); +} + +int efa_com_set_aenq_config(struct efa_com_dev *edev, u32 groups) +{ + struct efa_admin_get_feature_resp get_resp; + struct efa_admin_set_feature_resp set_resp; + struct efa_admin_set_feature_cmd cmd; + int err; + + pr_debug("configuring aenq with groups[0x%x]\n", groups); + + err = efa_com_get_feature(edev, &get_resp, EFA_ADMIN_AENQ_CONFIG); + if (unlikely(err)) { + pr_err("Failed to get aenq attributes: %d\n", err); + return err; + } + + pr_debug("Get aenq groups: supported[0x%x] enabled[0x%x]\n", + get_resp.u.aenq.supported_groups, + get_resp.u.aenq.enabled_groups); + + if ((get_resp.u.aenq.supported_groups & groups) != groups) { + pr_err("Trying to set unsupported aenq groups[0x%x] supported[0x%x]\n", + groups, get_resp.u.aenq.supported_groups); + return -EOPNOTSUPP; + } + + memset(&cmd, 0, sizeof(cmd)); + cmd.u.aenq.enabled_groups = groups; + err = efa_com_set_feature(edev, &set_resp, &cmd, + EFA_ADMIN_AENQ_CONFIG); + if (unlikely(err)) { + pr_err("Failed to set aenq attributes: %d\n", err); + return err; + } + + return 0; +} From patchwork Tue Dec 4 12:04:26 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gal Pressman X-Patchwork-Id: 10711641 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 2A0D213AF for ; Tue, 4 Dec 2018 12:05:30 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1C7F22A35B for ; Tue, 4 Dec 2018 12:05:30 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 10B1A2AE4E; Tue, 4 Dec 2018 12:05:30 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id AF96E2A35B for ; Tue, 4 Dec 2018 12:05:29 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726152AbeLDMF3 (ORCPT ); Tue, 4 Dec 2018 07:05:29 -0500 Received: from smtp-fw-2101.amazon.com ([72.21.196.25]:31252 "EHLO smtp-fw-2101.amazon.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726097AbeLDMF3 (ORCPT ); Tue, 4 Dec 2018 07:05:29 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1543925128; x=1575461128; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version; bh=w/dER8QoYguplwOH8reFdZEVsBcMeXkMiMNLzS3caUU=; b=uNzBeiXs+4tpU5nX1Bo4rNKpHhSWZUDNlxkTEt6LM04GcDAxuCAxHpqT GNEpslytX0vtDO9MOggd+fEdBMKwmP5EQ5fZhRidNcS43/QeUjwWNZjke ymbfDgRH+1JSe0UQYxWXLWWpGpumxrjHkwEaz368Pe7Cd3rk/AgUgGg0Y Y=; X-IronPort-AV: E=Sophos;i="5.56,253,1539648000"; d="scan'208";a="707203105" Received: from iad6-co-svc-p1-lb1-vlan2.amazon.com (HELO email-inbound-relay-1e-97fdccfd.us-east-1.amazon.com) ([10.124.125.2]) by smtp-border-fw-out-2101.iad2.amazon.com with ESMTP/TLS/DHE-RSA-AES256-SHA; 04 Dec 2018 12:05:28 +0000 Received: from EX13MTAUEA001.ant.amazon.com (iad55-ws-svc-p15-lb9-vlan2.iad.amazon.com [10.40.159.162]) by email-inbound-relay-1e-97fdccfd.us-east-1.amazon.com (8.14.7/8.14.7) with ESMTP id wB4C5IWN071272 (version=TLSv1/SSLv3 cipher=AES256-SHA bits=256 verify=FAIL); Tue, 4 Dec 2018 12:05:27 GMT Received: from EX13D02EUC003.ant.amazon.com (10.43.164.10) by EX13MTAUEA001.ant.amazon.com (10.43.61.82) with Microsoft SMTP Server (TLS) id 15.0.1367.3; Tue, 4 Dec 2018 12:05:14 +0000 Received: from EX13MTAUEB001.ant.amazon.com (10.43.60.96) by EX13D02EUC003.ant.amazon.com (10.43.164.10) with Microsoft SMTP Server (TLS) id 15.0.1367.3; Tue, 4 Dec 2018 12:05:13 +0000 Received: from galpress-VirtualBox.hfa16.amazon.com (10.218.62.26) by mail-relay.amazon.com (10.43.60.129) with Microsoft SMTP Server id 15.0.1367.3 via Frontend Transport; Tue, 4 Dec 2018 12:05:10 +0000 From: Gal Pressman To: Doug Ledford , Jason Gunthorpe CC: Alexander Matushevsky , Yossi Leybovich , , Tom Tucker , Gal Pressman Subject: [PATCH rdma-next 10/13] RDMA/efa: Add bitmap allocation service Date: Tue, 4 Dec 2018 14:04:26 +0200 Message-ID: <1543925069-8838-11-git-send-email-galpress@amazon.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1543925069-8838-1-git-send-email-galpress@amazon.com> References: <1543925069-8838-1-git-send-email-galpress@amazon.com> MIME-Version: 1.0 Sender: linux-rdma-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Bitmap allocation service is currently used for assigning Protection Domain (PD) numbers. Signed-off-by: Gal Pressman --- drivers/infiniband/hw/efa/efa_bitmap.c | 76 ++++++++++++++++++++++++++++++++++ 1 file changed, 76 insertions(+) create mode 100644 drivers/infiniband/hw/efa/efa_bitmap.c diff --git a/drivers/infiniband/hw/efa/efa_bitmap.c b/drivers/infiniband/hw/efa/efa_bitmap.c new file mode 100644 index 000000000000..251cc68d25f5 --- /dev/null +++ b/drivers/infiniband/hw/efa/efa_bitmap.c @@ -0,0 +1,76 @@ +// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB +/* + * Copyright 2006, 2007 Cisco Systems, Inc. All rights reserved. + * Copyright 2007, 2008 Mellanox Technologies. All rights reserved. + * Copyright 2018 Amazon.com, Inc. or its affiliates. + */ + +#include + +#include "efa.h" + +u32 efa_bitmap_alloc(struct efa_bitmap *bitmap) +{ + u32 obj; + + spin_lock(&bitmap->lock); + + obj = find_next_zero_bit(bitmap->table, bitmap->max, bitmap->last); + if (obj >= bitmap->max) + obj = find_first_zero_bit(bitmap->table, bitmap->max); + + if (obj < bitmap->max) { + set_bit(obj, bitmap->table); + bitmap->last = obj + 1; + if (bitmap->last == bitmap->max) + bitmap->last = 0; + } else { + obj = EFA_BITMAP_INVAL; + } + + if (obj != EFA_BITMAP_INVAL) + --bitmap->avail; + + spin_unlock(&bitmap->lock); + + return obj; +} + +void efa_bitmap_free(struct efa_bitmap *bitmap, u32 obj) +{ + obj &= bitmap->mask; + + spin_lock(&bitmap->lock); + bitmap_clear(bitmap->table, obj, 1); + bitmap->avail++; + spin_unlock(&bitmap->lock); +} + +u32 efa_bitmap_avail(struct efa_bitmap *bitmap) +{ + return bitmap->avail; +} + +int efa_bitmap_init(struct efa_bitmap *bitmap, u32 num) +{ + /* num must be a power of 2 */ + if (!is_power_of_2(num)) + return -EINVAL; + + bitmap->last = 0; + bitmap->max = num; + bitmap->mask = num - 1; + bitmap->avail = num; + spin_lock_init(&bitmap->lock); + bitmap->table = kcalloc(BITS_TO_LONGS(bitmap->max), + sizeof(long), GFP_KERNEL); + if (!bitmap->table) + return -ENOMEM; + + return 0; +} + +void efa_bitmap_cleanup(struct efa_bitmap *bitmap) +{ + kfree(bitmap->table); +} From patchwork Tue Dec 4 12:04:27 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gal Pressman X-Patchwork-Id: 10711651 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id F11B31731 for ; Tue, 4 Dec 2018 12:05:41 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id DF7122A35B for ; Tue, 4 Dec 2018 12:05:41 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id D3AD82AF03; Tue, 4 Dec 2018 12:05:41 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 3D1472AD6B for ; Tue, 4 Dec 2018 12:05:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726194AbeLDMFj (ORCPT ); Tue, 4 Dec 2018 07:05:39 -0500 Received: from smtp-fw-9102.amazon.com ([207.171.184.29]:27302 "EHLO smtp-fw-9102.amazon.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725956AbeLDMFi (ORCPT ); Tue, 4 Dec 2018 07:05:38 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1543925131; x=1575461131; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version; bh=ocNdH4AWlxvGFFirRZEJ+ywIQsEk50DuOQF20RK9bxI=; b=h+8eixbxzNqPIIpdFDPlmxMrGHlD0JpUZgBWy/g77l0DbJrXS5kNnrrv IqyLj8x2sdfNY3iL2cdT8jybTQdaeE3ia4do6xtlcA5IuNW/gCkLeeeBy 4utg18Ox2F2nMSYqhXSYdcijFEDrRielDIC8qKm5SX7LAEbxaJmjYGVVT U=; X-IronPort-AV: E=Sophos;i="5.56,253,1539648000"; d="scan'208";a="645915032" Received: from sea3-co-svc-lb6-vlan3.sea.amazon.com (HELO email-inbound-relay-1d-37fd6b3d.us-east-1.amazon.com) ([10.47.22.38]) by smtp-border-fw-out-9102.sea19.amazon.com with ESMTP/TLS/DHE-RSA-AES256-SHA; 04 Dec 2018 12:05:30 +0000 Received: from EX13MTAUEA001.ant.amazon.com (iad55-ws-svc-p15-lb9-vlan3.iad.amazon.com [10.40.159.166]) by email-inbound-relay-1d-37fd6b3d.us-east-1.amazon.com (8.14.7/8.14.7) with ESMTP id wB4C5STj055450 (version=TLSv1/SSLv3 cipher=AES256-SHA bits=256 verify=FAIL); Tue, 4 Dec 2018 12:05:29 GMT Received: from EX13D19EUB001.ant.amazon.com (10.43.166.229) by EX13MTAUEA001.ant.amazon.com (10.43.61.243) with Microsoft SMTP Server (TLS) id 15.0.1367.3; Tue, 4 Dec 2018 12:05:16 +0000 Received: from EX13MTAUEB001.ant.amazon.com (10.43.60.96) by EX13D19EUB001.ant.amazon.com (10.43.166.229) with Microsoft SMTP Server (TLS) id 15.0.1367.3; Tue, 4 Dec 2018 12:05:15 +0000 Received: from galpress-VirtualBox.hfa16.amazon.com (10.218.62.26) by mail-relay.amazon.com (10.43.60.129) with Microsoft SMTP Server id 15.0.1367.3 via Frontend Transport; Tue, 4 Dec 2018 12:05:13 +0000 From: Gal Pressman To: Doug Ledford , Jason Gunthorpe CC: Alexander Matushevsky , Yossi Leybovich , , Tom Tucker , Gal Pressman Subject: [PATCH rdma-next 11/13] RDMA/efa: Add EFA verbs implementation Date: Tue, 4 Dec 2018 14:04:27 +0200 Message-ID: <1543925069-8838-12-git-send-email-galpress@amazon.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1543925069-8838-1-git-send-email-galpress@amazon.com> References: <1543925069-8838-1-git-send-email-galpress@amazon.com> MIME-Version: 1.0 Sender: linux-rdma-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Add a file that implements the EFA verbs. Signed-off-by: Gal Pressman --- drivers/infiniband/hw/efa/efa_verbs.c | 1827 +++++++++++++++++++++++++++++++++ 1 file changed, 1827 insertions(+) create mode 100644 drivers/infiniband/hw/efa/efa_verbs.c diff --git a/drivers/infiniband/hw/efa/efa_verbs.c b/drivers/infiniband/hw/efa/efa_verbs.c new file mode 100644 index 000000000000..ec887648060e --- /dev/null +++ b/drivers/infiniband/hw/efa/efa_verbs.c @@ -0,0 +1,1827 @@ +// SPDX-License-Identifier: GPL-2.0 OR BSD-2-Clause +/* + * Copyright 2018 Amazon.com, Inc. or its affiliates. + */ + +#include + +#include +#include +#include +#include +#include + +#include "efa.h" + +#define EFA_MMAP_DB_BAR_MEMORY_FLAG BIT(61) +#define EFA_MMAP_REG_BAR_MEMORY_FLAG BIT(62) +#define EFA_MMAP_MEM_BAR_MEMORY_FLAG BIT(63) +#define EFA_MMAP_BARS_MEMORY_MASK \ + (EFA_MMAP_REG_BAR_MEMORY_FLAG | EFA_MMAP_MEM_BAR_MEMORY_FLAG | \ + EFA_MMAP_DB_BAR_MEMORY_FLAG) + +struct efa_ucontext { + struct ib_ucontext ibucontext; + /* Protects ucontext state */ + struct mutex lock; + struct list_head link; + struct list_head pending_mmaps; + u64 mmap_key; +}; + +#define EFA_AENQ_ENABLED_GROUPS \ + (BIT(EFA_ADMIN_FATAL_ERROR) | BIT(EFA_ADMIN_WARNING) | \ + BIT(EFA_ADMIN_NOTIFICATION) | BIT(EFA_ADMIN_KEEP_ALIVE)) + +struct efa_pd { + struct ib_pd ibpd; + u32 pdn; +}; + +struct efa_mr { + struct ib_mr ibmr; + struct ib_umem *umem; + u64 vaddr; +}; + +struct efa_cq { + struct ib_cq ibcq; + struct efa_ucontext *ucontext; + u16 cq_idx; + dma_addr_t dma_addr; + void *cpu_addr; + size_t size; +}; + +struct efa_qp { + struct ib_qp ibqp; + enum ib_qp_state state; + u32 qp_handle; + dma_addr_t rq_dma_addr; + void *rq_cpu_addr; + size_t rq_size; +}; + +struct efa_ah { + struct ib_ah ibah; + /* dest_addr */ + u8 id[EFA_GID_SIZE]; +}; + +struct efa_ah_id { + struct list_head list; + /* dest_addr */ + u8 id[EFA_GID_SIZE]; + u16 address_handle; + unsigned int ref_count; +}; + +struct efa_mmap_entry { + struct list_head list; + void *obj; + u64 address; + u64 length; + u64 key; +}; + +static void mmap_entry_insert(struct efa_ucontext *ucontext, + struct efa_mmap_entry *entry, + u64 mem_flag); + +static void mmap_obj_entries_remove(struct efa_ucontext *ucontext, + void *obj); + +#define EFA_PAGE_SHIFT 12 +#define EFA_PAGE_SIZE BIT(EFA_PAGE_SHIFT) +#define EFA_PAGE_PTR_SIZE 8 + +#define EFA_CHUNK_ALLOC_SIZE BIT(EFA_PAGE_SHIFT) +#define EFA_CHUNK_PTR_SIZE sizeof(struct efa_com_ctrl_buff_info) + +#define EFA_PAGE_PTRS_PER_CHUNK \ + ((EFA_CHUNK_ALLOC_SIZE - EFA_CHUNK_PTR_SIZE) / EFA_PAGE_PTR_SIZE) + +#define EFA_CHUNK_USED_SIZE \ + ((EFA_PAGE_PTRS_PER_CHUNK * EFA_PAGE_PTR_SIZE) + EFA_CHUNK_PTR_SIZE) + +#define EFA_SUPPORTED_ACCESS_FLAGS IB_ACCESS_LOCAL_WRITE + +struct pbl_chunk { + u64 *buf; + u32 length; + dma_addr_t dma_addr; +}; + +struct pbl_chunk_list { + unsigned int size; + struct pbl_chunk *chunks; +}; + +struct pbl_context { + u64 *pbl_buf; + u32 pbl_buf_size_in_bytes; + bool physically_continuous; + union { + struct { + dma_addr_t dma_addr; + } continuous; + struct { + u32 pbl_buf_size_in_pages; + struct scatterlist *sgl; + int sg_dma_cnt; + struct pbl_chunk_list chunk_list; + } indirect; + } phys; + + struct efa_dev *dev; + struct device *dmadev; +}; + +static inline struct efa_dev *to_edev(struct ib_device *ibdev) +{ + return container_of(ibdev, struct efa_dev, ibdev); +} + +static inline struct efa_ucontext *to_eucontext(struct ib_ucontext *ibucontext) +{ + return container_of(ibucontext, struct efa_ucontext, ibucontext); +} + +static inline struct efa_pd *to_epd(struct ib_pd *ibpd) +{ + return container_of(ibpd, struct efa_pd, ibpd); +} + +static inline struct efa_mr *to_emr(struct ib_mr *ibmr) +{ + return container_of(ibmr, struct efa_mr, ibmr); +} + +static inline struct efa_qp *to_eqp(struct ib_qp *ibqp) +{ + return container_of(ibqp, struct efa_qp, ibqp); +} + +static inline struct efa_cq *to_ecq(struct ib_cq *ibcq) +{ + return container_of(ibcq, struct efa_cq, ibcq); +} + +static inline struct efa_ah *to_eah(struct ib_ah *ibah) +{ + return container_of(ibah, struct efa_ah, ibah); +} + +#define field_avail(x, fld, sz) (offsetof(typeof(x), fld) + \ + sizeof(((typeof(x) *)0)->fld) <= (sz)) + +#define EFA_IS_RESERVED_CLEARED(reserved) \ + !memchr_inv(reserved, 0, sizeof(reserved)) + +int efa_query_device(struct ib_device *ibdev, + struct ib_device_attr *props, + struct ib_udata *udata) +{ + struct efa_ibv_ex_query_device_resp resp = {}; + struct efa_com_get_device_attr_result result; + struct efa_dev *dev = to_edev(ibdev); + int err; + + pr_debug("--->\n"); + memset(props, 0, sizeof(*props)); + + if (udata && udata->inlen && + !ib_is_udata_cleared(udata, 0, udata->inlen)) { + pr_err_ratelimited("Incompatible ABI params, udata not cleared\n"); + return -EINVAL; + } + + err = efa_get_device_attributes(dev, &result); + if (err) { + pr_err("failed to get device_attr err[%d]!\n", err); + return err; + } + + props->max_mr_size = result.max_mr_pages * PAGE_SIZE; + props->page_size_cap = result.page_size_cap; + props->vendor_id = result.vendor_id; + props->vendor_part_id = result.vendor_part_id; + props->hw_ver = dev->pdev->subsystem_device; + props->max_qp = result.max_sq; + props->device_cap_flags = IB_DEVICE_PORT_ACTIVE_EVENT | + IB_DEVICE_VIRTUAL_FUNCTION | + IB_DEVICE_BLOCK_MULTICAST_LOOPBACK; + props->max_cq = result.max_cq; + props->max_pd = result.max_pd; + props->max_mr = result.max_mr; + props->max_ah = result.max_ah; + props->max_cqe = result.max_cq_depth; + props->max_qp_wr = min_t(u16, result.max_sq_depth, + result.max_rq_depth); + props->max_send_sge = result.max_sq_sge; + props->max_recv_sge = result.max_rq_sge; + + if (udata && udata->outlen) { + resp.sub_cqs_per_cq = result.sub_cqs_per_cq; + resp.max_sq_sge = result.max_sq_sge; + resp.max_rq_sge = result.max_rq_sge; + resp.max_sq_wr = result.max_sq_depth; + resp.max_rq_wr = result.max_rq_depth; + resp.max_inline_data = result.inline_buf_size; + + err = ib_copy_to_udata(udata, &resp, + min(sizeof(resp), udata->outlen)); + if (err) { + pr_err_ratelimited("failed to copy udata for query_device.\n"); + return err; + } + } + + return err; +} + +int efa_query_port(struct ib_device *ibdev, u8 port, + struct ib_port_attr *props) +{ + struct efa_dev *dev = to_edev(ibdev); + + pr_debug("--->\n"); + + mutex_lock(&dev->efa_dev_lock); + memset(props, 0, sizeof(*props)); + + props->lid = 0; + props->lmc = 1; + props->sm_lid = 0; + props->sm_sl = 0; + + props->state = IB_PORT_ACTIVE; + props->phys_state = 5; + props->port_cap_flags = 0; + props->gid_tbl_len = 1; + props->pkey_tbl_len = 1; + props->bad_pkey_cntr = 0; + props->qkey_viol_cntr = 0; + props->active_speed = IB_SPEED_EDR; + props->active_width = IB_WIDTH_4X; + props->max_mtu = ib_mtu_int_to_enum(dev->mtu); + props->active_mtu = ib_mtu_int_to_enum(dev->mtu); + props->max_msg_sz = dev->mtu; + props->max_vl_num = 1; + mutex_unlock(&dev->efa_dev_lock); + return 0; +} + +int efa_query_qp(struct ib_qp *ibqp, struct ib_qp_attr *qp_attr, + int qp_attr_mask, + struct ib_qp_init_attr *qp_init_attr) +{ + struct efa_qp *qp = to_eqp(ibqp); + + pr_debug("--->\n"); + + memset(qp_attr, 0, sizeof(*qp_attr)); + memset(qp_init_attr, 0, sizeof(*qp_init_attr)); + + qp_attr->qp_state = qp->state; + qp_attr->cur_qp_state = qp->state; + qp_attr->port_num = 1; + + qp_init_attr->qp_type = ibqp->qp_type; + qp_init_attr->recv_cq = ibqp->recv_cq; + qp_init_attr->send_cq = ibqp->send_cq; + + return 0; +} + +int efa_query_gid(struct ib_device *ibdev, u8 port, int index, + union ib_gid *gid) +{ + struct efa_dev *dev = to_edev(ibdev); + + pr_debug("port %d gid index %d\n", port, index); + + if (index > 1) + return -EINVAL; + + mutex_lock(&dev->efa_dev_lock); + memcpy(gid->raw, dev->addr, sizeof(dev->addr)); + mutex_unlock(&dev->efa_dev_lock); + + return 0; +} + +int efa_query_pkey(struct ib_device *ibdev, u8 port, u16 index, + u16 *pkey) +{ + pr_debug("--->\n"); + if (index > 1) + return -EINVAL; + + *pkey = 0xffff; + return 0; +} + +struct ib_pd *efa_alloc_pd(struct ib_device *ibdev, + struct ib_ucontext *ibucontext, + struct ib_udata *udata) +{ + struct efa_ibv_alloc_pd_resp resp = {}; + struct efa_dev *dev = to_edev(ibdev); + struct efa_pd *pd; + int err; + + pr_debug("--->\n"); + + if (!ibucontext) { + pr_err("ibucontext is not valid\n"); + return ERR_PTR(-EOPNOTSUPP); + } + + if (udata && udata->inlen && + !ib_is_udata_cleared(udata, 0, udata->inlen)) { + pr_err_ratelimited("Incompatible ABI params, udata not cleared\n"); + return ERR_PTR(-EINVAL); + } + + pd = kzalloc(sizeof(*pd), GFP_KERNEL); + if (!pd) { + dev->stats.sw_stats.alloc_pd_alloc_err++; + return ERR_PTR(-ENOMEM); + } + + pd->pdn = efa_bitmap_alloc(&dev->pd_bitmap); + if (pd->pdn == EFA_BITMAP_INVAL) { + pr_err("Failed to alloc PD (max_pd %u)\n", dev->caps.max_pd); + dev->stats.sw_stats.alloc_pd_bitmap_full_err++; + kfree(pd); + return ERR_PTR(-ENOMEM); + } + + resp.pdn = pd->pdn; + + if (udata && udata->outlen) { + err = ib_copy_to_udata(udata, &resp, + min(sizeof(resp), udata->outlen)); + if (err) { + pr_err_ratelimited("failed to copy udata for alloc_pd\n"); + efa_bitmap_free(&dev->pd_bitmap, pd->pdn); + kfree(pd); + return ERR_PTR(err); + } + } + + pr_debug("Allocated pd[%d]\n", pd->pdn); + + return &pd->ibpd; +} + +int efa_dealloc_pd(struct ib_pd *ibpd) +{ + struct efa_dev *dev = to_edev(ibpd->device); + struct efa_pd *pd = to_epd(ibpd); + + pr_debug("Dealloc pd[%d]\n", pd->pdn); + efa_bitmap_free(&dev->pd_bitmap, pd->pdn); + kfree(pd); + + return 0; +} + +int efa_destroy_qp_handle(struct efa_dev *dev, u32 qp_handle) +{ + struct efa_com_destroy_qp_params params = { .qp_handle = qp_handle }; + + return efa_com_destroy_qp(dev->edev, ¶ms); +} + +int efa_destroy_qp(struct ib_qp *ibqp) +{ + struct efa_dev *dev = to_edev(ibqp->pd->device); + struct efa_qp *qp = to_eqp(ibqp); + struct efa_ucontext *ucontext; + + pr_debug("Destroy qp[%u]\n", ibqp->qp_num); + ucontext = ibqp->pd->uobject ? + to_eucontext(ibqp->pd->uobject->context) : + NULL; + + if (!ucontext) + return -EOPNOTSUPP; + + efa_destroy_qp_handle(dev, qp->qp_handle); + mmap_obj_entries_remove(ucontext, qp); + + if (qp->rq_cpu_addr) { + pr_debug("qp->cpu_addr[%p] freed: size[%lu], dma[%pad]\n", + qp->rq_cpu_addr, qp->rq_size, + &qp->rq_dma_addr); + dma_free_coherent(&dev->pdev->dev, qp->rq_size, + qp->rq_cpu_addr, qp->rq_dma_addr); + } + + kfree(qp); + return 0; +} + +static int qp_mmap_entries_setup(struct efa_qp *qp, + struct efa_dev *dev, + struct efa_ucontext *ucontext, + struct efa_com_create_qp_params *params, + struct efa_ibv_create_qp_resp *resp) +{ + struct efa_mmap_entry *rq_db_entry; + struct efa_mmap_entry *sq_db_entry; + struct efa_mmap_entry *rq_entry; + struct efa_mmap_entry *sq_entry; + + sq_db_entry = kzalloc(sizeof(*sq_db_entry), GFP_KERNEL); + sq_entry = kzalloc(sizeof(*sq_entry), GFP_KERNEL); + if (!sq_db_entry || !sq_entry) { + dev->stats.sw_stats.mmap_entry_alloc_err++; + goto err_alloc; + } + + if (qp->rq_size) { + rq_entry = kzalloc(sizeof(*rq_entry), GFP_KERNEL); + rq_db_entry = kzalloc(sizeof(*rq_db_entry), GFP_KERNEL); + if (!rq_entry || !rq_db_entry) { + dev->stats.sw_stats.mmap_entry_alloc_err++; + goto err_alloc_rq; + } + + rq_db_entry->obj = qp; + rq_entry->obj = qp; + + rq_entry->address = virt_to_phys(qp->rq_cpu_addr); + rq_entry->length = qp->rq_size; + mmap_entry_insert(ucontext, rq_entry, 0); + resp->rq_mmap_key = rq_entry->key; + resp->rq_mmap_size = qp->rq_size; + + rq_db_entry->address = dev->db_bar_addr + + resp->rq_db_offset; + rq_db_entry->length = PAGE_SIZE; + mmap_entry_insert(ucontext, rq_db_entry, + EFA_MMAP_DB_BAR_MEMORY_FLAG); + resp->rq_db_mmap_key = rq_db_entry->key; + resp->rq_db_offset &= ~PAGE_MASK; + } + + sq_db_entry->obj = qp; + sq_entry->obj = qp; + + sq_db_entry->address = dev->db_bar_addr + resp->sq_db_offset; + resp->sq_db_offset &= ~PAGE_MASK; + sq_db_entry->length = PAGE_SIZE; + mmap_entry_insert(ucontext, sq_db_entry, EFA_MMAP_DB_BAR_MEMORY_FLAG); + resp->sq_db_mmap_key = sq_db_entry->key; + + sq_entry->address = dev->mem_bar_addr + resp->llq_desc_offset; + resp->llq_desc_offset &= ~PAGE_MASK; + sq_entry->length = PAGE_ALIGN(params->sq_ring_size_in_bytes + + resp->llq_desc_offset); + mmap_entry_insert(ucontext, sq_entry, EFA_MMAP_MEM_BAR_MEMORY_FLAG); + resp->llq_desc_mmap_key = sq_entry->key; + + return 0; + +err_alloc_rq: + kfree(rq_entry); + kfree(rq_db_entry); +err_alloc: + kfree(sq_entry); + kfree(sq_db_entry); + return -ENOMEM; +} + +static int efa_qp_validate_cap(struct efa_dev *dev, + struct ib_qp_init_attr *init_attr) +{ + if (init_attr->cap.max_send_wr > dev->caps.max_sq_depth) { + pr_err("qp: requested send wr[%u] exceeds the max[%u]\n", + init_attr->cap.max_send_wr, + dev->caps.max_sq_depth); + return -EINVAL; + } + if (init_attr->cap.max_recv_wr > dev->caps.max_rq_depth) { + pr_err("qp: requested receive wr[%u] exceeds the max[%u]\n", + init_attr->cap.max_recv_wr, + dev->caps.max_rq_depth); + return -EINVAL; + } + if (init_attr->cap.max_send_sge > dev->caps.max_sq_sge) { + pr_err("qp: requested sge send[%u] exceeds the max[%u]\n", + init_attr->cap.max_send_sge, dev->caps.max_sq_sge); + return -EINVAL; + } + if (init_attr->cap.max_recv_sge > dev->caps.max_rq_sge) { + pr_err("qp: requested sge recv[%u] exceeds the max[%u]\n", + init_attr->cap.max_recv_sge, dev->caps.max_rq_sge); + return -EINVAL; + } + if (init_attr->cap.max_inline_data > dev->caps.inline_buf_size) { + pr_warn("requested inline data[%u] exceeds the max[%u]\n", + init_attr->cap.max_inline_data, + dev->caps.inline_buf_size); + return -EINVAL; + } + + return 0; +} + +struct ib_qp *efa_create_qp(struct ib_pd *ibpd, + struct ib_qp_init_attr *init_attr, + struct ib_udata *udata) +{ + struct efa_com_create_qp_params create_qp_params = {}; + struct efa_com_create_qp_result create_qp_resp; + struct efa_dev *dev = to_edev(ibpd->device); + struct efa_ibv_create_qp_resp resp = {}; + struct efa_ibv_create_qp cmd = {}; + struct efa_ucontext *ucontext; + struct efa_qp *qp; + int err; + + ucontext = ibpd->uobject ? to_eucontext(ibpd->uobject->context) : + NULL; + + err = efa_qp_validate_cap(dev, init_attr); + if (err) + return ERR_PTR(err); + + if (!ucontext) + return ERR_PTR(-EOPNOTSUPP); + + if (init_attr->qp_type != IB_QPT_UD && + init_attr->qp_type != IB_QPT_SRD) { + pr_err("unsupported qp type %d\n", init_attr->qp_type); + return ERR_PTR(-EINVAL); + } + + if (!udata || !field_avail(cmd, srd_qp, udata->inlen)) { + pr_err_ratelimited("Incompatible ABI params, no input udata\n"); + return ERR_PTR(-EINVAL); + } + + if (udata->inlen > sizeof(cmd) && + !ib_is_udata_cleared(udata, sizeof(cmd), + udata->inlen - sizeof(cmd))) { + pr_err_ratelimited("Incompatible ABI params, unknown fields in udata\n"); + return ERR_PTR(-EINVAL); + } + + err = ib_copy_from_udata(&cmd, udata, + min(sizeof(cmd), udata->inlen)); + if (err) { + pr_err_ratelimited("%s: cannot copy udata for create_qp\n", + dev_name(&dev->ibdev.dev)); + return ERR_PTR(err); + } + + if (cmd.comp_mask) { + pr_err_ratelimited("Incompatible ABI params, unknown fields in udata\n"); + return ERR_PTR(-EINVAL); + } + + qp = kzalloc(sizeof(*qp), GFP_KERNEL); + if (!qp) { + dev->stats.sw_stats.create_qp_alloc_err++; + return ERR_PTR(-ENOMEM); + } + + create_qp_params.pd = to_epd(ibpd)->pdn; + if (init_attr->qp_type == IB_QPT_SRD) + create_qp_params.qp_type = EFA_ADMIN_QP_TYPE_SRD; + else + create_qp_params.qp_type = EFA_ADMIN_QP_TYPE_UD; + + pr_debug("create QP, qp type %d srd qp %d\n", + init_attr->qp_type, cmd.srd_qp); + create_qp_params.send_cq_idx = to_ecq(init_attr->send_cq)->cq_idx; + create_qp_params.recv_cq_idx = to_ecq(init_attr->recv_cq)->cq_idx; + create_qp_params.sq_depth = cmd.sq_depth; + create_qp_params.sq_ring_size_in_bytes = cmd.sq_ring_size; + + create_qp_params.rq_ring_size_in_bytes = cmd.rq_entries * + cmd.rq_entry_size; + qp->rq_size = PAGE_ALIGN(create_qp_params.rq_ring_size_in_bytes); + if (qp->rq_size) { + qp->rq_cpu_addr = dma_zalloc_coherent(&dev->pdev->dev, + qp->rq_size, + &qp->rq_dma_addr, + GFP_KERNEL); + if (!qp->rq_cpu_addr) { + dev->stats.sw_stats.create_qp_alloc_err++; + err = -ENOMEM; + goto err_free_qp; + } + pr_debug("qp->cpu_addr[%p] allocated: size[%lu], dma[%pad]\n", + qp->rq_cpu_addr, qp->rq_size, &qp->rq_dma_addr); + create_qp_params.rq_base_addr = qp->rq_dma_addr; + } + + memset(&resp, 0, sizeof(resp)); + err = efa_com_create_qp(dev->edev, &create_qp_params, + &create_qp_resp); + if (err) { + pr_err("failed to create qp %d\n", err); + err = -EINVAL; + goto err_free_dma; + } + + WARN_ON_ONCE(create_qp_resp.sq_db_offset > dev->db_bar_len); + WARN_ON_ONCE(create_qp_resp.rq_db_offset > dev->db_bar_len); + WARN_ON_ONCE(create_qp_resp.llq_descriptors_offset > + dev->mem_bar_len); + + resp.sq_db_offset = create_qp_resp.sq_db_offset; + resp.rq_db_offset = create_qp_resp.rq_db_offset; + resp.llq_desc_offset = create_qp_resp.llq_descriptors_offset; + resp.send_sub_cq_idx = create_qp_resp.send_sub_cq_idx; + resp.recv_sub_cq_idx = create_qp_resp.recv_sub_cq_idx; + + err = qp_mmap_entries_setup(qp, dev, ucontext, &create_qp_params, + &resp); + if (err) + goto err_destroy_qp; + + qp->qp_handle = create_qp_resp.qp_handle; + qp->ibqp.qp_num = create_qp_resp.qp_num; + qp->ibqp.qp_type = init_attr->qp_type; + + if (udata && udata->outlen) { + err = ib_copy_to_udata(udata, &resp, + min(sizeof(resp), udata->outlen)); + if (err) { + pr_err_ratelimited("failed to copy udata for qp[%u]", + create_qp_resp.qp_num); + goto err_mmap_remove; + } + } + + pr_debug("Created qp[%d]\n", qp->ibqp.qp_num); + + return &qp->ibqp; + +err_mmap_remove: + mmap_obj_entries_remove(ucontext, qp); +err_destroy_qp: + efa_destroy_qp_handle(dev, create_qp_resp.qp_handle); +err_free_dma: + if (qp->rq_size) { + pr_debug("qp->cpu_addr[%p] freed: size[%lu], dma[%pad]\n", + qp->rq_cpu_addr, qp->rq_size, &qp->rq_dma_addr); + dma_free_coherent(&dev->pdev->dev, qp->rq_size, + qp->rq_cpu_addr, qp->rq_dma_addr); + } +err_free_qp: + kfree(qp); + return ERR_PTR(err); +} + +static int efa_destroy_cq_idx(struct efa_dev *dev, int cq_idx) +{ + struct efa_com_destroy_cq_params params = { .cq_idx = cq_idx }; + + return efa_com_destroy_cq(dev->edev, ¶ms); +} + +int efa_destroy_cq(struct ib_cq *ibcq) +{ + struct efa_dev *dev = to_edev(ibcq->device); + struct efa_cq *cq = to_ecq(ibcq); + + pr_debug("Destroy cq[%d] virt[%p] freed: size[%lu], dma[%pad]\n", + cq->cq_idx, cq->cpu_addr, cq->size, &cq->dma_addr); + if (!cq->ucontext) + return -EOPNOTSUPP; + + efa_destroy_cq_idx(dev, cq->cq_idx); + + mmap_obj_entries_remove(cq->ucontext, cq); + dma_free_coherent(&dev->pdev->dev, cq->size, + cq->cpu_addr, cq->dma_addr); + + kfree(cq); + return 0; +} + +static int cq_mmap_entries_setup(struct efa_cq *cq, + struct efa_ibv_create_cq_resp *resp) +{ + struct efa_mmap_entry *cq_entry; + + cq_entry = kzalloc(sizeof(*cq_entry), GFP_KERNEL); + if (!cq_entry) + return -ENOMEM; + + cq_entry->obj = cq; + + cq_entry->address = virt_to_phys(cq->cpu_addr); + cq_entry->length = cq->size; + mmap_entry_insert(cq->ucontext, cq_entry, 0); + resp->q_mmap_key = cq_entry->key; + resp->q_mmap_size = cq_entry->length; + + return 0; +} + +static struct ib_cq *do_create_cq(struct ib_device *ibdev, int entries, + int vector, struct ib_ucontext *ibucontext, + struct ib_udata *udata) +{ + struct efa_ibv_create_cq_resp resp = {}; + struct efa_com_create_cq_params params; + struct efa_com_create_cq_result result; + struct efa_dev *dev = to_edev(ibdev); + struct efa_ibv_create_cq cmd = {}; + struct efa_cq *cq; + int err; + + pr_debug("entries %d udata %p\n", entries, udata); + + if (entries < 1 || entries > dev->caps.max_cq_depth) { + pr_err("cq: requested entries[%u] non-positive or greater than max[%u]\n", + entries, dev->caps.max_cq_depth); + return ERR_PTR(-EINVAL); + } + + if (!ibucontext) { + pr_err("context is not valid "); + return ERR_PTR(-EOPNOTSUPP); + } + + if (!udata || !field_avail(cmd, num_sub_cqs, udata->inlen)) { + pr_err_ratelimited("Incompatible ABI params, no input udata\n"); + return ERR_PTR(-EINVAL); + } + + if (udata->inlen > sizeof(cmd) && + !ib_is_udata_cleared(udata, sizeof(cmd), + udata->inlen - sizeof(cmd))) { + pr_err_ratelimited("Incompatible ABI params, unknown fields in udata\n"); + return ERR_PTR(-EINVAL); + } + + err = ib_copy_from_udata(&cmd, udata, + min(sizeof(cmd), udata->inlen)); + if (err) { + pr_err_ratelimited("%s: cannot copy udata for create_cq\n", + dev_name(&dev->ibdev.dev)); + return ERR_PTR(err); + } + + if (cmd.comp_mask || !EFA_IS_RESERVED_CLEARED(cmd.reserved_50)) { + pr_err_ratelimited("Incompatible ABI params, unknown fields in udata\n"); + return ERR_PTR(-EINVAL); + } + + if (!cmd.cq_entry_size) { + pr_err("invalid entry size [%u]\n", cmd.cq_entry_size); + return ERR_PTR(-EINVAL); + } + + if (cmd.num_sub_cqs != dev->caps.sub_cqs_per_cq) { + pr_err("invalid number of sub cqs[%u] expected[%u]\n", + cmd.num_sub_cqs, dev->caps.sub_cqs_per_cq); + return ERR_PTR(-EINVAL); + } + + cq = kzalloc(sizeof(*cq), GFP_KERNEL); + if (!cq) { + dev->stats.sw_stats.create_cq_alloc_err++; + return ERR_PTR(-ENOMEM); + } + + memset(&resp, 0, sizeof(resp)); + cq->ucontext = to_eucontext(ibucontext); + cq->size = PAGE_ALIGN(cmd.cq_entry_size * entries * cmd.num_sub_cqs); + cq->cpu_addr = dma_zalloc_coherent(&dev->pdev->dev, + cq->size, &cq->dma_addr, + GFP_KERNEL); + if (!cq->cpu_addr) { + dev->stats.sw_stats.create_cq_alloc_err++; + err = -ENOMEM; + goto err_free_cq; + } + pr_debug("cq->cpu_addr[%p] allocated: size[%lu], dma[%pad]\n", + cq->cpu_addr, cq->size, &cq->dma_addr); + + params.cq_depth = entries; + params.dma_addr = cq->dma_addr; + params.entry_size_in_bytes = cmd.cq_entry_size; + params.num_sub_cqs = cmd.num_sub_cqs; + err = efa_com_create_cq(dev->edev, ¶ms, &result); + if (err) { + pr_err("failed to create cq [%d]!\n", err); + goto err_free_dma; + } + + resp.cq_idx = result.cq_idx; + cq->cq_idx = result.cq_idx; + cq->ibcq.cqe = result.actual_depth; + WARN_ON_ONCE(entries != result.actual_depth); + + err = cq_mmap_entries_setup(cq, &resp); + if (err) { + pr_err("could not setup cq[%u] mmap entries!\n", cq->cq_idx); + goto err_destroy_cq; + } + + if (udata && udata->outlen) { + err = ib_copy_to_udata(udata, &resp, + min(sizeof(resp), udata->outlen)); + if (err) { + pr_err_ratelimited("failed to copy udata for %s", + dev_name(&dev->ibdev.dev)); + goto err_mmap_remove; + } + } + + pr_debug("Created cq[%d], cq depth[%u]. dma[%pad] virt[%p]\n", + cq->cq_idx, result.actual_depth, &cq->dma_addr, cq->cpu_addr); + + return &cq->ibcq; + +err_mmap_remove: + mmap_obj_entries_remove(to_eucontext(ibucontext), cq); +err_destroy_cq: + efa_destroy_cq_idx(dev, cq->cq_idx); +err_free_dma: + pr_debug("cq->cpu_addr[%p] freed: size[%lu], dma[%pad]\n", + cq->cpu_addr, cq->size, &cq->dma_addr); + dma_free_coherent(&dev->pdev->dev, cq->size, cq->cpu_addr, + cq->dma_addr); +err_free_cq: + kfree(cq); + return ERR_PTR(err); +} + +struct ib_cq *efa_create_cq(struct ib_device *ibdev, + const struct ib_cq_init_attr *attr, + struct ib_ucontext *ibucontext, + struct ib_udata *udata) +{ + pr_debug("--->\n"); + return do_create_cq(ibdev, attr->cqe, attr->comp_vector, ibucontext, + udata); +} + +static int umem_to_page_list(struct ib_umem *umem, + u64 *page_list, + u32 hp_cnt, + u8 hp_shift) +{ + u32 pages_in_hp = BIT(hp_shift - PAGE_SHIFT); + unsigned int page_idx = 0; + unsigned int hp_idx = 0; + struct scatterlist *sg; + unsigned int entry; + + if (umem->page_shift != PAGE_SHIFT) + return -EINVAL; + + pr_debug("hp_cnt[%u], pages_in_hp[%u]\n", hp_cnt, pages_in_hp); + + for_each_sg(umem->sg_head.sgl, sg, umem->nmap, entry) { + if (unlikely(sg_dma_len(sg) != PAGE_SIZE)) { + pr_err("sg_dma_len[%u] != PAGE_SIZE[%lu]\n", + sg_dma_len(sg), PAGE_SIZE); + return -EINVAL; + } + + if (page_idx % pages_in_hp == 0) { + page_list[hp_idx] = sg_dma_address(sg); + hp_idx++; + } + page_idx++; + } + + return 0; +} + +static struct scatterlist *efa_vmalloc_buf_to_sg(u64 *buf, int page_cnt) +{ + struct scatterlist *sglist; + struct page *pg; + int i; + + sglist = kcalloc(page_cnt, sizeof(*sglist), GFP_KERNEL); + if (!sglist) + return NULL; + sg_init_table(sglist, page_cnt); + for (i = 0; i < page_cnt; i++) { + pg = vmalloc_to_page(buf); + if (!pg) + goto err; + WARN_ON_ONCE(PageHighMem(pg)); + sg_set_page(&sglist[i], pg, EFA_PAGE_SIZE, 0); + buf = (u64 *)((u8 *)buf + EFA_PAGE_SIZE); + } + return sglist; + +err: + kfree(sglist); + return NULL; +} + +/* + * create a chunk list of physical pages dma addresses from the supplied + * scatter gather list + */ +static int pbl_chunk_list_create(struct pbl_context *pbl) +{ + unsigned int entry, npg_in_sg, chunk_list_size, chunk_idx, page_idx; + struct pbl_chunk_list *chunk_list = &pbl->phys.indirect.chunk_list; + int page_cnt = pbl->phys.indirect.pbl_buf_size_in_pages; + struct scatterlist *pages_sgl = pbl->phys.indirect.sgl; + int sg_dma_cnt = pbl->phys.indirect.sg_dma_cnt; + struct efa_com_ctrl_buff_info *ctrl_buf; + u64 *cur_chunk_buf, *prev_chunk_buf; + struct scatterlist *sg; + dma_addr_t dma_addr; + int i; + + /* allocate a chunk list that consists of 4KB chunks */ + chunk_list_size = DIV_ROUND_UP(page_cnt, EFA_PAGE_PTRS_PER_CHUNK); + + chunk_list->size = chunk_list_size; + chunk_list->chunks = kcalloc(chunk_list_size, + sizeof(*chunk_list->chunks), + GFP_KERNEL); + if (!chunk_list->chunks) + return -ENOMEM; + + pr_debug("chunk_list_size[%u] - pages[%u]\n", chunk_list_size, + page_cnt); + + /* allocate chunk buffers: */ + for (i = 0; i < chunk_list_size; i++) { + chunk_list->chunks[i].buf = kzalloc(EFA_CHUNK_ALLOC_SIZE, + GFP_KERNEL); + if (!chunk_list->chunks[i].buf) + goto chunk_list_dealloc; + + chunk_list->chunks[i].length = EFA_CHUNK_USED_SIZE; + } + chunk_list->chunks[chunk_list_size - 1].length = + ((page_cnt % EFA_PAGE_PTRS_PER_CHUNK) * EFA_PAGE_PTR_SIZE) + + EFA_CHUNK_PTR_SIZE; + + /* fill the dma addresses of sg list pages to chunks: */ + chunk_idx = 0; + page_idx = 0; + cur_chunk_buf = chunk_list->chunks[0].buf; + for_each_sg(pages_sgl, sg, sg_dma_cnt, entry) { + npg_in_sg = sg_dma_len(sg) >> EFA_PAGE_SHIFT; + for (i = 0; i < npg_in_sg; i++) { + cur_chunk_buf[page_idx++] = sg_dma_address(sg) + + (EFA_PAGE_SIZE * i); + + if (page_idx == EFA_PAGE_PTRS_PER_CHUNK) { + chunk_idx++; + cur_chunk_buf = chunk_list->chunks[chunk_idx].buf; + page_idx = 0; + } + } + } + + /* map chunks to dma and fill chunks next ptrs */ + for (i = chunk_list_size - 1; i >= 0; i--) { + dma_addr = dma_map_single(pbl->dmadev, + chunk_list->chunks[i].buf, + chunk_list->chunks[i].length, + DMA_TO_DEVICE); + if (dma_mapping_error(pbl->dmadev, dma_addr)) { + pr_err("chunk[%u] dma_map_failed\n", i); + goto chunk_list_unmap; + } + + chunk_list->chunks[i].dma_addr = dma_addr; + pr_debug("chunk[%u] mapped at [%pad]\n", i, &dma_addr); + + if (!i) + break; + + prev_chunk_buf = chunk_list->chunks[i - 1].buf; + + ctrl_buf = (struct efa_com_ctrl_buff_info *) + &prev_chunk_buf[EFA_PAGE_PTRS_PER_CHUNK]; + ctrl_buf->length = chunk_list->chunks[i].length; + + efa_com_set_dma_addr(dma_addr, + &ctrl_buf->address.mem_addr_high, + &ctrl_buf->address.mem_addr_low); + } + + return 0; + +chunk_list_unmap: + for (; i < chunk_list_size; i++) { + dma_unmap_single(pbl->dmadev, chunk_list->chunks[i].dma_addr, + chunk_list->chunks[i].length, DMA_TO_DEVICE); + } +chunk_list_dealloc: + for (i = 0; i < chunk_list_size; i++) + kfree(chunk_list->chunks[i].buf); + + kfree(chunk_list->chunks); + return -ENOMEM; +} + +static void pbl_chunk_list_destroy(struct pbl_context *pbl) +{ + struct pbl_chunk_list *chunk_list = &pbl->phys.indirect.chunk_list; + int i; + + for (i = 0; i < chunk_list->size; i++) { + dma_unmap_single(pbl->dmadev, chunk_list->chunks[i].dma_addr, + chunk_list->chunks[i].length, DMA_TO_DEVICE); + kfree(chunk_list->chunks[i].buf); + } + + kfree(chunk_list->chunks); +} + +/* initialize pbl continuous mode: map pbl buffer to a dma address. */ +static int pbl_continuous_initialize(struct pbl_context *pbl) +{ + dma_addr_t dma_addr; + + dma_addr = dma_map_single(pbl->dmadev, pbl->pbl_buf, + pbl->pbl_buf_size_in_bytes, DMA_TO_DEVICE); + if (dma_mapping_error(pbl->dmadev, dma_addr)) { + pr_err("Unable to map pbl to DMA address"); + return -ENOMEM; + } + + pbl->phys.continuous.dma_addr = dma_addr; + pr_debug("pbl continuous - dma_addr = %pad, size[%u]\n", + &dma_addr, pbl->pbl_buf_size_in_bytes); + + return 0; +} + +/* + * initialize pbl indirect mode: + * create a chunk list out of the dma addresses of the physical pages of + * pbl buffer. + */ +static int pbl_indirect_initialize(struct pbl_context *pbl) +{ + u32 size_in_pages = DIV_ROUND_UP(pbl->pbl_buf_size_in_bytes, + EFA_PAGE_SIZE); + struct scatterlist *sgl; + int sg_dma_cnt, err; + + sgl = efa_vmalloc_buf_to_sg(pbl->pbl_buf, size_in_pages); + if (!sgl) + return -ENOMEM; + + sg_dma_cnt = dma_map_sg(pbl->dmadev, sgl, size_in_pages, DMA_TO_DEVICE); + if (!sg_dma_cnt) { + err = -EINVAL; + goto err_map; + } + + pbl->phys.indirect.pbl_buf_size_in_pages = size_in_pages; + pbl->phys.indirect.sgl = sgl; + pbl->phys.indirect.sg_dma_cnt = sg_dma_cnt; + err = pbl_chunk_list_create(pbl); + if (err) { + pr_err("chunk_list creation failed[%d]!\n", err); + goto err_chunk; + } + + pr_debug("pbl indirect - size[%u], chunks[%u]\n", + pbl->pbl_buf_size_in_bytes, + pbl->phys.indirect.chunk_list.size); + + return 0; + +err_chunk: + dma_unmap_sg(pbl->dmadev, sgl, size_in_pages, DMA_TO_DEVICE); +err_map: + kfree(sgl); + return err; +} + +static void pbl_indirect_terminate(struct pbl_context *pbl) +{ + pbl_chunk_list_destroy(pbl); + dma_unmap_sg(pbl->dmadev, pbl->phys.indirect.sgl, + pbl->phys.indirect.pbl_buf_size_in_pages, DMA_TO_DEVICE); + kfree(pbl->phys.indirect.sgl); +} + +/* create a page buffer list from a mapped user memory region */ +static int pbl_create(struct pbl_context *pbl, + struct efa_dev *dev, + struct ib_umem *umem, + int hp_cnt, + u8 hp_shift) +{ + int err; + + pbl->dev = dev; + pbl->dmadev = &dev->pdev->dev; + pbl->pbl_buf_size_in_bytes = hp_cnt * EFA_PAGE_PTR_SIZE; + pbl->pbl_buf = kzalloc(pbl->pbl_buf_size_in_bytes, + GFP_KERNEL | __GFP_NOWARN); + if (pbl->pbl_buf) { + pbl->physically_continuous = true; + err = umem_to_page_list(umem, pbl->pbl_buf, hp_cnt, hp_shift); + if (err) + goto err_continuous; + err = pbl_continuous_initialize(pbl); + if (err) + goto err_continuous; + } else { + pbl->physically_continuous = false; + pbl->pbl_buf = vzalloc(pbl->pbl_buf_size_in_bytes); + if (!pbl->pbl_buf) + return -ENOMEM; + + err = umem_to_page_list(umem, pbl->pbl_buf, hp_cnt, hp_shift); + if (err) + goto err_indirect; + err = pbl_indirect_initialize(pbl); + if (err) + goto err_indirect; + } + + pr_debug("user_pbl_created: user_pages[%u], continuous[%u]\n", + hp_cnt, pbl->physically_continuous); + + return 0; + +err_continuous: + kfree(pbl->pbl_buf); + return err; +err_indirect: + vfree(pbl->pbl_buf); + return err; +} + +static void pbl_destroy(struct pbl_context *pbl) +{ + if (pbl->physically_continuous) { + dma_unmap_single(pbl->dmadev, pbl->phys.continuous.dma_addr, + pbl->pbl_buf_size_in_bytes, DMA_TO_DEVICE); + kfree(pbl->pbl_buf); + } else { + pbl_indirect_terminate(pbl); + vfree(pbl->pbl_buf); + } +} + +static int efa_create_inline_pbl(struct efa_mr *mr, + struct efa_com_reg_mr_params *params) +{ + int err; + + params->inline_pbl = true; + err = umem_to_page_list(mr->umem, params->pbl.inline_pbl_array, + params->page_num, params->page_shift); + if (err) { + pr_err("failed to create inline pbl[%d]\n", err); + return err; + } + + pr_debug("inline_pbl_array - pages[%u]\n", params->page_num); + + return 0; +} + +static int efa_create_pbl(struct efa_dev *dev, + struct pbl_context *pbl, + struct efa_mr *mr, + struct efa_com_reg_mr_params *params) +{ + int err; + + err = pbl_create(pbl, dev, mr->umem, params->page_num, + params->page_shift); + if (err) { + pr_err("failed to create pbl[%d]\n", err); + return err; + } + + params->inline_pbl = false; + params->indirect = !pbl->physically_continuous; + if (pbl->physically_continuous) { + params->pbl.pbl.length = pbl->pbl_buf_size_in_bytes; + + efa_com_set_dma_addr(pbl->phys.continuous.dma_addr, + ¶ms->pbl.pbl.address.mem_addr_high, + ¶ms->pbl.pbl.address.mem_addr_low); + } else { + params->pbl.pbl.length = + pbl->phys.indirect.chunk_list.chunks[0].length; + + efa_com_set_dma_addr(pbl->phys.indirect.chunk_list.chunks[0].dma_addr, + ¶ms->pbl.pbl.address.mem_addr_high, + ¶ms->pbl.pbl.address.mem_addr_low); + } + + return 0; +} + +static void efa_cont_pages(struct ib_umem *umem, u64 addr, + unsigned long max_page_shift, + int *count, u8 *shift, u32 *ncont) +{ + unsigned long page_shift = umem->page_shift; + struct scatterlist *sg; + u64 base = ~0, p = 0; + unsigned long tmp; + unsigned long m; + u64 len, pfn; + int i = 0; + int entry; + + addr = addr >> page_shift; + tmp = (unsigned long)addr; + m = find_first_bit(&tmp, BITS_PER_LONG); + if (max_page_shift) + m = min_t(unsigned long, max_page_shift - page_shift, m); + + for_each_sg(umem->sg_head.sgl, sg, umem->nmap, entry) { + len = sg_dma_len(sg) >> page_shift; + pfn = sg_dma_address(sg) >> page_shift; + if (base + p != pfn) { + /* + * If either the offset or the new + * base are unaligned update m + */ + tmp = (unsigned long)(pfn | p); + if (!IS_ALIGNED(tmp, 1 << m)) + m = find_first_bit(&tmp, BITS_PER_LONG); + + base = pfn; + p = 0; + } + + p += len; + i += len; + } + + if (i) { + m = min_t(unsigned long, ilog2(roundup_pow_of_two(i)), m); + *ncont = DIV_ROUND_UP(i, (1 << m)); + } else { + m = 0; + *ncont = 0; + } + + *shift = page_shift + m; + *count = i; +} + +struct ib_mr *efa_reg_mr(struct ib_pd *ibpd, u64 start, u64 length, + u64 virt_addr, int access_flags, + struct ib_udata *udata) +{ + struct efa_dev *dev = to_edev(ibpd->device); + struct efa_com_reg_mr_params params = {}; + struct efa_com_reg_mr_result result = {}; + struct pbl_context pbl; + struct efa_mr *mr; + int inline_size; + int npages; + int err; + + if (udata && udata->inlen && + !ib_is_udata_cleared(udata, 0, sizeof(udata->inlen))) { + pr_err_ratelimited("Incompatible ABI params, udata not cleared\n"); + return ERR_PTR(-EINVAL); + } + + if (access_flags & ~EFA_SUPPORTED_ACCESS_FLAGS) { + pr_err("Unsupported access flags[%#x], supported[%#x]\n", + access_flags, EFA_SUPPORTED_ACCESS_FLAGS); + return ERR_PTR(-EOPNOTSUPP); + } + + mr = kzalloc(sizeof(*mr), GFP_KERNEL); + if (!mr) { + dev->stats.sw_stats.reg_mr_alloc_err++; + return ERR_PTR(-ENOMEM); + } + + mr->umem = ib_umem_get(ibpd->uobject->context, start, length, + access_flags, 0); + if (IS_ERR(mr->umem)) { + err = PTR_ERR(mr->umem); + pr_err("failed to pin and map user space memory[%d]\n", err); + goto err; + } + + params.pd = to_epd(ibpd)->pdn; + params.iova = virt_addr; + params.mr_length_in_bytes = length; + params.permissions = access_flags & 0x1; + + efa_cont_pages(mr->umem, start, + EFA_ADMIN_REG_MR_CMD_PHYS_PAGE_SIZE_SHIFT_MASK, &npages, + ¶ms.page_shift, ¶ms.page_num); + pr_debug("start %#llx length %#llx npages %d params.page_shift %u params.page_num %u\n", + start, length, npages, params.page_shift, params.page_num); + + inline_size = ARRAY_SIZE(params.pbl.inline_pbl_array); + if (params.page_num <= inline_size) { + err = efa_create_inline_pbl(mr, ¶ms); + if (err) + goto err_unmap; + + err = efa_com_register_mr(dev->edev, ¶ms, &result); + if (err) { + pr_err("efa_com_register_mr failed - %d!\n", err); + goto err_unmap; + } + } else { + err = efa_create_pbl(dev, &pbl, mr, ¶ms); + if (err) + goto err_unmap; + + err = efa_com_register_mr(dev->edev, ¶ms, &result); + pbl_destroy(&pbl); + + if (err) { + pr_err("efa_com_register_mr failed - %d!\n", err); + goto err_unmap; + } + } + + mr->vaddr = virt_addr; + mr->ibmr.lkey = result.l_key; + mr->ibmr.rkey = result.r_key; + mr->ibmr.length = length; + pr_debug("Registered mr[%d]\n", mr->ibmr.lkey); + + return &mr->ibmr; + +err_unmap: + ib_umem_release(mr->umem); +err: + kfree(mr); + return ERR_PTR(err); +} + +int efa_dereg_mr(struct ib_mr *ibmr) +{ + struct efa_dev *dev = to_edev(ibmr->device); + struct efa_com_dereg_mr_params params; + struct efa_mr *mr = to_emr(ibmr); + + pr_debug("Deregister mr[%d]\n", ibmr->lkey); + + if (mr->umem) { + params.l_key = mr->ibmr.lkey; + efa_com_dereg_mr(dev->edev, ¶ms); + ib_umem_release(mr->umem); + } + + kfree(mr); + + return 0; +} + +int efa_get_port_immutable(struct ib_device *ibdev, u8 port_num, + struct ib_port_immutable *immutable) +{ + pr_debug("--->\n"); + immutable->core_cap_flags = RDMA_CORE_CAP_PROT_EFA; + immutable->gid_tbl_len = 1; + + return 0; +} + +struct ib_ucontext *efa_alloc_ucontext(struct ib_device *ibdev, + struct ib_udata *udata) +{ + struct efa_ibv_alloc_ucontext_resp resp = {}; + struct efa_dev *dev = to_edev(ibdev); + struct efa_ucontext *ucontext; + int err; + + pr_debug("--->\n"); + /* + * it's fine if the driver does not know all request fields, + * we will ack input fields in our response. + */ + + ucontext = kzalloc(sizeof(*ucontext), GFP_KERNEL); + if (!ucontext) { + dev->stats.sw_stats.alloc_ucontext_alloc_err++; + return ERR_PTR(-ENOMEM); + } + + mutex_init(&ucontext->lock); + INIT_LIST_HEAD(&ucontext->pending_mmaps); + + mutex_lock(&dev->efa_dev_lock); + + resp.cmds_supp_udata_mask |= EFA_USER_CMDS_SUPP_UDATA_QUERY_DEVICE; + resp.cmds_supp_udata_mask |= EFA_USER_CMDS_SUPP_UDATA_CREATE_AH; + resp.kernel_supp_mask |= EFA_KERNEL_SUPP_QPT_SRD; + + if (udata && udata->outlen) { + err = ib_copy_to_udata(udata, &resp, + min(sizeof(resp), udata->outlen)); + if (err) + goto err_resp; + } + + list_add_tail(&ucontext->link, &dev->ctx_list); + mutex_unlock(&dev->efa_dev_lock); + return &ucontext->ibucontext; + +err_resp: + mutex_unlock(&dev->efa_dev_lock); + kfree(ucontext); + return ERR_PTR(err); +} + +int efa_dealloc_ucontext(struct ib_ucontext *ibucontext) +{ + struct efa_ucontext *ucontext = to_eucontext(ibucontext); + struct efa_dev *dev = to_edev(ibucontext->device); + + pr_debug("--->\n"); + + WARN_ON(!list_empty(&ucontext->pending_mmaps)); + + mutex_lock(&dev->efa_dev_lock); + list_del(&ucontext->link); + mutex_unlock(&dev->efa_dev_lock); + kfree(ucontext); + return 0; +} + +static void mmap_obj_entries_remove(struct efa_ucontext *ucontext, void *obj) +{ + struct efa_mmap_entry *entry, *tmp; + + pr_debug("--->\n"); + + mutex_lock(&ucontext->lock); + list_for_each_entry_safe(entry, tmp, &ucontext->pending_mmaps, list) { + if (entry->obj == obj) { + list_del(&entry->list); + pr_debug("mmap: obj[%p] key[0x%llx] addr[0x%llX] len[0x%llX] removed\n", + entry->obj, entry->key, entry->address, + entry->length); + kfree(entry); + } + } + mutex_unlock(&ucontext->lock); +} + +static struct efa_mmap_entry *mmap_entry_remove(struct efa_ucontext *ucontext, + u64 key, + u64 len) +{ + struct efa_mmap_entry *entry, *tmp; + + mutex_lock(&ucontext->lock); + list_for_each_entry_safe(entry, tmp, &ucontext->pending_mmaps, list) { + if (entry->key == key && entry->length == len) { + list_del_init(&entry->list); + pr_debug("mmap: obj[%p] key[0x%llx] addr[0x%llX] len[0x%llX] removed\n", + entry->obj, key, entry->address, + entry->length); + mutex_unlock(&ucontext->lock); + return entry; + } + } + mutex_unlock(&ucontext->lock); + + return NULL; +} + +static void mmap_entry_insert(struct efa_ucontext *ucontext, + struct efa_mmap_entry *entry, + u64 mem_flag) +{ + mutex_lock(&ucontext->lock); + entry->key = ucontext->mmap_key | mem_flag; + ucontext->mmap_key += PAGE_SIZE; + list_add_tail(&entry->list, &ucontext->pending_mmaps); + pr_debug("mmap: obj[%p] addr[0x%llx], len[0x%llx], key[0x%llx] inserted\n", + entry->obj, entry->address, entry->length, entry->key); + mutex_unlock(&ucontext->lock); +} + +static int __efa_mmap(struct efa_dev *dev, + struct vm_area_struct *vma, + u64 mmap_flag, + u64 address, + u64 length) +{ + u64 pfn = address >> PAGE_SHIFT; + int err; + + switch (mmap_flag) { + case EFA_MMAP_REG_BAR_MEMORY_FLAG: + pr_debug("mapping address[0x%llX], length[0x%llX] on register BAR!", + address, length); + vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot); + err = io_remap_pfn_range(vma, vma->vm_start, pfn, length, + vma->vm_page_prot); + break; + case EFA_MMAP_MEM_BAR_MEMORY_FLAG: + pr_debug("mapping address 0x%llX, length[0x%llX] on memory BAR!", + address, length); + vma->vm_page_prot = pgprot_writecombine(vma->vm_page_prot); + err = io_remap_pfn_range(vma, vma->vm_start, pfn, length, + vma->vm_page_prot); + break; + case EFA_MMAP_DB_BAR_MEMORY_FLAG: + pr_debug("mapping address 0x%llX, length[0x%llX] on DB BAR!", + address, length); + vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot); + err = io_remap_pfn_range(vma, vma->vm_start, pfn, length, + vma->vm_page_prot); + break; + default: + pr_debug("mapping address[0x%llX], length[0x%llX] of dma buffer!\n", + address, length); + err = remap_pfn_range(vma, vma->vm_start, pfn, length, + vma->vm_page_prot); + } + + return err; +} + +int efa_mmap(struct ib_ucontext *ibucontext, + struct vm_area_struct *vma) +{ + struct efa_ucontext *ucontext = to_eucontext(ibucontext); + struct efa_dev *dev = to_edev(ibucontext->device); + u64 length = vma->vm_end - vma->vm_start; + u64 key = vma->vm_pgoff << PAGE_SHIFT; + struct efa_mmap_entry *entry; + u64 mmap_flag; + u64 address; + + pr_debug("start 0x%lx, end 0x%lx, length = 0x%llx, key = 0x%llx\n", + vma->vm_start, vma->vm_end, length, key); + + if (length % PAGE_SIZE != 0) { + pr_err("length[0x%llX] is not page size aligned[0x%lX]!", + length, PAGE_SIZE); + return -EINVAL; + } + + entry = mmap_entry_remove(ucontext, key, length); + if (!entry) { + pr_err("key[0x%llX] does not have valid entry!", key); + return -EINVAL; + } + address = entry->address; + kfree(entry); + + mmap_flag = key & EFA_MMAP_BARS_MEMORY_MASK; + return __efa_mmap(dev, vma, mmap_flag, address, length); +} + +static inline bool efa_ah_id_equal(u8 *id1, u8 *id2) +{ + return !memcmp(id1, id2, EFA_GID_SIZE); +} + +static int efa_get_ah_by_id(struct efa_dev *dev, u8 *id, + u16 *ah_res, bool ref_update) +{ + struct efa_ah_id *ah_id; + + list_for_each_entry(ah_id, &dev->efa_ah_list, list) { + if (efa_ah_id_equal(ah_id->id, id)) { + *ah_res = ah_id->address_handle; + if (ref_update) + ah_id->ref_count++; + return 0; + } + } + + return -EINVAL; +} + +static int efa_add_ah_id(struct efa_dev *dev, u8 *id, + u16 address_handle) +{ + struct efa_ah_id *ah_id; + + ah_id = kzalloc(sizeof(*ah_id), GFP_KERNEL); + if (!ah_id) + return -ENOMEM; + + memcpy(ah_id->id, id, sizeof(ah_id->id)); + ah_id->address_handle = address_handle; + ah_id->ref_count = 1; + list_add_tail(&ah_id->list, &dev->efa_ah_list); + + return 0; +} + +static void efa_remove_ah_id(struct efa_dev *dev, u8 *id, u32 *ref_count) +{ + struct efa_ah_id *ah_id, *tmp; + + list_for_each_entry_safe(ah_id, tmp, &dev->efa_ah_list, list) { + if (efa_ah_id_equal(ah_id->id, id)) { + *ref_count = --ah_id->ref_count; + if (ah_id->ref_count == 0) { + list_del(&ah_id->list); + kfree(ah_id); + return; + } + } + } +} + +static void ah_destroy_on_device(struct efa_dev *dev, u16 device_ah) +{ + struct efa_com_destroy_ah_params params; + int err; + + params.ah = device_ah; + err = efa_com_destroy_ah(dev->edev, ¶ms); + if (err) + pr_err("efa_com_destroy_ah failed (%d)\n", err); +} + +static int efa_create_ah_id(struct efa_dev *dev, u8 *id, + u16 *efa_address_handle) +{ + struct efa_com_create_ah_params params = {}; + struct efa_com_create_ah_result result = {}; + int err; + + mutex_lock(&dev->ah_list_lock); + err = efa_get_ah_by_id(dev, id, efa_address_handle, true); + if (err) { + memcpy(params.dest_addr, id, sizeof(params.dest_addr)); + err = efa_com_create_ah(dev->edev, ¶ms, &result); + if (err) { + pr_err("efa_com_create_ah failed %d\n", err); + goto err_unlock; + } + + pr_debug("create address handle %u for address %pI6\n", + result.ah, params.dest_addr); + + err = efa_add_ah_id(dev, id, result.ah); + if (err) { + pr_err("efa_add_ah_id failed %d\n", err); + goto err_destroy_ah; + } + + *efa_address_handle = result.ah; + } + mutex_unlock(&dev->ah_list_lock); + + return 0; + +err_destroy_ah: + ah_destroy_on_device(dev, result.ah); +err_unlock: + mutex_unlock(&dev->ah_list_lock); + return err; +} + +static void efa_destroy_ah_id(struct efa_dev *dev, u8 *id) +{ + u16 device_ah; + u32 ref_count; + int err; + + mutex_lock(&dev->ah_list_lock); + err = efa_get_ah_by_id(dev, id, &device_ah, false); + if (err) { + WARN_ON(1); + goto out_unlock; + } + + efa_remove_ah_id(dev, id, &ref_count); + if (!ref_count) + ah_destroy_on_device(dev, device_ah); + +out_unlock: + mutex_unlock(&dev->ah_list_lock); +} + +struct ib_ah *efa_create_ah(struct ib_pd *ibpd, + struct rdma_ah_attr *ah_attr, + struct ib_udata *udata) +{ + struct efa_dev *dev = to_edev(ibpd->device); + struct efa_ibv_create_ah_resp resp = {}; + u16 efa_address_handle; + struct efa_ah *ah; + int err; + + pr_debug("--->\n"); + + if (udata && udata->inlen && + !ib_is_udata_cleared(udata, 0, udata->inlen)) { + pr_err_ratelimited("Incompatiable ABI params\n"); + return ERR_PTR(-EINVAL); + } + + ah = kzalloc(sizeof(*ah), GFP_KERNEL); + if (!ah) { + dev->stats.sw_stats.create_ah_alloc_err++; + return ERR_PTR(-ENOMEM); + } + + err = efa_create_ah_id(dev, ah_attr->grh.dgid.raw, &efa_address_handle); + if (err) + goto err_free; + + resp.efa_address_handle = efa_address_handle; + + if (udata && udata->outlen) { + err = ib_copy_to_udata(udata, &resp, + min(sizeof(resp), udata->outlen)); + if (err) { + pr_err_ratelimited("failed to copy udata for create_ah response\n"); + goto err_destroy_ah; + } + } + + memcpy(ah->id, ah_attr->grh.dgid.raw, sizeof(ah->id)); + return &ah->ibah; + +err_destroy_ah: + efa_destroy_ah_id(dev, ah_attr->grh.dgid.raw); +err_free: + kfree(ah); + return ERR_PTR(err); +} + +int efa_destroy_ah(struct ib_ah *ibah) +{ + struct efa_dev *dev = to_edev(ibah->pd->device); + struct efa_ah *ah = to_eah(ibah); + + pr_debug("--->\n"); + efa_destroy_ah_id(dev, ah->id); + + kfree(ah); + return 0; +} + +/* In ib callbacks section - Start of stub funcs */ +int efa_post_send(struct ib_qp *ibqp, + const struct ib_send_wr *wr, + const struct ib_send_wr **bad_wr) +{ + pr_warn("Function not supported\n"); + return -EOPNOTSUPP; +} + +int efa_post_recv(struct ib_qp *ibqp, + const struct ib_recv_wr *wr, + const struct ib_recv_wr **bad_wr) +{ + pr_warn("Function not supported\n"); + return -EOPNOTSUPP; +} + +int efa_poll_cq(struct ib_cq *ibcq, int num_entries, + struct ib_wc *wc) +{ + pr_warn("Function not supported\n"); + return -EOPNOTSUPP; +} + +int efa_req_notify_cq(struct ib_cq *ibcq, + enum ib_cq_notify_flags flags) +{ + pr_warn("Function not supported\n"); + return -EOPNOTSUPP; +} + +struct ib_mr *efa_get_dma_mr(struct ib_pd *ibpd, int acc) +{ + pr_warn("Function not supported\n"); + return ERR_PTR(-EOPNOTSUPP); +} + +int efa_modify_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr, + int attr_mask, struct ib_udata *udata) +{ + pr_warn("Function not supported\n"); + return -EOPNOTSUPP; +} + +enum rdma_link_layer efa_port_link_layer(struct ib_device *ibdev, + u8 port_num) +{ + pr_debug("--->\n"); + return IB_LINK_LAYER_ETHERNET; +} + From patchwork Tue Dec 4 12:04:28 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gal Pressman X-Patchwork-Id: 10711649 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 2561713AF for ; Tue, 4 Dec 2018 12:05:41 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 175242A35B for ; Tue, 4 Dec 2018 12:05:41 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 0BDDA2AF00; Tue, 4 Dec 2018 12:05:41 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E3CFC2A35B for ; Tue, 4 Dec 2018 12:05:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726222AbeLDMFj (ORCPT ); Tue, 4 Dec 2018 07:05:39 -0500 Received: from smtp-fw-9102.amazon.com ([207.171.184.29]:56079 "EHLO smtp-fw-9102.amazon.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726174AbeLDMFj (ORCPT ); Tue, 4 Dec 2018 07:05:39 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1543925134; x=1575461134; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version; bh=guwJ10JM5+1Z9+j00mpjc+c8+6bI/gutLrBxk48mnPc=; b=SU2eOGMRkk96mwi4lwG7ZFvdZLKvA+rVxRwVPiz9r1pQwBiNGoDTM1Yv WEaAaijM+FQ94idqjEERnihggLZ661dkhbw+WE8ygaMSS4xCcxJh1vmz9 gDkTtRk0V1iNvqueBtY4T9ypk1DvFbiqlTLY2uH+LEQzGIMV8O2qAUtnF A=; X-IronPort-AV: E=Sophos;i="5.56,253,1539648000"; d="scan'208";a="645915037" Received: from sea3-co-svc-lb6-vlan3.sea.amazon.com (HELO email-inbound-relay-1a-af6a10df.us-east-1.amazon.com) ([10.47.22.38]) by smtp-border-fw-out-9102.sea19.amazon.com with ESMTP/TLS/DHE-RSA-AES256-SHA; 04 Dec 2018 12:05:34 +0000 Received: from EX13MTAUEA001.ant.amazon.com (iad55-ws-svc-p15-lb9-vlan2.iad.amazon.com [10.40.159.162]) by email-inbound-relay-1a-af6a10df.us-east-1.amazon.com (8.14.7/8.14.7) with ESMTP id wB4C5VFv075952 (version=TLSv1/SSLv3 cipher=AES256-SHA bits=256 verify=FAIL); Tue, 4 Dec 2018 12:05:32 GMT Received: from EX13D19EUB002.ant.amazon.com (10.43.166.78) by EX13MTAUEA001.ant.amazon.com (10.43.61.243) with Microsoft SMTP Server (TLS) id 15.0.1367.3; Tue, 4 Dec 2018 12:05:19 +0000 Received: from EX13MTAUEB001.ant.amazon.com (10.43.60.96) by EX13D19EUB002.ant.amazon.com (10.43.166.78) with Microsoft SMTP Server (TLS) id 15.0.1367.3; Tue, 4 Dec 2018 12:05:18 +0000 Received: from galpress-VirtualBox.hfa16.amazon.com (10.218.62.26) by mail-relay.amazon.com (10.43.60.129) with Microsoft SMTP Server id 15.0.1367.3 via Frontend Transport; Tue, 4 Dec 2018 12:05:16 +0000 From: Gal Pressman To: Doug Ledford , Jason Gunthorpe CC: Alexander Matushevsky , Yossi Leybovich , , Tom Tucker , Gal Pressman Subject: [PATCH rdma-next 12/13] RDMA/efa: Add the efa module Date: Tue, 4 Dec 2018 14:04:28 +0200 Message-ID: <1543925069-8838-13-git-send-email-galpress@amazon.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1543925069-8838-1-git-send-email-galpress@amazon.com> References: <1543925069-8838-1-git-send-email-galpress@amazon.com> MIME-Version: 1.0 Sender: linux-rdma-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Add the main EFA module file. Signed-off-by: Gal Pressman --- drivers/infiniband/hw/efa/efa_main.c | 669 +++++++++++++++++++++++++++++++++++ 1 file changed, 669 insertions(+) create mode 100644 drivers/infiniband/hw/efa/efa_main.c diff --git a/drivers/infiniband/hw/efa/efa_main.c b/drivers/infiniband/hw/efa/efa_main.c new file mode 100644 index 000000000000..0e7c6ad99461 --- /dev/null +++ b/drivers/infiniband/hw/efa/efa_main.c @@ -0,0 +1,669 @@ +// SPDX-License-Identifier: GPL-2.0 OR BSD-2-Clause +/* + * Copyright 2018 Amazon.com, Inc. or its affiliates. + */ + +#include +#include + +#include + +#include "efa.h" +#include "efa_pci_id_tbl.h" + +MODULE_AUTHOR("Amazon.com, Inc. or its affiliates"); +MODULE_LICENSE("Dual BSD/GPL"); +MODULE_DESCRIPTION(DEVICE_NAME); +MODULE_DEVICE_TABLE(pci, efa_pci_tbl); + +#define EFA_REG_BAR 0 +#define EFA_MEM_BAR 2 +#define EFA_BASE_BAR_MASK (BIT(EFA_REG_BAR) | BIT(EFA_MEM_BAR)) + +#define EFA_AENQ_ENABLED_GROUPS \ + (BIT(EFA_ADMIN_FATAL_ERROR) | BIT(EFA_ADMIN_WARNING) | \ + BIT(EFA_ADMIN_NOTIFICATION) | BIT(EFA_ADMIN_KEEP_ALIVE)) + +static void efa_update_network_attr(struct efa_dev *dev, + struct efa_com_get_network_attr_result *network_attr) +{ + pr_debug("-->\n"); + + memcpy(dev->addr, network_attr->addr, sizeof(network_attr->addr)); + dev->mtu = network_attr->mtu; + + pr_debug("full addr %pI6\n", dev->addr); +} + +static void efa_update_dev_cap(struct efa_dev *dev, + struct efa_com_get_device_attr_result *device_attr) +{ + dev->caps.max_sq = device_attr->max_sq; + dev->caps.max_sq_depth = device_attr->max_sq_depth; + dev->caps.max_rq = device_attr->max_sq; + dev->caps.max_rq_depth = device_attr->max_rq_depth; + dev->caps.max_cq = device_attr->max_cq; + dev->caps.max_cq_depth = device_attr->max_cq_depth; + dev->caps.inline_buf_size = device_attr->inline_buf_size; + dev->caps.max_sq_sge = device_attr->max_sq_sge; + dev->caps.max_rq_sge = device_attr->max_rq_sge; + dev->caps.max_mr = device_attr->max_mr; + dev->caps.max_mr_pages = device_attr->max_mr_pages; + dev->caps.page_size_cap = device_attr->page_size_cap; + dev->caps.max_pd = device_attr->max_pd; + dev->caps.max_ah = device_attr->max_ah; + dev->caps.sub_cqs_per_cq = device_attr->sub_cqs_per_cq; + dev->caps.max_inline_data = device_attr->inline_buf_size; +} + +int efa_get_device_attributes(struct efa_dev *dev, + struct efa_com_get_device_attr_result *result) +{ + int err; + + pr_debug("--->\n"); + err = efa_com_get_device_attr(dev->edev, result); + if (err) + pr_err("failed to get device_attr err[%d]!\n", err); + + return err; +} + +/* This handler will called for unknown event group or unimplemented handlers */ +static void unimplemented_aenq_handler(void *data, + struct efa_admin_aenq_entry *aenq_e) +{ + pr_err_ratelimited("Unknown event was received or event with unimplemented handler\n"); +} + +static void efa_keep_alive(void *data, struct efa_admin_aenq_entry *aenq_e) +{ + struct efa_dev *dev = (struct efa_dev *)data; + + dev->stats.keep_alive_rcvd++; +} + +static struct efa_aenq_handlers aenq_handlers = { + .handlers = { + [EFA_ADMIN_KEEP_ALIVE] = efa_keep_alive, + }, + .unimplemented_handler = unimplemented_aenq_handler +}; + +static void efa_release_bars(struct efa_dev *dev, int bars_mask) +{ + struct pci_dev *pdev = dev->pdev; + int release_bars; + + release_bars = pci_select_bars(pdev, IORESOURCE_MEM) & bars_mask; + pci_release_selected_regions(pdev, release_bars); +} + +static irqreturn_t efa_intr_msix_mgmnt(int irq, void *data) +{ + struct efa_dev *dev = (struct efa_dev *)data; + + efa_com_admin_q_comp_intr_handler(dev->edev); + + /* Don't call the aenq handler before probe is done */ + if (likely(test_bit(EFA_DEVICE_RUNNING_BIT, &dev->state))) + efa_com_aenq_intr_handler(dev->edev, data); + + return IRQ_HANDLED; +} + +static int efa_request_mgmnt_irq(struct efa_dev *dev) +{ + struct efa_irq *irq; + int err; + + irq = &dev->admin_irq; + err = request_irq(irq->vector, irq->handler, 0, irq->name, + irq->data); + if (err) { + dev_err(&dev->pdev->dev, "failed to request admin irq (%d)\n", + err); + return err; + } + + dev_dbg(&dev->pdev->dev, "set affinity hint of mgmnt irq.to 0x%lx (irq vector: %d)\n", + irq->affinity_hint_mask.bits[0], irq->vector); + irq_set_affinity_hint(irq->vector, &irq->affinity_hint_mask); + + return err; +} + +static void efa_setup_mgmnt_irq(struct efa_dev *dev) +{ + u32 cpu; + + snprintf(dev->admin_irq.name, EFA_IRQNAME_SIZE, + "efa-mgmnt@pci:%s", pci_name(dev->pdev)); + dev->admin_irq.handler = efa_intr_msix_mgmnt; + dev->admin_irq.data = dev; + dev->admin_irq.vector = + pci_irq_vector(dev->pdev, dev->admin_msix_vector_idx); + cpu = cpumask_first(cpu_online_mask); + dev->admin_irq.cpu = cpu; + cpumask_set_cpu(cpu, + &dev->admin_irq.affinity_hint_mask); + pr_info("setup irq:%p vector:%d name:%s\n", + &dev->admin_irq, + dev->admin_irq.vector, + dev->admin_irq.name); +} + +static void efa_free_mgmnt_irq(struct efa_dev *dev) +{ + struct efa_irq *irq; + + irq = &dev->admin_irq; + irq_set_affinity_hint(irq->vector, NULL); + free_irq(irq->vector, irq->data); +} + +static int efa_set_mgmnt_irq(struct efa_dev *dev) +{ + int err; + + efa_setup_mgmnt_irq(dev); + + err = efa_request_mgmnt_irq(dev); + if (err) { + dev_err(&dev->pdev->dev, "Can not setup management interrupts\n"); + return err; + } + + return 0; +} + +static int efa_set_doorbell_bar(struct efa_dev *dev, int db_bar_idx) +{ + struct pci_dev *pdev = dev->pdev; + int bars; + int err; + + dev->db_bar_idx = db_bar_idx; + + if (!(BIT(db_bar_idx) & EFA_BASE_BAR_MASK)) { + bars = pci_select_bars(pdev, IORESOURCE_MEM) & BIT(db_bar_idx); + + err = pci_request_selected_regions(pdev, bars, DRV_MODULE_NAME); + if (err) { + dev_err(&pdev->dev, "pci_request_selected_regions for bar %d failed %d\n", + db_bar_idx, err); + return err; + } + } + + dev->db_bar_addr = pci_resource_start(dev->pdev, db_bar_idx); + dev->db_bar_len = pci_resource_len(dev->pdev, db_bar_idx); + + return 0; +} + +static void efa_release_doorbell_bar(struct efa_dev *dev) +{ + int db_bar_idx = dev->db_bar_idx; + + if (!(BIT(db_bar_idx) & EFA_BASE_BAR_MASK)) + efa_release_bars(dev, BIT(db_bar_idx)); +} + +static void efa_update_hw_hints(struct efa_dev *dev, + struct efa_com_get_hw_hints_result *hw_hints) +{ + struct efa_com_dev *edev = dev->edev; + + if (hw_hints->mmio_read_timeout) + edev->mmio_read.mmio_read_timeout = + hw_hints->mmio_read_timeout * 1000; + + if (hw_hints->poll_interval) + edev->admin_queue.poll_interval = hw_hints->poll_interval; + + if (hw_hints->admin_completion_timeout) + edev->admin_queue.completion_timeout = + hw_hints->admin_completion_timeout; +} + +static int efa_ib_device_add(struct efa_dev *dev) +{ + struct efa_com_get_network_attr_result network_attr; + struct efa_com_get_device_attr_result device_attr; + struct efa_com_get_hw_hints_result hw_hints; + struct pci_dev *pdev = dev->pdev; + int err; + + mutex_init(&dev->efa_dev_lock); + mutex_init(&dev->ah_list_lock); + INIT_LIST_HEAD(&dev->ctx_list); + INIT_LIST_HEAD(&dev->efa_ah_list); + + /* init IB device */ + err = efa_get_device_attributes(dev, &device_attr); + if (err) { + pr_err("efa_get_device_attr failed (%d)\n", err); + return err; + } + + efa_update_dev_cap(dev, &device_attr); + + pr_debug("Doorbells bar (%d)\n", device_attr.db_bar); + err = efa_set_doorbell_bar(dev, device_attr.db_bar); + if (err) + return err; + + err = efa_bitmap_init(&dev->pd_bitmap, dev->caps.max_pd); + if (err) { + pr_err("efa_bitmap_init failed (%d)\n", err); + goto err_free_doorbell_bar; + } + + err = efa_com_get_network_attr(dev->edev, &network_attr); + if (err) { + pr_err("efa_com_get_network_attr failed (%d)\n", err); + goto err_free_pd_bitmap; + } + + efa_update_network_attr(dev, &network_attr); + + err = efa_com_get_hw_hints(dev->edev, &hw_hints); + if (err) { + pr_err("efa_get_hw_hints failed (%d)\n", err); + goto err_free_pd_bitmap; + } + + efa_update_hw_hints(dev, &hw_hints); + + /* Try to enable all the available aenq groups */ + err = efa_com_set_aenq_config(dev->edev, EFA_AENQ_ENABLED_GROUPS); + if (err) { + pr_err("efa_aenq_init failed (%d)\n", err); + goto err_free_pd_bitmap; + } + + dev->ibdev.owner = THIS_MODULE; + dev->ibdev.node_type = RDMA_NODE_EFA; + dev->ibdev.phys_port_cnt = 1; + dev->ibdev.num_comp_vectors = 1; + dev->ibdev.dev.parent = &pdev->dev; + dev->ibdev.uverbs_abi_ver = 3; + + dev->ibdev.uverbs_cmd_mask = + (1ull << IB_USER_VERBS_CMD_GET_CONTEXT) | + (1ull << IB_USER_VERBS_CMD_QUERY_DEVICE) | + (1ull << IB_USER_VERBS_CMD_QUERY_PORT) | + (1ull << IB_USER_VERBS_CMD_ALLOC_PD) | + (1ull << IB_USER_VERBS_CMD_DEALLOC_PD) | + (1ull << IB_USER_VERBS_CMD_REG_MR) | + (1ull << IB_USER_VERBS_CMD_DEREG_MR) | + (1ull << IB_USER_VERBS_CMD_CREATE_COMP_CHANNEL) | + (1ull << IB_USER_VERBS_CMD_CREATE_CQ) | + (1ull << IB_USER_VERBS_CMD_DESTROY_CQ) | + (1ull << IB_USER_VERBS_CMD_CREATE_QP) | + (1ull << IB_USER_VERBS_CMD_MODIFY_QP) | + (1ull << IB_USER_VERBS_CMD_QUERY_QP) | + (1ull << IB_USER_VERBS_CMD_DESTROY_QP) | + (1ull << IB_USER_VERBS_CMD_CREATE_AH) | + (1ull << IB_USER_VERBS_CMD_OPEN_QP) | + (1ull << IB_USER_VERBS_CMD_DESTROY_AH); + + dev->ibdev.uverbs_ex_cmd_mask = + (1ull << IB_USER_VERBS_EX_CMD_QUERY_DEVICE); + + dev->ibdev.query_device = efa_query_device; + dev->ibdev.query_port = efa_query_port; + dev->ibdev.query_pkey = efa_query_pkey; + dev->ibdev.query_gid = efa_query_gid; + dev->ibdev.get_link_layer = efa_port_link_layer; + dev->ibdev.alloc_pd = efa_alloc_pd; + dev->ibdev.dealloc_pd = efa_dealloc_pd; + dev->ibdev.create_qp = efa_create_qp; + dev->ibdev.modify_qp = efa_modify_qp; + dev->ibdev.query_qp = efa_query_qp; + dev->ibdev.destroy_qp = efa_destroy_qp; + dev->ibdev.create_cq = efa_create_cq; + dev->ibdev.destroy_cq = efa_destroy_cq; + dev->ibdev.reg_user_mr = efa_reg_mr; + dev->ibdev.dereg_mr = efa_dereg_mr; + dev->ibdev.get_port_immutable = efa_get_port_immutable; + dev->ibdev.alloc_ucontext = efa_alloc_ucontext; + dev->ibdev.dealloc_ucontext = efa_dealloc_ucontext; + dev->ibdev.mmap = efa_mmap; + dev->ibdev.create_ah = efa_create_ah; + dev->ibdev.destroy_ah = efa_destroy_ah; + dev->ibdev.post_send = efa_post_send; + dev->ibdev.post_recv = efa_post_recv; + dev->ibdev.poll_cq = efa_poll_cq; + dev->ibdev.req_notify_cq = efa_req_notify_cq; + dev->ibdev.get_dma_mr = efa_get_dma_mr; + + err = ib_register_device(&dev->ibdev, "efa_%d", NULL); + if (err) + goto err_free_pd_bitmap; + + pr_info("Registered ib device %s\n", dev_name(&dev->ibdev.dev)); + + set_bit(EFA_DEVICE_RUNNING_BIT, &dev->state); + + return 0; + +err_free_pd_bitmap: + efa_bitmap_cleanup(&dev->pd_bitmap); +err_free_doorbell_bar: + efa_release_doorbell_bar(dev); + return err; +} + +static void efa_ib_device_remove(struct efa_dev *dev) +{ + pr_debug("--->\n"); + WARN_ON(!list_empty(&dev->efa_ah_list)); + WARN_ON(!list_empty(&dev->ctx_list)); + WARN_ON(efa_bitmap_avail(&dev->pd_bitmap) != dev->caps.max_pd); + + /* Reset the device only if the device is running. */ + if (test_bit(EFA_DEVICE_RUNNING_BIT, &dev->state)) + efa_com_dev_reset(dev->edev, EFA_REGS_RESET_NORMAL); + + pr_info("Unregister ib device %s\n", dev_name(&dev->ibdev.dev)); + ib_unregister_device(&dev->ibdev); + efa_bitmap_cleanup(&dev->pd_bitmap); + efa_release_doorbell_bar(dev); + pr_debug("<---\n"); +} + +static void efa_disable_msix(struct efa_dev *dev) +{ + pr_debug("--->\n"); + if (test_and_clear_bit(EFA_MSIX_ENABLED_BIT, &dev->state)) + pci_free_irq_vectors(dev->pdev); +} + +static int efa_enable_msix(struct efa_dev *dev) +{ + int msix_vecs, irq_num; + + if (test_bit(EFA_MSIX_ENABLED_BIT, &dev->state)) { + dev_err(&dev->pdev->dev, "Error, MSI-X is already enabled\n"); + return -EPERM; + } + + /* Reserve the max msix vectors we might need */ + msix_vecs = EFA_NUM_MSIX_VEC; + dev_dbg(&dev->pdev->dev, "trying to enable MSI-X, vectors %d\n", + msix_vecs); + + dev->admin_msix_vector_idx = EFA_MGMNT_MSIX_VEC_IDX; + irq_num = pci_alloc_irq_vectors(dev->pdev, msix_vecs, + msix_vecs, PCI_IRQ_MSIX); + + if (irq_num < 0) { + dev_err(&dev->pdev->dev, "Failed to enable MSI-X. irq_num %d\n", + irq_num); + return -ENOSPC; + } + + if (irq_num != msix_vecs) { + dev_warn(&dev->pdev->dev, + "Allocated %d MSI-X (out of %d requested)\n", + irq_num, msix_vecs); + return -ENOSPC; + } + + set_bit(EFA_MSIX_ENABLED_BIT, &dev->state); + + return 0; +} + +static int efa_device_init(struct efa_com_dev *edev, struct pci_dev *pdev) +{ + int dma_width; + int err; + + dev_dbg(&pdev->dev, "%s(): ---->\n", __func__); + + err = efa_com_dev_reset(edev, EFA_REGS_RESET_NORMAL); + if (err) { + dev_err(&pdev->dev, "Can not reset device\n"); + return err; + } + + err = efa_com_validate_version(edev); + if (err) { + dev_err(&pdev->dev, "device version is too low\n"); + return err; + } + + dma_width = efa_com_get_dma_width(edev); + if (dma_width < 0) { + dev_err(&pdev->dev, "Invalid dma width value %d", dma_width); + err = dma_width; + return err; + } + + err = pci_set_dma_mask(pdev, DMA_BIT_MASK(dma_width)); + if (err) { + dev_err(&pdev->dev, "pci_set_dma_mask failed 0x%x\n", err); + return err; + } + + err = pci_set_consistent_dma_mask(pdev, DMA_BIT_MASK(dma_width)); + if (err) { + dev_err(&pdev->dev, + "err_pci_set_consistent_dma_mask failed 0x%x\n", + err); + return err; + } + + return 0; +} + +static int efa_probe_device(struct pci_dev *pdev) +{ + struct efa_com_dev *edev; + struct efa_dev *dev; + int bars; + int err; + + dev_dbg(&pdev->dev, "%s(): --->\n", __func__); + + err = pci_enable_device_mem(pdev); + if (err) { + dev_err(&pdev->dev, "pci_enable_device_mem() failed!\n"); + return err; + } + + pci_set_master(pdev); + + dev = (struct efa_dev *)ib_alloc_device(sizeof(struct efa_dev)); + if (IS_ERR_OR_NULL(dev)) { + dev_err(&pdev->dev, "Device %s alloc failed\n", + dev_name(&pdev->dev)); + err = dev ? PTR_ERR(dev) : -ENOMEM; + goto err_disable_device; + } + + edev = kzalloc(sizeof(*edev), GFP_KERNEL); + if (!edev) { + err = -ENOMEM; + goto err_ibdev_destroy; + } + + pci_set_drvdata(pdev, dev); + edev->dmadev = &pdev->dev; + dev->edev = edev; + dev->pdev = pdev; + + bars = pci_select_bars(pdev, IORESOURCE_MEM) & EFA_BASE_BAR_MASK; + err = pci_request_selected_regions(pdev, bars, DRV_MODULE_NAME); + if (err) { + dev_err(&pdev->dev, "pci_request_selected_regions failed %d\n", + err); + goto err_free_efa_dev; + } + + dev->reg_bar_addr = pci_resource_start(pdev, EFA_REG_BAR); + dev->reg_bar_len = pci_resource_len(pdev, EFA_REG_BAR); + dev->mem_bar_addr = pci_resource_start(pdev, EFA_MEM_BAR); + dev->mem_bar_len = pci_resource_len(pdev, EFA_MEM_BAR); + + edev->reg_bar = devm_ioremap(&pdev->dev, + dev->reg_bar_addr, + dev->reg_bar_len); + if (!edev->reg_bar) { + dev_err(&pdev->dev, "failed to remap regs bar\n"); + err = -EFAULT; + goto err_release_bars; + } + + err = efa_com_mmio_reg_read_init(edev); + if (err) { + dev_err(&pdev->dev, "Failed to init readless MMIO\n"); + goto err_iounmap; + } + + err = efa_device_init(edev, pdev); + if (err) { + dev_err(&pdev->dev, "efa device init failed\n"); + if (err == -ETIME) + err = -EPROBE_DEFER; + goto err_reg_read_destroy; + } + + err = efa_enable_msix(dev); + if (err) { + dev_err(&pdev->dev, "Can not reserve msix vectors\n"); + goto err_reg_read_destroy; + } + + edev->admin_queue.msix_vector_idx = dev->admin_msix_vector_idx; + edev->aenq.msix_vector_idx = dev->admin_msix_vector_idx; + + err = efa_set_mgmnt_irq(dev); + if (err) { + dev_err(&pdev->dev, + "Failed to enable and set the management interrupts\n"); + goto err_disable_msix; + } + + err = efa_com_admin_init(edev, &aenq_handlers); + if (err) { + dev_err(&pdev->dev, + "Can not initialize efa admin queue with device\n"); + goto err_free_mgmnt_irq; + } + + dev_dbg(&pdev->dev, "%s(): <---\n", __func__); + return 0; + +err_free_mgmnt_irq: + efa_free_mgmnt_irq(dev); +err_disable_msix: + efa_disable_msix(dev); +err_reg_read_destroy: + efa_com_mmio_reg_read_destroy(edev); +err_iounmap: + devm_iounmap(&pdev->dev, edev->reg_bar); +err_release_bars: + efa_release_bars(dev, EFA_BASE_BAR_MASK); +err_free_efa_dev: + kfree(edev); +err_ibdev_destroy: + ib_dealloc_device(&dev->ibdev); +err_disable_device: + pci_disable_device(pdev); + return err; +} + +static void efa_remove_device(struct pci_dev *pdev) +{ + struct efa_dev *dev = pci_get_drvdata(pdev); + struct efa_com_dev *edev; + + dev_dbg(&pdev->dev, "%s(): --->\n", __func__); + if (!dev) + /* + * This device didn't load properly and its resources + * already released, nothing to do + */ + return; + + edev = dev->edev; + + efa_com_admin_destroy(edev); + efa_free_mgmnt_irq(dev); + efa_disable_msix(dev); + efa_com_mmio_reg_read_destroy(edev); + devm_iounmap(&pdev->dev, edev->reg_bar); + efa_release_bars(dev, EFA_BASE_BAR_MASK); + kfree(edev); + ib_dealloc_device(&dev->ibdev); + pci_disable_device(pdev); + dev_dbg(&pdev->dev, "%s(): <---\n", __func__); +} + +static int efa_probe(struct pci_dev *pdev, const struct pci_device_id *ent) +{ + struct efa_dev *dev; + int err; + + dev_dbg(&pdev->dev, "%s(): --->\n", __func__); + err = efa_probe_device(pdev); + if (err) + return err; + + dev = pci_get_drvdata(pdev); + err = efa_ib_device_add(dev); + if (err) + goto err_remove_device; + + return 0; + +err_remove_device: + efa_remove_device(pdev); + return err; +} + +static void efa_remove(struct pci_dev *pdev) +{ + struct efa_dev *dev = (struct efa_dev *)pci_get_drvdata(pdev); + + dev_dbg(&pdev->dev, "%s(): --->\n", __func__); + efa_ib_device_remove(dev); + efa_remove_device(pdev); +} + +static struct pci_driver efa_pci_driver = { + .name = DRV_MODULE_NAME, + .id_table = efa_pci_tbl, + .probe = efa_probe, + .remove = efa_remove, +}; + +static int __init efa_init(void) +{ + int err; + + err = pci_register_driver(&efa_pci_driver); + if (err) { + pr_err("couldn't register efa driver\n"); + goto err_register; + } + + pr_debug("<---\n"); + return 0; + +err_register: + return err; +} + +static void __exit efa_exit(void) +{ + pr_debug("--->\n"); + pci_unregister_driver(&efa_pci_driver); +} + +module_init(efa_init); +module_exit(efa_exit); From patchwork Tue Dec 4 12:04:29 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gal Pressman X-Patchwork-Id: 10711647 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0060C1731 for ; Tue, 4 Dec 2018 12:05:40 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E79A72AE4E for ; Tue, 4 Dec 2018 12:05:39 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id DC41D2AEFD; Tue, 4 Dec 2018 12:05:39 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C8E4C2A35B for ; Tue, 4 Dec 2018 12:05:37 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726210AbeLDMFh (ORCPT ); Tue, 4 Dec 2018 07:05:37 -0500 Received: from smtp-fw-4101.amazon.com ([72.21.198.25]:59497 "EHLO smtp-fw-4101.amazon.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726192AbeLDMFh (ORCPT ); Tue, 4 Dec 2018 07:05:37 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1543925136; x=1575461136; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version; bh=THig6bbvK5JQ9KQdEorVmh5x/AuxPaL9slgw0Lqbvl8=; b=SLqpH77lo/S0SoRxWYDjh5uXWWWmGIJ6khJzv7UPk0X+BRFc3MN4W5hS gGKFh3POyxBsHYigrjOITe4Z6Rb3N+PYBIjNE6p3ApThTGJFzONSa73+N NQnqG4L3LHRqRthrpTzUTRGXrESHseraOD0tjNHbCaXuAuiqyIfwjhW3h g=; X-IronPort-AV: E=Sophos;i="5.56,253,1539648000"; d="scan'208";a="748435986" Received: from iad6-co-svc-p1-lb1-vlan3.amazon.com (HELO email-inbound-relay-1d-38ae4ad2.us-east-1.amazon.com) ([10.124.125.6]) by smtp-border-fw-out-4101.iad4.amazon.com with ESMTP/TLS/DHE-RSA-AES256-SHA; 04 Dec 2018 12:05:35 +0000 Received: from EX13MTAUEA001.ant.amazon.com (iad55-ws-svc-p15-lb9-vlan2.iad.amazon.com [10.40.159.162]) by email-inbound-relay-1d-38ae4ad2.us-east-1.amazon.com (8.14.7/8.14.7) with ESMTP id wB4C5WAL114579 (version=TLSv1/SSLv3 cipher=AES256-SHA bits=256 verify=FAIL); Tue, 4 Dec 2018 12:05:34 GMT Received: from EX13D02EUB004.ant.amazon.com (10.43.166.221) by EX13MTAUEA001.ant.amazon.com (10.43.61.243) with Microsoft SMTP Server (TLS) id 15.0.1367.3; Tue, 4 Dec 2018 12:05:21 +0000 Received: from EX13MTAUEB001.ant.amazon.com (10.43.60.96) by EX13D02EUB004.ant.amazon.com (10.43.166.221) with Microsoft SMTP Server (TLS) id 15.0.1367.3; Tue, 4 Dec 2018 12:05:20 +0000 Received: from galpress-VirtualBox.hfa16.amazon.com (10.218.62.26) by mail-relay.amazon.com (10.43.60.129) with Microsoft SMTP Server id 15.0.1367.3 via Frontend Transport; Tue, 4 Dec 2018 12:05:18 +0000 From: Gal Pressman To: Doug Ledford , Jason Gunthorpe CC: Alexander Matushevsky , Yossi Leybovich , , Tom Tucker , Gal Pressman Subject: [PATCH rdma-next 13/13] RDMA/efa: Add driver to Kconfig/Makefile Date: Tue, 4 Dec 2018 14:04:29 +0200 Message-ID: <1543925069-8838-14-git-send-email-galpress@amazon.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1543925069-8838-1-git-send-email-galpress@amazon.com> References: <1543925069-8838-1-git-send-email-galpress@amazon.com> MIME-Version: 1.0 Sender: linux-rdma-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Add EFA Makefile and Kconfig. Signed-off-by: Gal Pressman --- MAINTAINERS | 8 ++++++++ drivers/infiniband/Kconfig | 2 ++ drivers/infiniband/hw/Makefile | 1 + drivers/infiniband/hw/efa/Kconfig | 14 ++++++++++++++ drivers/infiniband/hw/efa/Makefile | 8 ++++++++ 5 files changed, 33 insertions(+) create mode 100644 drivers/infiniband/hw/efa/Kconfig create mode 100644 drivers/infiniband/hw/efa/Makefile diff --git a/MAINTAINERS b/MAINTAINERS index f4855974f325..65755d2a404f 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -742,6 +742,14 @@ S: Supported F: Documentation/networking/ena.txt F: drivers/net/ethernet/amazon/ +AMAZON RDMA EFA DRIVER +M: Gal Pressman +R: Yossi Leybovich +L: linux-rdma@vger.kernel.org +Q: https://patchwork.kernel.org/project/linux-rdma/list/ +S: Supported +F: drivers/infiniband/hw/efa/ + AMD CRYPTOGRAPHIC COPROCESSOR (CCP) DRIVER M: Tom Lendacky M: Gary Hook diff --git a/drivers/infiniband/Kconfig b/drivers/infiniband/Kconfig index 0a3ec7c726ec..ddfeb94f67a9 100644 --- a/drivers/infiniband/Kconfig +++ b/drivers/infiniband/Kconfig @@ -120,4 +120,6 @@ source "drivers/infiniband/hw/qedr/Kconfig" source "drivers/infiniband/hw/bnxt_re/Kconfig" +source "drivers/infiniband/hw/efa/Kconfig" + endif # INFINIBAND diff --git a/drivers/infiniband/hw/Makefile b/drivers/infiniband/hw/Makefile index e4f31c1be8f7..06ad3c26fc94 100644 --- a/drivers/infiniband/hw/Makefile +++ b/drivers/infiniband/hw/Makefile @@ -14,3 +14,4 @@ obj-$(CONFIG_INFINIBAND_HFI1) += hfi1/ obj-$(CONFIG_INFINIBAND_HNS) += hns/ obj-$(CONFIG_INFINIBAND_QEDR) += qedr/ obj-$(CONFIG_INFINIBAND_BNXT_RE) += bnxt_re/ +obj-$(CONFIG_INFINIBAND_EFA) += efa/ diff --git a/drivers/infiniband/hw/efa/Kconfig b/drivers/infiniband/hw/efa/Kconfig new file mode 100644 index 000000000000..3d6489e05be4 --- /dev/null +++ b/drivers/infiniband/hw/efa/Kconfig @@ -0,0 +1,14 @@ +# SPDX-License-Identifier: GPL-2.0 OR BSD-2-Clause +# +# Amazon fabric device configuration +# + +config INFINIBAND_EFA + tristate "Amazon Elastic Fabric Adapter (EFA) support" + depends on PCI_MSI && 64BIT && !CPU_BIG_ENDIAN + depends on INFINIBAND_USER_ACCESS + help + This driver supports Amazon Elastic Fabric Adapter (EFA). + + To compile this driver as a module, choose M here. + The module will be called efa. diff --git a/drivers/infiniband/hw/efa/Makefile b/drivers/infiniband/hw/efa/Makefile new file mode 100644 index 000000000000..002979977d9b --- /dev/null +++ b/drivers/infiniband/hw/efa/Makefile @@ -0,0 +1,8 @@ +# SPDX-License-Identifier: GPL-2.0 OR BSD-2-Clause +# +# Makefile for Amazon Elastic Fabric Adapter (EFA) device drivers. +# + +obj-$(CONFIG_INFINIBAND_EFA) += efa.o + +efa-y := efa_bitmap.o efa_com_cmd.o efa_com.o efa_main.o efa_verbs.o