From patchwork Mon Mar 18 12:24:19 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yishai Hadas X-Patchwork-Id: 10857547 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C349017EF for ; Mon, 18 Mar 2019 12:25:43 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A2EB629222 for ; Mon, 18 Mar 2019 12:25:43 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 9703C292C8; Mon, 18 Mar 2019 12:25:43 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI,UNPARSEABLE_RELAY autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 5BD0229222 for ; Mon, 18 Mar 2019 12:25:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726435AbfCRMZk (ORCPT ); Mon, 18 Mar 2019 08:25:40 -0400 Received: from mail-il-dmz.mellanox.com ([193.47.165.129]:53330 "EHLO mellanox.co.il" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1727188AbfCRMZk (ORCPT ); Mon, 18 Mar 2019 08:25:40 -0400 Received: from Internal Mail-Server by MTLPINE1 (envelope-from yishaih@mellanox.com) with ESMTPS (AES256-SHA encrypted); 18 Mar 2019 14:25:30 +0200 Received: from vnc17.mtl.labs.mlnx (vnc17.mtl.labs.mlnx [10.7.2.17]) by labmailer.mlnx (8.13.8/8.13.8) with ESMTP id x2ICPUZf019566; Mon, 18 Mar 2019 14:25:30 +0200 Received: from vnc17.mtl.labs.mlnx (vnc17.mtl.labs.mlnx [127.0.0.1]) by vnc17.mtl.labs.mlnx (8.13.8/8.13.8) with ESMTP id x2ICPUjZ004918; Mon, 18 Mar 2019 14:25:30 +0200 Received: (from yishaih@localhost) by vnc17.mtl.labs.mlnx (8.13.8/8.13.8/Submit) id x2ICPU7k004917; Mon, 18 Mar 2019 14:25:30 +0200 From: Yishai Hadas To: linux-rdma@vger.kernel.org Cc: yishaih@mellanox.com, guyle@mellanox.com, Alexr@mellanox.com, jgg@mellanox.com, majd@mellanox.com Subject: [PATCH rdma-core 6/6] mlx5: Introduce a new send API in direct verbs Date: Mon, 18 Mar 2019 14:24:19 +0200 Message-Id: <1552911859-4073-7-git-send-email-yishaih@mellanox.com> X-Mailer: git-send-email 1.8.2.3 In-Reply-To: <1552911859-4073-1-git-send-email-yishaih@mellanox.com> References: <1552911859-4073-1-git-send-email-yishaih@mellanox.com> Sender: linux-rdma-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Guy Levi A new send API was introduced by libibverbs. This is a mlx5 specific API extension to the generic one which is introduced in DV. By calling mlx5dv_create_qp with the generic attr, send_ops_flags, mlx5 specific send work features can be used. The new struct, mlx5dv_qp_ex, is used to access mlx5 specific send work micro-functions. Till now, the driver let to create a QP with DC transport type w/o a data-path support (Users had to implement data-path by themselves). Now, we introduce a DC support over DV new post send API which is actually a complete DC data-path support (Post a send WR, post a receive WR and poll a WC). Signed-off-by: Guy Levi Signed-off-by: Yishai Hadas --- debian/ibverbs-providers.symbols | 2 + providers/mlx5/CMakeLists.txt | 2 +- providers/mlx5/libmlx5.map | 6 ++ providers/mlx5/man/CMakeLists.txt | 3 + providers/mlx5/man/mlx5dv_create_qp.3.md | 5 ++ providers/mlx5/man/mlx5dv_wr_post.3.md | 94 ++++++++++++++++++++++++++ providers/mlx5/mlx5.h | 9 ++- providers/mlx5/mlx5dv.h | 20 ++++++ providers/mlx5/qp.c | 112 ++++++++++++++++++++++--------- providers/mlx5/verbs.c | 17 ++++- 10 files changed, 236 insertions(+), 34 deletions(-) create mode 100644 providers/mlx5/man/mlx5dv_wr_post.3.md diff --git a/debian/ibverbs-providers.symbols b/debian/ibverbs-providers.symbols index 9be0a94..309bbef 100644 --- a/debian/ibverbs-providers.symbols +++ b/debian/ibverbs-providers.symbols @@ -17,6 +17,7 @@ libmlx5.so.1 ibverbs-providers #MINVER# MLX5_1.7@MLX5_1.7 21 MLX5_1.8@MLX5_1.8 22 MLX5_1.9@MLX5_1.9 23 + MLX5_1.10@MLX5_1.10 24 mlx5dv_init_obj@MLX5_1.0 13 mlx5dv_init_obj@MLX5_1.2 15 mlx5dv_query_device@MLX5_1.0 13 @@ -57,3 +58,4 @@ libmlx5.so.1 ibverbs-providers #MINVER# mlx5dv_devx_destroy_cmd_comp@MLX5_1.9 23 mlx5dv_devx_get_async_cmd_comp@MLX5_1.9 23 mlx5dv_devx_obj_query_async@MLX5_1.9 23 + mlx5dv_qp_ex_from_ibv_qp_ex@MLX5_1.10 24 diff --git a/providers/mlx5/CMakeLists.txt b/providers/mlx5/CMakeLists.txt index d629c58..88b1246 100644 --- a/providers/mlx5/CMakeLists.txt +++ b/providers/mlx5/CMakeLists.txt @@ -11,7 +11,7 @@ if (MLX5_MW_DEBUG) endif() rdma_shared_provider(mlx5 libmlx5.map - 1 1.9.${PACKAGE_VERSION} + 1 1.10.${PACKAGE_VERSION} buf.c cq.c dbrec.c diff --git a/providers/mlx5/libmlx5.map b/providers/mlx5/libmlx5.map index be99767..28c8616 100644 --- a/providers/mlx5/libmlx5.map +++ b/providers/mlx5/libmlx5.map @@ -79,4 +79,10 @@ MLX5_1.9 { mlx5dv_devx_destroy_cmd_comp; mlx5dv_devx_get_async_cmd_comp; mlx5dv_devx_obj_query_async; + mlx5dv_qp_ex_from_ibv_qp_ex; } MLX5_1.8; + +MLX5_1.10 { + global: + mlx5dv_qp_ex_from_ibv_qp_ex; +} MLX5_1.9; diff --git a/providers/mlx5/man/CMakeLists.txt b/providers/mlx5/man/CMakeLists.txt index d8d42c3..24bd5d8 100644 --- a/providers/mlx5/man/CMakeLists.txt +++ b/providers/mlx5/man/CMakeLists.txt @@ -18,6 +18,7 @@ rdma_man_pages( mlx5dv_open_device.3.md mlx5dv_query_device.3 mlx5dv_ts_to_ns.3 + mlx5dv_wr_post.3.md mlx5dv.7 ) rdma_alias_man_pages( @@ -39,4 +40,6 @@ rdma_alias_man_pages( mlx5dv_devx_qp_modify.3 mlx5dv_devx_ind_tbl_modify.3 mlx5dv_devx_qp_modify.3 mlx5dv_devx_ind_tbl_query.3 mlx5dv_devx_umem_reg.3 mlx5dv_devx_umem_dereg.3 + mlx5dv_wr_post.3 mlx5dv_wr_set_dc_addr.3 + mlx5dv_wr_post.3 mlx5dv_qp_ex_from_ibv_qp_ex.3 ) diff --git a/providers/mlx5/man/mlx5dv_create_qp.3.md b/providers/mlx5/man/mlx5dv_create_qp.3.md index c21b527..7a93e84 100644 --- a/providers/mlx5/man/mlx5dv_create_qp.3.md +++ b/providers/mlx5/man/mlx5dv_create_qp.3.md @@ -95,6 +95,11 @@ struct mlx5dv_dc_init_attr { : used to create a DCT QP. +# NOTES + +**mlx5dv_qp_ex_from_ibv_qp_ex()** is used to get *struct mlx5dv_qp_ex* for +accessing the send ops interfaces when IBV_QP_INIT_ATTR_SEND_OPS_FLAGS is used. + # RETURN VALUE **mlx5dv_create_qp()** diff --git a/providers/mlx5/man/mlx5dv_wr_post.3.md b/providers/mlx5/man/mlx5dv_wr_post.3.md new file mode 100644 index 0000000..2c17627 --- /dev/null +++ b/providers/mlx5/man/mlx5dv_wr_post.3.md @@ -0,0 +1,94 @@ +--- +date: 2019-02-24 +footer: mlx5 +header: "mlx5 Programmer's Manual" +tagline: Verbs +layout: page +license: 'Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md' +section: 3 +title: MLX5DV_WR +--- + +# NAME + +mlx5dv_wr_set_dc_addr - Attach a DC info to the last work request + +# SYNOPSIS + +```c +#include + +static inline void mlx5dv_wr_set_dc_addr(struct mlx5dv_qp_ex *mqp, + struct ibv_ah *ah, + uint32_t remote_dctn, + uint64_t remote_dc_key); +``` + +# DESCRIPTION + +The MLX5DV work request APIs (mlx5dv_wr_\*) is an extension for IBV work +request API (ibv_wr_\*) with mlx5 specific features for send work request. +This may be used together with or without ibv_wr_* calls. + +# USAGE + +To use these APIs a QP must be created using mlx5dv_create_qp() with +*send_ops_flags* of struct ibv_qp_init_attr_ex set. + +If the QP does not support all the requested work request types then QP +creation will fail. + +The mlx5dv_qp_ex is extracted from the IBV_QP by ibv_qp_to_qp_ex() and +mlx5dv_qp_ex_from_ibv_qp_ex(). This should be used to apply the mlx5 specific +features on the posted WR. + +A work request creation requires to use the ibv_qp_ex as described in the +man for ibv_wr_post and mlx5dv_qp with its available builders and setters. + +## QP Specific setters + +*DCI* QPs +: *mlx5dv_wr_set_dc_addr()* must be called to set the DCI WR properties. The + destination address of the work is specified by *ah*, the remote DCT + number is specified by *remote_dctn* and the DC key is specified by + *remote_dc_key*. + This setter is available when the QP transport is DCI and send_ops_flags + in struct ibv_qp_init_attr_ex is set. + The available builders and setters for DCI QP are the same as RC QP. + +# EXAMPLE + +```c +/* create DC QP type and specify the required send opcodes */ +attr_ex.qp_type = IBV_QPT_DRIVER; +attr_ex.comp_mask |= IBV_QP_INIT_ATTR_SEND_OPS_FLAGS; +attr_ex.send_ops_flags |= IBV_QP_EX_WITH_RDMA_WRITE; + +attr_dv.comp_mask |= MLX5DV_QP_INIT_ATTR_MASK_DC; +attr_dv.dc_init_attr.dc_type = MLX5DV_DCTYPE_DCI; + +ibv_qp *qp = mlx5dv_create_qp(ctx, attr_ex, attr_dv); +ibv_qp_ex *qpx = ibv_qp_to_qp_ex(qp); +mlx5dv_qp_ex *mqpx = mlx5dv_qp_ex_from_ibv_qp_ex(qpx); + +ibv_wr_start(qpx); + +/* Use ibv_qp_ex object to set WR generic attributes */ +qpx->wr_id = my_wr_id_1; +qpx->wr_flags = IBV_SEND_SIGNALED; +ibv_wr_rdma_write(qpx, rkey, remote_addr_1); +ibv_wr_set_sge(qpx, lkey, local_addr_1, length_1); + +/* Use mlx5 DC setter using mlx5dv_qp_ex object */ +mlx5dv_wr_set_wr_dc_addr(mqpx, ah, remote_dctn, remote_dc_key); + +ret = ibv_wr_complete(qpx); +``` + +# SEE ALSO + +**ibv_post_send**(3), **ibv_create_qp_ex(3)**, **ibv_wr_post(3)**. + +# AUTHOR + +Guy Levi diff --git a/providers/mlx5/mlx5.h b/providers/mlx5/mlx5.h index 3a22fde..c7c54fd 100644 --- a/providers/mlx5/mlx5.h +++ b/providers/mlx5/mlx5.h @@ -503,6 +503,7 @@ enum mlx5_qp_flags { struct mlx5_qp { struct mlx5_resource rsc; /* This struct must be first */ struct verbs_qp verbs_qp; + struct mlx5dv_qp_ex dv_qp; struct ibv_qp *ibv_qp; struct mlx5_buf buf; int max_inline_data; @@ -690,6 +691,11 @@ static inline struct mlx5_qp *to_mqp(struct ibv_qp *ibqp) return container_of(vqp, struct mlx5_qp, verbs_qp); } +static inline struct mlx5_qp *mqp_from_mlx5dv_qp_ex(struct mlx5dv_qp_ex *dv_qp) +{ + return container_of(dv_qp, struct mlx5_qp, dv_qp); +} + static inline struct mlx5_rwq *to_mrwq(struct ibv_wq *ibwq) { return container_of(ibwq, struct mlx5_rwq, wq); @@ -930,7 +936,8 @@ int mlx5_advise_mr(struct ibv_pd *pd, struct ibv_sge *sg_list, uint32_t num_sges); int mlx5_qp_fill_wr_pfns(struct mlx5_qp *mqp, - const struct ibv_qp_init_attr_ex *attr); + const struct ibv_qp_init_attr_ex *attr, + const struct mlx5dv_qp_init_attr *mlx5_attr); static inline void *mlx5_find_uidx(struct mlx5_context *ctx, uint32_t uidx) { diff --git a/providers/mlx5/mlx5dv.h b/providers/mlx5/mlx5dv.h index e2788d8..de4018c 100644 --- a/providers/mlx5/mlx5dv.h +++ b/providers/mlx5/mlx5dv.h @@ -193,6 +193,26 @@ struct ibv_qp *mlx5dv_create_qp(struct ibv_context *context, struct ibv_qp_init_attr_ex *qp_attr, struct mlx5dv_qp_init_attr *mlx5_qp_attr); +struct mlx5dv_qp_ex { + uint64_t comp_mask; + /* + * Available just for the MLX5 DC QP type with send opcodes of type: + * rdma, atomic and send. + */ + void (*wr_set_dc_addr)(struct mlx5dv_qp_ex *mqp, struct ibv_ah *ah, + uint32_t remote_dctn, uint64_t remote_dc_key); +}; + +struct mlx5dv_qp_ex *mlx5dv_qp_ex_from_ibv_qp_ex(struct ibv_qp_ex *qp); + +static inline void mlx5dv_wr_set_dc_addr(struct mlx5dv_qp_ex *mqp, + struct ibv_ah *ah, + uint32_t remote_dctn, + uint64_t remote_dc_key) +{ + mqp->wr_set_dc_addr(mqp, ah, remote_dctn, remote_dc_key); +} + enum mlx5dv_flow_action_esp_mask { MLX5DV_FLOW_ACTION_ESP_MASK_FLAGS = 1 << 0, }; diff --git a/providers/mlx5/qp.c b/providers/mlx5/qp.c index f3bce40..b2f749c 100644 --- a/providers/mlx5/qp.c +++ b/providers/mlx5/qp.c @@ -1168,7 +1168,7 @@ int mlx5_post_send(struct ibv_qp *ibqp, struct ibv_send_wr *wr, } enum { - WQE_REQ_SETTERS_UD_XRC = 2, + WQE_REQ_SETTERS_UD_XRC_DC = 2, }; static void mlx5_send_wr_start(struct ibv_qp_ex *ibqp) @@ -1296,14 +1296,15 @@ static inline void _mlx5_send_wr_send(struct ibv_qp_ex *ibqp, _common_wqe_init(ibqp, ib_op); - if (ibqp->qp_base.qp_type == IBV_QPT_UD) + if (ibqp->qp_base.qp_type == IBV_QPT_UD || + ibqp->qp_base.qp_type == IBV_QPT_DRIVER) transport_seg_sz = sizeof(struct mlx5_wqe_datagram_seg); else if (ibqp->qp_base.qp_type == IBV_QPT_XRC_SEND) transport_seg_sz = sizeof(struct mlx5_wqe_xrc_seg); mqp->cur_data = (void *)mqp->cur_ctrl + sizeof(struct mlx5_wqe_ctrl_seg) + transport_seg_sz; - /* In UD, cur_data may overrun the SQ */ + /* In UD/DC cur_data may overrun the SQ */ if (unlikely(mqp->cur_data == mqp->sq.qend)) mqp->cur_data = mlx5_get_send_wqe(mqp, 0); @@ -1435,11 +1436,16 @@ static inline void _mlx5_send_wr_rdma(struct ibv_qp_ex *ibqp, _common_wqe_init(ibqp, ib_op); - if (ibqp->qp_base.qp_type == IBV_QPT_XRC_SEND) + if (ibqp->qp_base.qp_type == IBV_QPT_DRIVER) + transport_seg_sz = sizeof(struct mlx5_wqe_datagram_seg); + else if (ibqp->qp_base.qp_type == IBV_QPT_XRC_SEND) transport_seg_sz = sizeof(struct mlx5_wqe_xrc_seg); raddr_seg = (void *)mqp->cur_ctrl + sizeof(struct mlx5_wqe_ctrl_seg) + transport_seg_sz; + /* In DC raddr_seg may overrun the SQ */ + if (unlikely(raddr_seg == mqp->sq.qend)) + raddr_seg = mlx5_get_send_wqe(mqp, 0); set_raddr_seg(raddr_seg, remote_addr, rkey); @@ -1490,11 +1496,16 @@ static inline void _mlx5_send_wr_atomic(struct ibv_qp_ex *ibqp, uint32_t rkey, _common_wqe_init(ibqp, ib_op); - if (ibqp->qp_base.qp_type == IBV_QPT_XRC_SEND) + if (ibqp->qp_base.qp_type == IBV_QPT_DRIVER) + transport_seg_sz = sizeof(struct mlx5_wqe_datagram_seg); + else if (ibqp->qp_base.qp_type == IBV_QPT_XRC_SEND) transport_seg_sz = sizeof(struct mlx5_wqe_xrc_seg); raddr_seg = (void *)mqp->cur_ctrl + sizeof(struct mlx5_wqe_ctrl_seg) + transport_seg_sz; + /* In DC raddr_seg may overrun the SQ */ + if (unlikely(raddr_seg == mqp->sq.qend)) + raddr_seg = mlx5_get_send_wqe(mqp, 0); set_raddr_seg(raddr_seg, remote_addr, rkey); @@ -1608,14 +1619,14 @@ mlx5_send_wr_set_sge_rc_uc(struct ibv_qp_ex *ibqp, uint32_t lkey, } static void -mlx5_send_wr_set_sge_ud_xrc(struct ibv_qp_ex *ibqp, uint32_t lkey, - uint64_t addr, uint32_t length) +mlx5_send_wr_set_sge_ud_xrc_dc(struct ibv_qp_ex *ibqp, uint32_t lkey, + uint64_t addr, uint32_t length) { struct mlx5_qp *mqp = to_mqp((struct ibv_qp *)ibqp); _mlx5_send_wr_set_sge(mqp, lkey, addr, length); - if (mqp->cur_setters_cnt == WQE_REQ_SETTERS_UD_XRC - 1) + if (mqp->cur_setters_cnt == WQE_REQ_SETTERS_UD_XRC_DC - 1) _common_wqe_finilize(mqp); else mqp->cur_setters_cnt++; @@ -1696,14 +1707,14 @@ mlx5_send_wr_set_sge_list_rc_uc(struct ibv_qp_ex *ibqp, size_t num_sge, } static void -mlx5_send_wr_set_sge_list_ud_xrc(struct ibv_qp_ex *ibqp, size_t num_sge, - const struct ibv_sge *sg_list) +mlx5_send_wr_set_sge_list_ud_xrc_dc(struct ibv_qp_ex *ibqp, size_t num_sge, + const struct ibv_sge *sg_list) { struct mlx5_qp *mqp = to_mqp((struct ibv_qp *)ibqp); _mlx5_send_wr_set_sge_list(mqp, num_sge, sg_list); - if (mqp->cur_setters_cnt == WQE_REQ_SETTERS_UD_XRC - 1) + if (mqp->cur_setters_cnt == WQE_REQ_SETTERS_UD_XRC_DC - 1) _common_wqe_finilize(mqp); else mqp->cur_setters_cnt++; @@ -1833,14 +1844,14 @@ mlx5_send_wr_set_inline_data_rc_uc(struct ibv_qp_ex *ibqp, void *addr, } static void -mlx5_send_wr_set_inline_data_ud_xrc(struct ibv_qp_ex *ibqp, void *addr, - size_t length) +mlx5_send_wr_set_inline_data_ud_xrc_dc(struct ibv_qp_ex *ibqp, void *addr, + size_t length) { struct mlx5_qp *mqp = to_mqp((struct ibv_qp *)ibqp); _mlx5_send_wr_set_inline_data(mqp, addr, length); - if (mqp->cur_setters_cnt == WQE_REQ_SETTERS_UD_XRC - 1) + if (mqp->cur_setters_cnt == WQE_REQ_SETTERS_UD_XRC_DC - 1) _common_wqe_finilize(mqp); else mqp->cur_setters_cnt++; @@ -1927,15 +1938,15 @@ mlx5_send_wr_set_inline_data_list_rc_uc(struct ibv_qp_ex *ibqp, } static void -mlx5_send_wr_set_inline_data_list_ud_xrc(struct ibv_qp_ex *ibqp, - size_t num_buf, - const struct ibv_data_buf *buf_list) +mlx5_send_wr_set_inline_data_list_ud_xrc_dc(struct ibv_qp_ex *ibqp, + size_t num_buf, + const struct ibv_data_buf *buf_list) { struct mlx5_qp *mqp = to_mqp((struct ibv_qp *)ibqp); _mlx5_send_wr_set_inline_data_list(mqp, num_buf, buf_list); - if (mqp->cur_setters_cnt == WQE_REQ_SETTERS_UD_XRC - 1) + if (mqp->cur_setters_cnt == WQE_REQ_SETTERS_UD_XRC_DC - 1) _common_wqe_finilize(mqp); else mqp->cur_setters_cnt++; @@ -2012,7 +2023,7 @@ mlx5_send_wr_set_ud_addr(struct ibv_qp_ex *ibqp, struct ibv_ah *ah, _set_datagram_seg(dseg, &mah->av, remote_qpn, remote_qkey); - if (mqp->cur_setters_cnt == WQE_REQ_SETTERS_UD_XRC - 1) + if (mqp->cur_setters_cnt == WQE_REQ_SETTERS_UD_XRC_DC - 1) _common_wqe_finilize(mqp); else mqp->cur_setters_cnt++; @@ -2027,7 +2038,27 @@ mlx5_send_wr_set_xrc_srqn(struct ibv_qp_ex *ibqp, uint32_t remote_srqn) xrc_seg->xrc_srqn = htobe32(remote_srqn); - if (mqp->cur_setters_cnt == WQE_REQ_SETTERS_UD_XRC - 1) + if (mqp->cur_setters_cnt == WQE_REQ_SETTERS_UD_XRC_DC - 1) + _common_wqe_finilize(mqp); + else + mqp->cur_setters_cnt++; +} + +static void mlx5_send_wr_set_dc_addr(struct mlx5dv_qp_ex *dv_qp, + struct ibv_ah *ah, + uint32_t remote_dctn, + uint64_t remote_dc_key) +{ + struct mlx5_qp *mqp = mqp_from_mlx5dv_qp_ex(dv_qp); + struct mlx5_wqe_datagram_seg *dseg = + (void *)mqp->cur_ctrl + sizeof(struct mlx5_wqe_ctrl_seg); + struct mlx5_ah *mah = to_mah(ah); + + memcpy(&dseg->av, &mah->av, sizeof(dseg->av)); + dseg->av.dqp_dct |= htobe32(remote_dctn | MLX5_EXTENDED_UD_AV); + dseg->av.key.dc_key = htobe64(remote_dc_key); + + if (mqp->cur_setters_cnt == WQE_REQ_SETTERS_UD_XRC_DC - 1) _common_wqe_finilize(mqp); else mqp->cur_setters_cnt++; @@ -2047,6 +2078,8 @@ enum { IBV_QP_EX_WITH_BIND_MW, MLX5_SUPPORTED_SEND_OPS_FLAGS_XRC = MLX5_SUPPORTED_SEND_OPS_FLAGS_RC, + MLX5_SUPPORTED_SEND_OPS_FLAGS_DCI = + MLX5_SUPPORTED_SEND_OPS_FLAGS_RC, MLX5_SUPPORTED_SEND_OPS_FLAGS_UD = IBV_QP_EX_WITH_SEND | IBV_QP_EX_WITH_SEND_WITH_IMM, @@ -2063,7 +2096,7 @@ enum { IBV_QP_EX_WITH_TSO, }; -static void fill_wr_builders_rc_xrc(struct ibv_qp_ex *ibqp) +static void fill_wr_builders_rc_xrc_dc(struct ibv_qp_ex *ibqp) { ibqp->wr_send = mlx5_send_wr_send_other; ibqp->wr_send_imm = mlx5_send_wr_send_imm; @@ -2108,12 +2141,12 @@ static void fill_wr_setters_rc_uc(struct ibv_qp_ex *ibqp) ibqp->wr_set_inline_data_list = mlx5_send_wr_set_inline_data_list_rc_uc; } -static void fill_wr_setters_ud_xrc(struct ibv_qp_ex *ibqp) +static void fill_wr_setters_ud_xrc_dc(struct ibv_qp_ex *ibqp) { - ibqp->wr_set_sge = mlx5_send_wr_set_sge_ud_xrc; - ibqp->wr_set_sge_list = mlx5_send_wr_set_sge_list_ud_xrc; - ibqp->wr_set_inline_data = mlx5_send_wr_set_inline_data_ud_xrc; - ibqp->wr_set_inline_data_list = mlx5_send_wr_set_inline_data_list_ud_xrc; + ibqp->wr_set_sge = mlx5_send_wr_set_sge_ud_xrc_dc; + ibqp->wr_set_sge_list = mlx5_send_wr_set_sge_list_ud_xrc_dc; + ibqp->wr_set_inline_data = mlx5_send_wr_set_inline_data_ud_xrc_dc; + ibqp->wr_set_inline_data_list = mlx5_send_wr_set_inline_data_list_ud_xrc_dc; } static void fill_wr_setters_eth(struct ibv_qp_ex *ibqp) @@ -2125,10 +2158,12 @@ static void fill_wr_setters_eth(struct ibv_qp_ex *ibqp) } int mlx5_qp_fill_wr_pfns(struct mlx5_qp *mqp, - const struct ibv_qp_init_attr_ex *attr) + const struct ibv_qp_init_attr_ex *attr, + const struct mlx5dv_qp_init_attr *mlx5_attr) { struct ibv_qp_ex *ibqp = &mqp->verbs_qp.qp_ex; uint64_t ops = attr->send_ops_flags; + struct mlx5dv_qp_ex *dv_qp; ibqp->wr_start = mlx5_send_wr_start; ibqp->wr_complete = mlx5_send_wr_complete; @@ -2145,7 +2180,7 @@ int mlx5_qp_fill_wr_pfns(struct mlx5_qp *mqp, if (ops & ~MLX5_SUPPORTED_SEND_OPS_FLAGS_RC) return EOPNOTSUPP; - fill_wr_builders_rc_xrc(ibqp); + fill_wr_builders_rc_xrc_dc(ibqp); fill_wr_setters_rc_uc(ibqp); break; @@ -2161,8 +2196,8 @@ int mlx5_qp_fill_wr_pfns(struct mlx5_qp *mqp, if (ops & ~MLX5_SUPPORTED_SEND_OPS_FLAGS_XRC) return EOPNOTSUPP; - fill_wr_builders_rc_xrc(ibqp); - fill_wr_setters_ud_xrc(ibqp); + fill_wr_builders_rc_xrc_dc(ibqp); + fill_wr_setters_ud_xrc_dc(ibqp); ibqp->wr_set_xrc_srqn = mlx5_send_wr_set_xrc_srqn; break; @@ -2174,7 +2209,7 @@ int mlx5_qp_fill_wr_pfns(struct mlx5_qp *mqp, return EOPNOTSUPP; fill_wr_builders_ud(ibqp); - fill_wr_setters_ud_xrc(ibqp); + fill_wr_setters_ud_xrc_dc(ibqp); ibqp->wr_set_ud_addr = mlx5_send_wr_set_ud_addr; break; @@ -2186,6 +2221,21 @@ int mlx5_qp_fill_wr_pfns(struct mlx5_qp *mqp, fill_wr_setters_eth(ibqp); break; + case IBV_QPT_DRIVER: + dv_qp = &mqp->dv_qp; + + if (!(mlx5_attr->comp_mask & MLX5DV_QP_INIT_ATTR_MASK_DC && + mlx5_attr->dc_init_attr.dc_type == MLX5DV_DCTYPE_DCI)) + return EOPNOTSUPP; + + if (ops & ~MLX5_SUPPORTED_SEND_OPS_FLAGS_DCI) + return EOPNOTSUPP; + + fill_wr_builders_rc_xrc_dc(ibqp); + fill_wr_setters_ud_xrc_dc(ibqp); + dv_qp->wr_set_dc_addr = mlx5_send_wr_set_dc_addr; + break; + default: return EOPNOTSUPP; } diff --git a/providers/mlx5/verbs.c b/providers/mlx5/verbs.c index 870279e..abbbf5a 100644 --- a/providers/mlx5/verbs.c +++ b/providers/mlx5/verbs.c @@ -1913,7 +1913,17 @@ static struct ibv_qp *create_qp(struct ibv_context *context, qp->atomics_enabled = 1; if (attr->comp_mask & IBV_QP_INIT_ATTR_SEND_OPS_FLAGS) { - ret = mlx5_qp_fill_wr_pfns(qp, attr); + /* + * Scatter2cqe, which is a data-path optimization, is disabled + * since driver DC data-path doesn't support it. + */ + if (mlx5_qp_attr && + mlx5_qp_attr->comp_mask & MLX5DV_QP_INIT_ATTR_MASK_DC) { + mlx5_create_flags &= ~MLX5_QP_FLAG_SCATTER_CQE; + scatter_to_cqe_configured = true; + } + + ret = mlx5_qp_fill_wr_pfns(qp, attr, mlx5_qp_attr); if (ret) { errno = ret; mlx5_dbg(fp, MLX5_DBG_QP, "Failed to handle operations flags (errno %d)\n", errno); @@ -2589,6 +2599,11 @@ struct ibv_qp *mlx5dv_create_qp(struct ibv_context *context, return create_qp(context, qp_attr, mlx5_qp_attr); } +struct mlx5dv_qp_ex *mlx5dv_qp_ex_from_ibv_qp_ex(struct ibv_qp_ex *qp) +{ + return &(container_of(qp, struct mlx5_qp, verbs_qp.qp_ex))->dv_qp; +} + int mlx5_get_srq_num(struct ibv_srq *srq, uint32_t *srq_num) { struct mlx5_srq *msrq = to_msrq(srq);