From patchwork Tue Aug 3 23:19:46 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Saeed Mahameed X-Patchwork-Id: 12417613 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-19.7 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A232CC4320E for ; Tue, 3 Aug 2021 23:20:20 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 834DC60F94 for ; Tue, 3 Aug 2021 23:20:20 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233096AbhHCXU3 (ORCPT ); Tue, 3 Aug 2021 19:20:29 -0400 Received: from mail.kernel.org ([198.145.29.99]:38930 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232530AbhHCXU3 (ORCPT ); Tue, 3 Aug 2021 19:20:29 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id C8CDD60F70; Tue, 3 Aug 2021 23:20:17 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1628032818; bh=tWDjMxjmGW41hmG5vH5C9xA8Y17SgNmUJici0G0fYnw=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=hsOoK2PVGo0GDze7vHbPrA/XQfZDgPxFMdUOdDk6609Sb6XacWk/CEnM6YVBuKIqe /Uxl/dQrDQxMTBpocpl0gJk+LPG9t5KYj9NBr07Txb9PcAJ6J8EJQc+HRgJM1jFMqS J+9NxGqHqJAOVB3O+RH/jzyuuMdbovJ9NS+If6OErbJc/Rlmm14XKNapXUYQnNhR2t wD+d/1AM+g/jatA38O4Ne0UbpOh8Nu1qK1xw6NXsHF/vyeWCbja0GKX2KUXK+hDPeW RvbgjrMDQvyn57HHJutyEW4K+37T7VTFrBw6b9erIYPPAXWomuvb9m6H+FG/sTifEv iUrmuSSVes35A== From: Saeed Mahameed To: Saeed Mahameed , Leon Romanovsky Cc: netdev@vger.kernel.org, linux-rdma@vger.kernel.org, Mark Bloch , Mark Zhang Subject: [PATCH mlx5-next 01/14] net/mlx5: Return mdev from eswitch Date: Tue, 3 Aug 2021 16:19:46 -0700 Message-Id: <20210803231959.26513-2-saeed@kernel.org> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20210803231959.26513-1-saeed@kernel.org> References: <20210803231959.26513-1-saeed@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org From: Mark Bloch Export a function so users can retrieve the mellanox device that manages the eswitch from the eswitch device. Signed-off-by: Mark Bloch Reviewed-by: Mark Zhang Signed-off-by: Saeed Mahameed --- drivers/net/ethernet/mellanox/mlx5/core/eswitch.c | 12 ++++++++++++ include/linux/mlx5/eswitch.h | 6 ++++++ 2 files changed, 18 insertions(+) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c index 97e6cb6f13c1..b65a472067d2 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c @@ -2384,3 +2384,15 @@ u16 mlx5_eswitch_get_total_vports(const struct mlx5_core_dev *dev) return mlx5_esw_allowed(esw) ? esw->total_vports : 0; } EXPORT_SYMBOL_GPL(mlx5_eswitch_get_total_vports); + +/** + * mlx5_eswitch_get_core_dev - Get the mdev device + * @esw : eswitch device. + * + * Return the mellanox core device which manages the eswitch. + */ +struct mlx5_core_dev *mlx5_eswitch_get_core_dev(struct mlx5_eswitch *esw) +{ + return mlx5_esw_allowed(esw) ? esw->dev : NULL; +} +EXPORT_SYMBOL(mlx5_eswitch_get_core_dev); diff --git a/include/linux/mlx5/eswitch.h b/include/linux/mlx5/eswitch.h index bc7db2e059eb..c2a34ff85188 100644 --- a/include/linux/mlx5/eswitch.h +++ b/include/linux/mlx5/eswitch.h @@ -128,6 +128,7 @@ u32 mlx5_eswitch_get_vport_metadata_for_set(struct mlx5_eswitch *esw, u8 mlx5_eswitch_mode(struct mlx5_core_dev *dev); u16 mlx5_eswitch_get_total_vports(const struct mlx5_core_dev *dev); +struct mlx5_core_dev *mlx5_eswitch_get_core_dev(struct mlx5_eswitch *esw); #else /* CONFIG_MLX5_ESWITCH */ @@ -171,6 +172,11 @@ static inline u16 mlx5_eswitch_get_total_vports(const struct mlx5_core_dev *dev) return 0; } +static inline struct mlx5_core_dev *mlx5_eswitch_get_core_dev(struct mlx5_eswitch *esw) +{ + return NULL; +} + #endif /* CONFIG_MLX5_ESWITCH */ static inline bool is_mdev_switchdev_mode(struct mlx5_core_dev *dev) From patchwork Tue Aug 3 23:19:47 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Saeed Mahameed X-Patchwork-Id: 12417615 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-19.7 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id BB9F5C43214 for ; Tue, 3 Aug 2021 23:20:20 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9B37760F58 for ; Tue, 3 Aug 2021 23:20:20 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232530AbhHCXUa (ORCPT ); Tue, 3 Aug 2021 19:20:30 -0400 Received: from mail.kernel.org ([198.145.29.99]:38934 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233205AbhHCXUa (ORCPT ); Tue, 3 Aug 2021 19:20:30 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id 447D660FA0; Tue, 3 Aug 2021 23:20:18 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1628032818; bh=Qktmr9TWKmVNw8s/sQBxRP1Jh/I+VL/hnnYwlQ34WnY=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=pf4EHnVdsacIMfE4ibILfczso4Eeu4L79LLCnmFVVoQpwWOOUU8hzvIaOg5XTkhC/ O6Xxr0u+LgeMCOZR1JwL0QMGC7BcnQl34w92BtrPLnJ1TmTeKrctiNN7CtLvAEd5kM WlCuCr8RT9RW9smhpkjWgbdMzzDmYtgXECcovKxbtvtZP3s9NtdxPbGRlZd8X/Ks0e 0vhZVJd7P9uV9mpNJUN/KzSZnQ9bhBADjRtULQWie+WjEMCLaAKXwRD8qt9dCjo3xM VF+t0GP7LhH8+LThTw/R6dTMA+IJFLuZRYjve9MODmzED1ohlUueBnXps4yEs/sUtG 0pwycsco1B6Jw== From: Saeed Mahameed To: Saeed Mahameed , Leon Romanovsky Cc: netdev@vger.kernel.org, linux-rdma@vger.kernel.org, Mark Bloch , Mark Zhang Subject: [PATCH mlx5-next 02/14] net/mlx5: Lag, add initial logic for shared FDB Date: Tue, 3 Aug 2021 16:19:47 -0700 Message-Id: <20210803231959.26513-3-saeed@kernel.org> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20210803231959.26513-1-saeed@kernel.org> References: <20210803231959.26513-1-saeed@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org From: Mark Bloch As shared FDB requires changes in two subsystems first expose the needed core functions so the RDMA side can be changed. mlx5_lag_is_master(): return true if a given mlx5 device is the lag master. mlx5_lag_is_shared_fdb(): Returns true if the lag mode is shared FDB. mlx5_lag_get_peer_mdev(): Return the peer mdev in lag. The mentioned functions will be used by downstream patches in order to add support for shared FDB for the RDMA side. Signed-off-by: Mark Bloch Reviewed-by: Mark Zhang Signed-off-by: Saeed Mahameed --- drivers/net/ethernet/mellanox/mlx5/core/lag.c | 49 +++++++++++++++++++ drivers/net/ethernet/mellanox/mlx5/core/lag.h | 1 + include/linux/mlx5/driver.h | 3 ++ 3 files changed, 53 insertions(+) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lag.c b/drivers/net/ethernet/mellanox/mlx5/core/lag.c index 5c043c5cc403..3049de648256 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/lag.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/lag.c @@ -746,6 +746,21 @@ bool mlx5_lag_is_active(struct mlx5_core_dev *dev) } EXPORT_SYMBOL(mlx5_lag_is_active); +bool mlx5_lag_is_master(struct mlx5_core_dev *dev) +{ + struct mlx5_lag *ldev; + bool res; + + spin_lock(&lag_lock); + ldev = mlx5_lag_dev(dev); + res = ldev && __mlx5_lag_is_active(ldev) && + dev == ldev->pf[MLX5_LAG_P1].dev; + spin_unlock(&lag_lock); + + return res; +} +EXPORT_SYMBOL(mlx5_lag_is_master); + bool mlx5_lag_is_sriov(struct mlx5_core_dev *dev) { struct mlx5_lag *ldev; @@ -760,6 +775,20 @@ bool mlx5_lag_is_sriov(struct mlx5_core_dev *dev) } EXPORT_SYMBOL(mlx5_lag_is_sriov); +bool mlx5_lag_is_shared_fdb(struct mlx5_core_dev *dev) +{ + struct mlx5_lag *ldev; + bool res; + + spin_lock(&lag_lock); + ldev = mlx5_lag_dev(dev); + res = ldev && __mlx5_lag_is_sriov(ldev) && ldev->shared_fdb; + spin_unlock(&lag_lock); + + return res; +} +EXPORT_SYMBOL(mlx5_lag_is_shared_fdb); + void mlx5_lag_update(struct mlx5_core_dev *dev) { struct mlx5_lag *ldev; @@ -827,6 +856,26 @@ u8 mlx5_lag_get_slave_port(struct mlx5_core_dev *dev, } EXPORT_SYMBOL(mlx5_lag_get_slave_port); +struct mlx5_core_dev *mlx5_lag_get_peer_mdev(struct mlx5_core_dev *dev) +{ + struct mlx5_core_dev *peer_dev = NULL; + struct mlx5_lag *ldev; + + spin_lock(&lag_lock); + ldev = mlx5_lag_dev(dev); + if (!ldev) + goto unlock; + + peer_dev = ldev->pf[MLX5_LAG_P1].dev == dev ? + ldev->pf[MLX5_LAG_P2].dev : + ldev->pf[MLX5_LAG_P1].dev; + +unlock: + spin_unlock(&lag_lock); + return peer_dev; +} +EXPORT_SYMBOL(mlx5_lag_get_peer_mdev); + int mlx5_lag_query_cong_counters(struct mlx5_core_dev *dev, u64 *values, int num_counters, diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lag.h b/drivers/net/ethernet/mellanox/mlx5/core/lag.h index 191392c37558..70b244b1a09e 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/lag.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/lag.h @@ -39,6 +39,7 @@ struct lag_tracker { */ struct mlx5_lag { u8 flags; + bool shared_fdb; u8 v2p_map[MLX5_MAX_PORTS]; struct kref ref; struct lag_func pf[MLX5_MAX_PORTS]; diff --git a/include/linux/mlx5/driver.h b/include/linux/mlx5/driver.h index 1efe37466969..af4dd6e9f97f 100644 --- a/include/linux/mlx5/driver.h +++ b/include/linux/mlx5/driver.h @@ -1138,6 +1138,8 @@ bool mlx5_lag_is_roce(struct mlx5_core_dev *dev); bool mlx5_lag_is_sriov(struct mlx5_core_dev *dev); bool mlx5_lag_is_multipath(struct mlx5_core_dev *dev); bool mlx5_lag_is_active(struct mlx5_core_dev *dev); +bool mlx5_lag_is_master(struct mlx5_core_dev *dev); +bool mlx5_lag_is_shared_fdb(struct mlx5_core_dev *dev); struct net_device *mlx5_lag_get_roce_netdev(struct mlx5_core_dev *dev); u8 mlx5_lag_get_slave_port(struct mlx5_core_dev *dev, struct net_device *slave); @@ -1145,6 +1147,7 @@ int mlx5_lag_query_cong_counters(struct mlx5_core_dev *dev, u64 *values, int num_counters, size_t *offsets); +struct mlx5_core_dev *mlx5_lag_get_peer_mdev(struct mlx5_core_dev *dev); struct mlx5_uars_page *mlx5_get_uars_page(struct mlx5_core_dev *mdev); void mlx5_put_uars_page(struct mlx5_core_dev *mdev, struct mlx5_uars_page *up); int mlx5_dm_sw_icm_alloc(struct mlx5_core_dev *dev, enum mlx5_sw_icm_type type, From patchwork Tue Aug 3 23:19:48 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Saeed Mahameed X-Patchwork-Id: 12417617 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-19.7 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 27F3AC432BE for ; Tue, 3 Aug 2021 23:20:22 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 0AB1F61037 for ; Tue, 3 Aug 2021 23:20:22 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233430AbhHCXUc (ORCPT ); Tue, 3 Aug 2021 19:20:32 -0400 Received: from mail.kernel.org ([198.145.29.99]:38940 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233297AbhHCXUa (ORCPT ); Tue, 3 Aug 2021 19:20:30 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id B787D60F94; Tue, 3 Aug 2021 23:20:18 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1628032819; bh=Eyax6avRleEjEtCaI7hHY+JEdQxmSBAv/5kk0G0HAZE=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=RwqN/2Gi+Qws5up0hgcenvrscmPvifdYiu/IH7ZuUDwawT+8iW7IA0Y1sfzF8LZX/ PVk8ldm3fEJPkL7h+u3il98CEydzCSR+mOcEdNXE70MxZlSMF+siK+k550ScGC3jhA WT5VwEFy68iR8L7H/VGwaBiLyRe3jarrzXm/fdCoTWReONyU2qUbGIF9eBVodkfcvR 8N+4CNko2BPlZbIqcyE2SRppC225kB83M6KGY9GrpPjH0DDOxyVMxRk2pCJzmj4hgt /uiEPnFvjyPVBYvPh2hfA9z+xYHG6wJxIh/5f7/F05zYnwC4Q6WWjsdhTk8ASIfvJe FYlwJvcQUtSlQ== From: Saeed Mahameed To: Saeed Mahameed , Leon Romanovsky Cc: netdev@vger.kernel.org, linux-rdma@vger.kernel.org, Mark Bloch , Mark Zhang Subject: [PATCH mlx5-next 03/14] RDMA/mlx5: Fill port info based on the relevant eswitch Date: Tue, 3 Aug 2021 16:19:48 -0700 Message-Id: <20210803231959.26513-4-saeed@kernel.org> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20210803231959.26513-1-saeed@kernel.org> References: <20210803231959.26513-1-saeed@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org From: Mark Bloch In shared FDB a single RDMA device can have representors that are connected to two different eswitches. Use the right eswitch when preparing the response to userspace. Signed-off-by: Mark Bloch Reviewed-by: Mark Zhang Signed-off-by: Saeed Mahameed --- drivers/infiniband/hw/mlx5/std_types.c | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/drivers/infiniband/hw/mlx5/std_types.c b/drivers/infiniband/hw/mlx5/std_types.c index c0ddf7b3c6e2..bbfcce3bdc84 100644 --- a/drivers/infiniband/hw/mlx5/std_types.c +++ b/drivers/infiniband/hw/mlx5/std_types.c @@ -114,14 +114,18 @@ static int fill_vport_vhca_id(struct mlx5_core_dev *mdev, u16 vport, static int fill_switchdev_info(struct mlx5_ib_dev *dev, u32 port_num, struct mlx5_ib_uapi_query_port *info) { - struct mlx5_core_dev *mdev = dev->mdev; struct mlx5_eswitch_rep *rep; + struct mlx5_core_dev *mdev; int err; rep = dev->port[port_num - 1].rep; if (!rep) return -EOPNOTSUPP; + mdev = mlx5_eswitch_get_core_dev(rep->esw); + if (!mdev) + return -EINVAL; + info->vport = rep->vport; info->flags |= MLX5_IB_UAPI_QUERY_PORT_VPORT; @@ -138,9 +142,9 @@ static int fill_switchdev_info(struct mlx5_ib_dev *dev, u32 port_num, if (err) return err; - if (mlx5_eswitch_vport_match_metadata_enabled(mdev->priv.eswitch)) { + if (mlx5_eswitch_vport_match_metadata_enabled(rep->esw)) { info->reg_c0.value = mlx5_eswitch_get_vport_metadata_for_match( - mdev->priv.eswitch, rep->vport); + rep->esw, rep->vport); info->reg_c0.mask = mlx5_eswitch_get_vport_metadata_mask(); info->flags |= MLX5_IB_UAPI_QUERY_PORT_VPORT_REG_C0; } From patchwork Tue Aug 3 23:19:49 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Saeed Mahameed X-Patchwork-Id: 12417619 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-19.7 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4A173C43216 for ; Tue, 3 Aug 2021 23:20:22 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 34EF461037 for ; Tue, 3 Aug 2021 23:20:22 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232047AbhHCXUc (ORCPT ); Tue, 3 Aug 2021 19:20:32 -0400 Received: from mail.kernel.org ([198.145.29.99]:38950 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233362AbhHCXUb (ORCPT ); Tue, 3 Aug 2021 19:20:31 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id 31ABC60F9C; Tue, 3 Aug 2021 23:20:19 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1628032819; bh=V5VFVrJWUkXo8/cOXZnkCjvTJpHkG/7Y51NWhGHWV0c=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=Y3yvlFpQsrpwgSNYYBMJhEqsuaaVHjjFSSd0KM6wRTK4L7vnNA+rhq7XyndnIeZs4 v24zTCDjOAVAJMB/tXzeZmCLqMD0O0MJRVugGV0+aOMAHJuwJtAhhrtizpZ7Hx1iPO EYH6Lj2FjCK09cTg/CqywJI3G/nC1mtYBo9GFewqewoxExMydHX5890VWHrEkkpESC cGNqXnn1Vvo9Ryw4YN6hrZ4mOFw8+bSybb8m4PE6gNVRbCVVvmEZvkHp3N/9nY6g7Z c9KRmIQzQrF5Y9wJ0YQPtW0WZu09yJKH9FBgcxwFzqbG4RtE40j8swGDX/NAxy86N6 TaSDfRo513fQA== From: Saeed Mahameed To: Saeed Mahameed , Leon Romanovsky Cc: netdev@vger.kernel.org, linux-rdma@vger.kernel.org, Mark Bloch , Mark Zhang Subject: [PATCH mlx5-next 04/14] {net, RDMA}/mlx5: Extend send to vport rules Date: Tue, 3 Aug 2021 16:19:49 -0700 Message-Id: <20210803231959.26513-5-saeed@kernel.org> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20210803231959.26513-1-saeed@kernel.org> References: <20210803231959.26513-1-saeed@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org From: Mark Bloch In shared FDB there is only one eswitch which is active and it receives traffic from all representors and all vports in the HCA. While the Ethernet representor will always reside on its native PF the IB representor will not. Extend send to vport rule creation to support such flows. Need to account for source vport that sends the traffic (on which the representors resides) and the target eswitch the traffic which reach. Signed-off-by: Mark Bloch Reviewed-by: Mark Zhang Signed-off-by: Saeed Mahameed --- drivers/infiniband/hw/mlx5/ib_rep.c | 2 +- drivers/net/ethernet/mellanox/mlx5/core/en_rep.c | 2 +- drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c | 5 +++-- include/linux/mlx5/eswitch.h | 1 + 4 files changed, 6 insertions(+), 4 deletions(-) diff --git a/drivers/infiniband/hw/mlx5/ib_rep.c b/drivers/infiniband/hw/mlx5/ib_rep.c index b25e0b33a11a..bf5a6e4d1c03 100644 --- a/drivers/infiniband/hw/mlx5/ib_rep.c +++ b/drivers/infiniband/hw/mlx5/ib_rep.c @@ -123,7 +123,7 @@ struct mlx5_flow_handle *create_flow_rule_vport_sq(struct mlx5_ib_dev *dev, rep = dev->port[port - 1].rep; - return mlx5_eswitch_add_send_to_vport_rule(esw, rep, sq->base.mqp.qpn); + return mlx5_eswitch_add_send_to_vport_rule(esw, esw, rep, sq->base.mqp.qpn); } static int mlx5r_rep_probe(struct auxiliary_device *adev, diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c index bf94bcb6fa5d..1d016cc64015 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c @@ -337,7 +337,7 @@ static int mlx5e_sqs2vport_start(struct mlx5_eswitch *esw, } /* Add re-inject rule to the PF/representor sqs */ - flow_rule = mlx5_eswitch_add_send_to_vport_rule(esw, rep, + flow_rule = mlx5_eswitch_add_send_to_vport_rule(esw, esw, rep, sqns_array[i]); if (IS_ERR(flow_rule)) { err = PTR_ERR(flow_rule); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c index 7579f3402776..12567002997f 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c @@ -925,6 +925,7 @@ int mlx5_eswitch_del_vlan_action(struct mlx5_eswitch *esw, struct mlx5_flow_handle * mlx5_eswitch_add_send_to_vport_rule(struct mlx5_eswitch *on_esw, + struct mlx5_eswitch *from_esw, struct mlx5_eswitch_rep *rep, u32 sqn) { @@ -943,10 +944,10 @@ mlx5_eswitch_add_send_to_vport_rule(struct mlx5_eswitch *on_esw, misc = MLX5_ADDR_OF(fte_match_param, spec->match_value, misc_parameters); MLX5_SET(fte_match_set_misc, misc, source_sqn, sqn); /* source vport is the esw manager */ - MLX5_SET(fte_match_set_misc, misc, source_port, rep->esw->manager_vport); + MLX5_SET(fte_match_set_misc, misc, source_port, from_esw->manager_vport); if (MLX5_CAP_ESW(on_esw->dev, merged_eswitch)) MLX5_SET(fte_match_set_misc, misc, source_eswitch_owner_vhca_id, - MLX5_CAP_GEN(rep->esw->dev, vhca_id)); + MLX5_CAP_GEN(from_esw->dev, vhca_id)); misc = MLX5_ADDR_OF(fte_match_param, spec->match_criteria, misc_parameters); MLX5_SET_TO_ONES(fte_match_set_misc, misc, source_sqn); diff --git a/include/linux/mlx5/eswitch.h b/include/linux/mlx5/eswitch.h index c2a34ff85188..0bfcf7b8ecf9 100644 --- a/include/linux/mlx5/eswitch.h +++ b/include/linux/mlx5/eswitch.h @@ -63,6 +63,7 @@ struct mlx5_eswitch_rep *mlx5_eswitch_vport_rep(struct mlx5_eswitch *esw, void *mlx5_eswitch_uplink_get_proto_dev(struct mlx5_eswitch *esw, u8 rep_type); struct mlx5_flow_handle * mlx5_eswitch_add_send_to_vport_rule(struct mlx5_eswitch *on_esw, + struct mlx5_eswitch *from_esw, struct mlx5_eswitch_rep *rep, u32 sqn); #ifdef CONFIG_MLX5_ESWITCH From patchwork Tue Aug 3 23:19:50 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Saeed Mahameed X-Patchwork-Id: 12417621 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-19.7 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 825C9C19F31 for ; Tue, 3 Aug 2021 23:20:23 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 6B01F6103C for ; Tue, 3 Aug 2021 23:20:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233478AbhHCXUd (ORCPT ); Tue, 3 Aug 2021 19:20:33 -0400 Received: from mail.kernel.org ([198.145.29.99]:38954 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233369AbhHCXUb (ORCPT ); Tue, 3 Aug 2021 19:20:31 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id 9E3FD60F70; Tue, 3 Aug 2021 23:20:19 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1628032820; bh=XIEHZdKku3R1IkD1Nyx3gMbhq0oS9QVpqCEARSwY/9o=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=uuhhiD+URfWvErrxCIKZVrRZ6EeXMuSYG9grgYlG1YpBUsixWRVFgXokHnfigNl29 yJ2/AwtOXEWUJh6urqQ1ZeAc/MZpr6J0HOQv77+NI9cywobDY8yOAJ8vWAcqLgtnLf 85e8wx84yij8jWJ8RK+3B5SUobuwdU7SpZSJCCVVjZtxlF1x4l4O3eEM2z+/Fv/wxG 9CejPOYZ/orEmiNG5j/80gs3kZQsP4/vzwzWO0CjBG6AfuP6wGHObLFqwSK4qviTsZ sdZppvsbhVnwTREfkH57QHyGA54TAgsLUNBgK7GBtm/zSDmHsG3AdY8C8HOhEdXOEV 6O82s2XOM7Ptw== From: Saeed Mahameed To: Saeed Mahameed , Leon Romanovsky Cc: netdev@vger.kernel.org, linux-rdma@vger.kernel.org, Mark Bloch , Mark Zhang Subject: [PATCH mlx5-next 05/14] RDMA/mlx5: Add shared FDB support Date: Tue, 3 Aug 2021 16:19:50 -0700 Message-Id: <20210803231959.26513-6-saeed@kernel.org> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20210803231959.26513-1-saeed@kernel.org> References: <20210803231959.26513-1-saeed@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org From: Mark Bloch Shared FDB allows to create a single RDMA device that holds representors from both eswitches. As shared FDB is only active when both uplink representors are enslaved there is a single RDMA port that represents both uplinks. The number of ports is the number of vports on both eswitches minus one as we only need 1 port for both uplinks. Signed-off-by: Mark Bloch Reviewed-by: Mark Zhang Signed-off-by: Saeed Mahameed --- drivers/infiniband/hw/mlx5/ib_rep.c | 75 ++++++++++++++++++++++++++--- drivers/infiniband/hw/mlx5/main.c | 44 ++++++++++------- 2 files changed, 95 insertions(+), 24 deletions(-) diff --git a/drivers/infiniband/hw/mlx5/ib_rep.c b/drivers/infiniband/hw/mlx5/ib_rep.c index bf5a6e4d1c03..52821485371a 100644 --- a/drivers/infiniband/hw/mlx5/ib_rep.c +++ b/drivers/infiniband/hw/mlx5/ib_rep.c @@ -8,13 +8,15 @@ #include "srq.h" static int -mlx5_ib_set_vport_rep(struct mlx5_core_dev *dev, struct mlx5_eswitch_rep *rep) +mlx5_ib_set_vport_rep(struct mlx5_core_dev *dev, + struct mlx5_eswitch_rep *rep, + int vport_index) { struct mlx5_ib_dev *ibdev; - int vport_index; ibdev = mlx5_eswitch_uplink_get_proto_dev(dev->priv.eswitch, REP_IB); - vport_index = rep->vport_index; + if (!ibdev) + return -EINVAL; ibdev->port[vport_index].rep = rep; rep->rep_data[REP_IB].priv = ibdev; @@ -26,19 +28,39 @@ mlx5_ib_set_vport_rep(struct mlx5_core_dev *dev, struct mlx5_eswitch_rep *rep) return 0; } +static void mlx5_ib_register_peer_vport_reps(struct mlx5_core_dev *mdev); + static int mlx5_ib_vport_rep_load(struct mlx5_core_dev *dev, struct mlx5_eswitch_rep *rep) { u32 num_ports = mlx5_eswitch_get_total_vports(dev); const struct mlx5_ib_profile *profile; + struct mlx5_core_dev *peer_dev; struct mlx5_ib_dev *ibdev; + u32 peer_num_ports; int vport_index; int ret; + vport_index = rep->vport_index; + + if (mlx5_lag_is_shared_fdb(dev)) { + peer_dev = mlx5_lag_get_peer_mdev(dev); + peer_num_ports = mlx5_eswitch_get_total_vports(peer_dev); + if (mlx5_lag_is_master(dev)) { + /* Only 1 ib port is the representor for both uplinks */ + num_ports += peer_num_ports - 1; + } else { + if (rep->vport == MLX5_VPORT_UPLINK) + return 0; + vport_index += peer_num_ports; + dev = peer_dev; + } + } + if (rep->vport == MLX5_VPORT_UPLINK) profile = &raw_eth_profile; else - return mlx5_ib_set_vport_rep(dev, rep); + return mlx5_ib_set_vport_rep(dev, rep, vport_index); ibdev = ib_alloc_device(mlx5_ib_dev, ib_dev); if (!ibdev) @@ -64,6 +86,8 @@ mlx5_ib_vport_rep_load(struct mlx5_core_dev *dev, struct mlx5_eswitch_rep *rep) goto fail_add; rep->rep_data[REP_IB].priv = ibdev; + if (mlx5_lag_is_shared_fdb(dev)) + mlx5_ib_register_peer_vport_reps(dev); return 0; @@ -82,18 +106,45 @@ static void *mlx5_ib_rep_to_dev(struct mlx5_eswitch_rep *rep) static void mlx5_ib_vport_rep_unload(struct mlx5_eswitch_rep *rep) { + struct mlx5_core_dev *mdev = mlx5_eswitch_get_core_dev(rep->esw); struct mlx5_ib_dev *dev = mlx5_ib_rep_to_dev(rep); + int vport_index = rep->vport_index; struct mlx5_ib_port *port; - port = &dev->port[rep->vport_index]; + if (WARN_ON(!mdev)) + return; + + if (mlx5_lag_is_shared_fdb(mdev) && + !mlx5_lag_is_master(mdev)) { + struct mlx5_core_dev *peer_mdev; + + if (rep->vport == MLX5_VPORT_UPLINK) + return; + peer_mdev = mlx5_lag_get_peer_mdev(mdev); + vport_index += mlx5_eswitch_get_total_vports(peer_mdev); + } + + if (!dev) + return; + + port = &dev->port[vport_index]; write_lock(&port->roce.netdev_lock); port->roce.netdev = NULL; write_unlock(&port->roce.netdev_lock); rep->rep_data[REP_IB].priv = NULL; port->rep = NULL; - if (rep->vport == MLX5_VPORT_UPLINK) + if (rep->vport == MLX5_VPORT_UPLINK) { + struct mlx5_core_dev *peer_mdev; + struct mlx5_eswitch *esw; + + if (mlx5_lag_is_shared_fdb(mdev)) { + peer_mdev = mlx5_lag_get_peer_mdev(mdev); + esw = peer_mdev->priv.eswitch; + mlx5_eswitch_unregister_vport_reps(esw, REP_IB); + } __mlx5_ib_remove(dev, dev->profile, MLX5_IB_STAGE_MAX); + } } static const struct mlx5_eswitch_rep_ops rep_ops = { @@ -102,6 +153,18 @@ static const struct mlx5_eswitch_rep_ops rep_ops = { .get_proto_dev = mlx5_ib_rep_to_dev, }; +static void mlx5_ib_register_peer_vport_reps(struct mlx5_core_dev *mdev) +{ + struct mlx5_core_dev *peer_mdev = mlx5_lag_get_peer_mdev(mdev); + struct mlx5_eswitch *esw; + + if (!peer_mdev) + return; + + esw = peer_mdev->priv.eswitch; + mlx5_eswitch_register_vport_reps(esw, &rep_ops, REP_IB); +} + struct net_device *mlx5_ib_get_rep_netdev(struct mlx5_eswitch *esw, u16 vport_num) { diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c index 094c976b1eed..ae05e143401c 100644 --- a/drivers/infiniband/hw/mlx5/main.c +++ b/drivers/infiniband/hw/mlx5/main.c @@ -126,6 +126,7 @@ static int get_port_state(struct ib_device *ibdev, static struct mlx5_roce *mlx5_get_rep_roce(struct mlx5_ib_dev *dev, struct net_device *ndev, + struct net_device *upper, u32 *port_num) { struct net_device *rep_ndev; @@ -137,6 +138,14 @@ static struct mlx5_roce *mlx5_get_rep_roce(struct mlx5_ib_dev *dev, if (!port->rep) continue; + if (upper == ndev && port->rep->vport == MLX5_VPORT_UPLINK) { + *port_num = i + 1; + return &port->roce; + } + + if (upper && port->rep->vport == MLX5_VPORT_UPLINK) + continue; + read_lock(&port->roce.netdev_lock); rep_ndev = mlx5_ib_get_rep_netdev(port->rep->esw, port->rep->vport); @@ -196,11 +205,12 @@ static int mlx5_netdev_event(struct notifier_block *this, } if (ibdev->is_rep) - roce = mlx5_get_rep_roce(ibdev, ndev, &port_num); + roce = mlx5_get_rep_roce(ibdev, ndev, upper, &port_num); if (!roce) return NOTIFY_DONE; - if ((upper == ndev || (!upper && ndev == roce->netdev)) - && ibdev->ib_active) { + if ((upper == ndev || + ((!upper || ibdev->is_rep) && ndev == roce->netdev)) && + ibdev->ib_active) { struct ib_event ibev = { }; enum ib_port_state port_state; @@ -3012,7 +3022,7 @@ static int mlx5_eth_lag_init(struct mlx5_ib_dev *dev) struct mlx5_flow_table *ft; int err; - if (!ns || !mlx5_lag_is_roce(mdev)) + if (!ns || !mlx5_lag_is_active(mdev)) return 0; err = mlx5_cmd_create_vport_lag(mdev); @@ -3074,9 +3084,11 @@ static int mlx5_enable_eth(struct mlx5_ib_dev *dev) { int err; - err = mlx5_nic_vport_enable_roce(dev->mdev); - if (err) - return err; + if (!dev->is_rep && dev->profile != &raw_eth_profile) { + err = mlx5_nic_vport_enable_roce(dev->mdev); + if (err) + return err; + } err = mlx5_eth_lag_init(dev); if (err) @@ -3085,7 +3097,8 @@ static int mlx5_enable_eth(struct mlx5_ib_dev *dev) return 0; err_disable_roce: - mlx5_nic_vport_disable_roce(dev->mdev); + if (!dev->is_rep && dev->profile != &raw_eth_profile) + mlx5_nic_vport_disable_roce(dev->mdev); return err; } @@ -3093,7 +3106,8 @@ static int mlx5_enable_eth(struct mlx5_ib_dev *dev) static void mlx5_disable_eth(struct mlx5_ib_dev *dev) { mlx5_eth_lag_cleanup(dev); - mlx5_nic_vport_disable_roce(dev->mdev); + if (!dev->is_rep && dev->profile != &raw_eth_profile) + mlx5_nic_vport_disable_roce(dev->mdev); } static int mlx5_ib_rn_get_params(struct ib_device *device, u32 port_num, @@ -3950,12 +3964,7 @@ static int mlx5_ib_roce_init(struct mlx5_ib_dev *dev) /* Register only for native ports */ err = mlx5_add_netdev_notifier(dev, port_num); - if (err || dev->is_rep || !mlx5_is_roce_init_enabled(mdev)) - /* - * We don't enable ETH interface for - * 1. IB representors - * 2. User disabled ROCE through devlink interface - */ + if (err) return err; err = mlx5_enable_eth(dev); @@ -3980,8 +3989,7 @@ static void mlx5_ib_roce_cleanup(struct mlx5_ib_dev *dev) ll = mlx5_port_type_cap_to_rdma_ll(port_type_cap); if (ll == IB_LINK_LAYER_ETHERNET) { - if (!dev->is_rep) - mlx5_disable_eth(dev); + mlx5_disable_eth(dev); port_num = mlx5_core_native_port_num(dev->mdev) - 1; mlx5_remove_netdev_notifier(dev, port_num); @@ -4037,7 +4045,7 @@ static int mlx5_ib_stage_ib_reg_init(struct mlx5_ib_dev *dev) { const char *name; - if (!mlx5_lag_is_roce(dev->mdev)) + if (!mlx5_lag_is_active(dev->mdev)) name = "mlx5_%d"; else name = "mlx5_bond_%d"; From patchwork Tue Aug 3 23:19:51 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Saeed Mahameed X-Patchwork-Id: 12417623 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-19.7 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5CCDCC432BE for ; Tue, 3 Aug 2021 23:20:25 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 3F9E561037 for ; Tue, 3 Aug 2021 23:20:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233666AbhHCXUe (ORCPT ); Tue, 3 Aug 2021 19:20:34 -0400 Received: from mail.kernel.org ([198.145.29.99]:38958 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229986AbhHCXUc (ORCPT ); Tue, 3 Aug 2021 19:20:32 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id 23EF260F93; Tue, 3 Aug 2021 23:20:20 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1628032820; bh=CKn/KvEeyZAMuD+iFiH3gFauEveHmSMXEL5LWTioEbc=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=t6eXw/E7p6Ij0lxShllsqHoBbFM3/hiGROIRM2adIiIkdeNJgOIJgSTthlygfxJVm ioa0yfwArIxq+fexGzcqOeuo6JT3o4vyc5jZMXgtW62z2E4nwjC2GHMEj5tYiU7vxk nHAsgvI+5KU5xF0qyxijFIJpi4NaBjbq3HDEJKCocz/Zi0yzIO6TxYEg6kgWNor3g6 3liyiJUEbFSja5be6sPo/SGHLtivN4+EnjSVTHgszsuHi/D1ur6xxOnNZEUFjnl4fQ XvzfzHo2wW+3lwt+vSeX86q3GrJhr8tIvEbRLUqU1sVbZgBvbKNIYk3gv0mW1xEM3H 4l4gjMqJJHc6w== From: Saeed Mahameed To: Saeed Mahameed , Leon Romanovsky Cc: netdev@vger.kernel.org, linux-rdma@vger.kernel.org, Ariel Levkovich , Roi Dayan Subject: [PATCH mlx5-next 06/14] net/mlx5: E-Switch, set flow source for send to uplink rule Date: Tue, 3 Aug 2021 16:19:51 -0700 Message-Id: <20210803231959.26513-7-saeed@kernel.org> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20210803231959.26513-1-saeed@kernel.org> References: <20210803231959.26513-1-saeed@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org From: Ariel Levkovich Set the flow source param to local vport for the uplink rep send-to-vport rule. This will comply with the recent changes in SW steering that use the flow source as an indication for the rule type - rx or tx. Since the uplink send-to-vport rule is forwarding traffic to the wire it has to indicate that it is an sx rule and can't use the any port value in the flow source. Signed-off-by: Ariel Levkovich Reviewed-by: Roi Dayan Signed-off-by: Saeed Mahameed --- drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c index 12567002997f..1735be77e1fd 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c @@ -963,6 +963,9 @@ mlx5_eswitch_add_send_to_vport_rule(struct mlx5_eswitch *on_esw, dest.vport.flags |= MLX5_FLOW_DEST_VPORT_VHCA_ID; flow_act.action = MLX5_FLOW_CONTEXT_ACTION_FWD_DEST; + if (rep->vport == MLX5_VPORT_UPLINK) + spec->flow_context.flow_source = MLX5_FLOW_CONTEXT_FLOW_SOURCE_LOCAL_VPORT; + flow_rule = mlx5_add_flow_rules(on_esw->fdb_table.offloads.slow_fdb, spec, &flow_act, &dest, 1); if (IS_ERR(flow_rule)) From patchwork Tue Aug 3 23:19:52 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Saeed Mahameed X-Patchwork-Id: 12417625 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-19.7 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7D070C4320E for ; Tue, 3 Aug 2021 23:20:25 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 63D3C61037 for ; Tue, 3 Aug 2021 23:20:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233706AbhHCXUf (ORCPT ); Tue, 3 Aug 2021 19:20:35 -0400 Received: from mail.kernel.org ([198.145.29.99]:38966 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233463AbhHCXUc (ORCPT ); Tue, 3 Aug 2021 19:20:32 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id 93BB560F70; Tue, 3 Aug 2021 23:20:20 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1628032820; bh=f12FcFKLWYfl3Yl2SCUybd/Sw3WOdJzqfECFxbR9zeM=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=i6qbRpM9WTJ+hPaArrou5phSH+QR9tF3VISWqrTLKa5dq9CZLKaXSiT/Vlz56B95r QDW7pn3KVKxUQgXR7vq5SQYA0fjz25nlw3M0pAiDqo4TqhI25+MkJWCXzuu7mWPWAD doLubOAb5zbxDqx8Pb/NaLh59YTOBFTVPuiPFrgGDw+VDw8hqSJoBSSi7PpRPsEnTr RxRrxGe3nHngni+sm7bhlx5hb5MbjhcVEcy/DNr5nGwLW/I7DoycM4J2aiQaPNP4bW 8RZrJ357qduzZHhLC3jqqEfjEPEJzsGdgAHzqrveHTBRtmBqR+1Ty8OHHcBupY1JMh 9A3TUrjImKi9Q== From: Saeed Mahameed To: Saeed Mahameed , Leon Romanovsky Cc: netdev@vger.kernel.org, linux-rdma@vger.kernel.org, Roi Dayan Subject: [PATCH mlx5-next 07/14] net/mlx5e: Add an option to create a shared mapping Date: Tue, 3 Aug 2021 16:19:52 -0700 Message-Id: <20210803231959.26513-8-saeed@kernel.org> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20210803231959.26513-1-saeed@kernel.org> References: <20210803231959.26513-1-saeed@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org From: Roi Dayan The shared mapping is identified by an id and type. Signed-off-by: Roi Dayan Signed-off-by: Saeed Mahameed --- .../ethernet/mellanox/mlx5/core/en/mapping.c | 45 +++++++++++++++++++ .../ethernet/mellanox/mlx5/core/en/mapping.h | 5 +++ 2 files changed, 50 insertions(+) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/mapping.c b/drivers/net/ethernet/mellanox/mlx5/core/en/mapping.c index ea321e528749..4e72ca8070e2 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/mapping.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/mapping.c @@ -5,11 +5,15 @@ #include #include #include +#include #include "mapping.h" #define MAPPING_GRACE_PERIOD 2000 +static LIST_HEAD(shared_ctx_list); +static DEFINE_MUTEX(shared_ctx_lock); + struct mapping_ctx { struct xarray xarray; DECLARE_HASHTABLE(ht, 8); @@ -20,6 +24,10 @@ struct mapping_ctx { struct delayed_work dwork; struct list_head pending_list; spinlock_t pending_list_lock; /* Guards pending list */ + u64 id; + u8 type; + struct list_head list; + refcount_t refcount; }; struct mapping_item { @@ -205,11 +213,48 @@ mapping_create(size_t data_size, u32 max_id, bool delayed_removal) mutex_init(&ctx->lock); xa_init_flags(&ctx->xarray, XA_FLAGS_ALLOC1); + refcount_set(&ctx->refcount, 1); + INIT_LIST_HEAD(&ctx->list); + + return ctx; +} + +struct mapping_ctx * +mapping_create_for_id(u64 id, u8 type, size_t data_size, u32 max_id, bool delayed_removal) +{ + struct mapping_ctx *ctx; + + mutex_lock(&shared_ctx_lock); + list_for_each_entry(ctx, &shared_ctx_list, list) { + if (ctx->id == id && ctx->type == type) { + if (refcount_inc_not_zero(&ctx->refcount)) + goto unlock; + break; + } + } + + ctx = mapping_create(data_size, max_id, delayed_removal); + if (IS_ERR(ctx)) + goto unlock; + + ctx->id = id; + ctx->type = type; + list_add(&ctx->list, &shared_ctx_list); + +unlock: + mutex_unlock(&shared_ctx_lock); return ctx; } void mapping_destroy(struct mapping_ctx *ctx) { + if (!refcount_dec_and_test(&ctx->refcount)) + return; + + mutex_lock(&shared_ctx_lock); + list_del(&ctx->list); + mutex_unlock(&shared_ctx_lock); + mapping_flush_work(ctx); xa_destroy(&ctx->xarray); mutex_destroy(&ctx->lock); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/mapping.h b/drivers/net/ethernet/mellanox/mlx5/core/en/mapping.h index 285525cc5470..4e2119f0f4c1 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/mapping.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/mapping.h @@ -24,4 +24,9 @@ struct mapping_ctx *mapping_create(size_t data_size, u32 max_id, bool delayed_removal); void mapping_destroy(struct mapping_ctx *ctx); +/* adds mapping with an id or get an existing mapping with the same id + */ +struct mapping_ctx * +mapping_create_for_id(u64 id, u8 type, size_t data_size, u32 max_id, bool delayed_removal); + #endif /* __MLX5_MAPPING_H__ */ From patchwork Tue Aug 3 23:19:53 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Saeed Mahameed X-Patchwork-Id: 12417629 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-19.7 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1E54BC4338F for ; Tue, 3 Aug 2021 23:20:28 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 07DB960F93 for ; Tue, 3 Aug 2021 23:20:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233740AbhHCXUf (ORCPT ); Tue, 3 Aug 2021 19:20:35 -0400 Received: from mail.kernel.org ([198.145.29.99]:38974 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233498AbhHCXUd (ORCPT ); Tue, 3 Aug 2021 19:20:33 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id 07BB560F94; Tue, 3 Aug 2021 23:20:20 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1628032821; bh=AFc/6k5YwUwP9AAOKk7bsRXYfF/doRJC0xMN4+zaCwM=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=Z8rtetJNIy1A+OrVyI3/SqkUmx/G5HW7LAfoE5s1M8AK064Mkk/pDGL3LOY4C01pl Qy+NbvwGYEuiko8zSBAqG2L0CWkpfrkSRKfEKbqevVqAJuLSgSxMtA1JahPaV5I6Xb Nq3wtO+bB0/LoohoeEzRu/CrJ8bG6u0gYIwiletYdTkOflDGdrQSzO/MLaZyPeAeGe /z17Rg/WGMIqm+lqZIvUlZjR66FdHyMyXBCzKdyRp5PiU6C+5NC8rq3NIZqXV+3mu5 CmVkjOWpGdYxDzpYajjQQbfe636pceY52F8Aq/rn5UnB5Te+ekneZjedioGPJVnAoT /HheTpl/Dc3Zw== From: Saeed Mahameed To: Saeed Mahameed , Leon Romanovsky Cc: netdev@vger.kernel.org, linux-rdma@vger.kernel.org, Roi Dayan Subject: [PATCH mlx5-next 08/14] net/mlx5e: Use shared mappings for restoring from metadata Date: Tue, 3 Aug 2021 16:19:53 -0700 Message-Id: <20210803231959.26513-9-saeed@kernel.org> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20210803231959.26513-1-saeed@kernel.org> References: <20210803231959.26513-1-saeed@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org From: Roi Dayan FTEs are added with mapped metadata which is saved per eswitch. When uplink reps are bonded and we are in a single FDB mode, we could fail to find metadata which was stored on one eswitch mapping but not the other or with a different id. To resolve this issue use shared mapping between eswitch ports. We do not have any conflict using a single mapping, for a type, between the ports. Signed-off-by: Roi Dayan Signed-off-by: Saeed Mahameed --- .../ethernet/mellanox/mlx5/core/en/tc_ct.c | 9 ++++++-- .../net/ethernet/mellanox/mlx5/core/en_tc.c | 21 ++++++++++++++----- .../net/ethernet/mellanox/mlx5/core/eswitch.h | 8 +++++++ .../mellanox/mlx5/core/eswitch_offloads.c | 11 +++++++--- 4 files changed, 39 insertions(+), 10 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c index 91e7a01e32be..b1707b86aa16 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c @@ -2138,6 +2138,7 @@ mlx5_tc_ct_init(struct mlx5e_priv *priv, struct mlx5_fs_chains *chains, struct mlx5_tc_ct_priv *ct_priv; struct mlx5_core_dev *dev; const char *msg; + u64 mapping_id; int err; dev = priv->mdev; @@ -2153,13 +2154,17 @@ mlx5_tc_ct_init(struct mlx5e_priv *priv, struct mlx5_fs_chains *chains, if (!ct_priv) goto err_alloc; - ct_priv->zone_mapping = mapping_create(sizeof(u16), 0, true); + mapping_id = mlx5_query_nic_system_image_guid(dev); + + ct_priv->zone_mapping = mapping_create_for_id(mapping_id, MAPPING_TYPE_ZONE, + sizeof(u16), 0, true); if (IS_ERR(ct_priv->zone_mapping)) { err = PTR_ERR(ct_priv->zone_mapping); goto err_mapping_zone; } - ct_priv->labels_mapping = mapping_create(sizeof(u32) * 4, 0, true); + ct_priv->labels_mapping = mapping_create_for_id(mapping_id, MAPPING_TYPE_LABELS, + sizeof(u32) * 4, 0, true); if (IS_ERR(ct_priv->labels_mapping)) { err = PTR_ERR(ct_priv->labels_mapping); goto err_mapping_labels; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c index 629a61e8022f..aca677933423 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c @@ -4848,6 +4848,7 @@ int mlx5e_tc_nic_init(struct mlx5e_priv *priv) struct mlx5_core_dev *dev = priv->mdev; struct mapping_ctx *chains_mapping; struct mlx5_chains_attr attr = {}; + u64 mapping_id; int err; mlx5e_mod_hdr_tbl_init(&tc->mod_hdr); @@ -4861,8 +4862,12 @@ int mlx5e_tc_nic_init(struct mlx5e_priv *priv) lockdep_set_class(&tc->ht.mutex, &tc_ht_lock_key); - chains_mapping = mapping_create(sizeof(struct mlx5_mapped_obj), - MLX5E_TC_TABLE_CHAIN_TAG_MASK, true); + mapping_id = mlx5_query_nic_system_image_guid(dev); + + chains_mapping = mapping_create_for_id(mapping_id, MAPPING_TYPE_CHAIN, + sizeof(struct mlx5_mapped_obj), + MLX5E_TC_TABLE_CHAIN_TAG_MASK, true); + if (IS_ERR(chains_mapping)) { err = PTR_ERR(chains_mapping); goto err_mapping; @@ -4951,6 +4956,7 @@ int mlx5e_tc_esw_init(struct rhashtable *tc_ht) struct mapping_ctx *mapping; struct mlx5_eswitch *esw; struct mlx5e_priv *priv; + u64 mapping_id; int err = 0; uplink_priv = container_of(tc_ht, struct mlx5_rep_uplink_priv, tc_ht); @@ -4967,8 +4973,12 @@ int mlx5e_tc_esw_init(struct rhashtable *tc_ht) uplink_priv->esw_psample = mlx5_esw_sample_init(netdev_priv(priv->netdev)); #endif - mapping = mapping_create(sizeof(struct tunnel_match_key), - TUNNEL_INFO_BITS_MASK, true); + mapping_id = mlx5_query_nic_system_image_guid(esw->dev); + + mapping = mapping_create_for_id(mapping_id, MAPPING_TYPE_TUNNEL, + sizeof(struct tunnel_match_key), + TUNNEL_INFO_BITS_MASK, true); + if (IS_ERR(mapping)) { err = PTR_ERR(mapping); goto err_tun_mapping; @@ -4976,7 +4986,8 @@ int mlx5e_tc_esw_init(struct rhashtable *tc_ht) uplink_priv->tunnel_mapping = mapping; /* 0xFFF is reserved for stack devices slow path table mark */ - mapping = mapping_create(sz_enc_opts, ENC_OPTS_BITS_MASK - 1, true); + mapping = mapping_create_for_id(mapping_id, MAPPING_TYPE_TUNNEL_ENC_OPTS, + sz_enc_opts, ENC_OPTS_BITS_MASK - 1, true); if (IS_ERR(mapping)) { err = PTR_ERR(mapping); goto err_enc_opts_mapping; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h index 48cac5bf606d..c3a47349f447 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h @@ -86,6 +86,14 @@ struct mlx5_mapped_obj { #define esw_chains(esw) \ ((esw)->fdb_table.offloads.esw_chains_priv) +enum { + MAPPING_TYPE_CHAIN, + MAPPING_TYPE_TUNNEL, + MAPPING_TYPE_TUNNEL_ENC_OPTS, + MAPPING_TYPE_LABELS, + MAPPING_TYPE_ZONE, +}; + struct vport_ingress { struct mlx5_flow_table *acl; struct mlx5_flow_handle *allow_rule; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c index 1735be77e1fd..dd5eadd6047b 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c @@ -2787,6 +2787,7 @@ int esw_offloads_enable(struct mlx5_eswitch *esw) struct mapping_ctx *reg_c0_obj_pool; struct mlx5_vport *vport; unsigned long i; + u64 mapping_id; int err; if (MLX5_CAP_ESW_FLOWTABLE_FDB(esw->dev, reformat) && @@ -2810,9 +2811,13 @@ int esw_offloads_enable(struct mlx5_eswitch *esw) if (err) goto err_vport_metadata; - reg_c0_obj_pool = mapping_create(sizeof(struct mlx5_mapped_obj), - ESW_REG_C0_USER_DATA_METADATA_MASK, - true); + mapping_id = mlx5_query_nic_system_image_guid(esw->dev); + + reg_c0_obj_pool = mapping_create_for_id(mapping_id, MAPPING_TYPE_CHAIN, + sizeof(struct mlx5_mapped_obj), + ESW_REG_C0_USER_DATA_METADATA_MASK, + true); + if (IS_ERR(reg_c0_obj_pool)) { err = PTR_ERR(reg_c0_obj_pool); goto err_pool; From patchwork Tue Aug 3 23:19:54 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Saeed Mahameed X-Patchwork-Id: 12417627 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-19.7 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5F9D8C432BE for ; Tue, 3 Aug 2021 23:20:28 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 402FB60F58 for ; Tue, 3 Aug 2021 23:20:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233369AbhHCXUg (ORCPT ); Tue, 3 Aug 2021 19:20:36 -0400 Received: from mail.kernel.org ([198.145.29.99]:38984 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233548AbhHCXUd (ORCPT ); Tue, 3 Aug 2021 19:20:33 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id A8CCF60FA0; Tue, 3 Aug 2021 23:20:21 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1628032822; bh=Uja8RzoMCc9BpUyWv8o01AMrFBPiPQUqo8GWoNessjw=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=CtacHI6n42JUQrWq+Ln86y9afrcNi7wQxmAWtCBsoNL3rOkyGaU4eE6xvPIT1qFrt LPHtu9mFbOnpFRpoMfextBBoai1ZQWUZKiJn+V55MuF0yjxx3dvven72hP7SKepYSJ aVQGYdht3P+u9xSXfOz1xXix8NtxzD9yXQdgHSxsbspstXKlsXuVzBoJn2L763Mb7e AJswtI0U/GlxXm5LDWmdOjLBj6bUl1oeESm3573P7N9EsbHTqG3dszDJWwVMB1Ew0E MFcbGxDC0R6Pg2iQoh80Gc2y8rLIunm+3UzbRJwmRSovK6MYzTahupBBmayRzWEWgQ rBi92jN9uA58g== From: Saeed Mahameed To: Saeed Mahameed , Leon Romanovsky Cc: netdev@vger.kernel.org, linux-rdma@vger.kernel.org, Mark Bloch , Mark Zhang Subject: [PATCH mlx5-next 09/14] net/mlx5: E-Switch, Add event callback for representors Date: Tue, 3 Aug 2021 16:19:54 -0700 Message-Id: <20210803231959.26513-10-saeed@kernel.org> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20210803231959.26513-1-saeed@kernel.org> References: <20210803231959.26513-1-saeed@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org From: Mark Bloch This callback will allow to notify representors about relevant events when in OFFLOADS mode. In downstream patches, this will be used to notify about PAIR/UNPAIR devcom events. Signed-off-by: Mark Bloch Reviewed-by: Mark Zhang Signed-off-by: Saeed Mahameed --- .../mellanox/mlx5/core/eswitch_offloads.c | 50 +++++++++++++++++-- include/linux/mlx5/eswitch.h | 9 ++++ 2 files changed, 56 insertions(+), 3 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c index dd5eadd6047b..b57a5c188832 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c @@ -2316,11 +2316,22 @@ void esw_offloads_unload_rep(struct mlx5_eswitch *esw, u16 vport_num) #define ESW_OFFLOADS_DEVCOM_PAIR (0) #define ESW_OFFLOADS_DEVCOM_UNPAIR (1) -static int mlx5_esw_offloads_pair(struct mlx5_eswitch *esw, - struct mlx5_eswitch *peer_esw) +static void mlx5_esw_offloads_rep_event_unpair(struct mlx5_eswitch *esw) { + const struct mlx5_eswitch_rep_ops *ops; + struct mlx5_eswitch_rep *rep; + unsigned long i; + u8 rep_type; - return esw_add_fdb_peer_miss_rules(esw, peer_esw->dev); + mlx5_esw_for_each_rep(esw, i, rep) { + rep_type = NUM_REP_TYPES; + while (rep_type--) { + ops = esw->offloads.rep_ops[rep_type]; + if (atomic_read(&rep->rep_data[rep_type].state) == REP_LOADED && + ops->event) + ops->event(esw, rep, MLX5_SWITCHDEV_EVENT_UNPAIR, NULL); + } + } } static void mlx5_esw_offloads_unpair(struct mlx5_eswitch *esw) @@ -2328,9 +2339,42 @@ static void mlx5_esw_offloads_unpair(struct mlx5_eswitch *esw) #if IS_ENABLED(CONFIG_MLX5_CLS_ACT) mlx5e_tc_clean_fdb_peer_flows(esw); #endif + mlx5_esw_offloads_rep_event_unpair(esw); esw_del_fdb_peer_miss_rules(esw); } +static int mlx5_esw_offloads_pair(struct mlx5_eswitch *esw, + struct mlx5_eswitch *peer_esw) +{ + const struct mlx5_eswitch_rep_ops *ops; + struct mlx5_eswitch_rep *rep; + unsigned long i; + u8 rep_type; + int err; + + err = esw_add_fdb_peer_miss_rules(esw, peer_esw->dev); + if (err) + return err; + + mlx5_esw_for_each_rep(esw, i, rep) { + for (rep_type = 0; rep_type < NUM_REP_TYPES; rep_type++) { + ops = esw->offloads.rep_ops[rep_type]; + if (atomic_read(&rep->rep_data[rep_type].state) == REP_LOADED && + ops->event) { + err = ops->event(esw, rep, MLX5_SWITCHDEV_EVENT_PAIR, peer_esw); + if (err) + goto err_out; + } + } + } + + return 0; + +err_out: + mlx5_esw_offloads_unpair(esw); + return err; +} + static int mlx5_esw_offloads_set_ns_peer(struct mlx5_eswitch *esw, struct mlx5_eswitch *peer_esw, bool pair) diff --git a/include/linux/mlx5/eswitch.h b/include/linux/mlx5/eswitch.h index 0bfcf7b8ecf9..4ab5c1fc1270 100644 --- a/include/linux/mlx5/eswitch.h +++ b/include/linux/mlx5/eswitch.h @@ -29,11 +29,20 @@ enum { REP_LOADED, }; +enum mlx5_switchdev_event { + MLX5_SWITCHDEV_EVENT_PAIR, + MLX5_SWITCHDEV_EVENT_UNPAIR, +}; + struct mlx5_eswitch_rep; struct mlx5_eswitch_rep_ops { int (*load)(struct mlx5_core_dev *dev, struct mlx5_eswitch_rep *rep); void (*unload)(struct mlx5_eswitch_rep *rep); void *(*get_proto_dev)(struct mlx5_eswitch_rep *rep); + int (*event)(struct mlx5_eswitch *esw, + struct mlx5_eswitch_rep *rep, + enum mlx5_switchdev_event event, + void *data); }; struct mlx5_eswitch_rep_data { From patchwork Tue Aug 3 23:19:55 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Saeed Mahameed X-Patchwork-Id: 12417633 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-19.7 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6F735C43214 for ; Tue, 3 Aug 2021 23:20:28 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 517EF60F94 for ; Tue, 3 Aug 2021 23:20:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233810AbhHCXUh (ORCPT ); Tue, 3 Aug 2021 19:20:37 -0400 Received: from mail.kernel.org ([198.145.29.99]:38988 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233570AbhHCXUe (ORCPT ); Tue, 3 Aug 2021 19:20:34 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id 2A61461040; Tue, 3 Aug 2021 23:20:22 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1628032822; bh=g1GRdjhhmYnq4odN6yrFsA6z4OJcL9sJuSOMn+enSIA=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=YcGfgtmdmwyGVqyUyWxGjIp4/Glg9XvPPHR8jN8kXdqskAACQqQB+fRYYqe9OSyN5 AsRvkpRBmiY1DEyGInfZOjxBJUei3H9g5ukiyfXO8wKy4qtoYKdq71jn+KlQYPID7h gOJIXXAsBDos31Mx2/DUjdeziiLSUU5USmVkdGpKrPBWvDCfDnFrTdrvW1kE4SNwss vmd/gbFAtmrrUCCdBgXYXp92SPxkzuOJliAE7jOPUvNNqO2b8sjddYEKtmfaeL5ueF 4Y86vzD0ajiwtpCNmAvmEgA+1rtNew4MgUe/asCsv5rtEKlhiNJI2ZtAsDFGr5Yd/v acCJPq88GPnbw== From: Saeed Mahameed To: Saeed Mahameed , Leon Romanovsky Cc: netdev@vger.kernel.org, linux-rdma@vger.kernel.org, Mark Bloch , Mark Zhang Subject: [PATCH mlx5-next 10/14] net/mlx5: Add send to vport rules on paired device Date: Tue, 3 Aug 2021 16:19:55 -0700 Message-Id: <20210803231959.26513-11-saeed@kernel.org> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20210803231959.26513-1-saeed@kernel.org> References: <20210803231959.26513-1-saeed@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org From: Mark Bloch When two mlx5 devices are paired in switchdev mode, always offload the send-to-vport rule to the peer E-Switch. This allows to abstract the logic when this is really necessary (single FDB) and combine the logic of both cases into one. Signed-off-by: Mark Bloch Reviewed-by: Mark Zhang Signed-off-by: Saeed Mahameed --- .../net/ethernet/mellanox/mlx5/core/en_rep.c | 86 ++++++++++++++++++- .../net/ethernet/mellanox/mlx5/core/en_rep.h | 2 + .../mellanox/mlx5/core/eswitch_offloads.c | 16 +++- 3 files changed, 101 insertions(+), 3 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c index 1d016cc64015..cc34600b4dde 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c @@ -49,6 +49,7 @@ #include "en/devlink.h" #include "fs_core.h" #include "lib/mlx5.h" +#include "lib/devcom.h" #define CREATE_TRACE_POINTS #include "diag/en_rep_tracepoint.h" #include "en_accel/ipsec.h" @@ -310,6 +311,8 @@ static void mlx5e_sqs2vport_stop(struct mlx5_eswitch *esw, rpriv = mlx5e_rep_to_rep_priv(rep); list_for_each_entry_safe(rep_sq, tmp, &rpriv->vport_sqs_list, list) { mlx5_eswitch_del_send_to_vport_rule(rep_sq->send_to_vport_rule); + if (rep_sq->send_to_vport_rule_peer) + mlx5_eswitch_del_send_to_vport_rule(rep_sq->send_to_vport_rule_peer); list_del(&rep_sq->list); kfree(rep_sq); } @@ -319,6 +322,7 @@ static int mlx5e_sqs2vport_start(struct mlx5_eswitch *esw, struct mlx5_eswitch_rep *rep, u32 *sqns_array, int sqns_num) { + struct mlx5_eswitch *peer_esw = NULL; struct mlx5_flow_handle *flow_rule; struct mlx5e_rep_priv *rpriv; struct mlx5e_rep_sq *rep_sq; @@ -329,6 +333,10 @@ static int mlx5e_sqs2vport_start(struct mlx5_eswitch *esw, return 0; rpriv = mlx5e_rep_to_rep_priv(rep); + if (mlx5_devcom_is_paired(esw->dev->priv.devcom, MLX5_DEVCOM_ESW_OFFLOADS)) + peer_esw = mlx5_devcom_get_peer_data(esw->dev->priv.devcom, + MLX5_DEVCOM_ESW_OFFLOADS); + for (i = 0; i < sqns_num; i++) { rep_sq = kzalloc(sizeof(*rep_sq), GFP_KERNEL); if (!rep_sq) { @@ -345,12 +353,34 @@ static int mlx5e_sqs2vport_start(struct mlx5_eswitch *esw, goto out_err; } rep_sq->send_to_vport_rule = flow_rule; + rep_sq->sqn = sqns_array[i]; + + if (peer_esw) { + flow_rule = mlx5_eswitch_add_send_to_vport_rule(peer_esw, esw, + rep, sqns_array[i]); + if (IS_ERR(flow_rule)) { + err = PTR_ERR(flow_rule); + mlx5_eswitch_del_send_to_vport_rule(rep_sq->send_to_vport_rule); + kfree(rep_sq); + goto out_err; + } + rep_sq->send_to_vport_rule_peer = flow_rule; + } + list_add(&rep_sq->list, &rpriv->vport_sqs_list); } + + if (peer_esw) + mlx5_devcom_release_peer_data(esw->dev->priv.devcom, MLX5_DEVCOM_ESW_OFFLOADS); + return 0; out_err: mlx5e_sqs2vport_stop(esw, rep); + + if (peer_esw) + mlx5_devcom_release_peer_data(esw->dev->priv.devcom, MLX5_DEVCOM_ESW_OFFLOADS); + return err; } @@ -1264,10 +1294,64 @@ static void *mlx5e_vport_rep_get_proto_dev(struct mlx5_eswitch_rep *rep) return rpriv->netdev; } +static void mlx5e_vport_rep_event_unpair(struct mlx5_eswitch_rep *rep) +{ + struct mlx5e_rep_priv *rpriv; + struct mlx5e_rep_sq *rep_sq; + + rpriv = mlx5e_rep_to_rep_priv(rep); + list_for_each_entry(rep_sq, &rpriv->vport_sqs_list, list) { + if (!rep_sq->send_to_vport_rule_peer) + continue; + mlx5_eswitch_del_send_to_vport_rule(rep_sq->send_to_vport_rule_peer); + rep_sq->send_to_vport_rule_peer = NULL; + } +} + +static int mlx5e_vport_rep_event_pair(struct mlx5_eswitch *esw, + struct mlx5_eswitch_rep *rep, + struct mlx5_eswitch *peer_esw) +{ + struct mlx5_flow_handle *flow_rule; + struct mlx5e_rep_priv *rpriv; + struct mlx5e_rep_sq *rep_sq; + + rpriv = mlx5e_rep_to_rep_priv(rep); + list_for_each_entry(rep_sq, &rpriv->vport_sqs_list, list) { + if (rep_sq->send_to_vport_rule_peer) + continue; + flow_rule = mlx5_eswitch_add_send_to_vport_rule(peer_esw, esw, rep, rep_sq->sqn); + if (IS_ERR(flow_rule)) + goto err_out; + rep_sq->send_to_vport_rule_peer = flow_rule; + } + + return 0; +err_out: + mlx5e_vport_rep_event_unpair(rep); + return PTR_ERR(flow_rule); +} + +static int mlx5e_vport_rep_event(struct mlx5_eswitch *esw, + struct mlx5_eswitch_rep *rep, + enum mlx5_switchdev_event event, + void *data) +{ + int err = 0; + + if (event == MLX5_SWITCHDEV_EVENT_PAIR) + err = mlx5e_vport_rep_event_pair(esw, rep, data); + else if (event == MLX5_SWITCHDEV_EVENT_UNPAIR) + mlx5e_vport_rep_event_unpair(rep); + + return err; +} + static const struct mlx5_eswitch_rep_ops rep_ops = { .load = mlx5e_vport_rep_load, .unload = mlx5e_vport_rep_unload, - .get_proto_dev = mlx5e_vport_rep_get_proto_dev + .get_proto_dev = mlx5e_vport_rep_get_proto_dev, + .event = mlx5e_vport_rep_event, }; static int mlx5e_rep_probe(struct auxiliary_device *adev, diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.h b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.h index 47a2dfb7792a..8f0c82448eec 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.h @@ -207,6 +207,8 @@ struct mlx5e_encap_entry { struct mlx5e_rep_sq { struct mlx5_flow_handle *send_to_vport_rule; + struct mlx5_flow_handle *send_to_vport_rule_peer; + u32 sqn; struct list_head list; }; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c index b57a5c188832..e02a8bd2bd96 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c @@ -1616,7 +1616,18 @@ static int esw_create_offloads_fdb_tables(struct mlx5_eswitch *esw) goto ns_err; } - table_size = esw->total_vports * MAX_SQ_NVPORTS + MAX_PF_SQ + + /* To be strictly correct: + * MLX5_MAX_PORTS * (esw->total_vports * MAX_SQ_NVPORTS + MAX_PF_SQ) + * should be: + * esw->total_vports * MAX_SQ_NVPORTS + MAX_PF_SQ + + * peer_esw->total_vports * MAX_SQ_NVPORTS + MAX_PF_SQ + * but as the peer device might not be in switchdev mode it's not + * possible. We use the fact that by default FW sets max vfs and max sfs + * to the same value on both devices. If it needs to be changed in the future note + * the peer miss group should also be created based on the number of + * total vports of the peer (currently is also uses esw->total_vports). + */ + table_size = MLX5_MAX_PORTS * (esw->total_vports * MAX_SQ_NVPORTS + MAX_PF_SQ) + MLX5_ESW_MISS_FLOWS + esw->total_vports + esw->esw_funcs.num_vfs; /* create the slow path fdb with encap set, so further table instances @@ -1673,7 +1684,8 @@ static int esw_create_offloads_fdb_tables(struct mlx5_eswitch *esw) source_eswitch_owner_vhca_id_valid, 1); } - ix = esw->total_vports * MAX_SQ_NVPORTS + MAX_PF_SQ; + /* See comment above table_size calculation */ + ix = MLX5_MAX_PORTS * (esw->total_vports * MAX_SQ_NVPORTS + MAX_PF_SQ); MLX5_SET(create_flow_group_in, flow_group_in, start_flow_index, 0); MLX5_SET(create_flow_group_in, flow_group_in, end_flow_index, ix - 1); From patchwork Tue Aug 3 23:19:56 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Saeed Mahameed X-Patchwork-Id: 12417635 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-19.7 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 287D7C4320E for ; Tue, 3 Aug 2021 23:20:29 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 1080F61037 for ; Tue, 3 Aug 2021 23:20:29 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233912AbhHCXUj (ORCPT ); Tue, 3 Aug 2021 19:20:39 -0400 Received: from mail.kernel.org ([198.145.29.99]:38958 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232935AbhHCXUe (ORCPT ); Tue, 3 Aug 2021 19:20:34 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id 9B18060F58; Tue, 3 Aug 2021 23:20:22 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1628032823; bh=vHYayD8Yt76H93pQteGSxFseJTY2VJBTKWkVabcLxc0=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=uNRDZfPj6fZHXJ+QMJNHpJnGne0OoOEdgfeL9hEJ73oxhR4UV0IkaFBUjKRf/pldN WDwFD44Iy8Gzh0nN33mU/r7351s9P7kkyxKlKWT+Hb98fjbuhyldJm7Z+mU7TscR9Z kK2NpoBEjMJbz6g7Mbyt/dORhFd/SX+a1XpV3b/jsyrFaeBpmxo6yuF2YSGb99IMIY spCSU08MdvmJDljFYikCy8DM+qIGdoN9RqFUDsKej+rExToTb72xeufQbVTkZcvOpy 7FU7lCnj5OjowF7moAfiza8mXI644G8Bnq6mT0eUAtda3+I0fKXrjiwJPJSCUGADMN g4DZwmRnTkzNA== From: Saeed Mahameed To: Saeed Mahameed , Leon Romanovsky Cc: netdev@vger.kernel.org, linux-rdma@vger.kernel.org, Mark Bloch , Mark Zhang Subject: [PATCH mlx5-next 11/14] net/mlx5: Lag, properly lock eswitch if needed Date: Tue, 3 Aug 2021 16:19:56 -0700 Message-Id: <20210803231959.26513-12-saeed@kernel.org> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20210803231959.26513-1-saeed@kernel.org> References: <20210803231959.26513-1-saeed@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org From: Mark Bloch Currently when doing hardware lag we check the eswitch mode but as this isn't done under a lock the check isn't valid. As the code needs to sync between two different devices an extra care is needed. - When going to change eswitch mode, if hardware lag is active destroy it. - While changing eswitch modes block any hardware bond creation. - Delay handling bonding events until there are no mode changes in progress. - When attaching a new mdev to lag, block until there is no mode change in progress. In order for the mode change to finish the interface lock will have to be taken. Release the lock and sleep for 100ms to allow forward progress. As this is a very rare condition (can happen if the user unbinds and binds a PCI function while also changing eswitch mode of the other PCI function) it has no real world impact. As taking multiple eswitch mode locks is now required lockdep will complain about a possible deadlock. Register a key per eswitch to make lockdep happy. Signed-off-by: Mark Bloch Reviewed-by: Mark Zhang Signed-off-by: Saeed Mahameed --- .../net/ethernet/mellanox/mlx5/core/eswitch.c | 24 +++++- .../net/ethernet/mellanox/mlx5/core/eswitch.h | 5 ++ .../mellanox/mlx5/core/eswitch_offloads.c | 5 +- drivers/net/ethernet/mellanox/mlx5/core/lag.c | 83 ++++++++++++++++--- drivers/net/ethernet/mellanox/mlx5/core/lag.h | 1 + .../net/ethernet/mellanox/mlx5/core/main.c | 5 +- .../ethernet/mellanox/mlx5/core/mlx5_core.h | 2 + 7 files changed, 107 insertions(+), 18 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c index b65a472067d2..f3a7f9d3334f 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c @@ -1458,8 +1458,6 @@ int mlx5_eswitch_enable_locked(struct mlx5_eswitch *esw, int mode, int num_vfs) esw->mode = mode; - mlx5_lag_update(esw->dev); - if (mode == MLX5_ESWITCH_LEGACY) { err = esw_legacy_enable(esw); } else { @@ -1506,6 +1504,7 @@ int mlx5_eswitch_enable(struct mlx5_eswitch *esw, int num_vfs) if (!mlx5_esw_allowed(esw)) return 0; + mlx5_lag_disable_change(esw->dev); down_write(&esw->mode_lock); if (esw->mode == MLX5_ESWITCH_NONE) { ret = mlx5_eswitch_enable_locked(esw, MLX5_ESWITCH_LEGACY, num_vfs); @@ -1519,6 +1518,7 @@ int mlx5_eswitch_enable(struct mlx5_eswitch *esw, int num_vfs) esw->esw_funcs.num_vfs = num_vfs; } up_write(&esw->mode_lock); + mlx5_lag_enable_change(esw->dev); return ret; } @@ -1550,8 +1550,6 @@ void mlx5_eswitch_disable_locked(struct mlx5_eswitch *esw, bool clear_vf) old_mode = esw->mode; esw->mode = MLX5_ESWITCH_NONE; - mlx5_lag_update(esw->dev); - if (old_mode == MLX5_ESWITCH_OFFLOADS) mlx5_rescan_drivers(esw->dev); @@ -1567,10 +1565,12 @@ void mlx5_eswitch_disable(struct mlx5_eswitch *esw, bool clear_vf) if (!mlx5_esw_allowed(esw)) return; + mlx5_lag_disable_change(esw->dev); down_write(&esw->mode_lock); mlx5_eswitch_disable_locked(esw, clear_vf); esw->esw_funcs.num_vfs = 0; up_write(&esw->mode_lock); + mlx5_lag_enable_change(esw->dev); } static int mlx5_query_hca_cap_host_pf(struct mlx5_core_dev *dev, void *out) @@ -1759,7 +1759,9 @@ int mlx5_eswitch_init(struct mlx5_core_dev *dev) ida_init(&esw->offloads.vport_metadata_ida); xa_init_flags(&esw->offloads.vhca_map, XA_FLAGS_ALLOC); mutex_init(&esw->state_lock); + lockdep_register_key(&esw->mode_lock_key); init_rwsem(&esw->mode_lock); + lockdep_set_class(&esw->mode_lock, &esw->mode_lock_key); esw->enabled_vports = 0; esw->mode = MLX5_ESWITCH_NONE; @@ -1793,6 +1795,7 @@ void mlx5_eswitch_cleanup(struct mlx5_eswitch *esw) esw->dev->priv.eswitch = NULL; destroy_workqueue(esw->work_queue); + lockdep_unregister_key(&esw->mode_lock_key); mutex_destroy(&esw->state_lock); WARN_ON(!xa_empty(&esw->offloads.vhca_map)); xa_destroy(&esw->offloads.vhca_map); @@ -2366,9 +2369,22 @@ int mlx5_esw_try_lock(struct mlx5_eswitch *esw) */ void mlx5_esw_unlock(struct mlx5_eswitch *esw) { + if (!mlx5_esw_allowed(esw)) + return; up_write(&esw->mode_lock); } +/** + * mlx5_esw_lock() - Take write lock on esw mode lock + * @esw: eswitch device. + */ +void mlx5_esw_lock(struct mlx5_eswitch *esw) +{ + if (!mlx5_esw_allowed(esw)) + return; + down_write(&esw->mode_lock); +} + /** * mlx5_eswitch_get_total_vports - Get total vports of the eswitch * diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h index c3a47349f447..5a27445fa892 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h @@ -323,6 +323,7 @@ struct mlx5_eswitch { u32 large_group_num; } params; struct blocking_notifier_head n_head; + struct lock_class_key mode_lock_key; }; void esw_offloads_disable(struct mlx5_eswitch *esw); @@ -707,6 +708,7 @@ void mlx5_esw_get(struct mlx5_core_dev *dev); void mlx5_esw_put(struct mlx5_core_dev *dev); int mlx5_esw_try_lock(struct mlx5_eswitch *esw); void mlx5_esw_unlock(struct mlx5_eswitch *esw); +void mlx5_esw_lock(struct mlx5_eswitch *esw); void esw_vport_change_handle_locked(struct mlx5_vport *vport); @@ -727,6 +729,9 @@ static inline const u32 *mlx5_esw_query_functions(struct mlx5_core_dev *dev) return ERR_PTR(-EOPNOTSUPP); } +static inline void mlx5_esw_unlock(struct mlx5_eswitch *esw) { return; } +static inline void mlx5_esw_lock(struct mlx5_eswitch *esw) { return; } + static inline struct mlx5_flow_handle * esw_add_restore_rule(struct mlx5_eswitch *esw, u32 tag) { diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c index e02a8bd2bd96..109cbbb99933 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c @@ -3051,10 +3051,11 @@ int mlx5_devlink_eswitch_mode_set(struct devlink *devlink, u16 mode, if (esw_mode_from_devlink(mode, &mlx5_mode)) return -EINVAL; + mlx5_lag_disable_change(esw->dev); err = mlx5_esw_try_lock(esw); if (err < 0) { NL_SET_ERR_MSG_MOD(extack, "Can't change mode, E-Switch is busy"); - return err; + goto enable_lag; } cur_mlx5_mode = err; err = 0; @@ -3071,6 +3072,8 @@ int mlx5_devlink_eswitch_mode_set(struct devlink *devlink, u16 mode, unlock: mlx5_esw_unlock(esw); +enable_lag: + mlx5_lag_enable_change(esw->dev); return err; } diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lag.c b/drivers/net/ethernet/mellanox/mlx5/core/lag.c index 3049de648256..459e3e5ef13f 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/lag.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/lag.c @@ -418,21 +418,48 @@ static void mlx5_queue_bond_work(struct mlx5_lag *ldev, unsigned long delay) queue_delayed_work(ldev->wq, &ldev->bond_work, delay); } +static void mlx5_lag_lock_eswitches(struct mlx5_core_dev *dev0, + struct mlx5_core_dev *dev1) +{ + if (dev0) + mlx5_esw_lock(dev0->priv.eswitch); + if (dev1) + mlx5_esw_lock(dev1->priv.eswitch); +} + +static void mlx5_lag_unlock_eswitches(struct mlx5_core_dev *dev0, + struct mlx5_core_dev *dev1) +{ + if (dev1) + mlx5_esw_unlock(dev1->priv.eswitch); + if (dev0) + mlx5_esw_unlock(dev0->priv.eswitch); +} + static void mlx5_do_bond_work(struct work_struct *work) { struct delayed_work *delayed_work = to_delayed_work(work); struct mlx5_lag *ldev = container_of(delayed_work, struct mlx5_lag, bond_work); + struct mlx5_core_dev *dev0 = ldev->pf[MLX5_LAG_P1].dev; + struct mlx5_core_dev *dev1 = ldev->pf[MLX5_LAG_P2].dev; int status; status = mlx5_dev_list_trylock(); if (!status) { - /* 1 sec delay. */ mlx5_queue_bond_work(ldev, HZ); return; } + if (ldev->mode_changes_in_progress) { + mlx5_dev_list_unlock(); + mlx5_queue_bond_work(ldev, HZ); + return; + } + + mlx5_lag_lock_eswitches(dev0, dev1); mlx5_do_bond(ldev); + mlx5_lag_unlock_eswitches(dev0, dev1); mlx5_dev_list_unlock(); } @@ -630,7 +657,7 @@ static void mlx5_ldev_remove_mdev(struct mlx5_lag *ldev, } /* Must be called with intf_mutex held */ -static void __mlx5_lag_dev_add_mdev(struct mlx5_core_dev *dev) +static int __mlx5_lag_dev_add_mdev(struct mlx5_core_dev *dev) { struct mlx5_lag *ldev = NULL; struct mlx5_core_dev *tmp_dev; @@ -638,7 +665,7 @@ static void __mlx5_lag_dev_add_mdev(struct mlx5_core_dev *dev) if (!MLX5_CAP_GEN(dev, vport_group_manager) || !MLX5_CAP_GEN(dev, lag_master) || MLX5_CAP_GEN(dev, num_lag_ports) != MLX5_MAX_PORTS) - return; + return 0; tmp_dev = mlx5_get_next_phys_dev(dev); if (tmp_dev) @@ -648,15 +675,17 @@ static void __mlx5_lag_dev_add_mdev(struct mlx5_core_dev *dev) ldev = mlx5_lag_dev_alloc(dev); if (!ldev) { mlx5_core_err(dev, "Failed to alloc lag dev\n"); - return; + return 0; } } else { + if (ldev->mode_changes_in_progress) + return -EAGAIN; mlx5_ldev_get(ldev); } mlx5_ldev_add_mdev(ldev, dev); - return; + return 0; } void mlx5_lag_remove_mdev(struct mlx5_core_dev *dev) @@ -667,7 +696,13 @@ void mlx5_lag_remove_mdev(struct mlx5_core_dev *dev) if (!ldev) return; +recheck: mlx5_dev_list_lock(); + if (ldev->mode_changes_in_progress) { + mlx5_dev_list_unlock(); + msleep(100); + goto recheck; + } mlx5_ldev_remove_mdev(ldev, dev); mlx5_dev_list_unlock(); mlx5_ldev_put(ldev); @@ -675,8 +710,16 @@ void mlx5_lag_remove_mdev(struct mlx5_core_dev *dev) void mlx5_lag_add_mdev(struct mlx5_core_dev *dev) { + int err; + +recheck: mlx5_dev_list_lock(); - __mlx5_lag_dev_add_mdev(dev); + err = __mlx5_lag_dev_add_mdev(dev); + if (err) { + mlx5_dev_list_unlock(); + msleep(100); + goto recheck; + } mlx5_dev_list_unlock(); } @@ -716,6 +759,7 @@ void mlx5_lag_add_netdev(struct mlx5_core_dev *dev, if (i >= MLX5_MAX_PORTS) ldev->flags |= MLX5_LAG_FLAG_READY; + mlx5_queue_bond_work(ldev, 0); } bool mlx5_lag_is_roce(struct mlx5_core_dev *dev) @@ -789,19 +833,36 @@ bool mlx5_lag_is_shared_fdb(struct mlx5_core_dev *dev) } EXPORT_SYMBOL(mlx5_lag_is_shared_fdb); -void mlx5_lag_update(struct mlx5_core_dev *dev) +void mlx5_lag_disable_change(struct mlx5_core_dev *dev) { + struct mlx5_core_dev *dev0; + struct mlx5_core_dev *dev1; struct mlx5_lag *ldev; mlx5_dev_list_lock(); + ldev = mlx5_lag_dev(dev); - if (!ldev) - goto unlock; + dev0 = ldev->pf[MLX5_LAG_P1].dev; + dev1 = ldev->pf[MLX5_LAG_P2].dev; - mlx5_do_bond(ldev); + ldev->mode_changes_in_progress++; + if (__mlx5_lag_is_active(ldev)) { + mlx5_lag_lock_eswitches(dev0, dev1); + mlx5_disable_lag(ldev); + mlx5_lag_unlock_eswitches(dev0, dev1); + } + mlx5_dev_list_unlock(); +} -unlock: +void mlx5_lag_enable_change(struct mlx5_core_dev *dev) +{ + struct mlx5_lag *ldev; + + mlx5_dev_list_lock(); + ldev = mlx5_lag_dev(dev); + ldev->mode_changes_in_progress--; mlx5_dev_list_unlock(); + mlx5_queue_bond_work(ldev, 0); } struct net_device *mlx5_lag_get_roce_netdev(struct mlx5_core_dev *dev) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lag.h b/drivers/net/ethernet/mellanox/mlx5/core/lag.h index 70b244b1a09e..e1d7a6671cf3 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/lag.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/lag.h @@ -39,6 +39,7 @@ struct lag_tracker { */ struct mlx5_lag { u8 flags; + int mode_changes_in_progress; bool shared_fdb; u8 v2p_map[MLX5_MAX_PORTS]; struct kref ref; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c b/drivers/net/ethernet/mellanox/mlx5/core/main.c index eb1b316560a8..1357a6ec8c3c 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/main.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c @@ -1179,6 +1179,7 @@ static int mlx5_load(struct mlx5_core_dev *dev) goto err_ec; } + mlx5_lag_add_mdev(dev); err = mlx5_sriov_attach(dev); if (err) { mlx5_core_err(dev, "sriov init failed %d\n", err); @@ -1186,11 +1187,11 @@ static int mlx5_load(struct mlx5_core_dev *dev) } mlx5_sf_dev_table_create(dev); - mlx5_lag_add_mdev(dev); return 0; err_sriov: + mlx5_lag_remove_mdev(dev); mlx5_ec_cleanup(dev); err_ec: mlx5_sf_hw_table_destroy(dev); @@ -1222,9 +1223,9 @@ static int mlx5_load(struct mlx5_core_dev *dev) static void mlx5_unload(struct mlx5_core_dev *dev) { - mlx5_lag_remove_mdev(dev); mlx5_sf_dev_table_destroy(dev); mlx5_sriov_detach(dev); + mlx5_lag_remove_mdev(dev); mlx5_ec_cleanup(dev); mlx5_sf_hw_table_destroy(dev); mlx5_vhca_event_stop(dev); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h b/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h index 343807ac2036..14ffd74eeabe 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h @@ -168,6 +168,8 @@ void mlx5_lag_add_netdev(struct mlx5_core_dev *dev, struct net_device *netdev); void mlx5_lag_remove_netdev(struct mlx5_core_dev *dev, struct net_device *netdev); void mlx5_lag_add_mdev(struct mlx5_core_dev *dev); void mlx5_lag_remove_mdev(struct mlx5_core_dev *dev); +void mlx5_lag_disable_change(struct mlx5_core_dev *dev); +void mlx5_lag_enable_change(struct mlx5_core_dev *dev); int mlx5_events_init(struct mlx5_core_dev *dev); void mlx5_events_cleanup(struct mlx5_core_dev *dev); From patchwork Tue Aug 3 23:19:57 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Saeed Mahameed X-Patchwork-Id: 12417631 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-19.7 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A1BA4C19F31 for ; Tue, 3 Aug 2021 23:20:29 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 8DD9660F94 for ; Tue, 3 Aug 2021 23:20:29 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233922AbhHCXUj (ORCPT ); Tue, 3 Aug 2021 19:20:39 -0400 Received: from mail.kernel.org ([198.145.29.99]:38966 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233670AbhHCXUf (ORCPT ); Tue, 3 Aug 2021 19:20:35 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id 21CFA60F93; Tue, 3 Aug 2021 23:20:23 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1628032823; bh=JcEopKgjR7krcW6ZEC6zhfogi5KM4huoKxlKQeSHzDQ=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=HGF2wfSd2w6Dpu4I+XW80N4ha3px7GwDxpBJVREXt+rgusGBfT8ru+6L9UYqb4b4E w6lpCQRyzj8o3NMiuBbuw+OHYC32cSaXmTGT4IDCQmFaFLB0SH2vaDfQq73P6PAJx/ kP+Mkvg6P6/bzKO1oOmDsNLIpsKJcldnQg09pBnkzMqXilEm9/qUtDpc03UvVonFFo rN6FHlj2FaZM0LcfE7sOjtE4eHBcu0ngM/PlxLSlSey2onOjTmXUMZ5fakkQ4wOAZ6 UzvWh0WPU+SJouQ+Zmlnj34KpomkPW5PKaRWRdc1URGKJ/kKI9jg89yJk5V/cuLZYL gYxNRyrxiNU7g== From: Saeed Mahameed To: Saeed Mahameed , Leon Romanovsky Cc: netdev@vger.kernel.org, linux-rdma@vger.kernel.org, Mark Bloch , Mark Zhang Subject: [PATCH mlx5-next 12/14] net/mlx5: Lag, move lag destruction to a workqueue Date: Tue, 3 Aug 2021 16:19:57 -0700 Message-Id: <20210803231959.26513-13-saeed@kernel.org> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20210803231959.26513-1-saeed@kernel.org> References: <20210803231959.26513-1-saeed@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org From: Mark Bloch If a netdev is removed from the lag the lag should be destroyed. With downstream patches this might trigger a reconfiguration of representors on a different eswitch and such we don't have the proper locking to so from this path. Move the destruction to be done by the workqueue. As the destruction won't affect the netdev side it okay to do so. The RDMA side will be reconfigured and it already coded to handle such reconfiguration. Signed-off-by: Mark Bloch Reviewed-by: Mark Zhang Signed-off-by: Saeed Mahameed --- drivers/net/ethernet/mellanox/mlx5/core/lag.c | 17 +++++++++-------- 1 file changed, 9 insertions(+), 8 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lag.c b/drivers/net/ethernet/mellanox/mlx5/core/lag.c index 459e3e5ef13f..89cd2b2af50a 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/lag.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/lag.c @@ -371,12 +371,13 @@ static void mlx5_do_bond(struct mlx5_lag *ldev) bool do_bond, roce_lag; int err; - if (!mlx5_lag_is_ready(ldev)) - return; - - tracker = ldev->tracker; + if (!mlx5_lag_is_ready(ldev)) { + do_bond = false; + } else { + tracker = ldev->tracker; - do_bond = tracker.is_bonded && mlx5_lag_check_prereq(ldev); + do_bond = tracker.is_bonded && mlx5_lag_check_prereq(ldev); + } if (do_bond && !__mlx5_lag_is_active(ldev)) { roce_lag = !mlx5_sriov_is_enabled(dev0) && @@ -733,11 +734,11 @@ void mlx5_lag_remove_netdev(struct mlx5_core_dev *dev, if (!ldev) return; - if (__mlx5_lag_is_active(ldev)) - mlx5_disable_lag(ldev); - mlx5_ldev_remove_netdev(ldev, netdev); ldev->flags &= ~MLX5_LAG_FLAG_READY; + + if (__mlx5_lag_is_active(ldev)) + mlx5_queue_bond_work(ldev, 0); } /* Must be called with intf_mutex held */ From patchwork Tue Aug 3 23:19:58 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Saeed Mahameed X-Patchwork-Id: 12417637 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-19.7 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A6CAEC19F3A for ; Tue, 3 Aug 2021 23:20:30 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 7744C60F58 for ; Tue, 3 Aug 2021 23:20:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233929AbhHCXUk (ORCPT ); Tue, 3 Aug 2021 19:20:40 -0400 Received: from mail.kernel.org ([198.145.29.99]:38974 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233705AbhHCXUf (ORCPT ); Tue, 3 Aug 2021 19:20:35 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id 950CE60F94; Tue, 3 Aug 2021 23:20:23 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1628032823; bh=+E6/e9DNe2bAuYvcxV8vgchUS5QVc4wisvubHCRs7TQ=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=dJPz6Zxn+tJKyQggIBxlyJqAIQ+nEhu5HKaF9K6pktkLYKyFJAA3YnmpWlM9o/LFV DpIxLkZYjGl7j0IuSTOgbe0eQhX1pSENCovwkfGNFFlhfhQLjJQE0qZ6ZsPDjH8mG4 WsV4fIO36cwI6jt/xmzGhq93cYbAvGH1uQMEAeeME9tTc80GaxDzwnzwcn//hL2bZb ZhwOFJT3kS7Mm5U9c67JTWiwhZ13jUmfq5wwgotb4i4VklAsutLReLsWk1RdDniLsk bRQgb7EuqtUaTZwd1164WMHauxtqFRIE8RZyP/eB2NvuOxMpzebL66dBnY6sDWwLRm b7OGOsXb8WT1w== From: Saeed Mahameed To: Saeed Mahameed , Leon Romanovsky Cc: netdev@vger.kernel.org, linux-rdma@vger.kernel.org, Mark Bloch , Mark Zhang Subject: [PATCH mlx5-next 13/14] net/mlx5/ E-Switch, add logic to enable shared FDB Date: Tue, 3 Aug 2021 16:19:58 -0700 Message-Id: <20210803231959.26513-14-saeed@kernel.org> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20210803231959.26513-1-saeed@kernel.org> References: <20210803231959.26513-1-saeed@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org From: Mark Bloch Shared FDB allows to direct traffic from all the vports in the HCA to a single eswitch. In order to do that three things are needed. 1) Point the ingress ACL of the slave uplink to that of the master. With this, wire traffic from both uplinks will reach the same eswitch with the same metadata where a single steering rule can catch traffic from both ports. 2) Set the FDB root flow table of the slave's eswitch to that of the master. As this flow table can change dynamically make sure to sync it on any set root flow table FDB command. This will make sure traffic from SFs, VFs, ECPFs and PFs reach the master eswitch. 3) Split wire traffic at the eswitch manager egress ACL so that it's directed to the native eswitch manager. We only treat wire traffic from both ports the same at the eswitch level. If such traffic wasn't handled in the eswitch it needs to reach the right representor to be processed by software. For example LACP packets should *always* reach the right uplink representor for correct operation. Signed-off-by: Mark Bloch Reviewed-by: Mark Zhang Signed-off-by: Saeed Mahameed --- .../mellanox/mlx5/core/esw/acl/egress_ofld.c | 16 + .../net/ethernet/mellanox/mlx5/core/eswitch.h | 25 ++ .../mellanox/mlx5/core/eswitch_offloads.c | 293 ++++++++++++++++++ .../net/ethernet/mellanox/mlx5/core/fs_cmd.c | 58 +++- .../net/ethernet/mellanox/mlx5/core/fs_core.c | 2 +- .../net/ethernet/mellanox/mlx5/core/fs_core.h | 2 + 6 files changed, 394 insertions(+), 2 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/esw/acl/egress_ofld.c b/drivers/net/ethernet/mellanox/mlx5/core/esw/acl/egress_ofld.c index 505bf811984a..2e504c7461c6 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/esw/acl/egress_ofld.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/esw/acl/egress_ofld.c @@ -15,6 +15,15 @@ static void esw_acl_egress_ofld_fwd2vport_destroy(struct mlx5_vport *vport) vport->egress.offloads.fwd_rule = NULL; } +static void esw_acl_egress_ofld_bounce_rule_destroy(struct mlx5_vport *vport) +{ + if (!vport->egress.offloads.bounce_rule) + return; + + mlx5_del_flow_rules(vport->egress.offloads.bounce_rule); + vport->egress.offloads.bounce_rule = NULL; +} + static int esw_acl_egress_ofld_fwd2vport_create(struct mlx5_eswitch *esw, struct mlx5_vport *vport, struct mlx5_flow_destination *fwd_dest) @@ -87,6 +96,7 @@ static void esw_acl_egress_ofld_rules_destroy(struct mlx5_vport *vport) { esw_acl_egress_vlan_destroy(vport); esw_acl_egress_ofld_fwd2vport_destroy(vport); + esw_acl_egress_ofld_bounce_rule_destroy(vport); } static int esw_acl_egress_ofld_groups_create(struct mlx5_eswitch *esw, @@ -145,6 +155,12 @@ static void esw_acl_egress_ofld_groups_destroy(struct mlx5_vport *vport) mlx5_destroy_flow_group(vport->egress.offloads.fwd_grp); vport->egress.offloads.fwd_grp = NULL; } + + if (!IS_ERR_OR_NULL(vport->egress.offloads.bounce_grp)) { + mlx5_destroy_flow_group(vport->egress.offloads.bounce_grp); + vport->egress.offloads.bounce_grp = NULL; + } + esw_acl_egress_vlan_grp_destroy(vport); } diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h index 5a27445fa892..f64aaf85b6ee 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h @@ -132,6 +132,8 @@ struct vport_egress { struct { struct mlx5_flow_group *fwd_grp; struct mlx5_flow_handle *fwd_rule; + struct mlx5_flow_handle *bounce_rule; + struct mlx5_flow_group *bounce_grp; } offloads; }; }; @@ -714,6 +716,12 @@ void esw_vport_change_handle_locked(struct mlx5_vport *vport); bool mlx5_esw_offloads_controller_valid(const struct mlx5_eswitch *esw, u32 controller); +int mlx5_eswitch_offloads_config_single_fdb(struct mlx5_eswitch *master_esw, + struct mlx5_eswitch *slave_esw); +void mlx5_eswitch_offloads_destroy_single_fdb(struct mlx5_eswitch *master_esw, + struct mlx5_eswitch *slave_esw); +int mlx5_eswitch_reload_reps(struct mlx5_eswitch *esw); + #else /* CONFIG_MLX5_ESWITCH */ /* eswitch API stubs */ static inline int mlx5_eswitch_init(struct mlx5_core_dev *dev) { return 0; } @@ -744,6 +752,23 @@ mlx5_esw_vport_to_devlink_port_index(const struct mlx5_core_dev *dev, { return vport_num; } + +static inline int +mlx5_eswitch_offloads_config_single_fdb(struct mlx5_eswitch *master_esw, + struct mlx5_eswitch *slave_esw) +{ + return 0; +} + +static inline void +mlx5_eswitch_offloads_destroy_single_fdb(struct mlx5_eswitch *master_esw, + struct mlx5_eswitch *slave_esw) {} + +static inline int +mlx5_eswitch_reload_reps(struct mlx5_eswitch *esw) +{ + return 0; +} #endif /* CONFIG_MLX5_ESWITCH */ #endif /* __MLX5_ESWITCH_H__ */ diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c index 109cbbb99933..192255e67ef4 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c @@ -2325,6 +2325,274 @@ void esw_offloads_unload_rep(struct mlx5_eswitch *esw, u16 vport_num) mlx5_esw_offloads_devlink_port_unregister(esw, vport_num); } +static int esw_set_uplink_slave_ingress_root(struct mlx5_core_dev *master, + struct mlx5_core_dev *slave) +{ + u32 in[MLX5_ST_SZ_DW(set_flow_table_root_in)] = {}; + u32 out[MLX5_ST_SZ_DW(set_flow_table_root_out)] = {}; + struct mlx5_eswitch *esw; + struct mlx5_flow_root_namespace *root; + struct mlx5_flow_namespace *ns; + struct mlx5_vport *vport; + int err; + + MLX5_SET(set_flow_table_root_in, in, opcode, + MLX5_CMD_OP_SET_FLOW_TABLE_ROOT); + MLX5_SET(set_flow_table_root_in, in, table_type, FS_FT_ESW_INGRESS_ACL); + MLX5_SET(set_flow_table_root_in, in, other_vport, 1); + MLX5_SET(set_flow_table_root_in, in, vport_number, MLX5_VPORT_UPLINK); + + if (master) { + esw = master->priv.eswitch; + vport = mlx5_eswitch_get_vport(esw, MLX5_VPORT_UPLINK); + MLX5_SET(set_flow_table_root_in, in, table_of_other_vport, 1); + MLX5_SET(set_flow_table_root_in, in, table_vport_number, + MLX5_VPORT_UPLINK); + + ns = mlx5_get_flow_vport_acl_namespace(master, + MLX5_FLOW_NAMESPACE_ESW_INGRESS, + vport->index); + root = find_root(&ns->node); + mutex_lock(&root->chain_lock); + + MLX5_SET(set_flow_table_root_in, in, + table_eswitch_owner_vhca_id_valid, 1); + MLX5_SET(set_flow_table_root_in, in, + table_eswitch_owner_vhca_id, + MLX5_CAP_GEN(master, vhca_id)); + MLX5_SET(set_flow_table_root_in, in, table_id, + root->root_ft->id); + } else { + esw = slave->priv.eswitch; + vport = mlx5_eswitch_get_vport(esw, MLX5_VPORT_UPLINK); + ns = mlx5_get_flow_vport_acl_namespace(slave, + MLX5_FLOW_NAMESPACE_ESW_INGRESS, + vport->index); + root = find_root(&ns->node); + mutex_lock(&root->chain_lock); + MLX5_SET(set_flow_table_root_in, in, table_id, root->root_ft->id); + } + + err = mlx5_cmd_exec(slave, in, sizeof(in), out, sizeof(out)); + mutex_unlock(&root->chain_lock); + + return err; +} + +static int esw_set_slave_root_fdb(struct mlx5_core_dev *master, + struct mlx5_core_dev *slave) +{ + u32 in[MLX5_ST_SZ_DW(set_flow_table_root_in)] = {}; + u32 out[MLX5_ST_SZ_DW(set_flow_table_root_out)] = {}; + struct mlx5_flow_root_namespace *root; + struct mlx5_flow_namespace *ns; + int err; + + MLX5_SET(set_flow_table_root_in, in, opcode, + MLX5_CMD_OP_SET_FLOW_TABLE_ROOT); + MLX5_SET(set_flow_table_root_in, in, table_type, + FS_FT_FDB); + + if (master) { + ns = mlx5_get_flow_namespace(master, + MLX5_FLOW_NAMESPACE_FDB); + root = find_root(&ns->node); + mutex_lock(&root->chain_lock); + MLX5_SET(set_flow_table_root_in, in, + table_eswitch_owner_vhca_id_valid, 1); + MLX5_SET(set_flow_table_root_in, in, + table_eswitch_owner_vhca_id, + MLX5_CAP_GEN(master, vhca_id)); + MLX5_SET(set_flow_table_root_in, in, table_id, + root->root_ft->id); + } else { + ns = mlx5_get_flow_namespace(slave, + MLX5_FLOW_NAMESPACE_FDB); + root = find_root(&ns->node); + mutex_lock(&root->chain_lock); + MLX5_SET(set_flow_table_root_in, in, table_id, + root->root_ft->id); + } + + err = mlx5_cmd_exec(slave, in, sizeof(in), out, sizeof(out)); + mutex_unlock(&root->chain_lock); + + return err; +} + +static int __esw_set_master_egress_rule(struct mlx5_core_dev *master, + struct mlx5_core_dev *slave, + struct mlx5_vport *vport, + struct mlx5_flow_table *acl) +{ + struct mlx5_flow_handle *flow_rule = NULL; + struct mlx5_flow_destination dest = {}; + struct mlx5_flow_act flow_act = {}; + struct mlx5_flow_spec *spec; + int err = 0; + void *misc; + + spec = kvzalloc(sizeof(*spec), GFP_KERNEL); + if (!spec) + return -ENOMEM; + + spec->match_criteria_enable = MLX5_MATCH_MISC_PARAMETERS; + misc = MLX5_ADDR_OF(fte_match_param, spec->match_value, + misc_parameters); + MLX5_SET(fte_match_set_misc, misc, source_port, MLX5_VPORT_UPLINK); + MLX5_SET(fte_match_set_misc, misc, source_eswitch_owner_vhca_id, + MLX5_CAP_GEN(slave, vhca_id)); + + misc = MLX5_ADDR_OF(fte_match_param, spec->match_criteria, misc_parameters); + MLX5_SET_TO_ONES(fte_match_set_misc, misc, source_port); + MLX5_SET_TO_ONES(fte_match_set_misc, misc, + source_eswitch_owner_vhca_id); + + flow_act.action = MLX5_FLOW_CONTEXT_ACTION_FWD_DEST; + dest.type = MLX5_FLOW_DESTINATION_TYPE_VPORT; + dest.vport.num = slave->priv.eswitch->manager_vport; + dest.vport.vhca_id = MLX5_CAP_GEN(slave, vhca_id); + dest.vport.flags |= MLX5_FLOW_DEST_VPORT_VHCA_ID; + + flow_rule = mlx5_add_flow_rules(acl, spec, &flow_act, + &dest, 1); + if (IS_ERR(flow_rule)) + err = PTR_ERR(flow_rule); + else + vport->egress.offloads.bounce_rule = flow_rule; + + kvfree(spec); + return err; +} + +static int esw_set_master_egress_rule(struct mlx5_core_dev *master, + struct mlx5_core_dev *slave) +{ + int inlen = MLX5_ST_SZ_BYTES(create_flow_group_in); + struct mlx5_eswitch *esw = master->priv.eswitch; + struct mlx5_flow_table_attr ft_attr = { + .max_fte = 1, .prio = 0, .level = 0, + }; + struct mlx5_flow_namespace *egress_ns; + struct mlx5_flow_table *acl; + struct mlx5_flow_group *g; + struct mlx5_vport *vport; + void *match_criteria; + u32 *flow_group_in; + int err; + + vport = mlx5_eswitch_get_vport(esw, esw->manager_vport); + if (IS_ERR(vport)) + return PTR_ERR(vport); + + egress_ns = mlx5_get_flow_vport_acl_namespace(master, + MLX5_FLOW_NAMESPACE_ESW_EGRESS, + vport->index); + if (!egress_ns) + return -EINVAL; + + if (vport->egress.acl) + return -EINVAL; + + flow_group_in = kvzalloc(inlen, GFP_KERNEL); + if (!flow_group_in) + return -ENOMEM; + + acl = mlx5_create_vport_flow_table(egress_ns, &ft_attr, vport->vport); + if (IS_ERR(acl)) { + err = PTR_ERR(acl); + goto out; + } + + match_criteria = MLX5_ADDR_OF(create_flow_group_in, flow_group_in, + match_criteria); + MLX5_SET_TO_ONES(fte_match_param, match_criteria, + misc_parameters.source_port); + MLX5_SET_TO_ONES(fte_match_param, match_criteria, + misc_parameters.source_eswitch_owner_vhca_id); + MLX5_SET(create_flow_group_in, flow_group_in, match_criteria_enable, + MLX5_MATCH_MISC_PARAMETERS); + + MLX5_SET(create_flow_group_in, flow_group_in, + source_eswitch_owner_vhca_id_valid, 1); + MLX5_SET(create_flow_group_in, flow_group_in, start_flow_index, 0); + MLX5_SET(create_flow_group_in, flow_group_in, end_flow_index, 0); + + g = mlx5_create_flow_group(acl, flow_group_in); + if (IS_ERR(g)) { + err = PTR_ERR(g); + goto err_group; + } + + err = __esw_set_master_egress_rule(master, slave, vport, acl); + if (err) + goto err_rule; + + vport->egress.acl = acl; + vport->egress.offloads.bounce_grp = g; + + kvfree(flow_group_in); + + return 0; + +err_rule: + mlx5_destroy_flow_group(g); +err_group: + mlx5_destroy_flow_table(acl); +out: + kvfree(flow_group_in); + return err; +} + +static void esw_unset_master_egress_rule(struct mlx5_core_dev *dev) +{ + struct mlx5_vport *vport; + + vport = mlx5_eswitch_get_vport(dev->priv.eswitch, + dev->priv.eswitch->manager_vport); + + esw_acl_egress_ofld_cleanup(vport); +} + +int mlx5_eswitch_offloads_config_single_fdb(struct mlx5_eswitch *master_esw, + struct mlx5_eswitch *slave_esw) +{ + int err; + + err = esw_set_uplink_slave_ingress_root(master_esw->dev, + slave_esw->dev); + if (err) + return -EINVAL; + + err = esw_set_slave_root_fdb(master_esw->dev, + slave_esw->dev); + if (err) + goto err_fdb; + + err = esw_set_master_egress_rule(master_esw->dev, + slave_esw->dev); + if (err) + goto err_acl; + + return err; + +err_acl: + esw_set_slave_root_fdb(NULL, slave_esw->dev); + +err_fdb: + esw_set_uplink_slave_ingress_root(NULL, slave_esw->dev); + + return err; +} + +void mlx5_eswitch_offloads_destroy_single_fdb(struct mlx5_eswitch *master_esw, + struct mlx5_eswitch *slave_esw) +{ + esw_unset_master_egress_rule(master_esw->dev); + esw_set_slave_root_fdb(NULL, slave_esw->dev); + esw_set_uplink_slave_ingress_root(NULL, slave_esw->dev); +} + #define ESW_OFFLOADS_DEVCOM_PAIR (0) #define ESW_OFFLOADS_DEVCOM_UNPAIR (1) @@ -2674,6 +2942,31 @@ static void esw_destroy_uplink_offloads_acl_tables(struct mlx5_eswitch *esw) esw_vport_destroy_offloads_acl_tables(esw, vport); } +int mlx5_eswitch_reload_reps(struct mlx5_eswitch *esw) +{ + struct mlx5_eswitch_rep *rep; + unsigned long i; + int ret; + + if (!esw || esw->mode != MLX5_ESWITCH_OFFLOADS) + return 0; + + rep = mlx5_eswitch_get_rep(esw, MLX5_VPORT_UPLINK); + if (atomic_read(&rep->rep_data[REP_ETH].state) != REP_LOADED) + return 0; + + ret = mlx5_esw_offloads_rep_load(esw, MLX5_VPORT_UPLINK); + if (ret) + return ret; + + mlx5_esw_for_each_rep(esw, i, rep) { + if (atomic_read(&rep->rep_data[REP_ETH].state) == REP_LOADED) + mlx5_esw_offloads_rep_load(esw, rep->vport); + } + + return 0; +} + static int esw_offloads_steering_init(struct mlx5_eswitch *esw) { struct mlx5_esw_indir_table *indir; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fs_cmd.c b/drivers/net/ethernet/mellanox/mlx5/core/fs_cmd.c index 896a6c3dbdb7..7db8df64a60e 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/fs_cmd.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/fs_cmd.c @@ -152,17 +152,56 @@ static int mlx5_cmd_stub_destroy_ns(struct mlx5_flow_root_namespace *ns) return 0; } +static int mlx5_cmd_set_slave_root_fdb(struct mlx5_core_dev *master, + struct mlx5_core_dev *slave, + bool ft_id_valid, + u32 ft_id) +{ + u32 out[MLX5_ST_SZ_DW(set_flow_table_root_out)] = {}; + u32 in[MLX5_ST_SZ_DW(set_flow_table_root_in)] = {}; + struct mlx5_flow_root_namespace *root; + struct mlx5_flow_namespace *ns; + + MLX5_SET(set_flow_table_root_in, in, opcode, + MLX5_CMD_OP_SET_FLOW_TABLE_ROOT); + MLX5_SET(set_flow_table_root_in, in, table_type, + FS_FT_FDB); + if (ft_id_valid) { + MLX5_SET(set_flow_table_root_in, in, + table_eswitch_owner_vhca_id_valid, 1); + MLX5_SET(set_flow_table_root_in, in, + table_eswitch_owner_vhca_id, + MLX5_CAP_GEN(master, vhca_id)); + MLX5_SET(set_flow_table_root_in, in, table_id, + ft_id); + } else { + ns = mlx5_get_flow_namespace(slave, + MLX5_FLOW_NAMESPACE_FDB); + root = find_root(&ns->node); + MLX5_SET(set_flow_table_root_in, in, table_id, + root->root_ft->id); + } + + return mlx5_cmd_exec(slave, in, sizeof(in), out, sizeof(out)); +} + static int mlx5_cmd_update_root_ft(struct mlx5_flow_root_namespace *ns, struct mlx5_flow_table *ft, u32 underlay_qpn, bool disconnect) { u32 in[MLX5_ST_SZ_DW(set_flow_table_root_in)] = {}; struct mlx5_core_dev *dev = ns->dev; + int err; if ((MLX5_CAP_GEN(dev, port_type) == MLX5_CAP_PORT_TYPE_IB) && underlay_qpn == 0) return 0; + if (ft->type == FS_FT_FDB && + mlx5_lag_is_shared_fdb(dev) && + !mlx5_lag_is_master(dev)) + return 0; + MLX5_SET(set_flow_table_root_in, in, opcode, MLX5_CMD_OP_SET_FLOW_TABLE_ROOT); MLX5_SET(set_flow_table_root_in, in, table_type, ft->type); @@ -177,7 +216,24 @@ static int mlx5_cmd_update_root_ft(struct mlx5_flow_root_namespace *ns, MLX5_SET(set_flow_table_root_in, in, other_vport, !!(ft->flags & MLX5_FLOW_TABLE_OTHER_VPORT)); - return mlx5_cmd_exec_in(dev, set_flow_table_root, in); + err = mlx5_cmd_exec_in(dev, set_flow_table_root, in); + if (!err && + ft->type == FS_FT_FDB && + mlx5_lag_is_shared_fdb(dev) && + mlx5_lag_is_master(dev)) { + err = mlx5_cmd_set_slave_root_fdb(dev, + mlx5_lag_get_peer_mdev(dev), + !disconnect, (!disconnect) ? + ft->id : 0); + if (err && !disconnect) { + MLX5_SET(set_flow_table_root_in, in, op_mod, 0); + MLX5_SET(set_flow_table_root_in, in, table_id, + ns->root_ft->id); + mlx5_cmd_exec_in(dev, set_flow_table_root, in); + } + } + + return err; } static int mlx5_cmd_create_flow_table(struct mlx5_flow_root_namespace *ns, diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c index d7bf0a3e4a52..1fba8544314a 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c @@ -413,7 +413,7 @@ static bool check_valid_spec(const struct mlx5_flow_spec *spec) return true; } -static struct mlx5_flow_root_namespace *find_root(struct fs_node *node) +struct mlx5_flow_root_namespace *find_root(struct fs_node *node) { struct fs_node *root; struct mlx5_flow_namespace *ns; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.h b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.h index 7317cdeab661..98240badc342 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.h @@ -294,6 +294,8 @@ void mlx5_fs_egress_acls_cleanup(struct mlx5_core_dev *dev); int mlx5_fs_ingress_acls_init(struct mlx5_core_dev *dev, int total_vports); void mlx5_fs_ingress_acls_cleanup(struct mlx5_core_dev *dev); +struct mlx5_flow_root_namespace *find_root(struct fs_node *node); + #define fs_get_obj(v, _node) {v = container_of((_node), typeof(*v), node); } #define fs_list_for_each_entry(pos, root) \ From patchwork Tue Aug 3 23:19:59 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Saeed Mahameed X-Patchwork-Id: 12417639 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-19.7 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D6548C432BE for ; Tue, 3 Aug 2021 23:20:32 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id A4D1260F58 for ; Tue, 3 Aug 2021 23:20:32 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233958AbhHCXUm (ORCPT ); Tue, 3 Aug 2021 19:20:42 -0400 Received: from mail.kernel.org ([198.145.29.99]:38984 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233743AbhHCXUf (ORCPT ); Tue, 3 Aug 2021 19:20:35 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id 1510760F70; Tue, 3 Aug 2021 23:20:24 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1628032824; bh=wPMbOjqoH1dBdJO1XE08P8TUVvPbWReG9PxCWFW6RTE=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=TXyLsvAg7vUfVeQZiTI971az1RFIJbbbInyZ/Pyiz2iwc0k1uZOLPx5lhxmT5RbXY 6V37Lk42+Zt6xKOXy0ytIc3gLWM5M/uMk1Te9pb04GB686BjQyOm9iV4tgwjYL/C0k n3LRATgsI/CLSCUsVfNLerkacPAEZtt3N9pM+6UHzrDdcaOKH4u2Q4pa5blJejSukz xjdWhn6gVtqWCkeFztG1cynIB9VCGezMCBJQcUfg2rQtpaZxH0bKjSCQGbqfozxOtm ll4AaiMT6274LjZA2LD4HP3R1Hx37pDhu0oKJj63c1PosZkylbQO4AZAoqXRwDhA+A 7aPJyh8phggFQ== From: Saeed Mahameed To: Saeed Mahameed , Leon Romanovsky Cc: netdev@vger.kernel.org, linux-rdma@vger.kernel.org, Mark Bloch Subject: [PATCH mlx5-next 14/14] net/mlx5: Lag, Create shared FDB when in switchdev mode Date: Tue, 3 Aug 2021 16:19:59 -0700 Message-Id: <20210803231959.26513-15-saeed@kernel.org> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20210803231959.26513-1-saeed@kernel.org> References: <20210803231959.26513-1-saeed@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org From: Mark Bloch If both eswitches are in switchdev mode and the uplink representors are enslaved to the same bond device create a shared FDB configuration. When moving to shared FDB mode not only the hardware needs be configured but the RDMA driver needs to reconfigure itself. When such change is done, unload the RDMA devices, configure the hardware and load the RDMA representors. When destroying the lag (can happen if a PCI function is unbinded, driver is unloaded or by just removing a netdev from the bond) make sure to restore the system to the previous state only if possible. For example, if a PCI function is unbinded there is no need to load the representors as the device is going away. Signed-off-by: Mark Bloch Signed-off-by: Saeed Mahameed --- drivers/net/ethernet/mellanox/mlx5/core/lag.c | 118 +++++++++++++++--- drivers/net/ethernet/mellanox/mlx5/core/lag.h | 3 +- .../net/ethernet/mellanox/mlx5/core/lag_mp.c | 2 +- 3 files changed, 105 insertions(+), 18 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lag.c b/drivers/net/ethernet/mellanox/mlx5/core/lag.c index 89cd2b2af50a..f4dfa55c8c7e 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/lag.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/lag.c @@ -32,7 +32,9 @@ #include #include +#include #include +#include "lib/devcom.h" #include "mlx5_core.h" #include "eswitch.h" #include "lag.h" @@ -45,7 +47,7 @@ static DEFINE_SPINLOCK(lag_lock); static int mlx5_cmd_create_lag(struct mlx5_core_dev *dev, u8 remap_port1, - u8 remap_port2) + u8 remap_port2, bool shared_fdb) { u32 in[MLX5_ST_SZ_DW(create_lag_in)] = {}; void *lag_ctx = MLX5_ADDR_OF(create_lag_in, in, ctx); @@ -54,6 +56,7 @@ static int mlx5_cmd_create_lag(struct mlx5_core_dev *dev, u8 remap_port1, MLX5_SET(lagc, lag_ctx, tx_remap_affinity_1, remap_port1); MLX5_SET(lagc, lag_ctx, tx_remap_affinity_2, remap_port2); + MLX5_SET(lagc, lag_ctx, fdb_selection_mode, shared_fdb); return mlx5_cmd_exec_in(dev, create_lag, in); } @@ -224,35 +227,59 @@ void mlx5_modify_lag(struct mlx5_lag *ldev, } static int mlx5_create_lag(struct mlx5_lag *ldev, - struct lag_tracker *tracker) + struct lag_tracker *tracker, + bool shared_fdb) { struct mlx5_core_dev *dev0 = ldev->pf[MLX5_LAG_P1].dev; + struct mlx5_core_dev *dev1 = ldev->pf[MLX5_LAG_P2].dev; + u32 in[MLX5_ST_SZ_DW(destroy_lag_in)] = {}; int err; mlx5_infer_tx_affinity_mapping(tracker, &ldev->v2p_map[MLX5_LAG_P1], &ldev->v2p_map[MLX5_LAG_P2]); - mlx5_core_info(dev0, "lag map port 1:%d port 2:%d", - ldev->v2p_map[MLX5_LAG_P1], ldev->v2p_map[MLX5_LAG_P2]); + mlx5_core_info(dev0, "lag map port 1:%d port 2:%d shared_fdb:%d", + ldev->v2p_map[MLX5_LAG_P1], ldev->v2p_map[MLX5_LAG_P2], + shared_fdb); err = mlx5_cmd_create_lag(dev0, ldev->v2p_map[MLX5_LAG_P1], - ldev->v2p_map[MLX5_LAG_P2]); - if (err) + ldev->v2p_map[MLX5_LAG_P2], shared_fdb); + if (err) { mlx5_core_err(dev0, "Failed to create LAG (%d)\n", err); + return err; + } + + if (shared_fdb) { + err = mlx5_eswitch_offloads_config_single_fdb(dev0->priv.eswitch, + dev1->priv.eswitch); + if (err) + mlx5_core_err(dev0, "Can't enable single FDB mode\n"); + else + mlx5_core_info(dev0, "Operation mode is single FDB\n"); + } + + if (err) { + MLX5_SET(destroy_lag_in, in, opcode, MLX5_CMD_OP_DESTROY_LAG); + if (mlx5_cmd_exec_in(dev0, destroy_lag, in)) + mlx5_core_err(dev0, + "Failed to deactivate RoCE LAG; driver restart required\n"); + } + return err; } int mlx5_activate_lag(struct mlx5_lag *ldev, struct lag_tracker *tracker, - u8 flags) + u8 flags, + bool shared_fdb) { bool roce_lag = !!(flags & MLX5_LAG_FLAG_ROCE); struct mlx5_core_dev *dev0 = ldev->pf[MLX5_LAG_P1].dev; int err; - err = mlx5_create_lag(ldev, tracker); + err = mlx5_create_lag(ldev, tracker, shared_fdb); if (err) { if (roce_lag) { mlx5_core_err(dev0, @@ -266,6 +293,7 @@ int mlx5_activate_lag(struct mlx5_lag *ldev, } ldev->flags |= flags; + ldev->shared_fdb = shared_fdb; return 0; } @@ -278,6 +306,12 @@ static int mlx5_deactivate_lag(struct mlx5_lag *ldev) ldev->flags &= ~MLX5_LAG_MODE_FLAGS; + if (ldev->shared_fdb) { + mlx5_eswitch_offloads_destroy_single_fdb(ldev->pf[MLX5_LAG_P1].dev->priv.eswitch, + ldev->pf[MLX5_LAG_P2].dev->priv.eswitch); + ldev->shared_fdb = false; + } + MLX5_SET(destroy_lag_in, in, opcode, MLX5_CMD_OP_DESTROY_LAG); err = mlx5_cmd_exec_in(dev0, destroy_lag, in); if (err) { @@ -333,6 +367,10 @@ static void mlx5_lag_remove_devices(struct mlx5_lag *ldev) if (!ldev->pf[i].dev) continue; + if (ldev->pf[i].dev->priv.flags & + MLX5_PRIV_FLAGS_DISABLE_ALL_ADEV) + continue; + ldev->pf[i].dev->priv.flags |= MLX5_PRIV_FLAGS_DISABLE_IB_ADEV; mlx5_rescan_drivers_locked(ldev->pf[i].dev); } @@ -342,12 +380,15 @@ static void mlx5_disable_lag(struct mlx5_lag *ldev) { struct mlx5_core_dev *dev0 = ldev->pf[MLX5_LAG_P1].dev; struct mlx5_core_dev *dev1 = ldev->pf[MLX5_LAG_P2].dev; + bool shared_fdb = ldev->shared_fdb; bool roce_lag; int err; roce_lag = __mlx5_lag_is_roce(ldev); - if (roce_lag) { + if (shared_fdb) { + mlx5_lag_remove_devices(ldev); + } else if (roce_lag) { if (!(dev0->priv.flags & MLX5_PRIV_FLAGS_DISABLE_ALL_ADEV)) { dev0->priv.flags |= MLX5_PRIV_FLAGS_DISABLE_IB_ADEV; mlx5_rescan_drivers_locked(dev0); @@ -359,8 +400,34 @@ static void mlx5_disable_lag(struct mlx5_lag *ldev) if (err) return; - if (roce_lag) + if (shared_fdb || roce_lag) mlx5_lag_add_devices(ldev); + + if (shared_fdb) { + if (!(dev0->priv.flags & MLX5_PRIV_FLAGS_DISABLE_ALL_ADEV)) + mlx5_eswitch_reload_reps(dev0->priv.eswitch); + if (!(dev1->priv.flags & MLX5_PRIV_FLAGS_DISABLE_ALL_ADEV)) + mlx5_eswitch_reload_reps(dev1->priv.eswitch); + } +} + +static bool mlx5_shared_fdb_supported(struct mlx5_lag *ldev) +{ + struct mlx5_core_dev *dev0 = ldev->pf[MLX5_LAG_P1].dev; + struct mlx5_core_dev *dev1 = ldev->pf[MLX5_LAG_P2].dev; + + if (is_mdev_switchdev_mode(dev0) && + is_mdev_switchdev_mode(dev1) && + mlx5_eswitch_vport_match_metadata_enabled(dev0->priv.eswitch) && + mlx5_eswitch_vport_match_metadata_enabled(dev1->priv.eswitch) && + mlx5_devcom_is_paired(dev0->priv.devcom, + MLX5_DEVCOM_ESW_OFFLOADS) && + MLX5_CAP_GEN(dev1, lag_native_fdb_selection) && + MLX5_CAP_ESW(dev1, root_ft_on_other_esw) && + MLX5_CAP_ESW(dev0, esw_shared_ingress_acl)) + return true; + + return false; } static void mlx5_do_bond(struct mlx5_lag *ldev) @@ -380,6 +447,8 @@ static void mlx5_do_bond(struct mlx5_lag *ldev) } if (do_bond && !__mlx5_lag_is_active(ldev)) { + bool shared_fdb = mlx5_shared_fdb_supported(ldev); + roce_lag = !mlx5_sriov_is_enabled(dev0) && !mlx5_sriov_is_enabled(dev1); @@ -389,23 +458,40 @@ static void mlx5_do_bond(struct mlx5_lag *ldev) dev1->priv.eswitch->mode == MLX5_ESWITCH_NONE; #endif - if (roce_lag) + if (shared_fdb || roce_lag) mlx5_lag_remove_devices(ldev); err = mlx5_activate_lag(ldev, &tracker, roce_lag ? MLX5_LAG_FLAG_ROCE : - MLX5_LAG_FLAG_SRIOV); + MLX5_LAG_FLAG_SRIOV, + shared_fdb); if (err) { - if (roce_lag) + if (shared_fdb || roce_lag) mlx5_lag_add_devices(ldev); return; - } - - if (roce_lag) { + } else if (roce_lag) { dev0->priv.flags &= ~MLX5_PRIV_FLAGS_DISABLE_IB_ADEV; mlx5_rescan_drivers_locked(dev0); mlx5_nic_vport_enable_roce(dev1); + } else if (shared_fdb) { + dev0->priv.flags &= ~MLX5_PRIV_FLAGS_DISABLE_IB_ADEV; + mlx5_rescan_drivers_locked(dev0); + + err = mlx5_eswitch_reload_reps(dev0->priv.eswitch); + if (!err) + err = mlx5_eswitch_reload_reps(dev1->priv.eswitch); + + if (err) { + dev0->priv.flags |= MLX5_PRIV_FLAGS_DISABLE_IB_ADEV; + mlx5_rescan_drivers_locked(dev0); + mlx5_deactivate_lag(ldev); + mlx5_lag_add_devices(ldev); + mlx5_eswitch_reload_reps(dev0->priv.eswitch); + mlx5_eswitch_reload_reps(dev1->priv.eswitch); + mlx5_core_err(dev0, "Failed to enable lag\n"); + return; + } } } else if (do_bond && __mlx5_lag_is_active(ldev)) { mlx5_modify_lag(ldev, &tracker); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lag.h b/drivers/net/ethernet/mellanox/mlx5/core/lag.h index e1d7a6671cf3..d4bae528954e 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/lag.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/lag.h @@ -73,7 +73,8 @@ void mlx5_modify_lag(struct mlx5_lag *ldev, struct lag_tracker *tracker); int mlx5_activate_lag(struct mlx5_lag *ldev, struct lag_tracker *tracker, - u8 flags); + u8 flags, + bool shared_fdb); int mlx5_lag_dev_get_netdev_idx(struct mlx5_lag *ldev, struct net_device *ndev); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lag_mp.c b/drivers/net/ethernet/mellanox/mlx5/core/lag_mp.c index c4bf8b679541..011b639b29bf 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/lag_mp.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/lag_mp.c @@ -161,7 +161,7 @@ static void mlx5_lag_fib_route_event(struct mlx5_lag *ldev, struct lag_tracker tracker; tracker = ldev->tracker; - mlx5_activate_lag(ldev, &tracker, MLX5_LAG_FLAG_MULTIPATH); + mlx5_activate_lag(ldev, &tracker, MLX5_LAG_FLAG_MULTIPATH, false); } mlx5_lag_set_port_affinity(ldev, MLX5_LAG_NORMAL_AFFINITY);