From patchwork Wed Apr 7 00:14:52 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Shiraz Saleem X-Patchwork-Id: 12186293 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B4CDCC43462 for ; Wed, 7 Apr 2021 00:16:29 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9CC09613C0 for ; Wed, 7 Apr 2021 00:16:29 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1347440AbhDGAQh (ORCPT ); Tue, 6 Apr 2021 20:16:37 -0400 Received: from mga17.intel.com ([192.55.52.151]:60815 "EHLO mga17.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1347421AbhDGAQM (ORCPT ); Tue, 6 Apr 2021 20:16:12 -0400 IronPort-SDR: sDK2wITvZIsmP48e2kF5uPBGn1LV5puu62+L51MOojTHtm58w6ofEHBfr58NuJB6rdh1QQiczB f4J3hxT/9RXA== X-IronPort-AV: E=McAfee;i="6000,8403,9946"; a="173263141" X-IronPort-AV: E=Sophos;i="5.82,201,1613462400"; d="scan'208";a="173263141" Received: from orsmga002.jf.intel.com ([10.7.209.21]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 Apr 2021 17:16:03 -0700 IronPort-SDR: 4e6zgcPXNWxX9ktjAozBYg0Qhy/j1f1gOSbAeOUz/8g3WQGSdPeLyaDLUEkS/Y7lnNSV0pZCSj 5ARZUcUk0gbw== X-IronPort-AV: E=Sophos;i="5.82,201,1613462400"; d="scan'208";a="396440911" Received: from ssaleem-mobl.amr.corp.intel.com ([10.212.32.74]) by orsmga002-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 Apr 2021 17:16:02 -0700 From: Shiraz Saleem To: dledford@redhat.com, jgg@nvidia.com, kuba@kernel.org, davem@davemloft.net Cc: linux-rdma@vger.kernel.org, netdev@vger.kernel.org, david.m.ertman@intel.com, anthony.l.nguyen@intel.com, Mustafa Ismail , Shiraz Saleem Subject: [PATCH v4 resend 13/23] RDMA/irdma: Add QoS definitions Date: Tue, 6 Apr 2021 19:14:52 -0500 Message-Id: <20210407001502.1890-14-shiraz.saleem@intel.com> X-Mailer: git-send-email 2.31.0 In-Reply-To: <20210407001502.1890-1-shiraz.saleem@intel.com> References: <20210407001502.1890-1-shiraz.saleem@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Mustafa Ismail Add definitions for managing the RDMA HW work scheduler (WS) tree. A WS node is created via a control QP operation with the bandwidth allocation, arbitration scheme, and traffic class of the QP specified. The Qset handle returned associates the QoS parameters for the QP. The Qset is registered with the LAN and an equivalent node is created in the LAN packet scheduler tree. Signed-off-by: Mustafa Ismail Signed-off-by: Shiraz Saleem --- drivers/infiniband/hw/irdma/ws.c | 406 +++++++++++++++++++++++++++++++++++++++ drivers/infiniband/hw/irdma/ws.h | 41 ++++ 2 files changed, 447 insertions(+) create mode 100644 drivers/infiniband/hw/irdma/ws.c create mode 100644 drivers/infiniband/hw/irdma/ws.h diff --git a/drivers/infiniband/hw/irdma/ws.c b/drivers/infiniband/hw/irdma/ws.c new file mode 100644 index 0000000..6e4d1d8 --- /dev/null +++ b/drivers/infiniband/hw/irdma/ws.c @@ -0,0 +1,406 @@ +// SPDX-License-Identifier: GPL-2.0 or Linux-OpenIB +/* Copyright (c) 2019 - 2020 Intel Corporation */ +#include "osdep.h" +#include "status.h" +#include "hmc.h" +#include "defs.h" +#include "type.h" +#include "protos.h" + +#include "ws.h" + +/** + * irdma_alloc_node - Allocate a WS node and init + * @vsi: vsi pointer + * @user_pri: user priority + * @node_type: Type of node, leaf or parent + * @parent: parent node pointer + */ +static struct irdma_ws_node *irdma_alloc_node(struct irdma_sc_vsi *vsi, + u8 user_pri, + enum irdma_ws_node_type node_type, + struct irdma_ws_node *parent) +{ + struct irdma_virt_mem ws_mem; + struct irdma_ws_node *node; + u16 node_index = 0; + + ws_mem.size = sizeof(struct irdma_ws_node); + ws_mem.va = kzalloc(ws_mem.size, GFP_KERNEL); + if (!ws_mem.va) + return NULL; + + if (parent || vsi->vm_vf_type == IRDMA_VF_TYPE) { + node_index = irdma_alloc_ws_node_id(vsi->dev); + if (node_index == IRDMA_WS_NODE_INVALID) { + kfree(ws_mem.va); + return NULL; + } + } + + node = ws_mem.va; + node->index = node_index; + node->vsi_index = vsi->vsi_idx; + INIT_LIST_HEAD(&node->child_list_head); + if (node_type == WS_NODE_TYPE_LEAF) { + node->type_leaf = true; + node->traffic_class = vsi->qos[user_pri].traffic_class; + node->user_pri = user_pri; + node->rel_bw = vsi->qos[user_pri].rel_bw; + if (!node->rel_bw) + node->rel_bw = 1; + + node->lan_qs_handle = vsi->qos[user_pri].lan_qos_handle; + node->prio_type = IRDMA_PRIO_WEIGHTED_RR; + } else { + node->rel_bw = 1; + node->prio_type = IRDMA_PRIO_WEIGHTED_RR; + node->enable = true; + } + + node->parent = parent; + + return node; +} + +/** + * irdma_free_node - Free a WS node + * @vsi: VSI stricture of device + * @node: Pointer to node to free + */ +static void irdma_free_node(struct irdma_sc_vsi *vsi, + struct irdma_ws_node *node) +{ + struct irdma_virt_mem ws_mem; + + if (node->index) + irdma_free_ws_node_id(vsi->dev, node->index); + + ws_mem.va = node; + ws_mem.size = sizeof(struct irdma_ws_node); + kfree(ws_mem.va); +} + +/** + * irdma_ws_cqp_cmd - Post CQP work scheduler node cmd + * @vsi: vsi pointer + * @node: pointer to node + * @cmd: add, remove or modify + */ +static enum irdma_status_code +irdma_ws_cqp_cmd(struct irdma_sc_vsi *vsi, struct irdma_ws_node *node, u8 cmd) +{ + struct irdma_ws_node_info node_info = {}; + + node_info.id = node->index; + node_info.vsi = node->vsi_index; + if (node->parent) + node_info.parent_id = node->parent->index; + else + node_info.parent_id = node_info.id; + + node_info.weight = node->rel_bw; + node_info.tc = node->traffic_class; + node_info.prio_type = node->prio_type; + node_info.type_leaf = node->type_leaf; + node_info.enable = node->enable; + if (irdma_cqp_ws_node_cmd(vsi->dev, cmd, &node_info)) { + ibdev_dbg(to_ibdev(vsi->dev), "WS: CQP WS CMD failed\n"); + return IRDMA_ERR_NO_MEMORY; + } + + if (node->type_leaf && cmd == IRDMA_OP_WS_ADD_NODE) { + node->qs_handle = node_info.qs_handle; + vsi->qos[node->user_pri].qs_handle = node_info.qs_handle; + } + + return 0; +} + +/** + * ws_find_node - Find SC WS node based on VSI id or TC + * @parent: parent node of First VSI or TC node + * @match_val: value to match + * @type: match type VSI/TC + */ +static struct irdma_ws_node *ws_find_node(struct irdma_ws_node *parent, + u16 match_val, + enum irdma_ws_match_type type) +{ + struct irdma_ws_node *node; + + switch (type) { + case WS_MATCH_TYPE_VSI: + list_for_each_entry(node, &parent->child_list_head, siblings) { + if (node->vsi_index == match_val) + return node; + } + break; + case WS_MATCH_TYPE_TC: + list_for_each_entry(node, &parent->child_list_head, siblings) { + if (node->traffic_class == match_val) + return node; + } + break; + default: + break; + } + + return NULL; +} + +/** + * irdma_tc_in_use - Checks to see if a leaf node is in use + * @vsi: vsi pointer + * @user_pri: user priority + */ +static bool irdma_tc_in_use(struct irdma_sc_vsi *vsi, u8 user_pri) +{ + int i; + + mutex_lock(&vsi->qos[user_pri].qos_mutex); + if (!list_empty(&vsi->qos[user_pri].qplist)) { + mutex_unlock(&vsi->qos[user_pri].qos_mutex); + return true; + } + + /* Check if the traffic class associated with the given user priority + * is in use by any other user priority. If so, nothing left to do + */ + for (i = 0; i < IRDMA_MAX_USER_PRIORITY; i++) { + if (vsi->qos[i].traffic_class == vsi->qos[user_pri].traffic_class && + !list_empty(&vsi->qos[i].qplist)) { + mutex_unlock(&vsi->qos[user_pri].qos_mutex); + return true; + } + } + mutex_unlock(&vsi->qos[user_pri].qos_mutex); + + return false; +} + +/** + * irdma_remove_leaf - Remove leaf node unconditionally + * @vsi: vsi pointer + * @user_pri: user priority + */ +static void irdma_remove_leaf(struct irdma_sc_vsi *vsi, u8 user_pri) +{ + struct irdma_ws_node *ws_tree_root, *vsi_node, *tc_node; + int i; + u16 traffic_class; + + traffic_class = vsi->qos[user_pri].traffic_class; + for (i = 0; i < IRDMA_MAX_USER_PRIORITY; i++) + if (vsi->qos[i].traffic_class == traffic_class) + vsi->qos[i].valid = false; + + ws_tree_root = vsi->dev->ws_tree_root; + if (!ws_tree_root) + return; + + vsi_node = ws_find_node(ws_tree_root, vsi->vsi_idx, + WS_MATCH_TYPE_VSI); + if (!vsi_node) + return; + + tc_node = ws_find_node(vsi_node, + vsi->qos[user_pri].traffic_class, + WS_MATCH_TYPE_TC); + if (!tc_node) + return; + + irdma_ws_cqp_cmd(vsi, tc_node, IRDMA_OP_WS_DELETE_NODE); + vsi->unregister_qset(vsi, tc_node); + list_del(&tc_node->siblings); + irdma_free_node(vsi, tc_node); + /* Check if VSI node can be freed */ + if (list_empty(&vsi_node->child_list_head)) { + irdma_ws_cqp_cmd(vsi, vsi_node, IRDMA_OP_WS_DELETE_NODE); + list_del(&vsi_node->siblings); + irdma_free_node(vsi, vsi_node); + /* Free head node there are no remaining VSI nodes */ + if (list_empty(&ws_tree_root->child_list_head)) { + irdma_ws_cqp_cmd(vsi, ws_tree_root, + IRDMA_OP_WS_DELETE_NODE); + irdma_free_node(vsi, ws_tree_root); + vsi->dev->ws_tree_root = NULL; + } + } +} + +/** + * irdma_ws_add - Build work scheduler tree, set RDMA qs_handle + * @vsi: vsi pointer + * @user_pri: user priority + */ +enum irdma_status_code irdma_ws_add(struct irdma_sc_vsi *vsi, u8 user_pri) +{ + struct irdma_ws_node *ws_tree_root; + struct irdma_ws_node *vsi_node; + struct irdma_ws_node *tc_node; + u16 traffic_class; + enum irdma_status_code ret = 0; + int i; + + mutex_lock(&vsi->dev->ws_mutex); + if (vsi->tc_change_pending) { + ret = IRDMA_ERR_NOT_READY; + goto exit; + } + + if (vsi->qos[user_pri].valid) + goto exit; + + ws_tree_root = vsi->dev->ws_tree_root; + if (!ws_tree_root) { + ibdev_dbg(to_ibdev(vsi->dev), "WS: Creating root node\n"); + ws_tree_root = irdma_alloc_node(vsi, user_pri, + WS_NODE_TYPE_PARENT, NULL); + if (!ws_tree_root) { + ret = IRDMA_ERR_NO_MEMORY; + goto exit; + } + + ret = irdma_ws_cqp_cmd(vsi, ws_tree_root, IRDMA_OP_WS_ADD_NODE); + if (ret) { + irdma_free_node(vsi, ws_tree_root); + goto exit; + } + + vsi->dev->ws_tree_root = ws_tree_root; + } + + /* Find a second tier node that matches the VSI */ + vsi_node = ws_find_node(ws_tree_root, vsi->vsi_idx, + WS_MATCH_TYPE_VSI); + + /* If VSI node doesn't exist, add one */ + if (!vsi_node) { + ibdev_dbg(to_ibdev(vsi->dev), + "WS: Node not found matching VSI %d\n", + vsi->vsi_idx); + vsi_node = irdma_alloc_node(vsi, user_pri, WS_NODE_TYPE_PARENT, + ws_tree_root); + if (!vsi_node) { + ret = IRDMA_ERR_NO_MEMORY; + goto vsi_add_err; + } + + ret = irdma_ws_cqp_cmd(vsi, vsi_node, IRDMA_OP_WS_ADD_NODE); + if (ret) { + irdma_free_node(vsi, vsi_node); + goto vsi_add_err; + } + + list_add(&vsi_node->siblings, &ws_tree_root->child_list_head); + } + + ibdev_dbg(to_ibdev(vsi->dev), + "WS: Using node %d which represents VSI %d\n", + vsi_node->index, vsi->vsi_idx); + traffic_class = vsi->qos[user_pri].traffic_class; + tc_node = ws_find_node(vsi_node, traffic_class, + WS_MATCH_TYPE_TC); + if (!tc_node) { + /* Add leaf node */ + ibdev_dbg(to_ibdev(vsi->dev), + "WS: Node not found matching VSI %d and TC %d\n", + vsi->vsi_idx, traffic_class); + tc_node = irdma_alloc_node(vsi, user_pri, WS_NODE_TYPE_LEAF, + vsi_node); + if (!tc_node) { + ret = IRDMA_ERR_NO_MEMORY; + goto leaf_add_err; + } + + ret = irdma_ws_cqp_cmd(vsi, tc_node, IRDMA_OP_WS_ADD_NODE); + if (ret) { + irdma_free_node(vsi, tc_node); + goto leaf_add_err; + } + + list_add(&tc_node->siblings, &vsi_node->child_list_head); + /* + * callback to LAN to update the LAN tree with our node + */ + ret = vsi->register_qset(vsi, tc_node); + if (ret) + goto reg_err; + + tc_node->enable = true; + ret = irdma_ws_cqp_cmd(vsi, tc_node, IRDMA_OP_WS_MODIFY_NODE); + if (ret) + goto reg_err; + } + ibdev_dbg(to_ibdev(vsi->dev), + "WS: Using node %d which represents VSI %d TC %d\n", + tc_node->index, vsi->vsi_idx, traffic_class); + /* + * Iterate through other UPs and update the QS handle if they have + * a matching traffic class. + */ + for (i = 0; i < IRDMA_MAX_USER_PRIORITY; i++) { + if (vsi->qos[i].traffic_class == traffic_class) { + vsi->qos[i].qs_handle = tc_node->qs_handle; + vsi->qos[i].lan_qos_handle = tc_node->lan_qs_handle; + vsi->qos[i].l2_sched_node_id = tc_node->l2_sched_node_id; + vsi->qos[i].valid = true; + } + } + goto exit; + +leaf_add_err: + if (list_empty(&vsi_node->child_list_head)) { + if (irdma_ws_cqp_cmd(vsi, vsi_node, IRDMA_OP_WS_DELETE_NODE)) + goto exit; + list_del(&vsi_node->siblings); + irdma_free_node(vsi, vsi_node); + } + +vsi_add_err: + /* Free head node there are no remaining VSI nodes */ + if (list_empty(&ws_tree_root->child_list_head)) { + irdma_ws_cqp_cmd(vsi, ws_tree_root, IRDMA_OP_WS_DELETE_NODE); + vsi->dev->ws_tree_root = NULL; + irdma_free_node(vsi, ws_tree_root); + } + +exit: + mutex_unlock(&vsi->dev->ws_mutex); + return ret; + +reg_err: + mutex_unlock(&vsi->dev->ws_mutex); + irdma_ws_remove(vsi, user_pri); + return ret; +} + +/** + * irdma_ws_remove - Free WS scheduler node, update WS tree + * @vsi: vsi pointer + * @user_pri: user priority + */ +void irdma_ws_remove(struct irdma_sc_vsi *vsi, u8 user_pri) +{ + mutex_lock(&vsi->dev->ws_mutex); + if (irdma_tc_in_use(vsi, user_pri)) + goto exit; + irdma_remove_leaf(vsi, user_pri); +exit: + mutex_unlock(&vsi->dev->ws_mutex); +} + +/** + * irdma_ws_reset - Reset entire WS tree + * @vsi: vsi pointer + */ +void irdma_ws_reset(struct irdma_sc_vsi *vsi) +{ + u8 i; + + mutex_lock(&vsi->dev->ws_mutex); + for (i = 0; i < IRDMA_MAX_USER_PRIORITY; ++i) + irdma_remove_leaf(vsi, i); + mutex_unlock(&vsi->dev->ws_mutex); +} diff --git a/drivers/infiniband/hw/irdma/ws.h b/drivers/infiniband/hw/irdma/ws.h new file mode 100644 index 0000000..f0e16f6 --- /dev/null +++ b/drivers/infiniband/hw/irdma/ws.h @@ -0,0 +1,41 @@ +/* SPDX-License-Identifier: GPL-2.0 or Linux-OpenIB */ +/* Copyright (c) 2015 - 2020 Intel Corporation */ +#ifndef IRDMA_WS_H +#define IRDMA_WS_H + +#include "osdep.h" + +enum irdma_ws_node_type { + WS_NODE_TYPE_PARENT, + WS_NODE_TYPE_LEAF, +}; + +enum irdma_ws_match_type { + WS_MATCH_TYPE_VSI, + WS_MATCH_TYPE_TC, +}; + +struct irdma_ws_node { + struct list_head siblings; + struct list_head child_list_head; + struct irdma_ws_node *parent; + u64 lan_qs_handle; /* opaque handle used by LAN */ + u32 l2_sched_node_id; + u16 index; + u16 qs_handle; + u16 vsi_index; + u8 traffic_class; + u8 user_pri; + u8 rel_bw; + u8 abstraction_layer; /* used for splitting a TC */ + u8 prio_type; + bool type_leaf:1; + bool enable:1; +}; + +struct irdma_sc_vsi; +enum irdma_status_code irdma_ws_add(struct irdma_sc_vsi *vsi, u8 user_pri); +void irdma_ws_remove(struct irdma_sc_vsi *vsi, u8 user_pri); +void irdma_ws_reset(struct irdma_sc_vsi *vsi); + +#endif /* IRDMA_WS_H */