From patchwork Fri Oct 6 11:02:08 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mateusz Polchlopek X-Patchwork-Id: 13411281 X-Patchwork-Delegate: kuba@kernel.org Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B3A7D1C281 for ; Fri, 6 Oct 2023 11:05:44 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="Ch2raXB5" Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.9]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D03E6D6 for ; Fri, 6 Oct 2023 04:05:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1696590342; x=1728126342; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=PhC5i0WPq26xuaRCDFJA407gh4koR2a2+1sV0lTgnVU=; b=Ch2raXB53CHrDkg8HzlqGQO1m/quwo8uD6RD754kqSbx8r2IjBbXp9Xa 0pKNfPsoD7N/mcVI3k3QQmJ2TCcRBcJYVhjOaXs8PfvS1Rlm5FrydVNTO 6PaegbASrjQOM15DVMTgF4NxM2r0c2vpWbctfglynK88u8PQkdYe14cRA iqrJ5NKWenzv3WK2mEmDsFXwjUOU0Hx4UrUDt3lVWUuUJg0kqw/NWM8O6 bd3IhGG/HrOrcqZyBDg54/Aw+LzCnoylwf/gDRrCWGmkn++EE5W/JQFYT FXZWqhq6eJjJtw+6W8MnWGKKTqX/r68O3/trRsIE6gJiqOmk1IZJZb1n7 g==; X-IronPort-AV: E=McAfee;i="6600,9927,10854"; a="2332731" X-IronPort-AV: E=Sophos;i="6.03,203,1694761200"; d="scan'208";a="2332731" Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by orvoesa101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 Oct 2023 04:05:42 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10854"; a="895844401" X-IronPort-AV: E=Sophos;i="6.03,203,1694761200"; d="scan'208";a="895844401" Received: from irvmail002.ir.intel.com ([10.43.11.120]) by fmsmga001.fm.intel.com with ESMTP; 06 Oct 2023 04:04:05 -0700 Received: from fedora.igk.intel.com (Metan_eth.igk.intel.com [10.123.220.124]) by irvmail002.ir.intel.com (Postfix) with ESMTP id 6944F36345; Fri, 6 Oct 2023 12:05:35 +0100 (IST) From: Mateusz Polchlopek To: intel-wired-lan@lists.osuosl.org Cc: netdev@vger.kernel.org, Raj Victor , Michal Wilczynski , Przemek Kitszel , Mateusz Polchlopek Subject: [Intel-wired-lan] [PATCH iwl-net v2 1/5] ice: Support 5 layer topology Date: Fri, 6 Oct 2023 07:02:08 -0400 Message-Id: <20231006110212.96305-2-mateusz.polchlopek@intel.com> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20231006110212.96305-1-mateusz.polchlopek@intel.com> References: <20231006110212.96305-1-mateusz.polchlopek@intel.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,SPF_HELO_NONE, SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net X-Patchwork-Delegate: kuba@kernel.org From: Raj Victor There is a performance issue when the number of VSIs are not multiple of 8. This is caused due to the max children limitation per node(8) in 9 layer topology. The BW credits are shared evenly among the children by default. Assume one node has 8 children and the other has 1. The parent of these nodes share the BW credit equally among them. Apparently this causes a problem for the first node which has 8 children. The 9th VM get more BW credits than the first 8 VMs. Example: 1) With 8 VM's: for x in 0 1 2 3 4 5 6 7; do taskset -c ${x} netperf -P0 -H 172.68.169.125 & sleep .1 ; done tx_queue_0_packets: 23283027 tx_queue_1_packets: 23292289 tx_queue_2_packets: 23276136 tx_queue_3_packets: 23279828 tx_queue_4_packets: 23279828 tx_queue_5_packets: 23279333 tx_queue_6_packets: 23277745 tx_queue_7_packets: 23279950 tx_queue_8_packets: 0 2) With 9 VM's: for x in 0 1 2 3 4 5 6 7 8; do taskset -c ${x} netperf -P0 -H 172.68.169.125 & sleep .1 ; done tx_queue_0_packets: 24163396 tx_queue_1_packets: 24164623 tx_queue_2_packets: 24163188 tx_queue_3_packets: 24163701 tx_queue_4_packets: 24163683 tx_queue_5_packets: 24164668 tx_queue_6_packets: 23327200 tx_queue_7_packets: 24163853 tx_queue_8_packets: 91101417 So on average queue 8 statistics show that 3.7 times more packets were send there than to the other queues. The FW starting with version 3.20, has increased the max number of children per node by reducing the number of layers from 9 to 5. Reflect this on driver side. Signed-off-by: Raj Victor Co-developed-by: Michal Wilczynski Signed-off-by: Michal Wilczynski Reviewed-by: Przemek Kitszel Signed-off-by: Mateusz Polchlopek --- .../net/ethernet/intel/ice/ice_adminq_cmd.h | 23 ++ drivers/net/ethernet/intel/ice/ice_common.c | 5 + drivers/net/ethernet/intel/ice/ice_ddp.c | 199 ++++++++++++++++++ drivers/net/ethernet/intel/ice/ice_ddp.h | 7 +- drivers/net/ethernet/intel/ice/ice_sched.h | 3 + drivers/net/ethernet/intel/ice/ice_type.h | 1 + 6 files changed, 236 insertions(+), 2 deletions(-) diff --git a/drivers/net/ethernet/intel/ice/ice_adminq_cmd.h b/drivers/net/ethernet/intel/ice/ice_adminq_cmd.h index eb4c13b754a4..cea1f6f7053f 100644 --- a/drivers/net/ethernet/intel/ice/ice_adminq_cmd.h +++ b/drivers/net/ethernet/intel/ice/ice_adminq_cmd.h @@ -120,6 +120,7 @@ struct ice_aqc_list_caps_elem { #define ICE_AQC_CAPS_PCIE_RESET_AVOIDANCE 0x0076 #define ICE_AQC_CAPS_POST_UPDATE_RESET_RESTRICT 0x0077 #define ICE_AQC_CAPS_NVM_MGMT 0x0080 +#define ICE_AQC_CAPS_TX_SCHED_TOPO_COMP_MODE 0x0085 #define ICE_AQC_CAPS_FW_LAG_SUPPORT 0x0092 #define ICE_AQC_BIT_ROCEV2_LAG 0x01 #define ICE_AQC_BIT_SRIOV_LAG 0x02 @@ -806,6 +807,23 @@ struct ice_aqc_get_topo { __le32 addr_low; }; +/* Get/Set Tx Topology (indirect 0x0418/0x0417) */ +struct ice_aqc_get_set_tx_topo { + u8 set_flags; +#define ICE_AQC_TX_TOPO_FLAGS_CORRER BIT(0) +#define ICE_AQC_TX_TOPO_FLAGS_SRC_RAM BIT(1) +#define ICE_AQC_TX_TOPO_FLAGS_LOAD_NEW BIT(4) +#define ICE_AQC_TX_TOPO_FLAGS_ISSUED BIT(5) + + u8 get_flags; +#define ICE_AQC_TX_TOPO_GET_RAM 2 + + __le16 reserved1; + __le32 reserved2; + __le32 addr_high; + __le32 addr_low; +}; + /* Update TSE (indirect 0x0403) * Get TSE (indirect 0x0404) * Add TSE (indirect 0x0401) @@ -2469,6 +2487,7 @@ struct ice_aq_desc { struct ice_aqc_get_link_topo get_link_topo; struct ice_aqc_i2c read_write_i2c; struct ice_aqc_read_i2c_resp read_i2c_resp; + struct ice_aqc_get_set_tx_topo get_set_tx_topo; } params; }; @@ -2575,6 +2594,10 @@ enum ice_adminq_opc { ice_aqc_opc_query_sched_res = 0x0412, ice_aqc_opc_remove_rl_profiles = 0x0415, + /* tx topology commands */ + ice_aqc_opc_set_tx_topo = 0x0417, + ice_aqc_opc_get_tx_topo = 0x0418, + /* PHY commands */ ice_aqc_opc_get_phy_caps = 0x0600, ice_aqc_opc_set_phy_cfg = 0x0601, diff --git a/drivers/net/ethernet/intel/ice/ice_common.c b/drivers/net/ethernet/intel/ice/ice_common.c index d4e259b816b9..b4afdbc3e596 100644 --- a/drivers/net/ethernet/intel/ice/ice_common.c +++ b/drivers/net/ethernet/intel/ice/ice_common.c @@ -1549,6 +1549,8 @@ ice_aq_send_cmd(struct ice_hw *hw, struct ice_aq_desc *desc, void *buf, case ice_aqc_opc_set_port_params: case ice_aqc_opc_get_vlan_mode_parameters: case ice_aqc_opc_set_vlan_mode_parameters: + case ice_aqc_opc_set_tx_topo: + case ice_aqc_opc_get_tx_topo: case ice_aqc_opc_add_recipe: case ice_aqc_opc_recipe_to_profile: case ice_aqc_opc_get_recipe: @@ -2105,6 +2107,9 @@ ice_parse_common_caps(struct ice_hw *hw, struct ice_hw_common_caps *caps, ice_debug(hw, ICE_DBG_INIT, "%s: sriov_lag = %u\n", prefix, caps->sriov_lag); break; + case ICE_AQC_CAPS_TX_SCHED_TOPO_COMP_MODE: + caps->tx_sched_topo_comp_mode_en = (number == 1); + break; default: /* Not one of the recognized common capabilities */ found = false; diff --git a/drivers/net/ethernet/intel/ice/ice_ddp.c b/drivers/net/ethernet/intel/ice/ice_ddp.c index b27ec93638b6..95c7712b56b8 100644 --- a/drivers/net/ethernet/intel/ice/ice_ddp.c +++ b/drivers/net/ethernet/intel/ice/ice_ddp.c @@ -4,6 +4,7 @@ #include "ice_common.h" #include "ice.h" #include "ice_ddp.h" +#include "ice_sched.h" /* For supporting double VLAN mode, it is necessary to enable or disable certain * boost tcam entries. The metadata labels names that match the following @@ -1897,3 +1898,201 @@ enum ice_ddp_state ice_copy_and_init_pkg(struct ice_hw *hw, const u8 *buf, return state; } + +/** + * ice_get_set_tx_topo - get or set Tx topology + * @hw: pointer to the HW struct + * @buf: pointer to Tx topology buffer + * @buf_size: buffer size + * @cd: pointer to command details structure or NULL + * @flags: pointer to descriptor flags + * @set: 0-get, 1-set topology + * + * The function will get or set Tx topology + */ +static int +ice_get_set_tx_topo(struct ice_hw *hw, u8 *buf, u16 buf_size, + struct ice_sq_cd *cd, u8 *flags, bool set) +{ + struct ice_aqc_get_set_tx_topo *cmd; + struct ice_aq_desc desc; + int status; + + cmd = &desc.params.get_set_tx_topo; + if (set) { + ice_fill_dflt_direct_cmd_desc(&desc, ice_aqc_opc_set_tx_topo); + cmd->set_flags = ICE_AQC_TX_TOPO_FLAGS_ISSUED; + /* requested to update a new topology, not a default topology */ + if (buf) + cmd->set_flags |= ICE_AQC_TX_TOPO_FLAGS_SRC_RAM | + ICE_AQC_TX_TOPO_FLAGS_LOAD_NEW; + } else { + ice_fill_dflt_direct_cmd_desc(&desc, ice_aqc_opc_get_tx_topo); + cmd->get_flags = ICE_AQC_TX_TOPO_GET_RAM; + } + desc.flags |= cpu_to_le16(ICE_AQ_FLAG_RD); + status = ice_aq_send_cmd(hw, &desc, buf, buf_size, cd); + if (status) + return status; + /* read the return flag values (first byte) for get operation */ + if (!set && flags) + *flags = desc.params.get_set_tx_topo.set_flags; + + return 0; +} + +/** + * ice_cfg_tx_topo - Initialize new Tx topology if available + * @hw: pointer to the HW struct + * @buf: pointer to Tx topology buffer + * @len: buffer size + * + * The function will apply the new Tx topology from the package buffer + * if available. + */ +int ice_cfg_tx_topo(struct ice_hw *hw, u8 *buf, u32 len) +{ + u8 *current_topo, *new_topo = NULL; + struct ice_run_time_cfg_seg *seg; + struct ice_buf_hdr *section; + struct ice_pkg_hdr *pkg_hdr; + enum ice_ddp_state state; + u16 offset, size = 0; + u32 reg = 0; + int status; + u8 flags; + + if (!buf || !len) + return -EINVAL; + + /* Does FW support new Tx topology mode ? */ + if (!hw->func_caps.common_cap.tx_sched_topo_comp_mode_en) { + ice_debug(hw, ICE_DBG_INIT, "FW doesn't support compatibility mode\n"); + return -EOPNOTSUPP; + } + + current_topo = kzalloc(ICE_AQ_MAX_BUF_LEN, GFP_KERNEL); + if (!current_topo) + return -ENOMEM; + + /* Get the current Tx topology */ + status = ice_get_set_tx_topo(hw, current_topo, ICE_AQ_MAX_BUF_LEN, NULL, + &flags, false); + + kfree(current_topo); + + if (status) { + ice_debug(hw, ICE_DBG_INIT, "Get current topology is failed\n"); + return status; + } + + /* Is default topology already applied ? */ + if (!(flags & ICE_AQC_TX_TOPO_FLAGS_LOAD_NEW) && + hw->num_tx_sched_layers == ICE_SCHED_9_LAYERS) { + ice_debug(hw, ICE_DBG_INIT, "Default topology already applied\n"); + return -EEXIST; + } + + /* Is new topology already applied ? */ + if ((flags & ICE_AQC_TX_TOPO_FLAGS_LOAD_NEW) && + hw->num_tx_sched_layers == ICE_SCHED_5_LAYERS) { + ice_debug(hw, ICE_DBG_INIT, "New topology already applied\n"); + return -EEXIST; + } + + /* Setting topology already issued? */ + if (flags & ICE_AQC_TX_TOPO_FLAGS_ISSUED) { + ice_debug(hw, ICE_DBG_INIT, "Update Tx topology was done by another PF\n"); + /* Add a small delay before exiting */ + msleep(2000); + return -EEXIST; + } + + /* Change the topology from new to default (5 to 9) */ + if (!(flags & ICE_AQC_TX_TOPO_FLAGS_LOAD_NEW) && + hw->num_tx_sched_layers == ICE_SCHED_5_LAYERS) { + ice_debug(hw, ICE_DBG_INIT, "Change topology from 5 to 9 layers\n"); + goto update_topo; + } + + pkg_hdr = (struct ice_pkg_hdr *)buf; + state = ice_verify_pkg(pkg_hdr, len); + if (state) { + ice_debug(hw, ICE_DBG_INIT, "Failed to verify pkg (err: %d)\n", + state); + return -EIO; + } + + /* Find runtime configuration segment */ + seg = (struct ice_run_time_cfg_seg *) + ice_find_seg_in_pkg(hw, SEGMENT_TYPE_ICE_RUN_TIME_CFG, pkg_hdr); + if (!seg) { + ice_debug(hw, ICE_DBG_INIT, "5 layer topology segment is missing\n"); + return -EIO; + } + + if (le32_to_cpu(seg->buf_table.buf_count) < ICE_MIN_S_COUNT) { + ice_debug(hw, ICE_DBG_INIT, "5 layer topology segment count(%d) is wrong\n", + seg->buf_table.buf_count); + return -EIO; + } + + section = ice_pkg_val_buf(seg->buf_table.buf_array); + if (!section || le32_to_cpu(section->section_entry[0].type) != + ICE_SID_TX_5_LAYER_TOPO) { + ice_debug(hw, ICE_DBG_INIT, "5 layer topology section type is wrong\n"); + return -EIO; + } + + size = le16_to_cpu(section->section_entry[0].size); + offset = le16_to_cpu(section->section_entry[0].offset); + if (size < ICE_MIN_S_SZ || size > ICE_MAX_S_SZ) { + ice_debug(hw, ICE_DBG_INIT, "5 layer topology section size is wrong\n"); + return -EIO; + } + + /* Make sure the section fits in the buffer */ + if (offset + size > ICE_PKG_BUF_SIZE) { + ice_debug(hw, ICE_DBG_INIT, "5 layer topology buffer > 4K\n"); + return -EIO; + } + + /* Get the new topology buffer */ + new_topo = ((u8 *)section) + offset; + +update_topo: + /* Acquire global lock to make sure that set topology issued + * by one PF. + */ + status = ice_acquire_res(hw, ICE_GLOBAL_CFG_LOCK_RES_ID, ICE_RES_WRITE, + ICE_GLOBAL_CFG_LOCK_TIMEOUT); + if (status) { + ice_debug(hw, ICE_DBG_INIT, "Failed to acquire global lock\n"); + return status; + } + + /* Check if reset was triggered already. */ + reg = rd32(hw, GLGEN_RSTAT); + if (reg & GLGEN_RSTAT_DEVSTATE_M) { + /* Reset is in progress, re-init the HW again */ + ice_debug(hw, ICE_DBG_INIT, "Reset is in progress. Layer topology might be applied already\n"); + ice_check_reset(hw); + return 0; + } + + /* Set new topology */ + status = ice_get_set_tx_topo(hw, new_topo, size, NULL, NULL, true); + if (status) { + ice_debug(hw, ICE_DBG_INIT, "Failed setting Tx topology\n"); + return status; + } + + /* New topology is updated, delay 1 second before issuing the CORER */ + msleep(1000); + ice_reset(hw, ICE_RESET_CORER); + /* CORER will clear the global lock, so no explicit call + * required for release. + */ + + return 0; +} diff --git a/drivers/net/ethernet/intel/ice/ice_ddp.h b/drivers/net/ethernet/intel/ice/ice_ddp.h index abb5f32f2ef4..c00203df35da 100644 --- a/drivers/net/ethernet/intel/ice/ice_ddp.h +++ b/drivers/net/ethernet/intel/ice/ice_ddp.h @@ -100,8 +100,9 @@ struct ice_pkg_hdr { /* generic segment */ struct ice_generic_seg_hdr { -#define SEGMENT_TYPE_METADATA 0x00000001 -#define SEGMENT_TYPE_ICE 0x00000010 +#define SEGMENT_TYPE_METADATA 0x00000001 +#define SEGMENT_TYPE_ICE 0x00000010 +#define SEGMENT_TYPE_ICE_RUN_TIME_CFG 0x00000020 __le32 seg_type; struct ice_pkg_ver seg_format_ver; __le32 seg_size; @@ -431,4 +432,6 @@ u16 ice_pkg_buf_get_active_sections(struct ice_buf_build *bld); void *ice_pkg_enum_section(struct ice_seg *ice_seg, struct ice_pkg_enum *state, u32 sect_type); +int ice_cfg_tx_topo(struct ice_hw *hw, u8 *buf, u32 len); + #endif diff --git a/drivers/net/ethernet/intel/ice/ice_sched.h b/drivers/net/ethernet/intel/ice/ice_sched.h index 0055d9330c07..9307b9067f63 100644 --- a/drivers/net/ethernet/intel/ice/ice_sched.h +++ b/drivers/net/ethernet/intel/ice/ice_sched.h @@ -6,6 +6,9 @@ #include "ice_common.h" +#define ICE_SCHED_5_LAYERS 5 +#define ICE_SCHED_9_LAYERS 9 + #define SCHED_NODE_NAME_MAX_LEN 32 #define ICE_QGRP_LAYER_OFFSET 2 diff --git a/drivers/net/ethernet/intel/ice/ice_type.h b/drivers/net/ethernet/intel/ice/ice_type.h index e68425bd0713..957540c27eb1 100644 --- a/drivers/net/ethernet/intel/ice/ice_type.h +++ b/drivers/net/ethernet/intel/ice/ice_type.h @@ -293,6 +293,7 @@ struct ice_hw_common_caps { bool pcie_reset_avoidance; /* Post update reset restriction */ bool reset_restrict_support; + bool tx_sched_topo_comp_mode_en; }; /* IEEE 1588 TIME_SYNC specific info */ From patchwork Fri Oct 6 11:02:09 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mateusz Polchlopek X-Patchwork-Id: 13411280 X-Patchwork-Delegate: kuba@kernel.org Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A79C21BDF9 for ; Fri, 6 Oct 2023 11:05:44 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="GvK+IzOc" Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.9]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 52ED3CA for ; Fri, 6 Oct 2023 04:05:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1696590344; x=1728126344; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=c6zZJweTyeHofPvPL00/6e5qNZyuSiVqBJjm+rSh+Iw=; b=GvK+IzOc4Y2SuNIu3xEfd1J6Oi2yLHZFyCDaW3aVCmF8cPzCSXt79Q7q ZFUTNxH/DNg6viOwlMQzhusttyxrmv0kYfcl6Z9BAo9IZ6lAAo6mmuIWo udSpelAc6Wh1OEnnhvfan+8lClFhLULglED/YxHbL71sQ5vTFhIDY+nYK KrVTFmgLqvPHo4vUKJrp+fHJI6CWOc4VClrITyYKtA2UN97pWkLs7woGx l2qeSAVU7sgZCd7iUt2URc8tvc7HjBbJKTRSiPtugVWxiQ4T8JpdPNOqB dW+Tbrk8hcqcfg4tw/h8QHEHN4cLqwr7XgvKR3A0sLDpv3XPT5iTGnCCj A==; X-IronPort-AV: E=McAfee;i="6600,9927,10854"; a="2332736" X-IronPort-AV: E=Sophos;i="6.03,203,1694761200"; d="scan'208";a="2332736" Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by orvoesa101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 Oct 2023 04:05:43 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10854"; a="895844409" X-IronPort-AV: E=Sophos;i="6.03,203,1694761200"; d="scan'208";a="895844409" Received: from irvmail002.ir.intel.com ([10.43.11.120]) by fmsmga001.fm.intel.com with ESMTP; 06 Oct 2023 04:04:08 -0700 Received: from fedora.igk.intel.com (Metan_eth.igk.intel.com [10.123.220.124]) by irvmail002.ir.intel.com (Postfix) with ESMTP id 40D323636A; Fri, 6 Oct 2023 12:05:39 +0100 (IST) From: Mateusz Polchlopek To: intel-wired-lan@lists.osuosl.org Cc: netdev@vger.kernel.org, Raj Victor , Michal Wilczynski , Przemek Kitszel , Mateusz Polchlopek Subject: [Intel-wired-lan] [PATCH iwl-net v2 2/5] ice: Adjust the VSI/Aggregator layers Date: Fri, 6 Oct 2023 07:02:09 -0400 Message-Id: <20231006110212.96305-3-mateusz.polchlopek@intel.com> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20231006110212.96305-1-mateusz.polchlopek@intel.com> References: <20231006110212.96305-1-mateusz.polchlopek@intel.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,SPF_HELO_NONE, SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net X-Patchwork-Delegate: kuba@kernel.org From: Raj Victor Adjust the VSI/Aggregator layers based on the number of logical layers supported by the FW. Currently the VSI and Aggregator layers are fixed based on the 9 layer scheduler tree layout. Due to performance reasons the number of layers of the scheduler tree is changing from 9 to 5. It requires a readjustment of these VSI/Aggregator layer values. Signed-off-by: Raj Victor Co-developed-by: Michal Wilczynski Signed-off-by: Michal Wilczynski Reviewed-by: Przemek Kitszel Signed-off-by: Mateusz Polchlopek --- drivers/net/ethernet/intel/ice/ice_sched.c | 37 +++++++++++----------- 1 file changed, 19 insertions(+), 18 deletions(-) diff --git a/drivers/net/ethernet/intel/ice/ice_sched.c b/drivers/net/ethernet/intel/ice/ice_sched.c index c0533d7b66b9..5a65cf8a1806 100644 --- a/drivers/net/ethernet/intel/ice/ice_sched.c +++ b/drivers/net/ethernet/intel/ice/ice_sched.c @@ -1140,12 +1140,11 @@ u8 ice_sched_get_vsi_layer(struct ice_hw *hw) * 5 or less sw_entry_point_layer */ /* calculate the VSI layer based on number of layers. */ - if (hw->num_tx_sched_layers > ICE_VSI_LAYER_OFFSET + 1) { - u8 layer = hw->num_tx_sched_layers - ICE_VSI_LAYER_OFFSET; - - if (layer > hw->sw_entry_point_layer) - return layer; - } + if (hw->num_tx_sched_layers == ICE_SCHED_9_LAYERS) + return hw->num_tx_sched_layers - ICE_VSI_LAYER_OFFSET; + else if (hw->num_tx_sched_layers == ICE_SCHED_5_LAYERS) + /* qgroup and VSI layers are same */ + return hw->num_tx_sched_layers - ICE_QGRP_LAYER_OFFSET; return hw->sw_entry_point_layer; } @@ -1162,13 +1161,10 @@ u8 ice_sched_get_agg_layer(struct ice_hw *hw) * 7 or less sw_entry_point_layer */ /* calculate the aggregator layer based on number of layers. */ - if (hw->num_tx_sched_layers > ICE_AGG_LAYER_OFFSET + 1) { - u8 layer = hw->num_tx_sched_layers - ICE_AGG_LAYER_OFFSET; - - if (layer > hw->sw_entry_point_layer) - return layer; - } - return hw->sw_entry_point_layer; + if (hw->num_tx_sched_layers == ICE_SCHED_9_LAYERS) + return hw->num_tx_sched_layers - ICE_AGG_LAYER_OFFSET; + else + return hw->sw_entry_point_layer; } /** @@ -1523,10 +1519,11 @@ ice_sched_get_free_qparent(struct ice_port_info *pi, u16 vsi_handle, u8 tc, { struct ice_sched_node *vsi_node, *qgrp_node; struct ice_vsi_ctx *vsi_ctx; + u8 qgrp_layer, vsi_layer; u16 max_children; - u8 qgrp_layer; qgrp_layer = ice_sched_get_qgrp_layer(pi->hw); + vsi_layer = ice_sched_get_vsi_layer(pi->hw); max_children = pi->hw->max_children[qgrp_layer]; vsi_ctx = ice_get_vsi_ctx(pi->hw, vsi_handle); @@ -1537,6 +1534,12 @@ ice_sched_get_free_qparent(struct ice_port_info *pi, u16 vsi_handle, u8 tc, if (!vsi_node) return NULL; + /* If the queue group and VSI layer are same then queues + * are all attached directly to VSI + */ + if (qgrp_layer == vsi_layer) + return vsi_node; + /* get the first queue group node from VSI sub-tree */ qgrp_node = ice_sched_get_first_node(pi, vsi_node, qgrp_layer); while (qgrp_node) { @@ -3220,7 +3223,7 @@ ice_sched_add_rl_profile(struct ice_port_info *pi, u8 profile_type; int status; - if (layer_num >= ICE_AQC_TOPO_MAX_LEVEL_NUM) + if (!pi || layer_num >= pi->hw->num_tx_sched_layers) return NULL; switch (rl_type) { case ICE_MIN_BW: @@ -3236,8 +3239,6 @@ ice_sched_add_rl_profile(struct ice_port_info *pi, return NULL; } - if (!pi) - return NULL; hw = pi->hw; list_for_each_entry(rl_prof_elem, &pi->rl_prof_list[layer_num], list_entry) @@ -3467,7 +3468,7 @@ ice_sched_rm_rl_profile(struct ice_port_info *pi, u8 layer_num, u8 profile_type, struct ice_aqc_rl_profile_info *rl_prof_elem; int status = 0; - if (layer_num >= ICE_AQC_TOPO_MAX_LEVEL_NUM) + if (layer_num >= pi->hw->num_tx_sched_layers) return -EINVAL; /* Check the existing list for RL profile */ list_for_each_entry(rl_prof_elem, &pi->rl_prof_list[layer_num], From patchwork Fri Oct 6 11:02:10 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mateusz Polchlopek X-Patchwork-Id: 13411282 X-Patchwork-Delegate: kuba@kernel.org Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 830731C2A2 for ; Fri, 6 Oct 2023 11:05:47 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="IrwNaIZa" Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.9]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E1E1AE9 for ; Fri, 6 Oct 2023 04:05:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1696590345; x=1728126345; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=kNrEmOh5WmMmAaFdoGUjMhORFctkIYyOww9zGbLkdFI=; b=IrwNaIZaZ3AqBLbbXYRJTGBEtjizK0vA4rwKECoBzNIB7F6C4ZXEnDmA G5a7aTbYvi5Hl3wm1MSftws3FMEcxh4IGEvh9A/6X+jSbeuua8xY2yXLd 5tFb4v5VFTfxu3cgI4qVTJox1n0qDQIx6XAmYVcVenBYFrHmvoV3M/8yp FwjouhcE6foVXNtjQRs9mkuLxI3UyE7uFzn3K8IiZkB9E4xnuAaEvhTP6 JiLmp8Sz+CNEExorGmhU37XnJvFxIdKJt67jyqGrWQua1ZDQTLaM3T0sS YoOubyBWPv3b5j4CChVCVLxIWUulWHt+COIUG95Vyl8oD9LbaTWZNtiPT g==; X-IronPort-AV: E=McAfee;i="6600,9927,10854"; a="2332739" X-IronPort-AV: E=Sophos;i="6.03,203,1694761200"; d="scan'208";a="2332739" Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by orvoesa101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 Oct 2023 04:05:45 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10854"; a="895844417" X-IronPort-AV: E=Sophos;i="6.03,203,1694761200"; d="scan'208";a="895844417" Received: from irvmail002.ir.intel.com ([10.43.11.120]) by fmsmga001.fm.intel.com with ESMTP; 06 Oct 2023 04:04:10 -0700 Received: from fedora.igk.intel.com (Metan_eth.igk.intel.com [10.123.220.124]) by irvmail002.ir.intel.com (Postfix) with ESMTP id 69DCF36345; Fri, 6 Oct 2023 12:05:41 +0100 (IST) From: Mateusz Polchlopek To: intel-wired-lan@lists.osuosl.org Cc: netdev@vger.kernel.org, Michal Wilczynski , Przemek Kitszel , Mateusz Polchlopek Subject: [Intel-wired-lan] [PATCH iwl-net v2 3/5] ice: Enable switching default Tx scheduler topology Date: Fri, 6 Oct 2023 07:02:10 -0400 Message-Id: <20231006110212.96305-4-mateusz.polchlopek@intel.com> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20231006110212.96305-1-mateusz.polchlopek@intel.com> References: <20231006110212.96305-1-mateusz.polchlopek@intel.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,SPF_HELO_NONE, SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net X-Patchwork-Delegate: kuba@kernel.org From: Michal Wilczynski Introduce support for Tx scheduler topology change, based on user selection, from default 9-layer to 5-layer. Change requires NVM (version 3.20 or newer) and DDP package (OS Package 1.3.30 or newer - available for over a year in linux-firmware, since commit aed71f296637 in linux-firmware ("ice: Update package to 1.3.30.0")) https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/commit/?id=aed71f296637 Enable 5-layer topology switch in init path of the driver. To accomplish that upload of the DDP package needs to be delayed, until change in Tx topology is finished. To trigger the Tx change user selection should be changed in NVM using devlink. Then the platform should be rebooted. Signed-off-by: Michal Wilczynski Reviewed-by: Przemek Kitszel Co-developed-by: Mateusz Polchlopek Signed-off-by: Mateusz Polchlopek --- drivers/net/ethernet/intel/ice/ice_ddp.c | 2 +- drivers/net/ethernet/intel/ice/ice_ddp.h | 2 +- drivers/net/ethernet/intel/ice/ice_main.c | 97 ++++++++++++++++++----- 3 files changed, 80 insertions(+), 21 deletions(-) diff --git a/drivers/net/ethernet/intel/ice/ice_ddp.c b/drivers/net/ethernet/intel/ice/ice_ddp.c index 95c7712b56b8..44591bcdb537 100644 --- a/drivers/net/ethernet/intel/ice/ice_ddp.c +++ b/drivers/net/ethernet/intel/ice/ice_ddp.c @@ -1950,7 +1950,7 @@ ice_get_set_tx_topo(struct ice_hw *hw, u8 *buf, u16 buf_size, * The function will apply the new Tx topology from the package buffer * if available. */ -int ice_cfg_tx_topo(struct ice_hw *hw, u8 *buf, u32 len) +int ice_cfg_tx_topo(struct ice_hw *hw, const u8 *buf, u32 len) { u8 *current_topo, *new_topo = NULL; struct ice_run_time_cfg_seg *seg; diff --git a/drivers/net/ethernet/intel/ice/ice_ddp.h b/drivers/net/ethernet/intel/ice/ice_ddp.h index c00203df35da..b6f126e11dea 100644 --- a/drivers/net/ethernet/intel/ice/ice_ddp.h +++ b/drivers/net/ethernet/intel/ice/ice_ddp.h @@ -432,6 +432,6 @@ u16 ice_pkg_buf_get_active_sections(struct ice_buf_build *bld); void *ice_pkg_enum_section(struct ice_seg *ice_seg, struct ice_pkg_enum *state, u32 sect_type); -int ice_cfg_tx_topo(struct ice_hw *hw, u8 *buf, u32 len); +int ice_cfg_tx_topo(struct ice_hw *hw, const u8 *buf, u32 len); #endif diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c index 3363c69d49da..ad3a1572a635 100644 --- a/drivers/net/ethernet/intel/ice/ice_main.c +++ b/drivers/net/ethernet/intel/ice/ice_main.c @@ -4319,11 +4319,11 @@ static char *ice_get_opt_fw_name(struct ice_pf *pf) /** * ice_request_fw - Device initialization routine * @pf: pointer to the PF instance + * @firmware: double pointer to firmware struct */ -static void ice_request_fw(struct ice_pf *pf) +static int ice_request_fw(struct ice_pf *pf, const struct firmware **firmware) { char *opt_fw_filename = ice_get_opt_fw_name(pf); - const struct firmware *firmware = NULL; struct device *dev = ice_pf_to_dev(pf); int err = 0; @@ -4332,29 +4332,86 @@ static void ice_request_fw(struct ice_pf *pf) * and warning messages for other errors. */ if (opt_fw_filename) { - err = firmware_request_nowarn(&firmware, opt_fw_filename, dev); - if (err) { - kfree(opt_fw_filename); - goto dflt_pkg_load; - } - - /* request for firmware was successful. Download to device */ - ice_load_pkg(firmware, pf); + err = firmware_request_nowarn(firmware, opt_fw_filename, dev); kfree(opt_fw_filename); - release_firmware(firmware); - return; + if (!err) + return err; } + err = request_firmware(firmware, ICE_DDP_PKG_FILE, dev); + if (err) + dev_err(dev, "The DDP package file was not found or could not be read. Entering Safe Mode\n"); + + return err; +} + +/** + * ice_init_tx_topology - performs Tx topology initialization + * @hw: pointer to the hardware structure + * @firmware: pointer to firmware structure + */ +static int +ice_init_tx_topology(struct ice_hw *hw, const struct firmware *firmware) +{ + u8 num_tx_sched_layers = hw->num_tx_sched_layers; + struct ice_pf *pf = hw->back; + struct device *dev; + int err; + + dev = ice_pf_to_dev(pf); -dflt_pkg_load: - err = request_firmware(&firmware, ICE_DDP_PKG_FILE, dev); + err = ice_cfg_tx_topo(hw, firmware->data, firmware->size); + if (!err) { + if (hw->num_tx_sched_layers > num_tx_sched_layers) + dev_info(dev, "Tx scheduling layers switching feature disabled\n"); + else + dev_info(dev, "Tx scheduling layers switching feature enabled\n"); + /* if there was a change in topology ice_cfg_tx_topo triggered + * a CORER and we need to re-init hw + */ + ice_deinit_hw(hw); + err = ice_init_hw(hw); + + return err; + } else if (err == -EIO) { + dev_info(dev, "DDP package does not support Tx scheduling layers switching feature - please update to the latest DDP package and try again\n"); + } + + return 0; +} + +/** + * ice_init_ddp_config - DDP related configuration + * @hw: pointer to the hardware structure + * @pf: pointer to pf structure + * + * This function loads DDP file from the disk, then initializes Tx + * topology. At the end DDP package is loaded on the card. + */ +static int ice_init_ddp_config(struct ice_hw *hw, struct ice_pf *pf) +{ + struct device *dev = ice_pf_to_dev(pf); + const struct firmware *firmware = NULL; + int err; + + err = ice_request_fw(pf, &firmware); if (err) { - dev_err(dev, "The DDP package file was not found or could not be read. Entering Safe Mode\n"); - return; + dev_err(dev, "Fail during requesting FW: %d\n", err); + return err; } - /* request for firmware was successful. Download to device */ + err = ice_init_tx_topology(hw, firmware); + if (err) { + dev_err(dev, "Fail during initialization of Tx topology: %d\n", + err); + release_firmware(firmware); + return err; + } + + /* Download firmware to device */ ice_load_pkg(firmware, pf); release_firmware(firmware); + + return 0; } /** @@ -4614,9 +4671,11 @@ static int ice_init_dev(struct ice_pf *pf) ice_init_feature_support(pf); - ice_request_fw(pf); + err = ice_init_ddp_config(hw, pf); + if (err) + return err; - /* if ice_request_fw fails, ICE_FLAG_ADV_FEATURES bit won't be + /* if ice_init_ddp_config fails, ICE_FLAG_ADV_FEATURES bit won't be * set in pf->state, which will cause ice_is_safe_mode to return * true */ From patchwork Fri Oct 6 11:02:11 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mateusz Polchlopek X-Patchwork-Id: 13411283 X-Patchwork-Delegate: kuba@kernel.org Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B1A711C2B8 for ; Fri, 6 Oct 2023 11:05:48 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="ilXVJ2X/" Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.9]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C08D4CE for ; Fri, 6 Oct 2023 04:05:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1696590347; x=1728126347; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Tkc8agmVOmyhZ70IJFdk21jdiK9X2goN0WvtHdCYIZQ=; b=ilXVJ2X/VpvZiRHMiX2iNyTasIgQiXNZLoVI1RV0B2stVIvVUxoPlmIU splG7nqt3yhVjrP1w1JnrQXiDD/xQtBapyaBRkm6ZHwY77HrHpy/o8Al/ cpK6d1sAV6ZO/gMQTH5ru56ULUVeXf503CZFJJGI7JjLFihGguFo5LAjr pF6tAG0Dse4kB5vFhWNfbX6Mi/58DR+5x8q4I29Wr5AWjo4CAcPZVOu5v TOW+gb3oJt3OPQDUowSjhaBmwxmDgJwJZBWkzq2AOUnG739tuP1Y/5uYJ 7Sy+omHSyRh296vVMTQ5P0EItyB+WFeuV1qXeWEyeMLXHFdRcYaP1Ni0k A==; X-IronPort-AV: E=McAfee;i="6600,9927,10854"; a="2332744" X-IronPort-AV: E=Sophos;i="6.03,203,1694761200"; d="scan'208";a="2332744" Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by orvoesa101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 Oct 2023 04:05:47 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10854"; a="895844428" X-IronPort-AV: E=Sophos;i="6.03,203,1694761200"; d="scan'208";a="895844428" Received: from irvmail002.ir.intel.com ([10.43.11.120]) by fmsmga001.fm.intel.com with ESMTP; 06 Oct 2023 04:04:12 -0700 Received: from fedora.igk.intel.com (Metan_eth.igk.intel.com [10.123.220.124]) by irvmail002.ir.intel.com (Postfix) with ESMTP id 199713636A; Fri, 6 Oct 2023 12:05:43 +0100 (IST) From: Mateusz Polchlopek To: intel-wired-lan@lists.osuosl.org Cc: netdev@vger.kernel.org, Lukasz Czapnik , Przemek Kitszel , Mateusz Polchlopek Subject: [Intel-wired-lan] [PATCH iwl-net v2 4/5] ice: Add tx_scheduling_layers devlink param Date: Fri, 6 Oct 2023 07:02:11 -0400 Message-Id: <20231006110212.96305-5-mateusz.polchlopek@intel.com> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20231006110212.96305-1-mateusz.polchlopek@intel.com> References: <20231006110212.96305-1-mateusz.polchlopek@intel.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,SPF_HELO_NONE, SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net X-Patchwork-Delegate: kuba@kernel.org From: Lukasz Czapnik It was observed that Tx performance was inconsistent across all queues and/or VSIs and that it was directly connected to existing 9-layer topology of the Tx scheduler. Introduce new private devlink param - tx_scheduling_layers. This parameter gives user flexibility to choose the 5-layer transmit scheduler topology which helps to smooth out the transmit performance. Allowed parameter values are 5 and 9. Example usage: Show: devlink dev param show pci/0000:4b:00.0 name tx_scheduling_layers pci/0000:4b:00.0: name tx_scheduling_layers type driver-specific values: cmode permanent value 9 Set: devlink dev param set pci/0000:4b:00.0 name tx_scheduling_layers value 5 cmode permanent devlink dev param set pci/0000:4b:00.0 name tx_scheduling_layers value 9 cmode permanent Signed-off-by: Lukasz Czapnik Reviewed-by: Przemek Kitszel Co-developed-by: Mateusz Polchlopek Signed-off-by: Mateusz Polchlopek --- .../net/ethernet/intel/ice/ice_adminq_cmd.h | 9 + drivers/net/ethernet/intel/ice/ice_devlink.c | 174 +++++++++++++++++- .../net/ethernet/intel/ice/ice_fw_update.c | 7 +- .../net/ethernet/intel/ice/ice_fw_update.h | 3 + drivers/net/ethernet/intel/ice/ice_nvm.c | 2 +- drivers/net/ethernet/intel/ice/ice_nvm.h | 3 + 6 files changed, 192 insertions(+), 6 deletions(-) diff --git a/drivers/net/ethernet/intel/ice/ice_adminq_cmd.h b/drivers/net/ethernet/intel/ice/ice_adminq_cmd.h index cea1f6f7053f..1202abfb9eb3 100644 --- a/drivers/net/ethernet/intel/ice/ice_adminq_cmd.h +++ b/drivers/net/ethernet/intel/ice/ice_adminq_cmd.h @@ -1616,6 +1616,15 @@ struct ice_aqc_nvm { }; #define ICE_AQC_NVM_START_POINT 0 +#define ICE_AQC_NVM_TX_TOPO_MOD_ID 0x14B + +struct ice_aqc_nvm_tx_topo_user_sel { + __le16 length; + u8 data; +#define ICE_AQC_NVM_TX_TOPO_USER_SEL BIT(4) + + u8 reserved; +}; /* NVM Checksum Command (direct, 0x0706) */ struct ice_aqc_nvm_checksum { diff --git a/drivers/net/ethernet/intel/ice/ice_devlink.c b/drivers/net/ethernet/intel/ice/ice_devlink.c index 80dc5445b50d..55974079aa3f 100644 --- a/drivers/net/ethernet/intel/ice/ice_devlink.c +++ b/drivers/net/ethernet/intel/ice/ice_devlink.c @@ -736,6 +736,172 @@ ice_devlink_port_unsplit(struct devlink *devlink, struct devlink_port *port, return ice_devlink_port_split(devlink, port, 1, extack); } +enum ice_devlink_param_id { + ICE_DEVLINK_PARAM_ID_BASE = DEVLINK_PARAM_GENERIC_ID_MAX, + ICE_DEVLINK_PARAM_ID_TX_BALANCE, +}; + +/** + * ice_get_tx_topo_user_sel - Read user's choice from flash + * @pf: pointer to pf structure + * @layers: value read from flash will be saved here + * + * Reads user's preference for Tx Scheduler Topology Tree from PFA TLV. + * + * Returns zero when read was successful, negative values otherwise. + */ +static int ice_get_tx_topo_user_sel(struct ice_pf *pf, uint8_t *layers) +{ + struct ice_aqc_nvm_tx_topo_user_sel usr_sel = {}; + struct ice_hw *hw = &pf->hw; + int err; + + err = ice_acquire_nvm(hw, ICE_RES_READ); + if (err) + return err; + + err = ice_aq_read_nvm(hw, ICE_AQC_NVM_TX_TOPO_MOD_ID, 0, + sizeof(usr_sel), &usr_sel, true, true, NULL); + if (err) + goto exit_release_res; + + if (usr_sel.data & ICE_AQC_NVM_TX_TOPO_USER_SEL) + *layers = ICE_SCHED_5_LAYERS; + else + *layers = ICE_SCHED_9_LAYERS; + +exit_release_res: + ice_release_nvm(hw); + + return err; +} + +/** + * ice_update_tx_topo_user_sel - Save user's preference in flash + * @pf: pointer to pf structure + * @layers: value to be saved in flash + * + * Variable "layers" defines user's preference about number of layers in Tx + * Scheduler Topology Tree. This choice should be stored in PFA TLV field + * and be picked up by driver, next time during init. + * + * Returns zero when save was successful, negative values otherwise. + */ +static int ice_update_tx_topo_user_sel(struct ice_pf *pf, int layers) +{ + struct ice_aqc_nvm_tx_topo_user_sel usr_sel = {}; + struct ice_hw *hw = &pf->hw; + int err; + + err = ice_acquire_nvm(hw, ICE_RES_WRITE); + if (err) + return err; + + err = ice_aq_read_nvm(hw, ICE_AQC_NVM_TX_TOPO_MOD_ID, 0, + sizeof(usr_sel), &usr_sel, true, true, NULL); + if (err) + goto exit_release_res; + + if (layers == ICE_SCHED_5_LAYERS) + usr_sel.data |= ICE_AQC_NVM_TX_TOPO_USER_SEL; + else + usr_sel.data &= ~ICE_AQC_NVM_TX_TOPO_USER_SEL; + + err = ice_write_one_nvm_block(pf, ICE_AQC_NVM_TX_TOPO_MOD_ID, 2, + sizeof(usr_sel.data), &usr_sel.data, + true, NULL, NULL); + if (err) + err = -EIO; + +exit_release_res: + ice_release_nvm(hw); + + return err; +} + +/** + * ice_devlink_tx_sched_layers_get - Get tx_scheduling_layers parameter + * @devlink: pointer to the devlink instance + * @id: the parameter ID to set + * @ctx: context to store the parameter value + * + * Returns zero on success and negative value on failure. + */ +static int ice_devlink_tx_sched_layers_get(struct devlink *devlink, u32 id, + struct devlink_param_gset_ctx *ctx) +{ + struct ice_pf *pf = devlink_priv(devlink); + struct device *dev = ice_pf_to_dev(pf); + int err; + + err = ice_get_tx_topo_user_sel(pf, &ctx->val.vu8); + if (err) { + dev_warn(dev, "Failed to read Tx Scheduler Tree - User Selection data from flash\n"); + return -EIO; + } + + return 0; +} + +/** + * ice_devlink_tx_sched_layers_set - Set tx_scheduling_layers parameter + * @devlink: pointer to the devlink instance + * @id: the parameter ID to set + * @ctx: context to get the parameter value + * + * Returns zero on success and negative value on failure. + */ +static int ice_devlink_tx_sched_layers_set(struct devlink *devlink, u32 id, + struct devlink_param_gset_ctx *ctx) +{ + struct ice_pf *pf = devlink_priv(devlink); + struct device *dev = ice_pf_to_dev(pf); + int err; + + err = ice_update_tx_topo_user_sel(pf, ctx->val.vu8); + if (err) + return -EIO; + + dev_warn(dev, "Tx scheduling layers have been changed on this device. You must reboot the system for the change to take effect."); + + return 0; +} + +/** + * ice_devlink_tx_sched_layers_validate - Validate passed tx_scheduling_layers + * parameter value + * @devlink: unused pointer to devlink instance + * @id: the parameter ID to validate + * @val: value to validate + * @extack: netlink extended ACK structure + * + * Supported values are: + * - 5 - five layers Tx Scheduler Topology Tree + * - 9 - nine layers Tx Scheduler Topology Tree + * + * Returns zero when passed parameter value is supported. Negative value on + * error. + */ +static int ice_devlink_tx_sched_layers_validate(struct devlink *devlink, u32 id, + union devlink_param_value val, + struct netlink_ext_ack *extack) +{ + struct ice_pf *pf = devlink_priv(devlink); + struct ice_hw *hw = &pf->hw; + + if (!hw->func_caps.common_cap.tx_sched_topo_comp_mode_en) { + NL_SET_ERR_MSG_MOD(extack, "Error: Requested feature is not supported by the FW on this device. Update the FW and run this command again."); + return -EOPNOTSUPP; + } + + if (val.vu8 != ICE_SCHED_5_LAYERS && val.vu8 != ICE_SCHED_9_LAYERS) { + NL_SET_ERR_MSG_MOD(extack, "Error: Wrong number of tx scheduler layers provided."); + return -EINVAL; + } + + return 0; +} + /** * ice_tear_down_devlink_rate_tree - removes devlink-rate exported tree * @pf: pf struct @@ -1389,7 +1555,13 @@ static const struct devlink_param ice_devlink_params[] = { ice_devlink_enable_iw_get, ice_devlink_enable_iw_set, ice_devlink_enable_iw_validate), - + DEVLINK_PARAM_DRIVER(ICE_DEVLINK_PARAM_ID_TX_BALANCE, + "tx_scheduling_layers", + DEVLINK_PARAM_TYPE_U8, + BIT(DEVLINK_PARAM_CMODE_PERMANENT), + ice_devlink_tx_sched_layers_get, + ice_devlink_tx_sched_layers_set, + ice_devlink_tx_sched_layers_validate), }; static void ice_devlink_free(void *devlink_ptr) diff --git a/drivers/net/ethernet/intel/ice/ice_fw_update.c b/drivers/net/ethernet/intel/ice/ice_fw_update.c index 319a2d6fe26c..f81db6c107c8 100644 --- a/drivers/net/ethernet/intel/ice/ice_fw_update.c +++ b/drivers/net/ethernet/intel/ice/ice_fw_update.c @@ -286,10 +286,9 @@ ice_send_component_table(struct pldmfw *context, struct pldmfw_component *compon * * Returns: zero on success, or a negative error code on failure. */ -static int -ice_write_one_nvm_block(struct ice_pf *pf, u16 module, u32 offset, - u16 block_size, u8 *block, bool last_cmd, - u8 *reset_level, struct netlink_ext_ack *extack) +int ice_write_one_nvm_block(struct ice_pf *pf, u16 module, u32 offset, + u16 block_size, u8 *block, bool last_cmd, + u8 *reset_level, struct netlink_ext_ack *extack) { u16 completion_module, completion_retval; struct device *dev = ice_pf_to_dev(pf); diff --git a/drivers/net/ethernet/intel/ice/ice_fw_update.h b/drivers/net/ethernet/intel/ice/ice_fw_update.h index 750574885716..04b200462757 100644 --- a/drivers/net/ethernet/intel/ice/ice_fw_update.h +++ b/drivers/net/ethernet/intel/ice/ice_fw_update.h @@ -9,5 +9,8 @@ int ice_devlink_flash_update(struct devlink *devlink, struct netlink_ext_ack *extack); int ice_get_pending_updates(struct ice_pf *pf, u8 *pending, struct netlink_ext_ack *extack); +int ice_write_one_nvm_block(struct ice_pf *pf, u16 module, u32 offset, + u16 block_size, u8 *block, bool last_cmd, + u8 *reset_level, struct netlink_ext_ack *extack); #endif diff --git a/drivers/net/ethernet/intel/ice/ice_nvm.c b/drivers/net/ethernet/intel/ice/ice_nvm.c index f6f52a248066..745f2459943f 100644 --- a/drivers/net/ethernet/intel/ice/ice_nvm.c +++ b/drivers/net/ethernet/intel/ice/ice_nvm.c @@ -18,7 +18,7 @@ * * Read the NVM using the admin queue commands (0x0701) */ -static int +int ice_aq_read_nvm(struct ice_hw *hw, u16 module_typeid, u32 offset, u16 length, void *data, bool last_command, bool read_shadow_ram, struct ice_sq_cd *cd) diff --git a/drivers/net/ethernet/intel/ice/ice_nvm.h b/drivers/net/ethernet/intel/ice/ice_nvm.h index 774c2317967d..88978b9a95b1 100644 --- a/drivers/net/ethernet/intel/ice/ice_nvm.h +++ b/drivers/net/ethernet/intel/ice/ice_nvm.h @@ -14,6 +14,9 @@ struct ice_orom_civd_info { int ice_acquire_nvm(struct ice_hw *hw, enum ice_aq_res_access_type access); void ice_release_nvm(struct ice_hw *hw); +int ice_aq_read_nvm(struct ice_hw *hw, u16 module_typeid, u32 offset, u16 length, + void *data, bool last_command, bool read_shadow_ram, + struct ice_sq_cd *cd); int ice_read_flat_nvm(struct ice_hw *hw, u32 offset, u32 *length, u8 *data, bool read_shadow_ram); From patchwork Fri Oct 6 11:02:12 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mateusz Polchlopek X-Patchwork-Id: 13411284 X-Patchwork-Delegate: kuba@kernel.org Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DBA951C683 for ; Fri, 6 Oct 2023 11:05:49 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="j7ww8Nta" Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.9]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C390DDB for ; Fri, 6 Oct 2023 04:05:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1696590348; x=1728126348; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=b1ZFwZorx5Q6/aXcIfmXrHZm47jo8kxB6IGrZ9z+pJs=; b=j7ww8Nta/0YrD2pYhmjfhmT1TS6ohHNG6S+bm6voEQWrwWBom+zNtAhf nwDOS46+bJxtdTvz0b1uHHxx01al8fSEIrMHuFMhO+NpGn2WXu4aaPcQJ k61hyrP/SGC4FJt0RHs/QxnsgE+mrawk6t7vVhNFZRoEVJS0n2BObRXom xz6/MUEuTrAp+d5mtOBMrw3gHMDKJa/DV2iUwKXXYohGZdKOz7dOhohEA 2jjK45Tl4zAKzF9mvaKoQKEfD3I0uRautgOQ4IlBIpvkskuSSwF984k/P pEQWqpXAp/Xwz+ZDfV+jCIdzI5O9o2RTW7k5sxryz94HOStVJHXFLi5uF w==; X-IronPort-AV: E=McAfee;i="6600,9927,10854"; a="2332747" X-IronPort-AV: E=Sophos;i="6.03,203,1694761200"; d="scan'208";a="2332747" Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by orvoesa101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 Oct 2023 04:05:48 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10854"; a="895844432" X-IronPort-AV: E=Sophos;i="6.03,203,1694761200"; d="scan'208";a="895844432" Received: from irvmail002.ir.intel.com ([10.43.11.120]) by fmsmga001.fm.intel.com with ESMTP; 06 Oct 2023 04:04:14 -0700 Received: from fedora.igk.intel.com (Metan_eth.igk.intel.com [10.123.220.124]) by irvmail002.ir.intel.com (Postfix) with ESMTP id EC73736345; Fri, 6 Oct 2023 12:05:44 +0100 (IST) From: Mateusz Polchlopek To: intel-wired-lan@lists.osuosl.org Cc: netdev@vger.kernel.org, Michal Wilczynski , Przemek Kitszel , Mateusz Polchlopek Subject: [Intel-wired-lan] [PATCH iwl-net v2 5/5] ice: Document tx_scheduling_layers parameter Date: Fri, 6 Oct 2023 07:02:12 -0400 Message-Id: <20231006110212.96305-6-mateusz.polchlopek@intel.com> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20231006110212.96305-1-mateusz.polchlopek@intel.com> References: <20231006110212.96305-1-mateusz.polchlopek@intel.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,SPF_HELO_NONE, SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net X-Patchwork-Delegate: kuba@kernel.org From: Michal Wilczynski New driver specific parameter 'tx_scheduling_layers' was introduced. Describe parameter in the documentation. Signed-off-by: Michal Wilczynski Reviewed-by: Przemek Kitszel Co-developed-by: Mateusz Polchlopek Signed-off-by: Mateusz Polchlopek --- Documentation/networking/devlink/ice.rst | 50 ++++++++++++++++++++++++ 1 file changed, 50 insertions(+) diff --git a/Documentation/networking/devlink/ice.rst b/Documentation/networking/devlink/ice.rst index 2f60e34ab926..328a728d197b 100644 --- a/Documentation/networking/devlink/ice.rst +++ b/Documentation/networking/devlink/ice.rst @@ -22,6 +22,56 @@ Parameters - runtime - mutually exclusive with ``enable_roce`` +.. list-table:: Driver-specific parameters implemented + :widths: 5 5 5 85 + + * - Name + - Type + - Mode + - Description + * - ``tx_scheduling_layers`` + - u8 + - permanent + The ice hardware uses hierarchical scheduling for Tx with a fixed + number of layers in the scheduling tree. Root node is representing a + port, while all the leaves represents the queues. This way of + configuring Tx scheduler allows features like DCB or devlink-rate + (documented below) for fine-grained configuration how much BW is given + to any given queue or group of queues, as scheduling parameters can be + configured at any given layer of the tree. By default 9-layer tree + topology was deemed best for most workloads, as it gives optimal + performance to configurability ratio. However for some specific cases, + this might not be the case. A great example would be sending traffic to + queues that is not a multiple of 8. Since in 9-layer topology maximum + number of children is limited to 8, the 9th queue has a different parent + than the rest, and it's given more BW credits. This causes a problem + when the system is sending traffic to 9 queues: + + | tx_queue_0_packets: 24163396 + | tx_queue_1_packets: 24164623 + | tx_queue_2_packets: 24163188 + | tx_queue_3_packets: 24163701 + | tx_queue_4_packets: 24163683 + | tx_queue_5_packets: 24164668 + | tx_queue_6_packets: 23327200 + | tx_queue_7_packets: 24163853 + | tx_queue_8_packets: 91101417 < Too much traffic is sent to 9th + + Sometimes this might be a big concern, so the idea is to empower the + user to switch to 5-layer topology, enabling performance gains but + sacrificing configurability for features like DCB and devlink-rate. + + This parameter gives user flexibility to choose the 5-layer transmit + scheduler topology. After switching parameter reboot is required for + the feature to start working. + + User could choose 9 (the default) or 5 as a value of parameter, e.g.: + $ devlink dev param set pci/0000:16:00.0 name tx_scheduling_layers + value 5 cmode permanent + + And verify that value has been set: + $ devlink dev param show pci/0000:16:00.0 name tx_scheduling_layers + Info versions =============