From patchwork Thu Feb 6 18:26:26 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexander Lobakin X-Patchwork-Id: 13963539 X-Patchwork-Delegate: kuba@kernel.org Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.11]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8EB0A1A2381; Thu, 6 Feb 2025 18:30:30 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.11 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738866632; cv=none; b=ZFCh3RIFXhgxnEA83/DkGqJgTVkdgIkSnX5HRh9fnireOVVMHDshPd4Rg7E6Qc/6ULaog1kJG/JJIWZbTt8hi0bPNHEKaGdPpO+kZhkAIP7+/cYCEVO4ZZmWZkD+xLlH+VUE0mTcB9wh3jr8DV0mEGfHXWOA2Ys+kwPsk7cju3c= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738866632; c=relaxed/simple; bh=eT8n979ZpwPw5e/QeuHsD+kqpySQIoGeQuEIuBIocjs=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=cZ8veOieeBUXV5vOAnDZnCiSLbit6RBujKOdbKXKqPZs6dx8Nk7KnQIh/r23brg1uBF2m7XEdWD8lCYfW+K1uJXCfQLBYgJRfc9Jb0H5zPEbWWfMNqfasCPynRzeTIJyLiwwYVvCeynLSUrDGcDhiiR1qJFi3YO/c84zlUI5TFI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=Q075m2xd; arc=none smtp.client-ip=198.175.65.11 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="Q075m2xd" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1738866630; x=1770402630; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=eT8n979ZpwPw5e/QeuHsD+kqpySQIoGeQuEIuBIocjs=; b=Q075m2xdCjkhWyh78xFUrV0jtlM3GaJ6dxiF+K95ti/vdebh8SXJmuX5 laLovz7ujm8DcunMD0QclXvPwzbhX10KPjbR3435FXWeB+j9/4OaGie+3 mpAE/DABOIMOf9Zaz9T/yqqfJdqycUPJDYNW1spG9EQ6D1vlo6qQbAY82 wXc4ziIuicNg1w1Q55sxp71QJw8qpwwL40GLoQPV7KVQ3jj3LKuixFkLp 0MH+I3PleDe5jcqF36QF0tm2Hj8yctaCuXF0np0NflYWqBLCG1Rs9geyK gUagnfx5VCjgGgv7RgRO1HEFam6RZskKafHp9gEgcn/gV3zhS6qzPoNwD Q==; X-CSE-ConnectionGUID: sUPoJAzSSeyW0ICul4P+zQ== X-CSE-MsgGUID: VuH3xO97QcqWyvE7/sFUSg== X-IronPort-AV: E=McAfee;i="6700,10204,11336"; a="49734441" X-IronPort-AV: E=Sophos;i="6.13,264,1732608000"; d="scan'208";a="49734441" Received: from orviesa009.jf.intel.com ([10.64.159.149]) by orvoesa103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 Feb 2025 10:30:30 -0800 X-CSE-ConnectionGUID: ijf33W/vTv2rmI7xLcDLWQ== X-CSE-MsgGUID: xCFex8EfQqeSQDK9gI+bsA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.13,264,1732608000"; d="scan'208";a="111065875" Received: from newjersey.igk.intel.com ([10.102.20.203]) by orviesa009.jf.intel.com with ESMTP; 06 Feb 2025 10:30:25 -0800 From: Alexander Lobakin To: Andrew Lunn , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni Cc: Alexander Lobakin , Alexei Starovoitov , Daniel Borkmann , John Fastabend , Andrii Nakryiko , "Jose E. Marchesi" , =?utf-8?q?Toke_H=C3=B8iland-?= =?utf-8?q?J=C3=B8rgensen?= , Magnus Karlsson , Maciej Fijalkowski , Przemek Kitszel , Jason Baron , Casey Schaufler , Nathan Chancellor , bpf@vger.kernel.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH net-next 1/4] unroll: add generic loop unroll helpers Date: Thu, 6 Feb 2025 19:26:26 +0100 Message-ID: <20250206182630.3914318-2-aleksander.lobakin@intel.com> X-Mailer: git-send-email 2.48.1 In-Reply-To: <20250206182630.3914318-1-aleksander.lobakin@intel.com> References: <20250206182630.3914318-1-aleksander.lobakin@intel.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: kuba@kernel.org There are cases when we need to explicitly unroll loops. For example, cache operations, filling DMA descriptors on very high speeds etc. Add compiler-specific attribute macros to give the compiler a hint that we'd like to unroll a loop. Example usage: #define UNROLL_BATCH 8 unrolled_count(UNROLL_BATCH) for (u32 i = 0; i < UNROLL_BATCH; i++) op(priv, i); Note that sometimes the compilers won't unroll loops if they think this would have worse optimization and perf than without unrolling, and that unroll attributes are available only starting GCC 8. For older compiler versions, no hints/attributes will be applied. For better unrolling/parallelization, don't have any variables that interfere between iterations except for the iterator itself. Co-developed-by: Jose E. Marchesi # pragmas Signed-off-by: Jose E. Marchesi Reviewed-by: Przemek Kitszel Signed-off-by: Alexander Lobakin --- include/linux/unroll.h | 44 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 44 insertions(+) diff --git a/include/linux/unroll.h b/include/linux/unroll.h index d42fd6366373..863fb69f6a7e 100644 --- a/include/linux/unroll.h +++ b/include/linux/unroll.h @@ -9,6 +9,50 @@ #include +#ifdef CONFIG_CC_IS_CLANG +#define __pick_unrolled(x, y) _Pragma(#x) +#elif CONFIG_GCC_VERSION >= 80000 +#define __pick_unrolled(x, y) _Pragma(#y) +#else +#define __pick_unrolled(x, y) /* not supported */ +#endif + +/** + * unrolled - loop attributes to ask the compiler to unroll it + * + * Usage: + * + * #define BATCH 8 + * + * unrolled_count(BATCH) + * for (u32 i = 0; i < BATCH; i++) + * // loop body without cross-iteration dependencies + * + * This is only a hint and the compiler is free to disable unrolling if it + * thinks the count is suboptimal and may hurt performance and/or hugely + * increase object code size. + * Not having any cross-iteration dependencies (i.e. when iter x + 1 depends + * on what iter x will do with variables) is not a strict requirement, but + * provides best performance and object code size. + * Available only on Clang and GCC 8.x onwards. + */ + +/* Ask the compiler to pick an optimal unroll count, Clang only */ +#define unrolled \ + __pick_unrolled(clang loop unroll(enable), /* nothing */) + +/* Unroll each @n iterations of the loop */ +#define unrolled_count(n) \ + __pick_unrolled(clang loop unroll_count(n), GCC unroll n) + +/* Unroll the whole loop */ +#define unrolled_full \ + __pick_unrolled(clang loop unroll(full), GCC unroll 65534) + +/* Never unroll the loop */ +#define unrolled_none \ + __pick_unrolled(clang loop unroll(disable), GCC unroll 1) + #define UNROLL(N, MACRO, args...) CONCATENATE(__UNROLL_, N)(MACRO, args) #define __UNROLL_0(MACRO, args...) From patchwork Thu Feb 6 18:26:27 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexander Lobakin X-Patchwork-Id: 13963540 X-Patchwork-Delegate: kuba@kernel.org Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.11]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 00B3A1B4239; Thu, 6 Feb 2025 18:30:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.11 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738866637; cv=none; b=TO9EDQ65IfSloIWbrm9wqUP8odtwyTwK/2Gfl2F2hchFtZMDCWD1k0g9vkPD6WZWcgx4CI79Xp0d4J7RnsuojIhArcEGZHTSfgVLNE9TS+qsU9Ya7emVjPBr67vV27qpzYejdjTgDqHCtUoBO2NtaDSrJ+6XgAjzPsP/vZzxQtU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738866637; c=relaxed/simple; bh=tCQ2Nhmk3K4pHm1Ho2NAJuNHMIXd4ulXGPcI1saaATI=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=CyElP3u3mv7TtqPzMC7c1BCUVqx30Q3zhn6j1Iq1D1bCSnYoA9LiKwtRBzWqgS9bHRpubZIkIzUbLpb5L7UJ4qOWcJHxXTs68C98mcY0yI0Byw6DsSbKqLDrhdSWv7JSd7yjX5dXavhe/wOrGJYg73s/N/HXWb6P/E7f+dZaADM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=jQ5y+2Ie; arc=none smtp.client-ip=198.175.65.11 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="jQ5y+2Ie" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1738866636; x=1770402636; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=tCQ2Nhmk3K4pHm1Ho2NAJuNHMIXd4ulXGPcI1saaATI=; b=jQ5y+2IewCvIfn0t20tc8/39DwpwSvPt5HNQ6bXwJKvWpFuTpPyi6BzN 5oamj6ex6xYh0DIPjG2TXkHeGnapMSVgE8b+tLB1bY7tHOsDgRIIjVF0p sQSL/VRlZi4DzEVwnsWONyH/MgJhQwm7vsLkbGuxGF5Xdfs35wy2mPSW+ FR13qrbbB1zxavSHt9MsI/Mv3UbuFtmP951crOKqeVl516emlUtDYafJd BTGOtsFlWeuhVInIgzwKEPwxbNHLPFZ04jgFReDV8JS263Z89Y0r8vP3s Wyl9SrxBKIZzJ+nvM0oXqZZSmDyXZF5eUXc6y90KGTCxvmqxv06tghc1s A==; X-CSE-ConnectionGUID: mdjNCPNATOGkWJo0OFwceA== X-CSE-MsgGUID: K14FfMpSQquhsI8cTqb/5A== X-IronPort-AV: E=McAfee;i="6700,10204,11336"; a="49734475" X-IronPort-AV: E=Sophos;i="6.13,264,1732608000"; d="scan'208";a="49734475" Received: from orviesa009.jf.intel.com ([10.64.159.149]) by orvoesa103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 Feb 2025 10:30:35 -0800 X-CSE-ConnectionGUID: vwc6IRvkSbGsC9Kh89PRZQ== X-CSE-MsgGUID: PShxg+/QQ+O+iGVJRsImAQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.13,264,1732608000"; d="scan'208";a="111065886" Received: from newjersey.igk.intel.com ([10.102.20.203]) by orviesa009.jf.intel.com with ESMTP; 06 Feb 2025 10:30:31 -0800 From: Alexander Lobakin To: Andrew Lunn , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni Cc: Alexander Lobakin , Alexei Starovoitov , Daniel Borkmann , John Fastabend , Andrii Nakryiko , "Jose E. Marchesi" , =?utf-8?q?Toke_H=C3=B8iland-?= =?utf-8?q?J=C3=B8rgensen?= , Magnus Karlsson , Maciej Fijalkowski , Przemek Kitszel , Jason Baron , Casey Schaufler , Nathan Chancellor , bpf@vger.kernel.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH net-next 2/4] i40e: use generic unrolled_count() macro Date: Thu, 6 Feb 2025 19:26:27 +0100 Message-ID: <20250206182630.3914318-3-aleksander.lobakin@intel.com> X-Mailer: git-send-email 2.48.1 In-Reply-To: <20250206182630.3914318-1-aleksander.lobakin@intel.com> References: <20250206182630.3914318-1-aleksander.lobakin@intel.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: kuba@kernel.org i40e, as well as ice, has a custom loop unrolling macro for unrolling Tx descriptors filling on XSk xmit. Replace i40e defs with generic unrolled_count(), which is also more convenient as it allows passing defines as its argument, not hardcoded values, while the loop declaration will still be a usual for-loop. Signed-off-by: Alexander Lobakin Acked-by: Maciej Fijalkowski --- drivers/net/ethernet/intel/i40e/i40e_xsk.h | 10 +--------- drivers/net/ethernet/intel/i40e/i40e_xsk.c | 4 +++- 2 files changed, 4 insertions(+), 10 deletions(-) diff --git a/drivers/net/ethernet/intel/i40e/i40e_xsk.h b/drivers/net/ethernet/intel/i40e/i40e_xsk.h index ef156fad52f2..dd16351a7af8 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_xsk.h +++ b/drivers/net/ethernet/intel/i40e/i40e_xsk.h @@ -6,7 +6,7 @@ #include -/* This value should match the pragma in the loop_unrolled_for +/* This value should match the pragma in the unrolled_count() * macro. Why 4? It is strictly empirical. It seems to be a good * compromise between the advantage of having simultaneous outstanding * reads to the DMA array that can hide each others latency and the @@ -14,14 +14,6 @@ */ #define PKTS_PER_BATCH 4 -#ifdef __clang__ -#define loop_unrolled_for _Pragma("clang loop unroll_count(4)") for -#elif __GNUC__ >= 8 -#define loop_unrolled_for _Pragma("GCC unroll 4") for -#else -#define loop_unrolled_for for -#endif - struct i40e_ring; struct i40e_vsi; struct net_device; diff --git a/drivers/net/ethernet/intel/i40e/i40e_xsk.c b/drivers/net/ethernet/intel/i40e/i40e_xsk.c index e28f1905a4a0..9f47388eaba5 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_xsk.c +++ b/drivers/net/ethernet/intel/i40e/i40e_xsk.c @@ -2,6 +2,7 @@ /* Copyright(c) 2018 Intel Corporation. */ #include +#include #include #include "i40e_txrx_common.h" #include "i40e_xsk.h" @@ -529,7 +530,8 @@ static void i40e_xmit_pkt_batch(struct i40e_ring *xdp_ring, struct xdp_desc *des dma_addr_t dma; u32 i; - loop_unrolled_for(i = 0; i < PKTS_PER_BATCH; i++) { + unrolled_count(PKTS_PER_BATCH) + for (i = 0; i < PKTS_PER_BATCH; i++) { u32 cmd = I40E_TX_DESC_CMD_ICRC | xsk_is_eop_desc(&desc[i]); dma = xsk_buff_raw_get_dma(xdp_ring->xsk_pool, desc[i].addr); From patchwork Thu Feb 6 18:26:28 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexander Lobakin X-Patchwork-Id: 13963541 X-Patchwork-Delegate: kuba@kernel.org Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.11]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5DBB61B4239; Thu, 6 Feb 2025 18:30:41 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.11 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738866642; cv=none; b=BtZo8K1lyDbc84Eq7lBvIZfJWyMuhskHH4pMi8Yx3GtG5Zpc5s5l46yVPBoOmPuDTGajoHg5atdFlVUO0sOmSp8OtP0ewYYVTPDwRR62ikQav0uSAKTegFQpZNeP/61zTazbueiatI/hs3sQZvLDsaVipUaGui71MaiiQTptU2Y= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738866642; c=relaxed/simple; bh=QT+oHjlIPggfbQlgJprfaj60Qp6VDwquMDsXmu0uwD8=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=C5Ekz7WpZYE8QKcxfGGZDk2mRYsTMSIpzCgcvfyHN+vhyLsbMnyBaUga/5euA605R1KNE8lgmuv6+kMwacDsDZcfCgFqmbXnc7WLpzjGzqeEgOC00kqp9qeK9gXTjzzUQuUK+U8zJbVcxE1I9i9v8l0WvnhMGAz1B48RYcfDgY4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=NbrvLmCt; arc=none smtp.client-ip=198.175.65.11 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="NbrvLmCt" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1738866641; x=1770402641; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=QT+oHjlIPggfbQlgJprfaj60Qp6VDwquMDsXmu0uwD8=; b=NbrvLmCt1JBE3amaFggK/PLn6ohNamHiCUFVP/37AcWB70l/iP0ooxGe 9Y+7vlgAobuq12Ud+3bilbYMGPMG/xbFbKQSe4aMkQalX/i7o+UG4SxV2 ZoLrvN5tirGLVcX9I1hTlvLYefgR6AnBbhaSwDwsyECbX42R52P9Q8PM9 50e8ndfC8qVN/QB3pq/XBaG7/Au7azhnfepnvtmV3X3dV8ctyd8nP7MYN klyqUjldp+i6Xgd8A4qptz3GtJ/7WrQz2Pltw9hVjAXZ2UVQ2BliL9QAK Mdr00wretvZvQI+84gCaM4S/J+K6UDJ0jupiJXrBD+baXZPlm4Dp5T8o0 w==; X-CSE-ConnectionGUID: H3Jj9lpeTLeZovkowNYgXg== X-CSE-MsgGUID: 4cw8zu8hRSyniyjWsL+LWQ== X-IronPort-AV: E=McAfee;i="6700,10204,11336"; a="49734500" X-IronPort-AV: E=Sophos;i="6.13,264,1732608000"; d="scan'208";a="49734500" Received: from orviesa009.jf.intel.com ([10.64.159.149]) by orvoesa103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 Feb 2025 10:30:41 -0800 X-CSE-ConnectionGUID: Z4lY5usIS7CbEEMkjaoYJg== X-CSE-MsgGUID: 1PCNebL5TEOK9nJo6zwdVw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.13,264,1732608000"; d="scan'208";a="111065899" Received: from newjersey.igk.intel.com ([10.102.20.203]) by orviesa009.jf.intel.com with ESMTP; 06 Feb 2025 10:30:36 -0800 From: Alexander Lobakin To: Andrew Lunn , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni Cc: Alexander Lobakin , Alexei Starovoitov , Daniel Borkmann , John Fastabend , Andrii Nakryiko , "Jose E. Marchesi" , =?utf-8?q?Toke_H=C3=B8iland-?= =?utf-8?q?J=C3=B8rgensen?= , Magnus Karlsson , Maciej Fijalkowski , Przemek Kitszel , Jason Baron , Casey Schaufler , Nathan Chancellor , bpf@vger.kernel.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH net-next 3/4] ice: use generic unrolled_count() macro Date: Thu, 6 Feb 2025 19:26:28 +0100 Message-ID: <20250206182630.3914318-4-aleksander.lobakin@intel.com> X-Mailer: git-send-email 2.48.1 In-Reply-To: <20250206182630.3914318-1-aleksander.lobakin@intel.com> References: <20250206182630.3914318-1-aleksander.lobakin@intel.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: kuba@kernel.org ice, same as i40e, has a custom loop unrolling macros for unrolling Tx descriptors filling on XSk xmit. Replace ice defs with generic unrolled_count(), which is also more convenient as it allows passing defines as its argument, not hardcoded values, while the loop declaration will still be usual for-loop. Signed-off-by: Alexander Lobakin Acked-by: Maciej Fijalkowski --- drivers/net/ethernet/intel/ice/ice_xsk.h | 8 -------- drivers/net/ethernet/intel/ice/ice_xsk.c | 4 +++- 2 files changed, 3 insertions(+), 9 deletions(-) diff --git a/drivers/net/ethernet/intel/ice/ice_xsk.h b/drivers/net/ethernet/intel/ice/ice_xsk.h index 45adeb513253..8dc5d55e26c5 100644 --- a/drivers/net/ethernet/intel/ice/ice_xsk.h +++ b/drivers/net/ethernet/intel/ice/ice_xsk.h @@ -7,14 +7,6 @@ #define PKTS_PER_BATCH 8 -#ifdef __clang__ -#define loop_unrolled_for _Pragma("clang loop unroll_count(8)") for -#elif __GNUC__ >= 8 -#define loop_unrolled_for _Pragma("GCC unroll 8") for -#else -#define loop_unrolled_for for -#endif - struct ice_vsi; #ifdef CONFIG_XDP_SOCKETS diff --git a/drivers/net/ethernet/intel/ice/ice_xsk.c b/drivers/net/ethernet/intel/ice/ice_xsk.c index 8975d2971bc3..a3a4eaa17739 100644 --- a/drivers/net/ethernet/intel/ice/ice_xsk.c +++ b/drivers/net/ethernet/intel/ice/ice_xsk.c @@ -2,6 +2,7 @@ /* Copyright (c) 2019, Intel Corporation. */ #include +#include #include #include #include "ice.h" @@ -989,7 +990,8 @@ static void ice_xmit_pkt_batch(struct ice_tx_ring *xdp_ring, struct ice_tx_desc *tx_desc; u32 i; - loop_unrolled_for(i = 0; i < PKTS_PER_BATCH; i++) { + unrolled_count(PKTS_PER_BATCH) + for (i = 0; i < PKTS_PER_BATCH; i++) { dma_addr_t dma; dma = xsk_buff_raw_get_dma(xsk_pool, descs[i].addr); From patchwork Thu Feb 6 18:26:29 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexander Lobakin X-Patchwork-Id: 13963542 X-Patchwork-Delegate: kuba@kernel.org Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.11]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 42A631D63DB; Thu, 6 Feb 2025 18:30:47 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.11 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738866649; cv=none; b=M9PCtZB5kwCLUIEfavJsDt1UGQ9b3qiqK86inD982nRcSmOQ65tfj8AVRB3jxyDTSCIbf9v0GypU77OlHC1hI7T/qoE9wMcPHHr4GHhn05tIT5rSWhcG9p/6gFkzFdGu1T2PyaeQeT74DASUan+NsZc6YfT56DssGv4/2qiX994= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738866649; c=relaxed/simple; bh=DH0AgivTe8MS9FAs7iNfl3KHQN3uOqPy3+QRqscL/Ok=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=DBSK3st0GnDewqBM237YCndEDdPOQoHnnHoMtBP41MBOtGBd0KOpf7zFCngUXV151+0f1hm9DllY3QREO7XVNPQP7gQsoJvYjgxu+3XqE5C2uemDLWjlzSyqndKDoP801Hvhjs8peCIsTQIlTV9W6lnb4ny9qhLJ2ADButFp9wY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=QgktsySt; arc=none smtp.client-ip=198.175.65.11 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="QgktsySt" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1738866647; x=1770402647; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=DH0AgivTe8MS9FAs7iNfl3KHQN3uOqPy3+QRqscL/Ok=; b=QgktsyStbg4J5r4KAaFcXgE/aSCQtXO5QdKmBCxsVmu7vkjDm4ApvEdk a0OkZnka5hGJiijxNgMjMSwWZcZvIfNQU/H0jx1ZUJgSLTFjh3opqOotj onvxH3ONo8vCDY0olep8SeV5/5htkBO4IgAf24U3cYMqHMLcFEddVVrh6 oTmHyhU37kpYzfeNhxaYqyFICr+tU0XztiWR6w7MdkNMPr+ENeMc/qFFl /EwUBe/y/cjEDNQihl/+WzGCg7lY3pcVWE2LlixsxH0DdMRtkohmVuSME sF9adz2+iVOSZqOSEXmy831K2r5oaHz0DefuQwWL5VhvFP5o7fU9FWr4/ Q==; X-CSE-ConnectionGUID: P45S6D6FRASMju++BHhZQw== X-CSE-MsgGUID: VwbP7lvlS/K2zkZgHlNRFA== X-IronPort-AV: E=McAfee;i="6700,10204,11336"; a="49734526" X-IronPort-AV: E=Sophos;i="6.13,264,1732608000"; d="scan'208";a="49734526" Received: from orviesa009.jf.intel.com ([10.64.159.149]) by orvoesa103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 Feb 2025 10:30:46 -0800 X-CSE-ConnectionGUID: YBOvmXwcSfyh3KWrhJRnew== X-CSE-MsgGUID: fl6iGoDJSPWASY+NiWusAg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.13,264,1732608000"; d="scan'208";a="111065902" Received: from newjersey.igk.intel.com ([10.102.20.203]) by orviesa009.jf.intel.com with ESMTP; 06 Feb 2025 10:30:42 -0800 From: Alexander Lobakin To: Andrew Lunn , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni Cc: Alexander Lobakin , Alexei Starovoitov , Daniel Borkmann , John Fastabend , Andrii Nakryiko , "Jose E. Marchesi" , =?utf-8?q?Toke_H=C3=B8iland-?= =?utf-8?q?J=C3=B8rgensen?= , Magnus Karlsson , Maciej Fijalkowski , Przemek Kitszel , Jason Baron , Casey Schaufler , Nathan Chancellor , bpf@vger.kernel.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH net-next 4/4] xsk: add helper to get &xdp_desc's DMA and meta pointer in one go Date: Thu, 6 Feb 2025 19:26:29 +0100 Message-ID: <20250206182630.3914318-5-aleksander.lobakin@intel.com> X-Mailer: git-send-email 2.48.1 In-Reply-To: <20250206182630.3914318-1-aleksander.lobakin@intel.com> References: <20250206182630.3914318-1-aleksander.lobakin@intel.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: kuba@kernel.org Currently, when your driver supports XSk Tx metadata and you want to send an XSk frame, you need to do the following: * call external xsk_buff_raw_get_dma(); * call inline xsk_buff_get_metadata(), which calls external xsk_buff_raw_get_data() and then do some inline checks. This effectively means that the following piece: addr = pool->unaligned ? xp_unaligned_add_offset_to_addr(addr) : addr; is done twice per frame, plus you have 2 external calls per frame, plus this: meta = pool->addrs + addr - pool->tx_metadata_len; if (unlikely(!xsk_buff_valid_tx_metadata(meta))) is always inlined, even if there's no meta or it's invalid. Add xsk_buff_raw_get_ctx() (xp_raw_get_ctx() to be precise) to do that in one go. It returns a small structure with 2 fields: DMA address, filled unconditionally, and metadata pointer, non-NULL only if it's present and valid. The address correction is performed only once and you also have only 1 external call per XSk frame, which does all the calculations and checks outside of your hotpath. You only need to check `if (ctx.meta)` for the metadata presence. To not copy any existing code, derive address correction and getting virtual and DMA address into small helpers. bloat-o-meter reports no object code changes for the existing functionality. Signed-off-by: Alexander Lobakin --- include/net/xdp_sock_drv.h | 43 +++++++++++++++++++++++++++++++--- include/net/xsk_buff_pool.h | 8 +++++++ net/xdp/xsk_buff_pool.c | 46 +++++++++++++++++++++++++++++++++---- 3 files changed, 90 insertions(+), 7 deletions(-) diff --git a/include/net/xdp_sock_drv.h b/include/net/xdp_sock_drv.h index 784cd34f5bba..15086dcf51d8 100644 --- a/include/net/xdp_sock_drv.h +++ b/include/net/xdp_sock_drv.h @@ -196,6 +196,23 @@ static inline void *xsk_buff_raw_get_data(struct xsk_buff_pool *pool, u64 addr) return xp_raw_get_data(pool, addr); } +/** + * xsk_buff_raw_get_ctx - get &xdp_desc context + * @pool: XSk buff pool desc address belongs to + * @addr: desc address (from userspace) + * + * Wrapper for xp_raw_get_ctx() to be used in drivers, see its kdoc for + * details. + * + * Return: new &xdp_desc_ctx struct containing desc's DMA address and metadata + * pointer, if it is present and valid (initialized to %NULL otherwise). + */ +static inline struct xdp_desc_ctx +xsk_buff_raw_get_ctx(const struct xsk_buff_pool *pool, u64 addr) +{ + return xp_raw_get_ctx(pool, addr); +} + #define XDP_TXMD_FLAGS_VALID ( \ XDP_TXMD_FLAGS_TIMESTAMP | \ XDP_TXMD_FLAGS_CHECKSUM | \ @@ -207,20 +224,27 @@ xsk_buff_valid_tx_metadata(const struct xsk_tx_metadata *meta) return !(meta->flags & ~XDP_TXMD_FLAGS_VALID); } -static inline struct xsk_tx_metadata *xsk_buff_get_metadata(struct xsk_buff_pool *pool, u64 addr) +static inline struct xsk_tx_metadata * +__xsk_buff_get_metadata(const struct xsk_buff_pool *pool, void *data) { struct xsk_tx_metadata *meta; if (!pool->tx_metadata_len) return NULL; - meta = xp_raw_get_data(pool, addr) - pool->tx_metadata_len; + meta = data - pool->tx_metadata_len; if (unlikely(!xsk_buff_valid_tx_metadata(meta))) return NULL; /* no way to signal the error to the user */ return meta; } +static inline struct xsk_tx_metadata * +xsk_buff_get_metadata(struct xsk_buff_pool *pool, u64 addr) +{ + return __xsk_buff_get_metadata(pool, xp_raw_get_data(pool, addr)); +} + static inline void xsk_buff_dma_sync_for_cpu(struct xdp_buff *xdp) { struct xdp_buff_xsk *xskb = container_of(xdp, struct xdp_buff_xsk, xdp); @@ -388,12 +412,25 @@ static inline void *xsk_buff_raw_get_data(struct xsk_buff_pool *pool, u64 addr) return NULL; } +static inline struct xdp_desc_ctx +xsk_buff_raw_get_ctx(const struct xsk_buff_pool *pool, u64 addr) +{ + return (struct xdp_desc_ctx){ }; +} + static inline bool xsk_buff_valid_tx_metadata(struct xsk_tx_metadata *meta) { return false; } -static inline struct xsk_tx_metadata *xsk_buff_get_metadata(struct xsk_buff_pool *pool, u64 addr) +static inline struct xsk_tx_metadata * +__xsk_buff_get_metadata(const struct xsk_buff_pool *pool, void *data) +{ + return NULL; +} + +static inline struct xsk_tx_metadata * +xsk_buff_get_metadata(struct xsk_buff_pool *pool, u64 addr) { return NULL; } diff --git a/include/net/xsk_buff_pool.h b/include/net/xsk_buff_pool.h index 50779406bc2d..1dcd4d71468a 100644 --- a/include/net/xsk_buff_pool.h +++ b/include/net/xsk_buff_pool.h @@ -141,6 +141,14 @@ u32 xp_alloc_batch(struct xsk_buff_pool *pool, struct xdp_buff **xdp, u32 max); bool xp_can_alloc(struct xsk_buff_pool *pool, u32 count); void *xp_raw_get_data(struct xsk_buff_pool *pool, u64 addr); dma_addr_t xp_raw_get_dma(struct xsk_buff_pool *pool, u64 addr); + +struct xdp_desc_ctx { + dma_addr_t dma; + struct xsk_tx_metadata *meta; +}; + +struct xdp_desc_ctx xp_raw_get_ctx(const struct xsk_buff_pool *pool, u64 addr); + static inline dma_addr_t xp_get_dma(struct xdp_buff_xsk *xskb) { return xskb->dma; diff --git a/net/xdp/xsk_buff_pool.c b/net/xdp/xsk_buff_pool.c index 1f7975b49657..c263fb7a68dc 100644 --- a/net/xdp/xsk_buff_pool.c +++ b/net/xdp/xsk_buff_pool.c @@ -699,18 +699,56 @@ void xp_free(struct xdp_buff_xsk *xskb) } EXPORT_SYMBOL(xp_free); -void *xp_raw_get_data(struct xsk_buff_pool *pool, u64 addr) +static u64 __xp_raw_get_addr(const struct xsk_buff_pool *pool, u64 addr) +{ + return pool->unaligned ? xp_unaligned_add_offset_to_addr(addr) : addr; +} + +static void *__xp_raw_get_data(const struct xsk_buff_pool *pool, u64 addr) { - addr = pool->unaligned ? xp_unaligned_add_offset_to_addr(addr) : addr; return pool->addrs + addr; } + +void *xp_raw_get_data(struct xsk_buff_pool *pool, u64 addr) +{ + return __xp_raw_get_data(pool, __xp_raw_get_addr(pool, addr)); +} EXPORT_SYMBOL(xp_raw_get_data); -dma_addr_t xp_raw_get_dma(struct xsk_buff_pool *pool, u64 addr) +static dma_addr_t __xp_raw_get_dma(const struct xsk_buff_pool *pool, u64 addr) { - addr = pool->unaligned ? xp_unaligned_add_offset_to_addr(addr) : addr; return (pool->dma_pages[addr >> PAGE_SHIFT] & ~XSK_NEXT_PG_CONTIG_MASK) + (addr & ~PAGE_MASK); } + +dma_addr_t xp_raw_get_dma(struct xsk_buff_pool *pool, u64 addr) +{ + return __xp_raw_get_dma(pool, __xp_raw_get_addr(pool, addr)); +} EXPORT_SYMBOL(xp_raw_get_dma); + +/** + * xp_raw_get_ctx - get &xdp_desc context + * @pool: XSk buff pool desc address belongs to + * @addr: desc address (from userspace) + * + * Helper for getting desc's DMA address and metadata pointer, if present. + * Saves one call on hotpath, double calculation of the actual address, + * and inline checks for metadata presence and sanity. + * + * Return: new &xdp_desc_ctx struct containing desc's DMA address and metadata + * pointer, if it is present and valid (initialized to %NULL otherwise). + */ +struct xdp_desc_ctx xp_raw_get_ctx(const struct xsk_buff_pool *pool, u64 addr) +{ + struct xdp_desc_ctx ret; + + addr = __xp_raw_get_addr(pool, addr); + + ret.dma = __xp_raw_get_dma(pool, addr); + ret.meta = __xsk_buff_get_metadata(pool, __xp_raw_get_data(pool, addr)); + + return ret; +} +EXPORT_SYMBOL(xp_raw_get_ctx);