Message ID | 20240819223442.48013-2-anthony.l.nguyen@intel.com (mailing list archive) |
---|---|
State | Rejected |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | idpf: XDP chapter II: convert Tx completion to libeth | expand |
On Mon, 19 Aug 2024 15:34:33 -0700 Tony Nguyen wrote: > There are cases when we need to explicitly unroll loops. For example, > cache operations, filling DMA descriptors on very high speeds etc. > Add compiler-specific attribute macros to give the compiler a hint > that we'd like to unroll a loop. > Example usage: > > #define UNROLL_BATCH 8 > > unrolled_count(UNROLL_BATCH) > for (u32 i = 0; i < UNROLL_BATCH; i++) > op(priv, i); > > Note that sometimes the compilers won't unroll loops if they think this > would have worse optimization and perf than without unrolling, and that > unroll attributes are available only starting GCC 8. For older compiler > versions, no hints/attributes will be applied. > For better unrolling/parallelization, don't have any variables that > interfere between iterations except for the iterator itself. Please run the submissions thru get_maintainers
From: Jakub Kicinski <kuba@kernel.org> Date: Tue, 20 Aug 2024 17:55:39 -0700 > On Mon, 19 Aug 2024 15:34:33 -0700 Tony Nguyen wrote: >> There are cases when we need to explicitly unroll loops. For example, >> cache operations, filling DMA descriptors on very high speeds etc. >> Add compiler-specific attribute macros to give the compiler a hint >> that we'd like to unroll a loop. >> Example usage: >> >> #define UNROLL_BATCH 8 >> >> unrolled_count(UNROLL_BATCH) >> for (u32 i = 0; i < UNROLL_BATCH; i++) >> op(priv, i); >> >> Note that sometimes the compilers won't unroll loops if they think this >> would have worse optimization and perf than without unrolling, and that >> unroll attributes are available only starting GCC 8. For older compiler >> versions, no hints/attributes will be applied. >> For better unrolling/parallelization, don't have any variables that >> interfere between iterations except for the iterator itself. > > Please run the submissions thru get_maintainers I always do that. get_maintainers.pl gives nobody for linux/unroll.h. Thanks, Olek
On Thu, 22 Aug 2024 17:15:25 +0200 Alexander Lobakin wrote: > > Please run the submissions thru get_maintainers > > I always do that. get_maintainers.pl gives nobody for linux/unroll.h. You gotta feed it the *patch*, not the path. For keyword matching on the contents. I wanted to print a warning when people use get_maintainer with a path but Linus blocked it. I'm convinced 99% of such uses are misguided. But TBH I was directing the message at Tony as well. Please just feed the patches to get_maintainer when posting.
On 8/22/2024 3:59 PM, Jakub Kicinski wrote: > On Thu, 22 Aug 2024 17:15:25 +0200 Alexander Lobakin wrote: >>> Please run the submissions thru get_maintainers >> >> I always do that. get_maintainers.pl gives nobody for linux/unroll.h. > > You gotta feed it the *patch*, not the path. For keyword matching on the > contents. I wanted to print a warning when people use get_maintainer > with a path but Linus blocked it. I'm convinced 99% of such uses are > misguided. > > But TBH I was directing the message at Tony as well. Please just feed > the patches to get_maintainer when posting. Ack. Thanks, Tony
From: Jakub Kicinski <kuba@kernel.org> Date: Thu, 22 Aug 2024 15:59:46 -0700 > On Thu, 22 Aug 2024 17:15:25 +0200 Alexander Lobakin wrote: >>> Please run the submissions thru get_maintainers >> >> I always do that. get_maintainers.pl gives nobody for linux/unroll.h. > > You gotta feed it the *patch*, not the path. For keyword matching on the Oops, sorry. I always do that for patches, not paths, but here I wanted to use a shortcut not thinking that it may give completely different results =\ > contents. I wanted to print a warning when people use get_maintainer > with a path but Linus blocked it. I'm convinced 99% of such uses are > misguided. > > But TBH I was directing the message at Tony as well. Please just feed > the patches to get_maintainer when posting. Thanks, Olek
diff --git a/include/linux/unroll.h b/include/linux/unroll.h new file mode 100644 index 000000000000..e305d155faa6 --- /dev/null +++ b/include/linux/unroll.h @@ -0,0 +1,50 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* Copyright (C) 2024 Intel Corporation */ + +#ifndef _LINUX_UNROLL_H +#define _LINUX_UNROLL_H + +#ifdef CONFIG_CC_IS_CLANG +#define __pick_unrolled(x, y) _Pragma(#x) +#elif CONFIG_GCC_VERSION >= 80000 +#define __pick_unrolled(x, y) _Pragma(#y) +#else +#define __pick_unrolled(x, y) /* not supported */ +#endif + +/** + * unrolled - loop attributes to ask the compiler to unroll it + * + * Usage: + * + * #define BATCH 4 + * unrolled_count(BATCH) + * for (u32 i = 0; i < BATCH; i++) + * // loop body without cross-iteration dependencies + * + * This is only a hint and the compiler is free to disable unrolling if it + * thinks the count is suboptimal and may hurt performance and/or hugely + * increase object code size. + * Not having any cross-iteration dependencies (i.e. when iter x + 1 depends + * on what iter x will do with variables) is not a strict requirement, but + * provides best performance and object code size. + * Available only on Clang and GCC 8.x onwards. + */ + +/* Ask the compiler to pick an optimal unroll count, Clang only */ +#define unrolled \ + __pick_unrolled(clang loop unroll(enable), /* nothing */) + +/* Unroll each @n iterations of a loop */ +#define unrolled_count(n) \ + __pick_unrolled(clang loop unroll_count(n), GCC unroll n) + +/* Unroll the whole loop */ +#define unrolled_full \ + __pick_unrolled(clang loop unroll(full), GCC unroll 65534) + +/* Never unroll a loop */ +#define unrolled_none \ + __pick_unrolled(clang loop unroll(disable), GCC unroll 1) + +#endif /* _LINUX_UNROLL_H */