diff mbox series

[v2,1/2] include/uapi/linux/swab.h: move default implementation for swab macros into asm-generic

Message ID 20250319-riscv-swab-v2-1-d53b6d6ab915@iencinas.com (mailing list archive)
State New
Headers show
Series Implement endianness swap macros for RISC-V | expand

Checks

Context Check Description
bjorn/pre-ci_am success Success

Commit Message

Ignacio Encinas Rubio March 19, 2025, 9:09 p.m. UTC
Move the default byteswap implementation into asm-generic so that it can
be included from arch code.

This is required by RISC-V in order to have a fallback implementation
without duplicating it.

Signed-off-by: Ignacio Encinas <ignacio@iencinas.com>
---
 include/uapi/asm-generic/swab.h | 32 ++++++++++++++++++++++++++++++++
 include/uapi/linux/swab.h       | 33 +--------------------------------
 2 files changed, 33 insertions(+), 32 deletions(-)

Comments

Arnd Bergmann March 19, 2025, 9:12 p.m. UTC | #1
On Wed, Mar 19, 2025, at 22:09, Ignacio Encinas wrote:
> Move the default byteswap implementation into asm-generic so that it can
> be included from arch code.
>
> This is required by RISC-V in order to have a fallback implementation
> without duplicating it.
>
> Signed-off-by: Ignacio Encinas <ignacio@iencinas.com>
> ---
>  include/uapi/asm-generic/swab.h | 32 ++++++++++++++++++++++++++++++++
>  include/uapi/linux/swab.h       | 33 +--------------------------------
>  2 files changed, 33 insertions(+), 32 deletions(-)
>

I think we should just remove these entirely in favor of the
compiler-povided built-ins.

    Arnd
Ignacio Encinas Rubio March 19, 2025, 9:37 p.m. UTC | #2
On 19/3/25 22:12, Arnd Bergmann wrote:
> On Wed, Mar 19, 2025, at 22:09, Ignacio Encinas wrote:
>> Move the default byteswap implementation into asm-generic so that it can
>> be included from arch code.
>>
>> This is required by RISC-V in order to have a fallback implementation
>> without duplicating it.
>>
>> Signed-off-by: Ignacio Encinas <ignacio@iencinas.com>
>> ---
>>  include/uapi/asm-generic/swab.h | 32 ++++++++++++++++++++++++++++++++
>>  include/uapi/linux/swab.h       | 33 +--------------------------------
>>  2 files changed, 33 insertions(+), 32 deletions(-)
>>
> 
> I think we should just remove these entirely in favor of the
> compiler-povided built-ins.

Got it. I assumed they existed to explicitly avoid relying on
__builtin_bswap as they might not exist. However, I did a quick grep and
found that there are some uses in the wild.

I couldn't find compiler builtins for ___constant_swahb32 nor 
___constant_swahw32, so I guess I'll leave them as they are.

Thank you!
Arnd Bergmann March 19, 2025, 9:49 p.m. UTC | #3
On Wed, Mar 19, 2025, at 22:37, Ignacio Encinas Rubio wrote:
> On 19/3/25 22:12, Arnd Bergmann wrote:
>> On Wed, Mar 19, 2025, at 22:09, Ignacio Encinas wrote:
>>> Move the default byteswap implementation into asm-generic so that it can
>>> be included from arch code.
>>>
>>> This is required by RISC-V in order to have a fallback implementation
>>> without duplicating it.
>>>
>>> Signed-off-by: Ignacio Encinas <ignacio@iencinas.com>
>>> ---
>>>  include/uapi/asm-generic/swab.h | 32 ++++++++++++++++++++++++++++++++
>>>  include/uapi/linux/swab.h       | 33 +--------------------------------
>>>  2 files changed, 33 insertions(+), 32 deletions(-)
>>>
>> 
>> I think we should just remove these entirely in favor of the
>> compiler-povided built-ins.
>
> Got it. I assumed they existed to explicitly avoid relying on
> __builtin_bswap as they might not exist. However, I did a quick grep and
> found that there are some uses in the wild.

Right, I do remember when we had a discussion about this maybe
15 years ago when gcc didn't have the builtins on all architectures
yet, but those versions are long gone, and we never cleaned it up.

> I couldn't find compiler builtins for ___constant_swahb32 nor 
> ___constant_swahw32, so I guess I'll leave them as they are.

Correct. There are also 24-bit and 48-bit swap functions
in include/linux/unaligned.h that have no corresponding builtins.

      Arnd
Ignacio Encinas Rubio March 20, 2025, 10:36 p.m. UTC | #4
On 19/3/25 22:49, Arnd Bergmann wrote:
> On Wed, Mar 19, 2025, at 22:37, Ignacio Encinas Rubio wrote:
>> On 19/3/25 22:12, Arnd Bergmann wrote:
>>> On Wed, Mar 19, 2025, at 22:09, Ignacio Encinas wrote:
>>>> Move the default byteswap implementation into asm-generic so that it can
>>>> be included from arch code.
>>>>
>>>> This is required by RISC-V in order to have a fallback implementation
>>>> without duplicating it.
>>>>
>>>> Signed-off-by: Ignacio Encinas <ignacio@iencinas.com>
>>>> ---
>>>>  include/uapi/asm-generic/swab.h | 32 ++++++++++++++++++++++++++++++++
>>>>  include/uapi/linux/swab.h       | 33 +--------------------------------
>>>>  2 files changed, 33 insertions(+), 32 deletions(-)
>>>>
>>>
>>> I think we should just remove these entirely in favor of the
>>> compiler-povided built-ins.
>>
>> Got it. I assumed they existed to explicitly avoid relying on
>> __builtin_bswap as they might not exist. However, I did a quick grep and
>> found that there are some uses in the wild.
> 
> Right, I do remember when we had a discussion about this maybe
> 15 years ago when gcc didn't have the builtins on all architectures
> yet, but those versions are long gone, and we never cleaned it up.

I just had a chance to look at this and it looks a bit more complex than
I initially thought. ___constant_swab macros are used in more places
than I expected, and {little,big}_endian.h define their own macros that
are used elsewhere, ...

It is not clear to me how to proceed here. I could:

  1) Just remove ___constant_swab macros and replace them with
  __builtin_swap everywhere

  2) Go a step further and evaluate removing __constant_htonl and
  relatives

Let me know what you think is the best option :)

I'll resend this series without this patch (and make the RISC-V use
fall back into __builtin_bswap)
 
>> I couldn't find compiler builtins for ___constant_swahb32 nor 
>> ___constant_swahw32, so I guess I'll leave them as they are.
> 
> Correct. There are also 24-bit and 48-bit swap functions
> in include/linux/unaligned.h that have no corresponding builtins.

Thanks for clarifying!
Arnd Bergmann March 21, 2025, 10:23 a.m. UTC | #5
On Thu, Mar 20, 2025, at 23:36, Ignacio Encinas Rubio wrote:
> On 19/3/25 22:49, Arnd Bergmann wrote:
>> On Wed, Mar 19, 2025, at 22:37, Ignacio Encinas Rubio wrote:
>>> On 19/3/25 22:12, Arnd Bergmann wrote:
>> Right, I do remember when we had a discussion about this maybe
>> 15 years ago when gcc didn't have the builtins on all architectures
>> yet, but those versions are long gone, and we never cleaned it up.
>
> I just had a chance to look at this and it looks a bit more complex than
> I initially thought. ___constant_swab macros are used in more places
> than I expected, and {little,big}_endian.h define their own macros that
> are used elsewhere, ...
>
> It is not clear to me how to proceed here. I could:
>
>   1) Just remove ___constant_swab macros and replace them with
>   __builtin_swap everywhere
>
>   2) Go a step further and evaluate removing __constant_htonl and
>   relatives
>
> Let me know what you think is the best option :)

I think we can start enabling CONFIG_ARCH_USE_BUILTIN_BSWAP
on all architectures and removing the custom versions
from arch/*/include/uapi/asm/swab.h, which all seem to
predate the compiler builtins and likely produce worse code.

    Arnd
Ignacio Encinas Rubio March 21, 2025, 6:38 p.m. UTC | #6
On 21/3/25 11:23, Arnd Bergmann wrote:
> On Thu, Mar 20, 2025, at 23:36, Ignacio Encinas Rubio wrote:
>> On 19/3/25 22:49, Arnd Bergmann wrote:
>>> On Wed, Mar 19, 2025, at 22:37, Ignacio Encinas Rubio wrote:
>>>> On 19/3/25 22:12, Arnd Bergmann wrote:
>>> Right, I do remember when we had a discussion about this maybe
>>> 15 years ago when gcc didn't have the builtins on all architectures
>>> yet, but those versions are long gone, and we never cleaned it up.
>>
>> I just had a chance to look at this and it looks a bit more complex than
>> I initially thought. ___constant_swab macros are used in more places
>> than I expected, and {little,big}_endian.h define their own macros that
>> are used elsewhere, ...
>>
>> It is not clear to me how to proceed here. I could:
>>
>>   1) Just remove ___constant_swab macros and replace them with
>>   __builtin_swap everywhere
>>
>>   2) Go a step further and evaluate removing __constant_htonl and
>>   relatives
>>
>> Let me know what you think is the best option :)
> 
> I think we can start enabling CONFIG_ARCH_USE_BUILTIN_BSWAP
> on all architectures and removing the custom versions
> from arch/*/include/uapi/asm/swab.h, which all seem to
> predate the compiler builtins and likely produce worse code.

This seems fine for some architectures but I don't think we can use
this approach for RISC-V. RISC-V code assumes that the bitmanip 
extension might not be available (see arch/riscv/include/asm/bitops.h).

The current approach [1] is to detect this at boot and patch the kernel 
to adapt it to the actual hardware running it (using specific 
instructions or not).

On the other hand, I tried using __builtin_swap for the RISC-V version 
as an alternative to the "optimized" one (instead of relying on
___constant_swab, see [2]) and I immediately got compilation errors. 

Some architectures seem to require definitions for __bswapsi2 and 
__bswapdi2 [3]. I'm guessing this happens for the architectures that
don't require bit manipulation instructions but have them as extensions.

arm,csky,mips and xtensa seem to fit this description as they 
feature their own __bswapsi2 implementations. Note that they simply
call ___constant_swab or are ___constant_swab written in assembly
language [4] [5].

Unless I'm missing something, it seems to me that using compiler 
builtins (at least for RISC-V, and potentially others) is even more 
problematic than keeping ___constant_swab around. What do you think, 
should we keep patch 1 after all?

We could remove __arch_swab for architectures that always assume bit 
manipulation instructions availability, but then the kernel would fall
back into ___constant_swab when CONFIG_ARCH_USE_BUILTIN_BSWAP=n. Turning
their custom implementations into 

	#define __arch_swabXY __builtin_bswapXY

would solve this issue, but I'm not sure it is an acceptable approach.

Thanks!

[1] https://lore.kernel.org/all/ce034f2b-2f6e-403a-81f1-680af4c72929@ghiti.fr/
[2] https://lore.kernel.org/all/20250319-riscv-swab-v2-2-d53b6d6ab915@iencinas.com/
[3] https://gcc.gnu.org/onlinedocs/gcc-13.3.0/gccint.pdf
[4] https://lore.kernel.org/all/20230512164815.2150839-1-jcmvbkbc@gmail.com/
[5] https://lore.kernel.org/all/1664437198-31260-3-git-send-email-yangtiezhu@loongson.cn/
diff mbox series

Patch

diff --git a/include/uapi/asm-generic/swab.h b/include/uapi/asm-generic/swab.h
index f2da4e4fd4d129c43f904c5f1b6234036b57cc77..43d83df007a6fbfb0011452e12e71f429425cad5 100644
--- a/include/uapi/asm-generic/swab.h
+++ b/include/uapi/asm-generic/swab.h
@@ -16,4 +16,36 @@ 
 #endif
 #endif
 
+/*
+ * casts are necessary for constants, because we never know how for sure
+ * how U/UL/ULL map to __u16, __u32, __u64. At least not in a portable way.
+ */
+#define ___constant_swab16(x) ((__u16)(				\
+	(((__u16)(x) & (__u16)0x00ffU) << 8) |			\
+	(((__u16)(x) & (__u16)0xff00U) >> 8)))
+
+#define ___constant_swab32(x) ((__u32)(				\
+	(((__u32)(x) & (__u32)0x000000ffUL) << 24) |		\
+	(((__u32)(x) & (__u32)0x0000ff00UL) <<  8) |		\
+	(((__u32)(x) & (__u32)0x00ff0000UL) >>  8) |		\
+	(((__u32)(x) & (__u32)0xff000000UL) >> 24)))
+
+#define ___constant_swab64(x) ((__u64)(				\
+	(((__u64)(x) & (__u64)0x00000000000000ffULL) << 56) |	\
+	(((__u64)(x) & (__u64)0x000000000000ff00ULL) << 40) |	\
+	(((__u64)(x) & (__u64)0x0000000000ff0000ULL) << 24) |	\
+	(((__u64)(x) & (__u64)0x00000000ff000000ULL) <<  8) |	\
+	(((__u64)(x) & (__u64)0x000000ff00000000ULL) >>  8) |	\
+	(((__u64)(x) & (__u64)0x0000ff0000000000ULL) >> 24) |	\
+	(((__u64)(x) & (__u64)0x00ff000000000000ULL) >> 40) |	\
+	(((__u64)(x) & (__u64)0xff00000000000000ULL) >> 56)))
+
+#define ___constant_swahw32(x) ((__u32)(			\
+	(((__u32)(x) & (__u32)0x0000ffffUL) << 16) |		\
+	(((__u32)(x) & (__u32)0xffff0000UL) >> 16)))
+
+#define ___constant_swahb32(x) ((__u32)(			\
+	(((__u32)(x) & (__u32)0x00ff00ffUL) << 8) |		\
+	(((__u32)(x) & (__u32)0xff00ff00UL) >> 8)))
+
 #endif /* _ASM_GENERIC_SWAB_H */
diff --git a/include/uapi/linux/swab.h b/include/uapi/linux/swab.h
index 01717181339eb0fb5128668ca13f38205c03fa28..ca808c492996f810ce417ce9701306070873847b 100644
--- a/include/uapi/linux/swab.h
+++ b/include/uapi/linux/swab.h
@@ -6,38 +6,7 @@ 
 #include <linux/stddef.h>
 #include <asm/bitsperlong.h>
 #include <asm/swab.h>
-
-/*
- * casts are necessary for constants, because we never know how for sure
- * how U/UL/ULL map to __u16, __u32, __u64. At least not in a portable way.
- */
-#define ___constant_swab16(x) ((__u16)(				\
-	(((__u16)(x) & (__u16)0x00ffU) << 8) |			\
-	(((__u16)(x) & (__u16)0xff00U) >> 8)))
-
-#define ___constant_swab32(x) ((__u32)(				\
-	(((__u32)(x) & (__u32)0x000000ffUL) << 24) |		\
-	(((__u32)(x) & (__u32)0x0000ff00UL) <<  8) |		\
-	(((__u32)(x) & (__u32)0x00ff0000UL) >>  8) |		\
-	(((__u32)(x) & (__u32)0xff000000UL) >> 24)))
-
-#define ___constant_swab64(x) ((__u64)(				\
-	(((__u64)(x) & (__u64)0x00000000000000ffULL) << 56) |	\
-	(((__u64)(x) & (__u64)0x000000000000ff00ULL) << 40) |	\
-	(((__u64)(x) & (__u64)0x0000000000ff0000ULL) << 24) |	\
-	(((__u64)(x) & (__u64)0x00000000ff000000ULL) <<  8) |	\
-	(((__u64)(x) & (__u64)0x000000ff00000000ULL) >>  8) |	\
-	(((__u64)(x) & (__u64)0x0000ff0000000000ULL) >> 24) |	\
-	(((__u64)(x) & (__u64)0x00ff000000000000ULL) >> 40) |	\
-	(((__u64)(x) & (__u64)0xff00000000000000ULL) >> 56)))
-
-#define ___constant_swahw32(x) ((__u32)(			\
-	(((__u32)(x) & (__u32)0x0000ffffUL) << 16) |		\
-	(((__u32)(x) & (__u32)0xffff0000UL) >> 16)))
-
-#define ___constant_swahb32(x) ((__u32)(			\
-	(((__u32)(x) & (__u32)0x00ff00ffUL) << 8) |		\
-	(((__u32)(x) & (__u32)0xff00ff00UL) >> 8)))
+#include <asm-generic/swab.h>
 
 /*
  * Implement the following as inlines, but define the interface using