diff mbox series

[PATCHv2,bpf-next,1/3] bpf, tests: tweak endianness selection

Message ID 20190320125335.19621-1-sergey.senozhatsky@gmail.com (mailing list archive)
State New
Headers show
Series [PATCHv2,bpf-next,1/3] bpf, tests: tweak endianness selection | expand

Commit Message

Sergey Senozhatsky March 20, 2019, 12:53 p.m. UTC
Not all compilers have __builtin_bswap16() and __builtin_bswap32(),
thus not all compilers are able to compile the following code:

        (__builtin_constant_p(x) ? \
                ___constant_swab16(x) : __builtin_bswap16(x))

That's the reason why bpf_ntohl() doesn't work on GCC < 4.8, for
instance:

        error: implicit declaration of function '__builtin_bswap16'

We can use __builtin_bswap16() only if compiler has this built-in,
that is, only if __HAVE_BUILTIN_BSWAP16__ is defined. Standard UAPI
__swab16()/__swab32() take care of that, and, additionally, handle
__builtin_constant_p() cases as well:

 #ifdef __HAVE_BUILTIN_BSWAP16__
 #define __swab16(x) (__u16)__builtin_bswap16((__u16)(x))
 #else
 #define __swab16(x)                             \
         (__builtin_constant_p((__u16)(x)) ?     \
         ___constant_swab16(x) :                 \
         __fswab16(x))
 #endif

So we can tweak selftests/bpf/bpf_endian.h and use UAPI
__swab16()/__swab32().

Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
---

v2: fixed build error, reshuffled patches (Stanislav Fomichev)

 tools/testing/selftests/bpf/bpf_endian.h | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

Comments

Stanislav Fomichev March 20, 2019, 5:13 p.m. UTC | #1
On 03/20, Sergey Senozhatsky wrote:
> Not all compilers have __builtin_bswap16() and __builtin_bswap32(),
> thus not all compilers are able to compile the following code:
> 
>         (__builtin_constant_p(x) ? \
>                 ___constant_swab16(x) : __builtin_bswap16(x))
> 
> That's the reason why bpf_ntohl() doesn't work on GCC < 4.8, for
> instance:
> 
>         error: implicit declaration of function '__builtin_bswap16'
> 
> We can use __builtin_bswap16() only if compiler has this built-in,
> that is, only if __HAVE_BUILTIN_BSWAP16__ is defined. Standard UAPI
> __swab16()/__swab32() take care of that, and, additionally, handle
> __builtin_constant_p() cases as well:
> 
>  #ifdef __HAVE_BUILTIN_BSWAP16__
>  #define __swab16(x) (__u16)__builtin_bswap16((__u16)(x))
>  #else
>  #define __swab16(x)                             \
>          (__builtin_constant_p((__u16)(x)) ?     \
>          ___constant_swab16(x) :                 \
>          __fswab16(x))
>  #endif
> 
> So we can tweak selftests/bpf/bpf_endian.h and use UAPI
> __swab16()/__swab32().
> 
> Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
> ---
> 
> v2: fixed build error, reshuffled patches (Stanislav Fomichev)
Tested them locally with the compiler I saw the initial issues with - all
fine, I don't see any errors with the older gcc.

One last question I have is: what happens in the llvm+bpf case? Have
you tested that? I think LLVM has all the builtins required, but since
we are relying on the swab.h now (and it relies on
__HAVE_BUILTIN_BSWAP16__), I wonder whether this detection works
correctly on the llvm when targeting bpf. (sidenote: bpf_endian.h can be
used from both userspace and bpf programs).

> 
>  tools/testing/selftests/bpf/bpf_endian.h | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/tools/testing/selftests/bpf/bpf_endian.h b/tools/testing/selftests/bpf/bpf_endian.h
> index b25595ea4a78..1ed268b2002b 100644
> --- a/tools/testing/selftests/bpf/bpf_endian.h
> +++ b/tools/testing/selftests/bpf/bpf_endian.h
> @@ -20,12 +20,12 @@
>   * use different targets.
>   */
>  #if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
> -# define __bpf_ntohs(x)			__builtin_bswap16(x)
> -# define __bpf_htons(x)			__builtin_bswap16(x)
> +# define __bpf_ntohs(x)			__swab16(x)
> +# define __bpf_htons(x)			__swab16(x)
>  # define __bpf_constant_ntohs(x)	___constant_swab16(x)
>  # define __bpf_constant_htons(x)	___constant_swab16(x)
> -# define __bpf_ntohl(x)			__builtin_bswap32(x)
> -# define __bpf_htonl(x)			__builtin_bswap32(x)
> +# define __bpf_ntohl(x)			__swab32(x)
> +# define __bpf_htonl(x)			__swab32(x)
>  # define __bpf_constant_ntohl(x)	___constant_swab32(x)
>  # define __bpf_constant_htonl(x)	___constant_swab32(x)
>  #elif __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
> -- 
> 2.21.0
>
Yonghong Song March 20, 2019, 10:20 p.m. UTC | #2
On 3/20/19 10:13 AM, Stanislav Fomichev wrote:
> On 03/20, Sergey Senozhatsky wrote:
>> Not all compilers have __builtin_bswap16() and __builtin_bswap32(),
>> thus not all compilers are able to compile the following code:
>>
>>          (__builtin_constant_p(x) ? \
>>                  ___constant_swab16(x) : __builtin_bswap16(x))
>>
>> That's the reason why bpf_ntohl() doesn't work on GCC < 4.8, for
>> instance:
>>
>>          error: implicit declaration of function '__builtin_bswap16'
>>
>> We can use __builtin_bswap16() only if compiler has this built-in,
>> that is, only if __HAVE_BUILTIN_BSWAP16__ is defined. Standard UAPI
>> __swab16()/__swab32() take care of that, and, additionally, handle
>> __builtin_constant_p() cases as well:
>>
>>   #ifdef __HAVE_BUILTIN_BSWAP16__
>>   #define __swab16(x) (__u16)__builtin_bswap16((__u16)(x))
>>   #else
>>   #define __swab16(x)                             \
>>           (__builtin_constant_p((__u16)(x)) ?     \
>>           ___constant_swab16(x) :                 \
>>           __fswab16(x))
>>   #endif
>>
>> So we can tweak selftests/bpf/bpf_endian.h and use UAPI
>> __swab16()/__swab32().
>>
>> Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
>> ---
>>
>> v2: fixed build error, reshuffled patches (Stanislav Fomichev)
> Tested them locally with the compiler I saw the initial issues with - all
> fine, I don't see any errors with the older gcc.
> 
> One last question I have is: what happens in the llvm+bpf case? Have
> you tested that? I think LLVM has all the builtins required, but since
> we are relying on the swab.h now (and it relies on
> __HAVE_BUILTIN_BSWAP16__), I wonder whether this detection works
> correctly on the llvm when targeting bpf. (sidenote: bpf_endian.h can be
> used from both userspace and bpf programs).

Inside kernel clang compiler header (linux/compiler-clang.h) does not 
define __HAVE_BUILTIN_BSWAP16__. So it will go to the "else" branch in 
the above. So I think it should work with clang + bpf.

> 
>>
>>   tools/testing/selftests/bpf/bpf_endian.h | 8 ++++----
>>   1 file changed, 4 insertions(+), 4 deletions(-)
>>
>> diff --git a/tools/testing/selftests/bpf/bpf_endian.h b/tools/testing/selftests/bpf/bpf_endian.h
>> index b25595ea4a78..1ed268b2002b 100644
>> --- a/tools/testing/selftests/bpf/bpf_endian.h
>> +++ b/tools/testing/selftests/bpf/bpf_endian.h
>> @@ -20,12 +20,12 @@
>>    * use different targets.
>>    */
>>   #if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
>> -# define __bpf_ntohs(x)			__builtin_bswap16(x)
>> -# define __bpf_htons(x)			__builtin_bswap16(x)
>> +# define __bpf_ntohs(x)			__swab16(x)
>> +# define __bpf_htons(x)			__swab16(x)
>>   # define __bpf_constant_ntohs(x)	___constant_swab16(x)
>>   # define __bpf_constant_htons(x)	___constant_swab16(x)
>> -# define __bpf_ntohl(x)			__builtin_bswap32(x)
>> -# define __bpf_htonl(x)			__builtin_bswap32(x)
>> +# define __bpf_ntohl(x)			__swab32(x)
>> +# define __bpf_htonl(x)			__swab32(x)
>>   # define __bpf_constant_ntohl(x)	___constant_swab32(x)
>>   # define __bpf_constant_htonl(x)	___constant_swab32(x)
>>   #elif __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
>> -- 
>> 2.21.0
>>
Stanislav Fomichev March 20, 2019, 10:27 p.m. UTC | #3
On 03/20, Yonghong Song wrote:
> 
> 
> On 3/20/19 10:13 AM, Stanislav Fomichev wrote:
> > On 03/20, Sergey Senozhatsky wrote:
> >> Not all compilers have __builtin_bswap16() and __builtin_bswap32(),
> >> thus not all compilers are able to compile the following code:
> >>
> >>          (__builtin_constant_p(x) ? \
> >>                  ___constant_swab16(x) : __builtin_bswap16(x))
> >>
> >> That's the reason why bpf_ntohl() doesn't work on GCC < 4.8, for
> >> instance:
> >>
> >>          error: implicit declaration of function '__builtin_bswap16'
> >>
> >> We can use __builtin_bswap16() only if compiler has this built-in,
> >> that is, only if __HAVE_BUILTIN_BSWAP16__ is defined. Standard UAPI
> >> __swab16()/__swab32() take care of that, and, additionally, handle
> >> __builtin_constant_p() cases as well:
> >>
> >>   #ifdef __HAVE_BUILTIN_BSWAP16__
> >>   #define __swab16(x) (__u16)__builtin_bswap16((__u16)(x))
> >>   #else
> >>   #define __swab16(x)                             \
> >>           (__builtin_constant_p((__u16)(x)) ?     \
> >>           ___constant_swab16(x) :                 \
> >>           __fswab16(x))
> >>   #endif
> >>
> >> So we can tweak selftests/bpf/bpf_endian.h and use UAPI
> >> __swab16()/__swab32().
> >>
> >> Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
> >> ---
> >>
> >> v2: fixed build error, reshuffled patches (Stanislav Fomichev)
> > Tested them locally with the compiler I saw the initial issues with - all
> > fine, I don't see any errors with the older gcc.
> > 
> > One last question I have is: what happens in the llvm+bpf case? Have
> > you tested that? I think LLVM has all the builtins required, but since
> > we are relying on the swab.h now (and it relies on
> > __HAVE_BUILTIN_BSWAP16__), I wonder whether this detection works
> > correctly on the llvm when targeting bpf. (sidenote: bpf_endian.h can be
> > used from both userspace and bpf programs).
> 
> Inside kernel clang compiler header (linux/compiler-clang.h) does not 
> define __HAVE_BUILTIN_BSWAP16__. So it will go to the "else" branch in 
> the above. So I think it should work with clang + bpf.
Hm, isn't it the opposite of what we want then? I think for llvm+bpf we always
want to use the builtins to make it properly generate
BPF_TO_BE/BPF_TO_LE instructions.

> > 
> >>
> >>   tools/testing/selftests/bpf/bpf_endian.h | 8 ++++----
> >>   1 file changed, 4 insertions(+), 4 deletions(-)
> >>
> >> diff --git a/tools/testing/selftests/bpf/bpf_endian.h b/tools/testing/selftests/bpf/bpf_endian.h
> >> index b25595ea4a78..1ed268b2002b 100644
> >> --- a/tools/testing/selftests/bpf/bpf_endian.h
> >> +++ b/tools/testing/selftests/bpf/bpf_endian.h
> >> @@ -20,12 +20,12 @@
> >>    * use different targets.
> >>    */
> >>   #if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
> >> -# define __bpf_ntohs(x)			__builtin_bswap16(x)
> >> -# define __bpf_htons(x)			__builtin_bswap16(x)
> >> +# define __bpf_ntohs(x)			__swab16(x)
> >> +# define __bpf_htons(x)			__swab16(x)
> >>   # define __bpf_constant_ntohs(x)	___constant_swab16(x)
> >>   # define __bpf_constant_htons(x)	___constant_swab16(x)
> >> -# define __bpf_ntohl(x)			__builtin_bswap32(x)
> >> -# define __bpf_htonl(x)			__builtin_bswap32(x)
> >> +# define __bpf_ntohl(x)			__swab32(x)
> >> +# define __bpf_htonl(x)			__swab32(x)
> >>   # define __bpf_constant_ntohl(x)	___constant_swab32(x)
> >>   # define __bpf_constant_htonl(x)	___constant_swab32(x)
> >>   #elif __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
> >> -- 
> >> 2.21.0
> >>
Yonghong Song March 20, 2019, 10:45 p.m. UTC | #4
On 3/20/19 3:27 PM, Stanislav Fomichev wrote:
> On 03/20, Yonghong Song wrote:
>>
>>
>> On 3/20/19 10:13 AM, Stanislav Fomichev wrote:
>>> On 03/20, Sergey Senozhatsky wrote:
>>>> Not all compilers have __builtin_bswap16() and __builtin_bswap32(),
>>>> thus not all compilers are able to compile the following code:
>>>>
>>>>           (__builtin_constant_p(x) ? \
>>>>                   ___constant_swab16(x) : __builtin_bswap16(x))
>>>>
>>>> That's the reason why bpf_ntohl() doesn't work on GCC < 4.8, for
>>>> instance:
>>>>
>>>>           error: implicit declaration of function '__builtin_bswap16'
>>>>
>>>> We can use __builtin_bswap16() only if compiler has this built-in,
>>>> that is, only if __HAVE_BUILTIN_BSWAP16__ is defined. Standard UAPI
>>>> __swab16()/__swab32() take care of that, and, additionally, handle
>>>> __builtin_constant_p() cases as well:
>>>>
>>>>    #ifdef __HAVE_BUILTIN_BSWAP16__
>>>>    #define __swab16(x) (__u16)__builtin_bswap16((__u16)(x))
>>>>    #else
>>>>    #define __swab16(x)                             \
>>>>            (__builtin_constant_p((__u16)(x)) ?     \
>>>>            ___constant_swab16(x) :                 \
>>>>            __fswab16(x))
>>>>    #endif
>>>>
>>>> So we can tweak selftests/bpf/bpf_endian.h and use UAPI
>>>> __swab16()/__swab32().
>>>>
>>>> Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
>>>> ---
>>>>
>>>> v2: fixed build error, reshuffled patches (Stanislav Fomichev)
>>> Tested them locally with the compiler I saw the initial issues with - all
>>> fine, I don't see any errors with the older gcc.
>>>
>>> One last question I have is: what happens in the llvm+bpf case? Have
>>> you tested that? I think LLVM has all the builtins required, but since
>>> we are relying on the swab.h now (and it relies on
>>> __HAVE_BUILTIN_BSWAP16__), I wonder whether this detection works
>>> correctly on the llvm when targeting bpf. (sidenote: bpf_endian.h can be
>>> used from both userspace and bpf programs).
>>
>> Inside kernel clang compiler header (linux/compiler-clang.h) does not
>> define __HAVE_BUILTIN_BSWAP16__. So it will go to the "else" branch in
>> the above. So I think it should work with clang + bpf.
> Hm, isn't it the opposite of what we want then? I think for llvm+bpf we always
> want to use the builtins to make it properly generate
> BPF_TO_BE/BPF_TO_LE instructions.

Okay, I see. Then this patch will not achieve that.
The following are two common ways to compile a bpf program:
   - "clang -target bpf ...", maybe add macro __BPF__ somewhere
     to indicate builtin_bswap16 always available?
   - "clang <host target> ..." and then "llc -march=bpf ..."
     in this case, __BPF__ macro is not available and
     we will not be able to use builtin swap for bpf program.

Maybe use __clang__ macro (or gcc macro) to distinguish between clang 
and gcc. If it is gcc we will check builtin availability, otherwise,
we assume builtin always available? This not pretty though.

> 
>>>
>>>>
>>>>    tools/testing/selftests/bpf/bpf_endian.h | 8 ++++----
>>>>    1 file changed, 4 insertions(+), 4 deletions(-)
>>>>
>>>> diff --git a/tools/testing/selftests/bpf/bpf_endian.h b/tools/testing/selftests/bpf/bpf_endian.h
>>>> index b25595ea4a78..1ed268b2002b 100644
>>>> --- a/tools/testing/selftests/bpf/bpf_endian.h
>>>> +++ b/tools/testing/selftests/bpf/bpf_endian.h
>>>> @@ -20,12 +20,12 @@
>>>>     * use different targets.
>>>>     */
>>>>    #if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
>>>> -# define __bpf_ntohs(x)			__builtin_bswap16(x)
>>>> -# define __bpf_htons(x)			__builtin_bswap16(x)
>>>> +# define __bpf_ntohs(x)			__swab16(x)
>>>> +# define __bpf_htons(x)			__swab16(x)
>>>>    # define __bpf_constant_ntohs(x)	___constant_swab16(x)
>>>>    # define __bpf_constant_htons(x)	___constant_swab16(x)
>>>> -# define __bpf_ntohl(x)			__builtin_bswap32(x)
>>>> -# define __bpf_htonl(x)			__builtin_bswap32(x)
>>>> +# define __bpf_ntohl(x)			__swab32(x)
>>>> +# define __bpf_htonl(x)			__swab32(x)
>>>>    # define __bpf_constant_ntohl(x)	___constant_swab32(x)
>>>>    # define __bpf_constant_htonl(x)	___constant_swab32(x)
>>>>    #elif __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
>>>> -- 
>>>> 2.21.0
>>>>
Daniel Borkmann March 20, 2019, 11:08 p.m. UTC | #5
On 03/20/2019 11:45 PM, Yonghong Song wrote:
> On 3/20/19 3:27 PM, Stanislav Fomichev wrote:
>> On 03/20, Yonghong Song wrote:
>>> On 3/20/19 10:13 AM, Stanislav Fomichev wrote:
>>>> On 03/20, Sergey Senozhatsky wrote:
>>>>> Not all compilers have __builtin_bswap16() and __builtin_bswap32(),
>>>>> thus not all compilers are able to compile the following code:
>>>>>
>>>>>           (__builtin_constant_p(x) ? \
>>>>>                   ___constant_swab16(x) : __builtin_bswap16(x))
>>>>>
>>>>> That's the reason why bpf_ntohl() doesn't work on GCC < 4.8, for
>>>>> instance:
>>>>>
>>>>>           error: implicit declaration of function '__builtin_bswap16'
>>>>>
>>>>> We can use __builtin_bswap16() only if compiler has this built-in,
>>>>> that is, only if __HAVE_BUILTIN_BSWAP16__ is defined. Standard UAPI
>>>>> __swab16()/__swab32() take care of that, and, additionally, handle
>>>>> __builtin_constant_p() cases as well:
>>>>>
>>>>>    #ifdef __HAVE_BUILTIN_BSWAP16__
>>>>>    #define __swab16(x) (__u16)__builtin_bswap16((__u16)(x))
>>>>>    #else
>>>>>    #define __swab16(x)                             \
>>>>>            (__builtin_constant_p((__u16)(x)) ?     \
>>>>>            ___constant_swab16(x) :                 \
>>>>>            __fswab16(x))
>>>>>    #endif
>>>>>
>>>>> So we can tweak selftests/bpf/bpf_endian.h and use UAPI
>>>>> __swab16()/__swab32().
>>>>>
>>>>> Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
>>>>> ---
>>>>>
>>>>> v2: fixed build error, reshuffled patches (Stanislav Fomichev)
>>>> Tested them locally with the compiler I saw the initial issues with - all
>>>> fine, I don't see any errors with the older gcc.
>>>>
>>>> One last question I have is: what happens in the llvm+bpf case? Have
>>>> you tested that? I think LLVM has all the builtins required, but since
>>>> we are relying on the swab.h now (and it relies on
>>>> __HAVE_BUILTIN_BSWAP16__), I wonder whether this detection works
>>>> correctly on the llvm when targeting bpf. (sidenote: bpf_endian.h can be
>>>> used from both userspace and bpf programs).
>>>
>>> Inside kernel clang compiler header (linux/compiler-clang.h) does not
>>> define __HAVE_BUILTIN_BSWAP16__. So it will go to the "else" branch in
>>> the above. So I think it should work with clang + bpf.
>> Hm, isn't it the opposite of what we want then? I think for llvm+bpf we always
>> want to use the builtins to make it properly generate
>> BPF_TO_BE/BPF_TO_LE instructions.
> 
> Okay, I see. Then this patch will not achieve that.
> The following are two common ways to compile a bpf program:
>    - "clang -target bpf ...", maybe add macro __BPF__ somewhere
>      to indicate builtin_bswap16 always available?
>    - "clang <host target> ..." and then "llc -march=bpf ..."
>      in this case, __BPF__ macro is not available and
>      we will not be able to use builtin swap for bpf program.
> 
> Maybe use __clang__ macro (or gcc macro) to distinguish between clang 
> and gcc. If it is gcc we will check builtin availability, otherwise,
> we assume builtin always available? This not pretty though.

I think the way this should be fixed is the following: In case
of LLVM (aka compiling BPF prog), we want the code to be as-is,
in case if gcc is compiling the hostprog, we either want to keep
using __builtin_bswap16() or fall-back to something else. Thus,
I would suggest, we add a new feature test for tooling infra under
tools/build/feature/ that compiles a dummy prog with __builtin_bswap16().
And in the bpf_endian.h we define __bpf_ntohs(x) to __bpf_swab16(x)
which either resolves to __builtin_bswap16() or some fallback
implementation if not available. I don't think there should be much
of an issue and it would follow the standard way to do it.

>>>>>    tools/testing/selftests/bpf/bpf_endian.h | 8 ++++----
>>>>>    1 file changed, 4 insertions(+), 4 deletions(-)
>>>>>
>>>>> diff --git a/tools/testing/selftests/bpf/bpf_endian.h b/tools/testing/selftests/bpf/bpf_endian.h
>>>>> index b25595ea4a78..1ed268b2002b 100644
>>>>> --- a/tools/testing/selftests/bpf/bpf_endian.h
>>>>> +++ b/tools/testing/selftests/bpf/bpf_endian.h
>>>>> @@ -20,12 +20,12 @@
>>>>>     * use different targets.
>>>>>     */
>>>>>    #if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
>>>>> -# define __bpf_ntohs(x)			__builtin_bswap16(x)
>>>>> -# define __bpf_htons(x)			__builtin_bswap16(x)
>>>>> +# define __bpf_ntohs(x)			__swab16(x)
>>>>> +# define __bpf_htons(x)			__swab16(x)
>>>>>    # define __bpf_constant_ntohs(x)	___constant_swab16(x)
>>>>>    # define __bpf_constant_htons(x)	___constant_swab16(x)
>>>>> -# define __bpf_ntohl(x)			__builtin_bswap32(x)
>>>>> -# define __bpf_htonl(x)			__builtin_bswap32(x)
>>>>> +# define __bpf_ntohl(x)			__swab32(x)
>>>>> +# define __bpf_htonl(x)			__swab32(x)
>>>>>    # define __bpf_constant_ntohl(x)	___constant_swab32(x)
>>>>>    # define __bpf_constant_htonl(x)	___constant_swab32(x)
>>>>>    #elif __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
>>>>> -- 
>>>>> 2.21.0
>>>>>
Stanislav Fomichev March 21, 2019, 12:03 a.m. UTC | #6
On 03/21, Daniel Borkmann wrote:
> On 03/20/2019 11:45 PM, Yonghong Song wrote:
> > On 3/20/19 3:27 PM, Stanislav Fomichev wrote:
> >> On 03/20, Yonghong Song wrote:
> >>> On 3/20/19 10:13 AM, Stanislav Fomichev wrote:
> >>>> On 03/20, Sergey Senozhatsky wrote:
> >>>>> Not all compilers have __builtin_bswap16() and __builtin_bswap32(),
> >>>>> thus not all compilers are able to compile the following code:
> >>>>>
> >>>>>           (__builtin_constant_p(x) ? \
> >>>>>                   ___constant_swab16(x) : __builtin_bswap16(x))
> >>>>>
> >>>>> That's the reason why bpf_ntohl() doesn't work on GCC < 4.8, for
> >>>>> instance:
> >>>>>
> >>>>>           error: implicit declaration of function '__builtin_bswap16'
> >>>>>
> >>>>> We can use __builtin_bswap16() only if compiler has this built-in,
> >>>>> that is, only if __HAVE_BUILTIN_BSWAP16__ is defined. Standard UAPI
> >>>>> __swab16()/__swab32() take care of that, and, additionally, handle
> >>>>> __builtin_constant_p() cases as well:
> >>>>>
> >>>>>    #ifdef __HAVE_BUILTIN_BSWAP16__
> >>>>>    #define __swab16(x) (__u16)__builtin_bswap16((__u16)(x))
> >>>>>    #else
> >>>>>    #define __swab16(x)                             \
> >>>>>            (__builtin_constant_p((__u16)(x)) ?     \
> >>>>>            ___constant_swab16(x) :                 \
> >>>>>            __fswab16(x))
> >>>>>    #endif
> >>>>>
> >>>>> So we can tweak selftests/bpf/bpf_endian.h and use UAPI
> >>>>> __swab16()/__swab32().
> >>>>>
> >>>>> Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
> >>>>> ---
> >>>>>
> >>>>> v2: fixed build error, reshuffled patches (Stanislav Fomichev)
> >>>> Tested them locally with the compiler I saw the initial issues with - all
> >>>> fine, I don't see any errors with the older gcc.
> >>>>
> >>>> One last question I have is: what happens in the llvm+bpf case? Have
> >>>> you tested that? I think LLVM has all the builtins required, but since
> >>>> we are relying on the swab.h now (and it relies on
> >>>> __HAVE_BUILTIN_BSWAP16__), I wonder whether this detection works
> >>>> correctly on the llvm when targeting bpf. (sidenote: bpf_endian.h can be
> >>>> used from both userspace and bpf programs).
> >>>
> >>> Inside kernel clang compiler header (linux/compiler-clang.h) does not
> >>> define __HAVE_BUILTIN_BSWAP16__. So it will go to the "else" branch in
> >>> the above. So I think it should work with clang + bpf.
> >> Hm, isn't it the opposite of what we want then? I think for llvm+bpf we always
> >> want to use the builtins to make it properly generate
> >> BPF_TO_BE/BPF_TO_LE instructions.
> > 
> > Okay, I see. Then this patch will not achieve that.
> > The following are two common ways to compile a bpf program:
> >    - "clang -target bpf ...", maybe add macro __BPF__ somewhere
> >      to indicate builtin_bswap16 always available?
> >    - "clang <host target> ..." and then "llc -march=bpf ..."
> >      in this case, __BPF__ macro is not available and
> >      we will not be able to use builtin swap for bpf program.
> > 
> > Maybe use __clang__ macro (or gcc macro) to distinguish between clang 
> > and gcc. If it is gcc we will check builtin availability, otherwise,
> > we assume builtin always available? This not pretty though.
> 
> I think the way this should be fixed is the following: In case
> of LLVM (aka compiling BPF prog), we want the code to be as-is,
> in case if gcc is compiling the hostprog, we either want to keep
> using __builtin_bswap16() or fall-back to something else. Thus,
> I would suggest, we add a new feature test for tooling infra under
> tools/build/feature/ that compiles a dummy prog with __builtin_bswap16().
> And in the bpf_endian.h we define __bpf_ntohs(x) to __bpf_swab16(x)
> which either resolves to __builtin_bswap16() or some fallback
> implementation if not available. I don't think there should be much
> of an issue and it would follow the standard way to do it.
It's not as easy as llvm vs gcc. We can compile userland tests with
llvm/clang as well. We really need to distinguish between the target: bfp vs
non-bpf: always use builtins in bpf case and fallback to swab.h for
userland (or use feature detection, but swab.h should be enough in
theory).

Can we rely on __bpf__ define?

$ cat tmp.c
#ifdef __bpf__
#error a
#else
#error b
#endif
$ clang -c -target bpf tmp.c 
tmp.c:2:2: error: a
#error a
 ^
 1 error generated.

> 
> >>>>>    tools/testing/selftests/bpf/bpf_endian.h | 8 ++++----
> >>>>>    1 file changed, 4 insertions(+), 4 deletions(-)
> >>>>>
> >>>>> diff --git a/tools/testing/selftests/bpf/bpf_endian.h b/tools/testing/selftests/bpf/bpf_endian.h
> >>>>> index b25595ea4a78..1ed268b2002b 100644
> >>>>> --- a/tools/testing/selftests/bpf/bpf_endian.h
> >>>>> +++ b/tools/testing/selftests/bpf/bpf_endian.h
> >>>>> @@ -20,12 +20,12 @@
> >>>>>     * use different targets.
> >>>>>     */
> >>>>>    #if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
> >>>>> -# define __bpf_ntohs(x)			__builtin_bswap16(x)
> >>>>> -# define __bpf_htons(x)			__builtin_bswap16(x)
> >>>>> +# define __bpf_ntohs(x)			__swab16(x)
> >>>>> +# define __bpf_htons(x)			__swab16(x)
> >>>>>    # define __bpf_constant_ntohs(x)	___constant_swab16(x)
> >>>>>    # define __bpf_constant_htons(x)	___constant_swab16(x)
> >>>>> -# define __bpf_ntohl(x)			__builtin_bswap32(x)
> >>>>> -# define __bpf_htonl(x)			__builtin_bswap32(x)
> >>>>> +# define __bpf_ntohl(x)			__swab32(x)
> >>>>> +# define __bpf_htonl(x)			__swab32(x)
> >>>>>    # define __bpf_constant_ntohl(x)	___constant_swab32(x)
> >>>>>    # define __bpf_constant_htonl(x)	___constant_swab32(x)
> >>>>>    #elif __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
> >>>>> -- 
> >>>>> 2.21.0
> >>>>>
>
Sergey Senozhatsky March 21, 2019, 12:22 a.m. UTC | #7
On (03/20/19 10:13), Stanislav Fomichev wrote:
> Tested them locally with the compiler I saw the initial issues with - all
> fine, I don't see any errors with the older gcc.

Thanks!

> One last question I have is: what happens in the llvm+bpf case? Have
> you tested that? I think LLVM has all the builtins required, but since
> we are relying on the swab.h now (and it relies on
> __HAVE_BUILTIN_BSWAP16__), I wonder whether this detection works
> correctly on the llvm when targeting bpf. (sidenote: bpf_endian.h can be
> used from both userspace and bpf programs).

Honestly, I haven't, but I think we should be fine.

For !__HAVE_BUILTIN_BSWAP16__ compilers we still do constant folding.
swab16/swab32 turn into

	__builtin_constant_p((__u16)(x)) ? ___constant_swab16(x) : __fswab16(x))
and
	__builtin_constant_p((__u32)(x)) ? ___constant_swab32(x) : __fswab32(x))

clang/llvm support __builtin_constant_p GCC extension [1]:

 : Clang supports a number of builtin library functions with the same
 : syntax as GCC, including things like __builtin_nan, __builtin_constant_p,
 : __builtin_choose_expr, __builtin_types_compatible_p,
 : __builtin_assume_aligned, __sync_fetch_and_add, etc.

So clang should be able to detect swab on a compile time constant and
optimize it.

[1] https://clang.llvm.org/docs/LanguageExtensions.html

	-ss
Yonghong Song March 21, 2019, 12:23 a.m. UTC | #8
On 3/20/19 5:03 PM, Stanislav Fomichev wrote:
> On 03/21, Daniel Borkmann wrote:
>> On 03/20/2019 11:45 PM, Yonghong Song wrote:
>>> On 3/20/19 3:27 PM, Stanislav Fomichev wrote:
>>>> On 03/20, Yonghong Song wrote:
>>>>> On 3/20/19 10:13 AM, Stanislav Fomichev wrote:
>>>>>> On 03/20, Sergey Senozhatsky wrote:
>>>>>>> Not all compilers have __builtin_bswap16() and __builtin_bswap32(),
>>>>>>> thus not all compilers are able to compile the following code:
>>>>>>>
>>>>>>>            (__builtin_constant_p(x) ? \
>>>>>>>                    ___constant_swab16(x) : __builtin_bswap16(x))
>>>>>>>
>>>>>>> That's the reason why bpf_ntohl() doesn't work on GCC < 4.8, for
>>>>>>> instance:
>>>>>>>
>>>>>>>            error: implicit declaration of function '__builtin_bswap16'
>>>>>>>
>>>>>>> We can use __builtin_bswap16() only if compiler has this built-in,
>>>>>>> that is, only if __HAVE_BUILTIN_BSWAP16__ is defined. Standard UAPI
>>>>>>> __swab16()/__swab32() take care of that, and, additionally, handle
>>>>>>> __builtin_constant_p() cases as well:
>>>>>>>
>>>>>>>     #ifdef __HAVE_BUILTIN_BSWAP16__
>>>>>>>     #define __swab16(x) (__u16)__builtin_bswap16((__u16)(x))
>>>>>>>     #else
>>>>>>>     #define __swab16(x)                             \
>>>>>>>             (__builtin_constant_p((__u16)(x)) ?     \
>>>>>>>             ___constant_swab16(x) :                 \
>>>>>>>             __fswab16(x))
>>>>>>>     #endif
>>>>>>>
>>>>>>> So we can tweak selftests/bpf/bpf_endian.h and use UAPI
>>>>>>> __swab16()/__swab32().
>>>>>>>
>>>>>>> Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
>>>>>>> ---
>>>>>>>
>>>>>>> v2: fixed build error, reshuffled patches (Stanislav Fomichev)
>>>>>> Tested them locally with the compiler I saw the initial issues with - all
>>>>>> fine, I don't see any errors with the older gcc.
>>>>>>
>>>>>> One last question I have is: what happens in the llvm+bpf case? Have
>>>>>> you tested that? I think LLVM has all the builtins required, but since
>>>>>> we are relying on the swab.h now (and it relies on
>>>>>> __HAVE_BUILTIN_BSWAP16__), I wonder whether this detection works
>>>>>> correctly on the llvm when targeting bpf. (sidenote: bpf_endian.h can be
>>>>>> used from both userspace and bpf programs).
>>>>>
>>>>> Inside kernel clang compiler header (linux/compiler-clang.h) does not
>>>>> define __HAVE_BUILTIN_BSWAP16__. So it will go to the "else" branch in
>>>>> the above. So I think it should work with clang + bpf.
>>>> Hm, isn't it the opposite of what we want then? I think for llvm+bpf we always
>>>> want to use the builtins to make it properly generate
>>>> BPF_TO_BE/BPF_TO_LE instructions.
>>>
>>> Okay, I see. Then this patch will not achieve that.
>>> The following are two common ways to compile a bpf program:
>>>     - "clang -target bpf ...", maybe add macro __BPF__ somewhere
>>>       to indicate builtin_bswap16 always available?
>>>     - "clang <host target> ..." and then "llc -march=bpf ..."
>>>       in this case, __BPF__ macro is not available and
>>>       we will not be able to use builtin swap for bpf program.
>>>
>>> Maybe use __clang__ macro (or gcc macro) to distinguish between clang
>>> and gcc. If it is gcc we will check builtin availability, otherwise,
>>> we assume builtin always available? This not pretty though.
>>
>> I think the way this should be fixed is the following: In case
>> of LLVM (aka compiling BPF prog), we want the code to be as-is,
>> in case if gcc is compiling the hostprog, we either want to keep
>> using __builtin_bswap16() or fall-back to something else. Thus,
>> I would suggest, we add a new feature test for tooling infra under
>> tools/build/feature/ that compiles a dummy prog with __builtin_bswap16().
>> And in the bpf_endian.h we define __bpf_ntohs(x) to __bpf_swab16(x)
>> which either resolves to __builtin_bswap16() or some fallback
>> implementation if not available. I don't think there should be much
>> of an issue and it would follow the standard way to do it.
> It's not as easy as llvm vs gcc. We can compile userland tests with
> llvm/clang as well. We really need to distinguish between the target: bfp vs
> non-bpf: always use builtins in bpf case and fallback to swab.h for
> userland (or use feature detection, but swab.h should be enough in
> theory).
> 
> Can we rely on __bpf__ define?
> 
> $ cat tmp.c
> #ifdef __bpf__
> #error a
> #else
> #error b
> #endif
> $ clang -c -target bpf tmp.c
> tmp.c:2:2: error: a
> #error a
>   ^
>   1 error generated.

Yes, you can rely this, __bpf__, __bpf or __BPF__. These three
are clang predefined macros for target bpf.

> 
>>
>>>>>>>     tools/testing/selftests/bpf/bpf_endian.h | 8 ++++----
>>>>>>>     1 file changed, 4 insertions(+), 4 deletions(-)
>>>>>>>
>>>>>>> diff --git a/tools/testing/selftests/bpf/bpf_endian.h b/tools/testing/selftests/bpf/bpf_endian.h
>>>>>>> index b25595ea4a78..1ed268b2002b 100644
>>>>>>> --- a/tools/testing/selftests/bpf/bpf_endian.h
>>>>>>> +++ b/tools/testing/selftests/bpf/bpf_endian.h
>>>>>>> @@ -20,12 +20,12 @@
>>>>>>>      * use different targets.
>>>>>>>      */
>>>>>>>     #if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
>>>>>>> -# define __bpf_ntohs(x)			__builtin_bswap16(x)
>>>>>>> -# define __bpf_htons(x)			__builtin_bswap16(x)
>>>>>>> +# define __bpf_ntohs(x)			__swab16(x)
>>>>>>> +# define __bpf_htons(x)			__swab16(x)
>>>>>>>     # define __bpf_constant_ntohs(x)	___constant_swab16(x)
>>>>>>>     # define __bpf_constant_htons(x)	___constant_swab16(x)
>>>>>>> -# define __bpf_ntohl(x)			__builtin_bswap32(x)
>>>>>>> -# define __bpf_htonl(x)			__builtin_bswap32(x)
>>>>>>> +# define __bpf_ntohl(x)			__swab32(x)
>>>>>>> +# define __bpf_htonl(x)			__swab32(x)
>>>>>>>     # define __bpf_constant_ntohl(x)	___constant_swab32(x)
>>>>>>>     # define __bpf_constant_htonl(x)	___constant_swab32(x)
>>>>>>>     #elif __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
>>>>>>> -- 
>>>>>>> 2.21.0
>>>>>>>
>>
Sergey Senozhatsky March 21, 2019, 12:34 a.m. UTC | #9
On (03/20/19 15:27), Stanislav Fomichev wrote:
[..]
> > Inside kernel clang compiler header (linux/compiler-clang.h) does not 
> > define __HAVE_BUILTIN_BSWAP16__. So it will go to the "else" branch in 
> > the above. So I think it should work with clang + bpf.
> Hm, isn't it the opposite of what we want then? I think for llvm+bpf we always
> want to use the builtins to make it properly generate
> BPF_TO_BE/BPF_TO_LE instructions.

Oh, hmm, OK. I see your point now. bpf insn set for variables.

	-ss
Alexei Starovoitov March 21, 2019, 3:24 a.m. UTC | #10
On Wed, Mar 20, 2019 at 09:53:33PM +0900, Sergey Senozhatsky wrote:
> Not all compilers have __builtin_bswap16() and __builtin_bswap32(),
> thus not all compilers are able to compile the following code:
> 
>         (__builtin_constant_p(x) ? \
>                 ___constant_swab16(x) : __builtin_bswap16(x))
> 
> That's the reason why bpf_ntohl() doesn't work on GCC < 4.8, for
> instance:

nack to fixes to support such old compilers.
Sergey Senozhatsky March 21, 2019, 5:08 a.m. UTC | #11
On (03/20/19 20:24), Alexei Starovoitov wrote:
> On Wed, Mar 20, 2019 at 09:53:33PM +0900, Sergey Senozhatsky wrote:
> > Not all compilers have __builtin_bswap16() and __builtin_bswap32(),
> > thus not all compilers are able to compile the following code:
> > 
> >         (__builtin_constant_p(x) ? \
> >                 ___constant_swab16(x) : __builtin_bswap16(x))
> > 
> > That's the reason why bpf_ntohl() doesn't work on GCC < 4.8, for
> > instance:
> 
> nack to fixes to support such old compilers.

Fair enough.

	-ss
Stanislav Fomichev March 21, 2019, 3:49 p.m. UTC | #12
On 03/21, Sergey Senozhatsky wrote:
> On (03/20/19 20:24), Alexei Starovoitov wrote:
> > On Wed, Mar 20, 2019 at 09:53:33PM +0900, Sergey Senozhatsky wrote:
> > > Not all compilers have __builtin_bswap16() and __builtin_bswap32(),
> > > thus not all compilers are able to compile the following code:
> > > 
> > >         (__builtin_constant_p(x) ? \
> > >                 ___constant_swab16(x) : __builtin_bswap16(x))
> > > 
> > > That's the reason why bpf_ntohl() doesn't work on GCC < 4.8, for
> > > instance:
> > 
> > nack to fixes to support such old compilers.
> 
> Fair enough.
What is too old? Documentation/process/changes.rst says that minimum
supported gcc is 4.6, do we lift that requirement for the tests?

> 	-ss
Sergey Senozhatsky March 22, 2019, 2:46 a.m. UTC | #13
On (03/21/19 08:49), Stanislav Fomichev wrote:
> On 03/21, Sergey Senozhatsky wrote:
> > On (03/20/19 20:24), Alexei Starovoitov wrote:
> > > On Wed, Mar 20, 2019 at 09:53:33PM +0900, Sergey Senozhatsky wrote:
> > > > Not all compilers have __builtin_bswap16() and __builtin_bswap32(),
> > > > thus not all compilers are able to compile the following code:
> > > > 
> > > >         (__builtin_constant_p(x) ? \
> > > >                 ___constant_swab16(x) : __builtin_bswap16(x))
> > > > 
> > > > That's the reason why bpf_ntohl() doesn't work on GCC < 4.8, for
> > > > instance:
> > > 
> > > nack to fixes to support such old compilers.
> > 
> > Fair enough.
> What is too old? Documentation/process/changes.rst says that minimum
> supported gcc is 4.6, do we lift that requirement for the tests?

Hmm, good point, Stanislav. I thought it was gcc 4.9 which introduced
asm goto and hence 4.9 is the minimum supported version. But it seems
that it was 4.5/4.6, so the min supported gcc version is 4.6. Which
means that those bpf defines won't work on some compilers.

Alexei, does your NACK still stand?

	-ss
Alexei Starovoitov March 22, 2019, 3:12 a.m. UTC | #14
On Fri, Mar 22, 2019 at 11:46:52AM +0900, Sergey Senozhatsky wrote:
> On (03/21/19 08:49), Stanislav Fomichev wrote:
> > On 03/21, Sergey Senozhatsky wrote:
> > > On (03/20/19 20:24), Alexei Starovoitov wrote:
> > > > On Wed, Mar 20, 2019 at 09:53:33PM +0900, Sergey Senozhatsky wrote:
> > > > > Not all compilers have __builtin_bswap16() and __builtin_bswap32(),
> > > > > thus not all compilers are able to compile the following code:
> > > > > 
> > > > >         (__builtin_constant_p(x) ? \
> > > > >                 ___constant_swab16(x) : __builtin_bswap16(x))
> > > > > 
> > > > > That's the reason why bpf_ntohl() doesn't work on GCC < 4.8, for
> > > > > instance:
> > > > 
> > > > nack to fixes to support such old compilers.
> > > 
> > > Fair enough.
> > What is too old? Documentation/process/changes.rst says that minimum
> > supported gcc is 4.6, do we lift that requirement for the tests?
> 
> Hmm, good point, Stanislav. I thought it was gcc 4.9 which introduced
> asm goto and hence 4.9 is the minimum supported version. But it seems
> that it was 4.5/4.6, so the min supported gcc version is 4.6. Which
> means that those bpf defines won't work on some compilers.
> 
> Alexei, does your NACK still stand?

yes.
bpf samples and selftests require llvm and new features like BTF
require the latest llvm which requires gcc 5.1.
Things are more or less working still with gcc 4.8, but soon will
likely start breaking.
diff mbox series

Patch

diff --git a/tools/testing/selftests/bpf/bpf_endian.h b/tools/testing/selftests/bpf/bpf_endian.h
index b25595ea4a78..1ed268b2002b 100644
--- a/tools/testing/selftests/bpf/bpf_endian.h
+++ b/tools/testing/selftests/bpf/bpf_endian.h
@@ -20,12 +20,12 @@ 
  * use different targets.
  */
 #if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
-# define __bpf_ntohs(x)			__builtin_bswap16(x)
-# define __bpf_htons(x)			__builtin_bswap16(x)
+# define __bpf_ntohs(x)			__swab16(x)
+# define __bpf_htons(x)			__swab16(x)
 # define __bpf_constant_ntohs(x)	___constant_swab16(x)
 # define __bpf_constant_htons(x)	___constant_swab16(x)
-# define __bpf_ntohl(x)			__builtin_bswap32(x)
-# define __bpf_htonl(x)			__builtin_bswap32(x)
+# define __bpf_ntohl(x)			__swab32(x)
+# define __bpf_htonl(x)			__swab32(x)
 # define __bpf_constant_ntohl(x)	___constant_swab32(x)
 # define __bpf_constant_htonl(x)	___constant_swab32(x)
 #elif __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__