Message ID | 20210504110519.16097-1-alx.manpages@gmail.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | [RFC,v2] bpf.2: Use standard types and attributes | expand |
Context | Check | Description |
---|---|---|
netdev/tree_selection | success | Not a local patch |
On Tue, May 4, 2021 at 4:05 AM Alejandro Colomar <alx.manpages@gmail.com> wrote: > > Some manual pages are already using C99 syntax for integral > types 'uint32_t', but some aren't. There are some using kernel > syntax '__u32'. Fix those. > > Some pages also document attributes, using GNU syntax > '__attribute__((xxx))'. Update those to use the shorter and more > portable C11 keywords such as 'alignas()' when possible, and C2x > syntax '[[gnu::xxx]]' elsewhere, which hasn't been standardized > yet, but is already implemented in GCC, and available through > either --std=c2x or any of the --std=gnu... options. > > The standard isn't very clear on how to use alignas() or > [[]]-style attributes, so the following link is useful in the case > of 'alignas()' and '[[gnu::aligned()]]': > <https://stackoverflow.com/q/67271825/6872717> > > Signed-off-by: Alejandro Colomar <alx.manpages@gmail.com> > Cc: LKML <linux-kernel@vger.kernel.org> > Cc: glibc <libc-alpha@sourceware.org> > Cc: GCC <gcc-patches@gcc.gnu.org> > Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com> > Cc: bpf <bpf@vger.kernel.org> > Cc: David Laight <David.Laight@ACULAB.COM> > Cc: Zack Weinberg <zackw@panix.com> > Cc: Joseph Myers <joseph@codesourcery.com> > --- > man2/bpf.2 | 49 ++++++++++++++++++++++++------------------------- > 1 file changed, 24 insertions(+), 25 deletions(-) > > diff --git a/man2/bpf.2 b/man2/bpf.2 > index 6e1ffa198..04b8fbcef 100644 > --- a/man2/bpf.2 > +++ b/man2/bpf.2 > @@ -186,41 +186,40 @@ commands: > .PP > .in +4n > .EX > -union bpf_attr { > +union [[gnu::aligned(8)]] bpf_attr { > struct { /* Used by BPF_MAP_CREATE */ > - __u32 map_type; > - __u32 key_size; /* size of key in bytes */ > - __u32 value_size; /* size of value in bytes */ > - __u32 max_entries; /* maximum number of entries > - in a map */ > + uint32_t map_type; > + uint32_t key_size; /* size of key in bytes */ > + uint32_t value_size; /* size of value in bytes */ > + uint32_t max_entries; /* maximum number of entries > + in a map */ For the same reasons as explained earlier: Nacked-by: Alexei Starovoitov <ast@kernel.org>
On Tue, May 04, 2021 at 07:12:01AM -0700, Alexei Starovoitov wrote: > On Tue, May 4, 2021 at 4:05 AM Alejandro Colomar <alx.manpages@gmail.com> wrote: > > > > Some manual pages are already using C99 syntax for integral > > types 'uint32_t', but some aren't. There are some using kernel > > syntax '__u32'. Fix those. > > > > Some pages also document attributes, using GNU syntax > > '__attribute__((xxx))'. Update those to use the shorter and more > > portable C11 keywords such as 'alignas()' when possible, and C2x > > syntax '[[gnu::xxx]]' elsewhere, which hasn't been standardized > > yet, but is already implemented in GCC, and available through > > either --std=c2x or any of the --std=gnu... options. > > > > The standard isn't very clear on how to use alignas() or > > [[]]-style attributes, so the following link is useful in the case > > of 'alignas()' and '[[gnu::aligned()]]': > > <https://stackoverflow.com/q/67271825/6872717> > > > > Signed-off-by: Alejandro Colomar <alx.manpages@gmail.com> > > Cc: LKML <linux-kernel@vger.kernel.org> > > Cc: glibc <libc-alpha@sourceware.org> > > Cc: GCC <gcc-patches@gcc.gnu.org> > > Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com> > > Cc: bpf <bpf@vger.kernel.org> > > Cc: David Laight <David.Laight@ACULAB.COM> > > Cc: Zack Weinberg <zackw@panix.com> > > Cc: Joseph Myers <joseph@codesourcery.com> > > --- > > man2/bpf.2 | 49 ++++++++++++++++++++++++------------------------- > > 1 file changed, 24 insertions(+), 25 deletions(-) > > > > diff --git a/man2/bpf.2 b/man2/bpf.2 > > index 6e1ffa198..04b8fbcef 100644 > > --- a/man2/bpf.2 > > +++ b/man2/bpf.2 > > @@ -186,41 +186,40 @@ commands: > > .PP > > .in +4n > > .EX > > -union bpf_attr { > > +union [[gnu::aligned(8)]] bpf_attr { > > struct { /* Used by BPF_MAP_CREATE */ > > - __u32 map_type; > > - __u32 key_size; /* size of key in bytes */ > > - __u32 value_size; /* size of value in bytes */ > > - __u32 max_entries; /* maximum number of entries > > - in a map */ > > + uint32_t map_type; > > + uint32_t key_size; /* size of key in bytes */ > > + uint32_t value_size; /* size of value in bytes */ > > + uint32_t max_entries; /* maximum number of entries > > + in a map */ > > For the same reasons as explained earlier: > Nacked-by: Alexei Starovoitov <ast@kernel.org> I agree, the two are not the same type at all, this change should not be accepted. thanks, greg k-h
Hi Greg and Alexei, > On Tue, May 04, 2021 at 07:12:01AM -0700, Alexei Starovoitov wrote: >> For the same reasons as explained earlier: >> Nacked-by: Alexei Starovoitov <ast@kernel.org> Okay, I'll add that. On 5/4/21 4:24 PM, Greg KH wrote:> I agree, the two are not the same type at all, this change should not be > accepted. I get that in the kernel you don't use the standard fixed-width types (with some exceptions), probably not to mess with code that relies on <stdint.h> not being included (I hope there's not much code that relies on this in 2021, but who knows). But, there is zero difference between these types, from the point of view of the compiler. There's 100% compatibility between those types, and you're able to mix'n'match them. See some example below. Could you please explain why the documentation, which supposedly only documents the API and not the internal implementation, should not use standard naming conventions? The standard is much easier to read for userspace programmers, which might ignore why the kernel does some things in some specific ways. BTW, just to clarify, bpf.2 is just a small sample to get reviews; the original intention was to replace __uNN by uintNN_t in all of the manual pages. Thanks, Alex ... Example: $ cat test.c #include <stdint.h> typedef int __s32; int32_t foo(void); int main(void) { return 1 - foo(); } __s32 foo(void) { return 1; } $ cc -Wall -Wextra -Werror -S -Og test.c -o test.s $ cat test.s .file "test.c" .text .globl foo .type foo, @function foo: .LFB1: .cfi_startproc movl $1, %eax ret .cfi_endproc .LFE1: .size foo, .-foo .globl main .type main, @function main: .LFB0: .cfi_startproc call foo movl %eax, %edx movl $1, %eax subl %edx, %eax ret .cfi_endproc .LFE0: .size main, .-main .ident "GCC: (Debian 10.2.1-6) 10.2.1 20210110" .section .note.GNU-stack,"",@progbits $
On Tue, May 04, 2021 at 05:53:29PM +0200, Alejandro Colomar (man-pages) wrote: > Hi Greg and Alexei, > > > On Tue, May 04, 2021 at 07:12:01AM -0700, Alexei Starovoitov wrote: > > > For the same reasons as explained earlier: > > > Nacked-by: Alexei Starovoitov <ast@kernel.org> > > Okay, I'll add that. > > > On 5/4/21 4:24 PM, Greg KH wrote:> I agree, the two are not the same type at > all, this change should not be > > accepted. > > I get that in the kernel you don't use the standard fixed-width types (with > some exceptions), probably not to mess with code that relies on <stdint.h> > not being included (I hope there's not much code that relies on this in > 2021, but who knows). > > But, there is zero difference between these types, from the point of view of > the compiler. There's 100% compatibility between those types, and you're > able to mix'n'match them. See some example below. > > Could you please explain why the documentation, which supposedly only > documents the API and not the internal implementation, should not use > standard naming conventions? The standard is much easier to read for > userspace programmers, which might ignore why the kernel does some things in > some specific ways. > > BTW, just to clarify, bpf.2 is just a small sample to get reviews; the > original intention was to replace __uNN by uintNN_t in all of the manual > pages. There's a very old post from Linus where he describes the difference between things like __u32 and uint32_t. They are not the same, they live in different namespaces, and worlds, and can not always be swapped out for each other on all arches. Dig it up if you are curious, but for user/kernel apis you HAVE to use the __uNN and can not use uintNN_t variants, so don't try to mix/match them, it's good to just follow the kernel standard please. So consider this my: Nacked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> as well. thanks, greg k-h
On 5/4/21 5:53 PM, Alejandro Colomar (man-pages) wrote: > Hi Greg and Alexei, > >> On Tue, May 04, 2021 at 07:12:01AM -0700, Alexei Starovoitov wrote: >>> For the same reasons as explained earlier: >>> Nacked-by: Alexei Starovoitov <ast@kernel.org> > > Okay, I'll add that. > > On 5/4/21 4:24 PM, Greg KH wrote:> I agree, the two are not the same type at all, this change should not be >> accepted. > > I get that in the kernel you don't use the standard fixed-width types (with some exceptions), probably not to mess with code that relies on <stdint.h> not being included (I hope there's not much code that relies on this in 2021, but who knows). > > But, there is zero difference between these types, from the point of view of the compiler. There's 100% compatibility between those types, and you're able to mix'n'match them. See some example below. > > Could you please explain why the documentation, which supposedly only documents the API and not the internal implementation, should not use standard naming conventions? The standard is much easier to read for userspace programmers, which might ignore why the kernel does some things in some specific ways. > > BTW, just to clarify, bpf.2 is just a small sample to get reviews; the original intention was to replace __uNN by uintNN_t in all of the manual pages. But what /problem/ is this really solving? Why bother to change this /now/ after so many years?! I think this is causing more confusion than solving anything, really. Moreover, what are you doing with all the __{le,be}{16,32,64} types in uapi? Anyway, NAK for bpf.2 specifically, and the idea generally.. Best, Daniel
On Tue, May 4, 2021 at 12:06 PM Greg KH <gregkh@linuxfoundation.org> wrote: > On Tue, May 04, 2021 at 05:53:29PM +0200, Alejandro Colomar (man-pages) wrote: > > On 5/4/21 4:24 PM, Greg KH wrote: > > > I agree, the two are not the same type at all, this change should not be > > > accepted. > > > > I get that in the kernel you don't use the standard fixed-width types (with > > some exceptions), probably not to mess with code that relies on <stdint.h> > > not being included (I hope there's not much code that relies on this in > > 2021, but who knows). > > > > But, there is zero difference between these types, from the point of view of > > the compiler. There's 100% compatibility between those types, and you're > > able to mix'n'match them. See some example below. ... > There's a very old post from Linus where he describes the difference > between things like __u32 and uint32_t. They are not the same, they > live in different namespaces, and worlds, and can not always be swapped > out for each other on all arches. > > Dig it up if you are curious, but for user/kernel apis you HAVE to use > the __uNN and can not use uintNN_t variants, so don't try to mix/match > them, it's good to just follow the kernel standard please. ... > Nacked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Speaking from the C library's perspective, I'm going to push back pretty hard on this NAK, for several reasons. First, this is a proposed change to the manpages, not the headers themselves. Manpage documentation of C structs is *not* expected to match the actual declaration in the headers. The documented field type is usually assignment-compatible with the actual type, but not always. There's no guarantee whatsoever that the fields are in the same order as the header, or that the listed set of fields is complete. I would say that as long as any value of type __u32 can be stored in a variable of type uint32_t without data loss, and vice versa, there is no reason why manpages should *have to* use __u32 in preference to uint32_t, and that in the absence of such a reason, the standard type should be used. Second, it's true that __u32 and uint32_t are in different namespaces, and it may well be necessary for uapi <linux/*.h> headers to use the __uNN names in order to preserve the C standard's distinction between the program and the implementation, but that's *not* a reason for documentation aimed at writers of user-space programs to use the __uNN names. In fact, it is exactly the opposite! User space program authors should, all else equal, be *discouraged* from using the __uNN names, and avoiding their use in manpages is one way to do that. Third, if there does in fact exist a situation where __uNN and uintNN_t are *not* assignment compatible, THAT IS A BUG IN THE KERNEL. Frankly, it would be such a catastrophic bug that I think Linus has to have been *wrong*. We would have noticed the problems long ago if he were right. I'm going to have to ask you to produce hard evidence for your claim that __uNN and uintNN_t are not (always) assignment compatible, and hard evidence why that can't be fixed within the kernel, or else withdraw your objection. zw
Hi Greg, Daniel, On 5/4/21 6:06 PM, Greg KH wrote: > There's a very old post from Linus where he describes the difference > between things like __u32 and uint32_t. They are not the same, they > live in different namespaces, and worlds, and can not always be swapped > out for each other on all arches.> > Dig it up if you are curious, but for user/kernel apis you HAVE to use > the __uNN and can not use uintNN_t variants, so don't try to mix/match > them, it's good to just follow the kernel standard please. I found these: * [RFC] Splitting kernel headers and deprecating __KERNEL__ <https://lore.kernel.org/lkml/Pine.LNX.4.58.0412140734340.3279@ppc970.osdl.org/T/> * coding style <https://lore.kernel.org/lkml/alpine.LFD.0.98.0706160840290.14121@woody.linux-foundation.org/> * [patch] Small input fixes for 2.5.29 <https://lore.kernel.org/lkml/Pine.LNX.4.33.0207301417190.2051-100000@penguin.transmeta.com/T/> I already knew the first one, and now found the other two. If there's any other thread that is relevant, I couldn't find it. The thing is, in all of those threads, the only reasons to avoid <stdint.h> types in the kernel (at least, the only explicitly mentioned ones) are (a bit simplified, but this is the general idea of those threads): * Possibly breaking something in such a big automated change. * Namespace collision with userspace (the C standard allows defining uint32_t for nefarious purposes as long as you don't include <stdint.h>. POSIX prohibits that, though) * Uglier But * The manual pages only document the variable size and signedness by using either '__u32' or 'uint32_t'. We state that the variable is an unsigned integer of exactly 32 bits; nothing more and nothing less. It doesn't specify that those types are defined in <linux/bpf.h> (or whatever header a specific manual page uses). In fact, in uint32_t(3) we clearly state the headers that shall provide the type. In the end, the kernel will receive a 32 bit number. I'm not exactly sure about what is wrong with this. Is there any magic in the kernel/user interface beyond what the standard and the compiler define that I ignore? * At that time (~2004), the C99 and POSIX.1-2001 standards were quite young, and it was likely to find code that defined uint32_t. Currently, it is hard to find something that compiles without C99, and even if C99 allows you to define uint32_t as long as you don't include <stdint.h>, it would be really stupid to do so. And POSIX, which completely prohibits defining uint32_t, is also very present in Linux and other UNIX systems. So we can probably guarantee that using <stdint.h> in the kernel wouldn't break anything. But yet this isn't trying to do so. This is only about the manual pages. I haven't read it in any of those threads, but suspect that the static analyzer used for the kernel might use extra information from the different 'u32'/'__u32' type names to do some extra checks. Does it? > and can not always be swapped out for each other on all arches. Really? 'uint32_t' is defined as "an unsigned integer type of a fixed width of exactly 32 bits". How is that different from '[__]u32'? Aren't the kernel types guaranteed to be unsigned integers of exactly 32 bits? AFAICT, they are 100% binary compatible; and if not, it's probably a kernel bug. Yes there are archs that don't provide 64 bit integers (I ignore if any of the archs supported by Linux does though), but if an arch doesn't provide 'uint64_t', it will neither be possible to have '__u64'. [ uintN_t Include: <stdint.h>. Alternatively, <inttypes.h>. uint8_t, uint16_t, uint32_t, uint64_t An unsigned integer type of a fixed width of ex‐ actly N bits, N being the value specified in its type name. According to the C language standard, they shall be capable of storing values in the range [0, UINTN_MAX], substituting N by the appro‐ priate number. According to POSIX, uint8_t, uint16_t, and uint32_t are required; uint64_t is only required in implementations that provide integer types with width 64; and all other types of this form are op‐ tional. ] -- uint32_t(3) > > So consider this my: > > Nacked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> > > as well. Okay. On 5/4/21 6:08 PM, Daniel Borkmann wrote: > > But what /problem/ is this really solving? Why bother to change this /now/ > after so many years?! I think this is causing more confusion than solving > anything, really. Moreover, what are you doing with all the > __{le,be}{16,32,64} > types in uapi? Anyway, NAK for bpf.2 specifically, and the idea generally.. > I'm trying to clarify the manual pages as much as possible, by using standard conventions and similar structure all around the pages. Not everyone understands kernel conventions. Basically, Zack said very much what I had in mind with this patch. Thanks for your reviews! Regards, Alex -- Alejandro Colomar Linux man-pages comaintainer; https://www.kernel.org/doc/man-pages/ http://www.alejandro-colomar.es/
* Alejandro Colomar: > The thing is, in all of those threads, the only reasons to avoid > <stdint.h> types in the kernel (at least, the only explicitly > mentioned ones) are (a bit simplified, but this is the general idea of > those threads): > > * Possibly breaking something in such a big automated change. > * Namespace collision with userspace (the C standard allows defining > uint32_t for nefarious purposes as long as you don't include > <stdint.h>. POSIX prohibits that, though) > * Uglier __u64 can't be formatted with %llu on all architectures. That's not true for uint64_t, where you have to use %lu on some architectures to avoid compiler warnings (and technically undefined behavior). There are preprocessor macros to get the expected format specifiers, but they are clunky. I don't know if the problem applies to uint32_t. It does happen with size_t and ptrdiff_t on 32-bit targets (both vary between int and long). Thanks, Florian
Hi Florian, On 5/4/21 9:45 PM, Florian Weimer wrote: > * Alejandro Colomar: > >> The thing is, in all of those threads, the only reasons to avoid >> <stdint.h> types in the kernel (at least, the only explicitly >> mentioned ones) are (a bit simplified, but this is the general idea of >> those threads): >> >> * Possibly breaking something in such a big automated change. >> * Namespace collision with userspace (the C standard allows defining >> uint32_t for nefarious purposes as long as you don't include >> <stdint.h>. POSIX prohibits that, though) >> * Uglier > > __u64 can't be formatted with %llu on all architectures. That's not > true for uint64_t, where you have to use %lu on some architectures to > avoid compiler warnings (and technically undefined behavior). There are > preprocessor macros to get the expected format specifiers, but they are > clunky. I don't know if the problem applies to uint32_t. It does > happen with size_t and ptrdiff_t on 32-bit targets (both vary between > int and long). > Hmmm, that's interesting. It looks like Linux always uses long long for 64 bit types, while glibc uses 'long' as long as it's possible, and only uses 'long long' when necessary. Assignment is still 100% valid both ways and binary compatibility also 100% (AFAIK), given they're the same length and signedness, but pointers are incompatible. That's something to note, even though in this case there are no pointers involved, so no incompatibilities. Maybe the kernel and glibc could use the same rules to improve compatibility, but that's out of the scope of this. Thanks, Alex
On 5/4/21 8:54 PM, Alejandro Colomar (man-pages) wrote: > On 5/4/21 6:06 PM, Greg KH wrote: > > There's a very old post from Linus where he describes the difference > > between things like __u32 and uint32_t. They are not the same, they > > live in different namespaces, and worlds, and can not always be swapped > > out for each other on all arches.> > > Dig it up if you are curious, but for user/kernel apis you HAVE to use > > the __uNN and can not use uintNN_t variants, so don't try to mix/match > > them, it's good to just follow the kernel standard please. > I found these: > > * [RFC] Splitting kernel headers and deprecating __KERNEL__ <https://lore.kernel.org/lkml/Pine.LNX.4.58.0412140734340.3279@ppc970.osdl.org/T/> > > * coding style <https://lore.kernel.org/lkml/alpine.LFD.0.98.0706160840290.14121@woody.linux-foundation.org/> > > * [patch] Small input fixes for 2.5.29 <https://lore.kernel.org/lkml/Pine.LNX.4.33.0207301417190.2051-100000@penguin.transmeta.com/T/> > > I already knew the first one, and now found the other two. If there's any other thread that is relevant, I couldn't find it. > > The thing is, in all of those threads, the only reasons to avoid <stdint.h> types in the kernel (at least, the only explicitly mentioned ones) are (a bit simplified, but this is the general idea of those threads): > > * Possibly breaking something in such a big automated change. > * Namespace collision with userspace (the C standard allows defining uint32_t for nefarious purposes as long as you don't include <stdint.h>. POSIX prohibits that, though) > * Uglier > > But > > * The manual pages only document the variable size and signedness by using either '__u32' or 'uint32_t'. We state that the variable is an unsigned integer of exactly 32 bits; nothing more and nothing less. It doesn't specify that those types are defined in <linux/bpf.h> (or whatever header a specific manual page uses). In fact, in uint32_t(3) we clearly state the headers that shall provide the type. In the end, the kernel will receive a 32 bit number. I'm not exactly sure about what is wrong with this. Is there any magic in the kernel/user interface beyond what the standard and the compiler define that I ignore? > > * At that time (~2004), the C99 and POSIX.1-2001 standards were quite young, and it was likely to find code that defined uint32_t. Currently, it is hard to find something that compiles without C99, and even if C99 allows you to define uint32_t as long as you don't include <stdint.h>, it would be really stupid to do so. And POSIX, which completely prohibits defining uint32_t, is also very present in Linux and other UNIX systems. So we can probably guarantee that using <stdint.h> in the kernel wouldn't break anything. But yet this isn't trying to do so. This is only about the manual pages. > > I haven't read it in any of those threads, but suspect that the static analyzer used for the kernel might use extra information from the different 'u32'/'__u32' type names to do some extra checks. Does it? > > > and can not always be swapped out for each other on all arches. > > Really? 'uint32_t' is defined as "an unsigned integer type of a fixed width of exactly 32 bits". How is that different from '[__]u32'? Aren't the kernel types guaranteed to be unsigned integers of exactly 32 bits? AFAICT, they are 100% binary compatible; and if not, it's probably a kernel bug. > > Yes there are archs that don't provide 64 bit integers (I ignore if any of the archs supported by Linux does though), but if an arch doesn't provide 'uint64_t', it will neither be possible to have '__u64'. > > [ > uintN_t > Include: <stdint.h>. Alternatively, <inttypes.h>. > > uint8_t, uint16_t, uint32_t, uint64_t > > An unsigned integer type of a fixed width of ex‐ > actly N bits, N being the value specified in its > type name. According to the C language standard, > they shall be capable of storing values in the > range [0, UINTN_MAX], substituting N by the appro‐ > priate number. > > According to POSIX, uint8_t, uint16_t, and > uint32_t are required; uint64_t is only required > in implementations that provide integer types with > width 64; and all other types of this form are op‐ > tional. > > ] -- uint32_t(3) > > > > > > So consider this my: > > > > Nacked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> > > > > as well. > Okay. > > On 5/4/21 6:08 PM, Daniel Borkmann wrote: > > > > But what /problem/ is this really solving? Why bother to change this /now/ > > after so many years?! I think this is causing more confusion than solving > > anything, really. Moreover, what are you doing with all the > > __{le,be}{16,32,64} > > types in uapi? Anyway, NAK for bpf.2 specifically, and the idea generally.. > > I'm trying to clarify the manual pages as much as possible, by using standard conventions and similar structure all around the pages. Not everyone understands kernel conventions. Basically, Zack said very much what I had in mind with this patch. But then are you also converting, for example, __{le,be}{16,32,64} to plain uint{16,32,64}_t in the man pages and thus removing contextual information (or inventing new equivalent types)? What about other types exposed to user space like __sum16, __wsum, or __poll_t when they are part of a man page, etc? Thanks, Daniel
Hi Daniel, On 5/4/21 10:06 PM, Daniel Borkmann wrote: >> >> On 5/4/21 6:08 PM, Daniel Borkmann wrote: >> > >> > But what /problem/ is this really solving? Why bother to change >> this /now/ >> > after so many years?! I think this is causing more confusion than >> solving >> > anything, really. Moreover, what are you doing with all the >> > __{le,be}{16,32,64} >> > types in uapi? Anyway, NAK for bpf.2 specifically, and the idea >> generally.. >> >> I'm trying to clarify the manual pages as much as possible, by using >> standard conventions and similar structure all around the pages. Not >> everyone understands kernel conventions. Basically, Zack said very >> much what I had in mind with this patch. > > But then are you also converting, for example, __{le,be}{16,32,64} to plain > uint{16,32,64}_t in the man pages and thus removing contextual information > (or inventing new equivalent types)? > > What about other types exposed to user space like __sum16, __wsum, or > __poll_t > when they are part of a man page, etc? Sorry, I forgot to address that part in my answer. If there's no standard way of naming a type without losing information, we can use the kernel naming. I have no objection to that. These are the only pages that seem to be using those: $ grep -Enr '\b__[a-z][a-z]+[0-9]+' man? man2/clone.2:44:clone, __clone2, clone3 \- create a child process man2/clone.2:1694:.BI "int __clone2(int (*" "fn" ")(void *)," man2/clone.2:1717:.BR __clone2 () man7/sock_diag.7:362: __be16 idiag_sport; man7/sock_diag.7:363: __be16 idiag_dport; man7/sock_diag.7:364: __be32 idiag_src[4]; man7/sock_diag.7:365: __be32 idiag_dst[4]; man7/bpf-helpers.7:514:.B \fBlong bpf_skb_vlan_push(struct sk_buff *\fP\fIskb\fP\fB, __be16\fP \fIvlan_proto\fP\fB, u16\fP \fIvlan_tci\fP\fB)\fP man7/bpf-helpers.7:878:.B \fBs64 bpf_csum_diff(__be32 *\fP\fIfrom\fP\fB, u32\fP \fIfrom_size\fP\fB, __be32 *\fP\fIto\fP\fB, u32\fP \fIto_size\fP\fB, __wsum\fP \fIseed\fP\fB)\fP man7/bpf-helpers.7:949:.B \fBlong bpf_skb_change_proto(struct sk_buff *\fP\fIskb\fP\fB, __be16\fP \fIproto\fP\fB, u64\fP \fIflags\fP\fB)\fP man7/system_data_types.7:473:.I __int128 man7/system_data_types.7:475:.I __int128 man7/system_data_types.7:1584:.I unsigned __int128 man7/system_data_types.7:1586:.I unsigned __int128 $ So sock_diag.7 and bpf-helpers.7 and only a handful of cases. Not much of a problem. I'd keep those untouched. Regards, Alex
On Tue, May 4, 2021 at 4:06 PM Daniel Borkmann <daniel@iogearbox.net> wrote: > > I'm trying to clarify the manual pages as much as possible, by using standard conventions and similar structure all around the pages. Not everyone understands kernel conventions. Basically, Zack said very much what I had in mind with this patch. > > But then are you also converting, for example, __{le,be}{16,32,64} to plain > uint{16,32,64}_t in the man pages and thus removing contextual information > (or inventing new equivalent types)? > > What about other types exposed to user space like __sum16, __wsum, or __poll_t > when they are part of a man page, etc? Fields that are specifically in some endianness that isn't (necessarily) the CPU's _should_ be documented as such in the manpage, but I dunno if __{le,be}{16,32,64} as a type name is the ideal way to do it. There is no off-the-shelf notation for this as far as I know. I do not know what __sum16, __wsum, and __poll_t are used for, but I want to remind everyone again that the kernel's concerns are not necessarily user space's concerns and the information that should appear in the manpages is the information that is most relevant to user space programmers. zw
On Tue, May 4, 2021 at 1:33 PM Zack Weinberg <zackw@panix.com> wrote: > the information that should > appear in the manpages is the information that is most relevant to > user space programmers. The bpf programs are compiled for the kernel and run in the kernel. Hence bpf man pages must reflect the kernel. Also there is BTF where type names are part of the verification process. if a user decides to rename a type it will be rejected by the kernel verifier.
From: Florian Weimer > Sent: 04 May 2021 20:46 > > * Alejandro Colomar: > > > The thing is, in all of those threads, the only reasons to avoid > > <stdint.h> types in the kernel (at least, the only explicitly > > mentioned ones) are (a bit simplified, but this is the general idea of > > those threads): > > > > * Possibly breaking something in such a big automated change. > > * Namespace collision with userspace (the C standard allows defining > > uint32_t for nefarious purposes as long as you don't include > > <stdint.h>. POSIX prohibits that, though) > > * Uglier > > __u64 can't be formatted with %llu on all architectures. That's not > true for uint64_t, where you have to use %lu on some architectures to > avoid compiler warnings (and technically undefined behavior). There are > preprocessor macros to get the expected format specifiers, but they are > clunky. I don't know if the problem applies to uint32_t. It does > happen with size_t and ptrdiff_t on 32-bit targets (both vary between > int and long). uint32_t can be 'randomly' either int or long on typical 32bit architectures. The correct way to print it is with eg "xxx %5.4" PRI_u32 " yyy". Typed like ptrdiff_t and size_t exist because of things like the x86 segmented model. Pointers are 32bit (segment and offset), size_t is (probably) 16 bit (nothing can be any bigger), but ptrdiff_t has to be 32bit to contain [-65535 .. 65535]. Kernel code has used u8, u16 and u32 since well before the standards body even thought about fixed width types (and well before Linux). ISTR they were considered as the standard names, but rejected and the current definitions approved. They were probably too worried about code already using u32 for a variable. (Shame they never fixed math.h) David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
On Wed, 5 May 2021, David Laight via Libc-alpha wrote: > > __u64 can't be formatted with %llu on all architectures. That's not > > true for uint64_t, where you have to use %lu on some architectures to > > avoid compiler warnings (and technically undefined behavior). There are > > preprocessor macros to get the expected format specifiers, but they are > > clunky. I don't know if the problem applies to uint32_t. It does > > happen with size_t and ptrdiff_t on 32-bit targets (both vary between > > int and long). > > uint32_t can be 'randomly' either int or long on typical 32bit architectures. > The correct way to print it is with eg "xxx %5.4" PRI_u32 " yyy". C2X adds printf length modifiers such as "w32", so you can use a friendlier %w32u, for example. (Not yet implemented in glibc or in GCC's format checking.)
diff --git a/man2/bpf.2 b/man2/bpf.2 index 6e1ffa198..04b8fbcef 100644 --- a/man2/bpf.2 +++ b/man2/bpf.2 @@ -186,41 +186,40 @@ commands: .PP .in +4n .EX -union bpf_attr { +union [[gnu::aligned(8)]] bpf_attr { struct { /* Used by BPF_MAP_CREATE */ - __u32 map_type; - __u32 key_size; /* size of key in bytes */ - __u32 value_size; /* size of value in bytes */ - __u32 max_entries; /* maximum number of entries - in a map */ + uint32_t map_type; + uint32_t key_size; /* size of key in bytes */ + uint32_t value_size; /* size of value in bytes */ + uint32_t max_entries; /* maximum number of entries + in a map */ }; - struct { /* Used by BPF_MAP_*_ELEM and BPF_MAP_GET_NEXT_KEY - commands */ - __u32 map_fd; - __aligned_u64 key; + struct { /* Used by BPF_MAP_*_ELEM and BPF_MAP_GET_NEXT_KEY commands */ + uint32_t map_fd; + uint64_t alignas(8) key; union { - __aligned_u64 value; - __aligned_u64 next_key; + uint64_t alignas(8) value; + uint64_t alignas(8) next_key; }; - __u64 flags; + uint64_t flags; }; struct { /* Used by BPF_PROG_LOAD */ - __u32 prog_type; - __u32 insn_cnt; - __aligned_u64 insns; /* \(aqconst struct bpf_insn *\(aq */ - __aligned_u64 license; /* \(aqconst char *\(aq */ - __u32 log_level; /* verbosity level of verifier */ - __u32 log_size; /* size of user buffer */ - __aligned_u64 log_buf; /* user supplied \(aqchar *\(aq - buffer */ - __u32 kern_version; - /* checked when prog_type=kprobe - (since Linux 4.1) */ + uint32_t prog_type; + uint32_t insn_cnt; + uint64_t alignas(8) insns; /* \(aqconst struct bpf_insn *\(aq */ + uint64_t alignas(8) license; /* \(aqconst char *\(aq */ + uint32_t log_level; /* verbosity level of verifier */ + uint32_t log_size; /* size of user buffer */ + uint64_t alignas(8) log_buf; /* user supplied \(aqchar *\(aq + buffer */ + uint32_t kern_version; + /* checked when prog_type=kprobe + (since Linux 4.1) */ .\" commit 2541517c32be2531e0da59dfd7efc1ce844644f5 }; -} __attribute__((aligned(8))); +}; .EE .in .\"
Some manual pages are already using C99 syntax for integral types 'uint32_t', but some aren't. There are some using kernel syntax '__u32'. Fix those. Some pages also document attributes, using GNU syntax '__attribute__((xxx))'. Update those to use the shorter and more portable C11 keywords such as 'alignas()' when possible, and C2x syntax '[[gnu::xxx]]' elsewhere, which hasn't been standardized yet, but is already implemented in GCC, and available through either --std=c2x or any of the --std=gnu... options. The standard isn't very clear on how to use alignas() or [[]]-style attributes, so the following link is useful in the case of 'alignas()' and '[[gnu::aligned()]]': <https://stackoverflow.com/q/67271825/6872717> Signed-off-by: Alejandro Colomar <alx.manpages@gmail.com> Cc: LKML <linux-kernel@vger.kernel.org> Cc: glibc <libc-alpha@sourceware.org> Cc: GCC <gcc-patches@gcc.gnu.org> Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com> Cc: bpf <bpf@vger.kernel.org> Cc: David Laight <David.Laight@ACULAB.COM> Cc: Zack Weinberg <zackw@panix.com> Cc: Joseph Myers <joseph@codesourcery.com> --- man2/bpf.2 | 49 ++++++++++++++++++++++++------------------------- 1 file changed, 24 insertions(+), 25 deletions(-)