Message ID | 1452547845-12039-1-git-send-email-chengang@emindsoft.com.cn (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Tue, Jan 12, 2016 at 05:30:45AM +0800, chengang@emindsoft.com.cn wrote: > From: Chen Gang <gang.chen.5i5j@gmail.com> > > Use bool type for all functions which return boolean value. It will not > only let code clearer, but also sometimes let gcc produce better code. What's the point of this chunk? > static enum d_walk_ret check_mount(void *data, struct dentry *dentry) > { > - int *ret = data; > + bool *ret = data; > if (d_mountpoint(dentry)) { > - *ret = 1; > + *ret = true; > return D_WALK_QUIT; > } > return D_WALK_CONTINUE; You are replacing a 1-word store with 1-byte store; if anything, that's more likely to yield _worse_ code, not better one. > -static inline int d_unhashed(const struct dentry *dentry) > +static inline bool d_unhashed(const struct dentry *dentry) > { > return hlist_bl_unhashed(&dentry->d_hash); > } > > -static inline int d_unlinked(const struct dentry *dentry) > +static inline bool d_unlinked(const struct dentry *dentry) > { > return d_unhashed(dentry) && !IS_ROOT(dentry); > } > -static inline int simple_positive(struct dentry *dentry) > +static inline bool simple_positive(struct dentry *dentry) > { > return d_really_is_positive(dentry) && !d_unhashed(dentry); > } And these three are harmless, but completely pointless... -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Al Viro <viro@ZenIV.linux.org.uk> wrote: > > -static inline int d_unhashed(const struct dentry *dentry) > > +static inline bool d_unhashed(const struct dentry *dentry) > > { > > return hlist_bl_unhashed(&dentry->d_hash); > > } > > > > -static inline int d_unlinked(const struct dentry *dentry) > > +static inline bool d_unlinked(const struct dentry *dentry) > > { > > return d_unhashed(dentry) && !IS_ROOT(dentry); > > } > > > -static inline int simple_positive(struct dentry *dentry) > > +static inline bool simple_positive(struct dentry *dentry) > > { > > return d_really_is_positive(dentry) && !d_unhashed(dentry); > > } > > And these three are harmless, but completely pointless... gcc-5 does actually produce slightly smaller code when int returns are replaced by bools under some circumstances within the kernel. David -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Jan 12, 2016 at 12:33:37AM +0000, David Howells wrote: > gcc-5 does actually produce slightly smaller code when int returns are > replaced by bools under some circumstances within the kernel. For inlines? Are you serious? -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 1/12/16 06:51, Al Viro wrote: > On Tue, Jan 12, 2016 at 05:30:45AM +0800, chengang@emindsoft.com.cn wrote: >> From: Chen Gang <gang.chen.5i5j@gmail.com> >> >> Use bool type for all functions which return boolean value. It will not >> only let code clearer, but also sometimes let gcc produce better code. > > What's the point of this chunk? > I'll explain it below this mail, please check. >> static enum d_walk_ret check_mount(void *data, struct dentry *dentry) >> { >> - int *ret = data; >> + bool *ret = data; >> if (d_mountpoint(dentry)) { >> - *ret = 1; >> + *ret = true; >> return D_WALK_QUIT; >> } >> return D_WALK_CONTINUE; > > You are replacing a 1-word store with 1-byte store; if anything, that's more > likely to yield _worse_ code, not better one. > For me, it really generates a little better code: - Both 1-word store and 1-byte store are 1 instruction, normally, they have the same execution speed (although it is not quite precise). - But 1-byte store instruction has short length under CISC archs, which can generate a little better code globally. - For most of archs, 1-word store can process bytes nonalignment cases, for check_mount() individually, the parameter data may be not word alignment, which may cause the 1-word store slower than 1-byte store. The related objdump is below: origin: 00000000 <check_mount>: 0: 8b 12 mov (%edx),%edx 2: 81 e2 00 00 01 00 and $0x10000,%edx 8: 74 16 je 20 <check_mount+0x20> a: c7 00 01 00 00 00 movl $0x1,(%eax) 10: b8 01 00 00 00 mov $0x1,%eax 15: c3 ret 16: 8d 76 00 lea 0x0(%esi),%esi 19: 8d bc 27 00 00 00 00 lea 0x0(%edi,%eiz,1),%edi 20: 31 c0 xor %eax,%eax 22: c3 ret 23: 8d b6 00 00 00 00 lea 0x0(%esi),%esi 29: 8d bc 27 00 00 00 00 lea 0x0(%edi,%eiz,1),%edi new: 00000000 <check_mount>: 0: 8b 12 mov (%edx),%edx 2: 81 e2 00 00 01 00 and $0x10000,%edx 8: 74 0e je 18 <check_mount+0x18> a: c6 00 01 movb $0x1,(%eax) d: b8 01 00 00 00 mov $0x1,%eax 12: c3 ret 13: 90 nop 14: 8d 74 26 00 lea 0x0(%esi,%eiz,1),%esi 18: 31 c0 xor %eax,%eax 1a: c3 ret 1b: 90 nop 1c: 8d 74 26 00 lea 0x0(%esi,%eiz,1),%esi [root@localhost fs]# gcc -v Using built-in specs. COLLECT_GCC=gcc COLLECT_LTO_WRAPPER=/usr/local/libexec/gcc/x86_64-pc-linux-gnu/6.0.0/lto-wrapper Target: x86_64-pc-linux-gnu Configured with: ../gcc-ana/configure Thread model: posix gcc version 6.0.0 20151121 (experimental) (GCC) >> -static inline int d_unhashed(const struct dentry *dentry) >> +static inline bool d_unhashed(const struct dentry *dentry) >> { >> return hlist_bl_unhashed(&dentry->d_hash); >> } >> >> -static inline int d_unlinked(const struct dentry *dentry) >> +static inline bool d_unlinked(const struct dentry *dentry) >> { >> return d_unhashed(dentry) && !IS_ROOT(dentry); >> } > >> -static inline int simple_positive(struct dentry *dentry) >> +static inline bool simple_positive(struct dentry *dentry) >> { >> return d_really_is_positive(dentry) && !d_unhashed(dentry); >> } > > And these three are harmless, but completely pointless... > For performance, please check the original reply above this mail. For me, bool can make the code a little simpler and clearer: - int can express more things: error code, handler, count ... So if we really only use one boolean variable, bool type is more clearer (it is only for boolean). - The old ANSI C compiler may not support bool type, so we have to use int type instead of. But if one header/source file has already used bool type in some part, the whole file need use bool type too. Thanks.
On Wed, Jan 13, 2016 at 05:42:20AM +0800, Chen Gang wrote: > For me, it really generates a little better code: > > - Both 1-word store and 1-byte store are 1 instruction, normally, they > have the same execution speed (although it is not quite precise). > > - But 1-byte store instruction has short length under CISC archs, which > can generate a little better code globally. > > - For most of archs, 1-word store can process bytes nonalignment cases, > for check_mount() individually, the parameter data may be not word > alignment, which may cause the 1-word store slower than 1-byte store. What the hell do you mean, unaligned? It's given an address of local variable of type int; it _will_ be aligned, or the compiler is FUBAR. As for the inlines... frankly, if gcc generates a different code from having replaced int with bool in those, it's time to do something very nasty to gcc developers. -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 1/13/16 06:21, Al Viro wrote: > On Wed, Jan 13, 2016 at 05:42:20AM +0800, Chen Gang wrote: > >> For me, it really generates a little better code: >> >> - Both 1-word store and 1-byte store are 1 instruction, normally, they >> have the same execution speed (although it is not quite precise). >> >> - But 1-byte store instruction has short length under CISC archs, which >> can generate a little better code globally. >> >> - For most of archs, 1-word store can process bytes nonalignment cases, >> for check_mount() individually, the parameter data may be not word >> alignment, which may cause the 1-word store slower than 1-byte store. > > What the hell do you mean, unaligned? It's given an address of local > variable of type int; it _will_ be aligned, or the compiler is FUBAR. > I guess you misunderstand my meaning, in our case, it should not happen, so I say "for check_mount() individually". For 32 bits store instruction, we need consider about the byte alignment. > As for the inlines... frankly, if gcc generates a different code from having > replaced int with bool in those, it's time to do something very nasty to > gcc developers. > Could you provide the related proof? Thanks.
On Thu, Jan 14, 2016 at 06:39:53AM +0800, Chen Gang wrote: > > As for the inlines... frankly, if gcc generates a different code from having > > replaced int with bool in those, it's time to do something very nasty to > > gcc developers. > > > > Could you provide the related proof? static inline _Bool f(.....) { return <int expression>; } ... if (f(.....)) should generate the code identical to if ((_Bool)<int expression>) which, in turn, should generate the code identical to if (<int expression> != 0) and if (<int expression>) Neither explicit nor implicit conversion to _Bool (the former by the explicit cast, the latter - by declaring f() to return _Bool) matters at all when the damn thing is inlined in a condition context. Conversion to _Bool is equivalent to comparison with 0, and so is the use in condition of if() and friends. For something not inlined you might get different code generated due to a difference in calling sequences of _Bool(...) and int(...); for inlined case having one of those variants produce a better code means that compiler has managed to miss some trivial optimization in all other variants. And I'm yet to see any proof that gcc *does* fuck up in that fashion. It might - dumb bugs happen to everyone, but I would not assume that they'd managed to do something that bogys without experimental evidence. -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 1/14/16 06:54, Al Viro wrote: > On Thu, Jan 14, 2016 at 06:39:53AM +0800, Chen Gang wrote: > >>> As for the inlines... frankly, if gcc generates a different code from having >>> replaced int with bool in those, it's time to do something very nasty to >>> gcc developers. >>> >> >> Could you provide the related proof? > > static inline _Bool f(.....) > { > return <int expression>; > } > > ... > if (f(.....)) > For me, your case above isn't suitable for using bool. Please check this patch, there is no any cases like you said above. - For d_unhashed() which return hlist_bl_unhashed(), it seems like your case, but in fact hlist_bl_unhashed() also need return bool (which I shall send another patch for, next). - And all the other changes of this patch are all for real, pure bool functions. Thanks. > should generate the code identical to > if ((_Bool)<int expression>) > which, in turn, should generate the code identical to > if (<int expression> != 0) > and > if (<int expression>) > > Neither explicit nor implicit conversion to _Bool (the former by the explicit > cast, the latter - by declaring f() to return _Bool) matters at all when the > damn thing is inlined in a condition context. Conversion to _Bool is > equivalent to comparison with 0, and so is the use in condition of if() and > friends. > > For something not inlined you might get different code generated due to a > difference in calling sequences of _Bool(...) and int(...); for inlined > case having one of those variants produce a better code means that compiler > has managed to miss some trivial optimization in all other variants. > > And I'm yet to see any proof that gcc *does* fuck up in that fashion. It > might - dumb bugs happen to everyone, but I would not assume that they'd > managed to do something that bogys without experimental evidence. > For your cases, what you said sounds OK to me (although I am not quite sure what you said above whether precise or not). Thanks.
Hello all: Is this patch OK? shall I send the other patch based on this one? (the other patch is v3 trivial patch for include/linux/dcache.h). And sorry for replying late: the last week, I was not in Beijing, had to be busy for analyzing a Linux kernel usb related issue for my company's customer in Guangzhou (but at last, I guess, it is not kernel issue). Thanks. On 1/14/16 23:39, Chen Gang wrote: > > On 1/14/16 06:54, Al Viro wrote: >> On Thu, Jan 14, 2016 at 06:39:53AM +0800, Chen Gang wrote: >> >>>> As for the inlines... frankly, if gcc generates a different code from having >>>> replaced int with bool in those, it's time to do something very nasty to >>>> gcc developers. >>>> >>> >>> Could you provide the related proof? >> >> static inline _Bool f(.....) >> { >> return <int expression>; >> } >> >> ... >> if (f(.....)) >> > > For me, your case above isn't suitable for using bool. Please check this > patch, there is no any cases like you said above. > > - For d_unhashed() which return hlist_bl_unhashed(), it seems like your > case, but in fact hlist_bl_unhashed() also need return bool (which I > shall send another patch for, next). > > - And all the other changes of this patch are all for real, pure bool > functions. > > Thanks. > >> should generate the code identical to >> if ((_Bool)<int expression>) >> which, in turn, should generate the code identical to >> if (<int expression> != 0) >> and >> if (<int expression>) >> >> Neither explicit nor implicit conversion to _Bool (the former by the explicit >> cast, the latter - by declaring f() to return _Bool) matters at all when the >> damn thing is inlined in a condition context. Conversion to _Bool is >> equivalent to comparison with 0, and so is the use in condition of if() and >> friends. >> >> For something not inlined you might get different code generated due to a >> difference in calling sequences of _Bool(...) and int(...); for inlined >> case having one of those variants produce a better code means that compiler >> has managed to miss some trivial optimization in all other variants. >> >> And I'm yet to see any proof that gcc *does* fuck up in that fashion. It >> might - dumb bugs happen to everyone, but I would not assume that they'd >> managed to do something that bogys without experimental evidence. >> > > For your cases, what you said sounds OK to me (although I am not quite > sure what you said above whether precise or not). > > Thanks. >
On Mon, Jan 25, 2016 at 05:19:43AM +0800, Chen Gang wrote: > Hello all: > > Is this patch OK? shall I send the other patch based on this one? (the > other patch is v3 trivial patch for include/linux/dcache.h). > > And sorry for replying late: the last week, I was not in Beijing, had to > be busy for analyzing a Linux kernel usb related issue for my company's > customer in Guangzhou (but at last, I guess, it is not kernel issue). Again, do you have _any_ evidence of improved code generation with that patch? Because if you do, I would really like to see it, so I could file bugs against gcc optimizer. Your impression of what _Bool is and what semantics does it have appears to be rather different from that described in C99, but that's a secondary issue - first and foremost, on which .config and with which gcc version do you see improvements from that change? -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 1/25/16 05:27, Al Viro wrote: > > Again, do you have _any_ evidence of improved code generation with that > patch? Because if you do, I would really like to see it, so I could file > bugs against gcc optimizer. > > Your impression of what _Bool is and what semantics does it have appears > to be rather different from that described in C99, but that's a secondary > issue - first and foremost, on which .config and with which gcc version > do you see improvements from that change? > For our case, the check_mount function have smaller size under x86_64 (movl for int, movb for bool, movl is longer than movb). The related objdump is below, welcome any ideas, suggestions, and discussions for it. origin (for int): 00000000 <check_mount>: 0: 8b 12 mov (%edx),%edx 2: 81 e2 00 00 01 00 and $0x10000,%edx 8: 74 16 je 20 <check_mount+0x20> a: c7 00 01 00 00 00 movl $0x1,(%eax) 10: b8 01 00 00 00 mov $0x1,%eax 15: c3 ret 16: 8d 76 00 lea 0x0(%esi),%esi 19: 8d bc 27 00 00 00 00 lea 0x0(%edi,%eiz,1),%edi 20: 31 c0 xor %eax,%eax 22: c3 ret 23: 8d b6 00 00 00 00 lea 0x0(%esi),%esi 29: 8d bc 27 00 00 00 00 lea 0x0(%edi,%eiz,1),%edi new (for bool): 00000000 <check_mount>: 0: 8b 12 mov (%edx),%edx 2: 81 e2 00 00 01 00 and $0x10000,%edx 8: 74 0e je 18 <check_mount+0x18> a: c6 00 01 movb $0x1,(%eax) d: b8 01 00 00 00 mov $0x1,%eax 12: c3 ret 13: 90 nop 14: 8d 74 26 00 lea 0x0(%esi,%eiz,1),%esi 18: 31 c0 xor %eax,%eax 1a: c3 ret 1b: 90 nop 1c: 8d 74 26 00 lea 0x0(%esi,%eiz,1),%esi [root@localhost fs]# gcc -v Using built-in specs. COLLECT_GCC=gcc COLLECT_LTO_WRAPPER=/usr/local/libexec/gcc/x86_64-pc-linux-gnu/6.0.0/lto-wrapper Target: x86_64-pc-linux-gnu Configured with: ../gcc-ana/configure Thread model: posix gcc version 6.0.0 20151121 (experimental) (GCC)
diff --git a/fs/dcache.c b/fs/dcache.c index b4539e8..7701479 100644 --- a/fs/dcache.c +++ b/fs/dcache.c @@ -1281,9 +1281,9 @@ rename_retry: static enum d_walk_ret check_mount(void *data, struct dentry *dentry) { - int *ret = data; + bool *ret = data; if (d_mountpoint(dentry)) { - *ret = 1; + *ret = true; return D_WALK_QUIT; } return D_WALK_CONTINUE; @@ -1296,9 +1296,9 @@ static enum d_walk_ret check_mount(void *data, struct dentry *dentry) * Return true if the parent or its subdirectories contain * a mount point */ -int have_submounts(struct dentry *parent) +bool have_submounts(struct dentry *parent) { - int ret = 0; + bool ret = false; d_walk(parent, &ret, check_mount, NULL); diff --git a/include/linux/dcache.h b/include/linux/dcache.h index 7781ce11..880a41c 100644 --- a/include/linux/dcache.h +++ b/include/linux/dcache.h @@ -266,7 +266,7 @@ extern struct dentry *d_find_alias(struct inode *); extern void d_prune_aliases(struct inode *); /* test whether we have any submounts in a subdir tree */ -extern int have_submounts(struct dentry *); +extern bool have_submounts(struct dentry *); /* * This adds the entry to the hash queues. @@ -370,12 +370,12 @@ extern struct dentry *dget_parent(struct dentry *dentry); * Returns true if the dentry passed is not currently hashed. */ -static inline int d_unhashed(const struct dentry *dentry) +static inline bool d_unhashed(const struct dentry *dentry) { return hlist_bl_unhashed(&dentry->d_hash); } -static inline int d_unlinked(const struct dentry *dentry) +static inline bool d_unlinked(const struct dentry *dentry) { return d_unhashed(dentry) && !IS_ROOT(dentry); } @@ -508,7 +508,7 @@ static inline bool d_really_is_positive(const struct dentry *dentry) return dentry->d_inode != NULL; } -static inline int simple_positive(struct dentry *dentry) +static inline bool simple_positive(struct dentry *dentry) { return d_really_is_positive(dentry) && !d_unhashed(dentry); }