Message ID | 20240905170356.260300-1-andriy.shevchenko@linux.intel.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [v1,1/1] x86/percpu: Cast -1 to argument type when comparing in percpu_add_op() | expand |
On Thu, Sep 05, 2024 at 08:03:56PM +0300, Andy Shevchenko wrote: > When percpu_add_op() is used with unsigned argument, it prevents kernel builds > with clang, `make W=1` and CONFIG_WERROR=y: > > net/ipv4/tcp_output.c:187:3: error: result of comparison of constant -1 with expression of type 'u8' (aka 'unsigned char') is always false [-Werror,-Wtautological-constant-out-of-range-compare] > 187 | NET_ADD_STATS(sock_net(sk), LINUX_MIB_TCPACKCOMPRESSED, > | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > 188 | tp->compressed_ack); > | ~~~~~~~~~~~~~~~~~~~ > ... > arch/x86/include/asm/percpu.h:238:31: note: expanded from macro 'percpu_add_op' > 238 | ((val) == 1 || (val) == -1)) ? \ > | ~~~~~ ^ ~~ > > Fix this by casting -1 to the type of the parameter and then compare. Any comments? Or can it be taken in?
Andy, The subject here is not very informative. It explains the "what" of the patch, but not the "why". A better subject might have been: x86/percpu: Fix clang warning when dealing with unsigned types > --- a/arch/x86/include/asm/percpu.h > +++ b/arch/x86/include/asm/percpu.h > @@ -234,9 +234,10 @@ do { \ > */ > #define percpu_add_op(size, qual, var, val) \ > do { \ > - const int pao_ID__ = (__builtin_constant_p(val) && \ > - ((val) == 1 || (val) == -1)) ? \ > - (int)(val) : 0; \ > + const int pao_ID__ = \ > + (__builtin_constant_p(val) && \ > + ((val) == 1 || \ > + (val) == (typeof(val))-1)) ? (int)(val) : 0; \ This doesn't _look_ right. Let's assume 'val' is a u8. (u8)-1 is 255, right? So casting the -1 over to a u8 actually changed its value. So the comparison that you added would actually trigger for 255: (val) == (typeof(val))-1)) 255 == (u8)-1 255 == 255 That's not the end of the world because the pao_ID__ still ends up at 255 and the lower if() falls into the "add" bucket, but it isn't great for reading the macro. It seems like it basically works on accident. Wouldn't casting 'val' over to an int be shorter, more readable, not have that logical false match *and* line up with the cast later on in the expression? const int pao_ID__ = (__builtin_constant_p(val) && ((val) == 1 || (int)(val) == -1)) ? (int)(val) : 0; Other suggestions to make it more readable would be welcome. Since I'm making comments, I would have really appreciated some extra info here like why you are hitting this and nobody else is. This is bog standard code that everybody compiles. Is clang use _that_ unusual? Or do most clang users just ignore all the warnings? Or are you using a bleeding edge version of clang that spits out new warnings that other clang users aren't seeing? Another nice thing would have been to say that this produces the exact same code with and without the patch. Or that you had tested it in *some* way. It took me a couple of minutes to convince myself that your version works and doesn't do something silly like a "dec" if you hand in val==255.
On Wed, Oct 16, 2024 at 8:45 AM Dave Hansen <dave.hansen@intel.com> wrote: > Since I'm making comments, I would have really appreciated some extra > info here like why you are hitting this and nobody else is. This is bog > standard code that everybody compiles. Is clang use _that_ unusual? Or > do most clang users just ignore all the warnings? Or are you using a > bleeding edge version of clang that spits out new warnings that other > clang users aren't seeing? Note the W=1 part in the commit message. That's the part people generally don't test with, but the bots do. On Thu, Sep 5, 2024 at 10:04 AM Andy Shevchenko <andriy.shevchenko@linux.intel.com> wrote: > > When percpu_add_op() is used with unsigned argument, it prevents kernel builds > with clang, `make W=1` and CONFIG_WERROR=y:
On Wed, Oct 16, 2024 at 08:44:56AM -0700, Dave Hansen wrote: > Andy, > > The subject here is not very informative. It explains the "what" of the > patch, but not the "why". > > A better subject might have been: > > x86/percpu: Fix clang warning when dealing with unsigned types Thanks, makes sense! > > --- a/arch/x86/include/asm/percpu.h > > +++ b/arch/x86/include/asm/percpu.h > > @@ -234,9 +234,10 @@ do { \ > > */ > > #define percpu_add_op(size, qual, var, val) \ > > do { \ > > - const int pao_ID__ = (__builtin_constant_p(val) && \ > > - ((val) == 1 || (val) == -1)) ? \ > > - (int)(val) : 0; \ > > + const int pao_ID__ = \ > > + (__builtin_constant_p(val) && \ > > + ((val) == 1 || \ > > + (val) == (typeof(val))-1)) ? (int)(val) : 0; \ > > This doesn't _look_ right. But if feels right if we really want to supply unsigned types here. Maybe some more magic is needed (like in min() case). > Let's assume 'val' is a u8. (u8)-1 is 255, right? So casting the -1 > over to a u8 actually changed its value. So the comparison that you > added would actually trigger for 255: > > (val) == (typeof(val))-1)) > > 255 == (u8)-1 > 255 == 255 > > That's not the end of the world because the pao_ID__ still ends up at > 255 and the lower if() falls into the "add" bucket, but it isn't great > for reading the macro. It seems like it basically works on accident. > Wouldn't casting 'val' over to an int be shorter, more readable, not > have that logical false match *and* line up with the cast later on in > the expression? Maybe more readable, but wouldn't it be theoretically buggy for u64? I'm talking about the case when u64 == UINT_MAX, which will be true in your case and false in mine. > const int pao_ID__ = (__builtin_constant_p(val) && > ((val) == 1 || (int)(val) == -1)) ? > > (int)(val) : 0; > > Other suggestions to make it more readable would be welcome. > > Since I'm making comments, I would have really appreciated some extra > info here like why you are hitting this and nobody else is. This is bog > standard code that everybody compiles. Is clang use _that_ unusual? Why are you asking me about this? I don't know... > Or do most clang users just ignore all the warnings? Same here. I don't know... Both Qs sounds rhetorical to me. > Or are you using a bleeding edge version of clang that spits out new warnings > that other clang users aren't seeing? AFAICT It's *not* even close to the bleeding edge. It's standard Debian supply. > Another nice thing would have been to say that this produces the exact > same code with and without the patch. Or that you had tested it in > *some* way. I have run percpu_test in both cases and also checked code with `bloat-o-meter` and `cmp -b`. Everything is the same. I even added a test case for the above mentioned situation. > It took me a couple of minutes to convince myself that your > version works and doesn't do something silly like a "dec" if you hand in > val==255. It took me much more to find the best solution that appears not everyone likes :-) P.S. And as Nick pointed out it's simple `make W=1`, what the additional information you wanna see here? Care to provide a template?
On Wed, Oct 16, 2024 at 09:06:13PM +0300, Andy Shevchenko wrote: > On Wed, Oct 16, 2024 at 08:44:56AM -0700, Dave Hansen wrote: ... > > This doesn't _look_ right. See below. ... > Maybe more readable, but wouldn't it be theoretically buggy for u64? > I'm talking about the case when u64 == UINT_MAX, which will be true > in your case and false in mine. > > > const int pao_ID__ = (__builtin_constant_p(val) && > > ((val) == 1 || (int)(val) == -1)) ? > > > > (int)(val) : 0; This code _is_ buggy, thanks to my new test case. [ 66.161375] pcp -1 (0xffffffffffffffff) != expected 4294967295 (0xffffffff) Hence, I'll send a v2 with the test case and updated Subject.
On Wed, Oct 16, 2024 at 08:44:56AM -0700, Dave Hansen wrote: > Andy, > > The subject here is not very informative. It explains the "what" of the > patch, but not the "why". > > A better subject might have been: > > x86/percpu: Fix clang warning when dealing with unsigned types > > > --- a/arch/x86/include/asm/percpu.h > > +++ b/arch/x86/include/asm/percpu.h > > @@ -234,9 +234,10 @@ do { \ > > */ > > #define percpu_add_op(size, qual, var, val) \ > > do { \ > > - const int pao_ID__ = (__builtin_constant_p(val) && \ > > - ((val) == 1 || (val) == -1)) ? \ > > - (int)(val) : 0; \ > > + const int pao_ID__ = \ > > + (__builtin_constant_p(val) && \ > > + ((val) == 1 || \ > > + (val) == (typeof(val))-1)) ? (int)(val) : 0; \ > > This doesn't _look_ right. > > Let's assume 'val' is a u8. (u8)-1 is 255, right? So casting the -1 > over to a u8 actually changed its value. So the comparison that you > added would actually trigger for 255: > > (val) == (typeof(val))-1)) > > 255 == (u8)-1 > 255 == 255 Which is correct, no? Add of 255 to an u8 is the same as decrement one. > That's not the end of the world because the pao_ID__ still ends up at > 255 and the lower if() falls into the "add" bucket, but it isn't great > for reading the macro. It seems like it basically works on accident. You're correct in that it does not achieve the desired result (in all cases). But this is because (int)(val) will never turn into -1 when val == 255. > Wouldn't casting 'val' over to an int be shorter, more readable, not > have that logical false match *and* line up with the cast later on in > the expression? > > const int pao_ID__ = (__builtin_constant_p(val) && > ((val) == 1 || (int)(val) == -1)) ? > > (int)(val) : 0; > > Other suggestions to make it more readable would be welcome. This is very very wrong. No u8 value when cast to int will ever equal -1. Notably (int)(u8)255 == 255. > Since I'm making comments, I would have really appreciated some extra > info here like why you are hitting this and nobody else is. This is bog > standard code that everybody compiles. Is clang use _that_ unusual? Or > do most clang users just ignore all the warnings? Or are you using a > bleeding edge version of clang that spits out new warnings that other > clang users aren't seeing? The code as is, is wrong, I don't think we'll ever end up in the dec case for 'short' unsigned types. Clang is just clever enough to realize this and issues a warning. Something like so might work: const int pao_ID__ = __builtin_constant_p(val) ? ((typeof(var))(val) == 1 ? 1 : ((typeof(var))(val) == (typeof(var))-1 ? -1 : 0 )) : 0; This should get, assuming typeof(var) is u8, a dec for both 255 and -1.
On 10/16/24 11:20, Andy Shevchenko wrote: >> Maybe more readable, but wouldn't it be theoretically buggy for u64? >> I'm talking about the case when u64 == UINT_MAX, which will be true >> in your case and false in mine. >> >>> const int pao_ID__ = (__builtin_constant_p(val) && >>> ((val) == 1 || (int)(val) == -1)) ? >>> >>> (int)(val) : 0; > This code _is_ buggy, thanks to my new test case. > > [ 66.161375] pcp -1 (0xffffffffffffffff) != expected 4294967295 (0xffffffff) Thanks for pointing that out Andy (and Peter too)!
On 10/16/24 12:20, Peter Zijlstra wrote: > The code as is, is wrong, I don't think we'll ever end up in the dec > case for 'short' unsigned types. Clang is just clever enough to realize > this and issues a warning. Ahhh, that's the key to it. Thanks, Peter. > Something like so might work: > > const int pao_ID__ = __builtin_constant_p(val) ? > ((typeof(var))(val) == 1 ? 1 : > ((typeof(var))(val) == (typeof(var))-1 ? -1 : 0 )) : 0; Would anybody hate if we broke this up a bit, like: const typeof(var) _val = val; const int paoconst = __builtin_constant_p(val); const int paoinc = paoconst && ((_val) == 1); const int paodec = paoconst && ((_val) == (typeof(var))-1); and then did if (paoinc) percpu_unary_op(size, qual, "inc", var); ... Or even: #define PAOINC 1234 const int pao_ID__ = __builtin_constant_p(val) ? ((typeof(var))(val) == 1 ? PAOINC : ... if (PAOINC) percpu_unary_op(size, qual, "inc", var); Since the 1 and -1 ternary results end up just being magic numbers anyway. Otherwise that pao_ID__ expression is pretty gnarly.
On Wed, Oct 16, 2024 at 12:44:18PM -0700, Dave Hansen wrote: > Would anybody hate if we broke this up a bit, like: > > const typeof(var) _val = val; > const int paoconst = __builtin_constant_p(val); > const int paoinc = paoconst && ((_val) == 1); > const int paodec = paoconst && ((_val) == (typeof(var))-1); > > and then did > > if (paoinc) > percpu_unary_op(size, qual, "inc", var); > ... I think that is an overall improvement. Proceed! :-)
On Thu, Oct 17, 2024 at 08:18:59PM +0200, Peter Zijlstra wrote: > On Wed, Oct 16, 2024 at 12:44:18PM -0700, Dave Hansen wrote: > > > Would anybody hate if we broke this up a bit, like: > > > > const typeof(var) _val = val; > > const int paoconst = __builtin_constant_p(val); > > const int paoinc = paoconst && ((_val) == 1); > > const int paodec = paoconst && ((_val) == (typeof(var))-1); > > > > and then did > > > > if (paoinc) > > percpu_unary_op(size, qual, "inc", var); > > ... > > I think that is an overall improvement. Proceed! :-) Wouldn't typeof(var) be a regression? The val can be wider (in term of bits) than var and cutting it like this might bring different result depending on the signedness. TL;DR: Whatever is done, please add more (corner) test cases to the percpu_test.c.
On 10/17/24 11:18, Peter Zijlstra wrote: > On Wed, Oct 16, 2024 at 12:44:18PM -0700, Dave Hansen wrote: > >> Would anybody hate if we broke this up a bit, like: >> >> const typeof(var) _val = val; >> const int paoconst = __builtin_constant_p(val); >> const int paoinc = paoconst && ((_val) == 1); >> const int paodec = paoconst && ((_val) == (typeof(var))-1); >> >> and then did >> >> if (paoinc) >> percpu_unary_op(size, qual, "inc", var); >> ... > I think that is an overall improvement. Proceed!
On Tue, 22 Oct 2024, Dave Hansen wrote: > So I think Peter's version was the best. It shuts up clang and also > preserves the existing (good) gcc 'sub' behavior. I'll send it out for > real in a bit, but I'm thinking of something like the attached patch. The desired behavior is a "dec". "sub" has a longer op code AFAICT.
On Tue, Oct 22, 2024 at 12:53:01PM -0700, Dave Hansen wrote: > On 10/17/24 11:18, Peter Zijlstra wrote: > > On Wed, Oct 16, 2024 at 12:44:18PM -0700, Dave Hansen wrote: ... > >> Would anybody hate if we broke this up a bit, like: > >> > >> const typeof(var) _val = val; > >> const int paoconst = __builtin_constant_p(val); > >> const int paoinc = paoconst && ((_val) == 1); > >> const int paodec = paoconst && ((_val) == (typeof(var))-1); > >> > >> and then did > >> > >> if (paoinc) > >> percpu_unary_op(size, qual, "inc", var); > >> ... > > I think that is an overall improvement. Proceed!
On 10/22/24 16:24, Christoph Lameter (Ampere) wrote: > On Tue, 22 Oct 2024, Dave Hansen wrote: > >> So I think Peter's version was the best. It shuts up clang and also >> preserves the existing (good) gcc 'sub' behavior. I'll send it out for >> real in a bit, but I'm thinking of something like the attached patch. > The desired behavior is a "dec". "sub" has a longer op code AFAICT. Gah, yes, of course. I misspoke. We want "inc" and "dec" for +1 and -1. "add" and "sub" are heftier and get used for everything else.
On 10/23/24 10:15, Dave Hansen wrote: > On 10/22/24 16:24, Christoph Lameter (Ampere) wrote: >> On Tue, 22 Oct 2024, Dave Hansen wrote: >> >>> So I think Peter's version was the best. It shuts up clang and also >>> preserves the existing (good) gcc 'sub' behavior. I'll send it out for >>> real in a bit, but I'm thinking of something like the attached patch. >> The desired behavior is a "dec". "sub" has a longer op code AFAICT. > > Gah, yes, of course. I misspoke. > > We want "inc" and "dec" for +1 and -1. "add" and "sub" are heftier and > get used for everything else. Do we really? I don't know if there are any microarchitectures where the partial register update still matters. It is only one byte difference. -hpa
diff --git a/arch/x86/include/asm/percpu.h b/arch/x86/include/asm/percpu.h index c55a79d5feae..e525cd85f999 100644 --- a/arch/x86/include/asm/percpu.h +++ b/arch/x86/include/asm/percpu.h @@ -234,9 +234,10 @@ do { \ */ #define percpu_add_op(size, qual, var, val) \ do { \ - const int pao_ID__ = (__builtin_constant_p(val) && \ - ((val) == 1 || (val) == -1)) ? \ - (int)(val) : 0; \ + const int pao_ID__ = \ + (__builtin_constant_p(val) && \ + ((val) == 1 || \ + (val) == (typeof(val))-1)) ? (int)(val) : 0; \ \ if (0) { \ typeof(var) pao_tmp__; \
When percpu_add_op() is used with unsigned argument, it prevents kernel builds with clang, `make W=1` and CONFIG_WERROR=y: net/ipv4/tcp_output.c:187:3: error: result of comparison of constant -1 with expression of type 'u8' (aka 'unsigned char') is always false [-Werror,-Wtautological-constant-out-of-range-compare] 187 | NET_ADD_STATS(sock_net(sk), LINUX_MIB_TCPACKCOMPRESSED, | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 188 | tp->compressed_ack); | ~~~~~~~~~~~~~~~~~~~ ... arch/x86/include/asm/percpu.h:238:31: note: expanded from macro 'percpu_add_op' 238 | ((val) == 1 || (val) == -1)) ? \ | ~~~~~ ^ ~~ Fix this by casting -1 to the type of the parameter and then compare. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> --- arch/x86/include/asm/percpu.h | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)