[GIT,PULL] kbuild updates for v4.7-rc1

Message ID	5345273.JScMX9oohG@wuerfel (mailing list archive)
State	New, archived
Headers	show Return-Path: <linux-kbuild-owner@kernel.org> From: Arnd Bergmann <arnd@arndb.de> To: Linus Torvalds <torvalds@linux-foundation.org> Cc: Michal Marek <mmarek@suse.com>, mussitantesmortem@gmail.com, nicolas.ferre@atmel.com, Nicolas Pitre <nicolas.pitre@linaro.org>, robert.jarzmik@free.fr, yamada.masahiro@socionext.com, Linux Kbuild mailing list <linux-kbuild@vger.kernel.org>, Linux Kernel Mailing List <linux-kernel@vger.kernel.org>, Bob Peterson <rpeterso@redhat.com> Subject: Re: [GIT PULL] kbuild updates for v4.7-rc1 Date: Fri, 27 May 2016 14:33:05 +0200 Message-ID: <5345273.JScMX9oohG@wuerfel> User-Agent: KMail/5.1.3 (Linux/4.4.0-22-generic; KDE/5.18.0; x86_64; ; ) In-Reply-To: <CA+55aFzQxws_ipKZhV9cFrJdV8CvKy=m9AxGe5cjM=FYceNbZQ@mail.gmail.com> References: <20160526203346.GA4734@pobox.suse.cz> <CA+55aFzQxws_ipKZhV9cFrJdV8CvKy=m9AxGe5cjM=FYceNbZQ@mail.gmail.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8BIT Content-Type: text/plain; charset="UTF-8" Sender: linux-kbuild-owner@vger.kernel.org Precedence: bulk

Arnd Bergmann May 27, 2016, 12:33 p.m. UTC

On Thursday, May 26, 2016 10:26:50 PM CEST Linus Torvalds wrote:
> On Thu, May 26, 2016 at 1:33 PM, Michal Marek <mmarek@suse.com> wrote:
> >
> >   git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild.git kbuild
> 
> This pull results in new warnings.
> 
> I get new "may be uninitialized" warnings now for me allmodconfig
> build, and while I didn't look at them all, the one I looked at was
> just entirely crap:
> 
>    fs/gfs2/dir.c: In function ‘dir_split_leaf.isra.16’:
>    fs/gfs2/dir.c:1021:8: warning: ‘leaf_no’ may be used uninitialized
> in this function [-Wmaybe-uninitialized]
>      error = get_leaf(dip, leaf_no, &obh);
>            ^
> 
> yeah no, leaf_no is initialized a few lines up by
> 
>         error = get_leaf_nr(dip, index, &leaf_no);
> 
> and the fact that gcc can't follow the trivial  error handling is not
> our fault. It looks *so* trivial that I wonder why.
> 
> That said, I don't see exactly what in the pull request causes this.
>
> My reading of the diff seems to say that you are actually adding
> *more* cases of -Wno-maybe-uninitialized, not less.

I put the explanation for this into the individual patch commit logs.

> So I suspect it's almost accidental in just how the Kconfig option
> CC_OPTIMIZE_FOR_PERFORMANCE happened, which in turn probably just
> changes the options for "make allmiodconfig", and it now picks a
> non-size-optimized build that always showed those warnings and I just
> didn't see them.

Correct, the point of the CC_OPTIMIZE_FOR_PERFORMANCE change is that
we now see the -Wmaybe-uninitialized warnings by default now. I also
submitted patches for all warnings I saw a while ago, but some only
happen with certain compiler versions that I don't regularly test on,
or are architecture specific.

The specific gfs2 warning is one that I fixed before but it came
back after some back-and-forth discussions, and I forgot to post
it again but kept it in my tree of backlog warning fixes, my
mistake. 

I keep running into new valid Wmaybe-uninitialized warnings on
randconfig builds, e.g. https://lkml.org/lkml/2016/5/27/99 today,
and having them enabled in allmodconfig should make it easier for
patch authors to catch them first.

I don't currently get any other -Wmaybe-uninialized warnings using
gcc-6 on ARM, but I get two warnings on x86, in
drivers/net/wireless/wl3501_cs.c and
drivers/net/wireless/intel/iwlwifi/mvm/nvm.c

I'll have a look at those.

	Arnd

8<----
[PATCH] gfs2: avoid maybe-uninitialized warning again

gcc can not always figure out which code is only used in an error
condition an assignment to indirect argument is only done after the
use of IS_ERR() catches errors. In gfs2, this results in a warning
about correct code:

fs/gfs2/dir.c: In function 'get_first_leaf':
fs/gfs2/dir.c:802:9: warning: 'leaf_no' may be used uninitialized in this function [-Wmaybe-uninitialized]
fs/gfs2/dir.c: In function 'dir_split_leaf':
fs/gfs2/dir.c:1021:8: warning: 'leaf_no' may be used uninitialized in this function [-Wmaybe-uninitialized]

I managed to work around this earlier using the IS_ERR_VALUE(),
but unfortunately that did not catch all configurations.
This new patch reworks the function in question to us
PTR_ERR_OR_ZERO() instead, which makes it more obvious to the
compiler what happens, and should also result in better object
code.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fixes: 67893f12e537 ("gfs2: avoid uninitialized variable warning")

--
To unsubscribe from this list: send the line "unsubscribe linux-kbuild" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Linus Torvalds May 27, 2016, 5:23 p.m. UTC | #1

On Fri, May 27, 2016 at 5:33 AM, Arnd Bergmann <arnd@arndb.de> wrote:
>
>
> gcc can not always figure out which code is only used in an error
> condition an assignment to indirect argument is only done after the
> use of IS_ERR() catches errors. In gfs2, this results in a warning
> about correct code:

I figured out why gcc warns.

The code is correct, but gcc isn't able to tell that, because when we
return "int" for the error case from get_leaf_nr(), gcc thinks that
that *could* be a zero.

This code:

        if (IS_ERR(hash))
                return PTR_ERR(hash);

is obviously "return non-zero" to a kernel developer (because that's
how our error codes work), but to a compiler that "return PTR_ERR()"
ends up casting a "long" to an "int" return value.

And the compiler thinks that while the "long" was clearly non-zero and
negative, the "int" it cast things to might be zero.

Yes, a very smart compiler might have figured out that the
IS_ERR_VALUE check also means that the value will be non-zero in "int"
as well, but that actually takes value range analysis, not just "we
already ttested it".

And yes, the error goes away if I turn the "int" into "long" into the
affected codepaths.

I'm not entirely happy about your patch, because I think it makes the
code worse. You fix it by effectively making gcc test the resulting
value after the type conversion. I'd really prefer to figure out some
way to let gcc actually understand this error handling model. Because
it's not just the warning, it also generates unnecessary double tests
(first comparing a 64-bit value against the error range, and then
comparing the truncated 32-bit error code against zero).

Making errors be "long" does fix not just the warning but also the
code generation. But we've traditionally used "int" for error returns,
even though the system call path then again turns it into "long" (for
somewhat related reasons, actually - we want to make sure that the
"int" error code is valid in all 64 bits).

So we *could* just start encouraging people to use "long" for error
handling. It would fix warnings and improve the results. But let me
think about this a bit more.

                    Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kbuild" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Linus Torvalds May 27, 2016, 6:28 p.m. UTC | #2

On Fri, May 27, 2016 at 10:23 AM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> This code:
>
>         if (IS_ERR(hash))
>                 return PTR_ERR(hash);
>
> is obviously "return non-zero" to a kernel developer (because that's
> how our error codes work), but to a compiler that "return PTR_ERR()"
> ends up casting a "long" to an "int" return value.

It doesn't help that gfs2 is using the IS_ERR_VALUE() macro on an
"int", whih is bogus to begin with. It happens to *work*, but by the
time you've cast an error pointer to "int", you've lost enough bits
that "IS_ERR_VALUE()" isn't a valid thing to do. You should just check
against zero (possibly against negative).

But fixing that and just checking against a non-zero int doesn't
actually fix the warning.

              Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kbuild" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Arnd Bergmann May 27, 2016, 8:04 p.m. UTC | #3

On Friday, May 27, 2016 10:23:00 AM CEST Linus Torvalds wrote:
> On Fri, May 27, 2016 at 5:33 AM, Arnd Bergmann <arnd@arndb.de> wrote:
> >
> >
> > gcc can not always figure out which code is only used in an error
> > condition an assignment to indirect argument is only done after the
> > use of IS_ERR() catches errors. In gfs2, this results in a warning
> > about correct code:
> 
> I figured out why gcc warns.
> 
> The code is correct, but gcc isn't able to tell that, because when we
> return "int" for the error case from get_leaf_nr(), gcc thinks that
> that *could* be a zero.
> 
> This code:
> 
>         if (IS_ERR(hash))
>                 return PTR_ERR(hash);
> 
> is obviously "return non-zero" to a kernel developer (because that's
> how our error codes work), but to a compiler that "return PTR_ERR()"
> ends up casting a "long" to an "int" return value.
> 
> And the compiler thinks that while the "long" was clearly non-zero and
> negative, the "int" it cast things to might be zero.

Exactly, this was my conclusion back in January when I did the 67893f12e537
patch that fixed it for 32-bit architectures: gcc cannot figure out
that 'if (IS_ERR(ptr))' and the '!err' check are equivalent but it
does manage to find 'IS_ERR(ptr)' and 'IS_ERR_VAL(PTR_ERR(ptr))'
are the same, across multiple inlined functions.

> Yes, a very smart compiler might have figured out that the
> IS_ERR_VALUE check also means that the value will be non-zero in "int"
> as well, but that actually takes value range analysis, not just "we
> already ttested it".
> 
> And yes, the error goes away if I turn the "int" into "long" into the
> affected codepaths.
> 
> I'm not entirely happy about your patch, because I think it makes the
> code worse. You fix it by effectively making gcc test the resulting
> value after the type conversion. I'd really prefer to figure out some
> way to let gcc actually understand this error handling model. Because
> it's not just the warning, it also generates unnecessary double tests
> (first comparing a 64-bit value against the error range, and then
> comparing the truncated 32-bit error code against zero).

I see what you mean. On 32-bit ARM, we get the double compare for any
of the versions, while on 64-bit x86, the current version in mainline
is the worst, while the original code (with the warning) and the version
after my second patch still need the second comparison against zero.

> Making errors be "long" does fix not just the warning but also the
> code generation. But we've traditionally used "int" for error returns,
> even though the system call path then again turns it into "long" (for
> somewhat related reasons, actually - we want to make sure that the
> "int" error code is valid in all 64 bits).
> 
> So we *could* just start encouraging people to use "long" for error
> handling. It would fix warnings and improve the results. But let me
> think about this a bit more.
>
On Friday, May 27, 2016 11:28:48 AM CEST Linus Torvalds wrote:
> It doesn't help that gfs2 is using the IS_ERR_VALUE() macro on an
> "int", whih is bogus to begin with. It happens to *work*, but by the
> time you've cast an error pointer to "int", you've lost enough bits
> that "IS_ERR_VALUE()" isn't a valid thing to do. You should just check
> against zero (possibly against negative).
>
> But fixing that and just checking against a non-zero int doesn't
> actually fix the warning.

Right, removing the IS_ERR_VALUE() gets us back to the state before
my original patch, and we get the warning on all architectures, and
we certainly don't want to promote the use of IS_ERR_VALUE() further
because of the problems you mention with unsigned types that are
shorter than 'long'.

In fact, the patch that I have in my private tree that was hiding the
warning for me on x86 is one that removes all instances of IS_ERR_VALUE()
with arguments other than 'unsigned long', see http://pastebin.com/uYa2mkgC
for reference.

	Arnd
--
To unsubscribe from this list: send the line "unsubscribe linux-kbuild" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Linus Torvalds May 27, 2016, 8:20 p.m. UTC | #4

On Fri, May 27, 2016 at 1:04 PM, Arnd Bergmann <arnd@arndb.de> wrote:
>
> In fact, the patch that I have in my private tree that was hiding the
> warning for me on x86 is one that removes all instances of IS_ERR_VALUE()
> with arguments other than 'unsigned long', see http://pastebin.com/uYa2mkgC
> for reference.

Please just send me that patch, we need to do this (and then add a
cast to pointer (and back to unsigned long) in IS_ERR_VALUE() so that
we get warnings for when that macro is mis-used.

I didn't look at the details of your patch, but I did look at several
IS_ERR_VALUE() uses in the standard kernel, and they were basically
all wrong. Even the ones that used it for the rigth reason (vm_brk()
that returns a pointer or an error in an "unsigned long") had actively
screwed up and truncated that (correct) unsigned long value to "int"
before doing the IS_ERR_VALUE(), which made it all wrong again.

And the other users just looked entirely bogus, and were all just
"zero or negative error code" that has nothing to do with
IS_ERR_VALUE(). The code should just check against zero, not use that
macro that was designed for something different.

                Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kbuild" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Arnd Bergmann May 27, 2016, 9:36 p.m. UTC | #5

On Friday, May 27, 2016 1:20:29 PM CEST Linus Torvalds wrote:
> On Fri, May 27, 2016 at 1:04 PM, Arnd Bergmann <arnd@arndb.de> wrote:
> >
> > In fact, the patch that I have in my private tree that was hiding the
> > warning for me on x86 is one that removes all instances of IS_ERR_VALUE()
> > with arguments other than 'unsigned long', see http://pastebin.com/uYa2mkgC
> > for reference.
> 
> Please just send me that patch, we need to do this (and then add a
> cast to pointer (and back to unsigned long) in IS_ERR_VALUE() so that
> we get warnings for when that macro is mis-used.

Ok, I've tried to come up with a summary of what happened so far
on this, and sent the patch your way, leaving out the modified
IS_ERR_VALUE() definition for the moment.

I've added the people on Cc whose drivers had the most invasive
changes.

> I didn't look at the details of your patch, but I did look at several
> IS_ERR_VALUE() uses in the standard kernel, and they were basically
> all wrong.

Yes, that matches what we found earlier this year when Andrzej Hajda
did some work on fixing the worst problems he found.

	Arnd

--
To unsubscribe from this list: send the line "unsubscribe linux-kbuild" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Al Viro May 27, 2016, 9:52 p.m. UTC | #6

On Fri, May 27, 2016 at 01:20:29PM -0700, Linus Torvalds wrote:

> I didn't look at the details of your patch, but I did look at several
> IS_ERR_VALUE() uses in the standard kernel, and they were basically
> all wrong. Even the ones that used it for the rigth reason (vm_brk()
> that returns a pointer or an error in an "unsigned long") had actively
> screwed up and truncated that (correct) unsigned long value to "int"
> before doing the IS_ERR_VALUE(), which made it all wrong again.
> 
> And the other users just looked entirely bogus, and were all just
> "zero or negative error code" that has nothing to do with
> IS_ERR_VALUE(). The code should just check against zero, not use that
> macro that was designed for something different.

Both gfs2 ones and the one in fs/romfs are crap.  AFS one is simply ridiculous -
        result = generic_file_write_iter(iocb, from);
        if (IS_ERR_VALUE(result)) {
                _leave(" = %zd", result);
                return result;
        }

        _leave(" = %zd", result);
        return result;
In binfmt*.c some callers are valid (vm_mmap and vm_brk values; BTW, the
code in binfmt_flat.c checks for vm_mmap returning 0, which should never
happen, AFAICS).  Most of binfmt_flat.c ones are pure cargo-culting.

net/9p/client.c ones are results of serious mistake in design of 9P.L
extensions to 9P - they assume that numeric values of E... are
architecture-independent and send them over the wire.  The uses of
IS_ERR_VALUE there are papering over that.  Badly.

A couple of sound/* ones should be simply "is negative".

netfilter ones are caused by bad calling conventions of
xt_percpu_counter_alloc().

I hadn't looked through drivers/* users, but judging by fs/, net/ and sound/,
I would expect them to be pointless garbage in the best case and red flags
for broken code in the worst.

IMO IS_ERR_VALUE() should be renamed to something less generic and be used
a lot less than it is right now.
--
To unsubscribe from this list: send the line "unsubscribe linux-kbuild" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[GIT,PULL] kbuild updates for v4.7-rc1

Commit Message

Comments

Patch