diff mbox series

[2/2] ci: let pedantic job compile with -Og

Message ID 351dec4a4d5a5619e7627e11a8e674e32125125e.1717655210.git.ps@pks.im (mailing list archive)
State Superseded
Headers show
Series ci: detect more warnings via `-Og` | expand

Commit Message

Patrick Steinhardt June 6, 2024, 6:30 a.m. UTC
We have recently noticed that our CI does not always notice variables
that may be used uninitialized. While it is expected that compiler
warnings aren't perfect, this one was a but puzzling because it was
rather obvious that the variable can be uninitialized.

Many compiler warnings unfortunately depend on the optimization level
used by the compiler. While `-O0` for example will disable a lot of
warnings altogether because optimization passes go away, `-O2`, which is
our default optimization level used in CI, may optimize specific code
away or even double down on undefined behaviour. Interestingly, this
specific instance that triggered the investigation does get noted by GCC
when using `-Og`.

While we could adapt all jobs to compile with `-Og` now, that would
potentially mask other warnings that only get diagnosed with `-O2`.
Instead, only adapt the "pedantic" job to compile with `-Og`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 ci/run-build-and-tests.sh | 9 +++++++++
 1 file changed, 9 insertions(+)

Comments

Jeff King June 6, 2024, 6:52 a.m. UTC | #1
On Thu, Jun 06, 2024 at 08:30:34AM +0200, Patrick Steinhardt wrote:

> We have recently noticed that our CI does not always notice variables
> that may be used uninitialized. While it is expected that compiler
> warnings aren't perfect, this one was a but puzzling because it was
> rather obvious that the variable can be uninitialized.
> 
> Many compiler warnings unfortunately depend on the optimization level
> used by the compiler. While `-O0` for example will disable a lot of
> warnings altogether because optimization passes go away, `-O2`, which is
> our default optimization level used in CI, may optimize specific code
> away or even double down on undefined behaviour. Interestingly, this
> specific instance that triggered the investigation does get noted by GCC
> when using `-Og`.
> 
> While we could adapt all jobs to compile with `-Og` now, that would
> potentially mask other warnings that only get diagnosed with `-O2`.
> Instead, only adapt the "pedantic" job to compile with `-Og`.

Hmm. This is the first time I've ever seen lower optimization levels
produce more warnings. It is almost always the opposite case in my
experience. So it's not clear to me that moving to "-Og" will generally
find more warning spots, and that this isn't a bit of a fluke.

As you note, we'll still compile with -O2 in other jobs. But isn't the
point of the pedantic job to enable a bunch of extra warnings that
aren't available elsewhere? We wouldn't have any coverage of those.

So for the pedantic warnings, we're left with a guess as to whether -Og
or -O2 will yield more results. And in my experience it is probably -O2.

If we want to get coverage of -Og, I'd suggest doing it in a job that is
otherwise overlapping with another (maybe linux-TEST-vars, which I think
is otherwise a duplicate build?).

> diff --git a/ci/run-build-and-tests.sh b/ci/run-build-and-tests.sh
> index 98dda42045..e78e19e4a6 100755
> --- a/ci/run-build-and-tests.sh
> +++ b/ci/run-build-and-tests.sh
> @@ -44,6 +44,15 @@ pedantic)
>  	# Don't run the tests; we only care about whether Git can be
>  	# built.
>  	export DEVOPTS=pedantic
> +	# Warnings generated by compilers are unfortunately specific to the
> +	# optimization level. With `-O0`, many warnings won't be shown at all,
> +	# whereas the optimizations performed by our default optimization level
> +	# `-O2` will mask others. We thus use `-Og` here just so that we have
> +	# at least one job with a different optimization level so that we can
> +	# overall surface more warnings.
> +	cat >config.mak <<-EOF
> +	export CFLAGS=-Og
> +	EOF

Writing config.mak is unusual, though I guess just setting CFLAGS in the
environment isn't enough, because we set it unconditionally in the
Makefile. Doing "make CFLAGS=-Og" would work, but we'd need help from
the code that actually runs "make".

I do suspect the "export" is unnecessary; it should just be used by the
Makefile recipes themselves.

Your command above also loses the "-g" and "-Wall" from the default
CFLAGS. Maybe OK, since DEVELOPER=1 implies -Wall anyway, and "-g" isn't
important. But one thing I've done for a long time in my config.mak is:

  O ?= 2
  CFLAGS = -g -Wall -O$(O)

Then you can "make O=0" or "make O=g" if you want. And I think that
setting O=g in the environment (exported) would work, as well.

-Peff
Patrick Steinhardt June 6, 2024, 7:41 a.m. UTC | #2
On Thu, Jun 06, 2024 at 02:52:36AM -0400, Jeff King wrote:
> On Thu, Jun 06, 2024 at 08:30:34AM +0200, Patrick Steinhardt wrote:
> 
> > We have recently noticed that our CI does not always notice variables
> > that may be used uninitialized. While it is expected that compiler
> > warnings aren't perfect, this one was a but puzzling because it was
> > rather obvious that the variable can be uninitialized.
> > 
> > Many compiler warnings unfortunately depend on the optimization level
> > used by the compiler. While `-O0` for example will disable a lot of
> > warnings altogether because optimization passes go away, `-O2`, which is
> > our default optimization level used in CI, may optimize specific code
> > away or even double down on undefined behaviour. Interestingly, this
> > specific instance that triggered the investigation does get noted by GCC
> > when using `-Og`.
> > 
> > While we could adapt all jobs to compile with `-Og` now, that would
> > potentially mask other warnings that only get diagnosed with `-O2`.
> > Instead, only adapt the "pedantic" job to compile with `-Og`.
> 
> Hmm. This is the first time I've ever seen lower optimization levels
> produce more warnings. It is almost always the opposite case in my
> experience. So it's not clear to me that moving to "-Og" will generally
> find more warning spots, and that this isn't a bit of a fluke.
> 
> As you note, we'll still compile with -O2 in other jobs. But isn't the
> point of the pedantic job to enable a bunch of extra warnings that
> aren't available elsewhere? We wouldn't have any coverage of those.
> 
> So for the pedantic warnings, we're left with a guess as to whether -Og
> or -O2 will yield more results. And in my experience it is probably -O2.
> 
> If we want to get coverage of -Og, I'd suggest doing it in a job that is
> otherwise overlapping with another (maybe linux-TEST-vars, which I think
> is otherwise a duplicate build?).

I don't think linux-TEST-vars would be a good candidate for this because
it uses Ubuntu 20.04. Ideally, we'd want to have a test run with an
up-to-date version of Ubuntu so that we also get a recent version of the
compiler toolchain.

I kind of wonder whether we should revamp this pedantic job in the first
place. The consequence of that job is that our codebase needs to be
compile cleanly with `-Wpedantic`. So if that is a requirement anyway,
why don't we run all jobs with `DEVOPTS=pedantic` and just drop this job
altogether? This may surface some additional warnings on platforms where
we currently don't set that, but is that a bad thing?

The only downside I can think of is that we stop compiling on Fedora,
which may have a more up-to-date GCC version than Ubuntu. But if the
goal of this job was to _really_ get an up-to-date compiler toolchain,
then we should rather pick a rolling release distro like Arch. Otherwise
I find this to be of dubious benefit.

If we merge it into the other jobs, then I'd just pick any random job
that uses "ubuntu:latest" like "linux-gcc-default" to compile with
`-Og`.

> > diff --git a/ci/run-build-and-tests.sh b/ci/run-build-and-tests.sh
> > index 98dda42045..e78e19e4a6 100755
> > --- a/ci/run-build-and-tests.sh
> > +++ b/ci/run-build-and-tests.sh
> > @@ -44,6 +44,15 @@ pedantic)
> >  	# Don't run the tests; we only care about whether Git can be
> >  	# built.
> >  	export DEVOPTS=pedantic
> > +	# Warnings generated by compilers are unfortunately specific to the
> > +	# optimization level. With `-O0`, many warnings won't be shown at all,
> > +	# whereas the optimizations performed by our default optimization level
> > +	# `-O2` will mask others. We thus use `-Og` here just so that we have
> > +	# at least one job with a different optimization level so that we can
> > +	# overall surface more warnings.
> > +	cat >config.mak <<-EOF
> > +	export CFLAGS=-Og
> > +	EOF
> 
> Writing config.mak is unusual, though I guess just setting CFLAGS in the
> environment isn't enough, because we set it unconditionally in the
> Makefile. Doing "make CFLAGS=-Og" would work, but we'd need help from
> the code that actually runs "make".

Yeah, I went the easy route because setting it via the environment is
not enough, as you correctly mention.

> I do suspect the "export" is unnecessary; it should just be used by the
> Makefile recipes themselves.
> 
> Your command above also loses the "-g" and "-Wall" from the default
> CFLAGS. Maybe OK, since DEVELOPER=1 implies -Wall anyway, and "-g" isn't
> important. But one thing I've done for a long time in my config.mak is:
> 
>   O ?= 2
>   CFLAGS = -g -Wall -O$(O)
> 
> Then you can "make O=0" or "make O=g" if you want. And I think that
> setting O=g in the environment (exported) would work, as well.

Sure, I can do something along these lines.

Patrick
Jeff King June 6, 2024, 8:05 a.m. UTC | #3
On Thu, Jun 06, 2024 at 09:41:56AM +0200, Patrick Steinhardt wrote:

> > If we want to get coverage of -Og, I'd suggest doing it in a job that is
> > otherwise overlapping with another (maybe linux-TEST-vars, which I think
> > is otherwise a duplicate build?).
> 
> I don't think linux-TEST-vars would be a good candidate for this because
> it uses Ubuntu 20.04. Ideally, we'd want to have a test run with an
> up-to-date version of Ubuntu so that we also get a recent version of the
> compiler toolchain.

Oof, yeah, I agree that if the point is to find obscure warnings, newer
is going to be better.

> I kind of wonder whether we should revamp this pedantic job in the first
> place. The consequence of that job is that our codebase needs to be
> compile cleanly with `-Wpedantic`. So if that is a requirement anyway,
> why don't we run all jobs with `DEVOPTS=pedantic` and just drop this job
> altogether? This may surface some additional warnings on platforms where
> we currently don't set that, but is that a bad thing?

Yeah, if we always compile cleanly with pedantic, then I don't see why
it wouldn't just be the default for DEVELOPER=1. The point of that flag
is to be as picky as possible so that we catch things early. If some
platform can't handle it (let's imagine Windows or something), then I
think we should be explicitly _disabling_ pedantic there.

> The only downside I can think of is that we stop compiling on Fedora,
> which may have a more up-to-date GCC version than Ubuntu. But if the
> goal of this job was to _really_ get an up-to-date compiler toolchain,
> then we should rather pick a rolling release distro like Arch. Otherwise
> I find this to be of dubious benefit.

There may be some value in general in compiling on multiple distros, as
a sort of "unknown unknowns" thing. We don't know what we might turn up,
but exposing ourselves to more variables may lead to catching failures
before users see them.

I don't know if Fedora was specifically chosen for recent gcc there, or
if it was simply for variety.

Once again, these overlapping variables within various jobs make it hard
to reason about (but I don't propose normalizing all of them; that would
increase the amount of CPU work by a lot; I am just grumbling).

But yeah, between Arch and Fedora, I don't have an opinion. Doing both
might even be valuable, if we are shoving random variations into random
jobs. ;)

> If we merge it into the other jobs, then I'd just pick any random job
> that uses "ubuntu:latest" like "linux-gcc-default" to compile with
> `-Og`.

That would be OK with me. I also think it would be OK to do nothing for
now. We saw one case where "-Og" found a warning that "-O2" didn't. We
could wait to see if it happens twice before acting. I dunno.

-Peff
Patrick Steinhardt June 6, 2024, 8:25 a.m. UTC | #4
On Thu, Jun 06, 2024 at 04:05:52AM -0400, Jeff King wrote:
> On Thu, Jun 06, 2024 at 09:41:56AM +0200, Patrick Steinhardt wrote:
> > I kind of wonder whether we should revamp this pedantic job in the first
> > place. The consequence of that job is that our codebase needs to be
> > compile cleanly with `-Wpedantic`. So if that is a requirement anyway,
> > why don't we run all jobs with `DEVOPTS=pedantic` and just drop this job
> > altogether? This may surface some additional warnings on platforms where
> > we currently don't set that, but is that a bad thing?
> 
> Yeah, if we always compile cleanly with pedantic, then I don't see why
> it wouldn't just be the default for DEVELOPER=1. The point of that flag
> is to be as picky as possible so that we catch things early. If some
> platform can't handle it (let's imagine Windows or something), then I
> think we should be explicitly _disabling_ pedantic there.
> 
> > The only downside I can think of is that we stop compiling on Fedora,
> > which may have a more up-to-date GCC version than Ubuntu. But if the
> > goal of this job was to _really_ get an up-to-date compiler toolchain,
> > then we should rather pick a rolling release distro like Arch. Otherwise
> > I find this to be of dubious benefit.
> 
> There may be some value in general in compiling on multiple distros, as
> a sort of "unknown unknowns" thing. We don't know what we might turn up,
> but exposing ourselves to more variables may lead to catching failures
> before users see them.
> 
> I don't know if Fedora was specifically chosen for recent gcc there, or
> if it was simply for variety.

True enough. But even so, I think the better solution here would be to
drop one of the Ubuntu-based jobs and then convert the Fedora one into a
fully-fledged job that also runs tests.

That's something for a later iteration, though.

> Once again, these overlapping variables within various jobs make it hard
> to reason about (but I don't propose normalizing all of them; that would
> increase the amount of CPU work by a lot; I am just grumbling).

No arguing there, it certainly is hard to discover overall. I don't
really think there's a way around this if we want to have different
combinations while not running a full matrix of jobs, because the
combinations are always going to be arbitrary.

> But yeah, between Arch and Fedora, I don't have an opinion. Doing both
> might even be valuable, if we are shoving random variations into random
> jobs. ;)

True.

> > If we merge it into the other jobs, then I'd just pick any random job
> > that uses "ubuntu:latest" like "linux-gcc-default" to compile with
> > `-Og`.
> 
> That would be OK with me. I also think it would be OK to do nothing for
> now. We saw one case where "-Og" found a warning that "-O2" didn't. We
> could wait to see if it happens twice before acting. I dunno.

I don't think it hurts to have a job with `-Og`, and if it detects some
corner cases that we otherwise don't I think there's a real benefit. In
the best case, I'd want to give everyone the tools to avoid sending
patch series to the mailing list which are broken and where that could
have been detected earlier. In the end, it saves everybody's time.

Patrick
Junio C Hamano June 6, 2024, 4:32 p.m. UTC | #5
Jeff King <peff@peff.net> writes:

> So for the pedantic warnings, we're left with a guess as to whether -Og
> or -O2 will yield more results. And in my experience it is probably -O2.
>
> If we want to get coverage of -Og, I'd suggest doing it in a job that is
> otherwise overlapping with another (maybe linux-TEST-vars, which I think
> is otherwise a duplicate build?).

The same knee-jerk reaction came to me.

Speaking of variants, is there any interest in migrating one or some
of the existing x86-64 CI jobs to arm64 CI jobs GitHub introduced
recently?  I suspect that we won't find any endianness bugs (I
expect they are configured to do little endian just like everybody
else) and there may no longer be lurking unaligned read bugs (but
"git log --grep=unaligned" finds surprising number of them we have
seen and fixed), so the returns may be very small.

> Your command above also loses the "-g" and "-Wall" from the default
> CFLAGS. Maybe OK, since DEVELOPER=1 implies -Wall anyway, and "-g" isn't
> important. But one thing I've done for a long time in my config.mak is:
>
>   O ?= 2
>   CFLAGS = -g -Wall -O$(O)
>
> Then you can "make O=0" or "make O=g" if you want. And I think that
> setting O=g in the environment (exported) would work, as well.

I do something similar, but my $(O) replaces the whole -O2 thing, so
I can say something silly like

    make O="-O2 -g -Wall"
Patrick Steinhardt June 7, 2024, 5:10 a.m. UTC | #6
On Thu, Jun 06, 2024 at 09:32:09AM -0700, Junio C Hamano wrote:
> Jeff King <peff@peff.net> writes:
> 
> > So for the pedantic warnings, we're left with a guess as to whether -Og
> > or -O2 will yield more results. And in my experience it is probably -O2.
> >
> > If we want to get coverage of -Og, I'd suggest doing it in a job that is
> > otherwise overlapping with another (maybe linux-TEST-vars, which I think
> > is otherwise a duplicate build?).
> 
> The same knee-jerk reaction came to me.
> 
> Speaking of variants, is there any interest in migrating one or some
> of the existing x86-64 CI jobs to arm64 CI jobs GitHub introduced
> recently?  I suspect that we won't find any endianness bugs (I
> expect they are configured to do little endian just like everybody
> else) and there may no longer be lurking unaligned read bugs (but
> "git log --grep=unaligned" finds surprising number of them we have
> seen and fixed), so the returns may be very small.

Note that we already run arm64 via GitLab's macOS runners. That's not
Linux of course, but I guess that any architectural issues should still
be caught by that.

Not to say that we shouldn't adapt GitHub.

Patrick
Junio C Hamano June 7, 2024, 6:42 p.m. UTC | #7
Patrick Steinhardt <ps@pks.im> writes:

>> Speaking of variants, is there any interest in migrating one or some
>> of the existing x86-64 CI jobs to arm64 CI jobs GitHub introduced
>> recently?  I suspect that we won't find any endianness bugs (I
>> expect they are configured to do little endian just like everybody
>> else) and there may no longer be lurking unaligned read bugs (but
>> "git log --grep=unaligned" finds surprising number of them we have
>> seen and fixed), so the returns may be very small.
>
> Note that we already run arm64 via GitLab's macOS runners. That's not
> Linux of course, but I guess that any architectural issues should still
> be caught by that.

The more the merrier and it is nice to know we have extra variety
over there.

> Not to say that we shouldn't adapt GitHub.

;-)
diff mbox series

Patch

diff --git a/ci/run-build-and-tests.sh b/ci/run-build-and-tests.sh
index 98dda42045..e78e19e4a6 100755
--- a/ci/run-build-and-tests.sh
+++ b/ci/run-build-and-tests.sh
@@ -44,6 +44,15 @@  pedantic)
 	# Don't run the tests; we only care about whether Git can be
 	# built.
 	export DEVOPTS=pedantic
+	# Warnings generated by compilers are unfortunately specific to the
+	# optimization level. With `-O0`, many warnings won't be shown at all,
+	# whereas the optimizations performed by our default optimization level
+	# `-O2` will mask others. We thus use `-Og` here just so that we have
+	# at least one job with a different optimization level so that we can
+	# overall surface more warnings.
+	cat >config.mak <<-EOF
+	export CFLAGS=-Og
+	EOF
 	run_tests=
 	;;
 esac