diff mbox series

[v2,2/2] ci: compile "linux-gcc-default" job with -Og

Message ID bdf0e40a770c57b63e7519983d37b97a85ce07bf.1717662814.git.ps@pks.im (mailing list archive)
State Superseded
Headers show
Series ci: detect more warnings via `-Og` | expand

Commit Message

Patrick Steinhardt June 6, 2024, 9:31 a.m. UTC
We have recently noticed that our CI does not always notice variables
that may be used uninitialized. While it is expected that compiler
warnings aren't perfect, this one was a but puzzling because it was
rather obvious that the variable can be uninitialized.

Many compiler warnings unfortunately depend on the optimization level
used by the compiler. While `-O0` for example will disable a lot of
warnings altogether because optimization passes go away, `-O2`, which is
our default optimization level used in CI, may optimize specific code
away or even double down on undefined behaviour. Interestingly, this
specific instance that triggered the investigation does get noted by GCC
when using `-Og`.

While we could adapt all jobs to compile with `-Og` now, that would
potentially mask other warnings that only get diagnosed with `-O2`.
Instead, adapt the "linux-gcc-default" job to compile with `-Og`. This
job is chosen because it uses the "ubuntu:latest" image and should thus
have a comparatively recent compiler toolchain, and because we have
other jobs that use "ubuntu:latest" so that we do not loose coverage for
warnings diagnosed only on `-O2` level.

To make it easier to set up the optimization level in our CI, add
support in our Makefile to specify the level via an environment
variable.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---

I was a little torn whether we really want to name the variable `O`
here because it feels so easy to set it by accident. We could rename
this to `OPTIMIZATION` or `OPTIMIZATION_LEVEL`, but that's quite a
mouthful.

Alternatively, if we don't want to have this variable in the first
place, then I'm also happy to adapt the script itself to pass the
optimization level via an argument.

 Makefile                  | 3 ++-
 ci/run-build-and-tests.sh | 9 +++++++++
 2 files changed, 11 insertions(+), 1 deletion(-)

Comments

Justin Tobler June 6, 2024, 3:32 p.m. UTC | #1
On 24/06/06 11:31AM, Patrick Steinhardt wrote:
> We have recently noticed that our CI does not always notice variables
> that may be used uninitialized. While it is expected that compiler
> warnings aren't perfect, this one was a but puzzling because it was

s/but/bit/

> rather obvious that the variable can be uninitialized.
> 
> Many compiler warnings unfortunately depend on the optimization level
> used by the compiler. While `-O0` for example will disable a lot of
> warnings altogether because optimization passes go away, `-O2`, which is
> our default optimization level used in CI, may optimize specific code
> away or even double down on undefined behaviour. Interestingly, this
> specific instance that triggered the investigation does get noted by GCC
> when using `-Og`.
> 
> While we could adapt all jobs to compile with `-Og` now, that would
> potentially mask other warnings that only get diagnosed with `-O2`.
> Instead, adapt the "linux-gcc-default" job to compile with `-Og`. This
> job is chosen because it uses the "ubuntu:latest" image and should thus
> have a comparatively recent compiler toolchain, and because we have
> other jobs that use "ubuntu:latest" so that we do not loose coverage for

s/loose/lose/

> warnings diagnosed only on `-O2` level.
> 
> To make it easier to set up the optimization level in our CI, add
> support in our Makefile to specify the level via an environment
> variable.
> 
> Signed-off-by: Patrick Steinhardt <ps@pks.im>
> ---
> 
> I was a little torn whether we really want to name the variable `O`
> here because it feels so easy to set it by accident. We could rename
> this to `OPTIMIZATION` or `OPTIMIZATION_LEVEL`, but that's quite a
> mouthful.
> 
> Alternatively, if we don't want to have this variable in the first
> place, then I'm also happy to adapt the script itself to pass the
> optimization level via an argument.

I think the variable itself is fine, but for a name I think that either 
`OPTIMIZATION` or `OPTIMIZATION_LEVEL` would be a better pick. We have 
some other lengthy environment varible names so I don't think its too 
bad.

Otherwise, this looks good to me :)

-Justin
Junio C Hamano June 6, 2024, 5:02 p.m. UTC | #2
Patrick Steinhardt <ps@pks.im> writes:

> I was a little torn whether we really want to name the variable `O`
> here because it feels so easy to set it by accident. We could rename
> this to `OPTIMIZATION` or `OPTIMIZATION_LEVEL`, but that's quite a
> mouthful.
>
> Alternatively, if we don't want to have this variable in the first
> place, then I'm also happy to adapt the script itself to pass the
> optimization level via an argument.

The latter is much more preferrable.  It is too easy to stomp on
people's established workflow that already uses that variable for
other purposes or expects slightly different syntax.

>  Makefile                  | 3 ++-
>  ci/run-build-and-tests.sh | 9 +++++++++
>  2 files changed, 11 insertions(+), 1 deletion(-)
>
> diff --git a/Makefile b/Makefile
> index 59d98ba688..ff57c94fdf 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -1357,7 +1357,8 @@ endif
>  # tweaked by config.* below as well as the command-line, both of
>  # which'll override these defaults.
>  # Older versions of GCC may require adding "-std=gnu99" at the end.
> -CFLAGS = -g -O2 -Wall
> +O ?= 2
> +CFLAGS = -g -O$(O) -Wall
>  LDFLAGS =
>  CC_LD_DYNPATH = -Wl,-rpath,
>  BASIC_CFLAGS = -I.
> diff --git a/ci/run-build-and-tests.sh b/ci/run-build-and-tests.sh
> index 98dda42045..0f00dbd289 100755
> --- a/ci/run-build-and-tests.sh
> +++ b/ci/run-build-and-tests.sh
> @@ -13,6 +13,15 @@ esac
>  run_tests=t
>  
>  case "$jobname" in
> +linux-gcc-default)
> +	# Warnings generated by compilers are unfortunately specific to the
> +	# optimization level. With `-O0`, many warnings won't be shown at all,
> +	# whereas the optimizations performed by our default optimization level
> +	# `-O2` will mask others. We thus use `-Og` here just so that we have
> +	# at least one job with a different optimization level so that we can
> +	# overall surface more warnings.
> +	export O=g
> +	;;
>  linux-gcc)
>  	export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
>  	;;
Patrick Steinhardt June 7, 2024, 5:28 a.m. UTC | #3
On Thu, Jun 06, 2024 at 10:02:04AM -0700, Junio C Hamano wrote:
> Patrick Steinhardt <ps@pks.im> writes:
> 
> > I was a little torn whether we really want to name the variable `O`
> > here because it feels so easy to set it by accident. We could rename
> > this to `OPTIMIZATION` or `OPTIMIZATION_LEVEL`, but that's quite a
> > mouthful.
> >
> > Alternatively, if we don't want to have this variable in the first
> > place, then I'm also happy to adapt the script itself to pass the
> > optimization level via an argument.
> 
> The latter is much more preferrable.  It is too easy to stomp on
> people's established workflow that already uses that variable for
> other purposes or expects slightly different syntax.

We're unlikely to break existing workflows though if we name this
variable something like `OPTIMIZATION_LEVEL`. The workflows of both you
and Peff would keep on working without any modification. Also, doesn't
the fact that both of you have similar workarounds hint that this is
something that other devs might want to have, as well?

We could also generalize this a bit and introduce `CFLAGS_APPEND`
instead. Optimization levels are last-one-wins anyway, so people can use
that to append their own flags without overriding existing values. It
would also mean that we can avoid repeating the CFLAGS that we already
have in our Makefile in our CI scripts.

I'll send a version along these lines. If people are still unhappy with
that direction, then I'll give up and just repeat the whole CFLAGS in
"ci/run-build-and-tests.sh".

Patrick
Junio C Hamano June 7, 2024, 6:45 p.m. UTC | #4
Patrick Steinhardt <ps@pks.im> writes:

> We're unlikely to break existing workflows though if we name this
> variable something like `OPTIMIZATION_LEVEL`.

Yeah, being more explicit is always good.

> We could also generalize this a bit and introduce `CFLAGS_APPEND`
> instead. Optimization levels are last-one-wins anyway, so people can use
> that to append their own flags without overriding existing values. It
> would also mean that we can avoid repeating the CFLAGS that we already
> have in our Makefile in our CI scripts.

Yup, Peff's $(O) cannot serve as such, but my $(O) is already being
used as such.  Naming the variable that gives additional CFLAGS as
such is probably a good way to go.

Thanks.
Junio C Hamano June 7, 2024, 6:48 p.m. UTC | #5
Patrick Steinhardt <ps@pks.im> writes:

> While we could adapt all jobs to compile with `-Og` now, that would
> potentially mask other warnings that only get diagnosed with `-O2`.
> Instead, adapt the "linux-gcc-default" job to compile with `-Og`.

It seems that we are already triggering for 'seen'.  I haven't
checked if that is a false positive or a real problem, though.

https://github.com/git/git/actions/runs/9407652417/job/25913907166#step:4:139

  pack-mtimes.c: In function ‘load_pack_mtimes_file’:
  Error: pack-mtimes.c:89:25: ‘mtimes_size’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
     89 |                         munmap(data, mtimes_size);
        |                         ^~~~~~~~~~~~~~~~~~~~~~~~~
  cc1: all warnings being treated as errors
  make: *** [Makefile:2757: pack-mtimes.o] Error 1
  make: *** Waiting for unfinished jobs....
  pack-revindex.c: In function ‘load_revindex_from_disk’:
  Error: pack-revindex.c:260:25: ‘revindex_size’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
    260 |                         munmap(data, revindex_size);
        |                         ^~~~~~~~~~~~~~~~~~~~~~~~~~~
  cc1: all warnings being treated as errors
  make: *** [Makefile:2757: pack-revindex.o] Error 1
  cat: exit.status: No such file or directory
Junio C Hamano June 7, 2024, 8:35 p.m. UTC | #6
Junio C Hamano <gitster@pobox.com> writes:

>   pack-mtimes.c: In function ‘load_pack_mtimes_file’:
>   Error: pack-mtimes.c:89:25: ‘mtimes_size’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
>      89 |                         munmap(data, mtimes_size);
>         |                         ^~~~~~~~~~~~~~~~~~~~~~~~~
>   cc1: all warnings being treated as errors
>   make: *** [Makefile:2757: pack-mtimes.o] Error 1
>   make: *** Waiting for unfinished jobs....

The use on line 89 is guarded with "if (data)" and data can become
non-NULL only after mtimes_size is computed, so this is benign.

They have excuse for a false positive because the warning is about
"maybe" uninitialized, but that does not help our annoyance factor
X-<.

>   pack-revindex.c: In function ‘load_revindex_from_disk’:
>   Error: pack-revindex.c:260:25: ‘revindex_size’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
>     260 |                         munmap(data, revindex_size);
>         |                         ^~~~~~~~~~~~~~~~~~~~~~~~~~~
>   cc1: all warnings being treated as errors
>   make: *** [Makefile:2757: pack-revindex.o] Error 1
>   cat: exit.status: No such file or directory

This follows exactly the same pattern established by the other one
(or perhaps the other one copied from here).  It is another false
positive.

I am not sure what the right fix would be.  For example, if we were
interested in avoiding to incur too much resources for revindex, we
might do something like this

--- i/pack-revindex.c
+++ w/pack-revindex.c
@@ -258,6 +258,8 @@ static int load_revindex_from_disk(char *revindex_name,
 	if (ret) {
 		if (data)
 			munmap(data, revindex_size);
+		fprintf(stderr, "would have fit %d revindex in 10MB\n",
+			10 * 1024 * 1024 / revindex_size);
 	} else {
 		*len_p = revindex_size;
 		*data_p = (const uint32_t *)data;

without even guarding with "if (data)".

If we "initialize" revindex_size to a meaningless dummy value like 0
like the attached would _hide_ such a real bug from the compiler, so
I dunno.

@@ -206,7 +206,7 @@ static int load_revindex_from_disk(char *revindex_name,
 	int fd, ret = 0;
 	struct stat st;
 	void *data = NULL;
-	size_t revindex_size;
+	size_t revindex_size = 0;
 	struct revindex_header *hdr;
 
 	if (git_env_bool(GIT_TEST_REV_INDEX_DIE_ON_DISK, 0))
Jeff King June 8, 2024, 8:49 a.m. UTC | #7
On Fri, Jun 07, 2024 at 11:45:18AM -0700, Junio C Hamano wrote:

> Patrick Steinhardt <ps@pks.im> writes:
> 
> > We're unlikely to break existing workflows though if we name this
> > variable something like `OPTIMIZATION_LEVEL`.
> 
> Yeah, being more explicit is always good.

One of the reasons I used the very short "O" for mine is that I often
specify it by hand. I actually leave it as O=0 by default, since the
majority of my builds are about developing and debugging (so speed of
compilation is much more important than speed of the resulting
executable). And then when I am interested in performance, I run "make
O=2".

So OPTIMIZATION_LEVEL defeats the purpose. ;)

> > We could also generalize this a bit and introduce `CFLAGS_APPEND`
> > instead. Optimization levels are last-one-wins anyway, so people can use
> > that to append their own flags without overriding existing values. It
> > would also mean that we can avoid repeating the CFLAGS that we already
> > have in our Makefile in our CI scripts.
> 
> Yup, Peff's $(O) cannot serve as such, but my $(O) is already being
> used as such.  Naming the variable that gives additional CFLAGS as
> such is probably a good way to go.

I have something like this in my config.mak, too. ;) But I call it
EXTRA_CFLAGS. That seems less grammatically awkward to me than
CFLAGS_APPEND, but that may be entering bikeshed territory.

And you can tell from the length name that I do not use it all that
often.

-Peff
diff mbox series

Patch

diff --git a/Makefile b/Makefile
index 59d98ba688..ff57c94fdf 100644
--- a/Makefile
+++ b/Makefile
@@ -1357,7 +1357,8 @@  endif
 # tweaked by config.* below as well as the command-line, both of
 # which'll override these defaults.
 # Older versions of GCC may require adding "-std=gnu99" at the end.
-CFLAGS = -g -O2 -Wall
+O ?= 2
+CFLAGS = -g -O$(O) -Wall
 LDFLAGS =
 CC_LD_DYNPATH = -Wl,-rpath,
 BASIC_CFLAGS = -I.
diff --git a/ci/run-build-and-tests.sh b/ci/run-build-and-tests.sh
index 98dda42045..0f00dbd289 100755
--- a/ci/run-build-and-tests.sh
+++ b/ci/run-build-and-tests.sh
@@ -13,6 +13,15 @@  esac
 run_tests=t
 
 case "$jobname" in
+linux-gcc-default)
+	# Warnings generated by compilers are unfortunately specific to the
+	# optimization level. With `-O0`, many warnings won't be shown at all,
+	# whereas the optimizations performed by our default optimization level
+	# `-O2` will mask others. We thus use `-Og` here just so that we have
+	# at least one job with a different optimization level so that we can
+	# overall surface more warnings.
+	export O=g
+	;;
 linux-gcc)
 	export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
 	;;