diff mbox series

Fix -Wmaybe-uninitialized warnings under -O0

Message ID 33984eeaabbfbcfd4b9d3903549d8b7d6c4ced7e.1585726172.git.liu.denton@gmail.com (mailing list archive)
State New, archived
Headers show
Series Fix -Wmaybe-uninitialized warnings under -O0 | expand

Commit Message

Denton Liu April 1, 2020, 7:30 a.m. UTC
When compiling Git under -O0, gcc (Arch Linux 9.3.0-1) 9.3.0 produces
many -Wmaybe-uninitialized warnings. These are false positives since
when Git is compiled under -O2, gcc is smart enough to see that the
code paths that use these variables all initialise them beforehand.
Nonetheless, these warnings block the compilation process when
DEVELOPER=1 is enabled (which enables -Werror).

Fix these warnings by initializing these variables with dummy values (0,
-1 or NULL as appropriate).

Signed-off-by: Denton Liu <liu.denton@gmail.com>
---
 builtin/branch.c          | 2 +-
 diff.c                    | 2 +-
 fast-import.c             | 4 ++--
 http-backend.c            | 2 +-
 ref-filter.c              | 2 +-
 refs/packed-backend.c     | 2 +-
 t/helper/test-ref-store.c | 2 +-
 trace2/tr2_dst.c          | 2 +-
 trailer.c                 | 2 +-
 9 files changed, 10 insertions(+), 10 deletions(-)

Comments

Denton Liu April 1, 2020, 9:05 a.m. UTC | #1
Hi Carlos,

On Wed, Apr 01, 2020 at 01:01:41AM -0700, Carlo Arenas wrote:
> On Wed, Apr 1, 2020 at 12:31 AM Denton Liu <liu.denton@gmail.com> wrote:
> 
> > Nonetheless, these warnings block the compilation process when
> > DEVELOPER=1 is enabled (which enables -Werror).
> >
> 
> you can also alternatively use DEVOPTS=no-error

Thanks for the tip. By the way, it seems like your message was sent with
HTML so it didn't hit the list.

-Denton

> 
> Carlo
Jeff King April 1, 2020, 9:52 a.m. UTC | #2
On Wed, Apr 01, 2020 at 03:30:16AM -0400, Denton Liu wrote:

> When compiling Git under -O0, gcc (Arch Linux 9.3.0-1) 9.3.0 produces
> many -Wmaybe-uninitialized warnings. These are false positives since
> when Git is compiled under -O2, gcc is smart enough to see that the
> code paths that use these variables all initialise them beforehand.
> Nonetheless, these warnings block the compilation process when
> DEVELOPER=1 is enabled (which enables -Werror).
> 
> Fix these warnings by initializing these variables with dummy values (0,
> -1 or NULL as appropriate).

Hmph. I almost always compile with -O0 and have been using gcc 9.3.0
since it was packaged for Debian a few weeks ago, but I don't see any of
these warnings.

The current version in Debian unstable is 9.3.0-8, which picks up some
extra patches from the upstream gcc-9 branch. But even if I download a
snapshot of the original 9.3.0 release, it builds fine.

So why does your version behave differently? And if this is a temporary
state for a buggy version of gcc (that may be fixed in the next point
release), is it worth changing our source code to appease it?

-Peff
Denton Liu April 1, 2020, 2:06 p.m. UTC | #3
Hi Peff,

On Wed, Apr 01, 2020 at 05:52:55AM -0400, Jeff King wrote:
> On Wed, Apr 01, 2020 at 03:30:16AM -0400, Denton Liu wrote:
> 
> > When compiling Git under -O0, gcc (Arch Linux 9.3.0-1) 9.3.0 produces
> > many -Wmaybe-uninitialized warnings. These are false positives since
> > when Git is compiled under -O2, gcc is smart enough to see that the
> > code paths that use these variables all initialise them beforehand.
> > Nonetheless, these warnings block the compilation process when
> > DEVELOPER=1 is enabled (which enables -Werror).
> > 
> > Fix these warnings by initializing these variables with dummy values (0,
> > -1 or NULL as appropriate).
> 
> Hmph. I almost always compile with -O0 and have been using gcc 9.3.0
> since it was packaged for Debian a few weeks ago, but I don't see any of
> these warnings.
> 
> The current version in Debian unstable is 9.3.0-8, which picks up some
> extra patches from the upstream gcc-9 branch. But even if I download a
> snapshot of the original 9.3.0 release, it builds fine.
> 
> So why does your version behave differently? And if this is a temporary
> state for a buggy version of gcc (that may be fixed in the next point
> release), is it worth changing our source code to appease it?

A correction to the earlier message... It seems like I wasn't reporting
the correct settings. I was actually compiling with -Og, not -O0
(whoops!).

I tested it with gcc-8 and it seems like it also reports the same
problem. Also, -O1 reports warnings as well.

I guess the question is do we only support -O0 and -O2 or should we
support the levels in between as well?

Sorry for the confusion,

Denton

> 
> -Peff
Jeff King April 3, 2020, 2:04 p.m. UTC | #4
On Wed, Apr 01, 2020 at 10:06:43AM -0400, Denton Liu wrote:

> > So why does your version behave differently? And if this is a temporary
> > state for a buggy version of gcc (that may be fixed in the next point
> > release), is it worth changing our source code to appease it?
> 
> A correction to the earlier message... It seems like I wasn't reporting
> the correct settings. I was actually compiling with -Og, not -O0
> (whoops!).
> 
> I tested it with gcc-8 and it seems like it also reports the same
> problem. Also, -O1 reports warnings as well.

Ah, OK, I can reproduce easily with -Og (up through gcc-10). Most of
them don't trigger with -O1; just the one in ref-filter.c.

That one's interesting. We have:

  int ret = 0;
  ...
  if (...)
         ...
  else
         ret = for_each_fullref_in_pattern(...);
  ...
  return ret;

So we'd either have 0 or an assigned return. But the bug is actually in
for_each_fullref_in_pattern(), which does this:

  int ret; /* uninitialized! */

  /* a bunch of early return conditionals */
  if (...)
    return ...;

  for_each_string_list_item(...) {
    ret = for_each_fullref_in(...);
  }

  return ret;

but that will return an uninitialized value when there are no patterns.
I doubt we have such a case, but that may explain why -O0 does not
complain (it assumes "in_pattern" will return a useful value) and -O2
does not (it is able to figure out that it always does), but -O1 only
inlines part of it.

Curiously, -Og _does_ find the correct function.

Your patch silences it, but is it doing the right thing? It sets "ret =
0", but we haven't actually iterated anything. Should it be an error
instead?

-Peff
Jeff King April 3, 2020, 2:38 p.m. UTC | #5
On Fri, Apr 03, 2020 at 10:04:47AM -0400, Jeff King wrote:

> Ah, OK, I can reproduce easily with -Og (up through gcc-10). Most of
> them don't trigger with -O1; just the one in ref-filter.c.

I guess I should have been more clear since the -O1 and -Og locations
are different. -O1 complains in filter_refs().

By the way, that function's handling of filter->kind seems very sketchy
to me. It does:

  int ret = 0;
  if (!filter->kind)
                die("filter_refs: invalid type");
  else {
          /*
           * For common cases where we need only branches or remotes or tags,
           * we only iterate through those refs. If a mix of refs is needed,
           * we iterate over all refs and filter out required refs with the help
           * of filter_ref_kind().
           */
          if (filter->kind == FILTER_REFS_BRANCHES)
                  ret = for_each_fullref_in("refs/heads/", ...);
          else if (filter->kind == FILTER_REFS_REMOTES)
                  ret = for_each_fullref_in("refs/remotes/", ...);
          else if (filter->kind == FILTER_REFS_TAGS)
                  ret = for_each_fullref_in("refs/tags/", ...);
          else if (filter->kind & FILTER_REFS_ALL)
                  ret = for_each_fullref_in_pattern(filter, ...);
          if (!ret && (filter->kind & FILTER_REFS_DETACHED_HEAD))
                  head_ref(...);
  }

So filter->kind is sometimes treated like a bitfield and sometimes not.
I can set it to (ALL & DETACHED_HEAD) to get something useful, but not
(BRANCHES & HEAD). The up-front check tries to complain if you didn't
ask for anything, but there are other flags like INCLUDE_BROKEN that
would cause "!filter->kind" to be false, but still not produce any
output.

And shouldn't we be checking the return value of head_ref() like the
others?

All of this is outside the scope of our current discussion, and
untangling it might be messy (because it could touch the callers). I
just wanted to document my findings for now. :)

-Peff
Denton Liu April 4, 2020, 12:07 p.m. UTC | #6
Hi Peff,

On Fri, Apr 03, 2020 at 10:04:47AM -0400, Jeff King wrote:
> On Wed, Apr 01, 2020 at 10:06:43AM -0400, Denton Liu wrote:
> 
> > > So why does your version behave differently? And if this is a temporary
> > > state for a buggy version of gcc (that may be fixed in the next point
> > > release), is it worth changing our source code to appease it?
> > 
> > A correction to the earlier message... It seems like I wasn't reporting
> > the correct settings. I was actually compiling with -Og, not -O0
> > (whoops!).
> > 
> > I tested it with gcc-8 and it seems like it also reports the same
> > problem. Also, -O1 reports warnings as well.
> 
> Ah, OK, I can reproduce easily with -Og (up through gcc-10). Most of
> them don't trigger with -O1; just the one in ref-filter.c.
> 
> That one's interesting. We have:
> 
>   int ret = 0;
>   ...
>   if (...)
>          ...
>   else
>          ret = for_each_fullref_in_pattern(...);
>   ...
>   return ret;
> 
> So we'd either have 0 or an assigned return. But the bug is actually in
> for_each_fullref_in_pattern(), which does this:
> 
>   int ret; /* uninitialized! */
> 
>   /* a bunch of early return conditionals */
>   if (...)
>     return ...;
> 
>   for_each_string_list_item(...) {
>     ret = for_each_fullref_in(...);

This loop is missing a bit of important context:

	if (ret)
		break;

>   }
> 
>   return ret;
> 
> but that will return an uninitialized value when there are no patterns.
> I doubt we have such a case, but that may explain why -O0 does not
> complain (it assumes "in_pattern" will return a useful value) and -O2
> does not (it is able to figure out that it always does), but -O1 only
> inlines part of it.
> 
> Curiously, -Og _does_ find the correct function.
> 
> Your patch silences it, but is it doing the right thing? It sets "ret =
> 0", but we haven't actually iterated anything. Should it be an error
> instead?

I understood the semantics of for_each_fullref_in_pattern() (in the
non-early return case) to be "for each item, iterate and compute a
value; if that value is non-zero return it. If not found, return zero".
The missing context above is important since, without it, we lose the
semantics.

If I'm understanding the above correctly then, studying this function in
a vacuum, it would make sense to assign a default value of 0 since if
the for operates on an empty list, the function should consider the item
vacuously not found and we should return 0 as a result.

This was the type of analysis I applied to the other changes. I'll admit
that I studied the other functions in a vacuum as well since these
seemed to be superficial warnings (since they aren't triggered with
-O{0,2}) which indicated to me that the code was correct except for some
"cosmetic" errors. I dunno, perhaps this is my inexperience showing
through.

Thanks,

Denton

P.S. Do we want to quash the -O3 warnings as well?

> -Peff
Jeff King April 4, 2020, 2:21 p.m. UTC | #7
On Sat, Apr 04, 2020 at 08:07:43AM -0400, Denton Liu wrote:

> >   for_each_string_list_item(...) {
> >     ret = for_each_fullref_in(...);
> 
> This loop is missing a bit of important context:
> 
> 	if (ret)
> 		break;
> 

Yes, I omitted it because it's not relevant to whether ret is
initialized or not (i.e., if we enter the loop then it always is.
But I think it is to the argument you make below.

> > Your patch silences it, but is it doing the right thing? It sets "ret =
> > 0", but we haven't actually iterated anything. Should it be an error
> > instead?
> 
> I understood the semantics of for_each_fullref_in_pattern() (in the
> non-early return case) to be "for each item, iterate and compute a
> value; if that value is non-zero return it. If not found, return zero".
> The missing context above is important since, without it, we lose the
> semantics.
> 
> If I'm understanding the above correctly then, studying this function in
> a vacuum, it would make sense to assign a default value of 0 since if
> the for operates on an empty list, the function should consider the item
> vacuously not found and we should return 0 as a result.

I think the break on "ret" is better understood as "if we saw an error,
return it; otherwise keep going".

If we were given zero patterns, is that a noop success, or is it an
error? This should never happen because we cover that case earlier, so
it would be a bug in find_longest_prefixes. Perhaps:

diff --git a/ref-filter.c b/ref-filter.c
index b1812cb69a..b358567663 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -1964,6 +1964,8 @@ static int for_each_fullref_in_pattern(struct ref_filter *filter,
 	}
 
 	find_longest_prefixes(&prefixes, filter->name_patterns);
+	if (!prefixes.nr)
+		BUG("no common ref prefix?");
 
 	for_each_string_list_item(prefix, &prefixes) {
 		ret = for_each_fullref_in(prefix->string, cb, cb_data, broken);

is worth doing to document that assumption. Ideally that would then let
the compiler know that we'll always enter the loop, but it doesn't seem
to be quite the clever.

So I dunno. I think it probably doesn't matter between "0" and "-1",
because this case really shouldn't be reachable.

> This was the type of analysis I applied to the other changes. I'll admit
> that I studied the other functions in a vacuum as well since these
> seemed to be superficial warnings (since they aren't triggered with
> -O{0,2}) which indicated to me that the code was correct except for some
> "cosmetic" errors. I dunno, perhaps this is my inexperience showing
> through.

Yeah, the code is correct in this case, and I don't think the
uninitialized case can be reached. But the subtle linkage here is that
patterns[0] being non-NULL means that find_longest_prefixes() will never
return a zero-length list, which means we'll always enter that loop at
least once.

> P.S. Do we want to quash the -O3 warnings as well?

They're probably worth looking at. I've periodically swept through and
fixed them, as recently as last September (try "git log --grep=-O3").
But new ones seem to have cropped up. I'm not sure if they were
introduced in the code or if the compiler got smarter (or dumber).

Just skimming the output, I see:

  In function ‘ll_binary_merge’,
      inlined from ‘ll_xdl_merge’ at ll-merge.c:115:10,
      inlined from ‘ll_union_merge’ at ll-merge.c:151:9:
  ll-merge.c:74:4: error: ‘%s’ directive argument is null [-Werror=format-overflow=]
     74 |    warning("Cannot merge binary files: %s (%s vs. %s)",
        |    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
     75 |     path, name1, name2);
        |     ~~~~~~~~~~~~~~~~~~~

This one seems likely to be accurate (and just uncovered by more
aggressive inlining). The union_merge function passes in NULL filenames,
and probably could trigger this warning on binary input.

  trace2/tr2_dst.c: In function ‘tr2_dst_get_trace_fd.part.0’:
  trace2/tr2_dst.c:296:10: error: ‘fd’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
    296 |  dst->fd = fd;
        |  ~~~~~~~~^~~~
  trace2/tr2_dst.c:229:6: note: ‘fd’ was declared here
    229 |  int fd;
        |      ^~

This one looks like a false positive. The tr2 code does something
unusual for our code base: on error it returns errno. And I think the
compiler is not aware that errno would not be zero after a syscall
failure. It might be worth changing the return value to match our usual
pattern (return -1, and the caller can look in errno), which would
appease the compiler as a side effect.

  revision.c: In function ‘do_add_index_objects_to_pending’:
  revision.c:322:22: error: array subscript [1, 2147483647] is outside array bounds of ‘char[1]’ [-Werror=array-bounds]
    322 |   if (0 < len && name[len] && buf.len)
        |                  ~~~~^~~~~

This one is confusing. I imagine the char[1] is the empty string we
frequently pass to add_pending_object() and friends. And the "len" here
comes from interpret_branch_name(name, ...). So somehow the compiler
doesn't quite know that the length we'll return is going to be less than
or equal to the string length we pass in.

I don't know how to fix that aside from this, which isn't great (btw, it
looks like a lot of this code could stand to switch to using ssize_t
instead of int; there are probably weird errors if you managed to
somehow feed a 2GB branch name).

diff --git a/revision.c b/revision.c
index 8136929e23..9bc398bec1 100644
--- a/revision.c
+++ b/revision.c
@@ -317,9 +317,10 @@ static void add_pending_object_with_path(struct rev_info *revs,
 		revs->no_walk = 0;
 	if (revs->reflog_info && obj->type == OBJ_COMMIT) {
 		struct strbuf buf = STRBUF_INIT;
-		int len = interpret_branch_name(name, 0, &buf, 0);
+		size_t namelen = strlen(name);
+		int len = interpret_branch_name(name, namelen, &buf, 0);
 
-		if (0 < len && name[len] && buf.len)
+		if (len == namelen && buf.len)
 			strbuf_addstr(&buf, name + len);
 		add_reflog_for_walk(revs->reflog_info,
 				    (struct commit *)obj,

-Peff
diff mbox series

Patch

diff --git a/builtin/branch.c b/builtin/branch.c
index d8297f80ff..3669fba546 100644
--- a/builtin/branch.c
+++ b/builtin/branch.c
@@ -455,7 +455,7 @@  static void print_current_branch_name(void)
 {
 	int flags;
 	const char *refname = resolve_ref_unsafe("HEAD", 0, NULL, &flags);
-	const char *shortname;
+	const char *shortname = NULL;
 	if (!refname)
 		die(_("could not resolve HEAD"));
 	else if (!(flags & REF_ISSYMREF))
diff --git a/diff.c b/diff.c
index f2cfbf2214..99a35774d7 100644
--- a/diff.c
+++ b/diff.c
@@ -3263,7 +3263,7 @@  static void emit_binary_diff_body(struct diff_options *o,
 	void *delta;
 	void *deflated;
 	void *data;
-	unsigned long orig_size;
+	unsigned long orig_size = 0;
 	unsigned long delta_size;
 	unsigned long deflate_size;
 	unsigned long data_size;
diff --git a/fast-import.c b/fast-import.c
index b8b65a801c..0509c0b92f 100644
--- a/fast-import.c
+++ b/fast-import.c
@@ -1829,7 +1829,7 @@  static void parse_original_identifier(void)
 
 static int parse_data(struct strbuf *sb, uintmax_t limit, uintmax_t *len_res)
 {
-	const char *data;
+	const char *data = NULL;
 	strbuf_reset(sb);
 
 	if (!skip_prefix(command_buf.buf, "data ", &data))
@@ -2719,7 +2719,7 @@  static void parse_new_commit(const char *arg)
 static void parse_new_tag(const char *arg)
 {
 	static struct strbuf msg = STRBUF_INIT;
-	const char *from;
+	const char *from = NULL;
 	char *tagger;
 	struct branch *s;
 	struct tag *t;
diff --git a/http-backend.c b/http-backend.c
index ec3144b444..1091cdf2cd 100644
--- a/http-backend.c
+++ b/http-backend.c
@@ -252,7 +252,7 @@  static void http_config(void)
 
 static struct rpc_service *select_service(struct strbuf *hdr, const char *name)
 {
-	const char *svc_name;
+	const char *svc_name = NULL;
 	struct rpc_service *svc = NULL;
 	int i;
 
diff --git a/ref-filter.c b/ref-filter.c
index b1812cb69a..6abd48f81a 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -1938,7 +1938,7 @@  static int for_each_fullref_in_pattern(struct ref_filter *filter,
 {
 	struct string_list prefixes = STRING_LIST_INIT_DUP;
 	struct string_list_item *prefix;
-	int ret;
+	int ret = 0;
 
 	if (!filter->match_as_path) {
 		/*
diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index 4458a0f69c..f37c6d19a6 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -627,7 +627,7 @@  static struct snapshot *create_snapshot(struct packed_ref_store *refs)
 
 	/* If the file has a header line, process it: */
 	if (snapshot->buf < snapshot->eof && *snapshot->buf == '#') {
-		char *tmp, *p, *eol;
+		char *tmp, *p = NULL, *eol;
 		struct string_list traits = STRING_LIST_INIT_NODUP;
 
 		eol = memchr(snapshot->buf, '\n',
diff --git a/t/helper/test-ref-store.c b/t/helper/test-ref-store.c
index 799fc00aa1..d82dcb83a3 100644
--- a/t/helper/test-ref-store.c
+++ b/t/helper/test-ref-store.c
@@ -19,7 +19,7 @@  static unsigned int arg_flags(const char *arg, const char *name)
 
 static const char **get_store(const char **argv, struct ref_store **refs)
 {
-	const char *gitdir;
+	const char *gitdir = NULL;
 
 	if (!argv[0]) {
 		die("ref store required");
diff --git a/trace2/tr2_dst.c b/trace2/tr2_dst.c
index ae052a07fe..910301a53d 100644
--- a/trace2/tr2_dst.c
+++ b/trace2/tr2_dst.c
@@ -227,7 +227,7 @@  static int tr2_dst_try_unix_domain_socket(struct tr2_dst *dst,
 {
 	unsigned int uds_try = 0;
 	int fd;
-	int e;
+	int e = 0;
 	const char *path = NULL;
 
 	/*
diff --git a/trailer.c b/trailer.c
index 0c414f2fed..cac9d0a4d9 100644
--- a/trailer.c
+++ b/trailer.c
@@ -507,7 +507,7 @@  static int git_trailer_config(const char *conf_key, const char *value, void *cb)
 	struct arg_item *item;
 	struct conf_info *conf;
 	char *name = NULL;
-	enum trailer_info_type type;
+	enum trailer_info_type type = -1;
 	int i;
 
 	if (!skip_prefix(conf_key, "trailer.", &trailer_item))