diff mbox series

[2/3] ci: add build checking for side-effects in assert() calls

Message ID 80dcc2ba3aa0ef72abe18f8525d571ea39ac6382.1741911652.git.gitgitgadget@gmail.com (mailing list archive)
State Superseded
Headers show
Series Add a static analysis job to prevent assertions with side effects | expand

Commit Message

Elijah Newren March 14, 2025, 12:20 a.m. UTC
From: Elijah Newren <newren@gmail.com>

It is a big no-no to have side-effects in an assertion, because if the
assert() is compiled out, you don't get that side-effect, leading to the
code behaving differently.  That can be a large headache to debug.

We have roughly 566 assert() calls in our codebase (my grep might have
picked up things that aren't actually assert() calls, but most appeared
to be).  All but 9 of them can be determined by gcc to be free of side
effects with a clever redefine of assert().  The current 9 appear to be
free of side effects to me as well, but are too complicated for a
compiler/linker to figure that since each assertion involves some kind
of function call.  Add a CI job which will find and report these
possibly problematic assertions, and have the job suggest to the user
that they replace these with BUG_IF_NOT() calls.

Example output from running:

```
ERROR: The compiler could not verify the following assert()
       calls are free of side-effects.  Please replace with
       BUG_IF_NOT() calls.
/home/newren/floss/git/diffcore-rename.c:1409
	assert(!dir_rename_count || strmap_empty(dir_rename_count));
/home/newren/floss/git/merge-ort.c:1645
			assert(renames->deferred[side].trivial_merges_okay &&
			       !strset_contains(&renames->deferred[side].target_dirs,
						path));
/home/newren/floss/git/merge-ort.c:794
	assert(omittable_hint ==
	       (!starts_with(type_short_descriptions[type], "CONFLICT") &&
		!starts_with(type_short_descriptions[type], "ERROR")) ||
	       type == CONFLICT_DIR_RENAME_SUGGESTED);
/home/newren/floss/git/merge-recursive.c:1200
	assert(!merge_remote_util(commit));
/home/newren/floss/git/object-file.c:2709
	assert(would_convert_to_git_filter_fd(istate, path));
/home/newren/floss/git/parallel-checkout.c:280
	assert(is_eligible_for_parallel_checkout(pc_item->ce, &pc_item->ca));
/home/newren/floss/git/scalar.c:244
	assert(have_fsmonitor_support());
/home/newren/floss/git/scalar.c:254
	assert(have_fsmonitor_support());
/home/newren/floss/git/sequencer.c:4968
		assert(!(opts->signoff || opts->no_commit ||
			 opts->record_origin || should_edit(opts) ||
			 opts->committer_date_is_author_date ||
			 opts->ignore_date));
```

Note that if there are possibly problematic assertions, not necessarily
all of them will be shown in a single run, because the compiler errors
may include something like "ld: ... more undefined references to
`not_supposed_to_survive' follow" instead of listing each individually.
But in such cases, once you clean up a few that are shown in your first
run, subsequent runs will show (some of) the ones that remain, allowing
you to iteratively remove them all.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 Makefile                      |  4 ++++
 ci/check-unsafe-assertions.sh | 18 ++++++++++++++++++
 ci/run-static-analysis.sh     |  2 ++
 git-compat-util.h             |  6 ++++++
 4 files changed, 30 insertions(+)
 create mode 100755 ci/check-unsafe-assertions.sh

Comments

Junio C Hamano March 14, 2025, 1:06 a.m. UTC | #1
"Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:

> +make CHECK_ASSERTION_SIDE_EFFECTS=1 >compiler_output 2>compiler_error
> +if test $? != 0
> +then
> +    echo "ERROR: The compiler could not verify the following assert()" >&2
> +    echo "       calls are free of side-effects.  Please replace with" >&2
> +    echo "       BUG_IF_NOT() calls." >&2
> +    grep undefined.reference.to..not_supposed_to_survive compiler_error \
> +      | sed -e s/:[^:]*$// | sort | uniq | tr ':' ' ' \
> +      | while read f l

A few style guides:

 - doing multiple echo into the same descriptor is easier to read if
   you have redirection near the beginning, i.e.

	echo >&2 "message one"
	echo >&2 "message two that may way be longer than the previous"
        ehco >&2 "message three"

 - multi-line pipelines are easier to follow without backslash by
   ending the previous line with a '|', i.e.

	grep ... file |
	sed -e ... |
	while read file line
	do
		...

 - I thought our "one indent one tab" standard extends to shell
   scripts as well?

> diff --git a/git-compat-util.h b/git-compat-util.h
> index c3415ad7e0a..0aefd763751 100644
> --- a/git-compat-util.h
> +++ b/git-compat-util.h
> @@ -1584,4 +1584,10 @@ static inline void *container_of_or_null_offset(void *ptr, size_t offset)
>  	((uintptr_t)&(ptr)->member - (uintptr_t)(ptr))
>  #endif /* !__GNUC__ */
>  
> +#ifdef CHECK_ASSERTION_SIDE_EFFECTS
> +#undef assert
> +extern int not_supposed_to_survive;
> +#define assert(expr) ((void)(not_supposed_to_survive || (expr)))
> +#endif /* CHECK_ASSERTION_SIDE_EFFECTS */

Cute.  As this checking assert is in void context, the optimizing
compiler knows that the entire thing can be optimized away ONLY IF
it can somehow prove that (expr) has no side effect.  And if it does
not optimize it away, you will hit an error from the linker, saying
that the undefined variable is being used.

This requires a fairly good optimizing compiler that can peek into
(as in "inline") what is in expr to notice, so it cannot be free of
false positive, but at least when the optimization works as expected,
it is provably (modulo optimizer bugs) side-effect free.

Is this something we can use in our project?  I am just double
checking.

Thanks.
brian m. carlson March 14, 2025, 1:18 a.m. UTC | #2
On 2025-03-14 at 01:06:21, Junio C Hamano wrote:
> "Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:
> 
> > diff --git a/git-compat-util.h b/git-compat-util.h
> > index c3415ad7e0a..0aefd763751 100644
> > --- a/git-compat-util.h
> > +++ b/git-compat-util.h
> > @@ -1584,4 +1584,10 @@ static inline void *container_of_or_null_offset(void *ptr, size_t offset)
> >  	((uintptr_t)&(ptr)->member - (uintptr_t)(ptr))
> >  #endif /* !__GNUC__ */
> >  
> > +#ifdef CHECK_ASSERTION_SIDE_EFFECTS
> > +#undef assert
> > +extern int not_supposed_to_survive;
> > +#define assert(expr) ((void)(not_supposed_to_survive || (expr)))
> > +#endif /* CHECK_ASSERTION_SIDE_EFFECTS */
> 
> Cute.  As this checking assert is in void context, the optimizing
> compiler knows that the entire thing can be optimized away ONLY IF
> it can somehow prove that (expr) has no side effect.  And if it does
> not optimize it away, you will hit an error from the linker, saying
> that the undefined variable is being used.

I agree this is very clever.

> This requires a fairly good optimizing compiler that can peek into
> (as in "inline") what is in expr to notice, so it cannot be free of
> false positive, but at least when the optimization works as expected,
> it is provably (modulo optimizer bugs) side-effect free.
> 
> Is this something we can use in our project?  I am just double
> checking.

I believe it's valid in C99.  Certainly some compiler might be bad at
optimizing, or a user may have compiled with -O0, but this is run in CI,
where we have known good compilers and can control the optimization
flags.  I doubt GCC, Clang, or MSVC will have problems here, and since
this is not on by default, users using something less capable (the Tiny
C Compiler, maybe?) or a vendor compiler won't even see it.

Was there some other case that you were concerned about?
Junio C Hamano March 14, 2025, 1:20 a.m. UTC | #3
"brian m. carlson" <sandals@crustytoothpaste.net> writes:

>> Is this something we can use in our project?  I am just double
>> checking.
>
> I believe it's valid in C99.  Certainly some compiler might be bad at
> optimizing, or a user may have compiled with -O0, but this is run in CI,
> where we have known good compilers and can control the optimization
> flags.  I doubt GCC, Clang, or MSVC will have problems here, and since
> this is not on by default, users using something less capable (the Tiny
> C Compiler, maybe?) or a vendor compiler won't even see it.
>
> Was there some other case that you were concerned about?

Licensing, mostly, as clever things we see are not necessarily home
grown.  I know the patch came with DCO sign-off, but it does not
hurt to double check.
Elijah Newren March 14, 2025, 1:27 a.m. UTC | #4
On Thu, Mar 13, 2025 at 6:20 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> "brian m. carlson" <sandals@crustytoothpaste.net> writes:
>
> >> Is this something we can use in our project?  I am just double
> >> checking.
> >
> > I believe it's valid in C99.  Certainly some compiler might be bad at
> > optimizing, or a user may have compiled with -O0, but this is run in CI,
> > where we have known good compilers and can control the optimization
> > flags.  I doubt GCC, Clang, or MSVC will have problems here, and since
> > this is not on by default, users using something less capable (the Tiny
> > C Compiler, maybe?) or a vendor compiler won't even see it.
> >
> > Was there some other case that you were concerned about?
>
> Licensing, mostly, as clever things we see are not necessarily home
> grown.  I know the patch came with DCO sign-off, but it does not
> hurt to double check.

These two lines:

> +extern int not_supposed_to_survive;
> +#define assert(expr) ((void)(not_supposed_to_survive || (expr)))

, which serve as the core trick, I had used elsewhere before.  Doing
some searches, it looks like those likely came from
https://stackoverflow.com/questions/10593492/catching-assert-with-side-effects.
And it appears that StackOverflow is CC-BY-SA-3.0 or -4.0.  Doh,
sorry.  Anyone got a clever alternative?

The rest of the patch was all written by me.
Junio C Hamano March 14, 2025, 5:29 p.m. UTC | #5
Elijah Newren <newren@gmail.com> writes:

>> Licensing, mostly, as clever things we see are not necessarily home
>> grown.  I know the patch came with DCO sign-off, but it does not
>> hurt to double check.
>
> These two lines:
>
>> +extern int not_supposed_to_survive;
>> +#define assert(expr) ((void)(not_supposed_to_survive || (expr)))
>
> , which serve as the core trick, I had used elsewhere before.

It may be arguable that it is too small to be copyrightable and
there is no other way to express the idea behind that check, but
in any case ...

> Anyone got a clever alternative?

... as I cannot unsee your patch, I cannot be the one who comes up
with a clever alternative, if we are worried about licensing with
what you posted X-<.
Elijah Newren March 16, 2025, 6:38 a.m. UTC | #6
On Fri, Mar 14, 2025 at 10:29 AM Junio C Hamano <gitster@pobox.com> wrote:
>
> Elijah Newren <newren@gmail.com> writes:
>
> >> Licensing, mostly, as clever things we see are not necessarily home
> >> grown.  I know the patch came with DCO sign-off, but it does not
> >> hurt to double check.
> >
> > These two lines:
> >
> >> +extern int not_supposed_to_survive;
> >> +#define assert(expr) ((void)(not_supposed_to_survive || (expr)))
> >
> > , which serve as the core trick, I had used elsewhere before.
>
> It may be arguable that it is too small to be copyrightable and
> there is no other way to express the idea behind that check, but
> in any case ...

That's what I had been assuming, but then you, brian, and Taylor all
pointed out how clever it was making me think otherwise.

> > Anyone got a clever alternative?
>
> ... as I cannot unsee your patch, I cannot be the one who comes up
> with a clever alternative, if we are worried about licensing with
> what you posted X-<.

Turns out we don't need an alternative.  I contacted the author, who
responded and placed the two-liner into the public domain with no
warranty of any kind.  I'll send a re-roll with an updated commit
message.
Elijah Newren March 17, 2025, 3:45 p.m. UTC | #7
On Sat, Mar 15, 2025 at 11:38 PM Elijah Newren <newren@gmail.com> wrote:
>
> On Fri, Mar 14, 2025 at 10:29 AM Junio C Hamano <gitster@pobox.com> wrote:
> >
> > Elijah Newren <newren@gmail.com> writes:
> >
> > >> Licensing, mostly, as clever things we see are not necessarily home
> > >> grown.  I know the patch came with DCO sign-off, but it does not
> > >> hurt to double check.
> > >
> > > These two lines:
> > >
> > >> +extern int not_supposed_to_survive;
> > >> +#define assert(expr) ((void)(not_supposed_to_survive || (expr)))
> > >
> > > , which serve as the core trick, I had used elsewhere before.
> >
> > It may be arguable that it is too small to be copyrightable and
> > there is no other way to express the idea behind that check, but
> > in any case ...
>
> That's what I had been assuming, but then you, brian, and Taylor all
> pointed out how clever it was making me think otherwise.
>
> > > Anyone got a clever alternative?
> >
> > ... as I cannot unsee your patch, I cannot be the one who comes up
> > with a clever alternative, if we are worried about licensing with
> > what you posted X-<.
>
> Turns out we don't need an alternative.  I contacted the author, who
> responded and placed the two-liner into the public domain with no
> warranty of any kind.  I'll send a re-roll with an updated commit
> message.

And of course, in my excitement to send a re-roll addressing that
issue, I totally spaced fixing the style issues you pointed out.  I'll
wait a couple days for any comments and then send a re-roll with those
fixes.
Junio C Hamano March 17, 2025, 10:27 p.m. UTC | #8
Elijah Newren <newren@gmail.com> writes:

>> >> +extern int not_supposed_to_survive;
>> >> +#define assert(expr) ((void)(not_supposed_to_survive || (expr)))
>> >
>> > , which serve as the core trick, I had used elsewhere before.
>>
>> It may be arguable that it is too small to be copyrightable and
>> there is no other way to express the idea behind that check, but
>> in any case ...
>
> That's what I had been assuming, but then you, brian, and Taylor all
> pointed out how clever it was making me think otherwise.

Heh, cleverness lies in the idea, not the expression, and copyright
is about expression.

> Turns out we don't need an alternative.  I contacted the author, who
> responded and placed the two-liner into the public domain with no
> warranty of any kind.  I'll send a re-roll with an updated commit
> message.

Wonderful.  That is the best solution.

Thanks.
diff mbox series

Patch

diff --git a/Makefile b/Makefile
index 7315507381e..57774912f18 100644
--- a/Makefile
+++ b/Makefile
@@ -2261,6 +2261,10 @@  ifdef WITH_BREAKING_CHANGES
 	BASIC_CFLAGS += -DWITH_BREAKING_CHANGES
 endif
 
+ifdef CHECK_ASSERTION_SIDE_EFFECTS
+	BASIC_CFLAGS += -DCHECK_ASSERTION_SIDE_EFFECTS
+endif
+
 ifdef INCLUDE_LIBGIT_RS
 	# Enable symbol hiding in contrib/libgit-sys/libgitpub.a without making
 	# us rebuild the whole tree every time we run a Rust build.
diff --git a/ci/check-unsafe-assertions.sh b/ci/check-unsafe-assertions.sh
new file mode 100755
index 00000000000..d66091efd22
--- /dev/null
+++ b/ci/check-unsafe-assertions.sh
@@ -0,0 +1,18 @@ 
+#!/bin/sh
+
+make CHECK_ASSERTION_SIDE_EFFECTS=1 >compiler_output 2>compiler_error
+if test $? != 0
+then
+    echo "ERROR: The compiler could not verify the following assert()" >&2
+    echo "       calls are free of side-effects.  Please replace with" >&2
+    echo "       BUG_IF_NOT() calls." >&2
+    grep undefined.reference.to..not_supposed_to_survive compiler_error \
+      | sed -e s/:[^:]*$// | sort | uniq | tr ':' ' ' \
+      | while read f l
+      do
+	printf "${f}:${l}\n  "
+	awk -v start="$l" 'NR >= start { print; if (/\);/) exit }' $f
+      done
+    exit 1
+fi
+rm compiler_output compiler_error
diff --git a/ci/run-static-analysis.sh b/ci/run-static-analysis.sh
index 0d51e5ce0e7..ae714e020ae 100755
--- a/ci/run-static-analysis.sh
+++ b/ci/run-static-analysis.sh
@@ -31,4 +31,6 @@  exit 1
 
 make check-pot
 
+${0%/*}/check-unsafe-assertions.sh
+
 save_good_tree
diff --git a/git-compat-util.h b/git-compat-util.h
index c3415ad7e0a..0aefd763751 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -1584,4 +1584,10 @@  static inline void *container_of_or_null_offset(void *ptr, size_t offset)
 	((uintptr_t)&(ptr)->member - (uintptr_t)(ptr))
 #endif /* !__GNUC__ */
 
+#ifdef CHECK_ASSERTION_SIDE_EFFECTS
+#undef assert
+extern int not_supposed_to_survive;
+#define assert(expr) ((void)(not_supposed_to_survive || (expr)))
+#endif /* CHECK_ASSERTION_SIDE_EFFECTS */
+
 #endif