diff mbox series

bisect.c: remove unused includes

Message ID 20220331194436.58005-1-garrit@slashdev.space (mailing list archive)
State New, archived
Headers show
Series bisect.c: remove unused includes | expand

Commit Message

Garrit Franke March 31, 2022, 7:44 p.m. UTC
Clean up includes no longer needed by bisect.c.

Signed-off-by: Garrit Franke <garrit@slashdev.space>
---
 bisect.c | 9 ---------
 1 file changed, 9 deletions(-)

Comments

Junio C Hamano March 31, 2022, 9:29 p.m. UTC | #1
Garrit Franke <garrit@slashdev.space> writes:

> Clean up includes no longer needed by bisect.c.
>
> Signed-off-by: Garrit Franke <garrit@slashdev.space>
> ---
>  bisect.c | 9 ---------
>  1 file changed, 9 deletions(-)
>
> diff --git a/bisect.c b/bisect.c
> index 9e6a2b7f20..e07e2d215d 100644
> --- a/bisect.c
> +++ b/bisect.c
> @@ -1,21 +1,12 @@
> -#include "cache.h"

cf. Documentation/CodingGuidelines

The first #include must be <git-compat-util.h>, or <cache.h> or
<builtin.h>, which are well known to include <git-compat-util.h>
first.

Including <git-compat-util.h> indirectly by <config.h> ->
<hashmap.h> -> <hash.h> -> <git-compat-util.h> does not count.

>  #include "config.h"
> -#include "commit.h"

Other headers may indirectly include <commit.h> as their
implementation detail, but what matters is that *we* in this source
file use what <commit.h> gives us ourselves, like the concrete shape
of "struct commit_list".  This change is not wanted.

I'll stop here.  There may be truly leftover "unused" includes among
those removed by the remainder of this patch, but I suspect that
some are like <commit.h> above, i.e. we directly use it, and because
we do not want to be broken by some header file's implementation
detail changing, we MUST include it ourselves.

I think this should give us a useful guideline to sift through the
rest, and an updated patch to remove truly unused ones are very much
welcome.  We may actually find some we are not directly including
ourselves but we should (e.g. I do not see <string-list.h> included
by us, but we clearly use structures and functions declared there,
and probably is depending, wrongly, on some header file we include
happens to indirectly include it).

Thanks.

> -#include "diff.h"
> -#include "revision.h"
>  #include "refs.h"
>  #include "list-objects.h"
>  #include "quote.h"
> -#include "hash-lookup.h"
>  #include "run-command.h"
>  #include "log-tree.h"
>  #include "bisect.h"
> -#include "oid-array.h"
> -#include "strvec.h"
> -#include "commit-slab.h"
>  #include "commit-reach.h"
>  #include "object-store.h"
> -#include "dir.h"
>  
>  static struct oid_array good_revs;
>  static struct oid_array skipped_revs;
Ævar Arnfjörð Bjarmason April 1, 2022, 8:07 a.m. UTC | #2
On Thu, Mar 31 2022, Junio C Hamano wrote:

[Changed $subject to make this easier to find]

> Garrit Franke <garrit@slashdev.space> writes:
>
>> Clean up includes no longer needed by bisect.c.
>>
>> Signed-off-by: Garrit Franke <garrit@slashdev.space>
>> ---
>>  bisect.c | 9 ---------
>>  1 file changed, 9 deletions(-)
>>
>> diff --git a/bisect.c b/bisect.c
>> index 9e6a2b7f20..e07e2d215d 100644
>> --- a/bisect.c
>> +++ b/bisect.c
>> @@ -1,21 +1,12 @@
>> -#include "cache.h"
>
> cf. Documentation/CodingGuidelines
>
> The first #include must be <git-compat-util.h>, or <cache.h> or
> <builtin.h>, which are well known to include <git-compat-util.h>
> first.
>
> Including <git-compat-util.h> indirectly by <config.h> ->
> <hashmap.h> -> <hash.h> -> <git-compat-util.h> does not count.

Also: Some built-ins don't include builtin.h as they should, a fix (or
even basic CI check) for that would be most welcome.

	git grep -C2 -n -F -e builtin.h -e cache.h -e git-compat-util.h -- builtin

I.e. we have this saying a lot of those are redundant:
	
	Documentation/CodingGuidelines- - The first #include in C files, except in platform specific compat/
	Documentation/CodingGuidelines-   implementations, must be either "git-compat-util.h", "cache.h" or
	Documentation/CodingGuidelines:   "builtin.h".  You do not have to include more than one of these.

But maybe it's not worth it, anyway...

>>  #include "config.h"
>> -#include "commit.h"
>
> Other headers may indirectly include <commit.h> as their
> implementation detail, but what matters is that *we* in this source
> file use what <commit.h> gives us ourselves, like the concrete shape
> of "struct commit_list".  This change is not wanted.
>
> I'll stop here.  There may be truly leftover "unused" includes among
> those removed by the remainder of this patch, but I suspect that
> some are like <commit.h> above, i.e. we directly use it, and because
> we do not want to be broken by some header file's implementation
> detail changing, we MUST include it ourselves.
>
> I think this should give us a useful guideline to sift through the
> rest, and an updated patch to remove truly unused ones are very much
> welcome.  We may actually find some we are not directly including
> ourselves but we should (e.g. I do not see <string-list.h> included
> by us, but we clearly use structures and functions declared there,
> and probably is depending, wrongly, on some header file we include
> happens to indirectly include it).

... For anyone interested in pursuing this, I think using the excellent
include-what-you-use tool would be a nice start.

We could even eventually add it to our CI if the false positive rate
isn't bad (I haven't checked much):
https://github.com/include-what-you-use/include-what-you-use

E.g. in this case (I manually omitted the rest of the output, there's
probably a iwyu option to omit it, but I didn't see how do that from
skimming the docs):

	$ sudo apt install iwyu # YMMV
	$ make bisect.o CC=include-what-you-use CFLAGS="-Xiwyu --verbose=1" 2>&1 | grep -v -E -e '^#include <' -e '^#include "(cache|git-compat-util|gettext)\.h"'
	    CC bisect.o
	
	(bisect.h has correct #includes/fwd-decls)
	
	bisect.c should add these lines:
	#include "hash.h"              // for oideq, object_id, oidcmp, oidcpy, GIT_...
	#include "object.h"            // for object, repo_clear_commit_marks
	#include "path.h"              // for GIT_PATH_FUNC, git_pathdup
	#include "pretty.h"            // for CMIT_FMT_UNSPECIFIED, format_commit_me...
	#include "repository.h"        // for repository (ptr only), the_repository
	#include "strbuf.h"            // for strbuf_release, strbuf, strbuf_getline_lf
	#include "string-list.h"       // for string_list_append, string_list_clear
	
	bisect.c should remove these lines:
	- #include "hash-lookup.h"  // lines 9-9
	- struct commit_weight;  // lines 76-76

Then if I patch it as:
	
	diff --git a/bisect.c b/bisect.c
	index 9e6a2b7f201..512430e3cc8 100644
	--- a/bisect.c
	+++ b/bisect.c
	@@ -6,7 +6,6 @@
	 #include "refs.h"
	 #include "list-objects.h"
	 #include "quote.h"
	-#include "hash-lookup.h"
	 #include "run-command.h"
	 #include "log-tree.h"
	 #include "bisect.h"
	@@ -16,6 +15,13 @@
	 #include "commit-reach.h"
	 #include "object-store.h"
	 #include "dir.h"
	+#include "hash.h"
	+#include "object.h"
	+#include "path.h"
	+#include "pretty.h"
	+#include "repository.h"
	+#include "strbuf.h"
	+#include "string-list.h"
	 
	 static struct oid_array good_revs;
	 static struct oid_array skipped_revs;

It's happier, but probably needs to be told to ignore define_commit_slab() somehow:

	$ make bisect.o CC=include-what-you-use CFLAGS="-Xiwyu --verbose=1" 2>&1 | grep -v -E -e '^#include <' -e '^#include "(cache|git-compat-util|gettext)\.h"'
	    CC bisect.o
	
	(bisect.h has correct #includes/fwd-decls)
	
	bisect.c should add these lines:
	
	bisect.c should remove these lines:
	- struct commit_weight;  // lines 82-82

That still needs to be massaged a bit, e.g. we should probably omit
hash.h and anything else in cache.h and git-compat-util.h.

Or maybe not & we should make those headers even lighter. It is rather
annoying that changing some of those things leads to a complete
re-build, but there's a trade-off there where we probably want things
like gettext.h and other used-almost-everywhere headers in included by
those.

So take all the above with a huge grain of salt. I haven't used iwyu
much, but it seems to be something that'll help us go in the direction
Junio noted above.

I think starting with:

	make -k git-objs <the CC etc. params above>

And tackling the "should remove these lines" issues first would be a
good start, e.g. for serve.c it says:
	
	serve.c should remove these lines:
	- #include "cache.h"  // lines 1-1
	- #include "strvec.h"  // lines 6-6

We don't want that first one, but it's right about the second one. It's
been orphaned since f0a35c9ce52 (serve: drop "keys" strvec, 2021-09-15),
I skimmed some of the rist and they all seem like good
suggestions. E.g. lockfile.h for builtin/apply.c, which isn't needed
since 6d058c88264 (apply: move lockfile into `apply_state`, 2017-10-05).
Ævar Arnfjörð Bjarmason April 6, 2022, 7:54 a.m. UTC | #3
On Tue, Apr 05 2022, Garrit Franke wrote:

> On 01.04.22 10:07, Ævar Arnfjörð Bjarmason wrote:

Aside: I don't think I've ever seen encoded quoted-printable go quite so
bad so fast. That went from =C3=86var to =C3=83=C6=92=C3=A2=E2=82=AC in
one reply. Whatever your E-Mail is doing with encodings seems to be
taking multiple passes through misencodings :)

Don't worry about getting the name "right" or whatever, I'm amused by
the encoding issue...

>> ... For anyone interested in pursuing this, I think using the excellent
>> include-what-you-use tool would be a nice start.
>>
>> We could even eventually add it to our CI if the false positive rate
>> isn't bad (I haven't checked much):
>> https://github.com/include-what-you-use/include-what-you-use
>
> This seems to be a really nice tool indeed. I wouldn't be comfortable
> adding it to the CI just yet, but it did make it considerably easier to
> spot includes that could safely be removed.

Re the reply I had on 1/4 I think it's probably best to drop that in its
current form, but the fixes themselves (perhaps with a re-roll for nits
I posted in reply) seem good.

I was really hoping though that if someone wanted to pursue this a bit
more we'd get to the point of being able to run "make all test" on a
source tree that iwyu would munge with all its suggestions, and then see
if it outright failed to compile, or whether it would e.g. have faster
compilation speed (or not..).

> I think we could try battle-testing this tool in the codebase to get a
> sense of how it behaves. To start, I added your reference-command to a
> script under "contrib/iwyu" and ran it against the files you noted.
> Before breaking a bulk of the files, I wanted to make sure that this
> undertaking is headed in the right direction.

Even if the patches aren't sent in making the actual changes a one-off
script to e.g. wrap the fix_includes.py script I mentioned would be very
interesting.

We could then even run that in CI with relatively little setup,
i.e. checkout <rev>, do munging, then compile.
diff mbox series

Patch

diff --git a/bisect.c b/bisect.c
index 9e6a2b7f20..e07e2d215d 100644
--- a/bisect.c
+++ b/bisect.c
@@ -1,21 +1,12 @@ 
-#include "cache.h"
 #include "config.h"
-#include "commit.h"
-#include "diff.h"
-#include "revision.h"
 #include "refs.h"
 #include "list-objects.h"
 #include "quote.h"
-#include "hash-lookup.h"
 #include "run-command.h"
 #include "log-tree.h"
 #include "bisect.h"
-#include "oid-array.h"
-#include "strvec.h"
-#include "commit-slab.h"
 #include "commit-reach.h"
 #include "object-store.h"
-#include "dir.h"
 
 static struct oid_array good_revs;
 static struct oid_array skipped_revs;