Message ID | 3bca3239cb84488a1638f2bb269392f47f78c6da.1616414951.git.avarab@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Makefile/coccicheck: fix bugs and speed it up | expand |
Am 22.03.21 um 13:11 schrieb Ævar Arnfjörð Bjarmason: > Change the coccicheck target to run on all of our *.c and *.h files > with --include-headers-for-types, instead of trusting it to find *.h > files and other includes to modify from its recursive walking of > includes as it has been doing with only --all-includes. > > The --all-includes option introduced in a9a884aea57 (coccicheck: use > --all-includes by default, 2016-09-30) is needed because we have rules > that e.g. use the "type T" construct that wouldn't match unless we > scoured our headers for the definition of the relevant type. > > But combining --all-includes it with processing N files at a time with > xargs as we've done since 960154b9c17 (coccicheck: optionally batch > spatch invocations, 2019-05-06) introduced a subtle bug with duplicate > hunks being written to the generated *.patch files. I did not dig down > to the root cause of that, but I think it's due to spatch doing (and > failing to do) some magical locking/racy mtime checking to decide if > it emits a hunk. See [1] for a way to reproduce the issue, and a > discussion of it. > > This change speeds up the runtime of "make -j8 coccicheck" from ~1m50s > to ~1m20s for me. See [2] for more timings. Without this patch, /usr/bin/time -l reports: 95.08 real 598.29 user 32.62 sys 103727104 maximum resident set size With --include-headers-for-types, I get this: 76.37 real 483.83 user 26.77 sys 97107968 maximum resident set size This is similar to your numbers. And adding that option to the demo script in your [1] gives consistent results for all xargs -n values, so that option alone fixes the subtle bug you refer to above, right? However, with the second part of this patch also applied (adding %.h to FOUND_C_SOURCES), I get this: 90.38 real 558.38 user 38.32 sys 100073472 maximum resident set size Is this still necessary? > > We could also use --no-includes for a runtime of ~55s, but that would > produce broken patches (we miss some hunks) in cases where we need the > type or other definitions from included files. > > If anyone cares there's probably an optimization opportunity here to > e.g. detect that the whole file doesn't need to consider includes, > since the rules would be unambiguous without considering them. > > Note on portability: The --include-headers-for-types option is not in > my "man spatch", but it's been part of spatch since 2014. See its > fe3a327a (include headers for types option, 2014-07-27), it was first > released with version 1.0.0 of spatch, released in April 2015. The > spatch developers are just inconsistent about updating their TeX docs > and manpages at the same time. > > 1. https://lore.kernel.org/git/20210305170724.23859-3-avarab@gmail.com/ > 2. https://lore.kernel.org/git/20210306192525.15197-1-avarab@gmail.com/ > > Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> > --- > Makefile | 5 +++-- > 1 file changed, 3 insertions(+), 2 deletions(-) > > diff --git a/Makefile b/Makefile > index eef99b4705d..e43a9618df5 100644 > --- a/Makefile > +++ b/Makefile > @@ -1199,7 +1199,8 @@ SPARSE_FLAGS ?= > SP_EXTRA_FLAGS = -Wno-universal-initializer > > # For the 'coccicheck' target > -SPATCH_FLAGS = --all-includes --patch . > +SPATCH_FLAGS = --all-includes --include-headers-for-types --patch . > + > # For the 'coccicheck' target; setting SPATCH_BATCH_SIZE higher will > # usually result in less CPU usage at the cost of higher peak memory. > # Setting it to 0 will feed all files in a single spatch invocation. > @@ -2860,7 +2861,7 @@ check: config-list.h command-list.h > exit 1; \ > fi > > -FOUND_C_SOURCES = $(filter %.c,$(shell $(FIND_SOURCE_FILES))) > +FOUND_C_SOURCES = $(filter %.c %.h,$(shell $(FIND_SOURCE_FILES))) > COCCI_SOURCES = $(filter-out $(THIRD_PARTY_SOURCES),$(FOUND_C_SOURCES)) > > %.cocci.patch: %.cocci $(COCCI_SOURCES) >
Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes: > Change the coccicheck target to run on all of our *.c and *.h files > with --include-headers-for-types, instead of trusting it to find *.h > files and other includes to modify from its recursive walking of > includes as it has been doing with only --all-includes. Meaning '--all-includes' that is fed a C source would use all the headers included (recursively) in it, but if we add the other option, --include-headers-for-types, some *.h files are missed? If so, the above explains both hunks in the patch well (although it is unclear where that need to include *.h independently comes from, e.g. if it is working around a bug in spatch that we may expect for it to be fixed someday). > diff --git a/Makefile b/Makefile > index eef99b4705d..e43a9618df5 100644 > --- a/Makefile > +++ b/Makefile > @@ -1199,7 +1199,8 @@ SPARSE_FLAGS ?= > SP_EXTRA_FLAGS = -Wno-universal-initializer > > # For the 'coccicheck' target > -SPATCH_FLAGS = --all-includes --patch . > +SPATCH_FLAGS = --all-includes --include-headers-for-types --patch . > + > # For the 'coccicheck' target; setting SPATCH_BATCH_SIZE higher will > # usually result in less CPU usage at the cost of higher peak memory. > # Setting it to 0 will feed all files in a single spatch invocation. > @@ -2860,7 +2861,7 @@ check: config-list.h command-list.h > exit 1; \ > fi > > -FOUND_C_SOURCES = $(filter %.c,$(shell $(FIND_SOURCE_FILES))) > +FOUND_C_SOURCES = $(filter %.c %.h,$(shell $(FIND_SOURCE_FILES))) > COCCI_SOURCES = $(filter-out $(THIRD_PARTY_SOURCES),$(FOUND_C_SOURCES)) > > %.cocci.patch: %.cocci $(COCCI_SOURCES)
On Mon, Mar 22, 2021 at 07:05:06PM +0100, René Scharfe. wrote: > > This change speeds up the runtime of "make -j8 coccicheck" from ~1m50s > > to ~1m20s for me. See [2] for more timings. > > Without this patch, /usr/bin/time -l reports: > > 95.08 real 598.29 user 32.62 sys > 103727104 maximum resident set size > > With --include-headers-for-types, I get this: > > 76.37 real 483.83 user 26.77 sys > 97107968 maximum resident set size > > This is similar to your numbers. And adding that option to the demo > script in your [1] gives consistent results for all xargs -n values, > so that option alone fixes the subtle bug you refer to above, right? > > However, with the second part of this patch also applied (adding %.h > to FOUND_C_SOURCES), I get this: > > 90.38 real 558.38 user 38.32 sys > 100073472 maximum resident set size > > Is this still necessary? If I understand what the spatch options are doing (and I'm not at all sure that I do), then with --include-headers-for-types, aren't we avoiding looking for transformations in the included files? In which case not adding *.h to the list of files we pass means that we'd fail to find instances in the header files that need to be mentioned in the output. -Peff
diff --git a/Makefile b/Makefile index eef99b4705d..e43a9618df5 100644 --- a/Makefile +++ b/Makefile @@ -1199,7 +1199,8 @@ SPARSE_FLAGS ?= SP_EXTRA_FLAGS = -Wno-universal-initializer # For the 'coccicheck' target -SPATCH_FLAGS = --all-includes --patch . +SPATCH_FLAGS = --all-includes --include-headers-for-types --patch . + # For the 'coccicheck' target; setting SPATCH_BATCH_SIZE higher will # usually result in less CPU usage at the cost of higher peak memory. # Setting it to 0 will feed all files in a single spatch invocation. @@ -2860,7 +2861,7 @@ check: config-list.h command-list.h exit 1; \ fi -FOUND_C_SOURCES = $(filter %.c,$(shell $(FIND_SOURCE_FILES))) +FOUND_C_SOURCES = $(filter %.c %.h,$(shell $(FIND_SOURCE_FILES))) COCCI_SOURCES = $(filter-out $(THIRD_PARTY_SOURCES),$(FOUND_C_SOURCES)) %.cocci.patch: %.cocci $(COCCI_SOURCES)
Change the coccicheck target to run on all of our *.c and *.h files with --include-headers-for-types, instead of trusting it to find *.h files and other includes to modify from its recursive walking of includes as it has been doing with only --all-includes. The --all-includes option introduced in a9a884aea57 (coccicheck: use --all-includes by default, 2016-09-30) is needed because we have rules that e.g. use the "type T" construct that wouldn't match unless we scoured our headers for the definition of the relevant type. But combining --all-includes it with processing N files at a time with xargs as we've done since 960154b9c17 (coccicheck: optionally batch spatch invocations, 2019-05-06) introduced a subtle bug with duplicate hunks being written to the generated *.patch files. I did not dig down to the root cause of that, but I think it's due to spatch doing (and failing to do) some magical locking/racy mtime checking to decide if it emits a hunk. See [1] for a way to reproduce the issue, and a discussion of it. This change speeds up the runtime of "make -j8 coccicheck" from ~1m50s to ~1m20s for me. See [2] for more timings. We could also use --no-includes for a runtime of ~55s, but that would produce broken patches (we miss some hunks) in cases where we need the type or other definitions from included files. If anyone cares there's probably an optimization opportunity here to e.g. detect that the whole file doesn't need to consider includes, since the rules would be unambiguous without considering them. Note on portability: The --include-headers-for-types option is not in my "man spatch", but it's been part of spatch since 2014. See its fe3a327a (include headers for types option, 2014-07-27), it was first released with version 1.0.0 of spatch, released in April 2015. The spatch developers are just inconsistent about updating their TeX docs and manpages at the same time. 1. https://lore.kernel.org/git/20210305170724.23859-3-avarab@gmail.com/ 2. https://lore.kernel.org/git/20210306192525.15197-1-avarab@gmail.com/ Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- Makefile | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)