diff mbox series

Makefile: fix bugs in coccicheck and speed it up

Message ID 20210302205103.12230-1-avarab@gmail.com (mailing list archive)
State New, archived
Headers show
Series Makefile: fix bugs in coccicheck and speed it up | expand

Commit Message

Ævar Arnfjörð Bjarmason March 2, 2021, 8:51 p.m. UTC
I've often wondered why "make coccicheck" takes so long. This change
speeds it up by ~2x and makes it use much less memory. Or a reduction
of a max of around ~2GB per-process (under the old
SPATCH_BATCH_SIZE=0) to around ~200MB.

Running the full "make coccicheck" now takes ~50 seconds with -j8 on
my machine, v.s. ~2x of that before. I've got 64GB of memory on that
machine, or it would be much slower.

Why has it been so slow? Because I think we've always been running it
in entirely the wrong mode for what we wanted, and much of the
previous fixing of this target has involved re-arranging the deck
chairs on that particular Titanic.

What we really want to do with coccicheck is to do search/replacements
in all our *.c and *.h files. This is now what we do, and we'll
process a default of 64 files at a time.

What we were doing before was processing all our *.c files, and for
each of those *.c files we'd recursively look around for includes and
see if we needed to search/replace in those too.

That we did that dates back to [1] when we were only processing *.c
files, and it was always very redundant. We'd e.g. visit the likes of
strbuf.h lots of times since it's widely used as an include.

Then in the most recent attempt to optimize coccicheck in [2] this
anti-pattern finally turned into a bug.

Namely: before this change, if your coccicheck rule applied to
e.g. making a change in strbuf.h itself we'd get *lots* of duplicate
hunks applying the exact same change, as concurrent spatch processes
invoked by xargs raced one another. In one instance I ended up with 27
copies of the same hunk in a strbuf.patch.

Setting SPATCH_BATCH_SIZE=0 and processing all the files in one giant
batch mitigated this. I suspect the author of [2] either mostly ran in
that mode, or didn't test on changes that impacted widely used header
files.

So since we're going to want to process all our *.c and *.h let's just
do that, and drop --all-includes for --no-includes. It's not spatch's
job to find our sources, we're doing that. If someone is manually
tweaking COCCI_SOURCES they can just tweak SPATCH_FLAGS too.

I'm entirely removing SPATCH_BATCH_SIZE. If you want to tweak it you
can tweak SPATCH_XARGS_FLAGS to e.g. "-n 256", or "-P 4 -n 128". But
in my testing it isn't worth it to tweak SPATCH_XARGS_FLAGS for a full
"make coccicheck".

I'm also the whole "cat $@.log" introduced in [3]. Since we don't call
this in a loop anymore (and xargs will early-exit) we can just rely on
standard V=1 for debugging issues.

1. a9a884aea5 (coccicheck: use --all-includes by default, 2016-09-30)
2. 960154b9c1 (coccicheck: optionally batch spatch invocations,
   2019-05-06)
3. f5c2bc2b96 (Makefile: detect errors in running spatch, 2017-03-10)

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 Makefile | 41 ++++++++++++++++++++++-------------------
 1 file changed, 22 insertions(+), 19 deletions(-)

Comments

Denton Liu March 3, 2021, 9:43 a.m. UTC | #1
Hi Ævar,

On Tue, Mar 02, 2021 at 09:51:03PM +0100, Ævar Arnfjörð Bjarmason wrote:
> I've often wondered why "make coccicheck" takes so long. This change
> speeds it up by ~2x and makes it use much less memory. Or a reduction
> of a max of around ~2GB per-process (under the old
> SPATCH_BATCH_SIZE=0) to around ~200MB.
> 
> Running the full "make coccicheck" now takes ~50 seconds with -j8 on
> my machine, v.s. ~2x of that before. I've got 64GB of memory on that
> machine, or it would be much slower.
> 
> Why has it been so slow? Because I think we've always been running it
> in entirely the wrong mode for what we wanted, and much of the
> previous fixing of this target has involved re-arranging the deck
> chairs on that particular Titanic.
> 
> What we really want to do with coccicheck is to do search/replacements
> in all our *.c and *.h files. This is now what we do, and we'll
> process a default of 64 files at a time.
> 
> What we were doing before was processing all our *.c files, and for
> each of those *.c files we'd recursively look around for includes and
> see if we needed to search/replace in those too.
> 
> That we did that dates back to [1] when we were only processing *.c
> files, and it was always very redundant. We'd e.g. visit the likes of
> strbuf.h lots of times since it's widely used as an include.
> 
> Then in the most recent attempt to optimize coccicheck in [2] this
> anti-pattern finally turned into a bug.
> 
> Namely: before this change, if your coccicheck rule applied to
> e.g. making a change in strbuf.h itself we'd get *lots* of duplicate
> hunks applying the exact same change, as concurrent spatch processes
> invoked by xargs raced one another. In one instance I ended up with 27
> copies of the same hunk in a strbuf.patch.
> 
> Setting SPATCH_BATCH_SIZE=0 and processing all the files in one giant
> batch mitigated this. I suspect the author of [2] either mostly ran in
> that mode, or didn't test on changes that impacted widely used header
> files.
> 
> So since we're going to want to process all our *.c and *.h let's just
> do that, and drop --all-includes for --no-includes. It's not spatch's
> job to find our sources, we're doing that. If someone is manually
> tweaking COCCI_SOURCES they can just tweak SPATCH_FLAGS too.

This is a very clean and well-done analysis.

I've also timed my runs and the effects are really obvious:

Before:

	$ time make -j7 coccicheck
	    SPATCH contrib/coccinelle/array.cocci
	    SPATCH contrib/coccinelle/commit.cocci
	    SPATCH contrib/coccinelle/free.cocci
	    SPATCH contrib/coccinelle/hashmap.cocci
	    SPATCH contrib/coccinelle/object_id.cocci
	    SPATCH contrib/coccinelle/preincr.cocci
	    SPATCH contrib/coccinelle/strbuf.cocci
	    SPATCH contrib/coccinelle/swap.cocci
	    SPATCH contrib/coccinelle/xstrdup_or_null.cocci

	real    3m15.215s
	user    17m36.779s
	sys     1m4.262s

After:

	$ time make -j7 coccicheck
	GIT_VERSION = 2.31.0.rc1.1.gd8c3638591
	    SPATCH contrib/coccinelle/array.cocci
	    SPATCH contrib/coccinelle/commit.cocci
	    SPATCH contrib/coccinelle/flex_alloc.cocci
	    SPATCH contrib/coccinelle/free.cocci
	    SPATCH contrib/coccinelle/hashmap.cocci
	    SPATCH contrib/coccinelle/object_id.cocci
	    SPATCH contrib/coccinelle/preincr.cocci
	    SPATCH contrib/coccinelle/qsort.cocci
	    SPATCH contrib/coccinelle/strbuf.cocci
	    SPATCH contrib/coccinelle/swap.cocci
	    SPATCH contrib/coccinelle/xstrdup_or_null.cocci

	real    1m14.070s
	user    7m40.644s
	sys     0m13.689s

so this gets my

	Tested-by: Denton Liu <liu.denton@gmail.com>
 
> I'm entirely removing SPATCH_BATCH_SIZE. If you want to tweak it you
> can tweak SPATCH_XARGS_FLAGS to e.g. "-n 256", or "-P 4 -n 128". But
> in my testing it isn't worth it to tweak SPATCH_XARGS_FLAGS for a full
> "make coccicheck".
> 
> I'm also the whole "cat $@.log" introduced in [3]. Since we don't call

I'm also... what? :)

> this in a loop anymore (and xargs will early-exit) we can just rely on
> standard V=1 for debugging issues.
> 
> 1. a9a884aea5 (coccicheck: use --all-includes by default, 2016-09-30)
> 2. 960154b9c1 (coccicheck: optionally batch spatch invocations,
>    2019-05-06)
> 3. f5c2bc2b96 (Makefile: detect errors in running spatch, 2017-03-10)
> 
> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Ævar Arnfjörð Bjarmason March 3, 2021, 11:45 a.m. UTC | #2
On Tue, Mar 02 2021, Ævar Arnfjörð Bjarmason wrote:

A few notes, for a probable re-roll sometime later:

> I've often wondered why "make coccicheck" takes so long. This change
> speeds it up by ~2x and makes it use much less memory. Or a reduction
> of a max of around ~2GB per-process (under the old
> SPATCH_BATCH_SIZE=0) to around ~200MB.

I've also tested this with e.g. "strace -c -f -e trace=openat ",
before/after with just contrib/coccinelle/swap.cocci.patch:

    192612
    13106

Grepping e.g. for strbuf.h shows that when running "make coccicheck" we
open it 12 times as expected (we have 12 rules)

    strace -f -e trace=openat make -j8 coccicheck 2>&1|grep 'openat.*"strbuf\.h"' | wc -l
    12

(Is there a more direct way to filter this type of thing with strace?)

Before:

    $ strace -f -e trace=openat make -j8 coccicheck 2>&1|grep 'openat.*"strbuf\.h"' | wc -l
    137

> [...]
> I'm also the whole "cat $@.log" introduced in [3]. Since we don't call
> this in a loop anymore (and xargs will early-exit) we can just rely on
> standard V=1 for debugging issues.

Noticed by Denton in the side-thread, should be:

    I'm also removing the whole "cat $@.log" feature introduced in...

> +		xargs \
> +			-n 32 $(SPATCH_XARGS_FLAGS) \

As an aside this relies on xargs not barfing on "xargs -n 1 -n 2",
i.e. taking 2 over 1. I tested it on Linux/AIX/OpenBSD/Solaris, works
consistently on all of them.
Junio C Hamano March 4, 2021, 11:18 p.m. UTC | #3
Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:

> I've often wondered why "make coccicheck" takes so long. This change
> speeds it up by ~2x and makes it use much less memory. Or a reduction
> of a max of around ~2GB per-process (under the old
> SPATCH_BATCH_SIZE=0) to around ~200MB.

>
> Running the full "make coccicheck" now takes ~50 seconds with -j8 on
> my machine, v.s. ~2x of that before. I've got 64GB of memory on that
> machine, or it would be much slower.
>
> Why has it been so slow? Because I think we've always been running it
> in entirely the wrong mode for what we wanted, and much of the
> previous fixing of this target has involved re-arranging the deck
> chairs on that particular Titanic.
>
> What we really want to do with coccicheck is to do search/replacements
> in all our *.c and *.h files. This is now what we do, and we'll
> process a default of 64 files at a time.
>
> What we were doing before was processing all our *.c files, and for
> each of those *.c files we'd recursively look around for includes and
> see if we needed to search/replace in those too.
>
> That we did that dates back to [1] when we were only processing *.c
> files, and it was always very redundant. We'd e.g. visit the likes of
> strbuf.h lots of times since it's widely used as an include.
>
> Then in the most recent attempt to optimize coccicheck in [2] this
> anti-pattern finally turned into a bug.
>
> Namely: before this change, if your coccicheck rule applied to
> e.g. making a change in strbuf.h itself we'd get *lots* of duplicate
> hunks applying the exact same change, as concurrent spatch processes
> invoked by xargs raced one another. In one instance I ended up with 27
> copies of the same hunk in a strbuf.patch.
>
> Setting SPATCH_BATCH_SIZE=0 and processing all the files in one giant
> batch mitigated this. I suspect the author of [2] either mostly ran in
> that mode, or didn't test on changes that impacted widely used header
> files.
>
> So since we're going to want to process all our *.c and *.h let's just
> do that, and drop --all-includes for --no-includes. It's not spatch's
> job to find our sources, we're doing that. If someone is manually
> tweaking COCCI_SOURCES they can just tweak SPATCH_FLAGS too.
>
> I'm entirely removing SPATCH_BATCH_SIZE. If you want to tweak it you
> can tweak SPATCH_XARGS_FLAGS to e.g. "-n 256", or "-P 4 -n 128". But
> in my testing it isn't worth it to tweak SPATCH_XARGS_FLAGS for a full
> "make coccicheck".
>
> I'm also the whole "cat $@.log" introduced in [3]. Since we don't call
> this in a loop anymore (and xargs will early-exit) we can just rely on
> standard V=1 for debugging issues.
>
> 1. a9a884aea5 (coccicheck: use --all-includes by default, 2016-09-30)
> 2. 960154b9c1 (coccicheck: optionally batch spatch invocations,
>    2019-05-06)
> 3. f5c2bc2b96 (Makefile: detect errors in running spatch, 2017-03-10)
>
> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> ---

Nice, so in short, we've been redundantly running the checker code
over and over on the same header files wasting cycles.

Even though I saw you mentioned something about preparing for a
reroll, I'll tentatively queue this version to 'seen' for now.

THanks.
Jeff King March 5, 2021, 10:24 a.m. UTC | #4
On Tue, Mar 02, 2021 at 09:51:03PM +0100, Ævar Arnfjörð Bjarmason wrote:

> What we were doing before was processing all our *.c files, and for
> each of those *.c files we'd recursively look around for includes and
> see if we needed to search/replace in those too.
> 
> That we did that dates back to [1] when we were only processing *.c
> files, and it was always very redundant. We'd e.g. visit the likes of
> strbuf.h lots of times since it's widely used as an include.

OK. I don't offhand know if processing the includes along with the C
files buys us anything else. Coccinelle's behavior is often quite a
mystery, but if we think this produces the same results with less time,
I'm for it.

BTW, this "dates back to when we were only processing *.c files"
statement confused me a bit. Doesn't that include the current state
before this patch?  I.e., this hunk:

> -FOUND_C_SOURCES = $(filter %.c,$(shell $(FIND_SOURCE_FILES)))
> +FOUND_C_SOURCES = $(filter %.c %.h,$(shell $(FIND_SOURCE_FILES)))

seems to be an important part of the change, without which moving away
from --all-includes would be wrong.

> Then in the most recent attempt to optimize coccicheck in [2] this
> anti-pattern finally turned into a bug.
>
> Namely: before this change, if your coccicheck rule applied to
> e.g. making a change in strbuf.h itself we'd get *lots* of duplicate
> hunks applying the exact same change, as concurrent spatch processes
> invoked by xargs raced one another. In one instance I ended up with 27
> copies of the same hunk in a strbuf.patch.
> 
> Setting SPATCH_BATCH_SIZE=0 and processing all the files in one giant
> batch mitigated this. I suspect the author of [2] either mostly ran in
> that mode, or didn't test on changes that impacted widely used header
> files.

This "turned into a bug" I didn't understand, though. That commit [2]
just switched from always feeding files one at a time to letting you
feed more than one.  So yes, feeding all at once (with
SPATCH_BATCH_SIZE=0) mitigated the bug. But wouldn't any bug that shows
up with SPATCH_BATCH_SIZE=1 have already existed previously?

IOW, I don't think the batch-size stuff made anything worse. It made one
specific case better, but that was not even its purpose.

That is maybe splitting hairs, but I want to make sure I understand all
of what you're claiming here. But also...

> I'm entirely removing SPATCH_BATCH_SIZE. If you want to tweak it you
> can tweak SPATCH_XARGS_FLAGS to e.g. "-n 256", or "-P 4 -n 128". But
> in my testing it isn't worth it to tweak SPATCH_XARGS_FLAGS for a full
> "make coccicheck".

You hard-code this to 32 now. But I'm not sure if that's going to be the
universal best value.

Applying just part of your patch, but leaving SPATCH_BATCH_SIZE in
place, like so:

diff --git a/Makefile b/Makefile
index dfb0f1000f..88cb157547 100644
--- a/Makefile
+++ b/Makefile
@@ -1201,7 +1201,7 @@ SP_EXTRA_FLAGS = -Wno-universal-initializer
 # For the 'coccicheck' target; setting SPATCH_BATCH_SIZE higher will
 # usually result in less CPU usage at the cost of higher peak memory.
 # Setting it to 0 will feed all files in a single spatch invocation.
-SPATCH_FLAGS = --all-includes --patch .
+SPATCH_FLAGS = --no-includes --patch .
 SPATCH_BATCH_SIZE = 1
 
 include config.mak.uname
@@ -2859,7 +2859,7 @@ check: config-list.h command-list.h
 		exit 1; \
 	fi
 
-FOUND_C_SOURCES = $(filter %.c,$(shell $(FIND_SOURCE_FILES)))
+FOUND_C_SOURCES = $(filter %.c %.h,$(shell $(FIND_SOURCE_FILES)))
 COCCI_SOURCES = $(filter-out $(THIRD_PARTY_SOURCES),$(FOUND_C_SOURCES))
 
 %.cocci.patch: %.cocci $(COCCI_SOURCES)

Then I get this with various batch sizes (using "make -j16" on an 8-core
machine, so all of the patches are being run at the same time):

  size |  real  |  user  | sys
  -------------------------------
    1  |  1m28s | 10m00s | 0m56s
    2  |  1m03s |  7m49s | 0m33s
    4  |  0m51s |  6m28s | 0m20s
    8  |  0m45s |  6m02s | 0m14s
   16  |  0m45s |  6m17s | 0m10s
   32  |  0m44s |  6m33s | 0m07s
   64  |  0m45s |  6m48s | 0m06s
    0  |  1m08s | 10m08s | 0m03s

So there's a sweet spot at 8. Doing "32" isn't that much worse (runtime
is about the same, but we use more CPU), but it gets progressively worse
as the batch size increases.

That's the same sweet spot before your patch, too, btw. I know because I
happened to be timing it the other day, as coccinelle 1.1.0 hit debian
unstable. When I originally wrote the SPATCH_BATCH_SIZE feature, I think
I was using 1.0.4 or 1.0.7. Back then SPATCH_BATCH_SIZE=0 was a clear
win, assuming you had the memory. But in 1.0.8 it ran for many minutes
without finishing. I found back then that "2" was the sweet spot. But
now it's "8".

All of which is to say: the timing difference between my "8" and "32"
cases isn't that exciting. But the performance of spatch has been
sufficiently baffling to me that I think it's worth keeping the knob.

Your XARGS_FLAGS does accomplish something similar (and is in fact more
flexible, though at the cost of being less abstract).  I'm OK to replace
one with the other, but shouldn't it happen in a separate patch? It's
completely orthogonal to the --no-includes behavior.

> I'm also the whole "cat $@.log" introduced in [3]. Since we don't call
> this in a loop anymore (and xargs will early-exit) we can just rely on
> standard V=1 for debugging issues.

I think this is missing the point of the cat. We've redirected spatch's
stderr to the logfile. So if there's an error, you have no idea what it
was without manually figuring out which file to cat. And V=1 doesn't
help that.

We could stop doing that redirection, but the problem is that spatch is
very verbose. So the point was to store the output but only dump it
when we see an error.

So this part of the patch seems like a pure regression to me. E.g.,
introduce a syntax error like:

diff --git a/contrib/coccinelle/free.cocci b/contrib/coccinelle/free.cocci
index 4490069df9..c6c6562a0a 100644
--- a/contrib/coccinelle/free.cocci
+++ b/contrib/coccinelle/free.cocci
@@ -11,7 +11,7 @@ expression E;
   free(E);
 
 @@
-expression E;
+expression E
 @@
 - free(E);
 + FREE_AND_NULL(E);

and run "make coccicheck" before and after your patch:

  [before]
  $ make coccicheck
  GIT_VERSION = 2.31.0.rc1.8.gbe7935ed8b.dirty
      SPATCH contrib/coccinelle/free.cocci
  init_defs_builtins: /usr/bin/../lib/coccinelle/standard.h
  meta: parse error: 
    File "contrib/coccinelle/free.cocci", line 15, column 0, charpos = 99
    around = '@@',
    whole content = @@

  xargs: spatch: exited with status 255; aborting
  make: *** [Makefile:2866: contrib/coccinelle/free.cocci.patch] Error 1

  [after]
  $ make coccicheck
    SPATCH contrib/coccinelle/free.cocci
  make: *** [Makefile:2875: contrib/coccinelle/free.cocci.patch] Error 124

-Peff
Ævar Arnfjörð Bjarmason March 5, 2021, 11:17 a.m. UTC | #5
On Fri, Mar 05 2021, Junio C Hamano wrote:

> Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:
>> [...]
> Nice, so in short, we've been redundantly running the checker code
> over and over on the same header files wasting cycles.
>
> Even though I saw you mentioned something about preparing for a
> reroll, I'll tentatively queue this version to 'seen' for now.

Hopefully soon, was waiting to see if there were more comments & what
Jeff King thought of it, he's replied now. Will address that (haven't
read it yet) and re-roll as needed...
Ævar Arnfjörð Bjarmason March 5, 2021, 5:20 p.m. UTC | #6
On Fri, Mar 05 2021, Jeff King wrote:

I sent out a v2 that should address your feedback in :
https://lore.kernel.org/git/20210305170724.23859-1-avarab@gmail.com/

Just brief notes:

> On Tue, Mar 02, 2021 at 09:51:03PM +0100, Ævar Arnfjörð Bjarmason wrote:
>
>> What we were doing before was processing all our *.c files, and for
>> each of those *.c files we'd recursively look around for includes and
>> see if we needed to search/replace in those too.
>> 
>> That we did that dates back to [1] when we were only processing *.c
>> files, and it was always very redundant. We'd e.g. visit the likes of
>> strbuf.h lots of times since it's widely used as an include.
>
> OK. I don't offhand know if processing the includes along with the C
> files buys us anything else. Coccinelle's behavior is often quite a
> mystery, but if we think this produces the same results with less time,
> I'm for it.
>
> BTW, this "dates back to when we were only processing *.c files"
> statement confused me a bit. Doesn't that include the current state
> before this patch?  I.e., this hunk:
>
>> -FOUND_C_SOURCES = $(filter %.c,$(shell $(FIND_SOURCE_FILES)))
>> +FOUND_C_SOURCES = $(filter %.c %.h,$(shell $(FIND_SOURCE_FILES)))
>
> seems to be an important part of the change, without which moving away
> from --all-includes would be wrong.

Updated in the new 2/4 commit message.

>> Then in the most recent attempt to optimize coccicheck in [2] this
>> anti-pattern finally turned into a bug.
>>
>> Namely: before this change, if your coccicheck rule applied to
>> e.g. making a change in strbuf.h itself we'd get *lots* of duplicate
>> hunks applying the exact same change, as concurrent spatch processes
>> invoked by xargs raced one another. In one instance I ended up with 27
>> copies of the same hunk in a strbuf.patch.
>> 
>> Setting SPATCH_BATCH_SIZE=0 and processing all the files in one giant
>> batch mitigated this. I suspect the author of [2] either mostly ran in
>> that mode, or didn't test on changes that impacted widely used header
>> files.
>
> This "turned into a bug" I didn't understand, though. That commit [2]
> just switched from always feeding files one at a time to letting you
> feed more than one.  So yes, feeding all at once (with
> SPATCH_BATCH_SIZE=0) mitigated the bug. But wouldn't any bug that shows
> up with SPATCH_BATCH_SIZE=1 have already existed previously?
>
> IOW, I don't think the batch-size stuff made anything worse. It made one
> specific case better, but that was not even its purpose.
>
> That is maybe splitting hairs, but I want to make sure I understand all
> of what you're claiming here. But also...

Explained in the new 2/4. Your commit introduced a regression, because
spatch does racy magic background locking, running N of them at a time
applying the same rule tripped it up.

I suspect it was racy before, but we didn't have overlapping rules in
multiple *.cocci files.

>> I'm entirely removing SPATCH_BATCH_SIZE. If you want to tweak it you
>> can tweak SPATCH_XARGS_FLAGS to e.g. "-n 256", or "-P 4 -n 128". But
>> in my testing it isn't worth it to tweak SPATCH_XARGS_FLAGS for a full
>> "make coccicheck".
>
> You hard-code this to 32 now. But I'm not sure if that's going to be the
> universal best value.
>
> Applying just part of your patch, but leaving SPATCH_BATCH_SIZE in
> place, like so:
>
> diff --git a/Makefile b/Makefile
> index dfb0f1000f..88cb157547 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -1201,7 +1201,7 @@ SP_EXTRA_FLAGS = -Wno-universal-initializer
>  # For the 'coccicheck' target; setting SPATCH_BATCH_SIZE higher will
>  # usually result in less CPU usage at the cost of higher peak memory.
>  # Setting it to 0 will feed all files in a single spatch invocation.
> -SPATCH_FLAGS = --all-includes --patch .
> +SPATCH_FLAGS = --no-includes --patch .
>  SPATCH_BATCH_SIZE = 1
>  
>  include config.mak.uname
> @@ -2859,7 +2859,7 @@ check: config-list.h command-list.h
>  		exit 1; \
>  	fi
>  
> -FOUND_C_SOURCES = $(filter %.c,$(shell $(FIND_SOURCE_FILES)))
> +FOUND_C_SOURCES = $(filter %.c %.h,$(shell $(FIND_SOURCE_FILES)))
>  COCCI_SOURCES = $(filter-out $(THIRD_PARTY_SOURCES),$(FOUND_C_SOURCES))
>  
>  %.cocci.patch: %.cocci $(COCCI_SOURCES)
>
> Then I get this with various batch sizes (using "make -j16" on an 8-core
> machine, so all of the patches are being run at the same time):
>
>   size |  real  |  user  | sys
>   -------------------------------
>     1  |  1m28s | 10m00s | 0m56s
>     2  |  1m03s |  7m49s | 0m33s
>     4  |  0m51s |  6m28s | 0m20s
>     8  |  0m45s |  6m02s | 0m14s
>    16  |  0m45s |  6m17s | 0m10s
>    32  |  0m44s |  6m33s | 0m07s
>    64  |  0m45s |  6m48s | 0m06s
>     0  |  1m08s | 10m08s | 0m03s
>
> So there's a sweet spot at 8. Doing "32" isn't that much worse (runtime
> is about the same, but we use more CPU), but it gets progressively worse
> as the batch size increases.
>
> That's the same sweet spot before your patch, too, btw. I know because I
> happened to be timing it the other day, as coccinelle 1.1.0 hit debian
> unstable. When I originally wrote the SPATCH_BATCH_SIZE feature, I think
> I was using 1.0.4 or 1.0.7. Back then SPATCH_BATCH_SIZE=0 was a clear
> win, assuming you had the memory. But in 1.0.8 it ran for many minutes
> without finishing. I found back then that "2" was the sweet spot. But
> now it's "8".
>
> All of which is to say: the timing difference between my "8" and "32"
> cases isn't that exciting. But the performance of spatch has been
> sufficiently baffling to me that I think it's worth keeping the knob.
>
> Your XARGS_FLAGS does accomplish something similar (and is in fact more
> flexible, though at the cost of being less abstract).  I'm OK to replace
> one with the other, but shouldn't it happen in a separate patch? It's
> completely orthogonal to the --no-includes behavior.

*nod*, set to a default of 8 in the new 4/4.

>> I'm also the whole "cat $@.log" introduced in [3]. Since we don't call
>> this in a loop anymore (and xargs will early-exit) we can just rely on
>> standard V=1 for debugging issues.
>
> I think this is missing the point of the cat. We've redirected spatch's
> stderr to the logfile. So if there's an error, you have no idea what it
> was without manually figuring out which file to cat. And V=1 doesn't
> help that.
>
> We could stop doing that redirection, but the problem is that spatch is
> very verbose. So the point was to store the output but only dump it
> when we see an error.
>
> So this part of the patch seems like a pure regression to me. E.g.,
> introduce a syntax error like:
>
> diff --git a/contrib/coccinelle/free.cocci b/contrib/coccinelle/free.cocci
> index 4490069df9..c6c6562a0a 100644
> --- a/contrib/coccinelle/free.cocci
> +++ b/contrib/coccinelle/free.cocci
> @@ -11,7 +11,7 @@ expression E;
>    free(E);
>  
>  @@
> -expression E;
> +expression E
>  @@
>  - free(E);
>  + FREE_AND_NULL(E);
>
> and run "make coccicheck" before and after your patch:
>
>   [before]
>   $ make coccicheck
>   GIT_VERSION = 2.31.0.rc1.8.gbe7935ed8b.dirty
>       SPATCH contrib/coccinelle/free.cocci
>   init_defs_builtins: /usr/bin/../lib/coccinelle/standard.h
>   meta: parse error: 
>     File "contrib/coccinelle/free.cocci", line 15, column 0, charpos = 99
>     around = '@@',
>     whole content = @@
>
>   xargs: spatch: exited with status 255; aborting
>   make: *** [Makefile:2866: contrib/coccinelle/free.cocci.patch] Error 1
>
>   [after]
>   $ make coccicheck
>     SPATCH contrib/coccinelle/free.cocci
>   make: *** [Makefile:2875: contrib/coccinelle/free.cocci.patch] Error 124

The "cat $@.log" is back, but there's still some issues with it on
master now (I didn't change this) it'll spew a lot at you with xargs
since we emit the whole error output for every file, seemingly, before
cat-ing the file:

    make -j1 coccicheck SPATCH_FLAGS=--blahblah V=1 2>&1|grep -i example.*use|wc -l
    464

Well, it's a bit better now on my series with a default batch size of 8:

    $ make -j1 coccicheck SPATCH_FLAGS=--blahblah V=1 2>&1|grep -i example.*use|wc -l
    88

I got tired of dealing with the combination of shellscritp and make for
the day. But maybe something to do as a follow-up if anyone's
interested.
Jeff King March 6, 2021, 10:59 a.m. UTC | #7
On Fri, Mar 05, 2021 at 06:20:47PM +0100, Ævar Arnfjörð Bjarmason wrote:

> >   [after]
> >   $ make coccicheck
> >     SPATCH contrib/coccinelle/free.cocci
> >   make: *** [Makefile:2875: contrib/coccinelle/free.cocci.patch] Error 124
> 
> The "cat $@.log" is back, but there's still some issues with it on
> master now (I didn't change this) it'll spew a lot at you with xargs
> since we emit the whole error output for every file, seemingly, before
> cat-ing the file:
> 
>     make -j1 coccicheck SPATCH_FLAGS=--blahblah V=1 2>&1|grep -i example.*use|wc -l
>     464
> 
> Well, it's a bit better now on my series with a default batch size of 8:
> 
>     $ make -j1 coccicheck SPATCH_FLAGS=--blahblah V=1 2>&1|grep -i example.*use|wc -l
>     88
> 
> I got tired of dealing with the combination of shellscritp and make for
> the day. But maybe something to do as a follow-up if anyone's
> interested.

Hmm, yeah. xargs will keep going unless the command actually returns
255. I never noticed with syntax errors in the cocci files, because
those cause spatch to exit with 255!

IMHO that's likely to be the most common error, though probably "oops,
you don't have spatch installed" is a close second. But that one also
works well, because xargs (at least GNU xargs) will stop if it can't
run the program.

So it might be nice to have it bail on the first failure, but I don't
know that it's worth spending a lot of time on it.

-Peff
diff mbox series

Patch

diff --git a/Makefile b/Makefile
index dd08b4ced0..2108df8913 100644
--- a/Makefile
+++ b/Makefile
@@ -1195,11 +1195,20 @@  PTHREAD_CFLAGS =
 SPARSE_FLAGS ?=
 SP_EXTRA_FLAGS = -Wno-universal-initializer
 
-# For the 'coccicheck' target; setting SPATCH_BATCH_SIZE higher will
-# usually result in less CPU usage at the cost of higher peak memory.
-# Setting it to 0 will feed all files in a single spatch invocation.
-SPATCH_FLAGS = --all-includes --patch .
-SPATCH_BATCH_SIZE = 1
+SPATCH_FLAGS = --no-includes --patch .
+# For the 'coccicheck' target; Tweaking SPATCH_XARGS_FLAGS is
+# generally not neccesary with a top-level -jN.
+#
+# To get concurrency when targeting a single
+# contrib/coccinelle/%.patch use e.g. "-P" if your xargs(1) supports
+# it:
+#
+#    make contrib/coccinelle/strbuf.cocci.patch SPATCH_XARGS_FLAGS="-P 8 -n 64"
+#
+# Or a combination of the two:
+#
+#    make -j4 coccicheck SPATCH_XARGS_FLAGS="-P 2 -n 64"
+SPATCH_XARGS_FLAGS =
 
 include config.mak.uname
 -include config.mak.autogen
@@ -2852,24 +2861,18 @@  check: config-list.h command-list.h
 		exit 1; \
 	fi
 
-FOUND_C_SOURCES = $(filter %.c,$(shell $(FIND_SOURCE_FILES)))
+FOUND_C_SOURCES = $(filter %.c %.h,$(shell $(FIND_SOURCE_FILES)))
 COCCI_SOURCES = $(filter-out $(THIRD_PARTY_SOURCES),$(FOUND_C_SOURCES))
 
 %.cocci.patch: %.cocci $(COCCI_SOURCES)
 	$(QUIET_SPATCH) \
-	if test $(SPATCH_BATCH_SIZE) = 0; then \
-		limit=; \
-	else \
-		limit='-n $(SPATCH_BATCH_SIZE)'; \
-	fi; \
-	if ! echo $(COCCI_SOURCES) | xargs $$limit \
-		$(SPATCH) --sp-file $< $(SPATCH_FLAGS) \
-		>$@+ 2>$@.log; \
-	then \
-		cat $@.log; \
-		exit 1; \
-	fi; \
-	mv $@+ $@; \
+	$(RM) $@+ $@.log && \
+	echo $(COCCI_SOURCES) | \
+		xargs \
+			-n 32 $(SPATCH_XARGS_FLAGS) \
+			$(SPATCH) --sp-file $< $(SPATCH_FLAGS) \
+		>>$@+ 2>>$@.log && \
+	mv $@+ $@ && \
 	if test -s $@; \
 	then \
 		echo '    ' SPATCH result: $@; \