diff mbox series

coccicheck: optionally batch spatch invocations

Message ID 20190506234334.GA13296@sigill.intra.peff.net (mailing list archive)
State New, archived
Headers show
Series coccicheck: optionally batch spatch invocations | expand

Commit Message

Jeff King May 6, 2019, 11:43 p.m. UTC
On Mon, May 06, 2019 at 04:34:09PM +0700, Duy Nguyen wrote:

> > However, it comes at a cost. The RSS of each spatch process goes from
> > ~50MB to ~1500MB (and peak memory usage may be even higher if make runs
> 
> 1.5G should be fine. Trying...
> 
> Even with no -j, my htop's RES column goes up 6GB and put my laptop in
> "swap every bit of memory out, including the bits handling the screen"
> mode :( I don't think it was even the peak.

Interesting if you have a different version of spatch. I'm using 1.0.4
from Debian unstable.

I had just been eyeballing the values in "top" before, but I actually
measured more carefully. My peak was actually ~1900MB.

> It's probably a bit too much to ask, but is it possible to handle N
> files at a time (instead of all files), which consumes less memory and
> runs a bit slower, but still better than the default mode? I can see
> it already gets tricky doing complicated stuff in Makefile so "no" is
> perfectly ok.

I almost did this initially but I feared that nobody would actually use
it. :) So given at least one person who wants it, I took a look. If we
rely on xargs, then it is really not too bad (and is in fact shorter
than the current code). I also wrote up a pure-shell version, but it's
rather verbose even after taking some shortcuts with whitespace
splitting.

So here's what I think we should apply:

-- >8 --
Subject: [PATCH] coccicheck: optionally batch spatch invocations

In our "make coccicheck" rule, we currently feed each source file to its
own individual invocation of spatch. This has a few downsides:

  - it repeats any overhead spatch has for starting up and reading the
    patch file

  - any included header files may get processed from multiple
    invocations. This is slow (we see the same header files multiple
    times) and may produce a resulting patch with repeated hunks (which
    cannot be applied without further cleanup)

Ideally we'd just invoke a single instance of spatch per rule-file and
feed it all source files. But spatch can be rather memory hungry when
run in this way. I measured the peak RSS going from ~90MB for a single
file to ~1900MB for all files. Multiplied by multiple rule files being
processed at the same time (for "make -j"), this can make things slower
or even cause them to fail (e.g., this is reported to happen on our
Travis builds).

Instead, let's provide a tunable knob. We'll leave the default at "1",
but it can be cranked up to "999" for maximum CPU/memory tradeoff, or
people can find points in between that serve their particular machines.

Here are a few numbers running a single rule via:

  SIZES='1 4 16 999'
  RULE=contrib/coccinelle/object_id.cocci
  for i in $SIZES; do
    make clean
    /usr/bin/time -o $i.out --format='%e | %U | %S | %M' \
      make $RULE.patch SPATCH_BATCH_SIZE=$i
  done
  for i in $SIZES; do
    printf '%4d | %s\n' $i "$(cat $i.out)"
  done

which yields:

     1 | 97.73 | 93.38 | 4.33 | 100128
     4 | 52.80 | 51.14 | 1.69 | 135204
    16 | 35.82 | 35.09 | 0.76 | 284124
   999 | 23.30 | 23.13 | 0.20 | 1903852

The implementation is done with xargs, which should be widely available;
it's in POSIX, we rely on it already in the test suite. And "coccicheck"
is really a developer-only tool anyway, so it's not a big deal if
obscure systems can't run it.

Signed-off-by: Jeff King <peff@peff.net>
---
I left the default at 1 for safety. Probably 4 or 16 would be an OK
default, but I don't have any interest in figuring out exactly what
Travis or some hypothetical average machine can handle. I'll be setting
mine to 999. ;)

Making "0" work as "unlimited" might be nice, but xargs doesn't support
that and I didn't want to make the recipe any more unreadable than it
already is.

 Makefile | 13 ++++++-------
 1 file changed, 6 insertions(+), 7 deletions(-)

Comments

Jacob Keller May 7, 2019, 1:41 a.m. UTC | #1
On Mon, May 6, 2019 at 4:43 PM Jeff King <peff@peff.net> wrote:
>
> On Mon, May 06, 2019 at 04:34:09PM +0700, Duy Nguyen wrote:
>
> > > However, it comes at a cost. The RSS of each spatch process goes from
> > > ~50MB to ~1500MB (and peak memory usage may be even higher if make runs
> >
> > 1.5G should be fine. Trying...
> >
> > Even with no -j, my htop's RES column goes up 6GB and put my laptop in
> > "swap every bit of memory out, including the bits handling the screen"
> > mode :( I don't think it was even the peak.
>
> Interesting if you have a different version of spatch. I'm using 1.0.4
> from Debian unstable.
>
> I had just been eyeballing the values in "top" before, but I actually
> measured more carefully. My peak was actually ~1900MB.
>
> > It's probably a bit too much to ask, but is it possible to handle N
> > files at a time (instead of all files), which consumes less memory and
> > runs a bit slower, but still better than the default mode? I can see
> > it already gets tricky doing complicated stuff in Makefile so "no" is
> > perfectly ok.
>
> I almost did this initially but I feared that nobody would actually use
> it. :) So given at least one person who wants it, I took a look. If we
> rely on xargs, then it is really not too bad (and is in fact shorter
> than the current code). I also wrote up a pure-shell version, but it's
> rather verbose even after taking some shortcuts with whitespace
> splitting.
>
> So here's what I think we should apply:
>
> -- >8 --
> Subject: [PATCH] coccicheck: optionally batch spatch invocations
>
> In our "make coccicheck" rule, we currently feed each source file to its
> own individual invocation of spatch. This has a few downsides:
>
>   - it repeats any overhead spatch has for starting up and reading the
>     patch file
>
>   - any included header files may get processed from multiple
>     invocations. This is slow (we see the same header files multiple
>     times) and may produce a resulting patch with repeated hunks (which
>     cannot be applied without further cleanup)
>
> Ideally we'd just invoke a single instance of spatch per rule-file and
> feed it all source files. But spatch can be rather memory hungry when
> run in this way. I measured the peak RSS going from ~90MB for a single
> file to ~1900MB for all files. Multiplied by multiple rule files being
> processed at the same time (for "make -j"), this can make things slower
> or even cause them to fail (e.g., this is reported to happen on our
> Travis builds).
>
> Instead, let's provide a tunable knob. We'll leave the default at "1",
> but it can be cranked up to "999" for maximum CPU/memory tradeoff, or
> people can find points in between that serve their particular machines.
>
> Here are a few numbers running a single rule via:
>
>   SIZES='1 4 16 999'
>   RULE=contrib/coccinelle/object_id.cocci
>   for i in $SIZES; do
>     make clean
>     /usr/bin/time -o $i.out --format='%e | %U | %S | %M' \
>       make $RULE.patch SPATCH_BATCH_SIZE=$i
>   done
>   for i in $SIZES; do
>     printf '%4d | %s\n' $i "$(cat $i.out)"
>   done
>
> which yields:
>
>      1 | 97.73 | 93.38 | 4.33 | 100128
>      4 | 52.80 | 51.14 | 1.69 | 135204
>     16 | 35.82 | 35.09 | 0.76 | 284124
>    999 | 23.30 | 23.13 | 0.20 | 1903852
>
> The implementation is done with xargs, which should be widely available;
> it's in POSIX, we rely on it already in the test suite. And "coccicheck"
> is really a developer-only tool anyway, so it's not a big deal if
> obscure systems can't run it.
>
> Signed-off-by: Jeff King <peff@peff.net>
> ---
> I left the default at 1 for safety. Probably 4 or 16 would be an OK
> default, but I don't have any interest in figuring out exactly what
> Travis or some hypothetical average machine can handle. I'll be setting
> mine to 999. ;)
>
> Making "0" work as "unlimited" might be nice, but xargs doesn't support
> that and I didn't want to make the recipe any more unreadable than it
> already is.
>
>  Makefile | 13 ++++++-------
>  1 file changed, 6 insertions(+), 7 deletions(-)
>
> diff --git a/Makefile b/Makefile
> index 9f1b6e8926..daba958b8f 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -1174,8 +1174,10 @@ PTHREAD_CFLAGS =
>  SPARSE_FLAGS ?=
>  SP_EXTRA_FLAGS =
>
> -# For the 'coccicheck' target
> +# For the 'coccicheck' target; setting SPATCH_BATCH_SIZE higher will
> +# usually result in less CPU usage at the cost of higher peak memory.
>  SPATCH_FLAGS = --all-includes --patch .
> +SPATCH_BATCH_SIZE = 1
>
>  include config.mak.uname
>  -include config.mak.autogen
> @@ -2790,12 +2792,9 @@ endif
>
>  %.cocci.patch: %.cocci $(COCCI_SOURCES)
>         @echo '    ' SPATCH $<; \
> -       ret=0; \
> -       for f in $(COCCI_SOURCES); do \
> -               $(SPATCH) --sp-file $< $$f $(SPATCH_FLAGS) || \
> -                       { ret=$$?; break; }; \
> -       done >$@+ 2>$@.log; \
> -       if test $$ret != 0; \
> +       if ! echo $(COCCI_SOURCES) | xargs -n $(SPATCH_BATCH_SIZE) \
> +               $(SPATCH) --sp-file $< $(SPATCH_FLAGS) \
> +               >$@+ 2>$@.log; \
>         then \
>                 cat $@.log; \
>                 exit 1; \
> --
> 2.21.0.1314.g224b191707
>

This looks reasonable to me :)

Thanks,
Jake
Jeff King May 7, 2019, 2:04 a.m. UTC | #2
On Mon, May 06, 2019 at 06:41:03PM -0700, Jacob Keller wrote:

> > So here's what I think we should apply:
> >
> > -- >8 --
> > Subject: [PATCH] coccicheck: optionally batch spatch invocations
> [...]
> 
> This looks reasonable to me :)

Thanks. I should have added a:

  Based-on-a-patch-by: Jacob Keller <jacob.keller@gmail.com>

trailer.

-Peff
Junio C Hamano May 7, 2019, 2:42 a.m. UTC | #3
Jeff King <peff@peff.net> writes:

> The implementation is done with xargs, which should be widely available;
> it's in POSIX, we rely on it already in the test suite. And "coccicheck"
> is really a developer-only tool anyway, so it's not a big deal if
> obscure systems can't run it.

OK.

> I left the default at 1 for safety. Probably 4 or 16 would be an OK
> default, but I don't have any interest in figuring out exactly what
> Travis or some hypothetical average machine can handle. I'll be setting
> mine to 999. ;)
>
> Making "0" work as "unlimited" might be nice, but xargs doesn't support
> that and I didn't want to make the recipe any more unreadable than it
> already is.

Sounds good.  After reading the log message, I was curious if there
is a mechanism that makes 999 special (like 0 in your hypothetical
"0 means unlimited"), but I guess it is just "any number that is
greater than the number of source files we have will do".

Thanks.
Jeff King May 7, 2019, 2:55 a.m. UTC | #4
On Tue, May 07, 2019 at 11:42:28AM +0900, Junio C Hamano wrote:

> > Making "0" work as "unlimited" might be nice, but xargs doesn't support
> > that and I didn't want to make the recipe any more unreadable than it
> > already is.
> 
> Sounds good.  After reading the log message, I was curious if there
> is a mechanism that makes 999 special (like 0 in your hypothetical
> "0 means unlimited"), but I guess it is just "any number that is
> greater than the number of source files we have will do".

Yes, 2^31-1 is probably a better number, but it's harder to write out. :)

Here's what a patch might look like to implement "0". By still using
xargs in the unlimited code path, it's not too bad. I dunno.

---
diff --git a/Makefile b/Makefile
index daba958b8f..0765a59b7a 100644
--- a/Makefile
+++ b/Makefile
@@ -2792,7 +2792,12 @@ endif
 
 %.cocci.patch: %.cocci $(COCCI_SOURCES)
 	@echo '    ' SPATCH $<; \
-	if ! echo $(COCCI_SOURCES) | xargs -n $(SPATCH_BATCH_SIZE) \
+	if test $(SPATCH_BATCH_SIZE) = 0; then \
+		limit=; \
+	else \
+		limit='-n $(SPATCH_BATCH_SIZE)'; \
+	fi; \
+	if ! echo $(COCCI_SOURCES) | xargs $$limit \
 		$(SPATCH) --sp-file $< $(SPATCH_FLAGS) \
 		>$@+ 2>$@.log; \
 	then \

-Peff
Jacob Keller May 7, 2019, 3:04 a.m. UTC | #5
On Mon, May 6, 2019 at 7:55 PM Jeff King <peff@peff.net> wrote:
> Here's what a patch might look like to implement "0". By still using
> xargs in the unlimited code path, it's not too bad. I dunno.
>
> ---
> diff --git a/Makefile b/Makefile
> index daba958b8f..0765a59b7a 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -2792,7 +2792,12 @@ endif
>
>  %.cocci.patch: %.cocci $(COCCI_SOURCES)
>         @echo '    ' SPATCH $<; \
> -       if ! echo $(COCCI_SOURCES) | xargs -n $(SPATCH_BATCH_SIZE) \
> +       if test $(SPATCH_BATCH_SIZE) = 0; then \
> +               limit=; \
> +       else \
> +               limit='-n $(SPATCH_BATCH_SIZE)'; \
> +       fi; \
> +       if ! echo $(COCCI_SOURCES) | xargs $$limit \
>                 $(SPATCH) --sp-file $< $(SPATCH_FLAGS) \
>                 >$@+ 2>$@.log; \
>         then \
>
> -Peff

This doesn't seem too complicated to me..
Junio C Hamano May 7, 2019, 4:52 a.m. UTC | #6
Jeff King <peff@peff.net> writes:

> Yes, 2^31-1 is probably a better number, but it's harder to write out. :)
>
> Here's what a patch might look like to implement "0". By still using
> xargs in the unlimited code path, it's not too bad. I dunno.



As somebody who is too used to run "diff -U999" and be happy, I
cannot claim that I care enough, but the result does not look
too bad.

>
> ---
> diff --git a/Makefile b/Makefile
> index daba958b8f..0765a59b7a 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -2792,7 +2792,12 @@ endif
>  
>  %.cocci.patch: %.cocci $(COCCI_SOURCES)
>  	@echo '    ' SPATCH $<; \
> -	if ! echo $(COCCI_SOURCES) | xargs -n $(SPATCH_BATCH_SIZE) \
> +	if test $(SPATCH_BATCH_SIZE) = 0; then \
> +		limit=; \
> +	else \
> +		limit='-n $(SPATCH_BATCH_SIZE)'; \
> +	fi; \
> +	if ! echo $(COCCI_SOURCES) | xargs $$limit \
>  		$(SPATCH) --sp-file $< $(SPATCH_FLAGS) \
>  		>$@+ 2>$@.log; \
>  	then \
>
> -Peff
Duy Nguyen May 7, 2019, 10:20 a.m. UTC | #7
On Tue, May 7, 2019 at 6:43 AM Jeff King <peff@peff.net> wrote:
>
> On Mon, May 06, 2019 at 04:34:09PM +0700, Duy Nguyen wrote:
>
> > > However, it comes at a cost. The RSS of each spatch process goes from
> > > ~50MB to ~1500MB (and peak memory usage may be even higher if make runs
> >
> > 1.5G should be fine. Trying...
> >
> > Even with no -j, my htop's RES column goes up 6GB and put my laptop in
> > "swap every bit of memory out, including the bits handling the screen"
> > mode :( I don't think it was even the peak.
>
> Interesting if you have a different version of spatch. I'm using 1.0.4
> from Debian unstable.

I'm using 1.0.6

> Ideally we'd just invoke a single instance of spatch per rule-file and
> feed it all source files. But spatch can be rather memory hungry when
> run in this way. I measured the peak RSS going from ~90MB for a single
> file to ~1900MB for all files. Multiplied by multiple rule files being
> processed at the same time (for "make -j"), this can make things slower
> or even cause them to fail (e.g., this is reported to happen on our
> Travis builds).
>
> Instead, let's provide a tunable knob. We'll leave the default at "1",
> but it can be cranked up to "999" for maximum CPU/memory tradeoff, or
> people can find points in between that serve their particular machines.
>
> Here are a few numbers running a single rule via:
>
>   SIZES='1 4 16 999'
>   RULE=contrib/coccinelle/object_id.cocci
>   for i in $SIZES; do
>     make clean
>     /usr/bin/time -o $i.out --format='%e | %U | %S | %M' \
>       make $RULE.patch SPATCH_BATCH_SIZE=$i
>   done
>   for i in $SIZES; do
>     printf '%4d | %s\n' $i "$(cat $i.out)"
>   done
>
> which yields:
>
>      1 | 97.73 | 93.38 | 4.33 | 100128
>      4 | 52.80 | 51.14 | 1.69 | 135204
>     16 | 35.82 | 35.09 | 0.76 | 284124
>    999 | 23.30 | 23.13 | 0.20 | 1903852

I think I'm going to settle around 32 which has peak at 2GB for me and
I can still make use of 4 hyperthread cores.

It's weird for me though that no SIZES=1 is actually the fastest. The
larger SPATCH_BATCH_SIZE is, the slower:

> ~/w/git/temp $ rm contrib/coccinelle/commit.cocci.patch
> ~/w/git/temp $ time make contrib/coccinelle/commit.cocci.patch
     SPATCH contrib/coccinelle/commit.cocci

real    1m20,242s
user    1m15,863s
sys     0m4,376s

> ~/w/git/temp $ rm contrib/coccinelle/commit.cocci.patch
> ~/w/git/temp $ time make contrib/coccinelle/commit.cocci.patch SPATCH_BATCH_SIZE=16
     SPATCH contrib/coccinelle/commit.cocci

real    3m3,131s
user    2m59,345s
sys     0m3,819s

> ~/w/git/temp $ rm contrib/coccinelle/commit.cocci.patch
> ~/w/git/temp $ time make contrib/coccinelle/commit.cocci.patch SPATCH_BATCH_SIZE=32
     SPATCH contrib/coccinelle/commit.cocci

real    5m10,966s
user    5m5,434s
sys     0m5,500s

But that might be just cocci or environmental problem. Since you and
Jacob both see good speedup, this patch is good.
SZEDER Gábor May 7, 2019, 11:19 a.m. UTC | #8
On Mon, May 06, 2019 at 07:43:34PM -0400, Jeff King wrote:
> Subject: [PATCH] coccicheck: optionally batch spatch invocations
> 
> In our "make coccicheck" rule, we currently feed each source file to its
> own individual invocation of spatch. This has a few downsides:
> 
>   - it repeats any overhead spatch has for starting up and reading the
>     patch file
> 
>   - any included header files may get processed from multiple
>     invocations. This is slow (we see the same header files multiple
>     times) and may produce a resulting patch with repeated hunks (which
>     cannot be applied without further cleanup)

I wonder what would happen (and how it would perform) if we told
spatch to ignore all include files from our C sources, but then run it
on each of our header files as well.  It still would have to parse the
semantic patches over and over again, but those are rather small
(compared to, say, 'git-compar-util.h' and 'cache.h' that are included
in most of our source files).  But I don't know how the lack of
information about preprocessor macros would affect Coccinelle's
analyzis.

> Ideally we'd just invoke a single instance of spatch per rule-file and
> feed it all source files.

Well, ideally we would invoke spatch with '--dir .' to "process all
files in directory recursively", and exclude those that we are not
interested in (parts of 'compat/', 'sha1dc')...  but alas, such an
exclude mechanism doesn't seem to exist.
Jeff King May 8, 2019, 7:07 a.m. UTC | #9
On Tue, May 07, 2019 at 01:52:32PM +0900, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> > Yes, 2^31-1 is probably a better number, but it's harder to write out. :)
> >
> > Here's what a patch might look like to implement "0". By still using
> > xargs in the unlimited code path, it's not too bad. I dunno.
> 
> As somebody who is too used to run "diff -U999" and be happy, I
> cannot claim that I care enough, but the result does not look
> too bad.

OK. With two "not too bad" comments (plus my own similar feeling), let's
just do it. Here it is as a patch on top. It can also be squashed into
the tip of jk/cocci-batch, but then we should probably s/999/0/ in the
commit message. :)

-- >8 --
Subject: [PATCH] coccicheck: make batch size of 0 mean "unlimited"

If you have the memory to handle it, the ideal case is to run a single
spatch invocation with all of the source files. But the only way to do
so now is to pick an arbitrarily large batch size. Let's make "0" do
this, which is a little friendlier (and doesn't otherwise have a useful
meaning).

Signed-off-by: Jeff King <peff@peff.net>
---
 Makefile | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/Makefile b/Makefile
index daba958b8f..9cea614523 100644
--- a/Makefile
+++ b/Makefile
@@ -1176,6 +1176,7 @@ SP_EXTRA_FLAGS =
 
 # For the 'coccicheck' target; setting SPATCH_BATCH_SIZE higher will
 # usually result in less CPU usage at the cost of higher peak memory.
+# Setting it to 0 will feed all files in a single spatch invocation.
 SPATCH_FLAGS = --all-includes --patch .
 SPATCH_BATCH_SIZE = 1
 
@@ -2792,7 +2793,12 @@ endif
 
 %.cocci.patch: %.cocci $(COCCI_SOURCES)
 	@echo '    ' SPATCH $<; \
-	if ! echo $(COCCI_SOURCES) | xargs -n $(SPATCH_BATCH_SIZE) \
+	if test $(SPATCH_BATCH_SIZE) = 0; then \
+		limit=; \
+	else \
+		limit='-n $(SPATCH_BATCH_SIZE)'; \
+	fi; \
+	if ! echo $(COCCI_SOURCES) | xargs $$limit \
 		$(SPATCH) --sp-file $< $(SPATCH_FLAGS) \
 		>$@+ 2>$@.log; \
 	then \
Denton Liu May 8, 2019, 12:36 p.m. UTC | #10
Hi Jeff,

On Wed, May 08, 2019 at 03:07:54AM -0400, Jeff King wrote:
> On Tue, May 07, 2019 at 01:52:32PM +0900, Junio C Hamano wrote:
> 
> > Jeff King <peff@peff.net> writes:
> > 
> > > Yes, 2^31-1 is probably a better number, but it's harder to write out. :)
> > >
> > > Here's what a patch might look like to implement "0". By still using
> > > xargs in the unlimited code path, it's not too bad. I dunno.
> > 
> > As somebody who is too used to run "diff -U999" and be happy, I
> > cannot claim that I care enough, but the result does not look
> > too bad.
> 
> OK. With two "not too bad" comments (plus my own similar feeling), let's
> just do it. Here it is as a patch on top. It can also be squashed into
> the tip of jk/cocci-batch, but then we should probably s/999/0/ in the
> commit message. :)
> 
> -- >8 --
> Subject: [PATCH] coccicheck: make batch size of 0 mean "unlimited"
> 
> If you have the memory to handle it, the ideal case is to run a single
> spatch invocation with all of the source files. But the only way to do
> so now is to pick an arbitrarily large batch size. Let's make "0" do
> this, which is a little friendlier (and doesn't otherwise have a useful
> meaning).
> 
> Signed-off-by: Jeff King <peff@peff.net>
> ---
>  Makefile | 8 +++++++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/Makefile b/Makefile
> index daba958b8f..9cea614523 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -1176,6 +1176,7 @@ SP_EXTRA_FLAGS =
>  
>  # For the 'coccicheck' target; setting SPATCH_BATCH_SIZE higher will
>  # usually result in less CPU usage at the cost of higher peak memory.
> +# Setting it to 0 will feed all files in a single spatch invocation.
>  SPATCH_FLAGS = --all-includes --patch .
>  SPATCH_BATCH_SIZE = 1
>  
> @@ -2792,7 +2793,12 @@ endif
>  
>  %.cocci.patch: %.cocci $(COCCI_SOURCES)
>  	@echo '    ' SPATCH $<; \
> -	if ! echo $(COCCI_SOURCES) | xargs -n $(SPATCH_BATCH_SIZE) \
> +	if test $(SPATCH_BATCH_SIZE) = 0; then \
> +		limit=; \
> +	else \
> +		limit='-n $(SPATCH_BATCH_SIZE)'; \
> +	fi; \

Could we pull `limit` out of the recipe and into a make variable? You
mentioned earlier that you wanted to do this but it was too complicated
but now that it's written like this, it seem like it'd be pretty easy to
do.

> +	if ! echo $(COCCI_SOURCES) | xargs $$limit \
>  		$(SPATCH) --sp-file $< $(SPATCH_FLAGS) \
>  		>$@+ 2>$@.log; \
>  	then \
> -- 
> 2.21.0.1314.g224b191707
>
Jeff King May 8, 2019, 10:39 p.m. UTC | #11
On Wed, May 08, 2019 at 08:36:10AM -0400, Denton Liu wrote:

> >  %.cocci.patch: %.cocci $(COCCI_SOURCES)
> >  	@echo '    ' SPATCH $<; \
> > -	if ! echo $(COCCI_SOURCES) | xargs -n $(SPATCH_BATCH_SIZE) \
> > +	if test $(SPATCH_BATCH_SIZE) = 0; then \
> > +		limit=; \
> > +	else \
> > +		limit='-n $(SPATCH_BATCH_SIZE)'; \
> > +	fi; \
> 
> Could we pull `limit` out of the recipe and into a make variable? You
> mentioned earlier that you wanted to do this but it was too complicated
> but now that it's written like this, it seem like it'd be pretty easy to
> do.

Yes, we could, because this part of it is a straight text substitution
in the recipe, and not a conditional. I don't know that it's any more
readable, though. Either that same conditional gets split out like:

  ifeq ($(SPATCH_BATCH_SIZE),0)
  SPATCH_LIMIT =
  else
  SPATCH_LIMIT = -n $(SPATCH_BATCH_SIZE)
  endif

or we do it inline, but IMHO it's still pretty cluttered:

diff --git a/Makefile b/Makefile
index 9cea614523..aedc7b80a8 100644
--- a/Makefile
+++ b/Makefile
@@ -2793,12 +2793,8 @@ endif
 
 %.cocci.patch: %.cocci $(COCCI_SOURCES)
 	@echo '    ' SPATCH $<; \
-	if test $(SPATCH_BATCH_SIZE) = 0; then \
-		limit=; \
-	else \
-		limit='-n $(SPATCH_BATCH_SIZE)'; \
-	fi; \
-	if ! echo $(COCCI_SOURCES) | xargs $$limit \
+	if ! echo $(COCCI_SOURCES) | \
+	     xargs $(if $(filter 0,$(SPATCH_BATCH_SIZE)),,-n $(SPATCH_BATCH_SIZE)) \
 		$(SPATCH) --sp-file $< $(SPATCH_FLAGS) \
 		>$@+ 2>$@.log; \
 	then \

It gets a little better if we make the sentinel value the empty string
instead of "0" (you can drop the filter).

-Peff
diff mbox series

Patch

diff --git a/Makefile b/Makefile
index 9f1b6e8926..daba958b8f 100644
--- a/Makefile
+++ b/Makefile
@@ -1174,8 +1174,10 @@  PTHREAD_CFLAGS =
 SPARSE_FLAGS ?=
 SP_EXTRA_FLAGS =
 
-# For the 'coccicheck' target
+# For the 'coccicheck' target; setting SPATCH_BATCH_SIZE higher will
+# usually result in less CPU usage at the cost of higher peak memory.
 SPATCH_FLAGS = --all-includes --patch .
+SPATCH_BATCH_SIZE = 1
 
 include config.mak.uname
 -include config.mak.autogen
@@ -2790,12 +2792,9 @@  endif
 
 %.cocci.patch: %.cocci $(COCCI_SOURCES)
 	@echo '    ' SPATCH $<; \
-	ret=0; \
-	for f in $(COCCI_SOURCES); do \
-		$(SPATCH) --sp-file $< $$f $(SPATCH_FLAGS) || \
-			{ ret=$$?; break; }; \
-	done >$@+ 2>$@.log; \
-	if test $$ret != 0; \
+	if ! echo $(COCCI_SOURCES) | xargs -n $(SPATCH_BATCH_SIZE) \
+		$(SPATCH) --sp-file $< $(SPATCH_FLAGS) \
+		>$@+ 2>$@.log; \
 	then \
 		cat $@.log; \
 		exit 1; \