diff mbox series

[v4,1/5] run-command: add duplicate_output_fn to run_processes_parallel_opts

Message ID 20221108184200.2813458-2-calvinwan@google.com (mailing list archive)
State New, archived
Headers show
Series submodule: parallelize diff | expand

Commit Message

Calvin Wan Nov. 8, 2022, 6:41 p.m. UTC
Add duplicate_output_fn as an optionally set function in
run_process_parallel_opts. If set, output from each child process is
copied and passed to the callback function whenever output from the
child process is buffered to allow for separate parsing.

Signed-off-by: Calvin Wan <calvinwan@google.com>
---
 run-command.c               | 13 +++++++++++--
 run-command.h               | 24 +++++++++++++++++++++++
 t/helper/test-run-command.c | 21 ++++++++++++++++++++
 t/t0061-run-command.sh      | 39 +++++++++++++++++++++++++++++++++++++
 4 files changed, 95 insertions(+), 2 deletions(-)

Comments

Jonathan Tan Nov. 28, 2022, 8:45 p.m. UTC | #1
Calvin Wan <calvinwan@google.com> writes:

> +	if (opts->duplicate_output && opts->ungroup)
> +		BUG("duplicate_output and ungroup are incompatible with each other");

Thanks for spotting "ungroup" - that helps me to focus my review.

> @@ -1680,8 +1683,14 @@ static void pp_buffer_stderr(struct parallel_processes *pp,
>  	for (size_t i = 0; i < opts->processes; i++) {
>  		if (pp->children[i].state == GIT_CP_WORKING &&
>  		    pp->pfd[i].revents & (POLLIN | POLLHUP)) {
> -			int n = strbuf_read_once(&pp->children[i].err,
> -						 pp->children[i].process.err, 0);
> +			struct strbuf buf = STRBUF_INIT;
> +			int n = strbuf_read_once(&buf, pp->children[i].process.err, 0);
> +			strbuf_addbuf(&pp->children[i].err, &buf);
> +			if (opts->duplicate_output)
> +				opts->duplicate_output(&buf, &pp->children[i].err,
> +					  opts->data,
> +					  pp->children[i].data);
> +			strbuf_release(&buf);

[snip]

> Add duplicate_output_fn as an optionally set function in
> run_process_parallel_opts. If set, output from each child process is
> copied and passed to the callback function whenever output from the
> child process is buffered to allow for separate parsing.

Looking at this patch, since this new option is incompatible with "ungroup",
I would have expected that the new functionality be in a place that already
contains an "if (ungroup)", and thus would go into the "else" block. Looking at
the code, it seems like a reasonable place would be in pp_collect_finished().

Is the reason this is not there because we only want the output of the child
process, not anything that the callback functions might write to the out
strbuf? If yes, is there a reason for that? If not, I think the code would
be simpler if we did what I suggested. (Maybe this has already been discussed
previously - if that is the case, the reason for doing it this way should be in
the commit message.)

> diff --git a/t/helper/test-run-command.c b/t/helper/test-run-command.c
> index 3ecb830f4a..40dd329e02 100644
> --- a/t/helper/test-run-command.c
> +++ b/t/helper/test-run-command.c
> @@ -52,6 +52,21 @@ static int no_job(struct child_process *cp,
>  	return 0;
>  }
>  
> +static void duplicate_output(struct strbuf *process_out,
> +			struct strbuf *out,
> +			void *pp_cb,
> +			void *pp_task_cb)
> +{
> +	struct string_list list = STRING_LIST_INIT_DUP;
> +
> +	string_list_split(&list, process_out->buf, '\n', -1);
> +	for (size_t i = 0; i < list.nr; i++) {
> +		if (strlen(list.items[i].string) > 0)
> +			fprintf(stderr, "duplicate_output: %s\n", list.items[i].string);
> +	}
> +	string_list_clear(&list, 0);
> +}
> +
>  static int task_finished(int result,
>  			 struct strbuf *err,
>  			 void *pp_cb,
> @@ -439,6 +454,12 @@ int cmd__run_command(int argc, const char **argv)
>  		opts.ungroup = 1;
>  	}
>  
> +	if (!strcmp(argv[1], "--duplicate-output")) {
> +		argv += 1;
> +		argc -= 1;
> +		opts.duplicate_output = duplicate_output;
> +	}

In the tests, can we also write things from the callback functions? Whether we think that callback output should be
duplicated or not, we should test what happens to them.
Elijah Newren Nov. 29, 2022, 5:11 a.m. UTC | #2
On Tue, Nov 8, 2022 at 11:11 AM Calvin Wan <calvinwan@google.com> wrote:
>
> Add duplicate_output_fn as an optionally set function in
> run_process_parallel_opts. If set, output from each child process is
> copied and passed to the callback function whenever output from the
> child process is buffered to allow for separate parsing.
>
> Signed-off-by: Calvin Wan <calvinwan@google.com>
[...]
> diff --git a/t/helper/test-run-command.c b/t/helper/test-run-command.c
> index 3ecb830f4a..40dd329e02 100644
> --- a/t/helper/test-run-command.c
> +++ b/t/helper/test-run-command.c
> @@ -52,6 +52,21 @@ static int no_job(struct child_process *cp,
>         return 0;
>  }
>
> +static void duplicate_output(struct strbuf *process_out,
> +                       struct strbuf *out,
> +                       void *pp_cb,
> +                       void *pp_task_cb)

Should the last two parameters have an "UNUSED" annotation?
Glen Choo Nov. 29, 2022, 11:29 p.m. UTC | #3
Calvin Wan <calvinwan@google.com> writes:

> @@ -1680,8 +1683,14 @@ static void pp_buffer_stderr(struct parallel_processes *pp,
>  	for (size_t i = 0; i < opts->processes; i++) {
>  		if (pp->children[i].state == GIT_CP_WORKING &&
>  		    pp->pfd[i].revents & (POLLIN | POLLHUP)) {
> -			int n = strbuf_read_once(&pp->children[i].err,
> -						 pp->children[i].process.err, 0);
> +			struct strbuf buf = STRBUF_INIT;
> +			int n = strbuf_read_once(&buf, pp->children[i].process.err, 0);
> +			strbuf_addbuf(&pp->children[i].err, &buf);
> +			if (opts->duplicate_output)
> +				opts->duplicate_output(&buf, &pp->children[i].err,
> +					  opts->data,
> +					  pp->children[i].data);
> +			strbuf_release(&buf);
>  			if (n == 0) {
>  				close(pp->children[i].process.err);
>  				pp->children[i].state = GIT_CP_WAIT_CLEANUP;

A common pattern is that optional behavior does not impose additional
costs on the non-optional part. Here, we unconditionally do a
strbuf_addbuf() even though we don't use the result in the "else" case.

So this might be more idiomatically written as:

        int n = strbuf_read_once(&pp->children[i].err,
        			 pp->children[i].process.err, 0);
 +      if (opts->duplicate_output) {
 +          struct strbuf buf = STRBUF_INIT;
 +          strbuf_addbuf(&buf, &pp->children[i].err);
 +        	opts->duplicate_output(&buf, &pp->children[i].err,
 +        		  opts->data,
 +        		  pp->children[i].data);
 +          strbuf_release(&buf);
 +      }

which also has the nice benefit of not touching the strbuf_read_once()
line.
Ævar Arnfjörð Bjarmason Nov. 30, 2022, 9:53 a.m. UTC | #4
On Tue, Nov 29 2022, Glen Choo wrote:

> Calvin Wan <calvinwan@google.com> writes:
>
>> @@ -1680,8 +1683,14 @@ static void pp_buffer_stderr(struct parallel_processes *pp,
>>  	for (size_t i = 0; i < opts->processes; i++) {
>>  		if (pp->children[i].state == GIT_CP_WORKING &&
>>  		    pp->pfd[i].revents & (POLLIN | POLLHUP)) {
>> -			int n = strbuf_read_once(&pp->children[i].err,
>> -						 pp->children[i].process.err, 0);
>> +			struct strbuf buf = STRBUF_INIT;
>> +			int n = strbuf_read_once(&buf, pp->children[i].process.err, 0);
>> +			strbuf_addbuf(&pp->children[i].err, &buf);
>> +			if (opts->duplicate_output)
>> +				opts->duplicate_output(&buf, &pp->children[i].err,
>> +					  opts->data,
>> +					  pp->children[i].data);
>> +			strbuf_release(&buf);
>>  			if (n == 0) {
>>  				close(pp->children[i].process.err);
>>  				pp->children[i].state = GIT_CP_WAIT_CLEANUP;
>
> A common pattern is that optional behavior does not impose additional
> costs on the non-optional part. Here, we unconditionally do a
> strbuf_addbuf() even though we don't use the result in the "else" case.
>
> So this might be more idiomatically written as:
>
>         int n = strbuf_read_once(&pp->children[i].err,
>         			 pp->children[i].process.err, 0);
>  +      if (opts->duplicate_output) {
>  +          struct strbuf buf = STRBUF_INIT;
>  +          strbuf_addbuf(&buf, &pp->children[i].err);
>  +        	opts->duplicate_output(&buf, &pp->children[i].err,
>  +        		  opts->data,
>  +        		  pp->children[i].data);
>  +          strbuf_release(&buf);
>  +      }
>
> which also has the nice benefit of not touching the strbuf_read_once()
> line.

We should also use "size_t n" there, not "int n", which is what it
returns. It won't matter for overflow in this case, but let's not
truncate for no good reason, it makes spotting actual overflows in
compiler output harder.

And why does "&buf" exist at all? Why can't we just pass
&pp->children[i].err, and if this callback cares about the last thing we
read let's pass it an offset, so it can know what came from the
strbuf_read_once() (I don't know if it actually cares about that
either...).

That would avoid the copy entirely.
Phillip Wood Nov. 30, 2022, 10:26 a.m. UTC | #5
On 30/11/2022 09:53, Ævar Arnfjörð Bjarmason wrote:
> 
> On Tue, Nov 29 2022, Glen Choo wrote:
> 
>> Calvin Wan <calvinwan@google.com> writes:
>>
>>> @@ -1680,8 +1683,14 @@ static void pp_buffer_stderr(struct parallel_processes *pp,
>>>   	for (size_t i = 0; i < opts->processes; i++) {
>>>   		if (pp->children[i].state == GIT_CP_WORKING &&
>>>   		    pp->pfd[i].revents & (POLLIN | POLLHUP)) {
>>> -			int n = strbuf_read_once(&pp->children[i].err,
>>> -						 pp->children[i].process.err, 0);
>>> +			struct strbuf buf = STRBUF_INIT;
>>> +			int n = strbuf_read_once(&buf, pp->children[i].process.err, 0);
>>> +			strbuf_addbuf(&pp->children[i].err, &buf);
>>> +			if (opts->duplicate_output)

Shouldn't we be checking if n < 0 before we do this?

>>> +				opts->duplicate_output(&buf, &pp->children[i].err,
>>> +					  opts->data,
>>> +					  pp->children[i].data);
>>> +			strbuf_release(&buf);
>>>   			if (n == 0) {
>>>   				close(pp->children[i].process.err);
>>>   				pp->children[i].state = GIT_CP_WAIT_CLEANUP;
>>
>> A common pattern is that optional behavior does not impose additional
>> costs on the non-optional part. Here, we unconditionally do a
>> strbuf_addbuf() even though we don't use the result in the "else" case.
>>
>> So this might be more idiomatically written as:
>>
>>          int n = strbuf_read_once(&pp->children[i].err,
>>          			 pp->children[i].process.err, 0);
>>   +      if (opts->duplicate_output) {
>>   +          struct strbuf buf = STRBUF_INIT;
>>   +          strbuf_addbuf(&buf, &pp->children[i].err);
>>   +        	opts->duplicate_output(&buf, &pp->children[i].err,
>>   +        		  opts->data,
>>   +        		  pp->children[i].data);
>>   +          strbuf_release(&buf);
>>   +      }
>>
> [...]
> And why does "&buf" exist at all? 

I was wondering that as too

>Why can't we just pass
> &pp->children[i].err, and if this callback cares about the last thing we
> read let's pass it an offset, so it can know what came from the
> strbuf_read_once() (I don't know if it actually cares about that
> either...).

Or we could just pass a `const char*`, `size_t` pair.

> That would avoid the copy entirely.

Is the copying quadratic at the moment? - it looks like each call to 
strbuf_read_once() appends to the buffer and we copy the whole buffer 
each time.

Best Wishes

Phillip
Phillip Wood Nov. 30, 2022, 10:28 a.m. UTC | #6
On 30/11/2022 09:53, Ævar Arnfjörð Bjarmason wrote:
> 
> On Tue, Nov 29 2022, Glen Choo wrote:
> 
>> Calvin Wan <calvinwan@google.com> writes:
>> So this might be more idiomatically written as:
>>
>>          int n = strbuf_read_once(&pp->children[i].err,
>>          			 pp->children[i].process.err, 0);
>>   +      if (opts->duplicate_output) {
>>   +          struct strbuf buf = STRBUF_INIT;
>>   +          strbuf_addbuf(&buf, &pp->children[i].err);
>>   +        	opts->duplicate_output(&buf, &pp->children[i].err,
>>   +        		  opts->data,
>>   +        		  pp->children[i].data);
>>   +          strbuf_release(&buf);
>>   +      }
>>
>> which also has the nice benefit of not touching the strbuf_read_once()
>> line.
> 
> We should also use "size_t n" there, not "int n", which is what it
> returns.

It returns an ssize_t not size_t, lower down we test `n < 0` so we 
certainly should not be using an unsigned type.

Best Wishes

Phillip
Ævar Arnfjörð Bjarmason Nov. 30, 2022, 10:57 a.m. UTC | #7
On Wed, Nov 30 2022, Phillip Wood wrote:

> On 30/11/2022 09:53, Ævar Arnfjörð Bjarmason wrote:
>> On Tue, Nov 29 2022, Glen Choo wrote:
>> 
>>> Calvin Wan <calvinwan@google.com> writes:
>>> So this might be more idiomatically written as:
>>>
>>>          int n = strbuf_read_once(&pp->children[i].err,
>>>          			 pp->children[i].process.err, 0);
>>>   +      if (opts->duplicate_output) {
>>>   +          struct strbuf buf = STRBUF_INIT;
>>>   +          strbuf_addbuf(&buf, &pp->children[i].err);
>>>   +        	opts->duplicate_output(&buf, &pp->children[i].err,
>>>   +        		  opts->data,
>>>   +        		  pp->children[i].data);
>>>   +          strbuf_release(&buf);
>>>   +      }
>>>
>>> which also has the nice benefit of not touching the strbuf_read_once()
>>> line.
>> We should also use "size_t n" there, not "int n", which is what it
>> returns.
>
> It returns an ssize_t not size_t, lower down we test `n < 0` so we
> certainly should not be using an unsigned type.

Good catch, I just skimmed it and missed the extra "s". In any case:
let's use the type it's returning, so ssize_t, not int.

(DEVOPTS=extra-all etc. will also catch this, depending on your compiler
etc.)
Calvin Wan Nov. 30, 2022, 6:46 p.m. UTC | #8
On Mon, Nov 28, 2022 at 12:45 PM Jonathan Tan <jonathantanmy@google.com> wrote:

> Looking at this patch, since this new option is incompatible with "ungroup",
> I would have expected that the new functionality be in a place that already
> contains an "if (ungroup)", and thus would go into the "else" block. Looking at
> the code, it seems like a reasonable place would be in pp_collect_finished().

The code lives in pp_buffer_stderr(), which if you go one level higher, you'll
notice that the call to pp_buffer_stderr() is in the "else" block of an
"if (ungroup)".

> Is the reason this is not there because we only want the output of the child
> process, not anything that the callback functions might write to the out
> strbuf? If yes, is there a reason for that? If not, I think the code would
> be simpler if we did what I suggested. (Maybe this has already been discussed
> previously - if that is the case, the reason for doing it this way should be in
> the commit message.)

Yes, inside of pp_output(), you'll see that if the process is the output_owner,
then "pp->children[i].err" is printed and reset, which is why the code
lives before
pp_output(). The caller already has access to the callback functions and knows
what will be written to the out strbuf -- the goal of this patch is to
provide access
to all of the child output.

>
> > diff --git a/t/helper/test-run-command.c b/t/helper/test-run-command.c
> > index 3ecb830f4a..40dd329e02 100644
> > --- a/t/helper/test-run-command.c
> > +++ b/t/helper/test-run-command.c
> > @@ -52,6 +52,21 @@ static int no_job(struct child_process *cp,
> >       return 0;
> >  }
> >
> > +static void duplicate_output(struct strbuf *process_out,
> > +                     struct strbuf *out,
> > +                     void *pp_cb,
> > +                     void *pp_task_cb)
> > +{
> > +     struct string_list list = STRING_LIST_INIT_DUP;
> > +
> > +     string_list_split(&list, process_out->buf, '\n', -1);
> > +     for (size_t i = 0; i < list.nr; i++) {
> > +             if (strlen(list.items[i].string) > 0)
> > +                     fprintf(stderr, "duplicate_output: %s\n", list.items[i].string);
> > +     }
> > +     string_list_clear(&list, 0);
> > +}
> > +
> >  static int task_finished(int result,
> >                        struct strbuf *err,
> >                        void *pp_cb,
> > @@ -439,6 +454,12 @@ int cmd__run_command(int argc, const char **argv)
> >               opts.ungroup = 1;
> >       }
> >
> > +     if (!strcmp(argv[1], "--duplicate-output")) {
> > +             argv += 1;
> > +             argc -= 1;
> > +             opts.duplicate_output = duplicate_output;
> > +     }
>
> In the tests, can we also write things from the callback functions? Whether we think that callback output should be
> duplicated or not, we should test what happens to them.

ack.
Calvin Wan Nov. 30, 2022, 6:47 p.m. UTC | #9
On Mon, Nov 28, 2022 at 9:11 PM Elijah Newren <newren@gmail.com> wrote:
>
> On Tue, Nov 8, 2022 at 11:11 AM Calvin Wan <calvinwan@google.com> wrote:
> >
> > Add duplicate_output_fn as an optionally set function in
> > run_process_parallel_opts. If set, output from each child process is
> > copied and passed to the callback function whenever output from the
> > child process is buffered to allow for separate parsing.
> >
> > Signed-off-by: Calvin Wan <calvinwan@google.com>
> [...]
> > diff --git a/t/helper/test-run-command.c b/t/helper/test-run-command.c
> > index 3ecb830f4a..40dd329e02 100644
> > --- a/t/helper/test-run-command.c
> > +++ b/t/helper/test-run-command.c
> > @@ -52,6 +52,21 @@ static int no_job(struct child_process *cp,
> >         return 0;
> >  }
> >
> > +static void duplicate_output(struct strbuf *process_out,
> > +                       struct strbuf *out,
> > +                       void *pp_cb,
> > +                       void *pp_task_cb)
>
> Should the last two parameters have an "UNUSED" annotation?

Yes, good catch!
Calvin Wan Nov. 30, 2022, 7:02 p.m. UTC | #10
On Wed, Nov 30, 2022 at 2:26 AM Phillip Wood <phillip.wood123@gmail.com> wrote:
>
> On 30/11/2022 09:53, Ævar Arnfjörð Bjarmason wrote:
> >
> > On Tue, Nov 29 2022, Glen Choo wrote:
> >
> >> Calvin Wan <calvinwan@google.com> writes:
> >>
> >>> @@ -1680,8 +1683,14 @@ static void pp_buffer_stderr(struct parallel_processes *pp,
> >>>     for (size_t i = 0; i < opts->processes; i++) {
> >>>             if (pp->children[i].state == GIT_CP_WORKING &&
> >>>                 pp->pfd[i].revents & (POLLIN | POLLHUP)) {
> >>> -                   int n = strbuf_read_once(&pp->children[i].err,
> >>> -                                            pp->children[i].process.err, 0);
> >>> +                   struct strbuf buf = STRBUF_INIT;
> >>> +                   int n = strbuf_read_once(&buf, pp->children[i].process.err, 0);
> >>> +                   strbuf_addbuf(&pp->children[i].err, &buf);
> >>> +                   if (opts->duplicate_output)
>
> Shouldn't we be checking if n < 0 before we do this?

Yes we should.
>
> >>> +                           opts->duplicate_output(&buf, &pp->children[i].err,
> >>> +                                     opts->data,
> >>> +                                     pp->children[i].data);
> >>> +                   strbuf_release(&buf);
> >>>                     if (n == 0) {
> >>>                             close(pp->children[i].process.err);
> >>>                             pp->children[i].state = GIT_CP_WAIT_CLEANUP;
> >>
> >> A common pattern is that optional behavior does not impose additional
> >> costs on the non-optional part. Here, we unconditionally do a
> >> strbuf_addbuf() even though we don't use the result in the "else" case.
> >>
> >> So this might be more idiomatically written as:
> >>
> >>          int n = strbuf_read_once(&pp->children[i].err,
> >>                               pp->children[i].process.err, 0);
> >>   +      if (opts->duplicate_output) {
> >>   +          struct strbuf buf = STRBUF_INIT;
> >>   +          strbuf_addbuf(&buf, &pp->children[i].err);
> >>   +          opts->duplicate_output(&buf, &pp->children[i].err,
> >>   +                    opts->data,
> >>   +                    pp->children[i].data);
> >>   +          strbuf_release(&buf);
> >>   +      }
> >>
> > [...]
> > And why does "&buf" exist at all?
>
> I was wondering that as too
>
> >Why can't we just pass
> > &pp->children[i].err, and if this callback cares about the last thing we
> > read let's pass it an offset, so it can know what came from the
> > strbuf_read_once() (I don't know if it actually cares about that
> > either...).
>
> Or we could just pass a `const char*`, `size_t` pair.

Both you and Avar are correct that &buf isn't necessary. I think the
offset idea works better so the function calls stay similar.

>
> > That would avoid the copy entirely.
>
> Is the copying quadratic at the moment? - it looks like each call to
> strbuf_read_once() appends to the buffer and we copy the whole buffer
> each time.
>

Depends on how you're defining it, but doesn't really matter since it's
going away anyways ;)
diff mbox series

Patch

diff --git a/run-command.c b/run-command.c
index c772acd743..b8f430eb03 100644
--- a/run-command.c
+++ b/run-command.c
@@ -1560,6 +1560,9 @@  static void pp_init(struct parallel_processes *pp,
 
 	if (!opts->get_next_task)
 		BUG("you need to specify a get_next_task function");
+	
+	if (opts->duplicate_output && opts->ungroup)
+		BUG("duplicate_output and ungroup are incompatible with each other");
 
 	CALLOC_ARRAY(pp->children, n);
 	if (!opts->ungroup)
@@ -1680,8 +1683,14 @@  static void pp_buffer_stderr(struct parallel_processes *pp,
 	for (size_t i = 0; i < opts->processes; i++) {
 		if (pp->children[i].state == GIT_CP_WORKING &&
 		    pp->pfd[i].revents & (POLLIN | POLLHUP)) {
-			int n = strbuf_read_once(&pp->children[i].err,
-						 pp->children[i].process.err, 0);
+			struct strbuf buf = STRBUF_INIT;
+			int n = strbuf_read_once(&buf, pp->children[i].process.err, 0);
+			strbuf_addbuf(&pp->children[i].err, &buf);
+			if (opts->duplicate_output)
+				opts->duplicate_output(&buf, &pp->children[i].err,
+					  opts->data,
+					  pp->children[i].data);
+			strbuf_release(&buf);
 			if (n == 0) {
 				close(pp->children[i].process.err);
 				pp->children[i].state = GIT_CP_WAIT_CLEANUP;
diff --git a/run-command.h b/run-command.h
index e3e1ea01ad..dd6d6a86c2 100644
--- a/run-command.h
+++ b/run-command.h
@@ -440,6 +440,24 @@  typedef int (*start_failure_fn)(struct strbuf *out,
 				void *pp_cb,
 				void *pp_task_cb);
 
+/**
+ * This callback is called whenever output from a child process is buffered
+ * 
+ * "struct strbuf *process_out" contains the output from the child process
+ * 
+ * See run_processes_parallel() below for a discussion of the "struct
+ * strbuf *out" parameter.
+ *
+ * pp_cb is the callback cookie as passed into run_processes_parallel,
+ * pp_task_cb is the callback cookie as passed into get_next_task_fn.
+ *
+ * This function is incompatible with "ungroup"
+ */
+typedef void (*duplicate_output_fn)(struct strbuf *process_out,
+				    struct strbuf *out,
+				    void *pp_cb,
+				    void *pp_task_cb);
+
 /**
  * This callback is called on every child process that finished processing.
  *
@@ -493,6 +511,12 @@  struct run_process_parallel_opts
 	 */
 	start_failure_fn start_failure;
 
+	/**
+	 * duplicate_output: See duplicate_output_fn() above. This should be
+	 * NULL unless process specific output is needed
+	 */
+	duplicate_output_fn duplicate_output;
+
 	/**
 	 * task_finished: See task_finished_fn() above. This can be
 	 * NULL to omit any special handling.
diff --git a/t/helper/test-run-command.c b/t/helper/test-run-command.c
index 3ecb830f4a..40dd329e02 100644
--- a/t/helper/test-run-command.c
+++ b/t/helper/test-run-command.c
@@ -52,6 +52,21 @@  static int no_job(struct child_process *cp,
 	return 0;
 }
 
+static void duplicate_output(struct strbuf *process_out,
+			struct strbuf *out,
+			void *pp_cb,
+			void *pp_task_cb)
+{
+	struct string_list list = STRING_LIST_INIT_DUP;
+
+	string_list_split(&list, process_out->buf, '\n', -1);
+	for (size_t i = 0; i < list.nr; i++) {
+		if (strlen(list.items[i].string) > 0)
+			fprintf(stderr, "duplicate_output: %s\n", list.items[i].string);
+	}
+	string_list_clear(&list, 0);
+}
+
 static int task_finished(int result,
 			 struct strbuf *err,
 			 void *pp_cb,
@@ -439,6 +454,12 @@  int cmd__run_command(int argc, const char **argv)
 		opts.ungroup = 1;
 	}
 
+	if (!strcmp(argv[1], "--duplicate-output")) {
+		argv += 1;
+		argc -= 1;
+		opts.duplicate_output = duplicate_output;
+	}
+
 	jobs = atoi(argv[2]);
 	strvec_clear(&proc.args);
 	strvec_pushv(&proc.args, (const char **)argv + 3);
diff --git a/t/t0061-run-command.sh b/t/t0061-run-command.sh
index 7b5423eebd..130aec7c68 100755
--- a/t/t0061-run-command.sh
+++ b/t/t0061-run-command.sh
@@ -134,6 +134,15 @@  test_expect_success 'run_command runs in parallel with more jobs available than
 	test_cmp expect actual
 '
 
+test_expect_success 'run_command runs in parallel with more jobs available than tasks --duplicate-output' '
+	test-tool run-command --duplicate-output run-command-parallel 5 sh -c "printf \"%s\n%s\n\" Hello World" >out 2>err &&
+	test_must_be_empty out &&
+	test 4 = $(grep -c "duplicate_output: Hello" err) &&
+	test 4 = $(grep -c "duplicate_output: World" err) &&
+	sed "/duplicate_output/d" err > err1 &&
+	test_cmp expect err1
+'
+
 test_expect_success 'run_command runs ungrouped in parallel with more jobs available than tasks' '
 	test-tool run-command --ungroup run-command-parallel 5 sh -c "printf \"%s\n%s\n\" Hello World" >out 2>err &&
 	test_line_count = 8 out &&
@@ -145,6 +154,15 @@  test_expect_success 'run_command runs in parallel with as many jobs as tasks' '
 	test_cmp expect actual
 '
 
+test_expect_success 'run_command runs in parallel with as many jobs as tasks --duplicate-output' '
+	test-tool run-command --duplicate-output run-command-parallel 4 sh -c "printf \"%s\n%s\n\" Hello World" >out 2>err &&
+	test_must_be_empty out &&
+	test 4 = $(grep -c "duplicate_output: Hello" err) &&
+	test 4 = $(grep -c "duplicate_output: World" err) &&
+	sed "/duplicate_output/d" err > err1 &&
+	test_cmp expect err1
+'
+
 test_expect_success 'run_command runs ungrouped in parallel with as many jobs as tasks' '
 	test-tool run-command --ungroup run-command-parallel 4 sh -c "printf \"%s\n%s\n\" Hello World" >out 2>err &&
 	test_line_count = 8 out &&
@@ -156,6 +174,15 @@  test_expect_success 'run_command runs in parallel with more tasks than jobs avai
 	test_cmp expect actual
 '
 
+test_expect_success 'run_command runs in parallel with more tasks than jobs available --duplicate-output' '
+	test-tool run-command --duplicate-output run-command-parallel 3 sh -c "printf \"%s\n%s\n\" Hello World" >out 2>err &&
+	test_must_be_empty out &&
+	test 4 = $(grep -c "duplicate_output: Hello" err) &&
+	test 4 = $(grep -c "duplicate_output: World" err) &&
+	sed "/duplicate_output/d" err > err1 &&
+	test_cmp expect err1
+'
+
 test_expect_success 'run_command runs ungrouped in parallel with more tasks than jobs available' '
 	test-tool run-command --ungroup run-command-parallel 3 sh -c "printf \"%s\n%s\n\" Hello World" >out 2>err &&
 	test_line_count = 8 out &&
@@ -176,6 +203,12 @@  test_expect_success 'run_command is asked to abort gracefully' '
 	test_cmp expect actual
 '
 
+test_expect_success 'run_command is asked to abort gracefully --duplicate-output' '
+	test-tool run-command --duplicate-output run-command-abort 3 false >out 2>err &&
+	test_must_be_empty out &&
+	test_cmp expect err
+'
+
 test_expect_success 'run_command is asked to abort gracefully (ungroup)' '
 	test-tool run-command --ungroup run-command-abort 3 false >out 2>err &&
 	test_must_be_empty out &&
@@ -191,6 +224,12 @@  test_expect_success 'run_command outputs ' '
 	test_cmp expect actual
 '
 
+test_expect_success 'run_command outputs --duplicate-output' '
+	test-tool run-command --duplicate-output run-command-no-jobs 3 sh -c "printf \"%s\n%s\n\" Hello World" >out 2>err &&
+	test_must_be_empty out &&
+	test_cmp expect err
+'
+
 test_expect_success 'run_command outputs (ungroup) ' '
 	test-tool run-command --ungroup run-command-no-jobs 3 sh -c "printf \"%s\n%s\n\" Hello World" >out 2>err &&
 	test_must_be_empty out &&