mbox series

[v5,0/6] submodule: parallelize diff

Message ID 20230104215415.1083526-1-calvinwan@google.com (mailing list archive)
Headers show
Series submodule: parallelize diff | expand

Message

Calvin Wan Jan. 4, 2023, 9:54 p.m. UTC
Original cover letter for context:
https://lore.kernel.org/git/20221011232604.839941-1-calvinwan@google.com/

Thank you again everyone for the numerous reviews! For this reroll, I
incorporated most of the feedback given, fixed a bug I found, and made
some stylistic refactors. I also added a new patch at the end that swaps
the serial implementation in is_submodule_modified for the new parallel
one. While I had patch 6 originally smushed with the previous one,
the diff came out not very reviewer friendly so it has been separated
out.

Changes since v4

(Patch 1)
The code in run-command.c that calls duplicate_output_fn has been
cleaned up and no longer passes a separate strbuf for the output. It
instead passes an offset that represents the starting point in the
original strbuf.

(Patch 5)
Moved status parsing from status_duplicate_output to status_finish. In
pp_buffer_stderr::run-command.c, output is gathered by strbuf_read_once
which reads 8192 bytes at once so a longer status message would error
out during status parsing since part of it would be cut off. Therefore,
status parsing must happen at the end of the process rather than in
duplicate_output_fn (and has subsequently been moved).

(Patch 6)
New patch swapping serial implementation in is_submodule_modified for
the new parallel one.

Calvin Wan (6):
  run-command: add duplicate_output_fn to run_processes_parallel_opts
  submodule: strbuf variable rename
  submodule: move status parsing into function
  diff-lib: refactor match_stat_with_submodule
  diff-lib: parallelize run_diff_files for submodules
  submodule: call parallel code from serial status

 Documentation/config/submodule.txt |  12 ++
 diff-lib.c                         | 104 ++++++++++--
 run-command.c                      |  16 +-
 run-command.h                      |  27 ++++
 submodule.c                        | 250 ++++++++++++++++++++++-------
 submodule.h                        |   9 ++
 t/helper/test-run-command.c        |  21 +++
 t/t0061-run-command.sh             |  39 +++++
 t/t4027-diff-submodule.sh          |  19 +++
 t/t7506-status-submodule.sh        |  19 +++
 10 files changed, 441 insertions(+), 75 deletions(-)

Comments

Calvin Wan Jan. 5, 2023, 11:23 p.m. UTC | #1
Apologies for the broken link to the previous versions. Looks like I had
some encoding issues with copy/paste. Here are the previous versions

v4: https://lore.kernel.org/git/20221108184200.2813458-1-calvinwan@google.com/
v3: https://lore.kernel.org/git/20221020232532.1128326-1-calvinwan@google.com/
v2: https://lore.kernel.org/git/20221011232604.839941-1-calvinwan@google.com/
v1: https://lore.kernel.org/git/20220922232947.631309-1-calvinwan@google.com/

On Wed, Jan 4, 2023 at 1:54 PM Calvin Wan <calvinwan@google.com> wrote:
>
> Original cover letter for context:
> https://lore.kernel.org/git/20221011232604.839941-1-calvinwan@google.com/
>
> Thank you again everyone for the numerous reviews! For this reroll, I
> incorporated most of the feedback given, fixed a bug I found, and made
> some stylistic refactors. I also added a new patch at the end that swaps
> the serial implementation in is_submodule_modified for the new parallel
> one. While I had patch 6 originally smushed with the previous one,
> the diff came out not very reviewer friendly so it has been separated
> out.
>
> Changes since v4
>
> (Patch 1)
> The code in run-command.c that calls duplicate_output_fn has been
> cleaned up and no longer passes a separate strbuf for the output. It
> instead passes an offset that represents the starting point in the
> original strbuf.
>
> (Patch 5)
> Moved status parsing from status_duplicate_output to status_finish. In
> pp_buffer_stderr::run-command.c, output is gathered by strbuf_read_once
> which reads 8192 bytes at once so a longer status message would error
> out during status parsing since part of it would be cut off. Therefore,
> status parsing must happen at the end of the process rather than in
> duplicate_output_fn (and has subsequently been moved).
>
> (Patch 6)
> New patch swapping serial implementation in is_submodule_modified for
> the new parallel one.
>
> Calvin Wan (6):
>   run-command: add duplicate_output_fn to run_processes_parallel_opts
>   submodule: strbuf variable rename
>   submodule: move status parsing into function
>   diff-lib: refactor match_stat_with_submodule
>   diff-lib: parallelize run_diff_files for submodules
>   submodule: call parallel code from serial status
>
>  Documentation/config/submodule.txt |  12 ++
>  diff-lib.c                         | 104 ++++++++++--
>  run-command.c                      |  16 +-
>  run-command.h                      |  27 ++++
>  submodule.c                        | 250 ++++++++++++++++++++++-------
>  submodule.h                        |   9 ++
>  t/helper/test-run-command.c        |  21 +++
>  t/t0061-run-command.sh             |  39 +++++
>  t/t4027-diff-submodule.sh          |  19 +++
>  t/t7506-status-submodule.sh        |  19 +++
>  10 files changed, 441 insertions(+), 75 deletions(-)
>
> --
> 2.39.0.314.g84b9a713c41-goog
>