mbox series

[v3,0/2] diff: add -I<regex> that ignores matching changes

Message ID 20201015072406.4506-1-michal@isc.org (mailing list archive)
Headers show
Series diff: add -I<regex> that ignores matching changes | expand

Message

Michał Kępień Oct. 15, 2020, 7:24 a.m. UTC
This patch series adds a new diff option that enables ignoring changes
whose all lines (changed, removed, and added) match a given regular
expression.  This is similar to the -I/--ignore-matching-lines option in
standalone diff utilities and can be used e.g. to ignore changes which
only affect code comments or to look for unrelated changes in commits
containing a large number of automatically applied modifications (e.g. a
tree-wide string replacement).  The difference between -G/-S and the new
-I option is that the latter filters output on a per-change basis.

Changes from v2:

  - Add a long option for -I (--ignore-matching-lines) as it is
    commonplace in standalone diff utilities.  Update documentation and
    commit log messages accordingly.

  - Use xmalloc() instead of xcalloc() for allocating regex_t
    structures in diff_opt_ignore_regex().

  - Ensure the memory allocated in diff_opt_ignore_regex() gets
    released.

  - Use "return error(...)" instead of die() in the -I option callback.
    Make the relevant error message localizable.

  - Drastically reduce the number of -I<regex> tests due to excessive
    run time of t/t4069-diff-ignore-regex.sh from v1/v2 on some
    platforms (notably Windows).  Use a tweaked version of a test
    suggested by Johannes Schindelin (thanks!).  Given its reduction in
    size, squash patch 3 (which contained the tests) into patch 2.

  - Replace "see Documentation/diff-options.txt" with "-I<regex>" in the
    comments for the added structure fields, in order to make these
    comments more useful.

Changes from v1:

  - Add a new preliminary cleanup patch which ensures xpparam_t
    structures are always zero-initialized.  (This was a prerequisite
    for the next change below.)

  - Do not add a new 'xdl_opts' flag to check whether -I was used;
    instead, just check whether the array of regular expressions to
    match against is non-NULL.

  - Enable the -I option to be used multiple times.  As a consequence of
    this, regular expressions are now "pre-compiled" in the option's
    callback (and passed around as an array of regex_t structures)
    rather than deep down in xdiff code.  Add test cases exercising use
    of multiple -I options in the same git invocation.  Update
    documentation accordingly.

  - Rename xdl_mark_ignorable() to xdl_mark_ignorable_lines(), to
    indicate that it is logically a "sibling" of
    xdl_mark_ignorable_regex() rather than its "parent".

  - Optimize xdl_mark_ignorable_regex() by making it immediately skip
    changes already marked as ignored by xdl_mark_ignorable_lines().

  - Fix coding style issue in the prototype part of the definition of
    xdl_mark_ignorable_regex().

  - Add "/* see Documentation/diff-options.txt */" comments for the
    fields added to struct diff_options and xpparam_t, mimicking the
    comments used for 'anchors', 'anchors_nr', and 'anchors_alloc'.

  - Revise commit log messages to reflect all of the above.

Michał Kępień (2):
  merge-base, xdiff: zero out xpparam_t structures
  diff: add -I<regex> that ignores matching changes

 Documentation/diff-options.txt |  5 ++++
 builtin/merge-tree.c           |  1 +
 diff.c                         | 28 ++++++++++++++++++++
 diff.h                         |  4 +++
 t/t4013-diff-various.sh        | 33 ++++++++++++++++++++++++
 xdiff/xdiff.h                  |  4 +++
 xdiff/xdiffi.c                 | 47 ++++++++++++++++++++++++++++++++--
 xdiff/xhistogram.c             |  2 ++
 xdiff/xpatience.c              |  2 ++
 9 files changed, 124 insertions(+), 2 deletions(-)

Comments

Johannes Schindelin Oct. 16, 2020, 10 a.m. UTC | #1
Hi Michał,

On Thu, 15 Oct 2020, Michał Kępień wrote:

> This patch series adds a new diff option that enables ignoring changes
> whose all lines (changed, removed, and added) match a given regular
> expression.  This is similar to the -I/--ignore-matching-lines option in
> standalone diff utilities and can be used e.g. to ignore changes which
> only affect code comments or to look for unrelated changes in commits
> containing a large number of automatically applied modifications (e.g. a
> tree-wide string replacement).  The difference between -G/-S and the new
> -I option is that the latter filters output on a per-change basis.
>
> Changes from v2:
>
>   - Add a long option for -I (--ignore-matching-lines) as it is
>     commonplace in standalone diff utilities.  Update documentation and
>     commit log messages accordingly.
>
>   - Use xmalloc() instead of xcalloc() for allocating regex_t
>     structures in diff_opt_ignore_regex().
>
>   - Ensure the memory allocated in diff_opt_ignore_regex() gets
>     released.
>
>   - Use "return error(...)" instead of die() in the -I option callback.
>     Make the relevant error message localizable.
>
>   - Drastically reduce the number of -I<regex> tests due to excessive
>     run time of t/t4069-diff-ignore-regex.sh from v1/v2 on some
>     platforms (notably Windows).  Use a tweaked version of a test
>     suggested by Johannes Schindelin (thanks!).  Given its reduction in
>     size, squash patch 3 (which contained the tests) into patch 2.
>
>   - Replace "see Documentation/diff-options.txt" with "-I<regex>" in the
>     comments for the added structure fields, in order to make these
>     comments more useful.

Thank you for this diligent work! I looked over the patches and like them
a lot!

Thanks,
Dscho