mbox series

[v2,0/4] Warn about git-filter-branch usage and avoid it

Message ID 20190828002210.8862-1-newren@gmail.com (mailing list archive)
Headers show
Series Warn about git-filter-branch usage and avoid it | expand

Message

Elijah Newren Aug. 28, 2019, 12:22 a.m. UTC
Here's a series that shifts the focus slightly to warning about
git-filter-branch usage and avoiding it ourselves.  I have retained
patch 4 but left it marked as RFC for further discussion.  It appears
that folks generally seem to agree the first three patches are good
to include now -- assuming my small fixes correctly address their
requests and suggestions.

Changes since v1 (full range-diff below):
  * I might have had a little fun with a thesaurus (just trying to give
    reviewers something small to smile about...)
  * addressed feedback from Eric and Stolee, as detailed below
  * [Patch 2] factored out some common code
  * [Patch 3] fixed links in asciidoc documentation to make them more
    readable in both manpages and html format
  * [Patch 3] added a warning blurb to git-filter-branch itself

In particular, it'd be helpful if people could take a look at the changes
to git-filter-branch.sh in patch 3 and comment on whether an environment
variable is fine or if we should make it a config setting or something.

Elijah Newren (4):
  t6006: simplify and optimize empty message test
  t3427: accelerate this test by using fast-export and fast-import
  Recommend git-filter-repo instead of git-filter-branch
  [RFC] Remove git-filter-branch, it is now external to git.git

 .gitignore                          |   1 -
 Documentation/git-fast-export.txt   |   6 +-
 Documentation/git-filter-branch.txt | 481 --------------------
 Documentation/git-gc.txt            |  17 +-
 Documentation/git-rebase.txt        |   2 +-
 Documentation/git-replace.txt       |  10 +-
 Documentation/git-svn.txt           |   4 +-
 Documentation/githooks.txt          |   7 +-
 Makefile                            |   1 -
 command-list.txt                    |   1 -
 contrib/svn-fe/svn-fe.txt           |   4 +-
 git-filter-branch.sh                | 662 ----------------------------
 t/perf/p7000-filter-branch.sh       |  24 -
 t/t3427-rebase-subtree.sh           |  22 +-
 t/t6006-rev-list-format.sh          |   5 +-
 t/t7003-filter-branch.sh            | 505 ---------------------
 t/t7009-filter-branch-null-sha1.sh  |  55 ---
 t/t9902-completion.sh               |  12 +-
 18 files changed, 47 insertions(+), 1772 deletions(-)
 delete mode 100644 Documentation/git-filter-branch.txt
 delete mode 100755 git-filter-branch.sh
 delete mode 100755 t/perf/p7000-filter-branch.sh
 delete mode 100755 t/t7003-filter-branch.sh
 delete mode 100755 t/t7009-filter-branch-null-sha1.sh

Range-diff:
1:  7ddbeea2ca = 1:  7ddbeea2ca t6006: simplify and optimize empty message test
2:  0172ca771e < -:  ---------- t3427: accelerate this test by using fast-export and fast-import
3:  b814cc7b65 < -:  ---------- git-sh-i18n: work with external scripts
-:  ---------- > 2:  f18bd7a609 t3427: accelerate this test by using fast-export and fast-import
4:  dcec36d113 ! 3:  7008c16984 Recommend git-filter-repo instead of git-filter-branch in documentation
    @@ Metadata
     Author: Elijah Newren <newren@gmail.com>
     
      ## Commit message ##
    -    Recommend git-filter-repo instead of git-filter-branch in documentation
    +    Recommend git-filter-repo instead of git-filter-branch
     
    -    filter-branch suffers from a huge number of pitfalls that can result in
    -    incorrectly rewritten history, and many of the problems can easily go
    -    undetected until the new repository is in use.  This can result in
    -    problems ranging from an even messier history than what led folks to
    -    filter-branch in the first place, to data loss or corruption.  These
    -    issues cannot be backward compatibly fixed, so add a warning to the
    -    filter-branch manpage about this and recommand that another tool (such
    -    as filter-repo) be used instead.
    +    filter-branch suffers from a deluge of disguised dangers that disfigure
    +    history rewrites (i.e. deviate from the deliberate changes).  Many of
    +    these problems are unobtrusive and can easily go undiscovered until the
    +    new repository is in use.  This can result in problems ranging from an
    +    even messier history than what led folks to filter-branch in the first
    +    place, to data loss or corruption.  These issues cannot be backward
    +    compatibly fixed, so add a warning to both filter-branch and its manpage
    +    recommending that another tool (such as filter-repo) be used instead.
     
         Also, update other manpages that referenced filter-branch.  Several of
         these needed updates even if we could continue recommending
    @@ Documentation/git-filter-branch.txt: SYNOPSIS
      
     +WARNING
     +-------
    -+'git filter-branch' has a litany of gotchas that can and will cause
    -+history to be rewritten incorrectly (in addition to abysmal
    -+performance).  These issues cannot be backward compatibly fixed and as
    -+such, its use is not recommended.  Please use an alternative history
    -+filtering tool such as 'git filter-repo'.  If you still need to use
    -+'git filter-branch', please carefully read the "Safety" section of
    -+https://public-inbox.org/git/CABPp-BEDOH-row-hxY4u_cP30ptqOpcCvPibwyZ2wBu142qUbA@mail.gmail.com/
    -+and avoid as many of the pitfalls listed there as reasonably possible.
    ++'git filter-branch' has a plethora of pitfalls that can produce non-obvious
    ++manglings of the intended history rewrite (and can leave you with little
    ++time to investigate such problems since it has such abysmal performance).
    ++These safety and performance issues cannot be backward compatibly fixed and
    ++as such, its use is not recommended.  Please use an alternative history
    ++filtering tool such as https://github.com/newren/git-filter-repo/[git
    ++filter-repo].  If you still need to use 'git filter-branch', please
    ++carefully read the "Safety" section of the message on the Git mailing list
    ++https://public-inbox.org/git/CABPp-BEDOH-row-hxY4u_cP30ptqOpcCvPibwyZ2wBu142qUbA@mail.gmail.com/[detailing
    ++the land mines of filter-branch] and vigilantly avoid as many of the
    ++hazards listed there as reasonably possible.
     +
      DESCRIPTION
      -----------
    @@ contrib/svn-fe/svn-fe.txt: The exit status does not reflect whether an error was
     -git-svn(1), svn2git(1), svk(1), git-filter-branch(1), git-fast-import(1),
     +git-svn(1), svn2git(1), svk(1), git-filter-repo(1), git-fast-import(1),
      https://svn.apache.org/repos/asf/subversion/trunk/notes/dump-load-format.txt
    +
    + ## git-filter-branch.sh (mode change 100755 => 100644) ##
    +@@ git-filter-branch.sh: set_ident () {
    + 	finish_ident COMMITTER
    + }
    + 
    ++if [ -z "$FILTER_BRANCH_SQUELCH_WARNING" -a \
    ++     -z "$GIT_TEST_DISALLOW_ABBREVIATED_OPTIONS" ]; then
    ++	cat <<EOF
    ++WARNING: git-filter-branch has a glut of gotchas generating mangled history
    ++         rewrites.  Please use an alternative filtering tool such as 'git
    ++         filter-repo' (https://github.com/newren/git-filter-repo/) instead.
    ++         See the filter-branch manual page for more details; to squelch
    ++         this warning and pause, set FILTER_BRANCH_SQUELCH_WARNING=1.
    ++
    ++EOF
    ++	sleep 5
    ++fi
    ++
    + USAGE="[--setup <command>] [--subdirectory-filter <directory>] [--env-filter <command>]
    + 	[--tree-filter <command>] [--index-filter <command>]
    + 	[--parent-filter <command>] [--msg-filter <command>]
5:  9dec8e06ee ! 4:  ff3e04e558 Remove git-filter-branch, it is now external to git.git
    @@ Metadata
      ## Commit message ##
         Remove git-filter-branch, it is now external to git.git
     
    +    git-filter-branch still exists, still has the same regression tests,
    +    etc., but it is now being tracked in a separate repo that users will
    +    need to download separately.
    +
         Signed-off-by: Elijah Newren <newren@gmail.com>
     
      ## .gitignore ##
    @@ Documentation/git-filter-branch.txt (deleted)
     -
     -WARNING
     --------
    --'git filter-branch' has a litany of gotchas that can and will cause
    --history to be rewritten incorrectly (in addition to abysmal
    --performance).  These issues cannot be backward compatibly fixed and as
    --such, its use is not recommended.  Please use an alternative history
    --filtering tool such as 'git filter-repo'.  If you still need to use
    --'git filter-branch', please carefully read the "Safety" section of
    --https://public-inbox.org/git/CABPp-BEDOH-row-hxY4u_cP30ptqOpcCvPibwyZ2wBu142qUbA@mail.gmail.com/
    --and avoid as many of the pitfalls listed there as reasonably possible.
    +-'git filter-branch' has a plethora of pitfalls that can produce non-obvious
    +-manglings of the intended history rewrite (and can leave you with little
    +-time to investigate such problems since it has such abysmal performance).
    +-These safety and performance issues cannot be backward compatibly fixed and
    +-as such, its use is not recommended.  Please use an alternative history
    +-filtering tool such as https://github.com/newren/git-filter-repo/[git
    +-filter-repo].  If you still need to use 'git filter-branch', please
    +-carefully read the "Safety" section of the message on the Git mailing list
    +-https://public-inbox.org/git/CABPp-BEDOH-row-hxY4u_cP30ptqOpcCvPibwyZ2wBu142qUbA@mail.gmail.com/[detailing
    +-the land mines of filter-branch] and vigilantly avoid as many of the
    +-hazards listed there as reasonably possible.
     -
     -DESCRIPTION
     ------------
    @@ git-filter-branch.sh (deleted)
     -	finish_ident COMMITTER
     -}
     -
    +-if [ -z "$FILTER_BRANCH_SQUELCH_WARNING" -a \
    +-     -z "$GIT_TEST_DISALLOW_ABBREVIATED_OPTIONS" ]; then
    +-	cat <<EOF
    +-WARNING: git-filter-branch has a glut of gotchas generating mangled history
    +-         rewrites.  Please use an alternative filtering tool such as 'git
    +-         filter-repo' (https://github.com/newren/git-filter-repo/) instead.
    +-         See the filter-branch manual page for more details; to squelch
    +-         this warning and pause, set FILTER_BRANCH_SQUELCH_WARNING=1.
    +-
    +-EOF
    +-	sleep 5
    +-fi
    +-
     -USAGE="[--setup <command>] [--subdirectory-filter <directory>] [--env-filter <command>]
     -	[--tree-filter <command>] [--index-filter <command>]
     -	[--parent-filter <command>] [--msg-filter <command>]