diff mbox series

[RFC,v2,1/2] git-p4: inexact label detection

Message ID 54ef897fcf645d241690ce3be6867cb60d829552.1554162242.git.amazo@checkvideo.com (mailing list archive)
State New, archived
Headers show
Series git-p4: inexact labels and load changelist description from file | expand

Commit Message

Mazo, Andrey April 2, 2019, 12:13 a.m. UTC
Labels in Perforce are not global, but can be placed on a particular view/subdirectory.
This might pose difficulties when importing only parts of Perforce depot into a git repository.
For example:
 1. Depot layout is as follows:
    //depot/metaproject/branch1/subprojectA/...
    //depot/metaproject/branch1/subprojectB/...
    //depot/metaproject/branch2/subprojectA/...
    //depot/metaproject/branch2/subprojectB/...
 2. Labels are placed as follows:
    * label 1A on //depot/metaproject/branch1/subprojectA/...
    * label 1B on //depot/metaproject/branch1/subprojectB/...
    * label 2A on //depot/metaproject/branch2/subprojectA/...
    * label 2B on //depot/metaproject/branch2/subprojectB/...
 3. The goal is to import
    subprojectA into subprojectA.git and
    subprojectB into subprojectB.git
    preserving all the branches and labels.
 4. Importing subprojectA.
    Label 1A is imported fine because it's placed on certain commit on branch1.
    However, label 1B is not imported because it's placed on a commit in another subproject:
    git-p4 says: "importing label 1B: could not find git commit for changelist ..."
    The same is with label 2A, which is imported; and 2B, which is not.

Currently, there is no easy way (that I'm aware of) to tell git-p4 to
import an empty commit into a desired branch,
so that a label placed on that changelist could be imported as well,
It might be possible to get a similar effect by importing both subprojectA and B in a single git repo,
and then running `git filter-branch --subdirectory-filter subprojectA`,
but this might produce way more irrelevant empty commits, than needed for labels.
(although imported changelists can be limited with git-p4 --changesfile option)
Also, `git filter-branch` is harder to use for incremental imports
or when changes are submitted from git back to Perforce.

As suggested by Luke,
instead of creating an empty commit for the sole purpose of being tagged later,
teach git-p4 to search harder for the next lower changelist,
corresponding to the label in question.

Do this by finding the highest changelist up to the label under all known branches,
(branches are finalized by the time importP4Labels() runs)
and using it instead of a depot-wide changelist corresponding to the label.

This new behavior may not be desired for people,
who want exact label <-> changelist relationship.
So, add a new boolean config parameter git-p4.allowInexactLabels (defaults to false)
to explicitly enable it if needed.
Also, this behavior only appears to be useful in case of multiple branches,
(otherwise, every Perforce changelist should appear in git)
so it's not engaged when running without branch detection.

Detect and report (--verbose) "inexact" tags,
i.e. tags placed on a lower changelist than was in Perforce.
Implement this by comparing a changelist for which a commit was found
with a changelist corresponding to the label on the whole depot.

Note, that the new "inexact" logic works slower
than the original code in case of numerous branches,
because p4 needs to calculate the most recent change for each branch path instead of just one.

This is an alternative solution to "alien" branches concept proposed earlier:
https://public-inbox.org/git/b02df749b9266ac8c73707617a171122156621ab.1553283214.git.amazo@checkvideo.com/

Signed-off-by: Andrey Mazo <amazo@checkvideo.com>
Suggested-by: Luke Diamand <luke@diamand.org>
---
 Documentation/git-p4.txt | 14 ++++++++++++
 git-p4.py                | 48 +++++++++++++++++++++++++++++++++++-----
 2 files changed, 56 insertions(+), 6 deletions(-)
diff mbox series

Patch

diff --git a/Documentation/git-p4.txt b/Documentation/git-p4.txt
index 3494a1db3e..ceabab8b86 100644
--- a/Documentation/git-p4.txt
+++ b/Documentation/git-p4.txt
@@ -582,10 +582,24 @@  git-p4.importLabels::
 
 git-p4.labelImportRegexp::
 	Only p4 labels matching this regular expression will be imported. The
 	default value is '[a-zA-Z0-9_\-.]+$'.
 
+git-p4.allowInexactLabels::
+	Only has an effect if run with `--detect-branches`.
+	By default, when performing p4 label import,
+	'git p4' finds a changelist number of every label,
+	then finds a git commit corresponding to the found changelist number,
+	and then places an annotated tag on the found git commit.
+	If a git commit is not found, the label is considered unimportable
+	and is added to 'ignoredP4Labels' list.
+	If 'allowInexactLabels' is set to true,
+	'git p4' only considers changelists under branches being imported.
+	This has an effect that a tag in git might be placed on a lower changelist compared to p4.
+	This might be useful when importing just a subset of the depot into git,
+	if a label would be discarded otherwise.
+
 git-p4.useClientSpec::
 	Specify that the p4 client spec should be used to identify p4
 	depot paths of interest.  This is equivalent to specifying the
 	option `--use-client-spec`.  See the "CLIENT SPEC" section above.
 	This variable is a boolean, not the name of a p4 client.
diff --git a/git-p4.py b/git-p4.py
index 96c4b78dc7..98b2b7bbca 100755
--- a/git-p4.py
+++ b/git-p4.py
@@ -3162,17 +3162,43 @@  def importP4Labels(self, stream, p4Labels):
             if name in ignoredP4Labels:
                 continue
 
             labelDetails = p4CmdList(['label', "-o", name])[0]
 
-            # get the most recent changelist for each file in this label
-            change = p4Cmd(["changes", "-m", "1"] + ["%s...@%s" % (p, name)
+            if self.detectBranches and gitConfigBool("git-p4.allowInexactLabels"):
+                doInexactLabels = True
+            else:
+                doInexactLabels = False
+
+            # get the most recent changelist in this label for the whole depot
+            depot_wide_changelist = p4Cmd(["changes", "-m", "1"] + ["%s...@%s" % (p, name)
                                 for p in self.depotPaths])
+            if 'change' in depot_wide_changelist:
+                depot_wide_changelist = int(depot_wide_changelist['change'])
+            else:
+                depot_wide_changelist = None
 
-            if 'change' in change:
+            # get the most recent changelist for each file under branches of interest in this label
+            if doInexactLabels:
+                if self.useClientSpec:
+                    paths = ["%s...@%s" % (self.clientSpecDirs.client_prefix + p + '/', name) for p in self.knownBranches]
+                else:
+                    paths = ["%s...@%s" % (self.depotPaths[0] + p + '/', name) for p in self.knownBranches]
+                changes = p4CmdList(["changes", "-m", "1"] + paths)
+                changes = [int(c['change']) for c in changes if 'change' in c]
+
+                # there may be different "most recent" changelists for different paths.
+                # take the newest since some paths were just modified later than others.
+                if changes:
+                    changelist = max(changes)
+                else:
+                    changelist = None
+            else:
+                changelist = depot_wide_changelist
+
+            if changelist:
                 # find the corresponding git commit; take the oldest commit
-                changelist = int(change['change'])
                 if changelist in self.committedChanges:
                     gitCommit = ":%d" % changelist       # use a fast-import mark
                     commitFound = True
                 else:
                     gitCommit = read_pipe(["git", "rev-list", "--max-count=1",
@@ -3192,14 +3218,24 @@  def importP4Labels(self, stream, p4Labels):
                         tmwhen = 1
 
                     when = int(time.mktime(tmwhen))
                     self.streamTag(stream, name, labelDetails, gitCommit, when)
                     if verbose:
-                        print("p4 label %s mapped to git commit %s" % (name, gitCommit))
+                        if depot_wide_changelist == changelist:
+                            isExact = ""
+                        else:
+                            isExact = " inexactly"
+                        print("p4 label %s mapped%s to git commit %s" % (name, isExact, gitCommit))
             else:
                 if verbose:
-                    print("Label %s has no changelists - possibly deleted?" % name)
+                    if depot_wide_changelist:
+                        # there is a changelist corresponding to this label,
+                        # but it's not under any branches of interest.
+                        print("Label %s has no changelists under detected branches -- ignoring" % name)
+                    else:
+                        # there is no changelist corresponding to this label in the whole depot
+                        print("Label %s has no changelists - possibly deleted?" % name)
 
             if not commitFound:
                 # We can't import this label; don't try again as it will get very
                 # expensive repeatedly fetching all the files for labels that will
                 # never be imported. If the label is moved in the future, the