mbox series

[00/20] fundamentals of merge-ort implementation

Message ID 20201030034131.1479968-1-newren@gmail.com (mailing list archive)
Headers show
Series fundamentals of merge-ort implementation | expand

Message

Elijah Newren Oct. 30, 2020, 3:41 a.m. UTC
This series depends on a merge of en/strmap and
en/merge-ort-api-null-impl.

The goal of this series is to show the new design and structure behind
merge-ort, particularly the bits that are completely different to how
merge-recursive operates.  There are still multiple important codepaths
that die with a "Not yet implemented" message, so the new merge
algorithm is still not very usable (however, it can handle very trivial
rebases or cherry-picks at the end of the series).

At a high level, merge-ort avoids unpack_trees() and the index, instead
using traverse_trees() and its own data structure.  After it is done
processing each path, it writes a tree.  Only after it has created a new
tree will it touch the working copy or the index.  It does so by using a
simple checkout-like step to switch from head to the newly created tree.
If there are unmerged entries, it touches up the index after the
checkout-like step to record those higher order stages.

In the series:
  * Patch 1 adds some basic data structures.
  * Patch 2 documents the high-level steps.
  * Patches 3-5 are some simple setup.
  * Patches 6-10 collect data from the traverse_trees() operation.
  * Patches 11-15 process the individual paths and create a tree.
  * Patches 16-19 handle checkout-and-then-write-higher-order-stages.
  * Patch 20 frees data from the merge_options_internal data structure

Elijah Newren (20):
  merge-ort: setup basic internal data structures
  merge-ort: add some high-level algorithm structure
  merge-ort: port merge_start() from merge-recursive
  merge-ort: use histogram diff
  merge-ort: add an err() function similar to one from merge-recursive
  merge-ort: implement a very basic collect_merge_info()
  merge-ort: avoid repeating fill_tree_descriptor() on the same tree
  merge-ort: compute a few more useful fields for collect_merge_info
  merge-ort: record stage and auxiliary info for every path
  merge-ort: avoid recursing into identical trees
  merge-ort: add a preliminary simple process_entries() implementation
  merge-ort: have process_entries operate in a defined order
  merge-ort: step 1 of tree writing -- record basenames, modes, and oids
  merge-ort: step 2 of tree writing -- function to create tree object
  merge-ort: step 3 of tree writing -- handling subdirectories as we go
  merge-ort: basic outline for merge_switch_to_result()
  merge-ort: add implementation of checkout()
  tree: enable cmp_cache_name_compare() to be used elsewhere
  merge-ort: add implementation of record_unmerged_index_entries()
  merge-ort: free data structures in merge_finalize()

 merge-ort.c | 922 +++++++++++++++++++++++++++++++++++++++++++++++++++-
 tree.c      |   2 +-
 tree.h      |   2 +
 3 files changed, 922 insertions(+), 4 deletions(-)

Comments

Elijah Newren Oct. 30, 2020, 3:58 a.m. UTC | #1
On Thu, Oct 29, 2020 at 8:41 PM Elijah Newren <newren@gmail.com> wrote:
>
> ...
> The goal of this series is to show the new design and structure behind
> merge-ort, particularly the bits that are completely different to how
> merge-recursive operates....
>
> At a high level, merge-ort avoids unpack_trees() and the index, instead
> using traverse_trees() and its own data structure.  After it is done
> processing each path, it writes a tree.  Only after it has created a new
> tree will it touch the working copy or the index.  It does so by using a
> simple checkout-like step to switch from head to the newly created tree.
> If there are unmerged entries, it touches up the index after the
> checkout-like step to record those higher order stages.

While I didn't think anyone needed to be cc'ed on the whole series,
but I made some promises at Git Merge 2020 to give some heads up:

* Brian: Patches 13-15 create tree objects, joining other places in
the code such as fast-import, mktree, cache-tree, and notes that write
tree objects.  You mentioned something about consolidating these for
sha256 handling.
* Edward: You wanted a heads up when I started submitting the ort
merge backend.  Here it is.  It doesn't change any on-disk data
structures now or later, though, so I'm not sure libgit2 really is
affected.
* Peff: This isn't a Git Merge 2020 thing, but here's a series that
starts depending on strmap.  I'm happy to update and rebase if needed,
but opinions on strmap to prevent us from any API poisoning and such
would be great.  :-)

Elijah

> In the series:
>   * Patch 1 adds some basic data structures.
>   * Patch 2 documents the high-level steps.
>   * Patches 3-5 are some simple setup.
>   * Patches 6-10 collect data from the traverse_trees() operation.
>   * Patches 11-15 process the individual paths and create a tree.
>   * Patches 16-19 handle checkout-and-then-write-higher-order-stages.
>   * Patch 20 frees data from the merge_options_internal data structure
>
> Elijah Newren (20):
>   merge-ort: setup basic internal data structures
>   merge-ort: add some high-level algorithm structure
>   merge-ort: port merge_start() from merge-recursive
>   merge-ort: use histogram diff
>   merge-ort: add an err() function similar to one from merge-recursive
>   merge-ort: implement a very basic collect_merge_info()
>   merge-ort: avoid repeating fill_tree_descriptor() on the same tree
>   merge-ort: compute a few more useful fields for collect_merge_info
>   merge-ort: record stage and auxiliary info for every path
>   merge-ort: avoid recursing into identical trees
>   merge-ort: add a preliminary simple process_entries() implementation
>   merge-ort: have process_entries operate in a defined order
>   merge-ort: step 1 of tree writing -- record basenames, modes, and oids
>   merge-ort: step 2 of tree writing -- function to create tree object
>   merge-ort: step 3 of tree writing -- handling subdirectories as we go
>   merge-ort: basic outline for merge_switch_to_result()
>   merge-ort: add implementation of checkout()
>   tree: enable cmp_cache_name_compare() to be used elsewhere
>   merge-ort: add implementation of record_unmerged_index_entries()
>   merge-ort: free data structures in merge_finalize()
>
>  merge-ort.c | 922 +++++++++++++++++++++++++++++++++++++++++++++++++++-
>  tree.c      |   2 +-
>  tree.h      |   2 +
>  3 files changed, 922 insertions(+), 4 deletions(-)
>
> --
> 2.29.1.56.ga287c268e6.dirty