Message ID | 20210317204939.17890-9-alban.gruin@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Rewrite the remaining merge strategies from shell to C | expand |
Hi Alban, On Wed, 17 Mar 2021, Alban Gruin wrote: > This rewrites `git merge-one-file' from shell to C. This port is not > completely straightforward: to save precious cycles by avoiding reading > and flushing the index repeatedly, write temporary files when an > operation can be performed in-memory, or allow other function to use the > rewrite without forking nor worrying about the index, the calls to > external processes are replaced by calls to functions in libgit.a: > > - calls to `update-index --add --cacheinfo' are replaced by calls to > add_to_index_cacheinfo(); > > - calls to `update-index --remove' are replaced by calls to > remove_file_from_index(); > > - calls to `checkout-index -u -f' are replaced by calls to > checkout_entry(); > > - calls to `unpack-file' and `merge-files' are replaced by calls to > read_mmblob() and xdl_merge(), respectively, to merge files > in-memory; > > - calls to `checkout-index -f --stage=2' are removed, as this is needed > to have the correct permission bits on the merged file from the > script, but not in the C version; > > - calls to `update-index' are replaced by calls to add_file_to_index(). > > The bulk of the rewrite is done in a new file in libgit.a, > merge-strategies.c. This will enable the resolve and octopus strategies > to directly call it instead of forking. > > This also fixes a bug present in the original script: instead of > checking if a _regular_ file exists when a file exists in the branch to > merge, but not in our branch, the rewritten version checks if a file of > any kind (ie. a directory, ...) exists. This fixes the tests t6035.14, > where the branch to merge had a new file, `a/b', but our branch had a > directory there; it should have failed because a directory exists, but > it did not because there was no regular file called `a/b'. This test is > now marked as successful. > > This also teaches `merge-index' to call merge_three_way() (when invoked > with `--use=merge-one-file') without forking using a new callback, > merge_one_file_func(). > > To avoid any issue with a shrinking index because of the merge function > used (directly in the process or by forking), as described earlier, the > iterator of the loop of merge_all_index() is increased by the number of > entries with the same name, minus the difference between the number of > entries in the index before and after the merge. > > This should handle a shrinking index correctly, but could lead to issues > with a growing index. However, this case is not treated, as there is no > callback that can produce such a case. Nice! > diff --git a/builtin/merge-index.c b/builtin/merge-index.c > index fd5b1a5a92..04d38aa130 100644 > --- a/builtin/merge-index.c > +++ b/builtin/merge-index.c > @@ -38,7 +38,7 @@ static int merge_one_file_spawn(struct index_state *istate, > int cmd_merge_index(int argc, const char **argv, const char *prefix) > { > int i, force_file = 0, err = 0, one_shot = 0, quiet = 0; > - merge_fn merge_action = merge_one_file_spawn; > + merge_fn merge_action; > struct lock_file lock = LOCK_INIT; > struct repository *r = the_repository; > const char *use_internal = NULL; > @@ -69,10 +69,13 @@ int cmd_merge_index(int argc, const char **argv, const char *prefix) > > if (skip_prefix(pgm, "--use=", &use_internal)) { > if (!strcmp(use_internal, "merge-one-file")) > - pgm = "git-merge-one-file"; > + merge_action = merge_one_file_func; > else > die(_("git merge-index: unknown internal program %s"), use_internal); > - } > + > + repo_hold_locked_index(r, &lock, LOCK_DIE_ON_ERROR); > + } else > + merge_action = merge_one_file_spawn; I would have a slight preference to keep the default initializer, because that makes it easer to reason about. But if you _want_ to keep this patch as-is, I won't object. It is a bit sad that the conversion cannot be done more incrementally, as there is a lot to unpack in the many different cases that are handled. It looks correct, though. Just one thing: > > for (; i < argc; i++) { > const char *arg = argv[i]; > diff --git a/builtin/merge-one-file.c b/builtin/merge-one-file.c > new file mode 100644 > index 0000000000..ad99c6dbd4 > --- /dev/null > +++ b/builtin/merge-one-file.c > @@ -0,0 +1,94 @@ > +/* > + * Builtin "git merge-one-file" > + * > + * Copyright (c) 2020 Alban Gruin > + * > + * Based on git-merge-one-file.sh, written by Linus Torvalds. > + * > + * This is the git per-file merge utility, called with > + * > + * argv[1] - original file object name (or empty) > + * argv[2] - file in branch1 object name (or empty) > + * argv[3] - file in branch2 object name (or empty) > + * argv[4] - pathname in repository > + * argv[5] - original file mode (or empty) > + * argv[6] - file in branch1 mode (or empty) > + * argv[7] - file in branch2 mode (or empty) > + * > + * Handle some trivial cases. The _really_ trivial cases have been > + * handled already by git read-tree, but that one doesn't do any merges > + * that might change the tree layout. > + */ > + > +#include "cache.h" > +#include "builtin.h" > +#include "lockfile.h" > +#include "merge-strategies.h" > + > +static const char builtin_merge_one_file_usage[] = > + "git merge-one-file <orig blob> <our blob> <their blob> <path> " > + "<orig mode> <our mode> <their mode>\n\n" > + "Blob ids and modes should be empty for missing files."; > + > +static int read_mode(const char *name, const char *arg, unsigned int *mode) > +{ > + char *last; > + int ret = 0; > + > + *mode = strtol(arg, &last, 8); > + > + if (*last) > + ret = error(_("invalid '%s' mode: expected nothing, got '%c'"), name, *last); > + else if (!(S_ISREG(*mode) || S_ISDIR(*mode) || S_ISLNK(*mode))) > + ret = error(_("invalid '%s' mode: %o"), name, *mode); > + > + return ret; > +} > + > +int cmd_merge_one_file(int argc, const char **argv, const char *prefix) > +{ > + struct object_id orig_blob, our_blob, their_blob, > + *p_orig_blob = NULL, *p_our_blob = NULL, *p_their_blob = NULL; > + unsigned int orig_mode = 0, our_mode = 0, their_mode = 0, ret = 0; > + struct lock_file lock = LOCK_INIT; > + struct repository *r = the_repository; > + > + if (argc != 8) > + usage(builtin_merge_one_file_usage); > + > + if (repo_read_index(r) < 0) > + die("invalid index"); > + > + repo_hold_locked_index(r, &lock, LOCK_DIE_ON_ERROR); > + > + if (!get_oid_hex(argv[1], &orig_blob)) { > + p_orig_blob = &orig_blob; > + ret = read_mode("orig", argv[5], &orig_mode); > + } else if (!*argv[1] && *argv[5]) > + ret = error(_("no 'orig' object id given, but a mode was still given.")); Here, it looks as if the case of an empty `argv[1]` is not handled _explicitly_, but we rely on `get_oid_hex()` to return non-zero, and then we rely on the second arm _also_ not re-assigning `orig_blob`. I wonder whether this could be checked, and whether it would make sense to fold this, along with most of these 5 lines, into the `read_mode()` helper function (DRYing up the code even further). As for the rest of the patch, it is totally possible that I missed a bug, but it looks correct to me, and the added regression tests give me a good feeling about the patch, too. Thanks, Dscho > + > + if (!get_oid_hex(argv[2], &our_blob)) { > + p_our_blob = &our_blob; > + ret = read_mode("our", argv[6], &our_mode); > + } else if (!*argv[2] && *argv[6]) > + ret = error(_("no 'our' object id given, but a mode was still given.")); > + > + if (!get_oid_hex(argv[3], &their_blob)) { > + p_their_blob = &their_blob; > + ret = read_mode("their", argv[7], &their_mode); > + } else if (!*argv[3] && *argv[7]) > + ret = error(_("no 'their' object id given, but a mode was still given.")); > + > + if (ret) > + return ret; > + > + ret = merge_three_way(r->index, p_orig_blob, p_our_blob, p_their_blob, > + argv[4], orig_mode, our_mode, their_mode); > + > + if (ret) { > + rollback_lock_file(&lock); > + return !!ret; > + } > + > + return write_locked_index(r->index, &lock, COMMIT_LOCK); > +} > diff --git a/git-merge-one-file.sh b/git-merge-one-file.sh > deleted file mode 100755 > index f6d9852d2f..0000000000 > --- a/git-merge-one-file.sh > +++ /dev/null > @@ -1,167 +0,0 @@ > -#!/bin/sh > -# > -# Copyright (c) Linus Torvalds, 2005 > -# > -# This is the git per-file merge script, called with > -# > -# $1 - original file SHA1 (or empty) > -# $2 - file in branch1 SHA1 (or empty) > -# $3 - file in branch2 SHA1 (or empty) > -# $4 - pathname in repository > -# $5 - original file mode (or empty) > -# $6 - file in branch1 mode (or empty) > -# $7 - file in branch2 mode (or empty) > -# > -# Handle some trivial cases.. The _really_ trivial cases have > -# been handled already by git read-tree, but that one doesn't > -# do any merges that might change the tree layout. > - > -USAGE='<orig blob> <our blob> <their blob> <path>' > -USAGE="$USAGE <orig mode> <our mode> <their mode>" > -LONG_USAGE="usage: git merge-one-file $USAGE > - > -Blob ids and modes should be empty for missing files." > - > -SUBDIRECTORY_OK=Yes > -. git-sh-setup > -cd_to_toplevel > -require_work_tree > - > -if test $# != 7 > -then > - echo "$LONG_USAGE" > - exit 1 > -fi > - > -case "${1:-.}${2:-.}${3:-.}" in > -# > -# Deleted in both or deleted in one and unchanged in the other > -# > -"$1.." | "$1.$1" | "$1$1.") > - if { test -z "$6" && test "$5" != "$7"; } || > - { test -z "$7" && test "$5" != "$6"; } > - then > - echo "ERROR: File $4 deleted on one branch but had its" >&2 > - echo "ERROR: permissions changed on the other." >&2 > - exit 1 > - fi > - > - if test -n "$2" > - then > - echo "Removing $4" > - else > - # read-tree checked that index matches HEAD already, > - # so we know we do not have this path tracked. > - # there may be an unrelated working tree file here, > - # which we should just leave unmolested. Make sure > - # we do not have it in the index, though. > - exec git update-index --remove -- "$4" > - fi > - if test -f "$4" > - then > - rm -f -- "$4" && > - rmdir -p "$(expr "z$4" : 'z\(.*\)/')" 2>/dev/null || : > - fi && > - exec git update-index --remove -- "$4" > - ;; > - > -# > -# Added in one. > -# > -".$2.") > - # the other side did not add and we added so there is nothing > - # to be done, except making the path merged. > - exec git update-index --add --cacheinfo "$6" "$2" "$4" > - ;; > -"..$3") > - echo "Adding $4" > - if test -f "$4" > - then > - echo "ERROR: untracked $4 is overwritten by the merge." >&2 > - exit 1 > - fi > - git update-index --add --cacheinfo "$7" "$3" "$4" && > - exec git checkout-index -u -f -- "$4" > - ;; > - > -# > -# Added in both, identically (check for same permissions). > -# > -".$3$2") > - if test "$6" != "$7" > - then > - echo "ERROR: File $4 added identically in both branches," >&2 > - echo "ERROR: but permissions conflict $6->$7." >&2 > - exit 1 > - fi > - echo "Adding $4" > - git update-index --add --cacheinfo "$6" "$2" "$4" && > - exec git checkout-index -u -f -- "$4" > - ;; > - > -# > -# Modified in both, but differently. > -# > -"$1$2$3" | ".$2$3") > - > - case ",$6,$7," in > - *,120000,*) > - echo "ERROR: $4: Not merging symbolic link changes." >&2 > - exit 1 > - ;; > - *,160000,*) > - echo "ERROR: $4: Not merging conflicting submodule changes." >&2 > - exit 1 > - ;; > - esac > - > - src1=$(git unpack-file $2) > - src2=$(git unpack-file $3) > - case "$1" in > - '') > - echo "Added $4 in both, but differently." > - orig=$(git unpack-file $(git hash-object /dev/null)) > - ;; > - *) > - echo "Auto-merging $4" > - orig=$(git unpack-file $1) > - ;; > - esac > - > - git merge-file "$src1" "$orig" "$src2" > - ret=$? > - msg= > - if test $ret != 0 || test -z "$1" > - then > - msg='content conflict' > - ret=1 > - fi > - > - # Create the working tree file, using "our tree" version from the > - # index, and then store the result of the merge. > - git checkout-index -f --stage=2 -- "$4" && cat "$src1" >"$4" || exit 1 > - rm -f -- "$orig" "$src1" "$src2" > - > - if test "$6" != "$7" > - then > - if test -n "$msg" > - then > - msg="$msg, " > - fi > - msg="${msg}permissions conflict: $5->$6,$7" > - ret=1 > - fi > - > - if test $ret != 0 > - then > - echo "ERROR: $msg in $4" >&2 > - exit 1 > - fi > - exec git update-index -- "$4" > - ;; > - > -*) > - echo "ERROR: $4: Not handling case $1 -> $2 -> $3" >&2 > - ;; > -esac > -exit 1 > diff --git a/git.c b/git.c > index 9bc077a025..95eb74efe1 100644 > --- a/git.c > +++ b/git.c > @@ -544,6 +544,7 @@ static struct cmd_struct commands[] = { > { "merge-file", cmd_merge_file, RUN_SETUP_GENTLY }, > { "merge-index", cmd_merge_index, RUN_SETUP | NO_PARSEOPT }, > { "merge-ours", cmd_merge_ours, RUN_SETUP | NO_PARSEOPT }, > + { "merge-one-file", cmd_merge_one_file, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT }, > { "merge-recursive", cmd_merge_recursive, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT }, > { "merge-recursive-ours", cmd_merge_recursive, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT }, > { "merge-recursive-theirs", cmd_merge_recursive, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT }, > diff --git a/merge-strategies.c b/merge-strategies.c > index c80f964612..2717af51fd 100644 > --- a/merge-strategies.c > +++ b/merge-strategies.c > @@ -1,5 +1,197 @@ > #include "cache.h" > +#include "dir.h" > #include "merge-strategies.h" > +#include "xdiff-interface.h" > + > +static int add_merge_result_to_index(struct index_state *istate, unsigned int mode, > + const struct object_id *oid, const char *path, > + int checkout) > +{ > + struct cache_entry *ce; > + int res; > + > + res = add_to_index_cacheinfo(istate, mode, oid, path, 0, 1, 1, &ce); > + if (res == -1) > + return error(_("Invalid path '%s'"), path); > + else if (res == -2) > + return -1; > + > + if (checkout) { > + struct checkout state = CHECKOUT_INIT; > + > + state.istate = istate; > + state.force = 1; > + state.base_dir = ""; > + state.base_dir_len = 0; > + > + if (checkout_entry(ce, &state, NULL, NULL) < 0) > + return error(_("%s: cannot checkout file"), path); > + } > + > + return 0; > +} > + > +static int merge_one_file_deleted(struct index_state *istate, > + const struct object_id *our_blob, > + const struct object_id *their_blob, const char *path, > + unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode) > +{ > + if ((!our_blob && orig_mode != their_mode) || > + (!their_blob && orig_mode != our_mode)) > + return error(_("File %s deleted on one branch but had its " > + "permissions changed on the other."), path); > + > + if (our_blob) { > + printf(_("Removing %s\n"), path); > + > + if (file_exists(path)) > + remove_path(path); > + } > + > + if (remove_file_from_index(istate, path)) > + return error("%s: cannot remove from the index", path); > + return 0; > +} > + > +static int do_merge_one_file(struct index_state *istate, > + const struct object_id *orig_blob, > + const struct object_id *our_blob, > + const struct object_id *their_blob, const char *path, > + unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode) > +{ > + int ret, i, dest; > + ssize_t written; > + mmbuffer_t result = {NULL, 0}; > + mmfile_t mmfs[3]; > + xmparam_t xmp = {{0}}; > + > + if (our_mode == S_IFLNK || their_mode == S_IFLNK) > + return error(_("%s: Not merging symbolic link changes."), path); > + else if (our_mode == S_IFGITLINK || their_mode == S_IFGITLINK) > + return error(_("%s: Not merging conflicting submodule changes."), path); > + > + if (orig_blob) { > + printf(_("Auto-merging %s\n"), path); > + read_mmblob(mmfs + 0, orig_blob); > + } else { > + printf(_("Added %s in both, but differently.\n"), path); > + read_mmblob(mmfs + 0, &null_oid); > + } > + > + read_mmblob(mmfs + 1, our_blob); > + read_mmblob(mmfs + 2, their_blob); > + > + xmp.level = XDL_MERGE_ZEALOUS_ALNUM; > + xmp.style = 0; > + xmp.favor = 0; > + > + ret = xdl_merge(mmfs + 0, mmfs + 1, mmfs + 2, &xmp, &result); > + > + for (i = 0; i < 3; i++) > + free(mmfs[i].ptr); > + > + if (ret < 0) { > + free(result.ptr); > + return error(_("Failed to execute internal merge")); > + } > + > + if (ret > 0 || !orig_blob) > + ret = error(_("content conflict in %s"), path); > + if (our_mode != their_mode) > + ret = error(_("permission conflict: %o->%o,%o in %s"), > + orig_mode, our_mode, their_mode, path); > + > + unlink(path); > + if ((dest = open(path, O_WRONLY | O_CREAT, our_mode)) < 0) { > + free(result.ptr); > + return error_errno(_("failed to open file '%s'"), path); > + } > + > + written = write_in_full(dest, result.ptr, result.size); > + close(dest); > + > + free(result.ptr); > + > + if (written < 0) > + return error_errno(_("failed to write to '%s'"), path); > + if (ret) > + return ret; > + > + return add_file_to_index(istate, path, 0); > +} > + > +int merge_three_way(struct index_state *istate, > + const struct object_id *orig_blob, > + const struct object_id *our_blob, > + const struct object_id *their_blob, const char *path, > + unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode) > +{ > + if (orig_blob && > + ((!our_blob && !their_blob) || > + (!their_blob && our_blob && oideq(orig_blob, our_blob)) || > + (!our_blob && their_blob && oideq(orig_blob, their_blob)))) { > + /* Deleted in both or deleted in one and unchanged in the other. */ > + return merge_one_file_deleted(istate, our_blob, their_blob, path, > + orig_mode, our_mode, their_mode); > + } else if (!orig_blob && our_blob && !their_blob) { > + /* > + * Added in ours. The other side did not add and we > + * added so there is nothing to be done, except making > + * the path merged. > + */ > + return add_merge_result_to_index(istate, our_mode, our_blob, path, 0); > + } else if (!orig_blob && !our_blob && their_blob) { > + printf(_("Adding %s\n"), path); > + > + if (file_exists(path)) > + return error(_("untracked %s is overwritten by the merge."), path); > + > + return add_merge_result_to_index(istate, their_mode, their_blob, path, 1); > + } else if (!orig_blob && our_blob && their_blob && > + oideq(our_blob, their_blob)) { > + /* Added in both, identically (check for same permissions). */ > + if (our_mode != their_mode) > + return error(_("File %s added identically in both branches, " > + "but permissions conflict %o->%o."), > + path, our_mode, their_mode); > + > + printf(_("Adding %s\n"), path); > + > + return add_merge_result_to_index(istate, our_mode, our_blob, path, 1); > + } else if (our_blob && their_blob) { > + /* Modified in both, but differently. */ > + return do_merge_one_file(istate, > + orig_blob, our_blob, their_blob, path, > + orig_mode, our_mode, their_mode); > + } else { > + char orig_hex[GIT_MAX_HEXSZ] = {0}, our_hex[GIT_MAX_HEXSZ] = {0}, > + their_hex[GIT_MAX_HEXSZ] = {0}; > + > + if (orig_blob) > + oid_to_hex_r(orig_hex, orig_blob); > + if (our_blob) > + oid_to_hex_r(our_hex, our_blob); > + if (their_blob) > + oid_to_hex_r(their_hex, their_blob); > + > + return error(_("%s: Not handling case %s -> %s -> %s"), > + path, orig_hex, our_hex, their_hex); > + } > + > + return 0; > +} > + > +int merge_one_file_func(struct index_state *istate, > + const struct object_id *orig_blob, > + const struct object_id *our_blob, > + const struct object_id *their_blob, const char *path, > + unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode, > + void *data) > +{ > + return merge_three_way(istate, > + orig_blob, our_blob, their_blob, path, > + orig_mode, our_mode, their_mode); > +} > > static int merge_entry(struct index_state *istate, int quiet, unsigned int pos, > const char *path, int *err, merge_fn fn, void *data) > @@ -54,17 +246,24 @@ int merge_all_index(struct index_state *istate, int oneshot, int quiet, > merge_fn fn, void *data) > { > int err = 0, ret; > - unsigned int i; > + unsigned int i, prev_nr; > > for (i = 0; i < istate->cache_nr; i++) { > const struct cache_entry *ce = istate->cache[i]; > if (!ce_stage(ce)) > continue; > > + prev_nr = istate->cache_nr; > ret = merge_entry(istate, quiet || oneshot, i, ce->name, &err, fn, data); > - if (ret > 0) > - i += ret - 1; > - else if (ret == -1) > + if (ret > 0) { > + /* > + * Don't bother handling an index that has > + * grown, since merge_one_file_func() can't grow > + * it, and merge_one_file_spawn() can't change > + * it. > + */ > + i += ret - (prev_nr - istate->cache_nr) - 1; > + } else if (ret == -1) > return -1; > > if (err && !oneshot) > diff --git a/merge-strategies.h b/merge-strategies.h > index 88f476f170..8705a550ca 100644 > --- a/merge-strategies.h > +++ b/merge-strategies.h > @@ -3,6 +3,12 @@ > > #include "object.h" > > +int merge_three_way(struct index_state *istate, > + const struct object_id *orig_blob, > + const struct object_id *our_blob, > + const struct object_id *their_blob, const char *path, > + unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode); > + > typedef int (*merge_fn)(struct index_state *istate, > const struct object_id *orig_blob, > const struct object_id *our_blob, > @@ -10,6 +16,13 @@ typedef int (*merge_fn)(struct index_state *istate, > unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode, > void *data); > > +int merge_one_file_func(struct index_state *istate, > + const struct object_id *orig_blob, > + const struct object_id *our_blob, > + const struct object_id *their_blob, const char *path, > + unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode, > + void *data); > + > int merge_index_path(struct index_state *istate, int oneshot, int quiet, > const char *path, merge_fn fn, void *data); > int merge_all_index(struct index_state *istate, int oneshot, int quiet, > diff --git a/t/t6060-merge-index.sh b/t/t6060-merge-index.sh > index d0cdfeddc1..d9c07965dc 100755 > --- a/t/t6060-merge-index.sh > +++ b/t/t6060-merge-index.sh > @@ -72,7 +72,7 @@ test_expect_success 'merge-one-file fails without a work tree' ' > (cd bare.git && > GIT_INDEX_FILE=$PWD/merge.index && > export GIT_INDEX_FILE && > - test_must_fail git merge-index git-merge-one-file -a > + test_must_fail git merge-index --use=merge-one-file -a > ) > ' > > diff --git a/t/t6415-merge-dir-to-symlink.sh b/t/t6415-merge-dir-to-symlink.sh > index 2ce104aca7..075da1f55f 100755 > --- a/t/t6415-merge-dir-to-symlink.sh > +++ b/t/t6415-merge-dir-to-symlink.sh > @@ -97,7 +97,7 @@ test_expect_success SYMLINKS 'a/b was resolved as symlink' ' > test -h a/b > ' > > -test_expect_failure 'do not lose untracked in merge (resolve)' ' > +test_expect_success 'do not lose untracked in merge (resolve)' ' > git reset --hard && > git checkout baseline^0 && > >a/b/c/e && > -- > 2.31.0 > >
Hi Johannes, Le 22/03/2021 à 23:20, Johannes Schindelin a écrit : > Hi Alban, > > On Wed, 17 Mar 2021, Alban Gruin wrote: > >> @@ -69,10 +69,13 @@ int cmd_merge_index(int argc, const char **argv, const char *prefix) >> >> if (skip_prefix(pgm, "--use=", &use_internal)) { >> if (!strcmp(use_internal, "merge-one-file")) >> - pgm = "git-merge-one-file"; >> + merge_action = merge_one_file_func; >> else >> die(_("git merge-index: unknown internal program %s"), use_internal); >> - } >> + >> + repo_hold_locked_index(r, &lock, LOCK_DIE_ON_ERROR); >> + } else >> + merge_action = merge_one_file_spawn; > > I would have a slight preference to keep the default initializer, because > that makes it easer to reason about. But if you _want_ to keep this patch > as-is, I won't object. > Yeah, not sure why I did this. I'll change this. > It is a bit sad that the conversion cannot be done more incrementally, as > there is a lot to unpack in the many different cases that are handled. It > looks correct, though. > > Just one thing: > >> >> for (; i < argc; i++) { >> const char *arg = argv[i]; >> diff --git a/builtin/merge-one-file.c b/builtin/merge-one-file.c >> new file mode 100644 >> index 0000000000..ad99c6dbd4 >> --- /dev/null >> +++ b/builtin/merge-one-file.c >> @@ -0,0 +1,94 @@ >> +/* >> + * Builtin "git merge-one-file" >> + * >> + * Copyright (c) 2020 Alban Gruin >> + * >> + * Based on git-merge-one-file.sh, written by Linus Torvalds. >> + * >> + * This is the git per-file merge utility, called with >> + * >> + * argv[1] - original file object name (or empty) >> + * argv[2] - file in branch1 object name (or empty) >> + * argv[3] - file in branch2 object name (or empty) >> + * argv[4] - pathname in repository >> + * argv[5] - original file mode (or empty) >> + * argv[6] - file in branch1 mode (or empty) >> + * argv[7] - file in branch2 mode (or empty) >> + * >> + * Handle some trivial cases. The _really_ trivial cases have been >> + * handled already by git read-tree, but that one doesn't do any merges >> + * that might change the tree layout. >> + */ >> + >> +#include "cache.h" >> +#include "builtin.h" >> +#include "lockfile.h" >> +#include "merge-strategies.h" >> + >> +static const char builtin_merge_one_file_usage[] = >> + "git merge-one-file <orig blob> <our blob> <their blob> <path> " >> + "<orig mode> <our mode> <their mode>\n\n" >> + "Blob ids and modes should be empty for missing files."; >> + >> +static int read_mode(const char *name, const char *arg, unsigned int *mode) >> +{ >> + char *last; >> + int ret = 0; >> + >> + *mode = strtol(arg, &last, 8); >> + >> + if (*last) >> + ret = error(_("invalid '%s' mode: expected nothing, got '%c'"), name, *last); >> + else if (!(S_ISREG(*mode) || S_ISDIR(*mode) || S_ISLNK(*mode))) >> + ret = error(_("invalid '%s' mode: %o"), name, *mode); >> + >> + return ret; >> +} >> + >> +int cmd_merge_one_file(int argc, const char **argv, const char *prefix) >> +{ >> + struct object_id orig_blob, our_blob, their_blob, >> + *p_orig_blob = NULL, *p_our_blob = NULL, *p_their_blob = NULL; >> + unsigned int orig_mode = 0, our_mode = 0, their_mode = 0, ret = 0; >> + struct lock_file lock = LOCK_INIT; >> + struct repository *r = the_repository; >> + >> + if (argc != 8) >> + usage(builtin_merge_one_file_usage); >> + >> + if (repo_read_index(r) < 0) >> + die("invalid index"); >> + >> + repo_hold_locked_index(r, &lock, LOCK_DIE_ON_ERROR); >> + >> + if (!get_oid_hex(argv[1], &orig_blob)) { >> + p_orig_blob = &orig_blob; >> + ret = read_mode("orig", argv[5], &orig_mode); >> + } else if (!*argv[1] && *argv[5]) >> + ret = error(_("no 'orig' object id given, but a mode was still given.")); > > Here, it looks as if the case of an empty `argv[1]` is not handled > _explicitly_, but we rely on `get_oid_hex()` to return non-zero, and then > we rely on the second arm _also_ not re-assigning `orig_blob`. > > I wonder whether this could be checked, and whether it would make sense to > fold this, along with most of these 5 lines, into the `read_mode()` helper > function (DRYing up the code even further). > Do you mean rewriting the first condition to read like this: if (*argv[1] && !get_oid_hex(argv[1], &orig_blob)) { ? In which case yes, I can do that. BTW the two lasts calls to read_mode() should be like err |= read_mode(…); Cheers, Alban > As for the rest of the patch, it is totally possible that I missed a bug, > but it looks correct to me, and the added regression tests give me a good > feeling about the patch, too. > > Thanks, > Dscho >
Hi Alban, On Tue, 23 Mar 2021, Alban Gruin wrote: > Le 22/03/2021 à 23:20, Johannes Schindelin a écrit : > > > > On Wed, 17 Mar 2021, Alban Gruin wrote: > > > >> > >> for (; i < argc; i++) { > >> const char *arg = argv[i]; > >> diff --git a/builtin/merge-one-file.c b/builtin/merge-one-file.c > >> new file mode 100644 > >> index 0000000000..ad99c6dbd4 > >> --- /dev/null > >> +++ b/builtin/merge-one-file.c > >> @@ -0,0 +1,94 @@ > >> +/* > >> + * Builtin "git merge-one-file" > >> + * > >> + * Copyright (c) 2020 Alban Gruin > >> + * > >> + * Based on git-merge-one-file.sh, written by Linus Torvalds. > >> + * > >> + * This is the git per-file merge utility, called with > >> + * > >> + * argv[1] - original file object name (or empty) > >> + * argv[2] - file in branch1 object name (or empty) > >> + * argv[3] - file in branch2 object name (or empty) > >> + * argv[4] - pathname in repository > >> + * argv[5] - original file mode (or empty) > >> + * argv[6] - file in branch1 mode (or empty) > >> + * argv[7] - file in branch2 mode (or empty) > >> + * > >> + * Handle some trivial cases. The _really_ trivial cases have been > >> + * handled already by git read-tree, but that one doesn't do any merges > >> + * that might change the tree layout. > >> + */ > >> + > >> +#include "cache.h" > >> +#include "builtin.h" > >> +#include "lockfile.h" > >> +#include "merge-strategies.h" > >> + > >> +static const char builtin_merge_one_file_usage[] = > >> + "git merge-one-file <orig blob> <our blob> <their blob> <path> " > >> + "<orig mode> <our mode> <their mode>\n\n" > >> + "Blob ids and modes should be empty for missing files."; > >> + > >> +static int read_mode(const char *name, const char *arg, unsigned int *mode) > >> +{ > >> + char *last; > >> + int ret = 0; > >> + > >> + *mode = strtol(arg, &last, 8); > >> + > >> + if (*last) > >> + ret = error(_("invalid '%s' mode: expected nothing, got '%c'"), name, *last); > >> + else if (!(S_ISREG(*mode) || S_ISDIR(*mode) || S_ISLNK(*mode))) > >> + ret = error(_("invalid '%s' mode: %o"), name, *mode); > >> + > >> + return ret; > >> +} > >> + > >> +int cmd_merge_one_file(int argc, const char **argv, const char *prefix) > >> +{ > >> + struct object_id orig_blob, our_blob, their_blob, > >> + *p_orig_blob = NULL, *p_our_blob = NULL, *p_their_blob = NULL; > >> + unsigned int orig_mode = 0, our_mode = 0, their_mode = 0, ret = 0; > >> + struct lock_file lock = LOCK_INIT; > >> + struct repository *r = the_repository; > >> + > >> + if (argc != 8) > >> + usage(builtin_merge_one_file_usage); > >> + > >> + if (repo_read_index(r) < 0) > >> + die("invalid index"); > >> + > >> + repo_hold_locked_index(r, &lock, LOCK_DIE_ON_ERROR); > >> + > >> + if (!get_oid_hex(argv[1], &orig_blob)) { > >> + p_orig_blob = &orig_blob; > >> + ret = read_mode("orig", argv[5], &orig_mode); > >> + } else if (!*argv[1] && *argv[5]) > >> + ret = error(_("no 'orig' object id given, but a mode was still given.")); > > > > Here, it looks as if the case of an empty `argv[1]` is not handled > > _explicitly_, but we rely on `get_oid_hex()` to return non-zero, and then > > we rely on the second arm _also_ not re-assigning `orig_blob`. > > > > I wonder whether this could be checked, and whether it would make sense to > > fold this, along with most of these 5 lines, into the `read_mode()` helper > > function (DRYing up the code even further). > > > > Do you mean rewriting the first condition to read like this: > > if (*argv[1] && !get_oid_hex(argv[1], &orig_blob)) { > > ? > > In which case yes, I can do that. Yes, that's what I meant. Or this instead: if (!*argv[1]) { if (*argv[5]) ret = error(... mode was still given ...) } else if (!get_oid_hex(...)) { ... } > BTW the two lasts calls to read_mode() should be like > > err |= read_mode(…); While this is certainly shorter than if (read_mode(...)) ret = -1; I actually prefer the latter, for clarity (we do want `read_mode()` to be called, i.e. we cannot use `||=` here, but it is also not a bit-wise "or" operation, therefore `|=` strikes me as misleading). What do you think? Ciao, Dscho
Hi Johannes, Le 24/03/2021 à 10:10, Johannes Schindelin a écrit : > Hi Alban, > > On Tue, 23 Mar 2021, Alban Gruin wrote: > >> Le 22/03/2021 à 23:20, Johannes Schindelin a écrit : >>> >>> On Wed, 17 Mar 2021, Alban Gruin wrote: >>> >>>> >>>> for (; i < argc; i++) { >>>> const char *arg = argv[i]; >>>> diff --git a/builtin/merge-one-file.c b/builtin/merge-one-file.c >>>> new file mode 100644 >>>> index 0000000000..ad99c6dbd4 >>>> --- /dev/null >>>> +++ b/builtin/merge-one-file.c >>>> @@ -0,0 +1,94 @@ >>>> +/* >>>> + * Builtin "git merge-one-file" >>>> + * >>>> + * Copyright (c) 2020 Alban Gruin >>>> + * >>>> + * Based on git-merge-one-file.sh, written by Linus Torvalds. >>>> + * >>>> + * This is the git per-file merge utility, called with >>>> + * >>>> + * argv[1] - original file object name (or empty) >>>> + * argv[2] - file in branch1 object name (or empty) >>>> + * argv[3] - file in branch2 object name (or empty) >>>> + * argv[4] - pathname in repository >>>> + * argv[5] - original file mode (or empty) >>>> + * argv[6] - file in branch1 mode (or empty) >>>> + * argv[7] - file in branch2 mode (or empty) >>>> + * >>>> + * Handle some trivial cases. The _really_ trivial cases have been >>>> + * handled already by git read-tree, but that one doesn't do any merges >>>> + * that might change the tree layout. >>>> + */ >>>> + >>>> +#include "cache.h" >>>> +#include "builtin.h" >>>> +#include "lockfile.h" >>>> +#include "merge-strategies.h" >>>> + >>>> +static const char builtin_merge_one_file_usage[] = >>>> + "git merge-one-file <orig blob> <our blob> <their blob> <path> " >>>> + "<orig mode> <our mode> <their mode>\n\n" >>>> + "Blob ids and modes should be empty for missing files."; >>>> + >>>> +static int read_mode(const char *name, const char *arg, unsigned int *mode) >>>> +{ >>>> + char *last; >>>> + int ret = 0; >>>> + >>>> + *mode = strtol(arg, &last, 8); >>>> + >>>> + if (*last) >>>> + ret = error(_("invalid '%s' mode: expected nothing, got '%c'"), name, *last); >>>> + else if (!(S_ISREG(*mode) || S_ISDIR(*mode) || S_ISLNK(*mode))) >>>> + ret = error(_("invalid '%s' mode: %o"), name, *mode); >>>> + >>>> + return ret; >>>> +} >>>> + >>>> +int cmd_merge_one_file(int argc, const char **argv, const char *prefix) >>>> +{ >>>> + struct object_id orig_blob, our_blob, their_blob, >>>> + *p_orig_blob = NULL, *p_our_blob = NULL, *p_their_blob = NULL; >>>> + unsigned int orig_mode = 0, our_mode = 0, their_mode = 0, ret = 0; >>>> + struct lock_file lock = LOCK_INIT; >>>> + struct repository *r = the_repository; >>>> + >>>> + if (argc != 8) >>>> + usage(builtin_merge_one_file_usage); >>>> + >>>> + if (repo_read_index(r) < 0) >>>> + die("invalid index"); >>>> + >>>> + repo_hold_locked_index(r, &lock, LOCK_DIE_ON_ERROR); >>>> + >>>> + if (!get_oid_hex(argv[1], &orig_blob)) { >>>> + p_orig_blob = &orig_blob; >>>> + ret = read_mode("orig", argv[5], &orig_mode); >>>> + } else if (!*argv[1] && *argv[5]) >>>> + ret = error(_("no 'orig' object id given, but a mode was still given.")); >>> >>> Here, it looks as if the case of an empty `argv[1]` is not handled >>> _explicitly_, but we rely on `get_oid_hex()` to return non-zero, and then >>> we rely on the second arm _also_ not re-assigning `orig_blob`. >>> >>> I wonder whether this could be checked, and whether it would make sense to >>> fold this, along with most of these 5 lines, into the `read_mode()` helper >>> function (DRYing up the code even further). >>> >> >> Do you mean rewriting the first condition to read like this: >> >> if (*argv[1] && !get_oid_hex(argv[1], &orig_blob)) { >> >> ? >> >> In which case yes, I can do that. > > Yes, that's what I meant. Or this instead: > > if (!*argv[1]) { > if (*argv[5]) > ret = error(... mode was still given ...) > } else if (!get_oid_hex(...)) { > ... > } > >> BTW the two lasts calls to read_mode() should be like >> >> err |= read_mode(…); > > While this is certainly shorter than > > if (read_mode(...)) > ret = -1; > So, I folded all of this into a single function that reads the mode, convert the oid, and show an error if needed. Now, I have: if (read_param("orig", argv[1], argv[5], &orig_blob, &p_orig_blob, &orig_mode)) ret = -1; if (read_param("our", …)) ret = -1; if (read_param("their", …)) ret = -1; if (ret) return ret; > I actually prefer the latter, for clarity (we do want `read_mode()` to be > called, i.e. we cannot use `||=` here, but it is also not a bit-wise "or" > operation, therefore `|=` strikes me as misleading). What do you think? > Yes, I think it's much clearer that way. FIY, `||=' does not exist in C. Cheers, Alban > Ciao, > Dscho >
diff --git a/Makefile b/Makefile index 1b1dc49e86..e2e4389f76 100644 --- a/Makefile +++ b/Makefile @@ -600,7 +600,6 @@ SCRIPT_SH += git-bisect.sh SCRIPT_SH += git-difftool--helper.sh SCRIPT_SH += git-filter-branch.sh SCRIPT_SH += git-merge-octopus.sh -SCRIPT_SH += git-merge-one-file.sh SCRIPT_SH += git-merge-resolve.sh SCRIPT_SH += git-mergetool.sh SCRIPT_SH += git-quiltimport.sh @@ -1100,6 +1099,7 @@ BUILTIN_OBJS += builtin/mailsplit.o BUILTIN_OBJS += builtin/merge-base.o BUILTIN_OBJS += builtin/merge-file.o BUILTIN_OBJS += builtin/merge-index.o +BUILTIN_OBJS += builtin/merge-one-file.o BUILTIN_OBJS += builtin/merge-ours.o BUILTIN_OBJS += builtin/merge-recursive.o BUILTIN_OBJS += builtin/merge-tree.o diff --git a/builtin.h b/builtin.h index b6ce981b73..227c133036 100644 --- a/builtin.h +++ b/builtin.h @@ -179,6 +179,7 @@ int cmd_merge_base(int argc, const char **argv, const char *prefix); int cmd_merge_index(int argc, const char **argv, const char *prefix); int cmd_merge_ours(int argc, const char **argv, const char *prefix); int cmd_merge_file(int argc, const char **argv, const char *prefix); +int cmd_merge_one_file(int argc, const char **argv, const char *prefix); int cmd_merge_recursive(int argc, const char **argv, const char *prefix); int cmd_merge_tree(int argc, const char **argv, const char *prefix); int cmd_mktag(int argc, const char **argv, const char *prefix); diff --git a/builtin/merge-index.c b/builtin/merge-index.c index fd5b1a5a92..04d38aa130 100644 --- a/builtin/merge-index.c +++ b/builtin/merge-index.c @@ -38,7 +38,7 @@ static int merge_one_file_spawn(struct index_state *istate, int cmd_merge_index(int argc, const char **argv, const char *prefix) { int i, force_file = 0, err = 0, one_shot = 0, quiet = 0; - merge_fn merge_action = merge_one_file_spawn; + merge_fn merge_action; struct lock_file lock = LOCK_INIT; struct repository *r = the_repository; const char *use_internal = NULL; @@ -69,10 +69,13 @@ int cmd_merge_index(int argc, const char **argv, const char *prefix) if (skip_prefix(pgm, "--use=", &use_internal)) { if (!strcmp(use_internal, "merge-one-file")) - pgm = "git-merge-one-file"; + merge_action = merge_one_file_func; else die(_("git merge-index: unknown internal program %s"), use_internal); - } + + repo_hold_locked_index(r, &lock, LOCK_DIE_ON_ERROR); + } else + merge_action = merge_one_file_spawn; for (; i < argc; i++) { const char *arg = argv[i]; diff --git a/builtin/merge-one-file.c b/builtin/merge-one-file.c new file mode 100644 index 0000000000..ad99c6dbd4 --- /dev/null +++ b/builtin/merge-one-file.c @@ -0,0 +1,94 @@ +/* + * Builtin "git merge-one-file" + * + * Copyright (c) 2020 Alban Gruin + * + * Based on git-merge-one-file.sh, written by Linus Torvalds. + * + * This is the git per-file merge utility, called with + * + * argv[1] - original file object name (or empty) + * argv[2] - file in branch1 object name (or empty) + * argv[3] - file in branch2 object name (or empty) + * argv[4] - pathname in repository + * argv[5] - original file mode (or empty) + * argv[6] - file in branch1 mode (or empty) + * argv[7] - file in branch2 mode (or empty) + * + * Handle some trivial cases. The _really_ trivial cases have been + * handled already by git read-tree, but that one doesn't do any merges + * that might change the tree layout. + */ + +#include "cache.h" +#include "builtin.h" +#include "lockfile.h" +#include "merge-strategies.h" + +static const char builtin_merge_one_file_usage[] = + "git merge-one-file <orig blob> <our blob> <their blob> <path> " + "<orig mode> <our mode> <their mode>\n\n" + "Blob ids and modes should be empty for missing files."; + +static int read_mode(const char *name, const char *arg, unsigned int *mode) +{ + char *last; + int ret = 0; + + *mode = strtol(arg, &last, 8); + + if (*last) + ret = error(_("invalid '%s' mode: expected nothing, got '%c'"), name, *last); + else if (!(S_ISREG(*mode) || S_ISDIR(*mode) || S_ISLNK(*mode))) + ret = error(_("invalid '%s' mode: %o"), name, *mode); + + return ret; +} + +int cmd_merge_one_file(int argc, const char **argv, const char *prefix) +{ + struct object_id orig_blob, our_blob, their_blob, + *p_orig_blob = NULL, *p_our_blob = NULL, *p_their_blob = NULL; + unsigned int orig_mode = 0, our_mode = 0, their_mode = 0, ret = 0; + struct lock_file lock = LOCK_INIT; + struct repository *r = the_repository; + + if (argc != 8) + usage(builtin_merge_one_file_usage); + + if (repo_read_index(r) < 0) + die("invalid index"); + + repo_hold_locked_index(r, &lock, LOCK_DIE_ON_ERROR); + + if (!get_oid_hex(argv[1], &orig_blob)) { + p_orig_blob = &orig_blob; + ret = read_mode("orig", argv[5], &orig_mode); + } else if (!*argv[1] && *argv[5]) + ret = error(_("no 'orig' object id given, but a mode was still given.")); + + if (!get_oid_hex(argv[2], &our_blob)) { + p_our_blob = &our_blob; + ret = read_mode("our", argv[6], &our_mode); + } else if (!*argv[2] && *argv[6]) + ret = error(_("no 'our' object id given, but a mode was still given.")); + + if (!get_oid_hex(argv[3], &their_blob)) { + p_their_blob = &their_blob; + ret = read_mode("their", argv[7], &their_mode); + } else if (!*argv[3] && *argv[7]) + ret = error(_("no 'their' object id given, but a mode was still given.")); + + if (ret) + return ret; + + ret = merge_three_way(r->index, p_orig_blob, p_our_blob, p_their_blob, + argv[4], orig_mode, our_mode, their_mode); + + if (ret) { + rollback_lock_file(&lock); + return !!ret; + } + + return write_locked_index(r->index, &lock, COMMIT_LOCK); +} diff --git a/git-merge-one-file.sh b/git-merge-one-file.sh deleted file mode 100755 index f6d9852d2f..0000000000 --- a/git-merge-one-file.sh +++ /dev/null @@ -1,167 +0,0 @@ -#!/bin/sh -# -# Copyright (c) Linus Torvalds, 2005 -# -# This is the git per-file merge script, called with -# -# $1 - original file SHA1 (or empty) -# $2 - file in branch1 SHA1 (or empty) -# $3 - file in branch2 SHA1 (or empty) -# $4 - pathname in repository -# $5 - original file mode (or empty) -# $6 - file in branch1 mode (or empty) -# $7 - file in branch2 mode (or empty) -# -# Handle some trivial cases.. The _really_ trivial cases have -# been handled already by git read-tree, but that one doesn't -# do any merges that might change the tree layout. - -USAGE='<orig blob> <our blob> <their blob> <path>' -USAGE="$USAGE <orig mode> <our mode> <their mode>" -LONG_USAGE="usage: git merge-one-file $USAGE - -Blob ids and modes should be empty for missing files." - -SUBDIRECTORY_OK=Yes -. git-sh-setup -cd_to_toplevel -require_work_tree - -if test $# != 7 -then - echo "$LONG_USAGE" - exit 1 -fi - -case "${1:-.}${2:-.}${3:-.}" in -# -# Deleted in both or deleted in one and unchanged in the other -# -"$1.." | "$1.$1" | "$1$1.") - if { test -z "$6" && test "$5" != "$7"; } || - { test -z "$7" && test "$5" != "$6"; } - then - echo "ERROR: File $4 deleted on one branch but had its" >&2 - echo "ERROR: permissions changed on the other." >&2 - exit 1 - fi - - if test -n "$2" - then - echo "Removing $4" - else - # read-tree checked that index matches HEAD already, - # so we know we do not have this path tracked. - # there may be an unrelated working tree file here, - # which we should just leave unmolested. Make sure - # we do not have it in the index, though. - exec git update-index --remove -- "$4" - fi - if test -f "$4" - then - rm -f -- "$4" && - rmdir -p "$(expr "z$4" : 'z\(.*\)/')" 2>/dev/null || : - fi && - exec git update-index --remove -- "$4" - ;; - -# -# Added in one. -# -".$2.") - # the other side did not add and we added so there is nothing - # to be done, except making the path merged. - exec git update-index --add --cacheinfo "$6" "$2" "$4" - ;; -"..$3") - echo "Adding $4" - if test -f "$4" - then - echo "ERROR: untracked $4 is overwritten by the merge." >&2 - exit 1 - fi - git update-index --add --cacheinfo "$7" "$3" "$4" && - exec git checkout-index -u -f -- "$4" - ;; - -# -# Added in both, identically (check for same permissions). -# -".$3$2") - if test "$6" != "$7" - then - echo "ERROR: File $4 added identically in both branches," >&2 - echo "ERROR: but permissions conflict $6->$7." >&2 - exit 1 - fi - echo "Adding $4" - git update-index --add --cacheinfo "$6" "$2" "$4" && - exec git checkout-index -u -f -- "$4" - ;; - -# -# Modified in both, but differently. -# -"$1$2$3" | ".$2$3") - - case ",$6,$7," in - *,120000,*) - echo "ERROR: $4: Not merging symbolic link changes." >&2 - exit 1 - ;; - *,160000,*) - echo "ERROR: $4: Not merging conflicting submodule changes." >&2 - exit 1 - ;; - esac - - src1=$(git unpack-file $2) - src2=$(git unpack-file $3) - case "$1" in - '') - echo "Added $4 in both, but differently." - orig=$(git unpack-file $(git hash-object /dev/null)) - ;; - *) - echo "Auto-merging $4" - orig=$(git unpack-file $1) - ;; - esac - - git merge-file "$src1" "$orig" "$src2" - ret=$? - msg= - if test $ret != 0 || test -z "$1" - then - msg='content conflict' - ret=1 - fi - - # Create the working tree file, using "our tree" version from the - # index, and then store the result of the merge. - git checkout-index -f --stage=2 -- "$4" && cat "$src1" >"$4" || exit 1 - rm -f -- "$orig" "$src1" "$src2" - - if test "$6" != "$7" - then - if test -n "$msg" - then - msg="$msg, " - fi - msg="${msg}permissions conflict: $5->$6,$7" - ret=1 - fi - - if test $ret != 0 - then - echo "ERROR: $msg in $4" >&2 - exit 1 - fi - exec git update-index -- "$4" - ;; - -*) - echo "ERROR: $4: Not handling case $1 -> $2 -> $3" >&2 - ;; -esac -exit 1 diff --git a/git.c b/git.c index 9bc077a025..95eb74efe1 100644 --- a/git.c +++ b/git.c @@ -544,6 +544,7 @@ static struct cmd_struct commands[] = { { "merge-file", cmd_merge_file, RUN_SETUP_GENTLY }, { "merge-index", cmd_merge_index, RUN_SETUP | NO_PARSEOPT }, { "merge-ours", cmd_merge_ours, RUN_SETUP | NO_PARSEOPT }, + { "merge-one-file", cmd_merge_one_file, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT }, { "merge-recursive", cmd_merge_recursive, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT }, { "merge-recursive-ours", cmd_merge_recursive, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT }, { "merge-recursive-theirs", cmd_merge_recursive, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT }, diff --git a/merge-strategies.c b/merge-strategies.c index c80f964612..2717af51fd 100644 --- a/merge-strategies.c +++ b/merge-strategies.c @@ -1,5 +1,197 @@ #include "cache.h" +#include "dir.h" #include "merge-strategies.h" +#include "xdiff-interface.h" + +static int add_merge_result_to_index(struct index_state *istate, unsigned int mode, + const struct object_id *oid, const char *path, + int checkout) +{ + struct cache_entry *ce; + int res; + + res = add_to_index_cacheinfo(istate, mode, oid, path, 0, 1, 1, &ce); + if (res == -1) + return error(_("Invalid path '%s'"), path); + else if (res == -2) + return -1; + + if (checkout) { + struct checkout state = CHECKOUT_INIT; + + state.istate = istate; + state.force = 1; + state.base_dir = ""; + state.base_dir_len = 0; + + if (checkout_entry(ce, &state, NULL, NULL) < 0) + return error(_("%s: cannot checkout file"), path); + } + + return 0; +} + +static int merge_one_file_deleted(struct index_state *istate, + const struct object_id *our_blob, + const struct object_id *their_blob, const char *path, + unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode) +{ + if ((!our_blob && orig_mode != their_mode) || + (!their_blob && orig_mode != our_mode)) + return error(_("File %s deleted on one branch but had its " + "permissions changed on the other."), path); + + if (our_blob) { + printf(_("Removing %s\n"), path); + + if (file_exists(path)) + remove_path(path); + } + + if (remove_file_from_index(istate, path)) + return error("%s: cannot remove from the index", path); + return 0; +} + +static int do_merge_one_file(struct index_state *istate, + const struct object_id *orig_blob, + const struct object_id *our_blob, + const struct object_id *their_blob, const char *path, + unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode) +{ + int ret, i, dest; + ssize_t written; + mmbuffer_t result = {NULL, 0}; + mmfile_t mmfs[3]; + xmparam_t xmp = {{0}}; + + if (our_mode == S_IFLNK || their_mode == S_IFLNK) + return error(_("%s: Not merging symbolic link changes."), path); + else if (our_mode == S_IFGITLINK || their_mode == S_IFGITLINK) + return error(_("%s: Not merging conflicting submodule changes."), path); + + if (orig_blob) { + printf(_("Auto-merging %s\n"), path); + read_mmblob(mmfs + 0, orig_blob); + } else { + printf(_("Added %s in both, but differently.\n"), path); + read_mmblob(mmfs + 0, &null_oid); + } + + read_mmblob(mmfs + 1, our_blob); + read_mmblob(mmfs + 2, their_blob); + + xmp.level = XDL_MERGE_ZEALOUS_ALNUM; + xmp.style = 0; + xmp.favor = 0; + + ret = xdl_merge(mmfs + 0, mmfs + 1, mmfs + 2, &xmp, &result); + + for (i = 0; i < 3; i++) + free(mmfs[i].ptr); + + if (ret < 0) { + free(result.ptr); + return error(_("Failed to execute internal merge")); + } + + if (ret > 0 || !orig_blob) + ret = error(_("content conflict in %s"), path); + if (our_mode != their_mode) + ret = error(_("permission conflict: %o->%o,%o in %s"), + orig_mode, our_mode, their_mode, path); + + unlink(path); + if ((dest = open(path, O_WRONLY | O_CREAT, our_mode)) < 0) { + free(result.ptr); + return error_errno(_("failed to open file '%s'"), path); + } + + written = write_in_full(dest, result.ptr, result.size); + close(dest); + + free(result.ptr); + + if (written < 0) + return error_errno(_("failed to write to '%s'"), path); + if (ret) + return ret; + + return add_file_to_index(istate, path, 0); +} + +int merge_three_way(struct index_state *istate, + const struct object_id *orig_blob, + const struct object_id *our_blob, + const struct object_id *their_blob, const char *path, + unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode) +{ + if (orig_blob && + ((!our_blob && !their_blob) || + (!their_blob && our_blob && oideq(orig_blob, our_blob)) || + (!our_blob && their_blob && oideq(orig_blob, their_blob)))) { + /* Deleted in both or deleted in one and unchanged in the other. */ + return merge_one_file_deleted(istate, our_blob, their_blob, path, + orig_mode, our_mode, their_mode); + } else if (!orig_blob && our_blob && !their_blob) { + /* + * Added in ours. The other side did not add and we + * added so there is nothing to be done, except making + * the path merged. + */ + return add_merge_result_to_index(istate, our_mode, our_blob, path, 0); + } else if (!orig_blob && !our_blob && their_blob) { + printf(_("Adding %s\n"), path); + + if (file_exists(path)) + return error(_("untracked %s is overwritten by the merge."), path); + + return add_merge_result_to_index(istate, their_mode, their_blob, path, 1); + } else if (!orig_blob && our_blob && their_blob && + oideq(our_blob, their_blob)) { + /* Added in both, identically (check for same permissions). */ + if (our_mode != their_mode) + return error(_("File %s added identically in both branches, " + "but permissions conflict %o->%o."), + path, our_mode, their_mode); + + printf(_("Adding %s\n"), path); + + return add_merge_result_to_index(istate, our_mode, our_blob, path, 1); + } else if (our_blob && their_blob) { + /* Modified in both, but differently. */ + return do_merge_one_file(istate, + orig_blob, our_blob, their_blob, path, + orig_mode, our_mode, their_mode); + } else { + char orig_hex[GIT_MAX_HEXSZ] = {0}, our_hex[GIT_MAX_HEXSZ] = {0}, + their_hex[GIT_MAX_HEXSZ] = {0}; + + if (orig_blob) + oid_to_hex_r(orig_hex, orig_blob); + if (our_blob) + oid_to_hex_r(our_hex, our_blob); + if (their_blob) + oid_to_hex_r(their_hex, their_blob); + + return error(_("%s: Not handling case %s -> %s -> %s"), + path, orig_hex, our_hex, their_hex); + } + + return 0; +} + +int merge_one_file_func(struct index_state *istate, + const struct object_id *orig_blob, + const struct object_id *our_blob, + const struct object_id *their_blob, const char *path, + unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode, + void *data) +{ + return merge_three_way(istate, + orig_blob, our_blob, their_blob, path, + orig_mode, our_mode, their_mode); +} static int merge_entry(struct index_state *istate, int quiet, unsigned int pos, const char *path, int *err, merge_fn fn, void *data) @@ -54,17 +246,24 @@ int merge_all_index(struct index_state *istate, int oneshot, int quiet, merge_fn fn, void *data) { int err = 0, ret; - unsigned int i; + unsigned int i, prev_nr; for (i = 0; i < istate->cache_nr; i++) { const struct cache_entry *ce = istate->cache[i]; if (!ce_stage(ce)) continue; + prev_nr = istate->cache_nr; ret = merge_entry(istate, quiet || oneshot, i, ce->name, &err, fn, data); - if (ret > 0) - i += ret - 1; - else if (ret == -1) + if (ret > 0) { + /* + * Don't bother handling an index that has + * grown, since merge_one_file_func() can't grow + * it, and merge_one_file_spawn() can't change + * it. + */ + i += ret - (prev_nr - istate->cache_nr) - 1; + } else if (ret == -1) return -1; if (err && !oneshot) diff --git a/merge-strategies.h b/merge-strategies.h index 88f476f170..8705a550ca 100644 --- a/merge-strategies.h +++ b/merge-strategies.h @@ -3,6 +3,12 @@ #include "object.h" +int merge_three_way(struct index_state *istate, + const struct object_id *orig_blob, + const struct object_id *our_blob, + const struct object_id *their_blob, const char *path, + unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode); + typedef int (*merge_fn)(struct index_state *istate, const struct object_id *orig_blob, const struct object_id *our_blob, @@ -10,6 +16,13 @@ typedef int (*merge_fn)(struct index_state *istate, unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode, void *data); +int merge_one_file_func(struct index_state *istate, + const struct object_id *orig_blob, + const struct object_id *our_blob, + const struct object_id *their_blob, const char *path, + unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode, + void *data); + int merge_index_path(struct index_state *istate, int oneshot, int quiet, const char *path, merge_fn fn, void *data); int merge_all_index(struct index_state *istate, int oneshot, int quiet, diff --git a/t/t6060-merge-index.sh b/t/t6060-merge-index.sh index d0cdfeddc1..d9c07965dc 100755 --- a/t/t6060-merge-index.sh +++ b/t/t6060-merge-index.sh @@ -72,7 +72,7 @@ test_expect_success 'merge-one-file fails without a work tree' ' (cd bare.git && GIT_INDEX_FILE=$PWD/merge.index && export GIT_INDEX_FILE && - test_must_fail git merge-index git-merge-one-file -a + test_must_fail git merge-index --use=merge-one-file -a ) ' diff --git a/t/t6415-merge-dir-to-symlink.sh b/t/t6415-merge-dir-to-symlink.sh index 2ce104aca7..075da1f55f 100755 --- a/t/t6415-merge-dir-to-symlink.sh +++ b/t/t6415-merge-dir-to-symlink.sh @@ -97,7 +97,7 @@ test_expect_success SYMLINKS 'a/b was resolved as symlink' ' test -h a/b ' -test_expect_failure 'do not lose untracked in merge (resolve)' ' +test_expect_success 'do not lose untracked in merge (resolve)' ' git reset --hard && git checkout baseline^0 && >a/b/c/e &&
This rewrites `git merge-one-file' from shell to C. This port is not completely straightforward: to save precious cycles by avoiding reading and flushing the index repeatedly, write temporary files when an operation can be performed in-memory, or allow other function to use the rewrite without forking nor worrying about the index, the calls to external processes are replaced by calls to functions in libgit.a: - calls to `update-index --add --cacheinfo' are replaced by calls to add_to_index_cacheinfo(); - calls to `update-index --remove' are replaced by calls to remove_file_from_index(); - calls to `checkout-index -u -f' are replaced by calls to checkout_entry(); - calls to `unpack-file' and `merge-files' are replaced by calls to read_mmblob() and xdl_merge(), respectively, to merge files in-memory; - calls to `checkout-index -f --stage=2' are removed, as this is needed to have the correct permission bits on the merged file from the script, but not in the C version; - calls to `update-index' are replaced by calls to add_file_to_index(). The bulk of the rewrite is done in a new file in libgit.a, merge-strategies.c. This will enable the resolve and octopus strategies to directly call it instead of forking. This also fixes a bug present in the original script: instead of checking if a _regular_ file exists when a file exists in the branch to merge, but not in our branch, the rewritten version checks if a file of any kind (ie. a directory, ...) exists. This fixes the tests t6035.14, where the branch to merge had a new file, `a/b', but our branch had a directory there; it should have failed because a directory exists, but it did not because there was no regular file called `a/b'. This test is now marked as successful. This also teaches `merge-index' to call merge_three_way() (when invoked with `--use=merge-one-file') without forking using a new callback, merge_one_file_func(). To avoid any issue with a shrinking index because of the merge function used (directly in the process or by forking), as described earlier, the iterator of the loop of merge_all_index() is increased by the number of entries with the same name, minus the difference between the number of entries in the index before and after the merge. This should handle a shrinking index correctly, but could lead to issues with a growing index. However, this case is not treated, as there is no callback that can produce such a case. Signed-off-by: Alban Gruin <alban.gruin@gmail.com> --- Makefile | 2 +- builtin.h | 1 + builtin/merge-index.c | 9 +- builtin/merge-one-file.c | 94 +++++++++++++++ git-merge-one-file.sh | 167 -------------------------- git.c | 1 + merge-strategies.c | 207 +++++++++++++++++++++++++++++++- merge-strategies.h | 13 ++ t/t6060-merge-index.sh | 2 +- t/t6415-merge-dir-to-symlink.sh | 2 +- 10 files changed, 321 insertions(+), 177 deletions(-) create mode 100644 builtin/merge-one-file.c delete mode 100755 git-merge-one-file.sh