Message ID | 20190918202738.57273-1-jonathantanmy@google.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v2] merge-recursive: symlink's descendants not in way | expand |
On Wed, Sep 18, 2019 at 1:27 PM Jonathan Tan <jonathantanmy@google.com> wrote: > > When the working tree has: > - bar (directory) > - bar/file (file) > - foo (symlink to .) > > (note that lstat() for "foo/bar" would tell us that it is a directory) > > and the user merges a commit that deletes the foo symlink and instead > contains: > - bar (directory, as above) > - bar/file (file, as above) > - foo (directory) > - foo/bar (file) > > the merge should happen without requiring user intervention. However, > this does not happen. > > This is because dir_in_way(), when checking the working tree, thinks > that "foo/bar" is a directory. But a symlink should be treated much the > same as a file: since dir_in_way() is only checking to see if there is a > directory in the way, we don't want symlinks in leading paths to > sometimes cause dir_in_way() to return true. > > Teach dir_in_way() to also check for symlinks in leading paths before > reporting whether a directory is in the way. > > Helped-by: Elijah Newren <newren@gmail.com> > Signed-off-by: Jonathan Tan <jonathantanmy@google.com> > --- > Changes from v1: > > - Used has_symlink_leading_path(). This drastically shortens the diff. > - Updated commit message following suggestions from Junio, Szeder Gábor, > and Elijah Newren. > - Updated test to add prereq and verification that the working tree > contains what we want. > --- > merge-recursive.c | 3 ++- > t/t3030-merge-recursive.sh | 28 ++++++++++++++++++++++++++++ > 2 files changed, 30 insertions(+), 1 deletion(-) > > diff --git a/merge-recursive.c b/merge-recursive.c > index 6b812d67e3..22a12cfeba 100644 > --- a/merge-recursive.c > +++ b/merge-recursive.c > @@ -764,7 +764,8 @@ static int dir_in_way(struct index_state *istate, const char *path, > > strbuf_release(&dirpath); > return check_working_copy && !lstat(path, &st) && S_ISDIR(st.st_mode) && > - !(empty_ok && is_empty_dir(path)); > + !(empty_ok && is_empty_dir(path)) && > + !has_symlink_leading_path(path, strlen(path)); > } > > /* > diff --git a/t/t3030-merge-recursive.sh b/t/t3030-merge-recursive.sh > index ff641b348a..faa8892741 100755 > --- a/t/t3030-merge-recursive.sh > +++ b/t/t3030-merge-recursive.sh > @@ -452,6 +452,34 @@ test_expect_success 'merge-recursive d/f conflict result' ' > > ' > > +test_expect_success SYMLINKS 'dir in working tree with symlink ancestor does not produce d/f conflict' ' > + git init sym && > + ( > + cd sym && > + ln -s . foo && > + mkdir bar && > + >bar/file && > + git add foo bar/file && > + git commit -m "foo symlink" && > + > + git checkout -b branch1 && > + git commit --allow-empty -m "empty commit" && > + > + git checkout master && > + git rm foo && > + mkdir foo && > + >foo/bar && > + git add foo/bar && > + git commit -m "replace foo symlink with real foo dir and foo/bar file" && > + > + git checkout branch1 && > + > + git cherry-pick master && > + test_path_is_dir foo && > + test_path_is_file foo/bar > + ) > +' > + > test_expect_success 'reset and 3-way merge' ' > > git reset --hard "$c2" && > -- Looks good to me; nice how much it has simplified. Thanks for working on this. > 2.23.0.237.gc6a4ce50a0-goog A total tangent, but what do you use the "-goog" suffix for?
Jonathan Tan <jonathantanmy@google.com> writes: > When the working tree has: > - bar (directory) > - bar/file (file) > - foo (symlink to .) > > (note that lstat() for "foo/bar" would tell us that it is a directory) > > and the user merges a commit that deletes the foo symlink and instead > contains: > - bar (directory, as above) > - bar/file (file, as above) > - foo (directory) > - foo/bar (file) > > the merge should happen without requiring user intervention. Thanks. That clears my previous confusion. It is clear that the desired outcome is that bar/file will be merged with itself, foo itself will resolve to "deleted", and foo/bar will be created. > However, this does not happen. OK. We notice that we need to newly create foo/bar but we incorrectly find that there is "foo/bar" already because of the careless use of bare lstat(2) makes "bar" visible as if it were also "foo/bar". I wonder if the current code would be confused the same way if the side branch added "foo/bar/file", or the confusion would be even worse---it is not dir_in_way() and a different codepath would be affected, no? > diff --git a/merge-recursive.c b/merge-recursive.c > index 6b812d67e3..22a12cfeba 100644 > --- a/merge-recursive.c > +++ b/merge-recursive.c > @@ -764,7 +764,8 @@ static int dir_in_way(struct index_state *istate, const char *path, > > strbuf_release(&dirpath); > return check_working_copy && !lstat(path, &st) && S_ISDIR(st.st_mode) && > - !(empty_ok && is_empty_dir(path)); > + !(empty_ok && is_empty_dir(path)) && > + !has_symlink_leading_path(path, strlen(path)); As the new call is fairly heavyweight compared to everything else we are doing, I think it is very sensible to have this at the end, like this patch does. Nicely done. Thanks, will queue.
> OK. We notice that we need to newly create foo/bar but we > incorrectly find that there is "foo/bar" already because of the > careless use of bare lstat(2) makes "bar" visible as if it were also > "foo/bar". I wonder if the current code would be confused the same > way if the side branch added "foo/bar/file", or the confusion would > be even worse---it is not dir_in_way() and a different codepath > would be affected, no? I don't think there is a different codepath to be affected - as far as I can tell, dir_in_way() is the only cause (at least of this particular error). I've tested this case [1] and the current code actually works - as you said, dir_in_way() will not report anything in the way (since foo/bar/file doesn't exist in the working tree), so the merge will happen as expected. > Nicely done. Thanks, will queue. Thanks. [1] diff --git a/t/t3030-merge-recursive.sh b/t/t3030-merge-recursive.sh index faa8892741..f284aeb173 100755 --- a/t/t3030-merge-recursive.sh +++ b/t/t3030-merge-recursive.sh @@ -468,15 +468,16 @@ test_expect_success SYMLINKS 'dir in working tree with symlink ancestor does not git checkout master && git rm foo && mkdir foo && - >foo/bar && - git add foo/bar && - git commit -m "replace foo symlink with real foo dir and foo/bar file" && + mkdir foo/bar && + >foo/bar/baz && + git add foo/bar/baz && + git commit -m "replace foo symlink with real foo dir and foo/bar/baz file" && git checkout branch1 && git cherry-pick master && test_path_is_dir foo && - test_path_is_file foo/bar + test_path_is_file foo/bar/baz ) '
Jonathan Tan <jonathantanmy@google.com> writes: >> OK. We notice that we need to newly create foo/bar but we >> incorrectly find that there is "foo/bar" already because of the >> careless use of bare lstat(2) makes "bar" visible as if it were also >> "foo/bar". I wonder if the current code would be confused the same >> way if the side branch added "foo/bar/file", or the confusion would >> be even worse---it is not dir_in_way() and a different codepath >> would be affected, no? > > I don't think there is a different codepath to be affected - as far as I > can tell, dir_in_way() is the only cause (at least of this particular > error). OK, so existing code already realizes that "foo/bar/file" added in the side branch is the one that must survive, and the "bar/file" in the current branch does not fool it into thinking that "foo/bar/file" is also on our end, and needs to be merged as an add-add conflict. It was only the dir-in-the-way logic that was not careful enough? In that case, thanks for a very good news and for a careful analysis.
> Jonathan Tan <jonathantanmy@google.com> writes: > > >> OK. We notice that we need to newly create foo/bar but we > >> incorrectly find that there is "foo/bar" already because of the > >> careless use of bare lstat(2) makes "bar" visible as if it were also > >> "foo/bar". I wonder if the current code would be confused the same > >> way if the side branch added "foo/bar/file", or the confusion would > >> be even worse---it is not dir_in_way() and a different codepath > >> would be affected, no? > > > > I don't think there is a different codepath to be affected - as far as I > > can tell, dir_in_way() is the only cause (at least of this particular > > error). > > OK, so existing code already realizes that "foo/bar/file" added in > the side branch is the one that must survive, and the "bar/file" in > the current branch does not fool it into thinking that "foo/bar/file" > is also on our end, and needs to be merged as an add-add conflict. > It was only the dir-in-the-way logic that was not careful enough? Yes, that's correct. (I wrote foo/bar/baz in my other email but replaced "baz" with "file", and it still works before and after my patch.) > In that case, thanks for a very good news and for a careful analysis. You're welcome! The careful analysis should be credited to Elijah Newren [1]. [1] https://public-inbox.org/git/CABPp-BHpXWF+1hKUTfn8s-y4MJZXz+jUVS_K10eKyD6PGwo=gg@mail.gmail.com/
diff --git a/merge-recursive.c b/merge-recursive.c index 6b812d67e3..22a12cfeba 100644 --- a/merge-recursive.c +++ b/merge-recursive.c @@ -764,7 +764,8 @@ static int dir_in_way(struct index_state *istate, const char *path, strbuf_release(&dirpath); return check_working_copy && !lstat(path, &st) && S_ISDIR(st.st_mode) && - !(empty_ok && is_empty_dir(path)); + !(empty_ok && is_empty_dir(path)) && + !has_symlink_leading_path(path, strlen(path)); } /* diff --git a/t/t3030-merge-recursive.sh b/t/t3030-merge-recursive.sh index ff641b348a..faa8892741 100755 --- a/t/t3030-merge-recursive.sh +++ b/t/t3030-merge-recursive.sh @@ -452,6 +452,34 @@ test_expect_success 'merge-recursive d/f conflict result' ' ' +test_expect_success SYMLINKS 'dir in working tree with symlink ancestor does not produce d/f conflict' ' + git init sym && + ( + cd sym && + ln -s . foo && + mkdir bar && + >bar/file && + git add foo bar/file && + git commit -m "foo symlink" && + + git checkout -b branch1 && + git commit --allow-empty -m "empty commit" && + + git checkout master && + git rm foo && + mkdir foo && + >foo/bar && + git add foo/bar && + git commit -m "replace foo symlink with real foo dir and foo/bar file" && + + git checkout branch1 && + + git cherry-pick master && + test_path_is_dir foo && + test_path_is_file foo/bar + ) +' + test_expect_success 'reset and 3-way merge' ' git reset --hard "$c2" &&
When the working tree has: - bar (directory) - bar/file (file) - foo (symlink to .) (note that lstat() for "foo/bar" would tell us that it is a directory) and the user merges a commit that deletes the foo symlink and instead contains: - bar (directory, as above) - bar/file (file, as above) - foo (directory) - foo/bar (file) the merge should happen without requiring user intervention. However, this does not happen. This is because dir_in_way(), when checking the working tree, thinks that "foo/bar" is a directory. But a symlink should be treated much the same as a file: since dir_in_way() is only checking to see if there is a directory in the way, we don't want symlinks in leading paths to sometimes cause dir_in_way() to return true. Teach dir_in_way() to also check for symlinks in leading paths before reporting whether a directory is in the way. Helped-by: Elijah Newren <newren@gmail.com> Signed-off-by: Jonathan Tan <jonathantanmy@google.com> --- Changes from v1: - Used has_symlink_leading_path(). This drastically shortens the diff. - Updated commit message following suggestions from Junio, Szeder Gábor, and Elijah Newren. - Updated test to add prereq and verification that the working tree contains what we want. --- merge-recursive.c | 3 ++- t/t3030-merge-recursive.sh | 28 ++++++++++++++++++++++++++++ 2 files changed, 30 insertions(+), 1 deletion(-)