Message ID | 20190322232237.13293-3-matheus.bernardino@usp.br (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | clone: dir-iterator refactoring with tests | expand |
On 03/22, Matheus Tavares wrote: > There is currently an odd behaviour when locally clonning a repository > with symlinks at .git/objects: using --no-hardlinks all symlinks are > dereferenced but without it Git will try to hardlink the files with the > link() function, which has an OS-specific behaviour on symlinks. On OSX > and NetBSD, it creates a hardlink to the file pointed by the symlink > whilst on GNU/Linux, it creates a hardlink to the symlink itself. > > On Manjaro GNU/Linux: > $ touch a > $ ln -s a b > $ link b c > $ ls -li a b c > 155 [...] a > 156 [...] b -> a > 156 [...] c -> a > > But on NetBSD: > $ ls -li a b c > 2609160 [...] a > 2609164 [...] b -> a > 2609160 [...] c > > It's not good to have the result of a local clone to be OS-dependent and > since the behaviour on GNU/Linux may result in broken symlinks, let's > re-implement it with linkat() instead of link() using a flag to always > follow symlinks and make the hardlink be to the pointed file. With this, > besides standardizing the behaviour, no broken symlinks will be > produced. Also, add tests for symlinked files at .git/objects/. > > Note: Git won't create symlinks at .git/objects itself, but it's better > to handle this case and be friendly with users who manually create them. > > Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br> > Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> > Co-authored-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> > --- > builtin/clone.c | 2 +- > t/t5604-clone-reference.sh | 26 +++++++++++++++++++------- > 2 files changed, 20 insertions(+), 8 deletions(-) > > diff --git a/builtin/clone.c b/builtin/clone.c > index 50bde99618..b76f33c635 100644 > --- a/builtin/clone.c > +++ b/builtin/clone.c > @@ -443,7 +443,7 @@ static void copy_or_link_directory(struct strbuf *src, struct strbuf *dest, > if (unlink(dest->buf) && errno != ENOENT) > die_errno(_("failed to unlink '%s'"), dest->buf); > if (!option_no_hardlinks) { > - if (!link(src->buf, dest->buf)) > + if (!linkat(AT_FDCWD, src->buf, AT_FDCWD, dest->buf, AT_SYMLINK_FOLLOW)) This line is starting to get a bit long, might be worth breaking it up to keep to 80 characters per line. I notice that we are currently not using 'linkat()' anywhere else in our codebase. It looks like it has been introduced in POSIX.1-2008, which sounds fairly recent by git's standards. So I wonder if this is really supported on all platforms that git is being built on. I also wonder what would need to be done on Windows if we were to introduce this. I see we define the 'link()' function in 'compat/mingw.c' for that currently, so I guess something similar would be needed for 'linkat()'. I added Dscho to Cc for Windows expertise. While I agree with the goal of consistency accross all platforms here, I don't know if it's actually worth going through the pain of doing that, especially for somewhat of an edge case in local clones. If the test in the previous patch passes on all platforms, I'd be okay with just calling the behaviour here undefined, especially as git would never actually create symlinks in the .git/objects directory.
On Thu, Mar 28 2019, Thomas Gummerer wrote: > On 03/22, Matheus Tavares wrote: >> There is currently an odd behaviour when locally clonning a repository >> with symlinks at .git/objects: using --no-hardlinks all symlinks are >> dereferenced but without it Git will try to hardlink the files with the >> link() function, which has an OS-specific behaviour on symlinks. On OSX >> and NetBSD, it creates a hardlink to the file pointed by the symlink >> whilst on GNU/Linux, it creates a hardlink to the symlink itself. >> >> On Manjaro GNU/Linux: >> $ touch a >> $ ln -s a b >> $ link b c >> $ ls -li a b c >> 155 [...] a >> 156 [...] b -> a >> 156 [...] c -> a >> >> But on NetBSD: >> $ ls -li a b c >> 2609160 [...] a >> 2609164 [...] b -> a >> 2609160 [...] c >> >> It's not good to have the result of a local clone to be OS-dependent and >> since the behaviour on GNU/Linux may result in broken symlinks, let's >> re-implement it with linkat() instead of link() using a flag to always >> follow symlinks and make the hardlink be to the pointed file. With this, >> besides standardizing the behaviour, no broken symlinks will be >> produced. Also, add tests for symlinked files at .git/objects/. >> >> Note: Git won't create symlinks at .git/objects itself, but it's better >> to handle this case and be friendly with users who manually create them. >> >> Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br> >> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> >> Co-authored-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> >> --- >> builtin/clone.c | 2 +- >> t/t5604-clone-reference.sh | 26 +++++++++++++++++++------- >> 2 files changed, 20 insertions(+), 8 deletions(-) >> >> diff --git a/builtin/clone.c b/builtin/clone.c >> index 50bde99618..b76f33c635 100644 >> --- a/builtin/clone.c >> +++ b/builtin/clone.c >> @@ -443,7 +443,7 @@ static void copy_or_link_directory(struct strbuf *src, struct strbuf *dest, >> if (unlink(dest->buf) && errno != ENOENT) >> die_errno(_("failed to unlink '%s'"), dest->buf); >> if (!option_no_hardlinks) { >> - if (!link(src->buf, dest->buf)) >> + if (!linkat(AT_FDCWD, src->buf, AT_FDCWD, dest->buf, AT_SYMLINK_FOLLOW)) > > This line is starting to get a bit long, might be worth breaking it up > to keep to 80 characters per line. > > I notice that we are currently not using 'linkat()' anywhere else in > our codebase. It looks like it has been introduced in POSIX.1-2008, > which sounds fairly recent by git's standards. So I wonder if this is > really supported on all platforms that git is being built on. > > I also wonder what would need to be done on Windows if we were to > introduce this. I see we define the 'link()' function in > 'compat/mingw.c' for that currently, so I guess something similar > would be needed for 'linkat()'. I added Dscho to Cc for Windows > expertise. For better of worse this particular quest started because I pointed out (with some WIP patches) that for understanding this change we should test whatever we did now, to ensure that the refactoring didn't have unintended side-effects. But that's a separate question from whether or not we want to keep the current behavior. I think the current behavior is clearly insane, so I think we should change it with some follow-up patches. In particular options like --dissociate should clearly (in my mind at least) have behavior similar to "cp -L", and --local should hardlink to the *target* of the symlink, if anything, at least for objects/{??,pack,info} I think that changes the portability story with linkat(), since it's not something we should be planning to keep, just an intermediate step so we don't have a gigantic patch that both adds tests, refactors and changes the behavior. > While I agree with the goal of consistency accross all platforms here, > I don't know if it's actually worth going through the pain of doing > that, especially for somewhat of an edge case in local clones. Note that we explicitly clone everything under objects/, including recursively cloning unknown directories and their files. So this is not just say about how we handle symlinks that we don't expect now (nothing uses them), but if we want to make the promise that nothing in objects/ will ever use symlinks. Or more specifically, that if a new version of git starts using it that something doing local clones might produce a broken copy of such a repo. Maybe we'll still say "we don't care". Just saying it's a slightly different question... > If the test in the previous patch passes on all platforms, I'd be okay > with just calling the behaviour here undefined, especially as git > would never actually create symlinks in the .git/objects directory.
On Thu, Mar 28, 2019 at 7:10 PM Thomas Gummerer <t.gummerer@gmail.com> wrote: > > On 03/22, Matheus Tavares wrote: > > There is currently an odd behaviour when locally clonning a repository > > with symlinks at .git/objects: using --no-hardlinks all symlinks are > > dereferenced but without it Git will try to hardlink the files with the > > link() function, which has an OS-specific behaviour on symlinks. On OSX > > and NetBSD, it creates a hardlink to the file pointed by the symlink > > whilst on GNU/Linux, it creates a hardlink to the symlink itself. > > > > On Manjaro GNU/Linux: > > $ touch a > > $ ln -s a b > > $ link b c > > $ ls -li a b c > > 155 [...] a > > 156 [...] b -> a > > 156 [...] c -> a > > > > But on NetBSD: > > $ ls -li a b c > > 2609160 [...] a > > 2609164 [...] b -> a > > 2609160 [...] c > > > > It's not good to have the result of a local clone to be OS-dependent and > > since the behaviour on GNU/Linux may result in broken symlinks, let's > > re-implement it with linkat() instead of link() using a flag to always > > follow symlinks and make the hardlink be to the pointed file. With this, > > besides standardizing the behaviour, no broken symlinks will be > > produced. Also, add tests for symlinked files at .git/objects/. > > > > Note: Git won't create symlinks at .git/objects itself, but it's better > > to handle this case and be friendly with users who manually create them. > > > > Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br> > > Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> > > Co-authored-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> > > --- > > builtin/clone.c | 2 +- > > t/t5604-clone-reference.sh | 26 +++++++++++++++++++------- > > 2 files changed, 20 insertions(+), 8 deletions(-) > > > > diff --git a/builtin/clone.c b/builtin/clone.c > > index 50bde99618..b76f33c635 100644 > > --- a/builtin/clone.c > > +++ b/builtin/clone.c > > @@ -443,7 +443,7 @@ static void copy_or_link_directory(struct strbuf *src, struct strbuf *dest, > > if (unlink(dest->buf) && errno != ENOENT) > > die_errno(_("failed to unlink '%s'"), dest->buf); > > if (!option_no_hardlinks) { > > - if (!link(src->buf, dest->buf)) > > + if (!linkat(AT_FDCWD, src->buf, AT_FDCWD, dest->buf, AT_SYMLINK_FOLLOW)) > > This line is starting to get a bit long, might be worth breaking it up > to keep to 80 characters per line. > > I notice that we are currently not using 'linkat()' anywhere else in > our codebase. It looks like it has been introduced in POSIX.1-2008, > which sounds fairly recent by git's standards. So I wonder if this is > really supported on all platforms that git is being built on. > > I also wonder what would need to be done on Windows if we were to > introduce this. I see we define the 'link()' function in > 'compat/mingw.c' for that currently, so I guess something similar > would be needed for 'linkat()'. I added Dscho to Cc for Windows > expertise. Ok, what if instead of using linkat() we use 'realpath(const char *path, char *resolved_path)', which will resolve any symlinks at 'path' and store the canonical path at 'resolved_path'? Then, we can still keep using link() but now, with the certainty that all platforms will have a consistent behaviour? (also, realpath() is POSIX.1-2001) Would that be a better idea? > While I agree with the goal of consistency accross all platforms here, > I don't know if it's actually worth going through the pain of doing > that, especially for somewhat of an edge case in local clones. > > If the test in the previous patch passes on all platforms, I'd be okay > with just calling the behaviour here undefined, especially as git > would never actually create symlinks in the .git/objects directory.
Hi Thomas, On Thu, 28 Mar 2019, Thomas Gummerer wrote: > On 03/22, Matheus Tavares wrote: > > > > diff --git a/builtin/clone.c b/builtin/clone.c > > index 50bde99618..b76f33c635 100644 > > --- a/builtin/clone.c > > +++ b/builtin/clone.c > > @@ -443,7 +443,7 @@ static void copy_or_link_directory(struct strbuf *src, struct strbuf *dest, > > if (unlink(dest->buf) && errno != ENOENT) > > die_errno(_("failed to unlink '%s'"), dest->buf); > > if (!option_no_hardlinks) { > > - if (!link(src->buf, dest->buf)) > > + if (!linkat(AT_FDCWD, src->buf, AT_FDCWD, dest->buf, AT_SYMLINK_FOLLOW)) > > [...] > > I notice that we are currently not using 'linkat()' anywhere else in > our codebase. It looks like it has been introduced in POSIX.1-2008, > which sounds fairly recent by git's standards. So I wonder if this is > really supported on all platforms that git is being built on. I bet you it isn't. > I also wonder what would need to be done on Windows if we were to > introduce this. I see we define the 'link()' function in > 'compat/mingw.c' for that currently, so I guess something similar > would be needed for 'linkat()'. I added Dscho to Cc for Windows > expertise. Indeed, `linkat()` would have to be implemented in `compat/mingw.c`. It would be a bit involved because the last parameter of that function changes behavior noticeably, but the main difficulty (to determine the path from a file descriptor) should be overcome using `HANDLE olddirhandle = _get_osfhandle(olddirfd);` and the calling `GetFinalPathNameByHandleW(olddirhandle, wbuf, sizeof(wbuf));`. So yes, this is *not* something I'd do lightly. The bigger problem will be to continue to support older Unices such as SunOS and AIX. I highly doubt that they have that function. You should find out, Matheus. Ciao, Johannes
On 03/29, Matheus Tavares Bernardino wrote: > On Thu, Mar 28, 2019 at 7:10 PM Thomas Gummerer <t.gummerer@gmail.com> wrote: > > I notice that we are currently not using 'linkat()' anywhere else in > > our codebase. It looks like it has been introduced in POSIX.1-2008, > > which sounds fairly recent by git's standards. So I wonder if this is > > really supported on all platforms that git is being built on. > > > > I also wonder what would need to be done on Windows if we were to > > introduce this. I see we define the 'link()' function in > > 'compat/mingw.c' for that currently, so I guess something similar > > would be needed for 'linkat()'. I added Dscho to Cc for Windows > > expertise. > > Ok, what if instead of using linkat() we use 'realpath(const char > *path, char *resolved_path)', which will resolve any symlinks at > 'path' and store the canonical path at 'resolved_path'? Then, we can > still keep using link() but now, with the certainty that all platforms > will have a consistent behaviour? (also, realpath() is POSIX.1-2001) > Would that be a better idea? Yeah, I think that is a good idea. Note that 'realpath()' itself is not used anywhere in our codebase either, but there is 'strbuf_realpath()', that from reading the function documentation does exactly what 'realpath()' would do. So using 'strbuf_realpath()' would probably be the right thing to do here.
On 03/29, Ævar Arnfjörð Bjarmason wrote: > > On Thu, Mar 28 2019, Thomas Gummerer wrote: > > I notice that we are currently not using 'linkat()' anywhere else in > > our codebase. It looks like it has been introduced in POSIX.1-2008, > > which sounds fairly recent by git's standards. So I wonder if this is > > really supported on all platforms that git is being built on. > > > > I also wonder what would need to be done on Windows if we were to > > introduce this. I see we define the 'link()' function in > > 'compat/mingw.c' for that currently, so I guess something similar > > would be needed for 'linkat()'. I added Dscho to Cc for Windows > > expertise. > > For better of worse this particular quest started because I pointed out > (with some WIP patches) that for understanding this change we should > test whatever we did now, to ensure that the refactoring didn't have > unintended side-effects. > > But that's a separate question from whether or not we want to keep the > current behavior. > > I think the current behavior is clearly insane, so I think we should > change it with some follow-up patches. In particular options like > --dissociate should clearly (in my mind at least) have behavior similar > to "cp -L", and --local should hardlink to the *target* of the symlink, > if anything, at least for objects/{??,pack,info} Right, I definitely agree with all of that. Adding tests for the current behaviour is definitely a good thing if we can do it in a sane way. And I also agree that the current behaviour is insane, and should be fixed, but that may not want to be part of this patch series. > I think that changes the portability story with linkat(), since it's not > something we should be planning to keep, just an intermediate step so we > don't have a gigantic patch that both adds tests, refactors and changes > the behavior. Fair enough, but that also means that this patch series necessarily has to introduce the changes in behaviour as well as switching clone to use dir-iterator. Of course we could say that the switch-over to using dir-iterator could be done as a separate patch series, but that seems a bit too much of a change in scope of this series. Now I think Matheus has actually found a nice solution to this issue using 'strbuf_readlink()', which gives us the same behaviour as using 'linkat()' in this patch would give us, so this might not be that big an issue in the end.
On Fri, Mar 29, 2019 at 5:05 PM Thomas Gummerer <t.gummerer@gmail.com> wrote: > > On 03/29, Matheus Tavares Bernardino wrote: > > On Thu, Mar 28, 2019 at 7:10 PM Thomas Gummerer <t.gummerer@gmail.com> wrote: > > > I notice that we are currently not using 'linkat()' anywhere else in > > > our codebase. It looks like it has been introduced in POSIX.1-2008, > > > which sounds fairly recent by git's standards. So I wonder if this is > > > really supported on all platforms that git is being built on. > > > > > > I also wonder what would need to be done on Windows if we were to > > > introduce this. I see we define the 'link()' function in > > > 'compat/mingw.c' for that currently, so I guess something similar > > > would be needed for 'linkat()'. I added Dscho to Cc for Windows > > > expertise. > > > > Ok, what if instead of using linkat() we use 'realpath(const char > > *path, char *resolved_path)', which will resolve any symlinks at > > 'path' and store the canonical path at 'resolved_path'? Then, we can > > still keep using link() but now, with the certainty that all platforms > > will have a consistent behaviour? (also, realpath() is POSIX.1-2001) > > Would that be a better idea? > > Yeah, I think that is a good idea. Note that 'realpath()' itself is > not used anywhere in our codebase either, but there is > 'strbuf_realpath()', that from reading the function documentation does > exactly what 'realpath()' would do. So using 'strbuf_realpath()' > would probably be the right thing to do here. Thanks. While I was looking for realpath() at git codebase (before I saw your email), I got a little confused: Besides strbuf_realpath() I also found real_path(), real_path_if_valid() and real_pathdup(). All these last three use strbuf_realpath() but they also initialize the struct strbuf internally and just return a 'char *', which is much convenient in some cases. What seems weird to me is that, whilst real_pathdup() releases the internally initialized struct strubuf (leaving just the returned string to be free'd by the user), the other two don't. So, if struct strbuf change in the future to have more dynamic allocated resources, these functions will also have to be modified. Also, since real_pathdup() can already do what the other two do, do you know if there is a reason to keep all of them? One last question: I found some places which don't free the string returned by, for example, real_path() (e.g., find_worktree() at worktree.c). Would it be a valid/good patch (or patches) to add free() calls in this places? (I'm currently trying to get more people here at USP to contribute to git, and maybe this could be a nice first contribution for them...)
On 03/30, Matheus Tavares Bernardino wrote: > On Fri, Mar 29, 2019 at 5:05 PM Thomas Gummerer <t.gummerer@gmail.com> wrote: > > > > On 03/29, Matheus Tavares Bernardino wrote: > > > Ok, what if instead of using linkat() we use 'realpath(const char > > > *path, char *resolved_path)', which will resolve any symlinks at > > > 'path' and store the canonical path at 'resolved_path'? Then, we can > > > still keep using link() but now, with the certainty that all platforms > > > will have a consistent behaviour? (also, realpath() is POSIX.1-2001) > > > Would that be a better idea? > > > > Yeah, I think that is a good idea. Note that 'realpath()' itself is > > not used anywhere in our codebase either, but there is > > 'strbuf_realpath()', that from reading the function documentation does > > exactly what 'realpath()' would do. So using 'strbuf_realpath()' > > would probably be the right thing to do here. > > Thanks. While I was looking for realpath() at git codebase (before I > saw your email), I got a little confused: Besides strbuf_realpath() I > also found real_path(), real_path_if_valid() and real_pathdup(). All > these last three use strbuf_realpath() but they also initialize the > struct strbuf internally and just return a 'char *', which is much > convenient in some cases. Right, feel free to use whichever is most convenient for you, and whichever works in the context. > What seems weird to me is that, whilst > real_pathdup() releases the internally initialized struct strubuf > (leaving just the returned string to be free'd by the user), the other > two don't. So, if struct strbuf change in the future to have more > dynamic allocated resources, these functions will also have to be > modified. Also, since real_pathdup() can already do what the other two > do, do you know if there is a reason to keep all of them? Right, '*dup()' functions usually leave the return value to be free'd by the caller. And while 'real_pathdup()' could do what the others do already it also takes more effort to use it. Users don't need to free the return value from 'real_path()' to avoid a memory leak. This alone justifies its existence I think. > One last question: I found some places which don't free the string > returned by, for example, real_path() (e.g., find_worktree() at > worktree.c). Would it be a valid/good patch (or patches) to add free() > calls in this places? (I'm currently trying to get more people here at > USP to contribute to git, and maybe this could be a nice first > contribution for them...) Trying to plug memory leaks in the codebase is definitely something that I think is worthy of doing. Sometimes it's not worth actually free'ing the memory, for example just before the program exits, in which case we can use the UNLEAK annotation. It was introduced in 0e5bba53af ("add UNLEAK annotation for reducing leak false positives", 2017-09-08) if you want more background. That said, the memory from 'real_path()' should actually not be free'd. The strbuf there has a static lifetime, so it is valid until git exits. If we were to free the return value of the function we'd actually free an internal buffer of the strbuf, that is still valid. So if someone were to use 'real_path()' after that, the memory that strbuf still thinks it owns would actually have been free'd, which would result in undefined behaviour, and probably would make git segfault.
On Sat, Mar 30, 2019 at 4:27 PM Thomas Gummerer <t.gummerer@gmail.com> wrote: > > On 03/30, Matheus Tavares Bernardino wrote: > > On Fri, Mar 29, 2019 at 5:05 PM Thomas Gummerer <t.gummerer@gmail.com> wrote: > > > > > > On 03/29, Matheus Tavares Bernardino wrote: > > > > Ok, what if instead of using linkat() we use 'realpath(const char > > > > *path, char *resolved_path)', which will resolve any symlinks at > > > > 'path' and store the canonical path at 'resolved_path'? Then, we can > > > > still keep using link() but now, with the certainty that all platforms > > > > will have a consistent behaviour? (also, realpath() is POSIX.1-2001) > > > > Would that be a better idea? > > > > > > Yeah, I think that is a good idea. Note that 'realpath()' itself is > > > not used anywhere in our codebase either, but there is > > > 'strbuf_realpath()', that from reading the function documentation does > > > exactly what 'realpath()' would do. So using 'strbuf_realpath()' > > > would probably be the right thing to do here. > > > > Thanks. While I was looking for realpath() at git codebase (before I > > saw your email), I got a little confused: Besides strbuf_realpath() I > > also found real_path(), real_path_if_valid() and real_pathdup(). All > > these last three use strbuf_realpath() but they also initialize the > > struct strbuf internally and just return a 'char *', which is much > > convenient in some cases. > > Right, feel free to use whichever is most convenient for you, and > whichever works in the context. > > > What seems weird to me is that, whilst > > real_pathdup() releases the internally initialized struct strubuf > > (leaving just the returned string to be free'd by the user), the other > > two don't. So, if struct strbuf change in the future to have more > > dynamic allocated resources, these functions will also have to be > > modified. Also, since real_pathdup() can already do what the other two > > do, do you know if there is a reason to keep all of them? > > Right, '*dup()' functions usually leave the return value to be free'd > by the caller. And while 'real_pathdup()' could do what the others do > already it also takes more effort to use it. Users don't need to free > the return value from 'real_path()' to avoid a memory leak. This > alone justifies its existence I think. > > > One last question: I found some places which don't free the string > > returned by, for example, real_path() (e.g., find_worktree() at > > worktree.c). Would it be a valid/good patch (or patches) to add free() > > calls in this places? (I'm currently trying to get more people here at > > USP to contribute to git, and maybe this could be a nice first > > contribution for them...) > > Trying to plug memory leaks in the codebase is definitely something > that I think is worthy of doing. Sometimes it's not worth actually > free'ing the memory, for example just before the program exits, in > which case we can use the UNLEAK annotation. It was introduced in > 0e5bba53af ("add UNLEAK annotation for reducing leak false positives", > 2017-09-08) if you want more background. > > That said, the memory from 'real_path()' should actually not be > free'd. The strbuf there has a static lifetime, so it is valid until > git exits. If we were to free the return value of the function we'd > actually free an internal buffer of the strbuf, that is still valid. > So if someone were to use 'real_path()' after that, the memory that > strbuf still thinks it owns would actually have been free'd, which > would result in undefined behaviour, and probably would make git > segfault. > Thanks for the great explanation, Thomas. I hadn't noticed that the strbuf variable inside real_path() is declared as static. I also took some time, now, to better understand how strbuf functions deal with the buf attribute (especially how it's realloc'ed) and now I think I understand it better. Thanks again for the help! > -- > You received this message because you are subscribed to the Google Groups "Kernel USP" group. > To unsubscribe from this group and stop receiving emails from it, send an email to kernel-usp+unsubscribe@googlegroups.com. > To post to this group, send email to kernel-usp@googlegroups.com. > To view this discussion on the web visit https://groups.google.com/d/msgid/kernel-usp/20190330192738.GQ32487%40hank.intra.tgummerer.com. > For more options, visit https://groups.google.com/d/optout.
diff --git a/builtin/clone.c b/builtin/clone.c index 50bde99618..b76f33c635 100644 --- a/builtin/clone.c +++ b/builtin/clone.c @@ -443,7 +443,7 @@ static void copy_or_link_directory(struct strbuf *src, struct strbuf *dest, if (unlink(dest->buf) && errno != ENOENT) die_errno(_("failed to unlink '%s'"), dest->buf); if (!option_no_hardlinks) { - if (!link(src->buf, dest->buf)) + if (!linkat(AT_FDCWD, src->buf, AT_FDCWD, dest->buf, AT_SYMLINK_FOLLOW)) continue; if (option_local > 0) die_errno(_("failed to create link '%s'"), dest->buf); diff --git a/t/t5604-clone-reference.sh b/t/t5604-clone-reference.sh index 708b1a2c66..76d45f1187 100755 --- a/t/t5604-clone-reference.sh +++ b/t/t5604-clone-reference.sh @@ -266,7 +266,7 @@ test_expect_success 'clone a repo with garbage in objects/*' ' test_cmp expected actual ' -test_expect_success SYMLINKS 'setup repo with manually symlinked dirs and unknown files at objects/' ' +test_expect_success SYMLINKS 'setup repo with manually symlinked or unknown files at objects/' ' git init T && ( cd T && @@ -282,10 +282,18 @@ test_expect_success SYMLINKS 'setup repo with manually symlinked dirs and unknow cd .git/objects && find ?? -type d >loose-dirs && last_loose=$(tail -n 1 loose-dirs) && - rm -f loose-dirs && mv $last_loose a-loose-dir && ln -s a-loose-dir $last_loose && + first_loose=$(head -n 1 loose-dirs) && + rm -f loose-dirs && + ( + cd $first_loose && + obj=$(ls *) && + mv $obj ../an-object && + ln -s ../an-object $obj + ) && find . -type f | sort >../../../T.objects-files.raw && + find . -type l | sort >../../../T.objects-symlinks.raw && echo unknown_content> unknown_file ) ) && @@ -294,7 +302,7 @@ test_expect_success SYMLINKS 'setup repo with manually symlinked dirs and unknow ' -test_expect_success SYMLINKS 'clone repo with symlinked dirs and unknown files at objects/' ' +test_expect_success SYMLINKS 'clone repo with symlinked or unknown files at objects/' ' for option in --local --no-hardlinks --shared --dissociate do git clone $option T T$option || return 1 && @@ -303,7 +311,8 @@ test_expect_success SYMLINKS 'clone repo with symlinked dirs and unknown files a test_cmp T.objects T$option.objects && ( cd T$option/.git/objects && - find . -type f | sort >../../../T$option.objects-files.raw + find . -type f | sort >../../../T$option.objects-files.raw && + find . -type l | sort >../../../T$option.objects-symlinks.raw ) done && @@ -317,6 +326,7 @@ test_expect_success SYMLINKS 'clone repo with symlinked dirs and unknown files a ./Y/Z ./Y/Z ./a-loose-dir/Z + ./an-object ./Y/Z ./info/packs ./pack/pack-Z.idx @@ -326,15 +336,17 @@ test_expect_success SYMLINKS 'clone repo with symlinked dirs and unknown files a ./unknown_file EOF - for option in --local --dissociate --no-hardlinks + for option in --local --no-hardlinks --dissociate do - test_cmp expected-files T$option.objects-files.raw.de-sha || return 1 + test_cmp expected-files T$option.objects-files.raw.de-sha || return 1 && + test_must_be_empty T$option.objects-symlinks.raw.de-sha || return 1 done && cat >expected-files <<-EOF && ./info/alternates EOF - test_cmp expected-files T--shared.objects-files.raw + test_cmp expected-files T--shared.objects-files.raw && + test_must_be_empty T--shared.objects-symlinks.raw ' test_done