Message ID | YCsBRUQkrAm8l2gz@coredump.intra.peff.net (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [1/2] fsck: make symlinked .gitignore and .gitattributes a warning | expand |
On Tue, Feb 16 2021, Jeff King wrote: > While there are some minor security implications to having these files > be symlinks, this is out-weighed by the inconvenience of blocking > historical commits in some projects that might include them. Digging up the relevant thread that's the projects noted at https://lore.kernel.org/git/20201027033518.GH2645313@google.com/ ? I cloned the openmrn.git repository noted there, and checkout dies with: error: invalid path 'applications/clinic_app/targets/linux.x86/.gitignore' fatal: Could not reset index file to revision 'HEAD'. I'm running a recent-ish snapshot of next at d98b1dd5eaa7, so with your verify_path() change in current "seen". So this series changes nothing about the checkout, just the fsck check? I see there's your https://lore.kernel.org/git/20201027075853.GH3005508@coredump.intra.peff.net/#t to improve the !!symlink() codepath in apply.c Still, it seems like a rather jarring gap in implementation to just warn about this in fsck for the benefit of e.g. server operations, but then hard die on the current client. There seems to be no way around that hard die, and both repos in that report are ones that are just symlinking .gitignore to a ../somedir/.gitignore deep in their own tree. So aren't we both making the fsck check too loose and the client too strict? Would anyone care if this was an error on fsck if we did the "is outside repo?" check?
On Tue, Feb 16, 2021 at 01:38:30AM +0100, Ævar Arnfjörð Bjarmason wrote: > On Tue, Feb 16 2021, Jeff King wrote: > > > While there are some minor security implications to having these files > > be symlinks, this is out-weighed by the inconvenience of blocking > > historical commits in some projects that might include them. > > Digging up the relevant thread that's the projects noted at > https://lore.kernel.org/git/20201027033518.GH2645313@google.com/ ? > > I cloned the openmrn.git repository noted there, and checkout dies with: > > error: invalid path 'applications/clinic_app/targets/linux.x86/.gitignore' > fatal: Could not reset index file to revision 'HEAD'. > > I'm running a recent-ish snapshot of next at d98b1dd5eaa7, so with your > verify_path() change in current "seen". > > So this series changes nothing about the checkout, just the fsck check? Right. > I see there's your > https://lore.kernel.org/git/20201027075853.GH3005508@coredump.intra.peff.net/#t > to improve the !!symlink() codepath in apply.c That's a somewhat orthogonal approach, that tries to look at out-of-tree symlinks (rather than banning all symlinks for these files). I think it's worth banning them all, though; they don't actually work as you'd expect (i.e., whenever we read them in-repo the symlinks do nothing). > Still, it seems like a rather jarring gap in implementation to just warn > about this in fsck for the benefit of e.g. server operations, but then > hard die on the current client. It's not for the benefits of servers. It's because the solution for them is to stop symlinking, which fixes clone/checkout of new commits. But they'll still have those old trees hanging around in their history. If everybody rejects them, then it becomes difficult to push/fetch at all. That said, they'd probably want to checkout those old commits, too. So we probably do need a config override, even if it's a broad one ("trust me, this repo is OK, just allow symlinks for these special files"). > There seems to be no way around that hard die, and both repos in that > report are ones that are just symlinking .gitignore to a > ../somedir/.gitignore deep in their own tree. > > So aren't we both making the fsck check too loose and the client too > strict? Would anyone care if this was an error on fsck if we did the "is > outside repo?" check? An outside-the-repo check would probably be less invasive, but: - it still allows broken setups - it's significantly more complex. I know that the implementation I showed errs on the side of complaining in at least some cases (because it doesn't know if intermediate paths are themselves symlinks). But I'd worry there are also cases where it covers too little, nullifying the protection. -Peff
Jeff King <peff@peff.net> writes: > That said, they'd probably want to checkout those old commits, too. So > we probably do need a config override, even if it's a broad one ("trust > me, this repo is OK, just allow symlinks for these special files"). Is this about the check that is overly strict for some existing projects that kept the jk/symlinked-dotgitx-files topic in the 'seen' so far? On the fsck end, we know we can demote the error level per repository, but I wonder if we should make checkout/clone honor the same setting? I think GITMODULES_SYMLINK has been there for quite some time at "error" level and we do want to discourage it to be a symbolic link, so I am not quite sure what the demoting of these two achieves. Why aren't we having a similar issue on .gitmodules that is a symbolic link? Thanks.
On Mon, Feb 15, 2021 at 08:16:00PM -0500, Jeff King wrote: > That said, they'd probably want to checkout those old commits, too. So > we probably do need a config override, even if it's a broad one ("trust > me, this repo is OK, just allow symlinks for these special files"). Another option here is setting core.symlinks to false. That works more broadly than just the one symlink, though. It might be possible to apply that same setting (perhaps automatically, even) to just these .gitattributes, etc, metafiles. That may get tricky, though, as we'd need to do it not just on checkout, but any time we're considering the file (because we wouldn't want "git add" to re-add it as a non-symlink, nor git-diff to report it, etc). > > So aren't we both making the fsck check too loose and the client too > > strict? Would anyone care if this was an error on fsck if we did the "is > > outside repo?" check? > > An outside-the-repo check would probably be less invasive, but: > > - it still allows broken setups > > - it's significantly more complex. I know that the implementation I > showed errs on the side of complaining in at least some cases > (because it doesn't know if intermediate paths are themselves > symlinks). But I'd worry there are also cases where it covers too > little, nullifying the protection. Adding to the "complexity" point: it's also impossible to implement via fsck, where we do not have the full path of the tree entry. We could live without the fsck support if need be, though. I am beginning to wonder if just opening them all with O_NOFOLLOW (and a hacky 2-syscall fallback for portability) might be less ugly than all of this. -Peff
On Mon, Feb 15, 2021 at 05:56:50PM -0800, Junio C Hamano wrote: > Jeff King <peff@peff.net> writes: > > > That said, they'd probably want to checkout those old commits, too. So > > we probably do need a config override, even if it's a broad one ("trust > > me, this repo is OK, just allow symlinks for these special files"). > > Is this about the check that is overly strict for some existing > projects that kept the jk/symlinked-dotgitx-files topic in the > 'seen' so far? Yes. > On the fsck end, we know we can demote the error level per > repository, but I wonder if we should make checkout/clone honor the > same setting? What would the default be? If it's permissive, then it feels like we are not really solving much, as anybody who wanted to be careful can already inspect the tree contents. This is about avoiding surprises in the default config. If it's to forbid by default, then yes, I think the "trust me this repo is OK" I gave above would be a viable path forward. > I think GITMODULES_SYMLINK has been there for quite some time at > "error" level and we do want to discourage it to be a symbolic link, > so I am not quite sure what the demoting of these two achieves. Why > aren't we having a similar issue on .gitmodules that is a symbolic > link? I think it's just less common to have symlinked .gitmodules. To be clear, I think symlinked .gitignore is also pretty uncommon. Back when we discussed this originally in 2018 I scanned most of GitHub and came up with only a handful of repositories that did so. -Peff
diff --git a/fsck.c b/fsck.c index d0a201348d..c75c7d7dc7 100644 --- a/fsck.c +++ b/fsck.c @@ -67,8 +67,6 @@ static struct oidset gitmodules_done = OIDSET_INIT; FUNC(GITMODULES_URL, ERROR) \ FUNC(GITMODULES_PATH, ERROR) \ FUNC(GITMODULES_UPDATE, ERROR) \ - FUNC(GITIGNORE_SYMLINK, ERROR) \ - FUNC(GITATTRIBUTES_SYMLINK, ERROR) \ /* warnings */ \ FUNC(BAD_FILEMODE, WARN) \ FUNC(EMPTY_NAME, WARN) \ @@ -81,6 +79,8 @@ static struct oidset gitmodules_done = OIDSET_INIT; FUNC(NUL_IN_COMMIT, WARN) \ /* infos (reported as warnings, but ignored by default) */ \ FUNC(GITMODULES_PARSE, INFO) \ + FUNC(GITIGNORE_SYMLINK, INFO) \ + FUNC(GITATTRIBUTES_SYMLINK, INFO) \ FUNC(BAD_TAG_NAME, INFO) \ FUNC(MISSING_TAGGER_ENTRY, INFO) \ /* ignored (elevated when requested) */ \ diff --git a/t/t7450-bad-dotgitx-files.sh b/t/t7450-bad-dotgitx-files.sh index 326b34e167..4b1edb150e 100755 --- a/t/t7450-bad-dotgitx-files.sh +++ b/t/t7450-bad-dotgitx-files.sh @@ -140,6 +140,16 @@ test_expect_success 'index-pack --strict works for non-repo pack' ' ' check_forbidden_symlink () { + fsck_must_fail=test_must_fail + fsck_prefix=error + case "$1" in + --fsck-warning) + fsck_must_fail= + fsck_prefix=warning + shift + ;; + esac + name=$1 type=$2 path=$3 @@ -172,8 +182,8 @@ check_forbidden_symlink () { # Check not only that we fail, but that it is due to the # symlink detector - test_must_fail git fsck 2>output && - test_i18ngrep "tree $tree: ${name}Symlink" output + $fsck_must_fail git fsck 2>output && + test_i18ngrep "$fsck_prefix.*tree $tree: ${name}Symlink" output ) ' @@ -193,13 +203,13 @@ check_forbidden_symlink gitmodules vanilla .gitmodules check_forbidden_symlink gitmodules ntfs ".gitmodules ." check_forbidden_symlink gitmodules hfs ".${u200c}gitmodules" -check_forbidden_symlink gitattributes vanilla .gitattributes -check_forbidden_symlink gitattributes ntfs ".gitattributes ." -check_forbidden_symlink gitattributes hfs ".${u200c}gitattributes" +check_forbidden_symlink --fsck-warning gitattributes vanilla .gitattributes +check_forbidden_symlink --fsck-warning gitattributes ntfs ".gitattributes ." +check_forbidden_symlink --fsck-warning gitattributes hfs ".${u200c}gitattributes" -check_forbidden_symlink gitignore vanilla .gitignore -check_forbidden_symlink gitignore ntfs ".gitignore ." -check_forbidden_symlink gitignore hfs ".${u200c}gitignore" +check_forbidden_symlink --fsck-warning gitignore vanilla .gitignore +check_forbidden_symlink --fsck-warning gitignore ntfs ".gitignore ." +check_forbidden_symlink --fsck-warning gitignore hfs ".${u200c}gitignore" test_expect_success 'fsck detects non-blob .gitmodules' ' git init non-blob &&
We recently added fsck checks to complain about symlinked .gitignore and .gitattributes files, which are no longer allowed to be checked out. This is partially to inform fsck users of the problem, but also to protect older clients from receiving them (by blocking push and fetch via transfer.fsckObjects). While there are some minor security implications to having these files be symlinks, this is out-weighed by the inconvenience of blocking historical commits in some projects that might include them. Let's loosen the fsck check to a warning. It will continue to be reported by both git-fsck and transfer.fsckObjects, but will not impact the exit code or the acceptance of objects. Note that internally in fsck.c this is called "INFO", but the word "warning" will appear in user-visible output. Signed-off-by: Jeff King <peff@peff.net> --- fsck.c | 4 ++-- t/t7450-bad-dotgitx-files.sh | 26 ++++++++++++++++++-------- 2 files changed, 20 insertions(+), 10 deletions(-)