Message ID | pull.1101.v2.git.1640114048.gitgitgadget@gmail.com (mailing list archive) |
---|---|
Headers | show |
Series | Sparse checkout: fix bug with worktree of bare repo | expand |
On Tue, Dec 21, 2021 at 2:14 PM Derrick Stolee via GitGitGadget <gitgitgadget@gmail.com> wrote: > The root cause is that 'git sparse-checkout init' writes to the worktree > config without checking that core.bare might need to be set. This only > matters when the base repository is bare, since creating the config.worktree > file and enabling extensions.worktreeConfig will cause Git to treat the base > repo's core.bare=false as important for this worktree. I'm having trouble understanding what this is trying to say. Did you mean "true" rather than "false" in the final sentence? Anyhow, I think this description is somewhat stale. A more succinct way to describe the issue is that `git sparse-checkout init` wasn't correctly upgrading the repo to per-worktree configuration, with the result that the `core.bare=true` setting of a bare repo bled into worktree-specific configuration, which caused a bit of havoc. This patch series fixes `init` to upgrade the repo properly. > The critical bits are in Patches 3, 4, and 5 which introduce a helper for > upgrading to worktree config, a helper to write to worktree config, and then > consume that config helper in builtin/sparse-checkout.c and sparse-index.c. > > Update in v2 > ============ > > * Eric correctly pointed out that I was writing core.bare incorrectly. It > should move out of the core config and into the core repository's > worktree config. > * Patch 3 is new, separating the "upgrade" logic out of config.c, but it is > still called by the config helper to make it painless to write worktree > config. It's good to see the "upgrade to per-worktree config" split out to a standalone single-purpose utility function. I left several review comments in that patch, the most important of which is that the implementation is incomplete (because it doesn't relocate `core.worktree`), thus it can leave the repo in an inconsistent and broken state. Several of the other review comments are actionable, as well. I'm still concerned about and uncomfortable with the implementation of the new repo_config_set_worktree_gently(), but I've left ample comments about that elsewhere in this discussion, so I needn't go into it here. Thanks for working on this.
On Wed, Dec 22, 2021 at 8:00 AM Derrick Stolee via GitGitGadget <gitgitgadget@gmail.com> wrote: > > This patch series includes a fix to the bug reported by Sean Allred [1] and > diagnosed by Eric Sunshine [2]. > > The root cause is that 'git sparse-checkout init' writes to the worktree > config without checking that core.bare might need to be set. This only > matters when the base repository is bare, since creating the config.worktree > file and enabling extensions.worktreeConfig will cause Git to treat the base > repo's core.bare=false as important for this worktree. > > This series fixes this, but also puts in place some helpers to prevent this > from happening in the future. While here, some of the config paths are > modified to take a repository struct. > > The critical bits are in Patches 3, 4, and 5 which introduce a helper for > upgrading to worktree config, a helper to write to worktree config, and then > consume that config helper in builtin/sparse-checkout.c and sparse-index.c. Based on the description I went and fetched the patch series and tried it out. This feels like a bandaid to me. In addition to fixating on core.bare (thus overlooking core.worktree), it also overlooks that people can use worktrees without using sparse-checkout. What if they do something like: git clone --bare $URL myrepo cd myrepo git worktree add foo git worktree add bar git worktree add baz ... days/weeks later ... cd foo git config extensions.worktreeConfig true git config status.showUntrackedFiles no # Or other config options ... hours/days later .. cd ../bar git status At this point the user gets "fatal: this operation must be run in a work tree". And it's much, much worse if the original clone was not bare, but had core.worktree pointing somewhere else (because then the `git status` invocation will show the differences between the *other* worktree with the HEAD of *this* worktree). I think that "git worktree add" should check if either core.bare is false or core.worktree is set, and if so then set extensions.worktreeConfig and migrate the relevant config. While there may be some concern about non-git clients not understanding extensions.worktreeConfig, I'd say that this type of situation is one where we are just flat broken without it. Without it, we're stuck in a situation that will: (a) confuse users ("Why does core.bare/core.worktree seem to get ignored or have incorrect values?"), (b) confuse non-git clients (are they really going to have the tricky logic to overrule core.bare/core.worktree when in another worktree?), (c) confuse git itself after subsequent configuration tweaks, and (d) (related to item c) lead to ever more complicated logic in core git to attempt to know when and how to overrule core.bare and core.worktree as additional concerns enter the picture (e.g. will someone contribute code to override core.bare/core.worktree when run from a worktree with extensions.worktreeConfig=true, just as someone originally wrote code to override core.bare/core.worktree when the extensions.worktreeConfig setting wasn't present?) I also think `git worktree add` should handle additional configuration items related to sparse checkouts (as we've discussed elsewhere in the past), but that's going a bit outside the scope of this series; I only mention it so that we're aware the functionality added to `git worktree add` will be getting some additions in the future.
On Wed, Dec 22, 2021 at 5:54 PM Elijah Newren <newren@gmail.com> wrote: > On Wed, Dec 22, 2021 at 8:00 AM Derrick Stolee via GitGitGadget > <gitgitgadget@gmail.com> wrote: > > This patch series includes a fix to the bug reported by Sean Allred [1] and > > diagnosed by Eric Sunshine [2]. > > This feels like a bandaid to me. In addition to fixating on core.bare > (thus overlooking core.worktree), it also overlooks that people can > use worktrees without using sparse-checkout. What if they do > something like: > > git clone --bare $URL myrepo > cd myrepo > git worktree add foo > git worktree add bar > git worktree add baz > ... days/weeks later ... > cd foo > git config extensions.worktreeConfig true > git config status.showUntrackedFiles no # Or other config options > ... hours/days later .. > cd ../bar > git status > > At this point the user gets "fatal: this operation must be run in a > work tree". Your example indeed leads to a broken state because it doesn't follow the instructions given by git-worktree.txt for enabling `extensions.worktreeConfig`, which involves additional bookkeeping operations beyond merely setting that config variable. It is exactly this sort of situation which prompted me to suggest several times[1,2,3] in the conversation following my diagnosis of the problem, as well as in my reviews of this series, that we may want to add a git-worktree subcommand which does all the necessary bookkeeping to enable `extensions.worktreeConfig` rather than expecting users to handle it all manually. In [1], I called this hypothetical command `git worktree manage --enable-worktree-config ` and in [4], I called it `git worktree config --enable-per-worktree` (not because I like either name, but because I couldn't think of anything better). > I think that "git > worktree add" should check if either core.bare is false or > core.worktree is set, and if so then set extensions.worktreeConfig and > migrate the relevant config. (I think you meant "...if either core.bare is _true_ or...".) Similar to my response to Sean in [1] and to Stolee in [2], while this may help the situation for worktrees created _after_ `extensions.worktreeConfig` is enabled, it does _not_ help existing worktrees at all. For this reason, in my opinion, `git worktree add` is simply not the correct place to be addressing this problem, and it's why I suggested a separate command for enabling the feature and doing all the necessary bookkeeping. It's also why I suggested[2] that in the long run, we may want per-worktree config to be the default (and only) behavior rather than the current (legacy) behavior of all config being shared between worktrees. Aside from that, I'm uncomfortable with the suggestion that `git worktree add` should be responsible for making these sort of dramatic changes (upgrading to version=1 and enabling `extensions.worktreeConfig`) to the repository automatically. That seems very much out of scope for what this command should be doing. On the other hand, I would have no problem with `git worktree add` protecting users by detecting whether `core.bare=true` or `core.worktree` is set in the shared .git/config file and aborting with an error if so, and giving a "HINT" telling the user to enable per-worktree config via the (hypothetical) `git worktree config --enable-per-worktree` command. Regarding your feeling that this patch series is a "band-aid", while I agree with you that we ultimately need a better approach, such as the hypothetical `git worktree config --enable-per-worktree` (or eventually making per-worktree config be the default), that better solution does not need to be implemented today, and certainly shouldn't derail _this_ patch series which is aimed at fixing a very real bug which exists presently in `git sparse-checkout init`. This patch series does need a good number of improvements and fixes before it is ready -- as indicated by my v2 review comments[4,5,6], the most obvious of which is its missing handling of `core.worktree` -- but I do think this series is headed in the correct direction by focusing on fixing the immediate problem with `git sparse-checkout init` (and paving the way for an eventual more complete solution, such as `git worktree config --enable-per-worktree `). [1]: https://lore.kernel.org/git/CAPig+cQ6U_yFw-X2OWrizB1rbCvc4bNxuSzKFzmoLNnm0GH8Eg@mail.gmail.com/ [2]: https://lore.kernel.org/git/CAPig+cQPUe9REf+wgVNjyak_nk3V361h-48rTFgk6TGC7vJgOA@mail.gmail.com/ [3]: https://lore.kernel.org/git/CAPig+cRombN-8g0t7Hs9qQypJoY41gK3+kvypH4D0G6EB4JgbQ@mail.gmail.com/ [4]: https://lore.kernel.org/git/CAPig+cQrJ9yWjkc8VMu=uyx_qtrXdL3cNnxLVafoxOo6e-r9kw@mail.gmail.com/ [5]: https://lore.kernel.org/git/CAPig+cRi2SA6+poaemY8XR5ZoMweuztfiENpcRVOCnukg3Qa7w@mail.gmail.com/ [6]: https://lore.kernel.org/git/CAPig+cRuY40RNi4bC3CBfghLLqz74VUPRbaYJYEhmF78b0GfPQ@mail.gmail.com/#t > I also think `git worktree add` should handle additional configuration > items related to sparse checkouts (as we've discussed elsewhere in the > past), but that's going a bit outside the scope of this series; I only > mention it so that we're aware the functionality added to `git > worktree add` will be getting some additions in the future. I vaguely recall some mention of this not long ago on the list but didn't follow the discussion at all. Do you have pointers or a summary?
On Mon, Dec 27, 2021 at 2:15 AM Eric Sunshine <sunshine@sunshineco.com> wrote: > On Wed, Dec 22, 2021 at 5:54 PM Elijah Newren <newren@gmail.com> wrote: > > I think that "git > > worktree add" should check if either core.bare is false or > > core.worktree is set, and if so then set extensions.worktreeConfig and > > migrate the relevant config. > > Similar to my response to Sean in [1] and to Stolee in [2], while this > may help the situation for worktrees created _after_ > `extensions.worktreeConfig` is enabled, it does _not_ help existing > worktrees at all. For this reason, in my opinion, `git worktree add` > is simply not the correct place to be addressing this problem, and > it's why I suggested a separate command for enabling the feature and > doing all the necessary bookkeeping. It's also why I suggested[2] that > in the long run, we may want per-worktree config to be the default > (and only) behavior rather than the current (legacy) behavior of all > config being shared between worktrees. Thinking upon it further, I take back what I said about it not helping existing worktrees. Your proposal is _almost_ the same as my suggestion of eventually making per-worktree config the default. The difference is that you're only making it the default if `core.bare=true` or `core.worktree` is set. But do we need that distinction? If people are comfortable with that, then are they comfortable with simply flipping the switch and making per-worktree config the default today regardless of `core.bare` and `core.worktree`? I'm not sure that I'm comfortable with it due to the potential of breaking older versions of git which don't understand `extensions.worktreeConfig`, as well as breaking third-party tools, but maybe other people feel differently?
On Sun, Dec 26, 2021 at 11:15 PM Eric Sunshine <sunshine@sunshineco.com> wrote: > > On Wed, Dec 22, 2021 at 5:54 PM Elijah Newren <newren@gmail.com> wrote: > > On Wed, Dec 22, 2021 at 8:00 AM Derrick Stolee via GitGitGadget > > <gitgitgadget@gmail.com> wrote: > > > This patch series includes a fix to the bug reported by Sean Allred [1] and > > > diagnosed by Eric Sunshine [2]. > > > > This feels like a bandaid to me. In addition to fixating on core.bare > > (thus overlooking core.worktree), it also overlooks that people can > > use worktrees without using sparse-checkout. What if they do > > something like: > > > > git clone --bare $URL myrepo > > cd myrepo > > git worktree add foo > > git worktree add bar > > git worktree add baz > > ... days/weeks later ... > > cd foo > > git config extensions.worktreeConfig true > > git config status.showUntrackedFiles no # Or other config options > > ... hours/days later .. > > cd ../bar > > git status > > > > At this point the user gets "fatal: this operation must be run in a > > work tree". > > Your example indeed leads to a broken state because it doesn't follow > the instructions given by git-worktree.txt for enabling > `extensions.worktreeConfig`, which involves additional bookkeeping > operations beyond merely setting that config variable. These are instructions which neither Stolee nor I was aware of prior to your pointing it out. Not only had we often flipped that variable, we did so for many of our users (I know I did for some time prior to Stolee introducing sparse-checkout). With `git sparse-checkout` we propagated that to many more users, and now have many repositories out in the wild that have been set up for _years_ in violation of these instructions. So, even if Stolee and I are independently particularly bad about noticing the relevant documentation, we now have a situation where people can discover this misconfiguration just by looking around in their config. Once they notice it, they may well copy it elsewhere. I'd suspect that Stolee and I are actually _more_ likely to be aware of relevant documentation than the average Git user, so if we missed it, I suspect many of them will. Especially now that we've amplified their opportunities for discovering repositories set up in contravention to that documentation. So, I don't think relying on folks to read this particular piece of documentation is a reliable course of action...at least not without some changes to make it much more likely to be noticed. > It is exactly > this sort of situation which prompted me to suggest several > times[1,2,3] in the conversation following my diagnosis of the > problem, as well as in my reviews of this series, that we may want to > add a git-worktree subcommand which does all the necessary bookkeeping > to enable `extensions.worktreeConfig` rather than expecting users to > handle it all manually. In [1], I called this hypothetical command > `git worktree manage --enable-worktree-config ` and in [4], I called > it `git worktree config --enable-per-worktree` (not because I like > either name, but because I couldn't think of anything better). How would users discover this new command and use it? Is it any more reliably discoverable than the above documentation? Your suggestion sounds to me like "We know this command will break things, so we'll provide another command they can use to avoid the breakage, and hope they notice this new command and use it." I'm sure that's not your intent, and perhaps there's a way of making this suggestion robust, but to me it just sounds like it leads to inevitable breakage. I'd rather just fix the command that can break things. > > I think that "git > > worktree add" should check if either core.bare is false or > > core.worktree is set, and if so then set extensions.worktreeConfig and > > migrate the relevant config. > > (I think you meant "...if either core.bare is _true_ or...".) Yes, indeed. > Similar to my response to Sean in [1] and to Stolee in [2], while this > may help the situation for worktrees created _after_ > `extensions.worktreeConfig` is enabled, it does _not_ help existing > worktrees at all. For this reason, in my opinion, `git worktree add` > is simply not the correct place to be addressing this problem, and > it's why I suggested a separate command for enabling the feature and > doing all the necessary bookkeeping. It's also why I suggested[2] that > in the long run, we may want per-worktree config to be the default > (and only) behavior rather than the current (legacy) behavior of all > config being shared between worktrees. > > Aside from that, I'm uncomfortable with the suggestion that `git > worktree add` should be responsible for making these sort of dramatic > changes (upgrading to version=1 and enabling > `extensions.worktreeConfig`) to the repository automatically. That > seems very much out of scope for what this command should be doing. On > the other hand, I would have no problem with `git worktree add` > protecting users by detecting whether `core.bare=true` or > `core.worktree` is set in the shared .git/config file and aborting > with an error if so, and giving a "HINT" telling the user to enable > per-worktree config via the (hypothetical) `git worktree config > --enable-per-worktree` command. > > Regarding your feeling that this patch series is a "band-aid", while I > agree with you that we ultimately need a better approach, such as the > hypothetical `git worktree config --enable-per-worktree` (or > eventually making per-worktree config be the default), that better > solution does not need to be implemented today, and certainly > shouldn't derail _this_ patch series which is aimed at fixing a very > real bug which exists presently in `git sparse-checkout init`. This > patch series does need a good number of improvements and fixes before > it is ready -- as indicated by my v2 review comments[4,5,6], the most > obvious of which is its missing handling of `core.worktree` -- but I > do think this series is headed in the correct direction by focusing on > fixing the immediate problem with `git sparse-checkout init` (and > paving the way for an eventual more complete solution, such as `git > worktree config --enable-per-worktree `). Looks like you've changed your opinion a bit and it'd be better for me to respond to these parts in your follow-up email. > [1]: https://lore.kernel.org/git/CAPig+cQ6U_yFw-X2OWrizB1rbCvc4bNxuSzKFzmoLNnm0GH8Eg@mail.gmail.com/ > [2]: https://lore.kernel.org/git/CAPig+cQPUe9REf+wgVNjyak_nk3V361h-48rTFgk6TGC7vJgOA@mail.gmail.com/ > [3]: https://lore.kernel.org/git/CAPig+cRombN-8g0t7Hs9qQypJoY41gK3+kvypH4D0G6EB4JgbQ@mail.gmail.com/ > [4]: https://lore.kernel.org/git/CAPig+cQrJ9yWjkc8VMu=uyx_qtrXdL3cNnxLVafoxOo6e-r9kw@mail.gmail.com/ > [5]: https://lore.kernel.org/git/CAPig+cRi2SA6+poaemY8XR5ZoMweuztfiENpcRVOCnukg3Qa7w@mail.gmail.com/ > [6]: https://lore.kernel.org/git/CAPig+cRuY40RNi4bC3CBfghLLqz74VUPRbaYJYEhmF78b0GfPQ@mail.gmail.com/#t > > > I also think `git worktree add` should handle additional configuration > > items related to sparse checkouts (as we've discussed elsewhere in the > > past), but that's going a bit outside the scope of this series; I only > > mention it so that we're aware the functionality added to `git > > worktree add` will be getting some additions in the future. > > I vaguely recall some mention of this not long ago on the list but > didn't follow the discussion at all. Do you have pointers or a > summary? For the microsoft repositories, sparse-checkouts are needed because a full checkout is unmanageable (millions of files to check out otherwise). For other repositories, full checkouts might technically be manageable, but are annoyingly slow and users may only want to work with sparse checkouts (and for some of these, due to various mono-repoization efforts, the repository is growing towards a size where manageability of full checkouts is decreasing). The fact that `git worktree add` does a full checkout is quite painful...possibility to the point of making worktrees useless for some users. I think `git worktree add` should copy the sparsity of the worktree from which it was invoked. Addressing potential questions/objections to this proposal: * just requiring users to do a full checkout first is very unfriendly: the checkout might not even fit in available disk space, and even if it does fit, this has the performance penalty of inflating and writing all files to disk only to delete a (vast?) majority of them immediately after. Users have shown a willingness to swallow a lot of pain trying to figure out how to avoid that performance penalty. * full-checkout: If users do want a full checkout of the new worktree despite running from a sparse-checkout, it's a single command away (`git sparse-checkout disable` or `<sparsity-wrapper-script> --undo`). And in that case, the invoked commands don't do huge amounts of unnecessary work. * using --no-checkout as a proxy: This means no files checked out and no index file. The lack of an index file makes it appear that everything was manually deleted (with the deletion staged). Also, if the project is using some kind of <sparsity-wrapper-script> (e.g. for determining dependencies between directories so that appropriate 'modules' can be specified and transformed into a list of directories passed to sparse-checkout), then the sparsity-wrapper-script isn't available to them to invoke. If users try to check out just the wrapper file, then an index will be created and have just one entry and we kind of cement the fact that all other files look like they were intended to be deleted. Also, even if the user runs `git sparse-checkout init --cone`, you don't actually don't transform this no-checkout into a sparse checkout because sparse-checkout doesn't want to undo your staged deletions. Despite the fact that I'm very familiar with all the implementation internals, it was not obvious to me all the necessary additional commands needed for users to get a sparse checkout while making use of --no-checkout. Users stand little chance of figuring the necessary command invocations out without a huge amount of effort (and they've given up and come to me before asking for help, and my first response turned out to be incomplete in various cases...).
On Sun, Dec 26, 2021 at 11:34 PM Eric Sunshine <sunshine@sunshineco.com> wrote: > > On Mon, Dec 27, 2021 at 2:15 AM Eric Sunshine <sunshine@sunshineco.com> wrote: > > On Wed, Dec 22, 2021 at 5:54 PM Elijah Newren <newren@gmail.com> wrote: > > > I think that "git > > > worktree add" should check if either core.bare is false or > > > core.worktree is set, and if so then set extensions.worktreeConfig and > > > migrate the relevant config. > > > > Similar to my response to Sean in [1] and to Stolee in [2], while this > > may help the situation for worktrees created _after_ > > `extensions.worktreeConfig` is enabled, it does _not_ help existing > > worktrees at all. For this reason, in my opinion, `git worktree add` > > is simply not the correct place to be addressing this problem, and > > it's why I suggested a separate command for enabling the feature and > > doing all the necessary bookkeeping. It's also why I suggested[2] that > > in the long run, we may want per-worktree config to be the default > > (and only) behavior rather than the current (legacy) behavior of all > > config being shared between worktrees. > > Thinking upon it further, I take back what I said about it not helping > existing worktrees. > > Your proposal is _almost_ the same as my suggestion of eventually > making per-worktree config the default. The difference is that you're > only making it the default if `core.bare=true` or `core.worktree` is > set. Indeed. :-) > But do we need that distinction? If people are comfortable with > that, then are they comfortable with simply flipping the switch and > making per-worktree config the default today regardless of `core.bare` > and `core.worktree`? This is tempting, at least if we leave core.repositoryFormatVersion as 0 (see 11664196ac ("Revert "check_repository_format_gently(): refuse extensions for old repositories"", 2020-07-15)) when core.bare is false and core.worktree was unset. However, for that case: * This is a case where operating on the primary worktree was not previously problematic for older git versions or third party tools. * Interestingly, git <= 2.6.2 can continue to operate on the primary worktree (because it didn't know to error out on unknown extensions) * git >= 2.19.0 could continue to operate on the primary worktree (because it understands the extension) * git versions between that range would suddenly break, erroring out on the unknown extension (though those versions would start working again if we migrated core.bare and core.worktree but just didn't set extensions.worktreeConfig). > I'm not sure that I'm comfortable with it due to the potential of > breaking older versions of git which don't understand > `extensions.worktreeConfig`, as well as breaking third-party tools, > but maybe other people feel differently? The distinction I made was particularly written with third party tools and older versions of git in mind, to allow them to continue to safely operate when possible. But let's flesh it out a little: * core.bare = false AND core.worktree is unset (i.e. a typical non-bare clone), AND we try to add a worktree: we have _years_ of in-the-wild usage showing that old git versions and third party tools operate fine without migrating the config. Leave it in place and do not set extensions.worktreeConfig and do not upgrade core.repositoryFormatVersion. * (core.bare = true OR core.worktree is set) AND we try to add a worktree: all third party tools and all git versions (old and new) are broken anyway. Flip the switch (upgrade core.repositoryFormatVersion to 1, set extensions.worktreeConfig=true, and migrate core.bare/core.worktree for main repo and newly created worktree), which at least allows new git versions and new tools to work correctly, and will hopefully cause old tools to error out since this is a configuration they won't handle correctly. Further: * Toggling extensions.worktreeConfig=true for the first time is rather trivial for users to try; for years they have been able to do so without making _any_ other changes and expect things to continue to work (assuming new enough git versions and third-party tools). They have likely disseminated this information to other users. I certainly did. If we are going to expect more of anyone toggling this option, we need lots of documentation and perhaps code changes to help shore up the path. I'd rather just allow folks to continue to do such toggling. * Toggling either core.bare or core.worktree in an existing clone requires significant additional manual work to make things consistent. Because it requires a lot more knowledge and work, I think the burden should be on these users to know about the ramifications with existing worktrees. (I've never heard of a user other than myself attempt to toggle these; I'm sure there are some, it just seems quite rare.)
On Mon, Dec 27, 2021 at 2:35 PM Elijah Newren <newren@gmail.com> wrote: > On Sun, Dec 26, 2021 at 11:15 PM Eric Sunshine <sunshine@sunshineco.com> wrote: > > Your example indeed leads to a broken state because it doesn't follow > > the instructions given by git-worktree.txt for enabling > > `extensions.worktreeConfig`, which involves additional bookkeeping > > operations beyond merely setting that config variable. > > These are instructions which neither Stolee nor I was aware of prior > to your pointing it out. [...] > I'd suspect that Stolee and I are actually _more_ likely to be aware > of relevant documentation than the average Git user, so if we missed > it, I suspect many of them will. [...] I wasn't originally aware of the bookkeeping instructions either. In fact, for a good while, I wasn't even aware that Duy had implemented per-worktree configuration or that there was such a thing. I either must have been entirely off-list during the implementation or was not in a position to follow changes to the project at the time. I only became aware of per-worktree configuration at some point in the last two or three years when someone asked some question on the list related to the feature and I ended up diving into the documentation, the source code, and the patches themselves in order to understand it fully -- because I think I didn't understand it merely from reading the documentation which is rather sparse (no pun intended). And I had forgotten enough about it since then that I had to re-research it when Sean raised the issue on the list a few days back in relation to sparse-checkout. > So, I don't think relying on folks to read this particular piece of > documentation is a reliable course of action...at least not without > some changes to make it much more likely to be noticed. The sparsity of documentation about per-worktree configuration certainly doesn't help, nor the fact that it's fairly near the end of git-worktree.txt, at which point people may have given up reading. We could make it a bit more prominent by mentioning it early in the command description, but it still involves enough fiddly bookkeeping that it likely will continue to be problematic. Making per-worktree configuration the default does seem like the best long-term solution. Doing so should make all these problems go away. I don't know what Duy's plans were, nor whether he had some migration strategy planned. > > I vaguely recall some mention of this not long ago on the list but > > didn't follow the discussion at all. Do you have pointers or a > > summary? > > For the microsoft repositories, sparse-checkouts are needed because a > full checkout is unmanageable (millions of files to check out > otherwise). For other repositories, full checkouts might technically > be manageable, but are annoyingly slow and users may only want to work > with sparse checkouts (and for some of these, due to various > mono-repoization efforts, the repository is growing towards a size > where manageability of full checkouts is decreasing). > > The fact that `git worktree add` does a full checkout is quite > painful...possibility to the point of making worktrees useless for > some users. I think `git worktree add` should copy the sparsity of > the worktree from which it was invoked. Okay, I do recall reading a message in which you proposed this, though I didn't understand the reasoning for the suggestion since I wasn't following the discussion. The explanation you provide here makes the proposal understandable. > * using --no-checkout as a proxy: This means no files checked out > and no index file. The lack of an index file makes it appear that > everything was manually deleted (with the deletion staged). Also, if > the project is using some kind of <sparsity-wrapper-script> (e.g. for > determining dependencies between directories so that appropriate > 'modules' can be specified and transformed into a list of directories > passed to sparse-checkout), then the sparsity-wrapper-script isn't > available to them to invoke. If users try to check out just the > wrapper file, then an index will be created and have just one entry > and we kind of cement the fact that all other files look like they > were intended to be deleted. Also, even if the user runs `git > sparse-checkout init --cone`, you don't actually don't transform this > no-checkout into a sparse checkout because sparse-checkout doesn't > want to undo your staged deletions. Despite the fact that I'm very > familiar with all the implementation internals, it was not obvious to > me all the necessary additional commands needed for users to get a > sparse checkout while making use of --no-checkout. Users stand little > chance of figuring the necessary command invocations out without a > huge amount of effort (and they've given up and come to me before > asking for help, and my first response turned out to be incomplete in > various cases...). You've clearly put much more thought into this than I have (since I only just read this), so I'm not likely to have any meaningful input, but I'll write down a few thoughts/questions which popped into my head while reading what you wrote. Perhaps they've already been discussed elsewhere, so feel free to ignore (and they may not be worth responding to anyhow). When you say "copy the sparsity of the worktree from which it was invoked", do you mean that literally, such that it special-cases it and only copies sparse-checkout information? An alternative would be to allow the user to specify -- via the shared configuration (.git/config) -- exactly which config keys get inherited/copied over by `git worktree add`. Such a solution would avoid special-casing sparse-checkout and could be useful in the future for other commands which might need such functionality, though this approach may be overengineered. A more general approach might be for the new worktree to copy all the per-worktree configuration from the worktree in which the command was invoked, thus sparsity would be inherited "for free" along with other settings. This has the benefits of not requiring sparse-checkout special-cases in the code and it's easy to document ("the new worktree inherits/copies configuration settings from the worktree in which `git worktree add` was invoked") and easy to understand. I also wondered if adding some sort of `--sparse-checkout=...` option to `git worktree add` would solve this particular dilemma, thus allowing the user to configure custom sparse-checkout for the worktree as it is being created. I also very briefly wondered if this should instead be a feature of the `git sparse-checkout` command itself, such as `git sparse-checkout add-worktree`, but I think that's probably a dead-end in terms of user discoverability, whereas `git worktree add --sparse-checkout=...` is more easily discoverable for people wanting to work with worktrees.
On Mon, Dec 27, 2021 at 3:16 PM Elijah Newren <newren@gmail.com> wrote: > On Sun, Dec 26, 2021 at 11:34 PM Eric Sunshine <sunshine@sunshineco.com> wrote: > > Your proposal is _almost_ the same as my suggestion of eventually > > making per-worktree config the default. The difference is that you're > > only making it the default if `core.bare=true` or `core.worktree` is > > set. > > Indeed. :-) > > > But do we need that distinction? If people are comfortable with > > that, then are they comfortable with simply flipping the switch and > > making per-worktree config the default today regardless of `core.bare` > > and `core.worktree`? > > This is tempting, at least if we leave core.repositoryFormatVersion as > 0 (see 11664196ac ("Revert "check_repository_format_gently(): refuse > extensions for old repositories"", 2020-07-15)) when core.bare is > false and core.worktree was unset. However, for that case: I'll try to respond to this email when I can find a quiet block of time to really concentrate on what you're saying and reason through it more thoroughly; it will probably require several read-throughs.
On Mon, Dec 27, 2021 at 11:33 PM Eric Sunshine <sunshine@sunshineco.com> wrote: > > On Mon, Dec 27, 2021 at 2:35 PM Elijah Newren <newren@gmail.com> wrote: > > On Sun, Dec 26, 2021 at 11:15 PM Eric Sunshine <sunshine@sunshineco.com> wrote: > > > Your example indeed leads to a broken state because it doesn't follow > > > the instructions given by git-worktree.txt for enabling > > > `extensions.worktreeConfig`, which involves additional bookkeeping > > > operations beyond merely setting that config variable. > > > > These are instructions which neither Stolee nor I was aware of prior > > to your pointing it out. [...] > > I'd suspect that Stolee and I are actually _more_ likely to be aware > > of relevant documentation than the average Git user, so if we missed > > it, I suspect many of them will. [...] > > I wasn't originally aware of the bookkeeping instructions either. In > fact, for a good while, I wasn't even aware that Duy had implemented > per-worktree configuration or that there was such a thing. I either > must have been entirely off-list during the implementation or was not > in a position to follow changes to the project at the time. I only > became aware of per-worktree configuration at some point in the last > two or three years when someone asked some question on the list > related to the feature and I ended up diving into the documentation, > the source code, and the patches themselves in order to understand it > fully -- because I think I didn't understand it merely from reading > the documentation which is rather sparse (no pun intended). And I had > forgotten enough about it since then that I had to re-research it when > Sean raised the issue on the list a few days back in relation to > sparse-checkout. > > > So, I don't think relying on folks to read this particular piece of > > documentation is a reliable course of action...at least not without > > some changes to make it much more likely to be noticed. > > The sparsity of documentation about per-worktree configuration > certainly doesn't help, nor the fact that it's fairly near the end of > git-worktree.txt, at which point people may have given up reading. We > could make it a bit more prominent by mentioning it early in the > command description, but it still involves enough fiddly bookkeeping > that it likely will continue to be problematic. Further, it's not clear people even looked at git-worktree.txt at the time they learned about extensions.worktreeConfig. I believe I discovered and started using extensions.worktreeConfig from `git config --help`, which makes no mention of or even reference to the need for any extra steps. (I didn't see the original mailing list discussions around that setting either.) It never occurred to me in the ~3 years since to even look in `git worktree --help` for additional guidance around that config setting until this particular mailing list thread. > Making per-worktree configuration the default does seem like the best > long-term solution. Doing so should make all these problems go away. I > don't know what Duy's plans were, nor whether he had some migration > strategy planned. > > > > I vaguely recall some mention of this not long ago on the list but > > > didn't follow the discussion at all. Do you have pointers or a > > > summary? > > > > For the microsoft repositories, sparse-checkouts are needed because a > > full checkout is unmanageable (millions of files to check out > > otherwise). For other repositories, full checkouts might technically > > be manageable, but are annoyingly slow and users may only want to work > > with sparse checkouts (and for some of these, due to various > > mono-repoization efforts, the repository is growing towards a size > > where manageability of full checkouts is decreasing). > > > > The fact that `git worktree add` does a full checkout is quite > > painful...possibility to the point of making worktrees useless for > > some users. I think `git worktree add` should copy the sparsity of > > the worktree from which it was invoked. > > Okay, I do recall reading a message in which you proposed this, though > I didn't understand the reasoning for the suggestion since I wasn't > following the discussion. The explanation you provide here makes the > proposal understandable. > > > * using --no-checkout as a proxy: This means no files checked out > > and no index file. The lack of an index file makes it appear that > > everything was manually deleted (with the deletion staged). Also, if > > the project is using some kind of <sparsity-wrapper-script> (e.g. for > > determining dependencies between directories so that appropriate > > 'modules' can be specified and transformed into a list of directories > > passed to sparse-checkout), then the sparsity-wrapper-script isn't > > available to them to invoke. If users try to check out just the > > wrapper file, then an index will be created and have just one entry > > and we kind of cement the fact that all other files look like they > > were intended to be deleted. Also, even if the user runs `git > > sparse-checkout init --cone`, you don't actually don't transform this > > no-checkout into a sparse checkout because sparse-checkout doesn't > > want to undo your staged deletions. Despite the fact that I'm very > > familiar with all the implementation internals, it was not obvious to > > me all the necessary additional commands needed for users to get a > > sparse checkout while making use of --no-checkout. Users stand little > > chance of figuring the necessary command invocations out without a > > huge amount of effort (and they've given up and come to me before > > asking for help, and my first response turned out to be incomplete in > > various cases...). > > You've clearly put much more thought into this than I have (since I > only just read this), so I'm not likely to have any meaningful input, > but I'll write down a few thoughts/questions which popped into my head > while reading what you wrote. Perhaps they've already been discussed > elsewhere, so feel free to ignore (and they may not be worth > responding to anyhow). > > When you say "copy the sparsity of the worktree from which it was > invoked", do you mean that literally, such that it special-cases it > and only copies sparse-checkout information? An alternative would be > to allow the user to specify -- via the shared configuration > (.git/config) -- exactly which config keys get inherited/copied over > by `git worktree add`. Such a solution would avoid special-casing > sparse-checkout and could be useful in the future for other commands > which might need such functionality, though this approach may be > overengineered. > > A more general approach might be for the new worktree to copy all the > per-worktree configuration from the worktree in which the command was > invoked, thus sparsity would be inherited "for free" along with other > settings. This has the benefits of not requiring sparse-checkout > special-cases in the code and it's easy to document ("the new worktree > inherits/copies configuration settings from the worktree in which `git > worktree add` was invoked") and easy to understand. Ooh, this is a good point and I *really* like this simple solution. Thanks for pointing it out. Do note, though, that it's more than just config.worktree -- I also want the ${GITDIR}/info/sparse-checkout file copied. > I also wondered if adding some sort of `--sparse-checkout=...` option > to `git worktree add` would solve this particular dilemma, thus > allowing the user to configure custom sparse-checkout for the worktree > as it is being created. I also very briefly wondered if this should > instead be a feature of the `git sparse-checkout` command itself, such > as `git sparse-checkout add-worktree`, but I think that's probably a > dead-end in terms of user discoverability, whereas `git worktree add > --sparse-checkout=...` is more easily discoverable for people wanting > to work with worktrees. This might be a useful extra capability (we'd probably want to keep this flag in sync with git-clone's --sparse flag and whatever capabilities grow there), but I don't see it as a solution to this problem. I think the default needs to be copying the existing sparsity. Making users specify cone/non-cone mode and sparse-index/non-sparse-index and and several dozen directories by hand just doesn't sound reasonable to me. (We have a case with several hundred directories/modules, with various dependencies between them. Users can use a wrapper, `./sparsify --modules $MODULE_A $MODULE_B` which figures out the several dozen relevant directories and calls sparse-checkout set with those, but of course that wrapper won't yet be available in the new worktree until after the new worktree has been added.) An alternative that would be workable, though annoying, is giving the user a super-sparse checkout with only files in the toplevel directory present (i.e. what you'd get after `git sparse-checkout init --cone` or `git clone --sparse ...`), and then making them use the normal tools to manually specify the wanted sparsity (which probably requires switching back to the previous worktree to run some info commands to determine exactly what the sparsity was). An increasingly unworkable alternative is the current behavior of defaulting to a full checkout in all cases (and forcing users to sparsify afterwards). A full checkout is fine if the user came from one (and probably preferable in such a case), but it's increasingly problematic for us even with our repo being nowhere near the size of the microsoft repos.
On Mon, Dec 27, 2021 at 3:16 PM Elijah Newren <newren@gmail.com> wrote: > On Sun, Dec 26, 2021 at 11:34 PM Eric Sunshine <sunshine@sunshineco.com> wrote: > > Your proposal is _almost_ the same as my suggestion of eventually > > making per-worktree config the default. The difference is that you're > > only making it the default if `core.bare=true` or `core.worktree` is > > set. > > Indeed. :-) I mentioned previously[1] that I needed to find a block of time to really think through the topic before I'd be able to respond to this email. So, today I spent some time trying to reason through the various cases under discussion, and I came back and re-read this email with the intention of trying to summarize my understanding of the situation and my understanding of the points you were making. However, you did such a good job of summarizing the various cases at the very end of [2] that it probably makes more sense for me to respond to that email instead. [1]: https://lore.kernel.org/git/CAPig+cTFSDw-9Aq+=+r4sHSzTmG7s2T93Z0uqWTxHbKwGFaiYQ@mail.gmail.com/ [2]: https://lore.kernel.org/git/CABPp-BHuO3B366uJuODMQo-y449p8cAMVn0g2MTcO5di3Xa7Zg@mail.gmail.com/ > > But do we need that distinction? If people are comfortable with > > that, then are they comfortable with simply flipping the switch and > > making per-worktree config the default today regardless of `core.bare` > > and `core.worktree`? > > This is tempting, at least if we leave core.repositoryFormatVersion as > 0 (see 11664196ac ("Revert "check_repository_format_gently(): refuse > extensions for old repositories"", 2020-07-15)) when core.bare is > false and core.worktree was unset. However, for that case: I had seen 11664196ac when researching one of my earlier responses, though it took more than one read to (hopefully) fully understand what it is saying (i.e. due to an oversight, it's too late to enforce the `core.repositoryFormatVersion=1` requirement when extensions are used, as originally intended). > * This is a case where operating on the primary worktree was not > previously problematic for older git versions or third party tools. > * Interestingly, git <= 2.6.2 can continue to operate on the primary > worktree (because it didn't know to error out on unknown extensions) > * git >= 2.19.0 could continue to operate on the primary worktree > (because it understands the extension) > * git versions between that range would suddenly break, erroring out > on the unknown extension (though those versions would start working > again if we migrated core.bare and core.worktree but just didn't set > extensions.worktreeConfig). The significance of versions 2.6.2 and 2.19.0 is unclear to me. What context or criteria are you using to identify those versions as meaningful here?
On Tue, Dec 28, 2021 at 1:16 PM Elijah Newren <newren@gmail.com> wrote: > On Mon, Dec 27, 2021 at 11:33 PM Eric Sunshine <sunshine@sunshineco.com> wrote: > > The sparsity of documentation about per-worktree configuration > > certainly doesn't help, nor the fact that it's fairly near the end of > > git-worktree.txt, at which point people may have given up reading. We > > could make it a bit more prominent by mentioning it early in the > > command description, but it still involves enough fiddly bookkeeping > > that it likely will continue to be problematic. > > Further, it's not clear people even looked at git-worktree.txt at the > time they learned about extensions.worktreeConfig. I believe I > discovered and started using extensions.worktreeConfig from `git > config --help`, which makes no mention of or even reference to the > need for any extra steps. (I didn't see the original mailing list > discussions around that setting either.) It never occurred to me in > the ~3 years since to even look in `git worktree --help` for > additional guidance around that config setting until this particular > mailing list thread. That's an interesting datapoint. At the very least, we probably should update Documentation/git-config.txt to mention the extra bookkeeping required when setting `extensions.worktreeConfig=true`. > > A more general approach might be for the new worktree to copy all the > > per-worktree configuration from the worktree in which the command was > > invoked, thus sparsity would be inherited "for free" along with other > > settings. This has the benefits of not requiring sparse-checkout > > special-cases in the code and it's easy to document ("the new worktree > > inherits/copies configuration settings from the worktree in which `git > > worktree add` was invoked") and easy to understand. > > Ooh, this is a good point and I *really* like this simple solution. > Thanks for pointing it out. I do wonder, though, if there are traps waiting for us with this all-inclusive approach. I don't know what sort of worktree-specific configuration people use, so I do worry a bit that this could be casting a too-wide net, and that it might in fact be better to only copy the sparse-checkout settings (as ugly as it is to special-case that -- but we need to special-case `core.bare` and `core.worktree` anyhow[1]). [1]: https://lore.kernel.org/git/CAPig+cSUOknNC9GMyPvAqdBU0r1MVgvSpvgpSpXUmBm67HO7PQ@mail.gmail.com/ > Do note, though, that it's more than just config.worktree -- I also > want the ${GITDIR}/info/sparse-checkout file copied. Thanks for pointing that out. I'm reasonably (or completely) ignorant of sparse-checkout since I've never used it nor read the documentation, and I didn't follow the earlier discussions. > > I also wondered if adding some sort of `--sparse-checkout=...` option > > to `git worktree add` would solve this particular dilemma, thus > > allowing the user to configure custom sparse-checkout for the worktree > > as it is being created. I also very briefly wondered if this should > > instead be a feature of the `git sparse-checkout` command itself, such > > as `git sparse-checkout add-worktree`, but I think that's probably a > > dead-end in terms of user discoverability, whereas `git worktree add > > --sparse-checkout=...` is more easily discoverable for people wanting > > to work with worktrees. > > This might be a useful extra capability (we'd probably want to keep > this flag in sync with git-clone's --sparse flag and whatever > capabilities grow there), but I don't see it as a solution to this > problem. I think the default needs to be copying the existing > sparsity. Making users specify cone/non-cone mode and > sparse-index/non-sparse-index and and several dozen directories by > hand just doesn't sound reasonable to me. (We have a case with > several hundred directories/modules, with various dependencies between > them. Users can use a wrapper, `./sparsify --modules $MODULE_A > $MODULE_B` which figures out the several dozen relevant directories > and calls sparse-checkout set with those, but of course that wrapper > won't yet be available in the new worktree until after the new > worktree has been added.) Okay. > An alternative that would be workable, though annoying, is giving the > user a super-sparse checkout with only files in the toplevel directory > present (i.e. what you'd get after `git sparse-checkout init --cone` > or `git clone --sparse ...`), and then making them use the normal > tools to manually specify the wanted sparsity (which probably requires > switching back to the previous worktree to run some info commands to > determine exactly what the sparsity was). Sounds somewhat painful. > An increasingly unworkable alternative is the current behavior of > defaulting to a full checkout in all cases (and forcing users to > sparsify afterwards). A full checkout is fine if the user came from > one (and probably preferable in such a case), but it's increasingly > problematic for us even with our repo being nowhere near the size of > the microsoft repos. It feels unfortunate and a bit dirty to spread around this special-case knowledge about sparse-checkout to various parts of the system, but based upon the pain-points you describe, having a new worktree inherit the sparsity from the originating worktree does sound (given my limited knowledge of the topic) like it would ease the pain for users.
On Wed, Dec 29, 2021 at 10:22 PM Eric Sunshine <sunshine@sunshineco.com> wrote: > > On Mon, Dec 27, 2021 at 3:16 PM Elijah Newren <newren@gmail.com> wrote: > > On Sun, Dec 26, 2021 at 11:34 PM Eric Sunshine <sunshine@sunshineco.com> wrote: > > > Your proposal is _almost_ the same as my suggestion of eventually > > > making per-worktree config the default. The difference is that you're > > > only making it the default if `core.bare=true` or `core.worktree` is > > > set. > > > > Indeed. :-) > > I mentioned previously[1] that I needed to find a block of time to > really think through the topic before I'd be able to respond to this > email. So, today I spent some time trying to reason through the > various cases under discussion, and I came back and re-read this email > with the intention of trying to summarize my understanding of the > situation and my understanding of the points you were making. However, > you did such a good job of summarizing the various cases at the very > end of [2] that it probably makes more sense for me to respond to that > email instead. > > [1]: https://lore.kernel.org/git/CAPig+cTFSDw-9Aq+=+r4sHSzTmG7s2T93Z0uqWTxHbKwGFaiYQ@mail.gmail.com/ > [2]: https://lore.kernel.org/git/CABPp-BHuO3B366uJuODMQo-y449p8cAMVn0g2MTcO5di3Xa7Zg@mail.gmail.com/ > > > > But do we need that distinction? If people are comfortable with > > > that, then are they comfortable with simply flipping the switch and > > > making per-worktree config the default today regardless of `core.bare` > > > and `core.worktree`? > > > > This is tempting, at least if we leave core.repositoryFormatVersion as > > 0 (see 11664196ac ("Revert "check_repository_format_gently(): refuse > > extensions for old repositories"", 2020-07-15)) when core.bare is > > false and core.worktree was unset. However, for that case: > > I had seen 11664196ac when researching one of my earlier responses, > though it took more than one read to (hopefully) fully understand what > it is saying (i.e. due to an oversight, it's too late to enforce the > `core.repositoryFormatVersion=1` requirement when extensions are used, > as originally intended). > > > * This is a case where operating on the primary worktree was not > > previously problematic for older git versions or third party tools. > > * Interestingly, git <= 2.6.2 can continue to operate on the primary > > worktree (because it didn't know to error out on unknown extensions) > > * git >= 2.19.0 could continue to operate on the primary worktree > > (because it understands the extension) > > * git versions between that range would suddenly break, erroring out > > on the unknown extension (though those versions would start working > > again if we migrated core.bare and core.worktree but just didn't set > > extensions.worktreeConfig). > > The significance of versions 2.6.2 and 2.19.0 is unclear to me. What > context or criteria are you using to identify those versions as > meaningful here? We had been discussing how certain config settings "might break external tools OR old git versions", but hadn't brought up which tools or which git versions. I don't know which all external tools might be in play (though I mentioned some in use at $DAYJOB in another thread) but in this email I had just thought that I'd mention where the cutoff point was in terms of git versions which understood core.repositoryFormatVersion and extensions.worktreeConfig. If other folks also want to test how things behaved before or after these patches with "old git versions", those were the switchover points that are relevant and which I tested with.
On Wed, Dec 29, 2021 at 10:41 PM Eric Sunshine <sunshine@sunshineco.com> wrote: > > On Tue, Dec 28, 2021 at 1:16 PM Elijah Newren <newren@gmail.com> wrote: > > On Mon, Dec 27, 2021 at 11:33 PM Eric Sunshine <sunshine@sunshineco.com> wrote: > > > A more general approach might be for the new worktree to copy all the > > > per-worktree configuration from the worktree in which the command was > > > invoked, thus sparsity would be inherited "for free" along with other > > > settings. This has the benefits of not requiring sparse-checkout > > > special-cases in the code and it's easy to document ("the new worktree > > > inherits/copies configuration settings from the worktree in which `git > > > worktree add` was invoked") and easy to understand. > > > > Ooh, this is a good point and I *really* like this simple solution. > > Thanks for pointing it out. > > I do wonder, though, if there are traps waiting for us with this > all-inclusive approach. I don't know what sort of worktree-specific > configuration people use, so I do worry a bit that this could be > casting a too-wide net, and that it might in fact be better to only > copy the sparse-checkout settings (as ugly as it is to special-case > that -- but we need to special-case `core.bare` and `core.worktree` > anyhow[1]). > > [1]: https://lore.kernel.org/git/CAPig+cSUOknNC9GMyPvAqdBU0r1MVgvSpvgpSpXUmBm67HO7PQ@mail.gmail.com/ I could probably be persuaded either way (do users want to copy something and tweak it, or start with a clean slate?), and it might even make sense to have a flag for users to specify. My hunch, at least with the developers I work with, is that they're more likely to think in terms of "I want another worktree like this one, except that I'm going to change..." Also, another reason to prefer copying all of core.worktree (minus the always worktree-specific value of core.worktree, and core.bare), is because it's easy to explain in the documentation, and I think we'd be much less likely to obsolete user's knowledge over time. (I think additional sparse-checkout things, or new other features that also want to be copied over, are much more likely than the addition of keys that are always worktree-specific like core.worktree). ... > > An increasingly unworkable alternative is the current behavior of > > defaulting to a full checkout in all cases (and forcing users to > > sparsify afterwards). A full checkout is fine if the user came from > > one (and probably preferable in such a case), but it's increasingly > > problematic for us even with our repo being nowhere near the size of > > the microsoft repos. > > It feels unfortunate and a bit dirty to spread around this > special-case knowledge about sparse-checkout to various parts of the > system, This might makes things worse, but it's far too late to avoid that: $ git grep -il -e sparse-checkout -e skip_worktree .gitignore Documentation/RelNotes/2.19.0.txt Documentation/RelNotes/2.25.0.txt Documentation/RelNotes/2.26.0.txt Documentation/RelNotes/2.27.0.txt Documentation/RelNotes/2.28.0.txt Documentation/RelNotes/2.34.0.txt Documentation/RelNotes/2.6.0.txt Documentation/config/checkout.txt Documentation/config/core.txt Documentation/git-add.txt Documentation/git-checkout.txt Documentation/git-clone.txt Documentation/git-read-tree.txt Documentation/git-restore.txt Documentation/git-rm.txt Documentation/git-sparse-checkout.txt Documentation/git-worktree.txt Documentation/gitrepository-layout.txt Documentation/rev-list-options.txt Documentation/technical/index-format.txt Documentation/technical/parallel-checkout.txt Documentation/technical/sparse-index.txt Makefile advice.c apply.c attr.c builtin/add.c builtin/check-ignore.c builtin/checkout.c builtin/clone.c builtin/commit.c builtin/grep.c builtin/ls-files.c builtin/mv.c builtin/read-tree.c builtin/rm.c builtin/sparse-checkout.c builtin/stash.c builtin/update-index.c cache-tree.c cache.h command-list.txt contrib/completion/git-prompt.sh diff-lib.c diff.c dir.c dir.h entry.c git.c list-objects-filter.c list-objects-filter.h merge-ort.c merge-recursive.c path.c pathspec.c pathspec.h po/bg.po po/ca.po po/de.po po/el.po po/es.po po/fr.po po/git.pot po/id.po po/it.po po/ko.po po/pl.po po/pt_PT.po po/ru.po po/sv.po po/tr.po po/vi.po po/zh_CN.po po/zh_TW.po preload-index.c read-cache.c sparse-index.c sparse-index.h t/perf/p0005-status.sh t/perf/p0006-read-tree-checkout.sh t/perf/p0007-write-cache.sh t/perf/p2000-sparse-operations.sh t/perf/repos/inflate-repo.sh t/perf/repos/many-files.sh t/t0060-path-utils.sh t/t1011-read-tree-sparse-checkout.sh t/t1090-sparse-checkout-scope.sh t/t1091-sparse-checkout-builtin.sh t/t1092-sparse-checkout-compatibility.sh t/t2018-checkout-branch.sh t/t3507-cherry-pick-conflict.sh t/t3602-rm-sparse-checkout.sh t/t3705-add-sparse-checkout.sh t/t5317-pack-objects-filter-objects.sh t/t6112-rev-list-filters-objects.sh t/t6428-merge-conflicts-sparse.sh t/t6435-merge-sparse.sh t/t7002-mv-sparse-checkout.sh t/t7012-skip-worktree-writing.sh t/t7063-status-untracked-cache.sh t/t7418-submodule-sparse-gitmodules.sh t/t7519-status-fsmonitor.sh t/t7817-grep-sparse-checkout.sh unpack-trees.c wt-status.c > but based upon the pain-points you describe, having a new > worktree inherit the sparsity from the originating worktree does sound > (given my limited knowledge of the topic) like it would ease the pain > for users. :-)
On Thu, Dec 30, 2021 at 1:38 PM Elijah Newren <newren@gmail.com> wrote: > On Wed, Dec 29, 2021 at 10:41 PM Eric Sunshine <sunshine@sunshineco.com> wrote: > > On Tue, Dec 28, 2021 at 1:16 PM Elijah Newren <newren@gmail.com> wrote: > > > On Mon, Dec 27, 2021 at 11:33 PM Eric Sunshine <sunshine@sunshineco.com> wrote: > > > > A more general approach might be for the new worktree to copy all the > > > > per-worktree configuration from the worktree in which the command was > > > > invoked, thus sparsity would be inherited "for free" along with other > > > > settings. This has the benefits of not requiring sparse-checkout > > > > special-cases in the code and it's easy to document ("the new worktree > > > > inherits/copies configuration settings from the worktree in which `git > > > > worktree add` was invoked") and easy to understand. > > > > > > Ooh, this is a good point and I *really* like this simple solution. > > > Thanks for pointing it out. > > > > I do wonder, though, if there are traps waiting for us with this > > all-inclusive approach. I don't know what sort of worktree-specific > > configuration people use, so I do worry a bit that this could be > > casting a too-wide net, and that it might in fact be better to only > > copy the sparse-checkout settings (as ugly as it is to special-case > > that -- but we need to special-case `core.bare` and `core.worktree` > > anyhow[1]). > > I could probably be persuaded either way (do users want to copy > something and tweak it, or start with a clean slate?), and it might > even make sense to have a flag for users to specify. I also could probably be persuaded either way, and yes a flag is a possibility, though it would be nice if we could get along without it. > My hunch, at least with the developers I work with, is that they're > more likely to think in terms of "I want another worktree like this > one, except that I'm going to change..." > > Also, another reason to prefer copying all of core.worktree (minus the > always worktree-specific value of core.worktree, and core.bare), is > because it's easy to explain in the documentation, and I think we'd be > much less likely to obsolete user's knowledge over time. (I think > additional sparse-checkout things, or new other features that also > want to be copied over, are much more likely than the addition of keys > that are always worktree-specific like core.worktree). Another possible point in favor of copying all worktree-specific config to the new worktree is that if the user really does want to do some configuration specific to the new worktree, then that is going to require a certain amount of manual setup after creating the new worktree regardless of whether we copy all worktree-specific config or only a select subset (such as the sparse-checkout settings).
This patch series includes a fix to the bug reported by Sean Allred [1] and diagnosed by Eric Sunshine [2]. The root cause is that 'git sparse-checkout init' writes to the worktree config without checking that core.bare might need to be set. This only matters when the base repository is bare, since creating the config.worktree file and enabling extensions.worktreeConfig will cause Git to treat the base repo's core.bare=false as important for this worktree. This series fixes this, but also puts in place some helpers to prevent this from happening in the future. While here, some of the config paths are modified to take a repository struct. The critical bits are in Patches 3, 4, and 5 which introduce a helper for upgrading to worktree config, a helper to write to worktree config, and then consume that config helper in builtin/sparse-checkout.c and sparse-index.c. [1] https://lore.kernel.org/git/CABceR4bZmtC4rCwgxZ1BBYZP69VOUca1f_moJoP989vTUZWu9Q@mail.gmail.com/ [2] https://lore.kernel.org/git/CAPig+cQ6U_yFw-X2OWrizB1rbCvc4bNxuSzKFzmoLNnm0GH8Eg@mail.gmail.com/ Update in v2 ============ * Eric correctly pointed out that I was writing core.bare incorrectly. It should move out of the core config and into the core repository's worktree config. * Patch 3 is new, separating the "upgrade" logic out of config.c, but it is still called by the config helper to make it painless to write worktree config. Thanks, -Stolee Derrick Stolee (5): setup: use a repository when upgrading format config: make some helpers repo-aware worktree: add upgrade_to_worktree_config() config: add repo_config_set_worktree_gently() sparse-checkout: use repo_config_set_worktree_gently() builtin/sparse-checkout.c | 25 +++++----------- config.c | 39 +++++++++++++++++++++++-- config.h | 14 +++++++++ list-objects-filter-options.c | 2 +- repository.h | 2 +- setup.c | 6 ++-- sparse-index.c | 10 ++----- t/t1091-sparse-checkout-builtin.sh | 16 +++++++++- worktree.c | 47 ++++++++++++++++++++++++++++++ worktree.h | 12 ++++++++ 10 files changed, 140 insertions(+), 33 deletions(-) base-commit: 69a9c10c95e28df457e33b3c7400b16caf2e2962 Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1101%2Fderrickstolee%2Fsparse-checkout%2Fbare-worktree-bug-v2 Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1101/derrickstolee/sparse-checkout/bare-worktree-bug-v2 Pull-Request: https://github.com/gitgitgadget/git/pull/1101 Range-diff vs v1: 1: 28813703ff6 = 1: 889e69dc45d setup: use a repository when upgrading format 2: 3b549770eb9 = 2: 3e01356815a config: make some helpers repo-aware -: ----------- > 3: ed8e2a7b19d worktree: add upgrade_to_worktree_config() 3: 67993f6cff2 ! 4: 22896e9bb04 config: add repo_config_set_worktree_gently() @@ Metadata ## Commit message ## config: add repo_config_set_worktree_gently() - When adding config values to the worktree config, we might enable the - extensions.worktreeConfig setting and create the config.worktree file - for the first time. When the base repository is bare, this creates a - change of behavior for determining if the worktree is bare or not. A - worktree off of a bare repository is assumed to be non-bare when - extensions.worktreeConfig is disabled. When extensions.worktreeConfig is - enabled but config.worktree is empty, the worktree is considered bare - because the base repo's core.bare=true setting is used. + The previous change added upgrade_to_worktree_config() to assist + creating a worktree-specific config for the first time. However, this + requires every config writer to care about that upgrade before writing + to the worktree-specific config. In addition, callers need to know how + to generate the name of the config.worktree file and pass it to the + config API. - To avoid issues like this, create a helper that initializes all the - right settings in the correct order. A caller will be added in the next - change. + To assist, create a new repo_config_set_worktree_gently() method in the + config API that handles the upgrade_to_worktree_config() method in + addition to assigning the value in the worktree-specific config. This + will be consumed by an upcoming change. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> ## config.c ## +@@ + #include "dir.h" + #include "color.h" + #include "refs.h" ++#include "worktree.h" + + struct config_source { + struct config_source *prev; @@ config.c: int git_config_set_gently(const char *key, const char *value) return git_config_set_multivar_gently(key, value, NULL, 0); } @@ config.c: int git_config_set_gently(const char *key, const char *value) +int repo_config_set_worktree_gently(struct repository *r, + const char *key, const char *value) +{ -+ int res; -+ const char *config_filename = repo_git_path(r, "config.worktree"); -+ -+ /* -+ * Ensure that core.bare reflects the current worktree, since the -+ * logic for is_bare_repository() changes if extensions.worktreeConfig -+ * is disabled. -+ */ -+ if ((res = git_config_set_multivar_in_file_gently(config_filename, "core.bare", -+ r->worktree ? "false" : "true", -+ NULL, 0))) { -+ error(_("unable to set core.bare setting in worktree config")); -+ return res; -+ } -+ if (upgrade_repository_format(r, 1) < 0) -+ return error(_("unable to upgrade repository format to enable worktreeConfig")); -+ if ((res = git_config_set_gently("extensions.worktreeConfig", "true"))) { -+ error(_("failed to set extensions.worktreeConfig setting")); -+ return res; -+ } -+ -+ return git_config_set_multivar_in_file_gently(config_filename, key, value, NULL, 0); ++ return upgrade_to_worktree_config(r) || ++ git_config_set_multivar_in_file_gently( ++ repo_git_path(r, "config.worktree"), ++ key, value, NULL, 0); +} + void git_config_set(const char *key, const char *value) @@ config.h: void git_config_set_in_file(const char *, const char *, const char *); +/** + * Write a config value into the config.worktree file for the current -+ * worktree. This will initialize extensions.worktreeConfig if necessary. ++ * worktree. This will initialize extensions.worktreeConfig if necessary, ++ * which might trigger some changes to the root repository's config file. + */ +int repo_config_set_worktree_gently(struct repository *, const char *, const char *); + 4: 6202f767f4a ! 5: 06457fafa78 sparse-checkout: use repo_config_set_worktree_gently() @@ t/t1091-sparse-checkout-builtin.sh: test_expect_success 'git sparse-checkout ini + ( + cd worktree && + git sparse-checkout init && -+ test_cmp_config false core.bare && ++ test_must_fail git config core.bare && + git sparse-checkout set /* -+ ) ++ ) && ++ git -C bare config --list --show-origin >actual && ++ grep "file:config.worktree core.bare=true" actual +' + test_expect_success 'git sparse-checkout list after init' '