Message ID | pull.513.git.1579029962.gitgitgadget@gmail.com (mailing list archive) |
---|---|
Headers | show |
Series | Harden the sparse-checkout builtin | expand |
Hi Stolee, On Tue, Jan 14, 2020 at 07:25:54PM +0000, Derrick Stolee via GitGitGadget wrote: > This series is based on ds/sparse-list-in-cone-mode. > > This series attempts to clean up some rough edges in the sparse-checkout > feature, especially around the cone mode. > > Unfortunately, after the v2.25.0 release, we noticed an issue with the "git > clone --sparse" option when using a URL instead of a local path. This is > fixed and properly tested here. I haven't had a chance to look at the other patches (besides the one that I have already reviewed on- and off-list), so take my comments here with a grain of salt. It's too bad that 'git clone --sparse' isn't working with URLs in v2.25.0, but it happens, and I don't think that this is a grave-enough issue to warrant a new release. At least, since '--sparse' is new in v2.25.0, we're not breaking existing workflows that have already relied on it. And, since sparse-checkout is still relatively niche (at, least for now ;-)), I think that this not handling cloning from URLs is fine until v2.26.0. Of course, if there's ever another need for v2.25.1, I don't think that this would *hurt* to release then, which is to say that we definitely should have these patches in a release, soon, but I don't think that there's a terrible sense of urgency in the meantime. > Also, let's improve Git's response to these more complicated scenarios: > > 1. Running "git sparse-checkout init" in a worktree would complain because > the "info" dir doesn't exist. > 2. Tracked paths that include "*" and "" in their filenames. > 3. If a user edits the sparse-checkout file to have non-cone pattern, such > as "*" anywhere or "" in the wrong place, then we should respond > appropriately. That is: warn that the patterns are not cone-mode, then > revert to the old logic. > > Thanks, -Stolee > > Derrick Stolee (8): > t1091: use check_files to reduce boilerplate > sparse-checkout: create leading directories > clone: fix --sparse option with URLs > sparse-checkout: cone mode does not recognize "**" > sparse-checkout: detect short patterns > sparse-checkout: warn on incorrect '*' in patterns > sparse-checkout: properly match escaped characters > sparse-checkout: write escaped patterns in cone mode > > builtin/clone.c | 2 +- > builtin/sparse-checkout.c | 52 ++++- > dir.c | 69 ++++++- > dir.h | 1 + > t/t1091-sparse-checkout-builtin.sh | 320 ++++++++++++++++------------- > 5 files changed, 296 insertions(+), 148 deletions(-) > > > base-commit: 4fd683b6a35eabd23dd5183da7f654a1e1f00325 > Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-513%2Fderrickstolee%2Fsparse-harden-v1 > Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-513/derrickstolee/sparse-harden-v1 > Pull-Request: https://github.com/gitgitgadget/git/pull/513 > -- > gitgitgadget Thanks, Taylor
On 1/14/2020 2:34 PM, Taylor Blau wrote: > Hi Stolee, > > On Tue, Jan 14, 2020 at 07:25:54PM +0000, Derrick Stolee via GitGitGadget wrote: >> This series is based on ds/sparse-list-in-cone-mode. >> >> This series attempts to clean up some rough edges in the sparse-checkout >> feature, especially around the cone mode. >> >> Unfortunately, after the v2.25.0 release, we noticed an issue with the "git >> clone --sparse" option when using a URL instead of a local path. This is >> fixed and properly tested here. > > I haven't had a chance to look at the other patches (besides the one > that I have already reviewed on- and off-list), so take my comments here > with a grain of salt. > > It's too bad that 'git clone --sparse' isn't working with URLs in > v2.25.0, but it happens, and I don't think that this is a grave-enough > issue to warrant a new release. At least, since '--sparse' is new in > v2.25.0, we're not breaking existing workflows that have already relied > on it. > > And, since sparse-checkout is still relatively niche (at, least for now > ;-)), I think that this not handling cloning from URLs is fine until > v2.26.0. Since we've already worked out the workaround to be: git clone --no-checkout <url> <dir> cd <dir> git sparse-checkout init --cone there is no rush to fix this. Users _may_ discover the --sparse option from the clone docs and complain, but we can point them to the above directions for now. > Of course, if there's ever another need for v2.25.1, I don't think that > this would *hurt* to release then, which is to say that we definitely > should have these patches in a release, soon, but I don't think that > there's a terrible sense of urgency in the meantime. I wouldn't complain to have patches 1-3 in an otherwise-warranted .1 release. Thanks, -Stolee
On Tue, Jan 14, 2020 at 02:44:50PM -0500, Derrick Stolee wrote: > Since we've already worked out the workaround to be: > > git clone --no-checkout <url> <dir> > cd <dir> > git sparse-checkout init --cone > > there is no rush to fix this. Users _may_ discover the --sparse option > from the clone docs and complain, but we can point them to the above > directions for now. Yes, but part of the beauty of the new system is just having to say "--sparse" to make something useful happen. :) > > Of course, if there's ever another need for v2.25.1, I don't think that > > this would *hurt* to release then, which is to say that we definitely > > should have these patches in a release, soon, but I don't think that > > there's a terrible sense of urgency in the meantime. > > I wouldn't complain to have patches 1-3 in an otherwise-warranted .1 release. Agreed. 1-3 look obviously correct to me. The quoting bits I'm a little more fuzzy on, just because I haven't really looked hard into cone mode. Ditto for the "disable cone mode" checks. My gut instinct is that you should be able to deduce whether the pattern hashmap can be used purely from the patterns you see, and core.sparseCheckoutCone would not be needed (and so if you violate the rules by writing something manual, then it just gets slower; or maybe we're even able to apply the literal cone-mode rules quickly and handle the other separately). But it's much more likely I'm showing off my lack of knowledge of the details of the problem space. You can feel free to educate me, and/or roll your eyes and ignore me if this was already discussed earlier. By the way, I did notice this while poking about, which could go on top (or hopefully be lumped in with the 1-3 as "obviously correct"): -- >8 -- Subject: [PATCH] sparse-checkout: fix documentation typo for core.sparseCheckoutCone Signed-off-by: Jeff King <peff@peff.net> --- Documentation/git-sparse-checkout.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Documentation/git-sparse-checkout.txt b/Documentation/git-sparse-checkout.txt index 974ade2238..285893b069 100644 --- a/Documentation/git-sparse-checkout.txt +++ b/Documentation/git-sparse-checkout.txt @@ -106,7 +106,7 @@ The full pattern set allows for arbitrary pattern matches and complicated inclusion/exclusion rules. These can result in O(N*M) pattern matches when updating the index, where N is the number of patterns and M is the number of paths in the index. To combat this performance issue, a more restricted -pattern set is allowed when `core.spareCheckoutCone` is enabled. +pattern set is allowed when `core.sparseCheckoutCone` is enabled. The accepted patterns in the cone pattern set are:
"Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes: > Also, let's improve Git's response to these more complicated scenarios: > > 1. Running "git sparse-checkout init" in a worktree would complain because > the "info" dir doesn't exist. > 2. Tracked paths that include "*" and "" in their filenames. > 3. If a user edits the sparse-checkout file to have non-cone pattern, such > as "*" anywhere or "" in the wrong place, then we should respond > appropriately. That is: warn that the patterns are not cone-mode, then > revert to the old logic. It seems somebody ate a letter to make "<some letter>" into "" an empty string, so I cannot quite grok the above list---two out of three bullet points are not quite readable.
On 1/15/2020 2:16 PM, Junio C Hamano wrote: > "Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes: > >> Also, let's improve Git's response to these more complicated scenarios: >> >> 1. Running "git sparse-checkout init" in a worktree would complain because >> the "info" dir doesn't exist. >> 2. Tracked paths that include "*" and "" in their filenames. >> 3. If a user edits the sparse-checkout file to have non-cone pattern, such >> as "*" anywhere or "" in the wrong place, then we should respond >> appropriately. That is: warn that the patterns are not cone-mode, then >> revert to the old logic. > > It seems somebody ate a letter to make "<some letter>" into "" an > empty string, so I cannot quite grok the above list---two out of > three bullet points are not quite readable. In 2., the "" should be "\". In 3., the "*" and "" should be "**" and "*" respectively. These are things that are being collapsed by GitHub's markdown processing. Sorry that this affects GitGitGadget's cover letter. By escaping appropriately, these show up correctly and hopefully will be fixed in v2. -Stolee