Message ID | 20190604021330.16130-1-felipe.contreras@gmail.com (mailing list archive) |
---|---|
Headers | show |
Series | Fix fetch regression with transport helpers | expand |
On Mon, Jun 03, 2019 at 09:13:25PM -0500, Felipe Contreras wrote: > I noticed a regression with running tests for git-remote-hg; can't seem to be > able to fetch tags. > > Probably all remote helpers that use the import method are affected, if not > all. > > The following patches are meant to test for the issue, fix it, and get some > cleanups. > > I'm not exactly sure the solution is the one we want, but hopefull it gives an > idea as to what is needed. It looks good to me. Thanks for fixing this. The breakage is in v2.20, so I don't think this needs to be rushed into v2.22 (which is already at -rc3). But it should probably go to 'maint' sooner rather than later. -Peff
On Tue, Jun 4, 2019 at 9:35 AM Jeff King <peff@peff.net> wrote: > > On Mon, Jun 03, 2019 at 09:13:25PM -0500, Felipe Contreras wrote: > > I'm not exactly sure the solution is the one we want, but hopefull it gives an > > idea as to what is needed. > > It looks good to me. Thanks for fixing this. > > The breakage is in v2.20, so I don't think this needs to be rushed into > v2.22 (which is already at -rc3). But it should probably go to 'maint' > sooner rather than later. Indeed, it's probably not so urgent since people have not even been complaining (that I know of). However, I think it's clear that there's a regression and the fix is simple, so maybe just apply the obvious fix, and leave the rest of the patches for later? I'm attaching only the fix, just to temp the idea of applying that tiny thing into v2.2.
On Mon, 3 Jun 2019, Felipe Contreras wrote: > Felipe Contreras (5): > t5801 (remote-helpers): cleanup refspec stuff > t5801 (remote-helpers): add test to fetch tags > fetch: trivial cleanup > fetch: make the code more understandable > fetch: fix regression with transport helpers > > builtin/fetch.c | 28 ++++++++++++++++++---------- > t/t5801-remote-helpers.sh | 18 ++++++++++++++---- > t/t5801/git-remote-testgit | 22 +++++++++++++--------- > 3 files changed, 45 insertions(+), 23 deletions(-) This fails on macOS, in t5601, both in our osx-clang and osx-gcc jobs, as well as in the StaticAnalysis job. For details, see https://dev.azure.com/gitgitgadget/git/_build/results?buildId=10206 The t5601 failure looks like this: -- snip -- checking prerequisite: CASE_INSENSITIVE_FS mkdir -p "$TRASH_DIRECTORY/prereq-test-dir" && ( cd "$TRASH_DIRECTORY/prereq-test-dir" && echo good >CamelCase && echo bad >camelcase && test "$(cat CamelCase)" != good ) ++ mkdir -p '/Users/vsts/agent/2.150.3/work/1/s/t/trash directory.t5601-clone/prereq-test-dir' ++ cd '/Users/vsts/agent/2.150.3/work/1/s/t/trash directory.t5601-clone/prereq-test-dir' ++ echo good ++ echo bad +++ cat CamelCase ++ test bad '!=' good prerequisite CASE_INSENSITIVE_FS ok expecting success: grep X icasefs/warning && grep x icasefs/warning && test_i18ngrep "the following paths have collided" icasefs/warning ++ grep X icasefs/warning error: last command exited with $?=1 not ok 99 - colliding file detection -- snap -- My guess is that your changes remove something that was expected before, and is still expected, and that this was only tested on Linux, and only on a file system with case-sensitive file names. The StaticAnalysis job (which is admittedly a misnomer) points out another few valid issues, but that is probably because Junio applied this patch series on top of a very old commit. I, at least, could not spot any file in the Coccinelle report that was touched by this here patch series. Ciao, Johannes
On Wed, Jun 05, 2019 at 10:12:12AM +0200, Johannes Schindelin wrote: > This fails on macOS, in t5601, both in our osx-clang and osx-gcc jobs, as > well as in the StaticAnalysis job. For details, see > https://dev.azure.com/gitgitgadget/git/_build/results?buildId=10206 Hmm. I'm having a hard time seeing why (and I can't seem to reproduce it locally on a case-insensitive HFS+ filesystem under Linux). In particular, if the problem is here: > expecting success: > grep X icasefs/warning && > grep x icasefs/warning && > test_i18ngrep "the following paths have collided" icasefs/warning > > ++ grep X icasefs/warning > error: last command exited with $?=1 > not ok 99 - colliding file detection then that implies it has to do with the checkout phase, which Felipe's patch doesn't touch. Later in the log we see the actual file contents (I'm confused as to how this gets here; it looks like debugging bits that were added after the main script?): 2019-06-05T07:58:37.7961890Z Cloning into 'bogus'... 2019-06-05T07:58:37.7962430Z done. 2019-06-05T07:58:37.7963360Z warning: the following paths have collided (e.g. case-sensitive paths 2019-06-05T07:58:37.7964300Z on a case-insensitive filesystem) and only one from the same 2019-06-05T07:58:37.7964880Z colliding group is in the working tree: 2019-06-05T07:58:37.7965290Z 2019-06-05T07:58:37.7966250Z 'x' whereas a succeeding test expects us to mention both 'x' and 'X'. So we _did_ find the collision, but somehow 'X' was not reported. Looking at the code, I'm not even sure how that could happen. Given that this process does involve looking at stat data, it makes me wonder if there could be some raciness involved. But again, I'm scratching my head as to how exactly, and I couldn't reproduce it under load or with some carefully inserted sleep() calls. And it looks like it did reproduce twice on Azure. Can somebody who has osx locally reproduce this? Or is there a way to interactively access the Azure environment to dig further? > My guess is that your changes remove something that was expected before, > and is still expected, and that this was only tested on Linux, and only on > a file system with case-sensitive file names. It sounds like you're suggesting that changes to the test script subtly affected the later state. Which is indeed a common culprit. But the changes in Felipe's series were all to t5801, not the failing t5601. Am I misunderstanding what you mean? -Peff
On Wed, Jun 5, 2019 at 6:27 PM Jeff King <peff@peff.net> wrote: > > On Wed, Jun 05, 2019 at 10:12:12AM +0200, Johannes Schindelin wrote: > > > This fails on macOS, in t5601, both in our osx-clang and osx-gcc jobs, as > > well as in the StaticAnalysis job. For details, see > > https://dev.azure.com/gitgitgadget/git/_build/results?buildId=10206 > > Hmm. I'm having a hard time seeing why (and I can't seem to reproduce it > locally on a case-insensitive HFS+ filesystem under Linux). > > In particular, if the problem is here: > > > expecting success: > > grep X icasefs/warning && > > grep x icasefs/warning && > > test_i18ngrep "the following paths have collided" icasefs/warning > > > > ++ grep X icasefs/warning > > error: last command exited with $?=1 > > not ok 99 - colliding file detection > > then that implies it has to do with the checkout phase, which Felipe's > patch doesn't touch. Later in the log we see the actual file contents > (I'm confused as to how this gets here; it looks like debugging bits > that were added after the main script?): > > 2019-06-05T07:58:37.7961890Z Cloning into 'bogus'... > 2019-06-05T07:58:37.7962430Z done. > 2019-06-05T07:58:37.7963360Z warning: the following paths have collided (e.g. case-sensitive paths > 2019-06-05T07:58:37.7964300Z on a case-insensitive filesystem) and only one from the same > 2019-06-05T07:58:37.7964880Z colliding group is in the working tree: > 2019-06-05T07:58:37.7965290Z > 2019-06-05T07:58:37.7966250Z 'x' > > whereas a succeeding test expects us to mention both 'x' and 'X'. > > So we _did_ find the collision, but somehow 'X' was not reported. > Looking at the code, I'm not even sure how that could happen. Given that > this process does involve looking at stat data, it makes me wonder if It does use stat data in mark_colliding_entries() if core.checkStat is false. I think on MacOS it's actually true. I vaguely recall seeing just one 'x' once. I think last time I had a problem with truncating st_ino, but that should be fixed in e66ceca94b (clone: fix colliding file detection on APFS, 2018-11-20). So no idea how this happens again. > there could be some raciness involved. But again, I'm scratching my head > as to how exactly, and I couldn't reproduce it under load or with some > carefully inserted sleep() calls.
Hi Duy, On Wed, 5 Jun 2019, Duy Nguyen wrote: > On Wed, Jun 5, 2019 at 6:27 PM Jeff King <peff@peff.net> wrote: > > > > On Wed, Jun 05, 2019 at 10:12:12AM +0200, Johannes Schindelin wrote: > > > > > This fails on macOS, in t5601, both in our osx-clang and osx-gcc jobs, as > > > well as in the StaticAnalysis job. For details, see > > > https://dev.azure.com/gitgitgadget/git/_build/results?buildId=10206 > > > > Hmm. I'm having a hard time seeing why (and I can't seem to reproduce it > > locally on a case-insensitive HFS+ filesystem under Linux). > > > > In particular, if the problem is here: > > > > > expecting success: > > > grep X icasefs/warning && > > > grep x icasefs/warning && > > > test_i18ngrep "the following paths have collided" icasefs/warning > > > > > > ++ grep X icasefs/warning > > > error: last command exited with $?=1 > > > not ok 99 - colliding file detection > > > > then that implies it has to do with the checkout phase, which Felipe's > > patch doesn't touch. Later in the log we see the actual file contents > > (I'm confused as to how this gets here; it looks like debugging bits > > that were added after the main script?): > > > > 2019-06-05T07:58:37.7961890Z Cloning into 'bogus'... > > 2019-06-05T07:58:37.7962430Z done. > > 2019-06-05T07:58:37.7963360Z warning: the following paths have collided (e.g. case-sensitive paths > > 2019-06-05T07:58:37.7964300Z on a case-insensitive filesystem) and only one from the same > > 2019-06-05T07:58:37.7964880Z colliding group is in the working tree: > > 2019-06-05T07:58:37.7965290Z > > 2019-06-05T07:58:37.7966250Z 'x' > > > > whereas a succeeding test expects us to mention both 'x' and 'X'. > > > > So we _did_ find the collision, but somehow 'X' was not reported. > > Looking at the code, I'm not even sure how that could happen. Given that > > this process does involve looking at stat data, it makes me wonder if > > It does use stat data in mark_colliding_entries() if core.checkStat is > false. I think on MacOS it's actually true. > > I vaguely recall seeing just one 'x' once. I think last time I had a > problem with truncating st_ino, but that should be fixed in e66ceca94b > (clone: fix colliding file detection on APFS, 2018-11-20). So no idea > how this happens again. Good catch. I think the reason it happens again is simply that Junio picked a base commit that is older than the commit you referenced. Point in favor: Junio merged these here patches into `pu` and those test failures (as well as the StaticAnalysis issues) are gone. Thanks, Johannes > > > there could be some raciness involved. But again, I'm scratching my head > > as to how exactly, and I couldn't reproduce it under load or with some > > carefully inserted sleep() calls. > -- > Duy >
Johannes Schindelin <Johannes.Schindelin@gmx.de> writes: >> I vaguely recall seeing just one 'x' once. I think last time I had a >> problem with truncating st_ino, but that should be fixed in e66ceca94b >> (clone: fix colliding file detection on APFS, 2018-11-20). So no idea >> how this happens again. > > Good catch. I think the reason it happens again is simply that Junio > picked a base commit that is older than the commit you referenced. Yeah, that is because the patch specifically targetted a single commit as culprit, so naturally the topic that introduced that commit was the place to be "fixed" ;-) I was wondering if the base commit _before_ the fixes, i.e. e198b3a7 ("fetch: replace string-list used as a look-up table with a hashmap", 2018-09-25), failed the same test you saw problems with. It does predate e66ceca9 ("clone: fix colliding file detection on APFS", 2018-11-20), so my current theory is that it was broken already adn these patches that fixed a breakge had nothing to do with the t5601 tests failing.
Hi Junio, On Thu, 6 Jun 2019, Junio C Hamano wrote: > Johannes Schindelin <Johannes.Schindelin@gmx.de> writes: > > >> I vaguely recall seeing just one 'x' once. I think last time I had a > >> problem with truncating st_ino, but that should be fixed in e66ceca94b > >> (clone: fix colliding file detection on APFS, 2018-11-20). So no idea > >> how this happens again. > > > > Good catch. I think the reason it happens again is simply that Junio > > picked a base commit that is older than the commit you referenced. > > Yeah, that is because the patch specifically targetted a single > commit as culprit, so naturally the topic that introduced that > commit was the place to be "fixed" ;-) Yes, that matches my impression. I should maybe describe a bit better what *is* tested on GitGitGadget when it runs the build & test suite for the individual branches (as opposed to the integration branches maint, master, next & pu): the Azure Pipeline obviously *cannot* be defined in `azure-pipelines.yml`, as many of those branches do not even have that file. One of the features I like most about Azure Pipelines (yes, I really like Pipelines, they save me from so much work by enabling me to automate a *lot* in Git for Windows' maintenance, such as building and packaging quite a few of the components such as OpenSSH, cURL, etc... but I digress) is that it offers *both* to define the builds via a file that is committed *and also* in a definition that is maintained outside of the source code. So what I did was to port azure-pipelines.yml from the former to the latter, and *that* is run on those individual branches. As we noticed here, this opens the door for running into regressions that have been long fixed, just not in the tested branch. Side note: many projects that want to rely on the confidence instilled by automated testing therefore change their workflow to a more "topic branches are based on master, or on the release branches' tips" one. I am not saying that you, Junio, should switch to such a workflow because you are clearly comfortable with the current one. I mention this mainly for the benefit of readers who might wonder what options they have in their own projects to deal with this. What I usually do is a hack: this "manual" job definition tries to cherry-pick all kinds of known fixes to known regressions, and this APFS-name-collision one is just not one of them. When I find the time (hopefully next week, probably not), I shall try to take care of that. > I was wondering if the base commit _before_ the fixes, i.e. e198b3a7 > ("fetch: replace string-list used as a look-up table with a > hashmap", 2018-09-25), failed the same test you saw problems with. > It does predate e66ceca9 ("clone: fix colliding file detection on > APFS", 2018-11-20), so my current theory is that it was broken > already adn these patches that fixed a breakge had nothing to do > with the t5601 tests failing. Yes, I agree. One might ask why I even bother with testing the individual branches. The reason is that bisecting breakages in `pu` is a *nightmare*. Most of those breakages are present already in the individual branches, though, so if I can catch the breakages there, I have a much easier time keeping the CI builds green. Ciao, Dscho