Message ID | Y2CD6GBl6ORqKsug@coredump.intra.peff.net (mailing list archive) |
---|---|
State | Accepted |
Commit | db8016b43fd9541c85e1a3998a5a447deb1198fe |
Headers | show |
Series | [1/2] t5516: move plaintext-password tests from t5601 and t5516 | expand |
On Mon, Oct 31 2022, Jeff King wrote: > From: Johannes Schindelin <johannes.schindelin@gmx.de> Whatever our difference of opinion about the usefulness of not asserting that a thing doesn't get worse, even if it isn't perfect (c.f.: https://lore.kernel.org/git/221101.86a65b5q9q.gmgdl@evledraar.gmail.com/) I think that... > It is unclear as to _why_, but under certain circumstances the warning > about credentials being passed as part of the URL seems to be swallowed > by the `git remote-https` helper in the Windows jobs of Git's CI builds. ..this description dosen't match what the patch is doing, okey, so there's a case where the remote helper swallows the warnings, i.e. we'll emit fewer than we expected... > Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> > Signed-off-by: Jeff King <peff@peff.net> > --- > t/t5551-http-fetch-smart.sh | 14 +++++++------- > 1 file changed, 7 insertions(+), 7 deletions(-) > > diff --git a/t/t5551-http-fetch-smart.sh b/t/t5551-http-fetch-smart.sh > index bbe3d5721f..64c6c9f59e 100755 > --- a/t/t5551-http-fetch-smart.sh > +++ b/t/t5551-http-fetch-smart.sh > @@ -597,17 +597,17 @@ test_expect_success 'clone warns or fails when using username:password' ' > git -c transfer.credentialsInUrl=warn \ > clone $url_userpass attempt2 2>err && > grep "warning: $message" err >warnings && > - test_line_count = 2 warnings && > + test_line_count -ge 1 warnings && > > test_must_fail git -c transfer.credentialsInUrl=die \ > clone $url_userpass attempt3 2>err && > grep "fatal: $message" err >warnings && > - test_line_count = 1 warnings && > + test_line_count -ge 1 warnings && > > test_must_fail git -c transfer.credentialsInUrl=die \ > clone $url_userblank attempt4 2>err && > grep "fatal: $message" err >warnings && > - test_line_count = 1 warnings > + test_line_count -ge 1 warnings > ' > > test_expect_success 'clone does not detect username:password when it is https://username@domain:port/' ' > @@ -630,17 +630,17 @@ test_expect_success 'fetch warns or fails when using username:password' ' > > git -c transfer.credentialsInUrl=warn fetch $url_userpass 2>err && > grep "warning: $message" err >warnings && > - test_line_count = 3 warnings && > + test_line_count -ge 1 warnings && > > test_must_fail git -c transfer.credentialsInUrl=die \ > fetch $url_userpass 2>err && > grep "fatal: $message" err >warnings && > - test_line_count = 1 warnings && > + test_line_count -ge 1 warnings && > > test_must_fail git -c transfer.credentialsInUrl=die \ > fetch $url_userblank 2>err && > grep "fatal: $message" err >warnings && > - test_line_count = 1 warnings > + test_line_count -ge 1 warnings > ' > > > @@ -654,7 +654,7 @@ test_expect_success 'push warns or fails when using username:password' ' > test_must_fail git -c transfer.credentialsInUrl=die \ > push $url_userpass 2>err && > grep "fatal: $message" err >warnings && > - test_line_count = 1 warnings > + test_line_count -ge 1 warnings > ' ...but then why are we modifying these codepaths that have nothing to do with invoking the remote helper, i.e. where we die early before we get to that? And even if some of this was invoking that remote helper and we were modifying it blindly, then presumably the helper swallowing it would result in 0 some of the time, so "-ge 1" would be wrong. (That's not the case, but it's why I think the patch doesn't make much sense).
Jeff King <peff@peff.net> writes: > It is unclear as to _why_, but under certain circumstances the warning > about credentials being passed as part of the URL seems to be swallowed > by the `git remote-https` helper in the Windows jobs of Git's CI builds. > > Since it is not actually important how many times Git prints the > warning/error message, as long as it prints it at least once, ... Sorry, but I do not quite see the value of keeping the test to expect success in a weakend form. If we think we are emitting three warnings, whether we do so by mistake or by design, and some of them are lost and not shown for an unknown reason, is there a guarantee that at least one would survive? And when all three are lost, even the test in the weakened form would fail and stop the CI builds, no? If we do not know why we are losing some messages, and if we do not have resources to track down why, that is perfectly fine. Fixes can be prioritised. But wouldn't test_expect_failure be a better way to express "we think we ought to get 3 but for some reason we may not get all of them and we haven't figured it out"?
On Tue, Nov 01, 2022 at 04:29:31AM +0100, Ævar Arnfjörð Bjarmason wrote: > > It is unclear as to _why_, but under certain circumstances the warning > > about credentials being passed as part of the URL seems to be swallowed > > by the `git remote-https` helper in the Windows jobs of Git's CI builds. > > ..this description dosen't match what the patch is doing, okey, so > there's a case where the remote helper swallows the warnings, i.e. we'll > emit fewer than we expected... So I really didn't revisit this commit much at all, and was just trying to save Dscho (or Taylor) the work of having to rebase it, if we go with my patch 1. IMHO it is OK enough as it is, but if I were writing it from scratch I probably would have given the rationale that the tests are insiting on a dumb, sub-optimal behavior. And flakes or inconsistencies aside, they should be asserting only the presence or absence of the message. And probably would have left each at "grep" and dropped the test_line_count totally. It is not even clear to me that the remote-https is the one being swallowed (at least, I have not seen an argument or evidence that this is so; it does seem plausible). > > @@ -654,7 +654,7 @@ test_expect_success 'push warns or fails when using username:password' ' > > test_must_fail git -c transfer.credentialsInUrl=die \ > > push $url_userpass 2>err && > > grep "fatal: $message" err >warnings && > > - test_line_count = 1 warnings > > + test_line_count -ge 1 warnings > > ' > > ...but then why are we modifying these codepaths that have nothing to do > with invoking the remote helper, i.e. where we die early before we get > to that? If you're arguing that we should only do s/= 3/-ge 1/ for the test that is flaking, I could buy that. Though like I said, I consider the value here to be focusing the purpose of the tests as much as dealing with the flake. I really don't care that much either way, though. I'd also be fine with a separate test (marked expect_failure) that checks the number of messages. > And even if some of this was invoking that remote helper and we were > modifying it blindly, then presumably the helper swallowing it would > result in 0 some of the time, so "-ge 1" would be wrong. > > (That's not the case, but it's why I think the patch doesn't make much > sense). I thought the point is that the outer program calling the helper would consistently produce the error, always yielding at least one instance. The helper one is generally "extra" and undesired. -Peff
On Mon, Oct 31, 2022 at 09:54:23PM -0700, Junio C Hamano wrote: > Jeff King <peff@peff.net> writes: > > > It is unclear as to _why_, but under certain circumstances the warning > > about credentials being passed as part of the URL seems to be swallowed > > by the `git remote-https` helper in the Windows jobs of Git's CI builds. > > > > Since it is not actually important how many times Git prints the > > warning/error message, as long as it prints it at least once, ... > > Sorry, but I do not quite see the value of keeping the test to > expect success in a weakend form. If we think we are emitting three > warnings, whether we do so by mistake or by design, and some of them > are lost and not shown for an unknown reason, is there a guarantee > that at least one would survive? And when all three are lost, even > the test in the weakened form would fail and stop the CI builds, no? Without understanding the cause of the loss, I agree that things are a little hand-wavy. But the assumption _does_ seem to hold that we consistently produce at least one (presumably from the parent clone/fetch/push process). And if we can rely on that, there's value in the tests asserting that the message was shown to the user at least once. > If we do not know why we are losing some messages, and if we do not > have resources to track down why, that is perfectly fine. Fixes can > be prioritised. But wouldn't test_expect_failure be a better way to > express "we think we ought to get 3 but for some reason we may not > get all of them and we haven't figured it out"? Marking it as expect_failure throws out the main point of the test, though, which is that the user sees _some_ message (and that the "die" form aborts the operation). It might make sense to add a separate test in the meantime that documents the "oops, we get the wrong number" sometimes state (and eventually, if fixed, that could be folded back into the main test for efficiency / simplicity). -Peff
On Tue, Nov 01 2022, Jeff King wrote: > On Tue, Nov 01, 2022 at 04:29:31AM +0100, Ævar Arnfjörð Bjarmason wrote: > >> > It is unclear as to _why_, but under certain circumstances the warning >> > about credentials being passed as part of the URL seems to be swallowed >> > by the `git remote-https` helper in the Windows jobs of Git's CI builds. >> >> ..this description dosen't match what the patch is doing, okey, so >> there's a case where the remote helper swallows the warnings, i.e. we'll >> emit fewer than we expected... > > So I really didn't revisit this commit much at all, and was just trying > to save Dscho (or Taylor) the work of having to rebase it, if we go with > my patch 1. > > IMHO it is OK enough as it is, but if I were writing it from scratch I > probably would have given the rationale that the tests are insiting on a > dumb, sub-optimal behavior. And flakes or inconsistencies aside, they > should be asserting only the presence or absence of the message. And > probably would have left each at "grep" and dropped the test_line_count > totally. Do you mean that even if we fix the bug and consistently emit one and only one such message you'd like to have the tests not assert that that's the case? I do think that UX is important enough to test for, particularly if we've had a bug related to that that we've fixed. I.e. if something in the direction of my [1] goes in. > It is not even clear to me that the remote-https is the one being > swallowed (at least, I have not seen an argument or evidence that this > is so; it does seem plausible). It is the case, the only ones that are going to be duplicated are the "warn" ones, because for "die" we'll die right away in the parent process. Which is what I'm trying to get across here, and why I'm a bit confused. I.e. I thought you'd agree that we should test that we emit exactly one warning if & when we've fixed the underlying issue. That issue is already fixed for "die", so even if you want to loosen up the test your [2] should only keep the first line removal/addition in the first two hunks, and drop the 3rd one. >> > @@ -654,7 +654,7 @@ test_expect_success 'push warns or fails when using username:password' ' >> > test_must_fail git -c transfer.credentialsInUrl=die \ >> > push $url_userpass 2>err && >> > grep "fatal: $message" err >warnings && >> > - test_line_count = 1 warnings >> > + test_line_count -ge 1 warnings >> > ' >> >> ...but then why are we modifying these codepaths that have nothing to do >> with invoking the remote helper, i.e. where we die early before we get >> to that? > > If you're arguing that we should only do s/= 3/-ge 1/ for the test that > is flaking, I could buy that. I'm saying that if we're doing a handwaivy-fix and saying "sometimes the message gets swallowed" and fixing this blindly without checking how it works, then changing "= 1" to "-ge 1" doesn't make sense. It should be "-ge 0", i.e. surely that "one warning" can get swallowed too? Now, I know that never happens, because we'll always get at least one. I'm just saying that as soon as you stop to think about that you must also come to the conclusion that the "die" ones are OK as-is. That's because the reason we always get at least one is the same as we're always guaranteed to emit just one in the "die" case: The parent process emits it, then dies. >> And even if some of this was invoking that remote helper and we were >> modifying it blindly, then presumably the helper swallowing it would >> result in 0 some of the time, so "-ge 1" would be wrong. >> >> (That's not the case, but it's why I think the patch doesn't make much >> sense). > > I thought the point is that the outer program calling the helper would > consistently produce the error, always yielding at least one instance. > The helper one is generally "extra" and undesired. Yes, exactly. Which is what my fix[1] the root cause addresses. Anyway, I'm just trying to help here. If you/Johannes/others want to go with the "hotfix" as-is that's fine my me. I just don't see what the hurry is, it's been this way for two releases, if it's flaky that's been the case for months, I'd think we could just fix the root cause. 1. http://lore.kernel.org/git/RFC-patch-1.1-0266485bc6c-20221031T204149Z-avarab@gmail.com 2. https://lore.kernel.org/git/Y2CD6GBl6ORqKsug@coredump.intra.peff.net/
On Tue, Nov 01, 2022 at 09:15:00AM +0100, Ævar Arnfjörð Bjarmason wrote: > > So I really didn't revisit this commit much at all, and was just trying > > to save Dscho (or Taylor) the work of having to rebase it, if we go with > > my patch 1. > > > > IMHO it is OK enough as it is, but if I were writing it from scratch I > > probably would have given the rationale that the tests are insiting on a > > dumb, sub-optimal behavior. And flakes or inconsistencies aside, they > > should be asserting only the presence or absence of the message. And > > probably would have left each at "grep" and dropped the test_line_count > > totally. > > Do you mean that even if we fix the bug and consistently emit one and > only one such message you'd like to have the tests not assert that > that's the case? No, I wouldn't mind it, if that is a bug we've fixed. I just mean that the tests as written never wanted to say "3 is the absolute right number of times to write this message". They only put "3" there because it made things pass. > I do think that UX is important enough to test for, particularly if > we've had a bug related to that that we've fixed. I.e. if something in > the direction of my [1] goes in. Sure, I don't mind at all a test for it. In the short-term, if you want a test that fails, I'd prefer it be separate so that we can test the useful existing behavior that _does_ work. If the multiple-messages bug is fixed, I don't mind folding them together into a single test that passes. > > It is not even clear to me that the remote-https is the one being > > swallowed (at least, I have not seen an argument or evidence that this > > is so; it does seem plausible). > > It is the case, the only ones that are going to be duplicated are the > "warn" ones, because for "die" we'll die right away in the parent > process. Right, I understand why "die" produces only one. My question was when we produce 2 on Windows (sometimes?) but 3 elsewhere, are we sure it is the one from remote-https that is eaten, or could it ever be one of the others? In a sense we do not need to worry about "why is it sometimes eaten" if the bug is fixed to produce only the one message. But it may point to a separate and interesting problem (e.g., is stderr from remote-https not reliable?). > >> > @@ -654,7 +654,7 @@ test_expect_success 'push warns or fails when using username:password' ' > >> > test_must_fail git -c transfer.credentialsInUrl=die \ > >> > push $url_userpass 2>err && > >> > grep "fatal: $message" err >warnings && > >> > - test_line_count = 1 warnings > >> > + test_line_count -ge 1 warnings > >> > ' > >> > >> ...but then why are we modifying these codepaths that have nothing to do > >> with invoking the remote helper, i.e. where we die early before we get > >> to that? > > > > If you're arguing that we should only do s/= 3/-ge 1/ for the test that > > is flaking, I could buy that. > > I'm saying that if we're doing a handwaivy-fix and saying "sometimes the > message gets swallowed" and fixing this blindly without checking how it > works, then changing "= 1" to "-ge 1" doesn't make sense. Right, I'm fine with that (I perhaps should have said something stronger than "I could buy that"). As I said, I was mostly just rebasing Dscho's patch and I think it was good enough in the sense that it was hand-waving away the whole "there may be more than one" problem. But I do agree that we'll never see more in the "die" cases, and there is no need to change them. > > I thought the point is that the outer program calling the helper would > > consistently produce the error, always yielding at least one instance. > > The helper one is generally "extra" and undesired. > > Yes, exactly. Which is what my fix[1] the root cause addresses. > > Anyway, I'm just trying to help here. If you/Johannes/others want to go > with the "hotfix" as-is that's fine my me. > > I just don't see what the hurry is, it's been this way for two releases, > if it's flaky that's been the case for months, I'd think we could just > fix the root cause. It recently bit me twice, so maybe I am giving it more urgency than it deserves (or maybe something changed in CI to make it more likely). I do think it would be nice to fix it. I don't love your patch for the reasons I replied there (not your fault; it's inherently a crappy and complicated problem). In the meantime, I'd like to see CI fixed, as it is wasting developer's time. And that's why I called Dscho's loosening "good enough". It is hopefully a temporary state anyway. But I would be just as happy to see a similar patch which just changed the 2/3 lines to "-ge 1" (or just a straight grep). -Peff
On Tue, Nov 01 2022, Jeff King wrote: > On Tue, Nov 01, 2022 at 09:15:00AM +0100, Ævar Arnfjörð Bjarmason wrote: > >> > So I really didn't revisit this commit much at all, and was just trying >> > to save Dscho (or Taylor) the work of having to rebase it, if we go with >> > my patch 1. >> > >> > IMHO it is OK enough as it is, but if I were writing it from scratch I >> > probably would have given the rationale that the tests are insiting on a >> > dumb, sub-optimal behavior. And flakes or inconsistencies aside, they >> > should be asserting only the presence or absence of the message. And >> > probably would have left each at "grep" and dropped the test_line_count >> > totally. >> >> Do you mean that even if we fix the bug and consistently emit one and >> only one such message you'd like to have the tests not assert that >> that's the case? > > No, I wouldn't mind it, if that is a bug we've fixed. I just mean that > the tests as written never wanted to say "3 is the absolute right number > of times to write this message". They only put "3" there because it made > things pass. That's one reason, another is to assert current behavior, and to be able to answer questions like "does this patch change behavior" without having to recursively diff the trash directory output of a test run, because everything's so fuzzy. If and when it's 1 instead of 3, great, adjusting the test isn't a big deal. Anyway, we're off into general testing philosophy again, which I think is off topic here. >> I do think that UX is important enough to test for, particularly if >> we've had a bug related to that that we've fixed. I.e. if something in >> the direction of my [1] goes in. > > Sure, I don't mind at all a test for it. In the short-term, if you want > a test that fails, I'd prefer it be separate so that we can test the > useful existing behavior that _does_ work. If the multiple-messages bug > is fixed, I don't mind folding them together into a single test that > passes. Right, I'm not saying "keep the flaky test", I'm saying let's keep the ones we know aren't flaky. >> > It is not even clear to me that the remote-https is the one being >> > swallowed (at least, I have not seen an argument or evidence that this >> > is so; it does seem plausible). >> >> It is the case, the only ones that are going to be duplicated are the >> "warn" ones, because for "die" we'll die right away in the parent >> process. > > Right, I understand why "die" produces only one. My question was when we > produce 2 on Windows (sometimes?) but 3 elsewhere, are we sure it is the > one from remote-https that is eaten, or could it ever be one of the > others? I don't have a test case in front of me, and Johannes didn't provide one (or even a link to CI output). But from his description and being familiar with the code I'm pretty certain isn't not the "die" cases, those are all in-process, and it happens before we spawn sub-processes, I don't see how that would be different on Windows. >> > I thought the point is that the outer program calling the helper would >> > consistently produce the error, always yielding at least one instance. >> > The helper one is generally "extra" and undesired. >> >> Yes, exactly. Which is what my fix[1] the root cause addresses. >> >> Anyway, I'm just trying to help here. If you/Johannes/others want to go >> with the "hotfix" as-is that's fine my me. >> >> I just don't see what the hurry is, it's been this way for two releases, >> if it's flaky that's been the case for months, I'd think we could just >> fix the root cause. > > It recently bit me twice, so maybe I am giving it more urgency than it > deserves (or maybe something changed in CI to make it more likely). Bit you in GitHub Windows CI? > I do think it would be nice to fix it. I don't love your patch for the > reasons I replied there (not your fault; it's inherently a crappy and > complicated problem). In the meantime, I'd like to see CI fixed, as > it is wasting developer's time. And that's why I called Dscho's > loosening "good enough". It is hopefully a temporary state anyway. > > But I would be just as happy to see a similar patch which just changed > the 2/3 lines to "-ge 1" (or just a straight grep). Sure, if we're deciding not to care about tests that are unrelated to the flakyness problem at hand being loosened.
On Tue, Nov 01, 2022 at 03:42:59AM -0400, Jeff King wrote: > On Mon, Oct 31, 2022 at 09:54:23PM -0700, Junio C Hamano wrote: > > > Jeff King <peff@peff.net> writes: > > > > > It is unclear as to _why_, but under certain circumstances the warning > > > about credentials being passed as part of the URL seems to be swallowed > > > by the `git remote-https` helper in the Windows jobs of Git's CI builds. > > > > > > Since it is not actually important how many times Git prints the > > > warning/error message, as long as it prints it at least once, ... > > > > Sorry, but I do not quite see the value of keeping the test to > > expect success in a weakend form. If we think we are emitting three > > warnings, whether we do so by mistake or by design, and some of them > > are lost and not shown for an unknown reason, is there a guarantee > > that at least one would survive? And when all three are lost, even > > the test in the weakened form would fail and stop the CI builds, no? > > Without understanding the cause of the loss, I agree that things are a > little hand-wavy. But the assumption _does_ seem to hold that we > consistently produce at least one (presumably from the parent > clone/fetch/push process). And if we can rely on that, there's value in > the tests asserting that the message was shown to the user at least > once. Part of me wonders whether the local DNS-resolution issue you fixed in the first patch would be sufficient to get us to produce the wrong number of warnings consistently. I.e., if I queue just the first patch and drop Johannes's, would that be sufficient to get CI working consistently again? I don't know. It's frustrating to rely so much on an external environment that our feedback loop can only be as short as "push out some combination of these patches and wait for CI". That's disappointing, and TBH I would rather spend time focusing on other patches than play games with CI. The pair of patches look good to me. Perhaps we could avoid the weakened assumption, but I do not mind too much in the meantime. Especially since we already have a series[1] in the works that resolves the issue for good. Thanks, Taylor [1]: https://lore.kernel.org/git/RFC-patch-1.1-0266485bc6c-20221031T204149Z-avarab@gmail.com/
diff --git a/t/t5551-http-fetch-smart.sh b/t/t5551-http-fetch-smart.sh index bbe3d5721f..64c6c9f59e 100755 --- a/t/t5551-http-fetch-smart.sh +++ b/t/t5551-http-fetch-smart.sh @@ -597,17 +597,17 @@ test_expect_success 'clone warns or fails when using username:password' ' git -c transfer.credentialsInUrl=warn \ clone $url_userpass attempt2 2>err && grep "warning: $message" err >warnings && - test_line_count = 2 warnings && + test_line_count -ge 1 warnings && test_must_fail git -c transfer.credentialsInUrl=die \ clone $url_userpass attempt3 2>err && grep "fatal: $message" err >warnings && - test_line_count = 1 warnings && + test_line_count -ge 1 warnings && test_must_fail git -c transfer.credentialsInUrl=die \ clone $url_userblank attempt4 2>err && grep "fatal: $message" err >warnings && - test_line_count = 1 warnings + test_line_count -ge 1 warnings ' test_expect_success 'clone does not detect username:password when it is https://username@domain:port/' ' @@ -630,17 +630,17 @@ test_expect_success 'fetch warns or fails when using username:password' ' git -c transfer.credentialsInUrl=warn fetch $url_userpass 2>err && grep "warning: $message" err >warnings && - test_line_count = 3 warnings && + test_line_count -ge 1 warnings && test_must_fail git -c transfer.credentialsInUrl=die \ fetch $url_userpass 2>err && grep "fatal: $message" err >warnings && - test_line_count = 1 warnings && + test_line_count -ge 1 warnings && test_must_fail git -c transfer.credentialsInUrl=die \ fetch $url_userblank 2>err && grep "fatal: $message" err >warnings && - test_line_count = 1 warnings + test_line_count -ge 1 warnings ' @@ -654,7 +654,7 @@ test_expect_success 'push warns or fails when using username:password' ' test_must_fail git -c transfer.credentialsInUrl=die \ push $url_userpass 2>err && grep "fatal: $message" err >warnings && - test_line_count = 1 warnings + test_line_count -ge 1 warnings ' test_done