Message ID | 20200430160642.90096-1-carenas@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v9] credential-store: warn instead of fatal for bogus lines from store | expand |
Carlo Marcelo Arenas Belón <carenas@gmail.com> writes: > With the added checks for invalid URLs in credentials, any locally > modified store files which might have empty lines or even comments > were reported[1] failing to parse as valid credentials. > > Instead of doing a hard check for credentials, do a soft one and > therefore avoid the reported fatal error. > > Warn the user indicating the filename and line number so any invalid > entries could be corrected but continue parsing until a match is > found or all valid credentials are processed. > > Make sure that the credential that we will use to match is complete by > confirming it has all fields set as expected by the updated rules. > > [1] https://stackoverflow.com/a/61420852/5005936 > > Reported-by: Dirk <dirk@ed4u.de> > Helped-by: Eric Sunshine <sunshine@sunshineco.com> > Based-on-patch-by: Jonathan Nieder <jrnieder@gmail.com> > Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com> > --- Looks good. > diff --git a/credential-store.c b/credential-store.c > index c010497cb2..b1f5b2dea6 100644 > --- a/credential-store.c > +++ b/credential-store.c > @@ -4,10 +4,20 @@ > #include "string-list.h" > #include "parse-options.h" > > +#define PARSE_VERBOSE 0x01 > + > static struct lock_file credential_lock; > > +static int valid_credential(struct credential *entry) > +{ > + return (entry->username && entry->password && > + entry->protocol && *entry->protocol && > + ((entry->host && *entry->host) || entry->path)); > +} OK. > static int parse_credential_file(const char *fn, > struct credential *c, > + int flags, > void (*match_cb)(struct credential *), > void (*other_cb)(struct strbuf *)) > { > @@ -15,6 +25,7 @@ static int parse_credential_file(const char *fn, > struct strbuf line = STRBUF_INIT; > struct credential entry = CREDENTIAL_INIT; > int found_credential = 0; > + int lineno = 0; > > fh = fopen(fn, "r"); > if (!fh) { > @@ -23,17 +34,24 @@ static int parse_credential_file(const char *fn, > return found_credential; > } > > - while (strbuf_getline_lf(&line, fh) != EOF) { > - credential_from_url(&entry, line.buf); > - if (entry.username && entry.password && > - credential_match(c, &entry)) { > + while (strbuf_getline(&line, fh) != EOF) { Hmph, I probably have missed some discussion that happened in the last few days, but from the use of _lf() in the original, I sense that it is very much deliberate that the file format is designed to be LF delimited records, *not* a text file in which each line is terminated with some end-of-line marker that is platform dependent. IOW, an explicit use of _lf() shouts, at least to me, that we want a sequence of LF terminated records even on Windows and Macs. So, I am not sure why this change is desirable. > + lineno++; > + > + if (credential_from_url_gently(&entry, line.buf, 1) || > + !valid_credential(&entry)) { OK, so we read a line, fed it to the parser, and if we had trouble parsing or the line did not have enough credential material, we discard and warn (when told to). > + if (flags & PARSE_VERBOSE) > + warning(_("%s:%d: ignoring invalid credential"), > + fn, lineno); > + } else if (credential_match(c, &entry)) { > found_credential = 1; > if (match_cb) { > match_cb(&entry); > break; > } > + continue; And if the credential material on a good line matches, we remember we saw a match. If there is match callback, we call it and leave the loop to return from the function. Otherwise we go to the next line. > } > + > + if (other_cb) > other_cb(&line); A malformed/underspecified line and an unmatched line is fed to the other callback. > } > > @@ -62,7 +80,7 @@ static void rewrite_credential_file(const char *fn, struct credential *c, > die_errno("unable to get credential storage lock"); > if (extra) > print_line(extra); > - parse_credential_file(fn, c, NULL, print_line); > + parse_credential_file(fn, c, 0, NULL, print_line); This helper function is called from two places. - In store_credential_file, we write the credential we were asked to store to a strbuf, and then call this function. The function first writes that new credential and then calls the parse helper without PARSE_VERBOSE. We do not give any match callback, and other callback is to just print the line read from the credential file. So, the final output would be the new credential, followed by the current contents of the credential file except for the records that match the one we are storing. We do this without warning about malformed entries at all. - In remove_credential, we go over multiple files and copy the lines in each of them, except the ones that match the credential we are given. We also do this without warning about malformed entries at all. > if (commit_lock_file(&credential_lock) < 0) > die_errno("unable to write credential store"); > } > @@ -139,7 +157,8 @@ static void lookup_credential(const struct string_list *fns, struct credential * > struct string_list_item *fn; > > for_each_string_list_item(fn, fns) > - if (parse_credential_file(fn->string, c, print_entry, NULL)) > + if (parse_credential_file(fn->string, c, PARSE_VERBOSE, > + print_entry, NULL)) > return; /* Found credential */ > } This is triggered by the "get" operation. We read until we hit the entry we are looking for, at that point the match-callback will print out "username=... / password=..." lines and we ignore the remainder of the file. While we look for the first matching credential, we do warn about malformed or underspecified lines that we are ignoring, but because we leave the loop once a matching line is found, bad lines that come after it won't be even looked at to be warned. Validating and warning about bad entries is *not* the main purpose of the "credential-store" program, so I fully agree with the design of the "get" part. I am not so sure about the other two operations (i.e. "store" and "erase") that do scan all the entries and has chance to warn about bad ones, though (note: I am not saying that we should parse verbosely---it is just that I do not know why you chose not to and I am not convinced that it is a good idea not to warn). The tests looked good. Thanks.
Junio C Hamano <gitster@pobox.com> writes: [part of the commit log message] >> Warn the user indicating the filename and line number so any invalid >> entries could be corrected but continue parsing until a match is >> found or all valid credentials are processed. We should say "Do so only during the 'get' operation; give up giving any warning for 'erase' or 'store' operation, as keeping track of the line number of the input file, while having to issue a warning that has the line number of the output file, is too hard for us" or something like that here. I suspect that it might not be "too hard", but we'd need to rename other_cb to a more specific name first and limit the possibility of repurposing parse_credential_file(). For example, other_cb can become "copy_cb", we declare that the function works in two (and no other) modes, one is to look-up (which is read-only so we report the input line number in our warning messges) and the other is to copy-out-with-filtering (which will leave a file that is different from the input, so we report the output line number). To support the copy-out-wiht-filtering mode, we pass the starting line number into the function (i.e. 'store' may store a new line, so the first line copied out to the file may be the second line in the output), and count the output lines there: if (other_cb) { lines++; other_cb(&line); } and of course we won't increment lines++ when we saw a match in the copy-out-with-filtering mode. In look-up mode (i.e. copy_cb == NULL), we count the input lines just like your code. But I suspect that it may not be worth doing. But if we decide not to do so, we should document why we chose (not) to in the commit log message. > Validating and warning about bad entries is *not* the main purpose > of the "credential-store" program, so I fully agree with the design > of the "get" part. I am not so sure about the other two operations > (i.e. "store" and "erase") that do scan all the entries and has > chance to warn about bad ones, though (note: I am not saying that we > should parse verbosely---it is just that I do not know why you chose > not to and I am not convinced that it is a good idea not to warn). I re-read the incremental log for v8 and found your explanation. Thanks.
On Thu, Apr 30, 2020 at 01:21:06PM -0700, Junio C Hamano wrote: > Carlo Marcelo Arenas Belón <carenas@gmail.com> writes: > > @@ -23,17 +34,24 @@ static int parse_credential_file(const char *fn, > > return found_credential; > > } > > > > - while (strbuf_getline_lf(&line, fh) != EOF) { > > - credential_from_url(&entry, line.buf); > > - if (entry.username && entry.password && > > - credential_match(c, &entry)) { > > + while (strbuf_getline(&line, fh) != EOF) { > > Hmph, I probably have missed some discussion that happened in the > last few days, but from the use of _lf() in the original, I sense > that it is very much deliberate that the file format is designed to > be LF delimited records, *not* a text file in which each line is > terminated with some end-of-line marker that is platform dependent. > IOW, an explicit use of _lf() shouts, at least to me, that we want a > sequence of LF terminated records even on Windows and Macs. apologize for the confusion, the only discussion was the one I had with myself late at night when I realized that specific corruption was not being detected, and might be related to issues that seemed to be common[1] the problem is that practically speaking, if users in Windows and Macs edited the file they "might" had saved lines with the wrong line ending (*) (part of the reason I added a warning about "encoding" to the documentation), and those are difficult to "see" as invalid. using the non _lf() version ensures any trailing '\r' character would get removed from the credential and allow us to use them, instead of silently failing. originally I added a explicit check for it, which I assume you would prefer but .. > So, I am not sure why this change is desirable. to give users immediate relief from what seems to be a commonly seen issue. IMHO after we get this released out and they see the updated documentation explaining the risks, and some learn how to fix their credential files ( and hopefully use an editor that lets them see the problem); we could then add also a detection for this edge case and go back to _lf() I also had to admit I might had overreacted, as I was testing before v8 with a very corrupted file and was seeing warnings twice on each operation and somehow even got myself into the original fatal, which I had to admit I can't replicate now after some needed rest. Carlo [1] https://github.com/desktop/desktop/issues/9597 (*) textedit, which comes with macOS doesn't even write an EOF record when creating files, for example, and that behaviour seems to be common for most other native editors but a credential line WITHOUT line ending works
Carlo Marcelo Arenas Belón <carenas@gmail.com> writes: > the problem is that practically speaking, if users in Windows and Macs > edited the file they "might" had saved lines with the wrong line ending (*) > (part of the reason I added a warning about "encoding" to the documentation), > and those are difficult to "see" as invalid. > > using the non _lf() version ensures any trailing '\r' character would > get removed from the credential and allow us to use them, instead of > silently failing. You are forgetting why we are fixing credential-store, aren't you? It is primarily to help those who damaged their files by editing, and introducing cruft that cannot be parsed, from a stricter parsing introduced recently in 2.17.4 and above. Without the fix, they cannot operate with the store they already have. Now, think about those users who saved their file, after adding CR at the end of each line, but didn't do any other edit (like adding a blank line or "# comment"). It may have happened 3 years ago. What did they see from the system back then? It may have happened 3 minutes ago. What would they see with the stricter parser now? With or without the recent parser change, they would have seen that these lines no longer match the URLs they wanted to match, but the credential store does not die on them for malformed lines, no? In other words, the stricter parsing did nothing to these users. In fact, those users who added CR at the end of each line 3 years ago may have even depended on the disappearance of these entries for all these years. Maybe lines that record their ancient passwords for sites are still buried in the later parts of the file, with CR, but doing no harm because these lines do not match anything. These users may have changed their password since then and wrote new records with "credential store", and these new records are stored without CR at the end of the line, so they match the URLs. By using the non _lf() variant, you are suddenly resurrecting these old records that the users thought are already gone and have been causing no harm to them. Do we know that resurrecting these old records is a good thing to do? I don't. For example, once the user decides to "sort" the file (after all, we are talking about users who edit the file, so we cannot assume they won't do so), they would end up with duplicate records that record two passwords to the same site and they cannot tell which one is current, as you even lost the CR at the end of line that would have told you which ones are broken. In short, you wouldn't know what ramification it has by suddenly using non _lf() variant. And it has nothing to do with the fix we are trying to make. When fixing something, it is tempting to see if you can cover a bit more area with the same change. But you should make it a habit to stick to the absolute minimum first for safety. When contemplating to step out of the absolute minimum, you need to think twice to see if that extra "fix" really fixes, or if it potentially harms. And I am not convinced that turning CRLF into LF in this case is a good change. In any case, it certainly does not belong to the same commit as the one that fixes the fallout from stricter parser introduced in 2.17.4 and above. Thanks.
On Thu, Apr 30, 2020 at 6:40 PM Junio C Hamano <gitster@pobox.com> wrote: > > Carlo Marcelo Arenas Belón <carenas@gmail.com> writes: > > > the problem is that practically speaking, if users in Windows and Macs > > edited the file they "might" had saved lines with the wrong line ending (*) > > (part of the reason I added a warning about "encoding" to the documentation), > > and those are difficult to "see" as invalid. > > > > using the non _lf() version ensures any trailing '\r' character would > > get removed from the credential and allow us to use them, instead of > > silently failing. > > You are forgetting why we are fixing credential-store, aren't you? indeed, and thanks for the clarification. you are correct this was a bad idea and is better to warn them instead (even if only during get operations and for the lines that were processed) than my "solution". FWIW though, my change wasn't going to change the file on disk but only allow the line to be processed. gave up already on sneakily "fixing" corruption issues after Peff called me on it and yes, another version you will never see had a PARSER_TRIM flag added ;) Carlo PS. should we really do the warn even in store/erase operations as a followup or is the warning not useful as is, and probably then all operations should be quiet (as Jonathan suggested originally?) and we could do warn (and maybe fix) in a different change (maybe adding a fsck command of sorts)? > When fixing something, it is tempting to see if you can cover a bit > more area with the same change. But you should make it a habit to > stick to the absolute minimum first for safety. When contemplating > to step out of the absolute minimum, you need to think twice to see > if that extra "fix" really fixes, or if it potentially harms.
Carlo Arenas <carenas@gmail.com> writes: > PS. should we really do the warn even in store/erase operations as a > followup or is the warning not useful as is, and probably then all > operations should be quiet (as Jonathan suggested originally?) and we > could do warn (and maybe fix) in a different change (maybe adding a > fsck command of sorts)? Yeah, I think I like the "no warning, just ignore" much better. The implementation I suspect would become a lot simpler, right? Thanks for thinking one extra step.
On Thu, Apr 30, 2020 at 10:27:24PM -0700, Junio C Hamano wrote: > Carlo Arenas <carenas@gmail.com> writes: > > > PS. should we really do the warn even in store/erase operations as a > > followup or is the warning not useful as is, and probably then all > > operations should be quiet (as Jonathan suggested originally?) and we > > could do warn (and maybe fix) in a different change (maybe adding a > > fsck command of sorts)? > > Yeah, I think I like the "no warning, just ignore" much better. The > implementation I suspect would become a lot simpler, right? sure, it will be basically the version that Jonathan suggested[1] on top of yours and that I included in v4 but that was missing a SOB. Jonathan, would you want to submit it formally?, IMHO better if with base (from pu): 272281efcc (credential-store: document the file format a bit more, 2020-04-27) it might be still worth including some of the test cases, so I could do it instead too; but let me know what would be your preference otherwise, as they will need to be wrapped, probably better than on what became my confusing attempt to save everyone's time with the v4 series[2], and that was made obsolete by Junio's creation of jc/credential-store-file-format-doc. Carlo [1] https://lore.kernel.org/git/20200428052510.GA201501@google.com/ [2] https://lore.kernel.org/git/20200428104858.28573-1-carenas@gmail.com/ CC: trim to avoid spam, probably long overdue
Carlo Marcelo Arenas Belón <carenas@gmail.com> writes: > attempt to save everyone's time with the v4 series[2], and that was made > obsolete by Junio's creation of jc/credential-store-file-format-doc. I actually discarded that "file format only" patch after you included in your series. One fewer patch I have to keep track of is always better ;-) Thanks.
diff --git a/Documentation/git-credential-store.txt b/Documentation/git-credential-store.txt index 76b0798856..d5841cffad 100644 --- a/Documentation/git-credential-store.txt +++ b/Documentation/git-credential-store.txt @@ -95,8 +95,15 @@ https://user:pass@example.com ------------------------------ No other kinds of lines (e.g. empty lines or comment lines) are -allowed in the file, even though some may be silently ignored. Do -not view or edit the file with editors. +allowed in the file, even though historically the parser was very +lenient and some might had been silently ignored. + +Do not view or edit the file with editors as it could compromise the +validity of your credentials by sometimes subtle formatting issues, +like spaces, line wrapping or text encoding. + +An unparseable or otherwise invalid line is ignored, and a warning +message points out the problematic file and line number it appears in. When Git needs authentication for a particular URL context, credential-store will consider that context a pattern to match against diff --git a/credential-store.c b/credential-store.c index c010497cb2..b1f5b2dea6 100644 --- a/credential-store.c +++ b/credential-store.c @@ -4,10 +4,20 @@ #include "string-list.h" #include "parse-options.h" +#define PARSE_VERBOSE 0x01 + static struct lock_file credential_lock; +static int valid_credential(struct credential *entry) +{ + return (entry->username && entry->password && + entry->protocol && *entry->protocol && + ((entry->host && *entry->host) || entry->path)); +} + static int parse_credential_file(const char *fn, struct credential *c, + int flags, void (*match_cb)(struct credential *), void (*other_cb)(struct strbuf *)) { @@ -15,6 +25,7 @@ static int parse_credential_file(const char *fn, struct strbuf line = STRBUF_INIT; struct credential entry = CREDENTIAL_INIT; int found_credential = 0; + int lineno = 0; fh = fopen(fn, "r"); if (!fh) { @@ -23,17 +34,24 @@ static int parse_credential_file(const char *fn, return found_credential; } - while (strbuf_getline_lf(&line, fh) != EOF) { - credential_from_url(&entry, line.buf); - if (entry.username && entry.password && - credential_match(c, &entry)) { + while (strbuf_getline(&line, fh) != EOF) { + lineno++; + + if (credential_from_url_gently(&entry, line.buf, 1) || + !valid_credential(&entry)) { + if (flags & PARSE_VERBOSE) + warning(_("%s:%d: ignoring invalid credential"), + fn, lineno); + } else if (credential_match(c, &entry)) { found_credential = 1; if (match_cb) { match_cb(&entry); break; } + continue; } - else if (other_cb) + + if (other_cb) other_cb(&line); } @@ -62,7 +80,7 @@ static void rewrite_credential_file(const char *fn, struct credential *c, die_errno("unable to get credential storage lock"); if (extra) print_line(extra); - parse_credential_file(fn, c, NULL, print_line); + parse_credential_file(fn, c, 0, NULL, print_line); if (commit_lock_file(&credential_lock) < 0) die_errno("unable to write credential store"); } @@ -139,7 +157,8 @@ static void lookup_credential(const struct string_list *fns, struct credential * struct string_list_item *fn; for_each_string_list_item(fn, fns) - if (parse_credential_file(fn->string, c, print_entry, NULL)) + if (parse_credential_file(fn->string, c, PARSE_VERBOSE, + print_entry, NULL)) return; /* Found credential */ } diff --git a/t/t0302-credential-store.sh b/t/t0302-credential-store.sh index d6b54e8c65..05fd2785b9 100755 --- a/t/t0302-credential-store.sh +++ b/t/t0302-credential-store.sh @@ -107,7 +107,6 @@ test_expect_success 'store: if both xdg and home files exist, only store in home test_must_be_empty "$HOME/.config/git/credentials" ' - test_expect_success 'erase: erase matching credentials from both xdg and home files' ' echo "https://home-user:home-pass@example.com" >"$HOME/.git-credentials" && mkdir -p "$HOME/.config/git" && @@ -120,4 +119,93 @@ test_expect_success 'erase: erase matching credentials from both xdg and home fi test_must_be_empty "$HOME/.config/git/credentials" ' +test_expect_success 'get: credentials without scheme are invalid' ' + echo "://user:pass@example.com" >"$HOME/.git-credentials" && + cat >expect-stdout <<-\STDOUT && + protocol=https + host=example.com + username=askpass-username + password=askpass-password + STDOUT + test_config credential.helper store && + git credential fill <<-\EOF >stdout 2>stderr && + protocol=https + host=example.com + EOF + test_cmp expect-stdout stdout && + grep "askpass: Username for '\''https://example.com'\'':" stderr && + grep "askpass: Password for '\''https://askpass-username@example.com'\'':" stderr && + test_i18ngrep "ignoring invalid credential" stderr +' + +test_expect_success 'get: credentials without valid host/path are invalid' ' + echo "https://user:pass@" >"$HOME/.git-credentials" && + cat >expect-stdout <<-\STDOUT && + protocol=https + host=example.com + username=askpass-username + password=askpass-password + STDOUT + test_config credential.helper store && + git credential fill <<-\EOF >stdout 2>stderr && + protocol=https + host=example.com + EOF + test_cmp expect-stdout stdout && + grep "askpass: Username for '\''https://example.com'\'':" stderr && + grep "askpass: Password for '\''https://askpass-username@example.com'\'':" stderr && + test_i18ngrep "ignoring invalid credential" stderr +' + +test_expect_success 'get: credentials without username/password are invalid' ' + echo "https://pass@example.com" >"$HOME/.git-credentials" && + cat >expect-stdout <<-\STDOUT && + protocol=https + host=example.com + username=askpass-username + password=askpass-password + STDOUT + test_config credential.helper store && + git credential fill <<-\EOF >stdout 2>stderr && + protocol=https + host=example.com + EOF + test_cmp expect-stdout stdout && + grep "askpass: Username for '\''https://example.com'\'':" stderr && + grep "askpass: Password for '\''https://askpass-username@example.com'\'':" stderr && + test_i18ngrep "ignoring invalid credential" stderr +' + +test_expect_success 'get: store file can contain empty/bogus lines' ' + echo "" > "$HOME/.git-credentials" && + q_to_tab <<-\CONFIG >>"$HOME/.git-credentials" && + #comment + Q + https://user:pass@example.com + CONFIG + cat >expect-stdout <<-\STDOUT && + protocol=https + host=example.com + username=user + password=pass + STDOUT + test_config credential.helper store && + git credential fill <<-\EOF >stdout 2>stderr && + protocol=https + host=example.com + EOF + test_cmp expect-stdout stdout && + test_i18ngrep "ignoring invalid credential" stderr && + test_line_count = 3 stderr +' + +test_expect_success 'store: store succeeds silently in corrupted file' ' + echo "#comment" >"$HOME/.git-credentials" && + check approve store <<-\EOF && + url=https://user:pass@example.com + EOF + grep "https://user:pass@example.com" "$HOME/.git-credentials" && + test_must_be_empty stderr +' + test_done
With the added checks for invalid URLs in credentials, any locally modified store files which might have empty lines or even comments were reported[1] failing to parse as valid credentials. Instead of doing a hard check for credentials, do a soft one and therefore avoid the reported fatal error. Warn the user indicating the filename and line number so any invalid entries could be corrected but continue parsing until a match is found or all valid credentials are processed. Make sure that the credential that we will use to match is complete by confirming it has all fields set as expected by the updated rules. [1] https://stackoverflow.com/a/61420852/5005936 Reported-by: Dirk <dirk@ed4u.de> Helped-by: Eric Sunshine <sunshine@sunshineco.com> Based-on-patch-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com> --- v9: * use strbuf_getline() instead to better handle files with CRLF v8: * only warn during get operations as otherwise line number might be incorrect v7: * check for protocol in helper as suggested by Junio v6: * get rid of redacter and only use line number for context in warning * make validation more strict to also catch incomplete credentials * reorder check as suggested by Junio v5: * q_to_tab this round, with a single echo to make sure empty line is covered, as that seems to be a popular issue * rebase on top of jc/credential-store-file-format-doc * implement a redacter for credentials to use on errors to avoid leaking passwords v4: * use credential_from_url_gently instead as shown by Jonathan * add documentation to clarify "comments" is not a supported feature v3: * avoid using q_to_cr as suggested by Peff * a more verbose commit message and slightly more complete documentation v2: * use a here-doc for clarity as suggested by Eric * improve commit message and include documentation Documentation/git-credential-store.txt | 11 +++- credential-store.c | 33 ++++++++-- t/t0302-credential-store.sh | 90 +++++++++++++++++++++++++- 3 files changed, 124 insertions(+), 10 deletions(-) base-commit: 272281efcc18fcedd248597b8859f343cae1c5a0