Message ID | c7f0977cabd4ba7311b8045bc57e9e30198651fd.1635288599.git.gitgitgadget@gmail.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | fetch-pack: redact packfile urls in traces | expand |
On Tue, Oct 26 2021, Ivan Frade via GitGitGadget wrote: > From: Ivan Frade <ifrade@google.com> > > http-fetch prints the URL after failing to fetch it. This can be > confusing to users (they cannot really do anything with it) but even > worse, they can share by accident a sensitive URL (e.g. with > credentials) while looking for help. > > Redact the URL unless the GIT_TRACE_REDACT variable is set to false. This > mimics the redaction of other sensitive information in git, like the > Authorization header in HTTP. > > Signed-off-by: Ivan Frade <ifrade@google.com> > --- > http-fetch.c | 15 +++++++++++++-- > 1 file changed, 13 insertions(+), 2 deletions(-) > > diff --git a/http-fetch.c b/http-fetch.c > index fa642462a9e..bbe09a6ad9f 100644 > --- a/http-fetch.c > +++ b/http-fetch.c > @@ -4,6 +4,7 @@ > #include "http.h" > #include "walker.h" > #include "strvec.h" > +#include "urlmatch.h" > > static const char http_fetch_usage[] = "git http-fetch " > "[-c] [-t] [-a] [-v] [--recover] [-w ref] [--stdin | --packfile=hash | commit-id] url"; > @@ -63,8 +64,18 @@ static void fetch_single_packfile(struct object_id *packfile_hash, > if (start_active_slot(preq->slot)) { > run_active_slot(preq->slot); > if (results.curl_result != CURLE_OK) { > - die("Unable to get pack file %s\n%s", preq->url, > - curl_errorstr); > + struct url_info url; > + char *nurl = url_normalize(preq->url, &url); > + if (!git_env_bool("GIT_TRACE_REDACT", 1) || !nurl) { > + die("Unable to get pack file %s\n%s", preq->url, > + curl_errorstr); small nit: arrange if's from "if (cheap || expensive)", i.e. no need for getenv() if !nurl, but maybe compilers are smart enough for that... nit: die() messages should start with lower-case (in CodingGuidelines), and I think it's better to quote both, so: die("unable to get pack '%s': '%s'", ...) Or maybe without the second '%s', as in 3e8084f1884 (http: check CURLE_SSL_PINNEDPUBKEYNOTMATCH when emitting errors, 2021-09-24) (which I authored, but just copy/pasted the convention in the surrounding code)> > + } else { > + char *schema = xstrndup(url.url, url.scheme_len); > + char *host = xstrndup(&url.url[url.host_off], url.host_len); > + die("failed to get '%s' url from '%s' " > + "(full URL redacted due to GIT_TRACE_REDACT setting)\n%s", > + schema, host, curl_errorstr); Hrm, I haven't tested, but aren't both of those xstrndup's redundant to using %*s instead of %s for the printf format? I.e.: die("failed to get '%*s'[...]", url.schema_len, url.url, )
On Thu, Oct 28, 2021 at 12:46 PM Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote: > On Tue, Oct 26 2021, Ivan Frade via GitGitGadget wrote: > > if (results.curl_result != CURLE_OK) { > > - die("Unable to get pack file %s\n%s", preq->url, > > - curl_errorstr); > > + struct url_info url; > > + char *nurl = url_normalize(preq->url, &url); > > + if (!git_env_bool("GIT_TRACE_REDACT", 1) || !nurl) { > > + die("Unable to get pack file %s\n%s", preq->url, > > + curl_errorstr); > > small nit: arrange if's from "if (cheap || expensive)", i.e. no need for > getenv() if !nurl, but maybe compilers are smart enough for that... I had the same passing thought when glancing over this code (although this appears to be an error patch, thus not performance critical, so not terribly important). > nit: die() messages should start with lower-case (in CodingGuidelines), and I think it's better to quote both, so: > > die("unable to get pack '%s': '%s'", ...) > > Or maybe without the second '%s', as in 3e8084f1884 (http: check > CURLE_SSL_PINNEDPUBKEYNOTMATCH when emitting errors, 2021-09-24) (which > I authored, but just copy/pasted the convention in the surrounding > code)> Note that this is not a new die() call; it just got indented as-is by this patch, so the changes you suggest to the message string are potentially outside the scope of this patch. Possibilities: (1) make the changes in this patch but mention them in the commit message; (2) make the changes in a preparatory patch; (3) punt on the changes for now. > > + } else { > > + char *schema = xstrndup(url.url, url.scheme_len); > > + char *host = xstrndup(&url.url[url.host_off], url.host_len); > > + die("failed to get '%s' url from '%s' " > > + "(full URL redacted due to GIT_TRACE_REDACT setting)\n%s", > > + schema, host, curl_errorstr); > > Hrm, I haven't tested, but aren't both of those xstrndup's redundant to > using %*s instead of %s for the printf format? I.e.: > > die("failed to get '%*s'[...]", url.schema_len, url.url, ) I wondered the same when reading the patch. Thanks for mentioning it.
On Thu, Oct 28, 2021 at 9:46 AM Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote: > > > On Tue, Oct 26 2021, Ivan Frade via GitGitGadget wrote: > > > From: Ivan Frade <ifrade@google.com> > > ... > > + if (!git_env_bool("GIT_TRACE_REDACT", 1) || !nurl) { > > + die("Unable to get pack file %s\n%s", preq->url, > > + curl_errorstr); > > small nit: arrange if's from "if (cheap || expensive)", i.e. no need for > getenv() if !nurl, but maybe compilers are smart enough for that... Done > nit: die() messages should start with lower-case (in CodingGuidelines), and I think it's better to quote both, so: > > die("unable to get pack '%s': '%s'", ...) > > Or maybe without the second '%s', as in 3e8084f1884 (http: check > CURLE_SSL_PINNEDPUBKEYNOTMATCH when emitting errors, 2021-09-24) (which > I authored, but just copy/pasted the convention in the surrounding > code)> Done > > + } else { > > + char *schema = xstrndup(url.url, url.scheme_len); > > + char *host = xstrndup(&url.url[url.host_off], url.host_len); > > + die("failed to get '%s' url from '%s' " > > + "(full URL redacted due to GIT_TRACE_REDACT setting)\n%s", > > + schema, host, curl_errorstr); > > Hrm, I haven't tested, but aren't both of those xstrndup's redundant to > using %*s instead of %s for the printf format? I.e.: > > die("failed to get '%*s'[...]", url.schema_len, url.url, ) Indeed, "%.*s" did the trick. Thanks! Ivan
On Thu, Oct 28, 2021 at 10:26 AM Eric Sunshine <sunshine@sunshineco.com> wrote: > > On Thu, Oct 28, 2021 at 12:46 PM Ævar Arnfjörð Bjarmason > <avarab@gmail.com> wrote: > > > nit: die() messages should start with lower-case (in CodingGuidelines), and I think it's better to quote both, so: > > > > die("unable to get pack '%s': '%s'", ...) > > > > Or maybe without the second '%s', as in 3e8084f1884 (http: check > > CURLE_SSL_PINNEDPUBKEYNOTMATCH when emitting errors, 2021-09-24) (which > > I authored, but just copy/pasted the convention in the surrounding > > code)> > > Note that this is not a new die() call; it just got indented as-is by > this patch, so the changes you suggest to the message string are > potentially outside the scope of this patch. Possibilities: (1) make > the changes in this patch but mention them in the commit message; (2) > make the changes in a preparatory patch; (3) punt on the changes for > now. I went for option (1). It is a minimal change and we are moving that line in this commit. Ivan
Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes: >> + if (!git_env_bool("GIT_TRACE_REDACT", 1) || !nurl) { >> + die("Unable to get pack file %s\n%s", preq->url, >> + curl_errorstr); > > small nit: arrange if's from "if (cheap || expensive)", i.e. no need for > getenv() if !nurl, but maybe compilers are smart enough for that... They typically do not see what happens inside git_env_bool() while compling this compilation unit, and cannot tell if the programmer wanted to call it first for its side effects, hence they cannot swap them safely.
On Fri, Oct 29 2021, Junio C Hamano wrote: > Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes: > >>> + if (!git_env_bool("GIT_TRACE_REDACT", 1) || !nurl) { >>> + die("Unable to get pack file %s\n%s", preq->url, >>> + curl_errorstr); >> >> small nit: arrange if's from "if (cheap || expensive)", i.e. no need for >> getenv() if !nurl, but maybe compilers are smart enough for that... > > They typically do not see what happens inside git_env_bool() while > compling this compilation unit, and cannot tell if the programmer > wanted to call it first for its side effects, hence they cannot > swap them safely. *nod*, but since that function is just: int git_env_bool(const char *k, int def) { const char *v = getenv(k); return v ? git_config_bool(k, v) : def; } I was hedging and pondering if some compilers were smart enough these days to optimize things like that. I.e. in this case getenv() is a simple C library function, the env variable is constant, and we do a boolean test of it before calling git_config_bool(). So a sufficiently smart compiler could turn that into: /* global, probably something iterated over env already */ static int __have_seen_GIT_TRACE_REDACT = 0; ... if ((!__have_seen_GIT_TRACE_REDACT || !nurl) || (__have_seen_GIT_TRACE_REDACT && git_env_bool_without_v_bool_check(...))) But probably not, since it wolud need quite a bit of C library cooperation/hooks...
diff --git a/http-fetch.c b/http-fetch.c index fa642462a9e..bbe09a6ad9f 100644 --- a/http-fetch.c +++ b/http-fetch.c @@ -4,6 +4,7 @@ #include "http.h" #include "walker.h" #include "strvec.h" +#include "urlmatch.h" static const char http_fetch_usage[] = "git http-fetch " "[-c] [-t] [-a] [-v] [--recover] [-w ref] [--stdin | --packfile=hash | commit-id] url"; @@ -63,8 +64,18 @@ static void fetch_single_packfile(struct object_id *packfile_hash, if (start_active_slot(preq->slot)) { run_active_slot(preq->slot); if (results.curl_result != CURLE_OK) { - die("Unable to get pack file %s\n%s", preq->url, - curl_errorstr); + struct url_info url; + char *nurl = url_normalize(preq->url, &url); + if (!git_env_bool("GIT_TRACE_REDACT", 1) || !nurl) { + die("Unable to get pack file %s\n%s", preq->url, + curl_errorstr); + } else { + char *schema = xstrndup(url.url, url.scheme_len); + char *host = xstrndup(&url.url[url.host_off], url.host_len); + die("failed to get '%s' url from '%s' " + "(full URL redacted due to GIT_TRACE_REDACT setting)\n%s", + schema, host, curl_errorstr); + } } } else { die("Unable to start request");