Message ID | 1544922308-740-1-git-send-email-eedahlgren@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v2] Simplify handling of setup_git_directory_gently() failure cases. | expand |
Hi Erin, On Sat, 15 Dec 2018, Erin Dahlgren wrote: > diff --git a/setup.c b/setup.c > index 1be5037..e1a9e17 100644 > --- a/setup.c > +++ b/setup.c > @@ -831,16 +831,6 @@ static const char *setup_bare_git_dir(struct strbuf *cwd, int offset, > return NULL; > } > > -static const char *setup_nongit(const char *cwd, int *nongit_ok) > -{ > - if (!nongit_ok) > - die(_("not a git repository (or any of the parent directories): %s"), DEFAULT_GIT_DIR_ENVIRONMENT); > - if (chdir(cwd)) > - die_errno(_("cannot come back to cwd")); > - *nongit_ok = 1; > - return NULL; > -} > - > static dev_t get_device_or_die(const char *path, const char *prefix, int prefix_len) > { > struct stat buf; > @@ -1097,18 +1087,20 @@ const char *setup_git_directory_gently(int *nongit_ok) > prefix = setup_bare_git_dir(&cwd, dir.len, &repo_fmt, nongit_ok); > break; > case GIT_DIR_HIT_CEILING: > - prefix = setup_nongit(cwd.buf, nongit_ok); > - break; > + if (!nongit_ok) > + die(_("not a git repository (or any of the parent directories): %s"), > + DEFAULT_GIT_DIR_ENVIRONMENT); I am terribly sorry to bother you about formatting issues (in my mind, it is quite an annoying thing that we still have no fully automatic way to format Git's source code according to Git's preferred coding style, but there you go...): this `DEFAULT_GIT_DIR_ENVIRONMENT` should be aligned with the first parameter of `die()`, i.e. + if (!nongit_ok) + die(_("not a git repository (or any of the parent directories): %s"), + DEFAULT_GIT_DIR_ENVIRONMENT); > + *nongit_ok = 1; > + strbuf_release(&dir); > + return NULL; > case GIT_DIR_HIT_MOUNT_POINT: > - if (nongit_ok) { > - *nongit_ok = 1; > - strbuf_release(&cwd); > - strbuf_release(&dir); > - return NULL; > - } > - die(_("not a git repository (or any parent up to mount point %s)\n" > - "Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set)."), > - dir.buf); > + if (!nongit_ok) > + die(_("not a git repository (or any parent up to mount point %s)\n" > + "Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set)."), > + dir.buf); Likewise, `dir.buf` should be aligned with the `_` two lines above. Otherwise I think this patch is good to go! Thank you, Johannes > + *nongit_ok = 1; > + strbuf_release(&dir); > + return NULL; > default: > BUG("unhandled setup_git_directory_1() result"); > } > -- > 2.7.4 > >
On Sat, Dec 15, 2018 at 05:05:08PM -0800, Erin Dahlgren wrote: > setup_git_directory_gently() expects two types of failures to > discover a git directory (e.g. .git/): > [...] When I read your subject line, I'll admit to having a funny feeling in the pit of my stomach. This setup code has historically been full of subtle corner cases, and I expected a very tricky review. However, your explanation was so well-written that it was a pleasure to read. :) > Before this change are two misleading additional behaviors: > > - GIT_DIR_HIT_CEILING: setup_nongit() changes to the cwd for no > apparent reason. We never had the chance to change directories > up to this point so chdir(current cwd) is pointless. That makes sense. I think this is a holdover from when we used to walk backwards via chdir(), prior to ce9b8aab5d (setup_git_directory_1(): avoid changing global state, 2017-03-13). Back then, we needed to restore the state at this point, but now we don't. > - GIT_DIR_HIT_MOUNT_POINT: strbuf_release() frees the buffer > of a static struct strbuf (cwd). This is unnecessary because the > struct is static so its buffer is always reachable. This is also > misleading because nowhere else in the function is this buffer > released. Makes sense. I do have one question, though: > case GIT_DIR_HIT_CEILING: > - prefix = setup_nongit(cwd.buf, nongit_ok); > - break; > + if (!nongit_ok) > + die(_("not a git repository (or any of the parent directories): %s"), >a + DEFAULT_GIT_DIR_ENVIRONMENT); > + *nongit_ok = 1; > + strbuf_release(&dir); > + return NULL; This case used to break out of the switch, and now returns early. So we do not execute the later code which clears GIT_PREFIX_ENVIRONMENT, and zeroes the fields in startup_info. Those probably don't matter in most cases, but I wonder if there are some obscure ones where it might. Might it make sense to make GIT_DIR_HIT_MOUNT_POINT more like GIT_DIR_HIT_CEILING currently is, rather than the other way around? I.e., something like: case GIT_DIR_HIT_CEILING: if (!nongit_ok) die(...); *nongit_ok = 1; prefix = NULL; break; case GIT_DIR_HIT_MOUNT_POINT: if (!nongit_ok) die(...); *nongit_ok = 1; prefix = NULL; break; ? -Peff
Sorry Johannes for the repeat message, I failed to send to all. On Tue, Dec 18, 2018 at 4:35 AM Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote: > > Hi Erin, > > On Sat, 15 Dec 2018, Erin Dahlgren wrote: > > > diff --git a/setup.c b/setup.c > > index 1be5037..e1a9e17 100644 > > --- a/setup.c > > +++ b/setup.c > > @@ -831,16 +831,6 @@ static const char *setup_bare_git_dir(struct strbuf *cwd, int offset, > > return NULL; > > } > > > > -static const char *setup_nongit(const char *cwd, int *nongit_ok) > > -{ > > - if (!nongit_ok) > > - die(_("not a git repository (or any of the parent directories): %s"), DEFAULT_GIT_DIR_ENVIRONMENT); > > - if (chdir(cwd)) > > - die_errno(_("cannot come back to cwd")); > > - *nongit_ok = 1; > > - return NULL; > > -} > > - > > static dev_t get_device_or_die(const char *path, const char *prefix, int prefix_len) > > { > > struct stat buf; > > @@ -1097,18 +1087,20 @@ const char *setup_git_directory_gently(int *nongit_ok) > > prefix = setup_bare_git_dir(&cwd, dir.len, &repo_fmt, nongit_ok); > > break; > > case GIT_DIR_HIT_CEILING: > > - prefix = setup_nongit(cwd.buf, nongit_ok); > > - break; > > + if (!nongit_ok) > > + die(_("not a git repository (or any of the parent directories): %s"), > > + DEFAULT_GIT_DIR_ENVIRONMENT); > > I am terribly sorry to bother you about formatting issues (in my mind, it > is quite an annoying thing that we still have no fully automatic way to > format Git's source code according to Git's preferred coding style, but > there you go...): this `DEFAULT_GIT_DIR_ENVIRONMENT` should be aligned > with the first parameter of `die()`, i.e. > > + if (!nongit_ok) > + die(_("not a git repository (or any of the parent directories): %s"), > + DEFAULT_GIT_DIR_ENVIRONMENT); > > > + *nongit_ok = 1; > > + strbuf_release(&dir); > > + return NULL; > > case GIT_DIR_HIT_MOUNT_POINT: > > - if (nongit_ok) { > > - *nongit_ok = 1; > > - strbuf_release(&cwd); > > - strbuf_release(&dir); > > - return NULL; > > - } > > - die(_("not a git repository (or any parent up to mount point %s)\n" > > - "Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set)."), > > - dir.buf); > > + if (!nongit_ok) > > + die(_("not a git repository (or any parent up to mount point %s)\n" > > + "Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set)."), > > + dir.buf); > > Likewise, `dir.buf` should be aligned with the `_` two lines above. Hi Johannes, No problem, I'll make those changes. I'd be interested to hear the arguments against a "fully automatic way to format Git's source code according to Git's preferred coding style". If there aren't any then I'd be willing to take a crack at it. Should we start a separate "discussion" thread? > > Otherwise I think this patch is good to go! > > Thank you, > Johannes > > > + *nongit_ok = 1; > > + strbuf_release(&dir); > > + return NULL; > > default: > > BUG("unhandled setup_git_directory_1() result"); > > } > > -- > > 2.7.4 > > > >
Hi Peff, On Tue, Dec 18, 2018 at 9:54 AM Jeff King <peff@peff.net> wrote: > > On Sat, Dec 15, 2018 at 05:05:08PM -0800, Erin Dahlgren wrote: > > > setup_git_directory_gently() expects two types of failures to > > discover a git directory (e.g. .git/): > > [...] > > When I read your subject line, I'll admit to having a funny feeling in > the pit of my stomach. This setup code has historically been full of > subtle corner cases, and I expected a very tricky review. > > However, your explanation was so well-written that it was a pleasure to > read. :) Thanks :) > > > Before this change are two misleading additional behaviors: > > > > - GIT_DIR_HIT_CEILING: setup_nongit() changes to the cwd for no > > apparent reason. We never had the chance to change directories > > up to this point so chdir(current cwd) is pointless. > > That makes sense. I think this is a holdover from when we used to walk > backwards via chdir(), prior to ce9b8aab5d (setup_git_directory_1(): > avoid changing global state, 2017-03-13). Back then, we needed to > restore the state at this point, but now we don't. > > > - GIT_DIR_HIT_MOUNT_POINT: strbuf_release() frees the buffer > > of a static struct strbuf (cwd). This is unnecessary because the > > struct is static so its buffer is always reachable. This is also > > misleading because nowhere else in the function is this buffer > > released. > > Makes sense. > > I do have one question, though: > > > case GIT_DIR_HIT_CEILING: > > - prefix = setup_nongit(cwd.buf, nongit_ok); > > - break; > > + if (!nongit_ok) > > + die(_("not a git repository (or any of the parent directories): %s"), > >a + DEFAULT_GIT_DIR_ENVIRONMENT); > > + *nongit_ok = 1; > > + strbuf_release(&dir); > > + return NULL; > > This case used to break out of the switch, and now returns early. > > So we do not execute the later code which clears GIT_PREFIX_ENVIRONMENT, > and zeroes the fields in startup_info. Those probably don't matter in > most cases, but I wonder if there are some obscure ones where it might. > > Might it make sense to make GIT_DIR_HIT_MOUNT_POINT more like > GIT_DIR_HIT_CEILING currently is, rather than the other way around? > I.e., something like: > > case GIT_DIR_HIT_CEILING: > if (!nongit_ok) > die(...); > *nongit_ok = 1; > prefix = NULL; > break; > case GIT_DIR_HIT_MOUNT_POINT: > if (!nongit_ok) > die(...); > *nongit_ok = 1; > prefix = NULL; > break; > > ? After some digging, there seems to be a documented guarantee that "`GIT_PREFIX` is set as returned by running 'git rev-parse --show-prefix'". See Documentation/config/alias.txt. If the environment variable GIT_PREFIX is already set to something and we don't clear it when prefix is NULL, then the two can be out of sync. So that's a problem with my patch and the current handling of GIT_DIR_HIT_MOUNT_POINT. I'm not sure how important of a guarantee it is, but we should respect what's documented. Side note: One of the secondary goals of my patch was to make it really obvious that neither the GIT_DIR_HIT_CEILING nor the GIT_DIR_HIT_MOUNT_POINT case can get us into the block protected by (startup_info->have_repository || getenv(GIT_DIR_ENVIRONMENT)). With your suggestion (and it's a fair one) I don't think that's feasible without more significant refactoring. Should I just settle with a comment that explains this? Thanks, Erin > > -Peff
On Tue, Dec 18, 2018 at 12:54:02PM -0800, Erin Dahlgren wrote: > > Might it make sense to make GIT_DIR_HIT_MOUNT_POINT more like > > GIT_DIR_HIT_CEILING currently is, rather than the other way around? > > I.e., something like: > > > > case GIT_DIR_HIT_CEILING: > > if (!nongit_ok) > > die(...); > > *nongit_ok = 1; > > prefix = NULL; > > break; > > case GIT_DIR_HIT_MOUNT_POINT: > > if (!nongit_ok) > > die(...); > > *nongit_ok = 1; > > prefix = NULL; > > break; > > > > ? > > After some digging, there seems to be a documented guarantee that > "`GIT_PREFIX` is set as returned by running 'git rev-parse > --show-prefix'". See Documentation/config/alias.txt. If the > environment variable GIT_PREFIX is already set to something and we > don't clear it when prefix is NULL, then the two can be out of sync. > So that's a problem with my patch and the current handling of > GIT_DIR_HIT_MOUNT_POINT. I'm not sure how important of a guarantee it > is, but we should respect what's documented. Yeah, agreed. Another benefit of avoiding the early return is that we hit the cleanup code at the end of the function. That saves us having to release "dir" here. I assume we don't have to care about "gitdir" in these cases, but hitting the cleanup code means we don't even have to think about it. Over in: https://public-inbox.org/git/20181219154853.GC14802@sigill.intra.peff.net/ I suggested adding more cleanup to that block, too (though I _think_ in these cases it would also be a noop, it's again nice to not have to worry about it). > Side note: One of the secondary goals of my patch was to make it > really obvious that neither the GIT_DIR_HIT_CEILING nor the > GIT_DIR_HIT_MOUNT_POINT case can get us into the block protected by > (startup_info->have_repository || getenv(GIT_DIR_ENVIRONMENT)). With > your suggestion (and it's a fair one) I don't think that's feasible > without more significant refactoring. Should I just settle with a > comment that explains this? Yeah, I think a comment would probably be sufficient. Though we could also perhaps make the two cases (i.e., we found a repo or not) more clear. Something like: if (!*nongit_ok || !*nongit_ok) { startup_info->have_repository = 1; startup_info->prefix = prefix; if (prefix) ...etc... } else { start_info->have_repository = 0; startup_info->prefix = NULL; setenv(GIT_PREFIX_ENVIRONMENT, "", 1); } That may introduce some slight repetition, but I think it's probably clearer to deal with the cases separately (and it saves earlier code from make-work like setting prefix=NULL when there's no repo). The condition would also be a lot less confusing if nongit_ok were flipped, but I think that would be a big refactor due to the way we pass it around. -Peff
Jeff King <peff@peff.net> writes: > On Tue, Dec 18, 2018 at 12:54:02PM -0800, Erin Dahlgren wrote: > ... >> GIT_DIR_HIT_MOUNT_POINT. I'm not sure how important of a guarantee it >> is, but we should respect what's documented. > > Yeah, agreed. > > Another benefit of avoiding the early return is that we hit the cleanup > code at the end of the function. That saves us having to release "dir" > here. I assume we don't have to care about "gitdir" in these cases, but > hitting the cleanup code means we don't even have to think about it. > > Over in: > > https://public-inbox.org/git/20181219154853.GC14802@sigill.intra.peff.net/ > > I suggested adding more cleanup to that block, too (though I _think_ in > these cases it would also be a noop, it's again nice to not have to > worry about it). > >> Side note: One of the secondary goals of my patch was to make it >> really obvious that neither the GIT_DIR_HIT_CEILING nor the >> GIT_DIR_HIT_MOUNT_POINT case can get us into the block protected by >> (startup_info->have_repository || getenv(GIT_DIR_ENVIRONMENT)). With >> your suggestion (and it's a fair one) I don't think that's feasible >> without more significant refactoring. Should I just settle with a >> comment that explains this? > > Yeah, I think a comment would probably be sufficient. Though we could > also perhaps make the two cases (i.e., we found a repo or not) more > clear. Something like: > > if (!*nongit_ok || !*nongit_ok) { WTH is (A || A)? > startup_info->have_repository = 1; > startup_info->prefix = prefix; > if (prefix) ...etc... > } else { > start_info->have_repository = 0; > startup_info->prefix = NULL; > setenv(GIT_PREFIX_ENVIRONMENT, "", 1); > } > > That may introduce some slight repetition, but I think it's probably > clearer to deal with the cases separately (and it saves earlier code > from make-work like setting prefix=NULL when there's no repo). Sounds good. Thanks both.
On Wed, Dec 26, 2018 at 02:22:39PM -0800, Junio C Hamano wrote: > >> Side note: One of the secondary goals of my patch was to make it > >> really obvious that neither the GIT_DIR_HIT_CEILING nor the > >> GIT_DIR_HIT_MOUNT_POINT case can get us into the block protected by > >> (startup_info->have_repository || getenv(GIT_DIR_ENVIRONMENT)). With > >> your suggestion (and it's a fair one) I don't think that's feasible > >> without more significant refactoring. Should I just settle with a > >> comment that explains this? > > > > Yeah, I think a comment would probably be sufficient. Though we could > > also perhaps make the two cases (i.e., we found a repo or not) more > > clear. Something like: > > > > if (!*nongit_ok || !*nongit_ok) { > > WTH is (A || A)? Heh, I should learn to cut and paste better. This should be: if (!nongit_ok || !*nongit_ok) (which comes from the current code). -Peff
On Thu, Dec 27, 2018 at 8:24 AM Jeff King <peff@peff.net> wrote: > > On Wed, Dec 26, 2018 at 02:22:39PM -0800, Junio C Hamano wrote: > > > >> Side note: One of the secondary goals of my patch was to make it > > >> really obvious that neither the GIT_DIR_HIT_CEILING nor the > > >> GIT_DIR_HIT_MOUNT_POINT case can get us into the block protected by > > >> (startup_info->have_repository || getenv(GIT_DIR_ENVIRONMENT)). With > > >> your suggestion (and it's a fair one) I don't think that's feasible > > >> without more significant refactoring. Should I just settle with a > > >> comment that explains this? > > > > > > Yeah, I think a comment would probably be sufficient. Though we could > > > also perhaps make the two cases (i.e., we found a repo or not) more > > > clear. Something like: > > > > > > if (!*nongit_ok || !*nongit_ok) { > > > > WTH is (A || A)? > > Heh, I should learn to cut and paste better. This should be: > > if (!nongit_ok || !*nongit_ok) > > (which comes from the current code). Yep, but I think we can benefit from De Morgan's law here, where: (!nongit_ok || !*nongit_ok) == (!(nongit_ok && *nongit_ok)) PATCH v3 (just sent) uses that transformation like this: if (nongit_ok && *nongit_ok) { ... startup_info->has_repository = 0; } else { // !nongit_ok || !*nongit_ok .. startup_info->has_repository = 1; } Because IMHO (nongit_ok && *nongit_ok) is easier to read and reason about. Added brief comments as well. Thanks. - Erin > > -Peff
On Thu, Dec 27, 2018 at 03:46:10PM -0800, Erin Dahlgren wrote: > > Heh, I should learn to cut and paste better. This should be: > > > > if (!nongit_ok || !*nongit_ok) > > > > (which comes from the current code). > > Yep, but I think we can benefit from De Morgan's law here, where: > > (!nongit_ok || !*nongit_ok) == (!(nongit_ok && *nongit_ok)) > > PATCH v3 (just sent) uses that transformation like this: > > if (nongit_ok && *nongit_ok) { > ... startup_info->has_repository = 0; > } else { > // !nongit_ok || !*nongit_ok > .. startup_info->has_repository = 1; > } > > Because IMHO (nongit_ok && *nongit_ok) is easier to read and reason > about. Added brief comments as well. Ah yes, that's much better. -Peff
diff --git a/setup.c b/setup.c index 1be5037..e1a9e17 100644 --- a/setup.c +++ b/setup.c @@ -831,16 +831,6 @@ static const char *setup_bare_git_dir(struct strbuf *cwd, int offset, return NULL; } -static const char *setup_nongit(const char *cwd, int *nongit_ok) -{ - if (!nongit_ok) - die(_("not a git repository (or any of the parent directories): %s"), DEFAULT_GIT_DIR_ENVIRONMENT); - if (chdir(cwd)) - die_errno(_("cannot come back to cwd")); - *nongit_ok = 1; - return NULL; -} - static dev_t get_device_or_die(const char *path, const char *prefix, int prefix_len) { struct stat buf; @@ -1097,18 +1087,20 @@ const char *setup_git_directory_gently(int *nongit_ok) prefix = setup_bare_git_dir(&cwd, dir.len, &repo_fmt, nongit_ok); break; case GIT_DIR_HIT_CEILING: - prefix = setup_nongit(cwd.buf, nongit_ok); - break; + if (!nongit_ok) + die(_("not a git repository (or any of the parent directories): %s"), + DEFAULT_GIT_DIR_ENVIRONMENT); + *nongit_ok = 1; + strbuf_release(&dir); + return NULL; case GIT_DIR_HIT_MOUNT_POINT: - if (nongit_ok) { - *nongit_ok = 1; - strbuf_release(&cwd); - strbuf_release(&dir); - return NULL; - } - die(_("not a git repository (or any parent up to mount point %s)\n" - "Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set)."), - dir.buf); + if (!nongit_ok) + die(_("not a git repository (or any parent up to mount point %s)\n" + "Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set)."), + dir.buf); + *nongit_ok = 1; + strbuf_release(&dir); + return NULL; default: BUG("unhandled setup_git_directory_1() result"); }