Message ID | pull.985.v4.git.1645974782256.gitgitgadget@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v4] untracked-cache: support '--untracked-files=all' if configured | expand |
On Sun, Feb 27 2022, Tao Klerks via GitGitGadget wrote: > From: Tao Klerks <tao@klerks.biz> > [...] > The test suite passes, but as a n00b I do have questions: > > * Is there any additional validation that could/should be done to > confirm that "-uall" untracked data can be cached safely? I haven't given this any substantial review, sorry. > * It looks safe from following the code Yes, at a glance it looks safe to me. > * It seems from discussing briefly with Elijah Newren in the thread > above that thare are no obvious red flags > * Manual testing, explicitly and implicitly through months of use, > yields no issues > * Is it OK to check the repo configuration in the body of processing? > It seems to be a rare pattern Yes, it's not very common, but it's fine. > * Can anyone think of a way to test this change? Well, if we set "flags" to 0 in your new helper the existing tests will fail, so that's something at least. > -static void new_untracked_cache(struct index_state *istate) > +static unsigned configured_default_dir_flags(struct repository *repo) > +{ > + /* > + * This logic is coordinated with the setting of these flags in > + * wt-status.c#wt_status_collect_untracked(), and the evaluation > + * of the config setting in commit.c#git_status_config() > + */ > + char *status_untracked_files_config_value; > + int config_outcome = repo_config_get_string(repo, > + "status.showuntrackedfiles", > + &status_untracked_files_config_value); > + if (!config_outcome && !strcmp(status_untracked_files_config_value, "all")) { > + return 0; > + } else { > + /* > + * The default, if "all" is not set, is "normal" - leading us here. > + * If the value is "none" then it really doesn't matter. > + */ > + return DIR_SHOW_OTHER_DIRECTORIES | DIR_HIDE_EMPTY_DIRECTORIES; > + } > +} > + > [...] In reviewing this I found the addition of very long lines & indentation made this a bit harder to read. I came up with the below which should be squashable on top, and perhaps makes this easier to read. I.e. the one caller that needs custom flags passes them, others pass -1 now. For throwaway "char *" variables we usually use "char *p", "char *val" or whatever. A "status_untracked_files_config_value" is a bit much for something function local whose body is <10 lines of code. And if you drop the "int config_outcome" + rename the variable the repo_config_get_string() fits on one line: diff --git a/dir.c b/dir.c index 57a7d42482f..22a27d7780f 100644 --- a/dir.c +++ b/dir.c @@ -2746,34 +2746,33 @@ static void set_untracked_ident(struct untracked_cache *uc) strbuf_addch(&uc->ident, 0); } -static unsigned configured_default_dir_flags(struct repository *repo) +static unsigned new_untracked_cache_flags(struct index_state *istate) { + struct repository *repo = istate->repo; + char *val; + /* * This logic is coordinated with the setting of these flags in * wt-status.c#wt_status_collect_untracked(), and the evaluation * of the config setting in commit.c#git_status_config() */ - char *status_untracked_files_config_value; - int config_outcome = repo_config_get_string(repo, - "status.showuntrackedfiles", - &status_untracked_files_config_value); - if (!config_outcome && !strcmp(status_untracked_files_config_value, "all")) { + if (!repo_config_get_string(repo, "status.showuntrackedfiles", &val) && + !strcmp(val, "all")) return 0; - } else { - /* - * The default, if "all" is not set, is "normal" - leading us here. - * If the value is "none" then it really doesn't matter. - */ - return DIR_SHOW_OTHER_DIRECTORIES | DIR_HIDE_EMPTY_DIRECTORIES; - } + + /* + * The default, if "all" is not set, is "normal" - leading us here. + * If the value is "none" then it really doesn't matter. + */ + return DIR_SHOW_OTHER_DIRECTORIES | DIR_HIDE_EMPTY_DIRECTORIES; } -static void new_untracked_cache(struct index_state *istate, unsigned flags) +static void new_untracked_cache(struct index_state *istate, int flags) { struct untracked_cache *uc = xcalloc(1, sizeof(*uc)); strbuf_init(&uc->ident, 100); uc->exclude_per_dir = ".gitignore"; - uc->dir_flags = flags; + uc->dir_flags = flags >= 0 ? flags : new_untracked_cache_flags(istate); set_untracked_ident(uc); istate->untracked = uc; istate->cache_changed |= UNTRACKED_CHANGED; @@ -2782,13 +2781,11 @@ static void new_untracked_cache(struct index_state *istate, unsigned flags) void add_untracked_cache(struct index_state *istate) { if (!istate->untracked) { - new_untracked_cache(istate, - configured_default_dir_flags(istate->repo)); + new_untracked_cache(istate, -1); } else { if (!ident_in_untracked(istate->untracked)) { free_untracked_cache(istate->untracked); - new_untracked_cache(istate, - configured_default_dir_flags(istate->repo)); + new_untracked_cache(istate, -1); } } } @@ -2866,7 +2863,7 @@ static struct untracked_cache_dir *validate_untracked_cache(struct dir_struct *d * the current effective flags don't match the configured * flags. */ - if (dir->flags != configured_default_dir_flags(istate->repo)) + if (dir->flags != new_untracked_cache_flags(istate)) return NULL; /*
On Mon, Feb 28, 2022 at 3:08 PM Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote: > > In reviewing this I found the addition of very long lines & indentation > made this a bit harder to read. I came up with the below which should be > squashable on top, and perhaps makes this easier to read. > > I.e. the one caller that needs custom flags passes them, others pass -1 > now. > > For throwaway "char *" variables we usually use "char *p", "char *val" > or whatever. A "status_untracked_files_config_value" is a bit much for > something function local whose body is <10 lines of code. > > And if you drop the "int config_outcome" + rename the variable the > repo_config_get_string() fits on one line: > Thank you, this looks much cleaner! One question I have about this patch itself is why you changed the arg to new_untracked_cache_flags / configured_default_dir_flags from "repo" back to "istate" / whether that was intentional; I had understood Junio's recommendation that the function get the minimum it needed (the repo in this case) as very sensible... especially now that we need to pass in that state less often. Another question is whether I should add a comment in new_untracked_cache_flags explaining the new meaning of "-1" for the flags argument, or whether this is sufficiently standard in this project or in C generally that I should leave it implied. A third question is whether you have an opinion on the better approach to getting these changes merged: does it make more sense to argue that this is an improvement overall, and that the case of configured "all" and argument-provided "normal", which will now have worse performance, can and should be ignored, *or* will it be easier to get this reviewed/merged if the new behavior is made conditional upon a new configuration key? Last question is about protocol/workflow in this project: When I squash your proposal in, should I add an extra "Signed-Off-By: Ævar Arnfjörð Bjarmason <avarab@gmail.com>" line, or just include the changes with mine?? (I've read SubmittingPatches but don't really understand the whole signing off business; I do understand that as your changes improve mine directly, they should indeed be "squashed" in, rather than added as an extra commit)
diff --git a/dir.c b/dir.c index d91295f2bcd..57a7d42482f 100644 --- a/dir.c +++ b/dir.c @@ -2746,13 +2746,34 @@ static void set_untracked_ident(struct untracked_cache *uc) strbuf_addch(&uc->ident, 0); } -static void new_untracked_cache(struct index_state *istate) +static unsigned configured_default_dir_flags(struct repository *repo) +{ + /* + * This logic is coordinated with the setting of these flags in + * wt-status.c#wt_status_collect_untracked(), and the evaluation + * of the config setting in commit.c#git_status_config() + */ + char *status_untracked_files_config_value; + int config_outcome = repo_config_get_string(repo, + "status.showuntrackedfiles", + &status_untracked_files_config_value); + if (!config_outcome && !strcmp(status_untracked_files_config_value, "all")) { + return 0; + } else { + /* + * The default, if "all" is not set, is "normal" - leading us here. + * If the value is "none" then it really doesn't matter. + */ + return DIR_SHOW_OTHER_DIRECTORIES | DIR_HIDE_EMPTY_DIRECTORIES; + } +} + +static void new_untracked_cache(struct index_state *istate, unsigned flags) { struct untracked_cache *uc = xcalloc(1, sizeof(*uc)); strbuf_init(&uc->ident, 100); uc->exclude_per_dir = ".gitignore"; - /* should be the same flags used by git-status */ - uc->dir_flags = DIR_SHOW_OTHER_DIRECTORIES | DIR_HIDE_EMPTY_DIRECTORIES; + uc->dir_flags = flags; set_untracked_ident(uc); istate->untracked = uc; istate->cache_changed |= UNTRACKED_CHANGED; @@ -2761,11 +2782,13 @@ static void new_untracked_cache(struct index_state *istate) void add_untracked_cache(struct index_state *istate) { if (!istate->untracked) { - new_untracked_cache(istate); + new_untracked_cache(istate, + configured_default_dir_flags(istate->repo)); } else { if (!ident_in_untracked(istate->untracked)) { free_untracked_cache(istate->untracked); - new_untracked_cache(istate); + new_untracked_cache(istate, + configured_default_dir_flags(istate->repo)); } } } @@ -2780,8 +2803,9 @@ void remove_untracked_cache(struct index_state *istate) } static struct untracked_cache_dir *validate_untracked_cache(struct dir_struct *dir, - int base_len, - const struct pathspec *pathspec) + int base_len, + const struct pathspec *pathspec, + struct index_state *istate) { struct untracked_cache_dir *root; static int untracked_cache_disabled = -1; @@ -2812,17 +2836,9 @@ static struct untracked_cache_dir *validate_untracked_cache(struct dir_struct *d if (base_len || (pathspec && pathspec->nr)) return NULL; - /* Different set of flags may produce different results */ - if (dir->flags != dir->untracked->dir_flags || - /* - * See treat_directory(), case index_nonexistent. Without - * this flag, we may need to also cache .git file content - * for the resolve_gitlink_ref() call, which we don't. - */ - !(dir->flags & DIR_SHOW_OTHER_DIRECTORIES) || - /* We don't support collecting ignore files */ - (dir->flags & (DIR_SHOW_IGNORED | DIR_SHOW_IGNORED_TOO | - DIR_COLLECT_IGNORED))) + /* We don't support collecting ignore files */ + if (dir->flags & (DIR_SHOW_IGNORED | DIR_SHOW_IGNORED_TOO | + DIR_COLLECT_IGNORED)) return NULL; /* @@ -2845,6 +2861,41 @@ static struct untracked_cache_dir *validate_untracked_cache(struct dir_struct *d return NULL; } + /* + * We don't support using or preparing the untracked cache if + * the current effective flags don't match the configured + * flags. + */ + if (dir->flags != configured_default_dir_flags(istate->repo)) + return NULL; + + /* + * If the untracked structure we received does not have the same flags + * as configured, but the configured flags do match the effective flags, + * then we need to reset / create a new "untracked" structure to match + * the new config. + * Keeping the saved and used untracked cache in-line with the + * configuration provides an opportunity for frequent users of + * "git status -uall" to leverage the untracked cache by aligning their + * configuration (setting "status.showuntrackedfiles" to "all" or + * "normal" as appropriate), where previously this option was + * incompatible with untracked cache and *consistently* caused + * surprisingly bad performance (with fscache and fsmonitor enabled) on + * Windows. + * + * IMPROVEMENT OPPORTUNITY: If we reworked the untracked cache storage + * to not be as bound up with the desired output in a given run, + * and instead iterated through and stored enough information to + * correctly serve both "modes", then users could get peak performance + * with or without '-uall' regardless of their + * "status.showuntrackedfiles" config. + */ + if (dir->flags != dir->untracked->dir_flags) { + free_untracked_cache(istate->untracked); + new_untracked_cache(istate, dir->flags); + dir->untracked = istate->untracked; + } + if (!dir->untracked->root) FLEX_ALLOC_STR(dir->untracked->root, name, ""); @@ -2916,7 +2967,7 @@ int read_directory(struct dir_struct *dir, struct index_state *istate, return dir->nr; } - untracked = validate_untracked_cache(dir, len, pathspec); + untracked = validate_untracked_cache(dir, len, pathspec, istate); if (!untracked) /* * make sure untracked cache code path is disabled,