Message ID | 20250309153321.254844-1-ayu.chandekar@gmail.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | environment: move access to "core.attributesfile" into repo settings | expand |
On Sun, Mar 09, 2025 at 09:03:21PM +0530, Ayush Chandekar wrote: > Stop relying on global state to access the "core.attributesfile" > configuration. Instead, store the value in `struct repo_settings` and > retrieve it via `repo_settings_get_attributes_file_path()`. > > This prevents incorrect values from being used when a user or tool is > handling multiple repositories in the same process, each with different > attribute configurations. It also improves repository isolation and helps > progress towards libification by avoiding unnecessary global state. We typically switch the order around a bit in our commit messages: we first explain what the actual problem is, and then we say how we fix it. > diff --git a/attr.c b/attr.c > index 0bd2750528..aec4b42245 100644 > --- a/attr.c > +++ b/attr.c > @@ -879,12 +879,9 @@ const char *git_attr_system_file(void) > return system_wide; > } > > -const char *git_attr_global_file(void) > +const char *git_attr_global_file(struct repository *repo) > { > - if (!git_attributes_file) > - git_attributes_file = xdg_config_home("attributes"); > - > - return git_attributes_file; > + return repo_settings_get_attributesfile_path(repo); > } > > int git_attr_system_is_enabled(void) Hm. I wonder what the actual merit of this function is after the refactoring. Right now there isn't really any as it is a direct wrapper of `repo_settings_get_attributesfile_path()`. > diff --git a/attr.h b/attr.h > index a04a521092..c4f26b8f58 100644 > --- a/attr.h > +++ b/attr.h > @@ -213,11 +213,13 @@ void git_check_attr(struct index_state *istate, > const char *path, > struct attr_check *check); > > +struct repository; > + > /* > * Retrieve all attributes that apply to the specified path. > * check holds the attributes and their values. > */ > -void git_all_attrs(struct index_state *istate, > +void git_all_attrs(struct repository *repo, struct index_state *istate, > const char *path, struct attr_check *check); > > enum git_attr_direction { > @@ -233,7 +235,7 @@ void attr_start(void); > const char *git_attr_system_file(void); > > /* Return the global gitattributes file, if any. */ > -const char *git_attr_global_file(void); > +const char *git_attr_global_file(struct repository *repo); > > /* Return whether the system gitattributes file is enabled and should be used. */ > int git_attr_system_is_enabled(void); I think it would make sense to split out this change into a separate commit. The first commit would move the config into "repo-settings.c", the second commit would adapt functions and their callers as necessary. > @@ -283,4 +285,5 @@ struct match_attr { > struct match_attr *parse_attr_line(const char *line, const char *src, > int lineno, unsigned flags); > > + > #endif /* ATTR_H */ Extraneous newline. > diff --git a/builtin/check-attr.c b/builtin/check-attr.c > index 7cf275b893..1b8a89dfb2 100644 > --- a/builtin/check-attr.c > +++ b/builtin/check-attr.c > @@ -70,7 +70,7 @@ static void check_attr(const char *prefix, struct attr_check *check, > prefix_path(prefix, prefix ? strlen(prefix) : 0, file); > > if (collect_all) { > - git_all_attrs(the_repository->index, full_path, check); > + git_all_attrs(the_repository, the_repository->index, full_path, check); > } else { > git_check_attr(the_repository->index, full_path, check); > } > diff --git a/builtin/var.c b/builtin/var.c > index ada642a9fe..3d635c235e 100644 > --- a/builtin/var.c > +++ b/builtin/var.c > @@ -69,9 +69,9 @@ static char *git_attr_val_system(int ident_flag UNUSED) > return NULL; > } > > -static char *git_attr_val_global(int ident_flag UNUSED) > +static char *repo_git_attr_val_global(struct repository *repo, int ident_flag UNUSED) > { > - char *file = xstrdup_or_null(git_attr_global_file()); > + char *file = xstrdup_or_null(git_attr_global_file(repo)); > if (file) { > normalize_path_copy(file, file); > return file; > @@ -79,6 +79,11 @@ static char *git_attr_val_global(int ident_flag UNUSED) > return NULL; > } > > +static char *git_attr_val_global(int ident_flag) > +{ > + return repo_git_attr_val_global(the_repository, ident_flag); > +} > + > static char *git_config_val_system(int ident_flag UNUSED) > { > if (git_config_system()) { I think we should just retain `git_attr_val_global()` and plug in `the_repository`. The extra change here doesn't add anything, and "builtin/var.c" being a builtin means is not reused anywhere else, either. > diff --git a/repo-settings.c b/repo-settings.c > index 9d16d5399e..420ca72f5f 100644 > --- a/repo-settings.c > +++ b/repo-settings.c > @@ -167,3 +168,13 @@ int repo_settings_get_warn_ambiguous_refs(struct repository *repo) > &repo->settings.warn_ambiguous_refs, 1); > return repo->settings.warn_ambiguous_refs; > } > + > +const char *repo_settings_get_attributesfile_path(struct repository *repo) > +{ > + if (!repo->settings.git_attributes_file) { > + if (repo_config_get_pathname(repo, "core.attributesfile", &repo->settings.git_attributes_file)) { > + repo->settings.git_attributes_file = xdg_config_home("attributes"); > + } > + } We don't use curly braces around one-line statements. > + return repo->settings.git_attributes_file; > +} One thing I'm missing is the code to `free()` the allocated memory in `repo_settings_clear()`. > \ No newline at end of file Nit: missing newline at the end of the file. Thanks! Patrick
Hey, thanks for reviewing the patch! > We typically switch the order around a bit in our commit messages: we > first explain what the actual problem is, and then we say how we fix it. Got it. > Hm. I wonder what the actual merit of this function is after the > refactoring. Right now there isn't really any as it is a direct wrapper > of `repo_settings_get_attributesfile_path()`. I can remove the function and replace all the instances with `repo_settings_get_attributesfile_path()`. What do you think? > I think it would make sense to split out this change into a separate > commit. The first commit would move the config into "repo-settings.c", > the second commit would adapt functions and their callers as necessary. Alright. > Extraneous newline. Apologies. Will fix it. > I think we should just retain `git_attr_val_global()` and plug in > `the_repository`. The extra change here doesn't add anything, and > "builtin/var.c" being a builtin means is not reused anywhere else, > either. Makes sense. I will drop `repo_git_attr_val_globa()` and keep `git_attr_val_global()` with `the_repository`. > We don't use curly braces around one-line statements. Will fix it. > One thing I'm missing is the code to `free()` the allocated memory in > `repo_settings_clear()`. Oh right. > > \ No newline at end of file Got it. Thanks, Ayush:)
Patrick Steinhardt <ps@pks.im> writes: > On Sun, Mar 09, 2025 at 09:03:21PM +0530, Ayush Chandekar wrote: >> Stop relying on global state to access the "core.attributesfile" >> configuration. Instead, store the value in `struct repo_settings` and >> retrieve it via `repo_settings_get_attributes_file_path()`. >> >> This prevents incorrect values from being used when a user or tool is >> handling multiple repositories in the same process, each with different >> attribute configurations. It also improves repository isolation and helps >> progress towards libification by avoiding unnecessary global state. > > We typically switch the order around a bit in our commit messages: we > first explain what the actual problem is, and then we say how we fix it. A good suggestion. >> int git_attr_system_is_enabled(void) > > Hm. I wonder what the actual merit of this function is after the > refactoring. Right now there isn't really any as it is a direct wrapper > of `repo_settings_get_attributesfile_path()`. Another thing that felt awkward about this change is actually larger. The current attributes globals are built around the notion that the functions involved work on a single set of attributes at a time. Even in a single repository, when you are checking new contents into the object database and when you are checking objects out of the object database, you'd need to switch the direction manually, which means you always have two sets of attributes active that you can switch between (one is from the working tree and the other one is from the index, if I am recalling correctly). But step back and think. What does it mean to make them belong to a repository instance? Whose index and working tree does the attribute set that belongs to a repository that is not the_repository come from? So it seems to me that removing the reliance on globals is a good thing, and introducing some abstraction is a good step forward, but it is dubious if the abstracted "set of attributes" should be part of a repository instance. This is a bit similar to the_index situation, where we can have more than one in-core index instance for a given repository we are working with, an index_state knows its repository, but a repository instance only has a pointer to its primary index_state instance and does not know other index_state objects. The right way to deal with in-core index is to pass index_state, not repository, through the call chain for this reason, and I suspect that we are better off to make sure that we do the same for the attributes, i.e. pass pointer to an attribute set instance through the callchain, not a repository.
> Another thing that felt awkward about this change is actually > larger. The current attributes globals are built around the notion > that the functions involved work on a single set of attributes at a > time. Even in a single repository, when you are checking new contents > into the object database and when you are checking objects out of > the object database, you'd need to switch the direction manually, > which means you always have two sets of attributes active that you > can switch between (one is from the working tree and the other one > is from the index, if I am recalling correctly). > > But step back and think. What does it mean to make them belong to a > repository instance? Whose index and working tree does the attribute > set that belongs to a repository that is not the_repository come from? I'm trying my best to wrap my head around this. I definitely don't fully understand your review yet since I'm still quite new to the codebase. I get what you mean when you said which repo does it belong to. But in the long term, isn’t our goal to get rid of the_repository anyway? So at some point, wouldn't we need to either attach attributes to a repository or have the attribute set know about its repository? Thanks, Ayush:)
Ayush Chandekar <ayu.chandekar@gmail.com> writes: > But in the long term, isn’t our goal to get rid of the_repository anyway? > So at some point, wouldn't we need to either attach attributes to a > repository or have the attribute set know about its repository? My point is that it may not help further the cause of removing the assumption that certain operations only work on the_repository and not on an arbitrary "struct repository" instance, to muck with the attribute subsystem. If it turns out that attribute data should not belong to a repository instance, then it would not help to have the globals moved to members of "struct repository" and pass a repository instance down the code paths. Rather, it may turn out that we are better off passing a separate structure that is *NOT* a "struct repository" that represents the set(s) of attributes down the same code paths.
diff --git a/attr.c b/attr.c index 0bd2750528..aec4b42245 100644 --- a/attr.c +++ b/attr.c @@ -879,12 +879,9 @@ const char *git_attr_system_file(void) return system_wide; } -const char *git_attr_global_file(void) +const char *git_attr_global_file(struct repository *repo) { - if (!git_attributes_file) - git_attributes_file = xdg_config_home("attributes"); - - return git_attributes_file; + return repo_settings_get_attributesfile_path(repo); } int git_attr_system_is_enabled(void) @@ -906,7 +903,7 @@ static void push_stack(struct attr_stack **attr_stack_p, } } -static void bootstrap_attr_stack(struct index_state *istate, +static void bootstrap_attr_stack(struct repository *repo, struct index_state *istate, const struct object_id *tree_oid, struct attr_stack **stack) { @@ -927,8 +924,8 @@ static void bootstrap_attr_stack(struct index_state *istate, } /* home directory */ - if (git_attr_global_file()) { - e = read_attr_from_file(git_attr_global_file(), flags); + if (git_attr_global_file(repo)) { + e = read_attr_from_file(git_attr_global_file(repo), flags); push_stack(stack, e, NULL, 0); } @@ -946,7 +943,7 @@ static void bootstrap_attr_stack(struct index_state *istate, push_stack(stack, e, NULL, 0); } -static void prepare_attr_stack(struct index_state *istate, +static void prepare_attr_stack(struct repository *repo, struct index_state *istate, const struct object_id *tree_oid, const char *path, int dirlen, struct attr_stack **stack) @@ -969,7 +966,7 @@ static void prepare_attr_stack(struct index_state *istate, * .gitattributes in deeper directories to shallower ones, * and finally use the built-in set as the default. */ - bootstrap_attr_stack(istate, tree_oid, stack); + bootstrap_attr_stack(repo, istate, tree_oid, stack); /* * Pop the "info" one that is always at the top of the stack. @@ -1143,7 +1140,7 @@ static void determine_macros(struct all_attrs_item *all_attrs, * If check->check_nr is non-zero, only attributes in check[] are collected. * Otherwise all attributes are collected. */ -static void collect_some_attrs(struct index_state *istate, +static void collect_some_attrs(struct repository *repo, struct index_state *istate, const struct object_id *tree_oid, const char *path, struct attr_check *check) { @@ -1164,7 +1161,7 @@ static void collect_some_attrs(struct index_state *istate, dirlen = 0; } - prepare_attr_stack(istate, tree_oid, path, dirlen, &check->stack); + prepare_attr_stack(repo, istate, tree_oid, path, dirlen, &check->stack); all_attrs_init(&g_attr_hashmap, check); determine_macros(check->all_attrs, check->stack); @@ -1310,7 +1307,7 @@ void git_check_attr(struct index_state *istate, int i; const struct object_id *tree_oid = default_attr_source(); - collect_some_attrs(istate, tree_oid, path, check); + collect_some_attrs(the_repository, istate, tree_oid, path, check); for (i = 0; i < check->nr; i++) { unsigned int n = check->items[i].attr->attr_nr; @@ -1321,14 +1318,14 @@ void git_check_attr(struct index_state *istate, } } -void git_all_attrs(struct index_state *istate, +void git_all_attrs(struct repository *repo, struct index_state *istate, const char *path, struct attr_check *check) { int i; const struct object_id *tree_oid = default_attr_source(); attr_check_reset(check); - collect_some_attrs(istate, tree_oid, path, check); + collect_some_attrs(repo, istate, tree_oid, path, check); for (i = 0; i < check->all_attrs_nr; i++) { const char *name = check->all_attrs[i].attr->name; diff --git a/attr.h b/attr.h index a04a521092..c4f26b8f58 100644 --- a/attr.h +++ b/attr.h @@ -213,11 +213,13 @@ void git_check_attr(struct index_state *istate, const char *path, struct attr_check *check); +struct repository; + /* * Retrieve all attributes that apply to the specified path. * check holds the attributes and their values. */ -void git_all_attrs(struct index_state *istate, +void git_all_attrs(struct repository *repo, struct index_state *istate, const char *path, struct attr_check *check); enum git_attr_direction { @@ -233,7 +235,7 @@ void attr_start(void); const char *git_attr_system_file(void); /* Return the global gitattributes file, if any. */ -const char *git_attr_global_file(void); +const char *git_attr_global_file(struct repository *repo); /* Return whether the system gitattributes file is enabled and should be used. */ int git_attr_system_is_enabled(void); @@ -283,4 +285,5 @@ struct match_attr { struct match_attr *parse_attr_line(const char *line, const char *src, int lineno, unsigned flags); + #endif /* ATTR_H */ diff --git a/builtin/check-attr.c b/builtin/check-attr.c index 7cf275b893..1b8a89dfb2 100644 --- a/builtin/check-attr.c +++ b/builtin/check-attr.c @@ -70,7 +70,7 @@ static void check_attr(const char *prefix, struct attr_check *check, prefix_path(prefix, prefix ? strlen(prefix) : 0, file); if (collect_all) { - git_all_attrs(the_repository->index, full_path, check); + git_all_attrs(the_repository, the_repository->index, full_path, check); } else { git_check_attr(the_repository->index, full_path, check); } diff --git a/builtin/var.c b/builtin/var.c index ada642a9fe..3d635c235e 100644 --- a/builtin/var.c +++ b/builtin/var.c @@ -69,9 +69,9 @@ static char *git_attr_val_system(int ident_flag UNUSED) return NULL; } -static char *git_attr_val_global(int ident_flag UNUSED) +static char *repo_git_attr_val_global(struct repository *repo, int ident_flag UNUSED) { - char *file = xstrdup_or_null(git_attr_global_file()); + char *file = xstrdup_or_null(git_attr_global_file(repo)); if (file) { normalize_path_copy(file, file); return file; @@ -79,6 +79,11 @@ static char *git_attr_val_global(int ident_flag UNUSED) return NULL; } +static char *git_attr_val_global(int ident_flag) +{ + return repo_git_attr_val_global(the_repository, ident_flag); +} + static char *git_config_val_system(int ident_flag UNUSED) { if (git_config_system()) { diff --git a/config.c b/config.c index 36f76fafe5..d483f1418c 100644 --- a/config.c +++ b/config.c @@ -1432,11 +1432,6 @@ static int git_default_core_config(const char *var, const char *value, return 0; } - if (!strcmp(var, "core.attributesfile")) { - FREE_AND_NULL(git_attributes_file); - return git_config_pathname(&git_attributes_file, var, value); - } - if (!strcmp(var, "core.hookspath")) { FREE_AND_NULL(git_hooks_path); return git_config_pathname(&git_hooks_path, var, value); diff --git a/environment.c b/environment.c index e5b361bb5d..e1da5a69b7 100644 --- a/environment.c +++ b/environment.c @@ -42,7 +42,6 @@ char *git_commit_encoding; char *git_log_output_encoding; char *apply_default_whitespace; char *apply_default_ignorewhitespace; -char *git_attributes_file; char *git_hooks_path; int zlib_compression_level = Z_BEST_SPEED; int pack_compression_level = Z_DEFAULT_COMPRESSION; diff --git a/environment.h b/environment.h index 2f43340f0b..9dc6bc0f3f 100644 --- a/environment.h +++ b/environment.h @@ -159,7 +159,6 @@ extern int assume_unchanged; extern int warn_on_object_refname_ambiguity; extern char *apply_default_whitespace; extern char *apply_default_ignorewhitespace; -extern char *git_attributes_file; extern char *git_hooks_path; extern int zlib_compression_level; extern int pack_compression_level; diff --git a/repo-settings.c b/repo-settings.c index 9d16d5399e..420ca72f5f 100644 --- a/repo-settings.c +++ b/repo-settings.c @@ -4,6 +4,7 @@ #include "repository.h" #include "midx.h" #include "pack-objects.h" +#include "path.h" static void repo_cfg_bool(struct repository *r, const char *key, int *dest, int def) @@ -167,3 +168,13 @@ int repo_settings_get_warn_ambiguous_refs(struct repository *repo) &repo->settings.warn_ambiguous_refs, 1); return repo->settings.warn_ambiguous_refs; } + +const char *repo_settings_get_attributesfile_path(struct repository *repo) +{ + if (!repo->settings.git_attributes_file) { + if (repo_config_get_pathname(repo, "core.attributesfile", &repo->settings.git_attributes_file)) { + repo->settings.git_attributes_file = xdg_config_home("attributes"); + } + } + return repo->settings.git_attributes_file; +} \ No newline at end of file diff --git a/repo-settings.h b/repo-settings.h index 93ea0c3274..2a5ca5f07d 100644 --- a/repo-settings.h +++ b/repo-settings.h @@ -61,6 +61,7 @@ struct repo_settings { size_t delta_base_cache_limit; size_t packed_git_window_size; size_t packed_git_limit; + char *git_attributes_file; }; #define REPO_SETTINGS_INIT { \ .index_version = -1, \ @@ -78,5 +79,7 @@ void prepare_repo_settings(struct repository *r); enum log_refs_config repo_settings_get_log_all_ref_updates(struct repository *repo); /* Read the value for "core.warnAmbiguousRefs". */ int repo_settings_get_warn_ambiguous_refs(struct repository *repo); +/* Read the value for "core.attributesfile". */ +const char *repo_settings_get_attributesfile_path(struct repository *repo); #endif /* REPO_SETTINGS_H */
Stop relying on global state to access the "core.attributesfile" configuration. Instead, store the value in `struct repo_settings` and retrieve it via `repo_settings_get_attributes_file_path()`. This prevents incorrect values from being used when a user or tool is handling multiple repositories in the same process, each with different attribute configurations. It also improves repository isolation and helps progress towards libification by avoiding unnecessary global state. Signed-off-by: Ayush Chandekar <ayu.chandekar@gmail.com> --- attr.c | 27 ++++++++++++--------------- attr.h | 7 +++++-- builtin/check-attr.c | 2 +- builtin/var.c | 9 +++++++-- config.c | 5 ----- environment.c | 1 - environment.h | 1 - repo-settings.c | 11 +++++++++++ repo-settings.h | 3 +++ 9 files changed, 39 insertions(+), 27 deletions(-)