Message ID | f7bdad78219de6819d0403f8957e9a0c8b4218bc.1614123848.git.gitgitgadget@gmail.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | Optimization batch 8: use file basenames even more | expand |
On 2/23/2021 6:44 PM, Elijah Newren via GitGitGadget wrote:
> + for (i=0; i<to_remove.nr; ++i)
nit: "i < to_remove.nr"
I'm really stretching to find something valuable to say about
these well-constructed patches.
-Stolee
On Wed, Feb 24 2021, Elijah Newren via GitGitGadget wrote: > From: Elijah Newren <newren@gmail.com> > [...] > MAYBE_UNUSED > -static void cleanup_dir_rename_info(struct dir_rename_info *info) > +static void cleanup_dir_rename_info(struct dir_rename_info *info, > + struct strset *dirs_removed, > + int keep_dir_rename_count) > { > + struct hashmap_iter iter; > + struct strmap_entry *entry; > + > if (!info->setup) > return; > > - partial_clear_dir_rename_count(info->dir_rename_count); > - strmap_clear(info->dir_rename_count, 1); > + if (!keep_dir_rename_count) { > + partial_clear_dir_rename_count(info->dir_rename_count); > + strmap_clear(info->dir_rename_count, 1); > + FREE_AND_NULL(info->dir_rename_count); > + } else { > + /* > + * Although dir_rename_count was passed in > + * diffcore_rename_extended() and we want to keep it around and > + * return it to that caller, we first want to remove any data > + * associated with directories that weren't renamed. > + */ > + struct string_list to_remove = STRING_LIST_INIT_NODUP; > + int i; > + > [...] I find the pattern in patch 02 and 03 and leading up to this 04/05 confusing to review. First we add a clear_dir_rename_count() in 02 but nothing uses it, then in 03 it's renamed to cleanup_dir_rename_info() and its code changed, but still nothing uses it. Here we're changing the function nothing uses, and then finally in 05 we make use of it, and the MAYBE_UNUSED attribute is removed. I appreciate trying to split these large and complex patches into more digestible pieces. I think that sometimes it's more readable to have a patch that adds a function and a subsequent one that uses it. But in this case where we've gone through stages of changing code that's never been used I think we're making it harder to read than not. I'd prefer just to see this cleanup_dir_rename_info() function pop into existence in 05. Just my 0.02. Style nit/preference: I think code like this is easier to read as: if (simple-case) { blah blah; return; } complex_case; Than not having the "return" and having most of the interesting logic in an indented "else" block. Or maybe just this on top of the whole thing (a -w diff, hopefully more readable, but still understandable): diff --git a/diffcore-rename.c b/diffcore-rename.c index 70a484b9b6..5a5c62ec79 100644 --- a/diffcore-rename.c +++ b/diffcore-rename.c @@ -609,11 +609,12 @@ void partial_clear_dir_rename_count(struct strmap *dir_rename_count) } static void cleanup_dir_rename_info(struct dir_rename_info *info, - struct strset *dirs_removed, - int keep_dir_rename_count) + struct strset *dirs_removed) { struct hashmap_iter iter; struct strmap_entry *entry; + struct string_list to_remove = STRING_LIST_INIT_NODUP; + int i; if (!info->setup) return; @@ -624,20 +625,12 @@ static void cleanup_dir_rename_info(struct dir_rename_info *info, /* dir_rename_guess */ strmap_clear(&info->dir_rename_guess, 1); - if (!keep_dir_rename_count) { - partial_clear_dir_rename_count(info->dir_rename_count); - strmap_clear(info->dir_rename_count, 1); - FREE_AND_NULL(info->dir_rename_count); - } else { /* * Although dir_rename_count was passed in * diffcore_rename_extended() and we want to keep it around and * return it to that caller, we first want to remove any data * associated with directories that weren't renamed. */ - struct string_list to_remove = STRING_LIST_INIT_NODUP; - int i; - strmap_for_each_entry(info->dir_rename_count, &iter, entry) { const char *source_dir = entry->key; struct strintmap *counts = entry->value; @@ -653,7 +646,6 @@ static void cleanup_dir_rename_info(struct dir_rename_info *info, to_remove.items[i].string, 1); string_list_clear(&to_remove, 0); } -} static const char *get_basename(const char *filename) { @@ -1317,7 +1309,13 @@ void diffcore_rename_extended(struct diff_options *options, if (rename_dst[i].filespec_to_free) free_filespec(rename_dst[i].filespec_to_free); - cleanup_dir_rename_info(&info, dirs_removed, dir_rename_count != NULL); + if (!dir_rename_count) { + cleanup_dir_rename_info(&info, dirs_removed); + } else { + partial_clear_dir_rename_count(info.dir_rename_count); + strmap_clear(info.dir_rename_count, 1); + FREE_AND_NULL(info.dir_rename_count); + } FREE_AND_NULL(rename_dst); rename_dst_nr = rename_dst_alloc = 0; FREE_AND_NULL(rename_src); I also wonder if that strmap_clear() wouldn't be better moved into partial_clear_dir_rename_count(): diff --git a/diffcore-rename.c b/diffcore-rename.c index 5a5c62ec790..5f6c5745d64 100644 --- a/diffcore-rename.c +++ b/diffcore-rename.c @@ -596,7 +596,8 @@ static void initialize_dir_rename_info(struct dir_rename_info *info, } } -void partial_clear_dir_rename_count(struct strmap *dir_rename_count) +void partial_clear_dir_rename_count(struct strmap *dir_rename_count, + int clear_strmap) { struct hashmap_iter iter; struct strmap_entry *entry; @@ -606,6 +607,9 @@ void partial_clear_dir_rename_count(struct strmap *dir_rename_count) strintmap_clear(counts); } strmap_partial_clear(dir_rename_count, 1); + if (clear_strmap) + strmap_clear(dir_rename_count, 1); + } static void cleanup_dir_rename_info(struct dir_rename_info *info, @@ -1312,8 +1316,7 @@ void diffcore_rename_extended(struct diff_options *options, if (!dir_rename_count) { cleanup_dir_rename_info(&info, dirs_removed); } else { - partial_clear_dir_rename_count(info.dir_rename_count); - strmap_clear(info.dir_rename_count, 1); + partial_clear_dir_rename_count(info.dir_rename_count, 1); FREE_AND_NULL(info.dir_rename_count); } FREE_AND_NULL(rename_dst); diff --git a/diffcore.h b/diffcore.h index c6ba64abd19..de8667bfa04 100644 --- a/diffcore.h +++ b/diffcore.h @@ -161,7 +161,8 @@ struct diff_filepair *diff_queue(struct diff_queue_struct *, struct diff_filespec *); void diff_q(struct diff_queue_struct *, struct diff_filepair *); -void partial_clear_dir_rename_count(struct strmap *dir_rename_count); +void partial_clear_dir_rename_count(struct strmap *dir_rename_count, + int clear_strmap); void diffcore_break(struct repository *, int); void diffcore_rename(struct diff_options *); diff --git a/merge-ort.c b/merge-ort.c index 467404cc0a3..0bbd49f0d78 100644 --- a/merge-ort.c +++ b/merge-ort.c @@ -353,9 +353,9 @@ static void clear_or_reinit_internal_opts(struct merge_options_internal *opti, for (i = MERGE_SIDE1; i <= MERGE_SIDE2; ++i) { strset_func(&renames->dirs_removed[i]); - partial_clear_dir_rename_count(&renames->dir_rename_count[i]); - if (!reinitialize) - strmap_clear(&renames->dir_rename_count[i], 1); + partial_clear_dir_rename_count(&renames->dir_rename_count[i], + !reinitialize); + free(&renames->dir_rename_count[i]); strmap_func(&renames->dir_renames[i], 0); }
On Thu, Feb 25 2021, Ævar Arnfjörð Bjarmason wrote: > - partial_clear_dir_rename_count(&renames->dir_rename_count[i]); > - if (!reinitialize) > - strmap_clear(&renames->dir_rename_count[i], 1); > + partial_clear_dir_rename_count(&renames->dir_rename_count[i], > + !reinitialize); > + free(&renames->dir_rename_count[i]); > > strmap_func(&renames->dir_renames[i], 0); > } That free() wasn't supposed to be in there, I was still experimenting with whether partial_clear_dir_rename_count() should also free() this itself. But now that I notice I left that in there and in the meantime ran all the tests, they all passed. So maybe this & the if/else in diffcore_rename_extended() can lose its braces by just calling either cleanup_dir_rename_info() or partial_clear_dir_rename_count(). I didn't look in any detail, and if the free() v.s. FREE_AND_NULL() distinction mattered here.
Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes: > But in this case where we've gone through stages of changing code that's > never been used I think we're making it harder to read than not. I'd > prefer just to see this cleanup_dir_rename_info() function pop into > existence in 05. I had a similar impression on parts of the earlier series where a new helper without users were given as a standalone patch. I found it a bit disorienting. > Style nit/preference: I think code like this is easier to read as: > > if (simple-case) { > blah > blah; > return; > } > complex_case; > > Than not having the "return" and having most of the interesting logic in > an indented "else" block. Or maybe just this on top of the whole thing > (a -w diff, hopefully more readable, but still understandable): Yes, that is also a good tip for a more readable patch, but that applies only for if/else at the end of the function. In general, formulating the condition so that the smaller body comes first for "if" and the larger one goes to the body of "else" would make the if/else easier to understand, as you can often hold the condition and smaller body just before "else" in your head, and after clearly understanding that part, it becomes easier to concentrate on the other side, i.e. "now we know what happens if the condition is true, what about the other case? Let's read on ...". Thanks.
diff --git a/diffcore-rename.c b/diffcore-rename.c index 7759c9a3a2ed..aa21d4e7175c 100644 --- a/diffcore-rename.c +++ b/diffcore-rename.c @@ -507,13 +507,45 @@ void partial_clear_dir_rename_count(struct strmap *dir_rename_count) } MAYBE_UNUSED -static void cleanup_dir_rename_info(struct dir_rename_info *info) +static void cleanup_dir_rename_info(struct dir_rename_info *info, + struct strset *dirs_removed, + int keep_dir_rename_count) { + struct hashmap_iter iter; + struct strmap_entry *entry; + if (!info->setup) return; - partial_clear_dir_rename_count(info->dir_rename_count); - strmap_clear(info->dir_rename_count, 1); + if (!keep_dir_rename_count) { + partial_clear_dir_rename_count(info->dir_rename_count); + strmap_clear(info->dir_rename_count, 1); + FREE_AND_NULL(info->dir_rename_count); + } else { + /* + * Although dir_rename_count was passed in + * diffcore_rename_extended() and we want to keep it around and + * return it to that caller, we first want to remove any data + * associated with directories that weren't renamed. + */ + struct string_list to_remove = STRING_LIST_INIT_NODUP; + int i; + + strmap_for_each_entry(info->dir_rename_count, &iter, entry) { + const char *source_dir = entry->key; + struct strintmap *counts = entry->value; + + if (!strset_contains(dirs_removed, source_dir)) { + string_list_append(&to_remove, source_dir); + strintmap_clear(counts); + continue; + } + } + for (i=0; i<to_remove.nr; ++i) + strmap_remove(info->dir_rename_count, + to_remove.items[i].string, 1); + string_list_clear(&to_remove, 0); + } } static const char *get_basename(const char *filename)