diff mbox series

[2/3] dir: make clear_directory() free all relevant memory

Message ID 068e097e22fa42b79e70b0346cc7460f1a3cbcff.1597561152.git.gitgitgadget@gmail.com (mailing list archive)
State Superseded
Headers show
Series Clean up some memory leaks in and around dir.c | expand

Commit Message

Linus Arver via GitGitGadget Aug. 16, 2020, 6:59 a.m. UTC
From: Elijah Newren <newren@gmail.com>

The calling convention for the dir API is supposed to end with a call to
clear_directory() to free up no longer needed memory.  However,
clear_directory() didn't free dir->entries or dir->ignored.  I believe
this was oversight, but a number of callers noticed memory leaks and
started free'ing these, but often somewhat haphazardly (sometimes
freeing the entries in the arrays, and sometimes only free'ing the
arrays themselves).  This suggests the callers weren't trying to make
sure any possible memory used might be free'd, but just the memory they
noticed their usecase definitely had allocated.  This also caused the
extra memory deallocations to be duplicated in many places.

Fix this mess by moving all the duplicated free'ing logic into
clear_directory().

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 builtin/clean.c |  6 +-----
 builtin/stash.c |  3 ---
 dir.c           | 12 ++++++++----
 dir.h           |  2 +-
 wt-status.c     |  4 ----
 5 files changed, 10 insertions(+), 17 deletions(-)

Comments

Jeff King Aug. 16, 2020, 8:54 a.m. UTC | #1
On Sun, Aug 16, 2020 at 06:59:10AM +0000, Elijah Newren via GitGitGadget wrote:

> From: Elijah Newren <newren@gmail.com>
> 
> The calling convention for the dir API is supposed to end with a call to
> clear_directory() to free up no longer needed memory.  However,
> clear_directory() didn't free dir->entries or dir->ignored.  I believe
> this was oversight, but a number of callers noticed memory leaks and
> started free'ing these, but often somewhat haphazardly (sometimes
> freeing the entries in the arrays, and sometimes only free'ing the
> arrays themselves).  This suggests the callers weren't trying to make
> sure any possible memory used might be free'd, but just the memory they
> noticed their usecase definitely had allocated.  This also caused the
> extra memory deallocations to be duplicated in many places.
> 
> Fix this mess by moving all the duplicated free'ing logic into
> clear_directory().

Makes sense. I don't know the dir.c code very well, so my worry would be
that some caller really wanted the other fields left untouched. But
looking over the callers, it seems they're all clearing it before the
struct goes out of scope. It's possible that they could have created
other pointer references, but it seems unlikely (and I'd argue they
should stop doing that and make their own copies of the data). E.g.,
wt_status_collect_untracked() sticks names into a string_list, but it
sets strdup_strings to make its own copy, so it's good.

> @@ -3034,6 +3031,13 @@ void clear_directory(struct dir_struct *dir)
>  		free(group->pl);
>  	}
>  
> +	for (i = 0; i < dir->ignored_nr; i++)
> +		free(dir->ignored[i]);
> +	for (i = 0; i < dir->nr; i++)
> +		free(dir->entries[i]);
> +	free(dir->ignored);
> +	free(dir->entries);
> +
>  	stk = dir->exclude_stack;
>  	while (stk) {
>  		struct exclude_stack *prev = stk->prev;

In most of our "clear" functions, the struct is ready for use again
(i.e., fields are set back to the initialized state). I don't think any
caller cares at this point, but it may be worth doing while we are here
as a least-surprise thing. That would mean setting these pointers to
NULL, and probably a few others that you aren't touching here.

Perhaps even easier would be to add a dir_init() call at the end after
your next patch adds that function.

-Peff
Elijah Newren Aug. 17, 2020, 4:58 p.m. UTC | #2
On Sun, Aug 16, 2020 at 1:54 AM Jeff King <peff@peff.net> wrote:
>
> On Sun, Aug 16, 2020 at 06:59:10AM +0000, Elijah Newren via GitGitGadget wrote:
>
> > From: Elijah Newren <newren@gmail.com>
> >
> > The calling convention for the dir API is supposed to end with a call to
> > clear_directory() to free up no longer needed memory.  However,
> > clear_directory() didn't free dir->entries or dir->ignored.  I believe
> > this was oversight, but a number of callers noticed memory leaks and
> > started free'ing these, but often somewhat haphazardly (sometimes
> > freeing the entries in the arrays, and sometimes only free'ing the
> > arrays themselves).  This suggests the callers weren't trying to make
> > sure any possible memory used might be free'd, but just the memory they
> > noticed their usecase definitely had allocated.  This also caused the
> > extra memory deallocations to be duplicated in many places.
> >
> > Fix this mess by moving all the duplicated free'ing logic into
> > clear_directory().
>
> Makes sense. I don't know the dir.c code very well, so my worry would be
> that some caller really wanted the other fields left untouched. But
> looking over the callers, it seems they're all clearing it before the
> struct goes out of scope. It's possible that they could have created
> other pointer references, but it seems unlikely (and I'd argue they
> should stop doing that and make their own copies of the data). E.g.,
> wt_status_collect_untracked() sticks names into a string_list, but it
> sets strdup_strings to make its own copy, so it's good.
>
> > @@ -3034,6 +3031,13 @@ void clear_directory(struct dir_struct *dir)
> >               free(group->pl);
> >       }
> >
> > +     for (i = 0; i < dir->ignored_nr; i++)
> > +             free(dir->ignored[i]);
> > +     for (i = 0; i < dir->nr; i++)
> > +             free(dir->entries[i]);
> > +     free(dir->ignored);
> > +     free(dir->entries);
> > +
> >       stk = dir->exclude_stack;
> >       while (stk) {
> >               struct exclude_stack *prev = stk->prev;
>
> In most of our "clear" functions, the struct is ready for use again
> (i.e., fields are set back to the initialized state). I don't think any
> caller cares at this point, but it may be worth doing while we are here
> as a least-surprise thing. That would mean setting these pointers to
> NULL, and probably a few others that you aren't touching here.
>
> Perhaps even easier would be to add a dir_init() call at the end after
> your next patch adds that function.

Make sense; I'll fix it up.
diff mbox series

Patch

diff --git a/builtin/clean.c b/builtin/clean.c
index 5a9c29a558..4ffe00dd7f 100644
--- a/builtin/clean.c
+++ b/builtin/clean.c
@@ -1021,11 +1021,7 @@  int cmd_clean(int argc, const char **argv, const char *prefix)
 		string_list_append(&del_list, rel);
 	}
 
-	for (i = 0; i < dir.nr; i++)
-		free(dir.entries[i]);
-
-	for (i = 0; i < dir.ignored_nr; i++)
-		free(dir.ignored[i]);
+	clear_directory(&dir);
 
 	if (interactive && del_list.nr > 0)
 		interactive_main_loop();
diff --git a/builtin/stash.c b/builtin/stash.c
index 10d87630cd..da48533d49 100644
--- a/builtin/stash.c
+++ b/builtin/stash.c
@@ -875,11 +875,8 @@  static int get_untracked_files(const struct pathspec *ps, int include_untracked,
 		strbuf_addstr(untracked_files, ent->name);
 		/* NUL-terminate: will be fed to update-index -z */
 		strbuf_addch(untracked_files, '\0');
-		free(ent);
 	}
 
-	free(dir.entries);
-	free(dir.ignored);
 	clear_directory(&dir);
 	return found;
 }
diff --git a/dir.c b/dir.c
index 08df469bf7..b136c037d9 100644
--- a/dir.c
+++ b/dir.c
@@ -3012,10 +3012,7 @@  int remove_path(const char *name)
 	return 0;
 }
 
-/*
- * Frees memory within dir which was allocated for exclude lists and
- * the exclude_stack.  Does not free dir itself.
- */
+/* Frees memory within dir which was allocated.  Does not free dir itself. */
 void clear_directory(struct dir_struct *dir)
 {
 	int i, j;
@@ -3034,6 +3031,13 @@  void clear_directory(struct dir_struct *dir)
 		free(group->pl);
 	}
 
+	for (i = 0; i < dir->ignored_nr; i++)
+		free(dir->ignored[i]);
+	for (i = 0; i < dir->nr; i++)
+		free(dir->entries[i]);
+	free(dir->ignored);
+	free(dir->entries);
+
 	stk = dir->exclude_stack;
 	while (stk) {
 		struct exclude_stack *prev = stk->prev;
diff --git a/dir.h b/dir.h
index 5855c065a6..7d76d0644f 100644
--- a/dir.h
+++ b/dir.h
@@ -36,7 +36,7 @@ 
  *
  * - Use `dir.entries[]`.
  *
- * - Call `clear_directory()` when none of the contained elements are no longer in use.
+ * - Call `clear_directory()` when the contained elements are no longer in use.
  *
  */
 
diff --git a/wt-status.c b/wt-status.c
index d75399085d..c00ea3e06a 100644
--- a/wt-status.c
+++ b/wt-status.c
@@ -724,18 +724,14 @@  static void wt_status_collect_untracked(struct wt_status *s)
 		struct dir_entry *ent = dir.entries[i];
 		if (index_name_is_other(istate, ent->name, ent->len))
 			string_list_insert(&s->untracked, ent->name);
-		free(ent);
 	}
 
 	for (i = 0; i < dir.ignored_nr; i++) {
 		struct dir_entry *ent = dir.ignored[i];
 		if (index_name_is_other(istate, ent->name, ent->len))
 			string_list_insert(&s->ignored, ent->name);
-		free(ent);
 	}
 
-	free(dir.entries);
-	free(dir.ignored);
 	clear_directory(&dir);
 
 	if (advice_status_u_option)