diff mbox series

[3/3] setup: add `clear_repository_format()`

Message ID 20181218072528.3870492-4-martin.agren@gmail.com (mailing list archive)
State New, archived
Headers show
Series setup: add `clear_repository_format()` | expand

Commit Message

Martin Ågren Dec. 18, 2018, 7:25 a.m. UTC
After we set up a `struct repository_format`, it owns various pieces of
allocated memory. We then either use those members, because we decide we
want to use the "candidate" repository format, or we discard the
candidate / scratch space. In the first case, we transfer ownership of
the memory to a few global variables. In the latter case, we just
silently drop the struct and end up leaking memory.

Introduce a function `clear_repository_format()` which frees the memory
the struct holds on to. Call it in the code paths where we currently
leak the memory. Also call it in the error path of
`read_repository_format()` to clean up any partial result.

For hygiene, we need to at least set the pointers that we free to NULL.
For future-proofing, let's zero the entire struct instead. It just means
that in the error path of `read_...()` we need to restore the error
sentinel in the `version` field.

We could take this opportunity to stop claiming that all fields except
`version` are undefined in case of an error. On the other hand, having
them defined as zero is not much better than having them undefined. We
could define them to some fallback configuration (`is_bare = -1` and
`hash_algo = GIT_HASH_SHA1`?), but "clear()" and/or "read()" seem like
the wrong places to enforce fallback configurations. Let's leave things
as "undefined" instead to encourage users to check `version`.

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
---
 The error state can always be defined later. Defining it now, then
 trying to backpedal, is probably not so fun. Filling the struct with
 non-zero values might help flush out bugs like the one fixed in the
 previous patch, but I'm wary of going that far in this patch.

 cache.h      |  6 ++++++
 repository.c |  1 +
 setup.c      | 14 ++++++++++++++
 3 files changed, 21 insertions(+)

Comments

Jeff King Dec. 19, 2018, 3:48 p.m. UTC | #1
On Tue, Dec 18, 2018 at 08:25:28AM +0100, Martin Ågren wrote:

> After we set up a `struct repository_format`, it owns various pieces of
> allocated memory. We then either use those members, because we decide we
> want to use the "candidate" repository format, or we discard the
> candidate / scratch space. In the first case, we transfer ownership of
> the memory to a few global variables. In the latter case, we just
> silently drop the struct and end up leaking memory.
> 
> Introduce a function `clear_repository_format()` which frees the memory
> the struct holds on to. Call it in the code paths where we currently
> leak the memory. Also call it in the error path of
> `read_repository_format()` to clean up any partial result.
> 
> For hygiene, we need to at least set the pointers that we free to NULL.
> For future-proofing, let's zero the entire struct instead. It just means
> that in the error path of `read_...()` we need to restore the error
> sentinel in the `version` field.

This seems reasonable, and I very much agree on the zero-ing (even
though it _shouldn't_ matter due to the "undefined" rule). That also
makes it safe to clear() multiple times, which is a nice property.

> +void clear_repository_format(struct repository_format *format)
> +{
> +	string_list_clear(&format->unknown_extensions, 0);
> +	free(format->work_tree);
> +	free(format->partial_clone);
> +	memset(format, 0, sizeof(*format));
>  }

For the callers that actually pick the values out, I think it might be a
little less error-prone if they actually copied the strings and then
called clear_repository_format(). That avoids leaks of values that they
didn't know or care about (and the cost of an extra strdup for
repository setup is not a big deal).

Something like this on top of your patch, I guess (with the idea being
that functions which return an error would clear the format, but a
"successful" one would get returned back up the stack to
setup_git_directory_gently(), which then clears it before returning.

-- >8 --
diff --git a/setup.c b/setup.c
index babe5ea156..a5699f9ee6 100644
--- a/setup.c
+++ b/setup.c
@@ -470,6 +470,7 @@ static int check_repository_format_gently(const char *gitdir, struct repository_
 			warning("%s", err.buf);
 			strbuf_release(&err);
 			*nongit_ok = -1;
+			clear_repository_format(candidate);
 			return -1;
 		}
 		die("%s", err.buf);
@@ -499,7 +500,7 @@ static int check_repository_format_gently(const char *gitdir, struct repository_
 		}
 		if (candidate->work_tree) {
 			free(git_work_tree_cfg);
-			git_work_tree_cfg = candidate->work_tree;
+			git_work_tree_cfg = xstrdup(candidate->work_tree);
 			inside_work_tree = -1;
 		}
 	} else {
@@ -1158,6 +1159,7 @@ const char *setup_git_directory_gently(int *nongit_ok)
 
 	strbuf_release(&dir);
 	strbuf_release(&gitdir);
+	clear_repository_format(&repo_fmt);
 
 	return prefix;
 }

-Peff
Martin Ågren Dec. 19, 2018, 9:49 p.m. UTC | #2
On Wed, 19 Dec 2018 at 16:48, Jeff King <peff@peff.net> wrote:
>
> On Tue, Dec 18, 2018 at 08:25:28AM +0100, Martin Ågren wrote:
>
> > +void clear_repository_format(struct repository_format *format)
> > +{
> > +     string_list_clear(&format->unknown_extensions, 0);
> > +     free(format->work_tree);
> > +     free(format->partial_clone);
> > +     memset(format, 0, sizeof(*format));
> >  }
>
> For the callers that actually pick the values out, I think it might be a
> little less error-prone if they actually copied the strings and then
> called clear_repository_format(). That avoids leaks of values that they
> didn't know or care about (and the cost of an extra strdup for
> repository setup is not a big deal).
>
> Something like this on top of your patch, I guess (with the idea being
> that functions which return an error would clear the format, but a
> "successful" one would get returned back up the stack to
> setup_git_directory_gently(), which then clears it before returning.

Thanks for the suggestion. I'll ponder 1) how to go about this
robustifying, 2) how to present the result as part of a v2 series.

To Junio on the sidelines in a cast (hope you're feeling better!):
you can expect a v2 of this series.


Martin
diff mbox series

Patch

diff --git a/cache.h b/cache.h
index 8b9e592c65..53ac01efa7 100644
--- a/cache.h
+++ b/cache.h
@@ -979,6 +979,12 @@  struct repository_format {
  */
 void read_repository_format(struct repository_format *format, const char *path);
 
+/*
+ * Free the memory held onto by `format`, but not the struct itself.
+ * (No need to use this after `read_repository_format()` fails.)
+ */
+void clear_repository_format(struct repository_format *format);
+
 /*
  * Verify that the repository described by repository_format is something we
  * can read. If it is, return 0. Otherwise, return -1, and "err" will describe
diff --git a/repository.c b/repository.c
index 5dd1486718..efa9d1d960 100644
--- a/repository.c
+++ b/repository.c
@@ -159,6 +159,7 @@  int repo_init(struct repository *repo,
 	if (worktree)
 		repo_set_worktree(repo, worktree);
 
+	clear_repository_format(&format);
 	return 0;
 
 error:
diff --git a/setup.c b/setup.c
index 52c3c9d31f..babe5ea156 100644
--- a/setup.c
+++ b/setup.c
@@ -517,6 +517,18 @@  void read_repository_format(struct repository_format *format, const char *path)
 	format->hash_algo = GIT_HASH_SHA1;
 	string_list_init(&format->unknown_extensions, 1);
 	git_config_from_file(check_repo_format, path, format);
+	if (format->version == -1) {
+		clear_repository_format(format);
+		format->version = -1;
+	}
+}
+
+void clear_repository_format(struct repository_format *format)
+{
+	string_list_clear(&format->unknown_extensions, 0);
+	free(format->work_tree);
+	free(format->partial_clone);
+	memset(format, 0, sizeof(*format));
 }
 
 int verify_repository_format(const struct repository_format *format,
@@ -1043,9 +1055,11 @@  int discover_git_directory(struct strbuf *commondir,
 		strbuf_release(&err);
 		strbuf_setlen(commondir, commondir_offset);
 		strbuf_setlen(gitdir, gitdir_offset);
+		clear_repository_format(&candidate);
 		return -1;
 	}
 
+	clear_repository_format(&candidate);
 	return 0;
 }