diff mbox series

[13/19] revision: always store allocated strings in output encoding

Message ID 68a7d24e4a715eaf60414373636860215d27e643.1716983704.git.ps@pks.im (mailing list archive)
State Superseded
Headers show
Series Compile with `-Wwrite-strings` | expand

Commit Message

Patrick Steinhardt May 29, 2024, 12:45 p.m. UTC
The `git_log_output_encoding` variable can be set via the `--encoding=`
option. When doing so, we conditionally either assign it to the passed
value, or if the value is "none" we assign it the empty string.
Depending on which of the both code paths we pick though, the variable
may end up being assigned either an allocated string or a string
constant.

This is somewhat risky and may easily lead to bugs when a different code
path may want to reassign a new value to it, freeing the previous value.
We already to this when parsing the "i18n.logoutputencoding" config in
`git_default_i18n_config()`. But because the config is typically parsed
before we parse command line options this has been fine so far.

Regardless of that, safeguard the code such that the variable always
contains an allocated string. While at it, also free the old value in
case there was any to plug a potential memory leak.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 revision.c             | 3 ++-
 t/t3900-i18n-commit.sh | 1 +
 t/t3901-i18n-patch.sh  | 1 +
 3 files changed, 4 insertions(+), 1 deletion(-)

Comments

Junio C Hamano May 29, 2024, 8:23 p.m. UTC | #1
Patrick Steinhardt <ps@pks.im> writes:

> The `git_log_output_encoding` variable can be set via the `--encoding=`
> option. When doing so, we conditionally either assign it to the passed
> value, or if the value is "none" we assign it the empty string.
> Depending on which of the both code paths we pick though, the variable
> may end up being assigned either an allocated string or a string
> constant.
>
> This is somewhat risky and may easily lead to bugs when a different code
> path may want to reassign a new value to it, freeing the previous value.
> We already to this when parsing the "i18n.logoutputencoding" config in
> `git_default_i18n_config()`. But because the config is typically parsed
> before we parse command line options this has been fine so far.
>
> Regardless of that, safeguard the code such that the variable always
> contains an allocated string. While at it, also free the old value in
> case there was any to plug a potential memory leak.

Nice.  Now the thing always has to be freed once we are done.
Consistency is good.
diff mbox series

Patch

diff --git a/revision.c b/revision.c
index 7ddf0f151a..2ee6886078 100644
--- a/revision.c
+++ b/revision.c
@@ -2650,10 +2650,11 @@  static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg
 	} else if (!strcmp(arg, "--invert-grep")) {
 		revs->grep_filter.no_body_match = 1;
 	} else if ((argcount = parse_long_opt("encoding", argv, &optarg))) {
+		free(git_log_output_encoding);
 		if (strcmp(optarg, "none"))
 			git_log_output_encoding = xstrdup(optarg);
 		else
-			git_log_output_encoding = "";
+			git_log_output_encoding = xstrdup("");
 		return argcount;
 	} else if (!strcmp(arg, "--reverse")) {
 		revs->reverse ^= 1;
diff --git a/t/t3900-i18n-commit.sh b/t/t3900-i18n-commit.sh
index f27d09cfd9..db7b403bc1 100755
--- a/t/t3900-i18n-commit.sh
+++ b/t/t3900-i18n-commit.sh
@@ -5,6 +5,7 @@ 
 
 test_description='commit and log output encodings'
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 compare_with () {
diff --git a/t/t3901-i18n-patch.sh b/t/t3901-i18n-patch.sh
index 4b37f78829..5f0b9afc3f 100755
--- a/t/t3901-i18n-patch.sh
+++ b/t/t3901-i18n-patch.sh
@@ -8,6 +8,7 @@  test_description='i18n settings and format-patch | am pipe'
 GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
 export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 check_encoding () {