mbox series

[v3,00/22] Memory leak fixes (pt.4)

Message ID cover.1723540931.git.ps@pks.im (mailing list archive)
Headers show
Series Memory leak fixes (pt.4) | expand

Message

Patrick Steinhardt Aug. 13, 2024, 9:31 a.m. UTC
Hi,

this is the third version of my fourth batch of patches that fix
various memory leaks.

Changes compared to v2:

  - Various typo fixes in commit messages.

  - Introduce `print_system_path()` as proposed by Taylor, which removes
    some of the repetition when printing system ptahs.

  - Micro-optimize one allocation for comment char strings away. Also,
    rename the variable to `comment_line_str_to_free` to better match
    how we call such variables in other places.

Thanks!

Patrick

Patrick Steinhardt (22):
  remote: plug memory leak when aliasing URLs
  git: fix leaking system paths
  object-file: fix memory leak when reading corrupted headers
  object-name: fix leaking symlink paths in object context
  bulk-checkin: fix leaking state TODO
  read-cache: fix leaking hashfile when writing index fails
  submodule-config: fix leaking name entry when traversing submodules
  config: fix leaking comment character config
  builtin/rebase: fix leaking `commit.gpgsign` value
  builtin/notes: fix leaking `struct notes_tree` when merging notes
  builtin/fast-import: plug trivial memory leaks
  builtin/fast-export: fix leaking diff options
  builtin/fast-export: plug leaking tag names
  merge-ort: unconditionally release attributes index
  sequencer: release todo list on error paths
  unpack-trees: clear index when not propagating it
  diff: fix leak when parsing invalid ignore regex option
  builtin/format-patch: fix various trivial memory leaks
  userdiff: fix leaking memory for configured diff drivers
  builtin/log: fix leak when showing converted blob contents
  diff: free state populated via options
  builtin/diff: free symmetric diff members

 builtin/commit.c                      |  7 +-
 builtin/diff.c                        | 10 ++-
 builtin/fast-export.c                 | 19 ++++--
 builtin/fast-import.c                 |  8 ++-
 builtin/log.c                         | 13 +++-
 builtin/notes.c                       |  9 ++-
 builtin/rebase.c                      |  1 +
 bulk-checkin.c                        |  2 +
 config.c                              |  3 +-
 csum-file.c                           |  2 +-
 csum-file.h                           | 10 +++
 diff.c                                | 16 ++++-
 environment.c                         |  1 +
 environment.h                         |  1 +
 git.c                                 | 13 +++-
 merge-ort.c                           |  3 +-
 object-file.c                         |  1 +
 object-name.c                         |  1 +
 range-diff.c                          |  6 +-
 read-cache.c                          | 97 ++++++++++++++++-----------
 remote.c                              |  2 +
 sequencer.c                           | 67 ++++++++++++------
 submodule-config.c                    | 18 +++--
 t/t0210-trace2-normal.sh              |  2 +-
 t/t1006-cat-file.sh                   |  1 +
 t/t1050-large.sh                      |  1 +
 t/t1450-fsck.sh                       |  1 +
 t/t1601-index-bogus.sh                |  2 +
 t/t2107-update-index-basic.sh         |  1 +
 t/t3310-notes-merge-manual-resolve.sh |  1 +
 t/t3311-notes-merge-fanout.sh         |  1 +
 t/t3404-rebase-interactive.sh         |  1 +
 t/t3435-rebase-gpg-sign.sh            |  1 +
 t/t3507-cherry-pick-conflict.sh       |  1 +
 t/t3510-cherry-pick-sequence.sh       |  1 +
 t/t3705-add-sparse-checkout.sh        |  1 +
 t/t4013-diff-various.sh               |  1 +
 t/t4014-format-patch.sh               |  1 +
 t/t4018-diff-funcname.sh              |  1 +
 t/t4030-diff-textconv.sh              |  2 +
 t/t4042-diff-textconv-caching.sh      |  2 +
 t/t4048-diff-combined-binary.sh       |  1 +
 t/t4064-diff-oidfind.sh               |  2 +
 t/t4065-diff-anchored.sh              |  1 +
 t/t4068-diff-symmetric-merge-base.sh  |  1 +
 t/t4069-remerge-diff.sh               |  1 +
 t/t4108-apply-threeway.sh             |  1 +
 t/t4209-log-pickaxe.sh                |  2 +
 t/t6421-merge-partial-clone.sh        |  1 +
 t/t6428-merge-conflicts-sparse.sh     |  1 +
 t/t7008-filter-branch-null-sha1.sh    |  1 +
 t/t7030-verify-tag.sh                 |  1 +
 t/t7817-grep-sparse-checkout.sh       |  1 +
 t/t9300-fast-import.sh                |  1 +
 t/t9304-fast-import-marks.sh          |  2 +
 t/t9351-fast-export-anonymize.sh      |  1 +
 unpack-trees.c                        |  2 +
 userdiff.c                            | 38 ++++++++---
 userdiff.h                            |  4 ++
 59 files changed, 288 insertions(+), 106 deletions(-)

Range-diff against v2:
 1:  2afa51f9ff !  1:  02f6da020f remote: plug memory leak when aliasing URLs
    @@ Commit message
         each of their URLs. The actual aliasing logic is then contained in
         `alias_url()`, which returns an allocated string that contains the new
         URL. This URL replaces the old URL that we have in the strvec that
    -    contanis all remote URLs.
    +    contains all remote URLs.
     
         We replace the remote URLs via `strvec_replace()`, which does not hand
         over ownership of the new string to the vector. Still, we didn't free
 2:  324140e4fd !  2:  f36d895948 git: fix leaking system paths
    @@ Commit message
         memory leaks looming exposed by that test suite and it thus does not yet
         pass with the memory leak checker enabled.
     
    +    Helped-by: Taylor Blau <me@ttaylorr.com>
         Signed-off-by: Patrick Steinhardt <ps@pks.im>
     
      ## git.c ##
    +@@ git.c: void setup_auto_pager(const char *cmd, int def)
    + 	commit_pager_choice();
    + }
    + 
    ++static void print_system_path(const char *path)
    ++{
    ++	char *s_path = system_path(path);
    ++	puts(s_path);
    ++	free(s_path);
    ++}
    ++
    + static int handle_options(const char ***argv, int *argc, int *envchanged)
    + {
    + 	const char **orig_argv = *argv;
     @@ git.c: static int handle_options(const char ***argv, int *argc, int *envchanged)
      				exit(0);
      			}
      		} else if (!strcmp(cmd, "--html-path")) {
     -			puts(system_path(GIT_HTML_PATH));
    -+			char *path = system_path(GIT_HTML_PATH);
    -+			puts(path);
    -+			free(path);
    ++			print_system_path(GIT_HTML_PATH);
      			trace2_cmd_name("_query_");
      			exit(0);
      		} else if (!strcmp(cmd, "--man-path")) {
     -			puts(system_path(GIT_MAN_PATH));
    -+			char *path = system_path(GIT_MAN_PATH);
    -+			puts(path);
    -+			free(path);
    ++			print_system_path(GIT_MAN_PATH);
      			trace2_cmd_name("_query_");
      			exit(0);
      		} else if (!strcmp(cmd, "--info-path")) {
     -			puts(system_path(GIT_INFO_PATH));
    -+			char *path = system_path(GIT_INFO_PATH);
    -+			puts(path);
    -+			free(path);
    ++			print_system_path(GIT_INFO_PATH);
      			trace2_cmd_name("_query_");
      			exit(0);
      		} else if (!strcmp(cmd, "-p") || !strcmp(cmd, "--paginate")) {
 3:  43a38a2281 !  3:  0415ac986d object-file: fix memory leak when reading corrupted headers
    @@ Metadata
      ## Commit message ##
         object-file: fix memory leak when reading corrupted headers
     
    -    When reading corrupt object headers in `read_loose_object()`, then we
    -    bail out immediately. This causes a memory leak though because we would
    -    have already initialized the zstream in `unpack_loose_header()`, and it
    -    is the callers responsibility to finish the zstream even on error. While
    +    When reading corrupt object headers in `read_loose_object()`, we bail
    +    out immediately. This causes a memory leak though because we would have
    +    already initialized the zstream in `unpack_loose_header()`, and it is
    +    the callers responsibility to finish the zstream even on error. While
         this feels weird, other callsites do it correctly already.
     
         Fix this leak by ending the zstream even on errors. We may want to
 4:  9d3dc145e8 =  4:  e5130e50a9 object-name: fix leaking symlink paths in object context
 5:  454139e7a4 =  5:  276c828ad1 bulk-checkin: fix leaking state TODO
 6:  f8b7195796 =  6:  ed0608e705 read-cache: fix leaking hashfile when writing index fails
 7:  762fb5aa73 !  7:  b7a7f88c7d submodule-config: fix leaking name enrty when traversing submodules
    @@ Metadata
     Author: Patrick Steinhardt <ps@pks.im>
     
      ## Commit message ##
    -    submodule-config: fix leaking name enrty when traversing submodules
    +    submodule-config: fix leaking name entry when traversing submodules
     
         We traverse through submodules in the tree via `tree_entry()`, passing
         to it a `struct name_entry` that it is supposed to populate with the
 8:  8fbd72a100 !  8:  9054a459a1 config: fix leaking comment character config
    @@ Commit message
         value in the first place because it may contain a string constant.
     
         Refactor the code such that we track allocated comment character strings
    -    via a separate non-constant variable `comment_line_str_allocated`. Adapt
    +    via a separate non-constant variable `comment_line_str_to_free`. Adapt
         sites that set `comment_line_str` to set both and free the old value
    -    that was stored in `comment_line_str_allocated`.
    +    that was stored in `comment_line_str_to_free`.
     
         This memory leak is being hit in t3404. As there are still other memory
         leaks in that file we cannot yet mark it as passing with leak checking
    @@ builtin/commit.c: static void adjust_comment_line_char(const struct strbuf *sb)
      
      	if (!memchr(sb->buf, candidates[0], sb->len)) {
     -		comment_line_str = xstrfmt("%c", candidates[0]);
    -+		free(comment_line_str_allocated);
    -+		comment_line_str = comment_line_str_allocated =
    ++		free(comment_line_str_to_free);
    ++		comment_line_str = comment_line_str_to_free =
     +			xstrfmt("%c", candidates[0]);
      		return;
      	}
    @@ builtin/commit.c: static void adjust_comment_line_char(const struct strbuf *sb)
      		die(_("unable to select a comment character that is not used\n"
      		      "in the current commit message"));
     -	comment_line_str = xstrfmt("%c", *p);
    -+	free(comment_line_str_allocated);
    -+	comment_line_str = comment_line_str_allocated = xstrfmt("%c", *p);
    ++	free(comment_line_str_to_free);
    ++	comment_line_str = comment_line_str_to_free = xstrfmt("%c", *p);
      }
      
      static void prepare_amend_commit(struct commit *commit, struct strbuf *sb,
    @@ config.c: static int git_default_core_config(const char *var, const char *value,
      			if (strchr(value, '\n'))
      				return error(_("%s cannot contain newline"), var);
     -			comment_line_str = xstrdup(value);
    -+			free(comment_line_str_allocated);
    -+			comment_line_str = comment_line_str_allocated =
    -+				xstrdup(value);
    ++			comment_line_str = value;
    ++			FREE_AND_NULL(comment_line_str_to_free);
      			auto_comment_line_char = 0;
      		} else
      			return error(_("%s must have at least one character"), var);
    @@ environment.c: int protect_ntfs = PROTECT_NTFS_DEFAULT;
       * that is subject to stripspace.
       */
      const char *comment_line_str = "#";
    -+char *comment_line_str_allocated;
    ++char *comment_line_str_to_free;
      int auto_comment_line_char;
      
      /* Parallel index stat data preload? */
    @@ environment.h: struct strvec;
       * that is subject to stripspace.
       */
      extern const char *comment_line_str;
    -+extern char *comment_line_str_allocated;
    ++extern char *comment_line_str_to_free;
      extern int auto_comment_line_char;
      
      /*
 9:  e497b76e9c =  9:  1d3957a5eb builtin/rebase: fix leaking `commit.gpgsign` value
10:  c886b666f7 = 10:  0af1bab5a1 builtin/notes: fix leaking `struct notes_tree` when merging notes
11:  d1c757157b = 11:  30d4e9ed43 builtin/fast-import: plug trivial memory leaks
12:  fa2d5c5d6b ! 12:  9591fb7b5e builtin/fast-export: fix leaking diff options
    @@ Metadata
      ## Commit message ##
         builtin/fast-export: fix leaking diff options
     
    -    Before caling `handle_commit()` in a loop, we set `diffopt.no_free` such
    -    that its contents aren't getting freed inside of `handle_commit()`. We
    -    never unset that flag though, which means that it'll ultimately leak
    +    Before calling `handle_commit()` in a loop, we set `diffopt.no_free`
    +    such that its contents aren't getting freed inside of `handle_commit()`.
    +    We never unset that flag though, which means that it'll ultimately leak
         when calling `release_revisions()`.
     
         Fix this by unsetting the flag after the loop.
13:  d9dd860d2a = 13:  254bbb7f6f builtin/fast-export: plug leaking tag names
14:  8f6860485e = 14:  334c4ed71a merge-ort: unconditionally release attributes index
15:  ea6a350f31 = 15:  9f08a859fb sequencer: release todo list on error paths
16:  2755023742 = 16:  5d4934b1a9 unpack-trees: clear index when not propagating it
17:  edf6f148cd = 17:  e1b6a24fbe diff: fix leak when parsing invalid ignore regex option
18:  343e3bd4df = 18:  c048b54a2c builtin/format-patch: fix various trivial memory leaks
19:  be2c5b0bca = 19:  39b2921e3e userdiff: fix leaking memory for configured diff drivers
20:  7888203833 = 20:  50dea1c98a builtin/log: fix leak when showing converted blob contents
21:  245fc30afb = 21:  d5cb4ad580 diff: free state populated via options
22:  343ddcd17b ! 22:  31e38ba4e1 builtin/diff: free symmetric diff members
    @@ Commit message
     
         We populate a `struct symdiff` in case the user has requested a
         symmetric diff. Part of this is to populate a `skip` bitmap that
    -    indicates whihc commits shall be ignored in the diff. But while this
    +    indicates which commits shall be ignored in the diff. But while this
         bitmap is dynamically allocated, we never free it.
     
         Fix this by introducing and calling a new `symdiff_release()` function

Comments

Junio C Hamano Aug. 13, 2024, 4:58 p.m. UTC | #1
Patrick Steinhardt <ps@pks.im> writes:

> Patrick Steinhardt (22):
>   remote: plug memory leak when aliasing URLs
>   git: fix leaking system paths
>   object-file: fix memory leak when reading corrupted headers
>   object-name: fix leaking symlink paths in object context
>   bulk-checkin: fix leaking state TODO
>   read-cache: fix leaking hashfile when writing index fails
>   submodule-config: fix leaking name entry when traversing submodules
>   config: fix leaking comment character config
>   builtin/rebase: fix leaking `commit.gpgsign` value
>   builtin/notes: fix leaking `struct notes_tree` when merging notes
>   builtin/fast-import: plug trivial memory leaks
>   builtin/fast-export: fix leaking diff options
>   builtin/fast-export: plug leaking tag names
>   merge-ort: unconditionally release attributes index
>   sequencer: release todo list on error paths
>   unpack-trees: clear index when not propagating it
>   diff: fix leak when parsing invalid ignore regex option
>   builtin/format-patch: fix various trivial memory leaks
>   userdiff: fix leaking memory for configured diff drivers
>   builtin/log: fix leak when showing converted blob contents
>   diff: free state populated via options
>   builtin/diff: free symmetric diff members

Thanks for a pleasant read.  I had a few minor comments but they
likely show more of my misreading of the patches than real problems
;-)