mbox series

[00/10] unpack-trees & dir APIs: fix memory leaks

Message ID cover-00.10-00000000000-20211004T002226Z-avarab@gmail.com (mailing list archive)
Headers show
Series unpack-trees & dir APIs: fix memory leaks | expand

Message

Ævar Arnfjörð Bjarmason Oct. 4, 2021, 12:46 a.m. UTC
This series fixes memory leaks in the unpack-trees and dir APIs for
all their callers. There's been a discussion between myself & Elijah
on his en/removing-untracked-fixes series[1] about the memory leak
fixing aspect of his series.

I've got locally queued patches that fix widespread memory leaks in
the test suite and make much of it pass under SANITIZE=leak, once the
common leaks in revisions.c (git rev-list/show/log etc.), "checkout",
"dir" and "unpack-trees" are fixed a lot of tests become entirely
leak-free, as much code that needs to setup various basic things will
require one of those commands.

I think that the more narrow fixes[2] to the memory leaks around
unpack-trees in Elijah's series[3] are better skipped and that series
rebased on top of this one (I'll submit an RFC version of his that is
rebased on this as a follow-up).

I.e. his solves a very small amount of the memory leaks in this area,
whereas this is something I've got running as part of end-to-end
SANITIZE=leak testing, so I think that the difference in approaches we
picked when it comes to fixing them is likely because of that.

E.g. continuing to allocate the "struct dir_struct" on the heap in his
version[4] in his is, I think, something that makes more sense for
fixes that haven't pulled at the thread of how much merge-recursive.c
is making that question of ownerhip confusing. There's also changen in
his that'll become simpler as the complexity of the underlying APIs
has gone away, e.g. [6].

1. https://lore.kernel.org/git/87ilyjviiy.fsf@evledraar.gmail.com/
2. https://lore.kernel.org/git/0c74285b25311c83bb158cf89a551160a0f3a5d3.1632760428.git.gitgitgadget@gmail.com/
3. https://lore.kernel.org/git/pull.1036.v3.git.1632760428.gitgitgadget@gmail.com/
4. https://lore.kernel.org/git/0d119142778dce8617dd9b2c102b5f5bfdc9dc0f.1632760428.git.gitgitgadget@gmail.com/
6. https://lore.kernel.org/git/f1a0700e598e52d6cdb507fe8a09c4fa9291c982.1632760428.git.gitgitgadget@gmail.com/

Comments on individual patches below:

Ævar Arnfjörð Bjarmason (10):
  unpack-trees.[ch]: define and use a UNPACK_TREES_OPTIONS_INIT

I had this at the end of the v3 of my designated initializer cleanup
series[7]

I think Junio fairly commented that this in isolation looked like it
was going nowhere[8] since we didn' get past initializing "struct
unpack_trees_options" as "{ 0 }", but that'll soon be the case in this
series...

  merge-recursive.c: call a new unpack_trees_options_init() function

Details how merge-recursive.c calls unpack_trees() differently than
every other caller when it comes to "struct unpack_trees_options"
setup.

  unpack-trees.[ch]: embed "dir" in "struct unpack_trees_options"

Elijah's series ends up with a "dir" still heap-allocated in "struct
unpack_trees_options", just dynamically and "privately". Here it's
allocated on the stack (or for merge-recursive.c, as defined in
UNPACK_TREES_OPTIONS_INIT), because we could untangle the
merge-recursive.c edge-case earlier.

  unpack-trees API: don't have clear_unpack_trees_porcelain() reset

Move merge-recursive.c special-snowflake behavior into
merge-recursive.c.

  dir.[ch]: make DIR_INIT mandatory
  dir.c: get rid of lazy initialization

The last caller not using "struct dir_struct" via DIR_INIT goes away,
allowing for untangling the mess I commented on at length in [9].

  unpack-trees API: rename clear_unpack_trees_porcelain()

Just a s/clear_unpack_trees_porcelain/unpack_trees_options_release/g,
since that's what it does now.

  unpack-trees: don't leak memory in verify_clean_subdirectory()
  merge.c: avoid duplicate unpack_trees_options_release() code
  built-ins: plug memory leaks with unpack_trees_options_release()

A lot of memory leak fixes both in unpack-trees.c and its users, only
a subset of this is in Elijah's series.

7. https://lore.kernel.org/git/cover-v3-0.6-00000000000-20211001T102056Z-avarab@gmail.com/
8. https://lore.kernel.org/git/xmqqk0iw4e7v.fsf@gitster.g/
9. https://lore.kernel.org/git/87sfxhohsj.fsf@evledraar.gmail.com/

 add-interactive.c         |  2 +-
 archive.c                 | 12 ++++++++----
 builtin/am.c              | 23 ++++++++++++++---------
 builtin/checkout.c        | 22 ++++++++++++----------
 builtin/clone.c           |  4 ++--
 builtin/commit.c          |  9 ++++++---
 builtin/merge.c           |  9 +++++----
 builtin/read-tree.c       | 28 +++++++++++++++-------------
 builtin/reset.c           | 16 ++++++++++------
 builtin/sparse-checkout.c |  5 ++---
 builtin/stash.c           | 20 ++++++++++++--------
 diff-lib.c                |  8 +++++---
 dir.c                     |  8 --------
 dir.h                     |  6 ++++--
 merge-ort.c               | 12 ++++--------
 merge-recursive.c         |  6 ++++--
 merge.c                   | 23 +++++++++++------------
 reset.c                   |  4 ++--
 sequencer.c               |  3 +--
 unpack-trees.c            | 24 +++++++++++++-----------
 unpack-trees.h            | 17 +++++++++++++----
 21 files changed, 144 insertions(+), 117 deletions(-)

Comments

Elijah Newren Oct. 4, 2021, 1:45 p.m. UTC | #1
On Sun, Oct 3, 2021 at 5:46 PM Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote:
>
> This series fixes memory leaks in the unpack-trees and dir APIs for
> all their callers.

There are several good fixes in this series.  Thanks for working on them!

> There's been a discussion between myself & Elijah
> on his en/removing-untracked-fixes series[1] about the memory leak
> fixing aspect of his series.

Not really, the memory leak fixing aspect of my series was patch 2;
most of our discussion was on patch 4, including your footnote link.
Patch 4 did not in any way involve fixing a memory leak, which you
yourself later acknowledged.  So most of our discussion was mostly
about aspects _other_ than leak fixing.

> I've got locally queued patches that fix widespread memory leaks in
> the test suite and make much of it pass under SANITIZE=leak, once the
> common leaks in revisions.c (git rev-list/show/log etc.), "checkout",
> "dir" and "unpack-trees" are fixed a lot of tests become entirely
> leak-free, as much code that needs to setup various basic things will
> require one of those commands.

Yaay!  This is great stuff!

> I think that the more narrow fixes[2] to the memory leaks around
> unpack-trees in Elijah's series[3] are better skipped and that series
> rebased on top of this one (I'll submit an RFC version of his that is
> rebased on this as a follow-up).

I *strongly* disagree.

> I.e. his solves a very small amount of the memory leaks in this area,
> whereas this is something I've got running as part of end-to-end
> SANITIZE=leak testing, so I think that the difference in approaches we
> picked when it comes to fixing them is likely because of that.
>
> E.g. continuing to allocate the "struct dir_struct" on the heap in his
> version[4] in his is, I think, something that makes more sense for
> fixes that haven't pulled at the thread of how much merge-recursive.c
> is making that question of ownerhip confusing. There's also changen in
> his that'll become simpler as the complexity of the underlying APIs
> has gone away, e.g. [6].

*Sigh*.  unpack_trees_options->dir is not allocated on the heap at the
end of my series.  I could understand missing that in the patches, but
I've also pointed it out to you two additional times in discussions on
the patches so far.  And you supposedly looked at all the patches
again while rebasing and adding your signed-off-by.

You also continue to refer to our discussion as though it was about
leakfixes, even though the patch we discussed in my series did not
involve any leak fixing.  I pointed that out and you said you stood
corrected (last comment at
https://lore.kernel.org/git/87k0ivpzfx.fsf@evledraar.gmail.com/), but
now you're referring to it that way again?  Even after rebasing my
series and adding your Signed-off-by, suggesting you should understand
it?  The leakfix was a different patch of the series, namely patch #2.

I agree that merge-recursive.c has confusing points.  I totally agree.
Unfortunately, both your patches that touch merge-recursive.c make it
worse; more so than the problems you were trying to fix in that file.

> 1. https://lore.kernel.org/git/87ilyjviiy.fsf@evledraar.gmail.com/
> 2. https://lore.kernel.org/git/0c74285b25311c83bb158cf89a551160a0f3a5d3.1632760428.git.gitgitgadget@gmail.com/
> 3. https://lore.kernel.org/git/pull.1036.v3.git.1632760428.gitgitgadget@gmail.com/
> 4. https://lore.kernel.org/git/0d119142778dce8617dd9b2c102b5f5bfdc9dc0f.1632760428.git.gitgitgadget@gmail.com/
> 6. https://lore.kernel.org/git/f1a0700e598e52d6cdb507fe8a09c4fa9291c982.1632760428.git.gitgitgadget@gmail.com/
>
...
>   merge-recursive.c: call a new unpack_trees_options_init() function
>
> Details how merge-recursive.c calls unpack_trees() differently than
> every other caller when it comes to "struct unpack_trees_options"
> setup.

Saying:

"merge-recursive.c has a heap-allocated unpack_trees_options, and thus
can't naturally use UNPACK_TREES_OPTIONS_INIT"

would have been shorter, and also explained things in full detail.
Your version makes it sound like it's doing something really weird and
needs a much more expansive explanation.

>   unpack-trees.[ch]: embed "dir" in "struct unpack_trees_options"
>
> Elijah's series ends up with a "dir" still heap-allocated in "struct
> unpack_trees_options", just dynamically and "privately".

As noted above, this is not true.  I'm confused why you try to claim
otherwise.  (I mean, it's really not all that important, I'm just
confused why you find it important to call out, especially when the
stack-based point was highlighted multiple times before but you still
insist on referring to it as heap-allocated.)

...
>   unpack-trees: don't leak memory in verify_clean_subdirectory()
>   merge.c: avoid duplicate unpack_trees_options_release() code
>   built-ins: plug memory leaks with unpack_trees_options_release()
>
> A lot of memory leak fixes both in unpack-trees.c and its users, only
> a subset of this is in Elijah's series.

Not sure why you feel the need to include the final phrase there; it
almost feels like you're trying to portray my series as a leakfix,
which feels misleading.  My series wasn't even about fixing leaks.  In
my first round, I knew of leaks, and intentionally attempted to avoid
fixing them because it was orthogonal to the point of my series (I
figured I could come back in a follow-on series and deal with it).  In
a subsequent round, I fixed one leak incidentally, in part because you
called it out, but more so because otherwise when I attempted to
consolidate code into one place it would appear to reviewers that the
consolidated code didn't match some of the callers.  In particular,
some of the sites had a leak and others didn't.  Adding a preparatory
leakfix (again, patch #2, NOT patch #4) made clear that the later
consolidation (in patch #4) really was just that -- moving several
identical code chunks into a single place.


....anyway, all that said, you've got some good fixes in this series.
You've also got three very problematic bits that need to be ripped
out.  And you should rebase this series on top of v3 of
en/removing-untracked-fixes.