diff mbox series

[v2,08/11] reftable/writer: unify releasing memory

Message ID 41db7414e17201f85b476af5e0183e72de450310.1712209149.git.ps@pks.im (mailing list archive)
State Superseded
Headers show
Series reftable: optimize write performance | expand

Commit Message

Patrick Steinhardt April 4, 2024, 5:48 a.m. UTC
There are two code paths which release memory of the reftable writer:

  - `reftable_writer_close()` releases internal state after it has
    written data.

  - `reftable_writer_free()` releases the block that was written to and
    the writer itself.

Both code paths free different parts of the writer, and consequently the
caller must make sure to call both. And while callers mostly do this
already, this falls apart when a write failure causes the caller to skip
calling `reftable_write_close()`.

Introduce a new function `reftable_writer_release()` that releases all
internal state and call it from both paths. Like this it is fine for the
caller to not call `reftable_writer_close()`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 reftable/writer.c | 23 +++++++++++++++--------
 1 file changed, 15 insertions(+), 8 deletions(-)

Comments

Han-Wen Nienhuys April 4, 2024, 7:08 a.m. UTC | #1
On Thu, Apr 4, 2024 at 7:48 AM Patrick Steinhardt <ps@pks.im> wrote:
> There are two code paths which release memory of the reftable writer:
>
>   - `reftable_writer_close()` releases internal state after it has
>     written data.
>
>   - `reftable_writer_free()` releases the block that was written to and
>     the writer itself.

The bifurcation is there so you can read the stats after closing the
writer. The new method makes it harder to misuse, but now you have two
ways to end a writer. Suggestion: drop reftable_writer_{free,close}
from reftable-writer.h (rename to remove the reftable_ prefix because
they are no longer considered public) and find another way to read out
the stats. Either pass an optional reftable_writer_stats into the
construction of the writer, return the stats from the close function,
or drop stats altogether.  IIRC They are only used in the unit tests.
Patrick Steinhardt April 4, 2024, 7:32 a.m. UTC | #2
On Thu, Apr 04, 2024 at 09:08:46AM +0200, Han-Wen Nienhuys wrote:
> On Thu, Apr 4, 2024 at 7:48 AM Patrick Steinhardt <ps@pks.im> wrote:
> > There are two code paths which release memory of the reftable writer:
> >
> >   - `reftable_writer_close()` releases internal state after it has
> >     written data.
> >
> >   - `reftable_writer_free()` releases the block that was written to and
> >     the writer itself.
> 
> The bifurcation is there so you can read the stats after closing the
> writer. The new method makes it harder to misuse, but now you have two
> ways to end a writer. Suggestion: drop reftable_writer_{free,close}
> from reftable-writer.h (rename to remove the reftable_ prefix because
> they are no longer considered public) and find another way to read out
> the stats. Either pass an optional reftable_writer_stats into the
> construction of the writer, return the stats from the close function,
> or drop stats altogether.  IIRC They are only used in the unit tests.

But even with these refactorings the stats remain intact after calling
`reftable_writer_close()` or `reftable_writer_release()`, right? So it
basically continues to work as expected.

It might not be the cleanest way to handle this, but I think this patch
is an improvement over the previous state because we plug a memory leak
and deduplicate the cleanup logic. So I would suggest to defer your
proposed refactorings to a later point, if you're okay with that.

Thanks!

Patrick
Han-Wen Nienhuys April 4, 2024, 9 a.m. UTC | #3
On Thu, Apr 4, 2024 at 9:32 AM Patrick Steinhardt <ps@pks.im> wrote:
>
> On Thu, Apr 04, 2024 at 09:08:46AM +0200, Han-Wen Nienhuys wrote:
> > On Thu, Apr 4, 2024 at 7:48 AM Patrick Steinhardt <ps@pks.im> wrote:
> > > There are two code paths which release memory of the reftable writer:
> > >
> > >   - `reftable_writer_close()` releases internal state after it has
> > >     written data.
> > >
> > >   - `reftable_writer_free()` releases the block that was written to and
> > >     the writer itself.
> >
> > The bifurcation is there so you can read the stats after closing the
> > writer. The new method makes it harder to misuse, but now you have two
> > ways to end a writer. Suggestion: drop reftable_writer_{free,close}
> > from reftable-writer.h (rename to remove the reftable_ prefix because
> > they are no longer considered public) and find another way to read out
> > the stats. Either pass an optional reftable_writer_stats into the
> > construction of the writer, return the stats from the close function,
> > or drop stats altogether.  IIRC They are only used in the unit tests.
>
> But even with these refactorings the stats remain intact after calling
> `reftable_writer_close()` or `reftable_writer_release()`, right? So it
> basically continues to work as expected.

Right - I misinterpreted your change.

> It might not be the cleanest way to handle this, but I think this patch
> is an improvement over the previous state because we plug a memory leak
> and deduplicate the cleanup logic. So I would suggest to defer your
> proposed refactorings to a later point, if you're okay with that.

yes. Please add reftable_writer_release to reftable-writer.h for
consistency, though. Or remove the reftable_ prefix.
Patrick Steinhardt April 4, 2024, 11:43 a.m. UTC | #4
On Thu, Apr 04, 2024 at 11:00:46AM +0200, Han-Wen Nienhuys wrote:
> On Thu, Apr 4, 2024 at 9:32 AM Patrick Steinhardt <ps@pks.im> wrote:
> > On Thu, Apr 04, 2024 at 09:08:46AM +0200, Han-Wen Nienhuys wrote:
[snip]
> > It might not be the cleanest way to handle this, but I think this patch
> > is an improvement over the previous state because we plug a memory leak
> > and deduplicate the cleanup logic. So I would suggest to defer your
> > proposed refactorings to a later point, if you're okay with that.
> 
> yes. Please add reftable_writer_release to reftable-writer.h for
> consistency, though. Or remove the reftable_ prefix.

I've dropped the `reftable_` prefix locally. Will wait a bit for
additional reviews though before sending out v3.

Patrick
diff mbox series

Patch

diff --git a/reftable/writer.c b/reftable/writer.c
index d347ec4cc6..7b70c9b666 100644
--- a/reftable/writer.c
+++ b/reftable/writer.c
@@ -149,11 +149,21 @@  void reftable_writer_set_limits(struct reftable_writer *w, uint64_t min,
 	w->max_update_index = max;
 }
 
+static void reftable_writer_release(struct reftable_writer *w)
+{
+	if (w) {
+		reftable_free(w->block);
+		w->block = NULL;
+		block_writer_release(&w->block_writer_data);
+		w->block_writer = NULL;
+		writer_clear_index(w);
+		strbuf_release(&w->last_key);
+	}
+}
+
 void reftable_writer_free(struct reftable_writer *w)
 {
-	if (!w)
-		return;
-	reftable_free(w->block);
+	reftable_writer_release(w);
 	reftable_free(w);
 }
 
@@ -643,16 +653,13 @@  int reftable_writer_close(struct reftable_writer *w)
 	}
 
 done:
-	/* free up memory. */
-	block_writer_release(&w->block_writer_data);
-	writer_clear_index(w);
-	strbuf_release(&w->last_key);
+	reftable_writer_release(w);
 	return err;
 }
 
 static void writer_clear_index(struct reftable_writer *w)
 {
-	for (size_t i = 0; i < w->index_len; i++)
+	for (size_t i = 0; w->index && i < w->index_len; i++)
 		strbuf_release(&w->index[i].last_key);
 	FREE_AND_NULL(w->index);
 	w->index_len = 0;