diff mbox series

[v2,06/11] reftable/writer: refactorings for `writer_add_record()`

Message ID 4877ab39212867e91058c60f99fe0dc2a592d583.1712209149.git.ps@pks.im (mailing list archive)
State Superseded
Headers show
Series reftable: optimize write performance | expand

Commit Message

Patrick Steinhardt April 4, 2024, 5:48 a.m. UTC
Large parts of the reftable library do not conform to Git's typical code
style. Refactor `writer_add_record()` such that it conforms better to it
and add some documentation that explains some of its more intricate
behaviour.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 reftable/writer.c | 38 +++++++++++++++++++++++++++-----------
 1 file changed, 27 insertions(+), 11 deletions(-)

Comments

Han-Wen Nienhuys April 4, 2024, 6:58 a.m. UTC | #1
On Thu, Apr 4, 2024 at 7:48 AM Patrick Steinhardt <ps@pks.im> wrote:
> +       /*
> +        * Try to add the record to the writer again. If this still fails then
> +        * the record does not fit into the block size.
> +        *
> +        * TODO: it would be great to have `block_writer_add()` return proper
> +        *       error codes so that we don't have to second-guess the failure
> +        *       mode here.
> +        */

The Go code returns a (size, boolean) tuple for the write routines
here, but that does not really work in the Git C style.

If you make the routines return error codes it suggests that the
in-memory write can fail for other reasons beyond "does not fit". Not
sure if that is really an improvement.
Patrick Steinhardt April 4, 2024, 7:32 a.m. UTC | #2
On Thu, Apr 04, 2024 at 08:58:08AM +0200, Han-Wen Nienhuys wrote:
> On Thu, Apr 4, 2024 at 7:48 AM Patrick Steinhardt <ps@pks.im> wrote:
> > +       /*
> > +        * Try to add the record to the writer again. If this still fails then
> > +        * the record does not fit into the block size.
> > +        *
> > +        * TODO: it would be great to have `block_writer_add()` return proper
> > +        *       error codes so that we don't have to second-guess the failure
> > +        *       mode here.
> > +        */
> 
> The Go code returns a (size, boolean) tuple for the write routines
> here, but that does not really work in the Git C style.
> 
> If you make the routines return error codes it suggests that the
> in-memory write can fail for other reasons beyond "does not fit". Not
> sure if that is really an improvement.

In reality, `block_writer_add()` already can fail because of different
reasons: it returns `REFTABLE_API_ERROR` if the passed-in record has an
empty key. This shouldn't ever happen, but it demonstrates that this is
certainly an area which needs some further cleanups.

Patrick
diff mbox series

Patch

diff --git a/reftable/writer.c b/reftable/writer.c
index 1d9ff0fbfa..0ad5eb8887 100644
--- a/reftable/writer.c
+++ b/reftable/writer.c
@@ -209,7 +209,8 @@  static int writer_add_record(struct reftable_writer *w,
 			     struct reftable_record *rec)
 {
 	struct strbuf key = STRBUF_INIT;
-	int err = -1;
+	int err;
+
 	reftable_record_key(rec, &key);
 	if (strbuf_cmp(&w->last_key, &key) >= 0) {
 		err = REFTABLE_API_ERROR;
@@ -218,27 +219,42 @@  static int writer_add_record(struct reftable_writer *w,
 
 	strbuf_reset(&w->last_key);
 	strbuf_addbuf(&w->last_key, &key);
-	if (!w->block_writer) {
+	if (!w->block_writer)
 		writer_reinit_block_writer(w, reftable_record_type(rec));
-	}
 
-	assert(block_writer_type(w->block_writer) == reftable_record_type(rec));
+	if (block_writer_type(w->block_writer) != reftable_record_type(rec))
+		BUG("record of type %d added to writer of type %d",
+		    reftable_record_type(rec), block_writer_type(w->block_writer));
 
-	if (block_writer_add(w->block_writer, rec) == 0) {
+	/*
+	 * Try to add the record to the writer. If this succeeds then we're
+	 * done. Otherwise the block writer may have hit the block size limit
+	 * and needs to be flushed.
+	 */
+	if (!block_writer_add(w->block_writer, rec)) {
 		err = 0;
 		goto done;
 	}
 
+	/*
+	 * The current block is full, so we need to flush and reinitialize the
+	 * writer to start writing the next block.
+	 */
 	err = writer_flush_block(w);
-	if (err < 0) {
+	if (err < 0)
 		goto done;
-	}
-
 	writer_reinit_block_writer(w, reftable_record_type(rec));
+
+	/*
+	 * Try to add the record to the writer again. If this still fails then
+	 * the record does not fit into the block size.
+	 *
+	 * TODO: it would be great to have `block_writer_add()` return proper
+	 *       error codes so that we don't have to second-guess the failure
+	 *       mode here.
+	 */
 	err = block_writer_add(w->block_writer, rec);
-	if (err == -1) {
-		/* we are writing into memory, so an error can only mean it
-		 * doesn't fit. */
+	if (err) {
 		err = REFTABLE_ENTRY_TOO_BIG_ERROR;
 		goto done;
 	}