[7/7] reftable/reader: add comments to `table_iter_next()`

Message ID	2f1f1dd95e1cc360bde3547bd18e227a9c326e13.1706782841.git.ps@pks.im (mailing list archive)
State	Superseded
Headers	show Received: from wout5-smtp.messagingengine.com (wout5-smtp.messagingengine.com [64.147.123.21]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CDBA01586E6 for <git@vger.kernel.org>; Thu, 1 Feb 2024 10:25:25 +0000 (UTC) Feedback-ID: i197146af:Fastmail Date: Thu, 1 Feb 2024 11:25:22 +0100 From: Patrick Steinhardt <ps@pks.im> To: git@vger.kernel.org Subject: [PATCH 7/7] reftable/reader: add comments to `table_iter_next()` Message-ID: <2f1f1dd95e1cc360bde3547bd18e227a9c326e13.1706782841.git.ps@pks.im> References: <cover.1706782841.git.ps@pks.im> Precedence: bulk MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha512; protocol="application/pgp-signature"; boundary="pxcd3/tq9LlF0NPd" Content-Disposition: inline In-Reply-To: <cover.1706782841.git.ps@pks.im>
Series	reftable: improve ref iteration performance \| expand [0/7] reftable: improve ref iteration performance [1/7] reftable/record: introduce function to compare records by key [2/7] reftable/merged: allocation-less dropping of shadowed records [3/7] reftable/merged: skip comparison for records of the same subiter [4/7] reftable/pq: allocation-less comparison of entry keys [5/7] reftable/block: swap buffers instead of copying [6/7] reftable/record: don't try to reallocate ref record name [7/7] reftable/reader: add comments to `table_iter_next()`

Message ID

2f1f1dd95e1cc360bde3547bd18e227a9c326e13.1706782841.git.ps@pks.im (mailing list archive)

State

Superseded

Headers

Feedback-ID: i197146af:Fastmail
Date: Thu, 1 Feb 2024 11:25:22 +0100
From: Patrick Steinhardt <ps@pks.im>
To: git@vger.kernel.org
Subject: [PATCH 7/7] reftable/reader: add comments to `table_iter_next()`
Message-ID: 
 <2f1f1dd95e1cc360bde3547bd18e227a9c326e13.1706782841.git.ps@pks.im>
References: <cover.1706782841.git.ps@pks.im>
Precedence: bulk
MIME-Version: 1.0
Content-Type: multipart/signed; micalg=pgp-sha512;
	protocol="application/pgp-signature"; boundary="pxcd3/tq9LlF0NPd"
Content-Disposition: inline
In-Reply-To: <cover.1706782841.git.ps@pks.im>

Series

reftable: improve ref iteration performance | expand

Commit Message

Patrick Steinhardt Feb. 1, 2024, 10:25 a.m. UTC

While working on the optimizations in the preceding patches I stumbled
upon `table_iter_next()` multiple times. It is quite easy to miss the
fact that we don't call `table_iter_next_in_block()` twice, but that the
second call is in fact `table_iter_next_block()`.

Add comments to explain what exactly is going on here to make things
more obvious. While at it, touch up the code to conform to our code
style better.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 reftable/reader.c | 26 +++++++++++++++++---------
 1 file changed, 17 insertions(+), 9 deletions(-)

Comments

John Cai Feb. 9, 2024, 4:01 p.m. UTC | #1

Hi Patrick,

On 1 Feb 2024, at 5:25, Patrick Steinhardt wrote:

> While working on the optimizations in the preceding patches I stumbled
> upon `table_iter_next()` multiple times. It is quite easy to miss the
> fact that we don't call `table_iter_next_in_block()` twice, but that the
> second call is in fact `table_iter_next_block()`.
>
> Add comments to explain what exactly is going on here to make things
> more obvious. While at it, touch up the code to conform to our code
> style better.
>
> Signed-off-by: Patrick Steinhardt <ps@pks.im>
> ---
>  reftable/reader.c | 26 +++++++++++++++++---------
>  1 file changed, 17 insertions(+), 9 deletions(-)
>
> diff --git a/reftable/reader.c b/reftable/reader.c
> index 64dc366fb1..add7d57f0b 100644
> --- a/reftable/reader.c
> +++ b/reftable/reader.c
> @@ -357,24 +357,32 @@ static int table_iter_next(struct table_iter *ti, struct reftable_record *rec)
>
>  	while (1) {
>  		struct table_iter next = TABLE_ITER_INIT;
> -		int err = 0;
> -		if (ti->is_finished) {
> +		int err;
> +
> +		if (ti->is_finished)
>  			return 1;
> -		}
>
> +		/*
> +		 * Check whether the current block still has more records. If
> +		 * so, return it. If the iterator returns positive then the
> +		 * current block has been exhausted.
> +		 */
>  		err = table_iter_next_in_block(ti, rec);
> -		if (err <= 0) {
> +		if (err <= 0)
>  			return err;
> -		}
>
> +		/*
> +		 * Otherwise, we need to continue to the next block in the
> +		 * table and retry. If there are no more blocks then the
> +		 * iterator is drained.
> +		 */
>  		err = table_iter_next_block(&next, ti);
> -		if (err != 0) {
> -			ti->is_finished = 1;
> -		}
>  		table_iter_block_done(ti);
> -		if (err != 0) {
> +		if (err) {

what's the reason for moving the if statement that handles err down after
table_iter_block_done?

> +			ti->is_finished = 1;
>  			return err;
>  		}
> +
>  		table_iter_copy_from(ti, &next);
>  		block_iter_close(&next.bi);
>  	}
> -- 
> 2.43.GIT

Thanks
John

Patrick Steinhardt Feb. 12, 2024, 8:24 a.m. UTC | #2

On Fri, Feb 09, 2024 at 11:01:13AM -0500, John Cai wrote:
> 
> Hi Patrick,
> 
> On 1 Feb 2024, at 5:25, Patrick Steinhardt wrote:
> 
> > While working on the optimizations in the preceding patches I stumbled
> > upon `table_iter_next()` multiple times. It is quite easy to miss the
> > fact that we don't call `table_iter_next_in_block()` twice, but that the
> > second call is in fact `table_iter_next_block()`.
> >
> > Add comments to explain what exactly is going on here to make things
> > more obvious. While at it, touch up the code to conform to our code
> > style better.
> >
> > Signed-off-by: Patrick Steinhardt <ps@pks.im>
> > ---
> >  reftable/reader.c | 26 +++++++++++++++++---------
> >  1 file changed, 17 insertions(+), 9 deletions(-)
> >
> > diff --git a/reftable/reader.c b/reftable/reader.c
> > index 64dc366fb1..add7d57f0b 100644
> > --- a/reftable/reader.c
> > +++ b/reftable/reader.c
> > @@ -357,24 +357,32 @@ static int table_iter_next(struct table_iter *ti, struct reftable_record *rec)
> >
> >  	while (1) {
> >  		struct table_iter next = TABLE_ITER_INIT;
> > -		int err = 0;
> > -		if (ti->is_finished) {
> > +		int err;
> > +
> > +		if (ti->is_finished)
> >  			return 1;
> > -		}
> >
> > +		/*
> > +		 * Check whether the current block still has more records. If
> > +		 * so, return it. If the iterator returns positive then the
> > +		 * current block has been exhausted.
> > +		 */
> >  		err = table_iter_next_in_block(ti, rec);
> > -		if (err <= 0) {
> > +		if (err <= 0)
> >  			return err;
> > -		}
> >
> > +		/*
> > +		 * Otherwise, we need to continue to the next block in the
> > +		 * table and retry. If there are no more blocks then the
> > +		 * iterator is drained.
> > +		 */
> >  		err = table_iter_next_block(&next, ti);
> > -		if (err != 0) {
> > -			ti->is_finished = 1;
> > -		}
> >  		table_iter_block_done(ti);
> > -		if (err != 0) {
> > +		if (err) {
> 
> what's the reason for moving the if statement that handles err down after
> table_iter_block_done?

Good question. Ultimately, it's a simplification because I just merge
the two blocks which checked for `err != 0` into a single block. There
is no need to mark the iterator as finished before calling
`table_iter_block_done()`.

So becaiuse `table_iter_block_done()` doesn't inspect `is_finished`,
these two implementations are in the end equivalent. Before:

```
if (err)
    ti->is_finished = 1;
table_iter_block_done(ti);
if (err)
    return err;
```

After:

```
table_iter_block_done(ti);
if (err) {
    ti->is_finished = 1;
    return err;
}
```

The latter is much easier to reason about I think. It's also more
efficient because there's one branch less.

Patrick

diff --git a/reftable/reader.c b/reftable/reader.c
index 64dc366fb1..add7d57f0b 100644
--- a/reftable/reader.c
+++ b/reftable/reader.c
@@ -357,24 +357,32 @@  static int table_iter_next(struct table_iter *ti, struct reftable_record *rec)
 
 	while (1) {
 		struct table_iter next = TABLE_ITER_INIT;
-		int err = 0;
-		if (ti->is_finished) {
+		int err;
+
+		if (ti->is_finished)
 			return 1;
-		}
 
+		/*
+		 * Check whether the current block still has more records. If
+		 * so, return it. If the iterator returns positive then the
+		 * current block has been exhausted.
+		 */
 		err = table_iter_next_in_block(ti, rec);
-		if (err <= 0) {
+		if (err <= 0)
 			return err;
-		}
 
+		/*
+		 * Otherwise, we need to continue to the next block in the
+		 * table and retry. If there are no more blocks then the
+		 * iterator is drained.
+		 */
 		err = table_iter_next_block(&next, ti);
-		if (err != 0) {
-			ti->is_finished = 1;
-		}
 		table_iter_block_done(ti);
-		if (err != 0) {
+		if (err) {
+			ti->is_finished = 1;
 			return err;
 		}
+
 		table_iter_copy_from(ti, &next);
 		block_iter_close(&next.bi);
 	}

[7/7] reftable/reader: add comments to `table_iter_next()`

Commit Message

Comments

Patch