diff mbox series

[7/7] reftable/reader: add comments to `table_iter_next()`

Message ID 2f1f1dd95e1cc360bde3547bd18e227a9c326e13.1706782841.git.ps@pks.im (mailing list archive)
State Superseded
Headers show
Series reftable: improve ref iteration performance | expand

Commit Message

Patrick Steinhardt Feb. 1, 2024, 10:25 a.m. UTC
While working on the optimizations in the preceding patches I stumbled
upon `table_iter_next()` multiple times. It is quite easy to miss the
fact that we don't call `table_iter_next_in_block()` twice, but that the
second call is in fact `table_iter_next_block()`.

Add comments to explain what exactly is going on here to make things
more obvious. While at it, touch up the code to conform to our code
style better.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 reftable/reader.c | 26 +++++++++++++++++---------
 1 file changed, 17 insertions(+), 9 deletions(-)

Comments

John Cai Feb. 9, 2024, 4:01 p.m. UTC | #1
Hi Patrick,

On 1 Feb 2024, at 5:25, Patrick Steinhardt wrote:

> While working on the optimizations in the preceding patches I stumbled
> upon `table_iter_next()` multiple times. It is quite easy to miss the
> fact that we don't call `table_iter_next_in_block()` twice, but that the
> second call is in fact `table_iter_next_block()`.
>
> Add comments to explain what exactly is going on here to make things
> more obvious. While at it, touch up the code to conform to our code
> style better.
>
> Signed-off-by: Patrick Steinhardt <ps@pks.im>
> ---
>  reftable/reader.c | 26 +++++++++++++++++---------
>  1 file changed, 17 insertions(+), 9 deletions(-)
>
> diff --git a/reftable/reader.c b/reftable/reader.c
> index 64dc366fb1..add7d57f0b 100644
> --- a/reftable/reader.c
> +++ b/reftable/reader.c
> @@ -357,24 +357,32 @@ static int table_iter_next(struct table_iter *ti, struct reftable_record *rec)
>
>  	while (1) {
>  		struct table_iter next = TABLE_ITER_INIT;
> -		int err = 0;
> -		if (ti->is_finished) {
> +		int err;
> +
> +		if (ti->is_finished)
>  			return 1;
> -		}
>
> +		/*
> +		 * Check whether the current block still has more records. If
> +		 * so, return it. If the iterator returns positive then the
> +		 * current block has been exhausted.
> +		 */
>  		err = table_iter_next_in_block(ti, rec);
> -		if (err <= 0) {
> +		if (err <= 0)
>  			return err;
> -		}
>
> +		/*
> +		 * Otherwise, we need to continue to the next block in the
> +		 * table and retry. If there are no more blocks then the
> +		 * iterator is drained.
> +		 */
>  		err = table_iter_next_block(&next, ti);
> -		if (err != 0) {
> -			ti->is_finished = 1;
> -		}
>  		table_iter_block_done(ti);
> -		if (err != 0) {
> +		if (err) {

what's the reason for moving the if statement that handles err down after
table_iter_block_done?

> +			ti->is_finished = 1;
>  			return err;
>  		}
> +
>  		table_iter_copy_from(ti, &next);
>  		block_iter_close(&next.bi);
>  	}
> -- 
> 2.43.GIT

Thanks
John
Patrick Steinhardt Feb. 12, 2024, 8:24 a.m. UTC | #2
On Fri, Feb 09, 2024 at 11:01:13AM -0500, John Cai wrote:
> 
> Hi Patrick,
> 
> On 1 Feb 2024, at 5:25, Patrick Steinhardt wrote:
> 
> > While working on the optimizations in the preceding patches I stumbled
> > upon `table_iter_next()` multiple times. It is quite easy to miss the
> > fact that we don't call `table_iter_next_in_block()` twice, but that the
> > second call is in fact `table_iter_next_block()`.
> >
> > Add comments to explain what exactly is going on here to make things
> > more obvious. While at it, touch up the code to conform to our code
> > style better.
> >
> > Signed-off-by: Patrick Steinhardt <ps@pks.im>
> > ---
> >  reftable/reader.c | 26 +++++++++++++++++---------
> >  1 file changed, 17 insertions(+), 9 deletions(-)
> >
> > diff --git a/reftable/reader.c b/reftable/reader.c
> > index 64dc366fb1..add7d57f0b 100644
> > --- a/reftable/reader.c
> > +++ b/reftable/reader.c
> > @@ -357,24 +357,32 @@ static int table_iter_next(struct table_iter *ti, struct reftable_record *rec)
> >
> >  	while (1) {
> >  		struct table_iter next = TABLE_ITER_INIT;
> > -		int err = 0;
> > -		if (ti->is_finished) {
> > +		int err;
> > +
> > +		if (ti->is_finished)
> >  			return 1;
> > -		}
> >
> > +		/*
> > +		 * Check whether the current block still has more records. If
> > +		 * so, return it. If the iterator returns positive then the
> > +		 * current block has been exhausted.
> > +		 */
> >  		err = table_iter_next_in_block(ti, rec);
> > -		if (err <= 0) {
> > +		if (err <= 0)
> >  			return err;
> > -		}
> >
> > +		/*
> > +		 * Otherwise, we need to continue to the next block in the
> > +		 * table and retry. If there are no more blocks then the
> > +		 * iterator is drained.
> > +		 */
> >  		err = table_iter_next_block(&next, ti);
> > -		if (err != 0) {
> > -			ti->is_finished = 1;
> > -		}
> >  		table_iter_block_done(ti);
> > -		if (err != 0) {
> > +		if (err) {
> 
> what's the reason for moving the if statement that handles err down after
> table_iter_block_done?

Good question. Ultimately, it's a simplification because I just merge
the two blocks which checked for `err != 0` into a single block. There
is no need to mark the iterator as finished before calling
`table_iter_block_done()`.

So becaiuse `table_iter_block_done()` doesn't inspect `is_finished`,
these two implementations are in the end equivalent. Before:

```
if (err)
    ti->is_finished = 1;
table_iter_block_done(ti);
if (err)
    return err;
```

After:

```
table_iter_block_done(ti);
if (err) {
    ti->is_finished = 1;
    return err;
}
```

The latter is much easier to reason about I think. It's also more
efficient because there's one branch less.

Patrick
diff mbox series

Patch

diff --git a/reftable/reader.c b/reftable/reader.c
index 64dc366fb1..add7d57f0b 100644
--- a/reftable/reader.c
+++ b/reftable/reader.c
@@ -357,24 +357,32 @@  static int table_iter_next(struct table_iter *ti, struct reftable_record *rec)
 
 	while (1) {
 		struct table_iter next = TABLE_ITER_INIT;
-		int err = 0;
-		if (ti->is_finished) {
+		int err;
+
+		if (ti->is_finished)
 			return 1;
-		}
 
+		/*
+		 * Check whether the current block still has more records. If
+		 * so, return it. If the iterator returns positive then the
+		 * current block has been exhausted.
+		 */
 		err = table_iter_next_in_block(ti, rec);
-		if (err <= 0) {
+		if (err <= 0)
 			return err;
-		}
 
+		/*
+		 * Otherwise, we need to continue to the next block in the
+		 * table and retry. If there are no more blocks then the
+		 * iterator is drained.
+		 */
 		err = table_iter_next_block(&next, ti);
-		if (err != 0) {
-			ti->is_finished = 1;
-		}
 		table_iter_block_done(ti);
-		if (err != 0) {
+		if (err) {
+			ti->is_finished = 1;
 			return err;
 		}
+
 		table_iter_copy_from(ti, &next);
 		block_iter_close(&next.bi);
 	}