diff mbox series

[RFC,04/10] ewah/bitmap: always allocate 2 more words

Message ID 20190913130226.7449-5-chriscool@tuxfamily.org (mailing list archive)
State New, archived
Headers show
Series Rewrite packfile reuse code | expand

Commit Message

Christian Couder Sept. 13, 2019, 1:02 p.m. UTC
From: Jeff King <peff@peff.net>

In a following patch we will allocate a variable number
of words in some bitmaps. When iterating over the words we
will need a mark to tell us when to stop iterating. Let's
always allocate 2 more words, that will always contain 0,
as that mark.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 ewah/bitmap.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Jonathan Tan Oct. 10, 2019, 11:40 p.m. UTC | #1
> From: Jeff King <peff@peff.net>
> 
> In a following patch we will allocate a variable number
> of words in some bitmaps. When iterating over the words we
> will need a mark to tell us when to stop iterating. Let's
> always allocate 2 more words, that will always contain 0,
> as that mark.

[snip]

>  	if (block >= self->word_alloc) {
>  		size_t old_size = self->word_alloc;
> -		self->word_alloc = block * 2;
> +		self->word_alloc = (block + 1) * 2;
>  		REALLOC_ARRAY(self->words, self->word_alloc);
>  		memset(self->words + old_size, 0x0,
>  			(self->word_alloc - old_size) * sizeof(eword_t));

This patch set was mentioned as needing more thorough review in "What's
Cooking" [1], so I thought I'd give it a try. As Peff said [2], the
justification in the commit message looks incorrect. He suggests that it
is most likely because "block" might be 0 (which is possible because a
previous patch eliminated the minimum of 32), which makes sense to me.

In any case, the next patch does not use 0 as a sentinel mark. Iteration
stops when word_alloc is reached anyway, and since this is a regular
bitmap, 0 is a valid word and cannot be used as a sentinel. (Maybe 0 is
a valid word in a compressed EWAH bitmap too...not sure about that.)

I think this should be squashed with patch 3, adding to that commit
message "since word_alloc might be 0, we need to change the growth
function". (Or just make the minimum word_alloc be 1 or 32 or something
positive, if that's possible.)

[1] https://public-inbox.org/git/xmqq36g5444k.fsf@gitster-ct.c.googlers.com/
[2] https://public-inbox.org/git/20191002155721.GD6116@sigill.intra.peff.net/
Christian Couder Oct. 11, 2019, 7:49 a.m. UTC | #2
On Fri, Oct 11, 2019 at 1:40 AM Jonathan Tan <jonathantanmy@google.com> wrote:
>
> > From: Jeff King <peff@peff.net>
> >
> > In a following patch we will allocate a variable number
> > of words in some bitmaps. When iterating over the words we
> > will need a mark to tell us when to stop iterating. Let's
> > always allocate 2 more words, that will always contain 0,
> > as that mark.
>
> [snip]
>
> >       if (block >= self->word_alloc) {
> >               size_t old_size = self->word_alloc;
> > -             self->word_alloc = block * 2;
> > +             self->word_alloc = (block + 1) * 2;
> >               REALLOC_ARRAY(self->words, self->word_alloc);
> >               memset(self->words + old_size, 0x0,
> >                       (self->word_alloc - old_size) * sizeof(eword_t));
>
> This patch set was mentioned as needing more thorough review in "What's
> Cooking" [1], so I thought I'd give it a try.

Thanks!

> As Peff said [2], the
> justification in the commit message looks incorrect. He suggests that it
> is most likely because "block" might be 0 (which is possible because a
> previous patch eliminated the minimum of 32), which makes sense to me.

Ok I will try to come up with a better justification, though Peff said
that he would took another look at this series and I'd rather wait
until he has done that.

> In any case, the next patch does not use 0 as a sentinel mark. Iteration
> stops when word_alloc is reached anyway, and since this is a regular
> bitmap, 0 is a valid word and cannot be used as a sentinel. (Maybe 0 is
> a valid word in a compressed EWAH bitmap too...not sure about that.)

Yeah I misread this. Hopefully Peff can shed some light on this.

> I think this should be squashed with patch 3, adding to that commit
> message "since word_alloc might be 0, we need to change the growth
> function". (Or just make the minimum word_alloc be 1 or 32 or something
> positive, if that's possible.)

Yeah, thank you for the suggestion. I still wonder why 2 is added
instead of just 1 though.
Jeff King Oct. 11, 2019, 6:05 p.m. UTC | #3
On Fri, Oct 11, 2019 at 09:49:53AM +0200, Christian Couder wrote:

> > I think this should be squashed with patch 3, adding to that commit
> > message "since word_alloc might be 0, we need to change the growth
> > function". (Or just make the minimum word_alloc be 1 or 32 or something
> > positive, if that's possible.)
> 
> Yeah, thank you for the suggestion. I still wonder why 2 is added
> instead of just 1 though.

Yeah, I think it should be squashed. I think it is not intentionally 2,
it is just that adding "1" to block makes sure we always make forward
progress. It could equally well be:

  self->word_alloc = block ? block * 2 : 1;

I think. Or probably this whole thing could be ALLOC_GROW(), as the
numbers aren't particularly important. I guess we need to make sure the
grown part is zero'd, so probably using alloc_nr() directly would make
more sense.

-Peff
diff mbox series

Patch

diff --git a/ewah/bitmap.c b/ewah/bitmap.c
index 143dc71419..eac05485f1 100644
--- a/ewah/bitmap.c
+++ b/ewah/bitmap.c
@@ -41,7 +41,7 @@  void bitmap_set(struct bitmap *self, size_t pos)
 
 	if (block >= self->word_alloc) {
 		size_t old_size = self->word_alloc;
-		self->word_alloc = block * 2;
+		self->word_alloc = (block + 1) * 2;
 		REALLOC_ARRAY(self->words, self->word_alloc);
 		memset(self->words + old_size, 0x0,
 			(self->word_alloc - old_size) * sizeof(eword_t));