diff mbox series

[01/23] ewah/ewah_bitmap.c: grow buffer past 1

Message ID 36deaad366d66d10b96755dd6969bfe51123a2d4.1605123652.git.me@ttaylorr.com (mailing list archive)
State Superseded
Headers show
Series pack-bitmap: bitmap generation improvements | expand

Commit Message

Taylor Blau Nov. 11, 2020, 7:41 p.m. UTC
When the buffer size is exactly 1, we fail to grow it properly, since
the integer truncation means that 1 * 3 / 2 = 1. This can cause a bad
write on the line below.

Bandaid this by first padding the buffer by 16, and then growing it.
This still allows old blocks to fit into new ones, but fixes the case
where the block size equals 1.

Co-authored-by: Jeff King <peff@peff.net>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 ewah/ewah_bitmap.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Junio C Hamano Nov. 22, 2020, 7:36 p.m. UTC | #1
Taylor Blau <me@ttaylorr.com> writes:

> When the buffer size is exactly 1, we fail to grow it properly, since
> the integer truncation means that 1 * 3 / 2 = 1. This can cause a bad
> write on the line below.

When the buffer_size is exactly (alloc_size - 1), we can fit the new
element at the last word in the buffer array, but we still grow.  Is
this because we anticipate that we would need to add more soon?

> Bandaid this by first padding the buffer by 16, and then growing it.
> This still allows old blocks to fit into new ones, but fixes the case
> where the block size equals 1.

Adding 16 unconditionally is not "to pad".  If somebody really wants
"to pad", a likely implementation would be that the size resulting
from some computation (e.g. multiplying by 1.5) is round up to a
multiple of some number, than rounding up the original number before
multiplying it by 1.5, so the use of that verb in the explanation
did not help me understand what is going on.

Having said that, I see you used the word "bandaid" to signal that
we shouldn't worry about this being optimal or even correct and we
should be happy as long as it is not wrong ;-), but is there any
reason behind this 16 (as opposed to picking, say, 8 or 31), or is
that pulled out of thin air?

I think this probably mimics what alloc_nr() computes for ALLOC_GROW().
I wonder why buffer_grow() cannot be built around ALLOC_GROW() instead?

Nothing in the code is wrong per-se, but just what I noticed while
re-reading the patch.

Thanks.

> Co-authored-by: Jeff King <peff@peff.net>
> Signed-off-by: Jeff King <peff@peff.net>
> Signed-off-by: Taylor Blau <me@ttaylorr.com>
> ---
>  ewah/ewah_bitmap.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/ewah/ewah_bitmap.c b/ewah/ewah_bitmap.c
> index d59b1afe3d..3fae04ad00 100644
> --- a/ewah/ewah_bitmap.c
> +++ b/ewah/ewah_bitmap.c
> @@ -45,7 +45,7 @@ static inline void buffer_grow(struct ewah_bitmap *self, size_t new_size)
>  static inline void buffer_push(struct ewah_bitmap *self, eword_t value)
>  {
>  	if (self->buffer_size + 1 >= self->alloc_size)
> -		buffer_grow(self, self->buffer_size * 3 / 2);
> +		buffer_grow(self, (self->buffer_size + 16) * 3 / 2);
>  
>  	self->buffer[self->buffer_size++] = value;
>  }
Taylor Blau Nov. 23, 2020, 4:22 p.m. UTC | #2
On Sun, Nov 22, 2020 at 11:36:17AM -0800, Junio C Hamano wrote:
> Taylor Blau <me@ttaylorr.com> writes:
>
> > When the buffer size is exactly 1, we fail to grow it properly, since
> > the integer truncation means that 1 * 3 / 2 = 1. This can cause a bad
> > write on the line below.
>
> When the buffer_size is exactly (alloc_size - 1), we can fit the new
> element at the last word in the buffer array, but we still grow.  Is
> this because we anticipate that we would need to add more soon?

Right; the check 'if (self->buffer_size + 1 >= self->alloc_size)' could
probably be written as a strict inequality, but that check dates back to
the original ewah implementation that Vicent added.

But, that is not quite the point of this patch: instead we want to stop
the integer math on that line from preventing us from growing the
buffer.

I think that this paragraph would be clarified by adding "and we need to
grow" to the end of "when the buffer size is exactly 1".

> > Bandaid this by first padding the buffer by 16, and then growing it.
> > This still allows old blocks to fit into new ones, but fixes the case
> > where the block size equals 1.
>
> Adding 16 unconditionally is not "to pad".  If somebody really wants
> "to pad", a likely implementation would be that the size resulting
> from some computation (e.g. multiplying by 1.5) is round up to a
> multiple of some number, than rounding up the original number before
> multiplying it by 1.5, so the use of that verb in the explanation
> did not help me understand what is going on.
>
> Having said that, I see you used the word "bandaid" to signal that
> we shouldn't worry about this being optimal or even correct and we
> should be happy as long as it is not wrong ;-), but is there any
> reason behind this 16 (as opposed to picking, say, 8 or 31), or is
> that pulled out of thin air?

Any phrase that more accurately states what's going on is fine by me,
but...

> I think this probably mimics what alloc_nr() computes for ALLOC_GROW().
> I wonder why buffer_grow() cannot be built around ALLOC_GROW() instead?

I think that we probably could just use ALLOC_GROW() as you suggest.
Funny enough, reading through GitHub's chat logs, apparently this is
something that Peff and I talked about. So, 16 probably came from
alloc_nr(), but we probably stopped short of realizing that we could
just use ALLOC_GROW as-is.

So, maybe something along the lines of:

diff --git a/ewah/ewah_bitmap.c b/ewah/ewah_bitmap.c
index 3fae04ad00..9effcc0877 100644
--- a/ewah/ewah_bitmap.c
+++ b/ewah/ewah_bitmap.c
@@ -19,6 +19,7 @@
 #include "git-compat-util.h"
 #include "ewok.h"
 #include "ewok_rlw.h"
+#include "cache.h"

 static inline size_t min_size(size_t a, size_t b)
 {
@@ -33,12 +34,7 @@ static inline size_t max_size(size_t a, size_t b)
 static inline void buffer_grow(struct ewah_bitmap *self, size_t new_size)
 {
 	size_t rlw_offset = (uint8_t *)self->rlw - (uint8_t *)self->buffer;
-
-	if (self->alloc_size >= new_size)
-		return;
-
-	self->alloc_size = new_size;
-	REALLOC_ARRAY(self->buffer, self->alloc_size);
+	ALLOC_GROW(self->buffer, new_size, self->alloc_size);
 	self->rlw = self->buffer + (rlw_offset / sizeof(eword_t));
 }

> Nothing in the code is wrong per-se, but just what I noticed while
> re-reading the patch.
>
> Thanks.

Thanks,
Taylor
Jeff King Nov. 24, 2020, 2:48 a.m. UTC | #3
On Mon, Nov 23, 2020 at 11:22:04AM -0500, Taylor Blau wrote:

> > I think this probably mimics what alloc_nr() computes for ALLOC_GROW().
> > I wonder why buffer_grow() cannot be built around ALLOC_GROW() instead?
> 
> I think that we probably could just use ALLOC_GROW() as you suggest.
> Funny enough, reading through GitHub's chat logs, apparently this is
> something that Peff and I talked about. So, 16 probably came from
> alloc_nr(), but we probably stopped short of realizing that we could
> just use ALLOC_GROW as-is.

That would probably be OK. It's a bit more aggressive, which could
matter if you have a large number of very small bitmaps. My original
goal of the "grow less aggressively" patch was to keep memory usage
down, knowing that I was going to be holding a lot of bitmaps in memory
at once. But even with micro-optimizations like this, it turned out to
be far too big in practice (and hence Stolee's work on top to reduce the
total number we hold at once).

> @@ -33,12 +34,7 @@ static inline size_t max_size(size_t a, size_t b)
>  static inline void buffer_grow(struct ewah_bitmap *self, size_t new_size)
>  {
>  	size_t rlw_offset = (uint8_t *)self->rlw - (uint8_t *)self->buffer;
> -
> -	if (self->alloc_size >= new_size)
> -		return;
> -
> -	self->alloc_size = new_size;
> -	REALLOC_ARRAY(self->buffer, self->alloc_size);
> +	ALLOC_GROW(self->buffer, new_size, self->alloc_size);
>  	self->rlw = self->buffer + (rlw_offset / sizeof(eword_t));
>  }

I think the real test would be measuring the peak heap of the series as
you posted it in v2, and this version replacing this patch (and the
"grow less aggressively" one) with ALLOC_GROW(). On something big, like
repacking all of the torvalds/linux or git/git fork networks.

If there's no appreciable difference, then definitely I think it's worth
the simplicity of reusing ALLOC_GROW().

-Peff
Jeff King Nov. 24, 2020, 2:51 a.m. UTC | #4
On Mon, Nov 23, 2020 at 09:48:22PM -0500, Jeff King wrote:

> > I think that we probably could just use ALLOC_GROW() as you suggest.
> > Funny enough, reading through GitHub's chat logs, apparently this is
> > something that Peff and I talked about. So, 16 probably came from
> > alloc_nr(), but we probably stopped short of realizing that we could
> > just use ALLOC_GROW as-is.
> 
> That would probably be OK. It's a bit more aggressive, which could
> matter if you have a large number of very small bitmaps. My original
> goal of the "grow less aggressively" patch was to keep memory usage
> down, knowing that I was going to be holding a lot of bitmaps in memory
> at once. But even with micro-optimizations like this, it turned out to
> be far too big in practice (and hence Stolee's work on top to reduce the
> total number we hold at once).

Oh, sorry, I was mixing this patch up with patches 6 and 7, which touch
buffer_grow().  This is a totally separate spot, and this is a pure
bug-fix.

I think the main reason we didn't use ALLOC_GROW() here in the beginning
is that the ewah code was originally designed to be a separate library
(a port of the java ewah library), and didn't depend on Git code.

These days we pull in xmalloc, etc, so we should be fine to use
ALLOC_GROW().

Likewise...

> I think the real test would be measuring the peak heap of the series as
> you posted it in v2, and this version replacing this patch (and the
> "grow less aggressively" one) with ALLOC_GROW(). On something big, like
> repacking all of the torvalds/linux or git/git fork networks.
> 
> If there's no appreciable difference, then definitely I think it's worth
> the simplicity of reusing ALLOC_GROW().

All of this is nonsense (though it does apply to the question of using
ALLOC_GROW() in bitmap_grow(), via patch 7).

We have many fewer ewah bitmaps in memory at one time, so I don't think
it's worth micro-managing a few extra bytes of growth. Using
ALLOC_GROW() for this case would be fine.

-Peff
Taylor Blau Dec. 1, 2020, 10:56 p.m. UTC | #5
On Mon, Nov 23, 2020 at 09:51:41PM -0500, Jeff King wrote:
> On Mon, Nov 23, 2020 at 09:48:22PM -0500, Jeff King wrote:
>
> > > I think that we probably could just use ALLOC_GROW() as you suggest.
> > > Funny enough, reading through GitHub's chat logs, apparently this is
> > > something that Peff and I talked about. So, 16 probably came from
> > > alloc_nr(), but we probably stopped short of realizing that we could
> > > just use ALLOC_GROW as-is.
> >
> > That would probably be OK. It's a bit more aggressive, which could
> > matter if you have a large number of very small bitmaps. My original
> > goal of the "grow less aggressively" patch was to keep memory usage
> > down, knowing that I was going to be holding a lot of bitmaps in memory
> > at once. But even with micro-optimizations like this, it turned out to
> > be far too big in practice (and hence Stolee's work on top to reduce the
> > total number we hold at once).
>
> Oh, sorry, I was mixing this patch up with patches 6 and 7, which touch
> buffer_grow().  This is a totally separate spot, and this is a pure
> bug-fix.
>
> I think the main reason we didn't use ALLOC_GROW() here in the beginning
> is that the ewah code was originally designed to be a separate library
> (a port of the java ewah library), and didn't depend on Git code.
>
> These days we pull in xmalloc, etc, so we should be fine to use
> ALLOC_GROW().
>
> Likewise...
>
> > I think the real test would be measuring the peak heap of the series as
> > you posted it in v2, and this version replacing this patch (and the
> > "grow less aggressively" one) with ALLOC_GROW(). On something big, like
> > repacking all of the torvalds/linux or git/git fork networks.
> >
> > If there's no appreciable difference, then definitely I think it's worth
> > the simplicity of reusing ALLOC_GROW().
>
> All of this is nonsense (though it does apply to the question of using
> ALLOC_GROW() in bitmap_grow(), via patch 7).

You and I timed this a week or two ago, but I only just returned to this
topic today. Switching to ALLOC_GROW() doesn't affect the final memory
usage at all, so I changed patch 7 up to use that instead of more or
less open-coding alloc_nr().

> -Peff

Thanks,
Taylor
diff mbox series

Patch

diff --git a/ewah/ewah_bitmap.c b/ewah/ewah_bitmap.c
index d59b1afe3d..3fae04ad00 100644
--- a/ewah/ewah_bitmap.c
+++ b/ewah/ewah_bitmap.c
@@ -45,7 +45,7 @@  static inline void buffer_grow(struct ewah_bitmap *self, size_t new_size)
 static inline void buffer_push(struct ewah_bitmap *self, eword_t value)
 {
 	if (self->buffer_size + 1 >= self->alloc_size)
-		buffer_grow(self, self->buffer_size * 3 / 2);
+		buffer_grow(self, (self->buffer_size + 16) * 3 / 2);
 
 	self->buffer[self->buffer_size++] = value;
 }