diff mbox series

[1/2] mem-pool: add mem_pool_strfmt()

Message ID 20240225113947.89357-2-l.s.r@web.de (mailing list archive)
State New
Headers show
Series name-rev: use memory pool for name strings | expand

Commit Message

René Scharfe Feb. 25, 2024, 11:39 a.m. UTC
Add a function for building a string, printf style, using a memory pool.
It uses the free space in the current block in the first attempt.  If
that suffices then the result can already be used without copying or
reformatting.

For strings that are significantly shorter on average than the block
size (ca. 1 MiB by default) this is the case most of the time, leading
to a better perfomance than a solution that doesn't access mem-pool
internals.

Signed-off-by: René Scharfe <l.s.r@web.de>
---
 mem-pool.c | 39 +++++++++++++++++++++++++++++++++++++++
 mem-pool.h |  5 +++++
 2 files changed, 44 insertions(+)

--
2.44.0

Comments

Jeff King Feb. 26, 2024, 7:08 a.m. UTC | #1
On Sun, Feb 25, 2024 at 12:39:44PM +0100, René Scharfe wrote:

> +static char *mem_pool_strvfmt(struct mem_pool *pool, const char *fmt,
> +			      va_list ap)
> +{
> +	struct mp_block *block = pool->mp_block;
> +	char *next_free = block ? block->next_free : NULL;
> +	size_t available = block ? block->end - block->next_free : 0;
> +	va_list cp;
> +	int len, len2;
> +	char *ret;
> +
> +	va_copy(cp, ap);
> +	len = vsnprintf(next_free, available, fmt, cp);
> +	va_end(cp);
> +	if (len < 0)
> +		BUG("your vsnprintf is broken (returned %d)", len);
> +
> +	ret = mem_pool_alloc(pool, len + 1);  /* 1 for NUL */
> +
> +	/* Shortcut; relies on mem_pool_alloc() not touching buffer contents. */
> +	if (ret == next_free)
> +		return ret;
> +
> +	len2 = vsnprintf(ret, len + 1, fmt, ap);
> +	if (len2 != len)
> +		BUG("your vsnprintf is broken (returns inconsistent lengths)");
> +	return ret;
> +}

This is pulling heavily from strbuf_vaddf(). This might be a dumb idea,
but... would it be reasonable to instead push a global flag that causes
xmalloc() to use a memory pool instead of the regular heap?

Then you could do something like:

  push_mem_pool(pool);
  str = xstrfmt("%.*s~%d^%d", ...etc...);
  pop_mem_pool(pool);

It's a little more involved at the caller, but it means that it now
works for all allocations, not just this one string helper.

Obviously you'd want it to be a thread-local value to prevent races. But
I still wonder if it could cause havoc when some sub-function makes an
allocation that the caller does not expect.

-Peff
René Scharfe Feb. 26, 2024, 6:17 p.m. UTC | #2
Am 26.02.24 um 08:08 schrieb Jeff King:
> On Sun, Feb 25, 2024 at 12:39:44PM +0100, René Scharfe wrote:
>
>> +static char *mem_pool_strvfmt(struct mem_pool *pool, const char *fmt,
>> +			      va_list ap)
>> +{
>> +	struct mp_block *block = pool->mp_block;
>> +	char *next_free = block ? block->next_free : NULL;
>> +	size_t available = block ? block->end - block->next_free : 0;
>> +	va_list cp;
>> +	int len, len2;
>> +	char *ret;
>> +
>> +	va_copy(cp, ap);
>> +	len = vsnprintf(next_free, available, fmt, cp);
>> +	va_end(cp);
>> +	if (len < 0)
>> +		BUG("your vsnprintf is broken (returned %d)", len);
>> +
>> +	ret = mem_pool_alloc(pool, len + 1);  /* 1 for NUL */
>> +
>> +	/* Shortcut; relies on mem_pool_alloc() not touching buffer contents. */
>> +	if (ret == next_free)
>> +		return ret;
>> +
>> +	len2 = vsnprintf(ret, len + 1, fmt, ap);
>> +	if (len2 != len)
>> +		BUG("your vsnprintf is broken (returns inconsistent lengths)");
>> +	return ret;
>> +}
>
> This is pulling heavily from strbuf_vaddf(). This might be a dumb idea,
> but... would it be reasonable to instead push a global flag that causes
> xmalloc() to use a memory pool instead of the regular heap?
>
> Then you could do something like:
>
>   push_mem_pool(pool);
>   str = xstrfmt("%.*s~%d^%d", ...etc...);
>   pop_mem_pool(pool);
>
> It's a little more involved at the caller, but it means that it now
> works for all allocations, not just this one string helper.

That would allow to keep track of allocations that would otherwise leak.
We can achieve that more easily by pushing the pointer to a global array
and never freeing it.  Hmm.

It would not allow the shortcut of using the vast pool as a scratch
space to format the string with a single vsnprintf call in most cases.
Or am I missing something?

René
Jeff King Feb. 27, 2024, 7:53 a.m. UTC | #3
On Mon, Feb 26, 2024 at 07:17:00PM +0100, René Scharfe wrote:

> > This is pulling heavily from strbuf_vaddf(). This might be a dumb idea,
> > but... would it be reasonable to instead push a global flag that causes
> > xmalloc() to use a memory pool instead of the regular heap?
> >
> > Then you could do something like:
> >
> >   push_mem_pool(pool);
> >   str = xstrfmt("%.*s~%d^%d", ...etc...);
> >   pop_mem_pool(pool);
> >
> > It's a little more involved at the caller, but it means that it now
> > works for all allocations, not just this one string helper.
> 
> That would allow to keep track of allocations that would otherwise leak.
> We can achieve that more easily by pushing the pointer to a global array
> and never freeing it.  Hmm.

I see two uses for memory pools:

  1. You want to optimize allocations (either locality or per-allocation
     overhead).

  2. You want to make a bunch of allocations with the same lifetime
     without worrying about their lifetime otherwise. And then you can
     free them all in one swoop at the end.

And my impression is that you were interested in (2) here. If what
you're saying is that another way to do that is:

  str = xstrfmt(...whatever...);
  attach_to_pool(pool, str);

  ...much later...
  free_pool(pool);

then yeah, I'd agree. And that is a lot less tricky / invasive than what
I suggested.

> It would not allow the shortcut of using the vast pool as a scratch
> space to format the string with a single vsnprintf call in most cases.
> Or am I missing something?

So here it sounds like you do care about some of the performance
aspects. So no, it would not allow that. You'd be using the vast pool of
heap memory provided by malloc(), and trusting that a call to malloc()
is not that expensive in practice. I don't know how true that is, or how
much it matters for this case.

-Peff
René Scharfe Feb. 27, 2024, 5:58 p.m. UTC | #4
Am 27.02.24 um 08:53 schrieb Jeff King:
> On Mon, Feb 26, 2024 at 07:17:00PM +0100, René Scharfe wrote:
>
>>> This is pulling heavily from strbuf_vaddf(). This might be a dumb idea,
>>> but... would it be reasonable to instead push a global flag that causes
>>> xmalloc() to use a memory pool instead of the regular heap?
>>>
>>> Then you could do something like:
>>>
>>>   push_mem_pool(pool);
>>>   str = xstrfmt("%.*s~%d^%d", ...etc...);
>>>   pop_mem_pool(pool);
>>>
>>> It's a little more involved at the caller, but it means that it now
>>> works for all allocations, not just this one string helper.
>>
>> That would allow to keep track of allocations that would otherwise leak.
>> We can achieve that more easily by pushing the pointer to a global array
>> and never freeing it.  Hmm.
>
> I see two uses for memory pools:
>
>   1. You want to optimize allocations (either locality or per-allocation
>      overhead).
>
>   2. You want to make a bunch of allocations with the same lifetime
>      without worrying about their lifetime otherwise. And then you can
>      free them all in one swoop at the end.
>
> And my impression is that you were interested in (2) here. If what
> you're saying is that another way to do that is:
>
>   str = xstrfmt(...whatever...);
>   attach_to_pool(pool, str);
>
>   ...much later...
>   free_pool(pool);
>
> then yeah, I'd agree. And that is a lot less tricky / invasive than what
> I suggested.
>
>> It would not allow the shortcut of using the vast pool as a scratch
>> space to format the string with a single vsnprintf call in most cases.
>> Or am I missing something?
>
> So here it sounds like you do care about some of the performance
> aspects. So no, it would not allow that. You'd be using the vast pool of
> heap memory provided by malloc(), and trusting that a call to malloc()
> is not that expensive in practice. I don't know how true that is, or how
> much it matters for this case.

In the specific use case we can look at three cases:

1. xstrfmt() [before 1c56fc2084]
2. size calculation + pre-size + strbuf_addf() [1c56fc2084]
3. mem_pool_strfmt() [this patch]

The performance of 2 and 3 is basically the same, 1 was worse.

xstrfmt() and strbuf_addf() both wrap strbuf_vaddf(), which uses
malloc() and vsnprintf().  My conclusion is that malloc() is fast
enough, but running vsnprintf() twice is slow (first time to determine
the allocation size, second time to actually build the string).  With a
memory pool we almost always only need to call it once per string, and
that's why I use it here.

The benefit of this patch (to me) is better code readability (no more
manual pre-sizing) without sacrificing performance.

The ability to clear (or UNLEAK) all these strings in one go is just a
bonus.

René
Jeff King March 7, 2024, 9:58 a.m. UTC | #5
On Tue, Feb 27, 2024 at 06:58:26PM +0100, René Scharfe wrote:

> >> It would not allow the shortcut of using the vast pool as a scratch
> >> space to format the string with a single vsnprintf call in most cases.
> >> Or am I missing something?
> >
> > So here it sounds like you do care about some of the performance
> > aspects. So no, it would not allow that. You'd be using the vast pool of
> > heap memory provided by malloc(), and trusting that a call to malloc()
> > is not that expensive in practice. I don't know how true that is, or how
> > much it matters for this case.
> 
> In the specific use case we can look at three cases:
> 
> 1. xstrfmt() [before 1c56fc2084]
> 2. size calculation + pre-size + strbuf_addf() [1c56fc2084]
> 3. mem_pool_strfmt() [this patch]
> 
> The performance of 2 and 3 is basically the same, 1 was worse.
> 
> xstrfmt() and strbuf_addf() both wrap strbuf_vaddf(), which uses
> malloc() and vsnprintf().  My conclusion is that malloc() is fast
> enough, but running vsnprintf() twice is slow (first time to determine
> the allocation size, second time to actually build the string).  With a
> memory pool we almost always only need to call it once per string, and
> that's why I use it here.
> 
> The benefit of this patch (to me) is better code readability (no more
> manual pre-sizing) without sacrificing performance.
> 
> The ability to clear (or UNLEAK) all these strings in one go is just a
> bonus.

Ah, OK. I admit I did not read the series all that carefully. Mostly I
am just concerned about a world where there are parallel-universe
versions of every allocating function that takes a mem-pool. If it's
limited to this obvious string formatting variant that may not be too
bad, though.

-Peff
diff mbox series

Patch

diff --git a/mem-pool.c b/mem-pool.c
index c7d6256020..2078c22b09 100644
--- a/mem-pool.c
+++ b/mem-pool.c
@@ -107,6 +107,45 @@  void *mem_pool_alloc(struct mem_pool *pool, size_t len)
 	return r;
 }

+static char *mem_pool_strvfmt(struct mem_pool *pool, const char *fmt,
+			      va_list ap)
+{
+	struct mp_block *block = pool->mp_block;
+	char *next_free = block ? block->next_free : NULL;
+	size_t available = block ? block->end - block->next_free : 0;
+	va_list cp;
+	int len, len2;
+	char *ret;
+
+	va_copy(cp, ap);
+	len = vsnprintf(next_free, available, fmt, cp);
+	va_end(cp);
+	if (len < 0)
+		BUG("your vsnprintf is broken (returned %d)", len);
+
+	ret = mem_pool_alloc(pool, len + 1);  /* 1 for NUL */
+
+	/* Shortcut; relies on mem_pool_alloc() not touching buffer contents. */
+	if (ret == next_free)
+		return ret;
+
+	len2 = vsnprintf(ret, len + 1, fmt, ap);
+	if (len2 != len)
+		BUG("your vsnprintf is broken (returns inconsistent lengths)");
+	return ret;
+}
+
+char *mem_pool_strfmt(struct mem_pool *pool, const char *fmt, ...)
+{
+	va_list ap;
+	char *ret;
+
+	va_start(ap, fmt);
+	ret = mem_pool_strvfmt(pool, fmt, ap);
+	va_end(ap);
+	return ret;
+}
+
 void *mem_pool_calloc(struct mem_pool *pool, size_t count, size_t size)
 {
 	size_t len = st_mult(count, size);
diff --git a/mem-pool.h b/mem-pool.h
index fe7507f022..d1c66413ec 100644
--- a/mem-pool.h
+++ b/mem-pool.h
@@ -47,6 +47,11 @@  void *mem_pool_calloc(struct mem_pool *pool, size_t count, size_t size);
 char *mem_pool_strdup(struct mem_pool *pool, const char *str);
 char *mem_pool_strndup(struct mem_pool *pool, const char *str, size_t len);

+/*
+ * Allocate memory from the memory pool and format a string into it.
+ */
+char *mem_pool_strfmt(struct mem_pool *pool, const char *fmt, ...);
+
 /*
  * Move the memory associated with the 'src' pool to the 'dst' pool. The 'src'
  * pool will be empty and not contain any memory. It still needs to be free'd