diff mbox series

[1/9] io_uring: Fold allocation into alloc_cache helper

Message ID 20241119012224.1698238-2-krisman@suse.de (mailing list archive)
State New
Headers show
Series Clean up alloc_cache allocations | expand

Commit Message

Gabriel Krisman Bertazi Nov. 19, 2024, 1:22 a.m. UTC
The allocation paths that use alloc_cache duplicate the same code
pattern, sometimes in a quite convoluted way.  Fold the allocation into
the cache code itself, making it just an allocator function, and keeping
the cache policy invisible to callers.  Another justification for doing
this, beyond code simplicity, is that it makes it trivial to test the
impact of disabling the cache and using slab directly, which I've used
for slab improvement experiments.

One relevant detail is that this allocates zeroed memory.  Rationale is
that it simplifies the handling of the embedded free_iov in some of the
cached objects, and the performance impact shouldn't be meaningful,
since we are supposed to be hitting the cache most of the time and the
allocation is already the slow path.

Signed-off-by: Gabriel Krisman Bertazi <krisman@suse.de>
---
 io_uring/alloc_cache.h | 7 +++++++
 1 file changed, 7 insertions(+)

Comments

Jens Axboe Nov. 19, 2024, 2:02 a.m. UTC | #1
On 11/18/24 6:22 PM, Gabriel Krisman Bertazi wrote:
> diff --git a/io_uring/alloc_cache.h b/io_uring/alloc_cache.h
> index b7a38a2069cf..6b34e491a30a 100644
> --- a/io_uring/alloc_cache.h
> +++ b/io_uring/alloc_cache.h
> @@ -30,6 +30,13 @@ static inline void *io_alloc_cache_get(struct io_alloc_cache *cache)
>  	return NULL;
>  }
>  
> +static inline void *io_alloc_cache_alloc(struct io_alloc_cache *cache, gfp_t gfp)
> +{
> +	if (!cache->nr_cached)
> +		return kzalloc(cache->elem_size, gfp);
> +	return io_alloc_cache_get(cache);
> +}

I don't think you want to use kzalloc here. The caller will need to
clear what its needs for the cached path anyway, so has no other option
than to clear/set things twice for that case.
Gabriel Krisman Bertazi Nov. 19, 2024, 3:30 p.m. UTC | #2
Jens Axboe <axboe@kernel.dk> writes:

> On 11/18/24 6:22 PM, Gabriel Krisman Bertazi wrote:
>> diff --git a/io_uring/alloc_cache.h b/io_uring/alloc_cache.h
>> index b7a38a2069cf..6b34e491a30a 100644
>> --- a/io_uring/alloc_cache.h
>> +++ b/io_uring/alloc_cache.h
>> @@ -30,6 +30,13 @@ static inline void *io_alloc_cache_get(struct io_alloc_cache *cache)
>>  	return NULL;
>>  }
>>  
>> +static inline void *io_alloc_cache_alloc(struct io_alloc_cache *cache, gfp_t gfp)
>> +{
>> +	if (!cache->nr_cached)
>> +		return kzalloc(cache->elem_size, gfp);
>> +	return io_alloc_cache_get(cache);
>> +}
>
> I don't think you want to use kzalloc here. The caller will need to
> clear what its needs for the cached path anyway, so has no other option
> than to clear/set things twice for that case.

Hi Jens,

The reason I do kzalloc here is to be able to trust the value of
rw->free_iov (io_rw_alloc_async) and hdr->free_iov (io_msg_alloc_async)
regardless of where the allocated memory came from, cache or slab.  In
the callers (patch 6 and 7), we do:

+	hdr = io_uring_alloc_async_data(&ctx->netmsg_cache, req);
+	if (!hdr)
+		return NULL;
+
+	/* If the async data was cached, we might have an iov cached inside. */
+	if (hdr->free_iov) {

An alternative would be to return a flag indicating whether the
allocated memory came from the cache or not, but it didn't seem elegant.
Do you see a better way?

I also considered that zeroing memory here shouldn't harm performance,
because it'll hit the cache most of the time.
Jens Axboe Nov. 19, 2024, 4:18 p.m. UTC | #3
On 11/19/24 8:30 AM, Gabriel Krisman Bertazi wrote:
> Jens Axboe <axboe@kernel.dk> writes:
> 
>> On 11/18/24 6:22 PM, Gabriel Krisman Bertazi wrote:
>>> diff --git a/io_uring/alloc_cache.h b/io_uring/alloc_cache.h
>>> index b7a38a2069cf..6b34e491a30a 100644
>>> --- a/io_uring/alloc_cache.h
>>> +++ b/io_uring/alloc_cache.h
>>> @@ -30,6 +30,13 @@ static inline void *io_alloc_cache_get(struct io_alloc_cache *cache)
>>>  	return NULL;
>>>  }
>>>  
>>> +static inline void *io_alloc_cache_alloc(struct io_alloc_cache *cache, gfp_t gfp)
>>> +{
>>> +	if (!cache->nr_cached)
>>> +		return kzalloc(cache->elem_size, gfp);
>>> +	return io_alloc_cache_get(cache);
>>> +}
>>
>> I don't think you want to use kzalloc here. The caller will need to
>> clear what its needs for the cached path anyway, so has no other option
>> than to clear/set things twice for that case.
> 
> Hi Jens,
> 
> The reason I do kzalloc here is to be able to trust the value of
> rw->free_iov (io_rw_alloc_async) and hdr->free_iov (io_msg_alloc_async)
> regardless of where the allocated memory came from, cache or slab.  In
> the callers (patch 6 and 7), we do:

I see, I guess that makes sense as some things are persistent in cache
and need clearing upfront if freshly allocated.

> +	hdr = io_uring_alloc_async_data(&ctx->netmsg_cache, req);
> +	if (!hdr)
> +		return NULL;
> +
> +	/* If the async data was cached, we might have an iov cached inside. */
> +	if (hdr->free_iov) {
> 
> An alternative would be to return a flag indicating whether the
> allocated memory came from the cache or not, but it didn't seem elegant.
> Do you see a better way?
> 
> I also considered that zeroing memory here shouldn't harm performance,
> because it'll hit the cache most of the time.

It should hit cache most of the time, but if we exceed the cache size,
then you will see allocations happen and churn. I don't like the idea of
the flag, then we still need to complicate the caller. We can do
something like slab where you have a hook for freshly allocated data
only? That can either be a property of the cache, or passed in via
io_alloc_cache_alloc()?

BTW, I'd probably change the name of that to io_cache_get() or
io_cache_alloc() or something like that, I don't think we need two
allocs in there.
diff mbox series

Patch

diff --git a/io_uring/alloc_cache.h b/io_uring/alloc_cache.h
index b7a38a2069cf..6b34e491a30a 100644
--- a/io_uring/alloc_cache.h
+++ b/io_uring/alloc_cache.h
@@ -30,6 +30,13 @@  static inline void *io_alloc_cache_get(struct io_alloc_cache *cache)
 	return NULL;
 }
 
+static inline void *io_alloc_cache_alloc(struct io_alloc_cache *cache, gfp_t gfp)
+{
+	if (!cache->nr_cached)
+		return kzalloc(cache->elem_size, gfp);
+	return io_alloc_cache_get(cache);
+}
+
 /* returns false if the cache was initialized properly */
 static inline bool io_alloc_cache_init(struct io_alloc_cache *cache,
 				       unsigned max_nr, size_t size)