diff mbox series

slob: add size header to all allocations

Message ID 20211018033841.3027515-1-rkovhaev@gmail.com (mailing list archive)
State New
Headers show
Series slob: add size header to all allocations | expand

Commit Message

Rustam Kovhaev Oct. 18, 2021, 3:38 a.m. UTC
Let's prepend all  allocations of (PAGE_SIZE - align_offset) and less
with the size header. This way kmem_cache_alloc() memory can be freed
with kfree() and the other way around, as long as they are less than
(PAGE_SIZE - align_offset).

The main reason for this change is to simplify SLOB a little bit, make
it a bit easier to debug whenever something goes wrong.

meminfo right after the system boot, without the patch:
Slab:              35500 kB

the same, with the patch:
Slab:              36396 kB

stats for compiling glibc without the patch:
      1,493,972.89 msec task-clock                #    3.557 CPUs utilized
           317,158      context-switches          #  212.292 /sec
             8,567      cpu-migrations            #    5.734 /sec
        33,788,323      page-faults               #   22.616 K/sec
 5,267,687,400,091      cycles                    #    3.526 GHz
 4,388,201,248,601      instructions              #    0.83  insn per cycle
   885,424,236,657      branches                  #  592.664 M/sec
    14,117,492,893      branch-misses             #    1.59% of all branches

     420.051843478 seconds time elapsed

     472.784856000 seconds user
    1024.645256000 seconds sys

the same with the patch:
      1,803,990.92 msec task-clock                #    3.597 CPUs utilized
           330,110      context-switches          #  182.989 /sec
             9,170      cpu-migrations            #    5.083 /sec
        33,789,627      page-faults               #   18.730 K/sec
 6,499,753,661,134      cycles                    #    3.603 GHz
 4,564,216,028,344      instructions              #    0.70  insn per cycle
   917,120,742,440      branches                  #  508.384 M/sec
    15,068,415,552      branch-misses             #    1.64% of all branches

     501.519434175 seconds time elapsed

     495.587614000 seconds user
    1312.652833000 seconds sys

Link: https://lore.kernel.org/lkml/20210929212347.1139666-1-rkovhaev@gmail.com
Signed-off-by: Rustam Kovhaev <rkovhaev@gmail.com>
---
 mm/slob.c | 47 ++++++++++++++++++++++++-----------------------
 1 file changed, 24 insertions(+), 23 deletions(-)

Comments

Vlastimil Babka Oct. 18, 2021, 9:22 a.m. UTC | #1
On 10/18/21 05:38, Rustam Kovhaev wrote:
> Let's prepend all  allocations of (PAGE_SIZE - align_offset) and less
> with the size header. This way kmem_cache_alloc() memory can be freed
> with kfree() and the other way around, as long as they are less than
> (PAGE_SIZE - align_offset).

This size limitation seems like an unnecessary gotcha. Couldn't we make
these large allocations in slob_alloc_node() (that use slob_new_pages()
directly) similar enough to large kmalloc() ones, so that kfree() can
recognize them and free properly? AFAICS it might mean just adding
__GFP_COMP to make sure there's a compound order stored, as these already
don't seem to set PageSlab.

> The main reason for this change is to simplify SLOB a little bit, make
> it a bit easier to debug whenever something goes wrong.

I would say the main reason is to simplify the slab API and guarantee that
both kmem_cache_alloc() and kmalloc() can be freed by kfree().

We should also update the comments at top of slob.c to reflect the change.
And Documentation/core-api/memory-allocation.rst (the last paragraph).

> meminfo right after the system boot, without the patch:
> Slab:              35500 kB
> 
> the same, with the patch:
> Slab:              36396 kB

2.5% increase, hopefully acceptable.

Thanks!
Rustam Kovhaev Oct. 19, 2021, 1:22 a.m. UTC | #2
On Mon, Oct 18, 2021 at 11:22:46AM +0200, Vlastimil Babka wrote:
> On 10/18/21 05:38, Rustam Kovhaev wrote:
> > Let's prepend all  allocations of (PAGE_SIZE - align_offset) and less
> > with the size header. This way kmem_cache_alloc() memory can be freed
> > with kfree() and the other way around, as long as they are less than
> > (PAGE_SIZE - align_offset).
> 
> This size limitation seems like an unnecessary gotcha. Couldn't we make
> these large allocations in slob_alloc_node() (that use slob_new_pages()
> directly) similar enough to large kmalloc() ones, so that kfree() can
> recognize them and free properly? AFAICS it might mean just adding
> __GFP_COMP to make sure there's a compound order stored, as these already
> don't seem to set PageSlab.

Thanks for the pointers, I'll send a new version.

> > The main reason for this change is to simplify SLOB a little bit, make
> > it a bit easier to debug whenever something goes wrong.
> 
> I would say the main reason is to simplify the slab API and guarantee that
> both kmem_cache_alloc() and kmalloc() can be freed by kfree().
> 
> We should also update the comments at top of slob.c to reflect the change.
> And Documentation/core-api/memory-allocation.rst (the last paragraph).

OK, thank you!

> > meminfo right after the system boot, without the patch:
> > Slab:              35500 kB
> > 
> > the same, with the patch:
> > Slab:              36396 kB
> 
> 2.5% increase, hopefully acceptable.
> 
> Thanks!
Hyeonggon Yoo Oct. 20, 2021, 11:46 a.m. UTC | #3
On Sun, Oct 17, 2021 at 08:38:41PM -0700, Rustam Kovhaev wrote:
> Let's prepend all  allocations of (PAGE_SIZE - align_offset) and less
> with the size header. This way kmem_cache_alloc() memory can be freed
> with kfree() and the other way around, as long as they are less than
> (PAGE_SIZE - align_offset).

Hello Rustam, I measured its impact on memory usage on
tiny kernel configuration as SLOB is used in very small machine.

on x86 32 bit + tinyconfig:
    Before:
    Slab:                668 kB

    After:
    Slab:                688~692 kB

it adds 20~24kB.

> 
> The main reason for this change is to simplify SLOB a little bit, make
> it a bit easier to debug whenever something goes wrong.
>

It seems acceptable But I wonder it is worth to increase memory usage
to allow freeing kmem_cache_alloc-ed objects by kfree()?

Thanks,
Hyeonggon

> meminfo right after the system boot, without the patch:
> Slab:              35500 kB
> 
> the same, with the patch:
> Slab:              36396 kB
>
Vlastimil Babka Oct. 21, 2021, 5:36 p.m. UTC | #4
On 10/20/21 13:46, Hyeonggon Yoo wrote:
> On Sun, Oct 17, 2021 at 08:38:41PM -0700, Rustam Kovhaev wrote:
>> Let's prepend all  allocations of (PAGE_SIZE - align_offset) and less
>> with the size header. This way kmem_cache_alloc() memory can be freed
>> with kfree() and the other way around, as long as they are less than
>> (PAGE_SIZE - align_offset).
> 
> Hello Rustam, I measured its impact on memory usage on
> tiny kernel configuration as SLOB is used in very small machine.
> 
> on x86 32 bit + tinyconfig:
>     Before:
>     Slab:                668 kB
> 
>     After:
>     Slab:                688~692 kB
> 
> it adds 20~24kB.

Thanks for the measurement. That's 3.5% increase.

> 
>> 
>> The main reason for this change is to simplify SLOB a little bit, make
>> it a bit easier to debug whenever something goes wrong.
>>
> 
> It seems acceptable But I wonder it is worth to increase memory usage
> to allow freeing kmem_cache_alloc-ed objects by kfree()?

Not for the reason above, but for providing a useful API guarantee
regardless of selected slab allocator IMHO yes.

> Thanks,
> Hyeonggon
> 
>> meminfo right after the system boot, without the patch:
>> Slab:              35500 kB
>> 
>> the same, with the patch:
>> Slab:              36396 kB
>> 
>
Hyeonggon Yoo Oct. 24, 2021, 10:43 a.m. UTC | #5
On Thu, Oct 21, 2021 at 07:36:26PM +0200, Vlastimil Babka wrote:
> On 10/20/21 13:46, Hyeonggon Yoo wrote:
> > On Sun, Oct 17, 2021 at 08:38:41PM -0700, Rustam Kovhaev wrote:
> >> Let's prepend all  allocations of (PAGE_SIZE - align_offset) and less
> >> with the size header. This way kmem_cache_alloc() memory can be freed
> >> with kfree() and the other way around, as long as they are less than
> >> (PAGE_SIZE - align_offset).
> > 
> > Hello Rustam, I measured its impact on memory usage on
> > tiny kernel configuration as SLOB is used in very small machine.
> > 
> > on x86 32 bit + tinyconfig:
> >     Before:
> >     Slab:                668 kB
> > 
> >     After:
> >     Slab:                688~692 kB
> > 
> > it adds 20~24kB.
> 
> Thanks for the measurement. That's 3.5% increase.
> 

You're welcome.

> > 
> >> 
> >> The main reason for this change is to simplify SLOB a little bit, make
> >> it a bit easier to debug whenever something goes wrong.
> >>
> > 
> > It seems acceptable But I wonder it is worth to increase memory usage
> > to allow freeing kmem_cache_alloc-ed objects by kfree()?
> 
> Not for the reason above, but for providing a useful API guarantee
> regardless of selected slab allocator IMHO yes.
> 

Mm.. that means some callers free kmem_cache_alloc-ed object using
kfree, and SLAB/SLUB already support that, and SLOB doesn't.

In what situations is freeing using kfree needed?
Wouldn't this make code confusing?

> > Thanks,
> > Hyeonggon
> > 
> >> meminfo right after the system boot, without the patch:
> >> Slab:              35500 kB
> >> 
> >> the same, with the patch:
> >> Slab:              36396 kB
> >> 
> > 
>
Vlastimil Babka Oct. 25, 2021, 8:19 a.m. UTC | #6
On 10/24/21 12:43, Hyeonggon Yoo wrote:
>> >> 
>> >> The main reason for this change is to simplify SLOB a little bit, make
>> >> it a bit easier to debug whenever something goes wrong.
>> >>
>> > 
>> > It seems acceptable But I wonder it is worth to increase memory usage
>> > to allow freeing kmem_cache_alloc-ed objects by kfree()?
>> 
>> Not for the reason above, but for providing a useful API guarantee
>> regardless of selected slab allocator IMHO yes.
>> 
> 
> Mm.. that means some callers free kmem_cache_alloc-ed object using
> kfree, and SLAB/SLUB already support that, and SLOB doesn't.

Exactly. Finding that out started this whole thread.

> In what situations is freeing using kfree needed?
> Wouldn't this make code confusing?

XFS seems to have good reasons - at some common freeing place objects can
appears from multiple caches, and it would be expensive to track their cache
just to free them. See
https://lore.kernel.org/all/20210930044202.GP2361455@dread.disaster.area/

IMHO it really makes sense to support this from API point of view.
kmem_cache_alloc() is basically a more specific version of the generic
kmalloc(). It makes sense if the generic kind of free, that is kfree() works
on those objects too.

>> > Thanks,
>> > Hyeonggon
>> > 
>> >> meminfo right after the system boot, without the patch:
>> >> Slab:              35500 kB
>> >> 
>> >> the same, with the patch:
>> >> Slab:              36396 kB
>> >> 
>> > 
>> 
>
diff mbox series

Patch

diff --git a/mm/slob.c b/mm/slob.c
index 74d3f6e60666..3d8fbb33f5a3 100644
--- a/mm/slob.c
+++ b/mm/slob.c
@@ -373,25 +373,28 @@  static void *slob_alloc(size_t size, gfp_t gfp, int align, int node,
 	}
 	if (unlikely(gfp & __GFP_ZERO))
 		memset(b, 0, size);
+	/* Write size in the header */
+	*(unsigned int *)b = size - align_offset;
+	b = (void *)b + align_offset;
 	return b;
 }
 
 /*
  * slob_free: entry point into the slob allocator.
  */
-static void slob_free(void *block, int size)
+static void slob_free(void *block)
 {
 	struct page *sp;
-	slob_t *prev, *next, *b = (slob_t *)block;
+	int align_offset = max_t(size_t, ARCH_KMALLOC_MINALIGN, ARCH_SLAB_MINALIGN);
+	void *hdr = (void *)block - align_offset;
+	unsigned int size = *(unsigned int *)hdr + align_offset;
+	slob_t *prev, *next, *b = (slob_t *)hdr;
 	slobidx_t units;
 	unsigned long flags;
 	struct list_head *slob_list;
 
-	if (unlikely(ZERO_OR_NULL_PTR(block)))
-		return;
-	BUG_ON(!size);
-
-	sp = virt_to_page(block);
+	BUG_ON(!size || size >= PAGE_SIZE);
+	sp = virt_to_page(hdr);
 	units = SLOB_UNITS(size);
 
 	spin_lock_irqsave(&slob_lock, flags);
@@ -476,7 +479,6 @@  void kmem_obj_info(struct kmem_obj_info *kpp, void *object, struct page *page)
 static __always_inline void *
 __do_kmalloc_node(size_t size, gfp_t gfp, int node, unsigned long caller)
 {
-	unsigned int *m;
 	int minalign = max_t(size_t, ARCH_KMALLOC_MINALIGN, ARCH_SLAB_MINALIGN);
 	void *ret;
 
@@ -497,12 +499,7 @@  __do_kmalloc_node(size_t size, gfp_t gfp, int node, unsigned long caller)
 		if (!size)
 			return ZERO_SIZE_PTR;
 
-		m = slob_alloc(size + minalign, gfp, align, node, minalign);
-
-		if (!m)
-			return NULL;
-		*m = size;
-		ret = (void *)m + minalign;
+		ret = slob_alloc(size + minalign, gfp, align, node, minalign);
 
 		trace_kmalloc_node(caller, ret,
 				   size, size + minalign, gfp, node);
@@ -554,9 +551,7 @@  void kfree(const void *block)
 
 	sp = virt_to_page(block);
 	if (PageSlab(sp)) {
-		int align = max_t(size_t, ARCH_KMALLOC_MINALIGN, ARCH_SLAB_MINALIGN);
-		unsigned int *m = (unsigned int *)(block - align);
-		slob_free(m, *m + align);
+		slob_free((void *)block);
 	} else {
 		unsigned int order = compound_order(sp);
 		mod_node_page_state(page_pgdat(sp), NR_SLAB_UNRECLAIMABLE_B,
@@ -567,7 +562,6 @@  void kfree(const void *block)
 }
 EXPORT_SYMBOL(kfree);
 
-/* can't use ksize for kmem_cache_alloc memory, only kmalloc */
 size_t __ksize(const void *block)
 {
 	struct page *sp;
@@ -600,16 +594,17 @@  int __kmem_cache_create(struct kmem_cache *c, slab_flags_t flags)
 
 static void *slob_alloc_node(struct kmem_cache *c, gfp_t flags, int node)
 {
+	int minalign = max_t(size_t, ARCH_KMALLOC_MINALIGN, ARCH_SLAB_MINALIGN);
 	void *b;
 
 	flags &= gfp_allowed_mask;
 
 	might_alloc(flags);
 
-	if (c->size < PAGE_SIZE) {
-		b = slob_alloc(c->size, flags, c->align, node, 0);
+	if (c->size < PAGE_SIZE - minalign) {
+		b = slob_alloc(c->size + minalign, flags, c->align, node, minalign);
 		trace_kmem_cache_alloc_node(_RET_IP_, b, c->object_size,
-					    SLOB_UNITS(c->size) * SLOB_UNIT,
+					    SLOB_UNITS(c->size + minalign) * SLOB_UNIT,
 					    flags, node);
 	} else {
 		b = slob_new_pages(flags, get_order(c->size), node);
@@ -649,8 +644,14 @@  EXPORT_SYMBOL(kmem_cache_alloc_node);
 
 static void __kmem_cache_free(void *b, int size)
 {
-	if (size < PAGE_SIZE)
-		slob_free(b, size);
+	struct page *sp;
+
+	if (unlikely(ZERO_OR_NULL_PTR(b)))
+		return;
+
+	sp = virt_to_page(b);
+	if (PageSlab(sp))
+		slob_free(b);
 	else
 		slob_free_pages(b, get_order(size));
 }