Message ID | 20211018033841.3027515-1-rkovhaev@gmail.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | slob: add size header to all allocations | expand |
On 10/18/21 05:38, Rustam Kovhaev wrote: > Let's prepend all allocations of (PAGE_SIZE - align_offset) and less > with the size header. This way kmem_cache_alloc() memory can be freed > with kfree() and the other way around, as long as they are less than > (PAGE_SIZE - align_offset). This size limitation seems like an unnecessary gotcha. Couldn't we make these large allocations in slob_alloc_node() (that use slob_new_pages() directly) similar enough to large kmalloc() ones, so that kfree() can recognize them and free properly? AFAICS it might mean just adding __GFP_COMP to make sure there's a compound order stored, as these already don't seem to set PageSlab. > The main reason for this change is to simplify SLOB a little bit, make > it a bit easier to debug whenever something goes wrong. I would say the main reason is to simplify the slab API and guarantee that both kmem_cache_alloc() and kmalloc() can be freed by kfree(). We should also update the comments at top of slob.c to reflect the change. And Documentation/core-api/memory-allocation.rst (the last paragraph). > meminfo right after the system boot, without the patch: > Slab: 35500 kB > > the same, with the patch: > Slab: 36396 kB 2.5% increase, hopefully acceptable. Thanks!
On Mon, Oct 18, 2021 at 11:22:46AM +0200, Vlastimil Babka wrote: > On 10/18/21 05:38, Rustam Kovhaev wrote: > > Let's prepend all allocations of (PAGE_SIZE - align_offset) and less > > with the size header. This way kmem_cache_alloc() memory can be freed > > with kfree() and the other way around, as long as they are less than > > (PAGE_SIZE - align_offset). > > This size limitation seems like an unnecessary gotcha. Couldn't we make > these large allocations in slob_alloc_node() (that use slob_new_pages() > directly) similar enough to large kmalloc() ones, so that kfree() can > recognize them and free properly? AFAICS it might mean just adding > __GFP_COMP to make sure there's a compound order stored, as these already > don't seem to set PageSlab. Thanks for the pointers, I'll send a new version. > > The main reason for this change is to simplify SLOB a little bit, make > > it a bit easier to debug whenever something goes wrong. > > I would say the main reason is to simplify the slab API and guarantee that > both kmem_cache_alloc() and kmalloc() can be freed by kfree(). > > We should also update the comments at top of slob.c to reflect the change. > And Documentation/core-api/memory-allocation.rst (the last paragraph). OK, thank you! > > meminfo right after the system boot, without the patch: > > Slab: 35500 kB > > > > the same, with the patch: > > Slab: 36396 kB > > 2.5% increase, hopefully acceptable. > > Thanks!
On Sun, Oct 17, 2021 at 08:38:41PM -0700, Rustam Kovhaev wrote: > Let's prepend all allocations of (PAGE_SIZE - align_offset) and less > with the size header. This way kmem_cache_alloc() memory can be freed > with kfree() and the other way around, as long as they are less than > (PAGE_SIZE - align_offset). Hello Rustam, I measured its impact on memory usage on tiny kernel configuration as SLOB is used in very small machine. on x86 32 bit + tinyconfig: Before: Slab: 668 kB After: Slab: 688~692 kB it adds 20~24kB. > > The main reason for this change is to simplify SLOB a little bit, make > it a bit easier to debug whenever something goes wrong. > It seems acceptable But I wonder it is worth to increase memory usage to allow freeing kmem_cache_alloc-ed objects by kfree()? Thanks, Hyeonggon > meminfo right after the system boot, without the patch: > Slab: 35500 kB > > the same, with the patch: > Slab: 36396 kB >
On 10/20/21 13:46, Hyeonggon Yoo wrote: > On Sun, Oct 17, 2021 at 08:38:41PM -0700, Rustam Kovhaev wrote: >> Let's prepend all allocations of (PAGE_SIZE - align_offset) and less >> with the size header. This way kmem_cache_alloc() memory can be freed >> with kfree() and the other way around, as long as they are less than >> (PAGE_SIZE - align_offset). > > Hello Rustam, I measured its impact on memory usage on > tiny kernel configuration as SLOB is used in very small machine. > > on x86 32 bit + tinyconfig: > Before: > Slab: 668 kB > > After: > Slab: 688~692 kB > > it adds 20~24kB. Thanks for the measurement. That's 3.5% increase. > >> >> The main reason for this change is to simplify SLOB a little bit, make >> it a bit easier to debug whenever something goes wrong. >> > > It seems acceptable But I wonder it is worth to increase memory usage > to allow freeing kmem_cache_alloc-ed objects by kfree()? Not for the reason above, but for providing a useful API guarantee regardless of selected slab allocator IMHO yes. > Thanks, > Hyeonggon > >> meminfo right after the system boot, without the patch: >> Slab: 35500 kB >> >> the same, with the patch: >> Slab: 36396 kB >> >
On Thu, Oct 21, 2021 at 07:36:26PM +0200, Vlastimil Babka wrote: > On 10/20/21 13:46, Hyeonggon Yoo wrote: > > On Sun, Oct 17, 2021 at 08:38:41PM -0700, Rustam Kovhaev wrote: > >> Let's prepend all allocations of (PAGE_SIZE - align_offset) and less > >> with the size header. This way kmem_cache_alloc() memory can be freed > >> with kfree() and the other way around, as long as they are less than > >> (PAGE_SIZE - align_offset). > > > > Hello Rustam, I measured its impact on memory usage on > > tiny kernel configuration as SLOB is used in very small machine. > > > > on x86 32 bit + tinyconfig: > > Before: > > Slab: 668 kB > > > > After: > > Slab: 688~692 kB > > > > it adds 20~24kB. > > Thanks for the measurement. That's 3.5% increase. > You're welcome. > > > >> > >> The main reason for this change is to simplify SLOB a little bit, make > >> it a bit easier to debug whenever something goes wrong. > >> > > > > It seems acceptable But I wonder it is worth to increase memory usage > > to allow freeing kmem_cache_alloc-ed objects by kfree()? > > Not for the reason above, but for providing a useful API guarantee > regardless of selected slab allocator IMHO yes. > Mm.. that means some callers free kmem_cache_alloc-ed object using kfree, and SLAB/SLUB already support that, and SLOB doesn't. In what situations is freeing using kfree needed? Wouldn't this make code confusing? > > Thanks, > > Hyeonggon > > > >> meminfo right after the system boot, without the patch: > >> Slab: 35500 kB > >> > >> the same, with the patch: > >> Slab: 36396 kB > >> > > >
On 10/24/21 12:43, Hyeonggon Yoo wrote: >> >> >> >> The main reason for this change is to simplify SLOB a little bit, make >> >> it a bit easier to debug whenever something goes wrong. >> >> >> > >> > It seems acceptable But I wonder it is worth to increase memory usage >> > to allow freeing kmem_cache_alloc-ed objects by kfree()? >> >> Not for the reason above, but for providing a useful API guarantee >> regardless of selected slab allocator IMHO yes. >> > > Mm.. that means some callers free kmem_cache_alloc-ed object using > kfree, and SLAB/SLUB already support that, and SLOB doesn't. Exactly. Finding that out started this whole thread. > In what situations is freeing using kfree needed? > Wouldn't this make code confusing? XFS seems to have good reasons - at some common freeing place objects can appears from multiple caches, and it would be expensive to track their cache just to free them. See https://lore.kernel.org/all/20210930044202.GP2361455@dread.disaster.area/ IMHO it really makes sense to support this from API point of view. kmem_cache_alloc() is basically a more specific version of the generic kmalloc(). It makes sense if the generic kind of free, that is kfree() works on those objects too. >> > Thanks, >> > Hyeonggon >> > >> >> meminfo right after the system boot, without the patch: >> >> Slab: 35500 kB >> >> >> >> the same, with the patch: >> >> Slab: 36396 kB >> >> >> > >> >
diff --git a/mm/slob.c b/mm/slob.c index 74d3f6e60666..3d8fbb33f5a3 100644 --- a/mm/slob.c +++ b/mm/slob.c @@ -373,25 +373,28 @@ static void *slob_alloc(size_t size, gfp_t gfp, int align, int node, } if (unlikely(gfp & __GFP_ZERO)) memset(b, 0, size); + /* Write size in the header */ + *(unsigned int *)b = size - align_offset; + b = (void *)b + align_offset; return b; } /* * slob_free: entry point into the slob allocator. */ -static void slob_free(void *block, int size) +static void slob_free(void *block) { struct page *sp; - slob_t *prev, *next, *b = (slob_t *)block; + int align_offset = max_t(size_t, ARCH_KMALLOC_MINALIGN, ARCH_SLAB_MINALIGN); + void *hdr = (void *)block - align_offset; + unsigned int size = *(unsigned int *)hdr + align_offset; + slob_t *prev, *next, *b = (slob_t *)hdr; slobidx_t units; unsigned long flags; struct list_head *slob_list; - if (unlikely(ZERO_OR_NULL_PTR(block))) - return; - BUG_ON(!size); - - sp = virt_to_page(block); + BUG_ON(!size || size >= PAGE_SIZE); + sp = virt_to_page(hdr); units = SLOB_UNITS(size); spin_lock_irqsave(&slob_lock, flags); @@ -476,7 +479,6 @@ void kmem_obj_info(struct kmem_obj_info *kpp, void *object, struct page *page) static __always_inline void * __do_kmalloc_node(size_t size, gfp_t gfp, int node, unsigned long caller) { - unsigned int *m; int minalign = max_t(size_t, ARCH_KMALLOC_MINALIGN, ARCH_SLAB_MINALIGN); void *ret; @@ -497,12 +499,7 @@ __do_kmalloc_node(size_t size, gfp_t gfp, int node, unsigned long caller) if (!size) return ZERO_SIZE_PTR; - m = slob_alloc(size + minalign, gfp, align, node, minalign); - - if (!m) - return NULL; - *m = size; - ret = (void *)m + minalign; + ret = slob_alloc(size + minalign, gfp, align, node, minalign); trace_kmalloc_node(caller, ret, size, size + minalign, gfp, node); @@ -554,9 +551,7 @@ void kfree(const void *block) sp = virt_to_page(block); if (PageSlab(sp)) { - int align = max_t(size_t, ARCH_KMALLOC_MINALIGN, ARCH_SLAB_MINALIGN); - unsigned int *m = (unsigned int *)(block - align); - slob_free(m, *m + align); + slob_free((void *)block); } else { unsigned int order = compound_order(sp); mod_node_page_state(page_pgdat(sp), NR_SLAB_UNRECLAIMABLE_B, @@ -567,7 +562,6 @@ void kfree(const void *block) } EXPORT_SYMBOL(kfree); -/* can't use ksize for kmem_cache_alloc memory, only kmalloc */ size_t __ksize(const void *block) { struct page *sp; @@ -600,16 +594,17 @@ int __kmem_cache_create(struct kmem_cache *c, slab_flags_t flags) static void *slob_alloc_node(struct kmem_cache *c, gfp_t flags, int node) { + int minalign = max_t(size_t, ARCH_KMALLOC_MINALIGN, ARCH_SLAB_MINALIGN); void *b; flags &= gfp_allowed_mask; might_alloc(flags); - if (c->size < PAGE_SIZE) { - b = slob_alloc(c->size, flags, c->align, node, 0); + if (c->size < PAGE_SIZE - minalign) { + b = slob_alloc(c->size + minalign, flags, c->align, node, minalign); trace_kmem_cache_alloc_node(_RET_IP_, b, c->object_size, - SLOB_UNITS(c->size) * SLOB_UNIT, + SLOB_UNITS(c->size + minalign) * SLOB_UNIT, flags, node); } else { b = slob_new_pages(flags, get_order(c->size), node); @@ -649,8 +644,14 @@ EXPORT_SYMBOL(kmem_cache_alloc_node); static void __kmem_cache_free(void *b, int size) { - if (size < PAGE_SIZE) - slob_free(b, size); + struct page *sp; + + if (unlikely(ZERO_OR_NULL_PTR(b))) + return; + + sp = virt_to_page(b); + if (PageSlab(sp)) + slob_free(b); else slob_free_pages(b, get_order(size)); }
Let's prepend all allocations of (PAGE_SIZE - align_offset) and less with the size header. This way kmem_cache_alloc() memory can be freed with kfree() and the other way around, as long as they are less than (PAGE_SIZE - align_offset). The main reason for this change is to simplify SLOB a little bit, make it a bit easier to debug whenever something goes wrong. meminfo right after the system boot, without the patch: Slab: 35500 kB the same, with the patch: Slab: 36396 kB stats for compiling glibc without the patch: 1,493,972.89 msec task-clock # 3.557 CPUs utilized 317,158 context-switches # 212.292 /sec 8,567 cpu-migrations # 5.734 /sec 33,788,323 page-faults # 22.616 K/sec 5,267,687,400,091 cycles # 3.526 GHz 4,388,201,248,601 instructions # 0.83 insn per cycle 885,424,236,657 branches # 592.664 M/sec 14,117,492,893 branch-misses # 1.59% of all branches 420.051843478 seconds time elapsed 472.784856000 seconds user 1024.645256000 seconds sys the same with the patch: 1,803,990.92 msec task-clock # 3.597 CPUs utilized 330,110 context-switches # 182.989 /sec 9,170 cpu-migrations # 5.083 /sec 33,789,627 page-faults # 18.730 K/sec 6,499,753,661,134 cycles # 3.603 GHz 4,564,216,028,344 instructions # 0.70 insn per cycle 917,120,742,440 branches # 508.384 M/sec 15,068,415,552 branch-misses # 1.64% of all branches 501.519434175 seconds time elapsed 495.587614000 seconds user 1312.652833000 seconds sys Link: https://lore.kernel.org/lkml/20210929212347.1139666-1-rkovhaev@gmail.com Signed-off-by: Rustam Kovhaev <rkovhaev@gmail.com> --- mm/slob.c | 47 ++++++++++++++++++++++++----------------------- 1 file changed, 24 insertions(+), 23 deletions(-)