Message ID | 20211201181510.18784-32-vbabka@suse.cz (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | Separate struct slab from struct page | expand |
On 12/10/21 17:37, Hyeonggon Yoo wrote: > On Wed, Dec 01, 2021 at 07:15:08PM +0100, Vlastimil Babka wrote: >> With a struct slab definition separate from struct page, we can go further and >> define only fields that the chosen sl*b implementation uses. This means >> everything between __page_flags and __page_refcount placeholders now depends on >> the chosen CONFIG_SL*B. > > When I read this patch series first, I thought struct slab is allocated > separately from struct page. > > But after reading it again, It uses same allocated space of struct page. Yes. Allocating it elsewhere is something that can be discussed later. It's not a simple clear win - more memory used, more overhead, complicated code... > So, the code should care about fields that page allocator cares when > freeing page. (->mapping, ->refcount, ->flags, ...) > > And, we can change offset of fields between page->flags and page->refcount, > If we care about the value of page->mapping before freeing it. > > Did I get it right? Yeah. Also whatever aliases with compound_head must not have bit zero set as that means a tail page. >> Some fields exist in all implementations (slab_list) >> but can be part of a union in some, so it's simpler to repeat them than >> complicate the definition with ifdefs even more. > > Before this patch I always ran preprocessor in my brain. > now it's MUCH easier to understand than before! > >> >> The patch doesn't change physical offsets of the fields, although it could be >> done later - for example it's now clear that tighter packing in SLOB could be >> possible. >> > > Is there a benefit if we pack SLOB's struct slab tighter? I don't see any immediate benefit, except avoiding the page->mapping alias as you suggested. > ... > >> #ifdef CONFIG_MEMCG >> unsigned long memcg_data; >> @@ -47,7 +69,9 @@ struct slab { >> static_assert(offsetof(struct page, pg) == offsetof(struct slab, sl)) >> SLAB_MATCH(flags, __page_flags); >> SLAB_MATCH(compound_head, slab_list); /* Ensure bit 0 is clear */ >> +#ifndef CONFIG_SLOB >> SLAB_MATCH(rcu_head, rcu_head); > > Because SLUB and SLAB sets slab->slab_cache = NULL (to set page->mapping = NULL), Hm, now that you mention it, maybe it would be better to do a "folio->mapping = NULL" instead as we now have a more clearer view where we operate on struct slab, and where we transition between that and a plain folio. This is IMHO part of preparing the folio for freeing, not a struct slab cleanup as struct slab doesn't need this cleanup. > What about adding this?: > > SLAB_MATCH(mapping, slab_cache); > > there was SLAB_MATCH(slab_cache, slab_cache) but removed. With the change suggested above, it wouldn't be needed as a safety check anymore. >> +#endif >> SLAB_MATCH(_refcount, __page_refcount); >> #ifdef CONFIG_MEMCG >> SLAB_MATCH(memcg_data, memcg_data); > > I couldn't find any functional problem on this patch. > but it seems there's some style issues. > > Below is what checkpatch.pl complains. > it's better to fix them! Not all checkpatch suggestions are correct and have to be followed, but I'll check what I missed. Thanks. > WARNING: Possible unwrapped commit description (prefer a maximum 75 chars per line) > #7: > With a struct slab definition separate from struct page, we can go further and > > WARNING: Possible repeated word: 'and' > #19: > implementation. Before this patch virt_to_cache() and and cache_from_obj() was > > WARNING: space prohibited between function name and open parenthesis '(' > #49: FILE: mm/kfence/core.c:432: > +#elif defined (CONFIG_SLAB) > > ERROR: "foo * bar" should be "foo *bar" > #73: FILE: mm/slab.h:20: > +void * s_mem;/* first object */ > > ERROR: "foo * bar" should be "foo *bar" > #111: FILE: mm/slab.h:53: > +void * __unused_1; > > ERROR: "foo * bar" should be "foo *bar" > #113: FILE: mm/slab.h:55: > +void * __unused_2; > > --- > Thanks, > Hyeonggon.
On Fri, Dec 10, 2021 at 07:26:11PM +0100, Vlastimil Babka wrote: > > Because SLUB and SLAB sets slab->slab_cache = NULL (to set page->mapping = NULL), > > Hm, now that you mention it, maybe it would be better to do a > "folio->mapping = NULL" instead as we now have a more clearer view where we > operate on struct slab, and where we transition between that and a plain > folio. This is IMHO part of preparing the folio for freeing, not a struct > slab cleanup as struct slab doesn't need this cleanup. Yes, I did that as part of "mm/slub: Convert slab freeing to struct slab" in my original series: - __ClearPageSlabPfmemalloc(page); + __slab_clear_pfmemalloc(slab); __ClearPageSlab(page); - /* In union with page->mapping where page allocator expects NULL */ - page->slab_cache = NULL; + page->mapping = NULL;
On Sat, Dec 11, 2021 at 11:55:27AM +0000, Hyeonggon Yoo wrote: > On Fri, Dec 10, 2021 at 07:26:11PM +0100, Vlastimil Babka wrote: > > On 12/10/21 17:37, Hyeonggon Yoo wrote: > > > On Wed, Dec 01, 2021 at 07:15:08PM +0100, Vlastimil Babka wrote: > > >> With a struct slab definition separate from struct page, we can go further and > > >> define only fields that the chosen sl*b implementation uses. This means > > >> everything between __page_flags and __page_refcount placeholders now depends on > > >> the chosen CONFIG_SL*B. > > > > > > When I read this patch series first, I thought struct slab is allocated > > > separately from struct page. > > > > > > But after reading it again, It uses same allocated space of struct page. > > > > Yes. Allocating it elsewhere is something that can be discussed later. It's > > not a simple clear win - more memory used, more overhead, complicated code... > > > > Right. That is a something that can be discussed, > But I don't think there will be much win. Oh no, there's a substantial win. If we can reduce struct page to a single pointer, that shrinks it from 64 bytes/4k to 8 bytes/4k. Set against that, you have to allocate the struct folio / struct slab / ... but then it's one _per allocation_ rather than one per page. So for an order-2 allocation, it takes 32 bytes + 64 bytes (= 96 bytes) rather than 4*64 = 256 bytes. It's an even bigger win for larger allocations, and it lets us grow the memory descriptors independently of each other. But it's also a substantial amount of work, so don't expect us to get there any time soon. Everything currently using struct page needs to be converted to use another type, and that's just the pre-requisite step. Some more thoughts on this here: https://lore.kernel.org/linux-mm/YXcLqcFhDq3uUwIj@casper.infradead.org/ > > Yeah. Also whatever aliases with compound_head must not have bit zero set as > > that means a tail page. > > > > Oh I was missing that. Thank you. > > Hmm then in struct slab, page->compound_head and slab->list_head (or > slab->rcu_head) has same offset. And list_head / rcu_head both store pointers. > > then it has a alignment requirement. (address saved in list_head/rcu_head > should be multiple of 2) > > Anyway, it was required long time before this patch, > so it is not a problem for this patch. Yes, that's why there's an assert that the list_heads all line up. This requirement will go away if we do get separately allocated memory descriptors (because that bottom bit is no longer PageTail).
diff --git a/mm/kfence/core.c b/mm/kfence/core.c index 4eb60cf5ff8b..46103a7628a6 100644 --- a/mm/kfence/core.c +++ b/mm/kfence/core.c @@ -427,10 +427,11 @@ static void *kfence_guarded_alloc(struct kmem_cache *cache, size_t size, gfp_t g /* Set required slab fields. */ slab = virt_to_slab((void *)meta->addr); slab->slab_cache = cache; - if (IS_ENABLED(CONFIG_SLUB)) - slab->objects = 1; - if (IS_ENABLED(CONFIG_SLAB)) - slab->s_mem = addr; +#if defined(CONFIG_SLUB) + slab->objects = 1; +#elif defined (CONFIG_SLAB) + slab->s_mem = addr; +#endif /* Memory initialization. */ for_each_canary(meta, set_canary_byte); diff --git a/mm/slab.h b/mm/slab.h index 2d50c099a222..8c5a8c005896 100644 --- a/mm/slab.h +++ b/mm/slab.h @@ -8,9 +8,24 @@ /* Reuses the bits in struct page */ struct slab { unsigned long __page_flags; + +#if defined(CONFIG_SLAB) + + union { + struct list_head slab_list; + struct rcu_head rcu_head; + }; + struct kmem_cache *slab_cache; + void *freelist; /* array of free object indexes */ + void * s_mem; /* first object */ + unsigned int active; + +#elif defined(CONFIG_SLUB) + union { struct list_head slab_list; - struct { /* Partial pages */ + struct rcu_head rcu_head; + struct { struct slab *next; #ifdef CONFIG_64BIT int slabs; /* Nr of slabs left */ @@ -18,25 +33,32 @@ struct slab { short int slabs; #endif }; - struct rcu_head rcu_head; }; - struct kmem_cache *slab_cache; /* not slob */ + struct kmem_cache *slab_cache; /* Double-word boundary */ void *freelist; /* first free object */ union { - void *s_mem; /* slab: first object */ - unsigned long counters; /* SLUB */ - struct { /* SLUB */ + unsigned long counters; + struct { unsigned inuse:16; unsigned objects:15; unsigned frozen:1; }; }; + unsigned int __unused; + +#elif defined(CONFIG_SLOB) + + struct list_head slab_list; + void * __unused_1; + void *freelist; /* first free block */ + void * __unused_2; + int units; + +#else +#error "Unexpected slab allocator configured" +#endif - union { - unsigned int active; /* SLAB */ - int units; /* SLOB */ - }; atomic_t __page_refcount; #ifdef CONFIG_MEMCG unsigned long memcg_data; @@ -47,7 +69,9 @@ struct slab { static_assert(offsetof(struct page, pg) == offsetof(struct slab, sl)) SLAB_MATCH(flags, __page_flags); SLAB_MATCH(compound_head, slab_list); /* Ensure bit 0 is clear */ +#ifndef CONFIG_SLOB SLAB_MATCH(rcu_head, rcu_head); +#endif SLAB_MATCH(_refcount, __page_refcount); #ifdef CONFIG_MEMCG SLAB_MATCH(memcg_data, memcg_data); @@ -623,6 +647,7 @@ static inline void memcg_slab_free_hook(struct kmem_cache *s, } #endif /* CONFIG_MEMCG_KMEM */ +#ifndef CONFIG_SLOB static inline struct kmem_cache *virt_to_cache(const void *obj) { struct slab *slab; @@ -669,6 +694,7 @@ static inline struct kmem_cache *cache_from_obj(struct kmem_cache *s, void *x) print_tracking(cachep, x); return cachep; } +#endif /* CONFIG_SLOB */ static inline size_t slab_ksize(const struct kmem_cache *s) {