Message ID | 20190507040609.21746-1-willy@infradead.org (mailing list archive) |
---|---|
Headers | show |
Series | Remove 'order' argument from many mm functions | expand |
On Mon, May 06, 2019 at 09:05:58PM -0700, Matthew Wilcox wrote: > From: "Matthew Wilcox (Oracle)" <willy@infradead.org> > > It's possible to save a few hundred bytes from the kernel text by moving > the 'order' argument into the GFP flags. I had the idea while I was > playing with THP pagecache (notably, I didn't want to add an 'order' > parameter to pagecache_get_page()) > > What I got for a -tiny config for page_alloc.o (with a tinyconfig, > x86-32) after each step: > > text data bss dec hex filename > 21462 349 44 21855 555f 1.o > 21447 349 44 21840 5550 2.o > 21415 349 44 21808 5530 3.o > 21399 349 44 21792 5520 4.o > 21399 349 44 21792 5520 5.o > 21367 349 44 21760 5500 6.o > 21303 349 44 21696 54c0 7.o > 21303 349 44 21696 54c0 8.o > 21303 349 44 21696 54c0 9.o > 21303 349 44 21696 54c0 A.o > 21303 349 44 21696 54c0 B.o > > I assure you that the callers all shrink as well. vmscan.o also > shrinks, but I didn't keep detailed records. > > Anyway, this is just a quick POC due to me being on an aeroplane for > most of today. Maybe we don't want to spend five GFP bits on this. > Some bits of this could be pulled out and applied even if we don't want > to go for the main objective. eg rmqueue_pcplist() doesn't use its > gfp_flags argument. Over all I may just be a simpleton WRT this but I'm not sure that the added complexity justifies the gain. But other than the 1 patch I don't see anything technically wrong. So I guess... Reviewed-by: Ira Weiny <ira.weiny@intel.com> > > Matthew Wilcox (Oracle) (11): > fix function alignment > mm: Pass order to __alloc_pages_nodemask in GFP flags > mm: Pass order to __get_free_pages() in GFP flags > mm: Pass order to prep_new_page in GFP flags > mm: Remove gfp_flags argument from rmqueue_pcplist > mm: Pass order to rmqueue in GFP flags > mm: Pass order to get_page_from_freelist in GFP flags > mm: Pass order to __alloc_pages_cpuset_fallback in GFP flags > mm: Pass order to prepare_alloc_pages in GFP flags > mm: Pass order to try_to_free_pages in GFP flags > mm: Pass order to node_reclaim() in GFP flags > > arch/x86/Makefile_32.cpu | 2 + > arch/x86/events/intel/ds.c | 4 +- > arch/x86/kvm/vmx/vmx.c | 4 +- > arch/x86/mm/init.c | 3 +- > arch/x86/mm/pgtable.c | 7 +-- > drivers/base/devres.c | 2 +- > include/linux/gfp.h | 57 +++++++++++--------- > include/linux/migrate.h | 2 +- > include/linux/swap.h | 2 +- > include/trace/events/vmscan.h | 28 +++++----- > mm/filemap.c | 2 +- > mm/gup.c | 4 +- > mm/hugetlb.c | 5 +- > mm/internal.h | 5 +- > mm/khugepaged.c | 2 +- > mm/mempolicy.c | 30 +++++------ > mm/migrate.c | 2 +- > mm/mmu_gather.c | 2 +- > mm/page_alloc.c | 97 +++++++++++++++++------------------ > mm/shmem.c | 5 +- > mm/slub.c | 2 +- > mm/vmscan.c | 26 +++++----- > 22 files changed, 147 insertions(+), 146 deletions(-) > > -- > 2.20.1 >
On Mon, May 06, 2019 at 09:05:58PM -0700, Matthew Wilcox wrote: > From: "Matthew Wilcox (Oracle)" <willy@infradead.org> > > It's possible to save a few hundred bytes from the kernel text by moving > the 'order' argument into the GFP flags. I had the idea while I was > playing with THP pagecache (notably, I didn't want to add an 'order' > parameter to pagecache_get_page()) > > What I got for a -tiny config for page_alloc.o (with a tinyconfig, > x86-32) after each step: > > text data bss dec hex filename > 21462 349 44 21855 555f 1.o > 21447 349 44 21840 5550 2.o > 21415 349 44 21808 5530 3.o > 21399 349 44 21792 5520 4.o > 21399 349 44 21792 5520 5.o > 21367 349 44 21760 5500 6.o > 21303 349 44 21696 54c0 7.o > 21303 349 44 21696 54c0 8.o > 21303 349 44 21696 54c0 9.o > 21303 349 44 21696 54c0 A.o > 21303 349 44 21696 54c0 B.o > > I assure you that the callers all shrink as well. vmscan.o also > shrinks, but I didn't keep detailed records. > > Anyway, this is just a quick POC due to me being on an aeroplane for > most of today. Maybe we don't want to spend five GFP bits on this. > Some bits of this could be pulled out and applied even if we don't want > to go for the main objective. eg rmqueue_pcplist() doesn't use its > gfp_flags argument. I like the idea. But I'm somewhat worried about running out of bits in gfp_t. Is there anything preventing us to bump gfp_t to u64 in the future?
On Wed, May 08, 2019 at 06:58:09PM -0700, Ira Weiny wrote: > On Mon, May 06, 2019 at 09:05:58PM -0700, Matthew Wilcox wrote: > > It's possible to save a few hundred bytes from the kernel text by moving > > the 'order' argument into the GFP flags. I had the idea while I was > > playing with THP pagecache (notably, I didn't want to add an 'order' > > parameter to pagecache_get_page()) ... > > Anyway, this is just a quick POC due to me being on an aeroplane for > > most of today. Maybe we don't want to spend five GFP bits on this. > > Some bits of this could be pulled out and applied even if we don't want > > to go for the main objective. eg rmqueue_pcplist() doesn't use its > > gfp_flags argument. > > Over all I may just be a simpleton WRT this but I'm not sure that the added > complexity justifies the gain. I'm disappointed that you see it as added complexity. I see it as reducing complexity. With this patch, we can simply pass GFP_PMD as a flag to pagecache_get_page(); without it, we have to add a fifth parameter to pagecache_get_page() and change all the callers to pass '0'.
> On Wed, May 08, 2019 at 06:58:09PM -0700, Ira Weiny wrote: > > On Mon, May 06, 2019 at 09:05:58PM -0700, Matthew Wilcox wrote: > > > It's possible to save a few hundred bytes from the kernel text by > > > moving the 'order' argument into the GFP flags. I had the idea > > > while I was playing with THP pagecache (notably, I didn't want to add an > 'order' > > > parameter to pagecache_get_page()) > ... > > > Anyway, this is just a quick POC due to me being on an aeroplane for > > > most of today. Maybe we don't want to spend five GFP bits on this. > > > Some bits of this could be pulled out and applied even if we don't > > > want to go for the main objective. eg rmqueue_pcplist() doesn't use > > > its gfp_flags argument. > > > > Over all I may just be a simpleton WRT this but I'm not sure that the > > added complexity justifies the gain. > > I'm disappointed that you see it as added complexity. I see it as reducing > complexity. With this patch, we can simply pass GFP_PMD as a flag to > pagecache_get_page(); without it, we have to add a fifth parameter to > pagecache_get_page() and change all the callers to pass '0'. I don't disagree for pagecache_get_page(). I'm not saying we should not do this. But this seems odd to me. Again I'm probably just being a simpleton... Ira
On Thu, May 09, 2019 at 04:48:39PM +0000, Weiny, Ira wrote: > > On Wed, May 08, 2019 at 06:58:09PM -0700, Ira Weiny wrote: > > > On Mon, May 06, 2019 at 09:05:58PM -0700, Matthew Wilcox wrote: > > > > It's possible to save a few hundred bytes from the kernel text by > > > > moving the 'order' argument into the GFP flags. I had the idea > > > > while I was playing with THP pagecache (notably, I didn't want to add an > > 'order' > > > > parameter to pagecache_get_page()) > > ... > > > > Anyway, this is just a quick POC due to me being on an aeroplane for > > > > most of today. Maybe we don't want to spend five GFP bits on this. > > > > Some bits of this could be pulled out and applied even if we don't > > > > want to go for the main objective. eg rmqueue_pcplist() doesn't use > > > > its gfp_flags argument. > > > > > > Over all I may just be a simpleton WRT this but I'm not sure that the > > > added complexity justifies the gain. > > > > I'm disappointed that you see it as added complexity. I see it as reducing > > complexity. With this patch, we can simply pass GFP_PMD as a flag to > > pagecache_get_page(); without it, we have to add a fifth parameter to > > pagecache_get_page() and change all the callers to pass '0'. > > I don't disagree for pagecache_get_page(). > > I'm not saying we should not do this. But this seems odd to me. > > Again I'm probably just being a simpleton... This concerns me, though. I see it as being a simplification, but if other people see it as a complication, then it's not. Perhaps I didn't take the patches far enough for you to see benefit? We have quite the thicket of .*alloc_page.* functions, and I can't keep them all straight. Between taking, or not taking, the nodeid, the gfp mask, the order, a VMA and random other crap; not to mention the NUMA vs !NUMA implementations, this is crying out for simplification. It doesn't help that I screwed up the __get_free_pages patch. I should have grepped and realised that we had over 200 callers and it's not worth changing them all as part of this patchset.
On Thu, May 09, 2019 at 02:07:55PM +0300, Kirill A. Shutemov wrote: > On Mon, May 06, 2019 at 09:05:58PM -0700, Matthew Wilcox wrote: > > Anyway, this is just a quick POC due to me being on an aeroplane for > > most of today. Maybe we don't want to spend five GFP bits on this. > > Some bits of this could be pulled out and applied even if we don't want > > to go for the main objective. eg rmqueue_pcplist() doesn't use its > > gfp_flags argument. > > I like the idea. But I'm somewhat worried about running out of bits in > gfp_t. Is there anything preventing us to bump gfp_t to u64 in the future? It's stored in a few structs that might not appreciate it growing, like struct address_space. I've been vaguely wondering about how to combine order, gfp_t and nodeid into one parameter in a way that doesn't grow those structs, but I don't have a solid idea yet.
On Thu, May 09, 2019 at 11:29:02AM -0700, Matthew Wilcox wrote: > On Thu, May 09, 2019 at 04:48:39PM +0000, Weiny, Ira wrote: > > > On Wed, May 08, 2019 at 06:58:09PM -0700, Ira Weiny wrote: > > > > On Mon, May 06, 2019 at 09:05:58PM -0700, Matthew Wilcox wrote: > > > > > It's possible to save a few hundred bytes from the kernel text by > > > > > moving the 'order' argument into the GFP flags. I had the idea > > > > > while I was playing with THP pagecache (notably, I didn't want to add an > > > 'order' > > > > > parameter to pagecache_get_page()) > > > ... > > > > > Anyway, this is just a quick POC due to me being on an aeroplane for > > > > > most of today. Maybe we don't want to spend five GFP bits on this. > > > > > Some bits of this could be pulled out and applied even if we don't > > > > > want to go for the main objective. eg rmqueue_pcplist() doesn't use > > > > > its gfp_flags argument. > > > > > > > > Over all I may just be a simpleton WRT this but I'm not sure that the > > > > added complexity justifies the gain. > > > > > > I'm disappointed that you see it as added complexity. I see it as reducing > > > complexity. With this patch, we can simply pass GFP_PMD as a flag to > > > pagecache_get_page(); without it, we have to add a fifth parameter to > > > pagecache_get_page() and change all the callers to pass '0'. > > > > I don't disagree for pagecache_get_page(). > > > > I'm not saying we should not do this. But this seems odd to me. > > > > Again I'm probably just being a simpleton... > > This concerns me, though. I see it as being a simplification, but if > other people see it as a complication, then it's not. Perhaps I didn't > take the patches far enough for you to see benefit? We have quite the > thicket of .*alloc_page.* functions, and I can't keep them all straight. > Between taking, or not taking, the nodeid, the gfp mask, the order, a VMA > and random other crap; not to mention the NUMA vs !NUMA implementations, > this is crying out for simplification. Was there a new version of this coming? Sorry perhaps I dropped the ball here by not replying? Ira > > It doesn't help that I screwed up the __get_free_pages patch. I should > have grepped and realised that we had over 200 callers and it's not > worth changing them all as part of this patchset. >
From: "Matthew Wilcox (Oracle)" <willy@infradead.org> It's possible to save a few hundred bytes from the kernel text by moving the 'order' argument into the GFP flags. I had the idea while I was playing with THP pagecache (notably, I didn't want to add an 'order' parameter to pagecache_get_page()) What I got for a -tiny config for page_alloc.o (with a tinyconfig, x86-32) after each step: text data bss dec hex filename 21462 349 44 21855 555f 1.o 21447 349 44 21840 5550 2.o 21415 349 44 21808 5530 3.o 21399 349 44 21792 5520 4.o 21399 349 44 21792 5520 5.o 21367 349 44 21760 5500 6.o 21303 349 44 21696 54c0 7.o 21303 349 44 21696 54c0 8.o 21303 349 44 21696 54c0 9.o 21303 349 44 21696 54c0 A.o 21303 349 44 21696 54c0 B.o I assure you that the callers all shrink as well. vmscan.o also shrinks, but I didn't keep detailed records. Anyway, this is just a quick POC due to me being on an aeroplane for most of today. Maybe we don't want to spend five GFP bits on this. Some bits of this could be pulled out and applied even if we don't want to go for the main objective. eg rmqueue_pcplist() doesn't use its gfp_flags argument. Matthew Wilcox (Oracle) (11): fix function alignment mm: Pass order to __alloc_pages_nodemask in GFP flags mm: Pass order to __get_free_pages() in GFP flags mm: Pass order to prep_new_page in GFP flags mm: Remove gfp_flags argument from rmqueue_pcplist mm: Pass order to rmqueue in GFP flags mm: Pass order to get_page_from_freelist in GFP flags mm: Pass order to __alloc_pages_cpuset_fallback in GFP flags mm: Pass order to prepare_alloc_pages in GFP flags mm: Pass order to try_to_free_pages in GFP flags mm: Pass order to node_reclaim() in GFP flags arch/x86/Makefile_32.cpu | 2 + arch/x86/events/intel/ds.c | 4 +- arch/x86/kvm/vmx/vmx.c | 4 +- arch/x86/mm/init.c | 3 +- arch/x86/mm/pgtable.c | 7 +-- drivers/base/devres.c | 2 +- include/linux/gfp.h | 57 +++++++++++--------- include/linux/migrate.h | 2 +- include/linux/swap.h | 2 +- include/trace/events/vmscan.h | 28 +++++----- mm/filemap.c | 2 +- mm/gup.c | 4 +- mm/hugetlb.c | 5 +- mm/internal.h | 5 +- mm/khugepaged.c | 2 +- mm/mempolicy.c | 30 +++++------ mm/migrate.c | 2 +- mm/mmu_gather.c | 2 +- mm/page_alloc.c | 97 +++++++++++++++++------------------ mm/shmem.c | 5 +- mm/slub.c | 2 +- mm/vmscan.c | 26 +++++----- 22 files changed, 147 insertions(+), 146 deletions(-)