Message ID | 20210630040034.1155892-15-willy@infradead.org (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | Folio conversion of memcg | expand |
On Wed 30-06-21 05:00:30, Matthew Wilcox wrote: > This saves dozens of bytes of text by eliminating a lot of calls to > compound_head(). > > Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> > --- > mm/memcontrol.c | 37 +++++++++++++++++++------------------ > 1 file changed, 19 insertions(+), 18 deletions(-) > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > index b94a6122f27d..95795b65ae3e 100644 > --- a/mm/memcontrol.c > +++ b/mm/memcontrol.c > @@ -5585,38 +5585,39 @@ static int mem_cgroup_move_account(struct page *page, > struct mem_cgroup *from, > struct mem_cgroup *to) > { > + struct folio *folio = page_folio(page); > struct lruvec *from_vec, *to_vec; > struct pglist_data *pgdat; > - unsigned int nr_pages = compound ? thp_nr_pages(page) : 1; > + unsigned int nr_pages = compound ? folio_nr_pages(folio) : 1; > int nid, ret; > > VM_BUG_ON(from == to); > - VM_BUG_ON_PAGE(PageLRU(page), page); > - VM_BUG_ON(compound && !PageTransHuge(page)); > + VM_BUG_ON_FOLIO(folio_lru(folio), folio); > + VM_BUG_ON(compound && !folio_multi(folio)); > > /* > * Prevent mem_cgroup_migrate() from looking at > * page's memory cgroup of its source page while we change it. > */ > ret = -EBUSY; > - if (!trylock_page(page)) > + if (!folio_trylock(folio)) > goto out; > > ret = -EINVAL; > - if (page_memcg(page) != from) > + if (folio_memcg(folio) != from) > goto out_unlock; > > - pgdat = page_pgdat(page); > + pgdat = folio_pgdat(folio); > from_vec = mem_cgroup_lruvec(from, pgdat); > to_vec = mem_cgroup_lruvec(to, pgdat); > > - lock_page_memcg(page); > + folio_memcg_lock(folio); > > - if (PageAnon(page)) { > - if (page_mapped(page)) { > + if (folio_anon(folio)) { > + if (folio_mapped(folio)) { > __mod_lruvec_state(from_vec, NR_ANON_MAPPED, -nr_pages); > __mod_lruvec_state(to_vec, NR_ANON_MAPPED, nr_pages); > - if (PageTransHuge(page)) { > + if (folio_multi(folio)) { Shouldn't be folio_transhuge? The resulting code is the same but folio_transhuge is more explicit and matches the THP aspect.
On Wed, Jun 30, 2021 at 10:30:38AM +0200, Michal Hocko wrote: > > - if (PageAnon(page)) { > > - if (page_mapped(page)) { > > + if (folio_anon(folio)) { > > + if (folio_mapped(folio)) { > > __mod_lruvec_state(from_vec, NR_ANON_MAPPED, -nr_pages); > > __mod_lruvec_state(to_vec, NR_ANON_MAPPED, nr_pages); > > - if (PageTransHuge(page)) { > > + if (folio_multi(folio)) { > > Shouldn't be folio_transhuge? The resulting code is the same but > folio_transhuge is more explicit and matches the THP aspect. I genuinely don't know. For the benefit of those reading along, the important part of the context is: if (folio_mapped(folio)) { __mod_lruvec_state(from_vec, NR_ANON_MAPPED, -nr_pages); __mod_lruvec_state(to_vec, NR_ANON_MAPPED, nr_pages); if (folio_multi(folio)) { __mod_lruvec_state(from_vec, NR_ANON_THPS, -nr_pages); __mod_lruvec_state(to_vec, NR_ANON_THPS, nr_pages); } } We need to decide what 'NR_ANON_THPS' means in a folio-based world where we have folios of all orders. Does it count only the number of pages in folios >= HPAGE_PMD_SIZE? Or does it count the number of pages in folios > PAGE_SIZE? Similar question (and I suspect the same answer) for NR_SHMEM_THPS and NR_FILE_THPS. Right now, I've been accounting any multi-page folio as a THP, but I don't have a good sense of what the answer should be.
On Wed 30-06-21 12:22:48, Matthew Wilcox wrote: > On Wed, Jun 30, 2021 at 10:30:38AM +0200, Michal Hocko wrote: > > > - if (PageAnon(page)) { > > > - if (page_mapped(page)) { > > > + if (folio_anon(folio)) { > > > + if (folio_mapped(folio)) { > > > __mod_lruvec_state(from_vec, NR_ANON_MAPPED, -nr_pages); > > > __mod_lruvec_state(to_vec, NR_ANON_MAPPED, nr_pages); > > > - if (PageTransHuge(page)) { > > > + if (folio_multi(folio)) { > > > > Shouldn't be folio_transhuge? The resulting code is the same but > > folio_transhuge is more explicit and matches the THP aspect. > > I genuinely don't know. For the benefit of those reading along, the > important part of the context is: > > if (folio_mapped(folio)) { > __mod_lruvec_state(from_vec, NR_ANON_MAPPED, -nr_pages); > __mod_lruvec_state(to_vec, NR_ANON_MAPPED, nr_pages); > if (folio_multi(folio)) { > __mod_lruvec_state(from_vec, NR_ANON_THPS, > -nr_pages); > __mod_lruvec_state(to_vec, NR_ANON_THPS, > nr_pages); > } > } > > We need to decide what 'NR_ANON_THPS' means in a folio-based world where > we have folios of all orders. Does it count only the number of pages > in folios >= HPAGE_PMD_SIZE? Or does it count the number of pages in > folios > PAGE_SIZE? At this stage we only have PMD based, right? I believe it would be simpler to stick with that at the moment and change that to a more generic way along with other places which need updating. Wrt. counters they do count pages so in this case this shouldn't be a problem. But we do have counters for pmd mappings and that might need some care.
On Wed, Jun 30, 2021 at 02:20:38PM +0200, Michal Hocko wrote: > On Wed 30-06-21 12:22:48, Matthew Wilcox wrote: > > We need to decide what 'NR_ANON_THPS' means in a folio-based world where > > we have folios of all orders. Does it count only the number of pages > > in folios >= HPAGE_PMD_SIZE? Or does it count the number of pages in > > folios > PAGE_SIZE? > > At this stage we only have PMD based, right? I believe it would be > simpler to stick with that at the moment and change that to a more > generic way along with other places which need updating. > > Wrt. counters they do count pages so in this case this shouldn't be a > problem. But we do have counters for pmd mappings and that might need > some care. Looking at how these are reported: show_val_kb(m, "AnonHugePages: ", global_node_page_state(NR_ANON_THPS)); show_val_kb(m, "ShmemHugePages: ", global_node_page_state(NR_SHMEM_THPS)); show_val_kb(m, "ShmemPmdMapped: ", global_node_page_state(NR_SHMEM_PMDMAPPED)); show_val_kb(m, "FileHugePages: ", global_node_page_state(NR_FILE_THPS)); show_val_kb(m, "FilePmdMapped: ", global_node_page_state(NR_FILE_PMDMAPPED)); it specifically refers to 'HugePages', so I think we need to only count folios with order >= PMD_ORDER. I'll make that change to folio_transhuge() and use folio_transhuge() here.
On Wed 30-06-21 13:31:17, Matthew Wilcox wrote: > On Wed, Jun 30, 2021 at 02:20:38PM +0200, Michal Hocko wrote: > > On Wed 30-06-21 12:22:48, Matthew Wilcox wrote: > > > We need to decide what 'NR_ANON_THPS' means in a folio-based world where > > > we have folios of all orders. Does it count only the number of pages > > > in folios >= HPAGE_PMD_SIZE? Or does it count the number of pages in > > > folios > PAGE_SIZE? > > > > At this stage we only have PMD based, right? I believe it would be > > simpler to stick with that at the moment and change that to a more > > generic way along with other places which need updating. > > > > Wrt. counters they do count pages so in this case this shouldn't be a > > problem. But we do have counters for pmd mappings and that might need > > some care. > > Looking at how these are reported: > > show_val_kb(m, "AnonHugePages: ", > global_node_page_state(NR_ANON_THPS)); > show_val_kb(m, "ShmemHugePages: ", > global_node_page_state(NR_SHMEM_THPS)); > show_val_kb(m, "ShmemPmdMapped: ", > global_node_page_state(NR_SHMEM_PMDMAPPED)); > show_val_kb(m, "FileHugePages: ", > global_node_page_state(NR_FILE_THPS)); > show_val_kb(m, "FilePmdMapped: ", > global_node_page_state(NR_FILE_PMDMAPPED)); > > it specifically refers to 'HugePages', so I think we need to only > count folios with order >= PMD_ORDER. Why? The presented value is in kB. It gives us a cumulative number of transparent large pages. Sure breakdown to respective orders would be impossible in general but the same would be the case if order > PMD_ORDER. I am not really sure how useful that information is in practice but that is a different story. > I'll make that change to > folio_transhuge() and use folio_transhuge() here. Thanks!
On Wed, Jun 30, 2021 at 02:45:33PM +0200, Michal Hocko wrote: > On Wed 30-06-21 13:31:17, Matthew Wilcox wrote: > > On Wed, Jun 30, 2021 at 02:20:38PM +0200, Michal Hocko wrote: > > > On Wed 30-06-21 12:22:48, Matthew Wilcox wrote: > > > > We need to decide what 'NR_ANON_THPS' means in a folio-based world where > > > > we have folios of all orders. Does it count only the number of pages > > > > in folios >= HPAGE_PMD_SIZE? Or does it count the number of pages in > > > > folios > PAGE_SIZE? > > > > > > At this stage we only have PMD based, right? I believe it would be > > > simpler to stick with that at the moment and change that to a more > > > generic way along with other places which need updating. > > > > > > Wrt. counters they do count pages so in this case this shouldn't be a > > > problem. But we do have counters for pmd mappings and that might need > > > some care. > > > > Looking at how these are reported: > > > > show_val_kb(m, "AnonHugePages: ", > > global_node_page_state(NR_ANON_THPS)); > > show_val_kb(m, "ShmemHugePages: ", > > global_node_page_state(NR_SHMEM_THPS)); > > show_val_kb(m, "ShmemPmdMapped: ", > > global_node_page_state(NR_SHMEM_PMDMAPPED)); > > show_val_kb(m, "FileHugePages: ", > > global_node_page_state(NR_FILE_THPS)); > > show_val_kb(m, "FilePmdMapped: ", > > global_node_page_state(NR_FILE_PMDMAPPED)); > > > > it specifically refers to 'HugePages', so I think we need to only > > count folios with order >= PMD_ORDER. > > Why? The presented value is in kB. It gives us a cumulative number of > transparent large pages. Sure breakdown to respective orders would be > impossible in general but the same would be the case if order > PMD_ORDER. > > I am not really sure how useful that information is in practice but that > is a different story. The scenario I'm thinking about is a situation where we have gigabytes of memory in the page cache in 16k-64k chunks and we'll see FileHugePages: 5219348 kB FilePmdMapped: 0 kB which might cause the slightly-too-clever user to think there's a problem.
On Wed 07-07-21 16:25:07, Matthew Wilcox wrote: > On Wed, Jun 30, 2021 at 02:45:33PM +0200, Michal Hocko wrote: > > On Wed 30-06-21 13:31:17, Matthew Wilcox wrote: > > > On Wed, Jun 30, 2021 at 02:20:38PM +0200, Michal Hocko wrote: > > > > On Wed 30-06-21 12:22:48, Matthew Wilcox wrote: > > > > > We need to decide what 'NR_ANON_THPS' means in a folio-based world where > > > > > we have folios of all orders. Does it count only the number of pages > > > > > in folios >= HPAGE_PMD_SIZE? Or does it count the number of pages in > > > > > folios > PAGE_SIZE? > > > > > > > > At this stage we only have PMD based, right? I believe it would be > > > > simpler to stick with that at the moment and change that to a more > > > > generic way along with other places which need updating. > > > > > > > > Wrt. counters they do count pages so in this case this shouldn't be a > > > > problem. But we do have counters for pmd mappings and that might need > > > > some care. > > > > > > Looking at how these are reported: > > > > > > show_val_kb(m, "AnonHugePages: ", > > > global_node_page_state(NR_ANON_THPS)); > > > show_val_kb(m, "ShmemHugePages: ", > > > global_node_page_state(NR_SHMEM_THPS)); > > > show_val_kb(m, "ShmemPmdMapped: ", > > > global_node_page_state(NR_SHMEM_PMDMAPPED)); > > > show_val_kb(m, "FileHugePages: ", > > > global_node_page_state(NR_FILE_THPS)); > > > show_val_kb(m, "FilePmdMapped: ", > > > global_node_page_state(NR_FILE_PMDMAPPED)); > > > > > > it specifically refers to 'HugePages', so I think we need to only > > > count folios with order >= PMD_ORDER. > > > > Why? The presented value is in kB. It gives us a cumulative number of > > transparent large pages. Sure breakdown to respective orders would be > > impossible in general but the same would be the case if order > PMD_ORDER. > > > > I am not really sure how useful that information is in practice but that > > is a different story. > > The scenario I'm thinking about is a situation where we have gigabytes > of memory in the page cache in 16k-64k chunks and we'll see > FileHugePages: 5219348 kB > FilePmdMapped: 0 kB > > which might cause the slightly-too-clever user to think there's a > problem. Well, cases like this one shouldn't be really hard to explain though so I wouldn't be worried about this.
diff --git a/mm/memcontrol.c b/mm/memcontrol.c index b94a6122f27d..95795b65ae3e 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -5585,38 +5585,39 @@ static int mem_cgroup_move_account(struct page *page, struct mem_cgroup *from, struct mem_cgroup *to) { + struct folio *folio = page_folio(page); struct lruvec *from_vec, *to_vec; struct pglist_data *pgdat; - unsigned int nr_pages = compound ? thp_nr_pages(page) : 1; + unsigned int nr_pages = compound ? folio_nr_pages(folio) : 1; int nid, ret; VM_BUG_ON(from == to); - VM_BUG_ON_PAGE(PageLRU(page), page); - VM_BUG_ON(compound && !PageTransHuge(page)); + VM_BUG_ON_FOLIO(folio_lru(folio), folio); + VM_BUG_ON(compound && !folio_multi(folio)); /* * Prevent mem_cgroup_migrate() from looking at * page's memory cgroup of its source page while we change it. */ ret = -EBUSY; - if (!trylock_page(page)) + if (!folio_trylock(folio)) goto out; ret = -EINVAL; - if (page_memcg(page) != from) + if (folio_memcg(folio) != from) goto out_unlock; - pgdat = page_pgdat(page); + pgdat = folio_pgdat(folio); from_vec = mem_cgroup_lruvec(from, pgdat); to_vec = mem_cgroup_lruvec(to, pgdat); - lock_page_memcg(page); + folio_memcg_lock(folio); - if (PageAnon(page)) { - if (page_mapped(page)) { + if (folio_anon(folio)) { + if (folio_mapped(folio)) { __mod_lruvec_state(from_vec, NR_ANON_MAPPED, -nr_pages); __mod_lruvec_state(to_vec, NR_ANON_MAPPED, nr_pages); - if (PageTransHuge(page)) { + if (folio_multi(folio)) { __mod_lruvec_state(from_vec, NR_ANON_THPS, -nr_pages); __mod_lruvec_state(to_vec, NR_ANON_THPS, @@ -5627,18 +5628,18 @@ static int mem_cgroup_move_account(struct page *page, __mod_lruvec_state(from_vec, NR_FILE_PAGES, -nr_pages); __mod_lruvec_state(to_vec, NR_FILE_PAGES, nr_pages); - if (PageSwapBacked(page)) { + if (folio_swapbacked(folio)) { __mod_lruvec_state(from_vec, NR_SHMEM, -nr_pages); __mod_lruvec_state(to_vec, NR_SHMEM, nr_pages); } - if (page_mapped(page)) { + if (folio_mapped(folio)) { __mod_lruvec_state(from_vec, NR_FILE_MAPPED, -nr_pages); __mod_lruvec_state(to_vec, NR_FILE_MAPPED, nr_pages); } - if (PageDirty(page)) { - struct address_space *mapping = page_mapping(page); + if (folio_dirty(folio)) { + struct address_space *mapping = folio_mapping(folio); if (mapping_can_writeback(mapping)) { __mod_lruvec_state(from_vec, NR_FILE_DIRTY, @@ -5649,7 +5650,7 @@ static int mem_cgroup_move_account(struct page *page, } } - if (PageWriteback(page)) { + if (folio_writeback(folio)) { __mod_lruvec_state(from_vec, NR_WRITEBACK, -nr_pages); __mod_lruvec_state(to_vec, NR_WRITEBACK, nr_pages); } @@ -5672,12 +5673,12 @@ static int mem_cgroup_move_account(struct page *page, css_get(&to->css); css_put(&from->css); - page->memcg_data = (unsigned long)to; + folio->memcg_data = (unsigned long)to; __memcg_unlock(from); ret = 0; - nid = page_to_nid(page); + nid = folio_nid(folio); local_irq_disable(); mem_cgroup_charge_statistics(to, nr_pages); @@ -5686,7 +5687,7 @@ static int mem_cgroup_move_account(struct page *page, memcg_check_events(from, nid); local_irq_enable(); out_unlock: - unlock_page(page); + folio_unlock(folio); out: return ret; }
This saves dozens of bytes of text by eliminating a lot of calls to compound_head(). Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> --- mm/memcontrol.c | 37 +++++++++++++++++++------------------ 1 file changed, 19 insertions(+), 18 deletions(-)