Message ID | 20201203031111.3187-1-songmuchun@bytedance.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v2] mm/memcontrol: make the slab calculation consistent | expand |
On Thu, Dec 03, 2020 at 11:11:11AM +0800, Muchun Song wrote: > Although the ratio of the slab is one, we also should read the ratio > from the related memory_stats instead of hard-coding. And the local > variable of size is already the value of slab_unreclaimable. So we > do not need to read again. Simplify the code here. > > Signed-off-by: Muchun Song <songmuchun@bytedance.com> > Acked-by: Roman Gushchin <guro@fb.com> I agree that ignoring the ratio right now is not very pretty, but size = memcg_page_state(memcg, NR_SLAB_RECLAIMABLE_B) + memcg_page_state(memcg, NR_SLAB_UNRECLAIMABLE_B); seq_buf_printf(&s, "slab %llu\n", size); is way easier to understand and more robust than using idx and idx + 1 and then requiring a series of BUG_ONs to ensure these two items are actually adjacent and in the right order. There is a redundant call to memcg_page_state(), granted, but that function is extremely cheap compared with e.g. seq_buf_printf(). > mm/memcontrol.c | 26 +++++++++++++++++++++----- > 1 file changed, 21 insertions(+), 5 deletions(-) IMO this really just complicates the code with little discernible upside. It's going to be a NAK from me, unfortunately. In retrospect, I think that memory_stats[] table was a mistake. It would probably be easier to implement this using a wrapper for memcg_page_state() that has a big switch() for unit conversion. Something like this: /* Translate stat items to the correct unit for memory.stat output */ static unsigned long memcg_page_state_output(memcg, item) { unsigned long value = memcg_page_state(memcg, item); int unit = PAGE_SIZE; switch (item) { case NR_SLAB_RECLAIMABLE_B: case NR_SLAB_UNRECLAIMABLE_B: case WORKINGSET_REFAULT_ANON: case WORKINGSET_REFAULT_FILE: case WORKINGSET_ACTIVATE_ANON: case WORKINGSET_ACTIVATE_FILE: case WORKINGSET_RESTORE_ANON: case WORKINGSET_RESTORE_FILE: case MEMCG_PERCPU_B: unit = 1; break; case NR_SHMEM_THPS: case NR_FILE_THPS: case NR_ANON_THPS: unit = HPAGE_PMD_SIZE; break; case NR_KERNEL_STACK_KB: unit = 1024; break; } return value * unit; } This would fix the ratio inconsistency, get rid of the awkward mix of static and runtime initialization of the table, is probably about the same amount of code, but simpler and more obvious overall.
On Fri, Dec 4, 2020 at 11:48 PM Johannes Weiner <hannes@cmpxchg.org> wrote: > > On Thu, Dec 03, 2020 at 11:11:11AM +0800, Muchun Song wrote: > > Although the ratio of the slab is one, we also should read the ratio > > from the related memory_stats instead of hard-coding. And the local > > variable of size is already the value of slab_unreclaimable. So we > > do not need to read again. Simplify the code here. > > > > Signed-off-by: Muchun Song <songmuchun@bytedance.com> > > Acked-by: Roman Gushchin <guro@fb.com> > > I agree that ignoring the ratio right now is not very pretty, but > > size = memcg_page_state(memcg, NR_SLAB_RECLAIMABLE_B) + > memcg_page_state(memcg, NR_SLAB_UNRECLAIMABLE_B); > seq_buf_printf(&s, "slab %llu\n", size); > > is way easier to understand and more robust than using idx and idx + 1 > and then requiring a series of BUG_ONs to ensure these two items are > actually adjacent and in the right order. > > There is a redundant call to memcg_page_state(), granted, but that > function is extremely cheap compared with e.g. seq_buf_printf(). > > > mm/memcontrol.c | 26 +++++++++++++++++++++----- > > 1 file changed, 21 insertions(+), 5 deletions(-) > > IMO this really just complicates the code with little discernible > upside. It's going to be a NAK from me, unfortunately. > > > In retrospect, I think that memory_stats[] table was a mistake. It > would probably be easier to implement this using a wrapper for > memcg_page_state() that has a big switch() for unit > conversion. Something like this: > > /* Translate stat items to the correct unit for memory.stat output */ > static unsigned long memcg_page_state_output(memcg, item) > { > unsigned long value = memcg_page_state(memcg, item); > int unit = PAGE_SIZE; > > switch (item) { > case NR_SLAB_RECLAIMABLE_B: > case NR_SLAB_UNRECLAIMABLE_B: > case WORKINGSET_REFAULT_ANON: > case WORKINGSET_REFAULT_FILE: > case WORKINGSET_ACTIVATE_ANON: > case WORKINGSET_ACTIVATE_FILE: > case WORKINGSET_RESTORE_ANON: > case WORKINGSET_RESTORE_FILE: > case MEMCG_PERCPU_B: > unit = 1; > break; > case NR_SHMEM_THPS: > case NR_FILE_THPS: > case NR_ANON_THPS: > unit = HPAGE_PMD_SIZE; > break; > case NR_KERNEL_STACK_KB: > unit = 1024; > break; > } > > return value * unit; > } > > This would fix the ratio inconsistency, get rid of the awkward mix of > static and runtime initialization of the table, is probably about the > same amount of code, but simpler and more obvious overall. Good idea. I can do that :) Thanks.
On Fri, Dec 04, 2020 at 10:46:13AM -0500, Johannes Weiner wrote: > On Thu, Dec 03, 2020 at 11:11:11AM +0800, Muchun Song wrote: > > Although the ratio of the slab is one, we also should read the ratio > > from the related memory_stats instead of hard-coding. And the local > > variable of size is already the value of slab_unreclaimable. So we > > do not need to read again. Simplify the code here. > > > > Signed-off-by: Muchun Song <songmuchun@bytedance.com> > > Acked-by: Roman Gushchin <guro@fb.com> > > I agree that ignoring the ratio right now is not very pretty, but > > size = memcg_page_state(memcg, NR_SLAB_RECLAIMABLE_B) + > memcg_page_state(memcg, NR_SLAB_UNRECLAIMABLE_B); > seq_buf_printf(&s, "slab %llu\n", size); > > is way easier to understand and more robust than using idx and idx + 1 > and then requiring a series of BUG_ONs to ensure these two items are > actually adjacent and in the right order. > > There is a redundant call to memcg_page_state(), granted, but that > function is extremely cheap compared with e.g. seq_buf_printf(). > > > mm/memcontrol.c | 26 +++++++++++++++++++++----- > > 1 file changed, 21 insertions(+), 5 deletions(-) > > IMO this really just complicates the code with little discernible > upside. It's going to be a NAK from me, unfortunately. > > > In retrospect, I think that memory_stats[] table was a mistake. It > would probably be easier to implement this using a wrapper for > memcg_page_state() that has a big switch() for unit > conversion. Something like this: +1 > > /* Translate stat items to the correct unit for memory.stat output */ > static unsigned long memcg_page_state_output(memcg, item) > { > unsigned long value = memcg_page_state(memcg, item); > int unit = PAGE_SIZE; > > switch (item) { > case NR_SLAB_RECLAIMABLE_B: > case NR_SLAB_UNRECLAIMABLE_B: > case WORKINGSET_REFAULT_ANON: > case WORKINGSET_REFAULT_FILE: > case WORKINGSET_ACTIVATE_ANON: > case WORKINGSET_ACTIVATE_FILE: > case WORKINGSET_RESTORE_ANON: > case WORKINGSET_RESTORE_FILE: > case MEMCG_PERCPU_B: > unit = 1; > break; > case NR_SHMEM_THPS: > case NR_FILE_THPS: > case NR_ANON_THPS: > unit = HPAGE_PMD_SIZE; > break; ^^^^^^^^^^^^ These can be easily converted to ordinary pages, so we can completely avoid this exception. > case NR_KERNEL_STACK_KB: > unit = 1024; > break; > } And NR_KERNEL_STACK_KB can be converted to bytes. Then we'll have everything kernel-related in bytes and everything userspace-related in PAGE_SIZE's. > > return value * unit; > } > > This would fix the ratio inconsistency, get rid of the awkward mix of > static and runtime initialization of the table, is probably about the > same amount of code, but simpler and more obvious overall. +1
diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 9922f1510956..75df129b7a52 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -1545,12 +1545,22 @@ static int __init memory_stats_init(void) int i; for (i = 0; i < ARRAY_SIZE(memory_stats); i++) { + switch (memory_stats[i].idx) { #ifdef CONFIG_TRANSPARENT_HUGEPAGE - if (memory_stats[i].idx == NR_ANON_THPS || - memory_stats[i].idx == NR_FILE_THPS || - memory_stats[i].idx == NR_SHMEM_THPS) + case NR_ANON_THPS: + case NR_FILE_THPS: + case NR_SHMEM_THPS: memory_stats[i].ratio = HPAGE_PMD_SIZE; + break; #endif + case NR_SLAB_UNRECLAIMABLE_B: + VM_BUG_ON(i < 1); + VM_BUG_ON(memory_stats[i - 1].idx != NR_SLAB_RECLAIMABLE_B); + break; + default: + break; + } + VM_BUG_ON(!memory_stats[i].ratio); VM_BUG_ON(memory_stats[i].idx >= MEMCG_NR_STAT); } @@ -1586,9 +1596,15 @@ static char *memory_stat_format(struct mem_cgroup *memcg) size *= memory_stats[i].ratio; seq_buf_printf(&s, "%s %llu\n", memory_stats[i].name, size); + /* + * We are printing reclaimable, unreclaimable of the slab + * and the sum of both. + */ if (unlikely(memory_stats[i].idx == NR_SLAB_UNRECLAIMABLE_B)) { - size = memcg_page_state(memcg, NR_SLAB_RECLAIMABLE_B) + - memcg_page_state(memcg, NR_SLAB_UNRECLAIMABLE_B); + int idx = i - 1; + + size += memcg_page_state(memcg, memory_stats[idx].idx) * + memory_stats[idx].ratio; seq_buf_printf(&s, "slab %llu\n", size); } }