mbox series

[RESEND,v2,00/12] Convert all vmstat counters to pages or bytes

Message ID 20201206101451.14706-1-songmuchun@bytedance.com (mailing list archive)
Headers show
Series Convert all vmstat counters to pages or bytes | expand

Message

Muchun Song Dec. 6, 2020, 10:14 a.m. UTC
Hi,

This patch series is aimed to convert all THP vmstat counters to pages
and some KiB vmstat counters to bytes.

The unit of some vmstat counters are pages, some are bytes, some are
HPAGE_PMD_NR, and some are KiB. When we want to expose these vmstat
counters to the userspace, we have to know the unit of the vmstat counters
is which one. It makes the code complex. Because there are too many choices,
the probability of making a mistake will be greater.

For example, the below is some bug fix:
  - 7de2e9f195b9 ("mm: memcontrol: correct the NR_ANON_THPS counter of hierarchical memcg")
  - not committed(it is the first commit in this series) ("mm: memcontrol: fix NR_ANON_THPS account")

This patch series can make the code simple (161 insertions(+), 187 deletions(-)).
And make the unit of the vmstat counters are either pages or bytes. Fewer choices
means lower probability of making mistakes :).

This was inspired by Johannes and Roman. Thanks to them.

Changes in v1 -> v2:
  - Change the series subject from "Convert all THP vmstat counters to pages"
    to "Convert all vmstat counters to pages or bytes".
  - Convert NR_KERNEL_SCS_KB account to bytes.
  - Convert vmstat slab counters to bytes.
  - Remove {global_}node_page_state_pages.

Muchun Song (12):
  mm: memcontrol: fix NR_ANON_THPS account
  mm: memcontrol: convert NR_ANON_THPS account to pages
  mm: memcontrol: convert NR_FILE_THPS account to pages
  mm: memcontrol: convert NR_SHMEM_THPS account to pages
  mm: memcontrol: convert NR_SHMEM_PMDMAPPED account to pages
  mm: memcontrol: convert NR_FILE_PMDMAPPED account to pages
  mm: memcontrol: convert kernel stack account to bytes
  mm: memcontrol: convert NR_KERNEL_SCS_KB account to bytes
  mm: memcontrol: convert vmstat slab counters to bytes
  mm: memcontrol: scale stat_threshold for byted-sized vmstat
  mm: memcontrol: make the slab calculation consistent
  mm: memcontrol: remove {global_}node_page_state_pages

 drivers/base/node.c     |  25 ++++-----
 fs/proc/meminfo.c       |  22 ++++----
 include/linux/mmzone.h  |  21 +++-----
 include/linux/vmstat.h  |  21 ++------
 kernel/fork.c           |   8 +--
 kernel/power/snapshot.c |   2 +-
 kernel/scs.c            |   4 +-
 mm/filemap.c            |   4 +-
 mm/huge_memory.c        |   9 ++--
 mm/khugepaged.c         |   4 +-
 mm/memcontrol.c         | 131 ++++++++++++++++++++++++------------------------
 mm/oom_kill.c           |   2 +-
 mm/page_alloc.c         |  17 +++----
 mm/rmap.c               |  19 ++++---
 mm/shmem.c              |   3 +-
 mm/vmscan.c             |   2 +-
 mm/vmstat.c             |  54 ++++++++------------
 17 files changed, 161 insertions(+), 187 deletions(-)

Comments

Michal Hocko Dec. 7, 2020, 1 p.m. UTC | #1
On Sun 06-12-20 18:14:39, Muchun Song wrote:
> Hi,
> 
> This patch series is aimed to convert all THP vmstat counters to pages
> and some KiB vmstat counters to bytes.
> 
> The unit of some vmstat counters are pages, some are bytes, some are
> HPAGE_PMD_NR, and some are KiB. When we want to expose these vmstat
> counters to the userspace, we have to know the unit of the vmstat counters
> is which one. It makes the code complex. Because there are too many choices,
> the probability of making a mistake will be greater.
> 
> For example, the below is some bug fix:
>   - 7de2e9f195b9 ("mm: memcontrol: correct the NR_ANON_THPS counter of hierarchical memcg")
>   - not committed(it is the first commit in this series) ("mm: memcontrol: fix NR_ANON_THPS account")
> 
> This patch series can make the code simple (161 insertions(+), 187 deletions(-)).
> And make the unit of the vmstat counters are either pages or bytes. Fewer choices
> means lower probability of making mistakes :).
> 
> This was inspired by Johannes and Roman. Thanks to them.

It would be really great if you could summarize the current and after
the patch state so that exceptions are clear and easier to review. The
existing situation is rather convoluted but we have at least units part
of the name so it is not too hard to notice that. Reducing exeptions
sounds nice but I am not really sure it is such an improvement it is
worth a lot of code churn. Especially when it comes to KB vs B. Counting
THPs as regular pages sounds like a good plan to me because we can
expect that THP will be of a different size in the future - especially
for file THPs.

> Changes in v1 -> v2:
>   - Change the series subject from "Convert all THP vmstat counters to pages"
>     to "Convert all vmstat counters to pages or bytes".
>   - Convert NR_KERNEL_SCS_KB account to bytes.
>   - Convert vmstat slab counters to bytes.
>   - Remove {global_}node_page_state_pages.
> 
> Muchun Song (12):
>   mm: memcontrol: fix NR_ANON_THPS account
>   mm: memcontrol: convert NR_ANON_THPS account to pages
>   mm: memcontrol: convert NR_FILE_THPS account to pages
>   mm: memcontrol: convert NR_SHMEM_THPS account to pages
>   mm: memcontrol: convert NR_SHMEM_PMDMAPPED account to pages
>   mm: memcontrol: convert NR_FILE_PMDMAPPED account to pages
>   mm: memcontrol: convert kernel stack account to bytes
>   mm: memcontrol: convert NR_KERNEL_SCS_KB account to bytes
>   mm: memcontrol: convert vmstat slab counters to bytes
>   mm: memcontrol: scale stat_threshold for byted-sized vmstat
>   mm: memcontrol: make the slab calculation consistent
>   mm: memcontrol: remove {global_}node_page_state_pages
> 
>  drivers/base/node.c     |  25 ++++-----
>  fs/proc/meminfo.c       |  22 ++++----
>  include/linux/mmzone.h  |  21 +++-----
>  include/linux/vmstat.h  |  21 ++------
>  kernel/fork.c           |   8 +--
>  kernel/power/snapshot.c |   2 +-
>  kernel/scs.c            |   4 +-
>  mm/filemap.c            |   4 +-
>  mm/huge_memory.c        |   9 ++--
>  mm/khugepaged.c         |   4 +-
>  mm/memcontrol.c         | 131 ++++++++++++++++++++++++------------------------
>  mm/oom_kill.c           |   2 +-
>  mm/page_alloc.c         |  17 +++----
>  mm/rmap.c               |  19 ++++---
>  mm/shmem.c              |   3 +-
>  mm/vmscan.c             |   2 +-
>  mm/vmstat.c             |  54 ++++++++------------
>  17 files changed, 161 insertions(+), 187 deletions(-)
> 
> -- 
> 2.11.0
Muchun Song Dec. 7, 2020, 2:52 p.m. UTC | #2
On Mon, Dec 7, 2020 at 9:00 PM Michal Hocko <mhocko@suse.com> wrote:
>
> On Sun 06-12-20 18:14:39, Muchun Song wrote:
> > Hi,
> >
> > This patch series is aimed to convert all THP vmstat counters to pages
> > and some KiB vmstat counters to bytes.
> >
> > The unit of some vmstat counters are pages, some are bytes, some are
> > HPAGE_PMD_NR, and some are KiB. When we want to expose these vmstat
> > counters to the userspace, we have to know the unit of the vmstat counters
> > is which one. It makes the code complex. Because there are too many choices,
> > the probability of making a mistake will be greater.
> >
> > For example, the below is some bug fix:
> >   - 7de2e9f195b9 ("mm: memcontrol: correct the NR_ANON_THPS counter of hierarchical memcg")
> >   - not committed(it is the first commit in this series) ("mm: memcontrol: fix NR_ANON_THPS account")
> >
> > This patch series can make the code simple (161 insertions(+), 187 deletions(-)).
> > And make the unit of the vmstat counters are either pages or bytes. Fewer choices
> > means lower probability of making mistakes :).
> >
> > This was inspired by Johannes and Roman. Thanks to them.
>
> It would be really great if you could summarize the current and after
> the patch state so that exceptions are clear and easier to review. The

Agree. Will do in the next version. Thanks.


> existing situation is rather convoluted but we have at least units part
> of the name so it is not too hard to notice that. Reducing exeptions
> sounds nice but I am not really sure it is such an improvement it is
> worth a lot of code churn. Especially when it comes to KB vs B. Counting

There are two vmstat counters (NR_KERNEL_STACK_KB and
NR_KERNEL_SCS_KB) whose units are KB. If we do this, all
vmstat counter units are either pages or bytes in the end. When
we expose those counters to userspace, it can be easy. You can
reference to:

    [RESEND PATCH v2 11/12] mm: memcontrol: make the slab calculation consistent

From this point of view, I think that it is worth doing this. Right?

> THPs as regular pages sounds like a good plan to me because we can
> expect that THP will be of a different size in the future - especially
> for file THPs. It can be easy to convert.
>
> > Changes in v1 -> v2:
> >   - Change the series subject from "Convert all THP vmstat counters to pages"
> >     to "Convert all vmstat counters to pages or bytes".
> >   - Convert NR_KERNEL_SCS_KB account to bytes.
> >   - Convert vmstat slab counters to bytes.
> >   - Remove {global_}node_page_state_pages.
> >
> > Muchun Song (12):
> >   mm: memcontrol: fix NR_ANON_THPS account
> >   mm: memcontrol: convert NR_ANON_THPS account to pages
> >   mm: memcontrol: convert NR_FILE_THPS account to pages
> >   mm: memcontrol: convert NR_SHMEM_THPS account to pages
> >   mm: memcontrol: convert NR_SHMEM_PMDMAPPED account to pages
> >   mm: memcontrol: convert NR_FILE_PMDMAPPED account to pages
> >   mm: memcontrol: convert kernel stack account to bytes
> >   mm: memcontrol: convert NR_KERNEL_SCS_KB account to bytes
> >   mm: memcontrol: convert vmstat slab counters to bytes
> >   mm: memcontrol: scale stat_threshold for byted-sized vmstat
> >   mm: memcontrol: make the slab calculation consistent
> >   mm: memcontrol: remove {global_}node_page_state_pages
> >
> >  drivers/base/node.c     |  25 ++++-----
> >  fs/proc/meminfo.c       |  22 ++++----
> >  include/linux/mmzone.h  |  21 +++-----
> >  include/linux/vmstat.h  |  21 ++------
> >  kernel/fork.c           |   8 +--
> >  kernel/power/snapshot.c |   2 +-
> >  kernel/scs.c            |   4 +-
> >  mm/filemap.c            |   4 +-
> >  mm/huge_memory.c        |   9 ++--
> >  mm/khugepaged.c         |   4 +-
> >  mm/memcontrol.c         | 131 ++++++++++++++++++++++++------------------------
> >  mm/oom_kill.c           |   2 +-
> >  mm/page_alloc.c         |  17 +++----
> >  mm/rmap.c               |  19 ++++---
> >  mm/shmem.c              |   3 +-
> >  mm/vmscan.c             |   2 +-
> >  mm/vmstat.c             |  54 ++++++++------------
> >  17 files changed, 161 insertions(+), 187 deletions(-)
> >
> > --
> > 2.11.0
>
> --
> Michal Hocko
> SUSE Labs



--
Yours,
Muchun
Michal Hocko Dec. 7, 2020, 3:02 p.m. UTC | #3
On Mon 07-12-20 22:52:30, Muchun Song wrote:
> On Mon, Dec 7, 2020 at 9:00 PM Michal Hocko <mhocko@suse.com> wrote:
> >
> > On Sun 06-12-20 18:14:39, Muchun Song wrote:
> > > Hi,
> > >
> > > This patch series is aimed to convert all THP vmstat counters to pages
> > > and some KiB vmstat counters to bytes.
> > >
> > > The unit of some vmstat counters are pages, some are bytes, some are
> > > HPAGE_PMD_NR, and some are KiB. When we want to expose these vmstat
> > > counters to the userspace, we have to know the unit of the vmstat counters
> > > is which one. It makes the code complex. Because there are too many choices,
> > > the probability of making a mistake will be greater.
> > >
> > > For example, the below is some bug fix:
> > >   - 7de2e9f195b9 ("mm: memcontrol: correct the NR_ANON_THPS counter of hierarchical memcg")
> > >   - not committed(it is the first commit in this series) ("mm: memcontrol: fix NR_ANON_THPS account")
> > >
> > > This patch series can make the code simple (161 insertions(+), 187 deletions(-)).
> > > And make the unit of the vmstat counters are either pages or bytes. Fewer choices
> > > means lower probability of making mistakes :).
> > >
> > > This was inspired by Johannes and Roman. Thanks to them.
> >
> > It would be really great if you could summarize the current and after
> > the patch state so that exceptions are clear and easier to review. The
> 
> Agree. Will do in the next version. Thanks.
> 
> 
> > existing situation is rather convoluted but we have at least units part
> > of the name so it is not too hard to notice that. Reducing exeptions
> > sounds nice but I am not really sure it is such an improvement it is
> > worth a lot of code churn. Especially when it comes to KB vs B. Counting
> 
> There are two vmstat counters (NR_KERNEL_STACK_KB and
> NR_KERNEL_SCS_KB) whose units are KB. If we do this, all
> vmstat counter units are either pages or bytes in the end. When
> we expose those counters to userspace, it can be easy. You can
> reference to:
> 
>     [RESEND PATCH v2 11/12] mm: memcontrol: make the slab calculation consistent
> 
> From this point of view, I think that it is worth doing this. Right?

Well, unless I am missing something, we have two counters in bytes, two
in kB, both clearly distinguishable by the B/KB suffix. Changing KB to B
will certainly reduce the different classes of units, no question about
that, but I am not really sure this is worth all the code churn. Maybe
others will think otherwise.

As I've said the THP accounting change makes more sense to me because it
allows future changes which are already undergoing so there is more
merit in those.
Randy Dunlap Dec. 7, 2020, 6:51 p.m. UTC | #4
On 12/7/20 7:02 AM, Michal Hocko wrote:
> On Mon 07-12-20 22:52:30, Muchun Song wrote:
>> On Mon, Dec 7, 2020 at 9:00 PM Michal Hocko <mhocko@suse.com> wrote:
>>>
>>> On Sun 06-12-20 18:14:39, Muchun Song wrote:
>>>> Hi,
>>>>
>>>> This patch series is aimed to convert all THP vmstat counters to pages
>>>> and some KiB vmstat counters to bytes.
>>>>
>>>> The unit of some vmstat counters are pages, some are bytes, some are
>>>> HPAGE_PMD_NR, and some are KiB. When we want to expose these vmstat
>>>> counters to the userspace, we have to know the unit of the vmstat counters
>>>> is which one. It makes the code complex. Because there are too many choices,
>>>> the probability of making a mistake will be greater.
>>>>
>>>> For example, the below is some bug fix:
>>>>   - 7de2e9f195b9 ("mm: memcontrol: correct the NR_ANON_THPS counter of hierarchical memcg")
>>>>   - not committed(it is the first commit in this series) ("mm: memcontrol: fix NR_ANON_THPS account")
>>>>
>>>> This patch series can make the code simple (161 insertions(+), 187 deletions(-)).
>>>> And make the unit of the vmstat counters are either pages or bytes. Fewer choices
>>>> means lower probability of making mistakes :).
>>>>
>>>> This was inspired by Johannes and Roman. Thanks to them.
>>>
>>> It would be really great if you could summarize the current and after
>>> the patch state so that exceptions are clear and easier to review. The
>>
>> Agree. Will do in the next version. Thanks.
>>
>>
>>> existing situation is rather convoluted but we have at least units part
>>> of the name so it is not too hard to notice that. Reducing exeptions
>>> sounds nice but I am not really sure it is such an improvement it is
>>> worth a lot of code churn. Especially when it comes to KB vs B. Counting
>>
>> There are two vmstat counters (NR_KERNEL_STACK_KB and
>> NR_KERNEL_SCS_KB) whose units are KB. If we do this, all
>> vmstat counter units are either pages or bytes in the end. When
>> we expose those counters to userspace, it can be easy. You can
>> reference to:
>>
>>     [RESEND PATCH v2 11/12] mm: memcontrol: make the slab calculation consistent
>>
>> From this point of view, I think that it is worth doing this. Right?
> 
> Well, unless I am missing something, we have two counters in bytes, two
> in kB, both clearly distinguishable by the B/KB suffix. Changing KB to B
> will certainly reduce the different classes of units, no question about
> that, but I am not really sure this is worth all the code churn. Maybe
> others will think otherwise.
> 
> As I've said the THP accounting change makes more sense to me because it
> allows future changes which are already undergoing so there is more
> merit in those.
> 

Hi,

Are there any documentation changes that go with these patches?
Or are none needed?

If the patches change the output in /proc/* or /sys/* then I expect
there would need to be some doc changes.

And is there any chance of confusing userspace s/w (binary or scripts)
with these changes?

thanks.
Roman Gushchin Dec. 7, 2020, 7:51 p.m. UTC | #5
On Mon, Dec 07, 2020 at 04:02:54PM +0100, Michal Hocko wrote:
> On Mon 07-12-20 22:52:30, Muchun Song wrote:
> > On Mon, Dec 7, 2020 at 9:00 PM Michal Hocko <mhocko@suse.com> wrote:
> > >
> > > On Sun 06-12-20 18:14:39, Muchun Song wrote:
> > > > Hi,
> > > >
> > > > This patch series is aimed to convert all THP vmstat counters to pages
> > > > and some KiB vmstat counters to bytes.
> > > >
> > > > The unit of some vmstat counters are pages, some are bytes, some are
> > > > HPAGE_PMD_NR, and some are KiB. When we want to expose these vmstat
> > > > counters to the userspace, we have to know the unit of the vmstat counters
> > > > is which one. It makes the code complex. Because there are too many choices,
> > > > the probability of making a mistake will be greater.
> > > >
> > > > For example, the below is some bug fix:
> > > >   - 7de2e9f195b9 ("mm: memcontrol: correct the NR_ANON_THPS counter of hierarchical memcg")
> > > >   - not committed(it is the first commit in this series) ("mm: memcontrol: fix NR_ANON_THPS account")
> > > >
> > > > This patch series can make the code simple (161 insertions(+), 187 deletions(-)).
> > > > And make the unit of the vmstat counters are either pages or bytes. Fewer choices
> > > > means lower probability of making mistakes :).
> > > >
> > > > This was inspired by Johannes and Roman. Thanks to them.
> > >
> > > It would be really great if you could summarize the current and after
> > > the patch state so that exceptions are clear and easier to review. The
> > 
> > Agree. Will do in the next version. Thanks.
> > 
> > 
> > > existing situation is rather convoluted but we have at least units part
> > > of the name so it is not too hard to notice that. Reducing exeptions
> > > sounds nice but I am not really sure it is such an improvement it is
> > > worth a lot of code churn. Especially when it comes to KB vs B. Counting
> > 
> > There are two vmstat counters (NR_KERNEL_STACK_KB and
> > NR_KERNEL_SCS_KB) whose units are KB. If we do this, all
> > vmstat counter units are either pages or bytes in the end. When
> > we expose those counters to userspace, it can be easy. You can
> > reference to:
> > 
> >     [RESEND PATCH v2 11/12] mm: memcontrol: make the slab calculation consistent
> > 
> > From this point of view, I think that it is worth doing this. Right?
> 
> Well, unless I am missing something, we have two counters in bytes, two
> in kB, both clearly distinguishable by the B/KB suffix. Changing KB to B
> will certainly reduce the different classes of units, no question about
> that, but I am not really sure this is worth all the code churn. Maybe
> others will think otherwise.

Even if it was me who suggested it, I do agree. It's nice to have a smaller number
of units, but if it creates a lot of hassle, then it makes not much sense.
I think we need to look at the final version of patches and decide if it
worth it or not.

> 
> As I've said the THP accounting change makes more sense to me because it
> allows future changes which are already undergoing so there is more
> merit in those.

+1
And this part is absolutely trivial.
Hugh Dickins Dec. 7, 2020, 8:33 p.m. UTC | #6
On Mon, 7 Dec 2020, Roman Gushchin wrote:
> On Mon, Dec 07, 2020 at 04:02:54PM +0100, Michal Hocko wrote:
> > 
> > As I've said the THP accounting change makes more sense to me because it
> > allows future changes which are already undergoing so there is more
> > merit in those.
> 
> +1
> And this part is absolutely trivial.

It does need to be recognized that, with these changes, every THP stats
update overflows the per-cpu counter, resorting to atomic global updates.
And I'd like to see that mentioned in the commit message.

But this change is consistent with 4.7's 8f182270dfec ("mm/swap.c: flush
lru pvecs on compound page arrival"): we accepted greater overhead for
greater accuracy back then, so I think it's okay to do so for THP stats.

Hugh
Muchun Song Dec. 8, 2020, 2:29 a.m. UTC | #7
On Tue, Dec 8, 2020 at 2:51 AM Randy Dunlap <rdunlap@infradead.org> wrote:
>
> On 12/7/20 7:02 AM, Michal Hocko wrote:
> > On Mon 07-12-20 22:52:30, Muchun Song wrote:
> >> On Mon, Dec 7, 2020 at 9:00 PM Michal Hocko <mhocko@suse.com> wrote:
> >>>
> >>> On Sun 06-12-20 18:14:39, Muchun Song wrote:
> >>>> Hi,
> >>>>
> >>>> This patch series is aimed to convert all THP vmstat counters to pages
> >>>> and some KiB vmstat counters to bytes.
> >>>>
> >>>> The unit of some vmstat counters are pages, some are bytes, some are
> >>>> HPAGE_PMD_NR, and some are KiB. When we want to expose these vmstat
> >>>> counters to the userspace, we have to know the unit of the vmstat counters
> >>>> is which one. It makes the code complex. Because there are too many choices,
> >>>> the probability of making a mistake will be greater.
> >>>>
> >>>> For example, the below is some bug fix:
> >>>>   - 7de2e9f195b9 ("mm: memcontrol: correct the NR_ANON_THPS counter of hierarchical memcg")
> >>>>   - not committed(it is the first commit in this series) ("mm: memcontrol: fix NR_ANON_THPS account")
> >>>>
> >>>> This patch series can make the code simple (161 insertions(+), 187 deletions(-)).
> >>>> And make the unit of the vmstat counters are either pages or bytes. Fewer choices
> >>>> means lower probability of making mistakes :).
> >>>>
> >>>> This was inspired by Johannes and Roman. Thanks to them.
> >>>
> >>> It would be really great if you could summarize the current and after
> >>> the patch state so that exceptions are clear and easier to review. The
> >>
> >> Agree. Will do in the next version. Thanks.
> >>
> >>
> >>> existing situation is rather convoluted but we have at least units part
> >>> of the name so it is not too hard to notice that. Reducing exeptions
> >>> sounds nice but I am not really sure it is such an improvement it is
> >>> worth a lot of code churn. Especially when it comes to KB vs B. Counting
> >>
> >> There are two vmstat counters (NR_KERNEL_STACK_KB and
> >> NR_KERNEL_SCS_KB) whose units are KB. If we do this, all
> >> vmstat counter units are either pages or bytes in the end. When
> >> we expose those counters to userspace, it can be easy. You can
> >> reference to:
> >>
> >>     [RESEND PATCH v2 11/12] mm: memcontrol: make the slab calculation consistent
> >>
> >> From this point of view, I think that it is worth doing this. Right?
> >
> > Well, unless I am missing something, we have two counters in bytes, two
> > in kB, both clearly distinguishable by the B/KB suffix. Changing KB to B
> > will certainly reduce the different classes of units, no question about
> > that, but I am not really sure this is worth all the code churn. Maybe
> > others will think otherwise.
> >
> > As I've said the THP accounting change makes more sense to me because it
> > allows future changes which are already undergoing so there is more
> > merit in those.
> >
>
> Hi,
>
> Are there any documentation changes that go with these patches?
> Or are none needed?
>
> If the patches change the output in /proc/* or /sys/* then I expect
> there would need to be some doc changes.

Oh, we do not change the output. It is transparent to userspace.

Thanks.

>
> And is there any chance of confusing userspace s/w (binary or scripts)
> with these changes?
>
> thanks.
> --
> ~Randy
>
Muchun Song Dec. 8, 2020, 2:40 a.m. UTC | #8
On Mon, Dec 7, 2020 at 11:02 PM Michal Hocko <mhocko@suse.com> wrote:
>
> On Mon 07-12-20 22:52:30, Muchun Song wrote:
> > On Mon, Dec 7, 2020 at 9:00 PM Michal Hocko <mhocko@suse.com> wrote:
> > >
> > > On Sun 06-12-20 18:14:39, Muchun Song wrote:
> > > > Hi,
> > > >
> > > > This patch series is aimed to convert all THP vmstat counters to pages
> > > > and some KiB vmstat counters to bytes.
> > > >
> > > > The unit of some vmstat counters are pages, some are bytes, some are
> > > > HPAGE_PMD_NR, and some are KiB. When we want to expose these vmstat
> > > > counters to the userspace, we have to know the unit of the vmstat counters
> > > > is which one. It makes the code complex. Because there are too many choices,
> > > > the probability of making a mistake will be greater.
> > > >
> > > > For example, the below is some bug fix:
> > > >   - 7de2e9f195b9 ("mm: memcontrol: correct the NR_ANON_THPS counter of hierarchical memcg")
> > > >   - not committed(it is the first commit in this series) ("mm: memcontrol: fix NR_ANON_THPS account")
> > > >
> > > > This patch series can make the code simple (161 insertions(+), 187 deletions(-)).
> > > > And make the unit of the vmstat counters are either pages or bytes. Fewer choices
> > > > means lower probability of making mistakes :).
> > > >
> > > > This was inspired by Johannes and Roman. Thanks to them.
> > >
> > > It would be really great if you could summarize the current and after
> > > the patch state so that exceptions are clear and easier to review. The
> >
> > Agree. Will do in the next version. Thanks.
> >
> >
> > > existing situation is rather convoluted but we have at least units part
> > > of the name so it is not too hard to notice that. Reducing exeptions
> > > sounds nice but I am not really sure it is such an improvement it is
> > > worth a lot of code churn. Especially when it comes to KB vs B. Counting
> >
> > There are two vmstat counters (NR_KERNEL_STACK_KB and
> > NR_KERNEL_SCS_KB) whose units are KB. If we do this, all
> > vmstat counter units are either pages or bytes in the end. When
> > we expose those counters to userspace, it can be easy. You can
> > reference to:
> >
> >     [RESEND PATCH v2 11/12] mm: memcontrol: make the slab calculation consistent
> >
> > From this point of view, I think that it is worth doing this. Right?
>
> Well, unless I am missing something, we have two counters in bytes, two
> in kB, both clearly distinguishable by the B/KB suffix. Changing KB to B
> will certainly reduce the different classes of units, no question about
> that, but I am not really sure this is worth all the code churn. Maybe
> others will think otherwise.
>
> As I've said the THP accounting change makes more sense to me because it
> allows future changes which are already undergoing so there is more
> merit in those.

OK, will delete the convert of KB to B. Thanks.

> --
> Michal Hocko
> SUSE Labs
Muchun Song Dec. 8, 2020, 2:42 a.m. UTC | #9
On Tue, Dec 8, 2020 at 4:33 AM Hugh Dickins <hughd@google.com> wrote:
>
> On Mon, 7 Dec 2020, Roman Gushchin wrote:
> > On Mon, Dec 07, 2020 at 04:02:54PM +0100, Michal Hocko wrote:
> > >
> > > As I've said the THP accounting change makes more sense to me because it
> > > allows future changes which are already undergoing so there is more
> > > merit in those.
> >
> > +1
> > And this part is absolutely trivial.
>
> It does need to be recognized that, with these changes, every THP stats
> update overflows the per-cpu counter, resorting to atomic global updates.
> And I'd like to see that mentioned in the commit message.

Thanks for reminding me. Will add.

>
> But this change is consistent with 4.7's 8f182270dfec ("mm/swap.c: flush
> lru pvecs on compound page arrival"): we accepted greater overhead for
> greater accuracy back then, so I think it's okay to do so for THP stats.

Agree. Thanks.

>
> Hugh