diff mbox series

[v3,03/19] mm: memcg: convert vmstat slab counters to bytes

Message ID 20200422204708.2176080-4-guro@fb.com (mailing list archive)
State New, archived
Headers show
Series The new cgroup slab memory controller | expand

Commit Message

Roman Gushchin April 22, 2020, 8:46 p.m. UTC
In order to prepare for per-object slab memory accounting, convert
NR_SLAB_RECLAIMABLE and NR_SLAB_UNRECLAIMABLE vmstat items to bytes.

To make it obvious, rename them to NR_SLAB_RECLAIMABLE_B and
NR_SLAB_UNRECLAIMABLE_B (similar to NR_KERNEL_STACK_KB).

Internally global and per-node counters are stored in pages,
however memcg and lruvec counters are stored in bytes.
This scheme may look weird, but only for now. As soon as slab
pages will be shared between multiple cgroups, global and
node counters will reflect the total number of slab pages.
However memcg and lruvec counters will be used for per-memcg
slab memory tracking, which will take separate kernel objects
in the account. Keeping global and node counters in pages helps
to avoid additional overhead.

The size of slab memory shouldn't exceed 4Gb on 32-bit machines,
so it will fit into atomic_long_t we use for vmstats.

Signed-off-by: Roman Gushchin <guro@fb.com>
---
 drivers/base/node.c     |  4 ++--
 fs/proc/meminfo.c       |  4 ++--
 include/linux/mmzone.h  | 16 +++++++++++++---
 kernel/power/snapshot.c |  2 +-
 mm/memcontrol.c         | 11 ++++-------
 mm/oom_kill.c           |  2 +-
 mm/page_alloc.c         |  8 ++++----
 mm/slab.h               | 15 ++++++++-------
 mm/slab_common.c        |  4 ++--
 mm/slob.c               | 12 ++++++------
 mm/slub.c               |  8 ++++----
 mm/vmscan.c             |  3 ++-
 mm/workingset.c         |  6 ++++--
 13 files changed, 53 insertions(+), 42 deletions(-)

Comments

Johannes Weiner May 7, 2020, 8:41 p.m. UTC | #1
On Wed, Apr 22, 2020 at 01:46:52PM -0700, Roman Gushchin wrote:
> In order to prepare for per-object slab memory accounting, convert
> NR_SLAB_RECLAIMABLE and NR_SLAB_UNRECLAIMABLE vmstat items to bytes.
> 
> To make it obvious, rename them to NR_SLAB_RECLAIMABLE_B and
> NR_SLAB_UNRECLAIMABLE_B (similar to NR_KERNEL_STACK_KB).
> 
> Internally global and per-node counters are stored in pages,
> however memcg and lruvec counters are stored in bytes.
> This scheme may look weird, but only for now. As soon as slab
> pages will be shared between multiple cgroups, global and
> node counters will reflect the total number of slab pages.
> However memcg and lruvec counters will be used for per-memcg
> slab memory tracking, which will take separate kernel objects
> in the account. Keeping global and node counters in pages helps
> to avoid additional overhead.
> 
> The size of slab memory shouldn't exceed 4Gb on 32-bit machines,
> so it will fit into atomic_long_t we use for vmstats.
> 
> Signed-off-by: Roman Gushchin <guro@fb.com>

Acked-by: Johannes Weiner <hannes@cmpxchg.org>

Thanks for splitting this out, it makes both this and the previous
patch easier to read.
Vlastimil Babka May 20, 2020, 12:25 p.m. UTC | #2
On 4/22/20 10:46 PM, Roman Gushchin wrote:
> In order to prepare for per-object slab memory accounting, convert
> NR_SLAB_RECLAIMABLE and NR_SLAB_UNRECLAIMABLE vmstat items to bytes.
> 
> To make it obvious, rename them to NR_SLAB_RECLAIMABLE_B and
> NR_SLAB_UNRECLAIMABLE_B (similar to NR_KERNEL_STACK_KB).
> 
> Internally global and per-node counters are stored in pages,
> however memcg and lruvec counters are stored in bytes.
> This scheme may look weird, but only for now. As soon as slab
> pages will be shared between multiple cgroups, global and
> node counters will reflect the total number of slab pages.
> However memcg and lruvec counters will be used for per-memcg
> slab memory tracking, which will take separate kernel objects
> in the account. Keeping global and node counters in pages helps
> to avoid additional overhead.
> 
> The size of slab memory shouldn't exceed 4Gb on 32-bit machines,
> so it will fit into atomic_long_t we use for vmstats.
> 
> Signed-off-by: Roman Gushchin <guro@fb.com>
> ---
>  drivers/base/node.c     |  4 ++--
>  fs/proc/meminfo.c       |  4 ++--
>  include/linux/mmzone.h  | 16 +++++++++++++---
>  kernel/power/snapshot.c |  2 +-
>  mm/memcontrol.c         | 11 ++++-------
>  mm/oom_kill.c           |  2 +-
>  mm/page_alloc.c         |  8 ++++----
>  mm/slab.h               | 15 ++++++++-------
>  mm/slab_common.c        |  4 ++--
>  mm/slob.c               | 12 ++++++------
>  mm/slub.c               |  8 ++++----
>  mm/vmscan.c             |  3 ++-
>  mm/workingset.c         |  6 ++++--
>  13 files changed, 53 insertions(+), 42 deletions(-)


> @@ -206,7 +206,17 @@ enum node_stat_item {
>  
>  static __always_inline bool vmstat_item_in_bytes(enum node_stat_item item)
>  {
> -	return false;
> +	/*
> +	 * Global and per-node slab counters track slab pages.
> +	 * It's expected that changes are multiples of PAGE_SIZE.
> +	 * Internally values are stored in pages.
> +	 *
> +	 * Per-memcg and per-lruvec counters track memory, consumed
> +	 * by individual slab objects. These counters are actually
> +	 * byte-precise.
> +	 */
> +	return (item == NR_SLAB_RECLAIMABLE_B ||
> +		item == NR_SLAB_UNRECLAIMABLE_B);
>  }

Ok, so this is no longer a no-op, but __always_inline here and inline in
global_node_page_state() should hopefully mean that for all users of
global_node_page_state(<constant>) the compiler will eliminate the branch for
non-slab counters. But there are also functions such as si_mem_available() that
use non-constant item. Maybe compiler is smart enough anyway, but perhaps it's
better to use global_node_page_state_pages() in such callers?

However __mod_node_page_state() and mode_node_state() will now branch always. I
wonder if the "API clean" goal is worth it...

> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -1409,9 +1409,8 @@ static char *memory_stat_format(struct mem_cgroup *memcg)
>  		       (u64)memcg_page_state(memcg, MEMCG_KERNEL_STACK_KB) *
>  		       1024);
>  	seq_buf_printf(&s, "slab %llu\n",
> -		       (u64)(memcg_page_state(memcg, NR_SLAB_RECLAIMABLE) +
> -			     memcg_page_state(memcg, NR_SLAB_UNRECLAIMABLE)) *
> -		       PAGE_SIZE);
> +		       (u64)(memcg_page_state(memcg, NR_SLAB_RECLAIMABLE_B) +
> +			     memcg_page_state(memcg, NR_SLAB_UNRECLAIMABLE_B)));
>  	seq_buf_printf(&s, "sock %llu\n",
>  		       (u64)memcg_page_state(memcg, MEMCG_SOCK) *
>  		       PAGE_SIZE);
> @@ -1445,11 +1444,9 @@ static char *memory_stat_format(struct mem_cgroup *memcg)
>  			       PAGE_SIZE);
>  
>  	seq_buf_printf(&s, "slab_reclaimable %llu\n",
> -		       (u64)memcg_page_state(memcg, NR_SLAB_RECLAIMABLE) *
> -		       PAGE_SIZE);
> +		       (u64)memcg_page_state(memcg, NR_SLAB_RECLAIMABLE_B));
>  	seq_buf_printf(&s, "slab_unreclaimable %llu\n",
> -		       (u64)memcg_page_state(memcg, NR_SLAB_UNRECLAIMABLE) *
> -		       PAGE_SIZE);
> +		       (u64)memcg_page_state(memcg, NR_SLAB_UNRECLAIMABLE_B));

So here we are now printing in bytes instead of pages, right? That's fine for
OOM report, but in sysfs aren't we breaking existing users?
Roman Gushchin May 20, 2020, 7:26 p.m. UTC | #3
On Wed, May 20, 2020 at 02:25:22PM +0200, Vlastimil Babka wrote:
> On 4/22/20 10:46 PM, Roman Gushchin wrote:
> > In order to prepare for per-object slab memory accounting, convert
> > NR_SLAB_RECLAIMABLE and NR_SLAB_UNRECLAIMABLE vmstat items to bytes.
> > 
> > To make it obvious, rename them to NR_SLAB_RECLAIMABLE_B and
> > NR_SLAB_UNRECLAIMABLE_B (similar to NR_KERNEL_STACK_KB).
> > 
> > Internally global and per-node counters are stored in pages,
> > however memcg and lruvec counters are stored in bytes.
> > This scheme may look weird, but only for now. As soon as slab
> > pages will be shared between multiple cgroups, global and
> > node counters will reflect the total number of slab pages.
> > However memcg and lruvec counters will be used for per-memcg
> > slab memory tracking, which will take separate kernel objects
> > in the account. Keeping global and node counters in pages helps
> > to avoid additional overhead.
> > 
> > The size of slab memory shouldn't exceed 4Gb on 32-bit machines,
> > so it will fit into atomic_long_t we use for vmstats.
> > 
> > Signed-off-by: Roman Gushchin <guro@fb.com>
> > ---
> >  drivers/base/node.c     |  4 ++--
> >  fs/proc/meminfo.c       |  4 ++--
> >  include/linux/mmzone.h  | 16 +++++++++++++---
> >  kernel/power/snapshot.c |  2 +-
> >  mm/memcontrol.c         | 11 ++++-------
> >  mm/oom_kill.c           |  2 +-
> >  mm/page_alloc.c         |  8 ++++----
> >  mm/slab.h               | 15 ++++++++-------
> >  mm/slab_common.c        |  4 ++--
> >  mm/slob.c               | 12 ++++++------
> >  mm/slub.c               |  8 ++++----
> >  mm/vmscan.c             |  3 ++-
> >  mm/workingset.c         |  6 ++++--
> >  13 files changed, 53 insertions(+), 42 deletions(-)
> 
> 
> > @@ -206,7 +206,17 @@ enum node_stat_item {
> >  
> >  static __always_inline bool vmstat_item_in_bytes(enum node_stat_item item)
> >  {
> > -	return false;
> > +	/*
> > +	 * Global and per-node slab counters track slab pages.
> > +	 * It's expected that changes are multiples of PAGE_SIZE.
> > +	 * Internally values are stored in pages.
> > +	 *
> > +	 * Per-memcg and per-lruvec counters track memory, consumed
> > +	 * by individual slab objects. These counters are actually
> > +	 * byte-precise.
> > +	 */
> > +	return (item == NR_SLAB_RECLAIMABLE_B ||
> > +		item == NR_SLAB_UNRECLAIMABLE_B);

Hello, Vlastimil!

Thank you for looking into the patchset, appreciate it.
In the next version I'll add some comments based on your suggestions in previous
letters.

> >  }
> 
> Ok, so this is no longer a no-op, but __always_inline here and inline in
> global_node_page_state() should hopefully mean that for all users of
> global_node_page_state(<constant>) the compiler will eliminate the branch for
> non-slab counters. But there are also functions such as si_mem_available() that
> use non-constant item. Maybe compiler is smart enough anyway, but perhaps it's
> better to use global_node_page_state_pages() in such callers?

I'll take a look, thanks for the idea.

> 
> However __mod_node_page_state() and mode_node_state() will now branch always. I
> wonder if the "API clean" goal is worth it...

You mean just adding a special write-side helper which will perform a conversion
and put VM_WARN_ON_ONCE() into generic write-side helpers?

> 
> > --- a/mm/memcontrol.c
> > +++ b/mm/memcontrol.c
> > @@ -1409,9 +1409,8 @@ static char *memory_stat_format(struct mem_cgroup *memcg)
> >  		       (u64)memcg_page_state(memcg, MEMCG_KERNEL_STACK_KB) *
> >  		       1024);
> >  	seq_buf_printf(&s, "slab %llu\n",
> > -		       (u64)(memcg_page_state(memcg, NR_SLAB_RECLAIMABLE) +
> > -			     memcg_page_state(memcg, NR_SLAB_UNRECLAIMABLE)) *
> > -		       PAGE_SIZE);
> > +		       (u64)(memcg_page_state(memcg, NR_SLAB_RECLAIMABLE_B) +
> > +			     memcg_page_state(memcg, NR_SLAB_UNRECLAIMABLE_B)));
> >  	seq_buf_printf(&s, "sock %llu\n",
> >  		       (u64)memcg_page_state(memcg, MEMCG_SOCK) *
> >  		       PAGE_SIZE);
> > @@ -1445,11 +1444,9 @@ static char *memory_stat_format(struct mem_cgroup *memcg)
> >  			       PAGE_SIZE);
> >  
> >  	seq_buf_printf(&s, "slab_reclaimable %llu\n",
> > -		       (u64)memcg_page_state(memcg, NR_SLAB_RECLAIMABLE) *
> > -		       PAGE_SIZE);
> > +		       (u64)memcg_page_state(memcg, NR_SLAB_RECLAIMABLE_B));
> >  	seq_buf_printf(&s, "slab_unreclaimable %llu\n",
> > -		       (u64)memcg_page_state(memcg, NR_SLAB_UNRECLAIMABLE) *
> > -		       PAGE_SIZE);
> > +		       (u64)memcg_page_state(memcg, NR_SLAB_UNRECLAIMABLE_B));
> 
> So here we are now printing in bytes instead of pages, right? That's fine for
> OOM report, but in sysfs aren't we breaking existing users?
> 

Hm, but it was in bytes previously, look at that x * PAGE_SIZE.
Or do you mean that now it can be not rounded to PAGE_SIZE?
If so, I don't think it breaks any expectations.

Thanks!
Vlastimil Babka May 21, 2020, 9:57 a.m. UTC | #4
On 5/20/20 9:26 PM, Roman Gushchin wrote:
> On Wed, May 20, 2020 at 02:25:22PM +0200, Vlastimil Babka wrote:
>> 
>> However __mod_node_page_state() and mode_node_state() will now branch always. I
>> wonder if the "API clean" goal is worth it...
> 
> You mean just adding a special write-side helper which will perform a conversion
> and put VM_WARN_ON_ONCE() into generic write-side helpers?

What I mean is that maybe node/global helpers should assume page granularity,
and lruvec/memcg helpers do the check is they should convert from bytes to pages
when calling node/global helpers. Then there would be no extra branches in
node/global helpers. But maybe it's not worth saving those branches, dunno.

>> 
>> > --- a/mm/memcontrol.c
>> > +++ b/mm/memcontrol.c
>> > @@ -1409,9 +1409,8 @@ static char *memory_stat_format(struct mem_cgroup *memcg)
>> >  		       (u64)memcg_page_state(memcg, MEMCG_KERNEL_STACK_KB) *
>> >  		       1024);
>> >  	seq_buf_printf(&s, "slab %llu\n",
>> > -		       (u64)(memcg_page_state(memcg, NR_SLAB_RECLAIMABLE) +
>> > -			     memcg_page_state(memcg, NR_SLAB_UNRECLAIMABLE)) *
>> > -		       PAGE_SIZE);
>> > +		       (u64)(memcg_page_state(memcg, NR_SLAB_RECLAIMABLE_B) +
>> > +			     memcg_page_state(memcg, NR_SLAB_UNRECLAIMABLE_B)));
>> >  	seq_buf_printf(&s, "sock %llu\n",
>> >  		       (u64)memcg_page_state(memcg, MEMCG_SOCK) *
>> >  		       PAGE_SIZE);
>> > @@ -1445,11 +1444,9 @@ static char *memory_stat_format(struct mem_cgroup *memcg)
>> >  			       PAGE_SIZE);
>> >  
>> >  	seq_buf_printf(&s, "slab_reclaimable %llu\n",
>> > -		       (u64)memcg_page_state(memcg, NR_SLAB_RECLAIMABLE) *
>> > -		       PAGE_SIZE);
>> > +		       (u64)memcg_page_state(memcg, NR_SLAB_RECLAIMABLE_B));
>> >  	seq_buf_printf(&s, "slab_unreclaimable %llu\n",
>> > -		       (u64)memcg_page_state(memcg, NR_SLAB_UNRECLAIMABLE) *
>> > -		       PAGE_SIZE);
>> > +		       (u64)memcg_page_state(memcg, NR_SLAB_UNRECLAIMABLE_B));
>> 
>> So here we are now printing in bytes instead of pages, right? That's fine for
>> OOM report, but in sysfs aren't we breaking existing users?
>> 
> 
> Hm, but it was in bytes previously, look at that x * PAGE_SIZE.

Yeah, that's what I managed to overlook, sorry.

> Or do you mean that now it can be not rounded to PAGE_SIZE?
> If so, I don't think it breaks any expectations.
> 
> Thanks!
>
Roman Gushchin May 21, 2020, 9:14 p.m. UTC | #5
On Thu, May 21, 2020 at 11:57:12AM +0200, Vlastimil Babka wrote:
> On 5/20/20 9:26 PM, Roman Gushchin wrote:
> > On Wed, May 20, 2020 at 02:25:22PM +0200, Vlastimil Babka wrote:
> >> 
> >> However __mod_node_page_state() and mode_node_state() will now branch always. I
> >> wonder if the "API clean" goal is worth it...
> > 
> > You mean just adding a special write-side helper which will perform a conversion
> > and put VM_WARN_ON_ONCE() into generic write-side helpers?
> 
> What I mean is that maybe node/global helpers should assume page granularity,
> and lruvec/memcg helpers do the check is they should convert from bytes to pages
> when calling node/global helpers. Then there would be no extra branches in
> node/global helpers. But maybe it's not worth saving those branches, dunno.

The problem is with helpers like mod_lruvec_state(), which do modify both global
and memcg-level counters. Also memcg- and global counters share idxes, so
it will be confusing to have NR_SLAB_RECLAIMABLE in bytes on one level and
in pages on the other.

So, idk, maybe there is a better way of organizing these counters in a less
complicated manner, but I've no ideas at the moment. But if you do, I'll appreciate it.

Thanks!
diff mbox series

Patch

diff --git a/drivers/base/node.c b/drivers/base/node.c
index 9d6afb7d2ccd..b3d13fa715ad 100644
--- a/drivers/base/node.c
+++ b/drivers/base/node.c
@@ -368,8 +368,8 @@  static ssize_t node_read_meminfo(struct device *dev,
 	unsigned long sreclaimable, sunreclaimable;
 
 	si_meminfo_node(&i, nid);
-	sreclaimable = node_page_state(pgdat, NR_SLAB_RECLAIMABLE);
-	sunreclaimable = node_page_state(pgdat, NR_SLAB_UNRECLAIMABLE);
+	sreclaimable = node_page_state_pages(pgdat, NR_SLAB_RECLAIMABLE_B);
+	sunreclaimable = node_page_state_pages(pgdat, NR_SLAB_UNRECLAIMABLE_B);
 	n = sprintf(buf,
 		       "Node %d MemTotal:       %8lu kB\n"
 		       "Node %d MemFree:        %8lu kB\n"
diff --git a/fs/proc/meminfo.c b/fs/proc/meminfo.c
index 8c1f1bb1a5ce..0811e4100084 100644
--- a/fs/proc/meminfo.c
+++ b/fs/proc/meminfo.c
@@ -53,8 +53,8 @@  static int meminfo_proc_show(struct seq_file *m, void *v)
 		pages[lru] = global_node_page_state(NR_LRU_BASE + lru);
 
 	available = si_mem_available();
-	sreclaimable = global_node_page_state(NR_SLAB_RECLAIMABLE);
-	sunreclaim = global_node_page_state(NR_SLAB_UNRECLAIMABLE);
+	sreclaimable = global_node_page_state_pages(NR_SLAB_RECLAIMABLE_B);
+	sunreclaim = global_node_page_state_pages(NR_SLAB_UNRECLAIMABLE_B);
 
 	show_val_kb(m, "MemTotal:       ", i.totalram);
 	show_val_kb(m, "MemFree:        ", i.freeram);
diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 22fe65edf425..1c68c482df6f 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -171,8 +171,8 @@  enum node_stat_item {
 	NR_INACTIVE_FILE,	/*  "     "     "   "       "         */
 	NR_ACTIVE_FILE,		/*  "     "     "   "       "         */
 	NR_UNEVICTABLE,		/*  "     "     "   "       "         */
-	NR_SLAB_RECLAIMABLE,
-	NR_SLAB_UNRECLAIMABLE,
+	NR_SLAB_RECLAIMABLE_B,
+	NR_SLAB_UNRECLAIMABLE_B,
 	NR_ISOLATED_ANON,	/* Temporary isolated pages from anon lru */
 	NR_ISOLATED_FILE,	/* Temporary isolated pages from file lru */
 	WORKINGSET_NODES,
@@ -206,7 +206,17 @@  enum node_stat_item {
 
 static __always_inline bool vmstat_item_in_bytes(enum node_stat_item item)
 {
-	return false;
+	/*
+	 * Global and per-node slab counters track slab pages.
+	 * It's expected that changes are multiples of PAGE_SIZE.
+	 * Internally values are stored in pages.
+	 *
+	 * Per-memcg and per-lruvec counters track memory, consumed
+	 * by individual slab objects. These counters are actually
+	 * byte-precise.
+	 */
+	return (item == NR_SLAB_RECLAIMABLE_B ||
+		item == NR_SLAB_UNRECLAIMABLE_B);
 }
 
 /*
diff --git a/kernel/power/snapshot.c b/kernel/power/snapshot.c
index 659800157b17..22da1728b9cb 100644
--- a/kernel/power/snapshot.c
+++ b/kernel/power/snapshot.c
@@ -1664,7 +1664,7 @@  static unsigned long minimum_image_size(unsigned long saveable)
 {
 	unsigned long size;
 
-	size = global_node_page_state(NR_SLAB_RECLAIMABLE)
+	size = global_node_page_state_pages(NR_SLAB_RECLAIMABLE_B)
 		+ global_node_page_state(NR_ACTIVE_ANON)
 		+ global_node_page_state(NR_INACTIVE_ANON)
 		+ global_node_page_state(NR_ACTIVE_FILE)
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 5f700fa8b78c..6cbc1f4829fc 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -1409,9 +1409,8 @@  static char *memory_stat_format(struct mem_cgroup *memcg)
 		       (u64)memcg_page_state(memcg, MEMCG_KERNEL_STACK_KB) *
 		       1024);
 	seq_buf_printf(&s, "slab %llu\n",
-		       (u64)(memcg_page_state(memcg, NR_SLAB_RECLAIMABLE) +
-			     memcg_page_state(memcg, NR_SLAB_UNRECLAIMABLE)) *
-		       PAGE_SIZE);
+		       (u64)(memcg_page_state(memcg, NR_SLAB_RECLAIMABLE_B) +
+			     memcg_page_state(memcg, NR_SLAB_UNRECLAIMABLE_B)));
 	seq_buf_printf(&s, "sock %llu\n",
 		       (u64)memcg_page_state(memcg, MEMCG_SOCK) *
 		       PAGE_SIZE);
@@ -1445,11 +1444,9 @@  static char *memory_stat_format(struct mem_cgroup *memcg)
 			       PAGE_SIZE);
 
 	seq_buf_printf(&s, "slab_reclaimable %llu\n",
-		       (u64)memcg_page_state(memcg, NR_SLAB_RECLAIMABLE) *
-		       PAGE_SIZE);
+		       (u64)memcg_page_state(memcg, NR_SLAB_RECLAIMABLE_B));
 	seq_buf_printf(&s, "slab_unreclaimable %llu\n",
-		       (u64)memcg_page_state(memcg, NR_SLAB_UNRECLAIMABLE) *
-		       PAGE_SIZE);
+		       (u64)memcg_page_state(memcg, NR_SLAB_UNRECLAIMABLE_B));
 
 	/* Accumulated memory events */
 
diff --git a/mm/oom_kill.c b/mm/oom_kill.c
index 463b3d74a64a..eb0ccb8666b0 100644
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -184,7 +184,7 @@  static bool is_dump_unreclaim_slabs(void)
 		 global_node_page_state(NR_ISOLATED_FILE) +
 		 global_node_page_state(NR_UNEVICTABLE);
 
-	return (global_node_page_state(NR_SLAB_UNRECLAIMABLE) > nr_lru);
+	return (global_node_page_state_pages(NR_SLAB_UNRECLAIMABLE_B) > nr_lru);
 }
 
 /**
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index b48336e20bdc..a4daae53b273 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5175,8 +5175,8 @@  long si_mem_available(void)
 	 * items that are in use, and cannot be freed. Cap this estimate at the
 	 * low watermark.
 	 */
-	reclaimable = global_node_page_state(NR_SLAB_RECLAIMABLE) +
-			global_node_page_state(NR_KERNEL_MISC_RECLAIMABLE);
+	reclaimable = global_node_page_state_pages(NR_SLAB_RECLAIMABLE_B) +
+		global_node_page_state(NR_KERNEL_MISC_RECLAIMABLE);
 	available += reclaimable - min(reclaimable / 2, wmark_low);
 
 	if (available < 0)
@@ -5320,8 +5320,8 @@  void show_free_areas(unsigned int filter, nodemask_t *nodemask)
 		global_node_page_state(NR_FILE_DIRTY),
 		global_node_page_state(NR_WRITEBACK),
 		global_node_page_state(NR_UNSTABLE_NFS),
-		global_node_page_state(NR_SLAB_RECLAIMABLE),
-		global_node_page_state(NR_SLAB_UNRECLAIMABLE),
+		global_node_page_state_pages(NR_SLAB_RECLAIMABLE_B),
+		global_node_page_state_pages(NR_SLAB_UNRECLAIMABLE_B),
 		global_node_page_state(NR_FILE_MAPPED),
 		global_node_page_state(NR_SHMEM),
 		global_zone_page_state(NR_PAGETABLE),
diff --git a/mm/slab.h b/mm/slab.h
index 815e4e9a94cd..633eedb6bad1 100644
--- a/mm/slab.h
+++ b/mm/slab.h
@@ -272,7 +272,7 @@  int __kmem_cache_alloc_bulk(struct kmem_cache *, gfp_t, size_t, void **);
 static inline int cache_vmstat_idx(struct kmem_cache *s)
 {
 	return (s->flags & SLAB_RECLAIM_ACCOUNT) ?
-		NR_SLAB_RECLAIMABLE : NR_SLAB_UNRECLAIMABLE;
+		NR_SLAB_RECLAIMABLE_B : NR_SLAB_UNRECLAIMABLE_B;
 }
 
 #ifdef CONFIG_MEMCG_KMEM
@@ -361,7 +361,7 @@  static __always_inline int memcg_charge_slab(struct page *page,
 
 	if (unlikely(!memcg || mem_cgroup_is_root(memcg))) {
 		mod_node_page_state(page_pgdat(page), cache_vmstat_idx(s),
-				    nr_pages);
+				    nr_pages << PAGE_SHIFT);
 		percpu_ref_get_many(&s->memcg_params.refcnt, nr_pages);
 		return 0;
 	}
@@ -371,7 +371,7 @@  static __always_inline int memcg_charge_slab(struct page *page,
 		goto out;
 
 	lruvec = mem_cgroup_lruvec(memcg, page_pgdat(page));
-	mod_lruvec_state(lruvec, cache_vmstat_idx(s), nr_pages);
+	mod_lruvec_state(lruvec, cache_vmstat_idx(s), nr_pages << PAGE_SHIFT);
 
 	/* transer try_charge() page references to kmem_cache */
 	percpu_ref_get_many(&s->memcg_params.refcnt, nr_pages);
@@ -396,11 +396,12 @@  static __always_inline void memcg_uncharge_slab(struct page *page, int order,
 	memcg = READ_ONCE(s->memcg_params.memcg);
 	if (likely(!mem_cgroup_is_root(memcg))) {
 		lruvec = mem_cgroup_lruvec(memcg, page_pgdat(page));
-		mod_lruvec_state(lruvec, cache_vmstat_idx(s), -nr_pages);
+		mod_lruvec_state(lruvec, cache_vmstat_idx(s),
+				 -(nr_pages << PAGE_SHIFT));
 		memcg_kmem_uncharge(memcg, nr_pages);
 	} else {
 		mod_node_page_state(page_pgdat(page), cache_vmstat_idx(s),
-				    -nr_pages);
+				    -(nr_pages << PAGE_SHIFT));
 	}
 	rcu_read_unlock();
 
@@ -484,7 +485,7 @@  static __always_inline int charge_slab_page(struct page *page,
 {
 	if (is_root_cache(s)) {
 		mod_node_page_state(page_pgdat(page), cache_vmstat_idx(s),
-				    1 << order);
+				    PAGE_SIZE << order);
 		return 0;
 	}
 
@@ -496,7 +497,7 @@  static __always_inline void uncharge_slab_page(struct page *page, int order,
 {
 	if (is_root_cache(s)) {
 		mod_node_page_state(page_pgdat(page), cache_vmstat_idx(s),
-				    -(1 << order));
+				    -(PAGE_SIZE << order));
 		return;
 	}
 
diff --git a/mm/slab_common.c b/mm/slab_common.c
index 9e72ba224175..b578ae29c743 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -1325,8 +1325,8 @@  void *kmalloc_order(size_t size, gfp_t flags, unsigned int order)
 	page = alloc_pages(flags, order);
 	if (likely(page)) {
 		ret = page_address(page);
-		mod_node_page_state(page_pgdat(page), NR_SLAB_UNRECLAIMABLE,
-				    1 << order);
+		mod_node_page_state(page_pgdat(page), NR_SLAB_UNRECLAIMABLE_B,
+				    PAGE_SIZE << order);
 	}
 	ret = kasan_kmalloc_large(ret, size, flags);
 	/* As ret might get tagged, call kmemleak hook after KASAN. */
diff --git a/mm/slob.c b/mm/slob.c
index ac2aecfbc7a8..7cc9805c8091 100644
--- a/mm/slob.c
+++ b/mm/slob.c
@@ -202,8 +202,8 @@  static void *slob_new_pages(gfp_t gfp, int order, int node)
 	if (!page)
 		return NULL;
 
-	mod_node_page_state(page_pgdat(page), NR_SLAB_UNRECLAIMABLE,
-			    1 << order);
+	mod_node_page_state(page_pgdat(page), NR_SLAB_UNRECLAIMABLE_B,
+			    PAGE_SIZE << order);
 	return page_address(page);
 }
 
@@ -214,8 +214,8 @@  static void slob_free_pages(void *b, int order)
 	if (current->reclaim_state)
 		current->reclaim_state->reclaimed_slab += 1 << order;
 
-	mod_node_page_state(page_pgdat(sp), NR_SLAB_UNRECLAIMABLE,
-			    -(1 << order));
+	mod_node_page_state(page_pgdat(sp), NR_SLAB_UNRECLAIMABLE_B,
+			    -(PAGE_SIZE << order));
 	__free_pages(sp, order);
 }
 
@@ -552,8 +552,8 @@  void kfree(const void *block)
 		slob_free(m, *m + align);
 	} else {
 		unsigned int order = compound_order(sp);
-		mod_node_page_state(page_pgdat(sp), NR_SLAB_UNRECLAIMABLE,
-				    -(1 << order));
+		mod_node_page_state(page_pgdat(sp), NR_SLAB_UNRECLAIMABLE_B,
+				    -(PAGE_SIZE << order));
 		__free_pages(sp, order);
 
 	}
diff --git a/mm/slub.c b/mm/slub.c
index 914b7261e6b6..03071ae5ff07 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -3898,8 +3898,8 @@  static void *kmalloc_large_node(size_t size, gfp_t flags, int node)
 	page = alloc_pages_node(node, flags, order);
 	if (page) {
 		ptr = page_address(page);
-		mod_node_page_state(page_pgdat(page), NR_SLAB_UNRECLAIMABLE,
-				    1 << order);
+		mod_node_page_state(page_pgdat(page), NR_SLAB_UNRECLAIMABLE_B,
+				    PAGE_SIZE << order);
 	}
 
 	return kmalloc_large_node_hook(ptr, size, flags);
@@ -4030,8 +4030,8 @@  void kfree(const void *x)
 
 		BUG_ON(!PageCompound(page));
 		kfree_hook(object);
-		mod_node_page_state(page_pgdat(page), NR_SLAB_UNRECLAIMABLE,
-				    -(1 << order));
+		mod_node_page_state(page_pgdat(page), NR_SLAB_UNRECLAIMABLE_B,
+				    -(PAGE_SIZE << order));
 		__free_pages(page, order);
 		return;
 	}
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 4c3a760c0522..88aa6656aaca 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -4226,7 +4226,8 @@  int node_reclaim(struct pglist_data *pgdat, gfp_t gfp_mask, unsigned int order)
 	 * unmapped file backed pages.
 	 */
 	if (node_pagecache_reclaimable(pgdat) <= pgdat->min_unmapped_pages &&
-	    node_page_state(pgdat, NR_SLAB_RECLAIMABLE) <= pgdat->min_slab_pages)
+	    node_page_state_pages(pgdat, NR_SLAB_RECLAIMABLE_B) <=
+	    pgdat->min_slab_pages)
 		return NODE_RECLAIM_FULL;
 
 	/*
diff --git a/mm/workingset.c b/mm/workingset.c
index 474186b76ced..9358c1ee5bb6 100644
--- a/mm/workingset.c
+++ b/mm/workingset.c
@@ -467,8 +467,10 @@  static unsigned long count_shadow_nodes(struct shrinker *shrinker,
 		for (pages = 0, i = 0; i < NR_LRU_LISTS; i++)
 			pages += lruvec_page_state_local(lruvec,
 							 NR_LRU_BASE + i);
-		pages += lruvec_page_state_local(lruvec, NR_SLAB_RECLAIMABLE);
-		pages += lruvec_page_state_local(lruvec, NR_SLAB_UNRECLAIMABLE);
+		pages += lruvec_page_state_local(
+			lruvec, NR_SLAB_RECLAIMABLE_B) >> PAGE_SHIFT;
+		pages += lruvec_page_state_local(
+			lruvec, NR_SLAB_UNRECLAIMABLE_B) >> PAGE_SHIFT;
 	} else
 #endif
 		pages = node_present_pages(sc->nid);