diff mbox series

[15/18] mm: memcontrol: make swap tracking an integral part of memory control

Message ID 20200420221126.341272-16-hannes@cmpxchg.org (mailing list archive)
State New, archived
Headers show
Series mm: memcontrol: charge swapin pages on instantiation | expand

Commit Message

Johannes Weiner April 20, 2020, 10:11 p.m. UTC
Without swap page tracking, users that are otherwise memory controlled
can easily escape their containment and allocate significant amounts
of memory that they're not being charged for. That's because swap does
readahead, but without the cgroup records of who owned the page at
swapout, readahead pages don't get charged until somebody actually
faults them into their page table and we can identify an owner task.
This can be maliciously exploited with MADV_WILLNEED, which triggers
arbitrary readahead allocations without charging the pages.

Make swap swap page tracking an integral part of memcg and remove the
Kconfig options. In the first place, it was only made configurable to
allow users to save some memory. But the overhead of tracking cgroup
ownership per swap page is minimal - 2 byte per page, or 512k per 1G
of swap, or 0.04%. Saving that at the expense of broken containment
semantics is not something we should present as a coequal option.

The swapaccount=0 boot option will continue to exist, and it will
eliminate the page_counter overhead and hide the swap control files,
but it won't disable swap slot ownership tracking.

This patch makes sure we always have the cgroup records at swapin
time; the next patch will fix the actual bug by charging readahead
swap pages at swapin time rather than at fault time.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
---
 init/Kconfig     | 17 +----------------
 mm/memcontrol.c  | 48 +++++++++++++++++-------------------------------
 mm/swap_cgroup.c |  6 ------
 3 files changed, 18 insertions(+), 53 deletions(-)

Comments

Alex Shi April 21, 2020, 9:27 a.m. UTC | #1
在 2020/4/21 上午6:11, Johannes Weiner 写道:
> The swapaccount=0 boot option will continue to exist, and it will
> eliminate the page_counter overhead and hide the swap control files,
> but it won't disable swap slot ownership tracking.

May we add extra explanation for this change to user? and the default
memsw limitations?

> 
> This patch makes sure we always have the cgroup records at swapin
> time; the next patch will fix the actual bug by charging readahead
> swap pages at swapin time rather than at fault time.
> 
> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Johannes Weiner April 21, 2020, 2:39 p.m. UTC | #2
Hi Alex,

thanks for your quick review so far, I'll add the tags to the patches.

On Tue, Apr 21, 2020 at 05:27:30PM +0800, Alex Shi wrote:
> 
> 
> 在 2020/4/21 上午6:11, Johannes Weiner 写道:
> > The swapaccount=0 boot option will continue to exist, and it will
> > eliminate the page_counter overhead and hide the swap control files,
> > but it won't disable swap slot ownership tracking.
> 
> May we add extra explanation for this change to user? and the default
> memsw limitations?

Can you elaborate what you think is missing and where you would like
to see it documented?

From a semantics POV, nothing changes with this patch. The memsw limit
defaults to "max", so it doesn't exert any control per default. The
only difference is whether we maintain swap records or not.
Alex Shi April 22, 2020, 3:14 a.m. UTC | #3
在 2020/4/21 下午10:39, Johannes Weiner 写道:
> Hi Alex,
> 
> thanks for your quick review so far, I'll add the tags to the patches.
> 
> On Tue, Apr 21, 2020 at 05:27:30PM +0800, Alex Shi wrote:
>>
>>
>> 在 2020/4/21 上午6:11, Johannes Weiner 写道:
>>> The swapaccount=0 boot option will continue to exist, and it will
>>> eliminate the page_counter overhead and hide the swap control files,
>>> but it won't disable swap slot ownership tracking.
>>
>> May we add extra explanation for this change to user? and the default
>> memsw limitations?
> 
> Can you elaborate what you think is missing and where you would like
> to see it documented?
> 
Maybe the following doc change is better after whole patchset? 
Guess users would would happy to know details of this change.

Also as to the RSS account name change, I don't know if it's good to polish
them in docs.

Thanks
Alex

diff --git a/Documentation/admin-guide/cgroup-v1/memory.rst b/Documentation/admin-guide/cgroup-v1/memory.rst
index 0ae4f564c2d6..1fd0878089fe 100644
--- a/Documentation/admin-guide/cgroup-v1/memory.rst
+++ b/Documentation/admin-guide/cgroup-v1/memory.rst
@@ -199,11 +199,11 @@ An RSS page is unaccounted when it's fully unmapped. A PageCache page is
 unaccounted when it's removed from radix-tree. Even if RSS pages are fully
 unmapped (by kswapd), they may exist as SwapCache in the system until they
 are really freed. Such SwapCaches are also accounted.
-A swapped-in page is not accounted until it's mapped.
+A swapped-in page is accounted after adding into swapcache.

 Note: The kernel does swapin-readahead and reads multiple swaps at once.
-This means swapped-in pages may contain pages for other tasks than a task
-causing page fault. So, we avoid accounting at swap-in I/O.
+Since page's memcg recorded into swap whatever memsw enabled, the page will
+be accounted after swapin.

 At page migration, accounting information is kept.

@@ -230,10 +230,10 @@ caller of swapoff rather than the users of shmem.
 2.4 Swap Extension (CONFIG_MEMCG_SWAP)
 --------------------------------------

-Swap Extension allows you to record charge for swap. A swapped-in page is
-charged back to original page allocator if possible.
+Swap usage is always recorded for each of cgroup. Swap Extension allows you to
+read and limit it.

-When swap is accounted, following files are added.
+When swap is limited, following files are added.

  - memory.memsw.usage_in_bytes.
  - memory.memsw.limit_in_bytes.

> From a semantics POV, nothing changes with this patch. The memsw limit
> defaults to "max", so it doesn't exert any control per default. The
> only difference is whether we maintain swap records or not.
>
Johannes Weiner April 22, 2020, 1:30 p.m. UTC | #4
On Wed, Apr 22, 2020 at 11:14:40AM +0800, Alex Shi wrote:
> 
> 
> 在 2020/4/21 下午10:39, Johannes Weiner 写道:
> > Hi Alex,
> > 
> > thanks for your quick review so far, I'll add the tags to the patches.
> > 
> > On Tue, Apr 21, 2020 at 05:27:30PM +0800, Alex Shi wrote:
> >>
> >>
> >> 在 2020/4/21 上午6:11, Johannes Weiner 写道:
> >>> The swapaccount=0 boot option will continue to exist, and it will
> >>> eliminate the page_counter overhead and hide the swap control files,
> >>> but it won't disable swap slot ownership tracking.
> >>
> >> May we add extra explanation for this change to user? and the default
> >> memsw limitations?
> > 
> > Can you elaborate what you think is missing and where you would like
> > to see it documented?
> > 
> Maybe the following doc change is better after whole patchset? 
> Guess users would would happy to know details of this change.

Thanks, I stole your patch and extended/tweaked it a little. Would you
mind providing your Signed-off-by:?

From 589d3c1b505e6671b4a9b424436c9eda88a0b08c Mon Sep 17 00:00:00 2001
From: Alex Shi <alex.shi@linux.alibaba.com>
Date: Wed, 22 Apr 2020 11:14:40 +0800
Subject: [PATCH] mm: memcontrol: document the new swap control behavior

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
---
 .../admin-guide/cgroup-v1/memory.rst          | 19 +++++++------------
 1 file changed, 7 insertions(+), 12 deletions(-)

diff --git a/Documentation/admin-guide/cgroup-v1/memory.rst b/Documentation/admin-guide/cgroup-v1/memory.rst
index 0ae4f564c2d6..12757e63b26c 100644
--- a/Documentation/admin-guide/cgroup-v1/memory.rst
+++ b/Documentation/admin-guide/cgroup-v1/memory.rst
@@ -199,11 +199,11 @@ An RSS page is unaccounted when it's fully unmapped. A PageCache page is
 unaccounted when it's removed from radix-tree. Even if RSS pages are fully
 unmapped (by kswapd), they may exist as SwapCache in the system until they
 are really freed. Such SwapCaches are also accounted.
-A swapped-in page is not accounted until it's mapped.
+A swapped-in page is accounted after adding into swapcache.
 
 Note: The kernel does swapin-readahead and reads multiple swaps at once.
-This means swapped-in pages may contain pages for other tasks than a task
-causing page fault. So, we avoid accounting at swap-in I/O.
+Since page's memcg recorded into swap whatever memsw enabled, the page will
+be accounted after swapin.
 
 At page migration, accounting information is kept.
 
@@ -222,18 +222,13 @@ the cgroup that brought it in -- this will happen on memory pressure).
 But see section 8.2: when moving a task to another cgroup, its pages may
 be recharged to the new cgroup, if move_charge_at_immigrate has been chosen.
 
-Exception: If CONFIG_MEMCG_SWAP is not used.
-When you do swapoff and make swapped-out pages of shmem(tmpfs) to
-be backed into memory in force, charges for pages are accounted against the
-caller of swapoff rather than the users of shmem.
-
-2.4 Swap Extension (CONFIG_MEMCG_SWAP)
+2.4 Swap Extension
 --------------------------------------
 
-Swap Extension allows you to record charge for swap. A swapped-in page is
-charged back to original page allocator if possible.
+Swap usage is always recorded for each of cgroup. Swap Extension allows you to
+read and limit it.
 
-When swap is accounted, following files are added.
+When CONFIG_SWAP is enabled, following files are added.
 
  - memory.memsw.usage_in_bytes.
  - memory.memsw.limit_in_bytes.
Alex Shi April 22, 2020, 1:40 p.m. UTC | #5
在 2020/4/22 下午9:30, Johannes Weiner 写道:
> On Wed, Apr 22, 2020 at 11:14:40AM +0800, Alex Shi wrote:
>>
>>
>> 在 2020/4/21 下午10:39, Johannes Weiner 写道:
>>> Hi Alex,
>>>
>>> thanks for your quick review so far, I'll add the tags to the patches.
>>>
>>> On Tue, Apr 21, 2020 at 05:27:30PM +0800, Alex Shi wrote:
>>>>
>>>>
>>>> 在 2020/4/21 上午6:11, Johannes Weiner 写道:
>>>>> The swapaccount=0 boot option will continue to exist, and it will
>>>>> eliminate the page_counter overhead and hide the swap control files,
>>>>> but it won't disable swap slot ownership tracking.
>>>>
>>>> May we add extra explanation for this change to user? and the default
>>>> memsw limitations?
>>>
>>> Can you elaborate what you think is missing and where you would like
>>> to see it documented?
>>>
>> Maybe the following doc change is better after whole patchset? 
>> Guess users would would happy to know details of this change.
> 
> Thanks, I stole your patch and extended/tweaked it a little. Would you
> mind providing your Signed-off-by:?

My pleasure. :)

Signed-off-by: Alex Shi <alex.shi@linux.alibaba.com>

> 
> From 589d3c1b505e6671b4a9b424436c9eda88a0b08c Mon Sep 17 00:00:00 2001
> From: Alex Shi <alex.shi@linux.alibaba.com>
> Date: Wed, 22 Apr 2020 11:14:40 +0800
> Subject: [PATCH] mm: memcontrol: document the new swap control behavior
> 
> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
> ---
>  .../admin-guide/cgroup-v1/memory.rst          | 19 +++++++------------
>  1 file changed, 7 insertions(+), 12 deletions(-)
> 
> diff --git a/Documentation/admin-guide/cgroup-v1/memory.rst b/Documentation/admin-guide/cgroup-v1/memory.rst
> index 0ae4f564c2d6..12757e63b26c 100644
> --- a/Documentation/admin-guide/cgroup-v1/memory.rst
> +++ b/Documentation/admin-guide/cgroup-v1/memory.rst
> @@ -199,11 +199,11 @@ An RSS page is unaccounted when it's fully unmapped. A PageCache page is
>  unaccounted when it's removed from radix-tree. Even if RSS pages are fully
>  unmapped (by kswapd), they may exist as SwapCache in the system until they
>  are really freed. Such SwapCaches are also accounted.
> -A swapped-in page is not accounted until it's mapped.
> +A swapped-in page is accounted after adding into swapcache.
>  
>  Note: The kernel does swapin-readahead and reads multiple swaps at once.
> -This means swapped-in pages may contain pages for other tasks than a task
> -causing page fault. So, we avoid accounting at swap-in I/O.
> +Since page's memcg recorded into swap whatever memsw enabled, the page will
> +be accounted after swapin.
>  
>  At page migration, accounting information is kept.
>  
> @@ -222,18 +222,13 @@ the cgroup that brought it in -- this will happen on memory pressure).
>  But see section 8.2: when moving a task to another cgroup, its pages may
>  be recharged to the new cgroup, if move_charge_at_immigrate has been chosen.
>  
> -Exception: If CONFIG_MEMCG_SWAP is not used.
> -When you do swapoff and make swapped-out pages of shmem(tmpfs) to
> -be backed into memory in force, charges for pages are accounted against the
> -caller of swapoff rather than the users of shmem.
> -
> -2.4 Swap Extension (CONFIG_MEMCG_SWAP)
> +2.4 Swap Extension
>  --------------------------------------
>  
> -Swap Extension allows you to record charge for swap. A swapped-in page is
> -charged back to original page allocator if possible.
> +Swap usage is always recorded for each of cgroup. Swap Extension allows you to
> +read and limit it.
>  
> -When swap is accounted, following files are added.
> +When CONFIG_SWAP is enabled, following files are added.
>  
>   - memory.memsw.usage_in_bytes.
>   - memory.memsw.limit_in_bytes.
>
Alex Shi April 22, 2020, 1:43 p.m. UTC | #6
在 2020/4/22 下午9:30, Johannes Weiner 写道:
>> Also as to the RSS account name change, I don't know if it's good to polish
>> them in docs.
> I didn't actually change anything user-visible, just the internal name
> of the counters:
> 
> static const unsigned int memcg1_stats[] = {
> 	NR_FILE_PAGES,		/* was MEMCG_CACHE */
> 	NR_ANON_MAPPED,		/* was MEMCG_RSS */
> 	NR_ANON_THPS,		/* was MEMCG_RSS_HUGE */
> 	NR_SHMEM,
> 	NR_FILE_MAPPED,
> 	NR_FILE_DIRTY,
> 	NR_WRITEBACK,
> 	MEMCG_SWAP,
> };
> 
> static const char *const memcg1_stat_names[] = {
> 	"cache",
> 	"rss",
> 	"rss_huge",
> 	"shmem",
> 	"mapped_file",
> 	"dirty",
> 	"writeback",
> 	"swap",
> };
> 
> Or did you refer to something else?

With about 'was MEMCG_RSS' etc. I believe curious user would know where
the concept from. :)

Thanks for this comments!
Aelx
Joonsoo Kim April 24, 2020, 12:30 a.m. UTC | #7
On Mon, Apr 20, 2020 at 06:11:23PM -0400, Johannes Weiner wrote:
> Without swap page tracking, users that are otherwise memory controlled
> can easily escape their containment and allocate significant amounts
> of memory that they're not being charged for. That's because swap does
> readahead, but without the cgroup records of who owned the page at
> swapout, readahead pages don't get charged until somebody actually
> faults them into their page table and we can identify an owner task.
> This can be maliciously exploited with MADV_WILLNEED, which triggers
> arbitrary readahead allocations without charging the pages.
> 
> Make swap swap page tracking an integral part of memcg and remove the
> Kconfig options. In the first place, it was only made configurable to
> allow users to save some memory. But the overhead of tracking cgroup
> ownership per swap page is minimal - 2 byte per page, or 512k per 1G
> of swap, or 0.04%. Saving that at the expense of broken containment
> semantics is not something we should present as a coequal option.
> 
> The swapaccount=0 boot option will continue to exist, and it will
> eliminate the page_counter overhead and hide the swap control files,
> but it won't disable swap slot ownership tracking.
> 
> This patch makes sure we always have the cgroup records at swapin
> time; the next patch will fix the actual bug by charging readahead
> swap pages at swapin time rather than at fault time.
> 
> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>

Reviewed-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Johannes Weiner April 24, 2020, 3:01 a.m. UTC | #8
On Mon, Apr 20, 2020 at 06:11:23PM -0400, Johannes Weiner wrote:
> @@ -6884,9 +6876,6 @@ void mem_cgroup_swapout(struct page *page, swp_entry_t entry)
>  	VM_BUG_ON_PAGE(PageLRU(page), page);
>  	VM_BUG_ON_PAGE(page_count(page), page);
>  
> -	if (!do_memsw_account())
> -		return;
> -
>  	memcg = page->mem_cgroup;
>  
>  	/* Readahead page, never charged */

I messed up here.

mem_cgroup_swapout() must not run on cgroup2, because cgroup2 uses
mem_cgroup_try_charge_swap() instead. Both record a swap entry and
running them both will trigger a VM_BUG_ON() on an existing record.

I'm actually somewhat baffled why this didn't trigger in my
MADV_PAGEOUT -> MADV_WILLNEED swap test. memory.max driven swapout
triggered it right away.

!do_memsw_account() needs to be !cgroup_subsys_on_dfl(memory_cgrp_subsys)

> @@ -6913,7 +6902,7 @@ void mem_cgroup_swapout(struct page *page, swp_entry_t entry)
>  	if (!mem_cgroup_is_root(memcg))
>  		page_counter_uncharge(&memcg->memory, nr_entries);
>  
> -	if (memcg != swap_memcg) {
> +	if (do_memsw_account() && memcg != swap_memcg) {
>  		if (!mem_cgroup_is_root(swap_memcg))
>  			page_counter_charge(&swap_memcg->memsw, nr_entries);
>  		page_counter_uncharge(&memcg->memsw, nr_entries);

And this can be !cgroup_memory_noswap instead. It'll do the same
thing, but will be clearer.

I'll have it fixed in version 2.
diff mbox series

Patch

diff --git a/init/Kconfig b/init/Kconfig
index 9e22ee8fbd75..39cdb13168cf 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -835,24 +835,9 @@  config MEMCG
 	  Provides control over the memory footprint of tasks in a cgroup.
 
 config MEMCG_SWAP
-	bool "Swap controller"
+	bool
 	depends on MEMCG && SWAP
-	help
-	  Provides control over the swap space consumed by tasks in a cgroup.
-
-config MEMCG_SWAP_ENABLED
-	bool "Swap controller enabled by default"
-	depends on MEMCG_SWAP
 	default y
-	help
-	  Memory Resource Controller Swap Extension comes with its price in
-	  a bigger memory consumption. General purpose distribution kernels
-	  which want to enable the feature but keep it disabled by default
-	  and let the user enable it by swapaccount=1 boot command line
-	  parameter should have this option unselected.
-	  For those who want to have the feature enabled by default should
-	  select this option (if, for some reason, they need to disable it
-	  then swapaccount=0 does the trick).
 
 config MEMCG_KMEM
 	bool
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 5558777023e7..1d7408a8744a 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -83,14 +83,10 @@  static bool cgroup_memory_nokmem;
 
 /* Whether the swap controller is active */
 #ifdef CONFIG_MEMCG_SWAP
-#ifdef CONFIG_MEMCG_SWAP_ENABLED
 bool cgroup_memory_noswap __read_mostly;
 #else
-bool cgroup_memory_noswap __read_mostly = 1;
-#endif /* CONFIG_MEMCG_SWAP_ENABLED */
-#else
 #define cgroup_memory_noswap		1
-#endif /* CONFIG_MEMCG_SWAP */
+#endif
 
 #ifdef CONFIG_CGROUP_WRITEBACK
 static DECLARE_WAIT_QUEUE_HEAD(memcg_cgwb_frn_waitq);
@@ -5290,8 +5286,7 @@  static struct page *mc_handle_swap_pte(struct vm_area_struct *vma,
 	 * we call find_get_page() with swapper_space directly.
 	 */
 	page = find_get_page(swap_address_space(ent), swp_offset(ent));
-	if (do_memsw_account())
-		entry->val = ent.val;
+	entry->val = ent.val;
 
 	return page;
 }
@@ -5325,8 +5320,7 @@  static struct page *mc_handle_file_pte(struct vm_area_struct *vma,
 		page = find_get_entry(mapping, pgoff);
 		if (xa_is_value(page)) {
 			swp_entry_t swp = radix_to_swp_entry(page);
-			if (do_memsw_account())
-				*entry = swp;
+			*entry = swp;
 			page = find_get_page(swap_address_space(swp),
 					     swp_offset(swp));
 		}
@@ -6459,6 +6453,9 @@  int mem_cgroup_charge(struct page *page, struct mm_struct *mm, gfp_t gfp_mask,
 		goto out;
 
 	if (PageSwapCache(page)) {
+		swp_entry_t ent = { .val = page_private(page), };
+		unsigned short id;
+
 		/*
 		 * Every swap fault against a single page tries to charge the
 		 * page, bail as early as possible.  shmem_unuse() encounters
@@ -6470,17 +6467,12 @@  int mem_cgroup_charge(struct page *page, struct mm_struct *mm, gfp_t gfp_mask,
 		if (compound_head(page)->mem_cgroup)
 			goto out;
 
-		if (!cgroup_memory_noswap) {
-			swp_entry_t ent = { .val = page_private(page), };
-			unsigned short id;
-
-			id = lookup_swap_cgroup_id(ent);
-			rcu_read_lock();
-			memcg = mem_cgroup_from_id(id);
-			if (memcg && !css_tryget_online(&memcg->css))
-				memcg = NULL;
-			rcu_read_unlock();
-		}
+		id = lookup_swap_cgroup_id(ent);
+		rcu_read_lock();
+		memcg = mem_cgroup_from_id(id);
+		if (memcg && !css_tryget_online(&memcg->css))
+			memcg = NULL;
+		rcu_read_unlock();
 	}
 
 	if (!memcg)
@@ -6497,7 +6489,7 @@  int mem_cgroup_charge(struct page *page, struct mm_struct *mm, gfp_t gfp_mask,
 	memcg_check_events(memcg, page);
 	local_irq_enable();
 
-	if (do_memsw_account() && PageSwapCache(page)) {
+	if (PageSwapCache(page)) {
 		swp_entry_t entry = { .val = page_private(page) };
 		/*
 		 * The swap entry might not get freed for a long time,
@@ -6884,9 +6876,6 @@  void mem_cgroup_swapout(struct page *page, swp_entry_t entry)
 	VM_BUG_ON_PAGE(PageLRU(page), page);
 	VM_BUG_ON_PAGE(page_count(page), page);
 
-	if (!do_memsw_account())
-		return;
-
 	memcg = page->mem_cgroup;
 
 	/* Readahead page, never charged */
@@ -6913,7 +6902,7 @@  void mem_cgroup_swapout(struct page *page, swp_entry_t entry)
 	if (!mem_cgroup_is_root(memcg))
 		page_counter_uncharge(&memcg->memory, nr_entries);
 
-	if (memcg != swap_memcg) {
+	if (do_memsw_account() && memcg != swap_memcg) {
 		if (!mem_cgroup_is_root(swap_memcg))
 			page_counter_charge(&swap_memcg->memsw, nr_entries);
 		page_counter_uncharge(&memcg->memsw, nr_entries);
@@ -6949,7 +6938,7 @@  int mem_cgroup_try_charge_swap(struct page *page, swp_entry_t entry)
 	struct mem_cgroup *memcg;
 	unsigned short oldid;
 
-	if (!cgroup_subsys_on_dfl(memory_cgrp_subsys) || cgroup_memory_noswap)
+	if (!cgroup_subsys_on_dfl(memory_cgrp_subsys))
 		return 0;
 
 	memcg = page->mem_cgroup;
@@ -6965,7 +6954,7 @@  int mem_cgroup_try_charge_swap(struct page *page, swp_entry_t entry)
 
 	memcg = mem_cgroup_id_get_online(memcg);
 
-	if (!mem_cgroup_is_root(memcg) &&
+	if (!cgroup_memory_noswap && !mem_cgroup_is_root(memcg) &&
 	    !page_counter_try_charge(&memcg->swap, nr_pages, &counter)) {
 		memcg_memory_event(memcg, MEMCG_SWAP_MAX);
 		memcg_memory_event(memcg, MEMCG_SWAP_FAIL);
@@ -6993,14 +6982,11 @@  void mem_cgroup_uncharge_swap(swp_entry_t entry, unsigned int nr_pages)
 	struct mem_cgroup *memcg;
 	unsigned short id;
 
-	if (cgroup_memory_noswap)
-		return;
-
 	id = swap_cgroup_record(entry, 0, nr_pages);
 	rcu_read_lock();
 	memcg = mem_cgroup_from_id(id);
 	if (memcg) {
-		if (!mem_cgroup_is_root(memcg)) {
+		if (!cgroup_memory_noswap && !mem_cgroup_is_root(memcg)) {
 			if (cgroup_subsys_on_dfl(memory_cgrp_subsys))
 				page_counter_uncharge(&memcg->swap, nr_pages);
 			else
diff --git a/mm/swap_cgroup.c b/mm/swap_cgroup.c
index 7aa764f09079..7f34343c075a 100644
--- a/mm/swap_cgroup.c
+++ b/mm/swap_cgroup.c
@@ -171,9 +171,6 @@  int swap_cgroup_swapon(int type, unsigned long max_pages)
 	unsigned long length;
 	struct swap_cgroup_ctrl *ctrl;
 
-	if (cgroup_memory_noswap)
-		return 0;
-
 	length = DIV_ROUND_UP(max_pages, SC_PER_PAGE);
 	array_size = length * sizeof(void *);
 
@@ -209,9 +206,6 @@  void swap_cgroup_swapoff(int type)
 	unsigned long i, length;
 	struct swap_cgroup_ctrl *ctrl;
 
-	if (cgroup_memory_noswap)
-		return;
-
 	mutex_lock(&swap_cgroup_mutex);
 	ctrl = &swap_cgroup_ctrl[type];
 	map = ctrl->map;