diff mbox series

[-mm] mm, swap: Simplify total_swapcache_pages() with get_swap_device()

Message ID 20190527082714.12151-1-ying.huang@intel.com (mailing list archive)
State New, archived
Headers show
Series [-mm] mm, swap: Simplify total_swapcache_pages() with get_swap_device() | expand

Commit Message

Huang, Ying May 27, 2019, 8:27 a.m. UTC
From: Huang Ying <ying.huang@intel.com>

total_swapcache_pages() may race with swapper_spaces[] allocation and
freeing.  Previously, this is protected with a swapper_spaces[]
specific RCU mechanism.  To simplify the logic/code complexity, it is
replaced with get/put_swap_device().  The code line number is reduced
too.  Although not so important, the swapoff() performance improves
too because one synchronize_rcu() call during swapoff() is deleted.

Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Jérôme Glisse <jglisse@redhat.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Yang Shi <yang.shi@linux.alibaba.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Daniel Jordan <daniel.m.jordan@oracle.com>
Cc: Andrea Parri <andrea.parri@amarulasolutions.com>
---
 mm/swap_state.c | 28 ++++++++++------------------
 1 file changed, 10 insertions(+), 18 deletions(-)

Comments

Paul E. McKenney May 27, 2019, 10:15 a.m. UTC | #1
On Mon, May 27, 2019 at 04:27:14PM +0800, Huang, Ying wrote:
> From: Huang Ying <ying.huang@intel.com>
> 
> total_swapcache_pages() may race with swapper_spaces[] allocation and
> freeing.  Previously, this is protected with a swapper_spaces[]
> specific RCU mechanism.  To simplify the logic/code complexity, it is
> replaced with get/put_swap_device().  The code line number is reduced
> too.  Although not so important, the swapoff() performance improves
> too because one synchronize_rcu() call during swapoff() is deleted.

I am guessing that total_swapcache_pages() is not used on any
fastpaths, but must defer to others on this.  Of course, if the
performance/scalability of total_swapcache_pages() is important,
benchmarking is needed.

But where do I find get_swap_device() and put_swap_device()?  I do not
see them in current mainline.

							Thanx, Paul

> Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
> Cc: Hugh Dickins <hughd@google.com>
> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> Cc: Minchan Kim <minchan@kernel.org>
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Cc: Tim Chen <tim.c.chen@linux.intel.com>
> Cc: Mel Gorman <mgorman@techsingularity.net>
> Cc: Jérôme Glisse <jglisse@redhat.com>
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: Andrea Arcangeli <aarcange@redhat.com>
> Cc: Yang Shi <yang.shi@linux.alibaba.com>
> Cc: David Rientjes <rientjes@google.com>
> Cc: Rik van Riel <riel@redhat.com>
> Cc: Jan Kara <jack@suse.cz>
> Cc: Dave Jiang <dave.jiang@intel.com>
> Cc: Daniel Jordan <daniel.m.jordan@oracle.com>
> Cc: Andrea Parri <andrea.parri@amarulasolutions.com>
> ---
>  mm/swap_state.c | 28 ++++++++++------------------
>  1 file changed, 10 insertions(+), 18 deletions(-)
> 
> diff --git a/mm/swap_state.c b/mm/swap_state.c
> index f509cdaa81b1..b84c58b572ca 100644
> --- a/mm/swap_state.c
> +++ b/mm/swap_state.c
> @@ -73,23 +73,19 @@ unsigned long total_swapcache_pages(void)
>  	unsigned int i, j, nr;
>  	unsigned long ret = 0;
>  	struct address_space *spaces;
> +	struct swap_info_struct *si;
> 
> -	rcu_read_lock();
>  	for (i = 0; i < MAX_SWAPFILES; i++) {
> -		/*
> -		 * The corresponding entries in nr_swapper_spaces and
> -		 * swapper_spaces will be reused only after at least
> -		 * one grace period.  So it is impossible for them
> -		 * belongs to different usage.
> -		 */
> -		nr = nr_swapper_spaces[i];
> -		spaces = rcu_dereference(swapper_spaces[i]);
> -		if (!nr || !spaces)
> +		/* Prevent swapoff to free swapper_spaces */
> +		si = get_swap_device(swp_entry(i, 1));
> +		if (!si)
>  			continue;
> +		nr = nr_swapper_spaces[i];
> +		spaces = swapper_spaces[i];
>  		for (j = 0; j < nr; j++)
>  			ret += spaces[j].nrpages;
> +		put_swap_device(si);
>  	}
> -	rcu_read_unlock();
>  	return ret;
>  }
> 
> @@ -611,20 +607,16 @@ int init_swap_address_space(unsigned int type, unsigned long nr_pages)
>  		mapping_set_no_writeback_tags(space);
>  	}
>  	nr_swapper_spaces[type] = nr;
> -	rcu_assign_pointer(swapper_spaces[type], spaces);
> +	swapper_spaces[type] = spaces;
> 
>  	return 0;
>  }
> 
>  void exit_swap_address_space(unsigned int type)
>  {
> -	struct address_space *spaces;
> -
> -	spaces = swapper_spaces[type];
> +	kvfree(swapper_spaces[type]);
>  	nr_swapper_spaces[type] = 0;
> -	rcu_assign_pointer(swapper_spaces[type], NULL);
> -	synchronize_rcu();
> -	kvfree(spaces);
> +	swapper_spaces[type] = NULL;
>  }
> 
>  static inline void swap_ra_clamp_pfn(struct vm_area_struct *vma,
> -- 
> 2.20.1
>
Andrea Parri May 27, 2019, 1:28 p.m. UTC | #2
> But where do I find get_swap_device() and put_swap_device()?  I do not
> see them in current mainline.

You should see them in the -mm tree:

  https://ozlabs.org/~akpm/mmots/broken-out/mm-swap-fix-race-between-swapoff-and-some-swap-operations.patch

or

  http://git.cmpxchg.org/cgit.cgi/linux-mmots.git/commit/?id=87efc56527b92a59d15c5d4e4b05f875b276a59a

Thanks,
  Andrea
Huang, Ying May 28, 2019, 12:46 a.m. UTC | #3
Hi, Paul,

"Paul E. McKenney" <paulmck@linux.ibm.com> writes:

> On Mon, May 27, 2019 at 04:27:14PM +0800, Huang, Ying wrote:
>> From: Huang Ying <ying.huang@intel.com>
>> 
>> total_swapcache_pages() may race with swapper_spaces[] allocation and
>> freeing.  Previously, this is protected with a swapper_spaces[]
>> specific RCU mechanism.  To simplify the logic/code complexity, it is
>> replaced with get/put_swap_device().  The code line number is reduced
>> too.  Although not so important, the swapoff() performance improves
>> too because one synchronize_rcu() call during swapoff() is deleted.
>
> I am guessing that total_swapcache_pages() is not used on any
> fastpaths, but must defer to others on this.  Of course, if the
> performance/scalability of total_swapcache_pages() is important,
> benchmarking is needed.

This patch is mostly about code cleanup instead of performance.  That
is, to make the code easier to be understand.

> But where do I find get_swap_device() and put_swap_device()?  I do not
> see them in current mainline.

They are not in mainline, but in -mm tree.  I should have made it more
clear.  Sorry about that.

Best Regards,
Huang, Ying
diff mbox series

Patch

diff --git a/mm/swap_state.c b/mm/swap_state.c
index f509cdaa81b1..b84c58b572ca 100644
--- a/mm/swap_state.c
+++ b/mm/swap_state.c
@@ -73,23 +73,19 @@  unsigned long total_swapcache_pages(void)
 	unsigned int i, j, nr;
 	unsigned long ret = 0;
 	struct address_space *spaces;
+	struct swap_info_struct *si;
 
-	rcu_read_lock();
 	for (i = 0; i < MAX_SWAPFILES; i++) {
-		/*
-		 * The corresponding entries in nr_swapper_spaces and
-		 * swapper_spaces will be reused only after at least
-		 * one grace period.  So it is impossible for them
-		 * belongs to different usage.
-		 */
-		nr = nr_swapper_spaces[i];
-		spaces = rcu_dereference(swapper_spaces[i]);
-		if (!nr || !spaces)
+		/* Prevent swapoff to free swapper_spaces */
+		si = get_swap_device(swp_entry(i, 1));
+		if (!si)
 			continue;
+		nr = nr_swapper_spaces[i];
+		spaces = swapper_spaces[i];
 		for (j = 0; j < nr; j++)
 			ret += spaces[j].nrpages;
+		put_swap_device(si);
 	}
-	rcu_read_unlock();
 	return ret;
 }
 
@@ -611,20 +607,16 @@  int init_swap_address_space(unsigned int type, unsigned long nr_pages)
 		mapping_set_no_writeback_tags(space);
 	}
 	nr_swapper_spaces[type] = nr;
-	rcu_assign_pointer(swapper_spaces[type], spaces);
+	swapper_spaces[type] = spaces;
 
 	return 0;
 }
 
 void exit_swap_address_space(unsigned int type)
 {
-	struct address_space *spaces;
-
-	spaces = swapper_spaces[type];
+	kvfree(swapper_spaces[type]);
 	nr_swapper_spaces[type] = 0;
-	rcu_assign_pointer(swapper_spaces[type], NULL);
-	synchronize_rcu();
-	kvfree(spaces);
+	swapper_spaces[type] = NULL;
 }
 
 static inline void swap_ra_clamp_pfn(struct vm_area_struct *vma,