[2/4] mm,page_alloc: Move the short sleep to should_reclaim_retry()
diff mbox

Message ID 1528369223-7571-2-git-send-email-penguin-kernel@I-love.SAKURA.ne.jp
State New
Headers show

Commit Message

Tetsuo Handa June 7, 2018, 11 a.m. UTC
should_reclaim_retry() should be a natural reschedule point. PF_WQ_WORKER
is a special case which needs a stronger rescheduling policy. Doing that
unconditionally seems more straightforward than depending on a zone being
a good candidate for a further reclaim.

Thus, move the short sleep when we are waiting for the owner of oom_lock
(which coincidentally also serves as a guaranteed sleep for PF_WQ_WORKER
threads) to should_reclaim_retry(). Note that it is not evaluated that
whether there is negative side effect with this change. We need to test
both real and artificial workloads for evaluation. You can compare with
and without this patch if you noticed something unexpected.

Signed-off-by: Michal Hocko <mhocko@suse.com>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: Roman Gushchin <guro@fb.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Tejun Heo <tj@kernel.org>
---
 mm/page_alloc.c | 40 ++++++++++++++++++----------------------
 1 file changed, 18 insertions(+), 22 deletions(-)

Comments

Michal Hocko June 7, 2018, 11:13 a.m. UTC | #1
On Thu 07-06-18 20:00:21, Tetsuo Handa wrote:
> should_reclaim_retry() should be a natural reschedule point. PF_WQ_WORKER
> is a special case which needs a stronger rescheduling policy. Doing that
> unconditionally seems more straightforward than depending on a zone being
> a good candidate for a further reclaim.
> 
> Thus, move the short sleep when we are waiting for the owner of oom_lock
> (which coincidentally also serves as a guaranteed sleep for PF_WQ_WORKER
> threads) to should_reclaim_retry(). Note that it is not evaluated that
> whether there is negative side effect with this change. We need to test
> both real and artificial workloads for evaluation. You can compare with
> and without this patch if you noticed something unexpected.
> 
> Signed-off-by: Michal Hocko <mhocko@suse.com>
> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
> Cc: Roman Gushchin <guro@fb.com>
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
> Cc: David Rientjes <rientjes@google.com>
> Cc: Tejun Heo <tj@kernel.org>

Your s-o-b is missing here. And I suspect this should be From: /me
but I do not care all that much.

> ---
>  mm/page_alloc.c | 40 ++++++++++++++++++----------------------
>  1 file changed, 18 insertions(+), 22 deletions(-)
> 
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index e90f152..210a476 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -3914,6 +3914,7 @@ bool gfp_pfmemalloc_allowed(gfp_t gfp_mask)
>  {
>  	struct zone *zone;
>  	struct zoneref *z;
> +	bool ret = false;
>  
>  	/*
>  	 * Costly allocations might have made a progress but this doesn't mean
> @@ -3977,25 +3978,26 @@ bool gfp_pfmemalloc_allowed(gfp_t gfp_mask)
>  				}
>  			}
>  
> -			/*
> -			 * Memory allocation/reclaim might be called from a WQ
> -			 * context and the current implementation of the WQ
> -			 * concurrency control doesn't recognize that
> -			 * a particular WQ is congested if the worker thread is
> -			 * looping without ever sleeping. Therefore we have to
> -			 * do a short sleep here rather than calling
> -			 * cond_resched().
> -			 */
> -			if (current->flags & PF_WQ_WORKER)
> -				schedule_timeout_uninterruptible(1);
> -			else
> -				cond_resched();
> -
> -			return true;
> +			ret = true;
> +			goto out;
>  		}
>  	}
>  
> -	return false;
> +out:
> +	/*
> +	 * Memory allocation/reclaim might be called from a WQ
> +	 * context and the current implementation of the WQ
> +	 * concurrency control doesn't recognize that
> +	 * a particular WQ is congested if the worker thread is
> +	 * looping without ever sleeping. Therefore we have to
> +	 * do a short sleep here rather than calling
> +	 * cond_resched().
> +	 */
> +	if (current->flags & PF_WQ_WORKER)
> +		schedule_timeout_uninterruptible(1);
> +	else
> +		cond_resched();
> +	return ret;
>  }
>  
>  static inline bool
> @@ -4237,12 +4239,6 @@ bool gfp_pfmemalloc_allowed(gfp_t gfp_mask)
>  	/* Retry as long as the OOM killer is making progress */
>  	if (did_some_progress) {
>  		no_progress_loops = 0;
> -		/*
> -		 * This schedule_timeout_*() serves as a guaranteed sleep for
> -		 * PF_WQ_WORKER threads when __zone_watermark_ok() == false.
> -		 */
> -		if (!tsk_is_oom_victim(current))
> -			schedule_timeout_uninterruptible(1);
>  		goto retry;
>  	}
>  
> -- 
> 1.8.3.1
>

Patch
diff mbox

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index e90f152..210a476 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3914,6 +3914,7 @@  bool gfp_pfmemalloc_allowed(gfp_t gfp_mask)
 {
 	struct zone *zone;
 	struct zoneref *z;
+	bool ret = false;
 
 	/*
 	 * Costly allocations might have made a progress but this doesn't mean
@@ -3977,25 +3978,26 @@  bool gfp_pfmemalloc_allowed(gfp_t gfp_mask)
 				}
 			}
 
-			/*
-			 * Memory allocation/reclaim might be called from a WQ
-			 * context and the current implementation of the WQ
-			 * concurrency control doesn't recognize that
-			 * a particular WQ is congested if the worker thread is
-			 * looping without ever sleeping. Therefore we have to
-			 * do a short sleep here rather than calling
-			 * cond_resched().
-			 */
-			if (current->flags & PF_WQ_WORKER)
-				schedule_timeout_uninterruptible(1);
-			else
-				cond_resched();
-
-			return true;
+			ret = true;
+			goto out;
 		}
 	}
 
-	return false;
+out:
+	/*
+	 * Memory allocation/reclaim might be called from a WQ
+	 * context and the current implementation of the WQ
+	 * concurrency control doesn't recognize that
+	 * a particular WQ is congested if the worker thread is
+	 * looping without ever sleeping. Therefore we have to
+	 * do a short sleep here rather than calling
+	 * cond_resched().
+	 */
+	if (current->flags & PF_WQ_WORKER)
+		schedule_timeout_uninterruptible(1);
+	else
+		cond_resched();
+	return ret;
 }
 
 static inline bool
@@ -4237,12 +4239,6 @@  bool gfp_pfmemalloc_allowed(gfp_t gfp_mask)
 	/* Retry as long as the OOM killer is making progress */
 	if (did_some_progress) {
 		no_progress_loops = 0;
-		/*
-		 * This schedule_timeout_*() serves as a guaranteed sleep for
-		 * PF_WQ_WORKER threads when __zone_watermark_ok() == false.
-		 */
-		if (!tsk_is_oom_victim(current))
-			schedule_timeout_uninterruptible(1);
 		goto retry;
 	}