diff mbox series

memcg: enable memcg oom-kill for __GFP_NOFAIL

Message ID 20210223204337.2785120-1-shakeelb@google.com (mailing list archive)
State New, archived
Headers show
Series memcg: enable memcg oom-kill for __GFP_NOFAIL | expand

Commit Message

Shakeel Butt Feb. 23, 2021, 8:43 p.m. UTC
In the era of async memcg oom-killer, the commit a0d8b00a3381 ("mm:
memcg: do not declare OOM from __GFP_NOFAIL allocations") added the code
to skip memcg oom-killer for __GFP_NOFAIL allocations. The reason was
that the __GFP_NOFAIL callers will not enter aync oom synchronization
path and will keep the task marked as in memcg oom. At that time the
tasks marked in memcg oom can bypass the memcg limits and the oom
synchronization would have happened later in the later userspace
triggered page fault. Thus letting the task marked as under memcg oom
bypass the memcg limit for arbitrary time.

With the synchronous memcg oom-killer (commit 29ef680ae7c21 ("memcg,
oom: move out_of_memory back to the charge path")) and not letting the
task marked under memcg oom to bypass the memcg limits (commit
1f14c1ac19aa4 ("mm: memcg: do not allow task about to OOM kill to bypass
the limit")), we can again allow __GFP_NOFAIL allocations to trigger
memcg oom-kill. This will make memcg oom behavior closer to page
allocator oom behavior.

Signed-off-by: Shakeel Butt <shakeelb@google.com>
---
 mm/memcontrol.c | 3 ---
 1 file changed, 3 deletions(-)

Comments

Michal Hocko Feb. 24, 2021, 9:14 a.m. UTC | #1
On Tue 23-02-21 12:43:37, Shakeel Butt wrote:
> In the era of async memcg oom-killer, the commit a0d8b00a3381 ("mm:
> memcg: do not declare OOM from __GFP_NOFAIL allocations") added the code
> to skip memcg oom-killer for __GFP_NOFAIL allocations. The reason was
> that the __GFP_NOFAIL callers will not enter aync oom synchronization
> path and will keep the task marked as in memcg oom. At that time the
> tasks marked in memcg oom can bypass the memcg limits and the oom
> synchronization would have happened later in the later userspace
> triggered page fault. Thus letting the task marked as under memcg oom
> bypass the memcg limit for arbitrary time.
> 
> With the synchronous memcg oom-killer (commit 29ef680ae7c21 ("memcg,
> oom: move out_of_memory back to the charge path")) and not letting the
> task marked under memcg oom to bypass the memcg limits (commit
> 1f14c1ac19aa4 ("mm: memcg: do not allow task about to OOM kill to bypass
> the limit")), we can again allow __GFP_NOFAIL allocations to trigger
> memcg oom-kill. This will make memcg oom behavior closer to page
> allocator oom behavior.

The patch is correct, I just do follow why 1f14c1ac19aa4 is really
relevant here. There nomem label wouldn't make any difference for
__GFP_NOFAIL requests. The code has has changed quite a lot since then.
 
> Signed-off-by: Shakeel Butt <shakeelb@google.com>

This is a clear overlook when I moved the oom handling back to the
charge path. Thanks for the fixup.
Acked-by: Michal Hocko <mhocko@suse.com>

> ---
>  mm/memcontrol.c | 3 ---
>  1 file changed, 3 deletions(-)
> 
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 2db2aeac8a9e..dcb5665aeb69 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -2797,9 +2797,6 @@ static int try_charge(struct mem_cgroup *memcg, gfp_t gfp_mask,
>  	if (gfp_mask & __GFP_RETRY_MAYFAIL)
>  		goto nomem;
>  
> -	if (gfp_mask & __GFP_NOFAIL)
> -		goto force;
> -
>  	if (fatal_signal_pending(current))
>  		goto force;
>  
> -- 
> 2.30.0.617.g56c4b15f3c-goog
Johannes Weiner Feb. 24, 2021, 8:38 p.m. UTC | #2
On Tue, Feb 23, 2021 at 12:43:37PM -0800, Shakeel Butt wrote:
> In the era of async memcg oom-killer, the commit a0d8b00a3381 ("mm:
> memcg: do not declare OOM from __GFP_NOFAIL allocations") added the code
> to skip memcg oom-killer for __GFP_NOFAIL allocations. The reason was
> that the __GFP_NOFAIL callers will not enter aync oom synchronization
> path and will keep the task marked as in memcg oom. At that time the
> tasks marked in memcg oom can bypass the memcg limits and the oom
> synchronization would have happened later in the later userspace
> triggered page fault. Thus letting the task marked as under memcg oom
> bypass the memcg limit for arbitrary time.
> 
> With the synchronous memcg oom-killer (commit 29ef680ae7c21 ("memcg,
> oom: move out_of_memory back to the charge path")) and not letting the
> task marked under memcg oom to bypass the memcg limits (commit
> 1f14c1ac19aa4 ("mm: memcg: do not allow task about to OOM kill to bypass
> the limit")), we can again allow __GFP_NOFAIL allocations to trigger
> memcg oom-kill. This will make memcg oom behavior closer to page
> allocator oom behavior.
> 
> Signed-off-by: Shakeel Butt <shakeelb@google.com>

Acked-by: Johannes Weiner <hannes@cmpxchg.org>
David Rientjes Feb. 25, 2021, 9:36 p.m. UTC | #3
On Tue, 23 Feb 2021, Shakeel Butt wrote:

> In the era of async memcg oom-killer, the commit a0d8b00a3381 ("mm:
> memcg: do not declare OOM from __GFP_NOFAIL allocations") added the code
> to skip memcg oom-killer for __GFP_NOFAIL allocations. The reason was
> that the __GFP_NOFAIL callers will not enter aync oom synchronization
> path and will keep the task marked as in memcg oom. At that time the
> tasks marked in memcg oom can bypass the memcg limits and the oom
> synchronization would have happened later in the later userspace
> triggered page fault. Thus letting the task marked as under memcg oom
> bypass the memcg limit for arbitrary time.
> 
> With the synchronous memcg oom-killer (commit 29ef680ae7c21 ("memcg,
> oom: move out_of_memory back to the charge path")) and not letting the
> task marked under memcg oom to bypass the memcg limits (commit
> 1f14c1ac19aa4 ("mm: memcg: do not allow task about to OOM kill to bypass
> the limit")), we can again allow __GFP_NOFAIL allocations to trigger
> memcg oom-kill. This will make memcg oom behavior closer to page
> allocator oom behavior.
> 
> Signed-off-by: Shakeel Butt <shakeelb@google.com>

Acked-by: David Rientjes <rientjes@google.com>
diff mbox series

Patch

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 2db2aeac8a9e..dcb5665aeb69 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2797,9 +2797,6 @@  static int try_charge(struct mem_cgroup *memcg, gfp_t gfp_mask,
 	if (gfp_mask & __GFP_RETRY_MAYFAIL)
 		goto nomem;
 
-	if (gfp_mask & __GFP_NOFAIL)
-		goto force;
-
 	if (fatal_signal_pending(current))
 		goto force;