Message ID | 20200908185051.62420-1-jpitti@cisco.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | mm: memcg: yield cpu when we fail to charge pages | expand |
On Tue, Sep 08, 2020 at 11:50:51AM -0700, Julius Hemanth Pitti wrote: > For non root CG, in try_charge(), we keep trying > to charge until we succeed. On non-preemptive > kernel, when we are OOM, this results in holding > CPU forever. > > On SMP systems, this doesn't create a big problem > because oom_reaper get a change to kill victim > and make some free pages. However on a single-core > CPU (or cases where oom_reaper pinned to same CPU > where try_charge is executing), oom_reaper shall > never get scheduled and we stay in try_charge forever. > > Steps to repo this on non-smp: > 1. mount -t tmpfs none /sys/fs/cgroup > 2. mkdir /sys/fs/cgroup/memory > 3. mount -t cgroup none /sys/fs/cgroup/memory -o memory > 4. mkdir /sys/fs/cgroup/memory/0 > 5. echo 40M > /sys/fs/cgroup/memory/0/memory.limit_in_bytes > 6. echo $$ > /sys/fs/cgroup/memory/0/tasks > 7. stress -m 5 --vm-bytes 10M --vm-hang 0 > > Signed-off-by: Julius Hemanth Pitti <jpitti@cisco.com> > --- > mm/memcontrol.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > index 0d6f3ea86738..4620d70267cb 100644 > --- a/mm/memcontrol.c > +++ b/mm/memcontrol.c > @@ -2652,6 +2652,8 @@ static int try_charge(struct mem_cgroup *memcg, gfp_t gfp_mask, > if (fatal_signal_pending(current)) > goto force; > > + cond_resched(); > + Can you, please, add a short comment here? Something like "give oom_reaper a chance on a non-SMP system"? > /* > * keep retrying as long as the memcg oom killer is able to make > * a forward progress or bypass the charge if the oom killer > -- > 2.17.1 > The patch makes total sense to me. Please, feel free to add Acked-by: Roman Gushchin <guro@fb.com> after adding a comment. Thank you!
On Tue, 2020-09-08 at 12:21 -0700, Roman Gushchin wrote: > On Tue, Sep 08, 2020 at 11:50:51AM -0700, Julius Hemanth Pitti wrote: > > For non root CG, in try_charge(), we keep trying > > to charge until we succeed. On non-preemptive > > kernel, when we are OOM, this results in holding > > CPU forever. > > > > On SMP systems, this doesn't create a big problem > > because oom_reaper get a change to kill victim > > and make some free pages. However on a single-core > > CPU (or cases where oom_reaper pinned to same CPU > > where try_charge is executing), oom_reaper shall > > never get scheduled and we stay in try_charge forever. > > > > Steps to repo this on non-smp: > > 1. mount -t tmpfs none /sys/fs/cgroup > > 2. mkdir /sys/fs/cgroup/memory > > 3. mount -t cgroup none /sys/fs/cgroup/memory -o memory > > 4. mkdir /sys/fs/cgroup/memory/0 > > 5. echo 40M > /sys/fs/cgroup/memory/0/memory.limit_in_bytes > > 6. echo $$ > /sys/fs/cgroup/memory/0/tasks > > 7. stress -m 5 --vm-bytes 10M --vm-hang 0 > > > > Signed-off-by: Julius Hemanth Pitti <jpitti@cisco.com> > > --- > > mm/memcontrol.c | 2 ++ > > 1 file changed, 2 insertions(+) > > > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > > index 0d6f3ea86738..4620d70267cb 100644 > > --- a/mm/memcontrol.c > > +++ b/mm/memcontrol.c > > @@ -2652,6 +2652,8 @@ static int try_charge(struct mem_cgroup > > *memcg, gfp_t gfp_mask, > > if (fatal_signal_pending(current)) > > goto force; > > > > + cond_resched(); > > + > > Can you, please, add a short comment here? > Something like "give oom_reaper a chance on a non-SMP system"? Sure. > > > /* > > * keep retrying as long as the memcg oom killer is able to > > make > > * a forward progress or bypass the charge if the oom killer > > -- > > 2.17.1 > > > > The patch makes total sense to me. Please, feel free to add > Acked-by: Roman Gushchin <guro@fb.com> after adding a comment. Thanks, I shall add. > > Thank you!
On 2020/9/9 AM2:50, Julius Hemanth Pitti wrote: > For non root CG, in try_charge(), we keep trying > to charge until we succeed. On non-preemptive > kernel, when we are OOM, this results in holding > CPU forever. > > On SMP systems, this doesn't create a big problem > because oom_reaper get a change to kill victim > and make some free pages. However on a single-core > CPU (or cases where oom_reaper pinned to same CPU > where try_charge is executing), oom_reaper shall > never get scheduled and we stay in try_charge forever. > > Steps to repo this on non-smp: > 1. mount -t tmpfs none /sys/fs/cgroup > 2. mkdir /sys/fs/cgroup/memory > 3. mount -t cgroup none /sys/fs/cgroup/memory -o memory > 4. mkdir /sys/fs/cgroup/memory/0 > 5. echo 40M > /sys/fs/cgroup/memory/0/memory.limit_in_bytes > 6. echo $$ > /sys/fs/cgroup/memory/0/tasks > 7. stress -m 5 --vm-bytes 10M --vm-hang 0 > > Signed-off-by: Julius Hemanth Pitti <jpitti@cisco.com> > --- > mm/memcontrol.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > index 0d6f3ea86738..4620d70267cb 100644 > --- a/mm/memcontrol.c > +++ b/mm/memcontrol.c > @@ -2652,6 +2652,8 @@ static int try_charge(struct mem_cgroup *memcg, gfp_t gfp_mask, > if (fatal_signal_pending(current)) > goto force; > > + cond_resched(); > + > /* > * keep retrying as long as the memcg oom killer is able to make > * a forward progress or bypass the charge if the oom killer > This should be fixed by: https://lkml.org/lkml/2020/8/26/1440 Thanks, Xunlei
diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 0d6f3ea86738..4620d70267cb 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -2652,6 +2652,8 @@ static int try_charge(struct mem_cgroup *memcg, gfp_t gfp_mask, if (fatal_signal_pending(current)) goto force; + cond_resched(); + /* * keep retrying as long as the memcg oom killer is able to make * a forward progress or bypass the charge if the oom killer
For non root CG, in try_charge(), we keep trying to charge until we succeed. On non-preemptive kernel, when we are OOM, this results in holding CPU forever. On SMP systems, this doesn't create a big problem because oom_reaper get a change to kill victim and make some free pages. However on a single-core CPU (or cases where oom_reaper pinned to same CPU where try_charge is executing), oom_reaper shall never get scheduled and we stay in try_charge forever. Steps to repo this on non-smp: 1. mount -t tmpfs none /sys/fs/cgroup 2. mkdir /sys/fs/cgroup/memory 3. mount -t cgroup none /sys/fs/cgroup/memory -o memory 4. mkdir /sys/fs/cgroup/memory/0 5. echo 40M > /sys/fs/cgroup/memory/0/memory.limit_in_bytes 6. echo $$ > /sys/fs/cgroup/memory/0/tasks 7. stress -m 5 --vm-bytes 10M --vm-hang 0 Signed-off-by: Julius Hemanth Pitti <jpitti@cisco.com> --- mm/memcontrol.c | 2 ++ 1 file changed, 2 insertions(+)