Message ID | 1598449622-108748-1-git-send-email-xlpang@linux.alibaba.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v2] mm: memcg: Fix memcg reclaim soft lockup | expand |
Hi Xunlei, Xunlei Pang writes: >Add cond_resched() at the upper shrink_node_memcgs() to solve this >issue, and any other possible issue like meomry.min protection. I understand the general intent of the patch, but could you clarify your concern around memory protection?
On Wed 26-08-20 21:47:02, Xunlei Pang wrote: > We've met softlockup with "CONFIG_PREEMPT_NONE=y", when > the target memcg doesn't have any reclaimable memory. > > It can be easily reproduced as below: > watchdog: BUG: soft lockup - CPU#0 stuck for 111s![memcg_test:2204] > CPU: 0 PID: 2204 Comm: memcg_test Not tainted 5.9.0-rc2+ #12 > Call Trace: > shrink_lruvec+0x49f/0x640 > shrink_node+0x2a6/0x6f0 > do_try_to_free_pages+0xe9/0x3e0 > try_to_free_mem_cgroup_pages+0xef/0x1f0 > try_charge+0x2c1/0x750 > mem_cgroup_charge+0xd7/0x240 > __add_to_page_cache_locked+0x2fd/0x370 > add_to_page_cache_lru+0x4a/0xc0 > pagecache_get_page+0x10b/0x2f0 > filemap_fault+0x661/0xad0 > ext4_filemap_fault+0x2c/0x40 > __do_fault+0x4d/0xf9 > handle_mm_fault+0x1080/0x1790 > > It only happens on our 1-vcpu instances, because there's no chance > for oom reaper to run to reclaim the to-be-killed process. > > Add cond_resched() at the upper shrink_node_memcgs() to solve this > issue, and any other possible issue like meomry.min protection. I would just add " This will mean that we will get a scheduling point for each memcg in the reclaimed hierarchy without any dependency on the reclaimable memory in that memcg thus making it more predictable. " > Suggested-by: Michal Hocko <mhocko@suse.com> > Signed-off-by: Xunlei Pang <xlpang@linux.alibaba.com> Acked-by: Michal Hocko <mhocko@suse.com> Thanks! > --- > mm/vmscan.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/mm/vmscan.c b/mm/vmscan.c > index 99e1796..bbdc38b 100644 > --- a/mm/vmscan.c > +++ b/mm/vmscan.c > @@ -2617,6 +2617,8 @@ static void shrink_node_memcgs(pg_data_t *pgdat, struct scan_control *sc) > > mem_cgroup_calculate_protection(target_memcg, memcg); > > + cond_resched(); > + > if (mem_cgroup_below_min(memcg)) { > /* > * Hard protection. > -- > 1.8.3.1
On Wed, Aug 26, 2020 at 09:47:02PM +0800, Xunlei Pang wrote: > We've met softlockup with "CONFIG_PREEMPT_NONE=y", when > the target memcg doesn't have any reclaimable memory. > > It can be easily reproduced as below: > watchdog: BUG: soft lockup - CPU#0 stuck for 111s![memcg_test:2204] > CPU: 0 PID: 2204 Comm: memcg_test Not tainted 5.9.0-rc2+ #12 > Call Trace: > shrink_lruvec+0x49f/0x640 > shrink_node+0x2a6/0x6f0 > do_try_to_free_pages+0xe9/0x3e0 > try_to_free_mem_cgroup_pages+0xef/0x1f0 > try_charge+0x2c1/0x750 > mem_cgroup_charge+0xd7/0x240 > __add_to_page_cache_locked+0x2fd/0x370 > add_to_page_cache_lru+0x4a/0xc0 > pagecache_get_page+0x10b/0x2f0 > filemap_fault+0x661/0xad0 > ext4_filemap_fault+0x2c/0x40 > __do_fault+0x4d/0xf9 > handle_mm_fault+0x1080/0x1790 > > It only happens on our 1-vcpu instances, because there's no chance > for oom reaper to run to reclaim the to-be-killed process. > > Add cond_resched() at the upper shrink_node_memcgs() to solve this > issue, and any other possible issue like meomry.min protection. > > Suggested-by: Michal Hocko <mhocko@suse.com> > Signed-off-by: Xunlei Pang <xlpang@linux.alibaba.com> This generally makes sense to me but really should have a comment: /* * This loop can become CPU-bound when there are thousands * of cgroups that aren't eligible for reclaim - either * because they don't have any pages, or because their * memory is explicitly protected. Avoid soft lockups. */ cond_resched(); The placement in the middle of the multi-part protection checks is a bit odd too. It would be better to have it either at the top of the loop, or at the end, by replacing the continues with goto next.
On Wed 26-08-20 12:43:32, Johannes Weiner wrote: > On Wed, Aug 26, 2020 at 09:47:02PM +0800, Xunlei Pang wrote: > > We've met softlockup with "CONFIG_PREEMPT_NONE=y", when > > the target memcg doesn't have any reclaimable memory. > > > > It can be easily reproduced as below: > > watchdog: BUG: soft lockup - CPU#0 stuck for 111s![memcg_test:2204] > > CPU: 0 PID: 2204 Comm: memcg_test Not tainted 5.9.0-rc2+ #12 > > Call Trace: > > shrink_lruvec+0x49f/0x640 > > shrink_node+0x2a6/0x6f0 > > do_try_to_free_pages+0xe9/0x3e0 > > try_to_free_mem_cgroup_pages+0xef/0x1f0 > > try_charge+0x2c1/0x750 > > mem_cgroup_charge+0xd7/0x240 > > __add_to_page_cache_locked+0x2fd/0x370 > > add_to_page_cache_lru+0x4a/0xc0 > > pagecache_get_page+0x10b/0x2f0 > > filemap_fault+0x661/0xad0 > > ext4_filemap_fault+0x2c/0x40 > > __do_fault+0x4d/0xf9 > > handle_mm_fault+0x1080/0x1790 > > > > It only happens on our 1-vcpu instances, because there's no chance > > for oom reaper to run to reclaim the to-be-killed process. > > > > Add cond_resched() at the upper shrink_node_memcgs() to solve this > > issue, and any other possible issue like meomry.min protection. > > > > Suggested-by: Michal Hocko <mhocko@suse.com> > > Signed-off-by: Xunlei Pang <xlpang@linux.alibaba.com> > > This generally makes sense to me but really should have a comment: > > /* > * This loop can become CPU-bound when there are thousands > * of cgroups that aren't eligible for reclaim - either > * because they don't have any pages, or because their > * memory is explicitly protected. Avoid soft lockups. > */ > cond_resched(); > > The placement in the middle of the multi-part protection checks is a > bit odd too. It would be better to have it either at the top of the > loop, or at the end, by replacing the continues with goto next. Yes makes sense. I would stick it to the begining of the loop to make it stand out and make it obvious wrt code flow.
Chris Down writes: >Xunlei Pang writes: >>Add cond_resched() at the upper shrink_node_memcgs() to solve this >>issue, and any other possible issue like meomry.min protection. > >I understand the general intent of the patch, but could you clarify >your concern around memory protection? Oh, I see, your concern is just preemption in general rather than a fixing anything for the memory protection side. In which case, go for it, but I agree with Michael that it would be nice to send v3 with a clarifying comment. Acked-by: Chris Down <chris@chrisdown.name>
diff --git a/mm/vmscan.c b/mm/vmscan.c index 99e1796..bbdc38b 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -2617,6 +2617,8 @@ static void shrink_node_memcgs(pg_data_t *pgdat, struct scan_control *sc) mem_cgroup_calculate_protection(target_memcg, memcg); + cond_resched(); + if (mem_cgroup_below_min(memcg)) { /* * Hard protection.
We've met softlockup with "CONFIG_PREEMPT_NONE=y", when the target memcg doesn't have any reclaimable memory. It can be easily reproduced as below: watchdog: BUG: soft lockup - CPU#0 stuck for 111s![memcg_test:2204] CPU: 0 PID: 2204 Comm: memcg_test Not tainted 5.9.0-rc2+ #12 Call Trace: shrink_lruvec+0x49f/0x640 shrink_node+0x2a6/0x6f0 do_try_to_free_pages+0xe9/0x3e0 try_to_free_mem_cgroup_pages+0xef/0x1f0 try_charge+0x2c1/0x750 mem_cgroup_charge+0xd7/0x240 __add_to_page_cache_locked+0x2fd/0x370 add_to_page_cache_lru+0x4a/0xc0 pagecache_get_page+0x10b/0x2f0 filemap_fault+0x661/0xad0 ext4_filemap_fault+0x2c/0x40 __do_fault+0x4d/0xf9 handle_mm_fault+0x1080/0x1790 It only happens on our 1-vcpu instances, because there's no chance for oom reaper to run to reclaim the to-be-killed process. Add cond_resched() at the upper shrink_node_memcgs() to solve this issue, and any other possible issue like meomry.min protection. Suggested-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Xunlei Pang <xlpang@linux.alibaba.com> --- mm/vmscan.c | 2 ++ 1 file changed, 2 insertions(+)