Message ID | 20201130131512.6043-1-songmuchun@bytedance.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | mm/memcg: fix NULL pointer dereference at workingset_eviction | expand |
On Mon 30-11-20 21:15:12, Muchun Song wrote: > We found a case of kernel panic. The stack trace is as follows > (omit some irrelevant information): > > BUG: kernel NULL pointer dereference, address: 00000000000000c8 > RIP: 0010:workingset_eviction+0x26b/0x450 > Call Trace: > __remove_mapping+0x224/0x2b0 > shrink_page_list+0x8c2/0x14e0 > shrink_inactive_list+0x1bf/0x3f0 > ? do_raw_spin_unlock+0x49/0xc0 > ? _raw_spin_unlock+0xa/0x20 > shrink_lruvec+0x401/0x640 > > This was caused by commit 76761ffa9ea1 ("mm/memcg: bail out early when > !memcg in mem_cgroup_lruvec"). When the parameter of memcg is NULL, we > should not use the &pgdat->__lruvec. So this just reverts commit > 76761ffa9ea1 to fix it. > > Fixes: 76761ffa9ea1 ("mm/memcg: bail out early when !memcg in mem_cgroup_lruvec") I do not see any commits like that in the current Linus tree. Is this a commit id from the linux-next? If yes, can we just fold it into the respective patch in mmotm tree please? > Signed-off-by: Muchun Song <songmuchun@bytedance.com> > --- > include/linux/memcontrol.h | 15 +++++++++------ > 1 file changed, 9 insertions(+), 6 deletions(-) > > diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h > index f9a496c4eac7..a1416205507c 100644 > --- a/include/linux/memcontrol.h > +++ b/include/linux/memcontrol.h > @@ -610,17 +610,20 @@ mem_cgroup_nodeinfo(struct mem_cgroup *memcg, int nid) > static inline struct lruvec *mem_cgroup_lruvec(struct mem_cgroup *memcg, > struct pglist_data *pgdat) > { > + struct mem_cgroup_per_node *mz; > struct lruvec *lruvec; > > - if (mem_cgroup_disabled() || !memcg) { > + if (mem_cgroup_disabled()) { > lruvec = &pgdat->__lruvec; > - } else { > - struct mem_cgroup_per_node *mz; > - > - mz = mem_cgroup_nodeinfo(memcg, pgdat->node_id); > - lruvec = &mz->lruvec; > + goto out; > } > > + if (!memcg) > + memcg = root_mem_cgroup; > + > + mz = mem_cgroup_nodeinfo(memcg, pgdat->node_id); > + lruvec = &mz->lruvec; > +out: > /* > * Since a node can be onlined after the mem_cgroup was created, > * we have to be prepared to initialize lruvec->pgdat here; > -- > 2.11.0 >
On Mon, Nov 30, 2020 at 9:23 PM Michal Hocko <mhocko@suse.com> wrote: > > On Mon 30-11-20 21:15:12, Muchun Song wrote: > > We found a case of kernel panic. The stack trace is as follows > > (omit some irrelevant information): > > > > BUG: kernel NULL pointer dereference, address: 00000000000000c8 > > RIP: 0010:workingset_eviction+0x26b/0x450 > > Call Trace: > > __remove_mapping+0x224/0x2b0 > > shrink_page_list+0x8c2/0x14e0 > > shrink_inactive_list+0x1bf/0x3f0 > > ? do_raw_spin_unlock+0x49/0xc0 > > ? _raw_spin_unlock+0xa/0x20 > > shrink_lruvec+0x401/0x640 > > > > This was caused by commit 76761ffa9ea1 ("mm/memcg: bail out early when > > !memcg in mem_cgroup_lruvec"). When the parameter of memcg is NULL, we > > should not use the &pgdat->__lruvec. So this just reverts commit > > 76761ffa9ea1 to fix it. > > > > Fixes: 76761ffa9ea1 ("mm/memcg: bail out early when !memcg in mem_cgroup_lruvec") > > I do not see any commits like that in the current Linus tree. Is this a > commit id from the linux-next? If yes, can we just fold it into the > respective patch in mmotm tree please? Yes. This commit is on the linux-next tree. Of course can. > > > Signed-off-by: Muchun Song <songmuchun@bytedance.com> > > --- > > include/linux/memcontrol.h | 15 +++++++++------ > > 1 file changed, 9 insertions(+), 6 deletions(-) > > > > diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h > > index f9a496c4eac7..a1416205507c 100644 > > --- a/include/linux/memcontrol.h > > +++ b/include/linux/memcontrol.h > > @@ -610,17 +610,20 @@ mem_cgroup_nodeinfo(struct mem_cgroup *memcg, int nid) > > static inline struct lruvec *mem_cgroup_lruvec(struct mem_cgroup *memcg, > > struct pglist_data *pgdat) > > { > > + struct mem_cgroup_per_node *mz; > > struct lruvec *lruvec; > > > > - if (mem_cgroup_disabled() || !memcg) { > > + if (mem_cgroup_disabled()) { > > lruvec = &pgdat->__lruvec; > > - } else { > > - struct mem_cgroup_per_node *mz; > > - > > - mz = mem_cgroup_nodeinfo(memcg, pgdat->node_id); > > - lruvec = &mz->lruvec; > > + goto out; > > } > > > > + if (!memcg) > > + memcg = root_mem_cgroup; > > + > > + mz = mem_cgroup_nodeinfo(memcg, pgdat->node_id); > > + lruvec = &mz->lruvec; > > +out: > > /* > > * Since a node can be onlined after the mem_cgroup was created, > > * we have to be prepared to initialize lruvec->pgdat here; > > -- > > 2.11.0 > > > > -- > Michal Hocko > SUSE Labs
On Mon 30-11-20 21:36:49, Muchun Song wrote: > On Mon, Nov 30, 2020 at 9:23 PM Michal Hocko <mhocko@suse.com> wrote: > > > > On Mon 30-11-20 21:15:12, Muchun Song wrote: > > > We found a case of kernel panic. The stack trace is as follows > > > (omit some irrelevant information): > > > > > > BUG: kernel NULL pointer dereference, address: 00000000000000c8 > > > RIP: 0010:workingset_eviction+0x26b/0x450 > > > Call Trace: > > > __remove_mapping+0x224/0x2b0 > > > shrink_page_list+0x8c2/0x14e0 > > > shrink_inactive_list+0x1bf/0x3f0 > > > ? do_raw_spin_unlock+0x49/0xc0 > > > ? _raw_spin_unlock+0xa/0x20 > > > shrink_lruvec+0x401/0x640 > > > > > > This was caused by commit 76761ffa9ea1 ("mm/memcg: bail out early when > > > !memcg in mem_cgroup_lruvec"). When the parameter of memcg is NULL, we > > > should not use the &pgdat->__lruvec. So this just reverts commit > > > 76761ffa9ea1 to fix it. > > > > > > Fixes: 76761ffa9ea1 ("mm/memcg: bail out early when !memcg in mem_cgroup_lruvec") > > > > I do not see any commits like that in the current Linus tree. Is this a > > commit id from the linux-next? If yes, can we just fold it into the > > respective patch in mmotm tree please? > > Yes. This commit is on the linux-next tree. FYI, patches coming from mmotm are constantly rebased in linux-next so the sha1 is meaningless and shouldn't be added as a reference in the changelog. > Of course can. Thanks! I believe Andrew should be able to just pick up the patch and make it -fix patch.
On 11/30/20 2:45 PM, Michal Hocko wrote: > On Mon 30-11-20 21:36:49, Muchun Song wrote: >> On Mon, Nov 30, 2020 at 9:23 PM Michal Hocko <mhocko@suse.com> wrote: >> > >> > On Mon 30-11-20 21:15:12, Muchun Song wrote: >> > > We found a case of kernel panic. The stack trace is as follows >> > > (omit some irrelevant information): >> > > >> > > BUG: kernel NULL pointer dereference, address: 00000000000000c8 >> > > RIP: 0010:workingset_eviction+0x26b/0x450 >> > > Call Trace: >> > > __remove_mapping+0x224/0x2b0 >> > > shrink_page_list+0x8c2/0x14e0 >> > > shrink_inactive_list+0x1bf/0x3f0 >> > > ? do_raw_spin_unlock+0x49/0xc0 >> > > ? _raw_spin_unlock+0xa/0x20 >> > > shrink_lruvec+0x401/0x640 >> > > >> > > This was caused by commit 76761ffa9ea1 ("mm/memcg: bail out early when >> > > !memcg in mem_cgroup_lruvec"). When the parameter of memcg is NULL, we >> > > should not use the &pgdat->__lruvec. So this just reverts commit >> > > 76761ffa9ea1 to fix it. >> > > >> > > Fixes: 76761ffa9ea1 ("mm/memcg: bail out early when !memcg in mem_cgroup_lruvec") >> > >> > I do not see any commits like that in the current Linus tree. Is this a >> > commit id from the linux-next? If yes, can we just fold it into the >> > respective patch in mmotm tree please? >> >> Yes. This commit is on the linux-next tree. > > FYI, patches coming from mmotm are constantly rebased in linux-next so > the sha1 is meaningless and shouldn't be added as a reference in the > changelog. > >> Of course can. > > Thanks! I believe Andrew should be able to just pick up the patch and > make it -fix patch. Well the fix is a revert, so just remove the patch from mmotm/next? BTW, looks like it wasn't sent to linux-mm [1], looks like missing To: header. [1] https://lore.kernel.org/lkml/1606446515-36069-1-git-send-email-alex.shi@linux.alibaba.com/
On Mon, 30 Nov 2020 16:57:37 +0100 Vlastimil Babka <vbabka@suse.cz> wrote: > On 11/30/20 2:45 PM, Michal Hocko wrote: > > On Mon 30-11-20 21:36:49, Muchun Song wrote: > >> On Mon, Nov 30, 2020 at 9:23 PM Michal Hocko <mhocko@suse.com> wrote: > >> > > >> > On Mon 30-11-20 21:15:12, Muchun Song wrote: > >> > > We found a case of kernel panic. The stack trace is as follows > >> > > (omit some irrelevant information): > >> > > > >> > > BUG: kernel NULL pointer dereference, address: 00000000000000c8 > >> > > RIP: 0010:workingset_eviction+0x26b/0x450 > >> > > Call Trace: > >> > > __remove_mapping+0x224/0x2b0 > >> > > shrink_page_list+0x8c2/0x14e0 > >> > > shrink_inactive_list+0x1bf/0x3f0 > >> > > ? do_raw_spin_unlock+0x49/0xc0 > >> > > ? _raw_spin_unlock+0xa/0x20 > >> > > shrink_lruvec+0x401/0x640 > >> > > > >> > > This was caused by commit 76761ffa9ea1 ("mm/memcg: bail out early when > >> > > !memcg in mem_cgroup_lruvec"). When the parameter of memcg is NULL, we > >> > > should not use the &pgdat->__lruvec. So this just reverts commit > >> > > 76761ffa9ea1 to fix it. > >> > > > >> > > Fixes: 76761ffa9ea1 ("mm/memcg: bail out early when !memcg in mem_cgroup_lruvec") > >> > > >> > I do not see any commits like that in the current Linus tree. Is this a > >> > commit id from the linux-next? If yes, can we just fold it into the > >> > respective patch in mmotm tree please? > >> > >> Yes. This commit is on the linux-next tree. > > > > FYI, patches coming from mmotm are constantly rebased in linux-next so > > the sha1 is meaningless and shouldn't be added as a reference in the > > changelog. > > > >> Of course can. > > > > Thanks! I believe Andrew should be able to just pick up the patch and > > make it -fix patch. > > Well the fix is a revert, so just remove the patch from mmotm/next? > BTW, looks like it wasn't sent to linux-mm [1], looks like missing To: header. > > [1] > https://lore.kernel.org/lkml/1606446515-36069-1-git-send-email-alex.shi@linux.alibaba.com/ Yes, I've dropped "mm/memcg: bail out early when !memcg in mem_cgroup_lruvec", thanks.
diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index f9a496c4eac7..a1416205507c 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -610,17 +610,20 @@ mem_cgroup_nodeinfo(struct mem_cgroup *memcg, int nid) static inline struct lruvec *mem_cgroup_lruvec(struct mem_cgroup *memcg, struct pglist_data *pgdat) { + struct mem_cgroup_per_node *mz; struct lruvec *lruvec; - if (mem_cgroup_disabled() || !memcg) { + if (mem_cgroup_disabled()) { lruvec = &pgdat->__lruvec; - } else { - struct mem_cgroup_per_node *mz; - - mz = mem_cgroup_nodeinfo(memcg, pgdat->node_id); - lruvec = &mz->lruvec; + goto out; } + if (!memcg) + memcg = root_mem_cgroup; + + mz = mem_cgroup_nodeinfo(memcg, pgdat->node_id); + lruvec = &mz->lruvec; +out: /* * Since a node can be onlined after the mem_cgroup was created, * we have to be prepared to initialize lruvec->pgdat here;
We found a case of kernel panic. The stack trace is as follows (omit some irrelevant information): BUG: kernel NULL pointer dereference, address: 00000000000000c8 RIP: 0010:workingset_eviction+0x26b/0x450 Call Trace: __remove_mapping+0x224/0x2b0 shrink_page_list+0x8c2/0x14e0 shrink_inactive_list+0x1bf/0x3f0 ? do_raw_spin_unlock+0x49/0xc0 ? _raw_spin_unlock+0xa/0x20 shrink_lruvec+0x401/0x640 This was caused by commit 76761ffa9ea1 ("mm/memcg: bail out early when !memcg in mem_cgroup_lruvec"). When the parameter of memcg is NULL, we should not use the &pgdat->__lruvec. So this just reverts commit 76761ffa9ea1 to fix it. Fixes: 76761ffa9ea1 ("mm/memcg: bail out early when !memcg in mem_cgroup_lruvec") Signed-off-by: Muchun Song <songmuchun@bytedance.com> --- include/linux/memcontrol.h | 15 +++++++++------ 1 file changed, 9 insertions(+), 6 deletions(-)