Message ID | 20200813040224.13054.96724.stgit@localhost.localdomain (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Re: [PATCH v17 14/21] mm/compaction: do page isolation first in compaction | expand |
在 2020/8/13 下午12:02, Alexander Duyck 写道: > From: Alexander Duyck <alexander.h.duyck@linux.intel.com> > > We can drop the need for the locked variable by making use of the > lruvec_holds_page_lru_lock function. By doing this we can avoid some rcu > locking ugliness for the case where the lruvec is still holding the LRU > lock associated with the page. Instead we can just use the lruvec and if it > is NULL we assume the lock was released. > > Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com> > --- > mm/compaction.c | 45 ++++++++++++++++++++------------------------- > 1 file changed, 20 insertions(+), 25 deletions(-) Thanks a lot! Don't know if community is ok if we keep the patch following whole patchset alone?
在 2020/8/13 下午12:02, Alexander Duyck 写道: > - rcu_read_lock(); > - lruvec = mem_cgroup_page_lruvec(page, pgdat); > - > /* If we already hold the lock, we can skip some rechecking */ > - if (lruvec != locked) { > - if (locked) > - unlock_page_lruvec_irqrestore(locked, flags); > + if (!lruvec || !lruvec_holds_page_lru_lock(page, lruvec)) { Ops, lruvec_holds_page_lru_lock need rcu_read_lock. > + if (lruvec) > + unlock_page_lruvec_irqrestore(lruvec, flags); > > + lruvec = mem_cgroup_page_lruvec(page, pgdat); > compact_lock_irqsave(&lruvec->lru_lock, &flags, cc); > - locked = lruvec; > rcu_read_unlock(); > and some bugs: [ 534.564741] CPU: 23 PID: 545 Comm: kcompactd1 Kdump: loaded Tainted: G S W 5.8.0-next-20200803-00028-g9a7ff2cd6e5c #85 [ 534.577320] Hardware name: Alibaba Alibaba Cloud ECS/Alibaba Cloud ECS, BIOS 1.0.PL.IP.P.027.02 05/29/2020 [ 534.587693] Call Trace: [ 534.590522] dump_stack+0x96/0xd0 [ 534.594231] ___might_sleep.cold.90+0xff/0x115 [ 534.599102] kcompactd+0x24b/0x370 [ 534.602904] ? finish_wait+0x80/0x80 [ 534.606897] ? kcompactd_do_work+0x3d0/0x3d0 [ 534.611566] kthread+0x14e/0x170 [ 534.615182] ? kthread_park+0x80/0x80 [ 534.619252] ret_from_fork+0x1f/0x30 [ 535.629483] BUG: sleeping function called from invalid context at include/linux/freezer.h:57 [ 535.638691] in_atomic(): 0, irqs_disabled(): 0, non_block: 0, pid: 545, name: kcompactd1 [ 535.647601] INFO: lockdep is turned off.
On Thu, Aug 13, 2020 at 12:45 AM Alex Shi <alex.shi@linux.alibaba.com> wrote: > > > > 在 2020/8/13 下午12:02, Alexander Duyck 写道: > > - rcu_read_lock(); > > - lruvec = mem_cgroup_page_lruvec(page, pgdat); > > - > > /* If we already hold the lock, we can skip some rechecking */ > > - if (lruvec != locked) { > > - if (locked) > > - unlock_page_lruvec_irqrestore(locked, flags); > > + if (!lruvec || !lruvec_holds_page_lru_lock(page, lruvec)) { > > Ops, lruvec_holds_page_lru_lock need rcu_read_lock. How so? The reason I wrote lruvec_holds_page_lru_lock the way I did is that it is simply comparing the pointers held by the page and the lruvec. It is never actually accessing any of the values, just the pointers. As such we should be able to compare the two since the lruvec is still locked and the the memcg and pgdat held by the lruvec should not be changed. Likewise with the page pointers assuming the values match. > > + if (lruvec) > > + unlock_page_lruvec_irqrestore(lruvec, flags); > > > > + lruvec = mem_cgroup_page_lruvec(page, pgdat); > > compact_lock_irqsave(&lruvec->lru_lock, &flags, cc); > > - locked = lruvec; > > rcu_read_unlock(); > > > > and some bugs: > [ 534.564741] CPU: 23 PID: 545 Comm: kcompactd1 Kdump: loaded Tainted: G S W 5.8.0-next-20200803-00028-g9a7ff2cd6e5c #85 > [ 534.577320] Hardware name: Alibaba Alibaba Cloud ECS/Alibaba Cloud ECS, BIOS 1.0.PL.IP.P.027.02 05/29/2020 > [ 534.587693] Call Trace: > [ 534.590522] dump_stack+0x96/0xd0 > [ 534.594231] ___might_sleep.cold.90+0xff/0x115 > [ 534.599102] kcompactd+0x24b/0x370 > [ 534.602904] ? finish_wait+0x80/0x80 > [ 534.606897] ? kcompactd_do_work+0x3d0/0x3d0 > [ 534.611566] kthread+0x14e/0x170 > [ 534.615182] ? kthread_park+0x80/0x80 > [ 534.619252] ret_from_fork+0x1f/0x30 > [ 535.629483] BUG: sleeping function called from invalid context at include/linux/freezer.h:57 > [ 535.638691] in_atomic(): 0, irqs_disabled(): 0, non_block: 0, pid: 545, name: kcompactd1 > [ 535.647601] INFO: lockdep is turned off. Ah, I see the bug now. It isn't the lruvec_holds_page_lru_lock that needs the LRU lock. This is an issue as a part of a merge conflict. There should have been an rcu_read_lock added before mem_cgroup_page_lruvec.
On Wed, Aug 12, 2020 at 11:57 PM Alex Shi <alex.shi@linux.alibaba.com> wrote: > > > > 在 2020/8/13 下午12:02, Alexander Duyck 写道: > > From: Alexander Duyck <alexander.h.duyck@linux.intel.com> > > > > We can drop the need for the locked variable by making use of the > > lruvec_holds_page_lru_lock function. By doing this we can avoid some rcu > > locking ugliness for the case where the lruvec is still holding the LRU > > lock associated with the page. Instead we can just use the lruvec and if it > > is NULL we assume the lock was released. > > > > Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com> > > --- > > mm/compaction.c | 45 ++++++++++++++++++++------------------------- > > 1 file changed, 20 insertions(+), 25 deletions(-) > > Thanks a lot! > Don't know if community is ok if we keep the patch following whole patchset alone? I am fine with you squashing it with another patch if you want. In theory this could probably be squashed in with the earlier patch I submitted that introduced lruvec_holds_page_lru_lock or some other patch. It is mostly just a cleanup anyway as it gets us away from needing to hold the RCU read lock in the case that we already have the correct lruvec.
在 2020/8/13 下午10:32, Alexander Duyck 写道: >> Thanks a lot! >> Don't know if community is ok if we keep the patch following whole patchset alone? > I am fine with you squashing it with another patch if you want. In > theory this could probably be squashed in with the earlier patch I > submitted that introduced lruvec_holds_page_lru_lock or some other > patch. It is mostly just a cleanup anyway as it gets us away from > needing to hold the RCU read lock in the case that we already have the > correct lruvec. Hi Alexander, Thanks a lot! look like it's better to fold it in patch 17. Alex
diff --git a/mm/compaction.c b/mm/compaction.c index b99c96c4862d..5021a18ef722 100644 --- a/mm/compaction.c +++ b/mm/compaction.c @@ -803,9 +803,8 @@ static bool too_many_isolated(pg_data_t *pgdat) { pg_data_t *pgdat = cc->zone->zone_pgdat; unsigned long nr_scanned = 0, nr_isolated = 0; - struct lruvec *lruvec; + struct lruvec *lruvec = NULL; unsigned long flags = 0; - struct lruvec *locked = NULL; struct page *page = NULL, *valid_page = NULL; unsigned long start_pfn = low_pfn; bool skip_on_failure = false; @@ -866,9 +865,9 @@ static bool too_many_isolated(pg_data_t *pgdat) * a fatal signal is pending. */ if (!(low_pfn % SWAP_CLUSTER_MAX)) { - if (locked) { - unlock_page_lruvec_irqrestore(locked, flags); - locked = NULL; + if (lruvec) { + unlock_page_lruvec_irqrestore(lruvec, flags); + lruvec = NULL; } if (fatal_signal_pending(current)) { @@ -949,9 +948,9 @@ static bool too_many_isolated(pg_data_t *pgdat) */ if (unlikely(__PageMovable(page)) && !PageIsolated(page)) { - if (locked) { - unlock_page_lruvec_irqrestore(locked, flags); - locked = NULL; + if (lruvec) { + unlock_page_lruvec_irqrestore(lruvec, flags); + lruvec = NULL; } if (!isolate_movable_page(page, isolate_mode)) @@ -992,16 +991,13 @@ static bool too_many_isolated(pg_data_t *pgdat) if (!TestClearPageLRU(page)) goto isolate_fail_put; - rcu_read_lock(); - lruvec = mem_cgroup_page_lruvec(page, pgdat); - /* If we already hold the lock, we can skip some rechecking */ - if (lruvec != locked) { - if (locked) - unlock_page_lruvec_irqrestore(locked, flags); + if (!lruvec || !lruvec_holds_page_lru_lock(page, lruvec)) { + if (lruvec) + unlock_page_lruvec_irqrestore(lruvec, flags); + lruvec = mem_cgroup_page_lruvec(page, pgdat); compact_lock_irqsave(&lruvec->lru_lock, &flags, cc); - locked = lruvec; rcu_read_unlock(); lruvec_memcg_debug(lruvec, page); @@ -1023,8 +1019,7 @@ static bool too_many_isolated(pg_data_t *pgdat) SetPageLRU(page); goto isolate_fail_put; } - } else - rcu_read_unlock(); + } /* The whole page is taken off the LRU; skip the tail pages. */ if (PageCompound(page)) @@ -1057,9 +1052,9 @@ static bool too_many_isolated(pg_data_t *pgdat) isolate_fail_put: /* Avoid potential deadlock in freeing page under lru_lock */ - if (locked) { - unlock_page_lruvec_irqrestore(locked, flags); - locked = NULL; + if (lruvec) { + unlock_page_lruvec_irqrestore(lruvec, flags); + lruvec = NULL; } put_page(page); @@ -1073,9 +1068,9 @@ static bool too_many_isolated(pg_data_t *pgdat) * page anyway. */ if (nr_isolated) { - if (locked) { - unlock_page_lruvec_irqrestore(locked, flags); - locked = NULL; + if (lruvec) { + unlock_page_lruvec_irqrestore(lruvec, flags); + lruvec = NULL; } putback_movable_pages(&cc->migratepages); cc->nr_migratepages = 0; @@ -1102,8 +1097,8 @@ static bool too_many_isolated(pg_data_t *pgdat) page = NULL; isolate_abort: - if (locked) - unlock_page_lruvec_irqrestore(locked, flags); + if (lruvec) + unlock_page_lruvec_irqrestore(lruvec, flags); if (page) { SetPageLRU(page); put_page(page);