diff mbox series

[RFC,1/3] mm: Drop locked from isolate_migratepages_block

Message ID 20200813040224.13054.96724.stgit@localhost.localdomain (mailing list archive)
State New, archived
Headers show
Series Re: [PATCH v17 14/21] mm/compaction: do page isolation first in compaction | expand

Commit Message

Alexander Duyck Aug. 13, 2020, 4:02 a.m. UTC
From: Alexander Duyck <alexander.h.duyck@linux.intel.com>

We can drop the need for the locked variable by making use of the
lruvec_holds_page_lru_lock function. By doing this we can avoid some rcu
locking ugliness for the case where the lruvec is still holding the LRU
lock associated with the page. Instead we can just use the lruvec and if it
is NULL we assume the lock was released.

Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>
---
 mm/compaction.c |   45 ++++++++++++++++++++-------------------------
 1 file changed, 20 insertions(+), 25 deletions(-)

Comments

Alex Shi Aug. 13, 2020, 6:56 a.m. UTC | #1
在 2020/8/13 下午12:02, Alexander Duyck 写道:
> From: Alexander Duyck <alexander.h.duyck@linux.intel.com>
> 
> We can drop the need for the locked variable by making use of the
> lruvec_holds_page_lru_lock function. By doing this we can avoid some rcu
> locking ugliness for the case where the lruvec is still holding the LRU
> lock associated with the page. Instead we can just use the lruvec and if it
> is NULL we assume the lock was released.
> 
> Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>
> ---
>  mm/compaction.c |   45 ++++++++++++++++++++-------------------------
>  1 file changed, 20 insertions(+), 25 deletions(-)

Thanks a lot!
Don't know if community is ok if we keep the patch following whole patchset alone?
Alex Shi Aug. 13, 2020, 7:44 a.m. UTC | #2
在 2020/8/13 下午12:02, Alexander Duyck 写道:
> -		rcu_read_lock();
> -		lruvec = mem_cgroup_page_lruvec(page, pgdat);
> -
>  		/* If we already hold the lock, we can skip some rechecking */
> -		if (lruvec != locked) {
> -			if (locked)
> -				unlock_page_lruvec_irqrestore(locked, flags);
> +		if (!lruvec || !lruvec_holds_page_lru_lock(page, lruvec)) {

Ops, lruvec_holds_page_lru_lock need rcu_read_lock. 

> +			if (lruvec)
> +				unlock_page_lruvec_irqrestore(lruvec, flags);
>  
> +			lruvec = mem_cgroup_page_lruvec(page, pgdat);
>  			compact_lock_irqsave(&lruvec->lru_lock, &flags, cc);
> -			locked = lruvec;
>  			rcu_read_unlock();
>  

and some bugs:
[  534.564741] CPU: 23 PID: 545 Comm: kcompactd1 Kdump: loaded Tainted: G S      W         5.8.0-next-20200803-00028-g9a7ff2cd6e5c #85
[  534.577320] Hardware name: Alibaba Alibaba Cloud ECS/Alibaba Cloud ECS, BIOS 1.0.PL.IP.P.027.02 05/29/2020
[  534.587693] Call Trace:
[  534.590522]  dump_stack+0x96/0xd0
[  534.594231]  ___might_sleep.cold.90+0xff/0x115
[  534.599102]  kcompactd+0x24b/0x370
[  534.602904]  ? finish_wait+0x80/0x80
[  534.606897]  ? kcompactd_do_work+0x3d0/0x3d0
[  534.611566]  kthread+0x14e/0x170
[  534.615182]  ? kthread_park+0x80/0x80
[  534.619252]  ret_from_fork+0x1f/0x30
[  535.629483] BUG: sleeping function called from invalid context at include/linux/freezer.h:57
[  535.638691] in_atomic(): 0, irqs_disabled(): 0, non_block: 0, pid: 545, name: kcompactd1
[  535.647601] INFO: lockdep is turned off.
Alexander Duyck Aug. 13, 2020, 2:26 p.m. UTC | #3
On Thu, Aug 13, 2020 at 12:45 AM Alex Shi <alex.shi@linux.alibaba.com> wrote:
>
>
>
> 在 2020/8/13 下午12:02, Alexander Duyck 写道:
> > -             rcu_read_lock();
> > -             lruvec = mem_cgroup_page_lruvec(page, pgdat);
> > -
> >               /* If we already hold the lock, we can skip some rechecking */
> > -             if (lruvec != locked) {
> > -                     if (locked)
> > -                             unlock_page_lruvec_irqrestore(locked, flags);
> > +             if (!lruvec || !lruvec_holds_page_lru_lock(page, lruvec)) {
>
> Ops, lruvec_holds_page_lru_lock need rcu_read_lock.

How so? The reason I wrote lruvec_holds_page_lru_lock the way I did is
that it is simply comparing the pointers held by the page and the
lruvec. It is never actually accessing any of the values, just the
pointers. As such we should be able to compare the two since the
lruvec is still locked and the the memcg and pgdat held by the lruvec
should not be changed. Likewise with the page pointers assuming the
values match.

> > +                     if (lruvec)
> > +                             unlock_page_lruvec_irqrestore(lruvec, flags);
> >
> > +                     lruvec = mem_cgroup_page_lruvec(page, pgdat);
> >                       compact_lock_irqsave(&lruvec->lru_lock, &flags, cc);
> > -                     locked = lruvec;
> >                       rcu_read_unlock();
> >
>
> and some bugs:
> [  534.564741] CPU: 23 PID: 545 Comm: kcompactd1 Kdump: loaded Tainted: G S      W         5.8.0-next-20200803-00028-g9a7ff2cd6e5c #85
> [  534.577320] Hardware name: Alibaba Alibaba Cloud ECS/Alibaba Cloud ECS, BIOS 1.0.PL.IP.P.027.02 05/29/2020
> [  534.587693] Call Trace:
> [  534.590522]  dump_stack+0x96/0xd0
> [  534.594231]  ___might_sleep.cold.90+0xff/0x115
> [  534.599102]  kcompactd+0x24b/0x370
> [  534.602904]  ? finish_wait+0x80/0x80
> [  534.606897]  ? kcompactd_do_work+0x3d0/0x3d0
> [  534.611566]  kthread+0x14e/0x170
> [  534.615182]  ? kthread_park+0x80/0x80
> [  534.619252]  ret_from_fork+0x1f/0x30
> [  535.629483] BUG: sleeping function called from invalid context at include/linux/freezer.h:57
> [  535.638691] in_atomic(): 0, irqs_disabled(): 0, non_block: 0, pid: 545, name: kcompactd1
> [  535.647601] INFO: lockdep is turned off.

Ah, I see the bug now. It isn't the lruvec_holds_page_lru_lock that
needs the LRU lock. This is an issue as a part of a merge conflict.
There should have been an rcu_read_lock added before
mem_cgroup_page_lruvec.
Alexander Duyck Aug. 13, 2020, 2:32 p.m. UTC | #4
On Wed, Aug 12, 2020 at 11:57 PM Alex Shi <alex.shi@linux.alibaba.com> wrote:
>
>
>
> 在 2020/8/13 下午12:02, Alexander Duyck 写道:
> > From: Alexander Duyck <alexander.h.duyck@linux.intel.com>
> >
> > We can drop the need for the locked variable by making use of the
> > lruvec_holds_page_lru_lock function. By doing this we can avoid some rcu
> > locking ugliness for the case where the lruvec is still holding the LRU
> > lock associated with the page. Instead we can just use the lruvec and if it
> > is NULL we assume the lock was released.
> >
> > Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>
> > ---
> >  mm/compaction.c |   45 ++++++++++++++++++++-------------------------
> >  1 file changed, 20 insertions(+), 25 deletions(-)
>
> Thanks a lot!
> Don't know if community is ok if we keep the patch following whole patchset alone?

I am fine with you squashing it with another patch if you want. In
theory this could probably be squashed in with the earlier patch I
submitted that introduced lruvec_holds_page_lru_lock or some other
patch. It is mostly just a cleanup anyway as it gets us away from
needing to hold the RCU read lock in the case that we already have the
correct lruvec.
Alex Shi Aug. 14, 2020, 7:25 a.m. UTC | #5
在 2020/8/13 下午10:32, Alexander Duyck 写道:
>> Thanks a lot!
>> Don't know if community is ok if we keep the patch following whole patchset alone?
> I am fine with you squashing it with another patch if you want. In
> theory this could probably be squashed in with the earlier patch I
> submitted that introduced lruvec_holds_page_lru_lock or some other
> patch. It is mostly just a cleanup anyway as it gets us away from
> needing to hold the RCU read lock in the case that we already have the
> correct lruvec.

Hi Alexander,

Thanks a lot! look like it's better to fold it in patch 17.

Alex
diff mbox series

Patch

diff --git a/mm/compaction.c b/mm/compaction.c
index b99c96c4862d..5021a18ef722 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -803,9 +803,8 @@  static bool too_many_isolated(pg_data_t *pgdat)
 {
 	pg_data_t *pgdat = cc->zone->zone_pgdat;
 	unsigned long nr_scanned = 0, nr_isolated = 0;
-	struct lruvec *lruvec;
+	struct lruvec *lruvec = NULL;
 	unsigned long flags = 0;
-	struct lruvec *locked = NULL;
 	struct page *page = NULL, *valid_page = NULL;
 	unsigned long start_pfn = low_pfn;
 	bool skip_on_failure = false;
@@ -866,9 +865,9 @@  static bool too_many_isolated(pg_data_t *pgdat)
 		 * a fatal signal is pending.
 		 */
 		if (!(low_pfn % SWAP_CLUSTER_MAX)) {
-			if (locked) {
-				unlock_page_lruvec_irqrestore(locked, flags);
-				locked = NULL;
+			if (lruvec) {
+				unlock_page_lruvec_irqrestore(lruvec, flags);
+				lruvec = NULL;
 			}
 
 			if (fatal_signal_pending(current)) {
@@ -949,9 +948,9 @@  static bool too_many_isolated(pg_data_t *pgdat)
 			 */
 			if (unlikely(__PageMovable(page)) &&
 					!PageIsolated(page)) {
-				if (locked) {
-					unlock_page_lruvec_irqrestore(locked, flags);
-					locked = NULL;
+				if (lruvec) {
+					unlock_page_lruvec_irqrestore(lruvec, flags);
+					lruvec = NULL;
 				}
 
 				if (!isolate_movable_page(page, isolate_mode))
@@ -992,16 +991,13 @@  static bool too_many_isolated(pg_data_t *pgdat)
 		if (!TestClearPageLRU(page))
 			goto isolate_fail_put;
 
-		rcu_read_lock();
-		lruvec = mem_cgroup_page_lruvec(page, pgdat);
-
 		/* If we already hold the lock, we can skip some rechecking */
-		if (lruvec != locked) {
-			if (locked)
-				unlock_page_lruvec_irqrestore(locked, flags);
+		if (!lruvec || !lruvec_holds_page_lru_lock(page, lruvec)) {
+			if (lruvec)
+				unlock_page_lruvec_irqrestore(lruvec, flags);
 
+			lruvec = mem_cgroup_page_lruvec(page, pgdat);
 			compact_lock_irqsave(&lruvec->lru_lock, &flags, cc);
-			locked = lruvec;
 			rcu_read_unlock();
 
 			lruvec_memcg_debug(lruvec, page);
@@ -1023,8 +1019,7 @@  static bool too_many_isolated(pg_data_t *pgdat)
 				SetPageLRU(page);
 				goto isolate_fail_put;
 			}
-		} else
-			rcu_read_unlock();
+		}
 
 		/* The whole page is taken off the LRU; skip the tail pages. */
 		if (PageCompound(page))
@@ -1057,9 +1052,9 @@  static bool too_many_isolated(pg_data_t *pgdat)
 
 isolate_fail_put:
 		/* Avoid potential deadlock in freeing page under lru_lock */
-		if (locked) {
-			unlock_page_lruvec_irqrestore(locked, flags);
-			locked = NULL;
+		if (lruvec) {
+			unlock_page_lruvec_irqrestore(lruvec, flags);
+			lruvec = NULL;
 		}
 		put_page(page);
 
@@ -1073,9 +1068,9 @@  static bool too_many_isolated(pg_data_t *pgdat)
 		 * page anyway.
 		 */
 		if (nr_isolated) {
-			if (locked) {
-				unlock_page_lruvec_irqrestore(locked, flags);
-				locked = NULL;
+			if (lruvec) {
+				unlock_page_lruvec_irqrestore(lruvec, flags);
+				lruvec = NULL;
 			}
 			putback_movable_pages(&cc->migratepages);
 			cc->nr_migratepages = 0;
@@ -1102,8 +1097,8 @@  static bool too_many_isolated(pg_data_t *pgdat)
 	page = NULL;
 
 isolate_abort:
-	if (locked)
-		unlock_page_lruvec_irqrestore(locked, flags);
+	if (lruvec)
+		unlock_page_lruvec_irqrestore(lruvec, flags);
 	if (page) {
 		SetPageLRU(page);
 		put_page(page);