From patchwork Mon Mar 2 11:00:11 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alex Shi X-Patchwork-Id: 11415265 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 62CCA138D for ; Mon, 2 Mar 2020 11:01:03 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 39E6924680 for ; Mon, 2 Mar 2020 11:01:03 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 39E6924680 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 7A1D66B0005; Mon, 2 Mar 2020 06:01:02 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 753C96B0006; Mon, 2 Mar 2020 06:01:02 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 642196B0007; Mon, 2 Mar 2020 06:01:02 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0133.hostedemail.com [216.40.44.133]) by kanga.kvack.org (Postfix) with ESMTP id 493B26B0005 for ; Mon, 2 Mar 2020 06:01:02 -0500 (EST) Received: from smtpin13.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 300318245578 for ; Mon, 2 Mar 2020 11:01:02 +0000 (UTC) X-FDA: 76550129964.13.mouth09_7d4cb9c00e803 X-Spam-Summary: 2,0,0,e352d14828aa7ac3,d41d8cd98f00b204,alex.shi@linux.alibaba.com,,RULES_HIT:41:69:355:379:541:800:960:966:968:973:988:989:1260:1261:1345:1359:1431:1437:1534:1542:1711:1730:1747:1777:1792:1801:2196:2199:2393:2559:2562:2898:3138:3139:3140:3141:3142:3353:3865:3866:3867:3868:3872:4321:4385:4605:5007:6261:6737:8957:9010:9121:9592:10004:11026:11473:11658:11914:12043:12048:12291:12296:12297:12438:12555:12683:12895:12986:13161:13229:13846:14181:14394:14721:14915:21060:21080:21451:21627:21987:30054:30070,0,RBL:115.124.30.130:@linux.alibaba.com:.lbl8.mailshell.net-62.20.2.100 64.201.201.201,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:none,Custom_rules:0:0:0,LFtime:23,LUA_SUMMARY:none X-HE-Tag: mouth09_7d4cb9c00e803 X-Filterd-Recvd-Size: 4346 Received: from out30-130.freemail.mail.aliyun.com (out30-130.freemail.mail.aliyun.com [115.124.30.130]) by imf38.hostedemail.com (Postfix) with ESMTP for ; Mon, 2 Mar 2020 11:00:59 +0000 (UTC) X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R711e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e04400;MF=alex.shi@linux.alibaba.com;NM=1;PH=DS;RN=14;SR=0;TI=SMTPD_---0TrQmbef_1583146853; Received: from localhost(mailfrom:alex.shi@linux.alibaba.com fp:SMTPD_---0TrQmbef_1583146853) by smtp.aliyun-inc.com(127.0.0.1); Mon, 02 Mar 2020 19:00:53 +0800 From: Alex Shi To: cgroups@vger.kernel.org, akpm@linux-foundation.org, mgorman@techsingularity.net, tj@kernel.org, hughd@google.com, khlebnikov@yandex-team.ru, daniel.m.jordan@oracle.com, yang.shi@linux.alibaba.com, willy@infradead.org, hannes@cmpxchg.org, lkp@intel.com Cc: Alex Shi , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v9 01/20] mm/vmscan: remove unnecessary lruvec adding Date: Mon, 2 Mar 2020 19:00:11 +0800 Message-Id: <1583146830-169516-2-git-send-email-alex.shi@linux.alibaba.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1583146830-169516-1-git-send-email-alex.shi@linux.alibaba.com> References: <1583146830-169516-1-git-send-email-alex.shi@linux.alibaba.com> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: We don't have to add a freeable page into lru and then remove from it. This change saves a couple of actions and makes the moving more clear. The SetPageLRU needs to be kept here for list intergrity. Otherwise: #0 mave_pages_to_lru #1 release_pages if (put_page_testzero()) if !put_page_testzero !PageLRU //skip lru_lock list_add(&page->lru,) list_add(&page->lru,) //corrupt Signed-off-by: Alex Shi Cc: Andrew Morton Cc: Johannes Weiner Cc: Tejun Heo Cc: Matthew Wilcox Cc: Hugh Dickins Cc: linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org --- mm/vmscan.c | 32 +++++++++++++++++++++----------- 1 file changed, 21 insertions(+), 11 deletions(-) diff --git a/mm/vmscan.c b/mm/vmscan.c index 876370565455..dcdd33f65f43 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -1838,26 +1838,29 @@ static unsigned noinline_for_stack move_pages_to_lru(struct lruvec *lruvec, while (!list_empty(list)) { page = lru_to_page(list); VM_BUG_ON_PAGE(PageLRU(page), page); + list_del(&page->lru); if (unlikely(!page_evictable(page))) { - list_del(&page->lru); spin_unlock_irq(&pgdat->lru_lock); putback_lru_page(page); spin_lock_irq(&pgdat->lru_lock); continue; } - lruvec = mem_cgroup_page_lruvec(page, pgdat); + /* + * The SetPageLRU needs to be kept here for list intergrity. + * Otherwise: + * #0 mave_pages_to_lru #1 release_pages + * if (put_page_testzero()) + * if !put_page_testzero + * !PageLRU //skip lru_lock + * list_add(&page->lru,) + * list_add(&page->lru,) //corrupt + */ SetPageLRU(page); - lru = page_lru(page); - - nr_pages = hpage_nr_pages(page); - update_lru_size(lruvec, lru, page_zonenum(page), nr_pages); - list_move(&page->lru, &lruvec->lists[lru]); - if (put_page_testzero(page)) { + if (unlikely(put_page_testzero(page))) { __ClearPageLRU(page); __ClearPageActive(page); - del_page_from_lru_list(page, lruvec, lru); if (unlikely(PageCompound(page))) { spin_unlock_irq(&pgdat->lru_lock); @@ -1865,9 +1868,16 @@ static unsigned noinline_for_stack move_pages_to_lru(struct lruvec *lruvec, spin_lock_irq(&pgdat->lru_lock); } else list_add(&page->lru, &pages_to_free); - } else { - nr_moved += nr_pages; + continue; } + + lruvec = mem_cgroup_page_lruvec(page, pgdat); + lru = page_lru(page); + nr_pages = hpage_nr_pages(page); + + update_lru_size(lruvec, lru, page_zonenum(page), nr_pages); + list_add(&page->lru, &lruvec->lists[lru]); + nr_moved += nr_pages; } /* From patchwork Mon Mar 2 11:00:12 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alex Shi X-Patchwork-Id: 11415285 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 06481109A for ; Mon, 2 Mar 2020 11:01:29 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id D15932465E for ; Mon, 2 Mar 2020 11:01:28 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org D15932465E Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 50B2C6B0036; Mon, 2 Mar 2020 06:01:14 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 4BCDC6B0037; Mon, 2 Mar 2020 06:01:14 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3AD1E6B006C; Mon, 2 Mar 2020 06:01:14 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0104.hostedemail.com [216.40.44.104]) by kanga.kvack.org (Postfix) with ESMTP id 20DD66B0036 for ; Mon, 2 Mar 2020 06:01:14 -0500 (EST) Received: from smtpin11.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id E82B8181AC9B6 for ; Mon, 2 Mar 2020 11:01:13 +0000 (UTC) X-FDA: 76550130426.11.pear35_7e907f3ea7713 X-Spam-Summary: 2,0,0,ef67506c8dc56c83,d41d8cd98f00b204,alex.shi@linux.alibaba.com,,RULES_HIT:41:69:355:379:541:800:960:973:988:989:1260:1261:1345:1359:1431:1437:1534:1543:1711:1730:1747:1777:1792:2393:2553:2559:2562:3138:3139:3140:3141:3142:3354:3865:3866:3868:3870:3872:4321:5007:6261:6737:7514:7875:8957:9207:9592:10004:11026:11232:11473:11658:11914:12043:12048:12291:12296:12297:12438:12555:12683:12895:13846:14096:14181:14394:14721:14915:21060:21080:21324:21451:21627:30054:30070:30090,0,RBL:115.124.30.54:@linux.alibaba.com:.lbl8.mailshell.net-64.201.201.201 62.20.2.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:none,Custom_rules:0:0:0,LFtime:24,LUA_SUMMARY:none X-HE-Tag: pear35_7e907f3ea7713 X-Filterd-Recvd-Size: 4727 Received: from out30-54.freemail.mail.aliyun.com (out30-54.freemail.mail.aliyun.com [115.124.30.54]) by imf05.hostedemail.com (Postfix) with ESMTP for ; Mon, 2 Mar 2020 11:01:02 +0000 (UTC) X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R851e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e04407;MF=alex.shi@linux.alibaba.com;NM=1;PH=DS;RN=16;SR=0;TI=SMTPD_---0TrQxdHj_1583146853; Received: from localhost(mailfrom:alex.shi@linux.alibaba.com fp:SMTPD_---0TrQxdHj_1583146853) by smtp.aliyun-inc.com(127.0.0.1); Mon, 02 Mar 2020 19:00:54 +0800 From: Alex Shi To: cgroups@vger.kernel.org, akpm@linux-foundation.org, mgorman@techsingularity.net, tj@kernel.org, hughd@google.com, khlebnikov@yandex-team.ru, daniel.m.jordan@oracle.com, yang.shi@linux.alibaba.com, willy@infradead.org, hannes@cmpxchg.org, lkp@intel.com Cc: Alex Shi , Michal Hocko , Vladimir Davydov , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v9 02/20] mm/memcg: fold lock_page_lru into commit_charge Date: Mon, 2 Mar 2020 19:00:12 +0800 Message-Id: <1583146830-169516-3-git-send-email-alex.shi@linux.alibaba.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1583146830-169516-1-git-send-email-alex.shi@linux.alibaba.com> References: <1583146830-169516-1-git-send-email-alex.shi@linux.alibaba.com> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: As Konstantin Khlebnikov mentioned: Also I don't like these functions: - called lock/unlock but actually also isolates - used just once - pgdat evaluated twice Cleanup and fold these functions into commit_charge. It also reduces lock time while lrucare && !PageLRU. Signed-off-by: Alex Shi Cc: Johannes Weiner Cc: Michal Hocko Cc: Konstantin Khlebnikov Cc: Vladimir Davydov Cc: Andrew Morton Cc: cgroups@vger.kernel.org Cc: linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org --- mm/memcontrol.c | 57 ++++++++++++++++++++------------------------------------- 1 file changed, 20 insertions(+), 37 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index d09776cd6e10..875e2aebcde7 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -2572,41 +2572,11 @@ static void cancel_charge(struct mem_cgroup *memcg, unsigned int nr_pages) css_put_many(&memcg->css, nr_pages); } -static void lock_page_lru(struct page *page, int *isolated) -{ - pg_data_t *pgdat = page_pgdat(page); - - spin_lock_irq(&pgdat->lru_lock); - if (PageLRU(page)) { - struct lruvec *lruvec; - - lruvec = mem_cgroup_page_lruvec(page, pgdat); - ClearPageLRU(page); - del_page_from_lru_list(page, lruvec, page_lru(page)); - *isolated = 1; - } else - *isolated = 0; -} - -static void unlock_page_lru(struct page *page, int isolated) -{ - pg_data_t *pgdat = page_pgdat(page); - - if (isolated) { - struct lruvec *lruvec; - - lruvec = mem_cgroup_page_lruvec(page, pgdat); - VM_BUG_ON_PAGE(PageLRU(page), page); - SetPageLRU(page); - add_page_to_lru_list(page, lruvec, page_lru(page)); - } - spin_unlock_irq(&pgdat->lru_lock); -} - static void commit_charge(struct page *page, struct mem_cgroup *memcg, bool lrucare) { - int isolated; + struct lruvec *lruvec = NULL; + pg_data_t *pgdat; VM_BUG_ON_PAGE(page->mem_cgroup, page); @@ -2614,9 +2584,17 @@ static void commit_charge(struct page *page, struct mem_cgroup *memcg, * In some cases, SwapCache and FUSE(splice_buf->radixtree), the page * may already be on some other mem_cgroup's LRU. Take care of it. */ - if (lrucare) - lock_page_lru(page, &isolated); - + if (lrucare) { + pgdat = page_pgdat(page); + spin_lock_irq(&pgdat->lru_lock); + + if (PageLRU(page)) { + lruvec = mem_cgroup_page_lruvec(page, pgdat); + ClearPageLRU(page); + del_page_from_lru_list(page, lruvec, page_lru(page)); + } else + spin_unlock_irq(&pgdat->lru_lock); + } /* * Nobody should be changing or seriously looking at * page->mem_cgroup at this point: @@ -2633,8 +2611,13 @@ static void commit_charge(struct page *page, struct mem_cgroup *memcg, */ page->mem_cgroup = memcg; - if (lrucare) - unlock_page_lru(page, isolated); + if (lrucare && lruvec) { + lruvec = mem_cgroup_page_lruvec(page, pgdat); + VM_BUG_ON_PAGE(PageLRU(page), page); + SetPageLRU(page); + add_page_to_lru_list(page, lruvec, page_lru(page)); + spin_unlock_irq(&pgdat->lru_lock); + } } #ifdef CONFIG_MEMCG_KMEM From patchwork Mon Mar 2 11:00:13 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alex Shi X-Patchwork-Id: 11415303 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id CD1FF109A for ; Mon, 2 Mar 2020 11:01:48 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id A3FB224676 for ; Mon, 2 Mar 2020 11:01:48 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org A3FB224676 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 7CA7F6B0072; Mon, 2 Mar 2020 06:01:47 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 7A0B16B0073; Mon, 2 Mar 2020 06:01:47 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 690BC6B0074; Mon, 2 Mar 2020 06:01:47 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0149.hostedemail.com [216.40.44.149]) by kanga.kvack.org (Postfix) with ESMTP id 4D0B76B0072 for ; Mon, 2 Mar 2020 06:01:47 -0500 (EST) Received: from smtpin17.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 2F1A522BEA for ; Mon, 2 Mar 2020 11:01:47 +0000 (UTC) X-FDA: 76550131854.17.tramp85_83e65c0ac8c44 X-Spam-Summary: 2,0,0,a60942d40f47b4e7,d41d8cd98f00b204,alex.shi@linux.alibaba.com,,RULES_HIT:41:355:379:541:800:960:973:988:989:1260:1261:1345:1359:1431:1437:1534:1541:1711:1730:1747:1777:1792:2393:2559:2562:3138:3139:3140:3141:3142:3352:3865:3867:3871:3874:4321:5007:6119:6261:6737:7903:10004:11026:11658:11914:12043:12048:12297:12438:12555:12895:12986:13069:13311:13357:13846:14096:14181:14384:14394:14721:14915:21060:21080:21451:21627:21990:30054:30070,0,RBL:115.124.30.54:@linux.alibaba.com:.lbl8.mailshell.net-62.20.2.100 64.201.201.201,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:none,Custom_rules:0:0:0,LFtime:1,LUA_SUMMARY:none X-HE-Tag: tramp85_83e65c0ac8c44 X-Filterd-Recvd-Size: 2686 Received: from out30-54.freemail.mail.aliyun.com (out30-54.freemail.mail.aliyun.com [115.124.30.54]) by imf14.hostedemail.com (Postfix) with ESMTP for ; Mon, 2 Mar 2020 11:01:45 +0000 (UTC) X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R161e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e04407;MF=alex.shi@linux.alibaba.com;NM=1;PH=DS;RN=14;SR=0;TI=SMTPD_---0TrQzvOW_1583146854; Received: from localhost(mailfrom:alex.shi@linux.alibaba.com fp:SMTPD_---0TrQzvOW_1583146854) by smtp.aliyun-inc.com(127.0.0.1); Mon, 02 Mar 2020 19:00:54 +0800 From: Alex Shi To: cgroups@vger.kernel.org, akpm@linux-foundation.org, mgorman@techsingularity.net, tj@kernel.org, hughd@google.com, khlebnikov@yandex-team.ru, daniel.m.jordan@oracle.com, yang.shi@linux.alibaba.com, willy@infradead.org, hannes@cmpxchg.org, lkp@intel.com Cc: Alex Shi , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v9 03/20] mm/page_idle: no unlikely double check for idle page counting Date: Mon, 2 Mar 2020 19:00:13 +0800 Message-Id: <1583146830-169516-4-git-send-email-alex.shi@linux.alibaba.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1583146830-169516-1-git-send-email-alex.shi@linux.alibaba.com> References: <1583146830-169516-1-git-send-email-alex.shi@linux.alibaba.com> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: As func comments mentioned, few isolated page missing be tolerated. So why not do further to drop the unlikely double check. That won't cause more idle pages, but reduce a lock contention. This is also a preparation for later new page isolation feature. Signed-off-by: Alex Shi Cc: Andrew Morton Cc: Johannes Weiner Cc: Matthew Wilcox Cc: Hugh Dickins Cc: linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org --- mm/page_idle.c | 8 -------- 1 file changed, 8 deletions(-) diff --git a/mm/page_idle.c b/mm/page_idle.c index 295512465065..914df63948b1 100644 --- a/mm/page_idle.c +++ b/mm/page_idle.c @@ -31,7 +31,6 @@ static struct page *page_idle_get_page(unsigned long pfn) { struct page *page; - pg_data_t *pgdat; if (!pfn_valid(pfn)) return NULL; @@ -41,13 +40,6 @@ static struct page *page_idle_get_page(unsigned long pfn) !get_page_unless_zero(page)) return NULL; - pgdat = page_pgdat(page); - spin_lock_irq(&pgdat->lru_lock); - if (unlikely(!PageLRU(page))) { - put_page(page); - page = NULL; - } - spin_unlock_irq(&pgdat->lru_lock); return page; } From patchwork Mon Mar 2 11:00:14 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alex Shi X-Patchwork-Id: 11415271 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id A8711109A for ; Mon, 2 Mar 2020 11:01:11 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 7FEE920656 for ; Mon, 2 Mar 2020 11:01:11 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 7FEE920656 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id DC2DC6B000E; Mon, 2 Mar 2020 06:01:05 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id CFD596B000C; Mon, 2 Mar 2020 06:01:05 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C1B776B000D; Mon, 2 Mar 2020 06:01:05 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0081.hostedemail.com [216.40.44.81]) by kanga.kvack.org (Postfix) with ESMTP id A97F06B0008 for ; Mon, 2 Mar 2020 06:01:05 -0500 (EST) Received: from smtpin05.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 879E22DFC for ; Mon, 2 Mar 2020 11:01:05 +0000 (UTC) X-FDA: 76550130090.05.fish12_7dc8499f0a010 X-Spam-Summary: 2,0,0,be5176fd15a61755,d41d8cd98f00b204,alex.shi@linux.alibaba.com,,RULES_HIT:41:69:355:379:541:800:960:973:988:989:1260:1261:1345:1359:1431:1437:1535:1544:1605:1711:1730:1747:1777:1792:2198:2199:2393:2559:2562:2731:2890:3138:3139:3140:3141:3142:3865:3870:3871:3872:4042:4117:4321:4605:5007:6261:6737:7903:8957:9010:9592:10004:11026:11232:11473:11658:11914:12043:12048:12291:12296:12297:12438:12555:12683:12895:12986:13161:13229:13846:14096:14181:14394:14721:14915:21060:21080:21450:21451:21627:21740:21795:21987:30001:30051:30054,0,RBL:115.124.30.133:@linux.alibaba.com:.lbl8.mailshell.net-64.201.201.201 62.20.2.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:none,Custom_rules:0:0:0,LFtime:26,LUA_SUMMARY:none X-HE-Tag: fish12_7dc8499f0a010 X-Filterd-Recvd-Size: 6041 Received: from out30-133.freemail.mail.aliyun.com (out30-133.freemail.mail.aliyun.com [115.124.30.133]) by imf38.hostedemail.com (Postfix) with ESMTP for ; Mon, 2 Mar 2020 11:01:03 +0000 (UTC) X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R101e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01f04455;MF=alex.shi@linux.alibaba.com;NM=1;PH=DS;RN=14;SR=0;TI=SMTPD_---0TrQzvOc_1583146854; Received: from localhost(mailfrom:alex.shi@linux.alibaba.com fp:SMTPD_---0TrQzvOc_1583146854) by smtp.aliyun-inc.com(127.0.0.1); Mon, 02 Mar 2020 19:00:55 +0800 From: Alex Shi To: cgroups@vger.kernel.org, akpm@linux-foundation.org, mgorman@techsingularity.net, tj@kernel.org, hughd@google.com, khlebnikov@yandex-team.ru, daniel.m.jordan@oracle.com, yang.shi@linux.alibaba.com, willy@infradead.org, hannes@cmpxchg.org, lkp@intel.com Cc: Alex Shi , linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH v9 04/20] mm/thp: move lru_add_page_tail func to huge_memory.c Date: Mon, 2 Mar 2020 19:00:14 +0800 Message-Id: <1583146830-169516-5-git-send-email-alex.shi@linux.alibaba.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1583146830-169516-1-git-send-email-alex.shi@linux.alibaba.com> References: <1583146830-169516-1-git-send-email-alex.shi@linux.alibaba.com> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: The func is only used in huge_memory.c, defining it in other file with a CONFIG_TRANSPARENT_HUGEPAGE macro restrict just looks weird. Let's move it close user. Signed-off-by: Alex Shi Cc: Andrew Morton Cc: Johannes Weiner Cc: Matthew Wilcox Cc: Hugh Dickins Cc: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org --- include/linux/swap.h | 4 ++-- mm/huge_memory.c | 35 +++++++++++++++++++++++++++++++++++ mm/swap.c | 41 +---------------------------------------- 3 files changed, 38 insertions(+), 42 deletions(-) diff --git a/include/linux/swap.h b/include/linux/swap.h index 1e99f7ac1d7e..c555e8f161ad 100644 --- a/include/linux/swap.h +++ b/include/linux/swap.h @@ -328,11 +328,11 @@ struct vma_swap_readahead { /* linux/mm/swap.c */ +extern void update_page_reclaim_stat(struct lruvec *lruvec, + int file, int rotated); extern void lru_cache_add(struct page *); extern void lru_cache_add_anon(struct page *page); extern void lru_cache_add_file(struct page *page); -extern void lru_add_page_tail(struct page *page, struct page *page_tail, - struct lruvec *lruvec, struct list_head *head); extern void activate_page(struct page *); extern void mark_page_accessed(struct page *); extern void lru_add_drain(void); diff --git a/mm/huge_memory.c b/mm/huge_memory.c index b08b199f9a11..acef164a8981 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -2445,6 +2445,41 @@ static void remap_page(struct page *page) } } +void lru_add_page_tail(struct page *page, struct page *page_tail, + struct lruvec *lruvec, struct list_head *list) +{ + const int file = 0; + + VM_BUG_ON_PAGE(!PageHead(page), page); + VM_BUG_ON_PAGE(PageCompound(page_tail), page); + VM_BUG_ON_PAGE(PageLRU(page_tail), page); + lockdep_assert_held(&lruvec_pgdat(lruvec)->lru_lock); + + if (!list) + SetPageLRU(page_tail); + + if (likely(PageLRU(page))) + list_add_tail(&page_tail->lru, &page->lru); + else if (list) { + /* page reclaim is reclaiming a huge page */ + get_page(page_tail); + list_add_tail(&page_tail->lru, list); + } else { + /* + * Head page has not yet been counted, as an hpage, + * so we must account for each subpage individually. + * + * Put page_tail on the list at the correct position + * so they all end up in order. + */ + add_page_to_lru_list_tail(page_tail, lruvec, + page_lru(page_tail)); + } + + if (!PageUnevictable(page)) + update_page_reclaim_stat(lruvec, file, PageActive(page_tail)); +} + static void __split_huge_page_tail(struct page *head, int tail, struct lruvec *lruvec, struct list_head *list) { diff --git a/mm/swap.c b/mm/swap.c index cf39d24ada2a..1ac24fc35d6b 100644 --- a/mm/swap.c +++ b/mm/swap.c @@ -262,8 +262,7 @@ void rotate_reclaimable_page(struct page *page) } } -static void update_page_reclaim_stat(struct lruvec *lruvec, - int file, int rotated) +void update_page_reclaim_stat(struct lruvec *lruvec, int file, int rotated) { struct zone_reclaim_stat *reclaim_stat = &lruvec->reclaim_stat; @@ -885,44 +884,6 @@ void __pagevec_release(struct pagevec *pvec) } EXPORT_SYMBOL(__pagevec_release); -#ifdef CONFIG_TRANSPARENT_HUGEPAGE -/* used by __split_huge_page_refcount() */ -void lru_add_page_tail(struct page *page, struct page *page_tail, - struct lruvec *lruvec, struct list_head *list) -{ - const int file = 0; - - VM_BUG_ON_PAGE(!PageHead(page), page); - VM_BUG_ON_PAGE(PageCompound(page_tail), page); - VM_BUG_ON_PAGE(PageLRU(page_tail), page); - lockdep_assert_held(&lruvec_pgdat(lruvec)->lru_lock); - - if (!list) - SetPageLRU(page_tail); - - if (likely(PageLRU(page))) - list_add_tail(&page_tail->lru, &page->lru); - else if (list) { - /* page reclaim is reclaiming a huge page */ - get_page(page_tail); - list_add_tail(&page_tail->lru, list); - } else { - /* - * Head page has not yet been counted, as an hpage, - * so we must account for each subpage individually. - * - * Put page_tail on the list at the correct position - * so they all end up in order. - */ - add_page_to_lru_list_tail(page_tail, lruvec, - page_lru(page_tail)); - } - - if (!PageUnevictable(page)) - update_page_reclaim_stat(lruvec, file, PageActive(page_tail)); -} -#endif /* CONFIG_TRANSPARENT_HUGEPAGE */ - static void __pagevec_lru_add_fn(struct page *page, struct lruvec *lruvec, void *arg) { From patchwork Mon Mar 2 11:00:15 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alex Shi X-Patchwork-Id: 11415277 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 32035109A for ; Mon, 2 Mar 2020 11:01:19 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 09D6424697 for ; Mon, 2 Mar 2020 11:01:19 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 09D6424697 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id C2CD16B000A; Mon, 2 Mar 2020 06:01:07 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id B9FFD6B0032; Mon, 2 Mar 2020 06:01:07 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8B86E6B000C; Mon, 2 Mar 2020 06:01:07 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0081.hostedemail.com [216.40.44.81]) by kanga.kvack.org (Postfix) with ESMTP id 70B466B000C for ; Mon, 2 Mar 2020 06:01:07 -0500 (EST) Received: from smtpin27.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 6444B180AD801 for ; Mon, 2 Mar 2020 11:01:07 +0000 (UTC) X-FDA: 76550130174.27.wave30_7df3b4d683401 X-Spam-Summary: 2,0,0,6cdcf1591dc495a0,d41d8cd98f00b204,alex.shi@linux.alibaba.com,,RULES_HIT:41:355:379:541:800:960:973:988:989:1260:1261:1345:1359:1431:1437:1534:1541:1711:1730:1747:1777:1792:2393:2559:2562:3138:3139:3140:3141:3142:3352:3865:3867:3868:3871:3872:4605:5007:6261:6737:7903:8957:9010:10004:11026:11232:11473:11658:11914:12043:12048:12296:12297:12438:12555:12895:12986:13069:13311:13357:13846:14096:14181:14384:14394:14721:14915:21060:21080:21451:21627:30054:30070,0,RBL:115.124.30.57:@linux.alibaba.com:.lbl8.mailshell.net-62.20.2.100 64.201.201.201,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:none,Custom_rules:0:0:0,LFtime:23,LUA_SUMMARY:none X-HE-Tag: wave30_7df3b4d683401 X-Filterd-Recvd-Size: 3242 Received: from out30-57.freemail.mail.aliyun.com (out30-57.freemail.mail.aliyun.com [115.124.30.57]) by imf43.hostedemail.com (Postfix) with ESMTP for ; Mon, 2 Mar 2020 11:01:05 +0000 (UTC) X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R141e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e07417;MF=alex.shi@linux.alibaba.com;NM=1;PH=DS;RN=14;SR=0;TI=SMTPD_---0TrQzvOi_1583146855; Received: from localhost(mailfrom:alex.shi@linux.alibaba.com fp:SMTPD_---0TrQzvOi_1583146855) by smtp.aliyun-inc.com(127.0.0.1); Mon, 02 Mar 2020 19:00:55 +0800 From: Alex Shi To: cgroups@vger.kernel.org, akpm@linux-foundation.org, mgorman@techsingularity.net, tj@kernel.org, hughd@google.com, khlebnikov@yandex-team.ru, daniel.m.jordan@oracle.com, yang.shi@linux.alibaba.com, willy@infradead.org, hannes@cmpxchg.org, lkp@intel.com Cc: Alex Shi , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v9 05/20] mm/thp: clean up lru_add_page_tail Date: Mon, 2 Mar 2020 19:00:15 +0800 Message-Id: <1583146830-169516-6-git-send-email-alex.shi@linux.alibaba.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1583146830-169516-1-git-send-email-alex.shi@linux.alibaba.com> References: <1583146830-169516-1-git-send-email-alex.shi@linux.alibaba.com> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Since the first parameter is only used by head page, it's better to make it stright. And no needs to keep head checking: VM_BUG_ON_PAGE(!PageHead(page), page); Signed-off-by: Alex Shi Cc: Andrew Morton Cc: Johannes Weiner Cc: Matthew Wilcox Cc: Hugh Dickins Cc: linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org --- mm/huge_memory.c | 13 ++++++------- 1 file changed, 6 insertions(+), 7 deletions(-) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index acef164a8981..599367d25fca 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -2445,21 +2445,20 @@ static void remap_page(struct page *page) } } -void lru_add_page_tail(struct page *page, struct page *page_tail, +void lru_add_page_tail(struct page *head, struct page *page_tail, struct lruvec *lruvec, struct list_head *list) { const int file = 0; - VM_BUG_ON_PAGE(!PageHead(page), page); - VM_BUG_ON_PAGE(PageCompound(page_tail), page); - VM_BUG_ON_PAGE(PageLRU(page_tail), page); + VM_BUG_ON_PAGE(PageCompound(page_tail), head); + VM_BUG_ON_PAGE(PageLRU(page_tail), head); lockdep_assert_held(&lruvec_pgdat(lruvec)->lru_lock); if (!list) SetPageLRU(page_tail); - if (likely(PageLRU(page))) - list_add_tail(&page_tail->lru, &page->lru); + if (likely(PageLRU(head))) + list_add_tail(&page_tail->lru, &head->lru); else if (list) { /* page reclaim is reclaiming a huge page */ get_page(page_tail); @@ -2476,7 +2475,7 @@ void lru_add_page_tail(struct page *page, struct page *page_tail, page_lru(page_tail)); } - if (!PageUnevictable(page)) + if (!PageUnevictable(head)) update_page_reclaim_stat(lruvec, file, PageActive(page_tail)); } From patchwork Mon Mar 2 11:00:16 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alex Shi X-Patchwork-Id: 11415267 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 262A6138D for ; Mon, 2 Mar 2020 11:01:07 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id EDE93246D6 for ; Mon, 2 Mar 2020 11:01:06 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org EDE93246D6 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id D70936B0006; Mon, 2 Mar 2020 06:01:04 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id CFC506B0007; Mon, 2 Mar 2020 06:01:04 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BE85C6B0008; Mon, 2 Mar 2020 06:01:04 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0129.hostedemail.com [216.40.44.129]) by kanga.kvack.org (Postfix) with ESMTP id A73C46B0006 for ; Mon, 2 Mar 2020 06:01:04 -0500 (EST) Received: from smtpin01.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 7708E8245578 for ; Mon, 2 Mar 2020 11:01:04 +0000 (UTC) X-FDA: 76550130048.01.voice81_7daa42b846262 X-Spam-Summary: 2,0,0,0186c851007bef5b,d41d8cd98f00b204,alex.shi@linux.alibaba.com,,RULES_HIT:41:69:355:379:541:800:960:966:973:988:989:1260:1261:1345:1359:1431:1437:1535:1543:1711:1730:1747:1777:1792:2196:2199:2393:2559:2562:3138:3139:3140:3141:3142:3354:3865:3867:3868:4250:4385:5007:6261:6737:8957:9010:9592:10004:11026:11473:11658:11914:12043:12048:12296:12297:12438:12555:12679:12895:13161:13229:13846:14096:14181:14394:14721:14915:21060:21080:21451:21627:21987:30012:30054,0,RBL:115.124.30.42:@linux.alibaba.com:.lbl8.mailshell.net-62.20.2.100 64.201.201.201,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:none,Custom_rules:0:0:0,LFtime:24,LUA_SUMMARY:none X-HE-Tag: voice81_7daa42b846262 X-Filterd-Recvd-Size: 5148 Received: from out30-42.freemail.mail.aliyun.com (out30-42.freemail.mail.aliyun.com [115.124.30.42]) by imf37.hostedemail.com (Postfix) with ESMTP for ; Mon, 2 Mar 2020 11:01:03 +0000 (UTC) X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R421e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01f04446;MF=alex.shi@linux.alibaba.com;NM=1;PH=DS;RN=16;SR=0;TI=SMTPD_---0TrR9JV8_1583146855; Received: from localhost(mailfrom:alex.shi@linux.alibaba.com fp:SMTPD_---0TrR9JV8_1583146855) by smtp.aliyun-inc.com(127.0.0.1); Mon, 02 Mar 2020 19:00:56 +0800 From: Alex Shi To: cgroups@vger.kernel.org, akpm@linux-foundation.org, mgorman@techsingularity.net, tj@kernel.org, hughd@google.com, khlebnikov@yandex-team.ru, daniel.m.jordan@oracle.com, yang.shi@linux.alibaba.com, willy@infradead.org, hannes@cmpxchg.org, lkp@intel.com Cc: Alex Shi , "Kirill A. Shutemov" , Andrea Arcangeli , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v9 06/20] mm/thp: narrow lru locking Date: Mon, 2 Mar 2020 19:00:16 +0800 Message-Id: <1583146830-169516-7-git-send-email-alex.shi@linux.alibaba.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1583146830-169516-1-git-send-email-alex.shi@linux.alibaba.com> References: <1583146830-169516-1-git-send-email-alex.shi@linux.alibaba.com> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Lru locking just guard the lru list and subpage's Mlocked. Including other things can't give help just delay the locking release. So narrow the locking for early lock release and better code meaning. Signed-off-by: Alex Shi Cc: Kirill A. Shutemov Cc: Andrea Arcangeli Cc: Johannes Weiner Cc: Andrew Morton Cc: linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org --- mm/huge_memory.c | 17 +++++++---------- 1 file changed, 7 insertions(+), 10 deletions(-) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 599367d25fca..3835f87d03fd 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -2542,13 +2542,14 @@ static void __split_huge_page_tail(struct page *head, int tail, } static void __split_huge_page(struct page *page, struct list_head *list, - pgoff_t end, unsigned long flags) + pgoff_t end) { struct page *head = compound_head(page); pg_data_t *pgdat = page_pgdat(head); struct lruvec *lruvec; struct address_space *swap_cache = NULL; unsigned long offset = 0; + unsigned long flags; int i; lruvec = mem_cgroup_page_lruvec(head, pgdat); @@ -2564,6 +2565,9 @@ static void __split_huge_page(struct page *page, struct list_head *list, xa_lock(&swap_cache->i_pages); } + /* Lru list would be changed, don't care head's LRU bit. */ + spin_lock_irqsave(&pgdat->lru_lock, flags); + for (i = HPAGE_PMD_NR - 1; i >= 1; i--) { __split_huge_page_tail(head, i, lruvec, list); /* Some pages can be beyond i_size: drop them from page cache */ @@ -2581,6 +2585,7 @@ static void __split_huge_page(struct page *page, struct list_head *list, head + i, 0); } } + spin_unlock_irqrestore(&pgdat->lru_lock, flags); ClearPageCompound(head); @@ -2601,8 +2606,6 @@ static void __split_huge_page(struct page *page, struct list_head *list, xa_unlock(&head->mapping->i_pages); } - spin_unlock_irqrestore(&pgdat->lru_lock, flags); - remap_page(head); for (i = 0; i < HPAGE_PMD_NR; i++) { @@ -2740,13 +2743,11 @@ bool can_split_huge_page(struct page *page, int *pextra_pins) int split_huge_page_to_list(struct page *page, struct list_head *list) { struct page *head = compound_head(page); - struct pglist_data *pgdata = NODE_DATA(page_to_nid(head)); struct deferred_split *ds_queue = get_deferred_split_queue(head); struct anon_vma *anon_vma = NULL; struct address_space *mapping = NULL; int count, mapcount, extra_pins, ret; bool mlocked; - unsigned long flags; pgoff_t end; VM_BUG_ON_PAGE(is_huge_zero_page(head), head); @@ -2812,9 +2813,6 @@ int split_huge_page_to_list(struct page *page, struct list_head *list) if (mlocked) lru_add_drain(); - /* prevent PageLRU to go away from under us, and freeze lru stats */ - spin_lock_irqsave(&pgdata->lru_lock, flags); - if (mapping) { XA_STATE(xas, &mapping->i_pages, page_index(head)); @@ -2844,7 +2842,7 @@ int split_huge_page_to_list(struct page *page, struct list_head *list) __dec_node_page_state(head, NR_FILE_THPS); } - __split_huge_page(page, list, end, flags); + __split_huge_page(page, list, end); if (PageSwapCache(head)) { swp_entry_t entry = { .val = page_private(head) }; @@ -2863,7 +2861,6 @@ int split_huge_page_to_list(struct page *page, struct list_head *list) spin_unlock(&ds_queue->split_queue_lock); fail: if (mapping) xa_unlock(&mapping->i_pages); - spin_unlock_irqrestore(&pgdata->lru_lock, flags); remap_page(head); ret = -EBUSY; } From patchwork Mon Mar 2 11:00:17 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alex Shi X-Patchwork-Id: 11415269 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0008A109A for ; Mon, 2 Mar 2020 11:01:08 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id BE69A2468E for ; Mon, 2 Mar 2020 11:01:08 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org BE69A2468E Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 421526B0007; Mon, 2 Mar 2020 06:01:05 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 3D28D6B0008; Mon, 2 Mar 2020 06:01:05 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2991B6B000A; Mon, 2 Mar 2020 06:01:05 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0241.hostedemail.com [216.40.44.241]) by kanga.kvack.org (Postfix) with ESMTP id 11BDB6B0007 for ; Mon, 2 Mar 2020 06:01:05 -0500 (EST) Received: from smtpin15.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id E620C180AD801 for ; Mon, 2 Mar 2020 11:01:04 +0000 (UTC) X-FDA: 76550130048.15.hill95_7dadf3cf53b1d X-Spam-Summary: 2,0,0,7244f262e4918ed3,d41d8cd98f00b204,alex.shi@linux.alibaba.com,,RULES_HIT:41:69:355:379:541:800:960:973:988:989:1260:1261:1345:1359:1431:1437:1535:1544:1711:1730:1747:1777:1792:2393:2559:2562:2693:2898:3138:3139:3140:3141:3142:3355:3867:3868:4117:4321:4605:5007:6261:6737:7514:7903:8603:8957:10004:11026:11232:11473:11638:11639:11658:11914:12043:12048:12296:12297:12438:12555:12895:13846:14096:14181:14394:14721:14915:21060:21080:21451:21627:21740:21987:21990:30054:30070,0,RBL:115.124.30.45:@linux.alibaba.com:.lbl8.mailshell.net-62.20.2.100 64.201.201.201,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:none,Custom_rules:0:0:0,LFtime:24,LUA_SUMMARY:none X-HE-Tag: hill95_7dadf3cf53b1d X-Filterd-Recvd-Size: 6434 Received: from out30-45.freemail.mail.aliyun.com (out30-45.freemail.mail.aliyun.com [115.124.30.45]) by imf07.hostedemail.com (Postfix) with ESMTP for ; Mon, 2 Mar 2020 11:01:03 +0000 (UTC) X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R531e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01f04396;MF=alex.shi@linux.alibaba.com;NM=1;PH=DS;RN=16;SR=0;TI=SMTPD_---0TrQxdI3_1583146856; Received: from localhost(mailfrom:alex.shi@linux.alibaba.com fp:SMTPD_---0TrQxdI3_1583146856) by smtp.aliyun-inc.com(127.0.0.1); Mon, 02 Mar 2020 19:00:56 +0800 From: Alex Shi To: cgroups@vger.kernel.org, akpm@linux-foundation.org, mgorman@techsingularity.net, tj@kernel.org, hughd@google.com, khlebnikov@yandex-team.ru, daniel.m.jordan@oracle.com, yang.shi@linux.alibaba.com, willy@infradead.org, hannes@cmpxchg.org, lkp@intel.com Cc: Alex Shi , Michal Hocko , Vladimir Davydov , linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH v9 07/20] mm/lru: introduce TestClearPageLRU Date: Mon, 2 Mar 2020 19:00:17 +0800 Message-Id: <1583146830-169516-8-git-send-email-alex.shi@linux.alibaba.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1583146830-169516-1-git-send-email-alex.shi@linux.alibaba.com> References: <1583146830-169516-1-git-send-email-alex.shi@linux.alibaba.com> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Combined PageLRU check and ClearPageLRU into one function by new introduced func TestClearPageLRU. This function will be used as page isolation precondition. No functional change yet. Suggested-by: Johannes Weiner Signed-off-by: Alex Shi Cc: Johannes Weiner Cc: Michal Hocko Cc: Vladimir Davydov Cc: Andrew Morton Cc: linux-kernel@vger.kernel.org Cc: cgroups@vger.kernel.org Cc: linux-mm@kvack.org --- include/linux/page-flags.h | 1 + mm/memcontrol.c | 4 ++-- mm/mlock.c | 3 +-- mm/swap.c | 8 ++------ mm/vmscan.c | 19 +++++++++---------- 5 files changed, 15 insertions(+), 20 deletions(-) diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h index 1bf83c8fcaa7..5cb155f3191e 100644 --- a/include/linux/page-flags.h +++ b/include/linux/page-flags.h @@ -318,6 +318,7 @@ static inline void page_init_poison(struct page *page, size_t size) PAGEFLAG(Dirty, dirty, PF_HEAD) TESTSCFLAG(Dirty, dirty, PF_HEAD) __CLEARPAGEFLAG(Dirty, dirty, PF_HEAD) PAGEFLAG(LRU, lru, PF_HEAD) __CLEARPAGEFLAG(LRU, lru, PF_HEAD) + TESTCLEARFLAG(LRU, lru, PF_HEAD) PAGEFLAG(Active, active, PF_HEAD) __CLEARPAGEFLAG(Active, active, PF_HEAD) TESTCLEARFLAG(Active, active, PF_HEAD) PAGEFLAG(Workingset, workingset, PF_HEAD) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 875e2aebcde7..f8419f3436a8 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -2588,9 +2588,8 @@ static void commit_charge(struct page *page, struct mem_cgroup *memcg, pgdat = page_pgdat(page); spin_lock_irq(&pgdat->lru_lock); - if (PageLRU(page)) { + if (TestClearPageLRU(page)) { lruvec = mem_cgroup_page_lruvec(page, pgdat); - ClearPageLRU(page); del_page_from_lru_list(page, lruvec, page_lru(page)); } else spin_unlock_irq(&pgdat->lru_lock); @@ -2613,6 +2612,7 @@ static void commit_charge(struct page *page, struct mem_cgroup *memcg, if (lrucare && lruvec) { lruvec = mem_cgroup_page_lruvec(page, pgdat); + VM_BUG_ON_PAGE(PageLRU(page), page); SetPageLRU(page); add_page_to_lru_list(page, lruvec, page_lru(page)); diff --git a/mm/mlock.c b/mm/mlock.c index a72c1eeded77..03b3a5d99ad7 100644 --- a/mm/mlock.c +++ b/mm/mlock.c @@ -108,13 +108,12 @@ void mlock_vma_page(struct page *page) */ static bool __munlock_isolate_lru_page(struct page *page, bool getpage) { - if (PageLRU(page)) { + if (TestClearPageLRU(page)) { struct lruvec *lruvec; lruvec = mem_cgroup_page_lruvec(page, page_pgdat(page)); if (getpage) get_page(page); - ClearPageLRU(page); del_page_from_lru_list(page, lruvec, page_lru(page)); return true; } diff --git a/mm/swap.c b/mm/swap.c index 1ac24fc35d6b..8e71bdd04a1a 100644 --- a/mm/swap.c +++ b/mm/swap.c @@ -59,15 +59,13 @@ */ static void __page_cache_release(struct page *page) { - if (PageLRU(page)) { + if (TestClearPageLRU(page)) { pg_data_t *pgdat = page_pgdat(page); struct lruvec *lruvec; unsigned long flags; spin_lock_irqsave(&pgdat->lru_lock, flags); lruvec = mem_cgroup_page_lruvec(page, pgdat); - VM_BUG_ON_PAGE(!PageLRU(page), page); - __ClearPageLRU(page); del_page_from_lru_list(page, lruvec, page_off_lru(page)); spin_unlock_irqrestore(&pgdat->lru_lock, flags); } @@ -831,7 +829,7 @@ void release_pages(struct page **pages, int nr) continue; } - if (PageLRU(page)) { + if (TestClearPageLRU(page)) { struct pglist_data *pgdat = page_pgdat(page); if (pgdat != locked_pgdat) { @@ -844,8 +842,6 @@ void release_pages(struct page **pages, int nr) } lruvec = mem_cgroup_page_lruvec(page, locked_pgdat); - VM_BUG_ON_PAGE(!PageLRU(page), page); - __ClearPageLRU(page); del_page_from_lru_list(page, lruvec, page_off_lru(page)); } diff --git a/mm/vmscan.c b/mm/vmscan.c index dcdd33f65f43..8958454d50fe 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -1751,21 +1751,20 @@ int isolate_lru_page(struct page *page) VM_BUG_ON_PAGE(!page_count(page), page); WARN_RATELIMIT(PageTail(page), "trying to isolate tail page"); - if (PageLRU(page)) { + get_page(page); + if (TestClearPageLRU(page)) { pg_data_t *pgdat = page_pgdat(page); struct lruvec *lruvec; + int lru = page_lru(page); - spin_lock_irq(&pgdat->lru_lock); lruvec = mem_cgroup_page_lruvec(page, pgdat); - if (PageLRU(page)) { - int lru = page_lru(page); - get_page(page); - ClearPageLRU(page); - del_page_from_lru_list(page, lruvec, lru); - ret = 0; - } + spin_lock_irq(&pgdat->lru_lock); + del_page_from_lru_list(page, lruvec, lru); spin_unlock_irq(&pgdat->lru_lock); - } + ret = 0; + } else + put_page(page); + return ret; } From patchwork Mon Mar 2 11:00:18 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alex Shi X-Patchwork-Id: 11415287 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9205D138D for ; Mon, 2 Mar 2020 11:01:31 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 5D0A32468E for ; Mon, 2 Mar 2020 11:01:31 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 5D0A32468E Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 952A96B0037; Mon, 2 Mar 2020 06:01:16 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 903036B006C; Mon, 2 Mar 2020 06:01:16 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7A2996B006E; Mon, 2 Mar 2020 06:01:16 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0141.hostedemail.com [216.40.44.141]) by kanga.kvack.org (Postfix) with ESMTP id 5B0DD6B0037 for ; Mon, 2 Mar 2020 06:01:16 -0500 (EST) Received: from smtpin18.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 45038180AD801 for ; Mon, 2 Mar 2020 11:01:16 +0000 (UTC) X-FDA: 76550130552.18.sink12_7f521336eb12a X-Spam-Summary: 2,0,0,4c407320a69eaab4,d41d8cd98f00b204,alex.shi@linux.alibaba.com,,RULES_HIT:2:41:69:355:379:541:800:960:966:973:988:989:1260:1261:1345:1359:1431:1437:1535:1605:1730:1747:1777:1792:2196:2198:2199:2200:2393:2553:2559:2562:2693:2731:2895:2898:2899:3138:3139:3140:3141:3142:3369:3865:3867:3868:3870:3871:3872:3874:4049:4118:4250:4321:4385:4605:5007:6119:6261:6737:8603:8957:9010:9592:10004:11026:11232:11473:11658:11914:12043:12048:12291:12296:12297:12438:12555:12683:12895:12986:13846:14394:14915:21060:21080:21451:21627:21987:21990:30012:30054:30070:30090,0,RBL:115.124.30.42:@linux.alibaba.com:.lbl8.mailshell.net-64.201.201.201 62.20.2.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:none,Custom_rules:0:0:0,LFtime:24,LUA_SUMMARY:none X-HE-Tag: sink12_7f521336eb12a X-Filterd-Recvd-Size: 7774 Received: from out30-42.freemail.mail.aliyun.com (out30-42.freemail.mail.aliyun.com [115.124.30.42]) by imf05.hostedemail.com (Postfix) with ESMTP for ; Mon, 2 Mar 2020 11:01:14 +0000 (UTC) X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R191e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e04426;MF=alex.shi@linux.alibaba.com;NM=1;PH=DS;RN=14;SR=0;TI=SMTPD_---0TrQzvOy_1583146856; Received: from localhost(mailfrom:alex.shi@linux.alibaba.com fp:SMTPD_---0TrQzvOy_1583146856) by smtp.aliyun-inc.com(127.0.0.1); Mon, 02 Mar 2020 19:00:57 +0800 From: Alex Shi To: cgroups@vger.kernel.org, akpm@linux-foundation.org, mgorman@techsingularity.net, tj@kernel.org, hughd@google.com, khlebnikov@yandex-team.ru, daniel.m.jordan@oracle.com, yang.shi@linux.alibaba.com, willy@infradead.org, hannes@cmpxchg.org, lkp@intel.com Cc: Alex Shi , linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH v9 08/20] mm/lru: add page isolation precondition in __isolate_lru_page Date: Mon, 2 Mar 2020 19:00:18 +0800 Message-Id: <1583146830-169516-9-git-send-email-alex.shi@linux.alibaba.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1583146830-169516-1-git-send-email-alex.shi@linux.alibaba.com> References: <1583146830-169516-1-git-send-email-alex.shi@linux.alibaba.com> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Johannes Weiner has suggested: " So here is a crazy idea that may be worth exploring: Right now, pgdat->lru_lock protects both PageLRU *and* the lruvec's linked list. Can we make PageLRU atomic and use it to stabilize the lru_lock instead, and then use the lru_lock only serialize list operations? ... " Yes, this patch is doing so on __isolate_lru_page which is the core page isolation func in compaction and shrinking path. This patch move clear page lru action before compaction getting lru_lock, makes it as a necessary condition for page isolation. Hence, PageLRU may be cleared druing shrink_inactive_list path for isolation reason. If so, we can skip that page's in reclaim. It's a preparation for later per memcg lru_lock change. Suggested-by: Johannes Weiner Signed-off-by: Alex Shi Cc: Andrew Morton Cc: Matthew Wilcox Cc: Hugh Dickins Cc: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org --- include/linux/swap.h | 2 +- mm/compaction.c | 25 +++++++++++++++++-------- mm/vmscan.c | 48 ++++++++++++++++++++++++++---------------------- 3 files changed, 44 insertions(+), 31 deletions(-) diff --git a/include/linux/swap.h b/include/linux/swap.h index c555e8f161ad..69f0794f1da3 100644 --- a/include/linux/swap.h +++ b/include/linux/swap.h @@ -351,7 +351,7 @@ extern void lru_cache_add_active_or_unevictable(struct page *page, extern unsigned long zone_reclaimable_pages(struct zone *zone); extern unsigned long try_to_free_pages(struct zonelist *zonelist, int order, gfp_t gfp_mask, nodemask_t *mask); -extern int __isolate_lru_page(struct page *page, isolate_mode_t mode); +extern int __isolate_lru_page_prepare(struct page *page, isolate_mode_t mode); extern unsigned long try_to_free_mem_cgroup_pages(struct mem_cgroup *memcg, unsigned long nr_pages, gfp_t gfp_mask, diff --git a/mm/compaction.c b/mm/compaction.c index 672d3c78c6ab..1baba328d089 100644 --- a/mm/compaction.c +++ b/mm/compaction.c @@ -948,6 +948,23 @@ static bool too_many_isolated(pg_data_t *pgdat) if (!(cc->gfp_mask & __GFP_FS) && page_mapping(page)) goto isolate_fail; + if (__isolate_lru_page_prepare(page, isolate_mode) != 0) + goto isolate_fail; + + /* + * Be careful not to clear PageLRU until after we're + * sure the page is not being freed elsewhere -- the + * page release code relies on it. + */ + if (unlikely(!get_page_unless_zero(page))) + goto isolate_fail; + + /* Try isolate the page */ + if (!TestClearPageLRU(page)) { + put_page(page); + goto isolate_fail; + } + /* If we already hold the lock, we can skip some rechecking */ if (!locked) { locked = compact_lock_irqsave(&pgdat->lru_lock, @@ -960,10 +977,6 @@ static bool too_many_isolated(pg_data_t *pgdat) goto isolate_abort; } - /* Recheck PageLRU and PageCompound under lock */ - if (!PageLRU(page)) - goto isolate_fail; - /* * Page become compound since the non-locked check, * and it's on LRU. It can only be a THP so the order @@ -977,10 +990,6 @@ static bool too_many_isolated(pg_data_t *pgdat) lruvec = mem_cgroup_page_lruvec(page, pgdat); - /* Try isolate the page */ - if (__isolate_lru_page(page, isolate_mode) != 0) - goto isolate_fail; - VM_BUG_ON_PAGE(PageCompound(page), page); /* Successfully isolated */ diff --git a/mm/vmscan.c b/mm/vmscan.c index 8958454d50fe..bc2ec3fe4f48 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -1522,20 +1522,20 @@ unsigned long reclaim_clean_pages_from_list(struct zone *zone, * * returns 0 on success, -ve errno on failure. */ -int __isolate_lru_page(struct page *page, isolate_mode_t mode) +int __isolate_lru_page_prepare(struct page *page, isolate_mode_t mode) { int ret = -EINVAL; - /* Only take pages on the LRU. */ - if (!PageLRU(page)) - return ret; - /* Compaction should not handle unevictable pages but CMA can do so */ if (PageUnevictable(page) && !(mode & ISOLATE_UNEVICTABLE)) return ret; ret = -EBUSY; + /* Only take pages on the LRU. */ + if (!PageLRU(page)) + return ret; + /* * To minimise LRU disruption, the caller can indicate that it only * wants to isolate pages it will be able to operate on without @@ -1576,20 +1576,9 @@ int __isolate_lru_page(struct page *page, isolate_mode_t mode) if ((mode & ISOLATE_UNMAPPED) && page_mapped(page)) return ret; - if (likely(get_page_unless_zero(page))) { - /* - * Be careful not to clear PageLRU until after we're - * sure the page is not being freed elsewhere -- the - * page release code relies on it. - */ - ClearPageLRU(page); - ret = 0; - } - - return ret; + return 0; } - /* * Update LRU sizes after isolating pages. The LRU size updates must * be complete before mem_cgroup_update_lru_size due to a santity check. @@ -1653,8 +1642,6 @@ static unsigned long isolate_lru_pages(unsigned long nr_to_scan, page = lru_to_page(src); prefetchw_prev_lru_page(page, src, flags); - VM_BUG_ON_PAGE(!PageLRU(page), page); - nr_pages = compound_nr(page); total_scan += nr_pages; @@ -1675,17 +1662,34 @@ static unsigned long isolate_lru_pages(unsigned long nr_to_scan, * only when the page is being freed somewhere else. */ scan += nr_pages; - switch (__isolate_lru_page(page, mode)) { + switch (__isolate_lru_page_prepare(page, mode)) { case 0: + /* + * Be careful not to clear PageLRU until after we're + * sure the page is not being freed elsewhere -- the + * page release code relies on it. + */ + if (unlikely(!get_page_unless_zero(page))) + goto busy; + + if (!TestClearPageLRU(page)) { + /* + * This page may in other isolation path, + * but we still hold lru_lock. + */ + put_page(page); + goto busy; + } + nr_taken += nr_pages; nr_zone_taken[page_zonenum(page)] += nr_pages; list_move(&page->lru, dst); break; - +busy: case -EBUSY: /* else it is being freed elsewhere */ list_move(&page->lru, src); - continue; + break; default: BUG(); From patchwork Mon Mar 2 11:00:19 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alex Shi X-Patchwork-Id: 11415307 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 69554138D for ; Mon, 2 Mar 2020 11:01:51 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 376F124676 for ; Mon, 2 Mar 2020 11:01:51 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 376F124676 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id AB2346B0073; Mon, 2 Mar 2020 06:01:48 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id A39306B0074; Mon, 2 Mar 2020 06:01:48 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 92A9A6B0075; Mon, 2 Mar 2020 06:01:48 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0111.hostedemail.com [216.40.44.111]) by kanga.kvack.org (Postfix) with ESMTP id 7AA396B0073 for ; Mon, 2 Mar 2020 06:01:48 -0500 (EST) Received: from smtpin06.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 6079C18ED0 for ; Mon, 2 Mar 2020 11:01:48 +0000 (UTC) X-FDA: 76550131896.06.sign12_83e2261f65d4c X-Spam-Summary: 2,0,0,26e9751206ab0c26,d41d8cd98f00b204,alex.shi@linux.alibaba.com,,RULES_HIT:41:69:355:379:541:800:960:968:973:988:989:1260:1261:1345:1359:1431:1437:1534:1543:1711:1730:1747:1777:1792:2393:2559:2562:3138:3139:3140:3141:3142:3354:3865:3866:3867:3868:3870:3871:3872:3874:4321:5007:6261:6737:8957:9592:10004:11026:11473:11638:11658:11914:12043:12048:12291:12296:12297:12438:12555:12683:12895:13255:13846:14181:14394:14721:14915:21060:21080:21451:21627:21987:21990:30012:30054:30064:30079,0,RBL:115.124.30.133:@linux.alibaba.com:.lbl8.mailshell.net-64.201.201.201 62.20.2.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:none,Custom_rules:0:0:0,LFtime:24,LUA_SUMMARY:none X-HE-Tag: sign12_83e2261f65d4c X-Filterd-Recvd-Size: 4877 Received: from out30-133.freemail.mail.aliyun.com (out30-133.freemail.mail.aliyun.com [115.124.30.133]) by imf01.hostedemail.com (Postfix) with ESMTP for ; Mon, 2 Mar 2020 11:01:45 +0000 (UTC) X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R841e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01f04397;MF=alex.shi@linux.alibaba.com;NM=1;PH=DS;RN=15;SR=0;TI=SMTPD_---0TrRgpkM_1583146857; Received: from localhost(mailfrom:alex.shi@linux.alibaba.com fp:SMTPD_---0TrRgpkM_1583146857) by smtp.aliyun-inc.com(127.0.0.1); Mon, 02 Mar 2020 19:00:57 +0800 From: Alex Shi To: cgroups@vger.kernel.org, akpm@linux-foundation.org, mgorman@techsingularity.net, tj@kernel.org, hughd@google.com, khlebnikov@yandex-team.ru, daniel.m.jordan@oracle.com, yang.shi@linux.alibaba.com, willy@infradead.org, hannes@cmpxchg.org, lkp@intel.com Cc: Alex Shi , "Kirill A. Shutemov" , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v9 09/20] mm/mlock: ClearPageLRU before get lru lock in munlock page isolation Date: Mon, 2 Mar 2020 19:00:19 +0800 Message-Id: <1583146830-169516-10-git-send-email-alex.shi@linux.alibaba.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1583146830-169516-1-git-send-email-alex.shi@linux.alibaba.com> References: <1583146830-169516-1-git-send-email-alex.shi@linux.alibaba.com> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: This is one of effort to split the PageLRU clear from old page isolation. This patch move the lru_lock after TestClearPageLRU, which takes holding PageLRU as precondition for page isolation, as a preparation for later lru_lock replacment. So we have to unfold __munlock_isolate_lru_page. __split_huge_page_refcount doesn't exist, but we still have to guard PageMlocked in __split_huge_page_tail. That make code ugly. Anyway we still remove 2 gotos. [lkp@intel.com: found a sleeping function bug ... at mm/rmap.c:1861] Signed-off-by: Alex Shi Cc: Kirill A. Shutemov Cc: Andrew Morton Cc: Johannes Weiner Cc: Matthew Wilcox Cc: Hugh Dickins Cc: linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org --- mm/mlock.c | 35 +++++++++++++++++++++++------------ 1 file changed, 23 insertions(+), 12 deletions(-) diff --git a/mm/mlock.c b/mm/mlock.c index 03b3a5d99ad7..7ddc52ca14b1 100644 --- a/mm/mlock.c +++ b/mm/mlock.c @@ -181,6 +181,7 @@ static void __munlock_isolation_failed(struct page *page) unsigned int munlock_vma_page(struct page *page) { int nr_pages; + bool clearlru = false; pg_data_t *pgdat = page_pgdat(page); /* For try_to_munlock() and to serialize with page migration */ @@ -189,32 +190,42 @@ unsigned int munlock_vma_page(struct page *page) VM_BUG_ON_PAGE(PageTail(page), page); /* - * Serialize with any parallel __split_huge_page_refcount() which + * Serialize with any parallel __split_huge_page_tail() which * might otherwise copy PageMlocked to part of the tail pages before * we clear it in the head page. It also stabilizes hpage_nr_pages(). */ + get_page(page); + clearlru = TestClearPageLRU(page); spin_lock_irq(&pgdat->lru_lock); if (!TestClearPageMlocked(page)) { - /* Potentially, PTE-mapped THP: do not skip the rest PTEs */ - nr_pages = 1; - goto unlock_out; + if (clearlru) + SetPageLRU(page); + /* + * Potentially, PTE-mapped THP: do not skip the rest PTEs + * Reuse lock as memory barrier for release_pages racing. + */ + spin_unlock_irq(&pgdat->lru_lock); + put_page(page); + return 0; } nr_pages = hpage_nr_pages(page); __mod_zone_page_state(page_zone(page), NR_MLOCK, -nr_pages); - if (__munlock_isolate_lru_page(page, true)) { + if (clearlru) { + struct lruvec *lruvec; + + lruvec = mem_cgroup_page_lruvec(page, page_pgdat(page)); + del_page_from_lru_list(page, lruvec, page_lru(page)); spin_unlock_irq(&pgdat->lru_lock); __munlock_isolated_page(page); - goto out; + } else { + spin_unlock_irq(&pgdat->lru_lock); + put_page(page); + __munlock_isolation_failed(page); } - __munlock_isolation_failed(page); - -unlock_out: - spin_unlock_irq(&pgdat->lru_lock); -out: return nr_pages - 1; } @@ -323,8 +334,8 @@ static void __munlock_pagevec(struct pagevec *pvec, struct zone *zone) pagevec_add(&pvec_putback, pvec->pages[i]); pvec->pages[i] = NULL; } - __mod_zone_page_state(zone, NR_MLOCK, delta_munlocked); spin_unlock_irq(&zone->zone_pgdat->lru_lock); + __mod_zone_page_state(zone, NR_MLOCK, delta_munlocked); /* Now we can release pins of pages that we are not munlocking */ pagevec_release(&pvec_putback); From patchwork Mon Mar 2 11:00:20 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alex Shi X-Patchwork-Id: 11415293 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B9DFE109A for ; Mon, 2 Mar 2020 11:01:40 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 843B12467B for ; Mon, 2 Mar 2020 11:01:40 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 843B12467B Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 818E96B0070; Mon, 2 Mar 2020 06:01:31 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 7A35B6B0071; Mon, 2 Mar 2020 06:01:31 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6436F6B0072; Mon, 2 Mar 2020 06:01:31 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0165.hostedemail.com [216.40.44.165]) by kanga.kvack.org (Postfix) with ESMTP id 447CB6B0070 for ; Mon, 2 Mar 2020 06:01:31 -0500 (EST) Received: from smtpin11.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 15666AF81 for ; Mon, 2 Mar 2020 11:01:31 +0000 (UTC) X-FDA: 76550131182.11.key73_818dee0c7c33e X-Spam-Summary: 1,0,0,,d41d8cd98f00b204,alex.shi@linux.alibaba.com,,RULES_HIT:30054:30070,0,RBL:47.88.44.36:@linux.alibaba.com:.lbl8.mailshell.net-64.10.201.10 62.18.0.100;47.88.44.36-irl.urbl.hostedemail.com-127.0.0.175,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:none,Custom_rules:0:0:0,LFtime:23,LUA_SUMMARY:none X-HE-Tag: key73_818dee0c7c33e X-Filterd-Recvd-Size: 8823 Received: from out4436.biz.mail.alibaba.com (out4436.biz.mail.alibaba.com [47.88.44.36]) by imf50.hostedemail.com (Postfix) with ESMTP for ; Mon, 2 Mar 2020 11:01:29 +0000 (UTC) X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R651e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01f04428;MF=alex.shi@linux.alibaba.com;NM=1;PH=DS;RN=14;SR=0;TI=SMTPD_---0TrQzvPA_1583146857; Received: from localhost(mailfrom:alex.shi@linux.alibaba.com fp:SMTPD_---0TrQzvPA_1583146857) by smtp.aliyun-inc.com(127.0.0.1); Mon, 02 Mar 2020 19:00:58 +0800 From: Alex Shi To: cgroups@vger.kernel.org, akpm@linux-foundation.org, mgorman@techsingularity.net, tj@kernel.org, hughd@google.com, khlebnikov@yandex-team.ru, daniel.m.jordan@oracle.com, yang.shi@linux.alibaba.com, willy@infradead.org, hannes@cmpxchg.org, lkp@intel.com Cc: Alex Shi , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v9 10/20] mm/lru: take PageLRU first in moving page between lru lists Date: Mon, 2 Mar 2020 19:00:20 +0800 Message-Id: <1583146830-169516-11-git-send-email-alex.shi@linux.alibaba.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1583146830-169516-1-git-send-email-alex.shi@linux.alibaba.com> References: <1583146830-169516-1-git-send-email-alex.shi@linux.alibaba.com> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Current move_fn do moving with PageLRU in lru_lock protection. Moving include a lru isolation and a lru adding. As to the isolation part, we need take PageLRU before move_fn, that add a extra PageLRU guard to block other isolations. and set lru bit back after page settled on lru list. This makes TestClearPageLRU as isolation's necessary condition in page moving between lru lists. Another page moving between lru lists is check_move_unevictable_pages func, we need to take PageLRu temporarilly same as move_fn. Signed-off-by: Alex Shi Cc: Andrew Morton Cc: Johannes Weiner Cc: Matthew Wilcox Cc: Hugh Dickins Cc: linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org --- mm/swap.c | 42 ++++++++++++++++++++++++------------------ mm/vmscan.c | 3 ++- 2 files changed, 26 insertions(+), 19 deletions(-) diff --git a/mm/swap.c b/mm/swap.c index 8e71bdd04a1a..16af7c8369fe 100644 --- a/mm/swap.c +++ b/mm/swap.c @@ -187,7 +187,7 @@ int get_kernel_page(unsigned long start, int write, struct page **pages) static void pagevec_lru_move_fn(struct pagevec *pvec, void (*move_fn)(struct page *page, struct lruvec *lruvec, void *arg), - void *arg) + void *arg, bool isolation) { int i; struct pglist_data *pgdat = NULL; @@ -198,6 +198,10 @@ static void pagevec_lru_move_fn(struct pagevec *pvec, struct page *page = pvec->pages[i]; struct pglist_data *pagepgdat = page_pgdat(page); + if (isolation && !TestClearPageLRU(page)) + continue; + + /* every page should be isolated from lru */ if (pagepgdat != pgdat) { if (pgdat) spin_unlock_irqrestore(&pgdat->lru_lock, flags); @@ -207,6 +211,9 @@ static void pagevec_lru_move_fn(struct pagevec *pvec, lruvec = mem_cgroup_page_lruvec(page, pgdat); (*move_fn)(page, lruvec, arg); + + if (isolation) + SetPageLRU(page); } if (pgdat) spin_unlock_irqrestore(&pgdat->lru_lock, flags); @@ -219,7 +226,7 @@ static void pagevec_move_tail_fn(struct page *page, struct lruvec *lruvec, { int *pgmoved = arg; - if (PageLRU(page) && !PageUnevictable(page)) { + if (!PageUnevictable(page)) { del_page_from_lru_list(page, lruvec, page_lru(page)); ClearPageActive(page); add_page_to_lru_list_tail(page, lruvec, page_lru(page)); @@ -235,7 +242,7 @@ static void pagevec_move_tail(struct pagevec *pvec) { int pgmoved = 0; - pagevec_lru_move_fn(pvec, pagevec_move_tail_fn, &pgmoved); + pagevec_lru_move_fn(pvec, pagevec_move_tail_fn, &pgmoved, true); __count_vm_events(PGROTATED, pgmoved); } @@ -272,7 +279,7 @@ void update_page_reclaim_stat(struct lruvec *lruvec, int file, int rotated) static void __activate_page(struct page *page, struct lruvec *lruvec, void *arg) { - if (PageLRU(page) && !PageActive(page) && !PageUnevictable(page)) { + if (!PageActive(page) && !PageUnevictable(page)) { int file = page_is_file_cache(page); int lru = page_lru_base_type(page); @@ -293,7 +300,7 @@ static void activate_page_drain(int cpu) struct pagevec *pvec = &per_cpu(activate_page_pvecs, cpu); if (pagevec_count(pvec)) - pagevec_lru_move_fn(pvec, __activate_page, NULL); + pagevec_lru_move_fn(pvec, __activate_page, NULL, true); } static bool need_activate_page_drain(int cpu) @@ -309,7 +316,7 @@ void activate_page(struct page *page) get_page(page); if (!pagevec_add(pvec, page) || PageCompound(page)) - pagevec_lru_move_fn(pvec, __activate_page, NULL); + pagevec_lru_move_fn(pvec, __activate_page, NULL, true); put_cpu_var(activate_page_pvecs); } } @@ -501,9 +508,6 @@ static void lru_deactivate_file_fn(struct page *page, struct lruvec *lruvec, int lru, file; bool active; - if (!PageLRU(page)) - return; - if (PageUnevictable(page)) return; @@ -544,7 +548,7 @@ static void lru_deactivate_file_fn(struct page *page, struct lruvec *lruvec, static void lru_deactivate_fn(struct page *page, struct lruvec *lruvec, void *arg) { - if (PageLRU(page) && PageActive(page) && !PageUnevictable(page)) { + if (PageActive(page) && !PageUnevictable(page)) { int file = page_is_file_cache(page); int lru = page_lru_base_type(page); @@ -561,7 +565,7 @@ static void lru_deactivate_fn(struct page *page, struct lruvec *lruvec, static void lru_lazyfree_fn(struct page *page, struct lruvec *lruvec, void *arg) { - if (PageLRU(page) && PageAnon(page) && PageSwapBacked(page) && + if (PageAnon(page) && PageSwapBacked(page) && !PageSwapCache(page) && !PageUnevictable(page)) { bool active = PageActive(page); @@ -607,15 +611,15 @@ void lru_add_drain_cpu(int cpu) pvec = &per_cpu(lru_deactivate_file_pvecs, cpu); if (pagevec_count(pvec)) - pagevec_lru_move_fn(pvec, lru_deactivate_file_fn, NULL); + pagevec_lru_move_fn(pvec, lru_deactivate_file_fn, NULL, true); pvec = &per_cpu(lru_deactivate_pvecs, cpu); if (pagevec_count(pvec)) - pagevec_lru_move_fn(pvec, lru_deactivate_fn, NULL); + pagevec_lru_move_fn(pvec, lru_deactivate_fn, NULL, true); pvec = &per_cpu(lru_lazyfree_pvecs, cpu); if (pagevec_count(pvec)) - pagevec_lru_move_fn(pvec, lru_lazyfree_fn, NULL); + pagevec_lru_move_fn(pvec, lru_lazyfree_fn, NULL, true); activate_page_drain(cpu); } @@ -641,7 +645,8 @@ void deactivate_file_page(struct page *page) struct pagevec *pvec = &get_cpu_var(lru_deactivate_file_pvecs); if (!pagevec_add(pvec, page) || PageCompound(page)) - pagevec_lru_move_fn(pvec, lru_deactivate_file_fn, NULL); + pagevec_lru_move_fn(pvec, + lru_deactivate_file_fn, NULL, true); put_cpu_var(lru_deactivate_file_pvecs); } } @@ -661,7 +666,8 @@ void deactivate_page(struct page *page) get_page(page); if (!pagevec_add(pvec, page) || PageCompound(page)) - pagevec_lru_move_fn(pvec, lru_deactivate_fn, NULL); + pagevec_lru_move_fn(pvec, + lru_deactivate_fn, NULL, true); put_cpu_var(lru_deactivate_pvecs); } } @@ -681,7 +687,7 @@ void mark_page_lazyfree(struct page *page) get_page(page); if (!pagevec_add(pvec, page) || PageCompound(page)) - pagevec_lru_move_fn(pvec, lru_lazyfree_fn, NULL); + pagevec_lru_move_fn(pvec, lru_lazyfree_fn, NULL, true); put_cpu_var(lru_lazyfree_pvecs); } } @@ -941,7 +947,7 @@ static void __pagevec_lru_add_fn(struct page *page, struct lruvec *lruvec, */ void __pagevec_lru_add(struct pagevec *pvec) { - pagevec_lru_move_fn(pvec, __pagevec_lru_add_fn, NULL); + pagevec_lru_move_fn(pvec, __pagevec_lru_add_fn, NULL, false); } EXPORT_SYMBOL(__pagevec_lru_add); diff --git a/mm/vmscan.c b/mm/vmscan.c index bc2ec3fe4f48..efaa4f41044e 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -4343,7 +4343,7 @@ void check_move_unevictable_pages(struct pagevec *pvec) } lruvec = mem_cgroup_page_lruvec(page, pgdat); - if (!PageLRU(page) || !PageUnevictable(page)) + if (!TestClearPageLRU(page) || !PageUnevictable(page)) continue; if (page_evictable(page)) { @@ -4354,6 +4354,7 @@ void check_move_unevictable_pages(struct pagevec *pvec) del_page_from_lru_list(page, lruvec, LRU_UNEVICTABLE); add_page_to_lru_list(page, lruvec, lru); pgrescued++; + SetPageLRU(page); } } From patchwork Mon Mar 2 11:00:21 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alex Shi X-Patchwork-Id: 11415273 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 1EEE1138D for ; Mon, 2 Mar 2020 11:01:15 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id EA77A24677 for ; Mon, 2 Mar 2020 11:01:14 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org EA77A24677 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 0D1446B0008; Mon, 2 Mar 2020 06:01:06 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 054366B000D; Mon, 2 Mar 2020 06:01:05 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DC0846B0008; Mon, 2 Mar 2020 06:01:05 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0069.hostedemail.com [216.40.44.69]) by kanga.kvack.org (Postfix) with ESMTP id B94126B000A for ; Mon, 2 Mar 2020 06:01:05 -0500 (EST) Received: from smtpin18.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id AD2DC180AD801 for ; Mon, 2 Mar 2020 11:01:05 +0000 (UTC) X-FDA: 76550130090.18.teeth39_7dd709303bc44 X-Spam-Summary: 2,0,0,43a846f62329df91,d41d8cd98f00b204,alex.shi@linux.alibaba.com,,RULES_HIT:41:355:379:541:800:960:973:981:988:989:1260:1261:1345:1359:1431:1437:1534:1541:1711:1714:1730:1747:1777:1792:2393:2553:2559:2562:3138:3139:3140:3141:3142:3351:3865:3866:5007:6261:6737:7514:8957:10004:11026:11232:11473:11638:11658:11914:12043:12048:12296:12297:12438:12555:12895:13069:13311:13357:13846:14096:14181:14384:14394:14721:14915:21060:21080:21451:21627:30054:30090,0,RBL:115.124.30.42:@linux.alibaba.com:.lbl8.mailshell.net-62.20.2.100 64.201.201.201,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:none,Custom_rules:0:0:0,LFtime:3,LUA_SUMMARY:none X-HE-Tag: teeth39_7dd709303bc44 X-Filterd-Recvd-Size: 2502 Received: from out30-42.freemail.mail.aliyun.com (out30-42.freemail.mail.aliyun.com [115.124.30.42]) by imf24.hostedemail.com (Postfix) with ESMTP for ; Mon, 2 Mar 2020 11:01:04 +0000 (UTC) X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R141e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e04420;MF=alex.shi@linux.alibaba.com;NM=1;PH=DS;RN=16;SR=0;TI=SMTPD_---0TrRgpkW_1583146858; Received: from localhost(mailfrom:alex.shi@linux.alibaba.com fp:SMTPD_---0TrRgpkW_1583146858) by smtp.aliyun-inc.com(127.0.0.1); Mon, 02 Mar 2020 19:00:58 +0800 From: Alex Shi To: cgroups@vger.kernel.org, akpm@linux-foundation.org, mgorman@techsingularity.net, tj@kernel.org, hughd@google.com, khlebnikov@yandex-team.ru, daniel.m.jordan@oracle.com, yang.shi@linux.alibaba.com, willy@infradead.org, hannes@cmpxchg.org, lkp@intel.com Cc: Alex Shi , Michal Hocko , Vladimir Davydov , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v9 11/20] mm/memcg: move SetPageLRU out of lru_lock in commit_charge Date: Mon, 2 Mar 2020 19:00:21 +0800 Message-Id: <1583146830-169516-12-git-send-email-alex.shi@linux.alibaba.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1583146830-169516-1-git-send-email-alex.shi@linux.alibaba.com> References: <1583146830-169516-1-git-send-email-alex.shi@linux.alibaba.com> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Since lru lock doesn't defense PageLRU anymore, move the setting out of lock may save a bit lock contention time. Signed-off-by: Alex Shi Cc: Johannes Weiner Cc: Michal Hocko Cc: Vladimir Davydov Cc: Andrew Morton Cc: cgroups@vger.kernel.org Cc: linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org --- mm/memcontrol.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index f8419f3436a8..7d7b861a948c 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -2614,9 +2614,9 @@ static void commit_charge(struct page *page, struct mem_cgroup *memcg, lruvec = mem_cgroup_page_lruvec(page, pgdat); VM_BUG_ON_PAGE(PageLRU(page), page); - SetPageLRU(page); add_page_to_lru_list(page, lruvec, page_lru(page)); spin_unlock_irq(&pgdat->lru_lock); + SetPageLRU(page); } } From patchwork Mon Mar 2 11:00:22 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alex Shi X-Patchwork-Id: 11415283 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 8DCA8138D for ; Mon, 2 Mar 2020 11:01:26 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 64F5F2467B for ; Mon, 2 Mar 2020 11:01:26 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 64F5F2467B Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id AB5CE6B0032; Mon, 2 Mar 2020 06:01:12 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 99D676B0036; Mon, 2 Mar 2020 06:01:12 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8B3ED6B0037; Mon, 2 Mar 2020 06:01:12 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0034.hostedemail.com [216.40.44.34]) by kanga.kvack.org (Postfix) with ESMTP id 7317D6B0032 for ; Mon, 2 Mar 2020 06:01:12 -0500 (EST) Received: from smtpin10.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 5C7388245571 for ; Mon, 2 Mar 2020 11:01:12 +0000 (UTC) X-FDA: 76550130384.10.slip87_7eade34bedb46 X-Spam-Summary: 40,2.5,0,5bb16ab3206f8998,d41d8cd98f00b204,alex.shi@linux.alibaba.com,,RULES_HIT:41:69:355:379:541:800:960:973:988:989:1260:1261:1345:1359:1431:1437:1534:1542:1711:1730:1747:1777:1792:2198:2199:2393:2559:2562:2898:3138:3139:3140:3141:3142:3354:3865:3867:3868:3870:3871:3872:4250:4321:5007:6261:6737:8957:9592:10011:11026:11658:11914:12043:12048:12114:12296:12297:12438:12555:12683:12895:12986:13846:14110:14181:14394:14721:14915:21060:21080:21451:21627:21987:21990:30054:30070,0,RBL:115.124.30.44:@linux.alibaba.com:.lbl8.mailshell.net-64.201.201.201 62.20.2.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:none,Custom_rules:0:2:0,LFtime:23,LUA_SUMMARY:none X-HE-Tag: slip87_7eade34bedb46 X-Filterd-Recvd-Size: 3995 Received: from out30-44.freemail.mail.aliyun.com (out30-44.freemail.mail.aliyun.com [115.124.30.44]) by imf27.hostedemail.com (Postfix) with ESMTP for ; Mon, 2 Mar 2020 11:01:09 +0000 (UTC) X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R851e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e04407;MF=alex.shi@linux.alibaba.com;NM=1;PH=DS;RN=14;SR=0;TI=SMTPD_---0TrR9JVc_1583146858; Received: from localhost(mailfrom:alex.shi@linux.alibaba.com fp:SMTPD_---0TrR9JVc_1583146858) by smtp.aliyun-inc.com(127.0.0.1); Mon, 02 Mar 2020 19:00:59 +0800 From: Alex Shi To: cgroups@vger.kernel.org, akpm@linux-foundation.org, mgorman@techsingularity.net, tj@kernel.org, hughd@google.com, khlebnikov@yandex-team.ru, daniel.m.jordan@oracle.com, yang.shi@linux.alibaba.com, willy@infradead.org, hannes@cmpxchg.org, lkp@intel.com Cc: Alex Shi , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v9 12/20] mm/mlock: clean up __munlock_isolate_lru_page Date: Mon, 2 Mar 2020 19:00:22 +0800 Message-Id: <1583146830-169516-13-git-send-email-alex.shi@linux.alibaba.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1583146830-169516-1-git-send-email-alex.shi@linux.alibaba.com> References: <1583146830-169516-1-git-send-email-alex.shi@linux.alibaba.com> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: clean up __munlock_isolate_lru_page func for later lru lock change. No functional change. Signed-off-by: Alex Shi Cc: Andrew Morton Cc: Johannes Weiner Cc: Hugh Dickins Cc: linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org --- mm/mlock.c | 47 ++++++++++++++++++----------------------------- 1 file changed, 18 insertions(+), 29 deletions(-) diff --git a/mm/mlock.c b/mm/mlock.c index 7ddc52ca14b1..a43b3da78541 100644 --- a/mm/mlock.c +++ b/mm/mlock.c @@ -103,25 +103,6 @@ void mlock_vma_page(struct page *page) } /* - * Isolate a page from LRU with optional get_page() pin. - * Assumes lru_lock already held and page already pinned. - */ -static bool __munlock_isolate_lru_page(struct page *page, bool getpage) -{ - if (TestClearPageLRU(page)) { - struct lruvec *lruvec; - - lruvec = mem_cgroup_page_lruvec(page, page_pgdat(page)); - if (getpage) - get_page(page); - del_page_from_lru_list(page, lruvec, page_lru(page)); - return true; - } - - return false; -} - -/* * Finish munlock after successful page isolation * * Page must be locked. This is a wrapper for try_to_munlock() @@ -311,26 +292,34 @@ static void __munlock_pagevec(struct pagevec *pvec, struct zone *zone) spin_lock_irq(&zone->zone_pgdat->lru_lock); for (i = 0; i < nr; i++) { struct page *page = pvec->pages[i]; + struct lruvec *lruvec; - if (TestClearPageMlocked(page)) { - /* - * We already have pin from follow_page_mask() - * so we can spare the get_page() here. - */ - if (__munlock_isolate_lru_page(page, false)) - continue; - else - __munlock_isolation_failed(page); - } else { + if (!TestClearPageMlocked(page)) { delta_munlocked++; + goto putback; + } + + if (!TestClearPageLRU(page)) { + __munlock_isolation_failed(page); + goto putback; } /* + * Isolate this page. + * We already have pin from follow_page_mask() + * so we can spare the get_page() here. + */ + lruvec = mem_cgroup_page_lruvec(page, page_pgdat(page)); + del_page_from_lru_list(page, lruvec, page_lru(page)); + continue; + + /* * We won't be munlocking this page in the next phase * but we still need to release the follow_page_mask() * pin. We cannot do it under lru_lock however. If it's * the last pin, __page_cache_release() would deadlock. */ +putback: pagevec_add(&pvec_putback, pvec->pages[i]); pvec->pages[i] = NULL; } From patchwork Mon Mar 2 11:00:23 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alex Shi X-Patchwork-Id: 11415323 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id A6878138D for ; Mon, 2 Mar 2020 11:02:14 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 563652166E for ; Mon, 2 Mar 2020 11:02:14 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 563652166E Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 61CF16B0006; Mon, 2 Mar 2020 06:02:13 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 5F4376B000E; Mon, 2 Mar 2020 06:02:13 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4E3436B0089; Mon, 2 Mar 2020 06:02:13 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0204.hostedemail.com [216.40.44.204]) by kanga.kvack.org (Postfix) with ESMTP id 310A36B0006 for ; Mon, 2 Mar 2020 06:02:13 -0500 (EST) Received: from smtpin09.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 02FE72040E for ; Mon, 2 Mar 2020 11:02:13 +0000 (UTC) X-FDA: 76550132946.09.waves04_877d5d297e139 X-Spam-Summary: 2,0,0,1e4589e566837e62,d41d8cd98f00b204,alex.shi@linux.alibaba.com,,RULES_HIT:41:69:327:355:379:541:800:960:966:967:973:988:989:1260:1261:1345:1359:1431:1437:1605:1730:1747:1777:1792:2194:2195:2196:2198:2199:2200:2201:2202:2393:2525:2553:2559:2563:2682:2685:2693:2731:2859:2890:2895:2898:2901:2924:2926:2933:2937:2939:2942:2945:2947:2951:2954:3022:3138:3139:3140:3141:3142:3865:3866:3867:3868:3870:3871:3872:3874:3934:3936:3938:3941:3944:3947:3950:3953:3956:3959:4042:4250:4321:4385:4605:5007:6119:6261:6737:7514:7875:7903:8957:9010:9025:9207:9592:10004:11026:11473:11657:11658:11914:12043:12048:12291:12296:12297:12438:12555:12683:12895:12986:13846:13868:14096:14394:14915:21060:21080:21324:21450:21451:21611:21627:21740:21987:21990:30001:30045:30054:30056:30070:30090,0,RBL:115.124.30.131:@linux.alibaba.com:.lbl8.mailshell.net-64.201.201.201 62.20.2.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:neutral,Custom_rules:0 :0:0,LFt X-HE-Tag: waves04_877d5d297e139 X-Filterd-Recvd-Size: 28787 Received: from out30-131.freemail.mail.aliyun.com (out30-131.freemail.mail.aliyun.com [115.124.30.131]) by imf44.hostedemail.com (Postfix) with ESMTP for ; Mon, 2 Mar 2020 11:02:10 +0000 (UTC) X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R221e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01f04391;MF=alex.shi@linux.alibaba.com;NM=1;PH=DS;RN=16;SR=0;TI=SMTPD_---0TrR9JVi_1583146859; Received: from localhost(mailfrom:alex.shi@linux.alibaba.com fp:SMTPD_---0TrR9JVi_1583146859) by smtp.aliyun-inc.com(127.0.0.1); Mon, 02 Mar 2020 19:00:59 +0800 From: Alex Shi To: cgroups@vger.kernel.org, akpm@linux-foundation.org, mgorman@techsingularity.net, tj@kernel.org, hughd@google.com, khlebnikov@yandex-team.ru, daniel.m.jordan@oracle.com, yang.shi@linux.alibaba.com, willy@infradead.org, hannes@cmpxchg.org, lkp@intel.com Cc: Alex Shi , Michal Hocko , Vladimir Davydov , linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH v9 13/20] mm/lru: replace pgdat lru_lock with lruvec lock Date: Mon, 2 Mar 2020 19:00:23 +0800 Message-Id: <1583146830-169516-14-git-send-email-alex.shi@linux.alibaba.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1583146830-169516-1-git-send-email-alex.shi@linux.alibaba.com> References: <1583146830-169516-1-git-send-email-alex.shi@linux.alibaba.com> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: This patch moves lru_lock into lruvec, give a lru_lock for each of lruvec, thus bring a lru_lock for each of memcg per node. So on a large node machine, each of memcg don't need suffer from per node pgdat->lru_lock waiting. They could go fast with their self lru_lock. We introduces function lock_page_lruvec, which will lock the page's memcg's lruvec->lru_lock. (Thanks Johannes Weiner, Hugh Dickins and Konstantin Khlebnikov suggestion/reminder during patch writting.) This is the key patch to replace per node lru_lock with per memcg lruvec lock, with few palce full back more frequency lru locking. But the later patch will use relock_page_lruvec_xxx to fill up the gap. According to Daniel Jordan's suggestion, I run 208 'dd' with on 104 containers on a 2s * 26cores * HT box with a modefied case: https://git.kernel.org/pub/scm/linux/kernel/git/wfg/vm-scalability.git/tree/case-lru-file-readtwice With this and later patches, the readtwice performance increases about 80% within concurrent containers. Signed-off-by: Alex Shi Cc: Andrew Morton Cc: Johannes Weiner Cc: Michal Hocko Cc: Vladimir Davydov Cc: Yang Shi Cc: Matthew Wilcox Cc: Konstantin Khlebnikov Cc: Hugh Dickins Cc: Tejun Heo Cc: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org Cc: cgroups@vger.kernel.org --- include/linux/memcontrol.h | 35 ++++++++++++++++++++++ include/linux/mmzone.h | 2 ++ mm/compaction.c | 51 ++++++++++++++++++++------------ mm/huge_memory.c | 9 ++---- mm/memcontrol.c | 50 +++++++++++++++++++++++++------- mm/mlock.c | 18 +++++------- mm/mmzone.c | 1 + mm/swap.c | 72 +++++++++++++++++----------------------------- mm/vmscan.c | 57 ++++++++++++++++-------------------- 9 files changed, 170 insertions(+), 125 deletions(-) diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index a7a0a1a5c8d5..b8a04f0a2ab8 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -424,6 +424,13 @@ static inline struct lruvec *mem_cgroup_lruvec(struct mem_cgroup *memcg, struct mem_cgroup *get_mem_cgroup_from_page(struct page *page); +struct lruvec *lock_page_lruvec_irq(struct page *page); +struct lruvec *lock_page_lruvec_irqsave(struct page *page, + unsigned long *flags); + +void unlock_page_lruvec_irq(struct lruvec *lruvec); +void unlock_page_lruvec_irqrestore(struct lruvec *lruvec, unsigned long flags); + static inline struct mem_cgroup *mem_cgroup_from_css(struct cgroup_subsys_state *css){ return css ? container_of(css, struct mem_cgroup, css) : NULL; @@ -926,6 +933,34 @@ static inline void mem_cgroup_put(struct mem_cgroup *memcg) { } +static inline struct lruvec *lock_page_lruvec_irq(struct page *page) +{ + struct pglist_data *pgdat = page_pgdat(page); + + spin_lock_irq(&pgdat->__lruvec.lru_lock); + return &pgdat->__lruvec; +} + +static inline struct lruvec *lock_page_lruvec_irqsave(struct page *page, + unsigned long *flagsp) +{ + struct pglist_data *pgdat = page_pgdat(page); + + spin_lock_irqsave(&pgdat->__lruvec.lru_lock, *flagsp); + return &pgdat->__lruvec; +} + +static inline void unlock_page_lruvec_irq(struct lruvec *lruvec) +{ + spin_unlock_irq(&lruvec->lru_lock); +} + +static inline void unlock_page_lruvec_irqrestore(struct lruvec *lruvec, + unsigned long flags) +{ + spin_unlock_irqrestore(&lruvec->lru_lock, flags); +} + static inline struct mem_cgroup * mem_cgroup_iter(struct mem_cgroup *root, struct mem_cgroup *prev, diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index 462f6873905a..a7bdff94990d 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -308,6 +308,8 @@ struct lruvec { atomic_long_t inactive_age; /* Refaults at the time of last reclaim cycle */ unsigned long refaults; + /* per lruvec lru_lock for memcg */ + spinlock_t lru_lock; /* Various lruvec state flags (enum lruvec_flags) */ unsigned long flags; #ifdef CONFIG_MEMCG diff --git a/mm/compaction.c b/mm/compaction.c index 1baba328d089..4a773f0ffedf 100644 --- a/mm/compaction.c +++ b/mm/compaction.c @@ -786,7 +786,7 @@ static bool too_many_isolated(pg_data_t *pgdat) unsigned long nr_scanned = 0, nr_isolated = 0; struct lruvec *lruvec; unsigned long flags = 0; - bool locked = false; + struct lruvec *locked_lruvec = NULL; struct page *page = NULL, *valid_page = NULL; unsigned long start_pfn = low_pfn; bool skip_on_failure = false; @@ -846,11 +846,20 @@ static bool too_many_isolated(pg_data_t *pgdat) * contention, to give chance to IRQs. Abort completely if * a fatal signal is pending. */ - if (!(low_pfn % SWAP_CLUSTER_MAX) - && compact_unlock_should_abort(&pgdat->lru_lock, - flags, &locked, cc)) { - low_pfn = 0; - goto fatal_pending; + if (!(low_pfn % SWAP_CLUSTER_MAX)) { + if (locked_lruvec) { + unlock_page_lruvec_irqrestore(locked_lruvec, flags); + locked_lruvec = NULL; + } + + if (fatal_signal_pending(current)) { + cc->contended = true; + + low_pfn = 0; + goto fatal_pending; + } + + cond_resched(); } if (!pfn_valid_within(low_pfn)) @@ -919,10 +928,9 @@ static bool too_many_isolated(pg_data_t *pgdat) */ if (unlikely(__PageMovable(page)) && !PageIsolated(page)) { - if (locked) { - spin_unlock_irqrestore(&pgdat->lru_lock, - flags); - locked = false; + if (locked_lruvec) { + unlock_page_lruvec_irqrestore(locked_lruvec, flags); + locked_lruvec = NULL; } if (!isolate_movable_page(page, isolate_mode)) @@ -965,10 +973,16 @@ static bool too_many_isolated(pg_data_t *pgdat) goto isolate_fail; } + lruvec = mem_cgroup_page_lruvec(page, pgdat); + /* If we already hold the lock, we can skip some rechecking */ - if (!locked) { - locked = compact_lock_irqsave(&pgdat->lru_lock, - &flags, cc); + if (lruvec != locked_lruvec) { + if (locked_lruvec) { + unlock_page_lruvec_irqrestore(locked_lruvec, flags); + locked_lruvec = NULL; + } + compact_lock_irqsave(&lruvec->lru_lock, &flags, cc); + locked_lruvec = lruvec; /* Try get exclusive access under lock */ if (!skip_updated) { @@ -988,7 +1002,6 @@ static bool too_many_isolated(pg_data_t *pgdat) } } - lruvec = mem_cgroup_page_lruvec(page, pgdat); VM_BUG_ON_PAGE(PageCompound(page), page); @@ -1025,9 +1038,9 @@ static bool too_many_isolated(pg_data_t *pgdat) * page anyway. */ if (nr_isolated) { - if (locked) { - spin_unlock_irqrestore(&pgdat->lru_lock, flags); - locked = false; + if (locked_lruvec) { + unlock_page_lruvec_irqrestore(locked_lruvec, flags); + locked_lruvec = NULL; } putback_movable_pages(&cc->migratepages); cc->nr_migratepages = 0; @@ -1052,8 +1065,8 @@ static bool too_many_isolated(pg_data_t *pgdat) low_pfn = end_pfn; isolate_abort: - if (locked) - spin_unlock_irqrestore(&pgdat->lru_lock, flags); + if (locked_lruvec) + unlock_page_lruvec_irqrestore(locked_lruvec, flags); /* * Updated the cached scanner pfn once the pageblock has been scanned diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 3835f87d03fd..7af56d0fb044 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -2452,7 +2452,7 @@ void lru_add_page_tail(struct page *head, struct page *page_tail, VM_BUG_ON_PAGE(PageCompound(page_tail), head); VM_BUG_ON_PAGE(PageLRU(page_tail), head); - lockdep_assert_held(&lruvec_pgdat(lruvec)->lru_lock); + lockdep_assert_held(&lruvec->lru_lock); if (!list) SetPageLRU(page_tail); @@ -2545,15 +2545,12 @@ static void __split_huge_page(struct page *page, struct list_head *list, pgoff_t end) { struct page *head = compound_head(page); - pg_data_t *pgdat = page_pgdat(head); struct lruvec *lruvec; struct address_space *swap_cache = NULL; unsigned long offset = 0; unsigned long flags; int i; - lruvec = mem_cgroup_page_lruvec(head, pgdat); - /* complete memcg works before add pages to LRU */ mem_cgroup_split_huge_fixup(head); @@ -2566,7 +2563,7 @@ static void __split_huge_page(struct page *page, struct list_head *list, } /* Lru list would be changed, don't care head's LRU bit. */ - spin_lock_irqsave(&pgdat->lru_lock, flags); + lruvec = lock_page_lruvec_irqsave(head, &flags); for (i = HPAGE_PMD_NR - 1; i >= 1; i--) { __split_huge_page_tail(head, i, lruvec, list); @@ -2585,7 +2582,7 @@ static void __split_huge_page(struct page *page, struct list_head *list, head + i, 0); } } - spin_unlock_irqrestore(&pgdat->lru_lock, flags); + unlock_page_lruvec_irqrestore(lruvec, flags); ClearPageCompound(head); diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 7d7b861a948c..b0c156da60fe 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -1219,7 +1219,7 @@ struct lruvec *mem_cgroup_page_lruvec(struct page *page, struct pglist_data *pgd goto out; } - memcg = page->mem_cgroup; + memcg = READ_ONCE(page->mem_cgroup); /* * Swapcache readahead pages are added to the LRU - and * possibly migrated - before they are charged. @@ -1240,6 +1240,37 @@ struct lruvec *mem_cgroup_page_lruvec(struct page *page, struct pglist_data *pgd return lruvec; } +/* page must be isolated */ +struct lruvec *lock_page_lruvec_irq(struct page *page) +{ + struct lruvec *lruvec; + + lruvec = mem_cgroup_page_lruvec(page, page_pgdat(page)); + spin_lock_irq(&lruvec->lru_lock); + + return lruvec; +} + +struct lruvec *lock_page_lruvec_irqsave(struct page *page, unsigned long *flags) +{ + struct lruvec *lruvec; + + lruvec = mem_cgroup_page_lruvec(page, page_pgdat(page)); + spin_lock_irqsave(&lruvec->lru_lock, *flags); + + return lruvec; +} + +void unlock_page_lruvec_irq(struct lruvec *lruvec) +{ + spin_unlock_irq(&lruvec->lru_lock); +} + +void unlock_page_lruvec_irqrestore(struct lruvec *lruvec, unsigned long flags) +{ + spin_unlock_irqrestore(&lruvec->lru_lock, flags); +} + /** * mem_cgroup_update_lru_size - account for adding or removing an lru page * @lruvec: mem_cgroup per zone lru vector @@ -2576,7 +2607,6 @@ static void commit_charge(struct page *page, struct mem_cgroup *memcg, bool lrucare) { struct lruvec *lruvec = NULL; - pg_data_t *pgdat; VM_BUG_ON_PAGE(page->mem_cgroup, page); @@ -2585,14 +2615,12 @@ static void commit_charge(struct page *page, struct mem_cgroup *memcg, * may already be on some other mem_cgroup's LRU. Take care of it. */ if (lrucare) { - pgdat = page_pgdat(page); - spin_lock_irq(&pgdat->lru_lock); - if (TestClearPageLRU(page)) { - lruvec = mem_cgroup_page_lruvec(page, pgdat); + lruvec = lock_page_lruvec_irq(page); + del_page_from_lru_list(page, lruvec, page_lru(page)); - } else - spin_unlock_irq(&pgdat->lru_lock); + unlock_page_lruvec_irq(lruvec); + } } /* * Nobody should be changing or seriously looking at @@ -2611,11 +2639,11 @@ static void commit_charge(struct page *page, struct mem_cgroup *memcg, page->mem_cgroup = memcg; if (lrucare && lruvec) { - lruvec = mem_cgroup_page_lruvec(page, pgdat); + lruvec = lock_page_lruvec_irq(page); VM_BUG_ON_PAGE(PageLRU(page), page); add_page_to_lru_list(page, lruvec, page_lru(page)); - spin_unlock_irq(&pgdat->lru_lock); + unlock_page_lruvec_irq(lruvec); SetPageLRU(page); } } @@ -2913,7 +2941,7 @@ void __memcg_kmem_uncharge(struct page *page, int order) /* * Because tail pages are not marked as "used", set it. We're under - * pgdat->lru_lock and migration entries setup in all page mappings. + * lruvec->lru_lock and migration entries setup in all page mappings. */ void mem_cgroup_split_huge_fixup(struct page *head) { diff --git a/mm/mlock.c b/mm/mlock.c index a43b3da78541..d285348b147e 100644 --- a/mm/mlock.c +++ b/mm/mlock.c @@ -163,7 +163,7 @@ unsigned int munlock_vma_page(struct page *page) { int nr_pages; bool clearlru = false; - pg_data_t *pgdat = page_pgdat(page); + struct lruvec *lruvec; /* For try_to_munlock() and to serialize with page migration */ BUG_ON(!PageLocked(page)); @@ -177,7 +177,7 @@ unsigned int munlock_vma_page(struct page *page) */ get_page(page); clearlru = TestClearPageLRU(page); - spin_lock_irq(&pgdat->lru_lock); + lruvec = lock_page_lruvec_irq(page); if (!TestClearPageMlocked(page)) { if (clearlru) @@ -186,7 +186,7 @@ unsigned int munlock_vma_page(struct page *page) * Potentially, PTE-mapped THP: do not skip the rest PTEs * Reuse lock as memory barrier for release_pages racing. */ - spin_unlock_irq(&pgdat->lru_lock); + unlock_page_lruvec_irq(lruvec); put_page(page); return 0; } @@ -195,14 +195,11 @@ unsigned int munlock_vma_page(struct page *page) __mod_zone_page_state(page_zone(page), NR_MLOCK, -nr_pages); if (clearlru) { - struct lruvec *lruvec; - - lruvec = mem_cgroup_page_lruvec(page, page_pgdat(page)); del_page_from_lru_list(page, lruvec, page_lru(page)); - spin_unlock_irq(&pgdat->lru_lock); + unlock_page_lruvec_irq(lruvec); __munlock_isolated_page(page); } else { - spin_unlock_irq(&pgdat->lru_lock); + unlock_page_lruvec_irq(lruvec); put_page(page); __munlock_isolation_failed(page); } @@ -289,7 +286,6 @@ static void __munlock_pagevec(struct pagevec *pvec, struct zone *zone) pagevec_init(&pvec_putback); /* Phase 1: page isolation */ - spin_lock_irq(&zone->zone_pgdat->lru_lock); for (i = 0; i < nr; i++) { struct page *page = pvec->pages[i]; struct lruvec *lruvec; @@ -309,8 +305,9 @@ static void __munlock_pagevec(struct pagevec *pvec, struct zone *zone) * We already have pin from follow_page_mask() * so we can spare the get_page() here. */ - lruvec = mem_cgroup_page_lruvec(page, page_pgdat(page)); + lruvec = lock_page_lruvec_irq(page); del_page_from_lru_list(page, lruvec, page_lru(page)); + unlock_page_lruvec_irq(lruvec); continue; /* @@ -323,7 +320,6 @@ static void __munlock_pagevec(struct pagevec *pvec, struct zone *zone) pagevec_add(&pvec_putback, pvec->pages[i]); pvec->pages[i] = NULL; } - spin_unlock_irq(&zone->zone_pgdat->lru_lock); __mod_zone_page_state(zone, NR_MLOCK, delta_munlocked); /* Now we can release pins of pages that we are not munlocking */ diff --git a/mm/mmzone.c b/mm/mmzone.c index 4686fdc23bb9..3750a90ed4a0 100644 --- a/mm/mmzone.c +++ b/mm/mmzone.c @@ -91,6 +91,7 @@ void lruvec_init(struct lruvec *lruvec) enum lru_list lru; memset(lruvec, 0, sizeof(struct lruvec)); + spin_lock_init(&lruvec->lru_lock); for_each_lru(lru) INIT_LIST_HEAD(&lruvec->lists[lru]); diff --git a/mm/swap.c b/mm/swap.c index 16af7c8369fe..50c856246f84 100644 --- a/mm/swap.c +++ b/mm/swap.c @@ -60,14 +60,12 @@ static void __page_cache_release(struct page *page) { if (TestClearPageLRU(page)) { - pg_data_t *pgdat = page_pgdat(page); struct lruvec *lruvec; unsigned long flags; - spin_lock_irqsave(&pgdat->lru_lock, flags); - lruvec = mem_cgroup_page_lruvec(page, pgdat); + lruvec = lock_page_lruvec_irqsave(page, &flags); del_page_from_lru_list(page, lruvec, page_off_lru(page)); - spin_unlock_irqrestore(&pgdat->lru_lock, flags); + unlock_page_lruvec_irqrestore(lruvec, flags); } __ClearPageWaiters(page); } @@ -190,33 +188,22 @@ static void pagevec_lru_move_fn(struct pagevec *pvec, void *arg, bool isolation) { int i; - struct pglist_data *pgdat = NULL; - struct lruvec *lruvec; + struct lruvec *lruvec = NULL; unsigned long flags = 0; for (i = 0; i < pagevec_count(pvec); i++) { struct page *page = pvec->pages[i]; - struct pglist_data *pagepgdat = page_pgdat(page); if (isolation && !TestClearPageLRU(page)) continue; - /* every page should be isolated from lru */ - if (pagepgdat != pgdat) { - if (pgdat) - spin_unlock_irqrestore(&pgdat->lru_lock, flags); - pgdat = pagepgdat; - spin_lock_irqsave(&pgdat->lru_lock, flags); - } - - lruvec = mem_cgroup_page_lruvec(page, pgdat); + lruvec = lock_page_lruvec_irqsave(page, &flags); (*move_fn)(page, lruvec, arg); + unlock_page_lruvec_irqrestore(lruvec, flags); if (isolation) SetPageLRU(page); } - if (pgdat) - spin_unlock_irqrestore(&pgdat->lru_lock, flags); release_pages(pvec->pages, pvec->nr); pagevec_reinit(pvec); } @@ -328,12 +315,12 @@ static inline void activate_page_drain(int cpu) void activate_page(struct page *page) { - pg_data_t *pgdat = page_pgdat(page); + struct lruvec *lruvec; page = compound_head(page); - spin_lock_irq(&pgdat->lru_lock); - __activate_page(page, mem_cgroup_page_lruvec(page, pgdat), NULL); - spin_unlock_irq(&pgdat->lru_lock); + lruvec = lock_page_lruvec_irq(page); + __activate_page(page, lruvec, NULL); + unlock_page_lruvec_irq(lruvec); } #endif @@ -783,8 +770,7 @@ void release_pages(struct page **pages, int nr) { int i; LIST_HEAD(pages_to_free); - struct pglist_data *locked_pgdat = NULL; - struct lruvec *lruvec; + struct lruvec *lruvec = NULL; unsigned long uninitialized_var(flags); unsigned int uninitialized_var(lock_batch); @@ -794,21 +780,20 @@ void release_pages(struct page **pages, int nr) /* * Make sure the IRQ-safe lock-holding time does not get * excessive with a continuous string of pages from the - * same pgdat. The lock is held only if pgdat != NULL. + * same lruvec. The lock is held only if lruvec != NULL. */ - if (locked_pgdat && ++lock_batch == SWAP_CLUSTER_MAX) { - spin_unlock_irqrestore(&locked_pgdat->lru_lock, flags); - locked_pgdat = NULL; + if (lruvec && ++lock_batch == SWAP_CLUSTER_MAX) { + unlock_page_lruvec_irqrestore(lruvec, flags); + lruvec = NULL; } if (is_huge_zero_page(page)) continue; if (is_zone_device_page(page)) { - if (locked_pgdat) { - spin_unlock_irqrestore(&locked_pgdat->lru_lock, - flags); - locked_pgdat = NULL; + if (lruvec) { + unlock_page_lruvec_irqrestore(lruvec, flags); + lruvec = NULL; } /* * ZONE_DEVICE pages that return 'false' from @@ -827,27 +812,24 @@ void release_pages(struct page **pages, int nr) continue; if (PageCompound(page)) { - if (locked_pgdat) { - spin_unlock_irqrestore(&locked_pgdat->lru_lock, flags); - locked_pgdat = NULL; + if (lruvec) { + unlock_page_lruvec_irqrestore(lruvec, flags); + lruvec = NULL; } __put_compound_page(page); continue; } if (TestClearPageLRU(page)) { - struct pglist_data *pgdat = page_pgdat(page); + struct lruvec *new_lruvec = mem_cgroup_page_lruvec(page, page_pgdat(page)); - if (pgdat != locked_pgdat) { - if (locked_pgdat) - spin_unlock_irqrestore(&locked_pgdat->lru_lock, - flags); + if (new_lruvec != lruvec) { + if (lruvec) + unlock_page_lruvec_irqrestore(lruvec, flags); lock_batch = 0; - locked_pgdat = pgdat; - spin_lock_irqsave(&locked_pgdat->lru_lock, flags); + lruvec = lock_page_lruvec_irqsave(page, &flags); } - lruvec = mem_cgroup_page_lruvec(page, locked_pgdat); del_page_from_lru_list(page, lruvec, page_off_lru(page)); } @@ -857,8 +839,8 @@ void release_pages(struct page **pages, int nr) list_add(&page->lru, &pages_to_free); } - if (locked_pgdat) - spin_unlock_irqrestore(&locked_pgdat->lru_lock, flags); + if (lruvec) + unlock_page_lruvec_irqrestore(lruvec, flags); mem_cgroup_uncharge_list(&pages_to_free); free_unref_page_list(&pages_to_free); diff --git a/mm/vmscan.c b/mm/vmscan.c index efaa4f41044e..a58cd5ee9ea1 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -1757,14 +1757,12 @@ int isolate_lru_page(struct page *page) get_page(page); if (TestClearPageLRU(page)) { - pg_data_t *pgdat = page_pgdat(page); struct lruvec *lruvec; int lru = page_lru(page); - lruvec = mem_cgroup_page_lruvec(page, pgdat); - spin_lock_irq(&pgdat->lru_lock); + lruvec = lock_page_lruvec_irq(page); del_page_from_lru_list(page, lruvec, lru); - spin_unlock_irq(&pgdat->lru_lock); + unlock_page_lruvec_irq(lruvec); ret = 0; } else put_page(page); @@ -1832,7 +1830,6 @@ static int too_many_isolated(struct pglist_data *pgdat, int file, static unsigned noinline_for_stack move_pages_to_lru(struct lruvec *lruvec, struct list_head *list) { - struct pglist_data *pgdat = lruvec_pgdat(lruvec); int nr_pages, nr_moved = 0; LIST_HEAD(pages_to_free); struct page *page; @@ -1843,9 +1840,9 @@ static unsigned noinline_for_stack move_pages_to_lru(struct lruvec *lruvec, VM_BUG_ON_PAGE(PageLRU(page), page); list_del(&page->lru); if (unlikely(!page_evictable(page))) { - spin_unlock_irq(&pgdat->lru_lock); + spin_unlock_irq(&lruvec->lru_lock); putback_lru_page(page); - spin_lock_irq(&pgdat->lru_lock); + spin_lock_irq(&lruvec->lru_lock); continue; } @@ -1866,18 +1863,16 @@ static unsigned noinline_for_stack move_pages_to_lru(struct lruvec *lruvec, __ClearPageActive(page); if (unlikely(PageCompound(page))) { - spin_unlock_irq(&pgdat->lru_lock); + spin_unlock_irq(&lruvec->lru_lock); (*get_compound_page_dtor(page))(page); - spin_lock_irq(&pgdat->lru_lock); + spin_lock_irq(&lruvec->lru_lock); } else list_add(&page->lru, &pages_to_free); continue; } - lruvec = mem_cgroup_page_lruvec(page, pgdat); lru = page_lru(page); nr_pages = hpage_nr_pages(page); - update_lru_size(lruvec, lru, page_zonenum(page), nr_pages); list_add(&page->lru, &lruvec->lists[lru]); nr_moved += nr_pages; @@ -1938,7 +1933,7 @@ static int current_may_throttle(void) lru_add_drain(); - spin_lock_irq(&pgdat->lru_lock); + spin_lock_irq(&lruvec->lru_lock); nr_taken = isolate_lru_pages(nr_to_scan, lruvec, &page_list, &nr_scanned, sc, lru); @@ -1950,7 +1945,7 @@ static int current_may_throttle(void) if (!cgroup_reclaim(sc)) __count_vm_events(item, nr_scanned); __count_memcg_events(lruvec_memcg(lruvec), item, nr_scanned); - spin_unlock_irq(&pgdat->lru_lock); + spin_unlock_irq(&lruvec->lru_lock); if (nr_taken == 0) return 0; @@ -1958,7 +1953,7 @@ static int current_may_throttle(void) nr_reclaimed = shrink_page_list(&page_list, pgdat, sc, 0, &stat, false); - spin_lock_irq(&pgdat->lru_lock); + spin_lock_irq(&lruvec->lru_lock); item = current_is_kswapd() ? PGSTEAL_KSWAPD : PGSTEAL_DIRECT; if (!cgroup_reclaim(sc)) @@ -1971,7 +1966,7 @@ static int current_may_throttle(void) __mod_node_page_state(pgdat, NR_ISOLATED_ANON + file, -nr_taken); - spin_unlock_irq(&pgdat->lru_lock); + spin_unlock_irq(&lruvec->lru_lock); mem_cgroup_uncharge_list(&page_list); free_unref_page_list(&page_list); @@ -2024,7 +2019,7 @@ static void shrink_active_list(unsigned long nr_to_scan, lru_add_drain(); - spin_lock_irq(&pgdat->lru_lock); + spin_lock_irq(&lruvec->lru_lock); nr_taken = isolate_lru_pages(nr_to_scan, lruvec, &l_hold, &nr_scanned, sc, lru); @@ -2035,7 +2030,7 @@ static void shrink_active_list(unsigned long nr_to_scan, __count_vm_events(PGREFILL, nr_scanned); __count_memcg_events(lruvec_memcg(lruvec), PGREFILL, nr_scanned); - spin_unlock_irq(&pgdat->lru_lock); + spin_unlock_irq(&lruvec->lru_lock); while (!list_empty(&l_hold)) { cond_resched(); @@ -2081,7 +2076,7 @@ static void shrink_active_list(unsigned long nr_to_scan, /* * Move pages back to the lru list. */ - spin_lock_irq(&pgdat->lru_lock); + spin_lock_irq(&lruvec->lru_lock); /* * Count referenced pages from currently used mappings as rotated, * even though only some of them are actually re-activated. This @@ -2099,7 +2094,7 @@ static void shrink_active_list(unsigned long nr_to_scan, __count_memcg_events(lruvec_memcg(lruvec), PGDEACTIVATE, nr_deactivate); __mod_node_page_state(pgdat, NR_ISOLATED_ANON + file, -nr_taken); - spin_unlock_irq(&pgdat->lru_lock); + spin_unlock_irq(&lruvec->lru_lock); mem_cgroup_uncharge_list(&l_active); free_unref_page_list(&l_active); @@ -2248,7 +2243,6 @@ static void get_scan_count(struct lruvec *lruvec, struct scan_control *sc, struct zone_reclaim_stat *reclaim_stat = &lruvec->reclaim_stat; u64 fraction[2]; u64 denominator = 0; /* gcc */ - struct pglist_data *pgdat = lruvec_pgdat(lruvec); unsigned long anon_prio, file_prio; enum scan_balance scan_balance; unsigned long anon, file; @@ -2326,7 +2320,7 @@ static void get_scan_count(struct lruvec *lruvec, struct scan_control *sc, file = lruvec_lru_size(lruvec, LRU_ACTIVE_FILE, MAX_NR_ZONES) + lruvec_lru_size(lruvec, LRU_INACTIVE_FILE, MAX_NR_ZONES); - spin_lock_irq(&pgdat->lru_lock); + spin_lock_irq(&lruvec->lru_lock); if (unlikely(reclaim_stat->recent_scanned[0] > anon / 4)) { reclaim_stat->recent_scanned[0] /= 2; reclaim_stat->recent_rotated[0] /= 2; @@ -2347,7 +2341,7 @@ static void get_scan_count(struct lruvec *lruvec, struct scan_control *sc, fp = file_prio * (reclaim_stat->recent_scanned[1] + 1); fp /= reclaim_stat->recent_rotated[1] + 1; - spin_unlock_irq(&pgdat->lru_lock); + spin_unlock_irq(&lruvec->lru_lock); fraction[0] = ap; fraction[1] = fp; @@ -4324,24 +4318,21 @@ int page_evictable(struct page *page) */ void check_move_unevictable_pages(struct pagevec *pvec) { - struct lruvec *lruvec; - struct pglist_data *pgdat = NULL; + struct lruvec *lruvec = NULL; int pgscanned = 0; int pgrescued = 0; int i; for (i = 0; i < pvec->nr; i++) { struct page *page = pvec->pages[i]; - struct pglist_data *pagepgdat = page_pgdat(page); + struct lruvec *new_lruvec = mem_cgroup_page_lruvec(page, page_pgdat(page)); pgscanned++; - if (pagepgdat != pgdat) { - if (pgdat) - spin_unlock_irq(&pgdat->lru_lock); - pgdat = pagepgdat; - spin_lock_irq(&pgdat->lru_lock); + if (lruvec != new_lruvec) { + if (lruvec) + unlock_page_lruvec_irq(lruvec); + lruvec = lock_page_lruvec_irq(page); } - lruvec = mem_cgroup_page_lruvec(page, pgdat); if (!TestClearPageLRU(page) || !PageUnevictable(page)) continue; @@ -4358,10 +4349,10 @@ void check_move_unevictable_pages(struct pagevec *pvec) } } - if (pgdat) { + if (lruvec) { __count_vm_events(UNEVICTABLE_PGRESCUED, pgrescued); __count_vm_events(UNEVICTABLE_PGSCANNED, pgscanned); - spin_unlock_irq(&pgdat->lru_lock); + unlock_page_lruvec_irq(lruvec); } } EXPORT_SYMBOL_GPL(check_move_unevictable_pages); From patchwork Mon Mar 2 11:00:24 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alex Shi X-Patchwork-Id: 11415275 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id CC30F138D for ; Mon, 2 Mar 2020 11:01:16 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id A2D4A2166E for ; Mon, 2 Mar 2020 11:01:16 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org A2D4A2166E Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 9F1586B000D; Mon, 2 Mar 2020 06:01:07 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 9A22A6B000A; Mon, 2 Mar 2020 06:01:07 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 83F0E6B0010; Mon, 2 Mar 2020 06:01:07 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0129.hostedemail.com [216.40.44.129]) by kanga.kvack.org (Postfix) with ESMTP id 624956B000A for ; Mon, 2 Mar 2020 06:01:07 -0500 (EST) Received: from smtpin19.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 5124122AAD for ; Mon, 2 Mar 2020 11:01:07 +0000 (UTC) X-FDA: 76550130174.19.bait60_7ded8e855340e X-Spam-Summary: 2,0,0,325dbd3d2af34566,d41d8cd98f00b204,alex.shi@linux.alibaba.com,,RULES_HIT:41:355:379:541:800:960:973:981:988:989:1260:1261:1345:1359:1431:1437:1534:1543:1711:1730:1747:1777:1792:2393:2553:2559:2562:2898:3138:3139:3140:3141:3142:3354:3865:3868:3871:4321:4605:5007:6261:6737:8957:9207:10004:11026:11473:11658:11914:12043:12048:12291:12296:12297:12438:12555:12683:12895:12986:13846:14181:14394:14721:14915:21060:21080:21451:21627:21990:30054:30070:30090,0,RBL:115.124.30.130:@linux.alibaba.com:.lbl8.mailshell.net-62.20.2.100 64.201.201.201,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:none,Custom_rules:0:0:0,LFtime:24,LUA_SUMMARY:none X-HE-Tag: bait60_7ded8e855340e X-Filterd-Recvd-Size: 4502 Received: from out30-130.freemail.mail.aliyun.com (out30-130.freemail.mail.aliyun.com [115.124.30.130]) by imf15.hostedemail.com (Postfix) with ESMTP for ; Mon, 2 Mar 2020 11:01:03 +0000 (UTC) X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R141e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e07417;MF=alex.shi@linux.alibaba.com;NM=1;PH=DS;RN=16;SR=0;TI=SMTPD_---0TrRgpkq_1583146859; Received: from localhost(mailfrom:alex.shi@linux.alibaba.com fp:SMTPD_---0TrRgpkq_1583146859) by smtp.aliyun-inc.com(127.0.0.1); Mon, 02 Mar 2020 19:01:00 +0800 From: Alex Shi To: cgroups@vger.kernel.org, akpm@linux-foundation.org, mgorman@techsingularity.net, tj@kernel.org, hughd@google.com, khlebnikov@yandex-team.ru, daniel.m.jordan@oracle.com, yang.shi@linux.alibaba.com, willy@infradead.org, hannes@cmpxchg.org, lkp@intel.com Cc: Alex Shi , Thomas Gleixner , Andrey Ryabinin , linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH v9 14/20] mm/lru: introduce the relock_page_lruvec function Date: Mon, 2 Mar 2020 19:00:24 +0800 Message-Id: <1583146830-169516-15-git-send-email-alex.shi@linux.alibaba.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1583146830-169516-1-git-send-email-alex.shi@linux.alibaba.com> References: <1583146830-169516-1-git-send-email-alex.shi@linux.alibaba.com> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: During the lruvec locking, a new page's lruvec is may same as previous one. Thus we could save a re-locking, and only change lock iff lruvec is new. Function named relock_page_lruvec following Hugh Dickins' patch. Signed-off-by: Alex Shi Cc: Johannes Weiner Cc: Andrew Morton Cc: Thomas Gleixner Cc: Andrey Ryabinin Cc: Matthew Wilcox Cc: Mel Gorman Cc: Konstantin Khlebnikov Cc: Hugh Dickins Cc: Tejun Heo Cc: linux-kernel@vger.kernel.org Cc: cgroups@vger.kernel.org Cc: linux-mm@kvack.org --- include/linux/memcontrol.h | 36 ++++++++++++++++++++++++++++++++++++ mm/vmscan.c | 8 ++------ 2 files changed, 38 insertions(+), 6 deletions(-) diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index b8a04f0a2ab8..f60009580d2a 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -1307,6 +1307,42 @@ static inline void dec_lruvec_page_state(struct page *page, mod_lruvec_page_state(page, idx, -1); } +/* Don't lock again iff page's lruvec locked */ +static inline struct lruvec *relock_page_lruvec_irq(struct page *page, + struct lruvec *locked_lruvec) +{ + struct pglist_data *pgdat = page_pgdat(page); + struct lruvec *lruvec; + + lruvec = mem_cgroup_page_lruvec(page, pgdat); + + if (likely(locked_lruvec == lruvec)) + return lruvec; + + if (unlikely(locked_lruvec)) + unlock_page_lruvec_irq(locked_lruvec); + + return lock_page_lruvec_irq(page); +} + +/* Don't lock again iff page's lruvec locked */ +static inline struct lruvec *relock_page_lruvec_irqsave(struct page *page, + struct lruvec *locked_lruvec, unsigned long *flags) +{ + struct pglist_data *pgdat = page_pgdat(page); + struct lruvec *lruvec; + + lruvec = mem_cgroup_page_lruvec(page, pgdat); + + if (likely(locked_lruvec == lruvec)) + return lruvec; + + if (unlikely(locked_lruvec)) + unlock_page_lruvec_irqrestore(locked_lruvec, *flags); + + return lock_page_lruvec_irqsave(page, flags); +} + #ifdef CONFIG_CGROUP_WRITEBACK struct wb_domain *mem_cgroup_wb_domain(struct bdi_writeback *wb); diff --git a/mm/vmscan.c b/mm/vmscan.c index a58cd5ee9ea1..de925bd629eb 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -4325,14 +4325,10 @@ void check_move_unevictable_pages(struct pagevec *pvec) for (i = 0; i < pvec->nr; i++) { struct page *page = pvec->pages[i]; - struct lruvec *new_lruvec = mem_cgroup_page_lruvec(page, page_pgdat(page)); pgscanned++; - if (lruvec != new_lruvec) { - if (lruvec) - unlock_page_lruvec_irq(lruvec); - lruvec = lock_page_lruvec_irq(page); - } + + lruvec = relock_page_lruvec_irq(page, lruvec); if (!TestClearPageLRU(page) || !PageUnevictable(page)) continue; From patchwork Mon Mar 2 11:00:25 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alex Shi X-Patchwork-Id: 11415279 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 6EFF5109A for ; Mon, 2 Mar 2020 11:01:21 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 462422467B for ; Mon, 2 Mar 2020 11:01:21 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 462422467B Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id CD62C6B000C; Mon, 2 Mar 2020 06:01:10 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id C84886B0032; Mon, 2 Mar 2020 06:01:10 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B81886B000C; Mon, 2 Mar 2020 06:01:10 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0155.hostedemail.com [216.40.44.155]) by kanga.kvack.org (Postfix) with ESMTP id 999A66B000C for ; Mon, 2 Mar 2020 06:01:10 -0500 (EST) Received: from smtpin08.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 7E1402040E for ; Mon, 2 Mar 2020 11:01:10 +0000 (UTC) X-FDA: 76550130300.08.clam06_7e61e2b840455 X-Spam-Summary: 2,0,0,f5cffc90c8209825,d41d8cd98f00b204,alex.shi@linux.alibaba.com,,RULES_HIT:41:355:379:541:800:960:973:981:988:989:1260:1261:1345:1359:1431:1437:1534:1541:1711:1730:1747:1777:1792:2393:2553:2559:2562:2898:3138:3139:3140:3141:3142:3353:3865:3866:3868:3871:3874:4321:5007:6261:6737:7901:7903:8957:10004:11026:11658:11914:12043:12048:12296:12297:12438:12555:12895:13069:13311:13357:13846:14181:14384:14394:14721:14915:21060:21080:21451:21627:21987:30054:30070:30090,0,RBL:115.124.30.43:@linux.alibaba.com:.lbl8.mailshell.net-64.201.201.201 62.20.2.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:none,Custom_rules:0:0:0,LFtime:23,LUA_SUMMARY:none X-HE-Tag: clam06_7e61e2b840455 X-Filterd-Recvd-Size: 3368 Received: from out30-43.freemail.mail.aliyun.com (out30-43.freemail.mail.aliyun.com [115.124.30.43]) by imf31.hostedemail.com (Postfix) with ESMTP for ; Mon, 2 Mar 2020 11:01:08 +0000 (UTC) X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R111e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01f04396;MF=alex.shi@linux.alibaba.com;NM=1;PH=DS;RN=14;SR=0;TI=SMTPD_---0TrRUeZ0_1583146860; Received: from localhost(mailfrom:alex.shi@linux.alibaba.com fp:SMTPD_---0TrRUeZ0_1583146860) by smtp.aliyun-inc.com(127.0.0.1); Mon, 02 Mar 2020 19:01:00 +0800 From: Alex Shi To: cgroups@vger.kernel.org, akpm@linux-foundation.org, mgorman@techsingularity.net, tj@kernel.org, hughd@google.com, khlebnikov@yandex-team.ru, daniel.m.jordan@oracle.com, yang.shi@linux.alibaba.com, willy@infradead.org, hannes@cmpxchg.org, lkp@intel.com Cc: Alex Shi , linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH v9 15/20] mm/mlock: optimize munlock_pagevec by relocking Date: Mon, 2 Mar 2020 19:00:25 +0800 Message-Id: <1583146830-169516-16-git-send-email-alex.shi@linux.alibaba.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1583146830-169516-1-git-send-email-alex.shi@linux.alibaba.com> References: <1583146830-169516-1-git-send-email-alex.shi@linux.alibaba.com> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: During the pagevec locking, a new page's lruvec is may same as previous one. Thus we could save a re-locking, and only change lock iff lruvec is newer. Signed-off-by: Alex Shi Cc: Andrew Morton Cc: Johannes Weiner Cc: Hugh Dickins Cc: linux-kernel@vger.kernel.org Cc: cgroups@vger.kernel.org Cc: linux-mm@kvack.org --- mm/mlock.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/mm/mlock.c b/mm/mlock.c index d285348b147e..236a29b791f4 100644 --- a/mm/mlock.c +++ b/mm/mlock.c @@ -281,6 +281,7 @@ static void __munlock_pagevec(struct pagevec *pvec, struct zone *zone) int nr = pagevec_count(pvec); int delta_munlocked = -nr; struct pagevec pvec_putback; + struct lruvec *lruvec = NULL; int pgrescued = 0; pagevec_init(&pvec_putback); @@ -288,7 +289,6 @@ static void __munlock_pagevec(struct pagevec *pvec, struct zone *zone) /* Phase 1: page isolation */ for (i = 0; i < nr; i++) { struct page *page = pvec->pages[i]; - struct lruvec *lruvec; if (!TestClearPageMlocked(page)) { delta_munlocked++; @@ -305,9 +305,8 @@ static void __munlock_pagevec(struct pagevec *pvec, struct zone *zone) * We already have pin from follow_page_mask() * so we can spare the get_page() here. */ - lruvec = lock_page_lruvec_irq(page); + lruvec = relock_page_lruvec_irq(page, lruvec); del_page_from_lru_list(page, lruvec, page_lru(page)); - unlock_page_lruvec_irq(lruvec); continue; /* @@ -320,6 +319,8 @@ static void __munlock_pagevec(struct pagevec *pvec, struct zone *zone) pagevec_add(&pvec_putback, pvec->pages[i]); pvec->pages[i] = NULL; } + if (lruvec) + unlock_page_lruvec_irq(lruvec); __mod_zone_page_state(zone, NR_MLOCK, delta_munlocked); /* Now we can release pins of pages that we are not munlocking */ From patchwork Mon Mar 2 11:00:26 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alex Shi X-Patchwork-Id: 11415281 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 19AC8138D for ; Mon, 2 Mar 2020 11:01:24 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id E4D9E2467B for ; Mon, 2 Mar 2020 11:01:23 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org E4D9E2467B Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 3AC616B0010; Mon, 2 Mar 2020 06:01:12 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 3870C6B0032; Mon, 2 Mar 2020 06:01:12 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 274086B0036; Mon, 2 Mar 2020 06:01:12 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0065.hostedemail.com [216.40.44.65]) by kanga.kvack.org (Postfix) with ESMTP id 017EC6B0010 for ; Mon, 2 Mar 2020 06:01:11 -0500 (EST) Received: from smtpin14.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id D37CF181AC9B6 for ; Mon, 2 Mar 2020 11:01:11 +0000 (UTC) X-FDA: 76550130342.14.turn59_7e9a09f28a50c X-Spam-Summary: 2,0,0,bdd10496fdefc40b,d41d8cd98f00b204,alex.shi@linux.alibaba.com,,RULES_HIT:41:355:379:541:800:960:973:988:989:1260:1261:1345:1359:1431:1437:1534:1541:1711:1730:1747:1777:1792:2393:2559:2562:2898:3138:3139:3140:3141:3142:3352:3867:3870:3872:5007:6261:6737:8957:9207:10004:11026:11658:11914:12043:12048:12296:12297:12438:12555:12895:13069:13255:13311:13357:13846:14096:14181:14384:14394:14721:14915:21060:21080:21451:21627:21987:30054:30070,0,RBL:115.124.30.54:@linux.alibaba.com:.lbl8.mailshell.net-62.20.2.100 64.201.201.201,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:none,Custom_rules:0:0:0,LFtime:26,LUA_SUMMARY:none X-HE-Tag: turn59_7e9a09f28a50c X-Filterd-Recvd-Size: 3183 Received: from out30-54.freemail.mail.aliyun.com (out30-54.freemail.mail.aliyun.com [115.124.30.54]) by imf16.hostedemail.com (Postfix) with ESMTP for ; Mon, 2 Mar 2020 11:01:09 +0000 (UTC) X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R681e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01f04397;MF=alex.shi@linux.alibaba.com;NM=1;PH=DS;RN=14;SR=0;TI=SMTPD_---0TrQzvPm_1583146860; Received: from localhost(mailfrom:alex.shi@linux.alibaba.com fp:SMTPD_---0TrQzvPm_1583146860) by smtp.aliyun-inc.com(127.0.0.1); Mon, 02 Mar 2020 19:01:01 +0800 From: Alex Shi To: cgroups@vger.kernel.org, akpm@linux-foundation.org, mgorman@techsingularity.net, tj@kernel.org, hughd@google.com, khlebnikov@yandex-team.ru, daniel.m.jordan@oracle.com, yang.shi@linux.alibaba.com, willy@infradead.org, hannes@cmpxchg.org, lkp@intel.com Cc: Alex Shi , linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH v9 16/20] mm/swap: only change the lru_lock iff page's lruvec is different Date: Mon, 2 Mar 2020 19:00:26 +0800 Message-Id: <1583146830-169516-17-git-send-email-alex.shi@linux.alibaba.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1583146830-169516-1-git-send-email-alex.shi@linux.alibaba.com> References: <1583146830-169516-1-git-send-email-alex.shi@linux.alibaba.com> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Since we introduced relock_page_lruvec, we could use it in more place to reduce spin_locks. Signed-off-by: Alex Shi Cc: Andrew Morton Cc: Johannes Weiner Cc: Konstantin Khlebnikov Cc: Hugh Dickins Cc: linux-kernel@vger.kernel.org Cc: cgroups@vger.kernel.org Cc: linux-mm@kvack.org --- mm/swap.c | 15 +++++++-------- 1 file changed, 7 insertions(+), 8 deletions(-) diff --git a/mm/swap.c b/mm/swap.c index 50c856246f84..74e03589adde 100644 --- a/mm/swap.c +++ b/mm/swap.c @@ -197,13 +197,15 @@ static void pagevec_lru_move_fn(struct pagevec *pvec, if (isolation && !TestClearPageLRU(page)) continue; - lruvec = lock_page_lruvec_irqsave(page, &flags); + lruvec = relock_page_lruvec_irqsave(page, lruvec, &flags); (*move_fn)(page, lruvec, arg); - unlock_page_lruvec_irqrestore(lruvec, flags); if (isolation) SetPageLRU(page); } + if (lruvec) + unlock_page_lruvec_irqrestore(lruvec, flags); + release_pages(pvec->pages, pvec->nr); pagevec_reinit(pvec); } @@ -821,14 +823,11 @@ void release_pages(struct page **pages, int nr) } if (TestClearPageLRU(page)) { - struct lruvec *new_lruvec = mem_cgroup_page_lruvec(page, page_pgdat(page)); + struct lruvec *pre_lruvec = lruvec; - if (new_lruvec != lruvec) { - if (lruvec) - unlock_page_lruvec_irqrestore(lruvec, flags); + lruvec = relock_page_lruvec_irqsave(page, lruvec, &flags); + if (pre_lruvec != lruvec) lock_batch = 0; - lruvec = lock_page_lruvec_irqsave(page, &flags); - } del_page_from_lru_list(page, lruvec, page_off_lru(page)); } From patchwork Mon Mar 2 11:00:27 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alex Shi X-Patchwork-Id: 11415317 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id DAA79109A for ; Mon, 2 Mar 2020 11:02:07 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id B223824677 for ; Mon, 2 Mar 2020 11:02:07 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org B223824677 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id E59F16B0005; Mon, 2 Mar 2020 06:02:06 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id E30256B0087; Mon, 2 Mar 2020 06:02:06 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D47726B0088; Mon, 2 Mar 2020 06:02:06 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0232.hostedemail.com [216.40.44.232]) by kanga.kvack.org (Postfix) with ESMTP id B86F06B0005 for ; Mon, 2 Mar 2020 06:02:06 -0500 (EST) Received: from smtpin20.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 960EE2289C for ; Mon, 2 Mar 2020 11:02:06 +0000 (UTC) X-FDA: 76550132652.20.ray75_86b7463047f53 X-Spam-Summary: 2,0,0,632ef1a91c1715c6,d41d8cd98f00b204,alex.shi@linux.alibaba.com,,RULES_HIT:41:355:379:541:800:960:973:988:989:1260:1261:1345:1359:1431:1437:1534:1541:1711:1730:1747:1777:1792:2393:2559:2562:3138:3139:3140:3141:3142:3352:3872:3876:4321:4605:5007:6261:6737:7903:9207:10004:11026:11473:11658:11914:12043:12048:12296:12297:12438:12555:12895:13069:13311:13357:13846:14096:14181:14384:14394:14721:14915:21060:21080:21451:21627,0,RBL:115.124.30.133:@linux.alibaba.com:.lbl8.mailshell.net-64.201.201.201 62.20.2.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:none,Custom_rules:0:0:0,LFtime:23,LUA_SUMMARY:none X-HE-Tag: ray75_86b7463047f53 X-Filterd-Recvd-Size: 2662 Received: from out30-133.freemail.mail.aliyun.com (out30-133.freemail.mail.aliyun.com [115.124.30.133]) by imf18.hostedemail.com (Postfix) with ESMTP for ; Mon, 2 Mar 2020 11:02:05 +0000 (UTC) X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R161e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e04400;MF=alex.shi@linux.alibaba.com;NM=1;PH=DS;RN=14;SR=0;TI=SMTPD_---0TrQmbfl_1583146861; Received: from localhost(mailfrom:alex.shi@linux.alibaba.com fp:SMTPD_---0TrQmbfl_1583146861) by smtp.aliyun-inc.com(127.0.0.1); Mon, 02 Mar 2020 19:01:01 +0800 From: Alex Shi To: cgroups@vger.kernel.org, akpm@linux-foundation.org, mgorman@techsingularity.net, tj@kernel.org, hughd@google.com, khlebnikov@yandex-team.ru, daniel.m.jordan@oracle.com, yang.shi@linux.alibaba.com, willy@infradead.org, hannes@cmpxchg.org, lkp@intel.com Cc: Alex Shi , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v9 17/20] mm/pgdat: remove pgdat lru_lock Date: Mon, 2 Mar 2020 19:00:27 +0800 Message-Id: <1583146830-169516-18-git-send-email-alex.shi@linux.alibaba.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1583146830-169516-1-git-send-email-alex.shi@linux.alibaba.com> References: <1583146830-169516-1-git-send-email-alex.shi@linux.alibaba.com> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Now pgdat.lru_lock was replaced by lruvec lock. It's not used anymore. Signed-off-by: Alex Shi Cc: Andrew Morton Cc: Konstantin Khlebnikov Cc: Hugh Dickins Cc: Johannes Weiner Cc: linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org Cc: cgroups@vger.kernel.org --- include/linux/mmzone.h | 1 - mm/page_alloc.c | 1 - 2 files changed, 2 deletions(-) diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index a7bdff94990d..f1ff6713ac06 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -768,7 +768,6 @@ struct deferred_split { /* Write-intensive fields used by page reclaim */ ZONE_PADDING(_pad1_) - spinlock_t lru_lock; #ifdef CONFIG_DEFERRED_STRUCT_PAGE_INIT /* diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 3c4eb750a199..8c7df304bcd1 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -6709,7 +6709,6 @@ static void __meminit pgdat_init_internals(struct pglist_data *pgdat) init_waitqueue_head(&pgdat->pfmemalloc_wait); pgdat_page_ext_init(pgdat); - spin_lock_init(&pgdat->lru_lock); lruvec_init(&pgdat->__lruvec); } From patchwork Mon Mar 2 11:00:28 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alex Shi X-Patchwork-Id: 11415291 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C8A3D109A for ; Mon, 2 Mar 2020 11:01:37 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 852922468D for ; Mon, 2 Mar 2020 11:01:37 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 852922468D Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 4E69B6B006E; Mon, 2 Mar 2020 06:01:21 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 493C76B0070; Mon, 2 Mar 2020 06:01:21 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3D3606B0071; Mon, 2 Mar 2020 06:01:21 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0127.hostedemail.com [216.40.44.127]) by kanga.kvack.org (Postfix) with ESMTP id 224B46B006E for ; Mon, 2 Mar 2020 06:01:21 -0500 (EST) Received: from smtpin30.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 0C574180AD801 for ; Mon, 2 Mar 2020 11:01:21 +0000 (UTC) X-FDA: 76550130762.30.smash91_7fd231f136a2f X-Spam-Summary: 2,0,0,bdb061c084a7827c,d41d8cd98f00b204,alex.shi@linux.alibaba.com,,RULES_HIT:4:41:69:355:379:541:800:960:966:968:973:988:989:1260:1261:1345:1359:1431:1437:1605:1730:1747:1777:1792:1801:1981:2194:2196:2198:2199:2200:2201:2393:2553:2559:2562:2639:2693:2731:2736:2737:2903:2916:3138:3139:3140:3141:3142:3865:3866:3867:3868:3870:3871:3872:3874:4250:4321:4385:4605:5007:6119:6261:6630:6737:7576:7875:7903:7974:8660:9010:9592:10004:11026:11232:11473:11658:11914:12043:12048:12291:12295:12296:12297:12438:12555:12679:12683:12895:12986:13148:13149:13156:13228:13230:13846:13869:13972:14096:14394:14915:21060:21067:21080:21324:21433:21451:21627:21740:21939:30005:30012:30034:30045:30051:30054:30070:30079:30085:30090,0,RBL:115.124.30.56:@linux.alibaba.com:.lbl8.mailshell.net-62.20.2.100 64.201.201.201,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:none,Custom_rules:0:0:0,LFtime:23,LUA_SUMMARY:none X-HE-Tag: smash91_7fd231f136a2f X-Filterd-Recvd-Size: 15158 Received: from out30-56.freemail.mail.aliyun.com (out30-56.freemail.mail.aliyun.com [115.124.30.56]) by imf13.hostedemail.com (Postfix) with ESMTP for ; Mon, 2 Mar 2020 11:01:17 +0000 (UTC) X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R111e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e04407;MF=alex.shi@linux.alibaba.com;NM=1;PH=DS;RN=16;SR=0;TI=SMTPD_---0TrQzvQ0_1583146861; Received: from localhost(mailfrom:alex.shi@linux.alibaba.com fp:SMTPD_---0TrQzvQ0_1583146861) by smtp.aliyun-inc.com(127.0.0.1); Mon, 02 Mar 2020 19:01:02 +0800 From: Alex Shi To: cgroups@vger.kernel.org, akpm@linux-foundation.org, mgorman@techsingularity.net, tj@kernel.org, hughd@google.com, khlebnikov@yandex-team.ru, daniel.m.jordan@oracle.com, yang.shi@linux.alibaba.com, willy@infradead.org, hannes@cmpxchg.org, lkp@intel.com Cc: Alex Shi , Andrey Ryabinin , Jann Horn , linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH v9 18/20] mm/lru: revise the comments of lru_lock Date: Mon, 2 Mar 2020 19:00:28 +0800 Message-Id: <1583146830-169516-19-git-send-email-alex.shi@linux.alibaba.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1583146830-169516-1-git-send-email-alex.shi@linux.alibaba.com> References: <1583146830-169516-1-git-send-email-alex.shi@linux.alibaba.com> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Hugh Dickins Since we changed the pgdat->lru_lock to lruvec->lru_lock, it's time to fix the incorrect comments in code. Also fixed some zone->lru_lock comment error from ancient time. etc. Signed-off-by: Hugh Dickins Signed-off-by: Alex Shi Cc: Andrew Morton Cc: Tejun Heo Cc: Andrey Ryabinin Cc: Jann Horn Cc: Mel Gorman Cc: Johannes Weiner Cc: Matthew Wilcox Cc: Hugh Dickins Cc: cgroups@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org --- Documentation/admin-guide/cgroup-v1/memcg_test.rst | 15 +++------------ Documentation/admin-guide/cgroup-v1/memory.rst | 8 ++++---- Documentation/trace/events-kmem.rst | 2 +- Documentation/vm/unevictable-lru.rst | 22 ++++++++-------------- include/linux/mm_types.h | 2 +- include/linux/mmzone.h | 2 +- mm/filemap.c | 4 ++-- mm/memcontrol.c | 2 +- mm/rmap.c | 2 +- mm/vmscan.c | 12 ++++++++---- 10 files changed, 30 insertions(+), 41 deletions(-) diff --git a/Documentation/admin-guide/cgroup-v1/memcg_test.rst b/Documentation/admin-guide/cgroup-v1/memcg_test.rst index 3f7115e07b5d..0b9f91589d3d 100644 --- a/Documentation/admin-guide/cgroup-v1/memcg_test.rst +++ b/Documentation/admin-guide/cgroup-v1/memcg_test.rst @@ -133,18 +133,9 @@ Under below explanation, we assume CONFIG_MEM_RES_CTRL_SWAP=y. 8. LRU ====== - Each memcg has its own private LRU. Now, its handling is under global - VM's control (means that it's handled under global pgdat->lru_lock). - Almost all routines around memcg's LRU is called by global LRU's - list management functions under pgdat->lru_lock. - - A special function is mem_cgroup_isolate_pages(). This scans - memcg's private LRU and call __isolate_lru_page() to extract a page - from LRU. - - (By __isolate_lru_page(), the page is removed from both of global and - private LRU.) - + Each memcg has its own vector of LRUs (inactive anon, active anon, + inactive file, active file, unevictable) of pages from each node, + each LRU handled under a single lru_lock for that memcg and node. 9. Typical Tests. ================= diff --git a/Documentation/admin-guide/cgroup-v1/memory.rst b/Documentation/admin-guide/cgroup-v1/memory.rst index 0ae4f564c2d6..5a68ecfdb835 100644 --- a/Documentation/admin-guide/cgroup-v1/memory.rst +++ b/Documentation/admin-guide/cgroup-v1/memory.rst @@ -297,13 +297,13 @@ When oom event notifier is registered, event will be delivered. PG_locked. mm->page_table_lock - pgdat->lru_lock - lock_page_cgroup. + lruvec->lru_lock + lock_page_cgroup. In many cases, just lock_page_cgroup() is called. - per-zone-per-cgroup LRU (cgroup's private LRU) is just guarded by - pgdat->lru_lock, it has no lock of its own. + per-node-per-cgroup LRU (cgroup's private LRU) is just guarded by + lruvec->lru_lock, it has no lock of its own. 2.7 Kernel Memory Extension (CONFIG_MEMCG_KMEM) ----------------------------------------------- diff --git a/Documentation/trace/events-kmem.rst b/Documentation/trace/events-kmem.rst index 555484110e36..68fa75247488 100644 --- a/Documentation/trace/events-kmem.rst +++ b/Documentation/trace/events-kmem.rst @@ -69,7 +69,7 @@ When pages are freed in batch, the also mm_page_free_batched is triggered. Broadly speaking, pages are taken off the LRU lock in bulk and freed in batch with a page list. Significant amounts of activity here could indicate that the system is under memory pressure and can also indicate -contention on the zone->lru_lock. +contention on the lruvec->lru_lock. 4. Per-CPU Allocator Activity ============================= diff --git a/Documentation/vm/unevictable-lru.rst b/Documentation/vm/unevictable-lru.rst index 17d0861b0f1d..0e1490524f53 100644 --- a/Documentation/vm/unevictable-lru.rst +++ b/Documentation/vm/unevictable-lru.rst @@ -33,7 +33,7 @@ reclaim in Linux. The problems have been observed at customer sites on large memory x86_64 systems. To illustrate this with an example, a non-NUMA x86_64 platform with 128GB of -main memory will have over 32 million 4k pages in a single zone. When a large +main memory will have over 32 million 4k pages in a single node. When a large fraction of these pages are not evictable for any reason [see below], vmscan will spend a lot of time scanning the LRU lists looking for the small fraction of pages that are evictable. This can result in a situation where all CPUs are @@ -55,7 +55,7 @@ unevictable, either by definition or by circumstance, in the future. The Unevictable Page List ------------------------- -The Unevictable LRU infrastructure consists of an additional, per-zone, LRU list +The Unevictable LRU infrastructure consists of an additional, per-node, LRU list called the "unevictable" list and an associated page flag, PG_unevictable, to indicate that the page is being managed on the unevictable list. @@ -84,15 +84,9 @@ The unevictable list does not differentiate between file-backed and anonymous, swap-backed pages. This differentiation is only important while the pages are, in fact, evictable. -The unevictable list benefits from the "arrayification" of the per-zone LRU +The unevictable list benefits from the "arrayification" of the per-node LRU lists and statistics originally proposed and posted by Christoph Lameter. -The unevictable list does not use the LRU pagevec mechanism. Rather, -unevictable pages are placed directly on the page's zone's unevictable list -under the zone lru_lock. This allows us to prevent the stranding of pages on -the unevictable list when one task has the page isolated from the LRU and other -tasks are changing the "evictability" state of the page. - Memory Control Group Interaction -------------------------------- @@ -101,8 +95,8 @@ The unevictable LRU facility interacts with the memory control group [aka memory controller; see Documentation/admin-guide/cgroup-v1/memory.rst] by extending the lru_list enum. -The memory controller data structure automatically gets a per-zone unevictable -list as a result of the "arrayification" of the per-zone LRU lists (one per +The memory controller data structure automatically gets a per-node unevictable +list as a result of the "arrayification" of the per-node LRU lists (one per lru_list enum element). The memory controller tracks the movement of pages to and from the unevictable list. @@ -196,7 +190,7 @@ for the sake of expediency, to leave a unevictable page on one of the regular active/inactive LRU lists for vmscan to deal with. vmscan checks for such pages in all of the shrink_{active|inactive|page}_list() functions and will "cull" such pages that it encounters: that is, it diverts those pages to the -unevictable list for the zone being scanned. +unevictable list for the node being scanned. There may be situations where a page is mapped into a VM_LOCKED VMA, but the page is not marked as PG_mlocked. Such pages will make it all the way to @@ -328,7 +322,7 @@ If the page was NOT already mlocked, mlock_vma_page() attempts to isolate the page from the LRU, as it is likely on the appropriate active or inactive list at that time. If the isolate_lru_page() succeeds, mlock_vma_page() will put back the page - by calling putback_lru_page() - which will notice that the page -is now mlocked and divert the page to the zone's unevictable list. If +is now mlocked and divert the page to the node's unevictable list. If mlock_vma_page() is unable to isolate the page from the LRU, vmscan will handle it later if and when it attempts to reclaim the page. @@ -603,7 +597,7 @@ Some examples of these unevictable pages on the LRU lists are: unevictable list in mlock_vma_page(). shrink_inactive_list() also diverts any unevictable pages that it finds on the -inactive lists to the appropriate zone's unevictable list. +inactive lists to the appropriate node's unevictable list. shrink_inactive_list() should only see SHM_LOCK'd pages that became SHM_LOCK'd after shrink_active_list() had moved them to the inactive list, or pages mapped diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index c28911c3afa8..852b8ee0da5d 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -78,7 +78,7 @@ struct page { struct { /* Page cache and anonymous pages */ /** * @lru: Pageout list, eg. active_list protected by - * pgdat->lru_lock. Sometimes used as a generic list + * lruvec->lru_lock. Sometimes used as a generic list * by the page owner. */ struct list_head lru; diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index f1ff6713ac06..bc91d4a43960 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -159,7 +159,7 @@ static inline bool free_area_empty(struct free_area *area, int migratetype) struct pglist_data; /* - * zone->lock and the zone lru_lock are two of the hottest locks in the kernel. + * zone->lock and the lru_lock are two of the hottest locks in the kernel. * So add a wild amount of padding here to ensure that they fall into separate * cachelines. There are very few zone structures in the machine, so space * consumption is not a concern here. diff --git a/mm/filemap.c b/mm/filemap.c index 1784478270e1..28fff98f4459 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -101,8 +101,8 @@ * ->swap_lock (try_to_unmap_one) * ->private_lock (try_to_unmap_one) * ->i_pages lock (try_to_unmap_one) - * ->pgdat->lru_lock (follow_page->mark_page_accessed) - * ->pgdat->lru_lock (check_pte_range->isolate_lru_page) + * ->lruvec->lru_lock (follow_page->mark_page_accessed) + * ->lruvec->lru_lock (check_pte_range->isolate_lru_page) * ->private_lock (page_remove_rmap->set_page_dirty) * ->i_pages lock (page_remove_rmap->set_page_dirty) * bdi.wb->list_lock (page_remove_rmap->set_page_dirty) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index b0c156da60fe..099926efbb48 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -2940,7 +2940,7 @@ void __memcg_kmem_uncharge(struct page *page, int order) #ifdef CONFIG_TRANSPARENT_HUGEPAGE /* - * Because tail pages are not marked as "used", set it. We're under + * Because tail pages are not marked as "used", set it. Don't need * lruvec->lru_lock and migration entries setup in all page mappings. */ void mem_cgroup_split_huge_fixup(struct page *head) diff --git a/mm/rmap.c b/mm/rmap.c index b3e381919835..39052794cb46 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -27,7 +27,7 @@ * mapping->i_mmap_rwsem * anon_vma->rwsem * mm->page_table_lock or pte_lock - * pgdat->lru_lock (in mark_page_accessed, isolate_lru_page) + * lruvec->lru_lock (in mark_page_accessed, isolate_lru_page) * swap_lock (in swap_duplicate, swap_info_get) * mmlist_lock (in mmput, drain_mmlist and others) * mapping->private_lock (in __set_page_dirty_buffers) diff --git a/mm/vmscan.c b/mm/vmscan.c index de925bd629eb..2c98352c95cc 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -1601,14 +1601,16 @@ static __always_inline void update_lru_sizes(struct lruvec *lruvec, } /** - * pgdat->lru_lock is heavily contended. Some of the functions that + * Isolating page from the lruvec to fill in @dst list by nr_to_scan times. + * + * lruvec->lru_lock is heavily contended. Some of the functions that * shrink the lists perform better by taking out a batch of pages * and working on them outside the LRU lock. * * For pagecache intensive workloads, this function is the hottest * spot in the kernel (apart from copy_*_user functions). * - * Appropriate locks must be held before calling this function. + * Lru_lock must be held before calling this function. * * @nr_to_scan: The number of eligible pages to look through on the list. * @lruvec: The LRU vector to pull pages from. @@ -1809,14 +1811,16 @@ static int too_many_isolated(struct pglist_data *pgdat, int file, /* * This moves pages from @list to corresponding LRU list. + * The pages from @list is out of any lruvec, and in the end list reuses as + * pages_to_free list. * * We move them the other way if the page is referenced by one or more * processes, from rmap. * * If the pages are mostly unmapped, the processing is fast and it is - * appropriate to hold zone_lru_lock across the whole operation. But if + * appropriate to hold lru_lock across the whole operation. But if * the pages are mapped, the processing is slow (page_referenced()) so we - * should drop zone_lru_lock around each page. It's impossible to balance + * should drop lru_lock around each page. It's impossible to balance * this, so instead we remove the pages from the LRU while processing them. * It is safe to rely on PG_active against the non-LRU pages in here because * nobody will play with that bit on a non-LRU page. From patchwork Mon Mar 2 11:00:29 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alex Shi X-Patchwork-Id: 11415289 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 8D2AC138D for ; Mon, 2 Mar 2020 11:01:34 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 5ADA72467B for ; Mon, 2 Mar 2020 11:01:34 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 5ADA72467B Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 81A986B006C; Mon, 2 Mar 2020 06:01:19 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 7C9EF6B006E; Mon, 2 Mar 2020 06:01:19 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5F5366B0070; Mon, 2 Mar 2020 06:01:19 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0122.hostedemail.com [216.40.44.122]) by kanga.kvack.org (Postfix) with ESMTP id 45F246B006C for ; Mon, 2 Mar 2020 06:01:19 -0500 (EST) Received: from smtpin24.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 36378180AD801 for ; Mon, 2 Mar 2020 11:01:19 +0000 (UTC) X-FDA: 76550130678.24.board38_7fd0b7beaa33a X-Spam-Summary: 2,0,0,c0b0c7b8ebcbec14,d41d8cd98f00b204,alex.shi@linux.alibaba.com,,RULES_HIT:41:355:379:541:800:960:973:988:989:1260:1261:1345:1359:1431:1437:1534:1543:1711:1730:1747:1777:1792:2393:2559:2562:2901:3138:3139:3140:3141:3142:3354:3866:4321:4605:5007:6114:6119:6261:6642:6737:7903:9040:10004:11026:11473:11658:11914:12043:12048:12296:12297:12438:12555:12895:12986:13846:14096:14181:14394:14721:14915:21060:21080:21451:21627:21966:21990:30054:30070,0,RBL:115.124.30.57:@linux.alibaba.com:.lbl8.mailshell.net-62.20.2.100 64.201.201.201,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:none,Custom_rules:0:0:0,LFtime:24,LUA_SUMMARY:none X-HE-Tag: board38_7fd0b7beaa33a X-Filterd-Recvd-Size: 4619 Received: from out30-57.freemail.mail.aliyun.com (out30-57.freemail.mail.aliyun.com [115.124.30.57]) by imf40.hostedemail.com (Postfix) with ESMTP for ; Mon, 2 Mar 2020 11:01:14 +0000 (UTC) X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R611e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e04400;MF=alex.shi@linux.alibaba.com;NM=1;PH=DS;RN=14;SR=0;TI=SMTPD_---0TrR9JWI_1583146862; Received: from localhost(mailfrom:alex.shi@linux.alibaba.com fp:SMTPD_---0TrR9JWI_1583146862) by smtp.aliyun-inc.com(127.0.0.1); Mon, 02 Mar 2020 19:01:02 +0800 From: Alex Shi To: cgroups@vger.kernel.org, akpm@linux-foundation.org, mgorman@techsingularity.net, tj@kernel.org, hughd@google.com, khlebnikov@yandex-team.ru, daniel.m.jordan@oracle.com, yang.shi@linux.alibaba.com, willy@infradead.org, hannes@cmpxchg.org, lkp@intel.com Cc: Alex Shi , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v9 19/20] mm/lru: add debug checking for page memcg moving Date: Mon, 2 Mar 2020 19:00:29 +0800 Message-Id: <1583146830-169516-20-git-send-email-alex.shi@linux.alibaba.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1583146830-169516-1-git-send-email-alex.shi@linux.alibaba.com> References: <1583146830-169516-1-git-send-email-alex.shi@linux.alibaba.com> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: This debug patch could give some clues if there are sth out of consideration. Hugh Dickins reported a bug of this patch, Thanks! Signed-off-by: Alex Shi Cc: Johannes Weiner Cc: Hugh Dickins Cc: Andrew Morton Cc: cgroups@vger.kernel.org Cc: linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org --- include/linux/memcontrol.h | 6 ++++++ mm/compaction.c | 2 ++ mm/memcontrol.c | 15 +++++++++++++++ 3 files changed, 23 insertions(+) diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index f60009580d2a..27c7bbe82125 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -431,6 +431,8 @@ struct lruvec *lock_page_lruvec_irqsave(struct page *page, void unlock_page_lruvec_irq(struct lruvec *lruvec); void unlock_page_lruvec_irqrestore(struct lruvec *lruvec, unsigned long flags); +void lruvec_memcg_debug(struct lruvec *lruvec, struct page *page); + static inline struct mem_cgroup *mem_cgroup_from_css(struct cgroup_subsys_state *css){ return css ? container_of(css, struct mem_cgroup, css) : NULL; @@ -1191,6 +1193,10 @@ static inline void count_memcg_page_event(struct page *page, void count_memcg_event_mm(struct mm_struct *mm, enum vm_event_item idx) { } + +static inline void lruvec_memcg_debug(struct lruvec *lruvec, struct page *page) +{ +} #endif /* CONFIG_MEMCG */ /* idx can be of type enum memcg_stat_item or node_stat_item */ diff --git a/mm/compaction.c b/mm/compaction.c index 4a773f0ffedf..40c270bd5092 100644 --- a/mm/compaction.c +++ b/mm/compaction.c @@ -984,6 +984,8 @@ static bool too_many_isolated(pg_data_t *pgdat) compact_lock_irqsave(&lruvec->lru_lock, &flags, cc); locked_lruvec = lruvec; + lruvec_memcg_debug(lruvec, page); + /* Try get exclusive access under lock */ if (!skip_updated) { skip_updated = true; diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 099926efbb48..2d71e53ead88 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -1240,6 +1240,19 @@ struct lruvec *mem_cgroup_page_lruvec(struct page *page, struct pglist_data *pgd return lruvec; } +void lruvec_memcg_debug(struct lruvec *lruvec, struct page *page) +{ +#ifdef CONFIG_DEBUG_VM + if (mem_cgroup_disabled()) + return; + + if (!page->mem_cgroup) + VM_BUG_ON_PAGE(lruvec_memcg(lruvec) != root_mem_cgroup, page); + else + VM_BUG_ON_PAGE(lruvec_memcg(lruvec) != page->mem_cgroup, page); +#endif +} + /* page must be isolated */ struct lruvec *lock_page_lruvec_irq(struct page *page) { @@ -1248,6 +1261,7 @@ struct lruvec *lock_page_lruvec_irq(struct page *page) lruvec = mem_cgroup_page_lruvec(page, page_pgdat(page)); spin_lock_irq(&lruvec->lru_lock); + lruvec_memcg_debug(lruvec, page); return lruvec; } @@ -1258,6 +1272,7 @@ struct lruvec *lock_page_lruvec_irqsave(struct page *page, unsigned long *flags) lruvec = mem_cgroup_page_lruvec(page, page_pgdat(page)); spin_lock_irqsave(&lruvec->lru_lock, *flags); + lruvec_memcg_debug(lruvec, page); return lruvec; } From patchwork Mon Mar 2 11:00:30 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alex Shi X-Patchwork-Id: 11415295 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 68E51138D for ; Mon, 2 Mar 2020 11:01:43 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 3D3442468D for ; Mon, 2 Mar 2020 11:01:43 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 3D3442468D Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id EA9786B0071; Mon, 2 Mar 2020 06:01:35 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id E59426B0072; Mon, 2 Mar 2020 06:01:35 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D988F6B0073; Mon, 2 Mar 2020 06:01:35 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0063.hostedemail.com [216.40.44.63]) by kanga.kvack.org (Postfix) with ESMTP id B90EC6B0071 for ; Mon, 2 Mar 2020 06:01:35 -0500 (EST) Received: from smtpin13.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 6F9E4181AC9BF for ; Mon, 2 Mar 2020 11:01:35 +0000 (UTC) X-FDA: 76550131350.13.store98_823582729ee4f X-Spam-Summary: 1,0,0,,d41d8cd98f00b204,alex.shi@linux.alibaba.com,,RULES_HIT:30054:30070,0,RBL:47.88.44.36:@linux.alibaba.com:.lbl8.mailshell.net-64.10.201.10 62.18.0.100;47.88.44.36-irl.urbl.hostedemail.com-127.0.0.175,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:none,Custom_rules:0:0:0,LFtime:23,LUA_SUMMARY:none X-HE-Tag: store98_823582729ee4f X-Filterd-Recvd-Size: 2494 Received: from out4436.biz.mail.alibaba.com (out4436.biz.mail.alibaba.com [47.88.44.36]) by imf41.hostedemail.com (Postfix) with ESMTP for ; Mon, 2 Mar 2020 11:01:34 +0000 (UTC) X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R561e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01f04391;MF=alex.shi@linux.alibaba.com;NM=1;PH=DS;RN=16;SR=0;TI=SMTPD_---0TrQmbg._1583146862; Received: from localhost(mailfrom:alex.shi@linux.alibaba.com fp:SMTPD_---0TrQmbg._1583146862) by smtp.aliyun-inc.com(127.0.0.1); Mon, 02 Mar 2020 19:01:03 +0800 From: Alex Shi To: cgroups@vger.kernel.org, akpm@linux-foundation.org, mgorman@techsingularity.net, tj@kernel.org, hughd@google.com, khlebnikov@yandex-team.ru, daniel.m.jordan@oracle.com, yang.shi@linux.alibaba.com, willy@infradead.org, hannes@cmpxchg.org, lkp@intel.com Cc: Alex Shi , Michal Hocko , Vladimir Davydov , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v9 20/20] mm/memcg: add debug checking in lock_page_memcg Date: Mon, 2 Mar 2020 19:00:30 +0800 Message-Id: <1583146830-169516-21-git-send-email-alex.shi@linux.alibaba.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1583146830-169516-1-git-send-email-alex.shi@linux.alibaba.com> References: <1583146830-169516-1-git-send-email-alex.shi@linux.alibaba.com> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: This extra irq disable/enable and BUG_ON checking costs 5% readtwice performance on a 2 socket * 26 cores * HT box. So put it into CONFIG_PROVE_LOCKING. Signed-off-by: Alex Shi Cc: Johannes Weiner Cc: Michal Hocko Cc: Vladimir Davydov Cc: Andrew Morton Cc: cgroups@vger.kernel.org Cc: linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org --- mm/memcontrol.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 2d71e53ead88..8d7f6336f15c 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -2026,6 +2026,12 @@ struct mem_cgroup *lock_page_memcg(struct page *page) if (unlikely(!memcg)) return NULL; +#ifdef CONFIG_PROVE_LOCKING + local_irq_save(flags); + might_lock(&memcg->move_lock); + local_irq_restore(flags); +#endif + if (atomic_read(&memcg->moving_account) <= 0) return memcg;