From patchwork Fri Feb 28 03:38:19 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Huang, Ying" X-Patchwork-Id: 11411467 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id A265514E3 for ; Fri, 28 Feb 2020 03:38:50 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 61EA8246A0 for ; Fri, 28 Feb 2020 03:38:50 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 61EA8246A0 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 826E66B0008; Thu, 27 Feb 2020 22:38:49 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 7FD996B000A; Thu, 27 Feb 2020 22:38:49 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 73B6F6B000C; Thu, 27 Feb 2020 22:38:49 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0028.hostedemail.com [216.40.44.28]) by kanga.kvack.org (Postfix) with ESMTP id 5BC8C6B0008 for ; Thu, 27 Feb 2020 22:38:49 -0500 (EST) Received: from smtpin21.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 38E698248047 for ; Fri, 28 Feb 2020 03:38:49 +0000 (UTC) X-FDA: 76538129178.21.plot64_12b5cd1bbdd56 X-Spam-Summary: 1,0,0,,d41d8cd98f00b204,ying.huang@intel.com,,RULES_HIT:30003:30054:30064:30091,0,RBL:192.55.52.151:@intel.com:.lbl8.mailshell.net-64.95.201.95 62.18.0.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:ft,MSBL:0,DNSBL:none,Custom_rules:0:0:0,LFtime:1,LUA_SUMMARY:none X-HE-Tag: plot64_12b5cd1bbdd56 X-Filterd-Recvd-Size: 6866 Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by imf42.hostedemail.com (Postfix) with ESMTP for ; Fri, 28 Feb 2020 03:38:48 +0000 (UTC) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga107.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 27 Feb 2020 19:38:47 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.70,493,1574150400"; d="scan'208";a="232107402" Received: from yhuang-dev.sh.intel.com ([10.239.159.23]) by orsmga008.jf.intel.com with ESMTP; 27 Feb 2020 19:38:44 -0800 From: "Huang, Ying" To: Andrew Morton Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Huang Ying , David Hildenbrand , Mel Gorman , Vlastimil Babka , Zi Yan , Michal Hocko , Peter Zijlstra , Dave Hansen , Minchan Kim , Johannes Weiner , Hugh Dickins Subject: [RFC 3/3] mm: Discard lazily freed pages when migrating Date: Fri, 28 Feb 2020 11:38:19 +0800 Message-Id: <20200228033819.3857058-4-ying.huang@intel.com> X-Mailer: git-send-email 2.25.0 In-Reply-To: <20200228033819.3857058-1-ying.huang@intel.com> References: <20200228033819.3857058-1-ying.huang@intel.com> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Huang Ying MADV_FREE is a lazy free mechanism in Linux. According to the manpage of mavise(2), the semantics of MADV_FREE is, The application no longer requires the pages in the range specified by addr and len. The kernel can thus free these pages, but the freeing could be delayed until memory pressure occurs. ... Originally, the pages freed lazily by MADV_FREE will only be freed really by page reclaiming when there is memory pressure or when unmapping the address range. In addition to that, there's another opportunity to free these pages really, when we try to migrate them. The main value to do that is to avoid to create the new memory pressure immediately if possible. Instead, even if the pages are required again, they will be allocated gradually on demand. That is, the memory will be allocated lazily when necessary. This follows the common philosophy in the Linux kernel, allocate resources lazily on demand. Signed-off-by: "Huang, Ying" Cc: David Hildenbrand Cc: Mel Gorman Cc: Vlastimil Babka Cc: Zi Yan Cc: Michal Hocko Cc: Peter Zijlstra Cc: Dave Hansen Cc: Minchan Kim Cc: Johannes Weiner Cc: Hugh Dickins --- include/linux/migrate.h | 4 ++++ mm/huge_memory.c | 20 +++++++++++++++----- mm/migrate.c | 16 +++++++++++++++- mm/rmap.c | 10 ++++++++++ 4 files changed, 44 insertions(+), 6 deletions(-) diff --git a/include/linux/migrate.h b/include/linux/migrate.h index 72120061b7d4..2c6cf985a8d3 100644 --- a/include/linux/migrate.h +++ b/include/linux/migrate.h @@ -14,8 +14,12 @@ typedef void free_page_t(struct page *page, unsigned long private); * Return values from addresss_space_operations.migratepage(): * - negative errno on page migration failure; * - zero on page migration success; + * + * __unmap_and_move() can also return 1 to indicate the page can be + * discarded instead of migrated. */ #define MIGRATEPAGE_SUCCESS 0 +#define MIGRATEPAGE_DISCARD 1 enum migrate_reason { MR_COMPACTION, diff --git a/mm/huge_memory.c b/mm/huge_memory.c index b1e069e68189..b64f356ab77e 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -3063,11 +3063,21 @@ void set_pmd_migration_entry(struct page_vma_mapped_walk *pvmw, pmdval = pmdp_invalidate(vma, address, pvmw->pmd); if (pmd_dirty(pmdval)) set_page_dirty(page); - entry = make_migration_entry(page, pmd_write(pmdval)); - pmdswp = swp_entry_to_pmd(entry); - if (pmd_soft_dirty(pmdval)) - pmdswp = pmd_swp_mksoft_dirty(pmdswp); - set_pmd_at(mm, address, pvmw->pmd, pmdswp); + /* Clean lazyfree page, discard instead of migrate */ + if (PageLazyFree(page) && !PageDirty(page)) { + pmd_clear(pvmw->pmd); + zap_deposited_table(mm, pvmw->pmd); + /* Invalidate as we cleared the pmd */ + mmu_notifier_invalidate_range(mm, address, + address + HPAGE_PMD_SIZE); + add_mm_counter(mm, MM_ANONPAGES, -HPAGE_PMD_NR); + } else { + entry = make_migration_entry(page, pmd_write(pmdval)); + pmdswp = swp_entry_to_pmd(entry); + if (pmd_soft_dirty(pmdval)) + pmdswp = pmd_swp_mksoft_dirty(pmdswp); + set_pmd_at(mm, address, pvmw->pmd, pmdswp); + } page_remove_rmap(page, true); put_page(page); } diff --git a/mm/migrate.c b/mm/migrate.c index 981f8374a6ef..b7e7d18af94c 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -1122,6 +1122,11 @@ static int __unmap_and_move(struct page *page, struct page *newpage, goto out_unlock_both; } page_was_mapped = 1; + /* Clean lazyfree page, discard instead of migrate */ + if (PageLazyFree(page) && !PageDirty(page)) { + rc = MIGRATEPAGE_DISCARD; + goto out_unlock_both; + } } if (!page_mapped(page)) @@ -1242,7 +1247,16 @@ static ICE_noinline int unmap_and_move(new_page_t get_new_page, num_poisoned_pages_inc(); } } else { - if (rc != -EAGAIN) { + /* + * If page is discard instead of migrated, release + * reference grabbed during isolation, free the new + * page. For the caller, this is same as migrating + * successfully. + */ + if (rc == MIGRATEPAGE_DISCARD) { + put_page(page); + rc = MIGRATEPAGE_SUCCESS; + } else if (rc != -EAGAIN) { if (likely(!__PageMovable(page))) { putback_lru_page(page); goto put_new; diff --git a/mm/rmap.c b/mm/rmap.c index 1dcbb1771dd7..bb52883f7b2d 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1569,6 +1569,16 @@ static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma, swp_entry_t entry; pte_t swp_pte; + /* Clean lazyfree page, discard instead of migrate */ + if (PageLazyFree(page) && !PageDirty(page) && + !(flags & TTU_SPLIT_FREEZE)) { + /* Invalidate as we cleared the pte */ + mmu_notifier_invalidate_range(mm, + address, address + PAGE_SIZE); + dec_mm_counter(mm, MM_ANONPAGES); + goto discard; + } + if (arch_unmap_one(mm, vma, address, pteval) < 0) { set_pte_at(mm, address, pvmw.pte, pteval); ret = false;