From patchwork Fri Jun 22 03:51:37 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Huang, Ying" X-Patchwork-Id: 10481139 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 4EA7F60383 for ; Fri, 22 Jun 2018 03:55:42 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 3F4F028F91 for ; Fri, 22 Jun 2018 03:55:42 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 3240B28F98; Fri, 22 Jun 2018 03:55:42 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00, MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 83D6D28F91 for ; Fri, 22 Jun 2018 03:55:41 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1944B6B026D; Thu, 21 Jun 2018 23:55:38 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 120406B026B; Thu, 21 Jun 2018 23:55:38 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id EDA096B026F; Thu, 21 Jun 2018 23:55:37 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pf0-f197.google.com (mail-pf0-f197.google.com [209.85.192.197]) by kanga.kvack.org (Postfix) with ESMTP id A32D26B026B for ; Thu, 21 Jun 2018 23:55:37 -0400 (EDT) Received: by mail-pf0-f197.google.com with SMTP id a13-v6so2542604pfo.22 for ; Thu, 21 Jun 2018 20:55:37 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=eMXG6THa7WU+f23Q6gi2wd3j/u6C2FHeiewVB7+GHoE=; b=X97ejVlrWAFGZC3Ig6jgxP0vSkyP7lfdTTFYAertZgIbpUa7Zxo0J8k2bxHffOdziH HP+eKaRWlr8kLzkIUaQDqEEtjtIepSpV8PAk6fdL4iExj0f9hgGkk/fb3fmZDlCoIIJO pFxmT5ZtlNZnumtIEbNsxhlVCWac/Dx1xOPcubu6Vlyhrvqus5/9MLgxt8dFlx5WZdOe Tx00aR7k8jZcgN5mw3q1S2O8/Rcceace+6RtJbjzSCSbRvpNkFLt9lYumo89GlXqJkeV MqeERd8e6D9+ONoZhIE0AxsQhGGXYYJdO7lywKhLfgJZoMrwbyEuU8vEksaQ4uAIgEO5 qZVg== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of ying.huang@intel.com designates 192.55.52.88 as permitted sender) smtp.mailfrom=ying.huang@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: APt69E2G19O6ZoNvL/FDNhCoZCQDprwdx0/64QUs/ljlXUsri0WMQWws d2XgRudGc8hBX5DrNXSFA3pIetMfrMo9XI7SvPu1jlxHo5FAn6+VwUvPiuM4XSfUdyygl+mqHRU SOsepkOcyDU8MSKiP2iHVw+Wr3DSYzI9i1m8n6s0EQPs8LbMonhFNpkX0p65mcGBRrg== X-Received: by 2002:a62:9d58:: with SMTP id i85-v6mr30097309pfd.76.1529639737335; Thu, 21 Jun 2018 20:55:37 -0700 (PDT) X-Google-Smtp-Source: ADUXVKIJpFUg7FPJaZVLAOpu0ITQtUOOcJdgFqVvAf8fmYSNbieMKSPbZ8RzjkquAzo0y6z2uZAL X-Received: by 2002:a62:9d58:: with SMTP id i85-v6mr30097267pfd.76.1529639736268; Thu, 21 Jun 2018 20:55:36 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1529639736; cv=none; d=google.com; s=arc-20160816; b=iqYTDOPefZDALMLxdAkQMMnM5eha7EH6R1jwRYTJ55BPKFOe0xB5piXlxIqVwxERRg HHwvOk/NEw4o+4MJwYE0b2N7ln8Rd5UsOQb06q33iq53XhuJmwN85bGUP+Zd5hQu557+ dTMuv7v948P5EEZ9EBq2yCmdHObalKOtdZighOEPaYeGGHW6fwcnOWmIaUI2Z6/39VCM aYzr9tLtumLF5bJZDNIOjeP4N4DfPMc8R7gJwpmfvxtaSgS68W0i3H3PyF6V38DrRDN5 5KHwP2fhTaRfzP5cvE97u2ZKR6tg1vNgwDaTNQJExMv4Fz3KJEgndqD8VB7tt1HcjGUs zBew== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from :arc-authentication-results; bh=eMXG6THa7WU+f23Q6gi2wd3j/u6C2FHeiewVB7+GHoE=; b=WZBPd/jrs5uM5exZdBmb1h6ULVo251hcfqrceRzJJQ3xnMovjaTA5DCdclYYJsZk6H 4XIRWrW1qFaz1dDz3uwDxk0uB5iVkEsfv0AU6z3hUfBgF0O0uzi/k3edhf5/97QGEjpl i7AheDzjDMMIo0Ma2dpSdyguC2BaUX4nNPaHKhbZRhJc4jBddQxzR19IxLHU8wwEGrFi F/I4J5h25JvyxsRbTZzqXg9MPHB3JemcO8vhcErORMw5HKM/D6+d2/e2QBBYZUn6V3iu yGRf7ozsgg16bzUpqoB5YTsLA0ri0n7iLJ1pLZ5qJ13wzcU5zS8rcqzq3ZzB6QQELrVJ lKYg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of ying.huang@intel.com designates 192.55.52.88 as permitted sender) smtp.mailfrom=ying.huang@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga01.intel.com (mga01.intel.com. [192.55.52.88]) by mx.google.com with ESMTPS id n61-v6si6160066plb.256.2018.06.21.20.55.36 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 21 Jun 2018 20:55:36 -0700 (PDT) Received-SPF: pass (google.com: domain of ying.huang@intel.com designates 192.55.52.88 as permitted sender) client-ip=192.55.52.88; Authentication-Results: mx.google.com; spf=pass (google.com: domain of ying.huang@intel.com designates 192.55.52.88 as permitted sender) smtp.mailfrom=ying.huang@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga101.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 21 Jun 2018 20:55:35 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.51,255,1526367600"; d="scan'208";a="65335110" Received: from wanpingl-mobl.ccr.corp.intel.com (HELO yhuang6-ux31a.ccr.corp.intel.com) ([10.254.212.200]) by fmsmga004.fm.intel.com with ESMTP; 21 Jun 2018 20:55:24 -0700 From: "Huang, Ying" To: Andrew Morton Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Huang Ying , "Kirill A. Shutemov" , Andrea Arcangeli , Michal Hocko , Johannes Weiner , Shaohua Li , Hugh Dickins , Minchan Kim , Rik van Riel , Dave Hansen , Naoya Horiguchi , Zi Yan , Daniel Jordan Subject: [PATCH -mm -v4 07/21] mm, THP, swap: Support PMD swap mapping in split_swap_cluster() Date: Fri, 22 Jun 2018 11:51:37 +0800 Message-Id: <20180622035151.6676-8-ying.huang@intel.com> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20180622035151.6676-1-ying.huang@intel.com> References: <20180622035151.6676-1-ying.huang@intel.com> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Huang Ying When splitting a THP in swap cache or failing to allocate a THP when swapin a huge swap cluster, the huge swap cluster will be split. In addition to clear the huge flag of the swap cluster, the PMD swap mapping count recorded in cluster_count() will be set to 0. But we will not touch PMD swap mappings themselves, because it is hard to find them all sometimes. When the PMD swap mappings are operated later, it will be found that the huge swap cluster has been split and the PMD swap mappings will be split at that time. Unless splitting a THP in swap cache (specified via "force" parameter), split_swap_cluster() will return -EEXIST if there is SWAP_HAS_CACHE flag in swap_map[offset]. Because this indicates there is a THP corresponds to this huge swap cluster, and it isn't desired to split the THP. When splitting a THP in swap cache, the position to call split_swap_cluster() is changed to before unlocking sub-pages. So that all sub-pages will be kept locked from the THP has been split to the huge swap cluster is split. This makes the code much easier to be reasoned. Signed-off-by: "Huang, Ying" Cc: "Kirill A. Shutemov" Cc: Andrea Arcangeli Cc: Michal Hocko Cc: Johannes Weiner Cc: Shaohua Li Cc: Hugh Dickins Cc: Minchan Kim Cc: Rik van Riel Cc: Dave Hansen Cc: Naoya Horiguchi Cc: Zi Yan Cc: Daniel Jordan --- include/linux/swap.h | 4 ++-- mm/huge_memory.c | 18 ++++++++++++------ mm/swapfile.c | 45 ++++++++++++++++++++++++++++++--------------- 3 files changed, 44 insertions(+), 23 deletions(-) diff --git a/include/linux/swap.h b/include/linux/swap.h index bb9de2cb952a..878f132dabc0 100644 --- a/include/linux/swap.h +++ b/include/linux/swap.h @@ -617,10 +617,10 @@ static inline swp_entry_t get_swap_page(struct page *page) #endif /* CONFIG_SWAP */ #ifdef CONFIG_THP_SWAP -extern int split_swap_cluster(swp_entry_t entry); +extern int split_swap_cluster(swp_entry_t entry, bool force); extern int split_swap_cluster_map(swp_entry_t entry); #else -static inline int split_swap_cluster(swp_entry_t entry) +static inline int split_swap_cluster(swp_entry_t entry, bool force) { return 0; } diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 2d615328d77f..586d8693b8af 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -2502,6 +2502,17 @@ static void __split_huge_page(struct page *page, struct list_head *list, unfreeze_page(head); + /* + * Split swap cluster before unlocking sub-pages. So all + * sub-pages will be kept locked from THP has been split to + * swap cluster is split. + */ + if (PageSwapCache(head)) { + swp_entry_t entry = { .val = page_private(head) }; + + split_swap_cluster(entry, true); + } + for (i = 0; i < HPAGE_PMD_NR; i++) { struct page *subpage = head + i; if (subpage == page) @@ -2728,12 +2739,7 @@ int split_huge_page_to_list(struct page *page, struct list_head *list) __dec_node_page_state(page, NR_SHMEM_THPS); spin_unlock(&pgdata->split_queue_lock); __split_huge_page(page, list, flags); - if (PageSwapCache(head)) { - swp_entry_t entry = { .val = page_private(head) }; - - ret = split_swap_cluster(entry); - } else - ret = 0; + ret = 0; } else { if (IS_ENABLED(CONFIG_DEBUG_VM) && mapcount) { pr_alert("total_mapcount: %u, page_count(): %u\n", diff --git a/mm/swapfile.c b/mm/swapfile.c index a0141307f3ac..5ff2da89b77c 100644 --- a/mm/swapfile.c +++ b/mm/swapfile.c @@ -1410,21 +1410,6 @@ static void swapcache_free_cluster(swp_entry_t entry) } } } - -int split_swap_cluster(swp_entry_t entry) -{ - struct swap_info_struct *si; - struct swap_cluster_info *ci; - unsigned long offset = swp_offset(entry); - - si = _swap_info_get(entry); - if (!si) - return -EBUSY; - ci = lock_cluster(si, offset); - cluster_clear_huge(ci); - unlock_cluster(ci); - return 0; -} #else static inline void swapcache_free_cluster(swp_entry_t entry) { @@ -4069,6 +4054,36 @@ int split_swap_cluster_map(swp_entry_t entry) unlock_cluster(ci); return 0; } + +int split_swap_cluster(swp_entry_t entry, bool force) +{ + struct swap_info_struct *si; + struct swap_cluster_info *ci; + unsigned long offset = swp_offset(entry); + int ret = 0; + + si = get_swap_device(entry); + if (!si) + return -EINVAL; + ci = lock_cluster(si, offset); + /* The swap cluster has been split by someone else */ + if (!cluster_is_huge(ci)) + goto out; + VM_BUG_ON(!is_cluster_offset(offset)); + VM_BUG_ON(cluster_count(ci) < SWAPFILE_CLUSTER); + /* If not forced, don't split swap cluster has swap cache */ + if (!force && si->swap_map[offset] & SWAP_HAS_CACHE) { + ret = -EEXIST; + goto out; + } + cluster_set_count(ci, SWAPFILE_CLUSTER); + cluster_clear_huge(ci); + +out: + unlock_cluster(ci); + put_swap_device(si); + return ret; +} #endif static int __init swapfile_init(void)