From patchwork Tue Sep 11 10:34:03 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kirill A . Shutemov" X-Patchwork-Id: 10595475 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 4B78F6CB for ; Tue, 11 Sep 2018 10:34:24 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 30A4229017 for ; Tue, 11 Sep 2018 10:34:24 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 24B4829067; Tue, 11 Sep 2018 10:34:24 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 6D95729017 for ; Tue, 11 Sep 2018 10:34:23 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E10418E0002; Tue, 11 Sep 2018 06:34:21 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id DC34F8E0001; Tue, 11 Sep 2018 06:34:21 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CB2078E0002; Tue, 11 Sep 2018 06:34:21 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pg1-f198.google.com (mail-pg1-f198.google.com [209.85.215.198]) by kanga.kvack.org (Postfix) with ESMTP id 8AE068E0001 for ; Tue, 11 Sep 2018 06:34:21 -0400 (EDT) Received: by mail-pg1-f198.google.com with SMTP id m4-v6so12222972pgq.19 for ; Tue, 11 Sep 2018 03:34:21 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id; bh=WP3DagXKtOqtONqNiBtirpo7NUzU3AKNAiuc+m4/cCI=; b=Q/d/E7aANSiEfmq8GTo82r/MPgMoS9o+GiT9tO9gK8OQfzPJDm+ATuqhxIqw98OFWz Go5L2z0kcFquYK8tq5IJhyyZbjxQit2ck/frsmJ+wOC7xUuYGazbpg9QO3RX9YALPfdC bFbBA28GKBys4GT4nUvKa6VJKyLBFCbsppmnPF3sdLh5nPbnPJsj6g72Rogje39t84WU 3WWGULv91SLYzo707qz1uZehucDJT0T2zWLksR36lxifI/vM4blWkA9evxz5wFTXqOoI BS/zZBTc9wl9Y9XfUskZs2UYEJ/ZSlyQGuHcZElM19YXSEnVWia0Dnsvfvq0u9fs7PeV 11Cg== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of kirill.shutemov@linux.intel.com designates 134.134.136.100 as permitted sender) smtp.mailfrom=kirill.shutemov@linux.intel.com; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: APzg51CU4M3JGKQlbvf/F0ly7sYl09+/FW7nuPz1PoUSmWJTQ4z5C/ln IDgDjbe1h0AOQY+75uLW8OxzF825k3XqfODlk1g4Gh4GIjbIswf8lIA50AFaxp56rIY03CMHq43 OpUOPbs6g6YOMavlWC1slbK5TOfpfgd6r9SwfB1cUdD5xpP3tgPDnW+5SxZwGjiO7hw== X-Received: by 2002:a63:e949:: with SMTP id q9-v6mr27489154pgj.4.1536662061243; Tue, 11 Sep 2018 03:34:21 -0700 (PDT) X-Google-Smtp-Source: ANB0VdZwD1IjtGGTFSD1tscveHRVGkXysXHt1wSadQOPSzHV5FjsC2nOE05RuY/adUyy6QMPN9L7 X-Received: by 2002:a63:e949:: with SMTP id q9-v6mr27489079pgj.4.1536662060378; Tue, 11 Sep 2018 03:34:20 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1536662060; cv=none; d=google.com; s=arc-20160816; b=b42ZJqb+ks0+bTS1FO6qb3IMo5wiqoqwrUVttnbgZIPeEEqLUIS8sXhm2d/rf5Pffc K4TJllbh5z2Iwd1Rb6G+EdiZHgbruqvv6oepjIW4ogeej21+chNAw9sIEa+fiKg9BGOu MOOfAvJO/0N2Y0xWVxovg8YPIvTvoeS0LyySFQdnCCqc7q/NuVdjeJyIvcau7BDotAwv 1V7ZvYYsuf3qygZDbOZoqKf0al/vEe4n5Rxmr2ORhkwdZyhc/veETgn9IAG3GQnA25is dU/sf/aSkW8qzjyGmsmvTwvZy9zI6XEZ200W2pmAV39Iw56XvwScpxPdQ5kjYUdPRbwA PvTQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=message-id:date:subject:cc:to:from; bh=WP3DagXKtOqtONqNiBtirpo7NUzU3AKNAiuc+m4/cCI=; b=vVxchGQgkKWToLCmTabm5Gq2MFfAEC0wXW74vacFHf9woLF5nGqhrJSZHRTTEy6uDx uxJJ3vZqav6O2yY4tN1+7JLdpTPSCgeOpYRstOC9kgfz/230po1Ol+Yoy/8MVxKD8tVV srb3yw6alK08gaNoNDFAd90LZKFjeWma8TS09ci1AWYtCsv8p8mRJGeNAMXsWwZ78sqa doLrSWjymThH76L8uqKTN5ffdLHJJAZALuxeIhRxfNRxKxPOcuPtgg0TMbpMGUXJ4AiZ PpwMU28XTWsYFP2FAoOMR0FLLN/eYGaFexNzMR5mdH4ifszSFDhqprNvTSnVyGfwkK2p V5gQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of kirill.shutemov@linux.intel.com designates 134.134.136.100 as permitted sender) smtp.mailfrom=kirill.shutemov@linux.intel.com; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga07.intel.com (mga07.intel.com. [134.134.136.100]) by mx.google.com with ESMTPS id h90-v6si20104366plb.64.2018.09.11.03.34.20 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 11 Sep 2018 03:34:20 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of kirill.shutemov@linux.intel.com designates 134.134.136.100 as permitted sender) client-ip=134.134.136.100; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of kirill.shutemov@linux.intel.com designates 134.134.136.100 as permitted sender) smtp.mailfrom=kirill.shutemov@linux.intel.com; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by orsmga105.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 11 Sep 2018 03:34:19 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.53,360,1531810800"; d="scan'208";a="69964605" Received: from black.fi.intel.com ([10.237.72.28]) by fmsmga008.fm.intel.com with ESMTP; 11 Sep 2018 03:34:17 -0700 Received: by black.fi.intel.com (Postfix, from userid 1000) id 388CC32A; Tue, 11 Sep 2018 13:34:15 +0300 (EEST) From: "Kirill A. Shutemov" To: Andrew Morton Cc: Vegard Nossum , linux-mm@kvack.org, linux-kernel@vger.kernel.org, "Kirill A. Shutemov" , stable@vger.kernel.org, Zi Yan , Naoya Horiguchi , Vlastimil Babka , Andrea Arcangeli Subject: [PATCH] mm, thp: Fix mlocking THP page with migration enabled Date: Tue, 11 Sep 2018 13:34:03 +0300 Message-Id: <20180911103403.38086-1-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.18.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP A transparent huge page is represented by a single entry on an LRU list. Therefore, we can only make unevictable an entire compound page, not individual subpages. If a user tries to mlock() part of a huge page, we want the rest of the page to be reclaimable. We handle this by keeping PTE-mapped huge pages on normal LRU lists: the PMD on border of VM_LOCKED VMA will be split into PTE table. Introduction of THP migration breaks the rules around mlocking THP pages. If we had a single PMD mapping of the page in mlocked VMA, the page will get mlocked, regardless of PTE mappings of the page. For tmpfs/shmem it's easy to fix by checking PageDoubleMap() in remove_migration_pmd(). Anon THP pages can only be shared between processes via fork(). Mlocked page can only be shared if parent mlocked it before forking, otherwise CoW will be triggered on mlock(). For Anon-THP, we can fix the issue by munlocking the page on removing PTE migration entry for the page. PTEs for the page will always come after mlocked PMD: rmap walks VMAs from oldest to newest. Test-case: #include #include #include #include #include int main(void) { unsigned long nodemask = 4; void *addr; addr = mmap((void *)0x20000000UL, 2UL << 20, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS | MAP_LOCKED, -1, 0); if (fork()) { wait(NULL); return 0; } mlock(addr, 4UL << 10); mbind(addr, 2UL << 20, MPOL_PREFERRED | MPOL_F_RELATIVE_NODES, &nodemask, 4, MPOL_MF_MOVE | MPOL_MF_MOVE_ALL); return 0; } Signed-off-by: Kirill A. Shutemov Reported-by: Vegard Nossum Fixes: 616b8371539a ("mm: thp: enable thp migration in generic path") Cc: [v4.14+] Cc: Zi Yan Cc: Naoya Horiguchi Cc: Vlastimil Babka Cc: Andrea Arcangeli --- mm/huge_memory.c | 2 +- mm/migrate.c | 3 +++ 2 files changed, 4 insertions(+), 1 deletion(-) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 533f9b00147d..00704060b7f7 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -2931,7 +2931,7 @@ void remove_migration_pmd(struct page_vma_mapped_walk *pvmw, struct page *new) else page_add_file_rmap(new, true); set_pmd_at(mm, mmun_start, pvmw->pmd, pmde); - if (vma->vm_flags & VM_LOCKED) + if ((vma->vm_flags & VM_LOCKED) && !PageDoubleMap(new)) mlock_vma_page(new); update_mmu_cache_pmd(vma, address, pvmw->pmd); } diff --git a/mm/migrate.c b/mm/migrate.c index d6a2e89b086a..01dad96b25b5 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -275,6 +275,9 @@ static bool remove_migration_pte(struct page *page, struct vm_area_struct *vma, if (vma->vm_flags & VM_LOCKED && !PageTransCompound(new)) mlock_vma_page(new); + if (PageTransCompound(new) && PageMlocked(page)) + clear_page_mlock(page); + /* No need to invalidate - it was non-present before */ update_mmu_cache(vma, pvmw.address, pvmw.pte); }