From patchwork Tue Aug 21 20:59:02 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mike Kravetz X-Patchwork-Id: 10572357 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E7FAF1390 for ; Tue, 21 Aug 2018 20:59:17 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id DDAB32AB6E for ; Tue, 21 Aug 2018 20:59:17 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id D13662AB76; Tue, 21 Aug 2018 20:59:17 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE, UNPARSEABLE_RELAY autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 3385F2AB6E for ; Tue, 21 Aug 2018 20:59:17 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 243DC6B20B0; Tue, 21 Aug 2018 16:59:16 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 1F1F86B20B2; Tue, 21 Aug 2018 16:59:16 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0BA866B20B1; Tue, 21 Aug 2018 16:59:16 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt0-f199.google.com (mail-qt0-f199.google.com [209.85.216.199]) by kanga.kvack.org (Postfix) with ESMTP id D305E6B20AE for ; Tue, 21 Aug 2018 16:59:15 -0400 (EDT) Received: by mail-qt0-f199.google.com with SMTP id a15-v6so17639309qtj.15 for ; Tue, 21 Aug 2018 13:59:15 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:dkim-signature:from:to:cc:subject:date :message-id:in-reply-to:references; bh=EMUsyMD++FbCjeWKHZ1xj6y0FD6wedPmRsDrc6Pm4QY=; b=Pscp+2iaeaocoUU0rlt1ZUAr6H68v5S7+MTQse/4DQLlZi3YcI1RSLF8gtBcc5s98d ESwSCLpVwyREOch23LYAfzfifFGCu4yqLNpHXXPM9/iC+go0u/r2xEOMlQKmVQudXotS Idxsg/mYgwXUxUghZqTikv9IFhQUzWgZKUb4UQiGTDd+hYXO3Q5sieJBzmtGQ9uQ1q9N JD2hZmR0s2ITZXRLQGSTnyyK2xpmC23LIvMVYgqXXWMJka/OQq9DKBzmzBpThsLeaDBZ 9yKyvL+/lGe5f4cO2tFXDqFDVZe0XBCiHcKqvw2F674yXVSGwXW5ooo2tQw4IfbaETqk 9bUQ== X-Gm-Message-State: APzg51BINX6ciy5QjaNXbKeZNIB0HD3kFr4GNfQqYU0TBVgk3PAzFJNV QM2WIlPdA95I+8e+8/lQb/nAhhXqDTRjJ3GIEBSefYSIWyCH9DRbKZ/lqgXqOXy74EbrKdImo5j zmVH0n1u1Gar9UNDuGDzyCgmeNiJ1FqgKxViZYoAH6pXZSi+JBskCwoeA9AATXtIw1A== X-Received: by 2002:a37:7386:: with SMTP id o128-v6mr8533444qkc.200.1534885155583; Tue, 21 Aug 2018 13:59:15 -0700 (PDT) X-Google-Smtp-Source: ANB0Vdb1qReV5iGGxaoRtzuU8//oY7lhJNyd/3qd5Ra1cLrVfMyTwNFgpRBgsr4OIweDYMMH6+Xs X-Received: by 2002:a37:7386:: with SMTP id o128-v6mr8533393qkc.200.1534885154639; Tue, 21 Aug 2018 13:59:14 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1534885154; cv=none; d=google.com; s=arc-20160816; b=wlwGQtHFqTBMmTBuKgsJ2Jx5sJIDIo8lirrLZvAeyJxNjnjexQSkHFZf1Sfttuo0i7 ycFlHcGsj305B4RHaIyaB1d/pdDmiQJ5NIczGVdfkGfMNBmMSkuXtm27SfMP4nz1CAp0 wdS/6YwcxhhhAU6pI+TMGXont41h4gCTl3HMT+V5b03/sGPNhGRH0XV+af5CXZH7qZXB PVE0HAksGiLkzXDB27cb7k+L2AxqRVlLwVEoPtTORUu+6k7ueDOq1ya4TrfoI9VLASCo jVryGYma1CtoVG0m929ELAhqTFjQRdSZcXIe2uyq9XD23db42EVpDQSI7RInweTkoV7e yPog== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature:arc-authentication-results; bh=EMUsyMD++FbCjeWKHZ1xj6y0FD6wedPmRsDrc6Pm4QY=; b=C+zebnv44xshW08rg12tGvZ/s5vazmxXl8sl4Q1N5QgqFXkmuU4uHZKid9gpWpV/Dk CBBnkYqdMSiPiSEGXJc6SCRIfWpa+tV38HKwE06KbZyuV9obG9xRrnDyDlFOHlpZuU2T W4L4iY0sgjwC69ekHf7xThgd4ffItjrZ+vTZGkA1Tf2jSr7lYHJngNHoLl2dL/i0n4UW 0BemlG/kIb8rf/0fSFPKbPV2OBXuCyvj0V0UyuBK7JBjdMDIW596oyOUGNXKa1UQ4oIn zWZ4GwuclJPvylUK/3BjBOmnRHmzCODLuTDpyBeeGaQ/F/+9GabF3pvEJyRia6IevvyU tDjQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2018-07-02 header.b=dpGHh12L; spf=pass (google.com: domain of mike.kravetz@oracle.com designates 156.151.31.85 as permitted sender) smtp.mailfrom=mike.kravetz@oracle.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Received: from userp2120.oracle.com (userp2120.oracle.com. [156.151.31.85]) by mx.google.com with ESMTPS id f85-v6si2310471qkf.159.2018.08.21.13.59.14 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 21 Aug 2018 13:59:14 -0700 (PDT) Received-SPF: pass (google.com: domain of mike.kravetz@oracle.com designates 156.151.31.85 as permitted sender) client-ip=156.151.31.85; Authentication-Results: mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2018-07-02 header.b=dpGHh12L; spf=pass (google.com: domain of mike.kravetz@oracle.com designates 156.151.31.85 as permitted sender) smtp.mailfrom=mike.kravetz@oracle.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Received: from pps.filterd (userp2120.oracle.com [127.0.0.1]) by userp2120.oracle.com (8.16.0.22/8.16.0.22) with SMTP id w7LKuw5l017113; Tue, 21 Aug 2018 20:59:10 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references; s=corp-2018-07-02; bh=EMUsyMD++FbCjeWKHZ1xj6y0FD6wedPmRsDrc6Pm4QY=; b=dpGHh12LG8aPmIZROt1nt9FpZmkR6ySb4W64twUE5LBrXfP393jPdcFo2gOylNzc9/10 O6lgtto5cmq9mSeeV9mf8fyh73Vrm0dbLq2sBsftklCz89gNYSlU4zlGuckbv9ENzBNC cBdR/rotyyrJGx8KJONldHHR9Vc3awDWgHf/dyG12o7iHuT+GAbBqyD+0C8hmhNK5wS1 SUBuKz7IRvMAK1KkwsmBg0lERefl8oGb4UYebecoTN3Qne21Ji21baATCSBTX6wfyM2A URT4KDcu2Ar+npDA9UZNLlbcS9q71uuXtGEX509Urtr1Ysr4aziBW4lv4enYKuL1xUSn /A== Received: from userv0021.oracle.com (userv0021.oracle.com [156.151.31.71]) by userp2120.oracle.com with ESMTP id 2kxc3qq2aq-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 21 Aug 2018 20:59:10 +0000 Received: from aserv0121.oracle.com (aserv0121.oracle.com [141.146.126.235]) by userv0021.oracle.com (8.14.4/8.14.4) with ESMTP id w7LKx94A000435 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 21 Aug 2018 20:59:10 GMT Received: from abhmp0003.oracle.com (abhmp0003.oracle.com [141.146.116.9]) by aserv0121.oracle.com (8.14.4/8.13.8) with ESMTP id w7LKx9jC015319; Tue, 21 Aug 2018 20:59:09 GMT Received: from monkey.oracle.com (/50.38.38.67) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Tue, 21 Aug 2018 13:59:09 -0700 From: Mike Kravetz To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: "Kirill A . Shutemov" , =?utf-8?b?SsOp?= =?utf-8?b?csO0bWUgR2xpc3Nl?= , Vlastimil Babka , Naoya Horiguchi , Davidlohr Bueso , Michal Hocko , Andrew Morton , Mike Kravetz Subject: [PATCH v3 2/2] hugetlb: take PMD sharing into account when flushing tlb/caches Date: Tue, 21 Aug 2018 13:59:02 -0700 Message-Id: <20180821205902.21223-3-mike.kravetz@oracle.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180821205902.21223-1-mike.kravetz@oracle.com> References: <20180821205902.21223-1-mike.kravetz@oracle.com> X-Proofpoint-Virus-Version: vendor=nai engine=5900 definitions=8992 signatures=668707 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=815 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1807170000 definitions=main-1808210212 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP When fixing an issue with PMD sharing and migration, it was discovered via code inspection that other callers of huge_pmd_unshare potentially have an issue with cache and tlb flushing. Use the routine huge_pmd_sharing_possible() to calculate worst case ranges for mmu notifiers. Ensure that this range is flushed if huge_pmd_unshare succeeds and unmaps a PUD_SUZE area. Signed-off-by: Mike Kravetz --- mm/hugetlb.c | 53 +++++++++++++++++++++++++++++++++++++++++++--------- 1 file changed, 44 insertions(+), 9 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index fd155dc52117..c31d92889775 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -3333,8 +3333,8 @@ void __unmap_hugepage_range(struct mmu_gather *tlb, struct vm_area_struct *vma, struct page *page; struct hstate *h = hstate_vma(vma); unsigned long sz = huge_page_size(h); - const unsigned long mmun_start = start; /* For mmu_notifiers */ - const unsigned long mmun_end = end; /* For mmu_notifiers */ + unsigned long mmun_start = start; /* For mmu_notifiers */ + unsigned long mmun_end = end; /* For mmu_notifiers */ WARN_ON(!is_vm_hugetlb_page(vma)); BUG_ON(start & ~huge_page_mask(h)); @@ -3346,6 +3346,11 @@ void __unmap_hugepage_range(struct mmu_gather *tlb, struct vm_area_struct *vma, */ tlb_remove_check_page_size_change(tlb, sz); tlb_start_vma(tlb, vma); + + /* + * If sharing possible, alert mmu notifiers of worst case. + */ + (void)huge_pmd_sharing_possible(vma, &mmun_start, &mmun_end); mmu_notifier_invalidate_range_start(mm, mmun_start, mmun_end); address = start; for (; address < end; address += sz) { @@ -3356,6 +3361,10 @@ void __unmap_hugepage_range(struct mmu_gather *tlb, struct vm_area_struct *vma, ptl = huge_pte_lock(h, mm, ptep); if (huge_pmd_unshare(mm, &address, ptep)) { spin_unlock(ptl); + /* + * We just unmapped a page of PMDs by clearing a PUD. + * The caller's TLB flush range should cover this area. + */ continue; } @@ -3438,12 +3447,23 @@ void unmap_hugepage_range(struct vm_area_struct *vma, unsigned long start, { struct mm_struct *mm; struct mmu_gather tlb; + unsigned long tlb_start = start; + unsigned long tlb_end = end; + + /* + * If shared PMDs were possibly used within this vma range, adjust + * start/end for worst case tlb flushing. + * Note that we can not be sure if PMDs are shared until we try to + * unmap pages. However, we want to make sure TLB flushing covers + * the largest possible range. + */ + (void)huge_pmd_sharing_possible(vma, &tlb_start, &tlb_end); mm = vma->vm_mm; - tlb_gather_mmu(&tlb, mm, start, end); + tlb_gather_mmu(&tlb, mm, tlb_start, tlb_end); __unmap_hugepage_range(&tlb, vma, start, end, ref_page); - tlb_finish_mmu(&tlb, start, end); + tlb_finish_mmu(&tlb, tlb_start, tlb_end); } /* @@ -4309,11 +4329,21 @@ unsigned long hugetlb_change_protection(struct vm_area_struct *vma, pte_t pte; struct hstate *h = hstate_vma(vma); unsigned long pages = 0; + unsigned long f_start = start; + unsigned long f_end = end; + bool shared_pmd = false; + + /* + * In the case of shared PMDs, the area to flush could be beyond + * start/end. Set f_start/f_end to cover the maximum possible + * range if PMD sharing is possible. + */ + (void)huge_pmd_sharing_possible(vma, &f_start, &f_end); BUG_ON(address >= end); - flush_cache_range(vma, address, end); + flush_cache_range(vma, f_start, f_end); - mmu_notifier_invalidate_range_start(mm, start, end); + mmu_notifier_invalidate_range_start(mm, f_start, f_end); i_mmap_lock_write(vma->vm_file->f_mapping); for (; address < end; address += huge_page_size(h)) { spinlock_t *ptl; @@ -4324,6 +4354,7 @@ unsigned long hugetlb_change_protection(struct vm_area_struct *vma, if (huge_pmd_unshare(mm, &address, ptep)) { pages++; spin_unlock(ptl); + shared_pmd = true; continue; } pte = huge_ptep_get(ptep); @@ -4359,9 +4390,13 @@ unsigned long hugetlb_change_protection(struct vm_area_struct *vma, * Must flush TLB before releasing i_mmap_rwsem: x86's huge_pmd_unshare * may have cleared our pud entry and done put_page on the page table: * once we release i_mmap_rwsem, another task can do the final put_page - * and that page table be reused and filled with junk. + * and that page table be reused and filled with junk. If we actually + * did unshare a page of pmds, flush the range corresponding to the pud. */ - flush_hugetlb_tlb_range(vma, start, end); + if (shared_pmd) + flush_hugetlb_tlb_range(vma, f_start, f_end); + else + flush_hugetlb_tlb_range(vma, start, end); /* * No need to call mmu_notifier_invalidate_range() we are downgrading * page table protection not changing it to point to a new page. @@ -4369,7 +4404,7 @@ unsigned long hugetlb_change_protection(struct vm_area_struct *vma, * See Documentation/vm/mmu_notifier.rst */ i_mmap_unlock_write(vma->vm_file->f_mapping); - mmu_notifier_invalidate_range_end(mm, start, end); + mmu_notifier_invalidate_range_end(mm, f_start, f_end); return pages << h->order; }