From patchwork Wed Sep 19 17:44:37 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Steve Capper X-Patchwork-Id: 10606137 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id BB7D4913 for ; Wed, 19 Sep 2018 17:51:29 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A78CD2C9C8 for ; Wed, 19 Sep 2018 17:51:29 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id A5B4D2C9E7; Wed, 19 Sep 2018 17:51:29 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.8 required=2.0 tests=BAYES_00,DKIM_SIGNED, MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE,T_DKIM_INVALID autolearn=ham version=3.3.1 Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id EFB6F2C9E8 for ; Wed, 19 Sep 2018 17:51:28 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20170209; h=Sender: Content-Transfer-Encoding:Content-Type:MIME-Version:Cc:List-Subscribe: List-Help:List-Post:List-Archive:List-Unsubscribe:List-Id:Message-Id:Date: Subject:To:From:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:In-Reply-To: References:List-Owner; bh=ZuI7Ac0rC7Ohed6Y6r9pIGyYa+Ykj9++ab1s4QjgOog=; b=KNi 12O+4bu0AaELrxSggsBnXSYgumKlhd00UDroSAdVz7xx/c7Dj+uFTDykNLt2YJ2sSkFxD8KYqsCeH +HQz1no1LDbA3pwg9eQC2RENzP8gaahjVpB7f/yvq5XlZTiAiaYmvgyOZcF9Yh2VmQN1VkC22wHMm M5yocxoc3oZMKe1vtEZwhEJXPm8QC6ml0ULsjzPP5FruIaLySDHuSQZBM5GnF6clfMXWt5dP8ag6s +mQ2K2+djPTLp2Bg2wb9mF2lsojjYbcKaK+IpozbMIXMnFklvQHV7/9tnfbnUuUTmiXoaLlYaEwRz RKNkmxGMdW21aKQbP1MfVtqhvWlkcZQ==; Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.90_1 #2 (Red Hat Linux)) id 1g2gdI-0005oZ-7t; Wed, 19 Sep 2018 17:51:24 +0000 Received: from foss.arm.com ([217.140.101.70]) by bombadil.infradead.org with esmtp (Exim 4.90_1 #2 (Red Hat Linux)) id 1g2gdE-0005nr-J6 for linux-arm-kernel@lists.infradead.org; Wed, 19 Sep 2018 17:51:22 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 6E17B80D; Wed, 19 Sep 2018 10:51:06 -0700 (PDT) Received: from capper-debian.emea.arm.com (usa-sjc-mx-foss1.foss.arm.com [217.140.101.70]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 33BD43F5BD; Wed, 19 Sep 2018 10:51:06 -0700 (PDT) From: Steve Capper To: linux-arm-kernel@lists.infradead.org Subject: [PATCH] arm64: hugetlb: Avoid unnecessary clearing in huge_ptep_set_access_flags Date: Wed, 19 Sep 2018 18:44:37 +0100 Message-Id: <20180919174437.1866-1-steve.capper@arm.com> X-Mailer: git-send-email 2.11.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20180919_105120_640346_DBF8FEF4 X-CRM114-Status: GOOD ( 16.76 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: catalin.marinas@arm.com, Steve Capper , will.deacon@arm.com, zhang.lei@jp.fujitsu.com MIME-Version: 1.0 Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org X-Virus-Scanned: ClamAV using ClamSMTP For contiguous hugetlb, huge_ptep_set_access_flags performs a get_clear_flush (which then flushes the TLBs) even when no change of ptes is necessary. Unfortunately, this behaviour can lead to back-to-back page faults being generated when running with multiple threads that access the same contiguous huge page. Thread 1 | Thread 2 -----------------------------+------------------------------ hugetlb_fault | huge_ptep_set_access_flags | -> invalidate pte range | hugetlb_fault continue processing | wait for hugetlb fault mutex release mutex and return | huge_ptep_set_access_flags | -> invalidate pte range hugetlb_fault ... This patch changes huge_ptep_set_access_flags s.t. we first read the contiguous range of ptes (whilst preserving dirty information); the pte range is only then invalidated where necessary and this prevents further spurious page faults. Fixes: d8bdcff28764 ("arm64: hugetlb: Add break-before-make logic for contiguous entries") Reported-by: Lei Zhang Signed-off-by: Steve Capper Tested-by: Lei Zhang --- I was unable to test this on any hardware as I'm away from the office. Can you please test this Lei Zhang? Cheers, -- Steve --- arch/arm64/mm/hugetlbpage.c | 36 ++++++++++++++++++++++++++++++++---- 1 file changed, 32 insertions(+), 4 deletions(-) diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c index 192b3ba07075..76d229eb6ba1 100644 --- a/arch/arm64/mm/hugetlbpage.c +++ b/arch/arm64/mm/hugetlbpage.c @@ -131,6 +131,27 @@ static pte_t get_clear_flush(struct mm_struct *mm, return orig_pte; } +static pte_t get_contig_pte(pte_t *ptep, unsigned long pgsize, + unsigned long ncontig) +{ + unsigned long i; + pte_t orig_pte = huge_ptep_get(ptep); + + for (i = 0; i < ncontig; i++, ptep++) { + pte_t pte = huge_ptep_get(ptep); + + /* + * If HW_AFDBM is enabled, then the HW could turn on + * the dirty bit for any page in the set, so check + * them all. All hugetlb entries are already young. + */ + if (pte_dirty(pte)) + orig_pte = pte_mkdirty(orig_pte); + } + + return orig_pte; +} + /* * Changing some bits of contiguous entries requires us to follow a * Break-Before-Make approach, breaking the whole contiguous set @@ -324,7 +345,7 @@ int huge_ptep_set_access_flags(struct vm_area_struct *vma, unsigned long addr, pte_t *ptep, pte_t pte, int dirty) { - int ncontig, i, changed = 0; + int ncontig, i; size_t pgsize = 0; unsigned long pfn = pte_pfn(pte), dpfn; pgprot_t hugeprot; @@ -336,9 +357,16 @@ int huge_ptep_set_access_flags(struct vm_area_struct *vma, ncontig = find_num_contig(vma->vm_mm, addr, ptep, &pgsize); dpfn = pgsize >> PAGE_SHIFT; + orig_pte = get_contig_pte(ptep, pgsize, ncontig); + if (pte_same(orig_pte, pte)) + return 0; + + /* + * we need to get our orig_pte again as HW DBM may have happened since + * above. get_clear_flush will ultimately cmpxchg with 0 to ensure + * that we can't lose any dirty information. + */ orig_pte = get_clear_flush(vma->vm_mm, addr, ptep, pgsize, ncontig); - if (!pte_same(orig_pte, pte)) - changed = 1; /* Make sure we don't lose the dirty state */ if (pte_dirty(orig_pte)) @@ -348,7 +376,7 @@ int huge_ptep_set_access_flags(struct vm_area_struct *vma, for (i = 0; i < ncontig; i++, ptep++, addr += pgsize, pfn += dpfn) set_pte_at(vma->vm_mm, addr, ptep, pfn_pte(pfn, hugeprot)); - return changed; + return 1; } void huge_ptep_set_wrprotect(struct mm_struct *mm,