From patchwork Thu Nov 7 21:18:09 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Waiman Long X-Patchwork-Id: 11233785 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 75CD51599 for ; Thu, 7 Nov 2019 21:18:27 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 41C4220869 for ; Thu, 7 Nov 2019 21:18:27 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="L4uuUWJ0" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 41C4220869 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 7D47F6B0003; Thu, 7 Nov 2019 16:18:26 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 784676B0006; Thu, 7 Nov 2019 16:18:26 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 69AF16B0007; Thu, 7 Nov 2019 16:18:26 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0088.hostedemail.com [216.40.44.88]) by kanga.kvack.org (Postfix) with ESMTP id 556776B0003 for ; Thu, 7 Nov 2019 16:18:26 -0500 (EST) Received: from smtpin07.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with SMTP id 021EB180AD802 for ; Thu, 7 Nov 2019 21:18:26 +0000 (UTC) X-FDA: 76130745012.07.page91_49923bcc4bd02 X-Spam-Summary: 1,0,0,,d41d8cd98f00b204,longman@redhat.com,:mike.kravetz@oracle.com:akpm@linux-foundation.org:linux-kernel@vger.kernel.org::dave@stgolabs.net:peterz@infradead.org:mingo@redhat.com:will.deacon@arm.com:willy@infradead.org:longman@redhat.com,RULES_HIT:30005:30054:30064:30070,0,RBL:205.139.110.61:@redhat.com:.lbl8.mailshell.net-62.18.0.100 66.10.201.10,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:24,LUA_SUMMARY:none X-HE-Tag: page91_49923bcc4bd02 X-Filterd-Recvd-Size: 4113 Received: from us-smtp-1.mimecast.com (us-smtp-1.mimecast.com [205.139.110.61]) by imf13.hostedemail.com (Postfix) with ESMTP for ; Thu, 7 Nov 2019 21:18:25 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1573161505; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=voTZzAEj7OWKzC0ndJmO5AbBNl2ceM7KB8LtP6NsvIE=; b=L4uuUWJ0c8eKb1IuphXYTKOXrfg922L0mF6c+MokAHLy6ZNuEFQ5AVbFW+q16y4gVX9lpn g5+ErjlXvw1XlqFPnV0lj48x/6xEi/agEGRrut/T4blP7L7jyCS5uis8UI0Ew00q12WZNK +8VRrrTN3ohX5etQxMO5ewtET01LwgE= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-8-bHUahJBCPw6yLEMIefx9dQ-1; Thu, 07 Nov 2019 16:18:23 -0500 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id E684E800C61; Thu, 7 Nov 2019 21:18:21 +0000 (UTC) Received: from llong.com (dhcp-17-59.bos.redhat.com [10.18.17.59]) by smtp.corp.redhat.com (Postfix) with ESMTP id 423631001B07; Thu, 7 Nov 2019 21:18:18 +0000 (UTC) From: Waiman Long To: Mike Kravetz , Andrew Morton Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, Davidlohr Bueso , Peter Zijlstra , Ingo Molnar , Will Deacon , Matthew Wilcox , Waiman Long Subject: [PATCH v2] hugetlbfs: Take read_lock on i_mmap for PMD sharing Date: Thu, 7 Nov 2019 16:18:09 -0500 Message-Id: <20191107211809.9539-1-longman@redhat.com> X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 X-MC-Unique: bHUahJBCPw6yLEMIefx9dQ-1 X-Mimecast-Spam-Score: 0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: A customer with large SMP systems (up to 16 sockets) with application that uses large amount of static hugepages (~500-1500GB) are experiencing random multisecond delays. These delays was caused by the long time it took to scan the VMA interval tree with mmap_sem held. The sharing of huge PMD does not require changes to the i_mmap at all. Therefore, we can just take the read lock and let other threads searching for the right VMA to share it in parallel. Once the right VMA is found, either the PMD lock (2M huge page for x86-64) or the mm->page_table_lock will be acquired to perform the actual PMD sharing. Lock contention, if present, will happen in the spinlock. That is much better than contention in the rwsem where the time needed to scan the the interval tree is indeterminate. With this patch applied, the customer is seeing significant performance improvement over the unpatched kernel. Suggested-by: Mike Kravetz Reviewed-by: Mike Kravetz Signed-off-by: Waiman Long --- mm/hugetlb.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index b45a95363a84..f78891f92765 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -4842,7 +4842,7 @@ pte_t *huge_pmd_share(struct mm_struct *mm, unsigned long addr, pud_t *pud) if (!vma_shareable(vma, addr)) return (pte_t *)pmd_alloc(mm, pud, addr); - i_mmap_lock_write(mapping); + i_mmap_lock_read(mapping); vma_interval_tree_foreach(svma, &mapping->i_mmap, idx, idx) { if (svma == vma) continue; @@ -4872,7 +4872,7 @@ pte_t *huge_pmd_share(struct mm_struct *mm, unsigned long addr, pud_t *pud) spin_unlock(ptl); out: pte = (pte_t *)pmd_alloc(mm, pud, addr); - i_mmap_unlock_write(mapping); + i_mmap_unlock_read(mapping); return pte; }