From patchwork Thu May 17 11:06:25 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Laurent Dufour X-Patchwork-Id: 10406411 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id D8132602C2 for ; Thu, 17 May 2018 11:07:51 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C6DCC28A07 for ; Thu, 17 May 2018 11:07:51 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id BA01D28A5B; Thu, 17 May 2018 11:07:51 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00, MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 9C6F128A6B for ; Thu, 17 May 2018 11:07:50 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 16E996B04BB; Thu, 17 May 2018 07:07:33 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 0F8A36B04BC; Thu, 17 May 2018 07:07:33 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id EB56F6B04BD; Thu, 17 May 2018 07:07:32 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-wr0-f199.google.com (mail-wr0-f199.google.com [209.85.128.199]) by kanga.kvack.org (Postfix) with ESMTP id 86BBD6B04BC for ; Thu, 17 May 2018 07:07:32 -0400 (EDT) Received: by mail-wr0-f199.google.com with SMTP id p7-v6so2882536wrj.4 for ; Thu, 17 May 2018 04:07:32 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:in-reply-to:references:message-id; bh=WoVPsw1Fzjfj8LP5S8ATGRMlAKrcMCchsHhiX/h6DHM=; b=ZMvrPbHkhnWQfL/NObMSm2bJEHEAS9PFoiWsTtisr7DvH8TQjZ5UL7MxTaVlAOaVIl gxR4mqmVtFPMLKAbH7AhUJD5cw+rMPJy+6nwzp9iY2QB5SV2HoblQxDkv350bCaH9k8R rzq8GF+jMMsELzyy77SoMshzDisShCfSjRcW21sGcYL1xAuU9DaYcfN8vJ35yInmdk3d eAaN7n6yCOe/lMOQC5fDAZaRiuuQz8mcNixn8edNYCmc1RmfAjpj2Gf2DJr5bPB6Fiuk CN3t4ZahWwevWTNi3zUHpLb+PeT0pJLvki5BwcBM30EX7MjgSLz+rshzw5CJRRKrm3Go gyGw== X-Original-Authentication-Results: mx.google.com; spf=neutral (google.com: 148.163.158.5 is neither permitted nor denied by best guess record for domain of ldufour@linux.vnet.ibm.com) smtp.mailfrom=ldufour@linux.vnet.ibm.com; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=ibm.com X-Gm-Message-State: ALKqPwdk7pFyOWdZIRiO9ZRdW8BkfvYputYIUvHJSoT0J0pofsOxVL6P b6ndgGr2Ww1mlehUaO6h+FHA6zQpfwJdC1iFfO+7a8P4hJiFBZen0f3qySdAewDekQycDuQpaoX xZlrkKp0+hx8w275588oVjWqYwMnQ/RvXKn0Ok3P39L7nbRRGRwpUESARYPfRS/8= X-Received: by 2002:a50:f5cd:: with SMTP id x13-v6mr6283866edm.132.1526555252057; Thu, 17 May 2018 04:07:32 -0700 (PDT) X-Google-Smtp-Source: AB8JxZqlAbru9EuH/xJOeCUxrCHJv7da51bQpntra6V2fLETzVRzYwaACKOmzrW6x64j8KICdq5y X-Received: by 2002:a50:f5cd:: with SMTP id x13-v6mr6283787edm.132.1526555251048; Thu, 17 May 2018 04:07:31 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1526555251; cv=none; d=google.com; s=arc-20160816; b=HS68CeophXctPKTax4JxFo0MMVq1qgREAjPWCYndfm/fqIir5xEXRgH6OsvBu8xMYl 9Fpj+STN8Tt8wlNNbCYq5NcGD7RJW37RrqfCV1/S1ogAytOJsgIZQVhzFst/4P+HEReX jqmnmGKZ5siAH1Ak+UqMm/PvgBCsrzIOx0JJN8yzuYxQowsqfwaonLzPoeVkhTPSfprQ VULU4Uxl/ysNXuZ1Wg/vbtXvQsUMS3afKXP6Sjkynzjhf9H97phYEbqNOl+4KklDr7VM +CK/7XnW1DmmMk6C2d9unSw4hdVr99xO9PDqKmmpkWc8gg3U7zKFAYd1aUZ+ec4BmJUz lDqw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=message-id:references:in-reply-to:date:subject:cc:to:from :arc-authentication-results; bh=WoVPsw1Fzjfj8LP5S8ATGRMlAKrcMCchsHhiX/h6DHM=; b=pcsRD1mc6Zm08O0OJTrc+Zt63UBbF1tctVM+O045fX5+Iqotr5lK5NVvM8cKU/sWau ccMCRGh5LiuLK6XGAl9nWWA91UM6rFJrjHwx4MwR5pufeq6rAxBQIL5kuWtskeUT4tS0 DrNpiaW+/HKjIlKfVZi3+YwoBW7W4yctm3Ye138oWlCk7/CCp/9npAv261lHVcoiCD94 6SrNcUhHIHFzD+i9r96UvuDKKFWrTx6ujLm24Ro/iAlAbNkVjkr2Dyrwl8lePZpUXiKy tJjnRcml+VWyR/ZkZs+YQ9tRkkOZxa9I5FQAFHQG2mmSJJOUlxfidwnJfa2K2ckQ8GaC N/cg== ARC-Authentication-Results: i=1; mx.google.com; spf=neutral (google.com: 148.163.158.5 is neither permitted nor denied by best guess record for domain of ldufour@linux.vnet.ibm.com) smtp.mailfrom=ldufour@linux.vnet.ibm.com; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=ibm.com Received: from mx0a-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com. [148.163.158.5]) by mx.google.com with ESMTPS id f9-v6si3226001edm.12.2018.05.17.04.07.30 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 17 May 2018 04:07:31 -0700 (PDT) Received-SPF: neutral (google.com: 148.163.158.5 is neither permitted nor denied by best guess record for domain of ldufour@linux.vnet.ibm.com) client-ip=148.163.158.5; Authentication-Results: mx.google.com; spf=neutral (google.com: 148.163.158.5 is neither permitted nor denied by best guess record for domain of ldufour@linux.vnet.ibm.com) smtp.mailfrom=ldufour@linux.vnet.ibm.com; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=ibm.com Received: from pps.filterd (m0098416.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.16.0.22/8.16.0.22) with SMTP id w4HB4XiC179673 for ; Thu, 17 May 2018 07:07:29 -0400 Received: from e06smtp13.uk.ibm.com (e06smtp13.uk.ibm.com [195.75.94.109]) by mx0b-001b2d01.pphosted.com with ESMTP id 2j157622t9-1 (version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=NOT) for ; Thu, 17 May 2018 07:07:29 -0400 Received: from localhost by e06smtp13.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Thu, 17 May 2018 12:07:26 +0100 Received: from b06cxnps4075.portsmouth.uk.ibm.com (9.149.109.197) by e06smtp13.uk.ibm.com (192.168.101.143) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Thu, 17 May 2018 12:07:17 +0100 Received: from d06av22.portsmouth.uk.ibm.com (d06av22.portsmouth.uk.ibm.com [9.149.105.58]) by b06cxnps4075.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id w4HB7Hhj4653474; Thu, 17 May 2018 11:07:17 GMT Received: from d06av22.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id CE6584C050; Thu, 17 May 2018 11:59:05 +0100 (BST) Received: from d06av22.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 8FF6E4C044; Thu, 17 May 2018 11:59:04 +0100 (BST) Received: from nimbus.lab.toulouse-stg.fr.ibm.com (unknown [9.101.4.33]) by d06av22.portsmouth.uk.ibm.com (Postfix) with ESMTP; Thu, 17 May 2018 11:59:04 +0100 (BST) From: Laurent Dufour To: akpm@linux-foundation.org, mhocko@kernel.org, peterz@infradead.org, kirill@shutemov.name, ak@linux.intel.com, dave@stgolabs.net, jack@suse.cz, Matthew Wilcox , khandual@linux.vnet.ibm.com, aneesh.kumar@linux.vnet.ibm.com, benh@kernel.crashing.org, mpe@ellerman.id.au, paulus@samba.org, Thomas Gleixner , Ingo Molnar , hpa@zytor.com, Will Deacon , Sergey Senozhatsky , sergey.senozhatsky.work@gmail.com, Andrea Arcangeli , Alexei Starovoitov , kemi.wang@intel.com, Daniel Jordan , David Rientjes , Jerome Glisse , Ganesh Mahendran , Minchan Kim , Punit Agrawal , vinayak menon , Yang Shi Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, haren@linux.vnet.ibm.com, npiggin@gmail.com, bsingharora@gmail.com, paulmck@linux.vnet.ibm.com, Tim Chen , linuxppc-dev@lists.ozlabs.org, x86@kernel.org Subject: [PATCH v11 18/26] mm: protect mm_rb tree with a rwlock Date: Thu, 17 May 2018 13:06:25 +0200 X-Mailer: git-send-email 2.7.4 In-Reply-To: <1526555193-7242-1-git-send-email-ldufour@linux.vnet.ibm.com> References: <1526555193-7242-1-git-send-email-ldufour@linux.vnet.ibm.com> X-TM-AS-GCONF: 00 x-cbid: 18051711-0012-0000-0000-000005D790F0 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 18051711-0013-0000-0000-00001954BBDD Message-Id: <1526555193-7242-19-git-send-email-ldufour@linux.vnet.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:, , definitions=2018-05-17_05:, , signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 malwarescore=0 suspectscore=1 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 impostorscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1709140000 definitions=main-1805170100 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP This change is inspired by the Peter's proposal patch [1] which was protecting the VMA using SRCU. Unfortunately, SRCU is not scaling well in that particular case, and it is introducing major performance degradation due to excessive scheduling operations. To allow access to the mm_rb tree without grabbing the mmap_sem, this patch is protecting it access using a rwlock. As the mm_rb tree is a O(log n) search it is safe to protect it using such a lock. The VMA cache is not protected by the new rwlock and it should not be used without holding the mmap_sem. To allow the picked VMA structure to be used once the rwlock is released, a use count is added to the VMA structure. When the VMA is allocated it is set to 1. Each time the VMA is picked with the rwlock held its use count is incremented. Each time the VMA is released it is decremented. When the use count hits zero, this means that the VMA is no more used and should be freed. This patch is preparing for 2 kind of VMA access : - as usual, under the control of the mmap_sem, - without holding the mmap_sem for the speculative page fault handler. Access done under the control the mmap_sem doesn't require to grab the rwlock to protect read access to the mm_rb tree, but access in write must be done under the protection of the rwlock too. This affects inserting and removing of elements in the RB tree. The patch is introducing 2 new functions: - vma_get() to find a VMA based on an address by holding the new rwlock. - vma_put() to release the VMA when its no more used. These services are designed to be used when access are made to the RB tree without holding the mmap_sem. When a VMA is removed from the RB tree, its vma->vm_rb field is cleared and we rely on the WMB done when releasing the rwlock to serialize the write with the RMB done in a later patch to check for the VMA's validity. When free_vma is called, the file associated with the VMA is closed immediately, but the policy and the file structure remained in used until the VMA's use count reach 0, which may happens later when exiting an in progress speculative page fault. [1] https://patchwork.kernel.org/patch/5108281/ Cc: Peter Zijlstra (Intel) Cc: Matthew Wilcox Signed-off-by: Laurent Dufour --- include/linux/mm.h | 1 + include/linux/mm_types.h | 4 ++ kernel/fork.c | 3 ++ mm/init-mm.c | 3 ++ mm/internal.h | 6 +++ mm/mmap.c | 115 +++++++++++++++++++++++++++++++++++------------ 6 files changed, 104 insertions(+), 28 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index bcebec117d4d..05cbba70104b 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -1314,6 +1314,7 @@ static inline void INIT_VMA(struct vm_area_struct *vma) INIT_LIST_HEAD(&vma->anon_vma_chain); #ifdef CONFIG_SPECULATIVE_PAGE_FAULT seqcount_init(&vma->vm_sequence); + atomic_set(&vma->vm_ref_count, 1); #endif } diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index fb5962308183..b16ba02f7fd6 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -337,6 +337,7 @@ struct vm_area_struct { struct vm_userfaultfd_ctx vm_userfaultfd_ctx; #ifdef CONFIG_SPECULATIVE_PAGE_FAULT seqcount_t vm_sequence; + atomic_t vm_ref_count; /* see vma_get(), vma_put() */ #endif } __randomize_layout; @@ -355,6 +356,9 @@ struct kioctx_table; struct mm_struct { struct vm_area_struct *mmap; /* list of VMAs */ struct rb_root mm_rb; +#ifdef CONFIG_SPECULATIVE_PAGE_FAULT + rwlock_t mm_rb_lock; +#endif u32 vmacache_seqnum; /* per-thread vmacache */ #ifdef CONFIG_MMU unsigned long (*get_unmapped_area) (struct file *filp, diff --git a/kernel/fork.c b/kernel/fork.c index 99198a02efe9..f1258c2ade09 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -907,6 +907,9 @@ static struct mm_struct *mm_init(struct mm_struct *mm, struct task_struct *p, mm->mmap = NULL; mm->mm_rb = RB_ROOT; mm->vmacache_seqnum = 0; +#ifdef CONFIG_SPECULATIVE_PAGE_FAULT + rwlock_init(&mm->mm_rb_lock); +#endif atomic_set(&mm->mm_users, 1); atomic_set(&mm->mm_count, 1); init_rwsem(&mm->mmap_sem); diff --git a/mm/init-mm.c b/mm/init-mm.c index f0179c9c04c2..228134f5a336 100644 --- a/mm/init-mm.c +++ b/mm/init-mm.c @@ -17,6 +17,9 @@ struct mm_struct init_mm = { .mm_rb = RB_ROOT, +#ifdef CONFIG_SPECULATIVE_PAGE_FAULT + .mm_rb_lock = __RW_LOCK_UNLOCKED(init_mm.mm_rb_lock), +#endif .pgd = swapper_pg_dir, .mm_users = ATOMIC_INIT(2), .mm_count = ATOMIC_INIT(1), diff --git a/mm/internal.h b/mm/internal.h index 62d8c34e63d5..fb2667b20f0a 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -40,6 +40,12 @@ void page_writeback_init(void); int do_swap_page(struct vm_fault *vmf); +#ifdef CONFIG_SPECULATIVE_PAGE_FAULT +extern struct vm_area_struct *get_vma(struct mm_struct *mm, + unsigned long addr); +extern void put_vma(struct vm_area_struct *vma); +#endif + void free_pgtables(struct mmu_gather *tlb, struct vm_area_struct *start_vma, unsigned long floor, unsigned long ceiling); diff --git a/mm/mmap.c b/mm/mmap.c index 2450860e3f8e..54d298a67047 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -169,6 +169,27 @@ void unlink_file_vma(struct vm_area_struct *vma) } } +static void __free_vma(struct vm_area_struct *vma) +{ + if (vma->vm_file) + fput(vma->vm_file); + mpol_put(vma_policy(vma)); + kmem_cache_free(vm_area_cachep, vma); +} + +#ifdef CONFIG_SPECULATIVE_PAGE_FAULT +void put_vma(struct vm_area_struct *vma) +{ + if (atomic_dec_and_test(&vma->vm_ref_count)) + __free_vma(vma); +} +#else +static inline void put_vma(struct vm_area_struct *vma) +{ + __free_vma(vma); +} +#endif + /* * Close a vm structure and free it, returning the next. */ @@ -179,10 +200,7 @@ static struct vm_area_struct *remove_vma(struct vm_area_struct *vma) might_sleep(); if (vma->vm_ops && vma->vm_ops->close) vma->vm_ops->close(vma); - if (vma->vm_file) - fput(vma->vm_file); - mpol_put(vma_policy(vma)); - kmem_cache_free(vm_area_cachep, vma); + put_vma(vma); return next; } @@ -402,6 +420,14 @@ static void validate_mm(struct mm_struct *mm) #define validate_mm(mm) do { } while (0) #endif +#ifdef CONFIG_SPECULATIVE_PAGE_FAULT +#define mm_rb_write_lock(mm) write_lock(&(mm)->mm_rb_lock) +#define mm_rb_write_unlock(mm) write_unlock(&(mm)->mm_rb_lock) +#else +#define mm_rb_write_lock(mm) do { } while (0) +#define mm_rb_write_unlock(mm) do { } while (0) +#endif /* CONFIG_SPECULATIVE_PAGE_FAULT */ + RB_DECLARE_CALLBACKS(static, vma_gap_callbacks, struct vm_area_struct, vm_rb, unsigned long, rb_subtree_gap, vma_compute_subtree_gap) @@ -420,26 +446,37 @@ static void vma_gap_update(struct vm_area_struct *vma) } static inline void vma_rb_insert(struct vm_area_struct *vma, - struct rb_root *root) + struct mm_struct *mm) { + struct rb_root *root = &mm->mm_rb; + /* All rb_subtree_gap values must be consistent prior to insertion */ validate_mm_rb(root, NULL); rb_insert_augmented(&vma->vm_rb, root, &vma_gap_callbacks); } -static void __vma_rb_erase(struct vm_area_struct *vma, struct rb_root *root) +static void __vma_rb_erase(struct vm_area_struct *vma, struct mm_struct *mm) { + struct rb_root *root = &mm->mm_rb; /* * Note rb_erase_augmented is a fairly large inline function, * so make sure we instantiate it only once with our desired * augmented rbtree callbacks. */ + mm_rb_write_lock(mm); rb_erase_augmented(&vma->vm_rb, root, &vma_gap_callbacks); + mm_rb_write_unlock(mm); /* wmb */ + + /* + * Ensure the removal is complete before clearing the node. + * Matched by vma_has_changed()/handle_speculative_fault(). + */ + RB_CLEAR_NODE(&vma->vm_rb); } static __always_inline void vma_rb_erase_ignore(struct vm_area_struct *vma, - struct rb_root *root, + struct mm_struct *mm, struct vm_area_struct *ignore) { /* @@ -447,21 +484,21 @@ static __always_inline void vma_rb_erase_ignore(struct vm_area_struct *vma, * with the possible exception of the "next" vma being erased if * next->vm_start was reduced. */ - validate_mm_rb(root, ignore); + validate_mm_rb(&mm->mm_rb, ignore); - __vma_rb_erase(vma, root); + __vma_rb_erase(vma, mm); } static __always_inline void vma_rb_erase(struct vm_area_struct *vma, - struct rb_root *root) + struct mm_struct *mm) { /* * All rb_subtree_gap values must be consistent prior to erase, * with the possible exception of the vma being erased. */ - validate_mm_rb(root, vma); + validate_mm_rb(&mm->mm_rb, vma); - __vma_rb_erase(vma, root); + __vma_rb_erase(vma, mm); } /* @@ -576,10 +613,12 @@ void __vma_link_rb(struct mm_struct *mm, struct vm_area_struct *vma, * immediately update the gap to the correct value. Finally we * rebalance the rbtree after all augmented values have been set. */ + mm_rb_write_lock(mm); rb_link_node(&vma->vm_rb, rb_parent, rb_link); vma->rb_subtree_gap = 0; vma_gap_update(vma); - vma_rb_insert(vma, &mm->mm_rb); + vma_rb_insert(vma, mm); + mm_rb_write_unlock(mm); } static void __vma_link_file(struct vm_area_struct *vma) @@ -655,7 +694,7 @@ static __always_inline void __vma_unlink_common(struct mm_struct *mm, { struct vm_area_struct *next; - vma_rb_erase_ignore(vma, &mm->mm_rb, ignore); + vma_rb_erase_ignore(vma, mm, ignore); next = vma->vm_next; if (has_prev) prev->vm_next = next; @@ -932,16 +971,13 @@ int __vma_adjust(struct vm_area_struct *vma, unsigned long start, } if (remove_next) { - if (file) { + if (file) uprobe_munmap(next, next->vm_start, next->vm_end); - fput(file); - } if (next->anon_vma) anon_vma_merge(vma, next); mm->map_count--; - mpol_put(vma_policy(next)); vm_raw_write_end(next); - kmem_cache_free(vm_area_cachep, next); + put_vma(next); /* * In mprotect's case 6 (see comments on vma_merge), * we must remove another next too. It would clutter @@ -2199,15 +2235,11 @@ get_unmapped_area(struct file *file, unsigned long addr, unsigned long len, EXPORT_SYMBOL(get_unmapped_area); /* Look up the first VMA which satisfies addr < vm_end, NULL if none. */ -struct vm_area_struct *find_vma(struct mm_struct *mm, unsigned long addr) +static struct vm_area_struct *__find_vma(struct mm_struct *mm, + unsigned long addr) { struct rb_node *rb_node; - struct vm_area_struct *vma; - - /* Check the cache first. */ - vma = vmacache_find(mm, addr); - if (likely(vma)) - return vma; + struct vm_area_struct *vma = NULL; rb_node = mm->mm_rb.rb_node; @@ -2225,13 +2257,40 @@ struct vm_area_struct *find_vma(struct mm_struct *mm, unsigned long addr) rb_node = rb_node->rb_right; } + return vma; +} + +struct vm_area_struct *find_vma(struct mm_struct *mm, unsigned long addr) +{ + struct vm_area_struct *vma; + + /* Check the cache first. */ + vma = vmacache_find(mm, addr); + if (likely(vma)) + return vma; + + vma = __find_vma(mm, addr); if (vma) vmacache_update(addr, vma); return vma; } - EXPORT_SYMBOL(find_vma); +#ifdef CONFIG_SPECULATIVE_PAGE_FAULT +struct vm_area_struct *get_vma(struct mm_struct *mm, unsigned long addr) +{ + struct vm_area_struct *vma = NULL; + + read_lock(&mm->mm_rb_lock); + vma = __find_vma(mm, addr); + if (vma) + atomic_inc(&vma->vm_ref_count); + read_unlock(&mm->mm_rb_lock); + + return vma; +} +#endif + /* * Same as find_vma, but also return a pointer to the previous VMA in *pprev. */ @@ -2599,7 +2658,7 @@ detach_vmas_to_be_unmapped(struct mm_struct *mm, struct vm_area_struct *vma, insertion_point = (prev ? &prev->vm_next : &mm->mmap); vma->vm_prev = NULL; do { - vma_rb_erase(vma, &mm->mm_rb); + vma_rb_erase(vma, mm); mm->map_count--; tail_vma = vma; vma = vma->vm_next;