From patchwork Sat Dec 22 22:30:12 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mike Kravetz X-Patchwork-Id: 10741525 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 40BB26C5 for ; Sat, 22 Dec 2018 22:30:49 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2F22028A5D for ; Sat, 22 Dec 2018 22:30:49 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 2365D28A60; Sat, 22 Dec 2018 22:30:49 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE, UNPARSEABLE_RELAY autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2514D28A5D for ; Sat, 22 Dec 2018 22:30:48 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A98B88E0002; Sat, 22 Dec 2018 17:30:44 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id A1C6C8E0001; Sat, 22 Dec 2018 17:30:44 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8C3B08E0002; Sat, 22 Dec 2018 17:30:44 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-io1-f71.google.com (mail-io1-f71.google.com [209.85.166.71]) by kanga.kvack.org (Postfix) with ESMTP id 569DF8E0001 for ; Sat, 22 Dec 2018 17:30:44 -0500 (EST) Received: by mail-io1-f71.google.com with SMTP id f24so8722485ioh.21 for ; Sat, 22 Dec 2018 14:30:44 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:dkim-signature:from:to:cc:subject:date :message-id:in-reply-to:references; bh=tbthBz9kkb1Ac0hzw6ThNSAvFupxeJ/Q4TNxE0uQxXo=; b=ay0ZKfbLT5iZipiGmzK6g50OYnDp6yNIP5cxqUW7G7oWzCBcBjVVVHRQgwA4FIJ8VV uZ2pNUeARObYG7QiyQWCiXgym29dwaqPlaE01Z5upeORCbLcBP0vVoxatUZz0P0gzpR9 jDaD0NQX/4DGf0SiOZv7nR8reI4wWiRuNZnuef8kxs+4dEj6ecWaD4vl4zTpwqZ/fJnx MScD5IH6sWzVtI3HWOUTlN6lVd8myYIQdn8DtOH6msbqTj5jpjFtj5ZjVvkaoKN2zC7Z DHDcVL7v+o5gNsytYl2+lHDwwsXeG8CZTEUdAsiAz07ojnJQCl/+F5npRfyGdWj9I661 SwpQ== X-Gm-Message-State: AA+aEWZrMiazddhRfQqF116zv68rMTcawrDx8ps2X262AU1lX4ZGUlmD HWUhYe6ykmM1mStHsMZUPuO2U5fVQF3mJT+Y3vab4rdN0hLXLGvfDxYOb8g3sdyPF7z8grXLtPu XsmxzSg6SvBn4uNmkyHruLaXW7Dy6fEZTmXZgEF6RDMFZwC4rqP20uCb5ktRYwLtVVA== X-Received: by 2002:a02:5944:: with SMTP id p65mr5053554jab.3.1545517844027; Sat, 22 Dec 2018 14:30:44 -0800 (PST) X-Google-Smtp-Source: AFSGD/WBrifpgVdCri2FS9dQlR1aDL634gtGg4bi2Ra9EM06WEHJeIVR1dNJh9D46RTmKjxsExCv X-Received: by 2002:a02:5944:: with SMTP id p65mr5053512jab.3.1545517842728; Sat, 22 Dec 2018 14:30:42 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1545517842; cv=none; d=google.com; s=arc-20160816; b=MUMfTK6mKp1GxU/yukUivC9VwP3ocfOgQw1SJ6VaEGwBdUjRvv0D8fDfeJTuE+UpMZ x8IR03wDzdgwxupLmFYYnLrtYM5ycdROuzvT5etqCVXDwXYmY5sIF2fk/k3Vv9f8HNTP Lgqq8o/sa0hHm0krTXYvlS1lcM+cypSIbEoLxmG6e6B1H3DD81+5jPYWl0lv+3WMrBQ/ sRsDbp6LPDcmgDSVsBr0u8+8rbXs9sDDl2PJ8vxe+WxNZvLVBn5kdxAhiwmmOsdHPQPZ 3NNUe5WHJHmFajOVRYPg4uXqx7biQTSksxnj4WcEc/U6786mEPmDDrGn4EP1wU/qOHFm iOmw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=tbthBz9kkb1Ac0hzw6ThNSAvFupxeJ/Q4TNxE0uQxXo=; b=eMGCXOQMf6Gje+PNBuRDmw8jfNgCYcKoRUIk+yBf4WBGLjvd04Vv0+jA3IhJzVsbvz Nl3K5/s32pOkem4negpTmUC9yma/m1jnC3wiWjVXBtftTrc7hlzDol9AjE9dUfhW8Bww xABg2LW8ciG+nASuY9JwXQfrF9n7oPA7pp4xDW09sOiAXP9zYFNSSClbM1bH8Ax2TAzJ 9e9jxLxWuN3SgmRrFcZ35klqlc/cU8ZRcgkoe0Z7xkNLZRXzs7/2dE7aG7bWtOngbqcV MEioL7V5ob4/u3eVYxvXNVsqTwfw+rfEpF417+OKsoOjW2ufGCzHKjtaKF+YWD8Dx22k 1iaA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2018-07-02 header.b=TaQV2uOx; spf=pass (google.com: domain of mike.kravetz@oracle.com designates 141.146.126.79 as permitted sender) smtp.mailfrom=mike.kravetz@oracle.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Received: from aserp2130.oracle.com (aserp2130.oracle.com. [141.146.126.79]) by mx.google.com with ESMTPS id p18si108265ioh.40.2018.12.22.14.30.42 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 22 Dec 2018 14:30:42 -0800 (PST) Received-SPF: pass (google.com: domain of mike.kravetz@oracle.com designates 141.146.126.79 as permitted sender) client-ip=141.146.126.79; Authentication-Results: mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2018-07-02 header.b=TaQV2uOx; spf=pass (google.com: domain of mike.kravetz@oracle.com designates 141.146.126.79 as permitted sender) smtp.mailfrom=mike.kravetz@oracle.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Received: from pps.filterd (aserp2130.oracle.com [127.0.0.1]) by aserp2130.oracle.com (8.16.0.22/8.16.0.22) with SMTP id wBMMTGki066199; Sat, 22 Dec 2018 22:30:34 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references; s=corp-2018-07-02; bh=tbthBz9kkb1Ac0hzw6ThNSAvFupxeJ/Q4TNxE0uQxXo=; b=TaQV2uOxCvhI8IuB4k0g875QI1DSu0aecYn1pwvbBFF53lkQGj3y551gIY+T+njKfX/u 2WeEcRGFGdy9w7t4WPMzDgzjSLpMJJi6ZmTQiQRQgYnag8c6W4eDlffCKSCpi27sDHl7 v1GXS4ZzpiuNWEHX/ZsHVHcDZXrVUkcMmrJYGAYTFgE5LSKSSmhL47KkNomr6EMEfZDk s4z/X1axMkgv6K7HpZT/OvbUwDH5858oMvheZTLLH0PKkVdS9XGnAMmggArrlkHiaFY1 heO4V4AGggmrMz6WerhRBFUcGh5YQuBcgCbGp20B471mc051udpohvvFQJ5Ic3jPLJiX GQ== Received: from aserv0021.oracle.com (aserv0021.oracle.com [141.146.126.233]) by aserp2130.oracle.com with ESMTP id 2phasdhh0k-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Sat, 22 Dec 2018 22:30:34 +0000 Received: from aserv0121.oracle.com (aserv0121.oracle.com [141.146.126.235]) by aserv0021.oracle.com (8.14.4/8.14.4) with ESMTP id wBMMUYXv021308 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Sat, 22 Dec 2018 22:30:34 GMT Received: from abhmp0007.oracle.com (abhmp0007.oracle.com [141.146.116.13]) by aserv0121.oracle.com (8.14.4/8.13.8) with ESMTP id wBMMUXWB027674; Sat, 22 Dec 2018 22:30:33 GMT Received: from monkey.oracle.com (/50.38.38.67) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Sat, 22 Dec 2018 14:30:33 -0800 From: Mike Kravetz To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Michal Hocko , Hugh Dickins , Naoya Horiguchi , "Aneesh Kumar K . V" , Andrea Arcangeli , "Kirill A . Shutemov" , Davidlohr Bueso , Prakash Sangappa , Andrew Morton , Mike Kravetz , stable@vger.kernel.org Subject: [PATCH v3 1/2] hugetlbfs: use i_mmap_rwsem for more pmd sharing synchronization Date: Sat, 22 Dec 2018 14:30:12 -0800 Message-Id: <20181222223013.22193-2-mike.kravetz@oracle.com> X-Mailer: git-send-email 2.17.2 In-Reply-To: <20181222223013.22193-1-mike.kravetz@oracle.com> References: <20181222223013.22193-1-mike.kravetz@oracle.com> X-Proofpoint-Virus-Version: vendor=nai engine=5900 definitions=9115 signatures=668680 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1812220204 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP While looking at BUGs associated with invalid huge page map counts, it was discovered and observed that a huge pte pointer could become 'invalid' and point to another task's page table. Consider the following: A task takes a page fault on a shared hugetlbfs file and calls huge_pte_alloc to get a ptep. Suppose the returned ptep points to a shared pmd. Now, another task truncates the hugetlbfs file. As part of truncation, it unmaps everyone who has the file mapped. If the range being truncated is covered by a shared pmd, huge_pmd_unshare will be called. For all but the last user of the shared pmd, huge_pmd_unshare will clear the pud pointing to the pmd. If the task in the middle of the page fault is not the last user, the ptep returned by huge_pte_alloc now points to another task's page table or worse. This leads to bad things such as incorrect page map/reference counts or invalid memory references. To fix, expand the use of i_mmap_rwsem as follows: - i_mmap_rwsem is held in read mode whenever huge_pmd_share is called. huge_pmd_share is only called via huge_pte_alloc, so callers of huge_pte_alloc take i_mmap_rwsem before calling. In addition, callers of huge_pte_alloc continue to hold the semaphore until finished with the ptep. - i_mmap_rwsem is held in write mode whenever huge_pmd_unshare is called. Cc: Fixes: 39dde65c9940 ("shared page table for hugetlb page") Signed-off-by: Mike Kravetz --- mm/hugetlb.c | 67 ++++++++++++++++++++++++++++++++++----------- mm/memory-failure.c | 14 +++++++++- mm/migrate.c | 13 ++++++++- mm/rmap.c | 4 +++ mm/userfaultfd.c | 11 ++++++-- 5 files changed, 89 insertions(+), 20 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 309fb8c969af..2a3162030167 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -3239,6 +3239,7 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src, int cow; struct hstate *h = hstate_vma(vma); unsigned long sz = huge_page_size(h); + struct address_space *mapping = vma->vm_file->f_mapping; unsigned long mmun_start; /* For mmu_notifiers */ unsigned long mmun_end; /* For mmu_notifiers */ int ret = 0; @@ -3247,14 +3248,25 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src, mmun_start = vma->vm_start; mmun_end = vma->vm_end; - if (cow) + if (cow) { mmu_notifier_invalidate_range_start(src, mmun_start, mmun_end); + } else { + /* + * For shared mappings i_mmap_rwsem must be held to call + * huge_pte_alloc, otherwise the returned ptep could go + * away if part of a shared pmd and another thread calls + * huge_pmd_unshare. + */ + i_mmap_lock_read(mapping); + } for (addr = vma->vm_start; addr < vma->vm_end; addr += sz) { spinlock_t *src_ptl, *dst_ptl; + src_pte = huge_pte_offset(src, addr, sz); if (!src_pte) continue; + dst_pte = huge_pte_alloc(dst, addr, sz); if (!dst_pte) { ret = -ENOMEM; @@ -3325,6 +3337,8 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src, if (cow) mmu_notifier_invalidate_range_end(src, mmun_start, mmun_end); + else + i_mmap_unlock_read(mapping); return ret; } @@ -3772,14 +3786,18 @@ static vm_fault_t hugetlb_no_page(struct mm_struct *mm, }; /* - * hugetlb_fault_mutex must be dropped before - * handling userfault. Reacquire after handling - * fault to make calling code simpler. + * hugetlb_fault_mutex and i_mmap_rwsem must be + * dropped before handling userfault. Reacquire + * after handling fault to make calling code simpler. */ hash = hugetlb_fault_mutex_hash(h, mm, vma, mapping, idx, haddr); mutex_unlock(&hugetlb_fault_mutex_table[hash]); + i_mmap_unlock_read(mapping); + ret = handle_userfault(&vmf, VM_UFFD_MISSING); + + i_mmap_lock_read(mapping); mutex_lock(&hugetlb_fault_mutex_table[hash]); goto out; } @@ -3927,6 +3945,11 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma, ptep = huge_pte_offset(mm, haddr, huge_page_size(h)); if (ptep) { + /* + * Since we hold no locks, ptep could be stale. That is + * OK as we are only making decisions based on content and + * not actually modifying content here. + */ entry = huge_ptep_get(ptep); if (unlikely(is_hugetlb_entry_migration(entry))) { migration_entry_wait_huge(vma, mm, ptep); @@ -3934,20 +3957,31 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma, } else if (unlikely(is_hugetlb_entry_hwpoisoned(entry))) return VM_FAULT_HWPOISON_LARGE | VM_FAULT_SET_HINDEX(hstate_index(h)); - } else { - ptep = huge_pte_alloc(mm, haddr, huge_page_size(h)); - if (!ptep) - return VM_FAULT_OOM; } + /* + * Acquire i_mmap_rwsem before calling huge_pte_alloc and hold + * until finished with ptep. This prevents huge_pmd_unshare from + * being called elsewhere and making the ptep no longer valid. + * + * ptep could have already be assigned via huge_pte_offset. That + * is OK, as huge_pte_alloc will return the same value unless + * something changed. + */ mapping = vma->vm_file->f_mapping; - idx = vma_hugecache_offset(h, vma, haddr); + i_mmap_lock_read(mapping); + ptep = huge_pte_alloc(mm, haddr, huge_page_size(h)); + if (!ptep) { + i_mmap_unlock_read(mapping); + return VM_FAULT_OOM; + } /* * Serialize hugepage allocation and instantiation, so that we don't * get spurious allocation failures if two CPUs race to instantiate * the same page in the page cache. */ + idx = vma_hugecache_offset(h, vma, haddr); hash = hugetlb_fault_mutex_hash(h, mm, vma, mapping, idx, haddr); mutex_lock(&hugetlb_fault_mutex_table[hash]); @@ -4035,6 +4069,7 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma, } out_mutex: mutex_unlock(&hugetlb_fault_mutex_table[hash]); + i_mmap_unlock_read(mapping); /* * Generally it's safe to hold refcount during waiting page lock. But * here we just wait to defer the next page fault to avoid busy loop and @@ -4639,10 +4674,12 @@ void adjust_range_if_pmd_sharing_possible(struct vm_area_struct *vma, * Search for a shareable pmd page for hugetlb. In any case calls pmd_alloc() * and returns the corresponding pte. While this is not necessary for the * !shared pmd case because we can allocate the pmd later as well, it makes the - * code much cleaner. pmd allocation is essential for the shared case because - * pud has to be populated inside the same i_mmap_rwsem section - otherwise - * racing tasks could either miss the sharing (see huge_pte_offset) or select a - * bad pmd for sharing. + * code much cleaner. + * + * This routine must be called with i_mmap_rwsem held in at least read mode. + * For hugetlbfs, this prevents removal of any page table entries associated + * with the address space. This is important as we are setting up sharing + * based on existing page table entries (mappings). */ pte_t *huge_pmd_share(struct mm_struct *mm, unsigned long addr, pud_t *pud) { @@ -4659,7 +4696,6 @@ pte_t *huge_pmd_share(struct mm_struct *mm, unsigned long addr, pud_t *pud) if (!vma_shareable(vma, addr)) return (pte_t *)pmd_alloc(mm, pud, addr); - i_mmap_lock_write(mapping); vma_interval_tree_foreach(svma, &mapping->i_mmap, idx, idx) { if (svma == vma) continue; @@ -4689,7 +4725,6 @@ pte_t *huge_pmd_share(struct mm_struct *mm, unsigned long addr, pud_t *pud) spin_unlock(ptl); out: pte = (pte_t *)pmd_alloc(mm, pud, addr); - i_mmap_unlock_write(mapping); return pte; } @@ -4700,7 +4735,7 @@ pte_t *huge_pmd_share(struct mm_struct *mm, unsigned long addr, pud_t *pud) * indicated by page_count > 1, unmap is achieved by clearing pud and * decrementing the ref count. If count == 1, the pte page is not shared. * - * called with page table lock held. + * Called with page table lock held and i_mmap_rwsem held in write mode. * * returns: 1 successfully unmapped a shared pte page * 0 the underlying pte page is not shared, or it is the last user diff --git a/mm/memory-failure.c b/mm/memory-failure.c index 0cd3de3550f0..b992d1295578 100644 --- a/mm/memory-failure.c +++ b/mm/memory-failure.c @@ -1028,7 +1028,19 @@ static bool hwpoison_user_mappings(struct page *p, unsigned long pfn, if (kill) collect_procs(hpage, &tokill, flags & MF_ACTION_REQUIRED); - unmap_success = try_to_unmap(hpage, ttu); + if (!PageHuge(hpage)) { + unmap_success = try_to_unmap(hpage, ttu); + } else { + /* + * For hugetlb pages, try_to_unmap could potentially call + * huge_pmd_unshare. Because of this, take semaphore in + * write mode here and set TTU_RMAP_LOCKED to indicate we + * have taken the lock at this higer level. + */ + i_mmap_lock_write(mapping); + unmap_success = try_to_unmap(hpage, ttu|TTU_RMAP_LOCKED); + i_mmap_unlock_write(mapping); + } if (!unmap_success) pr_err("Memory failure: %#lx: failed to unmap page (mapcount=%d)\n", pfn, page_mapcount(hpage)); diff --git a/mm/migrate.c b/mm/migrate.c index 84381b55b2bd..725edaef238a 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -1307,8 +1307,19 @@ static int unmap_and_move_huge_page(new_page_t get_new_page, goto put_anon; if (page_mapped(hpage)) { + struct address_space *mapping = page_mapping(hpage); + + /* + * try_to_unmap could potentially call huge_pmd_unshare. + * Because of this, take semaphore in write mode here and + * set TTU_RMAP_LOCKED to let lower levels know we have + * taken the lock. + */ + i_mmap_lock_write(mapping); try_to_unmap(hpage, - TTU_MIGRATION|TTU_IGNORE_MLOCK|TTU_IGNORE_ACCESS); + TTU_MIGRATION|TTU_IGNORE_MLOCK|TTU_IGNORE_ACCESS| + TTU_RMAP_LOCKED); + i_mmap_unlock_write(mapping); page_was_mapped = 1; } diff --git a/mm/rmap.c b/mm/rmap.c index 85b7f9423352..c566bd552535 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -25,6 +25,7 @@ * page->flags PG_locked (lock_page) * hugetlbfs_i_mmap_rwsem_key (in huge_pmd_share) * mapping->i_mmap_rwsem + * hugetlb_fault_mutex (hugetlbfs specific page fault mutex) * anon_vma->rwsem * mm->page_table_lock or pte_lock * zone_lru_lock (in mark_page_accessed, isolate_lru_page) @@ -1374,6 +1375,9 @@ static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma, /* * If sharing is possible, start and end will be adjusted * accordingly. + * + * If called for a huge page, caller must hold i_mmap_rwsem + * in write mode as it is possible to call huge_pmd_unshare. */ adjust_range_if_pmd_sharing_possible(vma, &start, &end); } diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c index 458acda96f20..48368589f519 100644 --- a/mm/userfaultfd.c +++ b/mm/userfaultfd.c @@ -267,10 +267,14 @@ static __always_inline ssize_t __mcopy_atomic_hugetlb(struct mm_struct *dst_mm, VM_BUG_ON(dst_addr & ~huge_page_mask(h)); /* - * Serialize via hugetlb_fault_mutex + * Serialize via i_mmap_rwsem and hugetlb_fault_mutex. + * i_mmap_rwsem ensures the dst_pte remains valid even + * in the case of shared pmds. fault mutex prevents + * races with other faulting threads. */ - idx = linear_page_index(dst_vma, dst_addr); mapping = dst_vma->vm_file->f_mapping; + i_mmap_lock_read(mapping); + idx = linear_page_index(dst_vma, dst_addr); hash = hugetlb_fault_mutex_hash(h, dst_mm, dst_vma, mapping, idx, dst_addr); mutex_lock(&hugetlb_fault_mutex_table[hash]); @@ -279,6 +283,7 @@ static __always_inline ssize_t __mcopy_atomic_hugetlb(struct mm_struct *dst_mm, dst_pte = huge_pte_alloc(dst_mm, dst_addr, huge_page_size(h)); if (!dst_pte) { mutex_unlock(&hugetlb_fault_mutex_table[hash]); + i_mmap_unlock_read(mapping); goto out_unlock; } @@ -286,6 +291,7 @@ static __always_inline ssize_t __mcopy_atomic_hugetlb(struct mm_struct *dst_mm, dst_pteval = huge_ptep_get(dst_pte); if (!huge_pte_none(dst_pteval)) { mutex_unlock(&hugetlb_fault_mutex_table[hash]); + i_mmap_unlock_read(mapping); goto out_unlock; } @@ -293,6 +299,7 @@ static __always_inline ssize_t __mcopy_atomic_hugetlb(struct mm_struct *dst_mm, dst_addr, src_addr, &page); mutex_unlock(&hugetlb_fault_mutex_table[hash]); + i_mmap_unlock_read(mapping); vm_alloc_shared = vm_shared; cond_resched();