From patchwork Tue Aug 31 16:18:00 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: SeongJae Park X-Patchwork-Id: 12467539 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-20.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,MENTIONS_GIT_HOSTING,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 82917C432BE for ; Tue, 31 Aug 2021 16:18:46 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 25FB0610C8 for ; Tue, 31 Aug 2021 16:18:46 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 25FB0610C8 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 9F7B88D000C; Tue, 31 Aug 2021 12:18:45 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 980708D0001; Tue, 31 Aug 2021 12:18:45 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 820B08D000C; Tue, 31 Aug 2021 12:18:45 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0116.hostedemail.com [216.40.44.116]) by kanga.kvack.org (Postfix) with ESMTP id 6F7CB8D0001 for ; Tue, 31 Aug 2021 12:18:45 -0400 (EDT) Received: from smtpin33.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 21CDC231C2 for ; Tue, 31 Aug 2021 16:18:45 +0000 (UTC) X-FDA: 78535884210.33.D90F24E Received: from mail-qt1-f180.google.com (mail-qt1-f180.google.com [209.85.160.180]) by imf07.hostedemail.com (Postfix) with ESMTP id CCB0710000AA for ; Tue, 31 Aug 2021 16:18:44 +0000 (UTC) Received: by mail-qt1-f180.google.com with SMTP id s32so15074111qtc.12 for ; Tue, 31 Aug 2021 09:18:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id; bh=jUjZnzjHu9GsMs6wMwVvYSFPL5+G8RVZ2+4T5OE5Dqo=; b=Q3uA4sgZs3IF8PBhXxHoeuB5YC9886TGhXSqYAEjUq9hzx1s2LApIrOF/AgLA7VFU2 uNMZ6njXDDp4HyA5q/3Ugf2CaOGuxMNi3bRr4KS+9J1LN2CsfYImYT6AEyIaMIa8371K i2DdLR5Nx8yt3pU/lomxT6adLiQUCdukjPXnj1sASIGATazsTEaXPXctsVkKT7h+OYzb jRbQzCBdk7OrDta4hASDWjwSb0+rNXsg51uWDruM9Y0Whn9glayFudDZoameoqjKZ5LN Haei/tGSFkqQbBJuYPsbnjE+UBOZo7gnAv3W4FhAI2VRnhfDpEtNAGgnh8M6NhWsiuLk cQYQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=jUjZnzjHu9GsMs6wMwVvYSFPL5+G8RVZ2+4T5OE5Dqo=; b=YQmNVe+2DomqiAyz5pTBxiF+aE1tsfCkQWK0/17xMtWBq6Qfy9OC4Ti3o8Rft3ACMt Kb/II8FrxNCMm9U9LH4Sms3B30+BMWPQk5OMQXPPEKcmipyxRj5MqdTcYpi7M4c1Es+J /0jfR7ClGMGCT8EwzTqlQ4zCPGyI5SEBrbZollOoKhuNhNkwLbqMUGaVVdh4+Rgt6Ptg fN/ESGoh9rsKKspB09sK9LvMP01wJjc5y+tHl8w0RqAhRoRbmm0aCpgcCmyystB4yQxg ykXtsKEbIm58TZRfqRaabapqo14lpzL4wr3/HmIOQIuWIfBL/VqBN2yBJykdLa80BD0h QxlA== X-Gm-Message-State: AOAM531UkDI2wPhfg9InrbNsClTAk1Ay2ArZyYK3NyXaEtunCh+wZXZi C0vu94SO15dojt2tgDPx6t8= X-Google-Smtp-Source: ABdhPJw7lvM9s0fHCMfhwf8/Db17OOeOYGga9AY98ZZ9bOYo58GNmbg+ulePnad50idt5Ywysp54Xw== X-Received: by 2002:ac8:7d0d:: with SMTP id g13mr3463954qtb.367.1630426724005; Tue, 31 Aug 2021 09:18:44 -0700 (PDT) Received: from localhost.localdomain (ec2-35-169-212-159.compute-1.amazonaws.com. [35.169.212.159]) by smtp.gmail.com with ESMTPSA id n14sm10757112qti.47.2021.08.31.09.18.43 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 31 Aug 2021 09:18:43 -0700 (PDT) From: SeongJae Park To: akpm@linux-foundation.org Cc: david@redhat.com, markubo@amazon.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, SeongJae Park Subject: [PATCH v2] mm/damon/vaddr: Safely walk page table Date: Tue, 31 Aug 2021 16:18:00 +0000 Message-Id: <20210831161800.29419-1-sj38.park@gmail.com> X-Mailer: git-send-email 2.17.1 Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=gmail.com header.s=20161025 header.b=Q3uA4sgZ; spf=pass (imf07.hostedemail.com: domain of sj38park@gmail.com designates 209.85.160.180 as permitted sender) smtp.mailfrom=sj38park@gmail.com; dmarc=pass (policy=none) header.from=gmail.com X-Stat-Signature: edwdmqe4t4wmi9gsfx1k8zdihtjbot4e X-Rspamd-Queue-Id: CCB0710000AA X-Rspamd-Server: rspam04 X-HE-Tag: 1630426724-138406 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: SeongJae Park Commit d7f647622761 ("mm/damon: implement primitives for the virtual memory address spaces") of linux-mm[1] tries to find PTE or PMD for arbitrary virtual address using 'follow_invalidate_pte()' without proper locking[2]. This commit fixes the issue by using another page table walk function for more general use case ('walk_page_range()') under proper locking (holding mmap read lock). [1] https://github.com/hnaz/linux-mm/commit/d7f647622761 [2] https://lore.kernel.org/linux-mm/3b094493-9c1e-6024-bfd5-7eca66399b7e@redhat.com Fixes: d7f647622761 ("mm/damon: implement primitives for the virtual memory address spaces") Reported-by: David Hildenbrand Signed-off-by: SeongJae Park --- Changes from v1 (https://lore.kernel.org/linux-mm/20210827150400.6305-1-sj38.park@gmail.com/) - Hold only mmap read lock (David Hildenbrand) - Access the PTE/PMD from the walk_page_range() callbacks (David Hildenbrand) mm/damon/vaddr.c | 136 +++++++++++++++++++++++++++++++++-------------- 1 file changed, 97 insertions(+), 39 deletions(-) diff --git a/mm/damon/vaddr.c b/mm/damon/vaddr.c index 230db7413278..58c1fb2aafa9 100644 --- a/mm/damon/vaddr.c +++ b/mm/damon/vaddr.c @@ -8,10 +8,12 @@ #define pr_fmt(fmt) "damon-va: " fmt #include +#include #include #include #include #include +#include #include #include #include @@ -446,22 +448,42 @@ static void damon_pmdp_mkold(pmd_t *pmd, struct mm_struct *mm, #endif /* CONFIG_TRANSPARENT_HUGEPAGE */ } -static void damon_va_mkold(struct mm_struct *mm, unsigned long addr) +static int damon_mkold_pmd_entry(pmd_t *pmd, unsigned long addr, + unsigned long next, struct mm_walk *walk) { - pte_t *pte = NULL; - pmd_t *pmd = NULL; + pte_t *pte; spinlock_t *ptl; - if (follow_invalidate_pte(mm, addr, NULL, &pte, &pmd, &ptl)) - return; - - if (pte) { - damon_ptep_mkold(pte, mm, addr); - pte_unmap_unlock(pte, ptl); - } else { - damon_pmdp_mkold(pmd, mm, addr); + if (pmd_huge(*pmd)) { + ptl = pmd_lock(walk->mm, pmd); + if (pmd_huge(*pmd)) { + damon_pmdp_mkold(pmd, walk->mm, addr); + spin_unlock(ptl); + return 0; + } spin_unlock(ptl); } + + if (pmd_none(*pmd) || unlikely(pmd_bad(*pmd))) + return 0; + pte = pte_offset_map_lock(walk->mm, pmd, addr, &ptl); + if (!pte_present(*pte)) + goto out; + damon_ptep_mkold(pte, walk->mm, addr); +out: + pte_unmap_unlock(pte, ptl); + return 0; +} + +static struct mm_walk_ops damon_mkold_ops = { + .pmd_entry = damon_mkold_pmd_entry, +}; + +static void damon_va_mkold(struct mm_struct *mm, unsigned long addr) +{ + mmap_read_lock(mm); + walk_page_range(mm, addr, addr + 1, &damon_mkold_ops, NULL); + mmap_read_unlock(mm); } /* @@ -492,43 +514,79 @@ void damon_va_prepare_access_checks(struct damon_ctx *ctx) } } -static bool damon_va_young(struct mm_struct *mm, unsigned long addr, - unsigned long *page_sz) +struct damon_young_walk_private { + unsigned long *page_sz; + bool young; +}; + +static int damon_young_pmd_entry(pmd_t *pmd, unsigned long addr, + unsigned long next, struct mm_walk *walk) { - pte_t *pte = NULL; - pmd_t *pmd = NULL; + pte_t *pte; spinlock_t *ptl; struct page *page; - bool young = false; - - if (follow_invalidate_pte(mm, addr, NULL, &pte, &pmd, &ptl)) - return false; - - *page_sz = PAGE_SIZE; - if (pte) { - page = damon_get_page(pte_pfn(*pte)); - if (page && (pte_young(*pte) || !page_is_idle(page) || - mmu_notifier_test_young(mm, addr))) - young = true; - if (page) - put_page(page); - pte_unmap_unlock(pte, ptl); - return young; - } + struct damon_young_walk_private *priv = walk->private; #ifdef CONFIG_TRANSPARENT_HUGEPAGE - page = damon_get_page(pmd_pfn(*pmd)); - if (page && (pmd_young(*pmd) || !page_is_idle(page) || - mmu_notifier_test_young(mm, addr))) - young = true; - if (page) + if (pmd_huge(*pmd)) { + ptl = pmd_lock(walk->mm, pmd); + if (!pmd_huge(*pmd)) { + spin_unlock(ptl); + goto regular_page; + } + page = damon_get_page(pmd_pfn(*pmd)); + if (!page) + goto huge_out; + if (pmd_young(*pmd) || !page_is_idle(page) || + mmu_notifier_test_young(walk->mm, + addr)) { + *priv->page_sz = ((1UL) << HPAGE_PMD_SHIFT); + priv->young = true; + } put_page(page); +huge_out: + spin_unlock(ptl); + return 0; + } - spin_unlock(ptl); - *page_sz = ((1UL) << HPAGE_PMD_SHIFT); +regular_page: #endif /* CONFIG_TRANSPARENT_HUGEPAGE */ - return young; + if (pmd_none(*pmd) || unlikely(pmd_bad(*pmd))) + return -EINVAL; + pte = pte_offset_map_lock(walk->mm, pmd, addr, &ptl); + if (!pte_present(*pte)) + goto out; + page = damon_get_page(pte_pfn(*pte)); + if (!page) + goto out; + if (pte_young(*pte) || !page_is_idle(page) || + mmu_notifier_test_young(walk->mm, addr)) { + *priv->page_sz = PAGE_SIZE; + priv->young = true; + } + put_page(page); +out: + pte_unmap_unlock(pte, ptl); + return 0; +} + +static struct mm_walk_ops damon_young_ops = { + .pmd_entry = damon_young_pmd_entry, +}; + +static bool damon_va_young(struct mm_struct *mm, unsigned long addr, + unsigned long *page_sz) +{ + struct damon_young_walk_private arg = { + .page_sz = page_sz, + .young = false, + }; + + mmap_read_lock(mm); + walk_page_range(mm, addr, addr + 1, &damon_young_ops, &arg); + mmap_read_unlock(mm); + return arg.young; } /*