From patchwork Sat Jun 4 00:39:50 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zach O'Keefe X-Patchwork-Id: 12869458 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 93D40C43334 for ; Sat, 4 Jun 2022 00:40:16 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C22C88D0003; Fri, 3 Jun 2022 20:40:15 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id BD0108D0001; Fri, 3 Jun 2022 20:40:15 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A4A448D0003; Fri, 3 Jun 2022 20:40:15 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 8C3D48D0001 for ; Fri, 3 Jun 2022 20:40:15 -0400 (EDT) Received: from smtpin29.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 6568A35FFC for ; Sat, 4 Jun 2022 00:40:15 +0000 (UTC) X-FDA: 79538696790.29.02561DB Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) by imf28.hostedemail.com (Postfix) with ESMTP id D3C26C0037 for ; Sat, 4 Jun 2022 00:39:33 +0000 (UTC) Received: by mail-pj1-f74.google.com with SMTP id j15-20020a17090a738f00b001e345e429d2so4874961pjg.0 for ; Fri, 03 Jun 2022 17:40:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=etDzY7WRCSKD998NSJzC9UFp2r1Hcs3dHIWpMpnYzaQ=; b=rtdpM5FP01DnZvGKeE4PuwH9Xm2JZTNdWAOLKL7n+mWwg5ppXPHEwerIqxsM4uCDlK KOdLeHToYLVEsPw6Z7ycl6jVZuNacxqxMoKSZwY12ZZQXnv+V488KNAWDl4o2tbnifDU DRVdYjWaO/vvkWuuMSC0GNe4+xnPl0CgAeTnoJkYqDAaHTN3mGC+4F2gbuMABZd6vSjZ ZHQAnoSHEDZgONv+OfSSzn57HKtqI3wz94EGonps3+kfiWQCLDdT1wm8k8OkZO4sXy1T TZA7V6c7eo66ZVmrv8nTq5exYCedDmpFyxrsQYqkpkHy4KHsiYBqkwJeoy5Nfgeb4dxv IMjw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=etDzY7WRCSKD998NSJzC9UFp2r1Hcs3dHIWpMpnYzaQ=; b=plpxN6Pts5FNYnq6I0zJFGsIKS7lpME0NGB3a2tTWV/31xdB/GziS5o7wYswJgACTa D24RDVPbrH5BDSLqiss41DF7h7nY7ttN8ddDQcu0BqcMXQ/y/xVxU3BSeUseAeclIZtz PcPZbcfpnoxl07lEXaC40aLV1AZLyCEdawS/ZXlCRMGywipkwfL5qH1QJoEQNa5eBObN ZK+Dgl/EcF300gs0+8XPuuv6RjRx80lmIP+NrYmpDLH4QbVspo4AkuqeMN/G+K6G0Vhg 02SHjt0sy96gOp0COnS8JoR0EpQQN0Re9T7kgbUhzKSLAEY/dIIJvxdrEe9fUYz8ZUXZ JAFQ== X-Gm-Message-State: AOAM5313KOLirEVF5LwKDXu3Vx4eh1cwjm+OmFn+c1jI9DKt04kUfFN4 fw+CS5jeX4b4C5R3PDHeJfbCu/UU66Ap X-Google-Smtp-Source: ABdhPJxUZw/lmb0vuJ1f6s7zHXiNCAeKxG/hCRGfcN/qm3IQDEsuDjA/jeez/d5OaCK4kn13wT6p2cXs0bja X-Received: from zokeefe3.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:1b6]) (user=zokeefe job=sendgmr) by 2002:a17:903:186:b0:167:62b0:a556 with SMTP id z6-20020a170903018600b0016762b0a556mr211122plg.122.1654303213973; Fri, 03 Jun 2022 17:40:13 -0700 (PDT) Date: Fri, 3 Jun 2022 17:39:50 -0700 In-Reply-To: <20220604004004.954674-1-zokeefe@google.com> Message-Id: <20220604004004.954674-2-zokeefe@google.com> Mime-Version: 1.0 References: <20220604004004.954674-1-zokeefe@google.com> X-Mailer: git-send-email 2.36.1.255.ge46751e96f-goog Subject: [PATCH v6 01/15] mm: khugepaged: don't carry huge page to the next loop for !CONFIG_NUMA From: "Zach O'Keefe" To: Alex Shi , David Hildenbrand , David Rientjes , Matthew Wilcox , Michal Hocko , Pasha Tatashin , Peter Xu , Rongwei Wang , SeongJae Park , Song Liu , Vlastimil Babka , Yang Shi , Zi Yan , linux-mm@kvack.org Cc: Andrea Arcangeli , Andrew Morton , Arnd Bergmann , Axel Rasmussen , Chris Kennelly , Chris Zankel , Helge Deller , Hugh Dickins , Ivan Kokshaysky , "James E.J. Bottomley" , Jens Axboe , "Kirill A. Shutemov" , Matt Turner , Max Filippov , Miaohe Lin , Minchan Kim , Patrick Xia , Pavel Begunkov , Thomas Bogendoerfer , "Zach O'Keefe" Authentication-Results: imf28.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=rtdpM5FP; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf28.hostedemail.com: domain of 37amaYgcKCMYB0wqqrqs00sxq.o0yxuz69-yyw7mow.03s@flex--zokeefe.bounces.google.com designates 209.85.216.74 as permitted sender) smtp.mailfrom=37amaYgcKCMYB0wqqrqs00sxq.o0yxuz69-yyw7mow.03s@flex--zokeefe.bounces.google.com X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: D3C26C0037 X-Rspam-User: X-Stat-Signature: nunfo11f4xgkwdjfcfkf34psk5i9ddsx X-HE-Tag: 1654303173-46532 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Yang Shi The khugepaged has optimization to reduce huge page allocation calls for !CONFIG_NUMA by carrying the allocated but failed to collapse huge page to the next loop. CONFIG_NUMA doesn't do so since the next loop may try to collapse huge page from a different node, so it doesn't make too much sense to carry it. But when NUMA=n, the huge page is allocated by khugepaged_prealloc_page() before scanning the address space, so it means huge page may be allocated even though there is no suitable range for collapsing. Then the page would be just freed if khugepaged already made enough progress. This could make NUMA=n run have 5 times as much thp_collapse_alloc as NUMA=y run. This problem actually makes things worse due to the way more pointless THP allocations and makes the optimization pointless. This could be fixed by carrying the huge page across scans, but it will complicate the code further and the huge page may be carried indefinitely. But if we take one step back, the optimization itself seems not worth keeping nowadays since: * Not too many users build NUMA=n kernel nowadays even though the kernel is actually running on a non-NUMA machine. Some small devices may run NUMA=n kernel, but I don't think they actually use THP. * Since commit 44042b449872 ("mm/page_alloc: allow high-order pages to be stored on the per-cpu lists"), THP could be cached by pcp. This actually somehow does the job done by the optimization. Cc: Hugh Dickins Cc: "Kirill A. Shutemov" Signed-off-by: Zach O'Keefe --- mm/khugepaged.c | 100 ++++++++---------------------------------------- 1 file changed, 17 insertions(+), 83 deletions(-) diff --git a/mm/khugepaged.c b/mm/khugepaged.c index 476d79360101..cc3d6fb446d5 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -833,29 +833,30 @@ static int khugepaged_find_target_node(void) last_khugepaged_target_node = target_node; return target_node; } +#else +static int khugepaged_find_target_node(void) +{ + return 0; +} +#endif -static bool khugepaged_prealloc_page(struct page **hpage, bool *wait) +/* Sleep for the first alloc fail, break the loop for the second fail */ +static bool alloc_fail_should_sleep(struct page **hpage, bool *wait) { if (IS_ERR(*hpage)) { if (!*wait) - return false; + return true; *wait = false; *hpage = NULL; khugepaged_alloc_sleep(); - } else if (*hpage) { - put_page(*hpage); - *hpage = NULL; } - - return true; + return false; } static struct page * khugepaged_alloc_page(struct page **hpage, gfp_t gfp, int node) { - VM_BUG_ON_PAGE(*hpage, *hpage); - *hpage = __alloc_pages_node(node, gfp, HPAGE_PMD_ORDER); if (unlikely(!*hpage)) { count_vm_event(THP_COLLAPSE_ALLOC_FAILED); @@ -867,74 +868,6 @@ khugepaged_alloc_page(struct page **hpage, gfp_t gfp, int node) count_vm_event(THP_COLLAPSE_ALLOC); return *hpage; } -#else -static int khugepaged_find_target_node(void) -{ - return 0; -} - -static inline struct page *alloc_khugepaged_hugepage(void) -{ - struct page *page; - - page = alloc_pages(alloc_hugepage_khugepaged_gfpmask(), - HPAGE_PMD_ORDER); - if (page) - prep_transhuge_page(page); - return page; -} - -static struct page *khugepaged_alloc_hugepage(bool *wait) -{ - struct page *hpage; - - do { - hpage = alloc_khugepaged_hugepage(); - if (!hpage) { - count_vm_event(THP_COLLAPSE_ALLOC_FAILED); - if (!*wait) - return NULL; - - *wait = false; - khugepaged_alloc_sleep(); - } else - count_vm_event(THP_COLLAPSE_ALLOC); - } while (unlikely(!hpage) && likely(khugepaged_enabled())); - - return hpage; -} - -static bool khugepaged_prealloc_page(struct page **hpage, bool *wait) -{ - /* - * If the hpage allocated earlier was briefly exposed in page cache - * before collapse_file() failed, it is possible that racing lookups - * have not yet completed, and would then be unpleasantly surprised by - * finding the hpage reused for the same mapping at a different offset. - * Just release the previous allocation if there is any danger of that. - */ - if (*hpage && page_count(*hpage) > 1) { - put_page(*hpage); - *hpage = NULL; - } - - if (!*hpage) - *hpage = khugepaged_alloc_hugepage(wait); - - if (unlikely(!*hpage)) - return false; - - return true; -} - -static struct page * -khugepaged_alloc_page(struct page **hpage, gfp_t gfp, int node) -{ - VM_BUG_ON(!*hpage); - - return *hpage; -} -#endif /* * If mmap_lock temporarily dropped, revalidate vma @@ -1188,8 +1121,10 @@ static void collapse_huge_page(struct mm_struct *mm, out_up_write: mmap_write_unlock(mm); out_nolock: - if (!IS_ERR_OR_NULL(*hpage)) + if (!IS_ERR_OR_NULL(*hpage)) { mem_cgroup_uncharge(page_folio(*hpage)); + put_page(*hpage); + } trace_mm_collapse_huge_page(mm, isolated, result); return; } @@ -1992,8 +1927,10 @@ static void collapse_file(struct mm_struct *mm, unlock_page(new_page); out: VM_BUG_ON(!list_empty(&pagelist)); - if (!IS_ERR_OR_NULL(*hpage)) + if (!IS_ERR_OR_NULL(*hpage)) { mem_cgroup_uncharge(page_folio(*hpage)); + put_page(*hpage); + } /* TODO: tracepoints */ } @@ -2243,7 +2180,7 @@ static void khugepaged_do_scan(void) lru_add_drain_all(); while (progress < pages) { - if (!khugepaged_prealloc_page(&hpage, &wait)) + if (alloc_fail_should_sleep(&hpage, &wait)) break; cond_resched(); @@ -2262,9 +2199,6 @@ static void khugepaged_do_scan(void) progress = pages; spin_unlock(&khugepaged_mm_lock); } - - if (!IS_ERR_OR_NULL(hpage)) - put_page(hpage); } static bool khugepaged_should_wakeup(void) From patchwork Sat Jun 4 00:39:51 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zach O'Keefe X-Patchwork-Id: 12869459 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3610BCCA47D for ; Sat, 4 Jun 2022 00:40:18 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7B8478D0005; Fri, 3 Jun 2022 20:40:17 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 6F43E8D0001; Fri, 3 Jun 2022 20:40:17 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5B74E8D0005; Fri, 3 Jun 2022 20:40:17 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 46A0C8D0001 for ; Fri, 3 Jun 2022 20:40:17 -0400 (EDT) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay12.hostedemail.com (Postfix) with ESMTP id 262E81210F7 for ; Sat, 4 Jun 2022 00:40:17 +0000 (UTC) X-FDA: 79538696874.19.F636766 Received: from mail-pf1-f202.google.com (mail-pf1-f202.google.com [209.85.210.202]) by imf22.hostedemail.com (Postfix) with ESMTP id 423BCC0017 for ; Sat, 4 Jun 2022 00:40:13 +0000 (UTC) Received: by mail-pf1-f202.google.com with SMTP id x18-20020a62fb12000000b0051bab667811so4685720pfm.5 for ; Fri, 03 Jun 2022 17:40:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=lqmTpORZ3MQn6/m+oz8fm9SRtE9DcdUbGqykxpzOvOY=; b=PiXCdwYe0Tt3l8cdC5KZZsLlRUc/t20KgAhhCTd78fAv+th3yrCrnEiG1qKUS0rqR3 psagYiv4PrDxghoApAWK610i5FAzwQXLrEkWpRsYLn+9UdjTJotG1EluNebbI0Hw8psP bAJRQrwc/FPIxg5snkTC10o54ItQ4c39UPl6Ka/Nw7bk4m+yhbjp7vAqBg95wA96pz4l 96VztdO9q5V+KTQrcVsR+QPXtkGt4DAyjyCL7xTymsSVqRceMsjsDXCC7Xt9+vYybhKQ /omau5gpHdb3haD4TvbLo2Otw3b0v6PxYU8ZmFg8SARgrXTQbTvQMQ60rqn+Bqz/qAyV 3hPQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=lqmTpORZ3MQn6/m+oz8fm9SRtE9DcdUbGqykxpzOvOY=; b=Tk+5yLKSUOtw8QSH/ka1XD/wpJZMa93Va26T/034lYbRsxJsF+XISndLgPlRE6fO+j XdtBJT48NJ7OMWIE7RUbNKJXoZ0XoxJFBjlWEFvnaC+vIbKb6+IQSh1e6tOuDIW1BQUK nhrHVOReCVa4HN3kwtD+U2ViOYzgEOfy4TksR6ShyDx4UxEitjlaZvAigb4Y0KGJYIXb 8Op0uqkDhX7JHqD+fmupGZeaseM9iGOTv8Bnuxszm7zrhwvffldaf48Fz7lvJ1q4kLQM 2ivEzcEH8I2TH3EbvIPOBloVmo+7AmqPAjS+16EwV1KzFWz76OKnak1BZUK0Q6vIQLXR +o0Q== X-Gm-Message-State: AOAM5325Kbe1GU5PwJ+E40sakSmmMg0E5NxybSS6p9v5XWt1NhrJAo0M 6o+FinWLW4KmD/vtE7yTPeNPY8ky9P7B X-Google-Smtp-Source: ABdhPJwl0aIj/QO6+P0rGCC6j4uruqp1OTjD2bNNQh87dhpPOIZT7PP1reGvB4PU1Icq1iIAo/9IN0JZoM+J X-Received: from zokeefe3.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:1b6]) (user=zokeefe job=sendgmr) by 2002:a17:903:41d1:b0:166:3c6:f055 with SMTP id u17-20020a17090341d100b0016603c6f055mr12566717ple.112.1654303215645; Fri, 03 Jun 2022 17:40:15 -0700 (PDT) Date: Fri, 3 Jun 2022 17:39:51 -0700 In-Reply-To: <20220604004004.954674-1-zokeefe@google.com> Message-Id: <20220604004004.954674-3-zokeefe@google.com> Mime-Version: 1.0 References: <20220604004004.954674-1-zokeefe@google.com> X-Mailer: git-send-email 2.36.1.255.ge46751e96f-goog Subject: [PATCH v6 02/15] mm/khugepaged: record SCAN_PMD_MAPPED when scan_pmd() finds THP From: "Zach O'Keefe" To: Alex Shi , David Hildenbrand , David Rientjes , Matthew Wilcox , Michal Hocko , Pasha Tatashin , Peter Xu , Rongwei Wang , SeongJae Park , Song Liu , Vlastimil Babka , Yang Shi , Zi Yan , linux-mm@kvack.org Cc: Andrea Arcangeli , Andrew Morton , Arnd Bergmann , Axel Rasmussen , Chris Kennelly , Chris Zankel , Helge Deller , Hugh Dickins , Ivan Kokshaysky , "James E.J. Bottomley" , Jens Axboe , "Kirill A. Shutemov" , Matt Turner , Max Filippov , Miaohe Lin , Minchan Kim , Patrick Xia , Pavel Begunkov , Thomas Bogendoerfer , "Zach O'Keefe" Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=PiXCdwYe; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf22.hostedemail.com: domain of 376maYgcKCMgD2ysstsu22uzs.q20zw18B-00y9oqy.25u@flex--zokeefe.bounces.google.com designates 209.85.210.202 as permitted sender) smtp.mailfrom=376maYgcKCMgD2ysstsu22uzs.q20zw18B-00y9oqy.25u@flex--zokeefe.bounces.google.com X-Stat-Signature: sm7fkss77qnah6emuzhadxis8z7qns3f X-Rspam-User: X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 423BCC0017 X-HE-Tag: 1654303213-644269 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: When scanning an anon pmd to see if it's eligible for collapse, return SCAN_PMD_MAPPED if the pmd already maps a THP. Note that SCAN_PMD_MAPPED is different from SCAN_PAGE_COMPOUND used in the file-collapse path, since the latter might identify pte-mapped compound pages. This is required by MADV_COLLAPSE which necessarily needs to know what hugepage-aligned/sized regions are already pmd-mapped. Signed-off-by: Zach O'Keefe --- include/trace/events/huge_memory.h | 1 + mm/internal.h | 1 + mm/khugepaged.c | 32 ++++++++++++++++++++++++++---- mm/rmap.c | 15 ++++++++++++-- 4 files changed, 43 insertions(+), 6 deletions(-) diff --git a/include/trace/events/huge_memory.h b/include/trace/events/huge_memory.h index d651f3437367..55392bf30a03 100644 --- a/include/trace/events/huge_memory.h +++ b/include/trace/events/huge_memory.h @@ -11,6 +11,7 @@ EM( SCAN_FAIL, "failed") \ EM( SCAN_SUCCEED, "succeeded") \ EM( SCAN_PMD_NULL, "pmd_null") \ + EM( SCAN_PMD_MAPPED, "page_pmd_mapped") \ EM( SCAN_EXCEED_NONE_PTE, "exceed_none_pte") \ EM( SCAN_EXCEED_SWAP_PTE, "exceed_swap_pte") \ EM( SCAN_EXCEED_SHARED_PTE, "exceed_shared_pte") \ diff --git a/mm/internal.h b/mm/internal.h index 6e14749ad1e5..f768c7fae668 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -188,6 +188,7 @@ extern void reclaim_throttle(pg_data_t *pgdat, enum vmscan_throttle_state reason /* * in mm/rmap.c: */ +pmd_t *mm_find_pmd_raw(struct mm_struct *mm, unsigned long address); extern pmd_t *mm_find_pmd(struct mm_struct *mm, unsigned long address); /* diff --git a/mm/khugepaged.c b/mm/khugepaged.c index cc3d6fb446d5..7a914ca19e96 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -28,6 +28,7 @@ enum scan_result { SCAN_FAIL, SCAN_SUCCEED, SCAN_PMD_NULL, + SCAN_PMD_MAPPED, SCAN_EXCEED_NONE_PTE, SCAN_EXCEED_SWAP_PTE, SCAN_EXCEED_SHARED_PTE, @@ -901,6 +902,31 @@ static int hugepage_vma_revalidate(struct mm_struct *mm, unsigned long address, return 0; } +static int find_pmd_or_thp_or_none(struct mm_struct *mm, + unsigned long address, + pmd_t **pmd) +{ + pmd_t pmde; + + *pmd = mm_find_pmd_raw(mm, address); + if (!*pmd) + return SCAN_PMD_NULL; + + pmde = pmd_read_atomic(*pmd); + +#ifdef CONFIG_TRANSPARENT_HUGEPAGE + /* See comments in pmd_none_or_trans_huge_or_clear_bad() */ + barrier(); +#endif + if (!pmd_present(pmde)) + return SCAN_PMD_NULL; + if (pmd_trans_huge(pmde)) + return SCAN_PMD_MAPPED; + if (pmd_bad(pmde)) + return SCAN_FAIL; + return SCAN_SUCCEED; +} + /* * Bring missing pages in from swap, to complete THP collapse. * Only done if khugepaged_scan_pmd believes it is worthwhile. @@ -1146,11 +1172,9 @@ static int khugepaged_scan_pmd(struct mm_struct *mm, VM_BUG_ON(address & ~HPAGE_PMD_MASK); - pmd = mm_find_pmd(mm, address); - if (!pmd) { - result = SCAN_PMD_NULL; + result = find_pmd_or_thp_or_none(mm, address, &pmd); + if (result != SCAN_SUCCEED) goto out; - } memset(khugepaged_node_load, 0, sizeof(khugepaged_node_load)); pte = pte_offset_map_lock(mm, pmd, address, &ptl); diff --git a/mm/rmap.c b/mm/rmap.c index 04fac1af870b..c9979c6ad7a1 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -767,13 +767,12 @@ unsigned long page_address_in_vma(struct page *page, struct vm_area_struct *vma) return vma_address(page, vma); } -pmd_t *mm_find_pmd(struct mm_struct *mm, unsigned long address) +pmd_t *mm_find_pmd_raw(struct mm_struct *mm, unsigned long address) { pgd_t *pgd; p4d_t *p4d; pud_t *pud; pmd_t *pmd = NULL; - pmd_t pmde; pgd = pgd_offset(mm, address); if (!pgd_present(*pgd)) @@ -788,6 +787,18 @@ pmd_t *mm_find_pmd(struct mm_struct *mm, unsigned long address) goto out; pmd = pmd_offset(pud, address); +out: + return pmd; +} + +pmd_t *mm_find_pmd(struct mm_struct *mm, unsigned long address) +{ + pmd_t pmde; + pmd_t *pmd; + + pmd = mm_find_pmd_raw(mm, address); + if (!pmd) + goto out; /* * Some THP functions use the sequence pmdp_huge_clear_flush(), set_pmd_at() * without holding anon_vma lock for write. So when looking for a From patchwork Sat Jun 4 00:39:52 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zach O'Keefe X-Patchwork-Id: 12869460 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id F3052C433EF for ; Sat, 4 Jun 2022 00:40:19 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E42BE8D0006; Fri, 3 Jun 2022 20:40:18 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id DCB1D8D0001; Fri, 3 Jun 2022 20:40:18 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BF81E8D0006; Fri, 3 Jun 2022 20:40:18 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 9C2E78D0001 for ; Fri, 3 Jun 2022 20:40:18 -0400 (EDT) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay13.hostedemail.com (Postfix) with ESMTP id 7716461177 for ; Sat, 4 Jun 2022 00:40:18 +0000 (UTC) X-FDA: 79538696916.09.DB555CA Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) by imf15.hostedemail.com (Postfix) with ESMTP id 78AA6A0034 for ; Sat, 4 Jun 2022 00:39:53 +0000 (UTC) Received: by mail-pj1-f74.google.com with SMTP id mh12-20020a17090b4acc00b001e32eb45751so7999042pjb.9 for ; Fri, 03 Jun 2022 17:40:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=F17OfvhYDYHGWbv9IqLd9bnTWKY4tt7nEz2ld7DT1iI=; b=FVcmaFAaz2PBIYohfDyFefDPbDtRlx/tnHGz8tsZl9jjaZPDy8Rb1UTdgeEAbYn7mj OfIXzL27IQnFj1Z+X+xX3a0rg2A3N6B0aNa3FgYlES70VkTXSdqQyqAti0G7a6RJhpti 303I24a/Fu26IAVcCTWOknbBCx01l2VSH9QGMS2C5ALByL2WeMlQMF1JN8ujAerb5Abd 5OuvaS7PWomwGU4CiHfk5pBJcskmuEb+3qTKe8zzRBReDyaBDo8feb6nuKVOWyl4vSfi fEMUXCSzWndpA5uB4BX/K18R3JKi8GGRF7CaRjSr+5odfWl87Ui90GBvPTLyAvnh0fVT LUsg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=F17OfvhYDYHGWbv9IqLd9bnTWKY4tt7nEz2ld7DT1iI=; b=GDjGit8gl1ohMIoHIr1f1f3umvv9uQxntNtbCfjQUru3RCLF68kh0enkadIODyXXi0 mQiAeXkyNMqUwSsRzPPaqCb5CYgCkh1abIr+FjU61gAla9fogL+Xe43fEjIVAfGJAJCq UYKMuoWp8esxZEBAQOxOzEiFGJHi+OusZj+f5nlHqhAuQtj5ZfB3bKMwepgh9/bZF649 Bx1djy4x3ivS+dlUf9SlTmN6k08k+IPVVhLjgBrFlnUTQ1hCc9ySCwyAqtUyRYzSnx3U gwt7CpnsKeG2cVRvEfbH2lVxjT89QxBULXoORpO6uN3KtIW5RTebU1ZLuvK/ql5BMjkI t5tA== X-Gm-Message-State: AOAM531XzZqk4Z3jSif3rFghA7JPGStdKkmK7vy3DzqrLMwlnnc10eIV ZYOOtC0HlQ447UBmK0eJMCRHSIKZoCxE X-Google-Smtp-Source: ABdhPJyNPEVWep1VR7eu4/4qckEI+zsKMK9bwfPEBJfitnxs4KAtjDAHKV33zGjfW4/IuwNEA9OMPz3XtKij X-Received: from zokeefe3.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:1b6]) (user=zokeefe job=sendgmr) by 2002:a17:902:f650:b0:15f:3a10:a020 with SMTP id m16-20020a170902f65000b0015f3a10a020mr12359102plg.61.1654303217104; Fri, 03 Jun 2022 17:40:17 -0700 (PDT) Date: Fri, 3 Jun 2022 17:39:52 -0700 In-Reply-To: <20220604004004.954674-1-zokeefe@google.com> Message-Id: <20220604004004.954674-4-zokeefe@google.com> Mime-Version: 1.0 References: <20220604004004.954674-1-zokeefe@google.com> X-Mailer: git-send-email 2.36.1.255.ge46751e96f-goog Subject: [PATCH v6 03/15] mm/khugepaged: add struct collapse_control From: "Zach O'Keefe" To: Alex Shi , David Hildenbrand , David Rientjes , Matthew Wilcox , Michal Hocko , Pasha Tatashin , Peter Xu , Rongwei Wang , SeongJae Park , Song Liu , Vlastimil Babka , Yang Shi , Zi Yan , linux-mm@kvack.org Cc: Andrea Arcangeli , Andrew Morton , Arnd Bergmann , Axel Rasmussen , Chris Kennelly , Chris Zankel , Helge Deller , Hugh Dickins , Ivan Kokshaysky , "James E.J. Bottomley" , Jens Axboe , "Kirill A. Shutemov" , Matt Turner , Max Filippov , Miaohe Lin , Minchan Kim , Patrick Xia , Pavel Begunkov , Thomas Bogendoerfer , "Zach O'Keefe" X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 78AA6A0034 X-Stat-Signature: qkmnimkakf8hdsnd4n5q6uiarkxuussi X-Rspam-User: Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=FVcmaFAa; spf=pass (imf15.hostedemail.com: domain of 38amaYgcKCMoF40uuvuw44w1u.s421y3AD-220Bqs0.47w@flex--zokeefe.bounces.google.com designates 209.85.216.74 as permitted sender) smtp.mailfrom=38amaYgcKCMoF40uuvuw44w1u.s421y3AD-220Bqs0.47w@flex--zokeefe.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com X-HE-Tag: 1654303193-229365 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Modularize hugepage collapse by introducing struct collapse_control. This structure serves to describe the properties of the requested collapse, as well as serve as a local scratch pad to use during the collapse itself. Start by moving global per-node khugepaged statistics into this new structure, and stack allocate one for khugepaged collapse context. Signed-off-by: Zach O'Keefe Reported-by: kernel test robot --- mm/khugepaged.c | 87 ++++++++++++++++++++++++++++--------------------- 1 file changed, 49 insertions(+), 38 deletions(-) diff --git a/mm/khugepaged.c b/mm/khugepaged.c index 7a914ca19e96..907d0b2bd4bd 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -86,6 +86,14 @@ static struct kmem_cache *mm_slot_cache __read_mostly; #define MAX_PTE_MAPPED_THP 8 +struct collapse_control { + /* Num pages scanned per node */ + int node_load[MAX_NUMNODES]; + + /* Last target selected in khugepaged_find_target_node() */ + int last_target_node; +}; + /** * struct mm_slot - hash lookup from mm to mm_slot * @hash: hash collision list @@ -777,9 +785,7 @@ static void khugepaged_alloc_sleep(void) remove_wait_queue(&khugepaged_wait, &wait); } -static int khugepaged_node_load[MAX_NUMNODES]; - -static bool khugepaged_scan_abort(int nid) +static bool khugepaged_scan_abort(int nid, struct collapse_control *cc) { int i; @@ -791,11 +797,11 @@ static bool khugepaged_scan_abort(int nid) return false; /* If there is a count for this node already, it must be acceptable */ - if (khugepaged_node_load[nid]) + if (cc->node_load[nid]) return false; for (i = 0; i < MAX_NUMNODES; i++) { - if (!khugepaged_node_load[i]) + if (!cc->node_load[i]) continue; if (node_distance(nid, i) > node_reclaim_distance) return true; @@ -810,32 +816,31 @@ static inline gfp_t alloc_hugepage_khugepaged_gfpmask(void) } #ifdef CONFIG_NUMA -static int khugepaged_find_target_node(void) +static int khugepaged_find_target_node(struct collapse_control *cc) { - static int last_khugepaged_target_node = NUMA_NO_NODE; int nid, target_node = 0, max_value = 0; /* find first node with max normal pages hit */ for (nid = 0; nid < MAX_NUMNODES; nid++) - if (khugepaged_node_load[nid] > max_value) { - max_value = khugepaged_node_load[nid]; + if (cc->node_load[nid] > max_value) { + max_value = cc->node_load[nid]; target_node = nid; } /* do some balance if several nodes have the same hit record */ - if (target_node <= last_khugepaged_target_node) - for (nid = last_khugepaged_target_node + 1; nid < MAX_NUMNODES; - nid++) - if (max_value == khugepaged_node_load[nid]) { + if (target_node <= cc->last_target_node) + for (nid = cc->last_target_node + 1; nid < MAX_NUMNODES; + nid++) + if (max_value == cc->node_load[nid]) { target_node = nid; break; } - last_khugepaged_target_node = target_node; + cc->last_target_node = target_node; return target_node; } #else -static int khugepaged_find_target_node(void) +static int khugepaged_find_target_node(struct collapse_control *cc) { return 0; } @@ -1155,10 +1160,9 @@ static void collapse_huge_page(struct mm_struct *mm, return; } -static int khugepaged_scan_pmd(struct mm_struct *mm, - struct vm_area_struct *vma, - unsigned long address, - struct page **hpage) +static int khugepaged_scan_pmd(struct mm_struct *mm, struct vm_area_struct *vma, + unsigned long address, struct page **hpage, + struct collapse_control *cc) { pmd_t *pmd; pte_t *pte, *_pte; @@ -1176,7 +1180,7 @@ static int khugepaged_scan_pmd(struct mm_struct *mm, if (result != SCAN_SUCCEED) goto out; - memset(khugepaged_node_load, 0, sizeof(khugepaged_node_load)); + memset(cc->node_load, 0, sizeof(cc->node_load)); pte = pte_offset_map_lock(mm, pmd, address, &ptl); for (_address = address, _pte = pte; _pte < pte+HPAGE_PMD_NR; _pte++, _address += PAGE_SIZE) { @@ -1242,16 +1246,16 @@ static int khugepaged_scan_pmd(struct mm_struct *mm, /* * Record which node the original page is from and save this - * information to khugepaged_node_load[]. + * information to cc->node_load[]. * Khugepaged will allocate hugepage from the node has the max * hit record. */ node = page_to_nid(page); - if (khugepaged_scan_abort(node)) { + if (khugepaged_scan_abort(node, cc)) { result = SCAN_SCAN_ABORT; goto out_unmap; } - khugepaged_node_load[node]++; + cc->node_load[node]++; if (!PageLRU(page)) { result = SCAN_PAGE_LRU; goto out_unmap; @@ -1302,7 +1306,7 @@ static int khugepaged_scan_pmd(struct mm_struct *mm, out_unmap: pte_unmap_unlock(pte, ptl); if (ret) { - node = khugepaged_find_target_node(); + node = khugepaged_find_target_node(cc); /* collapse_huge_page will return with the mmap_lock released */ collapse_huge_page(mm, address, hpage, node, referenced, unmapped); @@ -1958,8 +1962,9 @@ static void collapse_file(struct mm_struct *mm, /* TODO: tracepoints */ } -static void khugepaged_scan_file(struct mm_struct *mm, - struct file *file, pgoff_t start, struct page **hpage) +static void khugepaged_scan_file(struct mm_struct *mm, struct file *file, + pgoff_t start, struct page **hpage, + struct collapse_control *cc) { struct page *page = NULL; struct address_space *mapping = file->f_mapping; @@ -1970,7 +1975,7 @@ static void khugepaged_scan_file(struct mm_struct *mm, present = 0; swap = 0; - memset(khugepaged_node_load, 0, sizeof(khugepaged_node_load)); + memset(cc->node_load, 0, sizeof(cc->node_load)); rcu_read_lock(); xas_for_each(&xas, page, start + HPAGE_PMD_NR - 1) { if (xas_retry(&xas, page)) @@ -1995,11 +2000,11 @@ static void khugepaged_scan_file(struct mm_struct *mm, } node = page_to_nid(page); - if (khugepaged_scan_abort(node)) { + if (khugepaged_scan_abort(node, cc)) { result = SCAN_SCAN_ABORT; break; } - khugepaged_node_load[node]++; + cc->node_load[node]++; if (!PageLRU(page)) { result = SCAN_PAGE_LRU; @@ -2032,7 +2037,7 @@ static void khugepaged_scan_file(struct mm_struct *mm, result = SCAN_EXCEED_NONE_PTE; count_vm_event(THP_SCAN_EXCEED_NONE_PTE); } else { - node = khugepaged_find_target_node(); + node = khugepaged_find_target_node(cc); collapse_file(mm, file, start, hpage, node); } } @@ -2040,8 +2045,9 @@ static void khugepaged_scan_file(struct mm_struct *mm, /* TODO: tracepoints */ } #else -static void khugepaged_scan_file(struct mm_struct *mm, - struct file *file, pgoff_t start, struct page **hpage) +static void khugepaged_scan_file(struct mm_struct *mm, struct file *file, + pgoff_t start, struct page **hpage, + struct collapse_control *cc) { BUILD_BUG(); } @@ -2052,7 +2058,8 @@ static void khugepaged_collapse_pte_mapped_thps(struct mm_slot *mm_slot) #endif static unsigned int khugepaged_scan_mm_slot(unsigned int pages, - struct page **hpage) + struct page **hpage, + struct collapse_control *cc) __releases(&khugepaged_mm_lock) __acquires(&khugepaged_mm_lock) { @@ -2133,12 +2140,13 @@ static unsigned int khugepaged_scan_mm_slot(unsigned int pages, mmap_read_unlock(mm); ret = 1; - khugepaged_scan_file(mm, file, pgoff, hpage); + khugepaged_scan_file(mm, file, pgoff, hpage, + cc); fput(file); } else { ret = khugepaged_scan_pmd(mm, vma, khugepaged_scan.address, - hpage); + hpage, cc); } /* move to next address */ khugepaged_scan.address += HPAGE_PMD_SIZE; @@ -2194,7 +2202,7 @@ static int khugepaged_wait_event(void) kthread_should_stop(); } -static void khugepaged_do_scan(void) +static void khugepaged_do_scan(struct collapse_control *cc) { struct page *hpage = NULL; unsigned int progress = 0, pass_through_head = 0; @@ -2218,7 +2226,7 @@ static void khugepaged_do_scan(void) if (khugepaged_has_work() && pass_through_head < 2) progress += khugepaged_scan_mm_slot(pages - progress, - &hpage); + &hpage, cc); else progress = pages; spin_unlock(&khugepaged_mm_lock); @@ -2254,12 +2262,15 @@ static void khugepaged_wait_work(void) static int khugepaged(void *none) { struct mm_slot *mm_slot; + struct collapse_control cc = { + .last_target_node = NUMA_NO_NODE, + }; set_freezable(); set_user_nice(current, MAX_NICE); while (!kthread_should_stop()) { - khugepaged_do_scan(); + khugepaged_do_scan(&cc); khugepaged_wait_work(); } From patchwork Sat Jun 4 00:39:53 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zach O'Keefe X-Patchwork-Id: 12869461 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E1F10C43334 for ; Sat, 4 Jun 2022 00:40:21 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B17B28D0007; Fri, 3 Jun 2022 20:40:20 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A52478D0001; Fri, 3 Jun 2022 20:40:20 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8C6128D0007; Fri, 3 Jun 2022 20:40:20 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 734228D0001 for ; Fri, 3 Jun 2022 20:40:20 -0400 (EDT) Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay12.hostedemail.com (Postfix) with ESMTP id 4EAEB121105 for ; Sat, 4 Jun 2022 00:40:20 +0000 (UTC) X-FDA: 79538697000.22.219EEA6 Received: from mail-pl1-f201.google.com (mail-pl1-f201.google.com [209.85.214.201]) by imf20.hostedemail.com (Postfix) with ESMTP id 720481C0007 for ; Sat, 4 Jun 2022 00:40:00 +0000 (UTC) Received: by mail-pl1-f201.google.com with SMTP id h13-20020a170902f70d00b0015f4cc5d19aso4987437plo.18 for ; Fri, 03 Jun 2022 17:40:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=gFfq2bXwHDAaQAwS6YYLfwBdudNA2F7VUhvxENP23yE=; b=FzFe+3qd419UKE5vgmSFkrvdy8PKPnChlhLxLrcPC5i2tbQYjIXIkAtVhffNsJJTBs ONP3NHoeFTS3fC+rxXqFYyku37/b2DcqzsRD5B9/65uvfPaf7yTt+QqztDoMNg3o4G1y SxA/HuvsnmzvmfO0yCdtISox1BFsI3EzlR7I9zBR+WLAJeW+lGZqNYbGu+wEYfcSnCnz 0Av+opwTkjHmV5VjqD85Zuk/DOtnj2IVZ1qvImj00pz0yhm9mCtxBYfKD7cIAHBseh+U KeuWHClGy3ENqLEcHJAW0OCktQrdvT7jO/aYsvRCsSJUmPwIwQa+UhTFP8CEnrNw9SBV Hoag== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=gFfq2bXwHDAaQAwS6YYLfwBdudNA2F7VUhvxENP23yE=; b=UCmC6bE6+yGLDCMbVr6jC7Eocfu24ntaD5VcIjXDixNUeTluyf464ACUUmKjf9CQk+ an/AXwcrDtSFNZ17xmYrap6S6vA5Xcj0Ozty+wN89Ie600E+5PYotQJivBh5S3u1Onyx meCUHaUGxpdVjbWfOpnjcuHJT8XxgP2vWOLOaklGZ/wOJOVkBV19+Bb6+WFqzhVzDbgH 89mDEPtRXTRGLMPrTAChfQruyyfG7SEkNeAyiNQ0zta/t4I+GTCJBErwWq9xF+bxhBLf p/GMqlI5SrQWERlVjxVAG5Y3vBkmPCW4Wk1djJa1jIb3QyTEq+wM1WdfbtVg9FBLauID p1fQ== X-Gm-Message-State: AOAM530WTjhb3t4cs7gTXl3ojjRsY9oWkROrklGC1E4lZDP0xWLU+yJm Y6cpALEHLQVU++nhAS1ENsPqQed2zQdH X-Google-Smtp-Source: ABdhPJyoIUyD/uPuZLi3HEEvHI7MYSICnv3c2/RED1Pg7W1IyoxEGWA3plZQZqynEHeHObocjCekl6zVJQbD X-Received: from zokeefe3.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:1b6]) (user=zokeefe job=sendgmr) by 2002:a17:90b:4c10:b0:1e3:6158:ab03 with SMTP id na16-20020a17090b4c1000b001e36158ab03mr19460823pjb.76.1654303218699; Fri, 03 Jun 2022 17:40:18 -0700 (PDT) Date: Fri, 3 Jun 2022 17:39:53 -0700 In-Reply-To: <20220604004004.954674-1-zokeefe@google.com> Message-Id: <20220604004004.954674-5-zokeefe@google.com> Mime-Version: 1.0 References: <20220604004004.954674-1-zokeefe@google.com> X-Mailer: git-send-email 2.36.1.255.ge46751e96f-goog Subject: [PATCH v6 04/15] mm/khugepaged: dedup and simplify hugepage alloc and charging From: "Zach O'Keefe" To: Alex Shi , David Hildenbrand , David Rientjes , Matthew Wilcox , Michal Hocko , Pasha Tatashin , Peter Xu , Rongwei Wang , SeongJae Park , Song Liu , Vlastimil Babka , Yang Shi , Zi Yan , linux-mm@kvack.org Cc: Andrea Arcangeli , Andrew Morton , Arnd Bergmann , Axel Rasmussen , Chris Kennelly , Chris Zankel , Helge Deller , Hugh Dickins , Ivan Kokshaysky , "James E.J. Bottomley" , Jens Axboe , "Kirill A. Shutemov" , Matt Turner , Max Filippov , Miaohe Lin , Minchan Kim , Patrick Xia , Pavel Begunkov , Thomas Bogendoerfer , "Zach O'Keefe" X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 720481C0007 Authentication-Results: imf20.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=FzFe+3qd; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf20.hostedemail.com: domain of 38qmaYgcKCMsG51vvwvx55x2v.t532z4BE-331Crt1.58x@flex--zokeefe.bounces.google.com designates 209.85.214.201 as permitted sender) smtp.mailfrom=38qmaYgcKCMsG51vvwvx55x2v.t532z4BE-331Crt1.58x@flex--zokeefe.bounces.google.com X-Stat-Signature: stsc9srjpgy7xk9r6hm3drez8dj99sb5 X-Rspam-User: X-HE-Tag: 1654303200-846737 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: The following code is duplicated in collapse_huge_page() and collapse_file(): gfp = alloc_hugepage_khugepaged_gfpmask() | __GFP_THISNODE; new_page = khugepaged_alloc_page(hpage, gfp, node); if (!new_page) { result = SCAN_ALLOC_HUGE_PAGE_FAIL; goto out; } if (unlikely(mem_cgroup_charge(page_folio(new_page), mm, gfp))) { result = SCAN_CGROUP_CHARGE_FAIL; goto out; } count_memcg_page_event(new_page, THP_COLLAPSE_ALLOC); Also, "node" is passed as an argument to both collapse_huge_page() and collapse_file() and obtained the same way, via khugepaged_find_target_node(). Move all this into a new helper, alloc_charge_hpage(), and remove the duplicate code from collapse_huge_page() and collapse_file(). Also, simplify khugepaged_alloc_page() by returning a bool indicating allocation success instead of a copy of the allocated struct page. Suggested-by: Peter Xu Reviewed-by: Yang Shi Reviewed-by: Peter Xu --- Signed-off-by: Zach O'Keefe --- mm/khugepaged.c | 77 ++++++++++++++++++++++--------------------------- 1 file changed, 34 insertions(+), 43 deletions(-) diff --git a/mm/khugepaged.c b/mm/khugepaged.c index 907d0b2bd4bd..38488d114073 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -860,19 +860,18 @@ static bool alloc_fail_should_sleep(struct page **hpage, bool *wait) return false; } -static struct page * -khugepaged_alloc_page(struct page **hpage, gfp_t gfp, int node) +static bool khugepaged_alloc_page(struct page **hpage, gfp_t gfp, int node) { *hpage = __alloc_pages_node(node, gfp, HPAGE_PMD_ORDER); if (unlikely(!*hpage)) { count_vm_event(THP_COLLAPSE_ALLOC_FAILED); *hpage = ERR_PTR(-ENOMEM); - return NULL; + return false; } prep_transhuge_page(*hpage); count_vm_event(THP_COLLAPSE_ALLOC); - return *hpage; + return true; } /* @@ -995,10 +994,23 @@ static bool __collapse_huge_page_swapin(struct mm_struct *mm, return true; } -static void collapse_huge_page(struct mm_struct *mm, - unsigned long address, - struct page **hpage, - int node, int referenced, int unmapped) +static int alloc_charge_hpage(struct page **hpage, struct mm_struct *mm, + struct collapse_control *cc) +{ + gfp_t gfp = alloc_hugepage_khugepaged_gfpmask() | __GFP_THISNODE; + int node = khugepaged_find_target_node(cc); + + if (!khugepaged_alloc_page(hpage, gfp, node)) + return SCAN_ALLOC_HUGE_PAGE_FAIL; + if (unlikely(mem_cgroup_charge(page_folio(*hpage), mm, gfp))) + return SCAN_CGROUP_CHARGE_FAIL; + count_memcg_page_event(*hpage, THP_COLLAPSE_ALLOC); + return SCAN_SUCCEED; +} + +static void collapse_huge_page(struct mm_struct *mm, unsigned long address, + struct page **hpage, int referenced, + int unmapped, struct collapse_control *cc) { LIST_HEAD(compound_pagelist); pmd_t *pmd, _pmd; @@ -1009,13 +1021,9 @@ static void collapse_huge_page(struct mm_struct *mm, int isolated = 0, result = 0; struct vm_area_struct *vma; struct mmu_notifier_range range; - gfp_t gfp; VM_BUG_ON(address & ~HPAGE_PMD_MASK); - /* Only allocate from the target node */ - gfp = alloc_hugepage_khugepaged_gfpmask() | __GFP_THISNODE; - /* * Before allocating the hugepage, release the mmap_lock read lock. * The allocation can take potentially a long time if it involves @@ -1023,17 +1031,12 @@ static void collapse_huge_page(struct mm_struct *mm, * that. We will recheck the vma after taking it again in write mode. */ mmap_read_unlock(mm); - new_page = khugepaged_alloc_page(hpage, gfp, node); - if (!new_page) { - result = SCAN_ALLOC_HUGE_PAGE_FAIL; - goto out_nolock; - } - if (unlikely(mem_cgroup_charge(page_folio(new_page), mm, gfp))) { - result = SCAN_CGROUP_CHARGE_FAIL; + result = alloc_charge_hpage(hpage, mm, cc); + if (result != SCAN_SUCCEED) goto out_nolock; - } - count_memcg_page_event(new_page, THP_COLLAPSE_ALLOC); + + new_page = *hpage; mmap_read_lock(mm); result = hugepage_vma_revalidate(mm, address, &vma); @@ -1306,10 +1309,9 @@ static int khugepaged_scan_pmd(struct mm_struct *mm, struct vm_area_struct *vma, out_unmap: pte_unmap_unlock(pte, ptl); if (ret) { - node = khugepaged_find_target_node(cc); /* collapse_huge_page will return with the mmap_lock released */ - collapse_huge_page(mm, address, hpage, node, - referenced, unmapped); + collapse_huge_page(mm, address, hpage, referenced, unmapped, + cc); } out: trace_mm_khugepaged_scan_pmd(mm, page, writable, referenced, @@ -1578,7 +1580,7 @@ static void retract_page_tables(struct address_space *mapping, pgoff_t pgoff) * @file: file that collapse on * @start: collapse start address * @hpage: new allocated huge page for collapse - * @node: appointed node the new huge page allocate from + * @cc: collapse context and scratchpad * * Basic scheme is simple, details are more complex: * - allocate and lock a new huge page; @@ -1595,12 +1597,11 @@ static void retract_page_tables(struct address_space *mapping, pgoff_t pgoff) * + restore gaps in the page cache; * + unlock and free huge page; */ -static void collapse_file(struct mm_struct *mm, - struct file *file, pgoff_t start, - struct page **hpage, int node) +static void collapse_file(struct mm_struct *mm, struct file *file, + pgoff_t start, struct page **hpage, + struct collapse_control *cc) { struct address_space *mapping = file->f_mapping; - gfp_t gfp; struct page *new_page; pgoff_t index, end = start + HPAGE_PMD_NR; LIST_HEAD(pagelist); @@ -1612,20 +1613,11 @@ static void collapse_file(struct mm_struct *mm, VM_BUG_ON(!IS_ENABLED(CONFIG_READ_ONLY_THP_FOR_FS) && !is_shmem); VM_BUG_ON(start & (HPAGE_PMD_NR - 1)); - /* Only allocate from the target node */ - gfp = alloc_hugepage_khugepaged_gfpmask() | __GFP_THISNODE; - - new_page = khugepaged_alloc_page(hpage, gfp, node); - if (!new_page) { - result = SCAN_ALLOC_HUGE_PAGE_FAIL; + result = alloc_charge_hpage(hpage, mm, cc); + if (result != SCAN_SUCCEED) goto out; - } - if (unlikely(mem_cgroup_charge(page_folio(new_page), mm, gfp))) { - result = SCAN_CGROUP_CHARGE_FAIL; - goto out; - } - count_memcg_page_event(new_page, THP_COLLAPSE_ALLOC); + new_page = *hpage; /* * Ensure we have slots for all the pages in the range. This is @@ -2037,8 +2029,7 @@ static void khugepaged_scan_file(struct mm_struct *mm, struct file *file, result = SCAN_EXCEED_NONE_PTE; count_vm_event(THP_SCAN_EXCEED_NONE_PTE); } else { - node = khugepaged_find_target_node(cc); - collapse_file(mm, file, start, hpage, node); + collapse_file(mm, file, start, hpage, cc); } } From patchwork Sat Jun 4 00:39:54 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zach O'Keefe X-Patchwork-Id: 12869462 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7D05AC433EF for ; Sat, 4 Jun 2022 00:40:23 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4FDB88D0008; Fri, 3 Jun 2022 20:40:22 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 4AC578D0001; Fri, 3 Jun 2022 20:40:22 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3C5468D0008; Fri, 3 Jun 2022 20:40:22 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 0E8BC8D0001 for ; Fri, 3 Jun 2022 20:40:22 -0400 (EDT) Received: from smtpin16.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay12.hostedemail.com (Postfix) with ESMTP id D77DD1210F9 for ; Sat, 4 Jun 2022 00:40:21 +0000 (UTC) X-FDA: 79538697084.16.96DF4E5 Received: from mail-pf1-f202.google.com (mail-pf1-f202.google.com [209.85.210.202]) by imf31.hostedemail.com (Postfix) with ESMTP id 422162002E for ; Sat, 4 Jun 2022 00:39:38 +0000 (UTC) Received: by mail-pf1-f202.google.com with SMTP id z5-20020aa79f85000000b0051baa4e9fb8so4690852pfr.7 for ; Fri, 03 Jun 2022 17:40:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=boI9sUc1aLaPy+/Moqq/dXZm751GICf2cmQgqbxclaA=; b=rCI1h7hlaF0PygOkLN8m2eeS5QJqEqC0yYCr3IIBCKgZW11wIBMvWjaI044IxdQqKp 8p8hzNgjJC55489ei4y7+hF4qXlhp6IZJf8lUDhQrUigZ839PQyYl9Xl3PyiSuLdtQaN f+t6E59dnuWoALDgyRAAyRFOXAtV4IdpbFgS5l4GXs154GIQJgBjriXSbFiVTdPVwciR cyiaRciXLWfe4nnYleHkXNfTGoNcTyTBv3uWjngNylCwxPEY1Pv5PrK6Eazf6awxRjm6 S2fCbyex8C7/Onxdp3+erocoTyMg79SXAsSsKDfOS6OQVkE4YvWXFtPToAP0jC8/q1Om qyqw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=boI9sUc1aLaPy+/Moqq/dXZm751GICf2cmQgqbxclaA=; b=iRFrLLYpzvIIMeB/ALActBCRciM92+/k4p6fD0gKpSTRyR949BLP8305VVwMg6SoZE fq2oSmOiMToxTMuYGH2ICye5TNj9Ncu/AuZxTAnTMXORMjrgz7SLOHeZf8TspVzFMHkk 7VpTqWvfSDuuFfqZUlDgzjtxfqhPwl0UGP3hwwjqIZ8aB9MWPODpPftQQQMtdF/58B6I OQwoSO1Y6A5yYF5G3T2AINmKcIpE0y2w/wstJnyQj21N53OvnsIJe+WxTZTdw9T3Tsew 9prKBMmTPEipKnA/ugYNgFq2Agk3FjLeizocnPbqr0vi1SJ6d8n0OhU9/zbb2Me7D0TA LJdQ== X-Gm-Message-State: AOAM530MsPxYLGcpsQ6TFs94tf3NUwBDrdI/gnv+Bs4rVNfwnq9IHBJo jAx0HHlHxcg7HR6QquCuHhnncSfhsiyV X-Google-Smtp-Source: ABdhPJzsfLCGH4ERn3EghZbHMFsR+OktFe2zv4niVZS3+A5tNo0JsMGLtHC/yoQQdITL227iFZT/UH5iylXT X-Received: from zokeefe3.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:1b6]) (user=zokeefe job=sendgmr) by 2002:a17:902:ea53:b0:15b:1bb8:ac9e with SMTP id r19-20020a170902ea5300b0015b1bb8ac9emr12175559plg.45.1654303220573; Fri, 03 Jun 2022 17:40:20 -0700 (PDT) Date: Fri, 3 Jun 2022 17:39:54 -0700 In-Reply-To: <20220604004004.954674-1-zokeefe@google.com> Message-Id: <20220604004004.954674-6-zokeefe@google.com> Mime-Version: 1.0 References: <20220604004004.954674-1-zokeefe@google.com> X-Mailer: git-send-email 2.36.1.255.ge46751e96f-goog Subject: [PATCH v6 05/15] mm/khugepaged: make allocation semantics context-specific From: "Zach O'Keefe" To: Alex Shi , David Hildenbrand , David Rientjes , Matthew Wilcox , Michal Hocko , Pasha Tatashin , Peter Xu , Rongwei Wang , SeongJae Park , Song Liu , Vlastimil Babka , Yang Shi , Zi Yan , linux-mm@kvack.org Cc: Andrea Arcangeli , Andrew Morton , Arnd Bergmann , Axel Rasmussen , Chris Kennelly , Chris Zankel , Helge Deller , Hugh Dickins , Ivan Kokshaysky , "James E.J. Bottomley" , Jens Axboe , "Kirill A. Shutemov" , Matt Turner , Max Filippov , Miaohe Lin , Minchan Kim , Patrick Xia , Pavel Begunkov , Thomas Bogendoerfer , "Zach O'Keefe" X-Rspam-User: X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 422162002E X-Stat-Signature: p5aijajgzsk95wqpg8cy3tbkhnf1mdi7 Authentication-Results: imf31.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=rCI1h7hl; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf31.hostedemail.com: domain of 39KmaYgcKCM0I73xxyxz77z4x.v75416DG-553Etv3.7Az@flex--zokeefe.bounces.google.com designates 209.85.210.202 as permitted sender) smtp.mailfrom=39KmaYgcKCM0I73xxyxz77z4x.v75416DG-553Etv3.7Az@flex--zokeefe.bounces.google.com X-HE-Tag: 1654303178-55613 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Add a gfp_t flags member to struct collapse_control that allows contexts to specify their own allocation semantics. This decouples the allocation semantics from /sys/kernel/mm/transparent_hugepage/khugepaged/defrag. khugepaged updates this member for every hugepage processed, since the sysfs setting might change at any time. Signed-off-by: Zach O'Keefe --- mm/khugepaged.c | 21 +++++++++++++-------- 1 file changed, 13 insertions(+), 8 deletions(-) diff --git a/mm/khugepaged.c b/mm/khugepaged.c index 38488d114073..ba722347bebd 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -92,6 +92,9 @@ struct collapse_control { /* Last target selected in khugepaged_find_target_node() */ int last_target_node; + + /* gfp used for allocation and memcg charging */ + gfp_t gfp; }; /** @@ -994,15 +997,14 @@ static bool __collapse_huge_page_swapin(struct mm_struct *mm, return true; } -static int alloc_charge_hpage(struct page **hpage, struct mm_struct *mm, +static int alloc_charge_hpage(struct mm_struct *mm, struct page **hpage, struct collapse_control *cc) { - gfp_t gfp = alloc_hugepage_khugepaged_gfpmask() | __GFP_THISNODE; int node = khugepaged_find_target_node(cc); - if (!khugepaged_alloc_page(hpage, gfp, node)) + if (!khugepaged_alloc_page(hpage, cc->gfp, node)) return SCAN_ALLOC_HUGE_PAGE_FAIL; - if (unlikely(mem_cgroup_charge(page_folio(*hpage), mm, gfp))) + if (unlikely(mem_cgroup_charge(page_folio(*hpage), mm, cc->gfp))) return SCAN_CGROUP_CHARGE_FAIL; count_memcg_page_event(*hpage, THP_COLLAPSE_ALLOC); return SCAN_SUCCEED; @@ -1032,7 +1034,7 @@ static void collapse_huge_page(struct mm_struct *mm, unsigned long address, */ mmap_read_unlock(mm); - result = alloc_charge_hpage(hpage, mm, cc); + result = alloc_charge_hpage(mm, hpage, cc); if (result != SCAN_SUCCEED) goto out_nolock; @@ -1613,7 +1615,7 @@ static void collapse_file(struct mm_struct *mm, struct file *file, VM_BUG_ON(!IS_ENABLED(CONFIG_READ_ONLY_THP_FOR_FS) && !is_shmem); VM_BUG_ON(start & (HPAGE_PMD_NR - 1)); - result = alloc_charge_hpage(hpage, mm, cc); + result = alloc_charge_hpage(mm, hpage, cc); if (result != SCAN_SUCCEED) goto out; @@ -2037,8 +2039,7 @@ static void khugepaged_scan_file(struct mm_struct *mm, struct file *file, } #else static void khugepaged_scan_file(struct mm_struct *mm, struct file *file, - pgoff_t start, struct page **hpage, - struct collapse_control *cc) + pgoff_t start, struct collapse_control *cc) { BUILD_BUG(); } @@ -2121,6 +2122,9 @@ static unsigned int khugepaged_scan_mm_slot(unsigned int pages, if (unlikely(khugepaged_test_exit(mm))) goto breakouterloop; + /* reset gfp flags since sysfs settings might change */ + cc->gfp = alloc_hugepage_khugepaged_gfpmask() | + __GFP_THISNODE; VM_BUG_ON(khugepaged_scan.address < hstart || khugepaged_scan.address + HPAGE_PMD_SIZE > hend); @@ -2255,6 +2259,7 @@ static int khugepaged(void *none) struct mm_slot *mm_slot; struct collapse_control cc = { .last_target_node = NUMA_NO_NODE, + /* .gfp set later */ }; set_freezable(); From patchwork Sat Jun 4 00:39:55 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zach O'Keefe X-Patchwork-Id: 12869463 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 40D5CC433EF for ; Sat, 4 Jun 2022 00:40:25 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4411C8D0009; Fri, 3 Jun 2022 20:40:24 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 41CBF8D0001; Fri, 3 Jun 2022 20:40:24 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 21C178D0009; Fri, 3 Jun 2022 20:40:24 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 0A53D8D0001 for ; Fri, 3 Jun 2022 20:40:24 -0400 (EDT) Received: from smtpin16.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay12.hostedemail.com (Postfix) with ESMTP id CD5681210F6 for ; Sat, 4 Jun 2022 00:40:23 +0000 (UTC) X-FDA: 79538697168.16.CDA8439 Received: from mail-pg1-f201.google.com (mail-pg1-f201.google.com [209.85.215.201]) by imf02.hostedemail.com (Postfix) with ESMTP id 5246E80031 for ; Sat, 4 Jun 2022 00:40:17 +0000 (UTC) Received: by mail-pg1-f201.google.com with SMTP id y63-20020a638a42000000b003fd47b6f280so700997pgd.12 for ; Fri, 03 Jun 2022 17:40:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=K0jhLzx8PHmqnYXQ4AGsGCUG4keL8bqt+/lpOB/vqQ0=; b=WV5qKdiu622Jo8cB+a+1XcCOFzhZXaCUiozY1SPm8BMJNqOhy0PdD++Lbbu3KqLtXr mjm+uwmbRi3eKClF3wZil/YtKBWXxsjgDAr66B3Eng65AvCVIHwRMj5oke2qq8LoxR2b hxmWerPVqRLm1DgV95gxdUoCJwNshfPtzEZ1g6kXgW7HMsYpQ9Kf7rWkiYznKcmFUkZm oiJn/WhEOfTKt9BWdnyvPAJK7DO7CyBup/CGtJpMkwjDZ1COZ8/YLwiymHesg74bA1jN R0er5ACc4wAwPBQ+PImBfOX17k9rycKhM77n8KQFC+3fA7oxsIY1OoLQAFzJ940jqP14 dLJA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=K0jhLzx8PHmqnYXQ4AGsGCUG4keL8bqt+/lpOB/vqQ0=; b=oF4ROowvwY28jcMr77m7O+zUg/4L9/EysNerTEWKi+P8yGXs2r8SBEWhQts+zroeYt J1N0P3eZHi3w0RirdwV4HVgi0m2LfzympyA6bHk2Jsl1oMm+T/4p6JuYBBV9agV74Ox9 e8SvGmxZTYniK1yD07lE7KoMrsXPG/LfhtFC7S+SmmnzrRfwTTyea/jN28/htmoR0jL1 nyrcILjjIUofjl7fS3ORgtI2j3qtoW2f1fEXOboOysVo72yMiD9M89JNQ+dSLoySRjN3 hh24vVIN1dIpkxwjsUON4HewfeGbK3LT8lFO51dB7GN8duL7nJmXRUBlM48qUQsauAfb 8TfQ== X-Gm-Message-State: AOAM531RXRbTbeLPGg0/IVsbQIBX7Y2T3VHypt2QcrA2d84yhGzVoOxk vP+/omY3G5WEBqarkm1PA6mNBB3RpSxk X-Google-Smtp-Source: ABdhPJywj6QNsXMvRq3n3ZUzM66fSxdzHU5iyOA3hNvn11jP9vhm1vK7+Do9Yy5vMBCpGBuy1NE2JucLgZY+ X-Received: from zokeefe3.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:1b6]) (user=zokeefe job=sendgmr) by 2002:a65:4803:0:b0:3fc:bc0:722c with SMTP id h3-20020a654803000000b003fc0bc0722cmr10948481pgs.317.1654303222200; Fri, 03 Jun 2022 17:40:22 -0700 (PDT) Date: Fri, 3 Jun 2022 17:39:55 -0700 In-Reply-To: <20220604004004.954674-1-zokeefe@google.com> Message-Id: <20220604004004.954674-7-zokeefe@google.com> Mime-Version: 1.0 References: <20220604004004.954674-1-zokeefe@google.com> X-Mailer: git-send-email 2.36.1.255.ge46751e96f-goog Subject: [PATCH v6 06/15] mm/khugepaged: pipe enum scan_result codes back to callers From: "Zach O'Keefe" To: Alex Shi , David Hildenbrand , David Rientjes , Matthew Wilcox , Michal Hocko , Pasha Tatashin , Peter Xu , Rongwei Wang , SeongJae Park , Song Liu , Vlastimil Babka , Yang Shi , Zi Yan , linux-mm@kvack.org Cc: Andrea Arcangeli , Andrew Morton , Arnd Bergmann , Axel Rasmussen , Chris Kennelly , Chris Zankel , Helge Deller , Hugh Dickins , Ivan Kokshaysky , "James E.J. Bottomley" , Jens Axboe , "Kirill A. Shutemov" , Matt Turner , Max Filippov , Miaohe Lin , Minchan Kim , Patrick Xia , Pavel Begunkov , Thomas Bogendoerfer , "Zach O'Keefe" X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 5246E80031 X-Stat-Signature: 1dhdnx4muonbqbkf6rm6oxwtsqt17hzw X-Rspam-User: Authentication-Results: imf02.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=WV5qKdiu; spf=pass (imf02.hostedemail.com: domain of 39qmaYgcKCM8K95zz0z19916z.x97638FI-775Gvx5.9C1@flex--zokeefe.bounces.google.com designates 209.85.215.201 as permitted sender) smtp.mailfrom=39qmaYgcKCM8K95zz0z19916z.x97638FI-775Gvx5.9C1@flex--zokeefe.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com X-HE-Tag: 1654303217-672355 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Pipe enum scan_result codes back through return values of functions downstream of khugepaged_scan_file() and khugepaged_scan_pmd() to inform callers if the operation was successful, and if not, why. Since khugepaged_scan_pmd()'s return value already has a specific meaning (whether mmap_lock was unlocked or not), add a bool* argument to khugepaged_scan_pmd() to retrieve this information. Change khugepaged to take action based on the return values of khugepaged_scan_file() and khugepaged_scan_pmd() instead of acting deep within the collapsing functions themselves. Remove dependency on error pointers to communicate to khugepaged that allocation failed and it should sleep; instead just use the result of the scan (SCAN_ALLOC_HUGE_PAGE_FAIL if allocation fails). Signed-off-by: Zach O'Keefe Reviewed-by: Yang Shi --- mm/khugepaged.c | 192 ++++++++++++++++++++++++------------------------ 1 file changed, 96 insertions(+), 96 deletions(-) diff --git a/mm/khugepaged.c b/mm/khugepaged.c index ba722347bebd..03e0da0008f1 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -722,13 +722,13 @@ static int __collapse_huge_page_isolate(struct vm_area_struct *vma, result = SCAN_SUCCEED; trace_mm_collapse_huge_page_isolate(page, none_or_zero, referenced, writable, result); - return 1; + return SCAN_SUCCEED; } out: release_pte_pages(pte, _pte, compound_pagelist); trace_mm_collapse_huge_page_isolate(page, none_or_zero, referenced, writable, result); - return 0; + return result; } static void __collapse_huge_page_copy(pte_t *pte, struct page *page, @@ -850,14 +850,13 @@ static int khugepaged_find_target_node(struct collapse_control *cc) #endif /* Sleep for the first alloc fail, break the loop for the second fail */ -static bool alloc_fail_should_sleep(struct page **hpage, bool *wait) +static bool alloc_fail_should_sleep(int result, bool *wait) { - if (IS_ERR(*hpage)) { + if (result == SCAN_ALLOC_HUGE_PAGE_FAIL) { if (!*wait) return true; *wait = false; - *hpage = NULL; khugepaged_alloc_sleep(); } return false; @@ -868,7 +867,6 @@ static bool khugepaged_alloc_page(struct page **hpage, gfp_t gfp, int node) *hpage = __alloc_pages_node(node, gfp, HPAGE_PMD_ORDER); if (unlikely(!*hpage)) { count_vm_event(THP_COLLAPSE_ALLOC_FAILED); - *hpage = ERR_PTR(-ENOMEM); return false; } @@ -1010,17 +1008,17 @@ static int alloc_charge_hpage(struct mm_struct *mm, struct page **hpage, return SCAN_SUCCEED; } -static void collapse_huge_page(struct mm_struct *mm, unsigned long address, - struct page **hpage, int referenced, - int unmapped, struct collapse_control *cc) +static int collapse_huge_page(struct mm_struct *mm, unsigned long address, + int referenced, int unmapped, + struct collapse_control *cc) { LIST_HEAD(compound_pagelist); pmd_t *pmd, _pmd; pte_t *pte; pgtable_t pgtable; - struct page *new_page; + struct page *hpage; spinlock_t *pmd_ptl, *pte_ptl; - int isolated = 0, result = 0; + int result = SCAN_FAIL; struct vm_area_struct *vma; struct mmu_notifier_range range; @@ -1034,12 +1032,10 @@ static void collapse_huge_page(struct mm_struct *mm, unsigned long address, */ mmap_read_unlock(mm); - result = alloc_charge_hpage(mm, hpage, cc); + result = alloc_charge_hpage(mm, &hpage, cc); if (result != SCAN_SUCCEED) goto out_nolock; - new_page = *hpage; - mmap_read_lock(mm); result = hugepage_vma_revalidate(mm, address, &vma); if (result) { @@ -1100,11 +1096,11 @@ static void collapse_huge_page(struct mm_struct *mm, unsigned long address, mmu_notifier_invalidate_range_end(&range); spin_lock(pte_ptl); - isolated = __collapse_huge_page_isolate(vma, address, pte, - &compound_pagelist); + result = __collapse_huge_page_isolate(vma, address, pte, + &compound_pagelist); spin_unlock(pte_ptl); - if (unlikely(!isolated)) { + if (unlikely(result != SCAN_SUCCEED)) { pte_unmap(pte); spin_lock(pmd_ptl); BUG_ON(!pmd_none(*pmd)); @@ -1116,7 +1112,6 @@ static void collapse_huge_page(struct mm_struct *mm, unsigned long address, pmd_populate(mm, pmd, pmd_pgtable(_pmd)); spin_unlock(pmd_ptl); anon_vma_unlock_write(vma->anon_vma); - result = SCAN_FAIL; goto out_up_write; } @@ -1126,8 +1121,8 @@ static void collapse_huge_page(struct mm_struct *mm, unsigned long address, */ anon_vma_unlock_write(vma->anon_vma); - __collapse_huge_page_copy(pte, new_page, vma, address, pte_ptl, - &compound_pagelist); + __collapse_huge_page_copy(pte, hpage, vma, address, pte_ptl, + &compound_pagelist); pte_unmap(pte); /* * spin_lock() below is not the equivalent of smp_wmb(), but @@ -1135,43 +1130,42 @@ static void collapse_huge_page(struct mm_struct *mm, unsigned long address, * avoid the copy_huge_page writes to become visible after * the set_pmd_at() write. */ - __SetPageUptodate(new_page); + __SetPageUptodate(hpage); pgtable = pmd_pgtable(_pmd); - _pmd = mk_huge_pmd(new_page, vma->vm_page_prot); + _pmd = mk_huge_pmd(hpage, vma->vm_page_prot); _pmd = maybe_pmd_mkwrite(pmd_mkdirty(_pmd), vma); spin_lock(pmd_ptl); BUG_ON(!pmd_none(*pmd)); - page_add_new_anon_rmap(new_page, vma, address); - lru_cache_add_inactive_or_unevictable(new_page, vma); + page_add_new_anon_rmap(hpage, vma, address); + lru_cache_add_inactive_or_unevictable(hpage, vma); pgtable_trans_huge_deposit(mm, pmd, pgtable); set_pmd_at(mm, address, pmd, _pmd); update_mmu_cache_pmd(vma, address, pmd); spin_unlock(pmd_ptl); - *hpage = NULL; + hpage = NULL; - khugepaged_pages_collapsed++; result = SCAN_SUCCEED; out_up_write: mmap_write_unlock(mm); out_nolock: - if (!IS_ERR_OR_NULL(*hpage)) { - mem_cgroup_uncharge(page_folio(*hpage)); - put_page(*hpage); + if (hpage) { + mem_cgroup_uncharge(page_folio(hpage)); + put_page(hpage); } - trace_mm_collapse_huge_page(mm, isolated, result); - return; + trace_mm_collapse_huge_page(mm, result == SCAN_SUCCEED, result); + return result; } static int khugepaged_scan_pmd(struct mm_struct *mm, struct vm_area_struct *vma, - unsigned long address, struct page **hpage, + unsigned long address, bool *mmap_locked, struct collapse_control *cc) { pmd_t *pmd; pte_t *pte, *_pte; - int ret = 0, result = 0, referenced = 0; + int result = SCAN_FAIL, referenced = 0; int none_or_zero = 0, shared = 0; struct page *page = NULL; unsigned long _address; @@ -1306,19 +1300,19 @@ static int khugepaged_scan_pmd(struct mm_struct *mm, struct vm_area_struct *vma, result = SCAN_LACK_REFERENCED_PAGE; } else { result = SCAN_SUCCEED; - ret = 1; } out_unmap: pte_unmap_unlock(pte, ptl); - if (ret) { + if (result == SCAN_SUCCEED) { /* collapse_huge_page will return with the mmap_lock released */ - collapse_huge_page(mm, address, hpage, referenced, unmapped, - cc); + *mmap_locked = false; + result = collapse_huge_page(mm, address, referenced, + unmapped, cc); } out: trace_mm_khugepaged_scan_pmd(mm, page, writable, referenced, none_or_zero, result, unmapped); - return ret; + return result; } static void collect_mm_slot(struct mm_slot *mm_slot) @@ -1581,7 +1575,6 @@ static void retract_page_tables(struct address_space *mapping, pgoff_t pgoff) * @mm: process address space where collapse happens * @file: file that collapse on * @start: collapse start address - * @hpage: new allocated huge page for collapse * @cc: collapse context and scratchpad * * Basic scheme is simple, details are more complex: @@ -1599,12 +1592,11 @@ static void retract_page_tables(struct address_space *mapping, pgoff_t pgoff) * + restore gaps in the page cache; * + unlock and free huge page; */ -static void collapse_file(struct mm_struct *mm, struct file *file, - pgoff_t start, struct page **hpage, - struct collapse_control *cc) +static int collapse_file(struct mm_struct *mm, struct file *file, + pgoff_t start, struct collapse_control *cc) { struct address_space *mapping = file->f_mapping; - struct page *new_page; + struct page *hpage; pgoff_t index, end = start + HPAGE_PMD_NR; LIST_HEAD(pagelist); XA_STATE_ORDER(xas, &mapping->i_pages, start, HPAGE_PMD_ORDER); @@ -1615,12 +1607,10 @@ static void collapse_file(struct mm_struct *mm, struct file *file, VM_BUG_ON(!IS_ENABLED(CONFIG_READ_ONLY_THP_FOR_FS) && !is_shmem); VM_BUG_ON(start & (HPAGE_PMD_NR - 1)); - result = alloc_charge_hpage(mm, hpage, cc); + result = alloc_charge_hpage(mm, &hpage, cc); if (result != SCAN_SUCCEED) goto out; - new_page = *hpage; - /* * Ensure we have slots for all the pages in the range. This is * almost certainly a no-op because most of the pages must be present @@ -1637,14 +1627,14 @@ static void collapse_file(struct mm_struct *mm, struct file *file, } } while (1); - __SetPageLocked(new_page); + __SetPageLocked(hpage); if (is_shmem) - __SetPageSwapBacked(new_page); - new_page->index = start; - new_page->mapping = mapping; + __SetPageSwapBacked(hpage); + hpage->index = start; + hpage->mapping = mapping; /* - * At this point the new_page is locked and not up-to-date. + * At this point the hpage is locked and not up-to-date. * It's safe to insert it into the page cache, because nobody would * be able to map it or use it in another way until we unlock it. */ @@ -1672,7 +1662,7 @@ static void collapse_file(struct mm_struct *mm, struct file *file, result = SCAN_FAIL; goto xa_locked; } - xas_store(&xas, new_page); + xas_store(&xas, hpage); nr_none++; continue; } @@ -1814,19 +1804,19 @@ static void collapse_file(struct mm_struct *mm, struct file *file, list_add_tail(&page->lru, &pagelist); /* Finally, replace with the new page. */ - xas_store(&xas, new_page); + xas_store(&xas, hpage); continue; out_unlock: unlock_page(page); put_page(page); goto xa_unlocked; } - nr = thp_nr_pages(new_page); + nr = thp_nr_pages(hpage); if (is_shmem) - __mod_lruvec_page_state(new_page, NR_SHMEM_THPS, nr); + __mod_lruvec_page_state(hpage, NR_SHMEM_THPS, nr); else { - __mod_lruvec_page_state(new_page, NR_FILE_THPS, nr); + __mod_lruvec_page_state(hpage, NR_FILE_THPS, nr); filemap_nr_thps_inc(mapping); /* * Paired with smp_mb() in do_dentry_open() to ensure @@ -1837,21 +1827,21 @@ static void collapse_file(struct mm_struct *mm, struct file *file, smp_mb(); if (inode_is_open_for_write(mapping->host)) { result = SCAN_FAIL; - __mod_lruvec_page_state(new_page, NR_FILE_THPS, -nr); + __mod_lruvec_page_state(hpage, NR_FILE_THPS, -nr); filemap_nr_thps_dec(mapping); goto xa_locked; } } if (nr_none) { - __mod_lruvec_page_state(new_page, NR_FILE_PAGES, nr_none); + __mod_lruvec_page_state(hpage, NR_FILE_PAGES, nr_none); if (is_shmem) - __mod_lruvec_page_state(new_page, NR_SHMEM, nr_none); + __mod_lruvec_page_state(hpage, NR_SHMEM, nr_none); } /* Join all the small entries into a single multi-index entry */ xas_set_order(&xas, start, HPAGE_PMD_ORDER); - xas_store(&xas, new_page); + xas_store(&xas, hpage); xa_locked: xas_unlock_irq(&xas); xa_unlocked: @@ -1873,11 +1863,11 @@ static void collapse_file(struct mm_struct *mm, struct file *file, index = start; list_for_each_entry_safe(page, tmp, &pagelist, lru) { while (index < page->index) { - clear_highpage(new_page + (index % HPAGE_PMD_NR)); + clear_highpage(hpage + (index % HPAGE_PMD_NR)); index++; } - copy_highpage(new_page + (page->index % HPAGE_PMD_NR), - page); + copy_highpage(hpage + (page->index % HPAGE_PMD_NR), + page); list_del(&page->lru); page->mapping = NULL; page_ref_unfreeze(page, 1); @@ -1888,23 +1878,23 @@ static void collapse_file(struct mm_struct *mm, struct file *file, index++; } while (index < end) { - clear_highpage(new_page + (index % HPAGE_PMD_NR)); + clear_highpage(hpage + (index % HPAGE_PMD_NR)); index++; } - SetPageUptodate(new_page); - page_ref_add(new_page, HPAGE_PMD_NR - 1); + SetPageUptodate(hpage); + page_ref_add(hpage, HPAGE_PMD_NR - 1); if (is_shmem) - set_page_dirty(new_page); - lru_cache_add(new_page); + set_page_dirty(hpage); + lru_cache_add(hpage); /* * Remove pte page tables, so we can re-fault the page as huge. */ retract_page_tables(mapping, start); - *hpage = NULL; - - khugepaged_pages_collapsed++; + unlock_page(hpage); + hpage = NULL; + goto out; } else { struct page *page; @@ -1943,22 +1933,22 @@ static void collapse_file(struct mm_struct *mm, struct file *file, VM_BUG_ON(nr_none); xas_unlock_irq(&xas); - new_page->mapping = NULL; + hpage->mapping = NULL; } - unlock_page(new_page); + unlock_page(hpage); out: VM_BUG_ON(!list_empty(&pagelist)); - if (!IS_ERR_OR_NULL(*hpage)) { - mem_cgroup_uncharge(page_folio(*hpage)); - put_page(*hpage); + if (hpage) { + mem_cgroup_uncharge(page_folio(hpage)); + put_page(hpage); } /* TODO: tracepoints */ + return result; } -static void khugepaged_scan_file(struct mm_struct *mm, struct file *file, - pgoff_t start, struct page **hpage, - struct collapse_control *cc) +static int khugepaged_scan_file(struct mm_struct *mm, struct file *file, + pgoff_t start, struct collapse_control *cc) { struct page *page = NULL; struct address_space *mapping = file->f_mapping; @@ -2031,15 +2021,16 @@ static void khugepaged_scan_file(struct mm_struct *mm, struct file *file, result = SCAN_EXCEED_NONE_PTE; count_vm_event(THP_SCAN_EXCEED_NONE_PTE); } else { - collapse_file(mm, file, start, hpage, cc); + result = collapse_file(mm, file, start, cc); } } /* TODO: tracepoints */ + return result; } #else -static void khugepaged_scan_file(struct mm_struct *mm, struct file *file, - pgoff_t start, struct collapse_control *cc) +static int khugepaged_scan_file(struct mm_struct *mm, struct file *file, pgoff_t start, + struct collapse_control *cc) { BUILD_BUG(); } @@ -2049,8 +2040,7 @@ static void khugepaged_collapse_pte_mapped_thps(struct mm_slot *mm_slot) } #endif -static unsigned int khugepaged_scan_mm_slot(unsigned int pages, - struct page **hpage, +static unsigned int khugepaged_scan_mm_slot(unsigned int pages, int *result, struct collapse_control *cc) __releases(&khugepaged_mm_lock) __acquires(&khugepaged_mm_lock) @@ -2064,6 +2054,7 @@ static unsigned int khugepaged_scan_mm_slot(unsigned int pages, VM_BUG_ON(!pages); lockdep_assert_held(&khugepaged_mm_lock); + *result = SCAN_FAIL; if (khugepaged_scan.mm_slot) mm_slot = khugepaged_scan.mm_slot; @@ -2117,7 +2108,8 @@ static unsigned int khugepaged_scan_mm_slot(unsigned int pages, goto skip; while (khugepaged_scan.address < hend) { - int ret; + bool mmap_locked = true; + cond_resched(); if (unlikely(khugepaged_test_exit(mm))) goto breakouterloop; @@ -2134,20 +2126,28 @@ static unsigned int khugepaged_scan_mm_slot(unsigned int pages, khugepaged_scan.address); mmap_read_unlock(mm); - ret = 1; - khugepaged_scan_file(mm, file, pgoff, hpage, - cc); + mmap_locked = false; + *result = khugepaged_scan_file(mm, file, pgoff, + cc); fput(file); } else { - ret = khugepaged_scan_pmd(mm, vma, - khugepaged_scan.address, - hpage, cc); + *result = khugepaged_scan_pmd(mm, vma, + khugepaged_scan.address, + &mmap_locked, cc); } + if (*result == SCAN_SUCCEED) + ++khugepaged_pages_collapsed; /* move to next address */ khugepaged_scan.address += HPAGE_PMD_SIZE; progress += HPAGE_PMD_NR; - if (ret) - /* we released mmap_lock so break loop */ + if (!mmap_locked) + /* + * We released mmap_lock so break loop. Note + * that we drop mmap_lock before all hugepage + * allocations, so if allocation fails, we are + * guaranteed to break here and report the + * correct result back to caller. + */ goto breakouterloop_mmap_lock; if (progress >= pages) goto breakouterloop; @@ -2199,15 +2199,15 @@ static int khugepaged_wait_event(void) static void khugepaged_do_scan(struct collapse_control *cc) { - struct page *hpage = NULL; unsigned int progress = 0, pass_through_head = 0; unsigned int pages = READ_ONCE(khugepaged_pages_to_scan); bool wait = true; + int result = SCAN_SUCCEED; lru_add_drain_all(); while (progress < pages) { - if (alloc_fail_should_sleep(&hpage, &wait)) + if (alloc_fail_should_sleep(result, &wait)) break; cond_resched(); @@ -2221,7 +2221,7 @@ static void khugepaged_do_scan(struct collapse_control *cc) if (khugepaged_has_work() && pass_through_head < 2) progress += khugepaged_scan_mm_slot(pages - progress, - &hpage, cc); + &result, cc); else progress = pages; spin_unlock(&khugepaged_mm_lock); From patchwork Sat Jun 4 00:39:56 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zach O'Keefe X-Patchwork-Id: 12869464 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C7619C43334 for ; Sat, 4 Jun 2022 00:40:26 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id AF4B78D000A; Fri, 3 Jun 2022 20:40:25 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id ACAB88D0001; Fri, 3 Jun 2022 20:40:25 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 969808D000A; Fri, 3 Jun 2022 20:40:25 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 88E568D0001 for ; Fri, 3 Jun 2022 20:40:25 -0400 (EDT) Received: from smtpin10.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 5A12E11AC for ; Sat, 4 Jun 2022 00:40:25 +0000 (UTC) X-FDA: 79538697210.10.326A5CD Received: from mail-pg1-f201.google.com (mail-pg1-f201.google.com [209.85.215.201]) by imf03.hostedemail.com (Postfix) with ESMTP id 53A7A20013 for ; Sat, 4 Jun 2022 00:40:09 +0000 (UTC) Received: by mail-pg1-f201.google.com with SMTP id g129-20020a636b87000000b003fd1deac6ebso1617700pgc.23 for ; Fri, 03 Jun 2022 17:40:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=JGhvT9aDQlYaTanHqW6Ps9Sdwtg80bXP3IuUZrmB/34=; b=Xu496L45u5w3CTyipJ2ABzP2/wgASEaWpBJg2ugCcdh5hUbbigsVMdm43EWTtBG296 SzShQKFhTG+K6/luBWGrfHNx2bVpbV7CP16YM+OtkPZoK6vwwrRk1vmvZ/futSyowEM5 FffJpEUxoNdOdxHrMih81VSr1+3wlT9/4Y2YCH0TIJ3TOXyLI8HKSM1ajDXlx2tfsaKB tNU/UP/7fnLKcqIfDci0Se/Sf4qhxDgEhkOR+15kTfcGx+Z3EE12zUvh9nGt//u9Hk5A rP3y+Xd8tsRZFvH8s2ZQ0B0I3zcZbGMJRobpVs1yyvWIjUr22VGmnQvDRWioQOCRHMZD tOmQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=JGhvT9aDQlYaTanHqW6Ps9Sdwtg80bXP3IuUZrmB/34=; b=6wvkeRFVn9jz39AJAfhV+Dwr2NS8Yg32dMA6sP//9PuIPfpcagAovf4REpmshs8LPz DhrMvNiOxo/CyXOwOJUw6IRO1iIE1sbYGgJ+QQmBDHKWHNKSs94/zU37GM9W/XMdEKit GsNkUqf7So4b8csxntGVlrRuQ9HVHWE2ODHb4gO+P0ZkiNdxNtQCqM9f8XbSp1lEoqNk kYwUseRv0HrR3sHqisLC6m2bELSV09oElSaA0sZ6QB438cWPKX6R1n/358W+bDSzoD9c Jv7Eh6ySkMUUCsCPHMDKjU4CqhtKFrjGdy2rRPZCNGgbBxoXebECjO4d6y4fOV/aO4mA di/w== X-Gm-Message-State: AOAM533A9nOAmlqrXWtVhT7SBs3QP+fpZs8tehLItfuCXTJ8rlnhPIRU C6jd94LNw7I742S459iROTy4hYUMLjrn X-Google-Smtp-Source: ABdhPJwzuvMh3zQCfbfNMJsaq6OjADQioHiLnyYl6H2EDA9pvvXHHKAvSv1H2wAlbRTJdLmeHTxeSo6ISHIM X-Received: from zokeefe3.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:1b6]) (user=zokeefe job=sendgmr) by 2002:a05:6a00:2402:b0:4e1:46ca:68bd with SMTP id z2-20020a056a00240200b004e146ca68bdmr12581647pfh.70.1654303224005; Fri, 03 Jun 2022 17:40:24 -0700 (PDT) Date: Fri, 3 Jun 2022 17:39:56 -0700 In-Reply-To: <20220604004004.954674-1-zokeefe@google.com> Message-Id: <20220604004004.954674-8-zokeefe@google.com> Mime-Version: 1.0 References: <20220604004004.954674-1-zokeefe@google.com> X-Mailer: git-send-email 2.36.1.255.ge46751e96f-goog Subject: [PATCH v6 07/15] mm/khugepaged: add flag to ignore khugepaged heuristics From: "Zach O'Keefe" To: Alex Shi , David Hildenbrand , David Rientjes , Matthew Wilcox , Michal Hocko , Pasha Tatashin , Peter Xu , Rongwei Wang , SeongJae Park , Song Liu , Vlastimil Babka , Yang Shi , Zi Yan , linux-mm@kvack.org Cc: Andrea Arcangeli , Andrew Morton , Arnd Bergmann , Axel Rasmussen , Chris Kennelly , Chris Zankel , Helge Deller , Hugh Dickins , Ivan Kokshaysky , "James E.J. Bottomley" , Jens Axboe , "Kirill A. Shutemov" , Matt Turner , Max Filippov , Miaohe Lin , Minchan Kim , Patrick Xia , Pavel Begunkov , Thomas Bogendoerfer , "Zach O'Keefe" Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=Xu496L45; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf03.hostedemail.com: domain of 3-KmaYgcKCNEMB711213BB381.zB985AHK-997Ixz7.BE3@flex--zokeefe.bounces.google.com designates 209.85.215.201 as permitted sender) smtp.mailfrom=3-KmaYgcKCNEMB711213BB381.zB985AHK-997Ixz7.BE3@flex--zokeefe.bounces.google.com X-Stat-Signature: 3qzyhq75d5pwgeipzm6pg31fdduuwkby X-Rspam-User: X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 53A7A20013 X-HE-Tag: 1654303209-586131 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Add enforce_page_heuristics flag to struct collapse_control that allows context to ignore heuristics originally designed to guide khugepaged: 1) sysfs-controlled knobs khugepaged_max_ptes_[none|swap|shared] 2) requirement that some pages in region being collapsed be young or referenced This flag is set in khugepaged collapse context to preserve existing khugepaged behavior. This flag will be used (unset) when introducing madvise collapse context since here, the user presumably has reason to believe the collapse will be beneficial and khugepaged heuristics shouldn't tell the user they are wrong. Signed-off-by: Zach O'Keefe Reviewed-by: Yang Shi --- mm/khugepaged.c | 55 +++++++++++++++++++++++++++++++++---------------- 1 file changed, 37 insertions(+), 18 deletions(-) diff --git a/mm/khugepaged.c b/mm/khugepaged.c index 03e0da0008f1..c3589b3e238d 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -87,6 +87,13 @@ static struct kmem_cache *mm_slot_cache __read_mostly; #define MAX_PTE_MAPPED_THP 8 struct collapse_control { + /* + * Heuristics: + * - khugepaged_max_ptes_[none|swap|shared] + * - require memory to be young / referenced + */ + bool enforce_page_heuristics; + /* Num pages scanned per node */ int node_load[MAX_NUMNODES]; @@ -604,6 +611,7 @@ static bool is_refcount_suitable(struct page *page) static int __collapse_huge_page_isolate(struct vm_area_struct *vma, unsigned long address, pte_t *pte, + struct collapse_control *cc, struct list_head *compound_pagelist) { struct page *page = NULL; @@ -617,7 +625,8 @@ static int __collapse_huge_page_isolate(struct vm_area_struct *vma, if (pte_none(pteval) || (pte_present(pteval) && is_zero_pfn(pte_pfn(pteval)))) { if (!userfaultfd_armed(vma) && - ++none_or_zero <= khugepaged_max_ptes_none) { + (++none_or_zero <= khugepaged_max_ptes_none || + !cc->enforce_page_heuristics)) { continue; } else { result = SCAN_EXCEED_NONE_PTE; @@ -637,8 +646,8 @@ static int __collapse_huge_page_isolate(struct vm_area_struct *vma, VM_BUG_ON_PAGE(!PageAnon(page), page); - if (page_mapcount(page) > 1 && - ++shared > khugepaged_max_ptes_shared) { + if (cc->enforce_page_heuristics && page_mapcount(page) > 1 && + ++shared > khugepaged_max_ptes_shared) { result = SCAN_EXCEED_SHARED_PTE; count_vm_event(THP_SCAN_EXCEED_SHARED_PTE); goto out; @@ -705,9 +714,10 @@ static int __collapse_huge_page_isolate(struct vm_area_struct *vma, list_add_tail(&page->lru, compound_pagelist); next: /* There should be enough young pte to collapse the page */ - if (pte_young(pteval) || - page_is_young(page) || PageReferenced(page) || - mmu_notifier_test_young(vma->vm_mm, address)) + if (cc->enforce_page_heuristics && + (pte_young(pteval) || page_is_young(page) || + PageReferenced(page) || mmu_notifier_test_young(vma->vm_mm, + address))) referenced++; if (pte_write(pteval)) @@ -716,7 +726,7 @@ static int __collapse_huge_page_isolate(struct vm_area_struct *vma, if (unlikely(!writable)) { result = SCAN_PAGE_RO; - } else if (unlikely(!referenced)) { + } else if (unlikely(cc->enforce_page_heuristics && !referenced)) { result = SCAN_LACK_REFERENCED_PAGE; } else { result = SCAN_SUCCEED; @@ -1096,7 +1106,7 @@ static int collapse_huge_page(struct mm_struct *mm, unsigned long address, mmu_notifier_invalidate_range_end(&range); spin_lock(pte_ptl); - result = __collapse_huge_page_isolate(vma, address, pte, + result = __collapse_huge_page_isolate(vma, address, pte, cc, &compound_pagelist); spin_unlock(pte_ptl); @@ -1185,7 +1195,8 @@ static int khugepaged_scan_pmd(struct mm_struct *mm, struct vm_area_struct *vma, _pte++, _address += PAGE_SIZE) { pte_t pteval = *_pte; if (is_swap_pte(pteval)) { - if (++unmapped <= khugepaged_max_ptes_swap) { + if (++unmapped <= khugepaged_max_ptes_swap || + !cc->enforce_page_heuristics) { /* * Always be strict with uffd-wp * enabled swap entries. Please see @@ -1204,7 +1215,8 @@ static int khugepaged_scan_pmd(struct mm_struct *mm, struct vm_area_struct *vma, } if (pte_none(pteval) || is_zero_pfn(pte_pfn(pteval))) { if (!userfaultfd_armed(vma) && - ++none_or_zero <= khugepaged_max_ptes_none) { + (++none_or_zero <= khugepaged_max_ptes_none || + !cc->enforce_page_heuristics)) { continue; } else { result = SCAN_EXCEED_NONE_PTE; @@ -1234,8 +1246,9 @@ static int khugepaged_scan_pmd(struct mm_struct *mm, struct vm_area_struct *vma, goto out_unmap; } - if (page_mapcount(page) > 1 && - ++shared > khugepaged_max_ptes_shared) { + if (cc->enforce_page_heuristics && + page_mapcount(page) > 1 && + ++shared > khugepaged_max_ptes_shared) { result = SCAN_EXCEED_SHARED_PTE; count_vm_event(THP_SCAN_EXCEED_SHARED_PTE); goto out_unmap; @@ -1289,14 +1302,17 @@ static int khugepaged_scan_pmd(struct mm_struct *mm, struct vm_area_struct *vma, result = SCAN_PAGE_COUNT; goto out_unmap; } - if (pte_young(pteval) || - page_is_young(page) || PageReferenced(page) || - mmu_notifier_test_young(vma->vm_mm, address)) + if (cc->enforce_page_heuristics && + (pte_young(pteval) || page_is_young(page) || + PageReferenced(page) || mmu_notifier_test_young(vma->vm_mm, + address))) referenced++; } if (!writable) { result = SCAN_PAGE_RO; - } else if (!referenced || (unmapped && referenced < HPAGE_PMD_NR/2)) { + } else if (cc->enforce_page_heuristics && + (!referenced || + (unmapped && referenced < HPAGE_PMD_NR / 2))) { result = SCAN_LACK_REFERENCED_PAGE; } else { result = SCAN_SUCCEED; @@ -1966,7 +1982,8 @@ static int khugepaged_scan_file(struct mm_struct *mm, struct file *file, continue; if (xa_is_value(page)) { - if (++swap > khugepaged_max_ptes_swap) { + if (cc->enforce_page_heuristics && + ++swap > khugepaged_max_ptes_swap) { result = SCAN_EXCEED_SWAP_PTE; count_vm_event(THP_SCAN_EXCEED_SWAP_PTE); break; @@ -2017,7 +2034,8 @@ static int khugepaged_scan_file(struct mm_struct *mm, struct file *file, rcu_read_unlock(); if (result == SCAN_SUCCEED) { - if (present < HPAGE_PMD_NR - khugepaged_max_ptes_none) { + if (present < HPAGE_PMD_NR - khugepaged_max_ptes_none && + cc->enforce_page_heuristics) { result = SCAN_EXCEED_NONE_PTE; count_vm_event(THP_SCAN_EXCEED_NONE_PTE); } else { @@ -2258,6 +2276,7 @@ static int khugepaged(void *none) { struct mm_slot *mm_slot; struct collapse_control cc = { + .enforce_page_heuristics = true, .last_target_node = NUMA_NO_NODE, /* .gfp set later */ }; From patchwork Sat Jun 4 00:39:57 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zach O'Keefe X-Patchwork-Id: 12869465 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 40C2FC433EF for ; Sat, 4 Jun 2022 00:40:28 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3AA0E8D000B; Fri, 3 Jun 2022 20:40:27 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 3048D8D0001; Fri, 3 Jun 2022 20:40:27 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1A37D8D000B; Fri, 3 Jun 2022 20:40:27 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 06AAB8D0001 for ; Fri, 3 Jun 2022 20:40:27 -0400 (EDT) Received: from smtpin12.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id DE93435FFB for ; Sat, 4 Jun 2022 00:40:26 +0000 (UTC) X-FDA: 79538697252.12.C8249B7 Received: from mail-pf1-f202.google.com (mail-pf1-f202.google.com [209.85.210.202]) by imf22.hostedemail.com (Postfix) with ESMTP id 016ADC000C for ; Sat, 4 Jun 2022 00:40:22 +0000 (UTC) Received: by mail-pf1-f202.google.com with SMTP id x18-20020a62fb12000000b0051bab667811so4685720pfm.5 for ; Fri, 03 Jun 2022 17:40:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=H9V8udCAxXnwzoEsmfbIoLOPFQ1/JzQfk5V6JxTtMOM=; b=NK07lq920/iFRpmwM8krBW6OkI5vS+Ah/sAXvPGW9+6QRtiMeplmSOSQhtTExWfUa4 6UGvlGj9ePhP2CzoDcJhHfDRvWZCOb7i9nackiaMf1iGfNytUEuQh+Ava5uDCtV6PluO UWgPybMPCbtFfV+PeWGeDpVKVgoJVu3Iv0A0BVLAWN2dp9jvfg5wLTzmD6LvCeKUrgWv 4vL7BvJ8D4mnk3b+8TO76H7egT/UzEiS164WdB+iedqE+7gDxNnkwAPODFKaFv+gxqQr ZjBDbX9OHg+E/KJIXyd+YP0+XIAgPNgqcmsjpWzB8rNc8+E4wicJocOZVtPF5Xuwst8V 0SFA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=H9V8udCAxXnwzoEsmfbIoLOPFQ1/JzQfk5V6JxTtMOM=; b=28P3aw2+hl9TyR4UBCaC9qeEcK+UTlDUpKpNyRRSW+ZVFFLCuoYtmwEhXpW7S8c2kB nuUTijCDZ1r6iERQYBroGz5ZfT/1Fsj4R8k1Fd8Z0/pWo5mqsdcwA+2X21RSL/dS0/6f DGpcGmGznnd5qrnjpq9t8PPJKaPCkLCaCRqGAWVp9GSDK3GShVeRXbwQWBwu1897LMkt 3v88nXb6VgRJ2ce2fYoy/KIXnQuQCrT34u8gV1+Qvg+jOJLrklFW9OwGGjaqJ0Oem8fW 1kx45x0ghCpIY8GBvHl6GTVRr6yqJat7rDsvNzRtAcAK9jgPBfRLpmfrqncWw1CMoa5t AEpw== X-Gm-Message-State: AOAM531nkWYRN02Fo6npizOP/Kxj71mnqbT3ef0PpuKO41v7Djg4GObJ 6XaXhJgS+gAq3Dc9IccRUerUj1paKQsF X-Google-Smtp-Source: ABdhPJxdygnKfNDVnctU0aq/VVYlo5+VPz5vpoGe+JVEOmhCGPSqJyj9bAiWSpUq9xNYylmUJoBD+2WO7JWz X-Received: from zokeefe3.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:1b6]) (user=zokeefe job=sendgmr) by 2002:a17:902:7048:b0:15f:34c1:bae0 with SMTP id h8-20020a170902704800b0015f34c1bae0mr12429141plt.71.1654303225897; Fri, 03 Jun 2022 17:40:25 -0700 (PDT) Date: Fri, 3 Jun 2022 17:39:57 -0700 In-Reply-To: <20220604004004.954674-1-zokeefe@google.com> Message-Id: <20220604004004.954674-9-zokeefe@google.com> Mime-Version: 1.0 References: <20220604004004.954674-1-zokeefe@google.com> X-Mailer: git-send-email 2.36.1.255.ge46751e96f-goog Subject: [PATCH v6 08/15] mm/khugepaged: add flag to ignore THP sysfs enabled From: "Zach O'Keefe" To: Alex Shi , David Hildenbrand , David Rientjes , Matthew Wilcox , Michal Hocko , Pasha Tatashin , Peter Xu , Rongwei Wang , SeongJae Park , Song Liu , Vlastimil Babka , Yang Shi , Zi Yan , linux-mm@kvack.org Cc: Andrea Arcangeli , Andrew Morton , Arnd Bergmann , Axel Rasmussen , Chris Kennelly , Chris Zankel , Helge Deller , Hugh Dickins , Ivan Kokshaysky , "James E.J. Bottomley" , Jens Axboe , "Kirill A. Shutemov" , Matt Turner , Max Filippov , Miaohe Lin , Minchan Kim , Patrick Xia , Pavel Begunkov , Thomas Bogendoerfer , "Zach O'Keefe" Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=NK07lq92; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf22.hostedemail.com: domain of 3-amaYgcKCNINC822324CC492.0CA96BIL-AA8Jy08.CF4@flex--zokeefe.bounces.google.com designates 209.85.210.202 as permitted sender) smtp.mailfrom=3-amaYgcKCNINC822324CC492.0CA96BIL-AA8Jy08.CF4@flex--zokeefe.bounces.google.com X-Stat-Signature: 1pyhx86ro8hz1y1y8m7iphzreimkhpg4 X-Rspam-User: X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 016ADC000C X-HE-Tag: 1654303222-654895 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Add enforce_thp_enabled flag to struct collapse_control that allows context to ignore constraints imposed by /sys/kernel/transparent_hugepage/enabled. This flag is set in khugepaged collapse context to preserve existing khugepaged behavior. This flag will be used (unset) when introducing madvise collapse context since the desired THP semantics of MADV_COLLAPSE aren't coupled to sysfs THP settings. Most notably, for the purpose of eventual madvise_collapse(2) support, this allows userspace to trigger THP collapse on behalf of another processes, without adding support to meddle with the VMA flags of said process, or change sysfs THP settings. For now, limit this flag to /sys/kernel/transparent_hugepage/enabled, but it can be expanded to include /sys/kernel/transparent_hugepage/shmem_enabled later. Link: https://lore.kernel.org/linux-mm/CAAa6QmQxay1_=Pmt8oCX2-Va18t44FV-Vs-WsQt_6+qBks4nZA@mail.gmail.com/ Signed-off-by: Zach O'Keefe --- mm/khugepaged.c | 34 +++++++++++++++++++++++++++------- 1 file changed, 27 insertions(+), 7 deletions(-) diff --git a/mm/khugepaged.c b/mm/khugepaged.c index c3589b3e238d..4ad04f552347 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -94,6 +94,11 @@ struct collapse_control { */ bool enforce_page_heuristics; + /* Enforce constraints of + * /sys/kernel/mm/transparent_hugepage/enabled + */ + bool enforce_thp_enabled; + /* Num pages scanned per node */ int node_load[MAX_NUMNODES]; @@ -893,10 +898,12 @@ static bool khugepaged_alloc_page(struct page **hpage, gfp_t gfp, int node) */ static int hugepage_vma_revalidate(struct mm_struct *mm, unsigned long address, - struct vm_area_struct **vmap) + struct vm_area_struct **vmap, + struct collapse_control *cc) { struct vm_area_struct *vma; unsigned long hstart, hend; + unsigned long vma_flags; if (unlikely(khugepaged_test_exit(mm))) return SCAN_ANY_PROCESS; @@ -909,7 +916,18 @@ static int hugepage_vma_revalidate(struct mm_struct *mm, unsigned long address, hend = vma->vm_end & HPAGE_PMD_MASK; if (address < hstart || address + HPAGE_PMD_SIZE > hend) return SCAN_ADDRESS_RANGE; - if (!hugepage_vma_check(vma, vma->vm_flags)) + + /* + * If !cc->enforce_thp_enabled, set VM_HUGEPAGE so that + * hugepage_vma_check() can pass even if + * TRANSPARENT_HUGEPAGE_REQ_MADV_FLAG is set (i.e. "madvise" mode). + * Note that hugepage_vma_check() doesn't enforce that + * TRANSPARENT_HUGEPAGE_FLAG or TRANSPARENT_HUGEPAGE_REQ_MADV_FLAG + * must be set (i.e. "never" mode). + */ + vma_flags = cc->enforce_thp_enabled ? vma->vm_flags + : vma->vm_flags | VM_HUGEPAGE; + if (!hugepage_vma_check(vma, vma_flags)) return SCAN_VMA_CHECK; /* Anon VMA expected */ if (!vma->anon_vma || !vma_is_anonymous(vma)) @@ -953,7 +971,8 @@ static int find_pmd_or_thp_or_none(struct mm_struct *mm, static bool __collapse_huge_page_swapin(struct mm_struct *mm, struct vm_area_struct *vma, unsigned long haddr, pmd_t *pmd, - int referenced) + int referenced, + struct collapse_control *cc) { int swapped_in = 0; vm_fault_t ret = 0; @@ -980,7 +999,7 @@ static bool __collapse_huge_page_swapin(struct mm_struct *mm, /* do_swap_page returns VM_FAULT_RETRY with released mmap_lock */ if (ret & VM_FAULT_RETRY) { mmap_read_lock(mm); - if (hugepage_vma_revalidate(mm, haddr, &vma)) { + if (hugepage_vma_revalidate(mm, haddr, &vma, cc)) { /* vma is no longer available, don't continue to swapin */ trace_mm_collapse_huge_page_swapin(mm, swapped_in, referenced, 0); return false; @@ -1047,7 +1066,7 @@ static int collapse_huge_page(struct mm_struct *mm, unsigned long address, goto out_nolock; mmap_read_lock(mm); - result = hugepage_vma_revalidate(mm, address, &vma); + result = hugepage_vma_revalidate(mm, address, &vma, cc); if (result) { mmap_read_unlock(mm); goto out_nolock; @@ -1066,7 +1085,7 @@ static int collapse_huge_page(struct mm_struct *mm, unsigned long address, * Continuing to collapse causes inconsistency. */ if (unmapped && !__collapse_huge_page_swapin(mm, vma, address, - pmd, referenced)) { + pmd, referenced, cc)) { mmap_read_unlock(mm); goto out_nolock; } @@ -1078,7 +1097,7 @@ static int collapse_huge_page(struct mm_struct *mm, unsigned long address, * handled by the anon_vma lock + PG_lock. */ mmap_write_lock(mm); - result = hugepage_vma_revalidate(mm, address, &vma); + result = hugepage_vma_revalidate(mm, address, &vma, cc); if (result) goto out_up_write; /* check if the pmd is still valid */ @@ -2277,6 +2296,7 @@ static int khugepaged(void *none) struct mm_slot *mm_slot; struct collapse_control cc = { .enforce_page_heuristics = true, + .enforce_thp_enabled = true, .last_target_node = NUMA_NO_NODE, /* .gfp set later */ }; From patchwork Sat Jun 4 00:39:58 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zach O'Keefe X-Patchwork-Id: 12869466 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 606BCC43334 for ; Sat, 4 Jun 2022 00:40:31 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id F20E18D0001; Fri, 3 Jun 2022 20:40:30 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id ED32A8D000C; Fri, 3 Jun 2022 20:40:30 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D7E438D0001; Fri, 3 Jun 2022 20:40:30 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id C66128D0001 for ; Fri, 3 Jun 2022 20:40:30 -0400 (EDT) Received: from smtpin07.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 994BB217A6 for ; Sat, 4 Jun 2022 00:40:30 +0000 (UTC) X-FDA: 79538697420.07.1EBE7FE Received: from mail-pg1-f201.google.com (mail-pg1-f201.google.com [209.85.215.201]) by imf05.hostedemail.com (Postfix) with ESMTP id C2E3B10001F for ; Sat, 4 Jun 2022 00:39:53 +0000 (UTC) Received: by mail-pg1-f201.google.com with SMTP id r10-20020a632b0a000000b003fcb4af0273so4073617pgr.1 for ; Fri, 03 Jun 2022 17:40:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=fKqWmoMlAD6AycNT67bEEBV5bdJzfc5x2+4G8eyGxaw=; b=bL9sTnyWlaYXNHx8stfFCRO+5HDrruUF8ZBlmC83vvPrIEdNJMp4gJ5fuVgkRVDWJU Atn9cgiYaBaHtlV7JD/T9XUzOKLguJNwjNlFIpj21JAk6eWMs4rQ3k3IwRx+Dicz/FI/ UKkBwKvG95RWTJ2w9wEK+xGe+uiiQMc4uMMs/yIMDuqJcT9fj1RYkBIViJHsJ0zNh8nV z/9zr6xmoR65ID5BKyX+63W61QZr5FQzQlLzLU/OIGZaONKlfES7xIJ7GS25kbug4BSZ gUX+NihstJaSjlOzp4/Ks6W35Aim3tRwaLAJ4VMueYL+hoo8hcgdMc1JM5WdSb99jF/m 19CA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=fKqWmoMlAD6AycNT67bEEBV5bdJzfc5x2+4G8eyGxaw=; b=F25sEIpat6xPCJfDqlPhd+p+XEC1MrOyc2ZJkhM4ATjLEmQpji2zcQuVTdP0vggOV3 60+E0Pp3JAgwD96P4T0gdVisO13UeGg8QYnyFGkP8UMYZLQhRshU/5FPn+b3Dz7znxTL SvLJDn0xpp8bbeskHql6O/FAhr32tskMxjRF9LeCd/SEul2l81J5cZpB/N6TwBcI54u2 7HggLmamopCwU2eN69nYG7k53zNZWLQpBkY1aFZI3lbTVVnt8YbDEUXSTfkwahOQd8m0 mfLb4tQBEcBKsDvh3yBcailC/KzAey44kHoFFqikohTTIMnSFgBxBkm3Npvm+d9NA6ps xFYg== X-Gm-Message-State: AOAM532y/ph/vGqQsExCrAFCWe3hOeN8fl6FHWjE+cY5ibqHuUaXdxvW smzklZf/sWHxU8UQ8RlhigRmMROu9b2V X-Google-Smtp-Source: ABdhPJxzBjmy5YqNiLAhjl4DQ8FBYto3koDOItCVVlik6fT5nbiwI2uVqV72GcwY/EAblqpZX5udh+H1DKiP X-Received: from zokeefe3.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:1b6]) (user=zokeefe job=sendgmr) by 2002:a17:90a:178f:b0:1e3:3ba:c185 with SMTP id q15-20020a17090a178f00b001e303bac185mr3849pja.1.1654303227479; Fri, 03 Jun 2022 17:40:27 -0700 (PDT) Date: Fri, 3 Jun 2022 17:39:58 -0700 In-Reply-To: <20220604004004.954674-1-zokeefe@google.com> Message-Id: <20220604004004.954674-10-zokeefe@google.com> Mime-Version: 1.0 References: <20220604004004.954674-1-zokeefe@google.com> X-Mailer: git-send-email 2.36.1.255.ge46751e96f-goog Subject: [PATCH v6 09/15] mm/madvise: introduce MADV_COLLAPSE sync hugepage collapse From: "Zach O'Keefe" To: Alex Shi , David Hildenbrand , David Rientjes , Matthew Wilcox , Michal Hocko , Pasha Tatashin , Peter Xu , Rongwei Wang , SeongJae Park , Song Liu , Vlastimil Babka , Yang Shi , Zi Yan , linux-mm@kvack.org Cc: Andrea Arcangeli , Andrew Morton , Arnd Bergmann , Axel Rasmussen , Chris Kennelly , Chris Zankel , Helge Deller , Hugh Dickins , Ivan Kokshaysky , "James E.J. Bottomley" , Jens Axboe , "Kirill A. Shutemov" , Matt Turner , Max Filippov , Miaohe Lin , Minchan Kim , Patrick Xia , Pavel Begunkov , Thomas Bogendoerfer , "Zach O'Keefe" X-Stat-Signature: 7ub5zmgkbktktkrf7hpzz1brsicgufdp X-Rspam-User: Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=bL9sTnyW; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf05.hostedemail.com: domain of 3-6maYgcKCNQPEA44546EE6B4.2ECB8DKN-CCAL02A.EH6@flex--zokeefe.bounces.google.com designates 209.85.215.201 as permitted sender) smtp.mailfrom=3-6maYgcKCNQPEA44546EE6B4.2ECB8DKN-CCAL02A.EH6@flex--zokeefe.bounces.google.com X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: C2E3B10001F X-HE-Tag: 1654303193-169368 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: This idea was introduced by David Rientjes[1]. Introduce a new madvise mode, MADV_COLLAPSE, that allows users to request a synchronous collapse of memory at their own expense. The benefits of this approach are: * CPU is charged to the process that wants to spend the cycles for the THP * Avoid unpredictable timing of khugepaged collapse An immediate user of this new functionality are malloc() implementations that manage memory in hugepage-sized chunks, but sometimes subrelease memory back to the system in native-sized chunks via MADV_DONTNEED; zapping the pmd. Later, when the memory is hot, the implementation could madvise(MADV_COLLAPSE) to re-back the memory by THPs to regain hugepage coverage and dTLB performance. TCMalloc is such an implementation that could benefit from this[2]. Only privately-mapped anon memory is supported for now, but it is expected that file and shmem support will be added later to support the use-case of backing executable text by THPs. Current support provided by CONFIG_READ_ONLY_THP_FOR_FS may take a long time on a large system which might impair services from serving at their full rated load after (re)starting. Tricks like mremap(2)'ing text onto anonymous memory to immediately realize iTLB performance prevents page sharing and demand paging, both of which increase steady state memory footprint. With MADV_COLLAPSE, we get the best of both worlds: Peak upfront performance and lower RAM footprints. This call is independent of the system-wide THP sysfs settings, but will fail for memory marked VM_NOHUGEPAGE. THP allocation may enter direct reclaim and/or compaction. [1] https://lore.kernel.org/linux-mm/d098c392-273a-36a4-1a29-59731cdf5d3d@google.com/ [2] https://github.com/google/tcmalloc/tree/master/tcmalloc Suggested-by: David Rientjes Signed-off-by: Zach O'Keefe --- arch/alpha/include/uapi/asm/mman.h | 2 + arch/mips/include/uapi/asm/mman.h | 2 + arch/parisc/include/uapi/asm/mman.h | 2 + arch/xtensa/include/uapi/asm/mman.h | 2 + include/linux/huge_mm.h | 12 +++ include/uapi/asm-generic/mman-common.h | 2 + mm/khugepaged.c | 124 +++++++++++++++++++++++++ mm/madvise.c | 5 + 8 files changed, 151 insertions(+) diff --git a/arch/alpha/include/uapi/asm/mman.h b/arch/alpha/include/uapi/asm/mman.h index 4aa996423b0d..763929e814e9 100644 --- a/arch/alpha/include/uapi/asm/mman.h +++ b/arch/alpha/include/uapi/asm/mman.h @@ -76,6 +76,8 @@ #define MADV_DONTNEED_LOCKED 24 /* like DONTNEED, but drop locked pages too */ +#define MADV_COLLAPSE 25 /* Synchronous hugepage collapse */ + /* compatibility flags */ #define MAP_FILE 0 diff --git a/arch/mips/include/uapi/asm/mman.h b/arch/mips/include/uapi/asm/mman.h index 1be428663c10..c6e1fc77c996 100644 --- a/arch/mips/include/uapi/asm/mman.h +++ b/arch/mips/include/uapi/asm/mman.h @@ -103,6 +103,8 @@ #define MADV_DONTNEED_LOCKED 24 /* like DONTNEED, but drop locked pages too */ +#define MADV_COLLAPSE 25 /* Synchronous hugepage collapse */ + /* compatibility flags */ #define MAP_FILE 0 diff --git a/arch/parisc/include/uapi/asm/mman.h b/arch/parisc/include/uapi/asm/mman.h index a7ea3204a5fa..22133a6a506e 100644 --- a/arch/parisc/include/uapi/asm/mman.h +++ b/arch/parisc/include/uapi/asm/mman.h @@ -70,6 +70,8 @@ #define MADV_WIPEONFORK 71 /* Zero memory on fork, child only */ #define MADV_KEEPONFORK 72 /* Undo MADV_WIPEONFORK */ +#define MADV_COLLAPSE 73 /* Synchronous hugepage collapse */ + #define MADV_HWPOISON 100 /* poison a page for testing */ #define MADV_SOFT_OFFLINE 101 /* soft offline page for testing */ diff --git a/arch/xtensa/include/uapi/asm/mman.h b/arch/xtensa/include/uapi/asm/mman.h index 7966a58af472..1ff0c858544f 100644 --- a/arch/xtensa/include/uapi/asm/mman.h +++ b/arch/xtensa/include/uapi/asm/mman.h @@ -111,6 +111,8 @@ #define MADV_DONTNEED_LOCKED 24 /* like DONTNEED, but drop locked pages too */ +#define MADV_COLLAPSE 25 /* Synchronous hugepage collapse */ + /* compatibility flags */ #define MAP_FILE 0 diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h index 648cb3ce7099..2ca2f3b41fc8 100644 --- a/include/linux/huge_mm.h +++ b/include/linux/huge_mm.h @@ -240,6 +240,9 @@ void __split_huge_pud(struct vm_area_struct *vma, pud_t *pud, int hugepage_madvise(struct vm_area_struct *vma, unsigned long *vm_flags, int advice); +int madvise_collapse(struct vm_area_struct *vma, + struct vm_area_struct **prev, + unsigned long start, unsigned long end); void vma_adjust_trans_huge(struct vm_area_struct *vma, unsigned long start, unsigned long end, long adjust_next); spinlock_t *__pmd_trans_huge_lock(pmd_t *pmd, struct vm_area_struct *vma); @@ -395,6 +398,15 @@ static inline int hugepage_madvise(struct vm_area_struct *vma, BUG(); return 0; } + +static inline int madvise_collapse(struct vm_area_struct *vma, + struct vm_area_struct **prev, + unsigned long start, unsigned long end) +{ + BUG(); + return 0; +} + static inline void vma_adjust_trans_huge(struct vm_area_struct *vma, unsigned long start, unsigned long end, diff --git a/include/uapi/asm-generic/mman-common.h b/include/uapi/asm-generic/mman-common.h index 6c1aa92a92e4..6ce1f1ceb432 100644 --- a/include/uapi/asm-generic/mman-common.h +++ b/include/uapi/asm-generic/mman-common.h @@ -77,6 +77,8 @@ #define MADV_DONTNEED_LOCKED 24 /* like DONTNEED, but drop locked pages too */ +#define MADV_COLLAPSE 25 /* Synchronous hugepage collapse */ + /* compatibility flags */ #define MAP_FILE 0 diff --git a/mm/khugepaged.c b/mm/khugepaged.c index 4ad04f552347..073d6bb03b37 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -2404,3 +2404,127 @@ void khugepaged_min_free_kbytes_update(void) set_recommended_min_free_kbytes(); mutex_unlock(&khugepaged_mutex); } + +static int madvise_collapse_errno(enum scan_result r) +{ + switch (r) { + case SCAN_PMD_NULL: + case SCAN_ADDRESS_RANGE: + case SCAN_VMA_NULL: + case SCAN_PTE_NON_PRESENT: + case SCAN_PAGE_NULL: + /* + * Addresses in the specified range are not currently mapped, + * or are outside the AS of the process. + */ + return -ENOMEM; + case SCAN_ALLOC_HUGE_PAGE_FAIL: + case SCAN_CGROUP_CHARGE_FAIL: + /* A kernel resource was temporarily unavailable. */ + return -EAGAIN; + default: + return -EINVAL; + } +} + +int madvise_collapse(struct vm_area_struct *vma, struct vm_area_struct **prev, + unsigned long start, unsigned long end) +{ + struct collapse_control cc = { + .enforce_page_heuristics = false, + .enforce_thp_enabled = false, + .last_target_node = NUMA_NO_NODE, + .gfp = GFP_TRANSHUGE | __GFP_THISNODE, + }; + struct mm_struct *mm = vma->vm_mm; + unsigned long hstart, hend, addr; + int thps = 0, last_fail = SCAN_FAIL; + bool mmap_locked = true; + + BUG_ON(vma->vm_start > start); + BUG_ON(vma->vm_end < end); + + *prev = vma; + + /* TODO: Support file/shmem */ + if (!vma->anon_vma || !vma_is_anonymous(vma)) + return -EINVAL; + + hstart = (start + ~HPAGE_PMD_MASK) & HPAGE_PMD_MASK; + hend = end & HPAGE_PMD_MASK; + + /* + * Set VM_HUGEPAGE so that hugepage_vma_check() can pass even if + * TRANSPARENT_HUGEPAGE_REQ_MADV_FLAG is set (i.e. "madvise" mode). + * Note that hugepage_vma_check() doesn't enforce that + * TRANSPARENT_HUGEPAGE_FLAG or TRANSPARENT_HUGEPAGE_REQ_MADV_FLAG + * must be set (i.e. "never" mode) + */ + if (!hugepage_vma_check(vma, vma->vm_flags | VM_HUGEPAGE)) + return -EINVAL; + + mmgrab(mm); + lru_add_drain(); + + for (addr = hstart; addr < hend; addr += HPAGE_PMD_SIZE) { + int result = SCAN_FAIL; + bool retry = true; /* Allow one retry per hugepage */ +retry: + if (!mmap_locked) { + cond_resched(); + mmap_read_lock(mm); + mmap_locked = true; + result = hugepage_vma_revalidate(mm, addr, &vma, &cc); + if (result) { + last_fail = result; + goto out_nolock; + } + } + mmap_assert_locked(mm); + memset(cc.node_load, 0, sizeof(cc.node_load)); + result = khugepaged_scan_pmd(mm, vma, addr, &mmap_locked, &cc); + if (!mmap_locked) + *prev = NULL; /* Tell caller we dropped mmap_lock */ + + switch (result) { + case SCAN_SUCCEED: + case SCAN_PMD_MAPPED: + ++thps; + break; + /* Whitelisted set of results where continuing OK */ + case SCAN_PMD_NULL: + case SCAN_PTE_NON_PRESENT: + case SCAN_PTE_UFFD_WP: + case SCAN_PAGE_RO: + case SCAN_LACK_REFERENCED_PAGE: + case SCAN_PAGE_NULL: + case SCAN_PAGE_COUNT: + case SCAN_PAGE_LOCK: + case SCAN_PAGE_COMPOUND: + last_fail = result; + break; + case SCAN_PAGE_LRU: + if (retry) { + lru_add_drain_all(); + retry = false; + goto retry; + } + fallthrough; + default: + last_fail = result; + /* Other error, exit */ + goto out_maybelock; + } + } + +out_maybelock: + /* Caller expects us to hold mmap_lock on return */ + if (!mmap_locked) + mmap_read_lock(mm); +out_nolock: + mmap_assert_locked(mm); + mmdrop(mm); + + return thps == ((hend - hstart) >> HPAGE_PMD_SHIFT) ? 0 + : madvise_collapse_errno(last_fail); +} diff --git a/mm/madvise.c b/mm/madvise.c index 46feb62ce163..eccac2620226 100644 --- a/mm/madvise.c +++ b/mm/madvise.c @@ -59,6 +59,7 @@ static int madvise_need_mmap_write(int behavior) case MADV_FREE: case MADV_POPULATE_READ: case MADV_POPULATE_WRITE: + case MADV_COLLAPSE: return 0; default: /* be safe, default to 1. list exceptions explicitly */ @@ -1057,6 +1058,8 @@ static int madvise_vma_behavior(struct vm_area_struct *vma, if (error) goto out; break; + case MADV_COLLAPSE: + return madvise_collapse(vma, prev, start, end); } anon_name = anon_vma_name(vma); @@ -1150,6 +1153,7 @@ madvise_behavior_valid(int behavior) #ifdef CONFIG_TRANSPARENT_HUGEPAGE case MADV_HUGEPAGE: case MADV_NOHUGEPAGE: + case MADV_COLLAPSE: #endif case MADV_DONTDUMP: case MADV_DODUMP: @@ -1339,6 +1343,7 @@ int madvise_set_anon_name(struct mm_struct *mm, unsigned long start, * MADV_NOHUGEPAGE - mark the given range as not worth being backed by * transparent huge pages so the existing pages will not be * coalesced into THP and new pages will not be allocated as THP. + * MADV_COLLAPSE - synchronously coalesce pages into new THP. * MADV_DONTDUMP - the application wants to prevent pages in the given range * from being included in its core dump. * MADV_DODUMP - cancel MADV_DONTDUMP: no longer exclude from core dump. From patchwork Sat Jun 4 00:39:59 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zach O'Keefe X-Patchwork-Id: 12869467 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A68F2C433EF for ; Sat, 4 Jun 2022 00:40:32 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2E39D8D000D; Fri, 3 Jun 2022 20:40:32 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 271C78D000C; Fri, 3 Jun 2022 20:40:32 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 045AA8D000D; Fri, 3 Jun 2022 20:40:32 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id E3F708D000C for ; Fri, 3 Jun 2022 20:40:31 -0400 (EDT) Received: from smtpin17.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id C2C9535D24 for ; Sat, 4 Jun 2022 00:40:31 +0000 (UTC) X-FDA: 79538697462.17.6CFA652 Received: from mail-pg1-f201.google.com (mail-pg1-f201.google.com [209.85.215.201]) by imf08.hostedemail.com (Postfix) with ESMTP id 582D8160046 for ; Sat, 4 Jun 2022 00:40:04 +0000 (UTC) Received: by mail-pg1-f201.google.com with SMTP id s204-20020a632cd5000000b003fc8fd3c242so4538135pgs.7 for ; Fri, 03 Jun 2022 17:40:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=rdj9/oi0lN6TqrvB4Q2VyzXBAxhGJwG0VxRwxoU0UdA=; b=p+Z85abURcxHmeTDXTJwQ4AzS8a3J/iWZwK+CBkwwWRP8sX9djeoRX2YHwHWh5XuRD UIS4BaRgVJ6TdNHrVrnfE5OZmtDBcUY8XYN2TyByZgoDCy80c2cCAQvpDvvZhYlI9cUk T3Bhx48snEhJ7l+C8spgu+W9S3yPe87QSdtwuiO2oHqRuivA4cXID069M4yWp9z0Br2t TFZ/g7CuQg+ZTaDo9uvGyiP2u1eQEfLBmecLdICWrxFcwSfn2PSL1T92A6MxOfTZfgsE 1Hylg140Sh72yFhwriRzWqEkIXLstKe7qVbQItj3h3so9pkmswqhyE39av8tHPHhOrNC IaPw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=rdj9/oi0lN6TqrvB4Q2VyzXBAxhGJwG0VxRwxoU0UdA=; b=Ha81UrA89dM8S1myeXXm0NsO65HPRra0IgcH1jvls3abkVM/17tdrwqQed1I6GlRI6 gpguMgvQ6KqRZ9X26mDiTy9QsLXe4LfDiwflfuS4n0HW3Zw+jRLP0yyN7jnhSYENZC0A Lvn4Y8QAySYF7+F4qk7JNieU1b227zVaddjxR9ZIASWaJE7BH1mc5vnZft2GmoGA4IXB bBIQEVxqpUQXddEE+/9n7bIUMPbAksBRkY5xsScfp6VGcf1FH8AokMNc60DVjz4FkM52 ZtuQXCqXkgNUKgN6xiKfjGwfm4adF9Mnd6OZd5nhBI5Lt8x4sRW0KwQzrG3bddWsxJdk 83vg== X-Gm-Message-State: AOAM531M3pTTF7opespPgpqV2Fg4MwM5G4B9rgDByuV6gm4s94cQyJpA Jld76eNteEjgYn4GAgrMfdLNZ8aBdU9C X-Google-Smtp-Source: ABdhPJz8hWgX71hL7XN0eo7zfJUDAtrLOUvdac9Cp8EbxszTCXpp8hrcnFfunblmVyGxGGDQJtvdnZfomPN8 X-Received: from zokeefe3.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:1b6]) (user=zokeefe job=sendgmr) by 2002:a63:f113:0:b0:3fc:a9a6:91f with SMTP id f19-20020a63f113000000b003fca9a6091fmr10892125pgi.598.1654303230382; Fri, 03 Jun 2022 17:40:30 -0700 (PDT) Date: Fri, 3 Jun 2022 17:39:59 -0700 In-Reply-To: <20220604004004.954674-1-zokeefe@google.com> Message-Id: <20220604004004.954674-11-zokeefe@google.com> Mime-Version: 1.0 References: <20220604004004.954674-1-zokeefe@google.com> X-Mailer: git-send-email 2.36.1.255.ge46751e96f-goog Subject: [PATCH v6 10/15] mm/khugepaged: rename prefix of shared collapse functions From: "Zach O'Keefe" To: Alex Shi , David Hildenbrand , David Rientjes , Matthew Wilcox , Michal Hocko , Pasha Tatashin , Peter Xu , Rongwei Wang , SeongJae Park , Song Liu , Vlastimil Babka , Yang Shi , Zi Yan , linux-mm@kvack.org Cc: Andrea Arcangeli , Andrew Morton , Arnd Bergmann , Axel Rasmussen , Chris Kennelly , Chris Zankel , Helge Deller , Hugh Dickins , Ivan Kokshaysky , "James E.J. Bottomley" , Jens Axboe , "Kirill A. Shutemov" , Matt Turner , Max Filippov , Miaohe Lin , Minchan Kim , Patrick Xia , Pavel Begunkov , Thomas Bogendoerfer , "Zach O'Keefe" X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 582D8160046 Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=p+Z85abU; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf08.hostedemail.com: domain of 3_qmaYgcKCNcSHD77879HH9E7.5HFEBGNQ-FFDO35D.HK9@flex--zokeefe.bounces.google.com designates 209.85.215.201 as permitted sender) smtp.mailfrom=3_qmaYgcKCNcSHD77879HH9E7.5HFEBGNQ-FFDO35D.HK9@flex--zokeefe.bounces.google.com X-Stat-Signature: zzaix1p6pkthmhajjwwn3sbitunz5kui X-Rspam-User: X-HE-Tag: 1654303204-622628 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: The following functions/tracepoints are shared between khugepaged and madvise collapse contexts. Replace the "khugepaged_" prefix with generic "hpage_collapse_" prefix in such cases: khugepaged_test_exit() -> hpage_collapse_test_exit() khugepaged_scan_abort() -> hpage_collapse_scan_abort() khugepaged_scan_pmd() -> hpage_collapse_scan_pmd() khugepaged_find_target_node() -> hpage_collapse_find_target_node() khugepaged_alloc_page() -> hpage_collapse_alloc_page() huge_memory:mm_khugepaged_scan_pmd -> huge_memory:mm_hpage_collapse_scan_pmd Signed-off-by: Zach O'Keefe Reviewed-by: Yang Shi --- include/trace/events/huge_memory.h | 2 +- mm/khugepaged.c | 71 ++++++++++++++++-------------- 2 files changed, 38 insertions(+), 35 deletions(-) diff --git a/include/trace/events/huge_memory.h b/include/trace/events/huge_memory.h index 55392bf30a03..fb6c73632ff3 100644 --- a/include/trace/events/huge_memory.h +++ b/include/trace/events/huge_memory.h @@ -48,7 +48,7 @@ SCAN_STATUS #define EM(a, b) {a, b}, #define EMe(a, b) {a, b} -TRACE_EVENT(mm_khugepaged_scan_pmd, +TRACE_EVENT(mm_hpage_collapse_scan_pmd, TP_PROTO(struct mm_struct *mm, struct page *page, bool writable, int referenced, int none_or_zero, int status, int unmapped), diff --git a/mm/khugepaged.c b/mm/khugepaged.c index 073d6bb03b37..119c1bc84af7 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -102,7 +102,7 @@ struct collapse_control { /* Num pages scanned per node */ int node_load[MAX_NUMNODES]; - /* Last target selected in khugepaged_find_target_node() */ + /* Last target selected in hpage_collapse_find_target_node() */ int last_target_node; /* gfp used for allocation and memcg charging */ @@ -456,7 +456,7 @@ static void insert_to_mm_slots_hash(struct mm_struct *mm, hash_add(mm_slots_hash, &mm_slot->hash, (long)mm); } -static inline int khugepaged_test_exit(struct mm_struct *mm) +static inline int hpage_collapse_test_exit(struct mm_struct *mm) { return atomic_read(&mm->mm_users) == 0; } @@ -508,7 +508,7 @@ void __khugepaged_enter(struct mm_struct *mm) return; /* __khugepaged_exit() must not run from under us */ - VM_BUG_ON_MM(khugepaged_test_exit(mm), mm); + VM_BUG_ON_MM(hpage_collapse_test_exit(mm), mm); if (unlikely(test_and_set_bit(MMF_VM_HUGEPAGE, &mm->flags))) { free_mm_slot(mm_slot); return; @@ -562,11 +562,10 @@ void __khugepaged_exit(struct mm_struct *mm) } else if (mm_slot) { /* * This is required to serialize against - * khugepaged_test_exit() (which is guaranteed to run - * under mmap sem read mode). Stop here (after we - * return all pagetables will be destroyed) until - * khugepaged has finished working on the pagetables - * under the mmap_lock. + * hpage_collapse_test_exit() (which is guaranteed to run + * under mmap sem read mode). Stop here (after we return all + * pagetables will be destroyed) until khugepaged has finished + * working on the pagetables under the mmap_lock. */ mmap_write_lock(mm); mmap_write_unlock(mm); @@ -803,7 +802,7 @@ static void khugepaged_alloc_sleep(void) remove_wait_queue(&khugepaged_wait, &wait); } -static bool khugepaged_scan_abort(int nid, struct collapse_control *cc) +static bool hpage_collapse_scan_abort(int nid, struct collapse_control *cc) { int i; @@ -834,7 +833,7 @@ static inline gfp_t alloc_hugepage_khugepaged_gfpmask(void) } #ifdef CONFIG_NUMA -static int khugepaged_find_target_node(struct collapse_control *cc) +static int hpage_collapse_find_target_node(struct collapse_control *cc) { int nid, target_node = 0, max_value = 0; @@ -858,7 +857,7 @@ static int khugepaged_find_target_node(struct collapse_control *cc) return target_node; } #else -static int khugepaged_find_target_node(struct collapse_control *cc) +static int hpage_collapse_find_target_node(struct collapse_control *cc) { return 0; } @@ -877,7 +876,7 @@ static bool alloc_fail_should_sleep(int result, bool *wait) return false; } -static bool khugepaged_alloc_page(struct page **hpage, gfp_t gfp, int node) +static bool hpage_collapse_alloc_page(struct page **hpage, gfp_t gfp, int node) { *hpage = __alloc_pages_node(node, gfp, HPAGE_PMD_ORDER); if (unlikely(!*hpage)) { @@ -905,7 +904,7 @@ static int hugepage_vma_revalidate(struct mm_struct *mm, unsigned long address, unsigned long hstart, hend; unsigned long vma_flags; - if (unlikely(khugepaged_test_exit(mm))) + if (unlikely(hpage_collapse_test_exit(mm))) return SCAN_ANY_PROCESS; *vmap = vma = find_vma(mm, address); @@ -962,7 +961,7 @@ static int find_pmd_or_thp_or_none(struct mm_struct *mm, /* * Bring missing pages in from swap, to complete THP collapse. - * Only done if khugepaged_scan_pmd believes it is worthwhile. + * Only done if hpage_collapse_scan_pmd believes it is worthwhile. * * Called and returns without pte mapped or spinlocks held, * but with mmap_lock held to protect against vma changes. @@ -1027,9 +1026,9 @@ static bool __collapse_huge_page_swapin(struct mm_struct *mm, static int alloc_charge_hpage(struct mm_struct *mm, struct page **hpage, struct collapse_control *cc) { - int node = khugepaged_find_target_node(cc); + int node = hpage_collapse_find_target_node(cc); - if (!khugepaged_alloc_page(hpage, cc->gfp, node)) + if (!hpage_collapse_alloc_page(hpage, cc->gfp, node)) return SCAN_ALLOC_HUGE_PAGE_FAIL; if (unlikely(mem_cgroup_charge(page_folio(*hpage), mm, cc->gfp))) return SCAN_CGROUP_CHARGE_FAIL; @@ -1188,9 +1187,10 @@ static int collapse_huge_page(struct mm_struct *mm, unsigned long address, return result; } -static int khugepaged_scan_pmd(struct mm_struct *mm, struct vm_area_struct *vma, - unsigned long address, bool *mmap_locked, - struct collapse_control *cc) +static int hpage_collapse_scan_pmd(struct mm_struct *mm, + struct vm_area_struct *vma, + unsigned long address, bool *mmap_locked, + struct collapse_control *cc) { pmd_t *pmd; pte_t *pte, *_pte; @@ -1282,7 +1282,7 @@ static int khugepaged_scan_pmd(struct mm_struct *mm, struct vm_area_struct *vma, * hit record. */ node = page_to_nid(page); - if (khugepaged_scan_abort(node, cc)) { + if (hpage_collapse_scan_abort(node, cc)) { result = SCAN_SCAN_ABORT; goto out_unmap; } @@ -1345,8 +1345,8 @@ static int khugepaged_scan_pmd(struct mm_struct *mm, struct vm_area_struct *vma, unmapped, cc); } out: - trace_mm_khugepaged_scan_pmd(mm, page, writable, referenced, - none_or_zero, result, unmapped); + trace_mm_hpage_collapse_scan_pmd(mm, page, writable, referenced, + none_or_zero, result, unmapped); return result; } @@ -1356,7 +1356,7 @@ static void collect_mm_slot(struct mm_slot *mm_slot) lockdep_assert_held(&khugepaged_mm_lock); - if (khugepaged_test_exit(mm)) { + if (hpage_collapse_test_exit(mm)) { /* free mm_slot */ hash_del(&mm_slot->hash); list_del(&mm_slot->mm_node); @@ -1530,7 +1530,7 @@ static void khugepaged_collapse_pte_mapped_thps(struct mm_slot *mm_slot) if (!mmap_write_trylock(mm)) return; - if (unlikely(khugepaged_test_exit(mm))) + if (unlikely(hpage_collapse_test_exit(mm))) goto out; for (i = 0; i < mm_slot->nr_pte_mapped_thp; i++) @@ -1593,7 +1593,8 @@ static void retract_page_tables(struct address_space *mapping, pgoff_t pgoff) * it'll always mapped in small page size for uffd-wp * registered ranges. */ - if (!khugepaged_test_exit(mm) && !userfaultfd_wp(vma)) + if (!hpage_collapse_test_exit(mm) && + !userfaultfd_wp(vma)) collapse_and_free_pmd(mm, vma, addr, pmd); mmap_write_unlock(mm); } else { @@ -2020,7 +2021,7 @@ static int khugepaged_scan_file(struct mm_struct *mm, struct file *file, } node = page_to_nid(page); - if (khugepaged_scan_abort(node, cc)) { + if (hpage_collapse_scan_abort(node, cc)) { result = SCAN_SCAN_ABORT; break; } @@ -2114,7 +2115,7 @@ static unsigned int khugepaged_scan_mm_slot(unsigned int pages, int *result, goto breakouterloop_mmap_lock; progress++; - if (unlikely(khugepaged_test_exit(mm))) + if (unlikely(hpage_collapse_test_exit(mm))) goto breakouterloop; address = khugepaged_scan.address; @@ -2123,7 +2124,7 @@ static unsigned int khugepaged_scan_mm_slot(unsigned int pages, int *result, unsigned long hstart, hend; cond_resched(); - if (unlikely(khugepaged_test_exit(mm))) { + if (unlikely(hpage_collapse_test_exit(mm))) { progress++; break; } @@ -2148,7 +2149,7 @@ static unsigned int khugepaged_scan_mm_slot(unsigned int pages, int *result, bool mmap_locked = true; cond_resched(); - if (unlikely(khugepaged_test_exit(mm))) + if (unlikely(hpage_collapse_test_exit(mm))) goto breakouterloop; /* reset gfp flags since sysfs settings might change */ @@ -2168,9 +2169,10 @@ static unsigned int khugepaged_scan_mm_slot(unsigned int pages, int *result, cc); fput(file); } else { - *result = khugepaged_scan_pmd(mm, vma, - khugepaged_scan.address, - &mmap_locked, cc); + *result = hpage_collapse_scan_pmd(mm, vma, + khugepaged_scan.address, + &mmap_locked, + cc); } if (*result == SCAN_SUCCEED) ++khugepaged_pages_collapsed; @@ -2200,7 +2202,7 @@ static unsigned int khugepaged_scan_mm_slot(unsigned int pages, int *result, * Release the current mm_slot if this mm is about to die, or * if we scanned all vmas of this mm. */ - if (khugepaged_test_exit(mm) || !vma) { + if (hpage_collapse_test_exit(mm) || !vma) { /* * Make sure that if mm_users is reaching zero while * khugepaged runs here, khugepaged_exit will find @@ -2482,7 +2484,8 @@ int madvise_collapse(struct vm_area_struct *vma, struct vm_area_struct **prev, } mmap_assert_locked(mm); memset(cc.node_load, 0, sizeof(cc.node_load)); - result = khugepaged_scan_pmd(mm, vma, addr, &mmap_locked, &cc); + result = hpage_collapse_scan_pmd(mm, vma, addr, &mmap_locked, + &cc); if (!mmap_locked) *prev = NULL; /* Tell caller we dropped mmap_lock */ From patchwork Sat Jun 4 00:40:00 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zach O'Keefe X-Patchwork-Id: 12869468 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5AF0DC43334 for ; Sat, 4 Jun 2022 00:40:34 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id DE29B8D000E; Fri, 3 Jun 2022 20:40:33 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D6A9C8D000C; Fri, 3 Jun 2022 20:40:33 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C35708D000E; Fri, 3 Jun 2022 20:40:33 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id AE07A8D000C for ; Fri, 3 Jun 2022 20:40:33 -0400 (EDT) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 8E0CF35141 for ; Sat, 4 Jun 2022 00:40:33 +0000 (UTC) X-FDA: 79538697546.04.3DBB290 Received: from mail-pj1-f73.google.com (mail-pj1-f73.google.com [209.85.216.73]) by imf01.hostedemail.com (Postfix) with ESMTP id 9F15840005 for ; Sat, 4 Jun 2022 00:40:24 +0000 (UTC) Received: by mail-pj1-f73.google.com with SMTP id mh12-20020a17090b4acc00b001e32eb45751so7999284pjb.9 for ; Fri, 03 Jun 2022 17:40:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=IEIcsbesdwgjyLSnjhy6rJjaQh02HrF984U0ropJC2o=; b=LYN/ksS+4YaGhN39iKQ36tutvy8UlGcAXYH/lMRBLudOX79Y7melvYm0cwj4wLEYGR vRB3TOLIjywBB14bfHdTT+f88T/hSY71wmLK1xdWi+QK2CpBgHK+yNJ3iHILvdsHLxjv YiyCJNvhIZWWeaEsdN8QN3rmbcDDRKDSkwSMXIOOyeSQrnZ1xZmT14pM7OVRgONOn7Sm bWP8JTlmLG8ffn5QXh2pqheTFlKDDK1v+QEcZsLN8NZ8Tj+nBJXsLomqN6rWswsnZfB1 Yf1b+QOqY7hvlUe/LAMjniyqljrbgH7m+vT2MULnVFAXGZLIV6mWrjLI+OhTff0tg/mz eqWg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=IEIcsbesdwgjyLSnjhy6rJjaQh02HrF984U0ropJC2o=; b=6lGOBxz8Nm2TnjngHEZJ58ELQRzYPzUKILGCI1U/TQQCh6jOBA6YBSogVKKb9hJpxN DqPh9bF/RLyvqAyqAoP8alNMBXHBEvdmvf54tSGbY1b0mPPwD6p24QShKTU+LD5oEkps i64sznvwOLWVK/Zf1/QCN6O3tFAs5sw7IhtlTegtUCRGRb4pzIhLRPkFjY2VKheVTyKj p5GwNvexVKBv+4PjS/EUTzVk5JUOxVVNnV6ayzFeFZKKIhJr0jUtpNND4d1FrsJfAZ9o ssCRQuW97HjqkmmkWpTZszmcuT7r8bCHJJe9gDrPyiV4w3AfV5ey98LX3mzZv+nr2fJT HJ6Q== X-Gm-Message-State: AOAM531/vWu4Uv0tS+dV26JWZih9vciv9hid3NwfKBARRUMWFSeYYAlJ HEGpS8FvgwiPPNFCuiSEi97VGWPjBKpm X-Google-Smtp-Source: ABdhPJxYaAFfAc4d82J0Wwd0FnaP+CDLLMowH6hKLwkrA7fjGudj2bsg/b++dxaiApAADpMHGpD4UIVZQEhT X-Received: from zokeefe3.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:1b6]) (user=zokeefe job=sendgmr) by 2002:a17:902:cec2:b0:163:e44a:c678 with SMTP id d2-20020a170902cec200b00163e44ac678mr12381407plg.137.1654303232178; Fri, 03 Jun 2022 17:40:32 -0700 (PDT) Date: Fri, 3 Jun 2022 17:40:00 -0700 In-Reply-To: <20220604004004.954674-1-zokeefe@google.com> Message-Id: <20220604004004.954674-12-zokeefe@google.com> Mime-Version: 1.0 References: <20220604004004.954674-1-zokeefe@google.com> X-Mailer: git-send-email 2.36.1.255.ge46751e96f-goog Subject: [PATCH v6 11/15] mm/madvise: add MADV_COLLAPSE to process_madvise() From: "Zach O'Keefe" To: Alex Shi , David Hildenbrand , David Rientjes , Matthew Wilcox , Michal Hocko , Pasha Tatashin , Peter Xu , Rongwei Wang , SeongJae Park , Song Liu , Vlastimil Babka , Yang Shi , Zi Yan , linux-mm@kvack.org Cc: Andrea Arcangeli , Andrew Morton , Arnd Bergmann , Axel Rasmussen , Chris Kennelly , Chris Zankel , Helge Deller , Hugh Dickins , Ivan Kokshaysky , "James E.J. Bottomley" , Jens Axboe , "Kirill A. Shutemov" , Matt Turner , Max Filippov , Miaohe Lin , Minchan Kim , Patrick Xia , Pavel Begunkov , Thomas Bogendoerfer , "Zach O'Keefe" X-Stat-Signature: hoogeaj9aztehzxbao9xbymhwfa1x8bu X-Rspam-User: Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b="LYN/ksS+"; spf=pass (imf01.hostedemail.com: domain of 3AKqaYgcKCNkUJF99A9BJJBG9.7JHGDIPS-HHFQ57F.JMB@flex--zokeefe.bounces.google.com designates 209.85.216.73 as permitted sender) smtp.mailfrom=3AKqaYgcKCNkUJF99A9BJJBG9.7JHGDIPS-HHFQ57F.JMB@flex--zokeefe.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 9F15840005 X-HE-Tag: 1654303224-286291 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Allow MADV_COLLAPSE behavior for process_madvise(2) if caller has CAP_SYS_ADMIN or is requesting collapse of it's own memory. This is useful for the development of userspace agents that seek to optimize THP utilization system-wide by using userspace signals to prioritize what memory is most deserving of being THP-backed. Signed-off-by: Zach O'Keefe --- mm/madvise.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/mm/madvise.c b/mm/madvise.c index eccac2620226..b19e2f4b924c 100644 --- a/mm/madvise.c +++ b/mm/madvise.c @@ -1171,13 +1171,15 @@ madvise_behavior_valid(int behavior) } static bool -process_madvise_behavior_valid(int behavior) +process_madvise_behavior_valid(int behavior, struct task_struct *task) { switch (behavior) { case MADV_COLD: case MADV_PAGEOUT: case MADV_WILLNEED: return true; + case MADV_COLLAPSE: + return task == current || capable(CAP_SYS_ADMIN); default: return false; } @@ -1455,7 +1457,7 @@ SYSCALL_DEFINE5(process_madvise, int, pidfd, const struct iovec __user *, vec, goto free_iov; } - if (!process_madvise_behavior_valid(behavior)) { + if (!process_madvise_behavior_valid(behavior, task)) { ret = -EINVAL; goto release_task; } From patchwork Sat Jun 4 00:40:01 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zach O'Keefe X-Patchwork-Id: 12869469 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 53D0DC433EF for ; Sat, 4 Jun 2022 00:40:36 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id CDA338D000F; Fri, 3 Jun 2022 20:40:35 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C65508D000C; Fri, 3 Jun 2022 20:40:35 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 99F318D000F; Fri, 3 Jun 2022 20:40:35 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 7CEDE8D000C for ; Fri, 3 Jun 2022 20:40:35 -0400 (EDT) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 5C0BE35148 for ; Sat, 4 Jun 2022 00:40:35 +0000 (UTC) X-FDA: 79538697630.20.5781E94 Received: from mail-pg1-f201.google.com (mail-pg1-f201.google.com [209.85.215.201]) by imf10.hostedemail.com (Postfix) with ESMTP id 8885CC0009 for ; Sat, 4 Jun 2022 00:39:48 +0000 (UTC) Received: by mail-pg1-f201.google.com with SMTP id d125-20020a636883000000b003db5e24db27so4529818pgc.13 for ; Fri, 03 Jun 2022 17:40:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=B5SPhsI6VpXdWyKFD8QKME1JRRqPmxqCMQAcPJ6fqRE=; b=sY/V4+vDvT+647+i0gzF3Vy1IRynqaHcjsuOb91jqcTscjle7Wc1+ZjMecUkUs5JXC BdpBaa8wJjwb+ENWur2OLSwCiGZa/Ku1S5gzUBVV8fvXVcszIf4gDZgZfUJGKcOd9xXX 9SSTVmS4qzC2/aqf0KTvmv79P1DSHyJAUX4XrsaeesibxsZH/SKJhDYvTl3d2TMpRzC4 4LSsOhm8gWpE8pmtWxa0mi5V5k8zVS3hUWiaa3LVdSu/DgpYsm4wXNHIerHWm3CSJXFN 3PgOZozjOatHY5MxJO21vZvnQmEiHRdd4nfTjTkSXoR3C9CJ+zCHeIEuVfcvih3u4BKS mJ5A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=B5SPhsI6VpXdWyKFD8QKME1JRRqPmxqCMQAcPJ6fqRE=; b=L+BU/24wcSe/tIcvGoSjovTSoQJuYBArkXmJfHwzGpTvaCTKfz5ecHi28qBLuHGlV6 w1Za1wpU1Fp8/5LTUbD+yXyRBV6xT0QHyPddakT4aVBDx1v5vthMCieuZoALso/2ev0G jEb+6T4X26m3Zs1T/t37dtVMYtMINe6tT5AM1zxW+6B3YJb5wpkjq/8cPGe2kR6JxCZT yMSAk/dvXmwYW6dWYfmEPOPwlfJtoIQZtFgR56Nh7nQx7AfWsjdNljPfWJ0zA9PGQ73E CAR3cO8465Hq3xONQoBz7ZqSVjoZ7hGEVmQWny3fyLeE9fr70P7S4Evbvv/GOAcBt4AX ivKA== X-Gm-Message-State: AOAM531iQGovkcdAgzcD/oL1cRuOJr+qjpWPHg5vew8aWjBTSpoktATK s9ntOGZURTm1tpayRR7ySzEYSlvQzs2E X-Google-Smtp-Source: ABdhPJxsKh18uXZVtf1FHqtADWCP9PXCc31UDZXVkHjNcSX3l9rZhEab+1OnQRscKmO2nz6pZOnzM4jxJ8LB X-Received: from zokeefe3.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:1b6]) (user=zokeefe job=sendgmr) by 2002:a17:903:1248:b0:151:9708:d586 with SMTP id u8-20020a170903124800b001519708d586mr12466543plh.15.1654303234022; Fri, 03 Jun 2022 17:40:34 -0700 (PDT) Date: Fri, 3 Jun 2022 17:40:01 -0700 In-Reply-To: <20220604004004.954674-1-zokeefe@google.com> Message-Id: <20220604004004.954674-13-zokeefe@google.com> Mime-Version: 1.0 References: <20220604004004.954674-1-zokeefe@google.com> X-Mailer: git-send-email 2.36.1.255.ge46751e96f-goog Subject: [PATCH v6 12/15] selftests/vm: modularize collapse selftests From: "Zach O'Keefe" To: Alex Shi , David Hildenbrand , David Rientjes , Matthew Wilcox , Michal Hocko , Pasha Tatashin , Peter Xu , Rongwei Wang , SeongJae Park , Song Liu , Vlastimil Babka , Yang Shi , Zi Yan , linux-mm@kvack.org Cc: Andrea Arcangeli , Andrew Morton , Arnd Bergmann , Axel Rasmussen , Chris Kennelly , Chris Zankel , Helge Deller , Hugh Dickins , Ivan Kokshaysky , "James E.J. Bottomley" , Jens Axboe , "Kirill A. Shutemov" , Matt Turner , Max Filippov , Miaohe Lin , Minchan Kim , Patrick Xia , Pavel Begunkov , Thomas Bogendoerfer , "Zach O'Keefe" X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 8885CC0009 Authentication-Results: imf10.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b="sY/V4+vD"; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf10.hostedemail.com: domain of 3AqqaYgcKCNsWLHBBCBDLLDIB.9LJIFKRU-JJHS79H.LOD@flex--zokeefe.bounces.google.com designates 209.85.215.201 as permitted sender) smtp.mailfrom=3AqqaYgcKCNsWLHBBCBDLLDIB.9LJIFKRU-JJHS79H.LOD@flex--zokeefe.bounces.google.com X-Stat-Signature: ez3iofg8pd4udef9iida14dsprbcdjjj X-Rspam-User: X-HE-Tag: 1654303188-911014 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Modularize the collapse action of khugepaged collapse selftests by introducing a struct collapse_context which specifies how to collapse a given memory range and the expected semantics of the collapse. This can be reused later to test other collapse contexts. Additionally, all tests have logic that checks if a collapse occurred via reading /proc/self/smaps, and report if this is different than expected. Move this logic into the per-context ->collapse() hook instead of repeating it in every test. Signed-off-by: Zach O'Keefe --- tools/testing/selftests/vm/khugepaged.c | 315 +++++++++++------------- 1 file changed, 142 insertions(+), 173 deletions(-) diff --git a/tools/testing/selftests/vm/khugepaged.c b/tools/testing/selftests/vm/khugepaged.c index 155120b67a16..24a8715363be 100644 --- a/tools/testing/selftests/vm/khugepaged.c +++ b/tools/testing/selftests/vm/khugepaged.c @@ -23,6 +23,11 @@ static int hpage_pmd_nr; #define THP_SYSFS "/sys/kernel/mm/transparent_hugepage/" #define PID_SMAPS "/proc/self/smaps" +struct collapse_context { + void (*collapse)(const char *msg, char *p, bool expect); + bool enforce_pte_scan_limits; +}; + enum thp_enabled { THP_ALWAYS, THP_MADVISE, @@ -469,38 +474,6 @@ static void validate_memory(int *p, unsigned long start, unsigned long end) } } -#define TICK 500000 -static bool wait_for_scan(const char *msg, char *p) -{ - int full_scans; - int timeout = 6; /* 3 seconds */ - - /* Sanity check */ - if (check_huge(p)) { - printf("Unexpected huge page\n"); - exit(EXIT_FAILURE); - } - - madvise(p, hpage_pmd_size, MADV_HUGEPAGE); - - /* Wait until the second full_scan completed */ - full_scans = read_num("khugepaged/full_scans") + 2; - - printf("%s...", msg); - while (timeout--) { - if (check_huge(p)) - break; - if (read_num("khugepaged/full_scans") >= full_scans) - break; - printf("."); - usleep(TICK); - } - - madvise(p, hpage_pmd_size, MADV_NOHUGEPAGE); - - return timeout == -1; -} - static void alloc_at_fault(void) { struct settings settings = default_settings; @@ -528,53 +501,39 @@ static void alloc_at_fault(void) munmap(p, hpage_pmd_size); } -static void collapse_full(void) +static void collapse_full(struct collapse_context *c) { void *p; p = alloc_mapping(); fill_memory(p, 0, hpage_pmd_size); - if (wait_for_scan("Collapse fully populated PTE table", p)) - fail("Timeout"); - else if (check_huge(p)) - success("OK"); - else - fail("Fail"); + c->collapse("Collapse fully populated PTE table", p, true); validate_memory(p, 0, hpage_pmd_size); munmap(p, hpage_pmd_size); } -static void collapse_empty(void) +static void collapse_empty(struct collapse_context *c) { void *p; p = alloc_mapping(); - if (wait_for_scan("Do not collapse empty PTE table", p)) - fail("Timeout"); - else if (check_huge(p)) - fail("Fail"); - else - success("OK"); + c->collapse("Do not collapse empty PTE table", p, false); munmap(p, hpage_pmd_size); } -static void collapse_single_pte_entry(void) +static void collapse_single_pte_entry(struct collapse_context *c) { void *p; p = alloc_mapping(); fill_memory(p, 0, page_size); - if (wait_for_scan("Collapse PTE table with single PTE entry present", p)) - fail("Timeout"); - else if (check_huge(p)) - success("OK"); - else - fail("Fail"); + c->collapse("Collapse PTE table with single PTE entry present", p, + true); validate_memory(p, 0, page_size); munmap(p, hpage_pmd_size); } -static void collapse_max_ptes_none(void) +static void collapse_max_ptes_none(struct collapse_context *c) { int max_ptes_none = hpage_pmd_nr / 2; struct settings settings = default_settings; @@ -586,28 +545,22 @@ static void collapse_max_ptes_none(void) p = alloc_mapping(); fill_memory(p, 0, (hpage_pmd_nr - max_ptes_none - 1) * page_size); - if (wait_for_scan("Do not collapse with max_ptes_none exceeded", p)) - fail("Timeout"); - else if (check_huge(p)) - fail("Fail"); - else - success("OK"); + c->collapse("Maybe collapse with max_ptes_none exceeded", p, + !c->enforce_pte_scan_limits); validate_memory(p, 0, (hpage_pmd_nr - max_ptes_none - 1) * page_size); - fill_memory(p, 0, (hpage_pmd_nr - max_ptes_none) * page_size); - if (wait_for_scan("Collapse with max_ptes_none PTEs empty", p)) - fail("Timeout"); - else if (check_huge(p)) - success("OK"); - else - fail("Fail"); - validate_memory(p, 0, (hpage_pmd_nr - max_ptes_none) * page_size); + if (c->enforce_pte_scan_limits) { + fill_memory(p, 0, (hpage_pmd_nr - max_ptes_none) * page_size); + c->collapse("Collapse with max_ptes_none PTEs empty", p, true); + validate_memory(p, 0, + (hpage_pmd_nr - max_ptes_none) * page_size); + } munmap(p, hpage_pmd_size); write_settings(&default_settings); } -static void collapse_swapin_single_pte(void) +static void collapse_swapin_single_pte(struct collapse_context *c) { void *p; p = alloc_mapping(); @@ -625,18 +578,13 @@ static void collapse_swapin_single_pte(void) goto out; } - if (wait_for_scan("Collapse with swapping in single PTE entry", p)) - fail("Timeout"); - else if (check_huge(p)) - success("OK"); - else - fail("Fail"); + c->collapse("Collapse with swapping in single PTE entry", p, true); validate_memory(p, 0, hpage_pmd_size); out: munmap(p, hpage_pmd_size); } -static void collapse_max_ptes_swap(void) +static void collapse_max_ptes_swap(struct collapse_context *c) { int max_ptes_swap = read_num("khugepaged/max_ptes_swap"); void *p; @@ -656,39 +604,34 @@ static void collapse_max_ptes_swap(void) goto out; } - if (wait_for_scan("Do not collapse with max_ptes_swap exceeded", p)) - fail("Timeout"); - else if (check_huge(p)) - fail("Fail"); - else - success("OK"); + c->collapse("Maybe collapse with max_ptes_swap exceeded", p, + !c->enforce_pte_scan_limits); validate_memory(p, 0, hpage_pmd_size); - fill_memory(p, 0, hpage_pmd_size); - printf("Swapout %d of %d pages...", max_ptes_swap, hpage_pmd_nr); - if (madvise(p, max_ptes_swap * page_size, MADV_PAGEOUT)) { - perror("madvise(MADV_PAGEOUT)"); - exit(EXIT_FAILURE); - } - if (check_swap(p, max_ptes_swap * page_size)) { - success("OK"); - } else { - fail("Fail"); - goto out; - } + if (c->enforce_pte_scan_limits) { + fill_memory(p, 0, hpage_pmd_size); + printf("Swapout %d of %d pages...", max_ptes_swap, + hpage_pmd_nr); + if (madvise(p, max_ptes_swap * page_size, MADV_PAGEOUT)) { + perror("madvise(MADV_PAGEOUT)"); + exit(EXIT_FAILURE); + } + if (check_swap(p, max_ptes_swap * page_size)) { + success("OK"); + } else { + fail("Fail"); + goto out; + } - if (wait_for_scan("Collapse with max_ptes_swap pages swapped out", p)) - fail("Timeout"); - else if (check_huge(p)) - success("OK"); - else - fail("Fail"); - validate_memory(p, 0, hpage_pmd_size); + c->collapse("Collapse with max_ptes_swap pages swapped out", p, + true); + validate_memory(p, 0, hpage_pmd_size); + } out: munmap(p, hpage_pmd_size); } -static void collapse_single_pte_entry_compound(void) +static void collapse_single_pte_entry_compound(struct collapse_context *c) { void *p; @@ -710,17 +653,13 @@ static void collapse_single_pte_entry_compound(void) else fail("Fail"); - if (wait_for_scan("Collapse PTE table with single PTE mapping compound page", p)) - fail("Timeout"); - else if (check_huge(p)) - success("OK"); - else - fail("Fail"); + c->collapse("Collapse PTE table with single PTE mapping compound page", + p, true); validate_memory(p, 0, page_size); munmap(p, hpage_pmd_size); } -static void collapse_full_of_compound(void) +static void collapse_full_of_compound(struct collapse_context *c) { void *p; @@ -742,17 +681,12 @@ static void collapse_full_of_compound(void) else fail("Fail"); - if (wait_for_scan("Collapse PTE table full of compound pages", p)) - fail("Timeout"); - else if (check_huge(p)) - success("OK"); - else - fail("Fail"); + c->collapse("Collapse PTE table full of compound pages", p, true); validate_memory(p, 0, hpage_pmd_size); munmap(p, hpage_pmd_size); } -static void collapse_compound_extreme(void) +static void collapse_compound_extreme(struct collapse_context *c) { void *p; int i; @@ -798,18 +732,14 @@ static void collapse_compound_extreme(void) else fail("Fail"); - if (wait_for_scan("Collapse PTE table full of different compound pages", p)) - fail("Timeout"); - else if (check_huge(p)) - success("OK"); - else - fail("Fail"); + c->collapse("Collapse PTE table full of different compound pages", p, + true); validate_memory(p, 0, hpage_pmd_size); munmap(p, hpage_pmd_size); } -static void collapse_fork(void) +static void collapse_fork(struct collapse_context *c) { int wstatus; void *p; @@ -835,13 +765,8 @@ static void collapse_fork(void) fail("Fail"); fill_memory(p, page_size, 2 * page_size); - - if (wait_for_scan("Collapse PTE table with single page shared with parent process", p)) - fail("Timeout"); - else if (check_huge(p)) - success("OK"); - else - fail("Fail"); + c->collapse("Collapse PTE table with single page shared with parent process", + p, true); validate_memory(p, 0, page_size); munmap(p, hpage_pmd_size); @@ -860,7 +785,7 @@ static void collapse_fork(void) munmap(p, hpage_pmd_size); } -static void collapse_fork_compound(void) +static void collapse_fork_compound(struct collapse_context *c) { int wstatus; void *p; @@ -896,14 +821,10 @@ static void collapse_fork_compound(void) fill_memory(p, 0, page_size); write_num("khugepaged/max_ptes_shared", hpage_pmd_nr - 1); - if (wait_for_scan("Collapse PTE table full of compound pages in child", p)) - fail("Timeout"); - else if (check_huge(p)) - success("OK"); - else - fail("Fail"); + c->collapse("Collapse PTE table full of compound pages in child", + p, true); write_num("khugepaged/max_ptes_shared", - default_settings.khugepaged.max_ptes_shared); + default_settings.khugepaged.max_ptes_shared); validate_memory(p, 0, hpage_pmd_size); munmap(p, hpage_pmd_size); @@ -922,7 +843,7 @@ static void collapse_fork_compound(void) munmap(p, hpage_pmd_size); } -static void collapse_max_ptes_shared() +static void collapse_max_ptes_shared(struct collapse_context *c) { int max_ptes_shared = read_num("khugepaged/max_ptes_shared"); int wstatus; @@ -957,28 +878,22 @@ static void collapse_max_ptes_shared() else fail("Fail"); - if (wait_for_scan("Do not collapse with max_ptes_shared exceeded", p)) - fail("Timeout"); - else if (!check_huge(p)) - success("OK"); - else - fail("Fail"); - - printf("Trigger CoW on page %d of %d...", - hpage_pmd_nr - max_ptes_shared, hpage_pmd_nr); - fill_memory(p, 0, (hpage_pmd_nr - max_ptes_shared) * page_size); - if (!check_huge(p)) - success("OK"); - else - fail("Fail"); - - - if (wait_for_scan("Collapse with max_ptes_shared PTEs shared", p)) - fail("Timeout"); - else if (check_huge(p)) - success("OK"); - else - fail("Fail"); + c->collapse("Maybe collapse with max_ptes_shared exceeded", p, + !c->enforce_pte_scan_limits); + + if (c->enforce_pte_scan_limits) { + printf("Trigger CoW on page %d of %d...", + hpage_pmd_nr - max_ptes_shared, hpage_pmd_nr); + fill_memory(p, 0, (hpage_pmd_nr - max_ptes_shared) * + page_size); + if (!check_huge(p)) + success("OK"); + else + fail("Fail"); + + c->collapse("Collapse with max_ptes_shared PTEs shared", + p, true); + } validate_memory(p, 0, hpage_pmd_size); munmap(p, hpage_pmd_size); @@ -997,8 +912,57 @@ static void collapse_max_ptes_shared() munmap(p, hpage_pmd_size); } +#define TICK 500000 +static bool wait_for_scan(const char *msg, char *p) +{ + int full_scans; + int timeout = 6; /* 3 seconds */ + + /* Sanity check */ + if (check_huge(p)) { + printf("Unexpected huge page\n"); + exit(EXIT_FAILURE); + } + + madvise(p, hpage_pmd_size, MADV_HUGEPAGE); + + /* Wait until the second full_scan completed */ + full_scans = read_num("khugepaged/full_scans") + 2; + + printf("%s...", msg); + while (timeout--) { + if (check_huge(p)) + break; + if (read_num("khugepaged/full_scans") >= full_scans) + break; + printf("."); + usleep(TICK); + } + + madvise(p, hpage_pmd_size, MADV_NOHUGEPAGE); + + return timeout == -1; +} + +static void khugepaged_collapse(const char *msg, char *p, bool expect) +{ + if (wait_for_scan(msg, p)) { + if (expect) + fail("Timeout"); + else + success("OK"); + return; + } else if (check_huge(p) == expect) { + success("OK"); + } else { + fail("Fail"); + } +} + int main(void) { + struct collapse_context c; + setbuf(stdout, NULL); page_size = getpagesize(); @@ -1014,18 +978,23 @@ int main(void) adjust_settings(); alloc_at_fault(); - collapse_full(); - collapse_empty(); - collapse_single_pte_entry(); - collapse_max_ptes_none(); - collapse_swapin_single_pte(); - collapse_max_ptes_swap(); - collapse_single_pte_entry_compound(); - collapse_full_of_compound(); - collapse_compound_extreme(); - collapse_fork(); - collapse_fork_compound(); - collapse_max_ptes_shared(); + + printf("\n*** Testing context: khugepaged ***\n"); + c.collapse = &khugepaged_collapse; + c.enforce_pte_scan_limits = true; + + collapse_full(&c); + collapse_empty(&c); + collapse_single_pte_entry(&c); + collapse_max_ptes_none(&c); + collapse_swapin_single_pte(&c); + collapse_max_ptes_swap(&c); + collapse_single_pte_entry_compound(&c); + collapse_full_of_compound(&c); + collapse_compound_extreme(&c); + collapse_fork(&c); + collapse_fork_compound(&c); + collapse_max_ptes_shared(&c); restore_settings(0); } From patchwork Sat Jun 4 00:40:02 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zach O'Keefe X-Patchwork-Id: 12869470 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D89F7C43334 for ; Sat, 4 Jun 2022 00:40:37 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6E7328D0010; Fri, 3 Jun 2022 20:40:37 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 64D018D000C; Fri, 3 Jun 2022 20:40:37 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 49CD58D0010; Fri, 3 Jun 2022 20:40:37 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 313228D000C for ; Fri, 3 Jun 2022 20:40:37 -0400 (EDT) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay11.hostedemail.com (Postfix) with ESMTP id 0DD3181088 for ; Sat, 4 Jun 2022 00:40:37 +0000 (UTC) X-FDA: 79538697714.09.30C1AF0 Received: from mail-pg1-f202.google.com (mail-pg1-f202.google.com [209.85.215.202]) by imf10.hostedemail.com (Postfix) with ESMTP id 4ECDAC0009 for ; Sat, 4 Jun 2022 00:39:50 +0000 (UTC) Received: by mail-pg1-f202.google.com with SMTP id f9-20020a636a09000000b003c61848e622so4555139pgc.0 for ; Fri, 03 Jun 2022 17:40:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=3/lqlJ/21n3Zrj5CUB+THq2ufWV6QvEJXMFFvvhdkxM=; b=auNxcVWHSpT2RcsVZoHSOF4wuw/9opUxJlHgnDjKhbsMXRcZhNamrPKCtrxCHCDHtV 7vxfipvPFQ7u8lAB+MgCEmiwrkJn5QN7RmEGOYqmkWjqAK+MVa5N/5POXqTU/ZRxpC20 jMYpWaaKqeh9Wud3nqN7kJwoQFtzWPAdv9rGnpl/n8j6Bn3aoRp2NAfMPKh4aTMH7MTf N7Mt3wlG/TaLcUswpd39CPw6Ia3HMe7c1gBgGm4gYJGDHe9bOA83YVyvTuke/PD3ZVIY RKRFYZMOOYPZPonuH/1cAtxOWaSa1+9UFkAq/l3PT2YT4owgMNDJbJLE+vAanqiMGGdk M+gA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=3/lqlJ/21n3Zrj5CUB+THq2ufWV6QvEJXMFFvvhdkxM=; b=TbAX+wEPvf/dEj5h4g/VKgtkPd719kmwVym3svRJKPJdRAA7WSMo+iNw+RgcMPARyu KzLTCIFqMd0RJY3liKSl9Tc45Hucd5lUKcqfcbioqDffv6bHmfqGS+QkIazOZaR/hdIo sUlEneFtKf5a/IDlYZ4/Uc8Xkxihtzs70UdU0R8cA62ILIR5EYQwqwy7cqua0f2SXNlq UsVCyk6gtWWev5/Z7mTfWtS4r+6e9p59MZLmJxxk/2vpVxS/C6eEOrCmjuOWxlHmoHZl jmjhRhg6MjYgsrHLus4vP6PqsogR3kvutWtDGiFsmVSV/LmtPa7yLcNz6M+yMSX/CLs8 5kyg== X-Gm-Message-State: AOAM530TK8V7Mt2P0tcc4Q9yxy5RVJVWZmo/6zLGjnhOQ0hFb46fNsPF h1WmG3jC+p0bqPbSS5/DETEz5d7Ja+sR X-Google-Smtp-Source: ABdhPJwTm2MkHXyUFC0Lx22HJR5U8btT/38twDUnWnJz9HIIw3Z8E9gkaSBMw/NYHyzh7d2pzybRvnticNuN X-Received: from zokeefe3.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:1b6]) (user=zokeefe job=sendgmr) by 2002:a05:6a00:c95:b0:517:f266:465 with SMTP id a21-20020a056a000c9500b00517f2660465mr79891166pfv.42.1654303235719; Fri, 03 Jun 2022 17:40:35 -0700 (PDT) Date: Fri, 3 Jun 2022 17:40:02 -0700 In-Reply-To: <20220604004004.954674-1-zokeefe@google.com> Message-Id: <20220604004004.954674-14-zokeefe@google.com> Mime-Version: 1.0 References: <20220604004004.954674-1-zokeefe@google.com> X-Mailer: git-send-email 2.36.1.255.ge46751e96f-goog Subject: [PATCH v6 13/15] selftests/vm: add MADV_COLLAPSE collapse context to selftests From: "Zach O'Keefe" To: Alex Shi , David Hildenbrand , David Rientjes , Matthew Wilcox , Michal Hocko , Pasha Tatashin , Peter Xu , Rongwei Wang , SeongJae Park , Song Liu , Vlastimil Babka , Yang Shi , Zi Yan , linux-mm@kvack.org Cc: Andrea Arcangeli , Andrew Morton , Arnd Bergmann , Axel Rasmussen , Chris Kennelly , Chris Zankel , Helge Deller , Hugh Dickins , Ivan Kokshaysky , "James E.J. Bottomley" , Jens Axboe , "Kirill A. Shutemov" , Matt Turner , Max Filippov , Miaohe Lin , Minchan Kim , Patrick Xia , Pavel Begunkov , Thomas Bogendoerfer , "Zach O'Keefe" X-Rspamd-Queue-Id: 4ECDAC0009 X-Stat-Signature: efmnru7g3zusjgx7y6inungbx5gb9yir X-Rspam-User: Authentication-Results: imf10.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=auNxcVWH; spf=pass (imf10.hostedemail.com: domain of 3A6qaYgcKCNwXMICCDCEMMEJC.AMKJGLSV-KKIT8AI.MPE@flex--zokeefe.bounces.google.com designates 209.85.215.202 as permitted sender) smtp.mailfrom=3A6qaYgcKCNwXMICCDCEMMEJC.AMKJGLSV-KKIT8AI.MPE@flex--zokeefe.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com X-Rspamd-Server: rspam08 X-HE-Tag: 1654303190-760041 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Add madvise collapse context to hugepage collapse selftests. This context is tested with /sys/kernel/mm/transparent_hugepage/enabled set to "never" in order to avoid unwanted interaction with khugepaged during testing. Signed-off-by: Zach O'Keefe --- tools/testing/selftests/vm/khugepaged.c | 55 +++++++++++++++++++++++++ 1 file changed, 55 insertions(+) diff --git a/tools/testing/selftests/vm/khugepaged.c b/tools/testing/selftests/vm/khugepaged.c index 24a8715363be..5207930b34a4 100644 --- a/tools/testing/selftests/vm/khugepaged.c +++ b/tools/testing/selftests/vm/khugepaged.c @@ -14,6 +14,9 @@ #ifndef MADV_PAGEOUT #define MADV_PAGEOUT 21 #endif +#ifndef MADV_COLLAPSE +#define MADV_COLLAPSE 25 +#endif #define BASE_ADDR ((void *)(1UL << 30)) static unsigned long hpage_pmd_size; @@ -108,6 +111,7 @@ static struct settings default_settings = { }; static struct settings saved_settings; +static struct settings current_settings; static bool skip_settings_restore; static int exit_status; @@ -282,6 +286,8 @@ static void write_settings(struct settings *settings) write_num("khugepaged/max_ptes_swap", khugepaged->max_ptes_swap); write_num("khugepaged/max_ptes_shared", khugepaged->max_ptes_shared); write_num("khugepaged/pages_to_scan", khugepaged->pages_to_scan); + + current_settings = *settings; } static void restore_settings(int sig) @@ -912,6 +918,38 @@ static void collapse_max_ptes_shared(struct collapse_context *c) munmap(p, hpage_pmd_size); } +static void madvise_collapse(const char *msg, char *p, bool expect) +{ + int ret; + struct settings old_settings = current_settings; + struct settings settings = old_settings; + + printf("%s...", msg); + /* Sanity check */ + if (check_huge(p)) { + printf("Unexpected huge page\n"); + exit(EXIT_FAILURE); + } + + /* + * Prevent khugepaged interference and tests that MADV_COLLAPSE + * ignores /sys/kernel/mm/transparent_hugepage/enabled + */ + settings.thp_enabled = THP_NEVER; + write_settings(&settings); + + madvise(p, hpage_pmd_size, MADV_HUGEPAGE); + ret = madvise(p, hpage_pmd_size, MADV_COLLAPSE); + if (((bool)ret) == expect) + fail("Fail: Bad return value"); + else if (check_huge(p) != expect) + fail("Fail: check_huge()"); + else + success("OK"); + + write_settings(&old_settings); +} + #define TICK 500000 static bool wait_for_scan(const char *msg, char *p) { @@ -996,5 +1034,22 @@ int main(void) collapse_fork_compound(&c); collapse_max_ptes_shared(&c); + printf("\n*** Testing context: madvise ***\n"); + c.collapse = &madvise_collapse; + c.enforce_pte_scan_limits = false; + + collapse_full(&c); + collapse_empty(&c); + collapse_single_pte_entry(&c); + collapse_max_ptes_none(&c); + collapse_swapin_single_pte(&c); + collapse_max_ptes_swap(&c); + collapse_single_pte_entry_compound(&c); + collapse_full_of_compound(&c); + collapse_compound_extreme(&c); + collapse_fork(&c); + collapse_fork_compound(&c); + collapse_max_ptes_shared(&c); + restore_settings(0); } From patchwork Sat Jun 4 00:40:03 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zach O'Keefe X-Patchwork-Id: 12869471 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9F302CCA473 for ; Sat, 4 Jun 2022 00:40:39 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 31DE18D0011; Fri, 3 Jun 2022 20:40:39 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 255E18D000C; Fri, 3 Jun 2022 20:40:39 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id F26578D0011; Fri, 3 Jun 2022 20:40:38 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id D9AD98D000C for ; Fri, 3 Jun 2022 20:40:38 -0400 (EDT) Received: from smtpin27.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id BAF9934F8E for ; Sat, 4 Jun 2022 00:40:38 +0000 (UTC) X-FDA: 79538697756.27.5B6E88E Received: from mail-pl1-f202.google.com (mail-pl1-f202.google.com [209.85.214.202]) by imf02.hostedemail.com (Postfix) with ESMTP id 395798005C for ; Sat, 4 Jun 2022 00:40:32 +0000 (UTC) Received: by mail-pl1-f202.google.com with SMTP id c4-20020a170902c2c400b0015f16fb4a54so4991057pla.22 for ; Fri, 03 Jun 2022 17:40:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=J8M3jbE3jERZGXRPpicqrZjzrCeMFq4DsDPwtiOs6GI=; b=YLbKQhNOXT33RBlym8WYJTYPng47s3eBQSvfGGZ8u2iwsqcHATQBvrfJZtdjl5vJp+ xtVYKZ0995IP28s5Bb4XNCKTMDI02zj33mceqtN8eHNHqVO2k9WvwmIxUXMuxUs9FXcA 8rASYsRssYgrw5jo8Rf4V1fpzJsaZ6xt+zCY5NKWNkum/BZsky3ACEN39stITiFXGg+X ZPg1PvzX1MdvrM5RT7zSNwoYqQgrqPfL9cq4R5bCmB8rR4F4I9EYRjTDsJcrnnJ2mGb7 z1Cfip0JtCD0U0POfNmXGbBywDABImiYTMsKT4Cv1/p8w9X2cdqm+tPTmnujWNCGnDOY gHwQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=J8M3jbE3jERZGXRPpicqrZjzrCeMFq4DsDPwtiOs6GI=; b=QFcA1qD2fLvCUIrMU3d6SN/lPAVpeonp3vwH03S/Petv1w6p4EW1YGLWlii/BZcxDI VmIj2Yh3wHLncF8zw5xJcVRIieOLHrgvt3WFTvQdKwv9C/ECEOwFVXBVMoIZpHzYpNo5 9WyIuhocCppWURsoLt0aItImuQa8Gvp4ylsbziQLPWPB43NYb5AFpPSRvHpWsbzY1sc5 Zm4EACtV46OzYaOr5Ah+FzYJTmwx47wsZBXQ8O0bEHKmtkpNrddhaESq/Q1uFIj8Gfgx Xx++b+z6/qe/tE4jIQPWv1xoooPD0oDRgLLKVWGWU0Mh4JM2qB4XJ36IduChg8wmkmBj Ohag== X-Gm-Message-State: AOAM532vGmIi01PWYWoAfN2larCgveb3XNOkvNX6/jkw0A8rgH1r9dtt L5tgAHdNseVLaoThr6D/hOcIVTFevNjS X-Google-Smtp-Source: ABdhPJw8BZaGgapZZdKms5RHNA7Pb5z6W4GSfHYEmMbn/oGR5m0AP323QR/Ehl1ffabTM+pyyCvOhEsRFprH X-Received: from zokeefe3.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:1b6]) (user=zokeefe job=sendgmr) by 2002:a17:902:db06:b0:166:42b5:c827 with SMTP id m6-20020a170902db0600b0016642b5c827mr10360909plx.145.1654303237312; Fri, 03 Jun 2022 17:40:37 -0700 (PDT) Date: Fri, 3 Jun 2022 17:40:03 -0700 In-Reply-To: <20220604004004.954674-1-zokeefe@google.com> Message-Id: <20220604004004.954674-15-zokeefe@google.com> Mime-Version: 1.0 References: <20220604004004.954674-1-zokeefe@google.com> X-Mailer: git-send-email 2.36.1.255.ge46751e96f-goog Subject: [PATCH v6 14/15] selftests/vm: add selftest to verify recollapse of THPs From: "Zach O'Keefe" To: Alex Shi , David Hildenbrand , David Rientjes , Matthew Wilcox , Michal Hocko , Pasha Tatashin , Peter Xu , Rongwei Wang , SeongJae Park , Song Liu , Vlastimil Babka , Yang Shi , Zi Yan , linux-mm@kvack.org Cc: Andrea Arcangeli , Andrew Morton , Arnd Bergmann , Axel Rasmussen , Chris Kennelly , Chris Zankel , Helge Deller , Hugh Dickins , Ivan Kokshaysky , "James E.J. Bottomley" , Jens Axboe , "Kirill A. Shutemov" , Matt Turner , Max Filippov , Miaohe Lin , Minchan Kim , Patrick Xia , Pavel Begunkov , Thomas Bogendoerfer , "Zach O'Keefe" X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 395798005C X-Stat-Signature: 9bjy11pm1ddqyi61i4gmzn5w5wcn1391 X-Rspam-User: Authentication-Results: imf02.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=YLbKQhNO; spf=pass (imf02.hostedemail.com: domain of 3BaqaYgcKCN4ZOKEEFEGOOGLE.COMLINUX-MMKVACK.ORG@flex--zokeefe.bounces.google.com designates 209.85.214.202 as permitted sender) smtp.mailfrom=3BaqaYgcKCN4ZOKEEFEGOOGLE.COMLINUX-MMKVACK.ORG@flex--zokeefe.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com X-HE-Tag: 1654303232-799704 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Add selftest specific to madvise collapse context that tests MADV_COLLAPSE is "successful" if a hugepage-aligned/sized region is already pmd-mapped. This test also verifies that MADV_COLLAPSE can collapse memory into THPs even in "madvise" THP mode and the memory isn't marked VM_HUGEPAGE. Signed-off-by: Zach O'Keefe --- tools/testing/selftests/vm/khugepaged.c | 31 +++++++++++++++++++++++++ 1 file changed, 31 insertions(+) diff --git a/tools/testing/selftests/vm/khugepaged.c b/tools/testing/selftests/vm/khugepaged.c index 5207930b34a4..eeea84b0cd35 100644 --- a/tools/testing/selftests/vm/khugepaged.c +++ b/tools/testing/selftests/vm/khugepaged.c @@ -918,6 +918,36 @@ static void collapse_max_ptes_shared(struct collapse_context *c) munmap(p, hpage_pmd_size); } +static void madvise_collapse_existing_thps(void) +{ + void *p; + int err; + + p = alloc_mapping(); + fill_memory(p, 0, hpage_pmd_size); + + printf("Collapse fully populated PTE table..."); + /* + * Note that we don't set MADV_HUGEPAGE here, which + * also tests that VM_HUGEPAGE isn't required for + * MADV_COLLAPSE in "madvise" mode. + */ + err = madvise(p, hpage_pmd_size, MADV_COLLAPSE); + if (err == 0 && check_huge(p)) { + success("OK"); + printf("Re-collapse PMD-mapped hugepage"); + err = madvise(p, hpage_pmd_size, MADV_COLLAPSE); + if (err == 0 && check_huge(p)) + success("OK"); + else + fail("Fail"); + } else { + fail("Fail"); + } + validate_memory(p, 0, hpage_pmd_size); + munmap(p, hpage_pmd_size); +} + static void madvise_collapse(const char *msg, char *p, bool expect) { int ret; @@ -1050,6 +1080,7 @@ int main(void) collapse_fork(&c); collapse_fork_compound(&c); collapse_max_ptes_shared(&c); + madvise_collapse_existing_thps(); restore_settings(0); } From patchwork Sat Jun 4 00:40:04 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zach O'Keefe X-Patchwork-Id: 12869472 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4221CC433EF for ; Sat, 4 Jun 2022 00:40:41 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id CBA738D0012; Fri, 3 Jun 2022 20:40:40 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C1F1F8D000C; Fri, 3 Jun 2022 20:40:40 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A46C48D0012; Fri, 3 Jun 2022 20:40:40 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 8C7B08D000C for ; Fri, 3 Jun 2022 20:40:40 -0400 (EDT) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay12.hostedemail.com (Postfix) with ESMTP id 7331A1210F9 for ; Sat, 4 Jun 2022 00:40:40 +0000 (UTC) X-FDA: 79538697840.20.D826739 Received: from mail-pf1-f202.google.com (mail-pf1-f202.google.com [209.85.210.202]) by imf19.hostedemail.com (Postfix) with ESMTP id 25CEE1A0065 for ; Sat, 4 Jun 2022 00:40:23 +0000 (UTC) Received: by mail-pf1-f202.google.com with SMTP id b20-20020a62a114000000b0050a6280e374so4679119pff.13 for ; Fri, 03 Jun 2022 17:40:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=o0SylEh7awYGxWFGOEouAauCV/v/uXC3DP5BH3Hy0kU=; b=p/8PIElgTfxCj8efSuqif7oEnYo55skU4cCSMpeFLKkDExfD841BcbSQhkcaVs8MsT 0h0hYHr9qpXG/a5cePMU7rYSco7g4/3Sp2n+UDlG/LgE3JKIoHBOIvX7OxYkKtuwFLgy X7AaOxYbuCzjJLSwqNx5SCCbyZXHBtxnI7JnODIm/bPHLPP8ptgBwG3S87mEtHhcUGu+ GDz+21ed/as2EsPss/MRC9v9v6OAIbLA45J0kSlV8re/WruaG62H3ygXrhAFGm9KxrQd rgPTfdWTX+3FBgmQa5bYk0J4K34EmHjhTvsFPbYrdfyrPtoupLf50Oye716o3FdMSYnm h3aA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=o0SylEh7awYGxWFGOEouAauCV/v/uXC3DP5BH3Hy0kU=; b=bXiUKEr2U9CEaRlJw2NlGIfaDtzmT7NqkZ/NKsJ9/XQhXhl+IS0tf1K2a0rMcGKtqK cxFxlgk8pWw4r/Lxg1TY06cBAljeHN6HyqUK5R49+LYRg1GyfonbA+FsR+KlLm89tMhk GJtt0fRm+uIcrThrhLfFRKm7S0YsJVd0LWSPUTA2G/0ZY9tUa2gtoKynDPeoQ455KoWG dh9s51lavpBIf8bCSxoXw1oG7OZ5sxIHmD9o4M6OLbO7+tB+8C4AfR+QhXOwNQ4kmaNK A19mnClzhM29BsJK9PKTqew1rnM+Rui9MyP0NwAr1U86XfNucj5C225xlmsivYJWDRQg XXXg== X-Gm-Message-State: AOAM533u3LSibk/64LSnjhoaQxfIVuzpUaWqcjaujxpSPqLoSMeECt6X MtA6ku7mYCqXrGIQg1T9Ijwrporg6KvU X-Google-Smtp-Source: ABdhPJxWmmvZq0iW31CW877Tp3uD+WXphH78F1HRF1STso40tQxBq6ER3wf8Bo+tDc7t4i0ptHFJfRD0Y185 X-Received: from zokeefe3.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:1b6]) (user=zokeefe job=sendgmr) by 2002:a05:6a00:7d6:b0:518:9fa0:7dc with SMTP id n22-20020a056a0007d600b005189fa007dcmr60717624pfu.36.1654303239093; Fri, 03 Jun 2022 17:40:39 -0700 (PDT) Date: Fri, 3 Jun 2022 17:40:04 -0700 In-Reply-To: <20220604004004.954674-1-zokeefe@google.com> Message-Id: <20220604004004.954674-16-zokeefe@google.com> Mime-Version: 1.0 References: <20220604004004.954674-1-zokeefe@google.com> X-Mailer: git-send-email 2.36.1.255.ge46751e96f-goog Subject: [PATCH v6 15/15] tools headers uapi: add MADV_COLLAPSE madvise mode to tools From: "Zach O'Keefe" To: Alex Shi , David Hildenbrand , David Rientjes , Matthew Wilcox , Michal Hocko , Pasha Tatashin , Peter Xu , Rongwei Wang , SeongJae Park , Song Liu , Vlastimil Babka , Yang Shi , Zi Yan , linux-mm@kvack.org Cc: Andrea Arcangeli , Andrew Morton , Arnd Bergmann , Axel Rasmussen , Chris Kennelly , Chris Zankel , Helge Deller , Hugh Dickins , Ivan Kokshaysky , "James E.J. Bottomley" , Jens Axboe , "Kirill A. Shutemov" , Matt Turner , Max Filippov , Miaohe Lin , Minchan Kim , Patrick Xia , Pavel Begunkov , Thomas Bogendoerfer , "Zach O'Keefe" X-Stat-Signature: prz5i1fud19ua1d6dpkdyxu1d13zzcbk X-Rspam-User: Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b="p/8PIElg"; spf=pass (imf19.hostedemail.com: domain of 3B6qaYgcKCOAbQMGGHGIQQING.EQONKPWZ-OOMXCEM.QTI@flex--zokeefe.bounces.google.com designates 209.85.210.202 as permitted sender) smtp.mailfrom=3B6qaYgcKCOAbQMGGHGIQQING.EQONKPWZ-OOMXCEM.QTI@flex--zokeefe.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: 25CEE1A0065 X-HE-Tag: 1654303223-603728 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Tools able to translate MADV_COLLAPSE advice to human readable string: $ tools/perf/trace/beauty/madvise_behavior.sh static const char *madvise_advices[] = { [0] = "NORMAL", [1] = "RANDOM", [2] = "SEQUENTIAL", [3] = "WILLNEED", [4] = "DONTNEED", [8] = "FREE", [9] = "REMOVE", [10] = "DONTFORK", [11] = "DOFORK", [12] = "MERGEABLE", [13] = "UNMERGEABLE", [14] = "HUGEPAGE", [15] = "NOHUGEPAGE", [16] = "DONTDUMP", [17] = "DODUMP", [18] = "WIPEONFORK", [19] = "KEEPONFORK", [20] = "COLD", [21] = "PAGEOUT", [22] = "POPULATE_READ", [23] = "POPULATE_WRITE", [24] = "DONTNEED_LOCKED", [25] = "COLLAPSE", [100] = "HWPOISON", [101] = "SOFT_OFFLINE", }; Signed-off-by: Zach O'Keefe --- tools/include/uapi/asm-generic/mman-common.h | 2 ++ 1 file changed, 2 insertions(+) diff --git a/tools/include/uapi/asm-generic/mman-common.h b/tools/include/uapi/asm-generic/mman-common.h index 6c1aa92a92e4..6ce1f1ceb432 100644 --- a/tools/include/uapi/asm-generic/mman-common.h +++ b/tools/include/uapi/asm-generic/mman-common.h @@ -77,6 +77,8 @@ #define MADV_DONTNEED_LOCKED 24 /* like DONTNEED, but drop locked pages too */ +#define MADV_COLLAPSE 25 /* Synchronous hugepage collapse */ + /* compatibility flags */ #define MAP_FILE 0