From patchwork Wed May 4 21:44:30 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zach O'Keefe X-Patchwork-Id: 12838667 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D2A5EC433EF for ; Wed, 4 May 2022 21:45:13 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6B01E6B007B; Wed, 4 May 2022 17:45:13 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 6604E6B007D; Wed, 4 May 2022 17:45:13 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 48AE36B007E; Wed, 4 May 2022 17:45:13 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 39FDB6B007B for ; Wed, 4 May 2022 17:45:13 -0400 (EDT) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 0CD082BDC5 for ; Wed, 4 May 2022 21:45:13 +0000 (UTC) X-FDA: 79429391706.13.90603C7 Received: from mail-yb1-f201.google.com (mail-yb1-f201.google.com [209.85.219.201]) by imf11.hostedemail.com (Postfix) with ESMTP id A31324007E for ; Wed, 4 May 2022 21:45:08 +0000 (UTC) Received: by mail-yb1-f201.google.com with SMTP id h14-20020a25e20e000000b006484e4a1da2so2203686ybe.9 for ; Wed, 04 May 2022 14:45:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=6ylFBVibThCvrnccmW0woox5oR29xo59yENHBOiZ9ik=; b=OpAVbzyVUC2DBZdsf3dxfYBsNP6lJ9C9BuSS/Zy8cvRJWyFWWczRF2lvfg8bqEJpeF NC6Hl2Q6G7yfnKYYoJ/Z0h689Yrw/aFmq5ed0yykSBtetqAISHRHYGrQnbCoW7C3FqDp H3Iafrs9i5PzYon4ewGQjUH7VL//sUqS++ylKlLUsiax5+dNmEPhVjqjVECqWlCV2fwP tvr4wCTVGrszBpUm3GRwXvJsKu9vTWnqEo+/gTN2NPSpoxFXX1FlBW1cLpLU/wNsG9Aw LCiBXhy4kMaizILn9mhFCCNA6ielkxlppBxw92fUFY+sjh33w9g70I82F0VCwkxKfa+j lsxQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=6ylFBVibThCvrnccmW0woox5oR29xo59yENHBOiZ9ik=; b=iFn00Qy92rBbNXPlljbG7JerP/dVdX3wO+DuTuozprl3jBbO6XEyP0CHpsxiAB1NpI Dz+lvPmc3RLdRSh+Xd0/3sDMU2WLkfyS7TGZV2ENOjF6VXtfvhOGhP8pRWYdvIxnOvym PjeB2efvdzGtYFy9q2+Yuzr8jmz7u9dbBG7S/MikNjIfiSfvD/xGDo9fzHnj0Umi7S2i LIggq28L0naCmZniRCImFGfTlQb0xkrABot1efiojU0SfUvwZaH4DGRwnoL0T3bZAyKD sVMTlUNgebsdlA0BzC07Yyp4FNqfFWJPJco33A9UE5kiQjehlKNB/wim8v8mcSYelQON 0S4w== X-Gm-Message-State: AOAM530Pao9C1TcZtlq2mgy9VC936mmm4wfwJ+yBqw2xfZjCPriGQCKp 2+xo7U1Jx2/WM4n5McD3LBQeLVh9lPF3 X-Google-Smtp-Source: ABdhPJyA5gZQdg9lQlnkZX6yz2NBiF1nGVAvuYC03MVjXE0rsPPH5DKN3QGH5TxTHwMG/9jXkJ4N4F4lUamV X-Received: from zokeefe3.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:1b6]) (user=zokeefe job=sendgmr) by 2002:a05:6902:684:b0:648:7045:7d00 with SMTP id i4-20020a056902068400b0064870457d00mr19884001ybt.29.1651700711483; Wed, 04 May 2022 14:45:11 -0700 (PDT) Date: Wed, 4 May 2022 14:44:30 -0700 In-Reply-To: <20220504214437.2850685-1-zokeefe@google.com> Message-Id: <20220504214437.2850685-7-zokeefe@google.com> Mime-Version: 1.0 References: <20220504214437.2850685-1-zokeefe@google.com> X-Mailer: git-send-email 2.36.0.464.gb9c8b46e94-goog Subject: [PATCH v5 06/13] mm/khugepaged: add flag to ignore khugepaged_max_ptes_* From: "Zach O'Keefe" To: Alex Shi , David Hildenbrand , David Rientjes , Matthew Wilcox , Michal Hocko , Pasha Tatashin , Peter Xu , SeongJae Park , Song Liu , Vlastimil Babka , Yang Shi , Zi Yan , linux-mm@kvack.org Cc: Andrea Arcangeli , Andrew Morton , Arnd Bergmann , Axel Rasmussen , Chris Kennelly , Chris Zankel , Helge Deller , Hugh Dickins , Ivan Kokshaysky , "James E.J. Bottomley" , Jens Axboe , "Kirill A. Shutemov" , Matt Turner , Max Filippov , Miaohe Lin , Minchan Kim , Patrick Xia , Pavel Begunkov , Thomas Bogendoerfer , "Zach O'Keefe" X-Stat-Signature: tiwzs6fibwtnasciernrh3biry64x7g1 X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: A31324007E X-Rspam-User: Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=OpAVbzyV; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf11.hostedemail.com: domain of 35_NyYgcKCLQtieYYZYaiiafY.Wigfchor-ggepUWe.ila@flex--zokeefe.bounces.google.com designates 209.85.219.201 as permitted sender) smtp.mailfrom=35_NyYgcKCLQtieYYZYaiiafY.Wigfchor-ggepUWe.ila@flex--zokeefe.bounces.google.com X-HE-Tag: 1651700708-929014 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Add enforce_pte_scan_limits flag to struct collapse_control that allows context to ignore the sysfs-controlled knobs khugepaged_max_ptes_[none|swap|shared] and set this flag in khugepaged collapse context to preserve existing khugepaged behavior. This flag will be used (unset) when introducing madvise collapse context since here, the user presumably has reason to believe the collapse will be beneficial and khugepaged heuristics shouldn't tell the user they are wrong. Signed-off-by: Zach O'Keefe Acked-by: David Rientjes --- mm/khugepaged.c | 31 +++++++++++++++++++++---------- 1 file changed, 21 insertions(+), 10 deletions(-) diff --git a/mm/khugepaged.c b/mm/khugepaged.c index 1314caed65b0..ca730aec0e3e 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -87,6 +87,9 @@ static struct kmem_cache *mm_slot_cache __read_mostly; #define MAX_PTE_MAPPED_THP 8 struct collapse_control { + /* Respect khugepaged_max_ptes_[none|swap|shared] */ + bool enforce_pte_scan_limits; + /* Num pages scanned per node */ int node_load[MAX_NUMNODES]; @@ -614,6 +617,7 @@ static bool is_refcount_suitable(struct page *page) static int __collapse_huge_page_isolate(struct vm_area_struct *vma, unsigned long address, pte_t *pte, + struct collapse_control *cc, struct list_head *compound_pagelist) { struct page *page = NULL; @@ -627,7 +631,8 @@ static int __collapse_huge_page_isolate(struct vm_area_struct *vma, if (pte_none(pteval) || (pte_present(pteval) && is_zero_pfn(pte_pfn(pteval)))) { if (!userfaultfd_armed(vma) && - ++none_or_zero <= khugepaged_max_ptes_none) { + (++none_or_zero <= khugepaged_max_ptes_none || + !cc->enforce_pte_scan_limits)) { continue; } else { result = SCAN_EXCEED_NONE_PTE; @@ -647,8 +652,8 @@ static int __collapse_huge_page_isolate(struct vm_area_struct *vma, VM_BUG_ON_PAGE(!PageAnon(page), page); - if (page_mapcount(page) > 1 && - ++shared > khugepaged_max_ptes_shared) { + if (cc->enforce_pte_scan_limits && page_mapcount(page) > 1 && + ++shared > khugepaged_max_ptes_shared) { result = SCAN_EXCEED_SHARED_PTE; count_vm_event(THP_SCAN_EXCEED_SHARED_PTE); goto out; @@ -1187,7 +1192,7 @@ static int collapse_huge_page(struct mm_struct *mm, unsigned long address, mmu_notifier_invalidate_range_end(&range); spin_lock(pte_ptl); - result = __collapse_huge_page_isolate(vma, address, pte, + result = __collapse_huge_page_isolate(vma, address, pte, cc, &compound_pagelist); spin_unlock(pte_ptl); @@ -1275,7 +1280,8 @@ static int khugepaged_scan_pmd(struct mm_struct *mm, struct vm_area_struct *vma, _pte++, _address += PAGE_SIZE) { pte_t pteval = *_pte; if (is_swap_pte(pteval)) { - if (++unmapped <= khugepaged_max_ptes_swap) { + if (++unmapped <= khugepaged_max_ptes_swap || + !cc->enforce_pte_scan_limits) { /* * Always be strict with uffd-wp * enabled swap entries. Please see @@ -1294,7 +1300,8 @@ static int khugepaged_scan_pmd(struct mm_struct *mm, struct vm_area_struct *vma, } if (pte_none(pteval) || is_zero_pfn(pte_pfn(pteval))) { if (!userfaultfd_armed(vma) && - ++none_or_zero <= khugepaged_max_ptes_none) { + (++none_or_zero <= khugepaged_max_ptes_none || + !cc->enforce_pte_scan_limits)) { continue; } else { result = SCAN_EXCEED_NONE_PTE; @@ -1324,8 +1331,9 @@ static int khugepaged_scan_pmd(struct mm_struct *mm, struct vm_area_struct *vma, goto out_unmap; } - if (page_mapcount(page) > 1 && - ++shared > khugepaged_max_ptes_shared) { + if (cc->enforce_pte_scan_limits && + page_mapcount(page) > 1 && + ++shared > khugepaged_max_ptes_shared) { result = SCAN_EXCEED_SHARED_PTE; count_vm_event(THP_SCAN_EXCEED_SHARED_PTE); goto out_unmap; @@ -2051,7 +2059,8 @@ static int khugepaged_scan_file(struct mm_struct *mm, struct file *file, continue; if (xa_is_value(page)) { - if (++swap > khugepaged_max_ptes_swap) { + if (cc->enforce_pte_scan_limits && + ++swap > khugepaged_max_ptes_swap) { result = SCAN_EXCEED_SWAP_PTE; count_vm_event(THP_SCAN_EXCEED_SWAP_PTE); break; @@ -2102,7 +2111,8 @@ static int khugepaged_scan_file(struct mm_struct *mm, struct file *file, rcu_read_unlock(); if (result == SCAN_SUCCEED) { - if (present < HPAGE_PMD_NR - khugepaged_max_ptes_none) { + if (present < HPAGE_PMD_NR - khugepaged_max_ptes_none && + cc->enforce_pte_scan_limits) { result = SCAN_EXCEED_NONE_PTE; count_vm_event(THP_SCAN_EXCEED_NONE_PTE); } else { @@ -2332,6 +2342,7 @@ static int khugepaged(void *none) { struct mm_slot *mm_slot; struct collapse_control cc = { + .enforce_pte_scan_limits = true, .last_target_node = NUMA_NO_NODE, .alloc_charge_hpage = &alloc_charge_hpage, };