From patchwork Tue Dec 22 07:46:59 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Liang Li X-Patchwork-Id: 11986095 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-14.2 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8EB69C433E0 for ; Tue, 22 Dec 2020 07:47:06 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id EFB522255F for ; Tue, 22 Dec 2020 07:47:05 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org EFB522255F Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 14F426B007B; Tue, 22 Dec 2020 02:47:05 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 123896B0083; Tue, 22 Dec 2020 02:47:05 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 014AE8D0005; Tue, 22 Dec 2020 02:47:04 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0222.hostedemail.com [216.40.44.222]) by kanga.kvack.org (Postfix) with ESMTP id DCA2E6B007B for ; Tue, 22 Dec 2020 02:47:04 -0500 (EST) Received: from smtpin03.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id A1F003648 for ; Tue, 22 Dec 2020 07:47:04 +0000 (UTC) X-FDA: 77620137168.03.list34_5f0320c2745e Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin03.hostedemail.com (Postfix) with ESMTP id 7DC3428A4E9 for ; Tue, 22 Dec 2020 07:47:04 +0000 (UTC) X-HE-Tag: list34_5f0320c2745e X-Filterd-Recvd-Size: 20995 Received: from mail-pl1-f181.google.com (mail-pl1-f181.google.com [209.85.214.181]) by imf15.hostedemail.com (Postfix) with ESMTP for ; Tue, 22 Dec 2020 07:47:03 +0000 (UTC) Received: by mail-pl1-f181.google.com with SMTP id y8so6993545plp.8 for ; Mon, 21 Dec 2020 23:47:03 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:date:to:cc:subject:message-id:mail-followup-to:mime-version :content-disposition:user-agent; bh=/QJ3etTHXwFO0sfMopiRtUfgyd8Q29xDr/I07BPqM8U=; b=p1puU9SvCg3tvRbcGQ2hkXUCdUO3WJvuJ0kA0iX8BGOapHSTKYfJry17VM9Ey6lypQ 4hpG9raA8NszOIlfeBuZHg17/1r2oCx3HttayO4xMbwC+hP1ZgDYaDmD5Fgy+DvK95Rk qtWsL8AymLE2e20dA6bB4XcWQmVoPTelQgifD9ndpgoFSdgz8fMvJ9I2WB64wjXumO14 1bN6WtGrUvh1qRIUWsI3nKlAGZIMMKJZ2U5PbZV/H1RM1Y7Q7D94GIr9e5mcz5LB9RIP cnGCyjlLNSO+I63EtbqMrGG5O12bB+pDSLW9Px60C5b/cOIZgxoM8ebM6/Jj2tiR6HRh tCug== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:date:to:cc:subject:message-id :mail-followup-to:mime-version:content-disposition:user-agent; bh=/QJ3etTHXwFO0sfMopiRtUfgyd8Q29xDr/I07BPqM8U=; b=sF0fqoYAajCeR/nT8TLT23TB7CRnLeY6JaKL4CIm3kHocP48neC7MddLhyFM9/Be9v vHG1y3vDK2cgc64RLuMiNSZeZjfugLYvThqDxyT2GZda1SJvbZYlqx2j/oeYjZUlB+f7 12CIXgdxmq48qPmPbZcZLewOm3JUz1r18eaHZgXMCVgJK22q9LCvKGT3zZIzU3EA/MOE WQH2uaqPHFhI0r2drDxmByTx5Rvuz3FrvWO7E8ACyDv1NG+r8BIJM2v9xzwHFdD/r31E X+9omn1DD5T1DyBfQbwKJa3j8DKs4Ko6L2tBQDDCQm2pBhqz/YDFGKz2e6mdaMcGo8kj Tn7g== X-Gm-Message-State: AOAM531zDRILrJg0mJTGWn1p3Kd8Lw6OWBwKu3e4+XHx5rB1BQW7Z+bY aL6ZXt/MGbXe1aAEwAuC8Nc= X-Google-Smtp-Source: ABdhPJw0yHdBviDxZW6i1d1c62iRZvw98rgXLaZUewI87E4M5nHhPAz65S05jA6v2+0F4gqhWt1Wzw== X-Received: by 2002:a17:902:7449:b029:dc:bc:65de with SMTP id e9-20020a1709027449b02900dc00bc65demr15697195plt.79.1608623222955; Mon, 21 Dec 2020 23:47:02 -0800 (PST) Received: from open-light-1.localdomain (66.98.113.28.16clouds.com. [66.98.113.28]) by smtp.gmail.com with ESMTPSA id 7sm18884115pfu.2.2020.12.21.23.47.01 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Mon, 21 Dec 2020 23:47:02 -0800 (PST) From: Liang Li X-Google-Original-From: Liang Li Date: Tue, 22 Dec 2020 02:46:59 -0500 To: Alexander Duyck , Mel Gorman , Andrew Morton , Andrea Arcangeli , Dan Williams , "Michael S. Tsirkin" , David Hildenbrand , Jason Wang , Dave Hansen , Michal Hocko , Liang Li , Mike Kravetz , Liang Li Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, virtualization@lists.linux-foundation.org, qemu-devel@nongnu.org Subject: [RFC PATCH 1/3] mm: support hugetlb free page reporting Message-ID: <20201222074656.GA30035@open-light-1.localdomain> Mail-Followup-To: Alexander Duyck , Mel Gorman , Andrew Morton , Andrea Arcangeli , Dan Williams , "Michael S. Tsirkin" , David Hildenbrand , Jason Wang , Dave Hansen , Michal Hocko , Liang Li , Mike Kravetz , Liang Li , linux-mm@kvack.org, linux-kernel@vger.kernel.org, virtualization@lists.linux-foundation.org, qemu-devel@nongnu.org MIME-Version: 1.0 Content-Disposition: inline User-Agent: Mutt/1.5.21 (2010-09-15) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Free page reporting only supports buddy pages, it can't report the free pages reserved for hugetlbfs case. On the other hand, hugetlbfs is a good choice for a system with a huge amount of RAM, because it can help to reduce the memory management overhead and improve system performance. This patch add the support for reporting hugepages in the free list of hugetlb, it canbe used by virtio_balloon driver for memory overcommit and pre zero out free pages for speeding up memory population. Cc: Alexander Duyck Cc: Mel Gorman Cc: Andrea Arcangeli Cc: Dan Williams Cc: Dave Hansen Cc: David Hildenbrand Cc: Michal Hocko Cc: Andrew Morton Cc: Alex Williamson Cc: Michael S. Tsirkin Cc: Jason Wang Cc: Mike Kravetz Cc: Liang Li Signed-off-by: Liang Li --- include/linux/hugetlb.h | 3 + include/linux/page_reporting.h | 5 + mm/hugetlb.c | 29 ++++ mm/page_reporting.c | 287 +++++++++++++++++++++++++++++++++ mm/page_reporting.h | 34 ++++ 5 files changed, 358 insertions(+) diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index ebca2ef02212..a72ad25501d3 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -11,6 +11,7 @@ #include #include #include +#include struct ctl_table; struct user_struct; @@ -114,6 +115,8 @@ int hugetlb_treat_movable_handler(struct ctl_table *, int, void *, size_t *, int hugetlb_mempolicy_sysctl_handler(struct ctl_table *, int, void *, size_t *, loff_t *); +bool isolate_free_huge_page(struct page *page, struct hstate *h, int nid); +void putback_isolate_huge_page(struct hstate *h, struct page *page); int copy_hugetlb_page_range(struct mm_struct *, struct mm_struct *, struct vm_area_struct *); long follow_hugetlb_page(struct mm_struct *, struct vm_area_struct *, struct page **, struct vm_area_struct **, diff --git a/include/linux/page_reporting.h b/include/linux/page_reporting.h index 63e1e9fbcaa2..0da3d1a6f0cc 100644 --- a/include/linux/page_reporting.h +++ b/include/linux/page_reporting.h @@ -7,6 +7,7 @@ /* This value should always be a power of 2, see page_reporting_cycle() */ #define PAGE_REPORTING_CAPACITY 32 +#define HUGEPAGE_REPORTING_CAPACITY 1 struct page_reporting_dev_info { /* function that alters pages to make them "reported" */ @@ -26,4 +27,8 @@ struct page_reporting_dev_info { /* Tear-down and bring-up for page reporting devices */ void page_reporting_unregister(struct page_reporting_dev_info *prdev); int page_reporting_register(struct page_reporting_dev_info *prdev); + +/* Tear-down and bring-up for hugepage reporting devices */ +void hugepage_reporting_unregister(struct page_reporting_dev_info *prdev); +int hugepage_reporting_register(struct page_reporting_dev_info *prdev); #endif /*_LINUX_PAGE_REPORTING_H */ diff --git a/mm/hugetlb.c b/mm/hugetlb.c index cbf32d2824fd..de6ce147dfe2 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -41,6 +41,7 @@ #include #include #include +#include "page_reporting.h" #include "internal.h" int hugetlb_max_hstate __read_mostly; @@ -1028,6 +1029,11 @@ static void enqueue_huge_page(struct hstate *h, struct page *page) list_move(&page->lru, &h->hugepage_freelists[nid]); h->free_huge_pages++; h->free_huge_pages_node[nid]++; + if (hugepage_reported(page)) { + __ClearPageReported(page); + pr_info("%s, free_huge_pages=%ld\n", __func__, h->free_huge_pages); + } + hugepage_reporting_notify_free(h->order); } static struct page *dequeue_huge_page_node_exact(struct hstate *h, int nid) @@ -5531,6 +5537,29 @@ follow_huge_pgd(struct mm_struct *mm, unsigned long address, pgd_t *pgd, int fla return pte_page(*(pte_t *)pgd) + ((address & ~PGDIR_MASK) >> PAGE_SHIFT); } +bool isolate_free_huge_page(struct page *page, struct hstate *h, int nid) +{ + bool ret = true; + + VM_BUG_ON_PAGE(!PageHead(page), page); + + list_move(&page->lru, &h->hugepage_activelist); + set_page_refcounted(page); + h->free_huge_pages--; + h->free_huge_pages_node[nid]--; + + return ret; +} + +void putback_isolate_huge_page(struct hstate *h, struct page *page) +{ + int nid = page_to_nid(page); + pr_info("%s, free_huge_pages=%ld\n", __func__, h->free_huge_pages); + list_move(&page->lru, &h->hugepage_freelists[nid]); + h->free_huge_pages++; + h->free_huge_pages_node[nid]++; +} + bool isolate_huge_page(struct page *page, struct list_head *list) { bool ret = true; diff --git a/mm/page_reporting.c b/mm/page_reporting.c index 20ec3fb1afc4..15d4b5372df8 100644 --- a/mm/page_reporting.c +++ b/mm/page_reporting.c @@ -7,6 +7,7 @@ #include #include #include +#include #include "page_reporting.h" #include "internal.h" @@ -16,6 +17,10 @@ static struct page_reporting_dev_info __rcu *pr_dev_info __read_mostly; int page_report_mini_order = pageblock_order; unsigned long page_report_batch_size = 32 * 1024 * 1024; +static struct page_reporting_dev_info __rcu *hgpr_dev_info __read_mostly; +int hugepage_report_mini_order = pageblock_order; +unsigned long hugepage_report_batch_size = 64 * 1024 * 1024; + enum { PAGE_REPORTING_IDLE = 0, PAGE_REPORTING_REQUESTED, @@ -67,6 +72,24 @@ void __page_reporting_notify(void) rcu_read_unlock(); } +/* notify prdev of free hugepage reporting request */ +void __hugepage_reporting_notify(void) +{ + struct page_reporting_dev_info *prdev; + + /* + * We use RCU to protect the pr_dev_info pointer. In almost all + * cases this should be present, however in the unlikely case of + * a shutdown this will be NULL and we should exit. + */ + rcu_read_lock(); + prdev = rcu_dereference(hgpr_dev_info); + if (likely(prdev)) + __page_reporting_request(prdev); + + rcu_read_unlock(); +} + static void page_reporting_drain(struct page_reporting_dev_info *prdev, struct scatterlist *sgl, unsigned int nents, bool reported) @@ -103,6 +126,213 @@ page_reporting_drain(struct page_reporting_dev_info *prdev, sg_init_table(sgl, nents); } +static void +hugepage_reporting_drain(struct page_reporting_dev_info *prdev, + struct hstate *h, struct scatterlist *sgl, + unsigned int nents, bool reported) +{ + struct scatterlist *sg = sgl; + + /* + * Drain the now reported pages back into their respective + * free lists/areas. We assume at least one page is populated. + */ + do { + struct page *page = sg_page(sg); + + putback_isolate_huge_page(h, page); + + /* If the pages were not reported due to error skip flagging */ + if (!reported) + continue; + + __SetPageReported(page); + } while ((sg = sg_next(sg))); + + /* reinitialize scatterlist now that it is empty */ + sg_init_table(sgl, nents); +} + +/* + * The page reporting cycle consists of 4 stages, fill, report, drain, and + * idle. We will cycle through the first 3 stages until we cannot obtain a + * full scatterlist of pages, in that case we will switch to idle. + */ +static int +hugepage_reporting_cycle(struct page_reporting_dev_info *prdev, + struct hstate *h, unsigned int nid, + struct scatterlist *sgl, unsigned int *offset) +{ + struct list_head *list = &h->hugepage_freelists[nid]; + unsigned int page_len = PAGE_SIZE << h->order; + struct page *page, *next; + long budget; + int ret = 0, scan_cnt = 0; + + /* + * Perform early check, if free area is empty there is + * nothing to process so we can skip this free_list. + */ + if (list_empty(list)) + return ret; + + spin_lock_irq(&hugetlb_lock); + + if (huge_page_order(h) > MAX_ORDER) + budget = HUGEPAGE_REPORTING_CAPACITY; + else + budget = HUGEPAGE_REPORTING_CAPACITY * 32; + + /* loop through free list adding unreported pages to sg list */ + list_for_each_entry_safe(page, next, list, lru) { + /* We are going to skip over the reported pages. */ + if (PageReported(page)) { + if (++scan_cnt >= MAX_SCAN_NUM) { + ret = scan_cnt; + break; + } + continue; + } + + /* + * If we fully consumed our budget then update our + * state to indicate that we are requesting additional + * processing and exit this list. + */ + if (budget < 0) { + atomic_set(&prdev->state, PAGE_REPORTING_REQUESTED); + next = page; + break; + } + + /* Attempt to pull page from list and place in scatterlist */ + if (*offset) { + isolate_free_huge_page(page, h, nid); + /* Add page to scatter list */ + --(*offset); + sg_set_page(&sgl[*offset], page, page_len, 0); + + continue; + } + + /* + * Make the first non-processed page in the free list + * the new head of the free list before we release the + * zone lock. + */ + if (&page->lru != list && !list_is_first(&page->lru, list)) + list_rotate_to_front(&page->lru, list); + + /* release lock before waiting on report processing */ + spin_unlock_irq(&hugetlb_lock); + + /* begin processing pages in local list */ + ret = prdev->report(prdev, sgl, HUGEPAGE_REPORTING_CAPACITY); + + /* reset offset since the full list was reported */ + *offset = HUGEPAGE_REPORTING_CAPACITY; + + /* update budget to reflect call to report function */ + budget--; + + /* reacquire zone lock and resume processing */ + spin_lock_irq(&hugetlb_lock); + + /* flush reported pages from the sg list */ + hugepage_reporting_drain(prdev, h, sgl, + HUGEPAGE_REPORTING_CAPACITY, !ret); + + /* + * Reset next to first entry, the old next isn't valid + * since we dropped the lock to report the pages + */ + next = list_first_entry(list, struct page, lru); + + /* exit on error */ + if (ret) + break; + } + + /* Rotate any leftover pages to the head of the freelist */ + if (&next->lru != list && !list_is_first(&next->lru, list)) + list_rotate_to_front(&next->lru, list); + + spin_unlock_irq(&hugetlb_lock); + + return ret; +} + +static int +hugepage_reporting_process_hstate(struct page_reporting_dev_info *prdev, + struct scatterlist *sgl, struct hstate *h) +{ + unsigned int leftover, offset = HUGEPAGE_REPORTING_CAPACITY; + int ret = 0, nid; + + for (nid = 0; nid < MAX_NUMNODES; nid++) { + ret = hugepage_reporting_cycle(prdev, h, nid, sgl, &offset); + + if (ret < 0) + return ret; + } + + /* report the leftover pages before going idle */ + leftover = HUGEPAGE_REPORTING_CAPACITY - offset; + if (leftover) { + sgl = &sgl[offset]; + ret = prdev->report(prdev, sgl, leftover); + + /* flush any remaining pages out from the last report */ + spin_lock_irq(&hugetlb_lock); + hugepage_reporting_drain(prdev, h, sgl, leftover, !ret); + spin_unlock_irq(&hugetlb_lock); + } + + return ret; +} + +static void hugepage_reporting_process(struct work_struct *work) +{ + struct delayed_work *d_work = to_delayed_work(work); + struct page_reporting_dev_info *prdev = container_of(d_work, + struct page_reporting_dev_info, work); + int err = 0, state = PAGE_REPORTING_ACTIVE; + struct scatterlist *sgl; + struct hstate *h; + + /* + * Change the state to "Active" so that we can track if there is + * anyone requests page reporting after we complete our pass. If + * the state is not altered by the end of the pass we will switch + * to idle and quit scheduling reporting runs. + */ + atomic_set(&prdev->state, state); + + /* allocate scatterlist to store pages being reported on */ + sgl = kmalloc_array(HUGEPAGE_REPORTING_CAPACITY, sizeof(*sgl), GFP_KERNEL); + if (!sgl) + goto err_out; + + sg_init_table(sgl, HUGEPAGE_REPORTING_CAPACITY); + + for_each_hstate(h) { + err = hugepage_reporting_process_hstate(prdev, sgl, h); + if (err) + break; + } + + kfree(sgl); +err_out: + /* + * If the state has reverted back to requested then there may be + * additional pages to be processed. We will defer for 2s to allow + * more pages to accumulate. + */ + state = atomic_cmpxchg(&prdev->state, state, PAGE_REPORTING_IDLE); + if (state == PAGE_REPORTING_REQUESTED) + schedule_delayed_work(&prdev->work, prdev->delay_jiffies); +} + /* * The page reporting cycle consists of 4 stages, fill, report, drain, and * idle. We will cycle through the first 3 stages until we cannot obtain a @@ -341,6 +571,9 @@ static void page_reporting_process(struct work_struct *work) static DEFINE_MUTEX(page_reporting_mutex); DEFINE_STATIC_KEY_FALSE(page_reporting_enabled); +static DEFINE_MUTEX(hugepage_reporting_mutex); +DEFINE_STATIC_KEY_FALSE(hugepage_reporting_enabled); + int page_reporting_register(struct page_reporting_dev_info *prdev) { int err = 0; @@ -395,3 +628,57 @@ void page_reporting_unregister(struct page_reporting_dev_info *prdev) mutex_unlock(&page_reporting_mutex); } EXPORT_SYMBOL_GPL(page_reporting_unregister); + +int hugepage_reporting_register(struct page_reporting_dev_info *prdev) +{ + int err = 0; + + mutex_lock(&hugepage_reporting_mutex); + + /* nothing to do if already in use */ + if (rcu_access_pointer(hgpr_dev_info)) { + err = -EBUSY; + goto err_out; + } + + /* initialize state and work structures */ + atomic_set(&prdev->state, PAGE_REPORTING_IDLE); + INIT_DELAYED_WORK(&prdev->work, &hugepage_reporting_process); + + /* Begin initial flush of zones */ + __page_reporting_request(prdev); + + /* Assign device to allow notifications */ + rcu_assign_pointer(hgpr_dev_info, prdev); + + hugepage_report_mini_order = prdev->mini_order; + hugepage_report_batch_size = prdev->batch_size; + + /* enable hugepage reporting notification */ + if (!static_key_enabled(&hugepage_reporting_enabled)) { + static_branch_enable(&hugepage_reporting_enabled); + pr_info("Free hugepage reporting enabled\n"); + } +err_out: + mutex_unlock(&hugepage_reporting_mutex); + + return err; +} +EXPORT_SYMBOL_GPL(hugepage_reporting_register); + +void hugepage_reporting_unregister(struct page_reporting_dev_info *prdev) +{ + mutex_lock(&hugepage_reporting_mutex); + + if (rcu_access_pointer(hgpr_dev_info) == prdev) { + /* Disable page reporting notification */ + RCU_INIT_POINTER(hgpr_dev_info, NULL); + synchronize_rcu(); + + /* Flush any existing work, and lock it out */ + cancel_delayed_work_sync(&prdev->work); + } + + mutex_unlock(&hugepage_reporting_mutex); +} +EXPORT_SYMBOL_GPL(hugepage_reporting_unregister); diff --git a/mm/page_reporting.h b/mm/page_reporting.h index 86ac6ffad970..271c64c3c3cb 100644 --- a/mm/page_reporting.h +++ b/mm/page_reporting.h @@ -18,12 +18,24 @@ extern unsigned long page_report_batch_size; DECLARE_STATIC_KEY_FALSE(page_reporting_enabled); void __page_reporting_notify(void); +extern int hugepage_report_mini_order; +extern unsigned long hugepage_report_batch_size; + +DECLARE_STATIC_KEY_FALSE(hugepage_reporting_enabled); +void __hugepage_reporting_notify(void); + static inline bool page_reported(struct page *page) { return static_branch_unlikely(&page_reporting_enabled) && PageReported(page); } +static inline bool hugepage_reported(struct page *page) +{ + return static_branch_unlikely(&hugepage_reporting_enabled) && + PageReported(page); +} + /** * page_reporting_notify_free - Free page notification to start page processing * @@ -52,11 +64,33 @@ static inline void page_reporting_notify_free(unsigned int order) __page_reporting_notify(); } } + +static inline void hugepage_reporting_notify_free(unsigned int order) +{ + static long batch_size = 0; + + if (!static_branch_unlikely(&hugepage_reporting_enabled)) + return; + + /* Determine if we have crossed reporting threshold */ + if (order < hugepage_report_mini_order) + return; + + batch_size += (1 << order) << PAGE_SHIFT; + if (batch_size >= hugepage_report_batch_size) { + batch_size = 0; + __hugepage_reporting_notify(); + } +} #else /* CONFIG_PAGE_REPORTING */ #define page_reported(_page) false static inline void page_reporting_notify_free(unsigned int order) { } + +static inline void hugepage_reporting_notify_free(unsigned int order) +{ +} #endif /* CONFIG_PAGE_REPORTING */ #endif /*_MM_PAGE_REPORTING_H */ From patchwork Tue Dec 22 07:48:13 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Liang Li X-Patchwork-Id: 11986097 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-14.2 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A89C8C433E0 for ; Tue, 22 Dec 2020 07:48:19 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 3CA8422D57 for ; Tue, 22 Dec 2020 07:48:19 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 3CA8422D57 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id A4AD26B0083; Tue, 22 Dec 2020 02:48:18 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id A21A46B0085; Tue, 22 Dec 2020 02:48:18 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 93AE16B0087; Tue, 22 Dec 2020 02:48:18 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0072.hostedemail.com [216.40.44.72]) by kanga.kvack.org (Postfix) with ESMTP id 7D8676B0083 for ; Tue, 22 Dec 2020 02:48:18 -0500 (EST) Received: from smtpin03.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 405BC249C for ; Tue, 22 Dec 2020 07:48:18 +0000 (UTC) X-FDA: 77620140276.03.ship95_17096432745e Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin03.hostedemail.com (Postfix) with ESMTP id 22CD528A4E9 for ; Tue, 22 Dec 2020 07:48:18 +0000 (UTC) X-HE-Tag: ship95_17096432745e X-Filterd-Recvd-Size: 10710 Received: from mail-pj1-f49.google.com (mail-pj1-f49.google.com [209.85.216.49]) by imf04.hostedemail.com (Postfix) with ESMTP for ; Tue, 22 Dec 2020 07:48:17 +0000 (UTC) Received: by mail-pj1-f49.google.com with SMTP id v1so886627pjr.2 for ; Mon, 21 Dec 2020 23:48:17 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:date:to:cc:subject:message-id:mail-followup-to:mime-version :content-disposition:user-agent; bh=i3bfHEMTyy3RMCDYIydmrpz6KtdROg7RvoHxWojHkr0=; b=D1GHHgB6ojK9kFQC/HqZtz1VG49st4P/x1r7Gk2gUQTRxEFN899dkPc7TLy4YHxKwt UyZA9j9NI9zEHmeu2BOHZbb59/+OmLAzjp6tU9y7bHvcku937QY7HLIdHqr6azr0VZp2 Hvi+wQQXKbxX4gBZNp3ZAZ/u8Q8KNSkv3a9EjKiuKOvVMHDGkq32mMcvVB3ktpdNUiJ4 v4UnYrMYZANTliJLq/rOqgBGCxDghtjPlHSABDP7j4ez9OqB+toNiEbojmRIihIWPPd4 SSD6YYG+N1Myh0gfkdPkSOPdl/C3NSmvu6/JB8IqIlEp3q662kKLVkik5X5aaTrO8Lpn PEZw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:date:to:cc:subject:message-id :mail-followup-to:mime-version:content-disposition:user-agent; bh=i3bfHEMTyy3RMCDYIydmrpz6KtdROg7RvoHxWojHkr0=; b=R288PKf6EqSXPjkAV/YdwGZ8yKwbdE+14LxvpWmfQUKI9H/5VslmWgg+LYFrrmj5sK 26THhKXXQqa/c+4Ef5f1dWMzwfpTVqouO6nfZyyNMdRNrv0GoveBjBtu1W532877cwQ/ 7r0OW2tZdibzpkBYV4U9MhFkaKCSwaElTqGqIiHtQU8T1bnYj86/iQMCmmPqkYDFnAEM fXPT4Tjn4LzSGDZoCT8tqe8ZlL7eguzxx0oWk3cOVoTsPKvNxxXVxh2qB08lzGz8YxJU HuiKQD72oA0eUPAksXN0WeS8OORFwiBApNlhZ1oKHwTNUMbROznkizOz88H0KY8b5FoL l4ng== X-Gm-Message-State: AOAM532lgjWlb7PannjHSNmwIVgfoAE2RVxd7YkafNBwOJm7j0o2DBQZ 0sq+xgt2vCKLYEj2sDkm3XBMpzp3j64= X-Google-Smtp-Source: ABdhPJze6WndpyzBv8oA+rCboDsQWgmQOwnmfLMGSWq7A6Jn6aYr6PdovCpLTkyvj0c+KuPIFMKxaQ== X-Received: by 2002:a17:902:6b89:b029:da:fc41:baec with SMTP id p9-20020a1709026b89b02900dafc41baecmr20012241plk.39.1608623296624; Mon, 21 Dec 2020 23:48:16 -0800 (PST) Received: from open-light-1.localdomain (66.98.113.28.16clouds.com. [66.98.113.28]) by smtp.gmail.com with ESMTPSA id 36sm12872029pgr.56.2020.12.21.23.48.15 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Mon, 21 Dec 2020 23:48:16 -0800 (PST) From: Liang Li X-Google-Original-From: Liang Li Date: Tue, 22 Dec 2020 02:48:13 -0500 To: Alexander Duyck , Mel Gorman , Andrew Morton , Andrea Arcangeli , Dan Williams , "Michael S. Tsirkin" , David Hildenbrand , Jason Wang , Dave Hansen , Michal Hocko , Liang Li , Mike Kravetz , Liang Li Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, virtualization@lists.linux-foundation.org, qemu-devel@nongnu.org Subject: [RFC PATCH 2/3] virtio-balloon: add support for providing free huge page reports to host Message-ID: <20201222074810.GA30047@open-light-1.localdomain> Mail-Followup-To: Alexander Duyck , Mel Gorman , Andrew Morton , Andrea Arcangeli , Dan Williams , "Michael S. Tsirkin" , David Hildenbrand , Jason Wang , Dave Hansen , Michal Hocko , Liang Li , Mike Kravetz , Liang Li , linux-mm@kvack.org, linux-kernel@vger.kernel.org, virtualization@lists.linux-foundation.org, qemu-devel@nongnu.org MIME-Version: 1.0 Content-Disposition: inline User-Agent: Mutt/1.5.21 (2010-09-15) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Free page reporting only supports buddy pages, it can't report the free pages reserved for hugetlbfs case. On the other hand, hugetlbfs is a good choice for a system with a huge amount of RAM, because it can help to reduce the memory management overhead and improve system performance. This patch add support for reporting free hugepage to host when guest use hugetlbfs. A new feature bit and a new vq is added for this new feature. Cc: Alexander Duyck Cc: Mel Gorman Cc: Andrea Arcangeli Cc: Dan Williams Cc: Dave Hansen Cc: David Hildenbrand Cc: Michal Hocko Cc: Andrew Morton Cc: Alex Williamson Cc: Michael S. Tsirkin Cc: Jason Wang Cc: Mike Kravetz Cc: Liang Li Signed-off-by: Liang Li --- drivers/virtio/virtio_balloon.c | 61 +++++++++++++++++++++++++++++ include/uapi/linux/virtio_balloon.h | 1 + 2 files changed, 62 insertions(+) diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c index a298517079bb..61363dfd3c2d 100644 --- a/drivers/virtio/virtio_balloon.c +++ b/drivers/virtio/virtio_balloon.c @@ -52,6 +52,7 @@ enum virtio_balloon_vq { VIRTIO_BALLOON_VQ_STATS, VIRTIO_BALLOON_VQ_FREE_PAGE, VIRTIO_BALLOON_VQ_REPORTING, + VIRTIO_BALLOON_VQ_HPG_REPORTING, VIRTIO_BALLOON_VQ_MAX }; @@ -126,6 +127,10 @@ struct virtio_balloon { /* Free page reporting device */ struct virtqueue *reporting_vq; struct page_reporting_dev_info pr_dev_info; + + /* Free hugepage reporting device */ + struct virtqueue *hpg_reporting_vq; + struct page_reporting_dev_info hpr_dev_info; }; static const struct virtio_device_id id_table[] = { @@ -192,6 +197,33 @@ static int virtballoon_free_page_report(struct page_reporting_dev_info *pr_dev_i return 0; } +static int virtballoon_free_hugepage_report(struct page_reporting_dev_info *hpr_dev_info, + struct scatterlist *sg, unsigned int nents) +{ + struct virtio_balloon *vb = + container_of(hpr_dev_info, struct virtio_balloon, hpr_dev_info); + struct virtqueue *vq = vb->hpg_reporting_vq; + unsigned int unused, err; + + /* We should always be able to add these buffers to an empty queue. */ + err = virtqueue_add_inbuf(vq, sg, nents, vb, GFP_NOWAIT | __GFP_NOWARN); + + /* + * In the extremely unlikely case that something has occurred and we + * are able to trigger an error we will simply display a warning + * and exit without actually processing the pages. + */ + if (WARN_ON_ONCE(err)) + return err; + + virtqueue_kick(vq); + + /* When host has read buffer, this completes via balloon_ack */ + wait_event(vb->acked, virtqueue_get_buf(vq, &unused)); + + return 0; +} + static void set_page_pfns(struct virtio_balloon *vb, __virtio32 pfns[], struct page *page) { @@ -515,6 +547,7 @@ static int init_vqs(struct virtio_balloon *vb) callbacks[VIRTIO_BALLOON_VQ_FREE_PAGE] = NULL; names[VIRTIO_BALLOON_VQ_FREE_PAGE] = NULL; names[VIRTIO_BALLOON_VQ_REPORTING] = NULL; + names[VIRTIO_BALLOON_VQ_HPG_REPORTING] = NULL; if (virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_STATS_VQ)) { names[VIRTIO_BALLOON_VQ_STATS] = "stats"; @@ -531,6 +564,11 @@ static int init_vqs(struct virtio_balloon *vb) callbacks[VIRTIO_BALLOON_VQ_REPORTING] = balloon_ack; } + if (virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_HPG_REPORTING)) { + names[VIRTIO_BALLOON_VQ_HPG_REPORTING] = "hpg_reporting_vq"; + callbacks[VIRTIO_BALLOON_VQ_HPG_REPORTING] = balloon_ack; + } + err = vb->vdev->config->find_vqs(vb->vdev, VIRTIO_BALLOON_VQ_MAX, vqs, callbacks, names, NULL, NULL); if (err) @@ -566,6 +604,8 @@ static int init_vqs(struct virtio_balloon *vb) if (virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_REPORTING)) vb->reporting_vq = vqs[VIRTIO_BALLOON_VQ_REPORTING]; + if (virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_HPG_REPORTING)) + vb->hpg_reporting_vq = vqs[VIRTIO_BALLOON_VQ_HPG_REPORTING]; return 0; } @@ -1001,6 +1041,24 @@ static int virtballoon_probe(struct virtio_device *vdev) goto out_unregister_oom; } + vb->hpr_dev_info.report = virtballoon_free_hugepage_report; + if (virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_HPG_REPORTING)) { + unsigned int capacity; + + capacity = virtqueue_get_vring_size(vb->hpg_reporting_vq); + if (capacity < PAGE_REPORTING_CAPACITY) { + err = -ENOSPC; + goto out_unregister_oom; + } + + vb->hpr_dev_info.mini_order = 0; + vb->hpr_dev_info.batch_size = 2 * 1024 * 1024; /* 2M */ + vb->hpr_dev_info.delay_jiffies = 1 * HZ; /* 1 seconds */ + err = hugepage_reporting_register(&vb->hpr_dev_info); + if (err) + goto out_unregister_oom; + } + virtio_device_ready(vdev); if (towards_target(vb)) @@ -1053,6 +1111,8 @@ static void virtballoon_remove(struct virtio_device *vdev) if (virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_REPORTING)) page_reporting_unregister(&vb->pr_dev_info); + if (virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_HPG_REPORTING)) + hugepage_reporting_unregister(&vb->hpr_dev_info); if (virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_DEFLATE_ON_OOM)) unregister_oom_notifier(&vb->oom_nb); if (virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_FREE_PAGE_HINT)) @@ -1133,6 +1193,7 @@ static unsigned int features[] = { VIRTIO_BALLOON_F_FREE_PAGE_HINT, VIRTIO_BALLOON_F_PAGE_POISON, VIRTIO_BALLOON_F_REPORTING, + VIRTIO_BALLOON_F_HPG_REPORTING, }; static struct virtio_driver virtio_balloon_driver = { diff --git a/include/uapi/linux/virtio_balloon.h b/include/uapi/linux/virtio_balloon.h index ddaa45e723c4..8ca8f89d95c6 100644 --- a/include/uapi/linux/virtio_balloon.h +++ b/include/uapi/linux/virtio_balloon.h @@ -37,6 +37,7 @@ #define VIRTIO_BALLOON_F_FREE_PAGE_HINT 3 /* VQ to report free pages */ #define VIRTIO_BALLOON_F_PAGE_POISON 4 /* Guest is using page poisoning */ #define VIRTIO_BALLOON_F_REPORTING 5 /* Page reporting virtqueue */ +#define VIRTIO_BALLOON_F_HPG_REPORTING 6 /* Huge page reporting virtqueue */ /* Size of a PFN in the balloon interface. */ #define VIRTIO_BALLOON_PFN_SHIFT 12 From patchwork Tue Dec 22 07:49:13 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Liang Li X-Patchwork-Id: 11986099 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-14.2 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 00E8EC433E0 for ; Tue, 22 Dec 2020 07:49:19 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id A8E28225AB for ; Tue, 22 Dec 2020 07:49:18 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org A8E28225AB Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 1947E6B0085; Tue, 22 Dec 2020 02:49:18 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 11DFB8D0005; Tue, 22 Dec 2020 02:49:18 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 00CE86B0088; Tue, 22 Dec 2020 02:49:17 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0084.hostedemail.com [216.40.44.84]) by kanga.kvack.org (Postfix) with ESMTP id DAAC76B0085 for ; Tue, 22 Dec 2020 02:49:17 -0500 (EST) Received: from smtpin25.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 9C4AE3648 for ; Tue, 22 Dec 2020 07:49:17 +0000 (UTC) X-FDA: 77620142754.25.cake98_5d1788c2745e Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin25.hostedemail.com (Postfix) with ESMTP id 794C81804E3A0 for ; Tue, 22 Dec 2020 07:49:17 +0000 (UTC) X-HE-Tag: cake98_5d1788c2745e X-Filterd-Recvd-Size: 6846 Received: from mail-pg1-f171.google.com (mail-pg1-f171.google.com [209.85.215.171]) by imf39.hostedemail.com (Postfix) with ESMTP for ; Tue, 22 Dec 2020 07:49:16 +0000 (UTC) Received: by mail-pg1-f171.google.com with SMTP id w5so7884591pgj.3 for ; Mon, 21 Dec 2020 23:49:16 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:date:to:cc:subject:message-id:mail-followup-to:mime-version :content-disposition:user-agent; bh=EteXygXMCMIuOH65MtjowE2DhA9SwUq1uE5SG/8bUGA=; b=LGvosuLiYLgn1b5YwdnJwfvCqo3XbN555xW2S/pLyAc8CuUqtZAGHNlX5OhLPeQRO4 C9wIUC7lj+M+iU2FFT1wsEhktcwBwq76j874fWliVKllFErvCXfAAxAKVekLKJ0fPuX4 w9BPNrCcyE8RLp4U32xhMH4/yOciZ7ZmUVQsi7kvByf34AwsJf2PEPp8adJg0WcVurN9 y91b1qv/triV2c+4b/y4HigHiL/Sk+sYpuqdkd4PTNVp0q5YaDIas0IN7YMD2gFHzTrx eV/AGLh5tP8pePBFiPGWjo4RlSKftL1n+xJaqNVLxg/eXbhAbpkDrS1x2L8CHRsIt1/n D+sA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:date:to:cc:subject:message-id :mail-followup-to:mime-version:content-disposition:user-agent; bh=EteXygXMCMIuOH65MtjowE2DhA9SwUq1uE5SG/8bUGA=; b=fhbKUhrUxXCAW1kPCBiXxcSVUyKRFP9fVqeQ06egOorPbBgeFU/HigLRJ04sIN0Zq0 Y9KLdV1UUD/98+W9YB+q9l3/jx2aiq2VLSmJSFIE7eR/o68AadtMQxN6Zg/APO2sgznL gwMqYTHbY6JQFhyKWF9f7tm4sbS2oMsyjsZDV9XytKlckRZxeJddVNqE87DfN8q0E8/c FMdsSbJCkZwOC1vlj0cLfS7WE5N3vnctr7e4AJtj6X7jYidAqRNsM3w1hrVP4l6Wy4EQ 26dOrEL0zGrjN9BKaeBhL95+wsJWIi22XbkcSSUKyMc6ZeCnkhTUm9f894tr2mn0i7bR bKzw== X-Gm-Message-State: AOAM531WXenU6nmzK1Qr61iEmelB1QXfmdI1ZmCxqtUilrIaAfuuHQ0t jdL44Ab36wU6KV68YU4MdsE= X-Google-Smtp-Source: ABdhPJzBJLHkMIuEbtGr1+LpiBstQirZrxRohNTninbGDYKqZqqMHLh/V9kjyaXQ/gbGgwJskKjqfQ== X-Received: by 2002:a63:445a:: with SMTP id t26mr18659351pgk.402.1608623356116; Mon, 21 Dec 2020 23:49:16 -0800 (PST) Received: from open-light-1.localdomain (66.98.113.28.16clouds.com. [66.98.113.28]) by smtp.gmail.com with ESMTPSA id b189sm19198904pfb.194.2020.12.21.23.49.14 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Mon, 21 Dec 2020 23:49:15 -0800 (PST) From: Liang Li X-Google-Original-From: Liang Li Date: Tue, 22 Dec 2020 02:49:13 -0500 To: Alexander Duyck , Mel Gorman , Andrew Morton , Andrea Arcangeli , Dan Williams , "Michael S. Tsirkin" , David Hildenbrand , Jason Wang , Dave Hansen , Michal Hocko , Liang Li , Mike Kravetz , Liang Li Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, virtualization@lists.linux-foundation.org, qemu-devel@nongnu.org Subject: [RFC PATCH 3/3] mm: support free hugepage pre zero out Message-ID: <20201222074910.GA30051@open-light-1.localdomain> Mail-Followup-To: Alexander Duyck , Mel Gorman , Andrew Morton , Andrea Arcangeli , Dan Williams , "Michael S. Tsirkin" , David Hildenbrand , Jason Wang , Dave Hansen , Michal Hocko , Liang Li , Mike Kravetz , Liang Li , linux-mm@kvack.org, linux-kernel@vger.kernel.org, virtualization@lists.linux-foundation.org, qemu-devel@nongnu.org MIME-Version: 1.0 Content-Disposition: inline User-Agent: Mutt/1.5.21 (2010-09-15) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: This patch add support of pre zero out free hugepage, we can use this feature to speed up page population and page fault handing. Cc: Alexander Duyck Cc: Mel Gorman Cc: Andrea Arcangeli Cc: Dan Williams Cc: Dave Hansen Cc: David Hildenbrand Cc: Michal Hocko Cc: Andrew Morton Cc: Alex Williamson Cc: Michael S. Tsirkin Cc: Jason Wang Cc: Mike Kravetz Cc: Liang Li Signed-off-by: Liang Li --- mm/page_prezero.c | 17 +++++++++++++++++ 1 file changed, 17 insertions(+) diff --git a/mm/page_prezero.c b/mm/page_prezero.c index c8ce720bfc54..dff4e0adf402 100644 --- a/mm/page_prezero.c +++ b/mm/page_prezero.c @@ -26,6 +26,7 @@ static unsigned long delay_millisecs = 1000; static unsigned long zeropage_enable __read_mostly; static DEFINE_MUTEX(kzeropaged_mutex); static struct page_reporting_dev_info zero_page_dev_info; +static struct page_reporting_dev_info zero_hugepage_dev_info; inline void clear_zero_page_flag(struct page *page, int order) { @@ -69,9 +70,17 @@ static int start_kzeropaged(void) zero_page_dev_info.delay_jiffies = msecs_to_jiffies(delay_millisecs); err = page_reporting_register(&zero_page_dev_info); + + zero_hugepage_dev_info.report = zero_free_pages; + zero_hugepage_dev_info.mini_order = mini_page_order; + zero_hugepage_dev_info.batch_size = batch_size; + zero_hugepage_dev_info.delay_jiffies = msecs_to_jiffies(delay_millisecs); + + err |= hugepage_reporting_register(&zero_hugepage_dev_info); pr_info("Zero page enabled\n"); } else { page_reporting_unregister(&zero_page_dev_info); + hugepage_reporting_unregister(&zero_hugepage_dev_info); pr_info("Zero page disabled\n"); } @@ -90,7 +99,15 @@ static int restart_kzeropaged(void) zero_page_dev_info.batch_size = batch_size; zero_page_dev_info.delay_jiffies = msecs_to_jiffies(delay_millisecs); + hugepage_reporting_unregister(&zero_hugepage_dev_info); + + zero_hugepage_dev_info.report = zero_free_pages; + zero_hugepage_dev_info.mini_order = mini_page_order; + zero_hugepage_dev_info.batch_size = batch_size; + zero_hugepage_dev_info.delay_jiffies = msecs_to_jiffies(delay_millisecs); + err = page_reporting_register(&zero_page_dev_info); + err |= hugepage_reporting_register(&zero_hugepage_dev_info); pr_info("Zero page enabled\n"); }