From patchwork Thu May 30 21:53:41 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexander Duyck X-Patchwork-Id: 10969267 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B9823912 for ; Thu, 30 May 2019 21:53:50 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id AD81E28C52 for ; Thu, 30 May 2019 21:53:50 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id AB78828C54; Thu, 30 May 2019 21:53:50 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 5AC1228C52 for ; Thu, 30 May 2019 21:53:50 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726610AbfE3Vxp (ORCPT ); Thu, 30 May 2019 17:53:45 -0400 Received: from mail-ot1-f68.google.com ([209.85.210.68]:34795 "EHLO mail-ot1-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726590AbfE3Vxp (ORCPT ); Thu, 30 May 2019 17:53:45 -0400 Received: by mail-ot1-f68.google.com with SMTP id l17so7198425otq.1; Thu, 30 May 2019 14:53:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=subject:from:to:cc:date:message-id:in-reply-to:references :user-agent:mime-version:content-transfer-encoding; bh=sJuZdrZ6VMk79b1ZhahuZrudd0+Kp6YlC0eBwgdHqtI=; b=PsxZY9cXHZU3UHhQbhmWYsY+UJaHm8siNxxjBiful25kOQ+QKikMo6MgAVTCdgUVRf HjBxf/inGy+2U2z+5S+aaxGx2eUrk6bqqatAILgaaLwR1X4LkQlo7YDmx157FhcBU5Dh SKPTMz5awBl/iOy0G0otyV+PrTYLeu7JgLh7vznhqdFmuP7VyNkvH+QRBD6mjV8dj0ac VsIz6MtfFh0yXHkuy6sjpNia/mdmJuFWmSloJd2uK3Gf8a3fao/2BY0tS567OpBezrz+ nyjzchNz30eSf80CDrDHUso4ILvcPw8/p2r5zT3iwCzx5MLu0IwOQEdUzi8g7aVFn/P6 dHzg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:from:to:cc:date:message-id:in-reply-to :references:user-agent:mime-version:content-transfer-encoding; bh=sJuZdrZ6VMk79b1ZhahuZrudd0+Kp6YlC0eBwgdHqtI=; b=URjpjlp9ma0HFnGHE3B6qRf40A+ig1F/Wy/tj/0oj3SCtaQuY3a8J7tIOCU54ION1S EJJUkGmFloOQFJFo5jvZ9wPJuPVVk0jevkbCVpTwuh/fOpmG7jdkpXRPeaCR4dru07k6 rmRBP4lt95C1QMSFljbAeJZOKP64mQ5jyS7tf9DRRb7vH+6gWK81t2XXBxk58o1lk5AI bwVW5LKnkjW1rOJrZ6QJGHJ2timATdCU9Su4WOOO0Pp/U0G53RnMiEPwinslmB40FrTG 46MwrEuHvuyaKTH3l8XPqqy6qhUsa5lPzoomOqIbkH0F2qtt+t57xcMQGYdWQglwYpdh ZkBg== X-Gm-Message-State: APjAAAXLne4hpfrKcTUpGqsRknbiFwWNJz0AJO/6uBKmpur38I2mLOs/ vQMT+2e55DgTSeNTy8+w/eFCvepA9YY= X-Google-Smtp-Source: APXvYqxMsojOf0egOxfGWLBhthd5R18VJ7E3zUiMlsMMhMaEmJH88aU+eNQTsegxUV9DbttxV/LJgw== X-Received: by 2002:a9d:1b6d:: with SMTP id l100mr4256814otl.15.1559253224151; Thu, 30 May 2019 14:53:44 -0700 (PDT) Received: from localhost.localdomain (50-126-100-225.drr01.csby.or.frontiernet.net. [50.126.100.225]) by smtp.gmail.com with ESMTPSA id v89sm1441749otb.14.2019.05.30.14.53.42 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 30 May 2019 14:53:43 -0700 (PDT) Subject: [RFC PATCH 01/11] mm: Move MAX_ORDER definition closer to pageblock_order From: Alexander Duyck To: nitesh@redhat.com, kvm@vger.kernel.org, david@redhat.com, mst@redhat.com, dave.hansen@intel.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: yang.zhang.wz@gmail.com, pagupta@redhat.com, riel@surriel.com, konrad.wilk@oracle.com, lcapitulino@redhat.com, wei.w.wang@intel.com, aarcange@redhat.com, pbonzini@redhat.com, dan.j.williams@intel.com, alexander.h.duyck@linux.intel.com Date: Thu, 30 May 2019 14:53:41 -0700 Message-ID: <20190530215341.13974.19456.stgit@localhost.localdomain> In-Reply-To: <20190530215223.13974.22445.stgit@localhost.localdomain> References: <20190530215223.13974.22445.stgit@localhost.localdomain> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Alexander Duyck The definition of MAX_ORDER being contained in mmzone.h is problematic when wanting to just get access to things like pageblock_order since pageblock_order is defined on some architectures as being based on MAX_ORDER and it isn't included in pageblock-flags.h. Move the definition of MAX_ORDER into pageblock-flags.h so that it is defined in the same header as pageblock_order. By doing this we don't need to also include mmzone.h. The definition of MAX_ORDER will still be accessible to any file that includes mmzone.h as it includes pageblock-flags.h. Signed-off-by: Alexander Duyck --- include/linux/mmzone.h | 8 -------- include/linux/pageblock-flags.h | 8 ++++++++ 2 files changed, 8 insertions(+), 8 deletions(-) diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index 70394cabaf4e..a6bdff538437 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -22,14 +22,6 @@ #include #include -/* Free memory management - zoned buddy allocator. */ -#ifndef CONFIG_FORCE_MAX_ZONEORDER -#define MAX_ORDER 11 -#else -#define MAX_ORDER CONFIG_FORCE_MAX_ZONEORDER -#endif -#define MAX_ORDER_NR_PAGES (1 << (MAX_ORDER - 1)) - /* * PAGE_ALLOC_COSTLY_ORDER is the order at which allocations are deemed * costly to service. That is between allocation orders which should diff --git a/include/linux/pageblock-flags.h b/include/linux/pageblock-flags.h index 06a66327333d..e9e8006ccae1 100644 --- a/include/linux/pageblock-flags.h +++ b/include/linux/pageblock-flags.h @@ -40,6 +40,14 @@ enum pageblock_bits { NR_PAGEBLOCK_BITS }; +/* Free memory management - zoned buddy allocator. */ +#ifndef CONFIG_FORCE_MAX_ZONEORDER +#define MAX_ORDER 11 +#else +#define MAX_ORDER CONFIG_FORCE_MAX_ZONEORDER +#endif +#define MAX_ORDER_NR_PAGES (1 << (MAX_ORDER - 1)) + #ifdef CONFIG_HUGETLB_PAGE #ifdef CONFIG_HUGETLB_PAGE_SIZE_VARIABLE From patchwork Thu May 30 21:53:49 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexander Duyck X-Patchwork-Id: 10969275 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 85E1C912 for ; Thu, 30 May 2019 21:54:11 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 793B528A6A for ; Thu, 30 May 2019 21:54:11 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 6D32F28AD3; Thu, 30 May 2019 21:54:11 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id BEF8F28A6A for ; Thu, 30 May 2019 21:54:10 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726845AbfE3VyA (ORCPT ); Thu, 30 May 2019 17:54:00 -0400 Received: from mail-oi1-f196.google.com ([209.85.167.196]:42671 "EHLO mail-oi1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726794AbfE3Vxw (ORCPT ); Thu, 30 May 2019 17:53:52 -0400 Received: by mail-oi1-f196.google.com with SMTP id v25so5617905oic.9; Thu, 30 May 2019 14:53:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=subject:from:to:cc:date:message-id:in-reply-to:references :user-agent:mime-version:content-transfer-encoding; bh=7/ShmDor402LGxJdUnB6r3jYiu3C2H9J7GnwmwTdNWw=; b=jad/iZgDNFkLMXFeonjDMysadIOYcRNWeBjMK29xpgVosay9LzT2MFQMnv8Astm+6I pgM0i8wpDmzoi9qt7K91vUNWAGuh+XGZ2oJLqx8pewNeoPkJHZ5l/7H/8Ysxd6AtsFFl aCj2g9XNTxsTLctkn9rrNH39jDVy6ARaVZzwtxvqbbMedEezNh4AZEXOWhWMzzldDSdV apibBvcrrTo43whDeU+YJ3/f0h4B9swIqxrcA+2OL8YXB2+ysoauJw7hhuxHQ0jgxIeu X2GG8rL6oO7EhHvgrSOsaIBDg+IvsM2JE3gktx0c3oymPA5CCfmTN5G5oMzc74r4ZgSx RU+w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:from:to:cc:date:message-id:in-reply-to :references:user-agent:mime-version:content-transfer-encoding; bh=7/ShmDor402LGxJdUnB6r3jYiu3C2H9J7GnwmwTdNWw=; b=bvhvkXgj1/VPABMx5dynRKruWsfDxJGFOP/dify+8e3myNDIjSS/fcAqhVIZZeRnzx TNNlQFTwnUQ+AdSdZXswZ5/csf2TqKnOfTQ2y6yWcnJDUMZ6a5Df1MF909c6Z5H4SyXc aTpSpHph6/tXOiU1qsQVjWhN+FkXmcyOP2Rqd6G5cGvjRluCsQIFZ17TX4YHKK87UsCE k35ioMz2OBDwPKoG96lYaGJjmZ76Z7JSbjw7U6QmdvfQ7FWpxDb710oRTBjW26ioC1Rw Al/gKIzrKOSYQS+z3yJiOTQQFGirWqS9+twdfgpuGP39LGFw4yem+/z+nfyx5twdbBwg OUmw== X-Gm-Message-State: APjAAAWnI9rLVYbceVIndcTHK/VNb8IYYDs2nMRJMJvPaLvTE/uaAqFq iCK1ANQfeK+zX7T/KO+FGqU= X-Google-Smtp-Source: APXvYqxpsklzRzklzaaWunr5R2owZZS1CzYFOUcljqCLnm6wJpxHd7Xotg82bP7Wx08VkZW6UTLEHw== X-Received: by 2002:aca:3242:: with SMTP id y63mr3882561oiy.148.1559253231505; Thu, 30 May 2019 14:53:51 -0700 (PDT) Received: from localhost.localdomain (50-126-100-225.drr01.csby.or.frontiernet.net. [50.126.100.225]) by smtp.gmail.com with ESMTPSA id t21sm1459319otj.46.2019.05.30.14.53.49 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 30 May 2019 14:53:50 -0700 (PDT) Subject: [RFC PATCH 02/11] mm: Adjust shuffle code to allow for future coalescing From: Alexander Duyck To: nitesh@redhat.com, kvm@vger.kernel.org, david@redhat.com, mst@redhat.com, dave.hansen@intel.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: yang.zhang.wz@gmail.com, pagupta@redhat.com, riel@surriel.com, konrad.wilk@oracle.com, lcapitulino@redhat.com, wei.w.wang@intel.com, aarcange@redhat.com, pbonzini@redhat.com, dan.j.williams@intel.com, alexander.h.duyck@linux.intel.com Date: Thu, 30 May 2019 14:53:49 -0700 Message-ID: <20190530215349.13974.25544.stgit@localhost.localdomain> In-Reply-To: <20190530215223.13974.22445.stgit@localhost.localdomain> References: <20190530215223.13974.22445.stgit@localhost.localdomain> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Alexander Duyck This patch is meant to move the head/tail adding logic out of the shuffle code and into the __free_one_page function since ultimately that is where it is really needed anyway. By doing this we should be able to reduce the overhead and can consolidate all of the list addition bits in one spot. Signed-off-by: Alexander Duyck --- include/linux/mmzone.h | 12 -------- mm/page_alloc.c | 70 +++++++++++++++++++++++++++--------------------- mm/shuffle.c | 24 ---------------- mm/shuffle.h | 35 ++++++++++++++++++++++++ 4 files changed, 74 insertions(+), 67 deletions(-) diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index a6bdff538437..297edb45071a 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -108,18 +108,6 @@ static inline void add_to_free_area_tail(struct page *page, struct free_area *ar area->nr_free++; } -#ifdef CONFIG_SHUFFLE_PAGE_ALLOCATOR -/* Used to preserve page allocation order entropy */ -void add_to_free_area_random(struct page *page, struct free_area *area, - int migratetype); -#else -static inline void add_to_free_area_random(struct page *page, - struct free_area *area, int migratetype) -{ - add_to_free_area(page, area, migratetype); -} -#endif - /* Used for pages which are on another list */ static inline void move_to_free_area(struct page *page, struct free_area *area, int migratetype) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index c061f66c2d0c..2fa5bbb372bb 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -851,6 +851,36 @@ static inline struct capture_control *task_capc(struct zone *zone) #endif /* CONFIG_COMPACTION */ /* + * If this is not the largest possible page, check if the buddy + * of the next-highest order is free. If it is, it's possible + * that pages are being freed that will coalesce soon. In case, + * that is happening, add the free page to the tail of the list + * so it's less likely to be used soon and more likely to be merged + * as a higher order page + */ +static inline bool +buddy_merge_likely(unsigned long pfn, unsigned long buddy_pfn, + struct page *page, unsigned int order) +{ + struct page *higher_page, *higher_buddy; + unsigned long combined_pfn; + + if (is_shuffle_order(order) || order >= (MAX_ORDER - 2)) + return false; + + if (!pfn_valid_within(buddy_pfn)) + return false; + + combined_pfn = buddy_pfn & pfn; + higher_page = page + (combined_pfn - pfn); + buddy_pfn = __find_buddy_pfn(combined_pfn, order + 1); + higher_buddy = higher_page + (buddy_pfn - combined_pfn); + + return pfn_valid_within(buddy_pfn) && + page_is_buddy(higher_page, higher_buddy, order + 1); +} + +/* * Freeing function for a buddy system allocator. * * The concept of a buddy system is to maintain direct-mapped table @@ -879,11 +909,12 @@ static inline void __free_one_page(struct page *page, struct zone *zone, unsigned int order, int migratetype) { - unsigned long combined_pfn; + struct capture_control *capc = task_capc(zone); unsigned long uninitialized_var(buddy_pfn); - struct page *buddy; + unsigned long combined_pfn; + struct free_area *area; unsigned int max_order; - struct capture_control *capc = task_capc(zone); + struct page *buddy; max_order = min_t(unsigned int, MAX_ORDER, pageblock_order + 1); @@ -952,35 +983,12 @@ static inline void __free_one_page(struct page *page, done_merging: set_page_order(page, order); - /* - * If this is not the largest possible page, check if the buddy - * of the next-highest order is free. If it is, it's possible - * that pages are being freed that will coalesce soon. In case, - * that is happening, add the free page to the tail of the list - * so it's less likely to be used soon and more likely to be merged - * as a higher order page - */ - if ((order < MAX_ORDER-2) && pfn_valid_within(buddy_pfn) - && !is_shuffle_order(order)) { - struct page *higher_page, *higher_buddy; - combined_pfn = buddy_pfn & pfn; - higher_page = page + (combined_pfn - pfn); - buddy_pfn = __find_buddy_pfn(combined_pfn, order + 1); - higher_buddy = higher_page + (buddy_pfn - combined_pfn); - if (pfn_valid_within(buddy_pfn) && - page_is_buddy(higher_page, higher_buddy, order + 1)) { - add_to_free_area_tail(page, &zone->free_area[order], - migratetype); - return; - } - } - - if (is_shuffle_order(order)) - add_to_free_area_random(page, &zone->free_area[order], - migratetype); + area = &zone->free_area[order]; + if (buddy_merge_likely(pfn, buddy_pfn, page, order) || + is_shuffle_tail_page(order)) + add_to_free_area_tail(page, area, migratetype); else - add_to_free_area(page, &zone->free_area[order], migratetype); - + add_to_free_area(page, area, migratetype); } /* diff --git a/mm/shuffle.c b/mm/shuffle.c index 3ce12481b1dc..55d592e62526 100644 --- a/mm/shuffle.c +++ b/mm/shuffle.c @@ -4,7 +4,6 @@ #include #include #include -#include #include #include "internal.h" #include "shuffle.h" @@ -182,26 +181,3 @@ void __meminit __shuffle_free_memory(pg_data_t *pgdat) for (z = pgdat->node_zones; z < pgdat->node_zones + MAX_NR_ZONES; z++) shuffle_zone(z); } - -void add_to_free_area_random(struct page *page, struct free_area *area, - int migratetype) -{ - static u64 rand; - static u8 rand_bits; - - /* - * The lack of locking is deliberate. If 2 threads race to - * update the rand state it just adds to the entropy. - */ - if (rand_bits == 0) { - rand_bits = 64; - rand = get_random_u64(); - } - - if (rand & 1) - add_to_free_area(page, area, migratetype); - else - add_to_free_area_tail(page, area, migratetype); - rand_bits--; - rand >>= 1; -} diff --git a/mm/shuffle.h b/mm/shuffle.h index 777a257a0d2f..3f4edb60a453 100644 --- a/mm/shuffle.h +++ b/mm/shuffle.h @@ -3,6 +3,7 @@ #ifndef _MM_SHUFFLE_H #define _MM_SHUFFLE_H #include +#include /* * SHUFFLE_ENABLE is called from the command line enabling path, or by @@ -43,6 +44,35 @@ static inline bool is_shuffle_order(int order) return false; return order >= SHUFFLE_ORDER; } + +static inline bool is_shuffle_tail_page(int order) +{ + static u64 rand; + static u8 rand_bits; + u64 rand_old; + + if (!is_shuffle_order(order)) + return false; + + /* + * The lack of locking is deliberate. If 2 threads race to + * update the rand state it just adds to the entropy. + */ + if (rand_bits-- == 0) { + rand_bits = 64; + rand = get_random_u64(); + } + + /* + * Test highest order bit while shifting our random value. This + * should result in us testing for the carry flag following the + * shift. + */ + rand_old = rand; + rand <<= 1; + + return rand < rand_old; +} #else static inline void shuffle_free_memory(pg_data_t *pgdat) { @@ -60,5 +90,10 @@ static inline bool is_shuffle_order(int order) { return false; } + +static inline bool is_shuffle_tail_page(int order) +{ + return false; +} #endif #endif /* _MM_SHUFFLE_H */ From patchwork Thu May 30 21:53:56 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexander Duyck X-Patchwork-Id: 10969271 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 72DF914C0 for ; Thu, 30 May 2019 21:54:02 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 6397A28A5B for ; Thu, 30 May 2019 21:54:02 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 57CB628C59; Thu, 30 May 2019 21:54:02 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E56BD28C53 for ; Thu, 30 May 2019 21:54:01 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726875AbfE3VyA (ORCPT ); Thu, 30 May 2019 17:54:00 -0400 Received: from mail-ot1-f68.google.com ([209.85.210.68]:38051 "EHLO mail-ot1-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726738AbfE3Vx7 (ORCPT ); Thu, 30 May 2019 17:53:59 -0400 Received: by mail-ot1-f68.google.com with SMTP id s19so7171348otq.5; Thu, 30 May 2019 14:53:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=subject:from:to:cc:date:message-id:in-reply-to:references :user-agent:mime-version:content-transfer-encoding; bh=Wtqhuc8tld+njwpNjaUC3LbpBQWLP9ZV0hs8fIq5ASw=; b=XJvjTBc1+3PwDTickpM7WozJzbnV4CpwNeBr8/C7ZbJbctSTJZ+22GmcTknusuOoDr 2D/QL0yQASYrqIjKmMGZ3xqFG90FA2JAtq+lrDy0HQr9MXSz2qBD2cBdM4ZhHAuwE/Zd Q9dwWl8Ada9Zdo9/JEx9uFscSKEAWOpBTfgK529lzzWDXo6fUzrmPi2d62PgdJ29IGOa gK7cUg9CxpeecwgzN6TbiRfynxidJ0jNKZW5/jR6vInd5DleH5Nq3BWGpnZFvrhdNCBK zATeaXk0EmpgYeqnWbNJyZPrTHkkI1urnEhXtBXMhTPcPWMDJBMr1rxPelkrhnsQoH8+ rpuQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:from:to:cc:date:message-id:in-reply-to :references:user-agent:mime-version:content-transfer-encoding; bh=Wtqhuc8tld+njwpNjaUC3LbpBQWLP9ZV0hs8fIq5ASw=; b=bfGOnuYtZGNSXUuNRRr3E3bo3+YchCoLiImQuBTh5ZmHvimBQbJRp/LNjOHF76vvKm +6Qn65UVvRnLLdq+WRuQj6yzR8VeyHUwrYTxpnEVk2ApXRkyhVki81ei+sjVRHdNLrFE zNkY9Ft8Lx05X4HroyFukr8kKonx8SriWNB9oUpeKf/MY13hu1Vl8lYHJaM96i6bMnrk uKLIFKcAiQ4lxPEPXhlBxM2LW58Kvkzg1rdAvtCP+oDt9Q2ECyhmxIs2oziVCeCh3gNM sfelKqqI8xKEkbcszDcXNSKFo9a9QdvmrhFsFlfOF8kmQxaDgeqZ0z2Tor6dnA1jr6fR pg3g== X-Gm-Message-State: APjAAAXSAEvitb11IIyyLU8YTvLVfzzDwR6dRAozHuhFyGBesdppdAp5 zQ1LxY+ws+Fd7IFcnJVI+40= X-Google-Smtp-Source: APXvYqzVcIJEh7xNp7JK47qwuZzzsGLJK5TMq9FSbM20Pm3+szfoUg8qiY/3svza+IjLBI7oHuRCkg== X-Received: by 2002:a9d:7347:: with SMTP id l7mr4421138otk.183.1559253239042; Thu, 30 May 2019 14:53:59 -0700 (PDT) Received: from localhost.localdomain (50-126-100-225.drr01.csby.or.frontiernet.net. [50.126.100.225]) by smtp.gmail.com with ESMTPSA id o124sm1462102oig.23.2019.05.30.14.53.57 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 30 May 2019 14:53:58 -0700 (PDT) Subject: [RFC PATCH 03/11] mm: Add support for Treated Buddy pages From: Alexander Duyck To: nitesh@redhat.com, kvm@vger.kernel.org, david@redhat.com, mst@redhat.com, dave.hansen@intel.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: yang.zhang.wz@gmail.com, pagupta@redhat.com, riel@surriel.com, konrad.wilk@oracle.com, lcapitulino@redhat.com, wei.w.wang@intel.com, aarcange@redhat.com, pbonzini@redhat.com, dan.j.williams@intel.com, alexander.h.duyck@linux.intel.com Date: Thu, 30 May 2019 14:53:56 -0700 Message-ID: <20190530215356.13974.95767.stgit@localhost.localdomain> In-Reply-To: <20190530215223.13974.22445.stgit@localhost.localdomain> References: <20190530215223.13974.22445.stgit@localhost.localdomain> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Alexander Duyck This patch is adding support for flagging pages as "Treated" within the buddy allocator. If memory aeration is not enabled then the value will always be treated as false and the set/clear operations will have no effect. Signed-off-by: Alexander Duyck --- include/linux/mmzone.h | 1 + include/linux/page-flags.h | 32 ++++++++++++++++++++++++++++++++ mm/page_alloc.c | 5 +++++ 3 files changed, 38 insertions(+) diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index 297edb45071a..0263d5bf0b84 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -127,6 +127,7 @@ static inline void del_page_from_free_area(struct page *page, { list_del(&page->lru); __ClearPageBuddy(page); + __ResetPageTreated(page); set_page_private(page, 0); area->nr_free--; } diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h index 9f8712a4b1a5..1f8ccb98dd69 100644 --- a/include/linux/page-flags.h +++ b/include/linux/page-flags.h @@ -722,12 +722,32 @@ static inline int page_has_type(struct page *page) VM_BUG_ON_PAGE(!PageType(page, 0), page); \ page->page_type &= ~PG_##lname; \ } \ +static __always_inline void __ResetPage##uname(struct page *page) \ +{ \ + VM_BUG_ON_PAGE(!PageType(page, 0), page); \ + page->page_type |= PG_##lname; \ +} \ static __always_inline void __ClearPage##uname(struct page *page) \ { \ VM_BUG_ON_PAGE(!Page##uname(page), page); \ page->page_type |= PG_##lname; \ } +#define PAGE_TYPE_OPS_DISABLED(uname) \ +static __always_inline int Page##uname(struct page *page) \ +{ \ + return false; \ +} \ +static __always_inline void __SetPage##uname(struct page *page) \ +{ \ +} \ +static __always_inline void __ResetPage##uname(struct page *page) \ +{ \ +} \ +static __always_inline void __ClearPage##uname(struct page *page) \ +{ \ +} + /* * PageBuddy() indicates that the page is free and in the buddy system * (see mm/page_alloc.c). @@ -744,6 +764,18 @@ static inline int page_has_type(struct page *page) PAGE_TYPE_OPS(Offline, offline) /* + * PageTreated() is an alias for Offline, however it is not meant to be an + * exclusive value. It should be combined with PageBuddy() when seen as it + * is meant to indicate that the page has been scrubbed while waiting in + * the buddy system. + */ +#ifdef CONFIG_AERATION +PAGE_TYPE_OPS(Treated, offline) +#else +PAGE_TYPE_OPS_DISABLED(Treated) +#endif + +/* * If kmemcg is enabled, the buddy allocator will set PageKmemcg() on * pages allocated with __GFP_ACCOUNT. It gets cleared on page free. */ diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 2fa5bbb372bb..2894990862bd 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -942,6 +942,11 @@ static inline void __free_one_page(struct page *page, goto done_merging; if (!page_is_buddy(page, buddy, order)) goto done_merging; + + /* If buddy is not treated, then do not mark page treated */ + if (!PageTreated(buddy)) + __ResetPageTreated(page); + /* * Our buddy is free or it is CONFIG_DEBUG_PAGEALLOC guard page, * merge with it and move up one order. From patchwork Thu May 30 21:54:04 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexander Duyck X-Patchwork-Id: 10969277 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9792C14C0 for ; Thu, 30 May 2019 21:54:15 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8850A28904 for ; Thu, 30 May 2019 21:54:15 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 7C51C28BD3; Thu, 30 May 2019 21:54:15 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id DD69C28C55 for ; Thu, 30 May 2019 21:54:14 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726908AbfE3VyK (ORCPT ); Thu, 30 May 2019 17:54:10 -0400 Received: from mail-oi1-f193.google.com ([209.85.167.193]:37080 "EHLO mail-oi1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726887AbfE3VyH (ORCPT ); Thu, 30 May 2019 17:54:07 -0400 Received: by mail-oi1-f193.google.com with SMTP id i4so5797780oih.4; Thu, 30 May 2019 14:54:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=subject:from:to:cc:date:message-id:in-reply-to:references :user-agent:mime-version:content-transfer-encoding; bh=7u7wbcMP7bQHGisrQkMqGe4jN5gZ6kjRirjphuZkn6o=; b=jP6bqG9uQs1ohqmq0Ump0fIbO7IxQeGQBZTVwUv4QXZmN55C38S5+LgtxXDZ7b+N9P cAwMGK4qHImZyjEj6sRWXhaAwVLfCApEINYmjP5KrfZYd2t8BOrBaNoyoLIrUM0k3LFI DQ+B9FgHQwHwhiVlXjLQKZ2+0wYMokLivRn6KCHbG3dcN1u+X+QQrGmEiVKQ8pJR9fZM py3W9rnBsAIVOfzb51cz2748m+NAsXx4DoiT2Y3cLP8Uh60CNpq1ptq6vOPbzxCdvbtX 4aMHOysaC09KZyvYkaubbr2BsO34Q31M2S7Md+9+pSC+JqL+j+2deEuSc7oAcy2mTuRT 2hZQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:from:to:cc:date:message-id:in-reply-to :references:user-agent:mime-version:content-transfer-encoding; bh=7u7wbcMP7bQHGisrQkMqGe4jN5gZ6kjRirjphuZkn6o=; b=jaUdBOCJv31zPXFMZ8FzkEAXe3H440V9mRd9WrzRvuh2POHGD9UJ7a1GStkjD5avSC z6TWNpvXFhPTl0qMPrH2VsHwhISM7Wojp1XWytWhiBKUq2tRcXg0OB7+r46yv3UvIsF1 5hIk4XO5zHv8ciT0hrFbGYkZ7+EBdLQXVsTE8yMSP/JCuK4E9spYEGpHDuJjkJwUtJ1G yE4hMDcFQcFnh47LbpkERudqPzFmp/hVV9T20xvLL6amcfFaUmr2UVWX5PIgbQPVPVL3 +GNP3WodbH7qcBzgPtakhiouYTc9i3YOHt/k6E1pdhAOdLoqtieV9FvjLu3TEZUFUszl ez2Q== X-Gm-Message-State: APjAAAWnkp3QIKvt3daHtjBzH82dPZIEVfy85tPQaP8EHq1Uynpq3vFb 0kIjUyitv96/A0XAqvTG7FjLUgkYBFo= X-Google-Smtp-Source: APXvYqxs6hYRpQ0oJ3XzfwlTWjcwNAqk0y0aofeteBpXV1+xcjtP7sCaSxRds4CMWdwRHoJjAzHDUg== X-Received: by 2002:aca:c450:: with SMTP id u77mr2702554oif.119.1559253246531; Thu, 30 May 2019 14:54:06 -0700 (PDT) Received: from localhost.localdomain (50-126-100-225.drr01.csby.or.frontiernet.net. [50.126.100.225]) by smtp.gmail.com with ESMTPSA id t198sm1526478oih.41.2019.05.30.14.54.04 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 30 May 2019 14:54:05 -0700 (PDT) Subject: [RFC PATCH 04/11] mm: Split nr_free into nr_free_raw and nr_free_treated From: Alexander Duyck To: nitesh@redhat.com, kvm@vger.kernel.org, david@redhat.com, mst@redhat.com, dave.hansen@intel.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: yang.zhang.wz@gmail.com, pagupta@redhat.com, riel@surriel.com, konrad.wilk@oracle.com, lcapitulino@redhat.com, wei.w.wang@intel.com, aarcange@redhat.com, pbonzini@redhat.com, dan.j.williams@intel.com, alexander.h.duyck@linux.intel.com Date: Thu, 30 May 2019 14:54:04 -0700 Message-ID: <20190530215404.13974.27449.stgit@localhost.localdomain> In-Reply-To: <20190530215223.13974.22445.stgit@localhost.localdomain> References: <20190530215223.13974.22445.stgit@localhost.localdomain> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Alexander Duyck Split the nr_free value into two values that track where the pages were inserted into the list. The idea is that we can use this later to track which pages were treated and added to the free list versus the raw pages which were just added to the head of the list. Signed-off-by: Alexander Duyck --- include/linux/mmzone.h | 36 ++++++++++++++++++++++++++++++++---- mm/compaction.c | 4 ++-- mm/page_alloc.c | 14 +++++++++----- mm/vmstat.c | 5 +++-- 4 files changed, 46 insertions(+), 13 deletions(-) diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index 0263d5bf0b84..988c3094b686 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -89,7 +89,8 @@ static inline bool is_migrate_movable(int mt) struct free_area { struct list_head free_list[MIGRATE_TYPES]; - unsigned long nr_free; + unsigned long nr_free_raw; + unsigned long nr_free_treated; }; /* Used for pages not on another list */ @@ -97,7 +98,7 @@ static inline void add_to_free_area(struct page *page, struct free_area *area, int migratetype) { list_add(&page->lru, &area->free_list[migratetype]); - area->nr_free++; + area->nr_free_raw++; } /* Used for pages not on another list */ @@ -105,13 +106,31 @@ static inline void add_to_free_area_tail(struct page *page, struct free_area *ar int migratetype) { list_add_tail(&page->lru, &area->free_list[migratetype]); - area->nr_free++; + area->nr_free_raw++; } /* Used for pages which are on another list */ static inline void move_to_free_area(struct page *page, struct free_area *area, int migratetype) { + /* + * Since we are moving the page out of one migrate type and into + * another the page will be added to the head of the new list. + * + * To avoid creating an island of raw pages floating between two + * sections of treated pages we should reset the page type and + * just re-treat the page when we process the destination. + * + * No need to trigger a notification for this since the page itself + * is actually treated and we are just doing this for logistical + * reasons. + */ + if (PageTreated(page)) { + __ResetPageTreated(page); + area->nr_free_treated--; + area->nr_free_raw++; + } + list_move(&page->lru, &area->free_list[migratetype]); } @@ -125,11 +144,15 @@ static inline struct page *get_page_from_free_area(struct free_area *area, static inline void del_page_from_free_area(struct page *page, struct free_area *area) { + if (PageTreated(page)) + area->nr_free_treated--; + else + area->nr_free_raw--; + list_del(&page->lru); __ClearPageBuddy(page); __ResetPageTreated(page); set_page_private(page, 0); - area->nr_free--; } static inline bool free_area_empty(struct free_area *area, int migratetype) @@ -137,6 +160,11 @@ static inline bool free_area_empty(struct free_area *area, int migratetype) return list_empty(&area->free_list[migratetype]); } +static inline unsigned long nr_pages_in_free_area(struct free_area *area) +{ + return area->nr_free_raw + area->nr_free_treated; +} + struct pglist_data; /* diff --git a/mm/compaction.c b/mm/compaction.c index 9febc8cc84e7..f5a27d5dccdf 100644 --- a/mm/compaction.c +++ b/mm/compaction.c @@ -1318,7 +1318,7 @@ static int next_search_order(struct compact_control *cc, int order) unsigned long flags; unsigned int order_scanned = 0; - if (!area->nr_free) + if (!nr_pages_in_free_area(area)) continue; spin_lock_irqsave(&cc->zone->lock, flags); @@ -1674,7 +1674,7 @@ static unsigned long fast_find_migrateblock(struct compact_control *cc) unsigned long flags; struct page *freepage; - if (!area->nr_free) + if (!nr_pages_in_free_area(area)) continue; spin_lock_irqsave(&cc->zone->lock, flags); diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 2894990862bd..10eaea762627 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -2418,7 +2418,7 @@ int find_suitable_fallback(struct free_area *area, unsigned int order, int i; int fallback_mt; - if (area->nr_free == 0) + if (!nr_pages_in_free_area(area)) return -1; *can_steal = false; @@ -3393,7 +3393,7 @@ bool __zone_watermark_ok(struct zone *z, unsigned int order, unsigned long mark, struct free_area *area = &z->free_area[o]; int mt; - if (!area->nr_free) + if (!nr_pages_in_free_area(area)) continue; for (mt = 0; mt < MIGRATE_PCPTYPES; mt++) { @@ -5325,7 +5325,7 @@ void show_free_areas(unsigned int filter, nodemask_t *nodemask) struct free_area *area = &zone->free_area[order]; int type; - nr[order] = area->nr_free; + nr[order] = nr_pages_in_free_area(area); total += nr[order] << order; types[order] = 0; @@ -5944,9 +5944,13 @@ void __ref memmap_init_zone_device(struct zone *zone, static void __meminit zone_init_free_lists(struct zone *zone) { unsigned int order, t; - for_each_migratetype_order(order, t) { + + for_each_migratetype_order(order, t) INIT_LIST_HEAD(&zone->free_area[order].free_list[t]); - zone->free_area[order].nr_free = 0; + + for (order = MAX_ORDER; order--; ) { + zone->free_area[order].nr_free_raw = 0; + zone->free_area[order].nr_free_treated = 0; } } diff --git a/mm/vmstat.c b/mm/vmstat.c index fd7e16ca6996..aa822fda4250 100644 --- a/mm/vmstat.c +++ b/mm/vmstat.c @@ -1031,7 +1031,7 @@ static void fill_contig_page_info(struct zone *zone, unsigned long blocks; /* Count number of free blocks */ - blocks = zone->free_area[order].nr_free; + blocks = nr_pages_in_free_area(&zone->free_area[order]); info->free_blocks_total += blocks; /* Count free base pages */ @@ -1353,7 +1353,8 @@ static void frag_show_print(struct seq_file *m, pg_data_t *pgdat, seq_printf(m, "Node %d, zone %8s ", pgdat->node_id, zone->name); for (order = 0; order < MAX_ORDER; ++order) - seq_printf(m, "%6lu ", zone->free_area[order].nr_free); + seq_printf(m, "%6lu ", + nr_pages_in_free_area(&zone->free_area[order])); seq_putc(m, '\n'); } From patchwork Thu May 30 21:54:11 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexander Duyck X-Patchwork-Id: 10969281 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C722714C0 for ; Thu, 30 May 2019 21:54:17 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B6D9F28A5B for ; Thu, 30 May 2019 21:54:17 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id AA10D28C58; Thu, 30 May 2019 21:54:17 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id EC90928A5B for ; Thu, 30 May 2019 21:54:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726930AbfE3VyQ (ORCPT ); Thu, 30 May 2019 17:54:16 -0400 Received: from mail-oi1-f194.google.com ([209.85.167.194]:45790 "EHLO mail-oi1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726937AbfE3VyO (ORCPT ); Thu, 30 May 2019 17:54:14 -0400 Received: by mail-oi1-f194.google.com with SMTP id b20so2595183oie.12; Thu, 30 May 2019 14:54:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=subject:from:to:cc:date:message-id:in-reply-to:references :user-agent:mime-version:content-transfer-encoding; bh=B4jiG1jHnMkQIVQ7CnmS/QGR53dJGB1zbfKCiInlabY=; b=gFqv4vxWqkG3q53XLUYxfM48NGGzs7jrJLI7d0NMsnFs3NTQJkvh/fhH7mNOhxNcwm Sibve0l8XDYlIMV6OuYkQ6hf2niU43j1ZVnWuTaSLmSd0utr5H6sIS+JtZc+zOty4IZi jjpXbccBXneHJ+bauOZ5a1GEd+4AOlJzJF95ua0OmIFggONVmImQ37q6Gp7FU/ZiUdju aosW9LUpVAmmvtENAf6doZ+0L5J0NvgSXY9lSqm4aatO01j1LJRrnC/71+cwmr56xVyH oy5F4HCSkc722yy8f4DT7yX+LiTTfV7tGD7wAWHkP/5MMiq7ldDQ90Ta2pzoEocPFtCj 4Ffg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:from:to:cc:date:message-id:in-reply-to :references:user-agent:mime-version:content-transfer-encoding; bh=B4jiG1jHnMkQIVQ7CnmS/QGR53dJGB1zbfKCiInlabY=; b=YtM4GElyrODjLtnvJ6wpddtBi0/GbVeCsiulrtD/Ug1nO8Mb1daKXaIw00tpL4yAYU HD+SmyfuQdDed4giMEHy9T08ETBFjqgkbzo3lKk70S3XdHyrNLKCd32IbM4/UtggGZ0H rO8FsvOCubRoguHEqnuKc3whckjLyUv0z9JiyVD63Z//bEOMJlT/dvZA0cwQIwNK3HZT Ldate5Q5IBksT2qar53m8xiSZAzDq51TA5bkb/9bFOdorQzbaJr8FYyc6vbirQVmZqGR 0AA9EOlQbqUUhxYijDFYzhAiFBcWoDjPRpMyrYXKqMqSaDqjFnMxsqQV0wdm/cgjSHc4 mSeg== X-Gm-Message-State: APjAAAXCCnIq03xVhj9MiU1h50rioGrC9Z8U9ZnYSfK5Kea7ucUVcIb9 WPD9vGkrEmpmaQ07Fy4jorE= X-Google-Smtp-Source: APXvYqwuDRLHPeeQeLx6gytguQrqXX7UFDROgnc7qqhaG7g508ei93QI0YTUhPJGvN6SN4xgMuKEFA== X-Received: by 2002:aca:ec0f:: with SMTP id k15mr3783572oih.43.1559253253781; Thu, 30 May 2019 14:54:13 -0700 (PDT) Received: from localhost.localdomain (50-126-100-225.drr01.csby.or.frontiernet.net. [50.126.100.225]) by smtp.gmail.com with ESMTPSA id w130sm1429402oib.44.2019.05.30.14.54.12 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 30 May 2019 14:54:13 -0700 (PDT) Subject: [RFC PATCH 05/11] mm: Propogate Treated bit when splitting From: Alexander Duyck To: nitesh@redhat.com, kvm@vger.kernel.org, david@redhat.com, mst@redhat.com, dave.hansen@intel.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: yang.zhang.wz@gmail.com, pagupta@redhat.com, riel@surriel.com, konrad.wilk@oracle.com, lcapitulino@redhat.com, wei.w.wang@intel.com, aarcange@redhat.com, pbonzini@redhat.com, dan.j.williams@intel.com, alexander.h.duyck@linux.intel.com Date: Thu, 30 May 2019 14:54:11 -0700 Message-ID: <20190530215411.13974.73205.stgit@localhost.localdomain> In-Reply-To: <20190530215223.13974.22445.stgit@localhost.localdomain> References: <20190530215223.13974.22445.stgit@localhost.localdomain> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Alexander Duyck When we are going to call "expand" to split a page into subpages we should mark those subpages as being "Treated" if the parent page was a "Treated" page. By doing this we can avoid potentially providing hints on a page that was already hinted at a larger page size as being unused. Signed-off-by: Alexander Duyck --- include/linux/mmzone.h | 8 ++++++-- mm/page_alloc.c | 18 +++++++++++++++--- 2 files changed, 21 insertions(+), 5 deletions(-) diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index 988c3094b686..a55fe6d2f63c 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -97,16 +97,20 @@ struct free_area { static inline void add_to_free_area(struct page *page, struct free_area *area, int migratetype) { + if (PageTreated(page)) + area->nr_free_treated++; + else + area->nr_free_raw++; + list_add(&page->lru, &area->free_list[migratetype]); - area->nr_free_raw++; } /* Used for pages not on another list */ static inline void add_to_free_area_tail(struct page *page, struct free_area *area, int migratetype) { - list_add_tail(&page->lru, &area->free_list[migratetype]); area->nr_free_raw++; + list_add_tail(&page->lru, &area->free_list[migratetype]); } /* Used for pages which are on another list */ diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 10eaea762627..f6c067c6c784 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -1965,7 +1965,7 @@ void __init init_cma_reserved_pageblock(struct page *page) */ static inline void expand(struct zone *zone, struct page *page, int low, int high, struct free_area *area, - int migratetype) + int migratetype, bool treated) { unsigned long size = 1 << high; @@ -1984,8 +1984,17 @@ static inline void expand(struct zone *zone, struct page *page, if (set_page_guard(zone, &page[size], high, migratetype)) continue; - add_to_free_area(&page[size], area, migratetype); set_page_order(&page[size], high); + if (treated) + __SetPageTreated(&page[size]); + + /* + * The list we are placing this page in should be empty + * so it should be safe to place it here without worrying + * about creating a block of raw pages floating in between + * two blocks of treated pages. + */ + add_to_free_area(&page[size], area, migratetype); } } @@ -2122,6 +2131,7 @@ struct page *__rmqueue_smallest(struct zone *zone, unsigned int order, unsigned int current_order; struct free_area *area; struct page *page; + bool treated; /* Find a page of the appropriate size in the preferred list */ for (current_order = order; current_order < MAX_ORDER; ++current_order) { @@ -2129,8 +2139,10 @@ struct page *__rmqueue_smallest(struct zone *zone, unsigned int order, page = get_page_from_free_area(area, migratetype); if (!page) continue; + treated = PageTreated(page); del_page_from_free_area(page, area); - expand(zone, page, order, current_order, area, migratetype); + expand(zone, page, order, current_order, area, migratetype, + treated); set_pcppage_migratetype(page, migratetype); return page; } From patchwork Thu May 30 21:54:18 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexander Duyck X-Patchwork-Id: 10969285 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id A6BEF912 for ; Thu, 30 May 2019 21:54:28 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 9B39C28AB3 for ; Thu, 30 May 2019 21:54:28 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 8EF0A28AD3; Thu, 30 May 2019 21:54:28 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id BC04228B01 for ; Thu, 30 May 2019 21:54:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726960AbfE3VyX (ORCPT ); Thu, 30 May 2019 17:54:23 -0400 Received: from mail-oi1-f196.google.com ([209.85.167.196]:35748 "EHLO mail-oi1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726949AbfE3VyW (ORCPT ); Thu, 30 May 2019 17:54:22 -0400 Received: by mail-oi1-f196.google.com with SMTP id y6so2567146oix.2; Thu, 30 May 2019 14:54:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=subject:from:to:cc:date:message-id:in-reply-to:references :user-agent:mime-version:content-transfer-encoding; bh=s26j7V3iEZr//55S8rYqaobNc+2iO706C6wxuCttLf0=; b=aSb1R8bqa6Rr+SLly5lIAn6ed5UqisHx2cm6+o2zE7jy8I/m6Ufv6nvAAvA+75/NEu m/lrayI+hShBHlah07B/ZkL3lqq696x2NUjibkLhfJPBsBJBESEtYFuDzDrckyYAlDvg XXaTAzOzEL2wAXcPUQn0S6hyluZbeHnPjkSRnnT1eJeZEbbFA3q5XaMAAPOKmsUfnE+1 7UeR4D2WS4GiveHpkiF9qIUMj9VMCxpY6cjNmQwdap0RJYIve+PbJK2N2DZsXq9TKApw jm/vr+vrW24xGM/DJ8+ArlgCGwtunsLl3XaJ3Agwt/xQPku1APV5CR+J397GTEJKbI+1 6yog== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:from:to:cc:date:message-id:in-reply-to :references:user-agent:mime-version:content-transfer-encoding; bh=s26j7V3iEZr//55S8rYqaobNc+2iO706C6wxuCttLf0=; b=Nn8mZmuQjKMHAHQNfesD15S9awdvsVVzFNDdd8oy7uCXr2UydjBvtzSQOU5htdNexK s40aW8g8GZ9U5NqzdTBg8jJ0E2gHY66tQ95fCDITshMVogHhpSynP9NCjL8+0Bl8Z2tN DDw2eJjVY/txBJbEhRMphusjl4LopKC0Q7C+dgu5GXalx+fE9cUqMxJf6sYlg4R6Ux52 rEkditFwPva1fioLCCuQ+6pUbusTYWnI/Fj1fozzL3jKD+O5ZC+fCwZxTlvmhQEnY9+9 uioDfER3QVrO9qCxbeDV63CjqOwhukdhl25FBA35+cIvXZSodj/awc6hCZ5QZmE1/j++ pRnA== X-Gm-Message-State: APjAAAXA2d72cCPI1yUYikJXO+OeE0/1LXigHEuzd347QdDva8v/InL6 MALqgnKup7zvkWykl8PXZc0= X-Google-Smtp-Source: APXvYqy/FQ8bhtfq9GgSYKUcKlSG4/DGy0PBOtr85jmyWEPil3tQV1gAaVwyI4p86YxQAPDAp9Ru0A== X-Received: by 2002:aca:5004:: with SMTP id e4mr3986999oib.179.1559253261053; Thu, 30 May 2019 14:54:21 -0700 (PDT) Received: from localhost.localdomain (50-126-100-225.drr01.csby.or.frontiernet.net. [50.126.100.225]) by smtp.gmail.com with ESMTPSA id v89sm1442292otb.14.2019.05.30.14.54.19 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 30 May 2019 14:54:20 -0700 (PDT) Subject: [RFC PATCH 06/11] mm: Add membrane to free area to use as divider between treated and raw pages From: Alexander Duyck To: nitesh@redhat.com, kvm@vger.kernel.org, david@redhat.com, mst@redhat.com, dave.hansen@intel.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: yang.zhang.wz@gmail.com, pagupta@redhat.com, riel@surriel.com, konrad.wilk@oracle.com, lcapitulino@redhat.com, wei.w.wang@intel.com, aarcange@redhat.com, pbonzini@redhat.com, dan.j.williams@intel.com, alexander.h.duyck@linux.intel.com Date: Thu, 30 May 2019 14:54:18 -0700 Message-ID: <20190530215418.13974.63493.stgit@localhost.localdomain> In-Reply-To: <20190530215223.13974.22445.stgit@localhost.localdomain> References: <20190530215223.13974.22445.stgit@localhost.localdomain> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Alexander Duyck Add a pointer we shall call "membrane" which represents the upper boundary between the "raw" and "treated" pages. The general idea is that in order for a page to cross from one side of the membrane to the other it will need to go through the aeration treatment. By doing this we should be able to make certain that we keep the treated pages as one contiguous block within each free list. While treating the pages there may be two, but the two should merge into one before we complete the migratetype and allow it to fall back into the "settling" state. Signed-off-by: Alexander Duyck --- include/linux/mmzone.h | 38 ++++++++++++++++++++++++++++++++++++++ mm/page_alloc.c | 14 ++++++++++++-- 2 files changed, 50 insertions(+), 2 deletions(-) diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index a55fe6d2f63c..be996e8ca6b5 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -87,10 +87,28 @@ static inline bool is_migrate_movable(int mt) get_pfnblock_flags_mask(page, page_to_pfn(page), \ PB_migrate_end, MIGRATETYPE_MASK) +/* + * The treatment state indicates the current state of the region pointed to + * by the treatment_mt and the membrane pointer. The general idea is that + * when we are in the "SETTLING" state the treatment area is contiguous and + * it is safe to move on to treating another migratetype. If we are in the + * "AERATING" state then the region is being actively processed and we + * would cause issues such as potentially isolating a section of raw pages + * between two sections of treated pages if we were to move onto another + * migratetype. + */ +enum treatment_state { + TREATMENT_SETTLING, + TREATMENT_AERATING, +}; + struct free_area { struct list_head free_list[MIGRATE_TYPES]; unsigned long nr_free_raw; unsigned long nr_free_treated; + struct list_head *membrane; + u8 treatment_mt; + u8 treatment_state; }; /* Used for pages not on another list */ @@ -113,6 +131,19 @@ static inline void add_to_free_area_tail(struct page *page, struct free_area *ar list_add_tail(&page->lru, &area->free_list[migratetype]); } +static inline void +add_to_free_area_treated(struct page *page, struct free_area *area, + int migratetype) +{ + area->nr_free_treated++; + + BUG_ON(area->treatment_mt != migratetype); + + /* Insert page above membrane, then move membrane to the page */ + list_add_tail(&page->lru, area->membrane); + area->membrane = &page->lru; +} + /* Used for pages which are on another list */ static inline void move_to_free_area(struct page *page, struct free_area *area, int migratetype) @@ -135,6 +166,10 @@ static inline void move_to_free_area(struct page *page, struct free_area *area, area->nr_free_raw++; } + /* push membrane back if we removed the upper boundary */ + if (area->membrane == &page->lru) + area->membrane = page->lru.next; + list_move(&page->lru, &area->free_list[migratetype]); } @@ -153,6 +188,9 @@ static inline void del_page_from_free_area(struct page *page, else area->nr_free_raw--; + if (area->membrane == &page->lru) + area->membrane = page->lru.next; + list_del(&page->lru); __ClearPageBuddy(page); __ResetPageTreated(page); diff --git a/mm/page_alloc.c b/mm/page_alloc.c index f6c067c6c784..f4a629b6af96 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -989,6 +989,11 @@ static inline void __free_one_page(struct page *page, set_page_order(page, order); area = &zone->free_area[order]; + if (PageTreated(page)) { + add_to_free_area_treated(page, area, migratetype); + return; + } + if (buddy_merge_likely(pfn, buddy_pfn, page, order) || is_shuffle_tail_page(order)) add_to_free_area_tail(page, area, migratetype); @@ -5961,8 +5966,13 @@ static void __meminit zone_init_free_lists(struct zone *zone) INIT_LIST_HEAD(&zone->free_area[order].free_list[t]); for (order = MAX_ORDER; order--; ) { - zone->free_area[order].nr_free_raw = 0; - zone->free_area[order].nr_free_treated = 0; + struct free_area *area = &zone->free_area[order]; + + area->nr_free_raw = 0; + area->nr_free_treated = 0; + area->treatment_mt = 0; + area->treatment_state = TREATMENT_SETTLING; + area->membrane = &area->free_list[0]; } } From patchwork Thu May 30 21:54:26 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexander Duyck X-Patchwork-Id: 10969289 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C03C8912 for ; Thu, 30 May 2019 21:54:35 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B25EA28904 for ; Thu, 30 May 2019 21:54:35 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id A6E0C28B01; Thu, 30 May 2019 21:54:35 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 3022728904 for ; Thu, 30 May 2019 21:54:35 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726985AbfE3Vya (ORCPT ); Thu, 30 May 2019 17:54:30 -0400 Received: from mail-ot1-f66.google.com ([209.85.210.66]:40191 "EHLO mail-ot1-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726989AbfE3Vy3 (ORCPT ); Thu, 30 May 2019 17:54:29 -0400 Received: by mail-ot1-f66.google.com with SMTP id u11so7165090otq.7; Thu, 30 May 2019 14:54:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=subject:from:to:cc:date:message-id:in-reply-to:references :user-agent:mime-version:content-transfer-encoding; bh=ySRpFG6dIOHZ+M23tjria9Tf+GQw7/jvywjmJkXyRMo=; b=rcau95AXvxEykUxo3o4rkucpfbR1LX1x5X5CXkMhzd1fqmFgZZwKoTCbMpKKbYqFfC UbdBILZlqxd2RuRIYCEGRxN9ep9jYLYhRreEWra8LWhyYsMzByoobPErsAneYypuf6Gu PE0cjm4SCxwhQ+NB4y+KyiiEoBG9ckxFVgWYm/GtS2/Qw+K5GSC6c6a8Nvh8Z/zxoIGG Y4buUb8l/83NXSsb2o15MmHUsRTwTI8cOyyoiVqfkSpG+HVP3nQTtHm+7oCsJFRGsagc ei5rEo+G+E2/wyIldOG821aJC1RHYMP9oREGt0ATmV6jeHMyQNAdMUGC3sbUmsIg0r9/ beYw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:from:to:cc:date:message-id:in-reply-to :references:user-agent:mime-version:content-transfer-encoding; bh=ySRpFG6dIOHZ+M23tjria9Tf+GQw7/jvywjmJkXyRMo=; b=du0jgzjCn9doEVuFouLyTc/5woAFpe42QEUAtfxSKhmGztya2LG3K7PLiDJPCKR3FK FdSLpxlqh9oGNigyv02JAxVv/jCpcyz1e93ACfPYexQHBr/L55d4JlSCsvmWUG4B8APc P8z1sgD/liUl4QkRM7/QrxZI8UEXXEpa92mxjrOJDOu+xskYWCTUZd9pqdddLCxTxAeh 6qJyqCdKrVj8ycJjVVjoCTKZkXdb4aQt5WIb5SUmfk5JrTnXAHGsvPfrcWGQHipxPNq9 z2iC5iKFqj0F2AViywn34loUzUa6k7vrh0YFxTeOOYhGAz3fbpQctCImkBt2ZjHJL+Dl QjOQ== X-Gm-Message-State: APjAAAVio4yX4QPgAVwXlk5pJItgZRG9J5uPYxEFZNEGU8CKfOsTf7lL rMr+n7yTnQRMI6VuRxJetfM= X-Google-Smtp-Source: APXvYqxrTapW1llgCNvcp1SuLkSKr6HRUWIeylHL8O4HoCla8qYP0tDZcevscz28QJb95+UxLu1Szg== X-Received: by 2002:a9d:3285:: with SMTP id u5mr4625825otb.266.1559253268401; Thu, 30 May 2019 14:54:28 -0700 (PDT) Received: from localhost.localdomain (50-126-100-225.drr01.csby.or.frontiernet.net. [50.126.100.225]) by smtp.gmail.com with ESMTPSA id p63sm866455oih.1.2019.05.30.14.54.26 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 30 May 2019 14:54:27 -0700 (PDT) Subject: [RFC PATCH 07/11] mm: Add support for acquiring first free "raw" or "untreated" page in zone From: Alexander Duyck To: nitesh@redhat.com, kvm@vger.kernel.org, david@redhat.com, mst@redhat.com, dave.hansen@intel.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: yang.zhang.wz@gmail.com, pagupta@redhat.com, riel@surriel.com, konrad.wilk@oracle.com, lcapitulino@redhat.com, wei.w.wang@intel.com, aarcange@redhat.com, pbonzini@redhat.com, dan.j.williams@intel.com, alexander.h.duyck@linux.intel.com Date: Thu, 30 May 2019 14:54:26 -0700 Message-ID: <20190530215426.13974.82813.stgit@localhost.localdomain> In-Reply-To: <20190530215223.13974.22445.stgit@localhost.localdomain> References: <20190530215223.13974.22445.stgit@localhost.localdomain> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Alexander Duyck In order to be able to "treat" memory in an asynchonous fashion we need a way to acquire a block of memory that isn't already treated, and then flush that back in a way that we will not pick it back up again. To achieve that this patch adds a pair of functions. One to fill a list with pages to be treated, and another that will flush out the list back to the buddy allocator. Signed-off-by: Alexander Duyck --- include/linux/gfp.h | 6 +++ mm/page_alloc.c | 107 +++++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 113 insertions(+) diff --git a/include/linux/gfp.h b/include/linux/gfp.h index fb07b503dc45..407a089d861f 100644 --- a/include/linux/gfp.h +++ b/include/linux/gfp.h @@ -559,6 +559,12 @@ extern void *page_frag_alloc(struct page_frag_cache *nc, void drain_all_pages(struct zone *zone); void drain_local_pages(struct zone *zone); +#ifdef CONFIG_AERATION +struct page *get_raw_pages(struct zone *zone, unsigned int order, + int migratetype); +void free_treated_page(struct page *page); +#endif + void page_alloc_init_late(void); /* diff --git a/mm/page_alloc.c b/mm/page_alloc.c index f4a629b6af96..e79c65413dc9 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -2155,6 +2155,113 @@ struct page *__rmqueue_smallest(struct zone *zone, unsigned int order, return NULL; } +#ifdef CONFIG_AERATION +static struct page *get_raw_page_from_free_area(struct free_area *area, + int migratetype) +{ + struct list_head *head = &area->free_list[migratetype]; + struct page *page; + + /* If we have not worked in this free_list before reset membrane */ + if (area->treatment_mt != migratetype) { + area->treatment_mt = migratetype; + area->membrane = head; + } + + /* Try to pulling in any untreated pages above the the membrane */ + page = list_last_entry(area->membrane, struct page, lru); + list_for_each_entry_from_reverse(page, head, lru) { + /* + * If the page in front of the membrane is treated then try + * skimming the top to see if we have any untreated pages + * up there. + */ + if (PageTreated(page)) { + page = list_first_entry(head, struct page, lru); + if (PageTreated(page)) + break; + } + + /* update state of treatment */ + area->treatment_state = TREATMENT_AERATING; + + return page; + } + + /* + * At this point there are no longer any untreated pages between + * the membrane and the first entry of the list. So we can safely + * set the membrane to the top of the treated region and will mark + * the current migratetype as complete for now. + */ + area->membrane = &page->lru; + area->treatment_state = TREATMENT_SETTLING; + + return NULL; +} + +/** + * get_raw_pages - Provide a "raw" page for treatment by the aerator + * @zone: Zone to draw pages from + * @order: Order to draw pages from + * @migratetype: Migratetype to draw pages from + * + * This function will obtain a page that does not have the Treated value + * set in the page type field. It will attempt to fetch a "raw" page from + * just above the "membrane" and if that is not available it will attempt + * to pull a "raw" page from the head of the free list. + * + * The page will have the migrate type and order stored in the page + * metadata. + * + * Return: page pointer if raw page found, otherwise NULL + */ +struct page *get_raw_pages(struct zone *zone, unsigned int order, + int migratetype) +{ + struct free_area *area = &(zone->free_area[order]); + struct page *page; + + /* Find a page of the appropriate size in the preferred list */ + page = get_raw_page_from_free_area(area, migratetype); + if (page) { + del_page_from_free_area(page, area); + + /* record migratetype and order within page */ + set_pcppage_migratetype(page, migratetype); + set_page_private(page, order); + __mod_zone_freepage_state(zone, -(1 << order), migratetype); + } + + return page; +} +EXPORT_SYMBOL_GPL(get_raw_pages); + +/** + * free_treated_page - Return a now-treated "raw" page back where we got it + * @page: Previously "raw" page that can now be returned after treatment + * + * This function will pull the zone, migratetype, and order information out + * of the page and attempt to return it where it found it. We default to + * using free_one_page to return the page as it is possible that the + * pageblock might have been switched to an isolate migratetype during + * treatment. + */ +void free_treated_page(struct page *page) +{ + unsigned int order, mt; + struct zone *zone; + + zone = page_zone(page); + mt = get_pcppage_migratetype(page); + order = page_private(page); + + set_page_private(page, 0); + + free_one_page(zone, page, page_to_pfn(page), order, mt); +} +EXPORT_SYMBOL_GPL(free_treated_page); +#endif /* CONFIG_AERATION */ /* * This array describes the order lists are fallen back to when From patchwork Thu May 30 21:54:33 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexander Duyck X-Patchwork-Id: 10969293 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 800C114C0 for ; Thu, 30 May 2019 21:54:45 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 72EDA28C56 for ; Thu, 30 May 2019 21:54:45 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 6767C28C50; Thu, 30 May 2019 21:54:45 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 49CE628979 for ; Thu, 30 May 2019 21:54:44 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726934AbfE3Vyi (ORCPT ); Thu, 30 May 2019 17:54:38 -0400 Received: from mail-ot1-f67.google.com ([209.85.210.67]:43148 "EHLO mail-ot1-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726986AbfE3Vyh (ORCPT ); Thu, 30 May 2019 17:54:37 -0400 Received: by mail-ot1-f67.google.com with SMTP id i8so7145738oth.10; Thu, 30 May 2019 14:54:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=subject:from:to:cc:date:message-id:in-reply-to:references :user-agent:mime-version:content-transfer-encoding; bh=VrUdZuuah1ePBnYB1FF7mjrwNfzoFpJZPyHrGDL/mrg=; b=RWkM8Yl5fLEz/8vjf5jVcRJXT8EF4ayMTXv7VUaV9qHT5iuiRJFkfD2jAc7TX5vZE9 R6fwxU1uo7dbhu+Ta23F46YEx+EpstmpbboqE8ZzWfc73qPEBl8SVwYR3OxGein+w+pF 4TKspLHWCWLJrXCjZkUbFlNZ2p32JYAHCBjCKjE0s7lDIugHB2trrFgc3MpeSVSn9ypB iv0ZOemcgSITjKPMhkRw5HzR45ku6p6dVQrLPKbpZ6koXhcRFmuK2wVawtM0hYmcTq/i rJ860wczu4at7pY2GoVb3tGpoh2NxTlvLYvJS5tmmmm2w6Kv/KKg9NscA0k+VbbULgUG PZgQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:from:to:cc:date:message-id:in-reply-to :references:user-agent:mime-version:content-transfer-encoding; bh=VrUdZuuah1ePBnYB1FF7mjrwNfzoFpJZPyHrGDL/mrg=; b=dNh6XH5kYpGjRJQL6Ttm4VRYSmusW6NdAcu/px4GwN8COky9n9qrbKtQFSh8bEtQ/w Vzbq1ikXUxDzkOB8IH9MZYz4V4RwpBQgDbBOqzzKYMihWzkeEUxIV0B+H5QoQZJfvhwP t0E+A/INTXVCVZHTLg6HEudWx+EI/Uum+OYANsmSB+rjlQGy6NjL0VRw0gM+DEjlzQW2 6NKX4I+uL+D1pHyIj6V1xEhoBOPvcdIavlld+E68CFNXeFjEPztOrB6+Bmydv1obZ8gh EPGQw/QRpFoTDcdJRY52nkWPzMlijqXKxnU7M6DrBbFmEtAgVbGWwtFgEl4ht41sleAA 7jLg== X-Gm-Message-State: APjAAAXyOzGZukSZtnoJWIWWKw9nDu6ZHQuieavwkmfVA1Edz/YSw2nB UC2jOh4yXFOT7oHfLPdwM98= X-Google-Smtp-Source: APXvYqw7g1L+G0cqINE4PJyWCvfS0dKJpUSfVKFCiCqnvg3eJwWRViAbuOlMVDgf+SLDserE4hl5Ig== X-Received: by 2002:a05:6830:1318:: with SMTP id p24mr2104325otq.75.1559253276059; Thu, 30 May 2019 14:54:36 -0700 (PDT) Received: from localhost.localdomain (50-126-100-225.drr01.csby.or.frontiernet.net. [50.126.100.225]) by smtp.gmail.com with ESMTPSA id z16sm1117798ote.50.2019.05.30.14.54.34 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 30 May 2019 14:54:35 -0700 (PDT) Subject: [RFC PATCH 08/11] mm: Add support for creating memory aeration From: Alexander Duyck To: nitesh@redhat.com, kvm@vger.kernel.org, david@redhat.com, mst@redhat.com, dave.hansen@intel.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: yang.zhang.wz@gmail.com, pagupta@redhat.com, riel@surriel.com, konrad.wilk@oracle.com, lcapitulino@redhat.com, wei.w.wang@intel.com, aarcange@redhat.com, pbonzini@redhat.com, dan.j.williams@intel.com, alexander.h.duyck@linux.intel.com Date: Thu, 30 May 2019 14:54:33 -0700 Message-ID: <20190530215433.13974.43219.stgit@localhost.localdomain> In-Reply-To: <20190530215223.13974.22445.stgit@localhost.localdomain> References: <20190530215223.13974.22445.stgit@localhost.localdomain> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Alexander Duyck Add support for "aerating" memory in a guest by pushing individual pages out. This patch is meant to add generic support for this by adding a common framework that can be used later by drivers such as virtio-balloon. Signed-off-by: Alexander Duyck --- include/linux/memory_aeration.h | 54 +++++++ mm/Kconfig | 5 + mm/Makefile | 1 mm/aeration.c | 320 +++++++++++++++++++++++++++++++++++++++ 4 files changed, 380 insertions(+) create mode 100644 include/linux/memory_aeration.h create mode 100644 mm/aeration.c diff --git a/include/linux/memory_aeration.h b/include/linux/memory_aeration.h new file mode 100644 index 000000000000..5ba0e634f240 --- /dev/null +++ b/include/linux/memory_aeration.h @@ -0,0 +1,54 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _LINUX_MEMORY_AERATION_H +#define _LINUX_MEMORY_AERATION_H + +#include +#include +#include + +struct zone; + +#define AERATOR_MIN_ORDER pageblock_order + +struct aerator_dev_info { + unsigned long capacity; + struct list_head batch_reactor; + atomic_t refcnt; + void (*react)(struct aerator_dev_info *a_dev_info); +}; + +extern struct static_key aerator_notify_enabled; + +void aerator_cycle(void); +void __aerator_notify(struct zone *zone, int order); + +/** + * aerator_notify_free - Free page notification that will start page processing + * @page: Last page processed + * @zone: Pointer to current zone of last page processed + * @order: Order of last page added to zone + * + * This function is meant to act as a screener for __aerator_notify which + * will determine if a give zone has crossed over the high-water mark that + * will justify us beginning page treatment. If we have crossed that + * threshold then it will start the process of pulling some pages and + * placing them in the batch_reactor list for treatment. + */ +static inline void +aerator_notify_free(struct page *page, struct zone *zone, int order) +{ + if (!static_key_false(&aerator_notify_enabled)) + return; + + if (order < AERATOR_MIN_ORDER) + return; + + __aerator_notify(zone, order); +} + +void aerator_shutdown(void); +int aerator_startup(struct aerator_dev_info *sdev); + +#define AERATOR_ZONE_BITS (BITS_TO_LONGS(MAX_NR_ZONES) * BITS_PER_LONG) +#define AERATOR_HWM_BITS (AERATOR_ZONE_BITS * MAX_NUMNODES) +#endif /*_LINUX_MEMORY_AERATION_H */ diff --git a/mm/Kconfig b/mm/Kconfig index f0c76ba47695..34680214cefa 100644 --- a/mm/Kconfig +++ b/mm/Kconfig @@ -236,6 +236,11 @@ config COMPACTION linux-mm@kvack.org. # +# support for memory aeration +config AERATION + bool + +# # support for page migration # config MIGRATION diff --git a/mm/Makefile b/mm/Makefile index ac5e5ba78874..26c2fcd2b89d 100644 --- a/mm/Makefile +++ b/mm/Makefile @@ -104,3 +104,4 @@ obj-$(CONFIG_HARDENED_USERCOPY) += usercopy.o obj-$(CONFIG_PERCPU_STATS) += percpu-stats.o obj-$(CONFIG_HMM) += hmm.o obj-$(CONFIG_MEMFD_CREATE) += memfd.o +obj-$(CONFIG_AERATION) += aeration.o diff --git a/mm/aeration.c b/mm/aeration.c new file mode 100644 index 000000000000..aaf8af8d822f --- /dev/null +++ b/mm/aeration.c @@ -0,0 +1,320 @@ +// SPDX-License-Identifier: GPL-2.0 +#include +#include +#include +#include +#include +#include + +static unsigned long *aerator_hwm; +static struct aerator_dev_info *a_dev_info; +struct static_key aerator_notify_enabled; + +void aerator_shutdown(void) +{ + static_key_slow_dec(&aerator_notify_enabled); + + while (atomic_read(&a_dev_info->refcnt)) + msleep(20); + + kfree(aerator_hwm); + aerator_hwm = NULL; + + a_dev_info = NULL; +} +EXPORT_SYMBOL_GPL(aerator_shutdown); + +int aerator_startup(struct aerator_dev_info *sdev) +{ + size_t size = BITS_TO_LONGS(AERATOR_HWM_BITS) * sizeof(unsigned long); + unsigned long *hwm; + + if (a_dev_info || aerator_hwm) + return -EBUSY; + + a_dev_info = sdev; + + atomic_set(&sdev->refcnt, 0); + + hwm = kzalloc(size, GFP_KERNEL); + if (!hwm) { + aerator_shutdown(); + return -ENOMEM; + } + + aerator_hwm = hwm; + + static_key_slow_inc(&aerator_notify_enabled); + + return 0; +} +EXPORT_SYMBOL_GPL(aerator_startup); + +static inline unsigned long *get_aerator_hwm(int nid) +{ + if (!aerator_hwm) + return NULL; + + return aerator_hwm + (BITS_TO_LONGS(MAX_NR_ZONES) * nid); +} + +static int __aerator_fill(struct zone *zone, unsigned int size) +{ + struct list_head *batch = &a_dev_info->batch_reactor; + unsigned long nr_raw = 0; + unsigned int len = 0; + unsigned int order; + + for (order = MAX_ORDER; order-- != AERATOR_MIN_ORDER;) { + struct free_area *area = &(zone->free_area[order]); + int mt = area->treatment_mt; + + /* + * If there are no untreated pages to pull + * then we might as well skip the area. + */ + while (area->nr_free_raw) { + unsigned int count = 0; + struct page *page; + + /* + * If we completed aeration we can let the current + * free list work on settling so that a batch of + * new raw pages can build. In the meantime move on + * to the next migratetype. + */ + if (++mt >= MIGRATE_TYPES) + mt = 0; + + /* + * Pull pages from free list until we have drained + * it or we have filled the batch reactor. + */ + while ((page = get_raw_pages(zone, order, mt))) { + list_add(&page->lru, batch); + + if (++count == (size - len)) + return size; + } + + /* + * If we pulled any pages from this migratetype then + * we must move on to a new free area as we cannot + * move the membrane until after we have decanted the + * pages currently being aerated. + */ + if (count) { + len += count; + break; + } + } + + /* + * Keep a running total of the raw packets we have left + * behind. We will use this to determine if we should + * clear the HWM flag. + */ + nr_raw += area->nr_free_raw; + } + + /* + * If there are no longer enough free pages to fully populate + * the aerator, then we can just shut it down for this zone. + */ + if (nr_raw < a_dev_info->capacity) { + unsigned long *hwm = get_aerator_hwm(zone_to_nid(zone)); + + clear_bit(zone_idx(zone), hwm); + atomic_dec(&a_dev_info->refcnt); + } + + return len; +} + +static unsigned int aerator_fill(int nid, int zid, int budget) +{ + pg_data_t *pgdat = NODE_DATA(nid); + struct zone *zone = &pgdat->node_zones[zid]; + unsigned long flags; + int len; + + spin_lock_irqsave(&zone->lock, flags); + + /* fill aerator with "raw" pages */ + len = __aerator_fill(zone, budget); + + spin_unlock_irqrestore(&zone->lock, flags); + + return len; +} + +static void aerator_fill_and_react(void) +{ + int budget = a_dev_info->capacity; + int nr; + + /* + * We should never be calling this function while there are already + * pages in the reactor being aerated. If we are called under such + * a circumstance report an error. + */ + BUG_ON(!list_empty(&a_dev_info->batch_reactor)); +retry: + /* + * We want to hold one additional reference against the number of + * active hints as we may clear the hint that originally brought us + * here. We will clear it after we have either vaporized the content + * of the pages, or if we discover all pages were stolen out from + * under us. + */ + atomic_inc(&a_dev_info->refcnt); + + for_each_set_bit(nr, aerator_hwm, AERATOR_HWM_BITS) { + int node_id = nr / AERATOR_ZONE_BITS; + int zone_id = nr % AERATOR_ZONE_BITS; + + budget -= aerator_fill(node_id, zone_id, budget); + if (!budget) + goto start_aerating; + } + + if (unlikely(list_empty(&a_dev_info->batch_reactor))) { + /* + * If we never generated any pages, and we were holding the + * only remaining reference to active hints then we can + * just let this go for now and go idle. + */ + if (atomic_dec_and_test(&a_dev_info->refcnt)) + return; + + /* + * There must be a bit populated somewhere, try going + * back through and finding it. + */ + goto retry; + } + +start_aerating: + a_dev_info->react(a_dev_info); +} + +void aerator_decant(void) +{ + struct list_head *list = &a_dev_info->batch_reactor; + struct page *page; + + /* + * This function should never be called on an empty list. If so it + * points to a bug as we should never be running the aerator when + * the list is empty. + */ + WARN_ON(list_empty(&a_dev_info->batch_reactor)); + + while ((page = list_first_entry_or_null(list, struct page, lru))) { + list_del(&page->lru); + + __SetPageTreated(page); + + free_treated_page(page); + } +} + +/** + * aerator_cycle - drain, fill, and start aerating another batch of pages + * + * This function is at the heart of the aerator. It should be called after + * the previous batch of pages has finished being processed by the aerator. + * It will drain the aerator, refill it, and start the next set of pages + * being processed. + */ +void aerator_cycle(void) +{ + aerator_decant(); + + /* + * Now that the pages have been flushed we can drop our reference to + * the active hints list. If there are no further hints that need to + * be processed we can simply go idle. + */ + if (atomic_dec_and_test(&a_dev_info->refcnt)) + return; + + aerator_fill_and_react(); +} +EXPORT_SYMBOL_GPL(aerator_cycle); + +static void __aerator_fill_and_react(struct zone *zone) +{ + /* + * We should never be calling this function while there are already + * pages in the list being aerated. If we are called under such a + * circumstance report an error. + */ + BUG_ON(!list_empty(&a_dev_info->batch_reactor)); + + /* + * We want to hold one additional reference against the number of + * active hints as we may clear the hint that originally brought us + * here. We will clear it after we have either vaporized the content + * of the pages, or if we discover all pages were stolen out from + * under us. + */ + atomic_inc(&a_dev_info->refcnt); + + __aerator_fill(zone, a_dev_info->capacity); + + if (unlikely(list_empty(&a_dev_info->batch_reactor))) { + /* + * If we never generated any pages, and we were holding the + * only remaining reference to active hints then we can just + * let this go for now and go idle. + */ + if (atomic_dec_and_test(&a_dev_info->refcnt)) + return; + + /* + * Another zone must have populated some raw pages that + * need to be processed. Release the zone lock and process + * that zone instead. + */ + spin_unlock(&zone->lock); + aerator_fill_and_react(); + } else { + /* Release the zone lock and begin the page aerator */ + spin_unlock(&zone->lock); + a_dev_info->react(a_dev_info); + } + + /* Reaquire lock so we can resume processing this zone */ + spin_lock(&zone->lock); +} + +void __aerator_notify(struct zone *zone, int order) +{ + int node_id = zone_to_nid(zone); + int zone_id = zone_idx(zone); + unsigned long *hwm; + + if (zone->free_area[order].nr_free_raw < (2 * a_dev_info->capacity)) + return; + + hwm = get_aerator_hwm(node_id); + + /* + * We an use separate test and set operations here as there + * is nothing else that can set or clear this bit while we are + * holding the zone lock. The advantage to doing it this way is + * that we don't have to dirty the cacheline unless we are + * changing the value. + */ + if (test_bit(zone_id, hwm)) + return; + set_bit(zone_id, hwm); + + if (atomic_fetch_inc(&a_dev_info->refcnt)) + return; + + __aerator_fill_and_react(zone); +} +EXPORT_SYMBOL_GPL(__aerator_notify); + From patchwork Thu May 30 21:54:41 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexander Duyck X-Patchwork-Id: 10969297 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id F374F14C0 for ; Thu, 30 May 2019 21:54:49 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E79A528A12 for ; Thu, 30 May 2019 21:54:49 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id DC12628B01; Thu, 30 May 2019 21:54:49 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7E8DF28A12 for ; Thu, 30 May 2019 21:54:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726984AbfE3Vyo (ORCPT ); Thu, 30 May 2019 17:54:44 -0400 Received: from mail-ot1-f67.google.com ([209.85.210.67]:33838 "EHLO mail-ot1-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726949AbfE3Vyo (ORCPT ); Thu, 30 May 2019 17:54:44 -0400 Received: by mail-ot1-f67.google.com with SMTP id l17so7200544otq.1; Thu, 30 May 2019 14:54:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=subject:from:to:cc:date:message-id:in-reply-to:references :user-agent:mime-version:content-transfer-encoding; bh=+Hql3UnNz+U+1lYmmCsad2oNlY6bZIgkjk9Ecr29j+A=; b=BXs8yL238Nx++nVmRKlg3dNF8CTnKOpeTXH0AG07gUebKWVwBReMaUvxW+ADvzwJwk HL7Z2/iu94p7I4zy38LSM+GFa9I3DNkDzScHJPMvIDd3EIXwnwx7TU9/jRgQ2bPHCNTN b2ZeQhQnwYYnXtsyLbJi6ZBBU7WMzu+BduNJjFj7Yb342UxFjbViJCZZ1aWCPfh3p5z3 UqJZVkfskln9C5j/67ONPb9LHHYvE0bD2NGQUnSUtNRASvc24zb66/hAutak8OFQ70Mi +a9a/wdBysp+q2E181m0otXqpxVwLVHU5jpjDa7eHqaV2QssIfLYiRjA53oBXVcPZziH Oy6Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:from:to:cc:date:message-id:in-reply-to :references:user-agent:mime-version:content-transfer-encoding; bh=+Hql3UnNz+U+1lYmmCsad2oNlY6bZIgkjk9Ecr29j+A=; b=s+c62hmA4wJ8xfbPMysohBjJt4lk1QQYwTc/H6q+xK+86SwcpwbfKEzjiNfOyq2dC4 TOhUm9hFqDVRFGu1GBYewPanVQBbNjagizLX7Vuk1gSVPUg9P1qwCkKoYukh2Iy7Yq0x zUgebW+M3lGv57wGwPkXCZ7dSqzKbO9xuVRVB7t75vsanjd6XQ6+0AGfWYxOw53XE5hv dcmSf9C3TpZDHirXK1yYsMFb5JfCcMzA/ZiuOJZADHyIeRwDF3Av/C42FdnX9XDn6VbV DPCq9t1Tk/GajHlzt3iuZgY4wea1mw4HfMWTrV4IPOhDfW39QL7lwiOrfK1885jPgjre THag== X-Gm-Message-State: APjAAAVbBztQ6iIYbnrfunjX8PDAd9on+pVK1RoxEibRi/96zItPPs2H FgtSpC+95F0ajeKtDyWaY6c= X-Google-Smtp-Source: APXvYqxGTR2Bptz8i/SgWXoKTL+4Bu+dJLGidAXFm62PcUBVrIR9eyVW68+tWL2IUSUeElRSKJsvlg== X-Received: by 2002:a9d:5ec:: with SMTP id 99mr4311813otd.57.1559253283263; Thu, 30 May 2019 14:54:43 -0700 (PDT) Received: from localhost.localdomain (50-126-100-225.drr01.csby.or.frontiernet.net. [50.126.100.225]) by smtp.gmail.com with ESMTPSA id n7sm1450349oih.18.2019.05.30.14.54.41 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 30 May 2019 14:54:42 -0700 (PDT) Subject: [RFC PATCH 09/11] mm: Count isolated pages as "treated" From: Alexander Duyck To: nitesh@redhat.com, kvm@vger.kernel.org, david@redhat.com, mst@redhat.com, dave.hansen@intel.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: yang.zhang.wz@gmail.com, pagupta@redhat.com, riel@surriel.com, konrad.wilk@oracle.com, lcapitulino@redhat.com, wei.w.wang@intel.com, aarcange@redhat.com, pbonzini@redhat.com, dan.j.williams@intel.com, alexander.h.duyck@linux.intel.com Date: Thu, 30 May 2019 14:54:41 -0700 Message-ID: <20190530215441.13974.33609.stgit@localhost.localdomain> In-Reply-To: <20190530215223.13974.22445.stgit@localhost.localdomain> References: <20190530215223.13974.22445.stgit@localhost.localdomain> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Alexander Duyck Treat isolated pages as though they have already been treated. We do this so that we can avoid trying to treat pages that have been marked for isolation. The issue is that we don't want to run into issues where we are treating a page, and when we put it back we find it has been moved into the isolated migratetype, nor would we want to pull pages out of the isolated migratetype and then find that they are now being located in a different migratetype. To avoid those issues we can specifically mark all isolated pages as being "treated" and avoid special case handling for them since they will never be merged anyway, so we can just add them to the head of the free_list. In addition we will skip over the isolate migratetype when getting raw pages. Signed-off-by: Alexander Duyck --- include/linux/mmzone.h | 7 +++++++ mm/aeration.c | 8 ++++++-- mm/page_alloc.c | 2 +- 3 files changed, 14 insertions(+), 3 deletions(-) diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index be996e8ca6b5..f749ccfcc62a 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -137,6 +137,13 @@ static inline void add_to_free_area_tail(struct page *page, struct free_area *ar { area->nr_free_treated++; +#ifdef CONFIG_MEMORY_ISOLATION + /* Bypass membrane for isolated pages, all are considered "treated" */ + if (migratetype == MIGRATE_ISOLATE) { + list_add(&page->lru, &area->free_list[migratetype]); + return; + } +#endif BUG_ON(area->treatment_mt != migratetype); /* Insert page above membrane, then move membrane to the page */ diff --git a/mm/aeration.c b/mm/aeration.c index aaf8af8d822f..f921295ed3ae 100644 --- a/mm/aeration.c +++ b/mm/aeration.c @@ -1,6 +1,8 @@ // SPDX-License-Identifier: GPL-2.0 #include +#include #include +#include #include #include #include @@ -83,8 +85,10 @@ static int __aerator_fill(struct zone *zone, unsigned int size) * new raw pages can build. In the meantime move on * to the next migratetype. */ - if (++mt >= MIGRATE_TYPES) - mt = 0; + do { + if (++mt >= MIGRATE_TYPES) + mt = 0; + } while (is_migrate_isolate(mt)); /* * Pull pages from free list until we have drained diff --git a/mm/page_alloc.c b/mm/page_alloc.c index e79c65413dc9..e3800221414b 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -989,7 +989,7 @@ static inline void __free_one_page(struct page *page, set_page_order(page, order); area = &zone->free_area[order]; - if (PageTreated(page)) { + if (is_migrate_isolate(migratetype) || PageTreated(page)) { add_to_free_area_treated(page, area, migratetype); return; } From patchwork Thu May 30 21:54:48 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexander Duyck X-Patchwork-Id: 10969299 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E8108912 for ; Thu, 30 May 2019 21:54:54 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D88FE28A6A for ; Thu, 30 May 2019 21:54:54 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id CCF7B28C5A; Thu, 30 May 2019 21:54:54 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1EC7B28A5B for ; Thu, 30 May 2019 21:54:54 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727054AbfE3Vyx (ORCPT ); Thu, 30 May 2019 17:54:53 -0400 Received: from mail-ot1-f66.google.com ([209.85.210.66]:40222 "EHLO mail-ot1-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727055AbfE3Vyv (ORCPT ); Thu, 30 May 2019 17:54:51 -0400 Received: by mail-ot1-f66.google.com with SMTP id u11so7165870otq.7; Thu, 30 May 2019 14:54:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=subject:from:to:cc:date:message-id:in-reply-to:references :user-agent:mime-version:content-transfer-encoding; bh=/mUEYZKWYYcL2BHkb0DxPuaatZyzvoF9cSukuFqsj94=; b=cbR7FE1zdTHh1NeELi13nyRj1xb4yqLhmeuGlEbyivkZsLIB39JafRdCTyWZW4ufux df2TuWOVTJwY0SaSOo7mkmtztkSdvUnXcqR/MeoOGe42k0iFfT4bBFCPz0bxyURY7TUZ bhRHulEaACfjcyPUPtgEkThh7MyjbEWB2YIPTIEBYQmeZqG53q+hLORM9WmGLoudMFdL om1G4JzlXwN33wwnyKo34fnqr5EGubW3LDFIsN26NjyHtyi8wXc2VgvmiU+I8gU1ck9p 3SqhZ6xJNojk92m6bkbBN9LZm78eARhs3gL4w27+beyXfel6y7AYi/KV+k8A1EsqYxBS LlFA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:from:to:cc:date:message-id:in-reply-to :references:user-agent:mime-version:content-transfer-encoding; bh=/mUEYZKWYYcL2BHkb0DxPuaatZyzvoF9cSukuFqsj94=; b=f2ir20CIeNWzzJ+KN2kx4iTBu971KO5gJ9GLYqYI3jdxBg7EYYUIlL0m4uS+b+xEgQ k772KXuS7in01OK5Uu3HJW/lq+ZYhsmDVmSQFsF8E9bJ+CAW1MsnLz429wCf5qgu3n5U kSd0zB0bbATK3GHXwG2RnK1Yn21Q+zWzSyivr+gjEfxX7dopytwILid1VvSajQmjfGsu Ws6c9L+NGxUCMab+diFNfn4qHOpUF0zZWaE7XU1sCYGqYtXiZ/TrnYNLQv7FKFQKb6yQ SljKZh/jhTsrpBfBO3ASE6HUzRc1LXl7TOV2x26vZIPjYRaCkJboE2gX8HXHmuG5wcxF JC4g== X-Gm-Message-State: APjAAAW8hW/Ycbl911Cvujt6hoH39yvFkYPr+mjwjyxf4I/OU84oL0uW 2P8EHpw1/ECqyWeMV/aa5VY= X-Google-Smtp-Source: APXvYqx01mVsNW6pIQK0hwBzQcE1nW+e4GzPg6f2gyQVuJls5woMO9c9GDOi7DlASV/rWA1A7XKMvA== X-Received: by 2002:a9d:378b:: with SMTP id x11mr3987037otb.184.1559253290577; Thu, 30 May 2019 14:54:50 -0700 (PDT) Received: from localhost.localdomain (50-126-100-225.drr01.csby.or.frontiernet.net. [50.126.100.225]) by smtp.gmail.com with ESMTPSA id u8sm1544640otk.53.2019.05.30.14.54.48 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 30 May 2019 14:54:50 -0700 (PDT) Subject: [RFC PATCH 10/11] virtio-balloon: Add support for aerating memory via bubble hinting From: Alexander Duyck To: nitesh@redhat.com, kvm@vger.kernel.org, david@redhat.com, mst@redhat.com, dave.hansen@intel.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: yang.zhang.wz@gmail.com, pagupta@redhat.com, riel@surriel.com, konrad.wilk@oracle.com, lcapitulino@redhat.com, wei.w.wang@intel.com, aarcange@redhat.com, pbonzini@redhat.com, dan.j.williams@intel.com, alexander.h.duyck@linux.intel.com Date: Thu, 30 May 2019 14:54:48 -0700 Message-ID: <20190530215448.13974.59362.stgit@localhost.localdomain> In-Reply-To: <20190530215223.13974.22445.stgit@localhost.localdomain> References: <20190530215223.13974.22445.stgit@localhost.localdomain> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Alexander Duyck Add support for aerating memory using the bubble hinting feature provided by virtio-balloon. Bubble hinting differs from the regular balloon functionality in that is is much less durable than a standard memory balloon. Instead of creating a list of pages that cannot be accessed the pages are only inaccessible while they are being indicated to the virtio interface. Once the interface has acknowledged them they are placed back into their respective free lists and are once again accessible by the guest system. Signed-off-by: Alexander Duyck --- drivers/virtio/Kconfig | 1 drivers/virtio/virtio_balloon.c | 89 +++++++++++++++++++++++++++++++++++ include/uapi/linux/virtio_balloon.h | 1 3 files changed, 90 insertions(+), 1 deletion(-) diff --git a/drivers/virtio/Kconfig b/drivers/virtio/Kconfig index 023fc3bc01c6..9cdaccf92c3a 100644 --- a/drivers/virtio/Kconfig +++ b/drivers/virtio/Kconfig @@ -47,6 +47,7 @@ config VIRTIO_BALLOON tristate "Virtio balloon driver" depends on VIRTIO select MEMORY_BALLOON + select AERATION ---help--- This driver supports increasing and decreasing the amount of memory within a KVM guest. diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c index 44339fc87cc7..e1399991bc1f 100644 --- a/drivers/virtio/virtio_balloon.c +++ b/drivers/virtio/virtio_balloon.c @@ -18,6 +18,7 @@ #include #include #include +#include /* * Balloon device works in 4K page units. So each page is pointed to by @@ -45,6 +46,7 @@ enum virtio_balloon_vq { VIRTIO_BALLOON_VQ_DEFLATE, VIRTIO_BALLOON_VQ_STATS, VIRTIO_BALLOON_VQ_FREE_PAGE, + VIRTIO_BALLOON_VQ_HINTING, VIRTIO_BALLOON_VQ_MAX }; @@ -52,9 +54,16 @@ enum virtio_balloon_config_read { VIRTIO_BALLOON_CONFIG_READ_CMD_ID = 0, }; +#define VIRTIO_BUBBLE_ARRAY_HINTS_MAX 32 +struct virtio_bubble_page_hint { + __virtio32 pfn; + __virtio32 size; +}; + struct virtio_balloon { struct virtio_device *vdev; - struct virtqueue *inflate_vq, *deflate_vq, *stats_vq, *free_page_vq; + struct virtqueue *inflate_vq, *deflate_vq, *stats_vq, *free_page_vq, + *hinting_vq; /* Balloon's own wq for cpu-intensive work items */ struct workqueue_struct *balloon_wq; @@ -107,6 +116,11 @@ struct virtio_balloon { unsigned int num_pfns; __virtio32 pfns[VIRTIO_BALLOON_ARRAY_PFNS_MAX]; + /* The array of PFNs we are hinting on */ + unsigned int num_hints; + struct virtio_bubble_page_hint hints[VIRTIO_BUBBLE_ARRAY_HINTS_MAX]; + struct aerator_dev_info a_dev_info; + /* Memory statistics */ struct virtio_balloon_stat stats[VIRTIO_BALLOON_S_NR]; @@ -151,6 +165,54 @@ static void tell_host(struct virtio_balloon *vb, struct virtqueue *vq) } +void virtballoon_aerator_react(struct aerator_dev_info *a_dev_info) +{ + struct virtio_balloon *vb = container_of(a_dev_info, + struct virtio_balloon, + a_dev_info); + struct virtqueue *vq = vb->hinting_vq; + struct scatterlist sg; + unsigned int unused; + struct page *page; + + vb->num_hints = 0; + + list_for_each_entry(page, &a_dev_info->batch_reactor, lru) { + struct virtio_bubble_page_hint *hint; + unsigned int size; + + hint = &vb->hints[vb->num_hints++]; + hint->pfn = cpu_to_virtio32(vb->vdev, + page_to_balloon_pfn(page)); + size = VIRTIO_BALLOON_PAGES_PER_PAGE << page_private(page); + hint->size = cpu_to_virtio32(vb->vdev, size); + } + + /* We shouldn't have been called if there is nothing to process */ + if (WARN_ON(vb->num_hints == 0)) + return; + + /* Detach all the used buffers from the vq */ + while (virtqueue_get_buf(vq, &unused)) + ; + + sg_init_one(&sg, vb->hints, + sizeof(vb->hints[0]) * vb->num_hints); + + /* + * We should always be able to add one buffer to an + * empty queue. + */ + virtqueue_add_outbuf(vq, &sg, 1, vb, GFP_KERNEL); + virtqueue_kick(vq); +} + +static void aerator_settled(struct virtqueue *vq) +{ + /* Drain the current aerator contents, refill, and start next cycle */ + aerator_cycle(); +} + static void set_page_pfns(struct virtio_balloon *vb, __virtio32 pfns[], struct page *page) { @@ -475,6 +537,7 @@ static int init_vqs(struct virtio_balloon *vb) names[VIRTIO_BALLOON_VQ_DEFLATE] = "deflate"; names[VIRTIO_BALLOON_VQ_STATS] = NULL; names[VIRTIO_BALLOON_VQ_FREE_PAGE] = NULL; + names[VIRTIO_BALLOON_VQ_HINTING] = NULL; if (virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_STATS_VQ)) { names[VIRTIO_BALLOON_VQ_STATS] = "stats"; @@ -486,11 +549,19 @@ static int init_vqs(struct virtio_balloon *vb) callbacks[VIRTIO_BALLOON_VQ_FREE_PAGE] = NULL; } + if (virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_HINTING)) { + names[VIRTIO_BALLOON_VQ_HINTING] = "hinting_vq"; + callbacks[VIRTIO_BALLOON_VQ_HINTING] = aerator_settled; + } + err = vb->vdev->config->find_vqs(vb->vdev, VIRTIO_BALLOON_VQ_MAX, vqs, callbacks, names, NULL, NULL); if (err) return err; + if (virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_HINTING)) + vb->hinting_vq = vqs[VIRTIO_BALLOON_VQ_HINTING]; + vb->inflate_vq = vqs[VIRTIO_BALLOON_VQ_INFLATE]; vb->deflate_vq = vqs[VIRTIO_BALLOON_VQ_DEFLATE]; if (virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_STATS_VQ)) { @@ -929,12 +1000,25 @@ static int virtballoon_probe(struct virtio_device *vdev) if (err) goto out_del_balloon_wq; } + + vb->a_dev_info.react = virtballoon_aerator_react; + vb->a_dev_info.capacity = VIRTIO_BUBBLE_ARRAY_HINTS_MAX; + INIT_LIST_HEAD(&vb->a_dev_info.batch_reactor); + if (virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_HINTING)) { + err = aerator_startup(&vb->a_dev_info); + if (err) + goto out_unregister_shrinker; + } + virtio_device_ready(vdev); if (towards_target(vb)) virtballoon_changed(vdev); return 0; +out_unregister_shrinker: + if (virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_DEFLATE_ON_OOM)) + virtio_balloon_unregister_shrinker(vb); out_del_balloon_wq: if (virtio_has_feature(vdev, VIRTIO_BALLOON_F_FREE_PAGE_HINT)) destroy_workqueue(vb->balloon_wq); @@ -963,6 +1047,8 @@ static void virtballoon_remove(struct virtio_device *vdev) { struct virtio_balloon *vb = vdev->priv; + if (virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_HINTING)) + aerator_shutdown(); if (virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_DEFLATE_ON_OOM)) virtio_balloon_unregister_shrinker(vb); spin_lock_irq(&vb->stop_update_lock); @@ -1032,6 +1118,7 @@ static int virtballoon_validate(struct virtio_device *vdev) VIRTIO_BALLOON_F_DEFLATE_ON_OOM, VIRTIO_BALLOON_F_FREE_PAGE_HINT, VIRTIO_BALLOON_F_PAGE_POISON, + VIRTIO_BALLOON_F_HINTING, }; static struct virtio_driver virtio_balloon_driver = { diff --git a/include/uapi/linux/virtio_balloon.h b/include/uapi/linux/virtio_balloon.h index a1966cd7b677..2b0f62814e22 100644 --- a/include/uapi/linux/virtio_balloon.h +++ b/include/uapi/linux/virtio_balloon.h @@ -36,6 +36,7 @@ #define VIRTIO_BALLOON_F_DEFLATE_ON_OOM 2 /* Deflate balloon on OOM */ #define VIRTIO_BALLOON_F_FREE_PAGE_HINT 3 /* VQ to report free pages */ #define VIRTIO_BALLOON_F_PAGE_POISON 4 /* Guest is using page poisoning */ +#define VIRTIO_BALLOON_F_HINTING 5 /* Page hinting virtqueue */ /* Size of a PFN in the balloon interface. */ #define VIRTIO_BALLOON_PFN_SHIFT 12 From patchwork Thu May 30 21:54:55 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexander Duyck X-Patchwork-Id: 10969305 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 16FE0912 for ; Thu, 30 May 2019 21:55:05 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0BA0828979 for ; Thu, 30 May 2019 21:55:05 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id F292428BD3; Thu, 30 May 2019 21:55:04 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A501928979 for ; Thu, 30 May 2019 21:55:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727058AbfE3Vy7 (ORCPT ); Thu, 30 May 2019 17:54:59 -0400 Received: from mail-oi1-f194.google.com ([209.85.167.194]:39903 "EHLO mail-oi1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727075AbfE3Vy6 (ORCPT ); Thu, 30 May 2019 17:54:58 -0400 Received: by mail-oi1-f194.google.com with SMTP id v2so6183695oie.6; Thu, 30 May 2019 14:54:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=subject:from:to:cc:date:message-id:in-reply-to:references :user-agent:mime-version:content-transfer-encoding; bh=tFR5eWyyHKLzv/rG+/k1EP28VfnvmUHd1en3f8cuet4=; b=QntIfn3QP7w8OAf3KeLsA35GzvqasynhBr/LPKQ9UMSeS7sqkj18UbNYx/lJnWUJpx f9XhduhHSqYQrgukNTzCY6WtbcIgGHsNYgnGj2IhxiM0AVHY2+kyTEyxdSCPnZMVDLIx 9F25FqTBOpb+X9YuJjyHH3ZeN/n7u048oiFLngFfpUu8oI0RouCxcc88D1IA4PwTWbV6 YeeIFLwG3GLDmk1UTKsyorsyB8oh3Cc+b4Ql1DRnh02Uyw3Mddx82RmNGlqoHGX8YCHx neY4bedgWaliCWNuwj2x5E4P1f3BsPX6mffX4wlZD7UVun/2Shv7YNn8yRoAMtSRlhmk Enrw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:from:to:cc:date:message-id:in-reply-to :references:user-agent:mime-version:content-transfer-encoding; bh=tFR5eWyyHKLzv/rG+/k1EP28VfnvmUHd1en3f8cuet4=; b=LMMseEKlSNSmzCYwqqFLuqEcinAe8xbhNXntLvVE1bcZw6U2of7TlkYCnU80eG7S3l FhljUUYYK2GR2j7jIuE1JdVLVVlzHzWPgBcIbuHM4dg3K8nVkbynopNkH1hYk+kCFEd9 ynBNl1iDwHhCyBGGcuZt/Pc/DAcvtqCR76lrv4XxSvxdPzTP9tWnvQ6YV/g4+L5SsL1f w28avHQp/r7XSAOtZ5cwpyl4LmHfw++DTp07JcZWIhAY3BSD9O4TLz+Zg3hUYZwrAucx 1DFGgVZA4vwMeCkhWIBbRoupjgDVYfc7HSaninW/vNGMvT4Wtm8J2MFINgBKkRtIrVb2 W5kg== X-Gm-Message-State: APjAAAVhdTCNcylZwvyZT2wzf2fkcADyEJozArmb3Q3d9cX0sr7G5fac lpt2SPBSXHV5r2Awy1n0Ysk= X-Google-Smtp-Source: APXvYqxfGBCA97YkaWM4wjgbjBJNwlQrAjyiLnkd5V2gUCN0ncNK3/MIrN/Ma+crH/dforiEy7/ROg== X-Received: by 2002:aca:4e42:: with SMTP id c63mr4187588oib.170.1559253298022; Thu, 30 May 2019 14:54:58 -0700 (PDT) Received: from localhost.localdomain (50-126-100-225.drr01.csby.or.frontiernet.net. [50.126.100.225]) by smtp.gmail.com with ESMTPSA id 33sm1412918otb.56.2019.05.30.14.54.56 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 30 May 2019 14:54:57 -0700 (PDT) Subject: [RFC PATCH 11/11] mm: Add free page notification hook From: Alexander Duyck To: nitesh@redhat.com, kvm@vger.kernel.org, david@redhat.com, mst@redhat.com, dave.hansen@intel.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: yang.zhang.wz@gmail.com, pagupta@redhat.com, riel@surriel.com, konrad.wilk@oracle.com, lcapitulino@redhat.com, wei.w.wang@intel.com, aarcange@redhat.com, pbonzini@redhat.com, dan.j.williams@intel.com, alexander.h.duyck@linux.intel.com Date: Thu, 30 May 2019 14:54:55 -0700 Message-ID: <20190530215455.13974.87717.stgit@localhost.localdomain> In-Reply-To: <20190530215223.13974.22445.stgit@localhost.localdomain> References: <20190530215223.13974.22445.stgit@localhost.localdomain> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Alexander Duyck Add a hook so that we are notified when a new page is available. We will use this hook to notify the virtio aeration system when we have achieved enough free higher-order pages to justify the process of pulling some pages and hinting on them. Signed-off-by: Alexander Duyck --- arch/x86/include/asm/page.h | 11 +++++++++++ include/linux/gfp.h | 4 ++++ mm/page_alloc.c | 2 ++ 3 files changed, 17 insertions(+) diff --git a/arch/x86/include/asm/page.h b/arch/x86/include/asm/page.h index 7555b48803a8..dfd546230120 100644 --- a/arch/x86/include/asm/page.h +++ b/arch/x86/include/asm/page.h @@ -18,6 +18,17 @@ struct page; +#ifdef CONFIG_AERATION +#include + +#define HAVE_ARCH_FREE_PAGE_NOTIFY +static inline void +arch_free_page_notify(struct page *page, struct zone *zone, int order) +{ + aerator_notify_free(page, zone, order); +} + +#endif #include extern struct range pfn_mapped[]; extern int nr_pfn_mapped; diff --git a/include/linux/gfp.h b/include/linux/gfp.h index 407a089d861f..d975e7eabbf8 100644 --- a/include/linux/gfp.h +++ b/include/linux/gfp.h @@ -459,6 +459,10 @@ static inline struct zonelist *node_zonelist(int nid, gfp_t flags) #ifndef HAVE_ARCH_FREE_PAGE static inline void arch_free_page(struct page *page, int order) { } #endif +#ifndef HAVE_ARCH_FREE_PAGE_NOTIFY +static inline void +arch_free_page_notify(struct page *page, struct zone *zone, int order) { } +#endif #ifndef HAVE_ARCH_ALLOC_PAGE static inline void arch_alloc_page(struct page *page, int order) { } #endif diff --git a/mm/page_alloc.c b/mm/page_alloc.c index e3800221414b..104763034ce3 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -999,6 +999,8 @@ static inline void __free_one_page(struct page *page, add_to_free_area_tail(page, area, migratetype); else add_to_free_area(page, area, migratetype); + + arch_free_page_notify(page, zone, order); } /*