diff mbox series

[v9,09/11] mm/hugetlb: Introduce nr_free_vmemmap_pages in the struct hstate

Message ID 20201213154534.54826-10-songmuchun@bytedance.com (mailing list archive)
State New, archived
Headers show
Series Free some vmemmap pages of HugeTLB page | expand

Commit Message

Muchun Song Dec. 13, 2020, 3:45 p.m. UTC
All the infrastructure is ready, so we introduce nr_free_vmemmap_pages
field in the hstate to indicate how many vmemmap pages associated with
a HugeTLB page that we can free to buddy allocator. And initialize it
in the hugetlb_vmemmap_init(). This patch is actual enablement of the
feature.

Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Acked-by: Mike Kravetz <mike.kravetz@oracle.com>
---
 include/linux/hugetlb.h |  3 +++
 mm/hugetlb.c            |  1 +
 mm/hugetlb_vmemmap.c    | 29 +++++++++++++++++++++++++++++
 mm/hugetlb_vmemmap.h    | 10 ++++++----
 4 files changed, 39 insertions(+), 4 deletions(-)

Comments

Oscar Salvador Dec. 16, 2020, 1:43 p.m. UTC | #1
On Sun, Dec 13, 2020 at 11:45:32PM +0800, Muchun Song wrote:
> All the infrastructure is ready, so we introduce nr_free_vmemmap_pages
> field in the hstate to indicate how many vmemmap pages associated with
> a HugeTLB page that we can free to buddy allocator. And initialize it
"can be freed to buddy allocator"

> in the hugetlb_vmemmap_init(). This patch is actual enablement of the
> feature.
> 
> Signed-off-by: Muchun Song <songmuchun@bytedance.com>
> Acked-by: Mike Kravetz <mike.kravetz@oracle.com>

With below nits addressed you can add:

Reviewed-by: Oscar Salvador <osalvador@suse.de>

>  static int __init early_hugetlb_free_vmemmap_param(char *buf)
>  {
> +	/* We cannot optimize if a "struct page" crosses page boundaries. */
> +	if (!is_power_of_2(sizeof(struct page)))
> +		return 0;
> +

I wonder if we should report a warning in case someone wants to enable this
feature and stuct page size it not power of 2.
In case someone wonders why it does not work for him/her.

> +void __init hugetlb_vmemmap_init(struct hstate *h)
> +{
> +	unsigned int nr_pages = pages_per_huge_page(h);
> +	unsigned int vmemmap_pages;
> +
> +	if (!hugetlb_free_vmemmap_enabled)
> +		return;
> +
> +	vmemmap_pages = (nr_pages * sizeof(struct page)) >> PAGE_SHIFT;
> +	/*
> +	 * The head page and the first tail page are not to be freed to buddy
> +	 * system, the others page will map to the first tail page. So there
> +	 * are the remaining pages that can be freed.
"the other pages will map to the first tail page, so they can be freed."
> +	 *
> +	 * Could RESERVE_VMEMMAP_NR be greater than @vmemmap_pages? It is true
> +	 * on some architectures (e.g. aarch64). See Documentation/arm64/
> +	 * hugetlbpage.rst for more details.
> +	 */
> +	if (likely(vmemmap_pages > RESERVE_VMEMMAP_NR))
> +		h->nr_free_vmemmap_pages = vmemmap_pages - RESERVE_VMEMMAP_NR;
> +
> +	pr_info("can free %d vmemmap pages for %s\n", h->nr_free_vmemmap_pages,
> +		h->name);

Maybe specify this is hugetlb code:

pr_info("%s: blabla", __func__, ...)
or
pr_info("hugetlb: blalala", ...);

although I am not sure whether we need that at all, or maybe just use
pr_debug().
Muchun Song Dec. 16, 2020, 1:56 p.m. UTC | #2
On Wed, Dec 16, 2020 at 9:44 PM Oscar Salvador <osalvador@suse.de> wrote:
>
> On Sun, Dec 13, 2020 at 11:45:32PM +0800, Muchun Song wrote:
> > All the infrastructure is ready, so we introduce nr_free_vmemmap_pages
> > field in the hstate to indicate how many vmemmap pages associated with
> > a HugeTLB page that we can free to buddy allocator. And initialize it
> "can be freed to buddy allocator"
>
> > in the hugetlb_vmemmap_init(). This patch is actual enablement of the
> > feature.
> >
> > Signed-off-by: Muchun Song <songmuchun@bytedance.com>
> > Acked-by: Mike Kravetz <mike.kravetz@oracle.com>
>
> With below nits addressed you can add:
>
> Reviewed-by: Oscar Salvador <osalvador@suse.de>

Thanks.

>
> >  static int __init early_hugetlb_free_vmemmap_param(char *buf)
> >  {
> > +     /* We cannot optimize if a "struct page" crosses page boundaries. */
> > +     if (!is_power_of_2(sizeof(struct page)))
> > +             return 0;
> > +
>
> I wonder if we should report a warning in case someone wants to enable this
> feature and stuct page size it not power of 2.
> In case someone wonders why it does not work for him/her.
>
> > +void __init hugetlb_vmemmap_init(struct hstate *h)
> > +{
> > +     unsigned int nr_pages = pages_per_huge_page(h);
> > +     unsigned int vmemmap_pages;
> > +
> > +     if (!hugetlb_free_vmemmap_enabled)
> > +             return;
> > +
> > +     vmemmap_pages = (nr_pages * sizeof(struct page)) >> PAGE_SHIFT;
> > +     /*
> > +      * The head page and the first tail page are not to be freed to buddy
> > +      * system, the others page will map to the first tail page. So there
> > +      * are the remaining pages that can be freed.
> "the other pages will map to the first tail page, so they can be freed."
> > +      *
> > +      * Could RESERVE_VMEMMAP_NR be greater than @vmemmap_pages? It is true
> > +      * on some architectures (e.g. aarch64). See Documentation/arm64/
> > +      * hugetlbpage.rst for more details.
> > +      */
> > +     if (likely(vmemmap_pages > RESERVE_VMEMMAP_NR))
> > +             h->nr_free_vmemmap_pages = vmemmap_pages - RESERVE_VMEMMAP_NR;
> > +
> > +     pr_info("can free %d vmemmap pages for %s\n", h->nr_free_vmemmap_pages,
> > +             h->name);
>
> Maybe specify this is hugetlb code:
>
> pr_info("%s: blabla", __func__, ...)
> or
> pr_info("hugetlb: blalala", ...);
>
> although I am not sure whether we need that at all, or maybe just use
> pr_debug().

The pr_info can tell the user whether the feature is enabled. From this
point of view, it makes sense. Right?

Thanks.

>
> --
> Oscar Salvador
> SUSE L3
Oscar Salvador Dec. 16, 2020, 10:12 p.m. UTC | #3
On Wed, Dec 16, 2020 at 09:56:47PM +0800, Muchun Song wrote:
> The pr_info can tell the user whether the feature is enabled. From this
> point of view, it makes sense. Right?

Well, I guess so.
Anyway, it is not that we are going to flood the logs, so it is ok.
Muchun Song Dec. 17, 2020, 8:34 a.m. UTC | #4
On Wed, Dec 16, 2020 at 9:44 PM Oscar Salvador <osalvador@suse.de> wrote:
>
> On Sun, Dec 13, 2020 at 11:45:32PM +0800, Muchun Song wrote:
> > All the infrastructure is ready, so we introduce nr_free_vmemmap_pages
> > field in the hstate to indicate how many vmemmap pages associated with
> > a HugeTLB page that we can free to buddy allocator. And initialize it
> "can be freed to buddy allocator"
>
> > in the hugetlb_vmemmap_init(). This patch is actual enablement of the
> > feature.
> >
> > Signed-off-by: Muchun Song <songmuchun@bytedance.com>
> > Acked-by: Mike Kravetz <mike.kravetz@oracle.com>
>
> With below nits addressed you can add:
>
> Reviewed-by: Oscar Salvador <osalvador@suse.de>
>
> >  static int __init early_hugetlb_free_vmemmap_param(char *buf)
> >  {
> > +     /* We cannot optimize if a "struct page" crosses page boundaries. */
> > +     if (!is_power_of_2(sizeof(struct page)))
> > +             return 0;
> > +
>
> I wonder if we should report a warning in case someone wants to enable this
> feature and stuct page size it not power of 2.
> In case someone wonders why it does not work for him/her.

Agree. I think that we should add a warning message here.

>
> > +void __init hugetlb_vmemmap_init(struct hstate *h)
> > +{
> > +     unsigned int nr_pages = pages_per_huge_page(h);
> > +     unsigned int vmemmap_pages;
> > +
> > +     if (!hugetlb_free_vmemmap_enabled)
> > +             return;
> > +
> > +     vmemmap_pages = (nr_pages * sizeof(struct page)) >> PAGE_SHIFT;
> > +     /*
> > +      * The head page and the first tail page are not to be freed to buddy
> > +      * system, the others page will map to the first tail page. So there
> > +      * are the remaining pages that can be freed.
> "the other pages will map to the first tail page, so they can be freed."
> > +      *
> > +      * Could RESERVE_VMEMMAP_NR be greater than @vmemmap_pages? It is true
> > +      * on some architectures (e.g. aarch64). See Documentation/arm64/
> > +      * hugetlbpage.rst for more details.
> > +      */
> > +     if (likely(vmemmap_pages > RESERVE_VMEMMAP_NR))
> > +             h->nr_free_vmemmap_pages = vmemmap_pages - RESERVE_VMEMMAP_NR;
> > +
> > +     pr_info("can free %d vmemmap pages for %s\n", h->nr_free_vmemmap_pages,
> > +             h->name);
>
> Maybe specify this is hugetlb code:
>
> pr_info("%s: blabla", __func__, ...)
> or
> pr_info("hugetlb: blalala", ...);
>
> although I am not sure whether we need that at all, or maybe just use
> pr_debug().
>
> --
> Oscar Salvador
> SUSE L3
diff mbox series

Patch

diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
index 7f47f0eeca3b..66d82ae7b712 100644
--- a/include/linux/hugetlb.h
+++ b/include/linux/hugetlb.h
@@ -492,6 +492,9 @@  struct hstate {
 	unsigned int nr_huge_pages_node[MAX_NUMNODES];
 	unsigned int free_huge_pages_node[MAX_NUMNODES];
 	unsigned int surplus_huge_pages_node[MAX_NUMNODES];
+#ifdef CONFIG_HUGETLB_PAGE_FREE_VMEMMAP
+	unsigned int nr_free_vmemmap_pages;
+#endif
 #ifdef CONFIG_CGROUP_HUGETLB
 	/* cgroup control files */
 	struct cftype cgroup_files_dfl[7];
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index b0847b2ce01d..2b45235a70e9 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -3323,6 +3323,7 @@  void __init hugetlb_add_hstate(unsigned int order)
 	h->next_nid_to_free = first_memory_node;
 	snprintf(h->name, HSTATE_NAME_LEN, "hugepages-%lukB",
 					huge_page_size(h)/1024);
+	hugetlb_vmemmap_init(h);
 
 	parsed_hstate = h;
 }
diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c
index 64ad929cac61..d3b4c39f67c0 100644
--- a/mm/hugetlb_vmemmap.c
+++ b/mm/hugetlb_vmemmap.c
@@ -184,6 +184,10 @@  bool hugetlb_free_vmemmap_enabled;
 
 static int __init early_hugetlb_free_vmemmap_param(char *buf)
 {
+	/* We cannot optimize if a "struct page" crosses page boundaries. */
+	if (!is_power_of_2(sizeof(struct page)))
+		return 0;
+
 	if (!buf)
 		return -EINVAL;
 
@@ -222,3 +226,28 @@  void free_huge_page_vmemmap(struct hstate *h, struct page *head)
 	vmemmap_remap_reuse(vmemmap_addr + RESERVE_VMEMMAP_SIZE,
 			    free_vmemmap_pages_size_per_hpage(h));
 }
+
+void __init hugetlb_vmemmap_init(struct hstate *h)
+{
+	unsigned int nr_pages = pages_per_huge_page(h);
+	unsigned int vmemmap_pages;
+
+	if (!hugetlb_free_vmemmap_enabled)
+		return;
+
+	vmemmap_pages = (nr_pages * sizeof(struct page)) >> PAGE_SHIFT;
+	/*
+	 * The head page and the first tail page are not to be freed to buddy
+	 * system, the others page will map to the first tail page. So there
+	 * are the remaining pages that can be freed.
+	 *
+	 * Could RESERVE_VMEMMAP_NR be greater than @vmemmap_pages? It is true
+	 * on some architectures (e.g. aarch64). See Documentation/arm64/
+	 * hugetlbpage.rst for more details.
+	 */
+	if (likely(vmemmap_pages > RESERVE_VMEMMAP_NR))
+		h->nr_free_vmemmap_pages = vmemmap_pages - RESERVE_VMEMMAP_NR;
+
+	pr_info("can free %d vmemmap pages for %s\n", h->nr_free_vmemmap_pages,
+		h->name);
+}
diff --git a/mm/hugetlb_vmemmap.h b/mm/hugetlb_vmemmap.h
index b2c8d2f11d48..8fd9ae113dbd 100644
--- a/mm/hugetlb_vmemmap.h
+++ b/mm/hugetlb_vmemmap.h
@@ -13,17 +13,15 @@ 
 #ifdef CONFIG_HUGETLB_PAGE_FREE_VMEMMAP
 void alloc_huge_page_vmemmap(struct hstate *h, struct page *head);
 void free_huge_page_vmemmap(struct hstate *h, struct page *head);
+void hugetlb_vmemmap_init(struct hstate *h);
 
 /*
  * How many vmemmap pages associated with a HugeTLB page that can be freed
  * to the buddy allocator.
- *
- * Todo: Returns zero for now, which means the feature is disabled. We will
- * enable it once all the infrastructure is there.
  */
 static inline unsigned int free_vmemmap_pages_per_hpage(struct hstate *h)
 {
-	return 0;
+	return h->nr_free_vmemmap_pages;
 }
 #else
 static inline void alloc_huge_page_vmemmap(struct hstate *h, struct page *head)
@@ -38,5 +36,9 @@  static inline unsigned int free_vmemmap_pages_per_hpage(struct hstate *h)
 {
 	return 0;
 }
+
+static inline void hugetlb_vmemmap_init(struct hstate *h)
+{
+}
 #endif /* CONFIG_HUGETLB_PAGE_FREE_VMEMMAP */
 #endif /* _LINUX_HUGETLB_VMEMMAP_H */