diff mbox series

[v9,11/11] mm/hugetlb: Optimize the code with the help of the compiler

Message ID 20201213154534.54826-12-songmuchun@bytedance.com (mailing list archive)
State New, archived
Headers show
Series Free some vmemmap pages of HugeTLB page | expand

Commit Message

Muchun Song Dec. 13, 2020, 3:45 p.m. UTC
We cannot optimize if a "struct page" crosses page boundaries. If
it is true, we can optimize the code with the help of a compiler.
When free_vmemmap_pages_per_hpage() returns zero, most functions are
optimized by the compiler.

Signed-off-by: Muchun Song <songmuchun@bytedance.com>
---
 include/linux/hugetlb.h | 3 ++-
 mm/hugetlb_vmemmap.c    | 7 +++++++
 mm/hugetlb_vmemmap.h    | 5 +++--
 3 files changed, 12 insertions(+), 3 deletions(-)

Comments

Oscar Salvador Dec. 17, 2020, 10:31 a.m. UTC | #1
On Sun, Dec 13, 2020 at 11:45:34PM +0800, Muchun Song wrote:
>  static inline unsigned int free_vmemmap_pages_per_hpage(struct hstate *h)
>  {
> -	return h->nr_free_vmemmap_pages;
> +	return h->nr_free_vmemmap_pages && is_power_of_2(sizeof(struct page));

This is wrong as it will return either true or false, but not what we want:

	static inline unsigned long free_vmemmap_pages_size_per_hpage(struct hstate *h)
	{
	        return (unsigned long)free_vmemmap_pages_per_hpage(h) << PAGE_SHIFT;
	}

the above will compute to 4096, which is wrong for obvious reasons.
Muchun Song Dec. 17, 2020, 10:42 a.m. UTC | #2
On Thu, Dec 17, 2020 at 6:32 PM Oscar Salvador <osalvador@suse.de> wrote:
>
> On Sun, Dec 13, 2020 at 11:45:34PM +0800, Muchun Song wrote:
> >  static inline unsigned int free_vmemmap_pages_per_hpage(struct hstate *h)
> >  {
> > -     return h->nr_free_vmemmap_pages;
> > +     return h->nr_free_vmemmap_pages && is_power_of_2(sizeof(struct page));
>
> This is wrong as it will return either true or false, but not what we want:

Yeah, very thanks for pointing that out.

>
>         static inline unsigned long free_vmemmap_pages_size_per_hpage(struct hstate *h)
>         {
>                 return (unsigned long)free_vmemmap_pages_per_hpage(h) << PAGE_SHIFT;
>         }
>
> the above will compute to 4096, which is wrong for obvious reasons.

You are right. It is my mistake. Thanks Oscar.

>
> --
> Oscar Salvador
> SUSE L3
diff mbox series

Patch

diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
index 7295f6b3d55e..adc17765e0e9 100644
--- a/include/linux/hugetlb.h
+++ b/include/linux/hugetlb.h
@@ -791,7 +791,8 @@  extern bool hugetlb_free_vmemmap_enabled;
 
 static inline bool is_hugetlb_free_vmemmap_enabled(void)
 {
-	return hugetlb_free_vmemmap_enabled;
+	return hugetlb_free_vmemmap_enabled &&
+	       is_power_of_2(sizeof(struct page));
 }
 #else
 static inline bool is_hugetlb_free_vmemmap_enabled(void)
diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c
index bbcefd5fb7d1..e83c48c63a7b 100644
--- a/mm/hugetlb_vmemmap.c
+++ b/mm/hugetlb_vmemmap.c
@@ -240,6 +240,13 @@  void __init hugetlb_vmemmap_init(struct hstate *h)
 	BUILD_BUG_ON(NR_USED_SUBPAGE >=
 		     RESERVE_VMEMMAP_SIZE / sizeof(struct page));
 
+	/*
+	 * The compiler can help us to optimize this function to null
+	 * when the size of the struct page is not power of 2.
+	 */
+	if (!is_power_of_2(sizeof(struct page)))
+		return;
+
 	if (!hugetlb_free_vmemmap_enabled)
 		return;
 
diff --git a/mm/hugetlb_vmemmap.h b/mm/hugetlb_vmemmap.h
index 8fd9ae113dbd..1a29a80f9fe1 100644
--- a/mm/hugetlb_vmemmap.h
+++ b/mm/hugetlb_vmemmap.h
@@ -17,11 +17,12 @@  void hugetlb_vmemmap_init(struct hstate *h);
 
 /*
  * How many vmemmap pages associated with a HugeTLB page that can be freed
- * to the buddy allocator.
+ * to the buddy allocator. The checking of the is_power_of_2() aims to let
+ * the compiler help us optimize the code as much as possible.
  */
 static inline unsigned int free_vmemmap_pages_per_hpage(struct hstate *h)
 {
-	return h->nr_free_vmemmap_pages;
+	return h->nr_free_vmemmap_pages && is_power_of_2(sizeof(struct page));
 }
 #else
 static inline void alloc_huge_page_vmemmap(struct hstate *h, struct page *head)