diff mbox series

[v3,04/21] mm/hugetlb: Introduce nr_free_vmemmap_pages in the struct hstate

Message ID 20201108141113.65450-5-songmuchun@bytedance.com (mailing list archive)
State New, archived
Headers show
Series Free some vmemmap pages of hugetlb page | expand

Commit Message

Muchun Song Nov. 8, 2020, 2:10 p.m. UTC
If the size of hugetlb page is 2MB, we need 512 struct page structures
(8 pages) to be associated with it. As far as I know, we only use the
first 4 struct page structures. Use of first 4 struct page structures
comes from HUGETLB_CGROUP_MIN_ORDER.

For tail pages, the value of compound_head is the same. So we can reuse
first page of tail page structs. We map the virtual addresses of the
remaining 6 pages of tail page structs to the first tail page struct,
and then free these 6 pages. Therefore, we need to reserve at least 2
pages as vmemmap areas.

So we introduce a new nr_free_vmemmap_pages field in the hstate to
indicate how many vmemmap pages associated with a hugetlb page that we
can free to buddy system.

Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Acked-by: Mike Kravetz <mike.kravetz@oracle.com>
---
 include/linux/hugetlb.h |  3 +++
 mm/hugetlb.c            | 38 ++++++++++++++++++++++++++++++++++++++
 2 files changed, 41 insertions(+)

Comments

Oscar Salvador Nov. 9, 2020, 4:48 p.m. UTC | #1
On Sun, Nov 08, 2020 at 10:10:56PM +0800, Muchun Song wrote:
> +#ifdef CONFIG_HUGETLB_PAGE_FREE_VMEMMAP
> +/*
> + * There are 512 struct page structs(8 pages) associated with each 2MB
> + * hugetlb page. For tail pages, the value of compound_dtor is the same.
I gess you meant "For tail pages, the value of compound_head ...", right?

> + * So we can reuse first page of tail page structs. We map the virtual
> + * addresses of the remaining 6 pages of tail page structs to the first
> + * tail page struct, and then free these 6 pages. Therefore, we need to
> + * reserve at least 2 pages as vmemmap areas.
> + */
> +#define RESERVE_VMEMMAP_NR	2U
> +
> +static void __init hugetlb_vmemmap_init(struct hstate *h)
> +{
> +	unsigned int order = huge_page_order(h);
> +	unsigned int vmemmap_pages;
> +
> +	vmemmap_pages = ((1 << order) * sizeof(struct page)) >> PAGE_SHIFT;
> +	/*
> +	 * The head page and the first tail page not free to buddy system,

"The head page and the first tail page are not to be freed to..." better?


> +	 * the others page will map to the first tail page. So there are
> +	 * (@vmemmap_pages - RESERVE_VMEMMAP_NR) pages can be freed.
						      ^^^
                                                      that

> +	else
> +		h->nr_free_vmemmap_pages = 0;

I would specify that this is not expected to happen.
(At least I could not come up with a real scenario unless the system is
corrupted)
So, I would drop a brief comment pointing out that it is only a safety
net.


Unrelated to this patch but related in general, I am not sure about Mike but
would it be cleaner to move all the vmemmap functions to hugetlb_vmemmap.c?
hugetlb code is quite tricky, so I am not sure about stuffing more code
in there.
Muchun Song Nov. 10, 2020, 2:42 a.m. UTC | #2
On Tue, Nov 10, 2020 at 12:48 AM Oscar Salvador <osalvador@suse.de> wrote:
>
> On Sun, Nov 08, 2020 at 10:10:56PM +0800, Muchun Song wrote:
> > +#ifdef CONFIG_HUGETLB_PAGE_FREE_VMEMMAP
> > +/*
> > + * There are 512 struct page structs(8 pages) associated with each 2MB
> > + * hugetlb page. For tail pages, the value of compound_dtor is the same.
> I gess you meant "For tail pages, the value of compound_head ...", right?

Yeah, Thanks.

>
> > + * So we can reuse first page of tail page structs. We map the virtual
> > + * addresses of the remaining 6 pages of tail page structs to the first
> > + * tail page struct, and then free these 6 pages. Therefore, we need to
> > + * reserve at least 2 pages as vmemmap areas.
> > + */
> > +#define RESERVE_VMEMMAP_NR   2U
> > +
> > +static void __init hugetlb_vmemmap_init(struct hstate *h)
> > +{
> > +     unsigned int order = huge_page_order(h);
> > +     unsigned int vmemmap_pages;
> > +
> > +     vmemmap_pages = ((1 << order) * sizeof(struct page)) >> PAGE_SHIFT;
> > +     /*
> > +      * The head page and the first tail page not free to buddy system,
>
> "The head page and the first tail page are not to be freed to..." better?

Yeah, sorry for my poor English :).

>
>
> > +      * the others page will map to the first tail page. So there are
> > +      * (@vmemmap_pages - RESERVE_VMEMMAP_NR) pages can be freed.
>                                                       ^^^
>                                                       that
>
> > +     else
> > +             h->nr_free_vmemmap_pages = 0;
>
> I would specify that this is not expected to happen.
> (At least I could not come up with a real scenario unless the system is
> corrupted)
> So, I would drop a brief comment pointing out that it is only a safety
> net.

I will add a comment to point out this.

>
>
> Unrelated to this patch but related in general, I am not sure about Mike but
> would it be cleaner to move all the vmemmap functions to hugetlb_vmemmap.c?
> hugetlb code is quite tricky, so I am not sure about stuffing more code
> in there.
>

I also think that you are right, moving all the vmemmap functions to
hugetlb_vmemmap.c may make the code cleaner.

Hi Mike, what's your opinion?

Thanks.

> --
> Oscar Salvador
> SUSE L3
Mike Kravetz Nov. 10, 2020, 7:38 p.m. UTC | #3
On 11/9/20 6:42 PM, Muchun Song wrote:
> On Tue, Nov 10, 2020 at 12:48 AM Oscar Salvador <osalvador@suse.de> wrote:
>>
>> On Sun, Nov 08, 2020 at 10:10:56PM +0800, Muchun Song wrote:
>>
>> Unrelated to this patch but related in general, I am not sure about Mike but
>> would it be cleaner to move all the vmemmap functions to hugetlb_vmemmap.c?
>> hugetlb code is quite tricky, so I am not sure about stuffing more code
>> in there.
>>
> 
> I also think that you are right, moving all the vmemmap functions to
> hugetlb_vmemmap.c may make the code cleaner.
> 
> Hi Mike, what's your opinion?

I would be happy to see this in a separate file.  As Oscar mentions, the
hugetlb.c file/code is already somethat difficult to read and understand.
Muchun Song Nov. 11, 2020, 3:22 a.m. UTC | #4
On Wed, Nov 11, 2020 at 3:40 AM Mike Kravetz <mike.kravetz@oracle.com> wrote:
>
> On 11/9/20 6:42 PM, Muchun Song wrote:
> > On Tue, Nov 10, 2020 at 12:48 AM Oscar Salvador <osalvador@suse.de> wrote:
> >>
> >> On Sun, Nov 08, 2020 at 10:10:56PM +0800, Muchun Song wrote:
> >>
> >> Unrelated to this patch but related in general, I am not sure about Mike but
> >> would it be cleaner to move all the vmemmap functions to hugetlb_vmemmap.c?
> >> hugetlb code is quite tricky, so I am not sure about stuffing more code
> >> in there.
> >>
> >
> > I also think that you are right, moving all the vmemmap functions to
> > hugetlb_vmemmap.c may make the code cleaner.
> >
> > Hi Mike, what's your opinion?
>
> I would be happy to see this in a separate file.  As Oscar mentions, the
> hugetlb.c file/code is already somethat difficult to read and understand.

Got it. I will do this. Thanks.

> --
> Mike Kravetz
diff mbox series

Patch

diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
index d5cc5f802dd4..eed3dd3bd626 100644
--- a/include/linux/hugetlb.h
+++ b/include/linux/hugetlb.h
@@ -492,6 +492,9 @@  struct hstate {
 	unsigned int nr_huge_pages_node[MAX_NUMNODES];
 	unsigned int free_huge_pages_node[MAX_NUMNODES];
 	unsigned int surplus_huge_pages_node[MAX_NUMNODES];
+#ifdef CONFIG_HUGETLB_PAGE_FREE_VMEMMAP
+	unsigned int nr_free_vmemmap_pages;
+#endif
 #ifdef CONFIG_CGROUP_HUGETLB
 	/* cgroup control files */
 	struct cftype cgroup_files_dfl[7];
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 81a41aa080a5..a0007902fafb 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1292,6 +1292,42 @@  static inline void destroy_compound_gigantic_page(struct page *page,
 						unsigned int order) { }
 #endif
 
+#ifdef CONFIG_HUGETLB_PAGE_FREE_VMEMMAP
+/*
+ * There are 512 struct page structs(8 pages) associated with each 2MB
+ * hugetlb page. For tail pages, the value of compound_dtor is the same.
+ * So we can reuse first page of tail page structs. We map the virtual
+ * addresses of the remaining 6 pages of tail page structs to the first
+ * tail page struct, and then free these 6 pages. Therefore, we need to
+ * reserve at least 2 pages as vmemmap areas.
+ */
+#define RESERVE_VMEMMAP_NR	2U
+
+static void __init hugetlb_vmemmap_init(struct hstate *h)
+{
+	unsigned int order = huge_page_order(h);
+	unsigned int vmemmap_pages;
+
+	vmemmap_pages = ((1 << order) * sizeof(struct page)) >> PAGE_SHIFT;
+	/*
+	 * The head page and the first tail page not free to buddy system,
+	 * the others page will map to the first tail page. So there are
+	 * (@vmemmap_pages - RESERVE_VMEMMAP_NR) pages can be freed.
+	 */
+	if (likely(vmemmap_pages > RESERVE_VMEMMAP_NR))
+		h->nr_free_vmemmap_pages = vmemmap_pages - RESERVE_VMEMMAP_NR;
+	else
+		h->nr_free_vmemmap_pages = 0;
+
+	pr_debug("HugeTLB: can free %d vmemmap pages for %s\n",
+		 h->nr_free_vmemmap_pages, h->name);
+}
+#else
+static inline void hugetlb_vmemmap_init(struct hstate *h)
+{
+}
+#endif
+
 static void update_and_free_page(struct hstate *h, struct page *page)
 {
 	int i;
@@ -3285,6 +3321,8 @@  void __init hugetlb_add_hstate(unsigned int order)
 	snprintf(h->name, HSTATE_NAME_LEN, "hugepages-%lukB",
 					huge_page_size(h)/1024);
 
+	hugetlb_vmemmap_init(h);
+
 	parsed_hstate = h;
 }