Message ID | 1561350068-8966-1-git-send-email-kernelfans@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | mm/hugetlb: allow gigantic page allocation to migrate away smaller huge page | expand |
On Mon, Jun 24, 2019 at 12:21:08PM +0800, Pingfan Liu wrote: > The current pfn_range_valid_gigantic() rejects the pud huge page allocation > if there is a pmd huge page inside the candidate range. > > But pud huge resource is more rare, which should align on 1GB on x86. It is > worth to allow migrating away pmd huge page to make room for a pud huge > page. > > The same logic is applied to pgd and pud huge pages. I'm sorry but I don't quite understand why we should do this. Is this a bug or an optimization? It sounds like an optimization. > > Signed-off-by: Pingfan Liu <kernelfans@gmail.com> > Cc: Mike Kravetz <mike.kravetz@oracle.com> > Cc: Oscar Salvador <osalvador@suse.de> > Cc: David Hildenbrand <david@redhat.com> > Cc: Andrew Morton <akpm@linux-foundation.org> > Cc: linux-kernel@vger.kernel.org > --- > mm/hugetlb.c | 8 +++++--- > 1 file changed, 5 insertions(+), 3 deletions(-) > > diff --git a/mm/hugetlb.c b/mm/hugetlb.c > index ac843d3..02d1978 100644 > --- a/mm/hugetlb.c > +++ b/mm/hugetlb.c > @@ -1081,7 +1081,11 @@ static bool pfn_range_valid_gigantic(struct zone *z, > unsigned long start_pfn, unsigned long nr_pages) > { > unsigned long i, end_pfn = start_pfn + nr_pages; > - struct page *page; > + struct page *page = pfn_to_page(start_pfn); > + > + if (PageHuge(page)) > + if (compound_order(compound_head(page)) >= nr_pages) I don't think you want compound_order() here. Ira > + return false; > > for (i = start_pfn; i < end_pfn; i++) { > if (!pfn_valid(i)) > @@ -1098,8 +1102,6 @@ static bool pfn_range_valid_gigantic(struct zone *z, > if (page_count(page) > 0) > return false; > > - if (PageHuge(page)) > - return false; > } > > return true; > -- > 2.7.5 >
On 06/24/2019 09:51 AM, Pingfan Liu wrote: > The current pfn_range_valid_gigantic() rejects the pud huge page allocation > if there is a pmd huge page inside the candidate range. > > But pud huge resource is more rare, which should align on 1GB on x86. It is > worth to allow migrating away pmd huge page to make room for a pud huge > page. > > The same logic is applied to pgd and pud huge pages. The huge page in the range can either be a THP or HugeTLB and migrating them has different costs and chances of success. THP migration will involve splitting if THP migration is not enabled and all related TLB related costs. Are you sure that a PUD HugeTLB allocation really should go through these ? Is there any guarantee that after migration of multiple PMD sized THP/HugeTLB pages on the given range, the allocation request for PUD will succeed ? > > Signed-off-by: Pingfan Liu <kernelfans@gmail.com> > Cc: Mike Kravetz <mike.kravetz@oracle.com> > Cc: Oscar Salvador <osalvador@suse.de> > Cc: David Hildenbrand <david@redhat.com> > Cc: Andrew Morton <akpm@linux-foundation.org> > Cc: linux-kernel@vger.kernel.org > --- > mm/hugetlb.c | 8 +++++--- > 1 file changed, 5 insertions(+), 3 deletions(-) > > diff --git a/mm/hugetlb.c b/mm/hugetlb.c > index ac843d3..02d1978 100644 > --- a/mm/hugetlb.c > +++ b/mm/hugetlb.c > @@ -1081,7 +1081,11 @@ static bool pfn_range_valid_gigantic(struct zone *z, > unsigned long start_pfn, unsigned long nr_pages) > { > unsigned long i, end_pfn = start_pfn + nr_pages; > - struct page *page; > + struct page *page = pfn_to_page(start_pfn); > + > + if (PageHuge(page)) > + if (compound_order(compound_head(page)) >= nr_pages) > + return false; > > for (i = start_pfn; i < end_pfn; i++) { > if (!pfn_valid(i)) > @@ -1098,8 +1102,6 @@ static bool pfn_range_valid_gigantic(struct zone *z, > if (page_count(page) > 0) > return false; > > - if (PageHuge(page)) > - return false; > } > > return true; > So except in the case where there is a bigger huge page in the range this will attempt migrating everything on the way. As mentioned before if it all this is a good idea, it needs to differentiate between HugeTLB and THP and also take into account costs of migrations and chance of subsequence allocation attempt into account.
On Mon, Jun 24, 2019 at 1:03 PM Ira Weiny <ira.weiny@intel.com> wrote: > > On Mon, Jun 24, 2019 at 12:21:08PM +0800, Pingfan Liu wrote: > > The current pfn_range_valid_gigantic() rejects the pud huge page allocation > > if there is a pmd huge page inside the candidate range. > > > > But pud huge resource is more rare, which should align on 1GB on x86. It is > > worth to allow migrating away pmd huge page to make room for a pud huge > > page. > > > > The same logic is applied to pgd and pud huge pages. > > I'm sorry but I don't quite understand why we should do this. Is this a bug or > an optimization? It sounds like an optimization. Yes, an optimization. It can help us to success to allocate a 1GB hugetlb if there is some 2MB hugetlb sit in the candidate range. Allocation 1GB hugetlb requires more tough condition, not only a continuous 1GB range, but also aligned on GB. While allocating a 2MB range is easier. > > > > > Signed-off-by: Pingfan Liu <kernelfans@gmail.com> > > Cc: Mike Kravetz <mike.kravetz@oracle.com> > > Cc: Oscar Salvador <osalvador@suse.de> > > Cc: David Hildenbrand <david@redhat.com> > > Cc: Andrew Morton <akpm@linux-foundation.org> > > Cc: linux-kernel@vger.kernel.org > > --- > > mm/hugetlb.c | 8 +++++--- > > 1 file changed, 5 insertions(+), 3 deletions(-) > > > > diff --git a/mm/hugetlb.c b/mm/hugetlb.c > > index ac843d3..02d1978 100644 > > --- a/mm/hugetlb.c > > +++ b/mm/hugetlb.c > > @@ -1081,7 +1081,11 @@ static bool pfn_range_valid_gigantic(struct zone *z, > > unsigned long start_pfn, unsigned long nr_pages) > > { > > unsigned long i, end_pfn = start_pfn + nr_pages; > > - struct page *page; > > + struct page *page = pfn_to_page(start_pfn); > > + > > + if (PageHuge(page)) > > + if (compound_order(compound_head(page)) >= nr_pages) > > I don't think you want compound_order() here. Yes, your are right. Thanks, Pingfan > > Ira > > > + return false; > > > > for (i = start_pfn; i < end_pfn; i++) { > > if (!pfn_valid(i)) > > @@ -1098,8 +1102,6 @@ static bool pfn_range_valid_gigantic(struct zone *z, > > if (page_count(page) > 0) > > return false; > > > > - if (PageHuge(page)) > > - return false; > > } > > > > return true; > > -- > > 2.7.5 > >
On Mon, Jun 24, 2019 at 1:16 PM Anshuman Khandual <anshuman.khandual@arm.com> wrote: > > > > On 06/24/2019 09:51 AM, Pingfan Liu wrote: > > The current pfn_range_valid_gigantic() rejects the pud huge page allocation > > if there is a pmd huge page inside the candidate range. > > > > But pud huge resource is more rare, which should align on 1GB on x86. It is > > worth to allow migrating away pmd huge page to make room for a pud huge > > page. > > > > The same logic is applied to pgd and pud huge pages. > > The huge page in the range can either be a THP or HugeTLB and migrating them has > different costs and chances of success. THP migration will involve splitting if > THP migration is not enabled and all related TLB related costs. Are you sure > that a PUD HugeTLB allocation really should go through these ? Is there any PUD hugetlb has already driven out PMD thp in current. This patch just want to make PUD hugetlb survives PMD hugetlb. > guarantee that after migration of multiple PMD sized THP/HugeTLB pages on the > given range, the allocation request for PUD will succeed ? The migration is complicated, but as my understanding, if there is no gup pin in the range and there is enough memory including swap, then it will success. > > > > > Signed-off-by: Pingfan Liu <kernelfans@gmail.com> > > Cc: Mike Kravetz <mike.kravetz@oracle.com> > > Cc: Oscar Salvador <osalvador@suse.de> > > Cc: David Hildenbrand <david@redhat.com> > > Cc: Andrew Morton <akpm@linux-foundation.org> > > Cc: linux-kernel@vger.kernel.org > > --- > > mm/hugetlb.c | 8 +++++--- > > 1 file changed, 5 insertions(+), 3 deletions(-) > > > > diff --git a/mm/hugetlb.c b/mm/hugetlb.c > > index ac843d3..02d1978 100644 > > --- a/mm/hugetlb.c > > +++ b/mm/hugetlb.c > > @@ -1081,7 +1081,11 @@ static bool pfn_range_valid_gigantic(struct zone *z, > > unsigned long start_pfn, unsigned long nr_pages) > > { > > unsigned long i, end_pfn = start_pfn + nr_pages; > > - struct page *page; > > + struct page *page = pfn_to_page(start_pfn); > > + > > + if (PageHuge(page)) > > + if (compound_order(compound_head(page)) >= nr_pages) > > + return false; > > > > for (i = start_pfn; i < end_pfn; i++) { > > if (!pfn_valid(i)) > > @@ -1098,8 +1102,6 @@ static bool pfn_range_valid_gigantic(struct zone *z, > > if (page_count(page) > 0) > > return false; > > > > - if (PageHuge(page)) > > - return false; > > } > > > > return true; > > > > So except in the case where there is a bigger huge page in the range this will > attempt migrating everything on the way. As mentioned before if it all this is > a good idea, it needs to differentiate between HugeTLB and THP and also take > into account costs of migrations and chance of subsequence allocation attempt > into account. Sorry, but I think this logic is only for hugetlb. The caller alloc_gigantic_page() is only used inside mm/hugetlb.c, not by huge_memory.c. Thanks, Pingfan
On 06/24/2019 11:40 AM, Pingfan Liu wrote: > On Mon, Jun 24, 2019 at 1:16 PM Anshuman Khandual > <anshuman.khandual@arm.com> wrote: >> >> >> >> On 06/24/2019 09:51 AM, Pingfan Liu wrote: >>> The current pfn_range_valid_gigantic() rejects the pud huge page allocation >>> if there is a pmd huge page inside the candidate range. >>> >>> But pud huge resource is more rare, which should align on 1GB on x86. It is >>> worth to allow migrating away pmd huge page to make room for a pud huge >>> page. >>> >>> The same logic is applied to pgd and pud huge pages. >> >> The huge page in the range can either be a THP or HugeTLB and migrating them has >> different costs and chances of success. THP migration will involve splitting if >> THP migration is not enabled and all related TLB related costs. Are you sure >> that a PUD HugeTLB allocation really should go through these ? Is there any > PUD hugetlb has already driven out PMD thp in current. This patch just > want to make PUD hugetlb survives PMD hugetlb. You are right. PageHuge() is true only for HugeTLB pages unlike PageTransHuge() which is true for both HugeTLB and THP pages. So the current code does migrate the THP out in order to allocate a gigantic HugeTLB. While here just wondering should not we exclude THP as well unless it supports ARCH_HAS_THP_MIGRATION.
diff --git a/mm/hugetlb.c b/mm/hugetlb.c index ac843d3..02d1978 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -1081,7 +1081,11 @@ static bool pfn_range_valid_gigantic(struct zone *z, unsigned long start_pfn, unsigned long nr_pages) { unsigned long i, end_pfn = start_pfn + nr_pages; - struct page *page; + struct page *page = pfn_to_page(start_pfn); + + if (PageHuge(page)) + if (compound_order(compound_head(page)) >= nr_pages) + return false; for (i = start_pfn; i < end_pfn; i++) { if (!pfn_valid(i)) @@ -1098,8 +1102,6 @@ static bool pfn_range_valid_gigantic(struct zone *z, if (page_count(page) > 0) return false; - if (PageHuge(page)) - return false; } return true;
The current pfn_range_valid_gigantic() rejects the pud huge page allocation if there is a pmd huge page inside the candidate range. But pud huge resource is more rare, which should align on 1GB on x86. It is worth to allow migrating away pmd huge page to make room for a pud huge page. The same logic is applied to pgd and pud huge pages. Signed-off-by: Pingfan Liu <kernelfans@gmail.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: David Hildenbrand <david@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: linux-kernel@vger.kernel.org --- mm/hugetlb.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-)