diff mbox series

[5/5] mm/hugetl.c: warn out if expected count of huge pages adjustment is not achieved

Message ID 20200720062623.13135-6-bhe@redhat.com (mailing list archive)
State New, archived
Headers show
Series mm/hugetlb: Small cleanup and improvement | expand

Commit Message

Baoquan He July 20, 2020, 6:26 a.m. UTC
A customer complained that no any message is printed out when failed to
allocate explicitly specified number of persistent huge pages. That
specifying can be done by writing into /proc/sys/vm/nr_hugepages to
increase the persisten huge pages.

In the current code, it takes the best effort way to allocate the expected
number of huge pages. If only succeeding to get part of them, no any
information is printed out.

Here try to send out warning message if the expected number of huge pages
adjustment is not achieved, including increasing and decreasing the count
of persistent huge pages.

Signed-off-by: Baoquan He <bhe@redhat.com>
---
 mm/hugetlb.c | 13 ++++++++++++-
 1 file changed, 12 insertions(+), 1 deletion(-)

Comments

Mike Kravetz July 21, 2020, 12:38 a.m. UTC | #1
On 7/19/20 11:26 PM, Baoquan He wrote:
> A customer complained that no any message is printed out when failed to
> allocate explicitly specified number of persistent huge pages. That
> specifying can be done by writing into /proc/sys/vm/nr_hugepages to
> increase the persisten huge pages.
> 
> In the current code, it takes the best effort way to allocate the expected
> number of huge pages. If only succeeding to get part of them, no any
> information is printed out.
> 
> Here try to send out warning message if the expected number of huge pages
> adjustment is not achieved, including increasing and decreasing the count
> of persistent huge pages.

Perhaps change the wording a bit,

A customer complained that no message is logged when the number of
persistent huge pages is not changed to the exact value written to
the sysfs or proc nr_hugepages file.

In the current code, a best effort is made to satisfy requests made
via the nr_hugepages file.  However, requests may be only partially
satisfied.

Log a message if the code was unsuccessful in fully satisfying a
request.  This includes both increasing and decreasing the number
of persistent huge pages.

> 
> Signed-off-by: Baoquan He <bhe@redhat.com>
> ---
>  mm/hugetlb.c | 13 ++++++++++++-
>  1 file changed, 12 insertions(+), 1 deletion(-)

I am not opposed to this patch.  However, I believe the best way for a user
to determine if their request was successful is to compare the value of
nr_hugepages to the value which was written.

> 
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 467894d8332a..1dfb5d9e4e06 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -2661,7 +2661,7 @@ static int adjust_pool_surplus(struct hstate *h, nodemask_t *nodes_allowed,
>  static int set_max_huge_pages(struct hstate *h, unsigned long count, int nid,
>  			      nodemask_t *nodes_allowed)
>  {
> -	unsigned long min_count, ret;
> +	unsigned long min_count, ret, old_max;
>  	NODEMASK_ALLOC(nodemask_t, node_alloc_noretry, GFP_KERNEL);
>  
>  	/*
> @@ -2723,6 +2723,7 @@ static int set_max_huge_pages(struct hstate *h, unsigned long count, int nid,
>  	 * pool might be one hugepage larger than it needs to be, but
>  	 * within all the constraints specified by the sysctls.
>  	 */
> +	old_max = persistent_huge_pages(h);
>  	while (h->surplus_huge_pages && count > persistent_huge_pages(h)) {
>  		if (!adjust_pool_surplus(h, nodes_allowed, -1))
>  			break;
> @@ -2779,6 +2780,16 @@ static int set_max_huge_pages(struct hstate *h, unsigned long count, int nid,
>  	}
>  out:
>  	h->max_huge_pages = persistent_huge_pages(h);
> +	if (count != h->max_huge_pages) {
> +		char buf[32];
> +
> +		string_get_size(huge_page_size(h), 1, STRING_UNITS_2, buf, 32);
> +		pr_warn("HugeTLB: %s %lu of page size %s failed. Only %s %lu hugepages.\n",
> +			count > old_max ? "increasing" : "decreasing",
> +			abs(count - old_max), buf,
> +			count > old_max ? "increased" : "decreased",
> +			abs(old_max - h->max_huge_pages));
> +	}
>  	spin_unlock(&hugetlb_lock);

I would prefer if we drop the lock before logging the message.  That would
involve grabbing the value of h->max_huge_pages before dropping the lock.
Baoquan He July 21, 2020, 9:55 a.m. UTC | #2
On 07/20/20 at 05:38pm, Mike Kravetz wrote:
> On 7/19/20 11:26 PM, Baoquan He wrote:
> > A customer complained that no any message is printed out when failed to
> > allocate explicitly specified number of persistent huge pages. That
> > specifying can be done by writing into /proc/sys/vm/nr_hugepages to
> > increase the persisten huge pages.
> > 
> > In the current code, it takes the best effort way to allocate the expected
> > number of huge pages. If only succeeding to get part of them, no any
> > information is printed out.
> > 
> > Here try to send out warning message if the expected number of huge pages
> > adjustment is not achieved, including increasing and decreasing the count
> > of persistent huge pages.
> 
> Perhaps change the wording a bit,
> 
> A customer complained that no message is logged when the number of
> persistent huge pages is not changed to the exact value written to
> the sysfs or proc nr_hugepages file.
> 
> In the current code, a best effort is made to satisfy requests made
> via the nr_hugepages file.  However, requests may be only partially
> satisfied.
> 
> Log a message if the code was unsuccessful in fully satisfying a
> request.  This includes both increasing and decreasing the number
> of persistent huge pages.

Thanks, sounds much better, I will use these to replace the old log.

> 
> > 
> > Signed-off-by: Baoquan He <bhe@redhat.com>
> > ---
> >  mm/hugetlb.c | 13 ++++++++++++-
> >  1 file changed, 12 insertions(+), 1 deletion(-)
> 
> I am not opposed to this patch.  However, I believe the best way for a user
> to determine if their request was successful is to compare the value of
> nr_hugepages to the value which was written.

Agree. While from our customer's request, they told the log can help
'Easily detect and analyse previous reservation failures'.

> 
> > 
> > diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> > index 467894d8332a..1dfb5d9e4e06 100644
> > --- a/mm/hugetlb.c
> > +++ b/mm/hugetlb.c
> > @@ -2661,7 +2661,7 @@ static int adjust_pool_surplus(struct hstate *h, nodemask_t *nodes_allowed,
> >  static int set_max_huge_pages(struct hstate *h, unsigned long count, int nid,
> >  			      nodemask_t *nodes_allowed)
> >  {
> > -	unsigned long min_count, ret;
> > +	unsigned long min_count, ret, old_max;
> >  	NODEMASK_ALLOC(nodemask_t, node_alloc_noretry, GFP_KERNEL);
> >  
> >  	/*
> > @@ -2723,6 +2723,7 @@ static int set_max_huge_pages(struct hstate *h, unsigned long count, int nid,
> >  	 * pool might be one hugepage larger than it needs to be, but
> >  	 * within all the constraints specified by the sysctls.
> >  	 */
> > +	old_max = persistent_huge_pages(h);
> >  	while (h->surplus_huge_pages && count > persistent_huge_pages(h)) {
> >  		if (!adjust_pool_surplus(h, nodes_allowed, -1))
> >  			break;
> > @@ -2779,6 +2780,16 @@ static int set_max_huge_pages(struct hstate *h, unsigned long count, int nid,
> >  	}
> >  out:
> >  	h->max_huge_pages = persistent_huge_pages(h);
> > +	if (count != h->max_huge_pages) {
> > +		char buf[32];
> > +
> > +		string_get_size(huge_page_size(h), 1, STRING_UNITS_2, buf, 32);
> > +		pr_warn("HugeTLB: %s %lu of page size %s failed. Only %s %lu hugepages.\n",
> > +			count > old_max ? "increasing" : "decreasing",
> > +			abs(count - old_max), buf,
> > +			count > old_max ? "increased" : "decreased",
> > +			abs(old_max - h->max_huge_pages));
> > +	}
> >  	spin_unlock(&hugetlb_lock);
> 
> I would prefer if we drop the lock before logging the message.  That would
> involve grabbing the value of h->max_huge_pages before dropping the lock.

Sure, will change. We should try to release the lock's burden. Thanks.
Baoquan He July 22, 2020, 8:49 a.m. UTC | #3
Hi Mike,

On 07/20/20 at 05:38pm, Mike Kravetz wrote:
> > diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> > index 467894d8332a..1dfb5d9e4e06 100644
> > --- a/mm/hugetlb.c
> > +++ b/mm/hugetlb.c
> > @@ -2661,7 +2661,7 @@ static int adjust_pool_surplus(struct hstate *h, nodemask_t *nodes_allowed,
> >  static int set_max_huge_pages(struct hstate *h, unsigned long count, int nid,
> >  			      nodemask_t *nodes_allowed)
> >  {
> > -	unsigned long min_count, ret;
> > +	unsigned long min_count, ret, old_max;
> >  	NODEMASK_ALLOC(nodemask_t, node_alloc_noretry, GFP_KERNEL);
> >  
> >  	/*
> > @@ -2723,6 +2723,7 @@ static int set_max_huge_pages(struct hstate *h, unsigned long count, int nid,
> >  	 * pool might be one hugepage larger than it needs to be, but
> >  	 * within all the constraints specified by the sysctls.
> >  	 */
> > +	old_max = persistent_huge_pages(h);
> >  	while (h->surplus_huge_pages && count > persistent_huge_pages(h)) {
> >  		if (!adjust_pool_surplus(h, nodes_allowed, -1))
> >  			break;
> > @@ -2779,6 +2780,16 @@ static int set_max_huge_pages(struct hstate *h, unsigned long count, int nid,
> >  	}
> >  out:
> >  	h->max_huge_pages = persistent_huge_pages(h);
> > +	if (count != h->max_huge_pages) {
> > +		char buf[32];
> > +
> > +		string_get_size(huge_page_size(h), 1, STRING_UNITS_2, buf, 32);
> > +		pr_warn("HugeTLB: %s %lu of page size %s failed. Only %s %lu hugepages.\n",
> > +			count > old_max ? "increasing" : "decreasing",
> > +			abs(count - old_max), buf,
> > +			count > old_max ? "increased" : "decreased",
> > +			abs(old_max - h->max_huge_pages));
> > +	}
> >  	spin_unlock(&hugetlb_lock);
> 
> I would prefer if we drop the lock before logging the message.  That would
> involve grabbing the value of h->max_huge_pages before dropping the lock.

Do you think the below change is OK to you to move the message logging
after lock dropping? If yes, I will repost with updated patches.

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 6a9b7556ce5b..b5aa32a13569 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -2661,7 +2661,7 @@ static int adjust_pool_surplus(struct hstate *h, nodemask_t *nodes_allowed,
 static int set_max_huge_pages(struct hstate *h, unsigned long count, int nid,
 			      nodemask_t *nodes_allowed)
 {
-	unsigned long min_count, ret, old_max;
+	unsigned long min_count, ret, old_max, new_max;
 	NODEMASK_ALLOC(nodemask_t, node_alloc_noretry, GFP_KERNEL);
 
 	/*
@@ -2780,7 +2780,10 @@ static int set_max_huge_pages(struct hstate *h, unsigned long count, int nid,
 	}
 out:
 	h->max_huge_pages = persistent_huge_pages(h);
-	if (count != h->max_huge_pages) {
+	new_max = h->max_huge_pages;
+	spin_unlock(&hugetlb_lock);
+
+	if (count != new_max) {
 		char buf[32];
 
 		string_get_size(huge_page_size(h), 1, STRING_UNITS_2, buf, 32);
@@ -2788,9 +2791,8 @@ static int set_max_huge_pages(struct hstate *h, unsigned long count, int nid,
 			count > old_max ? "increasing" : "decreasing",
 			abs(count - old_max), buf,
 			count > old_max ? "increased" : "decreased",
-			abs(old_max - h->max_huge_pages));
+			abs(old_max - new_max));
 	}
-	spin_unlock(&hugetlb_lock);
 
 	NODEMASK_FREE(node_alloc_noretry);
Mike Kravetz July 22, 2020, 4:15 p.m. UTC | #4
On 7/22/20 1:49 AM, Baoquan He wrote:
> On 07/20/20 at 05:38pm, Mike Kravetz wrote:
>>> +	if (count != h->max_huge_pages) {
>>> +		char buf[32];
>>> +
>>> +		string_get_size(huge_page_size(h), 1, STRING_UNITS_2, buf, 32);
>>> +		pr_warn("HugeTLB: %s %lu of page size %s failed. Only %s %lu hugepages.\n",
>>> +			count > old_max ? "increasing" : "decreasing",
>>> +			abs(count - old_max), buf,
>>> +			count > old_max ? "increased" : "decreased",
>>> +			abs(old_max - h->max_huge_pages));
>>> +	}
>>>  	spin_unlock(&hugetlb_lock);
>>
>> I would prefer if we drop the lock before logging the message.  That would
>> involve grabbing the value of h->max_huge_pages before dropping the lock.
> 
> Do you think the below change is OK to you to move the message logging
> after lock dropping? If yes, I will repost with updated patches.
> 
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 6a9b7556ce5b..b5aa32a13569 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -2661,7 +2661,7 @@ static int adjust_pool_surplus(struct hstate *h, nodemask_t *nodes_allowed,
>  static int set_max_huge_pages(struct hstate *h, unsigned long count, int nid,
>  			      nodemask_t *nodes_allowed)
>  {
> -	unsigned long min_count, ret, old_max;
> +	unsigned long min_count, ret, old_max, new_max;
>  	NODEMASK_ALLOC(nodemask_t, node_alloc_noretry, GFP_KERNEL);
>  
>  	/*
> @@ -2780,7 +2780,10 @@ static int set_max_huge_pages(struct hstate *h, unsigned long count, int nid,
>  	}
>  out:
>  	h->max_huge_pages = persistent_huge_pages(h);
> -	if (count != h->max_huge_pages) {
> +	new_max = h->max_huge_pages;
> +	spin_unlock(&hugetlb_lock);
> +
> +	if (count != new_max) {
>  		char buf[32];
>  
>  		string_get_size(huge_page_size(h), 1, STRING_UNITS_2, buf, 32);
> @@ -2788,9 +2791,8 @@ static int set_max_huge_pages(struct hstate *h, unsigned long count, int nid,
>  			count > old_max ? "increasing" : "decreasing",
>  			abs(count - old_max), buf,
>  			count > old_max ? "increased" : "decreased",
> -			abs(old_max - h->max_huge_pages));
> +			abs(old_max - new_max));
>  	}
> -	spin_unlock(&hugetlb_lock);
>  
>  	NODEMASK_FREE(node_alloc_noretry);

Yes, that looks better.  Thank you.
diff mbox series

Patch

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 467894d8332a..1dfb5d9e4e06 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -2661,7 +2661,7 @@  static int adjust_pool_surplus(struct hstate *h, nodemask_t *nodes_allowed,
 static int set_max_huge_pages(struct hstate *h, unsigned long count, int nid,
 			      nodemask_t *nodes_allowed)
 {
-	unsigned long min_count, ret;
+	unsigned long min_count, ret, old_max;
 	NODEMASK_ALLOC(nodemask_t, node_alloc_noretry, GFP_KERNEL);
 
 	/*
@@ -2723,6 +2723,7 @@  static int set_max_huge_pages(struct hstate *h, unsigned long count, int nid,
 	 * pool might be one hugepage larger than it needs to be, but
 	 * within all the constraints specified by the sysctls.
 	 */
+	old_max = persistent_huge_pages(h);
 	while (h->surplus_huge_pages && count > persistent_huge_pages(h)) {
 		if (!adjust_pool_surplus(h, nodes_allowed, -1))
 			break;
@@ -2779,6 +2780,16 @@  static int set_max_huge_pages(struct hstate *h, unsigned long count, int nid,
 	}
 out:
 	h->max_huge_pages = persistent_huge_pages(h);
+	if (count != h->max_huge_pages) {
+		char buf[32];
+
+		string_get_size(huge_page_size(h), 1, STRING_UNITS_2, buf, 32);
+		pr_warn("HugeTLB: %s %lu of page size %s failed. Only %s %lu hugepages.\n",
+			count > old_max ? "increasing" : "decreasing",
+			abs(count - old_max), buf,
+			count > old_max ? "increased" : "decreased",
+			abs(old_max - h->max_huge_pages));
+	}
 	spin_unlock(&hugetlb_lock);
 
 	NODEMASK_FREE(node_alloc_noretry);