diff mbox series

[v5,2/2] mm/page_alloc: remove software prefetching in __free_pages_core

Message ID 1538727006-5727-2-git-send-email-arunks@codeaurora.org (mailing list archive)
State New, archived
Headers show
Series [v5,1/2] memory_hotplug: Free pages as higher order | expand

Commit Message

Arun KS Oct. 5, 2018, 8:10 a.m. UTC
They not only increase the code footprint, they actually make things
slower rather than faster. Remove them as contemporary hardware doesn't
need any hint.

Suggested-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Arun KS <arunks@codeaurora.org>
---
 mm/page_alloc.c | 6 +-----
 1 file changed, 1 insertion(+), 5 deletions(-)

Comments

Michal Hocko Oct. 9, 2018, 9:30 a.m. UTC | #1
On Fri 05-10-18 13:40:06, Arun KS wrote:
> They not only increase the code footprint, they actually make things
> slower rather than faster. Remove them as contemporary hardware doesn't
> need any hint.

I agree with the change but it is much better to add some numbers
whenever arguing about performance impact.

> 
> Suggested-by: Dan Williams <dan.j.williams@intel.com>
> Signed-off-by: Arun KS <arunks@codeaurora.org>
> ---
>  mm/page_alloc.c | 6 +-----
>  1 file changed, 1 insertion(+), 5 deletions(-)
> 
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 7ab5274..90db431 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -1258,14 +1258,10 @@ void __free_pages_core(struct page *page, unsigned int order)
>  	struct page *p = page;
>  	unsigned int loop;
>  
> -	prefetchw(p);
> -	for (loop = 0; loop < (nr_pages - 1); loop++, p++) {
> -		prefetchw(p + 1);
> +	for (loop = 0; loop < nr_pages ; loop++, p++) {
>  		__ClearPageReserved(p);
>  		set_page_count(p, 0);
>  	}
> -	__ClearPageReserved(p);
> -	set_page_count(p, 0);
>  
>  	page_zone(page)->managed_pages += nr_pages;
>  	set_page_refcounted(page);
> -- 
> 1.9.1
>
Vlastimil Babka Oct. 10, 2018, 4:36 p.m. UTC | #2
On 10/5/18 10:10 AM, Arun KS wrote:
> They not only increase the code footprint, they actually make things
> slower rather than faster. Remove them as contemporary hardware doesn't
> need any hint.
> 
> Suggested-by: Dan Williams <dan.j.williams@intel.com>
> Signed-off-by: Arun KS <arunks@codeaurora.org>

Yeah, a tight loop with fixed stride is a trivial case for hw prefetcher.

Acked-by: Vlastimil Babka <vbabka@suse.cz>

> ---
>  mm/page_alloc.c | 6 +-----
>  1 file changed, 1 insertion(+), 5 deletions(-)
> 
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 7ab5274..90db431 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -1258,14 +1258,10 @@ void __free_pages_core(struct page *page, unsigned int order)
>  	struct page *p = page;
>  	unsigned int loop;
>  
> -	prefetchw(p);
> -	for (loop = 0; loop < (nr_pages - 1); loop++, p++) {
> -		prefetchw(p + 1);
> +	for (loop = 0; loop < nr_pages ; loop++, p++) {
>  		__ClearPageReserved(p);
>  		set_page_count(p, 0);
>  	}
> -	__ClearPageReserved(p);
> -	set_page_count(p, 0);
>  
>  	page_zone(page)->managed_pages += nr_pages;
>  	set_page_refcounted(page);
>
diff mbox series

Patch

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 7ab5274..90db431 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1258,14 +1258,10 @@  void __free_pages_core(struct page *page, unsigned int order)
 	struct page *p = page;
 	unsigned int loop;
 
-	prefetchw(p);
-	for (loop = 0; loop < (nr_pages - 1); loop++, p++) {
-		prefetchw(p + 1);
+	for (loop = 0; loop < nr_pages ; loop++, p++) {
 		__ClearPageReserved(p);
 		set_page_count(p, 0);
 	}
-	__ClearPageReserved(p);
-	set_page_count(p, 0);
 
 	page_zone(page)->managed_pages += nr_pages;
 	set_page_refcounted(page);