diff mbox series

[RFC] mm: Micro-optimisation: Save two branches on hot - page allocation path

Message ID 20200304142230.8753-1-mateusznosek0@gmail.com (mailing list archive)
State New, archived
Headers show
Series [RFC] mm: Micro-optimisation: Save two branches on hot - page allocation path | expand

Commit Message

Mateusz Nosek March 4, 2020, 2:22 p.m. UTC
From: Mateusz Nosek <mateusznosek0@gmail.com>

This patch makes ALLOC_KSWAPD
equal to __GFP_KSWAPD_RACLAIM (cast to 'int').

Thanks to that code like:
```if (gfp_mask & __GFP_KSWAPD_RECLAIM)
		alloc_flags |= ALLOC_KSWAPD;```
can be changed to:
```alloc_flags |= (__force int) (gfp_mask &__GFP_KSWAPD_RECLAIM);```
Thanks to this one branch less is generated in the assembly.

In case of ALLOC_KSWAPD flag two branches are saved,
first one in code that always executes in the beggining of page allocation
and the second one in loop in page allocator slowpath.

Signed-off-by: Mateusz Nosek <mateusznosek0@gmail.com>
---
 mm/internal.h   |  2 +-
 mm/page_alloc.c | 23 +++++++++++++++--------
 2 files changed, 16 insertions(+), 9 deletions(-)

Comments

Vlastimil Babka March 4, 2020, 2:47 p.m. UTC | #1
Let's CC Mel.

On 3/4/20 3:22 PM, mateusznosek0@gmail.com wrote:
> From: Mateusz Nosek <mateusznosek0@gmail.com>
> 
> This patch makes ALLOC_KSWAPD
> equal to __GFP_KSWAPD_RACLAIM (cast to 'int').
> 
> Thanks to that code like:
> ```if (gfp_mask & __GFP_KSWAPD_RECLAIM)
> 		alloc_flags |= ALLOC_KSWAPD;```
> can be changed to:
> ```alloc_flags |= (__force int) (gfp_mask &__GFP_KSWAPD_RECLAIM);```
> Thanks to this one branch less is generated in the assembly.
> 
> In case of ALLOC_KSWAPD flag two branches are saved,
> first one in code that always executes in the beggining of page allocation
> and the second one in loop in page allocator slowpath.
> 
> Signed-off-by: Mateusz Nosek <mateusznosek0@gmail.com>

I think it's fine and in line with similar optimisations done by Mel in the past.

Acked-by: Vlastimil Babka <vbabka@suse.cz>

Some nits below.

> ---
>  mm/internal.h   |  2 +-
>  mm/page_alloc.c | 23 +++++++++++++++--------
>  2 files changed, 16 insertions(+), 9 deletions(-)
> 
> diff --git a/mm/internal.h b/mm/internal.h
> index 86372d164476..7fb724977743 100644
> --- a/mm/internal.h
> +++ b/mm/internal.h
> @@ -535,7 +535,7 @@ unsigned long reclaim_clean_pages_from_list(struct zone *zone,
>  #else
>  #define ALLOC_NOFRAGMENT	  0x0
>  #endif
> -#define ALLOC_KSWAPD		0x200 /* allow waking of kswapd */
> +#define ALLOC_KSWAPD		0x800 /* allow waking of kswapd */

Add a comment that the value has to match the GFP flag?

>  
>  enum ttu_flags;
>  struct tlbflush_unmap_batch;
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 79e950d76ffc..73afd883eab5 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -3609,10 +3609,14 @@ static bool zone_allows_reclaim(struct zone *local_zone, struct zone *zone)
>  static inline unsigned int
>  alloc_flags_nofragment(struct zone *zone, gfp_t gfp_mask)
>  {
> -	unsigned int alloc_flags = 0;
> +	unsigned int alloc_flags;
>  
> -	if (gfp_mask & __GFP_KSWAPD_RECLAIM)
> -		alloc_flags |= ALLOC_KSWAPD;
> +	/*
> +	 * __GFP_KSWAPD_RECLAIM is assumed to be the same as ALLOC_KSWAPD
> +	 * to save a branch.
> +	 */
> +	BUILD_BUG_ON(__GFP_KSWAPD_RECLAIM != (__force gfp_t) ALLOC_KSWAPD);

I think one BUILD_BUG_ON is enough...

> +	alloc_flags = (__force int) (gfp_mask & __GFP_KSWAPD_RECLAIM);
>  
>  #ifdef CONFIG_ZONE_DMA32
>  	if (!zone)
> @@ -4248,8 +4252,13 @@ gfp_to_alloc_flags(gfp_t gfp_mask)
>  {
>  	unsigned int alloc_flags = ALLOC_WMARK_MIN | ALLOC_CPUSET;
>  
> -	/* __GFP_HIGH is assumed to be the same as ALLOC_HIGH to save a branch. */
> +	/*
> +	 * __GFP_HIGH is assumed to be the same as ALLOC_HIGH
> +	 * and __GFP_KSWAPD_RECLAIM is assumed to be the same as ALLOC_KSWAPD
> +	 * to save two branches.
> +	 */
>  	BUILD_BUG_ON(__GFP_HIGH != (__force gfp_t) ALLOC_HIGH);
> +	BUILD_BUG_ON(__GFP_KSWAPD_RECLAIM != (__force gfp_t) ALLOC_KSWAPD);

... and this would be the better one of the two.

Thanks.

>  
>  	/*
>  	 * The caller may dip into page reserves a bit more if the caller
> @@ -4257,7 +4266,8 @@ gfp_to_alloc_flags(gfp_t gfp_mask)
>  	 * policy or is asking for __GFP_HIGH memory.  GFP_ATOMIC requests will
>  	 * set both ALLOC_HARDER (__GFP_ATOMIC) and ALLOC_HIGH (__GFP_HIGH).
>  	 */
> -	alloc_flags |= (__force int) (gfp_mask & __GFP_HIGH);
> +	alloc_flags |= (__force int)
> +		(gfp_mask & (__GFP_HIGH | __GFP_KSWAPD_RECLAIM));
>  
>  	if (gfp_mask & __GFP_ATOMIC) {
>  		/*
> @@ -4274,9 +4284,6 @@ gfp_to_alloc_flags(gfp_t gfp_mask)
>  	} else if (unlikely(rt_task(current)) && !in_interrupt())
>  		alloc_flags |= ALLOC_HARDER;
>  
> -	if (gfp_mask & __GFP_KSWAPD_RECLAIM)
> -		alloc_flags |= ALLOC_KSWAPD;
> -
>  #ifdef CONFIG_CMA
>  	if (gfpflags_to_migratetype(gfp_mask) == MIGRATE_MOVABLE)
>  		alloc_flags |= ALLOC_CMA;
>
Mel Gorman March 4, 2020, 3:15 p.m. UTC | #2
On Wed, Mar 04, 2020 at 03:22:30PM +0100, mateusznosek0@gmail.com wrote:
> From: Mateusz Nosek <mateusznosek0@gmail.com>
> 
> This patch makes ALLOC_KSWAPD
> equal to __GFP_KSWAPD_RACLAIM (cast to 'int').
> 

s/RACLAIM/RECLAIM/

Very strange word wrapping

> Thanks to that code like:
> ```if (gfp_mask & __GFP_KSWAPD_RECLAIM)
> 		alloc_flags |= ALLOC_KSWAPD;```
> can be changed to:
> ```alloc_flags |= (__force int) (gfp_mask &__GFP_KSWAPD_RECLAIM);```

Not sure what the multiple ``` are about. It does not appear to be an
encoding error but I think you can drop them. Maybe some mail readers
render this differently but it looks weird in plain text.

> Thanks to this one branch less is generated in the assembly.
> 
> In case of ALLOC_KSWAPD flag two branches are saved,
> first one in code that always executes in the beggining of page allocation

s/beggining/beginning/

> and the second one in loop in page allocator slowpath.

Also strange word wrapping.

> 
> Signed-off-by: Mateusz Nosek <mateusznosek0@gmail.com>
> ---
>  mm/internal.h   |  2 +-
>  mm/page_alloc.c | 23 +++++++++++++++--------
>  2 files changed, 16 insertions(+), 9 deletions(-)
> 
> diff --git a/mm/internal.h b/mm/internal.h
> index 86372d164476..7fb724977743 100644
> --- a/mm/internal.h
> +++ b/mm/internal.h
> @@ -535,7 +535,7 @@ unsigned long reclaim_clean_pages_from_list(struct zone *zone,
>  #else
>  #define ALLOC_NOFRAGMENT	  0x0
>  #endif
> -#define ALLOC_KSWAPD		0x200 /* allow waking of kswapd */
> +#define ALLOC_KSWAPD		0x800 /* allow waking of kswapd */
>  
>  enum ttu_flags;
>  struct tlbflush_unmap_batch;
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 79e950d76ffc..73afd883eab5 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -3609,10 +3609,14 @@ static bool zone_allows_reclaim(struct zone *local_zone, struct zone *zone)
>  static inline unsigned int
>  alloc_flags_nofragment(struct zone *zone, gfp_t gfp_mask)
>  {
> -	unsigned int alloc_flags = 0;
> +	unsigned int alloc_flags;
>  
> -	if (gfp_mask & __GFP_KSWAPD_RECLAIM)
> -		alloc_flags |= ALLOC_KSWAPD;
> +	/*
> +	 * __GFP_KSWAPD_RECLAIM is assumed to be the same as ALLOC_KSWAPD
> +	 * to save a branch.
> +	 */
> +	BUILD_BUG_ON(__GFP_KSWAPD_RECLAIM != (__force gfp_t) ALLOC_KSWAPD);
> +	alloc_flags = (__force int) (gfp_mask & __GFP_KSWAPD_RECLAIM);
>  
>  #ifdef CONFIG_ZONE_DMA32
>  	if (!zone)

Ok, that looks reasonable.

> @@ -4248,8 +4252,13 @@ gfp_to_alloc_flags(gfp_t gfp_mask)
>  {
>  	unsigned int alloc_flags = ALLOC_WMARK_MIN | ALLOC_CPUSET;
>  
> -	/* __GFP_HIGH is assumed to be the same as ALLOC_HIGH to save a branch. */
> +	/*
> +	 * __GFP_HIGH is assumed to be the same as ALLOC_HIGH
> +	 * and __GFP_KSWAPD_RECLAIM is assumed to be the same as ALLOC_KSWAPD
> +	 * to save two branches.
> +	 */
>  	BUILD_BUG_ON(__GFP_HIGH != (__force gfp_t) ALLOC_HIGH);
> +	BUILD_BUG_ON(__GFP_KSWAPD_RECLAIM != (__force gfp_t) ALLOC_KSWAPD);
>  
>  	/*
>  	 * The caller may dip into page reserves a bit more if the caller

As Vlastimil pointed out, you do not need to make two compiler-based
checks. This seems like a reasonable location given that gfp_to_alloc_flags
is the most obvious place to document tricks with the flags.

> @@ -4257,7 +4266,8 @@ gfp_to_alloc_flags(gfp_t gfp_mask)
>  	 * policy or is asking for __GFP_HIGH memory.  GFP_ATOMIC requests will
>  	 * set both ALLOC_HARDER (__GFP_ATOMIC) and ALLOC_HIGH (__GFP_HIGH).
>  	 */
> -	alloc_flags |= (__force int) (gfp_mask & __GFP_HIGH);
> +	alloc_flags |= (__force int)
> +		(gfp_mask & (__GFP_HIGH | __GFP_KSWAPD_RECLAIM));
>  
>  	if (gfp_mask & __GFP_ATOMIC) {
>  		/*

Slightly harder to read but it does potentially generate better code.

If you correct the typos in the changelog, the slightly strange formatting
of the changelog and drop one of the build checks then you can add

Acked-by: Mel Gorman <mgorman@techsingularity.net>
diff mbox series

Patch

diff --git a/mm/internal.h b/mm/internal.h
index 86372d164476..7fb724977743 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -535,7 +535,7 @@  unsigned long reclaim_clean_pages_from_list(struct zone *zone,
 #else
 #define ALLOC_NOFRAGMENT	  0x0
 #endif
-#define ALLOC_KSWAPD		0x200 /* allow waking of kswapd */
+#define ALLOC_KSWAPD		0x800 /* allow waking of kswapd */
 
 enum ttu_flags;
 struct tlbflush_unmap_batch;
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 79e950d76ffc..73afd883eab5 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3609,10 +3609,14 @@  static bool zone_allows_reclaim(struct zone *local_zone, struct zone *zone)
 static inline unsigned int
 alloc_flags_nofragment(struct zone *zone, gfp_t gfp_mask)
 {
-	unsigned int alloc_flags = 0;
+	unsigned int alloc_flags;
 
-	if (gfp_mask & __GFP_KSWAPD_RECLAIM)
-		alloc_flags |= ALLOC_KSWAPD;
+	/*
+	 * __GFP_KSWAPD_RECLAIM is assumed to be the same as ALLOC_KSWAPD
+	 * to save a branch.
+	 */
+	BUILD_BUG_ON(__GFP_KSWAPD_RECLAIM != (__force gfp_t) ALLOC_KSWAPD);
+	alloc_flags = (__force int) (gfp_mask & __GFP_KSWAPD_RECLAIM);
 
 #ifdef CONFIG_ZONE_DMA32
 	if (!zone)
@@ -4248,8 +4252,13 @@  gfp_to_alloc_flags(gfp_t gfp_mask)
 {
 	unsigned int alloc_flags = ALLOC_WMARK_MIN | ALLOC_CPUSET;
 
-	/* __GFP_HIGH is assumed to be the same as ALLOC_HIGH to save a branch. */
+	/*
+	 * __GFP_HIGH is assumed to be the same as ALLOC_HIGH
+	 * and __GFP_KSWAPD_RECLAIM is assumed to be the same as ALLOC_KSWAPD
+	 * to save two branches.
+	 */
 	BUILD_BUG_ON(__GFP_HIGH != (__force gfp_t) ALLOC_HIGH);
+	BUILD_BUG_ON(__GFP_KSWAPD_RECLAIM != (__force gfp_t) ALLOC_KSWAPD);
 
 	/*
 	 * The caller may dip into page reserves a bit more if the caller
@@ -4257,7 +4266,8 @@  gfp_to_alloc_flags(gfp_t gfp_mask)
 	 * policy or is asking for __GFP_HIGH memory.  GFP_ATOMIC requests will
 	 * set both ALLOC_HARDER (__GFP_ATOMIC) and ALLOC_HIGH (__GFP_HIGH).
 	 */
-	alloc_flags |= (__force int) (gfp_mask & __GFP_HIGH);
+	alloc_flags |= (__force int)
+		(gfp_mask & (__GFP_HIGH | __GFP_KSWAPD_RECLAIM));
 
 	if (gfp_mask & __GFP_ATOMIC) {
 		/*
@@ -4274,9 +4284,6 @@  gfp_to_alloc_flags(gfp_t gfp_mask)
 	} else if (unlikely(rt_task(current)) && !in_interrupt())
 		alloc_flags |= ALLOC_HARDER;
 
-	if (gfp_mask & __GFP_KSWAPD_RECLAIM)
-		alloc_flags |= ALLOC_KSWAPD;
-
 #ifdef CONFIG_CMA
 	if (gfpflags_to_migratetype(gfp_mask) == MIGRATE_MOVABLE)
 		alloc_flags |= ALLOC_CMA;