diff mbox

[v6,09/15] sparc64: optimized struct page zeroing

Message ID 1502138329-123460-10-git-send-email-pasha.tatashin@oracle.com (mailing list archive)
State New, archived
Headers show

Commit Message

Pavel Tatashin Aug. 7, 2017, 8:38 p.m. UTC
Add an optimized mm_zero_struct_page(), so struct page's are zeroed without
calling memset(). We do eight to tent regular stores based on the size of
struct page. Compiler optimizes out the conditions of switch() statement.

Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Reviewed-by: Steven Sistare <steven.sistare@oracle.com>
Reviewed-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Reviewed-by: Bob Picco <bob.picco@oracle.com>
---
 arch/sparc/include/asm/pgtable_64.h | 30 ++++++++++++++++++++++++++++++
 1 file changed, 30 insertions(+)

Comments

Michal Hocko Aug. 11, 2017, 12:53 p.m. UTC | #1
On Mon 07-08-17 16:38:43, Pavel Tatashin wrote:
> Add an optimized mm_zero_struct_page(), so struct page's are zeroed without
> calling memset(). We do eight to tent regular stores based on the size of
> struct page. Compiler optimizes out the conditions of switch() statement.

Again, this doesn't explain why we need this. You have mentioned those
reasons in some previous emails but be explicit here please.

> Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
> Reviewed-by: Steven Sistare <steven.sistare@oracle.com>
> Reviewed-by: Daniel Jordan <daniel.m.jordan@oracle.com>
> Reviewed-by: Bob Picco <bob.picco@oracle.com>
> ---
>  arch/sparc/include/asm/pgtable_64.h | 30 ++++++++++++++++++++++++++++++
>  1 file changed, 30 insertions(+)
> 
> diff --git a/arch/sparc/include/asm/pgtable_64.h b/arch/sparc/include/asm/pgtable_64.h
> index 6fbd931f0570..cee5cc7ccc51 100644
> --- a/arch/sparc/include/asm/pgtable_64.h
> +++ b/arch/sparc/include/asm/pgtable_64.h
> @@ -230,6 +230,36 @@ extern unsigned long _PAGE_ALL_SZ_BITS;
>  extern struct page *mem_map_zero;
>  #define ZERO_PAGE(vaddr)	(mem_map_zero)
>  
> +/* This macro must be updated when the size of struct page grows above 80
> + * or reduces below 64.
> + * The idea that compiler optimizes out switch() statement, and only
> + * leaves clrx instructions
> + */
> +#define	mm_zero_struct_page(pp) do {					\
> +	unsigned long *_pp = (void *)(pp);				\
> +									\
> +	 /* Check that struct page is either 64, 72, or 80 bytes */	\
> +	BUILD_BUG_ON(sizeof(struct page) & 7);				\
> +	BUILD_BUG_ON(sizeof(struct page) < 64);				\
> +	BUILD_BUG_ON(sizeof(struct page) > 80);				\
> +									\
> +	switch (sizeof(struct page)) {					\
> +	case 80:							\
> +		_pp[9] = 0;	/* fallthrough */			\
> +	case 72:							\
> +		_pp[8] = 0;	/* fallthrough */			\
> +	default:							\
> +		_pp[7] = 0;						\
> +		_pp[6] = 0;						\
> +		_pp[5] = 0;						\
> +		_pp[4] = 0;						\
> +		_pp[3] = 0;						\
> +		_pp[2] = 0;						\
> +		_pp[1] = 0;						\
> +		_pp[0] = 0;						\
> +	}								\
> +} while (0)
> +
>  /* PFNs are real physical page numbers.  However, mem_map only begins to record
>   * per-page information starting at pfn_base.  This is to handle systems where
>   * the first physical page in the machine is at some huge physical address,
> -- 
> 2.14.0
Pavel Tatashin Aug. 11, 2017, 4:04 p.m. UTC | #2
>> Add an optimized mm_zero_struct_page(), so struct page's are zeroed without
>> calling memset(). We do eight to tent regular stores based on the size of
>> struct page. Compiler optimizes out the conditions of switch() statement.
> 
> Again, this doesn't explain why we need this. You have mentioned those
> reasons in some previous emails but be explicit here please.
> 

I will add performance data to this patch as well.

Thank you,
Pasha
diff mbox

Patch

diff --git a/arch/sparc/include/asm/pgtable_64.h b/arch/sparc/include/asm/pgtable_64.h
index 6fbd931f0570..cee5cc7ccc51 100644
--- a/arch/sparc/include/asm/pgtable_64.h
+++ b/arch/sparc/include/asm/pgtable_64.h
@@ -230,6 +230,36 @@  extern unsigned long _PAGE_ALL_SZ_BITS;
 extern struct page *mem_map_zero;
 #define ZERO_PAGE(vaddr)	(mem_map_zero)
 
+/* This macro must be updated when the size of struct page grows above 80
+ * or reduces below 64.
+ * The idea that compiler optimizes out switch() statement, and only
+ * leaves clrx instructions
+ */
+#define	mm_zero_struct_page(pp) do {					\
+	unsigned long *_pp = (void *)(pp);				\
+									\
+	 /* Check that struct page is either 64, 72, or 80 bytes */	\
+	BUILD_BUG_ON(sizeof(struct page) & 7);				\
+	BUILD_BUG_ON(sizeof(struct page) < 64);				\
+	BUILD_BUG_ON(sizeof(struct page) > 80);				\
+									\
+	switch (sizeof(struct page)) {					\
+	case 80:							\
+		_pp[9] = 0;	/* fallthrough */			\
+	case 72:							\
+		_pp[8] = 0;	/* fallthrough */			\
+	default:							\
+		_pp[7] = 0;						\
+		_pp[6] = 0;						\
+		_pp[5] = 0;						\
+		_pp[4] = 0;						\
+		_pp[3] = 0;						\
+		_pp[2] = 0;						\
+		_pp[1] = 0;						\
+		_pp[0] = 0;						\
+	}								\
+} while (0)
+
 /* PFNs are real physical page numbers.  However, mem_map only begins to record
  * per-page information starting at pfn_base.  This is to handle systems where
  * the first physical page in the machine is at some huge physical address,