diff mbox

[v2,1/5] pmem, dax: clean up clear_pmem()

Message ID 20151022171021.38343.65959.stgit@dwillia2-desk3.amr.corp.intel.com (mailing list archive)
State Superseded
Headers show

Commit Message

Dan Williams Oct. 22, 2015, 5:10 p.m. UTC
Both, __dax_pmd_fault, and clear_pmem() were taking special steps to
clear memory a page at a time to take advantage of non-temporal
clear_page() implementations.  However, x86_64 does not use
non-temporal instructions for clear_page(), and arch_clear_pmem() was
always incurring the cost of __arch_wb_cache_pmem().

Clean up the assumption that doing clear_pmem() a page at a time is more
performant.

Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Reported-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 arch/x86/include/asm/pmem.h |    7 +------
 fs/dax.c                    |    4 +---
 2 files changed, 2 insertions(+), 9 deletions(-)

Comments

Jeff Moyer Oct. 22, 2015, 8:48 p.m. UTC | #1
Dan Williams <dan.j.williams@intel.com> writes:

> Both, __dax_pmd_fault, and clear_pmem() were taking special steps to
> clear memory a page at a time to take advantage of non-temporal
> clear_page() implementations.  However, x86_64 does not use
> non-temporal instructions for clear_page(), and arch_clear_pmem() was
> always incurring the cost of __arch_wb_cache_pmem().
>
> Clean up the assumption that doing clear_pmem() a page at a time is more
> performant.

Wouldn't another solution be to actually use non-temporal stores?  Why
did you choose to punt?

Cheers,
Jeff

>
> Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
> Reported-by: Dave Hansen <dave.hansen@linux.intel.com>
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> ---
>  arch/x86/include/asm/pmem.h |    7 +------
>  fs/dax.c                    |    4 +---
>  2 files changed, 2 insertions(+), 9 deletions(-)
>
> diff --git a/arch/x86/include/asm/pmem.h b/arch/x86/include/asm/pmem.h
> index d8ce3ec816ab..1544fabcd7f9 100644
> --- a/arch/x86/include/asm/pmem.h
> +++ b/arch/x86/include/asm/pmem.h
> @@ -132,12 +132,7 @@ static inline void arch_clear_pmem(void __pmem *addr, size_t size)
>  {
>  	void *vaddr = (void __force *)addr;
>  
> -	/* TODO: implement the zeroing via non-temporal writes */
> -	if (size == PAGE_SIZE && ((unsigned long)vaddr & ~PAGE_MASK) == 0)
> -		clear_page(vaddr);
> -	else
> -		memset(vaddr, 0, size);
> -
> +	memset(vaddr, 0, size);
>  	__arch_wb_cache_pmem(vaddr, size);
>  }
>  
> diff --git a/fs/dax.c b/fs/dax.c
> index a86d3cc2b389..5dc33d788d50 100644
> --- a/fs/dax.c
> +++ b/fs/dax.c
> @@ -623,9 +623,7 @@ int __dax_pmd_fault(struct vm_area_struct *vma, unsigned long address,
>  			goto fallback;
>  
>  		if (buffer_unwritten(&bh) || buffer_new(&bh)) {
> -			int i;
> -			for (i = 0; i < PTRS_PER_PMD; i++)
> -				clear_pmem(kaddr + i * PAGE_SIZE, PAGE_SIZE);
> +			clear_pmem(kaddr, HPAGE_SIZE);
>  			wmb_pmem();
>  			count_vm_event(PGMAJFAULT);
>  			mem_cgroup_count_vm_event(vma->vm_mm, PGMAJFAULT);
>
> _______________________________________________
> Linux-nvdimm mailing list
> Linux-nvdimm@lists.01.org
> https://lists.01.org/mailman/listinfo/linux-nvdimm
Dan Williams Oct. 22, 2015, 10:29 p.m. UTC | #2
On Thu, Oct 22, 2015 at 1:48 PM, Jeff Moyer <jmoyer@redhat.com> wrote:
> Dan Williams <dan.j.williams@intel.com> writes:
>
>> Both, __dax_pmd_fault, and clear_pmem() were taking special steps to
>> clear memory a page at a time to take advantage of non-temporal
>> clear_page() implementations.  However, x86_64 does not use
>> non-temporal instructions for clear_page(), and arch_clear_pmem() was
>> always incurring the cost of __arch_wb_cache_pmem().
>>
>> Clean up the assumption that doing clear_pmem() a page at a time is more
>> performant.
>
> Wouldn't another solution be to actually use non-temporal stores?

Sure.

> Why did you choose to punt?

Just a priority call at this point.  Patches welcome of course ;-).
Ross Zwisler Oct. 27, 2015, 5:31 p.m. UTC | #3
On Thu, Oct 22, 2015 at 01:10:21PM -0400, Dan Williams wrote:
> Both, __dax_pmd_fault, and clear_pmem() were taking special steps to
> clear memory a page at a time to take advantage of non-temporal
> clear_page() implementations.  However, x86_64 does not use
> non-temporal instructions for clear_page(), and arch_clear_pmem() was
> always incurring the cost of __arch_wb_cache_pmem().
> 
> Clean up the assumption that doing clear_pmem() a page at a time is more
> performant.
> 
> Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
> Reported-by: Dave Hansen <dave.hansen@linux.intel.com>
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>

Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Jeff Moyer Oct. 28, 2015, 9:01 p.m. UTC | #4
Dan Williams <dan.j.williams@intel.com> writes:

> On Thu, Oct 22, 2015 at 1:48 PM, Jeff Moyer <jmoyer@redhat.com> wrote:
>> Dan Williams <dan.j.williams@intel.com> writes:
>>
>>> Both, __dax_pmd_fault, and clear_pmem() were taking special steps to
>>> clear memory a page at a time to take advantage of non-temporal
>>> clear_page() implementations.  However, x86_64 does not use
>>> non-temporal instructions for clear_page(), and arch_clear_pmem() was
>>> always incurring the cost of __arch_wb_cache_pmem().
>>>
>>> Clean up the assumption that doing clear_pmem() a page at a time is more
>>> performant.
>>
>> Wouldn't another solution be to actually use non-temporal stores?
>
> Sure.
>
>> Why did you choose to punt?
>
> Just a priority call at this point.  Patches welcome of course ;-).

OK.  Patch is harmless.

Reviewed-by: Jeff Moyer <jmoyer@redhat.com>
diff mbox

Patch

diff --git a/arch/x86/include/asm/pmem.h b/arch/x86/include/asm/pmem.h
index d8ce3ec816ab..1544fabcd7f9 100644
--- a/arch/x86/include/asm/pmem.h
+++ b/arch/x86/include/asm/pmem.h
@@ -132,12 +132,7 @@  static inline void arch_clear_pmem(void __pmem *addr, size_t size)
 {
 	void *vaddr = (void __force *)addr;
 
-	/* TODO: implement the zeroing via non-temporal writes */
-	if (size == PAGE_SIZE && ((unsigned long)vaddr & ~PAGE_MASK) == 0)
-		clear_page(vaddr);
-	else
-		memset(vaddr, 0, size);
-
+	memset(vaddr, 0, size);
 	__arch_wb_cache_pmem(vaddr, size);
 }
 
diff --git a/fs/dax.c b/fs/dax.c
index a86d3cc2b389..5dc33d788d50 100644
--- a/fs/dax.c
+++ b/fs/dax.c
@@ -623,9 +623,7 @@  int __dax_pmd_fault(struct vm_area_struct *vma, unsigned long address,
 			goto fallback;
 
 		if (buffer_unwritten(&bh) || buffer_new(&bh)) {
-			int i;
-			for (i = 0; i < PTRS_PER_PMD; i++)
-				clear_pmem(kaddr + i * PAGE_SIZE, PAGE_SIZE);
+			clear_pmem(kaddr, HPAGE_SIZE);
 			wmb_pmem();
 			count_vm_event(PGMAJFAULT);
 			mem_cgroup_count_vm_event(vma->vm_mm, PGMAJFAULT);