diff mbox series

[RFC,v2,1/5] mm: Enable fault around for shared file page fault

Message ID 20230201081737.2330141-2-fengwei.yin@intel.com (mailing list archive)
State New
Headers show
Series folio based filemap_map_pages() | expand

Commit Message

Yin, Fengwei Feb. 1, 2023, 8:17 a.m. UTC
The shared fault handler can also benefit from fault-around.  While it
is uncommon to write to MAP_SHARED files, applications that do will see
a huge benefit with will-it-scale:page_fault3 (shared file write fault)
improving by 375%.

Signed-off-by: Yin Fengwei <fengwei.yin@intel.com>
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 mm/memory.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

Comments

Kirill A . Shutemov Feb. 1, 2023, 2:34 p.m. UTC | #1
On Wed, Feb 01, 2023 at 04:17:33PM +0800, Yin Fengwei wrote:
> The shared fault handler can also benefit from fault-around.  While it
> is uncommon to write to MAP_SHARED files, applications that do will see
> a huge benefit with will-it-scale:page_fault3 (shared file write fault)
> improving by 375%.
> 
> Signed-off-by: Yin Fengwei <fengwei.yin@intel.com>
> Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> ---
>  mm/memory.c | 11 +++++++++++
>  1 file changed, 11 insertions(+)
> 
> diff --git a/mm/memory.c b/mm/memory.c
> index 7a04a1130ec1..51c04bb60724 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -4546,6 +4546,17 @@ static vm_fault_t do_shared_fault(struct vm_fault *vmf)
>  	struct vm_area_struct *vma = vmf->vma;
>  	vm_fault_t ret, tmp;
>  
> +	/*
> +	 * Let's call ->map_pages() first and use ->fault() as fallback
> +	 * if page by the offset is not ready to be mapped (cold cache or
> +	 * something).
> +	 */
> +	if (should_fault_around(vmf)) {
> +		ret = do_fault_around(vmf);
> +		if (ret)
> +			return ret;
> +	}
> +

I believe it bypasses ->page_mkwrite() completely, no?

So you get a writable PTEs without notifying the filesystem. Smells like a
data loss.

>  	ret = __do_fault(vmf);
>  	if (unlikely(ret & (VM_FAULT_ERROR | VM_FAULT_NOPAGE | VM_FAULT_RETRY)))
>  		return ret;
> -- 
> 2.30.2
> 
>
Yin, Fengwei Feb. 2, 2023, 1:54 a.m. UTC | #2
On 2/1/2023 10:34 PM, Kirill A. Shutemov wrote:
> On Wed, Feb 01, 2023 at 04:17:33PM +0800, Yin Fengwei wrote:
>> The shared fault handler can also benefit from fault-around.  While it
>> is uncommon to write to MAP_SHARED files, applications that do will see
>> a huge benefit with will-it-scale:page_fault3 (shared file write fault)
>> improving by 375%.
>>
>> Signed-off-by: Yin Fengwei <fengwei.yin@intel.com>
>> Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
>> ---
>>  mm/memory.c | 11 +++++++++++
>>  1 file changed, 11 insertions(+)
>>
>> diff --git a/mm/memory.c b/mm/memory.c
>> index 7a04a1130ec1..51c04bb60724 100644
>> --- a/mm/memory.c
>> +++ b/mm/memory.c
>> @@ -4546,6 +4546,17 @@ static vm_fault_t do_shared_fault(struct vm_fault *vmf)
>>  	struct vm_area_struct *vma = vmf->vma;
>>  	vm_fault_t ret, tmp;
>>  
>> +	/*
>> +	 * Let's call ->map_pages() first and use ->fault() as fallback
>> +	 * if page by the offset is not ready to be mapped (cold cache or
>> +	 * something).
>> +	 */
>> +	if (should_fault_around(vmf)) {
>> +		ret = do_fault_around(vmf);
>> +		if (ret)
>> +			return ret;
>> +	}
>> +
> 
> I believe it bypasses ->page_mkwrite() completely, no?
> 
> So you get a writable PTEs without notifying the filesystem. Smells like a
> data loss.
Yes. You are right. This may be the reason why fault around is not enabled
for shared file write fault? I will drop this patch. Thanks.

Regards
Yin, Fengwei

> 
>>  	ret = __do_fault(vmf);
>>  	if (unlikely(ret & (VM_FAULT_ERROR | VM_FAULT_NOPAGE | VM_FAULT_RETRY)))
>>  		return ret;
>> -- 
>> 2.30.2
>>
>>
>
diff mbox series

Patch

diff --git a/mm/memory.c b/mm/memory.c
index 7a04a1130ec1..51c04bb60724 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -4546,6 +4546,17 @@  static vm_fault_t do_shared_fault(struct vm_fault *vmf)
 	struct vm_area_struct *vma = vmf->vma;
 	vm_fault_t ret, tmp;
 
+	/*
+	 * Let's call ->map_pages() first and use ->fault() as fallback
+	 * if page by the offset is not ready to be mapped (cold cache or
+	 * something).
+	 */
+	if (should_fault_around(vmf)) {
+		ret = do_fault_around(vmf);
+		if (ret)
+			return ret;
+	}
+
 	ret = __do_fault(vmf);
 	if (unlikely(ret & (VM_FAULT_ERROR | VM_FAULT_NOPAGE | VM_FAULT_RETRY)))
 		return ret;