[RFC,v3,5/5] x86: check VM_DEAD flag in page fault

Message ID	1530311985-31251-6-git-send-email-yang.shi@linux.alibaba.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <owner-linux-mm@kvack.org> Received-SPF: pass (google.com: domain of yang.shi@linux.alibaba.com designates 115.124.30.131 as permitted sender) client-ip=115.124.30.131; From: Yang Shi <yang.shi@linux.alibaba.com> To: mhocko@kernel.org, willy@infradead.org, ldufour@linux.vnet.ibm.com, akpm@linux-foundation.org, peterz@infradead.org, mingo@redhat.com, acme@kernel.org, alexander.shishkin@linux.intel.com, jolsa@redhat.com, namhyung@kernel.org, tglx@linutronix.de, hpa@zytor.com Cc: yang.shi@linux.alibaba.com, linux-mm@kvack.org, x86@kernel.org, linux-kernel@vger.kernel.org Subject: [RFC v3 PATCH 5/5] x86: check VM_DEAD flag in page fault Date: Sat, 30 Jun 2018 06:39:45 +0800 Message-Id: <1530311985-31251-6-git-send-email-yang.shi@linux.alibaba.com> In-Reply-To: <1530311985-31251-1-git-send-email-yang.shi@linux.alibaba.com> References: <1530311985-31251-1-git-send-email-yang.shi@linux.alibaba.com> Sender: owner-linux-mm@kvack.org Precedence: bulk

Yang Shi June 29, 2018, 10:39 p.m. UTC

Check VM_DEAD flag of vma in page fault handler, if it is set, trigger
SIGSEGV.

Cc: Michal Hocko <mhocko@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com>
---
 arch/x86/mm/fault.c | 4 ++++
 1 file changed, 4 insertions(+)

Laurent Dufour July 2, 2018, 8:45 a.m. UTC | #1

On 30/06/2018 00:39, Yang Shi wrote:
> Check VM_DEAD flag of vma in page fault handler, if it is set, trigger
> SIGSEGV.
> 
> Cc: Michal Hocko <mhocko@kernel.org>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: "H. Peter Anvin" <hpa@zytor.com>
> Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com>
> ---
>  arch/x86/mm/fault.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
> index 9a84a0d..3fd2da5 100644
> --- a/arch/x86/mm/fault.c
> +++ b/arch/x86/mm/fault.c
> @@ -1357,6 +1357,10 @@ static inline bool smap_violation(int error_code, struct pt_regs *regs)
>  		bad_area(regs, error_code, address);
>  		return;
>  	}
> +	if (unlikely(vma->vm_flags & VM_DEAD)) {
> +		bad_area(regs, error_code, address);
> +		return;
> +	}

This will have to be done for all the supported architectures, what about doing
this check in handle_mm_fault() and return VM_FAULT_SIGSEGV ?

>  	if (error_code & X86_PF_USER) {
>  		/*
>  		 * Accessing the stack below %sp is always a bug.
>

Michal Hocko July 2, 2018, 12:15 p.m. UTC | #2

On Mon 02-07-18 10:45:03, Laurent Dufour wrote:
> On 30/06/2018 00:39, Yang Shi wrote:
> > Check VM_DEAD flag of vma in page fault handler, if it is set, trigger
> > SIGSEGV.
> > 
> > Cc: Michal Hocko <mhocko@kernel.org>
> > Cc: Thomas Gleixner <tglx@linutronix.de>
> > Cc: Ingo Molnar <mingo@redhat.com>
> > Cc: "H. Peter Anvin" <hpa@zytor.com>
> > Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com>
> > ---
> >  arch/x86/mm/fault.c | 4 ++++
> >  1 file changed, 4 insertions(+)
> > 
> > diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
> > index 9a84a0d..3fd2da5 100644
> > --- a/arch/x86/mm/fault.c
> > +++ b/arch/x86/mm/fault.c
> > @@ -1357,6 +1357,10 @@ static inline bool smap_violation(int error_code, struct pt_regs *regs)
> >  		bad_area(regs, error_code, address);
> >  		return;
> >  	}
> > +	if (unlikely(vma->vm_flags & VM_DEAD)) {
> > +		bad_area(regs, error_code, address);
> > +		return;
> > +	}
> 
> This will have to be done for all the supported architectures, what about doing
> this check in handle_mm_fault() and return VM_FAULT_SIGSEGV ?

We already do have a model for that. Have a look at MMF_UNSTABLE.

Laurent Dufour July 2, 2018, 12:26 p.m. UTC | #3

On 02/07/2018 14:15, Michal Hocko wrote:
> On Mon 02-07-18 10:45:03, Laurent Dufour wrote:
>> On 30/06/2018 00:39, Yang Shi wrote:
>>> Check VM_DEAD flag of vma in page fault handler, if it is set, trigger
>>> SIGSEGV.
>>>
>>> Cc: Michal Hocko <mhocko@kernel.org>
>>> Cc: Thomas Gleixner <tglx@linutronix.de>
>>> Cc: Ingo Molnar <mingo@redhat.com>
>>> Cc: "H. Peter Anvin" <hpa@zytor.com>
>>> Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com>
>>> ---
>>>  arch/x86/mm/fault.c | 4 ++++
>>>  1 file changed, 4 insertions(+)
>>>
>>> diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
>>> index 9a84a0d..3fd2da5 100644
>>> --- a/arch/x86/mm/fault.c
>>> +++ b/arch/x86/mm/fault.c
>>> @@ -1357,6 +1357,10 @@ static inline bool smap_violation(int error_code, struct pt_regs *regs)
>>>  		bad_area(regs, error_code, address);
>>>  		return;
>>>  	}
>>> +	if (unlikely(vma->vm_flags & VM_DEAD)) {
>>> +		bad_area(regs, error_code, address);
>>> +		return;
>>> +	}
>>
>> This will have to be done for all the supported architectures, what about doing
>> this check in handle_mm_fault() and return VM_FAULT_SIGSEGV ?
> 
> We already do have a model for that. Have a look at MMF_UNSTABLE.

MMF_UNSTABLE is a mm's flag, here this is a VMA's flag which is checked.

Michal Hocko July 2, 2018, 12:45 p.m. UTC | #4

On Mon 02-07-18 14:26:09, Laurent Dufour wrote:
> On 02/07/2018 14:15, Michal Hocko wrote:
> > On Mon 02-07-18 10:45:03, Laurent Dufour wrote:
> >> On 30/06/2018 00:39, Yang Shi wrote:
> >>> Check VM_DEAD flag of vma in page fault handler, if it is set, trigger
> >>> SIGSEGV.
> >>>
> >>> Cc: Michal Hocko <mhocko@kernel.org>
> >>> Cc: Thomas Gleixner <tglx@linutronix.de>
> >>> Cc: Ingo Molnar <mingo@redhat.com>
> >>> Cc: "H. Peter Anvin" <hpa@zytor.com>
> >>> Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com>
> >>> ---
> >>>  arch/x86/mm/fault.c | 4 ++++
> >>>  1 file changed, 4 insertions(+)
> >>>
> >>> diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
> >>> index 9a84a0d..3fd2da5 100644
> >>> --- a/arch/x86/mm/fault.c
> >>> +++ b/arch/x86/mm/fault.c
> >>> @@ -1357,6 +1357,10 @@ static inline bool smap_violation(int error_code, struct pt_regs *regs)
> >>>  		bad_area(regs, error_code, address);
> >>>  		return;
> >>>  	}
> >>> +	if (unlikely(vma->vm_flags & VM_DEAD)) {
> >>> +		bad_area(regs, error_code, address);
> >>> +		return;
> >>> +	}
> >>
> >> This will have to be done for all the supported architectures, what about doing
> >> this check in handle_mm_fault() and return VM_FAULT_SIGSEGV ?
> > 
> > We already do have a model for that. Have a look at MMF_UNSTABLE.
> 
> MMF_UNSTABLE is a mm's flag, here this is a VMA's flag which is checked.

Yeah, and we have the VMA ready for all places where we do check the
flag. check_stable_address_space can be made to get vma rather than mm.

Laurent Dufour July 2, 2018, 1:33 p.m. UTC | #5

On 02/07/2018 14:45, Michal Hocko wrote:
> On Mon 02-07-18 14:26:09, Laurent Dufour wrote:
>> On 02/07/2018 14:15, Michal Hocko wrote:
>>> On Mon 02-07-18 10:45:03, Laurent Dufour wrote:
>>>> On 30/06/2018 00:39, Yang Shi wrote:
>>>>> Check VM_DEAD flag of vma in page fault handler, if it is set, trigger
>>>>> SIGSEGV.
>>>>>
>>>>> Cc: Michal Hocko <mhocko@kernel.org>
>>>>> Cc: Thomas Gleixner <tglx@linutronix.de>
>>>>> Cc: Ingo Molnar <mingo@redhat.com>
>>>>> Cc: "H. Peter Anvin" <hpa@zytor.com>
>>>>> Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com>
>>>>> ---
>>>>>  arch/x86/mm/fault.c | 4 ++++
>>>>>  1 file changed, 4 insertions(+)
>>>>>
>>>>> diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
>>>>> index 9a84a0d..3fd2da5 100644
>>>>> --- a/arch/x86/mm/fault.c
>>>>> +++ b/arch/x86/mm/fault.c
>>>>> @@ -1357,6 +1357,10 @@ static inline bool smap_violation(int error_code, struct pt_regs *regs)
>>>>>  		bad_area(regs, error_code, address);
>>>>>  		return;
>>>>>  	}
>>>>> +	if (unlikely(vma->vm_flags & VM_DEAD)) {
>>>>> +		bad_area(regs, error_code, address);
>>>>> +		return;
>>>>> +	}
>>>>
>>>> This will have to be done for all the supported architectures, what about doing
>>>> this check in handle_mm_fault() and return VM_FAULT_SIGSEGV ?
>>>
>>> We already do have a model for that. Have a look at MMF_UNSTABLE.
>>
>> MMF_UNSTABLE is a mm's flag, here this is a VMA's flag which is checked.
> 
> Yeah, and we have the VMA ready for all places where we do check the
> flag. check_stable_address_space can be made to get vma rather than mm.

Yeah, this would have been more efficient to check that flag at the beginning
of the page fault handler rather than the end, but this way it will be easier
to handle the speculative page fault too ;)

Michal Hocko July 2, 2018, 1:37 p.m. UTC | #6

On Mon 02-07-18 15:33:11, Laurent Dufour wrote:
> 
> 
> On 02/07/2018 14:45, Michal Hocko wrote:
> > On Mon 02-07-18 14:26:09, Laurent Dufour wrote:
> >> On 02/07/2018 14:15, Michal Hocko wrote:
[...]
> >>> We already do have a model for that. Have a look at MMF_UNSTABLE.
> >>
> >> MMF_UNSTABLE is a mm's flag, here this is a VMA's flag which is checked.
> > 
> > Yeah, and we have the VMA ready for all places where we do check the
> > flag. check_stable_address_space can be made to get vma rather than mm.
> 
> Yeah, this would have been more efficient to check that flag at the beginning
> of the page fault handler rather than the end, but this way it will be easier
> to handle the speculative page fault too ;)

The thing is that it doesn't really need to be called earlier. You are
not risking data corruption on file backed mappings.

Yang Shi July 2, 2018, 5:24 p.m. UTC | #7

On 7/2/18 6:37 AM, Michal Hocko wrote:
> On Mon 02-07-18 15:33:11, Laurent Dufour wrote:
>>
>> On 02/07/2018 14:45, Michal Hocko wrote:
>>> On Mon 02-07-18 14:26:09, Laurent Dufour wrote:
>>>> On 02/07/2018 14:15, Michal Hocko wrote:
> [...]
>>>>> We already do have a model for that. Have a look at MMF_UNSTABLE.
>>>> MMF_UNSTABLE is a mm's flag, here this is a VMA's flag which is checked.
>>> Yeah, and we have the VMA ready for all places where we do check the
>>> flag. check_stable_address_space can be made to get vma rather than mm.
>> Yeah, this would have been more efficient to check that flag at the beginning
>> of the page fault handler rather than the end, but this way it will be easier
>> to handle the speculative page fault too ;)
> The thing is that it doesn't really need to be called earlier. You are
> not risking data corruption on file backed mappings.

OK, I just think it could save a few cycles to check the flag earlier.

If nobody think it is necessary, we definitely could re-use 
check_stable_address_space(), just return VM_FAULT_SIGSEGV for VM_DEAD 
vma, and check for both shared and non-shared.

Thanks,
Yang

>

Michal Hocko July 2, 2018, 5:57 p.m. UTC | #8

On Mon 02-07-18 10:24:27, Yang Shi wrote:
> 
> 
> On 7/2/18 6:37 AM, Michal Hocko wrote:
> > On Mon 02-07-18 15:33:11, Laurent Dufour wrote:
> > > 
> > > On 02/07/2018 14:45, Michal Hocko wrote:
> > > > On Mon 02-07-18 14:26:09, Laurent Dufour wrote:
> > > > > On 02/07/2018 14:15, Michal Hocko wrote:
> > [...]
> > > > > > We already do have a model for that. Have a look at MMF_UNSTABLE.
> > > > > MMF_UNSTABLE is a mm's flag, here this is a VMA's flag which is checked.
> > > > Yeah, and we have the VMA ready for all places where we do check the
> > > > flag. check_stable_address_space can be made to get vma rather than mm.
> > > Yeah, this would have been more efficient to check that flag at the beginning
> > > of the page fault handler rather than the end, but this way it will be easier
> > > to handle the speculative page fault too ;)
> > The thing is that it doesn't really need to be called earlier. You are
> > not risking data corruption on file backed mappings.
> 
> OK, I just think it could save a few cycles to check the flag earlier.

This should be an extremely rare case. Just think about it. It should
only ever happen when an access races with munmap which itself is
questionable if not an outright bug.

> If nobody think it is necessary, we definitely could re-use
> check_stable_address_space(),

If we really need this whole VM_DEAD thing then it should be better
handled at the same place rather than some ad-hoc places.

> just return VM_FAULT_SIGSEGV for VM_DEAD vma,
> and check for both shared and non-shared.

Why would you even care about shared mappings?

Yang Shi July 2, 2018, 6:10 p.m. UTC | #9

On 7/2/18 10:57 AM, Michal Hocko wrote:
> On Mon 02-07-18 10:24:27, Yang Shi wrote:
>>
>> On 7/2/18 6:37 AM, Michal Hocko wrote:
>>> On Mon 02-07-18 15:33:11, Laurent Dufour wrote:
>>>> On 02/07/2018 14:45, Michal Hocko wrote:
>>>>> On Mon 02-07-18 14:26:09, Laurent Dufour wrote:
>>>>>> On 02/07/2018 14:15, Michal Hocko wrote:
>>> [...]
>>>>>>> We already do have a model for that. Have a look at MMF_UNSTABLE.
>>>>>> MMF_UNSTABLE is a mm's flag, here this is a VMA's flag which is checked.
>>>>> Yeah, and we have the VMA ready for all places where we do check the
>>>>> flag. check_stable_address_space can be made to get vma rather than mm.
>>>> Yeah, this would have been more efficient to check that flag at the beginning
>>>> of the page fault handler rather than the end, but this way it will be easier
>>>> to handle the speculative page fault too ;)
>>> The thing is that it doesn't really need to be called earlier. You are
>>> not risking data corruption on file backed mappings.
>> OK, I just think it could save a few cycles to check the flag earlier.
> This should be an extremely rare case. Just think about it. It should
> only ever happen when an access races with munmap which itself is
> questionable if not an outright bug.
>
>> If nobody think it is necessary, we definitely could re-use
>> check_stable_address_space(),
> If we really need this whole VM_DEAD thing then it should be better
> handled at the same place rather than some ad-hoc places.
>
>> just return VM_FAULT_SIGSEGV for VM_DEAD vma,
>> and check for both shared and non-shared.
> Why would you even care about shared mappings?

Just thought about we are dealing with VM_DEAD, which means the vma will 
be tore down soon regardless it is shared or non-shared.

MMF_UNSTABLE doesn't care about !shared case.

Michal Hocko July 3, 2018, 6:17 a.m. UTC | #10

On Mon 02-07-18 11:10:23, Yang Shi wrote:
> On 7/2/18 10:57 AM, Michal Hocko wrote:
[...]
> > Why would you even care about shared mappings?
> 
> Just thought about we are dealing with VM_DEAD, which means the vma will be
> tore down soon regardless it is shared or non-shared.
> 
> MMF_UNSTABLE doesn't care about !shared case.

Let me clarify some more. MMF_UNSTABLE is there to prevent from
unexpected page faults when the mm is torn down by the oom reaper. And
oom reaper only cares about private mappings because we do not touch
shared ones. Disk based shared mappings should be a non-issue for
VM_DEAD because even if you race and refault a page back then you know
it is the same one you have seen before. Memory backed shared mappings
are a different story because you can get a fresh new page. oom_reaper
doesn't care because it doesn't tear those down. You would have to but
my primary point was that we already have MMF_UNSTABLE so all you need
is to extend it to memory backed shared mappings (shmem and hugetlb).

Yang Shi July 3, 2018, 4:50 p.m. UTC | #11

On 7/2/18 11:17 PM, Michal Hocko wrote:
> On Mon 02-07-18 11:10:23, Yang Shi wrote:
>> On 7/2/18 10:57 AM, Michal Hocko wrote:
> [...]
>>> Why would you even care about shared mappings?
>> Just thought about we are dealing with VM_DEAD, which means the vma will be
>> tore down soon regardless it is shared or non-shared.
>>
>> MMF_UNSTABLE doesn't care about !shared case.

Sorry, this is a typo, it should be "shared".

> Let me clarify some more. MMF_UNSTABLE is there to prevent from
> unexpected page faults when the mm is torn down by the oom reaper. And
> oom reaper only cares about private mappings because we do not touch
> shared ones. Disk based shared mappings should be a non-issue for
> VM_DEAD because even if you race and refault a page back then you know
> it is the same one you have seen before. Memory backed shared mappings
> are a different story because you can get a fresh new page. oom_reaper
> doesn't care because it doesn't tear those down. You would have to but
> my primary point was that we already have MMF_UNSTABLE so all you need
> is to extend it to memory backed shared mappings (shmem and hugetlb).

Yes, sure. I think I got your point. Thanks for the elaboration.

Yang

>

[RFC,v3,5/5] x86: check VM_DEAD flag in page fault

Commit Message

Comments

Patch