diff mbox series

[04/16] mm/migration: reduce the rcu lock duration

Message ID 20220304093409.25829-5-linmiaohe@huawei.com (mailing list archive)
State New
Headers show
Series A few cleanup and fixup patches for migration | expand

Commit Message

Miaohe Lin March 4, 2022, 9:33 a.m. UTC
rcu_read_lock is required by grabbing the task refcount but it's not
needed for ptrace_may_access. So we could release the rcu lock after
task refcount is successfully grabbed to reduce the rcu holding time.

Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
---
 mm/migrate.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

Comments

Muchun Song March 4, 2022, 12:16 p.m. UTC | #1
On Fri, Mar 4, 2022 at 5:35 PM Miaohe Lin <linmiaohe@huawei.com> wrote:
>
> rcu_read_lock is required by grabbing the task refcount but it's not
> needed for ptrace_may_access. So we could release the rcu lock after
> task refcount is successfully grabbed to reduce the rcu holding time.
>
> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>

Also simple.

Reviewed-by: Muchun Song <songmuchun@bytedance.com>
Huang, Ying March 7, 2022, 2:32 a.m. UTC | #2
Miaohe Lin <linmiaohe@huawei.com> writes:

> rcu_read_lock is required by grabbing the task refcount but it's not
> needed for ptrace_may_access. So we could release the rcu lock after
> task refcount is successfully grabbed to reduce the rcu holding time.
>
> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
> ---
>  mm/migrate.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/mm/migrate.c b/mm/migrate.c
> index da5a81052468..26943bd819e8 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -1907,17 +1907,16 @@ static struct mm_struct *find_mm_struct(pid_t pid, nodemask_t *mem_nodes)
>  		return ERR_PTR(-ESRCH);
>  	}
>  	get_task_struct(task);
> +	rcu_read_unlock();
>  
>  	/*
>  	 * Check if this process has the right to modify the specified
>  	 * process. Use the regular "ptrace_may_access()" checks.
>  	 */
>  	if (!ptrace_may_access(task, PTRACE_MODE_READ_REALCREDS)) {
> -		rcu_read_unlock();
>  		mm = ERR_PTR(-EPERM);
>  		goto out;
>  	}
> -	rcu_read_unlock();
>  
>  	mm = ERR_PTR(security_task_movememory(task));
>  	if (IS_ERR(mm))

Digged some history via `git blame`, found that the RCU read lock is
extended in the following commit,

"
3268c63eded4612a3d07b56d1e02ce7731e6608e
Author:     Christoph Lameter <cl@linux.com>
AuthorDate: Wed Mar 21 16:34:06 2012 -0700
Commit:     Linus Torvalds <torvalds@linux-foundation.org>
CommitDate: Wed Mar 21 17:54:58 2012 -0700

mm: fix move/migrate_pages() race on task struct

Migration functions perform the rcu_read_unlock too early.  As a result
the task pointed to may change from under us.  This can result in an oops,
as reported by Dave Hansen in https://lkml.org/lkml/2012/2/23/302.

The following patch extend the period of the rcu_read_lock until after the
permissions checks are done.  We also take a refcount so that the task
reference is stable when calling security check functions and performing
cpuset node validation (which takes a mutex).

The refcount is dropped before actual page migration occurs so there is no
change to the refcounts held during page migration.

Also move the determination of the mm of the task struct to immediately
before the do_migrate*() calls so that it is clear that we switch from
handling the task during permission checks to the mm for the actual
migration.  Since the determination is only done once and we then no
longer use the task_struct we can be sure that we operate on a specific
address space that will not change from under us.
"

After that, the permission checking has been changed from __task_cred()
to ptrace_may_access().  So the situation may change somewhat.  Cced
some names found in git history to verify.

Best Regards,
Huang, Ying
Miaohe Lin March 8, 2022, 12:09 p.m. UTC | #3
On 2022/3/7 10:32, Huang, Ying wrote:
> Miaohe Lin <linmiaohe@huawei.com> writes:
> 
>> rcu_read_lock is required by grabbing the task refcount but it's not
>> needed for ptrace_may_access. So we could release the rcu lock after
>> task refcount is successfully grabbed to reduce the rcu holding time.
>>
>> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
>> ---
>>  mm/migrate.c | 3 +--
>>  1 file changed, 1 insertion(+), 2 deletions(-)
>>
>> diff --git a/mm/migrate.c b/mm/migrate.c
>> index da5a81052468..26943bd819e8 100644
>> --- a/mm/migrate.c
>> +++ b/mm/migrate.c
>> @@ -1907,17 +1907,16 @@ static struct mm_struct *find_mm_struct(pid_t pid, nodemask_t *mem_nodes)
>>  		return ERR_PTR(-ESRCH);
>>  	}
>>  	get_task_struct(task);
>> +	rcu_read_unlock();
>>  
>>  	/*
>>  	 * Check if this process has the right to modify the specified
>>  	 * process. Use the regular "ptrace_may_access()" checks.
>>  	 */
>>  	if (!ptrace_may_access(task, PTRACE_MODE_READ_REALCREDS)) {
>> -		rcu_read_unlock();
>>  		mm = ERR_PTR(-EPERM);
>>  		goto out;
>>  	}
>> -	rcu_read_unlock();
>>  
>>  	mm = ERR_PTR(security_task_movememory(task));
>>  	if (IS_ERR(mm))
> 
> Digged some history via `git blame`, found that the RCU read lock is
> extended in the following commit,
> 
> "
> 3268c63eded4612a3d07b56d1e02ce7731e6608e
> Author:     Christoph Lameter <cl@linux.com>
> AuthorDate: Wed Mar 21 16:34:06 2012 -0700
> Commit:     Linus Torvalds <torvalds@linux-foundation.org>
> CommitDate: Wed Mar 21 17:54:58 2012 -0700
> 
> mm: fix move/migrate_pages() race on task struct
> 
> Migration functions perform the rcu_read_unlock too early.  As a result
> the task pointed to may change from under us.  This can result in an oops,
> as reported by Dave Hansen in https://lkml.org/lkml/2012/2/23/302.
> 
> The following patch extend the period of the rcu_read_lock until after the
> permissions checks are done.  We also take a refcount so that the task
> reference is stable when calling security check functions and performing
> cpuset node validation (which takes a mutex).
> 
> The refcount is dropped before actual page migration occurs so there is no
> change to the refcounts held during page migration.
> 
> Also move the determination of the mm of the task struct to immediately
> before the do_migrate*() calls so that it is clear that we switch from
> handling the task during permission checks to the mm for the actual
> migration.  Since the determination is only done once and we then no
> longer use the task_struct we can be sure that we operate on a specific
> address space that will not change from under us.
> "
> 
> After that, the permission checking has been changed from __task_cred()
> to ptrace_may_access().  So the situation may change somewhat.  Cced

In ptrace_may_access, __task_cred is access while holding the rcu read lock.
It seems this is ensured by the ptrace_may_access itself.

> some names found in git history to verify.

Thanks for your carefulness.

> 
> Best Regards,
> Huang, Ying
> .
>
Huang, Ying March 9, 2022, 1:02 a.m. UTC | #4
Miaohe Lin <linmiaohe@huawei.com> writes:

> On 2022/3/7 10:32, Huang, Ying wrote:
>> Miaohe Lin <linmiaohe@huawei.com> writes:
>> 
>>> rcu_read_lock is required by grabbing the task refcount but it's not
>>> needed for ptrace_may_access. So we could release the rcu lock after
>>> task refcount is successfully grabbed to reduce the rcu holding time.
>>>
>>> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
>>> ---
>>>  mm/migrate.c | 3 +--
>>>  1 file changed, 1 insertion(+), 2 deletions(-)
>>>
>>> diff --git a/mm/migrate.c b/mm/migrate.c
>>> index da5a81052468..26943bd819e8 100644
>>> --- a/mm/migrate.c
>>> +++ b/mm/migrate.c
>>> @@ -1907,17 +1907,16 @@ static struct mm_struct *find_mm_struct(pid_t pid, nodemask_t *mem_nodes)
>>>  		return ERR_PTR(-ESRCH);
>>>  	}
>>>  	get_task_struct(task);
>>> +	rcu_read_unlock();
>>>  
>>>  	/*
>>>  	 * Check if this process has the right to modify the specified
>>>  	 * process. Use the regular "ptrace_may_access()" checks.
>>>  	 */
>>>  	if (!ptrace_may_access(task, PTRACE_MODE_READ_REALCREDS)) {
>>> -		rcu_read_unlock();
>>>  		mm = ERR_PTR(-EPERM);
>>>  		goto out;
>>>  	}
>>> -	rcu_read_unlock();
>>>  
>>>  	mm = ERR_PTR(security_task_movememory(task));
>>>  	if (IS_ERR(mm))
>> 
>> Digged some history via `git blame`, found that the RCU read lock is
>> extended in the following commit,
>> 
>> "
>> 3268c63eded4612a3d07b56d1e02ce7731e6608e
>> Author:     Christoph Lameter <cl@linux.com>
>> AuthorDate: Wed Mar 21 16:34:06 2012 -0700
>> Commit:     Linus Torvalds <torvalds@linux-foundation.org>
>> CommitDate: Wed Mar 21 17:54:58 2012 -0700
>> 
>> mm: fix move/migrate_pages() race on task struct
>> 
>> Migration functions perform the rcu_read_unlock too early.  As a result
>> the task pointed to may change from under us.  This can result in an oops,
>> as reported by Dave Hansen in https://lkml.org/lkml/2012/2/23/302.
>> 
>> The following patch extend the period of the rcu_read_lock until after the
>> permissions checks are done.  We also take a refcount so that the task
>> reference is stable when calling security check functions and performing
>> cpuset node validation (which takes a mutex).
>> 
>> The refcount is dropped before actual page migration occurs so there is no
>> change to the refcounts held during page migration.
>> 
>> Also move the determination of the mm of the task struct to immediately
>> before the do_migrate*() calls so that it is clear that we switch from
>> handling the task during permission checks to the mm for the actual
>> migration.  Since the determination is only done once and we then no
>> longer use the task_struct we can be sure that we operate on a specific
>> address space that will not change from under us.
>> "
>> 
>> After that, the permission checking has been changed from __task_cred()
>> to ptrace_may_access().  So the situation may change somewhat.  Cced
>
> In ptrace_may_access, __task_cred is access while holding the rcu read lock.
> It seems this is ensured by the ptrace_may_access itself.

Please read the patch above.  Before extending rcu_read_lock protected
region, __task_cred() is protected by rcu_read_lock already.  The patch
above combines 2 regions into 1.

Best Regards,
Huang, Ying

>> some names found in git history to verify.
>
> Thanks for your carefulness.
>
>> 
>> Best Regards,
>> Huang, Ying
>> .
>>
Miaohe Lin March 9, 2022, 8:28 a.m. UTC | #5
On 2022/3/9 9:02, Huang, Ying wrote:
> Miaohe Lin <linmiaohe@huawei.com> writes:
> 
>> On 2022/3/7 10:32, Huang, Ying wrote:
>>> Miaohe Lin <linmiaohe@huawei.com> writes:
>>>
>>>> rcu_read_lock is required by grabbing the task refcount but it's not
>>>> needed for ptrace_may_access. So we could release the rcu lock after
>>>> task refcount is successfully grabbed to reduce the rcu holding time.
>>>>
>>>> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
>>>> ---
>>>>  mm/migrate.c | 3 +--
>>>>  1 file changed, 1 insertion(+), 2 deletions(-)
>>>>
>>>> diff --git a/mm/migrate.c b/mm/migrate.c
>>>> index da5a81052468..26943bd819e8 100644
>>>> --- a/mm/migrate.c
>>>> +++ b/mm/migrate.c
>>>> @@ -1907,17 +1907,16 @@ static struct mm_struct *find_mm_struct(pid_t pid, nodemask_t *mem_nodes)
>>>>  		return ERR_PTR(-ESRCH);
>>>>  	}
>>>>  	get_task_struct(task);
>>>> +	rcu_read_unlock();
>>>>  
>>>>  	/*
>>>>  	 * Check if this process has the right to modify the specified
>>>>  	 * process. Use the regular "ptrace_may_access()" checks.
>>>>  	 */
>>>>  	if (!ptrace_may_access(task, PTRACE_MODE_READ_REALCREDS)) {
>>>> -		rcu_read_unlock();
>>>>  		mm = ERR_PTR(-EPERM);
>>>>  		goto out;
>>>>  	}
>>>> -	rcu_read_unlock();
>>>>  
>>>>  	mm = ERR_PTR(security_task_movememory(task));
>>>>  	if (IS_ERR(mm))
>>>
>>> Digged some history via `git blame`, found that the RCU read lock is
>>> extended in the following commit,
>>>
>>> "
>>> 3268c63eded4612a3d07b56d1e02ce7731e6608e
>>> Author:     Christoph Lameter <cl@linux.com>
>>> AuthorDate: Wed Mar 21 16:34:06 2012 -0700
>>> Commit:     Linus Torvalds <torvalds@linux-foundation.org>
>>> CommitDate: Wed Mar 21 17:54:58 2012 -0700
>>>
>>> mm: fix move/migrate_pages() race on task struct
>>>
>>> Migration functions perform the rcu_read_unlock too early.  As a result
>>> the task pointed to may change from under us.  This can result in an oops,
>>> as reported by Dave Hansen in https://lkml.org/lkml/2012/2/23/302.
>>>
>>> The following patch extend the period of the rcu_read_lock until after the
>>> permissions checks are done.  We also take a refcount so that the task
>>> reference is stable when calling security check functions and performing
>>> cpuset node validation (which takes a mutex).
>>>
>>> The refcount is dropped before actual page migration occurs so there is no
>>> change to the refcounts held during page migration.
>>>
>>> Also move the determination of the mm of the task struct to immediately
>>> before the do_migrate*() calls so that it is clear that we switch from
>>> handling the task during permission checks to the mm for the actual
>>> migration.  Since the determination is only done once and we then no
>>> longer use the task_struct we can be sure that we operate on a specific
>>> address space that will not change from under us.
>>> "
>>>
>>> After that, the permission checking has been changed from __task_cred()
>>> to ptrace_may_access().  So the situation may change somewhat.  Cced
>>
>> In ptrace_may_access, __task_cred is access while holding the rcu read lock.
>> It seems this is ensured by the ptrace_may_access itself.
> 
> Please read the patch above.  Before extending rcu_read_lock protected
> region, __task_cred() is protected by rcu_read_lock already.  The patch
> above combines 2 regions into 1.
> 

Yep, you're right. Thanks.

> Best Regards,
> Huang, Ying
> 
>>> some names found in git history to verify.
>>
>> Thanks for your carefulness.
>>
>>>
>>> Best Regards,
>>> Huang, Ying
>>> .
>>>
> .
>
diff mbox series

Patch

diff --git a/mm/migrate.c b/mm/migrate.c
index da5a81052468..26943bd819e8 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -1907,17 +1907,16 @@  static struct mm_struct *find_mm_struct(pid_t pid, nodemask_t *mem_nodes)
 		return ERR_PTR(-ESRCH);
 	}
 	get_task_struct(task);
+	rcu_read_unlock();
 
 	/*
 	 * Check if this process has the right to modify the specified
 	 * process. Use the regular "ptrace_may_access()" checks.
 	 */
 	if (!ptrace_may_access(task, PTRACE_MODE_READ_REALCREDS)) {
-		rcu_read_unlock();
 		mm = ERR_PTR(-EPERM);
 		goto out;
 	}
-	rcu_read_unlock();
 
 	mm = ERR_PTR(security_task_movememory(task));
 	if (IS_ERR(mm))