diff mbox series

proc: add locking checks in proc_inode_is_dead

Message ID 20201128175850.19484-1-wenyang@linux.alibaba.com (mailing list archive)
State New, archived
Headers show
Series proc: add locking checks in proc_inode_is_dead | expand

Commit Message

Wen Yang Nov. 28, 2020, 5:58 p.m. UTC
The proc_inode_is_dead function might race with __unhash_process.
This will result in a whole bunch of stale proc entries being cached.
To prevent that, add the required locking.

Signed-off-by: Wen Yang <wenyang@linux.alibaba.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Christian Brauner <christian@brauner.io>
Cc: linux-kernel@vger.kernel.org
Cc: linux-fsdevel@vger.kernel.org
---
 fs/proc/base.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

Comments

Eric W. Biederman Nov. 30, 2020, 6:41 p.m. UTC | #1
Wen Yang <wenyang@linux.alibaba.com> writes:

> The proc_inode_is_dead function might race with __unhash_process.
> This will result in a whole bunch of stale proc entries being cached.
> To prevent that, add the required locking.

I assume you are talking about during proc_task_readdir?

It is completely possible for the proc_inode_is_dead to report
the inode is still alive and then for unhash_process to
happen when afterwards.

Have you been able to trigger this race in practice?


Ouch!!!!  Oleg I just looked the introduction of proc_inode_is_dead in
d855a4b79f49 ("proc: don't (ab)use ->group_leader in proc_task_readdir()
paths") introduced a ``regression''.

Breaking the logic introduced in 7d8952440f40 ("[PATCH] procfs: Fix
listing of /proc/NOT_A_TGID/task") to keep those directory listings not
showing up.

Given that it has been 6 years and no one has cared it doesn't look like
an actual regression but it does suggest the proc_inode_is_dead can be
removed entirely as it does not achieve anything in proc_task_readdir.



As for removing the race.  I expect the thing to do is to modify
proc_pid_is_alive to verify the that the pid is still alive.
Oh but look get_task_pid verifies that thread_pid is not NULL
and unhash_process sets thread_pid to NULL.


My brain is too fuzzy right now to tell if it is possible for
get_task_pid to return a pid and then for that pid to pass through
unhash_process, while the code is still in proc_pid_make_inode.

proc_inode_is_dead is definitely not the place to look to close races.

Eric


> Signed-off-by: Wen Yang <wenyang@linux.alibaba.com>
> Cc: Oleg Nesterov <oleg@redhat.com>
> Cc: "Eric W. Biederman" <ebiederm@xmission.com>
> Cc: Alexey Dobriyan <adobriyan@gmail.com>
> Cc: Christian Brauner <christian@brauner.io>
> Cc: linux-kernel@vger.kernel.org
> Cc: linux-fsdevel@vger.kernel.org
> ---
>  fs/proc/base.c | 8 +++++++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/fs/proc/base.c b/fs/proc/base.c
> index 1bc9bcd..59720bc 100644
> --- a/fs/proc/base.c
> +++ b/fs/proc/base.c
> @@ -1994,7 +1994,13 @@ static int pid_revalidate(struct dentry *dentry, unsigned int flags)
>  
>  static inline bool proc_inode_is_dead(struct inode *inode)
>  {
> -	return !proc_pid(inode)->tasks[PIDTYPE_PID].first;
> +	bool has_task;
> +
> +	read_lock(&tasklist_lock);
> +	has_task = pid_has_task(proc_pid(inode), PIDTYPE_PID);
> +	read_unlock(&tasklist_lock);
> +
> +	return !has_task;
>  }
>  
>  int pid_delete_dentry(const struct dentry *dentry)
Oleg Nesterov Dec. 1, 2020, 12:35 p.m. UTC | #2
On 11/30, Eric W. Biederman wrote:
>
> Ouch!!!!  Oleg I just looked the introduction of proc_inode_is_dead in
> d855a4b79f49 ("proc: don't (ab)use ->group_leader in proc_task_readdir()
> paths") introduced a ``regression''.
>
> Breaking the logic introduced in 7d8952440f40 ("[PATCH] procfs: Fix
> listing of /proc/NOT_A_TGID/task") to keep those directory listings not
> showing up.

Sorry, I don't understand...

Do you mean that "ls /proc/pid/task" can see an empty dir? Afaics this
was possible before d855a4b79f49 too.

Or what?

Oleg.
Eric W. Biederman Dec. 1, 2020, 3:06 p.m. UTC | #3
Oleg Nesterov <oleg@redhat.com> writes:

> On 11/30, Eric W. Biederman wrote:
>>
>> Ouch!!!!  Oleg I just looked the introduction of proc_inode_is_dead in
>> d855a4b79f49 ("proc: don't (ab)use ->group_leader in proc_task_readdir()
>> paths") introduced a ``regression''.
>>
>> Breaking the logic introduced in 7d8952440f40 ("[PATCH] procfs: Fix
>> listing of /proc/NOT_A_TGID/task") to keep those directory listings not
>> showing up.
>
> Sorry, I don't understand...
>
> Do you mean that "ls /proc/pid/task" can see an empty dir? Afaics this
> was possible before d855a4b79f49 too.
>
> Or what?

Bah. Brain fart on my part.

I read 7d8952440f40 too fast.  I thought it was attempting to make
it so that "ls /proc/tid/task/" would see an empty dir while "ls
/proc/tgid/task/" would see the complete set of threads.

Where tgid is the pid of the thread group leader and tid
is the pid of some thread in the thread group.


But 7d8952440f40 was just attempting to ensure that no thread was
listed more than once in "/proc/xxx/task".

My apologies for the confusion.

Eric
diff mbox series

Patch

diff --git a/fs/proc/base.c b/fs/proc/base.c
index 1bc9bcd..59720bc 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -1994,7 +1994,13 @@  static int pid_revalidate(struct dentry *dentry, unsigned int flags)
 
 static inline bool proc_inode_is_dead(struct inode *inode)
 {
-	return !proc_pid(inode)->tasks[PIDTYPE_PID].first;
+	bool has_task;
+
+	read_lock(&tasklist_lock);
+	has_task = pid_has_task(proc_pid(inode), PIDTYPE_PID);
+	read_unlock(&tasklist_lock);
+
+	return !has_task;
 }
 
 int pid_delete_dentry(const struct dentry *dentry)