Message ID | 20201128175850.19484-1-wenyang@linux.alibaba.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | proc: add locking checks in proc_inode_is_dead | expand |
Wen Yang <wenyang@linux.alibaba.com> writes: > The proc_inode_is_dead function might race with __unhash_process. > This will result in a whole bunch of stale proc entries being cached. > To prevent that, add the required locking. I assume you are talking about during proc_task_readdir? It is completely possible for the proc_inode_is_dead to report the inode is still alive and then for unhash_process to happen when afterwards. Have you been able to trigger this race in practice? Ouch!!!! Oleg I just looked the introduction of proc_inode_is_dead in d855a4b79f49 ("proc: don't (ab)use ->group_leader in proc_task_readdir() paths") introduced a ``regression''. Breaking the logic introduced in 7d8952440f40 ("[PATCH] procfs: Fix listing of /proc/NOT_A_TGID/task") to keep those directory listings not showing up. Given that it has been 6 years and no one has cared it doesn't look like an actual regression but it does suggest the proc_inode_is_dead can be removed entirely as it does not achieve anything in proc_task_readdir. As for removing the race. I expect the thing to do is to modify proc_pid_is_alive to verify the that the pid is still alive. Oh but look get_task_pid verifies that thread_pid is not NULL and unhash_process sets thread_pid to NULL. My brain is too fuzzy right now to tell if it is possible for get_task_pid to return a pid and then for that pid to pass through unhash_process, while the code is still in proc_pid_make_inode. proc_inode_is_dead is definitely not the place to look to close races. Eric > Signed-off-by: Wen Yang <wenyang@linux.alibaba.com> > Cc: Oleg Nesterov <oleg@redhat.com> > Cc: "Eric W. Biederman" <ebiederm@xmission.com> > Cc: Alexey Dobriyan <adobriyan@gmail.com> > Cc: Christian Brauner <christian@brauner.io> > Cc: linux-kernel@vger.kernel.org > Cc: linux-fsdevel@vger.kernel.org > --- > fs/proc/base.c | 8 +++++++- > 1 file changed, 7 insertions(+), 1 deletion(-) > > diff --git a/fs/proc/base.c b/fs/proc/base.c > index 1bc9bcd..59720bc 100644 > --- a/fs/proc/base.c > +++ b/fs/proc/base.c > @@ -1994,7 +1994,13 @@ static int pid_revalidate(struct dentry *dentry, unsigned int flags) > > static inline bool proc_inode_is_dead(struct inode *inode) > { > - return !proc_pid(inode)->tasks[PIDTYPE_PID].first; > + bool has_task; > + > + read_lock(&tasklist_lock); > + has_task = pid_has_task(proc_pid(inode), PIDTYPE_PID); > + read_unlock(&tasklist_lock); > + > + return !has_task; > } > > int pid_delete_dentry(const struct dentry *dentry)
On 11/30, Eric W. Biederman wrote: > > Ouch!!!! Oleg I just looked the introduction of proc_inode_is_dead in > d855a4b79f49 ("proc: don't (ab)use ->group_leader in proc_task_readdir() > paths") introduced a ``regression''. > > Breaking the logic introduced in 7d8952440f40 ("[PATCH] procfs: Fix > listing of /proc/NOT_A_TGID/task") to keep those directory listings not > showing up. Sorry, I don't understand... Do you mean that "ls /proc/pid/task" can see an empty dir? Afaics this was possible before d855a4b79f49 too. Or what? Oleg.
Oleg Nesterov <oleg@redhat.com> writes: > On 11/30, Eric W. Biederman wrote: >> >> Ouch!!!! Oleg I just looked the introduction of proc_inode_is_dead in >> d855a4b79f49 ("proc: don't (ab)use ->group_leader in proc_task_readdir() >> paths") introduced a ``regression''. >> >> Breaking the logic introduced in 7d8952440f40 ("[PATCH] procfs: Fix >> listing of /proc/NOT_A_TGID/task") to keep those directory listings not >> showing up. > > Sorry, I don't understand... > > Do you mean that "ls /proc/pid/task" can see an empty dir? Afaics this > was possible before d855a4b79f49 too. > > Or what? Bah. Brain fart on my part. I read 7d8952440f40 too fast. I thought it was attempting to make it so that "ls /proc/tid/task/" would see an empty dir while "ls /proc/tgid/task/" would see the complete set of threads. Where tgid is the pid of the thread group leader and tid is the pid of some thread in the thread group. But 7d8952440f40 was just attempting to ensure that no thread was listed more than once in "/proc/xxx/task". My apologies for the confusion. Eric
diff --git a/fs/proc/base.c b/fs/proc/base.c index 1bc9bcd..59720bc 100644 --- a/fs/proc/base.c +++ b/fs/proc/base.c @@ -1994,7 +1994,13 @@ static int pid_revalidate(struct dentry *dentry, unsigned int flags) static inline bool proc_inode_is_dead(struct inode *inode) { - return !proc_pid(inode)->tasks[PIDTYPE_PID].first; + bool has_task; + + read_lock(&tasklist_lock); + has_task = pid_has_task(proc_pid(inode), PIDTYPE_PID); + read_unlock(&tasklist_lock); + + return !has_task; } int pid_delete_dentry(const struct dentry *dentry)
The proc_inode_is_dead function might race with __unhash_process. This will result in a whole bunch of stale proc entries being cached. To prevent that, add the required locking. Signed-off-by: Wen Yang <wenyang@linux.alibaba.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Christian Brauner <christian@brauner.io> Cc: linux-kernel@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org --- fs/proc/base.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-)