From patchwork Fri Nov 20 23:14:18 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Eric W. Biederman" X-Patchwork-Id: 11922851 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-21.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,MENTIONS_GIT_HOSTING,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A1380C63798 for ; Fri, 20 Nov 2020 23:16:28 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 4EF352240B for ; Fri, 20 Nov 2020 23:16:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728407AbgKTXQG (ORCPT ); Fri, 20 Nov 2020 18:16:06 -0500 Received: from out01.mta.xmission.com ([166.70.13.231]:33402 "EHLO out01.mta.xmission.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728123AbgKTXQG (ORCPT ); Fri, 20 Nov 2020 18:16:06 -0500 Received: from in02.mta.xmission.com ([166.70.13.52]) by out01.mta.xmission.com with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.93) (envelope-from ) id 1kgFdH-006Tmb-H0; Fri, 20 Nov 2020 16:15:59 -0700 Received: from ip68-227-160-95.om.om.cox.net ([68.227.160.95] helo=x220.int.ebiederm.org) by in02.mta.xmission.com with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.93) (envelope-from ) id 1kgFdG-00EG00-3B; Fri, 20 Nov 2020 16:15:59 -0700 From: "Eric W. Biederman" To: linux-kernel@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, criu@openvz.org, bpf@vger.kernel.org, Linus Torvalds , Alexander Viro , Christian Brauner , Oleg Nesterov , Cyrill Gorcunov , Jann Horn , Kees Cook , =?utf-8?q?Da?= =?utf-8?q?niel_P_=2E_Berrang=C3=A9?= , Jeff Layton , Miklos Szeredi , Matthew Wilcox , "J. Bruce Fields" , Trond Myklebust , Chris Wright , Alexei Starovoitov , Daniel Borkmann , Martin KaFai Lau , Song Liu , Yonghong Song , Andrii Nakryiko , John Fastabend , KP Singh , "Eric W. Biederman" Date: Fri, 20 Nov 2020 17:14:18 -0600 Message-Id: <20201120231441.29911-1-ebiederm@xmission.com> X-Mailer: git-send-email 2.25.0 In-Reply-To: <87r1on1v62.fsf@x220.int.ebiederm.org> References: <87r1on1v62.fsf@x220.int.ebiederm.org> MIME-Version: 1.0 X-XM-SPF: eid=1kgFdG-00EG00-3B;;;mid=<20201120231441.29911-1-ebiederm@xmission.com>;;;hst=in02.mta.xmission.com;;;ip=68.227.160.95;;;frm=ebiederm@xmission.com;;;spf=neutral X-XM-AID: U2FsdGVkX1/Dk4BkOwsyT/FgOO6bO2TIPLXxxbBOHLc= X-SA-Exim-Connect-IP: 68.227.160.95 X-SA-Exim-Mail-From: ebiederm@xmission.com Subject: [PATCH v2 01/24] exec: Move unshare_files to fix posix file locking during exec X-SA-Exim-Version: 4.2.1 (built Sat, 08 Feb 2020 21:53:50 +0000) X-SA-Exim-Scanned: Yes (on in02.mta.xmission.com) Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Many moons ago the binfmts were doing some very questionable things with file descriptors and an unsharing of the file descriptor table was added to make things better[1][2]. The helper steal_lockss was added to avoid breaking the userspace programs[3][4][6]. Unfortunately it turned out that steal_locks did not work for network file systems[5], so it was removed to see if anyone would complain[7][8]. It was thought at the time that NPTL would not be affected as the unshare_files happened after the other threads were killed[8]. Unfortunately because there was an unshare_files in binfmt_elf.c before the threads were killed this analysis was incorrect. This unshare_files in binfmt_elf.c resulted in the unshares_files happening whenever threads were present. Which led to unshare_files being moved to the start of do_execve[9]. Later the problems were rediscovered and the suggested approach was to readd steal_locks under a different name[10]. I happened to be reviewing patches and I noticed that this approach was a step backwards[11]. I proposed simply moving unshare_files[12] and it was pointed out that moving unshare_files without auditing the code was also unsafe[13]. There were then several attempts to solve this[14][15][16] and I even posted this set of changes[17]. Unfortunately because auditing all of execve is time consuming this change did not make it in at the time. Well now that I am cleaning up exec I have made the time to read through all of the binfmts and the only playing with file descriptors is either the security modules closing them in security_bprm_committing_creds or is in the generic code in fs/exec.c. None of it happens before begin_new_exec is called. So move unshare_files into begin_new_exec, after the point of no return. If memory is very very very low and the application calling exec is sharing file descriptor tables between processes we might fail past the point of no return. Which is unfortunate but no different than any of the other places where we allocate memory after the point of no return. This movement allows another process that shares the file table, or another thread of the same process and that closes files or changes their close on exec behavior and races with execve to cause some unexpected things to happen. There is only one time of check to time of use race and it is just there so that execve fails instead of an interpreter failing when it tries to open the file it is supposed to be interpreting. Failing later if userspace is being silly is not a problem. With this change it the following discription from the removal of steal_locks[8] finally becomes true. Apps using NPTL are not affected, since all other threads are killed before execve. Apps using LinuxThreads are only affected if they - have multiple threads during exec (LinuxThreads doesn't kill other threads, the app may do it with pthread_kill_other_threads_np()) - rely on POSIX locks being inherited across exec Both conditions are documented, but not their interaction. Apps using clone() natively are affected if they - use clone(CLONE_FILES) - rely on POSIX locks being inherited across exec I have investigated some paths to make it possible to solve this without moving unshare_files but they all look more complicated[18]. Reported-by: Daniel P. BerrangĂ© Reported-by: Jeff Layton History-tree: git://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git [1] 02cda956de0b ("[PATCH] unshare_files" [2] 04e9bcb4d106 ("[PATCH] use new unshare_files helper") [3] 088f5d7244de ("[PATCH] add steal_locks helper") [4] 02c541ec8ffa ("[PATCH] use new steal_locks helper") [5] https://lkml.kernel.org/r/E1FLIlF-0007zR-00@dorka.pomaz.szeredi.hu [6] https://lkml.kernel.org/r/0060321191605.GB15997@sorel.sous-sol.org [7] https://lkml.kernel.org/r/E1FLwjC-0000kJ-00@dorka.pomaz.szeredi.hu [8] c89681ed7d0e ("[PATCH] remove steal_locks()") [9] fd8328be874f ("[PATCH] sanitize handling of shared descriptor tables in failing execve()") [10] https://lkml.kernel.org/r/20180317142520.30520-1-jlayton@kernel.org [11] https://lkml.kernel.org/r/87r2nwqk73.fsf@xmission.com [12] https://lkml.kernel.org/r/87bmfgvg8w.fsf@xmission.com [13] https://lkml.kernel.org/r/20180322111424.GE30522@ZenIV.linux.org.uk [14] https://lkml.kernel.org/r/20180827174722.3723-1-jlayton@kernel.org [15] https://lkml.kernel.org/r/20180830172423.21964-1-jlayton@kernel.org [16] https://lkml.kernel.org/r/20180914105310.6454-1-jlayton@kernel.org [17] https://lkml.kernel.org/r/87a7ohs5ow.fsf@xmission.com [18] https://lkml.kernel.org/r/87pn8c1uj6.fsf_-_@x220.int.ebiederm.org Acked-by: Christian Brauner v1: https://lkml.kernel.org/r/20200817220425.9389-1-ebiederm@xmission.com Signed-off-by: "Eric W. Biederman" --- fs/exec.c | 29 +++++++++++++++-------------- 1 file changed, 15 insertions(+), 14 deletions(-) diff --git a/fs/exec.c b/fs/exec.c index 547a2390baf5..fa788988efe9 100644 --- a/fs/exec.c +++ b/fs/exec.c @@ -1238,6 +1238,7 @@ void __set_task_comm(struct task_struct *tsk, const char *buf, bool exec) int begin_new_exec(struct linux_binprm * bprm) { struct task_struct *me = current; + struct files_struct *displaced; int retval; /* Once we are committed compute the creds */ @@ -1257,6 +1258,13 @@ int begin_new_exec(struct linux_binprm * bprm) if (retval) goto out; + /* Ensure the files table is not shared. */ + retval = unshare_files(&displaced); + if (retval) + goto out; + if (displaced) + put_files_struct(displaced); + /* * Must be called _before_ exec_mmap() as bprm->mm is * not visibile until then. This also enables the update @@ -1776,7 +1784,6 @@ static int bprm_execve(struct linux_binprm *bprm, int fd, struct filename *filename, int flags) { struct file *file; - struct files_struct *displaced; int retval; /* @@ -1784,13 +1791,9 @@ static int bprm_execve(struct linux_binprm *bprm, */ io_uring_task_cancel(); - retval = unshare_files(&displaced); - if (retval) - return retval; - retval = prepare_bprm_creds(bprm); if (retval) - goto out_files; + return retval; check_unsafe_exec(bprm); current->in_execve = 1; @@ -1805,8 +1808,12 @@ static int bprm_execve(struct linux_binprm *bprm, bprm->file = file; /* * Record that a name derived from an O_CLOEXEC fd will be - * inaccessible after exec. Relies on having exclusive access to - * current->files (due to unshare_files above). + * inaccessible after exec. This allows the code in exec to + * choose to fail when the executable is not mmaped into the + * interpreter and an open file descriptor is not passed to + * the interpreter. This makes for a better user experience + * than having the interpreter start and then immediately fail + * when it finds the executable is inaccessible. */ if (bprm->fdpath && close_on_exec(fd, rcu_dereference_raw(current->files->fdt))) @@ -1827,8 +1834,6 @@ static int bprm_execve(struct linux_binprm *bprm, rseq_execve(current); acct_update_integrals(current); task_numa_free(current, false); - if (displaced) - put_files_struct(displaced); return retval; out: @@ -1845,10 +1850,6 @@ static int bprm_execve(struct linux_binprm *bprm, current->fs->in_exec = 0; current->in_execve = 0; -out_files: - if (displaced) - reset_files_struct(displaced); - return retval; } From patchwork Fri Nov 20 23:14:19 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Eric W. Biederman" X-Patchwork-Id: 11922843 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7A300C5519F for ; Fri, 20 Nov 2020 23:16:28 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 250CC22470 for ; Fri, 20 Nov 2020 23:16:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728761AbgKTXQI (ORCPT ); Fri, 20 Nov 2020 18:16:08 -0500 Received: from out01.mta.xmission.com ([166.70.13.231]:33444 "EHLO out01.mta.xmission.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726719AbgKTXQG (ORCPT ); Fri, 20 Nov 2020 18:16:06 -0500 Received: from in02.mta.xmission.com ([166.70.13.52]) by out01.mta.xmission.com with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.93) (envelope-from ) id 1kgFdN-006Tng-AZ; Fri, 20 Nov 2020 16:16:05 -0700 Received: from ip68-227-160-95.om.om.cox.net ([68.227.160.95] helo=x220.int.ebiederm.org) by in02.mta.xmission.com with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.93) (envelope-from ) id 1kgFdM-00EG00-61; Fri, 20 Nov 2020 16:16:04 -0700 From: "Eric W. Biederman" To: linux-kernel@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, criu@openvz.org, bpf@vger.kernel.org, Linus Torvalds , Alexander Viro , Christian Brauner , Oleg Nesterov , Cyrill Gorcunov , Jann Horn , Kees Cook , =?utf-8?q?Da?= =?utf-8?q?niel_P_=2E_Berrang=C3=A9?= , Jeff Layton , Miklos Szeredi , Matthew Wilcox , "J. Bruce Fields" , Trond Myklebust , Chris Wright , Alexei Starovoitov , Daniel Borkmann , Martin KaFai Lau , Song Liu , Yonghong Song , Andrii Nakryiko , John Fastabend , KP Singh , "Eric W. Biederman" Date: Fri, 20 Nov 2020 17:14:19 -0600 Message-Id: <20201120231441.29911-2-ebiederm@xmission.com> X-Mailer: git-send-email 2.25.0 In-Reply-To: <87r1on1v62.fsf@x220.int.ebiederm.org> References: <87r1on1v62.fsf@x220.int.ebiederm.org> MIME-Version: 1.0 X-XM-SPF: eid=1kgFdM-00EG00-61;;;mid=<20201120231441.29911-2-ebiederm@xmission.com>;;;hst=in02.mta.xmission.com;;;ip=68.227.160.95;;;frm=ebiederm@xmission.com;;;spf=neutral X-XM-AID: U2FsdGVkX196u8wLHWhkRGE/DETQH7hBBNOxVYgWlQo= X-SA-Exim-Connect-IP: 68.227.160.95 X-SA-Exim-Mail-From: ebiederm@xmission.com Subject: [PATCH v2 02/24] exec: Simplify unshare_files X-SA-Exim-Version: 4.2.1 (built Sat, 08 Feb 2020 21:53:50 +0000) X-SA-Exim-Scanned: Yes (on in02.mta.xmission.com) Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Now that exec no longer needs to return the unshared files to their previous value there is no reason to return displaced. Instead when unshare_fd creates a copy of the file table, call put_files_struct before returning from unshare_files. Acked-by: Christian Brauner v1: https://lkml.kernel.org/r/20200817220425.9389-2-ebiederm@xmission.com Signed-off-by: "Eric W. Biederman" --- fs/coredump.c | 5 +---- fs/exec.c | 5 +---- include/linux/fdtable.h | 2 +- kernel/fork.c | 12 ++++++------ 4 files changed, 9 insertions(+), 15 deletions(-) diff --git a/fs/coredump.c b/fs/coredump.c index 0cd9056d79cc..abf807235262 100644 --- a/fs/coredump.c +++ b/fs/coredump.c @@ -585,7 +585,6 @@ void do_coredump(const kernel_siginfo_t *siginfo) int ispipe; size_t *argv = NULL; int argc = 0; - struct files_struct *displaced; /* require nonrelative corefile path and be extra careful */ bool need_suid_safe = false; bool core_dumped = false; @@ -791,11 +790,9 @@ void do_coredump(const kernel_siginfo_t *siginfo) } /* get us an unshared descriptor table; almost always a no-op */ - retval = unshare_files(&displaced); + retval = unshare_files(); if (retval) goto close_fail; - if (displaced) - put_files_struct(displaced); if (!dump_interrupted()) { /* * umh disabled with CONFIG_STATIC_USERMODEHELPER_PATH="" would diff --git a/fs/exec.c b/fs/exec.c index fa788988efe9..14fae2ec1c9d 100644 --- a/fs/exec.c +++ b/fs/exec.c @@ -1238,7 +1238,6 @@ void __set_task_comm(struct task_struct *tsk, const char *buf, bool exec) int begin_new_exec(struct linux_binprm * bprm) { struct task_struct *me = current; - struct files_struct *displaced; int retval; /* Once we are committed compute the creds */ @@ -1259,11 +1258,9 @@ int begin_new_exec(struct linux_binprm * bprm) goto out; /* Ensure the files table is not shared. */ - retval = unshare_files(&displaced); + retval = unshare_files(); if (retval) goto out; - if (displaced) - put_files_struct(displaced); /* * Must be called _before_ exec_mmap() as bprm->mm is diff --git a/include/linux/fdtable.h b/include/linux/fdtable.h index a32bf47c593e..f46a084b60b2 100644 --- a/include/linux/fdtable.h +++ b/include/linux/fdtable.h @@ -109,7 +109,7 @@ struct task_struct; struct files_struct *get_files_struct(struct task_struct *); void put_files_struct(struct files_struct *fs); void reset_files_struct(struct files_struct *); -int unshare_files(struct files_struct **); +int unshare_files(void); struct files_struct *dup_fd(struct files_struct *, unsigned, int *) __latent_entropy; void do_close_on_exec(struct files_struct *); int iterate_fd(struct files_struct *, unsigned, diff --git a/kernel/fork.c b/kernel/fork.c index 32083db7a2a2..837b546528c8 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -3023,21 +3023,21 @@ SYSCALL_DEFINE1(unshare, unsigned long, unshare_flags) * the exec layer of the kernel. */ -int unshare_files(struct files_struct **displaced) +int unshare_files(void) { struct task_struct *task = current; - struct files_struct *copy = NULL; + struct files_struct *old, *copy = NULL; int error; error = unshare_fd(CLONE_FILES, NR_OPEN_MAX, ©); - if (error || !copy) { - *displaced = NULL; + if (error || !copy) return error; - } - *displaced = task->files; + + old = task->files; task_lock(task); task->files = copy; task_unlock(task); + put_files_struct(old); return 0; } From patchwork Fri Nov 20 23:14:20 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Eric W. Biederman" X-Patchwork-Id: 11922849 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id DC916C6379D for ; Fri, 20 Nov 2020 23:16:28 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id A7FC62242A for ; Fri, 20 Nov 2020 23:16:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728884AbgKTXQM (ORCPT ); Fri, 20 Nov 2020 18:16:12 -0500 Received: from out01.mta.xmission.com ([166.70.13.231]:33494 "EHLO out01.mta.xmission.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726719AbgKTXQK (ORCPT ); Fri, 20 Nov 2020 18:16:10 -0500 Received: from in02.mta.xmission.com ([166.70.13.52]) by out01.mta.xmission.com with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.93) (envelope-from ) id 1kgFdQ-006ToN-Or; Fri, 20 Nov 2020 16:16:08 -0700 Received: from ip68-227-160-95.om.om.cox.net ([68.227.160.95] helo=x220.int.ebiederm.org) by in02.mta.xmission.com with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.93) (envelope-from ) id 1kgFdP-00EG00-LR; Fri, 20 Nov 2020 16:16:08 -0700 From: "Eric W. Biederman" To: linux-kernel@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, criu@openvz.org, bpf@vger.kernel.org, Linus Torvalds , Alexander Viro , Christian Brauner , Oleg Nesterov , Cyrill Gorcunov , Jann Horn , Kees Cook , =?utf-8?q?Da?= =?utf-8?q?niel_P_=2E_Berrang=C3=A9?= , Jeff Layton , Miklos Szeredi , Matthew Wilcox , "J. Bruce Fields" , Trond Myklebust , Chris Wright , Alexei Starovoitov , Daniel Borkmann , Martin KaFai Lau , Song Liu , Yonghong Song , Andrii Nakryiko , John Fastabend , KP Singh , "Eric W. Biederman" Date: Fri, 20 Nov 2020 17:14:20 -0600 Message-Id: <20201120231441.29911-3-ebiederm@xmission.com> X-Mailer: git-send-email 2.25.0 In-Reply-To: <87r1on1v62.fsf@x220.int.ebiederm.org> References: <87r1on1v62.fsf@x220.int.ebiederm.org> MIME-Version: 1.0 X-XM-SPF: eid=1kgFdP-00EG00-LR;;;mid=<20201120231441.29911-3-ebiederm@xmission.com>;;;hst=in02.mta.xmission.com;;;ip=68.227.160.95;;;frm=ebiederm@xmission.com;;;spf=neutral X-XM-AID: U2FsdGVkX19kKA42rj5rpnQD9hm96V1uNw7V7vJkWCI= X-SA-Exim-Connect-IP: 68.227.160.95 X-SA-Exim-Mail-From: ebiederm@xmission.com Subject: [PATCH v2 03/24] exec: Remove reset_files_struct X-SA-Exim-Version: 4.2.1 (built Sat, 08 Feb 2020 21:53:50 +0000) X-SA-Exim-Scanned: Yes (on in02.mta.xmission.com) Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Now that exec no longer needs to restore the previous value of current->files on error there are no more callers of reset_files_struct so remove it. Acked-by: Christian Brauner v1: https://lkml.kernel.org/r/20200817220425.9389-3-ebiederm@xmission.com Signed-off-by: "Eric W. Biederman" --- fs/file.c | 12 ------------ include/linux/fdtable.h | 1 - 2 files changed, 13 deletions(-) diff --git a/fs/file.c b/fs/file.c index 4559b5fec3bd..beae7c55c84c 100644 --- a/fs/file.c +++ b/fs/file.c @@ -436,18 +436,6 @@ void put_files_struct(struct files_struct *files) } } -void reset_files_struct(struct files_struct *files) -{ - struct task_struct *tsk = current; - struct files_struct *old; - - old = tsk->files; - task_lock(tsk); - tsk->files = files; - task_unlock(tsk); - put_files_struct(old); -} - void exit_files(struct task_struct *tsk) { struct files_struct * files = tsk->files; diff --git a/include/linux/fdtable.h b/include/linux/fdtable.h index f46a084b60b2..7cc9885044d9 100644 --- a/include/linux/fdtable.h +++ b/include/linux/fdtable.h @@ -108,7 +108,6 @@ struct task_struct; struct files_struct *get_files_struct(struct task_struct *); void put_files_struct(struct files_struct *fs); -void reset_files_struct(struct files_struct *); int unshare_files(void); struct files_struct *dup_fd(struct files_struct *, unsigned, int *) __latent_entropy; void do_close_on_exec(struct files_struct *); From patchwork Fri Nov 20 23:14:21 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Eric W. Biederman" X-Patchwork-Id: 11922845 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.9 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,UNWANTED_LANGUAGE_BODY, URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 087CBC64E75 for ; Fri, 20 Nov 2020 23:16:29 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id CB3282240B for ; Fri, 20 Nov 2020 23:16:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729053AbgKTXQP (ORCPT ); Fri, 20 Nov 2020 18:16:15 -0500 Received: from out03.mta.xmission.com ([166.70.13.233]:48144 "EHLO out03.mta.xmission.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726719AbgKTXQN (ORCPT ); Fri, 20 Nov 2020 18:16:13 -0500 Received: from in02.mta.xmission.com ([166.70.13.52]) by out03.mta.xmission.com with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.93) (envelope-from ) id 1kgFdU-006Xbh-6t; Fri, 20 Nov 2020 16:16:12 -0700 Received: from ip68-227-160-95.om.om.cox.net ([68.227.160.95] helo=x220.int.ebiederm.org) by in02.mta.xmission.com with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.93) (envelope-from ) id 1kgFdT-00EG00-4X; Fri, 20 Nov 2020 16:16:11 -0700 From: "Eric W. Biederman" To: linux-kernel@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, criu@openvz.org, bpf@vger.kernel.org, Linus Torvalds , Alexander Viro , Christian Brauner , Oleg Nesterov , Cyrill Gorcunov , Jann Horn , Kees Cook , =?utf-8?q?Da?= =?utf-8?q?niel_P_=2E_Berrang=C3=A9?= , Jeff Layton , Miklos Szeredi , Matthew Wilcox , "J. Bruce Fields" , Trond Myklebust , Chris Wright , Alexei Starovoitov , Daniel Borkmann , Martin KaFai Lau , Song Liu , Yonghong Song , Andrii Nakryiko , John Fastabend , KP Singh , "Eric W. Biederman" Date: Fri, 20 Nov 2020 17:14:21 -0600 Message-Id: <20201120231441.29911-4-ebiederm@xmission.com> X-Mailer: git-send-email 2.25.0 In-Reply-To: <87r1on1v62.fsf@x220.int.ebiederm.org> References: <87r1on1v62.fsf@x220.int.ebiederm.org> MIME-Version: 1.0 X-XM-SPF: eid=1kgFdT-00EG00-4X;;;mid=<20201120231441.29911-4-ebiederm@xmission.com>;;;hst=in02.mta.xmission.com;;;ip=68.227.160.95;;;frm=ebiederm@xmission.com;;;spf=neutral X-XM-AID: U2FsdGVkX1/hV0hyPS0zo8CxNcnpJ9dbcaqQXH40V9k= X-SA-Exim-Connect-IP: 68.227.160.95 X-SA-Exim-Mail-From: ebiederm@xmission.com Subject: [PATCH v2 04/24] kcmp: In kcmp_epoll_target use fget_task X-SA-Exim-Version: 4.2.1 (built Sat, 08 Feb 2020 21:53:50 +0000) X-SA-Exim-Scanned: Yes (on in02.mta.xmission.com) Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Use the helper fget_task and simplify the code. As well as simplifying the code this removes one unnecessary increment of struct files_struct. This unnecessary increment of files_struct.count can result in exec unnecessarily unsharing files_struct and breaking posix locks, and it can result in fget_light having to fallback to fget reducing performance. Suggested-by: Oleg Nesterov Reviewed-by: Cyrill Gorcunov v1: https://lkml.kernel.org/r/20200817220425.9389-4-ebiederm@xmission.com Signed-off-by: "Eric W. Biederman" --- kernel/kcmp.c | 20 ++++---------------- 1 file changed, 4 insertions(+), 16 deletions(-) diff --git a/kernel/kcmp.c b/kernel/kcmp.c index b3ff9288c6cc..87c48c0104ad 100644 --- a/kernel/kcmp.c +++ b/kernel/kcmp.c @@ -107,7 +107,6 @@ static int kcmp_epoll_target(struct task_struct *task1, { struct file *filp, *filp_epoll, *filp_tgt; struct kcmp_epoll_slot slot; - struct files_struct *files; if (copy_from_user(&slot, uslot, sizeof(slot))) return -EFAULT; @@ -116,23 +115,12 @@ static int kcmp_epoll_target(struct task_struct *task1, if (!filp) return -EBADF; - files = get_files_struct(task2); - if (!files) + filp_epoll = fget_task(task2, slot.efd); + if (!filp_epoll) return -EBADF; - spin_lock(&files->file_lock); - filp_epoll = fcheck_files(files, slot.efd); - if (filp_epoll) - get_file(filp_epoll); - else - filp_tgt = ERR_PTR(-EBADF); - spin_unlock(&files->file_lock); - put_files_struct(files); - - if (filp_epoll) { - filp_tgt = get_epoll_tfile_raw_ptr(filp_epoll, slot.tfd, slot.toff); - fput(filp_epoll); - } + filp_tgt = get_epoll_tfile_raw_ptr(filp_epoll, slot.tfd, slot.toff); + fput(filp_epoll); if (IS_ERR(filp_tgt)) return PTR_ERR(filp_tgt); From patchwork Fri Nov 20 23:14:22 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Eric W. Biederman" X-Patchwork-Id: 11922847 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5D490C64E7A for ; Fri, 20 Nov 2020 23:16:29 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 30A452240B for ; Fri, 20 Nov 2020 23:16:29 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729145AbgKTXQR (ORCPT ); Fri, 20 Nov 2020 18:16:17 -0500 Received: from out02.mta.xmission.com ([166.70.13.232]:46004 "EHLO out02.mta.xmission.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726719AbgKTXQR (ORCPT ); Fri, 20 Nov 2020 18:16:17 -0500 X-Greylist: delayed 302 seconds by postgrey-1.27 at vger.kernel.org; Fri, 20 Nov 2020 18:16:16 EST Received: from in02.mta.xmission.com ([166.70.13.52]) by out02.mta.xmission.com with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.93) (envelope-from ) id 1kgFdX-002RDz-J2; Fri, 20 Nov 2020 16:16:15 -0700 Received: from ip68-227-160-95.om.om.cox.net ([68.227.160.95] helo=x220.int.ebiederm.org) by in02.mta.xmission.com with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.93) (envelope-from ) id 1kgFdW-00EG00-K1; Fri, 20 Nov 2020 16:16:15 -0700 From: "Eric W. Biederman" To: linux-kernel@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, criu@openvz.org, bpf@vger.kernel.org, Linus Torvalds , Alexander Viro , Christian Brauner , Oleg Nesterov , Cyrill Gorcunov , Jann Horn , Kees Cook , =?utf-8?q?Da?= =?utf-8?q?niel_P_=2E_Berrang=C3=A9?= , Jeff Layton , Miklos Szeredi , Matthew Wilcox , "J. Bruce Fields" , Trond Myklebust , Chris Wright , Alexei Starovoitov , Daniel Borkmann , Martin KaFai Lau , Song Liu , Yonghong Song , Andrii Nakryiko , John Fastabend , KP Singh , "Eric W. Biederman" Date: Fri, 20 Nov 2020 17:14:22 -0600 Message-Id: <20201120231441.29911-5-ebiederm@xmission.com> X-Mailer: git-send-email 2.25.0 In-Reply-To: <87r1on1v62.fsf@x220.int.ebiederm.org> References: <87r1on1v62.fsf@x220.int.ebiederm.org> MIME-Version: 1.0 X-XM-SPF: eid=1kgFdW-00EG00-K1;;;mid=<20201120231441.29911-5-ebiederm@xmission.com>;;;hst=in02.mta.xmission.com;;;ip=68.227.160.95;;;frm=ebiederm@xmission.com;;;spf=neutral X-XM-AID: U2FsdGVkX1+Xhwlps+6lJ3cvlbGlrhm0qEn5witM5cY= X-SA-Exim-Connect-IP: 68.227.160.95 X-SA-Exim-Mail-From: ebiederm@xmission.com Subject: [PATCH v2 05/24] bpf: In bpf_task_fd_query use fget_task X-SA-Exim-Version: 4.2.1 (built Sat, 08 Feb 2020 21:53:50 +0000) X-SA-Exim-Scanned: Yes (on in02.mta.xmission.com) Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Use the helper fget_task to simplify bpf_task_fd_query. As well as simplifying the code this removes one unnecessary increment of struct files_struct. This unnecessary increment of files_struct.count can result in exec unnecessarily unsharing files_struct and breaking posix locks, and it can result in fget_light having to fallback to fget reducing performance. This simplification comes from the observation that none of the callers of get_files_struct actually need to call get_files_struct that was made when discussing[1] exec and posix file locks. [1] https://lkml.kernel.org/r/20180915160423.GA31461@redhat.com Suggested-by: Oleg Nesterov v1: https://lkml.kernel.org/r/20200817220425.9389-5-ebiederm@xmission.com Signed-off-by: "Eric W. Biederman" --- kernel/bpf/syscall.c | 20 +++----------------- 1 file changed, 3 insertions(+), 17 deletions(-) diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index 8f50c9c19f1b..6d49c2e1634c 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -3878,7 +3878,6 @@ static int bpf_task_fd_query(const union bpf_attr *attr, pid_t pid = attr->task_fd_query.pid; u32 fd = attr->task_fd_query.fd; const struct perf_event *event; - struct files_struct *files; struct task_struct *task; struct file *file; int err; @@ -3896,23 +3895,11 @@ static int bpf_task_fd_query(const union bpf_attr *attr, if (!task) return -ENOENT; - files = get_files_struct(task); - put_task_struct(task); - if (!files) - return -ENOENT; - err = 0; - spin_lock(&files->file_lock); - file = fcheck_files(files, fd); + file = fget_task(task, fd); + put_task_struct(task); if (!file) - err = -EBADF; - else - get_file(file); - spin_unlock(&files->file_lock); - put_files_struct(files); - - if (err) - goto out; + return -EBADF; if (file->f_op == &bpf_link_fops) { struct bpf_link *link = file->private_data; @@ -3952,7 +3939,6 @@ static int bpf_task_fd_query(const union bpf_attr *attr, err = -ENOTSUPP; put_file: fput(file); -out: return err; } From patchwork Fri Nov 20 23:14:23 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Eric W. Biederman" X-Patchwork-Id: 11922853 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3458FC64E8A for ; Fri, 20 Nov 2020 23:16:30 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id D70A42240B for ; Fri, 20 Nov 2020 23:16:29 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729218AbgKTXQV (ORCPT ); Fri, 20 Nov 2020 18:16:21 -0500 Received: from out01.mta.xmission.com ([166.70.13.231]:33568 "EHLO out01.mta.xmission.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729180AbgKTXQU (ORCPT ); Fri, 20 Nov 2020 18:16:20 -0500 Received: from in02.mta.xmission.com ([166.70.13.52]) by out01.mta.xmission.com with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.93) (envelope-from ) id 1kgFdb-006TpF-0e; Fri, 20 Nov 2020 16:16:19 -0700 Received: from ip68-227-160-95.om.om.cox.net ([68.227.160.95] helo=x220.int.ebiederm.org) by in02.mta.xmission.com with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.93) (envelope-from ) id 1kgFdZ-00EG00-Uv; Fri, 20 Nov 2020 16:16:18 -0700 From: "Eric W. Biederman" To: linux-kernel@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, criu@openvz.org, bpf@vger.kernel.org, Linus Torvalds , Alexander Viro , Christian Brauner , Oleg Nesterov , Cyrill Gorcunov , Jann Horn , Kees Cook , =?utf-8?q?Da?= =?utf-8?q?niel_P_=2E_Berrang=C3=A9?= , Jeff Layton , Miklos Szeredi , Matthew Wilcox , "J. Bruce Fields" , Trond Myklebust , Chris Wright , Alexei Starovoitov , Daniel Borkmann , Martin KaFai Lau , Song Liu , Yonghong Song , Andrii Nakryiko , John Fastabend , KP Singh , "Eric W. Biederman" Date: Fri, 20 Nov 2020 17:14:23 -0600 Message-Id: <20201120231441.29911-6-ebiederm@xmission.com> X-Mailer: git-send-email 2.25.0 In-Reply-To: <87r1on1v62.fsf@x220.int.ebiederm.org> References: <87r1on1v62.fsf@x220.int.ebiederm.org> MIME-Version: 1.0 X-XM-SPF: eid=1kgFdZ-00EG00-Uv;;;mid=<20201120231441.29911-6-ebiederm@xmission.com>;;;hst=in02.mta.xmission.com;;;ip=68.227.160.95;;;frm=ebiederm@xmission.com;;;spf=neutral X-XM-AID: U2FsdGVkX19C41iZMJEug1eAQ9SvOh64q2n0pWvaffg= X-SA-Exim-Connect-IP: 68.227.160.95 X-SA-Exim-Mail-From: ebiederm@xmission.com Subject: [PATCH v2 06/24] proc/fd: In proc_fd_link use fget_task X-SA-Exim-Version: 4.2.1 (built Sat, 08 Feb 2020 21:53:50 +0000) X-SA-Exim-Scanned: Yes (on in02.mta.xmission.com) Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org When discussing[1] exec and posix file locks it was realized that none of the callers of get_files_struct fundamentally needed to call get_files_struct, and that by switching them to helper functions instead it will both simplify their code and remove unnecessary increments of files_struct.count. Those unnecessary increments can result in exec unnecessarily unsharing files_struct which breaking posix locks, and it can result in fget_light having to fallback to fget reducing system performance. Simplifying proc_fd_link is a little bit tricky. It is necessary to know that there is a reference to fd_f ile while path_get is running. This reference can either be guaranteed to exist either by locking the fdtable as the code currently does or by taking a reference on the file in question. Use fget_task to remove the need for get_files_struct and to take a reference to file in question. [1] https://lkml.kernel.org/r/20180915160423.GA31461@redhat.com Suggested-by: Oleg Nesterov v1: https://lkml.kernel.org/r/20200817220425.9389-8-ebiederm@xmission.com Signed-off-by: "Eric W. Biederman" --- fs/proc/fd.c | 13 +++---------- 1 file changed, 3 insertions(+), 10 deletions(-) diff --git a/fs/proc/fd.c b/fs/proc/fd.c index 81882a13212d..d58960f6ee52 100644 --- a/fs/proc/fd.c +++ b/fs/proc/fd.c @@ -146,29 +146,22 @@ static const struct dentry_operations tid_fd_dentry_operations = { static int proc_fd_link(struct dentry *dentry, struct path *path) { - struct files_struct *files = NULL; struct task_struct *task; int ret = -ENOENT; task = get_proc_task(d_inode(dentry)); if (task) { - files = get_files_struct(task); - put_task_struct(task); - } - - if (files) { unsigned int fd = proc_fd(d_inode(dentry)); struct file *fd_file; - spin_lock(&files->file_lock); - fd_file = fcheck_files(files, fd); + fd_file = fget_task(task, fd); if (fd_file) { *path = fd_file->f_path; path_get(&fd_file->f_path); ret = 0; + fput(fd_file); } - spin_unlock(&files->file_lock); - put_files_struct(files); + put_task_struct(task); } return ret; From patchwork Fri Nov 20 23:14:24 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Eric W. Biederman" X-Patchwork-Id: 11922855 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C6B90C71156 for ; Fri, 20 Nov 2020 23:16:30 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 867412240B for ; Fri, 20 Nov 2020 23:16:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729245AbgKTXQY (ORCPT ); Fri, 20 Nov 2020 18:16:24 -0500 Received: from out03.mta.xmission.com ([166.70.13.233]:48200 "EHLO out03.mta.xmission.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729180AbgKTXQX (ORCPT ); Fri, 20 Nov 2020 18:16:23 -0500 Received: from in02.mta.xmission.com ([166.70.13.52]) by out03.mta.xmission.com with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.93) (envelope-from ) id 1kgFde-006XcK-FR; Fri, 20 Nov 2020 16:16:22 -0700 Received: from ip68-227-160-95.om.om.cox.net ([68.227.160.95] helo=x220.int.ebiederm.org) by in02.mta.xmission.com with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.93) (envelope-from ) id 1kgFdd-00EG00-Cj; Fri, 20 Nov 2020 16:16:21 -0700 From: "Eric W. Biederman" To: linux-kernel@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, criu@openvz.org, bpf@vger.kernel.org, Linus Torvalds , Alexander Viro , Christian Brauner , Oleg Nesterov , Cyrill Gorcunov , Jann Horn , Kees Cook , =?utf-8?q?Da?= =?utf-8?q?niel_P_=2E_Berrang=C3=A9?= , Jeff Layton , Miklos Szeredi , Matthew Wilcox , "J. Bruce Fields" , Trond Myklebust , Chris Wright , Alexei Starovoitov , Daniel Borkmann , Martin KaFai Lau , Song Liu , Yonghong Song , Andrii Nakryiko , John Fastabend , KP Singh , "Eric W. Biederman" Date: Fri, 20 Nov 2020 17:14:24 -0600 Message-Id: <20201120231441.29911-7-ebiederm@xmission.com> X-Mailer: git-send-email 2.25.0 In-Reply-To: <87r1on1v62.fsf@x220.int.ebiederm.org> References: <87r1on1v62.fsf@x220.int.ebiederm.org> MIME-Version: 1.0 X-XM-SPF: eid=1kgFdd-00EG00-Cj;;;mid=<20201120231441.29911-7-ebiederm@xmission.com>;;;hst=in02.mta.xmission.com;;;ip=68.227.160.95;;;frm=ebiederm@xmission.com;;;spf=neutral X-XM-AID: U2FsdGVkX19gFZpO2kzio9Bgb51WV6i/yMyxVgnYUHc= X-SA-Exim-Connect-IP: 68.227.160.95 X-SA-Exim-Mail-From: ebiederm@xmission.com Subject: [PATCH v2 07/24] file: Rename __fcheck_files to files_lookup_fd_raw X-SA-Exim-Version: 4.2.1 (built Sat, 08 Feb 2020 21:53:50 +0000) X-SA-Exim-Scanned: Yes (on in02.mta.xmission.com) Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org The function fcheck despite it's comment is poorly named as it has no callers that only check it's return value. All of fcheck's callers use the returned file descriptor. The same is true for fcheck_files and __fcheck_files. A new less confusing name is needed. In addition the names of these functions are confusing as they do not report the kind of locks that are needed to be held when these functions are called making error prone to use them. To remedy this I am making the base functio name lookup_fd and will and prefixes and sufficies to indicate the rest of the context. Name the function (previously called __fcheck_files) that proceeds from a struct files_struct, looks up the struct file of a file descriptor, and requires it's callers to verify all of the appropriate locks are held files_lookup_fd_raw. The need for better names became apparent in the last round of discussion of this set of changes[1]. [1] https://lkml.kernel.org/r/CAHk-=wj8BQbgJFLa+J0e=iT-1qpmCRTbPAJ8gd6MJQ=kbRPqyQ@mail.gmail.com Signed-off-by: "Eric W. Biederman" --- fs/file.c | 2 +- include/linux/fdtable.h | 4 ++-- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/fs/file.c b/fs/file.c index beae7c55c84c..b5591efb87f5 100644 --- a/fs/file.c +++ b/fs/file.c @@ -887,7 +887,7 @@ static unsigned long __fget_light(unsigned int fd, fmode_t mask) struct file *file; if (atomic_read(&files->count) == 1) { - file = __fcheck_files(files, fd); + file = files_lookup_fd_raw(files, fd); if (!file || unlikely(file->f_mode & mask)) return 0; return (unsigned long)file; diff --git a/include/linux/fdtable.h b/include/linux/fdtable.h index 7cc9885044d9..639933f37da9 100644 --- a/include/linux/fdtable.h +++ b/include/linux/fdtable.h @@ -80,7 +80,7 @@ struct dentry; /* * The caller must ensure that fd table isn't shared or hold rcu or file lock */ -static inline struct file *__fcheck_files(struct files_struct *files, unsigned int fd) +static inline struct file *files_lookup_fd_raw(struct files_struct *files, unsigned int fd) { struct fdtable *fdt = rcu_dereference_raw(files->fdt); @@ -96,7 +96,7 @@ static inline struct file *fcheck_files(struct files_struct *files, unsigned int RCU_LOCKDEP_WARN(!rcu_read_lock_held() && !lockdep_is_held(&files->file_lock), "suspicious rcu_dereference_check() usage"); - return __fcheck_files(files, fd); + return files_lookup_fd_raw(files, fd); } /* From patchwork Fri Nov 20 23:14:25 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Eric W. Biederman" X-Patchwork-Id: 11922857 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 86472C5519F for ; Fri, 20 Nov 2020 23:17:12 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 36BF52242A for ; Fri, 20 Nov 2020 23:17:12 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729266AbgKTXQ2 (ORCPT ); Fri, 20 Nov 2020 18:16:28 -0500 Received: from out01.mta.xmission.com ([166.70.13.231]:33648 "EHLO out01.mta.xmission.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729241AbgKTXQ1 (ORCPT ); Fri, 20 Nov 2020 18:16:27 -0500 Received: from in02.mta.xmission.com ([166.70.13.52]) by out01.mta.xmission.com with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.93) (envelope-from ) id 1kgFdi-006TqE-8D; Fri, 20 Nov 2020 16:16:26 -0700 Received: from ip68-227-160-95.om.om.cox.net ([68.227.160.95] helo=x220.int.ebiederm.org) by in02.mta.xmission.com with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.93) (envelope-from ) id 1kgFdg-00EG00-RF; Fri, 20 Nov 2020 16:16:25 -0700 From: "Eric W. Biederman" To: linux-kernel@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, criu@openvz.org, bpf@vger.kernel.org, Linus Torvalds , Alexander Viro , Christian Brauner , Oleg Nesterov , Cyrill Gorcunov , Jann Horn , Kees Cook , =?utf-8?q?Da?= =?utf-8?q?niel_P_=2E_Berrang=C3=A9?= , Jeff Layton , Miklos Szeredi , Matthew Wilcox , "J. Bruce Fields" , Trond Myklebust , Chris Wright , Alexei Starovoitov , Daniel Borkmann , Martin KaFai Lau , Song Liu , Yonghong Song , Andrii Nakryiko , John Fastabend , KP Singh , "Eric W. Biederman" Date: Fri, 20 Nov 2020 17:14:25 -0600 Message-Id: <20201120231441.29911-8-ebiederm@xmission.com> X-Mailer: git-send-email 2.25.0 In-Reply-To: <87r1on1v62.fsf@x220.int.ebiederm.org> References: <87r1on1v62.fsf@x220.int.ebiederm.org> MIME-Version: 1.0 X-XM-SPF: eid=1kgFdg-00EG00-RF;;;mid=<20201120231441.29911-8-ebiederm@xmission.com>;;;hst=in02.mta.xmission.com;;;ip=68.227.160.95;;;frm=ebiederm@xmission.com;;;spf=neutral X-XM-AID: U2FsdGVkX18edY7AsNzDid9bgwd2GH9BMu//oTEEyxM= X-SA-Exim-Connect-IP: 68.227.160.95 X-SA-Exim-Mail-From: ebiederm@xmission.com Subject: [PATCH v2 08/24] file: Factor files_lookup_fd_locked out of fcheck_files X-SA-Exim-Version: 4.2.1 (built Sat, 08 Feb 2020 21:53:50 +0000) X-SA-Exim-Scanned: Yes (on in02.mta.xmission.com) Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org To make it easy to tell where files->file_lock protection is being used when looking up a file create files_lookup_fd_locked. Only allow this function to be called with the file_lock held. Update the callers of fcheck and fcheck_files that are called with the files->file_lock held to call files_lookup_fd_locked instead. Hopefully this makes it easier to quickly understand what is going on. The need for better names became apparent in the last round of discussion of this set of changes[1]. [1] https://lkml.kernel.org/r/CAHk-=wj8BQbgJFLa+J0e=iT-1qpmCRTbPAJ8gd6MJQ=kbRPqyQ@mail.gmail.com Signed-off-by: "Eric W. Biederman" --- fs/file.c | 2 +- fs/locks.c | 14 ++++++++------ fs/proc/fd.c | 2 +- include/linux/fdtable.h | 7 +++++++ 4 files changed, 17 insertions(+), 8 deletions(-) diff --git a/fs/file.c b/fs/file.c index b5591efb87f5..9d0e91168be1 100644 --- a/fs/file.c +++ b/fs/file.c @@ -1098,7 +1098,7 @@ static int ksys_dup3(unsigned int oldfd, unsigned int newfd, int flags) spin_lock(&files->file_lock); err = expand_files(files, newfd); - file = fcheck(oldfd); + file = files_lookup_fd_locked(files, oldfd); if (unlikely(!file)) goto Ebadf; if (unlikely(err < 0)) { diff --git a/fs/locks.c b/fs/locks.c index 1f84a03601fe..148197c1b547 100644 --- a/fs/locks.c +++ b/fs/locks.c @@ -2539,14 +2539,15 @@ int fcntl_setlk(unsigned int fd, struct file *filp, unsigned int cmd, */ if (!error && file_lock->fl_type != F_UNLCK && !(file_lock->fl_flags & FL_OFDLCK)) { + struct files_struct *files = current->files; /* * We need that spin_lock here - it prevents reordering between * update of i_flctx->flc_posix and check for it done in * close(). rcu_read_lock() wouldn't do. */ - spin_lock(¤t->files->file_lock); - f = fcheck(fd); - spin_unlock(¤t->files->file_lock); + spin_lock(&files->file_lock); + f = files_lookup_fd_locked(files, fd); + spin_unlock(&files->file_lock); if (f != filp) { file_lock->fl_type = F_UNLCK; error = do_lock_file_wait(filp, cmd, file_lock); @@ -2670,14 +2671,15 @@ int fcntl_setlk64(unsigned int fd, struct file *filp, unsigned int cmd, */ if (!error && file_lock->fl_type != F_UNLCK && !(file_lock->fl_flags & FL_OFDLCK)) { + struct files_struct *files = current->files; /* * We need that spin_lock here - it prevents reordering between * update of i_flctx->flc_posix and check for it done in * close(). rcu_read_lock() wouldn't do. */ - spin_lock(¤t->files->file_lock); - f = fcheck(fd); - spin_unlock(¤t->files->file_lock); + spin_lock(&files->file_lock); + f = files_lookup_fd_locked(files, fd); + spin_unlock(&files->file_lock); if (f != filp) { file_lock->fl_type = F_UNLCK; error = do_lock_file_wait(filp, cmd, file_lock); diff --git a/fs/proc/fd.c b/fs/proc/fd.c index d58960f6ee52..2cca9bca3b3a 100644 --- a/fs/proc/fd.c +++ b/fs/proc/fd.c @@ -35,7 +35,7 @@ static int seq_show(struct seq_file *m, void *v) unsigned int fd = proc_fd(m->private); spin_lock(&files->file_lock); - file = fcheck_files(files, fd); + file = files_lookup_fd_locked(files, fd); if (file) { struct fdtable *fdt = files_fdtable(files); diff --git a/include/linux/fdtable.h b/include/linux/fdtable.h index 639933f37da9..fda4b81dd735 100644 --- a/include/linux/fdtable.h +++ b/include/linux/fdtable.h @@ -91,6 +91,13 @@ static inline struct file *files_lookup_fd_raw(struct files_struct *files, unsig return NULL; } +static inline struct file *files_lookup_fd_locked(struct files_struct *files, unsigned int fd) +{ + RCU_LOCKDEP_WARN(!lockdep_is_held(&files->file_lock), + "suspicious rcu_dereference_check() usage"); + return files_lookup_fd_raw(files, fd); +} + static inline struct file *fcheck_files(struct files_struct *files, unsigned int fd) { RCU_LOCKDEP_WARN(!rcu_read_lock_held() && From patchwork Fri Nov 20 23:14:26 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Eric W. Biederman" X-Patchwork-Id: 11922859 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1CBABC64E69 for ; Fri, 20 Nov 2020 23:17:13 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id DD3562240B for ; Fri, 20 Nov 2020 23:17:12 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729318AbgKTXQc (ORCPT ); Fri, 20 Nov 2020 18:16:32 -0500 Received: from out03.mta.xmission.com ([166.70.13.233]:48262 "EHLO out03.mta.xmission.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729299AbgKTXQb (ORCPT ); Fri, 20 Nov 2020 18:16:31 -0500 Received: from in02.mta.xmission.com ([166.70.13.52]) by out03.mta.xmission.com with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.93) (envelope-from ) id 1kgFdl-006XdA-P4; Fri, 20 Nov 2020 16:16:29 -0700 Received: from ip68-227-160-95.om.om.cox.net ([68.227.160.95] helo=x220.int.ebiederm.org) by in02.mta.xmission.com with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.93) (envelope-from ) id 1kgFdk-00EG00-Kf; Fri, 20 Nov 2020 16:16:29 -0700 From: "Eric W. Biederman" To: linux-kernel@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, criu@openvz.org, bpf@vger.kernel.org, Linus Torvalds , Alexander Viro , Christian Brauner , Oleg Nesterov , Cyrill Gorcunov , Jann Horn , Kees Cook , =?utf-8?q?Da?= =?utf-8?q?niel_P_=2E_Berrang=C3=A9?= , Jeff Layton , Miklos Szeredi , Matthew Wilcox , "J. Bruce Fields" , Trond Myklebust , Chris Wright , Alexei Starovoitov , Daniel Borkmann , Martin KaFai Lau , Song Liu , Yonghong Song , Andrii Nakryiko , John Fastabend , KP Singh , "Eric W. Biederman" Date: Fri, 20 Nov 2020 17:14:26 -0600 Message-Id: <20201120231441.29911-9-ebiederm@xmission.com> X-Mailer: git-send-email 2.25.0 In-Reply-To: <87r1on1v62.fsf@x220.int.ebiederm.org> References: <87r1on1v62.fsf@x220.int.ebiederm.org> MIME-Version: 1.0 X-XM-SPF: eid=1kgFdk-00EG00-Kf;;;mid=<20201120231441.29911-9-ebiederm@xmission.com>;;;hst=in02.mta.xmission.com;;;ip=68.227.160.95;;;frm=ebiederm@xmission.com;;;spf=neutral X-XM-AID: U2FsdGVkX1/iyicyOORGjJR68+3TCVNRFyQygIkeLVU= X-SA-Exim-Connect-IP: 68.227.160.95 X-SA-Exim-Mail-From: ebiederm@xmission.com Subject: [PATCH v2 09/24] file: Replace fcheck_files with files_lookup_fd_rcu X-SA-Exim-Version: 4.2.1 (built Sat, 08 Feb 2020 21:53:50 +0000) X-SA-Exim-Scanned: Yes (on in02.mta.xmission.com) Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org This change renames fcheck_files to files_lookup_fd_rcu. All of the remaining callers take the rcu_read_lock before calling this function so the _rcu suffix is appropriate. This change also tightens up the debug check to verify that all callers hold the rcu_read_lock. All callers that used to call files_check with the files->file_lock held have now been changed to call files_lookup_fd_locked. This change of name has helped remind me of which locks and which guarantees are in place helping me to catch bugs later in the patchset. The need for better names became apparent in the last round of discussion of this set of changes[1]. [1] https://lkml.kernel.org/r/CAHk-=wj8BQbgJFLa+J0e=iT-1qpmCRTbPAJ8gd6MJQ=kbRPqyQ@mail.gmail.com Signed-off-by: "Eric W. Biederman" --- Documentation/filesystems/files.rst | 6 +++--- fs/file.c | 4 ++-- fs/proc/fd.c | 4 ++-- include/linux/fdtable.h | 7 +++---- kernel/bpf/task_iter.c | 2 +- kernel/kcmp.c | 2 +- 6 files changed, 12 insertions(+), 13 deletions(-) diff --git a/Documentation/filesystems/files.rst b/Documentation/filesystems/files.rst index cbf8e57376bf..ea75acdb632c 100644 --- a/Documentation/filesystems/files.rst +++ b/Documentation/filesystems/files.rst @@ -62,7 +62,7 @@ the fdtable structure - be held. 4. To look up the file structure given an fd, a reader - must use either fcheck() or fcheck_files() APIs. These + must use either fcheck() or files_lookup_fd_rcu() APIs. These take care of barrier requirements due to lock-free lookup. An example:: @@ -84,7 +84,7 @@ the fdtable structure - on ->f_count:: rcu_read_lock(); - file = fcheck_files(files, fd); + file = files_lookup_fd_rcu(files, fd); if (file) { if (atomic_long_inc_not_zero(&file->f_count)) *fput_needed = 1; @@ -104,7 +104,7 @@ the fdtable structure - lock-free, they must be installed using rcu_assign_pointer() API. If they are looked up lock-free, rcu_dereference() must be used. However it is advisable to use files_fdtable() - and fcheck()/fcheck_files() which take care of these issues. + and fcheck()/files_lookup_fd_rcu() which take care of these issues. 7. While updating, the fdtable pointer must be looked up while holding files->file_lock. If ->file_lock is dropped, then diff --git a/fs/file.c b/fs/file.c index 9d0e91168be1..5861c4f89419 100644 --- a/fs/file.c +++ b/fs/file.c @@ -814,7 +814,7 @@ static struct file *__fget_files(struct files_struct *files, unsigned int fd, rcu_read_lock(); loop: - file = fcheck_files(files, fd); + file = files_lookup_fd_rcu(files, fd); if (file) { /* File object ref couldn't be taken. * dup2() atomicity guarantee is the reason @@ -1127,7 +1127,7 @@ SYSCALL_DEFINE2(dup2, unsigned int, oldfd, unsigned int, newfd) int retval = oldfd; rcu_read_lock(); - if (!fcheck_files(files, oldfd)) + if (!files_lookup_fd_rcu(files, oldfd)) retval = -EBADF; rcu_read_unlock(); return retval; diff --git a/fs/proc/fd.c b/fs/proc/fd.c index 2cca9bca3b3a..3dec44d7c5c5 100644 --- a/fs/proc/fd.c +++ b/fs/proc/fd.c @@ -90,7 +90,7 @@ static bool tid_fd_mode(struct task_struct *task, unsigned fd, fmode_t *mode) return false; rcu_read_lock(); - file = fcheck_files(files, fd); + file = files_lookup_fd_rcu(files, fd); if (file) *mode = file->f_mode; rcu_read_unlock(); @@ -243,7 +243,7 @@ static int proc_readfd_common(struct file *file, struct dir_context *ctx, char name[10 + 1]; unsigned int len; - f = fcheck_files(files, fd); + f = files_lookup_fd_rcu(files, fd); if (!f) continue; data.mode = f->f_mode; diff --git a/include/linux/fdtable.h b/include/linux/fdtable.h index fda4b81dd735..fa8c402a7790 100644 --- a/include/linux/fdtable.h +++ b/include/linux/fdtable.h @@ -98,10 +98,9 @@ static inline struct file *files_lookup_fd_locked(struct files_struct *files, un return files_lookup_fd_raw(files, fd); } -static inline struct file *fcheck_files(struct files_struct *files, unsigned int fd) +static inline struct file *files_lookup_fd_rcu(struct files_struct *files, unsigned int fd) { - RCU_LOCKDEP_WARN(!rcu_read_lock_held() && - !lockdep_is_held(&files->file_lock), + RCU_LOCKDEP_WARN(!rcu_read_lock_held(), "suspicious rcu_dereference_check() usage"); return files_lookup_fd_raw(files, fd); } @@ -109,7 +108,7 @@ static inline struct file *fcheck_files(struct files_struct *files, unsigned int /* * Check whether the specified fd has an open file. */ -#define fcheck(fd) fcheck_files(current->files, fd) +#define fcheck(fd) files_lookup_fd_rcu(current->files, fd) struct task_struct; diff --git a/kernel/bpf/task_iter.c b/kernel/bpf/task_iter.c index 5b6af30bfbcd..5ab2ccfb96cb 100644 --- a/kernel/bpf/task_iter.c +++ b/kernel/bpf/task_iter.c @@ -183,7 +183,7 @@ task_file_seq_get_next(struct bpf_iter_seq_task_file_info *info, for (; curr_fd < max_fds; curr_fd++) { struct file *f; - f = fcheck_files(curr_files, curr_fd); + f = files_lookup_fd_rcu(curr_files, curr_fd); if (!f) continue; if (!get_file_rcu(f)) diff --git a/kernel/kcmp.c b/kernel/kcmp.c index 87c48c0104ad..990717c1aed3 100644 --- a/kernel/kcmp.c +++ b/kernel/kcmp.c @@ -67,7 +67,7 @@ get_file_raw_ptr(struct task_struct *task, unsigned int idx) rcu_read_lock(); if (task->files) - file = fcheck_files(task->files, idx); + file = files_lookup_fd_rcu(task->files, idx); rcu_read_unlock(); task_unlock(task); From patchwork Fri Nov 20 23:14:27 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Eric W. Biederman" X-Patchwork-Id: 11922861 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C7444C64E7C for ; Fri, 20 Nov 2020 23:17:13 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 8C04F2240B for ; Fri, 20 Nov 2020 23:17:13 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729339AbgKTXQh (ORCPT ); Fri, 20 Nov 2020 18:16:37 -0500 Received: from out03.mta.xmission.com ([166.70.13.233]:48302 "EHLO out03.mta.xmission.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729271AbgKTXQf (ORCPT ); Fri, 20 Nov 2020 18:16:35 -0500 Received: from in02.mta.xmission.com ([166.70.13.52]) by out03.mta.xmission.com with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.93) (envelope-from ) id 1kgFdp-006XdW-K3; Fri, 20 Nov 2020 16:16:33 -0700 Received: from ip68-227-160-95.om.om.cox.net ([68.227.160.95] helo=x220.int.ebiederm.org) by in02.mta.xmission.com with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.93) (envelope-from ) id 1kgFdo-00EG00-57; Fri, 20 Nov 2020 16:16:33 -0700 From: "Eric W. Biederman" To: linux-kernel@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, criu@openvz.org, bpf@vger.kernel.org, Linus Torvalds , Alexander Viro , Christian Brauner , Oleg Nesterov , Cyrill Gorcunov , Jann Horn , Kees Cook , =?utf-8?q?Da?= =?utf-8?q?niel_P_=2E_Berrang=C3=A9?= , Jeff Layton , Miklos Szeredi , Matthew Wilcox , "J. Bruce Fields" , Trond Myklebust , Chris Wright , Alexei Starovoitov , Daniel Borkmann , Martin KaFai Lau , Song Liu , Yonghong Song , Andrii Nakryiko , John Fastabend , KP Singh , "Eric W. Biederman" Date: Fri, 20 Nov 2020 17:14:27 -0600 Message-Id: <20201120231441.29911-10-ebiederm@xmission.com> X-Mailer: git-send-email 2.25.0 In-Reply-To: <87r1on1v62.fsf@x220.int.ebiederm.org> References: <87r1on1v62.fsf@x220.int.ebiederm.org> MIME-Version: 1.0 X-XM-SPF: eid=1kgFdo-00EG00-57;;;mid=<20201120231441.29911-10-ebiederm@xmission.com>;;;hst=in02.mta.xmission.com;;;ip=68.227.160.95;;;frm=ebiederm@xmission.com;;;spf=neutral X-XM-AID: U2FsdGVkX1/+yR4JtgeSRtgRNcLI225rYxafT0I4Doo= X-SA-Exim-Connect-IP: 68.227.160.95 X-SA-Exim-Mail-From: ebiederm@xmission.com Subject: [PATCH v2 10/24] file: Rename fcheck lookup_fd_rcu X-SA-Exim-Version: 4.2.1 (built Sat, 08 Feb 2020 21:53:50 +0000) X-SA-Exim-Scanned: Yes (on in02.mta.xmission.com) Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Also remove the confusing comment about checking if a fd exists. I could not find one instance in the entire kernel that still matches the description or the reason for the name fcheck. The need for better names became apparent in the last round of discussion of this set of changes[1]. [1] https://lkml.kernel.org/r/CAHk-=wj8BQbgJFLa+J0e=iT-1qpmCRTbPAJ8gd6MJQ=kbRPqyQ@mail.gmail.com Signed-off-by: "Eric W. Biederman" --- Documentation/filesystems/files.rst | 6 +++--- arch/powerpc/platforms/cell/spufs/coredump.c | 2 +- fs/notify/dnotify/dnotify.c | 2 +- include/linux/fdtable.h | 8 ++++---- 4 files changed, 9 insertions(+), 9 deletions(-) diff --git a/Documentation/filesystems/files.rst b/Documentation/filesystems/files.rst index ea75acdb632c..bcf84459917f 100644 --- a/Documentation/filesystems/files.rst +++ b/Documentation/filesystems/files.rst @@ -62,7 +62,7 @@ the fdtable structure - be held. 4. To look up the file structure given an fd, a reader - must use either fcheck() or files_lookup_fd_rcu() APIs. These + must use either lookup_fd_rcu() or files_lookup_fd_rcu() APIs. These take care of barrier requirements due to lock-free lookup. An example:: @@ -70,7 +70,7 @@ the fdtable structure - struct file *file; rcu_read_lock(); - file = fcheck(fd); + file = lookup_fd_rcu(fd); if (file) { ... } @@ -104,7 +104,7 @@ the fdtable structure - lock-free, they must be installed using rcu_assign_pointer() API. If they are looked up lock-free, rcu_dereference() must be used. However it is advisable to use files_fdtable() - and fcheck()/files_lookup_fd_rcu() which take care of these issues. + and lookup_fd_rcu()/files_lookup_fd_rcu() which take care of these issues. 7. While updating, the fdtable pointer must be looked up while holding files->file_lock. If ->file_lock is dropped, then diff --git a/arch/powerpc/platforms/cell/spufs/coredump.c b/arch/powerpc/platforms/cell/spufs/coredump.c index 026c181a98c5..60b5583e9eaf 100644 --- a/arch/powerpc/platforms/cell/spufs/coredump.c +++ b/arch/powerpc/platforms/cell/spufs/coredump.c @@ -74,7 +74,7 @@ static struct spu_context *coredump_next_context(int *fd) *fd = n - 1; rcu_read_lock(); - file = fcheck(*fd); + file = lookup_fd_rcu(*fd); ctx = SPUFS_I(file_inode(file))->i_ctx; get_spu_context(ctx); rcu_read_unlock(); diff --git a/fs/notify/dnotify/dnotify.c b/fs/notify/dnotify/dnotify.c index 5dcda8f20c04..5486aaca60b0 100644 --- a/fs/notify/dnotify/dnotify.c +++ b/fs/notify/dnotify/dnotify.c @@ -327,7 +327,7 @@ int fcntl_dirnotify(int fd, struct file *filp, unsigned long arg) } rcu_read_lock(); - f = fcheck(fd); + f = lookup_fd_rcu(fd); rcu_read_unlock(); /* if (f != filp) means that we lost a race and another task/thread diff --git a/include/linux/fdtable.h b/include/linux/fdtable.h index fa8c402a7790..2a4a8fed536e 100644 --- a/include/linux/fdtable.h +++ b/include/linux/fdtable.h @@ -105,10 +105,10 @@ static inline struct file *files_lookup_fd_rcu(struct files_struct *files, unsig return files_lookup_fd_raw(files, fd); } -/* - * Check whether the specified fd has an open file. - */ -#define fcheck(fd) files_lookup_fd_rcu(current->files, fd) +static inline struct file *lookup_fd_rcu(unsigned int fd) +{ + return files_lookup_fd_rcu(current->files, fd); +} struct task_struct; From patchwork Fri Nov 20 23:14:28 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Eric W. Biederman" X-Patchwork-Id: 11922877 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 035FDC64E7D for ; Fri, 20 Nov 2020 23:20:20 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id BEC322242A for ; Fri, 20 Nov 2020 23:20:19 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729486AbgKTXUF (ORCPT ); Fri, 20 Nov 2020 18:20:05 -0500 Received: from out01.mta.xmission.com ([166.70.13.231]:36046 "EHLO out01.mta.xmission.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729426AbgKTXUE (ORCPT ); Fri, 20 Nov 2020 18:20:04 -0500 Received: from in02.mta.xmission.com ([166.70.13.52]) by out01.mta.xmission.com with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.93) (envelope-from ) id 1kgFhC-006UZN-77; Fri, 20 Nov 2020 16:20:02 -0700 Received: from ip68-227-160-95.om.om.cox.net ([68.227.160.95] helo=x220.int.ebiederm.org) by in02.mta.xmission.com with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.93) (envelope-from ) id 1kgFdr-00EG00-VT; Fri, 20 Nov 2020 16:16:36 -0700 From: "Eric W. Biederman" To: linux-kernel@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, criu@openvz.org, bpf@vger.kernel.org, Linus Torvalds , Alexander Viro , Christian Brauner , Oleg Nesterov , Cyrill Gorcunov , Jann Horn , Kees Cook , =?utf-8?q?Da?= =?utf-8?q?niel_P_=2E_Berrang=C3=A9?= , Jeff Layton , Miklos Szeredi , Matthew Wilcox , "J. Bruce Fields" , Trond Myklebust , Chris Wright , Alexei Starovoitov , Daniel Borkmann , Martin KaFai Lau , Song Liu , Yonghong Song , Andrii Nakryiko , John Fastabend , KP Singh , "Eric W. Biederman" Date: Fri, 20 Nov 2020 17:14:28 -0600 Message-Id: <20201120231441.29911-11-ebiederm@xmission.com> X-Mailer: git-send-email 2.25.0 In-Reply-To: <87r1on1v62.fsf@x220.int.ebiederm.org> References: <87r1on1v62.fsf@x220.int.ebiederm.org> MIME-Version: 1.0 X-XM-SPF: eid=1kgFdr-00EG00-VT;;;mid=<20201120231441.29911-11-ebiederm@xmission.com>;;;hst=in02.mta.xmission.com;;;ip=68.227.160.95;;;frm=ebiederm@xmission.com;;;spf=neutral X-XM-AID: U2FsdGVkX1/h0QmcCLCV6HVHgw+OmRdeEJ6Or8V526k= X-SA-Exim-Connect-IP: 68.227.160.95 X-SA-Exim-Mail-From: ebiederm@xmission.com Subject: [PATCH v2 11/24] file: Implement task_lookup_fd_rcu X-SA-Exim-Version: 4.2.1 (built Sat, 08 Feb 2020 21:53:50 +0000) X-SA-Exim-Scanned: Yes (on in02.mta.xmission.com) Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org As a companion to lookup_fd_rcu implement task_lookup_fd_rcu for querying an arbitrary process about a specific file. Acked-by: Christian Brauner v1: https://lkml.kernel.org/r/20200818103713.aw46m7vprsy4vlve@wittgenstein Signed-off-by: "Eric W. Biederman" --- fs/file.c | 15 +++++++++++++++ include/linux/fdtable.h | 2 ++ 2 files changed, 17 insertions(+) diff --git a/fs/file.c b/fs/file.c index 5861c4f89419..6448523ca29e 100644 --- a/fs/file.c +++ b/fs/file.c @@ -865,6 +865,21 @@ struct file *fget_task(struct task_struct *task, unsigned int fd) return file; } +struct file *task_lookup_fd_rcu(struct task_struct *task, unsigned int fd) +{ + /* Must be called with rcu_read_lock held */ + struct files_struct *files; + struct file *file = NULL; + + task_lock(task); + files = task->files; + if (files) + file = files_lookup_fd_rcu(files, fd); + task_unlock(task); + + return file; +} + /* * Lightweight file lookup - no refcnt increment if fd table isn't shared. * diff --git a/include/linux/fdtable.h b/include/linux/fdtable.h index 2a4a8fed536e..a0558ae9b40c 100644 --- a/include/linux/fdtable.h +++ b/include/linux/fdtable.h @@ -110,6 +110,8 @@ static inline struct file *lookup_fd_rcu(unsigned int fd) return files_lookup_fd_rcu(current->files, fd); } +struct file *task_lookup_fd_rcu(struct task_struct *task, unsigned int fd); + struct task_struct; struct files_struct *get_files_struct(struct task_struct *); From patchwork Fri Nov 20 23:14:29 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Eric W. Biederman" X-Patchwork-Id: 11922869 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E99C3C71155 for ; Fri, 20 Nov 2020 23:20:20 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id A9F5A2242A for ; Fri, 20 Nov 2020 23:20:20 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728578AbgKTXUO (ORCPT ); Fri, 20 Nov 2020 18:20:14 -0500 Received: from out01.mta.xmission.com ([166.70.13.231]:36008 "EHLO out01.mta.xmission.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728872AbgKTXUD (ORCPT ); Fri, 20 Nov 2020 18:20:03 -0500 Received: from in02.mta.xmission.com ([166.70.13.52]) by out01.mta.xmission.com with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.93) (envelope-from ) id 1kgFhB-006UZ1-ND; Fri, 20 Nov 2020 16:20:01 -0700 Received: from ip68-227-160-95.om.om.cox.net ([68.227.160.95] helo=x220.int.ebiederm.org) by in02.mta.xmission.com with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.93) (envelope-from ) id 1kgFdv-00EG00-Bh; Fri, 20 Nov 2020 16:16:39 -0700 From: "Eric W. Biederman" To: linux-kernel@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, criu@openvz.org, bpf@vger.kernel.org, Linus Torvalds , Alexander Viro , Christian Brauner , Oleg Nesterov , Cyrill Gorcunov , Jann Horn , Kees Cook , =?utf-8?q?Da?= =?utf-8?q?niel_P_=2E_Berrang=C3=A9?= , Jeff Layton , Miklos Szeredi , Matthew Wilcox , "J. Bruce Fields" , Trond Myklebust , Chris Wright , Alexei Starovoitov , Daniel Borkmann , Martin KaFai Lau , Song Liu , Yonghong Song , Andrii Nakryiko , John Fastabend , KP Singh , "Eric W. Biederman" Date: Fri, 20 Nov 2020 17:14:29 -0600 Message-Id: <20201120231441.29911-12-ebiederm@xmission.com> X-Mailer: git-send-email 2.25.0 In-Reply-To: <87r1on1v62.fsf@x220.int.ebiederm.org> References: <87r1on1v62.fsf@x220.int.ebiederm.org> MIME-Version: 1.0 X-XM-SPF: eid=1kgFdv-00EG00-Bh;;;mid=<20201120231441.29911-12-ebiederm@xmission.com>;;;hst=in02.mta.xmission.com;;;ip=68.227.160.95;;;frm=ebiederm@xmission.com;;;spf=neutral X-XM-AID: U2FsdGVkX1/ToDHw4ikimbGSvYMEodDbQxjX7BmZcnc= X-SA-Exim-Connect-IP: 68.227.160.95 X-SA-Exim-Mail-From: ebiederm@xmission.com Subject: [PATCH v2 12/24] proc/fd: In tid_fd_mode use task_lookup_fd_rcu X-SA-Exim-Version: 4.2.1 (built Sat, 08 Feb 2020 21:53:50 +0000) X-SA-Exim-Scanned: Yes (on in02.mta.xmission.com) Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org When discussing[1] exec and posix file locks it was realized that none of the callers of get_files_struct fundamentally needed to call get_files_struct, and that by switching them to helper functions instead it will both simplify their code and remove unnecessary increments of files_struct.count. Those unnecessary increments can result in exec unnecessarily unsharing files_struct which breaking posix locks, and it can result in fget_light having to fallback to fget reducing system performance. Instead of manually coding finding the files struct for a task and then calling files_lookup_fd_rcu, use the helper task_lookup_fd_rcu that combines those to steps. Making the code simpler and removing the need to get a reference on a files_struct. [1] https://lkml.kernel.org/r/20180915160423.GA31461@redhat.com Suggested-by: Oleg Nesterov Acked-by: Christian Brauner v1: https://lkml.kernel.org/r/20200817220425.9389-7-ebiederm@xmission.com Signed-off-by: "Eric W. Biederman" --- fs/proc/fd.c | 7 +------ 1 file changed, 1 insertion(+), 6 deletions(-) diff --git a/fs/proc/fd.c b/fs/proc/fd.c index 3dec44d7c5c5..c1a984f3c4df 100644 --- a/fs/proc/fd.c +++ b/fs/proc/fd.c @@ -83,18 +83,13 @@ static const struct file_operations proc_fdinfo_file_operations = { static bool tid_fd_mode(struct task_struct *task, unsigned fd, fmode_t *mode) { - struct files_struct *files = get_files_struct(task); struct file *file; - if (!files) - return false; - rcu_read_lock(); - file = files_lookup_fd_rcu(files, fd); + file = task_lookup_fd_rcu(task, fd); if (file) *mode = file->f_mode; rcu_read_unlock(); - put_files_struct(files); return !!file; } From patchwork Fri Nov 20 23:14:30 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Eric W. Biederman" X-Patchwork-Id: 11922879 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 89271C56202 for ; Fri, 20 Nov 2020 23:20:59 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 50C1922470 for ; Fri, 20 Nov 2020 23:20:59 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729335AbgKTXUV (ORCPT ); Fri, 20 Nov 2020 18:20:21 -0500 Received: from out02.mta.xmission.com ([166.70.13.232]:46984 "EHLO out02.mta.xmission.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729241AbgKTXUA (ORCPT ); Fri, 20 Nov 2020 18:20:00 -0500 Received: from in02.mta.xmission.com ([166.70.13.52]) by out02.mta.xmission.com with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.93) (envelope-from ) id 1kgFh8-002RbD-NX; Fri, 20 Nov 2020 16:19:58 -0700 Received: from ip68-227-160-95.om.om.cox.net ([68.227.160.95] helo=x220.int.ebiederm.org) by in02.mta.xmission.com with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.93) (envelope-from ) id 1kgFdy-00EG00-Lx; Fri, 20 Nov 2020 16:16:43 -0700 From: "Eric W. Biederman" To: linux-kernel@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, criu@openvz.org, bpf@vger.kernel.org, Linus Torvalds , Alexander Viro , Christian Brauner , Oleg Nesterov , Cyrill Gorcunov , Jann Horn , Kees Cook , =?utf-8?q?Da?= =?utf-8?q?niel_P_=2E_Berrang=C3=A9?= , Jeff Layton , Miklos Szeredi , Matthew Wilcox , "J. Bruce Fields" , Trond Myklebust , Chris Wright , Alexei Starovoitov , Daniel Borkmann , Martin KaFai Lau , Song Liu , Yonghong Song , Andrii Nakryiko , John Fastabend , KP Singh , "Eric W. Biederman" Date: Fri, 20 Nov 2020 17:14:30 -0600 Message-Id: <20201120231441.29911-13-ebiederm@xmission.com> X-Mailer: git-send-email 2.25.0 In-Reply-To: <87r1on1v62.fsf@x220.int.ebiederm.org> References: <87r1on1v62.fsf@x220.int.ebiederm.org> MIME-Version: 1.0 X-XM-SPF: eid=1kgFdy-00EG00-Lx;;;mid=<20201120231441.29911-13-ebiederm@xmission.com>;;;hst=in02.mta.xmission.com;;;ip=68.227.160.95;;;frm=ebiederm@xmission.com;;;spf=neutral X-XM-AID: U2FsdGVkX194cx44q+KvkyQRjsxLSGAq1iGzNebFjbg= X-SA-Exim-Connect-IP: 68.227.160.95 X-SA-Exim-Mail-From: ebiederm@xmission.com Subject: [PATCH v2 13/24] kcmp: In get_file_raw_ptr use task_lookup_fd_rcu X-SA-Exim-Version: 4.2.1 (built Sat, 08 Feb 2020 21:53:50 +0000) X-SA-Exim-Scanned: Yes (on in02.mta.xmission.com) Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Modify get_file_raw_ptr to use task_lookup_fd_rcu. The helper task_lookup_fd_rcu does the work of taking the task lock and verifying that task->files != NULL and then calls files_lookup_fd_rcu. So let use the helper to make a simpler implementation of get_file_raw_ptr. Signed-off-by: "Eric W. Biederman" Acked-by: Cyrill Gorcunov --- kernel/kcmp.c | 9 ++------- 1 file changed, 2 insertions(+), 7 deletions(-) diff --git a/kernel/kcmp.c b/kernel/kcmp.c index 990717c1aed3..36e58eb5a11d 100644 --- a/kernel/kcmp.c +++ b/kernel/kcmp.c @@ -61,16 +61,11 @@ static int kcmp_ptr(void *v1, void *v2, enum kcmp_type type) static struct file * get_file_raw_ptr(struct task_struct *task, unsigned int idx) { - struct file *file = NULL; + struct file *file; - task_lock(task); rcu_read_lock(); - - if (task->files) - file = files_lookup_fd_rcu(task->files, idx); - + file = task_lookup_fd_rcu(task, idx); rcu_read_unlock(); - task_unlock(task); return file; } From patchwork Fri Nov 20 23:14:31 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Eric W. Biederman" X-Patchwork-Id: 11922887 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 43128C64EBC for ; Fri, 20 Nov 2020 23:21:01 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 0CCA42240C for ; Fri, 20 Nov 2020 23:21:01 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729526AbgKTXU3 (ORCPT ); Fri, 20 Nov 2020 18:20:29 -0500 Received: from out03.mta.xmission.com ([166.70.13.233]:49144 "EHLO out03.mta.xmission.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728864AbgKTXT7 (ORCPT ); Fri, 20 Nov 2020 18:19:59 -0500 Received: from in02.mta.xmission.com ([166.70.13.52]) by out03.mta.xmission.com with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.93) (envelope-from ) id 1kgFh7-006Xzz-Kx; Fri, 20 Nov 2020 16:19:57 -0700 Received: from ip68-227-160-95.om.om.cox.net ([68.227.160.95] helo=x220.int.ebiederm.org) by in02.mta.xmission.com with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.93) (envelope-from ) id 1kgFe2-00EG00-1l; Fri, 20 Nov 2020 16:16:46 -0700 From: "Eric W. Biederman" To: linux-kernel@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, criu@openvz.org, bpf@vger.kernel.org, Linus Torvalds , Alexander Viro , Christian Brauner , Oleg Nesterov , Cyrill Gorcunov , Jann Horn , Kees Cook , =?utf-8?q?Da?= =?utf-8?q?niel_P_=2E_Berrang=C3=A9?= , Jeff Layton , Miklos Szeredi , Matthew Wilcox , "J. Bruce Fields" , Trond Myklebust , Chris Wright , Alexei Starovoitov , Daniel Borkmann , Martin KaFai Lau , Song Liu , Yonghong Song , Andrii Nakryiko , John Fastabend , KP Singh , "Eric W. Biederman" Date: Fri, 20 Nov 2020 17:14:31 -0600 Message-Id: <20201120231441.29911-14-ebiederm@xmission.com> X-Mailer: git-send-email 2.25.0 In-Reply-To: <87r1on1v62.fsf@x220.int.ebiederm.org> References: <87r1on1v62.fsf@x220.int.ebiederm.org> MIME-Version: 1.0 X-XM-SPF: eid=1kgFe2-00EG00-1l;;;mid=<20201120231441.29911-14-ebiederm@xmission.com>;;;hst=in02.mta.xmission.com;;;ip=68.227.160.95;;;frm=ebiederm@xmission.com;;;spf=neutral X-XM-AID: U2FsdGVkX1/WaiBJd3Nq6/ZzikkysKyBD91aHVCzXs0= X-SA-Exim-Connect-IP: 68.227.160.95 X-SA-Exim-Mail-From: ebiederm@xmission.com Subject: [PATCH v2 14/24] file: Implement task_lookup_next_fd_rcu X-SA-Exim-Version: 4.2.1 (built Sat, 08 Feb 2020 21:53:50 +0000) X-SA-Exim-Scanned: Yes (on in02.mta.xmission.com) Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org As a companion to fget_task and task_lookup_fd_rcu implement task_lookup_next_fd_rcu that will return the struct file for the first file descriptor number that is equal or greater than the fd argument value, or NULL if there is no such struct file. This allows file descriptors of foreign processes to be iterated through safely, without needed to increment the count on files_struct. Some concern[1] has been expressed that this function takes the task_lock for each iteration and thus for each file descriptor. This place where this function will be called in a commonly used code path is for listing /proc//fd. I did some small benchmarks and did not see any measurable performance differences. For ordinary users ls is likely to stat each of the directory entries and tid_fd_mode called from tid_fd_revalidae has always taken the task lock for each file descriptor. So this does not look like it will be a big change in practice. At some point is will probably be worth changing put_files_struct to free files_struct after an rcu grace period so that task_lock won't be needed at all. [1] https://lkml.kernel.org/r/20200817220425.9389-10-ebiederm@xmission.com v1: https://lkml.kernel.org/r/20200817220425.9389-9-ebiederm@xmission.com Signed-off-by: "Eric W. Biederman" --- fs/file.c | 21 +++++++++++++++++++++ include/linux/fdtable.h | 1 + 2 files changed, 22 insertions(+) diff --git a/fs/file.c b/fs/file.c index 6448523ca29e..23b888a4acbe 100644 --- a/fs/file.c +++ b/fs/file.c @@ -880,6 +880,27 @@ struct file *task_lookup_fd_rcu(struct task_struct *task, unsigned int fd) return file; } +struct file *task_lookup_next_fd_rcu(struct task_struct *task, unsigned int *ret_fd) +{ + /* Must be called with rcu_read_lock held */ + struct files_struct *files; + unsigned int fd = *ret_fd; + struct file *file = NULL; + + task_lock(task); + files = task->files; + if (files) { + for (; fd < files_fdtable(files)->max_fds; fd++) { + file = files_lookup_fd_rcu(files, fd); + if (file) + break; + } + } + task_unlock(task); + *ret_fd = fd; + return file; +} + /* * Lightweight file lookup - no refcnt increment if fd table isn't shared. * diff --git a/include/linux/fdtable.h b/include/linux/fdtable.h index a0558ae9b40c..cf6c52dae3a1 100644 --- a/include/linux/fdtable.h +++ b/include/linux/fdtable.h @@ -111,6 +111,7 @@ static inline struct file *lookup_fd_rcu(unsigned int fd) } struct file *task_lookup_fd_rcu(struct task_struct *task, unsigned int fd); +struct file *task_lookup_next_fd_rcu(struct task_struct *task, unsigned int *fd); struct task_struct; From patchwork Fri Nov 20 23:14:32 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Eric W. Biederman" X-Patchwork-Id: 11922873 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9F934C8300F for ; Fri, 20 Nov 2020 23:20:21 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 773E62240C for ; Fri, 20 Nov 2020 23:20:21 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729327AbgKTXUA (ORCPT ); Fri, 20 Nov 2020 18:20:00 -0500 Received: from out03.mta.xmission.com ([166.70.13.233]:49098 "EHLO out03.mta.xmission.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728872AbgKTXT5 (ORCPT ); Fri, 20 Nov 2020 18:19:57 -0500 Received: from in02.mta.xmission.com ([166.70.13.52]) by out03.mta.xmission.com with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.93) (envelope-from ) id 1kgFh6-006Xzd-1h; Fri, 20 Nov 2020 16:19:56 -0700 Received: from ip68-227-160-95.om.om.cox.net ([68.227.160.95] helo=x220.int.ebiederm.org) by in02.mta.xmission.com with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.93) (envelope-from ) id 1kgFe5-00EG00-JV; Fri, 20 Nov 2020 16:16:50 -0700 From: "Eric W. Biederman" To: linux-kernel@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, criu@openvz.org, bpf@vger.kernel.org, Linus Torvalds , Alexander Viro , Christian Brauner , Oleg Nesterov , Cyrill Gorcunov , Jann Horn , Kees Cook , =?utf-8?q?Da?= =?utf-8?q?niel_P_=2E_Berrang=C3=A9?= , Jeff Layton , Miklos Szeredi , Matthew Wilcox , "J. Bruce Fields" , Trond Myklebust , Chris Wright , Alexei Starovoitov , Daniel Borkmann , Martin KaFai Lau , Song Liu , Yonghong Song , Andrii Nakryiko , John Fastabend , KP Singh , "Eric W. Biederman" , Andy Lavr Date: Fri, 20 Nov 2020 17:14:32 -0600 Message-Id: <20201120231441.29911-15-ebiederm@xmission.com> X-Mailer: git-send-email 2.25.0 In-Reply-To: <87r1on1v62.fsf@x220.int.ebiederm.org> References: <87r1on1v62.fsf@x220.int.ebiederm.org> MIME-Version: 1.0 X-XM-SPF: eid=1kgFe5-00EG00-JV;;;mid=<20201120231441.29911-15-ebiederm@xmission.com>;;;hst=in02.mta.xmission.com;;;ip=68.227.160.95;;;frm=ebiederm@xmission.com;;;spf=neutral X-XM-AID: U2FsdGVkX1+q5zppczWnIGjDjmGmiwzeRnfW/aWDbwE= X-SA-Exim-Connect-IP: 68.227.160.95 X-SA-Exim-Mail-From: ebiederm@xmission.com Subject: [PATCH v2 15/24] proc/fd: In proc_readfd_common use task_lookup_next_fd_rcu X-SA-Exim-Version: 4.2.1 (built Sat, 08 Feb 2020 21:53:50 +0000) X-SA-Exim-Scanned: Yes (on in02.mta.xmission.com) Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org When discussing[1] exec and posix file locks it was realized that none of the callers of get_files_struct fundamentally needed to call get_files_struct, and that by switching them to helper functions instead it will both simplify their code and remove unnecessary increments of files_struct.count. Those unnecessary increments can result in exec unnecessarily unsharing files_struct which breaking posix locks, and it can result in fget_light having to fallback to fget reducing system performance. Using task_lookup_next_fd_rcu simplifies proc_readfd_common, by moving the checking for the maximum file descritor into the generic code, and by remvoing the need for capturing and releasing a reference on files_struct. As task_lookup_fd_rcu may update the fd ctx->pos has been changed to be the fd +2 after task_lookup_fd_rcu returns. [1] https://lkml.kernel.org/r/20180915160423.GA31461@redhat.com Suggested-by: Oleg Nesterov Tested-by: Andy Lavr v1: https://lkml.kernel.org/r/20200817220425.9389-10-ebiederm@xmission.com Signed-off-by: Eric W. Biederman --- fs/proc/fd.c | 17 +++++------------ 1 file changed, 5 insertions(+), 12 deletions(-) diff --git a/fs/proc/fd.c b/fs/proc/fd.c index c1a984f3c4df..72c1525b4b3e 100644 --- a/fs/proc/fd.c +++ b/fs/proc/fd.c @@ -217,7 +217,6 @@ static int proc_readfd_common(struct file *file, struct dir_context *ctx, instantiate_t instantiate) { struct task_struct *p = get_proc_task(file_inode(file)); - struct files_struct *files; unsigned int fd; if (!p) @@ -225,22 +224,18 @@ static int proc_readfd_common(struct file *file, struct dir_context *ctx, if (!dir_emit_dots(file, ctx)) goto out; - files = get_files_struct(p); - if (!files) - goto out; rcu_read_lock(); - for (fd = ctx->pos - 2; - fd < files_fdtable(files)->max_fds; - fd++, ctx->pos++) { + for (fd = ctx->pos - 2;; fd++) { struct file *f; struct fd_data data; char name[10 + 1]; unsigned int len; - f = files_lookup_fd_rcu(files, fd); + f = task_lookup_next_fd_rcu(p, &fd); + ctx->pos = fd + 2LL; if (!f) - continue; + break; data.mode = f->f_mode; rcu_read_unlock(); data.fd = fd; @@ -249,13 +244,11 @@ static int proc_readfd_common(struct file *file, struct dir_context *ctx, if (!proc_fill_cache(file, ctx, name, len, instantiate, p, &data)) - goto out_fd_loop; + goto out; cond_resched(); rcu_read_lock(); } rcu_read_unlock(); -out_fd_loop: - put_files_struct(files); out: put_task_struct(p); return 0; From patchwork Fri Nov 20 23:14:33 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Eric W. Biederman" X-Patchwork-Id: 11922871 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 66606C8300B for ; Fri, 20 Nov 2020 23:20:21 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 0BD622240C for ; Fri, 20 Nov 2020 23:20:20 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729440AbgKTXUD (ORCPT ); Fri, 20 Nov 2020 18:20:03 -0500 Received: from out02.mta.xmission.com ([166.70.13.232]:47076 "EHLO out02.mta.xmission.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729358AbgKTXUC (ORCPT ); Fri, 20 Nov 2020 18:20:02 -0500 Received: from in02.mta.xmission.com ([166.70.13.52]) by out02.mta.xmission.com with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.93) (envelope-from ) id 1kgFhA-002Rc7-V7; Fri, 20 Nov 2020 16:20:01 -0700 Received: from ip68-227-160-95.om.om.cox.net ([68.227.160.95] helo=x220.int.ebiederm.org) by in02.mta.xmission.com with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.93) (envelope-from ) id 1kgFe9-00EG00-8Y; Fri, 20 Nov 2020 16:16:54 -0700 From: "Eric W. Biederman" To: linux-kernel@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, criu@openvz.org, bpf@vger.kernel.org, Linus Torvalds , Alexander Viro , Christian Brauner , Oleg Nesterov , Cyrill Gorcunov , Jann Horn , Kees Cook , =?utf-8?q?Da?= =?utf-8?q?niel_P_=2E_Berrang=C3=A9?= , Jeff Layton , Miklos Szeredi , Matthew Wilcox , "J. Bruce Fields" , Trond Myklebust , Chris Wright , Alexei Starovoitov , Daniel Borkmann , Martin KaFai Lau , Song Liu , Yonghong Song , Andrii Nakryiko , John Fastabend , KP Singh , "Eric W. Biederman" Date: Fri, 20 Nov 2020 17:14:33 -0600 Message-Id: <20201120231441.29911-16-ebiederm@xmission.com> X-Mailer: git-send-email 2.25.0 In-Reply-To: <87r1on1v62.fsf@x220.int.ebiederm.org> References: <87r1on1v62.fsf@x220.int.ebiederm.org> MIME-Version: 1.0 X-XM-SPF: eid=1kgFe9-00EG00-8Y;;;mid=<20201120231441.29911-16-ebiederm@xmission.com>;;;hst=in02.mta.xmission.com;;;ip=68.227.160.95;;;frm=ebiederm@xmission.com;;;spf=neutral X-XM-AID: U2FsdGVkX1/UiNkGFJP+cwxvIGMAiHgWENa406BND64= X-SA-Exim-Connect-IP: 68.227.160.95 X-SA-Exim-Mail-From: ebiederm@xmission.com Subject: [PATCH v2 16/24] bpf/task_iter: In task_file_seq_get_next use task_lookup_next_fd_rcu X-SA-Exim-Version: 4.2.1 (built Sat, 08 Feb 2020 21:53:50 +0000) X-SA-Exim-Scanned: Yes (on in02.mta.xmission.com) Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org When discussing[1] exec and posix file locks it was realized that none of the callers of get_files_struct fundamentally needed to call get_files_struct, and that by switching them to helper functions instead it will both simplify their code and remove unnecessary increments of files_struct.count. Those unnecessary increments can result in exec unnecessarily unsharing files_struct which breaking posix locks, and it can result in fget_light having to fallback to fget reducing system performance. Using task_lookup_next_fd_rcu simplifies task_file_seq_get_next, by moving the checking for the maximum file descritor into the generic code, and by remvoing the need for capturing and releasing a reference on files_struct. As the reference count of files_struct no longer needs to be maintained bpf_iter_seq_task_file_info can have it's files member removed and task_file_seq_get_next no longer needs it's fstruct argument. The curr_fd local variable does need to become unsigned to be used with fnext_task. As curr_fd is assigned from and assigned a u32 making curr_fd an unsigned int won't cause problems and might prevent them. [1] https://lkml.kernel.org/r/20180915160423.GA31461@redhat.com Suggested-by: Oleg Nesterov v1: https://lkml.kernel.org/r/20200817220425.9389-11-ebiederm@xmission.com Signed-off-by: "Eric W. Biederman" --- kernel/bpf/task_iter.c | 44 ++++++++++-------------------------------- 1 file changed, 10 insertions(+), 34 deletions(-) diff --git a/kernel/bpf/task_iter.c b/kernel/bpf/task_iter.c index 5ab2ccfb96cb..4ec63170c741 100644 --- a/kernel/bpf/task_iter.c +++ b/kernel/bpf/task_iter.c @@ -130,45 +130,33 @@ struct bpf_iter_seq_task_file_info { */ struct bpf_iter_seq_task_common common; struct task_struct *task; - struct files_struct *files; u32 tid; u32 fd; }; static struct file * task_file_seq_get_next(struct bpf_iter_seq_task_file_info *info, - struct task_struct **task, struct files_struct **fstruct) + struct task_struct **task) { struct pid_namespace *ns = info->common.ns; - u32 curr_tid = info->tid, max_fds; - struct files_struct *curr_files; + u32 curr_tid = info->tid; struct task_struct *curr_task; - int curr_fd = info->fd; + unsigned int curr_fd = info->fd; /* If this function returns a non-NULL file object, - * it held a reference to the task/files_struct/file. + * it held a reference to the task/file. * Otherwise, it does not hold any reference. */ again: if (*task) { curr_task = *task; - curr_files = *fstruct; curr_fd = info->fd; } else { curr_task = task_seq_get_next(ns, &curr_tid, true); if (!curr_task) return NULL; - curr_files = get_files_struct(curr_task); - if (!curr_files) { - put_task_struct(curr_task); - curr_tid = ++(info->tid); - info->fd = 0; - goto again; - } - - /* set *fstruct, *task and info->tid */ - *fstruct = curr_files; + /* set *task and info->tid */ *task = curr_task; if (curr_tid == info->tid) { curr_fd = info->fd; @@ -179,13 +167,11 @@ task_file_seq_get_next(struct bpf_iter_seq_task_file_info *info, } rcu_read_lock(); - max_fds = files_fdtable(curr_files)->max_fds; - for (; curr_fd < max_fds; curr_fd++) { + for (;; curr_fd++) { struct file *f; - - f = files_lookup_fd_rcu(curr_files, curr_fd); + f = task_lookup_next_fd_rcu(curr_task, &curr_fd); if (!f) - continue; + break; if (!get_file_rcu(f)) continue; @@ -197,10 +183,8 @@ task_file_seq_get_next(struct bpf_iter_seq_task_file_info *info, /* the current task is done, go to the next task */ rcu_read_unlock(); - put_files_struct(curr_files); put_task_struct(curr_task); *task = NULL; - *fstruct = NULL; info->fd = 0; curr_tid = ++(info->tid); goto again; @@ -209,13 +193,11 @@ task_file_seq_get_next(struct bpf_iter_seq_task_file_info *info, static void *task_file_seq_start(struct seq_file *seq, loff_t *pos) { struct bpf_iter_seq_task_file_info *info = seq->private; - struct files_struct *files = NULL; struct task_struct *task = NULL; struct file *file; - file = task_file_seq_get_next(info, &task, &files); + file = task_file_seq_get_next(info, &task); if (!file) { - info->files = NULL; info->task = NULL; return NULL; } @@ -223,7 +205,6 @@ static void *task_file_seq_start(struct seq_file *seq, loff_t *pos) if (*pos == 0) ++*pos; info->task = task; - info->files = files; return file; } @@ -231,22 +212,19 @@ static void *task_file_seq_start(struct seq_file *seq, loff_t *pos) static void *task_file_seq_next(struct seq_file *seq, void *v, loff_t *pos) { struct bpf_iter_seq_task_file_info *info = seq->private; - struct files_struct *files = info->files; struct task_struct *task = info->task; struct file *file; ++*pos; ++info->fd; fput((struct file *)v); - file = task_file_seq_get_next(info, &task, &files); + file = task_file_seq_get_next(info, &task); if (!file) { - info->files = NULL; info->task = NULL; return NULL; } info->task = task; - info->files = files; return file; } @@ -295,9 +273,7 @@ static void task_file_seq_stop(struct seq_file *seq, void *v) (void)__task_file_seq_show(seq, v, true); } else { fput((struct file *)v); - put_files_struct(info->files); put_task_struct(info->task); - info->files = NULL; info->task = NULL; } } From patchwork Fri Nov 20 23:14:34 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Eric W. Biederman" X-Patchwork-Id: 11922881 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D0F5AC6379F for ; Fri, 20 Nov 2020 23:20:59 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 99D712242A for ; Fri, 20 Nov 2020 23:20:59 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729552AbgKTXUa (ORCPT ); Fri, 20 Nov 2020 18:20:30 -0500 Received: from out01.mta.xmission.com ([166.70.13.231]:35918 "EHLO out01.mta.xmission.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729153AbgKTXT6 (ORCPT ); Fri, 20 Nov 2020 18:19:58 -0500 Received: from in02.mta.xmission.com ([166.70.13.52]) by out01.mta.xmission.com with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.93) (envelope-from ) id 1kgFh7-006UXu-5g; Fri, 20 Nov 2020 16:19:57 -0700 Received: from ip68-227-160-95.om.om.cox.net ([68.227.160.95] helo=x220.int.ebiederm.org) by in02.mta.xmission.com with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.93) (envelope-from ) id 1kgFeC-00EG00-P7; Fri, 20 Nov 2020 16:16:57 -0700 From: "Eric W. Biederman" To: linux-kernel@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, criu@openvz.org, bpf@vger.kernel.org, Linus Torvalds , Alexander Viro , Christian Brauner , Oleg Nesterov , Cyrill Gorcunov , Jann Horn , Kees Cook , =?utf-8?q?Da?= =?utf-8?q?niel_P_=2E_Berrang=C3=A9?= , Jeff Layton , Miklos Szeredi , Matthew Wilcox , "J. Bruce Fields" , Trond Myklebust , Chris Wright , Alexei Starovoitov , Daniel Borkmann , Martin KaFai Lau , Song Liu , Yonghong Song , Andrii Nakryiko , John Fastabend , KP Singh , "Eric W. Biederman" Date: Fri, 20 Nov 2020 17:14:34 -0600 Message-Id: <20201120231441.29911-17-ebiederm@xmission.com> X-Mailer: git-send-email 2.25.0 In-Reply-To: <87r1on1v62.fsf@x220.int.ebiederm.org> References: <87r1on1v62.fsf@x220.int.ebiederm.org> MIME-Version: 1.0 X-XM-SPF: eid=1kgFeC-00EG00-P7;;;mid=<20201120231441.29911-17-ebiederm@xmission.com>;;;hst=in02.mta.xmission.com;;;ip=68.227.160.95;;;frm=ebiederm@xmission.com;;;spf=neutral X-XM-AID: U2FsdGVkX1+YX8Zsom0xDVILDApQN2Ck0BtX/4RimOo= X-SA-Exim-Connect-IP: 68.227.160.95 X-SA-Exim-Mail-From: ebiederm@xmission.com Subject: [PATCH v2 17/24] proc/fd: In fdinfo seq_show don't use get_files_struct X-SA-Exim-Version: 4.2.1 (built Sat, 08 Feb 2020 21:53:50 +0000) X-SA-Exim-Scanned: Yes (on in02.mta.xmission.com) Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org When discussing[1] exec and posix file locks it was realized that none of the callers of get_files_struct fundamentally needed to call get_files_struct, and that by switching them to helper functions instead it will both simplify their code and remove unnecessary increments of files_struct.count. Those unnecessary increments can result in exec unnecessarily unsharing files_struct which breaking posix locks, and it can result in fget_light having to fallback to fget reducing system performance. Instead hold task_lock for the duration that task->files needs to be stable in seq_show. The task_lock was already taken in get_files_struct, and so skipping get_files_struct performs less work overall, and avoids the problems with the files_struct reference count. [1] https://lkml.kernel.org/r/20180915160423.GA31461@redhat.com Suggested-by: Oleg Nesterov Acked-by: Christian Brauner v1: https://lkml.kernel.org/r/20200817220425.9389-12-ebiederm@xmission.com Signed-off-by: "Eric W. Biederman" --- fs/proc/fd.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/fs/proc/fd.c b/fs/proc/fd.c index 72c1525b4b3e..cb51763ed554 100644 --- a/fs/proc/fd.c +++ b/fs/proc/fd.c @@ -28,9 +28,8 @@ static int seq_show(struct seq_file *m, void *v) if (!task) return -ENOENT; - files = get_files_struct(task); - put_task_struct(task); - + task_lock(task); + files = task->files; if (files) { unsigned int fd = proc_fd(m->private); @@ -47,8 +46,9 @@ static int seq_show(struct seq_file *m, void *v) ret = 0; } spin_unlock(&files->file_lock); - put_files_struct(files); } + task_unlock(task); + put_task_struct(task); if (ret) return ret; @@ -57,6 +57,7 @@ static int seq_show(struct seq_file *m, void *v) (long long)file->f_pos, f_flags, real_mount(file->f_path.mnt)->mnt_id); + /* show_fd_locks() never deferences files so a stale value is safe */ show_fd_locks(m, file, files); if (seq_has_overflowed(m)) goto out; From patchwork Fri Nov 20 23:14:35 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Eric W. Biederman" X-Patchwork-Id: 11922867 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 68191C64EBC for ; Fri, 20 Nov 2020 23:20:20 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 303532240C for ; Fri, 20 Nov 2020 23:20:20 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729474AbgKTXUE (ORCPT ); Fri, 20 Nov 2020 18:20:04 -0500 Received: from out01.mta.xmission.com ([166.70.13.231]:36090 "EHLO out01.mta.xmission.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729453AbgKTXUE (ORCPT ); Fri, 20 Nov 2020 18:20:04 -0500 Received: from in02.mta.xmission.com ([166.70.13.52]) by out01.mta.xmission.com with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.93) (envelope-from ) id 1kgFhC-006UZk-Ow; Fri, 20 Nov 2020 16:20:02 -0700 Received: from ip68-227-160-95.om.om.cox.net ([68.227.160.95] helo=x220.int.ebiederm.org) by in02.mta.xmission.com with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.93) (envelope-from ) id 1kgFeG-00EG00-6u; Fri, 20 Nov 2020 16:17:00 -0700 From: "Eric W. Biederman" To: linux-kernel@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, criu@openvz.org, bpf@vger.kernel.org, Linus Torvalds , Alexander Viro , Christian Brauner , Oleg Nesterov , Cyrill Gorcunov , Jann Horn , Kees Cook , =?utf-8?q?Da?= =?utf-8?q?niel_P_=2E_Berrang=C3=A9?= , Jeff Layton , Miklos Szeredi , Matthew Wilcox , "J. Bruce Fields" , Trond Myklebust , Chris Wright , Alexei Starovoitov , Daniel Borkmann , Martin KaFai Lau , Song Liu , Yonghong Song , Andrii Nakryiko , John Fastabend , KP Singh , "Eric W. Biederman" Date: Fri, 20 Nov 2020 17:14:35 -0600 Message-Id: <20201120231441.29911-18-ebiederm@xmission.com> X-Mailer: git-send-email 2.25.0 In-Reply-To: <87r1on1v62.fsf@x220.int.ebiederm.org> References: <87r1on1v62.fsf@x220.int.ebiederm.org> MIME-Version: 1.0 X-XM-SPF: eid=1kgFeG-00EG00-6u;;;mid=<20201120231441.29911-18-ebiederm@xmission.com>;;;hst=in02.mta.xmission.com;;;ip=68.227.160.95;;;frm=ebiederm@xmission.com;;;spf=neutral X-XM-AID: U2FsdGVkX1/WN8YoHfI/03jv7EktJCZELxhOW1eGhBc= X-SA-Exim-Connect-IP: 68.227.160.95 X-SA-Exim-Mail-From: ebiederm@xmission.com Subject: [PATCH v2 18/24] file: Merge __fd_install into fd_install X-SA-Exim-Version: 4.2.1 (built Sat, 08 Feb 2020 21:53:50 +0000) X-SA-Exim-Scanned: Yes (on in02.mta.xmission.com) Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org The function __fd_install was added to support binder[1]. With binder fixed[2] there are no more users. As fd_install just calls __fd_install with "files=current->files", merge them together by transforming the files parameter into a local variable initialized to current->files. [1] f869e8a7f753 ("expose a low-level variant of fd_install() for binder") [2] 44d8047f1d87 ("binder: use standard functions to allocate fds") Acked-by: Christian Brauner v1:https://lkml.kernel.org/r/20200817220425.9389-14-ebiederm@xmission.com Signed-off-by: "Eric W. Biederman" --- fs/file.c | 25 ++++++------------------- include/linux/fdtable.h | 2 -- 2 files changed, 6 insertions(+), 21 deletions(-) diff --git a/fs/file.c b/fs/file.c index 23b888a4acbe..0d4ec0fa23b3 100644 --- a/fs/file.c +++ b/fs/file.c @@ -158,7 +158,7 @@ static int expand_fdtable(struct files_struct *files, unsigned int nr) spin_unlock(&files->file_lock); new_fdt = alloc_fdtable(nr); - /* make sure all __fd_install() have seen resize_in_progress + /* make sure all fd_install() have seen resize_in_progress * or have finished their rcu_read_lock_sched() section. */ if (atomic_read(&files->count) > 1) @@ -181,7 +181,7 @@ static int expand_fdtable(struct files_struct *files, unsigned int nr) rcu_assign_pointer(files->fdt, new_fdt); if (cur_fdt != &files->fdtab) call_rcu(&cur_fdt->rcu, free_fdtable_rcu); - /* coupled with smp_rmb() in __fd_install() */ + /* coupled with smp_rmb() in fd_install() */ smp_wmb(); return 1; } @@ -584,17 +584,13 @@ EXPORT_SYMBOL(put_unused_fd); * It should never happen - if we allow dup2() do it, _really_ bad things * will follow. * - * NOTE: __fd_install() variant is really, really low-level; don't - * use it unless you are forced to by truly lousy API shoved down - * your throat. 'files' *MUST* be either current->files or obtained - * by get_files_struct(current) done by whoever had given it to you, - * or really bad things will happen. Normally you want to use - * fd_install() instead. + * This consumes the "file" refcount, so callers should treat it + * as if they had called fput(file). */ -void __fd_install(struct files_struct *files, unsigned int fd, - struct file *file) +void fd_install(unsigned int fd, struct file *file) { + struct files_struct *files = current->files; struct fdtable *fdt; rcu_read_lock_sched(); @@ -616,15 +612,6 @@ void __fd_install(struct files_struct *files, unsigned int fd, rcu_read_unlock_sched(); } -/* - * This consumes the "file" refcount, so callers should treat it - * as if they had called fput(file). - */ -void fd_install(unsigned int fd, struct file *file) -{ - __fd_install(current->files, fd, file); -} - EXPORT_SYMBOL(fd_install); static struct file *pick_file(struct files_struct *files, unsigned fd) diff --git a/include/linux/fdtable.h b/include/linux/fdtable.h index cf6c52dae3a1..a5ec736d74a5 100644 --- a/include/linux/fdtable.h +++ b/include/linux/fdtable.h @@ -126,8 +126,6 @@ int iterate_fd(struct files_struct *, unsigned, extern int __alloc_fd(struct files_struct *files, unsigned start, unsigned end, unsigned flags); -extern void __fd_install(struct files_struct *files, - unsigned int fd, struct file *file); extern int __close_fd(struct files_struct *files, unsigned int fd); extern int __close_range(unsigned int fd, unsigned int max_fd, unsigned int flags); From patchwork Fri Nov 20 23:14:36 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Eric W. Biederman" X-Patchwork-Id: 11922863 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C10D3C5519F for ; Fri, 20 Nov 2020 23:20:18 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 890AB2242A for ; Fri, 20 Nov 2020 23:20:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728807AbgKTXT5 (ORCPT ); Fri, 20 Nov 2020 18:19:57 -0500 Received: from out02.mta.xmission.com ([166.70.13.232]:46934 "EHLO out02.mta.xmission.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728864AbgKTXT4 (ORCPT ); Fri, 20 Nov 2020 18:19:56 -0500 Received: from in02.mta.xmission.com ([166.70.13.52]) by out02.mta.xmission.com with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.93) (envelope-from ) id 1kgFh5-002Ral-ID; Fri, 20 Nov 2020 16:19:55 -0700 Received: from ip68-227-160-95.om.om.cox.net ([68.227.160.95] helo=x220.int.ebiederm.org) by in02.mta.xmission.com with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.93) (envelope-from ) id 1kgFeJ-00EG00-Oz; Fri, 20 Nov 2020 16:17:04 -0700 From: "Eric W. Biederman" To: linux-kernel@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, criu@openvz.org, bpf@vger.kernel.org, Linus Torvalds , Alexander Viro , Christian Brauner , Oleg Nesterov , Cyrill Gorcunov , Jann Horn , Kees Cook , =?utf-8?q?Da?= =?utf-8?q?niel_P_=2E_Berrang=C3=A9?= , Jeff Layton , Miklos Szeredi , Matthew Wilcox , "J. Bruce Fields" , Trond Myklebust , Chris Wright , Alexei Starovoitov , Daniel Borkmann , Martin KaFai Lau , Song Liu , Yonghong Song , Andrii Nakryiko , John Fastabend , KP Singh , "Eric W. Biederman" Date: Fri, 20 Nov 2020 17:14:36 -0600 Message-Id: <20201120231441.29911-19-ebiederm@xmission.com> X-Mailer: git-send-email 2.25.0 In-Reply-To: <87r1on1v62.fsf@x220.int.ebiederm.org> References: <87r1on1v62.fsf@x220.int.ebiederm.org> MIME-Version: 1.0 X-XM-SPF: eid=1kgFeJ-00EG00-Oz;;;mid=<20201120231441.29911-19-ebiederm@xmission.com>;;;hst=in02.mta.xmission.com;;;ip=68.227.160.95;;;frm=ebiederm@xmission.com;;;spf=neutral X-XM-AID: U2FsdGVkX1/07Js/AbSwO/7LHKe+DUR0IrwFWqtok48= X-SA-Exim-Connect-IP: 68.227.160.95 X-SA-Exim-Mail-From: ebiederm@xmission.com Subject: [PATCH v2 19/24] file: In f_dupfd read RLIMIT_NOFILE once. X-SA-Exim-Version: 4.2.1 (built Sat, 08 Feb 2020 21:53:50 +0000) X-SA-Exim-Scanned: Yes (on in02.mta.xmission.com) Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Simplify the code, and remove the chance of races by reading RLIMIT_NOFILE only once in f_dupfd. Pass the read value of RLIMIT_NOFILE into alloc_fd which is the other location the rlimit was read in f_dupfd. As f_dupfd is the only caller of alloc_fd this changing alloc_fd is trivially safe. Further this causes alloc_fd to take all of the same arguments as __alloc_fd except for the files_struct argument. Acked-by: Christian Brauner v1: https://lkml.kernel.org/r/20200817220425.9389-15-ebiederm@xmission.com Signed-off-by: "Eric W. Biederman" --- fs/file.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/fs/file.c b/fs/file.c index 0d4ec0fa23b3..07e25f1b9dfd 100644 --- a/fs/file.c +++ b/fs/file.c @@ -538,9 +538,9 @@ int __alloc_fd(struct files_struct *files, return error; } -static int alloc_fd(unsigned start, unsigned flags) +static int alloc_fd(unsigned start, unsigned end, unsigned flags) { - return __alloc_fd(current->files, start, rlimit(RLIMIT_NOFILE), flags); + return __alloc_fd(current->files, start, end, flags); } int __get_unused_fd_flags(unsigned flags, unsigned long nofile) @@ -1175,10 +1175,11 @@ SYSCALL_DEFINE1(dup, unsigned int, fildes) int f_dupfd(unsigned int from, struct file *file, unsigned flags) { + unsigned long nofile = rlimit(RLIMIT_NOFILE); int err; - if (from >= rlimit(RLIMIT_NOFILE)) + if (from >= nofile) return -EINVAL; - err = alloc_fd(from, flags); + err = alloc_fd(from, nofile, flags); if (err >= 0) { get_file(file); fd_install(err, file); From patchwork Fri Nov 20 23:14:37 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Eric W. Biederman" X-Patchwork-Id: 11922875 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 142E5C83011 for ; Fri, 20 Nov 2020 23:20:22 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id C96AE2240C for ; Fri, 20 Nov 2020 23:20:21 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729202AbgKTXUU (ORCPT ); Fri, 20 Nov 2020 18:20:20 -0500 Received: from out02.mta.xmission.com ([166.70.13.232]:47022 "EHLO out02.mta.xmission.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729249AbgKTXUA (ORCPT ); Fri, 20 Nov 2020 18:20:00 -0500 Received: from in02.mta.xmission.com ([166.70.13.52]) by out02.mta.xmission.com with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.93) (envelope-from ) id 1kgFh9-002RbQ-B6; Fri, 20 Nov 2020 16:19:59 -0700 Received: from ip68-227-160-95.om.om.cox.net ([68.227.160.95] helo=x220.int.ebiederm.org) by in02.mta.xmission.com with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.93) (envelope-from ) id 1kgFeN-00EG00-Df; Fri, 20 Nov 2020 16:17:07 -0700 From: "Eric W. Biederman" To: linux-kernel@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, criu@openvz.org, bpf@vger.kernel.org, Linus Torvalds , Alexander Viro , Christian Brauner , Oleg Nesterov , Cyrill Gorcunov , Jann Horn , Kees Cook , =?utf-8?q?Da?= =?utf-8?q?niel_P_=2E_Berrang=C3=A9?= , Jeff Layton , Miklos Szeredi , Matthew Wilcox , "J. Bruce Fields" , Trond Myklebust , Chris Wright , Alexei Starovoitov , Daniel Borkmann , Martin KaFai Lau , Song Liu , Yonghong Song , Andrii Nakryiko , John Fastabend , KP Singh , "Eric W. Biederman" Date: Fri, 20 Nov 2020 17:14:37 -0600 Message-Id: <20201120231441.29911-20-ebiederm@xmission.com> X-Mailer: git-send-email 2.25.0 In-Reply-To: <87r1on1v62.fsf@x220.int.ebiederm.org> References: <87r1on1v62.fsf@x220.int.ebiederm.org> MIME-Version: 1.0 X-XM-SPF: eid=1kgFeN-00EG00-Df;;;mid=<20201120231441.29911-20-ebiederm@xmission.com>;;;hst=in02.mta.xmission.com;;;ip=68.227.160.95;;;frm=ebiederm@xmission.com;;;spf=neutral X-XM-AID: U2FsdGVkX1/hX9lwaOnMXrM0YEb54E/d3ReYkLxNKoQ= X-SA-Exim-Connect-IP: 68.227.160.95 X-SA-Exim-Mail-From: ebiederm@xmission.com Subject: [PATCH v2 20/24] file: Merge __alloc_fd into alloc_fd X-SA-Exim-Version: 4.2.1 (built Sat, 08 Feb 2020 21:53:50 +0000) X-SA-Exim-Scanned: Yes (on in02.mta.xmission.com) Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org The function __alloc_fd was added to support binder[1]. With binder fixed[2] there are no more users. As alloc_fd just calls __alloc_fd with "files=current->files", merge them together by transforming the files parameter into a local variable initialized to current->files. [1] dcfadfa4ec5a ("new helper: __alloc_fd()") [2] 44d8047f1d87 ("binder: use standard functions to allocate fds") Acked-by: Christian Brauner v1: https://lkml.kernel.org/r/20200817220425.9389-16-ebiederm@xmission.com Signed-off-by: "Eric W. Biederman" --- fs/file.c | 11 +++-------- include/linux/fdtable.h | 2 -- 2 files changed, 3 insertions(+), 10 deletions(-) diff --git a/fs/file.c b/fs/file.c index 07e25f1b9dfd..621563701bd9 100644 --- a/fs/file.c +++ b/fs/file.c @@ -480,9 +480,9 @@ static unsigned int find_next_fd(struct fdtable *fdt, unsigned int start) /* * allocate a file descriptor, mark it busy. */ -int __alloc_fd(struct files_struct *files, - unsigned start, unsigned end, unsigned flags) +static int alloc_fd(unsigned start, unsigned end, unsigned flags) { + struct files_struct *files = current->files; unsigned int fd; int error; struct fdtable *fdt; @@ -538,14 +538,9 @@ int __alloc_fd(struct files_struct *files, return error; } -static int alloc_fd(unsigned start, unsigned end, unsigned flags) -{ - return __alloc_fd(current->files, start, end, flags); -} - int __get_unused_fd_flags(unsigned flags, unsigned long nofile) { - return __alloc_fd(current->files, 0, nofile, flags); + return alloc_fd(0, nofile, flags); } int get_unused_fd_flags(unsigned flags) diff --git a/include/linux/fdtable.h b/include/linux/fdtable.h index a5ec736d74a5..dc476ae92f56 100644 --- a/include/linux/fdtable.h +++ b/include/linux/fdtable.h @@ -124,8 +124,6 @@ int iterate_fd(struct files_struct *, unsigned, int (*)(const void *, struct file *, unsigned), const void *); -extern int __alloc_fd(struct files_struct *files, - unsigned start, unsigned end, unsigned flags); extern int __close_fd(struct files_struct *files, unsigned int fd); extern int __close_range(unsigned int fd, unsigned int max_fd, unsigned int flags); From patchwork Fri Nov 20 23:14:38 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Eric W. Biederman" X-Patchwork-Id: 11922889 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A39ABC71155 for ; Fri, 20 Nov 2020 23:21:01 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 64DF62240C for ; Fri, 20 Nov 2020 23:21:01 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728932AbgKTXT5 (ORCPT ); Fri, 20 Nov 2020 18:19:57 -0500 Received: from out03.mta.xmission.com ([166.70.13.233]:49062 "EHLO out03.mta.xmission.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728807AbgKTXT4 (ORCPT ); Fri, 20 Nov 2020 18:19:56 -0500 Received: from in02.mta.xmission.com ([166.70.13.52]) by out03.mta.xmission.com with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.93) (envelope-from ) id 1kgFh4-006Xz9-SC; Fri, 20 Nov 2020 16:19:54 -0700 Received: from ip68-227-160-95.om.om.cox.net ([68.227.160.95] helo=x220.int.ebiederm.org) by in02.mta.xmission.com with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.93) (envelope-from ) id 1kgFeQ-00EG00-MH; Fri, 20 Nov 2020 16:17:11 -0700 From: "Eric W. Biederman" To: linux-kernel@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, criu@openvz.org, bpf@vger.kernel.org, Linus Torvalds , Alexander Viro , Christian Brauner , Oleg Nesterov , Cyrill Gorcunov , Jann Horn , Kees Cook , =?utf-8?q?Da?= =?utf-8?q?niel_P_=2E_Berrang=C3=A9?= , Jeff Layton , Miklos Szeredi , Matthew Wilcox , "J. Bruce Fields" , Trond Myklebust , Chris Wright , Alexei Starovoitov , Daniel Borkmann , Martin KaFai Lau , Song Liu , Yonghong Song , Andrii Nakryiko , John Fastabend , KP Singh , "Eric W. Biederman" Date: Fri, 20 Nov 2020 17:14:38 -0600 Message-Id: <20201120231441.29911-21-ebiederm@xmission.com> X-Mailer: git-send-email 2.25.0 In-Reply-To: <87r1on1v62.fsf@x220.int.ebiederm.org> References: <87r1on1v62.fsf@x220.int.ebiederm.org> MIME-Version: 1.0 X-XM-SPF: eid=1kgFeQ-00EG00-MH;;;mid=<20201120231441.29911-21-ebiederm@xmission.com>;;;hst=in02.mta.xmission.com;;;ip=68.227.160.95;;;frm=ebiederm@xmission.com;;;spf=neutral X-XM-AID: U2FsdGVkX18yZsJQg7Sob1TmifEv7F2Q+rtBQX59b5c= X-SA-Exim-Connect-IP: 68.227.160.95 X-SA-Exim-Mail-From: ebiederm@xmission.com Subject: [PATCH v2 21/24] file: Rename __close_fd to close_fd and remove the files parameter X-SA-Exim-Version: 4.2.1 (built Sat, 08 Feb 2020 21:53:50 +0000) X-SA-Exim-Scanned: Yes (on in02.mta.xmission.com) Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org The function __close_fd was added to support binder[1]. Now that binder has been fixed to no longer need __close_fd[2] all calls to __close_fd pass current->files. Therefore transform the files parameter into a local variable initialized to current->files, and rename __close_fd to close_fd to reflect this change, and keep it in sync with the similar changes to __alloc_fd, and __fd_install. This removes the need for callers to care about the extra care that needs to be take if anything except current->files is passed, by limiting the callers to only operation on current->files. [1] 483ce1d4b8c3 ("take descriptor-related part of close() to file.c") [2] 44d8047f1d87 ("binder: use standard functions to allocate fds") Acked-by: Christian Brauner v1: https://lkml.kernel.org/r/20200817220425.9389-17-ebiederm@xmission.com Signed-off-by: "Eric W. Biederman" --- fs/file.c | 10 ++++------ fs/open.c | 2 +- include/linux/fdtable.h | 3 +-- include/linux/syscalls.h | 6 +++--- 4 files changed, 9 insertions(+), 12 deletions(-) diff --git a/fs/file.c b/fs/file.c index 621563701bd9..987ea51630b4 100644 --- a/fs/file.c +++ b/fs/file.c @@ -629,11 +629,9 @@ static struct file *pick_file(struct files_struct *files, unsigned fd) return file; } -/* - * The same warnings as for __alloc_fd()/__fd_install() apply here... - */ -int __close_fd(struct files_struct *files, unsigned fd) +int close_fd(unsigned fd) { + struct files_struct *files = current->files; struct file *file; file = pick_file(files, fd); @@ -642,7 +640,7 @@ int __close_fd(struct files_struct *files, unsigned fd) return filp_close(file, files); } -EXPORT_SYMBOL(__close_fd); /* for ksys_close() */ +EXPORT_SYMBOL(close_fd); /* for ksys_close() */ /** * __close_range() - Close all file descriptors in a given range. @@ -1027,7 +1025,7 @@ int replace_fd(unsigned fd, struct file *file, unsigned flags) struct files_struct *files = current->files; if (!file) - return __close_fd(files, fd); + return close_fd(fd); if (fd >= rlimit(RLIMIT_NOFILE)) return -EBADF; diff --git a/fs/open.c b/fs/open.c index 9af548fb841b..581a674d7eee 100644 --- a/fs/open.c +++ b/fs/open.c @@ -1292,7 +1292,7 @@ EXPORT_SYMBOL(filp_close); */ SYSCALL_DEFINE1(close, unsigned int, fd) { - int retval = __close_fd(current->files, fd); + int retval = close_fd(fd); /* can't restart close syscall because file table entry was cleared */ if (unlikely(retval == -ERESTARTSYS || diff --git a/include/linux/fdtable.h b/include/linux/fdtable.h index dc476ae92f56..dad4a488ce60 100644 --- a/include/linux/fdtable.h +++ b/include/linux/fdtable.h @@ -124,8 +124,7 @@ int iterate_fd(struct files_struct *, unsigned, int (*)(const void *, struct file *, unsigned), const void *); -extern int __close_fd(struct files_struct *files, - unsigned int fd); +extern int close_fd(unsigned int fd); extern int __close_range(unsigned int fd, unsigned int max_fd, unsigned int flags); extern int __close_fd_get_file(unsigned int fd, struct file **res); extern int unshare_fd(unsigned long unshare_flags, unsigned int max_fds, diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h index 37bea07c12f2..9e055a0579f8 100644 --- a/include/linux/syscalls.h +++ b/include/linux/syscalls.h @@ -1295,16 +1295,16 @@ static inline long ksys_ftruncate(unsigned int fd, loff_t length) return do_sys_ftruncate(fd, length, 1); } -extern int __close_fd(struct files_struct *files, unsigned int fd); +extern int close_fd(unsigned int fd); /* * In contrast to sys_close(), this stub does not check whether the syscall * should or should not be restarted, but returns the raw error codes from - * __close_fd(). + * close_fd(). */ static inline int ksys_close(unsigned int fd) { - return __close_fd(current->files, fd); + return close_fd(fd); } extern long do_sys_truncate(const char __user *pathname, loff_t length); From patchwork Fri Nov 20 23:14:39 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Eric W. Biederman" X-Patchwork-Id: 11922883 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0C7B0C64E75 for ; Fri, 20 Nov 2020 23:21:00 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id BDEE82240C for ; Fri, 20 Nov 2020 23:20:59 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729556AbgKTXUb (ORCPT ); Fri, 20 Nov 2020 18:20:31 -0500 Received: from out01.mta.xmission.com ([166.70.13.231]:35878 "EHLO out01.mta.xmission.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728975AbgKTXT6 (ORCPT ); Fri, 20 Nov 2020 18:19:58 -0500 Received: from in02.mta.xmission.com ([166.70.13.52]) by out01.mta.xmission.com with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.93) (envelope-from ) id 1kgFh6-006UXX-JE; Fri, 20 Nov 2020 16:19:56 -0700 Received: from ip68-227-160-95.om.om.cox.net ([68.227.160.95] helo=x220.int.ebiederm.org) by in02.mta.xmission.com with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.93) (envelope-from ) id 1kgFeU-00EG00-K5; Fri, 20 Nov 2020 16:17:18 -0700 From: "Eric W. Biederman" To: linux-kernel@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, criu@openvz.org, bpf@vger.kernel.org, Linus Torvalds , Alexander Viro , Christian Brauner , Oleg Nesterov , Cyrill Gorcunov , Jann Horn , Kees Cook , =?utf-8?q?Da?= =?utf-8?q?niel_P_=2E_Berrang=C3=A9?= , Jeff Layton , Miklos Szeredi , Matthew Wilcox , "J. Bruce Fields" , Trond Myklebust , Chris Wright , Alexei Starovoitov , Daniel Borkmann , Martin KaFai Lau , Song Liu , Yonghong Song , Andrii Nakryiko , John Fastabend , KP Singh , "Eric W. Biederman" , Christoph Hellwig Date: Fri, 20 Nov 2020 17:14:39 -0600 Message-Id: <20201120231441.29911-22-ebiederm@xmission.com> X-Mailer: git-send-email 2.25.0 In-Reply-To: <87r1on1v62.fsf@x220.int.ebiederm.org> References: <87r1on1v62.fsf@x220.int.ebiederm.org> MIME-Version: 1.0 X-XM-SPF: eid=1kgFeU-00EG00-K5;;;mid=<20201120231441.29911-22-ebiederm@xmission.com>;;;hst=in02.mta.xmission.com;;;ip=68.227.160.95;;;frm=ebiederm@xmission.com;;;spf=neutral X-XM-AID: U2FsdGVkX19zSWrLrPbiAZCwM7yR70rMJTXOnXxmZFY= X-SA-Exim-Connect-IP: 68.227.160.95 X-SA-Exim-Mail-From: ebiederm@xmission.com Subject: [PATCH v2 22/24] file: Replace ksys_close with close_fd X-SA-Exim-Version: 4.2.1 (built Sat, 08 Feb 2020 21:53:50 +0000) X-SA-Exim-Scanned: Yes (on in02.mta.xmission.com) Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Now that ksys_close is exactly identical to close_fd replace the one caller of ksys_close with close_fd. [1] https://lkml.kernel.org/r/20200818112020.GA17080@infradead.org Suggested-by: Christoph Hellwig Signed-off-by: "Eric W. Biederman" --- fs/autofs/dev-ioctl.c | 5 +++-- include/linux/syscalls.h | 12 ------------ 2 files changed, 3 insertions(+), 14 deletions(-) diff --git a/fs/autofs/dev-ioctl.c b/fs/autofs/dev-ioctl.c index 322b7dfb4ea0..5bf781ea6d67 100644 --- a/fs/autofs/dev-ioctl.c +++ b/fs/autofs/dev-ioctl.c @@ -4,9 +4,10 @@ * Copyright 2008 Ian Kent */ +#include #include #include -#include +#include #include #include @@ -289,7 +290,7 @@ static int autofs_dev_ioctl_closemount(struct file *fp, struct autofs_sb_info *sbi, struct autofs_dev_ioctl *param) { - return ksys_close(param->ioctlfd); + return close_fd(param->ioctlfd); } /* diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h index 9e055a0579f8..0f72f380db72 100644 --- a/include/linux/syscalls.h +++ b/include/linux/syscalls.h @@ -1295,18 +1295,6 @@ static inline long ksys_ftruncate(unsigned int fd, loff_t length) return do_sys_ftruncate(fd, length, 1); } -extern int close_fd(unsigned int fd); - -/* - * In contrast to sys_close(), this stub does not check whether the syscall - * should or should not be restarted, but returns the raw error codes from - * close_fd(). - */ -static inline int ksys_close(unsigned int fd) -{ - return close_fd(fd); -} - extern long do_sys_truncate(const char __user *pathname, loff_t length); static inline long ksys_truncate(const char __user *pathname, loff_t length) From patchwork Fri Nov 20 23:14:40 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Eric W. Biederman" X-Patchwork-Id: 11922885 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id DBDEBC64E90 for ; Fri, 20 Nov 2020 23:21:00 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id B307C2240C for ; Fri, 20 Nov 2020 23:21:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729534AbgKTXUa (ORCPT ); Fri, 20 Nov 2020 18:20:30 -0500 Received: from out01.mta.xmission.com ([166.70.13.231]:35962 "EHLO out01.mta.xmission.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729228AbgKTXT7 (ORCPT ); Fri, 20 Nov 2020 18:19:59 -0500 Received: from in02.mta.xmission.com ([166.70.13.52]) by out01.mta.xmission.com with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.93) (envelope-from ) id 1kgFh8-006UYI-74; Fri, 20 Nov 2020 16:19:58 -0700 Received: from ip68-227-160-95.om.om.cox.net ([68.227.160.95] helo=x220.int.ebiederm.org) by in02.mta.xmission.com with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.93) (envelope-from ) id 1kgFea-00EG00-TR; Fri, 20 Nov 2020 16:17:21 -0700 From: "Eric W. Biederman" To: linux-kernel@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, criu@openvz.org, bpf@vger.kernel.org, Linus Torvalds , Alexander Viro , Christian Brauner , Oleg Nesterov , Cyrill Gorcunov , Jann Horn , Kees Cook , =?utf-8?q?Da?= =?utf-8?q?niel_P_=2E_Berrang=C3=A9?= , Jeff Layton , Miklos Szeredi , Matthew Wilcox , "J. Bruce Fields" , Trond Myklebust , Chris Wright , Alexei Starovoitov , Daniel Borkmann , Martin KaFai Lau , Song Liu , Yonghong Song , Andrii Nakryiko , John Fastabend , KP Singh , "Eric W. Biederman" Date: Fri, 20 Nov 2020 17:14:40 -0600 Message-Id: <20201120231441.29911-23-ebiederm@xmission.com> X-Mailer: git-send-email 2.25.0 In-Reply-To: <87r1on1v62.fsf@x220.int.ebiederm.org> References: <87r1on1v62.fsf@x220.int.ebiederm.org> MIME-Version: 1.0 X-XM-SPF: eid=1kgFea-00EG00-TR;;;mid=<20201120231441.29911-23-ebiederm@xmission.com>;;;hst=in02.mta.xmission.com;;;ip=68.227.160.95;;;frm=ebiederm@xmission.com;;;spf=neutral X-XM-AID: U2FsdGVkX1+WPgiefUzQYgmtCM2bY9TtjITWdux5+J8= X-SA-Exim-Connect-IP: 68.227.160.95 X-SA-Exim-Mail-From: ebiederm@xmission.com Subject: [PATCH v2 23/24] file: Rename __close_fd_get_file close_fd_get_file X-SA-Exim-Version: 4.2.1 (built Sat, 08 Feb 2020 21:53:50 +0000) X-SA-Exim-Scanned: Yes (on in02.mta.xmission.com) Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org The function close_fd_get_file is explicitly a variant of __close_fd[1]. Now that __close_fd has been renamed close_fd, rename close_fd_get_file to be consistent with close_fd. When __alloc_fd, __close_fd and __fd_install were introduced the double underscore indicated that the function took a struct files_struct parameter. The function __close_fd_get_file never has so the naming has always been inconsistent. This just cleans things up so there are not any lingering mentions or references __close_fd left in the code. [1] 80cd795630d6 ("binder: fix use-after-free due to ksys_close() during fdget()") Signed-off-by: Eric W. Biederman --- drivers/android/binder.c | 2 +- fs/file.c | 4 ++-- fs/io_uring.c | 2 +- include/linux/fdtable.h | 2 +- 4 files changed, 5 insertions(+), 5 deletions(-) diff --git a/drivers/android/binder.c b/drivers/android/binder.c index b5117576792b..5b2e3721ac27 100644 --- a/drivers/android/binder.c +++ b/drivers/android/binder.c @@ -2226,7 +2226,7 @@ static void binder_deferred_fd_close(int fd) if (!twcb) return; init_task_work(&twcb->twork, binder_do_fd_close); - __close_fd_get_file(fd, &twcb->file); + close_fd_get_file(fd, &twcb->file); if (twcb->file) { filp_close(twcb->file, current->files); task_work_add(current, &twcb->twork, TWA_RESUME); diff --git a/fs/file.c b/fs/file.c index 987ea51630b4..947ac6d5602f 100644 --- a/fs/file.c +++ b/fs/file.c @@ -721,11 +721,11 @@ int __close_range(unsigned fd, unsigned max_fd, unsigned int flags) } /* - * variant of __close_fd that gets a ref on the file for later fput. + * variant of close_fd that gets a ref on the file for later fput. * The caller must ensure that filp_close() called on the file, and then * an fput(). */ -int __close_fd_get_file(unsigned int fd, struct file **res) +int close_fd_get_file(unsigned int fd, struct file **res) { struct files_struct *files = current->files; struct file *file; diff --git a/fs/io_uring.c b/fs/io_uring.c index b42dfa0243bf..2e6e51292eff 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -4206,7 +4206,7 @@ static int io_close(struct io_kiocb *req, bool force_nonblock, /* might be already done during nonblock submission */ if (!close->put_file) { - ret = __close_fd_get_file(close->fd, &close->put_file); + ret = close_fd_get_file(close->fd, &close->put_file); if (ret < 0) return (ret == -ENOENT) ? -EBADF : ret; } diff --git a/include/linux/fdtable.h b/include/linux/fdtable.h index dad4a488ce60..4ed3589f9294 100644 --- a/include/linux/fdtable.h +++ b/include/linux/fdtable.h @@ -126,7 +126,7 @@ int iterate_fd(struct files_struct *, unsigned, extern int close_fd(unsigned int fd); extern int __close_range(unsigned int fd, unsigned int max_fd, unsigned int flags); -extern int __close_fd_get_file(unsigned int fd, struct file **res); +extern int close_fd_get_file(unsigned int fd, struct file **res); extern int unshare_fd(unsigned long unshare_flags, unsigned int max_fds, struct files_struct **new_fdp); From patchwork Fri Nov 20 23:14:41 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Eric W. Biederman" X-Patchwork-Id: 11922865 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 736F2C64E69 for ; Fri, 20 Nov 2020 23:20:19 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 2531C2240C for ; Fri, 20 Nov 2020 23:20:19 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729455AbgKTXUD (ORCPT ); Fri, 20 Nov 2020 18:20:03 -0500 Received: from out03.mta.xmission.com ([166.70.13.233]:49194 "EHLO out03.mta.xmission.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729345AbgKTXUC (ORCPT ); Fri, 20 Nov 2020 18:20:02 -0500 Received: from in02.mta.xmission.com ([166.70.13.52]) by out03.mta.xmission.com with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.93) (envelope-from ) id 1kgFh9-006Y0K-Rr; Fri, 20 Nov 2020 16:19:59 -0700 Received: from ip68-227-160-95.om.om.cox.net ([68.227.160.95] helo=x220.int.ebiederm.org) by in02.mta.xmission.com with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.93) (envelope-from ) id 1kgFee-00EG00-LB; Fri, 20 Nov 2020 16:17:25 -0700 From: "Eric W. Biederman" To: linux-kernel@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, criu@openvz.org, bpf@vger.kernel.org, Linus Torvalds , Alexander Viro , Christian Brauner , Oleg Nesterov , Cyrill Gorcunov , Jann Horn , Kees Cook , =?utf-8?q?Da?= =?utf-8?q?niel_P_=2E_Berrang=C3=A9?= , Jeff Layton , Miklos Szeredi , Matthew Wilcox , "J. Bruce Fields" , Trond Myklebust , Chris Wright , Alexei Starovoitov , Daniel Borkmann , Martin KaFai Lau , Song Liu , Yonghong Song , Andrii Nakryiko , John Fastabend , KP Singh , "Eric W. Biederman" Date: Fri, 20 Nov 2020 17:14:41 -0600 Message-Id: <20201120231441.29911-24-ebiederm@xmission.com> X-Mailer: git-send-email 2.25.0 In-Reply-To: <87r1on1v62.fsf@x220.int.ebiederm.org> References: <87r1on1v62.fsf@x220.int.ebiederm.org> MIME-Version: 1.0 X-XM-SPF: eid=1kgFee-00EG00-LB;;;mid=<20201120231441.29911-24-ebiederm@xmission.com>;;;hst=in02.mta.xmission.com;;;ip=68.227.160.95;;;frm=ebiederm@xmission.com;;;spf=neutral X-XM-AID: U2FsdGVkX1/pcMQqBBQUv0JpdvGdw8YX55VohZwtBzg= X-SA-Exim-Connect-IP: 68.227.160.95 X-SA-Exim-Mail-From: ebiederm@xmission.com Subject: [PATCH v2 24/24] file: Remove get_files_struct X-SA-Exim-Version: 4.2.1 (built Sat, 08 Feb 2020 21:53:50 +0000) X-SA-Exim-Scanned: Yes (on in02.mta.xmission.com) Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org When discussing[1] exec and posix file locks it was realized that none of the callers of get_files_struct fundamentally needed to call get_files_struct, and that by switching them to helper functions instead it will both simplify their code and remove unnecessary increments of files_struct.count. Those unnecessary increments can result in exec unnecessarily unsharing files_struct which breaking posix locks, and it can result in fget_light having to fallback to fget reducing system performance. Now that get_files_struct has no more users and can not cause the problems for posix file locking and fget_light remove get_files_struct so that it does not gain any new users. [1] https://lkml.kernel.org/r/20180915160423.GA31461@redhat.com Suggested-by: Oleg Nesterov Acked-by: Christian Brauner v1: https://lkml.kernel.org/r/20200817220425.9389-13-ebiederm@xmission.com Signed-off-by: "Eric W. Biederman" --- fs/file.c | 13 ------------- include/linux/fdtable.h | 1 - 2 files changed, 14 deletions(-) diff --git a/fs/file.c b/fs/file.c index 947ac6d5602f..412033d8cfdf 100644 --- a/fs/file.c +++ b/fs/file.c @@ -411,19 +411,6 @@ static struct fdtable *close_files(struct files_struct * files) return fdt; } -struct files_struct *get_files_struct(struct task_struct *task) -{ - struct files_struct *files; - - task_lock(task); - files = task->files; - if (files) - atomic_inc(&files->count); - task_unlock(task); - - return files; -} - void put_files_struct(struct files_struct *files) { if (atomic_dec_and_test(&files->count)) { diff --git a/include/linux/fdtable.h b/include/linux/fdtable.h index 4ed3589f9294..d0e78174874a 100644 --- a/include/linux/fdtable.h +++ b/include/linux/fdtable.h @@ -115,7 +115,6 @@ struct file *task_lookup_next_fd_rcu(struct task_struct *task, unsigned int *fd) struct task_struct; -struct files_struct *get_files_struct(struct task_struct *); void put_files_struct(struct files_struct *fs); int unshare_files(void); struct files_struct *dup_fd(struct files_struct *, unsigned, int *) __latent_entropy;