[v2,01/24] exec: Move unshare_files to fix posix file locking during exec

Many moons ago the binfmts were doing some very questionable things
with file descriptors and an unsharing of the file descriptor table
was added to make things better[1][2].  The helper steal_lockss was
added to avoid breaking the userspace programs[3][4][6].

Unfortunately it turned out that steal_locks did not work for network
file systems[5], so it was removed to see if anyone would
complain[7][8].  It was thought at the time that NPTL would not be
affected as the unshare_files happened after the other threads were
killed[8].  Unfortunately because there was an unshare_files in
binfmt_elf.c before the threads were killed this analysis was
incorrect.

This unshare_files in binfmt_elf.c resulted in the unshares_files
happening whenever threads were present.  Which led to unshare_files
being moved to the start of do_execve[9].

Later the problems were rediscovered and the suggested approach was to
readd steal_locks under a different name[10].  I happened to be
reviewing patches and I noticed that this approach was a step
backwards[11].

I proposed simply moving unshare_files[12] and it was pointed
out that moving unshare_files without auditing the code was
also unsafe[13].

There were then several attempts to solve this[14][15][16] and I even
posted this set of changes[17].  Unfortunately because auditing all of
execve is time consuming this change did not make it in at the time.

Well now that I am cleaning up exec I have made the time to read
through all of the binfmts and the only playing with file descriptors
is either the security modules closing them in
security_bprm_committing_creds or is in the generic code in fs/exec.c.
None of it happens before begin_new_exec is called.

So move unshare_files into begin_new_exec, after the point of no
return.  If memory is very very very low and the application calling
exec is sharing file descriptor tables between processes we might fail
past the point of no return.  Which is unfortunate but no different
than any of the other places where we allocate memory after the point
of no return.

This movement allows another process that shares the file table, or
another thread of the same process and that closes files or changes
their close on exec behavior and races with execve to cause some
unexpected things to happen.  There is only one time of check to time
of use race and it is just there so that execve fails instead of
an interpreter failing when it tries to open the file it is supposed
to be interpreting.   Failing later if userspace is being silly is
not a problem.

With this change it the following discription from the removal
of steal_locks[8] finally becomes true.

    Apps using NPTL are not affected, since all other threads are killed before
    execve.

    Apps using LinuxThreads are only affected if they

      - have multiple threads during exec (LinuxThreads doesn't kill other
        threads, the app may do it with pthread_kill_other_threads_np())
      - rely on POSIX locks being inherited across exec

    Both conditions are documented, but not their interaction.

    Apps using clone() natively are affected if they

      - use clone(CLONE_FILES)
      - rely on POSIX locks being inherited across exec

I have investigated some paths to make it possible to solve this
without moving unshare_files but they all look more complicated[18].

Reported-by: Daniel P. Berrangé <berrange@redhat.com>
Reported-by: Jeff Layton <jlayton@redhat.com>
History-tree: git://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git
[1] 02cda956de0b ("[PATCH] unshare_files"
[2] 04e9bcb4d106 ("[PATCH] use new unshare_files helper")
[3] 088f5d7244de ("[PATCH] add steal_locks helper")
[4] 02c541ec8ffa ("[PATCH] use new steal_locks helper")
[5] https://lkml.kernel.org/r/E1FLIlF-0007zR-00@dorka.pomaz.szeredi.hu
[6] https://lkml.kernel.org/r/0060321191605.GB15997@sorel.sous-sol.org
[7] https://lkml.kernel.org/r/E1FLwjC-0000kJ-00@dorka.pomaz.szeredi.hu
[8] c89681ed7d0e ("[PATCH] remove steal_locks()")
[9] fd8328be874f ("[PATCH] sanitize handling of shared descriptor tables in failing execve()")
[10] https://lkml.kernel.org/r/20180317142520.30520-1-jlayton@kernel.org
[11] https://lkml.kernel.org/r/87r2nwqk73.fsf@xmission.com
[12] https://lkml.kernel.org/r/87bmfgvg8w.fsf@xmission.com
[13] https://lkml.kernel.org/r/20180322111424.GE30522@ZenIV.linux.org.uk
[14] https://lkml.kernel.org/r/20180827174722.3723-1-jlayton@kernel.org
[15] https://lkml.kernel.org/r/20180830172423.21964-1-jlayton@kernel.org
[16] https://lkml.kernel.org/r/20180914105310.6454-1-jlayton@kernel.org
[17] https://lkml.kernel.org/r/87a7ohs5ow.fsf@xmission.com
[18] https://lkml.kernel.org/r/87pn8c1uj6.fsf_-_@x220.int.ebiederm.org
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
v1: https://lkml.kernel.org/r/20200817220425.9389-1-ebiederm@xmission.com
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 fs/exec.c | 29 +++++++++++++++--------------
 1 file changed, 15 insertions(+), 14 deletions(-)

Message ID	20201120231441.29911-1-ebiederm@xmission.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <linux-fsdevel-owner@kernel.org> X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-21.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,MENTIONS_GIT_HOSTING,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A1380C63798 for <linux-fsdevel@archiver.kernel.org>; Fri, 20 Nov 2020 23:16:28 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 4EF352240B for <linux-fsdevel@archiver.kernel.org>; Fri, 20 Nov 2020 23:16:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728407AbgKTXQG (ORCPT <rfc822;linux-fsdevel@archiver.kernel.org>); Fri, 20 Nov 2020 18:16:06 -0500 Received: from out01.mta.xmission.com ([166.70.13.231]:33402 "EHLO out01.mta.xmission.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728123AbgKTXQG (ORCPT <rfc822;linux-fsdevel@vger.kernel.org>); Fri, 20 Nov 2020 18:16:06 -0500 Received: from in02.mta.xmission.com ([166.70.13.52]) by out01.mta.xmission.com with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.93) (envelope-from <ebiederm@xmission.com>) id 1kgFdH-006Tmb-H0; Fri, 20 Nov 2020 16:15:59 -0700 Received: from ip68-227-160-95.om.om.cox.net ([68.227.160.95] helo=x220.int.ebiederm.org) by in02.mta.xmission.com with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.93) (envelope-from <ebiederm@xmission.com>) id 1kgFdG-00EG00-3B; Fri, 20 Nov 2020 16:15:59 -0700 From: "Eric W. Biederman" <ebiederm@xmission.com> To: linux-kernel@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, criu@openvz.org, bpf@vger.kernel.org, Linus Torvalds <torvalds@linux-foundation.org>, Alexander Viro <viro@zeniv.linux.org.uk>, Christian Brauner <christian.brauner@ubuntu.com>, Oleg Nesterov <oleg@redhat.com>, Cyrill Gorcunov <gorcunov@gmail.com>, Jann Horn <jann@thejh.net>, Kees Cook <keescook@chromium.org>, =?utf-8?q?Da?= =?utf-8?q?niel_P_=2E_Berrang=C3=A9?= <berrange@redhat.com>, Jeff Layton <jlayton@redhat.com>, Miklos Szeredi <miklos@szeredi.hu>, Matthew Wilcox <willy@infradead.org>, "J. Bruce Fields" <bfields@fieldses.org>, Trond Myklebust <trond.myklebust@hammerspace.com>, Chris Wright <chrisw@redhat.com>, Alexei Starovoitov <ast@kernel.org>, Daniel Borkmann <daniel@iogearbox.net>, Martin KaFai Lau <kafai@fb.com>, Song Liu <songliubraving@fb.com>, Yonghong Song <yhs@fb.com>, Andrii Nakryiko <andriin@fb.com>, John Fastabend <john.fastabend@gmail.com>, KP Singh <kpsingh@chromium.org>, "Eric W. Biederman" <ebiederm@xmission.com> Date: Fri, 20 Nov 2020 17:14:18 -0600 Message-Id: <20201120231441.29911-1-ebiederm@xmission.com> X-Mailer: git-send-email 2.25.0 In-Reply-To: <87r1on1v62.fsf@x220.int.ebiederm.org> References: <87r1on1v62.fsf@x220.int.ebiederm.org> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-XM-SPF: eid=1kgFdG-00EG00-3B;;;mid=<20201120231441.29911-1-ebiederm@xmission.com>;;;hst=in02.mta.xmission.com;;;ip=68.227.160.95;;;frm=ebiederm@xmission.com;;;spf=neutral X-XM-AID: U2FsdGVkX1/Dk4BkOwsyT/FgOO6bO2TIPLXxxbBOHLc= X-SA-Exim-Connect-IP: 68.227.160.95 X-SA-Exim-Mail-From: ebiederm@xmission.com Subject: [PATCH v2 01/24] exec: Move unshare_files to fix posix file locking during exec X-SA-Exim-Version: 4.2.1 (built Sat, 08 Feb 2020 21:53:50 +0000) X-SA-Exim-Scanned: Yes (on in02.mta.xmission.com) Precedence: bulk List-ID: <linux-fsdevel.vger.kernel.org> X-Mailing-List: linux-fsdevel@vger.kernel.org
Series	[v2,01/24] exec: Move unshare_files to fix posix file locking during exec \| expand [v2,01/24] exec: Move unshare_files to fix posix file locking during exec [v2,02/24] exec: Simplify unshare_files [v2,03/24] exec: Remove reset_files_struct [v2,04/24] kcmp: In kcmp_epoll_target use fget_task [v2,05/24] bpf: In bpf_task_fd_query use fget_task [v2,06/24] proc/fd: In proc_fd_link use fget_task [v2,07/24] file: Rename __fcheck_files to files_lookup_fd_raw [v2,08/24] file: Factor files_lookup_fd_locked out of fcheck_files [v2,09/24] file: Replace fcheck_files with files_lookup_fd_rcu [v2,10/24] file: Rename fcheck lookup_fd_rcu [v2,11/24] file: Implement task_lookup_fd_rcu [v2,12/24] proc/fd: In tid_fd_mode use task_lookup_fd_rcu [v2,13/24] kcmp: In get_file_raw_ptr use task_lookup_fd_rcu [v2,14/24] file: Implement task_lookup_next_fd_rcu [v2,15/24] proc/fd: In proc_readfd_common use task_lookup_next_fd_rcu [v2,16/24] bpf/task_iter: In task_file_seq_get_next use task_lookup_next_fd_rcu [v2,17/24] proc/fd: In fdinfo seq_show don't use get_files_struct [v2,18/24] file: Merge __fd_install into fd_install [v2,19/24] file: In f_dupfd read RLIMIT_NOFILE once. [v2,20/24] file: Merge __alloc_fd into alloc_fd [v2,21/24] file: Rename __close_fd to close_fd and remove the files parameter [v2,22/24] file: Replace ksys_close with close_fd [v2,23/24] file: Rename __close_fd_get_file close_fd_get_file [v2,24/24] file: Remove get_files_struct

[v2,01/24] exec: Move unshare_files to fix posix file locking during exec

Commit Message

Comments

Patch