From patchwork Sun Oct 30 21:46:31 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jann Horn X-Patchwork-Id: 9404729 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 84E626022E for ; Sun, 30 Oct 2016 21:46:53 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 822D828E58 for ; Sun, 30 Oct 2016 21:46:53 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 7702228E73; Sun, 30 Oct 2016 21:46:53 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.8 required=2.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_HI,T_DKIM_INVALID autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id ED96E28E58 for ; Sun, 30 Oct 2016 21:46:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753223AbcJ3Vqv (ORCPT ); Sun, 30 Oct 2016 17:46:51 -0400 Received: from thejh.net ([37.221.195.125]:54316 "EHLO thejh.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752136AbcJ3Vqs (ORCPT ); Sun, 30 Oct 2016 17:46:48 -0400 Received: from pc.thejh.net (pc.vpn [192.168.44.2]) by thejh.net (Postfix) with ESMTPSA id C4BCB18129B; Sun, 30 Oct 2016 22:46:45 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=thejh.net; s=s2016; t=1477864006; bh=qJMu0/2H+6trSfpTykEPU4Tpw6RDDLv/QAj0LP+565E=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=Qm95QEjItIu4vohqOwg0NacArKtxPBvO/nrGt5p4e6x8N82f2+q/3+iBu2UAw2y7p d1HhuN0/Eo90u1hCdpkiL8UIfGixrBal3DFoGzcRbhRBLmbpMQcOmWRtb3njPfTnPE 3HwSor2gGPnQhh7prb/TNwXn8JPFD61Hj2mhaLvXd6EsmUGp6XbnZA6olE/J0uwQt8 P2GNZBBRPyporZ5t++nwPSxDzmkdhm1c0+zeIiiRqjXPq6ndSy06xbrJL5fTIgLJJD V8GRJ0Xc2E57Hy0EfzlNzciWDdfM9K58TF1KMDEpnIjjqiCTiK+aML9/21C2bgKIwa a373KFp0WdRwg== From: Jann Horn To: Alexander Viro , Roland McGrath , Oleg Nesterov , John Johansen , James Morris , "Serge E. Hallyn" , Paul Moore , Stephen Smalley , Eric Paris , Casey Schaufler , Kees Cook , Andrew Morton , Janis Danisevskis , Seth Forshee , "Eric W. Biederman" , Thomas Gleixner , Benjamin LaHaise , Ben Hutchings , Andy Lutomirski , Linus Torvalds , Krister Johansen Cc: linux-fsdevel@vger.kernel.org, linux-security-module@vger.kernel.org, security@kernel.org Subject: [PATCH v3 1/8] exec: introduce cred_guard_light Date: Sun, 30 Oct 2016 22:46:31 +0100 Message-Id: <1477863998-3298-2-git-send-email-jann@thejh.net> X-Mailer: git-send-email 2.1.4 In-Reply-To: <1477863998-3298-1-git-send-email-jann@thejh.net> References: <1477863998-3298-1-git-send-email-jann@thejh.net> Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP This is a new per-threadgroup lock that can often be taken instead of cred_guard_mutex and has less deadlock potential. I'm doing this because Oleg Nesterov mentioned the potential for deadlocks, in particular if a debugged task is stuck in execve, trying to get rid of a ptrace-stopped thread, and the debugger attempts to inspect procfs files of the debugged task. The binfmt handlers (in particular for elf_fdpic and flat) might still call VFS read and mmap operations on the binary with the lock held, but not open operations (as is the case with cred_guard_mutex). An rwlock would be more appropriate here, but apparently those don't have _killable variants of the locking functions? This is a preparation patch for using proper locking in more places. Reported-by: Oleg Nesterov Signed-off-by: Jann Horn --- fs/exec.c | 15 ++++++++++++++- include/linux/init_task.h | 1 + include/linux/sched.h | 10 ++++++++++ kernel/fork.c | 1 + kernel/ptrace.c | 10 ++++++++++ 5 files changed, 36 insertions(+), 1 deletion(-) diff --git a/fs/exec.c b/fs/exec.c index 4e497b9ee71e..67b76cb319d8 100644 --- a/fs/exec.c +++ b/fs/exec.c @@ -1243,6 +1243,10 @@ int flush_old_exec(struct linux_binprm * bprm) if (retval) goto out; + retval = mutex_lock_killable(¤t->signal->cred_guard_light); + if (retval) + goto out; + /* * Must be called _before_ exec_mmap() as bprm->mm is * not visibile until then. This also enables the update @@ -1256,7 +1260,7 @@ int flush_old_exec(struct linux_binprm * bprm) acct_arg_size(bprm, 0); retval = exec_mmap(bprm->mm); if (retval) - goto out; + goto out_unlock; bprm->mm = NULL; /* We're using it now */ @@ -1268,6 +1272,8 @@ int flush_old_exec(struct linux_binprm * bprm) return 0; +out_unlock: + mutex_unlock(¤t->signal->cred_guard_light); out: return retval; } @@ -1391,6 +1397,7 @@ void install_exec_creds(struct linux_binprm *bprm) * credentials; any time after this it may be unlocked. */ security_bprm_committed_creds(bprm); + mutex_unlock(¤t->signal->cred_guard_light); mutex_unlock(¤t->signal->cred_guard_mutex); } EXPORT_SYMBOL(install_exec_creds); @@ -1758,6 +1765,12 @@ static int do_execveat_common(int fd, struct filename *filename, return retval; out: + if (!bprm->mm && bprm->cred) { + /* failure after flush_old_exec(), but before + * install_exec_creds() + */ + mutex_unlock(¤t->signal->cred_guard_light); + } if (bprm->mm) { acct_arg_size(bprm, 0); mmput(bprm->mm); diff --git a/include/linux/init_task.h b/include/linux/init_task.h index 325f649d77ff..c6819468e79a 100644 --- a/include/linux/init_task.h +++ b/include/linux/init_task.h @@ -60,6 +60,7 @@ extern struct fs_struct init_fs; INIT_PREV_CPUTIME(sig) \ .cred_guard_mutex = \ __MUTEX_INITIALIZER(sig.cred_guard_mutex), \ + .cred_guard_light = __MUTEX_INITIALIZER(sig.cred_guard_light) \ } extern struct nsproxy init_nsproxy; diff --git a/include/linux/sched.h b/include/linux/sched.h index 348f51b0ec92..0ccb379895b3 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -812,6 +812,16 @@ struct signal_struct { struct mutex cred_guard_mutex; /* guard against foreign influences on * credential calculations * (notably. ptrace) */ + /* + * Lightweight version of cred_guard_mutex; used to prevent race + * conditions where a user can gain information about the post-execve + * state of a task to which access should only be granted pre-execve. + * Hold this mutex while performing remote task inspection associated + * with a security check. + * This mutex MUST NOT be used in cases where anything changes about + * the security properties of a running execve(). + */ + struct mutex cred_guard_light; }; /* diff --git a/kernel/fork.c b/kernel/fork.c index 623259fc794d..d0e1d6fa4d00 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -1361,6 +1361,7 @@ static int copy_signal(unsigned long clone_flags, struct task_struct *tsk) current->signal->is_child_subreaper; mutex_init(&sig->cred_guard_mutex); + mutex_init(&sig->cred_guard_light); return 0; } diff --git a/kernel/ptrace.c b/kernel/ptrace.c index e6474f7272ec..c3312e9e0078 100644 --- a/kernel/ptrace.c +++ b/kernel/ptrace.c @@ -285,6 +285,16 @@ static int __ptrace_may_access(struct task_struct *task, unsigned int mode) return security_ptrace_access_check(task, mode); } +/* + * NOTE: When you call this function, you need to ensure that the target task + * can't acquire (via setuid execve) credentials between the ptrace access + * check and the privileged access. The recommended way to do this is to hold + * one of task->signal->{cred_guard_mutex,cred_guard_light} while calling this + * function and performing the requested access. + * + * This function may only be used if access is requested in the name of + * current_cred(). + */ bool ptrace_may_access(struct task_struct *task, unsigned int mode) { int err;