From patchwork Mon Nov 27 17:18:37 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Djalal Harouni X-Patchwork-Id: 10077645 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 0133B602BC for ; Mon, 27 Nov 2017 17:19:58 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D463528951 for ; Mon, 27 Nov 2017 17:19:57 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id C8B3D2897D; Mon, 27 Nov 2017 17:19:57 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.1 required=2.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED, DKIM_SIGNED, FREEMAIL_FROM, RCVD_IN_DNSWL_MED, T_DKIM_INVALID autolearn=ham version=3.3.1 Received: from mother.openwall.net (mother.openwall.net [195.42.179.200]) by mail.wl.linuxfoundation.org (Postfix) with SMTP id 0B14328DC1 for ; Mon, 27 Nov 2017 17:19:54 +0000 (UTC) Received: (qmail 31957 invoked by uid 550); 27 Nov 2017 17:19:33 -0000 Mailing-List: contact kernel-hardening-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Delivered-To: mailing list kernel-hardening@lists.openwall.com Received: (qmail 30632 invoked from network); 27 Nov 2017 17:19:30 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=LAVcs87W3eATQpGOvtOv5MLFMlmIt3ZWqlQlbVVWTSU=; b=suqKWI6HECbaNC4m/l06qugq1NJXv9T9mF1DEDRGKT+6umnC6bYv4+AKWMGzI/29jZ NM9TVvGSlUZkfd6wUf66Kmckq4cpKyvddNdjHucFd38iEmkzzqfayrPC7Zc1pL91RtqS nhVjg+TJAwCt7C3n3JgKICAdDKydxhHo2hq6eIN9Q6Jjf2yW3WXQLM8WObiISALxY+1g X39+lPZSsBx2EZe1WwukKYkUEe4EcXf1ZnbzYN6F6GGlD2Er2g761rsXyH/ol+uS0ify okYU3Yll8qhA0FQJ5/xPkxSyuVJzd2h8n5OPelBuh8OO/Z2NsbxkAquBAGZ1aN1AKSrf Vlyw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=LAVcs87W3eATQpGOvtOv5MLFMlmIt3ZWqlQlbVVWTSU=; b=EtfcOSZB8c7U3K7YLUS8JnpaqnUPPbaztZazBLpOdKY6LKJILVQBEyTbMpS0fgKOom mA5XpZ9TymO/2sVdfnGH3l8Qgo1LQ1rHafghvP6AEN2BSR7AwpZyXddKoFfNRlJVkrNG kRy8BU2dV6cwGaulqO0K/ozKD5wnk7rag2qd+bMEb/qFEH27XqSRRI0CqCA55xqcRl2C wb39TvrSOdbqmhBVDUm1JfOwMn8c6JuNUaUKqcGyjdjaJDDivIOLOhntRN07e+xOV3v5 ejEih4xJCGrxv41yL4xn/PQQXGdOAknFgnBSvM69G22u/CjJVLHS8KoBviiKOarmzlXj PACA== X-Gm-Message-State: AJaThX7AF3wK+Nv3AslyolwZniVu/INIUSwxJrVjggXUADuQUASJ0mlR DCB0BQ4KzsUn3bkWbY83V8M= X-Google-Smtp-Source: AGs4zMbhsQiYIFHrhFknExWh+WzeS79SS/WQaUEFmlqZniuVh29rYdPICen8qu9fCxfBEO90j282QA== X-Received: by 10.80.171.132 with SMTP id u4mr40131755edc.193.1511803158450; Mon, 27 Nov 2017 09:19:18 -0800 (PST) From: Djalal Harouni To: Kees Cook , Andy Lutomirski , Andrew Morton , "Luis R. Rodriguez" , James Morris , Ben Hutchings , Solar Designer , Serge Hallyn , Jessica Yu , Rusty Russell , linux-kernel@vger.kernel.org, linux-security-module@vger.kernel.org, kernel-hardening@lists.openwall.com Cc: Jonathan Corbet , Ingo Molnar , "David S. Miller" , netdev@vger.kernel.org, Peter Zijlstra , Linus Torvalds , Djalal Harouni Date: Mon, 27 Nov 2017 18:18:37 +0100 Message-Id: <1511803118-2552-5-git-send-email-tixxdz@gmail.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1511803118-2552-1-git-send-email-tixxdz@gmail.com> References: <1511803118-2552-1-git-send-email-tixxdz@gmail.com> Subject: [kernel-hardening] [PATCH v5 next 4/5] modules:capabilities: add a per-task modules auto-load mode X-Virus-Scanned: ClamAV using ClamSMTP Previous patches added the global sysctl "modules_autoload_mode". This patch make it possible to support process trees, containers, and sandboxes by providing an inherited per-task "modules_autoload_mode" flag that cannot be re-enabled once disabled. This allows to improve automatic module loading without affecting the rest of the system. Why we need this ? Usually a request to a kernel feature that is implemented by a module that is not loaded may trigger automatic module loading feature, allowing to transparently satisfy userspace, and provide numeours features as they are needed. In this case an implicit kernel module load operation happens. In most cases to load or unload a kernel module, an explicit operation happens where programs are required to have CAP_SYS_MODULE capability to perform so. However, in general with implicit module loading, no capabilities are required as automatic module loading is one of the most important and transparent operations of Linux. Recent vulnerabilities showed that automatic module loading can be abused in order to expose more bugs. Some of these vulnerabilities are: * DCCP use after free CVE-2017-6074 [1] [2] Unprivileged to local root PoC. * XFRM framework CVE-2017-7184 [3] As advertised it seems it was used to break Ubuntu at a security contest. * n_hldc CVE-2017-2636 [4] [5] Local privilege escalation. * L2TPv3 CVE-2016-10200 Currently most of Linux code is in a form of modules, and not all modules are written or maintained in the same way. In a container or sandbox world, apps can be moved from one context to another or from one Linux system to another one, the ability to restrict some of these apps to load extra kernel modules will prevent exposing some kernel interfaces that have not been updated withing such systems. The DCCP vulnerability CVE-2017-6074 that can be triggered by unprivileged, or CVE-2017-7184 in the XFRM framework are some recent real examples. CVE-2017-7184 was used to break Ubuntu at a security contest. Ubuntu is more of desktop distro, using a global switch to disable automatic module loading will harm users. Actually this design will always end up being ignored by such kind of systems that need to offer a competitive and interactive solution for their users. From this and from observing how apps are being run, this patch introduces a per-task "modules_autoload_mode" to restrict automatic module loading. This offers the following advantages: 1) Allows to abstract in userspace as something like: DenyNewFeatures=yes 2) Automatic module loading is still available to the rest of the system. 2) It is easy to use in containers and sandboxes. DCCP example could have been used to escape containers. The XFRM framework CVE-2017-7184 needs CAP_NET_ADMIN, but attackers may start to target CAP_NET_ADMIN, a per-task flag will make it harder. 3) Suitable for desktop and more interactive Linux systems. 4) Will allow in future to implement a per user policy. The user database format is old and not extensible, as discussed maybe with a modern format we may achieve the following: User=djalal DenyNewFeatures=no Which means that interactive user will be allowed to load extra Linux features. Others, volatile accounts or guests can be easily blocked from doing so. 5) CAP_NET_ADMIN is useful, it handles lot of operations, at same time it started to look more like CAP_SYS_ADMIN which is overloaded. We need CAP_NET_ADMIN, containers need it, but at same time maybe we do not want programs running with it to load 'netdev-%s' modules. Having an extra per-task flag allow to discharge CAP_NET_ADMIN and other capabilities, it is clearly targeted to automatic module loading operations and from a higher view to 'load new kernel features schema'. Usage: ------ To set the per-task "modules_autoload_mode": prctl(PR_SET_MODULES_AUTOLOAD_MODE, mode, 0, 0, 0); When a module auto-load request is triggered by current task, then the operation has first to satisfy the per-task access mode before attempting to implicitly load the module. Once set, this setting is inherited across fork, clone and execve. Prior to use, the task must call prctl(PR_SET_NO_NEW_PRIVS, 1) or run with CAP_SYS_ADMIN privileges in its namespace. If these are not true, -EACCES will be returned. This requirement ensures that unprivileged programs cannot affect the behaviour or surprise privileged children. The per-task "modules_autoload_mode" supports the following values: 0 There are no restrictions, usually the default unless set by parent. 1 The task must have CAP_SYS_MODULE to be able to trigger a module auto-load operation, or CAP_NET_ADMIN for modules with a 'netdev-%s' alias. 2 Automatic modules loading is disabled for the current task. The mode may only be increased, never decreased, thus ensuring that once applied, processes can never relax their setting. This make it easy for developers and users to handle. Note that even if the per-task "modules_autoload_mode" allows to auto-load the corresponding modules, automatic module loading may still fail due to the global sysctl "modules_autoload_mode". For more details please see Documentation/sysctl/kernel.txt, section "modules_autoload_mode". When a request to a kernel module is denied, the module name with the corresponding process name and its pid are logged. Administrators can use such information to explicitly load the appropriate modules. Testing per-task or per container setup --------------------------------------- The following tool can be used to test the feature: https://gist.githubusercontent.com/tixxdz/cf567e4275714199a32c4a80de4ea63a/raw/13e52ea0ee65772871bcf10fb6c94fedd349f5c1/pr_modules_autoload_mode_test.c Example 1) Before patch: $ lsmod | grep ipip - $ sudo ip tunnel add mytun mode ipip remote 10.0.2.100 local 10.0.2.15 ttl 255 $ lsmod | grep ipip - ipip 16384 0 tunnel4 16384 1 ipip ip_tunnel 28672 1 ipip $ grep Modules /proc/self/status ModulesAutoloadMode: 0 After patch: Set task "modules_autoload_mode" to disabled. $ lsmod | grep ipip - $ grep Modules /proc/self/status ModulesAutoloadMode: 0 $ su - root # ./pr_modules_autoload_mode_test 2 # grep Modules /proc/self/status ModulesAutoloadMode: 2 # ip tunnel add mytun mode ipip remote 10.0.2.100 local 10.0.2.15 ttl 255 add tunnel "tunl0" failed: No such device ... [ 634.954652] module: automatic module loading of netdev-tunl0 by "ip"[1560] was denied [ 634.955775] module: automatic module loading of tunl0 by "ip"[1560] was denied ... Example 2) Sample with XFRM tunnel mode. Before patch: $ lsmod | grep xfrm - $ grep Modules /proc/self/status ModulesAutoloadMode: 0 $ sudo ip xfrm state add src 10.0.2.100 dst 10.0.1.100 proto esp spi $id1 \ > reqid $id2 mode tunnel auth "hmac(sha256)" $key1 enc "cbc(aes)" $key2 $ lsmod | grep xfrm xfrm4_mode_tunnel 16384 2 After patch: Set task "modules_autoload_mode" to disabled. $ lsmod | grep xfrm - $ grep Modules /proc/self/status ModulesAutoloadMode: 0 $ su - root # ./pr_modules_autoload_mode_test 2 # grep Modules /proc/self/status ModulesAutoloadMode: 2 # ip xfrm state add src 10.0.2.100 dst 10.0.1.100 proto esp spi $id1 \ > reqid $id2 mode tunnel auth "hmac(sha256)" $key1 enc "cbc(aes)" $key2 RTNETLINK answers: Protocol not supported ... [ 3458.139490] module: automatic module loading of xfrm-mode-2-1 by "ip"[1506] was denied ... Example 3) Here we use DCCP as an example since the public PoC was against it. DCCP use after free CVE-2017-6074 (unprivileged to local root): The code path can be triggered by unprivileged, using the trigger.c program for DCCP use after free [2] and that was fixed by commit 5edabca9d4cff7f "dccp: fix freeing skb too early for IPV6_RECVPKTINFO". Before patch: $ lsmod | grep dccp $ strace ./dccp_trigger ... socket(AF_INET6, SOCK_DCCP, IPPROTO_IP) = 3 ... $ lsmod | grep dccp dccp_ipv6 24576 5 dccp_ipv4 24576 5 dccp_ipv6 dccp 102400 2 dccp_ipv6,dccp_ipv4 $ grep Modules /proc/self/status ModulesAutoloadMode: 0 After patch: Set task "modules_autoload_mode" to 1, privileged mode. $ lsmod | grep dccp $ ./pr_set_no_new_privs $ grep NoNewPrivs /proc/self/status NoNewPrivs: 1 $ ./pr_modules_autoload_mode_test 1 $ grep Modules /proc/self/status ModulesAutoloadMode: 1 $ strace ./dccp_trigger ... socket(AF_INET6, SOCK_DCCP, IPPROTO_IP) = -1 ESOCKTNOSUPPORT (Socket type not supported) ... $ lsmod | grep dccp $ dmesg ... [ 4662.171994] module: automatic module loading of net-pf-10-proto-0-type-6 by "dccp_trigger"[1759] was denied [ 4662.177284] module: automatic module loading of net-pf-10-proto-0 by "dccp_trigger"[1759] was denied [ 4662.180181] module: automatic module loading of net-pf-10-proto-0-type-6 by "dccp_trigger"[1759] was denied [ 4662.181709] module: automatic module loading of net-pf-10-proto-0 by "dccp_trigger"[1759] was denied Now task "modules_autoload_mode" to 2, disabled mode. $ lsmod | grep dccp $ grep Modules /proc/self/status ModulesAutoloadMode: 0 $ su - root # ./pr_modules_autoload_mode_test 2 # grep Modules /proc/self/status ModulesAutoloadMode: 2 # strace ./dccp_trigger ... socket(AF_INET6, SOCK_DCCP, IPPROTO_IP) = -1 ESOCKTNOSUPPORT (Socket type not supported) ... ... [ 5154.218740] module: automatic module loading of net-pf-10-proto-0-type-6 by "dccp_trigger"[1873] was denied [ 5154.219828] module: automatic module loading of net-pf-10-proto-0 by "dccp_trigger"[1873] was denied [ 5154.221814] module: automatic module loading of net-pf-10-proto-0-type-6 by "dccp_trigger"[1873] was denied [ 5154.222731] module: automatic module loading of net-pf-10-proto-0 by "dccp_trigger"[1873] was denied As showed, this blocks automatic module loading per-task. This allows to provide a usable system, where only some sandboxed apps or containers will be restricted to trigger automatic module loading, other parts of the system can continue to use the feature as it is which is the case of the desktop and userfriendly machines. [1] http://www.openwall.com/lists/oss-security/2017/02/22/3 [2] https://github.com/xairy/kernel-exploits/tree/master/CVE-2017-6074 [3] http://www.openwall.com/lists/oss-security/2017/03/29/2 [4] http://www.openwall.com/lists/oss-security/2017/03/07/6 [5] https://a13xp0p0v.github.io/2017/03/24/CVE-2017-2636.html Cc: Ben Hutchings Cc: Rusty Russell Cc: James Morris Cc: Serge Hallyn Cc: Solar Designer Cc: Andy Lutomirski Cc: Kees Cook Signed-off-by: Djalal Harouni --- Documentation/filesystems/proc.txt | 3 + Documentation/userspace-api/index.rst | 1 + .../userspace-api/modules_autoload_mode.rst | 116 +++++++++++++++++++++ fs/proc/array.c | 6 ++ include/linux/init_task.h | 8 ++ include/linux/module.h | 20 ++++ include/linux/sched.h | 5 + include/uapi/linux/prctl.h | 8 ++ kernel/module.c | 83 ++++++++++++--- security/commoncap.c | 36 +++++++ 10 files changed, 270 insertions(+), 16 deletions(-) create mode 100644 Documentation/userspace-api/modules_autoload_mode.rst diff --git a/Documentation/filesystems/proc.txt b/Documentation/filesystems/proc.txt index 2a84bb3..1974cb6 100644 --- a/Documentation/filesystems/proc.txt +++ b/Documentation/filesystems/proc.txt @@ -195,6 +195,7 @@ read the file /proc/PID/status: CapBnd: ffffffffffffffff NoNewPrivs: 0 Seccomp: 0 + ModulesAutoloadMode: 0 voluntary_ctxt_switches: 0 nonvoluntary_ctxt_switches: 1 @@ -269,6 +270,8 @@ Table 1-2: Contents of the status files (as of 4.8) CapBnd bitmap of capabilities bounding set NoNewPrivs no_new_privs, like prctl(PR_GET_NO_NEW_PRIV, ...) Seccomp seccomp mode, like prctl(PR_GET_SECCOMP, ...) + ModulesAutoloadMode modules auto-load mode, like + prctl(PR_GET_MODULES_AUTOLOAD_MODE, ...) Cpus_allowed mask of CPUs on which this process may run Cpus_allowed_list Same as previous, but in "list format" Mems_allowed mask of memory nodes allowed to this process diff --git a/Documentation/userspace-api/index.rst b/Documentation/userspace-api/index.rst index 7b2eb1b..bfd51b7 100644 --- a/Documentation/userspace-api/index.rst +++ b/Documentation/userspace-api/index.rst @@ -17,6 +17,7 @@ place where this information is gathered. :maxdepth: 2 no_new_privs + modules_autoload_mode seccomp_filter unshare diff --git a/Documentation/userspace-api/modules_autoload_mode.rst b/Documentation/userspace-api/modules_autoload_mode.rst new file mode 100644 index 0000000..1153c35 --- /dev/null +++ b/Documentation/userspace-api/modules_autoload_mode.rst @@ -0,0 +1,116 @@ +====================================== +Per-task module auto-load restrictions +====================================== + + +Introduction +============ + +Usually a request to a kernel feature that is implemented by a module +that is not loaded may trigger automatic module loading feature, allowing +to transparently satisfy userspace, and provide numerous other features +as they are needed. In this case an implicit kernel module load +operation happens. + +In most cases to load or unload a kernel module, an explicit operation +happens where programs are required to have ``CAP_SYS_MODULE`` capability +to perform so. However, with implicit module loading, no capabilities are +required, or only ``CAP_NET_ADMIN`` in rare cases where the module has the +'netdev-%s' alias. Historically this was always the case as automatic +module loading is one of the most important and transparent operations +of Linux, users expect that their programs just work, yet, recent cases +showed that this can be abused by unprivileged users or attackers to load +modules that were not updated, or modules that contain bugs and +vulnerabilities. + +Currently most of Linux code is in a form of modules, hence, allowing to +control automatic module loading in some cases is as important as the +operation itself, especially in the context where Linux is used in +different appliances. + +Restricting automatic module loading allows administratros to have the +appropriate time to update or deny module autoloading in advance. In a +container or sandbox world where apps can be moved from one context to +another, the ability to restrict some containers or apps to load extra +kernel modules will prevent exposing some kernel interfaces that may not +receive the same care as some other parts of the core. The DCCP vulnerability +CVE-2017-6074 that can be triggered by unprivileged, or CVE-2017-7184 +in the XFRM framework are some real examples where users or programs are +able to expose such kernel interfaces and escape their sandbox. + +The per-task ``modules_autoload_mode`` allow to restrict automatic module +loading per task, preventing the kernel from exposing more of its +interface. This is particularly useful for containers and sandboxes as +noted above, they are restricted from affecting the rest of the system +without affecting its functionality, automatic module loading is still +available for others. + + +Usage +===== + +When the kernel is compiled with modules support ``CONFIG_MODULES``, then: + +``PR_SET_MODULES_AUTOLOAD_MODE``: + Set the current task ``modules_autoload_mode``. When a module + auto-load request is triggered by current task, then the + operation has first to satisfy the per-task access mode before + attempting to implicitly load the module. As an example, + automatic loading of modules that contain bugs or vulnerabilities + can be restricted and unprivileged users can no longer abuse such + interfaces. Once set, this setting is inherited across ``fork(2)``, + ``clone(2)`` and ``execve(2)``. + + Prior to use, the task must call ``prctl(PR_SET_NO_NEW_PRIVS, 1)`` + or run with ``CAP_SYS_ADMIN`` privileges in its namespace. If + these are not true, ``-EACCES`` will be returned. This requirement + ensures that unprivileged programs cannot affect the behaviour or + surprise privileged children. + + Usage: + ``prctl(PR_SET_MODULES_AUTOLOAD_MODE, mode, 0, 0, 0);`` + + The 'mode' argument supports the following values: + 0 There are no restrictions, usually the default unless set + by parent. + 1 The task must have ``CAP_SYS_MODULE`` to be able to trigger a + module auto-load operation, or ``CAP_NET_ADMIN`` for modules + with a 'netdev-%s' alias. + 2 Automatic modules loading is disabled for the current task. + + The mode may only be increased, never decreased, thus ensuring + that once applied, processes can never relax their setting. + + + Returned values: + 0 On success. + ``-EINVAL`` If 'mode' is not valid, or the operation is not + supported. + ``-EACCES`` If task does not have ``CAP_SYS_ADMIN`` in its namespace + or is not running with ``no_new_privs``. + ``-EPERM`` If 'mode' is less strict than current task + ``modules_autoload_mode``. + + + Note that even if the per-task ``modules_autoload_mode`` allows to + auto-load the corresponding modules, automatic module loading + may still fail due to the global sysctl ``modules_autoload_mode``. + The default mode of ``modules_autoload_mode`` is to always allow + automatic module loading. For more details, please see + Documentation/sysctl/kernel.txt, section "modules_autoload_mode". + + + When a request to a kernel module is denied, the module name with the + corresponding process name and its pid are logged. Administrators can + use such information to explicitly load the appropriate modules. + + +``PR_GET_MODULES_AUTOLOAD_MODE``: + Return the current task ``modules_autoload_mode``. + + Usage: + ``prctl(PR_GET_MODULES_AUTOLOAD_MODE, 0, 0, 0, 0);`` + + Returned values: + mode The task's ``modules_autoload_mode`` + ``-ENOSYS`` If the kernel was compiled without ``CONFIG_MODULES``. diff --git a/fs/proc/array.c b/fs/proc/array.c index 79375fc..57b6cc5 100644 --- a/fs/proc/array.c +++ b/fs/proc/array.c @@ -90,6 +90,7 @@ #include #include #include +#include #include #include @@ -343,10 +344,15 @@ static inline void task_cap(struct seq_file *m, struct task_struct *p) static inline void task_seccomp(struct seq_file *m, struct task_struct *p) { + int autoload = task_modules_autoload_mode(p); + seq_put_decimal_ull(m, "NoNewPrivs:\t", task_no_new_privs(p)); #ifdef CONFIG_SECCOMP seq_put_decimal_ull(m, "\nSeccomp:\t", p->seccomp.mode); #endif + if (autoload != -ENOSYS) + seq_put_decimal_ull(m, "\nModulesAutoloadMode:\t", autoload); + seq_putc(m, '\n'); } diff --git a/include/linux/init_task.h b/include/linux/init_task.h index 6a53262..f564b41 100644 --- a/include/linux/init_task.h +++ b/include/linux/init_task.h @@ -153,6 +153,13 @@ extern struct cred init_cred; # define INIT_CGROUP_SCHED(tsk) #endif +#ifdef CONFIG_MODULES +# define INIT_MODULES_AUTOLOAD_MODE(tsk) \ + .modules_autoload_mode = 0, +#else +# define INIT_MODULES_AUTOLOAD_MODE(tsk) +#endif + #ifdef CONFIG_PERF_EVENTS # define INIT_PERF_EVENTS(tsk) \ .perf_event_mutex = \ @@ -250,6 +257,7 @@ extern struct cred init_cred; .tasks = LIST_HEAD_INIT(tsk.tasks), \ INIT_PUSHABLE_TASKS(tsk) \ INIT_CGROUP_SCHED(tsk) \ + INIT_MODULES_AUTOLOAD_MODE(tsk) \ .ptraced = LIST_HEAD_INIT(tsk.ptraced), \ .ptrace_entry = LIST_HEAD_INIT(tsk.ptrace_entry), \ .real_parent = &tsk, \ diff --git a/include/linux/module.h b/include/linux/module.h index c36aed8..1d742d3 100644 --- a/include/linux/module.h +++ b/include/linux/module.h @@ -13,6 +13,7 @@ #include #include #include +#include #include #include #include @@ -510,6 +511,15 @@ bool is_module_text_address(unsigned long addr); int may_autoload_module(char *kmod_name, int required_cap, const char *kmod_prefix); +/* Set 'modules_autoload_mode' of current task */ +int task_set_modules_autoload_mode(unsigned long value); + +/* Read task's 'modules_autoload_mode' */ +static inline int task_modules_autoload_mode(struct task_struct *task) +{ + return task->modules_autoload_mode; +} + static inline bool within_module_core(unsigned long addr, const struct module *mod) { @@ -662,6 +672,16 @@ static inline int may_autoload_module(char *kmod_name, int required_cap, return -ENOSYS; } +static inline int task_set_modules_autoload_mode(unsigned long value) +{ + return -ENOSYS; +} + +static inline int task_modules_autoload_mode(struct task_struct *task) +{ + return -ENOSYS; +} + static inline struct module *__module_address(unsigned long addr) { return NULL; diff --git a/include/linux/sched.h b/include/linux/sched.h index e5a2fbc..1b8cf78 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -658,6 +658,11 @@ struct task_struct { struct restart_block restart_block; +#ifdef CONFIG_MODULES + /* per-task modules auto-load mode */ + unsigned modules_autoload_mode:2; +#endif + pid_t pid; pid_t tgid; diff --git a/include/uapi/linux/prctl.h b/include/uapi/linux/prctl.h index 3165863..5baf9ae 100644 --- a/include/uapi/linux/prctl.h +++ b/include/uapi/linux/prctl.h @@ -211,4 +211,12 @@ struct prctl_mm_map { #define PR_SET_PDEATHSIG_PROC 48 #define PR_GET_PDEATHSIG_PROC 49 +/* + * Control the per-task modules auto-load mode + * + * See Documentation/prctl/modules_autoload_mode.txt for more details. + */ +#define PR_SET_MODULES_AUTOLOAD_MODE 50 +#define PR_GET_MODULES_AUTOLOAD_MODE 51 + #endif /* _LINUX_PRCTL_H */ diff --git a/kernel/module.c b/kernel/module.c index a7205fb..5c24ac4b 100644 --- a/kernel/module.c +++ b/kernel/module.c @@ -4345,6 +4345,7 @@ EXPORT_SYMBOL_GPL(__module_text_address); /** * may_autoload_module - Determine whether a module auto-load operation * is permitted + * * @kmod_name: The module name * @required_cap: if positive, may allow to auto-load the module if this * capability is set @@ -4362,47 +4363,51 @@ EXPORT_SYMBOL_GPL(__module_text_address); * loading. * * However even if the caller has the required capability, the operation can - * still be denied due to the global "modules_autoload_mode" sysctl mode. Unless - * set by enduser, the operation is always allowed which is the default. + * still be denied due to the per-task "modules_autoload_mode" mode and the + * global "modules_autoload_mode" sysctl one. Unless set by enduser, the + * operation is always allowed which is the default. * * The permission check is performed in this order: - * 1) If the global sysctl "modules_autoload_mode" is set to 'disabled', then - * operation is denied. + * 1) We calculate the strict mode of both: + * per-task 'modules_autoload_mode' and global sysctl 'modules_autoload_mode' + * + * We follow up with the result mode as "modules_autoload_mode": * - * 2) If the global sysctl "modules_autoload_mode" is set to 'privileged', then: + * 2) If "modules_autoload_mode" is set to 'disabled', then operation is denied. * - * 2.1) If "@required_cap" is positive and "@kmod_prefix" is set, then + * 3) If "modules_autoload_mode" is set to 'privileged', then: + * + * 3.1) If "@required_cap" is positive and "@kmod_prefix" is set, then * if the caller has the capability, the operation is allowed. * - * 2.2) If "@required_cap" is positive and "@kmod_prefix" is NULL, then we + * 3.2) If "@required_cap" is positive and "@kmod_prefix" is NULL, then we * fallback to check if caller has CAP_SYS_MODULE, if so, operation is * allowed. * - * 2.3) If caller passes "@required_cap" as a negative then we fallback to + * 3.3) If caller passes "@required_cap" as a negative then we fallback to * check if caller has CAP_SYS_MODULE, if so, operation is allowed. * * We require capabilities to autoload modules here, and CAP_SYS_MODULE here is * the default. * - * 2.4) Otherwise operation is denied. + * 3.4) Otherwise operation is denied. * - * 3) If the global sysctl "modules_autoload_mode" is set to 'allowed' which is - * the default, then: + * 4) If "modules_autoload_mode" is set to 'allowed' which is the default, then: * - * 3.1) If "@required_cap" is positive and "@kmod_prefix" is set, we check if + * 4.1) If "@required_cap" is positive and "@kmod_prefix" is set, we check if * caller has the capability, if so, operation is allowed. * In this case the calling subsystem requires the capability to be set before * allowing modules autoload operations and we have to honor that. * - * 3.2) If "@required_cap" is positive and "@kmod_prefix" is NULL, then we + * 4.2) If "@required_cap" is positive and "@kmod_prefix" is NULL, then we * fallback to check if caller has CAP_SYS_MODULE, if so, operation is * allowed. * - * 3.3) If caller passes "@required_cap" as a negative then operation is + * 4.3) If caller passes "@required_cap" as a negative then operation is * allowed. This is the most common case as it is used now by * request_module() function. * - * 3.4) Otherwise operation is denied. + * 4.4) Otherwise operation is denied. * * Returns 0 if the module request is allowed or -EPERM if not. */ @@ -4410,7 +4415,8 @@ int may_autoload_module(char *kmod_name, int required_cap, const char *kmod_prefix) { int module_require_cap = CAP_SYS_MODULE; - unsigned int autoload = modules_autoload_mode; + unsigned int autoload = max_t(unsigned int, modules_autoload_mode, + current->modules_autoload_mode); /* Short-cut for most use cases where kmod auto-loading is allowed */ if (autoload == MODULES_AUTOLOAD_ALLOWED && required_cap < 0) @@ -4442,6 +4448,51 @@ int may_autoload_module(char *kmod_name, int required_cap, return -EPERM; } +/** + * task_set_modules_autoload_mode - Set per-task modules auto-load mode + * @value: Value to set "modules_autoload_mode" of current task + * + * Set current task "modules_autoload_mode". The task has to have + * CAP_SYS_ADMIN in its namespace or be running with no_new_privs. This + * avoids scenarios where unprivileged tasks can affect the behaviour of + * privilged children by restricting module or kernel features. + * + * The task's "modules_autoload_mode" may only be increased, never decreased. + * + * Returns 0 on success, -EINVAL if @value is not valid, -EACCES if task does + * not have CAP_SYS_ADMIN in its namespace or is not running with no_new_privs, + * and finally -EPERM if @value is less strict than current task + * "modules_autoload_mode". + * + */ +int task_set_modules_autoload_mode(unsigned long value) +{ + if (value > MODULES_AUTOLOAD_DISABLED) + return -EINVAL; + + /* + * To set task "modules_autoload_mode" requires that the task has + * CAP_SYS_ADMIN in its namespace or be running with no_new_privs. + * This avoids scenarios where unprivileged tasks can affect the + * behaviour of privileged children by restricting module features. + */ + if (!task_no_new_privs(current) && + security_capable_noaudit(current_cred(), current_user_ns(), + CAP_SYS_ADMIN) != 0) + return -EACCES; + + /* + * The "modules_autoload_mode" may only be increased, never decreased, + * ensuring that once applied, processes can never relax their settings. + */ + if (current->modules_autoload_mode > value) + return -EPERM; + else if (current->modules_autoload_mode < value) + current->modules_autoload_mode = value; + + return 0; +} + /* Don't grab lock, we're oopsing. */ void print_modules(void) { diff --git a/security/commoncap.c b/security/commoncap.c index 236e573..67a235c 100644 --- a/security/commoncap.c +++ b/security/commoncap.c @@ -1157,6 +1157,36 @@ static int cap_prctl_drop(unsigned long cap) return commit_creds(new); } +/* + * Implement PR_SET_MODULES_AUTOLOAD_MODE. + * + * Returns 0 on success, -ve on error. + */ +static int pr_set_modules_autoload_mode(unsigned long arg2, unsigned long arg3, + unsigned long arg4, unsigned long arg5) +{ + if (arg3 || arg4 || arg5) + return -EINVAL; + + return task_set_modules_autoload_mode(arg2); +} + +/* + * Implement PR_GET_MODULES_AUTOLOAD_MODE. + * + * Return current task "modules_autoload_mode", -ve on error. + */ +static inline int pr_get_modules_autoload_mode(unsigned long arg2, + unsigned long arg3, + unsigned long arg4, + unsigned long arg5) +{ + if (arg2 || arg3 || arg4 || arg5) + return -EINVAL; + + return task_modules_autoload_mode(current); +} + /** * cap_task_prctl - Implement process control functions for this security module * @option: The process control function requested @@ -1287,6 +1317,12 @@ int cap_task_prctl(int option, unsigned long arg2, unsigned long arg3, return commit_creds(new); } + case PR_SET_MODULES_AUTOLOAD_MODE: + return pr_set_modules_autoload_mode(arg2, arg3, arg4, arg5); + + case PR_GET_MODULES_AUTOLOAD_MODE: + return pr_get_modules_autoload_mode(arg2, arg3, arg4, arg5); + default: /* No functionality available - continue with default */ return -ENOSYS;