From patchwork Wed Jan 17 16:14:43 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Josh Poimboeuf X-Patchwork-Id: 13521987 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6B7E7C47DAF for ; Wed, 17 Jan 2024 16:15:16 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id F10196B0106; Wed, 17 Jan 2024 11:15:13 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id EC17C6B0107; Wed, 17 Jan 2024 11:15:13 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D2D666B0108; Wed, 17 Jan 2024 11:15:13 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id AB6756B0107 for ; Wed, 17 Jan 2024 11:15:13 -0500 (EST) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 7C29A1613C7 for ; Wed, 17 Jan 2024 16:15:13 +0000 (UTC) X-FDA: 81689302506.15.7B04A2B Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by imf14.hostedemail.com (Postfix) with ESMTP id CC1C6100021 for ; Wed, 17 Jan 2024 16:15:10 +0000 (UTC) Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=WGlSJQ3j; dmarc=pass (policy=none) header.from=kernel.org; spf=pass (imf14.hostedemail.com: domain of jpoimboe@kernel.org designates 145.40.68.75 as permitted sender) smtp.mailfrom=jpoimboe@kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1705508111; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=o3K4TPl78GOzKHIjcRo4cqwYWgTS0PNWTuRXct9Lkh0=; b=ap03psUy6S6ifxAJPkyaVX334Yy13QkPEckq8Z/GjS6D1hbcuw8CT+miUNNcLeBJtNhCbV 8CcawLdTNlUxu8HLz+Q8WonMPNXQAM0tbLNwItG5729/wJW5vUSairXXww0ZHQOMrWWKrA kSQs+dU71o2GpFvsepZilmIWx4pQ7Ys= ARC-Authentication-Results: i=1; imf14.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=WGlSJQ3j; dmarc=pass (policy=none) header.from=kernel.org; spf=pass (imf14.hostedemail.com: domain of jpoimboe@kernel.org designates 145.40.68.75 as permitted sender) smtp.mailfrom=jpoimboe@kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1705508111; a=rsa-sha256; cv=none; b=iYcKPurWrMuexZ5wX/S7e4T8fwWbrZubwbHIUHEJ4OAd9n0NCCzQZsIBH5QH9XIXjPejUD U0cX0hAbYcRIGTju1fPpZcX/icIrtF6twwK2EKu1U1wakN28Ao1PnBtfUkqSYYLnnf62M+ DumS//tVNcTDIYOjms2AFlzzb/Me1OQ= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by ams.source.kernel.org (Postfix) with ESMTP id BA8E6B810C3; Wed, 17 Jan 2024 16:15:08 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 4ED50C43394; Wed, 17 Jan 2024 16:15:07 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1705508108; bh=eP1aE7eGJrI56wu/JUZuR/oHJxD8hIQm2thQjg7z0tE=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=WGlSJQ3jQyZPQFC++EdqLNjVx8Yadpn7xElFro4BLyKUqFFpQuAncr1EOLlt3uyuI +sxXmuKxtLjUOA+r4tkrjzwa0w2fXlZGffuba9K/H5yHuoLqmQz6tLk8/uuRsnNwvp 0OESLveyoKN0OwCncqn6LD6mHP7NTY6OGGx7qOg0g8puA1fNuqUsAJCkH0E6nK2ru8 j4UkgHz40xh0P0O1yVFYkRg0lALH6UXwp8Ff3UgZgnlfnXSn/RGd8P6ZwFHx957AUc 3JfjzKdG1G9q/wZM6RZ1qlXsBSPYbFSAjWGhKG1QGsZTEw1T2W7rDmqsCvi4cGD8dl B1HUqFkOztYmA== From: Josh Poimboeuf To: Linus Torvalds , Jeff Layton , Chuck Lever , Shakeel Butt , Roman Gushchin , Johannes Weiner , Michal Hocko Cc: linux-kernel@vger.kernel.org, Jens Axboe , Tejun Heo , Vasily Averin , Michal Koutny , Waiman Long , Muchun Song , Jiri Kosina , cgroups@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH RFC 1/4] fs/locks: Fix file lock cache accounting, again Date: Wed, 17 Jan 2024 08:14:43 -0800 Message-ID: X-Mailer: git-send-email 2.43.0 In-Reply-To: References: MIME-Version: 1.0 X-Rspamd-Queue-Id: CC1C6100021 X-Rspam-User: X-Rspamd-Server: rspam04 X-Stat-Signature: 8m1xeh5437ptscg95wcw783y7tcdrre5 X-HE-Tag: 1705508110-170131 X-HE-Meta: U2FsdGVkX18wElrqMOTIWDkzeGUiVdhrQcX+y/F4qlnb9WXPD3CJ2pW06MKTqLfCDw03zQKvDxncfxb81od3tmlSflxZWAX3d2B7QiJUL3+/wpvxl2o9juCQ5c1Jvl4FARJxOVnW8DUIvmsieLY6R9YlMtlQ4xwC2/Ti5Hh7h2M1I/yI0PipOhEtZDsj3e5s9r06Y9Gim+ALLqgp8R1EHfBriHeOY99FpJIDKkfnroeGnmclArB/P6BQl0rqiltdYCnkPp8xYZK8htN7gf8b3g+WHaoyJkkhvR2O32s3y6f/h4Q+gQPPdIh4Bs6D8TZBrjzRPztf/DDh+qFJz7zeGbJnQZVtqTeV1sIa0bp8fYBKyAet/K0an4qQfP4aBV1Tdva+/A4dO/3pJeA3p564jIsgKXbzatmRB7qsdBl/LTWfesxmj+sWwhxbO1kTX8UzmhHiZsjSr+pdcHYW8JGfBc1TIEE5AoQfVWwN5bISaKvrWpPf8mNAhSB5hJLqdqjh4cdtaLFlhMXmNav+qSxmXKEq24VQVqiz0Zp9X0Vx0l0vwB7uaRA7NeYKQWNzjZ8ygxAcgdbxWgaLb64j6qCqdu0HTK7BWmJDxXCf4GUydFqapVzLCFY5KETGYTNLOY6GllWZpWrciSOnIz9nxKKxnuVz3L9JuP+5DtO7Pigib2XfN6S3YO2VC35+hUAWxSa20lQSdA4qQjT6NQzu8NMgsi2IPK0g60a629lqNaGYMpHKfiCe081VjVIpxzm9+ctUsh9Np94xxjI50W75QyVpGMXL/QzdBaAj5F2ksnuD56kEosaPbk8yb/4ZR9OVYCxFh/WBuIXMDwzYv3FjPEK7So1cqwkVeZOJxnSFYieNnext3NwF6jGdSTkgNeNgOSoFy6YfgdVdiSifiT930wSB6FqvuD+gOEmsP5pUTrfn8GaE7CB+rtWh9w== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: A container can exceed its memcg limits by allocating a bunch of file locks. This bug was originally fixed by commit 0f12156dff28 ("memcg: enable accounting for file lock caches"), but was later reverted by commit 3754707bcc3e ("Revert "memcg: enable accounting for file lock caches"") due to performance issues. Unfortunately those performance issues were never addressed and the bug has remained unfixed for over two years. Fix it by default but allow users to disable it with a cmdline option (flock_accounting=off). Signed-off-by: Josh Poimboeuf --- .../admin-guide/kernel-parameters.txt | 17 +++++++++++ fs/locks.c | 30 +++++++++++++++++-- 2 files changed, 45 insertions(+), 2 deletions(-) diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index 6ee0f9a5da70..91987b06bc52 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -1527,6 +1527,23 @@ See Documentation/admin-guide/sysctl/net.rst for fb_tunnels_only_for_init_ns + flock_accounting= + [KNL] Enable/disable accounting for kernel + memory allocations related to file locks. + Format: { on | off } + Default: on + on: Enable kernel memory accounting for file + locks. This prevents task groups from + exceeding their memcg allocation limits. + However, it may cause slowdowns in the + flock() system call. + off: Disable kernel memory accounting for + file locks. This may allow a rogue task + to DoS the system by forcing the kernel + to allocate memory beyond the task + group's memcg limits. Not recommended + unless you have trusted user space. + floppy= [HW] See Documentation/admin-guide/blockdev/floppy.rst. diff --git a/fs/locks.c b/fs/locks.c index cc7c117ee192..235ac56c557d 100644 --- a/fs/locks.c +++ b/fs/locks.c @@ -2905,15 +2905,41 @@ static int __init proc_locks_init(void) fs_initcall(proc_locks_init); #endif +static bool flock_accounting __ro_after_init = true; + +static int __init flock_accounting_cmdline(char *str) +{ + if (!str) + return -EINVAL; + + if (!strcmp(str, "off")) + flock_accounting = false; + else if (!strcmp(str, "on")) + flock_accounting = true; + else + return -EINVAL; + + return 0; +} +early_param("flock_accounting", flock_accounting_cmdline); + +#define FLOCK_ACCOUNTING_MSG "WARNING: File lock accounting is disabled, container-triggered host memory exhaustion possible!\n" + static int __init filelock_init(void) { int i; + slab_flags_t flags = SLAB_PANIC; + + if (!flock_accounting) + pr_err(FLOCK_ACCOUNTING_MSG); + else + flags |= SLAB_ACCOUNT; flctx_cache = kmem_cache_create("file_lock_ctx", - sizeof(struct file_lock_context), 0, SLAB_PANIC, NULL); + sizeof(struct file_lock_context), 0, flags, NULL); filelock_cache = kmem_cache_create("file_lock_cache", - sizeof(struct file_lock), 0, SLAB_PANIC, NULL); + sizeof(struct file_lock), 0, flags, NULL); for_each_possible_cpu(i) { struct file_lock_list_struct *fll = per_cpu_ptr(&file_lock_list, i); From patchwork Wed Jan 17 16:14:44 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Josh Poimboeuf X-Patchwork-Id: 13521989 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A96DAC47258 for ; Wed, 17 Jan 2024 16:15:24 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0A1E76B0108; Wed, 17 Jan 2024 11:15:15 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 0206C6B010F; Wed, 17 Jan 2024 11:15:14 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C4F856B0109; Wed, 17 Jan 2024 11:15:14 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id A32E26B0107 for ; Wed, 17 Jan 2024 11:15:14 -0500 (EST) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 74D001A13DE for ; Wed, 17 Jan 2024 16:15:14 +0000 (UTC) X-FDA: 81689302548.02.0941A0A Received: from sin.source.kernel.org (sin.source.kernel.org [145.40.73.55]) by imf08.hostedemail.com (Postfix) with ESMTP id 64D0F16000B for ; Wed, 17 Jan 2024 16:15:12 +0000 (UTC) Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=GfNGTXOD; spf=pass (imf08.hostedemail.com: domain of jpoimboe@kernel.org designates 145.40.73.55 as permitted sender) smtp.mailfrom=jpoimboe@kernel.org; dmarc=pass (policy=none) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1705508112; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=DMIwppfDFtVsVkhkJAC34YqIuC0oTG/FedMP6I7fc6M=; b=TAc5ZCKIELY45smNUCSo/RjdYZJbEosRUy957p3SeRjdr8I0vgXfSnyg9vZYfH6VzWSEF4 KHyNnqjhmVMvQKJY7wor2n6qqbfRt3aEJqUvPVuy+weg/n6JINBEIEaGTqomKJnEa6aY6T WJlXZpn4rHzljuPPF4qOGuup0pStctY= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1705508112; a=rsa-sha256; cv=none; b=Dk9ntC25ujpqn5qiM9C0A+S3DgjbrbMgowzr01DRee4oDXUQiwar/Fe8QpJ8K8aDU5RVfQ aA/vi4ycN5LTxy0w/UxCmEIAmsl8I+uLaCgrFYlVacs14XG2Tks6D5Ko6zJmt06KRBsR3B GR44sfJPXhSebY888gfVxSc8hHqcvig= ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=GfNGTXOD; spf=pass (imf08.hostedemail.com: domain of jpoimboe@kernel.org designates 145.40.73.55 as permitted sender) smtp.mailfrom=jpoimboe@kernel.org; dmarc=pass (policy=none) header.from=kernel.org Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by sin.source.kernel.org (Postfix) with ESMTP id 46532CE1C9E; Wed, 17 Jan 2024 16:15:09 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 2FCCFC433A6; Wed, 17 Jan 2024 16:15:08 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1705508108; bh=9hto33msnD5StAxQOT593RifYAulOEFb566w3+Gvz0Y=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=GfNGTXODLFiLsBVDsLJVvkfO5IV9m3xe3V1IyY9QDAunprrLmIybo1nR4EASgkYNV ZxPQIlNRhxCRwAxYGZDieP3mteI7PlVqT6M6G9/8gAEmGLFszo/GkHXoVWxEhGCbWB GfAdkBo2+ABWF5lVOu8hwXqwdmT5VxzE/6yKc717wIJmBUW6jziYmNUNHzx6b4t2EG y0+BwYYFdCGNNt9/M+0W+hwybVBy5e4GnXZisSfO8fNh6gu7uDHmNFncKLj9IOS9av RSGF7dBcaoqCcC/8kIJ8oaGkf89HJAPGrnkfu1IZJpIDSD/joHHVrBLAX3SfSL/BGF v7PL8XppMBEmg== From: Josh Poimboeuf To: Linus Torvalds , Jeff Layton , Chuck Lever , Shakeel Butt , Roman Gushchin , Johannes Weiner , Michal Hocko Cc: linux-kernel@vger.kernel.org, Jens Axboe , Tejun Heo , Vasily Averin , Michal Koutny , Waiman Long , Muchun Song , Jiri Kosina , cgroups@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH RFC 2/4] fs/locks: Add CONFIG_FLOCK_ACCOUNTING Date: Wed, 17 Jan 2024 08:14:44 -0800 Message-ID: X-Mailer: git-send-email 2.43.0 In-Reply-To: References: MIME-Version: 1.0 X-Stat-Signature: 9x58nyfo11u15yujcux1baq78ai8hyuz X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: 64D0F16000B X-Rspam-User: X-HE-Tag: 1705508111-17343 X-HE-Meta: U2FsdGVkX18+t9nOtIgwyWy1BG7a6lxOCoGhHlAjHzjVSUh7v1XwXbtRhgpN6Dz0tKjsCGnD86QRLLG1mPdA/FyPA+oK0bn5MrZqihFsu0p6UCAz43V9sbCqJQJ4qu5P/+NeCcd48Q8HUet2Mme0RY8smvK+EMStAu9k8GJDkTYv5G0z0+kUxDRZuNeuMU3uuPysEnzQqo140Cy+9ZaJvcUNOl7p+5uvA3zkTRVGR4CbRcL4TLCLTpXPRrWUzgmA9SQyBH5s9QiiWvU5btYxq6yLkQNlybH1h2EN2DJohujCktpIiEVBaEcC98y4LL9haV6a9weS/IdwWkgkQoNp42CMavnUGujDJj1oy6wh6RiLbsyK6wIPn0aB8H3tYmLuIhia4TzxAKMKuLGnO5z0qk/6ZxukdCFEroF+IwsqWYiHCqZv6tMU+mhctauuBBSqGVOBKiG1j9SS9twNex9py3BH1gbCamqoxpvDQfXd5yVQi+XN/xdJbZnaNXWXLyHiFCrCNTAvhNoWtqU4IueB1KJEb9O07aqbyUQECj++uhksSIt1HuNNmw4y7TcrC9XuBfqf8n/tJtYDXbFEomjkVarOkuLHwIR0wy1z2NXWdmxWesB45KfdznDJtpskDN+1VEvYDBhlCJab4C7a0XZlHE0jf11KjQD66YKun6tWDq7l2Xg051A7qmS6kUYRTI+/GHehGOuaS9IZqHYs8Ys+P8wZ55G5KVjvdQeu5W2a4/6izKrlYxwcAK7Zz1+kVUBiReDHtOV00xOtjpUdmRwppJrr5bLkJ1vxjItoBGN2EM0zYqvTdYgIynTC74yh9LD759xgqSQK1t8ajKFe+01BNx471BmkbPb+je00Qk4RPTLA5uKHeRk7MFTV0A5zUq0I/5tFqcP6gqlg9lzVeA1s4X4HDfy88zw0tcjYhdW8MOoiDkvSN5nC+ei1ntpsa6BkJtNKkYLtlBIh5jesyLs 3/aBldl5 UR3SHL60g9RQgAWpidHENiLzCw/fWEfJPxajeBkxg4IWU8F/GIgvVOgC30B1zHFaAZ5mbC2ttjv+kFPrvlErPv7/bAxgmYHPm2z926WA24ZajMZe66iQpI7b/WUD4JKDb9OLiCJCrZgS3Vu4dwGDDR0N+JTMMZb6HaJ6FGUnrqJG9fSWQcIzrEGNzYNzh8/rvaDQcjFiLswkXvTCICg6LzbaNmC8/PQiLAs9q4FvT8LEQUAkRsJrFduPrHcTT7iGnEAtjj3sBPYnSzNsmce7lnaso3nP2xMiWmYs3 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Allow flock cache accounting to be disabled at build time. Signed-off-by: Josh Poimboeuf --- fs/Kconfig | 15 +++++++++++++++ fs/locks.c | 2 +- 2 files changed, 16 insertions(+), 1 deletion(-) diff --git a/fs/Kconfig b/fs/Kconfig index a3159831ba98..591f54a03059 100644 --- a/fs/Kconfig +++ b/fs/Kconfig @@ -129,6 +129,21 @@ config FILE_LOCKING for filesystems like NFS and for the flock() system call. Disabling this option saves about 11k. +config FLOCK_ACCOUNTING + bool "Enable kernel memory accounting for file locks" if EXPERT + depends on FILE_LOCKING + default y + help + This option enables kernel memory accounting for file locks. This + prevents task groups from exceeding their memcg allocation limits. + However, it may cause slowdowns in the flock() system call. + + Disabling this option is not recommended as it may allow a rogue task + to DoS the system by forcing the kernel to allocate memory beyond the + task group's memcg limits. + + If unsure, say Y. + source "fs/crypto/Kconfig" source "fs/verity/Kconfig" diff --git a/fs/locks.c b/fs/locks.c index 235ac56c557d..e2799a18c4e8 100644 --- a/fs/locks.c +++ b/fs/locks.c @@ -2905,7 +2905,7 @@ static int __init proc_locks_init(void) fs_initcall(proc_locks_init); #endif -static bool flock_accounting __ro_after_init = true; +static bool flock_accounting __ro_after_init = IS_ENABLED(CONFIG_FLOCK_ACCOUNTING); static int __init flock_accounting_cmdline(char *str) { From patchwork Wed Jan 17 16:14:45 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Josh Poimboeuf X-Patchwork-Id: 13521986 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2A715C47258 for ; Wed, 17 Jan 2024 16:15:14 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A08C66B0105; Wed, 17 Jan 2024 11:15:13 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 990996B0106; Wed, 17 Jan 2024 11:15:13 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 809F06B0107; Wed, 17 Jan 2024 11:15:13 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 667BC6B0105 for ; Wed, 17 Jan 2024 11:15:13 -0500 (EST) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id EAA60A13F5 for ; Wed, 17 Jan 2024 16:15:12 +0000 (UTC) X-FDA: 81689302464.24.3B35A4B Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf29.hostedemail.com (Postfix) with ESMTP id 14820120018 for ; Wed, 17 Jan 2024 16:15:10 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=ZyvIhOFN; dmarc=pass (policy=none) header.from=kernel.org; spf=pass (imf29.hostedemail.com: domain of jpoimboe@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=jpoimboe@kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1705508111; a=rsa-sha256; cv=none; b=ltdR76PS2FhXZ5NjwDniqlLSct6cSO3vqjYiO+ngFnbZCzsbGvl02yjq4AevTU8QFDqVBt LmlDfJ2qqbRtrqGv7JvJ4t1GfZDOmQXyWXH1np3BfvzWeARpYjlIhOpWbv3bDfor3pHn8/ xU/a6Z+Up+CVnHd2jp2eReTRPA4XBSY= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=ZyvIhOFN; dmarc=pass (policy=none) header.from=kernel.org; spf=pass (imf29.hostedemail.com: domain of jpoimboe@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=jpoimboe@kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1705508111; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=lYW7FMdw2mPifRv//jIC/vQYpevvbjQYxhbNLwNJyXA=; b=h/SfiRPIcSuWDfAhOXuHEgNEPgz7a+cCxLtM4toW1+0WAfjEgZGuiu4Se51XgBmPZErjqt lx/UaT7EJrUwk35fm7PTMfIyAF+Br4TLT8P0WQ5VhU9yqKIW5ryIl5D3bidnaoe9P08ki/ sQIwqr/9Gl3U6kk2UGni1SUW+DdEdac= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by dfw.source.kernel.org (Postfix) with ESMTP id DA855616CE; Wed, 17 Jan 2024 16:15:09 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 00CBFC43394; Wed, 17 Jan 2024 16:15:08 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1705508109; bh=SbzlSXpRvsUpXUW2hg2vsE65KIFNDSoztygwRFd62Iw=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=ZyvIhOFND4/d9lgtxET6AjQNr+5TfCtZM72YHaWKxkzrkGQQiTjsV9vryPHqzZ+YD nikLvgD8vIOlWx2dg/RxMbT9yTi45l0If5wWqleb9NUPwrIO7Utw2dFYRv9BbQ5thl oq/k8a+Xpp96kJTTRTzadt3PiWBxGR07IUBye6LrA5ZURe8VwAF63VYK3YQt7Hx0V7 0LZkLVxG9ySD0LuEtsK2rtOov5hIK6A5yzj1J1gM0ttDApF6T1dTp2+rCc3mgQq7h5 Ny6MIwcEY3L/zvvsiGDImPueYgg9GbH32KR0/avBtRTZs1r727xCk6NKUACNQutpU8 a+3aeBum1musA== From: Josh Poimboeuf To: Linus Torvalds , Jeff Layton , Chuck Lever , Shakeel Butt , Roman Gushchin , Johannes Weiner , Michal Hocko Cc: linux-kernel@vger.kernel.org, Jens Axboe , Tejun Heo , Vasily Averin , Michal Koutny , Waiman Long , Muchun Song , Jiri Kosina , cgroups@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH RFC 3/4] mitigations: Expand 'mitigations=off' to include optional software mitigations Date: Wed, 17 Jan 2024 08:14:45 -0800 Message-ID: X-Mailer: git-send-email 2.43.0 In-Reply-To: References: MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 14820120018 X-Stat-Signature: o4m94webfz73q6mtxdycgi3g8s69tyzg X-HE-Tag: 1705508110-769358 X-HE-Meta: U2FsdGVkX198YTjkN0/tHKan+522fYBcCHPFmqAi9TdJEmBDVuNFWqQoXBSV/6/dzCzqikPKQQ9d/5Q4aQ+oYu5gEDKc8IDyjHdbAWltjjwcXMmGfFCXakc4xxcs3die7Eh2z8gScAZhJ2S0I4od3ADNTugJ+EG7DIt+gudEqRVHf5z198vlhsJ59Ye1zlIcpnKnKI69P9f3I9aFPDgwCnCkU+LZ0TiBDFrKtFYJfV040c1AuNf+1V2TpxJaeS24oAEKzF8umjfX/y14Au30oU6t8G5dFMtgNxQiv7ME0SDwYzojqC70y5sviWE4n6C11rq9inDiizgQVww9b8TAzz3LMorSnhoK75AQQqW573nHtOnB332iRm6Dtt9G3KGitXnmkn9jTC9xlnQYxsfMpPdUQ1qHMByhbDJEhJFsTpOsCEyFcRQ4XKU+VueXX3brjT0K537idtVjOvz4gFLghTFljSmT/zdNiG9tHJT7JMB0Thz2AYYrRazzJbUut2cat9gj3BombbFr3G+RLU9D1dRy6rmzpw3Qkd1QwhSEnSZaf5Fksu//FxFIOMAVGxZQpf53HUBiGXkfSds0lfxkWrcr8GHeftHKBUj8bEAlN6d+6i0uLmXoCe3L973reSMNP2o2G4ysk+7fmF3wae2hY7cqQxIorbtYCqMyejlexmQx2EvEOEj+PCzs/pYljvhaHh50C9Y7jZQAHK8a4fBsXBO90NDOmHrZEqEZP0VdFjBlnWiSvV2j+SGKn80fgtkHVn9r88S+TI91EEoFB7ego7LM5vd2TDUWajwpHPl5sZ0L9fyL14GL8iOz6pPyqBgCqKDW+4O+3ahENzy6Iv1UTi2NNwr0fXjqsaAq7hlplORvBMI3IL4TTOebLvjOIMRljLaoxeJf3tU4n47XW1o1rjoCt4MZXRzxw2kKxNPrX8GsNdR960sz4rxJWUM5eeg0Lon0IQNpMmVZf8jQfGs Z726Betj c3V78s7Kz3S8SutsBjjAhE90sTjGan2OK4S8Ebmo9AtpPvlUzg4CwNjsX8Cw00VjVPlxWlzBBWlGWWGydjZAd0Bcx/1pn9RcA5y9v19Oq1MCn/+UjpmXFfbZn3mgvjoIoLZpjOny3wjQteIG+83Tn/0aQthVH8UXK71obDLCCb3UJHpdL7K1n4rOJb+b7MVtRv1YpykWPjkJCI+GDqN6azh7B0lPIwQrxYyG5HJ0EGXaSsIovaFzqbLjkPCKU3X83GIpgG6qAX1XHSBbVZxm5PDnomwV530ERka9eoQhiUirDq94= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: The 'mitigations=off' cmdline option disables all CPU mitigations at runtime. It's intended for users who are running with trusted user space and don't want the performance impact associated with all the mitigations. Up until now, it was only used for CPU mitigations. However, there can also be optional software mitigations which have performance impact. Expand 'mitigations=' to include optional software mitigations. After all there's nothing in the "mitigations" name which limits it to CPU vulnerabilities. In theory we could introduce separate {cpu,sw}_mitigations= options, but for the time being there's no need to separate them out. It's simpler to have them combined since the use case of "I have trusted user space and don't want the performance impacts of unneeded mitigations" is the same, regardless of the source of the bug. Move the interfaces around and rename them to reflect the new broader impact of mitigations=off. No functional changes. Signed-off-by: Josh Poimboeuf --- .../admin-guide/kernel-parameters.txt | 27 ++++++---- arch/arm64/kernel/cpufeature.c | 2 +- arch/arm64/kernel/proton-pack.c | 6 +-- arch/powerpc/kernel/security.c | 14 +++--- arch/s390/kernel/nospec-branch.c | 2 +- arch/x86/kernel/cpu/bugs.c | 35 ++++++------- arch/x86/kvm/mmu/mmu.c | 2 +- arch/x86/mm/pti.c | 3 +- include/linux/bpf.h | 5 +- include/linux/cpu.h | 3 -- include/linux/mitigations.h | 4 ++ kernel/Makefile | 3 +- kernel/cpu.c | 43 ---------------- kernel/mitigations.c | 50 +++++++++++++++++++ 14 files changed, 109 insertions(+), 90 deletions(-) create mode 100644 include/linux/mitigations.h create mode 100644 kernel/mitigations.c diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index 91987b06bc52..24e873351368 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -3391,16 +3391,23 @@ https://repo.or.cz/w/linux-2.6/mini2440.git mitigations= - [X86,PPC,S390,ARM64] Control optional mitigations for - CPU vulnerabilities. This is a set of curated, - arch-independent options, each of which is an - aggregation of existing arch-specific options. + [KNL] Control optional mitigations for CPU + vulnerabilities and performance-impacting + software vulnerabilities. This is a set of + curated, arch-independent options, each of which + is an aggregation of existing arch-specific + options. off - Disable all optional CPU mitigations. This - improves system performance, but it may also - expose users to several CPU vulnerabilities. - Equivalent to: if nokaslr then kpti=0 [ARM64] + Disable all optional mitigations. This + improves system performance, but may also + expose users to several vulnerabilities. + + Equivalent to: + + CPU mitigations: + ---------------- + if nokaslr then kpti=0 [ARM64] gather_data_sampling=off [X86] kvm.nx_huge_pages=off [X86] l1tf=off [X86] @@ -3426,7 +3433,7 @@ kvm.nx_huge_pages=force. auto (default) - Mitigate all CPU vulnerabilities, but leave SMT + Enable all optional mitigations, but leave SMT enabled, even if it's vulnerable. This is for users who don't want to be surprised by SMT getting disabled across kernel upgrades, or who @@ -3434,7 +3441,7 @@ Equivalent to: (default behavior) auto,nosmt - Mitigate all CPU vulnerabilities, disabling SMT + Enable all optional mitigations, disabling SMT if needed. This is for users who always want to be fully mitigated, even if it means losing SMT. Equivalent to: l1tf=flush,nosmt [X86] diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c index 01a4c1d7fc09..ae37898e5b1a 100644 --- a/arch/arm64/kernel/cpufeature.c +++ b/arch/arm64/kernel/cpufeature.c @@ -1719,7 +1719,7 @@ static bool unmap_kernel_at_el0(const struct arm64_cpu_capabilities *entry, } } - if (cpu_mitigations_off() && !__kpti_forced) { + if (mitigations_off() && !__kpti_forced) { str = "mitigations=off"; __kpti_forced = -1; } diff --git a/arch/arm64/kernel/proton-pack.c b/arch/arm64/kernel/proton-pack.c index 6268a13a1d58..00242edf1885 100644 --- a/arch/arm64/kernel/proton-pack.c +++ b/arch/arm64/kernel/proton-pack.c @@ -91,7 +91,7 @@ early_param("nospectre_v2", parse_spectre_v2_param); static bool spectre_v2_mitigations_off(void) { - bool ret = __nospectre_v2 || cpu_mitigations_off(); + bool ret = __nospectre_v2 || mitigations_off(); if (ret) pr_info_once("spectre-v2 mitigation disabled by command line option\n"); @@ -421,7 +421,7 @@ early_param("ssbd", parse_spectre_v4_param); */ static bool spectre_v4_mitigations_off(void) { - bool ret = cpu_mitigations_off() || + bool ret = mitigations_off() || __spectre_v4_policy == SPECTRE_V4_POLICY_MITIGATION_DISABLED; if (ret) @@ -1000,7 +1000,7 @@ void spectre_bhb_enable_mitigation(const struct arm64_cpu_capabilities *entry) /* No point mitigating Spectre-BHB alone. */ } else if (!IS_ENABLED(CONFIG_MITIGATE_SPECTRE_BRANCH_HISTORY)) { pr_info_once("spectre-bhb mitigation disabled by compile time option\n"); - } else if (cpu_mitigations_off() || __nospectre_bhb) { + } else if (mitigations_off() || __nospectre_bhb) { pr_info_once("spectre-bhb mitigation disabled by command line option\n"); } else if (supports_ecbhb(SCOPE_LOCAL_CPU)) { state = SPECTRE_MITIGATED; diff --git a/arch/powerpc/kernel/security.c b/arch/powerpc/kernel/security.c index 4856e1a5161c..52cf79b5d87a 100644 --- a/arch/powerpc/kernel/security.c +++ b/arch/powerpc/kernel/security.c @@ -64,7 +64,7 @@ void __init setup_barrier_nospec(void) enable = security_ftr_enabled(SEC_FTR_FAVOUR_SECURITY) && security_ftr_enabled(SEC_FTR_BNDS_CHK_SPEC_BAR); - if (!no_nospec && !cpu_mitigations_off()) + if (!no_nospec && !mitigations_off()) enable_barrier_nospec(enable); } @@ -135,7 +135,7 @@ early_param("nospectre_v2", handle_nospectre_v2); #ifdef CONFIG_PPC_E500 void __init setup_spectre_v2(void) { - if (no_spectrev2 || cpu_mitigations_off()) + if (no_spectrev2 || mitigations_off()) do_btb_flush_fixups(); else btb_flush_enabled = true; @@ -331,7 +331,7 @@ void setup_stf_barrier(void) stf_enabled_flush_types = type; - if (!no_stf_barrier && !cpu_mitigations_off()) + if (!no_stf_barrier && !mitigations_off()) stf_barrier_enable(enable); } @@ -530,7 +530,7 @@ void setup_count_cache_flush(void) { bool enable = true; - if (no_spectrev2 || cpu_mitigations_off()) { + if (no_spectrev2 || mitigations_off()) { if (security_ftr_enabled(SEC_FTR_BCCTRL_SERIALISED) || security_ftr_enabled(SEC_FTR_COUNT_CACHE_DISABLED)) pr_warn("Spectre v2 mitigations not fully under software control, can't disable\n"); @@ -700,13 +700,13 @@ void setup_rfi_flush(enum l1d_flush_type types, bool enable) enabled_flush_types = types; - if (!cpu_mitigations_off() && !no_rfi_flush) + if (!mitigations_off() && !no_rfi_flush) rfi_flush_enable(enable); } void setup_entry_flush(bool enable) { - if (cpu_mitigations_off()) + if (mitigations_off()) return; if (!no_entry_flush) @@ -715,7 +715,7 @@ void setup_entry_flush(bool enable) void setup_uaccess_flush(bool enable) { - if (cpu_mitigations_off()) + if (mitigations_off()) return; if (!no_uaccess_flush) diff --git a/arch/s390/kernel/nospec-branch.c b/arch/s390/kernel/nospec-branch.c index d1b16d83e49a..75ec4ad4198b 100644 --- a/arch/s390/kernel/nospec-branch.c +++ b/arch/s390/kernel/nospec-branch.c @@ -59,7 +59,7 @@ early_param("nospectre_v2", nospectre_v2_setup_early); void __init nospec_auto_detect(void) { - if (test_facility(156) || cpu_mitigations_off()) { + if (test_facility(156) || mitigations_off()) { /* * The machine supports etokens. * Disable expolines and disable nobp. diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c index bb0ab8466b91..45d4c2664011 100644 --- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -16,6 +16,7 @@ #include #include #include +#include #include #include @@ -243,7 +244,7 @@ static const char * const mds_strings[] = { static void __init mds_select_mitigation(void) { - if (!boot_cpu_has_bug(X86_BUG_MDS) || cpu_mitigations_off()) { + if (!boot_cpu_has_bug(X86_BUG_MDS) || mitigations_off()) { mds_mitigation = MDS_MITIGATION_OFF; return; } @@ -255,7 +256,7 @@ static void __init mds_select_mitigation(void) static_branch_enable(&mds_user_clear); if (!boot_cpu_has(X86_BUG_MSBDS_ONLY) && - (mds_nosmt || cpu_mitigations_auto_nosmt())) + (mds_nosmt || mitigations_auto_nosmt())) cpu_smt_disable(false); } } @@ -317,7 +318,7 @@ static void __init taa_select_mitigation(void) return; } - if (cpu_mitigations_off()) { + if (mitigations_off()) { taa_mitigation = TAA_MITIGATION_OFF; return; } @@ -358,7 +359,7 @@ static void __init taa_select_mitigation(void) */ static_branch_enable(&mds_user_clear); - if (taa_nosmt || cpu_mitigations_auto_nosmt()) + if (taa_nosmt || mitigations_auto_nosmt()) cpu_smt_disable(false); } @@ -408,7 +409,7 @@ static void __init mmio_select_mitigation(void) if (!boot_cpu_has_bug(X86_BUG_MMIO_STALE_DATA) || boot_cpu_has_bug(X86_BUG_MMIO_UNKNOWN) || - cpu_mitigations_off()) { + mitigations_off()) { mmio_mitigation = MMIO_MITIGATION_OFF; return; } @@ -451,7 +452,7 @@ static void __init mmio_select_mitigation(void) else mmio_mitigation = MMIO_MITIGATION_UCODE_NEEDED; - if (mmio_nosmt || cpu_mitigations_auto_nosmt()) + if (mmio_nosmt || mitigations_auto_nosmt()) cpu_smt_disable(false); } @@ -481,7 +482,7 @@ early_param("mmio_stale_data", mmio_stale_data_parse_cmdline); static void __init md_clear_update_mitigation(void) { - if (cpu_mitigations_off()) + if (mitigations_off()) return; if (!static_key_enabled(&mds_user_clear)) @@ -611,7 +612,7 @@ static void __init srbds_select_mitigation(void) srbds_mitigation = SRBDS_MITIGATION_HYPERVISOR; else if (!boot_cpu_has(X86_FEATURE_SRBDS_CTRL)) srbds_mitigation = SRBDS_MITIGATION_UCODE_NEEDED; - else if (cpu_mitigations_off() || srbds_off) + else if (mitigations_off() || srbds_off) srbds_mitigation = SRBDS_MITIGATION_OFF; update_srbds_msr(); @@ -742,7 +743,7 @@ static void __init gds_select_mitigation(void) goto out; } - if (cpu_mitigations_off()) + if (mitigations_off()) gds_mitigation = GDS_MITIGATION_OFF; /* Will verify below that mitigation _can_ be disabled */ @@ -841,7 +842,7 @@ static bool smap_works_speculatively(void) static void __init spectre_v1_select_mitigation(void) { - if (!boot_cpu_has_bug(X86_BUG_SPECTRE_V1) || cpu_mitigations_off()) { + if (!boot_cpu_has_bug(X86_BUG_SPECTRE_V1) || mitigations_off()) { spectre_v1_mitigation = SPECTRE_V1_MITIGATION_NONE; return; } @@ -974,7 +975,7 @@ static void __init retbleed_select_mitigation(void) { bool mitigate_smt = false; - if (!boot_cpu_has_bug(X86_BUG_RETBLEED) || cpu_mitigations_off()) + if (!boot_cpu_has_bug(X86_BUG_RETBLEED) || mitigations_off()) return; switch (retbleed_cmd) { @@ -1068,7 +1069,7 @@ static void __init retbleed_select_mitigation(void) } if (mitigate_smt && !boot_cpu_has(X86_FEATURE_STIBP) && - (retbleed_nosmt || cpu_mitigations_auto_nosmt())) + (retbleed_nosmt || mitigations_auto_nosmt())) cpu_smt_disable(false); /* @@ -1391,7 +1392,7 @@ static enum spectre_v2_mitigation_cmd __init spectre_v2_parse_cmdline(void) int ret, i; if (cmdline_find_option_bool(boot_command_line, "nospectre_v2") || - cpu_mitigations_off()) + mitigations_off()) return SPECTRE_V2_CMD_NONE; ret = cmdline_find_option(boot_command_line, "spectre_v2", arg, sizeof(arg)); @@ -1885,7 +1886,7 @@ static enum ssb_mitigation_cmd __init ssb_parse_cmdline(void) int ret, i; if (cmdline_find_option_bool(boot_command_line, "nospec_store_bypass_disable") || - cpu_mitigations_off()) { + mitigations_off()) { return SPEC_STORE_BYPASS_CMD_NONE; } else { ret = cmdline_find_option(boot_command_line, "spec_store_bypass_disable", @@ -2283,9 +2284,9 @@ static void __init l1tf_select_mitigation(void) if (!boot_cpu_has_bug(X86_BUG_L1TF)) return; - if (cpu_mitigations_off()) + if (mitigations_off()) l1tf_mitigation = L1TF_MITIGATION_OFF; - else if (cpu_mitigations_auto_nosmt()) + else if (mitigations_auto_nosmt()) l1tf_mitigation = L1TF_MITIGATION_FLUSH_NOSMT; override_cache_bits(&boot_cpu_data); @@ -2410,7 +2411,7 @@ static void __init srso_select_mitigation(void) { bool has_microcode = boot_cpu_has(X86_FEATURE_IBPB_BRTYPE); - if (cpu_mitigations_off()) + if (mitigations_off()) return; if (!boot_cpu_has_bug(X86_BUG_SRSO)) { diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 0b1f991b9a31..f0d105f740ed 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -6819,7 +6819,7 @@ static int get_nx_huge_pages(char *buffer, const struct kernel_param *kp) static bool get_nx_auto_mode(void) { /* Return true when CPU has the bug, and mitigations are ON */ - return boot_cpu_has_bug(X86_BUG_ITLB_MULTIHIT) && !cpu_mitigations_off(); + return boot_cpu_has_bug(X86_BUG_ITLB_MULTIHIT) && !mitigations_off(); } static void __set_nx_huge_pages(bool val) diff --git a/arch/x86/mm/pti.c b/arch/x86/mm/pti.c index 669ba1c345b3..16a63c241e1e 100644 --- a/arch/x86/mm/pti.c +++ b/arch/x86/mm/pti.c @@ -28,6 +28,7 @@ #include #include #include +#include #include #include @@ -84,7 +85,7 @@ void __init pti_check_boottime_disable(void) return; } - if (cpu_mitigations_off()) + if (mitigations_off()) pti_mode = PTI_FORCE_OFF; if (pti_mode == PTI_FORCE_OFF) { pti_print_if_insecure("disabled on command line."); diff --git a/include/linux/bpf.h b/include/linux/bpf.h index e30100597d0a..04356b9fa82a 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -30,6 +30,7 @@ #include #include #include +#include struct bpf_verifier_env; struct bpf_verifier_log; @@ -2214,12 +2215,12 @@ static inline bool bpf_allow_uninit_stack(void) static inline bool bpf_bypass_spec_v1(void) { - return cpu_mitigations_off() || perfmon_capable(); + return mitigations_off() || perfmon_capable(); } static inline bool bpf_bypass_spec_v4(void) { - return cpu_mitigations_off() || perfmon_capable(); + return mitigations_off() || perfmon_capable(); } int bpf_map_new_fd(struct bpf_map *map, int flags); diff --git a/include/linux/cpu.h b/include/linux/cpu.h index fc8094419084..b8c81d924a62 100644 --- a/include/linux/cpu.h +++ b/include/linux/cpu.h @@ -212,7 +212,4 @@ void cpuhp_report_idle_dead(void); static inline void cpuhp_report_idle_dead(void) { } #endif /* #ifdef CONFIG_HOTPLUG_CPU */ -extern bool cpu_mitigations_off(void); -extern bool cpu_mitigations_auto_nosmt(void); - #endif /* _LINUX_CPU_H_ */ diff --git a/include/linux/mitigations.h b/include/linux/mitigations.h new file mode 100644 index 000000000000..5acc80d49230 --- /dev/null +++ b/include/linux/mitigations.h @@ -0,0 +1,4 @@ +/* SPDX-License-Identifier: GPL-2.0 */ + +extern bool mitigations_off(void); +extern bool mitigations_auto_nosmt(void); diff --git a/kernel/Makefile b/kernel/Makefile index ce105a5558fc..d1514432bbc7 100644 --- a/kernel/Makefile +++ b/kernel/Makefile @@ -10,7 +10,8 @@ obj-y = fork.o exec_domain.o panic.o \ extable.o params.o \ kthread.o sys_ni.o nsproxy.o \ notifier.o ksysfs.o cred.o reboot.o \ - async.o range.o smpboot.o ucount.o regset.o ksyms_common.o + async.o range.o smpboot.o ucount.o regset.o ksyms_common.o \ + mitigations.o obj-$(CONFIG_USERMODE_DRIVER) += usermode_driver.o obj-$(CONFIG_MULTIUSER) += groups.o diff --git a/kernel/cpu.c b/kernel/cpu.c index e6ec3ba4950b..e273478cd437 100644 --- a/kernel/cpu.c +++ b/kernel/cpu.c @@ -3195,46 +3195,3 @@ void __init boot_cpu_hotplug_init(void) this_cpu_write(cpuhp_state.state, CPUHP_ONLINE); this_cpu_write(cpuhp_state.target, CPUHP_ONLINE); } - -/* - * These are used for a global "mitigations=" cmdline option for toggling - * optional CPU mitigations. - */ -enum cpu_mitigations { - CPU_MITIGATIONS_OFF, - CPU_MITIGATIONS_AUTO, - CPU_MITIGATIONS_AUTO_NOSMT, -}; - -static enum cpu_mitigations cpu_mitigations __ro_after_init = - CPU_MITIGATIONS_AUTO; - -static int __init mitigations_parse_cmdline(char *arg) -{ - if (!strcmp(arg, "off")) - cpu_mitigations = CPU_MITIGATIONS_OFF; - else if (!strcmp(arg, "auto")) - cpu_mitigations = CPU_MITIGATIONS_AUTO; - else if (!strcmp(arg, "auto,nosmt")) - cpu_mitigations = CPU_MITIGATIONS_AUTO_NOSMT; - else - pr_crit("Unsupported mitigations=%s, system may still be vulnerable\n", - arg); - - return 0; -} -early_param("mitigations", mitigations_parse_cmdline); - -/* mitigations=off */ -bool cpu_mitigations_off(void) -{ - return cpu_mitigations == CPU_MITIGATIONS_OFF; -} -EXPORT_SYMBOL_GPL(cpu_mitigations_off); - -/* mitigations=auto,nosmt */ -bool cpu_mitigations_auto_nosmt(void) -{ - return cpu_mitigations == CPU_MITIGATIONS_AUTO_NOSMT; -} -EXPORT_SYMBOL_GPL(cpu_mitigations_auto_nosmt); diff --git a/kernel/mitigations.c b/kernel/mitigations.c new file mode 100644 index 000000000000..2828a755a719 --- /dev/null +++ b/kernel/mitigations.c @@ -0,0 +1,50 @@ +// SPDX-License-Identifier: GPL-2.0 + +#include +#include +#include +#include +#include + +enum mitigations { + MITIGATIONS_OFF, + MITIGATIONS_AUTO, + MITIGATIONS_AUTO_NOSMT, +}; + +static enum mitigations mitigations __ro_after_init = + MITIGATIONS_AUTO; + +/* + * The "mitigations=" cmdline option is for toggling optional CPU or software + * mitigations which may impact performance. Mitigations should only be turned + * off if user space and VMs are running trusted code. + */ +static int __init mitigations_parse_cmdline(char *arg) +{ + if (!strcmp(arg, "off")) + mitigations = MITIGATIONS_OFF; + else if (!strcmp(arg, "auto")) + mitigations = MITIGATIONS_AUTO; + else if (!strcmp(arg, "auto,nosmt")) + mitigations = MITIGATIONS_AUTO_NOSMT; + else + pr_crit("Unsupported mitigations=%s, system may still be vulnerable\n", arg); + + return 0; +} +early_param("mitigations", mitigations_parse_cmdline); + +/* mitigations=off */ +bool mitigations_off(void) +{ + return mitigations == MITIGATIONS_OFF; +} +EXPORT_SYMBOL_GPL(mitigations_off); + +/* mitigations=auto,nosmt */ +bool mitigations_auto_nosmt(void) +{ + return mitigations == MITIGATIONS_AUTO_NOSMT; +} +EXPORT_SYMBOL_GPL(mitigations_auto_nosmt);