From patchwork Tue Jul 19 19:56:23 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Axel Rasmussen X-Patchwork-Id: 12922971 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id DED93C433EF for ; Tue, 19 Jul 2022 19:56:34 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5AF516B0071; Tue, 19 Jul 2022 15:56:34 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 55DD66B0073; Tue, 19 Jul 2022 15:56:34 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 44C556B0074; Tue, 19 Jul 2022 15:56:34 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 358C96B0071 for ; Tue, 19 Jul 2022 15:56:34 -0400 (EDT) Received: from smtpin10.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 0661C1202FF for ; Tue, 19 Jul 2022 19:56:34 +0000 (UTC) X-FDA: 79704906708.10.3CE31B5 Received: from mail-yw1-f202.google.com (mail-yw1-f202.google.com [209.85.128.202]) by imf08.hostedemail.com (Postfix) with ESMTP id 8CA8516007A for ; Tue, 19 Jul 2022 19:56:33 +0000 (UTC) Received: by mail-yw1-f202.google.com with SMTP id 00721157ae682-31e619dcbbaso15132057b3.14 for ; Tue, 19 Jul 2022 12:56:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:message-id:mime-version:subject:from:to:cc; bh=4ZVCVYd33nvDbXmYz6AUV1M3rt6KeSBhimBlcmhBGXo=; b=lc9AQqTARh1WZm1kvYwNkL/3MWhIgLRLSGpjjdvG/1blyOgUJ+O8L1AzrUfRecO2p4 xuDJwRyTDMVLvE0cWwwdXJn6/FEueOVEZJl36JekXzPjPLoLwUkElRf1vF+9Hb5eeCKp i9g+PrMitbkmV+noWmUawPDxG/2Fwzz+OSiTkTlZ/52zdWBrj+a116ToS0WJSFpV0zpO rdfr9FOmVuN/YnfBL3x4c+M/wJ7xf3GpdLYtmVMzCBaTELEvqU0wl81fsY8i3LQ1wM7Q qvkeuW+JuwRkjaPhfbzquMaukkqualLyL4tGYJSSR4YftkePZkzpGa7GExCk4wz3Fs/F lPGA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:message-id:mime-version:subject:from:to:cc; bh=4ZVCVYd33nvDbXmYz6AUV1M3rt6KeSBhimBlcmhBGXo=; b=jKpV03BhDF1S8FBZghsWucpvjIpU6uesabG0EmsAw5kH7Wc/FviaXDuHKpRvm0EdR2 9Ak0HXnyUIQfNEcGWx/tlhJ5OuQzvG4HefG45KxFQHRfYErlItCOs7NB40oUF89I+K6/ 4ZWQXEBmgEV6X7+/JhMgoCAB0RCH5Ysi18OCdK3gOJipyN6eSc25w2wNbTJnXwkv7Epe Dh5GRDCy+SX9bkXd3zEdJDZHKMw1Zk8pDbk/+ueVam7ADJeUpk+eqqJNqbLhw/RJluN0 zsS0gpnB67d/Qh2XB3oJIqH9ZJshHaSnw5rF2wg9CKJSeUQzLYaHfIhKmeq7Cp0Np7wR G0iw== X-Gm-Message-State: AJIora/OF7ueaj99nGBocnEhJTkmRS9Sgz2baKmj5xYM0x/e8lP6DY5n 5ByTlbQX4QPtEbglMsuCGNtSYFl0V6XnYgqXFowA X-Google-Smtp-Source: AGRyM1togdDHBrz/Hzl7WreswK03PRkqxUYOEK+ORErJYAO/PxDUWEFD1kLTLjurVrg5osULk0o2TVRizJIWAmhsrYPA X-Received: from ajr0.svl.corp.google.com ([2620:15c:2d4:203:a065:9221:e40d:4fbe]) (user=axelrasmussen job=sendgmr) by 2002:a81:58c1:0:b0:31d:6b54:3fd5 with SMTP id m184-20020a8158c1000000b0031d6b543fd5mr38069412ywb.7.1658260592731; Tue, 19 Jul 2022 12:56:32 -0700 (PDT) Date: Tue, 19 Jul 2022 12:56:23 -0700 Message-Id: <20220719195628.3415852-1-axelrasmussen@google.com> Mime-Version: 1.0 X-Mailer: git-send-email 2.37.0.170.g444d1eabd0-goog Subject: [PATCH v4 0/5] userfaultfd: add /dev/userfaultfd for fine grained access control From: Axel Rasmussen To: Alexander Viro , Andrew Morton , Dave Hansen , "Dmitry V . Levin" , Gleb Fotengauer-Malinovskiy , Hugh Dickins , Jan Kara , Jonathan Corbet , Mel Gorman , Mike Kravetz , Mike Rapoport , Nadav Amit , Peter Xu , Shuah Khan , Suren Baghdasaryan , Vlastimil Babka , zhangyi Cc: Axel Rasmussen , linux-doc@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-kselftest@vger.kernel.org ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=lc9AQqTA; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf08.hostedemail.com: domain of 3cAzXYg0KCAMd0houdvpxvvhqjrrjoh.frpolqx0-ppnydfn.ruj@flex--axelrasmussen.bounces.google.com designates 209.85.128.202 as permitted sender) smtp.mailfrom=3cAzXYg0KCAMd0houdvpxvvhqjrrjoh.frpolqx0-ppnydfn.ruj@flex--axelrasmussen.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1658260593; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding:in-reply-to: references:dkim-signature; bh=4ZVCVYd33nvDbXmYz6AUV1M3rt6KeSBhimBlcmhBGXo=; b=re9xX/herlWi4ZD3acmX+j9AQdVPtHzi1+F0S/dIqZ0oYXJbvB+ULCgSSOvzSsXwsXjswA bZHC/whSuOuuTGyVlkiXbiprXSuil9+i1VHpSEX7ZGs++RfdyxHW2bdBLQRLmO+Zr7uchv fbXIKTBSPYnH3lGp0TWXYd5Q3kDZD4w= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1658260593; a=rsa-sha256; cv=none; b=d9nva809nlFZAEYj05HvSscAEjWLhJ3853cp9LpqQ5G1fBUhWuvTyQN8GSWxgE4BJQAlz/ +FF+6YJojFqjPeNUpnZhSMtnO9Sxqm0bYFSZVl/GCQ/X2PP8zIPoM5TgyJMpxRCZLW5dmi 6Zo4xNMN21ol3u3IZ2NpmrO1wjs/i7U= X-Rspamd-Queue-Id: 8CA8516007A Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=lc9AQqTA; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf08.hostedemail.com: domain of 3cAzXYg0KCAMd0houdvpxvvhqjrrjoh.frpolqx0-ppnydfn.ruj@flex--axelrasmussen.bounces.google.com designates 209.85.128.202 as permitted sender) smtp.mailfrom=3cAzXYg0KCAMd0houdvpxvvhqjrrjoh.frpolqx0-ppnydfn.ruj@flex--axelrasmussen.bounces.google.com X-Rspam-User: X-Rspamd-Server: rspam04 X-Stat-Signature: 1kj4ikjjaq4iyw5ctips4f9fypjm58o3 X-HE-Tag: 1658260593-96802 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: This series is based on torvalds/master. The series is split up like so: - Patch 1 is a simple fixup which we should take in any case (even by itself). - Patches 2-6 add the feature, configurable selftest support, and docs. Why not ...? ============ - Why not /proc/[pid]/userfaultfd? The proposed use case for this is for one process to open a userfaultfd which can intercept another process' page faults. This seems to me like exactly what CAP_SYS_PTRACE is for, though, so I think this use case can simply use a syscall without the powers CAP_SYS_PTRACE grants being "too much". - Why not use a syscall? Access to syscalls is generally controlled by capabilities. We don't have a capability which is used for userfaultfd access without also granting more / other permissions as well, and adding a new capability was rejected [1]. - It's possible a LSM could be used to control access instead. I suspect adding a brand new one just for this would be rejected, but I think some existing ones like SELinux can be used to filter syscall access. Enabling SELinux for large production deployments which don't already use it is likely to be a huge undertaking though, and I don't think this use case by itself is enough to motivate that kind of architectural change. Changelog ========= v3->v4: - Picked up an Acked-by on 5/5. - Updated cover letter to cover "why not ...". - Refactored userfaultfd_allowed() into userfaultfd_syscall_allowed(). [Peter] - Removed obsolete comment from a previous version. [Peter] - Refactored userfaultfd_open() in selftest. [Peter] - Reworded admin-guide documentation. [Mike, Peter] - Squashed 2 commits adding /dev/userfaultfd to selftest and making selftest configurable. [Peter] - Added "syscall" test modifier (the default behavior) to selftest. [Peter] v2->v3: - Rebased onto linux-next/akpm-base, in order to be based on top of the run_vmtests.sh refactor which was merged previously. - Picked up some Reviewed-by's. - Fixed ioctl definition (_IO instead of _IOWR), and stopped using compat_ptr_ioctl since it is unneeded for ioctls which don't take a pointer. - Removed the "handle_kernel_faults" bool, simplifying the code. The result is logically equivalent, but simpler. - Fixed userfaultfd selftest so it returns KSFT_SKIP appropriately. - Reworded documentation per Shuah's feedback on v2. - Improved example usage for userfaultfd selftest. v1->v2: - Add documentation update. - Test *both* userfaultfd(2) and /dev/userfaultfd via the selftest. [1]: https://lore.kernel.org/lkml/686276b9-4530-2045-6bd8-170e5943abe4@schaufler-ca.com/T/ Axel Rasmussen (5): selftests: vm: add hugetlb_shared userfaultfd test to run_vmtests.sh userfaultfd: add /dev/userfaultfd for fine grained access control userfaultfd: selftests: modify selftest to use /dev/userfaultfd userfaultfd: update documentation to describe /dev/userfaultfd selftests: vm: add /dev/userfaultfd test cases to run_vmtests.sh Documentation/admin-guide/mm/userfaultfd.rst | 41 +++++++++++- Documentation/admin-guide/sysctl/vm.rst | 3 + fs/userfaultfd.c | 69 ++++++++++++++++---- include/uapi/linux/userfaultfd.h | 4 ++ tools/testing/selftests/vm/run_vmtests.sh | 11 +++- tools/testing/selftests/vm/userfaultfd.c | 69 +++++++++++++++++--- 6 files changed, 169 insertions(+), 28 deletions(-) --- 2.37.0.170.g444d1eabd0-goog