From patchwork Mon Dec 4 17:53:41 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Keith Busch X-Patchwork-Id: 13478858 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=meta.com header.i=@meta.com header.b="nScpq9EB" Received: from mx0a-00082601.pphosted.com (mx0a-00082601.pphosted.com [67.231.145.42]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 71870109 for ; Mon, 4 Dec 2023 09:54:04 -0800 (PST) Received: from pps.filterd (m0044012.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 3B4BCvmK032746 for ; Mon, 4 Dec 2023 09:54:04 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=meta.com; h=from : to : cc : subject : date : message-id : mime-version : content-transfer-encoding : content-type; s=s2048-2021-q4; bh=1mhGxSkWo3Ada5UDMylojLGx/+rGIOTO13l7eqsWImI=; b=nScpq9EBz8gEHlr+rcjAeIg+TMBdCnXxw1HZly1NzWZJeajY2bT/KHavDByvoRdkF7+Z FxSNEAwW3+obO+pNwTpAOPBL0aVSbIp76zaE51BFaZRONeH6CMwZNpzhnxos5BWoAN/r D23SPgTEAuMR0MG2tPI8S8qjMw3VHYPIYf9JElsqQ9LtCvBzcxhn4/mILZ3qVhjs2v2r YK9CKjchqhvlJy0YkvQ9Fb77IxB0HUW7ag4XEgjviPX803pD0L7DqP/r+0lbuYRowAl4 YRPPLhIkHstWK3g5du81sFFDxGSDYUGm4YhL3pOVlwTZY6QyMJGbA7l2GOeg7lWpStzZ AQ== Received: from maileast.thefacebook.com ([163.114.130.16]) by mx0a-00082601.pphosted.com (PPS) with ESMTPS id 3urm879ek7-8 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Mon, 04 Dec 2023 09:54:03 -0800 Received: from twshared15991.38.frc1.facebook.com (2620:10d:c0a8:1c::1b) by mail.thefacebook.com (2620:10d:c0a8:82::b) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.34; Mon, 4 Dec 2023 09:53:59 -0800 Received: by devbig007.nao1.facebook.com (Postfix, from userid 544533) id BD31F229B6E99; Mon, 4 Dec 2023 09:53:43 -0800 (PST) From: Keith Busch To: , CC: , , , , Keith Busch Subject: [PATCH 1/2] iouring: one capable call per iouring instance Date: Mon, 4 Dec 2023 09:53:41 -0800 Message-ID: <20231204175342.3418422-1-kbusch@meta.com> X-Mailer: git-send-email 2.34.1 Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-FB-Internal: Safe X-Proofpoint-GUID: 6ZuteBda8Yc7ZMI73PcFjkSUtuC4TsrM X-Proofpoint-ORIG-GUID: 6ZuteBda8Yc7ZMI73PcFjkSUtuC4TsrM X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.272,Aquarius:18.0.997,Hydra:6.0.619,FMLib:17.11.176.26 definitions=2023-12-04_17,2023-12-04_01,2023-05-22_02 From: Keith Busch The uring_cmd operation is often used for privileged actions, so drivers subscribing to this interface check capable() for each command. The capable() function is not fast path friendly for many kernel configs, and this can really harm performance. Stash the capable sys admin attribute in the io_uring context and set a new issue_flag for the uring_cmd interface. Signed-off-by: Keith Busch --- include/linux/io_uring_types.h | 4 ++++ io_uring/io_uring.c | 1 + io_uring/uring_cmd.c | 2 ++ 3 files changed, 7 insertions(+) diff --git a/include/linux/io_uring_types.h b/include/linux/io_uring_types.h index bebab36abce89..d64d6916753f0 100644 --- a/include/linux/io_uring_types.h +++ b/include/linux/io_uring_types.h @@ -36,6 +36,9 @@ enum io_uring_cmd_flags { /* set when uring wants to cancel a previously issued command */ IO_URING_F_CANCEL = (1 << 11), IO_URING_F_COMPAT = (1 << 12), + + /* ring validated as CAP_SYS_ADMIN capable */ + IO_URING_F_SYS_ADMIN = (1 << 13), }; struct io_wq_work_node { @@ -240,6 +243,7 @@ struct io_ring_ctx { unsigned int poll_activated: 1; unsigned int drain_disabled: 1; unsigned int compat: 1; + unsigned int sys_admin: 1; struct task_struct *submitter_task; struct io_rings *rings; diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index 1d254f2c997de..4aa10b64f539e 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -3980,6 +3980,7 @@ static __cold int io_uring_create(unsigned entries, struct io_uring_params *p, ctx->syscall_iopoll = 1; ctx->compat = in_compat_syscall(); + ctx->sys_admin = capable(CAP_SYS_ADMIN); if (!ns_capable_noaudit(&init_user_ns, CAP_IPC_LOCK)) ctx->user = get_uid(current_user()); diff --git a/io_uring/uring_cmd.c b/io_uring/uring_cmd.c index 8a38b9f75d841..764f0e004aa00 100644 --- a/io_uring/uring_cmd.c +++ b/io_uring/uring_cmd.c @@ -164,6 +164,8 @@ int io_uring_cmd(struct io_kiocb *req, unsigned int issue_flags) issue_flags |= IO_URING_F_CQE32; if (ctx->compat) issue_flags |= IO_URING_F_COMPAT; + if (ctx->sys_admin) + issue_flags |= IO_URING_F_SYS_ADMIN; if (ctx->flags & IORING_SETUP_IOPOLL) { if (!file->f_op->uring_cmd_iopoll) return -EOPNOTSUPP; From patchwork Mon Dec 4 17:53:42 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Keith Busch X-Patchwork-Id: 13478857 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=meta.com header.i=@meta.com header.b="n33naqaO" Received: from mx0b-00082601.pphosted.com (mx0b-00082601.pphosted.com [67.231.153.30]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8DB9BFF for ; Mon, 4 Dec 2023 09:54:01 -0800 (PST) Received: from pps.filterd (m0148460.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 3B4Ha6mX021386 for ; Mon, 4 Dec 2023 09:54:00 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=meta.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding : content-type; s=s2048-2021-q4; bh=1qv0DoiKmoyUZa14a5LG+bbxmqPFPFQcDu/E68ZKoZ0=; b=n33naqaOCnd95KyhpQzW8L01VE32dSqjsSsOoqWq0uEuOn/5V9ah/2gvQ58tAdP8pneK ZEicv1dQ9N4c/EHD5rLf3HU60tFTuZf/yMcvihOmp/eGXxkcA3fiQH5qsSSFvl5xrQNz FLAWZ7nxWQIbZNtmTw1J9AMMaDIyF3uqkhuVunX7wgtAzXk0Qwwgl0+6mIeG/paF0NRx rX17XNC9XqvctGUxxgrQxpybuY4/dL038egScjBR6847MGcdKDxf6Vg6wixzTCZ2fMVY CAYqQ597Y29PX/7k9v+gCM6RMJrHoWMaa9Tw7hC0q5b7JoE7+Qf+xiNnGbDeMRDq0Jbq yg== Received: from maileast.thefacebook.com ([163.114.130.16]) by mx0a-00082601.pphosted.com (PPS) with ESMTPS id 3use4j2n3f-5 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Mon, 04 Dec 2023 09:54:00 -0800 Received: from twshared22605.07.ash9.facebook.com (2620:10d:c0a8:1b::2d) by mail.thefacebook.com (2620:10d:c0a8:83::8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.34; Mon, 4 Dec 2023 09:53:57 -0800 Received: by devbig007.nao1.facebook.com (Postfix, from userid 544533) id EE937229B6E9B; Mon, 4 Dec 2023 09:53:48 -0800 (PST) From: Keith Busch To: , CC: , , , , Keith Busch Subject: [PATCH 2/2] nvme: use uring_cmd sys_admin flag Date: Mon, 4 Dec 2023 09:53:42 -0800 Message-ID: <20231204175342.3418422-2-kbusch@meta.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231204175342.3418422-1-kbusch@meta.com> References: <20231204175342.3418422-1-kbusch@meta.com> Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-FB-Internal: Safe X-Proofpoint-GUID: EtxV0nbztpT7vWCmxPhDgVtPYQaS92ui X-Proofpoint-ORIG-GUID: EtxV0nbztpT7vWCmxPhDgVtPYQaS92ui X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.272,Aquarius:18.0.997,Hydra:6.0.619,FMLib:17.11.176.26 definitions=2023-12-04_17,2023-12-04_01,2023-05-22_02 From: Keith Busch The nvme passthrough interface through io_uring is intended to be fast, so we should avoid calling capable() every io. Checking other permission first helped reduce this overhead, but it's still called for many commands. Use the new uring_cmd sys admin issue_flag to see if we can skip additional checks. The ioctl path won't be able to use this optimization, but that wasn't considered a fast path anyway. Signed-off-by: Keith Busch --- drivers/nvme/host/ioctl.c | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/drivers/nvme/host/ioctl.c b/drivers/nvme/host/ioctl.c index 6c5ae820bc0fc..83c0a1170505c 100644 --- a/drivers/nvme/host/ioctl.c +++ b/drivers/nvme/host/ioctl.c @@ -11,6 +11,7 @@ enum { NVME_IOCTL_VEC = (1 << 0), NVME_IOCTL_PARTITION = (1 << 1), + NVME_IOCTL_SYS_ADMIN = (1 << 2), }; static bool nvme_cmd_allowed(struct nvme_ns *ns, struct nvme_command *c, @@ -18,6 +19,9 @@ static bool nvme_cmd_allowed(struct nvme_ns *ns, struct nvme_command *c, { u32 effects; + if (flags & NVME_IOCTL_SYS_ADMIN) + return true; + /* * Do not allow unprivileged passthrough on partitions, as that allows an * escape from the containment of the partition. @@ -445,7 +449,7 @@ static int nvme_uring_cmd_io(struct nvme_ctrl *ctrl, struct nvme_ns *ns, struct request *req; blk_opf_t rq_flags = REQ_ALLOC_CACHE; blk_mq_req_flags_t blk_flags = 0; - int ret; + int ret, flags = 0; c.common.opcode = READ_ONCE(cmd->opcode); c.common.flags = READ_ONCE(cmd->flags); @@ -468,7 +472,11 @@ static int nvme_uring_cmd_io(struct nvme_ctrl *ctrl, struct nvme_ns *ns, c.common.cdw14 = cpu_to_le32(READ_ONCE(cmd->cdw14)); c.common.cdw15 = cpu_to_le32(READ_ONCE(cmd->cdw15)); - if (!nvme_cmd_allowed(ns, &c, 0, ioucmd->file->f_mode & FMODE_WRITE)) + if (issue_flags & IO_URING_F_SYS_ADMIN) + flags |= NVME_IOCTL_SYS_ADMIN; + + if (!nvme_cmd_allowed(ns, &c, flags, + ioucmd->file->f_mode & FMODE_WRITE)) return -EACCES; d.metadata = READ_ONCE(cmd->metadata);