From patchwork Thu May 30 15:23:38 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 13680558 Received: from mail-oi1-f179.google.com (mail-oi1-f179.google.com [209.85.167.179]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DB20D839E4 for ; Thu, 30 May 2024 15:28:34 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.167.179 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717082917; cv=none; b=CA354DIeNx0fiz/YP0lZXHTR6z93CPRxicG5GAVfflW4JRfhdilr7bX0cwjGLZmOf4C5oeYqtoMkV3nIH2g8rC//a8ahA8mdxu0QeU3zOuQw9KR/v0pYT+O5A0TcQ2QkF3+mZhcjYEdEW2OkVwq1piY88eK2I6bMpsxc1c73aVw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717082917; c=relaxed/simple; bh=5h1TIwk485xzuo65BPvZ2GQeWf4HXjNF6I31wjce8EI=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=ku3tbiqJu1UPyeF9AJt1XpkFfv+niuv91aP3I9QAdFHD2QUHkIfK7UsFfKmXFdN1NEDD8QT6rsMTH80Zm5nvO58jNZb0iQ9H6L+jxGrR8pjWHlGzAZF6D9hQtf9AKTONP4/AcnVD7Q3ExLN7Ji2sNB+ZD1WB0bjYTpxKmfUB/b8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk; spf=pass smtp.mailfrom=kernel.dk; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b=sCkuRuX/; arc=none smtp.client-ip=209.85.167.179 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=kernel.dk Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b="sCkuRuX/" Received: by mail-oi1-f179.google.com with SMTP id 5614622812f47-3d1b5f32065so37656b6e.2 for ; Thu, 30 May 2024 08:28:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20230601.gappssmtp.com; s=20230601; t=1717082913; x=1717687713; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=iYXbRi7M2c1HqxvzjiKFqL80MUJ6pePfMWM0r/nvfyU=; b=sCkuRuX/jkUgNtJpzUPpruaXQkixC/fIX/RaEIkE7o+pzjqPLb+xhEcy/3f85uz2o/ 8cfkkknIOA7BxL2rxP4JxHW00XigGW5EQHb3Bd//MZN9k+WIwpcpTGx7Vdc2i2MKS5R0 SEUfgh0PoKB/UQlHLQZOTMuTuAyT/7w5uTKczVNoFnxonB0bBlOlUR+XtLroPtf6wjTq lPtG3YnNJSR6NGgGjd2anZGePChi4FOMqxeiV0rQgYbnkGoA2cmPCGs8L+GDsZdfm0Qh J3KCgo404SjaAIpE26M1oRSCSmfWS0ECOZo1Q6qS571wRVT+mlimMu2N5RHlrz8jESM4 j1Lg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1717082913; x=1717687713; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=iYXbRi7M2c1HqxvzjiKFqL80MUJ6pePfMWM0r/nvfyU=; b=PkIrdHMh8C0+QFcpryUxFGIMGW9AlOtIyw3xXEskoa6iyqvJg9OOrJpzwpiWfRWXjB 8A/VM/YDRsE0is9DcMVhq92gJ819ZwWM41+5pzsu+eP3bKkkkk5WykJaQpf+7VDUHQzj 3XLQUE9JJl09AEeRINOrGaLATrttTC75GIVhRfYwTvSVp5Xlx6Z+jsDsv00zwus62kuc hCKeFttd26OI/8W2ZH3F5Rktn8lZY+2G2qe3hli2MRT608ukx9tVNrMREMed8rmkVcV8 uCrXF3Ao8O1/FfPpvZgdH2Ku+q+xGTM8RU9pCSiTuDLPo4JS8BzY2l0mOXqE2s4S8LHh OxIg== X-Gm-Message-State: AOJu0YyLc6UVBtYODimTnvD3j0QEON0TaQQFekQGtABQ3X5sD6jIu6Da 6kC7ltZ0CYKcdYsgangRqmDkmzFH6ALl92Ku6S8Zh4O+tiLlS7LN8sDUj2yIbbyuPTvBDBKklNK 5 X-Google-Smtp-Source: AGHT+IGJoG2Y4LxcGTgiPojUK0jk3S73/+ZV9d23fCQu8V1VMs/tIU9t5cCc/uYDvprxMpJwK8OstQ== X-Received: by 2002:a05:6808:1903:b0:3d1:df5a:2e01 with SMTP id 5614622812f47-3d1df5a3ac4mr1658766b6e.5.1717082913172; Thu, 30 May 2024 08:28:33 -0700 (PDT) Received: from localhost.localdomain ([96.43.243.2]) by smtp.gmail.com with ESMTPSA id 5614622812f47-3d1b3682381sm2008136b6e.2.2024.05.30.08.28.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 30 May 2024 08:28:32 -0700 (PDT) From: Jens Axboe To: io-uring@vger.kernel.org Cc: Jens Axboe Subject: [PATCH 1/7] io_uring/msg_ring: split fd installing into a helper Date: Thu, 30 May 2024 09:23:38 -0600 Message-ID: <20240530152822.535791-3-axboe@kernel.dk> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240530152822.535791-2-axboe@kernel.dk> References: <20240530152822.535791-2-axboe@kernel.dk> Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 No functional changes in this patch, just in preparation for needing to complete the fd install with the ctx lock already held. Signed-off-by: Jens Axboe --- io_uring/msg_ring.c | 26 ++++++++++++++++++-------- 1 file changed, 18 insertions(+), 8 deletions(-) diff --git a/io_uring/msg_ring.c b/io_uring/msg_ring.c index 81c4a9d43729..feff2b0822cf 100644 --- a/io_uring/msg_ring.c +++ b/io_uring/msg_ring.c @@ -173,25 +173,23 @@ static struct file *io_msg_grab_file(struct io_kiocb *req, unsigned int issue_fl return file; } -static int io_msg_install_complete(struct io_kiocb *req, unsigned int issue_flags) +static int __io_msg_install_complete(struct io_kiocb *req) { struct io_ring_ctx *target_ctx = req->file->private_data; struct io_msg *msg = io_kiocb_to_cmd(req, struct io_msg); struct file *src_file = msg->src_file; int ret; - if (unlikely(io_double_lock_ctx(target_ctx, issue_flags))) - return -EAGAIN; - ret = __io_fixed_fd_install(target_ctx, src_file, msg->dst_fd); if (ret < 0) - goto out_unlock; + return ret; msg->src_file = NULL; req->flags &= ~REQ_F_NEED_CLEANUP; if (msg->flags & IORING_MSG_RING_CQE_SKIP) - goto out_unlock; + return ret; + /* * If this fails, the target still received the file descriptor but * wasn't notified of the fact. This means that if this request @@ -199,8 +197,20 @@ static int io_msg_install_complete(struct io_kiocb *req, unsigned int issue_flag * later IORING_OP_MSG_RING delivers the message. */ if (!io_post_aux_cqe(target_ctx, msg->user_data, ret, 0)) - ret = -EOVERFLOW; -out_unlock: + return -EOVERFLOW; + + return ret; +} + +static int io_msg_install_complete(struct io_kiocb *req, unsigned int issue_flags) +{ + struct io_ring_ctx *target_ctx = req->file->private_data; + int ret; + + if (unlikely(io_double_lock_ctx(target_ctx, issue_flags))) + return -EAGAIN; + + ret = __io_msg_install_complete(req); io_double_unlock_ctx(target_ctx); return ret; } From patchwork Thu May 30 15:23:39 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 13680559 Received: from mail-ot1-f49.google.com (mail-ot1-f49.google.com [209.85.210.49]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CE7192B9A7 for ; Thu, 30 May 2024 15:28:36 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.49 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717082918; cv=none; b=RXv4DFcjiALUfMEpxRt3kp2c2bSfCW8eMSI+jnmQ41PK5pCUgCQMaKAlR068R9SoEKdmUwZ5qSFEclKdBu4HT/ANnn2pbDrGRC/k82KM+SQhD6a5i/gGPDG0KUwmmFnbu91cPELzQXmJo6J+ph8kb1PleTmrIpIPC2Ja+aEMiBM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717082918; c=relaxed/simple; bh=2Q2dSGFIk6ae3C72JCPQextl1UuegmFJtlnZRGAs0og=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=LpPOlJ4p2702IA/eWf4YHEeLcLwq5tIs9Q8Vayc0H3XPYbA2gH3tuHvoTdmpeKNvp28aug2OVEjtsE6Dahb5qOFfJoczjFEkZVTlABNg/POpUmxVEt9IH2CpNIsvYIoiwVDGv0Ap7DCmv2uGiTiJEAaxXwQaKPFoBl5eH5wbDIQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk; spf=pass smtp.mailfrom=kernel.dk; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b=L0zc8PUj; arc=none smtp.client-ip=209.85.210.49 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=kernel.dk Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b="L0zc8PUj" Received: by mail-ot1-f49.google.com with SMTP id 46e09a7af769-6f9090cb1ecso43416a34.2 for ; Thu, 30 May 2024 08:28:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20230601.gappssmtp.com; s=20230601; t=1717082915; x=1717687715; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=lMmEpGb/x1O2qmozmgQz/bi6ttKpLfGRNHdQWDI27F0=; b=L0zc8PUjlSNB6+1ltl7fsnuXGSJYFR3brRzrXEaCcKRAEuoSlryYpR1uPQrPho/Iyq NnMECURKp4t1Xr17pJcj3ylhXsDynf0QXtF3K/0ckq143rW5BrYkCGHz+GliW7mnkSIT JLp3t2JqiuV/FVS1PmU8xv5tVPfC9nZPhp4Z2oRhdZwVRS4a4ynl7XhvKqJD2/gR0Wkw EhmVlVQX3X0tetwbjX/iQeMm421RO+a/JO3XWtIzM+QnM4B0eSYp/aF0j/Zp5cOjIecY M9lXFYE9sEj3ExMeNWLkME0xI6lkD8Or0u3NjWrbh3C6rq6Z/UqWjHgdhlrHswmY9u45 lSOA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1717082915; x=1717687715; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=lMmEpGb/x1O2qmozmgQz/bi6ttKpLfGRNHdQWDI27F0=; b=SsUoUxrbUnbZ6RYsqX6l0wQVoI1xkZYCNSMSGP9+AJ4UC58LOCvLisIilP7gX0w2Mr n2NzY8fwkIKRRwcQ22B+5SkTUxyRX69SLHvJzgWAKDOM70CHVYO19P2fhfYapEvLDbQP yLWimY1lXaTu2wZpbmmGUHHMpsXiwGrTNUpPAMBERJtcqgNUddNeBx5ANKMPSQbzfXdt 9peoedgoZUEi+vDaCceYUsSB4gG836k3dMJDevhNhIMGlI525oI15dlWyZ/5yuiv9rTa chgmVKKVDxqh0Ev9h+eU32zNhn0esLjnI2uTlfL8IEfK1KZifh0ALrofnNxxuZRIWp1l YzDQ== X-Gm-Message-State: AOJu0YydYGmEkpZDES4geDsTeh6ysRldlQOLsrPMe7HTkKKcy2Ev+8VI WcBkAJq40RI7f3UOj4iprGw1H+YL5wRnJx4/+DbdkxTx/lBf95U41Secr3GlLizN/z6vqPJBLnX l X-Google-Smtp-Source: AGHT+IHUW3AQU+evhmzRWXhlMWUIKPoSC6G3aG/l6LUW2Z9JB0M2w+NEU9Hf+SxhIlykPApdacHYuQ== X-Received: by 2002:a05:6808:1b13:b0:3c7:528b:12ce with SMTP id 5614622812f47-3d1dcd1807fmr2770538b6e.3.1717082915149; Thu, 30 May 2024 08:28:35 -0700 (PDT) Received: from localhost.localdomain ([96.43.243.2]) by smtp.gmail.com with ESMTPSA id 5614622812f47-3d1b3682381sm2008136b6e.2.2024.05.30.08.28.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 30 May 2024 08:28:33 -0700 (PDT) From: Jens Axboe To: io-uring@vger.kernel.org Cc: Jens Axboe Subject: [PATCH 2/7] io_uring/msg_ring: tighten requirement for remote posting Date: Thu, 30 May 2024 09:23:39 -0600 Message-ID: <20240530152822.535791-4-axboe@kernel.dk> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240530152822.535791-2-axboe@kernel.dk> References: <20240530152822.535791-2-axboe@kernel.dk> Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Currently this is gated on whether or not the target ring needs a local completion - and if so, whether or not we're running on the right task. The use case for same thread cross posting is probably a lot less relevant than remote posting. And since we're going to improve this situation anyway, just gate it on local posting and ignore what task we're currently running on. Signed-off-by: Jens Axboe --- io_uring/msg_ring.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/io_uring/msg_ring.c b/io_uring/msg_ring.c index feff2b0822cf..15e7bda77d0d 100644 --- a/io_uring/msg_ring.c +++ b/io_uring/msg_ring.c @@ -68,9 +68,7 @@ void io_msg_ring_cleanup(struct io_kiocb *req) static inline bool io_msg_need_remote(struct io_ring_ctx *target_ctx) { - if (!target_ctx->task_complete) - return false; - return current != target_ctx->submitter_task; + return target_ctx->task_complete; } static int io_msg_exec_remote(struct io_kiocb *req, task_work_func_t func) From patchwork Thu May 30 15:23:40 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 13680560 Received: from mail-oi1-f173.google.com (mail-oi1-f173.google.com [209.85.167.173]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 69E39176253 for ; Thu, 30 May 2024 15:28:38 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.167.173 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717082920; cv=none; b=kjBxxm+JdRBpTRq2M5h7GyT/JEm2uBWYTr1qRLalJpefzJgz3B6Y4PI9V2Vj1NKx6JICcCkHMzEe36N8ngWrAJEVXZCmeFRXunyQB6AGz2sNNX2uepfEBOfWufX5Kc8DjZ3KeIbH5s4XABUNN3LsTl6bznzqQFCY85rwmTvdqG0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717082920; c=relaxed/simple; bh=I0yGKyybH2NDIn++3egAuNa/4hf42t0sOqm6hWTfrmo=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=oqj65/CCIILu7fnYgSRClArM2tFlLcUOWg+DQetEa64HgPJ7OLUYsUV4SP/6Wzmae66iH0ijHX7ofxnI4YGdnSCTxB9FrlG1w8Ls5qNGYED7VZjsM7ImhCGQHxBH4kIi6oIrNYeTCEafpfKISYnQiJoImsAOg4Bl6lf2aH5EeYc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk; spf=pass smtp.mailfrom=kernel.dk; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b=Um9PW3OR; arc=none smtp.client-ip=209.85.167.173 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=kernel.dk Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b="Um9PW3OR" Received: by mail-oi1-f173.google.com with SMTP id 5614622812f47-3d1d411ffbbso32259b6e.3 for ; Thu, 30 May 2024 08:28:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20230601.gappssmtp.com; s=20230601; t=1717082917; x=1717687717; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=2TgpfwreZeaYJNBx6Jv2LBjJ5S4QzmxlRqOX0v4pzc8=; b=Um9PW3ORomNaGpfDKBGOPooKrrpNg/6t0aTZo3pK+dCZgN1qeWqmFSWVHPvz/VBGmT wpyVHAj47lGRMtxf8vG2xG9ajzSBAAuGfIs5InES21ID98uuDX8mP2zmaQEO16e5c6lI qnhQmW9CgEPuS2bG1/AOdQOjqX9lWaHGgy+tzFpfIVFiXBBdgF3rYa/ybz+VmABcL/D1 goD3TWaS+A2Z8s82OE4dy0bJKkNI0WZDoREFr76X/GcGBKZM0PZ/a0MrQppDxgLxMZZt 58ucZP2pmfhhEocrnedMg+/zTdp5di2Ms3PWPNQJpsivc8MJpwc2V2e6MtuZlTrCVult G6Yg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1717082917; x=1717687717; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=2TgpfwreZeaYJNBx6Jv2LBjJ5S4QzmxlRqOX0v4pzc8=; b=TZZ5UiIg6cfULzZ5xyMSmQwPgvm+vX2yZCJ9knyiohZtTDlAgdrJ+61XWyQshtNATx ODp0K2FF8tkVYqkBaJbq6Ejcx4UzXUlbU3ThUxCbpYeCC7SxHKjvSkA/pfxJAzNf2R9D ZDmB0xfAZ7GE4+i6/61g1YuQ0m4C30W8KCNDHhnjTrIUaGvTdJTyZoEU2IpPHqqer1yO SejvjR1bpxyd9m9NfHSHraxNa1/EvHpDisn3COz6b0tdOUagSfqI6lIHlsTZP5QLivGU +/wmpGPgLiFUB6W10IuADBz81JjlBBuSopshUc63kXtkI52eEURx8Iecz7KQvDbSg7pz hlLA== X-Gm-Message-State: AOJu0YzEDTj/St9fCoez3YfDzIoJDULi5AV+WgrEx1LbjAm7Xl/QgJ1m x2y7dHUEPRL9oPk/37B0+W7fwby29kOg+fkd+MR57jjfHoFBNovUPMJIPpe+yXP+BrPXN6VyxY8 W X-Google-Smtp-Source: AGHT+IE+GpXa5K1gLsVGH9sEsMXgHbL7bZ5bCiwb97FrrEZf/91NNqsNEbbIR3VElXbEVWLt1RTExQ== X-Received: by 2002:a05:6808:1903:b0:3d1:df5a:2e01 with SMTP id 5614622812f47-3d1df5a3ac4mr1658954b6e.5.1717082917067; Thu, 30 May 2024 08:28:37 -0700 (PDT) Received: from localhost.localdomain ([96.43.243.2]) by smtp.gmail.com with ESMTPSA id 5614622812f47-3d1b3682381sm2008136b6e.2.2024.05.30.08.28.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 30 May 2024 08:28:35 -0700 (PDT) From: Jens Axboe To: io-uring@vger.kernel.org Cc: Jens Axboe Subject: [PATCH 3/7] io_uring/msg_ring: avoid double indirection task_work for data messages Date: Thu, 30 May 2024 09:23:40 -0600 Message-ID: <20240530152822.535791-5-axboe@kernel.dk> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240530152822.535791-2-axboe@kernel.dk> References: <20240530152822.535791-2-axboe@kernel.dk> Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 If IORING_SETUP_DEFER_TASKRUN is set, then we can't post CQEs remotely to the target ring. Instead, task_work is queued for the target ring, which is used to post the CQE. To make matters worse, once the target CQE has been posted, task_work is then queued with the originator to fill the completion. This obviously adds a bunch of overhead and latency. Instead of relying on generic kernel task_work for this, fill an overflow entry on the target ring and flag it as such that the target ring will flush it. This avoids both the task_work for posting the CQE, and it means that the originator CQE can be filled inline as well. In local testing, this reduces the latency on the sender side by 5-6x. Signed-off-by: Jens Axboe --- io_uring/msg_ring.c | 87 ++++++++++++++++++++++++++++----------------- 1 file changed, 55 insertions(+), 32 deletions(-) diff --git a/io_uring/msg_ring.c b/io_uring/msg_ring.c index 15e7bda77d0d..bdb935ef7aa2 100644 --- a/io_uring/msg_ring.c +++ b/io_uring/msg_ring.c @@ -87,38 +87,61 @@ static int io_msg_exec_remote(struct io_kiocb *req, task_work_func_t func) return IOU_ISSUE_SKIP_COMPLETE; } -static void io_msg_tw_complete(struct callback_head *head) +static struct io_overflow_cqe *io_alloc_overflow(struct io_ring_ctx *target_ctx) { - struct io_msg *msg = container_of(head, struct io_msg, tw); - struct io_kiocb *req = cmd_to_io_kiocb(msg); - struct io_ring_ctx *target_ctx = req->file->private_data; - int ret = 0; - - if (current->flags & PF_EXITING) { - ret = -EOWNERDEAD; - } else { - u32 flags = 0; - - if (msg->flags & IORING_MSG_RING_FLAGS_PASS) - flags = msg->cqe_flags; - - /* - * If the target ring is using IOPOLL mode, then we need to be - * holding the uring_lock for posting completions. Other ring - * types rely on the regular completion locking, which is - * handled while posting. - */ - if (target_ctx->flags & IORING_SETUP_IOPOLL) - mutex_lock(&target_ctx->uring_lock); - if (!io_post_aux_cqe(target_ctx, msg->user_data, msg->len, flags)) - ret = -EOVERFLOW; - if (target_ctx->flags & IORING_SETUP_IOPOLL) - mutex_unlock(&target_ctx->uring_lock); + bool is_cqe32 = target_ctx->flags & IORING_SETUP_CQE32; + size_t cqe_size = sizeof(struct io_overflow_cqe); + struct io_overflow_cqe *ocqe; + + if (is_cqe32) + cqe_size += sizeof(struct io_uring_cqe); + + ocqe = kmalloc(cqe_size, GFP_ATOMIC | __GFP_ACCOUNT); + if (!ocqe) + return NULL; + + if (is_cqe32) + ocqe->cqe.big_cqe[0] = ocqe->cqe.big_cqe[1] = 0; + + return ocqe; +} + +/* + * Entered with the target uring_lock held, and will drop it before + * returning. Adds a previously allocated ocqe to the overflow list on + * the target, and marks it appropriately for flushing. + */ +static void io_msg_add_overflow(struct io_msg *msg, + struct io_ring_ctx *target_ctx, + struct io_overflow_cqe *ocqe, int ret, + u32 flags) + __releases(&target_ctx->completion_lock) +{ + if (list_empty(&target_ctx->cq_overflow_list)) { + set_bit(IO_CHECK_CQ_OVERFLOW_BIT, &target_ctx->check_cq); + atomic_or(IORING_SQ_TASKRUN, &target_ctx->rings->sq_flags); } - if (ret < 0) - req_set_fail(req); - io_req_queue_tw_complete(req, ret); + ocqe->cqe.user_data = msg->user_data; + ocqe->cqe.res = ret; + ocqe->cqe.flags = flags; + list_add_tail(&ocqe->list, &target_ctx->cq_overflow_list); + spin_unlock(&target_ctx->completion_lock); + wake_up_state(target_ctx->submitter_task, TASK_INTERRUPTIBLE); +} + +static int io_msg_fill_remote(struct io_msg *msg, unsigned int issue_flags, + struct io_ring_ctx *target_ctx, u32 flags) +{ + struct io_overflow_cqe *ocqe; + + ocqe = io_alloc_overflow(target_ctx); + if (!ocqe) + return -ENOMEM; + + spin_lock(&target_ctx->completion_lock); + io_msg_add_overflow(msg, target_ctx, ocqe, msg->len, flags); + return 0; } static int io_msg_ring_data(struct io_kiocb *req, unsigned int issue_flags) @@ -135,12 +158,12 @@ static int io_msg_ring_data(struct io_kiocb *req, unsigned int issue_flags) if (target_ctx->flags & IORING_SETUP_R_DISABLED) return -EBADFD; - if (io_msg_need_remote(target_ctx)) - return io_msg_exec_remote(req, io_msg_tw_complete); - if (msg->flags & IORING_MSG_RING_FLAGS_PASS) flags = msg->cqe_flags; + if (io_msg_need_remote(target_ctx)) + return io_msg_fill_remote(msg, issue_flags, target_ctx, flags); + ret = -EOVERFLOW; if (target_ctx->flags & IORING_SETUP_IOPOLL) { if (unlikely(io_double_lock_ctx(target_ctx, issue_flags))) From patchwork Thu May 30 15:23:41 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 13680561 Received: from mail-oi1-f171.google.com (mail-oi1-f171.google.com [209.85.167.171]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B1765176253 for ; Thu, 30 May 2024 15:28:40 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.167.171 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717082922; cv=none; b=LEuV+k98qo8qvTZ+sTdz+uZgzwUUt9GyASU6QckJuvXNKrcQMxOw39GAFBLel3YMEcdvJvILqM0wbRSqPWblRrTXIRGhf8mcjX5ZD6wKsgJiI057Q0eqkDW6n6Ru0mC52Xd1gUZNqzmsQtgYWYyaWomGndejKf5EaRF2aDU5sW8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717082922; c=relaxed/simple; bh=n+m+CwAvSa1kXq2eCZNOoaL2fkYdDcUOpkvs/F1yy/o=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=bQKv5S86b00WaRH2i+c7srVkFT4bhOwzghoE3lXEnO8OEqE6d+Y2Yq/ocpYVFULkRqVmOupNvn9v6T09Cg5jkctdM7vhrncLS5iS1jwmee2pbHsItwt9WyRD0sCKC8SXWXu8ijhkiFvIOlFE1Mdsv+QvUejqQksObGoAPad3aV8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk; spf=pass smtp.mailfrom=kernel.dk; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b=nsc4Uql2; arc=none smtp.client-ip=209.85.167.171 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=kernel.dk Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b="nsc4Uql2" Received: by mail-oi1-f171.google.com with SMTP id 5614622812f47-3d1dbf0d2deso114939b6e.0 for ; Thu, 30 May 2024 08:28:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20230601.gappssmtp.com; s=20230601; t=1717082919; x=1717687719; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=f8xHpPp6VwiyLi53MsP31sWZpcYRDe2M7wr7hfpOP6E=; b=nsc4Uql2VyTXxXit2bW4syXu1yzzUrMd+HwRsb8lqFrVOvFnMbQyDflRvkcwzpCkk0 /3IZNNtJyqBKwlq1NR6/oEncWuTnBKHmvD/wy9tEHvt/aPBFqYJKjc3NGwm2MrYbD3pi ucUtOnmnKomDecYe9pz2BV5ZAMhS1SOKr88EtL6q6vs/EDE4riDL72NrsuvWqUBsh+XT 0KKZsTmjzz04q/01DXaE+RGl4RWThI996ihuLQwFqYqKR3y/gqsXRwQh9EwdkQB8GKzO zYzrq7jm6ADcsxrw+MkgHRS5ZRvev8xKyBQz/yyp4hb1hNJh8r6vyjVZij5D9fgJ9O0O zyUA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1717082919; x=1717687719; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=f8xHpPp6VwiyLi53MsP31sWZpcYRDe2M7wr7hfpOP6E=; b=mIuMHYlODOqLtFsagJBLfhO9wIDbY1kDemB9EEAuLyGtO6xWtPcFHeYCh7J9Hwcy/x 3sNwH4lYNdkczdbiyNUfcEKqbF/1wSI2/x8+5nQn+NJYSGw60UY/KP6OHtxt1YZxWIpZ 6NGVf8bxXOhD/CCvGgojvLunt/eZRc3jyjgzSRehrp05v1w7yTFaiQcLnitP+SicSAZj vJbBwymS33urujuEJk3Dzp6rYA6o1s5BylluJrLcjNI19Tv/s68ld17qPE3bUMekII7T wY6q8E0Tj55OB1hQ4QzW+EAiSH0ynxILr1xvXMk5CCNXgQgRXWgDh6WdbFJBXcbLr7MU WudA== X-Gm-Message-State: AOJu0Yzx3rFLDh+MblphWuUZfk/b1uT5NxZQbizO22wVCsnNUQRuENKN KZQu66SVlrTu7JVGk2vDQfmFpXEzia/Kvd8gxdz6sDqyFRfurY7SS+1btzGeJeic4LzNA2yU6Ni n X-Google-Smtp-Source: AGHT+IFfDHyYZBU2oqNxcVfNkKT0Lwe+19Fw4h5BRC84kFPff4DMGcBpUNWDEFz4Ax2xTHyjHhkg/w== X-Received: by 2002:a05:6808:23cb:b0:3c8:2be1:a65b with SMTP id 5614622812f47-3d1dcbe62c5mr2799134b6e.0.1717082918525; Thu, 30 May 2024 08:28:38 -0700 (PDT) Received: from localhost.localdomain ([96.43.243.2]) by smtp.gmail.com with ESMTPSA id 5614622812f47-3d1b3682381sm2008136b6e.2.2024.05.30.08.28.37 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 30 May 2024 08:28:37 -0700 (PDT) From: Jens Axboe To: io-uring@vger.kernel.org Cc: Jens Axboe Subject: [PATCH 4/7] io_uring/msg_ring: avoid double indirection task_work for fd passing Date: Thu, 30 May 2024 09:23:41 -0600 Message-ID: <20240530152822.535791-6-axboe@kernel.dk> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240530152822.535791-2-axboe@kernel.dk> References: <20240530152822.535791-2-axboe@kernel.dk> Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Like what was done for MSG_RING data passing avoiding a double task_work roundtrip for IORING_SETUP_DEFER_TASKRUN, implement the same model for fd passing. File descriptor passing is separately locked anyway, so the only remaining issue is CQE posting, just like it was for data passing. And for that, we can use the same approach. Signed-off-by: Jens Axboe --- io_uring/msg_ring.c | 58 +++++++++++++++++++++++++-------------------- 1 file changed, 32 insertions(+), 26 deletions(-) diff --git a/io_uring/msg_ring.c b/io_uring/msg_ring.c index bdb935ef7aa2..74590e66d7f7 100644 --- a/io_uring/msg_ring.c +++ b/io_uring/msg_ring.c @@ -71,22 +71,6 @@ static inline bool io_msg_need_remote(struct io_ring_ctx *target_ctx) return target_ctx->task_complete; } -static int io_msg_exec_remote(struct io_kiocb *req, task_work_func_t func) -{ - struct io_ring_ctx *ctx = req->file->private_data; - struct io_msg *msg = io_kiocb_to_cmd(req, struct io_msg); - struct task_struct *task = READ_ONCE(ctx->submitter_task); - - if (unlikely(!task)) - return -EOWNERDEAD; - - init_task_work(&msg->tw, func); - if (task_work_add(task, &msg->tw, TWA_SIGNAL)) - return -EOWNERDEAD; - - return IOU_ISSUE_SKIP_COMPLETE; -} - static struct io_overflow_cqe *io_alloc_overflow(struct io_ring_ctx *target_ctx) { bool is_cqe32 = target_ctx->flags & IORING_SETUP_CQE32; @@ -236,17 +220,39 @@ static int io_msg_install_complete(struct io_kiocb *req, unsigned int issue_flag return ret; } -static void io_msg_tw_fd_complete(struct callback_head *head) +static int io_msg_install_remote(struct io_kiocb *req, unsigned int issue_flags, + struct io_ring_ctx *target_ctx) { - struct io_msg *msg = container_of(head, struct io_msg, tw); - struct io_kiocb *req = cmd_to_io_kiocb(msg); - int ret = -EOWNERDEAD; + struct io_msg *msg = io_kiocb_to_cmd(req, struct io_msg); + bool skip_cqe = msg->flags & IORING_MSG_RING_CQE_SKIP; + struct io_overflow_cqe *ocqe = NULL; + int ret; - if (!(current->flags & PF_EXITING)) - ret = io_msg_install_complete(req, IO_URING_F_UNLOCKED); - if (ret < 0) - req_set_fail(req); - io_req_queue_tw_complete(req, ret); + if (!skip_cqe) { + ocqe = io_alloc_overflow(target_ctx); + if (!ocqe) + return -ENOMEM; + } + + if (unlikely(io_double_lock_ctx(target_ctx, issue_flags))) { + kfree(ocqe); + return -EAGAIN; + } + + ret = __io_fixed_fd_install(target_ctx, msg->src_file, msg->dst_fd); + mutex_unlock(&target_ctx->uring_lock); + + if (ret >= 0) { + msg->src_file = NULL; + req->flags &= ~REQ_F_NEED_CLEANUP; + if (!skip_cqe) { + spin_lock(&target_ctx->completion_lock); + io_msg_add_overflow(msg, target_ctx, ocqe, ret, 0); + return 0; + } + } + kfree(ocqe); + return ret; } static int io_msg_send_fd(struct io_kiocb *req, unsigned int issue_flags) @@ -271,7 +277,7 @@ static int io_msg_send_fd(struct io_kiocb *req, unsigned int issue_flags) } if (io_msg_need_remote(target_ctx)) - return io_msg_exec_remote(req, io_msg_tw_fd_complete); + return io_msg_install_remote(req, issue_flags, target_ctx); return io_msg_install_complete(req, issue_flags); } From patchwork Thu May 30 15:23:42 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 13680562 Received: from mail-oi1-f173.google.com (mail-oi1-f173.google.com [209.85.167.173]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 14AB9176257 for ; Thu, 30 May 2024 15:28:40 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.167.173 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717082923; cv=none; b=o4no0UgT+DmJyw5dhdbtEIu5oktDTO1s4JVFwYZPOH573YNygdpCOXtSXGrHKucz8Unixtsh8kizIEZrSAau3cG+LbQ8Rgsb9j6oLFT+FgLgbsQbRxlx6Zv7GSGem7OZ0iiQqSafa+8vPOabcdEJd9cai3dFGY6lgw/yItQOEKc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717082923; c=relaxed/simple; bh=5aRBLTLFFjZCBnQRaWREwbsqDiAByQrqRTQa26ZH0ZU=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=eP4ODm7X42E19VmHsEHPBHCPn6PGXnwo8aq8jR1GLuGRPpVjI4qgnzZ+AjCwY5sTasd2SZAfTtOZPBbS8cD0ynTEeSQhf6bS4od/4o+2yeTwoXfSEQVzjCqtsj6tSwLSqUAS+NoQF5qq3gc8lkvPWhQuVv6tqY6zsMEDkLSlkFg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk; spf=pass smtp.mailfrom=kernel.dk; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b=oGNB6Bay; arc=none smtp.client-ip=209.85.167.173 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=kernel.dk Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b="oGNB6Bay" Received: by mail-oi1-f173.google.com with SMTP id 5614622812f47-3d1dbf0d2deso114941b6e.0 for ; Thu, 30 May 2024 08:28:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20230601.gappssmtp.com; s=20230601; t=1717082919; x=1717687719; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=/EA1Comx7gvP1IGy42avlgdpDFbynZPpuWfpbA6DKD8=; b=oGNB6Baym2kJEHbZhZO5djvK3rNvBcN3R1bFkuF0lgfkA8EhXOt14MD213us5QiiTA ZqVwQWwQqyCLUGCU6xlw5j5zaLCnXBFl9C06CS1rWk5MJypQ5COmuMfYcU9ir8HFNZ9D /1W99s9MSgodjhdldbanYB22rNlEep4rr6qbUkVDnTAA3b4iL1NLYRazpkROqTFxvXqu 2lz+HCyIXhuZ3wxXObz7IEqlKszGxFTfCPLb54QJ3m+T+zSuYe0+XO39Or0PxAQ3HnUW 8zAnnzwJ2h1RM/N4BPNUvboLyKXi5Y8JjzVZLtlVbESvO5Y9a7Mcg6P3BUQBUdRs/Jlq +UoA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1717082919; x=1717687719; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=/EA1Comx7gvP1IGy42avlgdpDFbynZPpuWfpbA6DKD8=; b=CoWqrJtVwrKMWGOTanIhGs16GDEEOKFUvTWSjsm+vmZJzAvEtSjWfeClnfe/VeodvF Q2pBCrvs+SDtV+QdPffNlcquwLJwudl8SqK8aLC2FRRtrRXhT9NvGkaB9ouAxRQqAsQh KnsKre1HmfbnDBshMBDN5wUgMTRQ98vpz5u8uBpaca4WXyvglE10JI8y99BOuuqq//nK afzQ1xBhrZNG2/PmnL8TClte5+MOWx8LJm7hwVzdnc0snoArHxqKdRyvDMHMGunxD7n+ AFDvskJo1rt9kem0t/7/crfmoKnZ2vBIbGwGIjzLjH5lptDYY6O5eGAzvAhUsJUFGkh2 lLkg== X-Gm-Message-State: AOJu0YynFD3H2rwE+WfDshB/6YCNCcYPkmnpNRhuiY3P7jzPFmAcDPMG YXr6zV3YqurToHiKJkO26pZHvn46F9bVsrVuiF8s1VYoI3kvteIu26jjZYzDZ89bp7LDnkLhZRQ N X-Google-Smtp-Source: AGHT+IEEH5Zw4vDaBQbPn/IL+fEFxkAwz45fRS2a26sFCdrgFrkNjbzUmA3kWm2WsHEicGoC5O1csA== X-Received: by 2002:a05:6808:18a7:b0:3d1:e162:10a7 with SMTP id 5614622812f47-3d1e1622b79mr116694b6e.3.1717082919547; Thu, 30 May 2024 08:28:39 -0700 (PDT) Received: from localhost.localdomain ([96.43.243.2]) by smtp.gmail.com with ESMTPSA id 5614622812f47-3d1b3682381sm2008136b6e.2.2024.05.30.08.28.38 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 30 May 2024 08:28:38 -0700 (PDT) From: Jens Axboe To: io-uring@vger.kernel.org Cc: Jens Axboe Subject: [PATCH 5/7] io_uring/msg_ring: add an alloc cache for CQE entries Date: Thu, 30 May 2024 09:23:42 -0600 Message-ID: <20240530152822.535791-7-axboe@kernel.dk> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240530152822.535791-2-axboe@kernel.dk> References: <20240530152822.535791-2-axboe@kernel.dk> Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 io_uring accounts the memory allocated, which is quite expensive. Wrap the allocation and frees in the provided alloc cache framework. The target ctx needs to be locked anyway for posting the overflow entry, so just move the overflow alloc inside that section. Flushing the entries has it locked as well, so io_cache_alloc_free() can be used. In a simple test, most of the overhead of DEFER_TASKRUN message passing ends up being accounting for allocation and free, and with this change it's completely gone. Signed-off-by: Jens Axboe --- include/linux/io_uring_types.h | 7 ++++ io_uring/io_uring.c | 7 +++- io_uring/msg_ring.c | 67 +++++++++++++++++++++++----------- io_uring/msg_ring.h | 3 ++ 4 files changed, 62 insertions(+), 22 deletions(-) diff --git a/include/linux/io_uring_types.h b/include/linux/io_uring_types.h index 91224bbcfa73..0f8fc6070b12 100644 --- a/include/linux/io_uring_types.h +++ b/include/linux/io_uring_types.h @@ -357,6 +357,13 @@ struct io_ring_ctx { struct io_alloc_cache futex_cache; #endif + /* + * Unlike the other caches, this one is used by the sender of messages + * to this ring, not by the ring itself. As such, protection for this + * cache is under ->completion_lock, not ->uring_lock. + */ + struct io_alloc_cache msg_cache; + const struct cred *sq_creds; /* cred used for __io_sq_thread() */ struct io_sq_data *sq_data; /* if using sq thread polling */ diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index 816e93e7f949..bdb2636dc939 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -95,6 +95,7 @@ #include "futex.h" #include "napi.h" #include "uring_cmd.h" +#include "msg_ring.h" #include "memmap.h" #include "timeout.h" @@ -315,6 +316,7 @@ static __cold struct io_ring_ctx *io_ring_ctx_alloc(struct io_uring_params *p) ret |= io_alloc_cache_init(&ctx->uring_cache, IO_ALLOC_CACHE_MAX, sizeof(struct uring_cache)); ret |= io_futex_cache_init(ctx); + ret |= io_msg_cache_init(ctx); if (ret) goto err; init_completion(&ctx->ref_comp); @@ -351,6 +353,7 @@ static __cold struct io_ring_ctx *io_ring_ctx_alloc(struct io_uring_params *p) io_alloc_cache_free(&ctx->rw_cache, io_rw_cache_free); io_alloc_cache_free(&ctx->uring_cache, kfree); io_futex_cache_free(ctx); + io_msg_cache_free(ctx); kfree(ctx->cancel_table.hbs); kfree(ctx->cancel_table_locked.hbs); xa_destroy(&ctx->io_bl_xa); @@ -695,7 +698,8 @@ static void __io_cqring_overflow_flush(struct io_ring_ctx *ctx, bool dying) memcpy(cqe, &ocqe->cqe, cqe_size); } list_del(&ocqe->list); - kfree(ocqe); + if (!io_alloc_cache_put(&ctx->msg_cache, ocqe)) + kfree(ocqe); } if (list_empty(&ctx->cq_overflow_list)) { @@ -2649,6 +2653,7 @@ static __cold void io_ring_ctx_free(struct io_ring_ctx *ctx) io_alloc_cache_free(&ctx->rw_cache, io_rw_cache_free); io_alloc_cache_free(&ctx->uring_cache, kfree); io_futex_cache_free(ctx); + io_msg_cache_free(ctx); io_destroy_buffers(ctx); mutex_unlock(&ctx->uring_lock); if (ctx->sq_creds) diff --git a/io_uring/msg_ring.c b/io_uring/msg_ring.c index 74590e66d7f7..392763f3f090 100644 --- a/io_uring/msg_ring.c +++ b/io_uring/msg_ring.c @@ -11,6 +11,7 @@ #include "io_uring.h" #include "rsrc.h" #include "filetable.h" +#include "alloc_cache.h" #include "msg_ring.h" @@ -73,19 +74,24 @@ static inline bool io_msg_need_remote(struct io_ring_ctx *target_ctx) static struct io_overflow_cqe *io_alloc_overflow(struct io_ring_ctx *target_ctx) { - bool is_cqe32 = target_ctx->flags & IORING_SETUP_CQE32; - size_t cqe_size = sizeof(struct io_overflow_cqe); struct io_overflow_cqe *ocqe; - if (is_cqe32) - cqe_size += sizeof(struct io_uring_cqe); + ocqe = io_alloc_cache_get(&target_ctx->msg_cache); + if (!ocqe) { + bool is_cqe32 = target_ctx->flags & IORING_SETUP_CQE32; + size_t cqe_size = sizeof(struct io_overflow_cqe); - ocqe = kmalloc(cqe_size, GFP_ATOMIC | __GFP_ACCOUNT); - if (!ocqe) - return NULL; + if (is_cqe32) + cqe_size += sizeof(struct io_uring_cqe); - if (is_cqe32) - ocqe->cqe.big_cqe[0] = ocqe->cqe.big_cqe[1] = 0; + ocqe = kmalloc(cqe_size, GFP_ATOMIC | __GFP_ACCOUNT); + if (!ocqe) + return NULL; + + /* just init at alloc time, won't change */ + if (is_cqe32) + ocqe->cqe.big_cqe[0] = ocqe->cqe.big_cqe[1] = 0; + } return ocqe; } @@ -119,13 +125,16 @@ static int io_msg_fill_remote(struct io_msg *msg, unsigned int issue_flags, { struct io_overflow_cqe *ocqe; + spin_lock(&target_ctx->completion_lock); + ocqe = io_alloc_overflow(target_ctx); - if (!ocqe) - return -ENOMEM; + if (ocqe) { + io_msg_add_overflow(msg, target_ctx, ocqe, msg->len, flags); + return 0; + } - spin_lock(&target_ctx->completion_lock); - io_msg_add_overflow(msg, target_ctx, ocqe, msg->len, flags); - return 0; + spin_unlock(&target_ctx->completion_lock); + return -ENOMEM; } static int io_msg_ring_data(struct io_kiocb *req, unsigned int issue_flags) @@ -228,17 +237,16 @@ static int io_msg_install_remote(struct io_kiocb *req, unsigned int issue_flags, struct io_overflow_cqe *ocqe = NULL; int ret; + if (unlikely(io_double_lock_ctx(target_ctx, issue_flags))) + return -EAGAIN; + if (!skip_cqe) { + spin_lock(&target_ctx->completion_lock); ocqe = io_alloc_overflow(target_ctx); if (!ocqe) return -ENOMEM; } - if (unlikely(io_double_lock_ctx(target_ctx, issue_flags))) { - kfree(ocqe); - return -EAGAIN; - } - ret = __io_fixed_fd_install(target_ctx, msg->src_file, msg->dst_fd); mutex_unlock(&target_ctx->uring_lock); @@ -246,12 +254,14 @@ static int io_msg_install_remote(struct io_kiocb *req, unsigned int issue_flags, msg->src_file = NULL; req->flags &= ~REQ_F_NEED_CLEANUP; if (!skip_cqe) { - spin_lock(&target_ctx->completion_lock); io_msg_add_overflow(msg, target_ctx, ocqe, ret, 0); return 0; } } - kfree(ocqe); + if (ocqe) { + spin_unlock(&target_ctx->completion_lock); + kfree(ocqe); + } return ret; } @@ -331,3 +341,18 @@ int io_msg_ring(struct io_kiocb *req, unsigned int issue_flags) io_req_set_res(req, ret, 0); return IOU_OK; } + +int io_msg_cache_init(struct io_ring_ctx *ctx) +{ + size_t size = sizeof(struct io_overflow_cqe); + + if (ctx->flags & IORING_SETUP_CQE32) + size += sizeof(struct io_uring_cqe); + + return io_alloc_cache_init(&ctx->msg_cache, IO_ALLOC_CACHE_MAX, size); +} + +void io_msg_cache_free(struct io_ring_ctx *ctx) +{ + io_alloc_cache_free(&ctx->msg_cache, kfree); +} diff --git a/io_uring/msg_ring.h b/io_uring/msg_ring.h index 3987ee6c0e5f..94f5716d522e 100644 --- a/io_uring/msg_ring.h +++ b/io_uring/msg_ring.h @@ -3,3 +3,6 @@ int io_msg_ring_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe); int io_msg_ring(struct io_kiocb *req, unsigned int issue_flags); void io_msg_ring_cleanup(struct io_kiocb *req); + +int io_msg_cache_init(struct io_ring_ctx *ctx); +void io_msg_cache_free(struct io_ring_ctx *ctx); From patchwork Thu May 30 15:23:43 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 13680563 Received: from mail-oi1-f174.google.com (mail-oi1-f174.google.com [209.85.167.174]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CAD6917625C for ; Thu, 30 May 2024 15:28:42 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.167.174 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717082924; cv=none; b=ZDGCcayyTWRN7H11Ege9y4jIZHpkDd4GSvpb65tAEq8P7wOfh14fo5nobwSsxyGuB3Mtjsg+pwuBx3ht/vld46MzCTWEnHuTJb8kKJUSMX/nx70brDm+lltC0lxPc39egZAxmRRJLpWJ9ZN8OxWwP09iRFZ1rM/Uh+ANnY3KaUQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717082924; c=relaxed/simple; bh=2fTRsBnz9AVyev+x2Mw9m3Z+5ustOqERm/bvZl6mXL0=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=uFRhJgTtT0EeNCNEwJZ9rmWksY4KV4cHkTW7o0rUN6JSrIbHddx6efyG63zbTzVOsQjQYOa/xXL+JlIiBzsAGg5Z9XnSxYtnhr/7lXOTpnQQ73lcvC8Xg99X076If7Ii6eaw4kRMQG/FpVburvUYPfqlxYY1y8xhuy+l9+Cp9vc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk; spf=pass smtp.mailfrom=kernel.dk; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b=oqftRavX; arc=none smtp.client-ip=209.85.167.174 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=kernel.dk Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b="oqftRavX" Received: by mail-oi1-f174.google.com with SMTP id 5614622812f47-3c9bcd57524so92407b6e.3 for ; Thu, 30 May 2024 08:28:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20230601.gappssmtp.com; s=20230601; t=1717082921; x=1717687721; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=W8MtRuHfTNaN4XZ7n4pXixV9Qfzo0rOmpWw0qaqWXrs=; b=oqftRavXv1XT+j3dcfesfMweekX1awcEInJZ/krPEBT9afCTUaFxJc+CGuYvXX3zix QRU06FpxAnrDcfcJ/wlBtTq9aAj8poXUfBFoLJ2kHEqATecJ+FPTwC+64bMRXBHGNv+J 9OliSaIoIQX+p6QK3LFWIuFeJTEWRteTh5gY2Yq1Yp92WulRyAyZ4eG00JpbqtIXO+Y3 oe/UPqJa+hcZSueUtDaNgTiHPY01BTqPXLjo6chCkkNweITClkUjAYNsVRrrDVig7g7H IvxsIDMdNvIshGLyKKrUg48n1dNBziij32JshQJnv5AX3V7cO6slw4MJBZVCbthJCbmO 2xIQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1717082921; x=1717687721; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=W8MtRuHfTNaN4XZ7n4pXixV9Qfzo0rOmpWw0qaqWXrs=; b=Vs0zfvrSFzbYLNnr0pKxhWsEvGvsMjMkp0GcQ8vUrXfu10CRvcDSIFArgVwpMlICrD wfA3b6D1awd+hw2Cwwx/+KY1hGAQ2173/QbcJrTGvrMabxLdsNmVKD87aL8l2d0ovXll jjb/PBBAYO4ot5hu8/h7BZgYP7INlYQXe1Gm0GHdLyoxhccOqMzX+4YpWEI27Xz9pr/r f2Jum8oetcMXNYt9C4QvPrS2onSIx8fBOYYNFDhSc9y/vqMvxCMj7YyOpUck3VJqo/1Q v9DHaZu+RRrI47qzHgoLASUTk9sHKtMIlN+omEA1Lqww7XgjRMOP0hQ0w9Lt6GKXZZq2 FZXA== X-Gm-Message-State: AOJu0Yx7usV0wE775fzRc5LoeHmS6ztIfslO5U1sr5xNIfdnFKFODkzJ 8ozHFTdOCvCXhG+YC4Mr73vedQ5fWW9scvbv+F4jS1ISkJ1vJpepmMLmqWPo3xlPV0nZBgsc4Ce C X-Google-Smtp-Source: AGHT+IF1F9ap2dy5VaM2608Z7NumpOJj6e8alrAV2K37UjbJJmJywaAKzrh+sS1le6Az/h4xoITKNA== X-Received: by 2002:a05:6808:1403:b0:3c9:96cd:5bbf with SMTP id 5614622812f47-3d1dcc97a1cmr2998524b6e.1.1717082921368; Thu, 30 May 2024 08:28:41 -0700 (PDT) Received: from localhost.localdomain ([96.43.243.2]) by smtp.gmail.com with ESMTPSA id 5614622812f47-3d1b3682381sm2008136b6e.2.2024.05.30.08.28.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 30 May 2024 08:28:39 -0700 (PDT) From: Jens Axboe To: io-uring@vger.kernel.org Cc: Jens Axboe Subject: [PATCH 6/7] io_uring/msg_ring: remove callback_head from struct io_msg Date: Thu, 30 May 2024 09:23:43 -0600 Message-ID: <20240530152822.535791-8-axboe@kernel.dk> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240530152822.535791-2-axboe@kernel.dk> References: <20240530152822.535791-2-axboe@kernel.dk> Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 This is now unused, get rid of it. Signed-off-by: Jens Axboe --- io_uring/msg_ring.c | 1 - 1 file changed, 1 deletion(-) diff --git a/io_uring/msg_ring.c b/io_uring/msg_ring.c index 392763f3f090..5264ba346df8 100644 --- a/io_uring/msg_ring.c +++ b/io_uring/msg_ring.c @@ -22,7 +22,6 @@ struct io_msg { struct file *file; struct file *src_file; - struct callback_head tw; u64 user_data; u32 len; u32 cmd; From patchwork Thu May 30 15:23:44 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 13680564 Received: from mail-oi1-f176.google.com (mail-oi1-f176.google.com [209.85.167.176]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9327C17624F for ; Thu, 30 May 2024 15:28:44 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.167.176 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717082926; cv=none; b=jVLPgaiMzQbL+e3R3g5DiEXguedu59KPILjCJn8cPZTvIRC752ewzjfgigjqrefY1NRj5/RL9Lo2z+70bO6SyQ5KI9Joe8612SK7bBZu1li/0I9Ips7WeXnL3c+svrutNPoLj9109zodBVIUrkqi+AsItYEecSmELeU7QoAKPYQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717082926; c=relaxed/simple; bh=R2nR5aqjT40TN62fl+jJCzr0JgVCInPf3GEawhRpgzc=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=DbhM/xb39sJQaWg4bhwuLbF7CzNlY/8czhYc79qHWxa25uBpOM1DZ4//tqNXuOFzn2I3XRUSAoRgCDU2I1Dxx24vy8JWBzGOVzuW7CaGmztY6vt+XjQkH2UP8Hi3eiDTRnit+2bwhQNEdwHBYlcPH0gqcERRROfm5AgpWVcl6sY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk; spf=pass smtp.mailfrom=kernel.dk; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b=YQQcnAv7; arc=none smtp.client-ip=209.85.167.176 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=kernel.dk Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b="YQQcnAv7" Received: by mail-oi1-f176.google.com with SMTP id 5614622812f47-3d1b5f32065so37670b6e.2 for ; Thu, 30 May 2024 08:28:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20230601.gappssmtp.com; s=20230601; t=1717082923; x=1717687723; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=mL4ig77LMuOf22qhBfVwnNckeSTb3q7YGNu4PwzjitM=; b=YQQcnAv7xD/D5i31VRkh9Bt4J8GLxSJZV+FSuKp8mzd2SuZ8PLx8Mr7fRkZvkQde/f pBI29zZFPaffdA9zUlydQoS0EoucDE80KO1WsCYOrIyi+TFRv0GwcBDKfHe4a+aGBBm8 nbsiXGbWDoZ2cyryySauvrc3L0VaTrziNMrn+6oFqvpKgLEInQm2mxkCs8mFI11bDDI8 H8DUBDriJ2Y4AJ7Qnw1VRC6GhlHMj07BBWZ09MFb3jueMUTN2hJ5gifQfHn2K+pBmNqT 0cg5ypTwU2zyv8yja/6JTqU34eUFTRTzJdsm86DLWSgXPh3xKgU3EYA23jl/GXpnstWy rP7Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1717082923; x=1717687723; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=mL4ig77LMuOf22qhBfVwnNckeSTb3q7YGNu4PwzjitM=; b=K1xrTq9PNg4fmBmFr/I77hmcZTCVYDNIA6yOR3WlWbzNJEeQz0yVBc7U0VuuiOh1n9 bJ25MGFYJeh/4dPMaWsEOstbTJkClacyMNbwc/rPFc29ftJCnqqJAufyLYI+JRKfxh2u xKITqe2yNXrsGzcqV6CuUynEdh4ORZoUlEg/a6di3ajW4AZvbUl6gd/fzuPpeVMAPv5j qBNbI2uOZIhqtNeWfgewwlIVM8Nts/SoGeKJkGZEtx0ODQf2ngPH+9075kbMP5FEy4PL nsKD6lGMSy9nggYLgDsSgPPr5Ez+YmowqjGw3zwGbHxA9a7s9ZTa7G8hMqHsSaupIe44 nolw== X-Gm-Message-State: AOJu0YzsPPowGMwApEKSIhZLXBdFskpiMmysOKMzM9jQ1J4ITTY6F0hT /fPG89uSBlf2z1JGE7I0/KGhik/nycGoLaAm6hTfIz7SSk2q2Q2FnKhrsQJrsMwjFNFyJHOqKBI c X-Google-Smtp-Source: AGHT+IEZNpd0Op4s16qP6oP5y5XRAGp9QmySDH0EeCgV5483qGc3SlLjp2RPD4At/cxqapHnQJxc9A== X-Received: by 2002:a05:6808:bd5:b0:3c9:9474:cfda with SMTP id 5614622812f47-3d1dcbea465mr2455724b6e.0.1717082922903; Thu, 30 May 2024 08:28:42 -0700 (PDT) Received: from localhost.localdomain ([96.43.243.2]) by smtp.gmail.com with ESMTPSA id 5614622812f47-3d1b3682381sm2008136b6e.2.2024.05.30.08.28.41 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 30 May 2024 08:28:41 -0700 (PDT) From: Jens Axboe To: io-uring@vger.kernel.org Cc: Jens Axboe Subject: [PATCH 7/7] io_uring/msg_ring: remove non-remote message passing Date: Thu, 30 May 2024 09:23:44 -0600 Message-ID: <20240530152822.535791-9-axboe@kernel.dk> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240530152822.535791-2-axboe@kernel.dk> References: <20240530152822.535791-2-axboe@kernel.dk> Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Now that the overflow approach works well, there's no need to retain the double locking for direct CQ posting on the target ring. Just have any kind of target ring use the same messaging mechanism. Signed-off-by: Jens Axboe --- io_uring/msg_ring.c | 78 +++++---------------------------------------- 1 file changed, 8 insertions(+), 70 deletions(-) diff --git a/io_uring/msg_ring.c b/io_uring/msg_ring.c index 5264ba346df8..e966ec8757cb 100644 --- a/io_uring/msg_ring.c +++ b/io_uring/msg_ring.c @@ -33,11 +33,6 @@ struct io_msg { u32 flags; }; -static void io_double_unlock_ctx(struct io_ring_ctx *octx) -{ - mutex_unlock(&octx->uring_lock); -} - static int io_double_lock_ctx(struct io_ring_ctx *octx, unsigned int issue_flags) { @@ -66,11 +61,6 @@ void io_msg_ring_cleanup(struct io_kiocb *req) msg->src_file = NULL; } -static inline bool io_msg_need_remote(struct io_ring_ctx *target_ctx) -{ - return target_ctx->task_complete; -} - static struct io_overflow_cqe *io_alloc_overflow(struct io_ring_ctx *target_ctx) { struct io_overflow_cqe *ocqe; @@ -106,6 +96,8 @@ static void io_msg_add_overflow(struct io_msg *msg, u32 flags) __releases(&target_ctx->completion_lock) { + struct task_struct *task = READ_ONCE(target_ctx->submitter_task); + if (list_empty(&target_ctx->cq_overflow_list)) { set_bit(IO_CHECK_CQ_OVERFLOW_BIT, &target_ctx->check_cq); atomic_or(IORING_SQ_TASKRUN, &target_ctx->rings->sq_flags); @@ -116,7 +108,10 @@ static void io_msg_add_overflow(struct io_msg *msg, ocqe->cqe.flags = flags; list_add_tail(&ocqe->list, &target_ctx->cq_overflow_list); spin_unlock(&target_ctx->completion_lock); - wake_up_state(target_ctx->submitter_task, TASK_INTERRUPTIBLE); + if (task) + wake_up_state(task, TASK_INTERRUPTIBLE); + else if (wq_has_sleeper(&target_ctx->cq_wait)) + wake_up(&target_ctx->cq_wait); } static int io_msg_fill_remote(struct io_msg *msg, unsigned int issue_flags, @@ -141,7 +136,6 @@ static int io_msg_ring_data(struct io_kiocb *req, unsigned int issue_flags) struct io_ring_ctx *target_ctx = req->file->private_data; struct io_msg *msg = io_kiocb_to_cmd(req, struct io_msg); u32 flags = 0; - int ret; if (msg->src_fd || msg->flags & ~IORING_MSG_RING_FLAGS_PASS) return -EINVAL; @@ -153,19 +147,7 @@ static int io_msg_ring_data(struct io_kiocb *req, unsigned int issue_flags) if (msg->flags & IORING_MSG_RING_FLAGS_PASS) flags = msg->cqe_flags; - if (io_msg_need_remote(target_ctx)) - return io_msg_fill_remote(msg, issue_flags, target_ctx, flags); - - ret = -EOVERFLOW; - if (target_ctx->flags & IORING_SETUP_IOPOLL) { - if (unlikely(io_double_lock_ctx(target_ctx, issue_flags))) - return -EAGAIN; - } - if (io_post_aux_cqe(target_ctx, msg->user_data, msg->len, flags)) - ret = 0; - if (target_ctx->flags & IORING_SETUP_IOPOLL) - io_double_unlock_ctx(target_ctx); - return ret; + return io_msg_fill_remote(msg, issue_flags, target_ctx, flags); } static struct file *io_msg_grab_file(struct io_kiocb *req, unsigned int issue_flags) @@ -186,48 +168,6 @@ static struct file *io_msg_grab_file(struct io_kiocb *req, unsigned int issue_fl return file; } -static int __io_msg_install_complete(struct io_kiocb *req) -{ - struct io_ring_ctx *target_ctx = req->file->private_data; - struct io_msg *msg = io_kiocb_to_cmd(req, struct io_msg); - struct file *src_file = msg->src_file; - int ret; - - ret = __io_fixed_fd_install(target_ctx, src_file, msg->dst_fd); - if (ret < 0) - return ret; - - msg->src_file = NULL; - req->flags &= ~REQ_F_NEED_CLEANUP; - - if (msg->flags & IORING_MSG_RING_CQE_SKIP) - return ret; - - /* - * If this fails, the target still received the file descriptor but - * wasn't notified of the fact. This means that if this request - * completes with -EOVERFLOW, then the sender must ensure that a - * later IORING_OP_MSG_RING delivers the message. - */ - if (!io_post_aux_cqe(target_ctx, msg->user_data, ret, 0)) - return -EOVERFLOW; - - return ret; -} - -static int io_msg_install_complete(struct io_kiocb *req, unsigned int issue_flags) -{ - struct io_ring_ctx *target_ctx = req->file->private_data; - int ret; - - if (unlikely(io_double_lock_ctx(target_ctx, issue_flags))) - return -EAGAIN; - - ret = __io_msg_install_complete(req); - io_double_unlock_ctx(target_ctx); - return ret; -} - static int io_msg_install_remote(struct io_kiocb *req, unsigned int issue_flags, struct io_ring_ctx *target_ctx) { @@ -285,9 +225,7 @@ static int io_msg_send_fd(struct io_kiocb *req, unsigned int issue_flags) req->flags |= REQ_F_NEED_CLEANUP; } - if (io_msg_need_remote(target_ctx)) - return io_msg_install_remote(req, issue_flags, target_ctx); - return io_msg_install_complete(req, issue_flags); + return io_msg_install_remote(req, issue_flags, target_ctx); } int io_msg_ring_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)