From patchwork Fri Sep 16 21:36:25 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Stefan Metzmacher X-Patchwork-Id: 12978854 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1245FECAAA1 for ; Fri, 16 Sep 2022 21:38:11 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229471AbiIPViK (ORCPT ); Fri, 16 Sep 2022 17:38:10 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59958 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229655AbiIPViJ (ORCPT ); Fri, 16 Sep 2022 17:38:09 -0400 Received: from hr2.samba.org (hr2.samba.org [IPv6:2a01:4f8:192:486::2:0]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 95BC0BB00C for ; Fri, 16 Sep 2022 14:38:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=samba.org; s=42; h=Message-Id:Date:Cc:To:From; bh=0SwHugQVIVz/sTe+HezYfVSXyJOfkEsY28doul00OeI=; b=GEI6pWdKyA82hSpBHvvbVY6rmo 5sY2aXjBYhutRpMmr9XY/R+4A4K4jbiANCxwo8LBMrIEIykEdwFJBgDujlnqvgbdZ/v8/+T/0Y/MX cR/9d3bXMAz5sqJ+rZxlvS8r/X8aOmq9gnt0JksnYUwHGuezKU0rORCGy5hJXycERiBOpTGOtdVCP kEYQ0b8nrjXG4+lbbalkC1HWrwwqpAwCqNAuDbrTvljBLJbfB89zGwR5HjxJHkWfrDO1j/P6ZVgWm K5BM73Ph/SxIF1uad3J0sK9DuDuVEFSccb0X3K5Fd3SR7/P5JhKVYa9ynZN+vMz2fOcGSj1OCc23k p4mt7MKNMzg7uXnHcj5C2kqaTKcsoYoSzWFNPS2Qta1Yx9dYeSMiNc46M3m+ELczDW+yvQBIFG4E/ udKwTX1AElVvgZi96sJAYfi+2WCGtdQK+DEFThsG430KzDaEHYMsrxVdoUQSOUCbt51m3KuqnibDG FhKsfmm7/e153G5gVGIArf5K; Received: from [127.0.0.2] (localhost [127.0.0.1]) by hr2.samba.org with esmtpsa (TLS1.3:ECDHE_SECP256R1__ECDSA_SECP256R1_SHA256__CHACHA20_POLY1305:256) (Exim) id 1oZJ2E-000j67-9S; Fri, 16 Sep 2022 21:38:06 +0000 From: Stefan Metzmacher To: io-uring@vger.kernel.org, axboe@kernel.dk, asml.silence@gmail.com Cc: Stefan Metzmacher Subject: [PATCH 1/5] io_uring/opdef: rename SENDZC_NOTIF to SEND_ZC Date: Fri, 16 Sep 2022 23:36:25 +0200 Message-Id: <8e5cd8616919c92b6c3c7b6ea419fdffd5b97f3c.1663363798.git.metze@samba.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org It's confusing to see the string SENDZC_NOTIF in ftrace output when using IORING_OP_SEND_ZC. Fixes: b48c312be05e8 ("io_uring/net: simplify zerocopy send user API") Signed-off-by: Stefan Metzmacher Cc: Pavel Begunkov Cc: Jens Axboe Cc: io-uring@vger.kernel.org Reviewed-by: Pavel Begunkov --- io_uring/opdef.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/io_uring/opdef.c b/io_uring/opdef.c index c61494e0a602..c4dddd0fd709 100644 --- a/io_uring/opdef.c +++ b/io_uring/opdef.c @@ -471,7 +471,7 @@ const struct io_op_def io_op_defs[] = { .prep_async = io_uring_cmd_prep_async, }, [IORING_OP_SEND_ZC] = { - .name = "SENDZC_NOTIF", + .name = "SEND_ZC", .needs_file = 1, .unbound_nonreg_file = 1, .pollout = 1, From patchwork Fri Sep 16 21:36:26 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Stefan Metzmacher X-Patchwork-Id: 12978855 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5D380ECAAD8 for ; Fri, 16 Sep 2022 21:38:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229450AbiIPVio (ORCPT ); Fri, 16 Sep 2022 17:38:44 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60092 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229655AbiIPVin (ORCPT ); Fri, 16 Sep 2022 17:38:43 -0400 Received: from hr2.samba.org (hr2.samba.org [IPv6:2a01:4f8:192:486::2:0]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3F4B1AB430 for ; Fri, 16 Sep 2022 14:38:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=samba.org; s=42; h=Message-Id:Date:Cc:To:From; bh=5+4hNpMItLNugz0oZGYnGAzDfersrmpF1YEyKQa/3/4=; b=R3uQkQT8jt27VsilIotg/xryuj rZ7ZoM4s6YCDXFWLBvoqqas4fu7EmTE40dQGtgaGpg+sEjWI+KzeFc/8avodtDNgeZzw8CRhons/D rsdZexw0QzUU6+DVKay1sZbZzbUl3brL3gpuGHGkxRSqEz6bmP+wqZMIHk8TSo4W6U4TWm4SMb3qG SrlZ34oIdBHjeqqTgEa1IuPolX3q+ayPyB11Q8xuXBKss+VMnqRfWLwdhOdN7/SPpZgLSWSr54FrD /XhoUQrm6asy1dGCx/FkH6IJ98hC79j0B9/QpjURzs7zIzf4zMbqmvCu4vvZI/bTpuGNy/CJRlzHo ebg7f3szVBRJkqCxKmHdohLtPyZuG4SfCBg+GM5bSrJZ16pDxWL5fXTmWzaxyz7CbAOWr9CpKscPm VpOBTKXb9/8bl6Zp4yzmR81UTAjkG2wKFdYFqLqF7krpsoBQS+Toxw85CFvgBt9Vb2Jw9Mbvhk6ef t/Ng42UK82JfntZWOo3gC/xo; Received: from [127.0.0.2] (localhost [127.0.0.1]) by hr2.samba.org with esmtpsa (TLS1.3:ECDHE_SECP256R1__ECDSA_SECP256R1_SHA256__CHACHA20_POLY1305:256) (Exim) id 1oZJ2l-000j6T-RB; Fri, 16 Sep 2022 21:38:40 +0000 From: Stefan Metzmacher To: io-uring@vger.kernel.org, axboe@kernel.dk, asml.silence@gmail.com Cc: Stefan Metzmacher Subject: [PATCH 2/5] io_uring/core: move io_cqe->fd over from io_cqe->flags to io_cqe->res Date: Fri, 16 Sep 2022 23:36:26 +0200 Message-Id: X-Mailer: git-send-email 2.34.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org Both flags and res have the same lifetime currently. io_req_set_res() sets both of them. Callers of io_req_set_res() like req_fail_link_node(), io_req_complete_failed() and io_req_task_queue_fail() set io_cqe->res to their callers value and force flags to 0. The motivation for this change is the next commit, it will let us keep io_cqe->flags even on error. For IORING_OP_SEND_ZC it is needed to keep IORING_CQE_F_MORE even on a generic failure, userspace needs to know that a IORING_CQE_F_NOTIF will follow. Otherwise the buffers might be reused too early. Fixes: b48c312be05e8 ("io_uring/net: simplify zerocopy send user API") Signed-off-by: Stefan Metzmacher Cc: Pavel Begunkov Cc: Jens Axboe Cc: io-uring@vger.kernel.org --- include/linux/io_uring_types.h | 6 +++--- io_uring/io_uring.c | 10 ++++++++-- 2 files changed, 11 insertions(+), 5 deletions(-) diff --git a/include/linux/io_uring_types.h b/include/linux/io_uring_types.h index 677a25d44d7f..37925db42ae9 100644 --- a/include/linux/io_uring_types.h +++ b/include/linux/io_uring_types.h @@ -473,12 +473,12 @@ struct io_task_work { struct io_cqe { __u64 user_data; - __s32 res; - /* fd initially, then cflags for completion */ + /* fd initially, then res for completion */ union { - __u32 flags; int fd; + __s32 res; }; + __u32 flags; }; /* diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index b9640ad5069f..ae69cff94664 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -837,8 +837,6 @@ static void io_preinit_req(struct io_kiocb *req, struct io_ring_ctx *ctx) req->ctx = ctx; req->link = NULL; req->async_data = NULL; - /* not necessary, but safer to zero */ - req->cqe.res = 0; } static void io_flush_cached_locked_reqs(struct io_ring_ctx *ctx, @@ -1574,6 +1572,12 @@ static int io_issue_sqe(struct io_kiocb *req, unsigned int issue_flags) if (!def->audit_skip) audit_uring_entry(req->opcode); + /* + * req->cqe.fd was resolved by io_assign_file + * now make sure its alias req->cqe.res is reset, + * so we don't use that value by accident. + */ + req->cqe.res = -1; ret = def->issue(req, issue_flags); if (!def->audit_skip) @@ -1902,6 +1906,8 @@ static int io_init_req(struct io_ring_ctx *ctx, struct io_kiocb *req, state->need_plug = false; blk_start_plug_nr_ios(&state->plug, state->submit_nr); } + } else { + req->cqe.fd = -1; } personality = READ_ONCE(sqe->personality); From patchwork Fri Sep 16 21:36:27 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Stefan Metzmacher X-Patchwork-Id: 12978856 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D9903ECAAD8 for ; Fri, 16 Sep 2022 21:39:21 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229655AbiIPVjT (ORCPT ); Fri, 16 Sep 2022 17:39:19 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60212 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229642AbiIPVjS (ORCPT ); Fri, 16 Sep 2022 17:39:18 -0400 Received: from hr2.samba.org (hr2.samba.org [IPv6:2a01:4f8:192:486::2:0]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 67677AB430 for ; Fri, 16 Sep 2022 14:39:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=samba.org; s=42; h=Message-Id:Date:Cc:To:From; bh=ZoSZ9iwEKr6Q7TA8DRWCeyki7nGJWBiXI/c3WJ2ARhY=; b=vN42wpXoAlXhPDNFUEd5tHK9ng /lwuQjv2E7CTfKS8RyjQFkS8Vki0XZBJtQsxMDrBetJ/HvOiPlUb+LoRX7djzDkXv/YGbBchN8vwW 1K2ImtxtZN52mEM6jH2f4WuZ+Dazsj6IpjCbIErvBdthT/XdS/Y6VTDG+/fHgt2xmu9TwUiITy3tV nhPoZ8IBstLOSyaxgG3ip5vAnArqxIQoinl76lRAQTDKtA5eE3brE/KiZMx4rIf8WXRzvNz5Fh6LI Gp2bg8ne81hVYN2nunE1rCDP7X4hWbtrqzT+i1TbTNgM89xHfX4g22rtKCOrx2pZIaYrqZ1kdBvET h6T9/9lH8BOv9qC6CLsF6d2J6bJpIjTNrNBVxBQKLKun0hkpWqxgSa65BPUdZ1lxThJQ/SMZB6fw7 uZmurOGTBW1OUYHghh+wOOCP1rg/HdEMVFicPNXiQ05w4p71HQjj5oUCT5dLQ7vXBf5Sri5FTU5j9 dEjGkQhY3HNd44B8qMYbZqM6; Received: from [127.0.0.2] (localhost [127.0.0.1]) by hr2.samba.org with esmtpsa (TLS1.3:ECDHE_SECP256R1__ECDSA_SECP256R1_SHA256__CHACHA20_POLY1305:256) (Exim) id 1oZJ3K-000j7J-VW; Fri, 16 Sep 2022 21:39:15 +0000 From: Stefan Metzmacher To: io-uring@vger.kernel.org, axboe@kernel.dk, asml.silence@gmail.com Cc: Stefan Metzmacher Subject: [PATCH 3/5] io_uring/core: keep req->cqe.flags on generic errors Date: Fri, 16 Sep 2022 23:36:27 +0200 Message-Id: <5df304b3cb6eeb412b758ce638a5e129c4d6f6da.1663363798.git.metze@samba.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org Soon we'll have the case where IORING_OP_SEND_ZC will add IORING_CQE_F_MORE to req->cqe.flags before calling into sock_sendmsg() and once we did that we have to keep that flag even if we will hit some generic error later, e.g. in the partial io retry case. Hopefully passing req->cqe.flags to inline io_req_set_res(), allows the compiler to optimize out the effective req->cqe.flags = req->cqe.flags. Fixes: b48c312be05e8 ("io_uring/net: simplify zerocopy send user API") Signed-off-by: Stefan Metzmacher Cc: Pavel Begunkov Cc: Jens Axboe Cc: io-uring@vger.kernel.org --- io_uring/io_uring.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index ae69cff94664..062edbc04168 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -212,7 +212,7 @@ bool io_match_task_safe(struct io_kiocb *head, struct task_struct *task, static inline void req_fail_link_node(struct io_kiocb *req, int res) { req_set_fail(req); - io_req_set_res(req, res, 0); + io_req_set_res(req, res, req->cqe.flags); } static inline void io_req_add_to_cache(struct io_kiocb *req, struct io_ring_ctx *ctx) @@ -824,7 +824,8 @@ inline void __io_req_complete(struct io_kiocb *req, unsigned issue_flags) void io_req_complete_failed(struct io_kiocb *req, s32 res) { req_set_fail(req); - io_req_set_res(req, res, io_put_kbuf(req, IO_URING_F_UNLOCKED)); + req->cqe.flags |= io_put_kbuf(req, IO_URING_F_UNLOCKED); + io_req_set_res(req, res, req->cqe.flags); io_req_complete_post(req); } @@ -1106,7 +1107,7 @@ void io_req_task_submit(struct io_kiocb *req, bool *locked) void io_req_task_queue_fail(struct io_kiocb *req, int ret) { - io_req_set_res(req, ret, 0); + io_req_set_res(req, ret, req->cqe.flags); req->io_task_work.func = io_req_task_cancel; io_req_task_work_add(req); } @@ -1847,6 +1848,7 @@ static int io_init_req(struct io_ring_ctx *ctx, struct io_kiocb *req, /* same numerical values with corresponding REQ_F_*, safe to copy */ req->flags = sqe_flags = READ_ONCE(sqe->flags); req->cqe.user_data = READ_ONCE(sqe->user_data); + req->cqe.flags = 0; req->file = NULL; req->rsrc_node = NULL; req->task = current; From patchwork Fri Sep 16 21:36:28 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Stefan Metzmacher X-Patchwork-Id: 12978857 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3733EECAAD8 for ; Fri, 16 Sep 2022 21:39:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229642AbiIPVjz (ORCPT ); Fri, 16 Sep 2022 17:39:55 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33864 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229735AbiIPVjy (ORCPT ); Fri, 16 Sep 2022 17:39:54 -0400 Received: from hr2.samba.org (hr2.samba.org [IPv6:2a01:4f8:192:486::2:0]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5AC35183A0 for ; Fri, 16 Sep 2022 14:39:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=samba.org; s=42; h=Message-Id:Date:Cc:To:From; bh=7r/3CRTEJ4Ti2LV8Jee1aCkQwPQjB3aQ4I+qL2BWcc0=; b=EarU7R68DsdiHPZoM65ZA7RKV5 XqZls+pW6u+xZuPwdmhjl8JwCwTlr/Kx8GYrW1S04fQxNlGvK5881zSC4p4TRHRAUWzumkPMrl3w9 1IsTuSscl3K9ICwjXN0HOVq8KUzwDyhrpsjuXFX/FyPD26S/FU67Q5re8fnowDZin438G3AOkQDDg 0lzq8TifPd9H5YipX6tkOZVxWyTN0sSUrAayvjgbRUp7RZUYkUWrlRClNJrb+JoKNfdeDjnd+JWis YOZ4g76LrNnq4hXuP/AaNKpGV5XQ2AKl/ilc2mFqh2IEW0Ex91lhW4Zh39Ce/JB7bADjwBH0PYHA6 8MKnxRw7OlgkPUCqQI3Ffx4wandrWUfGc4eA7YZ4KqQVkXEEPJ0OrpPx3ffb7I2uObDqZ5457dW5h 1GRVpKBI+Pd8X/vMz414/Z2QKlNlCXTWKz+ho/YH0SpPvbIKl+k1ou9OGURx0wda97YjV6yC8Q6vc wyOVzG659kNXxOaENf/eWGez; Received: from [127.0.0.2] (localhost [127.0.0.1]) by hr2.samba.org with esmtpsa (TLS1.3:ECDHE_SECP256R1__ECDSA_SECP256R1_SHA256__CHACHA20_POLY1305:256) (Exim) id 1oZJ3u-000j7V-QP; Fri, 16 Sep 2022 21:39:51 +0000 From: Stefan Metzmacher To: io-uring@vger.kernel.org, axboe@kernel.dk, asml.silence@gmail.com Cc: Stefan Metzmacher Subject: [PATCH 4/5] io_uring/net: let io_sendzc set IORING_CQE_F_MORE before sock_sendmsg() Date: Fri, 16 Sep 2022 23:36:28 +0200 Message-Id: <88c6e27ee0b4a945ccbf347d354cccf862936f55.1663363798.git.metze@samba.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org sock_sendmsg() can take references to the passed buffers even on failure! So we need to make sure we'll set IORING_CQE_F_MORE before calling sock_sendmsg(). As REQ_F_CQE_SKIP for notif and IORING_CQE_F_MORE for the main request go hand in hand, lets simplify the REQ_F_CQE_SKIP logic too. We just start with REQ_F_CQE_SKIP set and reset it when we set IORING_CQE_F_MORE on the main request in order to have the transition in one isolated place. In future we might be able to revert IORING_CQE_F_MORE and !REQ_F_CQE_SKIP again if we find out that no reference was taken by the network layer. But that's a change for another day. The important thing would just be that the documentation for IORING_OP_SEND_ZC would indicate that the kernel may decide to return just a single cqe without IORING_CQE_F_MORE, even in the success case, so that userspace would not break when we add such an optimization at a layer point. Fixes: b48c312be05e8 ("io_uring/net: simplify zerocopy send user API") Signed-off-by: Stefan Metzmacher Cc: Pavel Begunkov Cc: Jens Axboe Cc: io-uring@vger.kernel.org --- io_uring/net.c | 19 +++++++++++++------ 1 file changed, 13 insertions(+), 6 deletions(-) diff --git a/io_uring/net.c b/io_uring/net.c index e9efed40cf3d..61e6194b01b7 100644 --- a/io_uring/net.c +++ b/io_uring/net.c @@ -883,7 +883,6 @@ void io_sendzc_cleanup(struct io_kiocb *req) { struct io_sendzc *zc = io_kiocb_to_cmd(req, struct io_sendzc); - zc->notif->flags |= REQ_F_CQE_SKIP; io_notif_flush(zc->notif); zc->notif = NULL; } @@ -920,6 +919,8 @@ int io_sendzc_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe) notif->cqe.user_data = req->cqe.user_data; notif->cqe.res = 0; notif->cqe.flags = IORING_CQE_F_NOTIF; + /* skip the notif cqe until we call sock_sendmsg() */ + notif->flags |= REQ_F_CQE_SKIP; req->flags |= REQ_F_NEED_CLEANUP; zc->buf = u64_to_user_ptr(READ_ONCE(sqe->addr)); @@ -1000,7 +1001,7 @@ int io_sendzc(struct io_kiocb *req, unsigned int issue_flags) struct msghdr msg; struct iovec iov; struct socket *sock; - unsigned msg_flags, cflags; + unsigned msg_flags; int ret, min_ret = 0; sock = sock_from_file(req->file); @@ -1055,6 +1056,15 @@ int io_sendzc(struct io_kiocb *req, unsigned int issue_flags) msg.msg_flags = msg_flags; msg.msg_ubuf = &io_notif_to_data(zc->notif)->uarg; msg.sg_from_iter = io_sg_from_iter; + + /* + * Now that we call sock_sendmsg, + * we need to assume that the data is referenced + * even on failure! + * So we need to force a NOTIF cqe + */ + zc->notif->flags &= ~REQ_F_CQE_SKIP; + req->cqe.flags |= IORING_CQE_F_MORE; ret = sock_sendmsg(sock, &msg); if (unlikely(ret < min_ret)) { @@ -1068,8 +1078,6 @@ int io_sendzc(struct io_kiocb *req, unsigned int issue_flags) req->flags |= REQ_F_PARTIAL_IO; return io_setup_async_addr(req, addr, issue_flags); } - if (ret < 0 && !zc->done_io) - zc->notif->flags |= REQ_F_CQE_SKIP; if (ret == -ERESTARTSYS) ret = -EINTR; req_set_fail(req); @@ -1082,8 +1090,7 @@ int io_sendzc(struct io_kiocb *req, unsigned int issue_flags) io_notif_flush(zc->notif); req->flags &= ~REQ_F_NEED_CLEANUP; - cflags = ret >= 0 ? IORING_CQE_F_MORE : 0; - io_req_set_res(req, ret, cflags); + io_req_set_res(req, ret, req->cqe.flags); return IOU_OK; } From patchwork Fri Sep 16 21:36:29 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Stefan Metzmacher X-Patchwork-Id: 12978858 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A3246ECAAA1 for ; Fri, 16 Sep 2022 21:40:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229675AbiIPVkb (ORCPT ); Fri, 16 Sep 2022 17:40:31 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35516 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229735AbiIPVka (ORCPT ); Fri, 16 Sep 2022 17:40:30 -0400 Received: from hr2.samba.org (hr2.samba.org [IPv6:2a01:4f8:192:486::2:0]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 08C382F009; Fri, 16 Sep 2022 14:40:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=samba.org; s=42; h=Message-Id:Date:Cc:To:From; bh=7/bK9/j4+KDJ0CqnTjC8j/mEzvOMjp40NmmYbeModGU=; b=gV+hgdr99PEWfLxDGTYE1h9OWC vnCZRE1zKY1SatyZ7/7QlQlBeVCD4GTuVrbBmCNbNJlUuLzdysxcnCUWdu8U5L2jgAQUnkneYvJXn SFW6eTsYyuEDz6PY5hco0+AkXIvaCj7QptHJa9tZRzXc+8qnDXLGh+37/98HSgXpW8qrrwCB/dQ+j ZXG9nKnMpqge3h1MMYILkkG6qinSYIcSe5PrcSiN4qV1WsuCjvO+pWt2x/9ulUdkTzqcs5eanHVaB A2MXiV+tOpbxCQ+12ZGejd/1gtbKPfAu50wxcF524tS2k536825AVq5RT+duzjtXBDwWLdlTHFyO+ ZEncZ+e4a9x0Hu6lbGEY6WGZ/TfVyPRvhK8SnEWdFzyNMZ9F7xAjRThjJ1IiYUK+MXxx5ppGQ7GdG H3MSAxB0olFfIQv41MlWtdEYyI5We7zp4nP2x2Rfu8Isx0VnNluQ+tVO5oN7685KOgL60Dx66A67l QEVB2EgQ2qBxStGPtU8fk6Dn; Received: from [127.0.0.2] (localhost [127.0.0.1]) by hr2.samba.org with esmtpsa (TLS1.3:ECDHE_SECP256R1__ECDSA_SECP256R1_SHA256__CHACHA20_POLY1305:256) (Exim) id 1oZJ4T-000j7f-KT; Fri, 16 Sep 2022 21:40:26 +0000 From: Stefan Metzmacher To: io-uring@vger.kernel.org, axboe@kernel.dk, asml.silence@gmail.com Cc: Stefan Metzmacher , Jakub Kicinski , netdev@vger.kernel.org Subject: [PATCH 5/5] io_uring/notif: let userspace know how effective the zero copy usage was Date: Fri, 16 Sep 2022 23:36:29 +0200 Message-Id: <76cdd53f618e2793e1ec298c837bb17c3b9f12ee.1663363798.git.metze@samba.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org The 2nd cqe for IORING_OP_SEND_ZC has IORING_CQE_F_NOTIF set in cqe->flags and it will now have the number of successful completed io_uring_tx_zerocopy_callback() callbacks in the lower 31-bits of cqe->res, the high bit (0x80000000) is set when io_uring_tx_zerocopy_callback() was called with success=false. If cqe->res is still 0, zero copy wasn't used at all. These values give userspace a change to adjust its strategy choosing IORING_OP_SEND_ZC or IORING_OP_SEND. And it's a bit richer than just a simple SO_EE_CODE_ZEROCOPY_COPIED indication. Fixes: b48c312be05e8 ("io_uring/net: simplify zerocopy send user API") Fixes: eb315a7d1396b ("tcp: support externally provided ubufs") Fixes: 1fd3ae8c906c0 ("ipv6/udp: support externally provided ubufs") Fixes: c445f31b3cfaa ("ipv4/udp: support externally provided ubufs") Signed-off-by: Stefan Metzmacher Cc: Pavel Begunkov Cc: Jens Axboe Cc: io-uring@vger.kernel.org Cc: Jakub Kicinski Cc: netdev@vger.kernel.org --- io_uring/notif.c | 18 ++++++++++++++++++ net/ipv4/ip_output.c | 3 ++- net/ipv4/tcp.c | 2 ++ net/ipv6/ip6_output.c | 3 ++- 4 files changed, 24 insertions(+), 2 deletions(-) diff --git a/io_uring/notif.c b/io_uring/notif.c index e37c6569d82e..b07d2a049931 100644 --- a/io_uring/notif.c +++ b/io_uring/notif.c @@ -28,7 +28,24 @@ static void io_uring_tx_zerocopy_callback(struct sk_buff *skb, struct io_notif_data *nd = container_of(uarg, struct io_notif_data, uarg); struct io_kiocb *notif = cmd_to_io_kiocb(nd); + uarg->zerocopy = uarg->zerocopy & success; + + if (success && notif->cqe.res < S32_MAX) + notif->cqe.res++; + if (refcount_dec_and_test(&uarg->refcnt)) { + /* + * If we hit at least one case that + * was not able to use zero copy, + * we set the high bit 0x80000000 + * so that notif->cqe.res < 0, means it was + * as least copied once. + * + * The other 31 bits are the success count. + */ + if (!uarg->zerocopy) + notif->cqe.res |= S32_MIN; + notif->io_task_work.func = __io_notif_complete_tw; io_req_task_work_add(notif); } @@ -53,6 +70,7 @@ struct io_kiocb *io_alloc_notif(struct io_ring_ctx *ctx) nd = io_notif_to_data(notif); nd->account_pages = 0; + nd->uarg.zerocopy = 1; nd->uarg.flags = SKBFL_ZEROCOPY_FRAG | SKBFL_DONT_ORPHAN; nd->uarg.callback = io_uring_tx_zerocopy_callback; refcount_set(&nd->uarg.refcnt, 1); diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c index d7bd1daf022b..4bdea7a4b2f7 100644 --- a/net/ipv4/ip_output.c +++ b/net/ipv4/ip_output.c @@ -1032,7 +1032,8 @@ static int __ip_append_data(struct sock *sk, paged = true; zc = true; uarg = msg->msg_ubuf; - } + } else + msg->msg_ubuf->zerocopy = 0; } else if (sock_flag(sk, SOCK_ZEROCOPY)) { uarg = msg_zerocopy_realloc(sk, length, skb_zcopy(skb)); if (!uarg) diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index 970e9a2cca4a..27a22d470741 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -1231,6 +1231,8 @@ int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size) uarg = msg->msg_ubuf; net_zcopy_get(uarg); zc = sk->sk_route_caps & NETIF_F_SG; + if (!zc) + uarg->zerocopy = 0; } else if (sock_flag(sk, SOCK_ZEROCOPY)) { uarg = msg_zerocopy_realloc(sk, size, skb_zcopy(skb)); if (!uarg) { diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c index f152e51242cb..d85036e91cf7 100644 --- a/net/ipv6/ip6_output.c +++ b/net/ipv6/ip6_output.c @@ -1556,7 +1556,8 @@ static int __ip6_append_data(struct sock *sk, paged = true; zc = true; uarg = msg->msg_ubuf; - } + } else + msg->msg_ubuf->zerocopy = 0; } else if (sock_flag(sk, SOCK_ZEROCOPY)) { uarg = msg_zerocopy_realloc(sk, length, skb_zcopy(skb)); if (!uarg)