From patchwork Thu Nov 24 09:35:55 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dylan Yudaken X-Patchwork-Id: 13054743 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8DCE6C43219 for ; Thu, 24 Nov 2022 09:36:40 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229755AbiKXJgj (ORCPT ); Thu, 24 Nov 2022 04:36:39 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45942 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229766AbiKXJge (ORCPT ); Thu, 24 Nov 2022 04:36:34 -0500 Received: from mx0a-00082601.pphosted.com (mx0b-00082601.pphosted.com [67.231.153.30]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 521E8122954 for ; Thu, 24 Nov 2022 01:36:33 -0800 (PST) Received: from pps.filterd (m0089730.ppops.net [127.0.0.1]) by m0089730.ppops.net (8.17.1.19/8.17.1.19) with ESMTP id 2ANHsBM4030099 for ; Thu, 24 Nov 2022 01:36:32 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=meta.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding : content-type; s=s2048-2021-q4; bh=jmOelkJnNh5qVKmW4sT7iMOybubcd1zgJvtXLSygZ1o=; b=nReQ+TAg5X4FaP3daLRc47xEeQO+jpRY5nvtXch9uAh+SGEMvpX19J272cJMig633xrW FooAwTVxnMfozRhE9IGF682VdLiCNgm/eBrjFODDt9dA5JPOPMkoEoBTB1kzEYZqXD1X D1ZAkGp76+lQMdO/cHXFM1egPBJqoiO4UTh56HjcXPHevzp/Th3IzqZC//+qiDz0fUMQ Vc8ItCHgY9+sr71prX3gAMWxqg6+gmDPK3fSNy8/rHebXUg9dPe8qF0IbcCtsgZFP8Bv F7T+a9+HkX3aPK784neqA+td8nM8FRgESjeajFCY2PIOX8SiMFAvvn4td+0lZ8W5DFZ2 jw== Received: from mail.thefacebook.com ([163.114.132.120]) by m0089730.ppops.net (PPS) with ESMTPS id 3m1c7rhj33-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Thu, 24 Nov 2022 01:36:32 -0800 Received: from twshared41876.03.ash8.facebook.com (2620:10d:c085:108::8) by mail.thefacebook.com (2620:10d:c085:11d::5) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.31; Thu, 24 Nov 2022 01:36:31 -0800 Received: by devbig038.lla2.facebook.com (Postfix, from userid 572232) id E93EAA173A16; Thu, 24 Nov 2022 01:36:20 -0800 (PST) From: Dylan Yudaken To: Jens Axboe , Pavel Begunkov CC: , , Dylan Yudaken Subject: [PATCH for-next v3 5/9] io_uring: add io_aux_cqe which allows deferred completion Date: Thu, 24 Nov 2022 01:35:55 -0800 Message-ID: <20221124093559.3780686-6-dylany@meta.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20221124093559.3780686-1-dylany@meta.com> References: <20221124093559.3780686-1-dylany@meta.com> MIME-Version: 1.0 X-FB-Internal: Safe X-Proofpoint-GUID: IfeZgjiiZen9I_TWB9DLPdhWtuLubLZS X-Proofpoint-ORIG-GUID: IfeZgjiiZen9I_TWB9DLPdhWtuLubLZS X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.219,Aquarius:18.0.895,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-11-24_07,2022-11-23_01,2022-06-22_01 Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org Use the just introduced deferred post cqe completion state when possible in io_aux_cqe. If not possible fallback to io_post_aux_cqe. This introduces a complication because of allow_overflow. For deferred completions we cannot know without locking the completion_lock if it will overflow (and even if we locked it, another post could sneak in and cause this cqe to be in overflow). However since overflow protection is mostly a best effort defence in depth to prevent infinite loops of CQEs for poll, just checking the overflow bit is going to be good enough and will result in at most 16 (array size of deferred cqes) overflows. Suggested-by: Pavel Begunkov Signed-off-by: Dylan Yudaken --- io_uring/io_uring.c | 34 ++++++++++++++++++++++++++++++++++ io_uring/io_uring.h | 2 ++ io_uring/net.c | 7 ++++--- io_uring/poll.c | 4 ++-- 4 files changed, 42 insertions(+), 5 deletions(-) diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index 53b61b5cde80..4f48e8a919a2 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -830,6 +830,40 @@ bool io_post_aux_cqe(struct io_ring_ctx *ctx, return filled; } +bool io_aux_cqe(struct io_ring_ctx *ctx, bool defer, u64 user_data, s32 res, u32 cflags, + bool allow_overflow) +{ + struct io_uring_cqe *cqe; + unsigned int length; + + if (!defer) + return io_post_aux_cqe(ctx, user_data, res, cflags, allow_overflow); + + length = ARRAY_SIZE(ctx->submit_state.cqes); + + lockdep_assert_held(&ctx->uring_lock); + + if (ctx->submit_state.cqes_count == length) { + io_cq_lock(ctx); + __io_flush_post_cqes(ctx); + /* no need to flush - flush is deferred */ + spin_unlock(&ctx->completion_lock); + } + + /* For defered completions this is not as strict as it is otherwise, + * however it's main job is to prevent unbounded posted completions, + * and in that it works just as well. + */ + if (!allow_overflow && test_bit(IO_CHECK_CQ_OVERFLOW_BIT, &ctx->check_cq)) + return false; + + cqe = &ctx->submit_state.cqes[ctx->submit_state.cqes_count++]; + cqe->user_data = user_data; + cqe->res = res; + cqe->flags = cflags; + return true; +} + static void __io_req_complete_post(struct io_kiocb *req) { struct io_ring_ctx *ctx = req->ctx; diff --git a/io_uring/io_uring.h b/io_uring/io_uring.h index a26d5aa7f3f3..dd02adf3d0df 100644 --- a/io_uring/io_uring.h +++ b/io_uring/io_uring.h @@ -36,6 +36,8 @@ bool io_post_aux_cqe(struct io_ring_ctx *ctx, u64 user_data, s32 res, u32 cflags bool allow_overflow); bool io_fill_cqe_aux(struct io_ring_ctx *ctx, u64 user_data, s32 res, u32 cflags, bool allow_overflow); +bool io_aux_cqe(struct io_ring_ctx *ctx, bool defer, u64 user_data, s32 res, u32 cflags, + bool allow_overflow); void __io_commit_cqring_flush(struct io_ring_ctx *ctx); static inline void io_req_complete_post_tw(struct io_kiocb *req, bool *locked) diff --git a/io_uring/net.c b/io_uring/net.c index 0de6f78ad978..90342dcb6b1d 100644 --- a/io_uring/net.c +++ b/io_uring/net.c @@ -601,8 +601,8 @@ static inline bool io_recv_finish(struct io_kiocb *req, int *ret, } if (!mshot_finished) { - if (io_post_aux_cqe(req->ctx, req->cqe.user_data, *ret, - cflags | IORING_CQE_F_MORE, true)) { + if (io_aux_cqe(req->ctx, issue_flags & IO_URING_F_COMPLETE_DEFER, + req->cqe.user_data, *ret, cflags | IORING_CQE_F_MORE, true)) { io_recv_prep_retry(req); return false; } @@ -1320,7 +1320,8 @@ int io_accept(struct io_kiocb *req, unsigned int issue_flags) if (ret < 0) return ret; - if (io_post_aux_cqe(ctx, req->cqe.user_data, ret, IORING_CQE_F_MORE, true)) + if (io_aux_cqe(ctx, issue_flags & IO_URING_F_COMPLETE_DEFER, + req->cqe.user_data, ret, IORING_CQE_F_MORE, true)) goto retry; return -ECANCELED; diff --git a/io_uring/poll.c b/io_uring/poll.c index 4bd43e6f5b72..922c1a366c41 100644 --- a/io_uring/poll.c +++ b/io_uring/poll.c @@ -252,8 +252,8 @@ static int io_poll_check_events(struct io_kiocb *req, bool *locked) __poll_t mask = mangle_poll(req->cqe.res & req->apoll_events); - if (!io_post_aux_cqe(ctx, req->cqe.user_data, - mask, IORING_CQE_F_MORE, false)) { + if (!io_aux_cqe(ctx, *locked, req->cqe.user_data, + mask, IORING_CQE_F_MORE, false)) { io_req_set_res(req, mask, 0); return IOU_POLL_REMOVE_POLL_USE_RES; }