From patchwork Wed Dec 7 03:53:26 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pavel Begunkov X-Patchwork-Id: 13066593 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id BECDEC47089 for ; Wed, 7 Dec 2022 03:54:37 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229804AbiLGDyh (ORCPT ); Tue, 6 Dec 2022 22:54:37 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41542 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229530AbiLGDyg (ORCPT ); Tue, 6 Dec 2022 22:54:36 -0500 Received: from mail-ed1-x531.google.com (mail-ed1-x531.google.com [IPv6:2a00:1450:4864:20::531]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9E3014E686 for ; Tue, 6 Dec 2022 19:54:35 -0800 (PST) Received: by mail-ed1-x531.google.com with SMTP id a16so23148427edb.9 for ; Tue, 06 Dec 2022 19:54:35 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=p7PBbvpWlX4R/Yms5Q9wU+PvrZz6g5J0eJQaIzAm6Vg=; b=fkqC7dMb2Bc/xVUPpjMRiT56ZR3FhrkI8auMk74QNqOMPeUdTAr4ZaSrk0RqSTE7CD wwleF6qo7aWx6nKvkvXaQ2QKLEb66kCJqpotktwfTT1p8+5TYNuuLnltrA2MtFsgPwC6 y/fHxkbAJ1HBum8pDmC7UWUafik/PpJV/u0PlFSADYGp0olLXSVDC/rJEdoPJEQCUGKw qzZ21BmaDl+eQN7vcGoJ7tsab14BkCs1AMwuqyu+DN1mvqk+qFhGoyuRH1cKaBoyB/AK cI5CNo90ziEHDNP0Z6nv0WYPsFlmmp9Orkoq6CBTsWSsDONp6RDMcQprJ5rx6W4UaOE9 h2zA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=p7PBbvpWlX4R/Yms5Q9wU+PvrZz6g5J0eJQaIzAm6Vg=; b=HhhMwfgeqNNPey1YRct0ETUHkmkeiHi/byBbuCsjFLSNqluDM1E5R2kWM0jI9L8Jfy WUMNwmdvdPnsRNJ5YpJ47SI17uqrM8Zu8IVUWiueKOkaK2NsknoomzNtJ7Gow8xaalX+ T2W8wHaEcfaDQqUlkrmUWsln5Y7kkLW5LvDYoNQo6sbft5QXufxenvk9pjq9AMJfRrqK aQWj/ib7rvsF0BPS6zBdTj817f4xTP+9k11xpFBoZqbEPr9pgd8oBs5pJAMhNpv29DM0 ZkcDrdRoCB5cPfbtK1kqRbcjzxhRxiPZd2WIraN4nwpQfSeUzuqkW5oXoMwipLQNHrqV dxVQ== X-Gm-Message-State: ANoB5pkJpOF7rERAfgze5EU81xrlQozIXlkbV0M8QpmPa6B5GkYaTF8D vW8RMakcyGrws9NpYmL/ontM1h8vzxU= X-Google-Smtp-Source: AA0mqf6fphUbdZBVhQ4yTEN7nGpIagGzOBQworifCQ1/TCo0zD16BwAFF9PqCeh7kM9zicvWXvYDpA== X-Received: by 2002:a05:6402:4504:b0:463:71ef:b9ce with SMTP id ez4-20020a056402450400b0046371efb9cemr7618896edb.75.1670385273969; Tue, 06 Dec 2022 19:54:33 -0800 (PST) Received: from 127.0.0.1localhost (94.196.241.58.threembb.co.uk. [94.196.241.58]) by smtp.gmail.com with ESMTPSA id 9-20020a170906210900b0073de0506745sm7938939ejt.197.2022.12.06.19.54.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 06 Dec 2022 19:54:33 -0800 (PST) From: Pavel Begunkov To: io-uring@vger.kernel.org Cc: Jens Axboe , asml.silence@gmail.com Subject: [PATCH for-next v2 01/12] io_uring: dont remove file from msg_ring reqs Date: Wed, 7 Dec 2022 03:53:26 +0000 Message-Id: X-Mailer: git-send-email 2.38.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org We should not be messing with req->file outside of core paths. Clearing it makes msg_ring non reentrant, i.e. luckily io_msg_send_fd() fails the request on failed io_double_lock_ctx() but clearly was originally intended to do retries instead. Cc: stable@vger.kernel.org Signed-off-by: Pavel Begunkov --- io_uring/io_uring.c | 2 +- io_uring/msg_ring.c | 4 ---- io_uring/opdef.c | 7 +++++++ io_uring/opdef.h | 2 ++ 4 files changed, 10 insertions(+), 5 deletions(-) diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index 436b1ac8f6d0..62372a641add 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -1810,7 +1810,7 @@ static int io_issue_sqe(struct io_kiocb *req, unsigned int issue_flags) return ret; /* If the op doesn't have a file, we're not polling for it */ - if ((req->ctx->flags & IORING_SETUP_IOPOLL) && req->file) + if ((req->ctx->flags & IORING_SETUP_IOPOLL) && def->iopoll_queue) io_iopoll_req_issued(req, issue_flags); return 0; diff --git a/io_uring/msg_ring.c b/io_uring/msg_ring.c index afb543aab9f6..615c85e164ab 100644 --- a/io_uring/msg_ring.c +++ b/io_uring/msg_ring.c @@ -167,9 +167,5 @@ int io_msg_ring(struct io_kiocb *req, unsigned int issue_flags) if (ret < 0) req_set_fail(req); io_req_set_res(req, ret, 0); - /* put file to avoid an attempt to IOPOLL the req */ - if (!(req->flags & REQ_F_FIXED_FILE)) - io_put_file(req->file); - req->file = NULL; return IOU_OK; } diff --git a/io_uring/opdef.c b/io_uring/opdef.c index 83dc0f9ad3b2..04dd2c983fce 100644 --- a/io_uring/opdef.c +++ b/io_uring/opdef.c @@ -63,6 +63,7 @@ const struct io_op_def io_op_defs[] = { .audit_skip = 1, .ioprio = 1, .iopoll = 1, + .iopoll_queue = 1, .async_size = sizeof(struct io_async_rw), .name = "READV", .prep = io_prep_rw, @@ -80,6 +81,7 @@ const struct io_op_def io_op_defs[] = { .audit_skip = 1, .ioprio = 1, .iopoll = 1, + .iopoll_queue = 1, .async_size = sizeof(struct io_async_rw), .name = "WRITEV", .prep = io_prep_rw, @@ -103,6 +105,7 @@ const struct io_op_def io_op_defs[] = { .audit_skip = 1, .ioprio = 1, .iopoll = 1, + .iopoll_queue = 1, .async_size = sizeof(struct io_async_rw), .name = "READ_FIXED", .prep = io_prep_rw, @@ -118,6 +121,7 @@ const struct io_op_def io_op_defs[] = { .audit_skip = 1, .ioprio = 1, .iopoll = 1, + .iopoll_queue = 1, .async_size = sizeof(struct io_async_rw), .name = "WRITE_FIXED", .prep = io_prep_rw, @@ -277,6 +281,7 @@ const struct io_op_def io_op_defs[] = { .audit_skip = 1, .ioprio = 1, .iopoll = 1, + .iopoll_queue = 1, .async_size = sizeof(struct io_async_rw), .name = "READ", .prep = io_prep_rw, @@ -292,6 +297,7 @@ const struct io_op_def io_op_defs[] = { .audit_skip = 1, .ioprio = 1, .iopoll = 1, + .iopoll_queue = 1, .async_size = sizeof(struct io_async_rw), .name = "WRITE", .prep = io_prep_rw, @@ -481,6 +487,7 @@ const struct io_op_def io_op_defs[] = { .plug = 1, .name = "URING_CMD", .iopoll = 1, + .iopoll_queue = 1, .async_size = uring_cmd_pdu_size(1), .prep = io_uring_cmd_prep, .issue = io_uring_cmd, diff --git a/io_uring/opdef.h b/io_uring/opdef.h index 3efe06d25473..df7e13d9bfba 100644 --- a/io_uring/opdef.h +++ b/io_uring/opdef.h @@ -25,6 +25,8 @@ struct io_op_def { unsigned ioprio : 1; /* supports iopoll */ unsigned iopoll : 1; + /* have to be put into the iopoll list */ + unsigned iopoll_queue : 1; /* opcode specific path will handle ->async_data allocation if needed */ unsigned manual_alloc : 1; /* size of async data needed, if any */ From patchwork Wed Dec 7 03:53:27 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pavel Begunkov X-Patchwork-Id: 13066594 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7FB0EC4708D for ; Wed, 7 Dec 2022 03:54:38 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229530AbiLGDyh (ORCPT ); Tue, 6 Dec 2022 22:54:37 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41548 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229544AbiLGDyh (ORCPT ); Tue, 6 Dec 2022 22:54:37 -0500 Received: from mail-ej1-x636.google.com (mail-ej1-x636.google.com [IPv6:2a00:1450:4864:20::636]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5492027168 for ; Tue, 6 Dec 2022 19:54:36 -0800 (PST) Received: by mail-ej1-x636.google.com with SMTP id m18so9506460eji.5 for ; Tue, 06 Dec 2022 19:54:36 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=kiMloYQ40taJVrPwywcsMg/nXflbrlvaJuqVqOat9EI=; b=hbLa/ocNZzlaIXaaKgCX0GOjRZqOdkCDps5ulV60oK7d1/r8OUhq6hfSHloMoadbpW 1H3hJvMrA51NSzvat+EE/tXL3gcL7rGlHv6HxCMiEp6Y5TLCrfrREWnVODXDllW9ybHd BVvjNFD4TAWIQfNBJ0kGI67mVOt70zH5YaAt4EP8gPdC+6BK8z2gcmmWfRGw8UZBjjkd BmdkERgvrn1jy2wpb55AqoD2tROa8HpLzrOo4wMXbRPi43Hn0d2WUKWXaAXkqkTPxEZ2 okvNViuiRsr6W5Q8qxOZJLImxLgAGHuWfplu+ze+VT3SdZlx1taHPcih4q/YNE0/kAhe Yrhw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=kiMloYQ40taJVrPwywcsMg/nXflbrlvaJuqVqOat9EI=; b=4MKGoATXe2vxV5KfPW0eeFTgTRxBiiwu5BxURwW9NJeratvmSQHXMm8HHFH5/wXnL7 UP5A0+Nz+zBWvig//buyDNgZwEXA1CusrGLHtVUkeloFOUxn2iRX0ealnOf1YGwvIS5l rQKiQli2gOliAvHg/D5Bo5DOsPMBkXD8i+FfQ3WvZaG9g7NSRT/R2EH1NFIE5XGOB034 nvPleeHY5GgtRAdhFBIElLgKzEzVQp1MIf9Y4fF8nDRPk+ITeYPCxyAOu2JNewKs7NJt J0+un5Rva3qrMRp6kwIt+lUrqTqUIdFSAeLrOJbqN2QgxtIs5EcEFZsNelLyGyzeW5UE ZaTg== X-Gm-Message-State: ANoB5plZ1WdzAF1bUhbH7hg7U/7kJNG/H3vPhdbiPt2i7PXrnVFOw3u1 SJVmvGQ8gJLIyhvYxPe1mghLg03sNg4= X-Google-Smtp-Source: AA0mqf5stzzcUmr8ud40EUV3Sx+e/nBNUQS4kIENyQwKS4k/qFtx5Bib+19qELHXxwGfaljT2JfDMg== X-Received: by 2002:a17:906:328e:b0:78d:7f22:2c53 with SMTP id 14-20020a170906328e00b0078d7f222c53mr59486400ejw.420.1670385274713; Tue, 06 Dec 2022 19:54:34 -0800 (PST) Received: from 127.0.0.1localhost (94.196.241.58.threembb.co.uk. [94.196.241.58]) by smtp.gmail.com with ESMTPSA id 9-20020a170906210900b0073de0506745sm7938939ejt.197.2022.12.06.19.54.34 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 06 Dec 2022 19:54:34 -0800 (PST) From: Pavel Begunkov To: io-uring@vger.kernel.org Cc: Jens Axboe , asml.silence@gmail.com Subject: [PATCH for-next v2 02/12] io_uring: improve io_double_lock_ctx fail handling Date: Wed, 7 Dec 2022 03:53:27 +0000 Message-Id: <4697f05afcc37df5c8f89e2fe6d9c7c19f0241f9.1670384893.git.asml.silence@gmail.com> X-Mailer: git-send-email 2.38.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org msg_ring will fail the request if it can't lock rings, instead punt it to io-wq as was originally intended. Cc: stable@vger.kernel.org Signed-off-by: Pavel Begunkov --- io_uring/msg_ring.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/io_uring/msg_ring.c b/io_uring/msg_ring.c index 615c85e164ab..c7d6586164ca 100644 --- a/io_uring/msg_ring.c +++ b/io_uring/msg_ring.c @@ -164,6 +164,8 @@ int io_msg_ring(struct io_kiocb *req, unsigned int issue_flags) } done: + if (ret == -EAGAIN) + return -EAGAIN; if (ret < 0) req_set_fail(req); io_req_set_res(req, ret, 0); From patchwork Wed Dec 7 03:53:28 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pavel Begunkov X-Patchwork-Id: 13066595 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B507CC352A1 for ; Wed, 7 Dec 2022 03:54:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229825AbiLGDyi (ORCPT ); Tue, 6 Dec 2022 22:54:38 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41564 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229544AbiLGDyi (ORCPT ); Tue, 6 Dec 2022 22:54:38 -0500 Received: from mail-ed1-x531.google.com (mail-ed1-x531.google.com [IPv6:2a00:1450:4864:20::531]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 03E874E686 for ; Tue, 6 Dec 2022 19:54:37 -0800 (PST) Received: by mail-ed1-x531.google.com with SMTP id l11so23200084edb.4 for ; Tue, 06 Dec 2022 19:54:36 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=zuK6PNjfl60GoY8hn5Dn5BqgQFQV7WXHDnH07XnDKew=; b=ZKZA7HD1Zpb8OblH8aWwufWc62/YCeWnwp1oMmCuB+Of+sj00ry1ljhNa8UjVwQ9Tn CqbF8N4U0FosEmvZBoQOtr9tjuSomOytA9QqSvY2vfYRu3EjURT+ulFnhdwBkBuF2X+8 naPB3tbGPNyVXiwOn3c+qIpT6e+U5KMduDknjGxxyqTgwBy1kYaSWW1x7kOpDN2bYvwW uXOFTVPr0Q6TL+Bzx39zGvGLX2ZlF27mA5aao0P/SX+nWXCys2LBv1rD2EMVOfBBGAbr q4Z+Mc+xsRyjG7Lb6cGy8QCmu7CV2sR5Cu2yJIfyG6bN2rZDs1oe+7HcAQygSfilKqxV t4lA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=zuK6PNjfl60GoY8hn5Dn5BqgQFQV7WXHDnH07XnDKew=; b=Mzt5F5koGXYmrrw3WvEuWWEic/LnC2vKJgaIqGFzIDdcKjfaZPHxNyG/Hp4X2fmcZh CS6/JcualDD27L3BnOX5zTVWtO4dHfEwj79fggcCe+O1QNTG8xCWJWMZ2O3qp1n9AcuF dbbo4akTHKMoPWjxeG6Krx2jNyuDAf5rS/FVy3m6jhQtw8uk1rrhWLn5qsQ86Ume1EoD 9OYppnvLoy7DgzIDn1wizsW//1be12n5zKmjcftXyMGnftvKlUiW0vpxYeNfXb4f3AjE gCS2ugH155fHlDlsFyWjeJ8r8vbOT+oL/dyVYYIJNah5EErJFraUvYxVKgHdhQDBE1sM K94g== X-Gm-Message-State: ANoB5pksIEdBLUrJ6vmLV2IbOTkrEs5wtZslByVPKsvS4D9bIjWgtZ/9 KYvPL2EbZ4Gy9PVDqRkSDgrqls0QNSc= X-Google-Smtp-Source: AA0mqf7RUjzCQalqepzweQ3OGC17kzO0TA0x/ycBgHmXQF8kXefKC4HQQDc6XQXcloiQBI/78d6jbQ== X-Received: by 2002:a05:6402:28ac:b0:46a:b8d0:a052 with SMTP id eg44-20020a05640228ac00b0046ab8d0a052mr47551021edb.399.1670385275421; Tue, 06 Dec 2022 19:54:35 -0800 (PST) Received: from 127.0.0.1localhost (94.196.241.58.threembb.co.uk. [94.196.241.58]) by smtp.gmail.com with ESMTPSA id 9-20020a170906210900b0073de0506745sm7938939ejt.197.2022.12.06.19.54.34 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 06 Dec 2022 19:54:35 -0800 (PST) From: Pavel Begunkov To: io-uring@vger.kernel.org Cc: Jens Axboe , asml.silence@gmail.com Subject: [PATCH for-next v2 03/12] io_uring: skip overflow CQE posting for dying ring Date: Wed, 7 Dec 2022 03:53:28 +0000 Message-Id: <26d13751155a735a3029e24f8d9ca992f810419d.1670384893.git.asml.silence@gmail.com> X-Mailer: git-send-email 2.38.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org After io_ring_ctx_wait_and_kill() is called there should be no users poking into rings and so there is no need to post CQEs. So, instead of trying to post overflowed CQEs into the CQ, drop them. Also, do it in io_ring_exit_work() in a loop to reduce the number of contexts it can be executed from and even when it struggles to quiesce the ring we won't be leaving memory allocated for longer than needed. Signed-off-by: Pavel Begunkov --- io_uring/io_uring.c | 45 +++++++++++++++++++++++++++++++-------------- 1 file changed, 31 insertions(+), 14 deletions(-) diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index 62372a641add..5c0b3ba6059e 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -611,12 +611,30 @@ void io_cq_unlock_post(struct io_ring_ctx *ctx) } /* Returns true if there are no backlogged entries after the flush */ -static bool __io_cqring_overflow_flush(struct io_ring_ctx *ctx, bool force) +static void io_cqring_overflow_kill(struct io_ring_ctx *ctx) +{ + struct io_overflow_cqe *ocqe; + LIST_HEAD(list); + + io_cq_lock(ctx); + list_splice_init(&ctx->cq_overflow_list, &list); + clear_bit(IO_CHECK_CQ_OVERFLOW_BIT, &ctx->check_cq); + io_cq_unlock(ctx); + + while (!list_empty(&list)) { + ocqe = list_first_entry(&list, struct io_overflow_cqe, list); + list_del(&ocqe->list); + kfree(ocqe); + } +} + +/* Returns true if there are no backlogged entries after the flush */ +static bool __io_cqring_overflow_flush(struct io_ring_ctx *ctx) { bool all_flushed; size_t cqe_size = sizeof(struct io_uring_cqe); - if (!force && __io_cqring_events(ctx) == ctx->cq_entries) + if (__io_cqring_events(ctx) == ctx->cq_entries) return false; if (ctx->flags & IORING_SETUP_CQE32) @@ -627,15 +645,11 @@ static bool __io_cqring_overflow_flush(struct io_ring_ctx *ctx, bool force) struct io_uring_cqe *cqe = io_get_cqe_overflow(ctx, true); struct io_overflow_cqe *ocqe; - if (!cqe && !force) + if (!cqe) break; ocqe = list_first_entry(&ctx->cq_overflow_list, struct io_overflow_cqe, list); - if (cqe) - memcpy(cqe, &ocqe->cqe, cqe_size); - else - io_account_cq_overflow(ctx); - + memcpy(cqe, &ocqe->cqe, cqe_size); list_del(&ocqe->list); kfree(ocqe); } @@ -658,7 +672,7 @@ static bool io_cqring_overflow_flush(struct io_ring_ctx *ctx) /* iopoll syncs against uring_lock, not completion_lock */ if (ctx->flags & IORING_SETUP_IOPOLL) mutex_lock(&ctx->uring_lock); - ret = __io_cqring_overflow_flush(ctx, false); + ret = __io_cqring_overflow_flush(ctx); if (ctx->flags & IORING_SETUP_IOPOLL) mutex_unlock(&ctx->uring_lock); } @@ -1478,7 +1492,7 @@ static int io_iopoll_check(struct io_ring_ctx *ctx, long min) check_cq = READ_ONCE(ctx->check_cq); if (unlikely(check_cq)) { if (check_cq & BIT(IO_CHECK_CQ_OVERFLOW_BIT)) - __io_cqring_overflow_flush(ctx, false); + __io_cqring_overflow_flush(ctx); /* * Similarly do not spin if we have not informed the user of any * dropped CQE. @@ -2646,8 +2660,7 @@ static __cold void io_ring_ctx_free(struct io_ring_ctx *ctx) __io_sqe_buffers_unregister(ctx); if (ctx->file_data) __io_sqe_files_unregister(ctx); - if (ctx->rings) - __io_cqring_overflow_flush(ctx, true); + io_cqring_overflow_kill(ctx); io_eventfd_unregister(ctx); io_alloc_cache_free(&ctx->apoll_cache, io_apoll_cache_free); io_alloc_cache_free(&ctx->netmsg_cache, io_netmsg_cache_free); @@ -2788,6 +2801,12 @@ static __cold void io_ring_exit_work(struct work_struct *work) * as nobody else will be looking for them. */ do { + if (test_bit(IO_CHECK_CQ_OVERFLOW_BIT, &ctx->check_cq)) { + mutex_lock(&ctx->uring_lock); + io_cqring_overflow_kill(ctx); + mutex_unlock(&ctx->uring_lock); + } + if (ctx->flags & IORING_SETUP_DEFER_TASKRUN) io_move_task_work_from_local(ctx); @@ -2853,8 +2872,6 @@ static __cold void io_ring_ctx_wait_and_kill(struct io_ring_ctx *ctx) mutex_lock(&ctx->uring_lock); percpu_ref_kill(&ctx->refs); - if (ctx->rings) - __io_cqring_overflow_flush(ctx, true); xa_for_each(&ctx->personalities, index, creds) io_unregister_personality(ctx, index); if (ctx->rings) From patchwork Wed Dec 7 03:53:29 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pavel Begunkov X-Patchwork-Id: 13066597 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id BE74FC63706 for ; Wed, 7 Dec 2022 03:54:40 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229816AbiLGDyj (ORCPT ); Tue, 6 Dec 2022 22:54:39 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41594 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229818AbiLGDyi (ORCPT ); Tue, 6 Dec 2022 22:54:38 -0500 Received: from mail-ed1-x536.google.com (mail-ed1-x536.google.com [IPv6:2a00:1450:4864:20::536]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D072E52158 for ; Tue, 6 Dec 2022 19:54:37 -0800 (PST) Received: by mail-ed1-x536.google.com with SMTP id m19so23146411edj.8 for ; Tue, 06 Dec 2022 19:54:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=upWUPmaa90glqBHyBCUCAkABecCY2H8UcpgigVm/aNs=; b=qu9WVmVS8eJtEyD8wcCi8/HZsOcZ1r20TqH4yNFApx1C2nKtuml1Dgics7/43YX0N9 8zDABfFflaTxf/+eHPDSLymC/n125+/P4NHbcgQTugJD13K10G0J/UNVhqQEaejN3gC8 Jh3Z8dixikgNdoBTFah2wxiISmTxLk887wbZgpO6OmxgUikJEy3Xnsv0sQXkZMRBtLhw lP2Zxkko26kM9fnmOTdQVEDOYNiCgTKzxAzdCm7kD9lxQ4fbVCAIDe/rUcsWI3HuaNEj fIg6jTc0emcGI58ytq16kifuJK64hfs/jhEqY4HrohccH/+HV6NzfwuqNlIB6ZLj3v7f DNuA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=upWUPmaa90glqBHyBCUCAkABecCY2H8UcpgigVm/aNs=; b=bL24SG4Qre1KOqOC7jyaSib1PPXMXNfFUtRmGAEhoFEGUkeXPzhOzObUOMHf1l6ONp WW5j/PTM8vYmR/Zql3WEwSAGBPrQ6oiF7MNE8oU1QwpKvNa2vSlFnGL4jZi4+TaMLSCz jB9Sv6eJFF7+hl5TaKb7eXWodk3FpvtaSV7eG2CFrQlCfeO1pC6uGscKLd9ttnUFVaFE 6nkiwaEhC4ignPlOWIM0V9oFdXcJSN158VB70GlZrV2hRDvmjJaD9eS3yce5aOJJtzJe 5wAJIkxdvssPZ3SRM5aZryGcZpQnHVjr9Zp9jSaYXYOxeHGrM7YsCnBWqXRgwd9RG9Qv MlNg== X-Gm-Message-State: ANoB5pn1RRposRK68VJAtJMlafaAqHg2ZNZug/fFUWHqYSM/KVIPVXhu GmLi+hJ5Rb3GRlK+TdiRxZVYGwg3vWw= X-Google-Smtp-Source: AA0mqf7Dk6HAnQ596PCaShzI9F9YzDCgaNiUtCMPD61zboCGbiIeOMpwwcb/GrDMpEKQ/hzVCyHsHw== X-Received: by 2002:aa7:c415:0:b0:46c:4b56:8c06 with SMTP id j21-20020aa7c415000000b0046c4b568c06mr16391533edq.230.1670385276131; Tue, 06 Dec 2022 19:54:36 -0800 (PST) Received: from 127.0.0.1localhost (94.196.241.58.threembb.co.uk. [94.196.241.58]) by smtp.gmail.com with ESMTPSA id 9-20020a170906210900b0073de0506745sm7938939ejt.197.2022.12.06.19.54.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 06 Dec 2022 19:54:35 -0800 (PST) From: Pavel Begunkov To: io-uring@vger.kernel.org Cc: Jens Axboe , asml.silence@gmail.com Subject: [PATCH for-next v2 04/12] io_uring: don't check overflow flush failures Date: Wed, 7 Dec 2022 03:53:29 +0000 Message-Id: <6b720a45c03345655517f8202cbd0bece2848fb2.1670384893.git.asml.silence@gmail.com> X-Mailer: git-send-email 2.38.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org The only way to fail overflowed CQEs flush is for CQ to be fully packed. There is one place checking for flush failures, i.e. io_cqring_wait(), but we limit the number to be waited for by the CQ size, so getting a failure automatically means that we're done with waiting. Don't check for failures, rarely but they might spuriously fail CQ waiting with -EBUSY. Signed-off-by: Pavel Begunkov --- io_uring/io_uring.c | 24 ++++++------------------ 1 file changed, 6 insertions(+), 18 deletions(-) diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index 5c0b3ba6059e..7bebca5ed950 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -629,13 +629,12 @@ static void io_cqring_overflow_kill(struct io_ring_ctx *ctx) } /* Returns true if there are no backlogged entries after the flush */ -static bool __io_cqring_overflow_flush(struct io_ring_ctx *ctx) +static void __io_cqring_overflow_flush(struct io_ring_ctx *ctx) { - bool all_flushed; size_t cqe_size = sizeof(struct io_uring_cqe); if (__io_cqring_events(ctx) == ctx->cq_entries) - return false; + return; if (ctx->flags & IORING_SETUP_CQE32) cqe_size <<= 1; @@ -654,30 +653,23 @@ static bool __io_cqring_overflow_flush(struct io_ring_ctx *ctx) kfree(ocqe); } - all_flushed = list_empty(&ctx->cq_overflow_list); - if (all_flushed) { + if (list_empty(&ctx->cq_overflow_list)) { clear_bit(IO_CHECK_CQ_OVERFLOW_BIT, &ctx->check_cq); atomic_andnot(IORING_SQ_CQ_OVERFLOW, &ctx->rings->sq_flags); } - io_cq_unlock_post(ctx); - return all_flushed; } -static bool io_cqring_overflow_flush(struct io_ring_ctx *ctx) +static void io_cqring_overflow_flush(struct io_ring_ctx *ctx) { - bool ret = true; - if (test_bit(IO_CHECK_CQ_OVERFLOW_BIT, &ctx->check_cq)) { /* iopoll syncs against uring_lock, not completion_lock */ if (ctx->flags & IORING_SETUP_IOPOLL) mutex_lock(&ctx->uring_lock); - ret = __io_cqring_overflow_flush(ctx); + __io_cqring_overflow_flush(ctx); if (ctx->flags & IORING_SETUP_IOPOLL) mutex_unlock(&ctx->uring_lock); } - - return ret; } void __io_put_task(struct task_struct *task, int nr) @@ -2505,11 +2497,7 @@ static int io_cqring_wait(struct io_ring_ctx *ctx, int min_events, trace_io_uring_cqring_wait(ctx, min_events); do { - /* if we can't even flush overflow, don't wait for more */ - if (!io_cqring_overflow_flush(ctx)) { - ret = -EBUSY; - break; - } + io_cqring_overflow_flush(ctx); prepare_to_wait_exclusive(&ctx->cq_wait, &iowq.wq, TASK_INTERRUPTIBLE); ret = io_cqring_wait_schedule(ctx, &iowq, timeout); From patchwork Wed Dec 7 03:53:30 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pavel Begunkov X-Patchwork-Id: 13066596 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 53F81C47089 for ; Wed, 7 Dec 2022 03:54:40 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229830AbiLGDyj (ORCPT ); Tue, 6 Dec 2022 22:54:39 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41576 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229816AbiLGDyi (ORCPT ); Tue, 6 Dec 2022 22:54:38 -0500 Received: from mail-ej1-x636.google.com (mail-ej1-x636.google.com [IPv6:2a00:1450:4864:20::636]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9574A27168 for ; Tue, 6 Dec 2022 19:54:37 -0800 (PST) Received: by mail-ej1-x636.google.com with SMTP id m18so9506630eji.5 for ; Tue, 06 Dec 2022 19:54:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=FVS0T7T3Ozq0qySF9iT4OWlcTNwZwEIB7k/FfHEjpAA=; b=UT9Xx3wqNraPVuCpehat9w1RfNR3OkCTJ39vrUIynSs+uNcrd7eJALy0aAnSsBMIR1 LVWRzDlH6RiMFtRZ6phRNj3dY56dLClDTrFGR0PRFBOXqJKR4FHMQuNRe6kTip7L4fEC z8E7txA4SzHvxPY6j3lbwBl0qa4e0iFY6654t1AbeH+/g4FZdeiS/jU5dSfq355shQnC ReIDM2ZHeL1Z+vgJzqCCSjZgKo9e3YxQtDP6dHQtBlOiygYLxsFPHr2kVF+AIN28SkkA V+sq+VVzb+B9x6ZCrLuh0V9VCKEF9afpMKZSeFvb7PfL4+tnAyrdDJoQq4CmM2OiIYsW boqQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=FVS0T7T3Ozq0qySF9iT4OWlcTNwZwEIB7k/FfHEjpAA=; b=6HSnymjwHEy6DrrEKm/RMrAMoaZFGPys0xKx6FrDntGwPvpxm2vdwcmpBK0acaJYQS LrmKrYj2TG3HX3LcGA0s33tD2usgV2a2xuhBcbwIoCxQG7SpXOyVkp8oZjLM6N8V+Ac/ soR8TR5T5TMmDt36YcQE6sdzLGSyksuk8OBGONvboElDMuBYVd8eVu8jvRwPhNMUTB9s nxP/ns96ckLbEyBDiodKdPAnuL4GIkngiyldmekd7Ae2KzIoP6rnlKK+L3ag0ZqgNuGO ptke9Sa/wsny6tPRKdtcbQrIaeznawkJaqUp5A0McjLQubUuaJSVqKWWM+fscCtEo6i2 RJJQ== X-Gm-Message-State: ANoB5plAs3GG0eltwKD+ZUbfpwrayZbM7QqK0K2nXcI06u/QuWv5XPDJ 398DtIxNCRYJVXlRpZX8Eiuiqc1Dhdw= X-Google-Smtp-Source: AA0mqf57sjUHbcpo6NHSBr5YwTZbVZ24aX/IP/ENPA7I2xn/4gtFGHUzt4eMq+hSz8hPcWHLzEJ/tA== X-Received: by 2002:a17:906:5213:b0:7b6:12ee:b7fc with SMTP id g19-20020a170906521300b007b612eeb7fcmr25192583ejm.265.1670385277038; Tue, 06 Dec 2022 19:54:37 -0800 (PST) Received: from 127.0.0.1localhost (94.196.241.58.threembb.co.uk. [94.196.241.58]) by smtp.gmail.com with ESMTPSA id 9-20020a170906210900b0073de0506745sm7938939ejt.197.2022.12.06.19.54.36 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 06 Dec 2022 19:54:36 -0800 (PST) From: Pavel Begunkov To: io-uring@vger.kernel.org Cc: Jens Axboe , asml.silence@gmail.com Subject: [PATCH for-next v2 05/12] io_uring: complete all requests in task context Date: Wed, 7 Dec 2022 03:53:30 +0000 Message-Id: <21ece72953f76bb2e77659a72a14326227ab6460.1670384893.git.asml.silence@gmail.com> X-Mailer: git-send-email 2.38.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org This patch adds ctx->task_complete flag. If set, we'll complete all requests in the context of the original task. Note, this extends to completion CQE posting only but not io_kiocb cleanup / free, e.g. io-wq may free the requests in the free calllback. This flag will be used later for optimisations purposes. Signed-off-by: Pavel Begunkov --- include/linux/io_uring.h | 2 ++ include/linux/io_uring_types.h | 2 ++ io_uring/io_uring.c | 14 +++++++++++--- 3 files changed, 15 insertions(+), 3 deletions(-) diff --git a/include/linux/io_uring.h b/include/linux/io_uring.h index 29e519752da4..934e5dd4ccc0 100644 --- a/include/linux/io_uring.h +++ b/include/linux/io_uring.h @@ -11,6 +11,8 @@ enum io_uring_cmd_flags { IO_URING_F_UNLOCKED = 2, /* the request is executed from poll, it should not be freed */ IO_URING_F_MULTISHOT = 4, + /* executed by io-wq */ + IO_URING_F_IOWQ = 8, /* int's last bit, sign checks are usually faster than a bit test */ IO_URING_F_NONBLOCK = INT_MIN, diff --git a/include/linux/io_uring_types.h b/include/linux/io_uring_types.h index accdfecee953..6be1e1359c89 100644 --- a/include/linux/io_uring_types.h +++ b/include/linux/io_uring_types.h @@ -208,6 +208,8 @@ struct io_ring_ctx { unsigned int drain_disabled: 1; unsigned int has_evfd: 1; unsigned int syscall_iopoll: 1; + /* all CQEs should be posted only by the submitter task */ + unsigned int task_complete: 1; } ____cacheline_aligned_in_smp; /* submission data */ diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index 7bebca5ed950..52ea83f241c6 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -932,8 +932,11 @@ static void __io_req_complete_post(struct io_kiocb *req) void io_req_complete_post(struct io_kiocb *req, unsigned issue_flags) { - if (!(issue_flags & IO_URING_F_UNLOCKED) || - !(req->ctx->flags & IORING_SETUP_IOPOLL)) { + if (req->ctx->task_complete && (issue_flags & IO_URING_F_IOWQ)) { + req->io_task_work.func = io_req_task_complete; + io_req_task_work_add(req); + } else if (!(issue_flags & IO_URING_F_UNLOCKED) || + !(req->ctx->flags & IORING_SETUP_IOPOLL)) { __io_req_complete_post(req); } else { struct io_ring_ctx *ctx = req->ctx; @@ -1841,7 +1844,7 @@ void io_wq_submit_work(struct io_wq_work *work) { struct io_kiocb *req = container_of(work, struct io_kiocb, work); const struct io_op_def *def = &io_op_defs[req->opcode]; - unsigned int issue_flags = IO_URING_F_UNLOCKED; + unsigned int issue_flags = IO_URING_F_UNLOCKED | IO_URING_F_IOWQ; bool needs_poll = false; int ret = 0, err = -ECANCELED; @@ -3501,6 +3504,11 @@ static __cold int io_uring_create(unsigned entries, struct io_uring_params *p, if (!ctx) return -ENOMEM; + if ((ctx->flags & IORING_SETUP_DEFER_TASKRUN) && + !(ctx->flags & IORING_SETUP_IOPOLL) && + !(ctx->flags & IORING_SETUP_SQPOLL)) + ctx->task_complete = true; + /* * When SETUP_IOPOLL and SETUP_SQPOLL are both enabled, user * space applications don't need to do io completion events From patchwork Wed Dec 7 03:53:31 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pavel Begunkov X-Patchwork-Id: 13066598 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E6273C4708D for ; Wed, 7 Dec 2022 03:54:41 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229670AbiLGDyl (ORCPT ); Tue, 6 Dec 2022 22:54:41 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41624 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229544AbiLGDyk (ORCPT ); Tue, 6 Dec 2022 22:54:40 -0500 Received: from mail-ej1-x630.google.com (mail-ej1-x630.google.com [IPv6:2a00:1450:4864:20::630]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 764835216D for ; Tue, 6 Dec 2022 19:54:39 -0800 (PST) Received: by mail-ej1-x630.google.com with SMTP id n20so11184054ejh.0 for ; Tue, 06 Dec 2022 19:54:39 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=BzK/zrdj76nrH7tgY872zfma02vZfGuAP6Fn/6zdEXI=; b=n7eguXwsQ0qfm1aLJchTAdxOyM9dWwFjfm9/jySjjXRqSM/ga6hyjRSv31DqHYU7JG lu//oJ7zscuZQUj/fqs/H79NRo9ypEasYQ1xJ0Mox22fXAuQsZcmSR/LVxKxu5I1KHBF Z0cq2sq/484Ha52Lxbr9mxIY3oUya2OPGXAicpIbctVzM9mbMCIiPpblaxEbL8G5nZgz bcxEzfj11QOFvN0EyLj/AA4OhuRN/4gkF44MpLq4SPMFY0XGmTAYnEkWEStga58RHuaz 4k3WBkeBdH4nPR9qd6jPiBmVeotSMKa4n9PDfmL4+j+xTV0Gm+g3wpVsVOxVj4QWKFWj FgOQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=BzK/zrdj76nrH7tgY872zfma02vZfGuAP6Fn/6zdEXI=; b=TnIt6sTzMhnKUAorghqVCIL8mgEpgQUFBlWIJnXQ59q+7+d51PAQ6iCWvYMwpSonRi j0voRpjIrhpd+4MMS8l5Cym0MbFcEKX0wSIdu23gvAd+l7GuFN5LLxf/zXaIhogu32ET AifDq+7g9/ypYusdQ/dlo4wOEhVUL7SAuyk08kRO06/SV9fFtJRZOD0SxxSMFGQ1UiC4 WP0KwOxG0E6Dy/wvKEzcUvDxaGuOOk7u+7DOCZd5A4nwZBsoBWZ/naa+yPgV765ybN8D vh84dEVPcknsmdgi3Oliox8WG74zyNXleVTR5bVOeIGht2J+GTRibxa3DtMQSxTTgmn4 2CoQ== X-Gm-Message-State: ANoB5pnxfOw+a76pMwUjZxBohjh7ucgGFAABRsN4WJORNbSsKv3sPYuK uSqdCn8+ysBsNACj/rbp8h1maSZXG1o= X-Google-Smtp-Source: AA0mqf6SnTUGML1/xmclXec+2OkM1iXAOZlbGwRULnBFke3XlTRXso8367M8SFCZX5GeGYqfTx8hxQ== X-Received: by 2002:a17:906:c011:b0:778:f9b6:6fc with SMTP id e17-20020a170906c01100b00778f9b606fcmr76790178ejz.580.1670385277852; Tue, 06 Dec 2022 19:54:37 -0800 (PST) Received: from 127.0.0.1localhost (94.196.241.58.threembb.co.uk. [94.196.241.58]) by smtp.gmail.com with ESMTPSA id 9-20020a170906210900b0073de0506745sm7938939ejt.197.2022.12.06.19.54.37 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 06 Dec 2022 19:54:37 -0800 (PST) From: Pavel Begunkov To: io-uring@vger.kernel.org Cc: Jens Axboe , asml.silence@gmail.com Subject: [PATCH for-next v2 06/12] io_uring: force multishot CQEs into task context Date: Wed, 7 Dec 2022 03:53:31 +0000 Message-Id: X-Mailer: git-send-email 2.38.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org Multishot are posting CQEs outside of the normal request completion path, which is usually done from within a task work handler. However, it might be not the case when it's yet to be polled but has been punted to io-wq. Make it abide ->task_complete and push it to the polling path when executed by io-wq. Signed-off-by: Pavel Begunkov --- io_uring/net.c | 21 +++++++++++++++++++++ 1 file changed, 21 insertions(+) diff --git a/io_uring/net.c b/io_uring/net.c index 90342dcb6b1d..f276f6dd5b09 100644 --- a/io_uring/net.c +++ b/io_uring/net.c @@ -67,6 +67,19 @@ struct io_sr_msg { struct io_kiocb *notif; }; +static inline bool io_check_multishot(struct io_kiocb *req, + unsigned int issue_flags) +{ + /* + * When ->locked_cq is set we only allow to post CQEs from the original + * task context. Usual request completions will be handled in other + * generic paths but multipoll may decide to post extra cqes. + */ + return !(issue_flags & IO_URING_F_IOWQ) || + !(issue_flags & IO_URING_F_MULTISHOT) || + !req->ctx->task_complete; +} + int io_shutdown_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe) { struct io_shutdown *shutdown = io_kiocb_to_cmd(req, struct io_shutdown); @@ -730,6 +743,9 @@ int io_recvmsg(struct io_kiocb *req, unsigned int issue_flags) (sr->flags & IORING_RECVSEND_POLL_FIRST)) return io_setup_async_msg(req, kmsg, issue_flags); + if (!io_check_multishot(req, issue_flags)) + return io_setup_async_msg(req, kmsg, issue_flags); + retry_multishot: if (io_do_buffer_select(req)) { void __user *buf; @@ -829,6 +845,9 @@ int io_recv(struct io_kiocb *req, unsigned int issue_flags) (sr->flags & IORING_RECVSEND_POLL_FIRST)) return -EAGAIN; + if (!io_check_multishot(req, issue_flags)) + return -EAGAIN; + sock = sock_from_file(req->file); if (unlikely(!sock)) return -ENOTSOCK; @@ -1280,6 +1299,8 @@ int io_accept(struct io_kiocb *req, unsigned int issue_flags) struct file *file; int ret, fd; + if (!io_check_multishot(req, issue_flags)) + return -EAGAIN; retry: if (!fixed) { fd = __get_unused_fd_flags(accept->flags, accept->nofile); From patchwork Wed Dec 7 03:53:32 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pavel Begunkov X-Patchwork-Id: 13066599 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0B7D4C352A1 for ; Wed, 7 Dec 2022 03:54:43 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229818AbiLGDyl (ORCPT ); Tue, 6 Dec 2022 22:54:41 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41652 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229534AbiLGDyl (ORCPT ); Tue, 6 Dec 2022 22:54:41 -0500 Received: from mail-ed1-x52f.google.com (mail-ed1-x52f.google.com [IPv6:2a00:1450:4864:20::52f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3222027168 for ; Tue, 6 Dec 2022 19:54:40 -0800 (PST) Received: by mail-ed1-x52f.google.com with SMTP id z92so23228278ede.1 for ; Tue, 06 Dec 2022 19:54:40 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=dP2+0OXomyqgmdewLzhRA7A0yMZuCYWYtsUc2p+oA/E=; b=jZEtXosr1wHSijYps7NTiYoCV9t5ecY9GqrKTPRvTz1aXlx+2+cz3WhfSv1euMFuWx xj1vY1gxKIR+lO9v0rvSo4IBsTsMGoLIN0bmShQgElg5sacHP/0n6bptv3rw8S8NSpFm I91LFmJdRWNJGqqLm2hWQ45WOldmG7X1YcRZoCCscMU7hU5iVJ5sK4mG7SEsiGRm+Jwn 74b8il++wbEl1w8c+fzoJdqrJTEtxzS7bhoOsJmOHQQ3nnPOYTXqFBID74u1O36wURI+ nYb/CD+ck6Ywazq62IHg/qMZ7nLqZ0aTRSw+lOLB+A3/3+dm5yo/xS+HiZ/vFDOuD6hp QoFQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=dP2+0OXomyqgmdewLzhRA7A0yMZuCYWYtsUc2p+oA/E=; b=B/589n3+6rJt5dt0HqI4xHAWwZUiqFMIOTGDG21a870g4q5tFZjpcWsqBoT8JeBNF/ l4tKhSMRKGMgZl//TpN2qyl5vzml9Z7rklU4R48gM46aUQQ+7J0wq1N3raQiCaIJpipz m6K7tNsmKop/oCdymjDd6KXhqMfHmR1jtd8x8MpY5Ohh0hmhJRk4I4D+AIZIckPq77Tp c8B52Jo2GYDfKxBHSjs9C1LcGBKX/Kve6C89bo4apG5+klVtgQ0SYuaIlLnNE3+IUvYQ f1qaI4oiM+jzcoeEK/gsBsXBCbityNqXmEkfRAIH2TsqvKlf11ooxsnepO6GTkUjGOpd 8vZg== X-Gm-Message-State: ANoB5plqptYEDXpKLiNcSWyu74Fi66AYt1xE0qUqEZrUaSoDHwLsac9L FvqQpXNhpVjUQdcheMMIJvBdyM/G9n0= X-Google-Smtp-Source: AA0mqf46lduNoX0OZAZJgZJLLMHAAU5QRm7jafa5d9NdBCUEFnDW3f2R8WTg0v7hx0TWbtsZFiQ/Yg== X-Received: by 2002:a05:6402:2052:b0:46b:5dcf:5365 with SMTP id bc18-20020a056402205200b0046b5dcf5365mr33369699edb.157.1670385278600; Tue, 06 Dec 2022 19:54:38 -0800 (PST) Received: from 127.0.0.1localhost (94.196.241.58.threembb.co.uk. [94.196.241.58]) by smtp.gmail.com with ESMTPSA id 9-20020a170906210900b0073de0506745sm7938939ejt.197.2022.12.06.19.54.37 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 06 Dec 2022 19:54:38 -0800 (PST) From: Pavel Begunkov To: io-uring@vger.kernel.org Cc: Jens Axboe , asml.silence@gmail.com Subject: [PATCH for-next v2 07/12] io_uring: use tw for putting rsrc Date: Wed, 7 Dec 2022 03:53:32 +0000 Message-Id: X-Mailer: git-send-email 2.38.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org Use task_work for completing rsrc removals, it'll be needed later for spinlock optimisations. Signed-off-by: Pavel Begunkov --- include/linux/io_uring_types.h | 1 + io_uring/io_uring.c | 1 + io_uring/rsrc.c | 19 +++++++++++++++++-- io_uring/rsrc.h | 1 + 4 files changed, 20 insertions(+), 2 deletions(-) diff --git a/include/linux/io_uring_types.h b/include/linux/io_uring_types.h index 6be1e1359c89..dcd8a563ab52 100644 --- a/include/linux/io_uring_types.h +++ b/include/linux/io_uring_types.h @@ -328,6 +328,7 @@ struct io_ring_ctx { struct io_rsrc_data *buf_data; struct delayed_work rsrc_put_work; + struct callback_head rsrc_put_tw; struct llist_head rsrc_put_llist; struct list_head rsrc_ref_list; spinlock_t rsrc_ref_lock; diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index 52ea83f241c6..3a422a7b7132 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -326,6 +326,7 @@ static __cold struct io_ring_ctx *io_ring_ctx_alloc(struct io_uring_params *p) spin_lock_init(&ctx->rsrc_ref_lock); INIT_LIST_HEAD(&ctx->rsrc_ref_list); INIT_DELAYED_WORK(&ctx->rsrc_put_work, io_rsrc_put_work); + init_task_work(&ctx->rsrc_put_tw, io_rsrc_put_tw); init_llist_head(&ctx->rsrc_put_llist); init_llist_head(&ctx->work_llist); INIT_LIST_HEAD(&ctx->tctx_list); diff --git a/io_uring/rsrc.c b/io_uring/rsrc.c index d25309400a45..18de10c68a15 100644 --- a/io_uring/rsrc.c +++ b/io_uring/rsrc.c @@ -204,6 +204,14 @@ void io_rsrc_put_work(struct work_struct *work) } } +void io_rsrc_put_tw(struct callback_head *cb) +{ + struct io_ring_ctx *ctx = container_of(cb, struct io_ring_ctx, + rsrc_put_tw); + + io_rsrc_put_work(&ctx->rsrc_put_work.work); +} + void io_wait_rsrc_data(struct io_rsrc_data *data) { if (data && !atomic_dec_and_test(&data->refs)) @@ -242,8 +250,15 @@ static __cold void io_rsrc_node_ref_zero(struct percpu_ref *ref) } spin_unlock_irqrestore(&ctx->rsrc_ref_lock, flags); - if (first_add) - mod_delayed_work(system_wq, &ctx->rsrc_put_work, delay); + if (!first_add) + return; + + if (ctx->submitter_task) { + if (!task_work_add(ctx->submitter_task, &ctx->rsrc_put_tw, + ctx->notify_method)) + return; + } + mod_delayed_work(system_wq, &ctx->rsrc_put_work, delay); } static struct io_rsrc_node *io_rsrc_node_alloc(void) diff --git a/io_uring/rsrc.h b/io_uring/rsrc.h index 81445a477622..2b8743645efc 100644 --- a/io_uring/rsrc.h +++ b/io_uring/rsrc.h @@ -53,6 +53,7 @@ struct io_mapped_ubuf { struct bio_vec bvec[]; }; +void io_rsrc_put_tw(struct callback_head *cb); void io_rsrc_put_work(struct work_struct *work); void io_rsrc_refs_refill(struct io_ring_ctx *ctx); void io_wait_rsrc_data(struct io_rsrc_data *data); From patchwork Wed Dec 7 03:53:33 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pavel Begunkov X-Patchwork-Id: 13066600 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 43230C4708D for ; Wed, 7 Dec 2022 03:54:44 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229843AbiLGDyn (ORCPT ); Tue, 6 Dec 2022 22:54:43 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41618 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229829AbiLGDyl (ORCPT ); Tue, 6 Dec 2022 22:54:41 -0500 Received: from mail-ed1-x536.google.com (mail-ed1-x536.google.com [IPv6:2a00:1450:4864:20::536]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D93E352158 for ; Tue, 6 Dec 2022 19:54:40 -0800 (PST) Received: by mail-ed1-x536.google.com with SMTP id e13so23144085edj.7 for ; Tue, 06 Dec 2022 19:54:40 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=r5fgO8A8lvZ98Yk+tc8eOFrqMx1DtgiAeuBltW/yuOA=; b=mNyRTWAGsag5ljFZSekhTerSRx48o8RQeSDc449OmHrp18NxwHe129CrQJH+hi6dU0 pRc7tLDOx2LAEMHA+hLaQHOm68hGqft6/vHnM6+Ha60fzk135CTaUICAglpvD9UuCO0u FRFLcBpznBV1LGkyO1BFsTyq0HaNLs7yqDYlVHZArrF/s/PFCT5bCZ7hUJJH5SarxkKg rmn7QPcRMd2K8X4Iv+FP7hjOZvUT+dVN8RarQzluuwyz8+pThYJPgJ5Jripu/bNrFEaU Ssmzp8d3Uqya1lfvlKmBtWwGUEv6jGOqbbEu5Vn9QcRSls56X//7LBP86cQ/evZ3v4mz 6J5g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=r5fgO8A8lvZ98Yk+tc8eOFrqMx1DtgiAeuBltW/yuOA=; b=qKcTpg083qxNIpZvxD3m6Vb1ERstiqf9aHm3Go3Hb2ZUNvcBNHDbkKNcig0QqqnbMb ayEYJwIfFU9IINXb/hkvSLzRLoipGhmZkXWeR9AREP15X6CJrIse+uVPcbKBxjYq4Wu3 vvDZBJru6EK7X5hFfWTeLxrl6osYz9hDrptxm90RsxnyxFMOMZN5ZfPW75lEaCGZGjOI ZDj5MBkJ+CiuX6ne8AW/7Sn0OKOYEh91wA0K7ibQJyG6HhLzAP4Z3PtIGHBRt8OrxaRW KOj0pwogCy+AbfuMKQGF+pfC5U/Sldn2FTLoKmIdJBgn5Jh3CiJpAr4hYCqUKv8Jj753 51cQ== X-Gm-Message-State: ANoB5plaEhljU7+gi1Va5n+9Dv/dhd9QAhyChN+C6lL/8rLWkDQXtdfG D1BFJY3x6ivbWkHlRvzIzjWL3SxOj1A= X-Google-Smtp-Source: AA0mqf7dM3J636vTKokAE6NkXwDWARNABM005EFropR0+g4fqlcoTze3SMNXWR3vbnr3hydMCxBh1Q== X-Received: by 2002:a05:6402:548a:b0:468:e8e2:31c9 with SMTP id fg10-20020a056402548a00b00468e8e231c9mr14184891edb.310.1670385279242; Tue, 06 Dec 2022 19:54:39 -0800 (PST) Received: from 127.0.0.1localhost (94.196.241.58.threembb.co.uk. [94.196.241.58]) by smtp.gmail.com with ESMTPSA id 9-20020a170906210900b0073de0506745sm7938939ejt.197.2022.12.06.19.54.38 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 06 Dec 2022 19:54:39 -0800 (PST) From: Pavel Begunkov To: io-uring@vger.kernel.org Cc: Jens Axboe , asml.silence@gmail.com Subject: [PATCH for-next v2 08/12] io_uring: never run tw and fallback in parallel Date: Wed, 7 Dec 2022 03:53:33 +0000 Message-Id: <96f4987265c4312f376f206511c6af3e77aaf5ac.1670384893.git.asml.silence@gmail.com> X-Mailer: git-send-email 2.38.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org Once we fallback a tw we want all requests to that task to be given to the fallback wq so we dont run it in parallel with the last, i.e. post PF_EXITING, tw run of the task. Signed-off-by: Pavel Begunkov --- io_uring/io_uring.c | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index 3a422a7b7132..0e424d8721ab 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -149,6 +149,7 @@ static void io_clean_op(struct io_kiocb *req); static void io_queue_sqe(struct io_kiocb *req); static void io_move_task_work_from_local(struct io_ring_ctx *ctx); static void __io_submit_flush_completions(struct io_ring_ctx *ctx); +static __cold void io_fallback_tw(struct io_uring_task *tctx); static struct kmem_cache *req_cachep; @@ -1160,10 +1161,17 @@ void tctx_task_work(struct callback_head *cb) struct io_uring_task *tctx = container_of(cb, struct io_uring_task, task_work); struct llist_node fake = {}; - struct llist_node *node = io_llist_xchg(&tctx->task_list, &fake); + struct llist_node *node; unsigned int loops = 1; - unsigned int count = handle_tw_list(node, &ctx, &uring_locked, NULL); + unsigned int count; + + if (unlikely(current->flags & PF_EXITING)) { + io_fallback_tw(tctx); + return; + } + node = io_llist_xchg(&tctx->task_list, &fake); + count = handle_tw_list(node, &ctx, &uring_locked, NULL); node = io_llist_cmpxchg(&tctx->task_list, &fake, NULL); while (node != &fake) { loops++; From patchwork Wed Dec 7 03:53:34 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pavel Begunkov X-Patchwork-Id: 13066601 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D6FCCC352A1 for ; Wed, 7 Dec 2022 03:54:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229832AbiLGDyp (ORCPT ); Tue, 6 Dec 2022 22:54:45 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41744 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229842AbiLGDym (ORCPT ); Tue, 6 Dec 2022 22:54:42 -0500 Received: from mail-ej1-x62b.google.com (mail-ej1-x62b.google.com [IPv6:2a00:1450:4864:20::62b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 909A152170 for ; Tue, 6 Dec 2022 19:54:41 -0800 (PST) Received: by mail-ej1-x62b.google.com with SMTP id b2so11029618eja.7 for ; Tue, 06 Dec 2022 19:54:41 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=1dAS4NoPFKX7V/NJNTiJJVwYipVppvXjw+VoTzli/nA=; b=logkz+rQdHGzgGba8RsimyabzGJ4VfPvKMFC+rDc4KuotpES7UCVRKWVgmwrbzOphI STfH2QaEL7yUx924Lali1K+tT6WPoO5paw4wmXPa5To8YvrT+Irk8aFOw3vEPy1rEq77 AtfwJIannntZkwbDHWFtozPdU62BMiaxOtOvOHXPUZv52Z+SAzYmbycd6KcaT4v4+Y0O RNzqKHuCisqyGF2oHiZ54Wq+ufkmQgJ8d9uPQWKaTeF/Hiri6gmBXudtmIwBu51cH187 jnwoWe7dL4Mqy74runmVo0dZj/ao5poLMINkahjfIgPiuGgdLxySied8LqZ7jVOdnNzy Bb1g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=1dAS4NoPFKX7V/NJNTiJJVwYipVppvXjw+VoTzli/nA=; b=R9ha+0ZlmzRmHdq4pJw7kUvNCg8FZj+vqX/rIvDlLcp8VMsZjJEm/dYCwN+QrbDc7V dEHsdQ8TyKqRrUR1c8k7UtCXDjvKB4tzbZQpUW3AaHWw8C2VWOWiZ0LYtuKVIs7tcxoj FLA12mKmfSHCdKG+R6V76jzIW/d7QkVkuuDV8QSXMydfUsFtcrxe06pjRAkcfLNhQpee g7nB0AREinydE160RLtTTQgjqgAO31FfxmDLLuGbZwhkg+TmlzsgBEGxlhPvbRat6c2a ED2cHFb9JPU64oKczBS8LCspZbjVKBooi9pRoju3eTuehLnGfDjsZkPqnSijYDBQ23s4 o4eg== X-Gm-Message-State: ANoB5plM7LDaKjeJFYFiMkT/G9OvYO5ICY8ydcxZpuLRfuNob1SrGpGB OguOOjQbx4h/7cRhtpyChhCg0WP64Fo= X-Google-Smtp-Source: AA0mqf6eE4NKoXSX9/6GjHtGFti2+Vv5ig137VcgOoCZUWGZvZNLqZ/SoOpzTZQzIJZZutSPHmZuRA== X-Received: by 2002:a17:906:404:b0:781:f54c:1947 with SMTP id d4-20020a170906040400b00781f54c1947mr64030473eja.69.1670385279982; Tue, 06 Dec 2022 19:54:39 -0800 (PST) Received: from 127.0.0.1localhost (94.196.241.58.threembb.co.uk. [94.196.241.58]) by smtp.gmail.com with ESMTPSA id 9-20020a170906210900b0073de0506745sm7938939ejt.197.2022.12.06.19.54.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 06 Dec 2022 19:54:39 -0800 (PST) From: Pavel Begunkov To: io-uring@vger.kernel.org Cc: Jens Axboe , asml.silence@gmail.com Subject: [PATCH for-next v2 09/12] io_uring: get rid of double locking Date: Wed, 7 Dec 2022 03:53:34 +0000 Message-Id: X-Mailer: git-send-email 2.38.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org We don't need to take both uring_locks at once, msg_ring can be split in two parts, first getting a file from the filetable of the first ring and then installing it into the second one. Signed-off-by: Pavel Begunkov --- io_uring/msg_ring.c | 85 ++++++++++++++++++++++++++------------------- io_uring/msg_ring.h | 1 + io_uring/opdef.c | 1 + 3 files changed, 51 insertions(+), 36 deletions(-) diff --git a/io_uring/msg_ring.c b/io_uring/msg_ring.c index c7d6586164ca..387c45a58654 100644 --- a/io_uring/msg_ring.c +++ b/io_uring/msg_ring.c @@ -15,6 +15,7 @@ struct io_msg { struct file *file; + struct file *src_file; u64 user_data; u32 len; u32 cmd; @@ -23,6 +24,17 @@ struct io_msg { u32 flags; }; +void io_msg_ring_cleanup(struct io_kiocb *req) +{ + struct io_msg *msg = io_kiocb_to_cmd(req, struct io_msg); + + if (WARN_ON_ONCE(!msg->src_file)) + return; + + fput(msg->src_file); + msg->src_file = NULL; +} + static int io_msg_ring_data(struct io_kiocb *req) { struct io_ring_ctx *target_ctx = req->file->private_data; @@ -37,17 +49,13 @@ static int io_msg_ring_data(struct io_kiocb *req) return -EOVERFLOW; } -static void io_double_unlock_ctx(struct io_ring_ctx *ctx, - struct io_ring_ctx *octx, +static void io_double_unlock_ctx(struct io_ring_ctx *octx, unsigned int issue_flags) { - if (issue_flags & IO_URING_F_UNLOCKED) - mutex_unlock(&ctx->uring_lock); mutex_unlock(&octx->uring_lock); } -static int io_double_lock_ctx(struct io_ring_ctx *ctx, - struct io_ring_ctx *octx, +static int io_double_lock_ctx(struct io_ring_ctx *octx, unsigned int issue_flags) { /* @@ -60,17 +68,28 @@ static int io_double_lock_ctx(struct io_ring_ctx *ctx, return -EAGAIN; return 0; } + mutex_lock(&octx->uring_lock); + return 0; +} - /* Always grab smallest value ctx first. We know ctx != octx. */ - if (ctx < octx) { - mutex_lock(&ctx->uring_lock); - mutex_lock(&octx->uring_lock); - } else { - mutex_lock(&octx->uring_lock); - mutex_lock(&ctx->uring_lock); +static struct file *io_msg_grab_file(struct io_kiocb *req, unsigned int issue_flags) +{ + struct io_msg *msg = io_kiocb_to_cmd(req, struct io_msg); + struct io_ring_ctx *ctx = req->ctx; + struct file *file = NULL; + unsigned long file_ptr; + int idx = msg->src_fd; + + io_ring_submit_lock(ctx, issue_flags); + if (likely(idx < ctx->nr_user_files)) { + idx = array_index_nospec(idx, ctx->nr_user_files); + file_ptr = io_fixed_file_slot(&ctx->file_table, idx)->file_ptr; + file = (struct file *) (file_ptr & FFS_MASK); + if (file) + get_file(file); } - - return 0; + io_ring_submit_unlock(ctx, issue_flags); + return file; } static int io_msg_send_fd(struct io_kiocb *req, unsigned int issue_flags) @@ -78,34 +97,27 @@ static int io_msg_send_fd(struct io_kiocb *req, unsigned int issue_flags) struct io_ring_ctx *target_ctx = req->file->private_data; struct io_msg *msg = io_kiocb_to_cmd(req, struct io_msg); struct io_ring_ctx *ctx = req->ctx; - unsigned long file_ptr; - struct file *src_file; + struct file *src_file = msg->src_file; int ret; if (target_ctx == ctx) return -EINVAL; + if (!src_file) { + src_file = io_msg_grab_file(req, issue_flags); + if (!src_file) + return -EBADF; + msg->src_file = src_file; + req->flags |= REQ_F_NEED_CLEANUP; + } - ret = io_double_lock_ctx(ctx, target_ctx, issue_flags); - if (unlikely(ret)) - return ret; - - ret = -EBADF; - if (unlikely(msg->src_fd >= ctx->nr_user_files)) - goto out_unlock; - - msg->src_fd = array_index_nospec(msg->src_fd, ctx->nr_user_files); - file_ptr = io_fixed_file_slot(&ctx->file_table, msg->src_fd)->file_ptr; - if (!file_ptr) - goto out_unlock; - - src_file = (struct file *) (file_ptr & FFS_MASK); - get_file(src_file); + if (unlikely(io_double_lock_ctx(target_ctx, issue_flags))) + return -EAGAIN; ret = __io_fixed_fd_install(target_ctx, src_file, msg->dst_fd); - if (ret < 0) { - fput(src_file); + if (ret < 0) goto out_unlock; - } + msg->src_file = NULL; + req->flags &= ~REQ_F_NEED_CLEANUP; if (msg->flags & IORING_MSG_RING_CQE_SKIP) goto out_unlock; @@ -119,7 +131,7 @@ static int io_msg_send_fd(struct io_kiocb *req, unsigned int issue_flags) if (!io_post_aux_cqe(target_ctx, msg->user_data, msg->len, 0)) ret = -EOVERFLOW; out_unlock: - io_double_unlock_ctx(ctx, target_ctx, issue_flags); + io_double_unlock_ctx(target_ctx, issue_flags); return ret; } @@ -130,6 +142,7 @@ int io_msg_ring_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe) if (unlikely(sqe->buf_index || sqe->personality)) return -EINVAL; + msg->src_file = NULL; msg->user_data = READ_ONCE(sqe->off); msg->len = READ_ONCE(sqe->len); msg->cmd = READ_ONCE(sqe->addr); diff --git a/io_uring/msg_ring.h b/io_uring/msg_ring.h index fb9601f202d0..3987ee6c0e5f 100644 --- a/io_uring/msg_ring.h +++ b/io_uring/msg_ring.h @@ -2,3 +2,4 @@ int io_msg_ring_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe); int io_msg_ring(struct io_kiocb *req, unsigned int issue_flags); +void io_msg_ring_cleanup(struct io_kiocb *req); diff --git a/io_uring/opdef.c b/io_uring/opdef.c index 04dd2c983fce..3aa0d65c50e3 100644 --- a/io_uring/opdef.c +++ b/io_uring/opdef.c @@ -445,6 +445,7 @@ const struct io_op_def io_op_defs[] = { .name = "MSG_RING", .prep = io_msg_ring_prep, .issue = io_msg_ring, + .cleanup = io_msg_ring_cleanup, }, [IORING_OP_FSETXATTR] = { .needs_file = 1, From patchwork Wed Dec 7 03:53:35 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pavel Begunkov X-Patchwork-Id: 13066603 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id ECF7EC63706 for ; Wed, 7 Dec 2022 03:54:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229534AbiLGDys (ORCPT ); Tue, 6 Dec 2022 22:54:48 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41770 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229846AbiLGDyn (ORCPT ); Tue, 6 Dec 2022 22:54:43 -0500 Received: from mail-ed1-x533.google.com (mail-ed1-x533.google.com [IPv6:2a00:1450:4864:20::533]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9D40652158 for ; Tue, 6 Dec 2022 19:54:42 -0800 (PST) Received: by mail-ed1-x533.google.com with SMTP id d20so23234328edn.0 for ; Tue, 06 Dec 2022 19:54:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=CQ9KJKd7U+9IDFMcUWamULroPRymnHOqPsKek3NuXMM=; b=C9jFDLvDjPZIC5ksjJJFom+cC9zQ0raq5uLSf4vKE/mV4HPzIqks0BTaMNoWWFoWUi bxZbUI2wTTDHv/9yVT3vIjaHsDMAPdtJaYuH+Or1nFxa2SI7jVyE3o0j8peCYRXRiEai T2TMkFHfUkzIVb3UX9uCGDFeVucqXbbBpjunRvP0XxwHYwSsH03i9E2onBwDcuSSUiM1 surojpdbGw7tUhDpdVfjN45bDJzeXSf8TEHyWOrJZAWQjJieZn/lDuRo+cTeVsh7ONyL dmSBNZi6JulkiAQKxcgJA6dEuc3rFV66+RoY7ZXTxXjjKUgnEpyriUiK9NSGo7cl6WFL fRVw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=CQ9KJKd7U+9IDFMcUWamULroPRymnHOqPsKek3NuXMM=; b=clQqtFOGy1a8NgTogB0HNiwuMuQcg0vv3Gub+3bGdnEVZjatRNsqO49CEf07wGv39i 6Otl/3ieS+ojm9IBcBDJ5aQbu+3L1fbByRoA7SwP2Ao6ntSsEZZ1D8ZmGF7ZgqX6K9BA VDzifAYgKglNuHPFtG2E9BQpkpXurtGHTXp18QLNELgRAZv/dSksSU/wG1VufUefcIKl tBbd7z6zlLlNhqBbMOY/OAM2jk0ZjrZ8SUBX4MgZL2RNnWMvKz4B2FNZt7Vn6fbnrKbM ouu6QngbWpvr2iQ9MFDhsDyZ9PFqFsQ5VQEkwAKhr5o2YuRLx5m/tiafCwEDncu7V+tH FUmA== X-Gm-Message-State: ANoB5pl0QGoq6wHcipdpFVKbaS9EdgU9EAiJjWs2h74Zmo8krO/A2aRv 9fyzKfaZBuTTc3X6MfzNH4QiypIhfvY= X-Google-Smtp-Source: AA0mqf7/9qBMMFy87ZWzndKGsXOTJVEfjmKR/ljZN1xCPqQky4EAnwybCYfhS6EU1yeYOlTqUmo7dA== X-Received: by 2002:a05:6402:e04:b0:469:e6ef:9164 with SMTP id h4-20020a0564020e0400b00469e6ef9164mr31999116edh.185.1670385281035; Tue, 06 Dec 2022 19:54:41 -0800 (PST) Received: from 127.0.0.1localhost (94.196.241.58.threembb.co.uk. [94.196.241.58]) by smtp.gmail.com with ESMTPSA id 9-20020a170906210900b0073de0506745sm7938939ejt.197.2022.12.06.19.54.40 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 06 Dec 2022 19:54:40 -0800 (PST) From: Pavel Begunkov To: io-uring@vger.kernel.org Cc: Jens Axboe , asml.silence@gmail.com Subject: [PATCH for-next v2 10/12] io_uring: extract a io_msg_install_complete helper Date: Wed, 7 Dec 2022 03:53:35 +0000 Message-Id: <1500ca1054cc4286a3ee1c60aacead57fcdfa02a.1670384893.git.asml.silence@gmail.com> X-Mailer: git-send-email 2.38.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org Extract a helper called io_msg_install_complete() from io_msg_send_fd(), will be used later. Signed-off-by: Pavel Begunkov --- io_uring/msg_ring.c | 34 +++++++++++++++++++++------------- 1 file changed, 21 insertions(+), 13 deletions(-) diff --git a/io_uring/msg_ring.c b/io_uring/msg_ring.c index 387c45a58654..525063ac3dab 100644 --- a/io_uring/msg_ring.c +++ b/io_uring/msg_ring.c @@ -92,36 +92,25 @@ static struct file *io_msg_grab_file(struct io_kiocb *req, unsigned int issue_fl return file; } -static int io_msg_send_fd(struct io_kiocb *req, unsigned int issue_flags) +static int io_msg_install_complete(struct io_kiocb *req, unsigned int issue_flags) { struct io_ring_ctx *target_ctx = req->file->private_data; struct io_msg *msg = io_kiocb_to_cmd(req, struct io_msg); - struct io_ring_ctx *ctx = req->ctx; struct file *src_file = msg->src_file; int ret; - if (target_ctx == ctx) - return -EINVAL; - if (!src_file) { - src_file = io_msg_grab_file(req, issue_flags); - if (!src_file) - return -EBADF; - msg->src_file = src_file; - req->flags |= REQ_F_NEED_CLEANUP; - } - if (unlikely(io_double_lock_ctx(target_ctx, issue_flags))) return -EAGAIN; ret = __io_fixed_fd_install(target_ctx, src_file, msg->dst_fd); if (ret < 0) goto out_unlock; + msg->src_file = NULL; req->flags &= ~REQ_F_NEED_CLEANUP; if (msg->flags & IORING_MSG_RING_CQE_SKIP) goto out_unlock; - /* * If this fails, the target still received the file descriptor but * wasn't notified of the fact. This means that if this request @@ -135,6 +124,25 @@ static int io_msg_send_fd(struct io_kiocb *req, unsigned int issue_flags) return ret; } +static int io_msg_send_fd(struct io_kiocb *req, unsigned int issue_flags) +{ + struct io_ring_ctx *target_ctx = req->file->private_data; + struct io_msg *msg = io_kiocb_to_cmd(req, struct io_msg); + struct io_ring_ctx *ctx = req->ctx; + struct file *src_file = msg->src_file; + + if (target_ctx == ctx) + return -EINVAL; + if (!src_file) { + src_file = io_msg_grab_file(req, issue_flags); + if (!src_file) + return -EBADF; + msg->src_file = src_file; + req->flags |= REQ_F_NEED_CLEANUP; + } + return io_msg_install_complete(req, issue_flags); +} + int io_msg_ring_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe) { struct io_msg *msg = io_kiocb_to_cmd(req, struct io_msg); From patchwork Wed Dec 7 03:53:36 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pavel Begunkov X-Patchwork-Id: 13066602 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 27F44C4708D for ; Wed, 7 Dec 2022 03:54:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229838AbiLGDyr (ORCPT ); Tue, 6 Dec 2022 22:54:47 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41624 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229828AbiLGDyn (ORCPT ); Tue, 6 Dec 2022 22:54:43 -0500 Received: from mail-ed1-x536.google.com (mail-ed1-x536.google.com [IPv6:2a00:1450:4864:20::536]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AEBDD52165 for ; Tue, 6 Dec 2022 19:54:42 -0800 (PST) Received: by mail-ed1-x536.google.com with SMTP id e13so23144168edj.7 for ; Tue, 06 Dec 2022 19:54:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=W6AZVOd5+CvwsZBv9HzpXB3puSLMulw5+42cLdFvuTo=; b=jn55tq4vWKouaacc90kTGTerojt/VHK6Yn/SSxrikk9RjqR0EP1r8sItbvfX/JFkNv MdEuFnSTZQIgwW6vgSz3xUK9elYg4PEYrdiTNmvBegVmfFvxerrCkkaVjlIZ0FLJNadp MMdm/Vws06YPesFe+igfuEobKbu/ng1fj+uRl88j9juhijS6U+9MfV1fSVCgHiDHTY2K SuTPQKjHvooel1Z8asBoqZa3iHmAXpNsyUCTAGneRQ7NKngqXcPzqZzo38I8vu9vE07/ c4KfPozAwU+Epy4scbSJQ7i3n5kYzoJulHMd2RYgkmX7H28qcoyz3gP7PgPjH9dFstzQ l2JQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=W6AZVOd5+CvwsZBv9HzpXB3puSLMulw5+42cLdFvuTo=; b=sYd6tUKnppf6pSe5gn5YvniDk++OCwHiZZnk5hAnScLkwepsP4J/IoMV1mQNpTWUgl ftN7/omitqYrW0l8IGcPDsNWS/i9bHQP2fFg8UHRS/xzF9L32JAbkd4JfiZ2EeB+bCLd wYxIZc+j5+VvhCPDXeczvE+dK6agsP5ikqFzXxVz8JaoPjGPHX2RHBapU4R+asiTJqyG SrILAQbTX32bmVIw4oNLKl3yEuQC4fzGzE30cvlBzOQXRJT5DemnqlHm1axsuzzJZJTb CTvBCl0z4UucOmtxmrTiPA6lGs0E8ydrBgylujG79aYRwEBuQomYqDpaJJ91/vkaWHz8 qmdQ== X-Gm-Message-State: ANoB5plJW+Ozx33PfBZ4B11gbW3+PN9OGQQHm+zm5mrewc68rDH0kTfI BCfzjhKEHa7cF5R2T7kyvAyYgiTARUo= X-Google-Smtp-Source: AA0mqf71qryQdtW4v2qxqvSJDjg/nEnBHw9Bt5iUdgU3BtQ8jkrHxec9P3eugCWeNLnaXbgmcfqIGQ== X-Received: by 2002:a05:6402:5406:b0:467:4b3d:f2ed with SMTP id ev6-20020a056402540600b004674b3df2edmr63311147edb.101.1670385282137; Tue, 06 Dec 2022 19:54:42 -0800 (PST) Received: from 127.0.0.1localhost (94.196.241.58.threembb.co.uk. [94.196.241.58]) by smtp.gmail.com with ESMTPSA id 9-20020a170906210900b0073de0506745sm7938939ejt.197.2022.12.06.19.54.41 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 06 Dec 2022 19:54:41 -0800 (PST) From: Pavel Begunkov To: io-uring@vger.kernel.org Cc: Jens Axboe , asml.silence@gmail.com Subject: [PATCH for-next v2 11/12] io_uring: do msg_ring in target task via tw Date: Wed, 7 Dec 2022 03:53:36 +0000 Message-Id: <4d76c7b28ed5d71b520de4482fbb7f660f21cd80.1670384893.git.asml.silence@gmail.com> X-Mailer: git-send-email 2.38.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org While executing in a context of one io_uring instance msg_ring manipulates another ring. We're trying to keep CQEs posting contained in the context of the ring-owner task, use task_work to send the request to the target ring's task when we're modifying its CQ or trying to install a file. Note, we can't safely use io_uring task_work infra and have to use task_work directly. Signed-off-by: Pavel Begunkov --- io_uring/msg_ring.c | 56 ++++++++++++++++++++++++++++++++++++++++++--- 1 file changed, 53 insertions(+), 3 deletions(-) diff --git a/io_uring/msg_ring.c b/io_uring/msg_ring.c index 525063ac3dab..24e6cc477515 100644 --- a/io_uring/msg_ring.c +++ b/io_uring/msg_ring.c @@ -16,6 +16,7 @@ struct io_msg { struct file *file; struct file *src_file; + struct callback_head tw; u64 user_data; u32 len; u32 cmd; @@ -35,6 +36,23 @@ void io_msg_ring_cleanup(struct io_kiocb *req) msg->src_file = NULL; } +static void io_msg_tw_complete(struct callback_head *head) +{ + struct io_msg *msg = container_of(head, struct io_msg, tw); + struct io_kiocb *req = cmd_to_io_kiocb(msg); + struct io_ring_ctx *target_ctx = req->file->private_data; + int ret = 0; + + if (current->flags & PF_EXITING) + ret = -EOWNERDEAD; + else if (!io_post_aux_cqe(target_ctx, msg->user_data, msg->len, 0)) + ret = -EOVERFLOW; + + if (ret < 0) + req_set_fail(req); + io_req_queue_tw_complete(req, ret); +} + static int io_msg_ring_data(struct io_kiocb *req) { struct io_ring_ctx *target_ctx = req->file->private_data; @@ -43,6 +61,15 @@ static int io_msg_ring_data(struct io_kiocb *req) if (msg->src_fd || msg->dst_fd || msg->flags) return -EINVAL; + if (target_ctx->task_complete && current != target_ctx->submitter_task) { + init_task_work(&msg->tw, io_msg_tw_complete); + if (task_work_add(target_ctx->submitter_task, &msg->tw, + TWA_SIGNAL)) + return -EOWNERDEAD; + + return IOU_ISSUE_SKIP_COMPLETE; + } + if (io_post_aux_cqe(target_ctx, msg->user_data, msg->len, 0)) return 0; @@ -124,6 +151,19 @@ static int io_msg_install_complete(struct io_kiocb *req, unsigned int issue_flag return ret; } +static void io_msg_tw_fd_complete(struct callback_head *head) +{ + struct io_msg *msg = container_of(head, struct io_msg, tw); + struct io_kiocb *req = cmd_to_io_kiocb(msg); + int ret = -EOWNERDEAD; + + if (!(current->flags & PF_EXITING)) + ret = io_msg_install_complete(req, IO_URING_F_UNLOCKED); + if (ret < 0) + req_set_fail(req); + io_req_queue_tw_complete(req, ret); +} + static int io_msg_send_fd(struct io_kiocb *req, unsigned int issue_flags) { struct io_ring_ctx *target_ctx = req->file->private_data; @@ -140,6 +180,15 @@ static int io_msg_send_fd(struct io_kiocb *req, unsigned int issue_flags) msg->src_file = src_file; req->flags |= REQ_F_NEED_CLEANUP; } + + if (target_ctx->task_complete && current != target_ctx->submitter_task) { + init_task_work(&msg->tw, io_msg_tw_fd_complete); + if (task_work_add(target_ctx->submitter_task, &msg->tw, + TWA_SIGNAL)) + return -EOWNERDEAD; + + return IOU_ISSUE_SKIP_COMPLETE; + } return io_msg_install_complete(req, issue_flags); } @@ -185,10 +234,11 @@ int io_msg_ring(struct io_kiocb *req, unsigned int issue_flags) } done: - if (ret == -EAGAIN) - return -EAGAIN; - if (ret < 0) + if (ret < 0) { + if (ret == -EAGAIN || ret == IOU_ISSUE_SKIP_COMPLETE) + return ret; req_set_fail(req); + } io_req_set_res(req, ret, 0); return IOU_OK; } From patchwork Wed Dec 7 03:53:37 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pavel Begunkov X-Patchwork-Id: 13066604 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id AC222C352A1 for ; Wed, 7 Dec 2022 03:54:50 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229544AbiLGDyt (ORCPT ); Tue, 6 Dec 2022 22:54:49 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41624 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229853AbiLGDyq (ORCPT ); Tue, 6 Dec 2022 22:54:46 -0500 Received: from mail-ej1-x632.google.com (mail-ej1-x632.google.com [IPv6:2a00:1450:4864:20::632]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id F2FB752149 for ; Tue, 6 Dec 2022 19:54:44 -0800 (PST) Received: by mail-ej1-x632.google.com with SMTP id bj12so10993334ejb.13 for ; Tue, 06 Dec 2022 19:54:44 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=VFoHvlXm6jZqrvKe7FCJj4gL6hq6dRVmZkk3fOr4bcQ=; b=PPMMm0qTXUcovbtDSAdEz8yvpYppj84MF3JYdQkEz5pyV4FvKe9onYfZSpViEFqk/b 9ElniL+wz7N2a2WHqpR3cWx4wF9NaHJnfnW322HiYl56cUDhMN4DqaSnjUrKLK19FR1y je0/QzdtSTPcgGFaB7qghiuP4pB+Cs8SluaKfWL8RHwLvqim9ig5M9oBSAkqHOYDNJFG 7KdzVTL6CP+WBqpfDDQDSsaI1BpUwXNAc5RNtqfc1w+gj9GsFaZ7VUUR0sx3uV+5MGxg a11fXGpNVNqDyvzbCKGj6A4chWRq3ercQsMP8cLpENsGppBvfTrt246DB0vW4WexTSI7 4CnA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=VFoHvlXm6jZqrvKe7FCJj4gL6hq6dRVmZkk3fOr4bcQ=; b=1oM13JwR01ufcMo67xpD9/2OIdhI7i3qi4ZkQpcUxKnNpDf2APD5MVsoPbYSB567ub MaQhkpMuwHO9zzcmdYnxcQ1mb5T93IuEkUZhXKmDCKW0DJZfy/+s6q+b8hmvO9Jpwoqq /ldwk5Iay/InbUWC6NL8EunqL+DfW4XaSVtZy5pMr4Dak7DvgpbfnnD3Rr0Jo5/ePRd7 a17j3FnZkHOLQFZFcjCA3/zpxqNtcav1Be/GdWWSNU5f83txkLpPRPWcWRlJM+xzxrD1 x5iqLf6pJmvsnGbjmRxRqai0ApWNfuvplyrtuIDGCXBpGOQiBVQx7VfYiIAdwfUmtyni 9rLw== X-Gm-Message-State: ANoB5pke0sYuqH5LYXWJlgFErmXYYvxzRmSP9PGAJpPNgEp+j5530tt7 w9jWzauwkSyUWCIV/XhVxdZLAau7j6A= X-Google-Smtp-Source: AA0mqf63iD2cKIxfxJ4VgXy2IgX6alYdOg21qeswsSEw+bCvjg9UP2z/onPBHKLSpaGRG6jPiarviA== X-Received: by 2002:a17:907:2387:b0:7c0:d6b2:1fd4 with SMTP id vf7-20020a170907238700b007c0d6b21fd4mr14859861ejb.703.1670385283243; Tue, 06 Dec 2022 19:54:43 -0800 (PST) Received: from 127.0.0.1localhost (94.196.241.58.threembb.co.uk. [94.196.241.58]) by smtp.gmail.com with ESMTPSA id 9-20020a170906210900b0073de0506745sm7938939ejt.197.2022.12.06.19.54.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 06 Dec 2022 19:54:43 -0800 (PST) From: Pavel Begunkov To: io-uring@vger.kernel.org Cc: Jens Axboe , asml.silence@gmail.com Subject: [PATCH for-next v2 12/12] io_uring: skip spinlocking for ->task_complete Date: Wed, 7 Dec 2022 03:53:37 +0000 Message-Id: <2a8c91fd82cfcdcc1d2e5bac7051fe2c183bda73.1670384893.git.asml.silence@gmail.com> X-Mailer: git-send-email 2.38.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org ->task_complete was added to serialised CQE posting by doing it from the task context only (or fallback wq when the task is dead), and now we can use that to avoid taking ->completion_lock while filling CQ entries. The patch skips spinlocking only in two spots, __io_submit_flush_completions() and flushing in io_aux_cqe, it's safer and covers all cases we care about. Extra care is taken to force taking the lock while queueing overflow entries. It fundamentally relies on SINGLE_ISSUER to have only one task posting events. It also need to take into account overflowed CQEs, flushing of which happens in the cq wait path, and so this implementation also needs DEFER_TASKRUN to limit waiters. For the same reason we disable it for SQPOLL, and for IOPOLL as it won't benefit from it in any case. DEFER_TASKRUN, SQPOLL and IOPOLL requirement may be relaxed in the future. Signed-off-by: Pavel Begunkov --- io_uring/io_uring.c | 71 +++++++++++++++++++++++++++++++++------------ io_uring/io_uring.h | 12 ++++++-- 2 files changed, 62 insertions(+), 21 deletions(-) diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index 0e424d8721ab..529ea5942dea 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -595,13 +595,25 @@ static inline void io_cq_unlock(struct io_ring_ctx *ctx) spin_unlock(&ctx->completion_lock); } +static inline void __io_cq_lock(struct io_ring_ctx *ctx) + __acquires(ctx->completion_lock) +{ + if (!ctx->task_complete) + spin_lock(&ctx->completion_lock); +} + +static inline void __io_cq_unlock(struct io_ring_ctx *ctx) +{ + if (!ctx->task_complete) + spin_unlock(&ctx->completion_lock); +} + /* keep it inlined for io_submit_flush_completions() */ -static inline void io_cq_unlock_post_inline(struct io_ring_ctx *ctx) +static inline void __io_cq_unlock_post(struct io_ring_ctx *ctx) __releases(ctx->completion_lock) { io_commit_cqring(ctx); - spin_unlock(&ctx->completion_lock); - + __io_cq_unlock(ctx); io_commit_cqring_flush(ctx); io_cqring_wake(ctx); } @@ -609,7 +621,10 @@ static inline void io_cq_unlock_post_inline(struct io_ring_ctx *ctx) void io_cq_unlock_post(struct io_ring_ctx *ctx) __releases(ctx->completion_lock) { - io_cq_unlock_post_inline(ctx); + io_commit_cqring(ctx); + spin_unlock(&ctx->completion_lock); + io_commit_cqring_flush(ctx); + io_cqring_wake(ctx); } /* Returns true if there are no backlogged entries after the flush */ @@ -796,12 +811,13 @@ struct io_uring_cqe *__io_get_cqe(struct io_ring_ctx *ctx, bool overflow) return &rings->cqes[off]; } -static bool io_fill_cqe_aux(struct io_ring_ctx *ctx, u64 user_data, s32 res, u32 cflags, - bool allow_overflow) +static bool io_fill_cqe_aux(struct io_ring_ctx *ctx, u64 user_data, s32 res, + u32 cflags) { struct io_uring_cqe *cqe; - lockdep_assert_held(&ctx->completion_lock); + if (!ctx->task_complete) + lockdep_assert_held(&ctx->completion_lock); ctx->cq_extra++; @@ -824,10 +840,6 @@ static bool io_fill_cqe_aux(struct io_ring_ctx *ctx, u64 user_data, s32 res, u32 } return true; } - - if (allow_overflow) - return io_cqring_event_overflow(ctx, user_data, res, cflags, 0, 0); - return false; } @@ -841,7 +853,17 @@ static void __io_flush_post_cqes(struct io_ring_ctx *ctx) for (i = 0; i < state->cqes_count; i++) { struct io_uring_cqe *cqe = &state->cqes[i]; - io_fill_cqe_aux(ctx, cqe->user_data, cqe->res, cqe->flags, true); + if (!io_fill_cqe_aux(ctx, cqe->user_data, cqe->res, cqe->flags)) { + if (ctx->task_complete) { + spin_lock(&ctx->completion_lock); + io_cqring_event_overflow(ctx, cqe->user_data, + cqe->res, cqe->flags, 0, 0); + spin_unlock(&ctx->completion_lock); + } else { + io_cqring_event_overflow(ctx, cqe->user_data, + cqe->res, cqe->flags, 0, 0); + } + } } state->cqes_count = 0; } @@ -852,7 +874,10 @@ static bool __io_post_aux_cqe(struct io_ring_ctx *ctx, u64 user_data, s32 res, u bool filled; io_cq_lock(ctx); - filled = io_fill_cqe_aux(ctx, user_data, res, cflags, allow_overflow); + filled = io_fill_cqe_aux(ctx, user_data, res, cflags); + if (!filled && allow_overflow) + filled = io_cqring_event_overflow(ctx, user_data, res, cflags, 0, 0); + io_cq_unlock_post(ctx); return filled; } @@ -876,10 +901,10 @@ bool io_aux_cqe(struct io_ring_ctx *ctx, bool defer, u64 user_data, s32 res, u32 lockdep_assert_held(&ctx->uring_lock); if (ctx->submit_state.cqes_count == length) { - io_cq_lock(ctx); + __io_cq_lock(ctx); __io_flush_post_cqes(ctx); /* no need to flush - flush is deferred */ - io_cq_unlock(ctx); + __io_cq_unlock_post(ctx); } /* For defered completions this is not as strict as it is otherwise, @@ -1414,7 +1439,7 @@ static void __io_submit_flush_completions(struct io_ring_ctx *ctx) struct io_wq_work_node *node, *prev; struct io_submit_state *state = &ctx->submit_state; - io_cq_lock(ctx); + __io_cq_lock(ctx); /* must come first to preserve CQE ordering in failure cases */ if (state->cqes_count) __io_flush_post_cqes(ctx); @@ -1422,10 +1447,18 @@ static void __io_submit_flush_completions(struct io_ring_ctx *ctx) struct io_kiocb *req = container_of(node, struct io_kiocb, comp_list); - if (!(req->flags & REQ_F_CQE_SKIP)) - io_fill_cqe_req(ctx, req); + if (!(req->flags & REQ_F_CQE_SKIP) && + unlikely(!__io_fill_cqe_req(ctx, req))) { + if (ctx->task_complete) { + spin_lock(&ctx->completion_lock); + io_req_cqe_overflow(req); + spin_unlock(&ctx->completion_lock); + } else { + io_req_cqe_overflow(req); + } + } } - io_cq_unlock_post_inline(ctx); + __io_cq_unlock_post(ctx); if (!wq_list_empty(&ctx->submit_state.compl_reqs)) { io_free_batch_list(ctx, state->compl_reqs.first); diff --git a/io_uring/io_uring.h b/io_uring/io_uring.h index 62227ec3260c..c117e029c8dc 100644 --- a/io_uring/io_uring.h +++ b/io_uring/io_uring.h @@ -110,7 +110,7 @@ static inline struct io_uring_cqe *io_get_cqe(struct io_ring_ctx *ctx) return io_get_cqe_overflow(ctx, false); } -static inline bool io_fill_cqe_req(struct io_ring_ctx *ctx, +static inline bool __io_fill_cqe_req(struct io_ring_ctx *ctx, struct io_kiocb *req) { struct io_uring_cqe *cqe; @@ -122,7 +122,7 @@ static inline bool io_fill_cqe_req(struct io_ring_ctx *ctx, */ cqe = io_get_cqe(ctx); if (unlikely(!cqe)) - return io_req_cqe_overflow(req); + return false; trace_io_uring_complete(req->ctx, req, req->cqe.user_data, req->cqe.res, req->cqe.flags, @@ -145,6 +145,14 @@ static inline bool io_fill_cqe_req(struct io_ring_ctx *ctx, return true; } +static inline bool io_fill_cqe_req(struct io_ring_ctx *ctx, + struct io_kiocb *req) +{ + if (likely(__io_fill_cqe_req(ctx, req))) + return true; + return io_req_cqe_overflow(req); +} + static inline void req_set_fail(struct io_kiocb *req) { req->flags |= REQ_F_FAIL;