From patchwork Thu Apr 6 13:20:07 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pavel Begunkov X-Patchwork-Id: 13203320 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8ADFFC7618D for ; Thu, 6 Apr 2023 13:21:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238222AbjDFNVB (ORCPT ); Thu, 6 Apr 2023 09:21:01 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47836 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238099AbjDFNU5 (ORCPT ); Thu, 6 Apr 2023 09:20:57 -0400 Received: from mail-ej1-x634.google.com (mail-ej1-x634.google.com [IPv6:2a00:1450:4864:20::634]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 64E609EF0; Thu, 6 Apr 2023 06:20:25 -0700 (PDT) Received: by mail-ej1-x634.google.com with SMTP id a640c23a62f3a-93283bd90dbso111389866b.0; Thu, 06 Apr 2023 06:20:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; t=1680787223; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=m8+70CIeUlL/bRzoWQFlxEHR02qO4uxN8VAdftx3jWU=; b=odRZJ8YvJwRQ98L5LnAtQ4v3cPf5ppkA/fChIdqnyu3qgEdzQX0S+OT6edH7PwOqdn Gm46Hm+zGCjodX7hINwoPoEWDBxFjKQWXowOnMxOVfqBJrCyYv3Y4ukGivocatNTNaD6 1gRY634sstPd34Lu2bEWuI7H1LUVWAEdWa1C2UBNjlYAojKnbudjyKUAhYBlWT5d9wWe 3Beg3Cy9SZSjXrfoKuAGbpjtAYmVSmm4Vlt9MT6UIvHj1IlrQiooGUPF6x/tD6jYRtja NyxyLPE8AnOAemmgxEF8XM/DL0TV0V/P2+ToOLkblx0qS3MLbRHeW5a9QiyVtVsOhj93 b7YQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1680787223; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=m8+70CIeUlL/bRzoWQFlxEHR02qO4uxN8VAdftx3jWU=; b=OYMHKGFeuKaln5Ll5YEg5XmaN1MLdYlfrtz4dkUdmN0hXEzW/KQ/fDLZraJ2Kja0L+ Ne7f2AsQXrqIEOkGI/U5hgOj0C7DXvCMOCg2eQDeqzXIGKxTYGjDuKidDchBas4qeZeR 8F5odQ2dNQdugTCys87heKK+SffMaglOF4h+lJ/nJKoT87uqhukh1l3mh076UVLo8lqD dfgA94egxsC2ZTXh0BorM3g46/wKwzdt2EJjSFfCh4Dw7U5ysfO6GKdVo9EI3qOmz6Fx XChKcDmBcUDuvou2zJOxw4WMbTWnZnu5vP0Ub0HPJVdVyfIRqK5H9BsGcasNJ8qnyqxQ hN3A== X-Gm-Message-State: AAQBX9ewzmOqyZADXR1G4fSzyUMgMPoK6gXXmF1olC/jZ6FKYfjb0y+u ts78mvyiSlfvUAVPa+QDkYAAmqG/tO4= X-Google-Smtp-Source: AKy350YsJ6OVTFSfhzTFy7XYvvGUvFJfZojs4AkHfw5du+PSPqWvcdkF6e7b8D/fn+dfHlkzxXIvww== X-Received: by 2002:a50:ee86:0:b0:502:1299:5fa5 with SMTP id f6-20020a50ee86000000b0050212995fa5mr5280100edr.16.1680787223618; Thu, 06 Apr 2023 06:20:23 -0700 (PDT) Received: from 127.com ([2620:10d:c092:600::2:a638]) by smtp.gmail.com with ESMTPSA id m20-20020a509994000000b0050470aa444fsm312732edb.51.2023.04.06.06.20.23 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 06 Apr 2023 06:20:23 -0700 (PDT) From: Pavel Begunkov To: io-uring@vger.kernel.org Cc: Jens Axboe , asml.silence@gmail.com, linux-kernel@vger.kernel.org Subject: [PATCH v2 1/8] io_uring: move pinning out of io_req_local_work_add Date: Thu, 6 Apr 2023 14:20:07 +0100 Message-Id: <49c0dbed390b0d6d04cb942dd3592879fd5bfb1b.1680782017.git.asml.silence@gmail.com> X-Mailer: git-send-email 2.40.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org Move ctx pinning from io_req_local_work_add() to the caller, looks better and makes working with the code a bit easier. Signed-off-by: Pavel Begunkov --- io_uring/io_uring.c | 11 ++++------- 1 file changed, 4 insertions(+), 7 deletions(-) diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index ae90d2753e0d..29a0516ee5ce 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -1306,17 +1306,15 @@ static void io_req_local_work_add(struct io_kiocb *req) { struct io_ring_ctx *ctx = req->ctx; - percpu_ref_get(&ctx->refs); - if (!llist_add(&req->io_task_work.node, &ctx->work_llist)) - goto put_ref; + return; /* needed for the following wake up */ smp_mb__after_atomic(); if (unlikely(atomic_read(&req->task->io_uring->in_cancel))) { io_move_task_work_from_local(ctx); - goto put_ref; + return; } if (ctx->flags & IORING_SETUP_TASKRUN_FLAG) @@ -1326,9 +1324,6 @@ static void io_req_local_work_add(struct io_kiocb *req) if (READ_ONCE(ctx->cq_waiting)) wake_up_state(ctx->submitter_task, TASK_INTERRUPTIBLE); - -put_ref: - percpu_ref_put(&ctx->refs); } void __io_req_task_work_add(struct io_kiocb *req, bool allow_local) @@ -1337,7 +1332,9 @@ void __io_req_task_work_add(struct io_kiocb *req, bool allow_local) struct io_ring_ctx *ctx = req->ctx; if (allow_local && ctx->flags & IORING_SETUP_DEFER_TASKRUN) { + percpu_ref_get(&ctx->refs); io_req_local_work_add(req); + percpu_ref_put(&ctx->refs); return; } From patchwork Thu Apr 6 13:20:08 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pavel Begunkov X-Patchwork-Id: 13203323 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 141AEC77B71 for ; Thu, 6 Apr 2023 13:21:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238029AbjDFNVD (ORCPT ); Thu, 6 Apr 2023 09:21:03 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48644 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238141AbjDFNU5 (ORCPT ); Thu, 6 Apr 2023 09:20:57 -0400 Received: from mail-ed1-x534.google.com (mail-ed1-x534.google.com [IPv6:2a00:1450:4864:20::534]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 619E9A246; Thu, 6 Apr 2023 06:20:26 -0700 (PDT) Received: by mail-ed1-x534.google.com with SMTP id 4fb4d7f45d1cf-50299ceefa4so1300953a12.2; Thu, 06 Apr 2023 06:20:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; t=1680787224; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=OFPFrQtthgmu2R5B+cqkgAt4SXW7neAKcH/973v1o60=; b=jDHpciR8zlv2xmb3+NF0BJkS/+wymZqezlIUfWqud0zV2V+oRkaFEMPNULVUHUNnNT 0Nj2KAh3ZJPzkgQQXQxB3owvsxSTEBJ4+RC5oWBRbDieJstz/KAWzFTJ2CB/eL9xEumO H20k5ZIoNSSmEBoLRPPcc37wkfXpi5Ikhm78V7ORHSoq6CbpNhkwIggCqTsW+DxI7IFt c+tCwUpoOorU/BR2AO9S3zm+YJMITMR6DSDkz3uE4FEShzsyOCfzsLE3qZHB+VBhBqt9 ym4Sa2CKWpsiNmtcRiDkM9NUjNsEEZl9Aa3BsG+fMcrPE/jI4JEX9pu8BghKfdj/Pbzq zUSw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1680787224; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=OFPFrQtthgmu2R5B+cqkgAt4SXW7neAKcH/973v1o60=; b=sAIk8A35l1TgOI+q9mAnYW3xz31+3bTZ+5wJKkq4dpAkznNzT1JCsHiyB1pQAgzNQz IUuKBeVV8JLTWk6QCqrOGtpfmE3jrXhp2UfuxoEu83oBV92Fm3FzRkFNFEBsld0eCrrt U3m+KN9K29biBCzphO9SmqnHGrdGKh6MUvZMCHrPZuGN5TjwAKqoiqHDHEvjnLMz8qkL Pr9wr9C7+PHSGBflXPQ61sk8lyrw/Iv7A5XWg1hHR8n4xthKo+8X/CFIhcYvItFTWdCQ 4YGwoxa1+wppD5fo9UBlCgFVuD2HZRSBWgfZJ+AV3kBUUlGybfUBqbqGd7hpMZKZrdXx eDzQ== X-Gm-Message-State: AAQBX9dbuKheAfJBvyMIqtyJKG5ZS0kDdx9jA9olXOhfUFalPMYZKRpz imDVwWCc5jJkVUu636eDVESDY5g6EeA= X-Google-Smtp-Source: AKy350bY+R7WtlKIXO+D9HY+v7ORLddUWtatGabkl2J9KHqXzp6m/oSg+uZnS30TIjZI/rikaS7M/w== X-Received: by 2002:aa7:ce15:0:b0:502:2148:2980 with SMTP id d21-20020aa7ce15000000b0050221482980mr5078208edv.30.1680787224359; Thu, 06 Apr 2023 06:20:24 -0700 (PDT) Received: from 127.com ([2620:10d:c092:600::2:a638]) by smtp.gmail.com with ESMTPSA id m20-20020a509994000000b0050470aa444fsm312732edb.51.2023.04.06.06.20.23 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 06 Apr 2023 06:20:23 -0700 (PDT) From: Pavel Begunkov To: io-uring@vger.kernel.org Cc: Jens Axboe , asml.silence@gmail.com, linux-kernel@vger.kernel.org Subject: [PATCH v2 2/8] io_uring: optimie local tw add ctx pinning Date: Thu, 6 Apr 2023 14:20:08 +0100 Message-Id: X-Mailer: git-send-email 2.40.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org We currently pin the ctx for io_req_local_work_add() with percpu_ref_get/put, which imply two rcu_read_lock/unlock pairs and some extra overhead on top in the fast path. Replace it with a pure rcu read and let io_ring_exit_work() synchronise against it. Signed-off-by: Pavel Begunkov --- io_uring/io_uring.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index 29a0516ee5ce..fb7215b543cd 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -1332,9 +1332,9 @@ void __io_req_task_work_add(struct io_kiocb *req, bool allow_local) struct io_ring_ctx *ctx = req->ctx; if (allow_local && ctx->flags & IORING_SETUP_DEFER_TASKRUN) { - percpu_ref_get(&ctx->refs); + rcu_read_lock(); io_req_local_work_add(req); - percpu_ref_put(&ctx->refs); + rcu_read_unlock(); return; } @@ -3052,6 +3052,10 @@ static __cold void io_ring_exit_work(struct work_struct *work) spin_lock(&ctx->completion_lock); spin_unlock(&ctx->completion_lock); + /* pairs with RCU read section in io_req_local_work_add() */ + if (ctx->flags & IORING_SETUP_DEFER_TASKRUN) + synchronize_rcu(); + io_ring_ctx_free(ctx); } From patchwork Thu Apr 6 13:20:09 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pavel Begunkov X-Patchwork-Id: 13203324 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 665E0C761A6 for ; Thu, 6 Apr 2023 13:21:06 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238274AbjDFNVE (ORCPT ); Thu, 6 Apr 2023 09:21:04 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47850 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238194AbjDFNU6 (ORCPT ); Thu, 6 Apr 2023 09:20:58 -0400 Received: from mail-ed1-x534.google.com (mail-ed1-x534.google.com [IPv6:2a00:1450:4864:20::534]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9C7DEA247; Thu, 6 Apr 2023 06:20:26 -0700 (PDT) Received: by mail-ed1-x534.google.com with SMTP id 4fb4d7f45d1cf-5002e950e23so1106603a12.1; Thu, 06 Apr 2023 06:20:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; t=1680787225; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=0P/A55sLOHJRbU1KHHDwi5FVeFyN2Okmbek6N7Le9GE=; b=XZtiB11LYu2Mbw9kRBm4PjXbKhMDnbKNQ+XdPxc/hE283xc2gRRqWXZIut13AASAdQ cL4/ezsKzXHNp5Il6f9ukuXn8JBY38gKaZTzAqTcDNNdUzkTa8Vbs+ijkQLzH8ylebkh O6AC8TmYcLxzBUtBn+X0Uq6VpTJaF7mZUru7Fopghshko6cFWW5pxgZ9OMY3UtF2rEPk dG0c6e/V2yv8pAxLlJwLTFsX1HRFlb2dls8CyXXmhFV3pOWlwzXbjpXeUuN6+H4SwXzg AnKpJEYWcSrWOzg6Xb+j4jkfLeuO5XvPc+FAV8qCjlWS4JoZ1UYkK9+b6ZDgBB5O9iJm tWlQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1680787225; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=0P/A55sLOHJRbU1KHHDwi5FVeFyN2Okmbek6N7Le9GE=; b=dVKaOsscGWSjRh895vg3vuW6WBqU9jJV1AjrliHnONyvLXs7YGsnkoCbeStKLz7LwF nw1Vp9gZgOCfjl+y/EVMF/DI9rrHgdaSOkGrJVx5sa3ciqJr6BuDqiumD+leZS4H1Cdx Ltu9dqweWXx5X9STsazSRRTByv7YeQUtomPcM4wNXYvqnDZRksJ5iIftIyD/fX45w0uz Rrfs/rzqXAQ9WMdrEd6FGjd8jG1GX4eMrSqIh0UCoWzzC6r/MKeLJmSoma4joKNXF3TZ 4pWL0/LqptOKRP2/yd60aZAz6QRsWp8xGUXvsDhlMuSdf3Dg6b9opXrMQoiKDVOrO0yW eg3w== X-Gm-Message-State: AAQBX9c9pxyxySF1dYzP8Qqh7AymW8RNU7u/6bpZEZ6JM3o7qebqeSM1 h3o9733rQWBahCyh8fGehfhOFXMBOO0= X-Google-Smtp-Source: AKy350ZHdcxGcANxNJDAWob5UDwkyXqr+5PXsWTs2s8LktRqsw0ZpXxz31AGaXNLZMCoGlJ1SOVzdg== X-Received: by 2002:aa7:ccce:0:b0:502:3376:7872 with SMTP id y14-20020aa7ccce000000b0050233767872mr5748603edt.35.1680787224697; Thu, 06 Apr 2023 06:20:24 -0700 (PDT) Received: from 127.com ([2620:10d:c092:600::2:a638]) by smtp.gmail.com with ESMTPSA id m20-20020a509994000000b0050470aa444fsm312732edb.51.2023.04.06.06.20.24 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 06 Apr 2023 06:20:24 -0700 (PDT) From: Pavel Begunkov To: io-uring@vger.kernel.org Cc: Jens Axboe , asml.silence@gmail.com, linux-kernel@vger.kernel.org Subject: [PATCH v2 3/8] io_uring: refactor __io_cq_unlock_post_flush() Date: Thu, 6 Apr 2023 14:20:09 +0100 Message-Id: <662ee5d898168ac206be06038525e97b64072a46.1680782017.git.asml.silence@gmail.com> X-Mailer: git-send-email 2.40.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org Instead of smp_mb() + __io_cqring_wake() in __io_cq_unlock_post_flush() use equivalent io_cqring_wake(). With that we can clean it up further and remove __io_cqring_wake(). Signed-off-by: Pavel Begunkov --- io_uring/io_uring.c | 6 ++---- io_uring/io_uring.h | 11 ++--------- 2 files changed, 4 insertions(+), 13 deletions(-) diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index fb7215b543cd..d4ac62de2113 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -640,10 +640,8 @@ static inline void __io_cq_unlock_post_flush(struct io_ring_ctx *ctx) * it will re-check the wakeup conditions once we return we can safely * skip waking it up. */ - if (!(ctx->flags & IORING_SETUP_DEFER_TASKRUN)) { - smp_mb(); - __io_cqring_wake(ctx); - } + if (!(ctx->flags & IORING_SETUP_DEFER_TASKRUN)) + io_cqring_wake(ctx); } void io_cq_unlock_post(struct io_ring_ctx *ctx) diff --git a/io_uring/io_uring.h b/io_uring/io_uring.h index 193b2db39fe8..24d8196bbca3 100644 --- a/io_uring/io_uring.h +++ b/io_uring/io_uring.h @@ -228,8 +228,7 @@ static inline void io_poll_wq_wake(struct io_ring_ctx *ctx) poll_to_key(EPOLL_URING_WAKE | EPOLLIN)); } -/* requires smb_mb() prior, see wq_has_sleeper() */ -static inline void __io_cqring_wake(struct io_ring_ctx *ctx) +static inline void io_cqring_wake(struct io_ring_ctx *ctx) { /* * Trigger waitqueue handler on all waiters on our waitqueue. This @@ -241,17 +240,11 @@ static inline void __io_cqring_wake(struct io_ring_ctx *ctx) * waitqueue handlers, we know we have a dependency between eventfd or * epoll and should terminate multishot poll at that point. */ - if (waitqueue_active(&ctx->cq_wait)) + if (wq_has_sleeper(&ctx->cq_wait)) __wake_up(&ctx->cq_wait, TASK_NORMAL, 0, poll_to_key(EPOLL_URING_WAKE | EPOLLIN)); } -static inline void io_cqring_wake(struct io_ring_ctx *ctx) -{ - smp_mb(); - __io_cqring_wake(ctx); -} - static inline bool io_sqring_full(struct io_ring_ctx *ctx) { struct io_rings *r = ctx->rings; From patchwork Thu Apr 6 13:20:10 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pavel Begunkov X-Patchwork-Id: 13203322 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 20970C76196 for ; Thu, 6 Apr 2023 13:21:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238212AbjDFNVC (ORCPT ); Thu, 6 Apr 2023 09:21:02 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47648 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238263AbjDFNU6 (ORCPT ); Thu, 6 Apr 2023 09:20:58 -0400 Received: from mail-ed1-x52a.google.com (mail-ed1-x52a.google.com [IPv6:2a00:1450:4864:20::52a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DB68493EA; Thu, 6 Apr 2023 06:20:26 -0700 (PDT) Received: by mail-ed1-x52a.google.com with SMTP id 4fb4d7f45d1cf-4fa3ca4090fso1302502a12.3; Thu, 06 Apr 2023 06:20:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; t=1680787225; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Xie9XCgTu7FI1l5nk9+NmrYSBfE45KwR9N84Pl/XEno=; b=pFyiffwvrKrL+a8mRQ0WHHlDdErDsDHxBH2WvrPhqvMx4xeIDEy54bMPIGaNhTh5sl 9HcBDYDAR7zFEIf1JVYoPBKsshPGnGDKFPe4zo6sLUV87yOuYvs0H22N05EZAxZBcHaY lkfpTaKYz6RIG0YbDSWq8f9gDlm78HPT9gyj6D/mhy20A17rqbF/YSHhVI1x4u1bHdce RoaPBPMVo7Z+IGtZaqXrwhJhubtmSIYbFDSCLIUmNL+bL/szHyjdbw3GRGlKxE3sqH1H 5OEF9g6pONaI6J2C5rKCQRWdJphEazXBrXzmFtTWLA6Ac6+NGHWZ8YA8cU2tXmiyxn6L C+zw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1680787225; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Xie9XCgTu7FI1l5nk9+NmrYSBfE45KwR9N84Pl/XEno=; b=13dh6d/CKxHeTXbSyf6M0eezAc31HoK0tF8e27sGM3JidOQJlOqScYCGtflZSZcuqy c4vt6YS8BgEUjWm4g0pXrY1atX6MGM4mUPkRpf/QAlbjdTOCCDHgw4LUVwsWHTpNzbFJ sTrz2fYBF4zqRDDi7XtSjIQhuzbu2jNSEpnwc/Pcaxo+cPCDlJPgr6bTQ4m0jOajdOFn +PxKTEovKahlTNhqm3tI3vnmX1DsXOuAqj1KKXhTNeHbnj83Q6NuYazznUPgM03+ix50 5jd+U2hpQTLHArvcOCK90GXiIrHCQwmrh8q8WNDCAyP3Yti5nDvovU0kO8OhJpK2Aj8+ EDpg== X-Gm-Message-State: AAQBX9foQpqnd4MTBiMXljOnUPTjOIo7vy7LMFtVb30nRLV3fCNK7Cr2 mWgUSCaercFg5nPKgOl9StYSNzAlqwc= X-Google-Smtp-Source: AKy350Z5bkIsq104YEcLm+j7qXTvqy1TDkoQzweZbfy3Qy8NrmTDp8LKjx5PGJ53BxqjJN1vsqpmPQ== X-Received: by 2002:aa7:d152:0:b0:502:2265:8417 with SMTP id r18-20020aa7d152000000b0050222658417mr5065901edo.17.1680787225176; Thu, 06 Apr 2023 06:20:25 -0700 (PDT) Received: from 127.com ([2620:10d:c092:600::2:a638]) by smtp.gmail.com with ESMTPSA id m20-20020a509994000000b0050470aa444fsm312732edb.51.2023.04.06.06.20.24 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 06 Apr 2023 06:20:24 -0700 (PDT) From: Pavel Begunkov To: io-uring@vger.kernel.org Cc: Jens Axboe , asml.silence@gmail.com, linux-kernel@vger.kernel.org Subject: [PATCH v2 4/8] io_uring: add tw add flags Date: Thu, 6 Apr 2023 14:20:10 +0100 Message-Id: <4c0f01e7ef4e6feebfb199093cc995af7a19befa.1680782017.git.asml.silence@gmail.com> X-Mailer: git-send-email 2.40.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org We pass 'allow_local' into io_req_task_work_add() but will need more flags. Replace it with a flags bit field and name this allow_local flag. Signed-off-by: Pavel Begunkov --- io_uring/io_uring.c | 7 ++++--- io_uring/io_uring.h | 9 +++++++-- 2 files changed, 11 insertions(+), 5 deletions(-) diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index d4ac62de2113..6f175fe682e4 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -1324,12 +1324,13 @@ static void io_req_local_work_add(struct io_kiocb *req) wake_up_state(ctx->submitter_task, TASK_INTERRUPTIBLE); } -void __io_req_task_work_add(struct io_kiocb *req, bool allow_local) +void __io_req_task_work_add(struct io_kiocb *req, unsigned flags) { struct io_uring_task *tctx = req->task->io_uring; struct io_ring_ctx *ctx = req->ctx; - if (allow_local && ctx->flags & IORING_SETUP_DEFER_TASKRUN) { + if (!(flags & IOU_F_TWQ_FORCE_NORMAL) && + (ctx->flags & IORING_SETUP_DEFER_TASKRUN)) { rcu_read_lock(); io_req_local_work_add(req); rcu_read_unlock(); @@ -1359,7 +1360,7 @@ static void __cold io_move_task_work_from_local(struct io_ring_ctx *ctx) io_task_work.node); node = node->next; - __io_req_task_work_add(req, false); + __io_req_task_work_add(req, IOU_F_TWQ_FORCE_NORMAL); } } diff --git a/io_uring/io_uring.h b/io_uring/io_uring.h index 24d8196bbca3..cb4309a2acdc 100644 --- a/io_uring/io_uring.h +++ b/io_uring/io_uring.h @@ -15,6 +15,11 @@ #include #endif +enum { + /* don't use deferred task_work */ + IOU_F_TWQ_FORCE_NORMAL = 1, +}; + enum { IOU_OK = 0, IOU_ISSUE_SKIP_COMPLETE = -EIOCBQUEUED, @@ -48,7 +53,7 @@ static inline bool io_req_ffs_set(struct io_kiocb *req) return req->flags & REQ_F_FIXED_FILE; } -void __io_req_task_work_add(struct io_kiocb *req, bool allow_local); +void __io_req_task_work_add(struct io_kiocb *req, unsigned flags); bool io_is_uring_fops(struct file *file); bool io_alloc_async_data(struct io_kiocb *req); void io_req_task_queue(struct io_kiocb *req); @@ -93,7 +98,7 @@ bool io_match_task_safe(struct io_kiocb *head, struct task_struct *task, static inline void io_req_task_work_add(struct io_kiocb *req) { - __io_req_task_work_add(req, true); + __io_req_task_work_add(req, 0); } #define io_for_each_link(pos, head) \ From patchwork Thu Apr 6 13:20:11 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pavel Begunkov X-Patchwork-Id: 13203325 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 15CF0C76196 for ; Thu, 6 Apr 2023 13:21:07 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238352AbjDFNVF (ORCPT ); Thu, 6 Apr 2023 09:21:05 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47964 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238262AbjDFNU6 (ORCPT ); Thu, 6 Apr 2023 09:20:58 -0400 Received: from mail-ej1-x62e.google.com (mail-ej1-x62e.google.com [IPv6:2a00:1450:4864:20::62e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 82A659ECC; Thu, 6 Apr 2023 06:20:27 -0700 (PDT) Received: by mail-ej1-x62e.google.com with SMTP id a640c23a62f3a-92fcb45a2cdso132000466b.0; Thu, 06 Apr 2023 06:20:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; t=1680787226; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=kd+SeWCUjtszJlrB0nyw1/x5gRJWByJn0vCgXmRmawk=; b=n8UKppUQgIobKFVZb2sLFFpuYMPwI0XY962g8cv91Wn6fYVRlcLjFgqyKpUxv2UsCS ho1g8mhE/jXxRK8YpYHOzqXKQJ2S4Lgf/kFStzG9aEN64+RQM/PyNtmeerS0IwZlqZ2K bjoqkwIs0zH/popw2+JLnf7e76tbfmr+CTpTnwsshKb8rMjPqc8lilicXYbRTx4Ei/wl oRME80Q5I4eqA3gbPrbR3Y+2FC0jrXAwh/u0aaHlaQImTxrHvTXvkC9Tt1746b4YwdCQ +xPXK5y3jDY0lQDrwItBocsQs5dwtJJkOQKCknXE1yFedND8PGv9i1QUJEs4cjzrmiaA KNLw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1680787226; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=kd+SeWCUjtszJlrB0nyw1/x5gRJWByJn0vCgXmRmawk=; b=mNwQ82VGdccgDj+nmH3TabQv/sz3J8nhfAbltoD68sZwmnsViImbVvhAUi1qAsSncE YxdUJdAWN0EThetqExcu9gqmjcrguAKGG2MKqXnHgr4ku4mClvrdkVHukjD0X0X5Jch7 leXD6PMVOhkqnns7dNdHk2d6eZbJedyEKBqJjIBJIGcyP35UEmUIfgbC0HGlcyLcajKr ig/XesfZfzoWAJpvjRBwIf8OGm8T8cBaG1CZ7fS95PAKgXq+P1ZkPKuu0BLY6NwY8ND3 W03vZqyBfncMvR45QIlTgw6nCQLi/75OeIbrVLzx5GMb47teyiFcovgnHMyTk2ZsAxK3 fwoQ== X-Gm-Message-State: AAQBX9fAIEm3hXD4zPKlPbOEguemxod6R5yvNUsvrOEGSXnUWnDv/BPT AdULgmPrKWBaEdMh8huieNlptcw+b0k= X-Google-Smtp-Source: AKy350bhE2mjaejsTWWBSMuODW5lajEQKrz5rVRvVXQS1/jWUfuXzOtJk3PO9fhznIcDpZlww7t3yg== X-Received: by 2002:aa7:c2c6:0:b0:4fd:5a28:2eff with SMTP id m6-20020aa7c2c6000000b004fd5a282effmr4709789edp.26.1680787225836; Thu, 06 Apr 2023 06:20:25 -0700 (PDT) Received: from 127.com ([2620:10d:c092:600::2:a638]) by smtp.gmail.com with ESMTPSA id m20-20020a509994000000b0050470aa444fsm312732edb.51.2023.04.06.06.20.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 06 Apr 2023 06:20:25 -0700 (PDT) From: Pavel Begunkov To: io-uring@vger.kernel.org Cc: Jens Axboe , asml.silence@gmail.com, linux-kernel@vger.kernel.org Subject: [PATCH v2 5/8] io_uring: inline llist_add() Date: Thu, 6 Apr 2023 14:20:11 +0100 Message-Id: X-Mailer: git-send-email 2.40.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org We'll need to grab some information from the previous request in the tw list, inline llist_add(), it'll be used in the following patch. Signed-off-by: Pavel Begunkov --- io_uring/io_uring.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index 6f175fe682e4..786ecfa01c54 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -1303,8 +1303,15 @@ static __cold void io_fallback_tw(struct io_uring_task *tctx) static void io_req_local_work_add(struct io_kiocb *req) { struct io_ring_ctx *ctx = req->ctx; + struct llist_node *first; - if (!llist_add(&req->io_task_work.node, &ctx->work_llist)) + first = READ_ONCE(ctx->work_llist.first); + do { + req->io_task_work.node.next = first; + } while (!try_cmpxchg(&ctx->work_llist.first, &first, + &req->io_task_work.node)); + + if (first) return; /* needed for the following wake up */ From patchwork Thu Apr 6 13:20:12 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pavel Begunkov X-Patchwork-Id: 13203327 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2670AC76196 for ; Thu, 6 Apr 2023 13:21:10 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238366AbjDFNVH (ORCPT ); Thu, 6 Apr 2023 09:21:07 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48096 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237895AbjDFNU7 (ORCPT ); Thu, 6 Apr 2023 09:20:59 -0400 Received: from mail-ej1-x630.google.com (mail-ej1-x630.google.com [IPv6:2a00:1450:4864:20::630]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4AF549EC8; Thu, 6 Apr 2023 06:20:28 -0700 (PDT) Received: by mail-ej1-x630.google.com with SMTP id j22so1344216ejv.1; Thu, 06 Apr 2023 06:20:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; t=1680787226; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=z3skbR51DToc6t5YTcydSbJg47SV5f1fSwhxlZGFm+8=; b=jYvDqHgImpCfYiLEYukcp6QQOe3SArOcvxlqXicYgzrJ9KPpJXqNWMSA2Yyr0CUmhY Fz6Cd2Lbr5JYLBIs4HCAcvW12GV6jpq3UITRZ51shZvgtDL6XClK8aDGjPXOtM7aYKDm V+a/YUZuN1QJkiwbJ61RfAmS30r5ECNsFnOH3U9am8pdOLBcbRaUgO/DLnVGI1V/m3xu m+dphyDfeZ9EnywG/lpVsVT718l9417LKEAnp4xm1qfBxoS5u7H3wMDQGqvS5FhFchhD B0Inb1fw9/Dyb7qEMbIJtxtwcryB1EJEikHQjEfUKHyDxjEE36abwezra3PJqzthSYzW H/6Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1680787226; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=z3skbR51DToc6t5YTcydSbJg47SV5f1fSwhxlZGFm+8=; b=cGGGtNiOgH9U1wRqgWuGszgf4e+9R4Lb6hNbFUIwzGYpY9etPAcvJmMIoHMIIhpI2W VYKvWyNpbt8gM5mQfmXSn/TcJkWr2NsZL1p9aRHWexYzKAQOSUk+jkVr3mV/rQcqOm4V 7jGgra7aJsdiImMeVOHXPagIhCuEyWIe/+1iOTrdqUvYBLMMUcCdZ/wl7HV3qEm4ULY9 xtmCXWisDFHoKq8ELq5Y1MVjoxqv5ANV/3IX0L0214hIoEG8gccC6aGWBgLfqemM2SV0 0v9MsXhxDPMPvRktOrx7b9URj6WZZlWf+yHjbXvDMmqw/YwNPcTbNctYc4Kolyz1NSsK LGxQ== X-Gm-Message-State: AAQBX9dsxIX4ttJ6n5Ldidlu7MgQQ9LNqtLmrQARm2oTDghKE+i9tbbe l49RxYMO555d7MlnpKvXu4Vv8ucyP4k= X-Google-Smtp-Source: AKy350arsvVxjiEy7oV6GeeQ+k9sbyUU8hkmebn+HOHLlnIOZbftF55CVEF4zDvRyGal8t0JJ0qyUQ== X-Received: by 2002:a17:906:5a88:b0:948:6e3d:a030 with SMTP id l8-20020a1709065a8800b009486e3da030mr6456873ejq.42.1680787226589; Thu, 06 Apr 2023 06:20:26 -0700 (PDT) Received: from 127.com ([2620:10d:c092:600::2:a638]) by smtp.gmail.com with ESMTPSA id m20-20020a509994000000b0050470aa444fsm312732edb.51.2023.04.06.06.20.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 06 Apr 2023 06:20:26 -0700 (PDT) From: Pavel Begunkov To: io-uring@vger.kernel.org Cc: Jens Axboe , asml.silence@gmail.com, linux-kernel@vger.kernel.org Subject: [PATCH v2 6/8] io_uring: reduce scheduling due to tw Date: Thu, 6 Apr 2023 14:20:12 +0100 Message-Id: X-Mailer: git-send-email 2.40.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org Every task_work will try to wake the task to be executed, which causes excessive scheduling and additional overhead. For some tw it's justified, but others won't do much but post a single CQE. When a task waits for multiple cqes, every such task_work will wake it up. Instead, the task may give a hint about how many cqes it waits for, io_req_local_work_add() will compare against it and skip wake ups if #cqes + #tw is not enough to satisfy the waiting condition. Task_work that uses the optimisation should be simple enough and never post more than one CQE. It's also ignored for non DEFER_TASKRUN rings. Signed-off-by: Pavel Begunkov --- include/linux/io_uring_types.h | 3 +- io_uring/io_uring.c | 68 +++++++++++++++++++++++----------- io_uring/io_uring.h | 9 +++++ io_uring/notif.c | 2 +- io_uring/notif.h | 2 +- io_uring/rw.c | 2 +- 6 files changed, 61 insertions(+), 25 deletions(-) diff --git a/include/linux/io_uring_types.h b/include/linux/io_uring_types.h index 4a6ce03a4903..fa621a508a01 100644 --- a/include/linux/io_uring_types.h +++ b/include/linux/io_uring_types.h @@ -296,7 +296,7 @@ struct io_ring_ctx { spinlock_t completion_lock; bool poll_multi_queue; - bool cq_waiting; + atomic_t cq_wait_nr; /* * ->iopoll_list is protected by the ctx->uring_lock for @@ -566,6 +566,7 @@ struct io_kiocb { atomic_t refs; atomic_t poll_refs; struct io_task_work io_task_work; + unsigned nr_tw; /* for polled requests, i.e. IORING_OP_POLL_ADD and async armed poll */ union { struct hlist_node hash_node; diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index 786ecfa01c54..8a327a81beaf 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -1300,35 +1300,59 @@ static __cold void io_fallback_tw(struct io_uring_task *tctx) } } -static void io_req_local_work_add(struct io_kiocb *req) +static void io_req_local_work_add(struct io_kiocb *req, unsigned flags) { struct io_ring_ctx *ctx = req->ctx; + unsigned nr_wait, nr_tw, nr_tw_prev; struct llist_node *first; + if (req->flags & (REQ_F_LINK | REQ_F_HARDLINK)) + flags &= ~IOU_F_TWQ_LAZY_WAKE; + first = READ_ONCE(ctx->work_llist.first); do { + nr_tw_prev = 0; + if (first) { + struct io_kiocb *first_req = container_of(first, + struct io_kiocb, + io_task_work.node); + /* + * Might be executed at any moment, rely on + * SLAB_TYPESAFE_BY_RCU to keep it alive. + */ + nr_tw_prev = READ_ONCE(first_req->nr_tw); + } + nr_tw = nr_tw_prev + 1; + /* Large enough to fail the nr_wait comparison below */ + if (!(flags & IOU_F_TWQ_LAZY_WAKE)) + nr_tw = -1U; + + req->nr_tw = nr_tw; req->io_task_work.node.next = first; } while (!try_cmpxchg(&ctx->work_llist.first, &first, &req->io_task_work.node)); - if (first) - return; - - /* needed for the following wake up */ - smp_mb__after_atomic(); - - if (unlikely(atomic_read(&req->task->io_uring->in_cancel))) { - io_move_task_work_from_local(ctx); - return; + if (!first) { + if (unlikely(atomic_read(&req->task->io_uring->in_cancel))) { + io_move_task_work_from_local(ctx); + return; + } + if (ctx->flags & IORING_SETUP_TASKRUN_FLAG) + atomic_or(IORING_SQ_TASKRUN, &ctx->rings->sq_flags); + if (ctx->has_evfd) + io_eventfd_signal(ctx); } - if (ctx->flags & IORING_SETUP_TASKRUN_FLAG) - atomic_or(IORING_SQ_TASKRUN, &ctx->rings->sq_flags); - if (ctx->has_evfd) - io_eventfd_signal(ctx); - - if (READ_ONCE(ctx->cq_waiting)) - wake_up_state(ctx->submitter_task, TASK_INTERRUPTIBLE); + nr_wait = atomic_read(&ctx->cq_wait_nr); + /* no one is waiting */ + if (!nr_wait) + return; + /* either not enough or the previous add has already woken it up */ + if (nr_wait > nr_tw || nr_tw_prev >= nr_wait) + return; + /* pairs with set_current_state() in io_cqring_wait() */ + smp_mb__after_atomic(); + wake_up_state(ctx->submitter_task, TASK_INTERRUPTIBLE); } void __io_req_task_work_add(struct io_kiocb *req, unsigned flags) @@ -1339,7 +1363,7 @@ void __io_req_task_work_add(struct io_kiocb *req, unsigned flags) if (!(flags & IOU_F_TWQ_FORCE_NORMAL) && (ctx->flags & IORING_SETUP_DEFER_TASKRUN)) { rcu_read_lock(); - io_req_local_work_add(req); + io_req_local_work_add(req, flags); rcu_read_unlock(); return; } @@ -2625,7 +2649,9 @@ static int io_cqring_wait(struct io_ring_ctx *ctx, int min_events, unsigned long check_cq; if (ctx->flags & IORING_SETUP_DEFER_TASKRUN) { - WRITE_ONCE(ctx->cq_waiting, 1); + int nr_wait = (int) iowq.cq_tail - READ_ONCE(ctx->rings->cq.tail); + + atomic_set(&ctx->cq_wait_nr, nr_wait); set_current_state(TASK_INTERRUPTIBLE); } else { prepare_to_wait_exclusive(&ctx->cq_wait, &iowq.wq, @@ -2634,7 +2660,7 @@ static int io_cqring_wait(struct io_ring_ctx *ctx, int min_events, ret = io_cqring_wait_schedule(ctx, &iowq); __set_current_state(TASK_RUNNING); - WRITE_ONCE(ctx->cq_waiting, 0); + atomic_set(&ctx->cq_wait_nr, 0); if (ret < 0) break; @@ -4517,7 +4543,7 @@ static int __init io_uring_init(void) io_uring_optable_init(); req_cachep = KMEM_CACHE(io_kiocb, SLAB_HWCACHE_ALIGN | SLAB_PANIC | - SLAB_ACCOUNT); + SLAB_ACCOUNT | SLAB_TYPESAFE_BY_RCU); return 0; }; __initcall(io_uring_init); diff --git a/io_uring/io_uring.h b/io_uring/io_uring.h index cb4309a2acdc..ef449e43d493 100644 --- a/io_uring/io_uring.h +++ b/io_uring/io_uring.h @@ -18,6 +18,15 @@ enum { /* don't use deferred task_work */ IOU_F_TWQ_FORCE_NORMAL = 1, + + /* + * A hint to not wake right away but delay until there are enough of + * tw's queued to match the number of CQEs the task is waiting for. + * + * Must not be used wirh requests generating more than one CQE. + * It's also ignored unless IORING_SETUP_DEFER_TASKRUN is set. + */ + IOU_F_TWQ_LAZY_WAKE = 2, }; enum { diff --git a/io_uring/notif.c b/io_uring/notif.c index 172105eb347d..e1846a25dde1 100644 --- a/io_uring/notif.c +++ b/io_uring/notif.c @@ -31,7 +31,7 @@ static void io_tx_ubuf_callback(struct sk_buff *skb, struct ubuf_info *uarg, struct io_kiocb *notif = cmd_to_io_kiocb(nd); if (refcount_dec_and_test(&uarg->refcnt)) - io_req_task_work_add(notif); + __io_req_task_work_add(notif, IOU_F_TWQ_LAZY_WAKE); } static void io_tx_ubuf_callback_ext(struct sk_buff *skb, struct ubuf_info *uarg, diff --git a/io_uring/notif.h b/io_uring/notif.h index c88c800cd89d..6dd1b30a468f 100644 --- a/io_uring/notif.h +++ b/io_uring/notif.h @@ -33,7 +33,7 @@ static inline void io_notif_flush(struct io_kiocb *notif) /* drop slot's master ref */ if (refcount_dec_and_test(&nd->uarg.refcnt)) - io_req_task_work_add(notif); + __io_req_task_work_add(notif, IOU_F_TWQ_LAZY_WAKE); } static inline int io_notif_account_mem(struct io_kiocb *notif, unsigned len) diff --git a/io_uring/rw.c b/io_uring/rw.c index f14868624f41..6c7d2654770e 100644 --- a/io_uring/rw.c +++ b/io_uring/rw.c @@ -304,7 +304,7 @@ static void io_complete_rw(struct kiocb *kiocb, long res) return; io_req_set_res(req, io_fixup_rw_res(req, res), 0); req->io_task_work.func = io_req_rw_complete; - io_req_task_work_add(req); + __io_req_task_work_add(req, IOU_F_TWQ_LAZY_WAKE); } static void io_complete_rw_iopoll(struct kiocb *kiocb, long res) From patchwork Thu Apr 6 13:20:13 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pavel Begunkov X-Patchwork-Id: 13203326 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A42C0C7618D for ; Thu, 6 Apr 2023 13:21:08 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238357AbjDFNVG (ORCPT ); Thu, 6 Apr 2023 09:21:06 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48070 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238146AbjDFNU6 (ORCPT ); Thu, 6 Apr 2023 09:20:58 -0400 Received: from mail-ej1-x629.google.com (mail-ej1-x629.google.com [IPv6:2a00:1450:4864:20::629]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EFEE69EF5; Thu, 6 Apr 2023 06:20:28 -0700 (PDT) Received: by mail-ej1-x629.google.com with SMTP id a640c23a62f3a-93071f06a9fso132379566b.2; Thu, 06 Apr 2023 06:20:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; t=1680787227; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=8zdZ1Ga45GnUtwGLx09F3ysxwpFtaVprx6Z0g2Ab66A=; b=jZdZldPO+rvoqZVqD7J6wZPEcYq3V3IFHOd42pQGxOMlmAFU3eLylaNjDNRwxkcrPB zOTso4OUn+5nB+q7YU6FBUWzOkZd8oBayrT2KTRDXBMPayGs3/2Uocse7+C4GJRh57EG vo1pRzRjEpB7pCkbjL5oLu1c4dRnWSNqxV7ZsBr0dHAUWqceIRzbpeR6PvGD4fqZF7ZK gBF3l6th6ic1kYwybiyv+qojVGYJLXW9Em92ZN9yZzVXQf4I+1dgCwS6NGUdX+cJXlv0 XVQK2Ca22aqyWU2+yptUb/WVYUbDSx4TKA9cbl9Cw3eDFNmbHheikpw0jYh2Cru8qeH9 cFeA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1680787227; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=8zdZ1Ga45GnUtwGLx09F3ysxwpFtaVprx6Z0g2Ab66A=; b=zoZnK5s5aQri2u/iGZimD6O1D++WEzpgN4W8YWw7ee/jX2t0vqHc7MDo0vSsN5nZj9 xesdnLqez94mLGkj8O4MH+5klJgTRyzZAzDVarSwMMA0A2Dtai+x6dksVUanPLq1wqx3 48CQe4u7wyoORywsnkOs5o1kqs08yAatBkcKDpPk15mTRs0wgCxgzhY6JSCVVjpQ0pMY 93GJQWixKctWj5l1kWr06tYXOgaGVC4yvKkbMWSz+Gp/9D4SObEwWH2hEU95ftmtHUV2 Z9i6idhGX1lMVThwdtQDkdjl6w5i9WiDBKJOW65nhgjcKTNffg5JBcNtpMtngc/Bx7jN /mEg== X-Gm-Message-State: AAQBX9edAbpZUnmoWFyjPo/QtRrAvlPj+MU68qkV5cr4dZoQA3YueMIf mcRW/JroP9hcf03sIDBGCTPczFC+yNg= X-Google-Smtp-Source: AKy350ZdIiogIUH6a1na6xPj5geiOARklM8r7A+y+EnG0LvDZ7FKfKXskNnXIlfNcs004hWJggGe5A== X-Received: by 2002:aa7:c40c:0:b0:500:50f6:dd30 with SMTP id j12-20020aa7c40c000000b0050050f6dd30mr5629587edq.15.1680787227214; Thu, 06 Apr 2023 06:20:27 -0700 (PDT) Received: from 127.com ([2620:10d:c092:600::2:a638]) by smtp.gmail.com with ESMTPSA id m20-20020a509994000000b0050470aa444fsm312732edb.51.2023.04.06.06.20.26 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 06 Apr 2023 06:20:26 -0700 (PDT) From: Pavel Begunkov To: io-uring@vger.kernel.org Cc: Jens Axboe , asml.silence@gmail.com, linux-kernel@vger.kernel.org Subject: [PATCH v2 7/8] io_uring: refactor __io_cq_unlock_post_flush() Date: Thu, 6 Apr 2023 14:20:13 +0100 Message-Id: X-Mailer: git-send-email 2.40.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org Separate ->task_complete path in __io_cq_unlock_post_flush(). Signed-off-by: Pavel Begunkov --- io_uring/io_uring.c | 22 ++++++++++++---------- 1 file changed, 12 insertions(+), 10 deletions(-) diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index 8a327a81beaf..0ea50c46f27f 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -627,21 +627,23 @@ static inline void __io_cq_unlock_post(struct io_ring_ctx *ctx) io_cqring_wake(ctx); } -static inline void __io_cq_unlock_post_flush(struct io_ring_ctx *ctx) +static void __io_cq_unlock_post_flush(struct io_ring_ctx *ctx) __releases(ctx->completion_lock) { io_commit_cqring(ctx); - __io_cq_unlock(ctx); - io_commit_cqring_flush(ctx); - /* - * As ->task_complete implies that the ring is single tasked, cq_wait - * may only be waited on by the current in io_cqring_wait(), but since - * it will re-check the wakeup conditions once we return we can safely - * skip waking it up. - */ - if (!(ctx->flags & IORING_SETUP_DEFER_TASKRUN)) + if (ctx->task_complete) { + /* + * ->task_complete implies that only current might be waiting + * for CQEs, and obviously, we currently don't. No one is + * waiting, wakeups are futile, skip them. + */ + io_commit_cqring_flush(ctx); + } else { + __io_cq_unlock(ctx); + io_commit_cqring_flush(ctx); io_cqring_wake(ctx); + } } void io_cq_unlock_post(struct io_ring_ctx *ctx) From patchwork Thu Apr 6 13:20:14 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pavel Begunkov X-Patchwork-Id: 13203328 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 83823C7618D for ; Thu, 6 Apr 2023 13:21:10 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238390AbjDFNVJ (ORCPT ); Thu, 6 Apr 2023 09:21:09 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48094 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238213AbjDFNU7 (ORCPT ); Thu, 6 Apr 2023 09:20:59 -0400 Received: from mail-ed1-x536.google.com (mail-ed1-x536.google.com [IPv6:2a00:1450:4864:20::536]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6C290A24A; Thu, 6 Apr 2023 06:20:29 -0700 (PDT) Received: by mail-ed1-x536.google.com with SMTP id 4fb4d7f45d1cf-50263dfe37dso6341573a12.0; Thu, 06 Apr 2023 06:20:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; t=1680787227; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=oqZbUgv+5sJFBzHrCS1hTYdTwSuWj9s9ZIzsmU8yC2w=; b=maOYQjQq1ZPe9acCFlODIHwUftIv67eW0Nk33eytV7fht+mIS6IOChb1Fpyyf+ladC /jzbQ1Uh7XIyb3FhHZN1Y3USpm4DFrMrP2/EMOyMOTG4f7J+6aB517MwlQvVUaQctn5N Wg2z/ofeOItPpQtCPxHv3ZBywLDnNr8qYlBl8IQE9Dmu2rtzjCmsct/cUcKRMNNIiPwY EcdjaphACQqnpXYQ1llqNRibnTn9t+cVpBPAPNd9Y2/oDXDqe+IeAQHHVntwt6tkoE5G IBdJ/xlK0gqFMwYJb7sQfdvS8ddABQ6+/Q7ng7HWI+ACFWT/YezGm0VDNDKgRgdq7KlJ b71Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1680787227; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=oqZbUgv+5sJFBzHrCS1hTYdTwSuWj9s9ZIzsmU8yC2w=; b=VO01mcZ+KeCbMJ7VjSdCAWCGyb0ppVnkcPuTtmESna9RNLosurPiz9cSwr3ujMHNiZ gygpbuaTZSIU4jtOMjbGsMdyu50gZjBNeRe9gsUaGc9woFFtZn4WKPNnQUrKoeWVTkmM nMjCdMQ8NhaQYrWwNMhzbd/uQjAf0oHOMJmxKhQ0myk9FrkO+iucUkkiEX90zuMd0kjR 47IMjAPg6R4Q9Jd62aGlxkKeycQOCKxQtPxlXszfDKij2ME4arIGdwsTnAJuGU+mU7x+ pYHYtLpr/wJ4bG/dg2tgDHi4VwqXetUQfNVlkvqGcE244Ogznz92NElDMnkQXkvaoyVn 5tpw== X-Gm-Message-State: AAQBX9cGUqo2efz0f2BRGVYmvfvHgFURddHR92JZubupCuYLlwKe3j4R 63utAEz9TJ81W+MWfDgSufhXa0BLccE= X-Google-Smtp-Source: AKy350ZH1y4lO92mtQ+oGAOmmz/4VGtw0S/ag8WjFpP1TR8VRgW/7kKdvzi/z1PQxSNAHPAb4acOrw== X-Received: by 2002:aa7:c0d1:0:b0:501:cf67:97fc with SMTP id j17-20020aa7c0d1000000b00501cf6797fcmr5477693edp.10.1680787227544; Thu, 06 Apr 2023 06:20:27 -0700 (PDT) Received: from 127.com ([2620:10d:c092:600::2:a638]) by smtp.gmail.com with ESMTPSA id m20-20020a509994000000b0050470aa444fsm312732edb.51.2023.04.06.06.20.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 06 Apr 2023 06:20:27 -0700 (PDT) From: Pavel Begunkov To: io-uring@vger.kernel.org Cc: Jens Axboe , asml.silence@gmail.com, linux-kernel@vger.kernel.org Subject: [PATCH v2 8/8] io_uring: optimise io_req_local_work_add Date: Thu, 6 Apr 2023 14:20:14 +0100 Message-Id: X-Mailer: git-send-email 2.40.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org Chains of memory accesses are never good for performance. The req->task->io_uring->in_cancel in io_req_local_work_add() is there so that when a task is exiting via io_uring_try_cancel_requests() and starts waiting for completions it gets woken up by every new task_work item queued. Do a little trick by announcing waiting in io_uring_try_cancel_requests() making io_req_local_work_add() to wake us up. We also need to check for deferred tw items after prepare_to_wait(TASK_INTERRUPTIBLE); Signed-off-by: Pavel Begunkov --- io_uring/io_uring.c | 24 ++++++++++++++++-------- 1 file changed, 16 insertions(+), 8 deletions(-) diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index 0ea50c46f27f..9bbf58297a0e 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -1335,10 +1335,6 @@ static void io_req_local_work_add(struct io_kiocb *req, unsigned flags) &req->io_task_work.node)); if (!first) { - if (unlikely(atomic_read(&req->task->io_uring->in_cancel))) { - io_move_task_work_from_local(ctx); - return; - } if (ctx->flags & IORING_SETUP_TASKRUN_FLAG) atomic_or(IORING_SQ_TASKRUN, &ctx->rings->sq_flags); if (ctx->has_evfd) @@ -3205,6 +3201,12 @@ static __cold bool io_uring_try_cancel_requests(struct io_ring_ctx *ctx, enum io_wq_cancel cret; bool ret = false; + /* set it so io_req_local_work_add() would wake us up */ + if (ctx->flags & IORING_SETUP_DEFER_TASKRUN) { + atomic_set(&ctx->cq_wait_nr, 1); + smp_mb(); + } + /* failed during ring init, it couldn't have issued any requests */ if (!ctx->rings) return false; @@ -3259,6 +3261,8 @@ __cold void io_uring_cancel_generic(bool cancel_all, struct io_sq_data *sqd) { struct io_uring_task *tctx = current->io_uring; struct io_ring_ctx *ctx; + struct io_tctx_node *node; + unsigned long index; s64 inflight; DEFINE_WAIT(wait); @@ -3280,9 +3284,6 @@ __cold void io_uring_cancel_generic(bool cancel_all, struct io_sq_data *sqd) break; if (!sqd) { - struct io_tctx_node *node; - unsigned long index; - xa_for_each(&tctx->xa, index, node) { /* sqpoll task will cancel all its requests */ if (node->ctx->sq_data) @@ -3305,7 +3306,13 @@ __cold void io_uring_cancel_generic(bool cancel_all, struct io_sq_data *sqd) prepare_to_wait(&tctx->wait, &wait, TASK_INTERRUPTIBLE); io_run_task_work(); io_uring_drop_tctx_refs(current); - + xa_for_each(&tctx->xa, index, node) { + if (!llist_empty(&node->ctx->work_llist)) { + WARN_ON_ONCE(node->ctx->submitter_task && + node->ctx->submitter_task != current); + goto end_wait; + } + } /* * If we've seen completions, retry without waiting. This * avoids a race where a completion comes in before we did @@ -3313,6 +3320,7 @@ __cold void io_uring_cancel_generic(bool cancel_all, struct io_sq_data *sqd) */ if (inflight == tctx_inflight(tctx, !cancel_all)) schedule(); +end_wait: finish_wait(&tctx->wait, &wait); } while (1);