From patchwork Sun Nov 20 17:28:04 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 13050112 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 82114C43219 for ; Sun, 20 Nov 2022 17:28:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229711AbiKTR2Q (ORCPT ); Sun, 20 Nov 2022 12:28:16 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47496 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229498AbiKTR2N (ORCPT ); Sun, 20 Nov 2022 12:28:13 -0500 Received: from mail-pl1-x62f.google.com (mail-pl1-x62f.google.com [IPv6:2607:f8b0:4864:20::62f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 25B79305 for ; Sun, 20 Nov 2022 09:28:12 -0800 (PST) Received: by mail-pl1-x62f.google.com with SMTP id p12so8649778plq.4 for ; Sun, 20 Nov 2022 09:28:12 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20210112.gappssmtp.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=XsuU+H3M3u2D+FIXWZSZgr+vWK9V5oMVu5fEmmlGnTM=; b=BOM0BtImH2FIi/g7YbbYOzI/qZgRZKJMBZo3NRfjjYs1WLlD2gXdxSXQyuA7AzGVUa YGwmEkoKWRT31nH0qOglZP2f4wUJNWzx1uI8uLw5DkJS5G6c6jSf0aQLS7ekPhaIborG r0eB2oEnU7uekWdsI4C25/GOp3Op3L4ZLrrOeC75CTwWm0ld+vxnDx1ctNRRv1hnfpnR qt2MKeF28AvjD9egvIcUJV3+51jPebWha9J3QWBPwQiG7MQkq/nhoeKBNu3Z9M7SkI5G wjtD5NFJnHFNchJ3FSeJhaMmptXBgdf42nqOgC1JQOE67dm8lL3plCYB2vUCR00+9SmI cc9w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=XsuU+H3M3u2D+FIXWZSZgr+vWK9V5oMVu5fEmmlGnTM=; b=5JM0Jakq3xhXbU7adlaD3o+ojCm6uKTJD5oJR3FoS95bl+NLbQsx8Rrw6otFwOhnZ+ LKRsM9HT/SsNVTMc81RPr73U+VrubJp1AB0H1K0vt3ZcgFP11nxTQRZ8BI8NMeT/cdIL 2dKIRBz33XZT+85TOWo7+JVO9bUPse3CGuDzl7E10qlsSiraKG1xKSHcHbf8vFtlV5OS As4ataGzJ5xWzEVUESq6Xjp09KH7phMlu55vNOXGjUWYQBK21qUIW5sCppZ2uoBfVJxK Zyj3hpN48JT8SmDX18s+/ZZmhEplSREwf3yKBzYGdbrowu6/QRTkuYvUbjOln4SdZ4L4 xZdA== X-Gm-Message-State: ANoB5pmbOR6zZw4viq8qsm2ZbAUurxGhVKAPAJ4f9HqNQDPgL9FRTOql xizEeldTjd22dIhQl9heJlZBsQsUAYjq/Q== X-Google-Smtp-Source: AA0mqf7GR3SUKhpj8VHRQuUOsPQxLSFGTv+QnscjV3POlq9BN0LCOK9w20UVmOQeceVM85gW0YQUgA== X-Received: by 2002:a17:902:9004:b0:17c:9a37:72fb with SMTP id a4-20020a170902900400b0017c9a3772fbmr8115908plp.82.1668965291402; Sun, 20 Nov 2022 09:28:11 -0800 (PST) Received: from localhost.localdomain ([198.8.77.157]) by smtp.gmail.com with ESMTPSA id d12-20020a170903230c00b0017e9b820a1asm7876953plh.100.2022.11.20.09.28.10 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 20 Nov 2022 09:28:10 -0800 (PST) From: Jens Axboe To: io-uring@vger.kernel.org Cc: Jens Axboe , stable@vger.kernel.org Subject: [PATCH 1/4] eventpoll: add EPOLL_URING wakeup flag Date: Sun, 20 Nov 2022 10:28:04 -0700 Message-Id: <20221120172807.358868-2-axboe@kernel.dk> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20221120172807.358868-1-axboe@kernel.dk> References: <20221120172807.358868-1-axboe@kernel.dk> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org We can have dependencies between epoll and io_uring. Consider an epoll context, identified by the epfd file descriptor, and an io_uring file descriptor identified by iofd. If we add iofd to the epfd context, and arm a multishot poll request for epfd with iofd, then the multishot poll request will repeatedly trigger and generate events until terminated by CQ ring overflow. This isn't a desired behavior. Add EPOLL_URING so that io_uring can pass it in as part of the poll wakeup key, and io_uring can check for that to detect a potential recursive invocation. Cc: stable@vger.kernel.org # 6.0 Signed-off-by: Jens Axboe --- fs/eventpoll.c | 18 ++++++++++-------- include/uapi/linux/eventpoll.h | 6 ++++++ 2 files changed, 16 insertions(+), 8 deletions(-) diff --git a/fs/eventpoll.c b/fs/eventpoll.c index 52954d4637b5..e864256001e8 100644 --- a/fs/eventpoll.c +++ b/fs/eventpoll.c @@ -491,7 +491,8 @@ static inline void ep_set_busy_poll_napi_id(struct epitem *epi) */ #ifdef CONFIG_DEBUG_LOCK_ALLOC -static void ep_poll_safewake(struct eventpoll *ep, struct epitem *epi) +static void ep_poll_safewake(struct eventpoll *ep, struct epitem *epi, + unsigned pollflags) { struct eventpoll *ep_src; unsigned long flags; @@ -522,16 +523,17 @@ static void ep_poll_safewake(struct eventpoll *ep, struct epitem *epi) } spin_lock_irqsave_nested(&ep->poll_wait.lock, flags, nests); ep->nests = nests + 1; - wake_up_locked_poll(&ep->poll_wait, EPOLLIN); + wake_up_locked_poll(&ep->poll_wait, EPOLLIN | pollflags); ep->nests = 0; spin_unlock_irqrestore(&ep->poll_wait.lock, flags); } #else -static void ep_poll_safewake(struct eventpoll *ep, struct epitem *epi) +static void ep_poll_safewake(struct eventpoll *ep, struct epitem *epi, + unsigned pollflags) { - wake_up_poll(&ep->poll_wait, EPOLLIN); + wake_up_poll(&ep->poll_wait, EPOLLIN | pollflags); } #endif @@ -742,7 +744,7 @@ static void ep_free(struct eventpoll *ep) /* We need to release all tasks waiting for these file */ if (waitqueue_active(&ep->poll_wait)) - ep_poll_safewake(ep, NULL); + ep_poll_safewake(ep, NULL, 0); /* * We need to lock this because we could be hit by @@ -1208,7 +1210,7 @@ static int ep_poll_callback(wait_queue_entry_t *wait, unsigned mode, int sync, v /* We have to call this outside the lock */ if (pwake) - ep_poll_safewake(ep, epi); + ep_poll_safewake(ep, epi, pollflags & EPOLL_URING); if (!(epi->event.events & EPOLLEXCLUSIVE)) ewake = 1; @@ -1553,7 +1555,7 @@ static int ep_insert(struct eventpoll *ep, const struct epoll_event *event, /* We have to call this outside the lock */ if (pwake) - ep_poll_safewake(ep, NULL); + ep_poll_safewake(ep, NULL, 0); return 0; } @@ -1629,7 +1631,7 @@ static int ep_modify(struct eventpoll *ep, struct epitem *epi, /* We have to call this outside the lock */ if (pwake) - ep_poll_safewake(ep, NULL); + ep_poll_safewake(ep, NULL, 0); return 0; } diff --git a/include/uapi/linux/eventpoll.h b/include/uapi/linux/eventpoll.h index 8a3432d0f0dc..532cc7fa75c0 100644 --- a/include/uapi/linux/eventpoll.h +++ b/include/uapi/linux/eventpoll.h @@ -41,6 +41,12 @@ #define EPOLLMSG (__force __poll_t)0x00000400 #define EPOLLRDHUP (__force __poll_t)0x00002000 +/* + * Internal flag - wakeup generated by io_uring, used to detect recursion back + * into the io_uring poll handler. + */ +#define EPOLL_URING ((__force __poll_t)(1U << 27)) + /* Set exclusive wakeup mode for the target file descriptor */ #define EPOLLEXCLUSIVE ((__force __poll_t)(1U << 28)) From patchwork Sun Nov 20 17:28:05 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 13050113 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1CBF5C4332F for ; Sun, 20 Nov 2022 17:28:21 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229498AbiKTR2R (ORCPT ); Sun, 20 Nov 2022 12:28:17 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47500 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229618AbiKTR2N (ORCPT ); Sun, 20 Nov 2022 12:28:13 -0500 Received: from mail-pl1-x635.google.com (mail-pl1-x635.google.com [IPv6:2607:f8b0:4864:20::635]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1132F26A for ; Sun, 20 Nov 2022 09:28:13 -0800 (PST) Received: by mail-pl1-x635.google.com with SMTP id j12so8651219plj.5 for ; Sun, 20 Nov 2022 09:28:13 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20210112.gappssmtp.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=1LkqzqvF/knzjUasduD/iS7lgrm2MeIBBhxzdjzsoUM=; b=0rTI5eUh+faPvBoSv4jP7lKUmcnz3mFpfmdd5GeX32U49agtyEoIgRFCzRBrAH03b8 I0kGWX2bZL5EaaELvXu1lhBUwXcmhyotwCnkNHS3cVGc6nBAZU/SkMMz3ixpiJKdRbrh Gay0ig17uXIXnHmbFuTehOQ0XaCej6u4SmcEJjinCVa7B2YobUSdmNf53kp5x9Yi59YM 7Vvq7B72RNmgr1MZVHiP213rSvdoS4VIMN8mD9Dhq5qD6uQmeMdmFZ3c5pbEUNSlm/WF 8Y61kTqXR+ANgELknP3WsgPCRNSunojGO1NSj2WWyjISg6Ox8rqtJ9UMLeYxG2Y9rBR9 cTJg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=1LkqzqvF/knzjUasduD/iS7lgrm2MeIBBhxzdjzsoUM=; b=rlU3TMSIqBq3Il8DpUl1eZRZAgvpU+IjE/AM60JV5zTelJtx86BLhyAJiEHW1CIVdh lStZPwiG8ocOYniotVRWaB2Xl2nSiyf52b5/sypCb1gCsCImF/ESw63So8FKPAfRBoFT D5bxwI9JOQF+h7mVN40Vv/WF0Uo2ugYo/8Q/rHySS8q1yBWpw/LnC0svl8hcM+nq+O50 LnQ/y9C5sNXJcxFuEqyBNPO8ZJK7AwaCvViXcnCzsp9YbELj22we7KEpqaYQctNNcjkG 7Ss7DJewHVPj9LJ/XSDaDbn2Osy0BUBsgc5XuwExZd/VLpsU1X9+uNdyd3o8RBHJ7MV3 jytw== X-Gm-Message-State: ANoB5pm2Z0skrM4IPhbrnYP4GsatGLtG6e1nrXwX6PaWOswfVsnjLRT9 FGfGqadwdwoT7+zl1ECi7/ZIQiG/AaCwKA== X-Google-Smtp-Source: AA0mqf6Xc+ZETcQQ/wv3RORTymlVOWVPAn+fGhOouEcA24JHeUF4XeBZGKztuN8Kfvrq9VG+66wlkg== X-Received: by 2002:a17:90a:c78c:b0:213:bbb4:13ce with SMTP id gn12-20020a17090ac78c00b00213bbb413cemr16452436pjb.246.1668965292339; Sun, 20 Nov 2022 09:28:12 -0800 (PST) Received: from localhost.localdomain ([198.8.77.157]) by smtp.gmail.com with ESMTPSA id d12-20020a170903230c00b0017e9b820a1asm7876953plh.100.2022.11.20.09.28.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 20 Nov 2022 09:28:11 -0800 (PST) From: Jens Axboe To: io-uring@vger.kernel.org Cc: Jens Axboe , stable@vger.kernel.org Subject: [PATCH 2/4] eventfd: provide a eventfd_signal_mask() helper Date: Sun, 20 Nov 2022 10:28:05 -0700 Message-Id: <20221120172807.358868-3-axboe@kernel.dk> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20221120172807.358868-1-axboe@kernel.dk> References: <20221120172807.358868-1-axboe@kernel.dk> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org This is identical to eventfd_signal(), but it allows the caller to pass in a mask to be used for the poll wakeup key. The use case is avoiding repeated multishot triggers if we have a dependency between eventfd and io_uring. If we setup an eventfd context and register that as the io_uring eventfd, and at the same time queue a multishot poll request for the eventfd context, then any CQE posted will repeatedly trigger the multishot request until it terminates when the CQ ring overflows. In preparation for io_uring detecting this circular dependency, add the mentioned helper so that io_uring can pass in EPOLL_URING as part of the poll wakeup key. Cc: stable@vger.kernel.org # 6.0 Signed-off-by: Jens Axboe --- fs/eventfd.c | 37 +++++++++++++++++++++---------------- include/linux/eventfd.h | 1 + 2 files changed, 22 insertions(+), 16 deletions(-) diff --git a/fs/eventfd.c b/fs/eventfd.c index c0ffee99ad23..249ca6c0b784 100644 --- a/fs/eventfd.c +++ b/fs/eventfd.c @@ -43,21 +43,7 @@ struct eventfd_ctx { int id; }; -/** - * eventfd_signal - Adds @n to the eventfd counter. - * @ctx: [in] Pointer to the eventfd context. - * @n: [in] Value of the counter to be added to the eventfd internal counter. - * The value cannot be negative. - * - * This function is supposed to be called by the kernel in paths that do not - * allow sleeping. In this function we allow the counter to reach the ULLONG_MAX - * value, and we signal this as overflow condition by returning a EPOLLERR - * to poll(2). - * - * Returns the amount by which the counter was incremented. This will be less - * than @n if the counter has overflowed. - */ -__u64 eventfd_signal(struct eventfd_ctx *ctx, __u64 n) +__u64 eventfd_signal_mask(struct eventfd_ctx *ctx, __u64 n, unsigned mask) { unsigned long flags; @@ -78,12 +64,31 @@ __u64 eventfd_signal(struct eventfd_ctx *ctx, __u64 n) n = ULLONG_MAX - ctx->count; ctx->count += n; if (waitqueue_active(&ctx->wqh)) - wake_up_locked_poll(&ctx->wqh, EPOLLIN); + wake_up_locked_poll(&ctx->wqh, EPOLLIN | mask); current->in_eventfd = 0; spin_unlock_irqrestore(&ctx->wqh.lock, flags); return n; } + +/** + * eventfd_signal - Adds @n to the eventfd counter. + * @ctx: [in] Pointer to the eventfd context. + * @n: [in] Value of the counter to be added to the eventfd internal counter. + * The value cannot be negative. + * + * This function is supposed to be called by the kernel in paths that do not + * allow sleeping. In this function we allow the counter to reach the ULLONG_MAX + * value, and we signal this as overflow condition by returning a EPOLLERR + * to poll(2). + * + * Returns the amount by which the counter was incremented. This will be less + * than @n if the counter has overflowed. + */ +__u64 eventfd_signal(struct eventfd_ctx *ctx, __u64 n) +{ + return eventfd_signal_mask(ctx, n, 0); +} EXPORT_SYMBOL_GPL(eventfd_signal); static void eventfd_free_ctx(struct eventfd_ctx *ctx) diff --git a/include/linux/eventfd.h b/include/linux/eventfd.h index 30eb30d6909b..e849329ce1a8 100644 --- a/include/linux/eventfd.h +++ b/include/linux/eventfd.h @@ -40,6 +40,7 @@ struct file *eventfd_fget(int fd); struct eventfd_ctx *eventfd_ctx_fdget(int fd); struct eventfd_ctx *eventfd_ctx_fileget(struct file *file); __u64 eventfd_signal(struct eventfd_ctx *ctx, __u64 n); +__u64 eventfd_signal_mask(struct eventfd_ctx *ctx, __u64 n, unsigned mask); int eventfd_ctx_remove_wait_queue(struct eventfd_ctx *ctx, wait_queue_entry_t *wait, __u64 *cnt); void eventfd_ctx_do_read(struct eventfd_ctx *ctx, __u64 *cnt); From patchwork Sun Nov 20 17:28:06 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 13050114 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E5222C43217 for ; Sun, 20 Nov 2022 17:28:22 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229713AbiKTR2S (ORCPT ); Sun, 20 Nov 2022 12:28:18 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47512 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229702AbiKTR2O (ORCPT ); Sun, 20 Nov 2022 12:28:14 -0500 Received: from mail-pl1-x629.google.com (mail-pl1-x629.google.com [IPv6:2607:f8b0:4864:20::629]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0F4881166 for ; Sun, 20 Nov 2022 09:28:14 -0800 (PST) Received: by mail-pl1-x629.google.com with SMTP id g10so8627170plo.11 for ; Sun, 20 Nov 2022 09:28:14 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20210112.gappssmtp.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=76MVOt/up7BUknWuGmpl00ZmM3I7BwtXZcmniA9qmuo=; b=cAcq15oFVNssiAaA1+sTqwpgHDh1e15jWmeDPDy66NuGC25jNsln9F+0KT/gT8gBuU 5wE12fGK1ljHRM4UykWbPJQt0AgnRTp5cveXEZz6yNqqD3G63ciwNMc+vsBvzFmza12O VBOZq5POobHQs/aBFdP2XaP6Qp8veK0GMjYrAkKbu5yXtM1diyHq9iKmc9cEn9568KX7 HvWBGh1KR9BQ5Rf2TQLxBZMOj1bUsuOHXsAcQPkTZQh24L3A0Hi1omghwJkNmxX0kAOq Xil6NcaJht/RmwvUEPcOeRLe2nV9S83hYKI2zw9KCwmU1L27IN7psoln3sP8cmZDt0Sg /dLg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=76MVOt/up7BUknWuGmpl00ZmM3I7BwtXZcmniA9qmuo=; b=uybbb0UfZyZnatRX0lPy/3MPEMPqQjbiHBV4KCwpV4/pR81Jo4etrA4Hkarit0ztcs yIWAzEIGHLZL08rJ0gIUXOK+p3Fbqr/bYyZo6yOoYb7p9thL8rDw64Tl7p1kXrl+lAVV SSW2RhbJIqcDKf39o0qJtRGfk09MDHNxhrA9/z/2+Gqdh/VS29uPRLXI2EF0PkBJSrMi vabFpymLISakRqBMBWQC/9jM6V8+Y5oeFfC/pzm6kVP1EheZOouH/NCvaPeBCSrzYl+l xPjMAUWMgtQFa5CiLvipYMu67hGwi4afJ5CZbTMmNqnIHehRyGvvWXUIUlDPaNJv72yF q2eg== X-Gm-Message-State: ANoB5pn7qtClibYK0hLOdUPWo5/e7GkoQZbEHj4NDw2UReNjjFTDjAwm dnDqjNoehyrH5+rwBvn9IqJnzFocG/RHsg== X-Google-Smtp-Source: AA0mqf5Ob53srCCwHBdEZtPbGznP6xqpEBrkCTE62mgckR3fwPb9gp+xuIjYU6tp4RK9OpW76n3LyQ== X-Received: by 2002:a17:902:d88c:b0:186:a7f1:8d2b with SMTP id b12-20020a170902d88c00b00186a7f18d2bmr8029594plz.137.1668965293255; Sun, 20 Nov 2022 09:28:13 -0800 (PST) Received: from localhost.localdomain ([198.8.77.157]) by smtp.gmail.com with ESMTPSA id d12-20020a170903230c00b0017e9b820a1asm7876953plh.100.2022.11.20.09.28.12 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 20 Nov 2022 09:28:12 -0800 (PST) From: Jens Axboe To: io-uring@vger.kernel.org Cc: Jens Axboe , stable@vger.kernel.org Subject: [PATCH 3/4] io_uring: pass in EPOLL_URING as part of eventfd signaling and wakeups Date: Sun, 20 Nov 2022 10:28:06 -0700 Message-Id: <20221120172807.358868-4-axboe@kernel.dk> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20221120172807.358868-1-axboe@kernel.dk> References: <20221120172807.358868-1-axboe@kernel.dk> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org Pass in EPOLL_URING when signaling eventfd or doing poll related wakups, so that we can check for a circular event dependency between eventfd and epoll. If this flag is set when our wakeup handlers are called, then we know we have a dependency that needs to terminate multishot requests. eventfd and epoll are the only such possible dependencies. Cc: stable@vger.kernel.org # 6.0 Signed-off-by: Jens Axboe --- io_uring/io_uring.c | 4 ++-- io_uring/io_uring.h | 15 +++++++++++---- io_uring/poll.c | 8 ++++++++ 3 files changed, 21 insertions(+), 6 deletions(-) diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index 8840cf3e20f2..53d0043b77a5 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -495,7 +495,7 @@ static void io_eventfd_ops(struct rcu_head *rcu) int ops = atomic_xchg(&ev_fd->ops, 0); if (ops & BIT(IO_EVENTFD_OP_SIGNAL_BIT)) - eventfd_signal(ev_fd->cq_ev_fd, 1); + eventfd_signal_mask(ev_fd->cq_ev_fd, 1, EPOLL_URING); /* IO_EVENTFD_OP_FREE_BIT may not be set here depending on callback * ordering in a race but if references are 0 we know we have to free @@ -531,7 +531,7 @@ static void io_eventfd_signal(struct io_ring_ctx *ctx) goto out; if (likely(eventfd_signal_allowed())) { - eventfd_signal(ev_fd->cq_ev_fd, 1); + eventfd_signal_mask(ev_fd->cq_ev_fd, 1, EPOLL_URING); } else { atomic_inc(&ev_fd->refs); if (!atomic_fetch_or(BIT(IO_EVENTFD_OP_SIGNAL_BIT), &ev_fd->ops)) diff --git a/io_uring/io_uring.h b/io_uring/io_uring.h index cef5ff924e63..f6cf74cd692b 100644 --- a/io_uring/io_uring.h +++ b/io_uring/io_uring.h @@ -4,6 +4,7 @@ #include #include #include +#include #include "io-wq.h" #include "slist.h" #include "filetable.h" @@ -207,12 +208,18 @@ static inline void io_commit_cqring(struct io_ring_ctx *ctx) static inline void __io_cqring_wake(struct io_ring_ctx *ctx) { /* - * wake_up_all() may seem excessive, but io_wake_function() and - * io_should_wake() handle the termination of the loop and only - * wake as many waiters as we need to. + * Trigger waitqueue handler on all waiters on our waitqueue. This + * won't necessarily wake up all the tasks, io_should_wake() will make + * that decision. + * + * Pass in EPOLLIN|EPOLL_URING as the poll wakeup key. The latter set + * in the mask so that if we recurse back into our own poll waitqueue + * handlers, we know we have a dependency between eventfd or epoll and + * should terminate multishot poll at that point. */ if (waitqueue_active(&ctx->cq_wait)) - wake_up_all(&ctx->cq_wait); + __wake_up(&ctx->cq_wait, TASK_NORMAL, 0, + poll_to_key(EPOLL_URING | EPOLLIN)); } static inline void io_cqring_wake(struct io_ring_ctx *ctx) diff --git a/io_uring/poll.c b/io_uring/poll.c index 055632e9092a..b5d9426c60f6 100644 --- a/io_uring/poll.c +++ b/io_uring/poll.c @@ -394,6 +394,14 @@ static int io_poll_wake(struct wait_queue_entry *wait, unsigned mode, int sync, return 0; if (io_poll_get_ownership(req)) { + /* + * If we trigger a multishot poll off our own wakeup path, + * disable multishot as there is a circular dependency between + * CQ posting and triggering the event. + */ + if (mask & EPOLL_URING) + poll->events |= EPOLLONESHOT; + /* optional, saves extra locking for removal in tw handler */ if (mask && poll->events & EPOLLONESHOT) { list_del_init(&poll->wait.entry); From patchwork Sun Nov 20 17:28:07 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 13050115 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B7494C433FE for ; Sun, 20 Nov 2022 17:28:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229702AbiKTR2V (ORCPT ); Sun, 20 Nov 2022 12:28:21 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47522 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229491AbiKTR2P (ORCPT ); Sun, 20 Nov 2022 12:28:15 -0500 Received: from mail-pl1-x636.google.com (mail-pl1-x636.google.com [IPv6:2607:f8b0:4864:20::636]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DC4DF275 for ; Sun, 20 Nov 2022 09:28:14 -0800 (PST) Received: by mail-pl1-x636.google.com with SMTP id w23so8618006ply.12 for ; Sun, 20 Nov 2022 09:28:14 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20210112.gappssmtp.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=8PFYeB16E8OQeb3PH9IjjhzmkhNWhHwQwZNzRu+UA0Q=; b=dCxZIiOHkwbwo5XNI4H68fmTG4oB1Ws5wPLUZKh5+mTC/48TuDgcneIlTt8i1L/kep VsxiRlO1MEsVvDNiCdOobUARKsehLLMkKarPLMzMW++gmiOWIfgsTYM7cok7dEFWvXLA R7DJUOM9j4icjTpUQVjZKpcQ399jd66aEdek1AvB8Vh6zQAs18JtV6hVivMvqqfbHRXx FfaEzKchY4d8W4dIaX/Ax0aDxvQyYRDLq6MSir4Hbz3ctsQOdJ0iSOHmzW4seSTTWlWr ul0tC7MSGlnYTVvlRoizXX7R1q5JHRuPdq+mQrK7bXnk1ySW/1WZh/iH+b0/kyDY1scA aszw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=8PFYeB16E8OQeb3PH9IjjhzmkhNWhHwQwZNzRu+UA0Q=; b=twUpk36pSukshdoduzLXL649iuz6CVPkQZvqDap8el+VbsM+xoK9L4zQmH+aiEjmLt 4b17m9RmiN9EPE1FlFf/nR63dk17+vhW7ShYVKhXLT6Rl+m9dXXJSQw0zZWfK8b30Msa 4mDNIVVmYNbQMPzif+x4vjE0Ul5eWhn5BxuVNLehnBoIJp/8BXgGwbHnLdbEKb1TTBam wqkRtrR9jr/CwsNZfYjh1tiqJuLu1EK6Ht2AtTubKt6dq8mQ3PzmbsVHo5p1Z2pYJvta en6Z9pt5HqXZhHa70INRNnewA4m7n2pMny9UPWPPz6g0t4NacOtNA6OcRv3XiZHypwpT 7iUw== X-Gm-Message-State: ANoB5pnKi2k3Wwm5Vz2zF3CDSWzyehhioM9BrY9mp1AsVJIxmeeLd7/Z uJkTzGjtDkmbN3d9/6wUxhkL2DyCPNZZPQ== X-Google-Smtp-Source: AA0mqf4N9SralDyhqe7mXeyNuD3r4ax1RIwjl0w0y/M62wlRvR0MriYA8b1xKzn1JUOyUcici9sW1A== X-Received: by 2002:a17:902:6b07:b0:186:df61:4693 with SMTP id o7-20020a1709026b0700b00186df614693mr1646301plk.173.1668965294106; Sun, 20 Nov 2022 09:28:14 -0800 (PST) Received: from localhost.localdomain ([198.8.77.157]) by smtp.gmail.com with ESMTPSA id d12-20020a170903230c00b0017e9b820a1asm7876953plh.100.2022.11.20.09.28.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 20 Nov 2022 09:28:13 -0800 (PST) From: Jens Axboe To: io-uring@vger.kernel.org Cc: Jens Axboe Subject: [PATCH 4/4] Revert "io_uring: disallow self-propelled ring polling" Date: Sun, 20 Nov 2022 10:28:07 -0700 Message-Id: <20221120172807.358868-5-axboe@kernel.dk> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20221120172807.358868-1-axboe@kernel.dk> References: <20221120172807.358868-1-axboe@kernel.dk> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org This reverts commit 7fdbc5f014c3f71bc44673a2d6c5bb2d12d45f25. This patch dealt with a subset of the real problem, which is a potential circular dependency on the wakup path for io_uring itself. Outside of io_uring, eventfd can also trigger this (see details in 8c881e87feae) and so can epoll (see details in 426930308abf). Now that we have a generic solution to this problem, get rid of the io_uring specific work-around. Signed-off-by: Jens Axboe --- io_uring/poll.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/io_uring/poll.c b/io_uring/poll.c index b5d9426c60f6..5733ed334738 100644 --- a/io_uring/poll.c +++ b/io_uring/poll.c @@ -246,8 +246,6 @@ static int io_poll_check_events(struct io_kiocb *req, bool *locked) continue; if (req->apoll_events & EPOLLONESHOT) return IOU_POLL_DONE; - if (io_is_uring_fops(req->file)) - return IOU_POLL_DONE; /* multishot, just fill a CQE and proceed */ if (!(req->flags & REQ_F_APOLL_MULTISHOT)) {