From patchwork Fri Feb 7 17:32:24 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 13965534 Received: from mail-il1-f169.google.com (mail-il1-f169.google.com [209.85.166.169]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4831219D89E for ; Fri, 7 Feb 2025 17:36:45 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.166.169 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738949806; cv=none; b=CsXUktko5sq37MA7B3eX4LbwK0ssPtdQEuSA2/Tjo09l5NxnVQHpVutDEPz4wK5LL9j5275KFlUJXProSX1U6EmzYbf6hatR1/cdtJO8EK43bFQe+9ZEX2bADxeLYcm3hKpITLecx0PlKfkISiLr8/olRkQSu70h9DMEbPl3XcA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738949806; c=relaxed/simple; bh=DuDffcRazxoqdMbIXpn7+EyJfMiPkF/egp+v1oEJbYo=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=TAJfHUgb9FrQs5hRadFS5AunF+mO/3B2BiQf9JjthPFDok6tIG5MuO2CthBg/63ollb46ykqe+4uUSovI1Igrd7pqfV5+lL14n4D1tCbv9jCSrz6rlYEq9CqWryY8VYHL0VBVssVJItr5asW1KjogLMMkoiipNPq1uzCwj0wsv4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk; spf=pass smtp.mailfrom=kernel.dk; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b=IEsG3nTz; arc=none smtp.client-ip=209.85.166.169 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=kernel.dk Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b="IEsG3nTz" Received: by mail-il1-f169.google.com with SMTP id e9e14a558f8ab-3d03ac846a7so7375285ab.2 for ; Fri, 07 Feb 2025 09:36:44 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20230601.gappssmtp.com; s=20230601; t=1738949804; x=1739554604; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=buVgqb/rrfd7qjVWy9KC8hfJsXJN99hoK7Ry2+u0vJ0=; b=IEsG3nTz3F6aeEo9oDmNacLXyXHNlV8rDaMEhSOWiPsvQa5+/CUoQ5NyXyJiODPE4A 5DHhfHJ3zPHNwbYisRm1PNZ9+szzsgO6nmUJdnRwCezhAV6AVenua3GwxmdEicVjB4sL Y/hOLo1Ru8ia2NEx0shLv2RrUDsDZMpfCAsZ+tRYtPSUsb1h/g89DbMHsQko5nyU1meM 4Ed25Oy6DXxMepYO+tO4kNa5tB6qZ+APY9U2BZxj9TGmpA6qcr6SUZJCWxV+AreYddX8 0R6J9va5icsPROShVbXbJGiTg5prHuG8DqEZsfM5QgtwQtveXS53nMOOSkZc26xV1u3T 9ehA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738949804; x=1739554604; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=buVgqb/rrfd7qjVWy9KC8hfJsXJN99hoK7Ry2+u0vJ0=; b=KYL4cwyCcqeakmDxjs0MD/RVgRpt4HtorCkxvGDuIMZSTb5dyN1tdqjHgJ58kpvIAt C39ramdg/MEo0T5dphHE5BJdd5ot9+3bDaBxPVUYmRD0FA59f4R87tlVPGqYFgdOxENW Ucsnjux4bz/uAJJaXpQWj5l6ZdnuD4mw7jzixbB5lsGiwBqhA/rMSuDwGIkCh8v0JjwE 5Nm3Lf/gt5swsud9yeCz5rqeIRgfEciWL7c/xOejmuOsutc7O9pP48jnlCcTGtkwXgs+ mHNM6n1TfXaOtl/0wE60a+ToG+qDd4X40aNJJ5R6EsZRNEors8X8CUcHe3aq3Do3F/fQ 0oPw== X-Gm-Message-State: AOJu0YxM0v1A8Ik5JEp0dMIJxu8/EqXc8RUH4vI5VzuthBT54kDYmIkx nUwk73L5aabZTisp7cYrY571n7Etj3cHB/DRAVTn6prOlFd6FmhQAm6gKVKnS6lj6WuK8XKBnyg 5 X-Gm-Gg: ASbGncvd7nOTCJGjfz/mrWIjkOJpZKOjWiiLgilrvNdA2K239RpuJNOk9ic+AyG/Pvm 9AKI+cBERT3qdimMbhaEthzBNrm7nm5zAfqwgLwEZWKZunf7izfY2jI3ZNOw5eMkoSgKyGexNA1 /FKwkCACnUJOqjsZiOGjOL4oMuYINoqtIsW8QGj78ziy53bRnaeRK8bEOshAZM6Acx2nRGXSp6w C/RC65bvWaKbp9wVm5uwGoqLyckrl+n+cjz+NJO0UPNkiXiEnmWesW+1SZ4db/d5Cp+umHPyM/6 j6sXvb0kZIQjjN6fEgQ= X-Google-Smtp-Source: AGHT+IH38njmpGOj3YxkEMJe5TX+OtPLNpsOMIhwPC104gcD+CbCCGR51nFfPMBz+iu5SQLer100dw== X-Received: by 2002:a92:c241:0:b0:3cf:c85c:4f60 with SMTP id e9e14a558f8ab-3d13dd4ba6amr31410065ab.11.1738949803847; Fri, 07 Feb 2025 09:36:43 -0800 (PST) Received: from localhost.localdomain ([96.43.243.2]) by smtp.gmail.com with ESMTPSA id 8926c6da1cb9f-4ece0186151sm206241173.111.2025.02.07.09.36.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 07 Feb 2025 09:36:42 -0800 (PST) From: Jens Axboe To: io-uring@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, brauner@kernel.org, Jens Axboe Subject: [PATCH 1/7] eventpoll: abstract out ep_try_send_events() helper Date: Fri, 7 Feb 2025 10:32:24 -0700 Message-ID: <20250207173639.884745-2-axboe@kernel.dk> X-Mailer: git-send-email 2.47.2 In-Reply-To: <20250207173639.884745-1-axboe@kernel.dk> References: <20250207173639.884745-1-axboe@kernel.dk> Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 In preparation for reusing this helper in another epoll setup helper, abstract it out. Signed-off-by: Jens Axboe --- fs/eventpoll.c | 28 ++++++++++++++++++---------- 1 file changed, 18 insertions(+), 10 deletions(-) diff --git a/fs/eventpoll.c b/fs/eventpoll.c index 7c0980db77b3..67d1808fda0e 100644 --- a/fs/eventpoll.c +++ b/fs/eventpoll.c @@ -1980,6 +1980,22 @@ static int ep_autoremove_wake_function(struct wait_queue_entry *wq_entry, return ret; } +static int ep_try_send_events(struct eventpoll *ep, + struct epoll_event __user *events, int maxevents) +{ + int res; + + /* + * Try to transfer events to user space. In case we get 0 events and + * there's still timeout left over, we go trying again in search of + * more luck. + */ + res = ep_send_events(ep, events, maxevents); + if (res > 0) + ep_suspend_napi_irqs(ep); + return res; +} + /** * ep_poll - Retrieves ready events, and delivers them to the caller-supplied * event buffer. @@ -2031,17 +2047,9 @@ static int ep_poll(struct eventpoll *ep, struct epoll_event __user *events, while (1) { if (eavail) { - /* - * Try to transfer events to user space. In case we get - * 0 events and there's still timeout left over, we go - * trying again in search of more luck. - */ - res = ep_send_events(ep, events, maxevents); - if (res) { - if (res > 0) - ep_suspend_napi_irqs(ep); + res = ep_try_send_events(ep, events, maxevents); + if (res) return res; - } } if (timed_out) From patchwork Fri Feb 7 17:32:25 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 13965535 Received: from mail-il1-f171.google.com (mail-il1-f171.google.com [209.85.166.171]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8FB0A19DF75 for ; Fri, 7 Feb 2025 17:36:46 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.166.171 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738949808; cv=none; b=ZaL3yxdXd3pysEWGPzgr60UZgTPwHWH8NsQGuJ3mxndmRKMJ+1otENxmSacWHomIWmZ3XxolwVa9wuYkN6+uTYI65mwpi17HJzpO34tK6QK5OUU1NtuOz1tu5o2VVCQFs21s4YPgvmcU8HX7p4EafiIJp0Q058Vv/zuvkiDL08Y= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738949808; c=relaxed/simple; bh=9CqAH+/+dru0tVXL9nfonyeHwVvUOEnm71GkeIjGvRA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Nf7PbfSYs6WnzVVN0Ms4Wya7+lN4U9jQKJEnazWY2utZjHlG/ZUOMwgMbxOol/Fba8iUV8EfZkHmWxMuYC3fpUDDYFZ0bBPwK/jxr/OOjN24wxxrvsHH6Uw6edqNXPq7zKUQLfhiaQdV0KITGN7Kepvj/WaNaSnghakSgz072ko= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk; spf=pass smtp.mailfrom=kernel.dk; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b=QIq1ROp2; arc=none smtp.client-ip=209.85.166.171 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=kernel.dk Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b="QIq1ROp2" Received: by mail-il1-f171.google.com with SMTP id e9e14a558f8ab-3d07df73412so12669905ab.1 for ; Fri, 07 Feb 2025 09:36:46 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20230601.gappssmtp.com; s=20230601; t=1738949805; x=1739554605; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=K9Ti6y1QSLcy76M2w8lkpqmCcvr5mUocCPUpPmYueYw=; b=QIq1ROp2jPQ7vXDU0UDHMqpovJ+0bFWWAVB0jqVuKplQIv6Ms/n0gspR4LpFhqvxtz MVMoZU38S0KEV6mdVt6yPRZ5vhZI52FrK6z/lvmvz/Jt74JDQJq5U+F0RAxaKZFqi8MY eW04mjaDSR1FcDaTYH1QGz/82I3ALxjn9Vnqcb8t0cwDS64tdoS8dDt9mvQBK1bNDFZg Ibf19/R2F3whADMzvbM+nviLqQ+cvq+FHv7J8HAnS3FFJr6HlHJIdHfE2xCCHArjye/d K/TuIgIRyJTCUUETgDIFwAIDdpaKi8M/3IXQ6QLDt6rDCXMMG1yF2fx/Rm0YDAw7jerK 9p7w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738949805; x=1739554605; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=K9Ti6y1QSLcy76M2w8lkpqmCcvr5mUocCPUpPmYueYw=; b=mnl/GHNbtKcfmpxttnwkmjSCESb2l4ZzeS3zSpYsQl9WcZlfNi4Hc2TvUaNoX3I80K gGRSKoqjaMO8Vpjsw0ALns8dYVbtYilg7HqHqdOzgptAulg/RHIkrHo3RlY4LTeHZ7T2 BR7xi+1DbIsEJd0To7XMDW+JWHPo3E6NuBRnfMbVYxMs9JdbY0AmuKLUTw6YFTWwL35y g5hK/gjDDW8Q8xvKZkYBYMR1OJPVtyRFmyC5kphSPf7nDn+r6jr/4UpAEoX/pt2alyUd 7bE0oHsXeVviTnuOo9ps0tsKBjIkGTd5kIVDOHUFMqfeMv1CoBxrxpWQOfnTCHH1Odx9 YntQ== X-Gm-Message-State: AOJu0Yxo3zeEP7INIdbT07ZKkwhTWL6Z2QVQM86hg7d7liTvy9yQ3MCM Q9hfTiOVD1sfZVplZ2WrAEHpUbSdMkx2StZB4TqUcIHnnu8UmdcT2oCcDfHK4Lg+D6FBfRXe9OD u X-Gm-Gg: ASbGncs/S+famOlgniqMNsHR3h9/aKW3FyNc8L/EbggEESapror99JIhJwlLmwvlPYX H+Cj6hcuTB/wEHS3YySXsnpKT5U9d5P23ET2DoOMrDUlaeb9xbvx9y2piSMVPMXgujVszmElWyB MVYcR7+u+Sn8EJAYpdM73UX0bCbnYT3bNQwscGIGCuwiNss15EASpGZeYpi5Yoznf46YM7YfjGd uHeN7wnxXVib9m4hFZmW2P6L3Ojp1umm76vCQWpl3B+emL/jC2r8UY7jeQ6kKb2ulgsNXALLMU8 8LxBq3eaXdMkzgOCf+s= X-Google-Smtp-Source: AGHT+IFPmeblSOGaVkG5w0Ux/nI58zc+Fqb2IC9VZwr48nevrVhclD4bfbjaJIUg0GayPfKZ5zroMg== X-Received: by 2002:a92:c263:0:b0:3cf:f88b:b51a with SMTP id e9e14a558f8ab-3d13dcfce6cmr34571995ab.2.1738949805333; Fri, 07 Feb 2025 09:36:45 -0800 (PST) Received: from localhost.localdomain ([96.43.243.2]) by smtp.gmail.com with ESMTPSA id 8926c6da1cb9f-4ece0186151sm206241173.111.2025.02.07.09.36.43 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 07 Feb 2025 09:36:44 -0800 (PST) From: Jens Axboe To: io-uring@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, brauner@kernel.org, Jens Axboe Subject: [PATCH 2/7] eventpoll: abstract out parameter sanity checking Date: Fri, 7 Feb 2025 10:32:25 -0700 Message-ID: <20250207173639.884745-3-axboe@kernel.dk> X-Mailer: git-send-email 2.47.2 In-Reply-To: <20250207173639.884745-1-axboe@kernel.dk> References: <20250207173639.884745-1-axboe@kernel.dk> Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Add a helper that checks the validity of the file descriptor and other parameters passed in to epoll_wait(). Signed-off-by: Jens Axboe --- fs/eventpoll.c | 39 +++++++++++++++++++++++++-------------- 1 file changed, 25 insertions(+), 14 deletions(-) diff --git a/fs/eventpoll.c b/fs/eventpoll.c index 67d1808fda0e..14466765b85d 100644 --- a/fs/eventpoll.c +++ b/fs/eventpoll.c @@ -2453,6 +2453,27 @@ SYSCALL_DEFINE4(epoll_ctl, int, epfd, int, op, int, fd, return do_epoll_ctl(epfd, op, fd, &epds, false); } +static int ep_check_params(struct file *file, struct epoll_event __user *evs, + int maxevents) +{ + /* The maximum number of event must be greater than zero */ + if (maxevents <= 0 || maxevents > EP_MAX_EVENTS) + return -EINVAL; + + /* Verify that the area passed by the user is writeable */ + if (!access_ok(evs, maxevents * sizeof(struct epoll_event))) + return -EFAULT; + + /* + * We have to check that the file structure underneath the fd + * the user passed to us _is_ an eventpoll file. + */ + if (!is_file_epoll(file)) + return -EINVAL; + + return 0; +} + /* * Implement the event wait interface for the eventpoll file. It is the kernel * part of the user space epoll_wait(2). @@ -2461,26 +2482,16 @@ static int do_epoll_wait(int epfd, struct epoll_event __user *events, int maxevents, struct timespec64 *to) { struct eventpoll *ep; - - /* The maximum number of event must be greater than zero */ - if (maxevents <= 0 || maxevents > EP_MAX_EVENTS) - return -EINVAL; - - /* Verify that the area passed by the user is writeable */ - if (!access_ok(events, maxevents * sizeof(struct epoll_event))) - return -EFAULT; + int ret; /* Get the "struct file *" for the eventpoll file */ CLASS(fd, f)(epfd); if (fd_empty(f)) return -EBADF; - /* - * We have to check that the file structure underneath the fd - * the user passed to us _is_ an eventpoll file. - */ - if (!is_file_epoll(fd_file(f))) - return -EINVAL; + ret = ep_check_params(fd_file(f), events, maxevents); + if (unlikely(ret)) + return ret; /* * At this point it is safe to assume that the "private_data" contains From patchwork Fri Feb 7 17:32:26 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 13965536 Received: from mail-io1-f53.google.com (mail-io1-f53.google.com [209.85.166.53]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3EFD219C540 for ; Fri, 7 Feb 2025 17:36:47 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.166.53 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738949809; cv=none; b=ToGt7YAzlwPIrnmrRe5YQ9GZrtiiQS94so1K/aDJqgiVsd3/6ryR+Grcb0tQv0bD7w1bYVjN+VtpqeEsGYgJ6NhhEP2+iKvo4xQcpHuEc9cjRFpcprK/4H7+N0w26RFvEfW9nrxLufOBm2s84alyHz8eC4dbCiJ7R1jyDo/6qwY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738949809; c=relaxed/simple; bh=neCFR3srUdC4BqNDJJ/puIB0s8g+B8z5na5WmH02VcE=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=MPdIxljg54wjtEdm8JkB9lARw32Fw4TBT3pJoxlhe/2ikGmNIglugeYk9MXZ90zHL0fqfLMfqd75wwoHOQDfuKfXVoWOjW0fRAH0YbOzm/dAtX70c9Pk0t3ZdjsBDWlWTVQP5N7QaDJxD8gTKBq8w9iq7rQIbPCEkiXUFbeWh1Q= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk; spf=pass smtp.mailfrom=kernel.dk; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b=bf/bvzxv; arc=none smtp.client-ip=209.85.166.53 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=kernel.dk Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b="bf/bvzxv" Received: by mail-io1-f53.google.com with SMTP id ca18e2360f4ac-8550803e1afso16180439f.1 for ; Fri, 07 Feb 2025 09:36:47 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20230601.gappssmtp.com; s=20230601; t=1738949807; x=1739554607; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=vQh1Zg0+2EnqTz66Y3o6QdpwBDhvVYS6NGGQhGbGVOA=; b=bf/bvzxvCJL3LfwHEaARfeVRtt1Earg3XhvXyEHIX4/JOckqigTdFPcI7Z2/czvsO5 NbSKZxkzIX/oe5E4PAXnoWI1hDDtbhAfGVj5eXIA8eHbOA3sPyQfS5IqU78qDnop5U8K KACVJizSXUiUNbnt3eX6sU2pqEtz+5cOTvevcQG/ln/cD6nj86uTldwpTshPKUW4Qf2X F5T+bh781dCyNMDMj+85GpVvheI3Pg4xeHklPGZSDheZdpvztc3+kNfDz/I8bHuVqYVn s/6dsJtvVX86V/THj2lfH2ge/Lgkn5tvi5S24RzerbTW7olP6tyTE84/vry+v4CWdADA MzsA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738949807; x=1739554607; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=vQh1Zg0+2EnqTz66Y3o6QdpwBDhvVYS6NGGQhGbGVOA=; b=G/Xm57hNVFK5zrFuE3UK+PiczG914ItsMfgYwN4O8i7HVXgY4qQaw7DBcbsHKTSTHs gKwYMsPjyg4RHh41fZLqI8kEgCJwyQ3DLgyYMJDcnXOyOu20IETGlFpdEmLIxwEk3+rD nZj8ZC37BOWjrCwxJo+to3F5t46An8zf42918GXShpKu/4nNaBjkwCHR3x9tjpm+SrTo Bp9/p6I70ABucUwh9Tv5LrlUgzPVmIcnjqaeQ6iu3tsb1jn1Rn2CfJcBTtXvuFZBiRy+ oaELy8btVDpExAmSdZQ67+n4IUBeqSuwV2Wv6X8qD7ERyrhkM6aJWt9eCuuIxJa85nB2 Cqqg== X-Gm-Message-State: AOJu0YzkpcWioI8ziDhWo9c4vuC71T8YK1vdxOmb0U3N4NklIsnuQljK Lttfk7xqIpcaxSUWtpOgjfT9QwL9JC3magRJo3gmmKcWcSruNFF5GnS9Y2Oxu0h8JelpF/1gZqC / X-Gm-Gg: ASbGncuqgu3p8Q6m84DFo46VryLs7Kg71pjSaHMiBhFNkuxX79/CArApPS+zdc6yBhw h5SEWSUZ1WcqRTqPDVhd9YMVrQ33Bz6+WdEJzNwZdFUgug8Clzd6ndBex+0rnokopc46Bs5ZWuh oJSxy9z3c9qoXnGh1sDTXMqKCgUg73CZY6lRYDMf4SU4XQmTwxESswATAhAPETjbZFHZTZgNPfO otbJXn0DikWsWw5OvBm4gJ1LXftiYkm/ySsGGx20wDyqnIQ77PNiVeIhG+ChkgJ5CqMCrOYx1z4 AO3cCruSaXuxlWwS3Y4= X-Google-Smtp-Source: AGHT+IEKzclYyCh3PyEzi+DF6BgOM8uthEYpdjl88V/12SKTf7VXtPAto5WeE4um2QNlQZDlP37OsA== X-Received: by 2002:a05:6602:360b:b0:841:9b5c:cfb3 with SMTP id ca18e2360f4ac-854fd8f9424mr421162739f.10.1738949806755; Fri, 07 Feb 2025 09:36:46 -0800 (PST) Received: from localhost.localdomain ([96.43.243.2]) by smtp.gmail.com with ESMTPSA id 8926c6da1cb9f-4ece0186151sm206241173.111.2025.02.07.09.36.45 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 07 Feb 2025 09:36:45 -0800 (PST) From: Jens Axboe To: io-uring@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, brauner@kernel.org, Jens Axboe Subject: [PATCH 3/7] eventpoll: add epoll_queue() interface Date: Fri, 7 Feb 2025 10:32:26 -0700 Message-ID: <20250207173639.884745-4-axboe@kernel.dk> X-Mailer: git-send-email 2.47.2 In-Reply-To: <20250207173639.884745-1-axboe@kernel.dk> References: <20250207173639.884745-1-axboe@kernel.dk> Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Basic interface that takes a wait_queue_entry rather than post one on the stack, which can be a persistent callback for when new events arrive. Works like regular epoll_wait(), except it doesn't block. If events are available, they are returned. If none are available, the passed in wait_queue_entry is added to the callback list. The wait_queue_entry must be previously initialized, and the callback provided will be called when events are added to the epoll context. Signed-off-by: Jens Axboe --- fs/eventpoll.c | 39 +++++++++++++++++++++++++++++++++++++++ include/linux/eventpoll.h | 4 ++++ 2 files changed, 43 insertions(+) diff --git a/fs/eventpoll.c b/fs/eventpoll.c index 14466765b85d..d3ac466ad415 100644 --- a/fs/eventpoll.c +++ b/fs/eventpoll.c @@ -1996,6 +1996,33 @@ static int ep_try_send_events(struct eventpoll *ep, return res; } +static int ep_poll_queue(struct eventpoll *ep, + struct epoll_event __user *events, int maxevents, + struct wait_queue_entry *wait) +{ + int res = 0, eavail; + + /* See ep_poll() for commentary */ + eavail = ep_events_available(ep); + while (1) { + if (eavail) { + res = ep_try_send_events(ep, events, maxevents); + if (res) + return res; + } + if (!list_empty_careful(&wait->entry)) + break; + write_lock_irq(&ep->lock); + eavail = ep_events_available(ep); + if (!eavail) + __add_wait_queue_exclusive(&ep->wq, wait); + write_unlock_irq(&ep->lock); + if (!eavail) + break; + } + return -EIOCBQUEUED; +} + /** * ep_poll - Retrieves ready events, and delivers them to the caller-supplied * event buffer. @@ -2474,6 +2501,18 @@ static int ep_check_params(struct file *file, struct epoll_event __user *evs, return 0; } +int epoll_queue(struct file *file, struct epoll_event __user *events, + int maxevents, struct wait_queue_entry *wait) +{ + int ret; + + ret = ep_check_params(file, events, maxevents); + if (unlikely(ret)) + return ret; + + return ep_poll_queue(file->private_data, events, maxevents, wait); +} + /* * Implement the event wait interface for the eventpoll file. It is the kernel * part of the user space epoll_wait(2). diff --git a/include/linux/eventpoll.h b/include/linux/eventpoll.h index 0c0d00fcd131..8de16374b8fe 100644 --- a/include/linux/eventpoll.h +++ b/include/linux/eventpoll.h @@ -25,6 +25,10 @@ struct file *get_epoll_tfile_raw_ptr(struct file *file, int tfd, unsigned long t /* Used to release the epoll bits inside the "struct file" */ void eventpoll_release_file(struct file *file); +/* Use to reap events, and/or queue for a callback on new events */ +int epoll_queue(struct file *file, struct epoll_event __user *events, + int maxevents, struct wait_queue_entry *wait); + /* * This is called from inside fs/file_table.c:__fput() to unlink files * from the eventpoll interface. We need to have this facility to cleanup From patchwork Fri Feb 7 17:32:27 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 13965537 Received: from mail-il1-f175.google.com (mail-il1-f175.google.com [209.85.166.175]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7E76B19C558 for ; Fri, 7 Feb 2025 17:36:49 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.166.175 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738949811; cv=none; b=HLI+veb5rYOboMDK7Gss5C6DrWAQ87ELyNNAcehto7jrOH8h09rMAiQq4rG5dcD/KAAgI8mUkknYL/M+7DrGUtpMLZmyxXygBI+TbRe3KPc6q4qJM9+oiBDXzD6OHte7vZSKJ/VI/qTyRMNW0oNw2//ndpYAVSp2LG0gBxIOzhk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738949811; c=relaxed/simple; bh=dyJDv1DYIwNakHlB93t6AK28i1AjFOufLyKR9k8IhYg=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=XRwebidnOQ9Kug5m7sFgxk/D07b8XMVrHGozReV5nRyEN4jnpM1Aaf/BlidmJ4ebL0xiWM1ogYjBHGP7II+Ja3ZTpAXRzGsmzZobxUxAE+GFqpm7Dj52hs6m3L5gcjKA6Wo4+vRbf2EB8z52CQSXTo/GD/geQt5Yxr3EsyvRYAg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk; spf=pass smtp.mailfrom=kernel.dk; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b=0bZ8m5V9; arc=none smtp.client-ip=209.85.166.175 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=kernel.dk Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b="0bZ8m5V9" Received: by mail-il1-f175.google.com with SMTP id e9e14a558f8ab-3d13e4dd0f2so7799595ab.3 for ; Fri, 07 Feb 2025 09:36:49 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20230601.gappssmtp.com; s=20230601; t=1738949808; x=1739554608; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=dZUJaBdxvX3TYFAnJ+wUNjux1qeKKL9lzbYxbO/HoGo=; b=0bZ8m5V9pINo8eK4nUQtE/1BpjYxi/EAVnkRs3RKvO/uSmPPqVwkuGC42MhL1TIngd HPiDHIsDUJXKYyvtC3TL9VB36HhzQl4VvL9jeFUgzmhNIIKPcZQ1VjUi46UMcOlO/IAt JA3QlHzh+EVixegrBogkF+Wip5bCMX1eLo76qOrLxIzdt8WYpXnRfI7tVV3/ciMpoB8r 52Yb8chfHyTxkwJfPIVngmRofW/EzRJXrjVb6i13HRJBUSSJbSQ22c8LwNlzU8a/yhE9 EHPbPqYHHyMEalNU+Ip/kPiDuovn+duNjPJ/15D1L+JLwvJ6YMbhkE1AwsR5Tb4s2H7p ZjOA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738949808; x=1739554608; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=dZUJaBdxvX3TYFAnJ+wUNjux1qeKKL9lzbYxbO/HoGo=; b=AsJJJDwgVyA4u1uUtYk4UKab+JA3bAxsNbDRB69o7DeeQ4c7+XARUPJPo3eBCUv5Ie LW4/Hwve5jRGFMxkrTh1gGsF4y1cYoyvpcjErhyAWiu7VTPXwAD4MJniqaUwkZI2O7n0 6mEeIKNAzLMQskmyiM7rzPsx5Yetxea+4dY2vdrQoHS6Fk5LI8gsZOpFBl+NDnL+VM7N I9TmZwGbAtUqu9M2jmv7xQFMN0mvG7DFulTxzg/LU/pEZL7W1nmeZh20hsvSWLwnp3dV aFVlpqcJR0b2TkA6c2rZ9SFzqLkGzCg8TW48x9jokG6Wbx2g5QRp1zK1tKArTXdYx7jj wyCg== X-Gm-Message-State: AOJu0YyW4a+gKCYfIPZ2NQPSKyx04TouFCCwOgAXqrYIjMBvAtk9Co7C 0otgIHo/dprwh+5UP0AH0Z3QUpgZdeIknq46zLwu7gyjdpUvdIIG2onLnV03NtoQaLo+L5/cN54 a X-Gm-Gg: ASbGnctosDeBZtS6P0CblZv/5FHTxDYG0r4QfKtvEAuvKGOUKxOkGv3avMiBtfyoJOc EWQGkOZbd+Y7kYLWI5iLms7bVIiEZdHhHE5kHu+n04amQC3SqJkIvTXcMwekmFt8rcfRiZvKDbx 7tVl9ABHfhE15a5SiQroFR+Qzyd+VU/X9H34GQ2CN1g2fYvd+FfvHNeoFZ8ZTKmLRFyyRp6nWib ezmWldb/lo+zmHzFFFNva3rdYnUFaV03ml5Mv7tPgYS5U0cQOHNFrvT8QHTLj9lwVeJF/ab5ty9 wwensV0TvfJPHaQvE+M= X-Google-Smtp-Source: AGHT+IHESkWaXYphqvlRLTxkdoYMY8NNXw/bbXA1U4bM78mkr25yds7t2Z+or5Lwm4/xJAfCW9GsRw== X-Received: by 2002:a05:6e02:1a2c:b0:3cf:c773:6992 with SMTP id e9e14a558f8ab-3d13dd66668mr34070995ab.12.1738949808113; Fri, 07 Feb 2025 09:36:48 -0800 (PST) Received: from localhost.localdomain ([96.43.243.2]) by smtp.gmail.com with ESMTPSA id 8926c6da1cb9f-4ece0186151sm206241173.111.2025.02.07.09.36.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 07 Feb 2025 09:36:47 -0800 (PST) From: Jens Axboe To: io-uring@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, brauner@kernel.org, Jens Axboe Subject: [PATCH 4/7] eventpoll: add helper to remove wait entry from wait queue head Date: Fri, 7 Feb 2025 10:32:27 -0700 Message-ID: <20250207173639.884745-5-axboe@kernel.dk> X-Mailer: git-send-email 2.47.2 In-Reply-To: <20250207173639.884745-1-axboe@kernel.dk> References: <20250207173639.884745-1-axboe@kernel.dk> Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 __epoll_wait_remove() is the core helper, it kills a given wait_queue_entry from the eventpoll wait_queue_head. Use it internally, and provide an overall helper, epoll_wait_remove(), which takes a struct file and provides the same functionality. Signed-off-by: Jens Axboe --- fs/eventpoll.c | 58 +++++++++++++++++++++++++-------------- include/linux/eventpoll.h | 3 ++ 2 files changed, 40 insertions(+), 21 deletions(-) diff --git a/fs/eventpoll.c b/fs/eventpoll.c index d3ac466ad415..b96cc9193517 100644 --- a/fs/eventpoll.c +++ b/fs/eventpoll.c @@ -2023,6 +2023,42 @@ static int ep_poll_queue(struct eventpoll *ep, return -EIOCBQUEUED; } +static int __epoll_wait_remove(struct eventpoll *ep, + struct wait_queue_entry *wait, int timed_out) +{ + int eavail; + + /* + * We were woken up, thus go and try to harvest some events. If timed + * out and still on the wait queue, recheck eavail carefully under + * lock, below. + */ + eavail = 1; + + if (!list_empty_careful(&wait->entry)) { + write_lock_irq(&ep->lock); + /* + * If the thread timed out and is not on the wait queue, it + * means that the thread was woken up after its timeout expired + * before it could reacquire the lock. Thus, when wait.entry is + * empty, it needs to harvest events. + */ + if (timed_out) + eavail = list_empty(&wait->entry); + __remove_wait_queue(&ep->wq, wait); + write_unlock_irq(&ep->lock); + } + + return eavail; +} + +int epoll_wait_remove(struct file *file, struct wait_queue_entry *wait) +{ + if (is_file_epoll(file)) + return __epoll_wait_remove(file->private_data, wait, false); + return -EINVAL; +} + /** * ep_poll - Retrieves ready events, and delivers them to the caller-supplied * event buffer. @@ -2135,27 +2171,7 @@ static int ep_poll(struct eventpoll *ep, struct epoll_event __user *events, HRTIMER_MODE_ABS); __set_current_state(TASK_RUNNING); - /* - * We were woken up, thus go and try to harvest some events. - * If timed out and still on the wait queue, recheck eavail - * carefully under lock, below. - */ - eavail = 1; - - if (!list_empty_careful(&wait.entry)) { - write_lock_irq(&ep->lock); - /* - * If the thread timed out and is not on the wait queue, - * it means that the thread was woken up after its - * timeout expired before it could reacquire the lock. - * Thus, when wait.entry is empty, it needs to harvest - * events. - */ - if (timed_out) - eavail = list_empty(&wait.entry); - __remove_wait_queue(&ep->wq, &wait); - write_unlock_irq(&ep->lock); - } + eavail = __epoll_wait_remove(ep, &wait, timed_out); } } diff --git a/include/linux/eventpoll.h b/include/linux/eventpoll.h index 8de16374b8fe..6c088d5e945b 100644 --- a/include/linux/eventpoll.h +++ b/include/linux/eventpoll.h @@ -29,6 +29,9 @@ void eventpoll_release_file(struct file *file); int epoll_queue(struct file *file, struct epoll_event __user *events, int maxevents, struct wait_queue_entry *wait); +/* Remove wait entry */ +int epoll_wait_remove(struct file *file, struct wait_queue_entry *wait); + /* * This is called from inside fs/file_table.c:__fput() to unlink files * from the eventpoll interface. We need to have this facility to cleanup From patchwork Fri Feb 7 17:32:28 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 13965538 Received: from mail-il1-f176.google.com (mail-il1-f176.google.com [209.85.166.176]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AF8C619F12D for ; Fri, 7 Feb 2025 17:36:50 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.166.176 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738949812; cv=none; b=B50rv9MLcra0PWSdy9U0yw3ILKzcac/LK2nlxA8ZnJRItHZyu0LiUbnywpaekxy0o0GDfiQZEBU/hHWYQ2Lrt/Eh8OuB4yPLDs/TbhWqNMW2k/VNW0sR0r04KWr9+e7BA6QTde0NDXUSsTQIl4WJ6xwB28jFNrKYW6VRqjf268s= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738949812; c=relaxed/simple; bh=AJ4EpG+a/HnjEaQKeFR8lZf7Ex4++mtSxvprrt1iy8A=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=GF39cUKBU9il8OaPVYc5KJKECKL4A09r2humftQEuDhxXB6F5FWcUc0SVgKUEkJAN7dYbzG6yf7Z6VUU2rcz66nDb/c1afX44opaklYitvs9QNMmHKVu0WeR8x7/piyReILX5ZOVoLC7SHJADrhVRQlnoqk08rB6qBvikaagH+Q= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk; spf=pass smtp.mailfrom=kernel.dk; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b=HZ0PTeKW; arc=none smtp.client-ip=209.85.166.176 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=kernel.dk Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b="HZ0PTeKW" Received: by mail-il1-f176.google.com with SMTP id e9e14a558f8ab-3d03d2bd7d2so20587075ab.0 for ; Fri, 07 Feb 2025 09:36:50 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20230601.gappssmtp.com; s=20230601; t=1738949809; x=1739554609; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Yl4I89AW9N+9ZjPLUSwNXzZLneK6YdFoM+i0fKJ0Giw=; b=HZ0PTeKWdgPzlRTKYZUtkF3n4dgc6kmTN0yRvmnKb+WWyaRLPx1ZT0Exx30cqPEJpQ NFvPNl8bXhCdkXGWvFORVCq7z0kjQ02nLwqY35ebken9jBUjLqZOR2/VjSDGvePo9yNH G1w6gdu7aF6uJIU1XEYJJ5yp4Dj4get3WNBfR0eqnNis1Hbj96NyXLxhYnosNDn+GSIe ePXzoVkYbp5N5XU/i716UqhwQq9HGQ2vTfBa11kyvL1tt+gll1bcol+4PY3t7w9DrGYz 10XtITW9mtN9FoiodpD1/VwiN+r3w53hfhznX79wk3qg84Rq1lSNzuZXjxvGzJTQi5zG HbDg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738949809; x=1739554609; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Yl4I89AW9N+9ZjPLUSwNXzZLneK6YdFoM+i0fKJ0Giw=; b=N2Br6LtjPe6mpDyCOTY+uonWQMjS0qxDg/1AQZESJ+xA5+Ssq1jmRC/aS81lUbNvUm 4+FVLiXUBe9U4HRLS21T6yyCmmP8YFWdM0dtSUZj2S8mbVt12kV18nB8J99YvSs7m8JG BtGHu3hhk/NHrpLGOpfWGGEwsB/lM7dgfQ8yS4sdhbmC11AoaqotPB+uYO5ZL/dwjd5Y 9vk7+t0Nvy5F1qKvCFD7j67skpcO/J7YSB0pErWe4L3OEeDQ7LKvD+lsPMkk+WGDQh23 CEtoT4peuw9YECXOh8jdyHTQbtN4fk0D4OdFWxsm4A8kPW3UrvtyW36AonYzsNJ8MlDr brQA== X-Gm-Message-State: AOJu0YxkhEvPdPZzqzOm1AlzWNX3Bz/JIdRntdsJZ+eWPLxj3NYTWOEU lF8kbfbEp96eLJgJ+lFSGtYkuWFRDr8A+C4PCg+y4gvfv3OksE8538qlD7XAeUulUcG2DVI2CKr Z X-Gm-Gg: ASbGncv/ee6NCLJIvLY/jLPZxSX1UOfeO8AshK3KVl9L+Y64B+OjrbASfail/q4F9pR VCMa0Nb7LPnNRmHkT8p9EMCXxXjKyWx4p+JKWrBZfmv30NoSHbvvcs/EAXfDeiJ93vxu0TbZhf/ M6ZjQhehNCdHfch5DJfFBzctbkSS29tgp3KxxXlYC/GRTRQQ7+4MAXzlqwB55DrIjls9R2ffFZh Dlldo1bve1uP4Hl1iE51jiDKN4dzTAfewHG/L4ndYOc7CH4/Hn3Eplur/wO1EFzoUzkc3a8Mk2h 1ZIrwHiCUhA9P2AGKBE= X-Google-Smtp-Source: AGHT+IFf5yFHbi0FpkwI0St9Frw/m6TmD/MzSZjJpnF8PSyMprcyK2hlYpYREkjfvbItwT4qYeHWRA== X-Received: by 2002:a05:6e02:338f:b0:3cf:c9ad:46a1 with SMTP id e9e14a558f8ab-3d13de7ae05mr31235335ab.13.1738949809397; Fri, 07 Feb 2025 09:36:49 -0800 (PST) Received: from localhost.localdomain ([96.43.243.2]) by smtp.gmail.com with ESMTPSA id 8926c6da1cb9f-4ece0186151sm206241173.111.2025.02.07.09.36.48 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 07 Feb 2025 09:36:48 -0800 (PST) From: Jens Axboe To: io-uring@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, brauner@kernel.org, Jens Axboe Subject: [PATCH 5/7] io_uring/epoll: remove CONFIG_EPOLL guards Date: Fri, 7 Feb 2025 10:32:28 -0700 Message-ID: <20250207173639.884745-6-axboe@kernel.dk> X-Mailer: git-send-email 2.47.2 In-Reply-To: <20250207173639.884745-1-axboe@kernel.dk> References: <20250207173639.884745-1-axboe@kernel.dk> Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Just have the Makefile add the object if epoll is enabled, then it's not necessary to guard the entire epoll.c file inside an CONFIG_EPOLL ifdef. Signed-off-by: Jens Axboe --- io_uring/Makefile | 9 +++++---- io_uring/epoll.c | 2 -- 2 files changed, 5 insertions(+), 6 deletions(-) diff --git a/io_uring/Makefile b/io_uring/Makefile index d695b60dba4f..7114a6dbd439 100644 --- a/io_uring/Makefile +++ b/io_uring/Makefile @@ -11,9 +11,10 @@ obj-$(CONFIG_IO_URING) += io_uring.o opdef.o kbuf.o rsrc.o notif.o \ eventfd.o uring_cmd.o openclose.o \ sqpoll.o xattr.o nop.o fs.o splice.o \ sync.o msg_ring.o advise.o openclose.o \ - epoll.o statx.o timeout.o fdinfo.o \ - cancel.o waitid.o register.o \ - truncate.o memmap.o alloc_cache.o + statx.o timeout.o fdinfo.o cancel.o \ + waitid.o register.o truncate.o \ + memmap.o alloc_cache.o obj-$(CONFIG_IO_WQ) += io-wq.o obj-$(CONFIG_FUTEX) += futex.o -obj-$(CONFIG_NET_RX_BUSY_POLL) += napi.o +obj-$(CONFIG_EPOLL) += epoll.o +obj-$(CONFIG_NET_RX_BUSY_POLL) += napi.o diff --git a/io_uring/epoll.c b/io_uring/epoll.c index 89bff2068a19..7848d9cc073d 100644 --- a/io_uring/epoll.c +++ b/io_uring/epoll.c @@ -12,7 +12,6 @@ #include "io_uring.h" #include "epoll.h" -#if defined(CONFIG_EPOLL) struct io_epoll { struct file *file; int epfd; @@ -58,4 +57,3 @@ int io_epoll_ctl(struct io_kiocb *req, unsigned int issue_flags) io_req_set_res(req, ret, 0); return IOU_OK; } -#endif From patchwork Fri Feb 7 17:32:29 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 13965539 Received: from mail-io1-f42.google.com (mail-io1-f42.google.com [209.85.166.42]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0C5201A23B9 for ; Fri, 7 Feb 2025 17:36:51 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.166.42 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738949813; cv=none; b=ZlIxpOL51I7PM7z5eod9x8wW1NqwLYmz0k6YSuLDULdofIRD2yI+QHygh5O0TwdRXac5einJ2jAqNcJ9ymC8W3TMu3EZ22QcxnUhxngAsiocJAatDSU0W2p31Qz4rhz6jBFgmdONSkGAZuIqGx1ZOOd+fL+eo1QdsyxEGdB499o= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738949813; c=relaxed/simple; bh=im1ntOsYLHYjJiEvx6njeC6vx3wirWuYtKuqP171Gs4=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=tKQ3i9JzipkgP8u7Si1OAjuHkaa11IutX6tLpU27tRPumUq0HJrVtHOBYTPKWtRGwojqhBw/qsZVTT9h4/EmXiiZroya+vHc8yh5sBpn42ho5WCEOxE4RpqKT+8zGOmZTda1Bgzi/CTgH/7EQSKkQCqffMsxRHyxKwUGPRzjVsA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk; spf=pass smtp.mailfrom=kernel.dk; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b=w/U7/PWj; arc=none smtp.client-ip=209.85.166.42 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=kernel.dk Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b="w/U7/PWj" Received: by mail-io1-f42.google.com with SMTP id ca18e2360f4ac-844eac51429so183442439f.2 for ; Fri, 07 Feb 2025 09:36:51 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20230601.gappssmtp.com; s=20230601; t=1738949811; x=1739554611; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=TQ8sKu9TA7JXcFTCTxuSEZOOG4qI+mYE1gZPeZ19VjI=; b=w/U7/PWjm0jUCptV/X671ti0JCD+43RtYOiJ+rnkve21x7fQVTIm4TlS8wnSl4Azra 86PZ3Zi8SVGwmqwN5wPgexPbMxsPsjTlcqa8KhRZiWjTDb65LOggFvMTJIiGCCRdXSJ1 EB3whJXVmgXWPZsz2D6CJfbRHrmKNuEfPsf0vspFK/VL3bXfhhAuLD8EicE89Et4wyDk 5QiyxFpq3z241dkE1OEDugR1IQmXCyEhWWirCL8IjD1+tVZPZ0a5iF5wp0WY9eeY+cCg utfMxC/6AJLmMPai1YJ+5FbN5LT61IdQaCmGkOmRh8hOMs73ikDKkN8NrFJQNNNc23mp f5/w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738949811; x=1739554611; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=TQ8sKu9TA7JXcFTCTxuSEZOOG4qI+mYE1gZPeZ19VjI=; b=Esu8LY8MXHAy/bwAoV5SvJjscLx6JJ530BOC0PTXuR3G1Dod5l0fdXWXwlC0C2mYX3 j85RqsHTBmnzs1989Nta1/TK7jb3B4EWi9ylpe2G2OuY75y525PcqFWk8cZ7eESjpiM8 +xs/sdIbc9fN6VOLXNirmlTJTUPkKoSu2Y6UQo/rTwYt4W8ibCMY7xI+v+7YtYa6R2R0 srj2f03xWB7nrlGIwg65M3nv3cdmhgtuAyVw2F/hFZZXGaPUso0QlpX2dqGqv5enOoQO tiTzgrJbkIHL1EEy3zYAjYApV5xoF7CCPLnePcUihUyMZUl90BTnShtxT9N7CMR8EptR fRMA== X-Gm-Message-State: AOJu0Yx0vJNcoH/Az8vYyQNW1eq/+6bbkqPz7AhNx+7wJGpTDZMEhVBG KAGHsWUCLm/T01X7IZHrbRSjfGFAyOl+AvGrP1zhVo+0dqsyaHb06kH8pSZUO96HnAf8AZhF0T3 g X-Gm-Gg: ASbGncsB5W8Mx4pthCmZOMmKog4Bmx+cVFedV3/rje/BCERxQTCqU8Q7ajyDLiOEVKl t82xu5Pq/tcrYfIy9bj2VhLdndped1cvv0MxlkZRW5cchQo3JUPAM3zmT7I5rE1Y0iZ5cGw/fHV 4d6zpi44iGDk4oXfXsJahR+IvOTMzBDFr8qjynvwA4vArcPQlZTzKvk0PzpQyUMqDSsyxAgBt1h SoH4iKi+IuwY/0iDhkZhULryQvnEULOdO+eXyUrM1V0OXwV4+8a0mgdOG67tcdLU3cQuR0u0oi7 3+N52vlIyTyq/vzsJfY= X-Google-Smtp-Source: AGHT+IGTfHeUMPey2BvLVdDpKeNQRJC8LAYlSXA0VI1/YyMFocNIej7iZLHyFztOXS6Jz5PYpOKPgg== X-Received: by 2002:a05:6602:3991:b0:849:a2bb:ffde with SMTP id ca18e2360f4ac-854fd89a878mr510739039f.4.1738949810703; Fri, 07 Feb 2025 09:36:50 -0800 (PST) Received: from localhost.localdomain ([96.43.243.2]) by smtp.gmail.com with ESMTPSA id 8926c6da1cb9f-4ece0186151sm206241173.111.2025.02.07.09.36.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 07 Feb 2025 09:36:49 -0800 (PST) From: Jens Axboe To: io-uring@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, brauner@kernel.org, Jens Axboe Subject: [PATCH 6/7] io_uring/poll: pull ownership handling into poll.h Date: Fri, 7 Feb 2025 10:32:29 -0700 Message-ID: <20250207173639.884745-7-axboe@kernel.dk> X-Mailer: git-send-email 2.47.2 In-Reply-To: <20250207173639.884745-1-axboe@kernel.dk> References: <20250207173639.884745-1-axboe@kernel.dk> Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 In preparation for using it from somewhere else. Rather than try and duplicate the functionality, just make it generically available to io_uring opcodes. Note: would have to be used carefully, cannot be used by opcodes that can trigger poll logic. Signed-off-by: Jens Axboe --- io_uring/poll.c | 30 +----------------------------- io_uring/poll.h | 31 +++++++++++++++++++++++++++++++ 2 files changed, 32 insertions(+), 29 deletions(-) diff --git a/io_uring/poll.c b/io_uring/poll.c index bb1c0cd4f809..5e44ac562491 100644 --- a/io_uring/poll.c +++ b/io_uring/poll.c @@ -41,16 +41,6 @@ struct io_poll_table { __poll_t result_mask; }; -#define IO_POLL_CANCEL_FLAG BIT(31) -#define IO_POLL_RETRY_FLAG BIT(30) -#define IO_POLL_REF_MASK GENMASK(29, 0) - -/* - * We usually have 1-2 refs taken, 128 is more than enough and we want to - * maximise the margin between this amount and the moment when it overflows. - */ -#define IO_POLL_REF_BIAS 128 - #define IO_WQE_F_DOUBLE 1 static int io_poll_wake(struct wait_queue_entry *wait, unsigned mode, int sync, @@ -70,7 +60,7 @@ static inline bool wqe_is_double(struct wait_queue_entry *wqe) return priv & IO_WQE_F_DOUBLE; } -static bool io_poll_get_ownership_slowpath(struct io_kiocb *req) +bool io_poll_get_ownership_slowpath(struct io_kiocb *req) { int v; @@ -85,24 +75,6 @@ static bool io_poll_get_ownership_slowpath(struct io_kiocb *req) return !(atomic_fetch_inc(&req->poll_refs) & IO_POLL_REF_MASK); } -/* - * If refs part of ->poll_refs (see IO_POLL_REF_MASK) is 0, it's free. We can - * bump it and acquire ownership. It's disallowed to modify requests while not - * owning it, that prevents from races for enqueueing task_work's and b/w - * arming poll and wakeups. - */ -static inline bool io_poll_get_ownership(struct io_kiocb *req) -{ - if (unlikely(atomic_read(&req->poll_refs) >= IO_POLL_REF_BIAS)) - return io_poll_get_ownership_slowpath(req); - return !(atomic_fetch_inc(&req->poll_refs) & IO_POLL_REF_MASK); -} - -static void io_poll_mark_cancelled(struct io_kiocb *req) -{ - atomic_or(IO_POLL_CANCEL_FLAG, &req->poll_refs); -} - static struct io_poll *io_poll_get_double(struct io_kiocb *req) { /* pure poll stashes this in ->async_data, poll driven retry elsewhere */ diff --git a/io_uring/poll.h b/io_uring/poll.h index 04ede93113dc..2f416cd3be13 100644 --- a/io_uring/poll.h +++ b/io_uring/poll.h @@ -21,6 +21,18 @@ struct async_poll { struct io_poll *double_poll; }; +#define IO_POLL_CANCEL_FLAG BIT(31) +#define IO_POLL_RETRY_FLAG BIT(30) +#define IO_POLL_REF_MASK GENMASK(29, 0) + +bool io_poll_get_ownership_slowpath(struct io_kiocb *req); + +/* + * We usually have 1-2 refs taken, 128 is more than enough and we want to + * maximise the margin between this amount and the moment when it overflows. + */ +#define IO_POLL_REF_BIAS 128 + /* * Must only be called inside issue_flags & IO_URING_F_MULTISHOT, or * potentially other cases where we already "own" this poll request. @@ -30,6 +42,25 @@ static inline void io_poll_multishot_retry(struct io_kiocb *req) atomic_inc(&req->poll_refs); } +/* + * If refs part of ->poll_refs (see IO_POLL_REF_MASK) is 0, it's free. We can + * bump it and acquire ownership. It's disallowed to modify requests while not + * owning it, that prevents from races for enqueueing task_work's and b/w + * arming poll and wakeups. + */ +static inline bool io_poll_get_ownership(struct io_kiocb *req) +{ + if (unlikely(atomic_read(&req->poll_refs) >= IO_POLL_REF_BIAS)) + return io_poll_get_ownership_slowpath(req); + return !(atomic_fetch_inc(&req->poll_refs) & IO_POLL_REF_MASK); +} + +static inline void io_poll_mark_cancelled(struct io_kiocb *req) +{ + atomic_or(IO_POLL_CANCEL_FLAG, &req->poll_refs); +} + + int io_poll_add_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe); int io_poll_add(struct io_kiocb *req, unsigned int issue_flags); From patchwork Fri Feb 7 17:32:30 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 13965540 Received: from mail-io1-f49.google.com (mail-io1-f49.google.com [209.85.166.49]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 96D201A3144 for ; Fri, 7 Feb 2025 17:36:53 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.166.49 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738949815; cv=none; b=YBAIbaCiISanJJV4/BuSL9l8MQkEkqd03lYSJx+Yp2anV/qFrS15zy0ERkwxtea0FDbqnIKEbaN6yHlHJNvY4N5QSSPobL2XA4iN0nWXxxkW7lx9IwAXpUMp2j5efWkVvdlVxE8x4vaTiVe0lkgj+rkQzwssP5sE5yaL0fuLHdo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738949815; c=relaxed/simple; bh=/wc5aGUcFs1HMlPcN4lPleOhlLs4Z6Y3krXMU90FQ5E=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=nra4P84rGkDSjYzw3LjB7ifHjCEN9+X/QSZgO4/Tjb5vpjdedArTHzmF4Kqp3FrZI/e286+M87pHlPwuaj7dkkGueL3HT4a/bjuHIVZ8gHXaDYOMpZ/KSLHrL1khm6ubOtpUjG8lMOvWiL7ePoNCgPCf8EPLP4TAsvdWLrZPx1s= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk; spf=pass smtp.mailfrom=kernel.dk; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b=rX1ViygU; arc=none smtp.client-ip=209.85.166.49 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=kernel.dk Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b="rX1ViygU" Received: by mail-io1-f49.google.com with SMTP id ca18e2360f4ac-844ef6275c5so68253339f.0 for ; Fri, 07 Feb 2025 09:36:53 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20230601.gappssmtp.com; s=20230601; t=1738949812; x=1739554612; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=7qbvtsfwyIZePBB7edcFoY3EkzIhOm6ji2xaEWPLWTg=; b=rX1ViygU+6mtbKC1K1dzq/cgWDMZ5vAXz8Kd54gsDIxnG+XCHq109PcMiUk17Eoa5A urLIAne3jgQlJsLjReGsS1XD2FObXCWAwYXJlZTAcL53y8mZ8rLwAICyT29LNJEbvMv8 22gyZ9ff3s5G7dGaB2V/mZSHCiANIGpHzfK0bw/rBSOcL2xz9pRfdZ/zXlhHKnD3rXq4 beIuvDYSpYuwv/T3gYVAF0fiW1IGA+hDLq295Q4qUw7A1I+qwDYBFfViB4dNJ8ML2t+v are2j05u95yoWdRu2XkYY8uQZ9miTeMyo4XKr4H8Abkqi9O2rQI4mJFleg2t/42Imckm lAMw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738949812; x=1739554612; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=7qbvtsfwyIZePBB7edcFoY3EkzIhOm6ji2xaEWPLWTg=; b=V/Tr15pGmqV3ruk6pS743Qtgf98yoD9hXfXOZGQ7AlV2QLd9ctDqXSsQuicQLce5gL LTtqATtPLg+jGYsKJhiiTfIbFGEGJcSE5lHj60QilqzoxFaKKir3EqT8j04+Z++P9uRm JFMoF60xArSuLXuqkR9pKXXf7Ytv1nA6KKihn/Onmdz7Y+FPxYzmfIogIPZcaEcJa3b1 aNTl+UPW/bu4BrQ4xLuXT7a67Kj84+cFyIVAYq+rylvYaf5bXSpqOjqrsTPyjrr0QW07 w1W99/UVv/Jedcgzkf1JObhkt0cJCaJU4YkohfO5y6KmUhiv6+rkH064MKLQIV+SdoNe mirw== X-Gm-Message-State: AOJu0YwS8mGbkjGvl6e8RmgzHWSMRldYNS5DtwHtvWXdqkXO482m4Vjb fVKR4LKS2KK7y+3qsaIl9zMeawMIgtOCLW5z+cu94yiy/0/2jw/xf6voGGAWXQMX10cTki4OhTk U X-Gm-Gg: ASbGncuxxH1aukx+fz07s1qzo8n8461BixCz7Aul9TAPtKBKIMEZzvmUlTBw/WEQjzG S7Ce9s9qILzJs9FHayi61Y8FaM756t7JmBBT6LviZTUJv3giCNNbLEOrVtatB7m+YIV2JEEZpW4 qsURQoFz6bWpv3DM0TiYDVu1Ok81u/mhMFmO8gVojHlMryEBIkcbqjDk0M5gmSkOzeoXWngHl16 FPtDQp6wno89HtY/n+p2+CQF1uadulylG/Pn5xqNxWHf/kEmLgc+nqpUlm0VTYoutJ7PCqfgKqe qD3xeqRghuNG0wtxGQY= X-Google-Smtp-Source: AGHT+IEhNjgBKE9AcXHRNWx2yEx4kJJe7LB6Frs7X1I9sitvExI/LF4qPxS9MDRcfcLNcaO/qNB7zQ== X-Received: by 2002:a05:6602:7509:b0:844:debf:24dc with SMTP id ca18e2360f4ac-854fd8a0a0cmr475959439f.5.1738949812085; Fri, 07 Feb 2025 09:36:52 -0800 (PST) Received: from localhost.localdomain ([96.43.243.2]) by smtp.gmail.com with ESMTPSA id 8926c6da1cb9f-4ece0186151sm206241173.111.2025.02.07.09.36.50 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 07 Feb 2025 09:36:51 -0800 (PST) From: Jens Axboe To: io-uring@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, brauner@kernel.org, Jens Axboe Subject: [PATCH 7/7] io_uring/epoll: add support for IORING_OP_EPOLL_WAIT Date: Fri, 7 Feb 2025 10:32:30 -0700 Message-ID: <20250207173639.884745-8-axboe@kernel.dk> X-Mailer: git-send-email 2.47.2 In-Reply-To: <20250207173639.884745-1-axboe@kernel.dk> References: <20250207173639.884745-1-axboe@kernel.dk> Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 For existing epoll event loops that can't fully convert to io_uring, the used approach is usually to add the io_uring fd to the epoll instance and use epoll_wait() to wait on both "legacy" and io_uring events. While this work, it isn't optimal as: 1) epoll_wait() is pretty limited in what it can do. It does not support partial reaping of events, or waiting on a batch of events. 2) When an io_uring ring is added to an epoll instance, it activates the io_uring "I'm being polled" logic which slows things down. Rather than use this approach, with EPOLL_WAIT support added to io_uring, event loops can use the normal io_uring wait logic for everything, as long as an epoll wait request has been armed with io_uring. Note that IORING_OP_EPOLL_WAIT does NOT take a timeout value, as this is an async request. Waiting on io_uring events in general has various timeout parameters, and those are the ones that should be used when waiting on any kind of request. If events are immediately available for reaping, then This opcode will return those immediately. If none are available, then it will post an async completion when they become available. cqe->res will contain either an error code (< 0 value) for a malformed request, invalid epoll instance, etc. It will return a positive result indicating how many events were reaped. IORING_OP_EPOLL_WAIT requests may be canceled using the normal io_uring cancelation infrastructure. The poll logic for managing ownership is adopted to guard the epoll side too. Signed-off-by: Jens Axboe --- include/linux/io_uring_types.h | 4 + include/uapi/linux/io_uring.h | 1 + io_uring/cancel.c | 5 ++ io_uring/epoll.c | 143 +++++++++++++++++++++++++++++++++ io_uring/epoll.h | 22 +++++ io_uring/io_uring.c | 5 ++ io_uring/opdef.c | 14 ++++ 7 files changed, 194 insertions(+) diff --git a/include/linux/io_uring_types.h b/include/linux/io_uring_types.h index e2fef264ff8b..031ba708a81d 100644 --- a/include/linux/io_uring_types.h +++ b/include/linux/io_uring_types.h @@ -369,6 +369,10 @@ struct io_ring_ctx { struct io_alloc_cache futex_cache; #endif +#ifdef CONFIG_EPOLL + struct hlist_head epoll_list; +#endif + const struct cred *sq_creds; /* cred used for __io_sq_thread() */ struct io_sq_data *sq_data; /* if using sq thread polling */ diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h index e11c82638527..a559e1e1544a 100644 --- a/include/uapi/linux/io_uring.h +++ b/include/uapi/linux/io_uring.h @@ -278,6 +278,7 @@ enum io_uring_op { IORING_OP_FTRUNCATE, IORING_OP_BIND, IORING_OP_LISTEN, + IORING_OP_EPOLL_WAIT, /* this goes last, obviously */ IORING_OP_LAST, diff --git a/io_uring/cancel.c b/io_uring/cancel.c index 0870060bac7c..d1af9496d9b3 100644 --- a/io_uring/cancel.c +++ b/io_uring/cancel.c @@ -17,6 +17,7 @@ #include "timeout.h" #include "waitid.h" #include "futex.h" +#include "epoll.h" #include "cancel.h" struct io_cancel { @@ -128,6 +129,10 @@ int io_try_cancel(struct io_uring_task *tctx, struct io_cancel_data *cd, if (ret != -ENOENT) return ret; + ret = io_epoll_wait_cancel(ctx, cd, issue_flags); + if (ret != -ENOENT) + return ret; + spin_lock(&ctx->completion_lock); if (!(cd->flags & IORING_ASYNC_CANCEL_FD)) ret = io_timeout_cancel(ctx, cd); diff --git a/io_uring/epoll.c b/io_uring/epoll.c index 7848d9cc073d..8f54bb1c39de 100644 --- a/io_uring/epoll.c +++ b/io_uring/epoll.c @@ -11,6 +11,7 @@ #include "io_uring.h" #include "epoll.h" +#include "poll.h" struct io_epoll { struct file *file; @@ -20,6 +21,13 @@ struct io_epoll { struct epoll_event event; }; +struct io_epoll_wait { + struct file *file; + int maxevents; + struct epoll_event __user *events; + struct wait_queue_entry wait; +}; + int io_epoll_ctl_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe) { struct io_epoll *epoll = io_kiocb_to_cmd(req, struct io_epoll); @@ -57,3 +65,138 @@ int io_epoll_ctl(struct io_kiocb *req, unsigned int issue_flags) io_req_set_res(req, ret, 0); return IOU_OK; } + +static void __io_epoll_finish(struct io_kiocb *req, int res) +{ + struct io_epoll_wait *iew = io_kiocb_to_cmd(req, struct io_epoll_wait); + + lockdep_assert_held(&req->ctx->uring_lock); + + epoll_wait_remove(req->file, &iew->wait); + hlist_del_init(&req->hash_node); + io_req_set_res(req, res, 0); + req->io_task_work.func = io_req_task_complete; + io_req_task_work_add(req); +} + +static void __io_epoll_cancel(struct io_kiocb *req) +{ + __io_epoll_finish(req, -ECANCELED); +} + +static bool __io_epoll_wait_cancel(struct io_kiocb *req) +{ + io_poll_mark_cancelled(req); + if (io_poll_get_ownership(req)) + __io_epoll_cancel(req); + return true; +} + +bool io_epoll_wait_remove_all(struct io_ring_ctx *ctx, struct io_uring_task *tctx, + bool cancel_all) +{ + return io_cancel_remove_all(ctx, tctx, &ctx->epoll_list, cancel_all, __io_epoll_wait_cancel); +} + +int io_epoll_wait_cancel(struct io_ring_ctx *ctx, struct io_cancel_data *cd, + unsigned int issue_flags) +{ + return io_cancel_remove(ctx, cd, issue_flags, &ctx->epoll_list, __io_epoll_wait_cancel); +} + +static void io_epoll_retry(struct io_kiocb *req, struct io_tw_state *ts) +{ + int v; + + do { + v = atomic_read(&req->poll_refs); + if (unlikely(v != 1)) { + if (WARN_ON_ONCE(!(v & IO_POLL_REF_MASK))) + return; + if (v & IO_POLL_CANCEL_FLAG) { + __io_epoll_cancel(req); + return; + } + } + v &= IO_POLL_REF_MASK; + } while (atomic_sub_return(v, &req->poll_refs) & IO_POLL_REF_MASK); + + io_req_task_submit(req, ts); +} + +static int io_epoll_execute(struct io_kiocb *req) +{ + struct io_epoll_wait *iew = io_kiocb_to_cmd(req, struct io_epoll_wait); + + list_del_init_careful(&iew->wait.entry); + if (io_poll_get_ownership(req)) { + req->io_task_work.func = io_epoll_retry; + io_req_task_work_add(req); + } + + return 1; +} + +static __cold int io_epoll_pollfree_wake(struct io_kiocb *req) +{ + struct io_epoll_wait *iew = io_kiocb_to_cmd(req, struct io_epoll_wait); + + io_poll_mark_cancelled(req); + list_del_init_careful(&iew->wait.entry); + io_epoll_execute(req); + return 1; +} + +static int io_epoll_wait_fn(struct wait_queue_entry *wait, unsigned mode, + int sync, void *key) +{ + struct io_kiocb *req = wait->private; + __poll_t mask = key_to_poll(key); + + if (unlikely(mask & POLLFREE)) + return io_epoll_pollfree_wake(req); + + return io_epoll_execute(req); +} + +int io_epoll_wait_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe) +{ + struct io_epoll_wait *iew = io_kiocb_to_cmd(req, struct io_epoll_wait); + + if (sqe->off || sqe->rw_flags || sqe->buf_index || sqe->splice_fd_in) + return -EINVAL; + + iew->maxevents = READ_ONCE(sqe->len); + iew->events = u64_to_user_ptr(READ_ONCE(sqe->addr)); + + iew->wait.flags = 0; + iew->wait.private = req; + iew->wait.func = io_epoll_wait_fn; + INIT_LIST_HEAD(&iew->wait.entry); + INIT_HLIST_NODE(&req->hash_node); + atomic_set(&req->poll_refs, 0); + return 0; +} + +int io_epoll_wait(struct io_kiocb *req, unsigned int issue_flags) +{ + struct io_epoll_wait *iew = io_kiocb_to_cmd(req, struct io_epoll_wait); + struct io_ring_ctx *ctx = req->ctx; + int ret; + + io_ring_submit_lock(ctx, issue_flags); + + ret = epoll_queue(req->file, iew->events, iew->maxevents, &iew->wait); + if (ret == -EIOCBQUEUED) { + if (hlist_unhashed(&req->hash_node)) + hlist_add_head(&req->hash_node, &ctx->epoll_list); + io_ring_submit_unlock(ctx, issue_flags); + return IOU_ISSUE_SKIP_COMPLETE; + } else if (ret < 0) { + req_set_fail(req); + } + hlist_del_init(&req->hash_node); + io_ring_submit_unlock(ctx, issue_flags); + io_req_set_res(req, ret, 0); + return IOU_OK; +} diff --git a/io_uring/epoll.h b/io_uring/epoll.h index 870cce11ba98..296940d89063 100644 --- a/io_uring/epoll.h +++ b/io_uring/epoll.h @@ -1,6 +1,28 @@ // SPDX-License-Identifier: GPL-2.0 +#include "cancel.h" + #if defined(CONFIG_EPOLL) +int io_epoll_wait_cancel(struct io_ring_ctx *ctx, struct io_cancel_data *cd, + unsigned int issue_flags); +bool io_epoll_wait_remove_all(struct io_ring_ctx *ctx, struct io_uring_task *tctx, + bool cancel_all); + int io_epoll_ctl_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe); int io_epoll_ctl(struct io_kiocb *req, unsigned int issue_flags); +int io_epoll_wait_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe); +int io_epoll_wait(struct io_kiocb *req, unsigned int issue_flags); +#else +static inline bool io_epoll_wait_remove_all(struct io_ring_ctx *ctx, + struct io_uring_task *tctx, + bool cancel_all) +{ + return false; +} +static inline int io_epoll_wait_cancel(struct io_ring_ctx *ctx, + struct io_cancel_data *cd, + unsigned int issue_flags) +{ + return 0; +} #endif diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index ec98a0ec6f34..73b9246eaa50 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -93,6 +93,7 @@ #include "notif.h" #include "waitid.h" #include "futex.h" +#include "epoll.h" #include "napi.h" #include "uring_cmd.h" #include "msg_ring.h" @@ -356,6 +357,9 @@ static __cold struct io_ring_ctx *io_ring_ctx_alloc(struct io_uring_params *p) INIT_HLIST_HEAD(&ctx->waitid_list); #ifdef CONFIG_FUTEX INIT_HLIST_HEAD(&ctx->futex_list); +#endif +#ifdef CONFIG_EPOLL + INIT_HLIST_HEAD(&ctx->epoll_list); #endif INIT_DELAYED_WORK(&ctx->fallback_work, io_fallback_req_func); INIT_WQ_LIST(&ctx->submit_state.compl_reqs); @@ -3079,6 +3083,7 @@ static __cold bool io_uring_try_cancel_requests(struct io_ring_ctx *ctx, ret |= io_poll_remove_all(ctx, tctx, cancel_all); ret |= io_waitid_remove_all(ctx, tctx, cancel_all); ret |= io_futex_remove_all(ctx, tctx, cancel_all); + ret |= io_epoll_wait_remove_all(ctx, tctx, cancel_all); ret |= io_uring_try_cancel_uring_cmd(ctx, tctx, cancel_all); mutex_unlock(&ctx->uring_lock); ret |= io_kill_timeouts(ctx, tctx, cancel_all); diff --git a/io_uring/opdef.c b/io_uring/opdef.c index e8baef4e5146..44553a657476 100644 --- a/io_uring/opdef.c +++ b/io_uring/opdef.c @@ -514,6 +514,17 @@ const struct io_issue_def io_issue_defs[] = { .async_size = sizeof(struct io_async_msghdr), #else .prep = io_eopnotsupp_prep, +#endif + }, + [IORING_OP_EPOLL_WAIT] = { + .needs_file = 1, + .unbound_nonreg_file = 1, + .audit_skip = 1, +#if defined(CONFIG_EPOLL) + .prep = io_epoll_wait_prep, + .issue = io_epoll_wait, +#else + .prep = io_eopnotsupp_prep, #endif }, }; @@ -745,6 +756,9 @@ const struct io_cold_def io_cold_defs[] = { [IORING_OP_LISTEN] = { .name = "LISTEN", }, + [IORING_OP_EPOLL_WAIT] = { + .name = "EPOLL_WAIT", + }, }; const char *io_uring_get_opcode(u8 opcode)