From patchwork Mon Feb 3 16:23:39 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 13957789 Received: from mail-io1-f51.google.com (mail-io1-f51.google.com [209.85.166.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BD2442B9BB for ; Mon, 3 Feb 2025 16:31:22 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.166.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738600284; cv=none; b=qV3dFmIQNGTmp9sTHs8SiPDMRLaNoJlK1AeM9smOVrharl9tG/yNgV09+t1nsYdkEF6pNoxYrqP765cknLp7v9k2gGqEhg1DjVWOr5YtWunce04novte0tNZFyiZs0tcGnI081vQOx7t0sVPFyxia8vuq1LILTVE4+s2cpeGJYE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738600284; c=relaxed/simple; bh=eGiwkNWqVn4n6sELTRMgfaMcoF7E583l9mfCxaa+7q8=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=GecuO9q/8445BPTzCgYffaFFWW2x2KHsEqa/+BSSJj7Dlb2TPZb0UyR3d9xEw2yKHnveqmys3MCnEobgyTPAI9WHT13k1Ea/z4cOtRFr9ldXqYjikdAwVj8c/1eACsT9dpLIJP7YOucUjfZQa4LC3CPs5MZXDYV1tKvjH4ZiaPY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk; spf=pass smtp.mailfrom=kernel.dk; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b=rS1BziiW; arc=none smtp.client-ip=209.85.166.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=kernel.dk Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b="rS1BziiW" Received: by mail-io1-f51.google.com with SMTP id ca18e2360f4ac-844e7bc6d84so137847839f.0 for ; Mon, 03 Feb 2025 08:31:22 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20230601.gappssmtp.com; s=20230601; t=1738600282; x=1739205082; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=enKCXdT18Z+amYe4hyaD8mGmZru39ch23j+4VBK1x/o=; b=rS1BziiWAgGhu11kKhVQo47FFq/EvFRuqK7n86uUQ7zwxw/xn8mz5e8qZscCkE7VF1 SSLAwjIvlWje7nVPnfCv1WYxdQuFbfO4jaoyIg9v9Lz2J6aLBvQEDaSMXuIYVoOsIqoa z5zKTKzl0SsBM1otLXwaOiaOZLQ23wnIGoQWsauuFZROQyXKP0zDLto3Vw28ZkZgcKRd x8z18umkodR4EIZK1R5ptxGxBPTctNDXEpPBVQGU5QGY7xdmWSS2fPedlNAn2WRPAt07 lYtf2yx1K02kEat8yCQDi3b8fxQ/oNLGoiUYScFSgwko3kmsdO+8Ri0QDNsFh6/jUxLm Gwgg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738600282; x=1739205082; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=enKCXdT18Z+amYe4hyaD8mGmZru39ch23j+4VBK1x/o=; b=WJDuA3uNs9TKndvikLcCJAHbLetMMqjb4HJHhl7AHp48NRSr015jDpcWoaowDy78uR iiVMgcSALerEqc5zh789v+IHK1sz+tmHvCFxnb3+1HTymI1q2vMVF8mX6gDOPQRVc5BE VSSOVKr79+ihtKjNzVPOPXjgihX5vqLa2Q4YAsMyQwB3ddTlWJ4diGfx/LEGX8DI9+Eg qZlVhDb9wA7hqYZoHLQ83RBrJaJ4I+NXLSMtGu27BnR+YUCnTsASrBJcKipVh/i95iBJ RaO0tFlrxop0iKJpyXNs2+f5q3OgMLIEOFGXI6rDOrYz7UjjK5Vxe8ybxnEvHUDofb2J 175A== X-Gm-Message-State: AOJu0YyJFiB6uj2/LB9d78TU+zjJUSLcsF+ry/o8A00n8ELXB1bPDI5m AveL2nQnnqlFFV+9mWrcSo7uZ/KD2/KCgGEkMr9VX8umB1j5X83N0kVJVxS89z8= X-Gm-Gg: ASbGncuIxfaHcA7k0TyYXr62VdahuHSJByc/nsH1BA5Rwt90MpQ1FXuRaLXgu8kLFSh ZR21ldr2QxUmEqfx3yEx7BN88nKQTs9OvSlV0P9BERgaNnECI6TYAhyx2k/xFEb/FJZ5UCQr5L3 ZlTZ0IBPJydh5w/Fv5NbOg/oxjJUzwr162u9jEEM8Fra6Jw+vzDG2RsUcsvXtMbqELB3VgxeZTU zXC/yYjFl65alFuPD5Wmkz1b+E2Kg4L/PSN1SNSpU5W7libtha97Yby9kM5vRWBseyotIZcA54U srsKeKeXVobnWM988fs= X-Google-Smtp-Source: AGHT+IE+/Ly9uGMX1dXoTTNsBcIWamNYpswbdfikyDTw/f71r7yEWo4FphmoVn9Iqomb/zqiMGDXpQ== X-Received: by 2002:a05:6602:4189:b0:84c:d479:e5a6 with SMTP id ca18e2360f4ac-8549fa3ebcdmr1542939539f.1.1738600281594; Mon, 03 Feb 2025 08:31:21 -0800 (PST) Received: from localhost.localdomain ([96.43.243.2]) by smtp.gmail.com with ESMTPSA id ca18e2360f4ac-854a16123c6sm243748139f.24.2025.02.03.08.31.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 03 Feb 2025 08:31:20 -0800 (PST) From: Jens Axboe To: io-uring@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, brauner@kernel.org, Jens Axboe Subject: [PATCH 1/9] eventpoll: abstract out main epoll reaper into a function Date: Mon, 3 Feb 2025 09:23:39 -0700 Message-ID: <20250203163114.124077-2-axboe@kernel.dk> X-Mailer: git-send-email 2.47.2 In-Reply-To: <20250203163114.124077-1-axboe@kernel.dk> References: <20250203163114.124077-1-axboe@kernel.dk> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Add epoll_wait(), which takes a struct file and the number of events etc to reap. This can then be called by do_epoll_wait(), and used by io_uring as well. No intended functional changes in this patch. Signed-off-by: Jens Axboe --- fs/eventpoll.c | 31 ++++++++++++++++++------------- include/linux/eventpoll.h | 4 ++++ 2 files changed, 22 insertions(+), 13 deletions(-) diff --git a/fs/eventpoll.c b/fs/eventpoll.c index 7c0980db77b3..73b639caed3d 100644 --- a/fs/eventpoll.c +++ b/fs/eventpoll.c @@ -2445,12 +2445,8 @@ SYSCALL_DEFINE4(epoll_ctl, int, epfd, int, op, int, fd, return do_epoll_ctl(epfd, op, fd, &epds, false); } -/* - * Implement the event wait interface for the eventpoll file. It is the kernel - * part of the user space epoll_wait(2). - */ -static int do_epoll_wait(int epfd, struct epoll_event __user *events, - int maxevents, struct timespec64 *to) +int epoll_wait(struct file *file, struct epoll_event __user *events, + int maxevents, struct timespec64 *to) { struct eventpoll *ep; @@ -2462,28 +2458,37 @@ static int do_epoll_wait(int epfd, struct epoll_event __user *events, if (!access_ok(events, maxevents * sizeof(struct epoll_event))) return -EFAULT; - /* Get the "struct file *" for the eventpoll file */ - CLASS(fd, f)(epfd); - if (fd_empty(f)) - return -EBADF; - /* * We have to check that the file structure underneath the fd * the user passed to us _is_ an eventpoll file. */ - if (!is_file_epoll(fd_file(f))) + if (!is_file_epoll(file)) return -EINVAL; /* * At this point it is safe to assume that the "private_data" contains * our own data structure. */ - ep = fd_file(f)->private_data; + ep = file->private_data; /* Time to fish for events ... */ return ep_poll(ep, events, maxevents, to); } +/* + * Implement the event wait interface for the eventpoll file. It is the kernel + * part of the user space epoll_wait(2). + */ +static int do_epoll_wait(int epfd, struct epoll_event __user *events, + int maxevents, struct timespec64 *to) +{ + /* Get the "struct file *" for the eventpoll file */ + CLASS(fd, f)(epfd); + if (!fd_empty(f)) + return epoll_wait(fd_file(f), events, maxevents, to); + return -EBADF; +} + SYSCALL_DEFINE4(epoll_wait, int, epfd, struct epoll_event __user *, events, int, maxevents, int, timeout) { diff --git a/include/linux/eventpoll.h b/include/linux/eventpoll.h index 0c0d00fcd131..f37fea931c44 100644 --- a/include/linux/eventpoll.h +++ b/include/linux/eventpoll.h @@ -25,6 +25,10 @@ struct file *get_epoll_tfile_raw_ptr(struct file *file, int tfd, unsigned long t /* Used to release the epoll bits inside the "struct file" */ void eventpoll_release_file(struct file *file); +/* Use to reap events */ +int epoll_wait(struct file *file, struct epoll_event __user *events, + int maxevents, struct timespec64 *to); + /* * This is called from inside fs/file_table.c:__fput() to unlink files * from the eventpoll interface. We need to have this facility to cleanup From patchwork Mon Feb 3 16:23:40 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 13957790 Received: from mail-io1-f49.google.com (mail-io1-f49.google.com [209.85.166.49]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 49F5E20B1ED for ; Mon, 3 Feb 2025 16:31:23 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.166.49 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738600285; cv=none; b=f8AIj++noztZ6yH1MkzpqsI7ZJ5C9x8S3QlXzNzAG2Oubp90WVJaVbqcYyy6kI4Dfv/TF/9ZfLxtzYztZhcAqTqhxBJ3jPpapgCsgQ4hCMoeGWaMnguCkJid2Qb6QPnbIzig86yi9D0Z7qQJr5LX5cMUSCaP6NTbkU2FOSRn8D0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738600285; c=relaxed/simple; bh=HCmSTkxKH/HdcO7Uglkp2F2gW3Sb8auKNqzvPdoBqxk=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=kpre9PC7uRxRbh7cnb6TWgeDmaq+On/DC3x0rSpOY/FE2yhsQno62mrF97UiEwxIGwSZ57SjQl8DbFCBHyigcymc5od/ERX0lOOwT4gSwV4+O2ifRUqn0qjpzk1cSlMDByeszeVlzQSAKtUy9JrIATWn7XNLl/PVTUXMw6YCvrQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk; spf=pass smtp.mailfrom=kernel.dk; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b=AxU0maFE; arc=none smtp.client-ip=209.85.166.49 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=kernel.dk Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b="AxU0maFE" Received: by mail-io1-f49.google.com with SMTP id ca18e2360f4ac-84cdacbc3dbso17692039f.2 for ; Mon, 03 Feb 2025 08:31:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20230601.gappssmtp.com; s=20230601; t=1738600283; x=1739205083; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Z+1svFbJCvXV9Zr5117Ph4lZhgiz5rV+2RtGExbCLLw=; b=AxU0maFEL3t1q4p6/lMeAJCdV8HTmJyy8G4YxU7WhQ4TGs3ORt4ICLaXexYUFRl3Bf L2taCK7BhuMo7V0Vjs2Tgpf0NMocTdKQOENhe/U0I7xctCinju03lI5AxFCGdhmbIUkt 9ttJqeRW1Qst5Hqsfw3FRyFd2ETQTEpov3TNd1QPD97BFlhZ3CPKVJ+8n2L7Kmtzo5NT RgDRZGKMI6IuNtkXO0lZ+jCtJESDYrIPKGmVfUJP6PT7Zz4ai1Sot5kNFB4DapjyIhmw 2vHxXxmEFda9j0+RxUA7yJvA/Q3gRnqg028zVZC7DHknzXKYPUiUmH5NEfN29HsaS7bT 0HNA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738600283; x=1739205083; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Z+1svFbJCvXV9Zr5117Ph4lZhgiz5rV+2RtGExbCLLw=; b=L74kld3R2YEIR8FC7oaFxdnOQfPkj/Ckt1wgSls26ttJeDQtP0h9ClIvxbp2rNp7+H Eg7kAPrNrhUxIJZcjna5tHzjonfv65WswrYafnkC4IuHS9Lzk3GcvtnqW6kMYf7vppmm IkCT1ddOZK57/4pKV7wDwD/zTSf2hyu11TRIoUTeMxt/mqbkGcq0dU/pnu5S2bSuNZRR ih8/0ANtKt6dTfS+IM3f7S0EavpC8s7FAkE7wjKcRIDN+PCdQG5p5HFToOZ3UJqPjc0c 9W2sOb0sOlB5MatpCRyFYrEhM8PvjOM2dUbvfGvXgS6Rp2WURO/D7xXlKImLfWhCZR/4 y1Ew== X-Gm-Message-State: AOJu0Yw9VsMuL3MbHXxrg1Q2lD8rgrDUmrWKx8cFnWNVupCZwnS3FPkL 0tUiV40drPkUxIjNUvsUa3nMhybRf6L0qUGBXZPgzX9OSZDbf++srJR83uvtOj0= X-Gm-Gg: ASbGncuxO7T7lXDxcCHytrZaB0k9WlQTi5tmyMtBs2EsKtViwGdDVaQSPGQ9CRTebUn BVL2ZG9kJS37a+i+cRwkk7aFGdA6l/rRxC2h1o4vsVa3lrvBaqAUrXYzm0CSNn/tuYdJ6wc/dTh eGfNJSPzhMGPRRNh+FSwAGfRTiknVuQ+jKqVBw4Z8Ec5donaxP60SsAQb/v5ht9pQ2KDWfjdDnX jfWtRUAI9LOetaXd+XjqH5BZfC8xlmfK9kspJsyp4tXlQrHrYdMr7TNctD/ILIDosWexaSzkulv gZdEFZH6ROFbH93uats= X-Google-Smtp-Source: AGHT+IEPqq+G0h4VYNXvxcyNE35GywbajeWSs01Mm7MyH0q/0gNnCdTL/T/DNkoK0+JZYY8PlpjbgQ== X-Received: by 2002:a05:6602:2c8c:b0:852:5e4:7d9e with SMTP id ca18e2360f4ac-85427de4d4dmr2047201539f.1.1738600283057; Mon, 03 Feb 2025 08:31:23 -0800 (PST) Received: from localhost.localdomain ([96.43.243.2]) by smtp.gmail.com with ESMTPSA id ca18e2360f4ac-854a16123c6sm243748139f.24.2025.02.03.08.31.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 03 Feb 2025 08:31:22 -0800 (PST) From: Jens Axboe To: io-uring@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, brauner@kernel.org, Jens Axboe Subject: [PATCH 2/9] eventpoll: add helper to remove wait entry from wait queue head Date: Mon, 3 Feb 2025 09:23:40 -0700 Message-ID: <20250203163114.124077-3-axboe@kernel.dk> X-Mailer: git-send-email 2.47.2 In-Reply-To: <20250203163114.124077-1-axboe@kernel.dk> References: <20250203163114.124077-1-axboe@kernel.dk> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 __epoll_wait_remove() is the core helper, it kills a given wait_queue_entry from the eventpoll wait_queue_head. Use it internally, and provide an overall helper, epoll_wait_remove(), which takes a struct file and provides the same functionality. Signed-off-by: Jens Axboe --- fs/eventpoll.c | 58 +++++++++++++++++++++++++-------------- include/linux/eventpoll.h | 3 ++ 2 files changed, 40 insertions(+), 21 deletions(-) diff --git a/fs/eventpoll.c b/fs/eventpoll.c index 73b639caed3d..01edbee5c766 100644 --- a/fs/eventpoll.c +++ b/fs/eventpoll.c @@ -1980,6 +1980,42 @@ static int ep_autoremove_wake_function(struct wait_queue_entry *wq_entry, return ret; } +static int __epoll_wait_remove(struct eventpoll *ep, + struct wait_queue_entry *wait, int timed_out) +{ + int eavail; + + /* + * We were woken up, thus go and try to harvest some events. If timed + * out and still on the wait queue, recheck eavail carefully under + * lock, below. + */ + eavail = 1; + + if (!list_empty_careful(&wait->entry)) { + write_lock_irq(&ep->lock); + /* + * If the thread timed out and is not on the wait queue, it + * means that the thread was woken up after its timeout expired + * before it could reacquire the lock. Thus, when wait.entry is + * empty, it needs to harvest events. + */ + if (timed_out) + eavail = list_empty(&wait->entry); + __remove_wait_queue(&ep->wq, wait); + write_unlock_irq(&ep->lock); + } + + return eavail; +} + +int epoll_wait_remove(struct file *file, struct wait_queue_entry *wait) +{ + if (is_file_epoll(file)) + return __epoll_wait_remove(file->private_data, wait, false); + return -EINVAL; +} + /** * ep_poll - Retrieves ready events, and delivers them to the caller-supplied * event buffer. @@ -2100,27 +2136,7 @@ static int ep_poll(struct eventpoll *ep, struct epoll_event __user *events, HRTIMER_MODE_ABS); __set_current_state(TASK_RUNNING); - /* - * We were woken up, thus go and try to harvest some events. - * If timed out and still on the wait queue, recheck eavail - * carefully under lock, below. - */ - eavail = 1; - - if (!list_empty_careful(&wait.entry)) { - write_lock_irq(&ep->lock); - /* - * If the thread timed out and is not on the wait queue, - * it means that the thread was woken up after its - * timeout expired before it could reacquire the lock. - * Thus, when wait.entry is empty, it needs to harvest - * events. - */ - if (timed_out) - eavail = list_empty(&wait.entry); - __remove_wait_queue(&ep->wq, &wait); - write_unlock_irq(&ep->lock); - } + eavail = __epoll_wait_remove(ep, &wait, timed_out); } } diff --git a/include/linux/eventpoll.h b/include/linux/eventpoll.h index f37fea931c44..1301fc74aca0 100644 --- a/include/linux/eventpoll.h +++ b/include/linux/eventpoll.h @@ -29,6 +29,9 @@ void eventpoll_release_file(struct file *file); int epoll_wait(struct file *file, struct epoll_event __user *events, int maxevents, struct timespec64 *to); +/* Remove wait entry */ +int epoll_wait_remove(struct file *file, struct wait_queue_entry *wait); + /* * This is called from inside fs/file_table.c:__fput() to unlink files * from the eventpoll interface. We need to have this facility to cleanup From patchwork Mon Feb 3 16:23:41 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 13957791 Received: from mail-io1-f47.google.com (mail-io1-f47.google.com [209.85.166.47]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 93D132B9BB for ; Mon, 3 Feb 2025 16:31:25 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.166.47 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738600287; cv=none; b=o2DUmS4VoCKXPKMPMQyDZ8I91Bwz/RVBHZBI+PvEbt+MRoEzeLSrt720Q5zXDlODN49HeSXVo0OHwEIGRlNtQIMlp+EfzFh3ooJOroJ0MBPwBls3ei/qn/VNkNp8ubsCQfqAAetsQUgsYLlKe8Si2ot+FDfHj0X/05pP7ZNy8gA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738600287; c=relaxed/simple; bh=FfOsgqQUbrqMBseqWoxvdeHMXbEwUwBb1lrRINWoyNY=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=F4YlBHYHaNQZ9QvOr6Ucrn8GlN/VosHSP3/Atsw9fxjV9jeRHQK+yDYX4nsYXKBdjx3p/ICELibCClyv8/khvMrZlf29jEJ7vArond3fAQJz2lSmQ0b8xNElnYkgK/gnRBbU+JhAP/VPht7DzPRu0pica5NQfz5VYaRhUa0yDX8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk; spf=pass smtp.mailfrom=kernel.dk; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b=xS0XDYZJ; arc=none smtp.client-ip=209.85.166.47 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=kernel.dk Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b="xS0XDYZJ" Received: by mail-io1-f47.google.com with SMTP id ca18e2360f4ac-844e12f702dso116215939f.3 for ; Mon, 03 Feb 2025 08:31:25 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20230601.gappssmtp.com; s=20230601; t=1738600285; x=1739205085; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=W4mLZ0JfeLifQ8U+aoK/TU8xkbtkVLx+Q+iBJc1yFPs=; b=xS0XDYZJGSQsc1jmdPJAujMEG29Ae0LJe6kTyP0wXMeX1DCf7UQ0kOb09f9L1Dh2W7 OfuzcbgvEtwEHKXE3Bymam7SStC3PV89f92xizEtv8JutTDGFujpIs8hr7FzmZiG5YKY TEHaUkqtuB6p6vrHufGWMGFrPmX/niFSrc0uQTwipB08mtJMxEx+f2WQItDqy0NQu6xl 1FH+zF7AWO7jArUQ/YUN/TEpBvPCv/NMlZY9fukVLdzFZJWoToQEKczNxWWxOjgEFGNt p31nj7EVGjWhhsf+mD/aFLLbOJlBXlOYfPDaT/rW0pRXSwUJOoY4nz9jq4H2Owscaj6w MTCQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738600285; x=1739205085; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=W4mLZ0JfeLifQ8U+aoK/TU8xkbtkVLx+Q+iBJc1yFPs=; b=aztBqwmPUN77TFWJOcIFNIr75E3f8hXDJbcVoYiUpFfPCfKB1GHxHhA9EfIOW2YuW1 HMRazKn4p8tFwep8ZspiU1pYBHfk/51XMsRiy29FbPgzh40bnf5Az1phRUEI1ZVkYd/y JYF5acxkHEKyz09MKXSSA19AAbA7jrPYSBicRysOItuT20U9SWhuD2S1nVMAQMkplpx2 u0LkI8LqRUNHBXjFrz+eWYjadgJ1+jEhlGaCAxV+USbq9t3oN4CrKigFmQ9BDHonbqbY M3qM74ppNI8SdSmtO2L4sJmIDaUo9OQyeqL0tpmodAJO/cugesMfN1lpbB7/6NAEu9nn y1HA== X-Gm-Message-State: AOJu0Ywu75a7UtPLBuHfVeilclYTB5ivRrOm1WfSzkOj+LlNEaL12pC+ MYIMpsiqIvsRx3aSKUWRdiafptbXgPC3gddJ8hALLwe3hK9sT5NV/4bOUuJvL8OXR7PLh8QqodW rwZg= X-Gm-Gg: ASbGnctwfdHPLwFKnm8qjgvcszV1mKYqM9uLvidYjg5Zy92zLLMth1WyydRNsYLT9Yv H73yG8G0Dl5CPqme5b2iaQMqJc81N4rY85wne0cAVPh4uhFiIIAzHqOHTsnRuxzySgYnxQCML/0 uNzg8G4JWOJPryy7EhmSavYOOVHBLtneZXZMkzeLnm1bTlJr9SR+hAfvnSlIB+1qRVR0zdQKDbW 55Wkm4Lf3ze3fmF9iWJRQ1pru9Dg5j2SujlQv3/tVyB32vCa2+kMkNhppStdMIxQc2upd6tD7+i YjNMgp6AuGeD+d5u20U= X-Google-Smtp-Source: AGHT+IETQZ/yuBJk5CL2/5U1LafOvbuk9WK74Yfijq5HcWNEsU4+WLjYmeWSufJ5PT3mU+sEDNWO3Q== X-Received: by 2002:a05:6602:4183:b0:84a:5201:41ff with SMTP id ca18e2360f4ac-85427de977emr2164108339f.3.1738600284728; Mon, 03 Feb 2025 08:31:24 -0800 (PST) Received: from localhost.localdomain ([96.43.243.2]) by smtp.gmail.com with ESMTPSA id ca18e2360f4ac-854a16123c6sm243748139f.24.2025.02.03.08.31.23 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 03 Feb 2025 08:31:23 -0800 (PST) From: Jens Axboe To: io-uring@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, brauner@kernel.org, Jens Axboe Subject: [PATCH 3/9] eventpoll: abstract out ep_try_send_events() helper Date: Mon, 3 Feb 2025 09:23:41 -0700 Message-ID: <20250203163114.124077-4-axboe@kernel.dk> X-Mailer: git-send-email 2.47.2 In-Reply-To: <20250203163114.124077-1-axboe@kernel.dk> References: <20250203163114.124077-1-axboe@kernel.dk> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 In preparation for reusing this helper in another epoll setup helper, abstract it out. Signed-off-by: Jens Axboe --- fs/eventpoll.c | 28 ++++++++++++++++++---------- 1 file changed, 18 insertions(+), 10 deletions(-) diff --git a/fs/eventpoll.c b/fs/eventpoll.c index 01edbee5c766..3cbd290503c7 100644 --- a/fs/eventpoll.c +++ b/fs/eventpoll.c @@ -2016,6 +2016,22 @@ int epoll_wait_remove(struct file *file, struct wait_queue_entry *wait) return -EINVAL; } +static int ep_try_send_events(struct eventpoll *ep, + struct epoll_event __user *events, int maxevents) +{ + int res; + + /* + * Try to transfer events to user space. In case we get 0 events and + * there's still timeout left over, we go trying again in search of + * more luck. + */ + res = ep_send_events(ep, events, maxevents); + if (res > 0) + ep_suspend_napi_irqs(ep); + return res; +} + /** * ep_poll - Retrieves ready events, and delivers them to the caller-supplied * event buffer. @@ -2067,17 +2083,9 @@ static int ep_poll(struct eventpoll *ep, struct epoll_event __user *events, while (1) { if (eavail) { - /* - * Try to transfer events to user space. In case we get - * 0 events and there's still timeout left over, we go - * trying again in search of more luck. - */ - res = ep_send_events(ep, events, maxevents); - if (res) { - if (res > 0) - ep_suspend_napi_irqs(ep); + res = ep_try_send_events(ep, events, maxevents); + if (res) return res; - } } if (timed_out) From patchwork Mon Feb 3 16:23:42 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 13957793 Received: from mail-io1-f54.google.com (mail-io1-f54.google.com [209.85.166.54]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EDC3720AF98 for ; Mon, 3 Feb 2025 16:31:26 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.166.54 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738600289; cv=none; b=hqx8fBhuF9nPfHRQp49K2ir7yR9APookyAPnl4lNeqMxCKMwf2fYk/QizCDhbuyr4/mE5HLZfQXSl4aBgVmmwcCDPvjufgXgrwg0T3sbpdOt8kPyGOshSnBTdoSjsO+NxcvWR0owWljcKvHZ1a57PbNmo2yQjNKndPF0H6SGzPI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738600289; c=relaxed/simple; bh=Lq/TvPbkX+M9fh/ofSu3jq3ZsdO0u7WjpsDGycNAOmQ=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=dBhzphVCyEMP1qvWjclat/Dlz5gpbuPFoQcVcFq1IPmYakFQrv4RNvRtbWAOSVd8G6XznXWjRiTlI91nGrtvWdDWFxeFHPitOYh06uBsGGmCI9Vpmjuq/txoNQL5niPtkw0bCHtiP3YaDofIu8I3RDjecrevWPANaVJE2ZtCVDE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk; spf=pass smtp.mailfrom=kernel.dk; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b=jrRD5HS6; arc=none smtp.client-ip=209.85.166.54 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=kernel.dk Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b="jrRD5HS6" Received: by mail-io1-f54.google.com with SMTP id ca18e2360f4ac-844bff5ba1dso323629839f.1 for ; Mon, 03 Feb 2025 08:31:26 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20230601.gappssmtp.com; s=20230601; t=1738600286; x=1739205086; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=enUf4uTUQbLHbQ4xi6ZoCFVbZFlA1oPnBMw51mV/SrU=; b=jrRD5HS6ruoBQr6xrBFCc2xTCKlY30qqKQnFsx331WjgeN6lVA/6GLIteTbpQzaAow 87fQ85L/EgsPXg1hfVcblsMPMup1JNYQ46DwzxEpM1ByGSHEogjQ64lxC5jtJdPDHkAp 2iD5LbU3nr31spW0WBkis7/DnXLz9D/D7ygLRJVcWWpboJBKKDcHU+0vSw/kLqFaTZgD 9Kp6NbeSYs8MYNIhz14Q4qfiBogUgv9Iiw2rfuPxZ8UQTYVtEE2nAN2zDJ7bSm2tlJcN rPI9ENrhRXRKx585D7sz4g/ZHunmPKLCClIKTj/EyF3L8Z1Jl9sQsZFdM7AXKQtM7nZs 9lsA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738600286; x=1739205086; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=enUf4uTUQbLHbQ4xi6ZoCFVbZFlA1oPnBMw51mV/SrU=; b=CdgbHbUutd/DRzQwSkMWGsIpCD/yzBHDSYscm/GnkCxZ5OcRCQgAWQGlej5TjXXeBX x3MYlAOAOUXHKjmfNhAbdYwnNiuTK+Nbpd/1nq/Eg6iANJxDBUYrOxW6hZitI7mzor4f 7dLTjHAqKt+rHThks76SxD3gGnjEeXhsK2iZ1gzuz6DDhbyWUrk71+99hNYzztJ5UIH+ BkAuuXVJ/JiZ9wqrhKlSy/cBwGBJkmyNbBTQeU6/TL4042U1PrVZKdck/BltXy2krjEb FITWYixkoanj1jSw5M6NkDCqe5PBxCyIq68TBsFk6sFiXDsHjTlaE5zrcvsocBXI/J7k qJvw== X-Gm-Message-State: AOJu0YxlET/gIwlY4p6Hw3tiEkMoxBYG0d8BB+vrUH+Xv1IeaXq1ikIl YRGjLMclaFFTdPOi+q0Lju45xX0YZ4dtXhYQfnDOaS/jpO59Bm04EJXd8GV9BPE= X-Gm-Gg: ASbGnct2kp2yCgnKhoU5/7KbOuNYitoPTD0sY2QQ24tMNzQl0AyGHjIpSZ9chaIHYn3 qlNWb0piFleS/8xHkctyF0PvO9VrLZbqTYeX5Cmd86gRo0hJrzjIVNeJ4wbWhoMX6ff+aUwUfF7 Rf9CRGpdI6PhYNbMUFe/ffpPQz0e/v98vwTG0dV1F94CrpInnTYpOiNbx/4FuCqki+mcgu0vZZD jJsTlFUKTVSrmosP+NyD9gK0mQX4gsU44YZB+/oflQxqvFtS5KxDJmU6gC3r5/e6mrSxeuTMSmL RsHba7zfrH4HcSKnd4Y= X-Google-Smtp-Source: AGHT+IEN8YdHjbZSshkK/29Wg0w9zSZQjrmgFR62n7cchB5dhcuaotu1SUwqJVpYktG/j/+fEnd5+Q== X-Received: by 2002:a05:6602:4019:b0:82c:e4e1:2e99 with SMTP id ca18e2360f4ac-85439fa26e6mr2051968639f.11.1738600285975; Mon, 03 Feb 2025 08:31:25 -0800 (PST) Received: from localhost.localdomain ([96.43.243.2]) by smtp.gmail.com with ESMTPSA id ca18e2360f4ac-854a16123c6sm243748139f.24.2025.02.03.08.31.24 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 03 Feb 2025 08:31:25 -0800 (PST) From: Jens Axboe To: io-uring@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, brauner@kernel.org, Jens Axboe Subject: [PATCH 4/9] eventpoll: add struct wait_queue_entry argument to epoll_wait() Date: Mon, 3 Feb 2025 09:23:42 -0700 Message-ID: <20250203163114.124077-5-axboe@kernel.dk> X-Mailer: git-send-email 2.47.2 In-Reply-To: <20250203163114.124077-1-axboe@kernel.dk> References: <20250203163114.124077-1-axboe@kernel.dk> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 In preparation for allowing an outside caller to add itself to the epoll waitqueue, pass in a struct wait_queue_entry. Unused in its current form, but will be utilized shortly. No intended functional changes in this patch. Signed-off-by: Jens Axboe --- fs/eventpoll.c | 5 +++-- include/linux/eventpoll.h | 3 ++- 2 files changed, 5 insertions(+), 3 deletions(-) diff --git a/fs/eventpoll.c b/fs/eventpoll.c index 3cbd290503c7..ecaa5591f4be 100644 --- a/fs/eventpoll.c +++ b/fs/eventpoll.c @@ -2470,7 +2470,8 @@ SYSCALL_DEFINE4(epoll_ctl, int, epfd, int, op, int, fd, } int epoll_wait(struct file *file, struct epoll_event __user *events, - int maxevents, struct timespec64 *to) + int maxevents, struct timespec64 *to, + struct wait_queue_entry *wait) { struct eventpoll *ep; @@ -2509,7 +2510,7 @@ static int do_epoll_wait(int epfd, struct epoll_event __user *events, /* Get the "struct file *" for the eventpoll file */ CLASS(fd, f)(epfd); if (!fd_empty(f)) - return epoll_wait(fd_file(f), events, maxevents, to); + return epoll_wait(fd_file(f), events, maxevents, to, NULL); return -EBADF; } diff --git a/include/linux/eventpoll.h b/include/linux/eventpoll.h index 1301fc74aca0..24f9344df5a3 100644 --- a/include/linux/eventpoll.h +++ b/include/linux/eventpoll.h @@ -27,7 +27,8 @@ void eventpoll_release_file(struct file *file); /* Use to reap events */ int epoll_wait(struct file *file, struct epoll_event __user *events, - int maxevents, struct timespec64 *to); + int maxevents, struct timespec64 *to, + struct wait_queue_entry *wait); /* Remove wait entry */ int epoll_wait_remove(struct file *file, struct wait_queue_entry *wait); From patchwork Mon Feb 3 16:23:43 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 13957792 Received: from mail-io1-f43.google.com (mail-io1-f43.google.com [209.85.166.43]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2510320B205 for ; Mon, 3 Feb 2025 16:31:27 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.166.43 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738600289; cv=none; b=BPrRLEE5fpp0D1BR+qP5UB7n4vdzzAcEUV2zL7gJM+JOkt4MAuHZ6vWJjqQhE4H5WStq9uCN+qM8aC5Dag4++yJBgb5RA7ywfiqwPVrEk7QodWc9sPxuL3xtoRsOzciOvUZc/gH45dd5Em2U0pwWHH6gr4URzkBWwPEv42H7iqo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738600289; c=relaxed/simple; bh=ayD177cZGh9bhC4WV7iQTen+4xZyD0WE3g92TyrwJnY=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=COxtCqnrMb06aK8GSJ00iKnrTDQzgBx051RPA0gt+o91O8ll03/7P8Gva+ukSaCcbMEfpAt63BnnINx1aX3FeHG3JJc5YURVgAGkFLsja5kni7IHWz67qAfwLE9kpt34bFPhkJGoZ4Ul1uZZrLopSDbNMyyzWmUVS48uiCa04Mc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk; spf=pass smtp.mailfrom=kernel.dk; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b=f6v5Hc7O; arc=none smtp.client-ip=209.85.166.43 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=kernel.dk Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b="f6v5Hc7O" Received: by mail-io1-f43.google.com with SMTP id ca18e2360f4ac-844e9b8b0b9so314960739f.0 for ; Mon, 03 Feb 2025 08:31:27 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20230601.gappssmtp.com; s=20230601; t=1738600287; x=1739205087; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=mCWW/DHnmpnY+C0n2fbNEGbgNNkUYylpMG8Oip3AQsg=; b=f6v5Hc7O/DXa4nmyM4n9ODsNvkNmUs/Z5Z0a1h6WjIHB/SpnDvF2x6BgHefgbwPHz5 7BPeKgaY89G6mrhzkF9Y+8O+qIL4Ud3HbblcnQWJ41SiG4DJJ3XxvuSxFm9+rcDBluqp iuZbCqidgWbM5/Srrya4xNe8gFOTK9e1k6CRlIyLrz4OSGOGupvAG7KIaLWSf4OTtaaf tdDN2t+OWo3jMjcYQWaT+H8Byvzp+2nUA0Lp1Q92jNsqBxhDO9TH5umgOCgvTJCsDNly biwBQoP931WR8/1+CIosOADUOANCE+ddm61Qsb0Z23jePnuBW5tB/B2GISErvk+D3+Jq X3vQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738600287; x=1739205087; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=mCWW/DHnmpnY+C0n2fbNEGbgNNkUYylpMG8Oip3AQsg=; b=lt0rRLPZmVOZ5hxf8UVEyL+1IPGLQv0dbcaAIHJ3ydT7vZzVJ6WDhVfk8TYkSOQxpV Wnj1p1uPwpP5B3Lz2g98AQb0iDswppaY3zXeF01+HLcrHVplZcrHycYSW7GEN3lrsMIa gXSXE+o0RMTFDQa8JqinxXxtc9mrMj35gL82+B9hhgqxuRDuvPId+BV///ZskYv7jehv GZ4xiTn/9WVFAD59SAkvWFcIIlQRj4OLZpkQqfW0Hw0e5eFHXzWRZkdoMWBITrzAqoq4 zOlScN1qwsaTkEmVpgV0+57J4h2aPFyM1KObflSNx711Z1c8mxktLOevK44SNYOLdyu3 K+gQ== X-Gm-Message-State: AOJu0YwwLYPDkdengunZhj83wbTJuPPSSFuR+4yttEh1v0GqzASsdCdF PoOjLWGiNqHsKVbYWSYTgM2nv1yhfIMPVkn/Jvp8kzwPtAbTHev6F822qUHi80g= X-Gm-Gg: ASbGnctRkl5YOGPR2Fjn5VaXowwgoiv1DghsM9X4PcsQxZWYPKuSCVM1GYpuXkpYiOW 5EY/psnjiC8hYk3ygIgkqcUQPvwYSlS6l87+jSrn/XSKVHTKp8dx6+vQKMvWo2j9wY7USzHbsMY jXud3z52xWG9D4ufnuuBExs+FNAKV1hbS042xfV2AwQ734PApYopcLCtUxKw+Lt3ISLSH1ATg0y 3WU85Hr1sSd+4+WMNO5xZWIAvVaUF2iPtk2D6F62I4JzDLQ1zX1fYnPyEY8RCXNquPPOxacWaMX V39RyaPQfyVloqr/Tho= X-Google-Smtp-Source: AGHT+IFody0BekAT6IYbYLW3U7jHo0YoJOdClv9dRYiYiWYEAJytUXMyrnGF6J0B0Z1zwildYJdr0Q== X-Received: by 2002:a05:6602:4806:b0:84f:44de:9c99 with SMTP id ca18e2360f4ac-85427e00ab5mr2210157139f.5.1738600287116; Mon, 03 Feb 2025 08:31:27 -0800 (PST) Received: from localhost.localdomain ([96.43.243.2]) by smtp.gmail.com with ESMTPSA id ca18e2360f4ac-854a16123c6sm243748139f.24.2025.02.03.08.31.26 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 03 Feb 2025 08:31:26 -0800 (PST) From: Jens Axboe To: io-uring@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, brauner@kernel.org, Jens Axboe Subject: [PATCH 5/9] eventpoll: add ep_poll_queue() loop Date: Mon, 3 Feb 2025 09:23:43 -0700 Message-ID: <20250203163114.124077-6-axboe@kernel.dk> X-Mailer: git-send-email 2.47.2 In-Reply-To: <20250203163114.124077-1-axboe@kernel.dk> References: <20250203163114.124077-1-axboe@kernel.dk> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 If a wait_queue_entry is passed in to epoll_wait(), then utilize this new helper for reaping events and/or adding to the epoll waitqueue rather than calling the potentially sleeping ep_poll(). It works like ep_poll(), except it doesn't block - it either returns the events that are already available, or it adds the specified entry to the struct eventpoll waitqueue to get a callback when events are triggered. It returns -EIOCBQUEUED for that case. Signed-off-by: Jens Axboe --- fs/eventpoll.c | 37 ++++++++++++++++++++++++++++++++++++- 1 file changed, 36 insertions(+), 1 deletion(-) diff --git a/fs/eventpoll.c b/fs/eventpoll.c index ecaa5591f4be..a8be0c7110e4 100644 --- a/fs/eventpoll.c +++ b/fs/eventpoll.c @@ -2032,6 +2032,39 @@ static int ep_try_send_events(struct eventpoll *ep, return res; } +static int ep_poll_queue(struct eventpoll *ep, + struct epoll_event __user *events, int maxevents, + struct wait_queue_entry *wait) +{ + int res, eavail; + + /* See ep_poll() for commentary */ + eavail = ep_events_available(ep); + while (1) { + if (eavail) { + res = ep_try_send_events(ep, events, maxevents); + if (res) + return res; + } + + eavail = ep_busy_loop(ep, true); + if (eavail) + continue; + + if (!list_empty_careful(&wait->entry)) + return -EIOCBQUEUED; + + write_lock_irq(&ep->lock); + eavail = ep_events_available(ep); + if (!eavail) + __add_wait_queue_exclusive(&ep->wq, wait); + write_unlock_irq(&ep->lock); + + if (!eavail) + return -EIOCBQUEUED; + } +} + /** * ep_poll - Retrieves ready events, and delivers them to the caller-supplied * event buffer. @@ -2497,7 +2530,9 @@ int epoll_wait(struct file *file, struct epoll_event __user *events, ep = file->private_data; /* Time to fish for events ... */ - return ep_poll(ep, events, maxevents, to); + if (!wait) + return ep_poll(ep, events, maxevents, to); + return ep_poll_queue(ep, events, maxevents, wait); } /* From patchwork Mon Feb 3 16:23:44 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 13957794 Received: from mail-io1-f48.google.com (mail-io1-f48.google.com [209.85.166.48]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5043720B1E2 for ; Mon, 3 Feb 2025 16:31:29 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.166.48 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738600290; cv=none; b=Q8kJ0tJtUSl1XrU1VSCt/MIjgIia+hdkvz8zrXVRvxE4msqzvvcFcYfJsfOQWEtkWW+dYJHHNaexesbRNxWsDwENuKWWQQbbC7Tp5LNdjd/ha9v5sKjfcvxB2AKcJYIkIobzV9+MgjE9YxlbFcp6u9KJ2vdfaug0P+uyGwZD3Xo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738600290; c=relaxed/simple; bh=AJ4EpG+a/HnjEaQKeFR8lZf7Ex4++mtSxvprrt1iy8A=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=mmZmbz+LlR5GDGQgph6fr+D97glgUHF9sHJfn6n0pLK8avku3Tq75Lpd3nIQHywQubqfjIYNFRewE0iEnNtrt74P/Utoja/JEsGsSUWp05F239PBJQ/HTfru6Fz+PlCgwX4QRP6lLLKTMA4qCrXMoVpQc+wW4BuxityyULpwpAQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk; spf=pass smtp.mailfrom=kernel.dk; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b=ZJoRKj/w; arc=none smtp.client-ip=209.85.166.48 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=kernel.dk Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b="ZJoRKj/w" Received: by mail-io1-f48.google.com with SMTP id ca18e2360f4ac-8521df70be6so18535939f.1 for ; Mon, 03 Feb 2025 08:31:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20230601.gappssmtp.com; s=20230601; t=1738600288; x=1739205088; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Yl4I89AW9N+9ZjPLUSwNXzZLneK6YdFoM+i0fKJ0Giw=; b=ZJoRKj/wU9yHgIyliRh6uYcJDLsHUg87VbXXz9WrMywx571IQHJSZvPWT+pehE6zwt XPJVDW+HJuvBUCcEY3dulA5y+us00W5xVqfja+guAPugN+752p/tYE64d3ZYUFaoNzqo bB/O+lZGQ/7RHcdgwEE/LA9ngEG1mXOl+SEJ7UhavRIv+I4huyEKASbtQ7wzEZ9DtQGA kIVZ/Q9FtD3miB/DDvoOctkkWrErtnhHBAOgDZPauz+AEJpcmxEqAKxZOV8XDQzVLGPl 6wCxQv8IGNEUmY+/173HHd8Gac4cbefpo4iWkLJhIEPwk2m+AEO2WJ88Jtr7KOxlbBky /Rew== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738600288; x=1739205088; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Yl4I89AW9N+9ZjPLUSwNXzZLneK6YdFoM+i0fKJ0Giw=; b=IQ75/8mshS4os5TwXE3fiI41DfPeS5XkQVNP+eOCGSpaEg87nomfPllkG8yvWNeHa0 eW3rzwpB5uJS4h+PCPHUlKOGHzctzP1QbMVyWVr2lOLB7myx/eAAvc6+rUgVQG3PFEYw r5a0bI2EI2+mEB5e7huYcLp+mJVMZXpEXKL6RM0t6mfQzooWSKtI5U37HIjDrXnn9F9k 5zVWtB4Ytb80CS859TqV0z+rKCLbJ/QhqQJ/bQfL68rs/rA37tnQ/+cbqJJHaVSQ+cDQ 1lOVm2fGgmV6jx9mRsz6oqogd/1Rt6VtqoCnES12RxO3032N3flBL6E2LSnroW5IuEhn 5krA== X-Gm-Message-State: AOJu0YwwgtD9cECGSZIRpoFq1PglAr5Keehr7XoBT3EPkWXogjCHNE3K q87JidxL/tyIHOdMBpsn3rl/+6qSDQ4jORA3dD8PgoOPSeu0vTcNrved+LdNkKpurGFBrsP6Vpp Oy7w= X-Gm-Gg: ASbGncuY1mOTpyj/IGAE2j3eExlpcL985p7vAMsFD61qvdQqBIU2jY+k3q2nifdQf5P 7gqvxKkqwCTAcQYzztjho+iyf/KrnOe/JY0XJWPYG69oDqLUnPAy8pJjlOP9I/krP1YwIf9+13g 6mltLJPDfUH9gz6zxsRFFujOu/1HPTNHB3GLL8M+yUlul3ahUeUJWUCQGl7wjP7UiyuRfdr1pNz tYO7eYyvRLe4XFvG6vTVjulcHzIZ1tnPgJ0iRUzklyhfcp+K7EWMLYMAcEZIZUYf8s5id0fGY+J Ti18tXCNmRnl9iNbrYI= X-Google-Smtp-Source: AGHT+IE2WgIanOs+ot0lC6WutrqrC2QvvTTTNdZSFULo/ROZk+MFEGYpcigt6bnj4gvbuMhY+oEQZA== X-Received: by 2002:a05:6602:3818:b0:84f:41d9:9932 with SMTP id ca18e2360f4ac-85427df1edbmr2081256039f.9.1738600288562; Mon, 03 Feb 2025 08:31:28 -0800 (PST) Received: from localhost.localdomain ([96.43.243.2]) by smtp.gmail.com with ESMTPSA id ca18e2360f4ac-854a16123c6sm243748139f.24.2025.02.03.08.31.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 03 Feb 2025 08:31:27 -0800 (PST) From: Jens Axboe To: io-uring@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, brauner@kernel.org, Jens Axboe Subject: [PATCH 6/9] io_uring/epoll: remove CONFIG_EPOLL guards Date: Mon, 3 Feb 2025 09:23:44 -0700 Message-ID: <20250203163114.124077-7-axboe@kernel.dk> X-Mailer: git-send-email 2.47.2 In-Reply-To: <20250203163114.124077-1-axboe@kernel.dk> References: <20250203163114.124077-1-axboe@kernel.dk> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Just have the Makefile add the object if epoll is enabled, then it's not necessary to guard the entire epoll.c file inside an CONFIG_EPOLL ifdef. Signed-off-by: Jens Axboe --- io_uring/Makefile | 9 +++++---- io_uring/epoll.c | 2 -- 2 files changed, 5 insertions(+), 6 deletions(-) diff --git a/io_uring/Makefile b/io_uring/Makefile index d695b60dba4f..7114a6dbd439 100644 --- a/io_uring/Makefile +++ b/io_uring/Makefile @@ -11,9 +11,10 @@ obj-$(CONFIG_IO_URING) += io_uring.o opdef.o kbuf.o rsrc.o notif.o \ eventfd.o uring_cmd.o openclose.o \ sqpoll.o xattr.o nop.o fs.o splice.o \ sync.o msg_ring.o advise.o openclose.o \ - epoll.o statx.o timeout.o fdinfo.o \ - cancel.o waitid.o register.o \ - truncate.o memmap.o alloc_cache.o + statx.o timeout.o fdinfo.o cancel.o \ + waitid.o register.o truncate.o \ + memmap.o alloc_cache.o obj-$(CONFIG_IO_WQ) += io-wq.o obj-$(CONFIG_FUTEX) += futex.o -obj-$(CONFIG_NET_RX_BUSY_POLL) += napi.o +obj-$(CONFIG_EPOLL) += epoll.o +obj-$(CONFIG_NET_RX_BUSY_POLL) += napi.o diff --git a/io_uring/epoll.c b/io_uring/epoll.c index 89bff2068a19..7848d9cc073d 100644 --- a/io_uring/epoll.c +++ b/io_uring/epoll.c @@ -12,7 +12,6 @@ #include "io_uring.h" #include "epoll.h" -#if defined(CONFIG_EPOLL) struct io_epoll { struct file *file; int epfd; @@ -58,4 +57,3 @@ int io_epoll_ctl(struct io_kiocb *req, unsigned int issue_flags) io_req_set_res(req, ret, 0); return IOU_OK; } -#endif From patchwork Mon Feb 3 16:23:45 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 13957795 Received: from mail-io1-f48.google.com (mail-io1-f48.google.com [209.85.166.48]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EDD2F20B1EC for ; Mon, 3 Feb 2025 16:31:30 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.166.48 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738600292; cv=none; b=iCsrboZ1A2LFTjhmmPqbEJWNt+ihZYBVSQPVBgkPNspGM4d7I5lG6k9IkGrFANSpADpTW/TwTw7CzQfZeJ+7oqAtQVRdEbUeArXq+U3jvqc0fDlxWhUYQk3LxTbbLHhuE6P+8eIiT30k/jJD+uahf/B4Tx7RM8WNREKHvDzb4Cw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738600292; c=relaxed/simple; bh=im1ntOsYLHYjJiEvx6njeC6vx3wirWuYtKuqP171Gs4=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=cxIpJGUKmXmTT1SFQbrqDkDjtB5LI9sG9aUEco00pVpYsI5naihjukzCwTxV9Pk3DYtsqKW61uRJxa9e3WTQgZd4x3wA7Q1T71HojC2cH9T6vvchxMp6C8Wxo+ocwQyxYdM392VWiKVqZmRC2dyjASXcfglNjpCUFaQia+N771w= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk; spf=pass smtp.mailfrom=kernel.dk; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b=2WV48dbb; arc=none smtp.client-ip=209.85.166.48 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=kernel.dk Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b="2WV48dbb" Received: by mail-io1-f48.google.com with SMTP id ca18e2360f4ac-84a1ce51187so125329739f.1 for ; Mon, 03 Feb 2025 08:31:30 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20230601.gappssmtp.com; s=20230601; t=1738600290; x=1739205090; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=TQ8sKu9TA7JXcFTCTxuSEZOOG4qI+mYE1gZPeZ19VjI=; b=2WV48dbbfKhGaSERDCCzZF60fGOeZyC03tkpBZK7RgjcKch27ArVCY3GVA5Hu0IZ7g GDdZd6C329mlV4MdXKJfiBRbRBuEqtEYwLsg2IjIm93/REwy3wUC0xyFaJf+DOf0cyfu pVFdvT9HYWKiGYhc17JYskY7kHZw+QcSAhe+Puw0QXemx67vG6DXEM/D02DlmUDWT518 qCiU+Uvrr2rIpS5gD8t5MRG8HqPSPebin5aphGYbjJRN5mVlRK94b9/dhxb3UEqgQnvQ PlBmhaXEfHsQDxRLErCo8KC4JsF/vIVG1FcjDm73jyFUJgwaRdzzpJ0s6j97yDUvSxeR SIXA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738600290; x=1739205090; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=TQ8sKu9TA7JXcFTCTxuSEZOOG4qI+mYE1gZPeZ19VjI=; b=Wja7ypyvhBiyOgEZ27NuFiOr/YREkwJh7/ef9lXCCJG1vn4s0SzrjQMVmnzUMggraP /wkJ75i1hkfD6WyA+RxUZZP/1um9gj2V+IpKA0AXS+aOsA7/Ym6thnYbDL0ktdwGXGUn iI2Uuf2+mg57HwmV7h8WOqoP+/jG7fl4kM1o+7K47VePPjELiyXCqT7enIvsYNPS3z7e UJ3nDncZ5DAS3EVMA7P2Ie4ULJRJsReEaA5xZZaoRGRkh/BNSN40f6g5iRf6uRv+eF/7 nhI4alumdd5s+09LHMaAMDzvdFt0CVmdXxOKn3Y2upQFgto1cZ0DZy5tnEzYdtj8/zS0 taTg== X-Gm-Message-State: AOJu0Yz9hHBMND1fgemTtv0wDFmyZ/z1OxEe1Jk0GWHqGSrUhPILBneP aaH8tBAe2zWT6JHXLJ+H5yvh/tJ4qyrLfgDjq5BcLykm0FuT6Tq7OD0GTC6D6gA= X-Gm-Gg: ASbGncvwa8bvrYecZ6Wm0ldZEpaPSoUkh0u6l/Sj1X6Sg3AvfAvhF8Kb9LNFpm+AjpP sTBh0OZiYc5KUdccm8nMMvCrOrRUc2pKz16U+PgImSMdSl1Z2HVk6DWd2u3Aqk1oRN62V94S5rP 4C/jmRvYlsLSIASTlANiBAq1Cism1hUuHhissFz2CyqlfPjP8u3h13tr6vhLrlto/h6rK830jxv Hcfz8SsdJy34PUHURjAy2qnsFbftasavvI3nBXT+xdtj9srWnutZyxGIFS1/k5zghORh+4pDxXZ 5WLp64kogktqmue96Eg= X-Google-Smtp-Source: AGHT+IHbguIeZfHTdu65ziKGW6ATBjS2yiHfTO/r9nckYu6FLpo07OrSWeS6IHMC5zGBM2+H37rscA== X-Received: by 2002:a05:6602:394e:b0:83a:a305:d9ee with SMTP id ca18e2360f4ac-85439fbbbf9mr2298125539f.12.1738600289812; Mon, 03 Feb 2025 08:31:29 -0800 (PST) Received: from localhost.localdomain ([96.43.243.2]) by smtp.gmail.com with ESMTPSA id ca18e2360f4ac-854a16123c6sm243748139f.24.2025.02.03.08.31.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 03 Feb 2025 08:31:28 -0800 (PST) From: Jens Axboe To: io-uring@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, brauner@kernel.org, Jens Axboe Subject: [PATCH 7/9] io_uring/poll: pull ownership handling into poll.h Date: Mon, 3 Feb 2025 09:23:45 -0700 Message-ID: <20250203163114.124077-8-axboe@kernel.dk> X-Mailer: git-send-email 2.47.2 In-Reply-To: <20250203163114.124077-1-axboe@kernel.dk> References: <20250203163114.124077-1-axboe@kernel.dk> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 In preparation for using it from somewhere else. Rather than try and duplicate the functionality, just make it generically available to io_uring opcodes. Note: would have to be used carefully, cannot be used by opcodes that can trigger poll logic. Signed-off-by: Jens Axboe --- io_uring/poll.c | 30 +----------------------------- io_uring/poll.h | 31 +++++++++++++++++++++++++++++++ 2 files changed, 32 insertions(+), 29 deletions(-) diff --git a/io_uring/poll.c b/io_uring/poll.c index bb1c0cd4f809..5e44ac562491 100644 --- a/io_uring/poll.c +++ b/io_uring/poll.c @@ -41,16 +41,6 @@ struct io_poll_table { __poll_t result_mask; }; -#define IO_POLL_CANCEL_FLAG BIT(31) -#define IO_POLL_RETRY_FLAG BIT(30) -#define IO_POLL_REF_MASK GENMASK(29, 0) - -/* - * We usually have 1-2 refs taken, 128 is more than enough and we want to - * maximise the margin between this amount and the moment when it overflows. - */ -#define IO_POLL_REF_BIAS 128 - #define IO_WQE_F_DOUBLE 1 static int io_poll_wake(struct wait_queue_entry *wait, unsigned mode, int sync, @@ -70,7 +60,7 @@ static inline bool wqe_is_double(struct wait_queue_entry *wqe) return priv & IO_WQE_F_DOUBLE; } -static bool io_poll_get_ownership_slowpath(struct io_kiocb *req) +bool io_poll_get_ownership_slowpath(struct io_kiocb *req) { int v; @@ -85,24 +75,6 @@ static bool io_poll_get_ownership_slowpath(struct io_kiocb *req) return !(atomic_fetch_inc(&req->poll_refs) & IO_POLL_REF_MASK); } -/* - * If refs part of ->poll_refs (see IO_POLL_REF_MASK) is 0, it's free. We can - * bump it and acquire ownership. It's disallowed to modify requests while not - * owning it, that prevents from races for enqueueing task_work's and b/w - * arming poll and wakeups. - */ -static inline bool io_poll_get_ownership(struct io_kiocb *req) -{ - if (unlikely(atomic_read(&req->poll_refs) >= IO_POLL_REF_BIAS)) - return io_poll_get_ownership_slowpath(req); - return !(atomic_fetch_inc(&req->poll_refs) & IO_POLL_REF_MASK); -} - -static void io_poll_mark_cancelled(struct io_kiocb *req) -{ - atomic_or(IO_POLL_CANCEL_FLAG, &req->poll_refs); -} - static struct io_poll *io_poll_get_double(struct io_kiocb *req) { /* pure poll stashes this in ->async_data, poll driven retry elsewhere */ diff --git a/io_uring/poll.h b/io_uring/poll.h index 04ede93113dc..2f416cd3be13 100644 --- a/io_uring/poll.h +++ b/io_uring/poll.h @@ -21,6 +21,18 @@ struct async_poll { struct io_poll *double_poll; }; +#define IO_POLL_CANCEL_FLAG BIT(31) +#define IO_POLL_RETRY_FLAG BIT(30) +#define IO_POLL_REF_MASK GENMASK(29, 0) + +bool io_poll_get_ownership_slowpath(struct io_kiocb *req); + +/* + * We usually have 1-2 refs taken, 128 is more than enough and we want to + * maximise the margin between this amount and the moment when it overflows. + */ +#define IO_POLL_REF_BIAS 128 + /* * Must only be called inside issue_flags & IO_URING_F_MULTISHOT, or * potentially other cases where we already "own" this poll request. @@ -30,6 +42,25 @@ static inline void io_poll_multishot_retry(struct io_kiocb *req) atomic_inc(&req->poll_refs); } +/* + * If refs part of ->poll_refs (see IO_POLL_REF_MASK) is 0, it's free. We can + * bump it and acquire ownership. It's disallowed to modify requests while not + * owning it, that prevents from races for enqueueing task_work's and b/w + * arming poll and wakeups. + */ +static inline bool io_poll_get_ownership(struct io_kiocb *req) +{ + if (unlikely(atomic_read(&req->poll_refs) >= IO_POLL_REF_BIAS)) + return io_poll_get_ownership_slowpath(req); + return !(atomic_fetch_inc(&req->poll_refs) & IO_POLL_REF_MASK); +} + +static inline void io_poll_mark_cancelled(struct io_kiocb *req) +{ + atomic_or(IO_POLL_CANCEL_FLAG, &req->poll_refs); +} + + int io_poll_add_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe); int io_poll_add(struct io_kiocb *req, unsigned int issue_flags); From patchwork Mon Feb 3 16:23:46 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 13957796 Received: from mail-io1-f52.google.com (mail-io1-f52.google.com [209.85.166.52]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 58A5720B1F2 for ; Mon, 3 Feb 2025 16:31:32 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.166.52 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738600294; cv=none; b=BevTcUDEPWfQkqfuTRDox7F9EBA5KnyLACeN810zfy/GnCQEFRyeY5l6hoRrlWA+Ciw18R2yqhyN3iqu3H5t/g2HIJB/4A9PbNm89Bbxt7XLh8szlv5e3PqGUD3XQlrPNN2CFmNDRrNTGXKuZ0Km9ERIAa/7blaMxup3r0WWOdA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738600294; c=relaxed/simple; bh=CMTEnTPWpb05BVNxeVZlHIui6WHm7CkjMUZktWehDB0=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=hdQ6sUKvwpnvaNnBTvaFZPqUnKYKYVbFfNL31y/F9YnfB84eKIxLoki3rRm9YRSHse8Oraz4Ns4FYzqOdg0oOX/smoMhAtK4v9+mdioifsnnyhTXSjz3RRECEWGiZ6NmE6fHLqqlw/jluXVvzw4wa7LoBJuRduN4+J6o0eXyx3g= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk; spf=pass smtp.mailfrom=kernel.dk; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b=Qn/rhjzR; arc=none smtp.client-ip=209.85.166.52 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=kernel.dk Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b="Qn/rhjzR" Received: by mail-io1-f52.google.com with SMTP id ca18e2360f4ac-8521df70be6so18540839f.1 for ; Mon, 03 Feb 2025 08:31:32 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20230601.gappssmtp.com; s=20230601; t=1738600291; x=1739205091; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=K+AXd57b1JzBRkx41X4hZ11sGTsVxTuoF3YRV66XCYw=; b=Qn/rhjzRs4X75LGahxyx+RRkNnVmbBGzs0MawPHukZ5PchEG2Lu5foDXVr3v2N+tb1 4ilnPdVFYN+ytlhqnsBo4ZCaxMUEzssHwNypWnvvnIKsbhO8ZsdxRqZIr9c3sYhVic0W iTh29BQlJjLBYak2TYtbJL6fwUxoHlTmfuSAiy1TNSyqlEiswjooN+C+EPkTCqkHaxW1 SLHwRMtgzKv+Vm1FEJY6VES7sL2OJZOH0n11H9V7eyyAwfzKXlCmK+USSa+SZEHQ6AG5 DB3MVKeP113Lh2g+fuUK0PHATbUlKaWqVIbGFszky8S1dg0/diAVr5/fC1rASOLduISI 62Rg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738600291; x=1739205091; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=K+AXd57b1JzBRkx41X4hZ11sGTsVxTuoF3YRV66XCYw=; b=iSS+WKLH+hSg0EmRoofTR8Jm51U1JZSAFdCTgBGKbYKXiW1NZZxkZ18elBDDJlR9Kr HgbdaxfKZj3zZHVnHt9ORdZdbHY4QpOQ9SVzEIudZUQCNqhQlVLB7noN9nuYu3haUq/y kex9TNvMT0NxLcFLZ5oqoHXa2tUPDbJXdDSsAIJMcpL7fHWFlzrMGNG22neEyaG3hwcV j+T6Zz4ckQLLkR8+1qPK9vf23lGICdWhWdQlEb4EER1rXGqBWv2UVwunjhMvMeDCz62U iENJnEWX81/UShwOUaeL7ZJUQl6QL8yn+AAYbJLiLa9ZgjN3jK3ImseLFBn4+Tmf/29V fD4g== X-Gm-Message-State: AOJu0YylBak7g3g7KZG8HRCrghYr1c78Okf7P8ID4R3Zhg4EEJjchNQQ RoKL2G/IGYtLBvsUOPGWeUJpbBzmogmr+FaBgNKOq9489ouTbn2P1b1pD6QK78XZOPdSWHLoVml +qic= X-Gm-Gg: ASbGncsEPzIl/ANykyShxAa+Oscy3FBrikvSdFsJK0EexrFpqJaQR6m6zAwA4BqfMgJ ie89xTpPrEuAEgEFeCkBY7deOO8uzEF1N5GqsUi61IuScnspDgaQa+LaBjzyRjjpl3h2Z3uhnyT VXIuuR4QI5yBf0LjkNUmKHYodT95Qi2jMCbuz36800j3GkJetrRSAUEqV/RluFQsdyOljDwqbW4 0WuhecncFb0KK93XId75A5sSKYA83QUucQne32U/2ZQ/QGozEVi9RodCaWPvC3c1EqodtSaGPUU Pz+pibawK3JiEjU1LXA= X-Google-Smtp-Source: AGHT+IF07yJPQjHqGkR2Cwji/yqUbjc61jgaBnPgFRqP63HHV9IkZQsnLBjH7Osln4jmhTHcVdSujQ== X-Received: by 2002:a05:6602:1696:b0:849:c82e:c084 with SMTP id ca18e2360f4ac-854111121f5mr2081287439f.6.1738600291215; Mon, 03 Feb 2025 08:31:31 -0800 (PST) Received: from localhost.localdomain ([96.43.243.2]) by smtp.gmail.com with ESMTPSA id ca18e2360f4ac-854a16123c6sm243748139f.24.2025.02.03.08.31.29 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 03 Feb 2025 08:31:30 -0800 (PST) From: Jens Axboe To: io-uring@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, brauner@kernel.org, Jens Axboe Subject: [PATCH 8/9] io_uring/epoll: add support for IORING_OP_EPOLL_WAIT Date: Mon, 3 Feb 2025 09:23:46 -0700 Message-ID: <20250203163114.124077-9-axboe@kernel.dk> X-Mailer: git-send-email 2.47.2 In-Reply-To: <20250203163114.124077-1-axboe@kernel.dk> References: <20250203163114.124077-1-axboe@kernel.dk> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 For existing epoll event loops that can't fully convert to io_uring, the used approach is usually to add the io_uring fd to the epoll instance and use epoll_wait() to wait on both "legacy" and io_uring events. While this work, it isn't optimal as: 1) epoll_wait() is pretty limited in what it can do. It does not support partial reaping of events, or waiting on a batch of events. 2) When an io_uring ring is added to an epoll instance, it activates the io_uring "I'm being polled" logic which slows things down. Rather than use this approach, with EPOLL_WAIT support added to io_uring, event loops can use the normal io_uring wait logic for everything, as long as an epoll wait request has been armed with io_uring. Note that IORING_OP_EPOLL_WAIT does NOT take a timeout value, as this is an async request. Waiting on io_uring events in general has various timeout parameters, and those are the ones that should be used when waiting on any kind of request. If events are immediately available for reaping, then This opcode will return those immediately. If none are available, then it will post an async completion when they become available. cqe->res will contain either an error code (< 0 value) for a malformed request, invalid epoll instance, etc. It will return a positive result indicating how many events were reaped. IORING_OP_EPOLL_WAIT requests may be canceled using the normal io_uring cancelation infrastructure. The poll logic for managing ownership is adopted to guard the epoll side too. Signed-off-by: Jens Axboe --- include/linux/io_uring_types.h | 4 + include/uapi/linux/io_uring.h | 1 + io_uring/cancel.c | 5 + io_uring/epoll.c | 168 +++++++++++++++++++++++++++++++++ io_uring/epoll.h | 22 +++++ io_uring/io_uring.c | 5 + io_uring/opdef.c | 14 +++ 7 files changed, 219 insertions(+) diff --git a/include/linux/io_uring_types.h b/include/linux/io_uring_types.h index 3def525a1da3..ee56992d31d5 100644 --- a/include/linux/io_uring_types.h +++ b/include/linux/io_uring_types.h @@ -370,6 +370,10 @@ struct io_ring_ctx { struct io_alloc_cache futex_cache; #endif +#ifdef CONFIG_EPOLL + struct hlist_head epoll_list; +#endif + const struct cred *sq_creds; /* cred used for __io_sq_thread() */ struct io_sq_data *sq_data; /* if using sq thread polling */ diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h index e11c82638527..a559e1e1544a 100644 --- a/include/uapi/linux/io_uring.h +++ b/include/uapi/linux/io_uring.h @@ -278,6 +278,7 @@ enum io_uring_op { IORING_OP_FTRUNCATE, IORING_OP_BIND, IORING_OP_LISTEN, + IORING_OP_EPOLL_WAIT, /* this goes last, obviously */ IORING_OP_LAST, diff --git a/io_uring/cancel.c b/io_uring/cancel.c index 484193567839..9cebd0145cb4 100644 --- a/io_uring/cancel.c +++ b/io_uring/cancel.c @@ -17,6 +17,7 @@ #include "timeout.h" #include "waitid.h" #include "futex.h" +#include "epoll.h" #include "cancel.h" struct io_cancel { @@ -128,6 +129,10 @@ int io_try_cancel(struct io_uring_task *tctx, struct io_cancel_data *cd, if (ret != -ENOENT) return ret; + ret = io_epoll_wait_cancel(ctx, cd, issue_flags); + if (ret != -ENOENT) + return ret; + spin_lock(&ctx->completion_lock); if (!(cd->flags & IORING_ASYNC_CANCEL_FD)) ret = io_timeout_cancel(ctx, cd); diff --git a/io_uring/epoll.c b/io_uring/epoll.c index 7848d9cc073d..2a9c679516c8 100644 --- a/io_uring/epoll.c +++ b/io_uring/epoll.c @@ -11,6 +11,7 @@ #include "io_uring.h" #include "epoll.h" +#include "poll.h" struct io_epoll { struct file *file; @@ -20,6 +21,13 @@ struct io_epoll { struct epoll_event event; }; +struct io_epoll_wait { + struct file *file; + int maxevents; + struct epoll_event __user *events; + struct wait_queue_entry wait; +}; + int io_epoll_ctl_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe) { struct io_epoll *epoll = io_kiocb_to_cmd(req, struct io_epoll); @@ -57,3 +65,163 @@ int io_epoll_ctl(struct io_kiocb *req, unsigned int issue_flags) io_req_set_res(req, ret, 0); return IOU_OK; } + +static void __io_epoll_cancel(struct io_kiocb *req) +{ + struct io_epoll_wait *iew = io_kiocb_to_cmd(req, struct io_epoll_wait); + + epoll_wait_remove(req->file, &iew->wait); + hlist_del_init(&req->hash_node); + io_req_set_res(req, -ECANCELED, 0); + req->io_task_work.func = io_req_task_complete; + io_req_task_work_add(req); +} + +static void __io_epoll_wait_cancel(struct io_kiocb *req) +{ + io_poll_mark_cancelled(req); + if (io_poll_get_ownership(req)) { + __io_epoll_cancel(req); + io_req_set_res(req, -ECANCELED, 0); + req->io_task_work.func = io_req_task_complete; + io_req_task_work_add(req); + } +} + +bool io_epoll_wait_remove_all(struct io_ring_ctx *ctx, struct io_uring_task *tctx, + bool cancel_all) +{ + struct hlist_node *tmp; + struct io_kiocb *req; + bool found = false; + + lockdep_assert_held(&ctx->uring_lock); + + hlist_for_each_entry_safe(req, tmp, &ctx->epoll_list, hash_node) { + if (!io_match_task_safe(req, tctx, cancel_all)) + continue; + __io_epoll_wait_cancel(req); + found = true; + } + + return found; +} + +int io_epoll_wait_cancel(struct io_ring_ctx *ctx, struct io_cancel_data *cd, + unsigned int issue_flags) +{ + struct hlist_node *tmp; + struct io_kiocb *req; + int nr = 0; + + io_ring_submit_lock(ctx, issue_flags); + hlist_for_each_entry_safe(req, tmp, &ctx->epoll_list, hash_node) { + if (!io_cancel_req_match(req, cd)) + continue; + __io_epoll_wait_cancel(req); + nr++; + } + io_ring_submit_unlock(ctx, issue_flags); + return nr ?: -ENOENT; +} + +static void io_epoll_retry(struct io_kiocb *req, struct io_tw_state *ts) +{ + int v; + + do { + v = atomic_read(&req->poll_refs); + if (unlikely(v != 1)) { + if (WARN_ON_ONCE(!(v & IO_POLL_REF_MASK))) + return; + if (v & IO_POLL_CANCEL_FLAG) { + __io_epoll_cancel(req); + return; + } + } + v &= IO_POLL_REF_MASK; + } while (atomic_sub_return(v, &req->poll_refs) & IO_POLL_REF_MASK); + + io_req_task_submit(req, ts); +} + +static int io_epoll_execute(struct io_kiocb *req) +{ + struct io_epoll_wait *iew = io_kiocb_to_cmd(req, struct io_epoll_wait); + + if (io_poll_get_ownership(req)) { + list_del_init_careful(&iew->wait.entry); + req->io_task_work.func = io_epoll_retry; + io_req_task_work_add(req); + return 1; + } + + return 0; +} + +static __cold int io_epoll_pollfree_wake(struct io_kiocb *req) +{ + struct io_epoll_wait *iew = io_kiocb_to_cmd(req, struct io_epoll_wait); + + io_poll_mark_cancelled(req); + list_del_init_careful(&iew->wait.entry); + io_epoll_execute(req); + return 1; +} + +static int io_epoll_wait_fn(struct wait_queue_entry *wait, unsigned mode, + int sync, void *key) +{ + struct io_kiocb *req = wait->private; + __poll_t mask = key_to_poll(key); + + if (unlikely(mask & POLLFREE)) + return io_epoll_pollfree_wake(req); + + return io_epoll_execute(req); +} + +int io_epoll_wait_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe) +{ + struct io_epoll_wait *iew = io_kiocb_to_cmd(req, struct io_epoll_wait); + + if (sqe->off || sqe->rw_flags || sqe->buf_index || sqe->splice_fd_in) + return -EINVAL; + + iew->maxevents = READ_ONCE(sqe->len); + iew->events = u64_to_user_ptr(READ_ONCE(sqe->addr)); + + iew->wait.flags = 0; + iew->wait.private = req; + iew->wait.func = io_epoll_wait_fn; + INIT_LIST_HEAD(&iew->wait.entry); + atomic_set(&req->poll_refs, 0); + return 0; +} + +int io_epoll_wait(struct io_kiocb *req, unsigned int issue_flags) +{ + struct io_epoll_wait *iew = io_kiocb_to_cmd(req, struct io_epoll_wait); + struct io_ring_ctx *ctx = req->ctx; + int ret; + + io_ring_submit_lock(ctx, issue_flags); + hlist_add_head(&req->hash_node, &ctx->epoll_list); + io_ring_submit_unlock(ctx, issue_flags); + + /* + * Timeout is fake here, it doesn't indicate any kind of sleep time. + * It's just set to something that is non-zero, so that wait queue + * wakeup is armed if no events are available. + */ + ret = epoll_wait(req->file, iew->events, iew->maxevents, NULL, &iew->wait); + if (ret == -EIOCBQUEUED) + return IOU_ISSUE_SKIP_COMPLETE; + else if (ret < 0) + req_set_fail(req); + io_ring_submit_lock(ctx, issue_flags); + hlist_del_init(&req->hash_node); + io_ring_submit_unlock(ctx, issue_flags); + io_req_set_res(req, ret, 0); + return IOU_OK; +} diff --git a/io_uring/epoll.h b/io_uring/epoll.h index 870cce11ba98..296940d89063 100644 --- a/io_uring/epoll.h +++ b/io_uring/epoll.h @@ -1,6 +1,28 @@ // SPDX-License-Identifier: GPL-2.0 +#include "cancel.h" + #if defined(CONFIG_EPOLL) +int io_epoll_wait_cancel(struct io_ring_ctx *ctx, struct io_cancel_data *cd, + unsigned int issue_flags); +bool io_epoll_wait_remove_all(struct io_ring_ctx *ctx, struct io_uring_task *tctx, + bool cancel_all); + int io_epoll_ctl_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe); int io_epoll_ctl(struct io_kiocb *req, unsigned int issue_flags); +int io_epoll_wait_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe); +int io_epoll_wait(struct io_kiocb *req, unsigned int issue_flags); +#else +static inline bool io_epoll_wait_remove_all(struct io_ring_ctx *ctx, + struct io_uring_task *tctx, + bool cancel_all) +{ + return false; +} +static inline int io_epoll_wait_cancel(struct io_ring_ctx *ctx, + struct io_cancel_data *cd, + unsigned int issue_flags) +{ + return 0; +} #endif diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index 2f311aeb536f..a17abdbae7ee 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -93,6 +93,7 @@ #include "notif.h" #include "waitid.h" #include "futex.h" +#include "epoll.h" #include "napi.h" #include "uring_cmd.h" #include "msg_ring.h" @@ -358,6 +359,9 @@ static __cold struct io_ring_ctx *io_ring_ctx_alloc(struct io_uring_params *p) INIT_HLIST_HEAD(&ctx->waitid_list); #ifdef CONFIG_FUTEX INIT_HLIST_HEAD(&ctx->futex_list); +#endif +#ifdef CONFIG_EPOLL + INIT_HLIST_HEAD(&ctx->epoll_list); #endif INIT_DELAYED_WORK(&ctx->fallback_work, io_fallback_req_func); INIT_WQ_LIST(&ctx->submit_state.compl_reqs); @@ -3095,6 +3099,7 @@ static __cold bool io_uring_try_cancel_requests(struct io_ring_ctx *ctx, ret |= io_poll_remove_all(ctx, tctx, cancel_all); ret |= io_waitid_remove_all(ctx, tctx, cancel_all); ret |= io_futex_remove_all(ctx, tctx, cancel_all); + ret |= io_epoll_wait_remove_all(ctx, tctx, cancel_all); ret |= io_uring_try_cancel_uring_cmd(ctx, tctx, cancel_all); mutex_unlock(&ctx->uring_lock); ret |= io_kill_timeouts(ctx, tctx, cancel_all); diff --git a/io_uring/opdef.c b/io_uring/opdef.c index e8baef4e5146..44553a657476 100644 --- a/io_uring/opdef.c +++ b/io_uring/opdef.c @@ -514,6 +514,17 @@ const struct io_issue_def io_issue_defs[] = { .async_size = sizeof(struct io_async_msghdr), #else .prep = io_eopnotsupp_prep, +#endif + }, + [IORING_OP_EPOLL_WAIT] = { + .needs_file = 1, + .unbound_nonreg_file = 1, + .audit_skip = 1, +#if defined(CONFIG_EPOLL) + .prep = io_epoll_wait_prep, + .issue = io_epoll_wait, +#else + .prep = io_eopnotsupp_prep, #endif }, }; @@ -745,6 +756,9 @@ const struct io_cold_def io_cold_defs[] = { [IORING_OP_LISTEN] = { .name = "LISTEN", }, + [IORING_OP_EPOLL_WAIT] = { + .name = "EPOLL_WAIT", + }, }; const char *io_uring_get_opcode(u8 opcode) From patchwork Mon Feb 3 16:23:47 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 13957797 Received: from mail-io1-f52.google.com (mail-io1-f52.google.com [209.85.166.52]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 16F6A20B80A for ; Mon, 3 Feb 2025 16:31:34 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.166.52 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738600296; cv=none; b=iLC1fiK/uQXzD0AvuKyXlZSzj2+BJEuTAKisvrTEAB5JNobEjaBxHSyBf7qEFipzlYYlevtbjZWjd64L2kGVoBH/wHrqGrC8qbK1VqAkZ3QLk+HC5ddYEYr0Jq0zxNGRbMpJpapDSlJBNa+OsrVnDZjNNRQr2qx3DV+bBV4nR+s= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738600296; c=relaxed/simple; bh=SiJFHOGW3I3zFqN3kp56SHJQkVmhHehxUSx0Yzr7T3E=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=SNM8XI4r1pueeut53xXcnsH1RGeBtlQuPlBEiXfVrp35oTiSvqK/AG7Kw4149Fe6JjwsQKVifaVBKexSFYLwobNZqvO3C3yQPhc3tBYc3YBGJoCj85R1ZBJHVJwCiJc9+l7dba9HjdxEe8tbTVHXSN53kAF7EstGbe5bRG6MePI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk; spf=pass smtp.mailfrom=kernel.dk; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b=eDun3atx; arc=none smtp.client-ip=209.85.166.52 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=kernel.dk Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b="eDun3atx" Received: by mail-io1-f52.google.com with SMTP id ca18e2360f4ac-844ef6275c5so114463239f.0 for ; Mon, 03 Feb 2025 08:31:34 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20230601.gappssmtp.com; s=20230601; t=1738600294; x=1739205094; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=0/fjiuRElsEUMm3xq1l9P9axtS1BBAUWMYirAwWU00g=; b=eDun3atxS7vxUznNqvzlHscmKmp+culWjhLBvNJamQbJNI3zSSllnwE6H20E5RhVlC GdeexBEgDdaMRNCM0Rp/xsAiKCBkwWytUDtmeOf2qjZjDClIS86rNr7jCmP9m3ThkGAD adgOQ+QR+e+dKkHlVgbT2H67SmahIgrsqwb9k7hFeB/HZSJJkEYywJAC9x+DKMbNhw95 8wkvfAy10YfwYLhmM+FvoT1up2XAi/FoX6I93aWMYrr/8NyxiYRy6iPpCSfrO5dS6Dru ylXEMNStmLLB4J1BgnhhXJy7PoJayyYxNNKHoX0VSpb0KzFwEgDZTYvF78m3dt+sg9UK bq+w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738600294; x=1739205094; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=0/fjiuRElsEUMm3xq1l9P9axtS1BBAUWMYirAwWU00g=; b=r387JNf/F6KUPjqB9Kdty+nHZExmHSi7RxxtaPqA/tqXhXbaVtirkHZX0xXrBKTx0d jauRrxLv9aAV8G4URCpq+RQ8KgfD67IUVQWH1HBj94IpU8WibvefATPXDkZ9/7pJCeuy 4SF0yqtQfSKm7JAQRL+2c9vZbc4UVVhrP0p5EUWHRkpcsfkLojds1k/Zq9sJ6NsF8Ymi mXPgYoCIPef37DJWHKaPNsGBQJs1y2WJCyQd8WFHsW7gbmimPOl8/mbk+jqf2MvHtoUd z6giVmudIBEzhVM9lcwiKFA8C+fwpxlLN0hOOJmx7C87brFUBnmz6ODu8Z/JjUgvVf2q TL4w== X-Gm-Message-State: AOJu0YxiIzPDHth1G/GT9W4SkoFEaTuc/jpQ5DzwlmiyLA/zTshqdfKI DrQurnolcqirGH5T06YnXhJIR+o2xDvzHpQ5VCoYljCgZfwchqpxpSf89+jHezQ= X-Gm-Gg: ASbGncva44jAo85arzuwT5Z2gKf4nTKwMEHjEXl2g7xcxTQyOjbsnqHAgaQfsmDPbOG gjlK/ioJUaXyvTbYLonBOWYHCQRB5mtptVkYsS3itNoaqpct3HTuxTm46MmiJNaDRj/vOjHBX1c jF1y4SlXinbPOv1NM2ITE+QWALsn6cXqSgrJhWwWomo3aqkqp2p5CqE4WJVUkGLjgQwYVQ2ZdvY SKCpcbudW3jG8FV4NTQa841ex+7nouW2uJJ8g/TvI5cVO4FCdhXVJFPDK3yUKlvGLw7LgLlBVGG vv1/q3yVdVFeEPx6Lm4= X-Google-Smtp-Source: AGHT+IH2wV2mJH512YPftlISVyZWUOTcz9QVi4y2nD/702D7EXmTUdDrOmfIKi7tAhOGaOpiBEGhqg== X-Received: by 2002:a05:6602:3990:b0:844:debf:24dc with SMTP id ca18e2360f4ac-85411111991mr2269268639f.5.1738600292695; Mon, 03 Feb 2025 08:31:32 -0800 (PST) Received: from localhost.localdomain ([96.43.243.2]) by smtp.gmail.com with ESMTPSA id ca18e2360f4ac-854a16123c6sm243748139f.24.2025.02.03.08.31.31 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 03 Feb 2025 08:31:31 -0800 (PST) From: Jens Axboe To: io-uring@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, brauner@kernel.org, Jens Axboe Subject: [PATCH 9/9] io_uring/epoll: add multishot support for IORING_OP_EPOLL_WAIT Date: Mon, 3 Feb 2025 09:23:47 -0700 Message-ID: <20250203163114.124077-10-axboe@kernel.dk> X-Mailer: git-send-email 2.47.2 In-Reply-To: <20250203163114.124077-1-axboe@kernel.dk> References: <20250203163114.124077-1-axboe@kernel.dk> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 As with other multishot requests, submitting a multishot epoll wait request will keep it re-armed post the initial trigger. This allows multiple epoll wait completions per request submitted, every time events are available. If more completions are expected for this epoll wait request, then IORING_CQE_F_MORE will be set in the posted cqe->flags. For multishot, the request remains on the epoll callback waitqueue head. This means that epoll doesn't need to juggle the ep->lock writelock (and disable/enable IRQs) for each invocation of the reaping loop. That should translate into nice efficiency gains. Use by setting IORING_EPOLL_WAIT_MULTISHOT in the sqe->epoll_flags member. Signed-off-by: Jens Axboe --- include/uapi/linux/io_uring.h | 6 ++++++ io_uring/epoll.c | 40 ++++++++++++++++++++++++++--------- 2 files changed, 36 insertions(+), 10 deletions(-) diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h index a559e1e1544a..93f504b6d4ec 100644 --- a/include/uapi/linux/io_uring.h +++ b/include/uapi/linux/io_uring.h @@ -73,6 +73,7 @@ struct io_uring_sqe { __u32 futex_flags; __u32 install_fd_flags; __u32 nop_flags; + __u32 epoll_flags; }; __u64 user_data; /* data to be passed back at completion time */ /* pack this to avoid bogus arm OABI complaints */ @@ -405,6 +406,11 @@ enum io_uring_op { #define IORING_ACCEPT_DONTWAIT (1U << 1) #define IORING_ACCEPT_POLL_FIRST (1U << 2) +/* + * epoll_wait flags, stored in sqe->epoll_flags + */ +#define IORING_EPOLL_WAIT_MULTISHOT (1U << 0) + /* * IORING_OP_MSG_RING command types, stored in sqe->addr */ diff --git a/io_uring/epoll.c b/io_uring/epoll.c index 2a9c679516c8..730f4b729f5b 100644 --- a/io_uring/epoll.c +++ b/io_uring/epoll.c @@ -24,6 +24,7 @@ struct io_epoll { struct io_epoll_wait { struct file *file; int maxevents; + int flags; struct epoll_event __user *events; struct wait_queue_entry wait; }; @@ -145,12 +146,15 @@ static void io_epoll_retry(struct io_kiocb *req, struct io_tw_state *ts) io_req_task_submit(req, ts); } -static int io_epoll_execute(struct io_kiocb *req) +static int io_epoll_execute(struct io_kiocb *req, __poll_t mask) { struct io_epoll_wait *iew = io_kiocb_to_cmd(req, struct io_epoll_wait); if (io_poll_get_ownership(req)) { - list_del_init_careful(&iew->wait.entry); + if (mask & EPOLL_URING_WAKE) + req->flags &= ~REQ_F_APOLL_MULTISHOT; + if (!(req->flags & REQ_F_APOLL_MULTISHOT)) + list_del_init_careful(&iew->wait.entry); req->io_task_work.func = io_epoll_retry; io_req_task_work_add(req); return 1; @@ -159,13 +163,13 @@ static int io_epoll_execute(struct io_kiocb *req) return 0; } -static __cold int io_epoll_pollfree_wake(struct io_kiocb *req) +static __cold int io_epoll_pollfree_wake(struct io_kiocb *req, __poll_t mask) { struct io_epoll_wait *iew = io_kiocb_to_cmd(req, struct io_epoll_wait); io_poll_mark_cancelled(req); list_del_init_careful(&iew->wait.entry); - io_epoll_execute(req); + io_epoll_execute(req, mask); return 1; } @@ -176,18 +180,23 @@ static int io_epoll_wait_fn(struct wait_queue_entry *wait, unsigned mode, __poll_t mask = key_to_poll(key); if (unlikely(mask & POLLFREE)) - return io_epoll_pollfree_wake(req); + return io_epoll_pollfree_wake(req, mask); - return io_epoll_execute(req); + return io_epoll_execute(req, mask); } int io_epoll_wait_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe) { struct io_epoll_wait *iew = io_kiocb_to_cmd(req, struct io_epoll_wait); - if (sqe->off || sqe->rw_flags || sqe->buf_index || sqe->splice_fd_in) + if (sqe->off || sqe->buf_index || sqe->splice_fd_in) return -EINVAL; + iew->flags = READ_ONCE(sqe->epoll_flags); + if (iew->flags & ~IORING_EPOLL_WAIT_MULTISHOT) + return -EINVAL; + else if (iew->flags & IORING_EPOLL_WAIT_MULTISHOT) + req->flags |= REQ_F_APOLL_MULTISHOT; iew->maxevents = READ_ONCE(sqe->len); iew->events = u64_to_user_ptr(READ_ONCE(sqe->addr)); @@ -195,6 +204,7 @@ int io_epoll_wait_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe) iew->wait.private = req; iew->wait.func = io_epoll_wait_fn; INIT_LIST_HEAD(&iew->wait.entry); + INIT_HLIST_NODE(&req->hash_node); atomic_set(&req->poll_refs, 0); return 0; } @@ -205,9 +215,11 @@ int io_epoll_wait(struct io_kiocb *req, unsigned int issue_flags) struct io_ring_ctx *ctx = req->ctx; int ret; - io_ring_submit_lock(ctx, issue_flags); - hlist_add_head(&req->hash_node, &ctx->epoll_list); - io_ring_submit_unlock(ctx, issue_flags); + if (hlist_unhashed(&req->hash_node)) { + io_ring_submit_lock(ctx, issue_flags); + hlist_add_head(&req->hash_node, &ctx->epoll_list); + io_ring_submit_unlock(ctx, issue_flags); + } /* * Timeout is fake here, it doesn't indicate any kind of sleep time. @@ -219,9 +231,17 @@ int io_epoll_wait(struct io_kiocb *req, unsigned int issue_flags) return IOU_ISSUE_SKIP_COMPLETE; else if (ret < 0) req_set_fail(req); + + if (ret >= 0 && req->flags & REQ_F_APOLL_MULTISHOT && + io_req_post_cqe(req, ret, IORING_CQE_F_MORE)) + return IOU_ISSUE_SKIP_COMPLETE; + io_ring_submit_lock(ctx, issue_flags); hlist_del_init(&req->hash_node); io_ring_submit_unlock(ctx, issue_flags); io_req_set_res(req, ret, 0); + + if (issue_flags & IO_URING_F_MULTISHOT) + return IOU_STOP_MULTISHOT; return IOU_OK; }