From patchwork Fri Aug 16 20:38:12 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 13766806 Received: from mail-pf1-f177.google.com (mail-pf1-f177.google.com [209.85.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A680C1C230D for ; Fri, 16 Aug 2024 20:44:11 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.177 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723841054; cv=none; b=utPdi4/xDFsP/iLsNYofafOWMAutGod+QwHJliorAjmqISmSorqrpUNGZvq5EsK+Udwtu7U19YHs71C5CQRqDnhqTWUPzjK5szc/efil/D8Cgta1DO89I0iT0e59/3M1I3rVx1BGrPiEuYNZHTTENiPAWZbKvz14v1elV3xLv1Q= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723841054; c=relaxed/simple; bh=tB0tCZSdRi2oyTqFUEOB3V9vv0GX0xX7UvXSoNVRUfo=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=pA7t/913flxynDWxCwo8cour3qQ/N20WxG+4TtFL4BWMnVluFvuYx6wt8tMwYLvDsZypkh0IOt2AvXMx/v1Ucik7xMj5wgJ+7aj1IAo1zzZfbMIxvKcjvBLqS540ocqlR5TvgsrVVY+g8ud26cO7fVrmDtKyDMOuTTf2AQFss4c= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk; spf=pass smtp.mailfrom=kernel.dk; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b=BK6W3aIu; arc=none smtp.client-ip=209.85.210.177 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=kernel.dk Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b="BK6W3aIu" Received: by mail-pf1-f177.google.com with SMTP id d2e1a72fcca58-70d2ff38af8so84372b3a.2 for ; Fri, 16 Aug 2024 13:44:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20230601.gappssmtp.com; s=20230601; t=1723841050; x=1724445850; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=I9Voe6ecOrXy/INncXXq7Jez8vuNLGZ98y4ll3KiRPM=; b=BK6W3aIu0zGXYlpCEvPz1GtyZID75kjMvDEmrYsZyihItl0KWr0Z0Rlpb5n4NLhTtw sukMxc6lAL1NOkMYhU7qiXf9my7SilNpLkzV44FgHcp8GlvXvBOC4txa5Gh0n5ucrC4N 0WvEERasVRMOQ4T+8dBF0uKrco5JFqkfkG98+EEjfCwM5SEwNGwqgq43CNp8JPICF1Jo 25Eu+KTzmJfemwSZ1ngtfa1RAkvy8giYsOajs5PihanMtL2Seb0LctfAL9y18PJ9oB0w oi2cJD3Fu/HjhYXDIuVryYOoHecN7lKvxkC20KE0xnnMGwAVRPCRvQb+ky+eqUpbyAcc pXtQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1723841050; x=1724445850; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=I9Voe6ecOrXy/INncXXq7Jez8vuNLGZ98y4ll3KiRPM=; b=ArMcpLoyZ4VBxoyanMUA6lFF5AeDGjcTn0JLTfCJQ+p+vLHyeNcoke2NrajY1ttdMh E1xuzcWPwOYhJj8FCDKOKjDRU1uU8/4eY0tYs14c91U/wydhc8xVwDrwp6O2c0iWk0QR Tv1na/hUcywFQoWmCUTxonHNh6P2V9r8QFhTVhhDMpAAVxJrb8d9SEPdMp6UovwppUnK mCx7ySQaYnYnVyoxuqlaT41UQefDUn2iCd9NK4ZFYtez/nDjYMAa2OshiUZ1pi4CdvpK APD/oJocrCXq+e23XVIXvIq079W90rC6++hmFO4TLrLzn3Zx4y2VAcqRqOktBThhQZ8D nDew== X-Gm-Message-State: AOJu0YzT50eXTF9MpORzppSrUta8mXFM940j0o3NGK+XCaBAuTQxObiE AmvzKZRoyXR8nQ3Xm9wh9OaAG8CwWISsC7nhVat+9DJsxBOW7f7aEdlXMB37QxQDek1anLlBAiM 1 X-Google-Smtp-Source: AGHT+IH5KdRDm0AW2n4BI4YRmXCkVnG5fRc9AgGsHtkdFNztEq0wZBcTHH6cC4xOlMtNZCUM5jZQaw== X-Received: by 2002:a17:902:ea0f:b0:1fd:aac9:a722 with SMTP id d9443c01a7336-20203ed4500mr27792415ad.4.1723841050536; Fri, 16 Aug 2024 13:44:10 -0700 (PDT) Received: from localhost.localdomain ([198.8.77.157]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-201f038a3d7sm29190995ad.186.2024.08.16.13.44.09 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 16 Aug 2024 13:44:09 -0700 (PDT) From: Jens Axboe To: io-uring@vger.kernel.org Cc: Jens Axboe Subject: [PATCH 1/5] io_uring: encapsulate extraneous wait flags into a separate struct Date: Fri, 16 Aug 2024 14:38:12 -0600 Message-ID: <20240816204302.85938-3-axboe@kernel.dk> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240816204302.85938-2-axboe@kernel.dk> References: <20240816204302.85938-2-axboe@kernel.dk> Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Rather than need to pass in 2 or 3 separate arguments, add a struct to encapsulate the timeout and sigset_t parts of waiting. In preparation for adding another argument for waiting. Signed-off-by: Jens Axboe --- io_uring/io_uring.c | 45 ++++++++++++++++++++++++--------------------- 1 file changed, 24 insertions(+), 21 deletions(-) diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index 5e75672525df..f90a4490908b 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -2384,13 +2384,18 @@ static inline int io_cqring_wait_schedule(struct io_ring_ctx *ctx, return ret; } +struct ext_arg { + size_t argsz; + struct __kernel_timespec __user *ts; + const sigset_t __user *sig; +}; + /* * Wait until events become available, if we don't already have some. The * application must reap them itself, as they reside on the shared cq ring. */ static int io_cqring_wait(struct io_ring_ctx *ctx, int min_events, u32 flags, - const sigset_t __user *sig, size_t sigsz, - struct __kernel_timespec __user *uts) + struct ext_arg *ext_arg) { struct io_wait_queue iowq; struct io_rings *rings = ctx->rings; @@ -2416,10 +2421,10 @@ static int io_cqring_wait(struct io_ring_ctx *ctx, int min_events, u32 flags, iowq.timeout = KTIME_MAX; iowq.no_iowait = flags & IORING_ENTER_NO_IOWAIT; - if (uts) { + if (ext_arg->ts) { struct timespec64 ts; - if (get_timespec64(&ts, uts)) + if (get_timespec64(&ts, ext_arg->ts)) return -EFAULT; iowq.timeout = timespec64_to_ktime(ts); @@ -2427,14 +2432,14 @@ static int io_cqring_wait(struct io_ring_ctx *ctx, int min_events, u32 flags, iowq.timeout = ktime_add(iowq.timeout, io_get_time(ctx)); } - if (sig) { + if (ext_arg->sig) { #ifdef CONFIG_COMPAT if (in_compat_syscall()) - ret = set_compat_user_sigmask((const compat_sigset_t __user *)sig, - sigsz); + ret = set_compat_user_sigmask((const compat_sigset_t __user *)ext_arg->sig, + ext_arg->argsz); else #endif - ret = set_user_sigmask(sig, sigsz); + ret = set_user_sigmask(ext_arg->sig, ext_arg->argsz); if (ret) return ret; @@ -3113,9 +3118,8 @@ static int io_validate_ext_arg(unsigned flags, const void __user *argp, size_t a return 0; } -static int io_get_ext_arg(unsigned flags, const void __user *argp, size_t *argsz, - struct __kernel_timespec __user **ts, - const sigset_t __user **sig) +static int io_get_ext_arg(unsigned flags, const void __user *argp, + struct ext_arg *ext_arg) { struct io_uring_getevents_arg arg; @@ -3124,8 +3128,8 @@ static int io_get_ext_arg(unsigned flags, const void __user *argp, size_t *argsz * is just a pointer to the sigset_t. */ if (!(flags & IORING_ENTER_EXT_ARG)) { - *sig = (const sigset_t __user *) argp; - *ts = NULL; + ext_arg->sig = (const sigset_t __user *) argp; + ext_arg->ts = NULL; return 0; } @@ -3133,15 +3137,15 @@ static int io_get_ext_arg(unsigned flags, const void __user *argp, size_t *argsz * EXT_ARG is set - ensure we agree on the size of it and copy in our * timespec and sigset_t pointers if good. */ - if (*argsz != sizeof(arg)) + if (ext_arg->argsz != sizeof(arg)) return -EINVAL; if (copy_from_user(&arg, argp, sizeof(arg))) return -EFAULT; if (arg.pad) return -EINVAL; - *sig = u64_to_user_ptr(arg.sigmask); - *argsz = arg.sigmask_sz; - *ts = u64_to_user_ptr(arg.ts); + ext_arg->sig = u64_to_user_ptr(arg.sigmask); + ext_arg->argsz = arg.sigmask_sz; + ext_arg->ts = u64_to_user_ptr(arg.ts); return 0; } @@ -3247,15 +3251,14 @@ SYSCALL_DEFINE6(io_uring_enter, unsigned int, fd, u32, to_submit, } mutex_unlock(&ctx->uring_lock); } else { - const sigset_t __user *sig; - struct __kernel_timespec __user *ts; + struct ext_arg ext_arg = { .argsz = argsz }; - ret2 = io_get_ext_arg(flags, argp, &argsz, &ts, &sig); + ret2 = io_get_ext_arg(flags, argp, &ext_arg); if (likely(!ret2)) { min_complete = min(min_complete, ctx->cq_entries); ret2 = io_cqring_wait(ctx, min_complete, flags, - sig, argsz, ts); + &ext_arg); } } From patchwork Fri Aug 16 20:38:13 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 13766807 Received: from mail-pf1-f181.google.com (mail-pf1-f181.google.com [209.85.210.181]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 211AE1C37A3 for ; Fri, 16 Aug 2024 20:44:12 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.181 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723841055; cv=none; b=brkW5znJIrDxqGoKnDbX3e0r0wgImnqq4BwkYY0OxLdmY95sv+MvqgcZyHi9q+VgotlzUYx/8XE1BmEaVvYI26T0hOzR6A78/W9845Pm0TZiS2EmTIN8/69O6ZXcY7Ho6id1rGlFDOvcRBNqJ7eVoZ4Y6hI3ilw11rs4PeF2Zww= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723841055; c=relaxed/simple; bh=Mhjw+LATf5RTFscSjrkKSb0R0zTDbui+lWsTaUGjr+I=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=IIAPdlEWPQpzczNC5UCOKlNx3Dk18BX62gv9rx0hglwSlr8e5OoEN9cqJrj2IxjFXY2X1v5DMjv//dzjAzqGHrCtH1etYS53qwk42wmoq+ow0WkIX1XyiGG3xM/XWkPgoH2v4k/pRncgD0HbNIUIp5Xd96MP017UMZwojyhg1H4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk; spf=pass smtp.mailfrom=kernel.dk; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b=poka1ozR; arc=none smtp.client-ip=209.85.210.181 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=kernel.dk Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b="poka1ozR" Received: by mail-pf1-f181.google.com with SMTP id d2e1a72fcca58-70d2ff38af8so84376b3a.2 for ; Fri, 16 Aug 2024 13:44:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20230601.gappssmtp.com; s=20230601; t=1723841052; x=1724445852; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=0CJ86QEHd2DNYgw+x5NJRs+gaRaLpF5c0ct13bOsR/o=; b=poka1ozR5OvNXL5AvphUqsC495vMpDN4IORogAivEqMobYC8JyqjhU8nNJPmCMjUUg QabYMO5YHDGr9AZ5DEFGRfbWU+rhZQtkmzbQUBPJkbUrWKnPkUZI8YYm0t9ObCzfd+Vr UW4LMN1I6tkuExlWRCPpSfp79cEQBQH5QsWCHhHJO/+Vm7+gG9UUYhlY+bupzQnt09cH BqRqv1eL7J/4OR1WG4fv49ddj/sGcED9vIZ500ebD4YAtxLjWNU9BOyIfrXJVjPjljxG IPaqU886/sAL7BoIHSxLp5ZHFiBnw9sJTsvPxR4yuALTXbXOOtsRlnm5mgei8mfuL7pX yPkg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1723841052; x=1724445852; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=0CJ86QEHd2DNYgw+x5NJRs+gaRaLpF5c0ct13bOsR/o=; b=O8DzXZJgzM5fCJ6XpNTMOC8Gxduytkv30328+csLWIe+LVUsFS/ILVZs4N46ZetAES FNxOlJThuxpKD5weHvftO21ky0k/F+rvuZUs40KIRuojCcVHykAiA1g4WRDa1AbMLAz3 w0khCxk+B942jY+dNAHiEG5tEBEKDvBBWAS+ODv9OwEjJKKlj+cjmmZN9GaSxDvXc2ym vBCjV03VgBsqLOn1WKw1EpLO2SJqlGY+LtD+gPf67iluYnSvYDhOqw7PQUb2t31WjAPI C/AFJrbufkkj0nb695jNpLyw+qfGH40b4coWikVmo/Z7xXsC5CfJFMidZuaL1NW0UX30 9k1g== X-Gm-Message-State: AOJu0YxiWjlbS5NwUgzye55+niMjLMcUO/AKPykPxU5CiRaABvFUEjFd doMday1NYrvRvjzQriV7r7/HaaC0nV3HUqNfUSG3cXLbZj5SsPawLCK/bJpWmq0+E+5RB0PqCJW t X-Google-Smtp-Source: AGHT+IGdUy9MS4OB+3NqOsdGSEbDBaZflF2EmnnQxDlFEHDWKKBpm0/U8Nlh76GejI8nDCOdjZuUXg== X-Received: by 2002:a17:902:da87:b0:1fc:52f4:1802 with SMTP id d9443c01a7336-202040aa821mr28225745ad.10.1723841052002; Fri, 16 Aug 2024 13:44:12 -0700 (PDT) Received: from localhost.localdomain ([198.8.77.157]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-201f038a3d7sm29190995ad.186.2024.08.16.13.44.10 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 16 Aug 2024 13:44:10 -0700 (PDT) From: Jens Axboe To: io-uring@vger.kernel.org Cc: Jens Axboe Subject: [PATCH 2/5] io_uring: move schedule wait logic into helper Date: Fri, 16 Aug 2024 14:38:13 -0600 Message-ID: <20240816204302.85938-4-axboe@kernel.dk> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240816204302.85938-2-axboe@kernel.dk> References: <20240816204302.85938-2-axboe@kernel.dk> Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 In preparation for expanding how we handle waits, move the actual schedule and schedule_timeout() handling into a helper. Signed-off-by: Jens Axboe --- io_uring/io_uring.c | 37 +++++++++++++++++++++---------------- 1 file changed, 21 insertions(+), 16 deletions(-) diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index f90a4490908b..2bdb66902f58 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -2350,22 +2350,10 @@ static bool current_pending_io(void) return percpu_counter_read_positive(&tctx->inflight); } -/* when returns >0, the caller should retry */ -static inline int io_cqring_wait_schedule(struct io_ring_ctx *ctx, - struct io_wait_queue *iowq) +static int __io_cqring_wait_schedule(struct io_ring_ctx *ctx, + struct io_wait_queue *iowq) { - int ret; - - if (unlikely(READ_ONCE(ctx->check_cq))) - return 1; - if (unlikely(!llist_empty(&ctx->work_llist))) - return 1; - if (unlikely(test_thread_flag(TIF_NOTIFY_SIGNAL))) - return 1; - if (unlikely(task_sigpending(current))) - return -EINTR; - if (unlikely(io_should_wake(iowq))) - return 0; + int ret = 0; /* * Mark us as being in io_wait if we have pending requests, so cpufreq @@ -2374,7 +2362,6 @@ static inline int io_cqring_wait_schedule(struct io_ring_ctx *ctx, */ if (!iowq->no_iowait && current_pending_io()) current->in_iowait = 1; - ret = 0; if (iowq->timeout == KTIME_MAX) schedule(); else if (!schedule_hrtimeout_range_clock(&iowq->timeout, 0, @@ -2384,6 +2371,24 @@ static inline int io_cqring_wait_schedule(struct io_ring_ctx *ctx, return ret; } +/* If this returns > 0, the caller should retry */ +static inline int io_cqring_wait_schedule(struct io_ring_ctx *ctx, + struct io_wait_queue *iowq) +{ + if (unlikely(READ_ONCE(ctx->check_cq))) + return 1; + if (unlikely(!llist_empty(&ctx->work_llist))) + return 1; + if (unlikely(test_thread_flag(TIF_NOTIFY_SIGNAL))) + return 1; + if (unlikely(task_sigpending(current))) + return -EINTR; + if (unlikely(io_should_wake(iowq))) + return 0; + + return __io_cqring_wait_schedule(ctx, iowq); +} + struct ext_arg { size_t argsz; struct __kernel_timespec __user *ts; From patchwork Fri Aug 16 20:38:14 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 13766808 Received: from mail-pl1-f169.google.com (mail-pl1-f169.google.com [209.85.214.169]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B60CB1C230D for ; Fri, 16 Aug 2024 20:44:14 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.169 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723841057; cv=none; b=uKLNgb8bYY5N071nToIJcifKnMBOF36oSnmYEEn+SCzODv9pcD32TJStgULmRVEjBlYeDkpNyh4t/uoV+cyMrKqO6CmHWESJE/tIMcJPIgSpCCPHmOGSq8w/GtJtqySXWicNyeMgy/T53I7Efhyigq2SLx1JLxJ0LBbbwq/9wNE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723841057; c=relaxed/simple; bh=cZ4e0WcuWQx6VX1fthW7bnYW4j+Jn2MgorC9W4EbmYQ=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=ctBc5fdBTKTDkGYaS19p8leWr6eoHeh0N4CHlFmmeIwDscTFwCmKTzkIVRN0q0+Lz4Ac2BaPWW0bAOT0JiepHhDxoKbVtzZIW3Uk4Lfn6WW6zdMawSuifnmKLrXv+aXI6sm1rIvs4b43BUlb4o8fUIM1rcYnVc33QpwwTzoGvOQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk; spf=pass smtp.mailfrom=kernel.dk; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b=RGXdsat+; arc=none smtp.client-ip=209.85.214.169 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=kernel.dk Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b="RGXdsat+" Received: by mail-pl1-f169.google.com with SMTP id d9443c01a7336-201e2ebed48so1224405ad.0 for ; Fri, 16 Aug 2024 13:44:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20230601.gappssmtp.com; s=20230601; t=1723841053; x=1724445853; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=pjuwZgy1No/wv7dcl9qPfV/1hVKvPzAmNfbDR9sXvXQ=; b=RGXdsat+3RWe3Cm9K3Vh4J8U/lEoK+MQyv9Af1+1pAjm1/Nti+JJN7r9GooSG8i/l4 06BQjmu2LYqKdy3OiqQ4nporcyAV9QRPUd469WddVebZYBADNyxgnh+dbuRRzeAhpBen XCxEE+94L9eUa3WFENWtefOdlg7sweXHLQWnXz6wumB68dg3gXujjOQFTmIEMX7rqDHf QZlDLUXcWERD5ERbqtpqtTlS0p0uxTvt2bBb6WQxehYpzFwViO4T9zZ2eD0R0b5dlYzB r5mGyExGdsbQCSMaZYVkhNxZQMU/X79t0Bdw+HGplsGy4CIcEzMQVpvs8QndWTCIh6jx RB4A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1723841053; x=1724445853; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=pjuwZgy1No/wv7dcl9qPfV/1hVKvPzAmNfbDR9sXvXQ=; b=E/haYYGzbzzy52XFb0CtWZMwWegeQ5jk2P1mtgTpMbdwJRTl7V7aYqxZw2syWoHCHZ OQn8Bnjn929E570OmJErl6VvNYpHIL3A9Uo6gLQbXlDfG47THgg6rtvQm7FAY3u3XhmZ GCIj1LQRSNjfiOiRhL5KM1gziaKhSeRF/aPUDj66BUqoLaX8LEchg2RflbexPgcYoh40 BeXrv1yiDH9o10UzTllMmKn/km9QOlypj/u42tKnxlOa3yguqJQfc4aIuMiEveHOiNiR BF1+BGiP1LE6OXhk42T4gwy/hCtazVZNUhfUMNPITA0M/cDmJyT7rX7jMegxYDHIhGNx ekKA== X-Gm-Message-State: AOJu0YxhvzgPnQ4fdCfK9lwULsymGfnRHWMyT4SBQrtXHt9nZjTiQCI5 MgCdxWIFsLyfp9jaq7ocnxgt4t5FMLIwrLviwOgxVCR3Ar2up0/rsercEjz6j+3uIrLo9KhrUFt w X-Google-Smtp-Source: AGHT+IE9NKTrpddX6QhPT16aLMnBTd4dHuYegdJP5oy4pjanQAlid2wjyjl665TJSjvAuw0B9oCyzQ== X-Received: by 2002:a17:902:d289:b0:202:13d7:9290 with SMTP id d9443c01a7336-20213d79b21mr12773235ad.8.1723841053448; Fri, 16 Aug 2024 13:44:13 -0700 (PDT) Received: from localhost.localdomain ([198.8.77.157]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-201f038a3d7sm29190995ad.186.2024.08.16.13.44.12 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 16 Aug 2024 13:44:12 -0700 (PDT) From: Jens Axboe To: io-uring@vger.kernel.org Cc: Jens Axboe Subject: [PATCH 3/5] io_uring: implement our own schedule timeout handling Date: Fri, 16 Aug 2024 14:38:14 -0600 Message-ID: <20240816204302.85938-5-axboe@kernel.dk> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240816204302.85938-2-axboe@kernel.dk> References: <20240816204302.85938-2-axboe@kernel.dk> Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 In preparation for having two distinct timeouts and avoid waking the task if we don't need to. Signed-off-by: Jens Axboe --- io_uring/io_uring.c | 41 ++++++++++++++++++++++++++++++++++++----- io_uring/io_uring.h | 2 ++ 2 files changed, 38 insertions(+), 5 deletions(-) diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index 2bdb66902f58..6e53ebd58aab 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -2322,7 +2322,7 @@ static int io_wake_function(struct wait_queue_entry *curr, unsigned int mode, * Cannot safely flush overflowed CQEs from here, ensure we wake up * the task, and the next invocation will do it. */ - if (io_should_wake(iowq) || io_has_work(iowq->ctx)) + if (io_should_wake(iowq) || io_has_work(iowq->ctx) || iowq->hit_timeout) return autoremove_wake_function(curr, mode, wake_flags, key); return -1; } @@ -2350,6 +2350,38 @@ static bool current_pending_io(void) return percpu_counter_read_positive(&tctx->inflight); } +static enum hrtimer_restart io_cqring_timer_wakeup(struct hrtimer *timer) +{ + struct io_wait_queue *iowq = container_of(timer, struct io_wait_queue, t); + struct io_ring_ctx *ctx = iowq->ctx; + + WRITE_ONCE(iowq->hit_timeout, 1); + if (ctx->flags & IORING_SETUP_DEFER_TASKRUN) + wake_up_process(ctx->submitter_task); + else + io_cqring_wake(ctx); + return HRTIMER_NORESTART; +} + +static int io_cqring_schedule_timeout(struct io_wait_queue *iowq, + clockid_t clock_id) +{ + iowq->hit_timeout = 0; + hrtimer_init_on_stack(&iowq->t, clock_id, HRTIMER_MODE_ABS); + iowq->t.function = io_cqring_timer_wakeup; + hrtimer_set_expires_range_ns(&iowq->t, iowq->timeout, 0); + hrtimer_start_expires(&iowq->t, HRTIMER_MODE_ABS); + + if (!READ_ONCE(iowq->hit_timeout)) + schedule(); + + hrtimer_cancel(&iowq->t); + destroy_hrtimer_on_stack(&iowq->t); + __set_current_state(TASK_RUNNING); + + return READ_ONCE(iowq->hit_timeout) ? -ETIME : 0; +} + static int __io_cqring_wait_schedule(struct io_ring_ctx *ctx, struct io_wait_queue *iowq) { @@ -2362,11 +2394,10 @@ static int __io_cqring_wait_schedule(struct io_ring_ctx *ctx, */ if (!iowq->no_iowait && current_pending_io()) current->in_iowait = 1; - if (iowq->timeout == KTIME_MAX) + if (iowq->timeout != KTIME_MAX) + ret = io_cqring_schedule_timeout(iowq, ctx->clockid); + else schedule(); - else if (!schedule_hrtimeout_range_clock(&iowq->timeout, 0, - HRTIMER_MODE_ABS, ctx->clockid)) - ret = -ETIME; current->in_iowait = 0; return ret; } diff --git a/io_uring/io_uring.h b/io_uring/io_uring.h index e0868b79774c..bac830a2d6ec 100644 --- a/io_uring/io_uring.h +++ b/io_uring/io_uring.h @@ -40,7 +40,9 @@ struct io_wait_queue { struct io_ring_ctx *ctx; unsigned cq_tail; unsigned nr_timeouts; + int hit_timeout; ktime_t timeout; + struct hrtimer t; #ifdef CONFIG_NET_RX_BUSY_POLL ktime_t napi_busy_poll_dt; bool napi_prefer_busy_poll; From patchwork Fri Aug 16 20:38:15 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 13766809 Received: from mail-pj1-f50.google.com (mail-pj1-f50.google.com [209.85.216.50]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3AF901C37A3 for ; Fri, 16 Aug 2024 20:44:16 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.50 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723841058; cv=none; b=U3c5QXzNr1NUcSachAl6RqvLmdpu5xiN7JS0EzbP9mSikvD2pogx4OQvpDIOy2NE/NUsvk4HTofybtziuKTK+JzVVBptSKs1UoBFGotz7pNS5g3n3Ql7OAojShMClJZ5UbZpL5d2NsRvaqycBrWMYBFUyOpBDkUmXY/N6iBKhs4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723841058; c=relaxed/simple; bh=g+eKojpK3LrgiX9mBSou2jCzZVKXiXdSwKUfLaxw4II=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=m0Lr6QgKvtSxW7vGJTkaljBwAjLIrM1VywCnsGl+6Z8E8QE41pn9m4ehTQP4nzoCZA5OJWTL2BI3x6pVpm54z+WoUiMnk/oU4jAiVG5A7p/9rFDwT5KHgUM5p1HVRecsNbtxPzvT48cp/9+rIF6/misTs7D0tnkmMPgVS3aVn1k= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk; spf=pass smtp.mailfrom=kernel.dk; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b=HPJr1DcA; arc=none smtp.client-ip=209.85.216.50 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=kernel.dk Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b="HPJr1DcA" Received: by mail-pj1-f50.google.com with SMTP id 98e67ed59e1d1-2d3cc6170eeso374984a91.1 for ; Fri, 16 Aug 2024 13:44:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20230601.gappssmtp.com; s=20230601; t=1723841055; x=1724445855; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=hMoYgyUrcy2s4+MjNukg1zvbc8INT9NVMGfeBKKfeGA=; b=HPJr1DcA8Cr+A4jHo+Dl9B3wp8nIdU0zX6bipntI5FLB/yYZUbtvmCCImL8O5W17wL MMY51SGkFtAqI9HfZT7qs0itMld8TYEebjAs/oexBAFvUj1yRkcLc5SGxQd6J9Jdl6Vx cKK+j2efr+x1rSZdBbG9j5Hwe+BkqMG2ARq7nsOz8x9jaAwISLNwmUqDhSWU+tRN6jND XIIYH08SpTqWL9t0oyalaOWvHaYrAf89tKiL5P3ez0fkSdk5wrUj12WUx2QOt9RcLfPC 4Xrdy4YctASKm3ZAFtABF8S9rdVGIkQYR4JZTV66V1tPwqjksh71vrfKdy5oNTdJbu2J tQuQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1723841055; x=1724445855; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=hMoYgyUrcy2s4+MjNukg1zvbc8INT9NVMGfeBKKfeGA=; b=mn1yMVYHxWqlTkLyzma3jWRrInFjN38NvaJnJUFE0ZJeb3vA6PmHdlTv8stOqsPjHo Ec4h9L+M9+SCFi/Tx/LOU0mTMz5SpaP4HgDS1kbiZBefzBVVQpfRqGCVtVseUf6S1lEY POpzCAeIJExrUIcvDaYVvVsKUs5mWH2KXXqlRmoNRDEaaT6CqJmKEOR0OK3IGjWjiute /6zhimHu9oHWBHPsOioP7GroRBPL7poV5q8m5jePPyYgezpHBwKiTm0q2w0+M8XHRQ+6 GrrVKEvZ/sNrAmX+vIhZxDJj7ofoashICSSqfOAPjMkdgKiT//ov5hIHJFlvU6uv901G 32Rg== X-Gm-Message-State: AOJu0YxYvkXXaZFU1CT4QP9cfX95yNLATTkYFhWk2/Uz5+AQZlF7aQpu 3+yvaPDKvXofm2cv2/f8n638n9LAlht1FZYFYwsHUqEmKr73hXjZzXZw4QcWSq+ShFnFDxGBTgk P X-Google-Smtp-Source: AGHT+IHgeITjFbuuz6sGs8eVAYs0EK3eng4Nwto6Qn5V9BiGYHqDpdCDjZM2TJJKekEiSQsZtNpDbQ== X-Received: by 2002:a17:902:ecd1:b0:1f7:1a37:d0b5 with SMTP id d9443c01a7336-20203e8f79emr27208975ad.4.1723841055016; Fri, 16 Aug 2024 13:44:15 -0700 (PDT) Received: from localhost.localdomain ([198.8.77.157]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-201f038a3d7sm29190995ad.186.2024.08.16.13.44.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 16 Aug 2024 13:44:13 -0700 (PDT) From: Jens Axboe To: io-uring@vger.kernel.org Cc: Jens Axboe Subject: [PATCH 4/5] io_uring: add support for batch wait timeout Date: Fri, 16 Aug 2024 14:38:15 -0600 Message-ID: <20240816204302.85938-6-axboe@kernel.dk> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240816204302.85938-2-axboe@kernel.dk> References: <20240816204302.85938-2-axboe@kernel.dk> Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Waiting for events with io_uring has two knobs that can be set: 1) The number of events to wake for 2) The timeout associated with the event Waiting will abort when either of those conditions are met, as expected. This adds support for a third event, which is associated with the number of events to wait for. Applications generally like to handle batches of completions, and right now they'd set a number of events to wait for and the timeout for that. If no events have been received but the timeout triggers, control is returned to the application and it can wait again. However, if the application doesn't have anything to do until events are reaped, then it's possible to make this waiting more efficient. For example, the application may have a latency time of 50 usecs and wanting to handle a batch of 8 requests at the time. If it uses 50 usecs as the timeout, then it'll be doing 20K context switches per second even if nothing is happening. This introduces the notion of min batch wait time. If the min batch wait time expires, then we'll return to userspace if we have any events at all. If none are available, the general wait time is applied. Any request arriving after the min batch wait time will cause waiting to stop and return control to the application. Signed-off-by: Jens Axboe --- io_uring/io_uring.c | 75 +++++++++++++++++++++++++++++++++++++++------ io_uring/io_uring.h | 2 ++ 2 files changed, 67 insertions(+), 10 deletions(-) diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index 6e53ebd58aab..27d949ff84a3 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -2363,13 +2363,62 @@ static enum hrtimer_restart io_cqring_timer_wakeup(struct hrtimer *timer) return HRTIMER_NORESTART; } +/* + * Doing min_timeout portion. If we saw any timeouts, events, or have work, + * wake up. If not, and we have a normal timeout, switch to that and keep + * sleeping. + */ +static enum hrtimer_restart io_cqring_min_timer_wakeup(struct hrtimer *timer) +{ + struct io_wait_queue *iowq = container_of(timer, struct io_wait_queue, t); + struct io_ring_ctx *ctx = iowq->ctx; + + /* no general timeout, or shorter, we are done */ + if (iowq->timeout == KTIME_MAX || + ktime_after(iowq->min_timeout, iowq->timeout)) + goto out_wake; + /* work we may need to run, wake function will see if we need to wake */ + if (io_has_work(ctx)) + goto out_wake; + /* got events since we started waiting, min timeout is done */ + if (iowq->cq_min_tail != READ_ONCE(ctx->rings->cq.tail)) + goto out_wake; + /* if we have any events and min timeout expired, we're done */ + if (io_cqring_events(ctx)) + goto out_wake; + + /* + * If using deferred task_work running and application is waiting on + * more than one request, ensure we reset it now where we are switching + * to normal sleeps. Any request completion post min_wait should wake + * the task and return. + */ + if (ctx->flags & IORING_SETUP_DEFER_TASKRUN) + atomic_set(&ctx->cq_wait_nr, 1); + + iowq->t.function = io_cqring_timer_wakeup; + hrtimer_set_expires(timer, iowq->timeout); + return HRTIMER_RESTART; +out_wake: + return io_cqring_timer_wakeup(timer); +} + static int io_cqring_schedule_timeout(struct io_wait_queue *iowq, - clockid_t clock_id) + clockid_t clock_id, ktime_t start_time) { + ktime_t timeout; + iowq->hit_timeout = 0; hrtimer_init_on_stack(&iowq->t, clock_id, HRTIMER_MODE_ABS); - iowq->t.function = io_cqring_timer_wakeup; - hrtimer_set_expires_range_ns(&iowq->t, iowq->timeout, 0); + if (iowq->min_timeout) { + timeout = ktime_add_ns(iowq->min_timeout, start_time); + iowq->t.function = io_cqring_min_timer_wakeup; + } else { + timeout = iowq->timeout; + iowq->t.function = io_cqring_timer_wakeup; + } + + hrtimer_set_expires_range_ns(&iowq->t, timeout, 0); hrtimer_start_expires(&iowq->t, HRTIMER_MODE_ABS); if (!READ_ONCE(iowq->hit_timeout)) @@ -2383,7 +2432,8 @@ static int io_cqring_schedule_timeout(struct io_wait_queue *iowq, } static int __io_cqring_wait_schedule(struct io_ring_ctx *ctx, - struct io_wait_queue *iowq) + struct io_wait_queue *iowq, + ktime_t start_time) { int ret = 0; @@ -2394,8 +2444,8 @@ static int __io_cqring_wait_schedule(struct io_ring_ctx *ctx, */ if (!iowq->no_iowait && current_pending_io()) current->in_iowait = 1; - if (iowq->timeout != KTIME_MAX) - ret = io_cqring_schedule_timeout(iowq, ctx->clockid); + if (iowq->timeout != KTIME_MAX || iowq->min_timeout != KTIME_MAX) + ret = io_cqring_schedule_timeout(iowq, ctx->clockid, start_time); else schedule(); current->in_iowait = 0; @@ -2404,7 +2454,8 @@ static int __io_cqring_wait_schedule(struct io_ring_ctx *ctx, /* If this returns > 0, the caller should retry */ static inline int io_cqring_wait_schedule(struct io_ring_ctx *ctx, - struct io_wait_queue *iowq) + struct io_wait_queue *iowq, + ktime_t start_time) { if (unlikely(READ_ONCE(ctx->check_cq))) return 1; @@ -2417,7 +2468,7 @@ static inline int io_cqring_wait_schedule(struct io_ring_ctx *ctx, if (unlikely(io_should_wake(iowq))) return 0; - return __io_cqring_wait_schedule(ctx, iowq); + return __io_cqring_wait_schedule(ctx, iowq, start_time); } struct ext_arg { @@ -2435,6 +2486,7 @@ static int io_cqring_wait(struct io_ring_ctx *ctx, int min_events, u32 flags, { struct io_wait_queue iowq; struct io_rings *rings = ctx->rings; + ktime_t start_time; int ret; if (!io_allowed_run_tw(ctx)) @@ -2453,9 +2505,12 @@ static int io_cqring_wait(struct io_ring_ctx *ctx, int min_events, u32 flags, INIT_LIST_HEAD(&iowq.wq.entry); iowq.ctx = ctx; iowq.nr_timeouts = atomic_read(&ctx->cq_timeouts); + iowq.cq_min_tail = READ_ONCE(ctx->rings->cq.tail); iowq.cq_tail = READ_ONCE(ctx->rings->cq.head) + min_events; + iowq.min_timeout = 0; iowq.timeout = KTIME_MAX; iowq.no_iowait = flags & IORING_ENTER_NO_IOWAIT; + start_time = io_get_time(ctx); if (ext_arg->ts) { struct timespec64 ts; @@ -2465,7 +2520,7 @@ static int io_cqring_wait(struct io_ring_ctx *ctx, int min_events, u32 flags, iowq.timeout = timespec64_to_ktime(ts); if (!(flags & IORING_ENTER_ABS_TIMER)) - iowq.timeout = ktime_add(iowq.timeout, io_get_time(ctx)); + iowq.timeout = ktime_add(iowq.timeout, start_time); } if (ext_arg->sig) { @@ -2496,7 +2551,7 @@ static int io_cqring_wait(struct io_ring_ctx *ctx, int min_events, u32 flags, TASK_INTERRUPTIBLE); } - ret = io_cqring_wait_schedule(ctx, &iowq); + ret = io_cqring_wait_schedule(ctx, &iowq, start_time); __set_current_state(TASK_RUNNING); atomic_set(&ctx->cq_wait_nr, IO_CQ_WAKE_INIT); diff --git a/io_uring/io_uring.h b/io_uring/io_uring.h index bac830a2d6ec..24ecd31c81e9 100644 --- a/io_uring/io_uring.h +++ b/io_uring/io_uring.h @@ -39,8 +39,10 @@ struct io_wait_queue { struct wait_queue_entry wq; struct io_ring_ctx *ctx; unsigned cq_tail; + unsigned cq_min_tail; unsigned nr_timeouts; int hit_timeout; + ktime_t min_timeout; ktime_t timeout; struct hrtimer t; #ifdef CONFIG_NET_RX_BUSY_POLL From patchwork Fri Aug 16 20:38:16 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 13766810 Received: from mail-pg1-f182.google.com (mail-pg1-f182.google.com [209.85.215.182]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8FEEC1C230D for ; Fri, 16 Aug 2024 20:44:17 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.215.182 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723841059; cv=none; b=MQZxWxBL8bLcN8j2V7ZM/TntvtQYRYww05WcQli0FarbbqRMl9IqVC9bVYXFVrAx6He4OV2UjJwaZdwi+TAX2CJN/ldyqYWV9Gtilv2EH2cxhbZyHx18G+F5btFNj/xPZSjXZpLS623pvNpyEHVihMjtgG0cKqBj0mXOYUAPab8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723841059; c=relaxed/simple; bh=1OZtNl0JDoIJwVXs5EoR0J9akC7TB1Si3RWg55ueBoY=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=g0Y8yXLF4Owobq82/mm8qSkFl2Jx3lmVTIlG4Usp9Vj+0Jtvy1cj5VWguQ9KS4MP81uwRrb3LB32VXAlfURJn6eflImNGvh+aoiN2CPXMYdK3uqSVRYpiTnWBN1Une3HYcvo6/GRP7UNFvih/Is0yAD46iwFFnicoWzwWyX2s0U= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk; spf=pass smtp.mailfrom=kernel.dk; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b=giq4LD9p; arc=none smtp.client-ip=209.85.215.182 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=kernel.dk Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b="giq4LD9p" Received: by mail-pg1-f182.google.com with SMTP id 41be03b00d2f7-72703dd2b86so212888a12.1 for ; Fri, 16 Aug 2024 13:44:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20230601.gappssmtp.com; s=20230601; t=1723841056; x=1724445856; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=4PLbe43VndWrCSAbyZINoMHbTMY+BQS7gvEu5PBtTVs=; b=giq4LD9pj024JkgViWRj+i+TVFpWFAEPJRyF9GUnTERmzR4LMGLoL5NM7jqTrHB/RS 1bF9+qWAGg/71bMoWio8kvFkkw1bcm8KnMtwohuJzQHwa/mqw7PW/VDYV6YUZAdGHDCk fWiZljtQw8+RC+iFAzZhpHODzOfpVHsE9jlj8pStrgV88GkBEfy5zdahb4RZDVuljSyV mSOQFoEN0gvfDFOL4AZZUsOMAks65queR74fKvZaVfPYRCbiURsYdlEDkJp0LVo1TMRo CujzCqgL3v1qkCRwZ7eJPXZtDOh5x/j+5RItCZnB0a3NniNIcpp1MYYHP1DObxDPvtvp QGUg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1723841056; x=1724445856; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=4PLbe43VndWrCSAbyZINoMHbTMY+BQS7gvEu5PBtTVs=; b=Kg1q14wm1ufzM5ebruujxmejwGrwUIerYr4W3yPh9K+kqvgwUu40zbIDdQKghazWY0 Yq15sX7wgwmlfCLwjASycZUnPJpFiQVLEGzjxjvdCUmDaRpJIk4Xe9Cn+avWzYmF/F4x TG7pbqLU10VJ7ImzvDDtnzd5caMEwsmtH5Tybh3YelNIzJb69fzZJ8xYCART+XX9BFJw RUlO4H17CthP58NS43eQkPzGojkdKQ1+SUVBpyZ9IAQ5NkLmIuNZjn0+e69S1JIHVPU9 8BolH6l7y6eWdqNe2JjpYwKKJWML9/bSnKTOii8jqmPfHZ7yecJSEXnu50qAslFbpYFN RREQ== X-Gm-Message-State: AOJu0YwiWFMMlnGbjJZZB1d9YZiJoH0fkgFd7DaW2S0+nOrsiUlw3Nb+ BNp7MDO9w+Q4MW05WnFjDpFAG2hkaxOZxiJ4+F/cobGO/7JVGrNX7EKlPvhl3uN3K7WDFvqc4Qe f X-Google-Smtp-Source: AGHT+IG836ZUkszCjcwwE2JcM+g3tIw4Lgj1BFbs9EfPbXHvv8JISY7Ao1dVVyDx8JMpKR1dwwGdBQ== X-Received: by 2002:a17:902:f681:b0:1fd:a0b9:2be6 with SMTP id d9443c01a7336-20206081e3emr24300765ad.2.1723841056546; Fri, 16 Aug 2024 13:44:16 -0700 (PDT) Received: from localhost.localdomain ([198.8.77.157]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-201f038a3d7sm29190995ad.186.2024.08.16.13.44.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 16 Aug 2024 13:44:15 -0700 (PDT) From: Jens Axboe To: io-uring@vger.kernel.org Cc: Jens Axboe Subject: [PATCH 5/5] io_uring: wire up min batch wake timeout Date: Fri, 16 Aug 2024 14:38:16 -0600 Message-ID: <20240816204302.85938-7-axboe@kernel.dk> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240816204302.85938-2-axboe@kernel.dk> References: <20240816204302.85938-2-axboe@kernel.dk> Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Expose min_wait_usec in io_uring_getevents_arg, replacing the pad member that is currently in there. The value is in usecs, which is explained in the name as well. Note that if min_wait_usec and a normal timeout is used in conjunction, the normal timeout is still relative to the base time. For example, if min_wait_usec is set to 100 and the normal timeout is 1000, the max total time waited is still 1000. This also means that if the normal timeout is shorter than min_wait_usec, then only the min_wait_usec will take effect. See previous commit for an explanation of how this works. IORING_FEAT_MIN_TIMEOUT is added as a feature flag for this, as applications doing submit_and_wait_timeout() style operations will generally not see the -EINVAL from the wait side as they return the number of IOs submitted. Only if no IOs are submitted will the -EINVAL bubble back up to the application. Signed-off-by: Jens Axboe --- include/uapi/linux/io_uring.h | 3 ++- io_uring/io_uring.c | 9 +++++---- 2 files changed, 7 insertions(+), 5 deletions(-) diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h index 3a94afa8665e..da0d472d5ec7 100644 --- a/include/uapi/linux/io_uring.h +++ b/include/uapi/linux/io_uring.h @@ -545,6 +545,7 @@ struct io_uring_params { #define IORING_FEAT_REG_REG_RING (1U << 13) #define IORING_FEAT_RECVSEND_BUNDLE (1U << 14) #define IORING_FEAT_IOWAIT_TOGGLE (1U << 15) +#define IORING_FEAT_MIN_TIMEOUT (1U << 16) /* * io_uring_register(2) opcodes and arguments @@ -768,7 +769,7 @@ enum io_uring_register_restriction_op { struct io_uring_getevents_arg { __u64 sigmask; __u32 sigmask_sz; - __u32 pad; + __u32 min_wait_usec; __u64 ts; }; diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index 27d949ff84a3..ecebcea5cbd7 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -2475,6 +2475,7 @@ struct ext_arg { size_t argsz; struct __kernel_timespec __user *ts; const sigset_t __user *sig; + ktime_t min_time; }; /* @@ -2507,7 +2508,7 @@ static int io_cqring_wait(struct io_ring_ctx *ctx, int min_events, u32 flags, iowq.nr_timeouts = atomic_read(&ctx->cq_timeouts); iowq.cq_min_tail = READ_ONCE(ctx->rings->cq.tail); iowq.cq_tail = READ_ONCE(ctx->rings->cq.head) + min_events; - iowq.min_timeout = 0; + iowq.min_timeout = ext_arg->min_time; iowq.timeout = KTIME_MAX; iowq.no_iowait = flags & IORING_ENTER_NO_IOWAIT; start_time = io_get_time(ctx); @@ -3232,8 +3233,7 @@ static int io_get_ext_arg(unsigned flags, const void __user *argp, return -EINVAL; if (copy_from_user(&arg, argp, sizeof(arg))) return -EFAULT; - if (arg.pad) - return -EINVAL; + ext_arg->min_time = arg.min_wait_usec * NSEC_PER_USEC; ext_arg->sig = u64_to_user_ptr(arg.sigmask); ext_arg->argsz = arg.sigmask_sz; ext_arg->ts = u64_to_user_ptr(arg.ts); @@ -3634,7 +3634,8 @@ static __cold int io_uring_create(unsigned entries, struct io_uring_params *p, IORING_FEAT_EXT_ARG | IORING_FEAT_NATIVE_WORKERS | IORING_FEAT_RSRC_TAGS | IORING_FEAT_CQE_SKIP | IORING_FEAT_LINKED_FILE | IORING_FEAT_REG_REG_RING | - IORING_FEAT_RECVSEND_BUNDLE | IORING_FEAT_IOWAIT_TOGGLE; + IORING_FEAT_RECVSEND_BUNDLE | IORING_FEAT_IOWAIT_TOGGLE | + IORING_FEAT_MIN_TIMEOUT; if (copy_to_user(params, p, sizeof(*p))) { ret = -EFAULT;